代理

介绍

代理是PydanticAI与LLMs交互的主要接口。

在某些使用案例中，单个代理将控制整个应用程序或组件，但多个代理也可以相互作用，以体现更复杂的工作流程。

这个 Agent 类具有完整的API文档，但从概念上来说，您可以将代理视为一个容器，用于：

组件	描述
系统提示	开发者为LLM编写的一组指令。
功能工具	LLM在生成响应时可能调用以获取信息的函数。
结构化结果类型	如果指定，LLM在运行结束时必须返回的结构化数据类型。
依赖类型约束	系统提示功能、工具和结果验证器在运行时可能会使用依赖项。
LLM model	与代理相关的可选默认LLM模型。运行代理时也可以指定。
模型设置	可选的默认模型设置，以帮助优化请求。在运行代理时也可以指定。

在类型术语中，代理在其依赖和结果类型上是通用的，例如，一个需要类型为 Foobar 的依赖并返回类型为 list[str] 的结果的代理将具有类型 Agent[Foobar, list[str]]。在实践中，您不需要关心这一点，它只意味着您的IDE可以告诉您何时拥有正确的类型，如果您选择使用静态类型检查，它应该可以与PydanticAI良好配合。

这是一个模拟轮盘赌的代理的简单示例：

roulette_wheel.py

from pydantic_ai import Agent, RunContext

roulette_agent = Agent(  # (1)!
    'openai:gpt-4o',
    deps_type=int,
    result_type=bool,
    system_prompt=(
        'Use the `roulette_wheel` function to see if the '
        'customer has won based on the number they provide.'
    ),
)


@roulette_agent.tool
async def roulette_wheel(ctx: RunContext[int], square: int) -> str:  # (2)!
    """check if the square is a winner"""
    return 'winner' if square == ctx.deps else 'loser'


# Run the agent
success_number = 18  # (3)!
result = roulette_agent.run_sync('Put my money on square eighteen', deps=success_number)
print(result.data)  # (4)!
#> True

result = roulette_agent.run_sync('I bet five is the winner', deps=success_number)
print(result.data)
#> False

创建一个代理，它期望一个整数依赖并返回一个布尔结果。这个代理将具有类型 Agent[int, bool]。
定义一个工具，检查正方形是否是赢家。在这里 RunContext 被参数化为依赖类型 int；如果您使用了错误的依赖类型，将会出现类型错误。
实际上，您可能想在此使用一个随机数，例如 random.randint(0, 36).
result.data 将是一个布尔值，用于指示该平方是否为赢家。 Pydantic 执行结果验证，它的类型将被标记为 bool，因为其类型是基于代理的 result_type 泛型参数得出的。

代理被设计为可重用，就像FastAPI应用一样

代理旨在一次实例化（通常作为模块全局变量）并在您的应用程序中重用，类似于一个小型 FastAPI 应用或一个 APIRouter。

运行代理

有四种方法可以运行代理：

agent.run() — 一个协程，返回一个RunResult，包含完成的响应。
agent.run_sync() — 一个普通的同步函数，它返回一个RunResult，包含一个完成的响应（内部，这只是调用 loop.run_until_complete(self.run())）。
agent.run_stream() — 一个协程，返回一个 StreamedRunResult，其中包含将响应作为异步可迭代对象流式传输的方法。
agent.iter() — 一个上下文管理器，它返回一个 AgentRun，这是一个可以异步迭代的代理基础 Graph 的节点。

这是一个简单的示例，演示前面提到的前三个：

run_agent.py

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')

result_sync = agent.run_sync('What is the capital of Italy?')
print(result_sync.data)
#> 罗马


async def main():
    result = await agent.run('What is the capital of France?')
    print(result.data)
    #> 巴黎

    async with agent.run_stream('What is the capital of the UK?') as response:
        print(await response.get_data())
        #> 伦敦

(这个示例是完整的，可以“按原样”运行 - 您需要添加 asyncio.run(main()) 来运行 main)

您还可以传递先前运行中的消息以继续对话或提供上下文，如消息和聊天记录中所述。

遍历代理的图

在内部，PydanticAI中的每个 Agent 使用 pydantic-graph 来管理其执行流程。 pydantic-graph 是一个通用的、以类型为中心的库，用于在Python中构建和运行有限状态机。它实际上并不依赖于PydanticAI——你可以单独使用它来处理与GenAI无关的工作流程——但PydanticAI利用它来协调在代理的运行中处理模型请求和模型响应。

在许多场景中，你根本不需要担心 pydantic-graph；调用 agent.run(...) 只是简单地从开始到结束遍历底层图。但是，如果你需要更深入的洞察或控制——例如捕获每个工具调用，或者在特定阶段注入你自己的逻辑——PydanticAI 通过 Agent.iter 暴露了更低级的迭代过程。此方法返回一个 AgentRun，你可以异步迭代，或者通过 next 方法手动逐节点驱动。一旦代理的图返回一个 End，你将获得最终结果以及所有步骤的详细历史。

`async for` 迭代

这是一个使用 async for 和 iter 记录代理执行的每个节点的示例：

agent_iter_async_for.py

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')


async def main():
    nodes = []
    # Begin an AgentRun, which is an async-iterable over the nodes of the agent's graph
    with agent.iter('What is the capital of France?') as agent_run:
        async for node in agent_run:
            # Each node represents a step in the agent's execution
            nodes.append(node)
    print(nodes)
    """
    [
        ModelRequestNode(
            request=ModelRequest(
                parts=[
                    UserPromptPart(
                        content='What is the capital of France?',
                        timestamp=datetime.datetime(...),
                        part_kind='user-prompt',
                    )
                ],
                kind='request',
            )
        ),
        HandleResponseNode(
            model_response=ModelResponse(
                parts=[TextPart(content='Paris', part_kind='text')],
                model_name='function:model_logic',
                timestamp=datetime.datetime(...),
                kind='response',
            )
        ),
        End(data=FinalResult(data='Paris', tool_name=None)),
    ]
    """
    print(agent_run.result.data)
    #> Paris

AgentRun 是一个异步迭代器，它在流程中生成每个节点（BaseNode 或 End）。
当返回一个 End 节点时，运行结束。

手动使用 `.next(...)`

您还可以通过将要运行的下一个节点传递给AgentRun.next(...)方法来手动驱动迭代。这允许您在节点执行之前检查或修改节点，或者根据您自己的逻辑跳过节点，并更容易地捕获next()中的错误：

agent_iter_next.py

from pydantic_ai import Agent
from pydantic_graph import End

agent = Agent('openai:gpt-4o')


async def main():
    with agent.iter('What is the capital of France?') as agent_run:
        node = agent_run.next_node  # (1)!

        all_nodes = [node]

        # Drive the iteration manually:
        while not isinstance(node, End):  # (2)!
            node = await agent_run.next(node)  # (3)!
            all_nodes.append(node)  # (4)!

        print(all_nodes)
        """
        [
            UserPromptNode(
                user_prompt='What is the capital of France?',
                system_prompts=(),
                system_prompt_functions=[],
                system_prompt_dynamic_functions={},
            ),
            ModelRequestNode(
                request=ModelRequest(
                    parts=[
                        UserPromptPart(
                            content='What is the capital of France?',
                            timestamp=datetime.datetime(...),
                            part_kind='user-prompt',
                        )
                    ],
                    kind='request',
                )
            ),
            HandleResponseNode(
                model_response=ModelResponse(
                    parts=[TextPart(content='Paris', part_kind='text')],
                    model_name='function:model_logic',
                    timestamp=datetime.datetime(...),
                    kind='response',
                )
            ),
            End(data=FinalResult(data='Paris', tool_name=None)),
        ]
        """

我们首先获取将在代理图中运行的第一个节点。
一旦生成了一个 End 节点，代理运行就结束了；End 的实例不能传递给 next。
当你调用 await agent_run.next(node) 时，它在代理的图中执行该节点，更新运行历史，并返回下一个要运行的节点。
您还可以根据需要在此检查或修改新的 node。

访问用法和最终结果

您可以随时通过 AgentRun 对象从 agent_run.usage() 获取使用统计信息（令牌，请求等）。此方法返回一个 Usage 对象，其中包含使用数据。

一旦运行完成， agent_run.final_result 将成为一个 AgentRunResult 对象，包含最终输出（及相关元数据）。

附加配置

使用限制

PydanticAI 提供了一个 UsageLimits 结构，帮助您限制模型运行时的使用量（令牌和/或请求）。

您可以通过将 usage_limits 参数传递给 run{_sync,_stream} 函数来应用这些设置。

考虑以下示例，其中我们限制响应令牌的数量：

from pydantic_ai import Agent
from pydantic_ai.exceptions import UsageLimitExceeded
from pydantic_ai.usage import UsageLimits

agent = Agent('anthropic:claude-3-5-sonnet-latest')

result_sync = agent.run_sync(
    'What is the capital of Italy? Answer with just the city.',
    usage_limits=UsageLimits(response_tokens_limit=10),
)
print(result_sync.data)
#> Rome
print(result_sync.usage())
"""
Usage(requests=1, request_tokens=62, response_tokens=1, total_tokens=63, details=None)
"""

try:
    result_sync = agent.run_sync(
        'What is the capital of Italy? Answer with a paragraph.',
        usage_limits=UsageLimits(response_tokens_limit=10),
    )
except UsageLimitExceeded as e:
    print(e)
    #> Exceeded the response_tokens_limit of 10 (response_tokens=32)

限制请求的数量可以在防止无限循环或过度调用工具方面很有用：

from typing_extensions import TypedDict

from pydantic_ai import Agent, ModelRetry
from pydantic_ai.exceptions import UsageLimitExceeded
from pydantic_ai.usage import UsageLimits


class NeverResultType(TypedDict):
    """
    Never ever coerce data to this type.
    """

    never_use_this: str


agent = Agent(
    'anthropic:claude-3-5-sonnet-latest',
    retries=3,
    result_type=NeverResultType,
    system_prompt='Any time you get a response, call the `infinite_retry_tool` to produce another response.',
)


@agent.tool_plain(retries=5)  # (1)!
def infinite_retry_tool() -> int:
    raise ModelRetry('Please try again.')


try:
    result_sync = agent.run_sync(
        'Begin infinite retry loop!', usage_limits=UsageLimits(request_limit=3)  # (2)!
    )
except UsageLimitExceeded as e:
    print(e)
    #> The next request would exceed the request_limit of 3

该工具在出错之前有能力重试5次，模拟可能陷入循环的工具。
此运行在3个请求后将出错，防止无限工具调用。

注意

如果您注册了许多工具，这一点尤其重要。request_limit可以用来防止模型在循环中调用它们过多次。

模型 (运行) 设置

PydanticAI 提供一个 settings.ModelSettings 结构来帮助您微调请求。这个结构允许您配置影响模型行为的常见参数，例如 temperature，max_tokens，timeout，以及更多。

有两种方式可以应用这些设置： 1. 通过 run{_sync,_stream} 函数传递 model_settings 参数。这允许按请求进行微调。 2. 在 Agent 初始化时设置 model_settings 参数。这些设置将默认应用于使用该代理的所有后续运行调用。然而，在特定运行调用期间提供的 model_settings 将覆盖代理的默认设置。

例如，如果您想将 temperature 设置为 0.0 以确保较少的随机行为，您可以执行以下操作：

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')

result_sync = agent.run_sync(
    'What is the capital of Italy?', model_settings={'temperature': 0.0}
)
print(result_sync.data)
#> Rome

模型特定设置

如果您希望进一步自定义模型行为，可以使用一个与您选择的模型关联的ModelSettings的子类，比如GeminiModelSettings。

例如：

from pydantic_ai import Agent, UnexpectedModelBehavior
from pydantic_ai.models.gemini import GeminiModelSettings

agent = Agent('google-gla:gemini-1.5-flash')

try:
    result = agent.run_sync(
        'Write a list of 5 very rude things that I might say to the universe after stubbing my toe in the dark:',
        model_settings=GeminiModelSettings(
            temperature=0.0,  # general model settings can also be specified
            gemini_safety_settings=[
                {
                    'category': 'HARM_CATEGORY_HARASSMENT',
                    'threshold': 'BLOCK_LOW_AND_ABOVE',
                },
                {
                    'category': 'HARM_CATEGORY_HATE_SPEECH',
                    'threshold': 'BLOCK_LOW_AND_ABOVE',
                },
            ],
        ),
    )
except UnexpectedModelBehavior as e:
    print(e)  # (1)!
    """
    Safety settings triggered, body:
    <safety settings details>
    """

此错误是因为超出了安全阈值。一般来说，result 将包含一个正常的 ModelResponse。

运行与对话

一个run可能代表整个对话——在一次运行中可以交换的消息数量没有限制。然而，一个conversation也可能由多个运行组成，特别是当你需要在不同的交互或API调用之间维持状态时。

以下是一个由多个回合组成的对话示例：

conversation_example.py

from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')

# First run
result1 = agent.run_sync('Who was Albert Einstein?')
print(result1.data)
#> Albert Einstein was a German-born theoretical physicist.

# Second run, passing previous messages
result2 = agent.run_sync(
    'What was his most famous equation?',
    message_history=result1.new_messages(),  # (1)!
)
print(result2.data)
#> Albert Einstein's most famous equation is (E = mc^2).

继续对话；没有 message_history 模型将不知道 "他" 是指谁。

(这个例子是完整的，可以“原样”运行)

按设计类型安全

PydanticAI旨在与静态类型检查器（如mypy和pyright）良好协作。

输入是（某种程度上）可选的

PydanticAI旨在使类型检查对您尽可能有用，如果您选择使用它，但您并不需要在所有地方始终使用类型。

也就是说，由于 PydanticAI 使用 Pydantic，而 Pydantic 使用类型提示作为架构和验证的定义，因此某些类型（特别是工具参数上的类型提示，以及 result_type 参数到 Agent）在运行时被使用。

我们（库开发者）搞砸了，如果类型提示让你感到困惑超过了帮助，如果你发现这个问题，请创建一个 issue 来说明让你烦恼的地方！

特别是，代理在依赖的类型和返回结果的类型上都是通用的，因此您可以使用类型提示来确保您使用的是正确的类型。

考虑以下带有类型错误的脚本：

type_mistakes.py

from dataclasses import dataclass

from pydantic_ai import Agent, RunContext


@dataclass
class User:
    name: str


agent = Agent(
    'test',
    deps_type=User,  # (1)!
    result_type=bool,
)


@agent.system_prompt
def add_user_name(ctx: RunContext[str]) -> str:  # (2)!
    return f"The user's name is {ctx.deps}."


def foobar(x: bytes) -> None:
    pass


result = agent.run_sync('Does their name start with "A"?', deps=User('Anne'))
foobar(result.data)  # (3)!

代理被定义为期望一个 User 的实例作为 deps。
但是这里 add_user_name 被定义为接受一个 str 作为依赖，而不是一个 User。
由于代理被定义为返回一个 bool，这将引发类型错误，因为 foobar 期望 bytes。

对这个运行 mypy 将会给出以下输出：

➤ uv run mypy type_mistakes.py
type_mistakes.py:18: error: Argument 1 to "system_prompt" of "Agent" has incompatible type "Callable[[RunContext[str]], str]"; expected "Callable[[RunContext[User]], str]"  [arg-type]
type_mistakes.py:28: error: Argument 1 to "foobar" has incompatible type "bool"; expected "bytes"  [arg-type]
Found 2 errors in 1 file (checked 1 source file)

运行 pyright 将识别相同的问题。

系统提示

系统提示乍一看可能显得简单，因为它们只是字符串（或者连接在一起的字符串序列），但制作正确的系统提示是使模型按您希望的方式表现的关键。

一般来说，系统提示分为两个类别：

静态系统提示: 这些在编写代码时就已知，可以通过Agent构造函数的system_prompt参数进行定义。
动态系统提示: 这些在某种程度上依赖于在运行时才能知道的上下文，并应通过带有 @agent.system_prompt 装饰的函数进行定义。

您可以将两者都添加到单个代理中；它们将在运行时按定义的顺序追加。

这是一个同时使用两种系统提示的示例：

system_prompts.py

from datetime import date

from pydantic_ai import Agent, RunContext

agent = Agent(
    'openai:gpt-4o',
    deps_type=str,  # (1)!
    system_prompt="Use the customer's name while replying to them.",  # (2)!
)


@agent.system_prompt  # (3)!
def add_the_users_name(ctx: RunContext[str]) -> str:
    return f"The user's name is {ctx.deps}."


@agent.system_prompt
def add_the_date() -> str:  # (4)!
    return f'The date is {date.today()}.'


result = agent.run_sync('What is the date?', deps='Frank')
print(result.data)
#> Hello Frank, the date today is 2032-01-02.

代理期望一个字符串依赖。
在代理创建时定义的静态系统提示。
通过带有 RunContext 的装饰器定义的动态系统提示，这在 run_sync 之后调用，而不是在代理被创建时，因此可以利用运行时信息，例如在该运行中使用的依赖项。
另一个动态系统提示，系统提示不必包含RunContext参数。

(这个例子是完整的，可以“原样”运行)

反思与自我修正

来自函数工具参数验证和结构化结果验证的验证错误可以通过请求重试传递回模型。

您还可以在 ModelRetry 中从工具或结果验证器函数中提升，以告诉模型它应该重试生成响应。

默认重试次数为 1，但可以为整个代理、特定工具或结果验证器进行更改。
您可以通过 ctx.retry 在工具或结果验证器中访问当前的重试计数。

这是一个示例：

tool_retry.py

from pydantic import BaseModel

from pydantic_ai import Agent, RunContext, ModelRetry

from fake_database import DatabaseConn


class ChatResult(BaseModel):
    user_id: int
    message: str


agent = Agent(
    'openai:gpt-4o',
    deps_type=DatabaseConn,
    result_type=ChatResult,
)


@agent.tool(retries=2)
def get_user_by_name(ctx: RunContext[DatabaseConn], name: str) -> int:
    """Get a user's ID from their full name."""
    print(name)
    #> John
    #> John Doe
    user_id = ctx.deps.users.get(name=name)
    if user_id is None:
        raise ModelRetry(
            f'No user found with name {name!r}, remember to provide their full name'
        )
    return user_id


result = agent.run_sync(
    'Send a message to John Doe asking for coffee next week', deps=DatabaseConn()
)
print(result.data)
"""
user_id=123 message='Hello John, would you be free for coffee sometime next week? Let me know what works for you!'
"""

模型错误

如果模型的行为意外（例如，重试限制被超过，或者它们的API返回 503），代理运行将引发 UnexpectedModelBehavior。

在这些情况下， capture_run_messages 可以用来获取运行期间交换的消息，以帮助诊断问题。

agent_model_errors.py

from pydantic_ai import Agent, ModelRetry, UnexpectedModelBehavior, capture_run_messages

agent = Agent('openai:gpt-4o')


@agent.tool_plain
def calc_volume(size: int) -> int:  # (1)!
    if size == 42:
        return size**3
    else:
        raise ModelRetry('Please try again.')


with capture_run_messages() as messages:  # (2)!
    try:
        result = agent.run_sync('Please get me the volume of a box with size 6.')
    except UnexpectedModelBehavior as e:
        print('An error occurred:', e)
        #> An error occurred: Tool exceeded max retries count of 1
        print('cause:', repr(e.__cause__))
        #> cause: ModelRetry('Please try again.')
        print('messages:', messages)
        """
        messages:
        [
            ModelRequest(
                parts=[
                    UserPromptPart(
                        content='Please get me the volume of a box with size 6.',
                        timestamp=datetime.datetime(...),
                        part_kind='user-prompt',
                    )
                ],
                kind='request',
            ),
            ModelResponse(
                parts=[
                    ToolCallPart(
                        tool_name='calc_volume',
                        args={'size': 6},
                        tool_call_id=None,
                        part_kind='tool-call',
                    )
                ],
                model_name='function:model_logic',
                timestamp=datetime.datetime(...),
                kind='response',
            ),
            ModelRequest(
                parts=[
                    RetryPromptPart(
                        content='Please try again.',
                        tool_name='calc_volume',
                        tool_call_id=None,
                        timestamp=datetime.datetime(...),
                        part_kind='retry-prompt',
                    )
                ],
                kind='request',
            ),
            ModelResponse(
                parts=[
                    ToolCallPart(
                        tool_name='calc_volume',
                        args={'size': 6},
                        tool_call_id=None,
                        part_kind='tool-call',
                    )
                ],
                model_name='function:model_logic',
                timestamp=datetime.datetime(...),
                kind='response',
            ),
        ]
        """
    else:
        print(result.data)

定义一个工具，在这种情况下将重复引发 ModelRetry。
capture_run_messages 用于捕获在运行期间交换的消息。

(这个例子是完整的，可以“原样”运行)

注意

如果你在单个 capture_run_messages 上下文中多次调用 run、run_sync 或 run_stream，messages 将仅表示第一次调用期间交换的消息。

代理

介绍

运行代理

遍历代理的图

async for 迭代

手动使用 .next(...)

访问用法和最终结果

附加配置

使用限制

模型 (运行) 设置

模型特定设置

运行与对话

按设计类型安全

系统提示

反思与自我修正

模型错误

`async for` 迭代

手动使用 `.next(...)`