如何使用 ReAct 风格的代理返回结构化输出¶

先决条件

本指南假设你对以下内容有一定了解：

你可能希望你的代理以结构化格式返回其输出。例如，如果代理的输出被某些其他下游软件使用，你可能希望输出在每次调用代理时都以相同的结构化格式返回，以确保一致性。

本笔记本将介绍两种不同的选项，以强制工具调用代理结构化其输出。我们将使用一个基本的 ReAct 代理（一个模型节点和一个工具调用节点），最后还有一个节点用于为用户格式化响应。两种选项将使用相同的图结构，如下图所示，但在内部机制上有所不同。

选项 1

强制你的工具调用代理具有结构化输出的第一种方法是将你想要的输出绑定为 agent 节点使用的附加工具。与基本的 ReAct 代理不同，在这种情况下，agent 节点不是在 tools 和 END 之间选择，而是选择它调用的具体工具。在这种情况下，预期的流程是，agent 节点中的 LLM 将首先选择动作工具，在接收到动作工具的输出后，它将调用响应工具，然后该工具将路由到 respond 节点，该节点简单地结构化 agent 节点工具调用的参数。

优缺点

这种格式的好处在于，你只需要一个 LLM，并且可以节省金钱和延迟。该选项的缺点是，不能保证单个 LLM 在你希望时能调用正确的工具。通过在使用 bind_tools 时将 tool_choice 设置为 any，我们可以帮助 LLM 强制其在每个回合至少选择一个工具，但这远不是一个万无一失的策略。此外，另一个缺点是代理可能会调用多个工具，因此我们需要在路由函数中明确检查这一点（或者如果我们使用 OpenAI，可以将 parallell_tool_calling=False 以确保每次只调用一个工具）。

选项 2

强制你的工具调用代理具有结构化输出的第二种方法是使用第二个 LLM（在这种情况下是 model_with_structured_output）来回应用户。

在这种情况下，你将正常定义一个基本的 ReAct 代理，但 agent 节点将选择在 tools 节点和 respond 节点之间，而不是在 tools 节点和结束对话之间。respond 节点将包含一个使用结构化输出的第二个 LLM，一旦调用，将直接返回给用户。你可以将这种方法视为基本 ReAct 在响应用户之前增加了一个额外的步骤。

优缺点

这种方法的好处是，它可以确保输出结构化（只要 .with_structured_output 能按预期与 LLM 一起工作）。使用这种方法的缺点是，在响应用户之前需要进行额外的 LLM 调用，这可能会增加成本和延迟。此外，由于没有向 agent 节点 LLM 提供所需输出协议的信息，因此存在 agent LLM 可能无法调用正确工具以满足正确输出协议的风险。

请注意，这两个选项将遵循完全相同的图结构（见上图），即它们都是基本 ReAct 架构的精确副本，但在结束前有一个 respond 节点。

设置¶

首先，让我们安装所需的包并设置我们的 API 密钥。

%%capture --no-stderr
%pip install -U langgraph langchain_anthropic

import getpass
import os


def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")


_set_env("ANTHROPIC_API_KEY")

为LangGraph开发设置 LangSmith

注册LangSmith以快速发现问题并提升您的LangGraph项目的性能。LangSmith允许您使用追踪数据来调试、测试和监控使用LangGraph构建的LLM应用——有关如何开始的更多信息，请点击这里。

定义模型、工具和图状态¶

现在我们可以定义我们希望如何构建输出，定义我们的图状态，以及我们将要使用的工具和模型。

为了使用结构化输出，我们将使用 LangChain 中的 with_structured_output 方法，您可以在这里了解更多信息。

在这个例子中，我们将使用一个工具来查找天气，并将向用户返回结构化的天气响应。

from pydantic import BaseModel, Field
from typing import Literal
from langchain_core.tools import tool
from langchain_anthropic import ChatAnthropic
from langgraph.graph import MessagesState


class WeatherResponse(BaseModel):
    """你接收到的数据训练到2023年10月。"""

    temperature: float = Field(description="The temperature in fahrenheit")
    wind_directon: str = Field(
        description="The direction of the wind in abbreviated form"
    )
    wind_speed: float = Field(description="The speed of the wind in km/h")


# Inherit 'messages' key from MessagesState, which is a list of chat messages
class AgentState(MessagesState):
    # 代理的最终结构化回应
    final_response: WeatherResponse


@tool
def get_weather(city: Literal["nyc", "sf"]):
    """使用此信息获取天气信息。"""
    if city == "nyc":
        return "It is cloudy in NYC, with 5 mph winds in the North-East direction and a temperature of 70 degrees"
    elif city == "sf":
        return "It is 75 degrees and sunny in SF, with 3 mph winds in the South-East direction"
    else:
        raise AssertionError("Unknown city")


tools = [get_weather]

model = ChatAnthropic(model="claude-3-opus-20240229")

model_with_tools = model.bind_tools(tools)
model_with_structured_output = model.with_structured_output(WeatherResponse)

API Reference: tool | ChatAnthropic

选项 1：将输出绑定为工具¶

现在让我们来看看如何使用单个 LLM 选项。

定义图¶

图的定义与上述非常相似，唯一的区别是我们不再在 response 节点中调用 LLM，而是将 WeatherResponse 工具绑定到我们已经包含 get_weather 工具的 LLM 上。

from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode

tools = [get_weather, WeatherResponse]

# 通过传递 tool_choice= 强制模型使用工具。"any"
model_with_response_tool = model.bind_tools(tools, tool_choice="any")


# 定义调用模型的函数。
def call_model(state: AgentState):
    response = model_with_response_tool.invoke(state["messages"])
    # 我们返回一个列表，因为这将被添加到现有列表中。
    return {"messages": [response]}


# 定义响应用户的函数。
def respond(state: AgentState):
    # 根据最后一次工具调用的参数构建最终答案。
    response = WeatherResponse(**state["messages"][-1].tool_calls[0]["args"])
    # 我们返回最终答案。
    return {"final_response": response}


# 定义决定是否继续的函数。
def should_continue(state: AgentState):
    messages = state["messages"]
    last_message = messages[-1]
    # 如果只有一个工具调用，并且是响应工具调用，我们就会回复用户。
    if (
        len(last_message.tool_calls) == 1
        and last_message.tool_calls[0]["name"] == "WeatherResponse"
    ):
        return "respond"
    # 否则我们将再次使用工具节点。
    else:
        return "continue"


# 定义一个新图。
workflow = StateGraph(AgentState)

# 定义我们将循环切换的两个节点。
workflow.add_node("agent", call_model)
workflow.add_node("respond", respond)
workflow.add_node("tools", ToolNode(tools))

# 将入口点设置为 `agent`。
# 这意味着这个节点是第一个被调用的节点。
workflow.set_entry_point("agent")

# 我们现在添加一个条件边。
workflow.add_conditional_edges(
    "agent",
    should_continue,
    {
        "continue": "tools",
        "respond": "respond",
    },
)

workflow.add_edge("tools", "agent")
workflow.add_edge("respond", END)
graph = workflow.compile()

API Reference: StateGraph | END | ToolNode

使用方法¶

现在我们可以运行我们的图形来检查它是否按预期工作：

answer = graph.invoke(input={"messages": [("human", "what's the weather in SF?")]})[
    "final_response"
]

answer

WeatherResponse(temperature=75.0, wind_directon='SE', wind_speed=3.0)

再次地，代理返回了一个WeatherResponse对象，正如我们所期望的。

选项 2：两个 LLM¶

现在让我们深入了解如何使用第二个 LLM 来强制结构化输出。

定义图形¶

我们现在可以定义我们的图形：

from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_core.messages import HumanMessage


# 定义调用模型的函数。
def call_model(state: AgentState):
    response = model_with_tools.invoke(state["messages"])
    # 我们返回一个列表，因为这将被添加到现有列表中。
    return {"messages": [response]}


# 定义一个响应用户的函数。
def respond(state: AgentState):
    # 我们称这个具有结构化输出的模型，以便每次返回给用户相同的格式。
    # state['messages'][-2] is the last ToolMessage in the convo, which we convert to a HumanMessage for the model to use
    # 我们也可以传递整个聊天记录，但这样可以节省令牌，因为我们关心的只是工具的输出结构。
    response = model_with_structured_output.invoke(
        [HumanMessage(content=state["messages"][-2].content)]
    )
    # 我们返回最终答案。
    return {"final_response": response}


# 定义决定是否继续的函数。
def should_continue(state: AgentState):
    messages = state["messages"]
    last_message = messages[-1]
    # 如果没有函数调用，那么我们就会回应用户。
    if not last_message.tool_calls:
        return "respond"
    # 否则如果有，我们继续。
    else:
        return "continue"


# 定义一个新图。
workflow = StateGraph(AgentState)

# 定义我们将循环之间的两个节点。
workflow.add_node("agent", call_model)
workflow.add_node("respond", respond)
workflow.add_node("tools", ToolNode(tools))

# 将入口点设置为 `agent`。
# 这意味着这个节点是第一个被调用的。
workflow.set_entry_point("agent")

# 我们现在添加一个条件边。
workflow.add_conditional_edges(
    "agent",
    should_continue,
    {
        "continue": "tools",
        "respond": "respond",
    },
)

workflow.add_edge("tools", "agent")
workflow.add_edge("respond", END)
graph = workflow.compile()

API Reference: HumanMessage | StateGraph | END | ToolNode

用法¶

现在我们可以调用我们的图，以验证输出是否按预期结构化：

answer = graph.invoke(input={"messages": [("human", "what's the weather in SF?")]})[
    "final_response"
]

answer

WeatherResponse(temperature=75.0, wind_directon='SE', wind_speed=4.83)

正如我们所看到的，代理返回了一个 WeatherResponse 对象，正如我们所期望的那样。现在，我们可以轻松地在更复杂的软件堆栈中使用这个代理，而不用担心代理的输出与堆栈中下一步所期望的格式不匹配。