智能体

在LlamaIndex中，我们将“智能体”定义为使用LLM、记忆和工具来处理外部用户输入的特定系统。这与术语“智能体化”形成对比，后者通常指智能体的超类，即任何在流程中包含LLM决策的系统。

要在LlamaIndex中创建一个智能体，只需几行代码：

import asyncio
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.llms.openai import OpenAI


# Define a simple calculator tool
def multiply(a: float, b: float) -> float:
    """Useful for multiplying two numbers."""
    return a * b


# Create an agent workflow with our calculator tool
agent = FunctionAgent(
    tools=[multiply],
    llm=OpenAI(model="gpt-4o-mini"),
    system_prompt="You are a helpful assistant that can multiply two numbers.",
)


async def main():
    # Run the agent
    response = await agent.run("What is 1234 * 4567?")
    print(str(response))


# Run the agent
if __name__ == "__main__":
    asyncio.run(main())

调用此智能体将启动特定的操作循环：

智能体获取最新消息 + 聊天记录
工具架构和聊天记录通过API发送
智能体要么直接回应，要么提供一系列工具调用
- 每个工具调用都会被执行
- 工具调用结果会被添加到聊天记录中
- 智能体会根据更新后的记录再次被调用，要么直接回应，要么选择更多调用

FunctionAgent 是一种利用LLM提供商的函数/工具调用能力来执行工具的智能体类型。其他类型的智能体，例如ReActAgent和CodeActAgent，则采用不同的提示策略来执行工具。

您可以访问智能体指南了解更多关于智能体及其功能的信息。

工具

工具可以简单地定义为Python函数，或者使用像FunctionTool和QueryEngineTool这样的类进行进一步定制。LlamaIndex还通过名为Tool Specs的机制为常见API提供预定义工具集。

你可以在工具指南中了解更多关于配置工具的信息

记忆

内存是构建智能体时的核心组件。默认情况下，所有LlamaIndex智能体都使用ChatMemoryBuffer作为内存。

要自定义它，您可以在智能体外部声明并将其传入：

from llama_index.core.memory import ChatMemoryBuffer

memory = ChatMemoryBuffer.from_defaults(token_limit=40000)

response = await agent.run(..., memory=memory)

您可以在内存指南中了解更多关于配置内存的信息

某些大型语言模型将支持多种模态，例如图像和文本。通过使用包含内容块的聊天消息，我们可以将图像传递给智能体进行推理。

例如，假设您有一张本演示文稿中的幻灯片的截图。

你可以将此图像传递给一个智能体进行推理，并观察它读取图像并相应行动。

from llama_index.core.agent.workflow import FunctionAgent
from llama_index.core.llms import ChatMessage, ImageBlock, TextBlock
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o-mini", api_key="sk-...")


def add(a: int, b: int) -> int:
    """Useful for adding two numbers together."""
    return a + b


workflow = FunctionAgent(
    tools=[add],
    llm=llm,
)

msg = ChatMessage(
    role="user",
    blocks=[
        TextBlock(text="Follow what the image says."),
        ImageBlock(path="./screenshot.png"),
    ],
)

response = await workflow.run(msg)
print(str(response))

多智能体系统

您可以将多个智能体组合成一个多智能体系统，其中每个智能体能够在完成任务时移交控制权给另一个智能体进行协调。

from llama_index.core.agent.workflow import AgentWorkflow

multi_agent = AgentWorkflow(agents=[FunctionAgent(...), FunctionAgent(...)])

resp = await agent.run("query")

这只是构建多智能体系统的一种方式。继续阅读以了解更多关于多智能体系统的内容。

手动智能体

虽然像 FunctionAgent、ReActAgent、CodeActAgent 和 AgentWorkflow 这样的智能体类抽象了许多细节，但有时需要构建自己的底层智能体。

直接使用 LLM 对象，您可以快速实现一个基础的智能体循环，同时完全掌控工具调用和错误处理的工作方式。

from llama_index.core.llms import ChatMessage
from llama_index.core.tools import FunctionTool
from llama_index.llms.openai import OpenAI


def select_song(song_name: str) -> str:
    """Useful for selecting a song."""
    return f"Song selected: {song_name}"


tools = [FunctionTool.from_defaults(select_song)]
tools_by_name = {t.metadata.name: t for t in [tool]}

# call llm with initial tools + chat history
chat_history = [ChatMessage(role="user", content="Pick a random song for me")]
resp = llm.chat_with_tools([tool], chat_history=chat_history)

# parse tool calls from response
tool_calls = llm.get_tool_calls_from_response(
    resp, error_on_no_tool_call=False
)

# loop while there are still more tools to call
while tool_calls:
    # add the LLM's response to the chat history
    chat_history.append(resp.message)

    # call every tool and add its result to chat_history
    for tool_call in tool_calls:
        tool_name = tool_call.tool_name
        tool_kwargs = tool_call.tool_kwargs

        print(f"Calling {tool_name} with {tool_kwargs}")
        tool_output = tool(**tool_kwargs)
        chat_history.append(
            ChatMessage(
                role="tool",
                content=str(tool_output),
                # most LLMs like OpenAI need to know the tool call id
                additional_kwargs={"tool_call_id": tool_call.tool_id},
            )
        )

        # check if the LLM can write a final response or calls more tools
        resp = llm.chat_with_tools([tool], chat_history=chat_history)
        tool_calls = llm.get_tool_calls_from_response(
            resp, error_on_no_tool_call=False
        )

# print the final response
print(resp.message.content)

示例 / 模块指南

你可以在模块指南页面中找到更完整的示例列表和模块指南。

智能体

工具

记忆

多模态智能体

多智能体系统

手动智能体

示例 / 模块指南