代理#

“自主代理是一个位于环境之中并作为环境一部分的系统,它感知环境并随着时间的推移对其采取行动,以追求自己的议程,从而影响它未来感知到的事物。”

—富兰克林和格雷瑟 (1997)

除了众所周知的RAGs,代理[1]是另一种流行的LLM应用家族。代理的突出之处在于它们能够通过可访问的工具进行推理、规划和行动。在实现方面,AdalFlow已将其简化为一个可以使用工具的生成器,采取多个步骤(顺序或并行)来完成用户查询。

设计#

我们将首先介绍ReAct [2],这是一种构建代理的通用范式,通过一系列交替的思考、行动和观察步骤来实现。

  • 思考: 采取行动背后的推理。

  • 操作: 从预定义的一组操作中采取的行动。特别是,这些是我们在工具中介绍的工具/功能工具。

  • 观察: 最简单的情况是动作的执行结果以字符串格式呈现。为了更加健壮,可以以任何方式定义,只要提供足够的执行信息供LLM计划下一步即可。

提示和数据模型#

DEFAULT_REACT_AGENT_SYSTEM_PROMPT 是React代理的LLM规划器的默认提示。 我们可以将提示模板分为四个部分:

  1. 任务描述

这部分是代理的整体角色设置和任务描述。

task_desc = r"""You are a helpful assistant.
Answer the user's query using the tools provided below with minimal steps and maximum accuracy.

Each step you will read the previous Thought, Action, and Observation(execution result of the action) and then provide the next Thought and Action."""
  1. 工具、输出格式和示例

模板的这一部分与我们调用tools中的函数的方式完全相同。 output_format_str是由FunctionExpression通过JsonOutputParser生成的。 它包括实际的输出格式和一系列FunctionExpression实例的示例。 我们使用FunctionExpressionthoughtaction字段作为代理的响应。

tools = r"""{% if tools %}
<TOOLS>
{% for tool in tools %}
{{ loop.index }}.
{{tool}}
------------------------
{% endfor %}
</TOOLS>
{% endif %}
{{output_format_str}}"""
  1. 任务规范,用于教导规划器如何“思考”。

我们提供更详细的指导,以确保代理始终以“完成”操作结束任务。 此外,我们教它如何处理简单查询和复杂查询。

  • 对于简单的查询,我们指示代理尽可能以最少的步骤完成。

  • 对于复杂的查询,我们教导代理采用‘分而治之’的策略,逐步解决查询问题。

task_spec = r"""<TASK_SPEC>
- For simple queries: Directly call the ``finish`` action and provide the answer.
- For complex queries:
   - Step 1: Read the user query and potentially divide it into subqueries. And get started with the first subquery.
   - Call one available tool at a time to solve each subquery/subquestion. \
   - At step 'finish', join all subqueries answers and finish the task.
Remember:
- Action must call one of the above tools with name. It can not be empty.
- You will always end with 'finish' action to finish the task. The answer can be the final answer or failure message.
</TASK_SPEC>"""

我们将所有这三个部分放在标签内。

  1. 代理步骤历史。

我们使用StepOutput来记录代理的步骤历史,包括:

  • action: 这将是代理预测的 FunctionExpression 实例。

  • observation: 动作的执行结果。

特别是,我们将用户查询后的步骤历史记录格式化为如下:

step_history = r"""User query:
{{ input_str }}
{# Step History #}
{% if step_history %}
<STEPS>
{% for history in step_history %}
Step {{ loop.index }}.
"Thought": "{{history.action.thought}}",
"Action": "{{history.action.action}}",
"Observation": "{{history.observation}}"
------------------------
{% endfor %}
</STEPS>
{% endif %}
You:"""

工具#

除了用户提供的工具外,默认情况下,我们添加了一个名为finish的新工具,以允许代理停止并返回最终答案。

def finish(answer: str) -> str:
   """Finish the task with answer."""
   return answer

仅仅返回一个字符串可能并不适合所有场景,我们可能会考虑在未来允许用户定义自己的完成函数,以应对更复杂的情况。

此外,由于提供的工具不能总是解决用户查询,我们允许用户通过add_llm_as_fallback参数配置是否应该使用LLM模型来解决子查询。 这个LLM将使用与代理的计划器相同的模型客户端和模型参数。以下是我们指定备用LLM工具的代码:

_additional_llm_tool = (
   Generator(model_client=model_client, model_kwargs=model_kwargs)
   if self.add_llm_as_fallback
   else None
)

def llm_tool(input: str) -> str:
   """I answer any input query with llm's world knowledge. Use me as a fallback tool or when the query is simple."""
   # use the generator to answer the query
   try:
         output: GeneratorOutput = _additional_llm_tool(
            prompt_kwargs={"input_str": input}
         )
         response = output.data if output else None
         return response
   except Exception as e:
         log.error(f"Error using the generator: {e}")
         print(f"Error using the generator: {e}")

   return None

React 代理#

我们定义了类 ReActAgent 来将所有内容整合在一起。 它将协调两个组件:

  • planner: 一个Generator,它与JsonOutputParser一起工作,使用FunctionExpression解析输出格式和函数调用的示例。

  • ToolManager: 管理给定的工具列表、完成函数和LLM工具。它负责解析和执行这些函数。

此外,它管理step_history作为StepOutput实例的列表,用于代理的内部状态。

名称

描述

__init__(self, tools: List[Union[Callable, AsyncCallable, FunctionTool]] = [], max_steps: int = 10, add_llm_as_fallback: bool = True, examples: List[FunctionExpression] = [], *, model_client: ModelClient, model_kwargs: Dict = {}, template: Optional[str] = None)

使用指定的工具、最大步骤、回退选项、示例、模型客户端、模型参数和模板(如果你想自定义提示)初始化ReActAgent

call(self, input: str, prompt_kwargs: Optional[Dict] = {}, model_kwargs: Optional[Dict] = {}) -> Any

使用输入查询提示代理并处理步骤以生成响应。

行动中的代理#

我们将设置两组模型,Groq的llama3-70b-8192`和OpenAI的gpt-3.5-turbo`,来测试两个查询。 为了比较,我们将这些与不使用代理的普通LLM响应进行比较。 以下是代码片段:

from adalflow.components.agent import ReActAgent
from adalflow.core import Generator, ModelClientType, ModelClient
from adalflow.utils import setup_env

setup_env()


# Define tools
def multiply(a: int, b: int) -> int:
   """
   Multiply two numbers.
   """
   return a * b

def add(a: int, b: int) -> int:
   """
   Add two numbers.
   """
   return a + b

def divide(a: float, b: float) -> float:
   """
   Divide two numbers.
   """
   return float(a) / b

llama3_model_kwargs = {
   "model": "llama3-70b-8192",  # llama3 70b works better than 8b here.
   "temperature": 0.0,
}
gpt_model_kwargs = {
   "model": "gpt-3.5-turbo",
   "temperature": 0.0,
}


def test_react_agent(model_client: ModelClient, model_kwargs: dict):
   tools = [multiply, add, divide]
   queries = [
      "What is the capital of France? and what is 465 times 321 then add 95297 and then divide by 13.2?",
      "Give me 5 words rhyming with cool, and make a 4-sentence poem using them",
   ]
   # define a generator without tools for comparison

   generator = Generator(
      model_client=model_client,
      model_kwargs=model_kwargs,
   )

   react = ReActAgent(
      max_steps=6,
      add_llm_as_fallback=True,
      tools=tools,
      model_client=model_client,
      model_kwargs=model_kwargs,
   )
   # print(react)

   for query in queries:
      print(f"Query: {query}")
      agent_response = react.call(query)
      llm_response = generator.call(prompt_kwargs={"input_str": query})
      print(f"Agent response: {agent_response}")
      print(f"LLM response: {llm_response}")
      print("")

React的结构,包括初始化参数和两个主要组件:tool_managerplanner,如下所示。

         

ReActAgent(
   max_steps=6, add_llm_as_fallback=True,
   (tool_manager): ToolManager(Tools: [FunctionTool(fn: , async: False, definition: FunctionDefinition(func_name='multiply', func_desc='multiply(a: int, b: int) -> int\n\n    Multiply two numbers.\n    ', func_parameters={'type': 'object', 'properties': {'a': {'type': 'int'}, 'b': {'type': 'int'}}, 'required': ['a', 'b']})), FunctionTool(fn: , async: False, definition: FunctionDefinition(func_name='add', func_desc='add(a: int, b: int) -> int\n\n    Add two numbers.\n    ', func_parameters={'type': 'object', 'properties': {'a': {'type': 'int'}, 'b': {'type': 'int'}}, 'required': ['a', 'b']})), FunctionTool(fn: , async: False, definition: FunctionDefinition(func_name='divide', func_desc='divide(a: float, b: float) -> float\n\n    Divide two numbers.\n    ', func_parameters={'type': 'object', 'properties': {'a': {'type': 'float'}, 'b': {'type': 'float'}}, 'required': ['a', 'b']})), FunctionTool(fn: .llm_tool at 0x11384b740>, async: False, definition: FunctionDefinition(func_name='llm_tool', func_desc="llm_tool(input: str) -> str\nI answer any input query with llm's world knowledge. Use me as a fallback tool or when the query is simple.", func_parameters={'type': 'object', 'properties': {'input': {'type': 'str'}}, 'required': ['input']})), FunctionTool(fn: .finish at 0x11382fa60>, async: False, definition: FunctionDefinition(func_name='finish', func_desc='finish(answer: str) -> str\nFinish the task with answer.', func_parameters={'type': 'object', 'properties': {'answer': {'type': 'str'}}, 'required': ['answer']}))], Additional Context: {})
   (planner): Generator(
      model_kwargs={'model': 'llama3-70b-8192', 'temperature': 0.0},
      (prompt): Prompt(
         template: 
         {# role/task description #}
         You are a helpful assistant.
         Answer the user's query using the tools provided below with minimal steps and maximum accuracy.
         {# REACT instructions #}
         Each step you will read the previous Thought, Action, and Observation(execution result of the action) and then provide the next Thought and Action.
         {# Tools #}
         {% if tools %}
         
         You available tools are:
         {# tools #}
         {% for tool in tools %}
         {{ loop.index }}.
         {{tool}}
         ------------------------
         {% endfor %}
         
         {% endif %}
         {# output format and examples #}
         
         {{output_format_str}}
         
         
         {# Task specification to teach the agent how to think using 'divide and conquer' strategy #}
         - For simple queries: Directly call the ``finish`` action and provide the answer.
         - For complex queries:
            - Step 1: Read the user query and potentially divide it into subqueries. And get started with the first subquery.
            - Call one available tool at a time to solve each subquery/subquestion. \
            - At step 'finish', join all subqueries answers and finish the task.
         Remember:
         - Action must call one of the above tools with name. It can not be empty.
         - You will always end with 'finish' action to finish the task. The answer can be the final answer or failure message.
         
         
         -----------------
         User query:
         {{ input_str }}
         {# Step History #}
         {% if step_history %}
         
         {% for history in step_history %}
         Step {{ loop.index }}.
         "Thought": "{{history.action.thought}}",
         "Action": "{{history.action.action}}",
         "Observation": "{{history.observation}}"
         ------------------------
         {% endfor %}
         
         {% endif %}
         You:, prompt_kwargs: {'tools': ['func_name: multiply\nfunc_desc: "multiply(a: int, b: int) -> int\\n\\n    Multiply two numbers.\\n    "\nfunc_parameters:\n  type: object\n  properties:\n    a:\n      type: int\n    b:\n      type: int\n  required:\n  - a\n  - b\n', 'func_name: add\nfunc_desc: "add(a: int, b: int) -> int\\n\\n    Add two numbers.\\n    "\nfunc_parameters:\n  type: object\n  properties:\n    a:\n      type: int\n    b:\n      type: int\n  required:\n  - a\n  - b\n', 'func_name: divide\nfunc_desc: "divide(a: float, b: float) -> float\\n\\n    Divide two numbers.\\n    "\nfunc_parameters:\n  type: object\n  properties:\n    a:\n      type: float\n    b:\n      type: float\n  required:\n  - a\n  - b\n', "func_name: llm_tool\nfunc_desc: 'llm_tool(input: str) -> str\n\n  I answer any input query with llm''s world knowledge. Use me as a fallback tool\n  or when the query is simple.'\nfunc_parameters:\n  type: object\n  properties:\n    input:\n      type: str\n  required:\n  - input\n", "func_name: finish\nfunc_desc: 'finish(answer: str) -> str\n\n  Finish the task with answer.'\nfunc_parameters:\n  type: object\n  properties:\n    answer:\n      type: str\n  required:\n  - answer\n"], 'output_format_str': 'Your output should be formatted as a standard JSON instance with the following schema:\n```\n{\n    "thought": "Why the function is called (Optional[str]) (optional)",\n    "action": "FuncName() Valid function call expression. Example: \\"FuncName(a=1, b=2)\\" Follow the data type specified in the function parameters.e.g. for Type object with x,y properties, use \\"ObjectType(x=1, y=2) (str) (required)"\n}\n```\nExamples:\n```\n{\n    "thought": "I have finished the task.",\n    "action": "finish(answer=\\"final answer: \'answer\'\\")"\n}\n________\n```\n-Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!\n-Use double quotes for the keys and string values.\n-DO NOT mistaken the "properties" and "type" in the schema as the actual fields in the JSON output.\n-Follow the JSON formatting conventions.'}, prompt_variables: ['input_str', 'tools', 'step_history', 'output_format_str']
      )
      (model_client): GroqAPIClient()
      (output_processors): JsonOutputParser(
         data_class=FunctionExpression, examples=[FunctionExpression(thought='I have finished the task.', action='finish(answer="final answer: \'answer\'")')], exclude_fields=None, return_data_class=True
         (output_format_prompt): Prompt(
         template: Your output should be formatted as a standard JSON instance with the following schema:
         ```
         {{schema}}
         ```
         {% if example %}
         Examples:
         ```
         {{example}}
         ```
         {% endif %}
         -Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!
         -Use double quotes for the keys and string values.
         -DO NOT mistaken the "properties" and "type" in the schema as the actual fields in the JSON output.
         -Follow the JSON formatting conventions., prompt_variables: ['example', 'schema']
         )
         (output_processors): JsonParser()
      )
   )
)
         
     

现在,让我们运行测试函数来看看代理的实际操作。

test_react_agent(ModelClientType.GROQ(), llama3_model_kwargs)
test_react_agent(ModelClientType.OPENAI(), gpt_model_kwargs)

我们的代理将通过彩色打印展示开发者的核心步骤,包括输入查询、步骤和最终答案。 以下是使用llama3的第一个查询的打印输出(此处无颜色):

2024-07-10 16:48:47 - [react.py:287:call] - input_query: What is the capital of France? and what is 465 times 321 then add 95297 and then divide by 13.2

2024-07-10 16:48:48 - [react.py:266:_run_one_step] - Step 1:
StepOutput(step=1, action=FunctionExpression(thought="Let's break down the query into subqueries and start with the first one.", action='llm_tool(input="What is the capital of France?")'), function=Function(thought=None, name='llm_tool', args=[], kwargs={'input': 'What is the capital of France?'}), observation='The capital of France is Paris!')
_______

2024-07-10 16:48:49 - [react.py:266:_run_one_step] - Step 2:
StepOutput(step=2, action=FunctionExpression(thought="Now, let's move on to the second subquery.", action='multiply(a=465, b=321)'), function=Function(thought=None, name='multiply', args=[], kwargs={'a': 465, 'b': 321}), observation=149265)
_______

2024-07-10 16:48:49 - [react.py:266:_run_one_step] - Step 3:
StepOutput(step=3, action=FunctionExpression(thought="Now, let's add 95297 to the result.", action='add(a=149265, b=95297)'), function=Function(thought=None, name='add', args=[], kwargs={'a': 149265, 'b': 95297}), observation=244562)
_______

2024-07-10 16:48:50 - [react.py:266:_run_one_step] - Step 4:
StepOutput(step=4, action=FunctionExpression(thought="Now, let's divide the result by 13.2.", action='divide(a=244562, b=13.2)'), function=Function(thought=None, name='divide', args=[], kwargs={'a': 244562, 'b': 13.2}), observation=18527.424242424244)
_______

2024-07-10 16:48:50 - [react.py:266:_run_one_step] - Step 5:
StepOutput(step=5, action=FunctionExpression(thought="Now, let's combine the answers of both subqueries.", action='finish(answer="The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.")'), function=Function(thought=None, name='finish', args=[], kwargs={'answer': 'The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.'}), observation='The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.')
_______
2024-07-10 16:48:50 - [react.py:301:call] - answer:
The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.

对于第二个查询,打印输出:

2024-07-10 16:48:51 - [react.py:287:call] - input_query: Give me 5 words rhyming with cool, and make a 4-sentence poem using them
2024-07-10 16:48:52 - [react.py:266:_run_one_step] - Step 1:
StepOutput(step=1, action=FunctionExpression(thought="I need to find 5 words that rhyme with 'cool'.", action='llm_tool(input="What are 5 words that rhyme with \'cool\'?")'), function=Function(thought=None, name='llm_tool', args=[], kwargs={'input': "What are 5 words that rhyme with 'cool'?"}), observation='Here are 5 words that rhyme with "cool":\n\n1. Rule\n2. Tool\n3. Fool\n4. Pool\n5. School')
_______

2024-07-10 16:49:00 - [react.py:266:_run_one_step] - Step 2:
StepOutput(step=2, action=FunctionExpression(thought='Now that I have the rhyming words, I need to create a 4-sentence poem using them.', action='llm_tool(input="Create a 4-sentence poem using the words \'rule\', \'tool\', \'fool\', \'pool\', and \'school\'.")'), function=Function(thought=None, name='llm_tool', args=[], kwargs={'input': "Create a 4-sentence poem using the words 'rule', 'tool', 'fool', 'pool', and 'school'."}), observation="Here is a 4-sentence poem using the words 'rule', 'tool', 'fool', 'pool', and 'school':\n\nIn the classroom, we learn to rule,\nWith a pencil as our trusty tool.\nBut if we're not careful, we can be a fool,\nAnd end up swimming in the school pool.")
_______

2024-07-10 16:49:12 - [react.py:266:_run_one_step] - Step 3:
StepOutput(step=3, action=FunctionExpression(thought='I have the poem, now I need to finish the task.', action='finish(answer="Here are 5 words that rhyme with \'cool\': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we\'re not careful, we can be a fool, And end up swimming in the school pool.")'), function=Function(thought=None, name='finish', args=[], kwargs={'answer': "Here are 5 words that rhyme with 'cool': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we're not careful, we can be a fool, And end up swimming in the school pool."}), observation="Here are 5 words that rhyme with 'cool': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we're not careful, we can be a fool, And end up swimming in the school pool.")
_______

2024-07-10 16:49:12 - [react.py:301:call] - answer:
Here are 5 words that rhyme with 'cool': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we're not careful, we can be a fool, And end up swimming in the school pool.

代理与普通LLM响应的比较如下所示:

Answer with agent: The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.
Answer without agent: GeneratorOutput(data="I'd be happy to help you with that!\n\nThe capital of France is Paris.\n\nNow, let's tackle the math problem:\n\n1. 465 × 321 = 149,485\n2. Add 95,297 to that result: 149,485 + 95,297 = 244,782\n3. Divide the result by 13.2: 244,782 ÷ 13.2 = 18,544.09\n\nSo, the answer is 18,544.09!", error=None, usage=None, raw_response="I'd be happy to help you with that!\n\nThe capital of France is Paris.\n\nNow, let's tackle the math problem:\n\n1. 465 × 321 = 149,485\n2. Add 95,297 to that result: 149,485 + 95,297 = 244,782\n3. Divide the result by 13.2: 244,782 ÷ 13.2 = 18,544.09\n\nSo, the answer is 18,544.09!", metadata=None)

对于第二个查询,比较如下所示:

Answer with agent: Here are 5 words that rhyme with 'cool': rule, tool, fool, pool, school. Here is a 4-sentence poem using the words: In the classroom, we learn to rule, With a pencil as our trusty tool. But if we're not careful, we can be a fool, And end up swimming in the school pool.
Answer without agent: GeneratorOutput(data='Here are 5 words that rhyme with "cool":\n\n1. rule\n2. tool\n3. fool\n4. pool\n5. school\n\nAnd here\'s a 4-sentence poem using these words:\n\nIn the summer heat, I like to be cool,\nFollowing the rule, I take a dip in the pool.\nI\'m not a fool, I know just what to do,\nI grab my tool and head back to school.', error=None, usage=None, raw_response='Here are 5 words that rhyme with "cool":\n\n1. rule\n2. tool\n3. fool\n4. pool\n5. school\n\nAnd here\'s a 4-sentence poem using these words:\n\nIn the summer heat, I like to be cool,\nFollowing the rule, I take a dip in the pool.\nI\'m not a fool, I know just what to do,\nI grab my tool and head back to school.', metadata=None)

ReAct代理特别有助于回答需要计算或更复杂推理和规划能力的查询。 然而,在一般查询上使用它可能有些过度,因为它可能需要比必要更多的步骤来回答查询。

自定义#

模板

你首先想要定制的是模板本身。 你可以通过将自己的模板传递给代理的构造函数来实现这一点。 我们建议你修改我们的默认模板:DEFAULT_REACT_AGENT_SYSTEM_PROMPT

更好的输出格式示例

其次,构造函数中的examples允许你提供更多的示例来强制正确的输出格式。 例如,如果我们希望它学习如何正确调用multiply,我们可以传入一个格式正确的FunctionExpression实例列表。 类方法from_function可以用来从函数及其参数创建一个FunctionExpression实例。

from adalflow.core.types import FunctionExpression

# generate an example of calling multiply with key-word arguments
example_using_multiply = FunctionExpression.from_function(
     func=multiply,
     thought="Now, let's multiply two numbers.",
     a=3,
     b=4,
 )
examples = [example_using_multiply]

# pass it to the agent

我们可以通过以下方式可视化这是如何传递给规划器提示的:

react.planner.print_prompt()

上面的示例将被格式化为:

<OUTPUT_FORMAT>
Your output should be formatted as a standard JSON instance with the following schema:
```
{
   "thought": "Why the function is called (Optional[str]) (optional)",
   "action": "FuncName(<kwargs>) Valid function call expression. Example: \"FuncName(a=1, b=2)\" Follow the data type specified in the function parameters.e.g. for Type object with x,y properties, use \"ObjectType(x=1, y=2) (str) (required)"
}
```
Examples:
```
{
   "thought": "Now, let's multiply two numbers.",
   "action": "multiply(a=3, b=4)"
}
________
{
   "thought": "I have finished the task.",
   "action": "finish(answer=\"final answer: 'answer'\")"
}
________
```
-Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!
-Use double quotes for the keys and string values.
-DO NOT mistaken the "properties" and "type" in the schema as the actual fields in the JSON output.
-Follow the JSON formatting conventions.
</OUTPUT_FORMAT>

子类 ReActAgent

如果你想进一步自定义代理,你可以子类化ReActAgent并重写你想要更改的方法。

参考文献