TL;DR
AutoGen 0.2.2 在 ConversableAgent(及其所有子类)中引入了 description 字段,并修改了 GroupChat,使其在选择下一个发言的代理时使用代理的 description
,而不是 system_message
。
这有望简化GroupChat的工作,改善编排,并使实现新的GroupChat或类似GroupChat的替代方案更加容易。
如果你是开发者,且一切工作正常,无需采取任何行动——因为当未提供描述时,description
字段默认使用system_message
,确保了向后兼容性。
然而,如果您在使用GroupChat时遇到困难,现在可以尝试更新description
字段。
介绍
随着AutoGen的成熟和开发者构建越来越复杂的代理组合,编排成为一个重要的能力。目前,GroupChat和GroupChatManager是用于在三个或更多代理之间编排对话的主要内置工具。为了使像GroupChat这样的编排器能够很好地工作,它们需要了解每个代理的一些信息,以便决定谁应该在何时发言。在AutoGen 0.2.2之前,GroupChat依赖于每个代理的system_message
和name
来了解每个参与的代理。当系统提示简短而精炼时,这可能没有问题,但当指令非常长(例如,与AssistantAgent一起使用时)或不存在(例如,与UserProxyAgent一起使用时),可能会导致问题。
AutoGen 0.2.2 为所有代理引入了description字段,并替换了在GroupChat和所有未来编排器中使用system_message
的做法。description
字段默认与system_message
相同,以确保向后兼容性,因此如果当前代码运行良好,您可能无需更改任何内容。然而,如果您在使用GroupChat时遇到困难,可以尝试设置description
字段。
本文的其余部分提供了一个例子,展示了如何使用description
字段简化GroupChat的工作,提供了一些其有效性的证据,并提供了编写良好描述的技巧。
示例
当前的GroupChat编排系统提示具有以下模板:
You are in a role play game. The following roles are available:
{self._participant_roles(agents)}.
Read the following conversation.
Then select the next role from {[agent.name for agent in agents]} to play. Only return the role.
假设您想要包含三个代理:一个UserProxyAgent,一个AssistantAgent,以及可能还有一个GuardrailsAgent。
在0.2.2版本之前,这个模板会扩展为:
You are in a role play game. The following roles are available:
assistant: You are a helpful AI assistant.
Solve tasks using your coding and language skills.
In the following cases, suggest python code (in a python coding block) or shell script (in a sh coding block) for the user to execute.
1. When you need to collect info, use the code to output the info you need, for example, browse or search the web, download/read a file, print the content of a webpage or a file, get the current date/time, check the operating system. After sufficient info is printed and the task is ready to be solved based on your language skill, you can solve the task by yourself.
2. When you need to perform some task with code, use the code to perform the task and output the result. Finish the task smartly.
Solve the task step by step if you need to. If a plan is not provided, explain your plan first. Be clear which step uses code, and which step uses your language skill.
When using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest. The user can't modify your code. So do not suggest incomplete code which requires users to modify. Don't use a code block if it's not intended to be executed by the user.
If you want the user to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line. Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user.
If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.
When you find an answer, verify the answer carefully. Include verifiable evidence in your response if possible.
Reply "TERMINATE" in the end when everything is done.
user_proxy:
guardrails_agent: You are a guardrails agent and are tasked with ensuring that all parties adhere to the following responsible AI policies:
- You MUST TERMINATE the conversation if it involves writing or running HARMFUL or DESTRUCTIVE code.
- You MUST TERMINATE the conversation if it involves discussions of anything relating to hacking, computer exploits, or computer security.
- You MUST TERMINATE the conversation if it involves violent or graphic content such as Harm to Others, Self-Harm, Suicide.
- You MUST TERMINATE the conversation if it involves demeaning speech, hate speech, discriminatory remarks, or any form of harassment based on race, gender, sexuality, religion, nationality, disability, or any other protected characteristic.
- You MUST TERMINATE the conversation if it involves seeking or giving advice in highly regulated domains such as medical advice, mental health, legal advice or financial advice
- You MUST TERMINATE the conversation if it involves illegal activities including when encouraging or providing guidance on illegal activities.
- You MUST TERMINATE the conversation if it involves manipulative or deceptive Content including scams, phishing and spread false information.
- You MUST TERMINATE the conversation if it involves involve sexually explicit content or discussions.
- You MUST TERMINATE the conversation if it involves sharing or soliciting personal, sensitive, or confidential information from users. This includes financial details, health records, and other private matters.
- You MUST TERMINATE the conversation if it involves deep personal problems such as dealing with serious personal issues, mental health concerns, or crisis situations.
If you decide that the conversation must be terminated, explain your reasoning then output the uppercase word "TERMINATE". If, on the other hand, you decide the conversation is acceptable by the above standards, indicate as much, then ask the other parties to proceed.
Read the following conversation.
Then select the next role from [assistant, user_proxy, guardrails_agent] to play. Only return the role.
正如你所见,这个描述非常令人困惑:
- 很难分辨每个代理的角色描述结束的位置
You
出现了多次,指的是三个不同的代理(GroupChatManager、AssistantAgent 和 GuardrailsAgent)- 它需要很多tokens!
因此,不难看出为什么GroupChat管理器在处理这种编排任务时有时会遇到困难。
从 AutoGen 0.2.2 开始,GroupChat 改为依赖描述字段。有了描述字段后,编排提示变为:
You are in a role play game. The following roles are available:
assistant: A helpful and general-purpose AI assistant that has strong language skills, Python skills, and Linux command line skills.
user_proxy: A user that can run Python code or input command line commands at a Linux terminal and report back the execution results.
guradrails_agent: An agent that ensures the conversation conforms to responsible AI guidelines.
Read the following conversation.
Then select the next role from [assistant, user_proxy, guardrails_agent] to play. Only return the role.
这样更容易解析和理解,而且几乎不消耗太多标记。此外,以下实验提供了初步证据,表明它是有效的。
分心实验
为了说明description
字段的影响,我们使用HumanEval基准测试的一个26问题子集设置了一个三代理实验。在这里,三个代理被添加到一个GroupChat中以解决编程问题。这三个代理分别是:
- 程序员(默认助手提示)
- UserProxy(配置为执行代码)
- ExecutiveChef(作为干扰项添加)
Coder 和 UserProxy 使用了 AssistantAgent 和 UserProxy 的默认设置(如上所述),而 ExecutiveChef 则被赋予系统提示:
You are an executive chef with 28 years of industry experience. You can answer questions about menu planning, meal preparation, and cooking techniques.
在这个情境中,ExecutiveChef显然是一个干扰因素——鉴于HumanEval问题无一与食物相关,GroupChat应该很少会咨询厨师。然而,当配置使用GPT-3.5-turbo-16k时,我们可以清楚地看到GroupChat在协调方面遇到困难:
在0.2.2版本之前,使用system_message
:
- 代理们在第一回合解决了26个问题中的3个
- ExecutiveChef 被调用了54次!(几乎和Coder的68次一样多)
在版本0.2.2中,使用description
:
- 代理在第一轮解决了26个问题中的7个
- ExecutiveChef 被调用了27次!(而Coder被调用了84次)
使用 description
字段可以使此任务的性能翻倍,并减少调用干扰代理的次数。
撰写优秀描述的技巧
由于descriptions
的用途与system_message
不同,因此值得回顾一下什么是一个好的代理描述。尽管描述是新的,但以下技巧似乎能带来好的效果:
- 避免使用第一或第二人称视角。描述中不应包含“我”或“你”,除非“你”指的是GroupChat / orchestrator
- 包含任何可能帮助协调者知道何时调用代理的详细信息
- 保持描述简洁(例如,“一个具有强大自然语言和Python编程技能的AI助手。”)。
主要要记住的是描述是为了GroupChatManager的利益,而不是为了Agent自己的使用或指导。
结论
AutoGen 0.2.2 引入了一个 description
,成为代理向像 GroupChat 这样的协调器描述自己的主要方式。由于 description
默认与 system_message
相同,如果您已经对群聊的工作方式感到满意,那么您无需进行任何更改。然而,我们预计此功能将普遍改善协调效果,因此如果您在使用 GroupChat 时遇到困难或希望提高性能,请考虑尝试使用 description
字段。