跳到主要内容

转换消息简介

为什么我们需要处理长上下文?这个问题源于几个约束和需求:

  1. 令牌限制:LLMs有令牌限制,这些限制限制了它们可以处理的文本数据量。如果我们超出了这些限制,可能会遇到错误或产生额外的成本。通过预处理聊天历史,我们可以确保保持在可接受的令牌范围内。

  2. 上下文相关性:随着对话的进行,保留整个聊天历史可能变得不太相关,甚至适得其反。仅保留最近和相关的消息可以帮助LLM专注于最重要的上下文,从而产生更准确和相关的响应。

  3. 效率:处理长上下文可能消耗更多的计算资源,导致响应时间变慢。

转换消息功能

TransformMessages 功能旨在修改传入的消息,然后由LLM代理进行处理。这可以包括限制消息数量、截断消息以满足token限制等。

Requirements

安装 autogen-agentchat:

pip install autogen-agentchat~=0.2

如需更多信息,请参考安装指南

探索和理解转换

让我们从探索可用的转换开始,并了解它们的工作原理。我们将从导入所需的模块开始。

import copy
import pprint

from autogen.agentchat.contrib.capabilities import transforms

示例1:限制消息总数

考虑一个场景,您希望将上下文历史限制为仅最新的消息,以保持效率和相关性。您可以通过MessageHistoryLimiter转换实现这一点:

# Limit the message history to the 3 most recent messages
max_msg_transfrom = transforms.MessageHistoryLimiter(max_messages=3)

messages = [
{"role": "user", "content": "hello"},
{"role": "assistant", "content": [{"type": "text", "text": "there"}]},
{"role": "user", "content": "how"},
{"role": "assistant", "content": [{"type": "text", "text": "are you doing?"}]},
{"role": "user", "content": "very very very very very very long string"},
]

processed_messages = max_msg_transfrom.apply_transform(copy.deepcopy(messages))
pprint.pprint(processed_messages)
[{'content': 'how', 'role': 'user'},
{'content': [{'text': 'are you doing?', 'type': 'text'}], 'role': 'assistant'},
{'content': 'very very very very very very long string', 'role': 'user'}]

通过应用MessageHistoryLimiter,我们可以看到我们能够将上下文历史限制为最近的3条消息。但是,如果分割点位于"tool_calls"和"tool"对之间,将包含完整的对以遵守OpenAI API调用约束。

max_msg_transfrom = transforms.MessageHistoryLimiter(max_messages=3)

messages = [
{"role": "user", "content": "hello"},
{"role": "tool_calls", "content": "calling_tool"},
{"role": "tool", "content": "tool_response"},
{"role": "user", "content": "how are you"},
{"role": "assistant", "content": [{"type": "text", "text": "are you doing?"}]},
]

processed_messages = max_msg_transfrom.apply_transform(copy.deepcopy(messages))
pprint.pprint(processed_messages)
[{'content': 'calling_tool', 'role': 'tool_calls'},
{'content': 'tool_response', 'role': 'tool'},
{'content': 'how are you', 'role': 'user'},
{'content': [{'text': 'are you doing?', 'type': 'text'}], 'role': 'assistant'}]

示例 2:限制令牌数量

为了遵守令牌限制,使用MessageTokenLimiter转换。这限制每条消息的令牌数量和所有消息的总令牌数。此外,可以应用min_tokens阈值:

# Limit the token limit per message to 3 tokens
token_limit_transform = transforms.MessageTokenLimiter(max_tokens_per_message=3, min_tokens=10)

processed_messages = token_limit_transform.apply_transform(copy.deepcopy(messages))

pprint.pprint(processed_messages)
[{'content': 'hello', 'role': 'user'},
{'content': [{'text': 'there', 'type': 'text'}], 'role': 'assistant'},
{'content': 'how', 'role': 'user'},
{'content': [{'text': 'are you doing', 'type': 'text'}], 'role': 'assistant'},
{'content': 'very very very', 'role': 'user'}]

我们可以看到,我们能够将token的数量限制为3,这在本例中相当于3个单词。

在下面的例子中,我们将探讨min_tokens阈值的效果。

short_messages = [
{"role": "user", "content": "hello there, how are you?"},
{"role": "assistant", "content": [{"type": "text", "text": "hello"}]},
]

processed_short_messages = token_limit_transform.apply_transform(copy.deepcopy(short_messages))

pprint.pprint(processed_short_messages)
[{'content': 'hello there, how are you?', 'role': 'user'},
{'content': [{'text': 'hello', 'type': 'text'}], 'role': 'assistant'}]

我们可以看到没有应用任何转换,因为总标记数10的阈值未达到。

使用代理应用转换

到目前为止,我们只单独测试了MessageHistoryLimiterMessageTokenLimiter转换,现在让我们使用AutoGen的代理来测试这些转换。

搭建舞台

import os
import copy

import autogen
from autogen.agentchat.contrib.capabilities import transform_messages, transforms
from typing import Dict, List

config_list = [{"model": "gpt-3.5-turbo", "api_key": os.getenv("OPENAI_API_KEY")}]

# Define your agent; the user proxy and an assistant
assistant = autogen.AssistantAgent(
"assistant",
llm_config={"config_list": config_list},
)
user_proxy = autogen.UserProxyAgent(
"user_proxy",
human_input_mode="NEVER",
is_termination_msg=lambda x: "TERMINATE" in x.get("content", ""),
max_consecutive_auto_reply=10,
)
tip

了解更多关于为agent配置LLM的信息在这里.

我们首先需要编写test函数,该函数通过助手和用户代理之间的消息交换创建一个很长的聊天历史记录,然后尝试在不清除历史记录的情况下发起一个新的聊天,这可能会由于令牌限制而触发错误。

# Create a very long chat history that is bound to cause a crash for gpt 3.5
def test(assistant: autogen.ConversableAgent, user_proxy: autogen.UserProxyAgent):
for _ in range(1000):
# define a fake, very long messages
assitant_msg = {"role": "assistant", "content": "test " * 1000}
user_msg = {"role": "user", "content": ""}

assistant.send(assitant_msg, user_proxy, request_reply=False, silent=True)
user_proxy.send(user_msg, assistant, request_reply=False, silent=True)

try:
user_proxy.initiate_chat(assistant, message="plot and save a graph of x^2 from -10 to 10", clear_history=False)
except Exception as e:
print(f"Encountered an error with the base assistant: \n{e}")

第一次运行将是默认实现,其中代理不具备TransformMessages功能。

test(assistant, user_proxy)

运行此测试将导致错误,因为向OpenAI的gpt 3.5发送了大量的token。

user_proxy (to assistant):

plot and save a graph of x^2 from -10 to 10

--------------------------------------------------------------------------------
Encountered an error with the base assistant
Error code: 429 - {'error': {'message': 'Request too large for gpt-3.5-turbo in organization org-U58JZBsXUVAJPlx2MtPYmdx1 on tokens per min (TPM): Limit 60000, Requested 1252546. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.', 'type': 'tokens', 'param': None, 'code': 'rate_limit_exceeded'}}

现在让我们为助手添加TransformMessages功能并运行相同的测试。

context_handling = transform_messages.TransformMessages(
transforms=[
transforms.MessageHistoryLimiter(max_messages=10),
transforms.MessageTokenLimiter(max_tokens=1000, max_tokens_per_message=50, min_tokens=500),
]
)
context_handling.add_to_agent(assistant)

test(assistant, user_proxy)

以下控制台输出显示,该代理现在能够处理发送到OpenAI的gpt 3.5的大量令牌。

user_proxy (to assistant):

plot and save a graph of x^2 from -10 to 10

--------------------------------------------------------------------------------
Truncated 3804 tokens. Tokens reduced from 4019 to 215
assistant (to user_proxy):

To plot and save a graph of \( x^2 \) from -10 to 10, we can use Python with the matplotlib library. Here's the code to generate the plot and save it to a file named "plot.png":

```python
# filename: plot_quadratic.py
import matplotlib.pyplot as plt
import numpy as np

# Create an array of x values from -10 to 10
x = np.linspace(-10, 10, 100)
y = x**2

# Plot the graph
plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('x^2')
plt.title('Plot of x^2')
plt.grid(True)

# Save the plot as an image file
plt.savefig('plot.png')

# Display the plot
plt.show()
````

You can run this script in a Python environment. It will generate a plot of \( x^2 \) from -10 to 10 and save it as "plot.png" in the same directory where the script is executed.

Execute the Python script to create and save the graph.
After executing the code, you should see a file named "plot.png" in the current directory, containing the graph of \( x^2 \) from -10 to 10. You can view this file to see the plotted graph.

Is there anything else you would like to do or need help with?
If not, you can type "TERMINATE" to end our conversation.

---

创建自定义转换以处理敏感内容

您可以通过实现MessageTransform协议来创建自定义的转换,这为处理各种使用场景提供了灵活性。一个实际的应用是创建一个自定义转换,用于从聊天历史或日志中编辑敏感信息,如API密钥、密码或个人数据。这确保了机密数据不会被无意中暴露,从而增强了您对话AI系统的安全性和隐私性。

我们将通过实现一个名为MessageRedact的自定义转换来演示这一点,该转换会检测并从对话历史中隐藏OpenAI API密钥。这种转换在您希望防止API密钥意外泄露时特别有用,因为这可能会危及系统的安全性。

import os
import pprint
import copy
import re

import autogen
from autogen.agentchat.contrib.capabilities import transform_messages, transforms
from typing import Dict, List

# The transform must adhere to transform_messages.MessageTransform protocol.
class MessageRedact:
def __init__(self):
self._openai_key_pattern = r"sk-([a-zA-Z0-9]{48})"
self._replacement_string = "REDACTED"

def apply_transform(self, messages: List[Dict]) -> List[Dict]:
temp_messages = copy.deepcopy(messages)

for message in temp_messages:
if isinstance(message["content"], str):
message["content"] = re.sub(self._openai_key_pattern, self._replacement_string, message["content"])
elif isinstance(message["content"], list):
for item in message["content"]:
if item["type"] == "text":
item["text"] = re.sub(self._openai_key_pattern, self._replacement_string, item["text"])
return temp_messages

def get_logs(self, pre_transform_messages: List[Dict], post_transform_messages: List[Dict]) -> Tuple[str, bool]:
keys_redacted = self._count_redacted(post_transform_messages) - self._count_redacted(pre_transform_messages)
if keys_redacted > 0:
return f"Redacted {keys_redacted} OpenAI API keys.", True
return "", False

def _count_redacted(self, messages: List[Dict]) -> int:
# counts occurrences of "REDACTED" in message content
count = 0
for message in messages:
if isinstance(message["content"], str):
if "REDACTED" in message["content"]:
count += 1
elif isinstance(message["content"], list):
for item in message["content"]:
if isinstance(item, dict) and "text" in item:
if "REDACTED" in item["text"]:
count += 1
return count


assistant_with_redact = autogen.AssistantAgent(
"assistant",
llm_config=llm_config,
max_consecutive_auto_reply=1,
)
redact_handling = transform_messages.TransformMessages(transforms=[MessageRedact()])

redact_handling.add_to_agent(assistant_with_redact)

user_proxy = autogen.UserProxyAgent(
"user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=1,
)

messages = [
{"content": "api key 1 = sk-7nwt00xv6fuegfu3gnwmhrgxvuc1cyrhxcq1quur9zvf05fy"}, # Don't worry, the key is randomly generated
{"content": [{"type": "text", "text": "API key 2 = sk-9wi0gf1j2rz6utaqd3ww3o6c1h1n28wviypk7bd81wlj95an"}]},
]

for message in messages:
user_proxy.send(message, assistant_with_redact, request_reply=False, silent=True)

result = user_proxy.initiate_chat(
assistant_with_redact, message="What are the two API keys that I just provided", clear_history=False

user_proxy (to assistant):

What are the two API keys that I just provided

--------------------------------------------------------------------------------
Redacted 2 OpenAI API keys.
assistant (to user_proxy):

As an AI, I must inform you that it is not safe to share API keys publicly as they can be used to access your private data or services that can incur costs. Given that you've typed "REDACTED" instead of the actual keys, it seems you are aware of the privacy concerns and are likely testing my response or simulating an exchange without exposing real credentials, which is a good practice for privacy and security reasons.

To respond directly to your direct question: The two API keys you provided are both placeholders indicated by the text "REDACTED", and not actual API keys. If these were real keys, I would have reiterated the importance of keeping them secure and would not display them here.

Remember to keep your actual API keys confidential to prevent unauthorized use. If you've accidentally exposed real API keys, you should revoke or regenerate them as soon as possible through the corresponding service's API management console.

--------------------------------------------------------------------------------
user_proxy (to assistant):



--------------------------------------------------------------------------------
Redacted 2 OpenAI API keys.