在使用演讲者选择时使用Transform消息

在群聊中使用“自动”模式选择发言者时，使用嵌套聊天来确定下一位发言者。此嵌套聊天包括群聊中的所有消息，这可能会导致大量内容，LLM需要处理这些内容以确定下一位发言者。随着对话的进行，将上下文长度保持在LLM的工作窗口内可能会变得具有挑战性。此外，减少总体标记数量将提高推理时间并减少标记成本。

使用Transform Messages，您可以控制哪些消息用于说话者选择，以及每条消息以及整体的上下文长度。

所有可用于Transform Messages的转换都可以应用于发言人选择的嵌套聊天中，例如MessageHistoryLimiter、MessageTokenLimiter和TextMessageCompressor。

如何应用它们

实例化您的GroupChat对象时，您只需将 TransformMessages 对象分配给select_speaker_transform_messages参数，其中的转换将应用于嵌套的发言者选择聊天。

并且，当您传入一个TransformMessages对象时，可以对该嵌套聊天应用多种转换。

作为嵌套聊天的一部分，一个名为‘checking_agent’的代理用于指导LLM选择下一个发言者。最好避免压缩或截断来自该代理的内容。如何在倒数第二个示例中完成此操作所示。

创建用于GroupChat中发言人选择的转换

我们将逐步创建一个TransformMessage对象，以展示如何为说话者选择构建转换。

每次迭代都将替换前一次，使您能够按原样使用每个单元格中的代码。

重要的是，转换按照它们在转换列表中的顺序应用。

# Start by importing the transform capabilities

import autogen
from autogen.agentchat.contrib.capabilities import transform_messages, transforms

# Limit the number of messages

# Let's start by limiting the number of messages to consider for speaker selection using a
# MessageHistoryLimiter transform. This example will use the latest 10 messages.

select_speaker_transforms = transform_messages.TransformMessages(
    transforms=[
        transforms.MessageHistoryLimiter(max_messages=10),
    ]
)

# Compress messages through an LLM

# An interesting and very powerful method of reducing tokens is by "compressing" the text of
# a message by using an LLM that's specifically designed to do that. The default LLM used for
# this purpose is LLMLingua (https://github.com/microsoft/LLMLingua) and it aims to reduce the
# number of tokens without reducing the message's meaning. We use the TextMessageCompressor
# transform to compress messages.

# There are multiple LLMLingua models available and it defaults to the first version, LLMLingua.
# This example will show how to use LongLLMLingua which is targeted towards long-context
# information processing. LLMLingua-2 has been released and you could use that as well.

# Create the compression arguments, which allow us to specify the model and other related
# parameters, such as whether to use the CPU or GPU.
select_speaker_compression_args = dict(
    model_name="microsoft/llmlingua-2-xlm-roberta-large-meetingbank", use_llmlingua2=True, device_map="cpu"
)

# Now we can add the TextMessageCompressor as the second step

# Important notes on the parameters used:
# min_tokens - will only apply text compression if the message has at least 1,000 tokens
# cache - enables caching, if a message has previously been compressed it will use the
#         cached version instead of recompressing it (making it much faster)
# filter_dict - to minimise the chance of compressing key information, we can include or
#         exclude messages based on role and name.
#         Here, we are excluding any 'system' messages as well as any messages from
#         'ceo' (just for example) and the 'checking_agent', which is an agent in the
#         nested chat speaker selection chat. Change the 'ceo' name or add additional
#         agent names for any agents that have critical content.
# exclude_filter - As we are setting this to True, the filter will be an exclusion filter.

# Import the cache functionality
from autogen.cache.in_memory_cache import InMemoryCache

select_speaker_transforms = transform_messages.TransformMessages(
    transforms=[
        transforms.MessageHistoryLimiter(max_messages=10),
        transforms.TextMessageCompressor(
            min_tokens=1000,
            text_compressor=transforms.LLMLingua(select_speaker_compression_args, structured_compression=True),
            cache=InMemoryCache(seed=43),
            filter_dict={"role": ["system"], "name": ["ceo", "checking_agent"]},
            exclude_filter=True,
        ),
    ]
)

# Limit the total number of tokens and tokens per message

# As a final example, we can manage the total tokens and individual message tokens. We have added a
# MessageTokenLimiter transform that will limit the total number of tokens for the messages to
# 3,000 with a maximum of 500 per individual message. Additionally, if a message is less than 300
# tokens it will not be truncated.

select_speaker_compression_args = dict(
    model_name="microsoft/llmlingua-2-xlm-roberta-large-meetingbank", use_llmlingua2=True, device_map="cpu"
)

select_speaker_transforms = transform_messages.TransformMessages(
    transforms=[
        transforms.MessageHistoryLimiter(max_messages=10),
        transforms.TextMessageCompressor(
            min_tokens=1000,
            text_compressor=transforms.LLMLingua(select_speaker_compression_args, structured_compression=True),
            cache=InMemoryCache(seed=43),
            filter_dict={"role": ["system"], "name": ["ceo", "checking_agent"]},
            exclude_filter=True,
        ),
        transforms.MessageTokenLimiter(max_tokens=3000, max_tokens_per_message=500, min_tokens=300),
    ]
)

# Now, we apply the transforms to a group chat. We do this by assigning the message
# transforms from above to the `select_speaker_transform_messages` parameter on the GroupChat.

import os

llm_config = {
    "config_list": [{"model": "gpt-4", "api_key": os.environ["OPENAI_API_KEY"]}],
}

# Define your agents
chief_executive_officer = autogen.ConversableAgent(
    "ceo",
    llm_config=llm_config,
    max_consecutive_auto_reply=1,
    system_message="You are leading this group chat, and the business, as the chief executive officer.",
)

general_manager = autogen.ConversableAgent(
    "gm",
    llm_config=llm_config,
    max_consecutive_auto_reply=1,
    system_message="You are the general manager of the business, running the day-to-day operations.",
)

financial_controller = autogen.ConversableAgent(
    "fin_controller",
    llm_config=llm_config,
    max_consecutive_auto_reply=1,
    system_message="You are the financial controller, ensuring all financial matters are managed accordingly.",
)

your_group_chat = autogen.GroupChat(
    agents=[chief_executive_officer, general_manager, financial_controller],
    select_speaker_transform_messages=select_speaker_transforms,
)

在使用演讲者选择时使用Transform消息

如何应用它们​

创建用于GroupChat中发言人选择的转换​

如何应用它们

创建用于GroupChat中发言人选择的转换