使用AutoGen AgentChat与基于LangChain的自定义客户端和Hugging Face模型
介绍
本笔记本展示了如何使用LangChain对LLM的广泛支持,在autogen中基于代理的对话中灵活使用各种语言模型(LLMs)。
我们将涵盖的内容:
- 创建一个自定义模型客户端,使用LangChain加载并与LLMs进行交互
- 配置AutoGen以使用我们基于LangChain的定制模型
- 使用自定义模型设置AutoGen代理
- 使用此设置演示一个简单的对话
虽然我们在这个示例中使用了Hugging Face模型,但同样的方法可以应用于LangChain支持的任何LLM,包括来自OpenAI、Anthropic或自定义模型的模型。这种集成为使用AutoGen创建复杂的多模型对话代理开启了许多可能性。
要求
本笔记本需要一些额外的依赖项,可以通过pip安装:
pip install pyautogen torch transformers sentencepiece langchain-huggingface
如需更多信息,请参考安装指南。
注意:根据你使用的模型,你可能需要调整Agent的默认提示
设置和导入
首先,让我们导入必要的库并定义我们的自定义模型客户端。
import json
import os
from types import SimpleNamespace
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline
from autogen import AssistantAgent, UserProxyAgent, config_list_from_json
创建并配置自定义模型
自定义模型类可以通过多种方式创建,但需要遵循在client.py中定义并如下所示的ModelClient
协议和响应结构。
响应协议有一些最低要求,但可以扩展以包含任何所需的额外信息。因此,消息检索可以定制,但需要返回一个字符串列表或一个 ModelClientResponseProtocol.Choice.Message
对象列表。
class ModelClient(Protocol):
"""
A client class must implement the following methods:
- create must return a response object that implements the ModelClientResponseProtocol
- cost must return the cost of the response
- get_usage must return a dict with the following keys:
- prompt_tokens
- completion_tokens
- total_tokens
- cost
- model
This class is used to create a client that can be used by OpenAIWrapper.
The response returned from create must adhere to the ModelClientResponseProtocol but can be extended however needed.
The message_retrieval method must be implemented to return a list of str or a list of messages from the response.
"""
RESPONSE_USAGE_KEYS = ["prompt_tokens", "completion_tokens", "total_tokens", "cost", "model"]
class ModelClientResponseProtocol(Protocol):
class Choice(Protocol):
class Message(Protocol):
content: Optional[str]
message: Message
choices: List[Choice]
model: str
def create(self, params) -> ModelClientResponseProtocol:
...
def message_retrieval(
self, response: ModelClientResponseProtocol
) -> Union[List[str], List[ModelClient.ModelClientResponseProtocol.Choice.Message]]:
"""
Retrieve and return a list of strings or a list of Choice.Message from the response.
NOTE: if a list of Choice.Message is returned, it currently needs to contain the fields of OpenAI's ChatCompletion Message object,
since that is expected for function or tool calling in the rest of the codebase at the moment, unless a custom agent is being used.
"""
...
def cost(self, response: ModelClientResponseProtocol) -> float:
...
@staticmethod
def get_usage(response: ModelClientResponseProtocol) -> Dict:
"""Return usage summary of the response using RESPONSE_USAGE_KEYS."""
...
简单自定义客户端的示例
按照huggingface的使用示例,使用Mistral’s Open-Orca
对于响应对象,python的SimpleNamespace
用于创建一个简单对象,可用于存储响应数据,但任何遵循ClientResponseProtocol
的对象都可以使用。
# custom client with custom model loader
class CustomModelClient:
"""Custom model client implementation for LangChain integration with AutoGen."""
def __init__(self, config, **kwargs):
"""Initialize the CustomModelClient."""
print(f"CustomModelClient config: {config}")
self.device = config.get("device", "cpu")
gen_config_params = config.get("params", {})
self.model_name = config["model"]
pipeline = HuggingFacePipeline.from_model_id(
model_id=self.model_name,
task="text-generation",
pipeline_kwargs=gen_config_params,
device=self.device,
)
self.model = ChatHuggingFace(llm=pipeline)
print(f"Loaded model {config['model']} to {self.device}")
def _to_chatml_format(self, message):
"""Convert message to ChatML format."""
if message["role"] == "system":
return SystemMessage(content=message["content"])
if message["role"] == "assistant":
return AIMessage(content=message["content"])
if message["role"] == "user":
return HumanMessage(content=message["content"])
raise ValueError(f"Unknown message type: {type(message)}")
def create(self, params):
"""Create a response using the model."""
if params.get("stream", False) and "messages" in params:
raise NotImplementedError("Local models do not support streaming.")
num_of_responses = params.get("n", 1)
response = SimpleNamespace()
inputs = [self._to_chatml_format(m) for m in params["messages"]]
response.choices = []
response.model = self.model_name
for _ in range(num_of_responses):
outputs = self.model.invoke(inputs)
text = outputs.content
choice = SimpleNamespace()
choice.message = SimpleNamespace()
choice.message.content = text
choice.message.function_call = None
response.choices.append(choice)
return response
def message_retrieval(self, response):
"""Retrieve messages from the response."""
return [choice.message.content for choice in response.choices]
def cost(self, response) -> float:
"""Calculate the cost of the response."""
response.cost = 0
return 0
@staticmethod
def get_usage(response):
"""Get usage statistics."""
return {}
设置您的API端点
config_list_from_json
函数从环境变量或 JSON 文件加载配置列表。
它首先查找指定名称的环境变量(在本例中为“OAI_CONFIG_LIST”),该变量需要是有效的json字符串。如果未找到该变量,它会查找具有相同名称的json文件。它通过模型过滤配置(你也可以通过其他键进行过滤)。
JSON 看起来如下所示:
[
{
"model": "gpt-4",
"api_key": "<your OpenAI API key here>"
},
{
"model": "gpt-4",
"api_key": "<your Azure OpenAI API key here>",
"base_url": "<your Azure OpenAI API base here>",
"api_type": "azure",
"api_version": "2024-02-01"
},
{
"model": "gpt-4-32k",
"api_key": "<your Azure OpenAI API key here>",
"base_url": "<your Azure OpenAI API base here>",
"api_type": "azure",
"api_version": "2024-02-01"
}
]
你可以以任何你喜欢的方式设置config_list的值。请参考此 notebook 以获取不同方法的完整代码示例。
为自定义模型设置配置
你可以在同一个配置列表中添加自定义模型加载所需的任何参数。
重要的是添加model_client_cls
字段并将其设置为与类名对应的字符串:"CustomModelClient"
。
os.environ["OAI_CONFIG_LIST"] = json.dumps(
[
{
"model": "mistralai/Mistral-7B-Instruct-v0.2",
"model_client_cls": "CustomModelClient",
"device": 0,
"n": 1,
"params": {
"max_new_tokens": 500,
"top_k": 50,
"temperature": 0.1,
"do_sample": True,
"return_full_text": False,
},
}
]
)
config_list_custom = config_list_from_json(
"OAI_CONFIG_LIST",
filter_dict={"model_client_cls": ["CustomModelClient"]},
)
import getpass
from huggingface_hub import login
# The Mistral-7B-Instruct-v0.2 is a gated model which requires API token to access
login(token=getpass.getpass("Enter your HuggingFace API Token"))
构建代理
构建一个用户代理和助理代理之间的简单对话
assistant = AssistantAgent("assistant", llm_config={"config_list": config_list_custom})
user_proxy = UserProxyAgent("user_proxy", code_execution_config=False)
[autogen.oai.client: 09-01 12:53:51] {484} INFO - Detected custom model client in config: CustomModelClient, model client can not be used until register_model_client is called.
注册自定义客户端类到助手代理
assistant.register_model_client(model_client_cls=CustomModelClient)
CustomModelClient config: {'model': 'microsoft/Phi-3.5-mini-instruct', 'model_client_cls': 'CustomModelClient', 'device': 0, 'n': 1, 'params': {'max_new_tokens': 100, 'top_k': 50, 'temperature': 0.1, 'do_sample': True, 'return_full_text': False}}
Loaded model microsoft/Phi-3.5-mini-instruct to 0
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:07<00:00, 3.51s/it]
user_proxy.initiate_chat(assistant, message="Write python code to print Hello World!")
Write python code to print Hello World!
--------------------------------------------------------------------------------
```python
# filename: hello_world.py
print("Hello World!")
```
To execute this code, save it in a file named `hello_world.py`. Then, open your terminal or command prompt, navigate to the directory containing the file, and run the following command:
```
python hello_world.py
```
The output should be:
```
Hello World!
```
If you encounter any errors,
--------------------------------------------------------------------------------
You are not running the flash-attention implementation, expect numerical differences.
ChatResult(chat_id=None, chat_history=[{'content': 'Write python code to print Hello World!', 'role': 'assistant', 'name': 'user_proxy'}, {'content': ' ```python\n# filename: hello_world.py\n\nprint("Hello World!")\n```\n\nTo execute this code, save it in a file named `hello_world.py`. Then, open your terminal or command prompt, navigate to the directory containing the file, and run the following command:\n\n```\npython hello_world.py\n```\n\nThe output should be:\n\n```\nHello World!\n```\n\nIf you encounter any errors,', 'role': 'user', 'name': 'assistant'}], summary=' ```python\n# filename: hello_world.py\n\nprint("Hello World!")\n```\n\nTo execute this code, save it in a file named `hello_world.py`. Then, open your terminal or command prompt, navigate to the directory containing the file, and run the following command:\n\n```\npython hello_world.py\n```\n\nThe output should be:\n\n```\nHello World!\n```\n\nIf you encounter any errors,', cost={'usage_including_cached_inference': {'total_cost': 0}, 'usage_excluding_cached_inference': {'total_cost': 0}}, human_input=['exit'])