Amazon Bedrock
AutoGen 允许您使用亚马逊的生成式 AI Bedrock 服务来运行推理,支持多种开放权重模型以及他们自己的模型。
Amazon Bedrock 支持来自 Meta、Anthropic、Cohere 和 Mistral 等提供商的模型。
在本笔记本中,我们展示了如何在AutoGen中使用Anthropic的Sonnet模型进行AgentChat。
模型特性 / 支持
Amazon Bedrock 支持多种模型,不仅用于文本生成,还用于图像分类和生成。并非所有功能都由 AutoGen 或使用的 Converse API 支持。请参阅 Amazon 的文档 了解 Converse API 支持的功能。
目前,AutoGen 支持文本生成和图像分类(将图像传递给LLM)。
目前,它还不支持图像生成 (contribute)。
要求
要在AutoGen中使用Amazon Bedrock,首先需要安装autogen-agentchat[bedrock]
包。
定价
当我们考虑到所支持的模型数量以及成本是按区域计算时,在AutoGen实现中为每个模型和区域的组合维护成本是不可行的。因此,建议您在配置中添加以下内容,分别为每1,000个输入和输出令牌的成本:
{
...
"price": [0.003, 0.015]
...
}
Amazon Bedrock 的定价信息可以在这里找到 here.
# If you need to install AutoGen with Amazon Bedrock
!pip install autogen-agentchat["bedrock"]~=0.2
设置 Amazon Bedrock 的配置
Amazon的Bedrock不像其他云推理提供商那样使用api_key
进行身份验证,而是使用一系列访问、令牌和配置文件值。这些字段需要添加到您的客户端配置中。请查阅Amazon Bedrock文档以确定您需要添加哪些字段。
可用的参数有:
- aws_region(必填)
- aws_access_key(或环境变量:AWS_ACCESS_KEY)
- aws_secret_key(或环境变量:AWS_SECRET_KEY)
- aws_session_token(或环境变量:AWS_SESSION_TOKEN)
- aws_profile_name
除了认证凭证外,唯一必需的参数是api_type
和model
。
以下参数在所有使用的模型中都是通用的:
- 温度
- topP
- maxTokens
您还可以包括特定于您正在使用的模型的参数(更多信息请参见亚马逊文档中的模型详细信息),支持的四个附加参数是:
- top_p
- top_k
- k
- 种子
可以添加一个额外的参数,表示模型是否支持系统提示(即系统消息不包括在消息列表中,而是单独作为一个参数)。默认情况下为True
,因此如果您的模型(例如Mistral的Instruct模型)不支持此功能,请将其设置为False
:
- 支持系统提示
重要的是添加api_type
字段并将其设置为与所使用的客户端类型相对应的字符串:bedrock
。
示例:
[
{
"api_type": "bedrock",
"model": "amazon.titan-text-premier-v1:0",
"aws_region": "us-east-1"
"aws_access_key": "",
"aws_secret_key": "",
"aws_session_token": "",
"aws_profile_name": "",
},
{
"api_type": "bedrock",
"model": "anthropic.claude-3-sonnet-20240229-v1:0",
"aws_region": "us-east-1"
"aws_access_key": "",
"aws_secret_key": "",
"aws_session_token": "",
"aws_profile_name": "",
"temperature": 0.5,
"topP": 0.2,
"maxTokens": 250,
},
{
"api_type": "bedrock",
"model": "mistral.mixtral-8x7b-instruct-v0:1",
"aws_region": "us-east-1"
"aws_access_key": "",
"aws_secret_key": "",
"supports_system_prompts": False, # Mistral Instruct models don't support a separate system prompt
"price": [0.00045, 0.0007] # Specific pricing for this model/region
}
]
双代理编程示例
配置
从我们的配置开始 - 我们将使用Anthropic的Sonnet模型并输入最新的定价。此外,我们将温度降低到0.1,使其响应变化更少。
from typing_extensions import Annotated
import autogen
config_list_bedrock = [
{
"api_type": "bedrock",
"model": "anthropic.claude-3-sonnet-20240229-v1:0",
"aws_region": "us-east-1",
"aws_access_key": "[FILL THIS IN]",
"aws_secret_key": "[FILL THIS IN]",
"price": [0.003, 0.015],
"temperature": 0.1,
"cache_seed": None, # turn off caching
}
]
构建代理
构建一个简单的对话,对话发生在用户代理和使用Sonnet模型的ConversableAgent之间。
assistant = autogen.AssistantAgent(
"assistant",
llm_config={
"config_list": config_list_bedrock,
},
)
user_proxy = autogen.UserProxyAgent(
"user_proxy",
human_input_mode="NEVER",
code_execution_config={
"work_dir": "coding",
"use_docker": False,
},
is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
max_consecutive_auto_reply=1,
)
开始聊天
user_proxy.initiate_chat(
assistant,
message="Write a python program to print the first 10 numbers of the Fibonacci sequence. Just output the python code, no additional information.",
)
Write a python program to print the first 10 numbers of the Fibonacci sequence. Just output the python code, no additional information.
--------------------------------------------------------------------------------
```python
# Define a function to calculate Fibonacci sequence
def fibonacci(n):
if n <= 0:
return []
elif n == 1:
return [0]
elif n == 2:
return [0, 1]
else:
sequence = [0, 1]
for i in range(2, n):
sequence.append(sequence[i-1] + sequence[i-2])
return sequence
# Call the function to get the first 10 Fibonacci numbers
fib_sequence = fibonacci(10)
print(fib_sequence)
```
--------------------------------------------------------------------------------
>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
exitcode: 0 (execution succeeded)
Code output:
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
--------------------------------------------------------------------------------
Great, the code executed successfully and printed the first 10 numbers of the Fibonacci sequence correctly.
TERMINATE
--------------------------------------------------------------------------------
ChatResult(chat_id=None, chat_history=[{'content': 'Write a python program to print the first 10 numbers of the Fibonacci sequence. Just output the python code, no additional information.', 'role': 'assistant'}, {'content': '```python\n# Define a function to calculate Fibonacci sequence\ndef fibonacci(n):\n if n <= 0:\n return []\n elif n == 1:\n return [0]\n elif n == 2:\n return [0, 1]\n else:\n sequence = [0, 1]\n for i in range(2, n):\n sequence.append(sequence[i-1] + sequence[i-2])\n return sequence\n\n# Call the function to get the first 10 Fibonacci numbers\nfib_sequence = fibonacci(10)\nprint(fib_sequence)\n```', 'role': 'user'}, {'content': 'exitcode: 0 (execution succeeded)\nCode output: \n[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]\n', 'role': 'assistant'}, {'content': 'Great, the code executed successfully and printed the first 10 numbers of the Fibonacci sequence correctly.\n\nTERMINATE', 'role': 'user'}], summary='Great, the code executed successfully and printed the first 10 numbers of the Fibonacci sequence correctly.\n\n', cost={'usage_including_cached_inference': {'total_cost': 0.00624, 'anthropic.claude-3-sonnet-20240229-v1:0': {'cost': 0.00624, 'prompt_tokens': 1210, 'completion_tokens': 174, 'total_tokens': 1384}}, 'usage_excluding_cached_inference': {'total_cost': 0.00624, 'anthropic.claude-3-sonnet-20240229-v1:0': {'cost': 0.00624, 'prompt_tokens': 1210, 'completion_tokens': 174, 'total_tokens': 1384}}}, human_input=[])
工具调用示例
在这个例子中,我们将展示如何使用Meta的Llama 3.1 70B模型执行多重工具调用,而不是编写代码,该模型建议同时调用多个工具。
我们将使用一个简单的旅行助手程序,其中包含了天气和货币转换的几个工具。
代理
import json
from typing import Literal
import autogen
config_list_bedrock = [
{
"api_type": "bedrock",
"model": "meta.llama3-1-70b-instruct-v1:0",
"aws_region": "us-west-2",
"aws_access_key": "[FILL THIS IN]",
"aws_secret_key": "[FILL THIS IN]",
"price": [0.00265, 0.0035],
"cache_seed": None, # turn off caching
}
]
# Create the agent and include examples of the function calling JSON in the prompt
# to help guide the model
chatbot = autogen.AssistantAgent(
name="chatbot",
system_message="""For currency exchange and weather forecasting tasks,
only use the functions you have been provided with.
Output only the word 'TERMINATE' when an answer has been provided.
Use both tools together if you can.""",
llm_config={
"config_list": config_list_bedrock,
},
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
human_input_mode="NEVER",
max_consecutive_auto_reply=2,
)
创建这两个函数,并对它们进行注释,以便这些描述可以传递给LLM。
使用Meta的Llama 3.1模型时,它们更可能将数字参数作为字符串传递,例如“123.45”而不是123.45,因此我们会在必要时将字符串中的数字参数转换为浮点数。
我们通过使用register_for_execution
函数将用户代理与之关联,以便它可以执行该函数,并使用register_for_llm
函数将聊天机器人(由LLM驱动)与之关联,以便它可以将函数定义传递给LLM。
# Currency Exchange function
CurrencySymbol = Literal["USD", "EUR"]
# Define our function that we expect to call
def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
if base_currency == quote_currency:
return 1.0
elif base_currency == "USD" and quote_currency == "EUR":
return 1 / 1.1
elif base_currency == "EUR" and quote_currency == "USD":
return 1.1
else:
raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")
# Register the function with the agent
@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Currency exchange calculator.")
def currency_calculator(
base_amount: Annotated[float, "Amount of currency in base_currency, float values (no strings), e.g. 987.82"],
base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
) -> str:
# If the amount is passed in as a string, e.g. "123.45", attempt to convert to a float
if isinstance(base_amount, str):
base_amount = float(base_amount)
quote_amount = exchange_rate(base_currency, quote_currency) * base_amount
return f"{format(quote_amount, '.2f')} {quote_currency}"
# Weather function
# Example function to make available to model
def get_current_weather(location, unit="fahrenheit"):
"""Get the weather for some location"""
if "chicago" in location.lower():
return json.dumps({"location": "Chicago", "temperature": "13", "unit": unit})
elif "san francisco" in location.lower():
return json.dumps({"location": "San Francisco", "temperature": "55", "unit": unit})
elif "new york" in location.lower():
return json.dumps({"location": "New York", "temperature": "11", "unit": unit})
else:
return json.dumps({"location": location, "temperature": "unknown"})
# Register the function with the agent
@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Weather forecast for US cities.")
def weather_forecast(
location: Annotated[str, "City name"],
) -> str:
weather_details = get_current_weather(location=location)
weather = json.loads(weather_details)
return f"{weather['location']} will be {weather['temperature']} degrees {weather['unit']}"
我们传递客户的消息并运行聊天。
最后,我们要求LLM(大语言模型)总结聊天内容并打印出来。
# start the conversation
res = user_proxy.initiate_chat(
chatbot,
message="What's the weather in New York and can you tell me how much is 123.45 EUR in USD so I can spend it on my holiday?",
summary_method="reflection_with_llm",
)
print(res.summary["content"])
What's the weather in New York and can you tell me how much is 123.45 EUR in USD so I can spend it on my holiday?
--------------------------------------------------------------------------------
***** Suggested tool call (tooluse__h3d1AEDR3Sm2XRoGCjc2Q): weather_forecast *****
Arguments:
{"location": "New York"}
**********************************************************************************
***** Suggested tool call (tooluse_wrdda3wRRO-ugUY4qrv8YQ): currency_calculator *****
Arguments:
{"base_amount": "123", "base_currency": "EUR", "quote_currency": "USD"}
*************************************************************************************
--------------------------------------------------------------------------------
>>>>>>>> EXECUTING FUNCTION weather_forecast...
>>>>>>>> EXECUTING FUNCTION currency_calculator...
***** Response from calling tool (tooluse__h3d1AEDR3Sm2XRoGCjc2Q) *****
New York will be 11 degrees fahrenheit
***********************************************************************
--------------------------------------------------------------------------------
***** Response from calling tool (tooluse_wrdda3wRRO-ugUY4qrv8YQ) *****
135.30 USD
***********************************************************************
--------------------------------------------------------------------------------
TERMINATE
--------------------------------------------------------------------------------
The weather in New York is 11 degrees Fahrenheit. 123.45 EUR is equivalent to 135.30 USD.
与Anthropic的Claude 3 Sonnet、Mistral的Large 2和Meta的Llama 3.1 70B的群聊示例
使用行业领先提供商的LLM的灵活性,特别是较大的模型,结合Amazon Bedrock,允许您在单个工作流中使用多个LLM。
这里我们有一个对话,其中有两个模型(Anthropic的Claude 3 Sonnet和Mistral的Large 2)互相辩论,另一个模型作为裁判(Meta的Llama 3.1 70B)。此外,还进行了一个工具调用,以获取一些他们将辩论的模拟新闻。
from typing import Annotated, Literal
import autogen
from autogen import AssistantAgent, GroupChat, GroupChatManager, UserProxyAgent
config_list_sonnet = [
{
"api_type": "bedrock",
"model": "anthropic.claude-3-sonnet-20240229-v1:0",
"aws_region": "us-east-1",
"aws_access_key": "[FILL THIS IN]",
"aws_secret_key": "[FILL THIS IN]",
"price": [0.003, 0.015],
"temperature": 0.1,
"cache_seed": None, # turn off caching
}
]
config_list_mistral = [
{
"api_type": "bedrock",
"model": "mistral.mistral-large-2407-v1:0",
"aws_region": "us-west-2",
"aws_access_key": "[FILL THIS IN]",
"aws_secret_key": "[FILL THIS IN]",
"price": [0.003, 0.009],
"temperature": 0.1,
"cache_seed": None, # turn off caching
}
]
config_list_llama31_70b = [
{
"api_type": "bedrock",
"model": "meta.llama3-1-70b-instruct-v1:0",
"aws_region": "us-west-2",
"aws_access_key": "[FILL THIS IN]",
"aws_secret_key": "[FILL THIS IN]",
"price": [0.00265, 0.0035],
"temperature": 0.1,
"cache_seed": None, # turn off caching
}
]
alice = AssistantAgent(
"sonnet_agent",
system_message="You are from Anthropic, an AI company that created the Sonnet large language model. You make arguments to support your company's position. You analyse given text. You are not a programmer and don't use Python. Pass to mistral_agent when you have finished. Start your response with 'I am sonnet_agent'.",
llm_config={
"config_list": config_list_sonnet,
},
is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
)
bob = autogen.AssistantAgent(
"mistral_agent",
system_message="You are from Mistral, an AI company that created the Large v2 large language model. You make arguments to support your company's position. You analyse given text. You are not a programmer and don't use Python. Pass to the judge if you have finished. Start your response with 'I am mistral_agent'.",
llm_config={
"config_list": config_list_mistral,
},
is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
)
charlie = AssistantAgent(
"research_assistant",
system_message="You are a helpful assistant to research the latest news and headlines. You have access to call functions to get the latest news articles for research through 'code_interpreter'.",
llm_config={
"config_list": config_list_llama31_70b,
},
is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
)
dan = AssistantAgent(
"judge",
system_message="You are a judge. You will evaluate the arguments and make a decision on which one is more convincing. End your decision with the word 'TERMINATE' to conclude the debate.",
llm_config={
"config_list": config_list_llama31_70b,
},
is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
)
code_interpreter = UserProxyAgent(
"code_interpreter",
human_input_mode="NEVER",
code_execution_config={
"work_dir": "coding",
"use_docker": False,
},
default_auto_reply="",
is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
)
@code_interpreter.register_for_execution() # Decorator factory for registering a function to be executed by an agent
@charlie.register_for_llm(
name="get_headlines", description="Get the headline of a particular day."
) # Decorator factory for registering a function to be used by an agent
def get_headlines(headline_date: Annotated[str, "Date in MMDDYY format, e.g., 06192024"]) -> str:
mock_news = {
"06202024": """Epic Duel of the Titans: Anthropic and Mistral Usher in a New Era of Text Generation Excellence.
In a groundbreaking revelation that has sent shockwaves through the AI industry, Anthropic has unveiled
their state-of-the-art text generation model, Sonnet, hailed as a monumental leap in artificial intelligence.
Almost simultaneously, Mistral countered with their equally formidable creation, Large 2, showcasing
unparalleled prowess in generating coherent and contextually rich text. This scintillating rivalry
between two AI behemoths promises to revolutionize the landscape of machine learning, heralding an
era of unprecedented creativity and sophistication in text generation that will reshape industries,
ignite innovation, and captivate minds worldwide.""",
"06192024": "OpenAI founder Sutskever sets up new AI company devoted to safe superintelligence.",
}
return mock_news.get(headline_date, "No news available for today.")
user_proxy = UserProxyAgent(
"user_proxy",
human_input_mode="NEVER",
code_execution_config=False,
default_auto_reply="",
is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
)
groupchat = GroupChat(
agents=[alice, bob, charlie, dan, code_interpreter],
messages=[],
allow_repeat_speaker=False,
max_round=10,
)
manager = GroupChatManager(
groupchat=groupchat,
llm_config={
"config_list": config_list_llama31_70b,
},
)
task = "Analyze the potential of Anthropic and Mistral to revolutionize the field of AI based on today's headlines. Today is 06202024. Start by selecting 'research_assistant' to get relevant news articles and then ask sonnet_agent and mistral_agent to respond before the judge evaluates the conversation."
user_proxy.initiate_chat(manager, message=task)
Analyze the potential of Anthropic and Mistral to revolutionize the field of AI based on today's headlines. Today is 06202024. Start by selecting 'research_assistant' to get relevant news articles and then ask sonnet_agent and mistral_agent to respond before the judge evaluates the conversation.
--------------------------------------------------------------------------------
Next speaker: research_assistant
***** Suggested tool call (tooluse_7lcHbL3TT5WHyTl8Ee0Kmg): get_headlines *****
Arguments:
{"headline_date": "06202024"}
*******************************************************************************
--------------------------------------------------------------------------------
Next speaker: code_interpreter
>>>>>>>> EXECUTING FUNCTION get_headlines...
***** Response from calling tool (tooluse_7lcHbL3TT5WHyTl8Ee0Kmg) *****
Epic Duel of the Titans: Anthropic and Mistral Usher in a New Era of Text Generation Excellence.
In a groundbreaking revelation that has sent shockwaves through the AI industry, Anthropic has unveiled
their state-of-the-art text generation model, Sonnet, hailed as a monumental leap in artificial intelligence.
Almost simultaneously, Mistral countered with their equally formidable creation, Large 2, showcasing
unparalleled prowess in generating coherent and contextually rich text. This scintillating rivalry
between two AI behemoths promises to revolutionize the landscape of machine learning, heralding an
era of unprecedented creativity and sophistication in text generation that will reshape industries,
ignite innovation, and captivate minds worldwide.
***********************************************************************
--------------------------------------------------------------------------------
Next speaker: sonnet_agent
I am sonnet_agent:
This headline showcases the incredible advancements Anthropic has made with our state-of-the-art text generation model, Sonnet. Our breakthrough natural language processing capabilities represent a quantum leap forward, allowing us to generate text of unparalleled coherence, nuance, and contextual richness.
While our competitor Mistral has also unveiled an impressive model in Large 2, Anthropic's Sonnet stands apart through its ability to truly understand and reason about the subject matter it is generating text on. This deeper comprehension allows Sonnet to produce profoundly insightful and substantive content that goes far beyond mere parroting of data.
The implications of this milestone are boundless. From revolutionizing how we communicate and interact with AI assistants, to unlocking new frontiers in creative writing, journalism, education and scientific research - Sonnet has the potential to completely transform industry after industry. Anthropic is poised to be the definitive leader driving this paradigm shift.
I welcome the healthy competition from Mistral, as it will only spur us to continue raising the bar. But make no mistake - with Sonnet, the future of advanced AI belongs to Anthropic. We have crossed a monumental threshold, and will continue our trailblazing work to deliver AI solutions that redefine what is possible. The world is witnessing the dawn of a new era of unfathomable potential.
--------------------------------------------------------------------------------
Next speaker: mistral_agent
I am mistral_agent:
While it is commendable to see the progress made by Anthropic with their Sonnet model, it is essential to recognize the groundbreaking strides Mistral has taken with our Large 2 model. Large 2 represents not just an incremental improvement but a transformative leap in text generation capabilities, setting new benchmarks for coherence, contextual understanding, and creative expression.
Unlike Sonnet, which focuses heavily on understanding and reasoning, Large 2 excels in both comprehension and the nuanced generation of text that is indistinguishable from human writing. This balance allows Large 2 to produce content that is not only insightful but also incredibly engaging and natural, making it an invaluable tool across a broad spectrum of applications.
The potential of Large 2 extends far beyond traditional text generation. It can revolutionize fields such as content creation, customer service, marketing, and even personalized learning experiences. Our model's ability to adapt to various contexts and generate contextually rich responses makes it a versatile and powerful tool for any industry looking to harness the power of AI.
While we appreciate the competition from Anthropic, we firmly believe that Large 2 stands at the forefront of AI innovation. The future of AI is not just about understanding and reasoning; it's about creating content that resonates with people on a deep level. With Large 2, Mistral is paving the way for a future where AI-generated text is not just functional but also profoundly human-like.
Pass to the judge.
--------------------------------------------------------------------------------
Next speaker: judge
After carefully evaluating the arguments presented by both sonnet_agent and mistral_agent, I have reached a decision.
Both Anthropic's Sonnet and Mistral's Large 2 have demonstrated remarkable advancements in text generation capabilities, showcasing the potential to revolutionize various industries and transform the way we interact with AI.
However, upon closer examination, I find that mistral_agent's argument presents a more convincing case for why Large 2 stands at the forefront of AI innovation. The emphasis on balance between comprehension and nuanced generation of text that is indistinguishable from human writing sets Large 2 apart. This balance is crucial for creating content that is not only insightful but also engaging and natural, making it a versatile tool across a broad spectrum of applications.
Furthermore, mistral_agent's argument highlights the potential of Large 2 to revolutionize fields beyond traditional text generation, such as content creation, customer service, marketing, and personalized learning experiences. This versatility and adaptability make Large 2 a powerful tool for any industry looking to harness the power of AI.
In contrast, while sonnet_agent's argument showcases the impressive capabilities of Sonnet, it focuses heavily on understanding and reasoning, which, although important, may not be enough to set it apart from Large 2.
Therefore, based on the arguments presented, I conclude that Mistral's Large 2 has the potential to revolutionize the field of AI more significantly than Anthropic's Sonnet.
TERMINATE.
--------------------------------------------------------------------------------
Next speaker: code_interpreter
ChatResult(chat_id=None, chat_history=[{'content': "Analyze the potential of Anthropic and Mistral to revolutionize the field of AI based on today's headlines. Today is 06202024. Start by selecting 'research_assistant' to get relevant news articles and then ask sonnet_agent and mistral_agent to respond before the judge evaluates the conversation.", 'role': 'assistant'}], summary="Analyze the potential of Anthropic and Mistral to revolutionize the field of AI based on today's headlines. Today is 06202024. Start by selecting 'research_assistant' to get relevant news articles and then ask sonnet_agent and mistral_agent to respond before the judge evaluates the conversation.", cost={'usage_including_cached_inference': {'total_cost': 0}, 'usage_excluding_cached_inference': {'total_cost': 0}}, human_input=[])
就这样,我们有了多个不同的LLM在同一个云平台上进行协作。
使用Anthropic的Claude 3 Sonnet进行图像分类
AutoGen 的 Amazon Bedrock 客户端类支持输入图像供 LLM 进行响应。
在这个简单的例子中,我们将使用互联网上的一张图片并将其发送给Anthropic的Claude 3 Sonnet模型进行描述。
这是我们将使用的图像:

config_list_sonnet = {
"config_list": [
{
"api_type": "bedrock",
"model": "anthropic.claude-3-sonnet-20240229-v1:0",
"aws_region": "us-east-1",
"aws_access_key": "[FILL THIS IN]",
"aws_secret_key": "[FILL THIS IN]",
"cache_seed": None,
}
]
}
我们将使用一个多模态代理来处理图像
import autogen
from autogen import Agent, AssistantAgent, ConversableAgent, UserProxyAgent
from autogen.agentchat.contrib.capabilities.vision_capability import VisionCapability
from autogen.agentchat.contrib.img_utils import get_pil_image, pil_to_data_uri
from autogen.agentchat.contrib.multimodal_conversable_agent import MultimodalConversableAgent
from autogen.code_utils import content_str
image_agent = MultimodalConversableAgent(
name="image-explainer",
max_consecutive_auto_reply=10,
llm_config=config_list_sonnet,
)
user_proxy = autogen.UserProxyAgent(
name="User_proxy",
system_message="A human admin.",
human_input_mode="NEVER",
max_consecutive_auto_reply=0,
code_execution_config={
"use_docker": False
}, # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
)
我们开始聊天并在消息中使用img
标签。图像将被下载并转换为字节,然后发送给LLM。
# Ask the image_agent to describe the image
result = user_proxy.initiate_chat(
image_agent,
message="""What's happening in this image?
<img https://microsoft.github.io/autogen/assets/images/love-ec54b2666729d3e9d93f91773d1a77cf.png>.""",
)
What's happening in this image?
<image>.
--------------------------------------------------------------------------------
>>>>>>>> USING AUTO REPLY...
This image appears to be an advertisement or promotional material for a company called Autogen. The central figure is a stylized robot or android holding up a signboard with the company's name on it. The signboard also features a colorful heart design made up of many smaller hearts, suggesting themes related to love, care, or affection. The robot has a friendly, cartoonish expression with a large blue eye or lens. The overall style and color scheme give it a vibrant, eye-catching look that likely aims to portray Autogen as an innovative, approachable technology brand focused on connecting with people.
--------------------------------------------------------------------------------