注意
Go to the end 下载完整示例代码。
模型¶
在本教程中,我们将介绍在AgentScope中集成的模型API,如何使用它们以及如何集成新的模型API。 支持的模型API和供应商包括:
API |
类 |
兼容性 |
流式处理 |
工具 |
视觉 |
推理 |
|---|---|---|---|---|---|---|
OpenAI |
|
vLLM, DeepSeek |
✅ |
✅ |
✅ |
✅ |
DashScope |
|
✅ |
✅ |
✅ |
✅ |
|
Anthropic |
|
✅ |
✅ |
✅ |
✅ |
|
双子座 |
|
✅ |
✅ |
✅ |
✅ |
|
Ollama |
|
✅ |
✅ |
✅ |
✅ |
注意
当使用vLLM时,你需要为不同模型在部署期间配置合适的工具调用参数,例如--enable-auto-tool-choice、--tool-call-parser等。更多详情,请参考官方vLLM文档。
为提供统一的模型接口,上述模型类包含以下通用方法:
__call__方法的前三个参数分别是messages、tools和tool_choice,分别代表输入消息、工具函数的JSON模式以及工具选择模式。返回类型要么是
ChatResponse实例,要么是流式模式下ChatResponse的异步生成器。
注意
不同模型API的输入消息格式不同,更多详情请参考Prompt Formatter。
ChatResponse 实例包含了生成的思考/文本/工具使用内容、身份、创建时间及使用情况信息。
import asyncio
import json
import os
from agentscope.message import TextBlock, ToolUseBlock, ThinkingBlock, Msg
from agentscope.model import ChatResponse, DashScopeChatModel
response = ChatResponse(
content=[
ThinkingBlock(
type="thinking",
thinking="I should search for AgentScope on Google.",
),
TextBlock(type="text", text="I'll search for AgentScope on Google."),
ToolUseBlock(
type="tool_use",
id="642n298gjna",
name="google_search",
input={"query": "AgentScope?"},
),
],
)
print(response)
ChatResponse(content=[{'type': 'thinking', 'thinking': 'I should search for AgentScope on Google.'}, {'type': 'text', 'text': "I'll search for AgentScope on Google."}, {'type': 'tool_use', 'id': '642n298gjna', 'name': 'google_search', 'input': {'query': 'AgentScope?'}}], id='2025-09-08 07:55:13.861_165988', created_at='2025-09-08 07:55:13.861', type='chat', usage=None, metadata=None)
以DashScopeChatModel为例,我们可以用它来创建一个聊天模型实例,并通过消息与工具进行调用:
async def example_model_call() -> None:
"""An example of using the DashScopeChatModel."""
model = DashScopeChatModel(
model_name="qwen-max",
api_key=os.environ["DASHSCOPE_API_KEY"],
stream=False,
)
res = await model(
messages=[
{"role": "user", "content": "Hi!"},
],
)
# You can directly create a ``Msg`` object with the response content
msg_res = Msg("Friday", res.content, "assistant")
print("The response:", res)
print("The response as Msg:", msg_res)
asyncio.run(example_model_call())
The response: ChatResponse(content=[{'type': 'text', 'text': 'Hello! How can I assist you today?'}], id='2025-09-08 07:55:15.115_fc7538', created_at='2025-09-08 07:55:15.115', type='chat', usage=ChatUsage(input_tokens=10, output_tokens=9, time=1.252716, type='chat'), metadata=None)
The response as Msg: Msg(id='LH5rT82NFpbavHb9Roz5iE', name='Friday', content=[{'type': 'text', 'text': 'Hello! How can I assist you today?'}], role='assistant', metadata=None, timestamp='2025-09-08 07:55:15.115', invocation_id='None')
流式处理¶
要启动流式模型,在模型构造函数中设置stream参数为True。
当流式功能启用时,__call__方法将返回一个异步生成器,该生成器会在模型生成时逐个输出ChatResponse实例。
注意
AgentScope中的流式模式设计为累积式,意味着每个区块中的内容包含所有之前的内容加上新生成的内容。
async def example_streaming() -> None:
"""An example of using the streaming model."""
model = DashScopeChatModel(
model_name="qwen-max",
api_key=os.environ["DASHSCOPE_API_KEY"],
stream=True,
)
generator = await model(
messages=[
{
"role": "user",
"content": "Count from 1 to 20, and just report the number without any other information.",
},
],
)
print("The type of the response:", type(generator))
i = 0
async for chunk in generator:
print(f"Chunk {i}")
print(f"\ttype: {type(chunk.content)}")
print(f"\t{chunk}\n")
i += 1
asyncio.run(example_streaming())
The type of the response: <class 'async_generator'>
Chunk 0
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1'}], id='2025-09-08 07:55:16.120_386c56', created_at='2025-09-08 07:55:16.120', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=1, time=1.003912, type='chat'), metadata=None)
Chunk 1
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n'}], id='2025-09-08 07:55:16.424_4e7f11', created_at='2025-09-08 07:55:16.424', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=4, time=1.307314, type='chat'), metadata=None)
Chunk 2
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4'}], id='2025-09-08 07:55:16.513_8eb731', created_at='2025-09-08 07:55:16.513', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=7, time=1.39666, type='chat'), metadata=None)
Chunk 3
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n'}], id='2025-09-08 07:55:16.602_636eb5', created_at='2025-09-08 07:55:16.602', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=10, time=1.485554, type='chat'), metadata=None)
Chunk 4
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n'}], id='2025-09-08 07:55:16.985_feb4fa', created_at='2025-09-08 07:55:16.985', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=16, time=1.868341, type='chat'), metadata=None)
Chunk 5
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n1'}], id='2025-09-08 07:55:17.370_e9e1ed', created_at='2025-09-08 07:55:17.370', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=22, time=2.25342, type='chat'), metadata=None)
Chunk 6
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n1'}], id='2025-09-08 07:55:17.536_7ba6d1', created_at='2025-09-08 07:55:17.536', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=28, time=2.419368, type='chat'), metadata=None)
Chunk 7
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n1'}], id='2025-09-08 07:55:17.716_150f14', created_at='2025-09-08 07:55:17.716', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=34, time=2.599737, type='chat'), metadata=None)
Chunk 8
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n1'}], id='2025-09-08 07:55:17.859_de1731', created_at='2025-09-08 07:55:17.859', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=40, time=2.742985, type='chat'), metadata=None)
Chunk 9
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n1'}], id='2025-09-08 07:55:18.033_6cea5f', created_at='2025-09-08 07:55:18.033', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=46, time=2.916369, type='chat'), metadata=None)
Chunk 10
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n19\n20'}], id='2025-09-08 07:55:18.229_25734c', created_at='2025-09-08 07:55:18.229', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=50, time=3.112902, type='chat'), metadata=None)
Chunk 11
type: <class 'list'>
ChatResponse(content=[{'type': 'text', 'text': '1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14\n15\n16\n17\n18\n19\n20'}], id='2025-09-08 07:55:18.248_f6c87b', created_at='2025-09-08 07:55:18.248', type='chat', usage=ChatUsage(input_tokens=27, output_tokens=50, time=3.131775, type='chat'), metadata=None)
推理¶
AgentScope通过提供ThinkingBlock来支持推理模型。
async def example_reasoning() -> None:
"""An example of using the reasoning model."""
model = DashScopeChatModel(
model_name="qwen-turbo",
api_key=os.environ["DASHSCOPE_API_KEY"],
enable_thinking=True,
)
res = await model(
messages=[
{"role": "user", "content": "Who am I?"},
],
)
last_chunk = None
async for chunk in res:
last_chunk = chunk
print("The final response:")
print(last_chunk)
asyncio.run(example_reasoning())
The final response:
ChatResponse(content=[{'type': 'thinking', 'thinking': 'Okay, the user asked "Who am I?" So, I need to figure out how to respond. First, I should consider that the user might be asking about my identity as an AI. But they could also be asking about their own identity, which is more personal.\n\nI should start by clarifying the question. Maybe the user wants to know what I am, so I can explain that I\'m Qwen, a large language model developed by Alibaba. But if they\'re asking about themselves, I need to be careful not to assume anything. \n\nI should make sure to cover both possibilities. Let me check the previous messages to see if there\'s any context. Wait, the user just asked "Who am I?" without any prior conversation. So, there\'s no context. \n\nIn that case, the best approach is to ask for clarification. I can mention that I\'m Qwen and offer to help with either their identity or other questions. That way, I\'m addressing both possibilities without making assumptions. \n\nAlso, I need to keep the response friendly and open-ended. Avoid any technical jargon. Make sure it\'s clear and helpful. Let me put that together in a natural way.'}, {'type': 'text', 'text': "You are asking about your identity! However, I don't have access to personal information about you. If you're asking about me, I am Qwen, a large language model developed by Alibaba Cloud. If you have a specific question or need help with something, feel free to let me know! 😊"}], id='2025-09-08 07:55:23.738_dced86', created_at='2025-09-08 07:55:23.738', type='chat', usage=ChatUsage(input_tokens=12, output_tokens=307, time=5.485331, type='chat'), metadata=None)
工具 API¶
不同模型供应商的工具API存在差异,例如工具JSON模式、工具调用/响应格式等。 为提供统一接口,AgentScope通过以下方式解决这个问题:
提供统⼀的工具调⽤块 ToolUseBlock 和⼯具响应块 ToolResultBlock。
在模型类的
__call__方法中提供统一的工具接口,该接口接受如下工具 JSON 结构列表:
json_schemas = [
{
"type": "function",
"function": {
"name": "google_search",
"description": "Search for a query on Google.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query.",
},
},
"required": ["query"],
},
},
},
]
扩展阅读¶
脚本总运行时间: (0分钟 9.882秒)