使用AutoGen进行运行时日志记录
AutoGen 提供了记录数据以进行调试和性能分析的工具。本笔记本展示了如何使用它们。
我们以不同的模式记录数据:- SQlite数据库 - 文件
通常,用户可以通过调用autogen.runtime_logging.start()
来启动日志记录,并通过调用autogen.runtime_logging.stop()
来停止日志记录。
import json
import pandas as pd
import autogen
from autogen import AssistantAgent, UserProxyAgent
# Setup API key. Add your own API key to config file or environment variable
llm_config = {
"config_list": autogen.config_list_from_json(
env_or_file="OAI_CONFIG_LIST",
),
"temperature": 0.9,
}
# Start logging
logging_session_id = autogen.runtime_logging.start(config={"dbname": "logs.db"})
print("Logging session ID: " + str(logging_session_id))
# Create an agent workflow and run it
assistant = AssistantAgent(name="assistant", llm_config=llm_config)
user_proxy = UserProxyAgent(
name="user_proxy",
code_execution_config=False,
human_input_mode="NEVER",
is_termination_msg=lambda msg: "TERMINATE" in msg["content"],
)
user_proxy.initiate_chat(
assistant, message="What is the height of the Eiffel Tower? Only respond with the answer and terminate"
)
autogen.runtime_logging.stop()
Logging session ID: 6e08f3e0-392b-434e-8b69-4ab36c4fcf99
What is the height of the Eiffel Tower? Only respond with the answer and terminate
--------------------------------------------------------------------------------
The height of the Eiffel Tower is approximately 330 meters.
TERMINATE
--------------------------------------------------------------------------------
从SQLite数据库获取数据
logs.db
应该被生成,默认情况下它使用的是 SQLite 数据库。你可以使用 sqlitebrowser
这样的 GUI 工具,或者使用 SQLite 命令行 shell,或者使用 python 脚本来查看数据:
def get_log(dbname="logs.db", table="chat_completions"):
import sqlite3
con = sqlite3.connect(dbname)
query = f"SELECT * from {table}"
cursor = con.execute(query)
rows = cursor.fetchall()
column_names = [description[0] for description in cursor.description]
data = [dict(zip(column_names, row)) for row in rows]
con.close()
return data
def str_to_dict(s):
return json.loads(s)
log_data = get_log()
log_data_df = pd.DataFrame(log_data)
log_data_df["total_tokens"] = log_data_df.apply(
lambda row: str_to_dict(row["response"])["usage"]["total_tokens"], axis=1
)
log_data_df["request"] = log_data_df.apply(lambda row: str_to_dict(row["request"])["messages"][0]["content"], axis=1)
log_data_df["response"] = log_data_df.apply(
lambda row: str_to_dict(row["response"])["choices"][0]["message"]["content"], axis=1
)
log_data_df
id | 调用_id | 客户_id | 包装器_id | 会话_id | 请求 | 响应 | 是否缓存 | 成本 | 开始时间 | 结束时间 | 总令牌数 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | e8bb00d7-6da5-4407-a949-e19b55d53da8 | 139819167322784 | 139823225568704 | 8821a150-8c78-4d05-a858-8a64f1d18648 | 你是一个有用的AI助手。\n解决任务... | 埃菲尔铁塔的高度大约是... | 1 | 0.01572 | 2024-02-13 15:06:22.082896 | 2024-02-13 15:06:22.083169 | 507 |
1 | 2 | c8522790-0067-484b-bb37-d39ae80db98b | 139823225568656 | 139823225563040 | fb0ef547-a2ac-428b-8c20-a5e63263b8e1 | 你是一个有用的AI助手。\n解决任务... | 埃菲尔铁塔的高度大约为... | 1 | 0.01572 | 2024-02-13 15:06:23.498758 | 2024-02-13 15:06:23.499045 | 507 |
2 | 3 | 91c3f6c0-c6f7-4306-89cd-f304c9556de4 | 139823225449024 | 139819166072448 | 6e08f3e0-392b-434e-8b69-4ab36c4fcf99 | 你是一个有用的AI助手。\n解决任务 u... | 埃菲尔铁塔的高度大约为... | 1 | 0.01572 | 2024-02-13 15:06:24.688990 | 2024-02-13 15:06:24.689238 | 507 |
计算成本
日志数据的一个用例是计算会话的成本。
# Sum totoal tokens for all sessions
total_tokens = log_data_df["total_tokens"].sum()
# Sum total cost for all sessions
total_cost = log_data_df["cost"].sum()
# Total tokens for specific session
session_tokens = log_data_df[log_data_df["session_id"] == logging_session_id]["total_tokens"].sum()
session_cost = log_data_df[log_data_df["session_id"] == logging_session_id]["cost"].sum()
print("Total tokens for all sessions: " + str(total_tokens) + ", total cost: " + str(round(total_cost, 4)))
print(
"Total tokens for session "
+ str(logging_session_id)
+ ": "
+ str(session_tokens)
+ ", cost: "
+ str(round(session_cost, 4))
)
Total tokens for all sessions: 1521, total cost: 0.0472
Total tokens for session 6e08f3e0-392b-434e-8b69-4ab36c4fcf99: 507, cost: 0.0157
在文件模式下记录日志数据
默认情况下,日志类型设置为sqlite
,如上所示,但我们为autogen.runtime_logging.start()
引入了一个新参数。
当 logger_type = "file"
时,将开始在文件模式下记录数据。
import pandas as pd
import autogen
from autogen import AssistantAgent, UserProxyAgent
# Setup API key. Add your own API key to config file or environment variable
llm_config = {
"config_list": autogen.config_list_from_json(
env_or_file="OAI_CONFIG_LIST",
),
"temperature": 0.9,
}
# Start logging with logger_type and the filename to log to
logging_session_id = autogen.runtime_logging.start(logger_type="file", config={"filename": "runtime.log"})
print("Logging session ID: " + str(logging_session_id))
# Create an agent workflow and run it
assistant = AssistantAgent(name="assistant", llm_config=llm_config)
user_proxy = UserProxyAgent(
name="user_proxy",
code_execution_config=False,
human_input_mode="NEVER",
is_termination_msg=lambda msg: "TERMINATE" in msg["content"],
)
user_proxy.initiate_chat(
assistant, message="What is the height of the Eiffel Tower? Only respond with the answer and terminate"
)
autogen.runtime_logging.stop()
Logging session ID: ed493ebf-d78e-49f0-b832-69557276d557
What is the height of the Eiffel Tower? Only respond with the answer and terminate
--------------------------------------------------------------------------------
The height of the Eiffel Tower is 330 meters.
TERMINATE
--------------------------------------------------------------------------------
这应该会在当前目录中创建一个runtime.log
文件。