Build Your Own Code Interpreter - Dynamic Tool Generation and Execution With o3-mini

为LLM智能体提供与外界或其他智能体交互能力的核心在于“工具（或函数）调用”，即LLM可以通过参数调用函数（代码块）。通常情况下，这些函数及其预期输入输出都由开发者预先定义。但在本指南中，我们将探索更灵活的范式——使用LLM模型（本例中为o3-mini）动态生成工具，并具备通过代码解释器执行工具的能力。

动态生成工具调用与代码解释器

动态生成工具是由大语言模型在运行时根据用户提示自行创建的函数或代码块。这意味着您无需在代码库中预先定义所有可能的场景，从而实现更加开放、创新和自适应的问题解决能力。

动态生成的工具调用更进一步，赋予LLM即时生成工具并执行代码块的能力。这种动态方法特别适用于涉及以下场景的任务：

数据分析与可视化
数据操作与转换
机器学习工作流生成与执行
流程自动化与脚本编写
以及更多可能性，随着实验探索不断涌现

使用o3-mini进行动态工具生成

o3-mini模型于1月31日发布，具备卓越的STEM能力——尤其在科学、数学和编程方面表现突出——同时保持了小型模型的低成本与低延迟特性。在本指南中，我们将展示o3-mini生成Python代码来解析数据并获取洞察的能力。

推理模型特别擅长生成动态工具来分析数据，因为它们能够自主推理，无需显式的思维链提示。事实上，提供显式的思维链指令可能会干扰模型的内部推理，导致次优结果。您可以在此here.了解更多关于o3-mini的信息。

为什么要构建自己的代码解释器

许多API提供商——例如OpenAI的Assistants API——都提供内置的代码解释器功能。这些内置代码解释器可能非常强大，但在某些情况下，开发人员可能需要创建自己的自定义代码解释器。例如：

语言或库支持: 内置解释器可能不支持任务所需的特定编程语言(如C++、Java等)或库。
任务兼容性: 您的使用场景可能与提供商的内置解决方案不兼容。
模型限制: 你可能需要一个语言模型，但该提供商的解释器不支持。
成本考量: 代码执行或模型使用的成本结构可能不符合您的预算或限制条件。
文件大小: 输入数据的文件过大或不受提供商的解释器支持。
与内部系统集成: 该提供商的解释器可能无法与您的内部系统集成。

你将学习的内容

通过遵循本指南，您将学习如何：

使用Docker设置一个隔离的Python代码执行环境
为LLM智能体配置您自己的代码解释器工具
为安全和防护建立明确的“智能体”关注点分离
使用o3-mini模型动态生成数据分析代码
协调智能体以高效完成指定任务
设计一个能够动态生成并执行代码的智能体应用

你将学习如何从头开始构建一个自定义代码解释器工具，利用LLM的强大能力生成复杂代码，并在隔离环境中安全执行这些代码——这一切都是为了让你的人工智能应用更加灵活、强大且经济高效。

示例场景

我们将使用Key Factors Traffic Accidents提供的样本数据来回答一系列问题。这些问题不需要预先定义，我们将赋予LLM生成代码来回答此类问题的能力。

示例问题可能包括：

哪些因素对事故频率的影响最大？（特征重要性分析）
哪些区域的事故风险最高？（分类/聚类）
交通罚款金额如何影响事故数量？（回归/因果推断）
我们能否确定最优罚款金额以减少事故率？（优化模型）
更高的罚款是否与较低的平均车速或减少的事故相关？（相关性/回归分析）
等等...

使用传统的预定义工具调用方法时，开发者需要为每个问题预先定义函数。这限制了LLM回答预定义函数集之外问题的能力。我们通过采用动态工具调用方法克服了这一限制，该方法让LLM生成代码并使用代码解释器工具来执行代码。

概述

让我们深入了解如何使用动态生成工具调用来构建这个智能体应用程序。该应用包含三个主要组件：

步骤1：设置一个隔离的代码执行容器环境

我们需要一个安全的环境来执行LLM生成的函数调用。我们希望避免直接在主机上运行LLM生成的代码，因此将创建一个具有受限资源访问权限的Docker容器环境（例如无网络访问）。默认情况下，Docker容器无法访问主机的文件系统，这有助于确保LLM生成的任何代码都保持隔离状态。

⚠️ 重要提醒：为LLM生成的代码实施强有力的防护措施

LLMs可能会生成具有意外后果的有害代码。作为最佳实践，请将代码执行环境隔离，仅授予任务所需的资源访问权限。避免在主机或笔记本电脑上运行LLM生成的代码。

步骤2：定义并测试智能体

"什么是智能体？" 在本手册中，智能体是指：

供LLM遵循的指令集，即开发者提示词
一个LLM模型，以及通过API调用该模型的能力
工具调用访问函数，并执行该函数的能力

我们将定义两个智能体：

FileAccessAgent: 该智能体将读取文件并将上下文提供给PythonCodeExecAgent。
PythonCodeExecAgent: 该智能体会生成Python代码来回答用户的问题，并在Docker容器中执行代码。

步骤3：设置智能体编排以运行应用程序

根据应用需求，有多种方式来编排智能体。在本示例中，我们将采用简单的顺序编排方式：用户提供任务后，智能体将按顺序调用以完成任务。

整体编排如下所示：

开始使用

先决条件

在开始之前，请确保您的主机上已安装并配置好以下内容：

Docker：已在本地机器上安装并运行。您可以了解更多关于Docker的信息并从这里安装它。
Python: 安装在您的本地机器上。您可以了解更多关于Python的信息并从这里安装它。
OpenAI API密钥：在本地机器上设置为环境变量或在根目录的.env文件中设置。您可以了解更多关于OpenAI API密钥的信息并从这里进行设置。

步骤1：设置隔离的代码执行环境

让我们定义一个Docker化的容器环境，用于执行我们的代码。我已经在resources/docker目录下定义了dockerfile，它将用于创建具有以下规格的容器环境：

以 Python 3.10 为基础
非root用户
预安装requirements.txt中的软件包

docker镜像创建过程中包含的requirements.txt文件列出了我们的LLM生成代码完成任务可能需要用到的所有潜在软件包。考虑到我们将限制容器访问网络，因此需要预先安装任务所需的软件包。出于安全考虑，我们的LLM将不被允许安装任何额外的软件包。

您可以创建自己的Docker镜像，包含任务所需的语言环境（如Python 3.10）并预装必要的软件包，或者为特定语言（如Java、C++等）创建自定义Docker镜像并安装任务所需的软件包。

让我们用以下命令构建docker镜像。为了简洁起见，我已将输出重定向到grep以获取成功消息，并在构建失败时打印消息。

!docker build -t python_sandbox:latest ./resources/docker 2>&1 | grep -E "View build details|ERROR" || echo "Build failed."

View build details: docker-desktop://dashboard/build/desktop-linux/desktop-linux/kl8fo02q7rgbindi9b42pn1zr

让我们在受限模式下运行容器。该容器将在后台运行。这是我们为容器定义安全策略的机会。最佳实践是仅允许容器执行任务所需的最基本功能。默认情况下，容器无法从内部访问主机文件系统。我们还要限制其网络访问权限，使其无法访问互联网或任何其他网络资源。

# Run the container in restricted mode. The container will run in the background.
!docker run -d --name sandbox --network none --cap-drop all --pids-limit 64 --tmpfs /tmp:rw,size=64M   python_sandbox:latest sleep infinity

8446d1e9a7972f2e00a5d1799451c1979d34a2962aa6b4c35a9868af8d321b0e

让我们使用docker ps命令确认容器正在运行，该命令应该会列出我们的容器。

!docker ps

CONTAINER ID   IMAGE                   COMMAND            CREATED         STATUS         PORTS     NAMES
8446d1e9a797   python_sandbox:latest   "sleep infinity"   2 seconds ago   Up 2 seconds             sandbox

步骤2：定义并测试智能体

出于我们的目的，我们将定义两个智能体。

智能体 1: 文件访问智能体 (带预定义工具调用)

理解文件内容的说明，以作为智能体2的上下文提供。
可以访问主机机器的文件系统。
可以从主机读取文件并将其复制到Docker容器中。
无法访问代码解释器工具。
使用gpt-4o模型。

智能体 2: Python 代码生成器和执行器 (支持动态生成工具调用和代码执行)

从智能体1接收文件内容的上下文。
生成Python脚本以回答用户问题的指令。
可以在Docker容器内访问代码解释器，该解释器用于执行Python代码。
只能访问Docker容器内的文件系统（无法访问宿主机）。
无法访问主机文件系统或网络。
使用我们最新的o3-mini模型，该模型擅长代码生成。

文件访问（智能体1）与代码生成器和执行器（智能体2）的这种分离对于防止LLM直接访问或修改主机至关重要。

将智能体1限制为仅使用静态工具调用，因为它可以访问主机文件系统。

智能体	工具调用类型	访问主机文件系统	访问Docker容器文件系统	访问代码解释器
智能体 1: 文件访问	预定义工具	是	是	否
智能体2: Python代码生成与执行器	动态生成工具	否	是	是

为了保持智能体和工具的组织性，我们定义了一套核心类，这些类将用于通过面向对象编程原则创建两个智能体以确保一致性。

BaseAgent: 我们从一个抽象基类开始，它强制实现诸如task()等通用方法签名。基类还提供了用于调试的日志记录器、语言模型接口以及其他常见功能，例如add_context()用于向智能体添加上下文。
ChatMessages: 一个用于存储对话历史的类，因为ChatCompletions API是无状态的。
ToolManager: 用于管理智能体可调用工具的类。
ToolInterface: 一个抽象类，用于定义智能体可以调用的任何"工具"，以确保这些工具具有一致的接口。

这些类定义在object_oriented_agents/core_classes目录中。

核心类的UML类图

以下类图展示了核心类之间的关系。该UML(统一建模语言)图表是使用Mermaid工具生成的

定义智能体1：带有文件访问工具的FileAccessAgent

让我们从定义继承自ToolInterface类的FileAccessTool开始。FileAccessTool工具定义在resources/registry/tools目录下的file_access_tool.py文件中。

FileAccessTool实现了ToolInterface类，确保所有工具具有一致的接口。
在get_definition方法和工具的run方法中绑定OpenAI Function Calling API的工具定义，可确保可维护性、可扩展性和可重用性。

现在，让我们定义继承BaseAgent类的FileAccessAgent，并将FileAccessTool绑定到该智能体。FileAccessAgent定义在resources/registry/agents目录下的file_acess_agent.py文件中。FileAccessAgent的功能如下：

BaseAgent 类的具体实现。
初始化时包含开发者提示、模型名称、日志记录器和语言模型接口。开发者可根据需要覆盖这些值。
具有一个setup_tools方法，用于将FileAccessTool注册到工具管理器。
具有一个task方法，该方法调用FileAccessTool来读取文件并向PythonCodeExecAgent提供上下文。
model_name='gpt-4o' 为该任务提供足够的推理和工具调用能力。

定义智能体2：带有PythonExecTool的PythonExecAgent

类似地，PythonExecTool继承自ToolInterface类并实现了get_definition和run方法。get_definition方法以OpenAI Function Calling API预期的格式返回工具定义。run方法在Docker容器中执行Python代码并返回输出结果。该工具定义在resources/registry/tools目录下的python_code_interpreter_tool.py文件中。

同样地，PythonExecAgent是BaseAgent类的具体实现。它定义在resources/registry/agents目录下的python_code_exec_agent.py文件中。PythonExecAgent具有以下特性：

BaseAgent 类的具体实现。
初始化时包含开发者提示、模型名称、日志记录器和语言模型接口。开发者可根据需要覆盖这些值。
具有一个setup_tools方法，用于将PythonExecTool注册到工具管理器。
具有一个task方法，该方法调用OpenAI API来执行用户的任务，在本例中涉及生成一个Python脚本来回答用户的问题，并使用代码解释器工具运行它。
model_name='o3-mini' 擅长STEM任务，例如代码生成。
reasoning_effort='high' 允许针对任务的复杂性进行更完整的推理，但代价是生成更多token和响应速度变慢。默认值为medium，这是速度和推理准确性之间的平衡。

你可以了解更多关于reasoning_effort参数的信息这里。

步骤3：设置智能体编排以运行应用程序

定义好智能体后，我们现在可以定义运行应用程序的编排循环。该循环将提示用户输入问题或任务，然后调用FileAccessAgent读取文件并为PythonExecAgent提供上下文。PythonExecAgent将生成Python代码来回答用户的问题，并在Docker容器中执行该代码。代码执行的输出将显示给用户。

用户可以输入'exit'来停止应用程序。我们的问题是：哪些因素对事故频率的影响最大？ 请注意，我们没有预定义回答这个问题的函数。

# Import the agents from registry/agents

from resources.registry.agents.file_access_agent import FileAccessAgent
from resources.registry.agents.python_code_exec_agent import PythonExecAgent


prompt = """Use the file traffic_accidents.csv for your analysis. The column names are:
Variable	Description
accidents	Number of recorded accidents, as a positive integer.
traffic_fine_amount	Traffic fine amount, expressed in thousands of USD.
traffic_density	Traffic density index, scale from 0 (low) to 10 (high).
traffic_lights	Proportion of traffic lights in the area (0 to 1).
pavement_quality	Pavement quality, scale from 0 (very poor) to 5 (excellent).
urban_area	Urban area (1) or rural area (0), as an integer.
average_speed	Average speed of vehicles in km/h.
rain_intensity	Rain intensity, scale from 0 (no rain) to 3 (heavy rain).
vehicle_count	Estimated number of vehicles, in thousands, as an integer.
time_of_day	Time of day in 24-hour format (0 to 24).
accidents	traffic_fine_amount
"""


print("Setup: ")
print(prompt)

print("Setting up the agents... ")

# Instantiate the agents with the default constructor defined values
# Developer may override the default values - prompt, model, logger, and language model interface if needed

# This agent use gpt-4o by default
file_ingestion_agent = FileAccessAgent()

# Let's make sure agent uses o3-mini model and set the reasoning_effort to high
data_analysis_agent = PythonExecAgent(model_name='o3-mini', reasoning_effort='high')

print("Understanding the contents of the file...")
# Give a task to the file ingestion agent to read the file and provide the context to the data analysis agent 
file_ingestion_agent_output = file_ingestion_agent.task(prompt)

# Add the file content as context to the data analysis agent
# The context is added to the agent's tool manager so that the tool manager can use the context to generate the code 

data_analysis_agent.add_context(prompt)
data_analysis_agent.add_context(file_ingestion_agent_output)

while True:

    print("Type your question related to the data in the file. Type 'exit' to exit.")
    user_input = input("Type your question.")

    if user_input == "exit":
        print("Exiting the application.")
        break

    print(f"User question: {user_input}")

    print("Generating dynamic tools and using code interpreter...")
    data_analysis_agent_output = data_analysis_agent.task(user_input)

    print("Output...")
    print(data_analysis_agent_output)

Setup: 
Use the file traffic_accidents.csv for your analysis. The column names are:
Variable	Description
accidents	Number of recorded accidents, as a positive integer.
traffic_fine_amount	Traffic fine amount, expressed in thousands of USD.
traffic_density	Traffic density index, scale from 0 (low) to 10 (high).
traffic_lights	Proportion of traffic lights in the area (0 to 1).
pavement_quality	Pavement quality, scale from 0 (very poor) to 5 (excellent).
urban_area	Urban area (1) or rural area (0), as an integer.
average_speed	Average speed of vehicles in km/h.
rain_intensity	Rain intensity, scale from 0 (no rain) to 3 (heavy rain).
vehicle_count	Estimated number of vehicles, in thousands, as an integer.
time_of_day	Time of day in 24-hour format (0 to 24).
accidents	traffic_fine_amount

Setting up the agents... 
Understanding the contents of the file...

2025-02-03 13:03:54,066 - MyApp - INFO - Handling tool call: safe_file_access
2025-02-03 13:03:54,067 - MyApp - INFO - Tool arguments: {'filename': './resources/data/traffic_accidents.csv'}
2025-02-03 13:03:54,562 - MyApp - INFO - Tool 'safe_file_access' response: Copied ./resources/data/traffic_accidents.csv into sandbox:/home/sandboxuser/.
The file content for the first 15 rows is:
    accidents  traffic_fine_amount  traffic_density  traffic_lights  pavement_quality  urban_area  average_speed  rain_intensity  vehicle_count  time_of_day
0          20               4.3709           2.3049         753.000            0.7700           1        321.592          1.1944       290.8570     160.4320
1          11               9.5564           3.2757           5.452            4.0540           1        478.623          6.2960       931.8120       8.9108
2          19               7.5879           2.0989           6.697          345.0000           0        364.476          2.8584       830.0860       5.5727
3          23               6.3879           4.9188           9.412            4.7290           0         20.920          2.1065       813.1590     131.4520
4          23               2.4042           1.9610           7.393            1.7111           1         37.378          1.7028         1.4663       6.9610
5          31               2.4040           6.7137           5.411            5.9050           1        404.621          1.8936       689.0410       8.1801
6          29               1.5228           5.2316           9.326            2.3785           1         16.292          2.5213       237.9710      12.6622
7          18               8.7956           8.9864           4.784            1.9984           0        352.566          1.9072       968.0670       8.0602
8          15               6.4100           1.6439           5.612            3.6090           1        217.198          3.4380       535.4440       8.2904
9          22               7.3727           8.0411           5.961            4.7650           1        409.261          2.0919       569.0560     203.5910
10         28               1.1853           7.9196           0.410            3.7678           1        147.689          1.6946       362.9180     224.1580
11         17               9.7292           1.2718           8.385            8.9720           0         46.888          2.8990       541.3630     198.5740
12         14               8.4920           3.9856           1.852            4.6776           0        287.393          2.2012        75.2240       2.3728
13         21               2.9111           1.7015           5.548            1.9607           1        176.652          1.0320       566.3010       6.9538
14         22               2.6364           2.5472           7.222            2.3709           0        209.686          4.0620        64.4850     170.7110

Type your question related to the data in the file. Type 'exit' to exit.
User question: What factors contribute the most to accident frequency?
Generating dynamic tools and using code interpreter...

2025-02-03 13:04:39,427 - MyApp - INFO - Handling tool call: execute_python_code
2025-02-03 13:04:39,429 - MyApp - INFO - Tool arguments: {'python_code': "import pandas as pd\nimport numpy as np\nfrom sklearn.linear_model import LinearRegression\nfrom sklearn.preprocessing import StandardScaler\nimport matplotlib.pyplot as plt\nimport seaborn as sns\n\n# Load the dataset\nfile_path = '/home/sandboxuser/traffic_accidents.csv'\ndf = pd.read_csv(file_path)\n\n# Show basic information\nprint('Dataset shape:', df.shape)\nprint('First few rows:')\nprint(df.head(), '\\n')\nprint('Columns:', df.columns.tolist(), '\\n')\n\n# Correlation matrix analysis\ncorr_matrix = df.corr()\nprint('Correlation matrix:')\nprint(corr_matrix, '\\n')\n\n# Correlation of each feature with accidents\nacc_corr = corr_matrix['accidents'].drop('accidents').sort_values(key=lambda x: abs(x), ascending=False)\nprint('Correlation of other variables with accidents (sorted by absolute correlation):')\nprint(acc_corr, '\\n')\n\n# Visualize the correlation matrix\nplt.figure(figsize=(10, 8))\nsns.heatmap(corr_matrix, annot=True, cmap='coolwarm', fmt='.2f')\nplt.title('Correlation Matrix')\nplt.tight_layout()\nplt.savefig('correlation_matrix.png')\nplt.close()\n\n# Prepare data for regression analysis\n# Exclude target variable 'accidents'\nfeatures = [col for col in df.columns if col != 'accidents']\nX = df[features]\ny = df['accidents']\n\n# Standardize the features to compare the regression coefficients on the same scale\nscaler = StandardScaler()\nX_scaled = scaler.fit_transform(X)\n\n# Fit a linear regression model\nmodel = LinearRegression()\nmodel.fit(X_scaled, y)\n\n# Gather coefficients along with feature names\ncoef = model.coef_\ncoef_df = pd.DataFrame({'Feature': features, 'Coefficient': coef})\ncoef_df['AbsCoefficient'] = coef_df['Coefficient'].abs()\ncoef_df = coef_df.sort_values(by='AbsCoefficient', ascending=False)\nprint('Linear Regression Coefficients (using standardized features):')\nprint(coef_df[['Feature', 'Coefficient']], '\\n')\n\n# Additionally, compute feature importances using a Random Forest regressor\nfrom sklearn.ensemble import RandomForestRegressor\nrf = RandomForestRegressor(random_state=42)\nrf.fit(X, y)\nrf_importance = rf.feature_importances_\nrf_df = pd.DataFrame({'Feature': features, 'Importance': rf_importance})\nrf_df = rf_df.sort_values(by='Importance', ascending=False)\nprint('Random Forest Feature Importances:')\nprint(rf_df, '\\n')\n\n# The printed outputs will help in understanding which factors contribute most to accident frequency.\n\n# For clarity, save the coefficients and importances to CSV files (optional)\ncoef_df.to_csv('linear_regression_coefficients.csv', index=False)\nrf_df.to_csv('random_forest_importances.csv', index=False)\n\n# End of analysis\n"}
2025-02-03 13:04:43,123 - MyApp - INFO - Tool 'execute_python_code' response: Dataset shape: (8756, 10)
First few rows:
   accidents  traffic_fine_amount  ...  vehicle_count  time_of_day
0         20               4.3709  ...       290.8570     160.4320
1         11               9.5564  ...       931.8120       8.9108
2         19               7.5879  ...       830.0860       5.5727
3         23               6.3879  ...       813.1590     131.4520
4         23               2.4042  ...         1.4663       6.9610

[5 rows x 10 columns] 

Columns: ['accidents', 'traffic_fine_amount', 'traffic_density', 'traffic_lights', 'pavement_quality', 'urban_area', 'average_speed', 'rain_intensity', 'vehicle_count', 'time_of_day'] 

Correlation matrix:
                     accidents  traffic_fine_amount  ...  vehicle_count  time_of_day
accidents             1.000000            -0.745161  ...       0.068399     0.101995
traffic_fine_amount  -0.745161             1.000000  ...      -0.016610    -0.006236
traffic_density      -0.059265            -0.004365  ...      -0.014244     0.002806
traffic_lights       -0.026642             0.009056  ...       0.001373    -0.001971
pavement_quality      0.064694            -0.021229  ...       0.007840     0.000055
urban_area            0.145092            -0.005136  ...      -0.006053    -0.006320
average_speed         0.093923             0.009151  ...       0.000777    -0.005338
rain_intensity       -0.091673            -0.015302  ...      -0.025933    -0.013446
vehicle_count         0.068399            -0.016610  ...       1.000000    -0.009303
time_of_day           0.101995            -0.006236  ...      -0.009303     1.000000

[10 rows x 10 columns] 

Correlation of other variables with accidents (sorted by absolute correlation):
traffic_fine_amount   -0.745161
urban_area             0.145092
time_of_day            0.101995
average_speed          0.093923
rain_intensity        -0.091673
vehicle_count          0.068399
pavement_quality       0.064694
traffic_density       -0.059265
traffic_lights        -0.026642
Name: accidents, dtype: float64 

Linear Regression Coefficients (using standardized features):
               Feature  Coefficient
0  traffic_fine_amount    -3.891935
4           urban_area     0.739618
5        average_speed     0.533698
6       rain_intensity    -0.532251
8          time_of_day     0.512661
1      traffic_density    -0.331997
7        vehicle_count     0.281283
3     pavement_quality     0.264987
2       traffic_lights    -0.092800 

Random Forest Feature Importances:
               Feature  Importance
0  traffic_fine_amount    0.580838
1      traffic_density    0.165201
6       rain_intensity    0.095124
8          time_of_day    0.035814
5        average_speed    0.035590
3     pavement_quality    0.032177
2       traffic_lights    0.022613
7        vehicle_count    0.021006
4           urban_area    0.011637

Output...
The analysis shows that one variable stands out by far:

• Both the simple correlation analysis and regression results indicate that traffic_fine_amount is the dominant factor—its correlation with accidents is strong (about –0.75), and in the standardized linear regression its coefficient is the largest in magnitude (around –3.89). The negative sign suggests that, in this data, higher fine amounts are associated with fewer accidents (which might reflect more stringent enforcement or deterrence).

Other findings include:

• The Random Forest model also ranks traffic_fine_amount as most important (importance ≈ 0.58), with the next most influential factor being traffic_density (importance ≈ 0.17). Although its simple correlation with accidents is lower, traffic_density may contribute non‐linearly.

• Additional factors like urban_area, average_speed, rain_intensity, and time_of_day have moderate associations (with linear model coefficients ranging between about ±0.5 to +0.74). These suggest that accidents tend to be somewhat higher in urban areas and vary with time of day and weather conditions, but their overall impact is much less than that of traffic fine amounts.

In summary, the data analysis indicates that traffic_fine_amount contributes the most to accident frequency—with higher fines linked to fewer recorded accidents—while factors such as traffic density, urban area status, vehicle speed, rain intensity, and time of day also play secondary roles.
Type your question related to the data in the file. Type 'exit' to exit.
Exiting the application.

在这个示例中，o3-mini根据用户问题动态生成了一个工具（Python脚本）来分析数据。请注意o3-mini使用了多种方法来研究问题，如相关性分析、线性回归和随机森林模型。这种方法突出了以下几点：

reasoning_effort: 模型执行的推理深度，例如在本例中尝试的方法数量，通常随着该参数从低、中到高的增加而增加。您可以尝试不同的推理深度级别以观察差异。

动态生成工具调用: 用于分析数据的工具(Python脚本)并非由开发者手动编写或预先确定。相反，o3-mini模型在运行时创建了相关的数据探索和相关性分析代码。

隔离代码执行：为确保安全性并避免在主机上运行不受信任的代码，Python脚本通过execute_python_code工具在Docker容器内执行。该容器具有受限的资源访问权限（例如无网络和受限的文件系统访问），从而将任意代码执行带来的潜在风险降至最低。

结论

该手册提供了开发自定义代码解释器的指南，可根据特定应用需求进行定制，解决供应商提供解决方案中存在的局限性，例如语言限制、成本考量以及与不同LLM或模型配合所需的灵活性。

智能体与工具管理方法: 我们还定义了一组核心类来管理智能体和工具。这种方法确保智能体和工具具有一致的接口，并可在不同应用中重复使用。可以创建如registry文件夹这样的智能体与工具资源库来统一管理。

o3-mini模型：我们展示了o3-mini模型能够根据用户简短提示，在运行时生成复杂代码来分析数据的能力。随后o3-mini模型会对分析结果进行推理，并向用户解释这些结果。

最后，总结一下，使用动态工具调用构建智能体应用的三个步骤是：

设置一个隔离的代码执行容器环境
定义并测试智能体
设置智能体编排以运行应用程序

我们讨论了隔离代码执行环境的重要性，以确保安全性并避免在主机上运行不受信任的代码。通过CSV文件的用例，我们演示了如何动态生成工具（Python脚本）来分析数据并回答用户的问题。我们还展示了如何在Docker容器中执行代码并将输出返回给用户。

2025年2月3日

构建您自己的代码解释器 - 使用o3-mini实现动态工具生成与执行

动态生成工具调用与代码解释器

使用o3-mini进行动态工具生成

为什么要构建自己的代码解释器

你将学习的内容

示例场景

概述

步骤1：设置一个隔离的代码执行容器环境

⚠️ 重要提醒：为LLM生成的代码实施强有力的防护措施

步骤2：定义并测试智能体

步骤3：设置智能体编排以运行应用程序

开始使用

先决条件

步骤1：设置隔离的代码执行环境

步骤2：定义并测试智能体

核心类的UML类图

步骤3：设置智能体编排以运行应用程序

结论