Skip to main content

[测试版] 批处理 API

涵盖批处理、文件

支持的提供商:

  • Azure OpenAI
  • OpenAI

快速开始

  • 创建用于批处理的文件

  • 创建批处理请求

  • 列出批处理

  • 检索特定批处理和文件内容

$ export OPENAI_API_KEY="sk-..."

$ litellm

# 运行在 http://0.0.0.0:4000

创建用于批处理的文件

curl http://localhost:4000/v1/files \
-H "Authorization: Bearer sk-1234" \
-F purpose="batch" \
-F file="@mydata.jsonl"

创建批处理请求

curl http://localhost:4000/v1/batches \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"input_file_id": "file-abc123",
"endpoint": "/v1/chat/completions",
"completion_window": "24h"
}'

检索特定批处理

curl http://localhost:4000/v1/batches/batch_abc123 \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \

列出批处理

curl http://localhost:4000/v1/batches \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \

创建用于批处理的文件

from litellm
import os

os.environ["OPENAI_API_KEY"] = "sk-.."

file_name = "openai_batch_completions.jsonl"
_current_dir = os.path.dirname(os.path.abspath(__file__))
file_path = os.path.join(_current_dir, file_name)
file_obj = await litellm.acreate_file(
file=open(file_path, "rb"),
purpose="batch",
custom_llm_provider="openai",
)
print("创建文件的响应=", file_obj)

创建批处理请求

from litellm
import os

create_batch_response = await litellm.acreate_batch(
completion_window="24h",
endpoint="/v1/chat/completions",
input_file_id=batch_input_file_id,
custom_llm_provider="openai",
metadata={"key1": "value1", "key2": "value2"},
)

print("litellm.create_batch的响应=", create_batch_response)

检索特定批处理和文件内容


retrieved_batch = await litellm.aretrieve_batch(
batch_id=create_batch_response.id, custom_llm_provider="openai"
)
print("检索到的批处理=", retrieved_batch)
# 仅断言我们检索到一个非空的批处理

assert retrieved_batch.id == create_batch_response.id

# 尝试获取我们原始文件的内容

file_content = await litellm.afile_content(
file_id=batch_input_file_id, custom_llm_provider="openai"
)

print("文件内容 = ", file_content)

列出批处理

list_batches_response = litellm.list_batches(custom_llm_provider="openai", limit=2)
print("list_batches_response=", list_batches_response)

👉 Proxy API 参考

Azure 批处理 API

只需将 Azure 环境变量添加到您的环境中。

export AZURE_API_KEY=""
export AZURE_API_BASE=""

并使用 /azure/* 作为批处理 API 调用

http://0.0.0.0:4000/azure/v1/batches

使用方法

设置

  • 将 Azure API 密钥添加到您的环境

1. 上传文件

curl http://localhost:4000/azure/v1/files \
-H "Authorization: Bearer sk-1234" \
-F purpose="batch" \
-F file="@mydata.jsonl"

示例文件

注意:model 应为您的 Azure 部署名称。

{"custom_id": "task-0", "method": "POST", "url": "/chat/completions", "body": {"model": "REPLACE-WITH-MODEL-DEPLOYMENT-NAME", "messages": [{"role": "system", "content": "您是一位帮助人们查找信息的 AI 助手。"}, {"role": "user", "content": "微软成立于何时?"}]}}
{"custom_id": "task-1", "method": "POST", "url": "/chat/completions", "body": {"model": "REPLACE-WITH-MODEL-DEPLOYMENT-NAME", "messages": [{"role": "system", "content": "您是一位帮助人们查找信息的 AI 助手。"}, {"role": "user", "content": "第一款 XBOX 何时发布?"}]}}
{"custom_id": "task-2", "method": "POST", "url": "/chat/completions", "body": {"model": "REPLACE-WITH-MODEL-DEPLOYMENT-NAME", "messages": [{"role": "system", "content": "您是一位帮助人们查找信息的 AI 助手。"}, {"role": "user", "content": "什么是 Altair Basic?"}]}}

2. 创建批处理

curl http://0.0.0.0:4000/azure/v1/batches \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input_file_id": "file-abc123",
"endpoint": "/v1/chat/completions",
"completion_window": "24h"
}'

3. 检索批处理

curl http://0.0.0.0:4000/azure/v1/batches/batch_abc123 \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json" \

4. 取消批处理

curl http://0.0.0.0:4000/azure/v1/batches/batch_abc123/cancel \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json" \
-X POST

5. 列出批次

curl http://0.0.0.0:4000/v1/batches?limit=2 \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json"

👉 健康检查 Azure 批次模型

[测试版] 负载均衡多个 Azure 部署

在你的 config.yaml 中,设置 enable_loadbalancing_on_batch_endpoints: true

model_list:
- model_name: "batch-gpt-4o-mini"
litellm_params:
model: "azure/gpt-4o-mini"
api_key: os.environ/AZURE_API_KEY
api_base: os.environ/AZURE_API_BASE
model_info:
mode: batch

litellm_settings:
enable_loadbalancing_on_batch_endpoints: true # 👈 关键更改

注意:这适用于 {PROXY_BASE_URL}/v1/files{PROXY_BASE_URL}/v1/batches。 注意:响应格式为 OpenAI 格式。

  1. 上传文件

只需在你的 .jsonl 文件中设置 model: batch-gpt-4o-mini

curl http://localhost:4000/v1/files \
-H "Authorization: Bearer sk-1234" \
-F purpose="batch" \
-F file="@mydata.jsonl"

示例文件

注意:model 应为你的 Azure 部署名称。

{"custom_id": "task-0", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was Microsoft founded?"}]}}
{"custom_id": "task-1", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was the first XBOX released?"}]}}
{"custom_id": "task-2", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "What is Altair Basic?"}]}}

预期响应(OpenAI 兼容)

{"id":"file-f0be81f654454113a922da60acb0eea6",...}
  1. 创建批次
curl http://0.0.0.0:4000/v1/batches \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input_file_id": "file-f0be81f654454113a922da60acb0eea6",
"endpoint": "/v1/chat/completions",
"completion_window": "24h",
"model: "batch-gpt-4o-mini"
}'

预期响应:

{"id":"batch_94e43f0a-d805-477d-adf9-bbb9c50910ed",...}
  1. 获取批次
curl http://0.0.0.0:4000/v1/batches/batch_94e43f0a-d805-477d-adf9-bbb9c50910ed \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json" \

预期响应:

{"id":"batch_94e43f0a-d805-477d-adf9-bbb9c50910ed",...}
  1. 列出批次
curl http://0.0.0.0:4000/v1/batches?limit=2 \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json"

预期响应:

{"data":[{"id":"batch_R3V...}
优云智算