[测试版] 批处理 API

涵盖批处理、文件

支持的提供商:

Azure OpenAI
OpenAI

快速开始

创建用于批处理的文件
创建批处理请求
列出批处理
检索特定批处理和文件内容

$ export OPENAI_API_KEY="sk-..."

$ litellm

# 运行在 http://0.0.0.0:4000

创建用于批处理的文件

curl http://localhost:4000/v1/files \
    -H "Authorization: Bearer sk-1234" \
    -F purpose="batch" \
    -F file="@mydata.jsonl"

创建批处理请求

curl http://localhost:4000/v1/batches \
        -H "Authorization: Bearer sk-1234" \
        -H "Content-Type: application/json" \
        -d '{
            "input_file_id": "file-abc123",
            "endpoint": "/v1/chat/completions",
            "completion_window": "24h"
    }'

检索特定批处理

curl http://localhost:4000/v1/batches/batch_abc123 \
    -H "Authorization: Bearer sk-1234" \
    -H "Content-Type: application/json" \

列出批处理

curl http://localhost:4000/v1/batches \
    -H "Authorization: Bearer sk-1234" \
    -H "Content-Type: application/json" \

创建用于批处理的文件

from litellm
import os 

os.environ["OPENAI_API_KEY"] = "sk-.."

file_name = "openai_batch_completions.jsonl"
_current_dir = os.path.dirname(os.path.abspath(__file__))
file_path = os.path.join(_current_dir, file_name)
file_obj = await litellm.acreate_file(
    file=open(file_path, "rb"),
    purpose="batch",
    custom_llm_provider="openai",
)
print("创建文件的响应=", file_obj)

创建批处理请求

from litellm
import os 

create_batch_response = await litellm.acreate_batch(
    completion_window="24h",
    endpoint="/v1/chat/completions",
    input_file_id=batch_input_file_id,
    custom_llm_provider="openai",
    metadata={"key1": "value1", "key2": "value2"},
)

print("litellm.create_batch的响应=", create_batch_response)

检索特定批处理和文件内容

retrieved_batch = await litellm.aretrieve_batch(
    batch_id=create_batch_response.id, custom_llm_provider="openai"
)
print("检索到的批处理=", retrieved_batch)
# 仅断言我们检索到一个非空的批处理

assert retrieved_batch.id == create_batch_response.id

# 尝试获取我们原始文件的内容

file_content = await litellm.afile_content(
    file_id=batch_input_file_id, custom_llm_provider="openai"
)

print("文件内容 = ", file_content)

列出批处理

list_batches_response = litellm.list_batches(custom_llm_provider="openai", limit=2)
print("list_batches_response=", list_batches_response)

👉 Proxy API 参考

Azure 批处理 API

只需将 Azure 环境变量添加到您的环境中。

export AZURE_API_KEY=""
export AZURE_API_BASE=""

并使用 /azure/* 作为批处理 API 调用

http://0.0.0.0:4000/azure/v1/batches

使用方法

设置

将 Azure API 密钥添加到您的环境

1. 上传文件

curl http://localhost:4000/azure/v1/files \
    -H "Authorization: Bearer sk-1234" \
    -F purpose="batch" \
    -F file="@mydata.jsonl"

示例文件

注意：model 应为您的 Azure 部署名称。

{"custom_id": "task-0", "method": "POST", "url": "/chat/completions", "body": {"model": "REPLACE-WITH-MODEL-DEPLOYMENT-NAME", "messages": [{"role": "system", "content": "您是一位帮助人们查找信息的 AI 助手。"}, {"role": "user", "content": "微软成立于何时？"}]}}
{"custom_id": "task-1", "method": "POST", "url": "/chat/completions", "body": {"model": "REPLACE-WITH-MODEL-DEPLOYMENT-NAME", "messages": [{"role": "system", "content": "您是一位帮助人们查找信息的 AI 助手。"}, {"role": "user", "content": "第一款 XBOX 何时发布？"}]}}
{"custom_id": "task-2", "method": "POST", "url": "/chat/completions", "body": {"model": "REPLACE-WITH-MODEL-DEPLOYMENT-NAME", "messages": [{"role": "system", "content": "您是一位帮助人们查找信息的 AI 助手。"}, {"role": "user", "content": "什么是 Altair Basic？"}]}}

2. 创建批处理

curl http://0.0.0.0:4000/azure/v1/batches \
  -H "Authorization: Bearer $LITELLM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input_file_id": "file-abc123",
    "endpoint": "/v1/chat/completions",
    "completion_window": "24h"
  }'

3. 检索批处理

curl http://0.0.0.0:4000/azure/v1/batches/batch_abc123 \
  -H "Authorization: Bearer $LITELLM_API_KEY" \
  -H "Content-Type: application/json" \

4. 取消批处理

curl http://0.0.0.0:4000/azure/v1/batches/batch_abc123/cancel \
  -H "Authorization: Bearer $LITELLM_API_KEY" \
  -H "Content-Type: application/json" \
  -X POST

5. 列出批次

curl http://0.0.0.0:4000/v1/batches?limit=2 \
  -H "Authorization: Bearer $LITELLM_API_KEY" \
  -H "Content-Type: application/json"

👉 健康检查 Azure 批次模型

[测试版] 负载均衡多个 Azure 部署

在你的 config.yaml 中，设置 enable_loadbalancing_on_batch_endpoints: true

model_list:
  - model_name: "batch-gpt-4o-mini"
    litellm_params:
      model: "azure/gpt-4o-mini"
      api_key: os.environ/AZURE_API_KEY
      api_base: os.environ/AZURE_API_BASE
    model_info:
      mode: batch

litellm_settings:
  enable_loadbalancing_on_batch_endpoints: true # 👈 关键更改

注意：这适用于 {PROXY_BASE_URL}/v1/files 和 {PROXY_BASE_URL}/v1/batches。注意：响应格式为 OpenAI 格式。

上传文件

只需在你的 .jsonl 文件中设置 model: batch-gpt-4o-mini。

curl http://localhost:4000/v1/files \
    -H "Authorization: Bearer sk-1234" \
    -F purpose="batch" \
    -F file="@mydata.jsonl"

示例文件

注意：model 应为你的 Azure 部署名称。

{"custom_id": "task-0", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was Microsoft founded?"}]}}
{"custom_id": "task-1", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was the first XBOX released?"}]}}
{"custom_id": "task-2", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "What is Altair Basic?"}]}}

预期响应（OpenAI 兼容）

{"id":"file-f0be81f654454113a922da60acb0eea6",...}

创建批次

curl http://0.0.0.0:4000/v1/batches \
  -H "Authorization: Bearer $LITELLM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input_file_id": "file-f0be81f654454113a922da60acb0eea6",
    "endpoint": "/v1/chat/completions",
    "completion_window": "24h",
    "model: "batch-gpt-4o-mini"
  }'

预期响应：

{"id":"batch_94e43f0a-d805-477d-adf9-bbb9c50910ed",...}

获取批次

curl http://0.0.0.0:4000/v1/batches/batch_94e43f0a-d805-477d-adf9-bbb9c50910ed \
  -H "Authorization: Bearer $LITELLM_API_KEY" \
  -H "Content-Type: application/json" \

预期响应：

{"id":"batch_94e43f0a-d805-477d-adf9-bbb9c50910ed",...}

列出批次

curl http://0.0.0.0:4000/v1/batches?limit=2 \
  -H "Authorization: Bearer $LITELLM_API_KEY" \
  -H "Content-Type: application/json"

预期响应：

{"data":[{"id":"batch_R3V...}

[测试版] 批处理 API

支持的提供商:​

快速开始​

👉 Proxy API 参考​

Azure 批处理 API​

使用方法​

1. 上传文件​

2. 创建批处理​

3. 检索批处理​

4. 取消批处理​

5. 列出批次​

👉 健康检查 Azure 批次模型​

[测试版] 负载均衡多个 Azure 部署​

支持的提供商:

快速开始

👉 Proxy API 参考

Azure 批处理 API

使用方法

1. 上传文件

2. 创建批处理

3. 检索批处理

4. 取消批处理

5. 列出批次

👉 健康检查 Azure 批次模型

[测试版] 负载均衡多个 Azure 部署