➡️ 创建透传端点
向LiteLLM代理添加透传路由
示例: 添加一个路由 /v1/rerank,将请求转发到 https://api.cohere.com/v1/rerank 通过LiteLLM代理
💡 这允许向LiteLLM代理发出以下请求
curl --request POST \
--url http://localhost:4000/v1/rerank \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--data '{
"model": "rerank-english-v3.0",
"query": "What is the capital of the United States?",
"top_n": 3,
"documents": ["Carson City is the capital city of the American state of Nevada."]
}'
教程 - 透传Cohere重新排序端点
步骤1 在litellm config.yaml中定义透传路由
general_settings:
master_key: sk-1234
pass_through_endpoints:
- path: "/v1/rerank" # 你想添加到LiteLLM代理服务器的路由
target: "https://api.cohere.com/v1/rerank" # 该路由应转发请求到的URL
headers: # 转发到该URL的请求头
Authorization: "bearer os.environ/COHERE_API_KEY" # (可选) 转发到你的端点的认证头
content-type: application/json # (可选) 转发到该端点的额外请求头
accept: application/json
forward_headers: True # (可选) 将所有传入请求的请求头转发到目标端点
步骤2 以详细调试模式启动代理服务器
litellm --config config.yaml --detailed_debug
步骤3 向透传端点发出请求
这里的 http://localhost:4000 是你的litellm代理端点
curl --request POST \
--url http://localhost:4000/v1/rerank \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--data '{
"model": "rerank-english-v3.0",
"query": "What is the capital of the United States?",
"top_n": 3,
"documents": ["Carson City is the capital city of the American state of Nevada.",
"The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
"Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",
"Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages.",
"Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states."]
}'
🎉 预期响应
此请求从LiteLLM代理转发到定义的目标URL(带有请求头)
{
"id": "37103a5b-8cfb-48d3-87c7-da288bedd429",
"results": [
{
"index": 2,
"relevance_score": 0.999071
},
{
"index": 4,
"relevance_score": 0.7867867
},
{
"index": 0,
"relevance_score": 0.32713068
}
],
"meta": {
"api_version": {
"version": "1"
},
"billed_units": {
"search_units": 1
}
}
}
教程 - 透传Langfuse请求
步骤1 在litellm config.yaml中定义透传路由
general_settings:
master_key: sk-1234
pass_through_endpoints:
- path: "/api/public/ingestion" # 你想添加到LiteLLM代理服务器的路由
target: "https://us.cloud.langfuse.com/api/public/ingestion" # 该路由应转发请求到的URL
headers:
LANGFUSE_PUBLIC_KEY: "os.environ/LANGFUSE_DEV_PUBLIC_KEY" # 你的langfuse账户公钥
LANGFUSE_SECRET_KEY: "os.environ/LANGFUSE_DEV_SK_KEY" # 你的langfuse账户密钥
步骤2 以详细调试模式启动代理服务器
litellm --config config.yaml --detailed_debug
步骤3 向透传端点发出请求
运行此代码以生成示例跟踪
from langfuse import Langfuse
langfuse = Langfuse(
host="http://localhost:4000", # 你的litellm代理端点
public_key="anything", # 不需要密钥,因为这是透传
secret_key="anything", # 不需要密钥,因为这是透传
)
print("sending langfuse trace request")
trace = langfuse.trace(name="test-trace-litellm-proxy-passthrough")
print("flushing langfuse request")
langfuse.flush()
print("flushed langfuse request")
🎉 预期响应
成功时 预期会在你的Langfuse仪表板上看到以下跟踪生成
你会在litellm代理服务器日志中看到以下端点被调用
POST /api/public/ingestion HTTP/1.1" 207 多状态
✨ [企业版] - 在传递端点上使用LiteLLM密钥/认证
如果你希望传递端点能够识别LiteLLM密钥/认证,请使用此功能。
这还会在传递端点上强制执行密钥的每分钟请求限制。
用法 - 在配置中设置 auth: true
general_settings:
master_key: sk-1234
pass_through_endpoints:
- path: "/v1/rerank"
target: "https://api.cohere.com/v1/rerank"
auth: true # 👈 关键更改以使用LiteLLM认证/密钥
headers:
Authorization: "bearer os.environ/COHERE_API_KEY"
content-type: application/json
accept: application/json
使用LiteLLM密钥测试请求
curl --request POST \
--url http://localhost:4000/v1/rerank \
--header 'accept: application/json' \
--header 'Authorization: Bearer sk-1234'\
--header 'content-type: application/json' \
--data '{
"model": "rerank-english-v3.0",
"query": "What is the capital of the United States?",
"top_n": 3,
"documents": ["Carson City is the capital city of the American state of Nevada.",
"The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
"Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",
"Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages.",
"Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states."]
}'
使用Langfuse客户端SDK与LiteLLM密钥
用法
- 设置yaml以传递langfuse /api/public/ingestion
general_settings:
master_key: sk-1234
pass_through_endpoints:
- path: "/api/public/ingestion" # 你希望添加到LiteLLM代理服务器的路由
target: "https://us.cloud.langfuse.com/api/public/ingestion" # 该路由应转发到的URL
auth: true # 👈 关键更改
custom_auth_parser: "langfuse" # 👈 关键更改
headers:
LANGFUSE_PUBLIC_KEY: "os.environ/LANGFUSE_DEV_PUBLIC_KEY" # 你的Langfuse账户公钥
LANGFUSE_SECRET_KEY: "os.environ/LANGFUSE_DEV_SK_KEY" # 你的Langfuse账户密钥
- 启动代理
litellm --config /path/to/config.yaml
- 使用langfuse sdk测试
from langfuse import Langfuse
langfuse = Langfuse(
host="http://localhost:4000", # 你的litellm代理端点
public_key="sk-1234", # 你的litellm代理API密钥
secret_key="anything", # 不需要密钥,因为这是传递
)
print("发送langfuse跟踪请求")
trace = langfuse.trace(name="test-trace-litellm-proxy-passthrough")
print("刷新langfuse请求")
langfuse.flush()
print("刷新langfuse请求")
pass_through_endpoints 在config.yaml中的规范
所有可能的 pass_through_endpoints 值及其含义
示例配置
general_settings:
pass_through_endpoints:
- path: "/v1/rerank" # 你希望添加到LiteLLM代理服务器的路由
target: "https://api.cohere.com/v1/rerank" # 该路由应转发请求到的URL
headers: # 转发到此URL的标头
Authorization: "bearer os.environ/COHERE_API_KEY" # (可选) 转发到你的端点的认证标头
content-type: application/json # (可选) 传递到此端点的额外标头
accept: application/json
规范
pass_through_endpoints列表: 请求转发端点配置的集合。path字符串: 要添加到LiteLLM代理服务器的路由。target字符串: 该路径的请求应转发到的URL。headers对象: 随请求转发的标头键值对。你可以在这里设置任何键值对,它将被转发到你的目标端点。Authorization字符串: 目标API的认证标头。content-type字符串: 请求体的格式规范。accept字符串: 服务器预期的响应格式。LANGFUSE_PUBLIC_KEY字符串: 你的Langfuse账户公钥 - 仅在转发到Langfuse时设置此项。LANGFUSE_SECRET_KEY字符串: 你的Langfuse账户密钥 - 仅在转发到Langfuse时设置此项。<your-custom-header>字符串: 传递任何自定义标头键值对。
forward_headers可选(布尔值): 如果为true,传入请求的所有标头将被转发到目标端点。默认值为False。
自定义聊天端点 (Anthropic/Bedrock/Vertex)
允许开发者使用Anthropic/boto3等客户端SDK调用代理。
参考我们的Anthropic适配器 代码进行测试。
1. 编写适配器
将自定义API模式的请求/响应转换为OpenAI模式(由litellm.completion()使用)并返回。
对于特定于提供者的参数 👉 提供者特定参数
from litellm import adapter_completion
import litellm
from litellm import ChatCompletionRequest, verbose_logger
from litellm.integrations.custom_logger import CustomLogger
from litellm.types.llms.anthropic import AnthropicMessagesRequest, AnthropicResponse
import os
# 这是什么?
## 将OpenAI调用转换为Anthropic的`/v1/messages`格式
import json
import os
import traceback
import uuid
from typing import Literal, Optional
import dotenv
import httpx
from pydantic import BaseModel
###################
# 自定义适配器 ##
###################
class AnthropicAdapter(CustomLogger):
def __init__(self) -> None:
super().__init__()
def translate_completion_input_params(
self, kwargs
) -> Optional[ChatCompletionRequest]:
"""
- 在需要时转换参数
- 其余参数保持不变
"""
request_body = AnthropicMessagesRequest(**kwargs) # 类型忽略
translated_body = litellm.AnthropicConfig().translate_anthropic_to_openai(
anthropic_message_request=request_body
)
return translated_body
def translate_completion_output_params(
self, response: litellm.ModelResponse
) -> Optional[AnthropicResponse]:
return litellm.AnthropicConfig().translate_openai_response_to_anthropic(
response=response
)
def translate_completion_output_params_streaming(self) -> Optional[BaseModel]:
return super().translate_completion_output_params_streaming()
anthropic_adapter = AnthropicAdapter()
###########
# 测试它 #
###########
## 注册自定义适配器
litellm.adapters = [{"id": "anthropic", "adapter": anthropic_adapter}]
## 设置环境变量
os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["COHERE_API_KEY"] = "your-cohere-key"
messages = [{ "content": "Hello, how are you?","role": "user"}]
# openai调用
response = adapter_completion(model="gpt-3.5-turbo", messages=messages, adapter_id="anthropic")
# cohere调用
response = adapter_completion(model="command-nightly", messages=messages, adapter_id="anthropic")
print(response)
2. 创建新端点
我们将步骤1中定义的自定义回调类传递给config.yaml。将回调设置为python_filename.logger_instance_name
在下面的配置中,我们传递
python_filename: custom_callbacks.py
logger_instance_name: anthropic_adapter。这在步骤1中定义
target: custom_callbacks.proxy_handler_instance
model_list:
- model_name: my-fake-claude-endpoint
litellm_params:
model: gpt-3.5-turbo
api_key: os.environ/OPENAI_API_KEY
general_settings:
master_key: sk-1234
pass_through_endpoints:
- path: "/v1/messages" # 要添加到LiteLLM代理服务器的路由
target: custom_callbacks.anthropic_adapter # 用于此路由的适配器
headers:
litellm_user_api_key: "x-api-key" # 包含LiteLLM密钥的标头字段
3. 测试它!
启动代理
litellm --config /path/to/config.yaml
Curl
curl --location 'http://0.0.0.0:4000/v1/messages' \
-H 'x-api-key: sk-1234' \
-H 'anthropic-version: 2023-06-01' \ # 忽略
-H 'content-type: application/json' \
-D '{
"model": "my-fake-claude-endpoint",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, world"}
]
}'