💸 支出跟踪
跟踪100多个LLM的密钥、用户和团队的支出。
如何使用LiteLLM跟踪支出
步骤1
步骤2 发送 /chat/completions 请求
- OpenAI Python v1.0.0+
- Curl 请求
- Langchain
import openai
client = openai.OpenAI(
api_key="sk-1234",
base_url="http://0.0.0.0:4000"
)
response = client.chat.completions.create(
model="llama3",
messages = [
{
"role": "user",
"content": "这是一个测试请求,请写一首短诗"
}
],
user="palantir",
extra_body={
"metadata": {
"tags": ["jobID:214590dsff09fds", "taskName:run_page_classification"]
}
}
)
print(response)
将 metadata 作为请求体的一部分传递
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{
"model": "llama3",
"messages": [
{
"role": "user",
"content": "你是什么LLM"
}
],
"user": "palantir",
"metadata": {
"tags": ["jobID:214590dsff09fds", "taskName:run_page_classification"]
}
}'
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage
import os
os.environ["OPENAI_API_KEY"] = "sk-1234"
chat = ChatOpenAI(
openai_api_base="http://0.0.0.0:4000",
model = "llama3",
user="palantir",
extra_body={
"metadata": {
"tags": ["jobID:214590dsff09fds", "taskName:run_page_classification"]
}
}
)
messages = [
SystemMessage(
content="你是一个我用来进行测试请求的有用助手。"
),
HumanMessage(
content="来自litellm的测试。告诉我为什么它很棒,用一句话。"
),
]
response = chat(messages)
print(response)
步骤3 - 验证跟踪的支出 就是这样。现在验证您的支出是否被跟踪
- 响应头
- 数据库 + UI
期望在响应头中看到 x-litellm-response-cost,其中包含计算的费用
以下支出记录在表 LiteLLM_SpendLogs 中
{
"api_key": "fe6b0cab4ff5a5a8df823196cc8a450*****", # 使用的API密钥的哈希
"user": "default_user", # 拥有 `api_key=sk-1234` 的内部用户 (LiteLLM_UserTable)
"team_id": "e8d1460f-846c-45d7-9b43-55f3cc52ac32", # 拥有 `api_key=sk-1234` 的团队 (LiteLLM_TeamTable)
"request_tags": ["jobID:214590dsff09fds", "taskName:run_page_classification"],# 请求中发送的标签
"end_user": "palantir", # 客户 - 请求中发送的 `user`
"model_group": "llama3", # 传递给LiteLLM的“模型”
"api_base": "https://api.groq.com/openai/v1/", # LiteLLM使用的模型的“api_base”
"spend": 0.000002, # 支出(单位:美元)
"total_tokens": 100,
"completion_tokens": 80,
"prompt_tokens": 20
}
在LiteLLM UI的用量标签中导航 (找到地址为 https://your-proxy-endpoint/ui),并验证您在“用量”下看到了支出记录。
Getting Spend Reports - To Charge Other Teams, Customers, Users
Use the /global/spend/report endpoint to get spend reports
- Spend Per Team
- 每个客户的支出
- 特定API密钥的支出
- 内部用户(密钥持有者)的支出
Example Request
👉 Key Change: Specify group_by=team
curl -X GET 'http://localhost:4000/global/spend/report?start_date=2024-04-01&end_date=2024-06-30&group_by=team' \
-H 'Authorization: Bearer sk-1234'
Example Response
- Expected Response
- 解析响应的脚本(Python)
[
{
"按天分组": "2024-04-30T00:00:00+00:00",
"团队": [
{
"团队名称": "产品团队",
"总支出": 0.0015265,
"元数据": [ # 按唯一(键 + 模型)查看支出
{
"模型": "gpt-4",
"支出": 0.00123,
"总令牌数": 28,
"API密钥": "88dc28.." # 哈希后的API密钥
},
{
"模型": "gpt-4",
"支出": 0.00123,
"总令牌数": 28,
"API密钥": "a73dc2.." # 哈希后的API密钥
},
{
"模型": "chatgpt-v-2",
"支出": 0.000214,
"总令牌数": 122,
"API密钥": "898c28.." # 哈希后的API密钥
},
{
"模型": "gpt-3.5-turbo",
"支出": 0.0000825,
"总令牌数": 85,
"API密钥": "84dc28.." # 哈希后的API密钥
}
]
}
]
}
]
import requests
url = 'http://localhost:4000/global/spend/report'
params = {
'start_date': '2023-04-01',
'end_date': '2024-06-30'
}
headers = {
'Authorization': 'Bearer sk-1234'
}
# 发起GET请求
response = requests.get(url, headers=headers, params=params)
spend_report = response.json()
for row in spend_report:
date = row["group_by_day"]
teams = row["teams"]
for team in teams:
team_name = team["team_name"]
total_spend = team["total_spend"]
metadata = team["metadata"]
print(f"日期: {date}")
print(f"团队: {team_name}")
print(f"总支出: {total_spend}")
print("元数据: ", metadata)
print()
脚本输出
# 日期: 2024-05-11T00:00:00+00:00
# 团队: local_test_team
# 总支出: 0.003675099999999999
# 元数据: [{'模型': 'gpt-3.5-turbo', '支出': 0.003675099999999999, 'api_key': 'b94d5e0bc3a71a573917fe1335dc0c14728c7016337451af9714924ff3a729db', '总令牌数': 3105}]
# 日期: 2024-05-13T00:00:00+00:00
# 团队: Unassigned Team
# 总支出: 3.4e-05
# 元数据: [{'模型': 'gpt-3.5-turbo', '支出': 3.4e-05, 'api_key': '9569d13c9777dba68096dea49b0b03e0aaf4d2b65d4030eda9e8a2733c3cd6e0', '总令牌数': 50}]
# 日期: 2024-05-13T00:00:00+00:00
# 团队: central
# 总支出: 0.000684
# 元数据: [{'模型': 'gpt-3.5-turbo', '支出': 0.000684, 'api_key': '0323facdf3af551594017b9ef162434a9b9a8ca1bbd9ccbd9d6ce173b1015605', '总令牌数': 498}]
# 日期: 2024-05-13T00:00:00+00:00
# 团队: local_test_team
# 总支出: 0.0005715000000000001
# 元数据: [{'模型': 'gpt-3.5-turbo', '支出': 0.0005715000000000001, 'api_key': 'b94d5e0bc3a71a573917fe1335dc0c14728c7016337451af9714924ff3a729db', '总令牌数': 423}]
示例请求
👉 关键更改:指定group_by=customer
curl -X GET 'http://localhost:4000/global/spend/report?start_date=2024-04-01&end_date=2024-06-30&group_by=customer' \
-H 'Authorization: Bearer sk-1234'
示例响应
[
{
"group_by_day": "2024-04-30T00:00:00+00:00",
"customers": [
{
"customer": "palantir",
"total_spend": 0.0015265,
"metadata": [ # 按唯一(密钥 + 模型)查看支出
{
"model": "gpt-4",
"spend": 0.00123,
"total_tokens": 28,
"api_key": "88dc28.." # 哈希后的api密钥
},
{
"model": "gpt-4",
"spend": 0.00123,
"total_tokens": 28,
"api_key": "a73dc2.." # 哈希后的api密钥
},
{
"model": "chatgpt-v-2",
"spend": 0.000214,
"total_tokens": 122,
"api_key": "898c28.." # 哈希后的api密钥
},
{
"model": "gpt-3.5-turbo",
"spend": 0.0000825,
"total_tokens": 85,
"api_key": "84dc28.." # 哈希后的api密钥
}
]
}
]
}
]
👉 关键更改:指定api_key=sk-1234
curl -X GET 'http://localhost:4000/global/spend/report?start_date=2024-04-01&end_date=2024-06-30&api_key=sk-1234' \
-H 'Authorization: Bearer sk-1234'
示例响应
[
{
"api_key": "88dc28d0f030c55ed4ab77ed8faf098196cb1c05df778539800c9f1243fe6b4b",
"total_cost": 0.3201286305151999,
"total_input_tokens": 36.0,
"total_output_tokens": 1593.0,
"model_details": [
{
"model": "dall-e-3",
"total_cost": 0.31999939051519993,
"total_input_tokens": 0,
"total_output_tokens": 0
},
{
"model": "llama3-8b-8192",
"total_cost": 0.00012924,
"total_input_tokens": 36,
"total_output_tokens": 1593
}
]
}
]
内部用户(密钥持有者):这是在调用/key/generate时传递的user_id的值
👉 关键更改:指定internal_user_id=ishaan
curl -X GET 'http://localhost:4000/global/spend/report?start_date=2024-04-01&end_date=2024-12-30&internal_user_id=ishaan' \
-H 'Authorization: Bearer sk-1234'
示例响应
[
{
"api_key": "88dc28d0f030c55ed4ab77ed8faf098196cb1c05df778539800c9f1243fe6b4b",
"total_cost": 0.00013132,
"total_input_tokens": 105.0,
"total_output_tokens": 872.0,
"model_details": [
{
"model": "gpt-3.5-turbo-instruct",
"total_cost": 5.85e-05,
"total_input_tokens": 15,
"total_output_tokens": 18
},
{
"model": "llama3-8b-8192",
"total_cost": 7.282000000000001e-05,
"total_input_tokens": 90,
"total_output_tokens": 854
}
]
},
{
"api_key": "151e85e46ab8c9c7fad090793e3fe87940213f6ae665b543ca633b0b85ba6dc6",
"total_cost": 5.2699999999999993e-05,
"total_input_tokens": 26.0,
"total_output_tokens": 27.0,
"model_details": [
{
"model": "gpt-3.5-turbo",
"total_cost": 5.2499999999999995e-05,
"total_input_tokens": 24,
"total_output_tokens": 27
},
{
"model": "text-embedding-ada-002",
"total_cost": 2e-07,
"total_input_tokens": 2,
"total_output_tokens": 0
}
]
},
{
"api_key": "60cb83a2dcbf13531bd27a25f83546ecdb25a1a6deebe62d007999dc00e1e32a",
"total_cost": 9.42e-06,
"total_input_tokens": 30.0,
"total_output_tokens": 99.0,
"model_details": [
{
"model": "llama3-8b-8192",
"total_cost": 9.42e-06,
"total_input_tokens": 30,
"total_output_tokens": 99
}
]
}
]
允许非代理管理员访问 /spend 端点
当您希望非代理管理员访问 /spend 端点时使用此功能
安排一次 与我们见面以获取您的企业许可证
创建密钥
使用 permissions={"get_spend_routes": true} 创建密钥
curl --location 'http://0.0.0.0:4000/key/generate' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data '{
"permissions": {"get_spend_routes": true}
}'
在 /spend 端点上使用生成的密钥
使用新生成的密钥访问支出路由
curl -X GET 'http://localhost:4000/global/spend/report?start_date=2024-04-01&end_date=2024-06-30' \
-H 'Authorization: Bearer sk-H16BKvrSNConSsBYLGc_7A'
重置团队、API 密钥支出 - 仅限主密钥
如果您想:
重置所有 API 密钥、团队的支出。
LiteLLM_TeamTable和LiteLLM_VerificationToken中所有团队和密钥的spend将被设置为spend=0LiteLLM 将保留
LiteLLMSpendLogs中的所有日志以供审计目的
请求
只有您设置的 LITELLM_MASTER_KEY 可以访问此路由
curl -X POST \
'http://localhost:4000/global/spend/reset' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json'
预期响应
{"message":"所有 API 密钥和团队的支出已成功重置","status":"success"}
Azure OpenAI 模型的支出跟踪
设置基本模型以跟踪 Azure 图像生成的成本
图像生成
model_list:
- model_name: dall-e-3
litellm_params:
model: azure/dall-e-3-test
api_version: 2023-06-01-preview
api_base: https://openai-gpt-4-test-v-1.openai.azure.com/
api_key: os.environ/AZURE_API_KEY
base_model: dall-e-3 # 👈 将 dall-e-3 设置为基础模型
model_info:
mode: image_generation
聊天完成 / 嵌入
问题:当使用 azure/gpt-4-1106-preview 时,Azure 在响应中返回 gpt-4。这导致成本跟踪不准确
解决方案 ✅ :在您的配置中设置 base_model,以便 LiteLLM 使用正确的模型计算 Azure 成本
从 这里 获取基础模型名称
带有 base_model 的示例配置
model_list:
- model_name: azure-gpt-3.5
litellm_params:
model: azure/chatgpt-v-2
api_base: os.environ/AZURE_API_BASE
api_key: os.environ/AZURE_API_KEY
api_version: "2023-07-01-preview"
model_info:
base_model: azure/gpt-4-1106-preview
自定义输入/输出定价
👉 前往 自定义输入/输出定价 为您的模型设置自定义定价
✨ 自定义支出日志元数据
将特定的键值对作为支出日志元数据的一部分进行记录
在支出日志元数据中记录特定的键值对是一项企业级功能。查看此处
✨ 自定义标签
使用自定义标签跟踪支出是一项企业级功能。查看此处