跳至内容

多模态推理(视觉语言模型)

TensorZero Gateway 支持视觉语言模型(VLMs)的多模态推理(例如图像输入)。

查看集成获取支持的模型列表。

设置

对象存储

TensorZero利用对象存储来存储多模态推理过程中使用的图像。它支持任何兼容S3的对象存储服务,包括AWS S3、GCP云存储、Cloudflare R2等多种服务。您可以在配置文件的object_storage部分配置对象存储服务。

在本示例中,我们将使用MinIO的本地部署,这是一个兼容S3协议的开源对象存储服务。

[object_storage]
type = "s3_compatible"
endpoint = "http://minio:9000" # optional: defaults to AWS S3
# region = "us-east-1" # optional: depends on your S3-compatible storage provider
bucket_name = "tensorzero" # optional: depends on your S3-compatible storage provider
# IMPORTANT: for production environments, remove the following setting and use a secure method of authentication in
# combination with a production-grade object storage service.
allow_http = true

您也可以将图像存储在本地目录中(type = "filesystem")或禁用图像存储功能(type = "disabled")。 详情请参阅配置参考

TensorZero网关将按以下优先级顺序尝试从以下资源获取凭据:

  1. S3_ACCESS_KEY_IDS3_SECRET_ACCESS_KEY 环境变量
  2. AWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEY 环境变量
  3. AWS SDK 的默认凭证配置

Docker Compose

我们将使用Docker Compose来部署TensorZero Gateway、ClickHouse和MinIO。

docker-compose.yml
# This is a simplified example for learning purposes. Do not use this in production.
# For production-ready deployments, see: https://www.tensorzero.com/docs/gateway/deployment
services:
clickhouse:
image: clickhouse/clickhouse-server:24.12-alpine
environment:
- CLICKHOUSE_USER=chuser
- CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1
- CLICKHOUSE_PASSWORD=chpassword
ports:
- "8123:8123"
healthcheck:
test: wget --spider --tries 1 http://chuser:chpassword@clickhouse:8123/ping
start_period: 30s
start_interval: 1s
timeout: 1s
gateway:
image: tensorzero/gateway
volumes:
# Mount our tensorzero.toml file into the container
- ./config:/app/config:ro
command: --config-file /app/config/tensorzero.toml
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY:?Environment variable OPENAI_API_KEY must be set.}
- S3_ACCESS_KEY_ID=miniouser
- S3_SECRET_ACCESS_KEY=miniopassword
- TENSORZERO_CLICKHOUSE_URL=http://chuser:chpassword@clickhouse:8123/tensorzero
ports:
- "3000:3000"
extra_hosts:
- "host.docker.internal:host-gateway"
depends_on:
clickhouse:
condition: service_healthy
minio:
condition: service_healthy
# For a production deployment, you can use AWS S3, GCP Cloud Storage, Cloudflare R2, etc.
minio:
image: bitnami/minio
ports:
- "9000:9000" # API port
- "9001:9001" # Console port
environment:
- MINIO_ROOT_USER=miniouser
- MINIO_ROOT_PASSWORD=miniopassword
- MINIO_DEFAULT_BUCKETS=tensorzero
healthcheck:
test: "mc ls local/tensorzero || exit 1"
start_period: 30s
start_interval: 1s
timeout: 1s

推理

完成设置后,您现在可以使用TensorZero Gateway进行多模态推理。

TensorZero网关支持接收嵌入式图像(以base64字符串编码)和远程图像(通过URL指定)。

from tensorzero import TensorZeroGateway
with TensorZeroGateway.build_http(
gateway_url="http://localhost:3000",
) as client:
response = client.inference(
model_name="openai::gpt-4o-mini",
input={
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Do the images share any common features?",
},
# Remote image of Ferris the crab
{
"type": "image",
"url": "https://raw.githubusercontent.com/tensorzero/tensorzero/ff3e17bbd3e32f483b027cf81b54404788c90dc1/tensorzero-internal/tests/e2e/providers/ferris.png",
},
# One-pixel orange image encoded as a base64 string
{
"type": "image",
"mime_type": "image/png",
"data": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAA1JREFUGFdj+O/P8B8ABe0CTsv8mHgAAAAASUVORK5CYII=",
},
],
}
],
},
)
print(response)