训练回合

一个episode（情节）是与共同下游结果相关联的一系列推理序列。

例如，一个episode（情节）可以指与以下内容相关联的一系列LLM调用：

解决技术支持工单
准备保险理赔
完成一通电话
从文档中提取数据
起草电子邮件

一个任务片段会包含一个或多个函数调用，有时还会多次调用同一函数。您的应用程序可以在任务片段内的函数调用之间执行任意操作（例如与用户交互、检索文档、控制机器人）。虽然这些操作超出了TensorZero的范围，但以这种方式构建您的LLM系统是可行的（且值得鼓励）。

/inference 端点接受一个可选的 episode_id 字段。当您发起第一个推理请求时，无需提供 episode_id。网关将为您创建一个新的会话，并在响应中返回 episode_id。当您发起第二个推理请求时，必须提供第一次响应中收到的 episode_id。网关将使用 episode_id 将这两个推理请求关联起来。

Scenario

在快速入门中，我们构建了一个简单的LLM应用，能够创作关于人工智能的俳句。

假设我们希望分别生成一些关于这首俳句的评论，并将两部分内容呈现给用户。我们可以将这两个推理关联到同一个事件中。

让我们在配置文件中定义一个额外的函数。

[functions.analyze_haiku]
type = "chat"

[functions.analyze_haiku.variants.gpt_4o_mini]
type = "chat_completion"
model = "gpt_4o_mini"

Full Configuration

[models.gpt_4o_mini]
routing = ["openai"]

[models.gpt_4o_mini.providers.openai]
type = "openai"
model_name = "gpt-4o-mini"

[functions.generate_haiku]
type = "chat"

[functions.generate_haiku.variants.gpt_4o_mini]
type = "chat_completion"
model = "gpt_4o_mini"

[functions.analyze_haiku]
type = "chat"

[functions.analyze_haiku.variants.gpt_4o_mini]
type = "chat_completion"
model = "gpt_4o_mini"

推理与事件

这次，我们将创建一个多步骤工作流，首先生成一首俳句，然后对其进行分析。在第一个推理请求中，我们不会提供episode_id，因此网关会为我们生成一个新的。然后我们将在第二个推理请求中使用该值。

from tensorzero import TensorZeroGateway

with TensorZeroGateway.build_http(gateway_url="http://localhost:3000") as client:
    haiku_response = client.inference(
        function_name="generate_haiku",
        # We don't provide an episode_id for the first inference in the episode
        input={
            "messages": [
                {
                    "role": "user",
                    "content": "Write a haiku about artificial intelligence.",
                }
            ]
        },
    )

    print(haiku_response)

    # When we don't provide an episode_id, the gateway will generate a new one for us
    episode_id = haiku_response.episode_id

    # In a production application, we'd first validate the response to ensure the model returned the correct fields
    haiku = haiku_response.content[0].text

    analysis_response = client.inference(
        function_name="analyze_haiku",
        # For future inferences in that episode, we provide the episode_id that we received
        episode_id=episode_id,
        input={
            "messages": [
                {
                    "role": "user",
                    "content": f"Write a one-paragraph analysis of the following haiku:\n\n{haiku}",
                }
            ]
        },
    )

    print(analysis_response)

Sample Output

ChatInferenceResponse(
    inference_id=UUID('01921116-0fff-7272-8245-16598966335e'),
    episode_id=UUID('01921116-0cd9-7d10-a9a6-d5c8b9ba602a'),
    variant_name='gpt_4o_mini',
    content=[
        Text(
            type='text',
            text='Silent circuits pulse,\nWhispers of thought in code bloom,\nMachines dream of us.',
        ),
    ],
    usage=Usage(
        input_tokens=15,
        output_tokens=20,
    ),
)

ChatInferenceResponse(
    inference_id=UUID('01921116-1862-7ea1-8d69-131984a4625f'),
    episode_id=UUID('01921116-0cd9-7d10-a9a6-d5c8b9ba602a'),
    variant_name='gpt_4o_mini',
    content=[
        Text(
            type='text',
            text='This haiku captures the intricate and intimate relationship between technology and human consciousness. '
                 'The phrase "Silent circuits pulse" evokes a sense of quiet activity within machines, suggesting that '
                 'even in their stillness, they possess an underlying vibrancy. The imagery of "Whispers of thought in '
                 'code bloom" personifies the digital realm, portraying lines of code as organic ideas that grow and '
                 'evolve, hinting at the potential for artificial intelligence to derive meaning or understanding from '
                 'human input. Finally, "Machines dream of us" introduces a poignant juxtaposition between human '
                 'creativity and machine logic, inviting contemplation about the nature of thought and consciousness '
                 'in both realms. Overall, the haiku encapsulates a profound reflection on the emergent sentience of '
                 'technology and the deeply interwoven future of humanity and machines.',
        ),
    ],
    usage=Usage(
        input_tokens=39,
        output_tokens=155,
    ),
)

结论与后续步骤

在TensorZero中，Episode（任务片段）是一等公民，它为多步骤大语言模型系统提供了强大的工作流支持。您可以将其与实验功能、指标与反馈以及工具使用（函数调用）等其他特性结合使用。例如，您可以追踪整个任务片段的KPI指标而非单次推理结果，随后联合优化您的大语言模型以最大化这些指标。