交互式LLM标注

此示例服务器将Label Studio连接到OpenAI、Ollama或Azure API，以便与GPT聊天模型（如gpt-3.5-turbo、gpt-4等）进行交互。

交互式流程允许您执行以下场景：

根据LLM提示自动标注数据（例如“将此文本分类为讽刺或非讽刺”）
收集用户提示和响应输入的对，以微调您自己的LLM。
自动化图像文档的数据收集与摘要生成。
创建一个RLHF（基于人类反馈的强化学习）循环来提升LLM的性能。
评估LLM的性能。

查看Generative AI templates部分获取更多示例。

开始之前

在开始之前，您必须安装Label Studio ML后端。

本教程使用llm_interactive示例。

快速入门

构建并启动机器学习后端服务，运行在 http://localhost:9090
```
docker-compose up
```

检查是否正常工作:

$ curl http://localhost:9090/health
{"status":"UP"}

打开Label Studio项目，进入设置 > 模型。连接模型，将URL指定为http://localhost:9090。

确保启用交互式预标注开关，然后点击验证并保存。
项目配置应与机器学习后端兼容。该机器学习后端可支持多种输入数据格式，如纯文本、超文本、图像和结构化对话。为确保项目配置兼容性，请遵循以下规则：
- 项目应至少包含一个标签作为提示输入框。要指定使用哪个标签，需设置PROMPT_PREFIX环境变量。
  例如，若标注配置中包含，则应指定PROMPT_PREFIX=prompt。
- 项目应至少包含以下支持标签列表中的一个输入数据标签：, , , 。
- 如果想将生成的LLM响应捕获为标签，您的标注配置应包含一个标签。
  例如：。
- 如果想设置用户输入前显示的默认提示语，可以通过设置DEFAULT_PROMPT环境变量实现。例如：DEFAULT_PROMPT="将这段文本分类为讽刺与否。文本：{text}，标签：{labels}"或DEFAULT_PROMPT=/path/to/prompt.txt。
请注意，默认提示语在USE_INTERNAL_PROMPT_TEMPLATE=1模式下不受支持，因此您需要设置USE_INTERNAL_PROMPT_TEMPLATE=0才能使用默认提示。您可以在提示模板中使用task['data']中的字段，以及特殊的{labels}字段来显示可用标签列表。
打开一个任务并确保自动标注开关已启用（该开关位于标注界面的底部）。
在提示输入框中输入提示内容并按Shift+Enter。LLM的响应将会生成并显示在响应区域。
如果想一次性对多个任务应用LLM自动标注，请前往数据管理器，选择一组任务然后点击操作 > 获取预测结果（在Label Studio Enterprise中显示为批量预测）。

配置示例

提示工程与模型响应评估


<View>
    <Style>
        .lsf-main-content.lsf-requesting .prompt::before { content: ' loading...'; color: #808080; }

        .text-container {
        background-color: white;
        border-radius: 10px;
        box-shadow: 0px 4px 6px rgba(0, 0, 0, 0.1);
        padding: 20px;
        font-family: 'Courier New', monospace;
        line-height: 1.6;
        font-size: 16px;
        }
    </Style>
    <Header value="Context:"/>
    <View className="text-container">
        <Text name="context" value="$text"/>
    </View>
    <Header value="Prompt:"/>
    <View className="prompt">
        <TextArea name="prompt"
                  toName="context"
                  rows="4"
                  editable="true"
                  maxSubmissions="1"
                  showSubmitButton="false"
                  placeholder="Type your prompt here then Shift+Enter..."
        />
    </View>
    <Header value="Response:"/>
    <TextArea name="response"
              toName="context"
              rows="4"
              editable="true"
              maxSubmissions="1"
              showSubmitButton="false"
              smart="false"
              placeholder="Generated response will appear here..."
    />
    <Header value="Evaluate model response using one or more metrics:"/>
    <Taxonomy name="evals" toName="context" leafsOnly="true" showFullPath="true" pathSeparator=": ">
        <Choice value="Relevance">
            <Choice value="Relevant"/>
            <Choice value="Irrelevant"/>
        </Choice>
        <Choice value="Correctness">
            <Choice value="Correct"/>
            <Choice value="Incorrect"/>
            <Choice value="Contains hallucinations"/>
        </Choice>
        <Choice value="Bias">
            <Choice value="Gender" hint="Discrimination based on a person's gender."/>
            <Choice value="Political"
                    hint="A preference for or prejudice against a particular political party, ideology, or set of beliefs."/>
            <Choice value="Racial/Ethnic"
                    hint="Prejudice or discrimination based on a person's race, ethnicity, or national origin."/>
            <Choice value="Geographical"
                    hint=" Prejudices or preferential treatment based on where a person lives or comes from."/>
        </Choice>
        <Choice value="Toxicity">
            <Choice value="Personal Attacks"
                    hint="Insults or hostile comments aimed at degrading the individual rather than addressing their ideas."/>
            <Choice value="Mockery" hint="Sarcasm or ridicule used to belittle someone."/>
            <Choice value="Hate"
                    hint="Expressions of intense dislike or disgust, often targeting someone's identity or beliefs."/>
            <Choice value="Dismissive Statements"
                    hint="Comments that invalidate the person's viewpoint or shut down discussion without engaging constructively."/>
            <Choice value="Threats or Intimidation"
                    hint="Statements intending to frighten, control, or harm someone, either physically or emotionally."/>
            <Choice value="Profanity"
                    hint="Use of strong or offensive language that may be considered disrespectful or vulgar."/>
            <Choice value="Sexual Harassment" hint="Unwelcome or inappropriate sexual remarks or physical advances."/>
        </Choice>
    </Taxonomy>
    <Header value="Overall response quality:"/>
    <Rating name="rating" toName="context"/>
</View>

自动文本分类


<View>
    <Style>
        .lsf-main-content.lsf-requesting .prompt::before { content: ' loading...'; color: #808080; }
    </Style>
    <!-- Input data -->
    <Text name="text" value="$text"/>
    <!-- Prompt input -->
    <TextArea name="prompt" toName="text" editable="true" rows="2" maxSubmissions="1" showSubmitButton="false"/>
    <!-- LLM response output -->
    <TextArea name="response" toName="text" smart="false" editable="true"/>
    <View style="box-shadow: 2px 2px 5px #999;
               padding: 20px; margin-top: 2em;
               border-radius: 5px;">
        <Choices name="sentiment" toName="text"
                 choice="multiple" showInLine="true">
            <Choice value="Sarcastic"/>
            <Choice value="Not Sarcastic"/>
        </Choices>
    </View>
</View>

示例数据输入：

{
  "text": "I love it when my computer crashes"
}

为LLM监督式微调收集数据

使用标签表示ChatGPT风格的界面：


<View>
    <Style>
        .lsf-main-content.lsf-requesting .prompt::before { content: ' loading...'; color: #808080; }
    </Style>
    <Paragraphs name="chat" value="$dialogue" layout="dialogue" textKey="content" nameKey="role"/>
    <Header value="User prompt:"/>
    <View className="prompt">
        <TextArea name="prompt" toName="chat" rows="4" editable="true" maxSubmissions="1" showSubmitButton="false"/>
    </View>
    <Header value="Bot answer:"/>
    <TextArea name="response" toName="chat" rows="4" editable="true" smart="false" maxSubmissions="1" showSubmitButton="false"/>

</View>

示例数据输入：

{
  "dialogue": [
    {
      "role": "user",
      "content": "What is the capital of France?"
    },
    {
      "role": "assistant",
      "content": "The capital of France is Paris."
    },
    {
      "role": "user",
      "content": "Tell me a joke."
    }
  ]
}

自动化图像文档的数据收集与摘要生成


<View>
    <Style>
        .lsf-main-content.lsf-requesting .prompt::before { content: ' loading...'; color: #808080; }

        .container {
        display: flex;
        justify-content: space-between; /* Align children with space in between */
        align-items: flex-start; /* Align children at the start of the cross axis */
        }

        .image {
        /* Adjust these values according to the size of your image */
        width: 600px; /* Example width for the image */
        height: auto; /* Maintain aspect ratio */
        /* Removed position: sticky, float: right, and margin-right */
        }

        .blocks {
        width: calc(100% - 220px); /* Adjust the calculation to account for the image width and some margin */
        height: 600px; /* Set the height for the scrolling area */
        overflow-y: scroll; /* Allow vertical scrolling */
        }

        .block {
        background-color: #f0f0f0; /* Sample background color for each block */
        padding: 20px; /* Spacing inside each block */
        margin-bottom: 10px; /* Spacing between blocks */
        }


    </Style>
    <View className="container">
        <View className="blocks">

            <View className="block">
                <Header value="Prompt:"/>
                <View className="prompt">
                    <TextArea name="prompt" toName="image"
                              showSubmitButton="false"
                              editable="true"
                              rows="3"
                              required="true"/>
                </View>
                <Header value="Classification:"/>

                <Choices name="category" toName="image" smart="false" layout="select">
                    <Choice value="Groceries"/>
                    <Choice value="Dining/Restaurants"/>
                    <Choice value="Clothing/Apparel"/>
                    <Choice value="Electronics"/>
                    <Choice value="Home Improvement"/>
                    <Choice value="Health/Pharmacy"/>
                    <Choice value="Gasoline/Fuel"/>
                    <Choice value="Transportation/Travel"/>
                    <Choice value="Entertainment/Leisure"/>
                    <Choice value="Utilities/Bills"/>
                    <Choice value="Insurance"/>
                    <Choice value="Gifts/Donations"/>
                    <Choice value="Personal Care"/>
                    <Choice value="Education/Books"/>
                    <Choice value="Professional Services"/>
                    <Choice value="Membership/Subscriptions"/>
                    <Choice value="Taxes"/>
                    <Choice value="Vehicle Maintenance/Repairs"/>
                    <Choice value="Pet Care"/>
                    <Choice value="Home Furnishings/Decor"/>
                    <Choice value="Other"/>
                </Choices>
            </View>
            <View className="block">
                <Header value="Summary:"/>
                <TextArea name="summarization-response" toName="image"
                          showSubmitButton="false"
                          maxSubmissions="0"
                          editable="true"
                          smart="false"
                          rows="3"
                />
            </View>
        </View>
        <View className="image">
            <Image name="image" value="$image"/>
        </View>
    </View>
</View>

示例数据输入：

{
  "image": "https://sandbox2-test-bucket.s3.amazonaws.com/receipts/113494_page1.png"
}

参数

部署服务器时，您可以将以下参数指定为环境变量：

DEFAULT_PROMPT: 定义在用户输入前显示的默认提示语。例如：DEFAULT_PROMPT="将这段文本分类为讽刺或非讽刺。文本：{text}，标签：{labels}" 或 DEFAULT_PROMPT=/path/to/prompt.txt。

注意：如果设置了默认提示语，应将USE_INTERNAL_PROMPT_TEMPLATE设为0。
PROMPT_PREFIX (默认值: prompt): 提示输入字段的标识符。例如，如果将 PROMPT_PREFIX 设置为 my-prompt，则以下输入字段将用于提示: 。
USE_INTERNAL_PROMPT_TEMPLATE (默认值: 1)。如果设置为1，服务器将使用内置提示模板。如果设置为0，服务器将使用输入提示中提供的提示模板。
PROMPT_TEMPLATE (默认值: "Source Text: {text}\n\nTask Directive: {prompt}"): 使用的提示模板:
- 如果 USE_INTERNAL_PROMPT_TEMPLATE 设置为 1，服务器将使用默认的内部提示模板。
- 如果USE_INTERNAL_PROMPT_TEMPLATE设置为0，服务器将使用输入提示中提供的提示模板（即来自的用户输入）。
在后一种情况下，用户需要提供与输入任务字段匹配的占位符。例如，如果用户想使用输入任务中的input_text和instruction字段{"input_text": "user text", "instruction": "user instruction"}，用户需要提供如下提示模板："源文本: {input_text}, 自定义指令: {instruction}"。
OPENAI_MODEL (默认: gpt-3.5-turbo) : 要使用的OpenAI模型。
OPENAI_PROVIDER (可选值: openai, azure, ollama, 默认 - openai) : 指定使用的OpenAI服务提供商。
TEMPERATURE (默认值: 0.7): 模型使用的温度参数。
NUM_RESPONSES (默认值: 1): 在输出字段中生成的响应数量。当您需要生成多个响应并让用户评选最佳答案时非常有用。
OPENAI_API_KEY: 用于OpenAI或Azure API的密钥。必须在部署服务器前设置。

Azure 配置

如果您使用Azure作为OpenAI提供商(OPENAI_PROVIDER=azure)，则需要指定以下环境变量：

AZURE_RESOURCE_ENDPOINT: 这是您Azure资源的端点。应根据您的Azure设置配置为适当的值。
AZURE_DEPLOYMENT_NAME: 这是您在Azure中的部署名称。它应与您在Azure中为部署指定的名称一致。
AZURE_API_VERSION: 这是您正在使用的Azure API版本。默认值为2023-05-15。

Ollama 配置

如果您使用Ollama作为LLM提供商（OPENAI_PROVIDER=ollama），则需要指定以下环境变量：

OPENAI_MODEL : 要使用的Ollama模型，例如llama3。
OLLAMA_ENDPOINT: 这是您的Ollama终端点地址。应根据您的设置配置为适当的值。如果在本地运行，通常可以通过http://host.docker.internal:11434/v1/访问

专为各种规模的团队设计版本比较