Gradio 聊天机器人 + LiteLLM 教程

将 LiteLLM 补全调用与 Gradio 聊天机器人演示流式集成的简单教程

安装与导入依赖

!pip install gradio litellm
import gradio
import litellm

定义推理函数

记得根据托管你 LLM 的服务器设置 model 和 api_base。

def inference(message, history):
    try:
        flattened_history = [item for sublist in history for item in sublist]
        full_message = " ".join(flattened_history + [message])
        messages_litellm = [{"role": "user", "content": full_message}] # litellm 消息格式
        partial_message = ""
        for chunk in litellm.completion(model="huggingface/meta-llama/Llama-2-7b-chat-hf",
                                        api_base="x.x.x.x:xxxx",
                                        messages=messages_litellm,
                                        max_new_tokens=512,
                                        temperature=.7,
                                        top_k=100,
                                        top_p=.9,
                                        repetition_penalty=1.18,
                                        stream=True):
            partial_message += chunk['choices'][0]['delta']['content'] # 从流式 litellm 块中提取文本
            yield partial_message
    except Exception as e:
        print("遇到异常:", str(e))
        yield f"发生错误，请清除错误并重试您的问题"

定义聊天界面

gr.ChatInterface(
    inference,
    chatbot=gr.Chatbot(height=400),
    textbox=gr.Textbox(placeholder="在此输入文本...", container=False, scale=5),
    description=f"""
    当前提示模板: {model_name}.
    错误的提示模板会导致性能下降。
    检查 API 规范以确保此格式与目标 LLM 匹配。""",
    title="简单聊天机器人测试应用",
    examples=["用一句话定义'深度学习'。"],
    retry_btn="重试",
    undo_btn="撤销",
    clear_btn="清除",
    theme=theme,
).queue().launch()

启动 Gradio 应用

从命令行: python app.py 或 gradio app.py (后者启用实时部署更新)
访问浏览器中提供的超链接。
享受与远程 LLM 服务器的提示无关的交互。

Gradio 聊天机器人 + LiteLLM 教程

安装与导入依赖​

定义推理函数​

定义聊天界面​

启动 Gradio 应用​

推荐扩展:​

安装与导入依赖

定义推理函数

定义聊天界面

启动 Gradio 应用

推荐扩展: