LiteLLM 与 Ollama

LiteLLM 是一个开源的本地运行代理服务器，它提供了一个与 OpenAI 兼容的 API。它与许多提供推理的供应商进行接口对接。为了处理推理，一个流行的开源推理引擎是 Ollama。

由于并非所有代理服务器都支持OpenAI的Function Calling（可与AutoGen一起使用），LiteLLM与Ollama共同启用了这一实用功能。

运行此堆栈需要安装：

AutoGen (安装说明)
LiteLLM
Ollama

注意：我们建议为您的开发环境使用虚拟环境，详见这篇文章以获取指导。

安装LiteLLM

安装带有代理服务器功能的LiteLLM：

pip install 'litellm[proxy]'

注意：如果使用Windows，请在WSL2中运行LiteLLM和Ollama。

tip

有关自定义LiteLLM安装说明，请参阅他们的GitHub仓库。

安装Ollama

对于Mac和Windows，下载Ollama。

对于 Linux：

curl -fsSL https://ollama.com/install.sh | sh

下载模型

Ollama 有一个模型库可供选择，查看它们这里。

在使用模型之前，您需要先下载它（使用库中的模型名称）：

ollama pull llama3:instruct

查看您已下载并可使用的模型：

ollama list

tip

Ollama 支持使用 GGUF 模型文件，这些文件可以在 Hugging Face 上轻松获取。有关示例，请参阅 Ollama 的 GitHub 仓库。

运行 LiteLLM 代理服务器

要使用您下载的模型运行LiteLLM，请在终端中执行以下操作：

litellm --model ollama/llama3:instruct

INFO:     Started server process [19040]
INFO:     Waiting for application startup.

#------------------------------------------------------------#
#                                                            #
#       'This feature doesn't meet my needs because...'       #
#        https://github.com/BerriAI/litellm/issues/new        #
#                                                            #
#------------------------------------------------------------#

 Thank you for using LiteLLM! - Krrish & Ishaan



Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new


INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:4000 (Press CTRL+C to quit)

这将运行代理服务器，并且它将在 ‘http://0.0.0.0:4000/’ 上可用。

使用LiteLLM+Ollama与AutoGen

现在我们有了LiteLLM代理服务器的URL，你可以以与OpenAI或基于云的代理服务器相同的方式在AutoGen中使用它。

由于你在本地运行此代理服务器，因此不需要API密钥。此外，由于在运行LiteLLM命令时已经设置了模型，因此在AutoGen中不需要配置模型名称。然而， model和api_key是AutoGen配置中的必填字段，因此我们在其中放置了虚拟值，如下例所示。

配置中的另一个设置是price，它可以用来设置代币的价格。由于我们是在本地运行它，我们将成本设为零。使用此设置还可以在无法确定价格时避免显示提示。

from autogen import ConversableAgent, UserProxyAgent

local_llm_config = {
    "config_list": [
        {
            "model": "NotRequired",  # Loaded with LiteLLM command
            "api_key": "NotRequired",  # Not needed
            "base_url": "http://0.0.0.0:4000",  # Your LiteLLM URL
            "price": [0, 0],  # Put in price per 1K tokens [prompt, response] as free!
        }
    ],
    "cache_seed": None,  # Turns off caching, useful for testing different models
}

# Create the agent that uses the LLM.
assistant = ConversableAgent("agent", llm_config=local_llm_config)

# Create the agent that represents the user in the conversation.
user_proxy = UserProxyAgent("user", code_execution_config=False)

# Let the assistant start the conversation.  It will end when the user types exit.
res = assistant.initiate_chat(user_proxy, message="How can I help you today?")

print(assistant)

agent (to user):

How can I help you today?

--------------------------------------------------------------------------------
user (to agent):

Why is the sky blue?

--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...
agent (to user):

A classic question!

The sky appears blue because of a phenomenon called scattering. When sunlight enters Earth's atmosphere, it encounters tiny molecules of gases such as nitrogen (N2) and oxygen (O2). These molecules scatter the light in all directions, but they scatter shorter (blue) wavelengths more than longer (red) wavelengths.

This is known as Rayleigh scattering, named after the British physicist Lord Rayleigh, who first described the phenomenon in the late 19th century. As a result of this scattering, the blue light is distributed throughout the atmosphere, giving the sky its blue appearance.

Additionally, when sunlight passes through more dense atmospheric particles like water vapor, pollutants, and dust, it can also be scattered or absorbed, which affects the color we see. For example, during sunrise and sunset, the light has to travel longer distances through the atmosphere, which scatters the shorter wavelengths even more, making the sky appear more red.

So, there you have it! The blue sky is a result of the combination of sunlight, atmospheric gases, and the scattering of light.

How's that? Do you have any other questions or would you like to explore more topics?

--------------------------------------------------------------------------------
user (to agent):

Why is it sometimes red, then?

--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...
agent (to user):

Excellent follow-up question!

As I mentioned earlier, the color we see in the sky can be affected by the amount and type of particles in the atmosphere. When the sunlight has to travel longer distances through the air, like during sunrise and sunset, it encounters more atmospheric particles that scatter the shorter blue wavelengths even more than the longer red wavelengths.

This is known as Mie scattering, named after the German physicist Gustav Mie. The larger particles, such as water droplets, pollen, and dust, are responsible for this type of scattering. They scatter the shorter blue wavelengths more efficiently than the longer red wavelengths, which is why we often see more red or orange hues during these times.

Additionally, during sunrise and sunset, the sun's rays have to travel through a thicker layer of atmosphere, which contains more particles like water vapor, pollutants, and aerosols. These particles can absorb or scatter certain wavelengths of light, making them appear redder or more orange.

The combination of Mie scattering and absorption by atmospheric particles can create the warm, golden hues we often see during sunrise and sunset. It's a beautiful reminder that the color of our sky is not just a result of the sun itself but also the complex interactions between sunlight, atmosphere, and particles!

Would you like to explore more about the Earth's atmosphere or perhaps learn about other fascinating topics?

--------------------------------------------------------------------------------
<autogen.agentchat.conversable_agent.ConversableAgent object at 0x7fe35da88dd0>

函数调用示例

函数调用（也称为工具调用）是OpenAI API的一项特性，AutoGen、LiteLLM和Ollama都支持这一特性。

以下是使用LiteLLM和Ollama进行函数调用的示例。基于这个货币转换笔记本。

LiteLLM 的加载方式与前一个示例相同，我们将继续使用 Meta 的 Llama3 模型，因为它擅长构建所需的函数调用消息。

注意：需要 LiteLLM 版本 1.41.27 或更高版本（以支持使用 Ollama 原生进行函数调用）。

在您的终端中：

litellm --model ollama/llama3

然后我们使用函数调用运行我们的程序。

from typing import Literal

from typing_extensions import Annotated

import autogen

local_llm_config = {
    "config_list": [
        {
            "model": "NotRequired",  # Loaded with LiteLLM command
            "api_key": "NotRequired",  # Not needed
            "base_url": "http://0.0.0.0:4000",  # Your LiteLLM URL
            "price": [0, 0],  # Put in price per 1K tokens [prompt, response] as free!
        }
    ],
    "cache_seed": None,  # Turns off caching, useful for testing different models
}

/usr/local/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

# Create the agent and include examples of the function calling JSON in the prompt
# to help guide the model
chatbot = autogen.AssistantAgent(
    name="chatbot",
    system_message="""For currency exchange tasks,
        only use the functions you have been provided with.
        If the function has been called previously,
        return only the word 'TERMINATE'.""",
    llm_config=local_llm_config,
)

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
    code_execution_config={"work_dir": "code", "use_docker": False},
)

CurrencySymbol = Literal["USD", "EUR"]

# Define our function that we expect to call


def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
    if base_currency == quote_currency:
        return 1.0
    elif base_currency == "USD" and quote_currency == "EUR":
        return 1 / 1.1
    elif base_currency == "EUR" and quote_currency == "USD":
        return 1.1
    else:
        raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")


# Register the function with the agent
@user_proxy.register_for_execution()
@chatbot.register_for_llm(description="Currency exchange calculator.")
def currency_calculator(
    base_amount: Annotated[float, "Amount of currency in base_currency"],
    base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
    quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
) -> str:
    quote_amount = exchange_rate(base_currency, quote_currency) * base_amount
    return f"{format(quote_amount, '.2f')} {quote_currency}"

# start the conversation
res = user_proxy.initiate_chat(
    chatbot,
    message="How much is 123.45 EUR in USD?",
    summary_method="reflection_with_llm",
)

user_proxy (to chatbot):

How much is 123.45 EUR in USD?

--------------------------------------------------------------------------------
chatbot (to user_proxy):

***** Suggested tool call (call_d9584223-9af0-4526-ad09-856b03487fd5): currency_calculator *****
Arguments: 
{"base_amount": 123.45, "base_currency": "EUR", "quote_currency": "USD"}
************************************************************************************************

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING FUNCTION currency_calculator...
user_proxy (to chatbot):

user_proxy (to chatbot):

***** Response from calling tool (call_d9584223-9af0-4526-ad09-856b03487fd5) *****
135.80 USD
**********************************************************************************

--------------------------------------------------------------------------------
chatbot (to user_proxy):

***** Suggested tool call (call_17b07b4d-629f-4314-8a04-97b1537fa486): currency_calculator *****
Arguments: 
{"base_amount": 123.45, "base_currency": "EUR", "quote_currency": "USD"}
************************************************************************************************

--------------------------------------------------------------------------------

我们可以看到，货币转换函数被调用时使用了正确的值，并且生成了一个结果。

tip

一旦将函数包含在对话中，使用LiteLLM和Ollama，模型可能会继续推荐工具调用（如上所示）。这是一个活跃的开发领域，未来的版本计划为AutoGen提供一个原生的Ollama客户端。

LiteLLM 与 Ollama

安装LiteLLM​

安装Ollama​

下载模型​

运行 LiteLLM 代理服务器​

使用LiteLLM+Ollama与AutoGen​

函数调用示例​