Python API#

LLM 提供了一个用于执行提示的 Python API，除了命令行界面之外。

理解这个API对于编写Plugins也很重要。

基本提示执行#

要针对gpt-4o-mini模型运行提示，请运行以下内容：

import llm

model = llm.get_model("gpt-4o-mini")
# Optional, you can configure the key in other ways:
model.key = "sk-..."
response = model.prompt("Five surprising names for a pet pelican")
print(response.text())

llm.get_model() 函数接受模型ID或别名。你也可以省略它以使用当前配置的默认模型，如果你没有更改默认值，那么默认模型是 gpt-4o-mini。

在这个例子中，密钥是通过Python代码设置的。你也可以使用OPENAI_API_KEY环境变量提供密钥，或者使用llm keys set openai命令将其存储在keys.json文件中，参见API key management。

__str__() 方法也返回响应的文本，因此你可以这样做：

print(llm.get_model().prompt("Five surprising names for a pet pelican"))

您可以运行此命令以查看可用模型及其别名的列表：

llm models

如果你已经设置了OPENAI_API_KEY环境变量，你可以省略model.key = 这一行。

使用无效的模型ID调用llm.get_model()将会引发llm.UnknownModelError异常。

系统提示#

对于接受系统提示的模型，将其作为 system="..." 传递：

response = model.prompt(
    "Five surprising names for a pet pelican",
    system="Answer like GlaDOS"
)

附件#

接受多模态输入（图像、音频、视频等）的模型可以使用attachments=关键字参数传递附件。这接受一个llm.Attachment()实例的列表。

这个例子展示了两个附件 - 一个来自文件路径，一个来自URL：

import llm

model = llm.get_model("gpt-4o-mini")
response = model.prompt(
    "Describe these images",
    attachments=[
        llm.Attachment(path="pelican.jpg"),
        llm.Attachment(url="https://static.simonwillison.net/static/2024/pelicans.jpg"),
    ]
)

使用 llm.Attachment(content=b"binary image content here") 直接传递二进制内容。

你可以使用model.attachment_types集合来检查模型支持哪些附件类型（如果有的话）：

model = llm.get_model("gpt-4o-mini")
print(model.attachment_types)
# {'image/gif', 'image/png', 'image/jpeg', 'image/webp'}

if "image/jpeg" in model.attachment_types:
    # Use a JPEG attachment here
    ...

模型选项#

对于支持选项的模型（查看那些带有llm models --options的模型），你可以将选项作为关键字参数传递给.prompt()方法：

model = llm.get_model()
print(model.prompt("Names for otters", temperature=0.2))

传递API密钥#

接受API密钥的模型应该在其model.prompt()方法中添加一个额外的key=参数：

model = llm.get_model("gpt-4o-mini")
print(model.prompt("Names for beavers", key="sk-..."))

如果您不提供此参数，LLM将尝试从环境变量中获取（对于OpenAI是OPENAI_API_KEY，对于其他插件则是其他变量）或从使用llm keys set命令保存的密钥中获取。

一些模型插件可能尚未升级以处理key=参数，在这种情况下，您需要使用其他机制之一。

来自插件的模型#

您已安装为插件的任何模型也可以通过此机制使用，例如使用Anthropic的Claude 3.5 Sonnet模型与llm-anthropic：

pip install llm-anthropic

然后在你的Python代码中：

import llm

model = llm.get_model("claude-3.5-sonnet")
# Use this if you have not set the key using 'llm keys set claude':
model.key = 'YOUR_API_KEY_HERE'
response = model.prompt("Five surprising names for a pet pelican")
print(response.text())

有些模型根本不使用API密钥。

访问底层的JSON#

大多数模型插件还会提供提示响应的JSON版本。这个结构会因模型插件而异，因此基于此构建的代码很可能只能与该特定模型提供商一起使用。

你可以使用response.json()方法将这个JSON数据作为Python字典访问：

import llm
from pprint import pprint

model = llm.get_model("gpt-4o-mini")
response = model.prompt("3 names for an otter")
json_data = response.json()
pprint(json_data)

这是来自GPT-4o mini的示例输出：

{'content': 'Sure! Here are three fun names for an otter:\n'
            '\n'
            '1. **Splash**\n'
            '2. **Bubbles**\n'
            '3. **Otto** \n'
            '\n'
            'Feel free to mix and match or use these as inspiration!',
 'created': 1739291215,
 'finish_reason': 'stop',
 'id': 'chatcmpl-AznO31yxgBjZ4zrzBOwJvHEWgdTaf',
 'model': 'gpt-4o-mini-2024-07-18',
 'object': 'chat.completion.chunk',
 'usage': {'completion_tokens': 43,
           'completion_tokens_details': {'accepted_prediction_tokens': 0,
                                         'audio_tokens': 0,
                                         'reasoning_tokens': 0,
                                         'rejected_prediction_tokens': 0},
           'prompt_tokens': 13,
           'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0},
           'total_tokens': 56}}

令牌使用情况#

许多模型可以返回执行提示时使用的令牌数量的计数。

response.usage() 方法为此提供了一个抽象：

pprint(response.usage())

示例输出：

Usage(input=5,
      output=2,
      details={'candidatesTokensDetails': [{'modality': 'TEXT',
                                            'tokenCount': 2}],
               'promptTokensDetails': [{'modality': 'TEXT', 'tokenCount': 5}]})

.input 和 .output 属性是表示输入和输出令牌数量的整数。.details 属性可能是一个包含额外自定义值的字典，这些值因模型而异。

流式响应#

对于支持它的模型，您可以像这样流式传输生成的响应：

response = model.prompt("Five diabolical names for a pet goat")
for chunk in response:
    print(chunk, end="")

前面描述的response.text()方法为你做了这件事——它遍历迭代器并将结果收集到一个字符串中。

如果响应已被评估，response.text() 将继续返回相同的字符串。

异步模型#

一些插件提供了其支持模型的异步版本，适合与Python asyncio一起使用。

要使用异步模型，请使用llm.get_async_model()函数而不是llm.get_model()：

import llm
model = llm.get_async_model("gpt-4o")

然后你可以使用await model.prompt(...)来运行一个提示：

response = await model.prompt(
    "Five surprising names for a pet pelican"
)
print(await response.text())

或者使用 async for chunk in ... 来流式传输生成的响应：

async for chunk in model.prompt(
    "Five surprising names for a pet pelican"
):
    print(chunk, end="", flush=True)

这个await model.prompt()方法接受与同步model.prompt()方法相同的参数，用于选项和附件以及key=等。

对话#

LLM 支持对话，您可以在持续对话中向模型提出后续问题。

要开始一个新的对话，请使用 model.conversation() 方法：

model = llm.get_model()
conversation = model.conversation()

然后你可以使用conversation.prompt()方法来对这个对话执行提示：

response = conversation.prompt("Five fun facts about pelicans")
print(response.text())

这与model.prompt()方法的工作方式完全相同，只是对话将在多个提示之间保持。因此，如果你接下来运行这个：

response2 = conversation.prompt("Now do skunks")
print(response2.text())

你将获得关于臭鼬的五个有趣事实。

conversation.prompt() 方法也支持附件：

response = conversation.prompt(
    "Describe these birds",
    attachments=[
        llm.Attachment(url="https://static.simonwillison.net/static/2024/pelicans.jpg")
    ]
)

访问 conversation.responses 以获取到目前为止在对话中返回的所有响应的列表。

列出模型#

llm.get_models() 列表返回所有可用模型的列表，包括来自插件的模型。

import llm

for model in llm.get_models():
    print(model.model_id)

使用 llm.get_async_models() 列出异步模型：

for model in llm.get_async_models():
    print(model.model_id)

当响应完成时运行代码#

对于某些应用程序，例如跟踪应用程序使用的令牌，可能需要在响应执行完毕后立即执行代码。

你可以使用response.on_done(callback)方法来实现这一点，该方法会在响应完成（所有令牌都已返回）时立即调用你的回调函数。

您提供的方法签名是def callback(response) - 在使用异步模型时，它可以选择性地是一个async def方法。

示例用法：

import llm

model = llm.get_model("gpt-4o-mini")
response = model.prompt("a poem about a hippo")
response.on_done(lambda response: print(response.usage()))
print(response.text())

输出如下：

Usage(input=20, output=494, details={})
In a sunlit glade by a bubbling brook,
Lived a hefty hippo, with a curious look.
...

或者使用asyncio模型，您需要await response.on_done(done)来排队回调：

import asyncio, llm

async def run():
    model = llm.get_async_model("gpt-4o-mini")
    response = model.prompt("a short poem about a brick")
    async def done(response):
        print(await response.usage())
        print(await response.text())
    await response.on_done(done)
    print(await response.text())

asyncio.run(run())

其他函数#

llm 顶级包包含一些有用的实用函数。

设置别名(alias, model_id)#

llm.set_alias() 函数可用于定义一个新的别名：

import llm

llm.set_alias("mini", "gpt-4o-mini")

第二个参数可以是模型标识符或另一个别名，在这种情况下，该别名将被解析。

如果aliases.json文件不存在或包含无效的JSON，它将被创建或覆盖。

移除别名(alias)#

从aliases.json文件中移除指定名称的别名。

如果别名不存在，则引发 KeyError。

import llm

llm.remove_alias("turbo")

设置默认模型（alias）#

这将默认模型设置为给定的模型ID或别名。对默认值的任何更改都将保存在LLM配置文件夹中，并将影响系统上使用LLM的所有程序，包括llm CLI工具。

import llm

llm.set_default_model("claude-3.5-sonnet")

获取默认模型()#

这将返回当前配置的默认模型，如果未设置默认值，则返回gpt-4o-mini。

import llm

model_id = llm.get_default_model()

要检测是否未设置默认值，您可以使用此模式：

if llm.get_default_model(default=None) is None:
    print("No default has been set")

这里的default=参数指定了如果没有配置默认值时应返回的值。

set_default_embedding_model(alias) 和 get_default_embedding_model()#

这两种方法与set_default_model()和get_default_model()的工作方式相同，但适用于默认的embedding model。