使用模式运行Outlines
Modal 是一个无服务器平台,可以让你轻松地在云端运行代码,包括GPU。对于那些在家没有强大GPU的我们来说,它非常方便,可以快速而轻松地配置、配置和协调云基础设施。
在本指南中,我们将向您展示如何使用Modal在云中运行使用Outlines编写的程序,利用GPU。
需求
我们建议在虚拟环境中安装 modal 和 outlines 。您可以通过以下方式创建一个:
然后安装所需的包:
构建图像
首先,我们需要定义我们的容器镜像。如果您需要访问受限模型,则需要提供一个 access token。请参见下面的 .env 调用,了解如何提供 HuggingFace 令牌。
设置令牌的最佳方式是通过设置环境变量 HF_TOKEN 来使用你的令牌。如果你不想这样做,我们在代码中提供了一行注释掉的代码,以便直接在代码中设置令牌。
from modal import Image, App, gpu
import os
# This creates a modal App object. Here we set the name to "outlines-app".
# There are other optional parameters like modal secrets, schedules, etc.
# See the documentation here: https://modal.com/docs/reference/modal.App
app = App(name="outlines-app")
# Specify a language model to use.
# Another good model to use is "NousResearch/Hermes-2-Pro-Mistral-7B"
language_model = "mistral-community/Mistral-7B-v0.2"
# Please set an environment variable HF_TOKEN with your Hugging Face API token.
# The code below (the .env({...}) part) will copy the token from your local
# environment to the container.
# More info on Image here: https://modal.com/docs/reference/modal.Image
outlines_image = Image.debian_slim(python_version="3.11").pip_install(
"outlines",
"transformers",
"datasets",
"accelerate",
"sentencepiece",
).env({
# This will pull in your HF_TOKEN environment variable if you have one.
'HF_TOKEN':os.environ['HF_TOKEN']
# To set the token directly in the code, uncomment the line below and replace
# 'YOUR_TOKEN' with the HuggingFace access token.
# 'HF_TOKEN':'YOUR_TOKEN'
})
设置容器
在运行较长的Modal应用时,建议在容器启动时下载语言模型,而不是在调用函数时。这将为未来的运行缓存模型。
# This function imports the model from Hugging Face. The modal container
# will call this function when it starts up. This is useful for
# downloading models, setting up environment variables, etc.
def import_model():
import outlines
outlines.models.transformers(language_model)
# This line tells the container to run the import_model function when it starts.
outlines_image = outlines_image.run_function(import_model)
定义一个模式
我们将运行在README中的JSON结构生成示例,使用以下模式:
# Specify a schema for the character description. In this case,
# we want to generate a character with a name, age, armor, weapon, and strength.
schema = """{
"title": "Character",
"type": "object",
"properties": {
"name": {
"title": "Name",
"maxLength": 10,
"type": "string"
},
"age": {
"title": "Age",
"type": "integer"
},
"armor": {"$ref": "#/definitions/Armor"},
"weapon": {"$ref": "#/definitions/Weapon"},
"strength": {
"title": "Strength",
"type": "integer"
}
},
"required": ["name", "age", "armor", "weapon", "strength"],
"definitions": {
"Armor": {
"title": "Armor",
"description": "An enumeration.",
"enum": ["leather", "chainmail", "plate"],
"type": "string"
},
"Weapon": {
"title": "Weapon",
"description": "An enumeration.",
"enum": ["sword", "axe", "mace", "spear", "bow", "crossbow"],
"type": "string"
}
}
}"""
为了使推理在Modal上工作,我们需要将相应的函数包装在一个 @app.function 装饰器中。我们将图像和希望该函数运行的GPU传递给这个装饰器。
让我们选择一款拥有80GB内存的A100。有效的GPU可以在这里找到。
# Define a function that uses the image we chose, and specify the GPU
# and memory we want to use.
@app.function(image=outlines_image, gpu=gpu.A100(size='80GB'))
def generate(
prompt: str = "Amiri, a 53 year old warrior woman with a sword and leather armor.",
):
# Remember, this function is being executed in the container,
# so we need to import the necessary libraries here. You should
# do this with any other libraries you might need.
import outlines
# Load the model into memory. The import_model function above
# should have already downloaded the model, so this call
# only loads the model into GPU memory.
model = outlines.models.transformers(
language_model, device="cuda"
)
# Generate a character description based on the prompt.
# We use the .json generation method -- we provide the
# - model: the model we loaded above
# - schema: the JSON schema we defined above
generator = outlines.generate.json(model, schema)
# Make sure you wrap your prompt in instruction tags ([INST] and [/INST])
# to indicate that the prompt is an instruction. Instruction tags can vary
# by models, so make sure to check the model's documentation.
character = generator(
f"<s>[INST]Give me a character description. Describe {prompt}.[/INST]"
)
# Print out the generated character.
print(character)
然后我们需要定义一个 local_entrypoint 来远程调用我们的函数 generate。
@app.local_entrypoint()
def main(
prompt: str = "Amiri, a 53 year old warrior woman with a sword and leather armor.",
):
# We use the "generate" function defined above -- note too that we are calling
# .remote() on the function. This tells modal to run the function in our cloud
# machine. If you want to run the function locally, you can call .local() instead,
# though this will require additional setup.
generate.remote(prompt)
这里 @app.local_entrypoint() 装饰器定义了 main 作为在使用 Modal CLI 时本地启动的函数。您可以将上述代码保存到 example.py (或使用 这个实现)。现在让我们看看如何使用 Modal CLI 在云端运行代码。
在云上运行
首先从PyPi安装Modal客户端,如果你还没有安装:
然后您需要从Modal获取一个令牌。运行以下命令:
一旦设置完成,您可以使用以下方法在云端运行推断:
您应该看到Modal应用程序初始化,并且很快在您的终端中看到print函数的结果。就是这样!