教程:数学推理¶
让我们通过一个快速示例来设置一个 dspy.ChainOfThought
模块,并针对回答代数问题进行优化。
通过 pip install -U dspy
安装最新版 DSPy 并跟随操作。您还需要运行 pip install datasets
。
推荐:设置MLflow追踪以了解底层运行情况。
MLflow DSPy 集成¶
MLflow 是一个与 DSPy 原生集成的 LLMOps 工具,提供可解释性和实验追踪功能。在本教程中,您可以使用 MLflow 将提示和优化进度可视化为追踪记录,以更好地理解 DSPy 的行为。您只需按照以下四个步骤即可轻松设置 MLflow。
- 安装 MLflow
%pip install mlflow>=2.20
- 在单独的终端中启动 MLflow UI
mlflow ui --port 5000
- 将笔记本连接到MLflow
import mlflow
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("DSPy")
- 启用追踪。
mlflow.dspy.autolog()
完成上述步骤后,你可以在笔记本上看到每个程序执行的追踪记录。它们提供了对模型行为的极佳可见性,并帮助你在整个教程中更好地理解DSPy的概念。
要了解更多关于集成的信息,请访问MLflow DSPy Documentation。
让我们告诉 DSPy 我们将在模块中使用 OpenAI 的 gpt-4o-mini
。为了认证,DSPy 会检查你的 OPENAI_API_KEY
。你可以轻松地将其替换为 其他提供商或本地模型。
import dspy
gpt4o_mini = dspy.LM('openai/gpt-4o-mini', max_tokens=2000)
gpt4o = dspy.LM('openai/gpt-4o', max_tokens=2000)
dspy.configure(lm=gpt4o_mini) # we'll use gpt-4o-mini as the default LM, unless otherwise specified
接下来,我们从MATH基准中加载一些数据示例。我们将使用训练集进行优化,并在保留的开发集上进行评估。
请注意,以下步骤将需要:
%pip install git+https://github.com/hendrycks/math.git
from dspy.datasets import MATH
dataset = MATH(subset='algebra')
print(len(dataset.train), len(dataset.dev))
350 350
让我们检查训练集中的一个示例。
example = dataset.train[0]
print("Question:", example.question)
print("Answer:", example.answer)
Question: The doctor has told Cal O'Ree that during his ten weeks of working out at the gym, he can expect each week's weight loss to be $1\%$ of his weight at the end of the previous week. His weight at the beginning of the workouts is $244$ pounds. How many pounds does he expect to weigh at the end of the ten weeks? Express your answer to the nearest whole number. Answer: 221
现在让我们定义我们的模块。它非常简单:只是一个思维链步骤,接收一个question
并生成一个answer
。
module = dspy.ChainOfThought("question -> answer")
module(question=example.question)
Prediction( reasoning="Cal O'Ree's weight loss each week is $1\\%$ of his weight at the end of the previous week. This means that at the end of each week, he retains $99\\%$ of his weight from the previous week. \n\nIf we denote his weight at the beginning as \\( W_0 = 244 \\) pounds, then his weight at the end of week \\( n \\) can be expressed as:\n\\[\nW_n = W_{n-1} \\times 0.99\n\\]\nThis can be simplified to:\n\\[\nW_n = W_0 \\times (0.99)^n\n\\]\nAfter 10 weeks, his weight will be:\n\\[\nW_{10} = 244 \\times (0.99)^{10}\n\\]\n\nNow, we calculate \\( (0.99)^{10} \\):\n\\[\n(0.99)^{10} \\approx 0.904382\n\\]\n\nNow, we can calculate his expected weight after 10 weeks:\n\\[\nW_{10} \\approx 244 \\times 0.904382 \\approx 220.5\n\\]\n\nRounding to the nearest whole number, Cal O'Ree can expect to weigh approximately \\( 221 \\) pounds at the end of the ten weeks.", answer='221' )
接下来,在提示优化之前,让我们为上面的零样本模块设置一个评估器。
THREADS = 24
kwargs = dict(num_threads=THREADS, display_progress=True, display_table=5)
evaluate = dspy.Evaluate(devset=dataset.dev, metric=dataset.metric, **kwargs)
evaluate(module)
Average Metric: 259.00 / 350 (74.0%): 100%|██████████| 350/350 [01:30<00:00, 3.85it/s]
2024/11/28 18:41:55 INFO dspy.evaluate.evaluate: Average Metric: 259 / 350 (74.0%)
问题 | 示例推理 | 示例答案 | 预测推理 | 预测答案 | 方法 | |
---|---|---|---|---|---|---|
0 | 使得函数 $...$ 的 $c$ 的最小整数值是多少 | 当且仅当...时,给定函数的定义域为所有实数 | 1 | 为确定使得函数...的 $c$ 的最小整数值 | 1 | ✔️ [正确] |
1 | 满足 $|{-x+3}|=7$ 的 $x$ 的最小值是多少? | 要使 $|{-x+3}| = 7$,必须满足 $-x + 3 = 7$ 或 $-x ... | -4 | 要解方程 \( |{-x+3}|=7 \),需要考虑定义域... | -4 | ✔️ [正确] |
2 | 计算 $\left\lceil -\frac{7}{4}\right\rceil$。 | $-\frac{7}{4}$ 介于 $-1$ 和 $-2$ 之间,因此 $\left\lceil -\frac{7}... | -1 | 要计算 \(\left\lceil -\frac{7}{4}\right\rceil\),我们首先需要... | -1 | ✔️ [正确] |
3 | 一个三角形的顶点坐标为 $(11,1)$、$(2,3)$ 和 $(3,7... | 我们必须通过使用...来找到每对点之间的距离 | 10 | 要找到顶点为...的三角形最长边的长度 | 10 | ✔️ [正确] |
4 | 设 $f(x) = x + 2$ 且 $g(x) = 1/f(x)$。那么 $g(f(-3))$ 的值是多少? | 首先,我们得出 $f(-3) = (-3) + 2 = -1$。然后,$$g(f(-3)) = g(... | 1 | 为了求解 \( g(f(-3)) \),我们首先需要计算 \( f(-3) \)。该... | 1 | ✔️ [正确] |
74.0
在MLflow实验中跟踪评估结果
为了跟踪并随时间可视化评估结果,您可以将结果记录到 MLflow 实验中。
import mlflow
# Start an MLflow Run to record the evaluation
with mlflow.start_run(run_name="math_evaluation"):
kwargs = dict(num_threads=THREADS, display_progress=True)
evaluate = dspy.Evaluate(devset=dataset.dev, metric=dataset.metric, **kwargs)
# Evaluate the program as usual
result = evaluate(module)
# Log the aggregated score
mlflow.log_metric("correctness", result.score)
# Log the detailed evaluation results as a table
mlflow.log_table(
{
"Question": [example.question for example in dataset.dev],
"Gold Answer": [example.answer for example in dataset.dev],
"Predicted Answer": [output[1] for output in result.results],
"Correctness": [output[2] for output in result.results],
},
artifact_file="eval_results.json",
)
要了解更多关于集成的信息,请访问MLflow DSPy Documentation。
最后我们来优化我们的模块。由于我们需要强大的推理能力,我们将使用大型GPT-4o作为教师模型(用于在优化时为小型语言模型引导推理),但不作为提示模型(用于制定指令)或任务模型(被训练的)。
GPT-4o将仅被调用少量次数。直接参与优化及最终(优化后)程序的模型将是GPT-4o-mini。
我们还将指定 max_bootstrapped_demos=4
,这意味着提示中最多需要四个引导示例,以及 max_labeled_demos=4
,这意味着在引导和预标记示例之间,总共最多需要四个。
kwargs = dict(num_threads=THREADS, teacher_settings=dict(lm=gpt4o), prompt_model=gpt4o_mini)
optimizer = dspy.MIPROv2(metric=dataset.metric, auto="medium", **kwargs)
kwargs = dict(max_bootstrapped_demos=4, max_labeled_demos=4)
optimized_module = optimizer.compile(module, trainset=dataset.train, **kwargs)
evaluate(optimized_module)
Average Metric: 310.00 / 350 (88.6%): 100%|██████████| 350/350 [01:31<00:00, 3.84it/s]
2024/11/28 18:59:19 INFO dspy.evaluate.evaluate: Average Metric: 310 / 350 (88.6%)
问题 | 示例推理 | 示例答案 | 预测推理 | 预测答案 | 方法 | |
---|---|---|---|---|---|---|
0 | 满足函数$...$的最小整数$c$值是多少 | 当且仅当...时,给定函数的定义域为所有实数 | 1 | 函数 \( f(x) = \frac{x^2 + 1}{x^2 - x + c} \) 将具有... | 1 | ✔️ [正确] |
1 | 满足 $|{-x+3}|=7$ 的解中,$x$ 的最小值是多少? | 为了使 $|{-x+3}| = 7$ 成立,必须有 $-x + 3 = 7$ 或 $-x ... | -4 | 方程 \( |{-x+3}|=7 \) 意味着两种可能情况:1. \(-x ... | -4 | ✔️ [正确] |
2 | 计算 $\left\lceil -\frac{7}{4}\right\rceil$。 | $-\frac{7}{4}$ 介于 $-1$ 和 $-2$ 之间,因此 $\left\lceil -\frac{7}... | -1 | 要计算 \(\left\lceil -\frac{7}{4}\right\rceil\),我们首先需要... | -1 | ✔️ [正确] |
3 | 一个三角形的顶点坐标为 $(11,1)$、$(2,3)$ 和 $(3,7... | 我们必须通过使用...来找到每对点之间的距离 | 10 | 为了找到由顶点构成的三角形边长... | 10 | ✔️ [正确] |
4 | 设 $f(x) = x + 2$ 且 $g(x) = 1/f(x)$。$g(f(-3))$ 是多少? | 首先,我们得到 $f(-3) = (-3) + 2 = -1$。然后,$$g(f(-3)) = g(... | 1 | 要求解 \( g(f(-3)) \),我们首先需要计算 \( f(-3) \)。使用... | 1 | ✔️ [正确] |
88.57
不错。在这里,将质量从74%提高到超过88%(基于一个保留集)是相当直接的。
也就是说,对于此类推理任务,您通常需要考虑更高级的策略,例如:
- 一个可访问计算器函数或
dspy.PythonInterpreter
的dspy.ReAct
模块 - 在顶层使用多数投票(或聚合器模块)集成多个优化后的提示
为了理解变化内容,让我们查看优化后的提示。或者,如果您按照上述说明启用了MLflow追踪,可以在丰富的追踪界面中对比优化前后的提示。
dspy.inspect_history()
[2024-11-28T18:59:19.176586] System message: Your input fields are: 1. `question` (str) Your output fields are: 1. `reasoning` (str) 2. `answer` (str) All interactions will be structured in the following way, with the appropriate values filled in. [[ ## question ## ]] {question} [[ ## reasoning ## ]] {reasoning} [[ ## answer ## ]] {answer} [[ ## completed ## ]] In adhering to this structure, your objective is: Analyze the `question` provided, and systematically apply mathematical reasoning to derive the `answer`. Ensure to articulate each step of your thought process in the `reasoning` field, detailing how you identify relationships and formulate equations to arrive at the solution. User message: [[ ## question ## ]] If $|4x+2|=10$ and $x<0$, what is the value of $x$? Respond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, then `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`. Assistant message: [[ ## reasoning ## ]] The equation $|4x+2|=10$ implies two possible equations: $4x+2=10$ or $4x+2=-10$. We solve each equation separately. 1. For $4x+2=10$: \[ 4x + 2 = 10 \\ 4x = 8 \\ x = 2 \] However, $x=2$ does not satisfy the condition $x<0$. 2. For $4x+2=-10$: \[ 4x + 2 = -10 \\ 4x = -12 \\ x = -3 \] The solution $x = -3$ satisfies the condition $x<0$. Therefore, the value of $x$ is $\boxed{-3}$. [[ ## answer ## ]] -3 [[ ## completed ## ]] User message: [[ ## question ## ]] Given that $-4$ is a solution to $x^2 + bx -36 = 0$, what is the value of $b$? Respond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, then `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`. Assistant message: [[ ## reasoning ## ]] The product of the roots of this quadratic is $-36/1=-36$, so the other solution must be $-36/-4=9$. That means that the sum of the solutions is $-4+9=5$. The sum of the solutions is also $-b/1=-b$. Thus, $-b=5$ and $b=\boxed{-5}$. [[ ## answer ## ]] -5 [[ ## completed ## ]] User message: [[ ## question ## ]] Richard is building a rectangular playground from 200 feet of fencing. The fencing must entirely enclose the playground. What is the maximum area of this playground? Respond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, then `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`. Assistant message: [[ ## reasoning ## ]] Let the length of the playground be $l$ and the width be $w$. We have the equation $2l+2w=200 \Rightarrow l + w = 100$. We want to maximize the area of this rectangular playground, which is given by $lw$. From our equation, we know that $l=100-w$. Substituting this in to our expression for area, we have \[(100-w)(w)=100w-w^2\]We will now complete the square to find the maximum value of this expression. Factoring a $-1$ out, we have \[-(w^2-100w)\]In order for the expression inside the parenthesis to be a perfect square, we need to add and subtract $(100/2)^2=2500$ inside the parenthesis. Doing this, we get \[-(w^2-100w+2500-2500) \Rightarrow -(w-50)^2+2500\]Since the maximum value of $-(w-50)^2$ is 0 (perfect squares are always nonnegative), the maximum value of the entire expression is 2500, which is achieved when $w=50$ and $l=100-w=50$ (the playground is a square). Thus, the maximum area of the playground is $\boxed{2500}$ square feet. [[ ## answer ## ]] 2500 [[ ## completed ## ]] User message: [[ ## question ## ]] Alice and Bob are playing a game. Alice starts first. On Alice's turn, she flips a coin. If she gets a heads, she wins. If not, it becomes Bob's turn. On Bob's turn, he flips a coin. If he gets a tails, he wins. If not, it becomes Alice's turn. What is the probability that Alice wins the game? Respond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, then `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`. Assistant message: [[ ## reasoning ## ]] Alice has a $1/2$ chance of winning the game on her first turn. If she doesn't, then the probability that she wins the game on her second turn is $1/8,$ since she must not win on her first flip ($1/2$ chance), Bob must not win on his first flip ($1/2$ chance), and then Alice must win on her second flip ($1/2$ chance). The probability that she wins the game on her third turn is $1/32,$ and in general, the probability that she wins the game on her $k^\text{th}$ turn is $(1/2)^{2k-1}.$ Thus, the probability that Alice wins is an infinite geometric series with first term $1/2$ and common ratio $1/4.$ So, the probability that Alice wins the game is $$\frac{\frac12}{1-\frac14} = \boxed{\frac{2}{3}}.$$OR Note that the only difference between the odds of Alice or Bob winning is who goes first. Because Bob goes second, the odds of him winning on his $k^\text{th}$ flip is half of the odds that Alice wins on her $k^\text{th}$ flip, since Alice must first get a tails before Bob gets a chance to win. Thus, if $a$ is Alice's chance of winning, and $b$ is Bob's chance of winning, then $a = 2b.$ Also, since someone must win, $a + b = 1.$ It follows that $a = 2/3$ and $b = 1/3,$ so Alice has a $\boxed{\frac{2}{3}}$ chance of winning the game. [[ ## answer ## ]] \frac{2}{3} [[ ## completed ## ]] User message: [[ ## question ## ]] Find the smallest possible value of $$\frac{(y-x)^2}{(y-z)(z-x)} + \frac{(z-y)^2}{(z-x)(x-y)} + \frac{(x-z)^2}{(x-y)(y-z)},$$ where $x,y,$ and $z$ are distinct real numbers. Respond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, then `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`. Response: [[ ## reasoning ## ]] Let \( S = \frac{(y-x)^2}{(y-z)(z-x)} + \frac{(z-y)^2}{(z-x)(x-y)} + \frac{(x-z)^2}{(x-y)(y-z)} \). To find the minimum value of \( S \), we can use the method of symmetry and the Cauchy-Schwarz inequality. By applying the Cauchy-Schwarz inequality, we have: \[ \left( \frac{(y-x)^2}{(y-z)(z-x)} + \frac{(z-y)^2}{(z-x)(x-y)} + \frac{(x-z)^2}{(x-y)(y-z)} \right) \left( (y-z)(z-x) + (z-x)(x-y) + (x-y)(y-z) \right) \geq \left( (y-x) + (z-y) + (x-z) \right)^2. \] The right-hand side simplifies to: \[ (y-x) + (z-y) + (x-z) = 0, \] which means the right-hand side is \( 0 \). Therefore, we have: \[ S \cdot \left( (y-z)(z-x) + (z-x)(x-y) + (x-y)(y-z) \right) \geq 0. \] This implies that \( S \) is non-negative. Next, we can test specific values for \( x, y, z \) to find the minimum. Let's set \( x = 0, y = 1, z = 2 \): \[ S = \frac{(1-0)^2}{(1-2)(2-0)} + \frac{(2-1)^2}{(2-0)(0-1)} + \frac{(0-2)^2}{(0-1)(1-2)}. \] Calculating each term: 1. The first term: \[ \frac{1^2}{(-1)(2)} = \frac{1}{-2} = -\frac{1}{2}. \] 2. The second term: \[ \frac{1^2}{(2)(-1)} = \frac{1}{-2} = -\frac{1}{2}. \] 3. The third term: \[ \frac{(-2)^2}{(-1)(-1)} = \frac{4}{1} = 4. \] Thus, \[ S = -\frac{1}{2} - \frac{1}{2} + 4 = 3. \] To check if this is indeed the minimum, we can also consider the case when \( x, y, z \) are the roots of a quadratic polynomial. By symmetry and the nature of the expression, we can conclude that the minimum value of \( S \) is indeed \( 3 \). Therefore, the smallest possible value of \( S \) is \( \boxed{3} \). [[ ## answer ## ]] 3 [[ ## completed ## ]]