如何调试模型#

介绍#

调试模型的方法有多种层次。最简单的一种方法就是直接打印出不同变量的取值。

因为 PyMC 使用 PyTensor 表达式来构建模型,而不是函数,所以无法在似然函数中放置 print 语句。相反,您可以使用 pytensor.printing.Print 类来打印中间值。

import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc as pm
%matplotlib inline
%config InlineBackend.figure_format = "retina"

RANDOM_SEED = 8927

如何打印PyTensor函数的中间值#

由于 PyTensor 函数被编译为 C,您必须使用 pytensor.printing.Print 类来打印中间值(在下面导入为 Print)。Python 的 print 函数将不起作用。下面是一个使用 Print 的简单示例。更多信息,请参见 调试 PyTensor

import pytensor.tensor as pt

from pytensor import function
from pytensor.printing import Print
x = pt.dvector("x")
y = pt.dvector("y")
func = function([x, y], 1 / (x - y))
func([1, 2, 3], [1, 0, -1])
array([ inf, 0.5 , 0.25])

要查看输出中导致inf值的原因,我们可以使用Print打印\((x-y)\)的中间值。Print类只是将其调用者传递下去,但会在用户定义的消息中打印出其值:

z_with_print = Print("x - y = ")(x - y)
func_with_print = function([x, y], 1 / z_with_print)
func_with_print([1, 2, 3], [1, 0, -1])
x - y =  __str__ = [0. 2. 4.]
array([ inf, 0.5 , 0.25])

Print 揭示了根本原因:当 \(x=1, y=1\) 时,\((x-y)\) 取零值,导致 inf 输出。

如何捕获Print输出以进行进一步分析#

当我们期望从Print获得多行输出时,将输出重定向到字符串缓冲区并在之后访问这些值可能是可取的(感谢Lindley Lentati启发此示例)。以下是使用Python print函数的玩具示例:

import sys

from io import StringIO

old_stdout = sys.stdout
mystdout = sys.stdout = StringIO()

for i in range(5):
    print(f"Test values: {i}")

output = mystdout.getvalue().split("\n")
sys.stdout = old_stdout  # setting sys.stdout back
output
['Test values: 0',
 'Test values: 1',
 'Test values: 2',
 'Test values: 3',
 'Test values: 4',
 '']

排查一个简单的 PyMC 模型#

rng = np.random.default_rng(RANDOM_SEED)
x = rng.normal(size=100)

with pm.Model() as model:
    # priors
    mu = pm.Normal("mu", mu=0, sigma=1)
    sd = pm.Normal("sd", mu=0, sigma=1)

    # setting out printing for mu and sd
    mu_print = Print("mu")(mu)
    sd_print = Print("sd")(sd)

    # likelihood
    obs = pm.Normal("obs", mu=mu_print, sigma=sd_print, observed=x)
with model:
    step = pm.Metropolis()
    trace = pm.sample(5, step, tune=0, chains=1, progressbar=False, random_seed=RANDOM_SEED)
Only 5 samples in chain.
sd __str__ = 0.0
mu __str__ = 0.0
---------------------------------------------------------------------------
SamplingError                             Traceback (most recent call last)
Input In [9], in <cell line: 1>()
      1 with model:
      2     step = pm.Metropolis()
----> 3     trace = pm.sample(5, step, tune=0, chains=1, progressbar=False, random_seed=RANDOM_SEED)

File c:\users\igork\pycharmprojects\pymc\pymc\sampling.py:558, in sample(draws, step, init, n_init, initvals, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, jitter_max_retries, return_inferencedata, idata_kwargs, mp_ctx, **kwargs)
    556 # One final check that shapes and logps at the starting points are okay.
    557 for ip in initial_points:
--> 558     model.check_start_vals(ip)
    559     _check_start_shape(model, ip)
    561 sample_args = {
    562     "draws": draws,
    563     "step": step,
   (...)
    573     "discard_tuned_samples": discard_tuned_samples,
    574 }

File c:\users\igork\pycharmprojects\pymc\pymc\model.py:1794, in Model.check_start_vals(self, start)
   1791 initial_eval = self.point_logps(point=elem)
   1793 if not all(np.isfinite(v) for v in initial_eval.values()):
-> 1794     raise SamplingError(
   1795         "Initial evaluation of model at starting point failed!\n"
   1796         f"Starting values:\n{elem}\n\n"
   1797         f"Initial evaluation results:\n{initial_eval}"
   1798     )

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'mu': array(0.), 'sd': array(0.)}

Initial evaluation results:
{'mu': -0.92, 'sd': -0.92, 'obs': -inf}

PyMC v4 的异常处理已经改进,因此现在 SamplingError 异常会打印出导致似然值为 -infmusd 的中间值。然而,在更复杂的情况下,使用 aeasara.printing.Print 打印中间值的技术可能非常有价值。

综合运用#

rng = np.random.default_rng(RANDOM_SEED)
y = rng.normal(loc=5, size=20)

old_stdout = sys.stdout
mystdout = sys.stdout = StringIO()

with pm.Model() as model:
    mu = pm.Normal("mu", mu=0, sigma=10)
    a = pm.Normal("a", mu=0, sigma=10, initval=0.1)
    b = pm.Normal("b", mu=0, sigma=10, initval=0.1)
    sd_print = Print("Delta")(a / b)
    obs = pm.Normal("obs", mu=mu, sigma=sd_print, observed=y)

    # limiting number of samples and chains to simplify output
    trace = pm.sample(draws=10, tune=0, chains=1, progressbar=False, random_seed=RANDOM_SEED)

output = mystdout.getvalue()
sys.stdout = old_stdout  # setting sys.stdout back
Only 10 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (1 chains in 1 job)
NUTS: [mu, a, b]
Sampling 1 chain for 0 tune and 10 draw iterations (0 + 10 draws total) took 1 seconds.
The chain contains only diverging samples. The model is probably misspecified.
The acceptance probability does not match the target. It is 0, but should be close to 0.8. Try to increase the number of tuning steps.
C:\Users\igork\AppData\Local\Temp\ipykernel_14804\1992602661.py:15: UserWarning: The number of samples is too small to check convergence reliably.
  trace = pm.sample(draws=10, tune=0, chains=1, progressbar=False, random_seed=RANDOM_SEED)
output
'Delta __str__ = -85.74093608165128\nDelta __str__ = -9.182002291671038\nDelta __str__ = 0.10737295473067673\nDelta __str__ = 0.10737295473067673\nDelta __str__ = 0.10737295473067673\nDelta __str__ = -9.315734173890055\nDelta __str__ = 0.10737295473067673\nDelta __str__ = -9.312485782438435\nDelta __str__ = 0.10737295473067673\nDelta __str__ = -9.314669656412736\nDelta __str__ = 0.10737295473067673\nDelta __str__ = -9.31581619157038\nDelta __str__ = 0.10737295473067673\nDelta __str__ = -9.315114719133609\nDelta __str__ = 0.10737295473067673\nDelta __str__ = -9.31511040479387\nDelta __str__ = 0.10737295473067673\nDelta __str__ = -9.314077394936474\nDelta __str__ = 0.10737295473067673\nDelta __str__ = -9.313673830463395\nDelta __str__ = 0.10737295473067673\nDelta __str__ = -9.31561025339713\nDelta __str__ = 0.10737295473067673\nDelta __str__ = -9.31526569370057\nDelta __str__ = 0.10737295473067673\nDelta __str__ = 0.10737295473067673\nDelta __str__ = 0.10737295473067673\nDelta __str__ = 0.10737295473067673\nDelta __str__ = 0.10737295473067673\nDelta __str__ = 0.10737295473067673\nDelta __str__ = 0.10737295473067673\nDelta __str__ = 0.10737295473067673\nDelta __str__ = 0.10737295473067673\nDelta __str__ = 0.10737295473067673\n'

原始输出有点混乱,需要一些清理和格式化才能转换为numpy.ndarray。在下面的示例中,使用正则表达式清理输出,然后使用eval进行评估,得到一个浮点数列表。下面的代码也适用于更高维度的输出(以防您想尝试不同的模型)。

import re

# output cleanup and conversion to numpy array
# this is code accepts more complicated inputs
pattern = re.compile("Delta __str__ = ")
output = re.sub(pattern, " ", output)
pattern = re.compile("\\s+")
output = re.sub(pattern, ",", output)
pattern = re.compile(r"\[,")
output = re.sub(pattern, "[", output)
output += "]"
output = "[" + output[1:]
output = eval(output)
output = np.array(output)
output
array([-85.74093608,  -9.18200229,   0.10737295,   0.10737295,
         0.10737295,  -9.31573417,   0.10737295,  -9.31248578,
         0.10737295,  -9.31466966,   0.10737295,  -9.31581619,
         0.10737295,  -9.31511472,   0.10737295,  -9.3151104 ,
         0.10737295,  -9.31407739,   0.10737295,  -9.31367383,
         0.10737295,  -9.31561025,   0.10737295,  -9.31526569,
         0.10737295,   0.10737295,   0.10737295,   0.10737295,
         0.10737295,   0.10737295,   0.10737295,   0.10737295,
         0.10737295,   0.10737295])

请注意,我们请求了5次抽取,但得到了34组\(a/b\)值。原因是对于每次迭代,所有提议的值都会被打印出来(不仅仅是被接受的值)。负值显然是有问题的。

output.shape
(34,)

作者#

  • 由Thomas Wiecki于2016年7月撰写

  • 由Igor Kuvychko于2022年8月更新([pymc#406](https://github.com/pymc-devs/pymc-examples/pull/406))

水印#

%load_ext watermark
%watermark -n -u -v -iv -w -p pytensor,xarray
Last updated: Tue Aug 02 2022

Python implementation: CPython
Python version       : 3.10.5
IPython version      : 8.4.0

pytensor: 2.7.5
xarray: 2022.3.0

matplotlib: 3.5.2
pytensor    : 2.7.5
numpy     : 1.23.0
arviz     : 0.12.1
sys       : 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 06:57:19) [MSC v.1929 64 bit (AMD64)]
re        : 2.2.1
pandas    : 1.4.3
pymc      : 4.1.2

Watermark: 2.3.1

许可证声明#

本示例库中的所有笔记本均在MIT许可证下提供,该许可证允许修改和重新分发,前提是保留版权和许可证声明。

引用 PyMC 示例#

要引用此笔记本,请使用Zenodo为pymc-examples仓库提供的DOI。

重要

许多笔记本是从其他来源改编的:博客、书籍……在这种情况下,您应该引用原始来源。

同时记得引用代码中使用的相关库。

这是一个BibTeX的引用模板:

@incollection{citekey,
  author    = "<notebook authors, see above>",
  title     = "<notebook title>",
  editor    = "PyMC Team",
  booktitle = "PyMC examples",
  doi       = "10.5281/zenodo.5654871"
}

渲染后可能看起来像: