{"cells": [{"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["# 分布的维度\n", "PyMC 提供了多种方式来指定其分布的维度。本文档提供了一个概述,并提供了一些用户提示。\n", "\n", "## 术语表\n", "在本文档中,我们将使用“维度”一词来指代维度的概念。以下每个术语在 PyMC 中都有特定的语义和计算定义。虽然我们在这里分享它们,但在下面的示例中查看时,它们会更有意义。\n", "\n", "+ *支持维度* → 分布的核心维度\n", "+ *批处理维度* → 超出分布支持维度的额外维度\n", "+ *隐含维度* → 源自分布参数的值或形状的维度\n", "+ *显式维度* → 由以下参数之一明确定义的维度:\n", " + *形状* → 从分布中抽样的数量\n", " + *维度* → 维度名称的数组\n", "+ *坐标* → 一个字典,将维度名称映射到坐标值\n"]}, {"cell_type": "code", "execution_count": 1, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [], "source": ["from functools import partial\n", "\n", "import pymc as pm\n", "import numpy as np\n", "import pytensor.tensor as pt\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["## 单变量分布示例\n", "我们可以从最简单的情况开始,单个正态分布。我们使用`.dist`来指定一个位于PyMC模型之外的分布。\n"]}, {"cell_type": "code", "execution_count": 2, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [], "source": ["normal_dist = pm.Normal.dist()\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["然后我们可以使用 {func}`~pymc.draw` 函数从该分布中进行随机抽样。\n"]}, {"cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": ["# 仅对绘图函数进行修补以确保可重复性\n", "rng = np.random.default_rng(seed=sum(map(ord, \"dimensionality\")))\n", "draw = partial(pm.draw, random_seed=rng)\n"]}, {"cell_type": "code", "execution_count": 4, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["(array(0.80189558), 0)"]}, "execution_count": 4, "metadata": {}, "output_type": "execute_result"}], "source": ["normal_draw = draw(normal_dist)\n", "normal_draw, normal_draw.ndim\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["在这种情况下,我们最终得到一个单一的标量值。这意味着正态分布具有标量支撑维度,因为你可以进行的最小随机抽取是一个维度为零的标量。每个分布的支撑维度作为一个属性被硬编码。\n"]}, {"cell_type": "code", "execution_count": 5, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["0"]}, "execution_count": 5, "metadata": {}, "output_type": "execute_result"}], "source": ["normal_dist.owner.op.ndim_supp\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["### 显式批处理维度\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["如果需要进行多次抽取,通常的倾向是创建多个相同变量的副本并将它们堆叠在一起。\n"]}, {"cell_type": "code", "execution_count": 6, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["array([ 0.83636296, -0.33327414, 0.9434115 ])"]}, "execution_count": 6, "metadata": {}, "output_type": "execute_result"}], "source": ["normal_dists = pm.math.stack([pm.Normal.dist() for _ in range(3)])\n", "draw(normal_dists)\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["更简单地说,可以通过使用形状参数来创建来自同一分布族的*批量*独立抽样。\n"]}, {"cell_type": "code", "execution_count": 7, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["array([ 0.98810294, -0.07003785, -0.37962748])"]}, "execution_count": 7, "metadata": {}, "output_type": "execute_result"}], "source": ["normal_dists = pm.Normal.dist(shape=(3,))\n", "draw(normal_dists)\n"]}, {"cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[ 7.99932116e-04, -1.94407945e+00, 3.90451962e-01],\n", " [ 1.10657367e+00, 6.49042149e-01, -1.09450185e+00],\n", " [-2.96226305e-01, 1.41884595e+00, -1.31589441e+00],\n", " [ 1.53109449e+00, -7.73771737e-01, 2.37418367e+00]])"]}, "execution_count": 8, "metadata": {}, "output_type": "execute_result"}], "source": ["normal_dists = pm.Normal.dist(shape=(4, 3))\n", "draw(normal_dists)\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["这不仅更加简洁,而且生成的向量化代码效率更高。在PyMC中,我们很少使用堆栈方法,除非需要组合来自不同分布族的抽样。\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["### 隐式批次维度\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["还可以通过传递具有更高维度的参数来创建一批抽样,而无需指定形状。\n"]}, {"cell_type": "code", "execution_count": 9, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["array([ 0.81833093, -0.2891973 , 1.2399946 ])"]}, "execution_count": 9, "metadata": {}, "output_type": "execute_result"}], "source": ["normal_dists = pm.Normal.dist(mu=np.array([0, 0, 0]), sigma=np.array([1, 1, 1]))\n", "draw(normal_dists)\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["这与之前的显式形状示例是等效的,我们本可以在这里显式地传递它。由于我们没有这样做,因此我们将这些批次维度称为 *隐式*。\n", "\n", "这在我们希望参数在批次维度上变化时变得非常有用。\n"]}, {"cell_type": "code", "execution_count": 10, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["array([ 0.99989975, 10.00009874, 100.00004215])"]}, "execution_count": 10, "metadata": {}, "output_type": "execute_result"}], "source": ["draw(pm.Normal.dist(mu=[1, 10, 100], sigma=0.0001))\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["当参数的形状不同时,它们会被广播,类似于NumPy的工作方式。在这种情况下,`sigma`被广播以匹配`mu`的形状。\n"]}, {"cell_type": "code", "execution_count": 11, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["[array([ 1, 10, 100]), array([0.0001, 0.0001, 0.0001])]"]}, "execution_count": 11, "metadata": {}, "output_type": "execute_result"}], "source": ["np.broadcast_arrays([1, 10, 100], 0.0001)\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["了解 NumPy {ref}`广播 ` 的工作原理是很重要的。当你做一些无效的操作时,容易遇到这种错误:\n"]}, {"cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (3,) and arg 1 with shape (2,).\n", "Apply node that caused the error: normal_rv{0, (0, 0), floatX, True}(RandomGeneratorSharedVariable(), [], 11, [ 1 10 100], [0.1 0.1])\n", "Toposort index: 0\n", "Inputs types: [RandomGeneratorType, TensorType(int64, shape=(0,)), TensorType(int64, shape=()), TensorType(int64, shape=(3,)), TensorType(float64, shape=(2,))]\n", "Inputs shapes: ['No shapes', (0,), (), (3,), (2,)]\n", "Inputs strides: ['No strides', (0,), (), (8,), (8,)]\n", "Inputs values: [Generator(PCG64) at 0x7F6427F8CAC0, array([], dtype=int64), array(11), array([ 1, 10, 100]), array([0.1, 0.1])]\n", "Outputs clients: [['output'], ['output']]\n", "\n", "HINT: Re-running with most PyTensor optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the PyTensor flag 'optimizer=fast_compile'. If that does not work, PyTensor optimizations can be disabled with 'optimizer=None'.\n", "HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.\n"]}], "source": ["try:\n", " # 形状为 (3,) 和 (2,) 的数组无法一起广播\n", " draw(pm.Normal.dist(mu=[1, 10, 100], sigma=[0.1, 0.1]))\n", "except ValueError as error:\n", " print(error)\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["### 结合隐式和显式批次维度\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["您可以将显式形状维度与隐式批次维度结合使用。如上所提到的,它们可以提供相同的信息。\n"]}, {"cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([-0.49526775, -0.94608062, 1.66397913])"]}, "execution_count": 13, "metadata": {}, "output_type": "execute_result"}], "source": ["normal_dists = pm.Normal.dist(mu=np.array([0, 1, 2]), sigma=1, shape=(3,))\n", "draw(normal_dists)\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["但是形状也可以用来扩展超出任何隐式批次维度。\n"]}, {"cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[ 2.22626513, 2.12938134, 0.49074886],\n", " [ 0.08312601, 1.05049093, 1.91718083],\n", " [-0.68191815, 1.43771096, 1.76780399],\n", " [-0.59883241, 0.26954893, 2.74319335]])"]}, "execution_count": 14, "metadata": {}, "output_type": "execute_result"}], "source": ["normal_dists = pm.Normal.dist(mu=np.array([0, 1, 2]), sigma=1, shape=(4, 3))\n", "draw(normal_dists)\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["请注意,由于广播规则,显式的批次维度必须始终“放在左侧”,在任何隐式维度之前。因此,在前面的例子中,`shape=(4, 3)` 是有效的,但 `shape=(3, 4)` 是无效的,因为 `mu` 参数可以广播到第一个形状,而不能广播到第二个形状。\n"]}, {"cell_type": "code", "execution_count": 15, "metadata": {"scrolled": true}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (3, 4) and arg 1 with shape (3,).\n", "Apply node that caused the error: normal_rv{0, (0, 0), floatX, True}(RandomGeneratorSharedVariable(), [3 4], 11, [0 1 2], 1.0)\n", "Toposort index: 0\n", "Inputs types: [RandomGeneratorType, TensorType(int64, shape=(2,)), TensorType(int64, shape=()), TensorType(int64, shape=(3,)), TensorType(float64, shape=())]\n", "Inputs shapes: ['No shapes', (2,), (), (3,), ()]\n", "Inputs strides: ['No strides', (8,), (), (8,), ()]\n", "Inputs values: [Generator(PCG64) at 0x7F64280725E0, array([3, 4]), array(11), array([0, 1, 2]), array(1.)]\n", "Outputs clients: [['output'], ['output']]\n", "\n", "HINT: Re-running with most PyTensor optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the PyTensor flag 'optimizer=fast_compile'. If that does not work, PyTensor optimizations can be disabled with 'optimizer=None'.\n", "HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.\n"]}], "source": ["try:\n", " draw(pm.Normal.dist(mu=np.array([0, 1, 2]), sigma=1, shape=(3, 4)))\n", "except ValueError as error:\n", " print(error)\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["如果您需要正态变量的形状为 `shape=(3, 4)`,可以在定义后对其进行转置。\n"]}, {"cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[-0.73397401, -0.18717845, -0.78548049, 1.64478883],\n", " [ 3.54543846, 1.22954216, 2.13674063, 1.94194106],\n", " [ 0.85294471, 3.52041332, 2.94428975, 3.25944187]])"]}, "execution_count": 16, "metadata": {}, "output_type": "execute_result"}], "source": ["transposed_normals = pm.Normal.dist(mu=np.array([0, 1, 2]), sigma=1, shape=(4, 3)).T\n", "draw(transposed_normals)\n"]}, {"cell_type": "markdown", "metadata": {}, "source": [":::{tip} 重要的是不要混淆在分布定义中设定的维度与在后续操作(如转置、索引或广播)中设定的维度。在使用PyMC进行采样时(无论是通过前向采样还是MCMC),随机抽样始终会来自于分布形状。请注意,在以下示例中,尽管两个变量具有相同的最终形状,但实际抽取的“随机”样本数量是不同的。\n", ":::\n"]}, {"cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": ["vector_normals = pm.Normal.dist(shape=(3,))\n", "broadcasted_normal = pt.broadcast_to(pm.Normal.dist(), (3,))\n"]}, {"cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [{"data": {"text/plain": ["(array([-0.45755879, 1.59975702, 0.20546749]),\n", " array([0.29866199, 0.29866199, 0.29866199]))"]}, "execution_count": 18, "metadata": {}, "output_type": "execute_result"}], "source": ["draw(vector_normals), draw(broadcasted_normal)\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["## 多元分布示例\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["某些分布根据定义在评估时返回多个值。这可能是一个值的向量,或一个矩阵,或一个任意的多维张量。一个例子是多元正态分布,它总是返回一个向量(具有一个维度的数组)。\n"]}, {"cell_type": "code", "execution_count": 19, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["(array([0.55390975, 2.17440418, 1.83014764]), 1)"]}, "execution_count": 19, "metadata": {}, "output_type": "execute_result"}], "source": ["mvnormal_dist = pm.MvNormal.dist(mu=np.ones(3), cov=np.eye(3))\n", "mvnormal_draw = draw(mvnormal_dist)\n", "mvnormal_draw, mvnormal_draw.ndim\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["与任何分布一样,支持的维度被指定为一个固定的属性。\n"]}, {"cell_type": "code", "execution_count": 20, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["1"]}, "execution_count": 20, "metadata": {}, "output_type": "execute_result"}], "source": ["mvnormal_dist.owner.op.ndim_supp\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["即使您指定一个单维的MvNormal,您也会得到一个向量!\n"]}, {"cell_type": "code", "execution_count": 21, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["(array([-0.68893796]), 1)"]}, "execution_count": 21, "metadata": {}, "output_type": "execute_result"}], "source": ["smallest_mvnormal_dist = pm.MvNormal.dist(mu=[1], cov=[[1]])\n", "smallest_mvnormal_draw = draw(smallest_mvnormal_dist)\n", "smallest_mvnormal_draw, smallest_mvnormal_draw.ndim\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["### 隐式支持维度\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["在我们刚刚看到的MvNormal示例中,支持维度实际上是隐含的。我们并没有具体指定我们想要的抽样向量的维度是3还是1。这是从`mu`和`cov`的形状中推断出来的。因此,我们称其为*隐含支持维度*。我们可以通过使用形状来使其更加明确。\n"]}, {"cell_type": "code", "execution_count": 22, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["array([0.57262853, 0.34230354, 1.96818163])"]}, "execution_count": 22, "metadata": {}, "output_type": "execute_result"}], "source": ["explicit_mvnormal = pm.MvNormal.dist(mu=np.ones(3), cov=np.eye(3), shape=(3,))\n", "draw(explicit_mvnormal)\n"]}, {"cell_type": "markdown", "metadata": {}, "source": [":::{warning} 然而,请注意,在撰写时,形状仅仅被忽略用于支持维度。它仅作为一种“类型提示”来标记预期的维度。\n", ":::\n"]}, {"cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([1.0623799 , 0.84622693, 0.34046237])"]}, "execution_count": 23, "metadata": {}, "output_type": "execute_result"}], "source": ["ignored_shape_mvnormal = pm.MvNormal.dist(mu=np.ones(3), cov=np.eye(3), shape=(4,))\n", "draw(ignored_shape_mvnormal)\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["### 显式批次维度\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["与单变量分布一样,我们可以添加显式的批量维度。我们将使用另一个向量分布来说明这一点:多项分布。下面的代码片段定义了一个包含五个独立多项分布的矩阵,每个分布都是大小为3的向量。\n"]}, {"cell_type": "code", "execution_count": 24, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["array([[2, 0, 3],\n", " [1, 1, 3],\n", " [0, 2, 3],\n", " [0, 1, 4],\n", " [1, 0, 4]])"]}, "execution_count": 24, "metadata": {}, "output_type": "execute_result"}], "source": ["draw(pm.Multinomial.dist(n=5, p=[0.1, 0.3, 0.6], shape=(5, 3)))\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": [":::{warning} 再次注意,形状对支持的维度没有影响\n", ":::\n"]}, {"cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[0, 1, 4],\n", " [0, 0, 5],\n", " [3, 1, 1],\n", " [0, 1, 4],\n", " [0, 2, 3]])"]}, "execution_count": 25, "metadata": {}, "output_type": "execute_result"}], "source": ["draw(pm.Multinomial.dist(n=5, p=[0.1, 0.3, 0.6], shape=(5, 4)))\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["出于同样的原因,您必须始终在支持维度的“左侧”明确定义显式的批量维度。以下内容将不会按预期行为。\n"]}, {"cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[2, 0, 3],\n", " [1, 3, 1],\n", " [1, 1, 3]])"]}, "execution_count": 26, "metadata": {}, "output_type": "execute_result"}], "source": ["draw(pm.Multinomial.dist(n=5, p=[0.1, 0.3, 0.6], shape=(3, 5)))\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["如果你需要多项式变量的形状为 `shape=(3, 5)`,可以在定义后进行转置。\n"]}, {"cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([[0, 0, 0, 0, 0],\n", " [2, 2, 1, 0, 3],\n", " [3, 3, 4, 5, 2]])"]}, "execution_count": 27, "metadata": {}, "output_type": "execute_result"}], "source": ["transposed_multinomials = pm.Multinomial.dist(n=5, p=[0.1, 0.3, 0.6], shape=(5, 3)).T\n", "draw(transposed_multinomials)\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["### 隐式批量维度\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["与单变量分布一样,我们可以为每个批处理维度使用不同的参数。\n"]}, {"cell_type": "code", "execution_count": 28, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["array([[1, 2, 2],\n", " [0, 3, 7]])"]}, "execution_count": 28, "metadata": {}, "output_type": "execute_result"}], "source": ["multinomial_dist = pm.Multinomial.dist(n=[5, 10], p=[0.1, 0.3, 0.6])\n", "draw(multinomial_dist)\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["哪一个等同于更冗长的内容\n"]}, {"cell_type": "code", "execution_count": 29, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["array([[2, 2, 1],\n", " [0, 3, 7]])"]}, "execution_count": 29, "metadata": {}, "output_type": "execute_result"}], "source": ["draw(pm.Multinomial.dist(n=[5, 10], p=[[0.1, 0.3, 0.6], [0.1, 0.3, 0.6]]))\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["如果您熟悉NumPy的广播规则,您可能会好奇PyMC是如何使其工作的。简单的广播在这里是行不通的。\n"]}, {"cell_type": "code", "execution_count": 30, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (2,) and arg 1 with shape (3,).\n"]}], "source": ["try:\n", " np.broadcast_arrays([5, 10], [0.1, 0.3, 0.6])\n", "except ValueError as exc:\n", " print(exc)\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["为了理解发生了什么,我们需要引入参数核心维度的概念。分布参数的核心维度是定义分布所需的参数最小维数。在多项式分布中,`n` 至少必须是一个标量整数,而 `p` 至少必须是一个表示每个类别结果概率的向量。因此,对于多项式分布,`n` 的核心维度是0,而 `p` 的核心维度是1。\n", "\n", "因此,如果我们有一个包含两个 `n` 的向量,我们实际上应该将 `p` 的向量广播成一个包含两个这样的向量的矩阵,并将每个 `n` 与每个广播的 `p` 的行配对。这的工作方式与 `np.vectorize` 完全相同。\n"]}, {"cell_type": "code", "execution_count": 31, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"name": "stdout", "output_type": "stream", "text": [">> 5 [0.1 0.3 0.6]\n", ">> 10 [0.1 0.3 0.6]\n"]}, {"data": {"text/plain": ["array([[1, 0, 4],\n", " [1, 2, 7]])"]}, "execution_count": 31, "metadata": {}, "output_type": "execute_result"}], "source": ["def core_multinomial(n, p):\n", " print(\">>\", n, p)\n", " return draw(pm.Multinomial.dist(n, p))\n", "\n", "\n", "vectorized_multinomial = np.vectorize(core_multinomial, signature=\"(),(p)->(p)\")\n", "vectorized_multinomial([5, 10], [0.1, 0.3, 0.6])\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["每个分布参数的核心维度也是硬编码为每个分布的一个属性。\n"]}, {"cell_type": "code", "execution_count": 32, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["(0, 1)"]}, "execution_count": 32, "metadata": {}, "output_type": "execute_result"}], "source": ["multinomial_dist.owner.op.ndims_params\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["隐式批处理维度仍然必须遵循广播规则。以下示例无效,因为 `n` 的批处理维度为 `shape=(2,)` ,而 `p` 的批处理维度为 `shape=(3,)` ,这两个维度无法一起广播。\n"]}, {"cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["operands could not be broadcast together with remapped shapes [original->remapped]: (2,) and requested shape (3,)\n", "Apply node that caused the error: multinomial_rv{1, (0, 1), int64, True}(RandomGeneratorSharedVariable(), [], 4, [ 5 10], [[0.1 0.3 ... 0.3 0.6]])\n", "Toposort index: 0\n", "Inputs types: [RandomGeneratorType, TensorType(int64, shape=(0,)), TensorType(int64, shape=()), TensorType(int64, shape=(2,)), TensorType(float64, shape=(3, 3))]\n", "Inputs shapes: ['No shapes', (0,), (), (2,), (3, 3)]\n", "Inputs strides: ['No strides', (0,), (), (8,), (24, 8)]\n", "Inputs values: [Generator(PCG64) at 0x7F6425B8B060, array([], dtype=int64), array(4), array([ 5, 10]), 'not shown']\n", "Outputs clients: [['output'], ['output']]\n", "\n", "HINT: Re-running with most PyTensor optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the PyTensor flag 'optimizer=fast_compile'. If that does not work, PyTensor optimizations can be disabled with 'optimizer=None'.\n", "HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.\n"]}], "source": ["try:\n", " draw(pm.Multinomial.dist(n=[5, 10], p=[[0.1, 0.3, 0.6], [0.1, 0.3, 0.6], [0.1, 0.3, 0.6]]))\n", "except ValueError as error:\n", " print(error)\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["### 结合隐式和显式批量维度\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["您可以并且应该将多维参数的隐式维度与显式形状信息结合,这样更易于推理。\n"]}, {"cell_type": "code", "execution_count": 34, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"text/plain": ["array([[0, 1, 4],\n", " [4, 1, 5]])"]}, "execution_count": 34, "metadata": {}, "output_type": "execute_result"}], "source": ["draw(pm.Multinomial.dist(n=[5, 10], p=[0.1, 0.3, 0.6], shape=(2, 3)))\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["显式的批量维度仍然可以超出任何隐式的批量维度。同样,由于广播的工作原理,显式批量维度必须始终\"在左边\"。以下情况是无效的,因为`n`具有形状为`(2,)`的批量维度,这无法广播到形状为`(2, 4)`的显式批量维度。\n"]}, {"cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["operands could not be broadcast together with remapped shapes [original->remapped]: (2,) and requested shape (2,4)\n", "Apply node that caused the error: multinomial_rv{1, (0, 1), int64, True}(RandomGeneratorSharedVariable(), [2 4], 4, [ 5 10], [0.1 0.3 0.6])\n", "Toposort index: 0\n", "Inputs types: [RandomGeneratorType, TensorType(int64, shape=(2,)), TensorType(int64, shape=()), TensorType(int64, shape=(2,)), TensorType(float64, shape=(3,))]\n", "Inputs shapes: ['No shapes', (2,), (), (2,), (3,)]\n", "Inputs strides: ['No strides', (8,), (), (8,), (8,)]\n", "Inputs values: [Generator(PCG64) at 0x7F6425AC8120, array([2, 4]), array(4), array([ 5, 10]), array([0.1, 0.3, 0.6])]\n", "Outputs clients: [['output'], ['output']]\n", "\n", "HINT: Re-running with most PyTensor optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the PyTensor flag 'optimizer=fast_compile'. If that does not work, PyTensor optimizations can be disabled with 'optimizer=None'.\n", "HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.\n"]}], "source": ["try:\n", " draw(pm.Multinomial.dist(n=[5, 10], p=[0.1, 0.3, 0.6], shape=(2, 4, 3)))\n", "except ValueError as error:\n", " print(error)\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["## 使用模型图检查维度\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["在PyMC模型中,分布通常会被频繁使用,因此存在一些工具可以在该上下文中帮助推理分布的形状。\n"]}, {"cell_type": "code", "execution_count": 36, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"name": "stdout", "output_type": "stream", "text": [" x: shape=(3,)\n", "sigma_log__: shape=()\n", " sigma: shape=()\n", " y: shape=(3,)\n"]}], "source": ["with pm.Model() as pmodel:\n", " mu = pm.Normal(\"x\", mu=0, shape=(3))\n", " sigma = pm.HalfNormal(\"sigma\")\n", " y = pm.Normal(\"y\", mu=mu, sigma=sigma)\n", "\n", "for rv, shape in pmodel.eval_rv_shapes().items():\n", " print(f\"{rv:>11}: shape={shape}\")\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["一个更强大的工具来理解和调试PyMC中的维度是{func}`~pymc.model_to_graphviz`函数。我们可以通过读取Graphviz输出而不是检查数组输出,以理解变量的维度。\n"]}, {"cell_type": "code", "execution_count": 37, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"image/svg+xml": ["\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "cluster3\n", "\n", "3\n", "\n", "\n", "\n", "y\n", "\n", "y\n", "~\n", "Normal\n", "\n", "\n", "\n", "x\n", "\n", "x\n", "~\n", "Normal\n", "\n", "\n", "\n", "x->y\n", "\n", "\n", "\n", "\n", "\n", "sigma\n", "\n", "sigma\n", "~\n", "HalfNormal\n", "\n", "\n", "\n", "sigma->y\n", "\n", "\n", "\n", "\n", "\n"], "text/plain": [""]}, "execution_count": 37, "metadata": {}, "output_type": "execute_result"}], "source": ["pm.model_to_graphviz(pmodel)\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["在上面的示例中,每个框(或板)左下角的数字表示其中分布的维度。如果某个分布在任何带数字的框之外,它具有标量形状。\n", "\n", "让我们利用这个工具来回顾隐式和显式维度:\n"]}, {"cell_type": "code", "execution_count": 38, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"image/svg+xml": ["\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "cluster3\n", "\n", "3\n", "\n", "\n", "cluster4\n", "\n", "4\n", "\n", "\n", "\n", "scalar (support)\n", "\n", "scalar (support)\n", "~\n", "Normal\n", "\n", "\n", "\n", "vector (implicit)\n", "\n", "vector (implicit)\n", "~\n", "Normal\n", "\n", "\n", "\n", "vector (explicit)\n", "\n", "vector (explicit)\n", "~\n", "Normal\n", "\n", "\n", "\n"], "text/plain": [""]}, "execution_count": 38, "metadata": {}, "output_type": "execute_result"}], "source": ["with pm.Model() as pmodel:\n", " pm.Normal(\"scalar (support)\")\n", " pm.Normal(\"vector (implicit)\", mu=[1, 2, 3])\n", " pm.Normal(\"vector (explicit)\", shape=(4,))\n", "\n", "pm.model_to_graphviz(pmodel)\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["## 维度\n"]}, {"cell_type": "markdown", "metadata": {}, "source": ["PyMC 支持 `dims` 的概念。对于许多随机变量来说,哪个维度对应于哪个“现实世界”的概念,例如观察数量、处理单元数量等,可能会变得令人困惑。`dims` 参数是一个额外的易读标签,可以传达这种含义。当单独使用时,`dims` 必须与显式的 `shape` 信息结合使用。\n"]}, {"cell_type": "code", "execution_count": 39, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"image/svg+xml": ["\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "clusteryear (3)\n", "\n", "year (3)\n", "\n", "\n", "\n", "profit\n", "\n", "profit\n", "~\n", "Normal\n", "\n", "\n", "\n"], "text/plain": [""]}, "execution_count": 39, "metadata": {}, "output_type": "execute_result"}], "source": ["with pm.Model() as model:\n", " pm.Normal(\"profit\", shape=(3,), dims=\"year\")\n", "\n", "pm.model_to_graphviz(model)\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["在 `dims` 的使用中,最强大的地方在于模型中指定的 `coords`。这为每个 `dim` 条目提供了一个独特的标签,使其更具意义。\n"]}, {"cell_type": "code", "execution_count": 40, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"image/svg+xml": ["\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "clusteryear (3)\n", "\n", "year (3)\n", "\n", "\n", "\n", "profit\n", "\n", "profit\n", "~\n", "Normal\n", "\n", "\n", "\n"], "text/plain": [""]}, "execution_count": 40, "metadata": {}, "output_type": "execute_result"}], "source": ["coords = {\"year\": [2020, 2021, 2022]}\n", "with pm.Model(coords=coords) as model:\n", " pm.Normal(\"profit\", dims=\"year\")\n", "\n", "pm.model_to_graphviz(model)\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["在这种情况下,分布的维度实际上可以通过使用的 `dims` 来定义。我们不需要传递形状或定义隐式批处理维度。\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["让我们通过一个多元正态分布的例子来回顾不同维度的特点。\n"]}, {"cell_type": "code", "execution_count": 41, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"data": {"image/svg+xml": ["\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "clustersupport (3)\n", "\n", "support (3)\n", "\n", "\n", "clusterbatch (4) x support (3)\n", "\n", "batch (4) x support (3)\n", "\n", "\n", "\n", "vector\n", "\n", "vector\n", "~\n", "MvNormal\n", "\n", "\n", "\n", "matrix (explicit)\n", "\n", "matrix (explicit)\n", "~\n", "MvNormal\n", "\n", "\n", "\n", "matrix (implicit)\n", "\n", "matrix (implicit)\n", "~\n", "MvNormal\n", "\n", "\n", "\n"], "text/plain": [""]}, "execution_count": 41, "metadata": {}, "output_type": "execute_result"}], "source": ["coords = {\n", " \"batch\": [0, 1, 2, 3],\n", " \"support\": [\"A\", \"B\", \"C\"],\n", "}\n", "with pm.Model(coords=coords) as model:\n", " pm.MvNormal(\"vector\", mu=[0, 0, 0], cov=np.eye(3), dims=(\"support\",))\n", " pm.MvNormal(\"matrix (implicit)\", mu=np.zeros((4, 3)), cov=np.eye(3), dims=(\"batch\", \"support\"))\n", " pm.MvNormal(\n", " \"matrix (explicit)\", mu=[0, 0, 0], cov=np.eye(3), shape=(4, 3), dims=(\"batch\", \"support\")\n", " )\n", "\n", "pm.model_to_graphviz(model)\n"]}, {"cell_type": "markdown", "metadata": {}, "source": [":::{tip} 对于最终模型的发布,我们建议将尺寸和坐标作为标签,因为这些标签将传递给 {class}`arviz.InferenceData`。这既是最佳实践,确保透明性和可读性,方便他人理解。同时,在单开发者的工作流程中,这也是非常有用的,例如,当存在三维或更高维度的分布时,它将有助于指明哪个维度对应于哪个模型概念。\n", ":::\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["## 调试形状问题的提示\n"]}, {"cell_type": "markdown", "metadata": {"pycharm": {"name": "#%% md\n"}}, "source": ["尽管我们提供了所有这些工具以方便用户,并且 PyMC 尽力理解用户的意图,但混合维度工具的结果可能并不总是符合最终的预期维度。有时模型可能在采样时不会指示错误,或者根本不会指示问题。在处理维度,特别是更复杂的维度时,我们建议:\n", "\n", "* 在采样之前使用 `model_to_graphviz` 可视化模型\n", "* 使用 `draw` 或 `sample_prior predictive` 早期捕捉错误\n", "* 检查返回的 `az.InferenceData` 对象,以确保所有数组大小符合预期\n", "* 在追踪错误时使用质数定义形状。\n"]}, {"cell_type": "code", "execution_count": 42, "metadata": {"pycharm": {"name": "#%%\n"}}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["Last updated: Mon Jul 17 2023\n", "\n", "Python implementation: CPython\n", "Python version : 3.10.9\n", "IPython version : 8.11.0\n", "\n", "pytensor: 2.12.3\n", "\n", "pymc : 5.2.0\n", "numpy : None\n", "pytensor: 2.12.3\n", "\n", "Watermark: 2.3.1\n", "\n"]}], "source": ["%load_ext watermark\n", "\n", "%watermark -n -u -v -iv -w -p pytensor\n"]}, {"cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": ["\n"]}], "metadata": {"hide_input": false, "interpreter": {"hash": "f574fac5b7e4a41f7640949d1e1759089329dd116ff7b389caa9cf21f93d872d"}, "kernelspec": {"display_name": "pymc", "language": "python", "name": "pymc"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.9"}, "toc": {"base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": true}, "varInspector": {"cols": {"lenName": 16, "lenType": 16, "lenVar": 40}, "kernels_config": {"python": {"delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())"}, "r": {"delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) "}}, "types_to_exclude": ["module", "function", "builtin_function_or_method", "instance", "_Feature"], "window_display": false}}, "nbformat": 4, "nbformat_minor": 4}