functorch.jacrev¶

functorch.jacrev(func, argnums=0, *, has_aux=False, chunk_size=None, _preallocate_and_copy=False)[source]¶

使用反向模式自动微分计算func关于索引argnum处参数的雅可比矩阵

注意

使用 chunk_size=1 相当于使用 for 循环逐行计算雅可比矩阵，即 vmap() 的约束不适用。

Parameters

func (function) – 一个Python函数，它接受一个或多个参数，其中必须有一个是Tensor，并返回一个或多个Tensors
argnums (int 或 Tuple[int]) – 可选的，整数或整数元组，指定要获取雅可比矩阵的参数。默认值：0。
has_aux (bool) – 标志，表示 func 返回一个 (output, aux) 元组，其中第一个元素是要微分的函数的输出，第二个元素是不会被微分的辅助对象。默认值：False。
chunk_size (None 或 int) – 如果为 None（默认），则使用最大块大小（相当于在 vjp 上执行单个 vmap 来计算雅可比矩阵）。如果为 1，则使用 for 循环逐行计算雅可比矩阵。如果不为 None，则一次计算 chunk_size 行的雅可比矩阵（相当于在 vjp 上执行多个 vmap）。如果在计算雅可比矩阵时遇到内存问题，请尝试指定一个非 None 的 chunk_size。

Returns

返回一个函数，该函数接受与func相同的输入，并返回func相对于argnums处的参数的雅可比矩阵。如果has_aux is True，则返回的函数将返回一个(jacobian, aux)元组，其中jacobian是雅可比矩阵，aux是由func返回的辅助对象。

使用逐点一元操作的基本用法将给出一个对角数组作为雅可比矩阵

>>> from torch.func import jacrev
>>> x = torch.randn(5)
>>> jacobian = jacrev(torch.sin)(x)
>>> expected = torch.diag(torch.cos(x))
>>> assert torch.allclose(jacobian, expected)

如果您想计算函数的输出以及函数的雅可比矩阵，请使用has_aux标志将输出作为辅助对象返回：

>>> from torch.func import jacrev
>>> x = torch.randn(5)
>>>
>>> def f(x):
>>>   return x.sin()
>>>
>>> def g(x):
>>>   result = f(x)
>>>   return result, result
>>>
>>> jacobian_f, f_x = jacrev(g, has_aux=True)(x)
>>> assert torch.allclose(f_x, f(x))

jacrev() 可以与 vmap 组合以生成批处理的雅可比矩阵：

>>> from torch.func import jacrev, vmap
>>> x = torch.randn(64, 5)
>>> jacobian = vmap(jacrev(torch.sin))(x)
>>> assert jacobian.shape == (64, 5, 5)

此外，jacrev() 可以与其自身组合以生成 Hessians

>>> from torch.func import jacrev
>>> def f(x):
>>>   return x.sin().sum()
>>>
>>> x = torch.randn(5)
>>> hessian = jacrev(jacrev(f))(x)
>>> assert torch.allclose(hessian, torch.diag(-x.sin()))

默认情况下，jacrev() 计算关于第一个输入的雅可比矩阵。然而，它可以通过使用 argnums 来计算关于不同参数的雅可比矩阵：

>>> from torch.func import jacrev
>>> def f(x, y):
>>>   return x + y ** 2
>>>
>>> x, y = torch.randn(5), torch.randn(5)
>>> jacobian = jacrev(f, argnums=1)(x, y)
>>> expected = torch.diag(2 * y)
>>> assert torch.allclose(jacobian, expected)

此外，将元组传递给 argnums 将计算关于多个参数的雅可比矩阵

>>> from torch.func import jacrev
>>> def f(x, y):
>>>   return x + y ** 2
>>>
>>> x, y = torch.randn(5), torch.randn(5)
>>> jacobian = jacrev(f, argnums=(0, 1))(x, y)
>>> expectedX = torch.diag(torch.ones_like(x))
>>> expectedY = torch.diag(2 * y)
>>> assert torch.allclose(jacobian[0], expectedX)
>>> assert torch.allclose(jacobian[1], expectedY)

注意

使用 PyTorch torch.no_grad 与 jacrev 一起。案例 1：在函数内部使用 torch.no_grad：

>>> def f(x):
>>>     with torch.no_grad():
>>>         c = x ** 2
>>>     return x - c

在这种情况下，jacrev(f)(x) 将尊重内部的 torch.no_grad。

案例2：在torch.no_grad上下文管理器中使用jacrev：

>>> with torch.no_grad():
>>>     jacrev(f)(x)

在这种情况下，jacrev 会尊重内部的 torch.no_grad，但不会尊重外部的。这是因为 jacrev 是一个“函数变换”：其结果不应依赖于 f 之外的上下文管理器的结果。

警告

我们已经将functorch集成到PyTorch中。作为集成的最后一步，functorch.jacrev自PyTorch 2.0起已被弃用，并将在未来版本PyTorch >= 2.3中删除。请改用torch.func.jacrev；更多详情请参阅PyTorch 2.0发布说明和/或torch.func迁移指南https://pytorch.org/docs/master/func.migrating.html