Register for Ray Summit 2024 with keynotes from Mira Murati, Marc Andreessen, and Anastasis Germanidis.

ray.rllib.models.distributions.Distribution.rsample#

abstract Distribution.rsample(*, sample_shape: Tuple[int, ...] = None, return_logp: bool = False, **kwargs) → numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor | Tuple[numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor, numpy.array | jnp.ndarray | tf.Tensor | torch.Tensor][源代码]#

从动作分布中绘制一个重新参数化的样本。

如果实现了这个方法，我们可以对样本相对于分布参数的梯度进行计算。

参数:

sample_shape – 要绘制的样本的形状。
return_logp – 是否返回采样值的对数概率。
**kwargs – 向前兼容占位符。

返回:

采样值。如果 return_logp 为 True，则返回一个包含采样值及其对数概率的元组。

优云智算