speechbrain.nnet.quaternion_networks.q_RNN 模块

实现四元数值循环神经网络的库。

Authors

Titouan Parcollet 2020

摘要

类：

`QLSTM`	该函数实现了四元数值LSTM，首次在《四元数循环神经网络》中由Parcollet T.引入。
`QLSTM_Layer`	此函数实现了四元数值LSTM层。
`QLiGRU`	该函数实现了四元数值的轻量级GRU（liGRU）。
`QLiGRU_Layer`	此函数实现了四元数值的轻门控循环单元（ligru）层。
`QRNN`	此函数实现了一个普通的四元数值RNN。
`QRNN_Layer`	此函数实现了四元数值的循环层。

参考

class speechbrain.nnet.quaternion_networks.q_RNN.QLSTM(hidden_size, input_shape, num_layers=1, bias=True, dropout=0.0, bidirectional=False, init_criterion='glorot', weight_init='quaternion', autograd=True)[source]

基础：Module

该函数实现了四元数值LSTM，首次在以下文献中引入：“四元数循环神经网络”，Parcollet T. 等人。

输入格式为 (batch, time, fea) 或 (batch, time, fea, channel)。在后一种形状中，最后两个维度将被合并： (batch, time, fea * channel)

Parameters:

hidden_size (int) – 输出神经元的数量（即输出的维度）。指定的值是以四元数值神经元为单位的。因此，输出是4*hidden_size。
input_shape (tuple) – 输入张量的预期形状。
num_layers (int, 可选) – 在RNN架构中使用的层数（默认为1）。
bias (bool, 可选) – 如果为True，则采用加法偏置b（默认为True）。
dropout (float, 可选) – 这是dropout因子（必须在0和1之间）（默认值为0.0）。
双向 (bool, 可选) – 如果为True，则使用一个双向模型，该模型会从左到右和从右到左扫描序列（默认为False）。
init_criterion (str , optional) – (glorot, he). 此参数控制权重的初始化标准。它与weights_init结合使用，以构建四元数值权重的初始化方法（默认为“glorot”）。
weight_init (str, optional) – (quaternion, unitary). 此参数定义了四元数值权重的初始化过程。“quaternion”将根据init_criterion和四元数极坐标形式生成随机四元数权重。“unitary”将权重归一化到单位圆上（默认为“quaternion”）。更多详情请参阅：“Quaternion Recurrent Neural Networks”, Parcollet T. 等人。
autograd (bool, 可选) – 当为True时，将使用默认的PyTorch自动梯度。当为False时，将使用自定义的反向传播，减少3到4倍的内存消耗。但速度会慢2倍（默认为True）。

Example

>>> inp_tensor = torch.rand([10, 16, 40])
>>> rnn = QLSTM(hidden_size=16, input_shape=inp_tensor.shape)
>>> out_tensor = rnn(inp_tensor)
>>>
torch.Size([10, 16, 64])

forward(x, hx: Tensor | None = None)[source]

返回vanilla QuaternionRNN的输出。

Parameters:

x (torch.Tensor) – 输入张量。
hx (torch.Tensor) – 隐藏层。

Returns:

output (torch.Tensor) – 四元数RNN的输出
hh (torch.Tensor) – 隐藏状态

class speechbrain.nnet.quaternion_networks.q_RNN.QLSTM_Layer(input_size, hidden_size, num_layers, batch_size, dropout=0.0, bidirectional=False, init_criterion='glorot', weight_init='quaternion', autograd='true')[source]

基础：Module

此函数实现了四元数值的LSTM层。

Parameters:

input_size (int) – 输入张量的特征维度（以实际值表示）。
hidden_size (int) – 输出值的数量（以实际值表示）。
num_layers (int, 可选) – 在RNN架构中使用的层数（默认为1）。
batch_size (int) – 输入张量的批量大小。
dropout (float, 可选) – 这是dropout因子（必须在0和1之间）（默认值为0.0）。
双向 (bool, 可选) – 如果为True，则使用一个双向模型，该模型会从左到右和从右到左扫描序列（默认为False）。
init_criterion (str , optional) – (glorot, he). 此参数控制权重的初始化标准。它与weights_init结合使用，以构建四元数值权重的初始化方法（默认为“glorot”）。
weight_init (str, optional) – (quaternion, unitary). 此参数定义了四元数值权重的初始化过程。“quaternion”将根据init_criterion和四元数极坐标形式生成随机四元数权重。“unitary”将权重归一化到单位圆上（默认为“quaternion”）。更多详情请参阅：“Quaternion Recurrent Neural Networks”, Parcollet T. 等人。
autograd (bool, 可选) – 当为True时，将使用默认的PyTorch自动梯度。当为False时，将使用自定义的反向传播，减少3到4倍的内存消耗。但速度会慢2倍（默认为True）。

forward(x: Tensor, hx: Tensor | None = None) → Tensor[source]

返回QuaternionRNN_layer的输出。

Parameters:

x (torch.Tensor) – 输入张量。
hx (torch.Tensor) – 隐藏层。

Returns:

h – 四元数RNN层的输出。

Return type:

torch.Tensor

class speechbrain.nnet.quaternion_networks.q_RNN.QRNN(hidden_size, input_shape, nonlinearity='tanh', num_layers=1, bias=True, dropout=0.0, bidirectional=False, init_criterion='glorot', weight_init='quaternion', autograd=True)[source]

基础：Module

此函数实现了一个普通的四元数值RNN。

输入格式为 (batch, time, fea) 或 (batch, time, fea, channel)。在后一种形状中，最后两个维度将被合并： (batch, time, fea * channel)

Parameters:

hidden_size (int) – 输出神经元的数量（即输出的维度）。指定的值是以四元数值神经元为单位的。因此，输出是4*hidden_size。
input_shape (tuple) – 输入张量的预期形状。
非线性 (str, 可选) – 非线性类型（tanh, relu）（默认“tanh”）。
num_layers (int, 可选) – 在RNN架构中使用的层数（默认为1）。
bias (bool, 可选) – 如果为True，则采用加法偏置b（默认为True）。
dropout (float, 可选) – 这是dropout因子（必须在0和1之间）（默认值为0.0）。
双向 (bool, 可选) – 如果为True，则使用一个双向模型，该模型会从左到右和从右到左扫描序列（默认为False）。
init_criterion (str , optional) – (glorot, he). 此参数控制权重的初始化标准。它与weights_init结合使用，以构建四元数值权重的初始化方法（默认为“glorot”）。
weight_init (str, optional) – (quaternion, unitary). 此参数定义了四元数值权重的初始化过程。“quaternion”将根据init_criterion和四元数极坐标形式生成随机四元数权重。“unitary”将权重归一化到单位圆上（默认为“quaternion”）。更多详情请参阅：“Quaternion Recurrent Neural Networks”, Parcollet T. 等人。
autograd (bool, 可选) – 当为True时，将使用默认的PyTorch自动梯度。当为False时，将使用自定义的反向传播，减少3到4倍的内存消耗。但速度会慢2倍（默认为True）。

Example

>>> inp_tensor = torch.rand([10, 16, 40])
>>> rnn = QRNN(hidden_size=16, input_shape=inp_tensor.shape)
>>> out_tensor = rnn(inp_tensor)
>>>
torch.Size([10, 16, 64])

forward(x, hx: Tensor | None = None)[source]

返回vanilla QuaternionRNN的输出。

Parameters:

x (torch.Tensor) – 输入张量。
hx (torch.Tensor) – 隐藏层。

Returns:

output (torch.Tensor)
hh (torch.Tensor)

class speechbrain.nnet.quaternion_networks.q_RNN.QRNN_Layer(input_size, hidden_size, num_layers, batch_size, dropout=0.0, nonlinearity='tanh', bidirectional=False, init_criterion='glorot', weight_init='quaternion', autograd='true')[source]

基础：Module

此函数实现了四元数值循环层。

Parameters:

input_size (int) – 输入张量的特征维度（以实际值表示）。
hidden_size (int) – 输出值的数量（以实际值表示）。
num_layers (int, 可选) – 在RNN架构中使用的层数（默认为1）。
batch_size (int) – 输入张量的批量大小。
dropout (float, 可选) – 这是dropout因子（必须在0和1之间）（默认值为0.0）。
非线性 (str, 可选) – 非线性类型（tanh, relu）（默认“tanh”）。
双向 (bool, 可选) – 如果为True，则使用一个双向模型，该模型会从左到右和从右到左扫描序列（默认为False）。
init_criterion (str , optional) – (glorot, he). 此参数控制权重的初始化标准。它与weights_init结合使用，以构建四元数值权重的初始化方法（默认为“glorot”）。
weight_init (str, optional) – (quaternion, unitary). 此参数定义了四元数值权重的初始化过程。“quaternion”将根据init_criterion和四元数极坐标形式生成随机四元数权重。“unitary”将权重归一化到单位圆上（默认为“quaternion”）。更多详情请参阅：“Quaternion Recurrent Neural Networks”, Parcollet T. 等人。
autograd (bool, 可选) – 当为True时，将使用默认的PyTorch自动梯度。当为False时，将使用自定义的反向传播，减少3到4倍的内存消耗。但速度会慢2倍（默认为True）。

forward(x: Tensor, hx: Tensor | None = None) → Tensor[source]

返回QuaternionRNN_layer的输出。

Parameters:

x (torch.Tensor) – 输入张量。
hx (torch.Tensor) – 隐藏层。

Returns:

h – 四元数RNN的输出

Return type:

torch.Tensor

class speechbrain.nnet.quaternion_networks.q_RNN.QLiGRU(hidden_size, input_shape, nonlinearity='leaky_relu', num_layers=1, bias=True, dropout=0.0, bidirectional=False, init_criterion='glorot', weight_init='quaternion', autograd=True)[source]

基础：Module

此函数实现了一个四元数值的轻量级GRU（liGRU）。

Ligru 是基于批归一化 + relu 激活 + 循环丢弃的单门 GRU 模型。更多信息请参见：

“M. Ravanelli, P. Brakel, M. Omologo, Y. Bengio, 用于语音识别的轻量门控循环单元, 发表于 IEEE 计算智能新兴主题汇刊, 2018” (https://arxiv.org/abs/1803.10225)

为了加快速度，它在使用前会使用torch即时编译器（jit）进行编译。

它接受格式为 (batch, time, fea) 的输入张量。在像 (batch, time, fea, channel) 这样的4d输入情况下，张量会被展平为 (batch, time, fea*channel)。

Parameters:

hidden_size (int) – 输出神经元的数量（即输出的维度）。指定的值是以四元数值神经元为单位的。因此，输出是2*hidden_size。
input_shape (tuple) – 输入的预期形状。
非线性 (str) – 非线性类型（tanh, relu）。
num_layers (int) – 在RNN架构中使用的层数。
bias (bool) – 如果为True，则采用加法偏置b。
dropout (float) – 这是dropout因子（必须在0和1之间）。
双向 (bool) – 如果为True，则使用一个双向模型，该模型会从左到右和从右到左扫描序列。
init_criterion (str, 可选) – (glorot, he). 此参数控制权重的初始化标准。它与weights_init结合使用，以构建四元数值权重的初始化方法（默认为“glorot”）。
weight_init (str, optional) – (quaternion, unitary). 此参数定义了四元数值权重的初始化过程。“quaternion”将根据init_criterion和四元数极坐标形式生成随机的四元数值权重。 “unitary”会将权重归一化到单位圆上（默认为“quaternion”）。更多详情请参阅：“Deep quaternion Networks”, Trabelsi C. et al.
autograd (bool, 可选) – 当为True时，将使用默认的PyTorch自动梯度。当为False时，将使用自定义的反向传播，减少3到4倍的内存消耗。但速度会慢2倍（默认为True）。

Example

>>> inp_tensor = torch.rand([10, 16, 40])
>>> rnn = QLiGRU(input_shape=inp_tensor.shape, hidden_size=16)
>>> out_tensor = rnn(inp_tensor)
>>>
torch.Size([4, 10, 5])

forward(x, hx: Tensor | None = None)[source]

返回QuaternionliGRU的输出。

Parameters:

x (torch.Tensor) – 输入张量。
hx (torch.Tensor) – 隐藏层。

Returns:

output (torch.Tensor)
hh (torch.Tensor)

class speechbrain.nnet.quaternion_networks.q_RNN.QLiGRU_Layer(input_size, hidden_size, num_layers, batch_size, dropout=0.0, nonlinearity='leaky_relu', normalization='batchnorm', bidirectional=False, init_criterion='glorot', weight_init='quaternion', autograd=True)[source]

基础：Module

此函数实现了四元数值的光门控循环单元（ligru）层。

Parameters:

input_size (int) – 输入张量的特征维度。
hidden_size (int) – 输出值的数量。
num_layers (int) – 在RNN架构中使用的层数。
batch_size (int) – 输入张量的批量大小。
dropout (float) – 这是dropout因子（必须在0和1之间）。
非线性 (str) – 非线性类型（tanh, relu）。
normalization (str) – 使用的归一化类型（batchnorm 或 none）
双向 (bool) – 如果为True，则使用一个双向模型，该模型会从左到右和从右到左扫描序列。
init_criterion (str , optional) – (glorot, he). 此参数控制权重的初始化标准。它与weights_init结合使用，以构建四元数值权重的初始化方法（默认为“glorot”）。
weight_init (str, optional) – (quaternion, unitary). 此参数定义了四元数值权重的初始化过程。“quaternion”将根据init_criterion和四元数极坐标形式生成随机四元数权重。“unitary”将权重归一化到单位圆上（默认为“quaternion”）。更多详情请参阅：“Deep quaternion Networks”，Trabelsi C. 等人。
autograd (bool, 可选) – 当为True时，将使用默认的PyTorch自动梯度。当为False时，将使用自定义的反向传播，减少3到4倍的内存消耗。但速度会慢2倍（默认为True）。

forward(x: Tensor, hx: Tensor | None = None) → Tensor[source]

返回四元数liGRU层的输出。

Parameters:

x (torch.Tensor) – 输入张量。
hx (torch.Tensor)

Return type:

四元数liGRU层的输出。