speechbrain.lobes.models.ESPnetVGG 模块

这些叶瓣复制了首次在ESPNET v1中引入的编码器

来源: https://github.com/espnet/espnet/blob/master/espnet/nets/pytorch_backend/rnn/encoders.py

Authors

Titouan Parcollet 2020

摘要

类：

ESPnetVGG

该模型是CNN和RNN的组合

参考

class speechbrain.lobes.models.ESPnetVGG.ESPnetVGG(input_shape, activation=<class 'torch.nn.modules.activation.ReLU'>, dropout=0.15, cnn_channels=[64, 128], rnn_class=<class 'speechbrain.nnet.RNN.LSTM'>, rnn_layers=4, rnn_neurons=512, rnn_bidirectional=True, rnn_re_init=False, projection_neurons=512)[source]

基础类: Sequential

This model is a combination of CNNs and RNNs following: ESPnet编码器。(VGG+RNN+MLP+tanh())

Parameters:

input_shape (tuple) – 期望输入的示例形状。
activation (torch class) – 用于构建激活层的类。适用于CNN和DNN。
dropout (float) – 神经元丢弃率，仅应用于RNN。
cnn_channels (list of ints) – 每个CNN块的输出通道数列表。
rnn_class (torch class) – 要使用的RNN类型（LiGRU, LSTM, GRU, RNN）
rnn_layers (int) – 包含的循环层数。
rnn_neurons (int) – RNN每层中的神经元数量。
rnn_bidirectional (bool) – 该模型是否仅处理前向或双向。
rnn_re_init (bool)
projection_neurons (int) – 最后一个线性层中的神经元数量。

Example

>>> inputs = torch.rand([10, 40, 60])
>>> model = ESPnetVGG(input_shape=inputs.shape)
>>> outputs = model(inputs)
>>> outputs.shape
torch.Size([10, 10, 512])