paddlespeech.s2t.frontend.augmentor.augmentation 模块

包含数据增强流程。

class paddlespeech.s2t.frontend.augmentor.augmentation.AugmentationPipeline(preprocess_conf: str, random_seed: int = 0)[来源]

基础： object

构建一个具有各种增强模型的预处理管道。这样的数据增强管道通常用于增加训练样本，以使模型对现实世界中的某些类型的扰动不变，从而提高模型的泛化能力。

管道是根据json字符串中的增强配置构建的，例如。

[ {
        "type": "noise",
        "params": {"min_snr_dB": 10,
                   "max_snr_dB": 20,
                   "noise_manifest_path": "datasets/manifest.noise"},
        "prob": 0.0
    },
    {
        "type": "speed",
        "params": {"min_speed_rate": 0.9,
                   "max_speed_rate": 1.1},
        "prob": 1.0
    },
    {
        "type": "shift",
        "params": {"min_shift_ms": -5,
                   "max_shift_ms": 5},
        "prob": 1.0
    },
    {
        "type": "volume",
        "params": {"min_gain_dBFS": -10,
                   "max_gain_dBFS": 10},
        "prob": 0.0
    },
    {
        "type": "bayesian_normal",
        "params": {"target_db": -20,
                   "prior_db": -20,
                   "prior_samples": 100},
        "prob": 0.0
    }
]

此增强配置将两个增强模型插入到管道中，一个是VolumePerturbAugmentor，另一个是SpeedPerturbAugmentor。“prob”表示当前增强器生效的概率。如果“prob”为零，则增强器不生效。

Params:: preprocess_conf(str): 增强配置在 json 文件或 json 字符串中。
random_seed(int): 随机种子。
Raises:: 值错误：如果增强的json配置格式不正确。

方法

`__call__`(xs[, uttid_list])	将自身作为一个函数调用。
`transform_audio`(audio_segment)	运行数据增强的预处理管道。
`transform_feature`(spec_segment)	声谱图增强。

SPEC_TYPES = {'specaug'}

transform_audio(audio_segment)[来源]

运行数据增强的预处理管道。

请注意，这是一个就地转换。

Parameters:: audio_segment (音频段|语音段) -- 要处理的音频段。

transform_feature(spec_segment)[来源]

声谱图增强。

Args:: spec_segment (np.ndarray)：音频特征，(D, T)。