paddleaudio.backends.soundfile_backend 模块

paddleaudio.backends.soundfile_backend.info(filepath: str, format: Optional[str] = None) → 音频信息[来源]

获取音频文件的信号信息。

Note:

filepath 参数故意标注为 str，尽管它也接受 pathlib.Path 对象。这是为了与 "sox_io" 后端的一致性，

Args:

filepath (path-like object or file-like object):: 音频数据的来源。
format (str or None, optional):: 未使用。PySoundFile不接受格式提示。

Returns:

音频信息：给定音频的元数据。

paddleaudio.backends.soundfile_backend.load(filepath: str, frame_offset: int = 0, num_frames: int = -1, normalize: bool = True, channels_first: bool = True, format: Optional[str] = None) → Tuple[Tensor, int][来源]

从文件加载音频数据。

Note:

此函数可以处理的格式取决于声音文件的安装。此函数已在以下格式上进行测试；

波形音频文件格式
- 32位浮点数
- 32位有符号整数
- 16位有符号整数
- 8位无符号整数
无损音频编码格式
OGG/VORBIS
球体

默认情况下 (normalize=True, channels_first=True), 该函数返回具有 float32 数据类型和形状 [channel, time] 的张量。样本被归一化以适应 [-1.0, 1.0] 的范围。

当输入格式为整数类型的WAV时，例如32位有符号整数、16位有符号整数和8位无符号整数（不支持24位有符号整数），通过提供 normalize=False，该函数可以返回整数Tensor，其中样本在相应dtype的整个范围内表示，即对于32位有符号PCM为 int32 tensor，16位有符号PCM为 int16，8位无符号PCM为 uint8。

normalize 参数对 32 位浮点 WAV 和其他格式没有影响，例如 flac 和 mp3。对于这些格式，此函数始终返回 float32 张量，其值被归一化到 [-1.0, 1.0]。

Note:

filepath 参数故意标注为 str，尽管它也接受 pathlib.Path 对象。这是为了与 "sox_io" 后端的一致性。

Args:

filepath (path-like object or file-like object):: 音频数据的来源。
frame_offset (int, optional):: 在开始读取数据之前要跳过的帧数。
num_frames (int, optional):: 要读取的最大帧数。 -1 读取所有剩余样本，从 frame_offset 开始。如果给定文件中没有足够的帧，这个函数可能会返回较少的帧数。
normalize (bool, optional):: 当 True 时，该函数始终返回 float32，并且样本值被归一化为 [-1.0, 1.0]。如果输入文件是整数 WAV，给出 False 将导致结果张量类型更改为整数类型。此参数对除整数 WAV 类型以外的格式没有影响。
channels_first (bool, optional):: 当为 True 时，返回的张量维度是 [通道, 时间]。否则，返回的张量维度是 [时间, 通道]。
format (str or None, optional):: 未使用。PySoundFile不接受格式提示。

Returns:

(paddle.Tensor, int): Resulting Tensor and sample rate.: 如果输入文件具有整数wav格式并且归一化关闭，则它具有整数类型，否则为 float32 类型。如果 channels_first=True，则它具有 [channel, time]，否则为 [time, channel]。

paddleaudio.backends.soundfile_backend.normalize(y: ndarray, norm_type: str = 'linear', mul_factor: float = 1.0) → ndarray[来源]

规范化输入音频并添加额外的倍数。

Args:: y (np.ndarray): 输入波形数组，支持1D或2D。
norm_type (str, optional): 归一化类型。默认值为'linear'。
mul_factor (float, optional): 缩放因子。默认值为1.0。
Returns:: np.ndarray: y 归一化后的结果。

paddleaudio.backends.soundfile_backend.resample(y: ndarray, src_sr: int, target_sr: int, mode: str = 'kaiser_fast') → ndarray[来源]

音频重采样。

Args:: y (np.ndarray): 输入波形数组，1D 或 2D。 src_sr (int): 源采样率。 target_sr (int): 目标采样率。 mode (str, optional): 使用的重采样滤波器。默认为 'kaiser_fast'。
Returns:: np.ndarray: y 重新采样到 target_sr

paddleaudio.backends.soundfile_backend.save(filepath: str, src: Tensor, sample_rate: int, channels_first: bool = True, compression: Optional[float] = None, format: Optional[str] = None, encoding: Optional[str] = None, bits_per_sample: Optional[int] = None)[来源]

将音频数据保存到文件。

Note:

此函数可以处理的格式取决于声音文件的安装。此函数已在以下格式上进行测试；

波形音频文件格式
- 32位浮点数
- 32位有符号整数
- 16位有符号整数
- 8位无符号整数
无损音频编码格式
OGG/VORBIS
球体

Note:

filepath 参数故意标注为 str，尽管它也接受 pathlib.Path 对象。这是为了与 "sox_io" 后端的一致性，

Args:

filepath (str or pathlib.Path): 音频文件的路径。
src (paddle.Tensor): 要保存的音频数据。必须是二维张量。
sample_rate (int): 采样率
channels_first (bool, optional): 如果 True，则给定的张量被解释为 [通道, 时间]，

否则 [时间, 渠道].

compression (float of None, optional): Not used.

这里仅用于与 "sox_io" 后端的接口兼容性。

format (str or None, optional): Override the audio format.

当 filepath 参数是类似路径的对象时，音频格式会根据文件扩展名推断。如果文件扩展名缺失或不同，您可以使用此参数指定正确的格式。

当 filepath 参数是类似文件的对象时，必须提供此参数。

有效值是 "wav", "ogg", "vorbis", "flac" 和 "sph".

encoding (str or None, optional): Changes the encoding for supported formats.

此参数仅对支持的格式有效，例如 "wav", ""flac" 和 "sph"。有效值为：

"PCM_S" （带符号整数线性PCM）

"PCM_U"（无符号整数线性PCM）

"PCM_F" （浮点PCM）

"ULAW" （mu-law）

"ALAW" (a-law)

bits_per_sample (int or None, optional): Changes the bit depth for the

支持的格式。当 format 是 "wav"、"flac" 或 "sph" 之一时，您可以更改位深度。有效值为 8、16、24、32 和 64。

支持的格式/编码/位深度/压缩方式有：

"wav"

32位浮点PCM
32-bit 有符号整数 PCM
24位有符号整数PCM
16位有符号整数 PCM
8位无符号整数 PCM
8位μ律
8位a-law

Note:: 默认编码/位深由输入张量的dtype决定。

"flac"

8位
16位（默认）
24位

"ogg", "vorbis"

不接受更改配置。

"sph"

8 位有符号整数 PCM
16位有符号整数 PCM
24位有符号整数PCM
32位有符号整数 PCM（默认）
8位μ律
8位a-law
16位a-law
24位a-law
32位a-law

paddleaudio.backends.soundfile_backend.soundfile_load(file: PathLike, sr: Optional[int] = None, mono: bool = True, merge_type: str = 'average', normal: bool = True, norm_type: str = 'linear', norm_mul_factor: float = 1.0, offset: float = 0.0, duration: Optional[int] = None, dtype: str = 'float32', resample_mode: str = 'kaiser_fast') → Tuple[ndarray, int][来源]

从磁盘加载音频文件。此函数使用音频后端从磁盘加载音频。

Args:: file (os.PathLike): 要加载的音频文件路径。 sr (Optional[int], optional): 加载的波形的采样率。默认为 None。 mono (bool, optional): 返回单声道的波形。默认为 True。 merge_type (str, optional): 多通道波形的合并类型。默认为 'average'。 normal (bool, optional): 波形归一化。默认为 True。 norm_type (str, optional): 归一化的类型。默认为 'linear'。 norm_mul_factor (float, optional): 缩放因子。默认为 1.0。 offset (float, optional): 波形起始位置的偏移量。默认为 0.0。 duration (Optional[int], optional): 要读取的波形持续时间。默认为 None。 dtype (str, optional): 波形的数据类型。默认为 'float32'。 resample_mode (str, optional): 要使用的重采样滤波器。默认为 'kaiser_fast'。
Returns:: 元组[np.ndarray, int]: ndarray中的波形及其采样率。

paddleaudio.backends.soundfile_backend.soundfile_save(y: ndarray, sr: int, file: PathLike) → None[来源]

将音频文件保存到磁盘。这个函数使用scipy.io.wavfile将音频保存到磁盘，并附加步骤将输入波形转换为int16。

Args:: y (np.ndarray): 输入波形数组，1D或2D。
sr (int): 采样率。
file (os.PathLike): 要保存的音频文件路径。

paddleaudio.backends.soundfile_backend.to_mono(y: ndarray, merge_type: str = 'average') → ndarray[来源]

将立体声音频转换为单声道。

Args:: y (np.ndarray): 输入波形数组，支持1D或2D。
merge_type (str, 可选): 合并类型以生成单声道波形。默认为 'average'。
Returns:: np.ndarray: y 带有单声道。