使用pyDub剪切长音频文件

小编典典

使用pyDub剪切长音频文件

python

我想使用pyDub来获取单个单词的长WAV文件（之间有静默）作为输入，然后去除所有静默，然后输出剩下的块就是单个WAV文件。文件名可以是连续数字，例如001.wav，002.wav，003.wav等。

Github页面上的“另一个示例？ ”示例执行了非常相似的操作，但是它没有输出单独的文件，而是将静默消除的片段组合回一个文件：

from pydub import AudioSegment
from pydub.utils import db_to_float

# Let's load up the audio we need...
podcast = AudioSegment.from_mp3("podcast.mp3")
intro = AudioSegment.from_wav("intro.wav")
outro = AudioSegment.from_wav("outro.wav")

# Let's consider anything that is 30 decibels quieter than
# the average volume of the podcast to be silence
average_loudness = podcast.rms
silence_threshold = average_loudness * db_to_float(-30)

# filter out the silence
podcast_parts = (ms for ms in podcast if ms.rms > silence_threshold)

# combine all the chunks back together
podcast = reduce(lambda a, b: a + b, podcast_parts)

# add on the bumpers
podcast = intro + podcast + outro

# save the result
podcast.export("podcast_processed.mp3", format="mp3")

是否可以将那些podcast_parts片段作为单独的WAV文件输出？如果是这样，怎么办？

谢谢！

阅读 352

2021-01-20

共1个答案

小编典典

该示例代码已相当简化，您可能需要看一下该strip_silence函数：

https://github.com/jiaaro/pydub/blob/2644289067aa05dbb832974ac75cdc91c3ea6911/pydub/effects.py#L98

然后仅导出每个块，而不是将它们组合。

该示例与strip_silence函数之间的主要区别在于该示例着眼于一毫秒的片段，因为例如40hz声音的一个波形的长度为25毫秒，所以它不能很好地计数低频声音。

但是，对原始问题的答案是，原始音频片段的所有这些片段也都是音频片段，因此您可以对它们调用export方法：）

更新：您可能想看看我刚刚推入master分支的沉默实用程序；特别是split_on_silence()可以做到这一点（假设正确的特定参数），如下所示：

from pydub import AudioSegment
from pydub.silence import split_on_silence

sound = AudioSegment.from_mp3("my_file.mp3")
chunks = split_on_silence(sound, 
    # must be silent for at least half a second
    min_silence_len=500,

    # consider it silent if quieter than -16 dBFS
    silence_thresh=-16
)

您可以将所有单个块导出为wav文件，如下所示：

for i, chunk in enumerate(chunks):
    chunk.export("/path/to/ouput/dir/chunk{0}.wav".format(i), format="wav")

这将使每个输出分别命名为“ chunk0.wav”，“ chunk1.wav”，“ chunk2.wav”，依此类推

2021-01-20