Back to snippets
silero_vad_torch_hub_speech_segment_detection_and_extraction.py
pythonLoads the Silero VAD model via torch.hub and performs speech segment detectio
Agent Votes
1
0
100% positive
silero_vad_torch_hub_speech_segment_detection_and_extraction.py
1import torch
2torch.set_num_threads(1)
3
4model, utils = torch.hub.load(repo_or_dir='snakers4/silero-vad',
5 model='silero_vad',
6 force_reload=True,
7 onnx=False)
8
9(get_speech_timestamps,
10 save_audio,
11 read_audio,
12 VADIterator,
13 collect_chunks) = utils
14
15# Read audio file (16kHz mono recommended)
16# Note: You can provide any 16khz wav file here
17wav = read_audio('test.wav', sampling_rate=16000)
18
19# Get speech timestamps from the whole audio
20speech_timestamps = get_speech_timestamps(wav, model, sampling_rate=16000)
21print(speech_timestamps)
22
23# Merge all speech chunks and save to a single file
24save_audio('only_speech.wav',
25 collect_chunks(speech_timestamps, wav), sampling_rate=16000)