medkit.audio.segmentation.webrtc_voice_detector#

This module needs extra-dependencies not installed as core dependencies of medkit. To install them, use pip install medkit-lib[webrtc-voice-detector].

Classes:

WebRTCVoiceDetector(output_label[, ...])

Voice Activity Detection operation relying on the webrtcvad package.

class WebRTCVoiceDetector(output_label, aggressiveness=2, frame_duration=30, nb_frames_in_window=10, switch_ratio=0.9, uid=None)[source]#

Voice Activity Detection operation relying on the webrtcvad package.

Per-frame VAD results of webrtcvad are aggregated with a switch algorithm considering the percentage of speech/non-speech frames in a wider sliding window.

Input segments must be mono at 8kHZ, 16kHz, 32kHz or 48Khz.

Parameters
  • output_label (str) – Label of output speech segments.

  • aggressiveness (Literal[0, 1, 2, 3]) – Aggressiveness param passed to webrtcvad (the higher, the more likely to detect speech).

  • frame_duration (Literal[10, 20, 30]) – Duration in milliseconds of frames passed to webrtcvad.

  • nb_frames_in_window (int) – Number of frames in the sliding window used when aggregating per-frame VAD results.

  • switch_ratio (float) – Percentage of speech/non-speech frames required to switch the window speech state when aggregating per-frame VAD results.

  • uid (str) – Identifier of the detector.

Methods:

run(segments)

Return all speech segments detected for all input segments.

set_prov_tracer(prov_tracer)

Enable provenance tracing.

Attributes:

description

Contains all the operation init parameters.

run(segments)[source]#

Return all speech segments detected for all input segments.

Parameters

segments (List[Segment]) – Audio segments on which to perform VAD.

Return type

List[Segment]

Returns

List[~medkit.core.audio.Segment] – Segments detected as containing speech activity.

property description: medkit.core.operation_desc.OperationDescription#

Contains all the operation init parameters.

Return type

OperationDescription

set_prov_tracer(prov_tracer)#

Enable provenance tracing.

Parameters

prov_tracer (ProvTracer) – The provenance tracer used to trace the provenance.