Core audio components#

This page contains all core audio concepts of medkit.

Note

For more details about public APIs, refer to medkit.core.audio.

Document & Annotations#

The AudioDocument class implements the Document protocol. It allows to store instances of the Segment class, which implements the Annotation protocol.

classDiagram direction TB class Document~Annotation~{ <<protocol>> } class Annotation{ <<protocol>> } class AudioDocument{ uid: str anns: AudioAnnotationContainer } class Segment { uid: str label: str attrs: AttributeContainer } Document <|.. AudioDocument: implements Annotation <|.. Segment: implements AudioDocument *-- Segment : contains\n(AudioAnnotationContainer)

Fig. 5 Audio document and annotation hierarchy#

Document#

AudioDocument relies on AudioAnnotationContainer, a subclass of AnnotationContainer, to manage the annotations.

Note

For common interfaces provided by core components, you can refer to Document.

Annotations#

For audio modality, AudioDocument can only contain Segment.

Spans#

Similarly to text spans, audio annotations have an audio span pointing to the part of the audio document that is annotated. Contrary to text annotations, multiple discontinuous spans are not supported. An audio annotation can only have 1 continuous span, and there is no concept of “modified spans”.

Note

For more details about public APIs, refer to medkit.core.audio.span.

Audio Buffer#

Access to the actual waveform data is handled through AudioBuffer instances. The same way text annotations store the text they refer to in their text property, which holds a string, audio annotations store the portion of the audio signal they refer to in an audio property holding an AudioBuffer.

The contents of an AudioBuffer might be different from the initial raw signal if it has been preprocessed. If the signal is identical to the initial raw signal, then a FileAudioBuffer can be used (with appropriate start and end boundaries). Otherwise, a MemoryAudioBuffer has to be used as there is no corresponding audio file containing the signal.

Creating a new AudioBuffer containing a portion of a pre-existing buffer is done through the trim() method.

Note

For more details about public APIs, refer to medkit.core.audio.audio_buffer.

Operations#

Abstract subclasses of Operation have been defined for audio to ease the development of audio operations according to run operations.

classDiagram Operation <|-- DocOperation Operation <|-- PreprocessingOperation Operation <|-- SegmentationOperation

Fig. 6 Operation hierarchy#

Note

For more details about public APIs, refer to medkit.core.audio.operation.