medkit.audio.transcription.doc_transcriber
medkit.audio.transcription.doc_transcriber#
Classes:
|
Speech-to-text transcriber generating text documents from audio documents. |
|
Protocol for operations in charge of the actual speech-to-text transcription to use with |
- class DocTranscriber(input_label, output_label, transcription_operation, attrs_to_copy=None, uid=None)[source]#
Speech-to-text transcriber generating text documents from audio documents.
For each text document, all audio segments with a specific label are converted into text segments and regrouped in a corresponding new text document. The text of each segment is concatenated to form the full raw text of the new document.
Generated text documents are instances of
TranscribedTextDocument
(subclass ofTextDocument
) with additional info such as the identifier of the original audio document and a mapping between audio spans and text spans.Methods :func: create_text_segment() and :func: augment_full_text_for_next_segment() can be overridden to customize how the text segments are created and how they are concatenated to form the full text.
The actual transcription task is delegated to a
TranscriptionOperation
that must be provided, for instance :class`~medkit.audio.transcription.hf_transcriber.HFTranscriber` or :class`~medkit.audio.transcription.sb_transcriber.SBTranscriber`.- Parameters
input_label (
str
) – Label of audio segments that should be transcribed.output_label (
str
) – Label of generated text segments.transcription_operation (
TranscriptionOperation
) – Transcription operation in charge of actually transcribing each audio segment.attrs_to_copy (
Optional
[List
[str
]]) – Labels of attributes that should be copied from the original audio segments to the transcribed text segments.uid (str) – Identifier of the transcriber.
Methods:
Append intermediate joining text to full text before the next segment is concatenated to it.
run
(audio_docs)Return a transcribed text document for each document in audio_docs
set_prov_tracer
(prov_tracer)Enable provenance tracing.
Attributes:
Contains all the operation init parameters.
- run(audio_docs)[source]#
Return a transcribed text document for each document in audio_docs
- Parameters
audio_docs (
List
[AudioDocument
]) – Audio documents to transcribe- Return type
List
[TranscribedTextDocument
]- Returns
List[TranscribedTextDocument] – Transcribed text documents (once per document in audio_docs)
- augment_full_text_for_next_segment(full_text, segment_text, audio_segment)[source]#
Append intermediate joining text to full text before the next segment is concatenated to it. Override for custom behavior.
- Return type
str
- property description: medkit.core.operation_desc.OperationDescription#
Contains all the operation init parameters.
- Return type
- set_prov_tracer(prov_tracer)#
Enable provenance tracing.
- Parameters
prov_tracer (
ProvTracer
) – The provenance tracer used to trace the provenance.
- class TranscriptionOperation(*args, **kwargs)[source]#
Protocol for operations in charge of the actual speech-to-text transcription to use with
DocTranscriber
Attributes:
Label to use for generated transcription attributes
Methods:
run
(segments)Add a transcription attribute to each segment with a text value containing the transcribed text.
- output_label: str#
Label to use for generated transcription attributes