medkit.audio.transcription.hf_transcriber
medkit.audio.transcription.hf_transcriber#
This module needs extra-dependencies not installed as core dependencies of medkit. To install them, use pip install medkit-lib[hf-transcriber].
Classes:
|
Transcriber operation based on a Hugging Face transformers model. |
- class HFTranscriber(model='facebook/s2t-large-librispeech-asr', output_label='transcribed_text', add_trailing_dot=True, capitalize=True, device=- 1, batch_size=1, hf_auth_token=None, cache_dir=None, uid=None)[source]#
Transcriber operation based on a Hugging Face transformers model.
For each segment given as input, a transcription attribute will be created with the transcribed text as value. If needed, a text document can later be created from all the transcriptions of a audio document using
~medkit.audio.transcription.TranscribedTextDocument.from_audio_doc
- Parameters
model (
str
) – Name of the ASR model on the Hugging Face models hub. Must be a model compatible with the AutomaticSpeechRecognitionPipeline transformers class.output_label (
str
) – Label of the attribute containing the transcribed text that will be attached to the input segmentsadd_trailing_dot (
bool
) – If True, a dot will be added at the end of each transcription text.capitalize (
bool
) – It True, the first letter of each transcription text will be uppercased and the rest lowercased.device (
int
) – Device to use for pytorch models. Follows the Hugging Face convention (-1 for cpu and device number for gpu, for instance 0 for “cuda:0”)batch_size (
int
) – Size of batches processed by ASR pipeline.hf_auth_token (
Optional
[str
]) – HuggingFace Authentication token (to access private models on the hub)cache_dir (
Union
[str
,Path
,None
]) – Directory where to store downloaded models. If not set, the default HuggingFace cache dir is used.uid (str) – Identifier of the transcriber.
Methods:
run
(segments)Add a transcription attribute to each segment with a text value containing the transcribed text.
set_prov_tracer
(prov_tracer)Enable provenance tracing.
Attributes:
Contains all the operation init parameters.
- run(segments)[source]#
Add a transcription attribute to each segment with a text value containing the transcribed text.
- Parameters
segments (
List
[Segment
]) – List of segments to transcribe
- property description: medkit.core.operation_desc.OperationDescription#
Contains all the operation init parameters.
- Return type
- set_prov_tracer(prov_tracer)#
Enable provenance tracing.
- Parameters
prov_tracer (
ProvTracer
) – The provenance tracer used to trace the provenance.