medkit.text.preprocessing.char_replacer#

Classes:

CharReplacer(output_label[, rules, name, uid])

Generic character replacer to be used as pre-processing module

class CharReplacer(output_label, rules=None, name=None, uid=None)[source]#

Generic character replacer to be used as pre-processing module

This module is a non-destructive module allowing to replace selected 1-char string with the wanted n-chars strings. It respects the span modification by creating a new text-bound annotation containing the span modification information from input text.

Parameters
  • output_label (str) – The output label of the created annotations

  • rules (Optional[List[Tuple[str, str]]]) – The list of replacement rules. Default: ALL_CHAR_RULES

  • name (Optional[str]) – Name describing the pre-processing module (defaults to the class name)

  • uid (str) – Identifier of the pre-processing module

Methods:

run(segments)

Run the module on a list of segments provided as input and returns a new list of segments

set_prov_tracer(prov_tracer)

Enable provenance tracing.

Attributes:

description

Contains all the operation init parameters.

run(segments)[source]#

Run the module on a list of segments provided as input and returns a new list of segments

Parameters

segments (List[Segment]) – List of segments to process

Return type

List[Segment]

Returns

List[~medkit.core.text.Segment] – List of new segments

property description: medkit.core.operation_desc.OperationDescription#

Contains all the operation init parameters.

Return type

OperationDescription

set_prov_tracer(prov_tracer)#

Enable provenance tracing.

Parameters

prov_tracer (ProvTracer) – The provenance tracer used to trace the provenance.