medkit.text.spacy.edsnlp#

This package needs extra-dependencies not installed as core dependencies of medkit. To install them, use pip install medkit[edsnlp].

Classes:

EDSNLPDocPipeline(nlp[, medkit_labels_anns, ...])

DocPipeline to obtain annotations created using EDS-NLP

EDSNLPPipeline(nlp[, spacy_entities, ...])

Segment annotator relying on an EDS-NLP pipeline

Functions:

build_adicap_attribute(spacy_span, spacy_label)

Build a medkit ADICAP normalization attribute from an EDS-NLP attribute with an ADICAP object as value.

build_date_attribute(spacy_span, spacy_label)

Build a medkit date attribute from an EDS-NLP attribute with a date object as value.

build_duration_attribute(spacy_span, spacy_label)

Build a medkit duration attribute from an EDS-NLP attribute with a duration object as value.

build_measurement_attribute(spacy_span, ...)

Build a medkit attribute from an EDS-NLP attribute with a measurement object as value.

build_tnm_attribute(spacy_span, spacy_label)

Build a medkit TNM attribute from an EDS-NLP attribute with a TNM object as value.

Data:

DEFAULT_ATTRIBUTE_FACTORIES

Pre-defined attribute factories to handle EDS-NLP attributes

class EDSNLPPipeline(nlp, spacy_entities=None, spacy_span_groups=None, spacy_attrs=None, medkit_attribute_factories=None, name=None, uid=None)[source]#

Segment annotator relying on an EDS-NLP pipeline

Initialize the segment annotator

Parameters
  • nlp (Language) – Language object with the loaded pipeline from Spacy

  • spacy_entities (Optional[List[str]]) – Labels of new spacy entities (doc.ents) to convert into medkit entities. If None (default) all the new spacy entities will be converted

  • spacy_span_groups (Optional[List[str]]) – Name of new spacy span groups (doc.spans) to convert into medkit segments. If None (default) new spacy span groups will be converted

  • spacy_attrs (Optional[List[str]]) – Name of span extensions to convert into medkit attributes. If None, all non-redundant EDS-NLP attributes will be handled.

  • medkit_attribute_factories (Optional[Dict[str, Callable[[Span, str], Attribute]]]) – Mapping of factories in charge of converting spacy attributes to medkit attributes. Factories will receive a spacy span and an an attribute label when called. The key in the mapping is the attribute label. Pre-defined default factories are listed in DEFAULT_ATTRIBUTE_FACTORIES

  • name (Optional[str]) – Name describing the pipeline (defaults to the class name).

  • uid (str) – Identifier of the pipeline

Attributes:

description

Contains all the operation init parameters.

Methods:

run(segments)

Run a spacy pipeline on a list of segments provided as input and returns a new list of segments.

set_prov_tracer(prov_tracer)

Enable provenance tracing.

property description: medkit.core.operation_desc.OperationDescription#

Contains all the operation init parameters.

Return type

OperationDescription

run(segments)#

Run a spacy pipeline on a list of segments provided as input and returns a new list of segments. Each segment is converted to spacy document (Doc object). Then, the spacy pipeline is executed and finally, the new annotations and attributes are converted into medkit annotations.

Parameters

segments (List[Segment]) – List of segments on which to run the spacy pipeline

Return type

List[Segment]

Returns

List[Segments] – List of new annotations

set_prov_tracer(prov_tracer)#

Enable provenance tracing.

Parameters

prov_tracer (ProvTracer) – The provenance tracer used to trace the provenance.

class EDSNLPDocPipeline(nlp, medkit_labels_anns=None, medkit_attrs=None, spacy_entities=None, spacy_span_groups=None, spacy_attrs=None, medkit_attribute_factories=None, name=None, uid=None)[source]#

DocPipeline to obtain annotations created using EDS-NLP

Initialize the pipeline

Parameters
  • nlp (Language) – Language object with the loaded pipeline from Spacy

  • medkit_labels_anns (Optional[List[str]]) – Labels of medkit annotations to include in the spacy document. If None (default) all the annotations will be included.

  • medkit_attrs (Optional[List[str]]) – Labels of medkit attributes to add in the annotations that will be included. If None (default) all the attributes will be added as custom attributes in each annotation included.

  • spacy_entities (Optional[List[str]]) – Labels of new spacy entities (doc.ents) to convert into medkit entities. If None (default) all the new spacy entities will be converted and added into its origin medkit document.

  • spacy_span_groups (Optional[List[str]]) – Name of new spacy span groups (doc.spans) to convert into medkit segments. If None (default) new spacy span groups will be converted and added into its origin medkit document.

  • spacy_attrs (Optional[List[str]]) – Name of span extensions to convert into medkit attributes. If None, all non-redundant EDS-NLP attributes will be handled.

  • medkit_attribute_factories (Optional[Dict[str, Callable[[Span, str], Attribute]]]) – Mapping of factories in charge of converting spacy attributes to medkit attributes. Factories will receive a spacy span and an an attribute label when called. The key in the mapping is the attribute label. Pre-defined default factories are listed in DEFAULT_ATTRIBUTE_FACTORIES

  • name (Optional[str]) – Name describing the pipeline (defaults to the class name).

  • uid (str) – Identifier of the pipeline

Attributes:

description

Contains all the operation init parameters.

Methods:

run(medkit_docs)

Run a spacy pipeline on a list of medkit documents.

set_prov_tracer(prov_tracer)

Enable provenance tracing.

property description: medkit.core.operation_desc.OperationDescription#

Contains all the operation init parameters.

Return type

OperationDescription

run(medkit_docs)#

Run a spacy pipeline on a list of medkit documents. Each medkit document is converted to spacy document (Doc object), with the selected annotations and attributes. Then, the spacy pipeline is executed and finally, the new annotations and attributes are converted into medkit annotations.

Parameters

medkit_docs (List[TextDocument]) – List of TextDocuments on which to run the pipeline

Return type

None

set_prov_tracer(prov_tracer)#

Enable provenance tracing.

Parameters

prov_tracer (ProvTracer) – The provenance tracer used to trace the provenance.

build_date_attribute(spacy_span, spacy_label)[source]#

Build a medkit date attribute from an EDS-NLP attribute with a date object as value.

Parameters
  • spacy_span (Span) – Spacy span having an ESD-NLP date attribute

  • spacy_label (str) – Label of the date attribute on spacy_spacy. Ex: “date”, “consultation_date”

Return type

Attribute

Returns

AttributeDateAttribute or RelativeDateAttribute instance, depending on the EDS-NLP attribute

build_duration_attribute(spacy_span, spacy_label)[source]#

Build a medkit duration attribute from an EDS-NLP attribute with a duration object as value.

Parameters
  • spacy_span (Span) – Spacy span having an ESD-NLP date attribute

  • spacy_label (str) – Label of the date attribute on spacy_spacy. Ex: “duration”

Return type

DurationAttribute

Returns

DurationAttribute – Medkit duration attribute

build_adicap_attribute(spacy_span, spacy_label)[source]#

Build a medkit ADICAP normalization attribute from an EDS-NLP attribute with an ADICAP object as value.

Parameters
  • spacy_span (Span) – Spacy span having an ADICAP object as value

  • spacy_label (str) – Label of the attribute on spacy_spacy. Ex: “adicap”

Return type

ADICAPNormAttribute

Returns

ADICAPNormAttribute – Medkit ADICAP normalization attribute

build_tnm_attribute(spacy_span, spacy_label)[source]#

Build a medkit TNM attribute from an EDS-NLP attribute with a TNM object as value.

Parameters
  • spacy_span (Span) – Spacy span having a TNM object as value

  • spacy_label (str) – Label of the attribute on spacy_spacy. Ex: “tnm”

Return type

TNMAttribute

Returns

TNMAttribute – Medkit TNM attribute

build_measurement_attribute(spacy_span, spacy_label)[source]#

Build a medkit attribute from an EDS-NLP attribute with a measurement object as value.

Parameters
  • spacy_span (Span) – Spacy span having a measurement object as value

  • spacy_label (str) – Label of the attribute on spacy_spacy. Ex: “size”, “weight”, “bmi”

Return type

Attribute

Returns

Attribute – Medkit attribute with normalized measurement value and “unit” metadata

DEFAULT_ATTRIBUTE_FACTORIES = {'adicap': <function build_adicap_attribute>, 'bmi': <function build_measurement_attribute>, 'consultation_date': <function build_date_attribute>, 'date': <function build_date_attribute>, 'duration': <function build_duration_attribute>, 'size': <function build_measurement_attribute>, 'tnm': <function build_tnm_attribute>, 'volume': <function build_measurement_attribute>, 'weight': <function build_measurement_attribute>}#

Pre-defined attribute factories to handle EDS-NLP attributes