medkit.text.relations.syntactic_relation_extractor#

This module needs extra-dependencies not installed as core dependencies of medkit. To install them, use pip install medkit-lib[syntactic-relation-extractor].

Classes:

SyntacticRelationExtractor([...])

Extractor of syntactic relations between entities in a TextDocument.

class SyntacticRelationExtractor(name_spacy_model='fr_core_news_sm', relation_label='has_syntactic_rel', entities_source=None, entities_target=None, name=None, uid=None)[source]#

Extractor of syntactic relations between entities in a TextDocument. The relation relies on the dependency parser from a spacy pipeline. A transition-based dependency parser defines a dependency tag for each token (word) in a document. This relation extractor uses syntactic neighbours of the words of an entity to determine whether a dependency exists between the entities.

Each TextDocument is converted to a spacy doc with the entities of interest. The labels of entities to be used as sources and targets of the relation are provided by the user, but it is also possible to not restrict the labels of sources and/or target entities. If neither the source label nor the target labels are provided, the ‘SyntacticRelationExtractor’ will detect relations among all entities in the document, and the order of the relation will be the syntactic order.

Initialize the syntactic relation extractor

Parameters
  • name_spacy_model (str) – Name or path of a spacy pipeline to load, it should include a syntactic dependency parser. To obtain consistent results, the spacy model should have the same language as the documents in which relations should be found.

  • relation_label (str) – Label of identified relations

  • entities_source (List[str]) – Labels of medkit entities to use as source of the relation. If None, any entity can be used as source.

  • entities_target (List[str]) – Labels of medkit entities to use as target of the relation. If None, any entity can be used as target.

  • name (Optional[str]) – Name describing the relation extractor (defaults to the class name)

  • uid (str) – Identifier of the relation extractor

Raises

ValueError – If the spacy model defined by name_spacy_model does not parse a document

Methods:

run(documents)

Add relations to each document from documents

set_prov_tracer(prov_tracer)

Enable provenance tracing.

Attributes:

description

Contains all the operation init parameters.

run(documents)[source]#

Add relations to each document from documents

Parameters

documents (List[TextDocument]) – List of text documents in which relations are to be found

property description: medkit.core.operation_desc.OperationDescription#

Contains all the operation init parameters.

Return type

OperationDescription

set_prov_tracer(prov_tracer)#

Enable provenance tracing.

Parameters

prov_tracer (ProvTracer) – The provenance tracer used to trace the provenance.