medkit.text.relations.syntactic_relation_extractor
medkit.text.relations.syntactic_relation_extractor#
This module needs extra-dependencies not installed as core dependencies of medkit. To install them, use pip install medkit-lib[syntactic-relation-extractor].
Classes:
|
Extractor of syntactic relations between entities in a TextDocument. |
- class SyntacticRelationExtractor(name_spacy_model='fr_core_news_sm', relation_label='has_syntactic_rel', entities_source=None, entities_target=None, name=None, uid=None)[source]#
Extractor of syntactic relations between entities in a TextDocument. The relation relies on the dependency parser from a spacy pipeline. A transition-based dependency parser defines a dependency tag for each token (word) in a document. This relation extractor uses syntactic neighbours of the words of an entity to determine whether a dependency exists between the entities.
Each TextDocument is converted to a spacy doc with the entities of interest. The labels of entities to be used as sources and targets of the relation are provided by the user, but it is also possible to not restrict the labels of sources and/or target entities. If neither the source label nor the target labels are provided, the ‘SyntacticRelationExtractor’ will detect relations among all entities in the document, and the order of the relation will be the syntactic order.
Initialize the syntactic relation extractor
- Parameters
name_spacy_model (str) – Name or path of a spacy pipeline to load, it should include a syntactic dependency parser. To obtain consistent results, the spacy model should have the same language as the documents in which relations should be found.
relation_label (str) – Label of identified relations
entities_source (List[str]) – Labels of medkit entities to use as source of the relation. If None, any entity can be used as source.
entities_target (List[str]) – Labels of medkit entities to use as target of the relation. If None, any entity can be used as target.
name (
Optional
[str
]) – Name describing the relation extractor (defaults to the class name)uid (str) – Identifier of the relation extractor
- Raises
ValueError – If the spacy model defined by name_spacy_model does not parse a document
Methods:
run
(documents)Add relations to each document from documents
set_prov_tracer
(prov_tracer)Enable provenance tracing.
Attributes:
Contains all the operation init parameters.
- run(documents)[source]#
Add relations to each document from documents
- Parameters
documents (
List
[TextDocument
]) – List of text documents in which relations are to be found
- property description: medkit.core.operation_desc.OperationDescription#
Contains all the operation init parameters.
- Return type
- set_prov_tracer(prov_tracer)#
Enable provenance tracing.
- Parameters
prov_tracer (
ProvTracer
) – The provenance tracer used to trace the provenance.