medkit.io.spacy
medkit.io.spacy#
This module needs extra-dependencies not installed as core dependencies of medkit. To install them, use pip install medkit-lib[spacy].
Classes:
|
Class in charge of converting spacy documents into a collection of TextDocuments. |
|
Class in charge of converting a list of TextDocuments into a list of spacy documents |
- class SpacyInputConverter(entities=None, span_groups=None, attrs=None, uid=None)[source]#
Class in charge of converting spacy documents into a collection of TextDocuments.
Initialize the spacy input converter
- Parameters
entities (
Optional
[List
[str
]]) – Labels of spacy entities (doc.ents) to convert into medkit entities. If None (default) all spacy entities will be converted and added into its origin medkit document.span_groups (
Optional
[List
[str
]]) – Name of groups of spacy spans (doc.spans) to convert into medkit segments. If None (default) all groups of spacy spans will be converted and added into the medkit document.attrs (
Optional
[List
[str
]]) – Name of span extensions to convert into medkit attributes. If None (default) all non-None extensions will be added for each annotationuid (
Optional
[str
]) – Identifier of the converter
Methods:
load
(spacy_docs)Create a list of TextDocuments from a list of spacy Doc objects.
- load(spacy_docs)[source]#
Create a list of TextDocuments from a list of spacy Doc objects. Depending on the configuration of the converted, the selected annotations and attributes are included in the documents.
- Parameters
spacy_docs (
List
[Doc
]) – A list of spacy documents to convert- Return type
List
[TextDocument
]- Returns
List[TextDocument] – A list of TextDocuments
- class SpacyOutputConverter(nlp, apply_nlp_spacy=False, labels_anns=None, attrs=None, uid=None)[source]#
Class in charge of converting a list of TextDocuments into a list of spacy documents
Initialize the spacy output converter
- Parameters
nlp (
Language
) – Language object with the loaded pipeline from Spacyapply_nlp_spacy (
bool
) – If True, each component of nlp pipeline is applied to the new spacy document. Some features, such as ‘POS TAG’, are added by a component of the pipeline, this parameter should be True, in order to add such attributes. If False, the nlp pipeline is not applied in the spacy document, so the document contains only the annotations and attributes transferred by medkit.labels_anns (
Optional
[List
[str
]]) – Labels of medkit annotations to include in the spacy document. If None (default) all the annotations will be included.attrs (
Optional
[List
[str
]]) – Labels of medkit attributes to add in the annotations that will be included. If None (default) all the attributes will be added as custom attributes in each annotation included.uid (
Optional
[str
]) – Identifier of the pipeline
Methods:
convert
(medkit_docs)Convert a list of TextDocuments into a list of spacy Doc objects.
- convert(medkit_docs)[source]#
Convert a list of TextDocuments into a list of spacy Doc objects. Depending on the configuration of the converted, the selected annotations and attributes are included in the documents.
- Parameters
medkit_docs (
List
[TextDocument
]) – A list of TextDocuments to convert- Return type
List
[Doc
]- Returns
List[Doc] – A list of spacy Doc objects