medkit.core.text.annotation#

Classes:

Entity(label, text, spans[, attrs, ...])

Text entity referencing part of an TextDocument.

Relation(label, source_id, target_id[, ...])

Relation between two text entities.

Segment(label, text, spans[, attrs, ...])

Text segment referencing part of an TextDocument.

TextAnnotation(label[, attrs, metadata, ...])

Base abstract class for all text annotations

class TextAnnotation(label, attrs=None, metadata=None, uid=None, attr_container_class=<class 'AttributeContainer'>)[source]#

Base abstract class for all text annotations

Variables
  • uid (str) – Unique identifier of the annotation.

  • label (str) – The label for this annotation (e.g., SENTENCE)

  • attrs (medkit.core.attribute_container.AttributeContainer) – Attributes of the annotation. Stored in a :class:{~medkit.core.AttributeContainer} but can be passed as a list at init.

  • metadata (Dict[str, Any]) – The metadata of the annotation

  • keys (Set[str]) – Pipeline output keys to which the annotation belongs to.

Methods:

get_subclass_for_data_dict(data_dict)

Return the subclass that corresponds to the class name found in a data dict

classmethod get_subclass_for_data_dict(data_dict)#

Return the subclass that corresponds to the class name found in a data dict

Parameters

data_dict (Dict[str, Any]) – Data dict returned by the to_dict() method of a subclass (or of the base class itself)

Return type

Optional[Type[Self]]

Returns

subclass – Subclass that generated data_dict, or None if data_dict correspond to the base class itself.

class Segment(label, text, spans, attrs=None, metadata=None, uid=None, store=None, attr_container_class=<class 'AttributeContainer'>)[source]#

Text segment referencing part of an TextDocument.

Variables
  • uid (str) – The segment identifier.

  • label (str) – The label for this segment (e.g., SENTENCE)

  • text (str) – Text of the segment.

  • spans (List[medkit.core.text.span.AnySpan]) – List of spans indicating which parts of the segment text correspond to which part of the document’s full text.

  • attrs (medkit.core.attribute_container.AttributeContainer) – Attributes of the segment. Stored in a :class:{~medkit.core.AttributeContainer} but can be passed as a list at init.

  • metadata (Dict[str, Any]) – The metadata of the segment

  • keys (Set[str]) – Pipeline output keys to which the segment belongs to.

Methods:

from_dict(segment_dict)

Creates a Segment from a dict

get_subclass_for_data_dict(data_dict)

Return the subclass that corresponds to the class name found in a data dict

classmethod from_dict(segment_dict)[source]#

Creates a Segment from a dict

Parameters

segment_dict (dict) – A dictionary from a serialized segment as generated by to_dict()

Return type

Self

classmethod get_subclass_for_data_dict(data_dict)#

Return the subclass that corresponds to the class name found in a data dict

Parameters

data_dict (Dict[str, Any]) – Data dict returned by the to_dict() method of a subclass (or of the base class itself)

Return type

Optional[Type[Self]]

Returns

subclass – Subclass that generated data_dict, or None if data_dict correspond to the base class itself.

class Entity(label, text, spans, attrs=None, metadata=None, uid=None, store=None, attr_container_class=<class 'EntityAttributeContainer'>)[source]#

Text entity referencing part of an TextDocument.

Variables
  • uid (str) – The entity identifier.

  • label (str) – The label for this entity (e.g., DISEASE)

  • text (str) – Text of the entity.

  • spans (List[medkit.core.text.span.AnySpan]) – List of spans indicating which parts of the entity text correspond to which part of the document’s full text.

  • attrs (medkit.core.text.entity_attribute_container.EntityAttributeContainer) – Attributes of the entity. Stored in a :class:{~medkit.core.EntityAttributeContainer} but can be passed as a list at init.

  • metadata (Dict[str, Any]) – The metadata of the entity

  • keys (Set[str]) – Pipeline output keys to which the entity belongs to.

Methods:

from_dict(segment_dict)

Creates a Segment from a dict

get_subclass_for_data_dict(data_dict)

Return the subclass that corresponds to the class name found in a data dict

classmethod from_dict(segment_dict)#

Creates a Segment from a dict

Parameters

segment_dict (dict) – A dictionary from a serialized segment as generated by to_dict()

Return type

Self

classmethod get_subclass_for_data_dict(data_dict)#

Return the subclass that corresponds to the class name found in a data dict

Parameters

data_dict (Dict[str, Any]) – Data dict returned by the to_dict() method of a subclass (or of the base class itself)

Return type

Optional[Type[Self]]

Returns

subclass – Subclass that generated data_dict, or None if data_dict correspond to the base class itself.

class Relation(label, source_id, target_id, attrs=None, metadata=None, uid=None, store=None, attr_container_class=<class 'AttributeContainer'>)[source]#

Relation between two text entities.

Variables
  • uid (str) – The identifier of the relation

  • label (str) – The relation label

  • source_id (str) – The identifier of the entity from which the relation is defined

  • target_id (str) – The identifier of the entity to which the relation is defined

  • attrs (medkit.core.attribute_container.AttributeContainer) – The attributes of the relation

  • metadata (Dict[str, Any]) – The metadata of the relation

  • keys (Set[str]) – Pipeline output keys to which the relation belongs to

Methods:

from_dict(relation_dict)

Creates a Relation from a dict

get_subclass_for_data_dict(data_dict)

Return the subclass that corresponds to the class name found in a data dict

classmethod get_subclass_for_data_dict(data_dict)#

Return the subclass that corresponds to the class name found in a data dict

Parameters

data_dict (Dict[str, Any]) – Data dict returned by the to_dict() method of a subclass (or of the base class itself)

Return type

Optional[Type[Self]]

Returns

subclass – Subclass that generated data_dict, or None if data_dict correspond to the base class itself.

classmethod from_dict(relation_dict)[source]#

Creates a Relation from a dict

Parameters

relation_dict (dict) – A dictionary from a serialized relation as generated by to_dict()

Return type

Self