medkit.core.pipeline
medkit.core.pipeline#
Classes:
|
|
|
Graph of processing operations |
|
|
|
Pipeline item describing how a processing operation is connected to other |
|
- class Pipeline(steps, input_keys, output_keys, name=None, uid=None)[source]#
Graph of processing operations
A pipeline is made of pipeline steps, connecting together different processing operations by the use of input/output keys. Each operation can be seen as a node and the keys are its edge. Two operations can be chained by using the same string as an output key for the first operation and as an input key to the second.
Steps must be added in the order of execution, there isn’t any sort of dependency detection mechanism.
Initialize the pipeline
- Parameters
steps (
List
[PipelineStep
]) –List of pipeline steps
Steps will be executed in the order in which they were added, so make sure to add first the steps generating data used by other steps.
input_keys (
List
[str
]) – List of keys corresponding to the inputs passed to run()output_keys (
List
[str
]) – List of keys corresponding to the outputs returned by run()name (
Optional
[str
]) – Name describing the pipeline (defaults to the class name)uid (
Optional
[str
]) – Identifier of the pipeline
Methods:
run
(*all_input_data)Run the pipeline.
- run(*all_input_data)[source]#
Run the pipeline.
- Parameters
*all_input_data (
List
[Any
]) –Input data expected by the pipeline, must be of same length as the pipeline input_keys.
For each input key, the corresponding input data must be a list of items than can be of any type.
- Return type
Union
[None
,List
[Any
],Tuple
[List
[Any
], …]]- Returns
Union[None, List[Any], Tuple[List[Any], …]] – All output data returned by the pipeline, will be of same length as the pipeline output_keys.
For each output key, the corresponding output will be a list of items that can be of any type.
If the pipeline has only one output key, then the corresponding output will be directly returned, not wrapped in a tuple. If the pipeline doesn’t have any output key, nothing (ie None) will be returned.
- class PipelineStep(operation, input_keys, output_keys, aggregate_input_keys=False)[source]#
Pipeline item describing how a processing operation is connected to other
- Parameters
operation (medkit.core.pipeline.PipelineCompatibleOperation) – The operation to use at that step
input_keys (List[str]) – For each input of operation, the key to use to retrieve the corresponding annotations (either retrieved from a document or generated by an earlier pipeline step)
output_keys (List[str]) – For each output of operation, the key used to pass output annotations to the next Pipeline step. Can be empty if operation doesn’t return new annotations.
aggregate_input_keys (bool) – If True, all the annotations from multiple input keys are aggregated in a single list. Defaults to False
- class PipelineCompatibleOperation(*args, **kwargs)[source]#
Methods:
run
(*all_input_data)- param all_input_data
One or several list of data items to process
- run(*all_input_data)[source]#
- Parameters
all_input_data (List[Any]) – One or several list of data items to process (according to the number of input the operation needs)
- Return type
Union
[None
,List
[Any
],Tuple
[List
[Any
], …]]- Returns
Union[None, List[Any], Tuple[List[Any], …]] – Tuple of list of all new data items created by the operation. Can be None if the operation does not create any new data items but rather modify existing items in-place (for instance by adding attributes to existing annotations). If there is only one list of created data items, it is possible to return directly that list without wrapping it in a tuple.