Utilities for pipelines¶
This page lists all the utility functions the library provides for pipelines.
Most of those are only useful if you are studying the code of the models in the library.
Argument handling¶
- 
class 
transformers.pipelines.ArgumentHandler[source]¶ Base interface for handling arguments for each
Pipeline.
- 
class 
transformers.pipelines.ZeroShotClassificationArgumentHandler[source]¶ Handles arguments for zero-shot for text classification by turning each possible label into an NLI premise/hypothesis pair.
- 
class 
transformers.pipelines.QuestionAnsweringArgumentHandler[source]¶ QuestionAnsweringPipeline requires the user to provide multiple arguments (i.e. question & context) to be mapped to internal
SquadExample.QuestionAnsweringArgumentHandler manages all the possible to create a
SquadExamplefrom the command-line supplied arguments.
Data format¶
- 
class 
transformers.pipelines.PipelineDataFormat(output_path: Optional[str], input_path: Optional[str], column: Optional[str], overwrite: bool = False)[source]¶ Base class for all the pipeline supported data format both for reading and writing. Supported data formats currently includes:
JSON
CSV
stdin/stdout (pipe)
PipelineDataFormatalso includes some utilities to work with multi-columns like mapping from datasets columns to pipelines keyword arguments through thedataset_kwarg_1=dataset_column_1format.- Parameters
 output_path (
str, optional) – Where to save the outgoing data.input_path (
str, optional) – Where to look for the input data.column (
str, optional) – The column to read.overwrite (
bool, optional, defaults toFalse) – Whether or not to overwrite theoutput_path.
- 
static 
from_str(format: str, output_path: Optional[str], input_path: Optional[str], column: Optional[str], overwrite=False) → transformers.pipelines.PipelineDataFormat[source]¶ Creates an instance of the right subclass of
PipelineDataFormatdepending onformat.- Parameters
 format – (
str): The format of the desired pipeline. Acceptable values are"json","csv"or"pipe".output_path (
str, optional) – Where to save the outgoing data.input_path (
str, optional) – Where to look for the input data.column (
str, optional) – The column to read.overwrite (
bool, optional, defaults toFalse) – Whether or not to overwrite theoutput_path.
- Returns
 The proper data format.
- Return type
 
- 
abstract 
save(data: Union[dict, List[dict]])[source]¶ Save the provided data object with the representation for the current
PipelineDataFormat.- Parameters
 data (
dictor list ofdict) – The data to store.
- 
class 
transformers.pipelines.CsvPipelineDataFormat(output_path: Optional[str], input_path: Optional[str], column: Optional[str], overwrite=False)[source]¶ Support for pipelines using CSV data format.
- Parameters
 output_path (
str, optional) – Where to save the outgoing data.input_path (
str, optional) – Where to look for the input data.column (
str, optional) – The column to read.overwrite (
bool, optional, defaults toFalse) – Whether or not to overwrite theoutput_path.
- 
save(data: List[dict])[source]¶ Save the provided data object with the representation for the current
PipelineDataFormat.- Parameters
 data (
List[dict]) – The data to store.
- 
class 
transformers.pipelines.JsonPipelineDataFormat(output_path: Optional[str], input_path: Optional[str], column: Optional[str], overwrite=False)[source]¶ Support for pipelines using JSON file format.
- Parameters
 output_path (
str, optional) – Where to save the outgoing data.input_path (
str, optional) – Where to look for the input data.column (
str, optional) – The column to read.overwrite (
bool, optional, defaults toFalse) – Whether or not to overwrite theoutput_path.
- 
class 
transformers.pipelines.PipedPipelineDataFormat(output_path: Optional[str], input_path: Optional[str], column: Optional[str], overwrite: bool = False)[source]¶ Read data from piped input to the python process. For multi columns data, columns should separated by
If columns are provided, then the output will be a dictionary with {column_x: value_x}
- Parameters
 output_path (
str, optional) – Where to save the outgoing data.input_path (
str, optional) – Where to look for the input data.column (
str, optional) – The column to read.overwrite (
bool, optional, defaults toFalse) – Whether or not to overwrite theoutput_path.
Utilities¶
- 
transformers.pipelines.get_framework(model, revision: Optional[str] = None)[source]¶ Select framework (TensorFlow or PyTorch) to use.
- Parameters
 model (
str,PreTrainedModelorTFPreTrainedModel) – If both frameworks are installed, picks the one corresponding to the model passed (either a model class or the model name). If no specific model is provided, defaults to using PyTorch.