Tools¶

Transformers Models¶

onnxscript.tools.transformers_models.get_model_and_inputs(model: str, config: str, dynamic_shapes: bool, device: str = 'cpu', num_hidden_layers: int = 1, with_mask: bool = True, implementation: str = 'eager', dtype: str | None = None, warmup: int = 5, repeat: int = 10) → tuple[Any, list[tuple[Tensor, ...]], dict | None][source]¶

Returns a model and a couple of dummy inputs.

Parameters:

model – model name, ‘phi’, ‘llama’, ‘phi3’, …
config – ‘small’, ‘medium’, ‘large’, …
dynamic_shapes – dynamic or static shapes
device – ‘cpu’ or ‘cuda’
num_hidden_layers – Number of hidden layers.
with_mask – One input or two inputs.
implementation – eager or sdpa
warmup – Number of inputs to generate.
repeat – Number of inputs to generate for repeat.
dtype – If specified, cast the model and the inputs into this type.

Returns:

model and list of inputs

onnxscript.tools.transformers_models.phi.get_phi_model_from_config(warmup: int = 5, repeat: int = 10, config: str = 'small', num_hidden_layers: int = 1, implementation: str = 'eager', dynamic_shapes: bool = False, with_mask: bool = True) → tuple[Any, list[tuple[Tensor, ...]], dict][source]¶

Returns a model Phi to test or benchmark.

Parameters:

warmup – Number of inputs to generate.
repeat – Number of inputs to generate for repeat.
config – small, medium or large
num_hidden_layers – number of hidden layers
implementation – eager or sdpa
with_mask – One or two inputs.
dynamic_shapes – dynamic shapes or not

Returns:

Model and list of inputs.

onnxscript.tools.transformers_models.phi3.get_phi3_model_from_config(warmup: int = 5, repeat: int = 10, config: str = 'small', num_hidden_layers: int = 1, implementation: str = 'eager', dynamic_shapes: bool = False, with_mask: bool = True) → tuple[Any, list[tuple[Tensor, ...]], dict][source]¶

Returns a model Phi to test or benchmark.

Parameters:

warmup – Number of inputs to generate.
repeat – Number of inputs to generate for repeat.
config – small, medium or large
num_hidden_layers – number of hidden layers
implementation – eager or sdpa
with_mask – One or two inputs.
dynamic_shapes – dynamic shapes or not

Returns:

Model and list of inputs.

onnxscript.tools.transformers_models.llama.get_llama_model_from_config(warmup: int = 5, repeat: int = 10, config: str = 'small', num_hidden_layers: int = 1, implementation: str = 'eager', dynamic_shapes: bool = False, with_mask: bool = True) → tuple[Any, list[tuple[Tensor, ...]], dict][source]¶

Returns a model Phi to test or benchmark.

Parameters:

warmup – Number of inputs to generate.
repeat – Number of inputs to generate for repeat.
config – small, medium or large
num_hidden_layers – Number of hidden layers.
implementation – eager or sdpa
with_mask – One or two inputs.
dynamic_shapes – dynamic shapes or not

Returns:

Model and list of inputs.