AIMET PyTorch Layer Output Generation API¶
This API captures and saves intermediate layer-outputs of a model. The model can be original(FP32) or quantsim. The layer-outputs are named according to the exported PyTorch/ONNX/TorchScript model by the quantsim export API. This allows layer-output comparison amongst FP32 model, quantization simulated model and actually quantized model on target-device to debug accuracy miss-match issues.
Top-level API¶
-
class
aimet_torch.layer_output_utils.
LayerOutputUtil
(model, dir_path, naming_scheme=<NamingScheme.PYTORCH: 1>, dummy_input=None, onnx_export_args=None)[source]¶ Implementation to capture and save outputs of intermediate layers of a model (fp32/quantsim).
Constructor for LayerOutputUtil.
- Parameters
model (
Module
) – Model whose layer-outputs are needed.dir_path (
str
) – Directory wherein layer-outputs will be saved.naming_scheme (
NamingScheme
) – Naming scheme to be followed to name layer-outputs. There are multiple schemes as per the exported model (pytorch, onnx or torchscript). Refer the NamingScheme enum definition.dummy_input (
Union
[Tensor
,Tuple
,List
[~T],None
]) – Dummy input to model. Required if naming_scheme is ‘NamingScheme.ONNX’ or ‘NamingScheme.TORCHSCRIPT’.onnx_export_args (
Union
[OnnxExportApiArgs
,Dict
[~KT, ~VT],None
]) – Should be same as that passed to quantsim export API to have consistency between layer-output names present in exported onnx model and generated layer-outputs. Required if naming_scheme is ‘NamingScheme.ONNX’.
The following API can be used to Generate Layer Outputs
-
LayerOutputUtil.
generate_layer_outputs
(input_batch)[source]¶ This method captures output of every layer of a model & saves the inputs and corresponding layer-outputs to disk.
- Parameters
input_batch (
Union
[Tensor
,List
[Tensor
],Tuple
[Tensor
]]) – Batch of inputs for which we want to obtain layer-outputs.- Returns
None
Enum Definition¶
Naming Scheme Enum
-
class
aimet_torch.layer_output_utils.
NamingScheme
[source]¶ Enumeration of layer-output naming schemes.
-
ONNX
= 2¶ Names outputs according to exported onnx model. Layer output names are generally numeric.
-
PYTORCH
= 1¶ Names outputs according to exported pytorch model. Layer names are used.
-
TORCHSCRIPT
= 3¶ Names outputs according to exported torchscript model. Layer output names are generally numeric.
-
Code Example¶
Imports
import torch
from torchvision import models
from aimet_torch.onnx_utils import OnnxExportApiArgs
from aimet_torch.model_preparer import prepare_model
from aimet_torch.quantsim import QuantizationSimModel
from aimet_torch.layer_output_utils import LayerOutputUtil, NamingScheme
Obtain Original or QuantSim model
# Obtain original model
original_model = models.resnet18()
original_model.eval()
original_model = prepare_model(original_model)
# Obtain quantsim model
dummy_input = torch.rand(1, 3, 224, 224)
def forward_pass(model: torch.nn.Module, input_batch: torch.Tensor):
model.eval()
with torch.no_grad():
_ = model(input_batch)
quantsim = QuantizationSimModel(model=original_model, quant_scheme='tf_enhanced',
dummy_input=dummy_input, rounding_mode='nearest',
default_output_bw=8, default_param_bw=8, in_place=False)
quantsim.compute_encodings(forward_pass_callback=forward_pass,
forward_pass_callback_args=dummy_input)
Obtain pre-processed inputs
# Get the inputs that are pre-processed using the same manner while computing quantsim encodings
input_batches = get_pre_processed_inputs()
Generate Layer Outputs
# Generate layer-outputs
layer_output_util = LayerOutputUtil(model=quantsim.model, dir_path='./layer_output_dump', naming_scheme=NamingScheme.ONNX,
dummy_input=dummy_input, onnx_export_args=OnnxExportApiArgs())
for input_batch in input_batches:
layer_output_util.generate_layer_outputs(input_batch)