AIMET ONNX AdaRound API

User Guide Link

To learn more about this technique, please see AdaRound

Top-level API

aimet_onnx.adaround.adaround_weight.Adaround.apply_adaround(model, params, path, filename_prefix, default_param_bw=4, param_bw_override_list=None, ignore_quant_ops_list=None, default_quant_scheme=QuantScheme.post_training_tf_enhanced, default_config_file=None, use_cuda=True, device=0, user_onnx_libs=None)

Returns model with optimized weight rounding of every module (Conv and Linear) and also saves the corresponding quantization encodings to a separate JSON-formatted file that can then be imported by QuantSim for inference or QAT

Parameters:

model (ModelProto) – Model to Adaround
params (AdaroundParameters) – Parameters for Adaround
path (str) – path where to store parameter encodings
filename_prefix (str) – Prefix to use for filename of the encodings file
default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters
param_bw_override_list (Optional[List[Tuple[str, int]]]) – List of Tuples. Each Tuple is a param name and the corresponding parameter bitwidth to be used for that param.
ignore_quant_ops_list (Optional[List[str]]) – Ops listed here are skipped during quantization needed for AdaRounding. Do not specify Conv and Linear modules in this list. Doing so, will affect accuracy.
default_quant_scheme (QuantScheme) – Quantization scheme. Supported options are using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced
default_config_file (Optional[str]) – Default configuration file for model quantizers
use_cuda (bool) – If we should use cuda
device (int) – CUDA device ID
user_onnx_libs (Optional[List[str]]) – List of paths to all compiled ONNX custom ops libraries

Return type:

ModelProto

Returns:

Model with Adarounded weights and saves corresponding parameter encodings JSON file at provided path

Adaround Parameters

class aimet_onnx.adaround.adaround_weight.AdaroundParameters(data_loader, num_batches, default_num_iterations=None, default_reg_param=0.01, default_beta_range=(20, 2), default_warm_start=0.2, forward_fn=None, forward_pass_callback_args=None)[source]

Configuration parameters for Adaround

Parameters:

data_loader – Data loader
num_batches (int) – Number of batches to be used for Adaround. A commonly recommended value for this parameter is the smaller value among (1) len(data_loader) and (2) ceil(2000/batch_size)
default_num_iterations (Optional[int]) – Number of iterations to adaround each layer. The default value is 10K for models with 8- or higher bit weights, and 15K for models with lower than 8 bit weights.
default_reg_param (float) – Regularization parameter, trading off between rounding loss vs reconstruction loss. Default 0.01
default_beta_range (Tuple) – Start and stop beta parameter for annealing of rounding loss (start_beta, end_beta). Default (20, 2)
default_warm_start (float) – warm up period, during which rounding loss has zero effect. Default 20% (0.2)
forward_fn (Optional[Callable]) – Function to compute encodings for sim
forward_pass_callback_args – These argument(s) are passed to the forward_pass_callback as-is. Up to the user to determine the type of this parameter. E.g. could be simply an integer representing the number of data samples to use. Or could be a tuple of parameters or an object representing something more complex. If set to None, forward_pass_callback will be invoked with no parameters.

Note: It is recommended to use onnx-simplifier before adarounding the model.

Code Example - Adaptive Rounding (AdaRound)

This example shows how to use AIMET to perform Adaptive Rounding (AdaRound).

Required imports

from onnxsim import simplify
from aimet_onnx.adaround.adaround_weight import AdaroundParameters, Adaround
from aimet_onnx.quantsim import QuantizationSimModel

User should write this function to pass calibration data

def pass_calibration_data(model):
    """
    The User of the QuantizationSimModel API is expected to write this function based on their data set.
    This is not a working function and is provided only as a guideline.

    :param model:
    """

Apply Adaround

def apply_adaround_example(model, dataloader):
        """
        Example code to run adaround

        """
        # Simplify the model
        model, _ = simplify(model)

        params = AdaroundParameters(data_loader=dataloader, num_batches=1, default_num_iterations=5,
                                    forward_fn=pass_calibration_data,
                                    forward_pass_callback_args=None)
        ada_rounded_model = Adaround.apply_adaround(model, params, './', 'dummy')

        sim = QuantizationSimModel(ada_rounded_model,
                                   default_param_bw=8,
                                   default_activation_bw=8, use_cuda=True)
        sim.set_and_freeze_param_encodings('./dummy.encodings')