AIMET PyTorch AdaRound API

Top-level API

aimet_torch.adaround.adaround_weight.Adaround.apply_adaround(model, dummy_input, params, path, filename_prefix, default_param_bw=4, param_bw_override_list=None, ignore_quant_ops_list=None, default_quant_scheme=<QuantScheme.post_training_tf_enhanced: 2>, default_config_file=None)

Returns model with optimized weight rounding of every module (Conv and Linear) and also saves the corresponding quantization encodings to a separate JSON-formatted file that can then be imported by QuantSim for inference or QAT

Parameters
  • model (Module) – Model to Adaround

  • dummy_input (Union[Tensor, Tuple]) – Dummy input to the model. Used to parse model graph. If the model has more than one input, pass a tuple. User is expected to place the tensors on the appropriate device.

  • params (AdaroundParameters) – Parameters for Adaround

  • path (str) – path where to store parameter encodings

  • filename_prefix (str) – Prefix to use for filename of the encodings file

  • default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters

  • param_bw_override_list (Optional[List[Tuple[Module, int]]]) – List of Tuples. Each Tuple is a module and the corresponding parameter bitwidth to be used for that module.

  • ignore_quant_ops_list (Optional[List[Module]]) – Ops listed here are skipped during quantization needed for AdaRounding. Do not specify Conv and Linear modules in this list. Doing so, will affect accuracy.

  • default_quant_scheme (QuantScheme) – Quantization scheme. Supported options are using Quant Scheme Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced

  • default_config_file (Optional[str]) – Default configuration file for model quantizers

Return type

Module

Returns

Model with Adarounded weights and saves corresponding parameter encodings JSON file at provided path

Multiple parameters can be specified by the users of the API.

Parameters
  • model - Model to apply AdaRound to

  • params - AdaroundParameters, explained below the API parameters description

  • default_param_bw - Default bitwidth (4-31) to use for initializing the encodings used for adaptive rounding. Default: 4

  • default_quant_scheme - Default Quantization scheme used for initializing encodings used for adaptive rounding. Supported options are using Quant Enum QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced. Default: QuantScheme.post_training_tf_enhanced

  • config_file - Configuration file for model quantizers


AdaroundParameters
  • data_loader - The Data Loader containing training data

  • num_batches - The number of batches to use for adarounding

  • default_num_iterations - Number of iterations to adaround each layer. Default: 10000

  • default_reg_param - Regularization parameter, trading off between rounding loss vs reconstruction loss. Default: 0.01

  • default_beta_range - Start and stop beta parameter for annealing of rounding loss (start_beta, end_beta). Default: (20, 2)

  • default_warm_start - warm up period, during which rounding loss has zero effect. Default: 20% (0.2)


Enum Definition

Quant Scheme Enum

class aimet_common.defs.QuantScheme

Enumeration of Quant schemes

post_training_tf = 1

Tf scheme

post_training_tf_enhanced = 2

Tf- enhanced scheme


Code Examples

Required imports


import logging
import torch
import torch.cuda
from torchvision import models

from aimet_common.utils import AimetLogger
from aimet_common.defs import QuantScheme
from aimet_torch.utils import create_fake_data_loader
from aimet_torch.quantsim import QuantizationSimModel
from aimet_torch.adaround.adaround_weight import Adaround, AdaroundParameters

Evaluation function

def dummy_forward_pass(model: torch.nn.Module, forward_pass_callback_args) -> float:
    """
    This is intended to be the user-defined model evaluation function.
    AIMET requires the above signature. So if the user's eval function does not
    match this signature, please create a simple wrapper.

    :param model: Model to evaluate
    :param forward_pass_callback_args: These argument(s) are passed to the forward_pass_callback as-is. Up to
            the user to determine the type of this parameter. E.g. could be simply an integer representing the number
            of data samples to use. Or could be a tuple of parameters or an object representing something more complex.
            If set to None, forward_pass_callback will be invoked with no parameters.
    :return: single float number (accuracy) representing model's performance
    """
    return .5

After applying AdaRound to ResNet18, the AdaRounded model and associated encodings are returned

def apply_adaround_example():

    AimetLogger.set_level_for_all_areas(logging.DEBUG)
    torch.cuda.empty_cache()

    model = models.resnet18(pretrained=True).eval()
    model = model.to(torch.device('cuda'))
    input_shape = (1, 3, 224, 224)
    dummy_input = torch.randn(input_shape).to(torch.device('cuda'))

    # As an illustrating example, a fake data loader is used here.
    # For AdaRound, the user should provide the training data loader.
    data_loader = create_fake_data_loader(dataset_size=64, batch_size=16, image_size=input_shape[1:])

    params = AdaroundParameters(data_loader=data_loader, num_batches=4, default_num_iterations=50,
                                default_reg_param=0.01, default_beta_range=(20, 2))

    # Returns model with adarounded weights and their corresponding encodings
    adarounded_model = Adaround.apply_adaround(model, dummy_input, params, path='./',
                                               filename_prefix='resnet18', default_param_bw=4,
                                               default_quant_scheme=QuantScheme.post_training_tf_enhanced,
                                               default_config_file=None)

    # Create QuantSim using adarounded_model
    sim = QuantizationSimModel(adarounded_model, quant_scheme=quant_scheme, default_param_bw=param_bw,
                               default_output_bw=output_bw, dummy_input=dummy_input)

    # Set and freeze encodings to use same quantization grid and then invoke compute encodings
    sim.set_and_freeze_param_encodings(encoding_path='./resnet18.encodings')
    sim.compute_encodings(dummy_forward_pass, forward_pass_callback_args=None)