AIMET TensorFlow AdaRound API

User Guide Link

To learn more about this technique, please see AdaRound.

Examples Notebook Link

For an end-to-end notebook showing how to use Keras AdaRound, please see here.

Top-level API

aimet_tensorflow.keras.adaround_weight.Adaround.apply_adaround(model, params, path, filename_prefix, default_param_bw=4, default_quant_scheme=QuantScheme.post_training_tf_enhanced, config_file=None)

Returns model with optimized weight rounding of every op (Conv and Linear) and also saves the corresponding quantization encodings to a separate JSON-formatted file that can then be imported by QuantSim for inference or QAT

Parameters:

model (Model) – Model to adaround
params (AdaroundParameters) – Parameters for adaround
path (str) – path where to store parameter encodings
filename_prefix (str) – Prefix to use for filename of the encodings file
default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters. Default 4
default_quant_scheme (QuantScheme) – Quantization scheme. Supported options are QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced. Default QuantScheme.post_training_tf_enhanced
config_file (Optional[str]) – Configuration file for model quantizers

Return type:

Model

Returns:

Model with Adarounded weights

Adaround Parameters

class aimet_tensorflow.keras.adaround_weight.AdaroundParameters(data_set, num_batches, default_num_iterations=10000, default_reg_param=0.01, default_beta_range=(20, 2), default_warm_start=0.2)[source]

Configuration parameters for Adaround

Parameters:

data_set (DatasetV2) – TF Data set
num_batches (int) – Number of batches
default_num_iterations (int) – Number of iterations to adaround each layer. Default 10000
default_reg_param (float) – Regularization parameter, trading off between rounding loss vs reconstruction loss. Default 0.01
default_beta_range (Tuple) – Start and stop beta parameter for annealing of rounding loss (start_beta, end_beta). Default (20, 2)
default_warm_start (float) – warm up period, during which rounding loss has zero effect. Default 20% (0.2)

Enum Definition

Quant Scheme Enum

class aimet_common.defs.QuantScheme(value)[source]

Enumeration of Quant schemes

post_training_percentile = 6: For a Tensor, adjusted minimum and maximum values are selected based on the percentile value passed. The Quantization encodings are calculated using the adjusted minimum and maximum value.

post_training_tf = 1: For a Tensor, the absolute minimum and maximum value of the Tensor are used to compute the Quantization encodings.

post_training_tf_enhanced = 2: For a Tensor, searches and selects the optimal minimum and maximum value that minimizes the Quantization Noise. The Quantization encodings are calculated using the selected minimum and maximum value.

training_range_learning_with_tf_enhanced_init = 4: For a Tensor, the encoding values are initialized with the post_training_tf_enhanced scheme. Then, the encodings are learned during training.

training_range_learning_with_tf_init = 3: For a Tensor, the encoding values are initialized with the post_training_tf scheme. Then, the encodings are learned during training.

Code Examples

Required imports

import logging
import numpy as np
import tensorflow as tf

from aimet_common.utils import AimetLogger
from aimet_common.defs import QuantScheme
from aimet_tensorflow.examples.test_models import keras_model
from aimet_tensorflow.keras.quantsim import QuantizationSimModel
from aimet_tensorflow.keras.adaround_weight import Adaround
from aimet_tensorflow.keras.adaround_weight import AdaroundParameters

Evaluation function

def dummy_forward_pass(model: tf.keras.Model, _):
    """
    This is intended to be the user-defined model evaluation function.
    AIMET requires the above signature. So if the user's eval function does not
    match this signature, please create a simple wrapper.
    :param model: Model to be evaluated
    :param _: These argument(s) are passed to the forward_pass_callback as-is. Up to
            the user to determine the type of this parameter. E.g. could be simply an integer representing the number
            of data samples to use. Or could be a tuple of parameters or an object representing something more complex.
            If set to None, forward_pass_callback will be invoked with no parameters.
    :return: single float number (accuracy) representing model's performance
    """
    input_data = np.random.rand(32, 16, 16, 3)
    return model(input_data)

After applying AdaRound to the model, the AdaRounded model and associated encodings are returned

def apply_adaround_example():

    AimetLogger.set_level_for_all_areas(logging.DEBUG)
    tf.keras.backend.clear_session()

    model = keras_model()
    dataset_size = 32
    batch_size = 16
    possible_batches = dataset_size // batch_size
    input_data = np.random.rand(dataset_size, 16, 16, 3)
    dataset = tf.data.Dataset.from_tensor_slices(input_data)
    dataset = dataset.batch(batch_size=batch_size)

    params = AdaroundParameters(data_set=dataset, num_batches=possible_batches, default_num_iterations=10)

    # W4A8
    param_bw = 4
    output_bw = 8
    quant_scheme = QuantScheme.post_training_tf_enhanced

    # Returns session with adarounded weights and their corresponding encodings
    adarounded_model = Adaround.apply_adaround(model, params, path='./', filename_prefix='dummy',
                                               default_param_bw=param_bw, default_quant_scheme=quant_scheme)

    # Create QuantSim using adarounded_session
    sim = QuantizationSimModel(adarounded_model, quant_scheme, default_output_bw=output_bw, default_param_bw=param_bw)

    # Set and freeze encodings to use same quantization grid and then invoke compute encodings
    sim.set_and_freeze_param_encodings(encoding_path='./dummy.encodings')
    sim.compute_encodings(dummy_forward_pass, None)