AIMET TensorFlow AdaRound API

Top-level API

aimet_tensorflow.adaround.adaround_weight.Adaround.apply_adaround(session, starting_op_names, output_op_names, params, path, filename_prefix, default_param_bw=4, default_quant_scheme=<QuantScheme.post_training_tf_enhanced: 2>, default_is_symmetric=False)

Returns Tf session - model with optimized weight rounding of every op (Conv and Linear) and also saves the corresponding quantization encodings to a separate JSON-formatted file that can then be imported by QuantSim for inference or QAT

Parameters
  • session (Session) – Tf session with model to adaround

  • starting_op_names (List[str]) – List of starting op names of the model

  • output_op_names (List[str]) – List of output op names of the model

  • params (AdaroundParameters) – Parameters for adaround

  • path (str) – path where to store parameter encodings

  • filename_prefix (str) – Prefix to use for filename of the encodings file

  • default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters. Default 4

  • default_quant_scheme (QuantScheme) – Quantization scheme. Supported options are QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced. Default QuantScheme.post_training_tf_enhanced

  • default_is_symmetric (bool) – True if symmetric encodings is used, else asymmetric encodings. Default False.

Return type

Session

Returns

Tf session with Adarounded weight and saves corresponding parameter encodings JSON file at provided path

Adaround Parameters

class aimet_tensorflow.adaround.adaround_weight.AdaroundParameters(data_set, num_batches, default_num_iterations=10000, default_reg_param=0.01, default_beta_range=(20, 2), default_warm_start=0.2)

Configuration parameters for Adaround

Parameters
  • data_set (DatasetV1) – TF Data set

  • num_batches (int) – Number of batches

  • default_num_iterations (int) – Number of iterations to adaround each layer. Default 10000

  • default_reg_param (float) – Regularization parameter, trading off between rounding loss vs reconstruction loss. Default 0.01

  • default_beta_range (Tuple) – Start and stop beta parameter for annealing of rounding loss (start_beta, end_beta). Default (20, 2)

  • default_warm_start (float) – warm up period, during which rounding loss has zero effect. Default 20% (0.2)

Enum Definition

Quant Scheme Enum

class aimet_common.defs.QuantScheme

Enumeration of Quant schemes

post_training_tf = 1

TF scheme (absolute min-max)

post_training_tf_enhanced = 2

Tf- enhanced scheme (SQNR based approach to discard outliers)

training_range_learning_with_tf_enhanced_init = 4

Learn appropriate encodings (scale/offset) during QAT. Uses TF-enhanced scheme (SQNR) to initialize.

training_range_learning_with_tf_init = 3

Learn appropriate encodings (scale/offset) during QAT. Uses TF scheme (absolute min/max) to initialize.

Code Examples

Required imports


import logging
import numpy as np
import tensorflow as tf

from aimet_common.utils import AimetLogger
from aimet_common.defs import QuantScheme
from aimet_tensorflow.examples.test_models import keras_model
from aimet_tensorflow.quantsim import QuantizationSimModel
from aimet_tensorflow.adaround.adaround_weight import Adaround, AdaroundParameters

Evaluation function

def dummy_forward_pass(session: tf.compat.v1.Session, _):
    """
    This is intended to be the user-defined model evaluation function.
    AIMET requires the above signature. So if the user's eval function does not
    match this signature, please create a simple wrapper.
    :param session: Session with model to be evaluated
    :param _: These argument(s) are passed to the forward_pass_callback as-is. Up to
            the user to determine the type of this parameter. E.g. could be simply an integer representing the number
            of data samples to use. Or could be a tuple of parameters or an object representing something more complex.
            If set to None, forward_pass_callback will be invoked with no parameters.
    :return: single float number (accuracy) representing model's performance
    """
    input_data = np.random.rand(32, 16, 16, 3)
    input_tensor = session.graph.get_tensor_by_name('conv2d_input:0')
    output_tensor = session.graph.get_tensor_by_name('keras_model/Softmax:0')
    output = session.run(output_tensor, feed_dict={input_tensor: input_data})
    return output

After applying AdaRound to the model, the AdaRounded session and associated encodings are returned

def apply_adaround_example():

    AimetLogger.set_level_for_all_areas(logging.DEBUG)
    tf.compat.v1.reset_default_graph()

    _ = keras_model()
    init = tf.compat.v1.global_variables_initializer()
    dataset_size = 32
    batch_size = 16
    possible_batches = dataset_size // batch_size
    input_data = np.random.rand(dataset_size, 16, 16, 3)
    dataset = tf.data.Dataset.from_tensor_slices(input_data)
    dataset = dataset.batch(batch_size=batch_size)

    session = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph())
    session.run(init)

    params = AdaroundParameters(data_set=dataset, num_batches=possible_batches, default_num_iterations=10)
    starting_op_names = ['conv2d_input']
    output_op_names = ['keras_model/Softmax']

    # W4A8
    param_bw = 4
    output_bw = 8
    quant_scheme = QuantScheme.post_training_tf_enhanced

    # Returns session with adarounded weights and their corresponding encodings
    adarounded_session = Adaround.apply_adaround(session, starting_op_names, output_op_names, params, path='./',
                                                 filename_prefix='dummy', default_param_bw=param_bw,
                                                 default_quant_scheme=quant_scheme, default_is_symmetric=False)

    # Create QuantSim using adarounded_session
    sim = QuantizationSimModel(adarounded_session, starting_op_names, output_op_names, quant_scheme,
                               default_output_bw=output_bw, default_param_bw=param_bw, use_cuda=False)

    # Set and freeze encodings to use same quantization grid and then invoke compute encodings
    sim.set_and_freeze_param_encodings(encoding_path='./dummy.encodings')
    sim.compute_encodings(dummy_forward_pass, None)

    session.close()
    adarounded_session.close()