AIMET TensorFlow BatchNorm Re-estimation APIs¶

Examples Notebook Link¶

For an end-to-end notebook showing how to use Keras Quantization-Aware Training with BatchNorm Re-estimation, please see here.

Introduction¶

Batch Norm (BN) Re-estimation re-estimates the statistics of BN layers after performing QAT. Using the re-estimated statistics, the BN layers are folded in to preceding Conv and Linear layers

Top-level APIs¶

API for BatchNorm Re-estimation

aimet_tensorflow.bn_reestimation.reestimate_bn_stats(sim, start_op_names, output_op_names, bn_re_estimation_dataset, bn_num_batches=100)[source]¶

top level api for end user directly call for eval()

Parameters

sim (QuantizationSimModel) – tf quantized model
start_op_names (List[str]) – List of starting op names of the model
output_op_names (List[str]) – List of output op names of the model
bn_re_estimation_dataset (DatasetV1) – Training dataset
bn_num_batches (int) – The number of batches to be used for reestimation

Return type

Handle

Returns

Handle that undos the effect of BN reestimation upon handle.remove()

API for BatchNorm fold to scale

aimet_tensorflow.batch_norm_fold.fold_all_batch_norms_to_scale(sim, starting_op_names, output_op_names)[source]¶

Fold all batch_norm layers in a model into the quantization scale parameter of the corresponding conv layers

Parameters

sim (QuantizationSimModel) – tf quantized model
starting_op_names (List[str]) – List of starting op names of the model
output_op_names (List[str]) – List of output op names of the model

Code Example - BN-Reestimation¶

Step 1. Load the model

For this example, we are going to load a pretrained ResNet18 model.

def load_fp32_model():

    from tensorflow.compat.v1.keras.applications.resnet import ResNet50

    tf.keras.backend.clear_session()
    model = ResNet50(weights='imagenet', input_shape=(224, 224, 3))
    sess = tf.keras.backend.get_session()

    # Following lines are additional steps to make keras model work with AIMET.
    from Examples.tensorflow.utils.add_computational_nodes_in_graph import add_image_net_computational_nodes_in_graph
    add_image_net_computational_nodes_in_graph(sess, model.output.name, image_net_config.dataset['images_classes'])

    input_op_names = [model.input.op.name]
    output_op_names = [model.output.op.name]

    return sess, input_op_names, output_op_names

Step 2. Create QuantSim with Range Learning and Per Channel Quantization Enabled

For an example of creating QuantSim with Range Learning QuantScheme, please see here
For how to enable Per Channel Quantization, please see here

Step 3. Perform QAT

    update_ops_name = [op.name for op in model.updates] # Used for finetuning

    # User action required
    # The following line of code is an example of how to use an example ImageNetPipeline's train function.
    # Replace the following line with your own pipeline's  train function.
    ImageNetDataPipeline.finetune(quant_sim.session, update_ops_name=update_ops_name, epochs=1, learning_rate=5e-7, decay_steps=5)

Step 4 a. Perform BatchNorm Re-estimation

    from aimet_tensorflow.bn_reestimation import reestimate_bn_stats

    reestimate_bn_stats(quant_sim, start_op_names=input_op_names, output_op_names=output_op_names,
                        bn_re_estimation_dataset=bn_re_restimation_dataset, bn_num_batches=100)

Step 4 b. Perform BatchNorm Fold to scale

    from aimet_tensorflow.batch_norm_fold import fold_all_batch_norms_to_scale

    fold_all_batch_norms_to_scale(quant_sim, input_op_names, output_op_names)

Step 5. Export the model and encodings and test on target

For how to export the model and encodings, please see here