Adaptive Rounding (AdaRound)

This notebook contains a working example of AIMET adaptive rounding (AdaRound).

AIMET quantization features typically use the “nearest rounding” technique for achieving quantization. When using the nearest rounding technique, the weight value is quantized to the nearest integer value.

AdaRound optimizes a loss function using unlabeled training data to decide whether to quantize a specific weight to the closer integer value or the farther one. Using AdaRound, quantized accuracy is closer to the FP32 model than with nearest rounding.

Overall flow

The example follows these high-level steps:

  1. Instantiate the example evaluation and training pipeline

  2. Load the FP32 model and evaluate the model to find the baseline FP32 accuracy

  3. Create a quantization simulation model (with fake quantization ops) and evaluate the quantized simuation model

  4. Apply AdaRound and evaluate the simulation model to get a post-finetuned quantized accuracy score

Note

This notebook does not show state-of-the-art results. For example, it uses a relatively quantization-friendly model (Resnet18). Also, some optimization parameters like number of fine-tuning epochs are chosen to improve execution speed in the notebook.


Dataset

This example does image classification on the ImageNet dataset. If you already have a version of the data set, use that. Otherwise download the data set, for example from https://image-net.org/challenges/LSVRC/2012/index .