AIMET AdaRound

By default, AIMET uses nearest rounding for quantization. A single weight value in a weight tensor is illustrated in the following figure. In nearest rounding, this weight value is quantized to the nearest integer value.

The Adaptive Rounding (AdaRound) feature uses a subset of the unlabeled training data to adaptively round weights. In the following figure, the weight value is quantized to the integer value far from it.

AdaRound optimizes a loss function using the unlabelled training data to decide whether to quantize a weight to the closer or further integer value. AdaRound quantization achieves accuracy closer to the FP32 model, while using low bit-width integer quantization.

When creating a QuantizationSimModel using AdaRounded, use the QuantizationSimModel provided in the API to set and freeze parameter encodings before computing the encodings. Refer the code example in the AdaRound API.

AdaRound use cases

Terminology

The following abbreviations are used in the following use case descriptions:

BC: Bias Correction
BNF: Batch Norm Folding
CLE: Cross Layer Equalization
HBF: High Bias Folding
QAT: Quantization Aware Training
{ }: An optional step in the use case

Recommended

The following sequences are recommended:

{BNF} –> {CLE} –> AdaRound
Applying BNF and CLE are optional steps before applying AdaRound. Some models benefit from applying CLE while some don’t.

AdaRound –> QAT
AdaRound is a post-training quantization feature, but for some models applying BNF and CLE may not help. For these models, applying AdaRound before QAT might help. AdaRound is a better weights initialization step that speeds up QAT.

Not recommended

Applying bias correction (BC) either before or after AdaRound is not recommended.

AdaRound –> BC

BC –> AdaRound

AdaRound hyper parameters guidelines

A number of hyper parameters used during AdaRound optimization are exposed to users. The default values of some of these parameters lead to stable, good results over many models; we recommend that you not change these.

Use the following guideline for adjusting hyper parameters with AdaRound.

Hyper Parameters to be changed often
- Number of batches (approximately 500-1000 images. If batch size of data loader is 64, then 16x the number of batches leads to 1024 images)
- Number of iterations(default 10000)
Hyper Parameters to change with caution
- Regularization parameter (default 0.01)
Hyper Parameters to avoid changing
- Beta range (default (20, 2))
- Warm start period (default 20%)

AdaRound API

See the AdaRound API variant for your platform: