Techniques

This section gives brief overview of quantization techniques and how to apply them with AIMET.

Post Training Quantization

Quantize model with given calibration data in user specified parameter and activation precision (bit-width).

Quantization Aware Training

Train model with quantization awareness to minimize quantization noise.

Blockwise Quantization

Quantize individual tensor with block size to balance accuracy and speed.

Analysis tools

Analysis tools to automatically identify sensitive areas and hotspots in your pre-trained model.

Compression

Reduces pre-trained model’s Multiply-accumulate(MAC) and memory costs with a minimal drop in accuracy. AIMET supports various compression techniques like Weight SVD, Spatial SVD and Channel pruning.