AIMET TensorFlow Quantization APIs
In order to make full use of AIMET Quantization features, there are several guidelines users are encouraged to follow when defining Keras models. AIMET provides APIs which can automate some of the model definition changes and checks whether AIMET Quantization features can be applied on Keras model.
- Users should first invoke Model Preparer API before using any of the AIMET Quantization features.
Model Guidelines: Guidelines for defining Keras models
Model Preparer API: Allows user to automate model definition changes
- AIMET Quantization for Keras provides the following functionality
Quant Analyzer API: Analyzes the model and points out sensitive layers to quantization
Quantization Simulation API: Allows ability to simulate inference and training on quantized hardware
Adaptive Rounding API: Post-training quantization technique to optimize rounding of weight tensors
Cross-Layer Equalization API: Post-training quantization technique to equalize layer parameters
BatchNorm Re-estimation API: Quantization-aware training technique to counter potential instability of batchnorm statistics (i.e. running mean and variance)