AIMET TensorFlow Quantization APIs¶
- AIMET Quantization for TensorFlow provides the following functionality
Quantization Simulation: Allows ability to simulate inference and training on quantized hardware
Adaptive Rounding: Post-training quantization technique to optimize rounding of weight tensors
Cross-Layer Equalization: Post-training quantization technique to equalize layer parameters
Bias Correction: Post-training quantization technique to correct shift in layer outputs due to quantization noise
AutoQuant API: Unified API that integrates the post-training quantization techniques provided by AIMET
BN Re-estimation APIs: APIs that Re-estimate BN layers’ statistics and fold the BN layers