AIMET TensorFlow Quantization APIs

AIMET Quantization for TensorFlow provides the following functionality
  • Quantization Simulation: Allows ability to simulate inference and training on quantized hardware

  • QuantAnalyzer: Analyzes the model and points out sensitive ops to quantization

  • Adaptive Rounding: Post-training quantization technique to optimize rounding of weight tensors

  • Cross-Layer Equalization: Post-training quantization technique to equalize layer parameters

  • Bias Correction: Post-training quantization technique to correct shift in layer outputs due to quantization noise

  • AutoQuant API: Unified API that integrates the post-training quantization techniques provided by AIMET

  • BN Re-estimation APIs: APIs that Re-estimate BN layers’ statistics and fold the BN layers