AIMET ONNX Quantization APIs
- AIMET Quantization for ONNX Models provides the following functionality.
Quantization Simulation API: Allows ability to simulate inference on quantized hardware
Cross-Layer Equalization API: Post-training quantization technique to equalize layer parameters
Adaround API: Post-training quantization technique to optimize rounding of weight tensors
AutoQuant API: Unified API that integrates the post-training quantization techniques provided by AIMET
QuantAnalyzer API: Analyzes the model and points out sensitive layers to quantization