AIMET PyTorch Quantization APIsΒΆ
In order to make full use of AIMET Quantization features, there are several guidelines users are encouraged to follow when defining PyTorch models. AIMET provides APIs which can automate some of the model definition changes and checks whether AIMET Quantization features can be applied on PyTorch model.
- User should first invoke Model Preparer API before using any of AIMET Quantization features.
Model Guidelines: Guidelines for defining PyTorch models
Model Preparer API: Allows user to automate model definition changes
Model Validator API: Allows user to check whether AIMET Quantization feature can be applied on a PyTorch model
- AIMET Quantization for PyTorch Models provides the following functionality.
Quantization Simulation API: Allows ability to simulate inference and training on quantized hardware
Adaptive Rounding API: Post-training quantization technique to optimize rounding of weight tensors
Cross-Layer Equalization API: Post-training quantization technique to equalize layer parameters
Bias Correction API: Post-training quantization technique to correct shift in layer outputs due to quantization noise
AutoQuant API: Unified API that integrates the post-training quantization techniques provided by AIMET