AIMET Documentation¶
The AI Model Efficiency Toolkit (AIMET) is a software toolkit for quantizing and compressing trained models. The primary objective of model optimization is to facilitate its deployment on edge devices like mobile phones or laptops.
AIMET employs post-training and fine-tuning techniques to minimize accuracy loss during quantization and compression.
AIMET supports models from the PyTorch, TensorFlow/Keras, and ONNX frameworks.
Quick Start¶
To quickly install and begin using AIMET with PyTorch, see the Quick Start Guide.
Installation¶
For other install options, including for TensorFlow and ONNX platforms or to run AIMET in a Docker container, see Installation.
User Guide¶
For a high-level explanation of how to use AIMET to optimize a model, see the Quantization User Guide.
Quantization Simulation Guide¶
Quantization simulation (QuantSim) emulates the behavior of quantized hardware in a model running on floating point hardware. For a guide to quantization simulation and its related techniques, see the Quantization Simulation Guide.
Feature Guide¶
For instructions on applying individual AIMET features, see the Feature Guide.
Example Notebooks¶
To view end-to-end examples of model quantization and compression, and to download the examples in Jupyter notebook format, see the Example Notebooks page.
API Reference¶
For a detailed look at the AIMET API, see the API Reference.
Release Notes¶
For information about new features in this release, see the Release Notes.
Glossary¶
See the glossary for explanations of terms and acronyms used on this website.
AI Model Efficiency Toolkit is a product of Qualcomm Innovation Center, Inc.
Qualcomm® AI Engine Direct SDK is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.