AIMET Documentation¶

The AI Model Efficiency Toolkit (AIMET) is a software toolkit for quantizing trained ML models.

AIMET improves the runtime performance of deep learning models by reducing compute load and memory footprint.

Models quantized with AIMET facilitate its deployment on edge devices like mobile phones or laptops by reducing memory footprint

AIMET employs post-training and fine-tuning techniques to minimize accuracy loss during quantization and compression.

AIMET supports models from the ONNX, PyTorch and TensorFlow/Keras frameworks.

Overview¶

Summary of what is AI Model Efficiency Toolkit(AIMET) and how you can use it, see the Overview.

To quickly install and begin using AIMET, see the Quick Start Guide.

Tutorials¶

Check curated tutorials to get started with quantization workflows and other techniques with AIMET, see the Tutorials.

Example Notebooks¶

To view end-to-end examples of model quantization and compression, and to download the examples in Jupyter notebook format, see the Example Notebooks page.

Techniques¶

For a high-level explanation of how to use AIMET to optimize a model, see the Quantization User Guide.

Post Training Quantization Techniques¶

For instructions on applying individual AIMET PTQ techniques, see the PTQ Techniques.

API Reference¶

For a detailed look at the AIMET API, see the API Reference.

Release Notes¶

For information about new features in this release, see the Release Notes.

Glossary¶

See the glossary for explanations of terms and acronyms used on this website.

AI Model Efficiency Toolkit is a product of Qualcomm Innovation Center, Inc.

Qualcomm® AI Engine Direct SDK is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.