AIMET Documentation

The AI Model Efficiency Toolkit (AIMET) is a software toolkit for quantizing trained ML models.

AIMET improves the runtime performance of deep learning models by reducing compute load and memory footprint.

Models quantized with AIMET facilitate its deployment on edge devices like mobile phones or laptops by reducing memory footprint

AIMET employs post-training and fine-tuning techniques to minimize accuracy loss during quantization and compression.

AIMET supports models from the ONNX, PyTorch and TensorFlow/Keras frameworks.

Overview

Summary of what is AI Model Efficiency Toolkit(AIMET) and how you can use it, see the Overview.

To quickly install and begin using AIMET, see the Quick Start Guide.

Tutorials

Check curated tutorials to get started with quantization workflows and other techniques with AIMET, see the Tutorials.

Example Notebooks

To view end-to-end examples of model quantization and compression, and to download the examples in Jupyter notebook format, see the Example Notebooks page.

Techniques

For a high-level explanation of how to use AIMET to optimize a model, see the Quantization User Guide.

Post Training Quantization Techniques

For instructions on applying individual AIMET PTQ techniques, see the PTQ Techniques.

API Reference

For a detailed look at the AIMET API, see the API Reference.

Release Notes

For information about new features in this release, see the Release Notes.

Glossary

See the glossary for explanations of terms and acronyms used on this website.




AI Model Efficiency Toolkit is a product of Qualcomm Innovation Center, Inc.

Qualcomm® AI Engine Direct SDK is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.