AIMET Documentation

The AI Model Efficiency Toolkit (AIMET) is a software toolkit for quantizing and compressing trained models. The primary objective of model optimization is to facilitate its deployment on edge devices like mobile phones or laptops.

AIMET employs post-training and fine-tuning techniques to minimize accuracy loss during quantization and compression.

AIMET supports models from the PyTorch, TensorFlow/Keras, and ONNX frameworks.

Quick Start

To quickly install and begin using AIMET with PyTorch, see the Quick Start Guide.

Installation

For other install options, including for TensorFlow and ONNX platforms or to run AIMET in a Docker container, see Installation.

User Guide

For a high-level explanation of how to use AIMET to optimize a model, see the Quantization User Guide.

Quantization Simulation Guide

Quantization simulation (QuantSim) emulates the behavior of quantized hardware in a model running on floating point hardware. For a guide to quantization simulation and its related techniques, see the Quantization Simulation Guide.

Feature Guide

For instructions on applying individual AIMET features, see the Feature Guide.

Example Notebooks

To view end-to-end examples of model quantization and compression, and to download the examples in Jupyter notebook format, see the Example Notebooks page.

API Reference

For a detailed look at the AIMET API, see the API Reference.

Release Notes

For information about new features in this release, see the Release Notes.

Glossary

See the glossary for explanations of terms and acronyms used on this website.




AI Model Efficiency Toolkit is a product of Qualcomm Innovation Center, Inc.

Qualcomm® AI Engine Direct SDK is a product of Qualcomm Technologies, Inc. and/or its subsidiaries.