Installation

This page provides instructions for installing the latest version of AIMET across supported frameworks: ONNX, PyTorch, and TensorFlow.

Please select one installation method that best suits your environment:

PyPI

Install AIMET from PyPI

pip install aimet-onnx
# Optional: To accelerate quantization with CUDA
pip install onnxruntime-gpu
pip install aimet-torch

Note

aimet_tf is not available on PyPI, please download aimet_tf package from https://github.com/quic/aimet/releases

Alternative packages

Install the latest version of AIMET for supported framework and compute platforms including ONNX, PyTorch and TensorFlow from the .whl files hosted at https://github.com/quic/aimet/releases.

Prerequisites

The AIMET package requires the following host platform setup. The following prerequisites apply to all frameworks variants.

  • 64-bit Intel x86-compatible processor

  • OS: Ubuntu 22.04 LTS

  • Python 3.10

  • For GPU variants:
    • Nvidia GPU card (Compute capability 5.2 or later)

    • Nvidia driver version 455 or later (using the latest driver is recommended; both CUDA and cuDNN are supported)

Choose and install a package

Use one of the following commands to install AIMET based on your choice of framework and compute platform.

With CUDA 12.x:

python3 -m pip install https://github.com/quic/aimet/releases/download/2.12.0/aimet_onnx-2.12.0+cu121-cp310-cp310-manylinux_2_34_x86_64.whl -f https://download.pytorch.org/whl/torch_stable.html

With CPU only:

python3 -m pip install https://github.com/quic/aimet/releases/download/2.12.0/aimet_onnx-2.12.0+cpu-cp310-cp310-manylinux_2_34_x86_64.whl -f https://download.pytorch.org/whl/torch_stable.html

With CUDA 12.x:

python3 -m pip install https://github.com/quic/aimet/releases/download/2.12.0/aimet_torch-2.12.0+cu121-py38-none-any.whl -f https://download.pytorch.org/whl/torch_stable.html

With CPU only:

python3 -m pip install https://github.com/quic/aimet/releases/download/2.12.0/aimet_torch-2.12.0+cpu-py38-none-any.whl -f https://download.pytorch.org/whl/torch_stable.html

Tensorflow 2.10 GPU

With CUDA 11.x:

python3 -m pip install https://github.com/quic/aimet/releases/download/2.12.0/aimet_tensorflow-2.12.0+cu118-cp310-cp310-manylinux_2_34_x86_64.whl

With CPU only:

python3 -m pip install https://github.com/quic/aimet/releases/download/2.12.0/aimet_tensorflow-2.12.0+cpu-cp310-cp310-manylinux_2_34_x86_64.whl

Verifying the installation

Verify your installation using the following instructions.

Step 1: Handle imports and other setup.

import numpy as np
from aimet_common import libpymo

x = np.random.randn(100)

quant_scheme = libpymo.QuantizationMode.QUANTIZATION_TF
analyzer = libpymo.EncodingAnalyzerForPython(quant_scheme)

Step 2: Compute scale and offset.

bitwidth = 8
is_symmetric, strict_symmetric, unsigned_symmetric = True, False, True
use_cuda = False
analyzer.updateStats(x, use_cuda)
encoding, _ = analyzer.computeEncoding(bitwidth, is_symmetric, strict_symmetric, unsigned_symmetric)

print(f'Min: {encoding.min}, Max: {encoding.max}, Scale(delta): {encoding.delta}, Offset: {encoding.offset}')

The encodings values should be similar to the one shown below.

Min: -3.3734087606114667, Max: 3.3470540046691895, Scale(delta): 0.026354755942277083, Offset: -128.0

Step 3: Perform quantize-dequantize.

quantizer = libpymo.TensorQuantizationSimForPython()
out = quantizer.quantizeDequantize(x,
                                   encoding,
                                   libpymo.RoundingMode.ROUND_NEAREST,
                                   bitwidth,
                                   use_cuda)
print(out)

The quantized-dequantized output should be similar to the one shown below.

[-1.291383    0.36896658  1.0541903  -1.2123188  -2.2137995   1.2650282
 -0.23719281  0.10541902  0.50074035 -0.05270951 -0.94877124  0.
  0.10541902  0.52709514 -0.7115784   2.2401543  -0.34261182  2.0293162
  0.34261182 -0.6061594  -0.36896658 -0.6588689  -1.5022211  -0.10541902
 -1.4758663  -0.8433522   0.7115784  -0.23719281  0.44803086 -0.94877124
  0.18448329 -1.0014807   0.55344987 -0.13177378  0.15812853 -0.7115784
 -0.4216761   1.1068997  -0.07906426  1.6603496   0.55344987 -0.47438562
 -0.6325141   0.4216761  -1.4495116   1.5549306  -0.6325141  -1.2123188
  0.50074035  1.291383    0.07906426 -1.2123188  -2.0820258   1.0014807
 -0.18448329 -0.4216761   1.0278355  -0.21083805  0.52709514  1.6867044
 -0.68522364  1.0278355  -0.55344987 -0.26354757  0.10541902 -0.02635476
  0.6588689  -0.34261182 -0.05270951  3.347054    0.07906426 -1.080545
 -0.57980466  1.4231569  -0.6588689   1.291383   -0.13177378  0.31625706
 -0.36896658  0.05270951 -0.81699747 -1.4231569  -1.1068997  -0.68522364
  0.7115784  -1.2650282  -0.7115784   0.50074035  0.28990233 -0.73793316
  0.21083805  2.4246376  -0.15812853  0.52709514 -0.02635476 -0.13177378
 -1.8711877   0.4216761  -0.55344987 -0.76428795]

Old versions

You can also view the release notes for older AIMET versions at https://github.com/quic/aimet/releases. Follow the documentation corresponding to that release to select and install the appropriate AIMET package.

Building from source

For most users, the recommended way to install AIMET is by using the pre-built package available through the pip package manager, as it provides the most straightforward and reliable experience.

However, if you prefer to work with the latest source code or plan to contribute to AIMET development, you’ll need to build it from source. To do so, follow the steps outlined for building the latest AIMET codebase manually, see Build AIMET from source.