Examples¶
AIMET end-to-end examples are Jupyter Notebooks that are intended to:
Familiarize you with the AIMET APIs,
Demonstrate how to apply AIMET to a pre-trained model from PyTorch, TensorFlow and ONNX frameworks,
Teach you how to use AIMET quantization and compression techniques.
For a discussion of quantization and compression techniques, see Optimization User Guide.
For the API reference, see API reference
Browse the notebooks¶
The following tables provide links to viewable HTML versions of the jupyter notebooks for AIMET quantization and compression features. Instructions after the tables describe how to run the notebooks.
Model Quantization Examples
Features |
PyTorch |
TensorFlow |
ONNX |
---|---|---|---|
Quantization simulation (QuantSim) |
|||
Quantization-aware training (QAT) |
Not implemented. |
||
Cross-layer equalization (CLE) |
|||
Adaptive rounding (AdaRound) |
|||
Automatic quantization (AutoQuant) |
Not implemented. |
||
Automatic mixed precision (AMP) |
|||
BatchNorm re-estimation |
Not implemented. |
||
Quant analyzer |
Not implemented. |
Model Compression Examples
Running the notebooks¶
To run the notebooks, follow the instructions below.
1. Run the notebook server¶
Install the Jupyter metapackage using the following command. (Prepend the command with
sudo -H
if necessary to grant admin privilege.)python3 -m pip install jupyter
Start the notebook server as follows:
jupyter notebook --ip=* --no-browser &
The command generates and displays a URL in the terminal.
Copy and paste the URL into your browser.
Install AIMET and its dependencies using the instructions in AIMET installation.
3. Run the notebooks¶
Navigate to one of the following paths in your local repository directory and launch your chosen jupyter notebook (.ipynb extension):
Model quantization notebooks
Examples/torch/quantization/
Examples/tensorflow/quantization/keras/
Examples/onnx/quantization/
Model compression notebooks
Examples/torch/compression/
Follow the instructions in the notebook to execute the code.