AI Model Efficiency Toolkit Logo
tf-torch-cpu_1.29.0
  • Quantization User Guide
    • Use Cases
    • AIMET Quantization Features
      • Quantization Simulation
        • Overview
        • QuantSim Workflow
        • Simulating Quantization Noise
        • Determining Quantization Parameters (Encodings)
        • Quantization Schemes
        • Configuring Quantization Simulation Ops
        • Frequently Asked Questions
      • Quantization-Aware Training (QAT)
        • Overview
        • QAT workflow
        • QAT modes
        • Recommendations for Quantization-Aware Training
      • Post-Training Quantization
        • AutoQuant
          • Overview
          • Workflow
        • Adaptive Rounding (AdaRound)
          • AdaRound Use Cases
          • Common terminology
          • Use Cases
        • Cross-Layer Equalization
          • Overview
          • User Flow
          • FAQs
          • References
        • BN Re-estimation
          • Overview
          • Workflow
        • Bias Correction [Depricated]
          • Overview
          • User Flow
          • FAQs
          • References
      • Debugging/Analysis Tools
        • QuantAnalyzer
          • Overview
          • Requirements
          • Detailed Analysis Descriptions
        • Visualizations
          • Overview
          • Quantization
            • PyTorch
            • TensorFlow
    • AIMET Quantization Workflow
      • PyTorch
        • PyTorch Model Guidelines
        • AIMET PyTorch Quantization APIs
          • Model Guidelines
          • Architecture Checker API
          • Model Preparer API
            • Top-level API
            • Code Examples
            • Limitations of torch.fx symbolic trace API
          • Model Validator API
          • Quant Analyzer API
            • Top-level API
            • Code Examples
          • Quantization Simulation API
            • User Guide Link
            • Examples Notebook Link
            • Guidelines
            • Top-level API
            • Enum Definition
            • Code Example - Quantization Aware Training (QAT)
          • Adaptive Rounding API
            • User Guide Link
            • Examples Notebook Link
            • Top-level API
            • Adaround Parameters
            • Enum Definition
            • Code Example - Adaptive Rounding (AdaRound)
          • Cross-Layer Equalization API
            • User Guide Link
            • Examples Notebook Link
            • Introduction
            • Cross Layer Equalization API
            • Code Example
            • Primitive APIs
              • Primitive APIs for Cross Layer Equalization
                • Introduction
                • ClsSetInfo Definition
                • Higher Level APIs for Cross Layer Equalization
                • Code Examples for Higher Level APIs
                • Lower Level APIs for Cross Layer Equalization
                • Code Examples for Lower Level APIs
          • Bias Correction API
            • User Guide Link
            • Bias Correction API
            • ConvBnInfoType
            • ActivationType
            • Quantization Params
            • Code Example #1 Empirical Bias Correction
            • Code Example #2 Analytical + Empirical Bias correction
          • AutoQuant API
            • User Guide Link
            • Examples Notebook Link
            • Top-level API
            • Code Examples
          • BN Re-estimation APIs
            • Examples Notebook Link
            • Introduction
            • Top-level APIs
            • Code Example - BN-Reestimation
          • Multi-GPU guidelines
      • Tensorflow
        • TensorFlow Model Guidelines
    • Debugging Guidelines
      • Quantization Guidebook
  • Compression User Guide
    • Overview
      • Compression Guidebook
    • Use Case
    • Compression ratio selection
      • Greedy Compression Ratio Selection
        • Overview
        • How it works
        • Per-layer Exploration
        • Compression Ratio Selection
      • Visualization
        • Overview
        • Design
        • Compression
        • Starting a Bokeh Server Session:
        • How to use the tool
    • Model Compression
      • Weight SVD
      • Spatial SVD
      • Channel Pruning
        • Overall Procedure
        • Channel Selection
        • Winnowing
          • Winnowing
            • Overview
            • Winnowing Overview
            • How Winnowing Works
        • Weight Reconstruction
    • Optional techniques to get better compression results
      • Rank Rounding
      • Per-layer Fine-tuning
    • FAQs
    • References
  • API Documentation
    • AIMET APIs for PyTorch
      • PyTorch Model Quantization API
        • Model Guidelines
        • Architecture Checker API
        • Model Preparer API
          • Top-level API
          • Code Examples
          • Limitations of torch.fx symbolic trace API
        • Model Validator API
        • Quant Analyzer API
          • Top-level API
          • Code Examples
        • Quantization Simulation API
          • User Guide Link
          • Examples Notebook Link
          • Guidelines
          • Top-level API
          • Enum Definition
          • Code Example - Quantization Aware Training (QAT)
        • Adaptive Rounding API
          • User Guide Link
          • Examples Notebook Link
          • Top-level API
          • Adaround Parameters
          • Enum Definition
          • Code Example - Adaptive Rounding (AdaRound)
        • Cross-Layer Equalization API
          • User Guide Link
          • Examples Notebook Link
          • Introduction
          • Cross Layer Equalization API
          • Code Example
          • Primitive APIs
            • Primitive APIs for Cross Layer Equalization
              • Introduction
              • ClsSetInfo Definition
              • Higher Level APIs for Cross Layer Equalization
              • Code Examples for Higher Level APIs
              • Lower Level APIs for Cross Layer Equalization
              • Code Examples for Lower Level APIs
        • Bias Correction API
          • User Guide Link
          • Bias Correction API
          • ConvBnInfoType
          • ActivationType
          • Quantization Params
          • Code Example #1 Empirical Bias Correction
          • Code Example #2 Analytical + Empirical Bias correction
        • AutoQuant API
          • User Guide Link
          • Examples Notebook Link
          • Top-level API
          • Code Examples
        • BN Re-estimation APIs
          • Examples Notebook Link
          • Introduction
          • Top-level APIs
          • Code Example - BN-Reestimation
        • Multi-GPU guidelines
      • PyTorch Model Compression API
        • Introduction
        • Top-level API for Compression
        • Greedy Selection Parameters
        • TAR Selection Parameters
        • Spatial SVD Configuration
        • Weight SVD Configuration
        • Channel Pruning Configuration
        • Configuration Definitions
        • Code Examples
      • PyTorch Model Visualization API for Compression
        • Top-level API Compression
        • Code Examples
      • PyTorch Model Visualization API for Quantization
        • Top-level API Quantization
        • Code Examples
      • PyTorch Debug API
        • Top-level API
        • Enum Definition
        • Code Example
    • AIMET APIs for TensorFlow
      • TensorFlow Model Guidelines
      • TensorFlow Model Quantization API
      • TensorFlow Model Compression API
        • Introduction
        • Top-level API for Compression
        • Greedy Selection Parameters
        • Spatial SVD Configuration
        • Channel Pruning Configuration
        • Configuration Definitions
        • Code Examples
        • Weight SVD Top-level API
        • Code Examples for Weight SVD
      • TensorFlow Model Visualization API for Quantization
        • Top-level API for Visualization of Weight tensors
        • Code Examples for Visualization of Weight tensors
      • Using AIMET Tensorflow APIs with Keras Models
        • Introduction
        • APIs
        • Code Example
        • Utility Functions
      • Tensorflow Debug API
        • Top-level API
        • Code Example
    • AIMET APIs for Keras
      • Keras Model Quantization API
      • Keras Debug API
        • Top-level API
        • Code Example
      • Keras Model Compression API
        • Introduction
        • Top-level API for Compression
        • Greedy Selection Parameters
        • Spatial SVD Configuration
        • Configuration Definitions
        • Code Examples
    • AIMET APIs for ONNX
      • ONNX Model Quantization API
      • ONNX Debug API
        • Top-level API
        • Code Example
    • Indices and tables
  • Examples Documentation
    • Browse the notebooks
    • Running the notebooks
      • Install Jupyter
      • Download the Example notebooks and related code
      • Run the notebooks
  • Installation
    • Release packages
    • System Requirements
    • Installation Instructions
      • Install in Host Machine
        • Install prerequisite packages
        • Install GPU packages
          • Install GPU packages for PyTorch 1.9 or ONNX
          • Install GPU packages for PyTorch 1.13
          • Install GPU packages for TensorFlow
        • Install AIMET packages
        • Install common debian packages
        • Install tensorflow GPU debian packages
        • Install torch GPU debian packages
        • Install ONNX GPU debian packages
        • Replace Pillow with Pillow-SIMD
        • Replace onnxruntime with onnxruntime-gpu
        • Post installation steps
        • Environment setup
      • Install in Docker Container
        • Set variant
        • Use prebuilt docker image
        • Build docker image locally
        • Start docker container
        • Install AIMET packages
        • Environment setup
AI Model Efficiency Toolkit
  • Welcome to AI Model Efficiency Toolkit API Docs!
  • AIMET PyTorch APIs
  • View page source
Previous Next

AIMET PyTorch APIsΒΆ

  • PyTorch Model Quantization API
  • PyTorch Model Compression API
  • PyTorch Model Visualization API for Compression
  • PyTorch Model Visualization API for Quantization
  • PyTorch Debug API
Previous Next

© Copyright 2020, Qualcomm Innovation Center, Inc..

Built with Sphinx using a theme provided by Read the Docs.