Qualcomm® Cloud AI

APIs

Initializing search

User Guide
API
FAQ
Blogs

Qualcomm® Cloud AI SDK User Guide

User Guide
User Guide
- Introduction
- Quick Start Guide
- Installation
  Installation
  - Checklist
  - Prerequisites
  - Cloud AI SDK
  - Hypervisors
  - Containers
    
    Containers
    
    Docker
    
    Kubernetes
  - Triton Inference Server
  - vLLM
  - AWS
  - Text Generation Inference
- Inference Workflow
  Inference Workflow
  - Export the Model
    
    Export the Model
    
    Exporting ONNX Model from Different Frameworks
    
    Operator and Datatype support
    
    Introduction to the Model Preparator Tool
  - Compile the Model
    
    Compile the Model
    
    Compile the Model
    
    Tune Performance
  - Execute the QPC
    
    Execute the QPC
    
    Execute the QPC
    
    Inference Profiling
- Pytorch Workflow
- Model Architecture Support
  Model Architecture Support
  - Large Language Models (LLMs)
- Features
  Features
  - Custom Operations (C++)
  - Model Sharding
- System Management
  System Management
  - System Management
  - AIC-manager
- Architecture
- QAIRT SDK
  QAIRT SDK
- Glossary
API
API
- Python API
  Python API
- CPP API
  CPP API
- ONNX Runtime
  ONNX Runtime
  - QAIC execution provider
FAQ
FAQ
- Frequently Asked Questions
Blogs
Blogs

APIs¶

Refer to api.html

© Copyright 2025 Qualcomm Innovation Center, Inc.

Created using Sphinx 8.1.3. and Sphinx-Immaterial