Skip to content

Cloud AI 100

API

Initializing search

User Guide
API
FAQ
Blogs

Cloud AI 100

User Guide
User Guide
- Quick Start Guide
  Quick Start Guide
- Installation
  Installation
  - Checklist
  - Pre-requisites
  - Cloud AI SDK
  - Hypervisors
  - Containers
    
    Containers
    
    Docker
    
    Kubernetes
  - vLLM
  - AWS
  - Text Generation Inference
- Inference Workflow
  Inference Workflow
  - Export the Model
    Export the Model
    
    Exporting ONNX Model from Different Frameworks
    
    Operator and Datatype support
    
    Introduction to the Model Preparator Tool
  - Compile the Model
    Compile the Model
    
    Compile the Model
    
    Tune Performance
  - Execute the QPC
    Execute the QPC
    
    Model Execution
    
    Inference Profiling
    
    Triton Inference Server
- PyTorch Workflow
  PyTorch Workflow
  - Eager Mode Finetune
    
    Eager Mode Finetune
- Model Architecture Support
  Model Architecture Support
  - Large-Language-Models
    Large-Language-Models
    
    Large Language Models (LLMs)
- Features
  Features
  - Custom Ops
    Custom Ops
    
    Custom Operations (C++)
  - Model Sharding
    Model Sharding
    
    Model Sharding
- System Management
  System Management
  - System Management
  - AIC-manager
    
    AIC-manager
- Architecture
  Architecture
- Glossary
  Glossary
API
API
- Python API
  Python API
  - Inference API
  - Util API
- CPP API
  CPP API
- ONNXRT API
  ONNXRT API
  - QAIC execution provider
FAQ
FAQ
Blogs
Blogs

API¶

Python API

C++ API

OnnxRT API

Cloud AI Glossary

© 2023 Qualcomm Innovation Center, Inc.

Made with Material for MkDocs