Supported Features

This section provides details on vLLM features supported on Qualcomm Cloud AI, including model coverage, serving capabilities, and performance optimizations.

Model Coverage

Execution, Memory, and Context Management

Decoding, Sampling, and Output Control

Model Optimization and Adaptation

Serving Architecture and Deployment

Configuration, Compatibility, and Reference