efficient-transformers

Getting Started

  • Introduction Qualcomm efficient-transformers library
  • Validated Models
  • Models Coming Soon

Installation

  • Pre-requisites
  • Installation
  • Sanity Check

Upgrade Efficient-Transformers

  • Using GitHub Repository

Inference on Cloud AI 100

  • Quick Start
  • Command Line Interface Use (CLI)
  • Python API

QAIC Finetune

  • Finetune Infra

Blogs

  • Train anywhere, Infer on Qualcomm Cloud AI 100
  • How to Quadruple LLM Decoding Performance with Speculative Decoding (SpD) and Microscaling (MX) Formats on Qualcomm® Cloud AI 100
  • Power-efficient acceleration for large language models – Qualcomm Cloud AI SDK
  • Qualcomm Cloud AI 100 Accelerates Large Language Model Inference by ~2x Using Microscaling (Mx) Formats
  • Qualcomm Cloud AI Introduces Efficient Transformers: One API, Infinite Possibilities

Reference

  • Qualcomm Cloud AI home
  • Qualcomm Cloud AI SDK download
  • Qualcomm Cloud AI API reference
  • User Guide
  • OCP Microscaling Formats (MX) Specification
efficient-transformers
  • Search


© Copyright 2024, Qualcomm.

Built with Sphinx using a theme provided by Read the Docs.
Version: Main
Versions
main
release/v1.18
release/v1.19