Frequently Asked Questions¶
General¶
What is Cloud AI 100 Accelerator?
Cloud AI 100 accelerators enable high performance inference on deep learning models. The accelerators are available in multiple form factors and associated SKUs. Cloud AI SDKs enable end to end workflows - from onboarding a pre-trained model to deployment of the ML inference application in production.
Cloud AI SDK Installation and Platform/OS Support¶
What operating systems and platforms are supported?
Where do I download the SDKs?
Cloud AI SDK consists of a Platform and Apps SDK. Refer to Cloud AI SDKs for more information.
For Platform SDK download, see Platform SDK Download
For Apps SDK download, see Apps SDK Download
What environment variables need to be set to resolve toolchain errors such as libQAic.so?
Set the environment variables as mentioned here
Deep Learning frameworks and networks¶
Which deep learning frameworks for supported by Cloud AI SDKs?
Onnx, tensorflow, pytorch, caffe or caffe2 are supported by the compiler.
qaic-exec
can dump the operators supported across different frameworks. Onnx has the best operator support.
Which deep learning neural networks are supported?
Cloud AI platforms supports many network categories - Computer vision, object detection, Semantic segmentation, Natural language processing, ADAS and Generative AI networks.
Performance information can be found in the Qualcomm Cloud AI 100 page
Model recipes can be found in the cloud-ai-sdk
github.
I have a neural network that I would like to run on Cloud AI platforms. How do I go about it?
There are 3 steps run an inference on Cloud AI platforms.
- Export the model in ONNX format (preferred due to operator support) and prepare the model
- Compile the model to generate a QPC (Qaic Program Container)
- Execute, integrate and deploy into production pipeline
The quick start guide provides a quick overview of the steps involved in running inference using a vision transformer model as an example.
Refer to Inference Workflow for detailed information how to onboard and run inference on Cloud AI platforms.
Users can also refer to the model recipes that provide the best performance for several networks across several categories.
Tutorials are another resource that walks through the workflows to onboard models, tune for best performance, profile inferences etc.
Runtime errors during inference¶
While running inference I encounter 'IOCTL: Connection timed out ERROR'. What is the fix?
There are 3 parameters that we recommend users to increase significantly when this issue is encountererd. If the issue is not fixed, please raise a case through the Qualcomm case support system or Cloud-ai-sdk
GitHub.
System management¶
Which utility/tool is used to query health, telemetry etc of all Cloud AI cards in the server?
Use the qaic-util CLI tool to query health, telemetry etc of Cloud AI cards in the server.
The Cloud AI device shows Status:Error
. How do i fix it?
Status: Error
could be due to one of the following:
- indicates the respective card(s) has not booted up completely
- user has not used
sudo
prefix if user has not been added toqaic
group.sudo /opt/qti-aic/tools/qaic-util
- unsupported OS/platforms, secure boot etc
Users can try to issue an soc_reset to see if the device recovers.