aimet_torch.onnx.export¶
- aimet_torch.onnx.export(model, args, f, *, export_int32_bias=True, prequantize_constants=False, **kwargs)[source]¶
Export
QuantizationSimModelto onnx model with onnx QuantizeLinear and DequantizeLinear embedded in the graph.This function takes the same set of arguments as torch.onnx.export()
- Parameters:
model – The model to be exported
args – Same as torch.onnx.export()
f – Same as torch.onnx.export()
export_int32_bias (bool, optional) – If true, generate and export int32 bias encoding on the fly (default: True)
**kwargs – Same as torch.onnx.export()
Note
For robustness, onnx >=1.19 is highly recommended with this API, especially when exporting large models (>2GB). This is due to a known bug in onnx <1.19 version converter. For more information, see https://github.com/onnx/onnx/issues/6529
Note
Dynamo-based export (dynamo=True) is not supported yet
Examples
>>> aimet_torch.onnx.export(sim.model, x, f="model.onnx", ... input_names=["input"], output_names=["output"], ... opset_version=21, dynamo=False, ... export_int32_bias=True) ... >>> import onnxruntime as ort >>> options = ort.SessionOptions() >>> options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL >>> sess = ort.InferenceSession("model.onnx", sess_options=options) >>> onnx_output, = sess.run(None, {"input": x.detach().numpy()}) >>> torch.nn.functional.cosine_similarity(torch.from_numpy(onnx_output), sim.model(x)) tensor([1.0000, 0.9999, 1.0000, ..., 1.0000, 1.0000, 1.0000], grad_fn=<AliasBackward0>)