aimet_onnx.quantsim.set_grouped_blockwise_quantization_for_weights¶
Top level APIs
- aimet_onnx.quantsim.set_grouped_blockwise_quantization_for_weights(sim, op_types, bitwidth, decompressed_bw, block_size, strict=False)[source]¶
Set weight parameter quantizers of modules to grouped blockwise quantization.
- Parameters:
sim (
QuantizationSimModel
) – Quantsim to set weight quantizers forop_types (
Union
[str
,Tuple
]) – Operator types for which to enable grouped blockwise weight quantizaitonbitwidth (
int
) – Bitwidth for affine quantizationdecompressed_bw (
int
) – Decompressed bw for grouped block quantizationblock_size (
int
) – Block size for affine quantization. The block size will be applied to the weight’s input features dimension, while per-channel will be used for the weight’s output features dimension
Examples
>>> # Assume 'sim' is a QuantizationSimModel object >>> # Sets of all Gemm, MatMul, and Conv weight quantizers to block_size 64 in the input_channels dimension: >>> set_grouped_blockwise_quantization_for_weights(sim=sim, ... op_types=("Gemm", "MatMul", "Conv"), ... bitwidth=4, ... decompressed_bw=8, ... block_size=64)