aimet_onnx.quantsim.set_grouped_blockwise_quantization_for_weights

Top level APIs

aimet_onnx.quantsim.set_grouped_blockwise_quantization_for_weights(sim, op_types, bitwidth, decompressed_bw, block_size, strict=False)[source]

Set weight parameter quantizers of modules to grouped blockwise quantization.

Parameters:
  • sim (QuantizationSimModel) – Quantsim to set weight quantizers for

  • op_types (Union[str, Tuple]) – Operator types for which to enable grouped blockwise weight quantizaiton

  • bitwidth (int) – Bitwidth for affine quantization

  • decompressed_bw (int) – Decompressed bw for grouped block quantization

  • block_size (int) – Block size for affine quantization. The block size will be applied to the weight’s input features dimension, while per-channel will be used for the weight’s output features dimension

Examples

>>> # Assume 'sim' is a QuantizationSimModel object
>>> # Sets of all Gemm, MatMul, and Conv weight quantizers to block_size 64 in the input_channels dimension:
>>> set_grouped_blockwise_quantization_for_weights(sim=sim,
...                                                op_types=("Gemm", "MatMul", "Conv"),
...                                                bitwidth=4,
...                                                decompressed_bw=8,
...                                                block_size=64)