quantize

aimet_torch.quantization.affine.quantize(tensor, scale, offset, *args, **kwargs)[source]

Applies quantization to the input.

Precisely,

out=clamp(inputscaleoffset,qmin,qmax)

If block size B=(B0B1BD1) is specified, this equation will be further generalized as

outj0jD1=clamp(inputj0jD1scalei0iD1offseti0iD1,qmin,qmax)where0d<Did=jdBd

This function is overloaded with the signatures listed below:

aimet_torch.quantization.affine.quantize(tensor, scale, offset, bitwidth, signed=False, block_size=None)[source]

Equivalent to:

qmin={2bitwidth12,if signed0,otherwise (default)qmax={2bitwidth12,if signed2bitwidth1,otherwise (default)
Parameters:
  • tensor (Tensor) – Tensor to quantize

  • scale (Tensor) – Scale for quantization

  • offset (Tensor) – Offset for quantization

  • bitwidth (int) – Bitwidth of quantized tensor based on which qmin and qmax will be derived

  • signed (bool) – If false, the output will be mapped to positive integers only. Otherwise, it will range over both positive and negative integers.

  • block_size (Tuple[int, ...], optional) – Block size

aimet_torch.quantization.affine.quantize(tensor, scale, offset, *, num_steps, signed=False, block_size=None)[source]

Equivalent to:

qmin={num_steps2,if signed0,otherwise (default)qmax={num_steps2,if signednum_steps,otherwise (default)
Parameters:
  • tensor (Tensor) – Tensor to quantize

  • scale (Tensor) – Scale for quantization

  • offset (Tensor) – Offset for quantization

  • num_steps (int) – The number of steps in the quantization range based on which qmin and qmax will be derived

  • signed (bool) – If false, the output will be mapped to positive integers only. Otherwise, it will range over both positive and negative integers.

  • block_size (Tuple[int, ...], optional) – Block size

aimet_torch.quantization.affine.quantize(tensor, scale, offset, *, qmin, qmax, block_size=None)[source]
Parameters:
  • tensor (Tensor) – Tensor to quantize

  • scale (Tensor) – Scale for quantization

  • offset (Tensor) – Offset for quantization

  • qmin (int) – Minimum value of the quantization range

  • qmax (int) – Maximum value of the quantization range

  • block_size (Tuple[int, ...], optional) – Block size

Examples

>>> import aimet_torch.v2.quantization as Q
>>> input = torch.arange(start=-0.3, end=1.3, step=0.05)
>>> print(input)
tensor([-3.0000e-01, -2.5000e-01, -2.0000e-01, -1.5000e-01, -1.0000e-01,
        -5.0000e-02, -1.1921e-08,  5.0000e-02,  1.0000e-01,  1.5000e-01,
        2.0000e-01,  2.5000e-01,  3.0000e-01,  3.5000e-01,  4.0000e-01,
        4.5000e-01,  5.0000e-01,  5.5000e-01,  6.0000e-01,  6.5000e-01,
        7.0000e-01,  7.5000e-01,  8.0000e-01,  8.5000e-01,  9.0000e-01,
        9.5000e-01,  1.0000e+00,  1.0500e+00,  1.1000e+00,  1.1500e+00,
        1.2000e+00,  1.2500e+00])
>>> scale = torch.tensor(1/15)
>>> offset = torch.tensor(0.0)
>>> Q.affine.quantize(input, scale, offset, bitwidth=4)
tensor([ 0.,  0.,  0.,  0.,  0.,  0., -0.,  1.,  2.,  2.,  3.,  4.,  4.,  5.,
         6.,  7.,  7.,  8.,  9., 10., 10., 11., 12., 13., 13., 14., 15., 15.,
         15., 15., 15., 15.])
>>> Q.affine.quantize(input, scale, offset, num_steps=15)
tensor([ 0.,  0.,  0.,  0.,  0.,  0., -0.,  1.,  2.,  2.,  3.,  4.,  4.,  5.,
         6.,  7.,  7.,  8.,  9., 10., 10., 11., 12., 13., 13., 14., 15., 15.,
         15., 15., 15., 15.])
>>> Q.affine.quantize(input, scale, offset, qmin=0, qmax=15)
tensor([ 0.,  0.,  0.,  0.,  0.,  0., -0.,  1.,  2.,  2.,  3.,  4.,  4.,  5.,
         6.,  7.,  7.,  8.,  9., 10., 10., 11., 12., 13., 13., 14., 15., 15.,
         15., 15., 15., 15.])