quantize_dequantize
- aimet_torch.v2.quantization.affine.quantize_dequantize(tensor, scale, offset, *args, **kwargs)[source]
Applies fake-quantization by quantizing and dequantizing the input.
Precisely,
\[out = (\overline{input} + offset) * scale\]where
\[\overline{input} = clamp\left(\left\lceil\frac{input}{scale}\right\rfloor - offset, qmin, qmax\right)\]If block size \(B = \begin{pmatrix} B_0 & B_1 & \cdots & B_{D-1} \end{pmatrix}\) is specified, this equation will be further generalized as
\[ \begin{align}\begin{aligned}\begin{split}out_{j_0 \cdots j_{D-1}} &= (\overline{input}_{j_0 \cdots j_{D-1}} + offset_{i_0 \cdots i_{D-1}}) * scale_{i_0 \cdots i_{D-1}}\\ \overline{input}_{j_0 \cdots j_{D-1}} &= clamp\left( \left\lceil\frac{input_{j_0 \cdots j_{D-1}}}{scale_{i_0 \cdots i_{D-1}}}\right\rfloor - offset_{i_0 \cdots i_{D-1}}, qmin, qmax\right)\\\end{split}\\\text{where } \quad \forall_{0 \leq d < D} \quad i_d = \left\lfloor \frac{j_d}{B_d} \right\rfloor\end{aligned}\end{align} \]This function is overloaded with the signatures listed below:
- aimet_torch.v2.quantization.affine.quantize_dequantize(tensor, scale, offset, bitwidth, signed=False, block_size=None)[source]
Equivalent to:
\[\begin{split}qmin= \begin{cases} -\left\lceil\frac{2^{bitwidth}-1}{2}\right\rceil,& \text{if } signed\\ 0, & \text{otherwise (default)} \end{cases} qmax= \begin{cases} \left\lfloor\frac{2^{bitwidth}-1}{2}\right\rfloor,& \text{if } signed\\ 2^{bitwidth}-1, & \text{otherwise (default)} \end{cases}\end{split}\]- Parameters
tensor (Tensor) – Tensor to quantize
scale (Tensor) – Scale for quantization
offset (Tensor) – Offset for quantization
bitwidth (int) – Bitwidth of quantized tensor based on which \(qmin\) and \(qmax\) will be derived
signed (bool) – If false, \(\overline{input}\) will be mapped to positive integers only. Otherwise, \(\overline{input}\) will range over both positive and negative integers.
block_size (Tuple[int, ...], optional) – Block size
- aimet_torch.v2.quantization.affine.quantize_dequantize(tensor, scale, offset, *, num_steps, signed=False, block_size=None)[source]
Equivalent to:
\[\begin{split}qmin= \begin{cases} -\left\lceil\frac{num\_steps}{2}\right\rceil,& \text{if } signed\\ 0, & \text{otherwise (default)} \end{cases} qmax= \begin{cases} \left\lfloor\frac{num\_steps}{2}\right\rfloor,& \text{if } signed\\ num\_steps, & \text{otherwise (default)} \end{cases}\end{split}\]- Parameters
tensor (Tensor) – Tensor to quantize
scale (Tensor) – Scale for quantization
offset (Tensor) – Offset for quantization
num_steps (int) – The number of steps in the quantization range based on which \(qmin\) and \(qmax\) will be derived
signed (bool) – If false, \(\overline{input}\) will be mapped to positive integers only. Otherwise, \(\overline{input}\) will range over both positive and negative integers.
block_size (Tuple[int, ...], optional) – Block size
- aimet_torch.v2.quantization.affine.quantize_dequantize(tensor, scale, offset, *, qmin, qmax, block_size=None)[source]
- Parameters
tensor (Tensor) – Tensor to quantize
scale (Tensor) – Scale for quantization
offset (Tensor) – Offset for quantization
qmin (int) – Minimum value of the quantization range
qmax (int) – Maximum value of the quantization range
block_size (Tuple[int, ...], optional) – Block size
Examples
>>> import aimet_torch.v2.quantization as Q >>> input = torch.arange(start=-0.3, end=1.3, step=0.05) >>> print(input) tensor([-3.0000e-01, -2.5000e-01, -2.0000e-01, -1.5000e-01, -1.0000e-01, -5.0000e-02, -1.1921e-08, 5.0000e-02, 1.0000e-01, 1.5000e-01, 2.0000e-01, 2.5000e-01, 3.0000e-01, 3.5000e-01, 4.0000e-01, 4.5000e-01, 5.0000e-01, 5.5000e-01, 6.0000e-01, 6.5000e-01, 7.0000e-01, 7.5000e-01, 8.0000e-01, 8.5000e-01, 9.0000e-01, 9.5000e-01, 1.0000e+00, 1.0500e+00, 1.1000e+00, 1.1500e+00, 1.2000e+00, 1.2500e+00]) >>> scale = torch.tensor(1/15) >>> offset = torch.tensor(0.0) >>> Q.affine.quantize_dequantize(input, scale, offset, bitwidth=4) tensor([0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0667, 0.1333, 0.1333, 0.2000, 0.2667, 0.2667, 0.3333, 0.4000, 0.4667, 0.4667, 0.5333, 0.6000, 0.6667, 0.6667, 0.7333, 0.8000, 0.8667, 0.8667, 0.9333, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000]) >>> Q.affine.quantize_dequantize(input, scale, offset, num_steps=15) tensor([0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0667, 0.1333, 0.1333, 0.2000, 0.2667, 0.2667, 0.3333, 0.4000, 0.4667, 0.4667, 0.5333, 0.6000, 0.6667, 0.6667, 0.7333, 0.8000, 0.8667, 0.8667, 0.9333, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000]) >>> Q.affine.quantize_dequantize(input, scale, offset, qmin=0, qmax=15) tensor([0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0667, 0.1333, 0.1333, 0.2000, 0.2667, 0.2667, 0.3333, 0.4000, 0.4667, 0.4667, 0.5333, 0.6000, 0.6667, 0.6667, 0.7333, 0.8000, 0.8667, 0.8667, 0.9333, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000])