QuantizedTensor

class aimet_torch.quantization.QuantizedTensor(*args, **kwargs)[source]

Represents a quantized tensor object. The object holds quantized values stored in a floating-point tensor along with an EncodingBase object which holds the information necessary to map the quantized values back to the real/represented values.

dequantize()[source]

Dequantizes self using self.encoding to produce a DequantizedTensor with the same encoding information.

Example: :rtype: DequantizedTensor

>>> from aimet_torch.v2.quantization as Q
>>> x = torch.tensor([[2.57, -2.312],
...                   [0.153, 0.205]])
>>> quantizer = Q.affine.Quantize(shape=(), bitwidth=8, symmetric=True)
>>> quantizer.set_range(-128 * 0.1, 127 * 0.1)
>>> x_q = quantizer(x)
>>> x_q
QuantizedTensor([[ 26., -23.],
                 [  2.,   2.]], grad_fn=<AliasBackward0>)
>>> x_dq = x_q.dequantize()
>>> x_dq
DequantizedTensor([[ 2.6000, -2.3000],
                   [ 0.2000,  0.2000]], grad_fn=<AliasBackward0>)
>>> torch.equal(x_dq.encoding.scale, x_q.encoding.scale)
True
quantize()[source]

Returns self

Return type:

QuantizedTensor

quantized_repr()[source]

Return the quantized representation of self as a torch.Tensor with data type self.encoding.dtype :rtype: Tensor

Note

The result of this function may not be able to carry a gradient depending on the quantized data type. Thus, it may be necessary to call this only within an autograd function to allow for backpropagation.

Example

>>> from aimet_torch.v2 import quantization as Q
>>> quantizer = Q.affine.Quantize(shape=(2, 1), bitwidth=8, symmetric=True)
>>> x = torch.randn((2, 4), requires_grad=True)
>>> with quantizer.compute_encodings():
...     x_q = quantizer(x)
>>> x_q
QuantizedTensor([[  11.,  -57., -128.,   38.],
                 [  28.,   -0., -128.,  -40.]], grad_fn=<AliasBackward0>)
>>> x_q.quantized_repr()
tensor([[  11,  -57, -128,   38],
        [  28,    0, -128,  -40]], dtype=torch.int8)