quantization.tensor

Classes

class aimet_torch.v2.quantization.tensor.QuantizedTensor(*args, **kwargs)[source]

Represents a quantized tensor object. The object holds quantized values stored in a floating-point tensor along with an EncodingBase object which holds the information necessary to map the quantized values back to the real/represented values.

dequantize()[source]

Dequantizes self using self.encoding to produce a DequantizedTensor with the same encoding information.

Example: :rtype: DequantizedTensor

>>> from aimet_torch.v2.quantization as Q
>>> x = torch.tensor([[2.57, -2.312],
...                   [0.153, 0.205]])
>>> quantizer = Q.affine.Quantize(shape=(), bitwidth=8, symmetric=True)
>>> quantizer.set_range(-128 * 0.1, 127 * 0.1)
>>> x_q = quantizer(x)
>>> x_q
QuantizedTensor([[ 26., -23.],
                 [  2.,   2.]], grad_fn=<AliasBackward0>)
>>> x_dq = x_q.dequantize()
>>> x_dq
DequantizedTensor([[ 2.6000, -2.3000],
                   [ 0.2000,  0.2000]], grad_fn=<AliasBackward0>)
>>> torch.equal(x_dq.encoding.scale, x_q.encoding.scale)
True

quantize()[source]

Returns self

Return type:: QuantizedTensor

quantized_repr()[source]

Return the quantized representation of self as a torch.Tensor with data type self.encoding.dtype :rtype: Tensor

Note

The result of this function may not be able to carry a gradient depending on the quantized data type. Thus, it may be necessary to call this only within an autograd function to allow for backpropagation.

Example

>>> from aimet_torch.v2 import quantization as Q
>>> quantizer = Q.affine.Quantize(shape=(2, 1), bitwidth=8, symmetric=True)
>>> x = torch.randn((2, 4), requires_grad=True)
>>> with quantizer.compute_encodings():
...     x_q = quantizer(x)
>>> x_q
QuantizedTensor([[  11.,  -57., -128.,   38.],
                 [  28.,   -0., -128.,  -40.]], grad_fn=<AliasBackward0>)
>>> x_q.quantized_repr()
tensor([[  11,  -57, -128,   38],
        [  28,    0, -128,  -40]], dtype=torch.int8)

class aimet_torch.v2.quantization.tensor.DequantizedTensor(*args, **kwargs)[source]

Represents a tensor which has been quantized and subsequently dequantized. This object contains real floating point data as well as an EncodingBase object which holds information about the quantization parameters with which the data was quantized. With this, a DequantizedTensor can be converted back to its quantized representation without further loss in information.

dequantize()[source]

Returns self

Return type:: DequantizedTensor

quantize()[source]

Quantizes self using self.encoding to produce a QuantizedTensor with the same encoding information.

Example: :rtype: QuantizedTensor

>>> import aimet_torch.v2.quantization as Q
>>> x = torch.tensor([[0.39, 51.0], [3.521, 9.41]])
>>> quant_dequant = Q.affine.QuantizeDequantize(shape=(), bitwidth=8, symmetric=False)
>>> quant_dequant.set_range(-10, 41)
>>> x_qdq = quant_dequant(x)
>>> x_qdq
DequantizedTensor([[ 0.4000, 41.0000],
                   [ 3.6000,  9.4000]], grad_fn=<AliasBackward0>)
>>> x_qdq.quantize()
QuantizedTensor([[ 52., 255.],
                 [ 68.,  97.]], grad_fn=<AliasBackward0>)

quantized_repr()[source]

Return the quantized representation of self as a torch.Tensor with data type self.encoding.dtype. :rtype: Tensor

Note

The result of this function may not be able to carry a gradient depending on the quantized data type. Thus, it may be necessary to call this only within an autograd function to allow for backpropagation.

Example

>>> import aimet_torch.v2.quantization as Q
>>> x = torch.tensor([[0.39, 51.0], [3.521, 9.41]])
>>> quant_dequant = Q.affine.QuantizeDequantize(shape=(), bitwidth=8, symmetric=False)
>>> quant_dequant.set_range(-10, 41)
>>> x_qdq = quant_dequant(x)
>>> x_qdq
DequantizedTensor([[ 0.4000, 41.0000],
                   [ 3.6000,  9.4000]], grad_fn=<AliasBackward0>)
>>> x_qdq.quantized_repr()
tensor([[ 52, 255],
        [ 68,  97]], dtype=torch.uint8)