QuantizedTensor¶
- class aimet_torch.quantization.QuantizedTensor(*args, **kwargs)[source]¶
Represents a quantized tensor object. The object holds quantized values stored in a floating-point tensor along with an
EncodingBase
object which holds the information necessary to map the quantized values back to the real/represented values.- dequantize()[source]¶
Dequantizes
self
usingself.encoding
to produce aDequantizedTensor
with the same encoding information.Example: :rtype:
DequantizedTensor
>>> from aimet_torch.v2.quantization as Q >>> x = torch.tensor([[2.57, -2.312], ... [0.153, 0.205]]) >>> quantizer = Q.affine.Quantize(shape=(), bitwidth=8, symmetric=True) >>> quantizer.set_range(-128 * 0.1, 127 * 0.1) >>> x_q = quantizer(x) >>> x_q QuantizedTensor([[ 26., -23.], [ 2., 2.]], grad_fn=<AliasBackward0>) >>> x_dq = x_q.dequantize() >>> x_dq DequantizedTensor([[ 2.6000, -2.3000], [ 0.2000, 0.2000]], grad_fn=<AliasBackward0>) >>> torch.equal(x_dq.encoding.scale, x_q.encoding.scale) True
- quantized_repr()[source]¶
Return the quantized representation of
self
as atorch.Tensor
with data typeself.encoding.dtype
:rtype:Tensor
Note
The result of this function may not be able to carry a gradient depending on the quantized data type. Thus, it may be necessary to call this only within an autograd function to allow for backpropagation.
Example
>>> from aimet_torch.v2 import quantization as Q >>> quantizer = Q.affine.Quantize(shape=(2, 1), bitwidth=8, symmetric=True) >>> x = torch.randn((2, 4), requires_grad=True) >>> with quantizer.compute_encodings(): ... x_q = quantizer(x) >>> x_q QuantizedTensor([[ 11., -57., -128., 38.], [ 28., -0., -128., -40.]], grad_fn=<AliasBackward0>) >>> x_q.quantized_repr() tensor([[ 11, -57, -128, 38], [ 28, 0, -128, -40]], dtype=torch.int8)