quantization.tensor
Classes
- class aimet_torch.v2.quantization.tensor.QuantizedTensor(*args, **kwargs)[source]
 Represents a quantized tensor object. The object holds quantized values stored in a floating-point tensor along with an
EncodingBaseobject which holds the information necessary to map the quantized values back to the real/represented values.- dequantize()[source]
 Dequantizes
selfusingself.encodingto produce aDequantizedTensorwith the same encoding information.Example: :rtype:
DequantizedTensor>>> from aimet_torch.v2.quantization as Q >>> x = torch.tensor([[2.57, -2.312], ... [0.153, 0.205]]) >>> quantizer = Q.affine.Quantize(shape=(), bitwidth=8, symmetric=True) >>> quantizer.set_range(-128 * 0.1, 127 * 0.1) >>> x_q = quantizer(x) >>> x_q QuantizedTensor([[ 26., -23.], [ 2., 2.]], grad_fn=<AliasBackward0>) >>> x_dq = x_q.dequantize() >>> x_dq DequantizedTensor([[ 2.6000, -2.3000], [ 0.2000, 0.2000]], grad_fn=<AliasBackward0>) >>> torch.equal(x_dq.encoding.scale, x_q.encoding.scale) True
- quantized_repr()[source]
 Return the quantized representation of
selfas atorch.Tensorwith data typeself.encoding.dtype:rtype:TensorNote
The result of this function may not be able to carry a gradient depending on the quantized data type. Thus, it may be necessary to call this only within an autograd function to allow for backpropagation.
Example
>>> from aimet_torch.v2 import quantization as Q >>> quantizer = Q.affine.Quantize(shape=(2, 1), bitwidth=8, symmetric=True) >>> x = torch.randn((2, 4), requires_grad=True) >>> with quantizer.compute_encodings(): ... x_q = quantizer(x) >>> x_q QuantizedTensor([[ 11., -57., -128., 38.], [ 28., -0., -128., -40.]], grad_fn=<AliasBackward0>) >>> x_q.quantized_repr() tensor([[ 11, -57, -128, 38], [ 28, 0, -128, -40]], dtype=torch.int8)
- class aimet_torch.v2.quantization.tensor.DequantizedTensor(*args, **kwargs)[source]
 Represents a tensor which has been quantized and subsequently dequantized. This object contains real floating point data as well as an
EncodingBaseobject which holds information about the quantization parameters with which the data was quantized. With this, aDequantizedTensorcan be converted back to its quantized representation without further loss in information.- quantize()[source]
 Quantizes
selfusingself.encodingto produce aQuantizedTensorwith the same encoding information.Example: :rtype:
QuantizedTensor>>> import aimet_torch.v2.quantization as Q >>> x = torch.tensor([[0.39, 51.0], [3.521, 9.41]]) >>> quant_dequant = Q.affine.QuantizeDequantize(shape=(), bitwidth=8, symmetric=False) >>> quant_dequant.set_range(-10, 41) >>> x_qdq = quant_dequant(x) >>> x_qdq DequantizedTensor([[ 0.4000, 41.0000], [ 3.6000, 9.4000]], grad_fn=<AliasBackward0>) >>> x_qdq.quantize() QuantizedTensor([[ 52., 255.], [ 68., 97.]], grad_fn=<AliasBackward0>)
- quantized_repr()[source]
 Return the quantized representation of
selfas atorch.Tensorwith data typeself.encoding.dtype. :rtype:TensorNote
The result of this function may not be able to carry a gradient depending on the quantized data type. Thus, it may be necessary to call this only within an autograd function to allow for backpropagation.
Example
>>> import aimet_torch.v2.quantization as Q >>> x = torch.tensor([[0.39, 51.0], [3.521, 9.41]]) >>> quant_dequant = Q.affine.QuantizeDequantize(shape=(), bitwidth=8, symmetric=False) >>> quant_dequant.set_range(-10, 41) >>> x_qdq = quant_dequant(x) >>> x_qdq DequantizedTensor([[ 0.4000, 41.0000], [ 3.6000, 9.4000]], grad_fn=<AliasBackward0>) >>> x_qdq.quantized_repr() tensor([[ 52, 255], [ 68, 97]], dtype=torch.uint8)