quantization.tensor
Classes
- class aimet_torch.v2.quantization.tensor.QuantizedTensor(*args, **kwargs)[source]
Represents a quantized tensor object. The object holds quantized values stored in a floating-point tensor along with an
EncodingBase
object which holds the information necessary to map the quantized values back to the real/represented values.- dequantize()[source]
Dequantizes
self
usingself.encoding
to produce aDequantizedTensor
with the same encoding information.Example: :rtype:
DequantizedTensor
>>> from aimet_torch.v2.quantization as Q >>> x = torch.tensor([[2.57, -2.312], ... [0.153, 0.205]]) >>> quantizer = Q.affine.Quantize(shape=(), bitwidth=8, symmetric=True) >>> quantizer.set_range(-128 * 0.1, 127 * 0.1) >>> x_q = quantizer(x) >>> x_q QuantizedTensor([[ 26., -23.], [ 2., 2.]], grad_fn=<AliasBackward0>) >>> x_dq = x_q.dequantize() >>> x_dq DequantizedTensor([[ 2.6000, -2.3000], [ 0.2000, 0.2000]], grad_fn=<AliasBackward0>) >>> torch.equal(x_dq.encoding.scale, x_q.encoding.scale) True
- quantized_repr()[source]
Return the quantized representation of
self
as atorch.Tensor
with data typeself.encoding.dtype
:rtype:Tensor
Note
The result of this function may not be able to carry a gradient depending on the quantized data type. Thus, it may be necessary to call this only within an autograd function to allow for backpropagation.
Example
>>> from aimet_torch.v2 import quantization as Q >>> quantizer = Q.affine.Quantize(shape=(2, 1), bitwidth=8, symmetric=True) >>> x = torch.randn((2, 4), requires_grad=True) >>> with quantizer.compute_encodings(): ... x_q = quantizer(x) >>> x_q QuantizedTensor([[ 11., -57., -128., 38.], [ 28., -0., -128., -40.]], grad_fn=<AliasBackward0>) >>> x_q.quantized_repr() tensor([[ 11, -57, -128, 38], [ 28, 0, -128, -40]], dtype=torch.int8)
- class aimet_torch.v2.quantization.tensor.DequantizedTensor(*args, **kwargs)[source]
Represents a tensor which has been quantized and subsequently dequantized. This object contains real floating point data as well as an
EncodingBase
object which holds information about the quantization parameters with which the data was quantized. With this, aDequantizedTensor
can be converted back to its quantized representation without further loss in information.- quantize()[source]
Quantizes
self
usingself.encoding
to produce aQuantizedTensor
with the same encoding information.Example: :rtype:
QuantizedTensor
>>> import aimet_torch.v2.quantization as Q >>> x = torch.tensor([[0.39, 51.0], [3.521, 9.41]]) >>> quant_dequant = Q.affine.QuantizeDequantize(shape=(), bitwidth=8, symmetric=False) >>> quant_dequant.set_range(-10, 41) >>> x_qdq = quant_dequant(x) >>> x_qdq DequantizedTensor([[ 0.4000, 41.0000], [ 3.6000, 9.4000]], grad_fn=<AliasBackward0>) >>> x_qdq.quantize() QuantizedTensor([[ 52., 255.], [ 68., 97.]], grad_fn=<AliasBackward0>)
- quantized_repr()[source]
Return the quantized representation of
self
as atorch.Tensor
with data typeself.encoding.dtype
. :rtype:Tensor
Note
The result of this function may not be able to carry a gradient depending on the quantized data type. Thus, it may be necessary to call this only within an autograd function to allow for backpropagation.
Example
>>> import aimet_torch.v2.quantization as Q >>> x = torch.tensor([[0.39, 51.0], [3.521, 9.41]]) >>> quant_dequant = Q.affine.QuantizeDequantize(shape=(), bitwidth=8, symmetric=False) >>> quant_dequant.set_range(-10, 41) >>> x_qdq = quant_dequant(x) >>> x_qdq DequantizedTensor([[ 0.4000, 41.0000], [ 3.6000, 9.4000]], grad_fn=<AliasBackward0>) >>> x_qdq.quantized_repr() tensor([[ 52, 255], [ 68, 97]], dtype=torch.uint8)