Encoding Analyzers
- class aimet_torch.v2.quantization.encoding_analyzer.EncodingAnalyzer(observer)[source]
Base class that gathers statistics of input data and computes encodings
- compute_encodings(num_steps, is_symmetric)[source]
Computes encodings based on the input data & calibration scheme and returns the encoding minimum and maximum value
- Parameters:
num_steps (int) – Number of steps used in quantization.
is_symmetric (bool) – True if encodings are symmetric
- Returns:
Encoding min and max as a tuple
Variants
- class aimet_torch.v2.quantization.encoding_analyzer.MinMaxEncodingAnalyzer(shape)[source]
EncodingAnalyzer subclass which uses min-max calibration. This involves tracking the minimum and maximum observed values and computing the min-max range as \([min(input), max(input)]\)
- Parameters:
shape (tuple) – Shape of calculated encoding
Example
>>> from aimet_torch.v2.quantization.encoding_analyzer import MinMaxEncodingAnalyzer >>> encoding_analyzer = MinMaxEncodingAnalyzer(shape=(1,)) >>> encoding_analyzer.update_stats(torch.randn(100)) >>> encoding_analyzer.compute_encodings(num_steps=math.pow(2, 8), is_symmetric=False) (tensor([-2.0991]), tensor([2.3696])) >>> encoding_analyzer.reset_stats() >>> encoding_analyzer.update_stats(torch.randn(100)) _MinMaxRange(min=tensor([-2.1721]), max=tensor([2.2592])) >>> encoding_analyzer.compute_encodings(num_steps=math.pow(2, 8), is_symmetric=False) (tensor([-2.1721]), tensor([2.2592]))
- class aimet_torch.v2.quantization.encoding_analyzer.SqnrEncodingAnalyzer(shape, num_bins=2048, *, asymmetric_delta_candidates=17, symmetric_delta_candidates=101, offset_candidates=21, max_parallelism=64, gamma=3.0)[source]
EncodingAnalyzer subclass which uses SQNR calibration. This involves recording values in a histogram and computing the min-max range based on values that produce the lowest expected SQNR.
- Parameters:
shape (tuple) – Shape of calculated encoding
num_bins (int) – Number of bins used to create the histogram
asymmetric_delta_candidates (int) – Number of delta values to search over in asymmetric mode
symmetric_delta_candidates (int) – Number of delta values to search over in symmetric mode
offset_candidates (int) – Number of offset values to search over in asymmetric mode
max_parallelism (int) – Maximum number of encodings to process in parallel (higher number results in higher memory usage but faster computation)
gamma (float) – Weighting factor on clipping noise (higher value results in less clipping noise)
percentile (float) – Percentile value which is used to clip values
Example
>>> from aimet_torch.v2.quantization.encoding_analyzer import SqnrEncodingAnalyzer >>> encoding_analyzer = SqnrEncodingAnalyzer(shape=(1,), num_bins = 10, gamma = 1) >>> encoding_analyzer.update_stats(torch.randn(100)) >>> encoding_analyzer.compute_encodings(num_steps = math.pow(2, 8), is_symmetric = False) (tensor([-2.3612]), tensor([2.8497])) >>> encoding_analyzer.reset_stats() >>> encoding_analyzer.update_stats(torch.randn(100)) [_Histogram(histogram=tensor([ 2., 0., 8., 8., 16., 22., 23., 12., 6., 3.]), bin_edges=tensor([-2.8907, -2.3625, -1.8343, -1.3061, -0.7779, -0.2497, 0.2784, 0.8066, 1.3348, 1.8630, 2.3912]), min=tensor(-2.8907), max=tensor(2.3912))] >>> encoding_analyzer.compute_encodings(num_steps = math.pow(2, 8), is_symmetric = False) (tensor([-2.7080]), tensor([2.2438]))
- class aimet_torch.v2.quantization.encoding_analyzer.PercentileEncodingAnalyzer(shape, num_bins=2048, percentile=100)[source]
EncodingAnalyzer subclass which uses percentile calibration. This involves recording values in a histogram and computing the min-max range given a percentile value \(p\). The range would be computed after clipping (100 - \(p\))% of the largest and smallest observed values.
- Parameters:
shape (tuple) – Shape of calculated encoding
num_bins (int) – Number of bins used to create the histogram
percentile (float) – Percentile value which is used to clip values
Example
>>> from aimet_torch.v2.quantization.encoding_analyzer import PercentileEncodingAnalyzer >>> encoding_analyzer = PercentileEncodingAnalyzer(shape=(1,), num_bins = 10, percentile = 80) >>> encoding_analyzer.update_stats(torch.randn(100)) >>> encoding_analyzer.compute_encodings(num_steps = math.pow(2, 8), is_symmetric = False) (tensor([-1.1188]), tensor([0.3368])) >>> encoding_analyzer.reset_stats() >>> encoding_analyzer.update_stats(torch.randn(100)) [_Histogram(histogram=tensor([ 1., 1., 8., 13., 19., 27., 16., 10., 3., 2.]), bin_edges=tensor([-2.5710, -2.0989, -1.6269, -1.1548, -0.6827, -0.2106, 0.2614, 0.7335, 1.2056, 1.6776, 2.1497]), min=tensor(-2.5710), max=tensor(2.1497))] >>> encoding_analyzer.compute_encodings(num_steps = math.pow(2, 8), is_symmetric = False) (tensor([-1.1548]), tensor([0.2614]))