aimet_tensorflow.quant_analyzer¶

Top level APIs

class aimet_tensorflow.keras.quant_analyzer.QuantAnalyzer(model, forward_pass_callback, eval_callback)[source]

QuantAnalyzer tool provides

model sensitivity to weight and activation quantization
per layer sensitivity analysis
per layer encoding (min - max range)
per PDF analysis and
per layer MSE analysis

Parameters:

model (Model) – FP32 model to analyze for quantization.
forward_pass_callback (CallbackFunc) – A callback function for model calibration that simply runs forward passes on the model to compute encoding (delta/offset). This callback function should use representative data and should be subset of entire train/validation dataset (~1000 images/samples).
eval_callback (CallbackFunc) – A callback function for model evaluation that determines model performance. This callback function is expected to return scalar value representing the model performance evaluated against entire test/evaluation dataset.

analyze(quant_scheme=QuantScheme.post_training_tf_enhanced, rounding_mode='nearest', default_param_bw=8, default_output_bw=8, config_file=None, results_dir='./tmp/')[source]

Analyze model for quantization and point out sensitive parts/hotspots of the model by performing

model sensitivity to quantization,
perform per layer sensitivity analysis by enabling and disabling quant wrappers,
export per layer encodings min - max ranges,
export per layer statistics histogram (PDF) when quant scheme is TF-Enhanced,
per layer MSE analysis

Parameters:

quant_scheme (QuantScheme) – Quantization scheme. Supported values are QuantScheme.post_training_tf or QuantScheme.post_training_tf_enhanced.
rounding_mode (str) – The round scheme to used. One of: ‘nearest’ or ‘stochastic’, defaults to ‘nearest’
default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters.
default_output_bw (int) – Default bitwidth (4-31) to use for quantizing layer inputs and outputs.
config_file (Optional[str]) – Path to configuration file for model quantizers.
results_dir (str) – Directory to save the results.

check_model_sensitivity_to_quantization(sim, default_param_bw, default_output_bw)[source]

Perform the sensitivity analysis to weight and activation quantization individually.

Parameters:

sim (QuantizationSimModel) – Quantsim model.
default_param_bw (int) – Default bitwidth (4-31) to use for quantizing layer parameters.
default_output_bw (int) – Default bitwidth (4-31) to use for quantizing layer inputs and outputs.

Returns:

FP32 eval score, weight-quantized eval score, act-quantized eval score.

enable_per_layer_mse_loss(unlabeled_dataset, num_batches)[source]

Enable per layer MSE loss analysis.

Parameters:

unlabeled_dataset (DatasetV2) – tf.data.Dataset provided as input to the model and used to calculate mse loss
num_batches (int) – Maximum number of batches to be used for MSE loss calculation

Return type:

None

export_per_layer_encoding_min_max_range(sim, results_dir)[source]

Export encoding min and max range for all weights and activations. results_dir should have html files in following format.

-results_dir: -activations.html -weights.html

If per channel quantization(PCQ) is enabled then,

-results_dir: -activations.html -{wrapped_module_name}_{param_name}.html

Parameters:

sim (QuantizationSimModel) – Quantsim model.
results_dir (str) – Directory to save the results.

Return type:

Tuple[Dict, Dict]

Returns:

layer wise min-max range for weights and activations.

export_per_layer_mse_loss(sim, results_dir)[source]

NOTE: Need to pass same model input data through both fp32 and quantsim model to tap output activations of each layer.

Export MSE loss between fp32 and quantized output activations for each layer. :type sim: QuantizationSimModel :param sim: Quantsim model. :type results_dir: str :param results_dir: Directory to save the results. :return layer wise MSE loss. dict[layer_name] = MSE loss.

Return type:: Dict[str, float]

export_per_layer_stats_histogram(sim, results_dir)[source]

NOTE: Not to invoke when quantization scheme is not TF-Enhanced.

Export histogram that represents a PDF of collected statistics by a quantizer for every quant wrapper. After invoking this API, results_dir should have html files in following format for every quantizers of quant wrappers.

-results_dir

-activations_pdf

name_{input/output}_{index}.html

-weights_pdf

-name: param_name_{channel_index}.html

Parameters:

sim (QuantizationSimModel) – Quantsim model.
results_dir (str) – Directory to save the results.

Return type:

None

perform_per_layer_analysis_by_disabling_quant_wrappers(sim, results_dir)[source]

NOTE: Option 2

All quant wrappers’ parameters and activations quantizers are enabled as per JSON config file and set to bit-width specified.
For every quant wrappers, based on occurrence:
1. Each quant wrapper’s parameters and activations quantizers are disabled.
2. Measure and record eval score on subset of dataset.
3. Enable disabled quantizers in step i.
Returns dictionary containing quant wrapper name and corresponding eval score.

Parameters:

sim (QuantizationSimModel) – Quantsim model.
results_dir (str) – Directory to save the results.

Return type:

Dict[str, float]

Returns:

layer wise eval score dictionary. dict[layer_name] = eval_score

perform_per_layer_analysis_by_enabling_quant_wrappers(sim, results_dir)[source]

NOTE: Option 1

All quant wrappers’ parameters and activations quantizers are disabled.
For every quant wrappers, based on occurrence:
1. Each quant wrapper’s parameters and activations quantizers are enabled as per JSON config file and set to bit-width specified.
2. Measure and record eval score on subset of dataset.
3. Disable enabled quantizers in step i.
Returns dictionary containing quant wrapper name and corresponding eval score.

Parameters:

sim (QuantizationSimModel) – Quantsim model.
results_dir (str) – Directory to save the results.

Return type:

Dict[str, float]

Returns:

layer-wise eval score dictionary. dict[layer_name] = eval_score