aimet_tensorflow.compress¶

Top-level API for Compression

class aimet_tensorflow.keras.compress.ModelCompressor[source]¶: aimet model compressor: Enables model compression using various schemes

static ModelCompressor.compress_model(model, eval_callback, eval_iterations, compress_scheme, cost_metric, parameters, trainer=None, visualization_url=None)[source]¶

Compress a given model using the specified parameters

Parameters:

model (Model) – Model, represented by a tf.keras.Model, to compress
eval_callback (Callable[[Any, Optional[int], bool], float]) – Evaluation callback. Expected signature is evaluate(model, iterations, use_cuda). Expected to return an accuracy metric.
eval_iterations – Iterations to run evaluation for.
compress_scheme (CompressionScheme) – Compression scheme. See the enum for allowed values
cost_metric (CostMetric) – Cost metric to use for the compression-ratio (either mac or memory)
parameters (SpatialSvdParameters) – Compression parameters specific to given compression scheme
trainer (Optional[Callable]) – Training function None: If per layer fine-tuning is not required while creating the final compressed model
visualization_url (Optional[str]) – url the user will need to input where visualizations will appear

Return type:

Tuple[Model, CompressionStats]

Returns:

A tuple of the compressed model session, and compression statistics

Greedy Selection Parameters

class aimet_common.defs.GreedySelectionParameters(target_comp_ratio, num_comp_ratio_candidates=10, use_monotonic_fit=False, saved_eval_scores_dict=None)[source]¶

Configuration parameters for the Greedy compression-ratio selection algorithm

Variables:

target_comp_ratio – Target compression ratio. Expressed as value between 0 and 1. Compression ratio is the ratio of cost of compressed model to cost of the original model.
num_comp_ratio_candidates – Number of comp-ratio candidates to analyze per-layer More candidates allows more granular distribution of compression at the cost of increased run-time during analysis. Default value=10. Value should be greater than 1.
use_monotonic_fit – If True, eval scores in the eval dictionary are fitted to a monotonically increasing function. This is useful if you see the eval dict scores for some layers are not monotonically increasing. By default, this option is set to False.
saved_eval_scores_dict – Path to the eval_scores dictionary pickle file that was saved in a previous run. This is useful to speed-up experiments when trying different target compression-ratios for example. aimet will save eval_scores dictionary pickle file automatically in a ./data directory relative to the current path. num_comp_ratio_candidates parameter will be ignored when this option is used.

Spatial SVD Configuration

class aimet_tensorflow.keras.defs.SpatialSvdParameters(input_op_names, output_op_names, mode, params, multiplicity=1)[source]¶

Configuration parameters for spatial svd compression

Parameters:

input_op_names (List[str]) – list of input op names to the model
output_op_names (List[str]) – List of output op names of the model
mode (Mode) – Either auto mode or manual mode
params (Union[ManualModeParams, AutoModeParams]) – Parameters for the mode selected
multiplicity – The multiplicity to which ranks/input channels will get rounded. Default: 1

class AutoModeParams(greedy_select_params, modules_to_ignore=None)[source]¶

Configuration parameters for auto-mode compression

Parameters:

greedy_select_params (GreedySelectionParameters) – Params for greedy comp-ratio selection algorithm
modules_to_ignore (Optional[List[Operation]]) – List of modules to ignore (None indicates nothing to ignore)

class ManualModeParams(list_of_module_comp_ratio_pairs)[source]¶

Configuration parameters for manual-mode spatial svd compression

Parameters:: list_of_module_comp_ratio_pairs (List[ModuleCompRatioPair]) – List of (module, comp-ratio) pairs

class Mode(value)[source]¶

Mode enumeration

auto = 2¶: Auto mode

manual = 1¶: Manual mode