aimet_tensorflow.compress¶
Top-level API for Compression
- class aimet_tensorflow.keras.compress.ModelCompressor[source]¶
aimet model compressor: Enables model compression using various schemes
- static ModelCompressor.compress_model(model, eval_callback, eval_iterations, compress_scheme, cost_metric, parameters, trainer=None, visualization_url=None)[source]¶
Compress a given model using the specified parameters
- Parameters:
model (
Model
) – Model, represented by a tf.keras.Model, to compresseval_callback (
Callable
[[Any
,Optional
[int
],bool
],float
]) – Evaluation callback. Expected signature is evaluate(model, iterations, use_cuda). Expected to return an accuracy metric.eval_iterations – Iterations to run evaluation for.
compress_scheme (
CompressionScheme
) – Compression scheme. See the enum for allowed valuescost_metric (
CostMetric
) – Cost metric to use for the compression-ratio (either mac or memory)parameters (
SpatialSvdParameters
) – Compression parameters specific to given compression schemetrainer (
Optional
[Callable
]) – Training function None: If per layer fine-tuning is not required while creating the final compressed modelvisualization_url (
Optional
[str
]) – url the user will need to input where visualizations will appear
- Return type:
Tuple
[Model
,CompressionStats
]- Returns:
A tuple of the compressed model session, and compression statistics
Greedy Selection Parameters
- class aimet_common.defs.GreedySelectionParameters(target_comp_ratio, num_comp_ratio_candidates=10, use_monotonic_fit=False, saved_eval_scores_dict=None)[source]¶
Configuration parameters for the Greedy compression-ratio selection algorithm
- Variables:
target_comp_ratio – Target compression ratio. Expressed as value between 0 and 1. Compression ratio is the ratio of cost of compressed model to cost of the original model.
num_comp_ratio_candidates – Number of comp-ratio candidates to analyze per-layer More candidates allows more granular distribution of compression at the cost of increased run-time during analysis. Default value=10. Value should be greater than 1.
use_monotonic_fit – If True, eval scores in the eval dictionary are fitted to a monotonically increasing function. This is useful if you see the eval dict scores for some layers are not monotonically increasing. By default, this option is set to False.
saved_eval_scores_dict – Path to the eval_scores dictionary pickle file that was saved in a previous run. This is useful to speed-up experiments when trying different target compression-ratios for example. aimet will save eval_scores dictionary pickle file automatically in a ./data directory relative to the current path. num_comp_ratio_candidates parameter will be ignored when this option is used.
Spatial SVD Configuration
- class aimet_tensorflow.keras.defs.SpatialSvdParameters(input_op_names, output_op_names, mode, params, multiplicity=1)[source]¶
Configuration parameters for spatial svd compression
- Parameters:
input_op_names (
List
[str
]) – list of input op names to the modeloutput_op_names (
List
[str
]) – List of output op names of the modelmode (
Mode
) – Either auto mode or manual modeparams (
Union
[ManualModeParams
,AutoModeParams
]) – Parameters for the mode selectedmultiplicity – The multiplicity to which ranks/input channels will get rounded. Default: 1
- class AutoModeParams(greedy_select_params, modules_to_ignore=None)[source]¶
Configuration parameters for auto-mode compression
- Parameters:
greedy_select_params (
GreedySelectionParameters
) – Params for greedy comp-ratio selection algorithmmodules_to_ignore (
Optional
[List
[Operation
]]) – List of modules to ignore (None indicates nothing to ignore)