aimet_torch.mixed_precision¶
Top-level API for Manual mixed precision
- class aimet_torch.v2.mixed_precision.MixedPrecisionConfigurator(sim)[source]¶
Mixed Precision Configurator helps set up a mixed precision profile in the QuantSim object. The user is expected to follow the below steps to set the sim in Mixed Precision.
Create QuantSim object
Create the MixedPrecisionConfigurator object by passing in the QuantSim object
Make a series of set_precision/set_model_input_precision/set_model_output_precision calls
Call apply() method by passing in the config file and strict flag
Run compute_encodings on the above QuantSim object
Export the encodings/onnx artifacts
- Parameters:
sim (
QuantizationSimModel
) – QuantSim object
- set_precision(arg, activation, param=None)[source]¶
- Parameters:
arg (
Union
[Module
,Type
[Module
]]) – Module can be of type torch.nn.Module or the type of the module.activation (
Union
[List
[Literal
['int16'
,'int8'
,'int4'
,'fp16'
]],Literal
['int16'
,'int8'
,'int4'
,'fp16'
]]) – A string representing the activation dtype of the module input(s)param (
Optional
[Dict
[str
,Literal
['int16'
,'int8'
,'int4'
,'fp16'
]]]) – Dict with name of the param as key and its dtype as value
If the ‘module’ is a leaf-module(the module doesnt compose of other torch.nn.module), the specified settings would be applied to the module.
If the ‘module’ is a non-leaf-module (module is composed of other torch.nn.module), the specified settings would be applied to all the leaf modules in ‘module’.
If the ‘module’ is Type of module, all the modules in the model which satisfy the specified module type would be set to the specified activation and param settings
If the same ‘module’ is specified through multiple set_precision(…) calls, the latest one will be applied.
Examples: TODO
- set_model_input_precision(activation)[source]¶
Activation precision which needs to be set to the model inputs
- Parameters:
activation (
Union
[List
[Optional
[Literal
['int16'
,'int8'
,'int4'
,'fp16'
]]],Tuple
[Optional
[Literal
['int16'
,'int8'
,'int4'
,'fp16'
]]],Literal
['int16'
,'int8'
,'int4'
,'fp16'
]]) – Activation dtypes for inputs of the model
- set_model_output_precision(activation)[source]¶
Activation precision which needs to be set to the model outputs
- Parameters:
activation (
Union
[List
[Optional
[Literal
['int16'
,'int8'
,'int4'
,'fp16'
]]],Tuple
[Optional
[Literal
['int16'
,'int8'
,'int4'
,'fp16'
]]],Literal
['int16'
,'int8'
,'int4'
,'fp16'
]]) – Activation dtypes for outputs of the model
- apply(log_file='./mmp_log.txt', strict=True)[source]¶
Apply the mp settings specified through the set_precision/set_model_input_precision/set_model_output_precision calls to the QuantSim object
- Parameters:
log_file (
Union
[IO
,str
,None
]) – log_file to store the logs. log_file can either be a string representing the path or the IO object to write the logs into.strict (
bool
) – Boolean flag to indicate whether to fail (strict=True) on incorrect/conflicting inputs made by the user or (strict=False) take a best-effort approach to realize the MP settings
Top-level API for Automatic mixed precision
Note
To enable phase-3 set the attribute GreedyMixedPrecisionAlgo.ENABLE_CONVERT_OP_REDUCTION = True
Currently only two candidates are supported - ((8,int), (8,int)) & ((16,int), (8,int))
Quantizer Groups definition
- class aimet_torch.amp.quantizer_groups.QuantizerGroup(input_quantizers=<factory>, output_quantizers=<factory>, parameter_quantizers=<factory>, supported_kernel_ops=<factory>)[source]¶
Group of modules and quantizers
- get_active_quantizers(name_to_quantizer_dict)[source]¶
Find all active tensor quantizers associated with this quantizer group
- get_candidate(name_to_quantizer_dict)[source]¶
Gets Activation & parameter bitwidth :type name_to_quantizer_dict:
Dict
:param name_to_quantizer_dict: Gets module from module name :rtype:Tuple
[Tuple
[int
,QuantizationDataType
],Tuple
[int
,QuantizationDataType
]] :return: Tuple of Activation, parameter bitwidth and data type
- get_input_quantizer_modules()[source]¶
helper method to get the module names corresponding to input_quantizers
- set_quantizers_to_candidate(name_to_quantizer_dict, candidate)[source]¶
Sets a quantizer group to a given candidate bitwidth :type name_to_quantizer_dict:
Dict
:param name_to_quantizer_dict: Gets module from module name :type candidate:Tuple
[Tuple
[int
,QuantizationDataType
],Tuple
[int
,QuantizationDataType
]] :param candidate: candidate with act and param bw and data types- Return type:
None
CallbackFunc Definition
- class aimet_common.defs.CallbackFunc(func, func_callback_args=None)[source]¶
Class encapsulating call back function and it’s arguments
- Parameters:
func (
Callable
) – Callable Functionfunc_callback_args – Arguments passed to the callable function
- class aimet_torch.amp.mixed_precision_algo.EvalCallbackFactory(data_loader, forward_fn=None)[source]¶
Factory class for various built-in eval callbacks
- Parameters:
data_loader (
DataLoader
) – Data loader to be used for evaluationforward_fn (
Optional
[Callable
[[Module
,Any
],Tensor
]]) – Function that runs forward pass and returns the output tensor. This function is expected to take 1) a model and 2) a single batch yielded from the data loader, and return a single torch.Tensor object which represents the output of the model. The default forward function is roughly equivalent tolambda model, batch: model(batch)