aimet_torch.experimental.adascale¶

Top level APIs

aimet_torch.experimental.adascale.apply_adascale(qsim, data_loader, forward_fn=None, num_iterations=1500)¶

Parameters:

qsim (QuantizationSimModel) – Quantization Sim model
data_loader (DataLoader) – DataLoader object to load the input data
forward_fn (Optional[Callable[[Module, Any], Any]]) – forward function to run the forward pass of the model
num_iterations (int) – Number of iterations to optimize for during AdaScale BKD

Note that the forward_fn should take exactly two arguments - 1) the model 2) The object returned from the dataloader irrespective of whether it’s a tensor/tuple of tensors/dict/etc

The forward_fn should prepare the “input sample” as needed and call the forward pass in the very end. The forward_fn should not be running any sort of eval, creating full dataloader inside the method, etc.

Example usage:

>>> model = DummyModel()
>>> dummy_input = ...
>>> data_set = DataSet(dummy_input)
>>> data_loader = DataLoader(data_set, ...)
>>> sim = QuantizationSimModel(model, dummy_input)
>>> apply_adascale(sim, data_loader, forward_fn=forward_fn, num_iterations=1500)
>>> sim.compute_encodings(...)
>>> sim.export(...)

apply_adascale modifies the weights in-place in the model
compute encodings should not be called before the apply_adascale call
Activation quantizers will remain uninitialized throughout the feature, and so compute encodings needs to be called by the user afterwards. This is so activation encodings will be computed with updated weights taken into account.

Warning: This feature is currently considered experimental pending API changes