aimet_onnx.experimental.adascale¶

Top level APIs

aimet_onnx.experimental.adascale.adascale_optimizer.apply_adascale(sim, inputs, adascale_model_config, num_iterations=1500)¶

Parameters:

sim (QuantizationSimModel) – Quantization Sim model
inputs (Collection[Dict[str, ndarray]]) – (Collection[Dict[str, np.ndarray]]): The set of input samples to use during optimization.
adascale_model_config (AdaScaleModelConfig) – Adascale model config. There are pre-defined configs for Llama, Qwen2, Mistral, Qwen3, Phi3. For other models use AdaScaleModelConfig
num_iterations (int) – Number of iterations to optimize for during AdaScale

Example usage:

>>> model = DummyModel()
>>> inputs = ...
>>> adascale_model_config = adascale_model_config['llama']
>>> sim = QuantizationSimModel(model)
>>> apply_adascale(sim, inputs, adascale_model_config, num_iterations=num_iterations)
>>> sim.compute_encodings(...)
>>> sim.export(...)

apply_adascale modifies the weights in-place in the model
compute encodings should not be called before the apply_adascale call
Activation quantizers will remain uninitialized throughout the feature, and so compute encodings needs to be called by the user afterwards. This is so activation encodings will be computed with updated weights taken into account.

Warning: This feature is currently considered experimental pending API changes

class aimet_onnx.experimental.adascale.adascale_optimizer.AdaScaleModelConfig(model_type, beta_gamma_lr=0.001, scales_lr=0.0005)[source]¶