aimet_onnx.experimental.adascale

Top level APIs

aimet_onnx.experimental.adascale.adascale_optimizer.apply_adascale(sim, inputs, adascale_model_config, num_iterations=1500)
Parameters:
  • sim (QuantizationSimModel) – Quantization Sim model

  • inputs (Collection[Dict[str, ndarray]]) – (Collection[Dict[str, np.ndarray]]): The set of input samples to use during optimization.

  • adascale_model_config (AdaScaleModelConfig) – Adascale model config. There are pre-defined configs for Llama, Qwen2, Mistral, Qwen3, Phi3. For other models use AdaScaleModelConfig

  • num_iterations (int) – Number of iterations to optimize for during AdaScale

Example usage:
>>> model = DummyModel()
>>> inputs = ...
>>> adascale_model_config = adascale_model_config['llama']
>>> sim = QuantizationSimModel(model)
>>> apply_adascale(sim, inputs, adascale_model_config, num_iterations=num_iterations)
>>> sim.compute_encodings(...)
>>> sim.export(...)
  1. apply_adascale modifies the weights in-place in the model

  2. compute encodings should not be called before the apply_adascale call

  3. Activation quantizers will remain uninitialized throughout the feature, and so compute encodings needs to be called by the user afterwards. This is so activation encodings will be computed with updated weights taken into account.

Warning: This feature is currently considered experimental pending API changes

class aimet_onnx.experimental.adascale.adascale_optimizer.AdaScaleModelConfig(model_type, beta_gamma_lr=0.001, scales_lr=0.0005)[source]