AIMET Keras Layer Output Generation API¶
This API captures and saves intermediate layer-outputs of a model. The model can be original(FP32) or quantsim. The layer-outputs are named according to the exported Keras model by the quantsim export API. This allows layer-output comparison amongst FP32 model, quantization simulated model and actually quantized model on target-device to debug accuracy miss-match issues.
Top-level API¶
-
class
aimet_tensorflow.keras.layer_output_utils.
LayerOutputUtil
(model, save_dir='./KerasLayerOutput')[source]¶ This class captures output of every layer of a keras (fp32/quantsim) model, creates a layer-output name to layer-output dictionary and saves the per layer outputs
Constructor - It initializes a few things that are required for capturing and naming layer-outputs. :type model:
Model
:param model: Keras (fp32/quantsim) model. :type save_dir:str
:param save_dir: Directory to save the layer outputs.
The following API can be used to Generate Layer Outputs
-
LayerOutputUtil.
generate_layer_outputs
(input_batch)[source]¶ This method captures output of every layer of a keras model & saves the inputs and corresponding layer-outputs to disk. This allows layer-output comparison either between original fp32 model and quantization simulated model or quantization simulated model and actually quantized model on-target to debug accuracy miss-match issues.
- Parameters
input_batch (
Union
[Tensor
,List
[Tensor
],Tuple
[Tensor
]]) – Batch of Inputs for which layer output need to be generated- Returns
None
Code Example¶
Imports
import numpy as np
import tensorflow as tf
from aimet_tensorflow.keras.quantsim import QuantizationSimModel
from aimet_tensorflow.keras.layer_output_utils import LayerOutputUtil
Obtain Original or QuantSim model session
def quantsim_forward_pass_callback(model, dummy_input):
_ = model.predict(dummy_input)
# Load the baseline/original (FP32) model
base_model = load_baseline_model()
dummy_input = np.random.rand(1, 16, 16, 3)
# Create QuantizationSim Object
quantsim_obj = QuantizationSimModel(
model=base_model,
quant_scheme='tf_enhanced',
rounding_mode="nearest",
default_output_bw=8,
default_param_bw=8,
in_place=False,
config_file=None
)
# Compute encodings
quantsim_obj.compute_encodings(quantsim_forward_pass_callback,
forward_pass_callback_args=dummy_input
)
Obtain pre-processed inputs
# Get the inputs that are pre-processed using the same manner while computing quantsim encodings
input_batches = get_pre_processed_inputs()
Generate Layer Outputs
# Generate layer-outputs
layer_output_util = LayerOutputUtil(model=quantsim_obj.model, save_dir="./KerasLayerOutput")
for input_batch in input_batches:
layer_output_util.generate_layer_outputs(input_batch=input_batch)