AIMET TensorFlow Layer Output Generation API
This API captures and saves intermediate layer-outputs of a model. The model can be original (FP32) or quantsim. The layer-outputs are named according to the exported Keras model by the quantsim export API. This allows layer-output comparison amongst FP32 model, quantization simulated model and actually quantized model on target-device to debug accuracy miss-match issues.
Top-level API
- class aimet_tensorflow.keras.layer_output_utils.LayerOutputUtil(model, save_dir='./KerasLayerOutput')[source]
Implementation to capture and save outputs of intermediate layers of a model (fp32/quantsim)
Constructor for LayerOutputUtil.
- Parameters:
model (
Model
) – Keras (fp32/quantsim) model.save_dir (
str
) – Directory to save the layer outputs.
The following API can be used to Generate Layer Outputs
- LayerOutputUtil.generate_layer_outputs(input_batch)[source]
This method captures output of every layer of a model & saves the inputs and corresponding layer-outputs to disk.
- Parameters:
input_batch (
Union
[Tensor
,List
[Tensor
],Tuple
[Tensor
]]) – Batch of Inputs for which layer output need to be generated- Returns:
None
Code Example
Imports
import tensorflow as tf
from aimet_tensorflow.keras.quantsim import QuantizationSimModel
from aimet_tensorflow.keras.layer_output_utils import LayerOutputUtil
Obtain Original or QuantSim model from AIMET Export Artifacts
# Load the model.
model = tf.keras.models.load_model('path/to/aimet_export_artifacts/model.h5')
# Use same arguments as that were used for the exported QuantSim model. For sake of simplicity only mandatory arguments are passed below.
quantsim = QuantizationSimModel(model)
# Load exported encodings into quantsim object.
quantsim.load_encodings_to_sim('path/to/aimet_export_artifacts/model.encodings')
# Check whether constructed original and quantsim model are running properly before using Layer Output Generation API.
_ = model.predict(dummy_input)
_ = quantsim.predict(dummy_input)
Obtain inputs for which we want to generate intermediate layer-outputs
# Use same input pre-processing pipeline as was used for computing the quantization encodings.
input_batches = get_pre_processed_inputs()
Generate layer-outputs
# Use original model to get fp32 layer-outputs
fp32_layer_output_util = LayerOutputUtil(model=model, save_dir='fp32_layer_outputs')
# Use quantsim model to get quantsim layer-outputs
quantsim_layer_output_util = LayerOutputUtil(model=quantsim.model, save_dir='quantsim_layer_outputs')
for input_batch in input_batches:
fp32_layer_output_util.generate_layer_outputs(input_batch=input_batch)
quantsim_layer_output_util.generate_layer_outputs(input_batch=input_batch)