AIMET Tensorflow Layer Output Generation API¶
This API captures and saves intermediate layer-outputs of a model. The model can be original(FP32) or quantsim. The layer-outputs are named according to the exported Tensorflow model by the quantsim export API. This allows layer-output comparison amongst FP32 model, quantization simulated model and actually quantized model on target-device to debug accuracy miss-match issues.
Top-level API¶
-
class
aimet_tensorflow.layer_output_utils.
LayerOutputUtil
(session, starting_op_names, output_op_names, dir_path)[source]¶ Implementation to capture and save outputs of intermediate layers of a model (fp32/quantsim)
Constructor for LayerOutputUtil.
- Parameters
session (
Session
) – Session containing the model whose layer-outputs are needed.starting_op_names (
List
[str
]) – List of starting op names of the model.output_op_names (
List
[str
]) – List of output op names of the model.dir_path (
str
) – Directory wherein layer-outputs will be saved.
The following API can be used to Generate Layer Outputs
-
LayerOutputUtil.
generate_layer_outputs
(input_batch)[source]¶ This method captures output of every layer of a model & saves the inputs and corresponding layer-outputs to disk.
- Parameters
input_batch (
Union
[ndarray
,List
[ndarray
],Tuple
[ndarray
]]) – Batch of inputs for which we want to obtain layer-outputs.- Returns
None
Code Example¶
Imports
import numpy as np
import tensorflow as tf
from aimet_tensorflow.examples.test_models import keras_model
from aimet_tensorflow.quantsim import QuantizationSimModel
from aimet_tensorflow.layer_output_utils import LayerOutputUtil
Obtain Original or QuantSim model session
# Load original model into session
def cpu_session():
tf.compat.v1.reset_default_graph()
with tf.device('/cpu:0'):
model = keras_model()
init = tf.compat.v1.global_variables_initializer()
session = tf.compat.v1.Session()
session.run(init)
return session
session = cpu_session()
# Obtain quantsim model session
def quantsim_forward_pass_callback(session, dummy_input):
model_input = session.graph.get_tensor_by_name('conv2d_input:0')
model_output = session.graph.get_tensor_by_name('keras_model/Softmax_quantized:0')
return session.run(model_output, feed_dict={model_input: dummy_input})
dummy_input = np.random.randn(1, 16, 16, 3)
quantsim = QuantizationSimModel(session, ['conv2d_input'], ['keras_model/Softmax'], use_cuda=False)
quantsim.compute_encodings(quantsim_forward_pass_callback, dummy_input)
Obtain pre-processed inputs
# Get the inputs that are pre-processed using the same manner while computing quantsim encodings
input_batches = get_pre_processed_inputs()
Generate Layer Outputs
# Generate layer-outputs
layer_output_util = LayerOutputUtil(session=quantsim.session, starting_op_names=['conv2d_input'],
output_op_names=['keras_model/Softmax'], dir_path='./layer_output_dump')
for input_batch in input_batches:
layer_output_util.generate_layer_outputs(input_batch)