How to compute inference time (ms) to compare my Original, Projected and Fine Tuned models? Error For code generation of convolution1dLayer, when convolving over the time dime
34 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Silvia
am 6 Nov. 2024 um 9:46
Kommentiert: Katja Mogalle
am 14 Nov. 2024 um 12:17
I am trying to compute the inference time of three different models (Original, Projected, and Fine-Tuned) to compare their performances not only with my evaluation metrics and in terms of dimensions (number of learnable parameters) but also in terms of inference time. I am following this example: https://it.mathworks.com/help/deeplearning/ug/compress-network-for-estimating-soc.html. The architectures of my networks are as follows:
Original Net:
- 'input' Sequence Input Sequence input with 1 dimensions
- 'conv1' 1-D Convolution 10 8×1 convolutions with stride 1 and padding 'same'
- 'batchnorm1' Batch Normalization Batch normalization with 10 channels
- 'relu1' ReLU ReLU
- 'gru1' GRU GRU with 32 hidden units
- 'output' Fully Connected 1 fully connected layer
Projected and Fine Tuned Net:
- 'input' Sequence Input Sequence input with 1 dimensions
- 'conv1' 1-D Convolution 10 8×1 convolutions with stride 1 and padding 'same'
- 'batchnorm1' Batch Normalization Batch normalization with 10 channels
- 'relu1' ReLU ReLU
- 'gru1' Projected Layer Projected GRU with 32 hidden units
- 'output' Projected Layer Projected fully connected layer with output size 1
This is my code:
cfg = coder.config("mex");
cfg.TargetLang = "C++";
cfg.DeepLearningConfig = coder.DeepLearningConfig("none");
noisyInputType = coder.typeof('double', [Inf 1], [1 0]);
codegen -config cfg FinalFineTuned_predict -args {noisyInputType}
codegen -config cfg FinalProjected_predict -args {noisyInputType}
codegen -config cfg FinalOriginal_predict -args {noisyInputType}
Where the functions are:
function out = FinalOriginal_predict(in) %#codegen
% A persistent object mynet is used to load the series network object.
% At the first call to this function, the persistent object is constructed and
% setup. When the function is called subsequent times, the same object is reused
% to call predict on inputs, thus avoiding reconstructing and reloading the
% network object.
% Copyright 2019-2021 The MathWorks, Inc.
persistent mynet;
if isempty(mynet)
mynet = coder.loadDeepLearningNetwork('1DCNN_LSTM07.mat');
end
outDlarray = predict(mynet, dlarray(single(in), 'TCB'));
out = extractdata(outDlarray);
end
2nd function:
function out = FinalProjected_predict(in) %#codegen
persistent mynet;
if isempty(mynet)
mynet = coder.loadDeepLearningNetwork('FinalProjected_unpacked.mat');
end
outDlarray = predict(mynet, dlarray(single(in), 'TCB'));
out = extractdata(outDlarray);
end
3rd function:
function out = FinalFineTuned_predict(in) %#codegen
persistent mynet;
if isempty(mynet)
mynet = coder.loadDeepLearningNetwork('FinalFineTuned_unpacked.mat');
end
outDlarray = predict(mynet, dlarray(single(in), 'TCB'));
out = extractdata(outDlarray);
end
I had to unpacked the projected layers in both the Projected and Fine Tuned networks, otherwise I had an error while compiling.
In all the cases, the error I am encountering now is: "For code generation of convolution1dLayer, when convolving over the time dimension ('T'), the 'T' dimension of the input must be fixed size." Can you help me?
Thank you in advance,
Silvia
0 Kommentare
Akzeptierte Antwort
Katja Mogalle
am 7 Nov. 2024 um 9:55
What the error message ("the 'T' dimension of the input must be fixed size.") is trying to say is that C/C++ code generation of networks containing a convolution 1D layer is not supported if your sequences have variable length. All sequences in your inference data must always have the same number of time steps.
So, let's assume all your sequences have 100 time steps, then you need to specify the input data to the codegen command as follows:
noisyInputType = coder.typeof('double', [100 1], [false false]);
codegen -config cfg FinalOriginal_predict -args {noisyInputType}
If you indeed have variable length sequences, you'd have to cut off sequences either on the left or right side to make them all fixed length (or pad shorter sequences). If you want to do this in MATLAB, you can use the padsequences function.
Hope that helps.
5 Kommentare
Katja Mogalle
am 14 Nov. 2024 um 12:17
I am certain the act of unpacking the projected layers is not the issue. Just make sure to save the "unpacked" network to the mat-file before re-generating the C/C+ code.
However, I do find inference speed is a tricky thing to measure and to understand. First, we need to make sure we have reliable measurements. This documentation example shows how to use timeit to measure inference speed and compare the original against the projected network: https://www.mathworks.com/help/deeplearning/ug/compress-network-for-estimating-soc.html#CompressNetworkForEstimatingBatteryStateOfChargeExample-13
Just to double, check, you are running the generated code on the CPU, not a GPU, correct? And you are not using any third-party deep learning libraries for codegen?
Would you be able to share your inference measurements (original network vs. projected network)?
The next thing we can look at is how much each layer was compressed using the projection technique. If a layer was not compressed very much, it can have a negative impact on inference speed as there are overheads associated with projection. If you are using MATLAB R2024a or newer, you can use the analyzeNetwork function to analyze the projected network (before unpacking). If you see any small values (or even negative) in the "Learnables Reduction" column of the layer analysis table, you should consider not projecting those layers (by utilizing the LayerNames argument in the compressNetworkUsingProjection function).
Weitere Antworten (0)
Siehe auch
Kategorien
Mehr zu Deep Learning with GPU Coder finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!