Why is my transformer training erroring out, with the following message "Error using trainnet (line 46)"
13 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
The full error message reads
Error using trainnet (line 46)
The number of mini-batch queue outputs (2) must match the number of network inputs plus
the number of network outputs (4).
I'm using an arrayDatastore to pass the predictors (x2) and the targets(x2) to the transformer model. Both predictors have 410 features, one of the targets has 410 features and the other target is a scalar function.
Code to generate the dummy predictor and target data is pasted below:
%--------------------------------------------------------------------
% data generation for encoder
numObs = 10;
seqLen = vocabSize;
x_enc = randi([1,10],[seqLen,numObs]);
y_enc = zeros(numObs,1);
for i = 1:numObs
idx = x_enc(1:2,i);
y_enc(i,:) = sum(x_enc(idx,i));
end
x_enc = num2cell(x_enc',2);
y_enc = num2cell(y_enc)';
x_1 = x_enc;
y_2 = y_enc';
% data generation for decoder
x_series = randi([1,10],[seqLen,numObs]);
y_series = sin(rand([seqLen,numObs]));
x_dec = x_series(:,1:end)';
y_dec = y_series(:,1:end)';
x_dec = num2cell(x_dec,2); x_2 = x_dec;
y_dec = num2cell(y_dec,2); y_1 = y_dec;
cell_data = {}; cell_data = [cell_data x_1 x_2 y_1 y_2];
dstrain = arrayDatastore(cell_data,'OutputType','same');
%-------------------------------------------------------------------
cell_data is of the form:
cell_data
cell_data =
10×4 cell array
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[10]}
{1×410 double} {1×410 double} {1×410 double} {[13]}
{1×410 double} {1×410 double} {1×410 double} {[20]}
{1×410 double} {1×410 double} {1×410 double} {[ 7]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[17]}
{1×410 double} {1×410 double} {1×410 double} {[ 6]}
{1×410 double} {1×410 double} {1×410 double} {[11]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
If I were to use readall(dstrain) to read the datastore, I get the same format as cell_data:
fds = readall(dstrain)
fds =
10×4 cell array
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[10]}
{1×410 double} {1×410 double} {1×410 double} {[13]}
{1×410 double} {1×410 double} {1×410 double} {[20]}
{1×410 double} {1×410 double} {1×410 double} {[ 7]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
{1×410 double} {1×410 double} {1×410 double} {[17]}
{1×410 double} {1×410 double} {1×410 double} {[ 6]}
{1×410 double} {1×410 double} {1×410 double} {[11]}
{1×410 double} {1×410 double} {1×410 double} {[ 8]}
Finally, if I use minibatchqueue to create a minibatch of datastore 'dstrain', I get:
mbq = minibatchqueue(dstrain)
mbq =
minibatchqueue with 4 outputs and properties:
Mini-batch creation:
MiniBatchSize: 10
PartialMiniBatch: 'return'
MiniBatchFcn: 'collate'
PreprocessingEnvironment: 'serial'
Outputs:
OutputCast: {'single' 'single' 'single' 'single'}
OutputAsDlarray: [1 1 1 1]
MiniBatchFormat: {'' '' '' ''}
OutputEnvironment: {'auto' 'auto' 'auto' 'auto'}
As you can see, there are four outputs for the minibatch, which appears to contradict the original error message that there are only two minibatchqueue outputs
Also to confirm, i double checked the transformer input output structure:
net
net =
dlnetwork with properties:
Layers: [64×1 nnet.cnn.layer.Layer]
Connections: [1714×2 table]
Learnables: [110×3 table]
State: [0×3 table]
InputNames: {'in_enc' 'in_dec'}
OutputNames: {'decoder_out' 'fc_13'}
Initialized: 1
View summary with summary.
which shows two inputs and two outputs.
Could someone point me to the mistake I'm making here (likely with the datastore format) - it seems that during batching, the model is only choosing two of the cell columns from cell_data/dstrain for the input and output, rather than all four and its not clear why...thanks in advance for your help!
CG
0 Kommentare
Antworten (1)
Jaimin
am 8 Jan. 2025
The error suggests a mismatch between the model input dimensions and the mini-batch created by the “minibatchqueue” function. To fix this issue, you can adjust the parameters provided to the “minibatchqueue” function.
For better understanding kindly refer to the following code snippet.
mbq = minibatchqueue(dstrain, ...
'MiniBatchSize', 10, ...
'OutputAsDlarray', [true, true, true, true], ...
'MiniBatchFormat', {'CB', 'CB', 'BC', 'BC'}, ...
'OutputCast', {'single', 'single', 'single', 'single'});
For more information kindly refer to the following MathWorks documentation.
I hope this will be helpful.
3 Kommentare
Siehe auch
Kategorien
Mehr zu Image Data Workflows finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!