signalDatastore of a large Dataset for feedforward training
14 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
i'm trying to train a feedforward net with a very large number of files andh datas (approx 13k files more than 3000 rows each). not being able to fit every data in a single matrix for the training, i tried to build a signal datastore and give it to the network, but i always receive the same error: 'Error using trainNetwork (line 191)
Invalid training data. The output size (1) of the last layer does not match the response size (2201).
Error in NN_datastore_v2 (line 76)
net=trainNetwork(sdsTrain, layers,options);'.
where is the mistake? i suppose it's in the readfunction, maybe the format? i tried several options but i can't seem to get the right combination. please help.
here's the full code:
clc
clear all;
Folders="********";
sds=signalDatastore(Folders,"IncludeSubfolders",true,"ReadFcn", @dataproc, 'FileExtensions','.txt');
numFiles = numel(sds.Files);
rng('default'); % Per la riproducibilità
fileIndices = randperm(numFiles);
trainRatio = 0.7;
valRatio = 0.15;
numTrain = floor(trainRatio * numFiles);
numVal = floor(valRatio * numFiles);
% Indici per ciascun set
trainIdx = fileIndices(1:numTrain);
valIdx = fileIndices(numTrain+1:numTrain+numVal);
testIdx = fileIndices(numTrain+numVal+1:end);
% Crea i sottodatastore
sdsTrain = subset(sds, trainIdx);
sdsVal = subset(sds, valIdx);
sdsTest = subset(sds, testIdx);
%%
layers = [
featureInputLayer(429, "Normalization", "zscore")
reluLayer
...
fullyConnectedLayer(1)
regressionLayer
];
options = trainingOptions('adam', ...
'MaxEpochs', 1000, ...
'MiniBatchSize', 64, ...
'ValidationData',sdsVal,...
'OutputNetwork','best-validation',...
'Verbose',true');
%%
net=trainNetwork(sdsTrain, layers,options);
%%
%%
function data=dataproc(filename)
l_max=2500;
% opts = detectImportOptions(filename, 'Delimiter','\t');
opts=delimitedTextImportOptions("NumVariables", 442);
opts.Delimiter = "\t";
fixedVariableNames = [******];
dynamicVariableNames = "Gage" + string(1:429);
opts.VariableNames = [fixedVariableNames, dynamicVariableNames];
opts.VariableTypes = repmat("double", 1, 442);
opts=setvaropts(opts, "DecimalSeparator", ",");
tableData = readtable(filename, opts);
dataNumeric=table2array(tableData);
if size(dataNumeric,1) <l_max
data={};
return
end
% if size(dataNumeric,1) > l_max
Fz= dataNumeric(300:l_max,strcmp(fixedVariableNames, 'FzN'));
lambdas = dataNumeric(300:l_max, 14:end);
[b_butter, a_butter] = butter(7, 0.03); % Filtro passa-basso
window_size = 5; % Finestra per filtro mediano
outlierIndices = isoutlier(lambdas, 'mean');
lambdas(outlierIndices) = nan;
lambdas = fillmissing(lambdas, 'linear');
strain_filt = medfilt1(lambdas, window_size)
filtered_force = filtfilt(b_butter, a_butter, Fz);
% data.X =strain_filt;
% data.Y =filtered_force;
data = {strain_filt, filtered_force};
% end
end
2 Kommentare
Abhaya
am 19 Dez. 2024
Hi Daniele, could you please provide the data you're using to train the network?
Antworten (1)
Gayathri
am 23 Dez. 2024
As per my understanding, each of your files have 2201 samples. But the network outputs only one sample as the number of neurons in the last "fullyConnectedLayer" is 1. Please replace this line of code with the following code.
fullyConnectedLayer(2201)
This would most probably solve the issue you are facing. I have not implemented the code at my end, as I do not have access to the input data.
For more information about "fullyConnectedLayer", please refer to the below link.
Hope you find this information helpful!
0 Kommentare
Siehe auch
Kategorien
Mehr zu Sequence and Numeric Feature Data Workflows finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!