Filter löschen
Filter löschen

how to specify the input and target data

10 Ansichten (letzte 30 Tage)
uma
uma am 16 Jun. 2022
Kommentiert: Walter Roberson am 21 Jun. 2022
I have a dataset 2310x25 table. I dont know how to specify the input and target data. i'm using the below code for k fold cross validation.
data= dlmread('data\\inputs1.txt'); %inputs
groups=dlmread('data\\targets1.txt'); % target
Fold=10;
indices = crossvalind('Kfold',length(groups),Fold);
for i =1:Fold
testy = (indices == i);
trainy = (~testy);
TestInputData=data(testy,:)';
TrainInputData=data(trainy,:)';
TestOutputData=groups(testy,:)';
TrainOutputData=groups(trainy,:)';
  8 Kommentare
Walter Roberson
Walter Roberson am 20 Jun. 2022
Are you aware that some of the entries are question mark?
uma
uma am 21 Jun. 2022
yes I know that. Now can you tell me how this dataset can be used to specify the input and target data

Melden Sie sich an, um zu kommentieren.

Antworten (1)

Walter Roberson
Walter Roberson am 21 Jun. 2022
filename = 'https://www.mathworks.com/matlabcentral/answers/uploaded_files/1038775/bankruptcy.csv';
opt = detectImportOptions(filename, 'TrimNonNumeric', true);
data = readmatrix(filename, opt);
data = rmmissing(data);
groups = data(:,end);
data = data(:,1:end-1);
whos groups
Name Size Bytes Class Attributes groups 3194x1 25552 double
[sum(groups==0), sum(groups==1)]
ans = 1×2
3164 30
cp = classperf(groups);
Fold=10;
indices = crossvalind('Kfold',length(groups),Fold);
failures = 0;
for i =1:Fold
test = (indices == i);
train = ~test;
try
class = classify(data(test,:), data(train,:), groups(train,:));
classperf(cp, lass, test);
catch ME
failures = failures + 1;
if failures <= 5
fprintf('failed on iteration %d\n', i);
else
break
end
end
end
failed on iteration 1 failed on iteration 2 failed on iteration 3 failed on iteration 4 failed on iteration 5
cp
Label: '' Description: '' ClassLabels: [2×1 double] GroundTruth: [3194×1 double] NumberOfObservations: 3194 ControlClasses: 2 TargetClasses: 1 ValidationCounter: 0 SampleDistribution: [3194×1 double] ErrorDistribution: [3194×1 double] SampleDistributionByClass: [2×1 double] ErrorDistributionByClass: [2×1 double] CountingMatrix: [3×2 double] CorrectRate: NaN ErrorRate: NaN LastCorrectRate: 0 LastErrorRate: 0 InconclusiveRate: NaN ClassifiedRate: NaN Sensitivity: NaN Specificity: NaN PositivePredictiveValue: NaN NegativePredictiveValue: NaN PositiveLikelihood: NaN NegativeLikelihood: NaN Prevalence: NaN DiagnosticTable: [2×2 double]
  1 Kommentar
Walter Roberson
Walter Roberson am 21 Jun. 2022
The reason for the failure is that you only have 30 entries with class 1, and when you are doing random selection for K-fold purposes, you are ending up with situations where there are no entries for class 1 in the training data.

Melden Sie sich an, um zu kommentieren.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by