Filter löschen
Filter löschen

Categorical to Numeric problem

13 Ansichten (letzte 30 Tage)
Stephen Gray
Stephen Gray am 8 Jan. 2024
Kommentiert: Cris LaPierre am 11 Jan. 2024
Hi
I have a table that has numeric and categorical items in it. I have converted the catergorical items to numeric using the unique() function which works very well and I can then feed the matrix into an NN for training. The problem is when I feed new data to get results, I don't know how to make sure the converted categirical data in the new table matches ther numbers in the training data. i.e. if a categorical field in the training data is converted to the number 5, how do I make sure if that categorical data is in the new data, that it gets assigned the same number? I'm begining to think it may be a manual thing
SPG

Akzeptierte Antwort

Hassaan
Hassaan am 8 Jan. 2024
% Example Training Data (Categorical)
training_categorical_data = {'cat', 'dog', 'fish', 'dog', 'cat'};
% Convert Categorical Data to Numeric for Training
[unique_categories, ~, numeric_categories] = unique(training_categorical_data);
category_to_number_map = containers.Map(unique_categories, num2cell(1:length(unique_categories)));
numeric_training_data = cell2mat(values(category_to_number_map, num2cell(training_categorical_data)));
% Training Process with numeric_training_data
% [Your neural network training code goes here]
% Example New Data (Categorical)
new_categorical_data = {'dog', 'cat', 'bird'};
% Convert New Categorical Data to Numeric Using Training Mapping
numeric_new_data = zeros(size(new_categorical_data));
for i = 1:length(new_categorical_data)
if isKey(category_to_number_map, new_categorical_data{i})
numeric_new_data(i) = category_to_number_map(new_categorical_data{i});
else
% Handle unseen categories, e.g., assign a special number or ignore
numeric_new_data(i) = NaN; % Assign NaN for unseen categories
end
end
% Now, numeric_new_data is ready for use with the trained model
% [Your prediction or evaluation code goes here]
  • The training data training_categorical_data is a cell array of categorical strings. This is converted to numeric_training_data using a mapping (category_to_number_map).
  • The new data new_categorical_data is then converted using the same mapping. Unseen categories (like 'bird' in this example) are handled separately; here, I've assigned NaN to them, but you can choose another method as appropriate.
  • You'll need to insert your specific neural network training and prediction code where indicated. The numeric_training_data and numeric_new_data arrays are what you'd use for training and prediction, respectively.
------------------------------------------------------------------------------------------------------------------------------------------------
If you find the solution helpful and it resolves your issue, it would be greatly appreciated if you could accept the answer. Also, leaving an upvote and a comment are also wonderful ways to provide feedback.
Professional Interests
  • Technical Services and Consulting
  • Embedded Systems | Firmware Developement | Simulations
  • Electrical and Electronics Engineering
  4 Kommentare
Stephen Gray
Stephen Gray am 10 Jan. 2024
OK, using dictionary instead and it's working so far.
Stephen Gray
Stephen Gray am 11 Jan. 2024
OK. I've got it to work now using dictionaries. Both this answer and the next one helped me get it working. AS yours includes how to use new data to I'll mark it as the answer. Thanks both for answering.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Cris LaPierre
Cris LaPierre am 8 Jan. 2024
Verschoben: Cris LaPierre am 8 Jan. 2024
Could you provide more details about your NN? I would think you should be able to pass categorical data into your network without having to convert it to numeric first.
If not, then I'd look into creating a dictionary, where you pass in the categorical value, and it returns the numberic value.
A = categorical({'medium' 'large' 'small' 'medium' 'large' 'small'});
names = unique(A)
names = 1×3 categorical array
large medium small
values = (1:length(names));
d = dictionary(names,values)
d = dictionary (categorical --> double) with 3 entries: large --> 1 medium --> 2 small --> 3
A(4)
ans = categorical
medium
x = d(A(4))
x = 2
  4 Kommentare
Stephen Gray
Stephen Gray am 9 Jan. 2024
Unfortunately not. The code part is
InpsM = table2cell(Inps);
OutsM =table2cell(Outs);
InpsM=InpsM';
OutsM=OutsM';
net=feedforwardnet([96,48,24]);
net.trainFcn = 'trainlm';
net.inputs{1}.processFcns = {'mapstd'};
net=train(net,InpsM,OutsM,'useParallel','yes');
The error I get is
Error using nntraining.setup>setupPerWorker
Inputs X{1,1} is not numeric or logical.
Error in nntraining.setup (line 77)
[net,data,tr,err] = setupPerWorker(net,trainFcn,X,Xi,Ai,T,EW,enableConfigure);
Error in network/train (line 336)
[net,data,tr,err] = nntraining.setup(net,net.trainFcn,X,Xi,Ai,T,EW,enableConfigure,isComposite);
Error in untitled (line 52)
net=train(net,InpsM,OutsM,'useParallel','yes');
SPG
Cris LaPierre
Cris LaPierre am 11 Jan. 2024
Found this, albeit on the trainnetwork page and not train, but it appears to still be applicable.
"To train a network using categorical features, you must first convert the categorical features to numeric."

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Sequence and Numeric Feature Data Workflows finden Sie in Help Center und File Exchange

Produkte


Version

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by