How to convert categorical data to numeric in separate columns?
16 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
% Hi! I have a dataset 'data5' with a column 'Location' which contains values Asia, US and Africa.
% I'm wanting to convert it to 3 separate columns, one for each location, which contains a 1 if the row is from that location and 0 otherwise
% This is the function I have created:
function data = categorical_values(data, var)
uniques = unique(var);
for i = 1:length(uniques)
values(:, i) = double(ismember(var, uniques(i)));
end
t = table;
[rows, cols] = size(values);
for i = 1:cols
t1 = table(values(:, i));
t1.Properties.VariableNames = uniques(i);
t = [t t1];
end
data = [t data];
end
% And this is the code I have been running, in a file called prep.m:
new = categorical_values(data5, data5.Location);
new.Location = []; % delete the old Location column
% I have been getting this error:
Error using categorical_values (line 11)
The VariableNames property is a cell array of character vectors. To
assign multiple variable names, specify names in a string array or a cell
array of character vectors.
Error in prep (line 16)
new = categorical_values(data5, data5.Location);
% Can anyone help??????? Thanks!
0 Kommentare
Antworten (1)
Adam Danz
am 10 Aug. 2020
Bearbeitet: Adam Danz
am 26 Okt. 2020
Here's a more efficient solution.
% Create demo data
location = categorical({'Asia','US','Asia','Africa','Africa','US','US','Asia'}');
unqCountries = unique(location(:)')
% Create matrix of 1s % 0s.
% Columns are identified by "unqCountries"
countryIdx = location(:) == unqCountries
% If you want to turn it into a table
T = array2table(countryIdx, 'VariableNames', string(unqCountries))
The error you're getting is because you're assigning a categorical variable as a table variable name which must be a character array or string. Convert to string:
t1.Properties.VariableNames = string(unique(i));
4 Kommentare
Adam Danz
am 26 Okt. 2020
"Is this same as dummy coding or One Hot Encoding?"
The T table could be used as dummy variables and contains binary values (true|false) which is similar to using dummy variables in regression.
Siehe auch
Kategorien
Mehr zu Data Type Conversion finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!