How to access the data type of the each column of a table
23 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
I have a table which contains categorical and numerical data. In order to separate those two, I want to know how to access the data type of the each column of a table.
Since I don't know how to, I tried function iscellstr as follows:
for i = 1:size(T,1)
for j=1:size(T,2)
CurrentData {i,j} = T{i,j}; %// Access all rows of a given column.
if iscategorical(CurrentData)
CategoricalData{i,j} = CurrentData{i,j};
elseif iscellstr(CurrentData)
StringcData{i,j} = CurrentData{i,j};
else
NumericData{i,j} = CurrentData{i,j};
end
end
end
But it does not seem to be working either. Any help would be appreciated.
1 Kommentar
Jan
am 20 Okt. 2014
It is overwhelming to read such a large amount of information. Shorter questions are more likely to be answered. Concentrate on the essential core of the problem an omit all details, which do not concern an answer.
Akzeptierte Antwort
Peter Perkins
am 23 Okt. 2014
Bearbeitet: Peter Perkins
am 23 Okt. 2014
Ege, your question isn't all that clear. But I'm guessing that you don't want to be looping over every element of the table and putting things in a cell array. What about this:
>> t = table(randn(3,1),randn(3,2),categorical({'a';'a';'b'}),categorical({'x';'y';'y'}))
t =
Var1 Var2 Var3 Var4
_______ ______________________ ____ ____
2.7694 0.7254 -0.20497 a x
-1.3499 -0.063055 -0.12414 a y
3.0349 0.71474 1.4897 b y
>> numericVars = varfun(@isnumeric,t,'output','uniform')
numericVars =
1 1 0 0
>> tNumeric = t(:,numericVars)
tNumeric =
Var1 Var2
_______ ______________________
2.7694 0.7254 -0.20497
-1.3499 -0.063055 -0.12414
3.0349 0.71474 1.4897
>> tNonNumeric = t(:,~numericVars)
tNonNumeric =
Var3 Var4
____ ____
a x
a y
b y
Or
>> tNumeric = t{:,numericVars}
tNumeric =
2.7694 0.7254 -0.20497
-1.3499 -0.063055 -0.12414
3.0349 0.71474 1.4897
Hope this helps.
Weitere Antworten (1)
Geoff Hayes
am 20 Okt. 2014
I think that your code is almost there in terms of breaking down the data into categorial, string, and numeric types, it is just the line of code that initializes the CurrentData which may be causing a problem
CurrentData {i,j} = T{i,j};
Since CurrentData is a cell matrix, we are not getting the expected response when we say (for example) iscellstr(CurrentData) since it is not just one element that we are considering but the current one and all previous, some or all which might be cell arrays of strings.
What we should probably do instead is just consider the one (current) element as
CurrentData = T{i,j};
Then we use that element when we need to update the categorical, string, or numeric arrays. Your code would then become
CategoricalData = cell(size(T));
StringcData = cell(size(T));
NumericData = cell(size(T));
for u = 1:size(T,1)
for v=1:size(T,2)
CurrentData = T{u,v};
if iscategorical(CurrentData)
CategoricalData{u,v} = CurrentData;
elseif iscellstr(CurrentData)
StringcData{u,v} = CurrentData;
else
NumericData{u,v} = CurrentData;
end
end
end
The above code pre-sizes all arrays to that of the table. (This is a good habit to get into as performance may degrade when arrays are not pre-sized.) I replaced your indices of i and j with u and v since i and j are also used to represent the imaginary number. We then iterate through the all elements in the table and group the categorical, numeric, and string data together. You will note that if there is an empty column in your *Data array, then that means that that column is not (for example) numeric. I tried the above with the example
load patients
BloodPressure = [Systolic Diastolic];
T = table(Gender,Age,Smoker,BloodPressure,'RowNames',LastName);
4 Kommentare
Geoff Hayes
am 23 Okt. 2014
If your numerical data consists of mx1 columns only (for some m), then you could try
for u = 1:size(T,2)
CurrentData = T.(u)(1);
if iscategorical(CurrentData)
CategoricalData(:,u) = T.(u);
elseif iscellstr(CurrentData)
StringcData(:,u) = T.(u);
else
data = T{:,u};
if iscolumn(data)
NumericData(:,u) = num2cell(T{:,u});
else
% not sure how to handle non-column data
end
end
end
The problem with the above is to handle the case where the T{:,u} contains non-scalar elements, so something like
[ 1 3 ]
[ 2 4 ]
etc.
Siehe auch
Kategorien
Mehr zu Data Type Identification finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!