Filter löschen
Filter löschen

How to fill in NaNs or <undefined> in data with the mode of each column

2 Ansichten (letzte 30 Tage)
Dhruv Ghulati
Dhruv Ghulati am 21 Dez. 2015
Kommentiert: jgg am 22 Dez. 2015
I have converted a mixed table of both categorical and double arrays into being all columns of type double, via making each category in the categorical arrays a double.
I have a table of 40k rows, and 40 columns. I want to fill in NaNs via replacing each NaN value with the mode value for that column.
I found a clear looping method in R via this link , but couldn't find a simple loop in matlab to do it. inpaint_nans seems to be more focused on interpolation of the data.
knnimpute()
also fails because I can have swathes of up to 1000 rows which are all NaNs (so I need 1200+ neighbours), as well as 40+ columns, so the algorithm has to loop through 40! times which is very slow.
Any ideas?

Antworten (1)

jgg
jgg am 22 Dez. 2015
Bearbeitet: jgg am 22 Dez. 2015
Select the NaNs and set them to things:
A = [1 2 NaN 4 5; 1 2 3 NaN 5; 1 NaN NaN NaN 5];
m = mode(A,1);
m = repmat(m,size(A,2), 1);
A_f = A;
A_f(isnan(A)) = m(isnan(A));
Looping is not necessary if you use vectorized operations.
Note: if your matrix is very large, the repmat step can be replaced with a for loop over the columns in order to use less memory, but 40k by 40 is not that large, so it should be fine.
  2 Kommentare
Dhruv Ghulati
Dhruv Ghulati am 22 Dez. 2015
Thanks so much! I changed to
m = repmat(m,size(A,1), 1);
To make the matrix repeat row wise not column wise, but otherwise it worked!
jgg
jgg am 22 Dez. 2015
If you liked this answer, please accept it so other people can see it resolved your problem!

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Creating and Concatenating Matrices finden Sie in Help Center und File Exchange

Produkte

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by