How to use kmeans function on data stored by datastore function?

2 Ansichten (letzte 30 Tage)
Ahmed Hamed
Ahmed Hamed am 29 Apr. 2016
Bearbeitet: Josh Meyer am 17 Jul. 2017
I'm trying to cluster big data using kmeans, i found a code that can do something similar here you are
Mu = bsxfun(@times,ones(20,30),(1:20)'); % Gaussian mixture mean
rn30 = randn(30,30);
Sigma = rn30'*rn30; % Symmetric and positive-definite covariance
Mdl = gmdistribution(Mu,Sigma);
rng(1); % For reproducibility
X = random(Mdl,10000);
pool = parpool; % Invokes workers
stream = RandStream('mlfg6331_64'); % Random number stream
options = statset('UseParallel',1,'UseSubstreams',1,...
'Streams',stream);
tic; % Start stopwatch timer
[idx,C,sumd,D] = kmeans(X,20,'Options',options,'MaxIter',10000,...
'Display','final','Replicates',10);
toc % Terminate stopwatch timer
But as you can see, X is double.
My problem is that i have a file named HIS.csv and i used the datastore function to store it as follows
ds = datastore('HIS_all.csv', 'DatastoreType', 'tabulartext','TreatAsMissing', 'NA');
when i tried
[idx,C,sumd,D] = kmeans(ds,20,'Options',options,'MaxIter',10000, 'Display','final','Replicates',10);
i get the following error
Undefined function 'isnan' for input arguments of type 'matlab.io.datastore.TabularTextDatastore'.
Error in kmeans (line 158)
wasnan = any(isnan(X),2);
Any suggestions?

Antworten (1)

Josh Meyer
Josh Meyer am 15 Jul. 2017
Bearbeitet: Josh Meyer am 17 Jul. 2017
Datastore is just a framework for loading small chunks of the data at a time, so you can't call generic functions directly on the datastore. Instead try converting the datastore into a tall array first:
T = tall(ds);
The kmeans function supports tall arrays, so once the data is in this format you can use the function. Note that there are some limitations to using kmeans on a tall array, so some of the NV pairs you specified might not work. The limitations are outlined here:

Kategorien

Mehr zu Statistics and Machine Learning Toolbox finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by