Fast way to retrieve nonzero entries of each row in a sparse matrix
19 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Dominik Mattioli
am 30 Aug. 2017
Bearbeitet: Matt J
am 31 Aug. 2017
I'm working with very large sparse matrices (10's of thousands to millions of entries) and I'd like to efficiently retrieve and store the unique nonzero entries of each row. For my application, the rows will have anywhere between 1 and 9 unique and nonzero entries.
I've decided to utilize "sprand" for an example - I'm fairly new to using sparse matrices and am not the best programmer so forgive my naivety - and I've tried to choose a density within the aforementioned range. My code is:
numrows = 1000000;
numcols = 1000000;
density = numrows^(-1.87);
num_entries_per_row = numrows*numcols*density % This is an average.
rsm = sprand(numrows,numcols,density); % random sparse matrix
tic
out = cell(numrows,1); % output.
for idx = 1:numrows
[~,~,value] = find(rsm(idx,:));
out{idx} = unique(value);
end
toc
I've had this running for a few minutes and still no result. Previously when the number of rows and columns were both 100,000 the result took about 32 seconds - way too slow for my purposes. Clearly not the fastest, but it is simple. Is there a better and more clever way to achieve this result in MATLAB?
2 Kommentare
Akzeptierte Antwort
Matt J
am 31 Aug. 2017
Bearbeitet: Matt J
am 31 Aug. 2017
[I,~,values]=find(rsm);
u=unique([I,values],'rows');
N=size(u,1);
tmp=diff([0;find(diff(u(:,1)));N]);
out=mat2cell(u(:,2),tmp);
5 Kommentare
Matt J
am 31 Aug. 2017
Bearbeitet: Matt J
am 31 Aug. 2017
keeping the result in a sparse matrix rather than millions of cell elements
With further testing, I'm actually seeing that cells overtake matrices in efficiency when the density of rsm gets above a certain threshold. Not sure why...
numrows = 1000000;
numcols = 1000000;
rsm = sprand(numrows,numcols,30/numcols);
rsm0=rsm.';
tic;
rsm=sort(rsm0);
[~,J,values]=find(rsm);
u=unique([J,values],'rows');
N=size(u,1);
tmp=diff([0;find(diff(u(:,1)));N]);
out1=mat2cell(u(:,2),tmp);
toc;
%Elapsed time is 8.225773 seconds.
tic;
rsm=sort(rsm0);
[I,J,values]=find(rsm);
[u,k]=unique([J,values],'rows');
out2=sparse(I(k),u(:,1),u(:,2));
toc;
%Elapsed time is 16.365144 seconds.
Weitere Antworten (0)
Siehe auch
Kategorien
Find more on Sparse Matrices in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!