Using accumarray for histcounts2 with >1024 bins in 1 dimension

3 Ansichten (letzte 30 Tage)
As the title expresses, I'm trying to generate a bivariate histograms of datasets wherein I will often have more than 1024 bins in one dimension, given my requisite data fidelity / bin width. As such, I'm trying to use accumarray to take the place of histcounts2, however I'm having trouble defining subs.
For full context, I'm starting from an arbitrarily-sized sparse array (lets say 100-by-100000). I then convert that to a full array using find(), and am finally trying to generate a bivariate histogram to visualize the data.
*Sidenote* variable names used herein are meant to be usefull for the example, not reflective of the actual variable names in my script.
DataSparse = sparse(100,100000);
DataSparse(randi(100*100000,[474 1])) = (82-36).*rand(474,1)+36;
[DataFull(:,1),DataFull(:,2),DataFull(:,3)] = find(DataSparse);
[~,~,subs] = unique([DataFull(:,2),quant(DataFull(:,3),5)],'rows');
%The columns of the original sparse matrix are already at the minimum
%acceptable fidelity for the data in that dimension, but the
%"height"/z-data/values in the sparse matrix can be dropped to a lower
%resolution.
%I know the subs definition is lacking, but am unsure how to properly
%define it. It's just where I'm at now.
BigHistogram = accumarray(subs,DataFull(:,3),[],@numel,[],1);
%again, this doesn't work to generate the equivalent of
%BigHistogram = hiscounts2(DataFull(:,2),DataFull(:,3),'BinWidth',[55 5]),
%but histcounts2 fails to retain the desired resolution if there's
%sufficiently far-flung data in DataSparse.
Help?

Akzeptierte Antwort

Steven Lord
Steven Lord am 3 Feb. 2023
MATLAB limits the number of bins if you specify BinWidth. If you specify a list of edges MATLAB will use that list of edges to determine the bins even if that results in more than 1024 bins.
DataSparse = sparse(100,100000);
DataSparse(randi(100*100000,[474 1])) = (82-36).*rand(474,1)+36;
[DataFull(:,1),DataFull(:,2),DataFull(:,3)] = find(DataSparse);
% Set up bin edge vectors
[min2, max2] = bounds(DataFull(:, 2));
xedges = min2:55:max2;
[min3, max3] = bounds(DataFull(:, 3));
yedges = min3:5:max3;
BigHistogram = histcounts2(DataFull(:,2),DataFull(:,3), ...
'XBinEdges', xedges, 'YBinEdges', yedges);
whos
Name Size Bytes Class Attributes BigHistogram 1806x9 130032 double DataFull 474x3 11376 double DataSparse 100x100000 807704 double sparse cmdout 1x33 66 char max2 1x1 8 double max3 1x1 8 double min2 1x1 8 double min3 1x1 8 double xedges 1x1807 14456 double yedges 1x10 80 double
Note the sizes of xedges and BigHistogram.
  3 Kommentare
Steven Lord
Steven Lord am 3 Feb. 2023
You could test if the last element of xedges is strictly less than max2. If it is concatenate that last element plus 55 to the xedges vector. Alternately add 55 to max2 and use the result as the third input to colon when you build xedges.
Gabriel Stanley
Gabriel Stanley am 3 Feb. 2023
Yeah, I've ended up getting to the second option. Ultimately it's not a big deal if there's an extra empty bin at the top end. Ty.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Data Distribution Plots finden Sie in Help Center und File Exchange

Produkte


Version

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by