Statistics : Data binning vs increased size
Ältere Kommentare anzeigen
Dear Users,
Suppose you have data that is recorded daily( lets say a temperature in different places over 4 years), each time you compute the histogram whether 1D or two dimensional, the number of bins affects dramatically the behavior of the data, the default is 256 here, while there are some techniques for optimizing that number, 256 is fine then?
4 Kommentare
Jan
am 6 Okt. 2013
Perhaps. If the number of bins is a critical parameter, how could you optimize it? How could you decide, if a certain number is good or better?
Image Analyst
am 6 Okt. 2013
I agree with Jan - it's hard to know how to "optimize" the number of bins when the criteria for saying what is "fine" and what is "not fine" has not been explained.
As far as I am concerned, tests and statistics are analytical, and histograms (different from optimal binning) are for data visualization only. So for me the ideal bin size is the one which shows major behaviors and smooths down small scale fluctuations.. in other words, it's a question of scale. If I had to build a "cheap" automatic bin-size adjustment algorithm (e.g. if MATLAB was meant to output automatically series of figures for automatic reports generation), I guess that I would just implement a loop which starts at 3 bins, and increases the number of bins until the derivative of the histogram changes sign more than a certain threshold (which could depend on the number of bins).
Youssef Khmou
am 6 Okt. 2013
Bearbeitet: Youssef Khmou
am 6 Okt. 2013
Antworten (0)
Kategorien
Mehr zu Histograms finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!