How to get area between 2 cdf curves?

15 Ansichten (letzte 30 Tage)
Chaithra D
Chaithra D am 3 Jun. 2023
Beantwortet: John D'Errico am 3 Jun. 2023
Hi All,
Iam writing a script to automate the n number of cdfplots to save automatically and among 'n' number of plots i need to choose the best cdf plot automatically. But iam stuck in finding best cdfplot. As i have tried different appraoch to find best cdf plot here are few:
1. Using interpolation method : By taking 1 cdf plot among n cdfplots as an example, interpolating points to curve 1 and curve 2 and taking difference of the each values ( curve1(:,1)-curve2(:,1)) and by keeping threshold after taking differences, I will count how many point are less than threshold in difference array . Similar way I tried doing for n number of plots and based the count as mentioned above I will take max(count) and I find the respective max count index, based on that index I am deciding the best plot. But by trying this method I am not getting desired results.
So I am trying to find the below approach:
2. Area method: Find the area between 2 cdf curve fro each plot and which plot gives me min area I will choose that plot as best cdfplot.
Finding area between curve was easy when we use plot() but uisng cdfplot() I am finding it difficult. So please can some one help in sloving this .
Here is my code for ploting cdf:
eg code:
y= abs(evrnd(0,3,100,1));
x= abs(evrnd(0,4,100,1));
figure(1)
cdfplot(y)
hold on;
cdfplot(x)
legend('cdf1','cdf2')
output :
======
here in the above figure i am trying to find the area between curves as i have marked lines between curves shown in bellow attached figure.
Thanks in advance.
  2 Kommentare
Star Strider
Star Strider am 3 Jun. 2023
I doubt that this is possible to do correctly. The problem is that the independent variables in the plots are not the same between the functions, and it’s likely not possible to create a common independent variable vector because of the differences between the two of them, and get a reliable result.
y= abs(evrnd(0,3,100,1));
x= abs(evrnd(0,4,100,1));
whos
Name Size Bytes Class Attributes cmdout 1x33 66 char x 100x1 800 double y 100x1 800 double
figure(1)
cp1 = cdfplot(y);
x1 = cp1.XData;
y1 = cp1.YData;
[x1min,x1max] = bounds(x1(isfinite(x1)))
x1min = 0.0476
x1max = 9.8995
hold on;
cp2 = cdfplot(x);
x2 = cp2.XData;
y2 = cp2.YData;
[x2min,x2max] = bounds(x2(isfinite(x2)))
x2min = 0.0090
x2max = 17.1675
legend('cdf1','cdf2')
.
Torsten
Torsten am 3 Jun. 2023
You should use the raw data of all the n CDF curves, put them together in one data field and generate the CDF for this data field.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

John D'Errico
John D'Errico am 3 Jun. 2023
If the curves cross, then you need to decide what the "area between" means. That is, is the area in one part negative? Or do you just want to compute the absolute value of the area between the curves.
Regardless, it is pretty simple in any case. You can do multiple things, all of which are easy, AND correct. But first, you NEED to decide which area you intend to compute.
The area between two curves is simply the integral of the difference. Essentially you just make sure all of the curves are extended at the top end to the same point. So some may need to be estrapolated. That allows you to evaluate them at the same points.
Note that the integral of the difference will, IF the curves cross, have some parts as negative, and others positive. They will negate each other, unless you decide to compute the integral of the absolute value of the difference.
How should you compute the integrals? That part is also trivial. Since these are empirical CDFs, it probably makes the most sense to use a rectangle rule, but that may not be crucially important. These PDFs are fairly bumpy as empirical PDFs, so getting a high order of integration is probably not that important. You could just use trapz, as an easy solution. Or you could even get tricky and use polyshapes. So there are many ways to do this.
Honestly, this is not a difficult problem. (I think Star was a bit confused when he said it would not be possible.) Without any data posted, or knowing in what sense you are asking to compute the area "between" the curves, I won't go any further.

Weitere Antworten (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by