How to plot a smooth curve for Empirical probability density ?

17 Ansichten (letzte 30 Tage)
Andi
Andi am 29 Mär. 2022
Kommentiert: Mathieu NOE am 1 Apr. 2022
Hi everyone,
I require to plot an empirical probability density curve of a data set (184 rows, 59 columns). Initially, I pick one column to plot the results but, the EPD curve looks so wired. May someone suggest to me how can I get desired result (Attached). Data is also attached for reference.
clear all
clc
X = load('R_0.01T.csv'); % input data
SzX = size(X) % matrix size
r=[X(1,:)]; % first row of the data set
ra = r(~isnan(r)); % remove all nan enteries
[f,x,flo,fhi] = ecdf(ra);
WinLen = 5
dfdxs = smoothdata(gradient(f)./gradient(x), 'movmedian',WinLen); % ... Then, Smooth Them Again ...
aaa = smooth(dfdxs);
plot(x, aaa)
This is what i get
Here is the expected results.
  2 Kommentare
VBBV
VBBV am 29 Mär. 2022
Change the line
WinLen = 35%
Modify this value and you can observe the difference
Andi
Andi am 29 Mär. 2022
Still not expected, I have tried with different values but no significant change.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Mathieu NOE
Mathieu NOE am 29 Mär. 2022
hello
tried a few options , smoothdata and spline fit. In fact, I was thinking you wanted something super smooth so I first opted for the spline fit. But I saw that the expected plot was supposed to have some waviness , so I changed to smoothdata
NB as I don't have the toolbox for ecdf , I found an alternative on FEX for it https://fr.mathworks.com/matlabcentral/fileexchange/32831-homemade-ecdf
but I had to correct a bug (the corrected function is attaced fyi)
hope the code below can help you
clear all
clc
X = load('R_0.01T.csv'); % input data
SzX = size(X); % matrix size
r=X(1,:); % first row of the data set
ra = r(~isnan(r)); % remove all nan enteries
% [f,x,flo,fhi] = ecdf(ra);
[f,x] = homemade_ecdf(ra); % FEX : https://fr.mathworks.com/matlabcentral/fileexchange/32831-homemade-ecdf
% remove duplicates
[x,ia,ic] = unique(x);
f = f(ia);
% resample the data (interpolation) on linearly spaced points
xx = linspace(min(x),max(x),100);
ff = interp1(x,f,xx);
% smoothing
ffs = smoothdata(ff, 'loess',40);
% or spline fit ?
% Breaks interpolated from data (log x scale to get more points in the
% lower x range and less in the upper x range)
breaks = logspace(log10(min(x)),log10(max(x)),5); %
p = splinefit(xx,ff,breaks); %
ff2 = ppval(p,xx);
figure(1),
plot(x,f,'*',xx,ffs,'-',xx,ff2,'-')
legend('raw','smoothed','spline fit');
% compute gradient from spline fitted data (my choice)
dx = mean(diff(xx));
dfdxs = gradient(ffs)./dx;
figure(2),plot(xx, dfdxs)
  14 Kommentare
Andi
Andi am 31 Mär. 2022
Bearbeitet: Andi am 31 Mär. 2022
Thank a lot. I think now i very near to the expected results. I have modified your script a bit to actual condition and plot the results. But woundering, why i unable to set color bar as jet, secondly the tick bar are not required only few ont power termed be fine.
clear all
clc
X = load('PDS_case.csv'); % input data
C_sort = sortrows(X,5);
Tri = C_sort(1:86,:);
data1 = sortrows(Tri,7);
PDS=(data1(:,7))';
data2=data1(:,8:66);
data4=data2';
Y=data4(:,1:86);
X=Y';
UU=[2e-8, 4e-8, 6e-8, 8e-8, 1e-7, 3e-7]; % these values linked with each color bar, should be make a color bar
r{1}=[X(1:60,:)];
r{2}=[X(61:68,:)];
r{3}=[X(69:76,:)];
r{4}=[X(77:79,:)];
r{5}=[X(80:81,:)];
r{6}=[X(82:85,:)];
figure
cm = colormap(jet(numel(r)));
cc = colorbar('vert');
set(cc,'Ticks',((1:numel(UU))/numel(UU))');
set(cc,'TickLabels',num2str(UU'));
ylabel(cc,'PDS')
hold on
for k = 1:numel(r) % ... Then, Remove The 'NaN' Elements ...
ra = r{k};
ra = ra(~isnan(ra));
[f ,x ] = ecdf(ra);
[x ,ia ,ic ] = unique(x);
f = f(ia );
xx = linspace(min(x ),max(x ),100);
ff = interp1(x ,f ,xx );
ffs = smoothdata(ff , 'loess',10);
dx = mean(diff(xx ));
dfdxs = gradient(ffs )./dx ;
plot(xx , dfdxs , '-', 'linewidth', 1, 'Color',cm(k,:))
ylim([0, 8])
end
hold off
grid
Here is what i get
and the expected one should be like
Mathieu NOE
Mathieu NOE am 1 Apr. 2022
hello again
1/ seems the expected curves are smoother compare to what we have now. You can probably increase the smoothing (driven by the value in smoothdata parameter)
ffs = smoothdata(ff , 'loess',10); % increase 10 =>20
2/ scale of the colorbar (ticks) : I don't know what are the values you specify in UU coming from ?

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Preprocessing Data finden Sie in Help Center und File Exchange

Tags

Produkte

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by