Filter löschen
Filter löschen

Unexpected interquartile range (IQR) result

8 Ansichten (letzte 30 Tage)
Sim
Sim am 9 Dez. 2023
Kommentiert: Sim am 11 Dez. 2023
For a number of distributions I would like to compare and show the interquartile range (IQR) and the standard deviation (STD).
For the normal distribution I got more or less what expected, i.e. the percentage of data within 1 STD, is around 68% of the distribution, and the IQR is around 50% of the distribution (i.e. the central half of the distribution). Here following my test:
clear all; clc;
samplesize = 100000;
% generate distribution
mu = 0;
sigma = 1;
data = normrnd(mu,repmat(sigma,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 68.1040
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data((data < (q(2)+q(1))) | (data > (q(2)-q(1))));
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 50.1370
% plot
figure
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(2)-q(1) q(2) q(2)+q(1)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
hold off
However, if I try the same with another distribution, like a gamma one, the IQR is not 50% anymore of the distribution. What did I do wrong?
clear all; clc;
samplesize = 100000;
% generate distribution
a = 1;
b = 5;
data = gamrnd(a,repmat(b,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 86.5350
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data((data < (q(2)+q(1))) | (data > (q(2)-q(1))));
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 100
% plot
figure
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(2)-q(1) q(2) q(2)+q(1)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
hold off

Antworten (1)

Sim
Sim am 9 Dez. 2023
Bearbeitet: Sim am 9 Dez. 2023
my bad.. this is the solution:
dataIQR = data( data > q(1) & data < q(3) );
and the vertical lines related to the quartiles need to be replaced by this command:
xline([q(1) q(2) q(3)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
This is a correct example:
% generate distribution
samplesize = 100000;
a = 1;
b = 8;
data = gamrnd(a,repmat(b,samplesize,1));
% statistics
m = mean(data);
s = std(data);
data1sigma = data((data < (m+s)) & (data > (m-s)));
percentage_data_1sigma = length(data1sigma)/length(data)*100
percentage_data_1sigma = 86.3970
q = quantile(data,[0.25 0.5 0.75]);
dataIQR = data( data > q(1) & data < q(3) );
percentage_data_IQR = length(dataIQR)/length(data)*100
percentage_data_IQR = 50
% plot
hold on
h = histogram(data);
xline([m-s m m+s],'-k',{'-1 Standard Dev.','Mean','+1 Standard Dev.'},'linewidth',1)
xline([q(1) q(2) q(3)],'-r',{'Q1','Q2','Q3'},'linewidth',1)
set(h,'FaceAlpha',0.2)
  2 Kommentare
Steven Lord
Steven Lord am 9 Dez. 2023
You could check your results using the iqr function and/or the prctile function, each moved from Statistics and Machine Learning Toolbox to MATLAB in release R2022a.
Sim
Sim am 11 Dez. 2023
Thanks a lot @Steven Lord for your nice comment and suggestion! :-) :-)

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Statistics and Machine Learning Toolbox finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by