Violinplot extending beyond data range

51 Ansichten (letzte 30 Tage)
Angie
Angie am 28 Nov. 2024 um 13:56
Kommentiert: William Rose am 3 Dez. 2024 um 15:28
Hello everyone,
I’m using the violinplot function in MATLAB to create violin plots for some datasets. I am specifying the position and the data as follows:
violinplot(3, data2(5:end));
However, I’ve encountered an issue. The violin plot extends to negative values even though all my data values are positive. For another dataset, I observed a similar problem: the violin plot includes values that are negative or larger than the maximum values in my data.
I’ve read that this might be caused by the kernel density estimation (KDE) method used by violinplot to calculate and visualize the data's probability density. KDE smooths the data distribution and can sometimes produce density values outside the actual range of the data.
I’m unsure how to resolve this issue and would greatly appreciate any advice or suggestions.
Thank you!
Angie

Akzeptierte Antwort

William Rose
William Rose am 28 Nov. 2024 um 15:56
Bearbeitet: William Rose am 28 Nov. 2024 um 16:01
[Edit: add ylim() so that all 3 plots have same y-axis range.]
You can vary the bandwidth, or the kernel function, or both. In the examples below, the data are uniformly distributed on (0,1), which is kind of a worst case, if you don't want the violin to extend to negative values. The violins do extend beyond the data in the examples below, but the options control by how much it extends. Experiment to see if you like the results. You may not be able to avoid the violin going negative, depending on your data.
ydata = rand(100,1);
figure;
%
subplot(131)
violinplot(ydata);
title('Default Violinplot'); ylim([-.5,1.5])
%
[f1,xf1] = kde(ydata,Bandwidth=0.05);
subplot(132)
violinplot(EvaluationPoints=xf1,DensityValues=f1)
title('Bandwidth=0.05'); ylim([-.5,1.5])
%
[f2,xf2] = kde(ydata,Kernel="box");
subplot(133)
violinplot(EvaluationPoints=xf2,DensityValues=f2)
title('Box Kernel'); ylim([-.5,1.5])
  4 Kommentare
Angie
Angie am 3 Dez. 2024 um 12:38
Thank you very much! As a pdf obtained with a kernel distribution extends beyond the most extreme data points in my dataset, which is something I want to avoid, I was considering using other distributions instead. Your examples have been very helpful.
William Rose
William Rose am 3 Dez. 2024 um 15:28
@Angie, you're welcome.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Automotive finden Sie in Help Center und File Exchange

Tags

Produkte


Version

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by