Boxplot for multiple categorical data sets
214 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Joana
am 24 Okt. 2019
Bearbeitet: Cris LaPierre
am 7 Apr. 2023
Hi
I want to plot the Boxplots for 3 repeated variables collected for 4 data sets, where each data set has 15x3 values. So i actually want to plot 4 catagories on x-axis, where each catagory will have 3 vertical boxplots.
Can anyone please help me with that.?
I have attache dthe file with name features.
Thanks in advance.
0 Kommentare
Akzeptierte Antwort
Cris LaPierre
am 25 Okt. 2019
Bearbeitet: Cris LaPierre
am 25 Okt. 2019
You could use the 'BoxStyle','filled' name,value pair when creating the boxplot. I don't like how that looks. The best I could find to create it the way I like was this post. Note that the fill is a colored object being placed on top of the box plot. That means it will cover up the median line unless you adjust its transparency.
I've moved the plotting of the mean so that it is on top of the new object creating the fill. I've also added it to the legend so that others know what that non-standard marker represents.
Final answer would be this:
load 'Data for plot.mat'
nDataSets = 7;
nVars = 3;
nVals = 15;
% Create column vector to indicate dataset
dataSet = categorical([ones(nVars*nVals,1); ...
ones(nVars*nVals,1)*2; ...
ones(nVars*nVals,1)*3; ...
ones(nVars*nVals,1)*4;...
ones(nVars*nVals,1)*5;...
ones(nVars*nVals,1)*6;...
ones(nVars*nVals,1)*7]);
% Create column vector to indicate the variable
clear var
var(1:nVals,1) = "Var1";
var(end+1:end+nVals,1) = "Var2";
var(end+1:end+nVals,1) = "Var3";
Var = categorical([var;var;var;var;var;var;var]);
% Create a table
testData = table(data,dataSet,Var);
h = boxplot(testData.data,{testData.dataSet,testData.Var},...
'ColorGroup',testData.Var,...
'Labels',{'','Data1','','','Data2','','','Data3','','','Data4','','','Data5','','','Data6','','','Data7',''});
% set(gca,'XTickLabel',{' '})
% Don't display outliers
ol = findobj(h,'Tag','Outliers');
set(ol,'Visible','off');
% Find all boxes
box_vars = findall(h,'Tag','Box');
% Fill boxes
for j=1:length(box_vars)
patch(get(box_vars(j),'XData'),get(box_vars(j),'YData'),box_vars(j).Color,'FaceAlpha',.1,'EdgeColor','none');
end
% Add legend
Lg = legend(box_vars(1:3), {'G1','G2','G3'},'Location','northoutside','Orientation','horizontal');
%% Add Mean to boxplots
summaryTbl = groupsummary(testData,{'dataSet','Var'},"mean")
hold on
plot(summaryTbl.mean_data, '+k')
hold off
Lg.String{4} = 'mean';
5 Kommentare
Clara Yang
am 7 Apr. 2023
Bearbeitet: Clara Yang
am 7 Apr. 2023
Hi, I figured the reasons, I only have 2 nVars, so in the label I need to delete some extra ' ' . If possible, is there ways for me to put the label in the middle in this case? Thank you so much again for writing this method!
Cris LaPierre
am 7 Apr. 2023
Bearbeitet: Cris LaPierre
am 7 Apr. 2023
It is probably best to ask a new question of your own, as more people will see it.
I don't see a good way to do this with boxplot, but boxchart can really simplify the code (it's come a long way since the question was originally asked). It does require a little manipulation to get the mean values to align, but nothing difficult.
The X tick label names come directly from the categorical information used to group the data. You don't have to use categorical for grouping, but it does make it convenient to group on non-numeric data.
Here, I've renamed the categories just to demonstrate.
% Create a test data set
nDataSets = 7;
nVars = 2;
nVals = 15;
data = rand(nVals*nVars*nDataSets,1);
% Create column vector to indicate dataset
dataSet = categorical([ones(nVars*nVals,1); ...
ones(nVars*nVals,1)*2; ...
ones(nVars*nVals,1)*3; ...
ones(nVars*nVals,1)*4;...
ones(nVars*nVals,1)*5;...
ones(nVars*nVals,1)*6;...
ones(nVars*nVals,1)*7]);
dataSet = renamecats(dataSet,{'Red','Orange','Yellow','Green','Purple','Indigo','Violet'});
% Create column vector to indicate the variable
clear var
var(1:nVals,1) = "Var1";
var(end+1:end+nVals,1) = "Var2";
Var = categorical([var;var;var;var;var;var;var]);
% Create a table
testData = table(data,dataSet,Var);
% ########################################
% Actual visualization code using boxchart
boxchart(testData.dataSet,testData.data,"GroupByColor",testData.Var)
%% Add Mean to boxplots
summaryTbl = groupsummary(testData,{'dataSet','Var'},"mean");
hold on
plot((1:nDataSets*nVars)/2 + 0.25, summaryTbl.mean_data, '+k')
hold off
legend(["G1","G2","Mean"],'Location','northoutside','Orientation','horizontal')
Weitere Antworten (3)
Cris LaPierre
am 24 Okt. 2019
Bearbeitet: Cris LaPierre
am 24 Okt. 2019
Not sure what you are hoping it looks like in the end, but here's one way.
load features.mat
data1 = features{1};
data2 = features{2};
data3 = features{3};
data4 = features{4};
subplot(1,4,1)
boxplot(data1)
title('Data 1')
subplot(1,4,2)
boxplot(data2)
title('Data 2')
subplot(1,4,3)
boxplot(data3)
title('Data 3')
subplot(1,4,4)
boxplot(data4)
title('Data 4')
1 Kommentar
Cris LaPierre
am 24 Okt. 2019
Bearbeitet: Cris LaPierre
am 24 Okt. 2019
One potentially cool thing is to take advantage of the grouping option (second syntax described in the doc). To do so, I'd recommend getting your data into a table. Create one variable with all the data, one with categorical info on the data set, and one with categorical info on the variable.
% Create column vector of all data
data = [data1(:); data2(:); data3(:); data4(:)];
% Create column vector to indicate dataset
dataSet = categorical([ones(numel(data1),1); ...
ones(numel(data2),1)*2; ...
ones(numel(data3),1)*3; ...
ones(numel(data4),1)*4]);
% Create column vector to indicate the variable
clear var
var(1:length(data1),1) = "Var1";
var(end+1:end+length(data1),1) = "Var2";
var(end+1:end+length(data1),1) = "Var3";
Var = categorical([var;var;var;var]);
% Create a table
testData = table(data,dataSet,Var);
Now you can use a single boxplot command to create the boxplot you describe. You can use multiple grouping variables to organize the data into separate boxplots (enclose them in curly braces). Here, I group first by dataSet, then by Var.
boxplot(testData.data,{testData.dataSet,testData.Var})
The two X-axis labels indicate 1) dataSet and 2) Variable.
If you want to see all the boxplots for a specific variable next to each other, change the order of your grouping variables to first group by Var, then by dataSet.
boxplot(testData.data,{testData.Var,testData.dataSet})
Notice the X-axis labels can still be used to correctly identify each boxplot.
Hudson Vieira Coutinho
am 12 Okt. 2022
Hi friend, take a look on this: https://www.mathworks.com/help/matlab/ref/boxchart.html
its very simple!
0 Kommentare
Siehe auch
Kategorien
Mehr zu Data Distribution Plots finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!