Filter löschen
Filter löschen

How can I compute mean and standard deviation for set of data with variability?

68 Ansichten (letzte 30 Tage)
I have extracted a plot of 5 data samples along the common x-axis(Eng.strain) with constant interval against the varying Y-values (Eng.stress). The data samples contain nan values after a certain threshold x value depending on the degrdation of the curves. I tried to extract the mean and std. deviation using the following commands:
% Calculate the mean and standard deviation
mean_stress_3p = mean([interpolated_stress_s1_3p; interpolated_stress_s2_3p; interpolated_stress_s3_3p;interpolated_stress_s4_3p;interpolated_stress_s5_3p],'omitnan');
std_stress_3p = std([interpolated_stress_s1_3p; interpolated_stress_s2_3p; interpolated_stress_s3_3p;interpolated_stress_s4_3p;interpolated_stress_s5_3p],'omitnan');
However, this obtained mean plot (blue line) doesn't seem right. Is there a better approach to extract the mean plot and its corresponding std. deviation along the degradation for such a data?
Code:
clear all; close all; clc
%%
[~, ~, data_s1_3p] = xlsread('C:\Hydrogel_damage\SA1_3p.xlsx');
strain_extensometer_s1_3p= cell2mat(data_s1_3p(2:92, 2));
time_dic_s1_3p= cell2mat(data_s1_3p(2:92, 1));
experimentalstress_original_s1_3p= cell2mat(data_s1_3p(2:894, 5));
time_utm_s1_3p= cell2mat(data_s1_3p(2:894, 3));
predicted_experimentalstress_s1_3p= interp1(time_utm_s1_3p,experimentalstress_original_s1_3p,time_dic_s1_3p,"linear","extrap");
commonTime_s1_3p = min(time_dic_s1_3p):0.005:max(max(time_dic_s1_3p));
commonStrain_s1_3p = interp1(time_dic_s1_3p, strain_extensometer_s1_3p, commonTime_s1_3p, 'linear',"extrap");
commonStress_s1_3p = interp1(time_utm_s1_3p, experimentalstress_original_s1_3p, commonTime_s1_3p, 'linear',"extrap");
strain_range_s1_3p = 0:0.00005:0.13;
[commonStrain_s1_3p, sortIndex] = sort(commonStrain_s1_3p);
commonStress_s1_3p = commonStress_s1_3p(sortIndex);
[commonStrain_s1_3p, ia, ~] = unique(commonStrain_s1_3p);
commonStress_s1_3p = commonStress_s1_3p(ia);
interpolated_stress_s1_3p= interp1(commonStrain_s1_3p, commonStress_s1_3p, strain_range_s1_3p, 'linear',"extrap");
interpolated_stress_s1_3p(interpolated_stress_s1_3p<0)=nan;
%%
[~, ~, data_s2_3p] = xlsread('C:\Hydrogel_damage\SA2_3p.xlsx');
strain_extensometer_s2_3p= cell2mat(data_s2_3p(2:126, 2));
time_dic_s2_3p= cell2mat(data_s2_3p(2:126, 1));
experimentalstress_original_s2_3p= cell2mat(data_s2_3p(2:1236, 5));
time_utm_s2_3p= cell2mat(data_s2_3p(2:1236, 3));
predicted_experimentalstress_s2_3p= interp1(time_utm_s2_3p,experimentalstress_original_s2_3p,time_dic_s2_3p,"linear","extrap");
commonTime_s2_3p = min(time_dic_s2_3p):0.005:max(max(time_dic_s2_3p));
commonStrain_s2_3p = interp1(time_dic_s2_3p, strain_extensometer_s2_3p, commonTime_s2_3p, 'linear',"extrap");
commonStress_s2_3p = interp1(time_utm_s2_3p, experimentalstress_original_s2_3p, commonTime_s2_3p, 'linear',"extrap");
strain_range_s2_3p = 0:0.00005:0.13;
[commonStrain_s2_3p, sortIndex] = sort(commonStrain_s2_3p);
commonStress_s2_3p = commonStress_s2_3p(sortIndex);
[commonStrain_s2_3p, ia, ~] = unique(commonStrain_s2_3p);
commonStress_s2_3p = commonStress_s2_3p(ia);
interpolated_stress_s2_3p= interp1(commonStrain_s2_3p, commonStress_s2_3p, strain_range_s2_3p, 'linear',"extrap");
interpolated_stress_s2_3p(interpolated_stress_s2_3p<0)=nan;
%%
[~, ~, data_s3_3p] = xlsread('C:\Hydrogel_damage\SA3_3p.xlsx');
strain_extensometer_s3_3p= cell2mat(data_s3_3p(2:109, 2));
time_dic_s3_3p= cell2mat(data_s3_3p(2:109, 1));
experimentalstress_original_s3_3p= cell2mat(data_s3_3p(2:1068, 5));
time_utm_s3_3p= cell2mat(data_s3_3p(2:1068, 3));
predicted_experimentalstress_s3_3p= interp1(time_utm_s3_3p,experimentalstress_original_s3_3p,time_dic_s3_3p,"linear","extrap");
commonTime_s3_3p = min(time_dic_s3_3p):0.005:max(max(time_dic_s3_3p));
commonStrain_s3_3p = interp1(time_dic_s3_3p, strain_extensometer_s3_3p, commonTime_s3_3p, 'linear',"extrap");
commonStress_s3_3p = interp1(time_utm_s3_3p, experimentalstress_original_s3_3p, commonTime_s3_3p, 'linear',"extrap");
strain_range_s3_3p = 0:0.00005:0.13;
[commonStrain_s3_3p, sortIndex] = sort(commonStrain_s3_3p);
commonStress_s3_3p = commonStress_s3_3p(sortIndex);
[commonStrain_s3_3p, ia, ~] = unique(commonStrain_s3_3p);
commonStress_s3_3p = commonStress_s3_3p(ia);
interpolated_stress_s3_3p= interp1(commonStrain_s3_3p, commonStress_s3_3p, strain_range_s3_3p, 'linear',"extrap");
interpolated_stress_s3_3p(interpolated_stress_s3_3p<0)=nan;
%%
[~, ~, data_s4_3p] = xlsread('C:\Hydrogel_damage\SA4_3p.xlsx');
strain_extensometer_s4_3p= cell2mat(data_s4_3p(2:116, 2));
time_dic_s4_3p= cell2mat(data_s4_3p(2:116, 1));
experimentalstress_original_s4_3p= cell2mat(data_s4_3p(2:1139, 5));
time_utm_s4_3p= cell2mat(data_s4_3p(2:1139, 3));
predicted_experimentalstress_s4_3p= interp1(time_utm_s4_3p,experimentalstress_original_s4_3p,time_dic_s4_3p,"linear","extrap");
commonTime_s4_3p = min(time_dic_s4_3p):0.005:max(max(time_dic_s4_3p));
commonStrain_s4_3p = interp1(time_dic_s4_3p, strain_extensometer_s4_3p, commonTime_s4_3p, 'linear',"extrap");
commonStress_s4_3p = interp1(time_utm_s4_3p, experimentalstress_original_s4_3p, commonTime_s4_3p, 'linear',"extrap");
strain_range_s4_3p = 0:0.00005:0.13;
[commonStrain_s4_3p, sortIndex] = sort(commonStrain_s4_3p);
commonStress_s4_3p = commonStress_s4_3p(sortIndex);
[commonStrain_s4_3p, ia, ~] = unique(commonStrain_s4_3p);
commonStress_s4_3p = commonStress_s4_3p(ia);
interpolated_stress_s4_3p= interp1(commonStrain_s4_3p, commonStress_s4_3p, strain_range_s4_3p, 'linear',"extrap");
interpolated_stress_s4_3p(interpolated_stress_s4_3p<0)=nan;
%%
[~, ~, data_s5_3p] = xlsread('C:\Hydrogel_damage\SA5_3p.xlsx');
strain_extensometer_s5_3p= cell2mat(data_s5_3p(2:130, 2));
time_dic_s5_3p= cell2mat(data_s5_3p(2:130, 1));
experimentalstress_original_s5_3p= cell2mat(data_s5_3p(2:1276, 5));
time_utm_s5_3p= cell2mat(data_s5_3p(2:1276, 3));
predicted_experimentalstress_s5_3p= interp1(time_utm_s5_3p,experimentalstress_original_s5_3p,time_dic_s5_3p,"linear","extrap");
commonTime_s5_3p = min(time_dic_s5_3p):0.005:max(max(time_dic_s5_3p));
commonStrain_s5_3p = interp1(time_dic_s5_3p, strain_extensometer_s5_3p, commonTime_s5_3p, 'linear',"extrap");
commonStress_s5_3p = interp1(time_utm_s5_3p, experimentalstress_original_s5_3p, commonTime_s5_3p, 'linear',"extrap");
strain_range_s5_3p = 0:0.00005:0.13;
[commonStrain_s5_3p, sortIndex] = sort(commonStrain_s5_3p);
commonStress_s5_3p = commonStress_s5_3p(sortIndex);
[commonStrain_s5_3p, ia, ~] = unique(commonStrain_s5_3p);
commonStress_s5_3p = commonStress_s5_3p(ia);
interpolated_stress_s5_3p= interp1(commonStrain_s5_3p, commonStress_s5_3p, strain_range_s5_3p, 'linear',"extrap");
interpolated_stress_s5_3p(interpolated_stress_s5_3p<0)=nan;
figure;
hold on
xlabel('Eng.strain');
ylabel('Engineering Stress (kPa)');
ylim([0 70])
plot(strain_range_s1_3p,interpolated_stress_s1_3p,"-.",'MarkerSize',5)
plot(strain_range_s2_3p,interpolated_stress_s2_3p,"-.",'MarkerSize',5)
plot(strain_range_s3_3p,interpolated_stress_s3_3p,"-.",'MarkerSize',5)
plot(strain_range_s4_3p,interpolated_stress_s4_3p,"-.",'MarkerSize',5)
plot(strain_range_s5_3p,interpolated_stress_s5_3p,"-.",'MarkerSize',5)
% Calculate the mean and standard deviation
mean_stress_3p = mean([interpolated_stress_s1_3p; interpolated_stress_s2_3p; interpolated_stress_s3_3p;interpolated_stress_s4_3p;interpolated_stress_s5_3p]);
std_stress_3p = std([interpolated_stress_s1_3p; interpolated_stress_s2_3p; interpolated_stress_s3_3p;interpolated_stress_s4_3p;interpolated_stress_s5_3p]);
% Define the x (strain) and y (stress) coordinates for the patch
x = strain_range_s1_3p;
y = mean_stress_3p ;
err = std_stress_3p;
% Create the patch (shaded area) for error bars
figure;
patch([x, fliplr(x)], [y + err, fliplr(y - err)], 'b', 'FaceAlpha', 0.2, 'EdgeColor', 'none');
hold on;
% Plot the mean stress curve
m= plot(x, y, 'b-', 'LineWidth', 2);
ylim([0 70])
[EDIT - formatted code as code]
  2 Kommentare
dpb
dpb am 30 Aug. 2024
Bearbeitet: dpb am 31 Aug. 2024
Those single line plots show a very sizable difference between the upper two and the lower three -- does it really make physical sense to average all five? That doesn't even consider the differences in where the turnover occurs.
If one ignores all of those issues, then the middle blue curve of the second figure looks like a quite reasonable average of the five -- it's in the middle but weighted towards the three lower which will outweigh the two upper and reflects the influence of the yellow and red traces before the NaN data kick in and are omitted. Perhaps you were expecting the mean to be halfway between the two groups? That would be so if there were two traces in each group, but there are 2/5 and 3/5 -- 0.4/0.6 weighting towards each group which is what the actual location is showing.
We know nothing of the real basis of what the data are, but it looks like you would need to do more preprocessing at a minimum and perhaps even reconsider what it is you're trying to accomplish; an average just doesn't seem like a useful statistic; certainly the variance/std deviation is going to be quite large.
Image Analyst
Image Analyst am 30 Aug. 2024
Bearbeitet: Image Analyst am 30 Aug. 2024
@Yogesh Why doesn't it seem "right"? Looks right to me, given the data that you have. Explain in detail what is not right about it. Would you rather just compute the average up to the peak/highest value of each curve?

Melden Sie sich an, um zu kommentieren.

Antworten (0)

Kategorien

Mehr zu Graphics Performance finden Sie in Help Center und File Exchange

Produkte


Version

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by