Statistical Test for decaying signals

6 Ansichten (letzte 30 Tage)
Henry Carey-Morgan
Henry Carey-Morgan am 24 Sep. 2024
Kommentiert: Star Strider am 8 Okt. 2024
I have two decaying relative intensity curves and would like a statistical test to show that they are different, each time point on each curve is produced by averaging from 100 taken data points - does anyone have any suggestions? The data is:
Data 1: 1 0.914144 0.876253 0.836468 0.806563 0.781585 0.744672 0.727541 0.695955 0.677459 0.630814 0.637396 0.609646 0.569227 0.565882 0.529177 0.520497 0.514375 0.504086 0.474612 0.447513 0.425238 0.432216 0.441622 0.407928 0.381347 0.387921 0.387443 0.380426 0.363821 0.353484
Data 2: 0.984578 0.9664 0.985515 0.98057 1 0.980536 0.930023 0.957503 0.903321 0.886397 0.897744 0.821625 0.85142 0.833694 0.826525 0.81353 0.768527 0.793422 0.81677 0.76768 0.773302 0.777807 0.736474 0.693616 0.694688 0.74992 0.712753 0.700593 0.708191 0.677843 0.720385
Time: 0 60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800

Antworten (2)

Star Strider
Star Strider am 24 Sep. 2024
... each time point on each curve is produced by averaging from 100 taken data points ...
The statistical test depends on what the data represent, and the characteristics of those data. However, just using the mean values is not going to be of any real value, since you also need to have measures of the dispersion of the data, specifically the variance, and if applicable, the standard deviation (since not all distributions — such as the lognormal distribution — have standard deviations).
If you do not know the underlying distributions of the data, my suggestion would be to use a nonparametric test. There are several that could work, however the friedman test might be the most appropriate here.
Data_1 = [1 0.914144 0.876253 0.836468 0.806563 0.781585 0.744672 0.727541 0.695955 0.677459 0.630814 0.637396 0.609646 0.569227 0.565882 0.529177 0.520497 0.514375 0.504086 0.474612 0.447513 0.425238 0.432216 0.441622 0.407928 0.381347 0.387921 0.387443 0.380426 0.363821 0.353484];
Data_2 = [0.984578 0.9664 0.985515 0.98057 1 0.980536 0.930023 0.957503 0.903321 0.886397 0.897744 0.821625 0.85142 0.833694 0.826525 0.81353 0.768527 0.793422 0.81677 0.76768 0.773302 0.777807 0.736474 0.693616 0.694688 0.74992 0.712753 0.700593 0.708191 0.677843 0.720385];
Time = [ 0 60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800];
figure
plot(Time, Data_1, '.-', 'DisplayName','Data_1')
hold on
plot(Time, Data_2, '.-', 'DisplayName','Data_2')
hold off
grid
legend('Location','best')
.
  4 Kommentare
Henry Carey-Morgan
Henry Carey-Morgan am 8 Okt. 2024
I have standard deviations for each point as well and can generate 95% confidence intervals that show the curves are different. I need a p value. Could you please explain how you are inputting the data into the friedman test? Is the reps the number of points I average over, in my case 100? Thank you so much for your help!
Star Strider
Star Strider am 8 Okt. 2024
To do what I suggested, you need the original data at each point.
That would go something like this —
Data_1 = [1 0.914144 0.876253 0.836468 0.806563 0.781585 0.744672 0.727541 0.695955 0.677459 0.630814 0.637396 0.609646 0.569227 0.565882 0.529177 0.520497 0.514375 0.504086 0.474612 0.447513 0.425238 0.432216 0.441622 0.407928 0.381347 0.387921 0.387443 0.380426 0.363821 0.353484];
Data_2 = [0.984578 0.9664 0.985515 0.98057 1 0.980536 0.930023 0.957503 0.903321 0.886397 0.897744 0.821625 0.85142 0.833694 0.826525 0.81353 0.768527 0.793422 0.81677 0.76768 0.773302 0.777807 0.736474 0.693616 0.694688 0.74992 0.712753 0.700593 0.708191 0.677843 0.720385];
Time = [ 0 60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800];
nPoints = numel(Time)
nPoints = 31
Data_Orig_1 = Data_1 + (rand(10, numel(Data_1))-0.5);
Data_Orig_2 = Data_2 + (rand(10, numel(Data_1))-0.5);
orng = [0.9 0.5 0.2];
friedman_data = [Data_Orig_1(:) Data_Orig_2(:)]
friedman_data = 310×2
0.5660 1.4103 0.6106 1.3911 1.0290 0.9247 1.1947 1.1333 0.7528 0.8116 1.3387 1.2896 1.0860 0.5246 0.7695 0.5941 1.4248 0.9095 0.6528 1.0547
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
[p,T,S] = friedman(friedman_data, size(Data_Orig_1,1))
p = 2.6813e-16
T = 5x6 cell array
{'Source' } {'SS' } {'df' } {'MS' } {'Chi-sq' } {'Prob>Chi-sq'} {'Columns' } {[2.3459e+03]} {[ 1]} {[2.3459e+03]} {[ 67.0247]} {[ 2.6813e-16]} {'Interaction'} {[ 914.3355]} {[ 30]} {[ 30.4778]} {0x0 double} {0x0 double } {'Error' } {[1.7355e+04]} {[558]} {[ 31.1018]} {0x0 double} {0x0 double } {'Total' } {[ 20615]} {[619]} {0x0 double } {0x0 double} {0x0 double }
S = struct with fields:
source: 'friedman' n: 31 meanranks: [8.5548 12.4452] sigma: 5.9161
figure
hp1 = plot(Time, Data_1, '.-b', 'DisplayName','Data_1', 'LineWidth',1.5);
hold on
plot(Time, Data_Orig_1, '.b')
hp2 = plot(Time, Data_2, '.-', 'DisplayName','Data_2', 'Color',orng, 'LineWidth',1.5);
plot(Time, Data_Orig_2, '.', 'Color',orng)
hold off
grid
xlabel('Time')
ylabel('Value')
legend([hp1 hp2],'Location','best')
Here, the matrix for the friedman test consists of vertically-concatenated columns of the data around the original points, of which there are uniformly 10 each (the second argument to friedman), creating a (310x2) matrix. The friedman function then compares these two, and determines that they are statistically signifiicant (in this instance). I also considered using multcompare however with only two groups, it is likely not necessary here.
I have never done anything even remotely like this, nor seen it done. (I have only compared two models using the same data with the likelihood ratio test.) I believe the friedman test is appropriate for this problem. In any event, I cannot envision any other way to approach it.
.

Melden Sie sich an, um zu kommentieren.


Jeff Miller
Jeff Miller am 25 Sep. 2024
One simple approach is to fit a straight line to each dataset and show that the slopes are statistically different. For example,
% I'm dividing Time by 1000 to get more readable slope values--i.e.,
% decrease per 1000 time units.
mdl1 = fitlm(Time/1000,Data_1);
ci1 = coefCI(mdl1);
mdl2 = fitlm(Time/1000,Data_2);
ci2 = coefCI(mdl2);
fprintf('Slope for data 1 = %f with 95 pct confidence interval %f to %f\n',mdl1.Coefficients.Estimate(2),ci1(2,1),ci1(2,2));
fprintf('Slope for data 2 = %f with 95 pct confidence interval %f to %f\n',mdl2.Coefficients.Estimate(2),ci2(2,1),ci2(2,2));
% Slope for data 1 = -0.326349 with 95 pct confidence interval -0.355400 to -0.297297
% Slope for data 2 = -0.185232 with 95 pct confidence interval -0.204295 to -0.166170
Since the confidence intervals don't overlap (and it's not even close), you are statistically justified in concluding that the decrease is steeper for data 1 than 2.
If you need an actual p value for a test of the difference in slopes, you'll need to do a bit more work.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by