What is the Interpretation of the p-Value from runstest() ?

Question

Paul am 18 Sep. 2024

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/2153640-what-is-the-interpretation-of-the-p-value-from-runstest

Beantwortet: Jeff Miller am 19 Sep. 2024

Akzeptierte Antwort: Jeff Miller

In MATLAB Online öffnen

For the function runstest the null hypothesis is "that the values in the data vector x come in random order."

Running 100 run tests on rand yields the following for the test decision and the p-value

rng(100);

for ii = 1:100

[h(ii),p(ii)] = runstest(rand(1e4,1));

end

plot(1:100,h,'-o',1:100,p,'-o')

In 97 cases, we can't reject the null hypothesis (is it surprising that the null hypothesis was rejected in even three of the cases?).

My real question is about the variability in the p-value. The doc states that: "Small values of p cast doubt on the validity of the null hypothesis."

How small is "small" in this context? What does "cast doubt" mean?

Regardless, is it expected that the p-value would have such variability when running the same test on data sets that, I would hope, are (in some sense) the same wrt to their "random ordering?

2 Kommentare
Keine anzeigenKeine ausblenden

dpb am 18 Sep. 2024

Maybe @John D'Errico's <discussion> of a similar Q? will help.

I am not familiar with the specific test and the MATLAB doc doesn't provide any details of the specific test statistic being calculated, so would have to dig into the bowels some to comment much more in depth.

The use of "cast doubt" is simply editorial; it has no technical meaning other than smaller values are less likely (under the test statistic) to have come from "random" sequences.

The definition of random for this purpose simply means how many consecutive values are above/below the mean of the sample; with a unform distribution, the probablility is 50:50 any given value is above/below the mean and the probability of the next being of the same direction of the previous is (theoretically) independent of the prior value as well. Given that, it doesn't seem at all surprising to me that whether 2, 3, ..., N consecutive values are above/below the mean would be quite variable from one sample to another

Paul am 19 Sep. 2024

Thanks for pointing out that other question. I hadn't seen it, but it doesn't really address the heart of this question.

What I can't my head wrapped around is how different realizations from the same process, which should be stationary in the sense that n values from rand should be similar (not a technical term) to a different set of n values, could be seen as being so different, as measured by the p-value, by runstest.

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Jeff Miller am 19 Sep. 2024

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/2153640-what-is-the-interpretation-of-the-p-value-from-runstest#answer_1519050

I'm not sure how much detail you want about hypothesis testing, but the fact is that the p values you get from repeated tests of a null hypothesis are uniformly distributed between 0 and 1 when the null hypothesis is true (at least for continuous test statistics, but the approximation to uniform is also quite good for most discrete ones). This is an inherent property of hypothesis tests because of the way they are constructed.

In your runs test example, the null hypothesis is true because the rng has no serial dependence, so across your 100 tests the p values are approximately uniform. (Run 10,000 and the approximation will be better.) You would get the same thing with repeated t-tests of a true null hypothesis, repeated p values for a sample correlation when the true population correlation is zero, etc.

I doubt if I can make it much more intutive, but here is a try: different random realizations of the same process do give different results, which may conform more or less exactly to what is expected under the null model. One way to pick up deviations from the null model is to scale those differences so that all of the difference sizes are equally likely when the null model is true.

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

What is the Interpretation of the p-Value from runstest() ?

2 Kommentare
Keine anzeigenKeine ausblenden

Akzeptierte Antwort

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

What is the Interpretation of the p-Value from runstest() ?

2 Kommentare Keine anzeigenKeine ausblenden

Akzeptierte Antwort

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

2 Kommentare
Keine anzeigenKeine ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden