Does the matlab ranksum function work for larger sample sizes?

Question

Rosie am 20 Jul. 2017

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/349607-does-the-matlab-ranksum-function-work-for-larger-sample-sizes

Kommentiert: the cyclist am 26 Mär. 2021

I'm using the matlab ranksum function for a power analysis of 2 samples and I'm getting statistical significance (small p-values). However, the textbook I have only uses the wilcoxon ransum test for non-parametric small sample sizes (sample sizes of 10 to 12), and my sample size is 50.

I wanted to know if the ranksum is still valid for larger sample sizes, or I'm getting these small p-values because my sample size is too large. Also, does anyone know what is the upper limit for a sample size that ranksum can handle?

Thanks.

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Star Strider am 20 Jul. 2017

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/349607-does-the-matlab-ranksum-function-work-for-larger-sample-sizes#answer_274930

The Wilcoxon ranksum (and signrank) tests are generally used for small samples because they are distribution-free, that is they do not depend on how the data are distributed, only that the data have similar distributions. Large samples tend to be normally distributed, as described by the Central Limit Theorem (link), so the normal distribution would apply if the samples are sufficiently large.

You can certainly use the Wilcoxon ranksum on large sample sizes, and a sample size of 50 is certainly appropriate for it.

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

the cyclist am 21 Jul. 2017

One needs to be a little bit careful with the statement "Large samples tend to be normally distributed", which is not strictly true. The CLT is generally making asseration about the statistics of distributions, for example the mean.

Melden Sie sich an, um zu kommentieren.

Answer 2

the cyclist am 20 Jul. 2017

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/349607-does-the-matlab-ranksum-function-work-for-larger-sample-sizes#answer_274929

Bearbeitet: the cyclist am 20 Jul. 2017

I would expect the test to be valid for large samples.

The danger as one moves from small samples to really large samples shifts from "Do I have enough data to see a meaningful effect?" to "I have so much data that I can detect really tiny differences between samples, but are these statistically significant differences actually meaningful?" It becomes more important to have a sense of what a meaningful effect size is.

For example, with a huge sample you might be able to detect a difference of 1 day between two 5-year survival curves. But even though it is statistically significant, it might not be clinically significant.

I don't believe there is a conceptual upper limit to sample size. Just a computer memory limit. :-)

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Answer 3

Blanca Larraga am 28 Nov. 2018

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/349607-does-the-matlab-ranksum-function-work-for-larger-sample-sizes#answer_349407

I am using ranksum with two samples of 200 elements and I get a p value which does not make any sense and if I do the boxplot I can clearly see that there is no difference between the two samples even though I get h=1 and a p value really small. Is there any other function I should use for this prupose?

2 Kommentare
Keine anzeigenKeine ausblenden

Elisa Iovene am 23 Mär. 2021

Hello Blanca, I’m having the same problem. Have you solved the problem? It would be really useful Thanks!

the cyclist am 26 Mär. 2021

In MATLAB Online öffnen

Even thought the topic here is relevant, you will not usually get any response from a question and comment that are 3 years old. (I happened to see it quite by accident.)

I never noticed @Blanca Larraga's note here. But it is pretty easy to create a distribution that has statistically different ranksum(), but doesn't look very different to the eye (if you have enough data points). In the below code, you can see the difference, but barely. And I wasn't even trying very hard. :-)

rng default

N=200;

x1 = randn(N,1);

x2 = x1 + 0.21;

[p,h] = ranksum(x1,x2)

p = 0.0475

h = logical

1

figure

boxplot([x1 x2])

Can you post your data, and give more details? If you tag me with @, I'll try to take a look.

Melden Sie sich an, um zu kommentieren.

Does the matlab ranksum function work for larger sample sizes?

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (2)

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

2 Kommentare
Keine anzeigenKeine ausblenden

Siehe auch

Kategorien

Tags

Community Treasure Hunt

Does the matlab ranksum function work for larger sample sizes?

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar -1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Weitere Antworten (2)

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

2 Kommentare Keine anzeigenKeine ausblenden

Siehe auch

Kategorien

Tags

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

2 Kommentare
Keine anzeigenKeine ausblenden