# Using chi2gof to test two distributions

13 Ansichten (letzte 30 Tage)
Allie am 6 Feb. 2019
Bearbeitet: Sim am 14 Aug. 2024
I want to use the chi2gof to test if two distributions come from a common distribution (null hypothesis) or if they do not come from a common distribution (alternative hypothesis). I have binned observational data (x), binned model data (y), and the bin edges (bins). Both the observational and model data are counts per bin.
x= [41 22 11 10 9 5 2 3 2]
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665]
bins=[0:9:81]
Because the data is already binned and because I'm testing x against y, I used the following code
[h,p,stat]=chi2gof(x,'Edges',bins,'Expected',y)
Manual calculation of the chi2 test statistic results in 4.6861 with a probablity of p=.7905. The above function however, produces a very different result. The resulting stats show different bin edges than designated, the ovserved counts per bin do not match x, the chi2 test statistic is ~87, and p<0.001. Could someone please explain why I'm getting such dramatically different results?
##### 0 Kommentare-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

### Akzeptierte Antwort

Jeff Miller am 7 Feb. 2019
Sorry, the x's really do have to be the data values. Try this:
bins=[0:9:81]
xvals = bins(1:end-1)+4.5; % Here are some fake data values that belong in each bin.
xcounts= [41 22 11 10 9 5 2 3 2] % These are the counts of the data values in each bin.
y= [38.052 24.2655 15.4665 9.8595 6.2895 4.011 2.562 1.6275 2.8665];
[h,p,stat]=chi2gof(xvals,'Edges',bins,'Expected',y,'Frequency',xcounts,'EMin',1)
This will give you your 4.68. By default, chi2gof groups small bins (less than 5) together, and 'EMin' tells it not to do that.
##### 2 KommentareKeine anzeigenKeine ausblenden
Allie am 7 Feb. 2019
This worked! Thank you
Sim am 29 Jul. 2024

Melden Sie sich an, um zu kommentieren.

### Weitere Antworten (2)

Jeff Miller am 6 Feb. 2019
It looks like chi2gof expects the values in x to be the actual, original scores, not the bin counts. Try adding 'Frequency',x to the parameter list.
##### 1 Kommentar-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden
Allie am 7 Feb. 2019
Bearbeitet: Allie am 7 Feb. 2019
This did not work. The stat output is below. As you can see, it changed the edges and expected values from what I originally input and the chi2stat became even bigger.
stat =
chi2stat: 234.4383
df: 5
edges: [0 9 18 27 36 45 81]
O: [12 30 22 0 41 0]
E: [38.0520 24.2655 15.4665 9.8595 6.2895 11.0670]

Melden Sie sich an, um zu kommentieren.

Sim am 14 Aug. 2024
Bearbeitet: Sim am 14 Aug. 2024
Shouldn't you use the two-sample chi-square test?
The Chi-squared test needs binned data. However, as far as I understand, you need to give the raw data, and not the binned data, as inputs of CHI2TEST2.
Indeed, CHI2TEST2 places the raw data into bins:
bins = unique([x1(:,1); x2(:,1)]); % create a bin for each unique value
##### 0 Kommentare-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

### Kategorien

Mehr zu Hypothesis Tests finden Sie in Help Center und File Exchange

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by