Main Content

fishertest

Fisher’s exact test

Description

h = fishertest(x) returns a test decision for Fisher’s exact test of the null hypothesis that there are no nonrandom associations between the two categorical variables in x, against the alternative that there is a nonrandom association. The result h is 1 if the test rejects the null hypothesis at the 5% significance level, or 0 otherwise.

example

[h,p,stats] = fishertest(x) also returns the p-value p of the test and the structure stats containing additional test results, including the odds ratio and its asymptotic confidence interval.

example

[___] = fishertest(x,Name,Value) returns a test decision using additional options specified by one or more name-value pair arguments. For example, you can change the significance level of the test or conduct a one-sided test.

example

Examples

collapse all

In a small survey, a researcher asked 17 individuals if they received a flu shot this year, and whether they caught the flu this winter. The results indicate that, of the nine people who did not receive a flu shot, three got the flu and six did not. Of the eight people who received a flu shot, one got the flu and seven did not.

Create a 2-by-2 contingency table containing the survey data. Row 1 contains data for the individuals who did not receive a flu shot, and row 2 contains data for the individuals who received a flu shot. Column 1 contains the number of individuals who got the flu, and column 2 contains the number of individuals who did not.

x = table([3;1],[6;7],'VariableNames',{'Flu','NoFlu'},'RowNames',{'NoShot','Shot'})
x=2×2 table
              Flu    NoFlu
              ___    _____

    NoShot     3       6  
    Shot       1       7  

Use Fisher's exact test to determine if there is a nonrandom association between receiving a flu shot and getting the flu.

h = fishertest(x)
h = logical
   0

The returned test decision h = 0 indicates that fishertest does not reject the null hypothesis of no nonrandom association between the categorical variables at the default 5% significance level. Therefore, based on the test results, individuals who do not get a flu shot do not have different odds of getting the flu than those who got the flu shot.

In a small survey, a researcher asked 17 individuals if they received a flu shot this year, and whether they caught the flu. The results indicate that, of the nine people who did not receive a flu shot, three got the flu and six did not. Of the eight people who received a flu shot, one got the flu and seven did not.

x = [3,6;1,7];

Use a right-tailed Fisher's exact test to determine if the odds of getting the flu is higher for individuals who did not receive a flu shot than for individuals who did. Conduct the test at the 1% significance level.

[h,p,stats] = fishertest(x,'Tail','right','Alpha',0.01)
h = logical
   0

p = 
0.3353
stats = struct with fields:
             OddsRatio: 3.5000
    ConfidenceInterval: [0.1289 95.0408]

The returned test decision h = 0 indicates that fishertest does not reject the null hypothesis of no nonrandom association between the categorical variables at the 1% significance level. Since this is a right-tailed hypothesis test, the conclusion is that individuals who do not get a flu shot do not have greater odds of getting the flu than those who got the flu shot.

Load the hospital data.

load hospital
hospital = dataset2table(hospital)
hospital=100×7 table
                 LastName       Sex      Age    Weight    Smoker    BloodPressure        Trials     
               ____________    ______    ___    ______    ______    _____________    _______________

    YPL-320    {'SMITH'   }    Male      38      176      true       124     93      {[         18]}
    GLI-532    {'JOHNSON' }    Male      43      163      false      109     77      {[   11 13 22]}
    PNI-258    {'WILLIAMS'}    Female    38      131      false      125     83      {1×0 double   }
    MIJ-579    {'JONES'   }    Female    40      133      false      117     75      {[       6 12]}
    XLK-030    {'BROWN'   }    Female    49      119      false      122     80      {[      14 23]}
    TFP-518    {'DAVIS'   }    Female    46      142      false      121     70      {[         19]}
    LPD-746    {'MILLER'  }    Female    33      142      true       130     88      {[         13]}
    ATA-945    {'WILSON'  }    Male      40      180      false      115     82      {1×0 double   }
    VNL-702    {'MOORE'   }    Male      28      183      false      115     78      {[          2]}
    LQW-768    {'TAYLOR'  }    Female    31      132      false      118     86      {[         11]}
    QFY-472    {'ANDERSON'}    Female    45      128      false      114     77      {[    8 10 14]}
    UJG-627    {'THOMAS'  }    Female    42      137      false      115     68      {[        4 9]}
    XUE-826    {'JACKSON' }    Male      25      174      false      127     74      {1×0 double   }
    TRW-072    {'WHITE'   }    Male      39      202      true       130     95      {[          8]}
    ELG-976    {'HARRIS'  }    Female    36      129      false      114     79      {1×0 double   }
    KOQ-996    {'MARTIN'  }    Male      48      181      true       130     92      {[13 15 21 27]}
      ⋮

The hospital dataset array contains data on 100 hospital patients, including last name, gender, age, weight, smoking status, and systolic and diastolic blood pressure measurements.

To determine if smoking status is independent of gender, use crosstab to create a 2-by-2 contingency table of smokers and nonsmokers, grouped by gender.

[tbl,chi2,p,labels] = crosstab(hospital.Sex,hospital.Smoker)
tbl = 2×2

    40    13
    26    21

chi2 = 
4.5083
p = 
0.0337
labels = 2×2 cell
    {'Female'}    {'0'}
    {'Male'  }    {'1'}

The rows of the resulting contingency table tbl correspond to the patient's gender, with row 1 containing data for females and row 2 containing data for males. The columns correspond to the patient's smoking status, with column 1 containing data for nonsmokers and column 2 containing data for smokers. The returned result chi2 = 4.5083 is the value of the chi-squared test statistic for a chi-squared test of independence. The returned value p = 0.0337 is an approximate p-value based on the chi-squared distribution.

Use the contingency table generated by crosstab to perform Fisher's exact test on the data.

[h,p,stats] = fishertest(tbl)
h = logical
   1

p = 
0.0375
stats = struct with fields:
             OddsRatio: 2.4852
    ConfidenceInterval: [1.0624 5.8135]

The result h = 1 indicates that fishertest rejects the null hypothesis of nonassociation between smoking status and gender at the 5% significance level. In other words, there is an association between gender and smoking status. The odds ratio indicates that the male patients have about 2.5 times greater odds of being smokers than the female patients.

The returned p-value of the test, p = 0.0375, is close to, but not exactly the same as, the result obtained by crosstab. This is because fishertest computes an exact p-value using the sample data, while crosstab uses a chi-squared approximation to compute the p-value.

Input Arguments

collapse all

Contingency table, specified as a 2-by-2 matrix or table containing nonnegative integer values. A contingency table contains the frequency distribution of the variables in the sample data. You can use crosstab to generate a contingency table from sample data.

Example: [4,0;0,4]

Data Types: single | double

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'Alpha',0.01,'Tail','right' specifies a right-tailed hypothesis test at the 1% significance level.

Significance level of the hypothesis test, specified as the comma-separated pair consisting of 'Alpha' and a scalar value in the range (0,1).

Example: 'Alpha',0.01

Data Types: single | double

Type of alternative hypothesis, specified as the comma-separated pair consisting of 'Tail' and one of the following.

'both'Two-tailed test. The alternative hypothesis is that there is a nonrandom association between the two variables in x, and the odds ratio is not equal to 1.
'right'Right-tailed test. The alternative hypothesis is that the odds ratio is greater than 1.
'left'Left-tailed test. The alternative hypothesis is that the odds ratio is less than 1.

Example: 'Tail','right'

Output Arguments

collapse all

Hypothesis test result, returned as a logical value.

  • If h is 1, then fishertest rejects the null hypothesis at the Alpha significance level.

  • If h is 0, then fishertest fails to reject the null hypothesis at the Alpha significance level.

p-value of the test, returned as a scalar value in the range [0,1]. p is the probability of observing a test statistic as extreme as, or more extreme than, the observed value under the null hypothesis. Small values of p cast doubt on the validity of the null hypothesis.

Test data, returned as a structure with the following fields:

  • OddsRatio — A measure of association between the two variables.

  • ConfidenceInterval — Asymptotic confidence interval for the odds ratio. If any of the cell frequencies in x are 0, then fishertest does not compute a confidence interval and instead displays [-Inf Inf].

More About

collapse all

Version History

Introduced in R2014b

See Also

|