coefTest

Linear hypothesis test on multinomial regression model coefficients

Since R2023a

Syntax

p = coefTest(mdl)

p = coefTest(mdl,H)

p = coefTest(mdl,H,C)

[p,F] =
coefTest(___)

[p,F,r]
= coefTest(___)

Description

p = coefTest(mdl) computes the p-value for an F-test that all coefficient estimates in mdl are zero.

example

p = coefTest(mdl,H) performs an F-test that H × B = 0, where B represents the coefficient vector. Use H to specify the coefficients to include in the F-test.

p = coefTest(mdl,H,C) performs an F-test that H × B = C.

[p,F] = coefTest(___) also returns the F-test statistic F using any of the input argument combinations in previous syntaxes.

[p,F,r] = coefTest(___) also returns the numerator degrees of freedom r for the test.

example

Examples

collapse all

Test Significance of Multinomial Regression Model

Open Live Script

Load the fisheriris data set.

load fisheriris

The column vector species contains iris flowers of three different species: setosa, versicolor, and virginica. The matrix meas contains four types of measurements for the flowers: the length and width of sepals and petals in centimeters.

Create a table from the iris measurements and species data by using the array2table function.

tbl = array2table(meas,...
    VariableNames=["SepalLength","SepalWidth","PetalLength","PetalWidth"]);
tbl.Species = species;

Fit a multinomial regression model using the petal measurements as the predictor data and the species as the response data.

mdl = fitmnr(tbl,"Species ~ PetalLength + PetalWidth^2")

mdl = 
Multinomial regression with nominal responses

                                Value       SE       tStat       pValue  
                               _______    ______    _______    __________

    (Intercept_setosa)           136.9    12.587     10.876    1.4933e-27
    PetalLength_setosa         -17.351    7.0021     -2.478      0.013211
    PetalWidth_setosa          -77.383     24.06    -3.2163     0.0012987
    PetalWidth^2_setosa        -24.719    8.3324    -2.9666     0.0030111
    (Intercept_versicolor)      8.2731    14.489      0.571         0.568
    PetalLength_versicolor     -5.7089    2.0638    -2.7662     0.0056709
    PetalWidth_versicolor       35.208     21.97     1.6026       0.10903
    PetalWidth^2_versicolor    -14.041    7.1653    -1.9596      0.050037


150 observations, 292 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 309.3988, p-value = 7.9151e-64

mdl is a multinomial regression model object that contains the results of the fitting a nominal multinomial regression model to the data. The chi-squared statistic and p-value correspond to the null hypothesis that the fitted model does not outperform a degenerate model consisting of only an intercept term. The large p-value indicates that not enough evidence exists to reject the null hypothesis.

Perform an F-test to test the null hypothesis that all coefficients, except the intercept term, are zero. Use the default 95% significance level.

p = coefTest(mdl)

p = 
3.5512e-133

The small p-value in the output indicates that enough evidence exists to reject the null hypothesis that all coefficients are zero. Enough evidence exists to conclude that at least one of the fitted model coefficients is statistically significant at the 95% significance level.

Test Significance of Multinomial Model Coefficient

Open Live Script

Load the carsmall data set.

load carsmall

The variables Acceleration, Weight, and Model_Year contain data for car acceleration, weight, and model year, respectively. The variable MPG contains car mileage data in miles per gallon (MPG).

Sort the data in MPG into four response categories by using the discretize function.

MPG = discretize(MPG,[9 19 29 39 48]);
tbl = table(MPG,Acceleration,Weight,Model_Year);

Fit a multinomial regression model of the car mileage as a function of the acceleration, weight, and model year.

mdl = fitmnr(tbl,"MPG ~ Acceleration + Model_Year + Weight",CategoricalPredictors="Model_Year")

mdl = 
Multinomial regression with nominal responses

                        Value         SE         tStat       pValue   
                       ________    _________    _______    ___________

    (Intercept_1)        154.38       15.697      9.835     7.9576e-23
    Acceleration_1       -11.31      0.53323     -21.21    7.7405e-100
    Weight_1           0.098347    0.0034745     28.306    2.9244e-176
    Model_Year_76_1      182.33       4.5868      39.75              0
    Model_Year_82_1     -1690.4       4.6231    -365.64              0
    (Intercept_2)        177.87       14.211     12.516     6.0891e-36
    Acceleration_2       -11.28      0.48884    -23.076    8.1522e-118
    Weight_2           0.090009    0.0030349     29.658    2.6661e-193
    Model_Year_76_2      187.19       4.2373     44.176              0
    Model_Year_82_2      -136.5       3.4781    -39.244              0
    (Intercept_3)        103.66       14.991     6.9146     4.6928e-12
    Acceleration_3      -11.359      0.48805    -23.274    8.2157e-120
    Weight_3           0.080071    0.0033652     23.794    3.8879e-125
    Model_Year_76_3      283.31       4.7309     59.885              0
    Model_Year_82_3     -34.727       4.0878    -8.4953     1.9743e-17


94 observations, 267 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 169.6193, p-value = 5.7114e-30

mdl is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data. By default, the fourth response category is the reference category. Each row of the table output corresponds to the coefficient of the model term in the first column. The tStat and pValue columns contain the t-statistics and p-values, respectively, for the null hypothesis that the corresponding coefficient is zero. The small p-values for the Model_Year terms indicate that the model year has a statistically significant effect on mdl. For example, the p-value for the term Model_Year_76_2 indicates that a car being manufactured in 1976 has a statistically significant effect on $\ln (\frac{π_{2}}{π_{4}})$ , where $π_{i}$ is the ith category probability.

You can use a numeric index matrix to investigate whether a group of coefficients contains a coefficient that is statistically significant. Use a numeric index matrix to test the null hypothesis that all coefficients corresponding to the Model_Year terms are zero.

idx_Model_Year = [0 0 0 1 0 0 0 0 0 0 0 0 0 0 0;...
                  0 0 0 0 1 0 0 0 0 0 0 0 0 0 0;...
                  0 0 0 0 0 0 0 0 1 0 0 0 0 0 0;...
                  0 0 0 0 0 0 0 0 0 1 0 0 0 0 0;...
                  0 0 0 0 0 0 0 0 0 0 0 0 0 1 0;...
                  0 0 0 0 0 0 0 0 0 0 0 0 0 0 1;...
];

[p_Model_Year,F_Model_Year,r_Model_Year] = coefTest(mdl,idx_Model_Year)

p_Model_Year = 
0

F_Model_Year = 
4.8985e+04

r_Model_Year = 
6

The returned p-value indicates that at least one of the category coefficients corresponding to Model_Year is statistically different from zero. This result is consistent with the small p-value for each of the Model_Term coefficients.

Input Arguments

collapse all

`mdl` — Multinomial regression model object
`MultinomialRegression` model object

Multinomial regression model object, specified as a MultinomialRegression model object created with the fitmnr function.

`H` — Hypothesis matrix
numeric index matrix | logical matrix

Hypothesis matrix, specified as a full-rank numeric index matrix of size r-by-s, where r is the number of linear combinations of coefficients being tested, and s is the total number of coefficients.

If you specify H, then the output p is the p-value for an F-test that H × B = 0, where B represents the coefficient vector.
If you specify H and C, then the output p is the p-value for an F-test that H × B = C.

Example: [1 0 0 0 0] tests the first coefficient among five coefficients.

Data Types: single | double | logical

`C` — Hypothesized value
numeric vector

Hypothesized value for testing the null hypothesis, specified as a numeric vector with the same number of rows as H.

If you specify H and C, then the output p is the p-value for an F-test that H × B = C, where B represents the coefficient vector.

Data Types: single | double

Output Arguments

collapse all

`p` — p-value for F-test
numeric value in the range [0,1]

p-value for the F-test, returned as a numeric value in the range [0,1].

`F` — Value of test statistic for F-test
numeric value

Value of the test statistic for the F-test, returned as a numeric value.

`r` — Numerator degrees of freedom for F-test
positive integer

Numerator degrees of freedom for the F-test, returned as a positive integer. The F-statistic has r degrees of freedom in the numerator and mdl.DFE degrees of freedom in the denominator.

Version History

Introduced in R2023a

coefTest

Syntax

Description

Examples

Test Significance of Multinomial Regression Model

Test Significance of Multinomial Model Coefficient

Input Arguments

mdl — Multinomial regression model object MultinomialRegression model object

H — Hypothesis matrix numeric index matrix | logical matrix

C — Hypothesized value numeric vector

Output Arguments

p — p-value for F-test numeric value in the range [0,1]

F — Value of test statistic for F-test numeric value

r — Numerator degrees of freedom for F-test positive integer

Version History

See Also

`mdl` — Multinomial regression model object
`MultinomialRegression` model object

`H` — Hypothesis matrix
numeric index matrix | logical matrix

`C` — Hypothesized value
numeric vector

`p` — p-value for F-test
numeric value in the range [0,1]

`F` — Value of test statistic for F-test
numeric value

`r` — Numerator degrees of freedom for F-test
positive integer