# predict

Predict responses of generalized linear regression model

## Syntax

``ypred = predict(mdl,Xnew)``
``````[ypred,yci] = predict(mdl,Xnew)``````
``````[ypred,yci] = predict(mdl,Xnew,Name,Value)``````

## Description

example

````ypred = predict(mdl,Xnew)` returns the predicted response values of the generalized linear regression model `mdl` to the points in `Xnew`.```
``````[ypred,yci] = predict(mdl,Xnew)``` also returns confidence intervals for the responses at `Xnew`.```

example

``````[ypred,yci] = predict(mdl,Xnew,Name,Value)``` specifies additional options using one or more name-value pair arguments. For example, you can specify the confidence level of the confidence interval.```

## Examples

collapse all

Create a generalized linear regression model, and predict its response to new data.

Generate sample data using Poisson random numbers with two underlying predictors `X(:,1)` and `X(:,2)`.

```rng('default') % For reproducibility rndvars = randn(100,2); X = [2 + rndvars(:,1),rndvars(:,2)]; mu = exp(1 + X*[1;2]); y = poissrnd(mu);```

Create a generalized linear regression model of Poisson data.

`mdl = fitglm(X,y,'y ~ x1 + x2','Distribution','poisson');`

Create data points for prediction.

```[Xtest1,Xtest2] = meshgrid(-1:.5:3,-2:.5:2); Xnew = [Xtest1(:),Xtest2(:)];```

Predict responses at the data points.

`ypred = predict(mdl,Xnew);`

Plot the predictions.

`surf(Xtest1,Xtest2,reshape(ypred,9,9))` Fit a generalized linear regression model, and then save the model by using `saveLearnerForCoder`. Define an entry-point function that loads the model by using `loadLearnerForCoder` and calls the `predict` function of the fitted model. Then use `codegen` to generate C/C++ code. Note that generating C/C++ code requires MATLAB® Coder™.

This example briefly explains the code generation workflow for the prediction of linear regression models at the command line. For more details, see Code Generation for Prediction of Machine Learning Model at Command Line. You can also generate code using the MATLAB Coder app. For details, see Code Generation for Prediction of Machine Learning Model Using MATLAB Coder App.

Train Model

Generate sample data using Poisson random numbers with two underlying predictors `X(:,1)` and `X(:,2)`.

```rng('default') % For reproducibility rndvars = randn(100,2); X = [2 + rndvars(:,1),rndvars(:,2)]; mu = exp(1 + X*[1;2]); y = poissrnd(mu);```

Create a generalized linear regression model. Specify the Poisson distribution for the response.

`mdl = fitglm(X,y,'y ~ x1 + x2','Distribution','poisson');`

Save Model

Save the fitted generalized linear regression model to the file `GLMMdl.mat` by using `saveLearnerForCoder`.

`saveLearnerForCoder(mdl,'GLMMdl');`

Define Entry-Point Function

In your current folder, define an entry-point function named `mypredictGLM.m` that does the following:

• Accept new predictor input and valid name-value pair arguments.

• Load the fitted generalized linear regression model in `GLMMdl.mat` by using `loadLearnerForCoder`.

• Return predictions and confidence interval bounds.

```function [yhat,ci] = mypredictGLM(x,varargin) %#codegen %MYPREDICTGLM Predict responses using GLM model % MYPREDICTGLM predicts responses for the n observations in the n-by-1 % vector x using the GLM model stored in the MAT-file GLMMdl.mat, % and then returns the predictions in the n-by-1 vector yhat. % MYPREDICTGLM also returns confidence interval bounds for the % predictions in the n-by-2 vector ci. CompactMdl = loadLearnerForCoder('GLMMdl'); narginchk(1,Inf); [yhat,ci] = predict(CompactMdl,x,varargin{:}); end ```

Add the `%#codegen` compiler directive (or pragma) to the entry-point function after the function signature to indicate that you intend to generate code for the MATLAB algorithm. Adding this directive instructs the MATLAB Code Analyzer to help you diagnose and fix violations that would result in errors during code generation.

Generate Code

Generate code for the entry-point function using `codegen`. Because C and C++ are statically typed languages, you must determine the properties of all variables in the entry-point function at compile time. To specify the data type and exact input array size, pass a MATLAB® expression that represents the set of values with a certain data type and array size. Use `coder.Constant` for the names of name-value pair arguments.

Create points for prediction.

```[Xtest1,Xtest2] = meshgrid(-1:.5:3,-2:.5:2); Xnew = [Xtest1(:),Xtest2(:)];```

Generate code and specify returning 90% simultaneous confidence intervals on the predictions.

`codegen mypredictGLM -args {Xnew,coder.Constant('Alpha'),0.1,coder.Constant('Simultaneous'),true}`

`codegen` generates the MEX function `mypredictGLM_mex` with a platform-dependent extension.

If the number of observations is unknown at compile time, you can also specify the input as variable-size by using `coder.typeof`. For details, see Specify Variable-Size Arguments for Code Generation and Specify Properties of Entry-Point Function Inputs (MATLAB Coder).

Verify Generated Code

Compare predictions and confidence intervals using `predict` and `mypredictGLM_mex`. Specify name-value pair arguments in the same order as in the `-args` argument in the call to `codegen`.

```[yhat1,ci1] = predict(mdl,Xnew,'Alpha',0.1,'Simultaneous',true); [yhat2,ci2] = mypredictGLM_mex(Xnew,'Alpha',0.1,'Simultaneous',true);```

The returned values from `mypredictGLM_mex` might include round-off differences compared to the values from `predict`. In this case, compare the values allowing a small tolerance.

`find(abs(yhat1-yhat2) > 1e-6)`
```ans = 0x1 empty double column vector ```
`find(abs(ci1-ci2) > 1e-6)`
```ans = 0x1 empty double column vector ```

The comparison confirms that the returned values are equal within the tolerance `1e–6`.

## Input Arguments

collapse all

Generalized linear regression model, specified as a `GeneralizedLinearModel` object created using `fitglm` or `stepwiseglm`, or a `CompactGeneralizedLinearModel` object created using `compact`.

New predictor input values, specified as a table, dataset array, or matrix. Each row of `Xnew` corresponds to one observation, and each column corresponds to one variable.

• If `Xnew` is a table or dataset array, it must contain predictors that have the same predictor names as in the `PredictorNames` property of `mdl`.

• If `Xnew` is a matrix, it must have the same number of variables (columns) in the same order as the predictor input used to create `mdl`. Note that `Xnew` must also contain any predictor variables that are not used as predictors in the fitted model. Also, all variables used in creating `mdl` must be numeric. To treat numerical predictors as categorical, identify the predictors using the `'CategoricalVars'` name-value pair argument when you create `mdl`.

Data Types: `single` | `double` | `table`

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: ```[ypred,yci] = predict(Mdl,Xnew,'Alpha',0.01,'Simultaneous',true)``` returns the confidence interval `yci` with a 99% confidence level, computed simultaneously for all predictor values.

Significance level for the confidence interval, specified as the comma-separated pair consisting of `'Alpha'` and a numeric value in the range [0,1]. The confidence level of `yci` is equal to 100(1 – `Alpha`)%. `Alpha` is the probability that the confidence interval does not contain the true value.

Example: `'Alpha',0.01`

Data Types: `single` | `double`

Number of trials for the binomial distribution, specified as the comma-separated pair consisting of `'BinomialSize'` and a scalar or vector of the same length as the response. `predict` expands the scalar input into a constant array of the same size as the response. The scalar input means that all observations have the same number of trials.

The meaning of the output values in `ypred` depends on the value of `'BinomialSize'`.

• If `'BinomialSize'` is 1 (default), then each value in the output `ypred` is the probability of success.

• If `'BinomialSize'` is not 1, then each value in the output `ypred` is the predicted number of successes in the trials.

Data Types: `single` | `double`

Offset value for each row in `Xnew`, specified as the comma-separated pair consisting of `'Offset'` and a scalar or vector with the same length as the response. `predict` expands the scalar input into a constant array of the same size as the response.

Note that the default value of this argument is a vector of zeros even if you specify the `'Offset'` name-value pair argument when fitting a model. If you specify `'Offset'` for fitting, the software treats the offset as an additional predictor with a coefficient value fixed at 1. In other words, the formula for fitting is

f(μ)` = Offset + X*b`,

where f is the link function, μ is the mean response, and X*b is the linear combination of predictors X. The `Offset` predictor has coefficient `1`.

Data Types: `single` | `double`

Flag to compute simultaneous confidence bounds, specified as the comma-separated pair consisting of `'Simultaneous'` and either true or false.

• `true``predict` computes confidence bounds for the curve of response values corresponding to all predictor values in `Xnew`, using Scheffe's method. The range between the upper and lower bounds contains the curve consisting of true response values with 100(1 – α)% confidence.

• `false``predict` computes confidence bounds for the response value at each observation in `Xnew`. The confidence interval for a response value at a specific predictor value contains the true response value with 100(1 – α)% confidence.

Simultaneous bounds are wider than separate bounds, because requiring the entire curve of response values to be within the bounds is stricter than requiring the response value at a single predictor value to be within the bounds.

Example: `'Simultaneous',true`

## Output Arguments

collapse all

Predicted response values at `Xnew`, returned as a numeric vector.

For a binomial model, the meaning of the output values in `ypred` depends on the value of the `'BinomialSize'` name-value pair argument.

• If `'BinomialSize'` is 1 (default), then each value in the output `ypred` is the probability of success.

• If `'BinomialSize'` is not 1, then each value in the output `ypred` is the predicted number of successes in the trials.

For a model with an offset, specify the offset value by using the `'Offset'` name-value pair argument. Otherwise, `predict` uses `0` as the offset value.

Confidence intervals for the responses, returned as a two-column matrix with each row providing one interval. The meaning of the confidence interval depends on the settings of the name-value pair arguments `'Alpha'` and `'Simultaneous'`.

## Alternative Functionality

• `feval` returns the same predictions as `predict`. The `feval` function does not support the `'Offset'` and `'BinomialSize'` name-value pair arguments . `feval` uses 0 as the offset value, and the output values in `ypred` are predicted probabilities. The `feval` function can take multiple input arguments for new predictor input values, with one input for each predictor variable, which is simpler to use with a model created from a table or dataset array. Note that the `feval` function does not give confidence intervals on its predictions.

• `random` returns predictions with added noise.