estimate

Fit vector error-correction (VEC) model to data

Syntax

``EstMdl = estimate(Mdl,Y)``
``[EstMdl,EstSE,logL,E] = estimate(Mdl,Y)``
``EstMdl = estimate(Mdl,Tbl1)``
``[EstMdl,EstSE,logL,Tbl2] = estimate(Mdl,Tbl1)``
``[___] = estimate(___,Name,Value)``

Description

example

````EstMdl = estimate(Mdl,Y)` returns a fully specified VEC(p – 1) model. This model stores the estimated parameter values resulting from fitting the VEC(p – 1) model `Mdl` to all variables (columns) of the matrix of observed multivariate response series `Y` using maximum likelihood.```

example

````[EstMdl,EstSE,logL,E] = estimate(Mdl,Y)` returns the estimated, asymptotic standard errors of the estimated parameters `EstSE`, optimized loglikelihood objective function value `logL`, and the multivariate residuals `E`.```

example

````EstMdl = estimate(Mdl,Tbl1)` fits the VEC(p – 1) model `Mdl` to variables in the input table or timetable `Tbl1`, which contains time series data, and returns the fully specified, estimated VEC(p – 1) model `EstMdl`. `estimate` selects the variables in `Mdl.SeriesNames` or all variables in `Tbl1`. To select different variables in `Tbl1` to fit the model to, use the `ResponseVariables` name-value argument. (since R2022b)```

example

````[EstMdl,EstSE,logL,Tbl2] = estimate(Mdl,Tbl1)` returns the estimated, asymptotic standard errors of the estimated parameters `EstSE`, the optimized loglikelihood objective function value `logL`, and the table or timetable `Tbl2` of all variables in `Tbl1` and residuals corresponding to the response variables to which the model is fit (`ResponseVariables`). (since R2022b)```

example

````[___] = estimate(___,Name,Value)` specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. `estimate` returns the output argument combination for the corresponding input arguments. For example, `estimate(Mdl,Y,Model="H1*",X=Exo)` fits the VEC(p – 1) model `Mdl` to the matrix of response data `Y`, and specifies the H1* Johansen form of the deterministic terms and the matrix of exogenous predictor data `Exo`.Supply all input data using the same data type. Specifically: If you specify the numeric matrix `Y`, optional data sets must be numeric arrays and you must use the appropriate name-value argument. For example, to specify a presample, set the `Y0` name-value argument to a numeric matrix of presample data.If you specify the table or timetable `Tbl1`, optional data sets must be tables or timetables, respectively, and you must use the appropriate name-value argument. For example, to specify a presample, set the `Presample` name-value argument to a table or timetable of presample data. ```

Examples

collapse all

Fit a VEC(1) model to seven macroeconomic series. Supply the response data as a numeric matrix.

Consider a VEC model for the following macroeconomic series:

• Gross domestic product (GDP)

• GDP implicit price deflator

• Paid compensation of employees

• Nonfarm business sector hours of all persons

• Effective federal funds rate

• Personal consumption expenditures

• Gross private domestic investment

Suppose that a cointegrating rank of 4 and one short-run term are appropriate, that is, consider a VEC(1) model.

Load the `Data_USEconVECModel` data set.

`load Data_USEconVECModel`

For more information on the data set and variables, enter `Description` at the command line.

Determine whether the data needs to be preprocessed by plotting the series on separate plots.

```figure tiledlayout(2,2) nexttile plot(FRED.Time,FRED.GDP); title("Gross Domestic Product"); ylabel("Index"); xlabel("Date"); nexttile plot(FRED.Time,FRED.GDPDEF); title("GDP Deflator"); ylabel("Index"); xlabel("Date"); nexttile plot(FRED.Time,FRED.COE); title("Paid Compensation of Employees"); ylabel("Billions of \$"); xlabel("Date"); nexttile plot(FRED.Time,FRED.HOANBS); title("Nonfarm Business Sector Hours"); ylabel("Index"); xlabel("Date");```

```figure tiledlayout(2,2) nexttile plot(FRED.Time,FRED.FEDFUNDS) title("Federal Funds Rate") ylabel("Percent") xlabel("Date") nexttile plot(FRED.Time,FRED.PCEC) title("Consumption Expenditures") ylabel("Billions of \$") xlabel("Date") nexttile plot(FRED.Time,FRED.GPDI) title("Gross Private Domestic Investment") ylabel("Billions of \$") xlabel("Date")```

Stabilize all series, except the federal funds rate, by applying the log transform. Scale the resulting series by 100 so that all series are on the same scale.

```FRED.GDP = 100*log(FRED.GDP); FRED.GDPDEF = 100*log(FRED.GDPDEF); FRED.COE = 100*log(FRED.COE); FRED.HOANBS = 100*log(FRED.HOANBS); FRED.PCEC = 100*log(FRED.PCEC); FRED.GPDI = 100*log(FRED.GPDI);```

Create a VEC(1) model using the shorthand syntax. Specify the variable names.

```Mdl = vecm(7,4,1); Mdl.SeriesNames = FRED.Properties.VariableNames```
```Mdl = vecm with properties: Description: "7-Dimensional Rank = 4 VEC(1) Model with Linear Time Trend" SeriesNames: "GDP" "GDPDEF" "COE" ... and 4 more NumSeries: 7 Rank: 4 P: 2 Constant: [7×1 vector of NaNs] Adjustment: [7×4 matrix of NaNs] Cointegration: [7×4 matrix of NaNs] Impact: [7×7 matrix of NaNs] CointegrationConstant: [4×1 vector of NaNs] CointegrationTrend: [4×1 vector of NaNs] ShortRun: {7×7 matrix of NaNs} at lag [1] Trend: [7×1 vector of NaNs] Beta: [7×0 matrix] Covariance: [7×7 matrix of NaNs] ```

`Mdl` is a `vecm` model object. All properties containing `NaN` values correspond to parameters to be estimated given data.

Estimate the model using the entire data set and the default options.

`EstMdl = estimate(Mdl,FRED.Variables)`
```EstMdl = vecm with properties: Description: "7-Dimensional Rank = 4 VEC(1) Model" SeriesNames: "GDP" "GDPDEF" "COE" ... and 4 more NumSeries: 7 Rank: 4 P: 2 Constant: [14.1329 8.77841 -7.20359 ... and 4 more]' Adjustment: [7×4 matrix] Cointegration: [7×4 matrix] Impact: [7×7 matrix] CointegrationConstant: [-28.6082 109.555 -77.0912 ... and 1 more]' CointegrationTrend: [4×1 vector of zeros] ShortRun: {7×7 matrix} at lag [1] Trend: [7×1 vector of zeros] Beta: [7×0 matrix] Covariance: [7×7 matrix] ```

`EstMdl` is an estimated `vecm` model object. It is fully specified because all parameters have known values. By default, `estimate` imposes the constraints of the H1 Johansen VEC model form by removing the cointegrating trend and linear trend terms from the model. Parameter exclusion from estimation is equivalent to imposing equality constraints to zero.

Display a short summary from the estimation.

`results = summarize(EstMdl)`
```results = struct with fields: Description: "7-Dimensional Rank = 4 VEC(1) Model" Model: "H1" SampleSize: 238 NumEstimatedParameters: 112 LogLikelihood: -1.4939e+03 AIC: 3.2118e+03 BIC: 3.6007e+03 Table: [133x4 table] Covariance: [7x7 double] Correlation: [7x7 double] ```

The `Table` field of `results` is a table of parameter estimates and corresponding statistics.

Consider the model and data in Fit VEC(1) Model to Matrix of Response Data, and suppose that the estimation sample starts at Q1 of 1980.

Load the `Data_USEconVECModel` data set and preprocess the data.

```load Data_USEconVECModel FRED.GDP = 100*log(FRED.GDP); FRED.GDPDEF = 100*log(FRED.GDPDEF); FRED.COE = 100*log(FRED.COE); FRED.HOANBS = 100*log(FRED.HOANBS); FRED.PCEC = 100*log(FRED.PCEC); FRED.GPDI = 100*log(FRED.GPDI);```

Identify the index corresponding to the start of the estimation sample.

`estIdx = FRED.Time(2:end) > '1979-12-31';`

Create a default VEC(1) model using the shorthand syntax. Assume that the appropriate cointegration rank is 4. Specify the variable names.

```Mdl = vecm(7,4,1); Mdl.SeriesNames = FRED.Properties.VariableNames;```

Estimate the model using the estimation sample. Specify all observations before the estimation sample as presample data. Also, specify estimation of the H Johansen form of the VEC model, which includes all deterministic parameters.

```Y0 = FRED{~estIdx,:}; EstMdl = estimate(Mdl,FRED{estIdx,:},'Y0',Y0,'Model',"H")```
```EstMdl = vecm with properties: Description: "7-Dimensional Rank = 4 VEC(1) Model with Linear Time Trend" SeriesNames: "GDP" "GDPDEF" "COE" ... and 4 more NumSeries: 7 Rank: 4 P: 2 Constant: [17.5698 3.74759 -20.1998 ... and 4 more]' Adjustment: [7×4 matrix] Cointegration: [7×4 matrix] Impact: [7×7 matrix] CointegrationConstant: [85.4825 -57.3569 -81.7344 ... and 1 more]' CointegrationTrend: [-0.0264185 -0.00275396 -0.0249583 ... and 1 more]' ShortRun: {7×7 matrix} at lag [1] Trend: [0.000514564 -0.000291183 0.00179965 ... and 4 more]' Beta: [7×0 matrix] Covariance: [7×7 matrix] ```

Because the VEC model order p is 2, `estimate` uses only the last two observations (rows) in `Y0` as a presample.

Since R2022b

Fit a VEC(1) model to seven macroeconomic series. Supply a timetable of data and specify the series for the fit. This example is based on Fit VEC(1) Model to Matrix of Response Data.

Load the `Data_USEconVECModel` data set.

```load Data_USEconVECModel head(FRED)```
``` Time GDP GDPDEF COE HOANBS FEDFUNDS PCEC GPDI ___________ _____ ______ _____ ______ ________ _____ ____ 31-Mar-1957 470.6 16.485 260.6 54.756 2.96 282.3 77.7 30-Jun-1957 472.8 16.601 262.5 54.639 3 284.6 77.9 30-Sep-1957 480.3 16.701 265.1 54.375 3.47 289.2 79.3 31-Dec-1957 475.7 16.711 263.7 53.249 2.98 290.8 71 31-Mar-1958 468.4 16.892 260.2 52.043 1.2 290.3 66.7 30-Jun-1958 472.8 16.94 259.9 51.297 0.93 293.2 65.1 30-Sep-1958 486.7 17.043 267.7 51.908 1.76 298.3 72 31-Dec-1958 500.4 17.123 272.7 52.683 2.42 302.2 80 ```

Stabilize all series, except the federal funds rate, by applying the log transform. Scale the resulting series by 100 so that all series are on the same scale.

```FRED.GDP = 100*log(FRED.GDP); FRED.GDPDEF = 100*log(FRED.GDPDEF); FRED.COE = 100*log(FRED.COE); FRED.HOANBS = 100*log(FRED.HOANBS); FRED.PCEC = 100*log(FRED.PCEC); FRED.GPDI = 100*log(FRED.GPDI); numobs = height(FRED)```
```numobs = 240 ```

Prepare Timetable for Estimation

When you plan to supply a timetable directly to estimate, you must ensure it has all the following characteristics:

• All selected response variables are numeric and do not contain any missing values.

• The timestamps in the `Time` variable are regular, and they are ascending or descending.

Remove all missing values from the table.

```DTT = rmmissing(FRED); numobs = height(DTT)```
```numobs = 240 ```

`DTT` does not contain any missing values.

Determine whether the sampling timestamps have a regular frequency and are sorted.

`areTimestampsRegular = isregular(DTT,"quarters")`
```areTimestampsRegular = logical 0 ```
`areTimestampsSorted = issorted(DTT.Time)`
```areTimestampsSorted = logical 1 ```

`areTimestampsRegular = 0` indicates that the timestamps of DTT are irregular. `areTimestampsSorted = 1` indicates that the timestamps are sorted. Macroeconomic series in this example are timestamped at the end of the month. This quality induces an irregularly measured series.

Remedy the time irregularity by shifting all dates to the first day of the quarter.

```dt = DTT.Time; dt = dateshift(dt,"start","quarter"); DTT.Time = dt; areTimestampsRegular = isregular(DTT,"quarters")```
```areTimestampsRegular = logical 1 ```

`DTT` is regular with respect to time.

Create Model Template for Estimation

Create a VEC(1) model using the shorthand syntax. Specify the variable names.

```Mdl = vecm(7,4,1); Mdl.SeriesNames = FRED.Properties.VariableNames```
```Mdl = vecm with properties: Description: "7-Dimensional Rank = 4 VEC(1) Model with Linear Time Trend" SeriesNames: "GDP" "GDPDEF" "COE" ... and 4 more NumSeries: 7 Rank: 4 P: 2 Constant: [7×1 vector of NaNs] Adjustment: [7×4 matrix of NaNs] Cointegration: [7×4 matrix of NaNs] Impact: [7×7 matrix of NaNs] CointegrationConstant: [4×1 vector of NaNs] CointegrationTrend: [4×1 vector of NaNs] ShortRun: {7×7 matrix of NaNs} at lag [1] Trend: [7×1 vector of NaNs] Beta: [7×0 matrix] Covariance: [7×7 matrix of NaNs] ```

Fit Model to Data

Estimate the model. Pass the entire timetable `DTT`. By default, `estimate` selects the response variables in `Mdl.SeriesNames` to fit to the model. Alternatively, you can use the `ResponseVariables` name-value argument.

Return the timetable of residuals and data fit to the model.

`[EstMdl,~,~,Tbl2] = estimate(Mdl,DTT);`

`EstMdl` is an estimated `vecm` model object. It is fully specified because all parameters have known values.

Display the head of the table `Tbl2`.

`head(Tbl2)`
``` Time GDP GDPDEF COE HOANBS FEDFUNDS PCEC GPDI GDP_Residuals GDPDEF_Residuals COE_Residuals HOANBS_Residuals FEDFUNDS_Residuals PCEC_Residuals GPDI_Residuals ___________ ______ ______ ______ ______ ________ ______ ______ _____________ ________________ _____________ ________________ __________________ ______________ ______________ 01-Jul-1957 617.44 281.55 558.01 399.59 3.47 566.71 437.32 0.12076 0.090979 -0.31114 -0.47341 -0.013177 0.14899 1.1764 01-Oct-1957 616.48 281.61 557.48 397.5 2.98 567.26 426.27 -2.4005 -0.39287 -2.1158 -2.1552 -0.86464 -0.89017 -12.289 01-Jan-1958 614.93 282.68 556.15 395.21 1.2 567.09 420.02 -2.0142 0.92195 -1.5874 -1.1852 -1.3247 -0.72797 -4.4964 01-Apr-1958 615.87 282.97 556.03 393.76 0.93 568.09 417.59 0.2131 -0.39586 -0.22658 -0.070487 -0.24993 0.17697 -0.31486 01-Jul-1958 618.76 283.57 558.99 394.95 1.76 569.81 427.67 2.0866 0.45876 2.4738 1.9098 0.98197 1.0195 9.119 01-Oct-1958 621.54 284.04 560.84 396.43 2.42 571.11 438.2 0.68671 0.053454 0.48556 0.63518 0.23659 -0.21548 4.2428 01-Jan-1959 623.66 284.31 563.55 398.35 2.8 573.62 442.12 0.39546 -0.066055 0.97292 1.0224 -0.054929 0.86153 0.68805 01-Apr-1959 626.19 284.46 565.91 400.24 3.39 575.54 449.31 0.24314 -0.22217 0.33889 0.4216 -0.20457 0.26963 -0.15985 ```

Because `Mdl.P` is `2`, estimation requires two presample observations. Consequently, `estimate` uses the first two rows (first two quarters of 1957) of `DTT` as a presample, fits the model to the remaining observations, and returns only those observations used in estimation in `Tbl2`.

Plot the residuals.

```varnames = Tbl2.Properties.VariableNames; resnames = varnames(contains(Tbl2.Properties.VariableNames,"_Residuals")); figure tiledlayout(3,3) for j = 1:7 nexttile plot(Tbl2.Time,Tbl2{:,resnames(j)}) title(resnames(j),Interpreter="none") grid on end```

Consider the model and data in Fit VEC(1) Model to Matrix of Response Data.

Load the `Data_USEconVECModel` data set and preprocess the data.

```load Data_USEconVECModel FRED.GDP = 100*log(FRED.GDP); FRED.GDPDEF = 100*log(FRED.GDPDEF); FRED.COE = 100*log(FRED.COE); FRED.HOANBS = 100*log(FRED.HOANBS); FRED.PCEC = 100*log(FRED.PCEC); FRED.GPDI = 100*log(FRED.GPDI);```

The `Data_Recessions` data set contains the beginning and ending serial dates of recessions. Load this data set. Convert the matrix of date serial numbers to a datetime array.

```load Data_Recessions dtrec = datetime(Recessions,'ConvertFrom','datenum');```

Create a dummy variable that identifies periods in which the U.S. was in a recession or worse. Specifically, the variable should be `1` if `FRED.Time` occurs during a recession, and `0` otherwise.

```isin = @(x)(any(dtrec(:,1) <= x & x <= dtrec(:,2))); isrecession = double(arrayfun(isin,FRED.Time));```

Create a VEC(1) model using the shorthand syntax. Assume that the appropriate cointegration rank is 4. You do not have to specify the presence of a regression component when creating the model. Specify the variable names.

```Mdl = vecm(7,4,1); Mdl.SeriesNames = FRED.Properties.VariableNames;```

Estimate the model using the entire sample. Specify the predictor identifying whether the observation was measured during a recession. Return the standard errors.

`[EstMdl,EstSE] = estimate(Mdl,FRED.Variables,'X',isrecession);`

Display the regression coefficient for each equation and the corresponding standard errors.

`EstMdl.Beta`
```ans = 7×1 -1.1975 -0.0187 -0.7530 -0.7094 -0.5932 -0.6835 -4.4839 ```
`EstSE.Beta`
```ans = 7×1 0.1547 0.0581 0.1507 0.1278 0.2471 0.1311 0.7150 ```

`EstMdl.Beta` and `EstSE.Beta` are 7-by-1 vectors. Rows correspond to response variables in `EstMdl.SeriesNames` and columns correspond to predictors.

To check whether the effects of recessions are significant, obtain summary statistics from `summarize`, and then display the results for `Beta`.

```results = summarize(EstMdl); isbeta = contains(results.Table.Properties.RowNames,'Beta'); betaresults = results.Table(isbeta,:)```
```betaresults=7×4 table Value StandardError TStatistic PValue _________ _____________ __________ __________ Beta(1,1) -1.1975 0.15469 -7.7411 9.8569e-15 Beta(2,1) -0.018738 0.05806 -0.32273 0.7469 Beta(3,1) -0.75305 0.15071 -4.9966 5.8341e-07 Beta(4,1) -0.70936 0.12776 -5.5521 2.8221e-08 Beta(5,1) -0.5932 0.24712 -2.4004 0.016377 Beta(6,1) -0.68353 0.13107 -5.2151 1.837e-07 Beta(7,1) -4.4839 0.715 -6.2712 3.5822e-10 ```
`whichsig = EstMdl.SeriesNames(betaresults.PValue < 0.05)`
```whichsig = 1x6 string "GDP" "COE" "HOANBS" "FEDFUNDS" "PCEC" "GPDI" ```

All series except `GDPDEF` appear to have a significant recessions effect.

Input Arguments

collapse all

VEC model containing unknown parameter values, specified as a `vecm` model object returned by `vecm`.

`NaN`-valued elements in properties indicate unknown, estimable parameters. Specified elements indicate equality constraints on parameters in model estimation. The innovations covariance matrix `Mdl.Covariance` cannot contain a mix of `NaN` values and real numbers; you must fully specify the covariance or it must be completely unknown (`NaN(Mdl.NumSeries)`).

Observed multivariate response series to which `estimate` fits the model, specified as a `numobs`-by-`numseries` numeric matrix.

`numobs` is the sample size. `numseries` is the number of response variables (`Mdl.NumSeries`).

Rows correspond to observations, and the last row contains the latest observation.

Columns correspond to individual response variables.

`Y` represents the continuation of the presample response series in `Y0`.

Data Types: `double`

Since R2022b

Time series data, to which `estimate` fits the model, specified as a table or timetable with `numvars` variables and `numobs` rows.

Each variable is a numeric vector representing a single path of `numobs` observations. You can optionally specify `numseries` response variables to fit to the model by using the `ResponseVariables` name-value argument, and you can specify `numpreds` predictor variables for the exogenous regression component by using the `PredictorVariables` name-value argument.

Each row is an observation, and measurements in each row occur simultaneously.

If `Tbl1` is a timetable, it must represent a sample with a regular datetime time step (see `isregular`), and the datetime vector `Tbl1.Time` must be ascending or descending.

If `Tbl1` is a table, the last row contains the latest observation.

Name-Value Arguments

Specify optional pairs of arguments as `Name1=Value1,...,NameN=ValueN`, where `Name` is the argument name and `Value` is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose `Name` in quotes.

Example: `estimate(Mdl,Y,Model="H1*",X=Exo)` fits the VEC(p – 1) model `Mdl` to the matrix of response data `Y`, and specifies the H1* Johansen form of the deterministic terms and the matrix of exogenous predictor data `Exo`.

Since R2022b

Variables to select from `Tbl1` to treat as response variables yt, specified as one of the following data types:

• String vector or cell vector of character vectors containing `numseries` variable names in `Tbl1.Properties.VariableNames`

• A length `numseries` vector of unique indices (integers) of variables to select from `Tbl1.Properties.VariableNames`

• A length `numvars` logical vector, where ```ResponseVariables(j) = true``` selects variable `j` from `Tbl1.Properties.VariableNames`, and `sum(ResponseVariables)` is `numseries`

The selected variables must be numeric vectors and cannot contain missing values (`NaN`).

If the number of variables in `Tbl1` matches `Mdl.NumSeries`, the default specifies all variables in `Tbl1`. If the number of variables in `Tbl1` exceeds `Mdl.NumSeries`, the default matches variables in `Tbl1` to names in `Mdl.SeriesNames`.

Example: `ResponseVariables=["GDP" "CPI"]`

Example: `ResponseVariables=[true false true false]` or `ResponseVariable=[1 3]` selects the first and third table variables as the response variables.

Data Types: `double` | `logical` | `char` | `cell` | `string`

Presample response observations to initialize the model for estimation, specified as a `numpreobs`-by-`numseries` numeric matrix. `numpreobs` is the number of presample observations. Use `Y0` only when you supply a matrix of response data `Y`.

Rows correspond to presample observations, and the last row contains the latest observation. `Y0` must have at least `Mdl.P` rows. If you supply more rows than necessary, `estimate` uses the latest `Mdl.P` observations only.

Columns must correspond to the `numseries` response variables in `Y`.

By default, `estimate` uses `Y(1:Mdl.P,:)` as presample observations, and then fits the model to ```Y((Mdl.P + 1):end,:)```. This action reduces the effective sample size.

Data Types: `double`

Since R2022b

Presample data to initialize the model for estimation, specified as a table or timetable, the same type as `Tbl1`, with `numprevars` variables and `numpreobs` rows. Use `Presample` only when you supply a table or timetable of data `Tbl1`.

Each variable is a single path of `numpreobs` observations representing the presample of the corresponding variable in `Tbl1`.

Each row is a presample observation, and measurements in each row occur simultaneously. `numpreobs` must be at least `Mdl.P`. If you supply more rows than necessary, `estimate` uses the latest `Mdl.P` observations only.

If `Presample` is a timetable, all the following conditions must be true:

• `Presample` must represent a sample with a regular datetime time step (see `isregular`).

• The inputs `Tbl1` and `Presample` must be consistent in time such that `Presample` immediately precedes `Tbl1` with respect to the sampling frequency and order.

• The datetime vector of sample timestamps `Presample.Time` must be ascending or descending.

If `Presample` is a table, the last row contains the latest presample observation.

By default, `estimate` uses the first or earliest `Mdl.P` observations in `Tbl1` as a presample, and then it fits the model to the remaining `numobs – Mdl.P` observations. This action reduces the effective sample size.

Since R2022b

Variables to select from `Presample` to use for presample data, specified as one of the following data types:

• String vector or cell vector of character vectors containing `numseries` variable names in `Presample.Properties.VariableNames`

• A length `numseries` vector of unique indices (integers) of variables to select from `Presample.Properties.VariableNames`

• A length `numprevars` logical vector, where ```PresampleResponseVariables(j) = true ``` selects variable `j` from `Presample.Properties.VariableNames`, and `sum(PresampleResponseVariables)` is `numseries`

The selected variables must be numeric vectors and cannot contain missing values (`NaN`).

`PresampleResponseNames` does not need to contain the same names as in `Tbl1`; `estimate` uses the data in selected variable `PresampleResponseVariables(j)` as a presample for `ResponseVariables(j)`.

The default specifies the same response variables as those selected from `Tbl1`, see `ResponseVariables`.

Example: `PresampleResponseVariables=["GDP" "CPI"]`

Example: `PresampleResponseVariables=[true false true false]` or `PresampleResponseVariable=[1 3]` selects the first and third table variables for presample data.

Data Types: `double` | `logical` | `char` | `cell` | `string`

Predictor data for the regression component in the model, specified as a numeric matrix containing `numpreds` columns. Use `X` only when you supply a matrix of response data `Y`.

`numpreds` is the number of predictor variables.

Rows correspond to observations, and the last row contains the latest observation. `estimate` does not use the regression component in the presample period. `X` must have at least as many observations as are used after the presample period:

• If you specify `Y0`, `X` must have at least `numobs` rows (see `Y`).

• Otherwise, `X` must have at least `numobs``Mdl.P` observations to account for the presample removal.

In either case, if you supply more rows than necessary, `estimate` uses the latest observations only.

Columns correspond to individual predictor variables. All predictor variables are present in the regression component of each response equation.

By default, `estimate` excludes the regression component, regardless of its presence in `Mdl`.

Data Types: `double`

Since R2022b

Variables to select from `Tbl1` to treat as exogenous predictor variables xt, specified as one of the following data types:

• String vector or cell vector of character vectors containing `numpreds` variable names in `Tbl1.Properties.VariableNames`

• A length `numpreds` vector of unique indices (integers) of variables to select from `Tbl1.Properties.VariableNames`

• A length `numvars` logical vector, where `PredictorVariables(j) = true ` selects variable `j` from `Tbl1.Properties.VariableNames`, and `sum(PredictorVariables)` is `numpreds`

The selected variables must be numeric vectors and cannot contain missing values (`NaN`).

By default, `estimate` excludes the regression component, regardless of its presence in `Mdl`.

Example: `PredictorVariables=["M1SL" "TB3MS" "UNRATE"]`

Example: `PredictorVariables=[true false true false]` or `PredictorVariable=[1 3]` selects the first and third table variables to supply the predictor data.

Data Types: `double` | `logical` | `char` | `cell` | `string`

Johansen form of the VEC(p – 1) model deterministic terms [2], specified as a value in this table (for variable definitions, see Vector Error-Correction Model).

ValueError-Correction TermDescription
`"H2"`

AB´yt − 1

No intercepts or trends are present in the cointegrating relations, and no deterministic trends are present in the levels of the data.

Specify this model only when all response series have a mean of zero.

`"H1*"`

A(B´yt−1+c0)

Intercepts are present in the cointegrating relations, and no deterministic trends are present in the levels of the data.

`"H1"`

A(B´yt−1+c0)+c1

Intercepts are present in the cointegrating relations, and deterministic linear trends are present in the levels of the data.

`"H*"`A(B´yt−1+c0+d0t)+c1

Intercepts and linear trends are present in the cointegrating relations, and deterministic linear trends are present in the levels of the data.

`"H"`A(B´yt−1+c0+d0t)+c1+d1t

Intercepts and linear trends are present in the cointegrating relations, and deterministic quadratic trends are present in the levels of the data.

If quadratic trends are not present in the data, this model can produce good in-sample fits but poor out-of-sample forecasts.

During estimation, if the overall model constant, overall linear trend, cointegrating constant, or cointegrating linear trend parameters are not in the model, then `estimate` constrains them to zero. If you specify a different equality constraint, that is, if the properties corresponding to those deterministic terms being constrained to zero have a value other than a vector of `NaN` values or zeros, then `estimate` issues an error. To enforce supported equality constraints, choose the Johansen model containing the deterministic term that you want to constrain.

Example: `Model="H1*"`

Data Types: `string` | `char`

Estimation information display type, specified as a value in this table.

ValueDescription
`"off"``estimate` does not display estimation information at the command line.
`"table"``estimate` displays a table of estimation information. Rows correspond to parameters, and columns correspond to estimates, standard errors, t statistics, and p values.
`"full"`In addition to a table of summary statistics, `estimate` displays the estimated innovations covariance and correlation matrices, loglikelihood value, Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), and other estimation information.

Example: `Display="full"`

Data Types: `string` | `char`

Maximum number of solver iterations allowed, specified as a positive numeric scalar.

`estimate` dispatches `MaxIterations` to `mvregress`.

Example: `MaxIterations=2000`

Data Types: `double`

Note

• `NaN` values in `Y`, `Y0`, and `X` indicate missing values. `estimate` removes missing values from the data by list-wise deletion.

• For the presample, `estimate` removes any row containing at least one `NaN`.

• For the estimation sample, `estimate` removes any row of the concatenated data matrix `[Y X]` containing at least one `NaN`.

This type of data reduction reduces the effective sample size.

• `estimate` issues an error when any table or timetable input contains missing values.

Output Arguments

collapse all

Estimated VEC(p – 1) model, returned as a `vecm` model object. `EstMdl` is a fully specified `vecm` model.

`estimate` uses `mvregress` to implement multivariate normal, maximum likelihood estimation. For more details, see Estimation of Multivariate Regression Models.

Estimated, asymptotic standard errors of the estimated parameters, returned as a structure array containing the fields in this table.

FieldDescription
`Constant`Standard errors of the overall model constants (c) corresponding to the estimates in `EstMdl.Constant`, an `Mdl.NumSeries`-by-1 numeric vector
`Adjustment`Standard errors of the adjustment speeds (A) corresponding to the estimates in `EstMdl.Adjustment`, an `Mdl.NumSeries`-by-`Mdl.Rank` numeric vector
`Impact`Standard errors of the impact coefficient (Π) corresponding to the estimates in `EstMdl.Impact`, an `Mdl.NumSeries`-by-`Mdl.NumSeries` numeric vector
`ShortRun`Standard errors of the short-run coefficients (Φ) corresponding to estimates in `EstMdl.ShortRun`, a cell vector with elements corresponding to `EstMdl.ShortRun`
`Beta`Standard errors of regression coefficients (β) corresponding to the estimates in `EstMdl.Beta`, an `Mdl.NumSeries`-by-`numpreds` numeric matrix
`Trend`Standard errors of the overall linear time trends (d) corresponding to the estimates in `EstMdl.Trend`, an `Mdl.NumSeries`-by-1 numeric vector

If `estimate` applies equality constraints during estimation by fixing any parameters to a value, then corresponding standard errors of those parameters are `0`.

`estimate` extracts all standard errors from the inverse of the expected Fisher information matrix returned by `mvregress` (see Standard Errors).

Optimized loglikelihood objective function value, returned as a numeric scalar.

Multivariate residuals from the fitted model `EstMdl`, returned as a numeric matrix containing `numseries` columns. `estimate` returns `E` only when you supply a matrix of response data `Y`.

• If you specify `Y0`, then `E` has `numobs` rows (see `Y`).

• Otherwise, `E` has `numobs``Mdl.P` rows to account for the presample removal.

Since R2022b

Multivariate residuals and estimation data, returned as a table or timetable, the same data type as `Tbl1`. `estimate` returns `Tbl2` only when you supply the input `Tbl1`.

`Tbl2` contains the residuals `E` from the model fit to the selected variables in `Tbl1`, and it contains all variables in `Tbl1`. `estimate` names the residuals corresponding to variable `ResponseJ` in `Tbl1` `ResponseJ_Residuals`. For example, if one of the selected response variables for estimation in `Tbl1` is `GDP`, `Tbl2` contains a variable for the residuals in the response equation of `GDP` with the name `GDP_Residuals`.

If you specify presample response data, `Tbl2` and `Tbl1` have the same number of rows, and their rows correspond. Otherwise, because `estimate` removes initial observations from `Tbl1` for the required presample by default, `Tbl2` has `numobs – Mdl.P` rows to account for that removal.

If `Tbl1` is a timetable, `Tbl1` and `Tbl2` have the same row order, either ascending or descending.

collapse all

Vector Error-Correction Model

A vector error-correction (VEC) model is a multivariate, stochastic time series model consisting of a system of m = `numseries` equations of m distinct, differenced response variables. Equations in the system can include an error-correction term, which is a linear function of the responses in levels used to stabilize the system. The cointegrating rank r is the number of cointegrating relations that exist in the system.

Each response equation can include an autoregressive polynomial composed of first differences of the response series (short-run polynomial of degree p – 1), a constant, a time trend, exogenous predictor variables, and a constant and time trend in the error-correction term.

A VEC(p – 1) model in difference-equation notation and in reduced form can be expressed in two ways:

• This equation is the component form of a VEC model, where the cointegration adjustment speeds and cointegration matrix are explicit, whereas the impact matrix is implied.

`$\begin{array}{c}\Delta {y}_{t}=A\left(B\prime {y}_{t-1}+{c}_{0}+{d}_{0}t\right)+{c}_{1}+{d}_{1}t+{\Phi }_{1}\Delta {y}_{t-1}+...+{\Phi }_{p-1}\Delta {y}_{t-\left(p-1\right)}+\beta {x}_{t}+{\epsilon }_{t}\\ =c+dt+AB\prime {y}_{t-1}+{\Phi }_{1}\Delta {y}_{t-1}+...+{\Phi }_{p-1}\Delta {y}_{t-\left(p-1\right)}+\beta {x}_{t}+{\epsilon }_{t}.\end{array}$`

The cointegrating relations are B'yt – 1 + c0 + d0t and the error-correction term is A(B'yt – 1 + c0 + d0t).

• This equation is the impact form of a VEC model, where the impact matrix is explicit, whereas the cointegration adjustment speeds and cointegration matrix are implied.

`$\begin{array}{c}\Delta {y}_{t}=\Pi {y}_{t-1}+A\left({c}_{0}+{d}_{0}t\right)+{c}_{1}+{d}_{1}t+{\Phi }_{1}\Delta {y}_{t-1}+...+{\Phi }_{p-1}\Delta {y}_{t-\left(p-1\right)}+\beta {x}_{t}+{\epsilon }_{t}\\ =c+dt+\Pi {y}_{t-1}+{\Phi }_{1}\Delta {y}_{t-1}+...+{\Phi }_{p-1}\Delta {y}_{t-\left(p-1\right)}+\beta {x}_{t}+{\epsilon }_{t}.\end{array}$`

In the equations:

• yt is an m-by-1 vector of values corresponding to m response variables at time t, where t = 1,...,T.

• Δyt = ytyt – 1. The structural coefficient is the identity matrix.

• r is the number of cointegrating relations and, in general, 0 < r < m.

• A is an m-by-r matrix of adjustment speeds.

• B is an m-by-r cointegration matrix.

• Π is an m-by-m impact matrix with a rank of r.

• c0 is an r-by-1 vector of constants (intercepts) in the cointegrating relations.

• d0 is an r-by-1 vector of linear time trends in the cointegrating relations.

• c1 is an m-by-1 vector of constants (deterministic linear trends in yt).

• d1 is an m-by-1 vector of linear time-trend values (deterministic quadratic trends in yt).

• c = Ac0 + c1 and is the overall constant.

• d = Ad0 + d1 and is the overall time-trend coefficient.

• Φj is an m-by-m matrix of short-run coefficients, where j = 1,...,p – 1 and Φp – 1 is not a matrix containing only zeros.

• xt is a k-by-1 vector of values corresponding to k exogenous predictor variables.

• β is an m-by-k matrix of regression coefficients.

• εt is an m-by-1 vector of random Gaussian innovations, each with a mean of 0 and collectively an m-by-m covariance matrix Σ. For ts, εt and εs are independent.

Condensed and in lag operator notation, the system is

`$\begin{array}{c}\Phi \left(L\right)\left(1-L\right){y}_{t}=A\left(B\prime {y}_{t-1}+{c}_{0}+{d}_{0}t\right)+{c}_{1}+{d}_{1}t+\beta {x}_{t}+{\epsilon }_{t}\\ =c+dt+AB\prime {y}_{t-1}+\beta {x}_{t}+{\epsilon }_{t}\end{array}$`

where $\Phi \left(L\right)=I-{\Phi }_{1}-{\Phi }_{2}-...-{\Phi }_{p-1}$, I is the m-by-m identity matrix, and Lyt = yt – 1.

If m = r, then the VEC model is a stable VAR(p) model in the levels of the responses. If r = 0, then the error-correction term is a matrix of zeros, and the VEC(p – 1) model is a stable VAR(p – 1) model in the first differences of the responses.

Johansen Form

The Johansen forms of a VEC Model differ with respect to the presence of deterministic terms. As detailed in [2], the estimation procedure differs among the forms. Consequently, allowable equality constraints on the deterministic terms during estimation differ among forms. For more details, see The Role of Deterministic Terms.

This table describes the five Johansen forms and supported equality constraints.

FormError-Correction TermDeterministic CoefficientsEquality Constraints
H2

AB´yt − 1

c = 0 (Constant).

d = 0 (Trend).

c0 = 0 (CointegrationConstant).

d0 = 0 (CointegrationTrend).

You can fully specify B.

All deterministic coefficients are zero.

H1*

A(B´yt−1+c0)

c = Ac0.

d = 0.

d0 = 0.

If you fully specify either B or c0, then you must fully specify the other.

MATLAB® derives the value of c from c0 and A.

All deterministic trends are zero.

H1

A(B´yt−1 + c0) + c1

c = Ac0 + c1.

d = 0.

d0 = 0.

You can fully specify B.

You can specify a mixture of `NaN` and numeric values for c.

MATLAB derives the value of c0 from c and A.

All deterministic trends are zero.

H*

A(B´yt−1 + c0 + d0t) + c1

c = Ac0 + c1.

If you fully specify either B or d0, then you must fully specify the other.

You can specify a mixture of `NaN` and numeric values for c.

MATLAB derives the value of c0 from c and A.

MATLAB derives the value of d from A and d0.

H

A(B´yt−1+c0+d0t)+c1+d1t

c = Ac0 + c1.

d = A.d0 + d1.

You can fully specify B.

You can specify a mixture of `NaN` and numeric values for c and d.

MATLAB derives the values of c0 and d0 from c, d, and A.

Algorithms

• If 1 ≤ `Mdl.Rank``Mdl.NumSeries``1`, as with most VEC models, then `estimate` performs parameter estimation in two steps.

1. `estimate` estimates the parameters of the cointegrating relations, including any restricted intercepts and time trends, by the Johansen method [2].

• The form of the cointegrating relations corresponds to one of the five parametric forms considered by Johansen in [2] (see `'Model'`). For more details, see `jcitest` and `jcontest`.

• The adjustment speed parameter (A) and the cointegration matrix (B) in the VEC(p – 1) model cannot be uniquely identified. However, the product Π = A*Bʹ is identifiable. In this estimation step, B = V1:r, where V1:r is the matrix composed of all rows and the first r columns of the eigenvector matrix V. V is normalized so that Vʹ*S11*V = I. For more details, see [2].

2. `estimate` constructs the error-correction terms from the estimated cointegrating relations. Then, `estimate` estimates the remaining terms in the VEC model by constructing a vector autoregression (VAR) model in first differences and including the error-correction terms as predictors. For models without cointegrating relations (`Mdl.Rank` = 0) or with a cointegrating matrix of full rank (`Mdl.Rank` = `Mdl.Numseries`), `estimate` performs this VAR estimation step only.

• You can remove stationary series, which are associated with standard unit vectors in the space of cointegrating relations, from cointegration analysis. To pretest individual series for stationarity, use `adftest`, `pptest`, `kpsstest`, and `lmctest`. As an alternative, you can test for standard unit vectors in the context of the full model by using `jcontest`.

• If `1``Mdl.Rank``Mdl.NumSeries``1`, the asymptotic error covariances of the parameters in the cointegrating relations (which include B, c0, and d0 corresponding to the `Cointegration`, `CointegrationConstant`, and `CointegrationTrend` properties, respectively) are generally non-Gaussian. Therefore, `estimate` does not estimate or return corresponding standard errors.

In contrast, the error covariances of the composite impact matrix, which is defined as the product A*Bʹ, are asymptotically Gaussian. Therefore, `estimate` estimates and returns its standard errors. Similar caveats hold for the standard errors of the overall constant and linear trend (A*c0 and A*d0corresponding to the `Constant` and `Trend` properties, respectively) of the H1* and H* Johansen forms.

References

[1] Hamilton, James D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

[2] Johansen, S. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford: Oxford University Press, 1995.

[3] Juselius, K. The Cointegrated VAR Model. Oxford: Oxford University Press, 2006.

[4] Lütkepohl, H. New Introduction to Multiple Time Series Analysis. Berlin: Springer, 2005.

Version History

Introduced in R2017b

expand all