# Predict Responses Using RegressionEnsemble Predict Block

This example shows how to train an ensemble model with optimal hyperparameters, and then use the RegressionEnsemble Predict block for response prediction in Simulink®. The block accepts an observation (predictor data) and returns the predicted response for the observation using the trained regression ensemble model.

### Train Regression Model with Optimal Hyperparameters

Load the `carbig` data set, which contains measurements of cars made in the 1970s and early 1980s.

```load carbig whos```
``` Name Size Bytes Class Attributes Acceleration 406x1 3248 double Cylinders 406x1 3248 double Displacement 406x1 3248 double Horsepower 406x1 3248 double MPG 406x1 3248 double Mfg 406x13 10556 char Model 406x36 29232 char Model_Year 406x1 3248 double Origin 406x7 5684 char Weight 406x1 3248 double cyl4 406x5 4060 char org 406x7 5684 char when 406x5 4060 char ```

`Origin` is a categorical variable. When you train a model for the RegressionEnsemble Predict block, you must preprocess categorical predictors by using the `dummyvar` function to include the categorical predictors in the model. You cannot use the `'CategoricalPredictors'` name-value argument. Create dummy variables for `Origin`.

```c_Origin = categorical(cellstr(Origin)); d_Origin = dummyvar(c_Origin);```

`dummyvar` creates dummy variables for each category of `c_Origin`. Determine the number of categories in `c_Origin` and the number of dummy variables in `d_Origin`.

`unique(cellstr(Origin))`
```ans = 7x1 cell {'England'} {'France' } {'Germany'} {'Italy' } {'Japan' } {'Sweden' } {'USA' } ```
`size(d_Origin)`
```ans = 1×2 406 7 ```

`dummyvar` creates dummy variables for each category of `Origin`.

Create a matrix containing six numeric predictor variables and the seven dummy variables for `Origin`. Also, create a vector of the response variable.

```X = [Acceleration,Cylinders,Displacement,Horsepower,Model_Year,Weight,d_Origin]; Y = MPG;```

Train an ensemble using `X` and `Y` with these options:

• Specify `'OptimizeHyperparameters'` as `'auto'` to train an ensemble with optimal hyperparameters. The `'auto'` option finds optimal values for `'Method'`,`'NumLearningCycles'`, and `'LearnRate'` (for applicable methods) of `fitrensemble` and `'MinLeafSize'` of tree learners.

• For reproducibility, set the random seed and use the `'expected-improvement-plus'` acquisition function. Also, for reproducibility of the random forest algorithm, specify `'Reproducible'` as `true` for tree learners.

```rng('default') t = templateTree('Reproducible',true); ensMdl = fitrensemble(X,Y,'Learners',t, ... 'OptimizeHyperparameters','auto', ... 'HyperparameterOptimizationOptions', ... struct('AcquisitionFunctionName','expected-improvement-plus'))```
```|===================================================================================================================================| | Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | Method | NumLearningC-| LearnRate | MinLeafSize | | | result | log(1+loss) | runtime | (observed) | (estim.) | | ycles | | | |===================================================================================================================================| | 1 | Best | 2.7403 | 10.448 | 2.7403 | 2.7403 | Bag | 184 | - | 69 | | 2 | Accept | 4.1317 | 0.72245 | 2.7403 | 2.8143 | Bag | 10 | - | 176 | | 3 | Best | 2.1687 | 11.583 | 2.1687 | 2.1689 | Bag | 118 | - | 2 | | 4 | Accept | 2.2747 | 2.0694 | 2.1687 | 2.1688 | LSBoost | 24 | 0.37779 | 7 | | 5 | Best | 2.1421 | 3.1716 | 2.1421 | 2.1422 | Bag | 75 | - | 1 | | 6 | Best | 2.1365 | 19.87 | 2.1365 | 2.1365 | Bag | 500 | - | 1 | | 7 | Accept | 2.4302 | 1.6717 | 2.1365 | 2.1365 | LSBoost | 37 | 0.94779 | 71 | | 8 | Accept | 2.1813 | 24.433 | 2.1365 | 2.1365 | LSBoost | 497 | 0.023582 | 1 | | 9 | Accept | 6.1992 | 4.005 | 2.1365 | 2.1363 | LSBoost | 91 | 0.0012439 | 1 | | 10 | Accept | 2.2119 | 19.656 | 2.1365 | 2.1363 | LSBoost | 497 | 0.087441 | 11 | | 11 | Accept | 4.7782 | 0.85121 | 2.1365 | 2.1366 | LSBoost | 15 | 0.055744 | 1 | | 12 | Accept | 2.3093 | 19.471 | 2.1365 | 2.1366 | LSBoost | 493 | 0.39665 | 1 | | 13 | Accept | 4.1304 | 8.4476 | 2.1365 | 2.1366 | LSBoost | 198 | 0.33031 | 201 | | 14 | Accept | 2.595 | 0.86258 | 2.1365 | 2.1367 | LSBoost | 16 | 0.99848 | 1 | | 15 | Accept | 2.6643 | 1.2997 | 2.1365 | 2.1363 | LSBoost | 25 | 0.97637 | 5 | | 16 | Accept | 2.2388 | 0.86244 | 2.1365 | 2.1363 | LSBoost | 11 | 0.42205 | 1 | | 17 | Accept | 4.1304 | 1.5347 | 2.1365 | 2.1789 | LSBoost | 19 | 0.79808 | 202 | | 18 | Accept | 2.3399 | 2.9743 | 2.1365 | 2.1363 | LSBoost | 71 | 0.44856 | 1 | | 19 | Accept | 2.7734 | 4.3919 | 2.1365 | 2.1394 | LSBoost | 107 | 0.020776 | 2 | | 20 | Accept | 2.3204 | 14.501 | 2.1365 | 2.136 | Bag | 463 | - | 16 | |===================================================================================================================================| | Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | Method | NumLearningC-| LearnRate | MinLeafSize | | | result | log(1+loss) | runtime | (observed) | (estim.) | | ycles | | | |===================================================================================================================================| | 21 | Accept | 2.2005 | 19.969 | 2.1365 | 2.137 | LSBoost | 464 | 0.10107 | 10 | | 22 | Accept | 2.479 | 2.4884 | 2.1365 | 2.136 | LSBoost | 40 | 0.93931 | 26 | | 23 | Accept | 4.4432 | 1.6021 | 2.1365 | 2.1366 | LSBoost | 16 | 0.094719 | 189 | | 24 | Accept | 2.2531 | 19.566 | 2.1365 | 2.137 | LSBoost | 497 | 0.32798 | 5 | | 25 | Accept | 2.158 | 17.777 | 2.1365 | 2.1366 | LSBoost | 433 | 0.015137 | 1 | | 26 | Accept | 2.6254 | 17.567 | 2.1365 | 2.1369 | LSBoost | 467 | 0.94779 | 50 | | 27 | Accept | 2.5612 | 0.72008 | 2.1365 | 2.1369 | LSBoost | 12 | 0.19061 | 17 | | 28 | Accept | 2.256 | 0.62882 | 2.1365 | 2.1366 | LSBoost | 10 | 0.37427 | 2 | | 29 | Accept | 2.2065 | 20.399 | 2.1365 | 2.1366 | LSBoost | 499 | 0.018238 | 5 | | 30 | Accept | 2.2539 | 0.48196 | 2.1365 | 2.1369 | Bag | 10 | - | 7 | ```

```__________________________________________________________ Optimization completed. MaxObjectiveEvaluations of 30 reached. Total function evaluations: 30 Total elapsed time: 285.8617 seconds Total objective function evaluation time: 254.0255 Best observed feasible point: Method NumLearningCycles LearnRate MinLeafSize ______ _________________ _________ ___________ Bag 500 NaN 1 Observed objective function value = 2.1365 Estimated objective function value = 2.1369 Function evaluation time = 19.8698 Best estimated feasible point (according to models): Method NumLearningCycles LearnRate MinLeafSize ______ _________________ _________ ___________ Bag 500 NaN 1 Estimated objective function value = 2.1369 Estimated function evaluation time = 17.8725 ```
```ensMdl = RegressionBaggedEnsemble ResponseName: 'Y' CategoricalPredictors: [] ResponseTransform: 'none' NumObservations: 398 HyperparameterOptimizationResults: [1x1 BayesianOptimization] NumTrained: 500 Method: 'Bag' LearnerNames: {'Tree'} ReasonForTermination: 'Terminated normally after completing the requested number of training cycles.' FitInfo: [] FitInfoDescription: 'None' Regularization: [] FResample: 1 Replace: 1 UseObsForLearner: [398x500 logical] Properties, Methods ```

`fitrensemble` returns a `RegressionBaggedEnsemble` object because the function finds the random forest algorithm (`'Bag'`) as the optimal method.

This example provides the Simulink model `slexCarDataRegressionEnsemblePredictExample.slx`, which includes the RegressionEnsemble Predict block. You can open the Simulink model or create a new model as described in this section.

Open the Simulink model `slexCarDataRegressionEnsemblePredictExample.slx`.

```SimMdlName = 'slexCarDataRegressionEnsemblePredictExample'; open_system(SimMdlName)```

The `PreLoadFcn` callback function of `slexCarDataRegressionEnsemblePredictExample` includes code to load the sample data, train the model using the optimal hyperparameters, and create an input signal for the Simulink model. If you open the Simulink model, then the software runs the code in `PreLoadFcn` before loading the Simulink model. To view the callback function, in the Setup section on the Modeling tab, click Model Settings and select Model Properties. Then, on the Callbacks tab, select the `PreLoadFcn` callback function in the Model callbacks pane.

To create a new Simulink model, open the Blank Model template and add the RegressionEnsemble Predict block. Add the Inport and Outport blocks and connect them to the RegressionEnsemble Predict block.

Double-click the RegressionEnsemble Predict block to open the Block Parameters dialog box. Specify the Select trained machine learning model parameter as `ensMdl`, which is the name of a workspace variable that contains the trained model. Click the Refresh button. The dialog box displays the options used to train the model `ensMdl` under Trained Machine Learning Model.

The RegressionEnsemble Predict block expects an observation containing 13 predictor values. Double-click the Inport block, and set the Port dimensions to 13 on the Signal Attributes tab.

Create an input signal in the form of a structure array for the Simulink model. The structure array must contain these fields:

• `time` — The points in time at which the observations enter the model. The orientation must correspond to the observations in the predictor data. So, in this example, `time` must be a column vector.

• `signals` — A 1-by-1 structure array describing the input data and containing the fields `values` and `dimensions`, where `values` is a matrix of predictor data, and `dimensions` is the number of predictor variables.

Create an appropriate structure array for the `slexCarDataRegressionEnsemblePredictExample` model from the `carsmall` data set. When you convert `Origin` in `carsmall` to the `categorical` data type array `c_Origin_small`, use `categories(c_Origin)` so that `c_Origin` and `c_Origin_small` have the same number of categories in the same order.

```load carsmall c_Origin_small = categorical(cellstr(Origin),categories(c_Origin)); d_Origin_small = dummyvar(c_Origin_small); testX = [Acceleration,Cylinders,Displacement,Horsepower,Model_Year,Weight,d_Origin_small]; testX = rmmissing(testX); carsmallInput.time = (0:size(testX,1)-1)'; carsmallInput.signals(1).values = testX; carsmallInput.signals(1).dimensions = size(testX,2);```

To import signal data from the workspace:

• Open the Configuration Parameters dialog box. On the Modeling tab, click Model Settings.

• In the Data Import/Export pane, select the Input check box and enter `carsmallInput` in the adjacent text box.

• In the Solver pane, under Simulation time, set Stop time to `carsmallInput.time(end)`. Under Solver selection, set Type to `Fixed-step`, and set Solver to `discrete (no continuous states)`.

Simulate the model.

`sim(SimMdlName);`

When the Inport block detects an observation, it directs the observation into the RegressionTree Predict block. You can use the Simulation Data Inspector (Simulink) to view the logged data of the Outport block.