Hauptinhalt

loss

Loss for multiresponse regression model

Since R2024b

    Description

    L = loss(Mdl,Tbl) returns the regression loss, or mean squared error (MSE), for the trained multiresponse regression model Mdl. The function calculates the loss using the predictor data and response variables in table Tbl. For more information, see Loss with Regression Chain Ensembles.

    L = loss(Mdl,X,Y) returns the regression loss for the model Mdl using the predictor data X and the response values in Y.

    L = loss(___,Name=Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. For example, you can standardize the response variables.

    example

    Examples

    collapse all

    Create a regression model with more than one response variable by using fitrchains.

    Load the carbig data set, which contains measurements of cars made in the 1970s and early 1980s. Create a table containing the predictor variables Displacement, Horsepower, and so on, as well as the response variables Acceleration and MPG. Display the first eight rows of the table.

    load carbig
    cars = table(Displacement,Horsepower,Model_Year, ...
        Origin,Weight,Acceleration,MPG);
    head(cars)
        Displacement    Horsepower    Model_Year    Origin     Weight    Acceleration    MPG
        ____________    __________    __________    _______    ______    ____________    ___
    
            307            130            70        USA         3504           12        18 
            350            165            70        USA         3693         11.5        15 
            318            150            70        USA         3436           11        18 
            304            150            70        USA         3433           12        16 
            302            140            70        USA         3449         10.5        17 
            429            198            70        USA         4341           10        15 
            454            220            70        USA         4354            9        14 
            440            215            70        USA         4312          8.5        14 
    

    Categorize the cars based on whether they were made in the USA.

    cars.Origin = categorical(cellstr(cars.Origin));
    cars.Origin = mergecats(cars.Origin,["France","Japan",...
        "Germany","Sweden","Italy","England"],"NotUSA");

    Partition the data into training and test sets. Use approximately 85% of the observations to train a multiresponse model, and 15% of the observations to test the performance of the trained model on new data. Use cvpartition to partition the data.

    rng("default") % For reproducibility
    c = cvpartition(height(cars),"Holdout",0.15);
    carsTrain = cars(training(c),:);
    carsTest = cars(test(c),:);

    Train a multiresponse regression model by passing the carsTrain training data to the fitrchains function. By default, the function uses bagged ensembles of trees in the regression chains.

    Mdl = fitrchains(carsTrain,["Acceleration","MPG"])
    Mdl = 
      RegressionChainEnsemble
               PredictorNames: {'Displacement'  'Horsepower'  'Model_Year'  'Origin'  'Weight'}
                 ResponseName: ["Acceleration"    "MPG"]
        CategoricalPredictors: 4
                    NumChains: 2
                LearnedChains: {2×2 cell}
              NumObservations: 338
    
    
      Properties, Methods
    
    

    Mdl is a trained RegressionChainEnsemble model object. You can use dot notation to access the properties of Mdl. For example, you can specify Mdl.Learners to see the bagged ensembles used to train the model.

    Evaluate the performance of the regression model on the test set by computing the test mean squared error (MSE). Smaller MSE values indicate better performance. Return the loss for each response variable separately by setting the OutputType name-value argument to "per-response".

    testMSE = loss(Mdl,carsTest,["Acceleration","MPG"], ...
        OutputType="per-response")
    testMSE = 1×2
    
        2.4921    9.0568
    
    

    Predict the response values for the observations in the test set. Return the predicted response values as a table.

    predictedY = predict(Mdl,carsTest,OutputType="table")
    predictedY=60×2 table
        Acceleration     MPG  
        ____________    ______
    
           12.573       16.109
            10.78       13.988
           11.282       12.963
           15.185       21.066
           12.203       13.773
           13.216       14.216
           17.117       30.199
           16.478       29.033
           13.439       14.208
           11.552       13.066
           13.398       13.271
           14.848       20.927
           16.552       24.603
           12.501       15.359
           15.778       19.328
           12.343       13.185
          ⋮
    
    

    Input Arguments

    collapse all

    Multiresponse regression model, specified as a RegressionChainEnsemble or CompactRegressionChainEnsemble object.

    Sample data, specified as a table. Each row of Tbl corresponds to one observation, and each column corresponds to one variable. Tbl must have the same data type as the data used to train Mdl, and must include all predictor and response variables.

    Data Types: table

    Predictor data, specified as a numeric matrix or a table. Each row of X corresponds to one observation, and each column corresponds to one predictor. X must have the same data type as the predictor data used to train Mdl, and must contain the same predictors. X and Y must have the same number of observations.

    Data Types: single | double | table

    Response data, specified as a numeric matrix or table. Each row of Y corresponds to one observation, and each column corresponds to one response variable. X and Y must have the same number of observations.

    Data Types: single | double | table

    Name-Value Arguments

    collapse all

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Example: loss(Mdl,Tbl,OutputType="per-response") specifies to return a regression loss value for each response variable.

    Type of output loss, specified as "average" or "per-response".

    ValueDescription
    "average"loss averages the loss values across all response variables and returns a scalar value.
    "per-response"loss returns a vector, where each element is the loss for one response variable.

    Example: OutputType="per-response"

    Data Types: char | string

    Flag to standardize the response data before computing the loss, specified as a numeric or logical 0 (false) or 1 (true). If you set StandardizeResponses to true, the software centers and scales each response variable by the corresponding variable mean and standard deviation in the training data.

    Specify StandardizeResponses as true when you have multiple response variables with very different scales and the OutputType is "average".

    Example: StandardizeResponses=true

    Data Types: single | double | logical

    Output Arguments

    collapse all

    Regression loss, or mean squared error (MSE), returned as a numeric scalar or vector.

    • If OutputType is "average", then loss averages the loss values across all response variables and returns a scalar value.

    • If OutputType is "per-response", then loss returns a vector, where each element is the loss for one response variable.

    For more information, see Loss with Regression Chain Ensembles.

    Algorithms

    collapse all

    References

    [1] Spyromitros-Xioufis, Eleftherios, Grigorios Tsoumakas, William Groves, and Ioannis Vlahavas. "Multi-Target Regression via Input Space Expansion: Treating Targets as Inputs." Machine Learning 104, no. 1 (July 2016): 55–98. https://doi.org/10.1007/s10994-016-5546-z.

    Version History

    Introduced in R2024b