Hauptinhalt

loss

Loss at each horizon step

Since R2023b

    Description

    L = loss(Mdl,TestTbl) returns the test set loss (mean squared error) for the direct forecasting model Mdl at each step of the horizon (Mdl.Horizon). The function uses the test data TestTbl to prepare lagged and leading predictors, and uses the response variable in TestTbl to compute loss values.

    L is a 1-by-h numeric vector, where h is the number of elements in Mdl.Horizon.

    example

    L = loss(Mdl,TestX,TestY) returns losses for the test set exogenous predictor data TestX and the test set response data TestY. This syntax assumes that Mdl uses exogenous predictors; that is, Mdl.PredictorNames is nonempty.

    L = loss(Mdl,TestY) returns test set losses when the model Mdl does not use exogenous predictors. That is, Mdl.PredictorNames must be empty.

    L = loss(___,LossFun=lossfun) specifies the loss function in addition to any of the input argument combinations in previous syntaxes.

    Examples

    collapse all

    Calculate the test set mean squared error (MSE) of a direct forecasting model.

    Load the sample file TemperatureData.csv, which contains average daily temperatures from January 2015 through July 2016. Read the file into a table. Observe the first eight observations in the table.

    temperatures = readtable("TemperatureData.csv");
    head(temperatures)
        Year       Month       Day    TemperatureF
        ____    ___________    ___    ____________
    
        2015    {'January'}     1          23     
        2015    {'January'}     2          31     
        2015    {'January'}     3          25     
        2015    {'January'}     4          39     
        2015    {'January'}     5          29     
        2015    {'January'}     6          12     
        2015    {'January'}     7          10     
        2015    {'January'}     8           4     
    

    For this example, use a subset of the temperature data that omits the first 100 observations.

    Tbl = temperatures(101:end,:);

    Create a datetime variable t that contains the year, month, and day information for each observation in Tbl. Then, use t to convert Tbl into a timetable.

    numericMonth = month(datetime(Tbl.Month, ...
        InputFormat="MMMM",Locale="en_US"));
    t = datetime(Tbl.Year,numericMonth,Tbl.Day);
    Tbl.Time = t;
    Tbl = table2timetable(Tbl);

    Plot the temperature values in Tbl over time.

    plot(Tbl.Time,Tbl.TemperatureF)
    xlabel("Date")
    ylabel("Temperature in Fahrenheit")

    Figure contains an axes object. The axes object with xlabel Date, ylabel Temperature in Fahrenheit contains an object of type line.

    Partition the temperature data into training and test sets by using tspartition. Reserve 20% of the observations for testing.

    partition = tspartition(size(Tbl,1),"Holdout",0.20);
    trainingTbl = Tbl(training(partition),:);
    testTbl = Tbl(test(partition),:);

    Create a full direct forecasting model by using the data in trainingTbl. Train the model using a decision tree learner. All three of the predictors (Year, Month, and Day) are leading predictors because their future values are known. To create new predictors by shifting the leading predictor and response variables backward in time, specify the leading predictor lags and the response variable lags.

    Mdl = directforecaster(trainingTbl,"TemperatureF", ...
        Learner="tree", ...
        LeadingPredictors="all",LeadingPredictorLags={0:1,0:1,0:7}, ...
        ResponseLags=1:7)
    Mdl = 
      DirectForecaster
    
                      Horizon: 1
                 ResponseLags: [1 2 3 4 5 6 7]
            LeadingPredictors: [1 2 3]
         LeadingPredictorLags: {[0 1]  [0 1]  [0 1 2 3 4 5 6 7]}
                 ResponseName: 'TemperatureF'
               PredictorNames: {'Year'  'Month'  'Day'}
        CategoricalPredictors: 2
                     Learners: {[1×1 classreg.learning.regr.CompactRegressionTree]}
                       MaxLag: 7
              NumObservations: 372
    
    
      Properties, Methods
    
    

    Mdl is a DirectForecaster model object. By default, the horizon is one step ahead. That is, Mdl predicts a value that is one step into the future.

    Calculate the test set MSE. Smaller MSE values indicate better performance.

    testMSE = loss(Mdl,testTbl)
    testMSE = 
    61.0849
    

    Input Arguments

    collapse all

    Direct forecasting model, specified as a DirectForecaster or CompactDirectForecaster model object.

    Test set data, specified as a table or timetable. Each row of TestTbl corresponds to one observation, and each column corresponds to one variable. TestTbl must have the same data type as the predictor data argument used to train Mdl, and must include all exogenous predictors and the response variable.

    Test set exogenous predictor data, specified as a numeric matrix, table, or timetable. Each row of TestX corresponds to one observation, and each column corresponds to one predictor. TestX must have the same data type as the predictor data argument used to train Mdl, and must consist of the same exogenous predictors.

    Test set response data, specified as a numeric vector, one-column table, or one-column timetable. Each row of TestY corresponds to one observation.

    • If TestX is a numeric matrix, then TestY must be a numeric vector.

    • If TestX is a table, then TestY must be a numeric vector or one-column table.

    • If TestX is a timetable or it is not specified, then TestY must be a numeric vector, one-column table, or one-column timetable.

    If you specify both TestX and TestY, then they must have the same number of observations.

    Loss function, specified as "mse" or a function handle.

    • If you specify the built-in function "mse", then the loss function is the mean squared error.

    • If you specify your own function using function handle notation, then the function must have the signature lossvalue = lossfun(Y,predictedY), where:

      • The output argument lossvalue is a scalar.

      • You specify the function name (lossfun).

      • Y is an n-by-1 vector of observed numeric responses at a specific horizon step, where n is the number of test set observations.

      • predictedY is an n-by-1 vector of predicted numeric responses at a specific horizon step.

      Specify your function using LossFun=@lossfun.

    Data Types: single | double | function_handle

    Limitations

    • When you use the loss object function, the test set data must contain at least Mdl.MaxLag + max(Mdl.Horizon) observations. The software requires these observations for creating lagged and leading predictors.

    Version History

    Introduced in R2023b