Main Content

customLifetimePDModel

Create customLifetimePDModel object for lifetime probability of default

Since R2022b

Description

Create and analyze a customLifetimePDModel object to calculate the lifetime probability of default (PD) using this workflow:

  1. Fit a PD model that can predict PD for a loan or a portfolio of loans.

  2. Define a function handle for a function that predicts the PD in your designated PD model.

  3. Use customLifetimePDModel and pass the specified function handle to create a customLifetimePDModel object. The designated model is now wrapped as a lifetime PD model.

  4. Use predict to predict the conditional PD and predictLifetime to predict the lifetime PD.

  5. Use modelDiscrimination to return AUROC and ROC data. You can plot the results using modelDiscriminationPlot.

  6. Use modelCalibration to return the RMSE of the observed and predicted PD data. You can plot the results using modelCalibrationPlot.

Creation

Description

example

CustomLifetimePDModel = customLifetimePDModel(pdFcnHandle,IDVar=idvar_value,ResponseVar=responsevar_value) creates a customLifetimePDModel object for a PD model using required name-value arguments and sets model object properties.

example

CustomLifetimePDModel = customLifetimePDModel(___,Name=Value) specifies options using one or more name-value arguments in addition to the input arguments in the previous syntax. The optional name-value arguments set model object properties. For example, CustomLifetimePDModel = customLifetimePDModel(pdFcnHandle,IDVar='ID',AgeVar='YOB',Description='Scorecard as lifetime PD model',LoanVars='ScoreGroup',MacroVars={'GDP''Market'},ModelID='ScorecardLifetime',ResponseVar='Default',WeightsVar="Weights") creates a CustomLifetimePDModel object.

Input Arguments

expand all

Function handle for custom model probability of default prediction function, specified as a function handle.

The function takes in a data table which includes variables that you specify in AgeVar, LoanVars, and MacroVars, and returns a predicted conditional PD value for each row of the table.

Note

Because the pdFcnHandle function passes the data input in its entirety to the prediction and validation methods, it allows extra columns in the data table for other variables, such as IDVar, ResponseVar, and grouping variables.

Data Types: function_handle

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: CustompdModel = customLifetimePDModel(pdFcnHandle,IDVar='ID',AgeVar='YOB',Description='Scorecard as lifetime PD model',LoanVars='ScoreGroup',MacroVars={'GDP''Market'},ModelID='ScorecardLifetime',ResponseVar='Default',WeightsVar="Weights")

Required customLifetimePDModel Name-Value Arguments

expand all

ID variable indicating which column in the data accepted as input by pdFcnHandle contains the loan or borrower ID, specified as IDVar and a string or character vector.

Note

IDVar is required for lifetime PD prediction using predictLifetime.

Data Types: string | char

Variable indicating which column in the data accepted as input by pdFcnHandle contains the response variable, specified as ResponseVar and a string or character vector.

Note

ResponseVar is required for model validation when you use modelDiscrimination, modelDiscriminationPlot, modelCalibration, and modelCalibrationPlot.

Data Types: string | char

Optional customLifetimePDModel Name-Value Arguments

expand all

User-defined model ID, specified as ModelID and a string or character vector. The software uses the ModelID to format outputs and is expected to be short.

Data Types: string | char

User-defined description for the model, specified as Description and a string or character vector.

Data Types: string | char

Age variable indicating which column in the data accepted as input by pdFcnHandle contains the loan age information, specified as AgeVar and a string or character vector.

Note

AgeVar, LoanVars, and MacroVars work together as data input for the predictor variables of the model. You must specify at least one of these inputs. The predict function validates that the data input contains all the predictor variables.

If the distinction between AgeVar, LoanVars, and MacroVars is not important for the custom model's PD prediction, use LoanVars to store all the predictor variables in the model.

An age variable is common for lifetime PD modeling. When you specify AgeVar the predictLifetime function uses it to validate the periodicity of the rows in the data.

Data Types: string | char

Loan variables indicating which column in the data accepted as input by pdFcnHandle contains the loan-specific information, such as origination score or loan-to-value ratio, specified as LoanVars and a string array or cell array of character vectors.

Note

AgeVar, LoanVars, and MacroVars work together as data input for the predictor variables of the model. You must specify at least one of these inputs. The predict function validates that the data input contains all the predictor variables.

If the distinction between AgeVar, LoanVars, and MacroVars is not important for the custom model's PD prediction, use LoanVars to store all the predictor variables in the model.

Data Types: string | cell

Macro variables indicating which column in the data accepted as input by pdFcnHandle contains the macroeconomic information, such as gross domestic product (GDP) growth or unemployment rate, specified as MacroVars and a string array or cell array of character vectors.

Note

AgeVar, LoanVars, and MacroVars work together as data input for the predictor variables of the model. You must specify at least one of these inputs. The predict function validates that the data input contains all the predictor variables.

If the distinction between AgeVar, LoanVars, and MacroVars is not important for the custom model's PD prediction, use LoanVars to store all the predictor variables in the model.

Data Types: string | cell

Variable indicating which column in data contains the observation weights, specified as WeightsVar and a string array.

Note

The default value ("") results in a weight of 1 for each row in the data.

For an example using WeightsVar, see Create Weighted Lifetime PD Model.

Data Types: string

Time interval value, specified as a positive scalar indicating the time interval used to define the 0-1 default indicator values in the response variable. For models trained with training data in panel data format, the time interval typically coincides with the distance between age values in the training data. For example, if the age data (AgeVar) is 1, 2, 3, ..., then the TimeInterval is 1; if the age data is 0.25, 0.5, 0.75, ..., then the TimeInterval is 0.25. For more information, see Lifetime Prediction and Time Interval.

By default, if you do not specify a TimeInterval when creating a Custom model, the TimeInterval is set to [].

Data Types: double

Properties

expand all

User-defined model ID, returned as a string.

Data Types: string

User-defined description, returned as a string.

Data Types: string

Custom model defined using the function handle (pdFcnHandle), returned as the PD prediction function handle (pdFcnHandle).

Data Types: function_handle

ID variable indicating which column in the data input defined by pdFcnHandle contains loan or borrower ID, returned as a string.

Data Types: string

Age variable indicating which column in the data input defined by pdFcnHandlecontains loan age information, returned as a string.

Data Types: string

Loan variables indicating which column in the data input defined by pdFcnHandle contains loan-specific information, returned as a string array.

Data Types: string

Macro variables indicating which column in the data input defined by pdFcnHandle contains macroeconomic information, returned as a string array.

Data Types: string

Variable indicating which column in the data input defined by pdFcnHandle contains the response variable, returned as a string.

Data Types: string

Variable indicating which column in data contains the observation weights, returned as a string array.

Data Types: string

Time interval value, returned as a positive numeric scalar.

Data Types: double

Object Functions

predictCompute conditional PD
predictLifetimeCompute cumulative lifetime PD, marginal PD, and survival probability
modelDiscriminationCompute AUROC and ROC data
modelCalibrationCompute RMSE of predicted and observed PDs on grouped data
modelDiscriminationPlotPlot ROC curve
modelCalibrationPlotPlot observed default rates compared to predicted PDs on grouped data

Examples

collapse all

This example shows how to use the customLifetimePDModel object with a function handle to wrap a credit scorecard model as a customLifetimePDModel model.

Load Data

Load the credit portfolio data. The data set is in panel data format, with multiple rows per loan.

load RetailCreditPanelData.mat
disp(head(data))
    ID    ScoreGroup    YOB    Default    Year
    __    __________    ___    _______    ____

    1      Low Risk      1        0       1997
    1      Low Risk      2        0       1998
    1      Low Risk      3        0       1999
    1      Low Risk      4        0       2000
    1      Low Risk      5        0       2001
    1      Low Risk      6        0       2002
    1      Low Risk      7        0       2003
    1      Low Risk      8        0       2004
disp(head(dataMacro))
    Year     GDP     Market
    ____    _____    ______

    1997     2.72      7.61
    1998     3.57     26.24
    1999     2.86      18.1
    2000     2.43      3.19
    2001     1.26    -10.51
    2002    -0.59    -22.95
    2003     0.63      2.78
    2004     1.85      9.48

Join the two data components into a single data set.

data = join(data,dataMacro);
disp(head(data))
    ID    ScoreGroup    YOB    Default    Year     GDP     Market
    __    __________    ___    _______    ____    _____    ______

    1      Low Risk      1        0       1997     2.72      7.61
    1      Low Risk      2        0       1998     3.57     26.24
    1      Low Risk      3        0       1999     2.86      18.1
    1      Low Risk      4        0       2000     2.43      3.19
    1      Low Risk      5        0       2001     1.26    -10.51
    1      Low Risk      6        0       2002    -0.59    -22.95
    1      Low Risk      7        0       2003     0.63      2.78
    1      Low Risk      8        0       2004     1.85      9.48

Fit Credit Scorecard Model

Use creditscorecard to create a creditscorecard object, use autobinning to perform automatic binning of specified predictors, and then use fitmodel to fit a logistic regression model to weight of evidence (WOE) data. In this example, the entire data set is used to train the model.

sc = creditscorecard(data,'IDVar','ID','PredictorVars',{'ScoreGroup' 'YOB' 'GDP' 'Market'},'ResponseVar','Default');
sc = autobinning(sc);
sc = autobinning(sc,'YOB','Algorithm','Split');
sc = fitmodel(sc,'Display','off');
displaypoints(sc)
ans=16×3 table
      Predictors            Bin          Points 
    ______________    _______________    _______

    {'ScoreGroup'}    {'High Risk'  }    0.61102
    {'ScoreGroup'}    {'Medium Risk'}     1.3043
    {'ScoreGroup'}    {'Low Risk'   }     1.9113
    {'ScoreGroup'}    {'<missing>'  }        NaN
    {'YOB'       }    {'[-Inf,2)'   }    0.56226
    {'YOB'       }    {'[2,5)'      }     1.0024
    {'YOB'       }    {'[5,7)'      }     1.4549
    {'YOB'       }    {'[7,Inf]'    }      2.509
    {'YOB'       }    {'<missing>'  }        NaN
    {'GDP'       }    {'[-Inf,0.63)'}      1.042
    {'GDP'       }    {'[0.63,Inf]' }     1.1657
    {'GDP'       }    {'<missing>'  }        NaN
    {'Market'    }    {'[-Inf,2.78)'}     1.0731
    {'Market'    }    {'[2.78,9.48)'}     1.1219
    {'Market'    }    {'[9.48,Inf]' }     1.2294
    {'Market'    }    {'<missing>'  }        NaN

Create customLifetimePDModel Object Using Function Handle

Use customLifetimePDModel with a function handle for the probdefault function.

pdFcnHandle = @(data) probdefault(sc,data);
pdModel = customLifetimePDModel(pdFcnHandle,IDVar='ID',AgeVar='YOB', ...
          Description='Scorecard as lifetime PD model',LoanVars='ScoreGroup', ...
          MacroVars={'GDP' 'Market'},ModelID='ScorecardLifetime',ResponseVar='Default');
disp(pdModel)
  CustomLifetimePD with properties:

            ModelID: "ScorecardLifetime"
        Description: "Scorecard as lifetime PD model"
    UnderlyingModel: @(data)probdefault(sc,data)
              IDVar: "ID"
             AgeVar: "YOB"
           LoanVars: "ScoreGroup"
          MacroVars: ["GDP"    "Market"]
        ResponseVar: "Default"
         WeightsVar: ""
       TimeInterval: []
pdModel.UnderlyingModel
ans = function_handle with value:
    @(data)probdefault(sc,data)

Predict Lifetime PD

Use the predictLifetime function to predict lifetime cumulative PD values for the first ID associated with the first eight rows of the data. The data input to predictLifetime must be in panel data form, with multiple rows per loan, and the function computes the cumulative probability of default for each period. For more information, see Time Interval and Data Input for Lifetime Prediction.

predictLifetime(pdModel,data(1:8,:))
ans = 8×1

    0.0085
    0.0134
    0.0182
    0.0236
    0.0272
    0.0312
    0.0324
    0.0335

Validate Model

By wrapping the scorecard model as a lifetime PD model, all the validation functionality of the lifetime PD models is available. For example, use modelCalibrationPlot to visualize the observed default rates compared to the predicted probabilities of default.

modelCalibrationPlot(pdModel,data,'YOB')

References

[1] Baesens, Bart, Daniel Roesch, and Harald Scheule. Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS. Wiley, 2016.

[2] Bellini, Tiziano. IFRS 9 and CECL Credit Risk Modelling and Validation: A Practical Guide with Examples Worked in R and SAS. San Diego, CA: Elsevier, 2019.

[3] Breeden, Joseph. Living with CECL: The Modeling Dictionary. Santa Fe, NM: Prescient Models LLC, 2018.

[4] Roesch, Daniel and Harald Scheule. Deep Credit Risk: Machine Learning with Python. Independently published, 2020.

Version History

Introduced in R2022b

expand all