real or categorical predictors, which one is faster?
Ältere Kommentare anzeigen
In regressions, is there a guidline to treat predictors as real values or categorical?
In a fitting problem with input as X, y where X contains the hour of the day information, e.g. 1, 2, 3, etc.., I tend to consider it as a categorical predictor because the length of unique(X) is limited (i.e. 24). Surprislingly, the fitting procedures seem slower than treating it as real values in a gaussian process fitrgp.
My questions are:
- why does it take longer with categorical predictor?
- in a similar situation, is there a guidline to decide whether take the predictors as real values or categorical inputs?
3 Kommentare
Walter Roberson
am 17 Sep. 2023
Have you experimented with passing uint8 data? I don't know if that is permitted; if it is then it would signal that discrete algorithms are to be used
mono
am 17 Sep. 2023
"why does it take longer with categorical predictor?"
I'd venture owing to the large number of dummy variables introduced by having 24 levels of time being modeled as categorical instead of continuous/discrete. You could try artificially reducing the same data set to 24, 12, 2 levels and see if that hypothesis is correct.
Regardless of whether it's true or not, it's still the model definition and purpose that should be controlling decisions such as this, not anything to do with compute time.
Akzeptierte Antwort
Weitere Antworten (0)
Kategorien
Mehr zu Gaussian Process Regression finden Sie in Hilfe-Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!