I am working on a project where I need to predict multiple response variables for a given data set likely using random forests or boositng. Are there any functions I could use that might provide what I am looking for. Basically, what I mean is:
data = (2-D matrix of regressors)
regression model = regression_function(data,response variables)

 Akzeptierte Antwort

Ive J
Ive J am 6 Jun. 2023

0 Stimmen

I'm not aware of such a function in MATLAB, but you can loop over your target/response variables, and each time fit a new model. Something like this:
models = cell(numel(responseVars), 1);
for k = 1:numel(models)
models{k} = fitrensemble(data(:, [features, responseVars(k)], responseVars(k)); % data table contains all features + outcomes
end

7 Kommentare

Alejandro Plata
Alejandro Plata am 6 Jun. 2023
That's unfortunate. Yeah, I attempted this method but the interactions between the response variables is so strong that the result is as good as just randomly guessing.
Ive J
Ive J am 6 Jun. 2023
I don't understand why the interaction between response variables matters. Can you please elaborate what's your objective? Do you mean independent variables (AKA features)? If you intend to perform a multivariate (NOT A multivariable or multiple) regression, please check the @the cyclist answer. But mvregress can only perform linear regression.
Ive J
Ive J am 6 Jun. 2023
Bearbeitet: Ive J am 6 Jun. 2023
Alejandro Plata
Alejandro Plata am 6 Jun. 2023
Sorry for the lack of clarity on my objective. I am analyzing a data set of sensitivity vectors between the nominal and varied trajectories in phase space for a damaged Lorenz system. Damage is simulated by changing the three parameters (sigma, beta, and r) slightly. The response variable in the case of machine learning is the continuous range of these parameters. So, the feature data is a set of vectors and magnitudes and the labeling data is a set of three parameters. I say the interaction between the response variables is strong because the parameters define how the phase space changes in a nonlinear manner.
What I want is a function that allows for the labeling of all three parameters at once, accounting for their combined relationships, rather than a system of three models that independently attempt to label the parameters. Basically some sort of adaptation of basic machine learning algorithms like SVM, random forest, etc. rather than a simple transformation to a set of single output problems (which seems to fail altogether at prediction).
Attached below are the results of changing just one parameter and using a stacked ensemble to predict the parameter values of the systems as well as changing two parameters and attempting the same process on test sets. Blue represents the test set and orange represents the learning model.
Above: Stacked ensemble where r is varied with a gaussian distribution of sigma = 1 from its nominal value of 28.
Above: Prediction of sigma parameter variation when both sigma and r were varied.
Using multiple linear learning models seems to fail. Thank you for your help so far, and hopefully this makes more sense!
Ive J
Ive J am 6 Jun. 2023
In that case you're facing a multivariate problem. To the best of my knowledege, MATLAB does not have built-in functions for random forests (you can check the above R package for this purpose), but it does support multi-class SVMs (fitcecoc) which may suit your specific problem.
the cyclist
the cyclist am 8 Jun. 2023
fitcecoc doesn't fit multiple response variables. It fits a single (categorical) response variable that has more than two categories.
Ive J
Ive J am 8 Jun. 2023
Bearbeitet: Ive J am 8 Jun. 2023
Yes, that's correct and I didn't mean fitcecoc is multivariate. For multivariate SVM one could check sklearn. But for this specific problem of OP, I meant something like this by aggregating different responses to see how one label vs others could differ compared to separate SVMs:
y1 = ["y1-1", "y1-2", "y1-3"];
y2 = ["y2-1", "y2-2"];
y_multi = y1' + "_" + y2;
y_multi = categorical(y_multi(:))
y_multi = 6×1 categorical array
y1-1_y2-1 y1-2_y2-1 y1-3_y2-1 y1-1_y2-2 y1-2_y2-2 y1-3_y2-2

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

the cyclist
the cyclist am 6 Jun. 2023

0 Stimmen

The only MATLAB function (that I know of) that can handle multiple response variables is mvregress. Take a look at my answer here for examples with some common design matrices. There are of course examples in the documentation page I linked, as well.

Produkte

Version

R2023a

Gefragt:

am 5 Jun. 2023

Bearbeitet:

am 8 Jun. 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by