Constrained multiple linear regression with multiple dependent variables

8 Ansichten (letzte 30 Tage)
I am doing a calibration of my mass spectrometer in order to quantify the amount of product that is being produced in my electrochemical cell from the mass peaks. I have 9 mass peaks (X) and 8 chemical products (Y) that I want to fit together through multiple linear regression.
I have about 860 separate data points with 86 different concentration profiles that are linearly independent. I already successfully calibrated the reverse (linked the products to each of the mass peaks), but then when I take the inverse of this matrix I suffer from huge error propagation and I am not able to quantify my products anymore. However, when I want to do calibration to link my product concentrations (Y) to my mass peaks (X) using mvregress from MATLAB, I get a coefficient matrix (beta) that is optimized to the data, but contains values that are statistically not correct. I get T-stat values that are negative or <2.
Instead I would like to do a Multiple Linear Regression where I can put constraints on the values of beta (put them to 0 as I know there is no linear dependence of that mass peak on that product concentration). Is there a function I could use for this? I tried mvregress and LSQLIN but LSQLIN only takes a single dependent variable.
Furthermore, I have no background in data science or chemometrics, so if there is anything that I do wrong or if any you have any other suggestions, please let me know. It is much appreciated :)
  2 Kommentare
Matt J
Matt J am 22 Nov. 2022
LSQLIN only takes a single dependent variable.
It doesn't. LSQLIN should be fine.
Daniel van den Berg
Daniel van den Berg am 23 Nov. 2022
Thank you so much for your quick answer! However, in the documentation it says that the input d needs to be a vector, while my dependent variables are a matrix of multiple dependent variables. Also, when I try to implement it, I do get an error that "Matrix dimensions must agree."
Therefore, there must be something that I am doing wrong. If so, could you please be so kind to tell me where my error or wrongful assumption is?

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Matt J
Matt J am 23 Nov. 2022
Bearbeitet: Matt J am 23 Nov. 2022
lsqlin is applicable with,
N=size(beta,1);
C=kron(X.',eye(N));
d=Y(:);
  2 Kommentare
Matt J
Matt J am 23 Nov. 2022
Bearbeitet: Matt J am 23 Nov. 2022
You can also set this up with the Problem-Based Optimization Set-up tools, e.g.
Y=rand(5); X=rand(5); %fake
beta=optimvar('beta',[size(Y,1),size(X,1)],...
'Lower',0,'Upper',10); %fake bounds
prob=optimproblem;
prob.Objective=sum(reshape(Y-beta*X,[],1).^2);
solution = solve(prob)
Solving problem using lsqlin. Minimum found that satisfies the constraints. Optimization completed because the objective function is non-decreasing in feasible directions, to within the value of the optimality tolerance, and constraints are satisfied to within the value of the constraint tolerance.
solution = struct with fields:
beta: [5×5 double]
Daniel van den Berg
Daniel van den Berg am 23 Nov. 2022
Thank you for your quick answers! It is much appreciated.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

the cyclist
the cyclist am 23 Nov. 2022
I think mvregress does what you want. It has been ages since I've used it, but I wrote a pretty detailed answer that gave three examples of design matrices for regressions with multiple response variables. The syntax is tricky, but I think if you carefully understand my three examples, you will get the gist, and be able to figure out if it will work for your case.
I'm pretty sure you can enforce "structural" zeros for coefficients, although that answer does not have an example of it.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by