Fittype issue: unrealistic results and problems with upper and lower bounds
Ältere Kommentare anzeigen
Hi me and my colleague are working on a problem to solve for the a, b, and d coefficients in the following equation:
y = s*[a*exp(-x/b)+(1-a)*exp(-x/d)];
where
x (x2 in the code) = transposed vector E2=[7.0, 14.0, 28.0, 42.0, 70.0, 90.0, 112.0, 180.0, 224.0, 327.0], later called E2T;
y (y2 in the code) = transposed vector M2 = [59012.590, 38612.883, 18706.516, 11165.926, 5639.9341, 3804.1899, 2776.5547, 1011.2273, 919.47906, 829.12860 ], later called M2T;
We're trying to use fittype function, which is the following:
ft = fittype( 's*[a*exp(-x/b)+(1-a)*exp(-x/d)]');
opts = fitoptions( 'Method', 'NonlinearLeastSquares', 'StartPoint', [60000, 0.8, 100, 1000], 'Upper', [inf, 1, 2000, 2000], 'Lower', [100, 0.1, 1, 1]);
[fitresult2, gof2] = fit(x2,cast(y2, "double"),ft,opts);
f = fit(x2,cast(y2, "double"),ft,opts);
% Assign results to coefficients
t2w_1=fitresult2.b;
t2w_2=fitresult2.d;
ampw1=fitresult2.a;
ampw2=1-fitresult2.a;
Our goal is to obtain the values of ampw1, ampw2, a specifically of t2w_1 and t2w_2.
The problem is that my colleague uses the same equation, initial guesses and bounds and gets a perfect fit in Excel, while my MATLAB code fails to do so and even ignores the bounds I set for "a". I am obtaining terrible results: "a" is supposed to be between 0.1 and 1 while I get something like 590 instead. One thing we found is that MATLAB probably tries to "normalize" the data somehow before the fit which might be part of the problem, but I am not sure how to resolve the issue. I tried other functions than fittype but the issue persists. Please help me fix this problem so that I can get a realistic fit.
I'd appreciate your help. Thanks in advance.
2 Kommentare
the cyclist
am 2 Mär. 2024
It's difficult to help you debug this without being able to run your code. Since you don't tell us the values of the components of y, we can't do that. Can you upload the data?
Edham
am 2 Mär. 2024
Akzeptierte Antwort
Weitere Antworten (1)
John D'Errico
am 2 Mär. 2024
Bearbeitet: John D'Errico
am 2 Mär. 2024
Star pointed much out to you, but I want to say a few extra things, and the fit he got was relative crap due to the terrible choice of starting values, so I'll post this as as answer.
First, that a parameter in a nonlinear model is statistically different from zero is often MEANINGLESS. It matters when you are fitting polynomial models, where if a coefficient is zero, then that term could potentially be dropped from the model. But that is a meaningless statement here.
Next, lets look at your problem and model:
x2 = [7.0, 14.0, 28.0, 42.0, 70.0, 90.0, 112.0, 180.0, 224.0, 327.0].';
y2 = [59012.590, 38612.883, 18706.516, 11165.926, 5639.9341, 3804.1899, 2776.5547, 1011.2273, 919.47906, 829.12860 ].';
ft = fittype( 's*(a*exp(-x/b)+(1-a)*exp(-x/d))')
I pointed out in my comment to @Star Strider that you should never use randn to generate starting values, ESPECIALLY when applied to nonlinear exponential models. You never want to choose exponential rates with the wrong sign, and randn will do that randomly.
As Star points out, the parameters are listed in the order fittype shows them. So, here, they are {a,b,d,s}. I'll need to guess which order those bounds should lie in, and what intelligently chosen starting values should be.
a we know should lie in the interval [0,1]. It is a coefficient that trades the two exponential terms off between each other. A start value of 0.5 is probably reasonable.
b is an exponential rate parameter. It should ALWAYS be positive here, but because you are dividing x by b, we will expect b to be on the order of 10 to maybe 100.
d is also an exponential rate parameter. It should ALWAYS be positive here, but because you are dividing x by d, we will expect d to be on the order of 10 to maybe 100. It is VERY important to note that the starting value for b and d should NOT be the same. That will often blow many fitting tools out of the water! Never use the same rate constant for two exponential terms.
s is a multiplicative constant, scaling the entire problem. Since your y values are on the order of 10000 to 60000, a start point of 60000 seems reasonable.
opts = fitoptions( 'Method', 'NonlinearLeastSquares', 'StartPoint', [ 0.5, 100, 1000, 60000], 'Upper', [1, 2000, 2000, inf], 'Lower', [0, 1, 1, 10000]);
[fitresult2, gof2] = fit(x2,cast(y2, "double"),ft,opts)
plot(fitresult2,x2,y2)
And that fit looks eminently reasonable! The RMSE is pretty small, and the R^2 coefficient is also near 1, not that I ever give a tinker's damn about numbers like that. If the fit looks good, then it is good. Your eye is a far better measure of goodness of fit than some scalar number. Trust your eyes.
As you can see, what matters a lot is the start point, but more importantly that you have the bounds and starting values in the CORRECT ORDER for the parameters. Don't just guess which order they might be in. You can read it off directly from the fittype return.
Were I to look further into this the huge variation in y suggests that additive errors may not be the best choice. I might consider a weighted model perhaps, or even a log transformation to fit in terms of a proportional error structure. But that fit seems reasonable.
Kategorien
Mehr zu Get Started with Curve Fitting Toolbox finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

