# Three terms exponential fit without initial guess

37 Ansichten (letzte 30 Tage)

Ältere Kommentare anzeigen

HamzaChem
am 17 Aug. 2022

Bearbeitet: HamzaChem
am 20 Aug. 2022

Is there a possibility to fit data (enough data points, thousands of data points) with 3-term exponential fit without initial guesses ? If not, is there a way of having an idea about the initial guesses, mathematically, without prior knowledge on the physical model (just to be safe and unbiased by what we want to get).

Many thanks!

##### 2 Kommentare

Alex Sha
am 17 Aug. 2022

Alex Sha
am 17 Aug. 2022

### Akzeptierte Antwort

John D'Errico
am 17 Aug. 2022

Bearbeitet: John D'Errico
am 17 Aug. 2022

There are a lot of misconceptions here.

- Can you do a fit without ANY initial guesses? No. Any nonlinear fit will use initial guesses. The ones that do not explicitly require a start point, like fit, still use guesses, but they supply random numbers as guesses. And that can be actively a bad thing. (I'll discuss that later.)
- You CAN use tools like GA. But even there you will need to have some intelligent set of bounds for the parameters, else the exponentials will just arbitrarily overflow or underflow. Either case is an actively bad thing in a fit.
- You can use tools like a partitioned least squares solver. (I'll show an example of that too.) These tools are more robust to bad starting values, they converge more stably,, etc. But they will still require some starting values.

A nonlinear least squares fit is just a search routine. You need to start it looking in some intelligent place. And exponentials are notorious for being nasty things, IF you fail to treat them properly. Again, remember that a fit is JUST a search. Think of it as putting a blind person down on the face of the earth, and tasking them with finding the point of lowest elevation. Give them a cane, to figure out which way is down hill. Also an altimeter (with spoken output), to figure out how high they are. And finally, give the poor fellow some scuba gear, as they will need it.

The point is, a nonlinear fit is little more intelligent that that. Start the fellow out in the vicinity of the Dead Sea, and expect he will get wet, but not that he will find a point in the bottom of the Pacific ocean.

Since you have supplied no data, I'll need to make up some data.

x = 1 + rand(500,1)*100;

y = 200 ./ x + randn(size(x));

plot(x,y,'.')

Yes. I know that data is NOT in fact a sum of three exponentials. But it has the necessary shape. And most of the time people are just guessing how many terms they need in such a model anyway. Much of the time, they don't even really know the true model. Too bad. This is my data. It looks a lot like any other exponential model data.

So let me see if I can fit this data as a sum of three exponentials. We can see some problems you might run into.

mdl = fittype('a*exp(-b*x) + c*exp(-d*x) + e*exp(-f*x)','indep','x')

First, look at the data. remember that exponential data often looks like that. The problem is, the points at the left end will carry the most influence, dragging the rest of the curve around. Any error there will have a huge impact on the fit.

Next, remember that the magnitude of your independent variable will be hugely important. If x is too large, then the exponential rate parameters can easily be large enough, or small enough in comparison, that you will see overflows in the exponential or underflows. And then the fit will become complete crap. For example, the data I gave here has x on the order of 0-100. Say an average value of 50. If, given a model with terms like

a*exp(-b*x)

if the starting estimate for b is on the order of 100, then ALL of your data will underflow at that start point. Your fit will probably fail to converge.

As well, you NEED to know the sign of the rate constants. Is this a positive or negative exponential model? The problem is that the optimization routine will have a great deal of difficulty in switching the signs of the rate constants. So it you provide a model as

a*exp(b*x)

and you give it the wrong sign for b as a starting value, the solver will usually fail to converge to something reasonable. Here, the problem is that exp(-x) and exp(x) have fundamentally different shapes. But since solvers don't really understand mathematics, they just fail to converge.

fittedmdl = fit(x,y,mdl)

plot(fittedmdl)

hold on

plot(x,y,'b.')

hold off

I tested out this fit multiple times, allowing fit to choose its own starting values. It actually did surprisingly well on this specific data, even though the data is not at all a sum of exponentials. Sometimes it failed to converge. That happened a little less than half the time here in my play as I was writing this.

Another problem is that sums of exponentials are themselves difficult problems to estimate. I saw that the Curve Fitting Toolbox did not have an 'exp3' model built in as an option. That is a good thing, because sums of exponentials often resut in poorly posed problems in the optimization. One would see that in the form of multiple singular or nearly singular matrix warnings, telling you the solver is having grave numerical problems.

Ok, can you find good starting values for such a model? SOMETIMES this is doable. Doing it automatically is not easy though. You could do things like take the log of your data. Since the log of an exponential model is now linear in the unknown coefficients, you could see if the result is a inear fit. That is if we have

y = a*exp(b*x)

then

log(y) = log(a) + b*x

Essentially, we now would hope to see a straight line fit, and the slope of that line would be a good estimate of your rate constant. Typically, I do this using a semi-logy plot. If the curve looks nice and liner, then an exponential model will be a good idea. Sadly, this fails for a sum of exponentials, since the log of a sum is nothing special.

semilogy(x,y,'.')

What does that tell me? It tells me that a pure exponential fit must fail to fit well, since the curve is not a straight line. But maybe, if we have a sum of several terms it may work. The semi-logy plot also points out the problem with trying to model process with additive noise, and then taking the log.

Finally, you can use a partitioned nonlinear least squares solver. I have one of them posted on the file exchange, in the form of fminspleas. This tool needs to iterate on only the nonlinear parameters in the problem, so given a model like:

a*exp(-b*x) + c*exp(-d*x) + e*exp(-f*x)

the solver works differently on the intrinsically nonlinear parameters b,d, and f, cpompared to the conditionally linear parameters a,c, and e. Essentially, the solver needs to search only in a 3-dimensional parameter space instead of a 6-dimensional parameter space. And this makes the solve more efficient, more robustly convergent, etc. You can find fminspleas on the File Exchange for free download.

In the end though, a nonlinear least squares fit is often no better than the quality of the starting values you can provide.

##### 7 Kommentare

John D'Errico
am 20 Aug. 2022

In that file, I find 4 columns of data.

load thisdata

plot(data(:,1),data(:,2))

plot(data(:,3),data(:,4))

So you want to find a model for that second curve as a sum of three exponential terms, if we ignore the first 4000 points before anything starts to happen? Something like this?

ind = 4000:size(data,1);

plot(data(ind,3),data(ind,4),'-')

### Weitere Antworten (1)

Sulaymon Eshkabilov
am 17 Aug. 2022

Yes, it is possible to get such fit. Here is an initial draft code syntax:

Xdata = ...

Ydata = ...

MODEL = fittype( @(a,b,c,d,f, g, x) (a*exp(b*x)+c*exp(d*x)+f*exp(g*x)), ...

'indep','x', 'coeff', {'a', 'b', 'c', 'd', 'f', 'g'} );

[FITRESULTS, GF] = fit(Xdata, Ydata, MODEL);

plot(FITRESULTS, Xdata, Ydata)

##### 1 Kommentar

John D'Errico
am 17 Aug. 2022

### Siehe auch

### Kategorien

### Produkte

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!