Curve Fitting Weibull error

6 Ansichten (letzte 30 Tage)
Joe Hasrouny
Joe Hasrouny am 20 Mai 2020
Kommentiert: Joe Hasrouny am 21 Mai 2020
Hello,
I am trying to find the corresponding parameters for my model function.
I always get this error message :
Error using internal.stats.getscheffeparam>ValidateParameters (line 182)
If non-empty, JW must be a numeric, real matrix.
This is the correspnding code :
x0 is the initial estimation of the 3 parameters P(1) P(2) P(3)
xdata = [0:660];
ydata = [1.022608794,1.009868918,1.009812254,1.001822681,0.995703007,0.995910774,0.991103808,0.993474237,0.991292687,0.996392415,0.996184648,0.9975068,0.995665231,1.000103883,1.002077667,1.001992672,1.002257102,0.995627455,0.988978921,0.989876095,0.993294802,0.991472121,0.989951647,0.993209807,0.997062934,0.98818563,0.997270701,0.999338924,0.994050317,0.990574947,1.001926564,1.000944394,1.012919311,1.020691674,1.020616123,1.01528974,1.016158583,1.007857359,1.006563539,1.00512806,1.01346706,1.002427093,0.999631686,1.000349426,1.000613856,0.993020928,0.990886597,0.989290571,0.985343004,0.983491992,0.986051299,0.995041931,0.99621298,0.993738667,0.995750227,1.001492143,1.000897174,1.00579858,1.016366349,1.019067316,1.008121789,1.006138561,0.996297975,0.985522439,0.978005062,0.982462602,0.979374433,0.993105923,0.994446963,0.997204594,0.994163645,0.996090209,0.982538154,0.985305228,0.991689332,0.997610683,1.00668631,1.011134406,1.013939257,1.008905636,1.010331671,1.000736627,1.001086053,0.999896117,0.995976881,0.989271683,0.991424902,0.988572832,0.989073361,0.988062859,0.987486778,0.990376624,0.995693563,0.993700892,0.999395588,1.004202554,1.005562481,1.003314823,0.99710071,1.00517528,1.012758764,1.004325325,1.007602372,1.022816561,1.015195301,1.010001133,1.020351692,1.022127153,1.021296086,1.029181777,1.049958447,1.075853732,1.126633802,1.184978468,1.269520626,1.352089,1.448313312,1.54306437,1.649025385,1.723736401,1.798116878,1.856149894,1.906108341,1.949021608,1.975445754,1.975880175,1.967493956,1.920425355,1.851900121,1.789012919,1.73890337,1.679236552,1.635813312,1.600407978,1.578696358,1.551563917,1.539428453,1.528596253,1.515365292,1.503324267,1.496751284,1.479837186,1.460590435,1.449843231,1.429982623,1.413899592,1.404767301,1.404417875,1.390790269,1.391866878,1.381591871,1.378550922,1.375831067,1.369343079,1.364819432,1.353231717,1.340992369,1.32654314,1.322680568,1.320338471,1.324729903,1.321840057,1.321027879,1.316126473,1.3035377,1.29244107,1.290108416,1.285112572,1.285773648,1.281854412,1.28347877,1.281656089,1.279257328,1.268444016,1.269709504,1.262919311,1.258169009,1.253711469,1.263287625,1.256884633,1.247742898,1.23849728,1.235682986,1.225256875,1.219892717,1.232604261,1.232396494,1.224161378,1.221469855,1.212620882,1.201108719,1.198492747,1.205273497,1.208201118,1.208758311,1.204839075,1.207048957,1.202251435,1.194091871,1.195574569,1.200409867,1.197538909,1.192949154,1.195177924,1.190002644,1.191249244,1.191475899,1.195716228,1.188094968,1.18941712,1.185563992,1.177744409,1.171841946,1.165344515,1.159205953,1.152614083,1.150375869,1.145266697,1.147986552,1.143736778,1.141668555,1.138183741,1.138249849,1.130590813,1.129674751,1.119191976,1.116424902,1.113110079,1.116661,1.116594893,1.112071245,1.113053415,1.109105848,1.112996751,1.113393397,1.114904427,1.112600106,1.116160471,1.116151028,1.110824645,1.112656769,1.118124811,1.119333636,1.110701874,1.11670822,1.120211922,1.11225068,1.103505591,1.104110003,1.102051224,1.097725899,1.104062783,1.100615745,1.101843457,1.096583182,1.09700816,1.088423617,1.099520248,1.093853883,1.091898988,1.100691296,1.110210789,1.102362874,1.099964113,1.099784678,1.097338697,1.085647099,1.087224237,1.09027463,1.091030145,1.080075174,1.084551602,1.074711015,1.068355243,1.066929208,1.072057268,1.068100257,1.076939785,1.072586129,1.076713131,1.067996374,1.068638561,1.06820414,1.075693185,1.069044651,1.074711015,1.073039438,1.073445527,1.071688954,1.071887277,1.073247205,1.071027879,1.065361514,1.073369976,1.069346857,1.064908205,1.064388788,1.058967966,1.055842022,1.060998413,1.058457993,1.056625869,1.061583938,1.061555606,1.053197718,1.056219779,1.053188274,1.051639468,1.04618087,1.052385539,1.047814672,1.045718117,1.0402123,1.038899592,1.036217513,1.042592173,1.049580689,1.058023572,1.064870429,1.067646948,1.063321623,1.055417044,1.049977335,1.044896494,1.049117936,1.044820943,1.044839831,1.052224992,1.052121109,1.047890224,1.054349879,1.058854639,1.055709807,1.048759066,1.047323587,1.042582729,1.045727561,1.043262693,1.047181928,1.043196585,1.041893321,1.03788909,1.037501889,1.032751587,1.0293801,1.020427244,1.026754684,1.03026783,1.04078838,1.048749622,1.05928906,1.049967891,1.052942732,1.041610003,1.046105319,1.040731717,1.042932155,1.036963584,1.048466304,1.037161907,1.044112647,1.04555757,1.040693941,1.03343155,1.036123073,1.037596328,1.03918291,1.041949985,1.040098972,1.038956256,1.034017075,1.034224841,1.036368616,1.044320414,1.054444319,1.050883953,1.047333031,1.036311952,1.027226881,1.02068223,1.026565805,1.026376927,1.031051677,1.027972953,1.033337111,1.027330765,1.030758915,1.029568979,1.031278332,1.025914173,1.029313992,1.022221593,1.026934119,1.02746298,1.029276216,1.034300393,1.042337186,1.037549108,1.037690768,1.033261559,1.02746298,1.022419915,1.029049562,1.032553264,1.031523874,1.039031807,1.032789362,1.031438879,1.031958296,1.030541704,1.024620354,1.022948776,1.019029541,1.011361061,1.014723104,1.012003249,1.023373753,1.029833409,1.032024403,1.032609927,1.031221668,1.019671729,1.014562557,1.019199532,1.020956105,1.018302357,1.027019115,1.029880629,1.028275159,1.023770399,1.025725295,1.026036945,1.023590964,1.021286642,1.020408356,1.025980281,1.02590473,1.024403143,1.0337432,1.031401103,1.026367483,1.019001209,1.023014884,1.01756573,1.020937217,1.018198474,1.032515488,1.030551148,1.027387428,1.030201723,1.035820867,1.024799788,1.023298202,1.022316032,1.016970762,1.013004306,1.017707389,1.021522741,1.019246751,1.02451647,1.037492445,1.035764204,1.03726579,1.041402236,1.036519719,1.025819734,1.025640299,1.023883726,1.02710411,1.030409489,1.029257328,1.03031505,1.025706407,1.019275083,1.01845346,1.021003324,1.023609852,1.031504986,1.043234361,1.046048655,1.038144077,1.037700212,1.03820074,1.031703309,1.031731641,1.036925808,1.032874358,1.032968797,1.034224841,1.034942581,1.040193412,1.044452629,1.034224841,1.035716984,1.033393775,1.036000302,1.03125,1.042252191,1.038446283,1.045009822,1.043451571,1.040863932,1.035178679,1.02906845,1.025526972,1.020304473,1.017811272,1.013807041,1.02616916,1.017820716,1.022674902,1.030900574,1.038927924,1.040561726,1.042686612,1.037756875,1.040089529,1.035074796,1.022722121,1.02259935,1.021909943,1.02202327,1.022882668,1.02398761,1.031174448,1.030702251,1.027359096,1.02590473,1.023713735,1.017169084,1.020125038,1.017594062,1.011710487,1.017046313,1.020644455,1.014279238,1.004901405,1.013495391,1.02367596,1.019813388,1.023392641,1.027566863,1.025149214,1.009406165,1.009972801,1.010001133,1.012645437,1.01373149,1.022684346,1.029323436,1.029219553,1.035726428,1.037133575,1.028917347,1.024280372,1.025753626,1.027359096,1.026509142,1.034205953,1.041704442,1.043489347,1.0324966,1.030966682,1.025479752,1.012324343,1.017395739,1.020455576,1.020550015,1.027047446,1.029200665,1.022136597,1.018642339,1.016385237,1.009047295,1.014458673,1.01524252,1.022164929,1.027784074,1.035065352,1.039079027,1.041902765,1.041005591,1.035207011,1.037681324,1.025725295,1.020408356,1.028001284,1.030645588,1.023619296,1.02933288,1.033195452,1.024280372,1.028190163,1.032912134,1.032005515,1.032704367,1.041043367,1.039938426,1.040514506,1.035905863,1.028879571,1.023723179,1.024535358,1.017971819,1.024714793,1.028057948,1.025800846,1.026065277,1.024223708,1.018056815,1.019870051,1.023288758,1.020493351,1.024667573,1.026405258,1.01953007,1.016734663,1.018264581,1.015393623,1.025970837,1.023770399,1.018651783,1.009406165,1.01154994,1.000897174,1.011937141,1.01627191,1.021683288,1.017046313,1.020134482,1.009132291,1.014817543,1.017905712,1.025404201,1.023779843,1.029795633,1.022325476,1.026065277,1.018585675,1.022193261,1.029568979,1.034272061,1.028407374,1.027576307,1.03244938,1.026726352,1.024233152,1.019397854,1.02389317,1.017263524,1.022504911,1.018906769,1.027264657,1.027944621,1.03057948,1.028870127,1.034017075,1.031108341,1.03290269,1.030881686,1.029606754];
fun = @(P,x)(-P(1)/P(2)).*(((x-P(3))/P(2)).^(P(1)-1)).*exp(-((x-P(3))/n).^P(1));
x0=[-1.2,33.15,105];
g = fitnlm(xdata,ydata,fun,x0)

Akzeptierte Antwort

John D'Errico
John D'Errico am 20 Mai 2020
Bearbeitet: John D'Errico am 20 Mai 2020
We are given not much more than this:
fun = @(P,x)(-P(1)/P(2)).*(((x-P(3))/P(2)).^(P(1)-1)).*exp(-((x-P(3))/n).^P(1));
fun is a function of P and x. But then we see the number n. No hint as to what n is.
I might guess 17, or some others might guess 42, which is probably the answer to all questions in the end anyway.
So, is n a typo? If not, and n has some reasonable value that you have kept secret, there are still issues. I cannot test your code, since there is no hint about n. But something in your code is producing results that are not real numbers. I looked at ydata, and it is all real.
But consider code fragments like this:
(((x-P(3))/P(2)).^(P(1)-1))
x is a vector of integers. We see that as the vector: 0:660.
We have x0 as:
x0=[-1.2,33.15,105];
So P(3) is 105, although as iterations proceed, it will not remain an integer. Anyway, if we take the vector [0:660], then subtract something on the order of 105 from it, we will have some arbitrary negative numbers.
ANd P(1) is something on the order of -1.2. So we then raise those numbers with various signs on them to some non-integer power. Just for kicks, what is this:
(-105)^-1.2
ans =
-0.00303759931155025 + 0.00220694508288106i
Oh, yes. Raising a negative number to some real, non-integer power produces complex numbers. If I now read the error message, I start to wonder if this is what produces the error. In turn, that brings about some questions as to the validity of your model, or possibly about the parameters you have chosen as start points.
  5 Kommentare
John D'Errico
John D'Errico am 21 Mai 2020
Bearbeitet: John D'Errico am 21 Mai 2020
Actually, I gave you a hint about what was wrong. But I'll concede you may have missed it. So let me be a bit more serious here.
The trick is to understand what is in the model, what the parameters mean, and when to add or subtract, when a parameter should be negative, etc.
For example, suppose you wanted to fit a Gaussian model to some data. It will probably be in a form like this:
y(x) = a*exp(-((x - b)/c)^2)
what do the parameters mean in context? First, what happens when x==b? It turns out that b is the location of the peak of the Gaussian mode, along the x-axis. In statistics, we think of that parameter as the mean.
Effectively, a term like x-b shifts the peak of that Gaussian model to the right or left. If nothing else in the model changes, changing b just shifts that entire curve to the right or to the left.
How about a? Remember we said that when x==b, the curve is at its peak. What is the value of y at that location?
y(b) = a*exp(-((b - b)/c)^2) = a*exp(0) = a
So in context of this model, what does a tell us? It is a scale parameter. It tells us how high the peak is.
Finally, what does c do in that model? This is perhaps the most difficult to explain, but c works sort of like a standard deviation. Large values of c make the curve a very wide, flattish thing. Small values of c make it a very narrow peaky thing.
All of that is for the Gaussian model.
The point is, when you understand how the parameters in the model enter in, what they do to the curve, then you can appreciate how to choose good starting values. You can also see when there is a problem. But remember, when we do this subtraction: (x-P(3)), then we are essentially shifting the entire curve to the right (as long as P(3) is a positive number.)
So now it is time to look at your data.
plot(xdata,ydata)
grid on
You described that as a Weibull model. I'll trust you (at least for the moment) without looking seriously at the model. But one characteristic of a Weibull is it is not defined at all for negative values of X.
Your Weibull model has a parameter - P(3), that alows the model to be shifted to the right or to the left. Effectively, that defines the point below which the model is NOT DEFINED.
What happens when data is below that point? You get strange crap happening. In this case, you get negative numbers, which when we raise them to non-integer powers results in complex numbers.
fun = @(P,x)(-P(1)/P(2)).*(((x-P(3))/P(2)).^(P(1)-1)).*exp(-((x-P(3))/P(2)).^P(1));
Essentially, bad stuff happens below P(3). But your data goes down all the way to zero in x. Now, when I look at your data, I see a model that is essentially constant below around 100 or so. And that is PROBABLY why you think that P(3) should indeed be 105 or so. You might be even close to correct in that respect.
You seem to think that these are decent starting parameter for that model. As I said, P(3) might be close. I'll now plot the model you have indicated.
P = [-1.2,33.15,105];
fplot(@(x) fun(P,x),[P(3),500])
And that looks reasonably like your data, almost, sort of.
For x above P(3), we get a number.
>> fun(P,125)
ans =
0.0176
>> fun(P,100)
ans =
5.8398e+03 - 1.9933e+02i
But for x below P(3), we get complex valued crapola. As I said, this model is NOT defined when x is less than P(3).
But wait! You have data that goes down all the way to x==0. What can you do?
The idea is to change your model, to make it a piecewise thing. When X is less than P(3), we will make fun return 0. Does that make sense?
Sort of. Except that it would not be consistent with your data. fun, as you have defined it, approaches zero as x-->inf.
>> fun(P,1000)
ans =
2.5202e-05
>> fun(P,10000)
ans =
1.2983e-07
>> fun(P,100000)
ans =
8.0304e-10
But now look back at your data. It has an asymptote at roughly 1, NOT 0. And below x==100 or so, y seems to fluctuate around 1, NOT 0.
That suggests your model is truly incorrect, at least it is so for this data.
We need to have a baseline parameter.
Lets go back to the Gaussian model. I'll change it by adding one new parameter.
y(x) = d + a*exp(-((x - b)/c)^2)
What does d do here? It allows us to now shift the entire curve up or down by a constant d. Whereas the orginal Gaussian model must approach zero as x-->inf, here it will approach d as a limit.
Do you see that? How it works?
So now lets change fun. fun will now be a function with 4 parameters, not just 3 parameters. (If we do not do this, then you cannot accomplish this curve fit.)
x0 = [-1.2, 33.15, 105, 1];
fun = @(P,x) P(4) + (-P(1)/P(2)).*((max(x-P(3),0)/P(2)).^(P(1)-1)).*exp(-(max(x-P(3),0)/P(2)).^P(1));
What did I do different? Wherever I saw (x-P(3)) I changed that to max(x-P(3),0). Essentially, this replaces anything below 0 when you subtract P(3) from x, anything below that gets turned into 0. We will no longer see negative numbers created, and then raised to a power. That is good.
Next, I added the parameter P(4) to fun. And what does that do? It now allows us to estimate a vertical translation of your entire curve. Essentially, below P(3), the function will now predict P(4). The asympotote as x-->inf is now also P(4). Both of these changes are important.
Does that help us?
>> x0 = [-1.2, 33.15, 105, 1];
>> fun(x0,105)
ans =
NaN
>> fun(x0,100)
ans =
NaN
We still have a problem. Now below that point, we get a NaN. So I'll make one final tweak to the model. I need to make the max function return a TINY number below P(3), not zero. So here we see the term fragment as max(x-P(3),eps).
>> fun = @(P,x) P(4) + (-P(1)/P(2)).*((max(x-P(3),eps)/P(2)).^(P(1)-1)).*exp(-(max(x-P(3),eps)/P(2)).^P(1));
>> fun(x0,100)
ans =
1
> fun(x0,1000000)
ans =
1.0000
So for x below P(3), we get something reasonable. As x-->inf, we also get 1. In both extreme cases the model now does something reasonable.
Does that mean it is time to fit a model? We are getting close. The problem is that your model is still not properly scaled. We need to have a 5th parameter in there.
Remember that the parameter I called a does for that Gaussian model? Your model needs to have a scale parameter in the vertical direction too.
I'll guess that number should be on the order of 100 here to make things work out.
fun = @(P,x) P(4) + P(5)*(-P(1)/P(2)).*((max(x-P(3),eps)/P(2)).^(P(1)-1)).*exp(-(max(x-P(3),eps)/P(2)).^P(1));
x0 = [-1.2, 33.15, 105, 1, 100];
g = fitnlm(xdata,ydata,fun,x0)
g =
Nonlinear regression model:
y ~ F(P,x)
Estimated Coefficients:
Estimate SE tStat pValue
________ _________ _______ ___________
P1 -0.69962 0.017736 -39.447 2.6215e-175
P2 39.285 1.1525 34.088 2.4328e-147
P3 111.16 0.21368 520.21 0
P4 1.0081 0.0014369 701.58 0
P5 62.072 1.4565 42.616 3.7784e-191
Number of observations: 661, Error degrees of freedom: 656
Root Mean Squared Error: 0.023
R-Squared: 0.978, Adjusted R-Squared 0.978
F-statistic vs. constant model: 7.26e+03, p-value = 0
And now we see that fitlm works. P(4) is the baseline parameter. As expected, it should be around 1. P(3) is the point where the curve does something above the baseline. 111 seems reasonable.
The final coefficients are:
g.Coefficients.Estimate
ans =
-0.6996
39.2852
111.1581
1.0081
62.0715
plot(xdata,ydata,'r.')
hold on
plot(xdata,fun(g.Coefficients.Estimate,xdata),'b-')
legend('Data','fitnlm model')
grid on
xlabel xdata
ylabel ydata
So not a splendid fit, as it seems to be missing the peak by just a bit. That error is known as "lack-of-fit". It happens when the model you have chosen for your data is not truly a correct model for that data, but only one that works approximately well. That seems to be exactly the case here, where I assume you have some data, and then someone has probably told you to use this model. In fact, a Weibull distribution is not a valid model for that data, as a Weibull PDF has the property that the integral of that function from P(3) to inf must be 1. That would not happen here.
Does this function, as estimated, have that property?
Pfinal = g.Coefficients.Estimate;
>> integral(@(x) fun([Pfinal(1:3);0;1],x),Pfinal(3),1000)
ans =
0.8933
>> integral(@(x) fun([Pfinal(1:3);0;1],x),Pfinal(3),10000)
ans =
0.9793
>> integral(@(x) fun([Pfinal(1:3);0;1],x),Pfinal(3),100000)
ans =
0.9959
>> integral(@(x) fun([Pfinal(1:3);0;1],x),Pfinal(3),1000000)
ans =
0.9992
As you can see, if I replace the DC offset parameter P(4) with 0, and the amplitude adjustment P(5) with 1, then indeed it does seem to behave as a PDF. The integral seems to approach 1.
integral(@(x) fun(Pfinal,x),Pfinal(3),1000000)
ans =
1.0080e+06
However, as it must be to fit your data, this is NOT a Weibull PDF. The integral under that curve out to infinity is not even finite.
Again, I had to SIGNIFICANTLY tweak your model to make it work here. Still essentially the same model. But without these changes, you will get crapola. fitnlm will completely fail otherwise, as you have seen.
Joe Hasrouny
Joe Hasrouny am 21 Mai 2020
Thank you very much for the detailed explanation. It makes sense now.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Linear and Nonlinear Regression finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by