How to tweak an equation to properly fit a dataset using lsqcurvefit?

4 Ansichten (letzte 30 Tage)
Juneight
Juneight am 25 Sep. 2023
Verschoben: Voss am 27 Sep. 2023
I have an equation that I'm working on that has 3 independent variables and may need to include additional fractional terms to properly fit a dataset. Here is the original equation:
Units:
y in m/s
and in m
in Pa
in
is unitless (Poisson's ratio)
I've tried setting the parameters to be used in fitting as such:
where and are the parameters.
However, this setup was not sufficient for the equation to be fitted to my dataset, as there were huge differences.
My question is, how can I know which fractional terms to add to this equation to be used in the fitting process, while also conserving the units of the output? For instance, I tried modifying the equation like this:
but the fit did not change, as far as I can tell visually.
I am using lsqcurvefit for fitting, here is my code:
independentVariables = [x1 x2 x3];
y = @(p, x) p(1) .* sqrt((2 .* x(:,1)) + p(3) .* (x(:,2) ./ x(:,3)) .^ p(4) ./ ...
(c1 .* (3 + c2) .* (1 - ((x(:,2) - x(:,3)) ./ x(:,2)) .^ p(2))));
% Initial guess for parameters, lower and upper bounds
initialGuess = [1 1 1];
lb = [];
ub = [];
% Call lsqcurvefit
parameters_y = lsqcurvefit(y, initialGuess, independentVariables, ydata(:,1),lb,ub);
Here is my dataset, I also added it as an attachment to this post:
ydata =
36165.2399652261
36832.2283727431
38244.8839743203
39532.2710645211
38985.1533416210
34225.9039660907
30164.7430624110
26359.0764634342
22772.0684138657
24623.4765316140
8877.15808522203
8717.70659273395
8558.63955174474
8603.93260894558
8659.68585200240
6188.18034995559
6123.01438079126
6057.56501843796
5991.56509717007
5924.71769692532
4872.24385495395
4852.85576311786
4832.76067401530
4811.90659475242
4790.23277171101
ydata = reshape(ydata,5,[]) % ydata is actually a 2D matrix where rows vary with x2 and columns vary with x3.
And finally here is how my fit looks right now:
Different angle:
Any help would be appreciated.
Edit: Added the independent variables, where columns in independentVariables are .
Edit 2: Added units of c1, c2, x1, x2, x3 and y.
Edit 3: Corrected the variables in the post to be in line with the .mat files.
  8 Kommentare
William Rose
William Rose am 26 Sep. 2023
You asked, "What would you recommend for the number of parameters to be used in such an equation?".
That is a very good question with no simple answer. There is no specific number of parameters that are recommended. I have read (I forget where) that one should have at least 10 data points per fitted parameter. One could question the justification for this recommendation. I think it is not a bad place to start. Of course you want the data points to span ranges that are significantly larger than the measurement noise in their respective dimensions. A good approach, which takes some effort, is to compute error bars for the parameter estimates. If the 95% error bar for parameter P spans zero, then you probably should not keep parameter P. There are various approaches to estimating whether added parameters are justified in a model: Akaike information criterion, Bayesian information criterion, Rissanen's minimum description length are well known.
Juneight
Juneight am 26 Sep. 2023
Sorry @dpb and @William Rose, I made a mistake while preparing the question post and changed the naming of the independent variables, they should make more sense now, I also updated the post but the independent variable that was previously named should have been , as its in Pascals and it is the third column in the independentVariables.
"In your code, you did not specify lower and upper bounds for the parameters. It is a good idea to include bounds for all parameters being fitted: non-negative, less than a million, or something. This is not hard, if the parameters have some physical meaning."
Noted, I will add them.

Melden Sie sich an, um zu kommentieren.

Antworten (1)

William Rose
William Rose am 26 Sep. 2023
Bearbeitet: William Rose am 27 Sep. 2023
{Edit:
Dimensional analysis needs changing because now you say x3 is [Pa] and before you said x3 is [m].
You said "the independent variable that was previously named x1 should have been x3, as its in Pascals and it is the third column in the independentVariables". Are you proposing that we exchange x1 and x3 in the model equation? Then my previous model analysis is now incorrect (and was a waste of time), and my model equations are incorrect. Arrgh!
When I compare youre new model 1 to the old model 1, I see that you did not just exchage x1 and x3 in the equation. What was x1 is now x3. What was x2 is now x1. What was x3 is now x2. @Juneight, am I going crazy? I'm definitely getting grumpy. Please clarify your comments regarding the exchanges of variables.}
I would not use interpolation to add points. This will not be a relaible way to improve the fit, since there seem to be non-linearities in the relationship, which we cannot account for in the interpolation. Also, this would be interpolation in 3 dimensions of x, which introduces more chance for error than interpolation in 1 or 2 dimensions.
The units for p(2) in model 1, and for p(2) and p(4) in model 2, are dimensionless. p(3) in model 2 has dimensions of meters. p(1) in models 1 and 2 is dimensionless. What is the physical interpretation of p(1)? It is nice when model parameters have physical meaning.
load('ydata');
load('independentVariables');
xdata=independentVariables;
% Make plots showing distribution of the independent variables
figure; subplot(131); plot(xdata(:,1),xdata(:,2),'r*'); xlabel('X1'); ylabel('X2'); grid on
subplot(132); plot(xdata(:,2),xdata(:,3),'g*'); xlabel('X2'); ylabel('X3'); grid on
subplot(133); plot(xdata(:,3),xdata(:,1),'b*'); xlabel('X3'); ylabel('X1'); grid on
Let's try fitting y=f(x1,x2,x3) with the model 1 equation, which is
[Insert the new equation which @Juneight substituted.]
In your first model (before model 1), you had 2 in place of p(2).
{The next several lines may not be corerct since @Juneight has redefined the variables x1, x2, x3.}
Therefore I assume that p(2) is probably a positive number, but maybe not 2. But since x3>>x2, and since x1 is positive, and c1 and c2 are positive, you will get a negative value under the square root, i.e. you'll get a complex number from this model. Therefore I suspect there is something wrong with your x3 or x2 data. Maybe x2 is in units of meters, and x3 is in nanometers. Or the model is wrong.
c1=1e3; % density (kg/m^3)
c2=0.33; % Poisson's ratio (non-dim)
%y = @(p, x) p(1)*sqrt(2*x(:,1)./(c1*(3+c2)*(1-((x(:,2)-x(:,3))./x(:,2)).^p(2))));
% What was x1 is now x3. What was x2 is now x1. What was x3 is now x2.
y = @(p, x) p(1)*sqrt(2*x(:,3)./(c1*(3+c2)*(1-((x(:,1)-x(:,2))./x(:,1)).^p(2))));
p0 = [1 2]; % initial guess for p()
lb = [1e-2 0.3]; % lower bounds for p()
ub = [1e6 4]; % upper bounds for p()
% Estimate p and get the sum squared error (SSE)
[pEst,SSE] = lsqcurvefit(y, p0, xdata, ydata, lb, ub);
Local minimum possible. lsqcurvefit stopped because the final change in the sum of squares relative to its initial value is less than the value of the function tolerance.
fprintf('p1=%.3f, p2=%.3f, SSE=%.2e.\n',pEst(1),pEst(2),SSE)
p1=47.782, p2=4.000, SSE=3.74e+09.
  6 Kommentare
dpb
dpb am 27 Sep. 2023
Bearbeitet: dpb am 27 Sep. 2023
load ydata
load independentVariables
y=ydata; clear ydata
x=independentVariables; clear independentVariables
x1=x(1:5:end,1); x2=x(1:5,2);
subplot(1,3,1), plot(x(:,1),x(:,2),'*'), xlabel('x1'), ylabel('x2')
subplot(1,3,2), plot(x(:,1),x(:,3),'*'), xlabel('x1'), ylabel('x3')
subplot(1,3,3), plot(x(:,2),x(:,3),'*'), xlabel('x2'), ylabel('x3')
figure
z=reshape(y,5,5);
surf(x1,x2,z), xlabel('x1'), ylabel('x2')
Have to leave unfinished -- x1, x2 are uniform grids, but x3 is not, can't use surf() but will have to use scattered interpolant because there isn't a uniform mesh.
numel(unique(x(:,3)))
ans = 25
In fact, every pressure point is unique...
William Rose
William Rose am 27 Sep. 2023
Verschoben: Voss am 27 Sep. 2023
“Think??” Makes me chuckle :) Interesting plot and interesting that all pressures are unique.

Melden Sie sich an, um zu kommentieren.

Produkte


Version

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by