Estimation the parameters of a non-linear equation

The equation is P =(1-(1-q)*x*m)^(1/(1-q)), where P is probability and x is inter-event time (s), q and m are the parameters that need to be determined.
I have attached the excel file, column A is the 'x' value, column B is the value of 'P'...So I need to determine the best possible q and m in a log-log plot (I have attached a plot in the excel file for your reference where blue line is the drawn plot (P_actual vs x) and orange one is the fitted plot after estimating the parameters (P_calculated vs x)...
There are 21115 values..so you have to first consider the data from 1 to 2000 (Group 1), calculate the q and m for this group..Then take the data from 2 to 20001 (Group 2), calculate the q and m for this group.. and so on..i.e., calculating the parameters for each and every group..U can also increase the group size, this is just an example.
Also, I don't have much idea about the initial guess of the parameters...whenever i give the initial guess it never converges..still for your convenience this is the range 1 < q < 3..and 10 < m < 1000...But one thing is sure 'q' should always be more than 1..
Please help!!

Antworten (2)

Ayush
Ayush am 20 Aug. 2024
Hey Kashif,
I understand that you are trying to determine the best-fit parameters ( q ) and ( m ) for your probability equation using MATLAB. This involves fitting the model to subsets of your data iteratively and ensuring convergence of the optimization process.
Here’s a structured approach to achieve this:
To fit the parameters ( q ) and ( m ) for each group of data, we can use "fmincon" function, which is suitable for constrained optimization problems. We will iterate over the data in specified group sizes, perform the optimization for each group, and store the results.
% Load the data from Excel
data = readtable('Parameter_estimation.xlsx');
x_data = data{:, 1};
P_data = data{:, 2};
% Define the model function
model_function = @(params, x) (1 - (1 - params(1)) * x * params(2)) .^ (1 / (1 - params(1)));
% Define the objective function for optimization
objective_function = @(params, x, P) sum((model_function(params, x) - P).^2);
% Define initial guess and bounds
initial_guess = [1.5, 100]; % Example initial guess [q, m]
lb = [1, 10]; % Lower bounds for [q, m]
ub = [3, 1000]; % Upper bounds for [q, m]
% Options for fmincon
options = optimoptions('fmincon', 'Display', 'off', 'MaxFunctionEvaluations', 10000);
% Define the group size
group_size = 2000; % You can adjust this size
% Initialize results storage
results = [];
% Iterate over the data
for start = 1:500:(length(x_data) - group_size + 1)
x_group = x_data(start:start + group_size - 1);
P_group = P_data(start:start + group_size - 1);
% Define a local objective function for the current group
local_objective = @(params) objective_function(params, x_group, P_group);
% Perform the optimization
[params, fval, exitflag] = fmincon(local_objective, initial_guess, [], [], [], [], lb, ub, [], options);
% Check if the optimization converged
if exitflag > 0
results = [results; start, params];
fprintf('Group starting at %d: q = %.4f, m = %.4f, Error = %.4f\n', start, params(1), params(2), fval);
else
fprintf('Fit did not converge for group starting at %d\n', start);
end
end
% Display all results at the end
fprintf('\nSummary of results:\n');
for i = 1:size(results, 1)
fprintf('Group starting at %d: q = %.4f, m = %.4f\n', results(i, 1), results(i, 2), results(i, 3));
end
Note: To improve the efficiency of your parameter fitting with a large dataset, we will increase the step size in our iteration. This means we will skip some data points between each group, allowing the script to run faster while still obtaining a representative fit for each data segment.
For more information on "fmincon" function in MATLAB, refer to the following MathWorks documentation:
Hope this helps !!
Regards

14 Kommentare

Thank you ayush..But I can see that for every group the value of m is 1000..This should not be the case..Please look into this..
Torsten
Torsten am 20 Aug. 2024
Bearbeitet: Torsten am 20 Aug. 2024
Change
ub = [3, 1000]; % Upper bounds for [q, m]
to
ub = [30, 100000]; % Upper bounds for [q, m]
But fitting parameters of a probability distribution shouldn't be done using a curve-fitting tool, but a distribution-fitting tool:
Code is giving absurd values for these initial guesses
What does the equation
P =(1-(1-q)*x*m)^(1/(1-q))
represent ? Is it a probability density function ?
Kashif Naukhez
Kashif Naukhez am 20 Aug. 2024
Bearbeitet: Kashif Naukhez am 20 Aug. 2024
P is a complementary cumulative distribution function..i.e., P (X>x) which is calculated from the values of x.
So is it correct that your data "x" should follow this density function ?
If this is the case, use MATLAB's "mle" to fit your parameters using "density" as the underlying pdf.
syms x m q
P = 1-(1-(1-q)*x*m)^(1/(1-q))
P = 
density = simplify(diff(P,x))
density = 
Kashif Naukhez
Kashif Naukhez am 20 Aug. 2024
Bearbeitet: Kashif Naukhez am 20 Aug. 2024
Yes 'x' will follow this density function P = 1-(1-(1-q)*x*m)^(1/(1-q))...I can obtain the parameters in Excel but it not possibe to do for such large data as it very time consuming..I dont know why MATLAB cant do it..
Torsten
Torsten am 20 Aug. 2024
Bearbeitet: Torsten am 20 Aug. 2024
Yes 'x' will follow this density function P = 1-(1-(1-q)*x*m)^(1/(1-q))...
According to your information given, this is the cumulative distribution function, not the probability density function.
I dont know why MATLAB cant do it..
It can if you use the correct tool, namely "mle".
Can you write the code please using MLE
Does your distribution have a name ?
No there is no specific name
Can you include your fit in Excel and the underlying data set ?
And why do you want to split your data set into smaller pieces and fit each subset separately ?
Check the sheets 1 and 2..Column A is x, Column B is P_actual and Column C is P_calculated which is calculated using the equation mentioned above..I just change the values of q and m and fit the orange plot to the blue plot. It is very time comsuming as it is based on hit and trial.
I need to find the variation of q with time so i need to see its variation at each and every moment by sliding the window by 1..( 1 to 500, then 2 to 501.. so on until the end, thus forming different groups and estimating the parameters for each group..I have not been able to solve this problem for more than a year..if you could do it, it will be a jackpot for me!!
Thanks in Advance
If we can do it in excel then why not on MATLAB. I would recommend you to forget about forming the groups as of now..Just take entire sheet 1 data as Group 1 and try to estimate q and m just like I did in excel. Later on further groups can be divided.

Melden Sie sich an, um zu kommentieren.

Torsten
Torsten am 21 Aug. 2024
Bearbeitet: Torsten am 21 Aug. 2024
If only reproducing the x-P curve matters, you can use the following code.
If the density of the x-values in certain regions should be used in the fitting process (e.g. to give weights to the measurements that are higher in regions with greater density of x-values), report back.
data = xlsread('Parameter_estimation.xlsx');
x = data(:,1);
P = data(:,2);
[xsort,isort] = sort(x);
Psort = P(isort);
[xsort,ia] = unique(xsort);
Psort = Psort(ia);
xinter = linspace(xsort(1),xsort(end),5000);
Pinter = interp1(xsort,Psort,xinter);
f = @(p)(1 - (1 - p(1)) * xinter * p(2)) .^ (1 / (1 - p(1)));
F = @(p) f(p) - Pinter;
p = lsqnonlin(F,[1.2,450])
Local minimum possible. lsqnonlin stopped because the final change in the sum of squares relative to its initial value is less than the value of the function tolerance.
p = 1x2
1.9855 933.1432
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
hold on
plot(xsort,Psort,'r')
plot(xinter,f(p),'b')
hold off
grid on
xlim([0 0.2])

4 Kommentare

Kashif Naukhez
Kashif Naukhez am 10 Sep. 2024
Bearbeitet: Kashif Naukhez am 10 Sep. 2024
The plot should be in log-log scale..Also, the fit should be a smooth curve, which is not the case here..This code is not at all working
The plot should be in log-log scale
Also, the fit should be a smooth curve, which is not the case here
Why is the fit not a smooth curve ? The blue fitted curve is
f(x) = (1 - (1 - p(1)) * x* p(2)) .^ (1 / (1 - p(1)));
and this is a smooth function of x.
It is not coming as a smooth curve for other data..Plus it should be in log log scale
Torsten
Torsten am 10 Sep. 2024
Bearbeitet: Torsten am 10 Sep. 2024
If the x-values and xinter-values are sorted in ascending or descending order as in the code above, the curve will always come out as "smooth".
And for log-log plot, look up loglog as noted above.

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Get Started with Curve Fitting Toolbox finden Sie in Hilfe-Center und File Exchange

Gefragt:

am 20 Aug. 2024

Bearbeitet:

am 10 Sep. 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by