Fitting scattered data to multiple cosine functions

3 Ansichten (letzte 30 Tage)
Ahmad Hamad
Ahmad Hamad am 11 Jul. 2018
Kommentiert: Matt J am 12 Jul. 2018
Hello,
I have data that represent 16 cosine shaped curves, but the data is in the form of scattered points (x_i y_i) i= 1,2,3 .... N. please relate to the attached plot. The points are not associated to the functions, further more, I don't have the exact functions but only their models, more specifically, each function follows the form: f_k = A_k cos(phi_0k + omega*x), omega is fixed for all the functions, only the amplitude A_k and the phase shift phi_k are specific to each function. Is there an easy way to associate each data point to one of the 16 curves?
Thanks in advance.
  6 Kommentare
Ahmad Hamad
Ahmad Hamad am 11 Jul. 2018
Bearbeitet: Ahmad Hamad am 11 Jul. 2018
@Anton Semechko, the nonlinear least squares has to be carried out on what exactly?
dpb
dpb am 11 Jul. 2018
Well, that's the issue; if you just throw all the data at a NLLSQ algorithm trying to estimate 32 parameters (mag and phase for sixteen functions), it'll fail miserably unless you have some way a priori to associate which point(s) belong to which term; otherwise it just looks basically like random noise.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Anton Semechko
Anton Semechko am 11 Jul. 2018
Bearbeitet: Anton Semechko am 11 Jul. 2018
Below is an example where I use brute-force search to find an optimal set of sinusoid parameters that best fit an unorganized dataset; like the one you have. Fitted model parameters can be subsequently used to classify the individual data points. I omitted that latter part, but it wont be too difficult to implement.
SAMPLE SINUSOIDS
RESULTS OF THE SEARCH
function brute_force_sinusiod_fit_demo
% Generate a sample data set composed on N sinusoids with the same angular
% frequency but varying amplitudes and phases
% -------------------------------------------------------------------------
N=16; % # of sinusoids
w=3; % angular frequency
A_o=(1+rand(N,1))/2; % amplitudes in the range [0.5 1]
phi_o=rand(N,1)*pi; % phases in the range [0 pi]
f=@(x,A,phi) A*cos(w*x+phi);
[F,X]=deal(cell(N,1));
for n=1:N
X{n}=sort(pi*rand(1E3,1)); % unevenly spaced samples in time
F{n}=f(X{n},A_o(n),phi_o(n)); % measured signal
end
F=cell2mat(F);
X=cell2mat(X);
% Scramble the order of samples so it becomes difficult to tell which point
% came from what signal
id_prm=randperm(numel(F));
F=F(id_prm);
X=X(id_prm);
figure('color','w')
plot(X,F,'.','MarkerSize',5)
drawnow
% Attempt to recover parameters of the N curves from simulated data
% -------------------------------------------------------------------------
A_rng=[0.4 1.2]; % expected range of amplitudes
phi_rng=[0 pi]; % expected range of phase shifts
% Search grid
phi=linspace(phi_rng(1),phi_rng(2),200);
dphi=phi(2)-phi(1);
A=linspace(A_rng(1),A_rng(2),200);
dA=A(2)-A(1);
[phi,A]=meshgrid(phi,A);
% Search A-phi parameter space
Ng=numel(A);
MAE=median(abs(F))*ones(size(A)); % expected error
for i=1:Ng
fprintf('%u/%u\n',i,Ng)
Fi=f(X,A(i),phi(i)); % output of the model
dF=abs(F-Fi); % absolute residuals
dF(dF>5*dA)=[]; % remove data points that deviate from Fi by more than 5*dA
if isempty(dF), continue; end
dF(dF>3*median(dF))=[];
if isempty(dF), continue; end
MAE(i)=median(dF); % quality of the fit
end
% Extract N best fits
[A_fit,phi_fit]=deal(zeros(N,1));
MAE2=MAE;
for n=1:N
% Absolute minimum
[~,id_min]=min(MAE2(:));
A_fit(n)=A(id_min);
phi_fit(n)=phi(id_min);
% Set neighbourhood around absolute minimum to Inf
D=((A-A_fit(n))/dA).^2 + ((phi-phi_fit(n))/dphi).^2;
MAE2(D<=(2^2))=Inf;
end
% Visualize
% -------------------------------------------------------------------------
figure('color','w')
subplot(2,1,1)
imagesc(phi_rng,A_rng,MAE)
hold on
plot(phi_fit,A_fit,'or','MarkerSize',10,'MarkerFaceColor','none')
set(gca,'YDir','normal','XLim',phi_rng+dphi*[-1 1],'YLim',A_rng+dA*[-1 1],'FontSize',15)
axis equal
xlabel('phase','FontSize',20)
ylabel('amplitude','FontSize',20)
title('Best Fit Model Parameters','FontSize',25)
subplot(2,1,2)
imagesc(phi_rng,A_rng,MAE)
hold on
plot(phi_o,A_o,'or','MarkerSize',10,'MarkerFaceColor','none')
set(gca,'YDir','normal','XLim',phi_rng+dphi*[-1 1],'YLim',A_rng+dA*[-1 1],'FontSize',15)
axis equal
xlabel('phase','FontSize',20)
ylabel('amplitude','FontSize',20)
title('Actual Model Parameters','FontSize',25)
  2 Kommentare
Ahmad Hamad
Ahmad Hamad am 12 Jul. 2018
Bearbeitet: Matt J am 12 Jul. 2018
@Anton Semechko,
Thank you for the code, Hough transform seems to be the way to go but the parameter space is huge, keep in mind that in your example you assumed w is given but that is not the case. It's common between all the signals, that is true, but it is also unknown.
Anton Semechko
Anton Semechko am 12 Jul. 2018
Yeah, that problem is significantly more challenging, but can be tackled using a similar approach.
There are various heuristics you can use along the way to speed-up the search. Assuming sinusoids have different amplitudes, you can fit them in decreasing order of amplitude, and remove data points best explained by the fitted model along the way. Maxim amplitude can be estimated directly from the data without any optimization. Frequency and phase of the signal with maximum amplitude can be optimized using the same general strategy I used in the demo above. Frequency found for this first signal can either be fixed when fitting remaining sinusoids, or you can implement a more robust strategy that allows you to test this solution on some sub-set of data.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Matt J
Matt J am 11 Jul. 2018
You need to implement a sort of sinusoidal Hough transform. Bin the 2D space of coordinates (A,phi) into cells and loop over the cells. For each combination (A,phi), generate the appropriate curve y=f(x) and see how many points lie within a tolerance region of that curve. This will give you a tableau Counts(A,phi). The top 16 peaks in that tableau will give you your 16 sinusoids.
  3 Kommentare
Ahmad Hamad
Ahmad Hamad am 12 Jul. 2018
I thought about Hough transform but with three parameters ( amplitude, phase and frequency which is common but unknown) the parameter space will be huge. Or is it manageable?
Matt J
Matt J am 12 Jul. 2018
You could start with coarse sampling, e.g., 10x10x10 and gradually refine.

Melden Sie sich an, um zu kommentieren.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by