Input for ARfit ? (EEG analysis)
Ältere Kommentare anzeigen
Hello,
I am trying to use the ARfit toolbox on EEG data, more specifically the arfit function itself, but I can't figure out what format the input should be in.
It says on the toolbox page that "The input matrix v must contain the time series data, with columns of v representing variables and rows of v representing observations" but I don't understand what it means. What I have is a vector where the rows are the EEG channels, and the columns are the signal values for each data point. However, it clearly doesn't work (either returns an error "time series too short", or it runs but the output is wrong (siglev = 0 with the function arres)).
I'm hoping someone there know how to use the toolbox better than I do... Thanks in advance.
Antworten (2)
William Rose
am 5 Nov. 2024
Bearbeitet: William Rose
am 5 Nov. 2024
[Edit: fix spelling.]
Have you run the script ardem.m, which is part of the ARfit toolbox? Does it produce reasonable output?
Assuming it does, I would modify ardem, so that it uses two columns of your data, instead of using the simulated data that it does use. To do this, change line
v = arsim(w, A, C, 200); % generate simulated data v (200x2 array)
to something like
myData=load('myData.txt'); % read data from file
v=myData(1:200,1:2); % v=rows 1-200 & columns 1 and 2 of data
The details of the new lines above will vary depending on which columns you want to use, etc. But you get the idea. See what happens.
You can also start a discussion on the FileExchange site for that toolbox.
Attach a sample data file here, if you want more assistance.
Walter Roberson
am 5 Nov. 2024
1 Stimme
What I have is a vector where the rows are the EEG channels, and the columns are the signal values for each data point.
Your input is in the wrong form for the toolbox.
Each EEG channel should be considered independent. Going to the next row of the data does not correspond to going to the next time-point of the data.
Potentially if you were to flip the array it just might be the right order... provided that what is currently the colums that are signal values for each data point, happen to reflect different times.
You need
channel1_time1 channel2_time1 channel3_time1
channel1_time2 channel2_time2 channel3_time2
channel1_time3 channel2_time3 channel3_time3
channel1_time4 channel2_time4 channel3_time4
and so on.
7 Kommentare
Annaëlle
am 6 Nov. 2024
William Rose
am 6 Nov. 2024
What @Walter Roberson means is that each row of your data array is a different time, and each column is a different channel, or vice versa. If it does not work before you transpose the data array, and it doesn't work after you transpose it, then that's weird.
Please upload a file of the data you want to analyze.
The data array passed to arfit() should have many rows (i.e. measurements at many times) and several columns (i.e. several channels). Here is an example of using arfit().
First, simulate a multiple autoregressive (multi AR) process with 2 channels and 500 time ponts. This could be done with arsim() in the toolbox, but I do it here without the toolbox routine:
>> N=500; % desired simulation duration (points)
>> v=zeros(N+100,2); % allocate v with 100 extra values for simulation startup
>> v(1:2,:)=randn(2,2); % initial values
>> A1 = [0.4 1.2; 0.3 0.7]; % multi AR coeffs for lag=1
>> A2 = [0.35 -0.3; -0.4 -0.5]; % multi AR coeffs for lag=2
>> C=[1 0.5; 0.5 1.5]; % covariance matrix for random noise
>> for i=3:N+100, v(i,:)=(A1*v(i-1,:)'+A2*v(i-2,:)')'+randn(1,2)*chol(C); end
>> v=v(101:end,:); % discard startup (first 100 points)
>> disp(size(v)) % display data array size
500 2
>>
Analyze the data with arfit():
>> [west, Aest, Cest, sbc]=arfit(v,1,5)
west =
-0.0875
-0.0574
Aest =
0.4224 1.1762 0.3259 -0.2827
0.2989 0.6430 -0.4120 -0.4466
Cest =
0.9287 0.4828
0.4828 1.5717
sbc =
0.5327 0.1546 0.1761 0.1987 0.2200
>>
See if you get similar results. (Results won't match exactly, due to randomness.) Then replace the simulated v above with your data array, with successive time points on successive rows, and a column for each channel.
Walter Roberson
am 6 Nov. 2024
My personal speculation is that "the columns are the signal values for each data point" does not reflect signal values over time.
William Rose
am 7 Nov. 2024
Bearbeitet: William Rose
am 7 Nov. 2024
Let's look at the actual data:
load v
n=length(v);
plot(1:n,v(:,1),'-r',1:n,v(:,2),'-b')
grid on; xlabel('Time (samples)'); ylabel('Amplitude');
legend('Ch.1','Ch.2')
Run arfit() on your data. Specify minimum and maximum lags to be 1 and 5, for now. Screenshot of results is below.
>> [w, A, Cest, sbc]=arfit(v,1,5)
w =
1.0e-10 *
-0.1137
-0.1830
A =
Columns 1 through 8
4.6171 0.0131 -8.8420 -0.0523 8.7743 0.0802 -4.5114 -0.0561
-0.0263 4.6635 0.0803 -8.9806 -0.0903 8.9246 0.0418 -4.5761
Columns 9 through 10
0.9621 0.0150
-0.0054 0.9685
Cest =
1.0e-19 *
0.6669 0.3503
0.3503 0.3031
sbc =
-35.3360 -37.2472 -40.0320 -42.3298 -44.9596
The results look reasonable. A is 2-by-10. arfit chooses the optimum model order (optOrder), within the range minlags to maxlags. The optimum order is the order where sbc is lowest. sbc is smallest at 5 lags. That is why A is 2x10. The size of A is numChan rows by (numChan x optOrder) columns. I cannot explain why you got such a large A matrix.
By the way, if you examine the A matrix, you see that each channel is predicted mainly by itself, and not by the other channel. I say that because A consists of 5 2x2 matrices: one 2x2 matrix for each lag. And the individual 2x2 matrices are almost diagonal. From the results above, we have
A=
4.6171 0.0131 -8.8420 -0.0523 8.7743 0.0802 -4.5114 -0.0561 0.9621 0.0150
-0.0263 4.6635 0.0803 -8.9806 -0.0903 8.9246 0.0418 -4.5761 -0.0054 0.9685
Therefore
A1=[4.62 0.01; -0.03 4.66] (lag 1)
A2=[-8.84 -0.05; 0.08 -8.98] (lag 2)
A3=[8.77 0.08; -0.09 8.92] (lag 3)
etc
Note how A1, A2, A3 are almost diagonal. The arfit package includes routines to compute uncertainty esitmates for all the estimated parametrers. See the package documentation, and ardem() for details. It would not surprise me if the off-diagonal values in the matrices above are not significantly different from 0. If that is the case, then a multiple autoregressive model is not justified for this data, and you should just use individual autoregressive models for each channel independently.
Since sbc was lowest at the maximum order investigated (i.e. at order 5), you could try even higher ranges for the order, to see if you can find a lower sbc (which means a significantly better fit) at a higher order.
Annaëlle
am 12 Nov. 2024
William Rose
am 12 Nov. 2024
@Annaëlle, you're welcome.
Good luck with your research.
Kategorien
Mehr zu EEG/MEG/ECoG finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
