I need to get the power spectrum of gene expression data that were sampled every 4 hours for a 48 hour period (13 samples, including zero point). I did an initial analysis based on a sound recording example but I think I'm missing something.
This is the code I'm using:
signal = load('data.txt')
N = length(signal);
fs = 1/(4*60*60); %Samples per second
fnyquist = fs/2; %Nyquist frequency
X_mags = abs(fft(signal));
bin_vals = [0 : N-1];
fax_Hz = bin_vals*fs/N;
N_2 = ceil(N/2);
plot(fax_Hz(1:N_2), X_mags(1:N_2))
I want to know if there are frequencies associated with the period of the signals. This data should have a period around 24 hours.
Mike

 Akzeptierte Antwort

Star Strider
Star Strider am 19 Okt. 2014
Bearbeitet: Star Strider am 19 Okt. 2014

0 Stimmen

I would do it slightly differently, and use your 4-hour sampling time directly:
GE = dlmread('Circadian analysis using FFT_data.txt');
T = [0:length(GE)-1]*4; % Sampling Times (Hours)
Ts = 4; % Sampling Interval (Hours)
Fs = 1/Ts; % Sampling Frequency (1/Hour)
Fn = Fs/2; % Nyquist Frequency (1/Hour)
FGE = fft(GE)/length(GE);
Fv = linspace(0,1,length(FGE)/2+1)*Fn;
Iv = 1:length(Fv);
figure(1)
% plot(Fv, abs(FGE(Iv))) % Frequency = 1/Hour
plot(Fv*24, abs(FGE(Iv))) % Frequency = 1/Day
grid
xlabel('Frequency (D^{-1})')
% xlabel('Frequency (H^{-1})')
ylabel('Amplitude')
The commented lines in the plot are in frequencies of 1/Hour. The uncommented lines are in frequencies of 1/Day, or (obviously) 1/(24 Hours).
EDIT to add plot:

6 Kommentare

Miguel
Miguel am 19 Okt. 2014
Thank you very much. I just have a doubt about the frequency at 0. It is normal to have a higher amplitud at the beginning? Do you know if there is a way to evaluate multliple signals at the same time? Or I need to do each plot by separate? I have data from 40 genes.
My pleasure!
The frequency at 0 (by definition a constant because with a frequency of 0 it does not oscillate) is actually the mean of all your data. It is the value about which all other data oscillate. You can see this if you plot it in time:
figure(2)
plot(T, GE)
grid
and calculate it as the mean:
GEm = mean(GE);
You can easily evaluate and plot the FFTs of all the genes at the same time. The fft function takes the FFT of column data by default, so you don’t have to do anything other than to give it all your gene data in a matrix with the data for each gene as a separate column. However they all have to be sampled at the same time increments (here every 4 hours), and have the same number of data points for you to use the same code (such as I’ve written here) on all of them simultaneously. So you could do the analysis and plotting on all your genes at once. Consider using the subplot function for your plots, since otherwise 40 of them are going to be difficult to distinguish if plotted together.
Are these the CLOCK genes or genes regulated by them?
Miguel
Miguel am 20 Okt. 2014
Just Per2 and Bmal1 are clock genes. The other genes were found to be regulated by the clock using microarray data. But all of them were validated by qPCR using the same number of samples, so I can perfectly use the subplot function. Thanks a lot. Is there a way to put the frequency value of the higher peak in the plot?
My pleasure!
Interesting. I’ve read about them in Science but that is the extent of my knowledge of them. (I trained as an Endocrinologist, so circadian rhythms interest me.) If you found the new clock-regulated genes, congratulations!
To display the frequency on the plot, put these two lines after the ylabel call in figure(1):
aFGE = abs(FGE(Iv));
text(Fv(3)*(24-.5), aFGE(3), sprintf('\\downarrow Frequency %6.3f Day^{-1}', Fv(3)*24), 'HorizontalAlignment','left', 'VerticalAlignment','bottom')
Experiment with it to get the result you want. The ‘Fv(3)*(24-.5)’ offset puts the arrow in the correct location. The ‘\\’ in the sprintf call allows sprintf to pass a single ‘\’ to the Tex interpreter so it will display the arrow. If you do not want the arrow, remove the ‘\\downarrow’ from the sprintf call.
If you were intending something else, let me know and I will see if it is possible to do.
Miguel
Miguel am 20 Okt. 2014
I was wondering if you knew about circadian rhythms because you asked me before. Good to know that you are interested in this topic too. I identify genes that were fluctuating over a 24 h period using R programming lenguaje but someone in my thesis committe asked me about the frequency of the signals, and this is another world. It was very confusing to me because the results were in hz, now it is more clear. Again, thank you very much for helping me. If I have another questions I will feel free to ask.
Star Strider
Star Strider am 20 Okt. 2014
My pleasure!
Signal processing techniques do not care what sampling times or time base you use so long as you are consistent. If you are acquiring your data in terms of hours rather than seconds, it makes sense to do the analysis in units of hours. (One day is 86400 seconds, so that would create some extremely inconvenient units if you defined your frequencies in Hz.)
You can probably do many calculations in MATLAB that you can do in R, although they may require Toolboxes that are not part of core MATLAB. I will help as much as I can with them if you want to expand your use of MATLAB.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Get Started with Signal Processing Toolbox finden Sie in Hilfe-Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by