Extracting Audio File Frequency

Question

1 Stimme

rmscalculation.m

Hello there,

I need to find the frequency of the audio file for specific segments. In my code I find the segments of talking and take the fft of these portions and find the frequencies. But the problem arises at the frequency part I need to find different frequencies but find exactly the same values. Could you please help?

Thanks in advance.

Audio file : https://drive.google.com/drive/folders/1EQABtLT-Is-oEk5w_6U4b1-jB8PJxaMp?usp=sharing

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Follow Question

Answer 1

William Rose am 15 Apr. 2022

In MATLAB Online öffnen

0 Stimmen

@mehtap agirsoy,

I have listened to file A1.wav. The instances of singing are not at 15 second intervals, even though this is expected by the code. Therefore the segments analyzed do not always contain singing. The amplitude of the singing is small. There are significant unrelated background noises. The pitch being sung sounds like the E flat above middle C (Eflat4). Therefore the dected dominant frequency should be around 311.1 Hz.

Approximate times of vocalization, in seconds: 1-5, 22-27, 42-46, 61-66, 82-87, 102-107.

There is background talking during 61-66. There is coughing or some other background sound in 82-87.

Conclusion: The frequency analysis of file A1.wav by rmscalculation.m is affected by background noises and incorrect timing. The signal to noise level is poor.

Recommendations:

Improve the recording set up to increase signal amplitude and reduce background noise.
Edit the audio file to extract the exact segments that contain the singing which you want to analyze.

I have looked at your code: rmscalculation.m.

Analysis of the script:

rmscalculation.m has three nested loops.

The outer loop is: for k=1:number of participants.

The middle loop is: for l=1:number of tests. This loop reads in a different audio file on each pass. It computes envolpe of hte signal as the moving average (with width 1000 points=1/44 of a second) of the absolute vaue of the signal. When the moving average crosses a threshold is deemed to be the time when talking starts.

The inner loop is: for i=1:6. Each pass extracts a segment of the signal. The segment start times are 15 seconds apart. The segments are 4.9 seconds long. The power spectrum of the segment is determined. The frequency that has max. power, within the frequency range 236 to 367 Hz, is determined for each segment.

Does that sond correct?

The script rmscalculation.m does not run. It gives the error

Error using xlsread (line 136)
Unable to open file 'F4_A1'.
File 'F4_A1' not found.
Error in rmscalculation (line 11)
a = xlsread(fname1); % comand to read excel/ particle count file

I commented out the lines related to file F4_A1. Then the script ran without error. It does not display any results.

To see the results:

>> disp(seg_Freq')
9312
8933
9312
9312
0339
0995

The frequency range of 90% to 140% of the middle C frequency will allow detection of frequencies corresponding to pitches from just below B3 to just above F4.

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Answer 2

William Rose am 12 Apr. 2022

In MATLAB Online öffnen

0 Stimmen

@mehtap agirsoy,

[moved my answer from a comment to an answer]

The google drive link you provided requres access permission. You may attach the audio file if you zip it first.

You probably know this already, but I will mention this just in case you do not know this:

When you compute the FFT or power spectrum of a segment of the signal, the frequencies of the FFT or power spectrum will be the same for each different segment (assuming the segment lengths are the same). The amplitude or power at each frequency will vary from segment to segment. You can compute the mean frequency for a segment, or you can compute the frequency with maximum power in each segment, etc. The script below does both, for an 8-second signal with gradually increasing frequency, divided into 0.5 second long segments. It plots the results. It appears that the max power frequency is better behaved than the mean frequency, in this example.

%% constants

Fs=8000; %sampling rate (Hz)

T=8; %signal duration (s)

wi=220*2*pi; %initial frequency (radians/s)

wf=880*2*pi; %final frequency (radians/s)

Tseg=0.5; %segment duration (s)

%% compute the signal

dt=1/Fs; %sampling interval

N=Fs*T; %signal duration (samples)

t=dt*(0:N-1); %vector of time values

phase=wi*t+(wf-wi)*t.*t/(2*T); %phase for signal with changing frequency

x=cos(phase); %signal amplitude

%% compute FFT of each segment

N1=Fs*Tseg; %segment duration (samples)

Nseg=T/Tseg; %number of segments

fmax=zeros(1,Nseg); %allocate array for max.power frequency of each segment

fmean=zeros(1,Nseg); %allocate array for mean frequency of each segment

df=1/Tseg; %frequency interval

f=(0:N1/2)*df; %vector of frequencies, up to Nyquist frequency

Nf=length(f); %number of frequencies in one-sided FFT

Y=zeros(Nf,Nseg); %allocate array for FFTs

for i=1:Nseg

X=fft(x((i-1)*N1+1:i*N1));

Y(:,i)=abs(X(1:Nf)); %magnitude of one-sided FFT

[~,indmax]=max(Y(:,i)); %index of largest element of Y

fmax(i)=f(indmax); %frequency with maximum power

fmean(i)=sum(f'.*Y(:,i))/sum(Y(:,i)); %mean frequency (amplitude-weighted)

end

%% plot results

figure;

subplot(211), plot(1:Nseg,fmax,'rx',1:Nseg,fmean,'bo');

xlabel('Segment'); ylabel('Frequency (Hz)');

legend('Max.Freq.','Mean Freq.'); grid on

title('Max & Mean Frequency vs. Segment')

subplot(212)

colorspec=[1,0,0;1,.33,0;1,.67,0;

1,1,0;.67,1,0;.33,1,0;

0,1,0;0,1,.33;0,1,.67;

0,1,1;0,.67,1;0,.33,0;

0,0,1;.5,0,1;

1,0,1;1,0,.5];

for i=1:Nseg

plot(f,Y(:,i),'Color',colorspec(i,:));

hold on;

end

xlabel('Frequency (Hz)'); ylabel('Amplitude'); xlim([0,1200])

grid on; title('Amplitude Spectra for Segments')

Try it. Good luck.

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

mehtap agirsoy am 12 Apr. 2022

Hi, many thanks for the help.

Zip file exceeds thelimits so I added drive link but forgot to change permissions, now it is ok.When you've time if you can check I'd be glad.

My freq results should fluctuate aroun 262 Hz when I tried max and mean results 617 and 22049.9 respectively. My segment freq are

261.931228637695

239.893341064453

261.931228637695

255.033874511719

262.099456787109

I'm not sure these are ok or not, a bit suspicious.

Melden Sie sich an, um zu kommentieren.

Answer 3

William Rose am 13 Apr. 2022

0 Stimmen

@mehtap agirsoy,

Middle C! The frequency sweep in my code goes from A3 to A5.

2 Kommentare
Keine anzeigen Keine ausblenden

William Rose am 13 Apr. 2022

I was able to see the file on google drive, which I could not do before. However, when I click "download" to put it on my drive - which I need to do in order to open it in Matlab - nothing happens. The Help for google drive says

"If you can't download a file: If you can't download a file, the owner may have disabled options to print, download, or copy for people with only comment or view access."

I suspect that's what haooening here. I can't help more since the file is impossible to access. Post a shorter file that fits within the zip limit.

mehtap agirsoy am 13 Apr. 2022

So sorry for the inconveninence. When I compress the file it still exceeds the limit. Anyone with the link are editor now.

Melden Sie sich an, um zu kommentieren.

Answer 4

William Rose am 16 Apr. 2022

0 Stimmen

estimateAudioFrequencies.m

@mehtap agirsoy,

I wrote a script that extract 3 seconds of sound from each vocalization. As I said before,the times of note-singing are approximately: 1-5, 22-27, 42-46, 61-66, 82-87, 102-107 seconds.

Therefore I extract sound from 2-5, 23-26, 43-46, 62-62, 83-86, 103-106 seconds.

I measure the mean frequency and the frequency of maxmimum power in each segment.

The max.power frequencies are about 620-630 Hz, consistent with the subjects singing E flat 5, also known as the E flat above treble C. The expected frequency of this pitch is 622 Hz, with A440 equal temperament tuning.

The script plots the max frequency for each segment and the power spectrum for each segment.

You confined the frequency search to 0.9 - 1.4 times middle C. This singing signal has very little power in that frequency range. Most of the power is around 630 Hz. I initially thought thse children were singing in octave 4 (using scientific pitch notation). Now I think they are singing an octave higher, in octave 5. It is not always easy to decide.

My code also creates a file, A1sel.wav, which is the selected audio segments, plus 1 second of silence after each segment. The graphical output from the script is below.

2 Kommentare
Keine anzeigen Keine ausblenden

mehtap agirsoy am 16 Apr. 2022

I really appreciate your help. It's my firsy time with signal processing and your explanations and code are awesome. Thanks awfully.

William Rose am 16 Apr. 2022

@mehtap agirsoy, You are welcome. Good luck with your work!

Melden Sie sich an, um zu kommentieren.

Extracting Audio File Frequency

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Akzeptierte Antwort

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Weitere Antworten (3)

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

2 Kommentare
Keine anzeigen Keine ausblenden

2 Kommentare
Keine anzeigen Keine ausblenden

Kategorien

Tags

Community Treasure Hunt

Extracting Audio File Frequency

0 Kommentare -2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Akzeptierte Antwort

0 Kommentare -2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Weitere Antworten (3)

1 Kommentar -1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

2 Kommentare Keine anzeigen Keine ausblenden

2 Kommentare Keine anzeigen Keine ausblenden

Kategorien

Tags

Siehe auch

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

2 Kommentare
Keine anzeigen Keine ausblenden

2 Kommentare
Keine anzeigen Keine ausblenden