Main Content

harmonicRatio

Harmonic ratio

Description

example

hr = harmonicRatio(audioIn,fs) returns the harmonic ratio of the signal, audioIn, over time. Columns of the input are treated as individual channels.

example

hr = harmonicRatio(audioIn,fs,Name=Value) specifies options using one or more name-value arguments.

Example: hr = harmonicRatio(audioIn,fs,Window=rectwin(round(fs*0.1)),OverlapLength=round(fs*0.05)) returns the harmonic ratio for the audio input signal sampled at fs Hz. The harmonic ratio is calculated for 100 ms rectangular windows with 50 ms overlap.

example

harmonicRatio(___) with no output arguments plots the harmonic ratio against time. You can specify an input combination from any of the previous syntaxes.

Examples

collapse all

Read in an audio file and calculate the harmonic ratio using default parameters.

[audioIn,fs] = audioread("RandomOscThree-24-96-stereo-13secs.aif");
audioInMono = mean(audioIn,2);

hr = harmonicRatio(audioInMono,fs);

Plot the amplitude and the harmonic ratio of the signal against time.

t = (0:length(audioInMono)-1)/fs;
subplot(2,1,1)
plot(t,audioInMono)
ylabel("Amplitude")
grid minor
axis tight

subplot(2,1,2)
harmonicRatio(audioInMono,fs)

Read in an audio file.

[audioIn,fs] = audioread("Counting-16-44p1-mono-15secs.wav");

Calculate the harmonic ratio of the audio file using 50 ms Hann windows with 25 ms overlap.

hr = harmonicRatio(audioIn,fs, ...
                   Window=hann(round(fs.*0.05),"periodic"), ...
                   OverlapLength=round(fs.*0.025));

Plot the harmonic ratio.

harmonicRatio(audioIn,fs, ...
              Window=hann(round(fs.*0.05),"periodic"), ...
              OverlapLength=round(fs.*0.025))

The harmonic ratio indicates the ratio of energy in the harmonic portion of audio to the total energy of the audio. Because the audio signal in this example has regions of near silence, where the total energy is very low, the harmonic ratio does a poor job discriminating between regions of speech and regions of silence. Add white noise to the audio signal and then calculate the harmonic ratio.

audioIn = audioIn + 0.1*randn(size(audioIn));
hr = harmonicRatio(audioIn,fs, ...
                   Window=hann(round(fs.*0.05),"periodic"), ...
                   OverlapLength=round(fs.*0.025));

Plot the new harmonic ratio of the audio signal combined with white noise.

harmonicRatio(audioIn,fs, ...
              Window=hann(round(fs.*0.05),"periodic"), ...
              OverlapLength=round(fs.*0.025))

Create a dsp.AudioFileReader object to read in stereo audio data frame-by-frame. Create a dsp.SignalSink object to log the harmonic ratio calculation.

fileReader = dsp.AudioFileReader('RandomOscThree-24-96-stereo-13secs.aif');
logger = dsp.SignalSink;

In an audio stream loop:

  1. Read in a frame of audio data.

  2. Calculate the harmonic ratio for each channel of the frame of audio.

  3. Log the harmonic ratio for later plotting.

To calculate the harmonic ratio for only a given input frame, specify a window with the same number of samples as the input, and set the overlap length to zero. Plot the logged data.

win = hamming(fileReader.SamplesPerFrame,'periodic');
while ~isDone(fileReader)
    audioIn = fileReader();
    
    hr = harmonicRatio(audioIn,fileReader.SampleRate, ...
                       'Window',win, ...
                       'OverlapLength',0);
    logger(hr)
end

plot(logger.Buffer)
ylabel('Harmonic Ratio')
legend('Left Channel','Right Channel')

If the input to your audio stream loop has a variable samples-per-frame, an inconsistent samples-per-frame with the analysis window size of harmonicRatio, or if you want to calculate the harmonic ratio of overlapped data, use dsp.AsyncBuffer.

Create a dsp.AsyncBuffer object, reset the logger, and release the file reader.

buff = dsp.AsyncBuffer;
reset(logger)
release(fileReader)

Calculate the harmonic ratio using 50 ms frames with a 25 ms overlap.

fs = fileReader.SampleRate;

samplesPerFrame = round(fs*0.05);
samplesOverlap = round(fs*0.025);

samplesPerHop = samplesPerFrame - samplesOverlap;

win = hamming(samplesPerFrame);

while ~isDone(fileReader)
    audioIn = fileReader();
    write(buff,audioIn);

    while buff.NumUnreadSamples >= samplesPerHop
        audioBuffered = read(buff,samplesPerFrame,samplesOverlap);

        hr = harmonicRatio(audioBuffered,fs, ...
                           'Window',win, ...
                           'OverlapLength',0);
        logger(hr)
    end

end
release(fileReader)

Plot the logged data.

plot(logger.Buffer)
ylabel('Harmonic Ratio')
legend('Left Channel','Right Channel')

The harmonic ratio measures the amount of energy in the tonal part of the signal compared to the amount of energy in the total signal.

Harmonic Ratio of Pure Tone

Create a pure tone and then calculate the harmonic ratio using default parameters. By default, the harmonic ratio is calculated for 30 ms Hamming windows with 10 ms hops. Plot the results. The harmonic ratio is near 1, which is the theoretical maximum.

fs = 48e3;
osc = audioOscillator(Frequency=500,SamplesPerFrame=192e3,SampleRate=fs);
sinewave = osc();

hr = harmonicRatio(sinewave,fs);

Plot the harmonic ratio against time. The harmonic ratio is near 1, which is the theoretical maximum.

harmonicRatio(sinewave,fs)
title("Sinusoid - Default Parameters")

The short-time analysis required for windowing lowers the harmonic ratio from the theoretical value of 1. To diminish the effect of windowing, you can increase the window size. Use a 100 ms Hamming window and a 10 ms hop.

win = hamming(round(fs*0.1),"periodic");
overlap = round(fs*0.099);

hr = harmonicRatio(sinewave,fs,Window=win,OverlapLength=overlap);

Plot the harmonic ratio and observe that it is closer to one than when using the default window length.

harmonicRatio(sinewave,fs,Window=win,OverlapLength=overlap)
title("Sinusoid - 100 ms Window")

Harmonic Ratio of White Noise

Create 5 seconds of white noise and then calculate the harmonic ratio using default parameters. By default, the harmonic ratio is calculated for 30 ms Hamming windows with 10 ms hops.

fs = 48e3;
noise = rand(fs*5,1);

hr = harmonicRatio(noise,fs);

Plot the results. The harmonic ratio is 0.

harmonicRatio(noise,fs)
title("Noise - Default Parameters")

Input Arguments

collapse all

Input signal, specified as a column vector or matrix. If specified as a matrix, harmonicRatio treats the columns of the matrix as individual audio channels.

Data Types: single | double

Sample rate of the input signal in Hz, specified as a positive scalar.

Data Types: single | double

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: Window=hamming(256)

Window applied in the time domain, specified as a real vector. The number of elements in the vector must be in the range [1, size(audioIn,1)]. The number of elements in the vector must also be greater than OverlapLength.

Data Types: single | double

Number of samples overlapped between adjacent windows, specified as an integer in the range [0, size(Window,1)).

Data Types: single | double

Output Arguments

collapse all

Harmonic ratio, returned as a scalar, vector, or matrix. Each row of hr corresponds to the harmonic ratio of a window of audioIn. The harmonic ratio is returned with values in the range [0,1]. A value of 0 represents low harmonicity, and a value of 1 represents high harmonicity.

Data Types: single | double

Algorithms

The harmonic ratio is calculated as described in [1]. The following algorithm is applied independently to each window of audio data. The normalized autocorrelation of the signal is determined as:

Γ(m)=n=1Ns(n)s(nm)n=1Ns(n)2n=0Ns(nm)2for(1mM)

where

  • s is a single frame of audio data with N elements.

  • M is the maximum lag in the calculation. The maximum lag is 40 ms, which corresponds to a minimum fundamental frequency of 25 Hz.

A first estimate of the harmonic ratio is determined as the maximum of the normalized autocorrelation, within a given range:

βHR=maxM0mM{Γ(m)}

where M0 is the lower edge of the search range, determined as the first zero crossing of the normalized autocorrelation.

Finally, the harmonic ratio estimate is improved using parabolic interpolation, as described in [2].

References

[1] Kim, Hyoung-Gook, Nicholas Moreau, and Thomas Sikora. MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval. John Wiley & Sons, 2005.

[2] Quadratic Interpolation of Spectral Peaks. Accessed October 11, 2018. https://ccrma.stanford.edu/~jos/sasp/Quadratic_Interpolation_Spectral_Peaks.html

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Version History

Introduced in R2019a