audioTimeScaler

Apply time scaling to streaming audio

Since R2019b

expand all in page

Description

The audioTimeScaler object performs audio time scale modification (TSM) independently across each input channel.

To modify the time scale of streaming audio:

Create the audioTimeScaler object and set its properties.
Call the object with arguments, as if it were a function.

To learn more about how System objects work, see What Are System Objects?

Creation

Syntax

aTS = audioTimeScaler

aTS = audioTimeScaler(speedupFactor)

aTS = audioTimeScaler(___,'Name',Value)

Description

aTS = audioTimeScaler creates an object, aTS, that performs audio time scale modification independently across each input channel over time.

aTS = audioTimeScaler(speedupFactor) sets the SpeedupFactor property to speedupFactor.

aTS = audioTimeScaler(___,'Name',Value) sets each property Name to the specified Value. Unspecified properties have default values.

Example: aTS = audioTimeScaler(1.2,'Window',sqrt(hann(1024,'periodic')),'OverlapLength',768) creates an object, aTS, that increases the tempo of audio by 1.2 times its original speed using a periodic 1024-point Hann window and a 768-point overlap.

Properties

expand all

Unless otherwise indicated, properties are nontunable, which means you cannot change their values after calling the object. Objects lock when you call them, and the release function unlocks them.

If a property is tunable, you can change its value at any time.

For more information on changing property values, see System Design in MATLAB Using System Objects.

`SpeedupFactor` — Speedup factor
`1.1` (default) | positive real scalar

Speedup factor, specified as a positive real scalar.

Tunable: Yes

`InputDomain` — Domain of input signal
`"Time"` (default) | `"Frequency"`

Domain of the input signal, specified as "Time" or "Frequency".

Data Types: char | string

`Window` — Analysis window
`sqrt(hann(512,'periodic'))` (default) | real vector

Analysis window, specified as a real vector.

Note

If using audioTimeScaler with frequency-domain input, you must specify Window as the same window used to transform audioIn to the frequency domain.

Data Types: single | double

`OverlapLength` — Overlap length of adjacent analysis windows
`384` (default) | nonnegative integer

Overlap length of adjacent analysis windows, specified as a nonnegative integer.

Note

If using audioTimeScaler with frequency-domain input, you must specify OverlapLength as the same overlap length used to transform audioIn to a time-frequency representation.

`FFTLength` — FFT length
`[]` (default) | positive scalar integer

FFT length, specified as a positive integer. The default, [], means that the FFT length is equal to the number of rows in the input signal.

Dependencies

To enable this property, set InputDomain to 'Time'.

`LockPhase` — Apply identity phase locking
`false` (default) | `true`

Apply identity phase locking, specified as true or false.

Data Types: logical

Usage

Syntax

audioOut = aTS(audioIn)

Description

example

audioOut = aTS(audioIn) applies time-scale modification to the input, audioIn, and returns the time-scaled output, audioOut.

Input Arguments

expand all

`audioIn` — Input audio
column vector | matrix

Input audio, specified as a column vector or matrix. How audioTimeScaler interprets audioIn depends on the InputDomain property.

If InputDomain is set to "Time", audioIn must be a real N-by-1 column vector or N-by-C matrix. The number of rows, N, must be equal to or less than the hop length (size(audioIn,1) <= numel(Window)-OverlapLength). Columns of a matrix are interpreted as individual channels.
If InputDomain is set to "Frequency", specify audioIn as a real or complex NFFT-by-1 column vector or NFFT-by-C matrix. The number of rows, NFFT, is the number of points in the DFT calculation, and is set on the first call to the audio time scaler. NFFT must be greater than or equal to the window length (size(audioIn,1) >= numel(Window)). Columns of a matrix are interpreted as individual channels.

Data Types: single | double
Complex Number Support: Yes

Output Arguments

expand all

`audioOut` — Time-stretched audio
column vector | matrix

Time-stretched audio, returned as a column vector or matrix.

Data Types: single | double

Object Functions

To use an object function, specify the System object™ as the first input argument. For example, to release system resources of a System object named obj, use this syntax:

release(obj)

expand all

Common to All System Objects

`step`	Run System object algorithm
`release`	Release resources and allow changes to System object property values and input characteristics
`reset`	Reset internal states of System object

Examples

collapse all

Apply Time Scale Modification to Streaming Audio

Open Live Script

To minimize artifacts caused by windowing, create a square root Hann window capable of perfect reconstruction. Use iscola to verify the design.

win = sqrt(hann(1024,'periodic'));
overlapLength = 896;
iscola(win,overlapLength)

ans = logical
   1

Create an audioTimeScaler with a speedup factor of 1.5. Change the value of alpha to hear the effect of the speedup factor.

alpha = 1.5;
aTS = audioTimeScaler( ...
    'SpeedupFactor',alpha, ...
    'Window',win, ...
    'OverlapLength',overlapLength);

Create a dsp.AudioFileReader object to read frames from an audio file. The length of frames input to the audio time scaler must be less than or equal to the analysis hop length defined in audioTimeScaler. To minimize buffering, set the samples per frame of the file reader to the analysis hop length.

hopLength = numel(aTS.Window) - overlapLength;
fileReader = dsp.AudioFileReader('Counting-16-44p1-mono-15secs.wav', ...
    'SamplesPerFrame',hopLength);

Create an audioDeviceWriter to write frames to your audio device. Use the same sample rate as the file reader.

deviceWriter = audioDeviceWriter('SampleRate',fileReader.SampleRate);

In an audio stream loop, read a frame the file, apply time scale modification, and then write a frame to the device.

while ~isDone(fileReader)
    audioIn = fileReader();
    audioOut = aTS(audioIn);
    deviceWriter(audioOut);
end

As a best practice, release your objects once done.

release(deviceWriter)
release(fileReader)
release(aTS)

Process Frequency-Domain Input

Open Live Script

Create a window capable of perfect reconstruction. Use iscola to verify the design.

win = kbdwin(512);
overlapLength = 256;
iscola(win,overlapLength)

ans = logical
   1

Create an audioTimeScaler with a speedup factor of 0.8. Set InputDomain to "Frequency" and specify the window and overlap length used to transform time-domain audio to the frequency domain. Set LockPhase to true to increase the fidelity in the time-scaled output.

alpha = 0.8;
timeScaleModification = audioTimeScaler( ...
    "SpeedupFactor",alpha, ...
    "InputDomain","Frequency", ...
    "Window",win, ...
    "OverlapLength",overlapLength, ...
    "LockPhase",true);

Create a dsp.AudioFileReader object to read frames from an audio file. Create a dsp.STFT object to perform a short-time Fourier transform on streaming audio. Specify the same window and overlap length you used to create the audioTimeScaler. Create an audioDeviceWriter object to write frames to your audio device.

fileReader = dsp.AudioFileReader('RockDrums-44p1-stereo-11secs.mp3','SamplesPerFrame',numel(win)-overlapLength);

shortTimeFourierTransform = dsp.STFT('Window',win,'OverlapLength',overlapLength,'FFTLength',numel(win));

deviceWriter = audioDeviceWriter('SampleRate',fileReader.SampleRate);

In an audio stream loop:

Read a frame from the file.
Input the frame to the STFT. The dsp.STFT object performs buffering.
Apply time scale modification.
Write the modified audio to your audio device.

while ~isDone(fileReader)
    x = fileReader();
    X = shortTimeFourierTransform(x);
    y = timeScaleModification(X);
    deviceWriter(y);
end

As a best practice, release your objects once done.

release(fileReader)
release(shortTimeFourierTransform)
release(timeScaleModification)
release(deviceWriter)

Algorithms

audioTimeScaler uses the same phase vocoder algorithm as stretchAudio and is based on the descriptions in [1] and [2].

References

[1] Driedger, Johnathan, and Meinard Müller. "A Review of Time-Scale Modification of Music Signals." Applied Sciences. Vol. 6, Issue 2, 2016.

[2] Driedger, Johnathan. "Time-Scale Modification Algorithms for Music Audio Signals." Master's thesis, Saarland University, 2011.

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

System Objects in MATLAB Code Generation (MATLAB Coder)

Version History

Introduced in R2019b

audioTimeScaler

Description

Creation

Syntax

Description

Properties

SpeedupFactor — Speedup factor 1.1 (default) | positive real scalar

InputDomain — Domain of input signal "Time" (default) | "Frequency"

Window — Analysis window sqrt(hann(512,'periodic')) (default) | real vector

OverlapLength — Overlap length of adjacent analysis windows 384 (default) | nonnegative integer

FFTLength — FFT length [] (default) | positive scalar integer

Dependencies

LockPhase — Apply identity phase locking false (default) | true

Usage

Syntax

Description

Input Arguments

audioIn — Input audio column vector | matrix

Output Arguments

audioOut — Time-stretched audio column vector | matrix

Object Functions

Common to All System Objects

Examples

Apply Time Scale Modification to Streaming Audio

Process Frequency-Domain Input

Algorithms

References

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

Version History

See Also

`SpeedupFactor` — Speedup factor
`1.1` (default) | positive real scalar

`InputDomain` — Domain of input signal
`"Time"` (default) | `"Frequency"`

`Window` — Analysis window
`sqrt(hann(512,'periodic'))` (default) | real vector

`OverlapLength` — Overlap length of adjacent analysis windows
`384` (default) | nonnegative integer

`FFTLength` — FFT length
`[]` (default) | positive scalar integer

`LockPhase` — Apply identity phase locking
`false` (default) | `true`

`audioIn` — Input audio
column vector | matrix

`audioOut` — Time-stretched audio
column vector | matrix

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.