openl3Features

(Removed) Extract OpenL3 features

Since R2021a

The openl3Features function has been removed. Use openl3Embeddings instead. For more information, see Version History.

Syntax

embeddings = openl3Features(audioIn,fs)

embeddings = openl3Features(audioIn,fs,Name,Value)

Description

embeddings = openl3Features(audioIn,fs) returns OpenL3 feature embeddings over time for audio input audioIn with sample rate fs. Columns of the input are treated as individual channels.

example

embeddings = openl3Features(audioIn,fs,Name,Value) specifies options using one or more Name,Value arguments. For example, embeddings = openl3Features(audioIn,fs,'OverlapPercentage',75) applies a 75% overlap between consecutive frames used to create the audio embeddings.

This function requires both Audio Toolbox™ and Deep Learning Toolbox™.

example

Examples

collapse all

Download `openl3Features` Functionality

This example uses:

Open Live Script

Download and unzip the Audio Toolbox™ model for OpenL3.

Type openl3Features at the command line. If the Audio Toolbox model for OpenL3 is not installed, the function provides a link to the location of the network weights. To download the model, click the link. Unzip the file to a location on the MATLAB® path.

Alternatively, execute the following commands to download and unzip the OpenL3 model to your temporary directory.

downloadFolder = fullfile(tempdir,'OpenL3Download');
loc = websave(downloadFolder,'https://ssd.mathworks.com/supportfiles/audio/openl3.zip');
OpenL3Location = tempdir;
unzip(loc,OpenL3Location)
addpath(fullfile(OpenL3Location,'openl3'))

Extract OpenL3 Embeddings

This example uses:

Open Live Script

Read in an audio file.

[audioIn,fs] = audioread('MainStreetOne-16-16-mono-12secs.wav');

Call the openl3Features function with the audio and sample rate to extract OpenL3 feature embeddings from the audio.

featureVectors = openl3Features(audioIn,fs);

The openl3Features function returns a matrix of 512-element feature vectors over time.

[numHops,numElementsPerHop,numChannels] = size(featureVectors)

numHops = 111

numElementsPerHop = 512

numChannels = 1

Decrease Time Resolution of OpenL3 Features

This example uses:

Open Live Script

Create a 10-second pink noise signal and then extract OpenL3 features. The openl3Features function extracts features from mel spectrograms with 90% overlap.

fs = 16e3;
dur = 10;
audioIn = pinknoise(dur*fs,1,'single');
features = openl3Features(audioIn,fs);

Plot the OpenL3 features over time.

surf(features,'EdgeColor','none')
view([30 65])
axis tight
xlabel('Feature Index')
ylabel('Frame')
xlabel('Feature Value')
title('OpenL3 Features')

To decrease the resolution of OpenL3 features over time, specify the percent overlap between mel spectrograms. Plot the results.

overlapPercentage = 10;
features = openl3Features(audioIn,fs,'OverlapPercentage',overlapPercentage);
surf(features,'EdgeColor','none')
view([30 65])
axis tight
xlabel('Feature Index')
ylabel('Frame')
zlabel('Feature Value')
title('OpenL3 Features')

Input Arguments

collapse all

`audioIn` — Input signal
column vector | matrix

Input signal, specified as a column vector or matrix. If you specify a matrix, openl3Features treats the columns of the matrix as individual audio channels.

Data Types: single | double

`fs` — Sample rate (Hz)
positive scalar

Sample rate of the input signal in Hz, specified as a positive scalar.

Data Types: single | double

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: openl3Features(audioIn,fs,'SpectrumType','mel256')

`OverlapPercentage` — Percentage overlap between consecutive spectrograms
`90` (default) | scalar in the range [0,100)

Percentage overlap between consecutive spectrograms, specified as a scalar in the range [0,100).

Data Types: single | double

`SpectrumType` — Spectrum type
`'mel128'` (default) | `'mel256'` | `'linear'`

Spectrum type generated from audio and used as input to the neural network, specified as 'mel128', 'mel256', or 'linear'.

Note

The SpectrumType that you select controls the spectrogram used in the network. See openl3 or openl3Preprocess for more details.

Data Types: char | string

`EmbeddingLength` — Embedding length
`512` (default) | `6144`

Length of the output audio embedding, specified as '512' or '6144'.

Data Types: single | double

`ContentType` — Audio content type
`'env'` (default) | `'music'`

Audio content type the neural network is trained on, specified as 'env' or 'music'.

Set ContentType to:

'env' when you want to use a model trained on environmental data.
'music' when you want to use a model trained on musical data.

Data Types: char | string

Output Arguments

collapse all

`embeddings` — Compact representation of audio data
N-by-L-by-C array

Compact representation of audio data, returned as an N-by-L-by-C array, where:

N –– Represents the number of buffered frames the audio signal is partitioned into and depends on the length of audioIn and the 'OverlapPercentage'.
L –– Represents the audio embedding length.
C –– Represents the number of input channels.

Data Types: single

References

[1] Cramer, Jason, et al. "Look, Listen, and Learn More: Design Choices for Deep Audio Embeddings." In ICASSP 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2019, pp. 3852-56. DOI.org (Crossref), doi:/10.1109/ICASSP.2019.8682475.

Extended Capabilities

expand all

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

This function fully supports GPU arrays. For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

Version History

Introduced in R2021a

expand all

R2024b: Errors

The openl3Features function has been removed. Use openl3Embeddings instead.

R2024a: Warns

The openl3Features function issues a warning that it will be removed in a future release.

R2022a: To be removed

The openl3Features function will be removed in a future release. Existing calls to openl3Features continue to run without warning.

openl3Features

Syntax

Description

Examples

Download openl3Features Functionality

Extract OpenL3 Embeddings

Decrease Time Resolution of OpenL3 Features

Input Arguments

audioIn — Input signal column vector | matrix

fs — Sample rate (Hz) positive scalar

Name-Value Arguments

OverlapPercentage — Percentage overlap between consecutive spectrograms 90 (default) | scalar in the range [0,100)

SpectrumType — Spectrum type 'mel128' (default) | 'mel256' | 'linear'

EmbeddingLength — Embedding length 512 (default) | 6144

ContentType — Audio content type 'env' (default) | 'music'

Output Arguments

embeddings — Compact representation of audio data N-by-L-by-C array

References

Extended Capabilities

GPU Arrays Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

R2024b: Errors

R2024a: Warns

R2022a: To be removed

See Also

Download `openl3Features` Functionality

`audioIn` — Input signal
column vector | matrix

`fs` — Sample rate (Hz)
positive scalar

`OverlapPercentage` — Percentage overlap between consecutive spectrograms
`90` (default) | scalar in the range [0,100)

`SpectrumType` — Spectrum type
`'mel128'` (default) | `'mel256'` | `'linear'`

`EmbeddingLength` — Embedding length
`512` (default) | `6144`

`ContentType` — Audio content type
`'env'` (default) | `'music'`

`embeddings` — Compact representation of audio data
N-by-L-by-C array

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.