Main Content

Audio Processing

Extend deep learning workflows with audio and speech processing applications

Apply deep learning to audio and speech processing applications by using Deep Learning Toolbox™ together with Audio Toolbox™. For signal processing applications, see Signal Processing. For applications in wireless communications, see Wireless Communications.

Apps

Signal LabelerLabel signal attributes, regions, and points of interest, and extract features

Functions

expand all

audioDatastoreDatastore for collection of audio files
audioDataAugmenterAugment audio data
audioFeatureExtractorStreamline audio feature extraction
openl3EmbeddingsExtract OpenL3 feature embeddings
pitchnnEstimate pitch with deep learning neural network
vggishEmbeddingsExtract VGGish feature embeddings
yamnetYAMNet neural network
classifySoundClassify sounds in audio signal
crepeCREPE neural network
pitchnnEstimate pitch with deep learning neural network
vggishVGGish neural network
vggishEmbeddingsExtract VGGish feature embeddings
openl3OpenL3 neural network
openl3EmbeddingsExtract OpenL3 feature embeddings
vadnetVoice activity detection (VAD) neural network
detectspeechnnDetect boundaries of speech in audio signal using AI

Blocks

expand all

VGGishVGGish embeddings extraction network
VGGish EmbeddingsExtract VGGish embeddings
YAMNetYAMNet sound classification network
Sound ClassifierClassify sounds in audio signal
OpenL3OpenL3 embeddings extraction network
OpenL3 EmbeddingsExtract OpenL3 embeddings
CREPECREPE deep pitch estimation neural network
Deep Pitch EstimatorEstimate pitch with CREPE deep learning neural network

Topics