Hauptinhalt

addAugmentationMethod

Add custom augmentation method

Description

addAugmentationMethod(aug,algorithmName,algorithmHandle) adds a custom augmentation algorithm to an audioDataAugmenter object.

example

addAugmentationMethod(aug,algorithmName,algorithmHandle,Name,Value) specifies options using one or more Name,Value pair arguments.

example

Examples

collapse all

You can expand the capabilities of audioDataAugmenter by adding custom augmentation methods.

Read in an audio signal and listen to it.

[audioIn,fs] = audioread('Counting-16-44p1-mono-15secs.wav');
sound(audioIn,fs)

Create an audioDataAugmenter object. Set the probability of applying white noise to 0.

augmenter = audioDataAugmenter('AddNoiseProbability',0)
augmenter = 
  audioDataAugmenter with properties:

               AugmentationMode: 'sequential'
    AugmentationParameterSource: 'random'
               NumAugmentations: 1
         TimeStretchProbability: 0.5000
             SpeedupFactorRange: [0.8000 1.2000]
          PitchShiftProbability: 0.5000
             SemitoneShiftRange: [-2 2]
       VolumeControlProbability: 0.5000
                VolumeGainRange: [-3 3]
            AddNoiseProbability: 0
           TimeShiftProbability: 0.5000
                 TimeShiftRange: [-0.0050 0.0050]

Specify a custom augmentation algorithm that applies pink noise. The AddPinkNoise algorithm is added to the augmenter properties.

algorithmName = 'AddPinkNoise';
algorithmHandle = @(x)x+pinknoise(size(x),'like',x);
addAugmentationMethod(augmenter,algorithmName,algorithmHandle)

augmenter
augmenter = 
  audioDataAugmenter with properties:

               AugmentationMode: 'sequential'
    AugmentationParameterSource: 'random'
               NumAugmentations: 1
         TimeStretchProbability: 0.5000
             SpeedupFactorRange: [0.8000 1.2000]
          PitchShiftProbability: 0.5000
             SemitoneShiftRange: [-2 2]
       VolumeControlProbability: 0.5000
                VolumeGainRange: [-3 3]
            AddNoiseProbability: 0
           TimeShiftProbability: 0.5000
                 TimeShiftRange: [-0.0050 0.0050]
        AddPinkNoiseProbability: 0.5000

Set the probability of adding pink noise to 1.

augmenter.AddPinkNoiseProbability = 1
augmenter = 
  audioDataAugmenter with properties:

               AugmentationMode: 'sequential'
    AugmentationParameterSource: 'random'
               NumAugmentations: 1
         TimeStretchProbability: 0.5000
             SpeedupFactorRange: [0.8000 1.2000]
          PitchShiftProbability: 0.5000
             SemitoneShiftRange: [-2 2]
       VolumeControlProbability: 0.5000
                VolumeGainRange: [-3 3]
            AddNoiseProbability: 0
           TimeShiftProbability: 0.5000
                 TimeShiftRange: [-0.0050 0.0050]
        AddPinkNoiseProbability: 1

Augment the original signal and listen to the result. Inspect parameters of the augmentation algorithms applied.

data = augment(augmenter,audioIn,fs);
sound(data.Audio{1},fs)

data.AugmentationInfo(1)
ans = struct with fields:
    SpeedupFactor: 1
    SemitoneShift: 0
       VolumeGain: 2.4803
        TimeShift: -0.0022
     AddPinkNoise: 'Applied'

Plot the mel spectrograms of the original and augmented signals.

melSpectrogram(audioIn,fs)
title('Original Signal')

Figure contains an axes object. The axes object with title Original Signal, xlabel Time (s), ylabel Frequency (kHz) contains an object of type image.

melSpectrogram(data.Audio{1},fs)
title('Augmented Signal')

Figure contains an axes object. The axes object with title Augmented Signal, xlabel Time (s), ylabel Frequency (kHz) contains an object of type image.

In this example, you add a custom augmentation method that applies median filtering to your audio.

Read in an audio signal and listen to it.

[audioIn,fs] = audioread("Counting-16-44p1-mono-15secs.wav");
sound(audioIn,fs)

Create a random sequential augmenter that adds noise with an SNR range of 5 dB to 10 dB. Set the probability of applying volume control, time stretching, pitch shifting, and time shifting to 0. Set NumAugmentations to 4 to create 4 separate augmentations.

aug = audioDataAugmenter('NumAugmentations',4, ...
    "AddNoiseProbability",1, ...
    "SNRRange",[5,10], ...
    "VolumeControlProbability",0, ...
    "TimeStretchProbability",0, ...
    "TimeShiftProbability",0, ...
    "PitchShiftProbability",0)
aug = 
  audioDataAugmenter with properties:

               AugmentationMode: 'sequential'
    AugmentationParameterSource: 'random'
               NumAugmentations: 4
         TimeStretchProbability: 0
          PitchShiftProbability: 0
       VolumeControlProbability: 0
            AddNoiseProbability: 1
                       SNRRange: [5 10]
           TimeShiftProbability: 0

Call addAugmentationMethod with an algorithm name and function handle. Specify the algorithm name as MedianFilter and the function handle as movmedian with a 3-element window length. The augmentation is added to the properties of your audioDataAugmenter object.

algorithmName = 'MedianFilter';
algorithmHandle = @(x)(movmedian(x,100));
addAugmentationMethod(aug,algorithmName,algorithmHandle)

aug
aug = 
  audioDataAugmenter with properties:

               AugmentationMode: 'sequential'
    AugmentationParameterSource: 'random'
               NumAugmentations: 4
         TimeStretchProbability: 0
          PitchShiftProbability: 0
       VolumeControlProbability: 0
            AddNoiseProbability: 1
                       SNRRange: [5 10]
           TimeShiftProbability: 0
        MedianFilterProbability: 0.5000

Set the probability of applying median filtering to 80%.

aug.MedianFilterProbability = 0.8
aug = 
  audioDataAugmenter with properties:

               AugmentationMode: 'sequential'
    AugmentationParameterSource: 'random'
               NumAugmentations: 4
         TimeStretchProbability: 0
          PitchShiftProbability: 0
       VolumeControlProbability: 0
            AddNoiseProbability: 1
                       SNRRange: [5 10]
           TimeShiftProbability: 0
        MedianFilterProbability: 0.8000

Call augment on the audio to create 4 augmentations.

data = augment(aug,audioIn,fs);

You can check the parameter configuration of each augmentation using the AugmentationInfo table variable. If median filtering was applied for an augmentation, then AugmentationInfo lists the parameter as 'Applied'. If median filtering was not applied for an augmentation, then AugmentationInfo lists the parameter as 'Bypassed'.

augmentationToInspect = 2;
data.AugmentationInfo(augmentationToInspect)
ans = struct with fields:
             SNR: 9.5787
    MedianFilter: 'Applied'

Listen to the audio you are inspecting. Plot the time-domain representation of the original and augmented signals.

augmentation = data.Audio{augmentationToInspect};
sound(augmentation,fs)
t = (0:(numel(audioIn)-1))/fs;
taug = (0:(numel(augmentation)-1))/fs;
plot(t,audioIn,taug,augmentation)
legend("Original Audio","Augmented Audio")
ylabel("Amplitude")
xlabel("Time (s)")

Figure contains an axes object. The axes object with xlabel Time (s), ylabel Amplitude contains 2 objects of type line. These objects represent Original Audio, Augmented Audio.

You can specify additional parameters and corresponding parameter ranges (for use when AugmentationParameterSource is set to 'random') and parameter values (for use when AugmentationParameterSource is set to 'specify'). You must specify additional parameters, parameter ranges, and parameter values during your call to addAugmentationMethod.

Call removeAugmentationMethod to remove the MedianFilter augmentation method. Call addAugmentationMethod again, this time specifying an augmentation parameter, parameter range, and parameter value. The augmentation and parameter range is added to the properties of your audioDataAugmenter object.

removeAugmentationMethod(aug,'MedianFilter')

algorithmName = 'MedianFilter';
augmentationParameter = 'MedianFilterWindowLength';
parameterRange = [1,200];
parameterValue = 100;

algorithmHandle = @(x,k)(movmedian(x,k));
addAugmentationMethod(aug,algorithmName,algorithmHandle, ...
    'AugmentationParameter',augmentationParameter, ...
    'ParameterRange',parameterRange, ...
    'ParameterValue',parameterValue)

aug
aug = 
  audioDataAugmenter with properties:

                 AugmentationMode: 'sequential'
      AugmentationParameterSource: 'random'
                 NumAugmentations: 4
           TimeStretchProbability: 0
            PitchShiftProbability: 0
         VolumeControlProbability: 0
              AddNoiseProbability: 1
                         SNRRange: [5 10]
             TimeShiftProbability: 0
          MedianFilterProbability: 0.5000
    MedianFilterWindowLengthRange: [1 200]

In the current augmentation pipeline configuration, the parameter value is not applicable. ParameterValue is applicable when AugmetnationParameterSource is set to 'specify'. Set AugmentationParameterSource to 'specify' to enable the current parameter value.

aug.AugmentationParameterSource = 'specify'
aug = 
  audioDataAugmenter with properties:

               AugmentationMode: 'sequential'
    AugmentationParameterSource: 'specify'
               ApplyTimeStretch: 1
                  SpeedupFactor: 0.8000
                ApplyPitchShift: 1
                  SemitoneShift: -3
             ApplyVolumeControl: 1
                     VolumeGain: -3
                  ApplyAddNoise: 1
                            SNR: 5
                 ApplyTimeShift: 1
                      TimeShift: 0.0050
              ApplyMedianFilter: 1
       MedianFilterWindowLength: 100

Set AugmentationParameterSource to random and then call augment.

aug.AugmentationParameterSource = "random";
data = augment(aug,audioIn,fs);

If median filtering was applied for an augmentation, then AugmentationInfo lists the value applied.

augmentationToInspect = 3;
data.AugmentationInfo(augmentationToInspect)
ans = struct with fields:
             SNR: 8.7701
    MedianFilter: 117.9847

Listen to the audio you are inspecting. Plot the time-domain representation of the original and augmented signals.

augmentation = data.Audio{augmentationToInspect};
sound(augmentation,fs)
t = (0:(numel(audioIn)-1))/fs;
taug = (0:(numel(augmentation)-1))/fs;
plot(t,audioIn,taug,augmentation)
legend("Original Audio","Augmented Audio")
ylabel("Amplitude")
xlabel("Time (s)")

Figure contains an axes object. The axes object with xlabel Time (s), ylabel Amplitude contains 2 objects of type line. These objects represent Original Audio, Augmented Audio.

Read in an audio signal and listen to it.

[audioIn,fs] = audioread('RockDrums-44p1-stereo-11secs.mp3');
sound(audioIn,fs)

Create and audioDataAugmenter object that outputs 5 augmentations. Set the AddNoiseProbability to 0.

aug = audioDataAugmenter('NumAugmentations',5,'AddNoiseProbability',0);

Add reverberation as a custom augmentation algorithm. The applyReverb function creates a reverberator object, updates the sample rate, pre-delay, and wet/dry mix as indicated, and then applies reverberation. To minimize computational overhead, the reverberator object is persistent. The object is reset on every call to avoid mixing the reverberation tail between audio files.

type applyReverb.m
function audioOut = applyReverb(audio,preDelay,wetDryMix,sampleRate)
    persistent reverbObject
    if isempty(reverbObject)
        reverbObject = reverberator;
    end
    reverbObject.SampleRate = sampleRate;
    reverbObject.PreDelay = preDelay;
    reverbObject.WetDryMix = wetDryMix;
    
    audioOut = reverbObject(audio);
    reset(reverbObject)
end

Add applyReverb as a custom augmentation method. To specify multiple parameters for a custom method, specify the parameters, parameter ranges, and parameter values as cell arrays with the same number of cells. Set the probability of applying reverberation to 1.

algorithmName = 'Reverb';
algorithmHandle = @(x,preDelay,weDryMix)applyReverb(x,preDelay,weDryMix,fs);
parameters = {'PreDelay','WetDryMix'};
parameterRanges = {[0,1],[0,1]};
parameterValues = {0,0.3};

addAugmentationMethod(aug,algorithmName,algorithmHandle, ...
    'AugmentationParameter',parameters, ...
    'ParameterRange',parameterRanges, ...
    'ParameterValue',parameterValues)

aug.ReverbProbability = 1
aug = 
  audioDataAugmenter with properties:

               AugmentationMode: 'sequential'
    AugmentationParameterSource: 'random'
               NumAugmentations: 5
         TimeStretchProbability: 0.5000
             SpeedupFactorRange: [0.8000 1.2000]
          PitchShiftProbability: 0.5000
             SemitoneShiftRange: [-2 2]
       VolumeControlProbability: 0.5000
                VolumeGainRange: [-3 3]
            AddNoiseProbability: 0
           TimeShiftProbability: 0.5000
                 TimeShiftRange: [-0.0050 0.0050]
              ReverbProbability: 1
                  PreDelayRange: [0 1]
                 WetDryMixRange: [0 1]

Call augment to create 5 augmentations.

data = augment(aug,audioIn,fs);

Check the configuration of each augmentation using AugmentationInfo.

augmentationToInspect = 1;
data.AugmentationInfo(augmentationToInspect)
ans = struct with fields:
    SpeedupFactor: 1
    SemitoneShift: -1.4325
       VolumeGain: 0
        TimeShift: 0
           Reverb: [0.2760 0.4984]

Listen to the audio you are inspecting. Plot the time-domain representation of the original and augmented signals.

augmentation = data.Audio{augmentationToInspect};
sound(augmentation,fs)
t = (0:(size(audioIn,1)-1))/fs;
taug = (0:(size(augmentation,1)-1))/fs;
plot(t,audioIn(:,1),taug,augmentation(:,1))
legend("Original Audio","Augmented Audio")
ylabel("Amplitude")
xlabel("Time (s)")

Figure contains an axes object. The axes object with xlabel Time (s), ylabel Amplitude contains 2 objects of type line. These objects represent Original Audio, Augmented Audio.

Input Arguments

collapse all

Algorithm name, specified as a character vector or string. algorithmName must be a unique property name on the audioDataAugmenter, aug.

Data Types: char | string

Handle to function that implements custom augmentation algorithm, specified as a function_handle.

Data Types: function_handle

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'AugmentationParameter,'PreDelay'

Augmentation parameter, specified as a character vector, string, cell array of character vectors, or cell array of strings.

Use cell arrays to create multiple augmentation parameters. If you create multiple augmentation parameters, you must also specify ParameterRange and ParameterValue as cell arrays containing information for each augmentation parameter.

Example: 'AugmentationParameter','PreDelay'

Example: 'AugmentationParameter',{'PreDelay','HighCutFrequency'}

Data Types: char | string

Parameter range, specified as a two-element vector of nondecreasing values (for a single parameter) or a cell array of two-element vectors of nondecreasing values (for multiple parameters).

Example: 'ParameterRange',[0,1]

Example: 'ParameterRange',{[0,1],[20,20000]}

Dependencies

To enable this property, set the AugmentationParameterSource property of your audioDataAugmenter object to 'random'.

Data Types: single | double | cell

Parameter value, specified as a scalar, vector, or cell array of scalars or vectors.

Example: 'ParameterValue',0

Example: 'ParameterValue',[0,0.5,1]

Example: 'ParameterValue',{0,20000}

Example: 'ParameterValue',{[0,0.5,1],20000}

Dependencies

To enable this property, set the AugmentationParameterSource property of your audioDataAugmenter to 'specify'.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64
Complex Number Support: Yes

Version History

Introduced in R2019b