Imbalanced Audio Dataset for Deep Learning Classification

Nicholas Ang

12 Jul. 2021

1 Antwort

Aktualisiert 30 Jul. 2021

26 Ansichten (30 Tage)

Melden Sie sich an, um diese Frage zu beantworten.

Anmelden, um Aktivität zu verfolgen

Melden Sie sich an, um diese Frage zu beantworten.

Anmelden, um Aktivität zu verfolgen

Ältere Kommentare anzeigen

0 Stimmen

Hi, I am trying to use audio data from interviews for binary classification through converting my dataset into spectrograms before feeding into CNN for classification. Firstly, the audio data have different duration i.e., 7 min-30 min and the dataset is imbalanced. I am aware of techniques such as SMOTE and oversampling of minority classes, but I am lost on how to oversample my minority class. Should I convert into spectrogram before oversampling and are there any ways to do it? Thanks!

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Anmelden, um Aktivität zu verfolgen

Antworten (1)

Vineet Joshi am 30 Jul. 2021

0 Stimmen

The following documentation talks about data augmentation for audio data. It covers examples on how to create custom pipelines and functions such as pitch shifting, time shifting, and time stretching.

Data Augmentation

Hope this helps you.

Thanks