Audio to Mel Spectrogram

6 Ansichten (letzte 30 Tage)
Mudasser Ahmad
Mudasser Ahmad am 22 Sep. 2023
Kommentiert: Mudasser Ahmad am 22 Sep. 2023
Hello I am working on sound classification problem. my task is to create mel spectrogram with three different windows length 93ms and 46ms and 23ms this is achieved by keeping n_fft to 2048,1024 and 512 respectively. I am getting (128,216) but I don't understand the 3 there (128,216,3) here 128 is number of frequency bins and 216 are number of frames. Can some help me understand the right side the attached image the DL part?
  2 Kommentare
Mathieu NOE
Mathieu NOE am 22 Sep. 2023
You have 3 time windows , so you are omputing 3 spectrograms, each one is an array size 128 x 216
at the end your 3 spectrograms are stored in a 3D array, size 128 x 216 x 3
Mudasser Ahmad
Mudasser Ahmad am 22 Sep. 2023
Thanks for your feedback.
is my code doing correctly? this is what the image says?
import librosa
import numpy as np
# Load the audio file
y, sr = librosa.load(r'G:\A NEW RESEARCH DATASET\1Fire\2_Fire.wav') # Replace 'path_to_your_audio_file.wav' with your audio file path
# List of n_fft values
n_ffts = [2048, 1024, 512]
# List to hold spectrograms
spectrograms = []
#Generate spectrograms for each n_fft value
for n_fft in n_ffts:
mel_spec = librosa.feature.melspectrogram(y=y, sr=sr, n_fft=n_fft, hop_length=512, n_mels=128)
spectrograms.append(mel_spec)
# Stack the spectrograms along the third dimension
tensor = np.stack(spectrograms, axis=-1)
print(tensor.shape) # This should print (90, time_steps, 3), where time_steps depends on the length of your audio file

Melden Sie sich an, um zu kommentieren.

Antworten (0)

Kategorien

Mehr zu Time-Frequency Analysis finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by