Saving images quickly for huge datasets

2 Ansichten (letzte 30 Tage)
Joenam Coutinho
Joenam Coutinho am 14 Apr. 2022
Kommentiert: Joenam Coutinho am 15 Apr. 2022
ads = audioDatastore(fulfolder, ...
'IncludeSubfolders',true, ...
'FileExtensions','.wav','LabelSource','foldernames');
ads.Files = natsortfiles(ads.Files);
fs = 44100;%sampling time for melspectrogram
for i = 1:length(myFolder)
[filepath,filename,extension] = fileparts(ads.Files{i});
readingdata = read(ads);
%Pre-process audio data
if width(readingdata)>1
readingdata = mean(readingdata,2);
end
if length(readingdata)<fs
readingdata = [readingdata;readingdata];
end
path=fullfile(image_save,filename);
%Save spectrogram as image
spectro(readingdata,fs,path);
end
function spectro(audiodata, fs, path)
melSpectrogram(audiodata,fs);
colorbar ('off');
axis off;
f=gcf;
saveas(f,path,'jpg');
%Crop spectrogram data only
file = [path,'.jpg'];
img = imread(file);
crop_im = imcrop(img,[115 50 675 535]);
imwrite(crop_im,file,"jpg");
end
I have written this code that saves the Melspectrogram image of each audio sample into a specified folder ad later crops it out.
My problem arises when I got 5136 audio samples, saving each image takes very long.
I would like to know if there is any other special and quicker way to get these images saved to my folder. I had kept my device running for almost two days and I am still saving the 1100th image.
Just like added a training process to my GPU is there a way I can sideload this work on my GPU.

Antworten (2)

Joss Knight
Joss Knight am 14 Apr. 2022
It's hard to say what will speed things up, since we don't know which part of the process is slow. Is saving slow? Is computing the spectrogram slow? Try running the MATLAB profiler on a subset of the data to see where the bottlenecks are.
If it's file I/O that's slow you can try parallelizing using some parallel syntax such as parfor. You might also try using datastore writeall, for which you can define a WriteFcn, which would essentially be the code of your spectro function. writeall let's you set the UseParallel option to true.
If it's the spectrogram computation that's slow, and you have a GPU, maybe running on the GPU will help. Just move your data to the GPU, for instance, melSpectrogram(gpuArray(audiodata),fs).
  1 Kommentar
Joss Knight
Joss Knight am 14 Apr. 2022
Oh, I've noticed that you're saving a figure to disk, then loading it again in order to crop it using imcrop. This is highly inefficient. Do not use saveas, use print, and work with the options to axis, axes and print to get the output you're after.

Melden Sie sich an, um zu kommentieren.


jibrahim
jibrahim am 14 Apr. 2022
Hi Joenam,
A couple of things I noticed in your code:
1) You rely on melSpectrogram to generate a plot for you, which is fine, but that will be a bottleneck, as you generate a plot for evey file. Perhaps returning the spectrogram (S = melSpectogram... will not generate a plot) and saving S to a file is faster
2) For each audio file, you write an image file, but then you read it, and then write it again. You would save time by pre-processing S, and writing the image file once, with no need to read it again.
  1 Kommentar
Joenam Coutinho
Joenam Coutinho am 15 Apr. 2022
I am not quite clear with the point no.2.
I tried cropping the melspectrogram before saving it inorder to save time between reading and writing. But i am unable to feed S into imcrop. I gives me an error. 'Expected DATA to be nonempty.
I feel I am doing something wrong but do not know where I am going wrong

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Image Data Workflows finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by