Hello Yong,
To support parallel training for your custom datastore, you need to select the one of the following option
- Implement MiniBatchable + PartitionableByIndex (see here)
- Implement Partitionable. This is what is documented here.
Unfortunately, you implemented MiniBatchable + Partitionable and this is not the correct combination.
Usually, the recommendation is just to stick to datastores that we ship (e.g. fileDatastore), and to use the transform function to modify it appropriately.
In your case, it seems that the choice of a custom datastore is justified because the data seems to have a specific structure and shuffle and partition behave in a specific way.
To support parallel training, you could
- Implement partitionableByIndex it it is not too much effort. It sort of seems that because of the structure of the data, it might not be possible to index it directly.
- Otherwise, remove the MiniBatchable interface from your datastore, and modify read to not return a table but just a row of data, like so
function [data,info] = read(ds)
data = {read(ds.Datastore), ds.Labels(ds.CurrentFileIndex)};
ds.CurrentFileIndex = ds.CurrentFileIndex + 1;
This will put your datastore in the case Partitionable only and should support the multi-gpu option.