Got it!!! It's actually easier than expected, I can pass the 4D array directly into trainNetwork, even without using arraydatastore. I typed 'edit trainNetwork' and read through the comments and this one explained a good amount of the steps:
% trainedNet = trainNetwork(X, Y, layers, options) trains and returns a
% network, trainedNet. The format of X depends on the input layer.
% - For an image input layer, X is a numeric array of images arranged
% so that the first three dimensions are the width, height and
% channels, and the last dimension indexes the individual images.
% - For a 3-D image input layer, X is a numeric array of 3-D images
% with the dimensions width, height, depth, channels, and the last
% dimension indexes the individual observations.