Main Content

Data Preprocessing

Manage and preprocess sequence and tabular data for deep learning

Preprocessing data is a common first step in the deep learning workflow to prepare raw data in a format that the network can accept. For example, you can preprocess data to enhance desired features or reduce artifacts that can bias the network. For example, you can normalize or remove noise from input data.

You can preprocess sequence input with operations such as normalization by using datastores and functions available in MATLAB® and Deep Learning Toolbox™. Other MATLAB toolboxes offer functions, datastores, and apps for labeling, processing, and augmenting deep learning data. Use specialized tools from other MATLAB toolboxes to process data for domains such as audio, text, and signal processing.


Video LabelerLabel video for computer vision applications
Ground Truth LabelerLabel ground truth data for automated driving applications
Signal LabelerLabel signal attributes, regions, and points of interest, and extract features (Seit R2019a)


transformTransform datastore (Seit R2019a)
combineCombine data from multiple datastores (Seit R2019a)
TransformedDatastoreDatastore to transform underlying datastore (Seit R2019a)
CombinedDatastoreDatastore to combine data read from multiple underlying datastores (Seit R2019a)
padsequencesPad or truncate sequence data to same length (Seit R2021a)
minibatchqueueCreate mini-batches for deep learning (Seit R2020b)