Create Modular Neural Networks

Many neural networks used for image processing applications have an architecture that follows a modular pattern. The pattern consists of an encoder module that downsamples the input followed by a decoder that upsamples the data. Bridge layers optionally connect the encoder and decoder modules. The modular pattern is used by convolutional neural networks (CNNs), such as U-Net, and generative adversarial network (GAN) generator and discriminator networks, such as CycleGAN and PatchGAN.

Create Encoder and Decoder Modules

To create encoder and decoder modules, you can:

Create an encoder network from a pretrained network, such as SqueezeNet, using the pretrainedEncoderNetwork function. The function prunes the pretrained network such that the encoder includes the number of downsampling operations that you specify.
Create encoder and decoder modules from building blocks of layers that follow a repeating pattern. To create a module, define a function that specifies the pattern, then assemble blocks into a module using the blockedNetwork function.

An encoder module consists of an initial block of layers, downsampling blocks, and residual blocks. A decoder module consists of upsampling blocks and a final block that provides the network output. The table describes the blocks of layers that commonly comprise encoder and decoder modules.

Type of Block	Description
Initial block	An `imageInputLayer` (Deep Learning Toolbox) A `convolution2dLayer` (Deep Learning Toolbox) with a stride of [1 1] An optional normalization layer An activation layer
Downsampling block	A downsampling layer, such as a pooling layer or a `convolution2dLayer` (Deep Learning Toolbox) with a stride greater than 1 An optional normalization layer An activation layer
Residual block	A `convolution2dLayer` (Deep Learning Toolbox) An optional normalization layer An activation layer An optional dropout layer A second `convolution2dLayer` (Deep Learning Toolbox) An optional second normalization layer An `additionLayer` (Deep Learning Toolbox) that provides a skip connection between every block
Upsampling block	Layers that perform upsampling, such as a transposed convolution layer, or a convolution layer followed by a resizing or depth-to-space layer. An optional normalization layer An activation layer
Final block	A `convolution2dLayer` (Deep Learning Toolbox) An optional output layer

Create Networks from Encoder and Decoder Modules

After you have an encoder and a decoder module, you can combine the modules to form a CNN, GAN generator, or GAN discriminator network using the encoderDecoderNetwork function. You can optionally include a bridge connection, skip connections, or additional layers at the end of the network.

You can also create popular GAN generator and discriminator networks directly by using functions available in Image Processing Toolbox™. These networks include CycleGAN, PatchGAN, pix2pixHD, and UNIT. For more information, see Get Started with GANs for Image-to-Image Translation.