The compile
method:
Generates the external memory address map.
Optimizes networks for deployment.
Splits networks into legs for deployment.
To integrate the generated deep learning processor IP core into your system
reference design, use the compile
method outputs.
When you create a dlhdl.Workflow
object and use the
compile
method, an external memory address map is generated. The
compile
method generates these address offsets based on the deep learning
network and target board: Use the address map to:
Load the network inputs.
Load the deep learning processor IP core instructions.
Load the network weights and biases.
Retrieve the output results.
The compile
method generates these address offsets:
InputDataOffset
— Address offset where the input images are
loaded.
OutputResultOffset
— Output results are written starting at
this address offset.
SchedulerDataOffset
— Address offset where the scheduler
runtime activation data is written. The runtime activation data includes information
such as hand off between the different deep learning processor kernels, instructions for
the different deep learning processor kernels, and so on.
SystemBufferOffset
— Do not use the memory address starting
at this offset and ending at the start of the
InstructionDataOffset
.
InstructionDataOffset
— All layer configuration (LC)
instructions are written starting at this address offset.
ConvWeightDataOffset
— All conv processing module weights
are written starting at this address offset.
FCWeightDataOffset
— All fully connected (FC) processing
module weights are written starting at this address offset.
EndOffset
— DDR memory end offset for generated deep
learning processor IP.
The example displays the external memory map generated for the ResNet-18 recognition
network that uses the zcu102_single
bitstream. See, Compile dagnet network object.
The compile
function optimizes networks for deployment by
identifying network layers that you can execute in a single operation on hardware and then
fuse them together. The compile
function performs these layer fusions
and optimizations:
Batch normalization layer (batchNormalizationLayer
) and 2-D convolution layer (convolution2dLayer
).
2-D zero padding layer (nnet.keras.layer.ZeroPadding2dLayer
)
and 2-D convolution layer (convolution2dLayer
).
2-D zero padding layer (nnet.keras.layer.ZeroPadding2dLayer
)
and 2-D max polling layer (maxPooling2dLayer
).
This code output is an example compiler optimization in the compiler log.
Optimizing series network: Fused 'nnet.cnn.layer.BatchNormalizationLayer' into 'nnet.cnn.layer.Convolution2DLayer'
The compile
function splits a network into legs during compilation.
A leg is a subset of the network that you can convert into a series network. The
compile
function groups the legs based on the output format of the
layers. The layer output format is defined as the data format of the deep learning processor
module that processes that layer. The layer output format is conv, fc, or adder. For
example, in this image, the compile
function groups all the layers in
Leg 2
together because they have a conv output format. To learn about
the layer output formats, see Supported Layers.
This image shows the legs of the ResNet-18 network created by the
compile
function and those legs highlighted on the ResNet-18 layer
architecture.