Main Content

Use Compiler Output for System Integration

The compile method:

  • Generates the external memory address map.

  • Optimizes networks for deployment.

  • Splits networks into legs for deployment.

To integrate the generated deep learning processor IP core into your system reference design, use the compile method outputs.

External Memory Address Map

When you create a dlhdl.Workflow object and use the compile method, an external memory address map is generated. The compile method generates these address offsets based on the deep learning network and target board: Use the address map to:

  • Load the network inputs.

  • Load the deep learning processor IP core instructions.

  • Load the network weights and biases.

  • Retrieve the output results.

The compile method generates these address offsets:

  • InputDataOffset — Address offset where the input images are loaded.

  • OutputResultOffset — Output results are written starting at this address offset.

  • SchedulerDataOffset — Address offset where the scheduler runtime activation data is written. The runtime activation data includes information such as hand off between the different deep learning processor kernels, instructions for the different deep learning processor kernels, and so on.

  • SystemBufferOffset — Do not use the memory address starting at this offset and ending at the start of the InstructionDataOffset.

  • InstructionDataOffset — All layer configuration (LC) instructions are written starting at this address offset.

  • ConvWeightDataOffset — All conv processing module weights are written starting at this address offset.

  • FCWeightDataOffset — All fully connected (FC) processing module weights are written starting at this address offset.

  • EndOffset — DDR memory end offset for generated deep learning processor IP.

The example displays the external memory map generated for the ResNet-18 recognition network that uses the zcu102_single bitstream. See, Compile dagnet network object.

Compiler Optimizations

The compile function optimizes networks for deployment by identifying network layers that you can execute in a single operation on hardware and then fuse them together. The compile function performs these layer fusions and optimizations:

This code output is an example compiler optimization in the compiler log.

Optimizing series network: Fused 'nnet.cnn.layer.BatchNormalizationLayer' into 'nnet.cnn.layer.Convolution2DLayer'

Leg Level Compilations

The compile function splits a network into legs during compilation. A leg is a subset of the network that you can convert into a series network. The compile function groups the legs based on the output format of the layers. The layer output format is defined as the data format of the deep learning processor module that processes that layer. The layer output format is conv, fc, or adder. For example, in this image, the compile function groups all the layers in Leg 2 together because they have a conv output format. To learn about the layer output formats, see Supported Layers.

ResNet-18 leg grouping

This image shows the legs of the ResNet-18 network created by the compile function and those legs highlighted on the ResNet-18 layer architecture.

ResNet-18 architecture and compiler legs

Related Topics