Main Content

Multicore Simulation of Audio Beamforming System

This example shows how an audio beamforming system simulation model in Simulink® can have improved performance using dataflow domain. It uses the dataflow domain in Simulink to automatically partition the data-driven portions of the communications system into multiple threads and thereby improving the performance of the simulation by executing it on your desktop's multiple cores.


The dataflow execution domain allows you to make use of multiple cores in the simulation of computationally intensive systems. This example shows how dataflow as the execution domain of a subsystem improves simulation performance of the model. To learn more about dataflow and how to run Simulink models using multiple threads, see Multicore Execution Using Dataflow Domain.

Acoustic Beamforming

This example shows acoustic beamforming using a uniform linear array (ULA) of microphones. The model simulates the reception of three audio signals from different directions on a 10-element uniformly spaced linear microphone array. After the addition of thermal noise at the receiver, beamforming is applied for different source angles and the result is played on a sound device. The audio source that needs to be played in the audio player can be selected using the dialog from the Select Source block.

Setting up the Dataflow Subsystem

This example uses dataflow domain in Simulink to make use of multiple cores on your desktop to improve simulation performance. The Domain parameter of the dataflow subsystem in this model is set as Dataflow. You can view this by selecting the subsystem and then accessing Property Inspector. To access Property Inspector, in the Simulink Toolstrip, on the Modeling tab, in the Design gallery select Property Inspector or on the Simulation tab, Prepare gallery, select Property Inspector.

Dataflow domains automatically partition your model into multiple threads for better performance. Once you set the Domain parameter to Dataflow, you can use the Multicore tab analysis to analyze your model to get better performance. The Multicore tab is available in the toolstrip when there is a dataflow domain in the model. To learn more about the Multicore tab, see Perform Multicore Analysis for Dataflow.

Analyzing Concurrency in Dataflow Subsystem

For this example the Multicore tab mode is set to Simulation Profiling for simulation performance analysis.

It is recommended to optimize model settings for optimal simulation performance. To accept the proposed model settings, on the Multicore tab, click Optimize. Alternatively, you can use the drop menu below the Optimize button to change the settings individually. In this example the model settings are already optimal.

On the Multicore tab, click the Run Analysis button to start the analysis of the dataflow domain for simulation performance. Once the analysis is finished, the Analysis Report and Suggestions window shows how many threads the dataflow subsystem uses during simulation.

After analyzing the model, the Analysis Report and Suggestions window shows 3 threads. This is because the three beamformer blocks are computationally intensive and can run in parallel. The three beamformer blocks however, depend on the Microphone Array and the Receiver blocks. Pipeline delays can be used to break this dependency and increase concurrency. The Analysis Report and Suggestions window shows the recommended number of pipeline delays as Suggested for Increasing Concurrency. The suggested latency value is computed to give the best performance.

The following diagram shows the Analysis Report and Suggestions window where the suggested latency is 1 for the dataflow subsystem.

Click the Accept button to use the recommended latency for the dataflow subsystem. This value can also be entered directly in the Property Inspector for Latency parameter. Simulink shows the latency parameter value using $Z^{-n}$ tags at the output ports of the dataflow subsystem.

The Analysis Report and Suggestions window now shows the number of threads as 4 meaning that the blocks inside the dataflow subsystem simulate in parallel using 4 threads. Highlight threads highlights the blocks with colors based on their thread allocation as shown in the Thread Highlighting Legend. Show pipeline delays shows where pipelining delays were inserted within the dataflow subsystem using $Z^{-n}$ tags.

Multicore Simulation Performance

We measure the performance improvement of using dataflow domain by comparing the execution time taken for running model with and without using dataflow. Execution time is measured using the sim command, which returns the simulation execution time of the model. These numbers and analysis were published on a Windows® desktop computer with Intel® Xeon® CPU W-2133 @ 3.6 GHz 6 Cores 12 Threads processor.

Simulation execution time for multithreaded model = 4.03s
Simulation execution time for single-threaded model = 6.26s
Actual speedup with dataflow: 1.6x


This example shows how multithreading using dataflow domain can improve performance in a audio beamforming simulation model using multiple cores on the desktop.