Windows named pipe for data input and ouput in Matlab extremely slow compared to other languages
Ältere Kommentare anzeigen
Running a Python module between two named pipes runs roughly 40 times faster than using a Matlab implementation.
Source of data: A binary program is generating a data stream consisting of 4 float values with 8 bytes each. Always 32 bytes are somehow representing these 4 timely related measurements and are followed by the next 4 measurements.
Destination of data: the same binary or even another seperate binary should get the data blocks after doing some processing. So the output stream should have the same structure, since it shoule be possible to concatenate each individual data source and destination together.
Adding a python function just passing the data from one named pipe to another named pipe runs with a certain performance.
Doing the same with a Matlab module, even without any data processing in between, just passing input pipe data to the ouput pipe, takes roughly 40 times longer than with the Python function. Python just reads ( inpipe.read() ) 32 bytes from the pipe and writes it to the output pipe ( outpipe.write() ).
The same is done in Matlab. But with a lot lower (40 times) data rate.
Here is the example Matlab code:
bufferSize = 32;
timeOut = 100;
% PipeName and Server defination
pipeNameIn = "datain_fifo";
pipeNameOut = "dataout_fifo";
serverName = "localhost";
% Add .Net
NET.addAssembly('System.Core');
pipeStreamIn = System.IO.Pipes.NamedPipeClientStream(serverName,...
pipeNameIn,...
System.IO.Pipes.PipeDirection.In);
pipeStreamOut = System.IO.Pipes.NamedPipeClientStream(serverName,...
pipeNameOut,...
System.IO.Pipes.PipeDirection.Out);
pipeStreamIn.Connect(timeOut);
pipeStreamOut.Connect(timeOut);
if ~pipeStreamIn.IsConnected
error('Pipe %s isnt connected...', pipeNameIn);
end
if ~pipeStreamOut.IsConnected
error('Pipe %s isnt connected...', pipeNameOut);
end
read_buffer = NET.createArray('System.Byte', bufferSize);
write_buffer = NET.createArray('System.Byte', bufferSize);
while pipeStreamIn.IsConnected
% this is a byte array with 32 bytes, including 4 * 8 byte floats
read_data = pipeStreamIn.Read(read_buffer, int32(0),int32(bufferSize));
inBuf = read_buffer.uint8;
% data processing should happen here
outBuf = inBuf;
pipeStreamOut.Write(outBuf,int32(0),bufferSize);
end
Any idea to increase the data throughput to a comparable rate as with Python would be appreciated. Since currently there is no data processing involved in this data passing, it is assumed, that the data rate reduction is a result of an incorrectly configured pipe or usage mode.
Running e.g. 1M of data rows, takes ~10sec with Python, but 400sec with Matlab.
Even though the binary code producing the data can provide both output and input pipe, it is not mandatory. Therefore an input and ouput named pipe is provided, to be more general. An inout pipe could not be used, since the interface of the binary model can not be modified. Since the performance of Python with two pipes is that much higher, I would not expect this to be the issue.
Asynchronous mode on both pipes, did not enhance anything. So I assume, that there must be a way to optimize this data throughput, but I do not see the current caviats.
Any comments and ideas would be really appreciated.
2 Kommentare
MikeMoc
am 18 Dez. 2023
Eric
am 1 Apr. 2024
Some ideas (I just started messing with named pipes on Windows myself, so I apologize for dumb suggestions):
- For me, MATLAB likes to default to Message mode instead of Byte mode for the In pipe's ReadMode (no idea why). I'm not sure if that's affecting your performance somehow. (I fixed it by just trying recreating the pipe again. Bizarre.)
- I discovered that my server writes were initially hanging because the default buffer size is zero. Apparently, a default buffer size of 0 is supposed to be allocated "as needed", but this was clearly not happening for me. So my server writes and client reads were needing to occur simultaneously, and I imagine no optimization was being done by the .NET interface. I'm not sure what your server settings are, but this could be contributing to your slow down. The fix would be to set the buffer size in the server constructor.
- I would try increasing bufferSize. In my experience, 32 is a bit small for reading/writing data chunks. I recommend trying 4096 (i.e. a page size on Linux) and see if you do any better. Your program already captures the number of bytes read in read_data, and so could easily be adjusted to account for less than full buffer reads.
Antworten (1)
Gojo
am 22 Jan. 2024
1 Stimme
Hey MikeMoc,
It appears that some adjustments to the handling of named pipes in Windows using MATLAB are required. The discrepancy in performance between the Python module and MATLAB's approach may stem from MATLAB's reliance on the .NET framework in Windows, an issue not observed on the Linux system.
Although, there is no library support for using named pipes in MATLAB, I can suggest some workarounds for the same.
- “MEX” files for using named pipes: A “MEX” file is a MATLAB executable file. “MEX” files provide an interface for using functions written in C/C++. A named pipe can be created in C/C++ and can be used for inter-process communication. Please have a look at the following answer for creating a named pipe in C: https://stackoverflow.com/questions/2784500/how-to-send-a-simple-string-between-two-programs-using-pipes After creating a C function, use the “mex” command for building the file. Please refer to the following MATLAB documentation for building a “MEX” function: https://www.mathworks.com/help/matlab/ref/mex.html For more information regarding external language interfaces that could be used in MATLAB, please check the following: https://www.mathworks.com/help/matlab/external-language-interfaces.html?s_tid=srchbrcm
- Using other IPC methods: Please have a look at the following benchmarks present for IPC: https://github.com/goldsborough/ipc-bench#: The ”Shared Memory” and “Memory-Mapped Files” methods have a much better benchmark as compared to that of named pipes.
- “Shared Memory” method: Using this method, MATLAB objects could be shared using shared memory. There is a File Exchange link leveraging this and has support for windows as well. Please find the post here: https://www.mathworks.com/matlabcentral/fileexchange/28572-sharedmatrix
- “Memory-Mapped Files” method: Inter Process Communication could be done using a shared file. The file is shared by mapping part of their memory space to a common location in the file. Please refer to the following MATLAB documentation on using memory mapping for communication: https://www.mathworks.com/help/matlab/import_export/share-memory-between-applications.html
Hope this helps.
1 Kommentar
MikeMoc
am 23 Jan. 2024
Kategorien
Mehr zu External Language Interfaces finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!