Filter löschen
Filter löschen

fwrite and MATLAB for a raid0 disk - Only one lane?

1 Ansicht (letzte 30 Tage)
Vincent Perrot
Vincent Perrot am 28 Mär. 2022
Bearbeitet: Vincent Perrot am 30 Mär. 2022
Hello everyone,
I have a raid0 NVMe disk (made up of 4 NVMe disks connected together through a PCIe card adaptator).
The disk works great (up to 12GB/s OUTSIDE MATLAB, PCIe 3.0) but I cannot reach such speed in MATLAB.
It looks like MATLAB is using a single bus lane (aka 3.5GB/s) to write the data to the disk (simple example):
data = randn(1024, 1024, 1024, 'double'); %8 GB
fid = fopen('test.bin', 'W');
tic;
fwrite(fid, data(:), 'double');
toc;
fclose(fid);
Takes about 2.3 seconds which is about 3.5 GB/s so like using one lane... where the raid0 uses 4 lanes (4x4 PCIe).
I am running out of solution, this is not related to the disk/raid0 itself; I tested a lot of raid0 configuration (bios, VROC, Windows raid), the issue only occur in MATLAB. Using hd5f files does not solve that issue, it seems to be related to MATLAB itself.
FYI: I need such speed, in my field/lab we are creating about 1TB data per 5 min the bottleneck is always related to saving the data.
EDIT 1: Removed "b" argument from "fopen"
EDIT 2: Added type "double" to "fwrite"
Thank you a lot.
  5 Kommentare
Walter Roberson
Walter Roberson am 30 Mär. 2022
Getting high speed transfer to disk can require using special system calls. I do not have any information about how it is done in Windows; in Linux apparently there are methods that can avoid round-trips to user mode. It is unlikely that MATLAB implements those methods.
In Windows... I don't know. Is WriteFileEx still used in practice? https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefileex That does asynchronous writes, which historically has been an important step in performance improvement. Or perhaps WriteFileGather() https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefilegather ?
In a logging situation, you would like to be able to grab a buffer full of input, schedule it to be written, and continue on without waiting for the I/O to complete.
I suspect that MATLAB simply uses C or C++ fwrite() https://www.cplusplus.com/reference/cstdio/fwrite/ which waits for I/O to complete
Vincent Perrot
Vincent Perrot am 30 Mär. 2022
@Walter Roberson I did a MEX file using WriteFile without success. I will try some asynchronous writes with WriteFileEx and also try WriteFileGather.
I did contact the support to get some answers about that.
I tried fwrite/ofstream/WriteFile (MEX files) even in chuncks, without any success.
Thanks for taking the time, I will read those links and try those approaches.

Melden Sie sich an, um zu kommentieren.

Antworten (2)

Jan
Jan am 29 Mär. 2022
Bearbeitet: Jan am 29 Mär. 2022
What about trying it as C-Mex?
data = randn(1024, 1024, 1024, 'double'); %8 GB
tic
uglyCWrite(data);
toc
// Short hack, UNTESTED!!!
// uglyCWrite.c
#include "mex.h"
#include <stdio.h>
#include <stdlib.h>
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
double *data;
size_t n, w;
File *fid;
data = (double *) mxGetData(prhs[0]);
n = mxGetNumberOfElements(prhs[0]);
w = mxGetElementSize(prhs[0]);
fid = fopen("test.bin", "w");
fwrite(data, n, w, fid);
fclose(fid);
}
  2 Kommentare
Vincent Perrot
Vincent Perrot am 29 Mär. 2022
Bearbeitet: Vincent Perrot am 29 Mär. 2022
Thank you for taking the time to put that piece of code together.
This morning I tested several MEX implementations from this post: https://stackoverflow.com/questions/70126690/write-binary-file-to-disk-super-fast-in-mex
Those are not faster than fwrite in MATLAB:
void writeBinFile(int16_t *data, size_t size)
{
FILE *fID;
fID = fopen("file_fopen.bin", "W");
fwrite(data, sizeof(int16_t), size, fID);
fclose(fID);
}
void writeBinFileFast(int16_t *data, size_t size)
{
ofstream file("file_ostream.bin", std::ios::out | std::ios::binary);
file.write((char *)&data[0], size * sizeof(int16_t));
file.close();
}
void writeBinFilePartByPart(int16_t *int_data, size_t size)
{
size_t part = 64 * 1024 * 1024;
size = size * sizeof(int16_t);
char *data = reinterpret_cast<char *> (int_data);
HANDLE file = CreateFileA (
"windows_test.bin",
GENERIC_WRITE,
0,
NULL,
CREATE_ALWAYS,
FILE_FLAG_SEQUENTIAL_SCAN,
NULL);
// Expand file size
SetFilePointer (file, size, NULL, FILE_BEGIN);
SetEndOfFile (file);
SetFilePointer (file, 0, NULL, FILE_BEGIN);
DWORD written;
if (size < part)
{
WriteFile (file, data, size, &written, NULL);
CloseHandle (file);
return;
}
size_t rem = size % part;
for (size_t i = 0; i < size-rem; i += part)
{
WriteFile (file, data+i, part, &written, NULL);
}
if (rem)
WriteFile (file, data+size-rem, rem, &written, NULL);
CloseHandle (file);
}

Melden Sie sich an, um zu kommentieren.


Jeremy Hughes
Jeremy Hughes am 29 Mär. 2022
I was playing around with this and found that this is much faster (by a factor of 3 on my machine):
fwrite(fid,data(:),"double");
  1 Kommentar
Vincent Perrot
Vincent Perrot am 29 Mär. 2022
Bearbeitet: Vincent Perrot am 29 Mär. 2022
Thank you.
Sadly we tried it, this is how I got the 3.5GB/s I was talking about in my first message.
I played around with the code and forgot to put it back in my question, sorry about that.
I edited my question, we are still at 3.5GB/s instead of 12 GB/s ish.

Melden Sie sich an, um zu kommentieren.

Produkte


Version

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by