Large streaming data direct to file
4 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Hi!,
I would like to setup a system to log months’ worth of financial json websocket data to a file.
- The json data coming in looks like this {"this": "that", "foo": [1,2,3], "bar": ["a", "b", "c"]}, and there is about 20 message per second.
- I did tests with FPRINTF writing directly to a .txt file. That works but the files get really big 2gb per day. Because there is not compression.
- I tested different SAVE formats ( '-v7' being by far the best) to save a new variable inside a .mat file every 10 mins. This was a little too slow to keep up with the stream of data coming in. Taking almost a second to save every 10 mins and it wouldn't be ideal to process it if I have to load a ton of different variables. But the file size looked to be very good. (http://undocumentedmatlab.com/blog/improving-save-performance)
- I tried the MATFILE declaration to write directly to file. But only could adjoin to the end of a file with '-v7.3' .mat files. Which makes the file a lot bigger then ‘-v7’ and still takes a little too long.
- I would like to have a file that uses good compression that I can write a new message to fast. Maybe HDF5 file format.?
I believe I need to serialize the data coming in and save it directly to a file in some kind of compressed way. But I'm not exactly sure how to do that.
- I read through this article and don't get exactly how to implement it. ( https://undocumentedmatlab.com/blog/serializing-deserializing-matlab-data). Since this is older article is there a more up to date way.
- Do I use something like "h5write"? "getByteStreamFromArray"?
- After the file is created with months of data. How do I pull each message, one by one, to process it?
- Is this "Fast serialize/deserialize" in the file exchange the correct path?... I can't figure out how to use it.
Thank you!
Joe
0 Kommentare
Antworten (1)
Jan
am 16 Nov. 2018
Bearbeitet: Jan
am 16 Nov. 2018
You can create the text as chat vector by sprintf instead of fprintf and compress it in the RAM before writing them to disk: https://www.mathworks.com/matlabcentral/fileexchange/69388-mkzip . This should avoid the overhead of compressed MAT files.
Maybe it is just the disk access, which slows down the processing. Then try to use a SSD instead.
1 Kommentar
Siehe auch
Kategorien
Mehr zu Text Files finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!