Filter löschen
Filter löschen

Reading a large .dat file or some parts of it

6 Ansichten (letzte 30 Tage)
Amin Rajabi
Amin Rajabi am 24 Feb. 2017
Kommentiert: Walter Roberson am 9 Okt. 2018
Dear All,
I have a very large DAT file (almost 16GB ). It contains the electricity usage of 8000 customers for around 4 years recorded at every 30 minutes (so it has something around 8000*4*365*24*2 rows!)
MS Excel allows me to open this file, however it's obvious that it loads only a part of it. Based on that I could figure out that the format is something like this:
990814, 246745, 0, 2012-07-22 20:00:00, 3.25, 0,0,0,0
which corresponds with:
CUSTOMER_KEY, CALENDAR_KEY, EVENT_KEY, READING_DATETIME, GENERAL_SUPPLY_KWH, CONTROLLED_LOAD_KWH, GROSS_GENERATION_KWH, NET_GENERATION_KWH, OTHER_KWH
My main problem is that when I want to load it into MATLAB it can't do it because of RAM memory problems.
I read about fopen, fread, fscanf, textscan, etc. However I couldn't figure out if its is possible to read only a part of this DAT file instead of whole of it? Is it any command to read from for example the row 100 to row 1000 of this DAT file before loading whole of it into memory?
I only need the usage of about 1000 customers for one month.
Thanks in advance for your help.

Akzeptierte Antwort

Walter Roberson
Walter Roberson am 25 Feb. 2017
The calling sequence for textscan is:
textscan(SOURCE, FORMAT, COUNT, OPTIONS...)
where SOURCE is either a file identifier or a string, FORMAT is a string, and COUNT is the maximum number of times to apply the FORMAT.
So to read a particular portion of the file, you can use the Headerlines option to skip everything before there, and you can use the COUNT to give the number of lines to process.
It is not exactly number of lines, though, because if you have empty lines then unless you have carefully chosen your options, the empty line will be considered leading whitespace that is automatically ignored without incrementing the count. It is more that, provided there is enough data, the count will be the number of rows of data that are returned.
  3 Kommentare
Rahimeh Rouhi
Rahimeh Rouhi am 8 Okt. 2018
Dear Walter Roberson, could you please help. I have a big dataset of images in form of .mat files. I have a similar problem. I used matfile to save all the data on a hard disk and load some parts, but it is very slow. Which way is better to load a part of data into the workspace? writing the data into a text file and loading by the command you mentioned could be helpful?
Walter Roberson
Walter Roberson am 9 Okt. 2018
How do you store the images inside the mst file? Cell array? One variable per image? Multiple dimensional array? Strut array?

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Low-Level File I/O finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by