Unable to read a huge XML or text file
3 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Hi,
I have a XML of 2GB in size. I keep getting java heap memory error when loading it. So I am thinking of reading it in as a text file and remove many useless rows in that file before saving it into a new and smaller file.
How to do that? I cannot even read it with textpad. Thanks!
Antworten (1)
Santa Raghavan
am 26 Jul. 2017
Bearbeitet: Santa Raghavan
am 26 Jul. 2017
The amount of Java Heap memory available to MATLAB can be increased and this can be done in the following way:
In the MATLAB Desktop Window:
For versions of MATLAB R2010a and above, use - File -> Preferences -> General -> Java Heap Memory. Move the slider to adjust the allocated heap memory.
For versions of MATLAB prior to R2010a, refer to the link below-
If that does not work, you can read it in as a text file using the textscan function by specifying the block size you wish to read at a time.
fileID = fopen('bigfile.txt');
formatSpec = '%s %f %*f %*f %s';
Read a block of data in the file. Use the HeaderLines name-value pair argument to instruct textscan to skip two lines before reading data.
D = textscan(fileID,formatSpec,'HeaderLines',2,'Delimiter','\t')
2 Kommentare
Santa Raghavan
am 27 Jul. 2017
You can also try the datastore function that lets you read files that dont fit into the memory.
ds = datastore('Myfile.xml', ...
'TreatAsMissing','NA')
ds.ReadSize = 100; % Specifies the number of lines
% you want to read at a time.
read(ds) % Reads first 100 lines in file
read(ds) % Reads next 100 lines in file
Subsequent read calls on ds fetches data from last read point.
Siehe auch
Kategorien
Mehr zu Large Files and Big Data finden Sie in Help Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!