Read specific rows from a large .csv
9 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Lorenzo
am 6 Jul. 2016
Kommentiert: Steven Hunsinger
am 14 Sep. 2022
Hi,
I try to find a solution, which computes fast, to handle a big .csv (35MB). Good part is I only a certain part of the file. Basically I would like to read only rows which start with a certain name.
Unfortunately the file is composed like this:
Varname_1 timestring(t=0) valueX valueY
Varname_2 timestring(t=0) valueX valueY
...
Varname_n timestring(t=0) valueX valueY
Varname_1 timestring(t=1) valueX valueY
Varname_2 timestring(t=1) valueX valueY
...
Varname_n timestring(t=1) valueX valueY
...
... and so on
My idea would be to read the .csv-file line by line check for Varname = Varname1 i.e. and write it to an cellarray (or 4 vectors) like this:
Varname_1 timestring(t=0) valueX valueY
Varname_1 timestring(t=1) valueX valueY
Varname_1 timestring(t=2) valueX valueY
...
Any idea for a smart code? Thank You! (add. notes: varname = string, time = string, value = number with , separated decimal)
------------------------------------ EDIT: example data
output would be i.e.
var2 10:10:10 16,1010138923
var2 10:10:20 89,1560542863
var2 10:10:30 69,557621819
var2 10:10:40 9,9246195517
3 Kommentare
dpb
am 6 Jul. 2016
That, I think, you'll have to fixup outside Matlab; don't think it knows how to handle it?? If it's csv separated, that's a problem for certain.
Akzeptierte Antwort
Image Analyst
am 6 Jul. 2016
Use readtable() and then search column 1 for the filename pattern you want. Attach a small example with wanted and unwanted filenames if you can't figure it out.
0 Kommentare
Weitere Antworten (2)
dpb
am 6 Jul. 2016
Bearbeitet: dpb
am 6 Jul. 2016
Untested, but check that the pattern matching format string doesn't solve the problem directly...
vName='Varname_1'; % the variable name you're looking for
fmt=[vName '%s %f %f']; % match vName, string, two numerics
fid=fopen('yourbigfile.csv','r');
data=textscan(fid,fmt,'delimiter',',');
fid=fclose(fid);
As said I'm not positive, but I think there's at least a reasonable chance the pattern-matching will do what you're looking for. Worth a shot methinks...
Well, doggonit, magic doesn't happen, joy didn't ensue... :(
But, the original idea isn't difficult...
while ~feof(fid)
l=fgetl(fid);
if strfind(l,vName)
data{i}=textscan(l,fmt);
end
end
fid=fclose(fid);
worked for a sample file albeit I used space-delimited and '.' as the decimal indicator; I think that'll still be a problem.
I thought
while ~feof(fid)
try
data{i}=textscan(l,fmt);
catch
end
end
fid=fclose(fid);
would work around the issue but it didn't; textscan simply gave up and quit reading anything once if failed; it doesn't throw an error, it just throws up its hands silently. :(
3 Kommentare
dpb
am 6 Jul. 2016
I used textscan not csvread, IA???
He's also got comma as the decimal indicator and says he's got a .csv file in which case it's indeterminable--which comma is a delimiter and which is a decimal point?
Lorenzo
am 6 Jul. 2016
1 Kommentar
Steven Hunsinger
am 14 Sep. 2022
Not so lightning fast if you get your company network involved. 67.5MB with a breakpoint after readtable. 10 minutes. This might be OK if I need all that data loaded into RAM, but seems excessive for reading the first line or so. Is there a better way?
Siehe auch
Kategorien
Mehr zu Workspace Variables and MAT Files finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!