How can I read a file with two different set of columns and multiple delimiters
Ältere Kommentare anzeigen
The whole file is just one column of data if we do not use any delimiter in the 'importdata' command. But, the first part of the file contains about 15 columns of data and 1000 rows and the second set of data contains 200 columns and 3000 rows if we use the comma and tab delimiter together in the import data function.
So I think the best way to write a code for this is in two steps, first to just extract the single column with tab delimiter and then save this data as an another delimited file and redo the whole thing using a comma delimiter.
Is this the right way to do it ?
3 Kommentare
Brendan Hamm
am 16 Sep. 2015
One would really need to see the data you are talking about to have a good answer for this. It may be that importdata is not the function you wish to use, or it may be that you require reading in these different sections at different times.
Harshith Nutulapati
am 16 Sep. 2015
Harshith Nutulapati
am 16 Sep. 2015
Bearbeitet: Harshith Nutulapati
am 16 Sep. 2015
Antworten (2)
Not clear what the two rows at the beginning of the green section are--are they also header rows or actually data? Also, is the last column the column number indicator for the given row or a line number or somesuch that is actually the 200th column value? IOW, are the lines actually the same length or not?
Assuming there really are 200 columns in the second section and the first two lines are headers, then
data=csvread('yourfile.csv',1002-1,0);
should work just fine. NB: the "-1" is because the offset in csvread is zero-base; I wrote it that way to emphasize the assumption above of 1000 lines in first section plus two header lines are being skipped. If either assumption above is wrong, clarify the actual situation in detail.
Of course, textscan or any number of alternatives is possible as well...
data=cell2mat(fid,repmat('%f',1,200), 'delimiter',',', ...
'headerlines',1002, ...
'collectoutput',1));
is the equivalent.
Kirby Fears
am 16 Sep. 2015
Try using delimread. Download the function and add it to your Matlab path using the addpath() function.
You can specify what rows you want to read and what the delimiter is.
out=delimread('harshith.txt',',','num',[4 7],[]);
disp(out.num);
I used the sample file below with the code above and it worked fine.
a b c d e f g h i j k
a b c d e f g
a b c d e f g h i j k
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
If your data is not purely numerical, you can get numerical only, text only, mixed output, etc.
1 Kommentar
No apparent need for a File Exchange routine, the builtin dlmread (or, since it's comma-delimited, csvread is even simpler).
I copied your sample file and it also works just fine; of course as noted above, the line offset is zero-based counting so the syntax for the above file is
data=csvread('filename',3,0);
or
data= dlmread('filename',',',3,0)
At one time there were "issues" with csvread and friends on files containing nonnumeric data even if it was to be skipped over by the row count; this seems to have been alleviated with recent versions. I checked with your file and R12 handled it fine; R11 read the numeric values but returned a column vector instead of the 2D array. R11 is the earliest version I still have installed.
Kategorien
Mehr zu Large Files and Big Data finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
