How can I read a file with two different set of columns and multiple delimiters

Question

0 Stimmen

The whole file is just one column of data if we do not use any delimiter in the 'importdata' command. But, the first part of the file contains about 15 columns of data and 1000 rows and the second set of data contains 200 columns and 3000 rows if we use the comma and tab delimiter together in the import data function.

So I think the best way to write a code for this is in two steps, first to just extract the single column with tab delimiter and then save this data as an another delimited file and redo the whole thing using a comma delimiter.

Is this the right way to do it ?

3 Kommentare
1 älteren Kommentar anzeigen 1 älteren Kommentar ausblenden

Harshith Nutulapati am 16 Sep. 2015

Bearbeitet: Harshith Nutulapati am 16 Sep. 2015

I also tried to use various other functions, textscan, dlmread/write, fscanf/fread, csvread etc.. I dont need the first part of the data, I only need the comma delimited tab. I actually got the data as a single row using the importdata command without specifying any delimiters and deleted the first part of the data using 'cellfunction'. But I did not yet figure out how process the rest of the data.

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Follow Question

Answer 1

dpb am 16 Sep. 2015

Bearbeitet: dpb am 17 Sep. 2015

In MATLAB Online öffnen

0 Stimmen

Not clear what the two rows at the beginning of the green section are--are they also header rows or actually data? Also, is the last column the column number indicator for the given row or a line number or somesuch that is actually the 200th column value? IOW, are the lines actually the same length or not?

Assuming there really are 200 columns in the second section and the first two lines are headers, then

data=csvread('yourfile.csv',1002-1,0);

should work just fine. NB: the "-1" is because the offset in csvread is zero-base; I wrote it that way to emphasize the assumption above of 1000 lines in first section plus two header lines are being skipped. If either assumption above is wrong, clarify the actual situation in detail.

Of course, textscan or any number of alternatives is possible as well...

data=cell2mat(fid,repmat('%f',1,200), 'delimiter',',', ...
                                      'headerlines',1002, ...
                                      'collectoutput',1));

is the equivalent.

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Answer 2

Kirby Fears am 16 Sep. 2015

In MATLAB Online öffnen

0 Stimmen

Try using delimread. Download the function and add it to your Matlab path using the addpath() function.

You can specify what rows you want to read and what the delimiter is.

 out=delimread('harshith.txt',',','num',[4 7],[]);
 disp(out.num);

I used the sample file below with the code above and it worked fine.

a b c d e f g h i j k
a b c d e f g
a b c d e f g h i j k
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20

If your data is not purely numerical, you can get numerical only, text only, mixed output, etc.

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

dpb am 16 Sep. 2015

Bearbeitet: dpb am 16 Sep. 2015

In MATLAB Online öffnen

No apparent need for a File Exchange routine, the builtin dlmread (or, since it's comma-delimited, csvread is even simpler).

I copied your sample file and it also works just fine; of course as noted above, the line offset is zero-based counting so the syntax for the above file is

data=csvread('filename',3,0);

or

data= dlmread('filename',',',3,0)

At one time there were "issues" with csvread and friends on files containing nonnumeric data even if it was to be skipped over by the row count; this seems to have been alleviated with recent versions. I checked with your file and R12 handled it fine; R11 read the numeric values but returned a column vector instead of the 2D array. R11 is the earliest version I still have installed.

Melden Sie sich an, um zu kommentieren.

How can I read a file with two different set of columns and multiple delimiters

3 Kommentare
1 älteren Kommentar anzeigen 1 älteren Kommentar ausblenden

Antworten (2)

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Kategorien

Tags

Community Treasure Hunt

How can I read a file with two different set of columns and multiple delimiters

3 Kommentare 1 älteren Kommentar anzeigen 1 älteren Kommentar ausblenden

Antworten (2)

0 Kommentare -2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

1 Kommentar -1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Kategorien

Tags

Siehe auch

Community Treasure Hunt

3 Kommentare
1 älteren Kommentar anzeigen 1 älteren Kommentar ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden