Reading a string to get required data

2 Ansichten (letzte 30 Tage)
Tom  Pearce
Tom Pearce am 27 Mär. 2011
Im trying to write a program where I can read HTML code for the purposes of extracting the data, for some analyse im conducting. Ive managed to remove the HTML jargon and am now left with a string which contains the data i require. However im trying to convert the data into a readable cell array;
Mar25,2011>4.88>4.88>4.83>4.88>51,000>Mar24,2011>4.72>4.72>4.72>4.72>13,300>Mar22,2011>4.88>4.88>4.88>4.88>0>Mar18,2011>5.00>5.00>5.00>5.00>0>Mar17,2011>4.81>4.89>4.81>4.89>1,001>
I know this may seem rather simple to most of you but im new to Matlab. Basically im trying to convert this string into a column array, firstly with the date followed by the sucessive five numbers for the whole data set. Any help on this would be greatly appreciated.

Antworten (2)

Walter Roberson
Walter Roberson am 27 Mär. 2011
textscan('%s%f%f%f%f%f', 'Delimiter', '>', 'CollectOutput', 1)
You might need to change the shapes around afterwards. I am not clear on what you are envisioning for a "column array".
  2 Kommentare
Tom  Pearce
Tom Pearce am 29 Mär. 2011
Basically I just want the data in a list (6 Columns wide) from which i can write to file and produce a graph. Ive tried textscan but keeps returning {0x1} [0x1] [0x1] [0x1] [0x1] [0x1]. Now i realise im along the right lines i will persivere. Thanks Very Much for your help.
Walter Roberson
Walter Roberson am 29 Mär. 2011
Ah, you have commas in your fifth numeric field; that throws off parsing them as a number. Also I forgot to show the string field.
Let T be the string you have the line stored in. Then,
Q = textscan(T,'%s%f%f%f%f%s', 'Delimiter', '>');
Q{6} = str2double(regexprep(Q{6},',',''));

Melden Sie sich an, um zu kommentieren.


Clemens
Clemens am 29 Mär. 2011
Personally I don't remove the "html jargon" in such cases. I use regexps like:
table_lines = regexp(table,'<tr [^>]*>(.*?)</tr>','tokens');
table_line_entry = regexp(table_line,'<td [^>]*>(.*?)</td>','tokens');
This has the advantage that it keeps the structure information of original table.
Also you might run into problems if in a table cell is html code, or just a ">" sign.

Kategorien

Mehr zu Characters and Strings finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by