MATLAB Answers

Jiali
0

reading mixed format csv data with empty value '-'

Asked by Jiali
on 18 Dec 2014
Latest activity Edited by per isakson
on 19 Dec 2014
The data is mixed formatted. I just list the float number parts to show you my problem.
1 1 100.3 -45000 -0.23 0.2555 600000
2 1 100.3 -45000 -0.23 0.2555 800000
3 1 100.3 -45000 -0.23 0.2555 800000
4 1 - - - - -
5 1 - - - - -
I can not delete the empty lines since I want to know their location. But when I use
textscan (fid,'%f %f %f %f %f %f %f),
I have trouble with class of every column. And If I use
'TreatAsEmpty', '-'
inside textscan, all the negative value will be read as wrong. Does anyone have any suggestions?

  0 Comments

Sign in to comment.

1 Answer

Answer by per isakson
on 19 Dec 2014
Edited by per isakson
on 19 Dec 2014

If the file together with the parsed result fits in memory try
>> out = cssm('cssm.txt')
out =
1.0e+05 *
0.0000 0.0000 0.0010 -0.4500 -0.0000 0.0000 6.0000
0.0000 0.0000 0.0010 -0.4500 -0.0000 0.0000 8.0000
0.0000 0.0000 0.0010 -0.4500 -0.0000 0.0000 8.0000
0.0000 0.0000 NaN NaN NaN NaN NaN
0.0001 0.0000 NaN NaN NaN NaN NaN
>>
where
function out = cssm( filespec )
str = fileread( filespec );
str = regexprep( str, '(?<=\s)\-(?=\s)', 'nan' );
out = cell2mat(textscan(str,'%f%f%f%f%f%f%f','CollectOutput',true ));
end
and where cssm.txt contains
1 1 100.3 -45000 -0.23 0.2555 600000
2 1 100.3 -45000 -0.23 0.2555 800000
3 1 100.3 -45000 -0.23 0.2555 800000
4 1 - - - - -
5 1 - - - - -
&nbsp
&nbsp
cssm.m above fails if the last character of the file is -. To fix that replace
str = regexprep( str, '(?<=\s)\-(?=\s)', 'nan' );
by
str = regexprep( str, '(?<=\s)\-(?=\s|$)', 'nan' );
to match the character, -, when it is the last character of the string.

  0 Comments

Sign in to comment.