how to import CSV file (problem with wrong written file)
Ältere Kommentare anzeigen
Hello everyone!
I am using sensor CHR UM7 from CH Robotics, which can maklogs of data. Unfortunately it safe this in CSV file but the type of record is wrong.. why?
comma divide numbers from number but in my file I have:
0,12334,0,0323233 (look in atteched file).
In fact these are not for numbers but two numbers (but with comma). Matlab can recognise this unfrtunately as a 4 numbers..
Does anyone of you has any ideas how to solve this problem?
5 Kommentare
dpb
am 28 Jun. 2017
Star has given workaround, who in their right mind would use the same character as field delimiter and decimal point!!???
I'd just ask if the vendor has a way to configure either the delimiter or (preferably) the decimal point character to fix the problem on the generation side and thus make it go away from the Matlab/user end.
Or, is it using the local locale setting for you region and can you possibly change that if so at least while creating these files?
Matthew Jameson
am 28 Jun. 2017
dpb
am 28 Jun. 2017
Well, that is a bummer, indeed!
I'm no regexp guru, what's the regular expressions pattern to change every other ',' to '.' in the file? Then, as long as it doesn't break the rule of there always being a value with a decimal in each field you could convert the file directly after the fixup.
That assumption, of course, is the same one Star has made in writing the string he then parses; just different way to get to that same text file.
@Matthew Jameson: are the files created on a computer which is set to use the decimal comma (e.g. most non-english European languages)? It is possible that the problem is caused by a hard-coded comma being used as the field separator, together with the computers locale setting used to generate the numeric strings.
If so then the whole problem could be easily avoided by switching the computer's locale settings to use the decimal point instead of the decimal comma.
Matthew Jameson
am 5 Jul. 2017
Akzeptierte Antwort
Weitere Antworten (2)
per isakson
am 30 Jun. 2017
Bearbeitet: per isakson
am 30 Jun. 2017
I've done an experiment with regular expression on R2016a. I encountered a problem. (?m), 'lineanchors' doesn't work as documented for a string, in which CRLF is used as new line character. It's works with LF as new line character. That's confusing. My solution is to replace fileread by txt2str , which returns a string with LF as new line character.
str = txt2str('blaa.csv'); % read file to string. Ensure LF is used for newline.
xpr = [ '(?m)' ... match the '^' and '$' at head and tail of a line
, '^' ... beginning of line
, '([+-]?\d+)' ... group dec part of 1st number
, ',' ... decimal separator
, '(\d+,[+-]?\d+)' ... group frac part of 1st number, list sep
... and decimal part of 2nd number
, ',' ... decimal separator
, '(\d+,[+-]?\d+)' ... group frac part of 2nd number, list sep
... and decimal part of 3rd number
, ',' ... decimal separator
, '(\d+,[+-]?\d+)' ... group frac part of 3rd number, list sep
... and decimal part of 4th number
, ',' ... decimal separator
, '(\d+)' ... group frac part of 3rd number, list sep
, '[ ]*$' ... trailing blanks and end of line
];
buf = regexprep( str, xpr, '$1.$2.$3.$4.$5' ); % insert decimal points
%
cac = textscan( buf, '%f%f%f%f', 'Delimiter',',', 'CollectOutput',true, 'Headerlines',1 );
num = cac{1,1};
inspect the result
>> num(1:3,:)
ans =
12.9086 -0.0277 -0.0320 -0.8833
12.9668 -0.0275 -0.0316 -0.8838
13.0281 -0.0270 -0.0315 -0.8842
>> whos num
Name Size Bytes Class Attributes
num 234x4 7488 double
2 Kommentare
Walter Roberson
am 30 Jun. 2017
per:
fileread() and then delete the CR would work.
It is true that lineanchors only expect newline. But you could use '\r?$' or '(?=\r?)$'
per isakson
am 30 Jun. 2017
Bearbeitet: per isakson
am 30 Jun. 2017
Walter
Yes and yes, but
- "delete the CR" adds clutter to the code and takes a bit of time, since the size of the string changes.
- "use '\r?$' or '(?=\r?)$'" adds clutter
"It is true that lineanchors only expect newline" Yes, that's true. And the same is true with 'dotexceptnewline'. IMO: this is a design mistake, since
- Matlab users typically don't want to bother about new line characters.
- PCRE (e.g. https://regex101.com/) handles CRLF and LF the same way (in these cases).
- I cannot think of any situation, in which the Matlab way offers an advantage.
The difference between txt2str and fileread is
[fid, msg] = fopen( filespec, 'rt' );
in place of
[fid, msg] = fopen(filename);
Matthew Jameson
am 30 Jun. 2017
0 Stimmen
Kategorien
Mehr zu Characters and Strings finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!