Why are some csv files imported incorrectly into my cell array?

3 Ansichten (letzte 30 Tage)
lil brain
lil brain am 13 Dez. 2022
Bearbeitet: Stephen23 am 14 Dez. 2022
Hi,
I have a cell array called alldata whcih contains the contents of 24 csv files. However, when importing these files I can see that the last five (for example the csv file: 5422_task.csv) have been incorrectly imported in that the first column inlcudes two values (seperated by a comma) with an apostrophe infront.
alldata{1, 24}
ans =
1216×3 cell array
{'media_open; media_play; medi…'} {'2022/09/23 15:06:18:984'} {' 2022/09/23 15:11:37:652"'}
{'Multimedia File,"task_com.Ut…'} {1×1 missing } {1×1 missing }
{'Lower Label,"Weak Presence"' } {1×1 missing } {1×1 missing }
{'Upper Label,"Strong Presence"' } {1×1 missing } {1×1 missing }
{'Minimum Value,-100' } {1×1 missing } {1×1 missing }
{'Maximum Value,100' } {1×1 missing } {1×1 missing }
{'Number of Steps,9' } {1×1 missing } {1×1 missing }
{'Second,"Rating"' } {1×1 missing } {1×1 missing }
{'%%%%%%,"%%%%%%"' } {1×1 missing } {1×1 missing }
{'10.5,96.09' } {1×1 missing } {1×1 missing }
{'10.75,96.09' } {1×1 missing } {1×1 missing }
{'11,96.09' } {1×1 missing } {1×1 missing }
{'11.25,96.16375' } {1×1 missing } {1×1 missing }
{'11.5,96.45875' } {1×1 missing } {1×1 missing }
On the other hand, all the other csv files have been correctly imported so that the first two columns show two different values that have been seperated by a comma (for example the csv file: 1311_task.csv).
alldata{1, 1}
ans =
682×3 cell array
{'media_open; media_play; medi…'} {'2022/09/19 14:42:27:371' } {' 2022/09/19 14:54:07:167"'}
{'Multimedia File' } {'com.UtrechtUniversity.XRPS_Q…'} {1×1 missing }
{'Lower Label' } {'Negative Affect' } {1×1 missing }
{'Upper Label' } {'Positive Affect' } {1×1 missing }
{'Minimum Value' } {[ -100]} {1×1 missing }
{'Maximum Value' } {[ 100]} {1×1 missing }
{'Number of Steps' } {[ 9]} {1×1 missing }
{'Second' } {'Rating' } {1×1 missing }
{'%%%%%%' } {'%%%%%%' } {1×1 missing }
{[ 1]} {[ 0.7800]} {1×1 missing }
{[ 2]} {[ 0.8975]} {1×1 missing }
{[ 3]} {[ 0.7800]} {1×1 missing }
{[ 4]} {[ 0.7800]} {1×1 missing }
{[ 5]} {[ 0.8385]} {1×1 missing }
{[ 6]} {[ 0.7800]} {1×1 missing }
{[ 7]} {[ 0.7800]} {1×1 missing }
Any idea why this might be the case?
Thank you!

Akzeptierte Antwort

Voss
Voss am 13 Dez. 2022
"Any idea why this might be the case?"
It's because the different files have commas and semicolons in different places, e.g. line 10 of 1311_task.csv looks like this:
1;0.78;
but line 10 of 5422_task.csv looks like this:
10.5,96.09;;
So in one file you've got a semicolon after each number, and in the other file a comma in between the numbers and two semicolons at the end of the line.
I don't know what function(s) you're using to import the files, but here's an attempt to handle both of those situations with one piece of code:
files = {'1311_task.csv' '5422_task.csv'};
C = cell(1,numel(files));
for ii = 1:numel(files)
C{ii} = readcell(files{ii},'Delimiter',{',' ';'},'ConsecutiveDelimitersRule','join');
end
C{:}
ans = 682×3 cell array
{'media_open; media_play; media_end'} {'2022/09/19 14:42:27:371' } {' 2022/09/19 14:54:07:167"'} {'Multimedia File' } {'com.UtrechtUniversity.XRPS_Quest-20220919-135434.mkv'} {1×1 missing } {'Lower Label' } {'Negative Affect' } {1×1 missing } {'Upper Label' } {'Positive Affect' } {1×1 missing } {'Minimum Value' } {[ -100]} {1×1 missing } {'Maximum Value' } {[ 100]} {1×1 missing } {'Number of Steps' } {[ 9]} {1×1 missing } {'Second' } {'Rating' } {1×1 missing } {'%%%%%%' } {'%%%%%%' } {1×1 missing } {[ 1]} {[ 0.7800]} {1×1 missing } {[ 2]} {[ 0.8975]} {1×1 missing } {[ 3]} {[ 0.7800]} {1×1 missing } {[ 4]} {[ 0.7800]} {1×1 missing } {[ 5]} {[ 0.8385]} {1×1 missing } {[ 6]} {[ 0.7800]} {1×1 missing } {[ 7]} {[ 0.7800]} {1×1 missing }
ans = 1216×3 cell array
{'media_open; media_play; media_end,"2022/09/23 15:06:11:215' } {'2022/09/23 15:06:18:984'} {' 2022/09/23 15:11:37:652"'} {'Multimedia File,"task_com.UtrechtUniversity.XRPS_Quest-20220923-142855.mkv"'} {1×1 missing } {1×1 missing } {'Lower Label,"Weak Presence"' } {1×1 missing } {1×1 missing } {'Upper Label,"Strong Presence"' } {1×1 missing } {1×1 missing } {'Minimum Value' } {[ -100]} {1×1 missing } {'Maximum Value' } {[ 100]} {1×1 missing } {'Number of Steps' } {[ 9]} {1×1 missing } {'Second,"Rating"' } {1×1 missing } {1×1 missing } {'%%%%%%,"%%%%%%"' } {1×1 missing } {1×1 missing } {[ 10.5000]} {[ 96.0900]} {1×1 missing } {[ 10.7500]} {[ 96.0900]} {1×1 missing } {[ 11]} {[ 96.0900]} {1×1 missing } {[ 11.2500]} {[ 96.1637]} {1×1 missing } {[ 11.5000]} {[ 96.4587]} {1×1 missing } {[ 11.7500]} {[ 96.0900]} {1×1 missing } {[ 12]} {[ 96.0900]} {1×1 missing }
As you can see there, the header info (lines 1-9) is not parsed the same between the two files, but the data section (lines 10-end) is, so maybe that's good enough?
  5 Kommentare
lil brain
lil brain am 14 Dez. 2022
It seems that this error appears no matter what files I select. It is always the first file in the list though.
Stephen23
Stephen23 am 14 Dez. 2022
"Why is that?"
Forgot the path, see fixed code.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by