Readtable not configuring delimiters correctly

Question

Andrea am 10 Okt. 2024

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/2158850-readtable-not-configuring-delimiters-correctly

Kommentiert: dpb am 10 Okt. 2024

experiments.csv

Hello,

I have a .csv file which, strangely enough, cannot be read correctly by readtable. Here you go with my code and the csv file.

Thank you in advance!

Andrea

P.S. I tried many versions of delimiters!

filename = 'experiments.csv';
opts = detectImportOptions(filename);
opts.Delimiter = ',';
T = readtable(filename,opts);

3 Kommentare
1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden

Arjun am 10 Okt. 2024

In MATLAB Online öffnen

experiments.csv

You can try this code below and let me know if it worked for you:

filename = 'experiments.csv';
% Detect import options, skipping the initial lines
opts = detectImportOptions(filename, 'Delimiter', ',');
opts.DataLines = [9, Inf]; 
% Read the table with only the first column
opts.SelectedVariableNames = opts.VariableNames(1);
T = readtable(filename, opts);
% Split the first column into separate columns
splitData = cellfun(@(x) strsplit(x, ','), T.(1), 'UniformOutput', false);
splitDataMatrix = vertcat(splitData{:});
% Keep only the last three columns
lastThreeColumns = splitDataMatrix(:, end-2:end);
% Clean and convert the string values to double
cleanedColumns = cellfun(@(x) strtrim(x), lastThreeColumns, 'UniformOutput', false); % Trim whitespace
% Remove non-numeric characters
cleanedColumns = regexprep(cleanedColumns, '[^\d.-]', ''); 
% Convert to double
lastThreeColumnsDouble = cellfun(@str2double, cleanedColumns);
% Display the converted matrix
disp(lastThreeColumnsDouble);
         0    4.6084         0
0200    4.6086         0
0400    4.6133    0.0001
0600    4.6319    0.0001
0800    4.6620         0
1000    4.6944         0
1200    4.7266         0
1400    4.7598         0
1600    4.7942         0
1800    4.8289         0
2000    4.8633    0.0001
2200    4.8970    0.0001
2400    4.9303         0
2600    4.9634         0
2800    4.9967         0
3000    5.0302         0
3200    5.0638         0
3400    5.0972   -0.0001
3600    5.1306         0
3800    5.1639         0
4000    5.1972         0
4200    5.2305         0
4400    5.2639         0
4600    5.2972   -0.0001
4800    5.3306         0
5000    5.3639    0.0001
5200    5.3972    0.0002
5400    5.4305         0
5600    5.4638         0
5800    5.4971    0.0001
6000    5.5305         0
6200    5.5639         0
6400    5.5972    0.0011
6600    5.6305    0.0093
6800    5.6638    0.0233
7000    5.6970    0.0383
7200    5.7304    0.0552
7400    5.7638    0.0724
7600    5.7971    0.0887
7800    5.8305    0.1036
8000    5.8638    0.1164
8200    5.8972    0.1285
8400    5.9305    0.1403
8600    5.9638    0.1516
8800    5.9972    0.1624
9000    6.0305    0.1722
9200    6.0639    0.1812
9400    6.0972    0.1892
9600    6.1305    0.1961
9800    6.1638    0.2017
0000    6.1971    0.2066
0200    6.2304    0.2109
0400    6.2638    0.2147
0600    6.2971    0.2182
0800    6.3304    0.2214
1000    6.3637    0.2244
1200    6.3971    0.2273
1400    6.4305    0.2300
1600    6.4638    0.2324
1800    6.4971    0.2346
2000    6.5304    0.2366
2200    6.5637    0.2385
2400    6.5970    0.2402
2600    6.6304    0.2418
2800    6.6638    0.2433
3000    6.6972    0.2448
3200    6.7305    0.2460
3400    6.7638    0.2472
3600    6.7972    0.2483
3800    6.8306    0.2493
4000    6.8639    0.2502
4200    6.8972    0.2511
4400    6.9305    0.2520
4600    6.9639    0.2528
4800    6.9972    0.2537
5000    7.0306    0.2543
5200    7.0639    0.2549
5400    7.0972    0.2556
5600    7.1305    0.2562
5800    7.1638    0.2567
6000    7.1971    0.2573
6200    7.2304    0.2578
6400    7.2638    0.2584
6600    7.2971    0.2589
6800    7.3305    0.2593
7000    7.3638    0.2597
7200    7.3971    0.2601
7400    7.4304    0.2605
7600    7.4638    0.2609
7800    7.4972    0.2614
8000    7.5305    0.2618

The final values are contained in "lastThreeColumnsDouble".

Stephen23 am 10 Okt. 2024

Bearbeitet: Stephen23 am 10 Okt. 2024

In MATLAB Online öffnen

experiments.csv

"I have a .csv file which, strangely enough, cannot be read correctly by readtable"

It is not strange at all: there is no heuristic or algorithm in the world that works in every circumstance. Every algorithm is fallible (no matter how much beginners imagine them to be perfect, just like their computers have infinite memory and are infinitely fast).

Did you look at the file in a text editor to check if it is a well formatted CSV file?

Because that file is not a well-formatted CSV file. It is actually a mess. Strictly speaking every line is one field:

txt = fileread('experiments.csv')
txt = 
    'Results Table 1
     
     
     Results Table 2
     
     
     1,Time,Displacement,Force
     ,(s),(mm),(kN)
     ",""0.0000"",""4.6084"",""0.0000"""
     ",""0.0200"",""4.6086"",""0.0000"""
     ",""0.0400"",""4.6133"",""0.0001"""
     ",""0.0600"",""4.6319"",""0.0001"""
     ",""0.0800"",""4.6620"",""0.0000"""
     ",""0.1000"",""4.6944"",""0.0000"""
     ",""0.1200"",""4.7266"",""0.0000"""
     ",""0.1400"",""4.7598"",""0.0000"""
     ",""0.1600"",""4.7942"",""0.0000"""
     ",""0.1800"",""4.8289"",""0.0000"""
     ",""0.2000"",""4.8633"",""0.0001"""
     ",""0.2200"",""4.8970"",""0.0001"""
     ",""0.2400"",""4.9303"",""0.0000"""
     ",""0.2600"",""4.9634"",""0.0000"""
     ",""0.2800"",""4.9967"",""0.0000"""
     ",""0.3000"",""5.0302"",""0.0000"""
     ",""0.3200"",""5.0638"",""0.0000"""
     ",""0.3400"",""5.0972"",""-0.0001"""
     ",""0.3600"",""5.1306"",""0.0000"""
     ",""0.3800"",""5.1639"",""0.0000"""
     ",""0.4000"",""5.1972"",""0.0000"""
     ",""0.4200"",""5.2305"",""0.0000"""
     ",""0.4400"",""5.2639"",""0.0000"""
     ",""0.4600"",""5.2972"",""-0.0001"""
     ",""0.4800"",""5.3306"",""0.0000"""
     ",""0.5000"",""5.3639"",""0.0001"""
     ",""0.5200"",""5.3972"",""0.0002"""
     ",""0.5400"",""5.4305"",""0.0000"""
     ",""0.5600"",""5.4638"",""0.0000"""
     ",""0.5800"",""5.4971"",""0.0001"""
     ",""0.6000"",""5.5305"",""0.0000"""
     ",""0.6200"",""5.5639"",""0.0000"""
     ",""0.6400"",""5.5972"",""0.0011"""
     ",""0.6600"",""5.6305"",""0.0093"""
     ",""0.6800"",""5.6638"",""0.0233"""
     ",""0.7000"",""5.6970"",""0.0383"""
     ",""0.7200"",""5.7304"",""0.0552"""
     ",""0.7400"",""5.7638"",""0.0724"""
     ",""0.7600"",""5.7971"",""0.0887"""
     ",""0.7800"",""5.8305"",""0.1036"""
     ",""0.8000"",""5.8638"",""0.1164"""
     ",""0.8200"",""5.8972"",""0.1285"""
     ",""0.8400"",""5.9305"",""0.1403"""
     ",""0.8600"",""5.9638"",""0.1516"""
     ",""0.8800"",""5.9972"",""0.1624"""
     ",""0.9000"",""6.0305"",""0.1722"""
     ",""0.9200"",""6.0639"",""0.1812"""
     ",""0.9400"",""6.0972"",""0.1892"""
     ",""0.9600"",""6.1305"",""0.1961"""
     ",""0.9800"",""6.1638"",""0.2017"""
     ",""1.0000"",""6.1971"",""0.2066"""
     ",""1.0200"",""6.2304"",""0.2109"""
     ",""1.0400"",""6.2638"",""0.2147"""
     ",""1.0600"",""6.2971"",""0.2182"""
     ",""1.0800"",""6.3304"",""0.2214"""
     ",""1.1000"",""6.3637"",""0.2244"""
     ",""1.1200"",""6.3971"",""0.2273"""
     ",""1.1400"",""6.4305"",""0.2300"""
     ",""1.1600"",""6.4638"",""0.2324"""
     ",""1.1800"",""6.4971"",""0.2346"""
     ",""1.2000"",""6.5304"",""0.2366"""
     ",""1.2200"",""6.5637"",""0.2385"""
     ",""1.2400"",""6.5970"",""0.2402"""
     ",""1.2600"",""6.6304"",""0.2418"""
     ",""1.2800"",""6.6638"",""0.2433"""
     ",""1.3000"",""6.6972"",""0.2448"""
     ",""1.3200"",""6.7305"",""0.2460"""
     ",""1.3400"",""6.7638"",""0.2472"""
     ",""1.3600"",""6.7972"",""0.2483"""
     ",""1.3800"",""6.8306"",""0.2493"""
     ",""1.4000"",""6.8639"",""0.2502"""
     ",""1.4200"",""6.8972"",""0.2511"""
     ",""1.4400"",""6.9305"",""0.2520"""
     ",""1.4600"",""6.9639"",""0.2528"""
     ",""1.4800"",""6.9972"",""0.2537"""
     ",""1.5000"",""7.0306"",""0.2543"""
     ",""1.5200"",""7.0639"",""0.2549"""
     ",""1.5400"",""7.0972"",""0.2556"""
     ",""1.5600"",""7.1305"",""0.2562"""
     ",""1.5800"",""7.1638"",""0.2567"""
     ",""1.6000"",""7.1971"",""0.2573"""
     ",""1.6200"",""7.2304"",""0.2578"""
     ",""1.6400"",""7.2638"",""0.2584"""
     ",""1.6600"",""7.2971"",""0.2589"""
     ",""1.6800"",""7.3305"",""0.2593"""
     ",""1.7000"",""7.3638"",""0.2597"""
     ",""1.7200"",""7.3971"",""0.2601"""
     ",""1.7400"",""7.4304"",""0.2605"""
     ",""1.7600"",""7.4638"",""0.2609"""
     ",""1.7800"",""7.4972"",""0.2614"""
     ",""1.8000"",""7.5305"",""0.2618"""'

Even if we ignore that "feature" there are more such "features": double quotes doubled around numeric values! Ugh. Sorry, but whatever wrote that file is the problem. Sadly, the Do What I Want and Not What I Gave You Toolbox is currently being tested and has not yet been released. Currently what you have is: rubbish in => rubbish out.

No surprises there.

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

dpb am 10 Okt. 2024

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/2158850-readtable-not-configuring-delimiters-correctly#answer_1530375

Bearbeitet: dpb am 10 Okt. 2024

In MATLAB Online öffnen

experiments.csv

filename = 'experiments.csv';
type(filename)
Results Table 1


Results Table 2


1,Time,Displacement,Force
,(s),(mm),(kN)
",""0.0000"",""4.6084"",""0.0000"""
",""0.0200"",""4.6086"",""0.0000"""
",""0.0400"",""4.6133"",""0.0001"""
",""0.0600"",""4.6319"",""0.0001"""
",""0.0800"",""4.6620"",""0.0000"""
",""0.1000"",""4.6944"",""0.0000"""
",""0.1200"",""4.7266"",""0.0000"""
",""0.1400"",""4.7598"",""0.0000"""
",""0.1600"",""4.7942"",""0.0000"""
",""0.1800"",""4.8289"",""0.0000"""
",""0.2000"",""4.8633"",""0.0001"""
",""0.2200"",""4.8970"",""0.0001"""
",""0.2400"",""4.9303"",""0.0000"""
",""0.2600"",""4.9634"",""0.0000"""
",""0.2800"",""4.9967"",""0.0000"""
",""0.3000"",""5.0302"",""0.0000"""
",""0.3200"",""5.0638"",""0.0000"""
",""0.3400"",""5.0972"",""-0.0001"""
",""0.3600"",""5.1306"",""0.0000"""
",""0.3800"",""5.1639"",""0.0000"""
",""0.4000"",""5.1972"",""0.0000"""
",""0.4200"",""5.2305"",""0.0000"""
",""0.4400"",""5.2639"",""0.0000"""
",""0.4600"",""5.2972"",""-0.0001"""
",""0.4800"",""5.3306"",""0.0000"""
",""0.5000"",""5.3639"",""0.0001"""
",""0.5200"",""5.3972"",""0.0002"""
",""0.5400"",""5.4305"",""0.0000"""
",""0.5600"",""5.4638"",""0.0000"""
",""0.5800"",""5.4971"",""0.0001"""
",""0.6000"",""5.5305"",""0.0000"""
",""0.6200"",""5.5639"",""0.0000"""
",""0.6400"",""5.5972"",""0.0011"""
",""0.6600"",""5.6305"",""0.0093"""
",""0.6800"",""5.6638"",""0.0233"""
",""0.7000"",""5.6970"",""0.0383"""
",""0.7200"",""5.7304"",""0.0552"""
",""0.7400"",""5.7638"",""0.0724"""
",""0.7600"",""5.7971"",""0.0887"""
",""0.7800"",""5.8305"",""0.1036"""
",""0.8000"",""5.8638"",""0.1164"""
",""0.8200"",""5.8972"",""0.1285"""
",""0.8400"",""5.9305"",""0.1403"""
",""0.8600"",""5.9638"",""0.1516"""
",""0.8800"",""5.9972"",""0.1624"""
",""0.9000"",""6.0305"",""0.1722"""
",""0.9200"",""6.0639"",""0.1812"""
",""0.9400"",""6.0972"",""0.1892"""
",""0.9600"",""6.1305"",""0.1961"""
",""0.9800"",""6.1638"",""0.2017"""
",""1.0000"",""6.1971"",""0.2066"""
",""1.0200"",""6.2304"",""0.2109"""
",""1.0400"",""6.2638"",""0.2147"""
",""1.0600"",""6.2971"",""0.2182"""
",""1.0800"",""6.3304"",""0.2214"""
",""1.1000"",""6.3637"",""0.2244"""
",""1.1200"",""6.3971"",""0.2273"""
",""1.1400"",""6.4305"",""0.2300"""
",""1.1600"",""6.4638"",""0.2324"""
",""1.1800"",""6.4971"",""0.2346"""
",""1.2000"",""6.5304"",""0.2366"""
",""1.2200"",""6.5637"",""0.2385"""
",""1.2400"",""6.5970"",""0.2402"""
",""1.2600"",""6.6304"",""0.2418"""
",""1.2800"",""6.6638"",""0.2433"""
",""1.3000"",""6.6972"",""0.2448"""
",""1.3200"",""6.7305"",""0.2460"""
",""1.3400"",""6.7638"",""0.2472"""
",""1.3600"",""6.7972"",""0.2483"""
",""1.3800"",""6.8306"",""0.2493"""
",""1.4000"",""6.8639"",""0.2502"""
",""1.4200"",""6.8972"",""0.2511"""
",""1.4400"",""6.9305"",""0.2520"""
",""1.4600"",""6.9639"",""0.2528"""
",""1.4800"",""6.9972"",""0.2537"""
",""1.5000"",""7.0306"",""0.2543"""
",""1.5200"",""7.0639"",""0.2549"""
",""1.5400"",""7.0972"",""0.2556"""
",""1.5600"",""7.1305"",""0.2562"""
",""1.5800"",""7.1638"",""0.2567"""
",""1.6000"",""7.1971"",""0.2573"""
",""1.6200"",""7.2304"",""0.2578"""
",""1.6400"",""7.2638"",""0.2584"""
",""1.6600"",""7.2971"",""0.2589"""
",""1.6800"",""7.3305"",""0.2593"""
",""1.7000"",""7.3638"",""0.2597"""
",""1.7200"",""7.3971"",""0.2601"""
",""1.7400"",""7.4304"",""0.2605"""
",""1.7600"",""7.4638"",""0.2609"""
",""1.7800"",""7.4972"",""0.2614"""
",""1.8000"",""7.5305"",""0.2618"""

The file is a list of strings; the delimiters are inside the string delimiters.

How was the file created? It isn't a valid CSV file format...well, it is, but only as one-column string.

But,

M=readcell(filename,'NumHeaderLines',9,'Whitespace','"','Delimiter',',');
M(1:5,:)
ans = 5x1 cell array
    {',""0.0200"",""4.6086"",""0.0000'}
    {',""0.0400"",""4.6133"",""0.0001'}
    {',""0.0600"",""4.6319"",""0.0001'}
    {',""0.0800"",""4.6620"",""0.0000'}
    {',""0.1000"",""4.6944"",""0.0000'}

didn't solve the problem...

str2double(strrep(split(ans,','),'"',''))
ans = 5×4
       NaN    0.0200    4.6086         0
       NaN    0.0400    4.6133    0.0001
       NaN    0.0600    4.6319    0.0001
       NaN    0.0800    4.6620         0
       NaN    0.1000    4.6944         0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

will work around it.

Alternatively, for this file you could use the FixedWidthImportOptions object, but that wouldn't always be true unless the file was written as this one.

I'd suggest going back to the source and see if you can get the file created as a valid CSV file; it's ok that the fields are string-delimited, but the whole line should not be contained within the apostrophes; that turns each line into a single string with embedded strings; certainly not what was intended.

It does seem a shortcoming in the 'DelimitedImportOptions' to not have a flag for the 'StringDelimited' type; it is pretty common owing to embedded blanks in string data, but that wouldn't solve this problem and it may get recognized automagically as is for properly constructed files.

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Answer 2

Star Strider am 10 Okt. 2024

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/2158850-readtable-not-configuring-delimiters-correctly#answer_1530385

Bearbeitet: Star Strider am 10 Okt. 2024

In MATLAB Online öffnen

experiments.csv

Try something like this —

% type('experiments.csv')

filename = 'experiments.csv';

Lines = readlines(filename);

FindLine = strfind(Lines, "1,");

CheckLineC = cellfun(@(x)~isempty(x), FindLine, 'Unif',0);

idx = find([CheckLineC{:}]);

FirstLine = extractAfter(Lines(idx), "1,");

VN = compose('%s',split(FirstLine,','));

T = readtable(filename, 'NumHeaderLines',8, 'Delimiter',{',','"'}, 'LeadingDelimitersRule','ignore', 'ConsecutiveDelimitersRule','join');

T.Properties.VariableNames = VN

T = 91x3 table

Time Displacement Force ____ ____________ ______ 0 4.6084 0 0.02 4.6086 0 0.04 4.6133 0.0001 0.06 4.6319 0.0001 0.08 4.662 0 0.1 4.6944 0 0.12 4.7266 0 0.14 4.7598 0 0.16 4.7942 0 0.18 4.8289 0 0.2 4.8633 0.0001 0.22 4.897 0.0001 0.24 4.9303 0 0.26 4.9634 0 0.28 4.9967 0 0.3 5.0302 0

figure

plot(T.Time, T.Displacement, 'DisplayName',VN{2})

hold on

plot(T.Time, T.Force, 'DisplayName',VN{3})

grid

xlabel(VN{1})

legend('Location','best')

This is not quite as efficient as I would like it to be, however it has the virtue of working.

EDIT — Added plot

.

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Answer 3

dpb am 10 Okt. 2024

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/2158850-readtable-not-configuring-delimiters-correctly#answer_1530440

Bearbeitet: dpb am 10 Okt. 2024

In MATLAB Online öffnen

experiments.csv

If it were me and I couldn't fix the original source that created files, I'd probably fix these first, then read a cleaned version instead...

filename = 'experiments.csv';
F=readlines(filename);
ix=find(startsWith(F,"1,"),1);          % make sure is first of line; should be, but...
F=F(ix:end);                            % remove the nondata header rows
F(1)=strrep(F(1),'1,',',');             % leave only the leading delimiter of first row
F=strrep(F,'"','');                     % strip the quotes--they're unneeded and get in the way
F=arrayfun(@(s)string(s{:}(2:end)),F);  % as is the leading delimiter
writelines(F,filename)                  % write a cleaned up version instead

If run the above on the file(s) first, then all you'll have to do later is read them directly...

T=readtable(filename,'ReadVariableNames',1,'VariableNamesLine',1,'VariableUnitsLine',2);

head(T)

Time Displacement Force ____ ____________ ______ 0 4.6084 0 0.02 4.6086 0 0.04 4.6133 0.0001 0.06 4.6319 0.0001 0.08 4.662 0 0.1 4.6944 0 0.12 4.7266 0 0.14 4.7598 0

%T.Properties

subplot(2,1,1)

plot(T.Time,T.Force)

xlabel(join([T.Properties.VariableNames(1),T.Properties.VariableUnits(1)]))

ylabel(join([T.Properties.VariableNames(3),T.Properties.VariableUnits(3)]))

subplot(2,1,2)

plot(T.Time,T.Displacement)

xlabel(join([T.Properties.VariableNames(1),T.Properties.VariableUnits(1)]))

ylabel(join([T.Properties.VariableNames(2),T.Properties.VariableUnits(2)]))

The total time may be a little longer by running them through the filter first, but it will make life much simpler from there on out...

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

dpb am 10 Okt. 2024

A thought strikes that there were two header sections in the file and the first (and only in this file) set of data had the preceding "1," on its variable names header row.

The code both @Star Strider and I show relies on specifically finding that "1" -- if the actual file structure is such that the data for both of those tests is in the same file, then to write totally generic code one would need to scan those header lines for the number of "Results" lines and then loop over those finding each test segment in turn...

Melden Sie sich an, um zu kommentieren.

Readtable not configuring delimiters correctly

3 Kommentare
1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden

Antworten (3)

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Community Treasure Hunt

Readtable not configuring delimiters correctly

3 Kommentare 1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden

Antworten (3)

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

1 Kommentar -1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Community Treasure Hunt

3 Kommentare
1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden