Smart import of random files

4 Ansichten (letzte 30 Tage)
Luca D'Angelo
Luca D'Angelo am 12 Jan. 2021
Bearbeitet: dpb am 16 Jan. 2021
Good morning,
I can't find a way to properly set a script for import data as table.
Suppose you have many folders and in each of them there are many csv or xlsx or txt files and you want to create as output a single file ('pathOut/database.xlsx'). For each folder, appending one file to the other is easy.
The problem is that I'd like to write a script (actually an app with App Designer) that is able to load files that are structured in different way: for example, I have files which have the first 30 rows that are meaningless text and I want to neglect those. In another folder, files have 3 meaningless rows.
I'm using App Designer and I've created a Edit Field (numeric) which allows to indicate the data starting rows, the column delimiter and the headerline row. On the other hand, detectImportOptions is not able to properly attribute the right variable type; in addition, in some cases althought I told that delimiter=',' readtable understands that but not for the headerline: the resultant table is mainly full of string column, with an header line confused where comma (the delimeter) is converted with ' _ '.
I attached here two different examples (Input_type_1.xlsx was converted in xlsx because this webpage does not support the .dat file): I don't need to load both of them in the same run, but I'd like to create an app which is able to read both of them specifing as input the less that I can (e.g. data starting rows, column delimiter and headerline row).
Thus, how can I manage detectImportOptions to do that? (I have Matlab 2018b release).
Thank you for your suggestions and time!
Luca
  8 Kommentare
Luca D'Angelo
Luca D'Angelo am 12 Jan. 2021
Well, that file is an output of an instrument. I have several thousands of that file and that's why I'm trying to write an app which should allow me not wasting some months splitting columns and I learnt to use MATLAB and not VBA/COM.
I wrote a specific script for that kind of file that works but I wanted to wrote an app able to load all kind of output file that I obtain from every instruments that I normally use.
I could convert that script as a function and call that one in my app. This could be a solution even if a little unelegant.
I'll keep trying and maybe I'll do that.
Thank you.
dpb
dpb am 13 Jan. 2021
Did the instrument name that as a .xlsx on creation or was that done later?
Doesn't make sense to have done this way; who's the manufacturer of the instrument and what is the instrument itself?
I've used a lot of test gear over 40 years and never saw this symptom from any of them...

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

dpb
dpb am 13 Jan. 2021
Bearbeitet: dpb am 13 Jan. 2021
"When the instrument creates its output, this is in .dat."
AHA! So you created the problem with the spreadsheet yourself, not the instrument manufacturer. They did write a csv file (albeit with non-quoted space-containing string variables). If you don't lie to detectImportOptions it seems to work just fine...
The one thing I've discovered in the past is that it (detectImportOptions) needs the klew about the delimiter on input to it instead of trying to only patch the default import options object. That's owing to the non-quoted strings in the file so there's still an issue there that the vendor could have done better, but it's not completely broken in being a one-column spreadsheet as is your original file.
optType1DAT=detectImportOptions('Input_type_1-1.dat','delimiter',',');
tType1_DAT=readtable('Input_type_1-1.dat',optType1DAT,'ReadVariableNames',1);
This returned
tType1_DAT =
17×58 table
DataOrder_DateTime_yyyy_mm_ddHh_mm_ss_ Status Stage1 Stage2 Stage3 Stage4 Stage5 Stage6 Stage7 Stage8 Stage9 Stage10 Stage11 Stage12 Stage13 Stage14 MISC ChargerI ChargerU TrapU ImpactorLowPressure AmbientPressure AuxChannel1 AuxChannel2 AuxChannel3 AuxChannel4 AuxChannel5 AuxChannel6 T1 T2 T3 T4 Reserved CAL Stage1_1 Stage2_1 Stage3_1 Stage4_1 Stage5_1 Stage6_1 Stage7_1 Stage8_1 Stage9_1 Stage10_1 Stage11_1 Stage12_1 Stage13_1 Stage14_1 CON ConcentrationValue COM UserComment PM10 PM2_5 PM1 PN10 PN2_5 PN1
______________________________________ _________________ ______ ______ ______ ______ ______ ______ ______ ______ ______ _______ _______ _______ _______ _______ ________ ________ ________ _____ ___________________ _______________ ___________ ___________ ___________ ___________ ___________ ___________ ______ ______ _____ _____ ________ _______ ________ ________ ________ ________ ________ ________ ________ ________ ________ _________ _________ _________ _________ _________ _______ __________________ _______ ___________ _______ _______ _______ _____ _____ _____
{'2019/03/10 00:00:43.1'} {'CbTlsfpAvzij0'} 44.39 78.19 75.01 76.31 96.02 157 117.6 60.55 19.55 14.59 12.38 16.15 8.71 18.38 {'MISC'} 0.9955 3997 19.99 40.77 1013 0.01602 0.01556 0.01564 0.01556 0.01556 0.01579 30.58 170.2 22.57 34.34 0 {'CAL'} 42550 22090 8786 4306 3160 2768 1358 315 26.64 4.705 2.177 4.434 0.1254 1.105 {'CON'} 29010 {'COM'} {0×0 char} 0.07477 0.01382 0.01214 29010 29010 29010
{'2019/03/10 00:01:43.1'} {'CbTlsfpAvzij0'} 46.4 87.02 81.06 79.99 99.32 162.5 120.8 62.23 20.2 15.06 12.86 16.96 8.786 18.99 {'MISC'} 0.9954 3821 19.99 40.84 1012 0.0161 0.01556 0.01564 0.01556 0.01549 0.01564 -39.32 -12.55 22.56 34.34 0 {'CAL'} 44470 24660 9494 4498 3255 2857 1389 320.6 26.34 4.384 2.053 4.61 0.04668 1.109 {'CON'} 30820 {'COM'} {0×0 char} 0.07461 0.01398 0.0124 30820 30820 30820
{'2019/03/10 00:02:43.1'} {'CbTlsfpAvzij0'} 43.44 79.87 77.16 77.11 96.31 157.9 118.2 60.78 19.91 14.62 12.33 16.71 8.317 18.69 {'MISC'} 0.9956 3907 19.99 40.67 1012 0.01602 0.01556 0.01579 0.01564 0.01564 0.01556 427 -261.7 22.54 33.93 0 {'CAL'} 41640 22610 9069 4356 3169 2786 1366 316.5 28.11 4.736 2.125 4.792 0.1119 1.152 {'CON'} 28870 {'COM'} {0×0 char} 0.07704 0.01391 0.01227 28870 28870 28870
{'2019/03/10 00:03:43.1'} {'CbTlsfpAvzij0'} 40.22 71.18 72.52 73.55 91.59 148.5 112.2 58.56 18.44 13.91 11.56 15.96 7.723 17.17 {'MISC'} 0.9954 3940 19.99 40.79 1011 0.01602 0.01564 0.01572 0.01564 0.01549 0.01572 -14.4 18.25 22.54 34.34 0 {'CAL'} 38550 20110 8561 4177 3025 2622 1302 308.6 25.86 4.973 2.101 4.736 0.1009 1.051 {'CON'} 26610 {'COM'} {0×0 char} 0.07071 0.01334 0.01171 26610 26610 26610
{'2019/03/10 00:04:43.1'} {'CbTlsfpAvzij0'} 37.01 60.24 65.23 68.88 85.76 137.8 105.8 56.72 17.11 13.35 10.73 15.07 7.242 16.31 {'MISC'} 0.9956 3929 19.99 40.71 1010 0.01595 0.01564 0.01579 0.01556 0.01556 0.01572 -213.3 208.3 22.52 34.34 0 {'CAL'} 35480 16950 7718 3937 2846 2439 1236 304.6 24.7 5.663 2.161 4.652 0.1681 1.049 {'CON'} 24060 {'COM'} {0×0 char} 0.07086 0.01286 0.0112 24050 24050 24050
{'2019/03/10 00:05:43.1'} {'CbTlsfpAvzij0'} 39.07 66.93 69.13 71.23 87.19 139.7 105.9 58.12 17.44 14.01 11.34 15.86 7.529 17.62 {'MISC'} 0.9957 3867 19.99 40.9 1011 0.01602 0.01572 0.01572 0.01549 0.01556 0.01564 -433.9 342.5 22.52 34.34 0 {'CAL'} 37440 18880 8159 4051 2877 2464 1226 309.9 23.85 5.902 2.296 4.878 0.1038 1.155 {'CON'} 25580 {'COM'} {0×0 char} 0.07627 0.013 0.01123 25580 25580 25580
{'2019/03/10 00:06:43.1'} {'CbTlsfpAvzij0'} 41.39 69.27 71.15 70.84 83.48 134.6 103.6 54.26 19.96 13.12 11.44 15.05 7.471 17.05 {'MISC'} 0.9957 3838 19.99 40.97 1012 0.01602 0.01549 0.01572 0.01564 0.01556 0.01556 -254.7 180.4 22.52 34.34 0 {'CAL'} 39670 19520 8380 4001 2725 2356 1191 281.2 33.06 4.008 2.156 4.231 0.08179 1.039 {'CON'} 26660 {'COM'} {0×0 char} 0.07019 0.01272 0.01107 26660 26660 26660
{'2019/03/10 00:07:43.1'} {'CbTlsfpAvzij0'} 39.5 64.87 68.21 69.03 81.26 128.6 99.68 54.06 18.43 13.09 11.07 14.66 7.613 16.4 {'MISC'} 0.9956 3941 19.99 40.7 1011 0.01602 0.01572 0.01572 0.01572 0.01556 0.01572 -78.19 -352.2 22.51 33.93 0 {'CAL'} 37860 18260 8048 3913 2660 2252 1147 284.1 28.94 4.809 2.209 4.215 0.1711 1.009 {'CON'} 25410 {'COM'} {0×0 char} 0.06858 0.01241 0.0107 25410 25410 25410
{'2019/03/10 00:08:43.1'} {'CbTlsfpAvzij0'} 40.91 66.71 68.9 69.02 80.86 126.8 98.06 54.47 18.36 13.48 11.35 15.17 7.67 17.16 {'MISC'} 0.9954 3990 19.99 41 1011 0.01595 0.01556 0.01572 0.01549 0.01556 0.01556 189.9 -292.1 22.51 34.34 0 {'CAL'} 39210 18770 8108 3897 2636 2212 1122 285.5 27.88 5.112 2.324 4.423 0.09481 1.08 {'CON'} 26120 {'COM'} {0×0 char} 0.07162 0.01234 0.01055 26120 26110 26110
{'2019/03/10 00:09:43.1'} {'CbTlsfpAvzij0'} 40.82 66.16 68.03 68.24 79.47 123.3 95.87 53.86 18.04 13.45 11.1 14.93 7.63 17.48 {'MISC'} 0.9955 3996 19.99 40.77 1011 0.0161 0.01579 0.01579 0.01564 0.01564 0.01572 -428.6 -293 22.5 34.34 0 {'CAL'} 39120 18610 7998 3851 2587 2146 1094 282.3 27.06 5.253 2.167 4.316 0.1383 1.13 {'CON'} 25980 {'COM'} {0×0 char} 0.07459 0.01199 0.01032 25980 25980 25980
{'2019/03/10 00:10:43.1'} {'CbTlsfpAvzij0'} 40 57.94 61.19 65.07 76.05 117.6 92.68 53.06 17.28 13.14 10.72 14.47 7.244 16.4 {'MISC'} 0.9954 3947 19.99 40.88 1012 0.01602 0.01564 0.01579 0.01572 0.01564 0.01572 -254.4 -261.5 22.49 34.34 0 {'CAL'} 38340 16180 7166 3684 2483 2050 1063 281.8 26.33 5.635 2.253 4.283 0.1049 1.045 {'CON'} 24670 {'COM'} {0×0 char} 0.0692 0.01181 0.01007 24670 24670 24670
{'2019/03/10 00:11:43.1'} {'CbTlsfpAvzij0'} 45.92 69.13 66.5 68.72 79.25 122.3 95.41 55.17 18.82 14.36 11.98 15.9 8.304 18.2 {'MISC'} 0.9954 3868 19.99 40.99 1012 0.01602 0.01564 0.01579 0.01549 0.01556 0.01556 -400.8 -398.6 22.47 34.34 0 {'CAL'} 44020 19360 7712 3845 2557 2113 1078 286.7 27.61 5.783 2.494 4.574 0.108 1.132 {'CON'} 28140 {'COM'} {0×0 char} 0.07431 0.01224 0.01032 28140 28140 28140
{'2019/03/10 00:12:43.1'} {'CbTlsfpAvzij0'} 51.3 78.16 70.87 71.26 80.85 124.2 96.79 55.37 20.16 14.67 12.53 16.59 8.705 18.7 {'MISC'} 0.9955 3848 19.99 40.89 1011 0.01602 0.01564 0.01572 0.01564 0.01564 0.01564 442.9 279.8 22.47 34.34 0 {'CAL'} 49170 21900 8156 3943 2578 2126 1078 280.1 28.96 4.744 2.175 4.501 0.08361 1.075 {'CON'} 31170 {'COM'} {0×0 char} 0.07079 0.01201 0.01033 31170 31170 31170
{'2019/03/10 00:13:43.1'} {'CbTlsfpAvzij0'} 45.04 66.49 64.36 68.23 78.32 120 94.31 54.52 18.61 13.92 11.52 15.44 8.108 17.54 {'MISC'} 0.9955 3933 19.99 40.74 1012 0.01602 0.01579 0.01579 0.01572 0.01556 0.01572 101.2 65.98 22.46 33.93 0 {'CAL'} 43170 18590 7459 3829 2532 2074 1068 284.3 27.8 5.451 2.228 4.409 0.1309 1.075 {'CON'} 27480 {'COM'} {0×0 char} 0.07106 0.01195 0.01023 27480 27480 27480
{'2019/03/10 00:14:43.1'} {'CbTlsfpAvzij0'} 46.31 68.89 65.76 69.64 79.68 122.2 95.59 55.48 18.76 14.11 11.7 15.6 7.816 17.66 {'MISC'} 0.9955 3932 19.99 40.33 1011 0.0161 0.01564 0.01564 0.01549 0.01564 0.01556 -324.1 49.78 22.45 33.93 0 {'CAL'} 44390 19270 7611 3903 2572 2109 1080 288.6 27.19 5.299 2.163 4.369 0.07646 1.058 {'CON'} 28270 {'COM'} {0×0 char} 0.06982 0.01202 0.01033 28270 28270 28270
{'2019/03/10 00:15:43.1'} {'CbTlsfpAvzij0'} 47.62 71.44 67.48 70.54 80.14 123 95.99 55.35 19.31 14.27 11.85 16.11 8.374 17.77 {'MISC'} 0.9955 3887 19.99 40.75 1011 0.0161 0.01564 0.01556 0.01556 0.01564 0.01572 -306 192.3 22.44 33.93 0 {'CAL'} 45640 20000 7803 3944 2578 2118 1080 285.4 28.34 5.152 2.077 4.553 0.08317 1.04 {'CON'} 29070 {'COM'} {0×0 char} 0.06907 0.01195 0.01036 29070 29070 29070
{'2019/03/10 00:16:43.1'} {'CbTlsfpAvzij0'} 47.44 78.03 72.56 72.99 83 125.3 96.54 57.36 18.18 14.7 12.12 16.24 8.249 17.5 {'MISC'} 0.9955 3937 19.99 40.47 1011 0.01617 0.01572 0.01564 0.01572 0.01564 0.01556 -216.1 -342.7 22.44 34.34 0 {'CAL'} 45470 21960 8427 4077 2669 2155 1080 296.5 22.45 5.36 2.074 4.494 0.06754 0.9752 {'CON'} 29760 {'COM'} {0×0 char} 0.06514 0.01189 0.01029 29760 29760 29760
>>
Go ahead and finish giving it the help it needs by setting the variable types and the import date format in the import options object to match.
I'd reset the first column column/variable name there as well as probably turn the Status variable into categorical.
  3 Kommentare
Luca D'Angelo
Luca D'Angelo am 16 Jan. 2021
I still can't understand why it works since it seems to be more or less the same I've tried at the beginning but...thank you so much!!! :)
dpb
dpb am 16 Jan. 2021
Bearbeitet: dpb am 16 Jan. 2021
"...it seems to be more or less the same..."
It's NOT the same as in any that I saw you post, no. "More or less" doesn't cut it in programming; details are not only important, they can be critical.
Here the critical difference is what I outlined above -- I called detectImportOptions with the 'delimiter',',' optional named parameter defined on input to it rather than relying on the default delimiter detection routine internal to the function and then patching the default returned import object.
As noted, from empirical evidence this is a particularly important additional aid to the function in its parsing of the file when there are non-quoted strings in a CSV file that contain embedded spaces; it is nearly impossible for a generic routine to be able to identify that condtion unilaterally.
As the last comment notes, it's a pretty good bet you'll still have random occasions where it fails depending on just what the extraneous lines can contain.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (2)

dpb
dpb am 13 Jan. 2021
Save the file as a .csv instead of .xlsx. May have to coerce Excel to not double-quote the strings; I presume there's a way but I use Excel as little as possible so that's an "Exercise for Student".
Following that, it's pretty straightforward...
optType1CSV=detectImportOptions('Input_type_1.csv','Delimiter',',');
optType1CSV.VariableNamesLine=optType1CSV.DataLines(1)-3; % if this is variable, takes more work
tType1=readtable('Input_type_1.csv',optType1CSV,'ReadVariableNames',1);
tType1.Date=datetime(tType1.Date,'InputFormat','yyyy/MM/dd hh:mm:ss.S');
The result of this for your sample file is:
tType1 =
8×58 table
Date Status Stage1 Stage2 Stage3 Stage4 Stage5 Stage6 Stage7 Stage8 Stage9 Stage10 Stage11 Stage12 Stage13 Stage14 MISC ChargerI ChargerU TrapU ImpactorLowPressure AmbientPressure AuxChannel1 AuxChannel2 AuxChannel3 AuxChannel4 AuxChannel5 AuxChannel6 T1 T2 T3 T4 Reserved CAL Stage1_1 Stage2_1 Stage3_1 Stage4_1 Stage5_1 Stage6_1 Stage7_1 Stage8_1 Stage9_1 Stage10_1 Stage11_1 Stage12_1 Stage13_1 Stage14_1 CON ConcentrationValue COM UserComment PM10 PM2_5 PM1 PN10 PN2_5 PN1
____________________ _________________ ______ ______ ______ ______ ______ ______ ______ ______ ______ _______ _______ ________ _______ _______ ________ ________ ________ _____ ___________________ _______________ ___________ ___________ ___________ ___________ ___________ ___________ ______ ______ _____ _____ ________ _______ _________ ________ ________ ________ ________ ________ ________ ________ ________ _________ _________ _________ _________ _________ _______ __________________ _______ ___________ _______ _______ _______ _____ _____ _____
26-Feb-2019 00:30:21 {'CbTlsfpAvzij0'} 106.2 72.35 121.2 155.9 259.9 379.2 340.7 231.8 68.89 24.7 10.26 7.802 19.51 12.76 {'MISC'} 0.9982 3437 19.99 39.96 1021 0.01602 0.01564 0.01564 0.01556 0.01556 0.01564 101.5 -50.8 24.83 29.85 0 {'CAL'} 1.019e+05 18730 14200 8973 8909 6805 4143 1391 163.6 1.231 0 0.0107 0.8538 0.02063 {'CON'} 58360 {'COM'} {0×0 char} 0.05119 0.04368 0.04368 58360 58360 58360
26-Feb-2019 00:31:21 {'CbTlsfpAvzij0'} 100.7 66.72 118.6 152.5 252.8 372.3 329.9 226.5 69.89 24.14 12.06 4.145 20.63 8.241 {'MISC'} 0.9981 3551 19.99 40.05 1020 0.01602 0.01564 0.01572 0.01549 0.01556 0.01572 -423.7 -28.2 24.8 30.67 0 {'CAL'} 96650 17200 13960 8805 8681 6693 4015 1362 174.8 6.267 0.1556 0.6002 1.995 0.1764 {'CON'} 55490 {'COM'} {0×0 char} 0.06779 0.04344 0.04332 55490 55490 55490
26-Feb-2019 00:32:21 {'CbTlsfpAvzij0'} 106.9 78.78 123.5 154.4 249.9 365.3 327.2 223.4 70.82 24.62 12.62 7.961 20.35 8.58 {'MISC'} 0.9982 3507 19.99 40.47 1020 0.01595 0.01556 0.01579 0.01556 0.01549 0.01556 189.9 -244.5 24.79 31.08 0 {'CAL'} 1.025e+05 20630 14440 8850 8525 6532 3961 1333 171.8 2.635 0 0.006366 0.8741 0 {'CON'} 59010 {'COM'} {0×0 char} 0.04922 0.04255 0.04255 59010 59010 59010
26-Feb-2019 00:33:21 {'CbTlsfpAvzij0'} 107.5 79.25 123.9 152.4 247.5 359 320.1 221.6 67.33 26 10.5 4.261 22.98 20.01 {'MISC'} 0.9975 3547 19.99 40.27 1020 0.01579 0.01556 0.01579 0.01549 0.01556 0.01564 -177.6 -111.7 24.79 30.67 0 {'CAL'} 1.032e+05 20750 14490 8712 8436 6410 3865 1322 158.1 3.178 0 0 1.659 0.3021 {'CON'} 59220 {'COM'} {0×0 char} 0.0704 0.0414 0.0414 59220 59220 59220
26-Feb-2019 00:34:21 {'CbTlsfpAvzij0'} 96.54 59.88 111.2 144.4 246.7 362 323.7 223.2 63.95 23.99 7.791 -0.06385 23.19 10.1 {'MISC'} 0.9967 3551 19.99 40.19 1020 0.01595 0.01564 0.01572 0.01564 0.01564 0.01564 -420.4 -402.7 24.79 31.49 0 {'CAL'} 92620 15270 13080 8344 8493 6520 3952 1349 152.6 2.434 0 0 2.366 0 {'CON'} 52840 {'COM'} {0×0 char} 0.05984 0.04178 0.04178 52840 52840 52840
26-Feb-2019 00:35:21 {'CbTlsfpAvzij0'} 99.33 63.98 112.1 144.5 251.1 369 328.9 227.8 61.62 25.13 6.76 -3.935 27.56 3.491 {'MISC'} 0.9967 3464 19.99 39.86 1020 0.01595 0.01572 0.01579 0.01564 0.01572 0.01564 -405.6 -22.27 24.81 31.89 0 {'CAL'} 95300 16420 13130 8316 8639 6643 4012 1376 140.8 3.313 0 0 3.761 0 {'CON'} 54390 {'COM'} {0×0 char} 0.07067 0.04189 0.04189 54390 54390 54390
26-Feb-2019 00:36:21 {'CbTlsfpAvzij0'} 92.43 58.33 104.8 135.6 238.9 351.7 312 220.1 58.7 24.7 5.016 -6.357 28.83 -0.1046 {'MISC'} 0.9969 3450 19.99 40.11 1020 0.0161 0.01564 0.01572 0.01549 0.01556 0.01564 10.44 -369.9 24.83 32.3 0 {'CAL'} 88670 14920 12300 7818 8235 6341 3811 1336 135.5 4.209 0 0 4.619 0 {'CON'} 50630 {'COM'} {0×0 char} 0.07547 0.04022 0.04022 50630 50630 50630
26-Feb-2019 00:37:21 {'CbTlsfpAvzij0'} 93.91 53.83 104 134.8 235.2 343.8 306 216.3 57.03 24.13 4.745 -7.297 28.45 -0.4211 {'MISC'} 0.9969 3627 19.99 40.47 1019 0.01595 0.01556 0.01572 0.01556 0.01549 0.01549 -312.5 -352.6 24.85 32.3 0 {'CAL'} 90080 13540 12200 7768 8099 6191 3733 1310 129.5 3.907 0 0 4.487 0 {'CON'} 50720 {'COM'} {0×0 char} 0.07352 0.03927 0.03927 50720 50720 50720
>>
Moral: Quit trying to pound square peg into round hole; fix the underlying problem of the badly constructed file.
  1 Kommentar
Luca D'Angelo
Luca D'Angelo am 13 Jan. 2021
Since this website didn't permit to upload a .dat file, in my second comment I attached a zip file where both the type input are. When the instrument creates its output, this is in .dat. In my zip file it was converted in .txt.
I've tried to attach it again from my Mac. It doesn't seem that there is a big difference.
The headerline row (Data=DataOrder=DateTime(yyyy/mm/dd hh:mm:ss),Status,Stage1,Stage2,Stage3,Stage4,Stage5,Stage6,...,Concentrarion value,...) has the problem that it has a space after "yyyy/mm/dd". Since I forced comma as delimiter, I thought that it wouldn't be a problem. I could try to solve this to set two delimiter character, i.e. space and comma (as done here: https://it.mathworks.com/matlabcentral/answers/466503-multiple-delimiters-for-readtable) and then merge the first and second column as one datetime.
pathIn='/path/Input_type_1/';
lista=dir(pathIn);
lista=lista(~ismember({lista.name},{'.','..','.DS_Store'})); %elimina elementi non leggibili
NumFile=length(lista);
delimiter=',';
RigaStartDati=41; %as custom input
count=1; %contatore per assegnare n righe e n colonne
i=1;
D{count,1}=lista(i).name;
Filename=strcat(pathIn,'/','Input_type_1-1.csv');
filename=char(Filename);
opt=detectImportOptions(filename,'Delimite',{',',' '});
opt.DataLines(1,1)=RigaStartDati;
RigaHeader=38; %as custom input
opt
opt =
DelimitedTextImportOptions with properties:
Format Properties:
Delimiter: {' ' ','}
Whitespace: '\b\t'
LineEnding: {'\n' '\r' '\r\n'}
CommentStyle: {}
ConsecutiveDelimitersRule: 'split'
LeadingDelimitersRule: 'keep'
EmptyLineRule: 'skip'
Encoding: 'ISO-8859-1'
Replacement Properties:
MissingRule: 'fill'
ImportErrorRule: 'fill'
ExtraColumnsRule: 'addvars'
Variable Import Properties: Set types by name using setvartype
VariableNames: {'DataOrder_DateTime_yyyy_mm_dd', 'hh_mm_ss_', 'Status' ... and 57 more}
VariableTypes: {'char', 'duration', 'char' ... and 57 more}
SelectedVariableNames: {'DataOrder_DateTime_yyyy_mm_dd', 'hh_mm_ss_', 'Status' ... and 57 more}
VariableOptions: Show all 60 VariableOptions
Access VariableOptions sub-properties using setvaropts/getvaropts
Location Properties:
DataLines: [41 Inf]
VariableNamesLine: 38
RowNamesColumn: 0
VariableUnitsLine: 0
VariableDescriptionsLine: 0
To display a preview of the table, use preview
This solves but since there is one of the headers is "Concentration value", it splits that and then shift data from column 51 to end.
In the end, the problem was not the file format but that even if I set comma as delimiter, the "space" in headerline causes problems.
If you have another idea, I will thank you more than I've already done ;)

Melden Sie sich an, um zu kommentieren.


dpb
dpb am 13 Jan. 2021
c=readcell('Input_type_1.xlsx'); % read as cellstr array
c=c(~cellfun(@(c)all(ismissing(c)),c,'UniformOutput',1)); % get rid of missing lines
c=c(find(contains(c,'DataOrder=')):end) % and everything before the data header line
c=c(contains(c,',')); % and anything else not comma delimited
writecell(c,'tempType1.csv',"QuoteStrings",0) % write to a temporary file
tType1=readtable('tempType1.csv','delimiter',',','ReadVariableNames',1)
This is about 99% there with only minor cleanup left...
>> readtable('tempType1.csv','delimiter',',','ReadVariableNames',1)
ans =
8×58 table
DataOrder_DateTime_yyyy_mm_ddHh_mm_ss_ Status Stage1 Stage2 Stage3 Stage4 Stage5 Stage6 Stage7 Stage8 Stage9 Stage10 Stage11 Stage12 Stage13 Stage14 MISC ChargerI ChargerU TrapU ImpactorLowPressure AmbientPressure AuxChannel1 AuxChannel2 AuxChannel3 AuxChannel4 AuxChannel5 AuxChannel6 T1 T2 T3 T4 Reserved CAL Stage1_1 Stage2_1 Stage3_1 Stage4_1 Stage5_1 Stage6_1 Stage7_1 Stage8_1 Stage9_1 Stage10_1 Stage11_1 Stage12_1 Stage13_1 Stage14_1 CON ConcentrationValue COM UserComment PM10 PM2_5 PM1 PN10 PN2_5 PN1
______________________________________ _________________ ______ ______ ______ ______ ______ ______ ______ ______ ______ _______ _______ ________ _______ _______ ________ ________ ________ _____ ___________________ _______________ ___________ ___________ ___________ ___________ ___________ ___________ ______ ______ _____ _____ ________ _______ _________ ________ ________ ________ ________ ________ ________ ________ ________ _________ _________ _________ _________ _________ _______ __________________ _______ ___________ _______ _______ _______ _____ _____ _____
{'2019/02/26 12:30:21.8'} {'CbTlsfpAvzij0'} 106.2 72.35 121.2 155.9 259.9 379.2 340.7 231.8 68.89 24.7 10.26 7.802 19.51 12.76 {'MISC'} 0.9982 3437 19.99 39.96 1021 0.01602 0.01564 0.01564 0.01556 0.01556 0.01564 101.5 -50.8 24.83 29.85 0 {'CAL'} 1.019e+05 18730 14200 8973 8909 6805 4143 1391 163.6 1.231 0 0.0107 0.8538 0.02063 {'CON'} 58360 {'COM'} NaN 0.05119 0.04368 0.04368 58360 58360 58360
{'2019/02/26 12:31:21.8'} {'CbTlsfpAvzij0'} 100.7 66.72 118.6 152.5 252.8 372.3 329.9 226.5 69.89 24.14 12.06 4.145 20.63 8.241 {'MISC'} 0.9981 3551 19.99 40.05 1020 0.01602 0.01564 0.01572 0.01549 0.01556 0.01572 -423.7 -28.2 24.8 30.67 0 {'CAL'} 96650 17200 13960 8805 8681 6693 4015 1362 174.8 6.267 0.1556 0.6002 1.995 0.1764 {'CON'} 55490 {'COM'} NaN 0.06779 0.04344 0.04332 55490 55490 55490
{'2019/02/26 12:32:21.8'} {'CbTlsfpAvzij0'} 106.9 78.78 123.5 154.4 249.9 365.3 327.2 223.4 70.82 24.62 12.62 7.961 20.35 8.58 {'MISC'} 0.9982 3507 19.99 40.47 1020 0.01595 0.01556 0.01579 0.01556 0.01549 0.01556 189.9 -244.5 24.79 31.08 0 {'CAL'} 1.025e+05 20630 14440 8850 8525 6532 3961 1333 171.8 2.635 0 0.006366 0.8741 0 {'CON'} 59010 {'COM'} NaN 0.04922 0.04255 0.04255 59010 59010 59010
...
to convert to datetime and set first column name.
More elegant would be to then use detectImportOptions on the cleaned-up file and can then set the time import format, etc., in it.
NB: If this format is consistent, you don't need to actually call detectImportOptions each time; create a library of import options objects for each particular type and save them as .mat files to reload and use where needed.
Either of the three ways gets around the problem your instrument vendor has created...you can't use their files in Excel as is, either. Something seems wrong in the setup; it doesn't make sense they would have done this deliberately.

Produkte


Version

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by