readtable is ignoring import options to get variable names
Ältere Kommentare anzeigen
Am I being stupid or is this function not logical?
I want to import a csv file. There are 3 header lines. The actual variable names are on line 3. The units are on line 2. Line 1 is to be ignored.
So if I just opts = detectImportOptions and then set opts.VariableNamesLine=3, opts.VariableUnitsLine=2. It picks up the latter and ignores completely the former and just uses the original variablenames it picked up on Line 1.
If I detectImportOptions(file,'NumHeaderLines',1) it then picks up the units line as the names.
If I do it again and tell it to skip 2 lines, it picks the right names. I can then set opts.VariableUnitsLine to 2 and it does go back and pick the units correctly.
So I do get what I want in the end. But the function doesn't seem to work as expected? i.e. create the options and then modify the line options. Seems like whatever it initially picks up first as the names gets set in stone and you can't do anything about it (except the sorta hacky way I just worked out).
5 Kommentare
jonas
am 16 Aug. 2018
attach the file
Alex Mason
am 16 Aug. 2018
Adam Danz
am 16 Aug. 2018
You could anonymize the data or create a working example that has the same structure but fake data that produces the same behavior as your current file. Providing the relevant code would also be helpful.
jonas
am 16 Aug. 2018
I have not been able to solve it using readtable, but I was able to reproduce the problem easily using the attached textfile. So if anyone else wants to give it a try...
Akzeptierte Antwort
Weitere Antworten (2)
Jacob Hootman
am 8 Okt. 2018
I had the same issue. I went through in debug several time; I believe this is a bug. Here is what I found:
Open TextImportOptions.m and go to line 211, it will read:
% Read Names
if opts.VariableNamesLine > 0 && rvn
names = readVariableNames(parser);
else
names = opts.SelectedVariableNames;
end
% Read Metadata
units = readVariableUnits(parser);
descr = readVariableDescriptions(parser);
The problem is that 'rvn' gets its value from a persistent variable, which means unless that parameter is specified on the first function call, it will always be false.
Change the &&, in the if statement, to 'OR' logic (read the 'NOTES' below, before doing so). Now the code will work as intended. This is what is should look like:
% Read Names
if opts.VariableNamesLine > 0 || rvn
names = readVariableNames(parser);
else
names = opts.SelectedVariableNames;
end
% Read Metadata
units = readVariableUnits(parser);
descr = readVariableDescriptions(parser);
Also, I'm not sure why the programmer decided to use an 'if else' statement to decide how to get the variable names, yet only calls a function to get the units and descriptions.
NOTES: (1) Making this change requires administrative access, (2) m file must be changed with a non matlab editor (ex: notepad++), (3) this change will only affect your local machine (i.e. other computers will have difficulties running if they do not have this change installed), (4) any updates that matlab installs may revert this code.
7 Kommentare
Adam Danz
am 9 Okt. 2018
What is TextImportOptions.m? That's not a matlab file. Even detectImportOptions.m doesn't have the variable name opts.VariableNamesLine so I'm not sure what file you're working with.
Jacob Hootman
am 9 Okt. 2018
Hmm, that's interesting. What version of matlab are you running? I'm on 9.3.0.713579 (R2017b).
TextImportOptions.m should exist in this folder:
C:\Program Files\MATLAB\R2017b\toolbox\shared\io\+matlab\+io\+text
The parent folders may be different depending on your setup.
jonas
am 9 Okt. 2018
readtable seems to have had quite a few updates over the last couple of releases, or a major one at some point recently. I always run into trouble when helping my colleagues with imports, as they are missing several key features.
Guillaume
am 9 Okt. 2018
detectImportOptions has been improved with every version since it's been introduced so I wouldn't expect the code to be similar from version to version. There's no TextImportOptions.m in R2018b, there's a getTextOpts.m instead which delegates the heavy lift to a built-in function (hence you can't see the actual detection code).
I see now, TextImportOptions is stored in a package directory which isn't allowed in the matlab path which is why it doesn't appear when I search for it using which() or similar methods (even in 2017b). It's a classdef m file. When you google " matlab TextImportOptions " there is nearly no information about this file.
Anyway, how did you end up in this classdef file? What function called it and how did you end up stepping through this file during debugging?
@Guillaume: That explains it. The fact that there are several different versions is unfortunate as it becomes difficult to write complex importopts for beginners on this forum. Many times people just reply with an error message, and therefore I usually opt for something more reliable such as textscan despite readtable usually being the more practical choice for semi-complex imports.
Sorry for interrupting your discussion, I will be on my way now :)
Jacob Hootman
am 28 Okt. 2018
@Adam Danz I just kept stepping into every function that resulted in an error. I called the readtable function with arguments for both the fileName and the OPTS.
Juan Nicolás Ibáñez
am 23 Sep. 2024
Bearbeitet: Juan Nicolás Ibáñez
am 23 Sep. 2024
0 Stimmen
The help for the function detectImportOptions() says
% "ReadVariableNames" - Whether or not to expect variable names in
% the file. Defaults to true.
However, for one large database, it did not got the variable names until I specified that as true in the command line, like this:
opts = detectImportOptions(path_filename,'NumHeaderLines',0,'ReadVariableNames',1)
Kategorien
Mehr zu Spreadsheets finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!