Readtable with mixed variable types - 2021a version behaving differently than 2019a
13 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Hello,
I'm using readtable to read an Excel file. The file I'm reading is not strictly organized by columns. For example, line 1 can have a string in the second column line 2 a number and line 3 could have an empty cell.
In 2019a this was no issue, I would simply get a table with empty strings or numbers read as strings, which I could easily convert.
In 2021a some columns are detected as numeric, and if the cell contains a string, it is simply read as "NaN". If I force the variable type to 'string', I still get empty cells (rather than empty strings), which breaks my subsequent code.
Which set of options can I pass to readtable so that
- cells containing strings are read as strings
- empty cells (in columns that are otherwise populated) are read as empty strings
- the number of columns read = maxmimum number of columns containing data in any row?
Thanks,
Martin
4 Kommentare
dpb
am 22 Aug. 2021
Bearbeitet: dpb
am 23 Aug. 2021
Unlike Fortran or C/C++, etc., MATLAB is a proprietary product not bound by a Standards Committee so, while there is an attempt at maintaining backwards compatiability at a given level, it is not at all unusual for Mathworks to make changes in operational behavior of various functions -- particularly higher-level abstractions like readtable are regularly improved. As a relatively recent introduction, the enhanced scanning is most often of benefit in being able to more accurately assess and import irregular files at the cost of some more overhead that is occasionally noticeable. Unfortunately, "there is no free lunch!" and so once in a while a revision such as this can cause a hiccup in previous code as you've noticed here.
In general, it's probably more reliable to spend a little more time with the import options in such a case and rely less on the default processing--which is, again, somewhat of a conundrum in that the whole point is to make the function more of an "easy-to-use, no intervention" tool. Sometimes it succeeds, ocasionally, it ends up going the other way. There is no perfect solution other than status quo which also isn't a viable development model.
TMW is pretty good about documenting changes; this one occurred in R2020a
readtable Function: Uses results of detectImportOptions function by default
Starting in R2020a, the readtable function uses the results of the detectImportOptions
function to import tabular data. In essence, these two readtable function calls behave
identically.
T = readtable(filename)
T = readtable(filename,detectImportOptions(filename))
Compatibility Considerations
There are several differences between the default behavior of readtable and its default
behavior in previous releases. To call readtable with the default behavior it had up to
R2019b, use the 'Format','auto' name-value pair argument.
T = readtable(filename,'Format','auto')
...
The whole skinny is at <release-notes-link> although have to navigate to the R202a section and then the Data Import subsection.
Of course, if one doesn't update every release, there's a lot to go through every six months to have any hope of staying abreast...one of the disadvantages of such an active development cycle as compared to the advantage of new features and bug fixes...it's a tradeoff everybody has to make for themselves.
For mission-critical code, it is really a conundrum...one almost has to redo the whole validation exercise on each release which may be a very expensive and time-consuming effort.
Antworten (0)
Siehe auch
Kategorien
Mehr zu Logical finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!