readtable not reading logical values as expected

I am using version 9.9.0.1592791 (R2020b) Update 5.
I found this behavior while reading and writing tables via readtable() and writetable().
% First, I make a table and write it to file:
myint = 1;
mychar = {'1'};
mybool = true;
t = table();
t = addvars(t, myint);
t = addvars(t, mychar);
t = addvars(t, mybool)
writetable(t, 'mytable.csv');
% This gives the table t:
%{
t =
1×3 table
myint mychar mybool
_____ ______ ______
1 {'1'} true
%}
% and the .csv file:
%{
myint,mychar,mybool
1,1,1
%}
% Note that "true" is written as a "1" as is expected
% Now, I try to read the table, properly setting the options:
opts = detectImportOptions('mytable.csv');
opts = setvartype(opts,'myint','int8');
opts = setvartype(opts,'mychar','char');
opts = setvartype(opts,'mybool','logical');
t = readtable('mytable.csv',opts)
% Instead of the expected result of getting a table with values 1 (as an integer), "1" (as a char), and true (logical), I get:
%{
t =
1×3 table
myint mychar mybool
_____ ______ ______
1 {'1'} false
%}
% When I change my file to:
%{
myint,mychar,mybool
1,1,true
%}
% readtable() reads the data as expected, giving:
%{
t =
1×3 table
int char bool
___ _____ _____
1 {'1'} true
%}
Why is this happening? This is unexpected, especially considering that writetable writes logical values as '0' and '1'. Is this expected or is this a bug?
Note: When I use .xls instead of .csv, it behaves as expected.

1 Kommentar

Adam Danz
Adam Danz am 10 Aug. 2021
Bearbeitet: Adam Danz am 10 Aug. 2021
Same behavior in R2021a. This is unexpected.
You should report this to tech support: Contact Us - MATLAB & Simulink
As a workaround, read in the logical column as double as then convert to logical.
opts = setvartype(opts,'mybool','double');
t = readtable('mytable.csv',opts);
t.mybool = logical(t.mybool);

Melden Sie sich an, um zu kommentieren.

 Akzeptierte Antwort

Jeremy Hughes
Jeremy Hughes am 10 Aug. 2021
Bearbeitet: Jeremy Hughes am 10 Aug. 2021
The default results of detectImportOptions seeing "1" & "0" will be numeric, and the default logical reading expects "true","t" or "false","f" (case insensitive), but this is overridable.
T = array2table(randn(6,3)>0)
T = 6×3 table
Var1 Var2 Var3 _____ _____ _____ true true true true true false true false true true true true false false true true false true
writetable(T,"mytable.csv","WriteVariableNames",false)
opts = delimitedTextImportOptions("NumVariables",3,...
"VariableNames",["myint","mychar","mybool"],...
"VariableTypes",["int8","char","logical"]);
getvaropts(opts,'mybool')
ans =
LogicalVariableImportOptions with properties: Variable Properties: Name: 'mybool' Type: 'logical' FillValue: 0 TreatAsMissing: {} QuoteRule: 'remove' Prefixes: {} Suffixes: {} EmptyFieldRule: 'missing' Logical Options: TrueSymbols: {'true' 't'} FalseSymbols: {'false' 'f'} CaseSensitive: 0
You can set what gets treated as true or false with the true and false symbols.
opts = setvaropts(opts,'mybool',...
"TrueSymbols",["t","true","1"],...
"FalseSymbols",["f","false","0"]);
t = readtable('mytable.csv',opts)
t = 6×3 table
myint mychar mybool _____ ______ ______ 1 {'1'} true 1 {'1'} false 1 {'0'} true 1 {'1'} true 0 {'0'} true 1 {'0'} true
Note: This will only match literally 0 and 1, not "0.0" as false or "2" true, but should be good enough for most cases.

11 Kommentare

Andrew Wisti
Andrew Wisti am 10 Aug. 2021
Bearbeitet: Andrew Wisti am 10 Aug. 2021
Okay, I see the issue. The key part is the default values for the TrueSymbols and FalseSymbols properties of the LogicalVariableImportOptions. By default only 'true' and 't' are read as true, and while 'false' and 'f' are explicitly set to read as false, anything that is not true is read as false.
Repeating the initial setup:
% write a table of 1,'1',true to csv
myint = 1;
mychar = {'1'};
mybool = true;
t = table();
t = addvars(t, myint);
t = addvars(t, mychar);
t = addvars(t, mybool);
writetable(t, 'mytable.csv');
% This actually writes:
% myint,mychar,mybool
% 1,1,1
% attempt to read table
opts = detectImportOptions('mytable.csv');
opts = setvartype(opts,'myint','int8');
opts = setvartype(opts,'mychar','char');
opts = setvartype(opts,'mybool','logical');
getvaropts(opts,'mybool')
The variable options for mybool show:
LogicalVariableImportOptions with properties:
Variable Properties:
Name: 'mybool'
Type: 'logical'
FillValue: 0
TreatAsMissing: {}
QuoteRule: 'remove'
Prefixes: {}
Suffixes: {}
EmptyFieldRule: 'missing'
Logical Options:
TrueSymbols: {'true' 't'}
FalseSymbols: {'false' 'f'}
CaseSensitive: 0
Because the default true symbols are 'true' and 't', the '1' in the file does not get read as true, and it appears by default anything that's not true or false is read as false.
But why? writetable() writes logicals as 0 and 1 and MATLAB itself interprets 0 and 1 as false and true, respectively:
true == 1 % logical 1
true == 0 % logical 0
false == 1 % logical 0
false == 0 % logical 1
true == ~0 % logical 1
true == ~1 % logical 0
false == ~1% logical 1
false == ~0% logical 0
So it seems that 0 and 1 should be included in the default TrueSymbols and FalseSymbols.
But why does the example provided by OP have unexpected results? The original table contained a true/1 value in column 3 but it is read-in as false even though the variable type is set to logical. Examples from the documentation show csv files being read-in by detectImportOptions rather than delimitedTextImportOptions.
I wonder if it's a problem with the encoding
opts = detectImportOptions('mytable.csv')
opts.Encoding % = UTF-8
I would be surprised if the encoding came into play, since all the characters in question are in the ASCII range.
I didn't see the exact file written, but I guess it contains "1" and not "true".
Anything that doesn't match the true and false symbols follows the ImportErrorRule, which is "fill" by default. That replaces the field where the error happened with the FillValue, which in this case is false.
If it were literally "true" in the file, it should have read it as true.
Also, detectImportOptions will return a delimitedTextImportOptions for this case. I just skipped the detection part since the OP was redefining the properties anyway. Detection is more expensive than just creating options--but detection is a lot more convenient.
While this answers my question, it raises another, which I tried to get at in my previous comment: why are '1' and '0' not default values for TrueSymbols and FalseSymbols? Again, MATLAB treats them as valid values for true and false and writetable() writes true and false as '1' and '0'. It is absolutely unexpected to write a table then read it in immediately and then get a different table when I try to specify data types. I understand that this is entering programming philosophy territory but again, this behavior is completely unexpected.
Jeremy Hughes
Jeremy Hughes am 11 Aug. 2021
Bearbeitet: Jeremy Hughes am 11 Aug. 2021
I think the reason 1 and 0 aren't being treated as logical by default is Note: part of my answer. Does it treat literal text "1" as true, or any text representation of numbers == 1, or why not all numbers not equal to zero?
For the latter two cases, importing as uint8 or double, and calling T.mybool = logical(T.mybool) works well enough, but I understand why someone might expect that to happen automatically.
I'm happy to create an enhancement request for this on your behalf.
Stephen23
Stephen23 am 11 Aug. 2021
Bearbeitet: Stephen23 am 11 Aug. 2021
"but I understand why someone might expect that to happen automatically."
I second this opinion. It is reasonable for the automagic detection to treat 0/1 as numeric, but once the user has explicitly specified the type of that variable it seems reasonable to expect a bit more effort from the importer and not just stick to t/f/true/false. Perhaps in addition it could use MATLAB's usual definition: 0==false, everything else is true.
> I'm happy to create an enhancement request for this on your behalf.
Yes please, that would be great.
Adam Danz
Adam Danz am 11 Aug. 2021
Thanks for the explanation, @Jeremy Hughes. I also second the enhancement request to consider 0=false ~0=true when the variable type is declared as logical.
Sorry for bumping this almost a year later, but this bug has not been fixed yet (MATLAB version 2022a update 2)
This bug just bit me recently and it took me a while to figure out what was wrong. This is a nasty bug because true entries are converted to false entirely silently. When I submitted a bug report I was directed here and given the same workarounds, which is fine, but to my surprise the case was quickly closed (in less than 24 hours).
This attitude is quite frustrating from a user's perspective. It is the second time that a bug report that I submit is getting "closed" despite only being given cheap workarounds. Why close a case if the reported bug is not fixed? And side question: Why is the "bug fix" suggestion here qualified as an "enhancement" rather than what it is, a "bug fix"?
(Apologies for the disgruntled tone.)
Hi, the reason for the quick closing of the case is that the report already exists in the system and it's not considered a bug. Your expectation and the design of this feature aren't in alignment, but it is working as designed as expected. This is, at least from a "policy" perspective, why it's an enhancement. Unfortunately, changing the default behavior would result in silent breaking-changes to existing uses which expect the current behavior. I realize that's not the most satisfying answer for your case.
Based on what you're expecting, instead of "logical" you'd be better off continuing to treat this as a number on import (uint8 perhaps), and convert it afterwards. There's not any performance ramifications for that, just a line:
t.mybool = logical(t.mybool)
That works as long as there aren't any "T" or "True" values as well. The paradigm of logical representation needs to be consistent in the file at the very least, or all hope is lost.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Data Type Identification finden Sie in Hilfe-Center und File Exchange

Produkte

Version

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by