I am trying to make an excel import function robust to treat either empty cells or cells containing only spaces in a specific way, namely to fill with the string
(I know, this may not seem useful, but it is for me...)
Variables = [...
"double","Index";
"string","StringData";
"string","Comment"];
opts = spreadsheetImportOptions(...
"NumVariables",size(Variables,1),...
"VariableNames",Variables(:,2)',...
"VariableTypes",Variables(:,1)',...
"DataRange","A3",...
"VariableNamesRange","A2",...
"VariableDescriptionsRange","A1",...
"RowNamesRange","A3",...
"ImportErrorRule","fill");
opts = setvaropts(opts,"StringData","FillValue","");
out = readtable("example_empty_cell.xlsx",opts)
For the attached excel file, this returns
out =
6×3 table
Index StringData Comment
_____ __________ ______________________
1 1 <missing> "single space"
2 2 <missing> "three spaces"
3 3 <missing> "empty"
4 4 <missing> "apostrophe and space"
5 5 <missing> "apostrophe only"
6 6 "hello" "non empty string"
It's not what I would expect...but if my replacement string is instead
opts = setvaropts(opts,"StringData","FillValue"," ");
I get the expected
out =
6×3 table
Index StringData Comment
_____ __________ ______________________
1 1 " " "single space"
2 2 " " "three spaces"
3 3 " " "empty"
4 4 " " "apostrophe and space"
5 5 " " "apostrophe only"
6 6 "hello" "non empty string"
Perhaps this issue is not specific to readtable/setvaropts, but rather some general internal conversion of "" into <missing>?
One clue is that if I set the fill value to strings(0), it produces an error saying
Expected a value which can be converted to a character vector.
In any case, I'm wondering if there is a one-shot way to achieve what I want without having to post-process the imported data one way or another?