MATLAB Answers

1

What is the difference between string arrays and cell arrays of character vectors?

Asked by Chris Volpe on 24 Apr 2017
Latest activity Commented on by Walter Roberson
on 23 May 2019
R2016b allows you to create string arrays, and R2017A allows you to use the double-quote syntax for specifying string literals. What is the practical difference between a string array (e.g ["one", "two"]) and a cell array of character vectors (e.g. {'one', 'two'}). Aside from minor conveniences like "strlength" (which could easily have been implemented to operate on cell arrays of character vectors), why should I care about this? Am I missing something?

  3 Comments

I don't have the new release as not able to run 64-bit here as yet, but I'd hazard the biggest reason to use the new string class over cellstr arrays is that it should let you get rid of the ugly cellfun gyrations need to do searching in cell arrays. The cellstr require such abhorences as the following
function [is,idx]=isFundName(pool,name)
% isFundName(funds,name) returns logical array of the requested
% fund name within the fund name or pool
%
% [isName,locName]=isFundName(NAMELIST,NAME)
% returns logical array and optionally the index location within
% the name list
if strcmp(class(pool),'dataset')
is=(~cellfun('isempty',strfind(pool.Fund,name)));
else
is=(~cellfun('isempty',strfind(pool,name)));
end
idx=find(is);
>>
The string functions such strfind over the cell array return another cell array so such nonsense as the above is needed.
Hopefully strings will let one write the above as simply a string search.
Yes, but even without adding a new data type, a new function could have been added to the base language to perform that search over the cell array, hiding the cellfun call, and giving the same external appearance, no? I'm trying to think of a case where the desired usage/programming paradigm couldn't be achieved via functions and thus necessitated a new data type.
Is there any non-numeric data structure that could not be implemented as a struct and then adding functions to the language ?

Sign in to comment.

3 Answers

Answer by Jurgen vL on 20 May 2019
Edited by Jurgen vL on 21 May 2019
 Accepted Answer

I'd like to add that for loops can become cleaner, instead of cellarray{idx} you can use idx directly. E.g. when displaying messages or iterating over struct fields.
for field = string(fieldnames(S)')
S.(field) = somevalue;
end
% I haven't figured out why this only works with horizontal arrays
In addition, cellfun typically requires the annoying argument 'UniformOutput' flag to be false when a function returns a character array. If a function that returns a char array is changed to return a scalar string this would clean things up too, e.g.:
[~, patientID] = cellfun(@fileparts,{cohort.pfolder})
%no need for UniformOutput if fileparts() is modernized to return strings.

  3 Comments

"I haven't figured out why this only works with horizontal arrays"
For some reason the for operator was designed and is documented to actually loop over the columns of its values argument. I have never seen anyone make us of this "feature" (in my experience it just gets in the way ot writing compact, clear code by requiring pointless reshape calls).
I vaguely remember a question a year (or two?) ago where this quirk was actually helpful (by allowing rows of the object array to be processed). If I recall correctly that was in the context of parfor. I can't find it in the parfor doc, but I seem to recall parfor doesn't process your array of objects the same as for. I don't have the parallel computing toolbox, so I can only test the fallback implementation of parfor.
</ramble>
I have occasionally made use of the fact that for processes by columns. It seldom provides additional clarity, though.
What I have sometimes wanted is to loop over cell entries without having to do a specific de-reference.

Sign in to comment.


Answer by Steven Lord
on 25 Apr 2017

You may find today's post from Loren's blog interesting and informative. If you have questions or feedback, as Dave wrote, "Expect to hear more from me on this topic. And please share your input with us by leaving a comment below. We're interested to hear from you."

  0 Comments

Sign in to comment.


Answer by Walter Roberson
on 25 Apr 2017

Students keep trying to use == to compare strings, and keep trying to use () to store strings. Making MATLAB easier for students is a practical difference.
Now as to whether it is faster or whether there are additional meaningful features... those are different questions ;-)

  1 Comment

What about the search issue--are strfind and friends now string aware? If so, that would be a_good_thing (tm).
From the blog Steven L reference, it appears "not yet". I'd echo the sentiments of another poster there that it would be better to hold off the introduction of these new features until they're really "ready for prime time" instead of just interesting little tidbits stuck on like the candy commercial...
How are strings displayed -- do they have a double-quote around them a la the single for cell strings to differentiate their appearance?
This is a 'yes' it seems...makes sense; presumed so but curious.

Sign in to comment.