Is it possible to extract numbers from formated strings without a for cycle?

4 Ansichten (letzte 30 Tage)
I have {'abc12', 'abc23', 'abc24', 'abc99'} and I need the vector [12,23,24,99]. How to do this without a for cycle?

Akzeptierte Antwort

Stephen23
Stephen23 am 29 Jan. 2018
Bearbeitet: Stephen23 am 29 Jan. 2018
Much faster than cellfun, str2double, or strrep, and with no explicit loop:
>> C = {'abc12', 'abc23', 'abc24', 'abc99'};
>> V = sscanf([C{:}],'%*3c%d')
V =
12
23
24
99
>>
  2 Kommentare
Jan
Jan am 29 Jan. 2018
Bearbeitet: Jan am 29 Jan. 2018
+1. Very efficient.
If C is large (e.g. 5000 elements), the concatenation needs a lot of time. It seems like Matlab's horzcat has a problem with the pre-allocation. Using FEX: Cell2Vec and with the format string 'abc%d' the code is even two times faster. But if the OP needs to do this for short arrays strings, the time for compiling the fast C-Mex function might be wasted.
See timings in my answer.
Remark: What a pity, that Matlab's sprintfc is not documented and that there is no corresponding sscanfc.
Stephen23
Stephen23 am 29 Jan. 2018
Bearbeitet: Stephen23 am 29 Jan. 2018
@Jan Simon: thank you for your in-depth timing and investigation.
"with the format string 'abc%d' the code is even two times faster"
I thought this might be the case, but suspected (based on this question) that the user would not always want the same 'abc' characters being matched.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (3)

Matt J
Matt J am 29 Jan. 2018
Bearbeitet: Matt J am 29 Jan. 2018
No, it is not really possible to do this without a for-loop. The suggestions Star Strider and I have given you use str2double and/or cellfun, which have for-loops inside them.
If for-loops hidden in functions don't count for you, then okay, but you could just as easily write your own function to hide the loop.
  3 Kommentare
Matt J
Matt J am 29 Jan. 2018
Bearbeitet: Matt J am 29 Jan. 2018
Hmmm. In R2017b, str2double is Mcoded but strrep is not, so maybe both have loops, but they can't be the "same kind".
Jan
Jan am 29 Jan. 2018
@Matt J: Exactly, this is what I actually wanted to express. Mentioning strrep and str2double was thought to supplement your answer. They contain loops in M or C code level. As you wrote: "it is not really possible to do this without a for-loop". +1
Remark: Old versions of Matlab contained many C codes of the builtin functions, like cellfun.c or histc.c. Even for some P coded files the corresponding M files have been shipped. These source codes have been very good examples. What a pity, that they are not available in modern versions anymore.

Melden Sie sich an, um zu kommentieren.


Jan
Jan am 29 Jan. 2018
Bearbeitet: Jan am 29 Jan. 2018
V = sscanf(Cell2Vec(C), 'abc%d');
Some timings:
C = repmat({'abc12', 'abc23', 'abc24', 'abc99'}, 1, 1000);
tic;
for k = 1:100
V = sscanf([C{:}],'%*3c%d');
end;
toc
tic;
for k = 1:100
V = sscanf(Cell2Vec(C), '%*3c%d');
end;
toc
% While '%*3c%d' requires some work, 'abc%d' is cheaper:
tic;
for k = 1:100
V = sscanf(Cell2Vec(C), 'abc%d');
end;
toc
tic;
for k = 1:100
V = str2double(strrep(C, 'abc', ''));
end
toc
tic;
for k = 1:100
V = cellfun(@str2double, regexp(C, '\d*', 'match'));
end
toc
Results:
Elapsed time is 0.556899 seconds. % sccanf([C{:}], '%*3c%d')
Elapsed time is 0.311912 seconds. % sccanf(Cell2Vec(C), '%*3c%d')
Elapsed time is 0.254616 seconds. % sccanf(Cell2Vec(C), 'abc%d')
Elapsed time is 12.940544 seconds. % str2double(strrep)
Elapsed time is 22.966292 seconds. % cellfun(@str2double, regexp)

Matt J
Matt J am 29 Jan. 2018
str2double( strrep( {'abc12', 'abc23', 'abc24', 'abc99'},'abc','') )

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by