How to slice each string in a string array without using for loop

37 Ansichten (letzte 30 Tage)
For a string array, for example,
celldata =
3×1 cell array
{'2018-12-12'}
{'2018-11-05'}
{'2018-09-02'}
Is there array operation (i.e. without using for loop) to extract the months from each cell and convert them to a 3*1 numeric matrix, which should be [12;11;09]. I don't want to use for loop because it was too slow.

Akzeptierte Antwort

Cedric
Cedric am 19 Sep. 2018
Bearbeitet: Cedric am 19 Sep. 2018
If you favor performance over readability/maintainability, you can build an approach around the following:
buffer = vertcat( dates{:} ) ;
months = (buffer(:,6:7) - '00') * [10;1] ;
where dates is a cell array of date strings ( celldata in your example).
Here is the benchmark:
% Build "large" data set.
N = 1e4 ;
dates = repmat( {'2018-12-12'; '2018-11-05'; '2018-09-02'}, N, 1 ) ;
% Basic FOR loop, STR2DOUBLE.
tic ;
n = numel( dates ) ;
months_forStr2double = zeros( n, 1 ) ;
for k = 1 : n
months_forStr2double(k) = str2double( dates{k}(6:7) ) ;
end
fprintf( 'Basic FOR, STR2DOUBLE : %.3fs\n', toc ) ;
% Basic FOR loop, STR2NUM.
tic ;
n = numel( dates ) ;
months_forStr2num = zeros( n, 1 ) ;
for k = 1 : n
months_forStr2num(k) = str2num( dates{k}(6:7) ) ;
end
fprintf( 'Basic FOR, STR2NUM : %.3fs\n', toc ) ;
% Basic FOR loop, SSCANF.
tic ;
n = numel( dates ) ;
months_forScanf = zeros( n, 1 ) ;
for k = 1 : n
months_forScanf(k) = sscanf( dates{k}(6:7), '%d' ) ;
end
fprintf( 'Basic FOR, SSCANF : %.3fs\n', toc ) ;
% CELLFUN (hidden FOR), SSCANF.
tic ;
months_cellfun = cellfun( @(date) sscanf( date(6:7), '%d' ), dates ) ;
fprintf( 'CELLFUN, SSCANF : %.3fs\n', toc ) ;
% REGEXP
tic ;
months_regexp = str2double( regexp( dates,'(?<=-)\d+(?=-)','match','once' )) ;
fprintf( 'REGEXP : %.3fs\n', toc ) ;
% CELL2MAT, STR2NUM
tic ;
chardata = cell2mat( dates ) ;
months_cell2matStr2num = str2num( chardata(:,6:7) ) ;
fprintf( 'CELL2MAT, STR2NUM : %.3fs\n', toc ) ;
% DATETIME
tic ;
dt = datetime( dates) ;
months_datetime = month( dt ) ;
fprintf( 'DATETIME : %.3fs\n', toc ) ;
% SSCANF
tic ;
months_sscanf = sscanf([dates{:}],'%*4d-%2d-%*2d') ;
fprintf( 'SSCANF : %.3fs\n', toc ) ;
% EXTRACTBETWEEN
tic ;
months_extractBetween = extractBetween( dates, '-', '-' ) ;
months_extractBetween = cellfun( @str2double, months_extractBetween ) ;
fprintf( 'EXTRACTBETWEEN : %.3fs\n', toc ) ;
% Trick.
tic ;
buffer = vertcat( dates{:} ) ;
months_trick = (buffer(:,6:7) - '00') * [10;1] ;
fprintf( 'Trick: %.3fs\n', toc ) ;
% Check
disp( [isequal( months_forStr2num, months_forStr2double ), ...
isequal( months_forScanf, months_forStr2double ), ...
isequal( months_cellfun, months_forStr2double ), ...
isequal( months_regexp, months_forStr2double ), ...
isequal( months_cell2matStr2num, months_forStr2double ), ...
isequal( months_datetime, months_forStr2double ), ...
isequal( months_sscanf, months_forStr2double ), ...
isequal( months_extractBetween, months_forStr2double ), ...
isequal( months_trick, months_forStr2double )] ) ;
Output:
Basic FOR, STR2DOUBLE : 0.489s
Basic FOR, STR2NUM : 0.975s
Basic FOR, SSCANF : 0.356s
CELLFUN, SSCANF : 0.550s
REGEXP : 0.673s
CELL2MAT, STR2NUM : 0.015s
DATETIME : 0.201s
SSCANF : 0.023s
EXTRACTBETWEEN : 0.624s
Trick: 0.008s
1 1 1 1 1 1 1 1 1
  2 Kommentare
Stephen23
Stephen23 am 19 Sep. 2018
Bearbeitet: Stephen23 am 19 Sep. 2018
sscanf does not require a loop, simply concatenate the char vectors and use an appropriate format string:
C = {'2018-12-12','2018-11-05','2018-09-02'}
sscanf([C{:}],'%*4d-%2d-%*2d')
Cedric
Cedric am 19 Sep. 2018
Thanks Stephen, just added this to the benchmark!

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (4)

Christopher Wallace
Christopher Wallace am 19 Sep. 2018
chardata = cell2mat(a);
numdata = str2num(chardata(:,6:7));

Paolo
Paolo am 18 Sep. 2018
Bearbeitet: Paolo am 18 Sep. 2018
str2double(regexp(celldata,'(?<=-)\d+(?=-)','match','once'))
ans =
12
11
9

Star Strider
Star Strider am 19 Sep. 2018
Using datetime and its functions:
celldata = [{'2018-12-12'}
{'2018-11-05'}
{'2018-09-02'}];
dt = datetime(celldata);
M = month(dt)
M =
12
11
9

Akira Agata
Akira Agata am 19 Sep. 2018
Another possible solution:
celldata = [{'2018-12-12'}
{'2018-11-05'}
{'2018-09-02'}];
M = extractBetween(celldata,'-','-');
M = cellfun(@str2double,M);

Kategorien

Mehr zu Data Type Conversion finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by