More efficient returning string position in cell string array
3 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Given an array such as
>> whos endtxt
Name Size Bytes Class Attributes
endtxt 137x2 22466 cell
Surely there is a (much) less verbose way to return the location of a specific string within the array than
>> find(~cellfun('isempty',strfind(endtxt(:,2),'FULLER')))
ans =
46
>>
I've whiffed on an efficient way to do something useful with the cell array of a zillion empty cells excepting for the one(s) of interest...the above does (finally!) work, but surely????
2 Kommentare
Antworten (3)
Guillaume
am 8 Okt. 2014
Assuming that the string you're looking for ('FULLER') is the exact match for one of the string in the cell array (and not just a substring), then
find(ismember(endtxt(:, 2), 'FULLER'))
matt dash
am 8 Okt. 2014
Well, here is an option with even more rigamarole, but it is faster if that matters. Basically make a copy of your text that is not a cell array, and cross reference it with a vector indicating where row breaks occur. For potentially even more speed you could remove the find entirely by keeping a vector of row indices for every character in the text (if memory is not an issue)
1) use cellfun(@length,<cells>) to get the length of each cell, then cumsum this to get the start index of each line (pre-pend a 0 at the beginning) 2) convert the cell arrays to one long string with [<cells>{:}] 3) now just use strfind on this one string to get the index 4) cross reference this with the index vector from (1) to see which line it begins in
Ridiculous, but on my computer it is 10-30x faster than find(~cellfun('isempty',strfind(lines,teststr)))
and seems to get faster for larger amounts of text.
code:
fid = fopen('book.txt','r'); %some long text file
teststr='Lampsacus' %some word in it
%read text file:
tline=fgetl(fid);
lines={};
while ischar(tline)
lines{end+1}=tline;
tline=fgetl(fid);
end
%method 1:
q=cellfun(@length,lines);
starts = [0 cumsum(q)];
alltxt=[lines{:}];
tic
a=strfind(alltxt,teststr);
for i = numel(a):-1:1
idx(i)=find(starts<=a(i),1,'last');
end
toc
%method 2:
tic
find(~cellfun('isempty',strfind(lines,teststr)));
toc
Siehe auch
Kategorien
Mehr zu Cell Arrays finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!