count occurrences of string in a single cell array (How many times a string appear)
88 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
KnowledgeSeeker
am 12 Feb. 2014
Kommentiert: Rahulsivanand s
am 25 Mai 2021
I have a single cell array containing long string as shown bellow:
xx = {'computer', 'car', 'computer', 'bus', 'tree', 'car'};
I am trying to achieve output in two cell array as shown:
xx = {'computer', 'car', 'bus', 'tree'}
occ = {'2', '2','1','1'}
Your suggestion and ideas are highly appreciated. Thanx in advance
0 Kommentare
Akzeptierte Antwort
Azzi Abdelmalek
am 12 Feb. 2014
xx = {'computer', 'car', 'computer', 'bus', 'tree', 'car'}
a=unique(xx,'stable')
b=cellfun(@(x) sum(ismember(xx,x)),a,'un',0)
2 Kommentare
Weitere Antworten (4)
Jos (10584)
am 12 Feb. 2014
A faster method and more direct method of counting using the additional output of UNIQUE:
XX = {'computer', 'car', 'computer', 'bus', 'tree', 'car'}
[uniqueXX, ~, J]=unique(XX)
occ = histc(J, 1:numel(uniqueXX))
6 Kommentare
Adam Danz
am 29 Aug. 2020
Bearbeitet: Adam Danz
am 29 Aug. 2020
For the carsmall data used in the other comparisons, histc was actually 1.33x faster than histcounts in r2019b and 1.22 x faster on r2020a (matlab online). On both machines I repeated the 10000-rep analysis 3 times and the final results were all within +/-0.02 of what's reported.
The difference between those numbers and your results may have to do with first-time-costs if you're just measuring the execution once with tic/toc.
I like your dedication to optimization! 😎
Bruno Luong
am 29 Aug. 2020
Bearbeitet: Bruno Luong
am 29 Aug. 2020
No first-time cost I ensure you. I post just one result ans snipet for simplicity, but I ran tic/toc on loop and within function and on 2 different computers (Windows 8.1 Windows 10 both with R2020a).
The conclusion on my side doesn't not change.
Yeah I'm kind of obssesing with Matlab speed, and I can't hide it.
MSchoenhart
am 27 Sep. 2018
Bearbeitet: Adam Danz
am 29 Aug. 2020
A very fast and simple vectorized method is to use categories (since R2013b). "countcats" is also using histc in the background but the code looks much cleaner:
xx = {'computer', 'car', 'computer', 'bus', 'tree', 'car'};
c = categorical(xx);
categories(c)
countcats(c)
2 Kommentare
Giuseppe Degan Di Dieco
am 27 Apr. 2021
Dear MSchoenhart,
thanks for your solution, it helped me too.
Best!
Bruno Luong
am 29 Aug. 2020
Bearbeitet: Bruno Luong
am 29 Aug. 2020
[yy,~,i] = unique(xx,'stable');
count = accumarray(i(:),1,[numel(yy),1]);
1 Kommentar
Girish Chandra
am 12 Feb. 2017
Bearbeitet: Adam Danz
am 29 Aug. 2020
Not using histc function you can do it in the following way
xx = {'computer', 'car', 'computer', 'bus', 'tree', 'car'}
U=unique(xx)
A=zeros(1,numel(U))
for i=1:numel(U)
for j=1:numel(xx)
if strcmp(U(i),xx(j))==1
A(i)=A(i)+1
end
end
end
3 Kommentare
Jon Adsersen
am 8 Apr. 2020
Based on the answer by Jos, a function that works for both numerical and string arrays could be formulated:
function [rep_values, N_rep, ind_rep] = f_reapeated_elements(A)
% Find repeated elements in A (can be both numeric or cell strings etc.)
% Outputs:
% rep_values - repeated values in A (occuring 2 or more times)
% N_rep - Number of repetitions of the values given in "rep_values"
% ind_rep - Ind in A of repeated values (occuring 2 or more times)
[un, ~, ind_un] = unique(A) ;
N_A = histc(ind_un,1:numel(un)) ;
rep_values = un(N_A>1) ;
N_rep = N_A(N_A>1) ;
ind_cell = cell(1, numel(rep_values)) ;
A_list = 1:numel(A) ;
for k = 1:numel(rep_values)
if isnumeric(rep_values)
ind_cell{k} = find(A == rep_values(k)) ;
else
log_ind = strcmp(A,rep_values(k)) ;
ind_cell{k} = A_list(log_ind) ;
end
end
ind_rep = unique([ind_cell{:}]) ;
Siehe auch
Kategorien
Mehr zu Logical finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!