Finding Duplicate Values per Column
26 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Greetings, suppose Column A has these values - 7 18 27 42 65 49 54 65 78 82 87 98
Is there a way to compare the values (row by row) and search for duplicates? (I'm using Matlab R2010b)I don't want the duplicated values to be removed.
Thanks.
0 Kommentare
Akzeptierte Antwort
Jan
am 22 Okt. 2011
A = [7 18 27 42 65 49 54 65 78 82 87 98];
[n, bin] = histc(A, unique(A));
multiple = find(n > 1);
index = find(ismember(bin, multiple));
Now the values A(index) appear mutliple times.
2 Kommentare
Weitere Antworten (4)
the cyclist
am 22 Okt. 2011
Here's a slightly different way:
X = [1 2 3 4 5 5 5 1];
uniqueX = unique(X);
countOfX = hist(X,uniqueX);
indexToRepeatedValue = (countOfX~=1);
repeatedValues = uniqueX(indexToRepeatedValue)
numberOfAppearancesOfRepeatedValues = countOfX(indexToRepeatedValue)
4 Kommentare
Anurag Pujari
am 25 Mär. 2016
Bearbeitet: Anurag Pujari
am 25 Mär. 2016
Accurate. What an excellent piece of code.
Wesley Allen
am 9 Feb. 2018
Bearbeitet: Wesley Allen
am 9 Feb. 2018
Duplicate Finding with Tolerance
If you want to find duplicates with tolerances (e.g., for non-integers), I use the following:
A = [1.313;2.4;2.400000001;1.31299999999;2.25;2.25;2.25000000001;3.7];
TOL = 1e-5;
uniqueA = uniquetol(A,TOL);
duplicateBool = abs(repmat(A,size(uniqueA.'))-repmat(uniqueA.',size(A))) < max(abs(uniqueA))*TOL;
duplicateCount = sum(duplicateBool).';
Just like with the cyclist's answer, if you want to isolate only the values that have more than one instance:
iDuplicate = (duplicateCount ~= 1);
repeatedValues = uniqueA(iDuplicate);
numberOfAppearancesOfRepeatedValues = duplicateCount(iDuplicate);
repeatedBool = duplicateBool(:,iDuplicate);
Using the Results
The unique values are in uniqueA:
>> uniqueA
uniqueA =
1.3130
2.2500
2.4000
3.7000
The quantity of each unique value is in duplicateCount:
>> duplicateCount
duplicateCount =
2
3
2
1
To get the indices of A corresponding to the n-th unique value, uniqueA(n)
>> n = 2;
>> uniqueA(n)
ans =
2.2500
>> duplicateIndex = find(duplicateBool(:,n))
duplicateIndex =
5
6
7
0 Kommentare
Fernando Meo
am 13 Aug. 2018
Here is another answer (a one liner)
If AA is a 2D matrix and you wish to find the rows which have a duplicate values in its columns,
RowsWhichHaveDuplicates = find(arrayfun(@(i (~isequal(length(unique(AA(i,:))),size(AA,2))), [1:size(AA,1)]));
Example
AA = [6 7 11 6; 7 11 4 8; 11 15 1 10; 15 4 14 12;
18 13 18 8; 12 13 18 1; 3 14 6 18];
>> RowsWhichHaveDuplicates = RowsWhichHaveDuplicates = find(arrayfun(@(i) (~isequal(length(unique(AA(i,:))),size(AA,2))), [1:size(AA,1)]))
RowsWhichHaveDuplicates =
1 5
If your values are real, then a tolerance can be set by using the matlab "round" function to the decimal places you wish to use.
AA = round(rand(10)*10,1); % First decimal place
AA =
6.0000 2.0000 0.4000 6.7000 9.4000 0.6000 8.3000 3.1000 1.0000 3.0000
9.1000 7.5000 0.6000 6.0000 0.7000 3.1000 0.3000 4.2000 9.0000 3.7000
2.5000 8.9000 5.0000 3.4000 7.2000 6.6000 8.4000 9.3000 9.0000 7.6000
8.6000 1.0000 4.1000 4.0000 8.3000 4.6000 2.6000 0.6000 0.8000 3.1000
7.6000 5.2000 2.2000 3.9000 7.3000 0.2000 6.6000 8.2000 5.2000 9.6000
2.2000 6.0000 4.3000 7.0000 5.1000 6.9000 6.7000 6.4000 2.8000 2.1000
4.2000 9.8000 9.5000 1.4000 5.2000 4.1000 2.6000 8.2000 8.8000 7.3000
1.3000 6.7000 2.0000 3.8000 7.6000 5.7000 3.3000 3.3000 6.7000 2.5000
9.2000 8.5000 7.1000 2.2000 6.3000 9.9000 2.5000 9.5000 1.2000 8.9000
2.9000 1.7000 7.8000 4.1000 0.7000 8.6000 7.1000 9.1000 3.7000 7.1000
RowsWhichHaveDuplicates = find(arrayfun(@(i) (~isequal(length(unique(AA(i,:))),size(AA,2))), [1:size(AA,1)]))
RowsWhichHaveDuplicates =
5 8 10
Hope this helps
0 Kommentare
Siehe auch
Kategorien
Mehr zu Logical finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!