Compare for uniqueness between 2 very large matrices

12 Ansichten (letzte 30 Tage)
Koren Murphy
Koren Murphy am 13 Jan. 2021
Bearbeitet: Adam Danz am 15 Jan. 2021
I have two matrices of the same size - 95 x 100,000.
They are in different orders but I would like to compare if columns in one matrix are repeated elsewhere in the other matrix - or if the 2 matrices are completely unique?
  2 Kommentare
Iuliu Ardelean
Iuliu Ardelean am 13 Jan. 2021
You can use Lia = ismember(A,B,'rows'), which returns a vector of ones and zeros of length(A), representing those rows of A which are members of B.
You will need to transpose your matrices though.
Jan
Jan am 14 Jan. 2021
I do not understand this sentence: "I want to know if the columns are completely unique - even if the same rows are filled in each they could have different values."
ismember(x,y,'rows') searches for equal rows in both matrices. This is exactly, what "if columns in one matrix are repeated elsewhere in the other matrix" means, isn't it?

Melden Sie sich an, um zu kommentieren.

Antworten (1)

Adam Danz
Adam Danz am 13 Jan. 2021
Bearbeitet: Adam Danz am 13 Jan. 2021
To determine if two arrays are 100% identical, use isequal or isequaln to ignore NaN values.
To determine if columns in matrix A are found in matrix B,
% create demo data
rng('default')
A = randi(3,3,20);
B = randi(4,3,20);
for i = 1:size(A,2)
isInB(i) = ismember(A(:,i)', B','rows');
end
isInB is a 1xn logical vector for n columns of A where isInB(i) indicates whether column i of A is found in B.
To find the column numbers in B that match each column number in A
isMatchInB = false(size(A,2),size(B,2));
for i = 1:size(A,2)
isMatchInB(i,:) = arrayfun(@(j)isequaln(A(:,i),B(:,j)),1:size(B,2));
end
isInB = any(isMatchInB,2);
isInB is a 1xn logical vector for n columns of A where isInB(i) indicates whether column i of A is found in B.
isMatchInB is an ixj logical vector where isMatchInB(i,j) indicates where colum i of A matches column j of B.
  3 Kommentare
Adam Danz
Adam Danz am 14 Jan. 2021
You could try storing the subscript indicies instead,
c = cell(1,size(A,2));
for i = 1:size(A,2)
c{i} = find(arrayfun(@(j)isequaln(A(:,i),B(:,j)),1:size(B,2)));
end
Bruno Luong
Bruno Luong am 14 Jan. 2021
Bearbeitet: Bruno Luong am 14 Jan. 2021
Or using sparse matrix
isMatchInB = sparse([],[],false(0),size(A,2),size(B,2));

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Operators and Elementary Operations finden Sie in Help Center und File Exchange

Produkte

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by