intersection of multiple arrays

85 Ansichten (letzte 30 Tage)
Ananya Malik
Ananya Malik am 14 Sep. 2017
Kommentiert: Ananya Malik am 14 Sep. 2017
I have multiple 1D arrays with integers. Eg. A=[1,2,3,4] B=[1,4,5,6,7] C=[1,5,8,9,10,11]. Here 1 is repeated in all the arrays and 4 in A&B, 5 in B&C. I want these repeated values to be assigned to a single array, based on the size of the array, and removed from other arrays. How can I achieve this. Please help. TIA.
  5 Kommentare
KL
KL am 14 Sep. 2017
So you only wanto to remove 1 from all the matrices in the above example?
Ananya Malik
Ananya Malik am 14 Sep. 2017
Bearbeitet: Ananya Malik am 14 Sep. 2017
No. The common elements are [1,4,5]. Upon removing them from all arrays, we get A=[2,3] B=[6,7] and C=[8,9,10,11]. 1 can be present in either A or B. Suppose we put 1 in A ==> A=[1,2,3] B=[6,7] C=[8,9,10,11]. 4 can be present in A or B. Size(B)<size(A), therefore A=[1 2 3] B=[4,6 7] c=[8,9,10,11]. Finally 5 can be present in B or C, but size(B)<size(C). Therefore the final output should be A=[1,2,3] B=[4,5,6,7] and C=[8,9,10,11].

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Guillaume
Guillaume am 14 Sep. 2017
This is in no way guaranteed to give you a perfectly balanced output. After that you need to look at optimisation algorithm which is beyond my field. This is also a lot less efficient than my other solution:
C = {[1 2 3 4 5 6], [1 4 5 6 7], [1 5 8 9 12 20]}
[uvals, ~, bin] = unique(cell2mat(C));
origarray = repelem(1:numel(C), cellfun(@numel, C));
dist = table(uvals', accumarray(bin, 1), accumarray(bin, origarray, [], @(x) {x}), 'VariableNames', {'value', 'repetition', 'arrayindices'});
dist = sortrows(dist, 'repetition');
The above builds a table of all the unique values, how many times they're repeated and where they come from originally. You can then iterate through that to distribute the values:
%distribute non-repeated values first
newC = accumarray(cell2mat(dist.arrayindices(dist.repetition == 1)), ...
dist.value(dist.repetition == 1), [], @(x){x'});
%and remove them from table
dist(dist.repetition == 1, :) = [];
%distribute remaining values one at a time
for row = 1:height(dist)
destarrays = dist.arrayindices{row}; %which array originally had the current value?
[~, destidx] = min(cellfun(@numel, newC(destarrays))); %find which is currently smallest
newC{destarrays(destidx)} = [newC{destarrays(destidx)}, dist.value(row)]; %append value to that smallest array
end
celldisp(newC)
I'm using a table here, which is not the most efficient container in matlab, but that make it easier to understand the code.
  1 Kommentar
Ananya Malik
Ananya Malik am 14 Sep. 2017
Thank you so much. Not exactly what I wanted, but pretty close. Will tweak it further for my use. Cheers

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (2)

KL
KL am 14 Sep. 2017
A=[1,2,3,4];
B=[1,4,5,6,7];
C=[1,5,8,9,10,11];
M = {A,B,C};
N = {[B C], [A C], [A B]};
CommonElements = unique(cell2mat(cellfun(@intersect, M,N,'UniformOutput',false)));
NewM = cellfun(@(x) removeEl(x,CommonElements),M,'UniformOutput',false);
%and then
function x = removeEl(x,a)
for el=1:length(a)
x(x==a(el))=[];
end
end

Guillaume
Guillaume am 14 Sep. 2017
This is how I'd do it:
C = {[1 2 3 4], [1 4 5 6 7], [1 5 8 9 10 11]};
%sort cell array by increasing vector size:
[~, order] = sort(cellfun(@numel, C));
C = C(order);
%get all unique values
uvals = unique(cell2mat(C));
%go through all arrays, keeping all values in uvals, then removing them from uvals so they can't be used by the next array
for iarr = 1:numel(C)
C{iarr} = intersect(C{iarr}, uvals); %only keep the values in uvals
uvals = setdiff(uvals, C{iarr}); %and remove the one we've just used
end
celldisp(C)
  1 Kommentar
Ananya Malik
Ananya Malik am 14 Sep. 2017
Bearbeitet: Ananya Malik am 14 Sep. 2017
I am trying to balance the number of elements in the arrays. So if I change my input to A=[1 2 3 4 5 6], B=[1 4 5 6 7] and C=[1 5 8 9 10 11]. Using the code you provided gives me A=[1 4 5 6 7] B=[2 3] and C=[8 9 10 11]. Whereas ideal case would be A=[2 3 4 6] B=[1 7 5] and C=[8 9 10 11] or A=[1 2 3 4] B=[5 6 7] C=[8 9 10 11] or something along this line.

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Matrices and Arrays finden Sie in Help Center und File Exchange

Produkte

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by