Is it possible to vectorize this?

2 Ansichten (letzte 30 Tage)
Omar Ali Muhammed
Omar Ali Muhammed am 27 Apr. 2021
Bearbeitet: Jan am 28 Apr. 2021
I want to find the most occurring element in the matrix in column wise excluding zeror elements.
e.g if A = [1 2 0 3 4 6; 9 3 4 0 9 5; 4 3 0 5 6 7; 3 7 7 3 0 0;1 1 8 8 4 8; 0 0 0 0 4 2; 0 0 0 0 0 0]'
The result is a cell matrix B
B={[1 2 3 4 6], [9], [4 3 5 6 7],[3 7], [8],[4 2], nan}
So most occurring elements is a cell array.
loops are inefficient for lage matrix.
Thanks in advance....
  6 Kommentare
Matt J
Matt J am 27 Apr. 2021
loops are inefficient for lage matrix.
Note that in Matlab, there is no way of populating a cell array that doesn't involve a for-loop (or something the same speed as a for-loop). So, the use of loops in some way will be inevitable.
Scott MacKenzie
Scott MacKenzie am 27 Apr. 2021
Bearbeitet: Scott MacKenzie am 27 Apr. 2021
Just to clarify, your question says "column wise". Do you mean "row wise"? Your example solution, B, shows the most occurring elements along the rows in A.
The code below avoids a loop and gets close to your goal:
% Assume A is the initial matrix (as in the example)
A(A==0) = NaN;
[~, ~, B] = mode(A,2);
B = B'
If A is your example matrix, then B matches your example result, except for the NaN entries where 0 occurs. Oddly (to me, anyway), you want 0s excluded except in the situation where all elments in a row are 0. In that case, NaN appears as the most occurring element. That doesn't quite add up to me, but that's the logic I see in your example.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Sean de Wolski
Sean de Wolski am 27 Apr. 2021
Bearbeitet: Sean de Wolski am 27 Apr. 2021
A = [1 2 0 3 4 6; 9 3 4 0 9 5; 4 3 0 5 6 7; 3 7 7 3 0 0;1 1 8 8 4 8; 0 0 0 0 4 2; 0 0 0 0 0 0]
A = 7×6
1 2 0 3 4 6 9 3 4 0 9 5 4 3 0 5 6 7 3 7 7 3 0 0 1 1 8 8 4 8 0 0 0 0 4 2 0 0 0 0 0 0
B = accumarray(repmat((1:height(A)).',width(A),1),A(:), [],@(x)modeall(nonzeros(x)))
B = 7×1 cell array
{5×1 double} {[ 9]} {5×1 double} {2×1 double} {[ 8]} {2×1 double} {[ NaN]}
celldisp(B)
B{1} = 1 2 3 4 6 B{2} = 9 B{3} = 3 4 5 6 7 B{4} = 3 7 B{5} = 8 B{6} = 2 4 B{7} = NaN
function m = modeall(x)
[~,~,m] = mode(x);
if isempty(m{1}) % Handle empty case
m{1} = nan;
end
end
  2 Kommentare
Omar Ali Muhammed
Omar Ali Muhammed am 27 Apr. 2021
Bearbeitet: Omar Ali Muhammed am 27 Apr. 2021
The code process the matrix row-wise, not column wise.
A =
1 9 4 3 1 0 0
2 3 3 7 1 0 0
0 4 0 7 8 0 0
3 0 5 3 8 0 0
4 9 6 0 4 4 0
6 5 7 0 8 2 0
Jan
Jan am 27 Apr. 2021
@Omar Ali Muhammed: Then move it to a function and provide A.' as input.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (2)

Bruno Luong
Bruno Luong am 27 Apr. 2021
Bearbeitet: Bruno Luong am 27 Apr. 2021
NOTE the order of most is sorted with this algorithm:
A = [1 2 0 3 4 6;
9 3 4 0 9 5;
4 3 0 5 6 7;
3 7 7 3 0 0;
1 1 8 8 4 8;
0 0 0 0 4 2;
0 0 0 0 0 0]'
A = 6×7
1 9 4 3 1 0 0 2 3 3 7 1 0 0 0 4 0 7 8 0 0 3 0 5 3 8 0 0 4 9 6 0 4 4 0 6 5 7 0 8 2 0
% Algo
[u,~,I] = unique(A);
keep = A ~= 0;
[~,J] = find(keep);
c = accumarray([I(keep),J],1);
[r,c] = find(c == max(c,[],1) & c>0);
B = accumarray(c,r,[size(A,2) 1], @(r) {u(r)})';
celldisp(B)
B{1} = 1 2 3 4 6 B{2} = 9 B{3} = 3 4 5 6 7 B{4} = 3 7 B{5} = 8 B{6} = 2 4 B{7} = []
  1 Kommentar
Bruno Luong
Bruno Luong am 28 Apr. 2021
Bearbeitet: Bruno Luong am 28 Apr. 2021
In case A contains reasonably small integers, the UNIQUE command can be removed and this method can be faster
% I = A; % <= this replace UNIQUE
keep = A ~= 0;
[~,J] = find(keep);
c = accumarray([A(keep),J],1);
[r,c] = find(c == max(c,[],1) & c>0);
B = accumarray(c,r,[size(A,2) 1], @(r) {r})'; % indexing u{r} is no longer needed

Melden Sie sich an, um zu kommentieren.


Jan
Jan am 27 Apr. 2021
mode() handles matrices as inputs also. Only ignoring the zeros is complicated.
For a comparison here the loop method:
A = [1 2 0 3 4 6; 9 3 4 0 9 5; 4 3 0 5 6 7; 3 7 7 3 0 0;1 1 8 8 4 8; 0 0 0 0 4 2; 0 0 0 0 0 0];
C = ModeFull(A.');
celldisp(C)
function C = ModeFull(A)
% Mode along 1st dimension ignoring zeros
n = size(A, 2);
C = cell(1, n);
for k = 1:n
a = A(:, k);
a = a(a ~= 0);
if isempty(a)
C{k} = NaN;
else
x = sort(a);
start = find([true; diff(x) ~= 0]);
freq = zeros(numel(x), 1);
freq(start) = [diff(start); numel(x) + 1 - start(end)];
m = max(freq);
C{k} = x(freq == m).';
end
end
end
Please compare the run time with Sean de Wolski's vectorized approach for your real data.
  2 Kommentare
Bruno Luong
Bruno Luong am 27 Apr. 2021
I test for big matrix 1000 x 1000 and Jan's method is fatest.
Jan
Jan am 28 Apr. 2021
Bearbeitet: Jan am 28 Apr. 2021
@Bruno Luong: Some timings (i5 mobile, R2018b)
A = randi(50, 1000, 1000);
A(rand(size(A)) < 0.2) = 0;
tic
B = accumarray(repmat((1:size(A, 1)).', size(A, 2), 1), A(:), [], ...
@(x)modeall(nonzeros(x)));
toc
tic; C = BrunosMode(A.'); toc
tic; D = ModeFull(A.'); toc
% Elapsed time is 0.402765 seconds. Sean
% Elapsed time is 0.165996 seconds. Bruno
% Elapsed time is 0.075373 seconds. Jan
This is another example, where the assumption "loops are inefficient for large matrices" do not match the expectations. This was the case before the JIT become powerful in Matlab 6.5 - this was in 2002. But as the "brute clearing header" the rumor of slow loops is still living.
Vectorizing is very efficient, if the data and the operation is suitable and if no huge intermediate data are produced.

Melden Sie sich an, um zu kommentieren.

Produkte


Version

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by