for loop over subset - finding indices vs if-clause

I want to compare two-cell arrays (X and Y) field by field (each field is a 2-dimensional array of points) and for each comparison compute how many points overlap. However I have a condition that needs to be fulfilled in each instance (which depends on some numbers in an array). Which of the following approaches makes more sense (speed or other issues)? Is there even a significant difference?
edit: The condition to check consists of two number comparisons, i.e. if/find array(jj)>= 0.5*K && array(jj)<=2*K, where K is constant for one ii.
1. find indices beforehand and run the second for-loop only over those
for ii=1:N
idx = find(jj that fulfill CONDITION in array);
for jj=idx
matrix(ii,jj) = sum(ismember(X{ii}, Y{jj}));
end
end
2. run for-loop over all indices and check condition individually with an if-clause
for ii=1:N
for jj=1:M
if Y(jj) fulfills CONDITION in array
matrix(ii,jj) = sum(ismember(X{ii}, Y{jj}));
end
end
end

 Akzeptierte Antwort

Felix Müller
Felix Müller am 5 Jul. 2021

1 Stimme

I ended up following Jan's advice and coded and timed it. It seems the first approach (finding indices and the running only over those) is slightly faster. It was between 0% and 5% less time used for the first approach compared to the second. I only tested for five random cases in my data.
If anybody has some insight into why this is faster, I'd be very interested to hear.

8 Kommentare

Jan
Jan am 5 Jul. 2021
Bearbeitet: Jan am 6 Jul. 2021
What are the elements of X and Y? Are they unique? Do the contain NANs?
ismembc is the fast core in ismember. You can save time by a pre-sorting:
matrix = zeros(N, M); % Pre-allocation is important!
for jj = 1:M
Yjj = sort(Y{jj}(:)); % [EDITED] Consider 2D contents
if Y(jj) fulfills CONDITION in array
for ii = 1:N
matrix(ii,jj) = sum(ismembc(X{ii}(:), Yjj));
end
end
end
I preallocate with NaNs.
X{ii}} and Y{jj} are 2-dimensional arrays of different lengths (which I do not know beforehand) which represent pixel positions. Because they are 2-dimensional I need the 'rows' argument.
I read about ismemc, but guessed I couldn't use it because the 2-dim array cannot be 'sorted' in the regular sense.
Jan
Jan am 6 Jul. 2021
I've edited my code to consider 2D matrices in X and Y. Please compare the run times now.
Am I misunderstanding something? The (:) command you use makes the Nx2-matrix that I have into one long vector (which is then sorted). But I actually need the Nx2-structure because each line (N(i,:)) is a point in 2D-space and I want to find the overlap between these two sets of points (which are all in 2D-space).
That is why I thought ismembc wasn't usable for me. I do believe it is faster (haven't tested it, but a) people recommend it and b) the explanations make sense).
Jan
Jan am 7 Jul. 2021
The code from your question contains ismember(A,B), which does not search for overlapping rows also. I've assumed, that this code does, what you need.
Argh, I'm very sorry. I had rewritten my question for clarity and then I must have dropped that part (I also had a sentence explaining the 'rows' argument, which is why I remember I had it).
Jan
Jan am 9 Jul. 2021
Maybe you want to post a minimal working example of the part, which consumes the most time of your code. Then we could check, if there are some other ways to speed this up. For 2D data there should be a fast approach using sortrows() and diff().
Felix Müller
Felix Müller am 13 Jul. 2021
Bearbeitet: Jan am 13 Jul. 2021
I have posted a new question because this issue is independent from the one I asked about here. Link to the new question: https://de.mathworks.com/matlabcentral/answers/877513-speed-up-ismember-with-two-dim-data

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Jan
Jan am 5 Jul. 2021

1 Stimme

It depends. How expensive is "fulfill CONDITION"?
Simply try it. Implement both versions and measure the timings with tic/toc.

1 Kommentar

I'll update the question, but it's only two number comparisons (i.e. if array(jj)>= 0.5*K && array(jj)<= 2*K, where K is constant for one ii).

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Loops and Conditional Statements finden Sie in Hilfe-Center und File Exchange

Produkte

Version

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by