Improve algorithm to fill a rectangle with given number of points

I have a working code but it´s taking too long if I change some of the parameters. The goal is to fill a rectangle of given size with a given number of points (NumLigands) and a distance restraint between them (LigSpace). Here it is:
tic
W = 200 ; % Angstrom
H = 200 ; % Angstrom
dx = 1 ; % Resolution in x axis
dy = 1 ; % Resolution in y axis
X = [0:dx:W]' ; % Set of coordinates in x axis
Y = [0:dy:H]' ; % Set of positions in y axis
Area = W*H/10^2 ; % in nm2
density = 2.67 ; % #/nm2
NumLigands = density * Area ;
LigSpace = 7.11 ; % Ligand spacing
keepP(1,:) = [randsample(X,1) ; randsample(Y,1)]' ; % Fix this point
add = 2 ;
while add <= NumLigands
newP = [randsample(X,1) ; randsample(Y,1)]' ; % try a new point
%Calculate distance to other points
Distance = sqrt( (newP(1,1) - keepP(:,1)).^2 + (newP(1,2) - keepP(:,2)).^2 ) ;
minDistance = min(Distance) ;
if minDistance > normrnd(LigSpace,1)
keepP(add,:) = newP ;
figure(1);plot(keepP(add,1),keepP(add,2),'g*')
xlim = [0 200];
ylim = [0 200];
hold on
add = add + 1 ;
end
end
toc
With these parameters it takes about 2 min in my computer, which is not too bad, but if I change something like increasing the number of points or reducing the distance constraint it can take up to 10 minutes or more.
I want to keep the positioning of ligands randomly, but is there an heuristic I could use to reduce the computation time? Thanks!

2 Kommentare

dpb
dpb am 15 Aug. 2018
Bearbeitet: dpb am 15 Aug. 2018
What's the density of points? I wonder if one were to start with array at the allowed spacing (with a random offset by row/column initially so aren't exactly uniform) and then, instead of generating N, randomly purge M, leaving N? You'd start of with the constraint being satisfied.
Just a thought, not sure how well it might suit needs...
The density of points is what I actually get experimentally and then use to build this surface. I´m not sure how to create that first array you mentioned. I tried a (very naive) purging approach by starting with all the available coordinates and then randomly choosing and purging but the loop I made to create that "unpurged" array was just too big and would take forever.

Melden Sie sich an, um zu kommentieren.

 Akzeptierte Antwort

Adam Danz
Adam Danz am 15 Aug. 2018
Bearbeitet: Adam Danz am 15 Aug. 2018
This is 3x faster on my machine.
Your version took 64 sec and the version below takes 21 sec. I made several changes. The biggest change was moving all of the plotting outside of the while-loop. This means you won't be able to see the plot update during the loop but it's a lot faster. I also allocated 'keepP', combined your 'distance' and 'minDistance' lines, and made other smaller changes. Due to the use of random numbers and random permutations, I didn't reduce the while-loop any further.
tic
W = 200 ; % Angstrom
H = 200 ; % Angstrom
dx = 1 ; % Resolution in x axis
dy = 1 ; % Resolution in y axis
X = (0:dx:W)' ; % Set of coordinates in x axis
Y = (0:dy:H)' ; % Set of positions in y axis
Area = W*H/10^2 ; % in nm2
density = 2.67 ; % #/nm2
NumLigands = density * Area ;
LigSpace = 7.11 ; % Ligand spacing
keepP = NaN(NumLigands, 2); %allocate!
keepP(1,:) = [randsample(X,1) ; randsample(Y,1)]' ; % Fix this point
add = 2 ;
while add <= NumLigands
newP = [randsample(X,1) ; randsample(Y,1)]' ; % try a new point
%Calculate distance to other points
minDistance = min(sqrt( (newP(1,1) - keepP(:,1)).^2 + (newP(1,2) - keepP(:,2)).^2 )) ;
if minDistance > normrnd(LigSpace,1)
keepP(add,:) = newP ;
add = add + 1;
end
end
figure(1);
plot(keepP(2:end,1),keepP(2:end,2),'g*')
xlim = [0 200];
ylim = [0 200];
toc

10 Kommentare

I didn't notice the plot; if one wanted the dynamic visualization updating the X/YData arrays directly is pretty quick in comparison to calling plot so may be able to afford the time...
Adam Danz
Adam Danz am 15 Aug. 2018
Bearbeitet: Adam Danz am 15 Aug. 2018
Yeah, it's nice to know that the code is working. A waitbar() is another alternative but probably slower than dpb's suggestion above. There's probably a way to do this without a loop that's much faster than my version. The idea would be to create a large set of coordinates that are random permutations (with replacement) of (X,Y) and then to use squareform(pdist()) to calculate the distance between all points; then eliminate those greater than normrnd(). I toyed around with it for a while but ran out of time.
Thanks! Your machine is definitely faster than mine haha yet this was twice faster than my previous attempt. That would be an interesting approach. I was checking the functions squareform and pdist which I didn´t know. For the coordinates would you use just a rand() function and then purge them?
squareform just reorganizes the outputs of pdist.
In your while loop you are selecting random values from [X,Y] by using
newP = [randsample(X,1) ; randsample(Y,1)]' ;
I was suggesting that instead of doing this within a loop choosing them one-by-one, you could select all random samples at once using randi() or randsample(). For example (here I select 9000 random permutations with replacement):
Not tested
X = (0:dx:W)'
Y = (0:dy:H)'
nSamp = 9000;
newP = [X(randi(length(X), 1, nSamp)); Y(randi(length(Y), 1, nSamp))];
I see! I´ll give it a try
OK, randsample() might be more straightforward to use than randi(); just be sure to use the 'replacement' input because your current methods are sampling with replacement.
Well, not having replacements would be the right way I guess... I tried in a previous attempt to keep track of the rejected points, and then before computing the distance check if the new random point was already chosen before. I didn´t see much improvement with that at the time but maybe I´ll try to include it again.
Adding screenshot for completeness:
If you don't want replacement, either use randsample() to create the entire vector of samples or use randperm which might be better. Just note that you can only create as many samples are there are data in X,Y.
Oh I just realized that I think I do need the replacement because I build the points with x and y independently..
Without the loop it would be definitely faster! I was running a very silly code to get an idea of these functions and it runs very fast despite the huge numbers I was using, something like this:
nSamp = 10000 ;
newP = [X(randi(length(X), 1, nSamp)), Y(randi(length(Y), 1, nSamp))];
newP = unique(newP,'rows');
d = squareform(pdist(newP)) ;
[r,c] = find(d < LigSpace, 250000) ;
newP(r,:) = [] ;
plot(newP(:,1), newP(:,2),'g*')
and it was taking just a few seconds. So far I´m happy with the improved looped version and I´ll keep exploring this way again in a few days.
Thanks once more for all the help!

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Loops and Conditional Statements finden Sie in Hilfe-Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by