Filter löschen
Filter löschen

How can the following code be optimized / vectorized?

2 Ansichten (letzte 30 Tage)
L'O.G.
L'O.G. am 16 Mär. 2022
Kommentiert: L'O.G. am 16 Mär. 2022
How can you optimize the following? It is painfully slow and takes around 5 minutes to run on my computer. Here I am calcuatling the distances between atoms in molecules while taking account periodic boundary conditions (which mean an atom on the edge of the box might actually belong to a molecule that wraps around, so to speak). I imagine this could be done in a far more efficient and speedy way than my Fortran-esque implementation.
A % an array where column 1 is the atom ID, column 2 is the molecule ID, and columns 3-5 are the positions in x, y, and z
N = 200; % molecules in system
Sites = 150; % sites in each molecule
length = 120; % length of box on each side
SitesInSystem = N*Sites;
B = zeros(SitesInSystem,SitesInSystem);
% center box at origin
A(:,3) = A(:,3) - length/2;
A(:,4) = A(:,4) - length/2;
A(:,5) = A(:,5) - length/2;
for ii = 1:SitesInSystem
for jj = 1:SitesInSystem
if ii ~= jj % if not the same site
dx = A(jj,3) - A(ii,3);
dy = A(jj,4) - A(ii,4);
dz = A(jj,5) - A(ii,5);
% taking into account periodic boundaries
if (dx > length * 0.5)
dx = dx - length;
elseif (dx <= -length * 0.5)
dx = dx + length;
end
if (dy > length * 0.5)
dy = dy - length;
elseif (dy <= -length * 0.5)
dy = dy + length;
end
if (dz > length * 0.5)
dz = dz - length;
elseif (dz <= -length * 0.5)
dz = dz + length;
end
% calculate distances
B(ii,jj) = sqrt(dx^2+dy^2+dz^2);
end
end
end
B = triu(B); % to avoid double counting
Perhaps this also can be combined with the following snippet, which calculates something based on the values of the matrix B:
for ii = 1:N % molecule
for jj = 1:Sites % site
for kk = 1:N
for mm = 1:Sites
if ii ~= kk % if not the same molecule
if (B(ii*jj,kk*mm) < 1.75) && (B(ii*jj,kk*mm) > 0)
% do a calculation, i.e., add edge ii-kk to a graph
% with N nodes
end
end
end
end
end
end
  4 Kommentare
Stephen23
Stephen23 am 16 Mär. 2022
"Here you can read the data into MATLAB and use readtable and table2array to turn the data into a MATLAB array."
READMATRIX would be simpler and more efficient: https://www.mathworks.com/help/matlab/ref/readmatrix.html
L'O.G.
L'O.G. am 16 Mär. 2022
@Stephen Thanks. I didn't know about that function!

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Bruno Luong
Bruno Luong am 16 Mär. 2022
Bearbeitet: Bruno Luong am 16 Mär. 2022
It takes about 25 seconds only PC
Ac35 = A(:,3:5);
Ai = reshape(Ac35,[],1,3);
B = zeros(SitesInSystem,SitesInSystem);
for j=1:SitesInSystem
Aj = reshape(Ac35(j,:),1,[],3);
dxyz = Ai-Aj;
dxyz = mod(dxyz+length/2,length)-length/2;
B(:,j) = sqrt(sum(dxyz.^2,3));
end

Weitere Antworten (2)

KSSV
KSSV am 16 Mär. 2022
I think pdist2, this inbuilt function will work for you.
T = readtable('Sample.xlsx') ;
A = table2array(T) ;
A = A(:,3:5) ; % pick coordinates (x,y,z)
d = pdist2(A,A) ;
B = triu(d) ; % you may check this with you answer. Use isequal to check.
  3 Kommentare
KSSV
KSSV am 16 Mär. 2022
I have checked your code and pdist2 for the provided 274*5 data. Both the methods, gave same solution.
L'O.G.
L'O.G. am 16 Mär. 2022
Bearbeitet: L'O.G. am 16 Mär. 2022
@KSSV Right, but for the full case (not the sample file I gave) I would need to consider that some atoms might be at the edges of the box, so that periodic boundaries as applied to that molecule must be taken into account. Sorry I didn't make that clearer. I mentioned it in my initial query and coded it using the if statements nested in the orignal set of for loops, but I guess not very clearly.

Melden Sie sich an, um zu kommentieren.


Walter Roberson
Walter Roberson am 16 Mär. 2022
You can see that the rem() based expression I use here can be used to vectorize the calculation.
format bank
Length = 5;
x = -3.5 : .5 : 3.5
x = 1×15
-3.50 -3.00 -2.50 -2.00 -1.50 -1.00 -0.50 0 0.50 1.00 1.50 2.00 2.50 3.00 3.50
for K = 1 : length(x)
dx = x(K);
if (dx > Length * 0.5)
dx = dx - Length;
elseif (dx <= -Length * 0.5)
dx = dx + Length;
end
newx_if(K) = dx;
end
newx_mod = rem((x - Length/2), Length) + Length/2;
newx_if
newx_if = 1×15
1.50 2.00 2.50 -2.00 -1.50 -1.00 -0.50 0 0.50 1.00 1.50 2.00 2.50 -2.00 -1.50
newx_mod
newx_mod = 1×15
1.50 2.00 2.50 -2.00 -1.50 -1.00 -0.50 0 0.50 1.00 1.50 2.00 2.50 3.00 3.50
newx_if - newx_mod
ans = 1×15
0 0 0 0 0 0 0 0 0 0 0 0 0 -5.00 -5.00
  3 Kommentare
Walter Roberson
Walter Roberson am 16 Mär. 2022
newxyz_mod = rem((A - Length/2), Length) + Length/2; % for x, y, z
looks fine, as long as Length is the same in all three directions.
You would follow it with
B = sqrt( sum(newxyz_mod.^2, 2) );
L'O.G.
L'O.G. am 16 Mär. 2022
Bearbeitet: L'O.G. am 16 Mär. 2022
@Walter Roberson Got it. Thank you! And about my other question, the differences -- rem() should use the differences, correct? Not the positions themselves? That's my understanding, but I don't know how to calculate that efficiently. As I mentioned in my previous comment, something like pdist(A(:,1),@(x,y) x-y); generates a huge vector and is slow, and that's just for x! I think there will be (N*Sites)^2 / 2 possible distances which will fill an upper triangular matrix. I'm not sure how to code that efficiently here, though.

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Matrix Indexing finden Sie in Help Center und File Exchange

Produkte


Version

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by