Nested for loop to parfor

1 view (last 30 days)
Subramanian on 10 Apr 2012
I have nested for loops.After I convert the outer for loop to parfor (I get no compilation errors), the program doesn't even go to the first step in the parfor loop until after 2.5 hours or so. Can someone please tell me why?
I tried various configurations where the number of broadcast variables were small and contained no large arrays but still the same problem. The program structure is as follows:
Variable initializations (around 10)
Matrix initializations (all to zeros, some of size 15*15, some 225*225) % Even if these matrices are initialized as local matrices inside the parfor loop it makes no difference.
parfor m=1:200
del(m)=some_value_depending on m
m %doesn't even print this for 2.5 hours! :(
for v=1:800
H=[some 15*15 matrix with some elements depending on m and some
others on v, but none of them depend on both simultaneously]
for i=1:225
for j=1:225
M=[225*225 matrix computed from some operations on H]
for k=1:15
some one line operation on M giving 2 matrices W and S
B=Inverse(W)*S -- size is 15*1;
Few lines of code operating on A(m)
plot(different values)
As you can see, there are no functions, but there are several large matrices within the loops. Will writing each loop as a function make it faster? If so, can someone please explain why?
Also, can someone tell how to use profile command for parallel computations if at all?
I am stuck for the past few weeks-please help.
Thank You
Subramanian on 13 Apr 2012
Unfortunately, M matrix is calculated for each iteration.There may be smarter ways of doing it, but for now this is not the problem. The program doesn't even enter 1st statement after the parfor loop for nearly 2.5 hours, running on 32 cores on a cluster.
Yes, I use W\S. I have initialized del and A. And I am opening my configuration using matlabpool just before the parfor.

Sign in to comment.

Answers (2)

Ken Atwell
Ken Atwell on 10 Apr 2012
How long does this loop take to run before converting the 'for' to 'parlor'? Are you running multiple local MATLAB workers on your computer, or connecting to a cluster?
Here are a few things that could be an explanation (assuming local workers -- if a cluster is involved, you may need to consult with its admin):
  1. Make sure you run 'matlabpool open' before the parlor loop (sorry if this sounds obvious, but it is a common pitfall).
  2. While the code is running, run the Task Manager (on Windows), Activity Monitor (Mac) or similar tool and look at the CPU usage and memory usage. CPU usage will probably be spiked at 100% ("good"), but if memory is also spiked, the computer may be trashing (overly-relying on virtual memory, which will almost certainly overwhelm and benefit from parlor).
  3. Run the code without 'parlor' and consult the Task Manager again for CPU usage. It is possible that the natural multithreading in MATLAB already doing a reasonable job, so there is little more to gain by switching to 'parfor'
  1 Comment
Subramanian on 10 Apr 2012
To answer some of your questions:
The code hardly takes any time to run before getting for parfor- it is almost instantaneous.
1.matlabpool open is right before parfor.
2. I am running it on a cluster using a configuration defined by me - 32 cores.

Sign in to comment.

Konrad Malkowski
Konrad Malkowski on 11 Apr 2012
Are the matrices preallocated, or are they allocated dynamically at run time by the inner for loops? If matrices are not preallocated, try doing that.
  1 Comment
Subramanian on 11 Apr 2012
Yes they are all pre-initialized to zeros before the parfor.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by