How to start parallel pool in multiple nested loops?

5 Ansichten (letzte 30 Tage)
Sajid Sarwar
Sajid Sarwar am 25 Jun. 2021
Kommentiert: Sajid Sarwar am 30 Jun. 2021
my outer loop is like this, parfor SNRindB=0:2:20 and I have more than 20 loops inside this outer loop. When I try to use parallel pool it gives me following error, Error: Unable to classify the variable 'num_err_bits' in the body of the parfor-loop. I am attaching the main file of my program for reference. I could not use the techniques explained in "https://kr.mathworks.com/help/parallel-computing/troubleshoot-variables-in-parfor-loops.html" becoz my code is too complex. Can anyone help me. I will be grateful.

Akzeptierte Antwort

Walter Roberson
Walter Roberson am 25 Jun. 2021
You have
v=1;
parfor SNRindB=0:2:20
%some code
num_err_bits(v)=0;
%more code
for num=1:(m*Nt)
if in_index(num) ~= out_index(num)
num_err_bits(v) = num_err_bits(v) + 1;
end
end
kk
end
prb(v) = num_err_bits(v)/(B);
v=v+1;
end
Let us run through this, supposing you have two parallel workers.
Suppose for the purposes of discussion that your code was permitted.
You assign 1 to v before the parfor, so the variable with 1 is copied to both workers when the parfor starts.
Suppose one of the workers gets SNRindB = 14 and the other gets SNRindB = 20 -- remember that parfor does not even try to execute the workers in 0:2:20 increasing order. One worker does not get assigned 0 and the next worker 2, then whatever worker finishes first gets assigned 4, and so on: parfor allocates chunks of range to a worker. At least one of the workers will be assigned the maximum value first, which is an optimization designed to make sure that arrays get allocated as large as needed the first time. So, sure, it could be the case that the first pass, one worker is assigned [20 18 16] and the other [14 12 10] and the rest left in reserve to serve to whichever finishes first.
Both workers have v = 1, so num_err_bits(1) is assigned 0 in both workers.
You do some work, and both workers enter the for num loop. Suppose the if is true, so both workers get to
num_err_bits(v) = num_err_bits(v) + 1
and v is 1 on both workers, so this is
num_err_bits(1) = num_err_bits(1) + 1
Now, let us take two cases:
  1. Due to chances of timing, one of the two workers is able to execute the complete statement first and then the other executes. If so, then num_err_bits(1) would assigned 0+1 = 1 and then num_err_bits(2) would be assigned 1+1 = 2. So you might happen to avoid a timing conflict, and on both workers num_err_bits(v) would be 2 (for the purposes of discussion!); OR
  2. Due to chances of timing, one of the two workers is able to execute the num_err_bits(v) reading part first, pulling out the 0, and then before the assignment in the first worker, the other worker executes the num_err_bits(v) reading part, also pulling out the 0. Both workers increment the count to 1, and both workers assign that 1 to num_err_bits(1) (for the purpose of discussion)
Now suppose that in one of the workers, in_index(num) ~= out_index(num) is false for all the other iterations of num, so it finishes the for num loop faster, and moves on and gets through to the assignment to prb(v) before the second worker is done with its for num loop.
So that faster worker goes to execute
prb(v) = num_err_bits(v)/(B);
If the slower worker is in the middle of a num_err_bits(v) = num_err_bits(v) + 1 then does the expression use the value before the increment or after the increment?
And the faster worker moves on to
v=v+1;
so v becomes 2. Meanwhile the slower worker is back at
num_err_bits(v) = num_err_bits(v) + 1
but now (for the purpose of discussion) v has become 2. So the slower worker has to read out num_err_bits(2) . But num_err_bits(2) has not been assigned to, perhaps. Or if the faster worker makes it to assigning 0 to num_err_bits(v) for the next iteration, then num_err_bits(v) might perhaps be 0. Or maybe somehow num_err_bits(v) was initialized with something and the slower worker increments it a few times and then the value gets overwritten with 0 when the faster worker gets to the place that it assigns 0 to num_err_bits(v)
So (for the purpose of discussion) eventually the slower worker assigns to prb(v) and increment v to 3... at a time when the faster worker might be counting on v staying the same for its work.
After all the cycles then (for the purpose of discussion) all of the prb are assigned to. How many of them? Well, that depends upon the timing of the incrementing v. In the worst case, prb can end up being short by (number of workers minus 1)
Now... supposing that prb came out the expected length. Your parfor finishes, and you do
SNRindB=0:2:20;
semilogy(SNRindB,prb,'-^b');
Gee, that sounds like you expect a particular position in prb to correspond to a particular SNRindB value. But it doesn't, because parfor executes in what should be considered a random order (it is at least non-deterministic order) and (for the purposes of discussion) you stored into prb in the order of finishing iterations, not according to the SNRindB value that was appropriate for the iteration.
How does MATLAB deal with all of this? Simple: it prohibits you from doing this.
When you assign into a variable inside a parfor, in a variable that is intended to be output (not a temporary variable), then the output variable must be indexed by a simple arithmetic expression of the parfor index. If I recall correctly, division is not permitted, so it would not be valid to index prb(SNRindB/2+1)
The code you posted does not appear to use num_err_bits after the parfor, so perhaps you can turn num_err_bits into a local variable.
Suggested code:
SNRindBvals = 0:2:20;
parfor v = 1 : length(SNRindBvals)
SNRindB = SNRindBvals(v);
%your code
prb(v) = as appropriate
%and NO increment of v
end
plot(SNRindBvals, prb)
  14 Kommentare
Walter Roberson
Walter Roberson am 30 Jun. 2021
Some of your loops can be replaced by reshape() or permute()
I have not studied what your code actually does.
Have you profiled the code to see where the time is going?
Sajid Sarwar
Sajid Sarwar am 30 Jun. 2021
I have not profiled the code to see time. Although, I will try to replace loops by reshape

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Parallel for-Loops (parfor) finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by