How to use RandStreams appropriately with Parallel Computing?

39 Ansichten (letzte 30 Tage)
Laura
Laura am 21 Jan. 2026 um 22:44
Bearbeitet: Laura am 28 Jan. 2026 um 0:56
I am currently working to update an existing set of code for reproducibility.
Currently, the code is structured as follows:
nlabs = 6;
seed = 1; % User-choice
[globalstream, labstreams{1:nlabs}] = RandStream.create('mrg32k3a','NumStreams',nlabs+1,'Seed',seed);
RandStream.setGlobalStream( globalstream );
parallelpool=parpool(nlabs);
spmd
RandStream.setGlobalStream( labstreams{spmdIndex} );
end
parfor i=1:nlabs
Calculations here
end
However, I need the code to be fully reproducible. I understand that to achieve reproducibility with parallel computing I need to use substreams ( https://www.mathworks.com/help/stats/reproducibility-in-parallel-statistical-computations.html ). However I am not confident of how to distinguish the global stream and worker stream.
I've seen an example in which the user used only a single global stream by storing and retreiving the stream state before and after the parfor loop ( https://www.mathworks.com/matlabcentral/answers/1670009-reproducible-and-independent-random-stream-generation-in-parfor-loop ) but it seems like it would be simpler to setup two independent streams.
I've outlined a two-stream setup below. Does this seem reasonable? I want globalstream and each substream of labstream to be independent.
nlabs = 6;
seed = 1; % User-choice
[globalstream, labstream] = RandStream.create('mrg32k3a','NumStreams',2,'Seed',seed);
RandStream.setGlobalStream( globalstream );
<Some Calculations>
parallelpool=parpool(nlabs);
parallel.pool.Constant(RandStream.setGlobalStream(labstream)) % Not sure of the syntax here
parfor i=1:nlabs
set(labstream,'Substream',i)
<Some Calculations>
end
RandStream.setGlobalStream( globalstream );
<Some Calculations>

Antworten (1)

Will
Will am 27 Jan. 2026 um 13:22
To achieve full reproducibility in parallel MATLAB code, it is essential to separate client-side (global) random number generation from worker-side random number generation, and to ensure that no RandStream object is shared across workers. While substreams are the correct mechanism for reproducible parallel execution, a single stream cannot be safely mutated inside a parfor loop, as execution order is undefined and leads to non-deterministic results.
Use one stream on the client for all serial computations. This stream is independent of any parallel execution.
seed = 1;
clientStream = RandStream('mrg32k3a','Seed',seed);
RandStream.setGlobalStream(clientStream);
% Client-side computations
A = rand(1,10);
Each worker must have its own stream instance. This is done using parallel.pool.Constant, which constructs a separate RandStream on each worker with the same seed.
nlabs = 6;
parpool(nlabs);
workerStreams = parallel.pool.Constant(@() ...
RandStream('mrg32k3a','Seed',seed));
This avoids sharing stream handles and guarantees deterministic initialization on every run.
Inside the parfor loop, assign a substream based on the loop index and set it as the worker’s global stream before generating random numbers.
parfor i = 1:nlabs
s = workerStreams.Value;
s.Substream = i; % Deterministic mapping
RandStream.setGlobalStream(s);
% Parallel computations
x = rand(1,5);
end
With mrg32k3a, substreams are independent and ordered, so results are reproducible regardless of scheduling or execution order.
  1 Kommentar
Laura
Laura am 27 Jan. 2026 um 18:44
Bearbeitet: Laura am 28 Jan. 2026 um 0:56
Thank you so much for your advice.
There is a nuance to your suggestion I want to be sure I understand.
Context: Based on my reading and tinkering, it seems that there are two alternative methods for creating random streams
Option 01: s1 and s2 are distinct, non-identical streams such that x is not equal to y
seed=1;
[s1,s2]=RandStream.create('mrg32k3a','NumStreams',2,'Seed',seed);
x=rand(s1,1)
y=rand(s2,1)
Option 02: s1 and s2 are distinct, identical streams such that z=w.
seed=1;
s3 = RandStream.create('mrg32k3a','Seed',seed);
s4 = RandStream.create('mrg32k3a','Seed',seed);
z=rand(s3,1)
w=rand(s4,1)
If I understand your suggestion correctly, you are proposing that I use a strategy analogous to option 02 so that each worker is given a distinct but identical stream. My initial concern was that I would lose idependence of calculations by using this method. However, it seems that substreams within a given stream are independent? So by using the combined technique of an identical stream for each worker with substream use indexed by the parfor index I am able to achieve independence in the calculations. Is this correct?
For reasons that I will clarify in a followup reply, I need to alter your syntax a bit.
Would the following be conceptually equivalent to what you propose?
% User options
seed = 1;
nlabs = 6;
% Create streams
makestream=@()RandStream('mrg32k3a','Seed',seed);
clientStream = makestream();
for i=1:nlabs
workerstreams{i}=makestream();
end
RandStream.setGlobalStream(clientStream);
% Client-side computations
A = rand(1,5)
parpool(nlabs);
spmd
RandStream.setGlobalStream(workerstreams{spmdIndex});
end
parfor i = 1:nlabs
s = RandStream.getGlobalStream
s.Substream = i; % Deterministic mapping
% Parallel computations
B(i,:) = rand(1,5)
end
The one potential issue I see with this method is that A = B(1,:), i.e. the clientstream is not distinct from the workerstreams. It seems like if I want independence in the calculations I need the clienttream to be distinct from the workerstreams? Assuming this is correct, would the following be an appropriate strategy update that incorporates the concept of your suggestion?
% User options
seedchoice = 1;
nlabs = 6;
% Create streams
makestream=@(seed)RandStream('mrg32k3a','Seed',seed);
clientStream = makestream(seedchoice);
for i=1:nlabs
workerstreams{i}=makestream(seedchoice+1); %Give the clientstream and workerstreams different seeds
end
RandStream.setGlobalStream(clientStream);
% Client-side computations
A = rand(1,5)
parpool(nlabs);
spmd
RandStream.setGlobalStream(workerstreams{spmdIndex});
end
parfor i = 1:nlabs
s = RandStream.getGlobalStream
s.Substream = i; % Deterministic mapping
% Parallel computations
B(i,:) = rand(1,5)
end

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Parallel for-Loops (parfor) finden Sie in Help Center und File Exchange

Produkte


Version

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by