Looking for something like a matrix version of randsample... [vectorization!]

14 Ansichten (letzte 30 Tage)
If I have a vector w (length n) and I want to pick a single random number between 1 and n using w as the weights, I can do something like
idx = randsample(n,1,true,w)
and I'll get a number between 1 and n with probability w(idx)/sum(w). Great.
Similarly, I have a matrix W (size N x M, where each of N,M is in the thousands or so), and I want to draw M random numbers between 1 and N, with the columns of W acting as independent weight vectors. I could obviously do
idx = zeros(N,1);
for i = 1:M
idx(i) = randsample(N,1,true,W(:,i));
end
...but I'm going to be calling this literally billions of times, so I'm looking for some efficiency.
I know that an equivalent way to think of this is to take my W matrix, normalize the columns so that they sum to one, do a cumsum on the columns, select a vector of uniform random numbers using rand(1,M), and find the first row indices where they are greater than the cumsum values, but I don't know how to do that without using a loop and find():
W_normalized = bsxfun(@rdivide,W,sum(W,1));
W_cdf = cumsum(W,1);
x = rand(1,M);
C = bsxfun(@lt,x,W_cdf);
and then the first row of each column of C with a 1 in it is my random number, but I haven't had any luck doing that in an efficient, vectorized way (I've seen this thread, but I think their conclusion to use a for-loop doesn't really seem to hold for larger matrices).
Any suggestions?
Thanks, Dan

Akzeptierte Antwort

Kirby Fears
Kirby Fears am 25 Nov. 2015
Bearbeitet: Kirby Fears am 30 Nov. 2015
Here's a pure matrix version of your bsxfun calls. It should be internally parallelized if your matrices are large.
W_normalized = W./repmat(sum(W,1),size(W,1),1);
W_cdf = cumsum(W_normalized,1);
x = rand(1,size(W,2));
C = repmat(x,size(W,1),1)<W_cdf;
You still need to find the first "true" value in C for each column to form an index.
Below I make a logical array where only the first "true" is present in each column:
idx = [C(1,:); xor(C(2:end,:),C(1:end-1,:))];
Hope this helps.
  4 Kommentare
Dan Gianotti
Dan Gianotti am 8 Mär. 2017
I just realized that I never "accepted" this answer. Thanks very much for the huge assistance! You presumably saved me a hundred years of CPU time.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by