Generate random numbers given distribution/histogram

8 Ansichten (letzte 30 Tage)
David C
David C am 25 Okt. 2012
Kommentiert: Himanshu Tanwar am 25 Mai 2022
MATLAB provides built-in functions to generate random numbers with an uniform or Gaussian (normal) distribution. My question is: if I have a discrete distribution or histogram, how can I can generate random numbers that have such a distribution (if the population (numbers I generate) is large enough)?
Please post here if anyone knows of a good method of doing this.
Thanks, David

Akzeptierte Antwort

Jonathan Epperl
Jonathan Epperl am 27 Okt. 2012
Since nobody has any suggestions, here's one. If you have a discrete distribution, say it is a Nx2 matrix PD, first column the discrete values, second the probabilities of the corresponding value -- so sum(PD(:,2))==1.
Then map the probablities to the unit interval and use rand. What mean by that:
% Those are your values and the corr. probabilities:
PD =[
1.0000 0.1000
2.0000 0.3000
3.0000 0.4000
4.0000 0.2000];
% Then make it into a cumulative distribution
D = cumsum(PD(:,2));
% D = [0.1000 0.4000 0.8000 1.0000]'
Now for every r generated by rand, if it is between D(i) and D(i+1), then it corresponds to an outcome PD(1,i+1), with the obvious extension at i==0. Here's a way you could do that, even though I'm sure there are better ones:
R = rand(100,1); % Your trials
p = @(r) find(r<pd,1,'first'); % find the 1st index s.t. r<D(i);
% Now this are your results of the random trials
rR = arrayfun(p,R);
% Check whether the distribution looks right:
hist(rR,1:4)
% It does, roughly 10% are 1, 30% are 2 and so on
If you want more help you should post a minimal example of the form in which you have the discrete distribution.
  5 Kommentare
Aasheesh Dixit
Aasheesh Dixit am 8 Jun. 2020
one change is required:
p = @(r) find(r<d,1,'first'); % find the 1st index s.t. r<D(i);
Himanshu Tanwar
Himanshu Tanwar am 25 Mai 2022
Overall code may be written (with some slight modifications) as:
dx = 0.001;
x = -100 : dx : 100; % limits depending on the random variable definition.
% For example, (dx : dx : 100) for Rayleigh (0 not recommended).
f = PDF(x);
F = cumsum(f) * dx; % CDF
F = F / F(end); % recommended when max(F) is close to 1. Otherwise increase x - points to achieve F(end) - max value - close to 1.
N = 10000;
U = rand(1, N);
pdf2rand = @(u) x(find(u <= F, 1, 'first'));
X = arrayfun(pdf2rand, U);
X % desired random variable points
function f = PDF(x)
% Probability Distribution Function
...
end

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (2)

Image Analyst
Image Analyst am 28 Okt. 2012
I didn't notice your question or I would have answered, especially since it's asked so often. Try RANDRAW ( http://www.mathworks.com/matlabcentral/fileexchange/7309-randraw) for a list of common distributions. Or to refresh yourself on the theory of why using the CDF works, see Wikipedia: http://en.wikipedia.org/wiki/Inverse_transform_sampling

Theron FARRELL
Theron FARRELL am 30 Apr. 2019
Bearbeitet: Theron FARRELL am 30 Apr. 2019
Hi there,
I use this naive function to generate artificial outliers applied in machine learning. Hope that it will be a bit help in your case.
function [Out_Data, Out_PDF, CHist] = Complement_PDF(Hist, Data_Num, p)
% Generate a 1D vector of data with a PDF specified as the complementary PDF of input historgram. Note that the larger
% Data_Num is, the more Out_PDF will resemble to CHist
% Input
% Hist: PDF/Histogram of data
% Data_Num: Desired number of data to be generated
% p: Precision given by number of digits after 0
% Output
% Out_Data: Generated data as per the complementary PDF
% Out_PDF: The complementary PDF as per Out_Data
% CHist: The complementary PDF as per Hist
% Example
% Hist = [1, 6, 7, 100, 0, 0, 0, 2, 3, 5];
% Data_Number = 100000;
% p = 3
Hist = Hist/sum(Hist);
CHist = 1- Hist;
CHist = CHist/sum(CHist);
CDF_CHist = cumsum(CHist);
CDF_CHist = double(int32(CDF_CHist*10^p))/10^p;
Out_Data = zeros(1, Data_Num);
Out_PDF = zeros(1, length(CDF_CHist));
for i = 1:Data_Num
% Generate a uniformly distributed variable
x = double(int32(rand*10^p))/10^p;
% Inversely index CDF
Out_Data(i) = Inverse_CDF(x, CDF_CHist);
temp = floor(Out_Data(i) * length(CDF_CHist));
Out_PDF(temp) = Out_PDF(temp) + 1;
end
figure;
subplot 221, bar(Hist);
subplot 222, bar(CHist);
subplot 223, plot(CDF_CHist);
subplot 224, bar(Out_PDF);
end
function [y] = Inverse_CDF(x, CDF_CHist)
CDF_CHist_Ext = [0, CDF_CHist];
y = 1;
for ind = 1:length(CDF_CHist)
if (x >= CDF_CHist_Ext(ind)) && (x < CDF_CHist_Ext(ind+1))
y = ind/length(CDF_CHist);
break;
end
end
end

Kategorien

Mehr zu Random Number Generation finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by