Monte Carlo technique for normally distributed random variable
6 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
I am given a task where I have to use a simple Monte Carlo technique to estimate P(A > 5), where A is a variable in a normal distribution.
Using the code below, I am generating numbers in a normal distribution with a mean of 100 and s.d. of 1: (reference: https://www.vertex42.com/ExcelArticles/mc/MatlabMCExample.html)
x = ( randn(n,1) * 1 ) + 100;
So after that code spits out numbers and gets stored in the array x, how would I be able to pick out all the numbers in the array that are +5 s.d. away from the mean, AND count how many of them there are? (so for this case, pick all the numbers in the array that are >= 105).
0 Kommentare
Antworten (2)
TADA
am 28 Nov. 2018
Bearbeitet: TADA
am 28 Nov. 2018
sig = 1; % standard deviation
mu = 100; % mean
n = 10000; % number of values
x = ( randn(n,1) * sig ) + mu;
z = 5; % desired standard score
flags = (x - z*sig) >= mu;
nOut = sum(flags); % this is how you can count them
disp(nOut);
nOut = length(x(flags)); % another way to count them
outValues = x(flags); % this is how you retrieve the values
disp(outValues);
disp(nOut);
It's not going to generate a lot of those outliers though, because >99% of the values are within 3 standard deviations away from the mean value. a standard score of >=5 is A LOT
I only got a couple after increasing n to 10,000,000
you can also use normrnd to generate your population, it's easier because it accepts the mean and standard deviation as input arguments.
3 Kommentare
John D'Errico
am 28 Nov. 2018
Sometimes hard to know what your instructor is thinking. ;-) But my guess is this is an attempt to teach that Monte Carlo is not always a good thing when the events you care about are truly rare.
John D'Errico
am 28 Nov. 2018
Bearbeitet: John D'Errico
am 28 Nov. 2018
I'd suggest that a better way is to employ a truncated normal distribution. So, if you carefully sample points from a truncated normal, where all events lie above say 4 sigma, then the odds of getting a sample at 5+ sigma or more are reasonably high.
Now you can compute the desired probability (with some careful mathematics). The problem is, my guess is this is against the philosophy of the homework assignment, which is to do it using a pure (untruncated) Gaussian. And since those 5+ sigma events are darn rare, you need a huge sample size to get any points in that bin at all.
format long g
normcdf(5)
ans =
0.999999713348428
2*(1 - normcdf(5))
ans =
5.7330314384707e-07
So tens or better yet, hundreds of millions of samples will be needed for brute force Monte Carlo sampling.
sum(abs(randn(1,1e8)) >= 5)
ans =
54
which matches the expected number of events nicely:
2*(1 - normcdf(5))*1e8
ans =
57.330314384707
So how might a truncated Gaussian sampling work? I'll truncate at +/- 4 sigma.
n = 1e6;
T = 4;
x = norminv(normcdf(-T)*rand(1,n)).*(2*(rand(1,n)<0.5)-1);
count5 = sum(abs(x) >= 5);
P5 = count5/n*2*normcdf(-T)
P5 =
5.76416601362783e-7
Which compares far more favorably with the actual probability of 5.7330314384707e-07, with a far smaller sample size. As I said, it is a bit of mathematics that seems to skirt acceptablity for such a problem. Certainly, if I was your instructor, I might look slightly askew at the approach, except that if you really understood how I did this and what was done here, it would also indicate that you do know what you are doing.
2 Kommentare
John D'Errico
am 28 Nov. 2018
Yes. normcdf(-T) for example, computes the integral from -inf to -T. But that is only the lower tail. Double it to get the upper tail too, since you care about the case where abs(x)>=5.
Siehe auch
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!