Picking random numbers with some predefined probability

5 Ansichten (letzte 30 Tage)
I have a variable lets say x = 3.3:8.5
I have to pick 200 random numbers from this range but there are a few conditions;
1) I have predefined probabilities of picking random numbers which is p = [0.16 0.24 0.28 0.32] for the intervals [3.3 4.6], [4.6 5.9], [5.9,7.2], and [7.2 8.5].
2) I would like to pick more number of elements from higher end of intervals, for example numbers picked from [7.2 8.5] should be closer to 8.5 than to 7.2.
I am trying to use a normal distribution with mean at 8.5 and truncating it at 3.3 and 8.5. I am playing with standard deviation to find the desired CDF plot.
I would like to know if there is any way to define a distribution by just using the information I have that is more technical than playing with standard deviation. You can find my code attached.
Also, should i include kurtosis and skewness in the definition of distribution? I am not so sure I need skewness since I am already truncating it at 8.5 which also happens to be the mean.
Thanks.
  2 Kommentare
John D'Errico
John D'Errico am 4 Apr. 2020
You have predefined probabilities. So then why in the name of god and little green apples are you then trying to stuff this into a normal distribution? Yes, I know. That is the only thing you know how to use. It is not even remotely close.
Instead, to choose a random number from this strange distribution,...
  1. Choose which interval a given point will be chosen from. That happens with the probabilities indicated in p, so that choice is just a discrete random variable from the numbers [1,2,3,4], based on p.
  2. Once you have chosen the interval for any given sample, now you need to decide what the probability distribution in that interval is. A simple choice might be a translated beta distribution. So sample from a beta random variable (see betarnd, from the stats toolbox) then shift and scale it to lie in the appropriate interval.
The nie thing about the choice of a beta distribution, is you have a great deal of choice in your control the shape of the pdf, based on the beta parameters.
bhuvan khoshoo
bhuvan khoshoo am 4 Apr. 2020
Hello John,
Its true, i do not have much of a background in statistics and so I do not know much about distributions. I was just trying to come up with what i know and the only thing i know is normal distribution.
I will look into beta distribution suggested by you.
Thanks a lot.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Walter Roberson
Walter Roberson am 4 Apr. 2020
1) I have predefined probabilities of picking random numbers which is p = [0.16 0.24 0.28 0.32] for the intervals [3.3 4.6], [4.6 5.9], [5.9,7.2], and [7.2 8.5].
To implement that, you will need to create a random number in the range [0 1] and find it within the boundaries of cumsum(p) in order to find out which interval it is. Then you would use independent distributions for each of the ranges, each with 100% total probability. This might require using a beta distribution for each of them in practice.
truncate() of a distribution is also a possibility: truncate automatically rescales the values so that the total probability within the truncated range is 100% -- a normal distribution truncated to a range will not look exactly like what would happen if you took a normal distribution and somehow values outside the truncation range just "don't count" and yet magically do not affect the probability calculations. A truncate()'d distribution is still a real distribution with a real pdf() and real cdf() that are meaningful.
2) I would like to pick more number of elements from higher end of intervals, for example numbers picked from [7.2 8.5] should be closer to 8.5 than to 7.2.
You have not really defined how you want to clusture more closely. one approach over the interval [a, b] is to use
(1-exp(-x))/int(1-exp(-x),a,b)
Another is to use
(x^2)/int(x^2,a,b)
but although these work out mathematically, I suspect that over [7.2 8.5] that the shapes of them might not bias the results visibly enough for your purposes.
  3 Kommentare
Walter Roberson
Walter Roberson am 5 Apr. 2020
Remember that no matter what distribution you choose over the interval [a, b], for the probability of being chosen must be such that which can also be understood as saying that the quantity must be such that cdf(b) = 1
To have a high probability in your upper half, you want the at the center to be less than 1/2 -- because with the integral being 1, less than 1/2 at the center point implies that there is more than half after the center point.
bhuvan khoshoo
bhuvan khoshoo am 5 Apr. 2020
Yes, I will make sure that cumulative probablity over any interval is 1, and based on my requirement of closeness, I will make sure that my cdf is shaped accordingly.
Thanks again Walter!!

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Produkte


Version

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by