# Generate random sequence number with condition

64 views (last 30 days)
Mi Tung on 23 Sep 2020 at 14:56
Commented: Mi Tung on 24 Sep 2020 at 0:21
Hello every one!
For example, if I want to generate a sequence random numbers (This sequence includes 200numbers) of values from 0.1 to 3.5 provided that the amount of numbers greater than 2 is only 10% of the total. How to do this in matlab. Can someone help me out? Thank you in advance.

Show 1 older comment
the cyclist on 23 Sep 2020 at 15:25
When generating random numbers, it is critically important to fully specify the distribution characteristics. Beginners (at random number generation and at coding) often don't think about this carefully enough.
For example, you cannot have a distribution that obeys both of these restrictions:
1. uniformly distributed on the interval [0.1,3.5]
2. less than 10% of the values are on the interval [2,3.5]
It may be that a particular sample meets these criteria, but your distribution cannot.
You could create a non-uniform distribution (e.g. an exponentially distributed one) where the distribution range is [0.1,3.5], and on average only 10% of the values are in [2,3.5]. But in a given sample, more than 10% could be in [2,3.5].
Just to try to reinforce these points a bit more. Suppose I roll a 6-sided die. On average, the number 6 will come up less than 20% of the time. But I could get a sample in which it comes up more than 20% of the time. If I try to enforce that the sample has fewer than 20% 6's, then I no longer have a "fair" die.
So, you need to tell us significantly more detail about what you want, and also be clear about your distribution vs a sample from that distribution.
Mi Tung on 23 Sep 2020 at 15:28
Its not necessarily equal exactly 10%. May be approximately. Thanks
Mi Tung on 23 Sep 2020 at 15:47
Hi the cyclist. Thanks for your suggest. I want to generate a sequence random numbers, in which does not need to be uniformly distributed on the interval [0.1,3.5], but only a condition is the amount of numbers greater than 2 is approximately 10% of the total. Below show the example, in which I entered manually.
AB=[0.25 0.32 0.26 0.17 0.43 0.62 0.16 1.51 0.34 0.42 0.25 0.56 0.12 0.32 0.33 0.13 0.24 0.37 0.41 0.61 0.72 0.56 0.84 1.35 0.63 0.44 0.38 0.29 0.31 0.42 0.66...
0.34 0.26 0.51 0.62 0.42 0.81 0.43 0.57 0.66 1.39 0.56 0.45 0.36 0.27 0.44 0.52 0.36 0.45 0.54 0.24 0.38 0.48 0.65 0.74 2.68 0.72 0.81 0.45 0.5 0.29 ...
0.38 0.57 0.42 0.38 0.64 0.54 0.39 2.41 0.15 2.28 0.37 0.57 0.71 0.83 1.95 0.74 0.65 0.24 0.36 0.47 0.46 0.28 0.37 0.14 0.19 0.58 0.43 0.51 0.11 0.42 0.55...
2.45 0.52 0.34 0.21 0.29 0.38 0.56 0.68 0.92 2.15 1.05 0.75 0.61 3.42 0.38 0.67 0.28 0.74 0.38 0.45 0.56 2.74 2.41 0.85 0.65 0.23 0.27 0.51 0.53 0.71...
2.68 0.73 0.48 0.51 0.76 2.41 0.28 0.39 0.37 0.58 0.14 0.25 0.29 2.35 0.48 0.85 0.64 1.95 2.56 0.84 0.95 3.05 1.03 2.87 0.69 0.73 0.81 0.55 0.48 0.56 0.61...
0.74 1.68 0.95 0.76 0.42 0.67 0.34 0.48 2.16 0.28 0.37 0.46 0.59 0.78 1.09 1.14 1.06 2.37 0.38 0.29 0.46 0.17 0.16 0.38 2.58 0.62 0.71 0.24 0.16 0.19...
0.41 0.26 0.35 2.75 0.24 2.74 0.78 1.14 1.08 0.84 0.64 0.37 0.89 2.87 0.65 0.37 0.47]

Bruno Luong on 23 Sep 2020 at 18:49
Edited: Bruno Luong on 23 Sep 2020 at 18:49
P=interp1([0 0.9 1],[0.1 2 3.5],'pchip','pp');
n = 1e6; % 200 in your case
% here is the array of random
r=ppval(P,rand(1,n));
% this should return value close to 0.1
sum(r>=2)/length(r)

the cyclist on 23 Sep 2020 at 19:52
One of literally an infinite number of solutions to OP's question, given the lack of specificity in what is needed. :-)
Mi Tung on 24 Sep 2020 at 0:19
many thank to mr Bruno Luong and the cyclist for suggesting. I got it. I wil try your best in Matlab.

John D'Errico on 23 Sep 2020 at 17:08
Edited: John D'Errico on 23 Sep 2020 at 17:13
I think you still do not appreciate that you cannot just say something is "random" and not define the distribution.
For example, if we generate a sequence of 100 numbers that are EXACTLY either 0.1 OR 3.5, but the probability of 3.5 happens 10% of the time, then it satisfies your requirement. That is trivial to do.
X = (rand(1,100) > 0.9)*3.4 + 0.1;
sum(X>2)/numel(X)
ans =
0.11
So here we have a random sequence where 11% of the 100 numbers exceeded 2. Since the sequence is of length 100, that means we had 11 of the 3.5 events, and 89 of the 0.1 events. As the sequence length goes to infinity, the probability of exceeding 2 will approach 10% quite nicely.
Here are the first 20 events in that sequence:
X(1:20)
ans =
Columns 1 through 9
0.1 0.1 0.1 3.5 0.1 0.1 0.1 3.5 0.1
Columns 10 through 18
0.1 0.1 3.5 0.1 0.1 0.1 0.1 0.1 0.1
Columns 19 through 20
0.1 0.1
It completely satisfies your requirements, but I may postulate, based on your example, that it does not satisfy your goals.
So, do you want some sort of uniform distribution in each sub-interval? That is, it is easy enough to generate a misture distribution, where 10% of the time we have a uniform sample from one interval, and 90% of the time is it uniform over another interval. Trivially easy. Or do you want to see some continuous distribution that has the desired probability. For example, it would be easy enough to derive some sort of truncated exponential, such that by variation of the rate parameter, one could tailor the distribution to what you want. Or we could choose a beta distribution with the same property, and beta distributions have all sorts of shapes.
But until you explain CLEARLY what you are looking for, and what distribution you want, I've given you a perfectly valid solution in one line of code.

Jeff Miller on 24 Sep 2020 at 0:04
Here is another distributional option with a figure showing the comparison to Bruno's:
%% Bruno's solution
P=interp1([0 0.9 1],[0.1 2 3.5],'pchip','pp');
n = 1e6; % 200 in your case
% here is the array of random
r=ppval(P,rand(1,n));
% this should return value close to 0.1
sum(r>=2)/length(r);
subplot(1,2,1);
histogram(r)
ylabel('Probability density')
xlabel('Random number')
%% Here is another possible distribution with uniform random numbers
% within each interval:
final = zeros(n,1); % the final random numbers will go here
% randomly choose whether each final number is
% in the interval 0.1-2 or the interval 2-3.5
preliminary = rand(n,1);
large = preliminary < 0.10; % only 10% should be large
largeN = sum(large);
final(large) = 2 + rand(largeN,1)*(3.5-2); % random numbers between 2-3.5
final(~large) = 0.1 + rand(n-largeN,1)*(2-0.1); % random numbers between 2-3.5
subplot(1,2,2);
histogram(final,'normalization','pdf')
ylabel('Probability density')
xlabel('Random number')

#### 1 Comment

Mi Tung on 24 Sep 2020 at 0:21
Thank mr Jeff Miller for your solutions. I accepted answers Bruno Luong's, because he answered before you. I'm sorry. The best regard.