Given a list of values and their probabilities sample 10,000 values.
Example:
x = [1 2 3 4 5]; prob = [0.2 0.1 0.4 0.1 0.2]
Note that sum(prob)=1. Function output should look like this:
output = [1 4 3 1 4 1 2 3 1 1 1 ... 3 4 2] % a vector of length 10,000
All vectors are meant to be row vectors.
Completely irrelevant test suite.
How would you test random output? : )
One possible way is to compute the empirical distribution based on a large number of random samples, and then compare the empirical distribution with the true distribution (up to a specified tolerance). Taking integer samples as an example, x = histcounts(data, [unique(data) Inf], 'Normalization','probability') returns the empirical probability of data.
That is exactly how the code validation is done (it mimics the central limit theorem), except I set the admissible variance sufficiently large in case someone ends up as outlier. Anyway, I changed it a little today so that "deterministic" solution does not get accepted any more...
Your test is very weak. It does not fully reflect the random sampling procedure.
See my new solution which takes advantage of the weakness of your test suite. It is essentially "deterministic", but it passes your test.
Hi, Jakub. One more fairly simple check you could add would be to ensure that the result of diff( output ) does not produce mostly zeros. .... BTW, you do not need to put all checks inside one assert command. Indeed, it may be helpful for players to know which assertion is tripping them up, with the addition of text error messages. —DIV
To be fair, Jakub, it is certainly a lot of work to set up a robust Test Suite for this problem. It is not possible to make a perfect Test Suite that will stop all cheating, so the plan would be just to make cheating difficult in comparison to a genuine solution. BTW, the leeway you allow on the sums (±50%) seems too generous. According to https://www.mathworks.com/matlabcentral/cody/problems/43592-sample-from-random-roulette/solutions/1351272 it seems ±2% would be about right, or perhaps ±5% if you still want to be extremely conservative / very generous.
...thank you, this works 20x faster than the reference solution, although discretize() seems to work on newer versions of Matlab only.
A good example which illustrates the weakness of the test suite.
Right, I am running out of ideas since the deterministic solution lie in the region of true solutions. Anyway, would you have some speedy solution for me if you followed the instructions and really wanted to return a random sample?
This is why I recommended you to check the empirical distribution, rather than the statistical mean. Anyway, I have just submitted a real solution with truly random samples.
Hmm... I kinda like my cheat better (Solution 1350403), Peng Liu :-) .... But seriously: Jakub, if you want to check your ability to detect true random UDF integer sequences, try Problem 44393 at https://www.mathworks.com/matlabcentral/cody/problems/44393 . My reference solution (unpublished) passes the Test Suite, as does Alfonso's concise code, and I expect both would catch the basic cheats employed here.
4653 Solvers
Find the two most distant points
1175 Solvers
162 Solvers
84 Solvers
352 Solvers