# Sampling data using contraints

1 view (last 30 days)
Maaz Ahmad on 11 Sep 2020
Commented: Maaz Ahmad on 11 Sep 2020
Hi, I want to sample points uniformly in a multidimensional design space, such that the sampled points follow an
equality constraint. Basically I want to sample 4 mole fractions (Xi) for 4 components in an input stream. I have lower
and upper bounds (inequality constraints) for each Xi as well.
Eg. 0.733<=X1<=0.944, 0.04<=X2<=0.093, 0.04<=X3<=0.094, 0.02<=X4<=0.048, X1+X2+X3+X4=1
Is there any way to use the inbuilt sampling techniques like sobol, LHS etc. while ensuring these constraints?

John D'Errico on 11 Sep 2020
Edited: John D'Errico on 11 Sep 2020
Use my randFixedLinearCombination tool, as found on the file exchange. (I did just post an update today, fixing a bug the code tripped on for this problem.)
n = 5;
lb = [0.733 0.04, 0.04, 0.02];
ub = [0.944, 0.093, 0.094, 0.048];
A = [1 1 1 1]
b = 1;
X = randFixedLinearCombination(n,A,b,lb,ub);
Does the array X satisfy the requirements? Yes.
>> X
X =
0.80603 0.078998 0.093954 0.021013
0.86969 0.041981 0.04217 0.046161
0.82108 0.082301 0.058065 0.038558
0.84723 0.045193 0.085937 0.021644
0.81159 0.055575 0.092472 0.040365
>> sum(X,2)
ans =
1
1
1
1
1
>> all(X >= lb,2)
ans =
5×1 logical array
1
1
1
1
1
>> all(X <= ub,2)
ans =
5×1 logical array
1
1
1
1
1
So X has 5 rows, each of which sum to 1. Each column all lies between the given set of bounds.
It is difficult to prove in a higher number of dimensions they are uniformly distributed in that space, but they are. I'm just too sleepy and too lazy to try to prove that to be true here.
There are limits on the size of the problem, due to the way the code is written. Depending on the problem, the upper bound might be on the order of 10-25 variables. Beyond that point lie some computationally intensive dragons in this code. The actual limit will be problem dependent though.

Maaz Ahmad on 11 Sep 2020
Thanks John! Just a small rectificiation in your sample code - I guess we need to swap b and A in the function argument. That works for me.
John D'Errico on 11 Sep 2020
Yes. I have no idea why I did it that way. The correct line of code as I wrote the function is:
X = randFixedLinearCombination(n,b,A,lb,ub);
Which is how the code expects the parameters. And that is exactly the opposite of what I would have written in terms of most codes in MATLAB that employ linear constraints. Tools like lsqlin, for example, alsways seem to sequence those arguments as A,b in a calling sequence. Therefore, my mental state when still half asleep was to write A,b. In fact, when I pasted in the code, I saw the error but then forgot to fix it in my answer. What you get when I try to answer questions when half asleep. :)
Somehow, Walter answers questions while even fully alseep, and he is always correct even then. Oh well.
I would rather I wrote the code in a more traditional argument order. But I did what I did a few years ago, and I should not change the code now.
Maaz Ahmad on 11 Sep 2020
haha, rather, what you get when you do not code in a traditional way!!