i create a matrix that HAS TO HAVE 3 columns and contain all P elements (order isn't important) and zero in the free places, for example in my case
4.5000 4.5000 5.2500
5.2500 2.2500 0.6000
0.5000 0.6000 0.5000
0.3000 1.0000 2.0000
0.5000 0.5000 0
my goal is to swap matrix elements between columns to reach the disposition that satisfies this property: say that the sum of all elements in vector P is S.
The disposition should satisfy the property that the sum of all elements in column 1 must be as close as possible to S/3 and the same for column 2 and column 3.
my first attempt was trying to generate all possible matrix disposition or permutation an find this memorizing the matrix that get closer to my goal by while cycle with this condition
the problem is that using perms(G) for generating it unfortunately I get out of memory because perms generates ALL possible permutation but i need only combination of n element i 3 groups that is much less than all possible combination but i don't know how to create it
Example of one possible output found by hand (order in elements of the same column doesn't matter):
My comment to John made me realise what an efficient way to go about it would be. The crux of the problem is creating the unique permutations of repelem(1:numberofsets, setsize) or finding unique permutations of a set with repetition. I'm sure that there's something to do that already on the file exchange but I can't find it.
I've left my original answer below, but don't use it. This is much faster and more efficient. Note that the code for finding unique permutations is heavily inspired by this post. First the helper function that generates the unique permutations
function partitions = uniquesets(setcount, setlength)
%generate unique partitions of P sets of length N
%number of partitions = factorial(P*N) / factorial(N)^P
%setcount: the number of sets in the partition
%setlength: the length of each set in the partition
%partitions: a K*L array, where K = setcount*setlength and L = factorial(setcount*setlength) / factorial(setlength)^setcount. Each column is a partition. Rows are set indices from 1 to setcount
This is very quick, about half a second on my machine and will work for much longer P.
It still does create a few duplicate matrices as duplicate numbers in P are still considered distinct for the purpose of partitioning. However, it does not creat unecessary partitions.
Original answer, which wasn't efficient at all
Bearing in mind this is not my domain at all (this is a problem for John), here is a solution that works for this problem size but is probably very inefficient.
It uses this fileexchange entry to find all partitions of size 3 of your P vector. This is very inefficient since of these, you only want the partitions where each subset has size 5. Still, it's better than computing all permutations. In addition, duplicate numbers are considered distinct for the partitioning problem, again giving you more partitions than necessary:
Your problem is poorly formulated. i.e., how close to S/3 do you need to be?
For long vectors, your problem really is large, especially as you are trying to solve it.
For a vector of length n, there are 2^n subsets of elements of the vector. This is well known, but 2^n grows exponentially. Note that perms is not appropriate here, so at least you don't have factorial growth, but it is still big.
Given the vector P of length 14, there are 2^14 subsets of elements to consider. We can think of them as the bits:
B14 = dec2bin(0:2^14-1);
10×14 char array
That was for the first 10 subsets we might choose. So each element of P is either included or excluded in a sum, as indicated by the corresponding bit.
(I'll discuss later the issue that it looks like you want at MOST 3 terms in the final sum.)
Now, your goal is essentially to find all such sums. In the end, you probably want to rank the sums, sorting them such that those with a sum closes to S/3 are at the top. But since you never say how close to S/3 you want to allow the sums to go, you effectively need to compute them all. This is not even remotely difficult for a vector of length only 14 though.
What were the 10 first such sums that were found? Represented bya 1 here where a given term was included in the sum, we see:
0 0 0 1 1 0 0 1 0 1 0 0 1 1
0 0 0 1 1 0 0 1 0 1 1 0 0 0
0 0 0 1 1 0 0 1 1 1 0 0 0 1
0 0 0 1 1 0 0 1 1 1 0 0 1 0
0 0 0 1 1 0 1 1 0 1 0 0 0 1
0 0 0 1 1 0 1 1 0 1 0 0 1 0
0 0 0 1 1 0 1 1 1 1 0 0 0 0
0 0 0 1 1 1 0 0 0 1 0 0 1 1
0 0 0 1 1 1 0 0 0 1 1 0 0 0
0 0 0 1 1 1 0 0 1 1 0 0 0 1
We can extract the specific terms in one such subset sum as:
Columns 1 through 11
0 0 0 5.25 2.25 0 0 0.6 0 0.3 0
Columns 12 through 14
0 0.5 0.5
So, was this all easy? Well, yes. Trivially so, at least for a vector of length only 14. I could have succeeded up to around 20 elements in the vector P. If you increase the length of P by a significant amount, things will still go to hell. So you can effectively never generate all such sums for a vector of length 50. (At least not with the computers we have available today.) Even that operation for partial sums of 25 terms or so will be brutal.
Remember that we computed ALL such partial sums, because we really don't know when to stop looking at all possible subsets. How far away from S/3 are we allowed to go?
1 1 1 1 1 1 1 1 1 1 1 1 1 1
So one of those candidate sums was the sum of all elements, and since none of the elements of P were negative, that sum will lie way out in the weeds.
For large, long vectors, is there a better way? Well yes. You could write an algorithm that recursively generated subsets of P, then used a tolerance to decide if a sum was too far away. But even this can be very difficult to perform, because you may not be able to know how to generate that tolerance. This could get difficult. Doable. But sometimes difficult.
Now, suppose we want to solve the problem of finding all subsets of at most 3 terms in any candidate sum? This is most easily done using nchoosek. So, I might split the problem into finding all partial sums of 1, 2, or 3 terms, then combining them at the end.
S = sum(P);
Starget = S/3;
Stol = 0.2; % a tolerance on the sum error
T3 = nchoosek(P,3);
err = abs(sum(T3,2) - Starget);
T3(err > Stol,:) = ;
err(err> Stol) = ;
[err,tags] = sort(err,'ascend');
T3 = T3(tags,:);
This is now trivially easy to write, as you see. And it is efficient, since we only cared about at most 3 terms in the partial sums. In fact, there were only 9 partial sums of 3 terms that were within a tolerance of 0.2 of the target sum.
4.5 4.5 0.5
4.5 4.5 0.5
4.5 4.5 0.5
4.5 4.5 0.5
5.25 2.25 2
5.25 2.25 2
4.5 4.5 0.3
4.5 4.5 0.6
4.5 4.5 0.6
Now you could do the same set of operations for exactly 2 terms in the sum, then look at 1 term. Each time, keep only the partial sets that were within the designated tolerance on the sum.
The nice thing is, even for significantly longer vectors P, if you are never interested of partial sums of never more than 3 terms, this is still doable. Thus, for vectors of length 50 or 100,
We still only need to consider a few tens of thousands of combinations, or even 100K such partial sets. It will still get nasty for vectors with many thousands of elements to be considered, but you can always overwhelm any computational scheme if you try hard enough..