nchoose2: save output in chunks?

7 Ansichten (letzte 30 Tage)
phlie
phlie am 19 Sep. 2016
Kommentiert: phlie am 22 Sep. 2016
Hi everyone, I have a cell array N(m,n) with mixed numeric/ string (with the first row as a header). I would like to create combinations without repetition of every row i with every other row j ≠ i. I am doing this with user-written nchoose2.
ind = nchoose2(1:size(N, 1)-1);
Unfortunately, my cell array is too large so that ind generates an out-of-memory error. Can I save the output of nchoose2 (or I wouldn't mind using nchoosek) in chunks? Like save the first 50k rows of the ind, process them, delete them, and then turn to the next 50k?
  2 Kommentare
José-Luis
José-Luis am 19 Sep. 2016
Do you have any idea how large your total output would actually be?
Guillaume
Guillaume am 19 Sep. 2016
For reference: nchoose2

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Guillaume
Guillaume am 19 Sep. 2016
Neither nchoosek nor nchoose2 let you return a portion of the output.
You can always generate the output using a loop and break out whenever you want:
function [rowcombination, nextfirstrow, nextsecondrow] = choose2row(in, maxrows, startfirstrow, startsecondrow)
%CHOOSE2ROW create every combination of 2 rows of a matrix/cell array
%The function can return a portion of the output and be called again to return the next portion.
%The function uses double loops to compute all combinations.
%Outputs:
% rowcombination: matrix/cell array where each row is the concatenation of two distinct rows of the original matrix/cell array.
% nextfirstrow:
% nextsecondrow: parameters to pass back to a subsequent call to CHOOSE2ROW to return the next portion of row combination.
%Inputs:
% in: input matrix/cell array of size [m, n].
% maxrows: maximum number of rows of output rowcombination. Inf for no limit. Scalar, optional. default Inf.
% startfirstrow: outer loop start index. Scalar, optional. default 1.
% startsecondrow: inner loop start index. Scalar, optional. default startfirstrow - 1.
if nargin < 2 || maxrows == Inf
maxrows = Inf;
else
validateattributes(maxrows, {'numeric'}, {'scalar', 'positive', 'integer'}, 2);
end
if nargin < 3
startfirstrow = 1;
else
validateattributes(startfirstrow, {'numeric'}, {'scalar', 'positive', 'integer', '<', size(in, 1)}, 3);
end
if nargin < 4
startsecondrow = startfirstrow + 1;
else
validateattributes(startsecondrow, {'numeric'}, {'scalar', 'positive', 'integer', '<=', size(in, 1), '>', startfirstrow}, 4);
end
nrows = (size(in, 1) - startfirstrow + 1) * (size(in, 1) - startfirstrow) / 2 - (startsecondrow - startfirstrow - 1); %total size of output still to generate
rowcombination = repmat(in(1, :), min(nrows, maxrows), 2); %initialise output to required size
rowout = 1;
for nextfirstrow = startfirstrow : size(in, 1)-1
for nextsecondrow = startsecondrow : size(in, 1)
rowcombination(rowout, :) = [in(nextfirstrow, :), in(nextsecondrow, :)];
rowout = rowout + 1;
if rowout > maxrows
nextsecondrow = nextsecondrow + 1; %#ok<FXSET> exiting the loop
if nextsecondrow > size(in, 1)
nextfirstrow = nextfirstrow + 1; %#ok<FXSET>
nextsecondrow = nextfirstrow + 1; %#ok<FXSET>
if nextfirstrow == size(in, 1)
nextfirstrow = Inf; %#ok<FXSET>
nextsecondrow = Inf; %#ok<FXSET>
end
end
return
end
end
startsecondrow = nextfirstrow + 2;
end
nextfirstrow = Inf;
nextsecondrow = Inf;
end
Of course, you're trying performance for memory.
  1 Kommentar
phlie
phlie am 22 Sep. 2016
Thank you, Guillaume. This works very well!

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Argument Definitions finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by