Most efficient way to vertically concatenate numeric data?

In the profiled output picture below, mergedDataPerRank is a cell array storing 53 double matrices with size (6300, 33). Using vertcat, it takes approx. 0.5 seconds (20 calls) to vertically concatenate the data. Is there a more efficient way to do this or is 0.5 seconds already fairly good?

4 Kommentare

0.5 seconds is already good for 20 calls each vertically concatenating 53 of 6300 x 33;
C = arrayfun(@(idx) rand(6300,33), (1:53).', 'uniform', 0);
tic; for K = 1 : 20; result1 = cell2mat(C); end; t1 = toc;
tic; for K = 1 : 20; result2 = vertcat(C{:}); end; t2 = toc;
format long g
[t1, t2]
ans = 1×2
1.811011 0.86993
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
On my desktop, the vertcat takes roughly 0.286 seconds
nrows = 6300;
ncols = 33;
ncells = 53;
C = arrayfun(@(idx) rand(nrows,ncols), (1:ncells).', 'uniform', 0);
tic; for K = 1 : 20; result1 = cell2mat(C); end; t1 = toc;
tic; for K = 1 : 20; result2 = vertcat(C{:}); end; t2 = toc;
Let's look at the "preallocate and fill" approach as an alternative. Of course, this code assumes that each cell contains a matrix of the same size and all of them are double matrices.
tic;
for k = 1:20
result3 = zeros(nrows*ncells, ncols);
beforefirst = 0;
for whichcell = 1:ncells
result3(beforefirst+(1:nrows), :) = C{whichcell};
beforefirst = beforefirst + nrows;
end
end
t3 = toc;
format long g
[t1, t2, t3]
ans = 1×3
1.878186 0.88884 0.856936
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Its time is pretty close to the vertcat approach. Do the three approaches give the same answer?
check = isequal(result1, result2, result3)
check = logical
1
Konstantin
Konstantin am 3 Apr. 2025
Bearbeitet: Konstantin am 3 Apr. 2025
But what is "Merge_filesAcrossRanks"? It looks like it is some your own custom function to read a lot of files in a given directory. If it is so (and if the size of all tables is quaranteed a priori), then I would recommend to combine reading and merging: just preallocate the whole final matrix (53*6300 tall, 33 wide), create a "current position (line)" variable, and then read files into this table while advancing the "current position".
That approach turns out to be slower.
nmat = 53;
nrow = 6300;
ncol = 33;
rng(12345)
tic
C = cell(nmat, 1);
for K = 1 : nmat; C{K} = rand(nrow, ncol); end
R1 = vertcat(C{:});
t1 = toc;
rng(12345);
R2 = zeros(nmat*nrow,ncol);
counter = 1;
for K = 1 : nmat; R2(counter:counter+nrow-1,:) = rand(nrow, ncol); counter = counter + nrow; end
t2 = toc;
format long g
[t1, t2]
ans = 1×2
0.159601 0.273057
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

Melden Sie sich an, um zu kommentieren.

Antworten (0)

Kategorien

Tags

Gefragt:

am 1 Apr. 2025

Kommentiert:

am 3 Apr. 2025

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by