Reducing a cell array of tables to a single table
18 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Ingo Marquart
am 25 Jun. 2019
Bearbeitet: Stephen23
am 23 Feb. 2022
I am using a one-dimensional cell array to save a set of tables.
The necessity for this arises from using a parfor loop in the main part of the program, where each i'th output is a table of results, and outputs must be generated in parallel. I would like to save everything in one table, but the order must be preserved. Since parfor restricts indexing, the best way I have found is to create said cell array, and afterwards looping through it. Since each table corresponds to a single index, Matlab happily accepts this indexing in the parallel loop.
Each iteration returns a table of length maxT with some amount of columns that I determine dynamically. I basically pre-allocate then the table mainTable and loop over my cell array to fill it. To set the correct indecies, I use a vector called asdf, which tells me which rows of mainTable should belong to a given iteration i (there's other ways to do this, this just came out of trying to make parfor work). If that seems confusing, just think of me looping through the cell array, and appending the table in cell i onto mainTable.
The issue is now that the second loop becomes rather slow, because it is not parallelized. Although the main work happens in the first parfor loop and therefore the current solution is still better than without parfor, I would very much like to make the reduction to a single table fast.
Even though I know the position of each table within mainTable (e.g. see variable "asdf"), I can not index with such slices in a parfor loop. The code below, which does this without parfor, works.
Some things which do not work:
cell2table(resultCell) gives a table of tables. No join or union on this is successful
resultCell{:} theoretically gives a list of all tables, but using [resultCell{:}] gives an error because of column duplication. Otherwise only the first table is extracted.
I did not find a way to parallelize the assignment to mainTable, because I always need to slice from a starting point to and ending point.
Any ideas?
parfor i=1:NrSims
%% Do something
% resulttable is a table of length maxT
resultCell{i}=resultTable;
end
%% Create main table
% Here I preallocate mainTable etc.
(...)
% Next, I create this index vector which allows me to slice mainTable for each i
asdf=kron(1:NrSims,ones(maxT,1)')';
for i=1:NrSims
slice=(asdf==i);
mainTable(slice, :) = resultCell{i};
end
4 Kommentare
Guillaume
am 25 Jun. 2019
Bearbeitet: Guillaume
am 25 Jun. 2019
I sincerely hope you're not actually naming your variable asdf! Giving variables a meaningful name (such as tableorder in this case) is the first step of documenting code.
as for your question, it seems you need to understand what cellarray{:} does, and thus the difference betwen [cellarray{:}] (aka horzcat(cellarray{:})) and vertcat(cellarray{:}). See Stephen's answer.
Note that:
asdf=kron(1:NrSims,ones(maxT,1)')';
is more simply:
asdf = repelem(1:NrSims, maxT)' %which is a lot clearer as to the intent
Akzeptierte Antwort
Stephen23
am 25 Jun. 2019
Bearbeitet: Stephen23
am 23 Feb. 2022
2 Kommentare
David Kelly
am 18 Aug. 2020
Stephen,
I just wanted to say thanks for contributing and solving so many to all these questions on the Malab answers/
The amount of times you have saved me is unbelievable!
Cheers
David
Weitere Antworten (1)
Campion Loong
am 27 Jun. 2019
Hi Ingo,
Glad you've found a solution. In case it maybe useful in your workflow, I'd like to mention the various Datastores available to you in base MATLAB:
PARFOR support is builtin via partition, so you don't need to explicitly manage the chunking and remerge. It also lets you scale out to other resources like clusters more easily.
Hope this helps.
0 Kommentare
Siehe auch
Kategorien
Mehr zu Parallel for-Loops (parfor) finden Sie in Help Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!