Filter löschen
Filter löschen

Recommendation: Struct or Cell array for panel data?

1 Ansicht (letzte 30 Tage)
fr_sk
fr_sk am 3 Jun. 2016
Bearbeitet: Stephen23 am 3 Jun. 2016
Hi, this issue seems to be one that people do not agree on easily.
I have hourly load profiles (about two years) for several thousand households. I want to work on the load data but still be able to identify the household no. (and possibly the hour) afterwards.
In terms of memory allocation and indexing accessibility, what would be the recommended way to store this data?
  • Cell arrays
  • Structs
  • Multidimensional arrays
Your opinion will be appreciated !

Akzeptierte Antwort

Stephen23
Stephen23 am 3 Jun. 2016
Bearbeitet: Stephen23 am 3 Jun. 2016
Rather than starting from a top-down decision, a good rule of thumb is to make this a bottom-up decision:
Use the simplest array class and arrangement that you reasonably can:
  1. numeric
  2. cell
  3. struct / table / ...
  4. something much more complex...
By choosing a simpler array much of the data processing is also simplified: we get many questions from beginners wanting to find the max value of scalar data in separate cells of cell arrays. The solution is always to remove the data from the cell array, convert it to a numeric array, and use an inbuilt functions. The point is: there was often no point in them even storing the data in a cell array in the first place, so all of those extra steps just slow down and make the code more complicated. From this point of view, ND numeric arrays are the simplest to index and access (if the data consists arrays of numeric data).
Of course if something in your data does not allow them to use a simpler data class (e.g. they are not simple ND arrays, or are arrays with different sizes, are different classes, etc), then by all means use a more complex storage: a cell array could be a perfectly good storage medium. That is why these more complex types exist: because sometimes they are the easiest way to represent data. But clearly there is no point in using a sparse array if a normal array does that job!
So, keeping this in mind, if you are planning on doing any kind of numeric processing over the data (i.e. between cases/tests) then using ND numeric arrays is likely to be much simpler. If you simply need to loop and repeat processing on each case independently then perhaps a cell would work... or structure for readability, or ... whatever suits your data's own arrangement.
A good decision depends on your data, what you plan to do with it, and what is most robust and readable to code.
  2 Kommentare
fr_sk
fr_sk am 3 Jun. 2016
Thank you for the comprehensive answer. I will use numeric arrays then. So I guess I will put the actual data in a two dimensional array and then add the metadata (household no.) as a third dimension (if that makes sense at all).
Stephen23
Stephen23 am 3 Jun. 2016
Bearbeitet: Stephen23 am 3 Jun. 2016
@fr_sk: that makes perfect sense! Although it can be a bit daunting at first, using the dimensions of numeric arrays is a really efficient way to handle data in MATLAB. Many inbuilt numeric functions have an option to select the dimension they operate along, so make sure to read the documenation.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Structures finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by