Table: group by datetime

22 Ansichten (letzte 30 Tage)
Arjan
Arjan am 11 Jun. 2015
Kommentiert: Peter Perkins am 16 Jul. 2015
Hi all,
I have defined a table and I now want to group by a column of type 'datetime'. I tried varfun, but this function doesn't work for this kind. Can someone help? Thanks a lot! So table looks like:
[TYPE Datetime VALUE]
And the result should be the value per datetime, summed by TYPE.
  4 Kommentare
Arjan
Arjan am 12 Jun. 2015
Hi Peter, Of course. I have defined a table in the following way and I now want to aggregate the rows, grouped by DT (=date time), i.e. I want to get the total value per hour of the day. Hopefully I made it clear.
if true
T_Flex = readtable('TEXTFILE.csv','Delimiter',';','Format',formatSpec,'ReadVariableNames',false);
T_Flex.Properties.VariableNames{1}='TYPE';%type of flex
T_Flex.Properties.VariableNames{2}='Date';%date
T_Flex.Properties.VariableNames{3}='HOUR'; %period of the day T_Flex.Properties.VariableNames{4}='VALUE';%flex
DT = table(datetime(year(T_Flex.Date),month(T_Flex.Date),day(T_Flex.Date),0,(T_Flex.PTU-1)*15,0));%date+period of day
T_Flex=[T_Flex DT];
end
Peter Perkins
Peter Perkins am 16 Jul. 2015
Arjan, I don't exactly understand your table, but I think your question boils down to, "I want to use varfun, with a grouping variable that's a datetime."
In R2014b and R2015a, varfun does not allow that. One work-around would be to temporarily convert the datetime to a numeric representation (for example, the number of seconds since some reference time). Then do the grouped calculation, and convert the grouping variable's unique values back to datetime:
>> d = datetime(2015,1,1) + caldays(randi(2,5,1));
>> x = randn(size(d));
>> t = table(d,x)
t =
d x
___________ ________
02-Jan-2015 0.30352
03-Jan-2015 -0.60033
02-Jan-2015 0.48997
03-Jan-2015 0.73936
02-Jan-2015 1.7119
>> d0 = datetime(2015,1,1);
>> t.d = seconds(t.d - d0)
t =
d x
_________ ________
86400 0.30352
1.728e+05 -0.60033
86400 0.48997
1.728e+05 0.73936
86400 1.7119
>> gt = varfun(@mean,t,'groupingVariable','d')
gt =
d GroupCount mean_x
_________ __________ ________
86400 86400 3 0.83512
172800 1.728e+05 2 0.069518
>> gt.d = d0 + seconds(gt.d)
gt =
d GroupCount mean_x
____________________ __________ ________
86400 02-Jan-2015 00:00:00 3 0.83512
172800 03-Jan-2015 00:00:00 2 0.069518

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Michael Rice
Michael Rice am 12 Jun. 2015
It isn't possible. To illustrate, I'll sum by matching date (instead of hour) using datetime versus datenum.
VALUE=[2;3;4];
dt = datetime(2015,[1;2;1],1);
tbl_dt=table(dt,VALUE,'VariableNames',{'dt','VALUE'});
tbl_dt=varfun(@nansum,tbl_dt,'GroupingVariables',{'dt'});
MATLAB returns
Error using table/varfun (line 154)
A grouping variable must be a categorical, numeric, or logical vector, a cell vector of strings, or a 2D character array.
Now use datenum instead:
dn = datenum(2015,[1;2;1],1);
tbl_dn=table(dn,VALUE,'VariableNames',{'dn','VALUE'});
tbl_dn=varfun(@nansum,tbl_dn,'GroupingVariables',{'dn'});
Because datenum is an integer, MATLAB has no problem:
tbl_dn =
dn GroupCount nansum_VALUE
__________ __________ ____________
735965 7.3597e+05 2 6
735996 7.36e+05 1 3
If you have table variables with dates or times and need to do grouping or summary functions (i.e. varfun or grpstats), this makes datetime pretty much useless. You could work around it by casting the value from datetime to datenum around each operation. Perhaps there is a forthcoming modification to varfun that would solve this limitation.

Weitere Antworten (0)

Kategorien

Mehr zu Tables finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by