Create a single average day out of hourly data (33days)?

4 Ansichten (letzte 30 Tage)
Martin Muehlegger
Martin Muehlegger am 29 Feb. 2020
Kommentiert: dpb am 9 Mai 2021
How am I able to sum my hourly data to create one "typical/average" day (diurnal cycle) out of 33 days.
I'm using 2017b and stored my data in a timetable using retime() to story hourly means
First row is date second is hourly mean temp.
'07-Mar-2019 11:00:00' 16
'07-Mar-2019 12:00:00' 14
'07-Mar-2019 13:00:00' 12
'07-Mar-2019 14:00:00' 19
'07-Mar-2019 15:00:00' 23
'07-Mar-2019 16:00:00' 14
'07-Mar-2019 17:00:00' 15
'07-Mar-2019 18:00:00' 18
'07-Mar-2019 19:00:00' 17
'07-Mar-2019 20:00:00' 13
'07-Mar-2019 21:00:00' 14
'07-Mar-2019 22:00:00' 14
'07-Mar-2019 23:00:00' 15
'08-Mar-2019 00:00:00' 16
'08-Mar-2019 01:00:00' 15
'08-Mar-2019 02:00:00' 13
'08-Mar-2019 03:00:00' 12
'08-Mar-2019 04:00:00' 16
'08-Mar-2019 05:00:00' 12
'08-Mar-2019 06:00:00' 12
'08-Mar-2019 07:00:00' 12
Matlab 2019 has groupsummary() which might do my job....

Akzeptierte Antwort

dpb
dpb am 29 Feb. 2020
Bearbeitet: dpb am 2 Mär. 2020
"First row is date second is hourly mean temp."
That would be column, not row, as just a nit.
Yes, either groupsummary or varfun will do the job, but you need to create the grouping variable to use them...beginning with a timetable, tt that looks like the above, then
>> tt.Day=day(tt.Date);
>> varfun(@mean,tt,'groupingvariables','Day')
ans =
2×3 timetable
Date Day GroupCount mean_Temp
____________________ ___ __________ _________
07-Mar-2019 11:00:00 7 13 15.692
08-Mar-2019 00:00:00 8 8 13.5
>> groupsummary(tt,'Day','mean')
ans =
2×3 table
Day GroupCount mean_Temp
___ __________ _________
7 13 15.692
8 8 13.5
>> [g,ig]=findgroups(tt.Day);
>> splitapply(@mean,tt.Temp,g)
ans =
15.6923
13.5000
>> table(tt.Date(ig),splitapply(@mean,tt.Temp,g))
ans =
2×2 table
Var1 Var2
____________________ ______
07-Mar-2019 17:00:00 15.692
07-Mar-2019 18:00:00 13.5
>>
are various routes to the end...
>> retime(timetable(tt.Date(ig),splitapply(@mean,tt.Temp,g),'VariableNames',{'MeanTemp'}),'daily','nearest')
ans =
2×1 timetable
Time MeanTemp
____________________ ________
07-Mar-2019 00:00:00 15.692
08-Mar-2019 00:00:00 13.5
>>
also gets rid of the odd beginning hour for each day if your data aren't all collected beginning at 00:00 each day.
You could add some error checking to ensure that don't include days that are just nightime or daytime measurements so those would bias results where there is some extreme from day to night.
ERRATUM:
Average over hours, not days as misread original intent:
tt.Hour=hour(tt.Time);
>> varfun(@mean,tt,"InputVariables",'Temp',"GroupingVariables",'Hour')
ans =
21×3 timetable
Time Hour GroupCount mean_Temp
____________________ ____ __________ _________
08-Mar-2019 00:00:00 0 1 16
08-Mar-2019 01:00:00 1 1 15
08-Mar-2019 02:00:00 2 1 13
08-Mar-2019 03:00:00 3 1 12
08-Mar-2019 04:00:00 4 1 16
08-Mar-2019 05:00:00 5 1 12
08-Mar-2019 06:00:00 6 1 12
08-Mar-2019 07:00:00 7 1 12
07-Mar-2019 11:00:00 11 1 16
07-Mar-2019 12:00:00 12 1 14
07-Mar-2019 13:00:00 13 1 12
07-Mar-2019 14:00:00 14 1 19
07-Mar-2019 15:00:00 15 1 23
07-Mar-2019 16:00:00 16 1 14
07-Mar-2019 17:00:00 17 1 15
07-Mar-2019 18:00:00 18 1 18
07-Mar-2019 19:00:00 19 1 17
07-Mar-2019 20:00:00 20 1 13
07-Mar-2019 21:00:00 21 1 14
07-Mar-2019 22:00:00 22 1 14
07-Mar-2019 23:00:00 23 1 15
>>
  3 Kommentare
dpb
dpb am 2 Mär. 2020
I thought it was the day average you wanted over the year and there are only two days in the dataset.
Use hour() instead of day() for grouping variable but you don't have enough data in the sample to average (which is part of why I misread the request even though it does indeed say dirurnal; my equivalent of row vs column! :)
Martin Muehlegger
Martin Muehlegger am 3 Mär. 2020
Perfekt that looks good. THANKS

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Paul MAUGER
Paul MAUGER am 8 Mai 2021
Bearbeitet: Paul MAUGER am 9 Mai 2021
Ia ora na,
As I've got a similar question I suppose it's better if I post it here :
I've got almost 5 years of hourly solar GHI data and I want to create a " (hourly) typical year ", i.e., I want to average for each hour, of each day in any year, the GHI value (see attached timetable : MUS_TT.txt). For example, if I got 4 values of GHI for February, 8th, at 9:00 (within years 2017, 2018, 2019 and 2020) then I want to average these 4 values.
At the end I should get a 365*24 = 8760 rows timetable (if there's not too much holes).
Note that I don't care about Feb. 29th of leap years 2016 and 2020 as the purpose is a typical year average and most years have 365 days.
Thanks for your help.
  11 Kommentare
dpb
dpb am 9 Mai 2021
Example...
>> tt.Date=linspace(tt.Date(1),datetime(2016,12,31),24).'; % build a leap year table
>> tt.DOY=tt.Date; tt.DOY.Format='DDD'; % and DOY grouping var
>> tail(tt) % get 366
ans
8×2 table
Date DOY
_________________ ___
11-Sep-2016 03:26 255
26-Sep-2016 23:31 270
12-Oct-2016 19:36 286
28-Oct-2016 15:41 302
13-Nov-2016 11:45 318
29-Nov-2016 07:50 334
15-Dec-2016 03:55 350
31-Dec-2016 00:00 366
>> tt.Date=tt.Date+years(1); % add a year...now it's 2017
>> tt.DOY=tt.Date; tt.DOY.Format='DDD'; % redo DOY
tail(tt)
8×2 table
Date DOY
_________________ ___
11-Sep-2017 09:15 254
27-Sep-2017 05:20 270
13-Oct-2017 01:25 286
28-Oct-2017 21:30 301
13-Nov-2017 17:34 317
29-Nov-2017 13:39 333
15-Dec-2017 09:44 349
31-Dec-2017 05:49 365
>>
and now there are only 365 days since it isn't a leapyear any longer.
Alternatively, if you didn't want to just throw the data away, I wasn't sure what datetime would do internally, but experimentation shows--
t =
datetime
29-Feb-2016 14:36:31
>> t=t+years(1)
t =
datetime
28-Feb-2017 20:25:43
>>
Adding a year to a leapyear Feb 29 day just converts it to the last day of the month for the next year. This would end up giving you two Feb 28th for the two leap years to be averaged in to the means for the 59th DOY. Your choice as to which way you would want to go, looks like.
dpb
dpb am 9 Mai 2021
OH! Again...wasn't thinking -- you want to use calyears() here, not years() --
>> t+days(1)
ans =
datetime
29-Feb-2016 20:25:43
>> ans+calyears(1)
ans =
datetime
28-Feb-2017 20:25:43
>>
It just rolls the time field over and maxes-out the number of days/month so 29-Feb --> 28-Feb in the non-leapyear year.
This is "free" and doesn't require you to throw the 29th day of data away entirely unless you choose to do so.

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Dates and Time finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by