How to use groupsummary to group by hour-of-year

The "groupsummary" function offers a convenient way to group variables in a table and apply functions such as computing the mean.
For example, if I want to compute the day-of-year mean of a variable named "foo" in a table "myTable" containing daily data with timestamp "date", I would use the command:
myDoyTable = groupsummary(myTable, "date","dayofyear","mean","foo");
This is great for daily data, but I am stumped in the case of hourly data.
Suppose I want to do a similar thing, but now myTable contains hourly data and date has an hourly timestamp.
Ideally I would want to use a command such as:
myDoyTable = groupsummary(myTable, "date","hourofyear","mean","foo");
Unfortunately, there is no "hourofyear" option (though there is hour of day, minute of hour, and many more).
Can anyone suggest a workaround?
Thanks,
-Dino

Antworten (2)

the cyclist
the cyclist am 30 Jun. 2023

0 Stimmen

The groupbin option "hour" doesn't do what you want?
(If not, can you upload a sample of input and output data indicating what you want? I'm having a difficult time understanding.)

4 Kommentare

Thank you for your response, let me try again:
Given a table "myTable" with 20 years of hourly data (each entry with a time stamp "time" like 01-Jan-2014 01:00:00, and a variable "foo" like 0.00425), the command:
doyTable = groupsummary(myTable, "time","dayofyear","mean","foo");
produces a table "doyTable" with 366 rows (one for each day of the year, including a bin for leap years) that contain the mean of "foo" for day 1 across the 20 data years, day 2 across the 20 data years, and so forth.
Similarly, the command:
hodTable = groupsummary(myTable, "time","hourofday","mean","foo");
produces a table "hodTable" with 24 rows (one for each hour of the day) that contain the mean of "foo" for each hour of the day across the 20 data years.
What I would like is the equivalent of a command:
hoyTable = groupsummary(myTable, "time","hourofyear","mean","foo");
This, in my ideal world, would produce a table that has 366 * 24 rows, meaning one row for each hour of the year (accounting for leap year bins). Each row would in this case contain the mean of foo for each specific hour of the year, across the 20 years of data. The entries would thus be the mean of 20 values for each hour of a year (except for the leap year bins, which would only have a mean of 5 values each).
In simpler words, I'm after a grouping "hourofyear" that would allow me to compute the mean of a variable on each specific hour of the year across the number of years of data available.
I hope that was more clear.
Thanks,
-d.
I think I understand. I can't test this idea on your data, because you have not uploaded it, but I believe that the following would do what you want.
First, convert your table (or make a copy) to change all of the years to be the same. (Choose a leap year, so that it has a Feb 29.)
Then do groupsummary with groupbin = "hour" on that table. MATLAB doesn't "know" that these are different years, so it includes each hour of each day (regardless of year) in the summary.
Thanks, that's clever! I will give it a try and let you know..
Here is the data, BTW, apologies for not uploading it earlier. I'm interested in the hour of year mean for column AP, namely "Fch4_f_1_1_1". I had to chop off the data file after a few years, due to size limitations. Feel free to take it for a spin, cyclist! Thanks again, -d.

Melden Sie sich an, um zu kommentieren.

Is this really "hour of year", a value that runs from 0 to 8784? In the same way that day of year is just
dt = datetime(2023,7,17,14,15,0)
dt = datetime
17-Jul-2023 14:15:00
doy = between(dateshift(dt,"start","year"),dt,"days") % or maybe - caldays(1)
doy = calendarDuration
197d
hour of year is just
hoy = between(dateshift(dt,"start","year"),dt,"time")
hoy = calendarDuration
4742h 15m 0s
hoy = floor(time(hoy),"hours"); hoy.Format = 'h'
hoy = duration
4742 hr
If you add an HoY grouping variable to your table, groupsummary can group by that.

Kategorien

Mehr zu Data Type Identification finden Sie in Hilfe-Center und File Exchange

Produkte

Version

R2023a

Gefragt:

am 30 Jun. 2023

Beantwortet:

am 17 Jul. 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by