Comparing datetime value from two tables

12 Ansichten (letzte 30 Tage)
Ziye
Ziye am 11 Jul. 2017
Kommentiert: Ziye am 3 Aug. 2017
I have two tables a and b both containing time and price data. They are imported from csv files and therefore become two tables.
Both contain data of similar format but the datetime data in a is in minutes eg "11-07-17 09:00" and in b it's in dates eg "11-07-17".
I want to compare the price data given a date, something like "select * from b where b(:,1)=date(a(1,1))"
I'm extremely confused in trying to convert table into other format in order to compare them, I felt that there must be a very easy way to do it which i'm not aware of. someone pls help thanks so much!
  1 Kommentar
Ziye
Ziye am 11 Jul. 2017
Thanks! ya I should have attached the files. Pls refer to the attached file

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Guillaume
Guillaume am 11 Jul. 2017
Bearbeitet: Guillaume am 11 Jul. 2017
A datetime always has an hour, minute, second component. The display format of the datetime array may be such that it is not shown, but it is always there. If you've read that array from a text file which didn't have a time component, then it defaults to 00:00:00.
You have several options to compare datetime arrays using just the date
  • You could set the time component in a to 00:00:00 and use standard comparison functions (e.g. ismember, or join)
  • You could extract the year, month and day from each with ymd and do array comparison (e.g. with ismember with the 'rows' option)
  • You could extract the date as strings (using char or datestr) and compare the strings.
Going with option 2, you could do:
ymda = zeros(height(a), 3);
[ymda(:, 1), ymda(:, 2), ymda(:, 3)] = ymd(a.Datetime);
ymdb = zeros(height(b), 3);
[ymdb(:, 1), ymdb(:, 2), ymdb(:, 3)] = ymd(b.Date);
queryresult = b(ismember(ymdb, ymda, 'rows'), :); %equivalent to select * from b where b.date = a.date
  3 Kommentare
Guillaume
Guillaume am 11 Jul. 2017
I wouldn't convert the tables to cell arrays. Tables are much easier to work with.
if you just use plain readtable it's not the month and day that is swapped but the day and year. That's because your year is encoded with only two digits. That confuses matlab so you need to tell it how to decode the date properly:
optionsa = setvaropts(detectImportOptions('a.csv'), 'Datetime', 'InputFormat', 'dd-MM-yy HH:mm');
a = readtable('a.csv', optionsa);
optionsb = setvaropts(detectImportOptions('b.csv'), 'Date', 'InputFormat', 'dd-MM-yy');
b = readtable('b.csv', optionsb);
%change datetime display format to something less ambiguous (optional)
a.Datetime.Format = 'dd MMM yyyy HH:mm';
b.Date.Format = 'dd MMM yyyy';
With that my code above works fine.
Ziye
Ziye am 12 Jul. 2017
Thanks Guillaume. It's a nice solution. Too bad i'm using an older version of matlab but I get the ideas.
I've been quite confused with these datatypes, when I need to work with multiple datasets containing datetime, double and strings you think it's better that I put them in tables?

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (2)

Peter Perkins
Peter Perkins am 11 Jul. 2017
One of your data files contains daily data with holes, the other contains irregular data of much higher frequency. The only two ways I can make sense of your question is that you want to either
1) braodcast out the daily data to the higher frequency data, or 2) somehow summarize the higher frequency data to a daily value and match them up
If you have access to MATLAB R2016b or later, both are easy to do with two timetables and the synchronize method. If tt1 has "pure dates" as it's row times, and tt2 has dates+times as its row times, then something like
tt = synchronize(tt1,tt2,'first','mean')
says, at the times defined by the first input, take the mean of the values in the second input between each of those times. The output is a timetable that has your prices as its two variables.
tt = synchronize(tt1,tt2,'last','previous')
says, at the times defined by the second input, find the value in the first input immediately before each of those times. This does what is in effect a join on the date.
There are subtleties you probably need to account for due to the holes in your daily data, but that's the general idea.
Prior to R2016b, you should be able to do more or less the same things by first creating a Date variable in your high frequency data using dateshift. To get a daily series,
varfun(@mean,tt2,'GroupingVariable','Date)
on the high frequency data, and then use the join function to combine that result with the daily data.
To get a high frequency series, just use the join function.
Hope this helps.

Steven Lord
Steven Lord am 11 Jul. 2017
If you store your data in a timetable you could index into rows in a specific range of time using a timerange.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by