How can we convert a datetime into a string that represents a Unix timestamp in nanoseconds?

89 Ansichten (letzte 30 Tage)
I am trying to use Matlab to generate a string which contains a Unix timestamp in nanoseconds (i.e number of nanoseconds since 01-Jan-1970 00:00:00, the Unix epoch) from an input date string.
For example, if my input is only 1 ns after the start of the epoch, then the following code works:
t0 = datetime('01-Jan-1970 00:00:00.000000000','Format','dd-MMM-yyyy HH:mm:ss.SSSSSSSSS');
t1 = datetime('01-Jan-1970 00:00:00.000000001','Format','dd-MMM-yyyy HH:mm:ss.SSSSSSSSS');
dt_ns = seconds(t1 - t0)*1e9
dt_ns_string = sprintf('%.0f',dt_ns)
% Output:
dt_ns_string =
'1'
and I have the nanosecond precision that I need.
However, for later dates this does not work. For example, if I instead for t1 use a date around today:
t1 = datetime('16-Jun-2020 00:00:00.000000001','Format','dd-MMM-yyyy HH:mm:ss.SSSSSSSSS');
then the output is the following:
dt_ns_string =
'1592265600000000000'
and I have lost the final nanosecond precision on the end of the string (final character should be a "1").
I believe this may be due to working with double types, and I might need to use uint64, but I can't figure out how to make the change.
How can I solve this?

Akzeptierte Antwort

Stephen23
Stephen23 am 17 Jun. 2020
Bearbeitet: Stephen23 am 20 Jun. 2020
Warning: this answer delves into undocumented features of the datetime object and relies on my own wild speculation that may be completely incorrect. Use only at your own risk!
Lets start by defining those datetime objects:
>> t0 = datetime('01-Jan-1970 00:00:00.000000000','Format','dd-MMM-yyyy HH:mm:ss.SSSSSSSSS')
t0 =
01-Jan-1970 00:00:00.000000000
>> t1 = datetime('01-Jan-1970 00:00:00.000000001','Format','dd-MMM-yyyy HH:mm:ss.SSSSSSSSS')
t1 =
01-Jan-1970 00:00:00.000000001
>> t2 = datetime('16-Jun-2020 00:00:00.000000001','Format','dd-MMM-yyyy HH:mm:ss.SSSSSSSSS')
t2 =
16-Jun-2020 00:00:00.000000001
However, while the difference t1-t0 gives the expected nanosecond output, the difference t2-t0 does not:
>> d = t1-t0; % so far so good!
>> d.Format = 'dd:hh:mm:ss.SSSSSSSSS'
d =
00:00:00.000000001
>> d = t2-t0; % where did the nanoseconds go to?
>> d.Format = 'dd:hh:mm:ss.SSSSSSSSS'
d =
18429:00:00:00.000000000
Clearly we have to fix whatever it is before getting to this difference, as the duration object d has already lost this information (which we can also confirm by opening the overloaded minus function and observing that the operations are marked for millisecond precision). But the nanosecond information is apparently stored in the datetime objects (after all we can see it displayed), so perhaps we can get it out somehow? The answer is yes, but we first need to convert the objects into something we can investigate easily:
warning off
s0 = struct(t0);
s1 = struct(t1);
s2 = struct(t2);
warning on
Apparently datetime objects store the time in milliseconds since 1st January 1970, as a floating point number:
>> s0.data % real() = 0 milliseconds since epoch
ans = 0 + 0i
>> s1.data % real() = 0.000001 milliseconds since epoch
ans = 1e-06 + 0i
This allows a huge range of date values, much larger than can be supported with a simple integer class. But the floating point has a limited precision, which is compensated for using the imaginary part of that number:
>> s2.data % real() = 1592265600000 milliseconds since epoch, imag() = 0.000001 milliseconds
ans = 1592265600000 + 1e-06i
Note the precision limit of double floating point is around the microseconds for dates around 2020:
>> eps(1592265600000) % milliseconds since epoch
ans = 0.000244140625
so there is no way one double floating point number by itself could count the milliseconds since 1st January 1970 and also have nanosecond precision for a date in 2020. It simply isn't possible. But by storing that compensation value in the imaginary part, datetime can effectively store a higher precision.
Can we use this? Perhaps... lets try converting those millisecond values to nanoseconds stored in a 64bit unsigned integer (the 1e6 factor converts millisecond -> nanosecond).
>> u0 = uint64(fix(real(s0.data)*1e6)); % should be zero!
>> u0 = u0 + imag(s0.data)*1e6
u0 = 0
>> u1 = uint64(fix(real(s1.data)*1e6)); % one nanosecond since epoch
>> u1 = u1 + imag(s1.data)*1e6
u1 = 1
>> u2 = uint64(fix(real(s2.data)*1e6));
>> u2 = u2 + imag(s2.data)*1e6
u2 = 1592265600000000001
and there is your uint64 value complete with nanosecond at the end :)
I guessed that fix is probably more appropriate than implicit rounding, but some experimentation is probably required. I also found some examples where datetime apparently doesn't store nanosecond precision, so if you really require nanoseconds since that epoch your best bet is probably just to count them yourself.
Oh, and the final part of your question:
>> out = sprintf('%u',u2)
out =
1592265600000000001
  3 Kommentare
Stephen23
Stephen23 am 20 Jun. 2020
Bearbeitet: Stephen23 am 20 Jun. 2020
"I am surprised there isn't an easier way to achieve this"
I was also wondering about the data design. My guess is that such a conversion is inherently lossy for many dates: whilst binary floating point can store an extremely wide range of dates at quite a reasonable relative precision, there is no single conversion to convert that entire floating point range into one integer number at your requested absolute precision.
For example using 64bit unsigned integer to count nanoseconds since 1st January 1970:
  • dates before that epoch cannot be represented, and
  • limited to slightly less than 585 years.
If we use signed integer then that allows earlier dates, but at the cost of a lower end date. So any conversion to integer is limited in its range OR precision, either way the user is faced with losing some dates that can be represented as floating point. Providing such a conversion to integer might be possible in theory, but I bet the first thing users would do is ask why it cannot represent dates after X, or why (given a wider range) it cannot represent times with less than Y precision. And given that many MATLAB users are scientists with data covering hundreds (thousands?) of years, this range is important.
It would certainly be possible using two integers, but I doubt many users would expect that either.
Peter Perkins
Peter Perkins am 26 Jun. 2020
Stephen, without getting too much into the details of the internals of datetime, your observation that MATLAB users have a very wide range of use cases for time is spot on. Cosmology, astrology (yes), high-frequency trading. So unlike many other time packages in other languages, there is no practical limit on datetime's range, and it retains enough precision even at extreme values for any need that I can think of:
>> datetime(14e9,1,1,0,0,1:2)
ans =
1×2 datetime array
1.4000e+10 CE 1.4000e+10 CE
>> diff(ans)
ans =
duration
00:00:01
And of course datetime supports missing data, crucial for data analysis. But your observation about the limits of elapsed times, i.e. durations, over long time spans is also correct: duration starts running out of ns precision for elapsed time in about +/- 104 days. Not to say it can't store elapsed times of almost any magnitude, just that it is a floating-point value.
I didn't go over your ns code too closely, but it looks similar to what convertTo (see my answer below) does.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Peter Perkins
Peter Perkins am 26 Jun. 2020
There's an easier way already built into datetime:
>> dt = datetime(["16-Jun-2020 00:00:00.000000001" "16-Jun-2020 00:00:00.000000002" "16-Jun-2020 00:00:00.000000003"],'Format','dd-MMM-yyyy HH:mm:ss.SSSSSSSSS')
dt =
1×3 datetime array
16-Jun-2020 00:00:00.000000001 16-Jun-2020 00:00:00.000000002 16-Jun-2020 00:00:00.000000003
>> convertTo(dt,'epochtime','TicksPerSecond',1e9)
ans =
1×3 int64 row vector
1592265600000000001 1592265600000000002 1592265600000000003
teeeeee, for reasons that may become apparent below, I'm wondering what you are doing with these values. If you are really needing that kind of precision over spans of decades, the obvious questions might be, "how are you measuring that precisely?" and "are you forgetting about leap seconds?" You may only care about order, not the actual values.
You shouldn't rely too much on the internals of datetime, but I will say that while datetime has a very wide range and precision, duration has the same behavior as double. So as the magnitude of a duration increases, the precision decreases. That's what allows you to create both of these:
>> seconds(1e-15)
ans =
duration
1e-15 sec
>> seconds(1e15)
ans =
duration
1e+15 sec
To represent ns since 1970, you'll need
>> ceil(log2(seconds(datetime('now') - datetime(1970,1,1))*1e9))
ans =
61
bits, which is more precision that duration has. With duration, you get about +/- 104 days at ns precision. or about 285 My at ms precision. This is completely independent of the display format, which might show units of s, min, hr, days (exact 86400s days, not calendar days), or years (exact 365.2425*86400s years, not calendar years). It would be interesting to hear about your use case for ns precision of elapsed times on the order of decades.
  3 Kommentare
Stephen23
Stephen23 am 26 Jun. 2020
Bearbeitet: Stephen23 am 28 Jun. 2020
@teeeeee: the convertTo documentation states at the bottom "Introduced in R2018b", which is a shame because this seems to be quite a neat solution to your original question.
@Peter Perkins: an example on this page
in the "Convert Datetime Arrays to Numeric Values" section would not go amiss. It is one of the first pages returned by [major internet search engine] when searching for "MATLAB convert date to number", but no sign of this useful function (or for that matter exceltime and posixtime, all of which have numeric outputs).

Melden Sie sich an, um zu kommentieren.

Produkte


Version

R2017a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by