December 9, 2009

Day 9 - What Time Is It?

Time is relative. No, I'm not talking about Einstein's relativity theory. Time is relative to geographies and synchronization sources. It's five o' clock, somewhere, right?

Time is a complex thing. Did you know there are a few bazillion time standards? Not just time representations, but actually standards on how to record and observe the passage of time! That's awesome! Further, learning about time standards helps explain why we have leap years and leap seconds.

Why do we care about time? Before we talk about why, let's first introduce some time representation formats.

Time formats for exchange and storage are also complex. ISO 8601 is a widely used and abused time format standard. This standard also supports durations (aka 30 seconds) and intervals (start and end timestamps), though I've never seen anything use ISO 8601 durations or intervals. Timestamps (ie; a point in time) are the focus of today, so we'll ignore durations and intervals. The problem with ISO 8601 is that it is huge and complex and supports a pile of different timestamp representations with all kinds of optional fields. This complexity has lead groups like the W3C and IETF to publish time format specifications that are similar to ISO 8601, but much simpler and reduced.

An example RFC 3339 timestamp looks like: 2009-12-09T00:45:58-08:00 (see RFC 3339 section 5.6). Like ISO 8601, RFC 3339 supports fractional seconds, so the following is also valid: 2009-12-09T00:45:58.335134-08:00.

The RFC does not specify a limit to the precision of the fractional seconds. This fractional seconds field is also the only optional field in RFC 3339; all other values must be present.

Another common time format is RFC-2822 (supercedes RFC-822). RFC-2822 section 3.3 covers time formats. This format has redundant fields and generally appears to be a format for the benefit of humans, not computers. It appears to be backwards compatible with RFC 822 time format. Mentionables about this format are that it uses abbreviated english names for months, does not support fractional seconds, and has some oddities in whitespace suggestions.

Text time formats aren't the only ones you need to be aware of. MySQL supports three time types, none of them superb. None of MySQL's time types support fractional seconds. DATETIME and TIMESTAMP lack timezone storage. DATE doesn't hold times. TIME doesn't hold dates. TIMESTAMP will convert to UTC on write and back to local time on read, but the mysql client needs to explain it's view of timezones external to any read or write. MySQL's leap second support is odd - you won't be able to detect if a stored timestamp is during a leapsecond due to a bug they had to work around in their mysql dump/restore with leap seconds.

Did you know Unix time doesn't support leap seconds? Neither did I. Apparently unix time is based on days since January 01, 1970. Every day, regardless of leap second, is 86400 seconds long, or so I've read. Leap second adjustments are pretty rare, but when they do happen, that 1 second change can make for some confusion during debugging. Next time you hear of a leap second being applied, mark it on your calendar so you don't forget.

Now we're getting into the 'why' that we should care about time. Like the MySQL example there are thousands of cases of software not using standard time formats. Almost every piece of software uses it's own logging time format, which brings me to my next point -

Time formats are a perfect example of what the programming world would say is mixing your data model and your view on that data. In this case, our data model is time, and the view is whatever format present it in. For example, in syslog, you'll see times like "Dec 6 11:51:04" and in Apache you'll see "02/Oct/2009:12:01:16 -0700". Nagios uses "[1253586479]" which is unix epoch.

Any random time format, depending on what it is needed for, can have one of a few problems. First, time formats often omit values: syslog doesn't often have year or timezone. Second, most time formats don't sort when compared: apache logs do not sort because the first value is the day. Finally, few things support or bother to use fractional seconds. Fractional seconds (micro/milli/nanoseconds) may not required for some applications, but it's something to be mindful of. Speaking of fractional seconds, standard libc strptime(3) cannot parse fractional seconds. Further, strptime(3) will not parse time zones.

A lack of common time format means that you can pick any two pieces of software in your system and have a high probability that they don't speak the same time format. This makes it very difficult to compare times across applications without writing custom parsers to convert (with awk, perl, whatever) times to a single format for comparison. This sucks, especially when (mentioned above) some parsing tools don't support time zones and fractional seconds.

The above highlights a common problem: having two tools speak different time languages makes for Great Sadness. I am quite annoyed that most software chooses to use their own randomly-created time format. You should be annoyed, too.

Time formats aren't the only problem. Having two devices out of time sync can cause many debugging headaches. There are even some protocols that require time to be synchronized. Kerberos for example will reject requests with timestamps too far out of sync with the server.

Use NTP to synchronize clocks across your servers.

As discussed above, lots of time formats do not include time zones. For your own sanity, you should standardize on a single time zone. UTC is a good choice as it does not use daylight savings time. Where applications allow, have them always show time in localtime but store time in UTC. If you are bad at converting UTC to localtime in your head, buy two wall clocks for your office and set one to UTC and one to local. Plus, having multiple wall clocks clearly increases the awesomeness of your office.

Lastly, be careful when you change the system time zone (on any operating system). Some processes don't check the timezone changes. For example, apache will continue to log in PST (-0800) even after you change /etc/localtime to use UTC - fixing this requires a restart of apache. The same problem may plague your cron, syslog, and other systems, depending on how they are implemented.

Time is hard. Keep your server and device clocks in sync, try to keep your software using the same time format, use a common time zone across your systems, and if you have the option, use time formats that don't drop things like year.

Further reading:


Steve Allen said...

Don't miss the fact that the combination of international agreements and technical standards result in an EPOCH FAIL.

vitroth said...

If you have equipment in multiple timezones, try to standardize on whether everything keeps its time in local time or in UTC. For some infrastructure, local time is a requirement.

I find it easiest to use UTC everywhere I can. In particular if all devices which send syslog data to a common syslog server don't use the same timezone, you're going to end up confusing timezones eventually when analyzing logs.

deadhead said...

openntpd : easy to configure, safe (from openbsd guys), reliable..