From munro at ip9.org Sun May 10 17:51:08 2009 From: munro at ip9.org (Thomas Munro) Date: Sun, 10 May 2009 18:51:08 +0100 Subject: [local-time-devel] ENCODE-TIMESTAMP with a timezone Message-ID: <70ffe3e40905101051p41925f72oeb3f3dc0ecb4590c@mail.gmail.com> Hi I was trying out local-time 1.0.1 and I encountered a problem: the function ENCODE-TIMESTAMP takes an offset, rather than a timezone (unlike the interface described in the Naggum paper, I think). The offset defaults to your offset for now, but that is not necessarily the offset that was in effect at the local time you are encoding, which might not be what some users want, and I couldn't figure out how to get the appropriate offset for that time with the functions available (although I may have missed something). For example, my computer and I live in the timezone Europe/London, which maps to subzone BST (GMT+01) today, and during the winter it maps to subzone GMT. The following REPL session shows the problem as I see it: CL-USER> (local-time:encode-timestamp 0 0 0 12 25 12 2008) @2008-12-25T11:00:00.000000Z ;; The local time was interpreted as BST, even though at the time ;; mentioned the subtimezone here was GMT; working as designed ;; but is it what most users would want? CL-USER> (local-time:encode-timestamp 0 0 0 12 09 05 2009) @2009-05-09T12:00:00.000000+01:00 ;; For yesterday's date it so happens that the subtimezone matches ;; today's, so the result is good. I need to be able to get from "2008-12-25 12:00:00 in tzinfo zone Europe/London" to a TIMESTAMP. I can see of course that there is a problem of ambiguity with such an interface: in countries with DST there is one hour per year when the local time repeats, but as far as I can see other libraries support this (for example Java's calendar system and POSIX mktime) and I guess they must just pick an arbitrary interpretation of ambiguous dates. Here is my attempt at modifying ENCODE-TIMESTAMP to support either OFFSET or TIMEZONE arguments. If you supply neither it uses *DEFAULT-TIMEZONE* rather than the current offset (so this is a change in behaviour). Here is the code: (defun encode-timestamp (nsec sec minute hour day month year &key (timezone *default-timezone*) offset into) "Return a new TIMESTAMP instance corresponding to the specified time elements." ;; If the user provided an explicit offset, we use that. Otherwise, ;; we try converting the local time to a timestamp using each available ;; subtimezone, until we find one where the offset matches the offset that ;; applies at that time (according to the transition table). ;; ;; Consequence for ambiguous cases: ;; Whichever subtimezone is listed first in the tzinfo database will be ;; the one that we pick to resolve ambiguous local time representations. (declare (type integer nsec sec minute hour day month year)) (if offset ;; a specific offset was requested (multiple-value-bind (nsec sec day) (encode-timestamp-into-values nsec sec minute hour day month year :offset offset) (if into (progn (setf (nsec-of into) nsec) (setf (sec-of into) sec) (setf (day-of into) day) into) (make-timestamp :nsec nsec :sec sec :day day))) ;; find the first potential offset that is valid at the represented time (let ((timestamp (or into (make-timestamp)))) (loop for subtimezone across (timezone-subzones timezone) do (encode-timestamp nsec sec minute hour day month year :offset (subzone-offset subtimezone) :into timestamp) (if (= (timestamp-subtimezone timestamp timezone) (subzone-offset subtimezone)) (return timestamp)) finally (error "The requested local time is not valid"))))) Here's are those examples again in the REPL with this definition of ENCODE-TIMESTAMP: CL-USER> (local-time:encode-timestamp 0 0 0 12 25 12 2008) @2008-12-25T12:00:00.000000Z CL-USER> (local-time:encode-timestamp 0 0 0 12 9 5 2009) @2009-05-09T12:00:00.000000+01:00 What do you think, is this useful? Is the algorithm correct and is there a better one? Would it be better to have two separate functions, one with the OFFSET and the other with the TIMEZONE? Does my LOOP syntax suck? Thanks Thomas Munro From dlowe at bitmuse.com Wed May 13 19:23:43 2009 From: dlowe at bitmuse.com (Daniel Lowe) Date: Wed, 13 May 2009 15:23:43 -0400 Subject: [local-time-devel] ENCODE-TIMESTAMP with a timezone In-Reply-To: <70ffe3e40905101051p41925f72oeb3f3dc0ecb4590c@mail.gmail.com> References: <70ffe3e40905101051p41925f72oeb3f3dc0ecb4590c@mail.gmail.com> Message-ID: <4A0B1E3F.9080901@bitmuse.com> Thomas Munro wrote: > What do you think, is this useful? Is the algorithm correct and > is there a better one? Would it be better to have two separate > functions, one with the OFFSET and the other with the TIMEZONE? > Does my LOOP syntax suck? Hi, Thomas. You can get the offset of a particular timezone with the timestamp-subtimezone function, given a timestamp of the period you want. If you want to start with a sub-zone name, you can use: (subzone-offset (first (gethash "CEST" local-time::*abbreviated-subzone-name->timezone-list*) Not the greatest solution, I know. We should probably have a function similar to timestamp-subtimezone that works with the name instead. One of the design principles I've had in mind while making the local-time library is that the complexity in time representations shouldn't be covered up with half-solutions, so I'm afraid the patch isn't going to go in. I've actually been planning to remove the guessing of the time offset entirely, defaulting to UTC. It's simply not meaningful, in the context of a timestamp, to have such an ambiguity lying around. The idea is that a timestamp is an unambiguous representation of a point in time. I'm still working on how to perform calculations with fuzzier time definitions. Sadly, I haven't been able to devote much time to it. : Daniel : From munro at ip9.org Thu May 14 09:27:11 2009 From: munro at ip9.org (Thomas Munro) Date: Thu, 14 May 2009 10:27:11 +0100 Subject: [local-time-devel] ENCODE-TIMESTAMP with a timezone In-Reply-To: <4A0B1E3F.9080901@bitmuse.com> References: <70ffe3e40905101051p41925f72oeb3f3dc0ecb4590c@mail.gmail.com> <4A0B1E3F.9080901@bitmuse.com> Message-ID: <70ffe3e40905140227k3ca0d0e3xe66425f9768cb59@mail.gmail.com> Hi Daniel 2009/5/13 Daniel Lowe : > You can get the offset of a particular timezone with the timestamp-subtimezone > function, given a timestamp of the period you want. Isn't there a chicken and egg problem here? Given elements of a local time and a timezone, I can't make a timestamp because I don't have the offset, and you're saying that to get the correct offset I need the timestamp first. > [...] > One of the design principles I've had in mind while making the local-time > library is that the complexity in time representations shouldn't be covered up > with half-solutions, so I'm afraid the patch isn't going to go in. I've actually > been planning to remove the guessing of the time offset entirely, defaulting to > UTC. It's simply not meaningful, in the context of a timestamp, to have such an > ambiguity lying around. That is what my patch attempted to provide: you do not need to guess/provide the offset (unless you want to), so that this matches the capabilities of other Oslen-powered time libraries. If you have only the timezone (which is obtained from the Olsen zoneinfo name based on a city, like "America/New_York") and a set of local time elements, the algorithm I provided will test each possible offset (usually there are only two) until it finds one which satisfies the constraint that the resulting timestamp should fall at a Unix epoch time where the transition table says that the offset in question applies. This is the only way that I could think of to resolve that chicken and egg problem: you need a timestamp to find out the offset, but you need an offset to make a timestamp (but there is probably a better way). > The idea is that a timestamp is an unambiguous representation of a point in > time. [...] Understood. What I am interested in is finding a way to translate from the normal human description of time ("2008-12-25 12:00:00 in London") to your unambiguous representation of time (a type of epoch time), since without it, ENCODE-TIMESTAMP is less useful than libc mktime for anyone working with a set of historical times expressed as local times in a given city. Is it the interface (being able to translate from "2008-12-25 12:00:00 in London" to a timestamp) or the implementation (the test-each-offset algorithm) that you didn't like? Thanks Thomas From dlowe at bitmuse.com Thu May 14 13:59:14 2009 From: dlowe at bitmuse.com (Daniel Lowe) Date: Thu, 14 May 2009 09:59:14 -0400 Subject: [local-time-devel] ENCODE-TIMESTAMP with a timezone In-Reply-To: <70ffe3e40905140227k3ca0d0e3xe66425f9768cb59@mail.gmail.com> References: <70ffe3e40905101051p41925f72oeb3f3dc0ecb4590c@mail.gmail.com> <4A0B1E3F.9080901@bitmuse.com> <70ffe3e40905140227k3ca0d0e3xe66425f9768cb59@mail.gmail.com> Message-ID: <4A0C23B2.7080500@bitmuse.com> Thomas Munro wrote: > Isn't there a chicken and egg problem here? Given elements of a local > time and a timezone, I can't make a timestamp because I don't have the > offset, and you're saying that to get the correct offset I need the > timestamp first. You don't have to pass in the timestamp you're trying to create. A timezone is really just a locale setting, and the offsets are stored in sub-timezones under a given timezone. As I said before, if you could look up the sub-timezone by name, you'd be able to identify the offset you desired. I'll make a patch for that functionality sometime this week (unless someone else writes it). > That is what my patch attempted to provide: you do not need to > guess/provide the offset (unless you want to) Your patch is guessing, using the UTC timestamp to attempt to find a working timezone. The problem is that more than one offset may be valid, given a local time. Yours simply picks the first it finds. It's equivalent to: (let ((offset (timestamp-subtimezone (encode-timestamp nsec sec minute hour day month year :offset 0) timezone))) (encode-timestamp nsec sec minute hour day month year :offset offset)) I don't have any problems with guessing the offset - I just want to make explicit when a guess is being made, and that's most easily done by not providing a default offset at all. Come to think of it, it'd be pretty cool for :offset in encode-timestamp to optionally take a string, referring to the subtimezone. : Daniel : From munro at ip9.org Thu May 14 14:35:13 2009 From: munro at ip9.org (Thomas Munro) Date: Thu, 14 May 2009 15:35:13 +0100 Subject: [local-time-devel] ENCODE-TIMESTAMP with a timezone In-Reply-To: <4A0C23B2.7080500@bitmuse.com> References: <70ffe3e40905101051p41925f72oeb3f3dc0ecb4590c@mail.gmail.com> <4A0B1E3F.9080901@bitmuse.com> <70ffe3e40905140227k3ca0d0e3xe66425f9768cb59@mail.gmail.com> <4A0C23B2.7080500@bitmuse.com> Message-ID: <70ffe3e40905140735w7060edacoe45bcec2bf258972@mail.gmail.com> 2009/5/14 Daniel Lowe : > Your patch is guessing, using the UTC timestamp to attempt to find a working > timezone. The problem is that more than one offset may be valid, given a local > time. Yours simply picks the first it finds. It's equivalent to: > > (let ((offset (timestamp-subtimezone (encode-timestamp nsec sec minute hour > day month year > :offset 0) > timezone))) > (encode-timestamp nsec sec minute hour day month year :offset offset)) Not quite -- it isn't using UTC to guess the timestamp (unless UTC is one of your subzones as it happens to be for London). It is doing the following, using 2009-12-25 12:00 America/New_York as the example (to avoid confusion about the use of GMT/UTC in London): Examine the available subtimezones for America/New_York, then: 1. Try to interpret the time as 2009-12-25 12:00 EST (subzone 0); does that map to a point in time when EST was valid? 2. Try to interpret the time as 2009-12-25 12:00 EDT (subzone 1); does that map to a point in time when EDT was valid? > I don't have any problems with guessing the offset - I just want to make > explicit when a guess is being made, and that's most easily done by not > providing a default offset at all. > > Come to think of it, it'd be pretty cool for :offset in encode-timestamp to > optionally take a string, referring to the subtimezone. But that would not be different in effect from using the offset as a number - EDT and EST are just 'nicknames' for offsets, and if you already know which one applies at the time you're encoding then you don't need the functionality that I am attempting to propose.