From munro at ip9.org  Sun May 10 17:51:08 2009
From: munro at ip9.org (Thomas Munro)
Date: Sun, 10 May 2009 18:51:08 +0100
Subject: [local-time-devel] ENCODE-TIMESTAMP with a timezone
Message-ID: <70ffe3e40905101051p41925f72oeb3f3dc0ecb4590c@mail.gmail.com>

Hi

I was trying out local-time 1.0.1 and I encountered a problem: the
function ENCODE-TIMESTAMP takes an offset, rather than a timezone
(unlike the interface described in the Naggum paper, I think).
The offset defaults to your offset for now, but that is not
necessarily the offset that was in effect at the local time you
are encoding, which might not be what some users want, and I
couldn't figure out how to get the appropriate offset for that time
with the functions available (although I may have missed something).

For example, my computer and I live in the timezone Europe/London,
which maps to subzone BST (GMT+01) today, and during the winter it
maps to subzone GMT.  The following REPL session shows the problem
as I see it:

CL-USER> (local-time:encode-timestamp 0 0 0 12 25 12 2008)
@2008-12-25T11:00:00.000000Z
;; The local time was interpreted as BST, even though at the time
;; mentioned the subtimezone here was GMT; working as designed
;; but is it what most users would want?

CL-USER> (local-time:encode-timestamp 0 0 0 12 09 05 2009)
@2009-05-09T12:00:00.000000+01:00
;; For yesterday's date it so happens that the subtimezone matches
;; today's, so the result is good.

I need to be able to get from "2008-12-25 12:00:00 in tzinfo zone
Europe/London" to a TIMESTAMP.  I can see of course that there
is a problem of ambiguity with such an interface: in countries with
DST there is one hour per year when the local time repeats, but as
far as I can see other libraries support this (for example Java's
calendar system and POSIX mktime) and I guess they must just pick
an arbitrary interpretation of ambiguous dates.

Here is my attempt at modifying ENCODE-TIMESTAMP to support either
OFFSET or TIMEZONE arguments.  If you supply neither it uses
*DEFAULT-TIMEZONE* rather than the current offset (so this is a change
in behaviour).  Here is the code:

(defun encode-timestamp (nsec sec minute hour day month year
                         &key (timezone *default-timezone*) offset into)
  "Return a new TIMESTAMP instance corresponding to the specified time
elements."
  ;; If the user provided an explicit offset, we use that.  Otherwise,
  ;; we try converting the local time to a timestamp using each available
  ;; subtimezone, until we find one where the offset matches the offset that
  ;; applies at that time (according to the transition table).
  ;;
  ;; Consequence for ambiguous cases:
  ;; Whichever subtimezone is listed first in the tzinfo database will be
  ;; the one that we pick to resolve ambiguous local time representations.
  (declare (type integer nsec sec minute hour day month year))
  (if offset
      ;; a specific offset was requested
      (multiple-value-bind (nsec sec day)
          (encode-timestamp-into-values nsec sec minute hour day month year
                                        :offset offset)
        (if into
            (progn
              (setf (nsec-of into) nsec)
              (setf (sec-of into) sec)
              (setf (day-of into) day)
              into)
            (make-timestamp
             :nsec nsec
             :sec sec
             :day day)))
      ;; find the first potential offset that is valid at the represented time
      (let ((timestamp (or into (make-timestamp))))
        (loop
           for subtimezone across (timezone-subzones timezone) do
             (encode-timestamp nsec sec minute hour day month year
                               :offset (subzone-offset subtimezone)
                               :into timestamp)
             (if (= (timestamp-subtimezone timestamp timezone)
                    (subzone-offset subtimezone))
                 (return timestamp))
           finally
             (error "The requested local time is not valid")))))

Here's are those examples again in the REPL with this definition of
ENCODE-TIMESTAMP:

CL-USER> (local-time:encode-timestamp 0 0 0 12 25 12 2008)
@2008-12-25T12:00:00.000000Z

CL-USER> (local-time:encode-timestamp 0 0 0 12 9 5 2009)
@2009-05-09T12:00:00.000000+01:00

What do you think, is this useful?  Is the algorithm correct and
is there a better one?  Would it be better to have two separate
functions, one with the OFFSET and the other with the TIMEZONE?
Does my LOOP syntax suck?

Thanks
Thomas Munro


From dlowe at bitmuse.com  Wed May 13 19:23:43 2009
From: dlowe at bitmuse.com (Daniel Lowe)
Date: Wed, 13 May 2009 15:23:43 -0400
Subject: [local-time-devel] ENCODE-TIMESTAMP with a timezone
In-Reply-To: <70ffe3e40905101051p41925f72oeb3f3dc0ecb4590c@mail.gmail.com>
References: <70ffe3e40905101051p41925f72oeb3f3dc0ecb4590c@mail.gmail.com>
Message-ID: <4A0B1E3F.9080901@bitmuse.com>

Thomas Munro wrote:
> What do you think, is this useful?  Is the algorithm correct and
> is there a better one?  Would it be better to have two separate
> functions, one with the OFFSET and the other with the TIMEZONE?
> Does my LOOP syntax suck?

Hi, Thomas.

You can get the offset of a particular timezone with the timestamp-subtimezone
function, given a timestamp of the period you want.  If you want to start with a
sub-zone name, you can use:
 (subzone-offset (first
                   (gethash "CEST"
                     local-time::*abbreviated-subzone-name->timezone-list*)

Not the greatest solution, I know.  We should probably have a function similar
to timestamp-subtimezone that works with the name instead.

One of the design principles I've had in mind while making the local-time
library is that the complexity in time representations shouldn't be covered up
with half-solutions, so I'm afraid the patch isn't going to go in. I've actually
been planning to remove the guessing of the time offset entirely, defaulting to
UTC.  It's simply not meaningful, in the context of a timestamp, to have such an
ambiguity lying around.

The idea is that a timestamp is an unambiguous representation of a point in
time.  I'm still working on how to perform calculations with fuzzier time
definitions.  Sadly, I haven't been able to devote much time to it.

: Daniel :


From munro at ip9.org  Thu May 14 09:27:11 2009
From: munro at ip9.org (Thomas Munro)
Date: Thu, 14 May 2009 10:27:11 +0100
Subject: [local-time-devel] ENCODE-TIMESTAMP with a timezone
In-Reply-To: <4A0B1E3F.9080901@bitmuse.com>
References: <70ffe3e40905101051p41925f72oeb3f3dc0ecb4590c@mail.gmail.com>
	<4A0B1E3F.9080901@bitmuse.com>
Message-ID: <70ffe3e40905140227k3ca0d0e3xe66425f9768cb59@mail.gmail.com>

Hi Daniel

2009/5/13 Daniel Lowe <dlowe at bitmuse.com>:
> You can get the offset of a particular timezone with the timestamp-subtimezone
> function, given a timestamp of the period you want.

Isn't there a chicken and egg problem here?  Given elements of a local
time and a timezone, I can't make a timestamp because I don't have the
offset, and you're saying that to get the correct offset I need the
timestamp first.

> [...]
> One of the design principles I've had in mind while making the local-time
> library is that the complexity in time representations shouldn't be covered up
> with half-solutions, so I'm afraid the patch isn't going to go in. I've actually
> been planning to remove the guessing of the time offset entirely, defaulting to
> UTC.  It's simply not meaningful, in the context of a timestamp, to have such an
> ambiguity lying around.

That is what my patch attempted to provide: you do not need to
guess/provide the offset (unless you want to), so that this matches
the capabilities of other Oslen-powered time libraries.  If you have
only the timezone (which is obtained from the Olsen zoneinfo name
based on a city, like "America/New_York") and a set of local time
elements, the algorithm I provided will test each possible offset
(usually there are only two) until it finds one which satisfies the
constraint that the resulting timestamp should fall at a Unix epoch
time where the transition table says that the offset in question
applies.  This is the only way that I could think of to resolve that
chicken and egg problem: you need a timestamp to find out the offset,
but you need an offset to make a timestamp (but there is probably a
better way).

> The idea is that a timestamp is an unambiguous representation of a point in
> time. [...]

Understood.  What I am interested in is finding a way to translate
from the normal human description of time ("2008-12-25 12:00:00 in
London") to your unambiguous representation of time (a type of epoch
time), since without it, ENCODE-TIMESTAMP is less useful than libc
mktime for anyone working with a set of historical times expressed as
local times in a given city.

Is it the interface (being able to translate from "2008-12-25 12:00:00
in London" to a timestamp) or the implementation (the test-each-offset
algorithm) that you didn't like?

Thanks
Thomas


From dlowe at bitmuse.com  Thu May 14 13:59:14 2009
From: dlowe at bitmuse.com (Daniel Lowe)
Date: Thu, 14 May 2009 09:59:14 -0400
Subject: [local-time-devel] ENCODE-TIMESTAMP with a timezone
In-Reply-To: <70ffe3e40905140227k3ca0d0e3xe66425f9768cb59@mail.gmail.com>
References: <70ffe3e40905101051p41925f72oeb3f3dc0ecb4590c@mail.gmail.com>	<4A0B1E3F.9080901@bitmuse.com>
	<70ffe3e40905140227k3ca0d0e3xe66425f9768cb59@mail.gmail.com>
Message-ID: <4A0C23B2.7080500@bitmuse.com>

Thomas Munro wrote:
> Isn't there a chicken and egg problem here?  Given elements of a local
> time and a timezone, I can't make a timestamp because I don't have the
> offset, and you're saying that to get the correct offset I need the
> timestamp first.

You don't have to pass in the timestamp you're trying to create.  A timezone is
really just a locale setting, and the offsets are stored in sub-timezones under
a given timezone.  As I said before, if you could look up the sub-timezone by
name, you'd be able to identify the offset you desired.  I'll make a patch for
that functionality sometime this week (unless someone else writes it).

> That is what my patch attempted to provide: you do not need to
> guess/provide the offset (unless you want to)

Your patch is guessing, using the UTC timestamp to attempt to find a working
timezone.  The problem is that more than one offset may be valid, given a local
time.  Yours simply picks the first it finds.  It's equivalent to:

(let ((offset (timestamp-subtimezone (encode-timestamp nsec sec minute hour
                                                       day month year
                                                       :offset 0)
                                     timezone)))
  (encode-timestamp nsec sec minute hour day month year :offset offset))

I don't have any problems with guessing the offset - I just want to make
explicit when a guess is being made, and that's most easily done by not
providing a default offset at all.

Come to think of it, it'd be pretty cool for :offset in encode-timestamp to
optionally take a string, referring to the subtimezone.

: Daniel :


From munro at ip9.org  Thu May 14 14:35:13 2009
From: munro at ip9.org (Thomas Munro)
Date: Thu, 14 May 2009 15:35:13 +0100
Subject: [local-time-devel] ENCODE-TIMESTAMP with a timezone
In-Reply-To: <4A0C23B2.7080500@bitmuse.com>
References: <70ffe3e40905101051p41925f72oeb3f3dc0ecb4590c@mail.gmail.com>
	<4A0B1E3F.9080901@bitmuse.com>
	<70ffe3e40905140227k3ca0d0e3xe66425f9768cb59@mail.gmail.com>
	<4A0C23B2.7080500@bitmuse.com>
Message-ID: <70ffe3e40905140735w7060edacoe45bcec2bf258972@mail.gmail.com>

2009/5/14 Daniel Lowe <dlowe at bitmuse.com>:
> Your patch is guessing, using the UTC timestamp to attempt to find a working
> timezone.  The problem is that more than one offset may be valid, given a local
> time.  Yours simply picks the first it finds.  It's equivalent to:
>
> (let ((offset (timestamp-subtimezone (encode-timestamp nsec sec minute hour
>                                                       day month year
>                                                       :offset 0)
>                                     timezone)))
>  (encode-timestamp nsec sec minute hour day month year :offset offset))

Not quite -- it isn't using UTC to guess the timestamp (unless UTC is
one of your subzones as it happens to be for London).  It is doing the
following, using 2009-12-25 12:00 America/New_York as the example (to
avoid confusion about the use of GMT/UTC in London):

Examine the available subtimezones for America/New_York, then:
1.  Try to interpret the time as 2009-12-25 12:00 EST (subzone 0);
does that map to a point in time when EST was valid?
2.  Try to interpret the time as 2009-12-25 12:00 EDT (subzone 1);
does that map to a point in time when EDT was valid?

> I don't have any problems with guessing the offset - I just want to make
> explicit when a guess is being made, and that's most easily done by not
> providing a default offset at all.
>
> Come to think of it, it'd be pretty cool for :offset in encode-timestamp to
> optionally take a string, referring to the subtimezone.

But that would not be different in effect from using the offset as a
number - EDT and EST are just 'nicknames' for offsets, and if you
already know which one applies at the time you're encoding then you
don't need the functionality that I am attempting to propose.