From Siebe at de-vos.de Wed May 25 09:31:55 2011 From: Siebe at de-vos.de (Siebe de Vos) Date: Wed, 25 May 2011 11:31:55 +0200 Subject: [local-time-devel] encode-timestamp : wrong offset at DST transition Message-ID: <201105251131.55530.Siebe@de-vos.de> Hi Local-Time, Correctly parsing and printing a local-time given a timezone is an essential feature for our application and I am happy I found that local-time can offer this. But there are some problems with using timezones. From the list history it is clear that this is not a trivial issue. I will point to some obvious mismatches between documentation and actual behaviour and then describe a bug and a solution for the bug. On a general level: in my opinion the philosophy of local-time is unclear with regard to time-zones. It seems to be the only lisp library around that is integrated with zoneinfo information. The date formatting of a timestamp for a symbolic timezone works fine. However, parsing and encoding of a timestamp expressed in local time historically required an explicit offset. Now timezones are added, but not with enough care: not even the documentation mentions the API facilities. Without touching the hairy issue of time arithmetic and timezones -- using the zoneinfo library when encoding a timestamp can and should be handled correctly. And it will make local-time really live up to its name! 1. Arguments of timestamp-subtimezone "Function: timestamp-subtimezone timestamp &optional timezone" LOCAL-TIME: (timestamp-subtimezone (now)) Error: TIMESTAMP-SUBTIMEZONE got 1 arg, wanted 2 args. Timezone is not optional in the implementation. 2. Arguments of encode-timestamp "Function: encode-timestamp nsec sec minute hour day month year &optional offset" LOCAL-TIME: (encode-timestamp 0 0 0 2 26 3 2011 7200) Error: &key list isn't even. Offset is a keyword argument in the implementation. 3. Default offset used by encode-timestamp encode-timestamp : "The offset is the number of seconds offset from UTC of the locale. The offset will be set by default to the lisp implementation's default offset at the current time." LOCAL-TIME: (timestamp-subtimezone (now) *default-timezone*) 7200 T "CEST" The "implementation's default offset at the current time" is 7200. LOCAL-TIME: (encode-timestamp 0 0 0 2 26 3 2011) @2011-03-26T02:00:00.000000+01:00 LOCAL-TIME: (encode-timestamp 0 0 0 2 26 3 2011 :offset 7200) @2011-03-26T01:00:00.000000+01:00 Contrary to the documentation the two values are not the same. Better "The implementation uses the offset of the provided timezone (if not provided, the default timezone) valid at the local time being encoded." 4. encode-timestamp fails near DST transitions When my interpretation of encode-timestamp in point 3 is correct the following should not happen: I'm in a CET (+01:00) locale, currently CEST (+02:00). The last transition was on March 27, 2011 when the local time stepped from 01:59:59 to 03:00:00. CL-USER: (local-time:encode-timestamp 0 0 0 0 27 3 2011) @2011-03-27T00:00:00.000000+01:00 ; Correct CL-USER: (local-time:encode-timestamp 0 0 0 1 27 3 2011) @2011-03-27T00:00:00.000000+01:00 ; Wrong, should be T01:00:00 CL-USER:(local-time:encode-timestamp 0 0 0 2 27 3 2011) @2011-03-27T01:00:00.000000+01:00 ; Wrong nor right: This is the hour ; lost in the transition. Still more wrong then right, I would say. CL-USER(32): (local-time:encode-timestamp 0 0 0 3 27 3 2011) @2011-03-27T03:00:00.000000+02:00 ; Correct The cause of the wrong value is in the function %guess-offset: it takes seconds and day and turns them into a unix-time to search the zoneinfo transition table. This means that the provided day and seconds are treated as having a 0 offset from UTS. However, in the context of encode-timestamp they should be treated in the given timezone. ANALYSIS The mail archives include several attempts to arrive at a correct offset given a timezone and a date and time for that timezone. Thomas Munro (10/05/2009) proposed a solution, which also appears in a patch from Antoni Piotr Oleksicki (28/07/2009): (let ((timestamp (or into (make-timestamp)))) (loop for subtimezone across (timezone-subzones timezone) do (encode-timestamp nsec sec minute hour day month year :offset (subzone-offset subtimezone) :into timestamp) (if (= (timestamp-subtimezone timestamp timezone) (subzone-offset subtimezone)) (return timestamp)) finally (error "The requested local time is not valid"))) This has two apparent drawbacks: - inefficent: the transition table is used after guessing - inefficient: too much consing, too many timestamp to unix-time conversions (involving bignums), ... - unfriendly: no useful fallback when the requested time is invalid Then the 1.0.2 code shows %guess-offset. As stated above the unix-time is created without offset, whereas the offset of the timezone should be applied when converting to unix-time. I modified %guess-offset, so that it searches for the transition boundary a second time after applying the offset, more or less like this: (if (zerop offset) offset (let ((transition-position-better (transition-position (- unix-time offset) (timezone-transitions zone)))) (if (eql transition-position transition-position-better) offset (subzone-offset (elt timezone-subzones (elt timezone-indexes transition-position-better)))))) This works for me. Before making a patch, the following should be considered: - is this compatible with the spirit? - are there situations where more iterations are required? - what to do when both positions are wrong, like T02:30:00 at the forward transition from CET to CEST? Suggestion: assume that the requestor forgot to change the clock, so use the subzone after the transition. - what to do if both apply, like for T02:30:00 at the backward transition from CEST to CET? Suggestion: assume the new time. - better use of the ordered transition table structure: you don't have to search the second try but only check the transition before or after (depending on sign of offset) transition-position - what is the behaviour in other Olsen based implementations (POSIX, Java)? - the improvement is based on what zic has done for us and not on a real understanding of the zone rules Thanks for your feedback! Bye, Siebe