Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?

2015-09-12 Thread Alexander Belopolsky
On Sat, Sep 12, 2015 at 3:41 PM, Tim Peters  wrote:

> > If there are not, maybe the intended semantics should go
> > by the wayside and be replaced by what pytz does.
>
> Changing anything about default arithmetic behavior is not a
> possibility.  This has been beaten to death multiple times on this
> mailing list already, and I'm not volunteering for another round of it
> ;-)


Tim and Guido only grudgingly accept it, but datetime already gives you
"the pytz way" and PEP 495 makes a small improvement to it.  The
localize/normalize functionality is provided by the .astimezone() method
which when called without arguments will attach an appropriate fixed offset
timezone to a datetime object.   You can then add timedeltas to the result
and stay within a "fictitious" fixed offset timezone that extends
indefinitely in both directions.  To get back to the actual civil time -
you call .astimezone() again.  This gives you what we call here a
"timeline" arithmetic and occasionally it is preferable to doing arithmetic
in UTC.  (Effectively you do arithmetic in local standard time instead of
UTC.)   Using a fixed offset timezone other than UTC for timeline
arithmetic is preferable in timezones that are far enough from UTC that
business hours straddle UTC midnight.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?

2015-09-12 Thread Alexander Belopolsky
On Sat, Sep 12, 2015 at 4:10 PM, Tim Peters  wrote:

> "A potential problem" with .astimezone()'s default is that it _does_
> create a fixed-offset zone.  It's not at all obvious that it should do
> so.  First time I saw it, my initial _expectation_ was that it
> "obviously" created a hybrid tzinfo reflecting the system zone's
> actual daylight rules, as various "tzlocal" implementations outside of
> Python do.
>

The clue should have been that  .astimezone() is an instance method and you
don't need to know time to create a hybrid tzinfo.  If a Local tzinfo was
available, it could just be passed to the .astimezone() method as an
argument.  You would not need .astimezone() to both create a tzinfo and
convert the datetime instance to it.

Still, I agree that this was a hack and a very similar hack to the one
implemented by pytz.   Hopefully once PEP 495 is implemented we will
shortly see "as intended" tzinfos to become more popular.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?

2015-09-13 Thread Alexander Belopolsky
On Sat, Sep 12, 2015 at 9:58 PM, Tim Peters  wrote:

> > That's why I believe PEP 495 followed by the implementation
> > of fold-aware "as intended" tzinfos (either within stdlib or by third
> > parties) is the right approach.
>
> Me too - except I think acceptance of 495 should be contingent upon
> someone first completing a fully functional (if not releasable)
> fold-aware zoneinfo wrapping.


Good idea.  How far are you from completing that?


>   Details have a way of surprising, and
> we should learn from the last time we released a tzinfo spec in the
> absence of any industrial-strength wrappings using it.


I completely agree.  That's why I am adding test cases like Lord Hope
Island and Vilnius to datetimetester.

I will try to create a  zoneinfo wrapping prototype as well, but I will
probably "cheat" and build it on top of pytz.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?

2015-09-13 Thread Alexander Belopolsky
On Sat, Sep 12, 2015 at 6:24 PM, Guido van Rossum  wrote:

> The repeated claims (by Alexander?) that astimezone() has the power of
> pytz's localize() need to stop.


Prove me wrong! :-)


> Those pytz methods work for any (pytz) timezone -- astimezone() with a
> default argument only works for the local time zone.


That's what os.environ['TZ'] = zonename is for.  The  astimezone() method
works for every timezone installed on your system.  Try it - you won't even
need to call time.tzset()!


> (And indeed what it does is surprising, except perhaps to pytz users.)


That I agree with.  Which makes it even more surprising that I often find
myself and pytz advocates on the opposite sides of the fence.

Granted, setting TZ is a silly trick, but one simple way to bring a full TZ
database to Python is to allow .astimezone() take a zonename string like
'Europe/Amsterdam' or 'America/Montevideo' as an argument and act as
os.environ['TZ'] = zonename; t.astimezone() does now, but without messing
with global state.

I made this suggestion before, but I find it inferior to "as intended"
tzinfos.

The only real claim that I am making is that fictitious fixed offset
timezones are useful and we already have some support for them in stdlib.
The datetime.timezone instances that .astimezone() attaches as tzinfo are
not that different from the instances that are attached by pytz's localize
and normalize methods.

In fact, the only major differences between datetime.timezone instances and
those used by pytz is that pytz's EST and EDT instances know that they come
from America/New_York, while datetime.timezone instances don't.  That's why
once you specify America/New_York in localize, your tzinfo.normalize knows
it implicitely, while in the extended .astimezone() solution you will have
to specify it again.  This is not a problem when you only support one local
timezone, but comes with a different set of tradeoffs when you have
multiple timezones.

One advantage of not carrying the memory of the parent zoneinfo in the
fixed offset tzinfo instance is that pickling of datetime objects and their
interchange between different systems becomes simpler.  A pickle of a
datetime.timezone instance is trivial - same as that of a tuple of
timedelta and a short string, but if your fixed offset tzinfo carries a
reference to a potentially large zoneinfo structure, you get all kinds of
interesting problems when you share them between systems that have
different TZ databases.

In any case, there are three approaches to designing a TZ database
interface in the datetime module: the "as intended" approach, the pytz
approach and the astimezone(zonename:str) approach.  The last two don't
require a fold attribute to disambiguate end-of-dst times and the first one
does.  With respect to arithmetic, the last two approaches are equivalent:
both timeline and classic arithmetics are possible, but neither is
painless.  The "as intended" approach comes with classic arithmetic that
"just works" and encourages the best practice for timeline arithmetic: do
it in UTC.  That's why I believe PEP 495 followed by the implementation of
fold-aware "as intended" tzinfos (either within stdlib or by third parties)
is the right approach.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?

2015-09-13 Thread Alexander Belopolsky
On Sat, Sep 12, 2015 at 10:25 PM, Tim Peters  wrote:

> > I will try to create a  zoneinfo wrapping prototype as well, but I will
> > probably "cheat" and build it on top of pytz.
>
> It would be crazy not to ;-)  Note that Stuart got to punt on "the
> hard part":  .utcoffset(), since pytz only uses fixed-offset classes.
> For a prototype - and possibly forever after - I'd be inclined to
> create an exhaustive list of transition times in local time, parallel
> to the list of such times already there in UTC.


Yes.  The only complication is that you need four transition points instead
of two per year in a regular DST case: (1) start of gap; (2) end of gap;
(3) start of fold; and (4) end of fold.  Once you know where you are with
respect to those points, figuring out utcoffset(), dst() and tzname() for
either value of fold is trivial.


>   An index into either
> list then gives an index into the other, and into the list of
> information about the transition (total offset, is_dst, etc).


Right.

It's a shame though to work from a transitions in UTC list because most of
DST rules are expressed in local times and then laboriously converted into
UTC.  I think I should also implement the POSIX TZ spec tzinfo.  This is
where the advantage of the "as intended" approach will be obvious.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?

2015-09-14 Thread Alexander Belopolsky
On Sun, Sep 13, 2015 at 6:21 PM, Guido van Rossum  wrote:
>
> Now, the question may remain how do people know what to set their
timezone to. But neither pytz nor datetime can help with that -- it is up
to the sysadmin.


Note that this question is also out of the scope of "tzdist", IETF Time
Zone Data Distribution Service Working Group:

"""
The following are Out of scope for the working group:
...
- Lookup protocols or APIs to map a location to a time zone.
""" 

I am not aware of any effort to develop such service. On the other hand,
stationary ISPs have means to distribute TZ information to the hosts.  See
for example, RFC 4833 ("Timezone Options for DHCP").
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?

2015-09-14 Thread Alexander Belopolsky
On Mon, Sep 14, 2015 at 3:30 PM, Tim Peters  wrote:

> > make it much cheaper to maintain global invariants like a sort order
> > according to the UTC value
>
> It would be nice to have!  .utcoffset() is an expensive operation
> as-is, and being able to rely on tm_gmtoff would make that dirt-cheap
> instead.


If it  is just a question of optimization, datetime objects can be extended
to cache utcoffset.  Note that PyPy have recently added caching of the hash
values in datetime objects.  I merged their changes in our datetime.py, but
it did not look like C implementation would benefit from it as much as pure
python did.  I expect something similar from caching utcoffset: a
measurable improvement for tzinfos implemented in Python and a wash for
those implemented in C.  (A more promising optimization approach is to
define a C API for tzinfo interface.)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?

2015-09-14 Thread Alexander Belopolsky
On Mon, Sep 14, 2015 at 3:13 PM, Random832  wrote:

> (No, I don't
> *care* how that's not how it's defined, it is *in fact* true for the UTC
> value that you will ever actually get from converting the values to UTC
> *today*, and it's the only total ordering that actually makes any sense)
>

This is a fine attitude when you implement your own brand new datetime
library.  As an author of a new library you have freedoms that developers
of a 12 years old widely deployed code don't have.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?

2015-09-14 Thread Alexander Belopolsky
On Mon, Sep 14, 2015 at 4:08 PM, Random832  wrote:

> On Mon, Sep 14, 2015, at 15:48, Alexander Belopolsky wrote:
> > On Mon, Sep 14, 2015 at 3:44 PM, Random832 
> > wrote:
> >
> > > It is an
> > > invariant that is true today, and therefore which you can't rely on any
> > > of the consumers of this 12 years old widely deployed code not to
> assume
> > > will remain true.
> > >
> >
> > Sorry, this sentence does not parse.  You are missing a "not" somewhere.
>
> Nope. I am asserting that:
>
> This invariant is true today.
>

You've never specified "this invariant", but I'll assume you are talking
about "a < b implies a.astimezone(UTC) < b.astimezone(UTC)."  This is *not*
true today:

>>> from datetime import *
>>> from datetimetester import Eastern
>>> UTC = timezone.utc
>>> a = datetime(2002, 4, 7, 1, 40, tzinfo=Eastern)
>>> b = datetime(2002, 4, 7, 2, 20, tzinfo=Eastern)
>>> a < b
True
>>> a.astimezone(UTC) < b.astimezone(UTC)
False



> Therefore, it is likely that at least some consumers of datetime will
> assume it is true.
>

Obviously, if Random832 is a real person, the last statement is true.  This
does not make the assumption true, just proves that at least one user is
confused about the current behavior. :-)


> Therefore, you cannot rely on there not being any consumers which assume
> it will remain true.
>

That's where we are now.  Some users make baseless assumptions.  This will
probably remain true. :-(


> It's awkward, since when I go back to analyze it it turns out that the
> "not" after 'code' actually technically modifies "any" earlier in the
> sentence, but the number of negatives is correct.


Writing in shorter sentences may help.


> (Though, it actually
> works out even without that change, since the question of *which*
> consumers rely on the invariant is unknown.)
>

True.  We will never know how many users rely on false assumptions.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?

2015-09-14 Thread Alexander Belopolsky
On Mon, Sep 14, 2015 at 3:44 PM, Random832  wrote:

> It is an
> invariant that is true today, and therefore which you can't rely on any
> of the consumers of this 12 years old widely deployed code not to assume
> will remain true.
>

Sorry, this sentence does not parse.  You are missing a "not" somewhere.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?

2015-09-14 Thread Alexander Belopolsky
On Mon, Sep 14, 2015 at 4:22 PM, Tim Peters  wrote:

> > faster
> > than CPython can look up the .utcoffset method. (At least for times
> > within a few years around now.) A programmer who makes it slower should
> > be fired.
>
> So any programmer who implements .utcoffset() in Python should be
> fired?  That's the only way I can read that.


No, no!  I've already conceded that caching UTC offset will probably help
pure Python implementations.  PyPy folks have established this fact for
hash and I am willing to extrapolate their results to UTC offset.  I am
only trying to say that if we decide to bring a fast TZ database to
CPython, pure python tzinfo interface will likely become our main
bottleneck, not the speed with which C code can compute the offset value.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?

2015-09-14 Thread Alexander Belopolsky
On Mon, Sep 14, 2015 at 3:49 PM, Tim Peters  wrote:

> It depends on how expensive .utcoffset()
> is, which in turn depends on how the tzinfo author implements it.
>

No, it does not.  In most time zones, UTC offset in seconds can be computed
by C code as a 4-byte integer faster than CPython can look up the
.utcoffset method. (At least for times within a few years around now.) A
programmer who makes it slower should be fired.  Yet I agree,
"'premature optimization'
applies at this time."
-- 
https://mail.python.org/mailman/listinfo/python-list


Adding PEP 495 support to dateutil

2015-09-16 Thread Alexander Belopolsky
On Sat, Sep 12, 2015 at 9:58 PM, Tim Peters  wrote:

> I think acceptance of 495 should be contingent upon
> someone first completing a fully functional (if not releasable)
> fold-aware zoneinfo wrapping.
>

After studying both pytz and dateutil offerings, I decided that it is
easier to add "fold-awareness" to the later.  I created a fork [1] on
Github and added [2] fold-awareness logic to the tzrange class that appears
to be the base class for most other tzinfo implementations.  I was
surprised how few test cases had to be changed.  It looks like  dateutil
test suit does not test questionable (in the absence of fold) behavior.  I
will need to beef up the test coverage.

I am making all development public early on and hope to see code reviews
and pull requests from interested parties.  Pull requests with additional
test cases are most welcome.

[1]: https://github.com/abalkin/dateutil/tree/pep-0495
[2]:
https://github.com/abalkin/dateutil/commit/57ecdbf481de7e21335ece8fcc5673d59252ec3f
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Adding PEP 495 support to dateutil

2015-09-19 Thread Alexander Belopolsky
[Tim Peters]
>
> I think acceptance of 495 should be contingent upon
> someone first completing a fully functional (if not releasable)
> fold-aware zoneinfo wrapping.


[Alexander Belopolsky]
>
> I am making all development public early on and hope to see code reviews
and pull requests from interested parties.  Pull requests with additional
test cases are most welcome.


I've made some additional progress in my dateutil fork [1].  The tzfile
class is now fold-aware.  The tzfile implementation of tzinfo takes the
history of local time type changes from a binary zoneinfo file. These files
are installed on the majority of UNIX platforms.

More testing is needed, but I think my fork is now close to meeting Tim's
challenge.

Please note that you need to run the modified  dateutil fork [1] code under
PEP 495 fork of CPython. [2]

[1]: https://github.com/abalkin/dateutil/tree/pep-0495
[2]: https://github.com/abalkin/cpython
-- 
https://mail.python.org/mailman/listinfo/python-list