Hello,
This week, I've finished the work on serialization by making the deserializers
capable of handling UTC offsets. I had to rewrite DateTimeField.to_python to
extract and interpret timezone offsets. Still, deserialization of aware
datetimes doesn't work with PyYAML: http://pyyaml.org/ticket/202
I also implemented the storage and retrieval of aware datetime objects in
PostgreSQL, MySQL and Oracle. Conversions happen:
- on storage, in `connection.ops.value_to_db_datetime`, called from
`get_db_prep_value`;
- on retrieval, in the database adapter's conversion functions.
The code is rather straightforward. When USE_TZ is True, naive datetimes are
interpreted as local time in TIME_ZONE, for backwards compatibility with
existing applications.
SQLite is more tricky because it uses
`django.db.backends.util.typecast_timestamp` to convert string to datetimes.
However:
- this function is used elsewhere of Django, in combination with the
`needs_datetime_string_cast` flag.
- it performs essentially the same operations as
`DateTimeField.to_python`.
I'll review the history of this code, and I'll try to refactor it.
Besides adding support for SQLite, I still have to:
- check that datetimes behave correctly when they're used as query
arguments, in aggregation functions, etc.
- optimize django.utils.tzinfo: fix #16899, use pytz.utc as the UTC
timezone class when pytz is available, etc.
I won't have much time for this project next week. See you in two weeks for the
next check-in!
Best regards,
--
Aymeric Augustin.
On 24 sept. 2011, at 15:24, Aymeric Augustin wrote:
> Hello,
>
> This week, I've been working on a related topic that I had missed entirely in
> my initial proposal: serialization.
>
> Developers will obtain aware datetimes from Django when USE_TZ = True. We
> must ensure that they serialize correctly.
>
> Currently, the serialization code isn't very consistent with datetimes:
> - JSON: the serializer uses the '%Y-%m-%d %H:%M:%S' format, losing
> microseconds and timezone information. This dates back to the initial commit
> at r3237. See also #10201.
> - XML: the serializer delegates to DateTimeField.value_to_string, who
> also uses the '%Y-%m-%d %H:%M:%S' format.
> - YAML: the serializer handles datetimes natively, and it includes
> microseconds and UTC offset in the output.
>
> I've hesitated between converting datetimes to UTC or rendering them as-is
> with an UTC offset. The former would be more consistent with the database and
> it's recommended in YAML. But the latter avoids modifying the data: not only
> is it faster, but it's also more predictable. Serialization isn't just about
> storing the data for further retrieval, it can be used to print arbitrary
> data in a different format. Finally, when the data comes straight from the
> database (the common case), it will be in UTC anyway.
>
> Eventually, I've decided to serialize aware datetimes without conversion. The
> implementation is here:
> https://bitbucket.org/aaugustin/django/compare/..django/django
>
> Here are the new serialization formats for datetimes:
> - JSON: as described in the specification at
> http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf >
> 15.9.1.15 Date Time String Format.
> - XML: as produced by datetime.isoformat(), ISO8601.
> - YAML: unchanged, compatible with http://yaml.org/type/timestamp.html
> — the canonical representation uses 'T' as separator and is in UTC, but it's
> also acceptable to use a space and include an offset like pyyaml does.
> These formats follow the best practices described in
> http://www.w3.org/TR/NOTE-datetime.
>
> This fix is backwards-incompatible for the JSON and XML serializers: it
> includes fractional seconds and timezone information, and it uses the
> normalized separator, 'T', between the date and time parts. However, I've
> made sure that existing fixtures will load properly with the new code. I'll
> mention all this in the release notes.
>
> Unrelatedly, I have switched the SQLite backend to supports_timezones =
> False, because it really doesn't make sense to write the UTC offset but
> ignore it when reading back the data.
>
> Best regards,
>
> --
> Aymeric Augustin.
>
> On 17 sept. 2011, at 09:59, Aymeric Augustin wrote:
>
>> Hello,
>>
>> This week, I've gathered all the information I need about how the database
>> engines and adapters supported by Django handle datetime objects. I'm
>> attaching my findings.
>>
>> The good news is that the database representations currently used by Django
>> are already optimal for my proposal. I'll store data in UTC:
>> - with an explicit timezone on PostgreSQL,
>> - without timezone on SQLite and MySQL because the database engine doesn't
>> support it,
>> - without timezone on Oracle because the database adapter doesn't support it.
>>
>>
>> Currently, Django sets the "supports_timezones feature" to True for SQLite.
>> I'm skeptical about this choice. Indeed, the time zone is stored: SQLite
>> just saves the output of "<datetime>.isoformat(), which includes the UTC
>> offset for aware datetime objects. However, the timezone information is
>> ignored when reading the data back from the database, thus yielding
>> incorrect data when it's different from the local time defined by
>> settings.TIME_ZONE.
>>
>> As far as I can tell, the "supports_timezones" and the
>> "needs_datetime_string_cast" database features are incompatible, at least
>> with the current implementation of "typecast_timestamp". There's a comment
>> about this problem that dates back to the merge of magic-removal, possibly
>> before:
>> https://code.djangoproject.com/browser/django/trunk/django/db/backends/util.py?annotate=blame#L79
>>
>> SQLite is the only engine who has these two flags set to True. I think
>> "supports_timezones" should be False. Does anyone know why it's True? Is it
>> just an historical artifact?
>>
>>
>> Finally, I have read the document that describes "to_python",
>> "value_to_string", and r"get_(db_)?prep_(value|save|lookup)". The next step
>> is to adjust these functions in DateFieldField, depending on the value of
>> settings.USE_TZ.
>>
>> Best regards,
>>
>> --
>> Aymeric Augustin.
>>
>> <DATABASE-NOTES.html>
>>
>> On 11 sept. 2011, at 23:18, Aymeric Augustin wrote:
>>
>>> Hello,
>>>
>>> Given the positive feedback received here and on IRC, I've started the
>>> implementation.
>>>
>>> Being most familiar with mercurial, I've forked the Bitbucket mirror. This
>>> page that compares my branch to trunk:
>>> https://bitbucket.org/aaugustin/django/compare/..django/django
>>>
>>> I've read a lot of code in django.db, and also the documentation of
>>> PostgreSQL, MySQL and SQLite regarding date/time types.
>>>
>>> I've written some tests that validate the current behavior of Django. Their
>>> goal is to guarantee backwards-compatibility when USE_TZ = False.
>>>
>>> At first they failed because runtests.py doesn't set os.environ['TZ'] and
>>> doesn't call time.tzset() , so the tests ran with my system local time. I
>>> fixed that in setUp and tearDown. Maybe we should call them in runtests.py
>>> too for consistency?
>>>
>>> By the way, since everything is supposed to be in UTC internally when
>>> USE_TZ is True, it is theoretically to get rid of os.environ['TZ'] and
>>> time.tzset(). They are only useful to make timezone-dependant functions
>>> respect the TIME_ZONE setting. However, for backwards compatibility (in
>>> particular with third-party apps), it's better to keep them and interpret
>>> naive datetimes in the timezone defined by settings.TIME_ZONE (instead of
>>> rejecting them outright). For this reason, I've decided to keep
>>> os.environ['TZ'] and time.tzset() even when USE_TZ is True.
>>>
>>> Best regards,
>>>
>>> --
>>> Aymeric Augustin.
>>>
>>>
>>> On 3 sept. 2011, at 17:40, Aymeric Augustin wrote:
>>>
>>>> Hello,
>>>>
>>>> The GSoC proposal "Multiple timezone support for datetime representation"
>>>> wasn't picked up in 2011 and 2010. Although I'm not a student and the
>>>> summer is over, I'd like to tackle this problem, and I would appreciate it
>>>> very much if a core developer accepted to mentor me during this work,
>>>> GSoC-style.
>>>>
>>>> Here is my proposal, following the GSoC guidelines. I apologize for the
>>>> wall of text; this has been discussed many times in the past 4 years and
>>>> I've tried to address as many concerns and objections as possible.
>>>>
>>>> Definition of success
>>>> ---------------------
>>>>
>>>> The goal is to resolve ticket #2626 in Django 1.4 or 1.5 (depending on
>>>> when 1.4 is released).
>>>>
>>>> Design specification
>>>> --------------------
>>>>
>>>> Some background on timezones in Django and Python
>>>> .................................................
>>>>
>>>> Currently, Django stores datetime objects in local time in the database,
>>>> local time being defined by the TIME_ZONE setting. It retrieves them as
>>>> naive datetime objects. As a consequence, developers work with naive
>>>> datetime objects in local time.
>>>>
>>>> This approach sort of works when all the users are in the same timezone
>>>> and don't care about data loss (inconsistencies) when DST kicks in or out.
>>>> Unfortunately, these assumptions aren't true for many Django projects: for
>>>> instance, one may want to log sessions (login/logout) for security
>>>> purposes: that's a 24/7 flow of important data. Read tickets #2626 and
>>>> #10587 for more details.
>>>>
>>>> Python's standard library provides limited support for timezones, but this
>>>> gap is filled by pytz <http://pytz.sourceforge.net/>. If you aren't
>>>> familiar with the topic, strongly recommend reading this page before my
>>>> proposal. It explains the problems of working in local time and the
>>>> limitations of Python's APIs. It has a lot of examples, too.
>>>>
>>>> Django should use timezone-aware UTC datetimes internally
>>>> .........................................................
>>>>
>>>> Example : datetime.datetime(2011, 09, 23, 8, 34, 12, tzinfo=pytz.utc)
>>>>
>>>> In my opinion, the problem of local time is strikingly similar to the
>>>> problem character encodings. Django uses only unicode internally and
>>>> converts at the borders (HTTP requests/responses and database). I propose
>>>> a similar solution: Django should always use UTC internally, and
>>>> conversion should happen at the borders, i.e. when rendering the templates
>>>> and processing POST data (in form fields/widgets). I'll discuss the
>>>> database in the next section.
>>>>
>>>> Quoting pytz' docs: "The preferred way of dealing with times is to always
>>>> work in UTC, converting to localtime only when generating output to be
>>>> read by humans." I think we can trust pytz' developers on this topic.
>>>>
>>>> Note that a timezone-aware UTC datetime is different from a naive
>>>> datetime. If we were using naive datetimes, and assuming we're using pytz,
>>>> a developer could write:
>>>>
>>>> mytimezone.localize(datetime_django_gave_me)
>>>>
>>>> which is incorrect, because it will interpret the naive datetime as local
>>>> time in "mytimezone". With timezone-aware UTC datetime, this kind of
>>>> errors can't happen, and the equivalent code is:
>>>>
>>>> datetime_django_gave_me.astimezone(mytimezone)
>>>>
>>>> Django should store datetimes in UTC in the database
>>>> ....................................................
>>>>
>>>> This horse has been beaten to death on this mailing-list so many times
>>>> that I'll keep the argumentation short. If Django handles everything as
>>>> UTC internally, it isn't useful to convert to anything else for storage,
>>>> and re-convert to UTC at retrieval.
>>>>
>>>> In order to make the database portable and interoperable:
>>>> - in databases that support timezones (at least PostgreSQL), the timezone
>>>> should be set to UTC, so that the data is unambiguous;
>>>> - in databases that don't (at least SQLite), storing data in UTC is the
>>>> most reasonable choice: if there's a "default timezone", that's UTC.
>>>>
>>>> I don't intend to change the storage format of datetimes. It has been
>>>> proposed on this mailing-list to store datetimes with original timezone
>>>> information. However, I suspect that in many cases, datetimes don't have a
>>>> significant "original timezone" by themselves. Furthermore, there are many
>>>> different ways to implemented this outside of Django's core. One is to
>>>> store a local date + a local time + a place or timezone + is_dst flag and
>>>> skip datetime entirely. Another is to store an UTC datetime + a place or
>>>> timezone. In the end, since there's no obvious and consensual way to
>>>> implement this idea, I've chosen to exclude it from my proposal. See the
>>>> "Timezone-aware storage of DateTime" thread on this mailing list for a
>>>> long and non-conclusive discussion of this idea.
>>>>
>>>> I'm expecting to take some flak because of this choice :) Indeed, if
>>>> you're writing a multi-timezone calendaring application, my work isn't
>>>> going to resolve all your problems — but it won't hurt either. It may even
>>>> provide a saner foundation to build upon. Once again, there's more than
>>>> one way to solve this problem, and I'm afraid that choosing one would
>>>> offend some people sufficiently to get the entire proposal rejected.
>>>>
>>>> Django should convert between UTC and local time in the templates and forms
>>>> ...........................................................................
>>>>
>>>> I regard the problem of local time (in which time zone is my user?) as
>>>> very similar to internationalization (which language does my user read?),
>>>> and even more to localization (in which country does my user live?),
>>>> because localization happens both on output and on input.
>>>>
>>>> I want controllable conversion to local time when rendering a datetime in
>>>> a template. I will introduce:
>>>> - a template tag, {% localtime on|off %}, that works exactly like {%
>>>> localize on|off %}; it will be available with {% load tz %};
>>>> - two template filters, {{ datetime|localtime }} and {{ datetime|utctime
>>>> }}, that work exactly like {{ value|localize }} and {{ value|unlocalize }}.
>>>>
>>>> I will convert datetimes to local time when rendering a DateTimeInput
>>>> widget, and also handle SplitDateTimeWidget and SplitHiddenDateTimeWidget
>>>> which are more complicated.
>>>>
>>>> Finally, I will convert datetimes entered by end-users in forms to UTC. I
>>>> can't think of cases where you'd want an interface in local time but user
>>>> input in UTC. As a consequence, I don't plan to introduce the equivalent
>>>> of the `localize` keyword argument in form fields, unless someone brings
>>>> up a sufficiently general use case.
>>>>
>>>> How to set each user's timezone
>>>> ...............................
>>>>
>>>> Internationalization and localization are based on the LANGUAGES setting.
>>>> There's a widely accepted standard to select automatically the proper
>>>> language and country, the Accept-Language header.
>>>>
>>>> Unfortunately, some countries like the USA have more than one timezone, so
>>>> country information isn't enough to select a timezone. To the best of my
>>>> knowledge, there isn't a widely accepted way to determine the timezones of
>>>> the end users on the web.
>>>>
>>>> I intend to use the TIME_ZONE setting by default and to provide an
>>>> equivalent of `translation.activate()` for setting the timezone. With this
>>>> feature, developers can implement their own middleware to set the timezone
>>>> for each user, for instance they may want to use
>>>> <http://pytz.sourceforge.net/#country-information>.
>>>>
>>>> This means I'll have to introduce another thread local. I know this is
>>>> frowned upon. I'd be very interested if someone has a better idea.
>>>>
>>>> It might be no longer necessary to set os.environ['TZ'] and run
>>>> time.tzset() at all. That would avoid a number of problems and make
>>>> Windows as well supported as Unix-based OSes — there's a bunch of tickets
>>>> in Trac about this.
>>>>
>>>> I'm less familiar with this part of the project and I'm interested in
>>>> advice about how to implement it properly.
>>>>
>>>> Backwards compatibility
>>>> .......................
>>>>
>>>> Most previous attempts to resolve have stumbled upon this problem.
>>>>
>>>> I propose to introduce a USE_TZ settings (yes, I know, yet another
>>>> setting) that works exactly like USE_L10N. If set to False, the default,
>>>> you will get the legacy (current) behavior. Thus, existing websites won't
>>>> be affected. If set to True, you will get the new behavior described above.
>>>>
>>>> I will also explain in the release notes how to migrate a database — which
>>>> means shifting all datetimes to UTC. I will attempt to develop a script to
>>>> automate this task.
>>>>
>>>> Dependency on pytz
>>>> ..................
>>>>
>>>> I plan to make pytz a mandatory dependency when USE_TZ is True. This would
>>>> be similar to the dependency on on gettext when USE_I18N is True.
>>>>
>>>> pytz gets a new release every time the Olson database is updated. For this
>>>> reason, it's better not to copy it in Django, unlike simplejson and
>>>> unittest2.
>>>>
>>>> It was split from Zope some time ago. It's a small amount of clean code
>>>> and it could be maintained within Django if it was abandoned (however
>>>> unlikely that sounds).
>>>>
>>>> Miscellaneous
>>>> .............
>>>>
>>>> The following items have caused bugs in the past and should be checked
>>>> carefully:
>>>>
>>>> - caching: add timezone to cache key? See #5691.
>>>> - functions that use LocalTimezone: naturaltime, timesince, timeuntil,
>>>> dateformat.
>>>> - os.environ['TZ']. See #14264.
>>>> - time.tzset() isn't supported on Windows. See #7062.
>>>>
>>>> Finally, my proposal shares some ideas with
>>>> https://github.com/brosner/django-timezones; I didn't find any
>>>> documentation, but I intend to review the code.
>>>>
>>>> About me
>>>> --------
>>>>
>>>> I've been working with Django since 2008. I'm doing a lot of triage in
>>>> Trac, I've written some patches (notably r16349, r16539, r16548, also some
>>>> documentation improvements and bug fixes), and I've helped to set up
>>>> continuous integration (especially for Oracle). In my day job, I'm
>>>> producing enterprise software based on Django with a team of ten
>>>> developers.
>>>>
>>>> Work plan
>>>> ---------
>>>>
>>>> Besides the research that's about 50% done, and discussion that's going to
>>>> take place now, I expect the implementation and tests to take me around
>>>> 80h. Given how much free time I can devote to Django, this means three to
>>>> six months.
>>>>
>>>> Here's an overview of my work plan:
>>>>
>>>> - Implement the USE_TZ flag and database support — this requires checking
>>>> the capabilities of each supported database in terms of datetime types and
>>>> time zone support. Write tests, especially to ensure backwards
>>>> compatibility. Write docs. (20h)
>>>>
>>>> - Implement timezone localization in templates. Write tests. Write docs.
>>>> (10h)
>>>>
>>>> - Implement timezone localization in widgets and forms. Check the admin
>>>> thoroughly. Write tests. Write docs. (15h)
>>>>
>>>> - Implement the utilities to set the user's timezone. Write tests. Write
>>>> docs. (15h)
>>>>
>>>> - Reviews, etc. (20h)
>>>>
>>>> What's next?
>>>> ------------
>>>>
>>>> Constructive criticism, obviously :) Remember that the main problems here
>>>> are backwards-compatibility and keeping things simple.
>>>>
>>>> Best regards,
>>>>
>>>> --
>>>> Aymeric.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Annex: Research notes
>>>> ---------------------
>>>>
>>>> Wiki
>>>> ....
>>>>
>>>> [GSOC]
>>>> https://code.djangoproject.com/wiki/SummerOfCode2011#Multipletimezonesupportfordatetimerepresentation
>>>>
>>>> Relevant tickets
>>>> ................
>>>>
>>>> #2626: canonical ticket for this issue
>>>>
>>>> #2447: dupe, an alternative solution
>>>> #8953: dupe, not much info
>>>> #10587: dupe, a fairly complete proposal, but doesn't address backwards
>>>> compatibility for existing data
>>>>
>>>> Relevant related tickets
>>>> ........................
>>>>
>>>> #14253: how should "now" behave in the admin when "client time" != "server
>>>> time"?
>>>>
>>>> Irrelevant related tickets
>>>> ..........................
>>>>
>>>> #11385: make it possible to enter data in a different timezone in
>>>> DateTimeField
>>>> #12666: timezone in the 'Date:' headers of outgoing emails - independant
>>>> resolution
>>>>
>>>> Relevant threads
>>>> ................
>>>>
>>>> 2011-05-31 Timezone-aware storage of DateTime
>>>> http://groups.google.com/group/django-developers/browse_thread/thread/76e2b486d561ab79
>>>>
>>>> 2010-08-16 Datetimes with timezones for mysql
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/5e220687b7af26f5
>>>>
>>>> 2009-03-23 Django internal datetime handling
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/ca023360ab457b91
>>>>
>>>> 2008-06-25 Proposal: PostgreSQL backends should *stop* using
>>>> settings.TIME_ZONE
>>>> http://groups.google.com/group/django-developers/browse_thread/thread/b8c885389374c040
>>>>
>>>> 2007-12-02 Timezone aware datetimes and MySQL (ticket #5304)
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/a9d765f83f552fa4
>>>>
>>>> Relevant related threads
>>>> ........................
>>>>
>>>> 2009-11-24 Why not datetime.utcnow() in auto_now/auto_now_add
>>>> http://groups.google.com/group/django-developers/browse_thread/thread/4ca560ef33c88bf3
>>>>
>>>> Irrelevant related threads
>>>> ..........................
>>>>
>>>> 2011-07-25 "c" date formating and Internet usage
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/61296125a4774291
>>>>
>>>> 2011-02-10 GSoC 2011 student contribution
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/0596b562cdaeac97/585ce1b04632198a?#585ce1b04632198a
>>>>
>>>> 2010-11-04 Changing settings per test
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/65aabb45687e572e
>>>>
>>>> 2009-09-15 What is the status of auto_now and auto_now_add?
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/cd1a76bca6055179
>>>>
>>>> 2009-03-09 TimeField broken in Oracle
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/bba2f80a2ca9b068
>>>>
>>>> 2009-01-12 Rolling back tests -- status and open issues
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/1e4f4c840b180895
>>>>
>>>> 2008-08-05 Transactional testsuite
>>>> https://groups.google.com/group/django-developers/browse_thread/thread/49aa551ad41fb919
>>>>
>>>
>>
>
--
You received this message because you are subscribed to the Google Groups
"Django developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/django-developers?hl=en.