On Tue, 15 Jun 2021 at 10:11, Antoine Pitrou <anto...@python.org> wrote:
>
>
> Le 15/06/2021 à 09:31, Joris Van den Bossche a écrit :
> >
> > (but I also don't fully understand your point here, as your "they
> > would get the correct histogram" seems to imply a positive statemenent
> > for tz-naive timestamps, while your email starts with a +1 on
> > Antoine's proposal which, as far as I understand it, says that
> > timestamps without timezone are useless / should be interpreted as UTC
> > instead (which makes your above described scenario impossible)).
>
> My proposal is that timestamps without timezone should be interpreted as
> UTC.  I don't get how that makes them "useless".  In my view, that makes
> them far more useful than if we don't know their base of reference
> (because then most operations you can do on them will give
> uninterpretable data).
>

Note that the "useless" was your wording about my interpretation of
timestamps without timezone as "unknown local timezone" (so my above
statement should probably have been phrased as ".. are either useless
or should be interpreted as UTC").
So I didn't want to imply that interpreting timestamps without
timezone as UTC is useless. That's certainly a clear interpretation
(and a useful abstraction, given earlier references to Java's
"instant" which is kind of similar AFAIU), but it's a *different*
interpretation as how I understand the current spec, and changing our
interpretation has consequences.

First, there are systems that have the notion of tz-naive local
timestamps / TIMESTAMP WITHOUT TIMEZONE (and without interpreting it
as UTC).
Some examples I am aware of are pandas, most database systems
(although with varying names), Jodatime's LocalDateTime, etc. If we
want to support those systems, Arrow needs to have an equivalent
timezone-less type. To quote Wes from his last email about dropping
the timezone-less timestamp: "I don't think that is something we can
do at this time lest we lose the ability to have high-fidelity
interoperability with other systems."

In addition, I will continue to argue that, depending on your
application, it *can* be reasonable to work with timestamps without a
timezone. Certainly, such timestamps don't contain information about
the absolute time point, and thus are inherently ambiguous for certain
operations. But as Wes mentioned before, there are still many
analytical operations that you can do on timezone-less data without
any problem or ambiguity (such as aggregating by year or month, or
even the hour of the day).

Joris

Reply via email to