> t...@sss.pgh.pa.us wrote:
> 
>> david.g.johns...@gmail.com <mailto:david.g.johns...@gmail.com> wrote:
>> 
>>> b...@yugabyte.com wrote:
>>> 
>>> Am I missing an essential clue to resolving what seems to me to be a 
>>> paradox? Or am I seeing two kinds of bug?
>> 
>> 
>> You are missing the material in appendix B.4. Date/Time Configuration Files
>> https://www.postgresql.org/docs/current/datetime-config-files.html
> 
> The short answer is that the reason there are two views is that there are two 
> sources of truth involved. pg_timezone_names reflects the zone names defined 
> in the IANA timezone database, while pg_timezone_abbrevs reflects the 
> abbreviations defined in our user-customizable abbreviations table.  It'd be 
> impossible to make them match exactly, and we don't try exceedingly hard.  In 
> particular, the IANA list has some zones such as "CET" that don't follow 
> their own continent/city naming convention.  (AFAIK those are all legacy 
> zones that they'd get rid of if they weren't concerned with backwards 
> compatibility.)  If those look like abbreviations, which they mostly do, then 
> it's confusing.
> 
> Where the rubber meets the road is in timestamptz input, and there we consult 
> the abbreviations table first.  (Not sure if that's documented, but you can 
> easily prove it by experiment.)
> 
> As for the question about "abbreviations" like +09 --- those are not 
> abbreviations at all, they're just hard-coded numeric UTC offsets.  So they 
> don't appear in pg_timezone_abbrevs.  IANA uses those as display offsets in 
> zones where there's not any widely-used-on-the-ground abbreviation.

Thanks, as ever, David and Tom, for your quick responses. Thanks also to Adrian 
Klaver, who replied in a branched thread with this—in response to my comment 
about my reading of the information content of the pg_timezone_abbrevs view: « 
This claims (as I read it) that a time zone abbreviation uniquely determines an 
offset from UTC. » 

> adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> wrote:
> 
> It says no such thing and would be lying if it did. Take a look at this:
> 
> https://en.wikipedia.org/wiki/List_of_time_zone_abbreviations
> 
> and see the abbreviations that share offsets.
> 
> What it is saying that, for example,  the timezone America/Los_Angeles has 
> two timezone abbreviations PDT (what I'm currently in) and PST. If you use an 
> abbreviation you don't get the DST transition rules that a full timezone name 
> has.

______________________________________________________________________

Firstly, David's response.

I (re)read "B.4. Date/Time Configuration Files" and located this directory on 
my macOS Big Sur for my PG Version 13.2 installation:

/usr/local/share/postgresql/timezonesets

I saw that it has files with names like America.txt, Asia.txt, Europe.txt, and 
so on. I opened America.txt and read this:

# NOTE:
#   This file is NOT loaded by the PostgreSQL database.  It just serves as
#   a template for timezones you could need.  See the `Date/Time Support'
#   appendix in the PostgreSQL documentation for more information.

The x-ref'd Appendix doesn't x-ref further to an explanation of the steps to 
follow in order actually to add timezons that I might need. It does have an 
entry for PST.

# CONFLICT! PST is not unique
# Other timezones:
#  - PST: Philippine Standard Time
PST    -28800    # Pacific Standard Time

Note the comment about a conflict. This suggests to naïve readers like me that 
conflicts are to be avoided. It also has a PDT entry.

I also looked in Asia.txt and found this:

# CONFLICT! PST is not unique
# Other timezones:
#  - PST: Pacific Standard Time (America)
PST     28800    # Philippine Standard Time

I s'pose that this is what David wanted me to see. Conflicts are a bad 
thing—but not disallowed.

This does mean that the outcome of this:

select ('2000-01-01 12:00:00 +00:00'::timestamptz) at time zone 'PST';

is undefined (unless I'm missing a statement of the priority rule for handling 
conflicts elsewhere in the PG doc) as I said in my email that started this 
thread, using a different example.
______________________________________________________________________

Secondly, Adrians's response.

Yes, the point that a timezone abbreviation does not uniquely determine the 
timezone offset is taken now. But notice this:

« In short, this is the difference between abbreviations and full names: 
abbreviations represent a specific offset from UTC…»
from

"8.5.3. Time Zones"
https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-TIMEZONES

This seems to me to be flat-out wrong. An abbreviation, in general, does not 
represent a specific offset from UTC. Rather, it can represent two or more 
different offsets.

This tells me that I must recommend never to use a timezone abbreviation in 
application code. It's anyway illegal as an argument for "set timezone". (If 
the same text happens to be a timezone name, well… that's what it is in that 
context.) And the result of using a timezone abbreviation in the "at time zone" 
clause is, in general, unspecified.

I'm convinced that the typical use case calls for using time zone 
names—exactly, as as been said, to benefit from the encoded DST transition 
rules that this brings. If I want to arrange a Zoom call with a colleague in 
Helsinki on, say, the Monday after the San Francisco "spring forward" weekend, 
and I don't remember when Finland springs forward, then I can do this to see 
what local time it is over there at my 09:00:

select '2021-03-20 09:00:00 America/Los_Angeles'::timestamptz at time zone 
'Europe/Helsinki';

And if, for some strange reason, I want to find out what my local time in this 
example is in a +03:00 timezone (as this is shown in a ::text typecast of  
timestamptz value and not as POSIX has it) not caring about who does what with 
DST, I can spell it as I mean it:

select ('2021-03-20 09:00:00 America/Los_Angeles'::timestamptz at time zone 
'UTC') + '3 hours'::interval;

This formulation is self-documenting. And it sidesteps all that rubbish about 
POSIZ saying "up" for what everybody else calls "down".

Because the purpose for the pg_timezone_abbrevs view (as it seems to me) is to 
control what's legal in the "at time zone" clause, I can forget it. Whatever it 
might say about utc_offset and is_dst is available, against the unique name, in 
pg_timezone_names.

This is an instructive example:

set timezone = 'America/Los_Angeles';
select
  to_char('2021-11-07 08:30:00 UTC'::timestamptz, 'hh24:mi:ss TZ (OF)') as "1st 
1:30",
  to_char('2021-11-07 09:30:00 UTC'::timestamptz, 'hh24:mi:ss TZ (OF)') as "2nd 
1:30";

This is the result:

      1st 1:30      |      2nd 1:30      
--------------------+--------------------
 01:30:00 PDT (-07) | 01:30:00 PST (-08)

The "(-07)" and "(-08)" convey real information. But the "PDT" and "PST" do 
not—except in the sense that one is culturally aware and can Google for things. 
The pg_timezone_names view never has PDT and PST at the same querying moment. 
Rather, it has only one according to the date. And this

select abbrev, utc_offset, is_dst::text
from pg_timezone_abbrevs
where abbrev in ('PST', 'PDT');

with this result:

 PDT    | -07:00:00  | true
 PST    | -08:00:00  | false

gives just an illusion of value because the two abbreviations happen each to be 
unique in that view. Each could easily have many rows that, for each 
abbreviation, mix "true" and "false".
______________________________________________________________________

Thirdly, Tom's response.

About « the reason there are two views is that there are two sources of truth 
involved », well yes… and who wants two sources of truth? All the more reason 
to forget the pg_timezone_abbrevs view and never to use an abbreviation in the 
"at time zone" clause.

About « the IANA list has some zones such as "CET" that don't follow their own 
continent/city naming convention… If those look like abbreviations, which they 
mostly do, then it's confusing. », what about "UTC" itself. Sources like this"

The Difference Between GMT and UTC
https://www.timeanddate.com/time/gmt-utc-time.html 
<https://www.timeanddate.com/time/gmt-utc-time.html>

point out that UTC is a time standard and not a time zone. But it's still 
useful to have it listed in pg_timezone_names. It's a challenge to work out how 
to recommend what subset of what's listed there to use. This page:

List of tz database time zones
https://en.wikipedia.org/wiki/List_of_tz_database_time_zones

is helpful because it shows the "cannonical" or not status of each entry. But 
consider this:

select name, abbrev, utc_offset, is_dst::text
from pg_timezone_names
where
  lower(name) like 'etc/gmt%0%' or
  lower(name) like 'etc/gmt%1%' or
  lower(name) like 'etc/gmt%2%' or
  lower(name) like 'etc/gmt%3%' or
  lower(name) like 'etc/gmt%4%' or
  lower(name) like 'etc/gmt%5%' or
  lower(name) like 'etc/gmt%6%' or
  lower(name) like 'etc/gmt%7%' or
  lower(name) like 'etc/gmt%8%' or
  lower(name) like 'etc/gmt%9%'
order by utc_offset;

with this result:

 Etc/GMT+12 | -12    | -12:00:00  | false
 Etc/GMT+11 | -11    | -11:00:00  | false
 Etc/GMT+10 | -10    | -10:00:00  | false
 ...
 Etc/GMT+1  | -01    | -01:00:00  | false
 Etc/GMT+0  | GMT    | 00:00:00   | false
 Etc/GMT-0  | GMT    | 00:00:00   | false
 Etc/GMT0   | GMT    | 00:00:00   | false
 Etc/GMT-1  | +01    | 01:00:00   | false
 ...
 Etc/GMT-12 | +12    | 12:00:00   | false
 Etc/GMT-13 | +13    | 13:00:00   | false
 Etc/GMT-14 | +14    | 14:00:00   | false

(Notice the "up is down" POSIX silliness.) The Wikipedia "List of tz database 
time zones" has each of these as canonical. But I intend to recommend avoiding 
using these for two reasons. First reason: 'cos each name contains a number, it 
isn't controlled by the names in pg_timezone_names (like David pointed out in 
another email).

Try this (simulating a keystroke bounce typo):

set timezone = 'Etc/GMT+122';
show timezone;

It's silently accepted and gives "ETC/GMT+122" with "show". And then, in turn, 
this:

select '2021-03-20 09:00:00'::timestamptz;

gives this result:

 2021-03-20 09:00:00-122

Nonsense, eh? As David said, it's an instance of the more general:

set timezone = 'Foo42Bar';
show timezone;

I wish there was a way to turn this off and accept only pg_timestamp_names.name 
values.

The second reason is that the abbreviations confuse ordinary readers who are 
slow to remember the "up is down" story.

Reply via email to