Domain based on TIMEZONE WITH TIME ZONE
Hi, I'm using a domain to specialize the built in TIMEZONE WITH TIME ZONE type. I want to sanity check this approach before continuing to use this. I want to prevent timestamps with non-UTC offsets from getting inserted into the database. Having a UTC-only database at the schema level means no app, proc, script or load operation can inadvertently supply a local time offset. The domain is defined as: CREATE DOMAIN UTC_TIMESTAMP AS TIMESTAMP WITH TIME ZONE CHECK (EXTRACT(TIMEZONE FROM VALUE) = 0); CREATE TYPE UTC_RANGE AS RANGE ( SUBTYPE = UTC_TIMESTAMP ); My potentially misguided assumption is that I am effectively re-using a check constraint across the schema. Is this crazy? Am I missing the point of how Postgres stores TIMEZONE WITH TIME ZONE internally? Thanks in advance, Ben
Re: Domain based on TIMEZONE WITH TIME ZONE
Sorry about the bug in the subject: the data type is TIMESTAMP WITH TIME ZONE, not TIMEZONE WITH TIME ZONE > On 10 May 2018, at 09:03, Ben Hood wrote: > > Hi, > > I'm using a domain to specialize the built in TIMEZONE WITH TIME ZONE type. I > want to sanity check this approach before continuing to use this. > > I want to prevent timestamps with non-UTC offsets from getting inserted into > the database. Having a UTC-only database at the schema level means no app, > proc, script or load operation can inadvertently supply a local time offset. > > The domain is defined as: > > CREATE DOMAIN UTC_TIMESTAMP AS TIMESTAMP WITH TIME ZONE CHECK > (EXTRACT(TIMEZONE FROM VALUE) = 0); > CREATE TYPE UTC_RANGE AS RANGE ( SUBTYPE = UTC_TIMESTAMP ); > > My potentially misguided assumption is that I am effectively re-using a check > constraint across the schema. > > Is this crazy? > > Am I missing the point of how Postgres stores TIMEZONE WITH TIME ZONE > internally? > > Thanks in advance, > > Ben > >
Re: Domain based on TIMEZONE WITH TIME ZONE
> On 10 May 2018, at 09:59, Francisco Olarte wrote: > > On Thu, May 10, 2018 at 10:03 AM, Ben Hood wrote: > ... >> Am I missing the point of how Postgres stores TIMEZONE WITH TIME ZONE >> internally? > > After reading in the follow up TIMEZONE is a typo for TIMESTAMP, yes, you are. > > Aproximately, postgres stores ( virtually ) a point in the time line, > for both with and without ts types, same format. Something like a real > number. > > The difference is mainly for transforming from/to text ( bear in mind > when you put a constant in a query you are trasnforming from text ). > In the with time zone case it formats/expects it as a time string in > the session configured time zone, in the without case it treats it ( > aproximately ) as if it was in utc ( and then discards the "+00" after > formating ). > > Maybe I'm confussing you more, its not too easy to explain. > > The point is TIMEZONE is not stored in either of them. Many thanks for clarification, very much appreciated. Your point is consistent with the manual: “All timezone-aware dates and times are stored internally in UTC” The subtleties of how a TIMESTAMP WITH TIME ZONE is stored versus how it is rendered by a client or used in calculations and queries have been discussed in numerous places. On reflection, maybe my question was phrased badly. The question should not be “how does Postgres store the timestamp internally”. Rather it should read “is enforcing the submission of UTC denominated timestamps in the server by using a domain a sensible way to enforce a policy that will blow up when apps attempt to use non-UTC timestamps (when they shouldn’t be)”. So the question is not how does the timestamp get stored, rather, is it an anti-pattern to use Postgres as a linter for apps that forget to use UTC exclusively?
Re: Domain based on TIMEZONE WITH TIME ZONE
> On 10 May 2018, at 11:36, Karsten Hilbert wrote: > > On Thu, May 10, 2018 at 11:19:36AM +0100, Ben Hood wrote: > > I dare say it is one of PG's strengths' to be usable as a > "linter”. Interesting that you share this view, because after thinking about why I was doing this, using UTC domains is for PG to be a linter. > However, maybe rephrase to: > > Is it an anti-pattern to use Postgres as a linter for > apps that forget to use ... timezones *appropriately* ... ? > > As long as you can force apps to submit proper timestamp-with- > timezone data is there _really_ a need to care whether apps > do submit in UTC ? OK, so by using TIMESTAMP WITH TIME ZONE, you force all apps to submit timezone qualified timestamps in what language they are written in. > After all, it is always converted to UTC > servside anyway ? And because of the internal UTC representation, there is no room for ambiguous timezones. > > In case you want to enforce only ever _handing out_ UTC data > you could wrap the table in a view with forces the output > timezone to UTC and only offers timestamp-withOUT-timezone to > the outside. Then force read access via the view. So on balance there is no need to use a domain for this? Or are we saying that domains are one way of achieving the timestamp hygiene, but equally, you can get the same result as described above? Or is there a specific downside to using a domain for this purpose? …..maybe I’ve just got a subjective prejudice for using domains to refine and tighten built in data types….
Re: Domain based on TIMEZONE WITH TIME ZONE
> On 10 May 2018, at 14:41, Adrian Klaver wrote: >> OK, so by using TIMESTAMP WITH TIME ZONE, you force all apps to submit >> timezone qualified timestamps in what language they are written in. > > Not really: > > https://www.postgresql.org/docs/10/static/datatype-datetime.html > > "For timestamp with time zone, the internally stored value is always in UTC > (Universal Coordinated Time, traditionally known as Greenwich Mean Time, > GMT). An input value that has an explicit time zone specified is converted to > UTC using the appropriate offset for that time zone. If no time zone is > stated in the input string, then it is assumed to be in the time zone > indicated by the system's TimeZone parameter, and is converted to UTC using > the offset for the timezone zone.” Many thanks for this clarification. So therefore you can’t rely on the TIMESTAMP WITH TIME ZONE forcing the app to explicitly specify the offset. This is is because if the app is not specific, then the server will default back to its configured timezone. So to get deterministic timestamps, you could either: a) make sure the server is always configured to run in UTC; b) issue SET TIME ZONE ‘UTC’; at the beginning of any application session > >>> After all, it is always converted to UTC >>> servside anyway ? >> And because of the internal UTC representation, there is no room for >> ambiguous timezones. > > Define ambiguous timezone? What I meant to say that is there should be no possibility for an effective timezone to arise implicitly. For example, if you (1) didn’t control the db server config and (2) and you forgot to enforce UTC at a client driver level and (3) didn’t set the offset in the app session Then the only way to know what the effective zone offset will be is to find out what the server default is. Is this plausible?
Re: Domain based on TIMEZONE WITH TIME ZONE
> On 10 May 2018, at 15:12, Vick Khera wrote: > > On Thu, May 10, 2018 at 7:31 AM, Ben Hood <mailto:b...@relops.com>> wrote: > Or are we saying that domains are one way of achieving the timestamp hygiene, > but equally, you can get the same result as described above? > > The *only* way to have timestamp hygiene is to require them to have time > zones at all times, even if that time zone is UTC. Any other representation > of a time is ambiguous without context. That makes sense. The motivation behind narrowing the built in TIMESTAMP WITH TIME ZONE down to a domain is to ensure the only permissible zone offset is UTC. This would be unambiguous.
Re: Domain based on TIMEZONE WITH TIME ZONE
> On 10 May 2018, at 15:17, Karsten Hilbert wrote: > >> Not really: >> >> https://www.postgresql.org/docs/10/static/datatype-datetime.html >> >> "For timestamp with time zone, the internally stored value is always in UTC >> (Universal Coordinated Time, traditionally known as Greenwich Mean Time, >> GMT). An input value that has an explicit time zone specified is converted >> to UTC using the appropriate offset for that time zone. If no time zone is >> stated in the input string, then it is assumed to be in the time zone >> indicated by the system's TimeZone parameter, and is converted to UTC using >> the offset for the timezone zone." > > True enough, I didn't remember that behaviour. > > And since a BEFORE UPDATE/INSERT trigger will see the data to > be inserted PG will have already done that while parsing from > on-the-wire data into in-memory ts-w-tz presentation so we > can't simply use a trigger to enforce explicit specification > of a timezone. > > Therefore, a domain could work but will require client > language support for easy integration. For reference, the domain based solution has been working for a while. It has fleshed out bugs in apps that failed to set the zone either on the session level at the field level. In those circumstances, the server raised an exception to say that a non-UTC timestamp was submitted. It could have been the timestamp had a specific zone other than UTC. In this case, the behavior would be that Postgres stores it as UTC and therefore all is good. But it also would be that a timestamp WITHOUT a zone was submitted, in which case, the server default would take precedence. > >>> And because of the internal UTC representation, there is no room for >>> ambiguous timezones. >> >> Define ambiguous timezone? > > OP likely means underspecified for his use case (= not > assuming "unspecified" to mean "TimeZone value"). But, then, > OP could always force TimeZone to UTC on his servers :-) That is what the OP meant. That the zone value was not explicit in all cases. For example, if you fallback to the server default. And yes, the OP could have set UTC both on the DB servers (assuming OP controls them) and within each server app (assuming OP controls them).
Re: Domain based on TIMEZONE WITH TIME ZONE
> On 10 May 2018, at 15:33, Tom Lane wrote: > > Ben Hood writes: >> So the question is not how does the timestamp get stored, rather, is it an >> anti-pattern to use Postgres as a linter for apps that forget to use UTC >> exclusively? > > Well, using a domain to enforce additional constraints on a field's value > is certainly not an anti-pattern in itself. But you have to realize that > the processing consists of first creating a value of the base type and > then applying the constraint expressions of the domain to it. This means > you cannot check any details that are lost in the input conversion, > because you don't have access to the original input string, only the > stored value. Fair point. In the case of the domain based on TIMESTAMP WITH TIME ZONE, if you are mandating UTC, what further details could get lost that may have been in the original input string? The semantics are that inserting into a column of this domain with a non-UTC or absent zone will be rejected and hence is data you don’t want in the database in any case. > > As others have explained, Postgres' TIMESTAMP WITH TIME ZONE type doesn't > preserve the input's timezone specification (if any) but forcibly rotates > to UTC and stores just a scalar UTC value. So you can't use a domain to > check anything about whether the input had a timezone field and if so what > it was. OK, point taken. The intention of the domain is prevent any non-UTC or absent zones, so do you need to check anything after you have inserted it? > > (This behavior is nonstandard --- the SQL spec evidently expects the > timezone to be stored explicitly in some fashion --- but I don't foresee > us changing it; we've accumulated too much backwards-compatibility > baggage now.) > > If you're sufficiently intent on having checking of that sort, you could > invent your own datatype with your own input function, and then make it > binary-compatible with timestamptz so that you don't need to provide much > else besides the I/O functions. varchar(n) has the same sort of > relationship with text, so there's precedent … Many thanks for your feedback, much appreciated. What I hear you saying is that is possible and similar things have been done before. But by the same token, just because you can do it, doesn’t necssarily mean it buys you that much. So if a person of your experience hasn’t come across something like this before, it might mean that on balance it’s not worth the effort and/or potential deviation from standard DB tooling. Be pragmatic. As an aside, this whole discussion has been super valuable for me to get a better understanding of exactly what is going on. This is precisely the intention of the domain in the first place - to be explicit about timestamps everywhere. So if I get timestamp UTC explicitness everywhere in my database by actually understanding what is going on as opposed to using a domain, then I have achieved my goal. I appreciate everybody chiming in on this topic :-)
Re: Domain based on TIMEZONE WITH TIME ZONE
> On 10 May 2018, at 18:29, Adrian Klaver wrote: > > Per my previous post a timestamp with timezone is going to be stored as UTC, > so there is no ambiguity there. On reflection I realized your concern maybe > with determining the original input timezone. That information is not stored > by Postgres, so there can be ambiguity as to their value. Doing all > timestamps in UTC is one one way to eliminate this. The potential issue I see > is that you now push the ambiguity into the app. Namely just from looking at > the database values you still do not know what the original timezone the app > lives in is. That’s very true, I hadn’t thought of that. The use case I have is a schema that unites billing events from disparate telecoms systems from a bunch of different carriers. The source of the non-specific timestamps is party input data files that provide different local timestamps for systems in different systems, but also different apps that submit billing events. So there are many inconsistent feeds into the database. So when we push the ambiguity into the app, at least what is happening is the the transaction is rejected which means the app breaks. When the app breaks, we can go in and fix the timestamp bug in the particular app. Often this is difficult, and we need to reason about the source data, but the breakage at least tells us that something is wrong. Otherwise we ingress the data, do complex billing queries and then the only time we find out about a bug is when a customer complains about a particular bill. When this happens, it is really difficult to determine whether there is a bug in the query logic or if the input is bogus. So in this sense, the database is linting the the source data.
Re: Domain based on TIMEZONE WITH TIME ZONE
> On 10 May 2018, at 17:35, David G. Johnston > wrote: > > '2018-05-10T15:23:00-07:00'::timestamptz is unambiguous That is true. Mandating UTC is not the only way to eliminate ambiguity. Apologies for appearing to suggest that this is case. > > Allowing client applications to represent time in the user's timezone is a > feature. Fair point. > If I send 4pm ET to the server to be stored in a timestamptz field, and fail > to tell the server that the timezone is ET in the value itself then I must > instead set my session timezone to ET or the server is going to store the > wrong value. There is nothing you can do in an default server to prevent > this. Tom has described how you could possibly make the "fail to tell the > server that the timezone is ET" impossible using a custom type. This seems > to be what you want though I'd question whether it is worth the cost. The domain approach is what has been implemented and has been running in production for some time without a problem (that we know of). The cost of this appears to be quite trivial in Postgres - there are 2 lines of DDL in the OP that describe the implementation and it seems to work. The reason why I started this thread is because somebody on a JDBC related list told me that I was crazy for trying to do this and that I had most likely misunderstood how Postgres works. So respecting their opinion, I decided to ask the Postgres experts. It is heartening to hear Tom say that the idea is not completely insane, but questions the business value of mandating UTC, given there are other ways to ensure timezone explicitness across the app(s) and the database. I’m also somewhat wiser about Postgres works now as well. > I'm not sure how binary timestamp values being sent to the server in a BIND > command plays into this... The bit I’m ignorant of is that the domain approach appears to work with the timestampz wire representation. Apps seem to bind using the platform timestampz representation and the server barfs if the app forgets to enforce UTC.
Re: Domain based on TIMEZONE WITH TIME ZONE
> On 10 May 2018, at 17:38, Adrian Klaver wrote: > > Well if you are using a timestamp with timezone field the value is always > going to be stored as UTC. The TimeZone setting just determines the rotation > from the input value to the stored value and the reverse. My previous point > was just that Postgres will not enforce an offset on input data. Good point. >> Then the only way to know what the effective zone offset will be is to find >> out what the server default is. >> Is this plausible? > > If you mean find the server default then yes: > > test_(aklaver)> select current_setting('TimeZone'); > current_setting > - > US/Pacific Thanks for the tip.
Re: Domain based on TIMEZONE WITH TIME ZONE
> On 10 May 2018, at 16:33, Francisco Olarte wrote: > > For what you want to do I think you'll have to parse the text value, > maybe by definig a view with a text columns and using some > rule/trigger magic for insert / updates. Sorry for being unclear - the solution I have in production appears to work with CREATE DOMAIN UTC_TIMESTAMP AS TIMESTAMP WITH TIME ZONE CHECK (EXTRACT(TIMEZONE FROM VALUE) = 0); This raises an exception when an app doesn’t use UTC. > > It seems you want to force your users to use an explcit time zone. > This may be better handled above the database. This is what I wanted to achieve and the approach so far seems to work. It is just that most people think the approach is weird. And this is the reason for the OP - a knowledgable person on a different list saw this approach, thought it was weird and suggested that I should educate myself. Which led me to asking on this list. And the consensus appears to be that the approach is weird and that there are other ways to achieve timezone explicitness, one of which is to gain a deep understanding of how Postgres handles and stores timestamps.
Re: Domain based on TIMEZONE WITH TIME ZONE
> On 10 May 2018, at 22:17, Peter J. Holzer wrote: > > I don't understand how this can work. As Francisco demonstrated, > EXTRACT(TIMEZONE FROM ts) doesn't extract the time zone from the value > ts, it reports the offset of the client's time zone. Yes, you and Francisco are right. If you do: set time zone 'UTC'; select '2018-05-10 23:17:44-02' :: utc_timestamp; This will work. But if you change the client zone, it will not: set time zone 'Asia/Pyongyang’; select '2018-05-10 23:17:44-02' :: utc_timestamp; [Code: , SQL State: 23514] ERROR: value for domain utc_timestamp violates check constraint "utc_timestamp_check" > What your check probably does is to enforce that the client's time zone > is set to UTC. Logically you must be right, given (a) you cannot extract the zone from the stored value, because the stored value does not contain a zone (b) the examples that only work when the session zone is set to UTC Therefore the semantics of EXTRACT(TIMEZONE FROM ts) must rely on the session zone only. Hence all the check does is to enforce the UTC zone.
Re: Domain based on TIMEZONE WITH TIME ZONE
> On 10 May 2018, at 23:43, Adrian Klaver wrote: > > Trying to tame time and time zones is maybe quixotic, but not weird. Quixotic is a very good description, I’d happily admit that using the UTC domain in this way is not as pragmatic as I thought it would when I introduced it. > While I was working on my response to Peter I realized that the below: > > CREATE DOMAIN UTC_TIMESTAMP AS TIMESTAMP WITH TIME ZONE CHECK > (EXTRACT(TIMEZONE FROM VALUE) = 0); > > might not work the way you expect if you have your server set to 'UTC’. “Might not work the way you expect it to” is the underlying theme and the overall motivation for trying to master time zones from disparate sources. > > show timezone; > TimeZone > -- > UTC > (1 row) > > CREATE DOMAIN UTC_TIMESTAMP AS TIMESTAMP WITH TIME ZONE CHECK > (EXTRACT(TIMEZONE FROM VALUE) = 0); > > create table ts_check(id integer, ts_fld utc_timestamp); > > insert into ts_check values (1, now()); > insert into ts_check values (2, '05/10/18 15:23'); > insert into ts_check values (3, '05/10/18 15:23+07'); > > > test=> select * from ts_check ; > id |ts_fld > +--- > 1 | 2018-05-10 22:37:58.745263+00 > 2 | 2018-05-10 15:23:00+00 > 3 | 2018-05-10 08:23:00+00 > > In fact I am not sure how it works:): As Peter and Francisco has previously demonstrated, I believe that inserting a timestamp into the UTC_TIMESTAMP column: (1) Checks the client time zone is UTC (2) Converts the timestamp to UTC, whether it was presented as UTC or not. Going to back to the original intention of enforcing UTC, the end result is correct, but for the wrong reason. In this example, row 3 contains the UTC value of '05/10/18 15:23+07’, which is the end result I wanted (pure UTC everywhere). But, the way that it is actually happens under the covers is masked by the use of the UTC_TIMESTAMP domain. In the scenario, there is a belief that only UTC timestamps are being inserted, which is not true. The way that Postgres stores any timestamp leads to the desired result, fooling me into believing that the UTC_TIMESTAMP domain is casual for the correct result. I would have need check whether the behavior is the same using a network client driver (maybe there is some implicit coercion going on when psql parses the statement). But I think the point still stands that Postgres timestamp handling is doing the real work and is casual for the desired outcome, not the domain.
Detaching multiple partitions in 1 ALTER TABLE statement
Hi, I’m wondering what is the syntax for detaching multiple partitions in 1 ALTER TABLE statement. I’ve tried this: # ALTER TABLE movement_legs DETACH PARTITION movement_legs_2019_06_20_11_15, DETACH PARTITION movement_legs_2019_06_20_11_20; psql: ERROR: syntax error at or near "," LINE 1: ...gs DETACH PARTITION movement_legs_2019_06_20_11_15, DETACH P… The documentation suggests "DETACH PARTITION can be combined into a list of multiple alterations to be applied together”: https://www.postgresql.org/docs/12/sql-altertable.html But I haven’t found an example of where multiple partitions are detached in 1 statement. This is using version 12 beta 1. Many thanks, Ben