Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Robert Haas
On Wed, Jul 24, 2024 at 3:43 PM Jeremy Schneider wrote: > But non-unique indexes for case insensitive searches will be more common. > Historically this is the most common way people did case insensitive on > oracle. > > Changing ctype would mean these queries can return wrong results Yeah. I me

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Jeremy Schneider
On Wed, Jul 24, 2024 at 12:47 PM Robert Haas wrote: On Wed, Jul 24, 2024 at 1:45 PM Jeff Davis wrote: > There's a qualitative difference between a collation update which can > break your PKs and FKs, and a ctype update which definitely will not. I don't think that's true. All you need is a uniq

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Robert Haas
On Wed, Jul 24, 2024 at 3:12 PM Jeff Davis wrote: > In any case, you are correct that Unicode updates could put some > constraints at risk, including unique indexes, CHECK, and partition > constraints. But someone has to actually use one of the affected > functions somewhere, and that's the main d

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Jeff Davis
On Wed, 2024-07-24 at 14:47 -0400, Robert Haas wrote: > On Wed, Jul 24, 2024 at 1:45 PM Jeff Davis wrote: > > There's a qualitative difference between a collation update which > > can > > break your PKs and FKs, and a ctype update which definitely will > > not. > > I don't think that's true. All

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Robert Haas
On Wed, Jul 24, 2024 at 1:45 PM Jeff Davis wrote: > There's a qualitative difference between a collation update which can > break your PKs and FKs, and a ctype update which definitely will not. I don't think that's true. All you need is a unique index on UPPER(somecol). -- Robert Haas EDB: http

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Peter Eisentraut
On 24.07.24 14:20, Robert Haas wrote: On Wed, Jul 24, 2024 at 12:42 AM Peter Eisentraut wrote: Fair enough. My argument was, that topic is distinct from the topic of this thread. OK, that's fair. But I think the solutions are the same: we complain all the time about glibc and ICU shipping co

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Jeff Davis
On Wed, 2024-07-24 at 08:20 -0400, Robert Haas wrote: > I note in passing that the last time I saw a customer query with > UPPER() in the join clause was... yesterday. Can you expand on that? This thread is mostly about durable state so I don't immediately see the connection. > So I don't want to

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Jeff Davis
On Tue, 2024-07-23 at 06:31 -0600, Jeremy Schneider wrote: > Other RDBMS are very careful not to corrupt databases, afaik > including function based indexes, by changing Unicode. I’m not aware > of any other RDBMS that updates Unicode versions in place; instead > they support multiple Unicode versi

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Jeremy Schneider
On Wed, Jul 24, 2024 at 6:20 AM Robert Haas wrote: > > I note in passing that the last time I saw a customer query with > UPPER() in the join clause was... yesterday. The problems there had > nothing to do with CTYPE, but there's no reason to suppose that it > couldn't have had such a problem. I

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Robert Haas
On Wed, Jul 24, 2024 at 12:42 AM Peter Eisentraut wrote: > Fair enough. My argument was, that topic is distinct from the topic of > this thread. OK, that's fair. But I think the solutions are the same: we complain all the time about glibc and ICU shipping collations and not versioning them. We s

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Noah Misch
On Tue, Jul 23, 2024 at 01:07:49PM -0700, Jeff Davis wrote: > On Tue, 2024-07-23 at 07:39 -0700, Noah Misch wrote: > > Short-term, we should remedy the step backward that pg_c_utf8 has taken: > > https://postgr.es/m/20240718233908.52.nmi...@google.com > > https://postgr.es/m/486d71991a3f80ec1c47e1b

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Peter Eisentraut
On 24.07.24 03:37, Robert Haas wrote: On Tue, Jul 23, 2024 at 4:36 PM Peter Eisentraut wrote: The sorting isn't the problem. We have a versioning mechanism for collations. What we do with the version information is clearly not perfect yet, but the mechanism exists and you can hack together qu

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Jeff Davis
On Tue, 2024-07-23 at 21:37 -0400, Robert Haas wrote: > In my experience, sorting is, overwhelmingly, the problem. I strongly agree. > That we have versioning information that someone could hypothetically > know how to do something useful with is not really useful, because > nobody actually knows

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Robert Haas
On Tue, Jul 23, 2024 at 4:36 PM Peter Eisentraut wrote: > The sorting isn't the problem. We have a versioning mechanism for > collations. What we do with the version information is clearly not > perfect yet, but the mechanism exists and you can hack together queries > that answer the question, d

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Tom Lane
Jeff Davis writes: > On Tue, 2024-07-23 at 16:07 -0400, Tom Lane wrote: >> Well, it depends on which libc collation you have in mind.  I was >> thinking of a libc-supplied C.UTF-8 collation, which I would expect >> to behave the same as pg_c_utf8, modulo which Unicode version it's >> based on. >

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Jeff Davis
On Tue, 2024-07-23 at 16:07 -0400, Tom Lane wrote: > Well, it depends on which libc collation you have in mind.  I was > thinking of a libc-supplied C.UTF-8 collation, which I would expect > to behave the same as pg_c_utf8, modulo which Unicode version it's > based on. Daniel Vérité documented[1]

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Tom Lane
"Daniel Verite" writes: > Tom Lane wrote: >> Why? If we agree that that's the way forward, we could certainly >> stick some collversion other than "1" into pg_c_utf8's pg_collation >> entry. There's already been one v17 catversion bump since beta2 >> (716bd12d2), so another one is basicall

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Peter Eisentraut
On 22.07.24 19:55, Robert Haas wrote: Every other piece of software in the world has to deal with changes as a result of the addition of new code points, and probably less commonly, revisions to existing code points. Presumably, their stuff breaks too, from time to time. I mean, I find it a bit d

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Daniel Verite
Tom Lane wrote: > > I don't see how we can get by without some kind of versioning here. > > It's probably too late to do that for v17, > > Why? If we agree that that's the way forward, we could certainly > stick some collversion other than "1" into pg_c_utf8's pg_collation > entry. Ther

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Tom Lane
Robert Haas writes: > On Tue, Jul 23, 2024 at 3:26 PM Tom Lane wrote: >>> Do we need to version the new ctype provider? >> It would be a version for the underlying Unicode definitions, >> not the provider as such, but perhaps yes. I don't know to what >> extent doing so would satisfy Noah's con

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Robert Haas
On Tue, Jul 23, 2024 at 3:26 PM Tom Lane wrote: > No, I think we *are* winning, because the updates are not "equally > unstable": with pg_c_utf8, we control when changes happen. We can > align them with major releases and release-note the differences. > With libc-based collations, we have zero co

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Jeff Davis
On Tue, 2024-07-23 at 07:39 -0700, Noah Misch wrote: > we should remedy the step backward that pg_c_utf8 has taken: Obviously I disagree that we've taken a step backwards. Can you articulate the principle by which all of the other problems with IMMUTABLE are just fine, but updates to Unicode are

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Tom Lane
Jeff Davis writes: > On Tue, 2024-07-23 at 15:26 -0400, Tom Lane wrote: >> No, I think we *are* winning, because the updates are not "equally >> unstable": with pg_c_utf8, we control when changes happen.  We can >> align them with major releases and release-note the differences. >> With libc-based

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Jeff Davis
On Tue, 2024-07-23 at 15:26 -0400, Tom Lane wrote: > No, I think we *are* winning, because the updates are not "equally > unstable": with pg_c_utf8, we control when changes happen.  We can > align them with major releases and release-note the differences. > With libc-based collations, we have zero

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Joe Conway
On 7/23/24 15:26, Tom Lane wrote: Robert Haas writes: Also, Noah has pointed out that C.UTF-8 introduces some forward-compatibility hazards of its own, at least with respect to ctype semantics. I don't have a clear view of what ought to be done about that, but if we just replace a dependency on

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Tom Lane
Robert Haas writes: > Also, Noah has pointed out that C.UTF-8 introduces some > forward-compatibility hazards of its own, at least with respect to > ctype semantics. I don't have a clear view of what ought to be done > about that, but if we just replace a dependency on an unstable set of > libc de

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Robert Haas
On Tue, Jul 23, 2024 at 1:03 PM Jeff Davis wrote: > One of my strongest motivations for PG_C_UTF8 was that there was still > a use case for libc in PG16: the "C.UTF-8" locale, which is not > supported at all in ICU. Daniel Vérité made me aware of the importance > of this locale, which offers code

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Jeff Davis
On Tue, 2024-07-23 at 08:49 -0400, Robert Haas wrote: > Hmm. I think we might have some unique problems due to the fact that > we rely partly on the operating system behavior, partly on libicu, > and > partly on our own internal tables. The reliance on the OS is especially problematic for reasons

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Noah Misch
On Mon, Jul 22, 2024 at 09:34:42AM -0700, Jeff Davis wrote: > On Mon, 2024-07-22 at 11:14 -0400, Robert Haas wrote: > > On Mon, Jul 22, 2024 at 10:26 AM Peter Eisentraut > > wrote: > > > I disagree with that.  We should put ourselves into the position to > > > adopt new Unicode versions without f

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Robert Haas
On Tue, Jul 23, 2024 at 8:32 AM Jeremy Schneider wrote: > Other RDBMS are very careful not to corrupt databases, afaik including > function based indexes, by changing Unicode. I’m not aware of any other RDBMS > that updates Unicode versions in place; instead they support multiple Unicode > vers

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Jeremy Schneider
On Tue, Jul 23, 2024 at 1:11 AM Laurenz Albe wrote: > On Mon, 2024-07-22 at 13:55 -0400, Robert Haas wrote: > > On Mon, Jul 22, 2024 at 1:18 PM Laurenz Albe > wrote: > > > I understand the difficulty (madness) of discussing every Unicode > > > change. If that's unworkable, my preference would b

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Robert Haas
On Tue, Jul 23, 2024 at 3:11 AM Laurenz Albe wrote: > I hear you. It would be interesting to know what other RDBMS do here. Yeah, I agree. -- Robert Haas EDB: http://www.enterprisedb.com

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Laurenz Albe
On Mon, 2024-07-22 at 13:55 -0400, Robert Haas wrote: > On Mon, Jul 22, 2024 at 1:18 PM Laurenz Albe wrote: > > I understand the difficulty (madness) of discussing every Unicode > > change.  If that's unworkable, my preference would be to stick with some > > Unicode version and never modify it, ev

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-22 Thread Robert Haas
On Mon, Jul 22, 2024 at 1:18 PM Laurenz Albe wrote: > I understand the difficulty (madness) of discussing every Unicode > change. If that's unworkable, my preference would be to stick with some > Unicode version and never modify it, ever. I think that's a completely non-viable way forward. Even

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-22 Thread Isaac Morland
On Mon, 22 Jul 2024 at 13:51, Jeff Davis wrote: > > Are you proposing a switch that would make PostgreSQL error out if > > somebody wants to use an unassigned code point? That would be an > > option. > > You can use a CHECK(UNICODE_ASSIGNED(t)) in version 17, and in version > 18 I have a propos

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-22 Thread Jeff Davis
On Mon, 2024-07-22 at 19:18 +0200, Laurenz Albe wrote: > I understand the difficulty (madness) of discussing every Unicode > change.  If that's unworkable, my preference would be to stick with > some > Unicode version and never modify it, ever. Among all the ways that IMMUTABLE and indexes can go

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-22 Thread Laurenz Albe
On Mon, 2024-07-22 at 16:26 +0200, Peter Eisentraut wrote: > I propose that, going forward, we take more care with Unicode updates: > > assess the impact, provide time for comments, and consider possible > > mitigations. In other words, it would be reviewed like any other > > change. > > I disagre

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-22 Thread Jeff Davis
On Mon, 2024-07-22 at 11:14 -0400, Robert Haas wrote: > On Mon, Jul 22, 2024 at 10:26 AM Peter Eisentraut > wrote: > > I disagree with that.  We should put ourselves into the position to > > adopt new Unicode versions without fear.  Similar to updates to > > time > > zones, snowball, etc. > > > >

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-22 Thread Jeff Davis
On Mon, 2024-07-22 at 16:26 +0200, Peter Eisentraut wrote: > Unless I missed something here, all the problem examples involve > unassigned code points that were later assigned. For normalization and case mapping that's right. For regexes, a character property could change. But that's mostly a th

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-22 Thread Robert Haas
On Mon, Jul 22, 2024 at 10:26 AM Peter Eisentraut wrote: > I disagree with that. We should put ourselves into the position to > adopt new Unicode versions without fear. Similar to updates to time > zones, snowball, etc. > > We can't be discussing the merits of the Unicode update every year. > Th

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-22 Thread Peter Eisentraut
On 19.07.24 21:41, Jeff Davis wrote: On Fri, 2024-07-19 at 21:06 +0200, Laurenz Albe wrote: Perhaps I should moderate my statement: if a change affects only a newly introduced code point (which is unlikely to be used in a database), and we think that the change is very important, we could consid

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-19 Thread Jeff Davis
On Fri, 2024-07-19 at 21:06 +0200, Laurenz Albe wrote: > Perhaps I should moderate my statement: if a change affects only a > newly > introduced code point (which is unlikely to be used in a database), > and we > think that the change is very important, we could consider applying > it. > But that s

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-19 Thread Laurenz Albe
On Tue, 2024-07-16 at 10:42 -0700, Jeff Davis wrote: > The IMMUTABLE marker for functions is quite simple on the surface, but > could be interpreted a few different ways, and there's some historical > baggage that makes it complicated. > > There are a number of ways in which IMMUTABLE functions ca

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-16 Thread Jeff Davis
On Tue, 2024-07-16 at 13:27 -0700, David G. Johnston wrote: > I'd teach pg_upgrade to inspect the post-upgraded catalog of is-use > dependencies and report on any of these it finds and remind the DBA > that this latent issue may exist in their system. That's impossible to do in a complete way, and

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-16 Thread Jeremy Schneider
On Tue, Jul 16, 2024 at 3:28 PM David G. Johnston < david.g.johns...@gmail.com> wrote: > > I'd teach pg_upgrade to inspect the post-upgraded catalog of in-use > dependencies and report on any of these it finds and remind the DBA that > this latent issue may exist in their system. > Would this hel

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-16 Thread David G. Johnston
On Tue, Jul 16, 2024 at 1:16 PM Tom Lane wrote: > Joe Conway writes: > > So you are proposing we add STATIC to VOLATILE/STABLE/IMMUTABLE (in the > > third position before IMMUTABLE), give it IMMUTABLE semantics, mark > > builtin functions that deserve it, and document with suitable caution > > s

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-16 Thread Tom Lane
Joe Conway writes: > Fair enough, but then I think we should change the documentation to not > say "forever". No objection to that, it's clearly a misleading definition. regards, tom lane

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-16 Thread Joe Conway
On 7/16/24 16:16, Tom Lane wrote: Joe Conway writes: So you are proposing we add STATIC to VOLATILE/STABLE/IMMUTABLE (in the third position before IMMUTABLE), give it IMMUTABLE semantics, mark builtin functions that deserve it, and document with suitable caution statements? What is the poin

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-16 Thread Tom Lane
Joe Conway writes: > So you are proposing we add STATIC to VOLATILE/STABLE/IMMUTABLE (in the > third position before IMMUTABLE), give it IMMUTABLE semantics, mark > builtin functions that deserve it, and document with suitable caution > statements? What is the point of that, exactly? I'll agr

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-16 Thread Joe Conway
On 7/16/24 15:33, David G. Johnston wrote: On Tue, Jul 16, 2024 at 11:57 AM Joe Conway > wrote: > There are two alternative philosophies: > > A. By choosing to use a Unicode-based function, the user has opted in > to the Unicode stability guarantee

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-16 Thread David G. Johnston
On Tue, Jul 16, 2024 at 11:57 AM Joe Conway wrote: > > > There are two alternative philosophies: > > > > A. By choosing to use a Unicode-based function, the user has opted in > > to the Unicode stability guarantees[2], and it's fine to update Unicode > > occasionally in new major versions as long

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-16 Thread Joe Conway
On 7/16/24 13:42, Jeff Davis wrote: The IMMUTABLE marker for functions is quite simple on the surface, but could be interpreted a few different ways, and there's some historical baggage that makes it complicated. There are a number of ways in which IMMUTABLE functions can change behavior: 1. Up

[18] Policy on IMMUTABLE functions and Unicode updates

2024-07-16 Thread Jeff Davis
The IMMUTABLE marker for functions is quite simple on the surface, but could be interpreted a few different ways, and there's some historical baggage that makes it complicated. There are a number of ways in which IMMUTABLE functions can change behavior: 1. Updating or moving to a different OS aff