Re: Order changes in PG16 since ICU introduction

2023-05-18 Thread Jeff Davis
ith no users). Beyond that, there seems to be some danger: if the syntax for rules is not perfectly compatible between ICU versions, the user might run into big problems. Regards, Jeff Davis

Re: Order changes in PG16 since ICU introduction

2023-05-18 Thread Jeff Davis
0001 and 0002), but I'd like to hear what others think. For historical reasons, users may assume that LC_COLLATE controls the default collation order because that's true in libc. And if their provider is ICU, they may be surprised that it doesn't. I believe w

Re: ICU locale validation / canonicalization

2023-05-20 Thread Jeff Davis
On Tue, 2023-05-02 at 07:29 -0700, Noah Misch wrote: > On Thu, Mar 30, 2023 at 08:59:41AM +0200, Peter Eisentraut wrote: > > On 30.03.23 04:33, Jeff Davis wrote: > > > Attached is a new version of the final patch, which performs > > > canonicalization. I'm not 10

Re: Order changes in PG16 since ICU introduction

2023-05-20 Thread Jeff Davis
directly and rely on the server environment. But in those cases, there's no way to set a provider at all, it's just relying on the server environment. There aren't many of these cases, and hopefully we can eliminate the reliance on the server environment over time. If I'm missing something, let me know what cases you have in mind. Regards, Jeff Davis

Re: Order changes in PG16 since ICU introduction

2023-05-22 Thread Jeff Davis
ome back and forth, like checking datlocprovider, then looking in the right fields and ignoring the wrong ones. Regards, Jeff Davis

Re: Order changes in PG16 since ICU introduction

2023-05-22 Thread Jeff Davis
atter in some cases. I'd just say that they are too confusing (likely to be misused), and becoming obsolete (or less relevant), or something along those lines. Otherwise, this is fine with me. I didn't do a detailed review because it's just mechanical. Regards, Jeff Davis

Re: Order changes in PG16 since ICU introduction

2023-05-22 Thread Jeff Davis
indexes or something. > > * I don't understand what "kc" means if "ks" is not set to > > "level1". > > There is an example here: > https://peter.eisentraut.org/blog/2023/05/16/overview-of-icu-collation-settings#colcaselevel Interesting, thank you. Regards, Jeff Davis

Re: Order changes in PG16 since ICU introduction

2023-05-24 Thread Jeff Davis
E > > > In practice we're probably getting the "und" ICU locale whereas "fr" > would > be appropriate. This is a good point and illustrates that ICU is not a drop-in replacement for libc in all cases. I don't see a solution here that doesn't involve some rough edges, though. "Locale" is a generic term, and if we continue to insist that it really means a libc locale, then ICU will never be on an equal footing with libc, let alone the preferred provider. Regards, Jeff Davis

Re: pg_collation.collversion for C.UTF-8

2023-05-25 Thread Jeff Davis
hich is not great for such a common locale name). ICU versions 63 and earlier recognize C.UTF-8 as en-US-u-va-posix (a.k.a. en_US_POSIX), which has some adjustments to match expectations of C sorting (e.g. upper case first). * libc: problems as raised in this thread. Regards, Jeff Davis

Re: Order changes in PG16 since ICU introduction

2023-05-25 Thread Jeff Davis
s, where you can make better use of those concepts. I feel like there are some interesting things that can be done with rules, but I haven't had a chance to really dig in yet. Regards, Jeff Davis

Re: pg_collation.collversion for C.UTF-8

2023-05-26 Thread Jeff Davis
On Thu, 2023-05-25 at 14:48 -0400, Tom Lane wrote: > Jeff Davis writes: > > What should we do with locales like C.UTF-8 in both libc and ICU? > > I vote for passing those to the existing C-specific code paths, Great, this would be a big step toward solving the ICU usability

Fix search_path for all maintenance commands

2023-05-26 Thread Jeff Davis
s of reasons). But I'm open to suggestion if someone knows a good way to do it. -- Jeff Davis PostgreSQL Contributor Team - AWS From 5c6d707a88c887641d551ed9a6983c74d6a82a7a Mon Sep 17 00:00:00 2001 From: Jeff Davis Date: Tue, 18 Apr 2023 10:45:51 -0700 Subject: [PATCH] Fix search_path

Re: pg_collation.collversion for C.UTF-8

2023-06-05 Thread Jeff Davis
On Fri, 2023-05-26 at 10:43 -0700, Jeff Davis wrote: > We still need to consider backwards compatibility. If someone has a > collation with locale name C.UTF-8 in an earlier version, any change > to > the interpretation of that locale name after an upgrade carries a > corruption

Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Jeff Davis
ctly what I did in v6 of this series: I created a "none" provider, and when someone specified provider=icu iculocale=C, it would change the provider to "none": https://www.postgresql.org/message-id/5f9bf4a0b040428c5db2dc1f23cc3ad96acb5672.camel%40j-davis.com I'm fine with either approach. Regards, Jeff Davis

Re: pg_collation.collversion for C.UTF-8

2023-06-06 Thread Jeff Davis
see what happens (older versions of ICU would interpret it as en-US-u-va-posix; newer versions would give the root locale). b. Consistently interpret it as en-US-u-va-posix. c. Don't pass it to the provider at all and treat it with memcmp semantics. Regards, Jeff Davis

Re: pg_collation.collversion for C.UTF-8

2023-06-07 Thread Jeff Davis
On Wed, 2023-06-07 at 23:28 +0200, Peter Eisentraut wrote: > On 06.06.23 21:23, Jeff Davis wrote: > > What about ICU? How should provider=icu locale=C.UTF-8 behave? We > > could: > > It should be an error. > > > a. Just pass it to the provider and see what happens

Re: Order changes in PG16 since ICU introduction

2023-06-07 Thread Jeff Davis
subthread. It also leaves the fundamental problem in place that LOCALE only applies to the libc provider, which multiple people have agreed is not acceptable. Regards, Jeff Davis

Re: Order changes in PG16 since ICU introduction

2023-06-07 Thread Jeff Davis
On Thu, 2023-06-08 at 00:11 +0200, Peter Eisentraut wrote: > On 05.06.23 19:54, Jeff Davis wrote: > > New patch series attached. > > Could you clarify what here is intended for 16 and what is for later? I apologize about the patch churn here. I implemented several approaches to se

Re: Order changes in PG16 since ICU introduction

2023-06-08 Thread Jeff Davis
ror. It's hard for me to estimate how many users might be inconvenienced by that, but it sounds like a risk. Perhaps for this specific case, and only in initdb, we change C.anything and POSIX.anything to the builtin provider? CREATE DATABASE and CREATE COLLATION could still reject such locales.

Re: Order changes in PG16 since ICU introduction

2023-06-09 Thread Jeff Davis
eems cleaner. You also suggested that we consider switching the provider to libc any time ICU doesn't support something. I'm not sure whether you meant a static list (C, C.UTF-8, POSIX, ...?) or some kind of dynamic test. I'm skeptical of being too smart here, but I'd like

Re: Fix search_path for all maintenance commands

2023-06-09 Thread Jeff Davis
gt; I'm inclined to agree that this is reasonable to desupport. Committed. > I bet we could skip forcing the search_path for maintenance commands > run as > the table owner, but such a discrepancy seems likely to cause far > more > confusion than anything else. Agreed. Regards, Jeff Davis

Re: Fix search_path for all maintenance commands

2023-06-09 Thread Jeff Davis
ommand-line option, or GUC, etc. That way we can > mark the old behaviour "deprecated", with a workaround for those who > may desperately need it, and in another release or so, finally pull > the plug on old behaviour. That sounds wise, though others may not like the idea of a GUC just for this change. Regards, Jeff Davis

Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX

2023-06-14 Thread Jeff Davis
ms, best to take that out and reconsider in 17 if worthwhile. Regards, Jeff Davis

[17] collation provider "builtin"

2023-06-14 Thread Jeff Davis
://www.postgresql.org/message-id/87sfb4gwgv.fsf%40news-spur.riddles.org.uk [2] https://www.postgresql.org/message-id/8a3dc06f-9b9d-4ed7-9a12-2070d8b01...@manitou-mail.org -- Jeff Davis PostgreSQL Contributor Team - AWS From 065cdf57239280ef121b51d2616c0729946af9dd Mon Sep 17 00:00:00 2001 From: Je

[17] CREATE COLLATION default provider

2023-06-14 Thread Jeff Davis
fault for "CREATE DATABASE ... TEMPLATE template0", which then becomes the default provider for "CREATE COLLATION (LOCALE='...')". -- Jeff Davis PostgreSQL Contributor Team - AWS From 329e32bfe5e1883a2cfd6e224c1d512b67256870 Mon Sep 17 00:00:00 2001 From: Jeff Dav

Re: Order changes in PG16 since ICU introduction

2023-06-14 Thread Jeff Davis
On Mon, 2023-06-12 at 23:04 +0200, Peter Eisentraut wrote: > I object to adding a new provider for PG16 (patch 0001). Added to July CF for 17. > > 2. Patch 0004 is possibly out of scope for 16 > Also clearly a new feature. Added to July CF for 17. Regards, Jeff Davis

Re: Order changes in PG16 since ICU introduction

2023-06-16 Thread Jeff Davis
On Fri, 2023-06-16 at 16:50 +0200, Peter Eisentraut wrote: > This looks good to me. > > Attached is small fixup patch with some documentation tweaks and > simplifying some test code (also includes pgperltidy). Thank you. Committed with your fixups. Regards, Jeff Davis

test_extensions: fix inconsistency between meson.build and Makefile

2023-06-16 Thread Jeff Davis
Patch attached. Currently, the Makefile specifies NO_LOCALE=1, and the meson.build does not. -- Jeff Davis PostgreSQL Contributor Team - AWS From 1775c98badb94a2ee185d7a6bd11482a4e5db58a Mon Sep 17 00:00:00 2001 From: Jeff Davis Date: Fri, 16 Jun 2023 11:51:00 -0700 Subject: [PATCH v1

Re: [17] collation provider "builtin"

2023-06-16 Thread Jeff Davis
ovider needs to be explicitly requested (as in the current patch), it's still useful, so I don't think we need to decide now. We should also keep in mind that whatever provider is selected at initdb time also becomes the default for future databases. Regards, Jeff Davis

Re: pg_collation.collversion for C.UTF-8

2023-06-16 Thread Jeff Davis
, but leave LC_CTYPE=C.UTF-8 as-is? Regards, Jeff Davis

Re: pg_collation.collversion for C.UTF-8

2023-06-19 Thread Jeff Davis
ing to the "true" semantics, if they are truly simple and well-defined and stable. But I don't think ctype=C.UTF-8 is actually stable because new characters can be added, right? Regards, Jeff Davis

collation-related loose ends before beta2

2023-06-20 Thread Jeff Davis
later. But if the default collation provider goes back to libc, the risk of ICU validation errors goes way down, so I don't object if Peter would like to change it back to an ERROR. Regards, Jeff Davis

Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX

2023-06-20 Thread Jeff Davis
le from the table's owner is an edge case in behavior and both make sense to me. In the absense of a use case, I'd be inclined towards just being consistent with the other privileges. Regards, Jeff Davis

Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX

2023-06-20 Thread Jeff Davis
the check for !skip_privs but need to add it to the flags in vacuum_is_permitted_for_relation(). Regards, Jeff Davis

Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX

2023-06-20 Thread Jeff Davis
fusing to users. Regards, Jeff Davis

Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX

2023-06-20 Thread Jeff Davis
On Tue, 2023-06-20 at 10:56 -0700, Nathan Bossart wrote: > On Tue, Jun 20, 2023 at 10:49:27AM -0700, Nathan Bossart wrote: > > Patch incoming... > > Attached. Looks good to me. Regards, Jeff Davis

Re: collation-related loose ends before beta2

2023-06-20 Thread Jeff Davis
On Tue, 2023-06-20 at 12:16 -0400, Tom Lane wrote: > Jeff Davis writes: > > Status on collation loose ends: > > This all sounds good to me. Patches attached. 0001 also removes the code to get a default locale when ICU is being used, because that was a part of the same commit t

Re: EBCDIC sorting as a use case for ICU rules

2023-06-21 Thread Jeff Davis
we could add some explanation along the way about how the rule is constructed to match EBCDIC, which would reduce the shock of a long rule like that. I wonder why the rule syntax is such that it cannot be broken up? Would it be incorrect for us to allow some whitespace in there? Regards, Jeff Davis

Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX

2023-06-21 Thread Jeff Davis
10) TO (20); CREATE INDEX p_idx ON p (i); CREATE INDEX special_idx ON p0 (j); GRANT MAINTAIN ON p TO foo; \c - foo REINDEX TABLE p; That would reindex p0_i_idx and p1_i_idx, but skip special_idx. That might be too confusing, but feels a bit more consistent permissions- wise. Regards, Jeff Davis

Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX

2023-06-21 Thread Jeff Davis
, we might also consider making REINDEX work a bit more like > VACUUM > and ANALYZE and emit a WARNING for any relations that the user is not > permitted to process.  But this probably deserves its own thread, and > it > might even need to wait until v17. Yes, we can revisit for 17. Regards, Jeff Davis

Re: pgsql: Fix search_path to a safe value during maintenance operations.

2023-06-29 Thread Jeff Davis
ch is new, so no breakage. And if someone is using the MAINTAIN privilege, they wouldn't be able to abuse the search_path, so it would close the hole. Patch attached (created a bit quickly, but seems to work). Regards, Jeff Davis [1] https://postgr.es/m/CAKFQuwaVJkM9u%

Re: pgsql: Fix search_path to a safe value during maintenance operations.

2023-06-30 Thread Jeff Davis
table. At some point in the very near future (though I realize that point may come after version 16), we need to lock down the search path in a lot of cases (not just maintenance commands), and I don't see any way around that. Regards, Jeff Davis

Re: test_extensions: fix inconsistency between meson.build and Makefile

2023-07-05 Thread Jeff Davis
.build does not need to, either. Regards, Jeff Davis

Re: pgsql: Fix search_path to a safe value during maintenance operations.

2023-07-06 Thread Jeff Davis
if (!vacuum_is_relation_owner(relid, classForm, options)) + continue; in get_all_vacuum_rels() whereas your patch left it out -- double-check that we're doing the right thing there. Also remember to bump the catversion. Other than that, it looks good to me. Regards, Jeff Davis

Re: EBCDIC sorting as a use case for ICU rules

2023-07-06 Thread Jeff Davis
'@' < \' < '=' < '"' > < a < b < c < d < e < f < g < h < i > < j < k < l < m < n < o < p < q < r > < '~' < s < t < u < v < w < x < y < z > < '[' < '^' < ']' > < '{' < A < B < C < D < E < F < G < H < I > < '}' < J < K < L < M < N < O < P < Q < R > < '\'  < S < T < U < V < W < X < Y < Z > < 0 < 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 > $$); That looks much nicer and would go nicely in the documentation along with some explanation. Regards, Jeff Davis

Re: Fix search_path for all maintenance commands

2023-07-06 Thread Jeff Davis
On Fri, 2023-05-26 at 16:21 -0700, Jeff Davis wrote: > Maintenance commands (ANALYZE, CLUSTER, REFRESH MATERIALIZED VIEW, > REINDEX, and VACUUM) currently run as the table owner, and as a > SECURITY_RESTRICTED_OPERATION. > > I propose that we also fix the search_path to "

Re: Fix search_path for all maintenance commands

2023-07-07 Thread Jeff Davis
evel I suspect we want lexical scoping, which is what most of us > have in our programming languages, in the database; but the database > has many elements of dynamic scoping, and changing that is both a > compatibility break and requires significant changes in the way the > database is designed. Does that suggest another approach? Regards, Jeff Davis

Re: 010_database.pl fails on openbsd w/ LC_ALL=LANG=C

2023-07-07 Thread Jeff Davis
is accepting it? If some libc implementations are too permissive, I might need to just disable this test. But if we can find a locale that is consistently acceptable in ICU but invalid in libc, then I can keep it... perhaps "und@colStrength=primary"? Regards, Jeff Davis

Re: ICU locale validation / canonicalization

2023-07-07 Thread Jeff Davis
bly indicates a user mistake). I don't think this is a practical problem any more. Regards, Jeff Davis

Re: pgsql: Fix search_path to a safe value during maintenance operations.

2023-07-07 Thread Jeff Davis
of weirdness. Also I'm not quite sure how quickly my search_path fix will be committed. Hopefully soon, because the current state is not great, but it's hard for me to say for sure. Regards, Jeff Davis

Re: [17] CREATE COLLATION default provider

2023-07-07 Thread Jeff Davis
or a non-C locale. A GUC might be a better default, and we could have CREATE COLLATION default to ICU if the server is built with ICU and if PROVIDER, LC_COLLATE and LC_CTYPE are unspecified. Regards, Jeff Davis

Re: 010_database.pl fails on openbsd w/ LC_ALL=LANG=C

2023-07-07 Thread Jeff Davis
On Sat, 2023-07-08 at 07:04 +1200, Thomas Munro wrote: > Doesn't look too hopeful: https://man.openbsd.org/setlocale.3 Hmm. I could try using a bogus encoding, but that may be too clever. I'll just remove the test. Regards, Jeff Davis

Refactor: allow pg_strncoll(), etc., to accept -1 length for NUL-terminated cstrings.

2024-08-22 Thread Jeff Davis
be a table of methods, which means we can add an extension hook to provide a different method table. That still requires more work, I'm just mentioning it here for context. Regards, Jeff Davis From 6f0c0a9e05039cd295c6c090b3d98d381244b35c Mon Sep 17 00:00:00 2001 From: Jeff Davis Date

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM

2024-08-26 Thread Jeff Davis
yway (which also materializes it; see tts_virtual_copyslot()) at heapam.c:2710? * After correcting the memory issues, can you get updated performance numbers for COPY? Regards, Jeff Davis

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM

2024-08-26 Thread Jeff Davis
ator over the buffered tuples to the caller. The caller can then use the iterator to insert into indexes, return a tuple to the executor, etc., and then release the iterator when done (freeing the buffer). That control flow is less convenient for most callers, though, so perhaps that should be optional? Regards, Jeff Davis

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM

2024-08-27 Thread Jeff Davis
ad-of-row triggers, and volatile functions in the query. We could also just consider RETURNING another restriction, which could be lifted later by implementing the logic in the callback (as described above) without an API change. Regards, Jeff Davis

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM

2024-08-27 Thread Jeff Davis
using the callback to copy tuples into the caller's context. In 0003, why do you need the global insert_modify_buffer_flush_context? 0004 is the only place that calls table_modify_buffer_flush(). Is that really necessary, or is automatic flushing enough? Regards, Jeff Davis

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM

2024-08-27 Thread Jeff Davis
On Mon, 2024-08-26 at 14:18 -0700, Jeff Davis wrote: > 0001 implementation issues: > > * We need default implementations for AMs that don't implement the > new > APIs, so that the AM will still function even if it only defines the > single-tuple APIs. If we need

Re: allowing extensions to control planner behavior

2024-08-28 Thread Jeff Davis
useful and seems relatively easy -- A JOIN B or B JOIN A (though there's some nuance about when you try to make that decision). The latter requires controlling an explosion of possibilities, and would be an entirely different kind of hook. Regards, Jeff Davis

Re: allowing extensions to control planner behavior

2024-08-28 Thread Jeff Davis
where there's enough context to know what's happening. There could be many such hooks, but I suspect only a handful of important ones. This idea allows the extension author to preserve the right paths long enough to use set_rel_pathlist_hook/set_join_pathlist_hook, which can editorialize on costs or do its own pruning. Regards, Jeff Davis

Re: allowing extensions to control planner behavior

2024-08-28 Thread Jeff Davis
On Wed, 2024-08-28 at 16:35 -0400, Robert Haas wrote: > On Wed, Aug 28, 2024 at 4:29 PM Jeff Davis wrote: > > Preserving a path for the right amount of time seems like the > > primary > > challenge for most of the use cases you raised (removing paths is > > easier tha

Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM

2024-08-29 Thread Jeff Davis
nd it still requires a solution for #4). Regards, Jeff Davis [1] https://www.postgresql.org/docs/devel/trigger-datachanges.html

Re: allowing extensions to control planner behavior

2024-08-29 Thread Jeff Davis
to hold onto multiple paths for longer, similar to pathkeys, which might offer some benefits or simplifications. Regards, Jeff Davis [1] https://www.postgresql.org/message-id/CA+TgmoZQyVxnRU--4g2bJonJ8RyJqNi2CHpy-=nwwbtnpaj...@mail.gmail.com

Re: allowing extensions to control planner behavior

2024-08-30 Thread Jeff Davis
formance goes, I'm only looking at branch in add_path() that calls compare_pathkeys(). Do you have some example queries which would be a worst case for that path? In general if you can post some details about how you are measuring, that would be helpful. Regards, Jeff Davis

Re: tiny step toward threading: reduce dependence on setlocale()

2024-09-03 Thread Jeff Davis
On Wed, 2024-08-28 at 18:43 +0200, Andreas Karlsson wrote: > On 8/15/24 12:55 AM, Jeff Davis wrote: > > This overlaps a bit with what Peter already proposed here: > > > > https://www.postgresql.org/message-id/4f562d84-87f4-44dc-8946-01d6c437936f%40eisentraut.org > >

Re: tiny step toward threading: reduce dependence on setlocale()

2024-09-04 Thread Jeff Davis
Committed v2-0001. On Tue, 2024-09-03 at 22:04 -0700, Jeff Davis wrote: > * This patch may change the handling of collation oid 0, and I'm not > sure whether that was intentional or not. lc_collate_is_c(0) returned > false, whereas pg_newlocale_from_collation(0)->collate_is_c r

Re: tiny step toward threading: reduce dependence on setlocale()

2024-09-12 Thread Jeff Davis
Regards, Jeff Davis

Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-07-15 Thread Jeff Davis
On Mon, 2024-07-15 at 13:44 -0400, Robert Haas wrote: > But ... why? I mean, what's the point of prohibiting that? Agreed. We ignore all kinds of stuff in search_path that doesn't make sense, like non-existent schemas. Simpler is better. Regards, Jeff Davis

Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions

2024-07-15 Thread Jeff Davis
for the session, or on a function that's not part of an extension. On re-reading, I see that you mean it should work if they explicitly set it as a part of a function that *is* part of an extension. And I agree with that -- just make it work. Regards, Jeff Davis

[18] Policy on IMMUTABLE functions and Unicode updates

2024-07-16 Thread Jeff Davis
ctions. We've been following (A), and that's the defacto policy today[3][4]. Noah and Laurenz argued[5] that the policy starting in version 18 should be (B). Given that it's a policy decision that affects more than just the builtin collation provider, I'd like to discus

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-16 Thread Jeff Davis
o in a complete way, and hard to do with much accuracy. I don't oppose it though -- if someone finds a way to provide enough information to be useful, then that's fine with me. Regards, Jeff Davis

Re: Built-in CTYPE provider

2024-07-17 Thread Jeff Davis
b2c332c47e3e0a67f0640b49c.ca...@j-davis.com Regards, Jeff Davis

Re: Built-in CTYPE provider

2024-07-17 Thread Jeff Davis
bd45b2c332c47e3e0a67f0640b49c.camel%40j-davis.com which seems like a more direct (and more complete) path to a resolution of your concerns. I speak only for myself, but I assure you that I have an open mind in that discussion, and that I have no intention force a Unicode update past objections.

Re: Built-in CTYPE provider

2024-07-18 Thread Jeff Davis
x27;t engage in the version 18 policy discussion. >   Maybe someone will change > something in v18 so it's not like that, but don't count on it. That's backwards. If nothing happens in v18, then there will be no breaking Unicode change. It takes an active step by a

Re: Built-in CTYPE provider

2024-07-18 Thread Jeff Davis
above is an accurate characterization. There's plenty of opportunity for deliberation and compromise in version 18, and my mind is still open to pretty much everything, up to and including freezing Unicode updates if necessary[3]. Regards, Jeff Davis [1] https://www.postgresql.org/m

Re: Built-in CTYPE provider

2024-07-19 Thread Jeff Davis
version 18 like normal, because there's no actual problem now, I see no reason your objections would be taken less seriously later. Regards, Jeff Davis [1] https://www.postgresql.org/message-id/d75d2d0d1d2bd45b2c332c47e3e0a67f0640b49c.camel%40j-davis.com

Re: Built-in CTYPE provider

2024-07-19 Thread Jeff Davis
e more that you say so in the policy thread here: https://www.postgresql.org/message-id/d75d2d0d1d2bd45b2c332c47e3e0a67f0640b49c.camel%40j-davis.com which would get broader visibility and I believe provide you with stronger assurances that *everyone* will be careful with Unicode updates. Regards,

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-19 Thread Jeff Davis
In other words, it would be reviewed like any other change. Ideally, some new developments would make it less worrisome, and Unicode updates could become more routine. I have some ideas, which I can propose in separate threads. But for now, I don't see a reason to rush Unicode updates. Regards, Jeff Davis

Re: Statistics Import and Export

2024-07-19 Thread Jeff Davis
lways completely replaced, but the way you can call pg_set_attribute_stats() doesn't imply that -- calling pg_set_attribute_stats(..., most_common_vals => ..., most_common_freqs => ...) looks like it would just replace the most_common_vals+freqs and leave histogram_bounds as it was, but it actually clears histogram_bounds, right? Should we make that work or should we document better that it doesn't? Regards, Jeff Davis

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-22 Thread Jeff Davis
's mostly a theoretical problem because, at least in my experience, I can't recall ever seeing an index that would be affected. Regards, Jeff Davis

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-22 Thread Jeff Davis
rings and EXECUTE them. Though perhaps not impossible if we use some kind of runtime detection. We could have some kind of global context that tracks, at runtime, when an expression is executing for the purposes of an index. If a function depends on a versioned collation, then mark the index or add a version somewhere. Regards, Jeff Davis

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-22 Thread Jeff Davis
obody has commented yet.) Regards, Jeff Davis

Re: Statistics Import and Export

2024-07-22 Thread Jeff Davis
x27;t be imported from an old version into a new version because it's either gone or the meaning has changed too much. But that argument doesn't apply to a bogus call, where the name/value pairs get misaligned or something. Regards, Jeff Davis

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Jeff Davis
ffers code point order collation combined with Unicode ctype semantics. With PG17, between ICU and the builtin provider, there's little remaining reason to use libc (aside from legacy). Regards, Jeff Davis

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Jeff Davis
tes, so primary keys will never be affected. The risks we are talking about are for expression indexes, e.g. on LOWER(). Even if you do have such expression indexes, the types of changes Unicode makes to casing and character properties are typically much more mild. Regards, Jeff Davis

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Jeff Davis
code are intolerable, and only for PG_C_UTF8? Regards, Jeff Davis

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Jeff Davis
rité documented[1] cases where the libc C.UTF-8 locale changed the *sort* behavior, thereby affecting primary keys. Regards, Jeff Davis [1] https://www.postgresql.org/message-id/8a3dc06f-9b9d-4ed7-9a12-2070d8b0165f%40manitou-mail.org

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-23 Thread Jeff Davis
;s much more tractable to review your expression indexes and look for problems (not ideal, but better). Also, as Peter points out, CTYPE changes are typically more narrow, so there's a good chance that there's no problem at all. Regards, Jeff Davis

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Jeff Davis
re you rebuild/fix objects to use the new collation, and when that's done then you change the default so that queries use version 2. How does all that work? Regards, Jeff Davis

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Jeff Davis
logy doesn't quite capture this distinction. I don't mean to over-emphasize this point, but I do think we need to keep some perspective here. But I agree with your general point that we shouldn't dismiss the problem just because it's minor. We should expect the problem to surface at some point and be reasonably prepared. Regards, Jeff Davis

Re: [18] Policy on IMMUTABLE functions and Unicode updates

2024-07-24 Thread Jeff Davis
On Wed, 2024-07-24 at 14:47 -0400, Robert Haas wrote: > On Wed, Jul 24, 2024 at 1:45 PM Jeff Davis wrote: > > There's a qualitative difference between a collation update which > > can > > break your PKs and FKs, and a ctype update which definitely will > > not. >

Re: Statistics Import and Export

2024-07-25 Thread Jeff Davis
stics and control, the control parameters will be: I don't like the idea of mixing statistics and control parameters in the same list. I do like the idea of returning a set, but I think it should be the positive set (effectively a representation of what is now in the pg_stats view) and any ignored settings would be output as WARNINGs. Regards, Jeff Davis

[18] separate collation and ctype versions, and cleanup of pg_database locale fields

2024-07-25 Thread Jeff Davis
pg_database will be locale- related? Regards, Jeff Davis

Re: tiny step toward threading: reduce dependence on setlocale()

2024-07-25 Thread Jeff Davis
; > { > > > > The patch sequencing might be a bit tricky here.  Maybe it's ok if > > patch 0004 stays as is in this respect if 0006 were to fix it back. Addressed in v3-0006. > > * v2-0005-Avoid-setlocale-in-lc_collate_is_c-and-lc_ctype_i.patch > > > &g

Re: tiny step toward threading: reduce dependence on setlocale()

2024-07-26 Thread Jeff Davis
gt; Also is there any reaosn you do not squash th 4th and the 6th patch? Done. I had to rearrange the patch ordering a bit because prior to the cache refactoring patch, it's unsafe to call pg_newlocale_from_collation() without checking lc_collate_is_c() or lc_

Re: Speed up collation cache

2024-07-26 Thread Jeff Davis
On Thu, 2024-06-20 at 17:07 +0700, John Naylor wrote: > On Sat, Jun 15, 2024 at 6:46 AM Jeff Davis wrote: > > Attached is a patch to use simplehash.h instead, which speeds > > things up > > enough to make them fairly close (from around 15% slower to around > > 8%). &

Re: MAINTAIN privilege -- what do we need to un-revert it?

2024-07-26 Thread Jeff Davis
nds of functions between releases. Even if the signatures remain the same, the parse structures may change, which creates similar incompatibilities. So let's just get rid of the 'params' argument from both functions. Regards, Jeff Davis

Re: [18] separate collation and ctype versions, and cleanup of pg_database locale fields

2024-07-27 Thread Jeff Davis
On Thu, 2024-07-25 at 13:29 -0700, Jeff Davis wrote: > it may be a good idea to version collation and ctype > separately. The ctype version is, more or less, the Unicode version, > and we know what that is for the builtin provider as well as ICU. Attached a rough patch for the pu

Re: Speed up collation cache

2024-07-28 Thread Jeff Davis
is > ready for committer. Committed, thank you. > And then we can discuss after committing if an additional cache of > the > last locale is worth it or not. Yeah, I'm holding off on that until refactoring in the area settles, and we'll see if it's still worth it. Regards, Jeff Davis

Re: tiny step toward threading: reduce dependence on setlocale()

2024-07-30 Thread Jeff Davis
etlocale(). I changed this to lookup the collation and then use pg_strxfrm(). That should improve histogram selectivity estimates because it uses the correct provider, rather than relying on setlocale(), right? New series attached. Regards, Jeff Davis From 5b903c82f34f5da9cab58ecd0a268345

<    6   7   8   9   10   11   12   13   14   15   >