Re: Container Types

2023-10-25 Thread Jeff Davis
ference limitation, and that needs to be solved regardless. After the type inference figures out what the right type is, then I think you're right that an OID is not required to track it, and however we do track it should be able to reuse some of the existing infrastructure for dealing with rec

Re: Does UCS_BASIC have the right CTYPE?

2023-10-26 Thread Jeff Davis
On Thu, 2023-10-26 at 16:49 +0200, Peter Eisentraut wrote: > On 25.10.23 20:32, Jeff Davis wrote: > > But what should the result of UPPER('á' COLLATE UCS_BASIC) be? In > > Postgres, the answer is 'á', but intuitively, one could reasonably > > expect the an

Re: Does UCS_BASIC have the right CTYPE?

2023-10-26 Thread Jeff Davis
On Thu, 2023-10-26 at 09:21 -0700, Jeff Davis wrote: > Our initcap() is not defined in the standard, and we document that it > only differentiates between alphanumeric and non-alphanumeric > characters, so we could get that behavior pretty easily as well. If > we > wanted to do it

Re: Does UCS_BASIC have the right CTYPE?

2023-10-26 Thread Jeff Davis
ing_8h.html#aa64fbd4ad23af84d01c931d7cfa25f89 See also the part about tailorings here: https://www.unicode.org/versions/Unicode15.1.0/ch03.pdf#G33992 Regards, Jeff Davis

Re: Does UCS_BASIC have the right CTYPE?

2023-10-26 Thread Jeff Davis
ot;builtin" provider I proposed earlier. If the behavior does change with a new Unicode version it would be easier to see and less likely to affect on- disk structures than a collation change. Regards, Jeff Davis

Re: [17] Special search_path names "!pg_temp" and "!pg_catalog"

2023-10-27 Thread Jeff Davis
On Thu, 2023-10-26 at 16:28 -0500, Nathan Bossart wrote: > On Fri, Aug 18, 2023 at 02:44:31PM -0700, Jeff Davis wrote: > > +    SET search_path = admin, "!pg_temp"; > > I think it's unfortunate that these new identifiers must be quoted.  > I > wonder i

Re: Improve WALRead() to suck data directly from WAL buffers when possible

2023-10-27 Thread Jeff Davis
d be good to comment that the function still works past the flush pointer, and that it will be safe to remove later (right?). * An "Assert(!RecoveryInProgress())" would be more appropriate than an error. Perhaps we will remove even that check in the future to achieve cascaded replication of unflushed data. Regards, Jeff Davis

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-10-27 Thread Jeff Davis
On Thu, 2023-09-21 at 14:33 -0700, Jeff Davis wrote: > I have attached an updated patch. Changes: Withdrawing this from CF due to lack of consensus. I'm happy to resume this discussion if someone sees a path forward to make it easier to secure the search_path; or at least help warn user

Re: Pre-proposal: unicode normalized text

2023-10-27 Thread Jeff Davis
On Mon, 2023-10-16 at 20:32 -0700, Jeff Davis wrote: > On Wed, 2023-10-11 at 08:56 +0200, Peter Eisentraut wrote: > > We need to be careful about precise terminology.  "Valid" has a > > defined > > meaning for Unicode.  A byte sequence can be valid or not as UTF

Re: Fix search_path for all maintenance commands

2023-10-27 Thread Jeff Davis
On Fri, 2023-07-21 at 15:32 -0700, Jeff Davis wrote: > Attached is a new version. Do we still want to do this? Right now, the MAINTAIN privilege is blocking on some way to prevent malicious users from abusing the MAINTAIN privilege and search_path to acquire the table owner's privileg

Re: unnest multirange, returned order

2023-10-27 Thread Jeff Davis
t I can add the patch > to the commitfest. > > Tiny as the patch is, I don't want it to fall between the cracks. Committed with adjusted wording. Thank you! -- Jeff Davis PostgreSQL Contributor Team - AWS

Re: MERGE ... RETURNING

2023-10-30 Thread Jeff Davis
(sorry to backtrack yet again...)? It couldn't be used in an arbitrary expression, but that also means that it couldn't end up in the wrong kind of expression. Regards, Jeff Davis

Re: always use runtime checks for CRC-32C instructions

2023-10-30 Thread Jeff Davis
rned about the call going through a function pointer? If so, is it possible that setting a flag and then branching would be better? Also, if it's a concern, should we also consider making an inlineable version of pg_comp_crc32c_sse42()? Regards, Jeff Davis

Re: MERGE ... RETURNING

2023-10-31 Thread Jeff Davis
On Tue, 2023-10-31 at 12:45 +0100, Vik Fearing wrote: > On 10/24/23 21:10, Jeff Davis wrote: > > Can we revisit the idea of a per-WHEN RETURNING clause? > > For the record, I dislike this idea. I agree that it makes things awkward, and if it creates grammatical problems as well

Re: MERGE ... RETURNING

2023-11-01 Thread Jeff Davis
parse analysis code, and > a > lot more if you grep more widely across the whole of the backend > code.) If you can point to a precedent, then I'm much more inclined to be OK with the implementation. Regards, Jeff Davis

Re: Fix search_path for all maintenance commands

2023-11-02 Thread Jeff Davis
which would compromise the index structure and any constraints using that index. But that problem is more bounded, at least. ] > After that, change search_path on function invocation as usual > rather than having special rules for what happens when a function is > invoked during a m

Inconsistent use of "volatile" when accessing shared memory?

2023-11-02 Thread Jeff Davis
mple examples with gcc at -O2, which seem to emit the loads/stores where expected. What is the guidance here? Is the volatile pointer use in AdvanceXLInsertBuffer() required, and if so, why not other places? Regards, Jeff Davis

Re: Improve WALRead() to suck data directly from WAL buffers when possible

2023-11-03 Thread Jeff Davis
7;t think that's true right now: AdvanceXLInsertBuffers() zeroes the old page before updating xlblocks[nextidx]. I think it needs something like: pg_atomic_write_u64(&XLogCtl->xlblocks[nextidx], InvalidXLogRecPtr); pg_write_barrier(); before the MemSet. I didn't review your latest v14 patch yet. Regards, Jeff Davis

Re: Pre-proposal: unicode normalized text

2023-11-03 Thread Jeff Davis
unicode_category.c to @pgcommonallfiles in Mkvcbuild.pm. I'll do a trial commit tomorrow and see if that fixes it unless someone has a better suggestion. Regards, Jeff Davis

Re: Pre-proposal: unicode normalized text

2023-11-03 Thread Jeff Davis
push that to fix the MSVC buildfarm > members. > > Sorry for the duplicate effort and/or stepping on your toes. Thank you, no apology necessary. Regards, Jeff Davis

Re: Pre-proposal: unicode normalized text

2023-11-03 Thread Jeff Davis
On Fri, 2023-11-03 at 17:11 +0700, John Naylor wrote: > On Sat, Oct 28, 2023 at 4:15 AM Jeff Davis wrote: > > > > I plan to commit something like v3 early next week unless someone > > else > > has additional comments or I missed a concern. > > Hi Jeff, is the C

Re: Improve WALRead() to suck data directly from WAL buffers when possible

2023-11-03 Thread Jeff Davis
nceXLInsertBuffer(), 3) > the following sanity check to see if the read page is valid in > XLogReadFromBuffers(). If it sounds sensible, I'll work towards > coding > it up. Thoughts? I like it. I think it will ultimately be a fairly simple loop. And by moving to atomics, we won't need the delicate comment in GetXLogBuffer(). Regards, Jeff Davis

Re: Inconsistent use of "volatile" when accessing shared memory?

2023-11-03 Thread Jeff Davis
r to me exactly why that matters. Intuitively, access through a local pointer seems much more likely to be optimized and therefore more dangerous, but that doesn't imply that access through global variables is not dangerous. Regards, Jeff Davis

Re: Inconsistent use of "volatile" when accessing shared memory?

2023-11-04 Thread Jeff Davis
because I'm suggesting that he can avoid the WALBufMappingLock to reduce the risk of a regression. In the process, we'll probably get rid of that unnecessary "volatile" in AdvanceXLInsertBuffer(). Regards, Jeff Davis

Re: Improve WALRead() to suck data directly from WAL buffers when possible

2023-11-04 Thread Jeff Davis
Assert(!XLogRecPtrIsInvalid(EndPtr)); Can that really happen? If the EndPtr is invalid, that means the page is in the process of being cleared, so the contents of the page are undefined at that time, right? Regards, Jeff Davis

Re: Fix search_path for all maintenance commands

2023-11-07 Thread Jeff Davis
her way search path can be changed, which adds to the complexity. Also, by default it's "$user", public; and given that "public" was world-writable until recently, that doesn't seem like a good idea for a change intended to prevent search_path manipulation. Regards, Jeff Davis

Why do indexes and sorts use the database collation?

2023-11-10 Thread Jeff Davis
Granted, there are reasons to want an index to have a particular collation, in which case it makes sense to opt-in to #2. But in the common case, the high performance costs and dependency versioning risks aren't worth it. Thoughts? Regards, Jeff Davis

Re: Why do indexes and sorts use the database collation?

2023-11-11 Thread Jeff Davis
>   than the user's direct request (e.g. DISTINCT/GROUP BY, merge > joins). +1. Where "cheaper" comes from is an interesting question -- is it a property of the provider or the specific collation? Or do we just call "C" special? Regards, Jeff Davis

Re: Why do indexes and sorts use the database collation?

2023-11-13 Thread Jeff Davis
On Mon, 2023-11-13 at 13:43 +0100, Peter Eisentraut wrote: > On 11.11.23 01:03, Jeff Davis wrote: > > But the database collation is always deterministic, > > So far! Yeah, if we did that, clearly the index collation would need to match that of the database to be useful. Wh

Re: Why do indexes and sorts use the database collation?

2023-11-13 Thread Jeff Davis
; > I'd think the specific collation. Even if we initially perhaps just > get the > default cost from the provider such, it structurally seems the sanest > place to > locate the cost. Makes sense, though I'm thinking we'd still want to special case the fastest collation as "C". Regards, Jeff Davis

Re: Why do indexes and sorts use the database collation?

2023-11-13 Thread Jeff Davis
course, if we feel entitled to create the primary key index with a > collation of our choosing, that'd make this unpredictable. I wouldn't describe it as "unpredictable". We'd have some defined way of defaulting the collation of an index which might be affected by a database option or something. In any case, it would be visible with \d. Regards, Jeff Davis >

Re: Why do indexes and sorts use the database collation?

2023-11-14 Thread Jeff Davis
On Tue, 2023-11-14 at 17:15 +0100, Peter Eisentraut wrote: > On 14.11.23 02:58, Jeff Davis wrote: > > If the user just wants PK/FK constraints, and equality lookups, > > then an > > index with the "C" collation makes a lot of sense to serve those > > purposes.

Re: Why do indexes and sorts use the database collation?

2023-11-14 Thread Jeff Davis
e in this thread, that's less useful than it may seem at first (text indexes are often uncorrelated). It seems valid to offer this as a trade-off that users can make. Regards, Jeff Davis

Re: Why do indexes and sorts use the database collation?

2023-11-14 Thread Jeff Davis
be easy to block where necessary. Regards, Jeff Davis

Re: Faster "SET search_path"

2023-11-14 Thread Jeff Davis
On Thu, 2023-10-19 at 19:01 -0700, Jeff Davis wrote: > 0003: Cache for recomputeNamespacePath. Committed with some further simplification around the OOM handling. Instead of using MCXT_ALLOC_NO_OOM, it just temporarily sets the cache invalid while copying the string, and sets it valid ag

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-11-14 Thread Jeff Davis
ant under all possible values of > search_path. If you care about your function behaving the same way > all > the time, you have to set the search_path. After adding the search path cache (recent commit f26c2368dc) hopefully that helps to make the above suggestion more reasonable performance- wise. I think we can call that progress. Regards, Jeff Davis

Re: Why do indexes and sorts use the database collation?

2023-11-15 Thread Jeff Davis
k this answers my earlier question. Now that I think about > this, the one confusing thing with this syntax is that it seems to > assign the collation to the constraint, but in reality we want the > constraint to be enforced with the column's collation and the > alternative collation is for the index. Yeah, let's be careful about that. It's still technically correct: uniqueness in either collation makes sense. But it could be confusing anyway. > > Regards, Jeff Davis

Re: Faster "SET search_path"

2023-11-16 Thread Jeff Davis
On Tue, 2023-11-14 at 20:13 -0800, Jeff Davis wrote: > On Thu, 2023-10-19 at 19:01 -0700, Jeff Davis wrote: > > 0003: Cache for recomputeNamespacePath. > > Committed with some further simplification around the OOM handling. While I considered OOM during hash key initialization

simplehash: preserve consistency in case of OOM

2023-11-17 Thread Jeff Davis
Right now, if allocation fails while growing a hashtable, it's left in an inconsistent state and can't be used again. Patch attached. -- Jeff Davis PostgreSQL Contributor Team - AWS From 82068d744f668039de7249854bc42eead4e77ebc Mon Sep 17 00:00:00 2001 From: Jeff Davis Date: F

Change GUC hashtable to use simplehash?

2023-11-17 Thread Jeff Davis
I had briefly experimented changing the hash table in guc.c to use simplehash. It didn't offer any measurable speedup, but the API is slightly nicer. I thought I'd post the patch in case others thought this was a good direction or nice cleanup. -- Jeff Davis PostgreSQL Contributor

Re: Why do indexes and sorts use the database collation?

2023-11-17 Thread Jeff Davis
ter to his > argument that > we could just use "C" for such indexes. I am saying we shouldn't prematurely optimize for the case of ORDER BY on a text PK case by making a an index with a non-"C" collation, given the costs and risks of non-"C" indexes. Particularly because, even if there is an ORDER BY, there are several common reasons such an index would not help anyway. > > > > Regards, Jeff Davis

Re: should check collations when creating partitioned index

2023-11-17 Thread Jeff Davis
518f791168bc6fb653d1f95f4d.ca...@j-davis.com Regards, Jeff Davis

Re: simplehash: preserve consistency in case of OOM

2023-11-17 Thread Jeff Davis
On Fri, 2023-11-17 at 12:13 -0800, Andres Freund wrote: > On 2023-11-17 10:42:54 -0800, Jeff Davis wrote: > > Right now, if allocation fails while growing a hashtable, it's left > > in > > an inconsistent state and can't be used again. > > I'm not ag

Re: should check collations when creating partitioned index

2023-11-17 Thread Jeff Davis
n other potential improvements/mitigations and see if I can make progress somewhere else. Regards, Jeff Davis

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Jeff Davis
ng both hsearch.h and simplehash.h for overlapping use cases indefinitely, then I'll drop this. Regards, Jeff Davis

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Jeff Davis
e might not rewrite hsearch.  But simplehash was never meant > to be a universal solution. OK, I will withdraw the patch until/unless it provides a concrete benefit. Regards, Jeff Davis

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Jeff Davis
l see if I can solve the case-folding slowness first, and then maybe it will be measurable. Regards, Jeff Davis

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Jeff Davis
fails, case-fold and try again. I'll hack up a patch -- I believe that would be measurable for the proconfigs. Regards, Jeff Davis

Re: Change GUC hashtable to use simplehash?

2023-11-19 Thread Jeff Davis
002 and 0001): master: 7899ms 0001: 7850 0002: 7958 0003: 7942 0004: 7549 0005: 7411 I'm inclined toward all of these patches. I'll also look at adding SH_STORE_HASH for the search_path cache. Looks like we're on track to bring the overhead of SET search_p

Re: PANIC serves too many masters

2023-11-20 Thread Jeff Davis
ting a few PANIC sites at a time? Is it fine to leave plain PANICs in place for the foreseeable future, or do you want all of them to eventually move? Regards, Jeff Davis

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-11-20 Thread Jeff Davis
t as a best practice in multi-user environments". Regards, Jeff Davis

Re: PANIC serves too many masters

2023-11-20 Thread Jeff Davis
ot;could not locate a valid checkpoint record"), errabort(false),errrestart(false))); Regards, Jeff Davis

Re: Faster "SET search_path"

2023-11-20 Thread Jeff Davis
On Thu, 2023-11-16 at 16:46 -0800, Jeff Davis wrote: > While I considered OOM during hash key initialization, I missed some > other potential out-of-memory hazards. Attached a fixup patch 0003, > which re-introduces one list copy but it simplifies things > substantially in addition to

simplehash: SH_OPTIMIZE_REPEAT for optimizing repeated lookups of the same key

2023-11-20 Thread Jeff Davis
, Jeff Davis From b878af835da794f3384f870db57b34e236b1efba Mon Sep 17 00:00:00 2001 From: Jeff Davis Date: Mon, 20 Nov 2023 17:42:07 -0800 Subject: [PATCH] Add SH_OPTIMIZE_REPEAT option to simplehash.h. Callers which expect to look up the same value repeatedly can specify SH_OPTIMIZE_REPEAT

Re: simplehash: SH_OPTIMIZE_REPEAT for optimizing repeated lookups of the same key

2023-11-20 Thread Jeff Davis
sh, and see where it tends to win. The caller can also save the hash and pass it down, but that's not always convenient to do. Regards, Jeff Davis

Re: Change GUC hashtable to use simplehash?

2023-11-21 Thread Jeff Davis
might benefit some other callers? Regards, Jeff Davis

Re: simplehash: SH_OPTIMIZE_REPEAT for optimizing repeated lookups of the same key

2023-11-21 Thread Jeff Davis
re likely to benefit we can reconsider. Though it makes it easy to test a few other callers, just to see what numbers appear. Regards, Jeff Davis

Re: proposal: change behavior on collation version mismatch

2023-11-27 Thread Jeff Davis
tor. > If we want to have a GUC that > allows warning behavior, I think that's OK but I think it should be > superuser-only and documented as a "developer" setting similar to > zero_damaged_pages. A GUC seems sensible to express the availability-vs-safety trade-off. I su

Re: proposal: change behavior on collation version mismatch

2023-11-27 Thread Jeff Davis
o it will be a long time before it's used widely enough to consider the problem solved. And even after all of that, ICU is not perfect, and our support for it still has various rough edges. Regards, Jeff Davis

Re: proposal: change behavior on collation version mismatch

2023-11-27 Thread Jeff Davis
false positives and false negatives. We'd need to document the setting so that users understand the consequences and limitations. I won't push strongly for such a setting to exist because I know that it's far from a complete solution. But I believe it would be sensible considering that this problem is going to take a while to resolve. Regards, Jeff Davis

Re: proposal: change behavior on collation version mismatch

2023-11-27 Thread Jeff Davis
nd packaging infrastructure, that is not very practical. Regards, Jeff Davis

encoding affects ICU regex character classification

2023-11-29 Thread Jeff Davis
se ICU is not allowed for that encoding), but I'd like it if we could make this infrastructure independent of ICU, because I have some follow-up proposals to simplify character classification here and in ts_locale.c. Thoughts? Regards, Jeff Davis

Re: Change GUC hashtable to use simplehash?

2023-12-03 Thread Jeff Davis
s.com which optimizes exact hits (most GUC names are already folded) before trying case folding? Regards, Jeff Davis

Re: Change GUC hashtable to use simplehash?

2023-12-04 Thread Jeff Davis
ial/clever with the hash functions. We would still want the faster hash for C-strings, but that's general and helps all callers. But you're right that it's more code, and that's not great. Regards, Jeff Davis

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-12-04 Thread Jeff Davis
om around ~7300ms to ~6800ms. This doesn't seem very controversial or complex, so I'll probably commit this soon unless someone else has a comment. -- Jeff Davis PostgreSQL Contributor Team - AWS From 906cb1cdf42f92090d4a9acf296098ec3bfa53e0 Mon Sep 17 00:00:00 2001 From: Jeff Davis Da

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-12-05 Thread Jeff Davis
the cast. Regards, Jeff Davis From 72b00b1b094945845e4ea4d427e426eafd5650c2 Mon Sep 17 00:00:00 2001 From: Jeff Davis Date: Mon, 4 Dec 2023 16:20:05 -0800 Subject: [PATCH v2] Cache opaque handle for GUC option to avoid repeasted lookups. When setting GUCs from proconfig, perfo

Re: Faster "SET search_path"

2023-12-05 Thread Jeff Davis
On Mon, 2023-11-20 at 17:13 -0800, Jeff Davis wrote: > Will commit 0005 soon. Committed. > I also attached a trivial 0006 patch that uses SH_STORE_HASH. I > wasn't > able to show much benefit, though, even when there's a bucket > collision. Perhaps there just aren&#x

Re: Change GUC hashtable to use simplehash?

2023-12-06 Thread Jeff Davis
find it hard > to > follow. OK. I am fine with (a). Regards, Jeff Davis

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-12-07 Thread Jeff Davis
I'm not inclined to commit this in its current form but if someone thinks that it's a worthwhile direction, I can clean it up a bit and reconsider. Regards, Jeff Davis From e48a54d9880ab65a1e5ad6d136b849bda2e4554e Mon Sep 17 00:00:00 2001 From: Jeff Davis Date:

Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }

2023-12-07 Thread Jeff Davis
On Tue, 2023-12-05 at 11:58 -0800, Jeff Davis wrote: > Also, I forward-declared config_generic in guc.h to eliminate the > cast. Looking more closely, I fixed an issue related to placeholder configs. We can't return a handle to a placeholder, because it's not stable, so in that ca

Re: micro-optimizing json.c

2023-12-07 Thread Jeff Davis
e there's a way to use a static buffer to even avoid the palloc() in get_str_from_var()? Not sure these are worth the effort; just brainstorming. In any case, +1 to your simple change. Regards, Jeff Davis

Re: Improve WALRead() to suck data directly from WAL buffers when possible

2023-12-07 Thread Jeff Davis
to be ahead of the + * page we're looking for. Don't PANIC on that, until we've verified the + * value while holding the lock. Is that still true even without a torn read? The code for 0001 itself looks good. These are minor concerns and I am inclined to commit something like it fa

Re: Change GUC hashtable to use simplehash?

2023-12-08 Thread Jeff Davis
earch path cache, and there's a significant speedup for cases not benefiting from a86c61c9ee. It's enough that we almost don't need a86c61c9ee. So a definite +1 to the new APIs. Regards, Jeff Davis

Re: Change GUC hashtable to use simplehash?

2023-12-08 Thread Jeff Davis
t if you want to commit that piece now, but I hesitate > to > call it a performance improvement on its own. > > - The runtime measurements I saw reported were well within the noise > level. > - The memory usage starts out better, but with more entries is worse. I suppose I'll wa

Re: Change GUC hashtable to use simplehash?

2023-12-09 Thread Jeff Davis
oing in the attached path is using part of the key as the seed. Is that a good idea or should the seed be zero or come from somewhere else? Regards, Jeff Davis From a30e5f0ea580fb5038eb90e862f697b557627f32 Mon Sep 17 00:00:00 2001 From: Jeff Davis Date: Fri, 8 Dec 2023 12:14:27 -0800 Subj

Re: encoding affects ICU regex character classification

2023-12-12 Thread Jeff Davis
nd ICU select 'Σ' ~* 'ς'; -- true in both libc and ICU Similarly for titlecase variants: select 'Dž' ~* 'dž'; -- false in libc and ICU select 'dž' ~* 'Dž'; -- true in libc and ICU If we do the case mapping ourselves, we can make those work. We'd just have to modify the APIs a bit so that allcases() can actually get all of the case variants, rather than relying on just towupper/towlower. Regards, Jeff Davis

Re: Built-in CTYPE provider

2023-12-13 Thread Jeff Davis
at > we're not going to every support newly added Unicode characters like > Latin Glottals. If, by "version it", you mean "update the data tables in new Postgres versions", then I agree. If you mean that one PG version would need to support many versions of Unicode, I don't agree. Regards, Jeff Davis [5] https://postgr.es/m/c5e9dac884332824e0797937518da0b8766c1238.ca...@j-davis.com [6] https://www.unicode.org/policies/stability_policy.html#Case_Folding

Re: Built-in CTYPE provider

2023-12-13 Thread Jeff Davis
On Wed, 2023-12-13 at 16:34 +0100, Daniel Verite wrote: > But there are CLDR mappings on top of that. I see, thank you. Would it still be called "full" case mapping to only use the mappings in SpecialCasing.txt? And would that be useful? Regards, Jeff Davis

Re: Built-in CTYPE provider

2023-12-14 Thread Jeff Davis
ithout this additional tailoring. You are correct that ICU will still have some features that won't be supported by the builtin provider. Better word boundary semantics in INITCAP() are another advantage. Regards, Jeff Davis

Re: encoding affects ICU regex character classification

2023-12-14 Thread Jeff Davis
should support case folding.) > And I have no idea if or when > glibc might have picked up the new unicode characters. That's a strong argument in favor of a builtin provider. Regards, Jeff Davis

Re: Built-in CTYPE provider

2023-12-18 Thread Jeff Davis
n't consistent with each other. ICU, libc, and the builtin provider will all be based on different versions of Unicode. That's by design. The built-in provider will be a bit better in the sense that it's consistent with the normalization functions, and the other providers aren't. Regards, Jeff Davis

Re: encoding affects ICU regex character classification

2023-12-18 Thread Jeff Davis
rried about. Regards, Jeff Davis

Re: Change GUC hashtable to use simplehash?

2023-12-18 Thread Jeff Davis
boundary, which I think is OK (though I think I'd need to fix the patch for when maxalign < 8). Regards, Jeff Davis From 055d5cc24404584fd98109fabdcf83348e5c49b4 Mon Sep 17 00:00:00 2001 From: Jeff Davis Date: Mon, 18 Dec 2023 16:44:27 -0800 Subject: [PATCH v10jd] Optimize hash functi

Re: Change GUC hashtable to use simplehash?

2023-12-19 Thread Jeff Davis
t of place and possibly slow, and there's a bitwise trick we can use instead. My original test case is a bit too "macro" of a benchmark at this point, so I'm not sure it's a good guide for these individual micro- optimizations. Regards, Jeff Davis

Re: Built-in CTYPE provider

2023-12-19 Thread Jeff Davis
ifferent locale at initdb time, they would be doing so intentionally, rather than implicitly accepting index corruption risks based on an environment variable. Regards, Jeff Davis

Re: Built-in CTYPE provider

2023-12-20 Thread Jeff Davis
ibity, > truly immutable and faster indexes for fields that > don't require linguistic ordering, alignment between Unicode > updates and Postgres updates. Thank you, that summarizes exactly the compromise that I'm trying to reach. Regards, Jeff Davis

Re: Built-in CTYPE provider

2023-12-20 Thread Jeff Davis
e, fast, stable, better semantics than "C" for many locales, and we can document it. In any case, we don't need to decide that now. If the builtin provider is useful, we should do it. Regards, Jeff Davis

Re: broken master regress tests

2023-12-20 Thread Jeff Davis
e.linux.utf8.sql seems to be skipped on my machine because of the "version() !~ 'linux-gnu'" check, even though I'm running Ubuntu. Is that test getting run often enough? And relatedly, is it worth thinking about extending pg_regress to report skipped tests so it's easier to f

Re: Built-in CTYPE provider

2023-12-21 Thread Jeff Davis
we'd have to consider whether it's worth it or not. Ideally, new callers would either use the new APIs or use the pg_ascii_* APIs. Regards, Jeff Davis

Re: Built-in CTYPE provider

2023-12-21 Thread Jeff Davis
connect to a database with non-unicode > encoding? > 💥😜  ...at least it seems to be able to walk the index without > decoding > strings to find other users - but the way these global catalogs work > scares me a little bit) I didn't see that specific demo, but in general we seem to change between pg_wchar and unicode code points too freely, so I'm not surprised that something went wrong. Regards, Jeff Davis

Re: broken master regress tests

2023-12-21 Thread Jeff Davis
On Wed, 2023-12-20 at 17:48 -0800, Jeff Davis wrote: > Attached. It appears to increase the coverage. I committed it and I'll see how the buildfarm reacts. Regards, Jeff Davis

Re: broken master regress tests

2023-12-28 Thread Jeff Davis
rministic = true);'; END $$; The above may need some adjustment, but perhaps you can try it out? Another option might be to use \gset to assign it to a variable, which might be more readable, but I think it's better to just follow what the rest of the file is doing. Regards, Jeff Davis

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2023-12-29 Thread Jeff Davis
d we look around for other unrelated protocol changes to make at the same time? Do we want a more generic form of negotiation? Regards, Jeff Davis

Re: [17] collation provider "builtin"

2023-12-29 Thread Jeff Davis
s://www.postgresql.org/message-id/804eb67b37f41d3afeb2b6469cbe8bfa79c562cc.ca...@j-davis.com and the most recent patch is posted there. Having a built-in provider is more useful if it also offers a "C.UTF-8" locale that is superior to the libc locale of the same name. Regards, Jeff Davis

Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs

2023-12-29 Thread Jeff Davis
one idea that came up. Regards, Jeff Davis

Re: broken master regress tests

2023-12-29 Thread Jeff Davis
ate that nondeterministic > collations not supported. Thank you, pushed this version. There are other similar commands in the file, so I think it's fine. It exercises a specific locale that might be different from datcollate. Regards, Jeff Davis

Re: [17] CREATE SUBSCRIPTION ... SERVER

2023-12-29 Thread Jeff Davis
On Tue, 2023-09-05 at 12:08 -0700, Jeff Davis wrote: > OK, so we could have a built-in FDW called pg_connection that would > do > the right kinds of validation; and then also allow other FDWs but the > subscription would have to do its own validation. Attached a rough rebased version

Re: [17] CREATE SUBSCRIPTION ... SERVER

2023-12-31 Thread Jeff Davis
On Fri, 2023-12-29 at 15:22 -0800, Jeff Davis wrote: > On Tue, 2023-09-05 at 12:08 -0700, Jeff Davis wrote: > > OK, so we could have a built-in FDW called pg_connection that would > > do > > the right kinds of validation; and then also allow other FDWs but > > the > &

Re: Minor cleanup for search path cache

2024-01-02 Thread Jeff Davis
in > check_search_path(). Looks good to me. Regards, Jeff Davis

Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-01-04 Thread Jeff Davis
27;s not clear to me the check of the wal page headers is the right one anyway. It seems like all of this would be simpler if you checked first how far you can safely read data, and then just loop and read that far. I'm not sure that it's worth it to try to mix the validity checks

<    1   2   3   4   5   6   7   8   9   10   >