On Thu, 2024-10-24 at 10:05 +0200, Andreas Karlsson wrote: > Why is there no pg_locale_builtin.c?
Just that it would be a fairly small file, but I'm fine with doing that. > I think adding an assert to create_pg_locale() which enforces valid > there is always a combination of ctype_is_c and casemap would be > good, > similar to the collate field. Good idea. > Why are casemap and ctype_methods not the same struct? They seem very > closely related. The code impact was in fairly different places, so it seemed like a nice way to break it out. I could combine them, but it would be a fairly large patch. > This commit makes me tempted to handle the ctype_is_c logic for > character classes also in callbacks and remove the if in functions > like > pg_wc_ispunct(). But this si something that would need to be > benchmarked. That's a good idea. The reason collate_is_c is important is because there are quite a few caller-specific optimizations, but that doesn't seem to be true of ctype_is_c. > I wonder if the bitmask idea isn't terrible for the branch predictor > and > that me may want one function per character class, but this is yet > again > something we need to benchmark. Agreed -- a lot of work has gone into optimizing the regex code, and we don't want a perf regression there. But I'm also not sure exactly which kinds of tests I should be running for that. > Is there a reason we allocate the icu_provider in > create_pg_locale_icu > with MemoryContextAllocZero when we intialize everything anyway? And > similar for other providers. Allocating and zeroing is a good defense against new optional methods and fields which can safely default to zero. > = v6-0011-Introduce-hooks-for-creating-custom-pg_locale_t.patch > > Looks good but seems like a quite painful API to use. How is it painful and can we make it better? > > * Have a CREATE LOCALE PROVIDER command and make "provider" an Oid > > rather than a char ('b'/'i'/'c'). The v6 patches brings us close to > > this point, but I'm not sure if we want to go this far in v18. > > Probably necessary but I hate all the DDL commands the way to SQL > standard is written forces us to add. There is some precedent for a DDL-like thing without new grammar: pg_replication_origin_create(). I don't have a strong opinion on whether to do that or not. > Regards, Jeff Davis