gt; passed, so I think this is a reasonable alternative to that design.
I'd have to see the patch to see whether I liked the end result. But
I'm guessing that involves a lot of non-mechanical changes in the call
sites, and also relies on test coverage for all of them.
Regards,
Jeff Davis
ile.
Should we automatically retain files associated with warnings, or copy
them to a different location?
Regards,
Jeff Davis
g infrastructure is a lot less of a
problem than other kinds of complexity, so it might be OK. But it would
be nice if there were a couple cases that would benefit rather than
one.
Regards,
Jeff Davis
as SECURITY DEFINER and then someone changes it
later?
Regards,
Jeff Davis
to an
"upgrade_warnings" directory sounds like a reasonable way to go.
Regards,
Jeff Davis
ficant performance overhead
to wrapping the function as is done for SECURITY DEFINER, so if the
function is obviously safe, it would be nice to avoid that. And it
would be another tool to help us mitigate the various related problems
we have with selecting from views, etc.
Regards,
Jeff Davis
ities
for optimization as well, such as:
* reducing the need for palloc and extra buffers, perhaps by using
buffers on the stack for small strings
* operate more directly on UTF-8 data rather than decoding and re-
encoding the entire string
Regards,
Jeff Davis
>
> Works for me.
Sounds good. We can document compatibility notes around this point.
If normalization becomes important, we can take the time to work out
the performance implications more carefully, and potentially introduce
an NCASEFOLD() if needed.
Regards,
Jeff Davis
On Thu, 2025-06-12 at 08:58 -0700, Jeff Davis wrote:
> On Thu, 2025-06-12 at 09:52 -0500, Nathan Bossart wrote:
> > If the idea is to remove all options for default behavior, we'd be
> > removing
> > --no-statistics, --with-data, and --with-schema at this point.
>
&
he entry for EXCLUDE? I also merged your wording with
some similar wording from the entry about UNIQUE. Attached.
Regards,
Jeff Davis
From 0988ec1bac79055899fb555ac0c0441333888c83 Mon Sep 17 00:00:00 2001
From: "Paul A. Jungwirth"
Date: Tue, 17 Jun 2025 20:48:56 -0700
Subject:
type.
> I guess I don't feel strongly about it either
> way.
Are you a user of citext? I'm genuinely interested in the use cases,
and whether the separate-data-type approach has merits that are missing
in the other approaches.
Regards,
Jeff Davis
tisfy Robert's concern about
the --help output. But Robert also wants stats off by default for
pg_dump and on by default for pg_restore, which I think means we need
both --with-statistics and --no-statistics anyway. Robert, comments?
Regards,
Jeff Davis
override that and I'm not sure we have one right now.
Regards,
Jeff Davis
ot sure whether we'd want to standardize one or both of
those functions.
And if you think there's likely to be a collision with the standard
that's hard to anticipate and fix now, then we should consider
reverting CASEFOLD() for 18 and wait for more progress on the
standardization. W
R(), so that
sounds like a good idea. I'd be interested to hear from users of
citext.
Regards,
Jeff Davis
ize _or_ use form-
> insensitive string comparison, but nothing did that 20 years ago.
> Thus
> doing the form-insensitivity in the filesystem seemed best, and if
> you
> do that you can be form-preserving to enable the optimization
> described
> above.
Databases have similar concerns as a filesystem in this respect.
Regards,
Jeff Davis
pen-source Unicode normalization? If so, that would be very
cool.
The reason I'm asking is because, if there are multiple open source
implementations, we should either have the best one, or just borrow
another one as long as it has a suitable license (perhaps translating
to C as necessary).
Regards,
Jeff Davis
isible changes in the past, and
> regenerating tsvectors because of that were merely a suggestion.
Interesting, thank you for looking into the history here. It would
certainly be simpler to just make FTS fully collation-aware.
Regards,
Jeff Davis
ther options,
we don't need to worry about consistency with them, and I think we
should just use "--statistics".
Regards,
Jeff Davis
Fixed.
Regards,
Jeff Davis
y.
To me, "last option wins" means that you don't raise an error; the
latter option simply overrides the earlier one.
Given that the pg_dump options are not order-sensitive now (unless I'm
missing something), I'm worried about the consequences of trying to
make them so now.
Regards,
Jeff Davis
On Mon, 2025-06-16 at 16:09 -0500, Nathan Bossart wrote:
> So perhaps there's not as strong of a
> consensus as we thought. Maybe we should ask for any new/updated
> votes.
Does it make any sense to be off by default in 18 and on in some later
release?
Regards
Jeff Davis
but the "--x-only" options
also put us in a tough spot.
If --data-only had always been spelled "--no-schema" (or "--without-
data" or whatever), and --schema-only had always been spelled "--no-
data", then I think it would be a lot easier to add statistics into the
mix.
Regards,
Jeff Davis
folding would also
want normalization, but it's hard to weigh that against the performance
cost. It might not matter outside of a few edge cases, though I'm not
sure exactly how many.
Regards,
Jeff Davis
/ comments. Another caller is
get_iso_localename().
There are also a couple false positives where mbstowcs_l/wcstombs_l are
emulated with uselocale() and mbstowcs/wcstombs. In that case, it's not
actually sensitive to the global setting.
---
copyfromparse.c - the input is
ted behavior.
If we make the opposite assumption, that none are ordering-sensitive
unless we mark them so, that would allow properly-marked functions to
fail at parse time, and the rest to fail at runtime. But this
assumption doesn't work as well for recording dependencies, because
we'd miss the dependencies for UDFs that aren't properly marked.
Thoughts?
Regards,
Jeff Davis
, then ignore LC_COLLATE/LC_CTYPE and emit a
WARNING, rather than trying to set it based on LOCALE and getting an
error.
Regards,
Jeff Davis
[1]
https://www.postgresql.org/message-id/cd3517c7-ddb8-454e-9dd5-70e3d84ff6a2%40eisentraut.org
From fea7ab4f0495330fae56f069520de374d75ae0b8 Mon Sep 17
On Fri, 2025-06-06 at 15:47 -0700, Jeff Davis wrote:
> > > * Force the environment variables LC_COLLATE=C and LC_CTYPE=C
> > > unconditionally, and pg_perm_setlocale() them
> >
> > Currently that would be a regression for some people, because
> > when
e database, and we've had plenty of fixes involving
> the startup process and a different process, mostly the checkpointer.
> That's an annoying limitation.
If you have in mind some other ways to use it than I like it a lot
more. And I don't have a better idea.
Regards,
Jeff Davis
t execute any non-superuser-owned code"
would be very useful at a practical level, e.g. for pg_dump.
Regards,
Jeff Davis
ct users to create their own functions which depend on our
normalization tables, we can add a fourth marker UNICODE. Otherwise, we
can just special case the few builtin functions we have to create those
dependency entries.
Regards,
Jeff Davis
that a UDF with collatable inputs depends on
all of the behaviors.
Regards,
Jeff Davis
a strong opinion on which route to
take, but I chose the above names from existing keywords so we wouldn't
have to add any.
Regards,
Jeff Davis
On Tue, 2025-06-03 at 20:22 -0700, Jeff Davis wrote:
> EQUALITY marker: indicates that the function or index AM depends on
> CollOid for the equality semantics of the input expression. Examples:
> texteq(), btree AM, hash AM. (Note: EQUALITY is only important for
> non-
> determini
We could try to create a GUC to control this behavior, but behavior-
changing GUCs don't have a great history, and it would probably last
quite some time before we could really turn off libc for good.
There would be similar challenges for downcase_identifier() and maybe
pg_strcasecmp().
Regards,
Jeff Davis
ger of accidentally depending on that setting. Can the encoding be
controlled with LC_MESSAGES instead of LC_CTYPE?
Do you have an example of how things can go wrong?
> For the LC_COLLATE settings, I think we could just
> do the setting in main(), where the other non-database-speci
NCTION
statements that come from other places (e.g. direct from applications,
or migration scripts, or extension scripts).
>
Regards,
Jeff Davis
on datctype, and I could have offered a more clear reply to
the user.
Regards,
Jeff Davis
On Thu, 2025-06-05 at 22:15 -0700, Jeff Davis wrote:
> To continue this thread, I did a symbol search in the meson build
> directory like (patterns.txt attached):
Attached a rough patch series which does what everyone seemed to agree
on:
* Change some trivial ASCII cases to use pg_
rip out --statistics-only (in favor
> of
> --no-schema --no-data --with-statistics).
I'd probably keep --statistics-only.
Regards,
Jeff Davis
On Thu, 2025-06-12 at 15:57 -0500, Nathan Bossart wrote:
> FWIW I don't have a tremendously strong opinion about --statistics-
> only.
Same here. I won't cast a vote on this particular issue, as long as the
functionality is available.
Regards,
Jeff Davis
simple to start using "last option wins" behavior
now. There are probably some combinations of options where it's not
clear whether a later option is an extra constraint or will override a
previous option.
Regards,
Jeff Davis
o.
I guess "CTYPE" works, but it's too technical and feels libc-specific.
Regards,
Jeff Davis
we need is the right encoding, do
we need a proper locale?
Regards,
Jeff Davis
ndexes,
which are in SECTION_POST_DATA).
Regards,
Jeff Davis
On Thu, 2025-06-12 at 10:18 -0400, Robert Haas wrote:
> Am I too late to propose ripping this out?
As long as we keep the functionality, I'm fine changing the
options/names around at this point.
Regards,
Jeff Davis
On Fri, 2025-02-07 at 11:19 -0800, Jeff Davis wrote:
>
> Attached v15. Just a rebase.
Attached v16.
> * commit this on the grounds that it's a desirable code improvement
> and
> the worst-case regression isn't a major concern; or
I plan to commit this soon after bra
SQL standard seems to require Unicode Full Case Mapping.
Regards,
Jeff Davis
[1] https://www.postgresql.org/docs/devel/locale.html#LOCALE-PROVIDERS
hem when either --statistics-only or --no-
> > schema is used.
Thank you.
>
> +1, pending resolution of the defaults issue.
I went ahead and committed this as it clearly needs to be fixed. We can
continue the options discussion.
Regards,
Jeff Davis
On Tue, 2025-07-01 at 08:06 -0700, Jeff Davis wrote:
> Attached rebased v3.
And here's v4.
I changed the global variable to only hold the LC_CTYPE (not
LC_COLLATE), because windows doesn't support a _locale_t that
represents multiple categories with different locales.
This pa
On Wed, 2025-06-11 at 12:15 -0700, Jeff Davis wrote:
> > v1-0008-Set-process-LC_COLLATE-C-and-LC_CTYPE-C.patch
> >
> > As I mentioned earlier in the thread, I don't think we can do this
> > for
> > LC_CTYPE, because otherwise system error messages would not
I was trying to exercise the function IsoLocaleName(), which is
surrounded by:
#if defined(WIN32) && defined(LC_MESSAGES)
but, at least in CI, that combination never seems to be true, which
surprised me. What platforms exercise this code path?
Regards,
Jeff Davis
On Mon, 2025-07-07 at 17:56 -0700, Jeff Davis wrote:
> I looked into this a bit, and if I understand correctly, the only
> problem is with strerror() and strerror_r(), which depend on
> LC_MESSAGES for the language but LC_CTYPE to find the right encoding.
...
> Windows would be a dif
On Wed, 2025-06-18 at 10:21 -0700, Jeff Davis wrote:
> * reject the combination of an "only" option and a "with" option
There seems to be a rough consensus on this point. Should we move ahead
with this small change and see if we can get consensus to go further?
Regards,
Jeff Davis
o-statistics and reject --statistics.
Other options are mostly the same between them, so I'm not sure it's a
good idea for them to diverge.
Regards,
Jeff Davis
7b25c86f).
The revert seems to be related to pgport_shlib. At least for my current
work, I'm focused on removing setlocale() dependencies in the backend,
and a PG_C_LOCALE should work fine there.
Regards,
Jeff Davis
On Thu, 2025-07-10 at 11:53 +1200, Thomas Munro wrote:
> On Thu, Jul 10, 2025 at 10:52 AM Jeff Davis
> wrote:
> > The first problem -- how to affect the encoding of strings returned
> > by
> > strerror() on windows -- may be solvable as well. It looks like
> > LC_ME
On Wed, 2025-06-11 at 12:15 -0700, Jeff Davis wrote:
> I changed this to a global_libc_locale that includes both LC_COLLATE
> and LC_CTYPE (from datcollate and datctype), in case an extension is
> relying on strcoll for some reason.
..
> This patch series, at least so far, is desi
s from pg_locale.h but instead put
> them in the .c files as needed, and explain why this is possible or
> suitable now.
It goes with v16-0003, so I will hold this back for now as well.
Regards,
Jeff Davis
one.
Regards,
Jeff Davis
d be confusing, but maybe it's fine.
Regards,
Jeff Davis
elease of the provider, it seems less likely to cause a problem
for equality searches, and therefore carries a lower risk for PKs. The
downside is that the keys will be larger and there are still some
risks, including bugs in the implementation (which is not just a
theoretical concern).
Othe
much milk if we only convert ASCII correctly.
>
> But perhaps I am just being paranoid.
That's a reasonable concern, and I don't mean to dismiss it. But I
believe that problem is two orders of magnitude smaller than the
problems we have with the status quo.
Regards,
Jeff Davis
On Fri, 2025-07-11 at 11:48 +1200, Thomas Munro wrote:
> On Fri, Jul 11, 2025 at 6:22 AM Jeff Davis wrote:
> > I don't have a great windows development environment, and it
> > appears CI
> > and the buildfarm don't offer great coverage either. Can I ask for
> &
ocale. The current proposal doesn't attempt that kind of
cleverness.
Comments?
Regards,
Jeff Davis
From 8ba8f74d28a64bfb006a76fbec64638f55f3660c Mon Sep 17 00:00:00 2001
From: Jeff Davis
Date: Thu, 17 Jul 2025 13:07:50 -0700
Subject: [PATCH] initdb: default to builtin C.UTF-8
Disc
On Wed, 2025-07-23 at 19:11 -0700, Jeff Davis wrote:
> The patch feels a bit over-engineered, but I'd like to know what you
> think. It would be great if you could test/debug the windows NLS-
> enabled paths.
Let me explain how it ended up looking over-engineered, and perhaps
On Wed, 2025-07-30 at 12:21 -0500, Nathan Bossart wrote:
> Here is what I have staged for commit.
That's more clear to me. I also like that it shows that the options
work well together, because that was not obvious before.
Regards,
Jeff Davis
On Tue, 2025-07-29 at 20:22 +0200, Álvaro Herrera wrote:
> Please move the switches themselves out of the translatable message,
> otherwise there are too many of them. For instance,
Thank you for looking, v2 attached.
Regards,
Jeff Davis
From 61b0239f17a1c7220de32699e95c6b365a
be builtin in that case, I suppose.
Another annoyance is that, if INITDB_LOCALE_PROVIDER=builtin, and
LC_CTYPE is not UTF-8-compatible, then we need to force LC_CTYPE=C.
That affects fewer things than it would with the libc provider, but it
still affects some things.
Regards,
Jeff Davis
lt
> behavior about statistics in pg_dump, though.
I don't see a consensus to make stats the default.
Regards,
Jeff Davis
On Thu, 2025-07-10 at 10:42 -0700, Jeff Davis wrote:
> On Wed, 2025-06-18 at 10:21 -0700, Jeff Davis wrote:
> > * reject the combination of an "only" option and a "with" option
>
> There seems to be a rough consensus on this point.
Patch attached.
On Wed, 2025-06-18 at 10:21 -0700, Jeff Davis wrote:
> On Wed, 2025-06-18 at 10:43 -0500, Nathan Bossart wrote:
> > IIUC the current proposal is to:
> >
> > * Dump/restore stats by default.
We don't have a consensus for that, so unless a few people make an
abrupt turnar
On Thu, 2025-07-31 at 17:21 +0200, Tomas Vondra wrote:
> On 7/31/25 15:39, Greg Burd wrote:
> > I recall a conversation at the last PGConf.dev (2025) with a
> > representative
> > from Intel and Jeff Davis (CC’ed) that had to do with checksums and
> > a vast
> &g
uot;? Because you currently can't do "--data-
only --schema-only". So that would make it not quite an alias.
If we go in this direction, it might be easier to just say that --
include conflicts with --schema-only and --data-only.
Regards,
Jeff Davis
On Tue, 2025-07-29 at 11:24 -0700, Jeff Davis wrote:
> On Wed, 2025-06-18 at 10:21 -0700, Jeff Davis wrote:
> > On Wed, 2025-06-18 at 10:43 -0500, Nathan Bossart wrote:
> > > IIUC the current proposal is to:
> > >
> > > * Dump/restore stats by default.
>
&
people generally
think it's an improvement over what we have now.
Otherwise, we should just proceed with:
https://www.postgresql.org/message-id/40cedfc22da152928a74d472708aaadb8855d8d9.ca...@j-davis.com
and close the open item.
Regards,
Jeff Davis
1501 - 1576 of 1576 matches
Mail list logo