ight be to treat the built-in ICU
version differently from the ones in icu_library_path. Not quite sure,
I'd have to think more. But as of now, I'd still lean toward #1 until a
better option is presented.
Regards,
Jeff Davis
On Sat, 2022-11-26 at 18:27 +1300, Thomas Munro wrote:
> On Thu, Nov 24, 2022 at 5:48 PM Thomas Munro
> wrote:
> > On Thu, Nov 24, 2022 at 3:07 PM Jeff Davis
> > wrote:
> > > I'd vote for 1 on the grounds that it's easier to document and
> > > unde
well enough to be certain. It's just that, on the basis of
> previous experience, (1) it's not that uncommon for people to
> actually
> end up in situations that we thought shouldn't ever happen and (2)
> code that deals with collations is more untrustworthy than average.
Yeah...
--
Jeff Davis
PostgreSQL Contributor Team - AWS
nable).
#6 doesn't answer all of the problems I pointed out earlier:
https://www.postgresql.org/message-id/83faecb4a89dfb5794938e7b4d9f89daf4c5d631.ca...@j-davis.com
but could be a better starting place for answers.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
| 153.88.34
64.2| 12.1| 153.97.35.8
65.1| 12.1| 153.97.36
66.1| 13.0 | 153.14.36.8
67.1| 13.0| 153.14.37
68.2| 13.0| 153.14.38.8
69.1| 13.0| 153.14.39
70.1| 14.0| 153.112.40
(21 rows)
--
Jeff Davis
PostgreSQL Contributor Team - AWS
On Mon, 2022-11-28 at 19:36 -0800, Jeff Davis wrote:
> On Mon, 2022-11-28 at 21:57 -0500, Robert Haas wrote:
> > That is ... astonishingly bad.
>
> https://unicode-org.atlassian.net/browse/CLDR-16175
Oops, reported in CLDR instead of ICU. Moved to:
https://unicode-org.atlassian
On Tue, 2022-11-29 at 10:46 -0800, Jeff Davis wrote:
> One bit of weirdness is that I may have found another ICU problem.
Reported as:
https://unicode-org.atlassian.net/browse/ICU-22216
--
Jeff Davis
PostgreSQL Contributor Team - AWS
n (no updates or new minors, only new majors). #6 might be a good
approach to facilitate this best practice. We'd then probably need to
change collversion to be a library major version, and then come up with
a migration path from 15 -> 16. Or we could store both library major
version and
d 153.80.32.1 are really the same
version? But 64.1 -> 64.2 looks like a real difference.
I suppose the next step is to test with actual data and find
differences?
--
Jeff Davis
PostgreSQL Contributor Team - AWS
ICU
differently from what is found in icu_library_path. I think that could
remove confusion over what happens when you upgrade the system's ICU
library.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
On Wed, 2022-11-30 at 10:52 +1300, Thomas Munro wrote:
> On Wed, Nov 30, 2022 at 8:38 AM Jeff Davis wrote:
> > On Tue, 2022-11-29 at 10:46 -0800, Jeff Davis wrote:
> > > One bit of weirdness is that I may have found another ICU
> > > problem.
> >
> >
On Wed, 2022-11-30 at 10:29 +1300, Thomas Munro wrote:
> On Wed, Nov 30, 2022 at 9:59 AM Jeff Davis wrote:
> > Here's what I found for the 'ar' locale (firstminor/lastminor are
> > the
> > icu library versions, firstcollversion/lastcollversion are their
>
ion()) doesn't match what's in
collversion. But we basically get out of the business of understanding
ICU versioning and leave that up to the administrator.
It's easier to document, and would require fewer GUCs (if any). And it
avoids mixing version information from another project into our data
model.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
On Tue, 2022-12-06 at 10:53 -0800, Andres Freund wrote:
> Perhaps worth posting a new version? Or are the remaining patches
> abandoned in
> favor of the other threads?
Marked what is there as committed, and the remainder is abandoned in
favor of other threads.
Thanks,
Jeff Davis
//www.postgresql.org/docs/devel/sql-createdatabase.html
[3] https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/ustring_8h.html
--
Jeff Davis
PostgreSQL Contributor Team - AWS
vilege to refresh a
materialized view.
It seems like the discussion on VACUUM/CLUSTER/REINDEX privileges is
happening in the other thread. What would you like to accomplish in
this thread?
--
Jeff Davis
PostgreSQL Contributor Team - AWS
int64().
2. Change MaxOffsetNumber to 2047. This seems likely to break
extensions that rely on it.
3. Define MaxOffsetNumber as 65536 and introduce a new
MaxItemsPerPage as 2048 for the stack-allocated arrays. We'd still need
to fix itemptr_to_uint64().
Thoughts?
Regards,
Jeff Davis
a uint64), or still hold an ItemPointerData?
Regards,
Jeff Davis
hen there exist tableAMs that
> use those upper 5 bits.
Does that mean we should declare the valid range of offsets to be
between 1 and 0xfffc (inclusive)?
I'm trying to use some mapping now that's somewhat stable so that I
don't have to worry that something will break later, and then require
reindexing all tables with my table AM.
Regards,
Jeff Davis
eful to make TIDs fully logical (not
> "physiological" -- physical across blocks, logical within blocks) for
> some new table AM. Even then, it would still definitely be a good
> idea
> to make these logical TID values correlate with the physical table
> structure in some way.
Agreed.
Regards,
Jeff Davis
m a particular physical table.
In the future we may support primary unique indexes at the table AM
layer, which would get more interesting. I can see an argument for a
TID being an arbitrary datum in that case, but I haven't really
considered the design implications. Is this what you are suggesting?
Regards,
Jeff Davis
have a parameterized granularity
might be nice, but not required.
Regards,
Jeff Davis
insofar as possible. The reality is that
> everything evolved around heapam, and that that's likely to matter in
> all kinds of ways that nobody fully understands just yet.
Agreed. I think of this as an evolving situation where we take steps
toward a better abstraction.
One (hopefully r
s though, so it could be a long
road.
Regards,
Jeff Davis
> That's what I'm talking about. I'd like to hear what you think about
> it.
FWIW, this is not a problem in my table AM. I am fine having different
TIDs for each version, just like heapam.
For index-organized tables it does seem like an interesting problem.
Regards,
Jeff Davis
or beyond, that may require a REINDEX for indexes on some table AMs. As
long as we have some robust way to check that a REINDEX is necessary,
that's fine with me.
Regards,
Jeff Davis
than the maximum number of
items that can fit on a page. That essentially wastes 5 bits of address
space for no obvious reason.
> and if you want bitmap scans
> to run reasonably quickly, the block number had also better
> correspond
> to physical locality to some degree.
Right (at least for now).
Regards,
Jeff Davis
o bring it down to 43 bits is not great.
Regards,
Jeff Davis
On Fri, 2021-04-30 at 10:55 -0700, Jeff Davis wrote:
> On Fri, 2021-04-30 at 12:35 -0400, Tom Lane wrote:
> > ISTM that would be up to the index AM. We'd need some interlocks
> > on
> > which index AMs could be used with which table AMs in any case, I
> > think.
level
> discussion.
One problem is that ginpostinglist.c restricts the use of offset
numbers higher than MaxOffsetNumber - 1. At best, that's a confusing
and unnecessary off-by-one error that we happen to be stuck with
because it affects the on-disk format. Now that I'm past that
particular confusion, I can live with a workaround until we do
something better.
What is the other problem with GIN?
Regards,
Jeff Davis
frustrating.
Whatever we do or don't do, we should try to avoid surprises. I expect
table AMs to be used heavily with partitioning.
Regards,
Jeff Davis
1,
291].
Attached is a patch that clarifies what I've found so far and gives
clear guidance to table AM authors. Before I commit this I'll make sure
that following the guidance actually works for the columnar AM.
Regards,
Jeff Davis
[1] Even for the current version of columnar, which
setNumber
> deletable[MaxOffsetNumber];
I don't think those are problems because they represent items on an
*index* page, not ItemPointers coming from a table.
Regards,
Jeff Davis
int is if "sooner" turns into "later" then we at least have some
guidance for table AM authors in the interim. But if nobody else thinks
that's useful, then so be it.
Regards,
Jeff Davis
but it's reckless to
> extrapolate
> from 1 working example, and right now that's all we have.
We should count columnar as a second example. While it doesn't support
everything that heap does, we are actively working on it and it's
gaining features quickly. It's also showing some impressive real-world
results.
Regards,
Jeff Davis
seems *weird* to
> me.
I am happy to keep table AM discussions concrete, as I have plenty of
concrete problems which I would like to turn into proposals.
Regards,
Jeff Davis
ceeding? Table AMs?
And why isn't columnar an example of someting that can "get by with
heapam's idea of TID"? I mean, it's not a perfect fit, but my primary
complaint this whole thread is that it's undefined, not that it's
completely unworkable.
Regards,
Jeff Davis
esentations. By "more careful", I don't mean that we reject all
proposals; I mean that we don't casually impose new limits in other
parts of the system that happen to work for heapam but will cause table
AM extensions to break.
Regards,
Jeff Davis
s consensus in this thread that we want to do that,
but I'd be fine with it.
It's possible but not trivial. tidbitmap.c would be the biggest
challenge, I think.
Regards,
Jeff Davis
The attached patch implements ALTER TABLE ... SET ACCESS METHOD.
For simplicity, I used the normal alter table path, ATRewriteTable(),
which does not follow the stricter isolation semantics that VACUUM FULL
follows. If someone thinks that's unacceptable, please let me know.
Regards,
On Wed, 2021-05-05 at 23:40 -0500, Justin Pryzby wrote:
> Why doees your patch say v15?
> It's nearly the same as my pre-existing patch, so should merge them.
Sorry, I completely missed your patch. I retract mine and we'll
continue with yours.
Regards,
Jeff Davis
#x27;t it just detoast the
data itself? Shouldn't that be able to decompress anything?
For example, in columnar[1], we just always detoast/decompress because
we want to compress it ourselves (along with other values from the same
column), and we never have a separate toast table. Is that code
in
on't need to do
anything to the toast table, just leave it where it is. But then the
responsibilities get a little confusing to me -- how is B supposed to
know that it doesn't need to toast anything? Is this the problem you
are trying to solve?
Regards,
Jeff Davis
CT?
It's the table AM's responsibility to detoast out-of-line datums and
toast any values that are too large (see
heapam.c:heap_prepare_insert()).
Regards,
Jeff Davis
ing
a change and going through the ordinary INSERT paths, for instance with
RLS. Also solvable.
5. Andres raised in another thread the idea of switching to the table
owner when applying changes (perhaps in a
SECURITY_RESTRICTED_OPERATION?):
https://www.postgresql.org/message-id/20230112033355.u5tiyr2bmuoc4...@awork3.anarazel.de
That seems related, and I like the idea.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
want to
support a way to pass SSL keys as values rather than file paths, so
that we can still do SSL.
So perhaps the answer is that it will be a small patch to get non-
superuser subscription owners, but we need three or four preliminary
patches first.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
t we have now)? I don't know if that solves the
problem you're trying to solve, but it seems lower-risk.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
feel that's the right thing.
But (a) that's not a very strong objection; and (b) my efforts are
better spent doing some of that groundwork than arguing about the order
in which the work should be done. So, time permitting, I may be able to
put out a patch or two for the next 'fest.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
t investigated deeply yet.
Maybe something about LTO, some intervening patch, or I just made some
mistakes somewhere (I did this fairly quickly). But as of now, it
doesn't look like the refactoring patch hurts anything.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
From d1e2e1757b043c876
On Fri, 2023-01-20 at 12:54 -0800, Jeff Davis wrote:
> Both of these are surprising, and I haven't investigated deeply yet.
It's just because autoconf defaults to -O2 and meson to -O3, at least
on my machine. It turns out that, at -O2, master and the refactoring
branch are even; but
iated keys get more complex (where maybe we need to consider
costing each provider?), we can make it more user-facing.
This is fairly simple, so I plan to commit soon.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
From 7699f28634f04772c718ac465bbbff48b849f2bc Mon Sep 17 00:00:00 2001
From:
ion string. You're right that it's incomplete, and also that it
doesn't make a lot of sense for files accessed indirectly.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
riate controls) one at a time. Arguably, that's already
what's happened by demanding a password (even if we don't like the
mechanism, it does seem to work for some important cases).
Is your patch targeted at use case (A), (B), or both?
--
Jeff Davis
PostgreSQL Contributor Team - AWS
s memory contexts, etc.) and before
preparing the sort keys (which involves catalog lookups). The
trust_strxfrm branch is happening in the type-specific sort support
function, which needs to be looked up in the catalog before being
called (using V1 calling convention).
It doesn't look li
mization could return wrong results. Set to
> > + true if certain that
> > strxfrm()
> > + can be trusted.
>
> "if you are certain"; or "if it is ..."
Done.
Thank you, rebased patch attached.
--
Jeff Davis
PostgreSQL Contributor Team
a
subscription with a server object, you'd just need to be a member of
pg_create_subscription and have the USAGE privilege on the server
object.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
gt; in this
> patch that didn't make it into ff9618e are the following
> documentation
> adjustments. I've added Jeff to get his thoughts.
Committed these extra clarifications. Thank you.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
ption today, and then later require
additional privileges (e.g. pg_create_connection).
If that's not a problem, then this sounds fine with me.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
f not I'll just leave them
in my branch and withdraw from this thread.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
substantial
speedup when using meson (-O3).
Also, when/if the multilib ICU support goes in, that may lose some of
these gains due to an extra indirect function call.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
text_generator.pl
Description: Perl program
From 39ed011cc51ba3a4af5e3b559a7
put
It is non-deterministic, but I tried with two generated files, and got
similar results.
Right now I suspect the ICU version might be the reason. I'll try with
72.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
ivial fixup if you eliminate the GUC patch.
I left it there because it makes exploring/testing easier (at least for
me), but the GUC patch doesn't need to be committed if there's no
consensus.
Regards,
Jeff Davis
t-icu (-Dicu=disabled/auto)
* At initdb time, default to --locale-provider=icu if built with
ICU support
If we don't want to nudge users toward ICU, is it because we are
waiting for something, or is there a lack of consensus that ICU is
actually better?
Regards,
Jeff Davis
On Thu, 2023-02-02 at 08:44 -0500, Robert Haas wrote:
> On Thu, Feb 2, 2023 at 8:13 AM Jeff Davis wrote:
> > If we don't want to nudge users toward ICU, is it because we are
> > waiting for something, or is there a lack of consensus that ICU is
> > actually better?
>
logic when trying to open a collator, which
could only be bad news.
Thoughts?
[1] https://unicode-org.github.io/icu/userguide/locale/#fallback
[2] https://en.wikipedia.org/wiki/IETF_language_tag
[3]
https://unicode-org.github.io/icu/userguide/locale/#canonicalization
--
Jeff Davis
PostgreSQL Contributor Team - AWS
7667fdd4178497aeb796c48d26e69b9.ca...@j-davis.com
--
Jeff Davis
PostgreSQL Contributor Team - AWS
From 1b7d940c0f12062185b8b42bf8d3c0a6f05a74d4 Mon Sep 17 00:00:00 2001
From: Jeff Davis
Date: Wed, 8 Feb 2023 12:06:26 -0800
Subject: [PATCH v1] Use ICU by default at initdb time.
If the ICU local
ow.
That's why we need to be careful about versioning (library versions or
collator versions or both), and we've had long discussions about that.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
I'm happy to hear more input or other proposals.
[1]
https://unicode-org.github.io/icu/userguide/locale/#canonicalization
--
Jeff Davis
PostgreSQL Contributor Team - AWS
#include
#include
#include
#define CAPACITY 1024
int main(int argc, char *argv[])
{
UErrorCode status;
UColl
fy things where we can -- collator versioning is
hard enough without wondering how a user-entered string will be
interpreted. And if we're going to be consistent, BCP 47 seems like the
most obvious choice.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
the bcp47 tag is de-CH.
uloc_canonicalize() and uloc_getLanguageTag() are declared in uloc.h,
and they aren't (as far as I can tell) tied to which collations are
actually defined.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
On Fri, 2023-02-10 at 09:43 -0500, Robert Haas wrote:
> On Thu, Feb 9, 2023 at 5:09 PM Jeff Davis wrote:
> > I do like the ICU format locale IDs from a readability standpoint.
> > "en_US@colstrength=primary" is more meaningful to me than "en-US-u-
> > ks-
>
Jeff Davis
PostgreSQL Contributor Team - AWS
From fac5ada4fc64c16b9553be1d69e4e117ccfebd88 Mon Sep 17 00:00:00 2001
From: Jeff Davis
Date: Fri, 10 Feb 2023 10:54:42 -0800
Subject: [PATCH v1] Correct docs for the default locale_provider of a new
database.
If the locale provider is not specifie
On Wed, 2023-02-08 at 18:22 -0800, Andres Freund wrote:
> On 2023-02-08 12:16:46 -0800, Jeff Davis wrote:
> > On Thu, 2023-02-02 at 18:10 -0500, Tom Lane wrote:
> > > Yeah. I would be resistant to making ICU a required dependency,
> > > but it doesn't seem unreaso
uld we use the ICU
format locale IDs, or BCP 47 language tags?
Do you have an opinion on that topic? If not, do you need additional
information?
--
Jeff Davis
PostgreSQL Contributor Team - AWS
anage the buffers.
Do you have a more specific suggestion? I'd like to keep the API
flexible enough that the caller can manage the buffers, like with
abbreviated keys. Perhaps the check can just be removed if we trust
that the library functions at least get the size calculation r
On Thu, 2023-02-02 at 05:13 -0800, Jeff Davis wrote:
> As a project, do we want to nudge users toward ICU as the collation
> provider as the best practice going forward?
One consideration here is security. Any vulnerability in ICU collation
routines could easily become a vulnerability in Po
ow I don't have a concrete proposal.
Regards,
Jeff Davis
This stuff shouldn't be in here, it's due to a debian patched
> autoconf.
Removed, thank you.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
From 0a691bdc1952871b2ec8d1c5086c90c8943d99cb Mon Sep 17 00:00:00 2001
From: Jeff Davis
Date: Fri, 10 Feb 2023 12:08:11 -0800
S
of issues that
one finds any time they dig into a dependency enough. "Setting our
sights very high"[1], to me, would just be ICU with a bit more rigorous
attention to quality issues.
[1]
https://www.postgresql.org/message-id/CA%2BTgmoYmeGJaW%3DPy9tAZtrnCP%2B_Q%2BzRQthv%3Dzn_HyA_nqEDM-A%40mail.gmail.com
--
Jeff Davis
PostgreSQL Contributor Team - AWS
On Thu, 2023-02-09 at 14:09 -0800, Jeff Davis wrote:
> It feels like BCP 47 is the right catalog representation. We are
> already using it for the import of initial collations, and it's a
> standard, and there seems to be good support in ICU.
Patch attached.
We should have been
t the new databases' collation fields equal to that of
the old cluster?
I'll submit it as a separate patch because it would be independently
useful.
Regards,
Jeff Davis
On Fri, 2023-02-17 at 00:06 -0800, Jeff Davis wrote:
> On Tue, 2023-02-14 at 09:59 -0800, Andres Freund wrote:
> > I am saying that pg_upgrade should be able to deal with the
> > difference. The
> > details of how to implement that, don't matter that much.
>
>
a check that the new cluster is empty, so I think it's
safe to hack the pg_database locale fields.
Regards,
Jeff Davis
>
be less surprising if we also
fixed up template0.)
And if we do fixup template0/template1/postgres to match the old
cluster, then CREATE DATABASE will have no issue.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
ech collate, an index
> rebuild is necessary
Yes, that's true of any locale change, provider change, or even
provider version change.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
lc_ctype, which also don't matter a whole lot
(because they can all be changed when using template0 as a template).
2. Update the pg_database entry for template0. This has less potential
for surprise in case people are actually using template0 for a
template.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
t seems like we should do the same thing for the loop in
GetXLogSummaryStats(). Maybe just for the outer loop is fine (the inner
loop is only 16 elements); though again, there's not an obvious
downside to fixing that, too.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
we don't have to guess about whether the
amount of memory is significant or not.
Committed to 16 with the changes to GetXLogSummaryStats() as well.
Committed unmodified version of your 15 backport. Thank you!
--
Jeff Davis
PostgreSQL Contributor Team - AWS
in
general and not specific to this patch. Changing providers obviously
requires an index rebuild to be safe.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
the default, and then adapt this patch if
necessary.
[1]
https://www.postgresql.org/message-id/20230214175957.idkb7shsqzp5n...@awork3.anarazel.de
--
Jeff Davis
PostgreSQL Contributor Team - AWS
From b2e42cef9d8080ad27ef76444b74a72e5cda922c Mon Sep 17 00:00:00 2001
From: Jeff Davis
Date: Wed, 1
On Mon, 2023-02-13 at 15:45 -0800, Jeff Davis wrote:
> New version attached. Changes:
These patches, especially 0001, have been around for a while, and
they've received some review attention with no outstanding TODOs that
I'm aware of.
I plan to commit v10 (or something close to it
particularly complicated, but it now looks more like an optimization
> problem of some kind or other.
There's another important point here, which is that it gives an
opportunity to decide to freeze some all-visible pages in a given round
just to reduce the deferred work, without worrying about advancing
relfrozenxid.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
moderate number of pages without worrying about
advancing relfrozenxid?
Regards,
Jeff Davis
on table size?
Regards,
Jeff Davis
bit), which was rejected, but
perhaps its time has come?
Regards,
Jeff Davis
[1]
https://www.postgresql.org/message-id/1353551097.11440.128.camel%40sussancws0025
just moved some code around.
Is there a good way to look for regressions (perf or correctness) when
making changes in this area, especially on windows and/or with strange
collation rules? What I did doesn't look like a problem, but there's
always a chance of a regression.
--
Jeff D
memory management, error
> handling,
> and some other things". That's pretty much a non-starter.
You may be surprised how much you can do with rust extensions without
changing any of those things[3].
Regards,
Jeff Davis
[1] https://doc.rust-lang.org/std/mem/fn.forget.html
[2] https://doc.rust-lang.org/book/ch17-02-trait-objects.html
[3] https://www.pgcon.org/2019/schedule/attachments/532_RustTalk.pdf
ked as all-visible in VM,
> but not having PD_ALL_VISIBLE in page header. And it violates VM
> constraint:
I'm not quite following this scenario. If the heap page has a lower LSN
than the VM page, how could we recover to a point where the VM bit is
set but the heap flag isn't? And what does it have to do with
wal_log_hints/checksums?
--
Jeff Davis
PostgreSQL Contributor Team - AWS
he buffers on the stack of varstr_cmp(). I'm not sure if that's a
problem or not.
> The length arguments ought to be of type size_t, I think.
Changed.
Thank you.
--
Jeff Davis
PostgreSQL Contributor Team - AWS
From 4d5664552a8a86418a94c37fd4ab8ca3a665c1cd Mon Sep 17 00:00:00 2001
o both
> things at exactly the same XID-age-wise time. But there is reason to
> think that doing so could make matters worse rather than better [1].
Can you explain?
Regards,
Jeff Davis
701 - 800 of 1501 matches
Mail list logo