Re: POC: Parallel processing of indexes in autovacuum

2025-05-02 Thread Daniil Davydov
On Thu, May 1, 2025 at 8:03 AM Masahiko Sawada  wrote:
>
> As I understand it, we initially disabled parallel vacuum for
> autovacuum because their objectives are somewhat contradictory.
> Parallel vacuum aims to accelerate the process by utilizing additional
> resources, while autovacuum is designed to perform cleaning operations
> with minimal impact on foreground transaction processing (e.g.,
> through vacuum delay).
>
Yep, we also decided that we must not create more a/v workers for
index processing.
In current implementation, the leader process sends a signal to the
a/v launcher, and the launcher tries to launch all requested workers.
But the number of workers never exceeds `autovacuum_max_workers`.
Thus, we will never have more a/v workers than in the standard case
(without this feature).

> Nevertheless, I see your point about the potential benefits of using
> parallel vacuum within autovacuum in specific scenarios. The crucial
> consideration is determining appropriate criteria for triggering
> parallel vacuum in autovacuum. Given that we currently support only
> parallel index processing, suitable candidates might be autovacuum
> operations on large tables that have a substantial number of
> sufficiently large indexes and a high volume of garbage tuples.
>
> Although the actual number of parallel workers ultimately depends on
> the number of eligible indexes, it might be beneficial to introduce a
> storage parameter, say parallel_vacuum_workers, that allows control
> over the number of parallel vacuum workers on a per-table basis.
>
For now, we have three GUC variables for this purpose:
max_parallel_index_autovac_workers, autovac_idx_parallel_min_rows,
autovac_idx_parallel_min_indexes.
That is, everything is as you said. But we are still conducting
research on this issue. I would like to get rid of some of these
parameters.

> Regarding implementation: I notice the WIP patch implements its own
> parallel vacuum mechanism for autovacuum. Have you considered simply
> setting at_params.nworkers to a value greater than zero?
>
About `at_params.nworkers = N` - that's exactly what we're doing (you
can see it in the `vacuum_rel` function). But we cannot fully reuse
code of VACUUM PARALLEL, because it creates its own processes via
dynamic bgworkers machinery.
As I said above - we don't want to consume additional resources. Also
we don't want to complicate communication between processes (the idea
is that a/v workers can only send signals to the a/v launcher).
As a result, we created our own implementation of parallel index
processing control - see changes in vacuumparallel.c and autovacuum.c.

--
Best regards,
Daniil Davydov




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Tom Lane
Jacob Champion  writes:
> On Fri, May 2, 2025 at 10:35 AM Tom Lane  wrote:
>> FWIW, on my Mac a meson build from HEAD goes through fine, with or
>> without this patch.  I'm getting openssl and libcurl from MacPorts
>> not Homebrew, but I don't know why that would make any difference.

> Do your  and  live in the same place? Mine do,
> so I had to disable NLS to get this to break.

Yeah, they are both under /opt/local/include in a MacPorts setup.
But disabling NLS doesn't break it for me.  I tried

meson setup build --auto-features=disabled -Dlibcurl=enabled

to make sure that /opt/local/include wasn't getting pulled in
some other way, and it still builds.

Apropos of that: our fine manual claims that option is spelled
--auto_features, but that fails for me.  Is that a typo in our
manual, or do some meson versions accept the underscore?

regards, tom lane




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Nathan Bossart
On Fri, May 02, 2025 at 10:42:25AM -0700, Jacob Champion wrote:
> On Fri, May 2, 2025 at 10:35 AM Tom Lane  wrote:
>> FWIW, on my Mac a meson build from HEAD goes through fine, with or
>> without this patch.  I'm getting openssl and libcurl from MacPorts
>> not Homebrew, but I don't know why that would make any difference.
> 
> Do your  and  live in the same place? Mine do,
> so I had to disable NLS to get this to break.

I enabled NLS and the problem disappeared for me, but that seems to be a
side effect of setting -Dextra_{include,lib}_dirs to point to my Homebrew
directories, which I needed to do to get NLS to work.

-- 
nathan




Assert("vacrel->eager_scan_remaining_successes > 0")

2025-05-02 Thread Masahiko Sawada
Hi,

I hit the assertion failure in the subject line with the following script:

create table t (a int) with (autovacuum_enabled = off);
insert into t select generate_series(1, 1_000_000);
vacuum t;
insert into t select generate_series(1, 1_000_000);
set vacuum_freeze_min_age to 0;
vacuum t;

When the success count reaches to 0, we disable the eager scan by
resetting related fields as follows:

/*
 * If we hit our success cap, permanently disable eager
 * scanning by setting the other eager scan management
 * fields to their disabled values.
 */
vacrel->eager_scan_remaining_fails = 0;
vacrel->next_eager_scan_region_start = InvalidBlockNumber;
vacrel->eager_scan_max_fails_per_region = 0;

However, there is a possibility that we have already eagerly scanned
another page and returned it to the read stream before we freeze the
eagerly-scanned page and disable the eager scan. In this case, the
next block that we retrieved from the read stream could also be an
eagerly-scanned page.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Adding skip scan (including MDAM style range skip scan) to nbtree

2025-05-02 Thread Peter Geoghegan
On Fri, May 2, 2025 at 2:22 PM Peter Geoghegan  wrote:
> A slight variant of my fuzzing Python script did in fact go on to
> detect a couple of bugs.
>
> I'm attaching a compressed SQL file with repros for 2 different bugs.
> The first bug was independently detected by some kind of fuzzing
> performed by Mark Dilger, reported elsewhere [1].

Picking up from the email with the big attachment...

Both bugs are from commit 8a510275, "Further optimize nbtree search
scan key comparisons" (not the main skip scan commit). I actually
wrote code very much like the code from these patches that appeared in
certain versions of the skip scan patch series -- it was originally
supposed to be defensive hardening. This so-called hardening wasn't
kept in the final committed version because I incorrectly believed
that it wasn't necessary.

I would like to commit the first patch later today, ahead of shipping
beta1. But the second patch won't make it into beta1.

In practice the bug fixed by the first patch is more likely to affect
users, and (unlike the bug fixed by the second patch), it involves a
hard crash. The first patch prevents us from dereferencing a NULL
pointer (pstate) within _bt_advance_array_keys (unless on an
assert-enabled build, where we get an assertion failure first). It
would also be possible to fix the issue by testing if pstate itself is
not a NULL pointer in the usual defensive style, but I find the
approach taken in the first patch slightly more natural.

The second patch is more complicated, and seems like something that
I'll need to spend more time thinking about before proceeding with
commit. It has subtle behavioral implications, in that it causes the
pstate.forcenonrequired mechanism to influence when and how
_bt_advance_array_keys schedules primitive index scans in a tiny
minority of forward scan cases. I know of only 3 queries where this
happens, 2 of which are from my repro -- it's actually really hard to
find an example of this, even if you go out of your way to.

Allowing pstate.forcenonrequired to affect primscan scheduling like
this is something that I have been avoiding up until now, since that
makes things cleaner -- but I'm starting to think that that goal isn't
important enough to force the second patch to be significantly more
complicated than what I came up with here. It's not like the
behavioral differences represent a clear regression; they're just
slightly different to what we see in cases where
pstate.forcenonrequired/pstate.ikey is forcibly not applied (e.g., by
commenting-out the calls to _bt_set_startikey made by _bt_readpage).

My approach in the second patch is to simply call _bt_start_array_keys
ahead of the finaltup call to _bt_checkkeys when
pstate.forcenonrequired, which has the merit of being relatively
simple (it's likely the simplest possible approach). I'm unwilling to
pay too much of a cost in implementation complexity just to avoid
side-effects in _bt_advance_array_keys/primscan scheduling, but maybe
I'll find that the cost isn't too high.

-- 
Peter Geoghegan


v1-0001-Avoid-treating-nonrequired-nbtree-keys-as-require.patch
Description: Binary data


v1-0002-Prevent-prematurely-nbtree-array-advancement.patch
Description: Binary data


Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Tom Lane
Jacob Champion  writes:
> On Fri, May 2, 2025 at 11:26 AM Tom Lane  wrote:
>> Apropos of that: our fine manual claims that option is spelled
>> --auto_features, but that fails for me.  Is that a typo in our
>> manual, or do some meson versions accept the underscore?

> --auto_features doesn't work for me either. That looks like an
> accidental mashup of --auto-features and -Dauto_features.

Ah, I see somebody already complained of this [1], but apparently
we did nothing about it.  I shall go fix it now.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/flat/172465652540.862882.17808523044292761256%40wrigleys.postgresql.org




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Jacob Champion
On Fri, May 2, 2025 at 11:56 AM Jacob Champion
 wrote:
> -I/opt/homebrew/Cellar/openssl@3/3.5.0/include

Except it _is_ right there.

Oh, ha -- I'm not using Homebrew's Curl in this minimal build. Looks
like it's coming from the sysroot.

% ls -l 
/Library/Developer/CommandLineTools/SDKs/MacOSX15.2.sdk/usr/include/curl
total 208
-rw-r--r--  1 root  wheel  129052 Nov  9 22:54 curl.h
-rw-r--r--  1 root  wheel3044 Nov  9 22:54 curlver.h
...

Well, that was fun.

--Jacob




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Jacob Champion
On Fri, May 2, 2025 at 10:31 AM Nathan Bossart  wrote:
> Yup, thanks!

Great, thanks. I'll push it soon.

--Jacob




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Jacob Champion
On Fri, May 2, 2025 at 10:35 AM Tom Lane  wrote:
> FWIW, on my Mac a meson build from HEAD goes through fine, with or
> without this patch.  I'm getting openssl and libcurl from MacPorts
> not Homebrew, but I don't know why that would make any difference.

Do your  and  live in the same place? Mine do,
so I had to disable NLS to get this to break.

--Jacob




Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly

2025-05-02 Thread Vitaly Davydov
Dear All,

Thank you for the attention to the patch. I updated a patch with a better
solution for the master branch which can be easily backported to the other
branches as we agree on the final solution.

Two tests are introduced which are based on Tomas Vondra's test for logical 
slots
with injection points from the discussion (patches: [1], [2], [3]). Tests
are implemented as module tests in src/test/modules/test_replslot_required_lsn
directory. I slightly modified the original test for logical slots and created a
new simpler test for physical slots without any additional injection points.

I prepared a new solution (patch [4]) which is also based on Tomas Vondra's
proposal. With a fresh eye, I realized that it can fix the issue as well. It is
easy and less invasive to implement. The new solution differs from my original
solution: it is backward compatible (doesn't require any changes in 
ReplicationSlot
structure). My original solution can be backward compatible as well if to
allocate flushed_restart_lsn in a separate array in shmem, not in the
ReplicationSlot structure, but I believe the new solution is the better one. If
you still think that my previous solution is the better (I don't think so), I
will prepare a backward compatible patch with my previous solution.

I also proposed one more commit (patch [5]) which removes unnecessary calls of
ReplicationSlotsComputeRequiredLSN function which seems to be redundant. This
function updates the oldest required LSN for slots and it is called every time
when slots' restart_lsn is changed. Once, we use the oldest required LSN in
CreateCheckPoint/CreateRestartPoint to remove old WAL segments, I believe, there
is no need to calculate the oldest value immediately when the slot is advancing
and in other cases when restart_lsn is changed. It may affect on
GetWALAvailability function because the oldest required LSN will be not up to
date, but this function seems to be used in the system view
pg_get_replication_slots and doesn't affect the logic of old WAL segments
removal. I also have some doubts concerning advancing of logical replication
slots: the call of ReplicationSlotsComputeRequiredLSN was removed. Not sure, how
it can affect on ReplicationSlotsComputeRequiredXmin. This commit is not
necessary for the fix but I think it is worth to consider. It may be dropped or
applied only to the master branch.

This patch can be easily backported to the major release branches. I will
quickly prepare the patches for the major releases as we agree on the final
solution.

I apologize for such too late change in patch when it is already on commitfest.
I'm not well experienced yet in the internals of PostgreSQL at the moment,
sometimes the better solution needs some time to grow. In doing we learn :)

[1] 0001-Add-injection-points-to-test-replication-slot-advanc.v2.patch
[2] 0002-Add-TAP-test-to-check-logical-repl-slot-advance-duri.v2.patch
[3] 0003-Add-TAP-test-to-check-physical-repl-slot-advance-dur.v2.patch
[4] 0004-Keep-WAL-segments-by-slot-s-flushed-restart-LSN.v2.patch
[5] 0005-Remove-redundant-ReplicationSlotsComputeRequiredLSN-.v2.patch

With best regards,
Vitaly
From a8693c3003df7f9850af0be5284bb6f0e7a82fa6 Mon Sep 17 00:00:00 2001
From: Vitaly Davydov 
Date: Wed, 30 Apr 2025 12:48:27 +0300
Subject: [PATCH 3/5] Add TAP test to check physical repl slot advance during
 checkpoint

The test verifies that the physical replication slot is still valid
after immediate restart on checkpoint completion in case when the slot
was advanced during checkpoint.

Discussion: https://www.postgresql.org/message-id/flat/1d12d2-67235980-35-19a406a0%4063439497
---
 .../test_replslot_required_lsn/meson.build|   3 +-
 .../t/002_physical_slot.pl| 126 ++
 2 files changed, 128 insertions(+), 1 deletion(-)
 create mode 100644 src/test/modules/test_replslot_required_lsn/t/002_physical_slot.pl

diff --git a/src/test/modules/test_replslot_required_lsn/meson.build b/src/test/modules/test_replslot_required_lsn/meson.build
index 999c16201fb..44d2546632b 100644
--- a/src/test/modules/test_replslot_required_lsn/meson.build
+++ b/src/test/modules/test_replslot_required_lsn/meson.build
@@ -9,7 +9,8 @@ tests += {
'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
 },
 'tests': [
-  't/001_logical_slot.pl'
+  't/001_logical_slot.pl',
+  't/002_physical_slot.pl'
 ],
   },
 }
diff --git a/src/test/modules/test_replslot_required_lsn/t/002_physical_slot.pl b/src/test/modules/test_replslot_required_lsn/t/002_physical_slot.pl
new file mode 100644
index 000..f89aec1da32
--- /dev/null
+++ b/src/test/modules/test_replslot_required_lsn/t/002_physical_slot.pl
@@ -0,0 +1,126 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+#
+# This test verifies the case when the physical slot is advanced during
+# checkpoint. The test checks that the physical slot's restart_lsn still refers
+# to an existed WAL segmen

Re: SQL functions: avoid making a tuplestore unnecessarily

2025-05-02 Thread Tom Lane
I wrote:
> Given the lack of field complaints over the many years it's been
> like this, there doesn't seem to be a live problem.  I think the
> explanation for that is
> (1) those mechanisms are never used to call set-returning functions,
> (2) therefore, the tuplestore will not be called on to hold more
> than one result row,
> (3) therefore, it won't get large enough that it wants to spill
> to disk,
> (4) therefore, its potentially dangling resowner pointer is never
> used.
> However, this is an uncomfortably shaky chain of assumptions.
> I want to cut it down by fixing things so that there is no
> tuplestore, period, in a non-set-returning SQL function.

Following up on this thread: Alexander Lakhin's report at [1] shows
that point (3) above is wrong: the tuplestore code will spill to disk
even when holding just one tuple, if that tuple is bigger than the
tuplestore's allowed work_mem.  (This seems kinda dubious to me, since
no memory savings can ensue.  But I have no desire to rejigger that
right now.)  So there may actually be a live bug associated with
use of a deleted resource owner here.  I've not tried to pursue that
though.

More immediately: Alexander's report also shows that there's an easily
reached bug in HEAD when the tuplestore does spill to disk.  When it
reads that tuple back in, it allocates it in the caller's memory
context, and fmgr_sql is now calling that in the short-lived context
it was called in not in its long-lived fcontext.  The end result of
that is that the long-lived result TupleTableSlot is now holding a
should_free pointer to a short-lived tuple, which ends up in an
attempt to pfree already-wiped storage during the next call of the
SQL function.

The patch I presented here eliminates that problem because with it,
fmgr_sql no longer pulls tuples out of the tuplestore at all.
So I want to apply this patch now instead of holding it for v19.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/9f975803-1a1c-4f21-b987-f572e110e860%40gmail.com




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Jacob Champion
On Fri, May 2, 2025 at 11:26 AM Tom Lane  wrote:
> Yeah, they are both under /opt/local/include in a MacPorts setup.
> But disabling NLS doesn't break it for me.  I tried
>
> meson setup build --auto-features=disabled -Dlibcurl=enabled
>
> to make sure that /opt/local/include wasn't getting pulled in
> some other way, and it still builds.

Hm. If you clear out the build artifacts under
src/interfaces/libpq-oauth, and then build with

$ ninja -v src/interfaces/libpq-oauth/libpq-oauth.a

does that help surface anything interesting?

> Apropos of that: our fine manual claims that option is spelled
> --auto_features, but that fails for me.  Is that a typo in our
> manual, or do some meson versions accept the underscore?

--auto_features doesn't work for me either. That looks like an
accidental mashup of --auto-features and -Dauto_features.

--Jacob




Re: POC: Parallel processing of indexes in autovacuum

2025-05-02 Thread Daniil Davydov
On Fri, May 2, 2025 at 11:58 PM Sami Imseih  wrote:
>
> I am generally -1 on the idea of autovacuum performing parallel
> index vacuum, because I always felt that the parallel option should
> be employed in a targeted manner for a specific table. if you have a bunch
> of large tables, some more important than others, a/c may end
> up using parallel resources on the least important tables and you
> will have to adjust a/v settings per table, etc to get the right table
> to be parallel index vacuumed by a/v.

Hm, this is a good point. I think I should clarify one moment - in
practice, there is a common situation when users have one huge table
among all databases (with 80+ indexes created on it). But, of course,
in general there may be few such tables.
But we can still adjust the autovac_idx_parallel_min_rows parameter.
If a table has a lot of dead tuples => it is actively used => table is
important (?).
Also, if the user can really determine the "importance" of each of the
tables - we can provide an appropriate table option. Tables with this
option set will be processed in parallel in priority order. What do
you think about such an idea?

>
> Also, with the TIDStore improvements for index cleanup, and the practical
> elimination of multi-pass index vacuums, I see this being even less
> convincing as something to add to a/v.

If I understood correctly, then we are talking about the fact that
TIDStore can store so many tuples that in fact a second pass is never
needed.
But the number of passes does not affect the presented optimization in
any way. We must think about a large number of indexes that must be
processed. Even within a single pass we can have a 40% increase in
speed.

>
> Now, If I am going to allocate extra workers to run vacuum in parallel, why
> not just provide more autovacuum workers instead so I can get more tables
> vacuumed within a span of time?

For now, only one process can clean up indexes, so I don't see how
increasing the number of a/v workers will help in the situation that I
mentioned above.
Also, we don't consume additional resources during autovacuum in this
patch - total number of a/v workers always <= autovacuum_max_workers.

BTW, see v2 patch, attached to this letter (bug fixes) :-)

--
Best regards,
Daniil Davydov
From 1c93a729b844a1dfe109e8d9e54d5cc0a941d061 Mon Sep 17 00:00:00 2001
From: Daniil Davidov 
Date: Sat, 3 May 2025 00:27:45 +0700
Subject: [PATCH v2] WIP Allow autovacuum to process indexes of single table in
 parallel

---
 src/backend/commands/vacuum.c |  27 +
 src/backend/commands/vacuumparallel.c | 289 +-
 src/backend/postmaster/autovacuum.c   | 906 +-
 src/backend/utils/misc/guc_tables.c   |  30 +
 src/backend/utils/misc/postgresql.conf.sample |   6 +
 src/include/postmaster/autovacuum.h   |  23 +
 src/test/modules/autovacuum/.gitignore|   1 +
 src/test/modules/autovacuum/Makefile  |  14 +
 .../autovacuum/t/001_autovac_parallel.pl  | 137 +++
 9 files changed, 1387 insertions(+), 46 deletions(-)
 create mode 100644 src/test/modules/autovacuum/.gitignore
 create mode 100644 src/test/modules/autovacuum/Makefile
 create mode 100644 src/test/modules/autovacuum/t/001_autovac_parallel.pl

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 33a33bf6b1c..a5ef5319ccc 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -2234,6 +2234,33 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
 	else
 		toast_relid = InvalidOid;
 
+	/*
+	 * Decide whether we need to process table with given oid in parallel mode
+	 * during autovacuum.
+	 */
+	if (AmAutoVacuumWorkerProcess() &&
+		params->index_cleanup != VACOPTVALUE_DISABLED)
+	{
+		PgStat_StatTabEntry *tabentry;
+
+		/* fetch the pgstat table entry */
+		tabentry = pgstat_fetch_stat_tabentry_ext(rel->rd_rel->relisshared,
+  rel->rd_id);
+		if (tabentry && tabentry->dead_tuples >= autovac_idx_parallel_min_rows)
+		{
+			List   *indexes = RelationGetIndexList(rel);
+			int		num_indexes = list_length(indexes);
+
+			list_free(indexes);
+
+			if (num_indexes >= autovac_idx_parallel_min_indexes &&
+max_parallel_index_autovac_workers > 0)
+			{
+params->nworkers = max_parallel_index_autovac_workers;
+			}
+		}
+	}
+
 	/*
 	 * Switch to the table owner's userid, so that any index functions are run
 	 * as that user.  Also lock down security-restricted operations and
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..cb4b7c23010 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1,20 +1,23 @@
 /*-
  *
  * vacuumparallel.c
- *	  Support routines for parallel vacuum execution.
+ *	  Support routines for parallel [auto]vacuum execution.
  *
  * This file contains routines that are inten

Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Tom Lane
Jacob Champion  writes:
> Hm. If you clear out the build artifacts under
> src/interfaces/libpq-oauth, and then build with
> $ ninja -v src/interfaces/libpq-oauth/libpq-oauth.a
> does that help surface anything interesting?

$ rm -rf src/interfaces/libpq-oauth
$ ninja -v src/interfaces/libpq-oauth/libpq-oauth.a
[1/2] ccache cc -Isrc/interfaces/libpq-oauth/libpq-oauth.a.p 
-Isrc/interfaces/libpq-oauth -I../src/interfaces/libpq-oauth 
-Isrc/interfaces/libpq -I../src/interfaces/libpq -Isrc/port -I../src/port 
-Isrc/include -I../src/include -I/opt/local/include 
-I/opt/local/libexec/openssl3/include -fdiagnostics-color=always -Wall 
-Winvalid-pch -O2 -g -isysroot 
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.4.sdk
 -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wmissing-prototypes 
-Wpointer-arith -Werror=vla -Werror=unguarded-availability-new -Wendif-labels 
-Wmissing-format-attribute -Wcast-function-type -Wformat-security 
-Wdeclaration-after-statement -Wmissing-variable-declarations 
-Wno-unused-command-line-argument -Wno-compound-token-split-by-macro 
-Wno-format-truncation -Wno-cast-function-type-strict -MD -MQ 
src/interfaces/libpq-oauth/libpq-oauth.a.p/oauth-curl.c.o -MF 
src/interfaces/libpq-oauth/libpq-oauth.a.p/oauth-curl.c.o.d -o 
src/interfaces/libpq-oauth/libpq-oauth.a.p/oauth-curl.c.o -c 
../src/interfaces/libpq-oauth/oauth-curl.c
[2/2] rm -f src/interfaces/libpq-oauth/libpq-oauth.a && ar csr 
src/interfaces/libpq-oauth/libpq-oauth.a 
src/interfaces/libpq-oauth/libpq-oauth.a.p/oauth-curl.c.o && ranlib -c 
src/interfaces/libpq-oauth/libpq-oauth.a

So it's getting -I/opt/local/include and also
-I/opt/local/libexec/openssl3/include from somewhere,
which I guess must be libcurl's pkg-config data ... yup:

$ pkg-config --cflags libcurl
-I/opt/local/include -I/opt/local/libexec/openssl3/include -I/opt/local/include

I bet Homebrew's libcurl packaging doesn't do that.

regards, tom lane




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Jacob Champion
On Fri, May 2, 2025 at 11:52 AM Tom Lane  wrote:
> $ pkg-config --cflags libcurl
> -I/opt/local/include -I/opt/local/libexec/openssl3/include 
> -I/opt/local/include
>
> I bet Homebrew's libcurl packaging doesn't do that.

Nope, Homebrew breaks them out into smaller pieces:

% PKG_CONFIG_PATH=/opt/homebrew/opt/curl/lib/pkgconfig pkg-config
--cflags libcurl
-I/opt/homebrew/Cellar/curl/8.13.0/include
-I/opt/homebrew/Cellar/brotli/1.1.0/include
-I/opt/homebrew/opt/zstd/include
-I/opt/homebrew/Cellar/libssh2/1.11.1/include
-I/opt/homebrew/Cellar/rtmpdump/2.4-20151223_3/include
-I/opt/homebrew/Cellar/openssl@3/3.5.0/include
-I/opt/homebrew/Cellar/libnghttp2/1.65.0/include

--Jacob




Re: extension_control_path and "directory"

2025-05-02 Thread Peter Eisentraut

On 29.04.25 17:06, Matheus Alcantara wrote:

On Tue, Apr 29, 2025 at 11:08 AM David E. Wheeler  wrote:

Right. My point is a minor one, but I thin you can use an if/ else there:

```c
if (strcmp(piece, "$system") == 0) {
 /* Substitute the path macro if needed */
 mangled = substitute_path_macro(piece, "$system", system_dir);
} else {
 /*
 * Append "extension" suffix in case is a custom extension
 * control path.
 */
 mangled = psprintf("%s/extension", mangled);
}
```



The substitute_path_macro() already handles the if/else on "piece" but I
think that this if/else version looks nicer. Fixed.

I've also included some documentation changes for this v5 version to
remove the "extension" from the examples and also mention the scenario
when using the "directory" on the .control file.


Thanks, I have committed this.  I did a bit of code reformatting and 
adjusted the documentation a bit.  It's good to get this in before beta1 
so that we don't have to change the valid values of 
extension_control_path past beta1.






Re: queryId constant squashing does not support prepared statements

2025-05-02 Thread Álvaro Herrera
On 2025-May-02, Michael Paquier wrote:

> That depends.  If we conclude that tracking this information through
> the parser based on the start and end positions in a query string
> for a set of values is more relevant, then we would be redesigning the
> facility from the ground, so the old approach would not be really
> relevant..

I disagree that a revert is warranted for this reason.  If you want to
change the implementation later, that's fine, as long as the user
interface doesn't change.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem."  (Tom Lane)




Small fixes needed by high-availability tools

2025-05-02 Thread Andrey Borodin
Hi hackers!

I want to revive attempts to fix some old edge cases of physical quorum 
replication.

Please find attached draft patches that demonstrate ideas. These patches are 
not actually proposed code changes, I rather want to have a design consensus 
first.

1. Allow checking standby sync before making data visible after crash recovery

Problem: Postgres instance must not allow to read data, if it is not yet known 
to be replicated.
Instantly after the crash we do not know if we are still cluster primary. We 
can disallow new
connections until standby quorum is established. Of course, walsenders and 
superusers must be exempt from this restriction.

Key change is following:
@@ -1214,6 +1215,16 @@ InitPostgres(const char *in_dbname, Oid dboid,
if (PostAuthDelay > 0)
pg_usleep(PostAuthDelay * 100L);
 
+   /* Check if we need to wait for startup synchronous replication */
+   if (!am_walsender &&
+   !superuser() &&
+   !StartupSyncRepEstablished())
+   {
+   ereport(FATAL,
+   (errcode(ERRCODE_CANNOT_CONNECT_NOW),
+errmsg("cannot connect until synchronous 
replication is established with standbys according to 
startup_synchronous_standby_level")));
+   }

We might also want to have some kind of cache that quorum was already 
established. Also the place where the check is done might be not most 
appropriate.

2. Do not allow to cancel locally written transaction

The problem was discussed many times [0,1,2,3] with some agreement on taken 
approach. But there was concerns that the solution is incomplete without first 
patch in the current thread.

Problem: user might try to cancel locally committed transaction and if we do so 
we will show non-replicated data as committed. This leads to loosing data with 
UPSERTs.

The key change is how we process cancels in SyncRepWaitForLSN().

3. Allow reading LSN written by walreciever, but not flushed yet

Problem: if we have synchronous_standby_names = ANY(node1,node2), node2 might 
be ahead of node1 by flush LSN, but before by written LSN. If we do a failover 
we choose node2 instead of node1 and loose data recently committed with 
synchronous_commit=remote_write.

Caveat: we already have a function pg_last_wal_receive_lsn(), which in fact 
returns flushed LSN, not written. I propose to add a new function which returns 
LSN actually written. Internals of this function are already implemented 
(GetWalRcvWriteRecPtr()), but unused.

Currently we just use a separate program lwaldump [4] which just reads WAL 
until last valid record. In case of failover pg_consul uses LSNs from lwaldump. 
This approach works well, but is cumbersome.


There are other caveats of replication, but IMO these 3 problems are most 
annoying in terms of data durability.

I'd greatly appreciate any thoughts on this.


Best regards, Andrey Borodin.


[0] 
https://www.postgresql.org/message-id/flat/C1F7905E-5DB2-497D-ABCC-E14D4DEE506C%40yandex-team.ru
[1] 
https://www.postgresql.org/message-id/flat/caeet0zhg5off7iecby6tzadh1moslmfz1hlm311p9vot7z+...@mail.gmail.com
[2] 
https://www.postgresql.org/message-id/flat/6a052e81060824a8286148b1165bafedbd7c86cd.ca...@j-davis.com#415dc2f7d41b8a251b419256407bb64d
[3] 
https://www.postgresql.org/message-id/flat/CALj2ACUrOB59QaE6%3DjF2cFAyv1MR7fzD8tr4YM5%2BOwEYG1SNzA%40mail.gmail.com
[4] https://github.com/g0djan/lwaldump



0001-Allow-checking-standby-sync-before-making-data-visib.patch
Description: Binary data


0002-Do-not-allow-to-cancel-locally-written-transaction.patch
Description: Binary data


0003-Allow-reading-LSN-written-by-walreciever-but-not-flu.patch
Description: Binary data


Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Jacob Champion
On Fri, May 2, 2025 at 8:11 AM Nathan Bossart  wrote:
>
> After commit b0635bf, I'm seeing the following meson build failures on
> macOS:

Thanks for the report, and sorry for the breakage.

> In file included from 
> ../postgresql/src/interfaces/libpq-oauth/oauth-curl.c:51:
> ../postgresql/src/interfaces/libpq/libpq-int.h:70:10: fatal error: 
> 'openssl/ssl.h' file not found
>70 | #include 
>   |  ^~~
> 1 error generated.

Hm. My test setup here is Homebrew with -Dextra_include_dirs, which
may explain why it's not failing for me. Looks like Cirrus also has
-Dextra_include_dirs...

> The following patch seems to resolve it.  I'm curious if commit 4ea1254
> might apply to meson, too, but FWIW I haven't noticed any related failures
> on my machine.

Yeah, I wonder if libintl is being similarly "cheated" on the Meson side.

> diff --git a/meson.build b/meson.build
> index 29d46c8ad01..19ad03042d3 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -3295,6 +3295,7 @@ libpq_deps += [
>
>  libpq_oauth_deps += [
>libcurl,
> +  ssl,
>  ]

Thanks! I think the include directory is the only thing needed for the
static library, not the full link dependency. But I'll try to build up
a new Meson setup, with minimal added settings, to see if I can
reproduce here. Can you share your Meson configuration?

--Jacob




Re: queryId constant squashing does not support prepared statements

2025-05-02 Thread Dmitry Dolgov
> On Fri, May 02, 2025 at 04:18:37PM GMT, Michael Paquier wrote:
> On Fri, May 02, 2025 at 09:13:39AM +0200, Dmitry Dolgov wrote:
> > Squashing constants was ment to be a first step towards doing the same
> > for other types of queries (params, rte_values), reverting it to
> > implement everything at once makes very little sense to me.
>
> That depends.  If we conclude that tracking this information through
> the parser based on the start and end positions in a query string
> for a set of values is more relevant, then we would be redesigning the
> facility from the ground, so the old approach would not be really
> relevant..

If I understand you correctly, changing the way how element list is
identified is not going to address the question whether or not to squash
parameters, right?




Re: Fix slot synchronization with two_phase decoding enabled

2025-05-02 Thread Nisha Moond
On Wed, Apr 30, 2025 at 5:45 AM Masahiko Sawada  wrote:
>
> On Tue, Apr 29, 2025 at 5:00 AM Nisha Moond  wrote:
>
> Thank you for updating the patch! Here are some comments on v10.
>

Thanks for reviewing the patch!

> ---
> +
> +# Also wait for two-phase to be enabled
> +$subscriber1->poll_query_until(
> +   'postgres', qq[
> +   SELECT count(1) = 0 FROM pg_subscription WHERE subname =
> 'regress_mysub2' and subtwophasestate NOT IN ('e');]
> +) or die "Timed out while waiting for subscriber to enable twophase";
> +
> +# Try to enable the failover for the subscription, should give error
> +($result, $stdout, $stderr) = $subscriber1->psql(
> +   'postgres',
> +   "ALTER SUBSCRIPTION regress_mysub2 DISABLE;
> +ALTER SUBSCRIPTION regress_mysub2 SET (failover = true);");
> +ok( $stderr =~
> + qr/ERROR:  could not alter replication slot "lsub2_slot": ERROR:
>  cannot enable failover for a two-phase enabled replication slot/,
> +   "failover can't be enabled if restart_lsn < two_phase_at on a
> two_phase subscription."
> +);
>
> I think it's possible between two tests that the walsender consumes
> some changes (e.g. generated by autovacuums) then the second check
> fails (i.e., ALTER SUBSCRIPTION SET (failover = ture) succeeds).
>

Yes, this is a possibility. To account for this negative test case
where restart_lsn < two_phase_at, I updated the test to attempt
altering a two_phase-enabled slot without any associated subscription.

> ---
> +# Test that SQL API disallows slot creation with both two_phase and
> failover enabled
> +($result, $stdout, $stderr) = $publisher->psql('postgres',
> +   "SELECT pg_create_logical_replication_slot('regress_mysub3',
> 'pgoutput', false, true, true);"
> +);
> +ok( $stderr =~
> + /ERROR:  cannot enable both "failover" and "two_phase" options
> during replication slot creation/,
> +   "cannot create slot with both two_phase and failover enabled");
>
> Isn't this test already covered by test_decoding/sql/slot.sql?
>

Yes, it is covered in slot.sql, hence removed from here.

> I've attached a patch that contains comment changes I mentioned above.
> Please incorporate it if you agree with them.
>

I have incorporated the suggested changes and updated a couple more
places accordingly.
~~~

Please find the v11 patch addressing the above points and all other
comments. I have also optimized the test by reducing the number of
subscriptions and slots.

--
Thanks,
Nisha


v11-0001-PG17-Approach-3-Fix-slot-synchronization-for-two.patch
Description: Binary data


Re: fixing CREATEROLE

2025-05-02 Thread Robert Haas
On Thu, May 1, 2025 at 4:31 PM Tom Lane  wrote:
> In any case, I'd be happier about createrole_self_grant if it had
> been a role property bit instead of a GUC.  But we'd still need
> to worry about whether it corrupts the results of dump/restore
> (offhand I think it still would, if it results in GRANTs that
> weren't there before).

Hmm. That might have been a better design. :-(

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: extension_control_path and "directory"

2025-05-02 Thread Matheus Alcantara
On Fri, May 2, 2025 at 11:51 AM Peter Eisentraut  wrote:
>
> Thanks, I have committed this.  I did a bit of code reformatting and
> adjusted the documentation a bit.  It's good to get this in before beta1
> so that we don't have to change the valid values of
> extension_control_path past beta1.
>

Thanks Peter!


-- 
Matheus Alcantara




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Nathan Bossart
After commit b0635bf, I'm seeing the following meson build failures on
macOS:

In file included from 
../postgresql/src/interfaces/libpq-oauth/oauth-curl.c:51:
../postgresql/src/interfaces/libpq/libpq-int.h:70:10: fatal error: 
'openssl/ssl.h' file not found
   70 | #include 
  |  ^~~
1 error generated.

The following patch seems to resolve it.  I'm curious if commit 4ea1254
might apply to meson, too, but FWIW I haven't noticed any related failures
on my machine.

diff --git a/meson.build b/meson.build
index 29d46c8ad01..19ad03042d3 100644
--- a/meson.build
+++ b/meson.build
@@ -3295,6 +3295,7 @@ libpq_deps += [

 libpq_oauth_deps += [
   libcurl,
+  ssl,
 ]

 subdir('src/interfaces/libpq')

-- 
nathan




Re: POC: Parallel processing of indexes in autovacuum

2025-05-02 Thread Sami Imseih
Thanks for raising this idea!

I am generally -1 on the idea of autovacuum performing parallel
index vacuum, because I always felt that the parallel option should
be employed in a targeted manner for a specific table. if you have a bunch
of large tables, some more important than others, a/c may end
up using parallel resources on the least important tables and you
will have to adjust a/v settings per table, etc to get the right table
to be parallel index vacuumed by a/v.

Also, with the TIDStore improvements for index cleanup, and the practical
elimination of multi-pass index vacuums, I see this being even less
convincing as something to add to a/v.

Now, If I am going to allocate extra workers to run vacuum in parallel, why
not just provide more autovacuum workers instead so I can get more tables
vacuumed within a span of time?

> Once we have parallel heap vacuum, as discussed in thread[1], it would
> also likely be beneficial to incorporate it into autovacuum during
> aggressive vacuum or failsafe mode.

IIRC, index cleanup is disabled by failsafe.


--
Sami Imseih
Amazon Web Services (AWS)




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Jacob Champion
On Fri, May 2, 2025 at 8:59 AM Jacob Champion
 wrote:
> libintl is already coming in via frontend_stlib_code, so that's fine.
> So now I'm wondering if any other static clients of libpq-int.h (if
> there are any) need the ssl dependency too, for correctness, or if
> it's just me.

Looks like it's just me. And using partial_dependency for the includes
seems like overkill, so I've kept the full ssl dependency object, but
moved it to the staticlib only, which is enough to solve the breakage
on my machine.

Nathan, if you get a chance, does the attached patch work for you?

Thanks!
--Jacob


0001-oauth-Correct-SSL-dependency-for-libpq-oauth.a.patch
Description: Binary data


Re: POC: Parallel processing of indexes in autovacuum

2025-05-02 Thread Sami Imseih
> On Fri, May 2, 2025 at 11:58 PM Sami Imseih  wrote:
> >
> > I am generally -1 on the idea of autovacuum performing parallel
> > index vacuum, because I always felt that the parallel option should
> > be employed in a targeted manner for a specific table. if you have a bunch
> > of large tables, some more important than others, a/c may end
> > up using parallel resources on the least important tables and you
> > will have to adjust a/v settings per table, etc to get the right table
> > to be parallel index vacuumed by a/v.
>
> Hm, this is a good point. I think I should clarify one moment - in
> practice, there is a common situation when users have one huge table
> among all databases (with 80+ indexes created on it). But, of course,
> in general there may be few such tables.
> But we can still adjust the autovac_idx_parallel_min_rows parameter.
> If a table has a lot of dead tuples => it is actively used => table is
> important (?).
> Also, if the user can really determine the "importance" of each of the
> tables - we can provide an appropriate table option. Tables with this
> option set will be processed in parallel in priority order. What do
> you think about such an idea?

I think in most cases, the user will want to determine the priority of
a table getting parallel vacuum cycles rather than having the autovacuum
determine the priority. I also see users wanting to stagger
vacuums of large tables with many indexes through some time period,
and give the
tables the full amount of parallel workers they can afford at these
specific periods
of time. A/V currently does not really allow for this type of
scheduling, and if we
give some kind of GUC to prioritize tables, I think users will constantly have
to be modifying this priority.

I am basing my comments on the scenarios I have seen on the field, and others
may have a different opinion.

> > Also, with the TIDStore improvements for index cleanup, and the practical
> > elimination of multi-pass index vacuums, I see this being even less
> > convincing as something to add to a/v.
>
> If I understood correctly, then we are talking about the fact that
> TIDStore can store so many tuples that in fact a second pass is never
> needed.
> But the number of passes does not affect the presented optimization in
> any way. We must think about a large number of indexes that must be
> processed. Even within a single pass we can have a 40% increase in
> speed.

I am not discounting that a single table vacuum with many indexes will
maybe perform better with parallel index scan, I am merely saying that
the TIDStore optimization now makes index vacuums better and perhaps
there is less of an incentive to use parallel.

> > Now, If I am going to allocate extra workers to run vacuum in parallel, why
> > not just provide more autovacuum workers instead so I can get more tables
> > vacuumed within a span of time?
>
> For now, only one process can clean up indexes, so I don't see how
> increasing the number of a/v workers will help in the situation that I
> mentioned above.
> Also, we don't consume additional resources during autovacuum in this
> patch - total number of a/v workers always <= autovacuum_max_workers.

Increasing a/v workers will not help speed up a specific table, what I
am suggesting is that instead of speeding up one table, let's just allow
other tables to not be starved of a/v cycles due to lack of a/v workers.

--
Sami




Re: Parallel CREATE INDEX for GIN indexes

2025-05-02 Thread Tomas Vondra
On 4/30/25 14:39, Tomas Vondra wrote:
> 
> On 4/18/25 03:03, Vinod Sridharan wrote:
>> ...
>>
> 
> The patch seems fine to me - I repeated the tests with mailing list
> archives, with MemoryContextStats() in _gin_parallel_merge, and it
> reliably minimizes the memory usage. So that's fine.
> 
> I was also worried if this might have performance impact, but it
> actually seems to make it a little bit faster.
> 
> I'll get this pushed.
> 

And pushed, so it'll be in beta1.


Thanks!

-- 
Tomas Vondra





Re: Improve explicit cursor handling in pg_stat_statements

2025-05-02 Thread Sami Imseih
> Hmm.  What are the workloads that you have seen as problematic?  Do
> these involve cursor names generated randomly, where most of them are
> similar with a random factor for the name?

postgres_fdw, as an example, in which cursor name get reused
for different queries. Notice below "c1" and "c2" is reused for different
queries, so now what underlying sql is FETCH, i.e. FETCH 100 FROM c1 referring
to? v2-0001 does not help us with the FETCH problem
because as I mentioned we don't have access to the underlying sql
( and parsing is even too early to do a portal lookup to find the
underlying sql to
base the queryId on). What v2-0001 will do is at least group the DECLARE CURSOR
statements together for cursors referencing the same query and reduce the #
of entries.

```
create foreign table t2(id int) server r1;
create foreign table t1(id int) server r1;

postgres=# select * from t2, t ;
 id | id
+
  1 |  1
(1 row)

postgres=# select * from t, t2 ;
 id | id
+
  1 |  1
(1 row)
```
on the remote side
```
postgres=# select calls, query from pg_stat_statements where query like '% c%';
 calls |  query
---+-
 1 | DECLARE c2 CURSOR FOR  +
   | SELECT id FROM public.t2
 2 | DECLARE c1 CURSOR FOR  +
   | SELECT id FROM public.t2
 3 | CLOSE c2
 3 | CLOSE c1
 2 | DECLARE c2 CURSOR FOR  +
   | SELECT id FROM public.t
 3 | FETCH 100 FROM c1
 3 | FETCH 100 FROM c2
 1 | DECLARE c1 CURSOR FOR  +
   | SELECT id FROM public.t
 2 | select calls, query from pg_stat_statements where query like $1
(9 rows)
```

> Too much normalization
> here would limit the amount of verbosity that we have for this area,
> especially if we are dealing with query patterns that rely on few
> CLOSE naming patterns spread across a couple of key queries, because
> we would now know anymore about their distribution.

The FETCH and CLOSE are already not clear to what underlying SQL
they are referring to, and there is not much chance to actually
improve that unless
we track a cursor queryId in pg_stat_statements ( at that point we can show that
FETCH or CLOSE refer to this specific cursor statement ).

--
Sami




Re: queryId constant squashing does not support prepared statements

2025-05-02 Thread Dmitry Dolgov
> On Fri, May 02, 2025 at 07:10:19AM GMT, Michael Paquier wrote:
> > I am really leaning towards that we should revert this feature as the
> > limitation we have now with parameters is a rather large one and I think
> > we need to go back and address this issue.
>
> I am wondering if this would not be the best move to do on HEAD.
> Let's see where the discussion drives us.

Squashing constants was ment to be a first step towards doing the same
for other types of queries (params, rte_values), reverting it to
implement everything at once makes very little sense to me.




Re: queryId constant squashing does not support prepared statements

2025-05-02 Thread Michael Paquier
On Fri, May 02, 2025 at 09:13:39AM +0200, Dmitry Dolgov wrote:
> Squashing constants was ment to be a first step towards doing the same
> for other types of queries (params, rte_values), reverting it to
> implement everything at once makes very little sense to me.

That depends.  If we conclude that tracking this information through
the parser based on the start and end positions in a query string
for a set of values is more relevant, then we would be redesigning the
facility from the ground, so the old approach would not be really
relevant..
--
Michael


signature.asc
Description: PGP signature


Re: Fix slot synchronization with two_phase decoding enabled

2025-05-02 Thread shveta malik
On Fri, May 2, 2025 at 12:57 PM Nisha Moond  wrote:
>
>
> Please find the v11 patch addressing the above points and all other
> comments. I have also optimized the test by reducing the number of
> subscriptions and slots.
>

Thanks for the patch. Few comments:

1)

pg_upgrade/t/003_logical_slots.pl:

- "SELECT slot_name, two_phase, failover FROM pg_replication_slots");
-is($result, qq(regress_sub|t|t), 'check the slot exists on new cluster');
+ "SELECT slot_name, two_phase FROM pg_replication_slots");
+is($result, qq(regress_sub|t), 'check the slot exists on new cluster');

With this change, the failover property test during upgrade is
removed. Shall we add it again either using ALTER SUB to enable
failover and two_phase together or a new subscription with failover
alone?

2)
+  The default is false. This option cannot be set together with
+  failover when creating a logical replication slot.
+  However, once the slot is created with two_phase =
true,
+  failover can be set to true after the slot has
+  consumed all transactions up to the point where two-phase decoding
+  was enabled.


Suggestion:  all transactions --> all the changes.

thanks
Shveta




Re: Add an option to skip loading missing publication to avoid logical replication failure

2025-05-02 Thread Xuneng Zhou
Yeh, tks for your clarification.  I have a basic understanding of it now. I
mean is this considered a bug or design defect in the codebase? If so,
should we prevent it from occuring in general, not just for this specific
test.

vignesh C 

>
> We have three processes involved in this scenario:
> A walsender process on the publisher, responsible for decoding and
> sending WAL changes.
> An apply worker process on the subscriber, which applies the changes.
> A session executing the ALTER SUBSCRIPTION command.
>
> Due to the asynchronous nature of these processes, the ALTER
> SUBSCRIPTION command may not be immediately observed by the apply
> worker. Meanwhile, the walsender may process and decode an INSERT
> statement.
> If the insert targets a table (e.g., tab_3) that does not belong to
> the current publication (pub1), the walsender silently skips
> replicating the record and advances its decoding position. This
> position is sent in a keepalive message to the subscriber, and since
> there are no pending transactions to flush, the apply worker reports
> it as the latest received LSN.
> Later, when the apply worker eventually detects the subscription
> change, it restarts—but by then, the insert has already been skipped
> and is no longer eligible for replay, as the table was not part of the
> publication (pub1) at the time of decoding.
> This race condition arises because the three processes run
> independently and may progress at different speeds due to CPU
> scheduling or system load.
> Thoughts?
>
> Regards,
> Vignesh
>


Re: Make COPY format extendable: Extract COPY TO format implementations

2025-05-02 Thread Masahiko Sawada
On Thu, May 1, 2025 at 4:04 PM Michael Paquier  wrote:
>
> On Thu, May 01, 2025 at 12:15:30PM -0700, Masahiko Sawada wrote:
> > In light of these concerns, I've been contemplating alternative
> > interface designs. One promising approach would involve registering
> > custom copy formats via a C function during module loading
> > (specifically, in _PG_init()). This method would require extension
> > authors to invoke a registration function, say
> > RegisterCustomCopyFormat(), in _PG_init() as follows:
> >
> > JsonLinesFormatId = RegisterCustomCopyFormat("jsonlines",
> >  &JsonLinesCopyToRoutine,
> >  &JsonLinesCopyFromRoutine);
> >
> > The registration function would validate the format name and store it
> > in TopMemoryContext. It would then return a unique identifier that can
> > be used subsequently to reference the custom copy format extension.
>
> Hmm.  How much should we care about the observability of the COPY
> format used by a given backend?  Storing this information in a
> backend's TopMemoryContext is OK to get the extensibility basics to
> work, but could it make sense to use some shmem state to allocate a
> uint32 ID that could be shared by all backends.  Contrary to EXPLAIN,
> COPY commands usually run for a very long time, so I am wondering if
> these APIs should be designed so as it would be possible to monitor
> the format used.  One layer where the format information could be made
> available is the progress reporting view for COPY, for example.  I can
> also imagine a pgstats kind where we do COPY stats aggregates, with a
> per-format pgstats kind, and sharing a fixed ID across multiple
> backends is relevant (when flushing the stats at shutdown, we would
> use a name/ID mapping like replication slots).
>
> I don't think that this needs to be relevant for the option part, just
> for the format where, I suspect, we should store in a shmem array
> based on the ID allocated the name of the format, the library of the
> callback and the function name fed to load_external_function().
>
> Note that custom LWLock and wait events use a shmem state for
> monitoring purposes, where we are able to do ID->format name lookups
> as much as format->ID lookups.  Perhaps it's OK not to do that for
> COPY, but I am wondering if we'd better design things from scratch
> with states in shmem state knowing that COPY is a long-running
> operation, and that if one mixes multiple formats they would most
> likely want to know which formats are bottlenecks, through SQL.  Cloud
> providers would love that.

Good point. It would make sense to have such information as a map on
shmem. It might be better to use dshash here since a custom copy
format module can be loaded at runtime. Or we can use dynahash with
large enough elements.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Make COPY format extendable: Extract COPY TO format implementations

2025-05-02 Thread David G. Johnston
On Thursday, May 1, 2025, Masahiko Sawada  wrote:

>
> In light of these concerns, I've been contemplating alternative
> interface designs. One promising approach would involve registering
> custom copy formats via a C function during module loading
> (specifically, in _PG_init()). This method would require extension
> authors to invoke a registration function, say
> RegisterCustomCopyFormat(), in _PG_init() as follows:
>
> JsonLinesFormatId = RegisterCustomCopyFormat("jsonlines",
>  &JsonLinesCopyToRoutine,
>  &JsonLinesCopyFromRoutine);
>
> The registration function would validate the format name and store it
> in TopMemoryContext. It would then return a unique identifier that can
> be used subsequently to reference the custom copy format extension.
>

How does this fix the search_path concern?  Are query writers supposed to
put JsonLinesFormatId into their queries?  Or are you just prohibiting a
DBA from ever installing an extension that wants to register a format name
that is already registered so that no namespace is ever required?

ISTM accommodating a namespace for formats is required just like we do for
virtually every other named object in the system.  At least, if we want to
play nice with extension authors.  It doesn’t have to be within the
existing pg_proc scope, we can create a new scope if desired, but
abolishing it seems unwise.

It would be more consistent with established policy if we didn’t make
exceptions for text/csv/binary - if the DBA permits a text format to exist
in a different schema and that schema appears first in the search_path,
unqualified references to text would resolve to the non-core handler.  We
already protect ourselves with safe search_paths.  This is really no
different than if someone wanted to implement a now() function and people
are putting pg_catalog from of existing usage.  It’s the DBAs problem, not
ours.

David J.


Re: Make COPY format extendable: Extract COPY TO format implementations

2025-05-02 Thread Masahiko Sawada
On Fri, May 2, 2025 at 10:36 PM David G. Johnston
 wrote:
>
> On Thursday, May 1, 2025, Masahiko Sawada  wrote:
>>
>>
>> In light of these concerns, I've been contemplating alternative
>> interface designs. One promising approach would involve registering
>> custom copy formats via a C function during module loading
>> (specifically, in _PG_init()). This method would require extension
>> authors to invoke a registration function, say
>> RegisterCustomCopyFormat(), in _PG_init() as follows:
>>
>> JsonLinesFormatId = RegisterCustomCopyFormat("jsonlines",
>>  &JsonLinesCopyToRoutine,
>>  &JsonLinesCopyFromRoutine);
>>
>> The registration function would validate the format name and store it
>> in TopMemoryContext. It would then return a unique identifier that can
>> be used subsequently to reference the custom copy format extension.
>
>
> How does this fix the search_path concern?  Are query writers supposed to put 
> JsonLinesFormatId into their queries?  Or are you just prohibiting a DBA from 
> ever installing an extension that wants to register a format name that is 
> already registered so that no namespace is ever required?

Users can specify "jsonlines", passed in the first argument to the
register function, to the COPY FORMAT option in this case.  While
JsonLinesFormatId is reserved for internal operations such as module
processing and monitoring, any attempt to load another custom COPY
format module named 'jsonlines' will result in an error.

> ISTM accommodating a namespace for formats is required just like we do for 
> virtually every other named object in the system.  At least, if we want to 
> play nice with extension authors.  It doesn’t have to be within the existing 
> pg_proc scope, we can create a new scope if desired, but abolishing it seems 
> unwise.
>
> It would be more consistent with established policy if we didn’t make 
> exceptions for text/csv/binary - if the DBA permits a text format to exist in 
> a different schema and that schema appears first in the search_path, 
> unqualified references to text would resolve to the non-core handler.  We 
> already protect ourselves with safe search_paths.  This is really no 
> different than if someone wanted to implement a now() function and people are 
> putting pg_catalog from of existing usage.  It’s the DBAs problem, not ours.

I'm concerned about allowing multiple 'text' format implementations
with identical names within the database, as this could lead to
considerable confusion. When users specify 'text', it would be more
logical to guarantee that the built-in 'text' format is consistently
used. This principle aligns with other customizable components, such
as custom resource managers, wait events, lightweight locks, and
custom scans. These components maintain their built-in data/types and
explicitly prevent the registration of duplicate names.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Make COPY format extendable: Extract COPY TO format implementations

2025-05-02 Thread Masahiko Sawada
On Fri, May 2, 2025 at 9:56 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Fri, 2 May 2025 21:38:32 -0700,
>   Masahiko Sawada  wrote:
>
> >> How about requiring schema for all custom formats?
> >>
> >> Valid:
> >>
> >>   COPY ... TO ... (FORMAT 'text');
> >>   COPY ... TO ... (FORMAT 'my_schema.jsonlines');
> >>
> >> Invalid:
> >>
> >>   COPY ... TO ... (FORMAT 'jsonlines'); -- no schema
> >>   COPY ... TO ... (FORMAT 'pg_catalog.text'); -- needless schema
> >>
> >> If we require "schema" for all custom formats, we don't need
> >> to depend on search_path.
> >
> > I'm concerned that users cannot use the same format name in the FORMAT
> > option depending on which schema the handler function is created.
>
> I'm not sure that it's a problem or not. If users want to
> use the same format name, they can install the handler
> function to the same schema.
>
> >> Why do we need to assign a unique ID? For performance? For
> >> RegisterCustomCopyFormatOption()?
> >
> > I think it's required for monitoring purposes for example. For
> > instance, we can set the format ID in the progress information and the
> > progress view can fetch the format name by the ID so that users can
> > see what format is being used in the COPY command.
>
> How about setting the format name instead of the format ID
> in the progress information?

The progress view can know only numbers. We need to extend the
progress view infrastructure so that we can pass other data types.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: POC: Parallel processing of indexes in autovacuum

2025-05-02 Thread Masahiko Sawada
On Fri, May 2, 2025 at 11:13 AM Daniil Davydov <3daniss...@gmail.com> wrote:
>
> On Thu, May 1, 2025 at 8:03 AM Masahiko Sawada  wrote:
> >
> > As I understand it, we initially disabled parallel vacuum for
> > autovacuum because their objectives are somewhat contradictory.
> > Parallel vacuum aims to accelerate the process by utilizing additional
> > resources, while autovacuum is designed to perform cleaning operations
> > with minimal impact on foreground transaction processing (e.g.,
> > through vacuum delay).
> >
> Yep, we also decided that we must not create more a/v workers for
> index processing.
> In current implementation, the leader process sends a signal to the
> a/v launcher, and the launcher tries to launch all requested workers.
> But the number of workers never exceeds `autovacuum_max_workers`.
> Thus, we will never have more a/v workers than in the standard case
> (without this feature).

I have concerns about this design. When autovacuuming on a single
table consumes all available autovacuum_max_workers slots with
parallel vacuum workers, the system becomes incapable of processing
other tables. This means that when determining the appropriate
autovacuum_max_workers value, users must consider not only the number
of tables to be processed concurrently but also the potential number
of parallel workers that might be launched. I think it would more make
sense to maintain the existing autovacuum_max_workers parameter while
introducing a new parameter that would either control the maximum
number of parallel vacuum workers per autovacuum worker or set a
system-wide cap on the total number of parallel vacuum workers.

>
> > Regarding implementation: I notice the WIP patch implements its own
> > parallel vacuum mechanism for autovacuum. Have you considered simply
> > setting at_params.nworkers to a value greater than zero?
> >
> About `at_params.nworkers = N` - that's exactly what we're doing (you
> can see it in the `vacuum_rel` function). But we cannot fully reuse
> code of VACUUM PARALLEL, because it creates its own processes via
> dynamic bgworkers machinery.
> As I said above - we don't want to consume additional resources. Also
> we don't want to complicate communication between processes (the idea
> is that a/v workers can only send signals to the a/v launcher).

Could you elaborate on the reasons why you don't want to use
background workers and avoid complicated communication between
processes? I'm not sure whether these concerns provide sufficient
justification for implementing its own parallel index processing.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: AIO v2.5

2025-05-02 Thread Noah Misch
On Wed, Apr 30, 2025 at 04:00:35PM -0400, Andres Freund wrote:
> pgaio_io_wait_for_free() does what it says on the tin. For that, after a bunch
> of other things, finds the oldest in-flight IO and waits for it.
> 
>   PgAioHandle *ioh = dclist_head_element(PgAioHandle, node,
>   
>&pgaio_my_backend->in_flight_ios);
> 
>   switch (ioh->state)
>   {
> ...
>   case PGAIO_HS_COMPLETED_IO:
>   case PGAIO_HS_SUBMITTED:
>   pgaio_debug_io(DEBUG2, ioh,
>  "waiting for free io 
> with %d in flight",
>  
> dclist_count(&pgaio_my_backend->in_flight_ios));
> ...
>   pgaio_io_wait(ioh, ioh->generation);
>   break;
> 
> 
> The problem is that, if the log level is low enough, ereport() (which is
> called by pgaio_debug_io()), processes interrupts.  The interrupt processing
> may end up execute ProcessBarrierSmgrRelease(), which in turn needs to wait
> for all in-flight IOs before the IOs are closed.
> 
> Which then leads to the
>   elog(PANIC, "waiting for own IO in wrong state: %d",
>state);
> 
> error.

Printing state 0 (PGAIO_HS_IDLE), right?  I think the chief problem is that
pgaio_io_wait_for_free() is fetching ioh->state, then possibly processing
interrupts in pgaio_debug_io(), then finally fetching ioh->generation.  If it
fetched ioh->generation to a local variable before pgaio_debug_io, I think
that would resolve this one.  Then the pgaio_io_was_recycled() would prevent
the PANIC:

if (pgaio_io_was_recycled(ioh, ref_generation, &state))
return;

if (am_owner)
{
if (state != PGAIO_HS_SUBMITTED
&& state != PGAIO_HS_COMPLETED_IO
&& state != PGAIO_HS_COMPLETED_SHARED
&& state != PGAIO_HS_COMPLETED_LOCAL)
{
elog(PANIC, "waiting for own IO in wrong state: %d",
 state);
}
}

Is that right?  If that's the solution, pgaio_closing_fd() and
pgaio_shutdown() would need similar care around fetching the generation before
the pgaio_debug_io.  Maybe there's an opportunity for a common inline
function.  Or at least a comment at the "generation" field on how to safely
time a fetch thereof and any barrier required.

> A similar set of steps can lead to the "no free IOs despite no in-flight IOs"
> ERROR that Alexander also observed - if pgaio_submit_staged() triggers a debug
> ereport that executes ProcessBarrierSmgrRelease() in an interrupt, we might
> wait for all in-flight IOs during IO submission, triggering the error.

That makes sense.

> I'm not yet sure how to best fix it - locally I have done so by pgaio_debug()
> do a HOLD_INTERRUPTS()/RESUME_INTERRUPTS() around the call to ereport.  But
> that doesn't really seem great - otoh requiring various pieces of code to know
> that anything emitting debug messages needs to hold interrupts etc makes for
> rare and hard to understand bugs.
> 
> We could just make the relevant functions hold interrupts, and that might be
> the best path forward, but we don't really need to hold all interrupts
> (e.g. termination would be fine), so it's a bit coarse grained.  It would need
> to happen in a few places, which isn't great either.
> 
> Other suggestions?

For the "no free IOs despite no in-flight IOs" case, I'd replace the
ereport(ERROR) with "return;" since we now know interrupt processing reclaimed
an IO.  Then decide what protection if any, we need against bugs causing an
infinite loop in caller pgaio_io_acquire().  What's the case motivating the
unbounded loop in pgaio_io_acquire(), as opposed to capping at two
pgaio_io_acquire_nb() calls?  If the theory is that pgaio_io_acquire() could
be reentrant, what scenario would reach that reentrancy?

> Thanks again for finding and reporting this Alexander!

+1!




Re: PG 18 release notes draft committed

2025-05-02 Thread Bruce Momjian
On Sat, May  3, 2025 at 01:46:29AM +0200, Jelte Fennema-Nio wrote:
> On Fri, 2 May 2025 at 04:45, Bruce Momjian  wrote:
> >
> > I have committd the first draft of the PG 18 release notes.  The item
> > count looks strong:
> 
> Thanks for all the work. Some notes:
> 
> 1. There's currently no mention that protocol version 3.2 was
> introduced in this release. I'm not sure where/how this should be
> mentioned, but I definitely think it should be somewhere. It's a
> pretty major change. One option is to replace/amend the "Make cancel
> request keys 256 bits" item. Maybe replace that with something like:
> "Postgres 18 introduces protocol version 3.2. This is the first new
> protocol version since 3.0, which was introduced in Postgres 7.4. This
> new protocol version 3.2 allows a server to use longer cancel request
> keys. When the client advertises support for protocol version 3.2 (or
> higher) Postgres 18 will use a cancel key size of 256 bits."

Okay, I added a mention next to the libpq version function entries.

> 2. Obviously biased since it's my contribution, but I think d38bab5
> might deserve a mention.

I disagree.  pgbench limits like this are not something we give much
detail around error avoidance to in the release notes.

> 3. The "Add PQtrace() output..." commitlist should also contain 7adec2d5fc

Added.  Patch attached.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.
diff --git a/doc/src/sgml/release-18.sgml b/doc/src/sgml/release-18.sgml
index 2654960eef8..ed7fa5d903f 100644
--- a/doc/src/sgml/release-18.sgml
+++ b/doc/src/sgml/release-18.sgml
@@ -2146,6 +2146,10 @@ Author: Robert Haas 
 Add function PQfullProtocolVersion() to report the full, including minor, protocol version number (Jacob Champion, Jelte Fennema-Nio)
 §
 
+
+
+This release introduces wire protocol version 3.2.
+
 
 
 
 
 


Re: POC: Parallel processing of indexes in autovacuum

2025-05-02 Thread Sami Imseih
> I think it would more make
> sense to maintain the existing autovacuum_max_workers parameter while
> introducing a new parameter that would either control the maximum
> number of parallel vacuum workers per autovacuum worker or set a
> system-wide cap on the total number of parallel vacuum workers.

+1, and would it make sense for parallel workers to come from
max_parallel_maintenance_workers? This is capped by
max_parallel_workers and max_worker_processes, so increasing
the defaults for all 3 will be needed as well.


--
Sami




Re: PG 18 release notes draft committed

2025-05-02 Thread Jelte Fennema-Nio
On Fri, 2 May 2025 at 04:45, Bruce Momjian  wrote:
>
> I have committd the first draft of the PG 18 release notes.  The item
> count looks strong:

Thanks for all the work. Some notes:

1. There's currently no mention that protocol version 3.2 was
introduced in this release. I'm not sure where/how this should be
mentioned, but I definitely think it should be somewhere. It's a
pretty major change. One option is to replace/amend the "Make cancel
request keys 256 bits" item. Maybe replace that with something like:
"Postgres 18 introduces protocol version 3.2. This is the first new
protocol version since 3.0, which was introduced in Postgres 7.4. This
new protocol version 3.2 allows a server to use longer cancel request
keys. When the client advertises support for protocol version 3.2 (or
higher) Postgres 18 will use a cancel key size of 256 bits."
2. Obviously biased since it's my contribution, but I think d38bab5
might deserve a mention.
3. The "Add PQtrace() output..." commitlist should also contain 7adec2d5fc




Re: pgsql: Add function to log the memory contexts of specified backend pro

2025-05-02 Thread Fujii Masao



On 2025/05/02 14:58, torikoshia wrote:

I confirmed that with this patch applied, the process no longer crashes even 
after applying the change Robert used to trigger the crash.

a small nitpick:

+    * requested  repeatedly and rapidly, potentially leading to 
infinite

It looks like there are two spaces between "requested" and "repeatedly".


Thanks for the review and testing! I've fixed the issue you pointed out.
Updated patch attached.

Since git cherry-pick didn't work cleanly for v16 and earlier,
I've also prepared a separate patch for those versions.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
From c39b3b83f8a04e781812bd3b83acb235babe3116 Mon Sep 17 00:00:00 2001
From: Fujii Masao 
Date: Sat, 3 May 2025 10:21:01 +0900
Subject: [PATCH v2] Add guard to prevent recursive memory contexts logging.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Previously, if memory contexts logging was triggered repeatedly and
rapidly while a previous request was still being processed, it could
result in recursive calls to ProcessLogMemoryContextInterrupt().
This could lead to infinite recursion and potentially crash the process.

This commit adds a guard to prevent such recursion.
If ProcessLogMemoryContextInterrupt() is already in progress and
logging memory contexts, subsequent calls will exit immediately,
avoiding unintended recursive calls.

While this scenario is unlikely in practice, it’s not impossible.
This change adds a safety check to prevent such failures.

Back-patch to v14, where memory contexts logging was introduced.

Reported-by: Robert Haas 
Author: Fujii Masao 
Reviewed-by: Atsushi Torikoshi 
Discussion: 
https://postgr.es/m/ca+tgmozmrv32tbnrrftvf9iwlntgqbhyslvcrhguwzvctph...@mail.gmail.com
Backpatch-through: 14
---
 src/backend/utils/mmgr/mcxt.c | 60 ---
 1 file changed, 41 insertions(+), 19 deletions(-)

diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 9fc83f11f6f..73dcd64c351 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -1198,28 +1198,50 @@ HandleLogMemoryContextInterrupt(void)
 void
 ProcessLogMemoryContextInterrupt(void)
 {
+   static bool in_progress = false;
+
LogMemoryContextPending = false;
 
-   /*
-* Use LOG_SERVER_ONLY to prevent this message from being sent to the
-* connected client.
-*/
-   ereport(LOG_SERVER_ONLY,
-   (errhidestmt(true),
-errhidecontext(true),
-errmsg("logging memory contexts of PID %d", 
MyProcPid)));
+   PG_TRY();
+   {
+   /*
+* Exit immediately if memory contexts logging is already in 
progress.
+* This prevents recursive calls, which could occur if logging 
is
+* requested repeatedly and rapidly, potentially leading to 
infinite
+* recursion and a crash.
+*/
+   if (in_progress)
+   return;
+   in_progress = true;
 
-   /*
-* When a backend process is consuming huge memory, logging all its 
memory
-* contexts might overrun available disk space. To prevent this, we 
limit
-* the number of child contexts to log per parent to 100.
-*
-* As with MemoryContextStats(), we suppose that practical cases where 
the
-* dump gets long will typically be huge numbers of siblings under the
-* same parent context; while the additional debugging value from seeing
-* details about individual siblings beyond 100 will not be large.
-*/
-   MemoryContextStatsDetail(TopMemoryContext, 100, false);
+   /*
+* Use LOG_SERVER_ONLY to prevent this message from being sent 
to the
+* connected client.
+*/
+   ereport(LOG_SERVER_ONLY,
+   (errhidestmt(true),
+errhidecontext(true),
+errmsg("logging memory contexts of PID %d", 
MyProcPid)));
+
+   /*
+* When a backend process is consuming huge memory, logging all 
its
+* memory contexts might overrun available disk space. To 
prevent
+* this, we limit the number of child contexts to log per 
parent to
+* 100.
+*
+* As with MemoryContextStats(), we suppose that practical 
cases where
+* the dump gets long will typically be huge numbers of 
siblings under
+* the same parent context; while the additional debugging 
value from
+* seeing details about individual siblings beyond 100 will not 
be
+* large.
+*/
+   MemoryContextStatsD

Re: Make COPY format extendable: Extract COPY TO format implementations

2025-05-02 Thread Sutou Kouhei
Hi,

In 
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on 
Thu, 1 May 2025 12:15:30 -0700,
  Masahiko Sawada  wrote:

> One of the primary considerations we need to address is the treatment
> of the specified format name. The current patch set utilizes built-in
> formats (namely 'csv', 'text', and 'binary') when the format name is
> either unqualified or explicitly specified with 'pg_catalog' as the
> schema. In all other cases, we search for custom format handler
> functions based on the search_path. To be frank, I have reservations
> about this interface design, as the dependence of the specified custom
> format name on the search_path could potentially confuse users.

How about requiring schema for all custom formats?

Valid:

  COPY ... TO ... (FORMAT 'text');
  COPY ... TO ... (FORMAT 'my_schema.jsonlines');

Invalid:

  COPY ... TO ... (FORMAT 'jsonlines'); -- no schema
  COPY ... TO ... (FORMAT 'pg_catalog.text'); -- needless schema

If we require "schema" for all custom formats, we don't need
to depend on search_path.

> In light of these concerns, I've been contemplating alternative
> interface designs. One promising approach would involve registering
> custom copy formats via a C function during module loading
> (specifically, in _PG_init()). This method would require extension
> authors to invoke a registration function, say
> RegisterCustomCopyFormat(), in _PG_init() as follows:
> 
> JsonLinesFormatId = RegisterCustomCopyFormat("jsonlines",
>  &JsonLinesCopyToRoutine,
>  &JsonLinesCopyFromRoutine);
> 
> The registration function would validate the format name and store it
> in TopMemoryContext. It would then return a unique identifier that can
> be used subsequently to reference the custom copy format extension.

I don't object the suggested interface because I don't have
a strong opinion how to implement this feature.

Why do we need to assign a unique ID? For performance? For
RegisterCustomCopyFormatOption()?

I think that we don't need to use it so much in COPY. We
don't need to use format name and assigned ID after we
retrieve a corresponding Copy{To,From}Routine. Because all
needed information are in Copy{To,From}Routine.

>  Extensions could register their own options within this
> framework, for example:
> 
> RegisterCustomCopyFormatOption(JsonLinesFormatId,
> "custom_option",
> custom_option_handler);

Can we defer to discuss how to add support for custom
options while we focus on the first implementation? Earlier
patch sets with the current approach had custom options
support but it's removed in the first implementation.

(BTW, I think that it's not a good API because we want COPY
FROM only options and COPY TO only options something like
"compression level".)

> This approach offers several advantages: it would eliminate the
> search_path issue, provide greater flexibility, and potentially
> simplify the overall interface for users and developers alike.

What contributes to the "flexibility"? Developers can call
multiple Register* functions in _PG_Init(), right?


Thanks,
-- 
kou




Re: Make COPY format extendable: Extract COPY TO format implementations

2025-05-02 Thread Sutou Kouhei
Hi,

In 
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on 
Fri, 2 May 2025 15:52:49 -0700,
  Masahiko Sawada  wrote:

>> Hmm.  How much should we care about the observability of the COPY
>> format used by a given backend?  Storing this information in a
>> backend's TopMemoryContext is OK to get the extensibility basics to
>> work, but could it make sense to use some shmem state to allocate a
>> uint32 ID that could be shared by all backends.  Contrary to EXPLAIN,
>> COPY commands usually run for a very long time, so I am wondering if
>> these APIs should be designed so as it would be possible to monitor
>> the format used.  One layer where the format information could be made
>> available is the progress reporting view for COPY, for example.  I can
>> also imagine a pgstats kind where we do COPY stats aggregates, with a
>> per-format pgstats kind, and sharing a fixed ID across multiple
>> backends is relevant (when flushing the stats at shutdown, we would
>> use a name/ID mapping like replication slots).
>>
>> I don't think that this needs to be relevant for the option part, just
>> for the format where, I suspect, we should store in a shmem array
>> based on the ID allocated the name of the format, the library of the
>> callback and the function name fed to load_external_function().
>>
>> Note that custom LWLock and wait events use a shmem state for
>> monitoring purposes, where we are able to do ID->format name lookups
>> as much as format->ID lookups.  Perhaps it's OK not to do that for
>> COPY, but I am wondering if we'd better design things from scratch
>> with states in shmem state knowing that COPY is a long-running
>> operation, and that if one mixes multiple formats they would most
>> likely want to know which formats are bottlenecks, through SQL.  Cloud
>> providers would love that.
> 
> Good point. It would make sense to have such information as a map on
> shmem. It might be better to use dshash here since a custom copy
> format module can be loaded at runtime. Or we can use dynahash with
> large enough elements.

If we don't need to assign an ID for each format, can we
avoid it? If we implement it, is this approach more complex
than the current table sampling method like approach?


Thanks,
-- 
kou




Re: PG 18 release notes draft committed

2025-05-02 Thread Bruce Momjian
On Fri, May  2, 2025 at 12:18:06PM -0400, Bruce Momjian wrote:
> Finally, I see the big increases in this release as being the optimizer,
> monitoring, and constraints.

Also, and I am loving the chapter markers linking to gitweb.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.




Re: extension_control_path and "directory"

2025-05-02 Thread Christoph Berg
Re: Matheus Alcantara
> > Thanks, I have committed this.  I did a bit of code reformatting and
> > adjusted the documentation a bit.  It's good to get this in before beta1
> > so that we don't have to change the valid values of
> > extension_control_path past beta1.
> 
> Thanks Peter!

And thanks everyone!

Christoph




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Jacob Champion
On Fri, May 2, 2025 at 8:46 AM Jacob Champion
 wrote:
> Yeah, I wonder if libintl is being similarly "cheated" on the Meson side.

libintl is already coming in via frontend_stlib_code, so that's fine.
So now I'm wondering if any other static clients of libpq-int.h (if
there are any) need the ssl dependency too, for correctness, or if
it's just me.

> But I'll try to build up
> a new Meson setup, with minimal added settings, to see if I can
> reproduce here. Can you share your Meson configuration?

(Never mind -- this was pretty easy to hit in a from-scratch configuration.)

--Jacob




Re: PG 18 release notes draft committed

2025-05-02 Thread Bruce Momjian
On Fri, May  2, 2025 at 01:00:57PM +0900, Amit Langote wrote:
> 1. Speed up execution of cached plans by deferring locks on partitions
> subject to pruning (Amit Langote)
> (bb3ec16e1, d47cbf474, cbc127917, 525392d57)
> 
> 2. Speed up child EquivalenceMember lookup in planner (Yuya Watari,
> David Rowley)
> (d69d45a5a)
> 
> 3. Speed up derived clause lookup in EquivalenceClass (Ashutosh Bapat)
> (88f55bc97)
> 
> Alternatively, 2 and 3 can be combined as:
> 
> 2. Speed up partition planning by improving EquivalenceClass lookups
> (Yuya Watari, David Rowley, Ashutosh Bapat)
> 
> I think 1 should go under Partitioning, which I see is currently missing.
> 
> Any thoughts, David?
> 
> Can work on a patch if you'd like.

So, a few things.  First, these set of commits was in a group of 10 that
I added since there have been complaints in the past that optimizer
improvements were not listed and therefore patch authors were not given
sufficient credit.  That means the 209 item count for PG 18 is 10 higher
than my normal filtering would produce.

Second, looking at the items, these are a case of "X is faster", which
we don't normally mention in the release notes.  We normally mention
"faster" when it is so much faster that use cases which were not
possible before might be possible now, so it is recommended to retest.
That is what I saw this grouped item as, whereas I don't think the
individual items meet that criteria.

Also, I didn't see enough partition items to warrant a separate
partition section, and we didn't have one in PG 17 either.  We could
pull all the partition items from the sections they are already in, but
they seem more natural in the sections they are in.

I don't think most people would know what EquivalenceMember is, and even
if they did, would they be able to connect it to an SQL query?

Finally, I see the big increases in this release as being the optimizer,
monitoring, and constraints.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.




First-draft back-branch release notes are up

2025-05-02 Thread Tom Lane
Much less exciting than the v18 release notes, but we
still gotta do 'em.  See

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=176877f461a8b55e921f597fb217f6ab89ee019f

As usual, please send corrections by Sunday.

regards, tom lane




Re: RFC: Additional Directory for Extensions

2025-05-02 Thread David E. Wheeler
On May 1, 2025, at 16:24, Peter Eisentraut  wrote:

> I see.  I have committed it now describing the current state.

Quick follow-up to tweak a couple of commas.

--- a/doc/src/sgml/extend.sgml
+++ b/doc/src/sgml/extend.sgml
@@ -1813,8 +1813,8 @@ include $(PGXS)
 

 You can select a separate directory prefix in which to install your
-extension's files, by setting the make variable
-prefix when executing make install
+extension's files by setting the make variable
+prefix when executing make install,
 like so:
 
 make install prefix=/usr/local/postgresql

Best,

David



signature.asc
Description: Message signed with OpenPGP


Re: Add an option to skip loading missing publication to avoid logical replication failure

2025-05-02 Thread Tom Lane
vignesh C  writes:
> Due to the asynchronous nature of these processes, the ALTER
> SUBSCRIPTION command may not be immediately observed by the apply
> worker. Meanwhile, the walsender may process and decode an INSERT
> statement.
> If the insert targets a table (e.g., tab_3) that does not belong to
> the current publication (pub1), the walsender silently skips
> replicating the record and advances its decoding position. This
> position is sent in a keepalive message to the subscriber, and since
> there are no pending transactions to flush, the apply worker reports
> it as the latest received LSN.

So this theory presumes that the apply worker receives and reacts to
the keepalive message, yet it has not observed a relevant
subscriber-side catalog update that surely committed before the
keepalive was generated.  It's fairly hard to see how that is okay,
because it's at least adjacent to something that must be considered a
bug: applying transmitted data without having observed DDL updates to
the target table.  Why is the processing of keepalives laxer than the
processing of data messages?

regards, tom lane




Re: PG 18 release notes draft committed

2025-05-02 Thread Jacob Champion
On Thu, May 1, 2025 at 7:44 PM Bruce Momjian  wrote:
> I will probably add markup in 1-3 weeks.  Let the feedback begin.  ;-)

Thanks!

>
> Version 18 contains a number of changes that may affect compatibility
> with previous releases.  Observe the following incompatibilities:
>
>[...]
> Rename server variable ssl_ecdh_curve to ssl_groups and allow multiple 
> colon-separated ECDH curves to be specified (Erica Zhang, Daniel Gustafsson)

The previous setting name should continue to function correctly, since
it's mapped as an alias, so this can probably be moved into the
"standard" config features rather than a compatibility change.

--Jacob




Re: PG 18 release notes draft committed

2025-05-02 Thread Bruce Momjian
On Fri, May  2, 2025 at 08:24:42AM -0700, Jacob Champion wrote:
> On Thu, May 1, 2025 at 7:44 PM Bruce Momjian  wrote:
> > I will probably add markup in 1-3 weeks.  Let the feedback begin.  ;-)
> 
> Thanks!
> 
> >
> > Version 18 contains a number of changes that may affect compatibility
> > with previous releases.  Observe the following incompatibilities:
> >
> >[...]
> > Rename server variable ssl_ecdh_curve to ssl_groups and allow multiple 
> > colon-separated ECDH curves to be specified (Erica Zhang, Daniel Gustafsson)
> 
> The previous setting name should continue to function correctly, since
> it's mapped as an alias, so this can probably be moved into the
> "standard" config features rather than a compatibility change.

Thanks, done.  The commit message didn't indicate the old name would
still work, and I didn't review the patch for that.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.




Re: First-draft back-branch release notes are up

2025-05-02 Thread Bruce Momjian
On Fri, May  2, 2025 at 12:39:15PM -0400, Tom Lane wrote:
> Much less exciting than the v18 release notes, but we
> still gotta do 'em.  See
> 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=176877f461a8b55e921f597fb217f6ab89ee019f
> 
> As usual, please send corrections by Sunday.

They look good to me, as does my entry:


 
  Avoid incorrect optimizations based on IS [NOT]
  NULL tests that are applied to composite values
  (Bruce Momjian)
  §
 


My commit message erroneously said "domains" instead of "composite
values".

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Nathan Bossart
On Fri, May 02, 2025 at 10:05:26AM -0700, Jacob Champion wrote:
> Nathan, if you get a chance, does the attached patch work for you?

Yup, thanks!

-- 
nathan




Re: [PoC] Federated Authn/z with OAUTHBEARER

2025-05-02 Thread Tom Lane
Jacob Champion  writes:
> Looks like it's just me. And using partial_dependency for the includes
> seems like overkill, so I've kept the full ssl dependency object, but
> moved it to the staticlib only, which is enough to solve the breakage
> on my machine.
> Nathan, if you get a chance, does the attached patch work for you?

FWIW, on my Mac a meson build from HEAD goes through fine, with or
without this patch.  I'm getting openssl and libcurl from MacPorts
not Homebrew, but I don't know why that would make any difference.

regards, tom lane




Re: Issue attaching a table to a partitioned table with an auto-referenced foreign key

2025-05-02 Thread Tender Wang
Alvaro Herrera  于2025年5月1日周四 20:17写道:

> Hello,
>
> I've been looking at this bug once again and I think I finally
> understood what's going on and how to fix it.
>
> Ref 1: https://postgr.es/m/20230707175859.17c91538@karst
>Re: Issue attaching a table to a partitioned table with an
>auto-referenced foreign key
>(Guillaume Lelarge)
> Ref 2: https://postgr.es/m/18156-a44bc7096f068...@postgresql.org
>BUG #18156: Self-referential foreign key in partitioned table not
>enforced on deletes
>(Matthew Gabeler-Lee)
> Ref 3:
> https://postgr.es/m/myvsif-attja5dcwouwh21r12r-sfxecy2-3ynt8kao...@mail.gmail.com
>Self referential foreign keys in partitioned table not working as
>expected
>(Luca Vallisa)
>
> First of all -- apparently we broke this in commit 5914a22f6ea5 (which
> fixed the other problems that had been reported by G. Lelarge in Ref 1)
> even worse than how it was before, by having the new functions just skip
> processing the referenced side altogether.  Previously we were at least
> partially setting up the necessary triggers, at least some of the time.
> So what the report by Luca is saying is, plain and simple, that the
> referenced-side action triggers just do not exist, which is why no error
> is thrown even on the most trivial cases, on the releases that contain
> that commit (17.1, 16.5, 15.9).
>

Hmm.  I didn't get the same conclusion.
Before commit 5914a22f6ea5, the issue reported by Luca could have happened.
Look at the test below on v17.0:
psql (17.0)
Type "help" for help.

postgres=# create table test (
id_1 int4 not null,
id_2 int4 not null,
parent_id_2 int4 null,
primary key (id_1, id_2),
foreign key (id_1, parent_id_2) references test (id_1, id_2)
) partition by list (id_1);
create table test_1 partition of test for values in (1);
insert into test values (1, 1, null), (1, 2, 1);
delete from test where (id_1, id_2) = (1, 1);
CREATE TABLE
CREATE TABLE
INSERT 0 2
DELETE 1

You can see from the above test that no error was reported.
But if I revert the commit 614a406b4ff1,  above test would report error on
v16devel:
psql (16devel)
Type "help" for help.

postgres=# create table test (
id_1 int4 not null,
id_2 int4 not null,
parent_id_2 int4 null,
primary key (id_1, id_2),
foreign key (id_1, parent_id_2) references test (id_1, id_2)
) partition by list (id_1);
create table test_1 partition of test for values in (1);
insert into test values (1, 1, null), (1, 2, 1);
delete from test where (id_1, id_2) = (1, 1);
CREATE TABLE
CREATE TABLE
INSERT 0 2
ERROR:  update or delete on table "test_1" violates foreign key constraint
"test_id_1_parent_id_2_fkey1" on table "test"
DETAIL:  Key (id_1, id_2)=(1, 1) is still referenced from table "test".


> Anyway, if people have a chance to give this a look, it would be
> helpful.
>

It's midnight in my time zone. I will look at this tomorrow.

-- 
Thanks,
Tender Wang


Re: POC: Parallel processing of indexes in autovacuum

2025-05-02 Thread Masahiko Sawada
On Fri, May 2, 2025 at 9:58 AM Sami Imseih  wrote:
>
> > Once we have parallel heap vacuum, as discussed in thread[1], it would
> > also likely be beneficial to incorporate it into autovacuum during
> > aggressive vacuum or failsafe mode.
>
> IIRC, index cleanup is disabled by failsafe.

Yes. My idea is to use parallel *heap* vacuum in autovacuum during
failsafe mode. I think it would make sense as users want to complete
freezing tables as soon as possible in this situation.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Make COPY format extendable: Extract COPY TO format implementations

2025-05-02 Thread Masahiko Sawada
On Fri, May 2, 2025 at 7:20 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Thu, 1 May 2025 12:15:30 -0700,
>   Masahiko Sawada  wrote:
>
> > One of the primary considerations we need to address is the treatment
> > of the specified format name. The current patch set utilizes built-in
> > formats (namely 'csv', 'text', and 'binary') when the format name is
> > either unqualified or explicitly specified with 'pg_catalog' as the
> > schema. In all other cases, we search for custom format handler
> > functions based on the search_path. To be frank, I have reservations
> > about this interface design, as the dependence of the specified custom
> > format name on the search_path could potentially confuse users.
>
> How about requiring schema for all custom formats?
>
> Valid:
>
>   COPY ... TO ... (FORMAT 'text');
>   COPY ... TO ... (FORMAT 'my_schema.jsonlines');
>
> Invalid:
>
>   COPY ... TO ... (FORMAT 'jsonlines'); -- no schema
>   COPY ... TO ... (FORMAT 'pg_catalog.text'); -- needless schema
>
> If we require "schema" for all custom formats, we don't need
> to depend on search_path.

I'm concerned that users cannot use the same format name in the FORMAT
option depending on which schema the handler function is created.

>
> > In light of these concerns, I've been contemplating alternative
> > interface designs. One promising approach would involve registering
> > custom copy formats via a C function during module loading
> > (specifically, in _PG_init()). This method would require extension
> > authors to invoke a registration function, say
> > RegisterCustomCopyFormat(), in _PG_init() as follows:
> >
> > JsonLinesFormatId = RegisterCustomCopyFormat("jsonlines",
> >  &JsonLinesCopyToRoutine,
> >  &JsonLinesCopyFromRoutine);
> >
> > The registration function would validate the format name and store it
> > in TopMemoryContext. It would then return a unique identifier that can
> > be used subsequently to reference the custom copy format extension.
>
> I don't object the suggested interface because I don't have
> a strong opinion how to implement this feature.
>
> Why do we need to assign a unique ID? For performance? For
> RegisterCustomCopyFormatOption()?

I think it's required for monitoring purposes for example. For
instance, we can set the format ID in the progress information and the
progress view can fetch the format name by the ID so that users can
see what format is being used in the COPY command.

>
> >  Extensions could register their own options within this
> > framework, for example:
> >
> > RegisterCustomCopyFormatOption(JsonLinesFormatId,
> > "custom_option",
> > custom_option_handler);
>
> Can we defer to discuss how to add support for custom
> options while we focus on the first implementation? Earlier
> patch sets with the current approach had custom options
> support but it's removed in the first implementation.

I think we can skip the custom option patch for the first
implementation but still need to discuss how we will be able to
implement it to understand the big picture of this feature. Otherwise
we could end up going the wrong direction.

>
> (BTW, I think that it's not a good API because we want COPY
> FROM only options and COPY TO only options something like
> "compression level".)

Why does this matter in terms of API? I think that even with this API
we can pass is_from to the option handler function so that it
validates the option based on it.

>
> > This approach offers several advantages: it would eliminate the
> > search_path issue, provide greater flexibility, and potentially
> > simplify the overall interface for users and developers alike.
>
> What contributes to the "flexibility"? Developers can call
> multiple Register* functions in _PG_Init(), right?

I think that with a tablesample-like approach we need to do everything
based on one handler function and callbacks returned from it whereas
there is no such limitation with C API style.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Make COPY format extendable: Extract COPY TO format implementations

2025-05-02 Thread Sutou Kouhei
Hi,

In 
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on 
Fri, 2 May 2025 21:38:32 -0700,
  Masahiko Sawada  wrote:

>> How about requiring schema for all custom formats?
>>
>> Valid:
>>
>>   COPY ... TO ... (FORMAT 'text');
>>   COPY ... TO ... (FORMAT 'my_schema.jsonlines');
>>
>> Invalid:
>>
>>   COPY ... TO ... (FORMAT 'jsonlines'); -- no schema
>>   COPY ... TO ... (FORMAT 'pg_catalog.text'); -- needless schema
>>
>> If we require "schema" for all custom formats, we don't need
>> to depend on search_path.
> 
> I'm concerned that users cannot use the same format name in the FORMAT
> option depending on which schema the handler function is created.

I'm not sure that it's a problem or not. If users want to
use the same format name, they can install the handler
function to the same schema.

>> Why do we need to assign a unique ID? For performance? For
>> RegisterCustomCopyFormatOption()?
> 
> I think it's required for monitoring purposes for example. For
> instance, we can set the format ID in the progress information and the
> progress view can fetch the format name by the ID so that users can
> see what format is being used in the COPY command.

How about setting the format name instead of the format ID
in the progress information?

> I think we can skip the custom option patch for the first
> implementation but still need to discuss how we will be able to
> implement it to understand the big picture of this feature. Otherwise
> we could end up going the wrong direction.

I think that we don't need to discuss it deeply because we
have many options with this approach. We can call C
functions in _PG_Init(). I think that this feature will not
be a blocker of this approach.

>> (BTW, I think that it's not a good API because we want COPY
>> FROM only options and COPY TO only options something like
>> "compression level".)
> 
> Why does this matter in terms of API? I think that even with this API
> we can pass is_from to the option handler function so that it
> validates the option based on it.

If we choose the API, each custom format developer needs to
handle the case in handler function. For example, if we pass
information whether this option is only for TO to
PostgreSQL, ProcessCopyOptions() not handler functions can
handle it.

Anyway, I think that we don't need to discuss this deeply
for now.

>> What contributes to the "flexibility"? Developers can call
>> multiple Register* functions in _PG_Init(), right?
> 
> I think that with a tablesample-like approach we need to do everything
> based on one handler function and callbacks returned from it whereas
> there is no such limitation with C API style.

Thanks for clarifying it. It seems that my understanding is
correct.

I hope that the flexibility is needed flexibility and too
much flexibility doesn't introduce too much complexity.


Thanks,
-- 
kou




Re: Make COPY format extendable: Extract COPY TO format implementations

2025-05-02 Thread Sutou Kouhei
Hi,

In 
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on 
Fri, 2 May 2025 23:02:25 -0700,
  Masahiko Sawada  wrote:

> The progress view can know only numbers. We need to extend the
> progress view infrastructure so that we can pass other data types.

Sorry. Could you tell me what APIs referred here?
pgstat_progress_*() functions in
src/include/utils/backend_progress.h?


Thanks,
-- 
kou




Re: Make COPY format extendable: Extract COPY TO format implementations

2025-05-02 Thread Masahiko Sawada
On Fri, May 2, 2025 at 11:20 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Fri, 2 May 2025 23:02:25 -0700,
>   Masahiko Sawada  wrote:
>
> > The progress view can know only numbers. We need to extend the
> > progress view infrastructure so that we can pass other data types.
>
> Sorry. Could you tell me what APIs referred here?
> pgstat_progress_*() functions in
> src/include/utils/backend_progress.h?

The progress information is stored in PgBackendStatus defined in
backend_status.h:

/*
 * Command progress reporting.  Any command which wishes can advertise
 * that it is running by setting st_progress_command,
 * st_progress_command_target, and st_progress_param[].
 * st_progress_command_target should be the OID of the relation which the
 * command targets (we assume there's just one, as this is meant for
 * utility commands), but the meaning of each element in the
 * st_progress_param array is command-specific.
 */
ProgressCommandType st_progress_command;
Oid st_progress_command_target;
int64   st_progress_param[PGSTAT_NUM_PROGRESS_PARAM];

Then the progress view maps the numbers to the corresponding strings:

CREATE VIEW pg_stat_progress_copy AS
SELECT
S.pid AS pid, S.datid AS datid, D.datname AS datname,
S.relid AS relid,
CASE S.param5 WHEN 1 THEN 'COPY FROM'
  WHEN 2 THEN 'COPY TO'
  END AS command,
CASE S.param6 WHEN 1 THEN 'FILE'
  WHEN 2 THEN 'PROGRAM'
  WHEN 3 THEN 'PIPE'
  WHEN 4 THEN 'CALLBACK'
  END AS "type",
S.param1 AS bytes_processed,
S.param2 AS bytes_total,
S.param3 AS tuples_processed,
S.param4 AS tuples_excluded,
S.param7 AS tuples_skipped
FROM pg_stat_get_progress_info('COPY') AS S
LEFT JOIN pg_database D ON S.datid = D.oid;

So the idea is that the backend process sets the format ID somewhere
in st_progress_param, and then the progress view calls a SQL function,
say pg_stat_get_copy_format_name(), with the format ID that returns
the corresponding format name.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Make COPY format extendable: Extract COPY TO format implementations

2025-05-02 Thread David G. Johnston
On Friday, May 2, 2025, Masahiko Sawada  wrote:

>
> I'm concerned about allowing multiple 'text' format implementations
> with identical names within the database, as this could lead to
> considerable confusion. When users specify 'text', it would be more
> logical to guarantee that the built-in 'text' format is consistently
> used.


Do you want to only give text/csv/binary this special treatment or also any
future format name we ever decide to implement in core.  If an extension
takes up “xml” and we try to do that in core do we fail an upgrade because
of the conflict, and make it impossible to actually use said extension?

This principle aligns with other customizable components, such
> as custom resource managers, wait events, lightweight locks, and
> custom scans. These components maintain their built-in data/types and
> explicitly prevent the registration of duplicate names.
>

I am totally lost on how any of those resemble this feature.

I’m all for registration to enable additional options and features - but am
against moving away from turning format into a namespaced identifier.  This
is a query-facing feature where namespaces are common and fundamentally
required.  I have some sympathy for the fact that until now one could not
prefix text/binary/csv with pg_catalog to be fully safe, but in reality
DBAs/query authors either put pg_catalog first in their search_path or make
an informed decision when they deviate.  That is the established precedent
relevant to this feature.  The power, and responsibility for education,
lies with the user.

David J.