Mutaamba (cc'd) and I reviewed this patch together. To summarize the patch and thread so far: The patch adds a new function, CheckSlotIsInSingleUserMode. If true then we raise an error. Otherwise we would trip an assert in ReplicationSlotRelease requiring the slot to have an active_pid, which is never set in single-user mode. We do want to support pg_drop_replication_slot in single-user mode, but not other replication slot functions. In particular, advancing the slot may call non-core code, which seems risky in single-user mode.
The patch does not apply. The attached v5 fixes it with a small update to the context for the slot.h hunk. The commit message needs some explanation. Why are we doing this? What does the patch do? What alternatives did we consider? The current error message seems reasonable. The patch has no docs. If we plan to forbid some functions in single-user mode, we should document which ones (e.g. in func.sgml). There are no tests. Do we have any tests at all for single-user mode? The only one I see is in recovery/t/017_shm.pl. What if we had tests that ran the regress suite in single-user mode? Basically instead of psql we run postgres --single? Do we expect a lot of it would fail? Is it small enough the test could maintain a diff that expects those failures? I'm not saying we should do that as part of this patch, but is there any interest in that? Since single-user mode is most often used when things are broken, it's a harsh place to hit a bug. The patch lacks source comments. Something on CheckSlotIsInSingleUserMode explaining why would be good. In ReplicationSlotRelease, why only assert that `slot->active_pid != 0`? Why not assert that it's MyProcPid (even outside single-user mode)? Can one backend really release a slot while another is using it? Can you drop it? Maybe you can copy it? Are we calling CheckSlotIsInSingleUserMode from everywhere that needs it? We tried to check all the functions that call ReplicationSlotCreate, ReplicationSlotRelease, and update_synced_slots_inactive_since (since they all have these asserts). To call out a few: The pg_replication_origin_* functions don't call the Assert-ing functions. binary_upgrade_logical_slot_has_caught_up - Not available from the command line. WalSndErrorCleanup - Probably not used in single-user mode? We also see that PostgresMain calls ReplicationSlotRelease from its error handler. Presumably single-user mode would be executing PostgresSingleUserMain instead, but still perhaps we should call CheckSlotIsInSingleUserMode here just as a sanity-check? On Thu, Feb 27, 2025 at 9:42 PM Hayato Kuroda (Fujitsu) <kuroda.hay...@fujitsu.com> wrote: > > I understand that we may not have a clear use case for this to work in > > single-user mode. But how will we define the boundary in similar > > cases? I mean, we should have some rule for such exposed functions, > > and it should be followed uniformly. Now, if one needs a bigger or > > complex change to make the function work in single-user mode, then it > > is easy to take an exception from what is currently being followed in > > the code. However, if the change is as trivial as you proposed in the > > first email, why not go with that and make this function work in > > single-user mode? > > Hmm, the opinion about this is completely opposite with other reviewers. I > want > to ask them again how you feel. I also added Tom who pointed out in the > initial > thread. > > Question: how you feel to restrict SQL functions for slot during the > single-user > mode? Nobody has considered use cases for it; we do not have concrete theory > and > similar cases. And needed band-aid to work seems very small. No one has replied yet, but I vote for forbidding these functions. I can't articulate a full theory for which functions we restrict in single-user mode, and I think we should permit as much as possible. But any theory would weigh usefulness against risk. Here no one has found a use-case for several years, and executing user-supplied code in an unusual, higher-risk scenario seems like a good reason to be cautious. Also without tests for single-user mode, I'm extra wary. If single-user support is desired by someone, they could submit a patch. Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
From b19815119909d0529d6c52889ab0959ec44e7e7f Mon Sep 17 00:00:00 2001 From: Hayato Kuroda <kuroda.hay...@fujitsu.com> Date: Wed, 19 Feb 2025 11:37:26 +0900 Subject: [PATCH v4] Prohibit slot operations while in the single user mode --- src/backend/replication/logical/logicalfuncs.c | 2 ++ src/backend/replication/slot.c | 12 ++++++++++++ src/backend/replication/slotfuncs.c | 12 ++++++++++++ src/include/replication/slot.h | 1 + 4 files changed, 27 insertions(+) diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c index ca53caac2f..178a07af7d 100644 --- a/src/backend/replication/logical/logicalfuncs.c +++ b/src/backend/replication/logical/logicalfuncs.c @@ -113,6 +113,8 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin List *options = NIL; DecodingOutputState *p; + CheckSlotIsInSingleUserMode(); + CheckSlotPermissions(); CheckLogicalDecodingRequirements(); diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index 719e531eb9..d041adfe48 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -1442,6 +1442,18 @@ CheckSlotPermissions(void) "REPLICATION"))); } +/* + * Check whether the instance is in single-user mode. + */ +void +CheckSlotIsInSingleUserMode(void) +{ + if (!IsUnderPostmaster) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("replication slots cannot be used in single-user mode"))); +} + /* * Reserve WAL for the currently active slot. * diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c index 146eef5871..b669774c87 100644 --- a/src/backend/replication/slotfuncs.c +++ b/src/backend/replication/slotfuncs.c @@ -17,6 +17,7 @@ #include "access/xlogrecovery.h" #include "access/xlogutils.h" #include "funcapi.h" +#include "miscadmin.h" #include "replication/logical.h" #include "replication/slot.h" #include "replication/slotsync.h" @@ -76,6 +77,8 @@ pg_create_physical_replication_slot(PG_FUNCTION_ARGS) if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) elog(ERROR, "return type must be a row type"); + CheckSlotIsInSingleUserMode(); + CheckSlotPermissions(); CheckSlotRequirements(); @@ -182,6 +185,8 @@ pg_create_logical_replication_slot(PG_FUNCTION_ARGS) if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) elog(ERROR, "return type must be a row type"); + CheckSlotIsInSingleUserMode(); + CheckSlotPermissions(); CheckLogicalDecodingRequirements(); @@ -515,6 +520,8 @@ pg_replication_slot_advance(PG_FUNCTION_ARGS) Assert(!MyReplicationSlot); + CheckSlotIsInSingleUserMode(); + CheckSlotPermissions(); if (XLogRecPtrIsInvalid(moveto)) @@ -612,9 +619,12 @@ copy_replication_slot(FunctionCallInfo fcinfo, bool logical_slot) TupleDesc tupdesc; HeapTuple tuple; + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) elog(ERROR, "return type must be a row type"); + CheckSlotIsInSingleUserMode(); + CheckSlotPermissions(); if (logical_slot) @@ -871,6 +881,8 @@ pg_sync_replication_slots(PG_FUNCTION_ARGS) char *err; StringInfoData app_name; + CheckSlotIsInSingleUserMode(); + CheckSlotPermissions(); if (!RecoveryInProgress()) diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h index f5a24ccfbf..24523b1001 100644 --- a/src/include/replication/slot.h +++ b/src/include/replication/slot.h @@ -306,6 +306,7 @@ extern void CheckPointReplicationSlots(bool is_shutdown); extern void CheckSlotRequirements(void); extern void CheckSlotPermissions(void); +extern void CheckSlotIsInSingleUserMode(void); extern ReplicationSlotInvalidationCause GetSlotInvalidationCause(const char *cause_name); extern const char *GetSlotInvalidationCauseName(ReplicationSlotInvalidationCause cause); -- 2.43.5