On 21/06/2023 16:45, Egor Chindyaskin wrote:
Hello! In continuation of the topic I would like to suggest solution.
This patch adds several checks to the vulnerable functions above.
I looked at this last patch. The depth checks are clearly better than
segfaulting, but I think we can also avoid the recursions and having to
error out. That seems nice especially for MemoryContextDelete(), which
is called at transaction cleanup.
1. CommitTransactionCommand
This is just tail recursion. The compiler will almost certainly optimize
it away unless you're using -O0. We can easily turn it into iteration
ourselves to avoid that hazard, per attached
0001-Turn-tail-recursion-into-iteration-in-CommitTransact.patch.
2. ShowTransactionStateRec
Since this is just a debugging aid, I think we can just stop recursing
if we're about to run out of stack space. Seems nicer than erroring out,
although it can still error if you run out of memory. See
0002-Avoid-stack-overflow-in-ShowTransactionStateRec.patch.
3. All the MemoryContext functions
I'm reluctant to add stack checks to these, because they are called in
places like cleaning up after transaction abort. MemoryContextDelete()
in particular. If you ereport an error, it's not clear that you can
recover cleanly; you'll leak memory if nothing else.
Fortunately MemoryContext contains pointers to parent and siblings, so
we can traverse a tree of MemoryContexts iteratively, without using stack.
MemoryContextStats() is a bit tricky, but we can put a limit on how much
it recurses, and just print a summary line if the limit is reached.
That's what we already do if a memory context has a lot of children.
(Actually, if we didn't try keep track of the # of children at each
level, to trigger the summarization, we could traverse the tree without
using stack. But a limit seems useful.)
What do you think?
--
Heikki Linnakangas
Neon (https://neon.tech)
From 543e6569a3f4730f3105379cc946cc6b54525179 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
Date: Fri, 24 Nov 2023 14:13:10 +0200
Subject: [PATCH 1/4] Turn tail recursion into iteration in
CommitTransactionCommand()
Usually the compiler will optimize away the tail recursion anyway, but
if it doesn't, you can drive the function into stack overflow. For
example:
(n=1000000; printf "BEGIN;"; for ((i=1;i<=$n;i++)); do printf "SAVEPOINT s$i;"; done; printf "ERROR; COMMIT;") | psql >/dev/null
Report by Egor Chindyaskin and Alexander Lakhin.
Discussion: https://www.postgresql.org/message-id/1672760457.940462079%40f306.i.mail.ru
---
src/backend/access/transam/xact.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 536edb3792..e16b73a9f1 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -3033,12 +3033,15 @@ RestoreTransactionCharacteristics(const SavedTransactionCharacteristics *s)
void
CommitTransactionCommand(void)
{
- TransactionState s = CurrentTransactionState;
+ TransactionState s;
SavedTransactionCharacteristics savetc;
+recurse:
+ s = CurrentTransactionState;
/* Must save in case we need to restore below */
SaveTransactionCharacteristics(&savetc);
+ s = CurrentTransactionState;
switch (s->blockState)
{
/*
@@ -3226,8 +3229,7 @@ CommitTransactionCommand(void)
*/
case TBLOCK_SUBABORT_END:
CleanupSubTransaction();
- CommitTransactionCommand();
- break;
+ goto recurse;
/*
* As above, but it's not dead yet, so abort first.
@@ -3235,8 +3237,7 @@ CommitTransactionCommand(void)
case TBLOCK_SUBABORT_PENDING:
AbortSubTransaction();
CleanupSubTransaction();
- CommitTransactionCommand();
- break;
+ goto recurse;
/*
* The current subtransaction is the target of a ROLLBACK TO
--
2.39.2
From 1da20313b6252346fbc3f46cdfde612532e98c84 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
Date: Fri, 24 Nov 2023 15:06:53 +0200
Subject: [PATCH 2/4] Avoid stack overflow in ShowTransactionStateRec()
The function recurses, but didn't perform stack-depth checks. It's
just a debugging aid, so instead of the usual check_stack_depth()
call, stop the printing if we'd risk stack overflow.
Here's an example of how to test this:
(n=1000000; printf "BEGIN;"; for ((i=1;i<=$n;i++)); do printf "SAVEPOINT s$i;"; done; printf "SET log_min_messages = 'DEBUG5'; SAVEPOINT sp;") | psql >/dev/null
In the passing, swap building the list of child XIDs and recursing to
parent. That saves memory while recursing, reducing the risk of out of
memory errors with lots of subtransactions. The saving is not very
significant in practice, but this order seems more logical anyway.
Report by Egor Chindyaskin and Alexander Lakhin.
Discussion: https://www.postgresql.org/message-id/1672760457.940462079%40f306.i.mail.ru
---
src/backend/access/transam/xact.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index e16b73a9f1..53584f7dc6 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5507,8 +5507,22 @@ ShowTransactionStateRec(const char *str, TransactionState s)
{
StringInfoData buf;
- initStringInfo(&buf);
+ if (s->parent)
+ {
+ /*
+ * Since this function recurses, it could be driven to stack overflow.
+ * This is just a debugging aid, so we can leave out some details
+ * instead of erroring out with check_stack_depth().
+ */
+ if (stack_is_too_deep())
+ ereport(DEBUG5,
+ (errmsg_internal("%s(%d): parent omitted to avoid stack overflow",
+ str, s->nestingLevel)));
+ else
+ ShowTransactionStateRec(str, s->parent);
+ }
+ initStringInfo(&buf);
if (s->nChildXids > 0)
{
int i;
@@ -5517,10 +5531,6 @@ ShowTransactionStateRec(const char *str, TransactionState s)
for (i = 1; i < s->nChildXids; i++)
appendStringInfo(&buf, " %u", s->childXids[i]);
}
-
- if (s->parent)
- ShowTransactionStateRec(str, s->parent);
-
ereport(DEBUG5,
(errmsg_internal("%s(%d) name: %s; blockState: %s; state: %s, xid/subid/cid: %u/%u/%u%s%s",
str, s->nestingLevel,
@@ -5532,7 +5542,6 @@ ShowTransactionStateRec(const char *str, TransactionState s)
(unsigned int) currentCommandId,
currentCommandIdUsed ? " (used)" : "",
buf.data)));
-
pfree(buf.data);
}
--
2.39.2
From be7a1be658479de72ee694e41a8ce0f07b8b40ad Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
Date: Fri, 24 Nov 2023 17:03:28 +0200
Subject: [PATCH 3/4] Avoid recursion in MemoryContext functions.
You might run out of stack space with recursion, which is not nice in
functions that might be used e.g. at cleanup after transaction
abort. MemoryContext contains pointer to parent and siblings, so we
can traverse a tree of contexts iteratively, without using
stack. Refactor the functions to do that.
MemoryContextStats() still recurses, but it now has a limit to how
deep it recurses. Once the limit is reached, it prints just a summary
of the rest of the hierarchy, similar to how it summarizes contexts
with lots of children. That seems good anyway, because a context dump
with hundreds of nested contexts isn't very readable.
Report by Egor Chindyaskin and Alexander Lakhin.
Discussion: https://www.postgresql.org/message-id/1672760457.940462079%40f306.i.mail.ru
---
src/backend/utils/mmgr/mcxt.c | 244 +++++++++++++++++++++++-----------
src/include/utils/memutils.h | 3 +-
2 files changed, 167 insertions(+), 80 deletions(-)
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 9fc83f11f6..2e9563d8c1 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -149,9 +149,10 @@ MemoryContext CurTransactionContext = NULL;
/* This is a transient link to the active portal's memory context: */
MemoryContext PortalContext = NULL;
+static void MemoryContextDeleteOnly(MemoryContext context);
static void MemoryContextCallResetCallbacks(MemoryContext context);
static void MemoryContextStatsInternal(MemoryContext context, int level,
- bool print, int max_children,
+ int max_level, int max_children,
MemoryContextCounters *totals,
bool print_to_stderr);
static void MemoryContextStatsPrint(MemoryContext context, void *passthru,
@@ -260,6 +261,41 @@ BogusGetChunkSpace(void *pointer)
return 0; /* keep compiler quiet */
}
+/*
+ * MemoryContextTraverse
+ * Helper function to traverse all descendants of a memory context
+ * without recursion.
+ *
+ * Recursion could lead to out-of-stack errors with deep context hierarchies,
+ * which would be unpleasant in error cleanup code paths.
+ *
+ * To process 'context' and all its descendants, use a loop like this:
+ *
+ * <process 'context'>
+ * for (MemoryContext curr = context->firstchild;
+ * curr != NULL;
+ * curr = MemoryContextTraverse(curr, context))
+ * {
+ * <process 'curr'>
+ * }
+ *
+ * This visits all the contexts in preorder.
+ */
+static MemoryContext
+MemoryContextTraverse(MemoryContext curr, MemoryContext top)
+{
+ if (curr->firstchild != NULL)
+ return curr->firstchild;
+
+ while (curr->nextchild == NULL)
+ {
+ curr = curr->parent;
+ if (curr == top)
+ return NULL;
+ }
+ return curr->nextchild;
+}
+
/*****************************************************************************
* EXPORTED ROUTINES *
@@ -379,14 +415,13 @@ MemoryContextResetOnly(MemoryContext context)
void
MemoryContextResetChildren(MemoryContext context)
{
- MemoryContext child;
-
Assert(MemoryContextIsValid(context));
- for (child = context->firstchild; child != NULL; child = child->nextchild)
+ for (MemoryContext curr = context->firstchild;
+ curr != NULL;
+ curr = MemoryContextTraverse(curr, context))
{
- MemoryContextResetChildren(child);
- MemoryContextResetOnly(child);
+ MemoryContextResetOnly(curr);
}
}
@@ -401,16 +436,53 @@ MemoryContextResetChildren(MemoryContext context)
*/
void
MemoryContextDelete(MemoryContext context)
+{
+ MemoryContext curr;
+
+ /*
+ * Delete subcontexts from the bottom up.
+ *
+ * Note: Do not use recursion here. A "stack depth limit exceeded" error
+ * would be unpleasant if we're already in the process of cleaning up from
+ * transaction abort. We also cannot use MemoryContextTraverse() here
+ * because we modify the tree as we go.
+ */
+ curr = context;
+ for (;;)
+ {
+ MemoryContext parent;
+
+ /* Descend down until we find a leaf context with no children */
+ while (curr->firstchild != NULL)
+ curr = curr->firstchild;
+
+ /*
+ * We're now at a leaf with no children. Free it and continue from the
+ * parent. Or if this was the original node, we're all done.
+ */
+ parent = curr->parent;
+ MemoryContextDeleteOnly(curr);
+
+ if (curr == context)
+ break;
+ curr = parent;
+ }
+}
+
+/*
+ * Subroutine of MemoryContextDelete, to delete a context that has no
+ * children.
+ */
+static void
+MemoryContextDeleteOnly(MemoryContext context)
{
Assert(MemoryContextIsValid(context));
/* We had better not be deleting TopMemoryContext ... */
Assert(context != TopMemoryContext);
/* And not CurrentMemoryContext, either */
Assert(context != CurrentMemoryContext);
-
- /* save a function call in common case where there are no children */
- if (context->firstchild != NULL)
- MemoryContextDeleteChildren(context);
+ /* All the children should've been deleted already */
+ Assert(context->firstchild == NULL);
/*
* It's not entirely clear whether 'tis better to do this before or after
@@ -676,12 +748,12 @@ MemoryContextMemAllocated(MemoryContext context, bool recurse)
if (recurse)
{
- MemoryContext child;
-
- for (child = context->firstchild;
- child != NULL;
- child = child->nextchild)
- total += MemoryContextMemAllocated(child, true);
+ for (MemoryContext curr = context->firstchild;
+ curr != NULL;
+ curr = MemoryContextTraverse(curr, context))
+ {
+ total += curr->mem_allocated;
+ }
}
return total;
@@ -698,8 +770,8 @@ MemoryContextMemAllocated(MemoryContext context, bool recurse)
void
MemoryContextStats(MemoryContext context)
{
- /* A hard-wired limit on the number of children is usually good enough */
- MemoryContextStatsDetail(context, 100, true);
+ /* Hard-wired limits are usually good enough */
+ MemoryContextStatsDetail(context, 100, 100, true);
}
/*
@@ -711,14 +783,15 @@ MemoryContextStats(MemoryContext context)
* with fprintf(stderr), otherwise use ereport().
*/
void
-MemoryContextStatsDetail(MemoryContext context, int max_children,
+MemoryContextStatsDetail(MemoryContext context,
+ int max_level, int max_children,
bool print_to_stderr)
{
MemoryContextCounters grand_totals;
memset(&grand_totals, 0, sizeof(grand_totals));
- MemoryContextStatsInternal(context, 0, true, max_children, &grand_totals, print_to_stderr);
+ MemoryContextStatsInternal(context, 0, max_level, max_children, &grand_totals, print_to_stderr);
if (print_to_stderr)
fprintf(stderr,
@@ -756,78 +829,89 @@ MemoryContextStatsDetail(MemoryContext context, int max_children,
*/
static void
MemoryContextStatsInternal(MemoryContext context, int level,
- bool print, int max_children,
+ int max_level, int max_children,
MemoryContextCounters *totals,
bool print_to_stderr)
{
- MemoryContextCounters local_totals;
MemoryContext child;
int ichild;
Assert(MemoryContextIsValid(context));
/* Examine the context itself */
- context->methods->stats(context,
- print ? MemoryContextStatsPrint : NULL,
- (void *) &level,
- totals, print_to_stderr);
+ {
+ context->methods->stats(context,
+ MemoryContextStatsPrint,
+ (void *) &level,
+ totals,
+ print_to_stderr);
+ }
/*
- * Examine children. If there are more than max_children of them, we do
- * not print the rest explicitly, but just summarize them.
+ * Examine children.
+ *
+ * If we are past the recursion depth limit or already running low on
+ * stack, do not print them explicitly but just summarize them. Similarly,
+ * if there are more than max_children of them, we do not print the rest
+ * explicitly, but just summarize them.
*/
- memset(&local_totals, 0, sizeof(local_totals));
-
- for (child = context->firstchild, ichild = 0;
- child != NULL;
- child = child->nextchild, ichild++)
+ ichild = 0;
+ child = context->firstchild;
+ if (level < max_level && !stack_is_too_deep())
{
- if (ichild < max_children)
+ for (; child != NULL && ichild < max_children;
+ child = child->nextchild)
+ {
MemoryContextStatsInternal(child, level + 1,
- print, max_children,
+ max_level, max_children,
totals,
print_to_stderr);
- else
- MemoryContextStatsInternal(child, level + 1,
- false, max_children,
- &local_totals,
- print_to_stderr);
+ ichild++;
+ }
}
- /* Deal with excess children */
- if (ichild > max_children)
+ if (child != NULL)
{
- if (print)
+ /* Summarize the rest of the children, avoiding recursion. */
+ MemoryContextCounters local_totals;
+
+ memset(&local_totals, 0, sizeof(local_totals));
+
+ while (child != NULL)
{
- if (print_to_stderr)
- {
- int i;
-
- for (i = 0; i <= level; i++)
- fprintf(stderr, " ");
- fprintf(stderr,
- "%d more child contexts containing %zu total in %zu blocks; %zu free (%zu chunks); %zu used\n",
- ichild - max_children,
- local_totals.totalspace,
- local_totals.nblocks,
- local_totals.freespace,
- local_totals.freechunks,
- local_totals.totalspace - local_totals.freespace);
- }
- else
- ereport(LOG_SERVER_ONLY,
- (errhidestmt(true),
- errhidecontext(true),
- errmsg_internal("level: %d; %d more child contexts containing %zu total in %zu blocks; %zu free (%zu chunks); %zu used",
- level,
- ichild - max_children,
- local_totals.totalspace,
- local_totals.nblocks,
- local_totals.freespace,
- local_totals.freechunks,
- local_totals.totalspace - local_totals.freespace)));
+ child->methods->stats(child, NULL, NULL, &local_totals, print_to_stderr);
+ child = MemoryContextTraverse(child, context);
+ ichild++;
}
+ if (print_to_stderr)
+ {
+ int i;
+
+ for (i = 0; i <= level; i++)
+ fprintf(stderr, " ");
+ fprintf(stderr,
+ "%d more child contexts containing %zu total in %zu blocks; %zu free (%zu chunks); %zu used\n",
+ ichild - max_children,
+ local_totals.totalspace,
+ local_totals.nblocks,
+ local_totals.freespace,
+ local_totals.freechunks,
+ local_totals.totalspace - local_totals.freespace);
+ }
+ else
+ ereport(LOG_SERVER_ONLY,
+ (errhidestmt(true),
+ errhidecontext(true),
+ errmsg_internal("level: %d; %d more child contexts containing %zu total in %zu blocks; %zu free (%zu chunks); %zu used",
+ level,
+ ichild - max_children,
+ local_totals.totalspace,
+ local_totals.nblocks,
+ local_totals.freespace,
+ local_totals.freechunks,
+ local_totals.totalspace - local_totals.freespace)));
+
if (totals)
{
totals->nblocks += local_totals.nblocks;
@@ -847,8 +931,7 @@ MemoryContextStatsInternal(MemoryContext context, int level,
*/
static void
MemoryContextStatsPrint(MemoryContext context, void *passthru,
- const char *stats_string,
- bool print_to_stderr)
+ const char *stats_string, bool print_to_stderr)
{
int level = *(int *) passthru;
const char *name = context->name;
@@ -927,13 +1010,15 @@ MemoryContextStatsPrint(MemoryContext context, void *passthru,
void
MemoryContextCheck(MemoryContext context)
{
- MemoryContext child;
-
Assert(MemoryContextIsValid(context));
-
context->methods->check(context);
- for (child = context->firstchild; child != NULL; child = child->nextchild)
- MemoryContextCheck(child);
+
+ for (MemoryContext curr = context->firstchild;
+ curr != NULL;
+ curr = MemoryContextTraverse(curr, context))
+ {
+ curr->methods->check(curr);
+ }
}
#endif
@@ -1212,14 +1297,15 @@ ProcessLogMemoryContextInterrupt(void)
/*
* When a backend process is consuming huge memory, logging all its memory
* contexts might overrun available disk space. To prevent this, we limit
- * the number of child contexts to log per parent to 100.
+ * the depth of the hierarchy, as well as the number of child contexts to
+ * log per parent to 100.
*
* As with MemoryContextStats(), we suppose that practical cases where the
* dump gets long will typically be huge numbers of siblings under the
* same parent context; while the additional debugging value from seeing
* details about individual siblings beyond 100 will not be large.
*/
- MemoryContextStatsDetail(TopMemoryContext, 100, false);
+ MemoryContextStatsDetail(TopMemoryContext, 100, 100, false);
}
void *
diff --git a/src/include/utils/memutils.h b/src/include/utils/memutils.h
index d14e8546a6..c25f982dc6 100644
--- a/src/include/utils/memutils.h
+++ b/src/include/utils/memutils.h
@@ -85,7 +85,8 @@ extern MemoryContext MemoryContextGetParent(MemoryContext context);
extern bool MemoryContextIsEmpty(MemoryContext context);
extern Size MemoryContextMemAllocated(MemoryContext context, bool recurse);
extern void MemoryContextStats(MemoryContext context);
-extern void MemoryContextStatsDetail(MemoryContext context, int max_children,
+extern void MemoryContextStatsDetail(MemoryContext context,
+ int max_level, int max_children,
bool print_to_stderr);
extern void MemoryContextAllowInCriticalSection(MemoryContext context,
bool allow);
--
2.39.2