On Mon, Feb 12, 2018 at 6:15 PM, Peter Geoghegan <p...@bowt.ie> wrote:
> * For 9.5 and 9.6, the approach taken in bugfix commit d8589946d
> should be taken even further -- we should always copy. Moreover, we
> should always copy within tuplesort_getdatum(), for the same reasons.
>
> * For 9.5, 9.6, 10, and master, we should make sure that
> tuplesort_getdatum() uses the caller's memory context. The fact that
> it doesn't already do so seems like a simple oversight. We should do
> this to be consistent with tuplesort_gettupleslot(). (This isn't
> critical, but seems like a good idea.)
>
> * For 9.5, 9.6, 10, and master, we should adjust some comments from
> tuplesort_getdatum() callers, so that they no longer say that
> tuplesort datum tuple memory lives in tuplesort context. That won't be
> true anymore.

Attached patches do it that way. I'm happy with what I came up with,
which is a lot simpler than my first approach. The extra copying seems
likely to be well worth it, since it is fairly isolated in practice,
especially on 9.6. There is no extra copying from v10+, since they
don't need the first fix at all.

-- 
Peter Geoghegan
From fcbdb2e4fb76a75a0c56ca258f99df888b174dc3 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <p...@bowt.ie>
Date: Thu, 14 Dec 2017 21:22:41 -0800
Subject: [PATCH] Fix double free of tuple with grouping sets/tuplesort.c.

Have tuplesort_gettupleslot() copy tuple contents for slot into caller's
memory context in all cases.  This necessitates an extra copy on 9.5 and
9.6, though only for cases where an explicit copy was not already
required (mostly just TSS_SORTEDONTAPE fetches, though all
TSS_FINALMERGE fetches on 9.5).  Do the same for tuplesort_getdatum().

This fixes a crash that can occur on at least 9.5 and 9.6, where some
grouping sets queries may call tuplesort_end() at node shutdown time,
and free tuple memory that the executor/a tuple slot imagines it is
responsible for.

In the case of all other fetch routines, continue to allocate the memory
in the tuplesort-owned context (only change comments).  This is a little
inconsistent with what we now do for tuplesort_gettupleslot() and
tuplesort_getdatum(), but that's preferable to adding new copy overhead
in the backbranches where it's clearly unnecessary.  These other fetch
routines provide the weakest possible guarantees about tuple memory
lifespan from v10 on, anyway, so this actually seems more consistent
overall.

In passing, change the memory context switch point within
tuplesort_getdatum() to matchtuplesort_gettupleslot(), so that we
actually allocate memory in the right context there (caller's context),
which seems like an independent bug.  It's not clear that this can cause
a crash, but it's still wrong.

Only the latter, less important fix is required on v10 and master,
because in v10 there is already no question of not having to make a copy
in caller's context anyway (see commits e94568ec and 3856cf96).

Author: Peter Geoghegan
Reported-By: Bernd Helmle
Analyzed-By: Peter Geoghegan, Andreas Seltenreich, Bernd Helmle
Discussion: https://www.postgresql.org/message-id/1512661638.9720.34.ca...@oopsware.de
Backpatch: 9.5-, where the issue is known to cause a crash.
---
 src/backend/utils/adt/orderedsetaggs.c | 21 +++++-----
 src/backend/utils/sort/tuplesort.c     | 74 +++++++++++++++++++++++++---------
 2 files changed, 63 insertions(+), 32 deletions(-)

diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index fe44d56..4d1e201 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -457,10 +457,9 @@ percentile_disc_final(PG_FUNCTION_ARGS)
 		elog(ERROR, "missing row in percentile_disc");
 
 	/*
-	 * Note: we *cannot* clean up the tuplesort object here, because the value
-	 * to be returned is allocated inside its sortcontext.  We could use
-	 * datumCopy to copy it out of there, but it doesn't seem worth the
-	 * trouble, since the cleanup callback will clear the tuplesort later.
+	 * Note: we could clean up the tuplesort object here, but it doesn't seem
+	 * worth the trouble, since the cleanup callback will clear the tuplesort
+	 * later.
 	 */
 
 	/* We shouldn't have stored any nulls, but do the right thing anyway */
@@ -575,10 +574,9 @@ percentile_cont_final_common(FunctionCallInfo fcinfo,
 	}
 
 	/*
-	 * Note: we *cannot* clean up the tuplesort object here, because the value
-	 * to be returned may be allocated inside its sortcontext.  We could use
-	 * datumCopy to copy it out of there, but it doesn't seem worth the
-	 * trouble, since the cleanup callback will clear the tuplesort later.
+	 * Note: we could clean up the tuplesort object here, but it doesn't seem
+	 * worth the trouble, since the cleanup callback will clear the tuplesort
+	 * later.
 	 */
 
 	PG_RETURN_DATUM(val);
@@ -1097,10 +1095,9 @@ mode_final(PG_FUNCTION_ARGS)
 		pfree(DatumGetPointer(last_val));
 
 	/*
-	 * Note: we *cannot* clean up the tuplesort object here, because the value
-	 * to be returned is allocated inside its sortcontext.  We could use
-	 * datumCopy to copy it out of there, but it doesn't seem worth the
-	 * trouble, since the cleanup callback will clear the tuplesort later.
+	 * Note: we could clean up the tuplesort object here, but it doesn't seem
+	 * worth the trouble, since the cleanup callback will clear the tuplesort
+	 * later.
 	 */
 
 	if (mode_freq)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 4dd5407..544f06d 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -1824,6 +1824,11 @@ tuplesort_performsort(Tuplesortstate *state)
  * direction into *stup.  Returns FALSE if no more tuples.
  * If *should_free is set, the caller must pfree stup.tuple when done with it.
  * Otherwise, caller should not use tuple following next call here.
+ *
+ * Note:  Public tuplesort fetch routine callers cannot rely on tuple being
+ * allocated in their own memory context when should_free is TRUE.  It may be
+ * necessary to create a new copy of the tuple to meet the requirements of
+ * public fetch routine callers.
  */
 static bool
 tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
@@ -2046,9 +2051,11 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
  * NULL value in leading attribute will set abbreviated value to zeroed
  * representation, which caller may rely on in abbreviated inequality check.
  *
- * The slot receives a copied tuple (sometimes allocated in caller memory
- * context) that will stay valid regardless of future manipulations of the
- * tuplesort's state.
+ * The slot receives a copied tuple that will stay valid regardless of future
+ * manipulations of the tuplesort's state.  This includes calls to
+ * tuplesort_end() -- tuple will be allocated in caller's own context to make
+ * this work (this differs from similar routines for other types of
+ * tuplesorts).
  */
 bool
 tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
@@ -2065,16 +2072,30 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
 
 	if (stup.tuple)
 	{
+		void   *tuple = stup.tuple;
+
 		/* Record abbreviated key for caller */
 		if (state->sortKeys->abbrev_converter && abbrev)
 			*abbrev = stup.datum1;
 
-		if (!should_free)
-		{
-			stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
-			should_free = true;
-		}
-		ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, should_free);
+		/*
+		 * Callers rely on memory being in their own memory context, which is
+		 * not guaranteed by tuplesort_gettuple_common(), even when should_free
+		 * is set to TRUE.  We must always copy here, since our interface does
+		 * not allow callers to opt into arrangement where tuple memory can go
+		 * away on the next call here, or after tuplesort_end() is called.
+		 */
+		stup.tuple = heap_copy_minimal_tuple((MinimalTuple) tuple);
+
+		/*
+		 * Free local copy eagerly.  It would be very invasive to get
+		 * tuplesort_gettuple_common() to allocate tuple in caller's context
+		 * for us, so we just do this instead.
+		 */
+		if (should_free)
+			pfree(tuple);
+
+		ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, true);
 		return true;
 	}
 	else
@@ -2089,7 +2110,7 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
  * Returns NULL if no more tuples.  If *should_free is set, the
  * caller must pfree the returned tuple when done with it.
  * If it is not set, caller should not use tuple following next
- * call here.
+ * call here.  It's never okay to use it after tuplesort_end().
  */
 HeapTuple
 tuplesort_getheaptuple(Tuplesortstate *state, bool forward, bool *should_free)
@@ -2110,7 +2131,7 @@ tuplesort_getheaptuple(Tuplesortstate *state, bool forward, bool *should_free)
  * Returns NULL if no more tuples.  If *should_free is set, the
  * caller must pfree the returned tuple when done with it.
  * If it is not set, caller should not use tuple following next
- * call here.
+ * call here.  It's never okay to use it after tuplesort_end().
  */
 IndexTuple
 tuplesort_getindextuple(Tuplesortstate *state, bool forward,
@@ -2132,7 +2153,8 @@ tuplesort_getindextuple(Tuplesortstate *state, bool forward,
  * Returns FALSE if no more datums.
  *
  * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
- * and is now owned by the caller.
+ * in caller's context, and is now owned by the caller (this differs from
+ * similar routines for other types of tuplesorts).
  *
  * Caller may optionally be passed back abbreviated value (on TRUE return
  * value) when abbreviation was used, which can be used to cheaply avoid
@@ -2154,6 +2176,8 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
 		MemoryContextSwitchTo(oldcontext);
 		return false;
 	}
+	/* Copy into caller's memory context */
+	MemoryContextSwitchTo(oldcontext);
 
 	/* Record abbreviated key for caller */
 	if (state->sortKeys->abbrev_converter && abbrev)
@@ -2166,16 +2190,26 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
 	}
 	else
 	{
-		/* use stup.tuple because stup.datum1 may be an abbreviation */
-
-		if (should_free)
-			*val = PointerGetDatum(stup.tuple);
-		else
-			*val = datumCopy(PointerGetDatum(stup.tuple), false, state->datumTypeLen);
+		/*
+		 * Callers rely on memory being in their own memory context, which is
+		 * not guaranteed by tuplesort_gettuple_common(), even when should_free
+		 * is set to TRUE.  We must always copy here, since our interface does
+		 * not allow callers to opt into arrangement where tuple memory can go
+		 * away on the next call here, or after tuplesort_end() is called.
+		 *
+		 * Use stup.tuple because stup.datum1 may be an abbreviation.
+		 */
+		*val = datumCopy(PointerGetDatum(stup.tuple), false, state->datumTypeLen);
 		*isNull = false;
-	}
 
-	MemoryContextSwitchTo(oldcontext);
+		/*
+		 * Free local copy eagerly.  It would be very invasive to get
+		 * tuplesort_gettuple_common() to allocate tuple in caller's context
+		 * for us, so we just do this instead.
+		 */
+		if (should_free)
+			pfree(stup.tuple);
+	}
 
 	return true;
 }
-- 
2.7.4

From f7a480807127e7b2fa5ee356981bbfc23ae0cd60 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <p...@bowt.ie>
Date: Fri, 9 Feb 2018 14:58:08 -0800
Subject: [PATCH] Fix double free of tuple with grouping sets/tuplesort.c.

Have tuplesort_gettupleslot() copy tuple contents for slot into caller's
memory context in all cases.  This necessitates an extra copy on 9.5 and
9.6, though only for cases where an explicit copy was not already
required (mostly just TSS_SORTEDONTAPE fetches, though all
TSS_FINALMERGE fetches on 9.5).  Do the same for tuplesort_getdatum().

This fixes a crash that can occur on at least 9.5 and 9.6, where some
grouping sets queries may call tuplesort_end() at node shutdown time,
and free tuple memory that the executor/a tuple slot imagines it is
responsible for.

In the case of all other fetch routines, continue to allocate the memory
in the tuplesort-owned context (only change comments).  This is a little
inconsistent with what we now do for tuplesort_gettupleslot() and
tuplesort_getdatum(), but that's preferable to adding new copy overhead
in the backbranches where it's clearly unnecessary.  These other fetch
routines provide the weakest possible guarantees about tuple memory
lifespan from v10 on, anyway, so this actually seems more consistent
overall.

In passing, change the memory context switch point within
tuplesort_getdatum() to matchtuplesort_gettupleslot(), so that we
actually allocate memory in the right context there (caller's context),
which seems like an independent bug.  It's not clear that this can cause
a crash, but it's still wrong.

Only the latter, less important fix is required on v10 and master,
because in v10 there is already no question of not having to make a copy
in caller's context anyway (see commits e94568ec and 3856cf96).

Author: Peter Geoghegan
Reported-By: Bernd Helmle
Analyzed-By: Peter Geoghegan, Andreas Seltenreich, Bernd Helmle
Discussion: https://www.postgresql.org/message-id/1512661638.9720.34.ca...@oopsware.de
Backpatch: 9.5-, where the issue is known to cause a crash.
---
 src/backend/utils/adt/orderedsetaggs.c | 21 +++++++++------------
 src/backend/utils/sort/tuplesort.c     | 19 ++++++++++---------
 2 files changed, 19 insertions(+), 21 deletions(-)

diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 8502fcf..c1d50eb 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -458,10 +458,9 @@ percentile_disc_final(PG_FUNCTION_ARGS)
 		elog(ERROR, "missing row in percentile_disc");
 
 	/*
-	 * Note: we *cannot* clean up the tuplesort object here, because the value
-	 * to be returned is allocated inside its sortcontext.  We could use
-	 * datumCopy to copy it out of there, but it doesn't seem worth the
-	 * trouble, since the cleanup callback will clear the tuplesort later.
+	 * Note: we could clean up the tuplesort object here, but it doesn't seem
+	 * worth the trouble, since the cleanup callback will clear the tuplesort
+	 * later.
 	 */
 
 	/* We shouldn't have stored any nulls, but do the right thing anyway */
@@ -576,10 +575,9 @@ percentile_cont_final_common(FunctionCallInfo fcinfo,
 	}
 
 	/*
-	 * Note: we *cannot* clean up the tuplesort object here, because the value
-	 * to be returned may be allocated inside its sortcontext.  We could use
-	 * datumCopy to copy it out of there, but it doesn't seem worth the
-	 * trouble, since the cleanup callback will clear the tuplesort later.
+	 * Note: we could clean up the tuplesort object here, but it doesn't seem
+	 * worth the trouble, since the cleanup callback will clear the tuplesort
+	 * later.
 	 */
 
 	PG_RETURN_DATUM(val);
@@ -1098,10 +1096,9 @@ mode_final(PG_FUNCTION_ARGS)
 		pfree(DatumGetPointer(last_val));
 
 	/*
-	 * Note: we *cannot* clean up the tuplesort object here, because the value
-	 * to be returned is allocated inside its sortcontext.  We could use
-	 * datumCopy to copy it out of there, but it doesn't seem worth the
-	 * trouble, since the cleanup callback will clear the tuplesort later.
+	 * Note: we could clean up the tuplesort object here, but it doesn't seem
+	 * worth the trouble, since the cleanup callback will clear the tuplesort
+	 * later.
 	 */
 
 	if (mode_freq)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 59cd28e..1dd568c 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -2103,11 +2103,12 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
  * representation, which caller may rely on in abbreviated inequality check.
  *
  * If copy is true, the slot receives a copied tuple that will stay valid
- * regardless of future manipulations of the tuplesort's state.  Memory is
- * owned by the caller.  If copy is false, the slot will just receive a
- * pointer to a tuple held within the tuplesort, which is more efficient, but
- * only safe for callers that are prepared to have any subsequent manipulation
- * of the tuplesort's state invalidate slot contents.
+ * regardless of future manipulations of the tuplesort's state.  This includes
+ * calls to tuplesort_end() -- tuple will be allocated in caller's own context
+ * to make this work.  If copy is false, the slot will just receive a pointer
+ * to a tuple held within the tuplesort, which is more efficient, but only safe
+ * for callers that are prepared to have any subsequent manipulation of the
+ * tuplesort's state invalidate slot contents.
  */
 bool
 tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
@@ -2185,8 +2186,8 @@ tuplesort_getindextuple(Tuplesortstate *state, bool forward)
  * Returns FALSE if no more datums.
  *
  * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
- * and is now owned by the caller (this differs from similar routines for
- * other types of tuplesorts).
+ * in caller's context, and is now owned by the caller (this differs from
+ * similar routines for other types of tuplesorts).
  *
  * Caller may optionally be passed back abbreviated value (on TRUE return
  * value) when abbreviation was used, which can be used to cheaply avoid
@@ -2207,6 +2208,8 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
 		MemoryContextSwitchTo(oldcontext);
 		return false;
 	}
+	/* Copy into caller's memory context */
+	MemoryContextSwitchTo(oldcontext);
 
 	/* Record abbreviated key for caller */
 	if (state->sortKeys->abbrev_converter && abbrev)
@@ -2224,8 +2227,6 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
 		*isNull = false;
 	}
 
-	MemoryContextSwitchTo(oldcontext);
-
 	return true;
 }
 
-- 
2.7.4

From 083b4f26446b51b4a3d839ca44c58144aecadc34 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <p...@bowt.ie>
Date: Fri, 9 Feb 2018 15:02:08 -0800
Subject: [PATCH] Fix double free of tuple with grouping sets/tuplesort.c.

Have tuplesort_gettupleslot() copy tuple contents for slot into caller's
memory context in all cases.  This necessitates an extra copy on 9.5 and
9.6, though only for cases where an explicit copy was not already
required (mostly just TSS_SORTEDONTAPE fetches, though all
TSS_FINALMERGE fetches on 9.5).  Do the same for tuplesort_getdatum().

This fixes a crash that can occur on at least 9.5 and 9.6, where some
grouping sets queries may call tuplesort_end() at node shutdown time,
and free tuple memory that the executor/a tuple slot imagines it is
responsible for.

In the case of all other fetch routines, continue to allocate the memory
in the tuplesort-owned context (only change comments).  This is a little
inconsistent with what we now do for tuplesort_gettupleslot() and
tuplesort_getdatum(), but that's preferable to adding new copy overhead
in the backbranches where it's clearly unnecessary.  These other fetch
routines provide the weakest possible guarantees about tuple memory
lifespan from v10 on, anyway, so this actually seems more consistent
overall.

In passing, change the memory context switch point within
tuplesort_getdatum() to matchtuplesort_gettupleslot(), so that we
actually allocate memory in the right context there (caller's context),
which seems like an independent bug.  It's not clear that this can cause
a crash, but it's still wrong.

Only the latter, less important fix is required on v10 and master,
because in v10 there is already no question of not having to make a copy
in caller's context anyway (see commits e94568ec and 3856cf96).

Author: Peter Geoghegan
Reported-By: Bernd Helmle
Analyzed-By: Peter Geoghegan, Andreas Seltenreich, Bernd Helmle
Discussion: https://www.postgresql.org/message-id/1512661638.9720.34.ca...@oopsware.de
Backpatch: 9.5-, where the issue is known to cause a crash.
---
 src/backend/utils/adt/orderedsetaggs.c | 21 +++++------
 src/backend/utils/sort/tuplesort.c     | 68 +++++++++++++++++++++++++++++-----
 2 files changed, 67 insertions(+), 22 deletions(-)

diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 39ed85b..1a10202 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -457,10 +457,9 @@ percentile_disc_final(PG_FUNCTION_ARGS)
 		elog(ERROR, "missing row in percentile_disc");
 
 	/*
-	 * Note: we *cannot* clean up the tuplesort object here, because the value
-	 * to be returned is allocated inside its sortcontext.  We could use
-	 * datumCopy to copy it out of there, but it doesn't seem worth the
-	 * trouble, since the cleanup callback will clear the tuplesort later.
+	 * Note: we could clean up the tuplesort object here, but it doesn't seem
+	 * worth the trouble, since the cleanup callback will clear the tuplesort
+	 * later.
 	 */
 
 	/* We shouldn't have stored any nulls, but do the right thing anyway */
@@ -575,10 +574,9 @@ percentile_cont_final_common(FunctionCallInfo fcinfo,
 	}
 
 	/*
-	 * Note: we *cannot* clean up the tuplesort object here, because the value
-	 * to be returned may be allocated inside its sortcontext.  We could use
-	 * datumCopy to copy it out of there, but it doesn't seem worth the
-	 * trouble, since the cleanup callback will clear the tuplesort later.
+	 * Note: we could clean up the tuplesort object here, but it doesn't seem
+	 * worth the trouble, since the cleanup callback will clear the tuplesort
+	 * later.
 	 */
 
 	PG_RETURN_DATUM(val);
@@ -1089,10 +1087,9 @@ mode_final(PG_FUNCTION_ARGS)
 		pfree(DatumGetPointer(last_val));
 
 	/*
-	 * Note: we *cannot* clean up the tuplesort object here, because the value
-	 * to be returned is allocated inside its sortcontext.  We could use
-	 * datumCopy to copy it out of there, but it doesn't seem worth the
-	 * trouble, since the cleanup callback will clear the tuplesort later.
+	 * Note: we could clean up the tuplesort object here, but it doesn't seem
+	 * worth the trouble, since the cleanup callback will clear the tuplesort
+	 * later.
 	 */
 
 	if (mode_freq)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 5676c07..294f2bd 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -1664,6 +1664,12 @@ tuplesort_performsort(Tuplesortstate *state)
  * Internal routine to fetch the next tuple in either forward or back
  * direction into *stup.  Returns FALSE if no more tuples.
  * If *should_free is set, the caller must pfree stup.tuple when done with it.
+ * Otherwise, caller should not use tuple following next call here.
+ *
+ * Note:  Public tuplesort fetch routine callers cannot rely on tuple being
+ * allocated in their own memory context when should_free is TRUE.  It may be
+ * necessary to create a new copy of the tuple to meet the requirements of
+ * public fetch routine callers.
  */
 static bool
 tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
@@ -1865,6 +1871,12 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
  * Fetch the next tuple in either forward or back direction.
  * If successful, put tuple in slot and return TRUE; else, clear the slot
  * and return FALSE.
+ *
+ * The slot receives a copied tuple that will stay valid regardless of future
+ * manipulations of the tuplesort's state.  This includes calls to
+ * tuplesort_end() -- tuple will be allocated in caller's own context to make
+ * this work (this differs from similar routines for other types of
+ * tuplesorts).
  */
 bool
 tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
@@ -1881,7 +1893,26 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
 
 	if (stup.tuple)
 	{
-		ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, should_free);
+		void   *tuple = stup.tuple;
+
+		/*
+		 * Callers rely on memory being in their own memory context, which is
+		 * not guaranteed by tuplesort_gettuple_common(), even when should_free
+		 * is set to TRUE.  We must always copy here, since our interface does
+		 * not allow callers to opt into arrangement where tuple memory can go
+		 * away on the next call here, or after tuplesort_end() is called.
+		 */
+		stup.tuple = heap_copy_minimal_tuple((MinimalTuple) tuple);
+
+		/*
+		 * Free local copy eagerly.  It would be very invasive to get
+		 * tuplesort_gettuple_common() to allocate tuple in caller's context
+		 * for us, so we just do this instead.
+		 */
+		if (should_free)
+			pfree(tuple);
+
+		ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, true);
 		return true;
 	}
 	else
@@ -1895,6 +1926,8 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
  * Fetch the next tuple in either forward or back direction.
  * Returns NULL if no more tuples.  If *should_free is set, the
  * caller must pfree the returned tuple when done with it.
+ * If it is not set, caller should not use tuple following next
+ * call here.  It's never okay to use it after tuplesort_end().
  */
 HeapTuple
 tuplesort_getheaptuple(Tuplesortstate *state, bool forward, bool *should_free)
@@ -1914,6 +1947,8 @@ tuplesort_getheaptuple(Tuplesortstate *state, bool forward, bool *should_free)
  * Fetch the next index tuple in either forward or back direction.
  * Returns NULL if no more tuples.  If *should_free is set, the
  * caller must pfree the returned tuple when done with it.
+ * If it is not set, caller should not use tuple following next
+ * call here.  It's never okay to use it after tuplesort_end().
  */
 IndexTuple
 tuplesort_getindextuple(Tuplesortstate *state, bool forward,
@@ -1935,7 +1970,8 @@ tuplesort_getindextuple(Tuplesortstate *state, bool forward,
  * Returns FALSE if no more datums.
  *
  * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
- * and is now owned by the caller.
+ * in caller's context, and is now owned by the caller (this differs from
+ * similar routines for other types of tuplesorts).
  */
 bool
 tuplesort_getdatum(Tuplesortstate *state, bool forward,
@@ -1950,6 +1986,8 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
 		MemoryContextSwitchTo(oldcontext);
 		return false;
 	}
+	/* Copy into caller's memory context */
+	MemoryContextSwitchTo(oldcontext);
 
 	if (stup.isnull1 || state->datumTypeByVal)
 	{
@@ -1958,16 +1996,26 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
 	}
 	else
 	{
-		/* use stup.tuple because stup.datum1 may be an abbreviation */
-
-		if (should_free)
-			*val = PointerGetDatum(stup.tuple);
-		else
-			*val = datumCopy(PointerGetDatum(stup.tuple), false, state->datumTypeLen);
+		/*
+		 * Callers rely on memory being in their own memory context, which is
+		 * not guaranteed by tuplesort_gettuple_common(), even when should_free
+		 * is set to TRUE.  We must always copy here, since our interface does
+		 * not allow callers to opt into arrangement where tuple memory can go
+		 * away on the next call here, or after tuplesort_end() is called.
+		 *
+		 * Use stup.tuple because stup.datum1 may be an abbreviation.
+		 */
+		*val = datumCopy(PointerGetDatum(stup.tuple), false, state->datumTypeLen);
 		*isNull = false;
-	}
 
-	MemoryContextSwitchTo(oldcontext);
+		/*
+		 * Free local copy eagerly.  It would be very invasive to get
+		 * tuplesort_gettuple_common() to allocate tuple in caller's context
+		 * for us, so we just do this instead.
+		 */
+		if (should_free)
+			pfree(stup.tuple);
+	}
 
 	return true;
 }
-- 
2.7.4

From 61ee080f3763e5a4112e9507f81d60fb8e889e99 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <p...@bowt.ie>
Date: Fri, 9 Feb 2018 15:40:44 -0800
Subject: [PATCH] Fix double free of tuple with grouping sets/tuplesort.c.

Have tuplesort_gettupleslot() copy tuple contents for slot into caller's
memory context in all cases.  This necessitates an extra copy on 9.5 and
9.6, though only for cases where an explicit copy was not already
required (mostly just TSS_SORTEDONTAPE fetches, though all
TSS_FINALMERGE fetches on 9.5).  Do the same for tuplesort_getdatum().

This fixes a crash that can occur on at least 9.5 and 9.6, where some
grouping sets queries may call tuplesort_end() at node shutdown time,
and free tuple memory that the executor/a tuple slot imagines it is
responsible for.

In the case of all other fetch routines, continue to allocate the memory
in the tuplesort-owned context (only change comments).  This is a little
inconsistent with what we now do for tuplesort_gettupleslot() and
tuplesort_getdatum(), but that's preferable to adding new copy overhead
in the backbranches where it's clearly unnecessary.  These other fetch
routines provide the weakest possible guarantees about tuple memory
lifespan from v10 on, anyway, so this actually seems more consistent
overall.

In passing, change the memory context switch point within
tuplesort_getdatum() to matchtuplesort_gettupleslot(), so that we
actually allocate memory in the right context there (caller's context),
which seems like an independent bug.  It's not clear that this can cause
a crash, but it's still wrong.

Only the latter, less important fix is required on v10 and master,
because in v10 there is already no question of not having to make a copy
in caller's context anyway (see commits e94568ec and 3856cf96).

Author: Peter Geoghegan
Reported-By: Bernd Helmle
Analyzed-By: Peter Geoghegan, Andreas Seltenreich, Bernd Helmle
Discussion: https://www.postgresql.org/message-id/1512661638.9720.34.ca...@oopsware.de
Backpatch: 9.5-, where the issue is known to cause a crash.
---
 src/backend/utils/adt/orderedsetaggs.c |  5 +----
 src/backend/utils/sort/tuplesort.c     | 19 ++++++++++---------
 2 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 50b34fc..ed36851 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -329,10 +329,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
  *
  * In the case where we're not expecting multiple finalfn calls, we could
  * arguably rely on the finalfn to clean up; but it's easier and more testable
- * if we just do it the same way in either case.  Note that many of the
- * finalfns could *not* free the tuplesort object, at least not without extra
- * data copying, because what they return is a pointer to a datum inside the
- * tuplesort object.
+ * if we just do it the same way in either case.
  */
 static void
 ordered_set_shutdown(Datum arg)
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 041bdc2..7fec18e 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -2148,11 +2148,12 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
  * representation, which caller may rely on in abbreviated inequality check.
  *
  * If copy is true, the slot receives a copied tuple that will stay valid
- * regardless of future manipulations of the tuplesort's state.  Memory is
- * owned by the caller.  If copy is false, the slot will just receive a
- * pointer to a tuple held within the tuplesort, which is more efficient, but
- * only safe for callers that are prepared to have any subsequent manipulation
- * of the tuplesort's state invalidate slot contents.
+ * regardless of future manipulations of the tuplesort's state.  This includes
+ * calls to tuplesort_end() -- tuple will be allocated in caller's own context
+ * to make this work.  If copy is false, the slot will just receive a pointer
+ * to a tuple held within the tuplesort, which is more efficient, but only safe
+ * for callers that are prepared to have any subsequent manipulation of the
+ * tuplesort's state invalidate slot contents.
  */
 bool
 tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
@@ -2230,8 +2231,8 @@ tuplesort_getindextuple(Tuplesortstate *state, bool forward)
  * Returns false if no more datums.
  *
  * If the Datum is pass-by-ref type, the returned value is freshly palloc'd
- * and is now owned by the caller (this differs from similar routines for
- * other types of tuplesorts).
+ * in caller's context, and is now owned by the caller (this differs from
+ * similar routines for other types of tuplesorts).
  *
  * Caller may optionally be passed back abbreviated value (on true return
  * value) when abbreviation was used, which can be used to cheaply avoid
@@ -2252,6 +2253,8 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
 		MemoryContextSwitchTo(oldcontext);
 		return false;
 	}
+	/* Copy into caller's memory context */
+	MemoryContextSwitchTo(oldcontext);
 
 	/* Record abbreviated key for caller */
 	if (state->sortKeys->abbrev_converter && abbrev)
@@ -2269,8 +2272,6 @@ tuplesort_getdatum(Tuplesortstate *state, bool forward,
 		*isNull = false;
 	}
 
-	MemoryContextSwitchTo(oldcontext);
-
 	return true;
 }
 
-- 
2.7.4

Reply via email to