On Wed, 6 Nov 2019 14:34:38 +0100
Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote:

> On 2019-11-05 17:29, Jehan-Guillaume de Rorthais wrote:
> > My best bet so far is that logicalrep_relmap_invalidate_cb is not called
> > after the DDL on the subscriber so the relmap cache is not invalidated. So
> > we end up with slot->tts_tupleDescriptor->natts superior than
> > rel->remoterel->natts in slot_store_cstrings, leading to the overflow on
> > attrmap and the sigsev.  
> 
> It looks like something like that is happening.  But it shouldn't. 
> Different table schemas on publisher and subscriber are well supported, 
> so this must be an edge case of some kind.  Please continue investigating.

I've been able to find the origin of the crash, but it was a long journey.

<debugger hard life>

  I was unable to debug using gdb record because of this famous error:

    Process record does not support instruction 0xc5 at address 0x1482758a4b30.

    Program stopped.
    __memset_avx2_unaligned_erms ()
    at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:168
    168 ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such
      file or directory.

  Trying with rr, I had constant "stack depth limit exceeded", even with
  unlimited one. Does it worth opening a discussion or a wiki page about these
  tools? Peter, it looks like you have some experience with rr, any feedback?

  Finally, Julien Rouhaud spend some time with me after work hours, a,swering
  my questions about some parts of the code and pointed me to the excellent
  backtrace_functions GUC addition few days ago. This finally did the trick to
  find out what was happening. Many thanks Julien!

</debugger hard life>

Back to the bug itself. Consider a working logical replication with constant
update/insert activity, eg. pgbench running against provider.

Now, on the subscriber side, a session issue an "ALTER TABLE ADD
COLUMN" on a subscribed table, eg. pgbench_branches. A cache invalidation
message is then pending for this table.

In the meantime, the logical replication worker receive an UPDATE to apply. It
opens the local relation using "logicalrep_rel_open". It finds the related
entry in LogicalRepRelMap is valid, so it does not update its attrmap
and directly opens and locks the local relation:

  /* Need to update the local cache? */
  if (!OidIsValid(entry->localreloid))
  {
    [...updates attrmap here...]
  }
  else
    entry->localrel = table_open(entry->localreloid, lockmode);

However, when locking the table, the code in LockRelationOid() actually process
any pending invalidation messages:

  LockRelationOid(Oid relid, LOCKMODE lockmode)
  {
    [...]
    /*
     * Now that we have the lock, check for invalidation messages, so that we
     * will update or flush any stale relcache entry before we try to use it.
     * RangeVarGetRelid() specifically relies on us for this.  We can skip
     * this in the not-uncommon case that we already had the same type of lock
     * being requested, since then no one else could have modified the
     * relcache entry in an undesirable way.  (In the case where our own xact
     * modifies the rel, the relcache update happens via
     * CommandCounterIncrement, not here.)
     *
     * However, in corner cases where code acts on tables (usually catalogs)
     * recursively, we might get here while still processing invalidation
     * messages in some outer execution of this function or a sibling.  The
     * "cleared" status of the lock tells us whether we really are done
     * absorbing relevant inval messages.
     */
    if (res != LOCKACQUIRE_ALREADY_CLEAR)
    {
      AcceptInvalidationMessages();
      MarkLockClear(locallock);
    }
  }

We end up with attrmap referencing N columns and the relcache referencing N+1
columns. Later, in apply_handle_update(), we build a TupleTableSlot based on
the relcache representation and we crash by overflowing attrmap while trying to
feed this larger slot in slot_store_cstrings().

Please find in attachment a bugfix proposal. It just moves the attrmap update
after the table_open() call.

Last, I was wondering if entry->attrmap should be pfree'd before palloc'ing it
again during its rebuild to avoid some memory leak?

Regards,
>From 4295b277952a46378f01211bd37075f20223aadc Mon Sep 17 00:00:00 2001
From: Jehan-Guillaume de Rorthais <j...@dalibo.com>
Date: Mon, 25 Nov 2019 15:21:38 +0100
Subject: [PATCH] Fix subscriber invalid memory access on DDL

Before this patch, the attrmap structure mapping the local fields
to remote ones for a subscribed relation was rebuild before handling
pending invalidation messages and potential relcache updates. This
was leading to an invalid memory access by overflowing the attrmap
when building the related tuple slot received from the provider.
---
 src/backend/replication/logical/relation.c | 57 ++++++++++++++++------
 1 file changed, 42 insertions(+), 15 deletions(-)

diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index b386f8460d..cfc34d49e0 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -220,6 +220,8 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode)
 {
 	LogicalRepRelMapEntry *entry;
 	bool		found;
+	Oid		localreloid = InvalidOid;
+	LogicalRepRelation	*remoterel;
 
 	if (LogicalRepRelMap == NULL)
 		logicalrep_relmap_init();
@@ -232,29 +234,56 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode)
 		elog(ERROR, "no relation map entry for remote relation ID %u",
 			 remoteid);
 
-	/* Need to update the local cache? */
+	remoterel = &entry->remoterel;
+
+	/*
+	 * entry->localreloid is set to InvalidOid when the local relation is
+	 * either unknown or invalidated.
+	 * Moreover, when opening and locking a relation, pending invalidation
+	 * messages are processed.
+	 * Because of that, we first need to open the local relation, then rebuild
+	 * the mapping of local attribute if the relation is invalidated.
+	 */
+
+	/*
+	 * Unknown or invalidated local relation. First look for its oid and open
+	 * it. We will build the mapping of local attributes after.
+	 */
 	if (!OidIsValid(entry->localreloid))
 	{
 		Oid			relid;
-		int			i;
-		int			found;
-		Bitmapset  *idkey;
-		TupleDesc	desc;
-		LogicalRepRelation *remoterel;
-		MemoryContext oldctx;
-
-		remoterel = &entry->remoterel;
 
-		/* Try to find and lock the relation by name. */
+		/* Try to find the relation by name */
 		relid = RangeVarGetRelid(makeRangeVar(remoterel->nspname,
 											  remoterel->relname, -1),
-								 lockmode, true);
+								 NoLock, true);
 		if (!OidIsValid(relid))
 			ereport(ERROR,
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 					 errmsg("logical replication target relation \"%s.%s\" does not exist",
 							remoterel->nspname, remoterel->relname)));
-		entry->localrel = table_open(relid, NoLock);
+
+		localreloid = relid;
+	}
+	else
+		localreloid = entry->localreloid;
+
+	/* Actually open and lock the related local relation */
+	entry->localrel = table_open(localreloid, lockmode);
+
+	/*
+	 * Pending invalidation messages might have been processed while grabbing
+	 * the lock during table_open, applying some DDL operations that are then
+	 * reflected in the relcache.
+	 * Because of that, we need to rebuild the mapping of local attribute after
+	 * the relation is opened.
+	 */
+	if (!OidIsValid(entry->localreloid)) {
+		int			found;
+		Bitmapset  *idkey;
+		TupleDesc	desc;
+		MemoryContext oldctx;
+		int			i;
 
 		/* Check for supported relkind. */
 		CheckSubscriptionRelkind(entry->localrel->rd_rel->relkind,
@@ -348,10 +377,8 @@ logicalrep_rel_open(LogicalRepRelId remoteid, LOCKMODE lockmode)
 			}
 		}
 
-		entry->localreloid = relid;
+		entry->localreloid = localreloid;
 	}
-	else
-		entry->localrel = table_open(entry->localreloid, lockmode);
 
 	if (entry->state != SUBREL_STATE_READY)
 		entry->state = GetSubscriptionRelState(MySubscription->oid,
-- 
2.20.1

Reply via email to