[HACKERS] Make relation_openrv atomic wrt DDL

Noah Misch Sun, 12 Jun 2011 12:20:12 -0700

On Thu, Jan 20, 2011 at 09:48:12PM -0500, Robert Haas wrote:
> On Thu, Jan 20, 2011 at 6:19 PM, Noah Misch <n...@leadboat.com> wrote:
> > On Thu, Jan 20, 2011 at 09:36:11PM +0000, Simon Riggs wrote:
> >> I agree that the DDL behaviour is wrong and should be fixed. Thank you
> >> for championing that alternative view.
> >>
> >> Swapping based upon names only works and is very flexible, much more so
> >> than EXCHANGE could be.
> >>
> >> A separate utility might be worth it, but the feature set of that should
> >> be defined in terms of correctly-working DDL behaviour. It's possible
> >> that no further requirement exists. I remove my own patch from
> >> consideration for this release.
> >>
> >> I'll review your patch and commit it, problems or objections excepted. I
> >> haven't looked at it in any detail.
> >
> > Thanks. ?I wouldn't be very surprised if that patch is even the wrong way to
> > achieve these semantics, but it's great that we're on the same page as to 
> > which
> > semantics they are.
> 
> I think Noah's patch is a not a good idea, because it will result in
> calling RangeVarGetRelid twice even in the overwhelmingly common case
> where no relevant invalidation occurs.  That'll add several syscache
> lookups per table to very common operations.


Valid concern.

[Refresher: this was a patch to improve behavior for this test case:

        psql -c "CREATE TABLE t AS VALUES ('old'); CREATE TABLE new_t AS VALUES 
('new')"
        psql -c 'SELECT pg_sleep(2) FROM t' & # block the ALTER or DROP briefly
        sleep 1                                                           # 
give prev time to take AccessShareLock

        # Do it this way, and the next SELECT gets data from the old table.
        psql -c 'ALTER TABLE t RENAME TO old_t; ALTER TABLE new_t RENAME TO t' &
        # Do it this way, and get: ERROR:  could not open relation with OID 
41380
        #psql -c 'DROP TABLE t; ALTER TABLE new_t RENAME TO t' &

        psql -c 'SELECT * FROM t'               # I get 'old' or an error, 
never 'new'.
        psql -c 'DROP TABLE IF EXISTS t, old_t, new_t'

It did so by rechecking the RangeVar->oid resolution after locking the found
relation, by which time concurrent DDL could no longer be interfering.]

I benchmarked the original patch with this function:

        Datum
        nmtest(PG_FUNCTION_ARGS)
        {
                int32           n = PG_GETARG_INT32(0);
                int                     i;
                RangeVar   *rv = makeRangeVar(NULL, "pg_am", 0);
        
                for (i = 0; i < n; ++i)
                {
                        Relation        r = relation_openrv(rv, 
AccessShareLock);
                        relation_close(r, AccessShareLock);
                }
                PG_RETURN_INT32(4);
        }

(Releasing the lock before transaction end makes for an unrealistic benchmark,
but so would opening the same relation millions of times in a single
transaction.  I'm trying to isolate the cost that would be spread over
millions of transactions opening relations a handful of times.  See attached
shar archive for a complete extension wrapping that test function.)

Indeed, the original patch slowed it by about 50%.  I improved the patch,
adding a global SharedInvalidMessageCounter to increment as we process
messages.  If this counter does not change between the RangeVarGetRelid() call
and the post-lock AcceptInvalidationMessages() call, we can skip the second
RangeVarGetRelid() call.  With the updated patch, I get these timings (in ms)
for runs of "SELECT nmtest(10000000)":

master: 19697.642, 20087.477, 19748.995
patched: 19723.716, 19879.538, 20257.671

In other words, no significant difference.  Since the patch removes the
no-longer-needed pre-lock call to AcceptInvalidationMessages(), changing to
"relation_close(r, NoLock)" in the test case actually reveals a 15%
performance improvement.  Granted, nothing to get excited about in light of
the artificiality.

This semantic improvement would be hard to test with the current pg_regress
suite, so I do not include any test case addition in the patch.  If
sufficiently important, it could be done with isolationtester.

Thanks,
nm

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 01a492e..63537fd 100644
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***************
*** 979,1004 **** relation_openrv(const RangeVar *relation, LOCKMODE lockmode)
  {
        Oid                     relOid;
  
!       /*
!        * Check for shared-cache-inval messages before trying to open the
!        * relation.  This is needed to cover the case where the name 
identifies a
!        * rel that has been dropped and recreated since the start of our
!        * transaction: if we don't flush the old syscache entry then we'll 
latch
!        * onto that entry and suffer an error when we do RelationIdGetRelation.
!        * Note that relation_open does not need to do this, since a relation's
!        * OID never changes.
!        *
!        * We skip this if asked for NoLock, on the assumption that the caller 
has
!        * already ensured some appropriate lock is held.
!        */
!       if (lockmode != NoLock)
!               AcceptInvalidationMessages();
! 
!       /* Look up the appropriate relation using namespace search */
!       relOid = RangeVarGetRelid(relation, false);
  
        /* Let relation_open do the rest */
!       return relation_open(relOid, lockmode);
  }
  
  /* ----------------
--- 979,989 ----
  {
        Oid                     relOid;
  
!       /* Look up and lock the appropriate relation using namespace search */
!       relOid = RangeVarLockRelid(relation, lockmode, false);
  
        /* Let relation_open do the rest */
!       return relation_open(relOid, NoLock);
  }
  
  /* ----------------
***************
*** 1014,1043 **** try_relation_openrv(const RangeVar *relation, LOCKMODE 
lockmode)
  {
        Oid                     relOid;
  
!       /*
!        * Check for shared-cache-inval messages before trying to open the
!        * relation.  This is needed to cover the case where the name 
identifies a
!        * rel that has been dropped and recreated since the start of our
!        * transaction: if we don't flush the old syscache entry then we'll 
latch
!        * onto that entry and suffer an error when we do RelationIdGetRelation.
!        * Note that relation_open does not need to do this, since a relation's
!        * OID never changes.
!        *
!        * We skip this if asked for NoLock, on the assumption that the caller 
has
!        * already ensured some appropriate lock is held.
!        */
!       if (lockmode != NoLock)
!               AcceptInvalidationMessages();
! 
!       /* Look up the appropriate relation using namespace search */
!       relOid = RangeVarGetRelid(relation, true);
  
        /* Return NULL on not-found */
        if (!OidIsValid(relOid))
                return NULL;
  
        /* Let relation_open do the rest */
!       return relation_open(relOid, lockmode);
  }
  
  /* ----------------
--- 999,1013 ----
  {
        Oid                     relOid;
  
!       /* Look up and lock the appropriate relation using namespace search */
!       relOid = RangeVarLockRelid(relation, lockmode, true);
  
        /* Return NULL on not-found */
        if (!OidIsValid(relOid))
                return NULL;
  
        /* Let relation_open do the rest */
!       return relation_open(relOid, NoLock);
  }
  
  /* ----------------
diff --git a/src/backend/catalog/namespindex 41e9299..771976b 100644
*** a/src/backend/catalog/namespace.c
--- b/src/backend/catalog/namespace.c
***************
*** 44,49 ****
--- 44,51 ----
  #include "parser/parse_func.h"
  #include "storage/backendid.h"
  #include "storage/ipc.h"
+ #include "storage/lmgr.h"
+ #include "storage/sinval.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
  #include "utils/guc.h"
***************
*** 285,290 **** RangeVarGetRelid(const RangeVar *relation, bool failOK)
--- 287,358 ----
  }
  
  /*
+  * RangeVarLockRelid
+  *            Like RangeVarGetRelid, but simulatenously acquire the specified 
lock on
+  *            the relation.  This works properly in the face of concurrent 
DDL that
+  *            may drop, create or rename relations.
+  *
+  * If the relation is not found and failOK = true, take no lock and return
+  * InvalidOid.  Otherwise, raise an error.
+  */
+ Oid
+ RangeVarLockRelid(const RangeVar *relation, LOCKMODE lockmode,
+                                 bool failOK)
+ {
+       int                     lastCounter;
+       Oid                     relOid1,
+                               relOid2;
+ 
+       /*
+        * First attempt.  If the caller requested NoLock, it already acquired 
an
+        * appropriate lock and has called AcceptInvalidationMessages() since
+        * doing so.  In this case, our first search is always correct, and we
+        * degenerate to behave exactly like RangeVarGetRelid().
+        */
+       lastCounter = SharedInvalidMessageCounter;
+       relOid1 = RangeVarGetRelid(relation, failOK);
+       if (lockmode == NoLock)
+               return relOid1;
+ 
+       /*
+        * By the time we acquire the lock, our RangeVar may denote a different
+        * relation or no relation at all.  In particular, this can happen when
+        * the lock acquisition blocks on a transaction performing DROP or ALTER
+        * TABLE RENAME.  However, once and while we do hold a lock of any 
level,
+        * we can count on the name of the found relation remaining stable.
+        *
+        * Even so, DDL activity could cause an object in a schema earlier in 
the
+        * search path to mask our original selection undetected.  No current 
lock
+        * would prevent that.  We let the user worry about it, such as by 
taking
+        * additional explicit locks.
+        */
+       do
+       {
+               /* Not-found is always final. */
+               if (!OidIsValid(relOid1))
+                       return relOid1;
+ 
+               /*
+                * LockRelationOid also calls AcceptInvalidationMessages() to 
make
+                * recent DDL effects visible, if needed.  Finding none, we're 
done.
+                */
+               LockRelationOid(relOid1, lockmode);
+               if (lastCounter == SharedInvalidMessageCounter)
+                       break;
+               else
+                       lastCounter = SharedInvalidMessageCounter;
+ 
+               /* Some invalidation messages arrived; search again. */
+               relOid2 = relOid1;
+               relOid1 = RangeVarGetRelid(relation, failOK);
+ 
+               /* Done when our RangeVar denotes the same relation we locked. 
*/
+       } while (relOid1 != relOid2);
+ 
+       return relOid1;
+ }
+ 
+ /*
   * RangeVarGetCreationNamespace
   *            Given a RangeVar describing a to-be-created relation,
   *            choose which namespace to create it in.
diff --git a/src/backend/storage/ipc/sindex 9ab16b1..9b1ec82 100644
*** a/src/backend/storage/ipc/sinval.c
--- b/src/backend/storage/ipc/sinval.c
***************
*** 22,27 ****
--- 22,30 ----
  #include "utils/inval.h"
  
  
+ unsigned SharedInvalidMessageCounter;
+ 
+ 
  /*
   * Because backends sitting idle will not be reading sinval events, we
   * need a way to give an idle backend a swift kick in the rear and make
***************
*** 90,95 **** ReceiveSharedInvalidMessages(
--- 93,99 ----
        {
                SharedInvalidationMessage *msg = &messages[nextmsg++];
  
+               SharedInvalidMessageCounter++;
                invalFunction(msg);
        }
  
***************
*** 106,111 **** ReceiveSharedInvalidMessages(
--- 110,116 ----
                {
                        /* got a reset message */
                        elog(DEBUG4, "cache state reset");
+                       SharedInvalidMessageCounter++;
                        resetFunction();
                        break;                          /* nothing more to do */
                }
***************
*** 118,123 **** ReceiveSharedInvalidMessages(
--- 123,129 ----
                {
                        SharedInvalidationMessage *msg = &messages[nextmsg++];
  
+                       SharedInvalidMessageCounter++;
                        invalFunction(msg);
                }
  
diff --git a/src/backend/storage/lmgr/lindex 859b385..90b2ecc 100644
*** a/src/backend/storage/lmgr/lmgr.c
--- b/src/backend/storage/lmgr/lmgr.c
***************
*** 81,90 **** LockRelationOid(Oid relid, LOCKMODE lockmode)
        /*
         * Now that we have the lock, check for invalidation messages, so that 
we
         * will update or flush any stale relcache entry before we try to use 
it.
!        * We can skip this in the not-uncommon case that we already had the 
same
!        * type of lock being requested, since then no one else could have
!        * modified the relcache entry in an undesirable way.  (In the case 
where
!        * our own xact modifies the rel, the relcache update happens via
         * CommandCounterIncrement, not here.)
         */
        if (res != LOCKACQUIRE_ALREADY_HELD)
--- 81,91 ----
        /*
         * Now that we have the lock, check for invalidation messages, so that 
we
         * will update or flush any stale relcache entry before we try to use 
it.
!        * RangeVarLockRelid() specifically relies on us for this.  We can skip
!        * this in the not-uncommon case that we already had the same type of 
lock
!        * being requested, since then no one else could have modified the
!        * relcache entry in an undesirable way.  (In the case where our own 
xact
!        * modifies the rel, the relcache update happens via
         * CommandCounterIncrement, not here.)
         */
        if (res != LOCKACQUIRE_ALREADY_HELD)
diff --git a/src/include/catalog/namesindex 5360096..7e64952 100644
*** a/src/include/catalog/namespace.h
--- b/src/include/catalog/namespace.h
***************
*** 15,20 ****
--- 15,21 ----
  #define NAMESPACE_H
  
  #include "nodes/primnodes.h"
+ #include "storage/lock.h"
  
  
  /*
***************
*** 48,53 **** typedef struct OverrideSearchPath
--- 49,56 ----
  
  
  extern Oid    RangeVarGetRelid(const RangeVar *relation, bool failOK);
+ extern Oid    RangeVarLockRelid(const RangeVar *relation, LOCKMODE lockmode,
+                                 bool failOK);
  extern Oid    RangeVarGetCreationNamespace(const RangeVar *newRelation);
  extern Oid    RangeVarGetAndCheckCreationNamespace(const RangeVar 
*newRelation);
  extern Oid    RelnameGetRelid(const char *relname);
diff --git a/src/include/storage/sinvaindex e9ce025..a90aa2d 100644
*** a/src/include/storage/sinval.h
--- b/src/include/storage/sinval.h
***************
*** 116,121 **** typedef union
--- 116,125 ----
  } SharedInvalidationMessage;
  
  
+ /* Counter of messages processed; may overflow. */
+ extern unsigned SharedInvalidMessageCounter;
+ 
+ 
  extern void SendSharedInvalidMessages(const SharedInvalidationMessage *msgs,
                                                  int n);
  extern void ReceiveSharedInvalidMessages(

nmtest-relation_openrv.shar
Description: Unix shell archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Make relation_openrv atomic wrt DDL

Reply via email to