Dear Tom,

Thanks for this, and sorry for not replying earlier. We finally obtained a window to deploy this patch on the real (rather busy!) production system as of last Saturday evening.

The good news is that the patch has now been in place for 5 days, and, despite some very high loading, it has survived without a single crash.

I'd venture to say that this issue is now fixed.

Best wishes,

Richard




Tom Lane wrote:
I wrote:
I'll get you a real fix as soon as I can, but might not be till
tomorrow.

The attached patch (against 8.4.x) fixes the problem as far as I can
tell.  Please test.

                        regards, tom lane



------------------------------------------------------------------------

Index: src/backend/utils/cache/relcache.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/cache/relcache.c,v
retrieving revision 1.287
diff -c -r1.287 relcache.c
*** src/backend/utils/cache/relcache.c  11 Jun 2009 14:49:05 -0000      1.287
--- src/backend/utils/cache/relcache.c  25 Sep 2009 17:32:02 -0000
***************
*** 1386,1392 ****
         *
         * The data we insert here is pretty incomplete/bogus, but it'll serve 
to
         * get us launched.  RelationCacheInitializePhase2() will read the real
!        * data from pg_class and replace what we've done here.
         */
        relation->rd_rel = (Form_pg_class) palloc0(CLASS_TUPLE_SIZE);
--- 1386,1394 ----
         *
         * The data we insert here is pretty incomplete/bogus, but it'll serve 
to
         * get us launched.  RelationCacheInitializePhase2() will read the real
!        * data from pg_class and replace what we've done here.  Note in 
particular
!        * that relowner is left as zero; this cues 
RelationCacheInitializePhase2
!        * that the real data isn't there yet.
         */
        relation->rd_rel = (Form_pg_class) palloc0(CLASS_TUPLE_SIZE);
***************
*** 2603,2619 ****
         * rows and replace the fake entries with them. Also, if any of the
         * relcache entries have rules or triggers, load that info the hard way
         * since it isn't recorded in the cache file.
         */
        hash_seq_init(&status, RelationIdCache);
while ((idhentry = (RelIdCacheEnt *) hash_seq_search(&status)) != NULL)
        {
                Relation        relation = idhentry->reldesc;
/*
                 * If it's a faked-up entry, read the real pg_class tuple.
                 */
!               if (needNewCacheFile && relation->rd_isnailed)
                {
                        HeapTuple       htup;
                        Form_pg_class relp;
--- 2605,2635 ----
         * rows and replace the fake entries with them. Also, if any of the
         * relcache entries have rules or triggers, load that info the hard way
         * since it isn't recorded in the cache file.
+        *
+        * Whenever we access the catalogs to read data, there is a possibility
+        * of a shared-inval cache flush causing relcache entries to be removed.
+        * Since hash_seq_search only guarantees to still work after the 
*current*
+        * entry is removed, it's unsafe to continue the hashtable scan 
afterward.
+        * We handle this by restarting the scan from scratch after each access.
+        * This is theoretically O(N^2), but the number of entries that actually
+        * need to be fixed is small enough that it doesn't matter.
         */
        hash_seq_init(&status, RelationIdCache);
while ((idhentry = (RelIdCacheEnt *) hash_seq_search(&status)) != NULL)
        {
                Relation        relation = idhentry->reldesc;
+               bool            restart = false;
+ + /*
+                * Make sure *this* entry doesn't get flushed while we work 
with it.
+                */
+               RelationIncrementReferenceCount(relation);
/*
                 * If it's a faked-up entry, read the real pg_class tuple.
                 */
!               if (relation->rd_rel->relowner == InvalidOid)
                {
                        HeapTuple       htup;
                        Form_pg_class relp;
***************
*** 2630,2636 ****
                         * Copy tuple to relation->rd_rel. (See notes in
                         * AllocateRelationDesc())
                         */
-                       Assert(relation->rd_rel != NULL);
                        memcpy((char *) relation->rd_rel, (char *) relp, 
CLASS_TUPLE_SIZE);
/* Update rd_options while we have the tuple */
--- 2646,2651 ----
***************
*** 2639,2660 ****
                        RelationParseRelOptions(relation, htup);
/*
!                        * Also update the derived fields in rd_att.
                         */
!                       relation->rd_att->tdtypeid = relp->reltype;
!                       relation->rd_att->tdtypmod = -1;  /* unnecessary, 
but... */
!                       relation->rd_att->tdhasoid = relp->relhasoids;
ReleaseSysCache(htup);
                }
/*
                 * Fix data that isn't saved in relcache cache file.
                 */
                if (relation->rd_rel->relhasrules && relation->rd_rules == NULL)
                        RelationBuildRuleLock(relation);
                if (relation->rd_rel->relhastriggers && relation->trigdesc == 
NULL)
                        RelationBuildTriggers(relation);
        }
/*
--- 2654,2710 ----
                        RelationParseRelOptions(relation, htup);
/*
!                        * Check the values in rd_att were set up correctly.  
(We cannot
!                        * just copy them over now: formrdesc must have set up 
the
!                        * rd_att data correctly to start with, because it may 
already
!                        * have been copied into one or more catcache entries.)
                         */
!                       Assert(relation->rd_att->tdtypeid == relp->reltype);
!                       Assert(relation->rd_att->tdtypmod == -1);
!                       Assert(relation->rd_att->tdhasoid == relp->relhasoids);
ReleaseSysCache(htup); + + /* relowner had better be OK now, else we'll loop forever */
+                       if (relation->rd_rel->relowner == InvalidOid)
+                               elog(ERROR, "invalid relowner in pg_class entry for 
\"%s\"",
+                                        RelationGetRelationName(relation));
+ + restart = true;
                }
/*
                 * Fix data that isn't saved in relcache cache file.
+                *
+                * relhasrules or relhastriggers could possibly be wrong or out 
of
+                * date.  If we don't actually find any rules or triggers, 
clear the
+                * local copy of the flag so that we don't get into an infinite 
loop
+                * here.  We don't make any attempt to fix the pg_class entry, 
though.
                 */
                if (relation->rd_rel->relhasrules && relation->rd_rules == NULL)
+               {
                        RelationBuildRuleLock(relation);
+                       if (relation->rd_rules == NULL)
+                               relation->rd_rel->relhasrules = false;
+                       restart = true;
+               }
                if (relation->rd_rel->relhastriggers && relation->trigdesc == 
NULL)
+               {
                        RelationBuildTriggers(relation);
+                       if (relation->trigdesc == NULL)
+                               relation->rd_rel->relhastriggers = false;
+                       restart = true;
+               }
+ + /* Release hold on the relation */
+               RelationDecrementReferenceCount(relation);
+ + /* Now, restart the hashtable scan if needed */
+               if (restart)
+               {
+                       hash_seq_term(&status);
+                       hash_seq_init(&status, RelationIdCache);
+               }
        }
/*

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reply via email to