On Tue, Feb 08, 2022 at 06:04:03PM -0800, Noah Misch wrote:
> On Tue, Feb 08, 2022 at 04:43:47PM -0800, Andres Freund wrote:
> > On 2022-02-08 22:13:01 +0100, Tomas Vondra wrote:
> > > On 10/24/21 03:40, Noah Misch wrote:
> > > > Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURRENTLY.
> > > > 
> > > > CIC and REINDEX CONCURRENTLY assume backends see their catalog changes
> > > > no later than each backend's next transaction start.  That failed to
> > > > hold when a backend absorbed a relevant invalidation in the middle of
> > > > running RelationBuildDesc() on the CIC index.  Queries that use the
> > > > resulting index can silently fail to find rows.  Fix this for future
> > > > index builds by making RelationBuildDesc() loop until it finishes
> > > > without accepting a relevant invalidation.  It may be necessary to
> > > > reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices.
> > > > Back-patch to 9.6 (all supported versions).
> > > > 
> > > > Noah Misch and Andrey Borodin, reviewed (in earlier versions) by Andres
> > > > Freund.
> > > > 
> > > > Discussion: 
> > > > https://postgr.es/m/20210730022548.ga1940...@gust.leadboat.com
> > > > 
> > > 
> > > Unfortunately, this seems to have broken CLOBBER_CACHE_ALWAYS builds. 
> > > Since
> > > this commit, initdb never completes due to infinite retrying over and over
> > > (on the first RelationBuildDesc call).
> 
> Thanks for the report.  I had added the debug_discard arguments of
> InvalidateSystemCachesExtended() and RelationCacheInvalidate() to make the new
> code survive a CREATE TABLE at debug_discard_caches=5.  Apparently that's not
> enough for initdb.  I'll queue a task to look at it.

The explanation was more boring than that.  v13 and earlier have an additional
InvalidateSystemCaches() call site, which I neglected to update.  Here's the
fix I intend to push.
Author:     Noah Misch <n...@leadboat.com>
Commit:     Noah Misch <n...@leadboat.com>

    Fix back-patch of "Avoid race in RelationBuildDesc() ..."
    
    The back-patch of commit fdd965d074d46765c295223b119ca437dbcac973 broke
    CLOBBER_CACHE_ALWAYS for v9.6 through v13, because it updated the
    InvalidateSystemCaches() call pertaining to CLOBBER_CACHE_RECURSIVELY
    and not also the one pertaining to CLOBBER_CACHE_ALWAYS.  Back-patch to
    v13, v12, v11, and v10.
    
    Reported by Tomas Vondra.  Reviewed by FIXME.
    
    Discussion: 
https://postgr.es/m/df7b4c0b-7d92-f03f-75c4-9e08b269a...@enterprisedb.com

diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index 8b0503f..8fe1f0d 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -715,27 +715,27 @@ AcceptInvalidationMessages(void)
         * slows things by at least a factor of 10000, so I wouldn't suggest
         * trying to run the entire regression tests that way.  It's useful to 
try
         * a few simple tests, to make sure that cache reload isn't subject to
         * internal cache-flush hazards, but after you've done a few thousand
         * recursive reloads it's unlikely you'll learn more.
         */
 #if defined(CLOBBER_CACHE_ALWAYS)
        {
                static bool in_recursion = false;
 
                if (!in_recursion)
                {
                        in_recursion = true;
-                       InvalidateSystemCaches();
+                       InvalidateSystemCachesExtended(true);
                        in_recursion = false;
                }
        }
 #elif defined(CLOBBER_CACHE_RECURSIVELY)
        {
                static int      recursion_depth = 0;
 
                /* Maximum depth is arbitrary depending on your threshold of 
pain */
                if (recursion_depth < 3)
                {
                        recursion_depth++;
                        InvalidateSystemCachesExtended(true);
                        recursion_depth--;

Reply via email to