On Tue, Feb 08, 2022 at 06:04:03PM -0800, Noah Misch wrote: > On Tue, Feb 08, 2022 at 04:43:47PM -0800, Andres Freund wrote: > > On 2022-02-08 22:13:01 +0100, Tomas Vondra wrote: > > > On 10/24/21 03:40, Noah Misch wrote: > > > > Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURRENTLY. > > > > > > > > CIC and REINDEX CONCURRENTLY assume backends see their catalog changes > > > > no later than each backend's next transaction start. That failed to > > > > hold when a backend absorbed a relevant invalidation in the middle of > > > > running RelationBuildDesc() on the CIC index. Queries that use the > > > > resulting index can silently fail to find rows. Fix this for future > > > > index builds by making RelationBuildDesc() loop until it finishes > > > > without accepting a relevant invalidation. It may be necessary to > > > > reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices. > > > > Back-patch to 9.6 (all supported versions). > > > > > > > > Noah Misch and Andrey Borodin, reviewed (in earlier versions) by Andres > > > > Freund. > > > > > > > > Discussion: > > > > https://postgr.es/m/20210730022548.ga1940...@gust.leadboat.com > > > > > > > > > > Unfortunately, this seems to have broken CLOBBER_CACHE_ALWAYS builds. > > > Since > > > this commit, initdb never completes due to infinite retrying over and over > > > (on the first RelationBuildDesc call). > > Thanks for the report. I had added the debug_discard arguments of > InvalidateSystemCachesExtended() and RelationCacheInvalidate() to make the new > code survive a CREATE TABLE at debug_discard_caches=5. Apparently that's not > enough for initdb. I'll queue a task to look at it.
The explanation was more boring than that. v13 and earlier have an additional InvalidateSystemCaches() call site, which I neglected to update. Here's the fix I intend to push.
Author: Noah Misch <n...@leadboat.com> Commit: Noah Misch <n...@leadboat.com> Fix back-patch of "Avoid race in RelationBuildDesc() ..." The back-patch of commit fdd965d074d46765c295223b119ca437dbcac973 broke CLOBBER_CACHE_ALWAYS for v9.6 through v13, because it updated the InvalidateSystemCaches() call pertaining to CLOBBER_CACHE_RECURSIVELY and not also the one pertaining to CLOBBER_CACHE_ALWAYS. Back-patch to v13, v12, v11, and v10. Reported by Tomas Vondra. Reviewed by FIXME. Discussion: https://postgr.es/m/df7b4c0b-7d92-f03f-75c4-9e08b269a...@enterprisedb.com diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c index 8b0503f..8fe1f0d 100644 --- a/src/backend/utils/cache/inval.c +++ b/src/backend/utils/cache/inval.c @@ -715,27 +715,27 @@ AcceptInvalidationMessages(void) * slows things by at least a factor of 10000, so I wouldn't suggest * trying to run the entire regression tests that way. It's useful to try * a few simple tests, to make sure that cache reload isn't subject to * internal cache-flush hazards, but after you've done a few thousand * recursive reloads it's unlikely you'll learn more. */ #if defined(CLOBBER_CACHE_ALWAYS) { static bool in_recursion = false; if (!in_recursion) { in_recursion = true; - InvalidateSystemCaches(); + InvalidateSystemCachesExtended(true); in_recursion = false; } } #elif defined(CLOBBER_CACHE_RECURSIVELY) { static int recursion_depth = 0; /* Maximum depth is arbitrary depending on your threshold of pain */ if (recursion_depth < 3) { recursion_depth++; InvalidateSystemCachesExtended(true); recursion_depth--;