Pavan Deolasee wrote: > Looking at the history and some past discussions, it seems Tomas reported > somewhat similar problem and Andres proposed a patch here > https://www.postgresql.org/message-id/20140514155204.ge23...@awork2.anarazel.de > which got committed via b23b0f5588d2e2. Not exactly the same issue, but > related to relcache flush happening in index_open(). > > I wonder if we should just add a recheck logic > in RelationGetIndexAttrBitmap() such that if rd_indexvalid is seen as 0 at > the end, we go back and start again from RelationGetIndexList(). Or is > there any code path that relies on finding the old information?
Good find on that old report. It occured to me to re-run Tomas' test case with this second patch you proposed (the "ddl" test takes 11 minutes in my laptop), and lo and behold -- your "recheck" code enters an infinite loop. Not good. I tried some more tricks that came to mind, but I didn't get any good results. Pavan and I discussed it further and he came up with the idea of returning the bitmapset we compute, but if an invalidation occurs in index_open, simply do not cache the bitmapsets so that next time this is called they are all computed again. Patch attached. This appears to have appeased both test_decoding's ddl test under CACHE_CLOBBER_ALWAYS, as well as the CREATE INDEX CONCURRENTLY bug. I intend to commit this soon to all branches, to ensure it gets into the next set of minors. Here is a detailed reproducer for the CREATE INDEX CONCURRENTLY bug. Line numbers are as of today's master; comments are supposed to help locating good spots in other branches. Given the use of gdb breakpoints, there's no good way to integrate this with regression tests, which is a pity because this stuff looks a little brittle to me, and if it breaks it is hard to detect. Prep: DROP TABLE IF EXISTS testtab; CREATE TABLE testtab (a int unique, b int, c int); INSERT INTO testtab VALUES (1, 100, 10); CREATE INDEX testindx ON testtab (b); S1: SELECT * FROM testtab; UPDATE testtab SET b = b + 1 WHERE a = 1; SELECT pg_backend_pid(); attach GDB to this session. break relcache.c:4800 # right before entering the per-index loop cont S2: SELECT pg_backend_pid(); DROP INDEX testindx; attach GDB to this session handle SIGUSR1 nostop break indexcmds.c:666 # right after index_create break indexcmds.c:790 # right before the second CommitTransactionCommand cont CREATE INDEX CONCURRENTLY testindx ON testtab (b); -- this blocks in breakpoint 1. Leave it there. S1: UPDATE testtab SET b = b + 1 WHERE a = 1; -- this blocks in breakpoint 1. Leave it there. S2: gdb -> cont -- This blocks waiting for S1 S1: gdb -> cont -- S1 finishes the update. S2 awakens and blocks again in breakpoint 2. S1: -- Now run more UPDATEs: UPDATE testtab SET b = b + 1 WHERE a = 1; This produces a broken HOT chain. This can be done several times; all these updates should be non-HOT because of the index created by CIC, but they are marked HOT. S2: "cont" From this point onwards, update are correct. SELECT * FROM testtab; SET enable_seqscan TO 0; SET enable_bitmapscan TO 0; SELECT * FROM testtab WHERE b between 100 and 110; -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
commit 7054c21883154f67058d42180606e0c5428d2745[m Author: Alvaro Herrera <alvhe...@alvh.no-ip.org> AuthorDate: Fri Feb 3 15:40:37 2017 -0300 CommitDate: Fri Feb 3 15:41:59 2017 -0300 Fix CREATE INDEX CONCURRENTLY relcache bug diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 8a7c560..003ccfa 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -4745,9 +4745,11 @@ RelationGetIndexPredicate(Relation relation) * Attribute numbers are offset by FirstLowInvalidHeapAttributeNumber so that * we can include system attributes (e.g., OID) in the bitmap representation. * - * Caller had better hold at least RowExclusiveLock on the target relation - * to ensure that it has a stable set of indexes. This also makes it safe - * (deadlock-free) for us to take locks on the relation's indexes. + * While all callers should at least RowExclusiveLock on the target relation, + * we still can't guarantee a stable set of indexes because CREATE INDEX + * CONCURRENTLY and DROP INDEX CONCURRENTLY can change set of indexes, without + * taking any conflicting lock. So we must be prepared to handle changes in the + * set of indexes and recompute bitmaps, when necessary. * * The returned result is palloc'd in the caller's memory context and should * be bms_free'd when not needed anymore. @@ -4763,7 +4765,6 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind) Oid relpkindex; Oid relreplindex; ListCell *l; - MemoryContext oldcxt; /* Quick exit if we already computed the result. */ if (relation->rd_indexattr != NULL) @@ -4807,6 +4808,8 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind) relpkindex = relation->rd_pkindex; relreplindex = relation->rd_replidindex; + Assert(relation->rd_indexvalid != 0); + /* * For each index, add referenced attributes to indexattrs. * @@ -4893,18 +4896,23 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind) relation->rd_idattr = NULL; /* - * Now save copies of the bitmaps in the relcache entry. We intentionally - * set rd_indexattr last, because that's the one that signals validity of - * the values; if we run out of memory before making that copy, we won't - * leave the relcache entry looking like the other ones are valid but - * empty. + * Now save copies of the bitmaps in the relcache entry, but only if no + * invalidation occured in the meantime. We intentionally set rd_indexattr + * last, because that's the one that signals validity of the values; if we + * run out of memory before making that copy, we won't leave the relcache + * entry looking like the other ones are valid but empty. */ - oldcxt = MemoryContextSwitchTo(CacheMemoryContext); - relation->rd_keyattr = bms_copy(uindexattrs); - relation->rd_pkattr = bms_copy(pkindexattrs); - relation->rd_idattr = bms_copy(idindexattrs); - relation->rd_indexattr = bms_copy(indexattrs); - MemoryContextSwitchTo(oldcxt); + if (relation->rd_indexvalid != 0) + { + MemoryContext oldcxt; + + oldcxt = MemoryContextSwitchTo(CacheMemoryContext); + relation->rd_keyattr = bms_copy(uindexattrs); + relation->rd_pkattr = bms_copy(pkindexattrs); + relation->rd_idattr = bms_copy(idindexattrs); + relation->rd_indexattr = bms_copy(indexattrs); + MemoryContextSwitchTo(oldcxt); + } /* We return our original working copy for caller to play with */ switch (attrKind)
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers