On Sun, Jul 23, 2017 at 8:32 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Meanwhile, it's still pretty unclear what happened yesterday on > culicidae.
That failure is indeed baffling. The only code that inserts (HASH_ENTER[_NULL]) into PredicateLockTargetHash: 1. CreatePredicateLock(). I would be a bug if that ever tried to insert a { 0, 0, 0, 0 } tag, and in any case it holds SerializablePredicateLockListLock in LW_SHARED. 2. TransferPredicateLocksToNewTarget(), which removes and restores the scratch entry and also explicitly inserts a transferred entry. It asserts that it holds SerializablePredicateLockListLock and is called only by PredicateLockPageSplit() which acquires it in LW_EXCLUSIVE. 3. DropAllPredicateLocksFromTable(), which removes and restores the scratch entry and also explicitly inserts a transferred entry. Acquires SerializablePredicateLockListLock in LW_EXCLUSIVE. I wondered if DropAllPredicateLocksFromTable() had itself inserted a tag that accidentally looks like the scratch tag in between removing and restoring, perhaps because the relation passed in had a bogus 0 DB OID etc, but it constructs a tag with SET_PREDICATELOCKTARGETTAG_RELATION(heaptargettag, dbId, heapId) which sets locktag_field3 to InvalidBlockNumber == -1, not 0 so that can't explain it. I wondered if a concurrent PredicateLockPageSplit() called TransferPredicateLocksToNewTarget() using a newtargettag built from a Relation that somehow had a bogus relation with DB OID 0, rel OID 0 and newblkno 0, but that doesn't help because SerializablePredicateLockListLock is acquired at LW_EXCLUSIVE so it can't run concurrently. It looks a bit like something at a lower level needs to be broken (GCC 6.3 released 6 months ago, maybe interacts badly with some clever memory model-dependent code of ours?) or something needs to be trashing memory. Here's the set of tests that ran concurrently with select_into, whose backtrace we see ("DROP SCHEMA selinto_schema CASCADE;"): parallel group (20 tests): select_distinct_on delete select_having random btree_index select_distinct namespace update case hash_index select_implicit subselect select_into arrays prepared_xacts transactions portals aggregates join union Of those I see that prepared_xacts, portals and transactions explicitly use SERIALIZABLE (which may or may not be important). I wonder if the thing to do here is to run selinto (or maybe just its setup and tear-down, "DROP SCHEMA ...") concurrently with those others in tight loops and burn some CPU. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers