Hi, On 2016-04-29 10:38:55 -0700, Jeff Janes wrote: > I've bisected the errors I was seeing, discussed in > http://www.postgresql.org/message-id/CAMkU=1xqehc0ok4d+tkjfq1nvuho37pyrkhjp6q8oxifmx7...@mail.gmail.com > > It look like they first appear in: > > commit 48354581a49c30f5757c203415aa8412d85b0f70 > Author: Andres Freund <and...@anarazel.de> > Date: Sun Apr 10 20:12:32 2016 -0700 > > Allow Pin/UnpinBuffer to operate in a lockfree manner. > > > I get the errors: > > ERROR: attempted to delete invisible tuple > STATEMENT: update foo set count=count+1,text_array=$1 where text_array @> $2 > > And also: > > ERROR: unexpected chunk number 1 (expected 2) for toast value > 85223889 in pg_toast_16424 > STATEMENT: update foo set count=count+1 where text_array @> $1
I'm a bit dumbfounded here. I think I found the issue, and, if so, it's *very* longstanding. HeapTupleSatisfiesToast(), aborted transactions, and hint bits (or oid wraparound) appear to be a broken combination. What I'm seing is that tuptoaster.c is trying to fetch or delete toast chunks generated in an aborted (via crash) transaction. And that, with the same valueid/chunk_id, a valid value exist somewhere else in the toast table. HeapTupleSatisfiesToast() does not do any actual tuple liveliness detection besides checking basic hint bits. Thus, if there's a dead tuple in the toast table's heap, which a *live* reference from the table, toast_fetch_datum() potentially returns the wrong results. That makes the new question: How can there be a live reference to a dead and reused toast id? As far as I can see the primary problem is that GetNewOidWithIndex() uses SnapshotDirty to verify whether a tuple is live. As that actually verifies tuple liveliness (and sets hint bits), it'll skip over an aborted toasted datum. A valid question is why the hint bits set by HeapTupleSatisfiesDirty() doesn't prevent that from occuring (besides standbys which aren't evolved): It's the constant crashes. Looking at a tuple which triggers the "unexpected chunk number" error, it indeed has a live and a dead version of the same chunk id. Interestingly the dead rows looks like: chunk_id = 1 - t_infomask: 0xA02 - HEAP_XMAX_INVALID | HEAP_XMIN_INVALID | HEAP_HASVARWIDTH chunk_id = 2 - t_infomask: 0x802 - HEAP_XMAX_INVALID | HEAP_HASVARWIDTH My rather strong suspicion on why that is, is that both tuples were visited by SnapshotDirty(), but only the first ones hint bits got persisted, because checksums were enabled. The second time through MarkBufferDirtyHint skipped work, because the buffer was already dirty. rmgr: XLOG len (rec/tot): 0/ 3809, tx: 206127725, lsn: 1C/EB3B0910, prev 1C/EB3B08E0, desc: FPI_FOR_HINT , blkref #0: rel 1663/16384/16433 blk 202567 FPW that's indeed the only time the affected page is touched as visible to pg_xlogdump after being filled... The easy way to trigger this problem would be to have an oid wraparound - but the WAL shows that that's not the case here. I've not figured that one out entirely (and won't tonight). But I do see WAL records like: rmgr: XLOG len (rec/tot): 4/ 30, tx: 0, lsn: 2/12004018, prev 2/12003288, desc: NEXTOID 4302693 rmgr: XLOG len (rec/tot): 4/ 30, tx: 0, lsn: 2/1327EA08, prev 2/1327DC60, desc: NEXTOID 4302693 i.e. two NEXTOID records allocating the same range, which obviously doesn't seem right. There's also every now and then close by ranges: rmgr: XLOG len (rec/tot): 4/ 30, tx: 0, lsn: 1/9A404DB8, prev 1/9A404270, desc: NEXTOID 3311455 rmgr: XLOG len (rec/tot): 4/ 30, tx: 7814505, lsn: 1/9A4EC888, prev 1/9A4EB9D0, desc: NEXTOID 3311461 As far as I can see something like the above, or an oid wraparound, are pretty much deadly for toast. Is anybody ready with a good defense for SatisfiesToast not doing any actual liveliness checks? Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers