Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

Andres Freund Mon, 24 Feb 2014 13:18:27 -0800

Hi,

On 2014-02-24 17:55:14 -0300, Alvaro Herrera wrote:
> Greg Stark wrote:
> > I have a database where a a couple rows don't appear in index scans
> > but do appear in sequential scans. It looks like the same problem as
> > Peter reported but this is a different database. I've extracted all
> > the xlogdump records and below are the ones I think are relevant. You
> > can see that lp 2 gets a few HOT updates and concurrently has someone
> > create a MultiXact NO KEY UPDATE lock while one of those HOT updates
> > is pending but not committed.


Per se the sequence of records doesn't look bad (even though I am not
happy that we log intermediate and final rows first, and only then the
start of the chain).

> > The net result seems to be that the ctid
> > update chain got broken. The index of course points to the head of the
> > HOT chain so it doesn't find the live tail whereas the sequential scan
> > picks it up.

Yea, that's the problem.

> > rmgr: Heap        len (rec/tot):    291/   323, tx:    5943849, lsn: 
> > FD/2F0ADFC0, prev FD/2F0ADF90, bkp: 0000, desc: hot_update: rel 
> > 1663/16385/212653; tid 13065/2 xmax 5943849 ; new tid 13065/3 xmax 0
> > rmgr: Heap2       len (rec/tot):     25/    57, tx:    5943851, lsn: 
> > FD/2F0AE450, prev FD/2F0AE408, bkp: 0000, desc: lock updated: xmax 5943851 
> > msk 000a; rel 1663/16385/212653; tid 13065/3
> > rmgr: MultiXact   len (rec/tot):     28/    60, tx:    5943851, lsn: 
> > FD/2F0AE490, prev FD/2F0AE450, bkp: 0000, desc: create mxid 728896 offset 
> > 1632045 nmembers 2: 5943849 (nokeyupd) 5943851 (keysh)
> > rmgr: Heap        len (rec/tot):     25/    57, tx:    5943851, lsn: 
> > FD/2F0AE4D0, prev FD/2F0AE490, bkp: 0000, desc: lock 728896: rel 
> > 1663/16385/212653; tid 13065/2 IS_MULTI EXCL_LOCK

> >  lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax  | t_field3 |   t_ctid  
> >  | t_infomask2 | t_infomask | t_hoff |
> > ----+--------+----------+--------+---------+---------+----------+------------+-------------+------------+--------+-
> >   2 |   3424 |        1 |    232 | 5943845 |  728896 |        0 | (13065,2) 
> >  |          32 |       4419 |     32 |
> >   3 |   3152 |        1 |    272 | 5943849 | 5943879 |        0 | (13065,4) 
> >  |       49184 |       9475 |     32 |
> >   4 |   2864 |        1 |    287 | 5943879 | 5943880 |        0 | (13065,7) 
> >  |       49184 |       9475 |     32 |
> >   7 |   2576 |        1 |    287 | 5943880 |       0 |        0 | (13065,7) 
> >  |       32800 |      10499 |     32 |

Those together explain the story. Note this bit:

static void
heap_xlog_lock(XLogRecPtr lsn, XLogRecord *record)
{
...
    HeapTupleHeaderClearHotUpdated(htup);
    HeapTupleHeaderSetXmax(htup, xlrec->locking_xid);
    HeapTupleHeaderSetCmax(htup, FirstCommandId, false);
    /* Make sure there is no forward chain link in t_ctid */
    htup->t_ctid = xlrec->target.tid;
...
}

So, the replay of FD/2F0AE4D0 breaks the ctid chain *and* unsets the
HOT_UPDATED flag.

Which means fkey locks have never properly worked across SR/crash
recovery.

Haven't thought about how to fix it yet, I hope won't have to (hint hint).

We somehow need to have a policy of testing changes to the WAL format
without full_page_writes. They hide bugs in replay far, far too often.


Greetings,

Andres Freund

-- 
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

Reply via email to