Re: [HACKERS] Speedup twophase transactions

Michael Paquier Tue, 12 Apr 2016 05:48:30 -0700

On Mon, Apr 11, 2016 at 7:16 PM, Stas Kelvich wrote:
> Michael, it looks like that you are the only one person who can reproduce 
> that bug. I’ve tried on bunch of OS’s and didn’t observe that behaviour, also 
> looking at your backtraces I can’t get who is holding this lock (and all of 
> that happens before first prepare record is replayed).


Where did you try it. FWIW, I can reproduce that on Linux and OSX, and
only manually though:
1) Create a master and a streaming standby.
2) Run the following on master:
BEGIN;
CREATE TABLE foo (a int);
PREPARE TRANSACTION 'tx';
3) stop -m immediate the standby
4) COMMIT PREPARED 'tx' on master
5) Restart standby, and the node will wait for a lock

> Can you investigate it more? Particularly find out who holds the lock?

OK, so if I look at this backtrace that's a standby LOCK record, but
we already know that:
    frame #9: 0x0000000107600383
postgres`LockAcquireExtended(locktag=0x00007fff58a4b228, lockmode=8,
sessionLock='\x01', dontWait='\0', reportMemoryError='\0') + 2819 at
lock.c:998
    frame #10: 0x00000001075f9cd6
postgres`StandbyAcquireAccessExclusiveLock(xid=867, dbOid=16384,
relOid=16385) + 358 at standby.c:627
  * frame #11: 0x00000001075fa23b
postgres`standby_redo(record=0x00007f90a9841e38) + 251 at
standby.c:809
    frame #12: 0x0000000107298d37 postgres`StartupXLOG + 9351 at xlog.c:6871

Here is the record pointer:
(lldb) p *record
(XLogReaderState) $4 = {
  read_page = 0x000000010729b3c0 (postgres`XLogPageRead at xlog.c:10973)
  system_identifier = 6272572355656465658
  private_data = 0x00007fff58a4bf40
  ReadRecPtr = 50424128
  EndRecPtr = 50424176
  decoded_record = 0x00007f90a9843178
  main_data = 0x00007f90a9804b78 "\x01"
  main_data_len = 16
  main_data_bufsz = 784
  record_origin = 0
  blocks = {
    [0] = {
      in_use = '\0'
      rnode = (spcNode = 1663, dbNode = 16384, relNode = 2674)
      forknum = MAIN_FORKNUM
      blkno = 22
      flags = '\x10'
      has_image = '\0'
      bkp_image = 0x00007f90a984826b "\x01"
      hole_offset = 892
      hole_length = 2076
      bimg_len = 6116
      bimg_info = '\x01'
      has_data = '\0'
      data = 0x00007f90a98595d8 "\a"
      data_len = 0
      data_bufsz = 154
    }

And in our case this corresponds to the record with LSN 0/03016940
that cannot take an exclusive LOCK:
rmgr: Transaction len (rec/tot):    784/   813, tx:        867, lsn:
0/03016610, prev 0/030165D8, desc: PREPARE
rmgr: Standby     len (rec/tot):     16/    42, tx:          0, lsn:
0/03016940, prev 0/03016610, desc: LOCK xid 867 db 16384 rel 16385
rmgr: Standby     len (rec/tot):     28/    54, tx:          0, lsn:
0/03016970, prev 0/03016940, desc: RUNNING_XACTS nextXid 868
latestCompletedXid 866 oldestRunningXid 867; 1 xacts: 867

There are two XID locks taken before that:
rmgr: Standby     len (rec/tot):     16/    42, tx:        867, lsn:
0/03016578, prev 0/03014D40, desc: LOCK xid 867 db 16384 rel 16385
rmgr: Standby     len (rec/tot):     16/    42, tx:          0, lsn:
0/030165A8, prev 0/03016578, desc: LOCK xid 867 db 16384 rel 16385

And pg_locks on the standby is reporting that another lock has been
taken but not released:
=# select locktype, pid, mode, granted, fastpath from pg_locks where
relation = 16385;
 locktype |  pid  |        mode         | granted | fastpath
----------+-------+---------------------+---------+----------
 relation | 68955 | AccessExclusiveLock | f       | f
 relation |  null | AccessExclusiveLock | t       | f
(2 rows)
In this case 68955 is the startup process trying to take the lock for
the LOCK record and it is not granted yet:
ioltas 68955   0.0  0.0  2593064   3228   ??  TXs   4:44PM   0:00.05
postgres: startup process   recovering 000000010000000000000003
waiting

Now there is already a lock that has been taken and granted,
conflicting here... As the relation is only PREPARE'd yet and not
COMMIT PREPARED at this stage of the replay, isn't this lock taken
during the PREPARE phase, then it is not released by your new code
paths, no?

[One LOCK_DEBUG build later...]

It looks to be the case... The PREPARE phase replayed after the
standby is restarted in recovery creates a series of exclusive locks
on the table created and those are not taken on HEAD. Once those are
replayed the LOCK_STANDBY record is conflicting with it. In the case
of the failure, the COMMIT PREPARED record cannot be fetched from
master via the WAL stream so the relation never becomes visible.

Attached as two log files that are the result of a couple of runs,
those are the logs of the standby after being restarted in crash
recovery
- 2pc_master_logs.log, for HEAD.
- 2pc_patch_logs.log, with your last patch applied.
Feel free to have a look at them.
Regards,
-- 
Michael

2pc_master_logs.log
Description: Binary data

2pc_patch_logs.log
Description: Binary data

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speedup twophase transactions

Reply via email to