On Tue, Nov 21, 2017 at 3:48 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote: > On Mon, Nov 20, 2017 at 7:35 PM, atorikoshi > <torikoshi_atsushi...@lab.ntt.co.jp> wrote: >> Hi, >> >> I put many queries into one transaction and made ReorderBuffer spill >> data to disk, and sent SIGKILL to postgres before the end of the >> transaction. >> >> After starting up postgres again, I observed the files spilled to >> data wasn't deleted. >> >> I think these files should be deleted because its transaction was no >> more valid, so no one can use these files. >> >> >> Below is a reproduction instructions. >> >> ------------------------------------------------ >> 1. Create table and publication at publiser. >> >> @pub =# CREATE TABLE t1( >> id INT PRIMARY KEY, >> name TEXT); >> >> @pub =# CREATE PUBLICATION pub FOR TABLE t1; >> >> 2. Create table and subscription at subscriber. >> >> @sub =# CREATE TABLE t1( >> id INT PRIMARY KEY, >> name TEXT >> ); >> >> @sub =# CREATE SUBSCRIPTION sub >> CONNECTION 'host=[hostname] port=[port] dbname=[dbname]' >> PUBLICATION pub; >> >> 3. Put many queries into one transaction. >> >> @pub =# BEGIN; >> INSERT INTO t1 >> SELECT >> i, >> 'aaaaaaaaaa' >> FROM >> generate_series(1, 1000000) as i; >> >> 4. Then we can see spilled files. >> >> @pub $ ls -1 ${PGDATA}/pg_replslot/sub/ >> state >> xid-561-lsn-0-1000000.snap >> xid-561-lsn-0-2000000.snap >> xid-561-lsn-0-3000000.snap >> xid-561-lsn-0-4000000.snap >> xid-561-lsn-0-5000000.snap >> xid-561-lsn-0-6000000.snap >> xid-561-lsn-0-7000000.snap >> xid-561-lsn-0-8000000.snap >> xid-561-lsn-0-9000000.snap >> >> 5. Kill publisher's postgres process before COMMIT. >> >> @pub $ kill -s SIGKILL [pid of postgres] >> >> 6. Start publisher's postgres process. >> >> @pub $ pg_ctl start -D ${PGDATA} >> >> 7. After a while, we can see the files remaining. >> (Immediately after starting publiser, we can not see these files.) >> >> @pub $ pg_ctl start -D ${PGDATA} >> >> When I configured with '--enable-cassert', below assertion error >> was appeared. >> >> TRAP: FailedAssertion("!(txn->final_lsn != 0)", File: "reorderbuffer.c", >> Line: 2576) >> ------------------------------------------------ >> >> Attached patch sets final_lsn to the last ReorderBufferChange if >> final_lsn == 0. > > Thank you for the report. I could reproduce this issue with the above > step. My analysis is, the cause of that a serialized reorder buffer > isn't cleaned up is that the aborted transaction without an abort WAL > record has no chance to set ReorderBufferTXN->final_lsn. So if there > is such serialized transactions ReorderBufferRestoreCleanup cleanups > no files, which is cause of the assertion failure (or a file being > orphaned). What do you think? > > On detail of your patch, I'm not sure it's safe if we set the lsn of > other than commit record or abort record to final_lsn. The comment in > reorderbuffer.h says, > > typedef trcut ReorderBufferTXN > { > (snip) > > /* ---- > * LSN of the record that lead to this xact to be committed or > * aborted. This can be a > * * plain commit record > * * plain commit record, of a parent transaction > * * prepared transaction commit > * * plain abort record > * * prepared transaction abort > * * error during decoding > * ---- > */ > XLogRecPtr final_lsn; > > But with your patch, we could set a lsn of a record that is other than > what listed above to final_lsn. One way I came up with is to make > ReorderBufferRestoreCleanup accept an invalid value of final_lsn and > regards it as a aborted transaction that doesn't has a abort WAL > record. So we can cleanup all serialized files if final_lsn of a > transaction is invalid.
After more thought, since there are some codes cosmetically setting final_lsn when the fate of transaction is determined possibly we should not accept a invalid value of final_lsn even in the case. > Since I'm not very familiar with snapshot > building part please check it. Sorry I wanted to say the reorderbuffer module. The snapshot building isn't relevant. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center