At Tue, 21 Nov 2017 20:53:04 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyot...@lab.ntt.co.jp> wrote in <20171121.205304.90315453.horiguchi.kyot...@lab.ntt.co.jp> > At Tue, 21 Nov 2017 20:27:25 +0900, atorikoshi > <torikoshi_atsushi...@lab.ntt.co.jp> wrote in > <d5dddea9-4182-a7e4-f368-156f5470d...@lab.ntt.co.jp> > > Thanks for reviewing! > > > > > > On 2017/11/21 18:12, Masahiko Sawada wrote: > > > On Tue, Nov 21, 2017 at 3:48 PM, Masahiko Sawada > > > <sawada.m...@gmail.com> wrote: > > >> On Mon, Nov 20, 2017 at 7:35 PM, atorikoshi > > >> <torikoshi_atsushi...@lab.ntt.co.jp> wrote: > > >>> Hi, > > >>> > > >>> I put many queries into one transaction and made ReorderBuffer spill > > >>> data to disk, and sent SIGKILL to postgres before the end of the > > >>> transaction. > > >>> > > >>> After starting up postgres again, I observed the files spilled to > > >>> data wasn't deleted. > > >>> > > >>> I think these files should be deleted because its transaction was no > > >>> more valid, so no one can use these files. > > >>> > > >>> > > >>> Below is a reproduction instructions. > > >>> > > >>> ------------------------------------------------ > > >>> 1. Create table and publication at publiser. > > >>> > > >>> @pub =# CREATE TABLE t1( > > >>> id INT PRIMARY KEY, > > >>> name TEXT); > > >>> > > >>> @pub =# CREATE PUBLICATION pub FOR TABLE t1; > > >>> > > >>> 2. Create table and subscription at subscriber. > > >>> > > >>> @sub =# CREATE TABLE t1( > > >>> id INT PRIMARY KEY, > > >>> name TEXT > > >>> ); > > >>> > > >>> @sub =# CREATE SUBSCRIPTION sub > > >>> CONNECTION 'host=[hostname] port=[port] dbname=[dbname]' > > >>> PUBLICATION pub; > > >>> > > >>> 3. Put many queries into one transaction. > > >>> > > >>> @pub =# BEGIN; > > >>> INSERT INTO t1 > > >>> SELECT > > >>> i, > > >>> 'aaaaaaaaaa' > > >>> FROM > > >>> generate_series(1, 1000000) as i; > > >>> > > >>> 4. Then we can see spilled files. > > >>> > > >>> @pub $ ls -1 ${PGDATA}/pg_replslot/sub/ > > >>> state > > >>> xid-561-lsn-0-1000000.snap > > >>> xid-561-lsn-0-2000000.snap > > >>> xid-561-lsn-0-3000000.snap > > >>> xid-561-lsn-0-4000000.snap > > >>> xid-561-lsn-0-5000000.snap > > >>> xid-561-lsn-0-6000000.snap > > >>> xid-561-lsn-0-7000000.snap > > >>> xid-561-lsn-0-8000000.snap > > >>> xid-561-lsn-0-9000000.snap > > >>> > > >>> 5. Kill publisher's postgres process before COMMIT. > > >>> > > >>> @pub $ kill -s SIGKILL [pid of postgres] > > >>> > > >>> 6. Start publisher's postgres process. > > >>> > > >>> @pub $ pg_ctl start -D ${PGDATA} > > >>> > > >>> 7. After a while, we can see the files remaining. > > >>> (Immediately after starting publiser, we can not see these files.) > > >>> > > >>> @pub $ pg_ctl start -D ${PGDATA} > > >>> > > >>> When I configured with '--enable-cassert', below assertion error > > >>> was appeared. > > >>> > > >>> TRAP: FailedAssertion("!(txn->final_lsn != 0)", File: > > >>> "reorderbuffer.c", > > >>> Line: 2576) > > >>> ------------------------------------------------ > > >>> > > >>> Attached patch sets final_lsn to the last ReorderBufferChange if > > >>> final_lsn == 0. > > >> > > >> Thank you for the report. I could reproduce this issue with the above > > >> step. My analysis is, the cause of that a serialized reorder buffer > > >> isn't cleaned up is that the aborted transaction without an abort WAL > > >> record has no chance to set ReorderBufferTXN->final_lsn. So if there > > >> is such serialized transactions ReorderBufferRestoreCleanup cleanups > > >> no files, which is cause of the assertion failure (or a file being > > >> orphaned). What do you think? > > >> > > >> On detail of your patch, I'm not sure it's safe if we set the lsn of > > >> other than commit record or abort record to final_lsn. The comment in > > >> reorderbuffer.h says, > > >> > > >> typedef trcut ReorderBufferTXN > > >> { > > >> (snip) > > >> > > >> /* ---- > > >> * LSN of the record that lead to this xact to be committed or > > >> * aborted. This can be a > > >> * * plain commit record > > >> * * plain commit record, of a parent transaction > > >> * * prepared transaction commit > > >> * * plain abort record > > >> * * prepared transaction abort > > >> * * error during decoding > > >> * ---- > > >> */ > > >> XLogRecPtr final_lsn; > > >> > > >> But with your patch, we could set a lsn of a record that is other than > > >> what listed above to final_lsn. One way I came up with is to make > > > > I added some comments on final_lsn. > > > > >> ReorderBufferRestoreCleanup accept an invalid value of final_lsn and > > >> regards it as a aborted transaction that doesn't has a abort WAL > > >> record. So we can cleanup all serialized files if final_lsn of a > > >> transaction is invalid. > > > > > > After more thought, since there are some codes cosmetically setting > > > final_lsn when the fate of transaction is determined possibly we > > > should not accept a invalid value of final_lsn even in the case. > > > > > It doesn't seem just a cosmetic. The first/last_lsn are used to > determin the files to be deleted. On the other hand the TXN > cannot have last_lsn since it hasn't see abort/commit record. > > > My new patch keeps setting final_lsn, but changed its location to the > > top of ReorderBufferCleanupTXN(). > > I think it's a kind of preparation, so doing it at the top improves > > readability. > > > > > Anyway I think you should register this patch to the next commit fest > > > so as not forget. > > > > Thanks for you advice, I've registered this issue as a bug. > > Using last changing LSN might work but I'm afraid that that fails > to remove the last snap file if the crash happens at the very > start of a segment. > > Anyway all files of the transaction is no longer useless at the > time, but it seems that the last_lsn is required to avoid > directory scanning at every transaction end. > > Letting ReorderBufferAbortOld scan the directory and determine > the first and last LSN then set to the txn would work but it > might be an overkill. Using the beginning LSN of the next segment > of the last_change->lsn could surely work... really? > (ReorderBufferRestoreCleanup doesn't complain on ENOENT.)
Somehow I deleted exessively while editing. One more possible solution is making ReorderBufferAbortOld take final LSN and DecodeStandbyOp passes the LSN of XLOG_RUNNING_XACTS record to it. > By the way, just using unlink() there might lead to the revmoed > file's resurrection but it would be another issue. -- Kyotaro Horiguchi NTT Open Source Software Center