Hi, When the master shuts down or crashes, there seems to be the case where walreceiver exits without flushing WAL which has already been written. This might lead startup process to replay un-flushed WAL and break a Write-Ahead-Logging rule.
walreceiver.c > /* Wait a while for data to arrive */ > if (walrcv_receive(NAPTIME_PER_CYCLE, &type, &buf, &len)) > { > /* Accept the received data, and process it */ > XLogWalRcvProcessMsg(type, buf, len); > > /* Receive any more data we can without sleeping */ > while (walrcv_receive(0, &type, &buf, &len)) > XLogWalRcvProcessMsg(type, buf, len); > > /* > * If we've written some records, flush them to disk > and let the > * startup process know about them. > */ > XLogWalRcvFlush(); > } The problematic case happens when the latter walrcv_receive emits ERROR. In this case, the WAL received by the former walrcv_receive is not guaranteed to have been flushed yet. The attached patch ensures that all WAL received is flushed to disk before walreceiver exits. This patch should be backported to 9.0, I think. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
flush_before_walreceiver_exit_v1.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers