On Wed, Aug 1, 2018 at 12:56 PM Kyotaro HORIGUCHI <horiguchi.kyot...@lab.ntt.co.jp> wrote: > > Thank you, Amit, Michael. >
Can you verify the first patch that I have posted above [1]? We can commit it separately. > > It's a long time ago.. Let me have a bit of time to blow dust off.. > > https://www.postgresql.org/message-id/20180420.173354.43303926.horiguchi.kyot...@lab.ntt.co.jp > > ...done..i working.. > > The original problem here was UpdateFullPageWrites() can call > RecoveryInProgress(), which does palloc in a critical > section. This happens when standy is commanded to reload with > change of full_pages_writes during recovery. > AFAIU from the original problem reported by Dilip, it can happen during checkpoint without standby as well. > While exploring it, I found that update of fullPageWrite flag is > updated concurrently against its design. A race condition happens > in the following steps in my diagnosis. > > 1. While the startup process is running recovery, we didn't > consider that checkpointer may be running concurrently, but it > happens for standby. > > 2. Checkpointer considers itself (or designed) as the *only* > updator of shared config including fillPageWrites. In reality > the startup is another concurent updator on standby. Addition to > that, checkpointer assumes that it is started under WAL-emitting > state, that is, InitXLogInsert() has been called elsewhere, but > it is not the case on standby. > > Note that checkpointer mustn't update FPW flag before recovery > ends because startup will overrides the flag. > > 3. As the result, when standby gets full_page_writes changed and > SIGHUP during recovery, checkpointer tries to update FPW flag > and calls RecoveryInProgress() on the way. bang! > > > With the 0002-step1.patch, checkpointer really becomes the only > updator of the FPW flag after recovery ends. StartupXLOG() > updates it just once before checkpointer starts to update it. > - * Update full_page_writes in shared memory and write an XLOG_FPW_CHANGE - * record before resource manager writes cleanup WAL records or checkpoint - * record is written. + * Update full_page_writes in shared memory with the lastest value before + * resource manager writes cleanup WAL records or checkpoint record is + * written. We don't need to write XLOG_FPW_CHANGE since this just + * reflects the status at the last redo'ed record. No lock is required + * since startup is the only updator of the flag at this + * point. Checkpointer will take over after SharedRecoveryInProgress is + * turned to false. */ Insert->fullPageWrites = lastFullPageWrites; - LocalSetXLogInsertAllowed(); - UpdateFullPageWrites(); - LocalXLogInsertAllowed = -1; lastFullPageWrites will contain the latest value among the replayed WAL records. It can still be different fullPageWrites which is updated by UpdateFullPageWrites(). So, I don't think it is advisable to remove it and rely on checkpointer to update it. I think when it is called from this code path, it will anyway not write XLOG_FPW_CHANGE because of the !RecoveryInProgress() check. > Under the normal(slow?) promotion mode, checkpointer is woken up > explicitly to make the config setting effective as soon as > possible. (This might be unnecessary.) > I am not sure this is the right approach. If we are worried about concurrency issues, then we can use lock to protect it. > In checkpointer, RecoveryInProgress() is called preior to > UpdateFPW() to disable update FPW during recovery, so the crash > that is the issue here is fixed. > It seems to me that you are trying to resolve two different problems in the same patch. I think we can have a patch to deal with concurrency issue if any and a separate patch to call RecoveryInProgress outside critical section. [1] - https://www.postgresql.org/message-id/CAA4eK1JvKxsHfM6GCHoKNas-7vDSniLgaAm%3DcG_OaQaOYRNc3w%40mail.gmail.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com