On Oct 30, 2019, at 09:45, Michael Paquier <mich...@paquier.xyz <mailto:mich...@paquier.xyz>> wrote: > > On Tue, Oct 29, 2019 at 07:50:01PM +0900, Kyotaro Horiguchi wrote: >> At Fri, 25 Oct 2019 15:18:34 +0800, "Dongming Liu" >> <lingce....@alibaba-inc.com> wrote in >>> I recently discovered two possible bugs about synchronous replication. >>> >>> 1. SyncRepCleanupAtProcExit may delete an element that has been deleted >>> SyncRepCleanupAtProcExit first checks whether the queue is detached, if it >>> is not detached, >>> acquires the SyncRepLock lock and deletes it. If this element has been >>> deleted by walsender, >>> it will be deleted repeatedly, SHMQueueDelete will core with a segment >>> fault. >>> >>> IMO, like SyncRepCancelWait, we should lock the SyncRepLock first and then >>> check >>> whether the queue is detached or not. >> >> I think you're right here. > > Indeed. Looking at the surroundings we expect some code paths to hold > SyncRepLock in exclusive or shared mode but we don't actually check > that the lock is hold. So let's add some assertions while on it. > >> This is not right. It is in transaction commit so it is in a >> HOLD_INTERRUPTS section. ProcessInterrupt does not respond to >> cancel/die interrupts thus the ereport should return. > > Yeah. There is an easy way to check after that: InterruptHoldoffCount > needs to be strictly positive. > > My suggestions are attached. Any thoughts?
Thanks, this patch looks good to me.
smime.p7s
Description: S/MIME cryptographic signature