On Fri, Dec 29, 2017 at 2:38 PM, Andres Freund <and...@anarazel.de> wrote: > Hm, I'm not quite convinced by this approach. Partially because of the > backpatch issue you mention, partially because using the list length as > a limit doesn't seem quite nice.
Seems OK to me. Certainly better than your competing proposal. > Given that the proclist_contains() checks in condition_variable.c are > already racy, I think it might be feasible to collect all procnos to > signal while holding the spinlock, and then signal all of them in one > go. That doesn't seem very nice at all. Not only does it violate the coding rule against looping while holding a spinlock, but it seems that it would require allocating memory while holding one, which is a non-starter. > Obviously it'd be nicer to not hold a spinlock while looping, but that > seems like something we can't fix in the back branches. [insert rant > about never using spinlocks unless there's very very clear convicing > reasons]. I don't think that's a coding rule that I'd be prepared to endorse. We've routinely used spinlocks for years in cases where the critical section was very short, just to keep the overhead down. I think it works fine in that case, although I admit that I failed to appreciate how unpleasant the livelock possibilities were in this case. It's not clear to me that we entirely need a back-patchable fix for this. It could be that parallel index scan can have the same issue, but I'm not aware of any user complaints. Parallel bitmap heap only ever waits once so it's probably fine. If we do need a back-patchable fix, I suppose slock_t mutex could be replaced by pg_atomic_uint32 state. I think that would avoid changing the size of the structure on common platforms, though obscure systems with spinlocks > 4 bytes might be affected. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company