On Wed, Jul 26, 2023 at 1:27 AM Andres Freund <and...@anarazel.de> wrote: > > > 0001 has been now applied. I have done more tests while looking at > > this patch since yesterday and was surprised to see higher TPS numbers > > on HEAD with the same tests as previously, and the patch was still > > shining with more than 256 clients. > > Just a small heads up: > > I just rebased my aio tree over the commit and promptly, on the first run, saw > a hang. I did some debugging on that. Unfortunately repeated runs haven't > repeated that hang, despite quite a bit of trying.
Hm. Please share workload details, test scripts, system info and any special settings for running in my setup. > The symptom I was seeing is that all running backends were stuck in > LWLockWaitForVar(), even though the value they're waiting for had > changed. Which obviously "shouldn't be possible". Were the backends stuck there indefinitely? IOW, did they get into a deadlock? > It's of course possible that this is AIO specific, but I didn't see anything > in stacks to suggest that. > > I do wonder if this possibly exposed an undocumented prior dependency on the > value update always happening under the list lock. I'm going through the other thread mentioned by Michael Paquier. I'm wondering if the deadlock issue illustrated here https://www.postgresql.org/message-id/55BB50D3.9000702%40iki.fi is showing up again, because 71e4cc6b8e reduced the contention on waitlist lock and made things *a bit* faster. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com