Re: LWLocks by LockManager slowing large DB

2021-04-12 Thread Nikolay Samokhvalov
On Mon, Apr 12, 2021 at 14:57 Andres Freund wrote: > Without knowing the proportion of LockManager wait events compared to > the rest it's hard to know what to make of it. These OSS tools can be useful to understand the proportion: - pgCenter https://github.com/lesovsky/pgcenter - pg_wait_samp

Re: LWLocks by LockManager slowing large DB

2021-04-12 Thread Andres Freund
Hi, On 2021-04-12 15:56:08 -0700, Paul Friedman wrote: > Also, I didn't understand your comment about a 'futex profile', could you > point me in the right direction here? My earlier mail included a section about running a perf profile showing the callers of the futex() system call, which in turn

RE: LWLocks by LockManager slowing large DB

2021-04-12 Thread Paul Friedman
Yes, I ran the query after a couple of minutes. Those are the steady-state numbers. Also 'top' shows: top - 22:44:26 up 12 days, 23:14, 5 users, load average: 20.99, 21.35, 19.27 Tasks: 859 total, 26 running, 539 sleeping, 0 stopped, 0 zombie %Cpu(s): 34.3 us, 1.6 sy, 0.0 ni, 64.1 id,

Re: LWLocks by LockManager slowing large DB

2021-04-12 Thread Andres Freund
Hi, On 2021-04-12 15:15:05 -0700, Paul Friedman wrote: > Thanks again for any advice you have. I think we'd need the perf profiles to be able to dig into this further. It's odd that there are a meaningful amount of LockManager contention in your case - assuming the stats you collected weren't ju

RE: LWLocks by LockManager slowing large DB

2021-04-12 Thread Paul Friedman
Thanks for the quick reply! These queries take ~1hr and are the only thing running on the system (all 60 are launched at the same time and the tables/files are fully-primed into memory so iowaits are basically zero). Yes, that’s the same query I’ve been running to analyze the locks and this i

Re: LWLocks by LockManager slowing large DB

2021-04-12 Thread Andres Freund
Hi, On 2021-04-12 12:37:42 -0700, Paul Friedman wrote: > Boiling the complex queries down to their simplest form, we test running 60 > of this query simultaneously: How long does one execution of these queries take (on average)? The likely bottlenecks are very different between running 60 concurr

LWLocks by LockManager slowing large DB

2021-04-12 Thread Paul Friedman
Hello, apologies for the long post, but I want to make sure I’ve got enough details to describe the problem for y’all. I’ve got a 64-core (Ubuntu 18.04 – 240GB RAM running at GCP) instance running PG 13.2 and PostGIS 3.1.1 and we’re having troubles getting it to run more than 30 or so large quer