On Fri, Sep 13, 2013 at 3:20 PM, Josh Berkus <j...@agliodbs.com> wrote: > On 09/13/2013 12:55 PM, Merlin Moncure wrote: >> what are the specific symptoms of your problem? anything interesting >> in pg_locks? is $client willing to experiment with custom patches? > > 3 servers: 1 master, two replicas. > 32-core Xeon, hyperthreaded to 64 cores > 512GB RAM each > s_b set to 8GB > Load-balanced between all 3 > ~~ 11 different databases > combined database size around 600GB > using pgbouncer > > Irregularly, during periods of high activity (although not necessarily > peak activity) one or another of the systems will go into paralysis, > with all backends apparently waiting on LWLocks (we need more tracing > information to completely confirm this). Activity at this time is > usually somewhere between 50 and 100 concurrent queries (and 80 to 150 > connections). pg_locks doesn't think anything is waiting on a lock. > > What's notable is that sometimes it's just *one* of the replicas which > goes into paralysis. If the master gets this issue though, the replicas > experience it soon afterwards. Increasing wal_buffers from 16GB to 64GB > seems to make this issue happen less frequently, but it doesn't go away > entirely. Only a restart of the server, or killing all backend, ends > the lockup. > > The workload is OLTP, essentially, around 20/80 write/read. They use > PostGIS. The other notable thing about their workload is that due to an > ORM defect, they get idle-in-transactions which last from 5 to 15 > seconds several times a minute. > > They are willing to use experimental patches, but only if those patches > can be applied only to a replica.
ok, points similar: *) master/slave config (two slaves for me) *) 'big' server 256GB mem, 32 core *) 80% approx. (perhaps more) *) some spacial searching (but not very much) *) OLTP *) presentation of load, although in my case it did resolve anywhere from 30 secs to half hour *) aside from the spike, 100% healthy points different *) application side pooling: 96 app servers, max 5 connections each (aside: are you using transaction mode pgbouncer?) *) I saw gripes about relation extension in pg_locks merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers