On Tue, Oct 7, 2014 at 4:12 PM, Andres Freund <and...@2ndquadrant.com> wrote: >> I think the easiest way to measure lwlock contention would be to put >> some counters in the lwlock itself. My guess, based on a lot of >> fiddling with LWLOCK_STATS over the years, is that there's no way to >> count lock acquisitions and releases without harming performance >> significantly - no matter where we put the counters, it's just going >> to be too expensive. However, I believe that incrementing a counter - >> even in the lwlock itself - might not be too expensive if we only do >> it when (1) a process goes to sleep or (2) spindelays occur. > > Increasing the size will be painful on its own :(.
I am afraid in this case we should think about minimizing overhead but not about avoiding it at all: having such DBA-friendly feature it is worth it. Let me step down a bit, since the discussion went to details, while the whole design idea stays unclear. What actually we need: fact, that lwlock acquired? lock count? time spent in lock? overall lock duration? Usual way to explain how any of such performance tools work, is Traffic example (and any oracle/db2 wait-interface aware DBA knows it): You have some from home to office way and spend an hour to make it. You try to optimize it and found, that however you take highway with no speed limit, you usually stack in traffic turning from highway to your office and spend there about 10-30 min. Alternative is to take another way with 2 speed limit zones and one traffic light, totally you will loose 2 and 5 minutes on speed limit parts and 2 min on red light - overall better than 30 minutes in a jam and even better than 10 min in a jam. That is all about: to found bottleneck we need information that process hold certain lock, that it was held certain time or there are a lot of shorter time locks. I think, sampling even 1-2 times pro second and building sort of histogram is well enough at the moment, because it shows (not very in a very precise manner however) that process hold certain lock, that it was held certain time or there are a lot of shorter time locks. After that it is possible to implement something more precise. (As far as I know, Greg Smith works on some sort of wait events, but it seems to me there are a lot of work to do to implement exact analog of OWI) -- Ilya Kosmodemiansky, PostgreSQL-Consulting.com tel. +14084142500 cell. +4915144336040 i...@postgresql-consulting.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers