2011/12/10 10:54, Greg Smith wrote:
On 12/08/2011 09:48 AM, Satoshi Nagayasu wrote:
For examples, I've been working on investigating PostgreSQL LWLock behaviors
precisely for a few weeks, and it could not be obtained within PostgreSQL
itself, therefore, I picked up SystemTap. However, SystemTap could not be
used in a production system, because it often kills the target processes. :(
How can I observe LWLocks in the production system?

I decided about a year ago that further work on using SystemTap was a black hole: time 
goes in, nothing really usable on any production server seems to come out. It can be 
useful for collecting data in a developer context. But the sort of problems people are 
more interested in all involve "why is the production server doing this?", and 
as you've also discovered the only reasonable answer so far doesn't involve SystemTap; it 
involves DTrace and either Solaris or FreeBSD (or Mac OS, for smaller server hardware 
deployments). Since those platforms are problematic to run database servers on in many 
cases, that doesn't help very much.

Absolutely. SystemTap would be useful if I'm able to reproduce the situation
outside the production system. However, in most cases, it would be actually
difficult.

I'm planning to put that instrumentation into the database directly, which is 
what people with Oracle background are asking for. There are two underlying 
low-level problems to solve before even starting that:

-How can the overhead of collecting the timing data be kept down? It's really high in 
some places. This is being worked out right now on pgsql-hackers, see "Timing 
overhead and Linux clock sources"

-How do you log the potentially large amount of data collected without killing server 
performance? Initial discussions also happening right now, see "logging in high 
performance systems".

I feel this will increasingly be the top blocker for performance sensitive 
deployments in the coming year, people used to having these tools in Oracle 
cannot imagine how they would operate without them. One of my big pictures 
goals is have this available as a compile-time option starting in PostgreSQL 
9.3 in 2013, piggybacked off the existing DTrace support. And the earlier the 
better--since many migrations have a long lead time, just knowing it's coming 
in the next version would be good enough for some people who are blocked right 
now to start working on theirs.

I'm glad to hear that. I'm very interested in focusing on it,
and will follow the threads. Thanks.

--
NAGAYASU Satoshi <satoshi.nagay...@gmail.com>

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to