Hi all, I've been working on a new system view, pg_stat_lwlocks, to observe LWLock, and just completed my 'proof-of-concept' code that can work with version 9.1.
Now, I'd like to know the possibility of this feature for future release. With this patch, DBA can easily determine a bottleneck around lwlocks. -------------------------------------------------- postgres=# SELECT * FROM pg_stat_lwlocks ORDER BY time_ms DESC LIMIT 10; lwlockid | calls | waits | time_ms ----------+--------+-------+--------- 49 | 193326 | 32096 | 23688 8 | 3305 | 133 | 1335 2 | 21 | 0 | 0 4 | 135188 | 0 | 0 5 | 57935 | 0 | 0 6 | 141 | 0 | 0 7 | 24580 | 1 | 0 3 | 3282 | 0 | 0 1 | 41 | 0 | 0 9 | 3 | 0 | 0 (10 rows) postgres=# -------------------------------------------------- In this view, 'lwlockid' column represents LWLockId used in the backends. 'calls' represents how many times LWLockAcquire() was called. 'waits' represents how many times LWLockAcquire() needed to wait within it before lock acquisition. 'time_ms' represents how long LWLockAcquire() totally waited on a lwlock. And lwlocks that use a LWLockId range, such as BufMappingLock or LockMgrLock, would be grouped and summed up in a single record. For example, lwlockid 49 in the above view represents LockMgrLock statistics. Now, I know there are some considerations. (1) Performance I've measured LWLock performance both with and without the patch, and confirmed that this patch does not affect the LWLock perfomance at all. pgbench scores with the patch: tps = 900.906658 (excluding connections establishing) tps = 908.528422 (excluding connections establishing) tps = 903.900977 (excluding connections establishing) tps = 910.470595 (excluding connections establishing) tps = 909.685396 (excluding connections establishing) pgbench scores without the patch: tps = 909.096785 (excluding connections establishing) tps = 894.868712 (excluding connections establishing) tps = 910.074669 (excluding connections establishing) tps = 904.022770 (excluding connections establishing) tps = 895.673830 (excluding connections establishing) Of course, this experiment was not I/O bound, and the cache hit ratio was >99.9%. (2) Memory space In this patch, I added three new members to LWLock structure as uint64 to collect statistics. It means that those members must be held in the shared memory, but I'm not sure whether it's appropriate. I think another possible option is holding those statistics values in local (backend) process memory, and send them through the stat collector process (like other statistics values). (3) LWLock names (or labels) Now, pg_stat_lwlocks view shows LWLockId itself. But LWLockId is not easy for DBA to determine actual lock type. So, I want to show LWLock names (or labels), like 'WALWriteLock' or 'LockMgrLock', but how should I implement it? Any comments? Regards, -- Satoshi Nagayasu <sn...@uptime.jp> Uptime Technologies, LLC. http://www.uptime.jp
diff -rc postgresql-9.1.2.orig/src/backend/catalog/postgres.bki postgresql-9.1.2/src/backend/catalog/postgres.bki *** postgresql-9.1.2.orig/src/backend/catalog/postgres.bki 2012-06-20 03:32:46.000000000 +0900 --- postgresql-9.1.2/src/backend/catalog/postgres.bki 2012-06-26 01:51:52.000000000 +0900 *************** *** 1553,1558 **** --- 1553,1559 ---- insert OID = 3071 ( pg_xlog_replay_pause 11 10 12 1 0 0 f f f t f v 0 0 2278 "" _null_ _null_ _null_ _null_ pg_xlog_replay_pause _null_ _null_ _null_ ) insert OID = 3072 ( pg_xlog_replay_resume 11 10 12 1 0 0 f f f t f v 0 0 2278 "" _null_ _null_ _null_ _null_ pg_xlog_replay_resume _null_ _null_ _null_ ) insert OID = 3073 ( pg_is_xlog_replay_paused 11 10 12 1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_is_xlog_replay_paused _null_ _null_ _null_ ) + insert OID = 3764 ( pg_stat_get_lwlocks 11 10 12 1 100 0 f f f f t s 0 0 2249 "" "{20,20,20,20}" "{o,o,o,o}" "{lwlockid,calls,waits,time_ms}" _null_ pg_stat_get_lwlocks _null_ _null_ _null_ ) insert OID = 2621 ( pg_reload_conf 11 10 12 1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_reload_conf _null_ _null_ _null_ ) insert OID = 2622 ( pg_rotate_logfile 11 10 12 1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_rotate_logfile _null_ _null_ _null_ ) insert OID = 2623 ( pg_stat_file 11 10 12 1 0 0 f f f t f v 1 0 2249 "25" "{25,20,1184,1184,1184,1184,16}" "{i,o,o,o,o,o,o}" "{filename,size,access,modification,change,creation,isdir}" _null_ pg_stat_file _null_ _null_ _null_ ) diff -rc postgresql-9.1.2.orig/src/backend/catalog/postgres.description postgresql-9.1.2/src/backend/catalog/postgres.description *** postgresql-9.1.2.orig/src/backend/catalog/postgres.description 2012-06-20 03:32:46.000000000 +0900 --- postgresql-9.1.2/src/backend/catalog/postgres.description 2012-06-26 01:51:52.000000000 +0900 *************** *** 947,952 **** --- 947,953 ---- 3071 pg_proc 0 pause xlog replay 3072 pg_proc 0 resume xlog replay, if it was paused 3073 pg_proc 0 true if xlog replay is paused + 3764 pg_proc 0 light-weight lock statistics 2621 pg_proc 0 reload configuration files 2622 pg_proc 0 rotate log file 2623 pg_proc 0 get information about file diff -rc postgresql-9.1.2.orig/src/backend/catalog/system_views.sql postgresql-9.1.2/src/backend/catalog/system_views.sql *** postgresql-9.1.2.orig/src/backend/catalog/system_views.sql 2012-06-20 03:32:46.000000000 +0900 --- postgresql-9.1.2/src/backend/catalog/system_views.sql 2012-06-26 03:35:59.000000000 +0900 *************** *** 594,599 **** --- 594,607 ---- pg_stat_get_buf_alloc() AS buffers_alloc, pg_stat_get_bgwriter_stat_reset_time() AS stats_reset; + CREATE VIEW pg_stat_lwlocks AS + SELECT + S.lwlockid, + S.calls, + S.waits, + S.time_ms + FROM pg_stat_get_lwlocks() AS S; + CREATE VIEW pg_user_mappings AS SELECT U.oid AS umid, diff -rc postgresql-9.1.2.orig/src/backend/storage/lmgr/lwlock.c postgresql-9.1.2/src/backend/storage/lmgr/lwlock.c *** postgresql-9.1.2.orig/src/backend/storage/lmgr/lwlock.c 2012-06-20 03:32:46.000000000 +0900 --- postgresql-9.1.2/src/backend/storage/lmgr/lwlock.c 2012-06-26 01:57:28.000000000 +0900 *************** *** 32,37 **** --- 32,38 ---- #include "storage/proc.h" #include "storage/spin.h" + #include <sys/time.h> /* We use the ShmemLock spinlock to protect LWLockAssign */ extern slock_t *ShmemLock; *************** *** 46,51 **** --- 47,57 ---- PGPROC *head; /* head of list of waiting PGPROCs */ PGPROC *tail; /* tail of list of waiting PGPROCs */ /* tail is undefined when head is NULL */ + + /* statistics stuff */ + uint64 calls; + uint64 waits; + uint64 time_ms; } LWLock; /* *************** *** 268,273 **** --- 274,282 ---- lock->lock.shared = 0; lock->lock.head = NULL; lock->lock.tail = NULL; + lock->lock.calls = 0; + lock->lock.waits = 0; + lock->lock.time_ms = 0; } /* *************** *** 323,330 **** --- 332,341 ---- PGPROC *proc = MyProc; bool retry = false; int extraWaits = 0; + struct timeval t1,t2; PRINT_LWDEBUG("LWLockAcquire", lockid, lock); + lock->calls++; #ifdef LWLOCK_STATS /* Set up local count state first time through in a given process */ *************** *** 457,462 **** --- 468,475 ---- #endif TRACE_POSTGRESQL_LWLOCK_WAIT_START(lockid, mode); + lock->waits++; + gettimeofday(&t1, NULL); for (;;) { *************** *** 468,473 **** --- 481,488 ---- } TRACE_POSTGRESQL_LWLOCK_WAIT_DONE(lockid, mode); + gettimeofday(&t2, NULL); + lock->time_ms += (t2.tv_sec-t1.tv_sec)*1000 + (t2.tv_usec-t1.tv_usec)/1000; LOG_LWDEBUG("LWLockAcquire", lockid, "awakened"); *************** *** 701,703 **** --- 716,752 ---- } return false; } + + void + lwlock_get_stat(LWLockId lockid, uint64 *calls, uint64 *waits, uint64 *time_ms) + { + volatile LWLock *lock = &(LWLockArray[lockid].lock); + + SpinLockAcquire(&lock->mutex); + *calls = lock->calls; + *waits = lock->waits; + *time_ms = lock->time_ms; + SpinLockRelease(&lock->mutex); + } + + void + lwlock_get_stat_sum(LWLockId start, LWLockId end, uint64 *calls, uint64 *waits, uint64 *time_ms) + { + LWLockId lockid; + + *calls = 0; + *waits = 0; + *time_ms = 0; + + for (lockid=start ; lockid<=end ; lockid++) + { + uint64 calls2, waits2, time_ms2; + + lwlock_get_stat(lockid, &calls2, &waits2, &time_ms2); + + *calls += calls2; + *waits += waits2; + *time_ms += time_ms2; + } + } + diff -rc postgresql-9.1.2.orig/src/backend/utils/adt/pgstatfuncs.c postgresql-9.1.2/src/backend/utils/adt/pgstatfuncs.c *** postgresql-9.1.2.orig/src/backend/utils/adt/pgstatfuncs.c 2012-06-20 03:32:50.000000000 +0900 --- postgresql-9.1.2/src/backend/utils/adt/pgstatfuncs.c 2012-06-26 01:59:17.000000000 +0900 *************** *** 109,114 **** --- 109,116 ---- extern Datum pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS); extern Datum pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS); + extern Datum pg_stat_get_lwlocks(PG_FUNCTION_ARGS); + /* Global bgwriter statistics, from bgwriter.c */ extern PgStat_MsgBgWriter bgwriterStats; *************** *** 1572,1574 **** --- 1574,1670 ---- PG_RETURN_VOID(); } + + + Datum + pg_stat_get_lwlocks(PG_FUNCTION_ARGS) + { + FuncCallContext *funcctx; + + /* stuff done only on the first call of the function */ + if (SRF_IS_FIRSTCALL()) + { + MemoryContext oldcontext; + TupleDesc tupdesc; + + /* create a function context for cross-call persistence */ + funcctx = SRF_FIRSTCALL_INIT(); + + oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); + + tupdesc = CreateTemplateTupleDesc(4, false); + TupleDescInitEntry(tupdesc, (AttrNumber) 1, "lockid", + INT8OID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 2, "calls", + INT8OID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 3, "waits", + INT8OID, -1, 0); + TupleDescInitEntry(tupdesc, (AttrNumber) 4, "time_ms", + INT8OID, -1, 0); + + funcctx->tuple_desc = BlessTupleDesc(tupdesc); + funcctx->max_calls = NumLWLocks(); + + MemoryContextSwitchTo(oldcontext); + } + + /* stuff done on every call of the function */ + funcctx = SRF_PERCALL_SETUP(); + + if (funcctx->call_cntr < funcctx->max_calls) + { + Datum values[4]; + bool nulls[4]; + HeapTuple tuple; + LWLockId lockid; + uint64 calls,waits,time_ms; + + MemSet(values, 0, sizeof(values)); + MemSet(nulls, 0, sizeof(nulls)); + + lockid = funcctx->call_cntr; + + if ( lockid<FirstBufMappingLock ) + { + lwlock_get_stat(lockid, &calls, &waits, &time_ms); + } + else if ( FirstBufMappingLock<=lockid && lockid<FirstLockMgrLock ) + { + lwlock_get_stat_sum(FirstBufMappingLock, FirstLockMgrLock-1, + &calls, &waits, &time_ms); + funcctx->call_cntr = FirstLockMgrLock - 1; + } + else if ( FirstLockMgrLock<=lockid && lockid<FirstPredicateLockMgrLock ) + { + lwlock_get_stat_sum(FirstLockMgrLock, FirstPredicateLockMgrLock-1, + &calls, &waits, &time_ms); + funcctx->call_cntr = FirstPredicateLockMgrLock - 1; + } + else if ( FirstPredicateLockMgrLock<=lockid && lockid<NumFixedLWLocks ) + { + lwlock_get_stat_sum(FirstPredicateLockMgrLock, NumFixedLWLocks-1, + &calls, &waits, &time_ms); + funcctx->call_cntr = NumFixedLWLocks - 1; + } + else if ( NumFixedLWLocks<=lockid && lockid<NumLWLocks() ) + { + lwlock_get_stat_sum(NumFixedLWLocks, NumLWLocks()-1, + &calls, &waits, &time_ms); + funcctx->call_cntr = NumLWLocks() - 1; + } + + values[0] = Int64GetDatum(lockid); + values[1] = Int64GetDatum(calls); + values[2] = Int64GetDatum(waits); + values[3] = Int64GetDatum(time_ms); + + tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls); + + SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuple)); + } + else + { + SRF_RETURN_DONE(funcctx); + } + } + diff -rc postgresql-9.1.2.orig/src/backend/utils/fmgroids.h postgresql-9.1.2/src/backend/utils/fmgroids.h *** postgresql-9.1.2.orig/src/backend/utils/fmgroids.h 2012-06-20 03:32:50.000000000 +0900 --- postgresql-9.1.2/src/backend/utils/fmgroids.h 2012-06-26 01:51:52.000000000 +0900 *************** *** 1977,1982 **** --- 1977,1983 ---- #define F_GET_CURRENT_TS_CONFIG 3759 #define F_TS_MATCH_TT 3760 #define F_TS_MATCH_TQ 3761 + #define F_PG_STAT_GET_LWLOCKS 3764 #define F_PG_TS_TEMPLATE_IS_VISIBLE 3768 #define F_REGDICTIONARYIN 3771 #define F_REGDICTIONARYOUT 3772 diff -rc postgresql-9.1.2.orig/src/backend/utils/fmgrtab.c postgresql-9.1.2/src/backend/utils/fmgrtab.c *** postgresql-9.1.2.orig/src/backend/utils/fmgrtab.c 2012-06-20 03:32:46.000000000 +0900 --- postgresql-9.1.2/src/backend/utils/fmgrtab.c 2012-06-26 01:51:52.000000000 +0900 *************** *** 1966,1971 **** --- 1966,1972 ---- extern Datum get_current_ts_config (PG_FUNCTION_ARGS); extern Datum ts_match_tt (PG_FUNCTION_ARGS); extern Datum ts_match_tq (PG_FUNCTION_ARGS); + extern Datum pg_stat_get_lwlocks (PG_FUNCTION_ARGS); extern Datum pg_ts_template_is_visible (PG_FUNCTION_ARGS); extern Datum regdictionaryin (PG_FUNCTION_ARGS); extern Datum regdictionaryout (PG_FUNCTION_ARGS); *************** *** 4156,4161 **** --- 4157,4163 ---- { 3759, "get_current_ts_config", 0, true, false, get_current_ts_config }, { 3760, "ts_match_tt", 2, true, false, ts_match_tt }, { 3761, "ts_match_tq", 2, true, false, ts_match_tq }, + { 3764, "pg_stat_get_lwlocks", 0, false, true, pg_stat_get_lwlocks }, { 3768, "pg_ts_template_is_visible", 1, true, false, pg_ts_template_is_visible }, { 3771, "regdictionaryin", 1, true, false, regdictionaryin }, { 3772, "regdictionaryout", 1, true, false, regdictionaryout }, diff -rc postgresql-9.1.2.orig/src/include/catalog/pg_proc.h postgresql-9.1.2/src/include/catalog/pg_proc.h *** postgresql-9.1.2.orig/src/include/catalog/pg_proc.h 2012-06-20 03:32:43.000000000 +0900 --- postgresql-9.1.2/src/include/catalog/pg_proc.h 2012-06-26 01:51:46.000000000 +0900 *************** *** 2881,2886 **** --- 2881,2889 ---- DATA(insert OID = 3073 ( pg_is_xlog_replay_paused PGNSP PGUID 12 1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_is_xlog_replay_paused _null_ _null_ _null_ )); DESCR("true if xlog replay is paused"); + DATA(insert OID = 3764 ( pg_stat_get_lwlocks PGNSP PGUID 12 1 100 0 f f f f t s 0 0 2249 "" "{20,20,20,20}" "{o,o,o,o}" "{lwlockid,calls,waits,time_ms}" _null_ pg_stat_get_lwlocks _null_ _null_ _null_ )); + DESCR("light-weight lock statistics"); + DATA(insert OID = 2621 ( pg_reload_conf PGNSP PGUID 12 1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_reload_conf _null_ _null_ _null_ )); DESCR("reload configuration files"); DATA(insert OID = 2622 ( pg_rotate_logfile PGNSP PGUID 12 1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_rotate_logfile _null_ _null_ _null_ )); diff -rc postgresql-9.1.2.orig/src/include/storage/lwlock.h postgresql-9.1.2/src/include/storage/lwlock.h *** postgresql-9.1.2.orig/src/include/storage/lwlock.h 2012-06-20 03:32:43.000000000 +0900 --- postgresql-9.1.2/src/include/storage/lwlock.h 2012-06-26 01:57:22.000000000 +0900 *************** *** 115,118 **** --- 115,121 ---- extern void RequestAddinLWLocks(int n); + extern void lwlock_get_stat(LWLockId, uint64 *, uint64 *, uint64 *); + extern void lwlock_get_stat_sum(LWLockId, LWLockId, uint64 *, uint64 *, uint64 *); + #endif /* LWLOCK_H */
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers