Hi all,

I've been working on a new system view, pg_stat_lwlocks, to observe
LWLock, and just completed my 'proof-of-concept' code that can work
with version 9.1.

Now, I'd like to know the possibility of this feature for future
release.

With this patch, DBA can easily determine a bottleneck around lwlocks.
--------------------------------------------------
postgres=# SELECT * FROM pg_stat_lwlocks ORDER BY time_ms DESC LIMIT 10;
 lwlockid | calls  | waits | time_ms
----------+--------+-------+---------
       49 | 193326 | 32096 |   23688
        8 |   3305 |   133 |    1335
        2 |     21 |     0 |       0
        4 | 135188 |     0 |       0
        5 |  57935 |     0 |       0
        6 |    141 |     0 |       0
        7 |  24580 |     1 |       0
        3 |   3282 |     0 |       0
        1 |     41 |     0 |       0
        9 |      3 |     0 |       0
(10 rows)

postgres=#
--------------------------------------------------

In this view,
  'lwlockid' column represents LWLockId used in the backends.
  'calls' represents how many times LWLockAcquire() was called.
  'waits' represents how many times LWLockAcquire() needed to wait
  within it before lock acquisition.
  'time_ms' represents how long LWLockAcquire() totally waited on
  a lwlock.

And lwlocks that use a LWLockId range, such as BufMappingLock or
LockMgrLock, would be grouped and summed up in a single record.
For example, lwlockid 49 in the above view represents LockMgrLock
statistics.

Now, I know there are some considerations.

(1) Performance

  I've measured LWLock performance both with and without the patch,
  and confirmed that this patch does not affect the LWLock perfomance
  at all.

  pgbench scores with the patch:
    tps = 900.906658 (excluding connections establishing)
    tps = 908.528422 (excluding connections establishing)
    tps = 903.900977 (excluding connections establishing)
    tps = 910.470595 (excluding connections establishing)
    tps = 909.685396 (excluding connections establishing)

  pgbench scores without the patch:
    tps = 909.096785 (excluding connections establishing)
    tps = 894.868712 (excluding connections establishing)
    tps = 910.074669 (excluding connections establishing)
    tps = 904.022770 (excluding connections establishing)
    tps = 895.673830 (excluding connections establishing)

  Of course, this experiment was not I/O bound, and the cache hit ratio
  was >99.9%.

(2) Memory space

  In this patch, I added three new members to LWLock structure
  as uint64 to collect statistics.

  It means that those members must be held in the shared memory,
  but I'm not sure whether it's appropriate.

  I think another possible option is holding those statistics
  values in local (backend) process memory, and send them through
  the stat collector process (like other statistics values).

(3) LWLock names (or labels)

  Now, pg_stat_lwlocks view shows LWLockId itself. But LWLockId is
  not easy for DBA to determine actual lock type.

  So, I want to show LWLock names (or labels), like 'WALWriteLock'
  or 'LockMgrLock', but how should I implement it?

Any comments?

Regards,
-- 
Satoshi Nagayasu <sn...@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp
diff -rc postgresql-9.1.2.orig/src/backend/catalog/postgres.bki 
postgresql-9.1.2/src/backend/catalog/postgres.bki
*** postgresql-9.1.2.orig/src/backend/catalog/postgres.bki      2012-06-20 
03:32:46.000000000 +0900
--- postgresql-9.1.2/src/backend/catalog/postgres.bki   2012-06-26 
01:51:52.000000000 +0900
***************
*** 1553,1558 ****
--- 1553,1559 ----
  insert OID = 3071 ( pg_xlog_replay_pause 11 10 12 1 0 0 f f f t f v 0 0 2278 
"" _null_ _null_ _null_ _null_ pg_xlog_replay_pause _null_ _null_ _null_ )
  insert OID = 3072 ( pg_xlog_replay_resume 11 10 12 1 0 0 f f f t f v 0 0 2278 
"" _null_ _null_ _null_ _null_ pg_xlog_replay_resume _null_ _null_ _null_ )
  insert OID = 3073 ( pg_is_xlog_replay_paused 11 10 12 1 0 0 f f f t f v 0 0 
16 "" _null_ _null_ _null_ _null_ pg_is_xlog_replay_paused _null_ _null_ _null_ 
)
+ insert OID = 3764 ( pg_stat_get_lwlocks 11 10 12 1 100 0 f f f f t s 0 0 2249 
"" "{20,20,20,20}" "{o,o,o,o}" "{lwlockid,calls,waits,time_ms}" _null_ 
pg_stat_get_lwlocks _null_ _null_ _null_ )
  insert OID = 2621 ( pg_reload_conf 11 10 12 1 0 0 f f f t f v 0 0 16 "" 
_null_ _null_ _null_ _null_ pg_reload_conf _null_ _null_ _null_ )
  insert OID = 2622 ( pg_rotate_logfile 11 10 12 1 0 0 f f f t f v 0 0 16 "" 
_null_ _null_ _null_ _null_ pg_rotate_logfile _null_ _null_ _null_ )
  insert OID = 2623 ( pg_stat_file 11 10 12 1 0 0 f f f t f v 1 0 2249 "25" 
"{25,20,1184,1184,1184,1184,16}" "{i,o,o,o,o,o,o}" 
"{filename,size,access,modification,change,creation,isdir}" _null_ pg_stat_file 
_null_ _null_ _null_ )
diff -rc postgresql-9.1.2.orig/src/backend/catalog/postgres.description 
postgresql-9.1.2/src/backend/catalog/postgres.description
*** postgresql-9.1.2.orig/src/backend/catalog/postgres.description      
2012-06-20 03:32:46.000000000 +0900
--- postgresql-9.1.2/src/backend/catalog/postgres.description   2012-06-26 
01:51:52.000000000 +0900
***************
*** 947,952 ****
--- 947,953 ----
  3071  pg_proc 0       pause xlog replay
  3072  pg_proc 0       resume xlog replay, if it was paused
  3073  pg_proc 0       true if xlog replay is paused
+ 3764  pg_proc 0       light-weight lock statistics
  2621  pg_proc 0       reload configuration files
  2622  pg_proc 0       rotate log file
  2623  pg_proc 0       get information about file
diff -rc postgresql-9.1.2.orig/src/backend/catalog/system_views.sql 
postgresql-9.1.2/src/backend/catalog/system_views.sql
*** postgresql-9.1.2.orig/src/backend/catalog/system_views.sql  2012-06-20 
03:32:46.000000000 +0900
--- postgresql-9.1.2/src/backend/catalog/system_views.sql       2012-06-26 
03:35:59.000000000 +0900
***************
*** 594,599 ****
--- 594,607 ----
          pg_stat_get_buf_alloc() AS buffers_alloc,
          pg_stat_get_bgwriter_stat_reset_time() AS stats_reset;
  
+ CREATE VIEW pg_stat_lwlocks AS
+     SELECT
+             S.lwlockid,
+             S.calls,
+             S.waits,
+             S.time_ms
+     FROM pg_stat_get_lwlocks() AS S;
+ 
  CREATE VIEW pg_user_mappings AS
      SELECT
          U.oid       AS umid,
diff -rc postgresql-9.1.2.orig/src/backend/storage/lmgr/lwlock.c 
postgresql-9.1.2/src/backend/storage/lmgr/lwlock.c
*** postgresql-9.1.2.orig/src/backend/storage/lmgr/lwlock.c     2012-06-20 
03:32:46.000000000 +0900
--- postgresql-9.1.2/src/backend/storage/lmgr/lwlock.c  2012-06-26 
01:57:28.000000000 +0900
***************
*** 32,37 ****
--- 32,38 ----
  #include "storage/proc.h"
  #include "storage/spin.h"
  
+ #include <sys/time.h>
  
  /* We use the ShmemLock spinlock to protect LWLockAssign */
  extern slock_t *ShmemLock;
***************
*** 46,51 ****
--- 47,57 ----
        PGPROC     *head;                       /* head of list of waiting 
PGPROCs */
        PGPROC     *tail;                       /* tail of list of waiting 
PGPROCs */
        /* tail is undefined when head is NULL */
+ 
+       /* statistics stuff */
+       uint64          calls;
+       uint64          waits;
+       uint64          time_ms;
  } LWLock;
  
  /*
***************
*** 268,273 ****
--- 274,282 ----
                lock->lock.shared = 0;
                lock->lock.head = NULL;
                lock->lock.tail = NULL;
+               lock->lock.calls = 0;
+               lock->lock.waits = 0;
+               lock->lock.time_ms = 0;
        }
  
        /*
***************
*** 323,330 ****
--- 332,341 ----
        PGPROC     *proc = MyProc;
        bool            retry = false;
        int                     extraWaits = 0;
+       struct timeval          t1,t2;
  
        PRINT_LWDEBUG("LWLockAcquire", lockid, lock);
+       lock->calls++;
  
  #ifdef LWLOCK_STATS
        /* Set up local count state first time through in a given process */
***************
*** 457,462 ****
--- 468,475 ----
  #endif
  
                TRACE_POSTGRESQL_LWLOCK_WAIT_START(lockid, mode);
+               lock->waits++;
+               gettimeofday(&t1, NULL);
  
                for (;;)
                {
***************
*** 468,473 ****
--- 481,488 ----
                }
  
                TRACE_POSTGRESQL_LWLOCK_WAIT_DONE(lockid, mode);
+               gettimeofday(&t2, NULL);
+               lock->time_ms += (t2.tv_sec-t1.tv_sec)*1000 + 
(t2.tv_usec-t1.tv_usec)/1000;
  
                LOG_LWDEBUG("LWLockAcquire", lockid, "awakened");
  
***************
*** 701,703 ****
--- 716,752 ----
        }
        return false;
  }
+ 
+ void
+ lwlock_get_stat(LWLockId lockid, uint64 *calls, uint64 *waits, uint64 
*time_ms)
+ {
+       volatile LWLock *lock = &(LWLockArray[lockid].lock);
+ 
+       SpinLockAcquire(&lock->mutex);
+       *calls   = lock->calls;
+       *waits   = lock->waits;
+       *time_ms = lock->time_ms;
+       SpinLockRelease(&lock->mutex);
+ }
+ 
+ void
+ lwlock_get_stat_sum(LWLockId start, LWLockId end, uint64 *calls, uint64 
*waits, uint64 *time_ms)
+ {
+       LWLockId lockid;
+ 
+       *calls   = 0;
+       *waits   = 0;
+       *time_ms = 0;
+ 
+       for (lockid=start ; lockid<=end ; lockid++)
+       {
+               uint64 calls2, waits2, time_ms2;
+ 
+               lwlock_get_stat(lockid, &calls2, &waits2, &time_ms2);
+ 
+               *calls   += calls2;
+               *waits   += waits2;
+               *time_ms += time_ms2;
+       }
+ }
+ 
diff -rc postgresql-9.1.2.orig/src/backend/utils/adt/pgstatfuncs.c 
postgresql-9.1.2/src/backend/utils/adt/pgstatfuncs.c
*** postgresql-9.1.2.orig/src/backend/utils/adt/pgstatfuncs.c   2012-06-20 
03:32:50.000000000 +0900
--- postgresql-9.1.2/src/backend/utils/adt/pgstatfuncs.c        2012-06-26 
01:59:17.000000000 +0900
***************
*** 109,114 ****
--- 109,116 ----
  extern Datum pg_stat_reset_single_table_counters(PG_FUNCTION_ARGS);
  extern Datum pg_stat_reset_single_function_counters(PG_FUNCTION_ARGS);
  
+ extern Datum pg_stat_get_lwlocks(PG_FUNCTION_ARGS);
+ 
  /* Global bgwriter statistics, from bgwriter.c */
  extern PgStat_MsgBgWriter bgwriterStats;
  
***************
*** 1572,1574 ****
--- 1574,1670 ----
  
        PG_RETURN_VOID();
  }
+ 
+ 
+ Datum
+ pg_stat_get_lwlocks(PG_FUNCTION_ARGS)
+ {
+       FuncCallContext *funcctx;
+ 
+       /* stuff done only on the first call of the function */
+       if (SRF_IS_FIRSTCALL())
+       {
+               MemoryContext oldcontext;
+               TupleDesc       tupdesc;
+ 
+               /* create a function context for cross-call persistence */
+               funcctx = SRF_FIRSTCALL_INIT();
+ 
+               oldcontext = 
MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+ 
+               tupdesc = CreateTemplateTupleDesc(4, false);
+               TupleDescInitEntry(tupdesc, (AttrNumber) 1, "lockid",
+                                                  INT8OID, -1, 0);
+               TupleDescInitEntry(tupdesc, (AttrNumber) 2, "calls",
+                                                  INT8OID, -1, 0);
+               TupleDescInitEntry(tupdesc, (AttrNumber) 3, "waits",
+                                                  INT8OID, -1, 0);
+               TupleDescInitEntry(tupdesc, (AttrNumber) 4, "time_ms",
+                                                  INT8OID, -1, 0);
+ 
+               funcctx->tuple_desc = BlessTupleDesc(tupdesc);
+               funcctx->max_calls = NumLWLocks();
+ 
+               MemoryContextSwitchTo(oldcontext);
+       }
+ 
+       /* stuff done on every call of the function */
+       funcctx = SRF_PERCALL_SETUP();
+ 
+       if (funcctx->call_cntr < funcctx->max_calls)
+       {
+               Datum           values[4];
+               bool            nulls[4];
+               HeapTuple       tuple;
+               LWLockId        lockid;
+               uint64          calls,waits,time_ms;
+ 
+               MemSet(values, 0, sizeof(values));
+               MemSet(nulls, 0, sizeof(nulls));
+ 
+               lockid = funcctx->call_cntr;
+ 
+               if ( lockid<FirstBufMappingLock )
+               {
+                       lwlock_get_stat(lockid, &calls, &waits, &time_ms);
+               }
+               else if ( FirstBufMappingLock<=lockid && 
lockid<FirstLockMgrLock )
+               {
+                       lwlock_get_stat_sum(FirstBufMappingLock, 
FirstLockMgrLock-1,
+                                           &calls, &waits, &time_ms);
+                       funcctx->call_cntr = FirstLockMgrLock - 1;
+               }
+               else if ( FirstLockMgrLock<=lockid && 
lockid<FirstPredicateLockMgrLock )
+               {
+                       lwlock_get_stat_sum(FirstLockMgrLock, 
FirstPredicateLockMgrLock-1,
+                                           &calls, &waits, &time_ms);
+                       funcctx->call_cntr = FirstPredicateLockMgrLock - 1;
+               }
+               else if ( FirstPredicateLockMgrLock<=lockid && 
lockid<NumFixedLWLocks )
+               {
+                       lwlock_get_stat_sum(FirstPredicateLockMgrLock, 
NumFixedLWLocks-1,
+                                           &calls, &waits, &time_ms);
+                       funcctx->call_cntr = NumFixedLWLocks - 1;
+               }
+               else if ( NumFixedLWLocks<=lockid && lockid<NumLWLocks() )
+               {
+                       lwlock_get_stat_sum(NumFixedLWLocks, NumLWLocks()-1,
+                                           &calls, &waits, &time_ms);
+                       funcctx->call_cntr = NumLWLocks() - 1;
+               }
+ 
+               values[0] = Int64GetDatum(lockid);
+               values[1] = Int64GetDatum(calls);
+               values[2] = Int64GetDatum(waits);
+               values[3] = Int64GetDatum(time_ms);
+ 
+               tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
+ 
+               SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuple));
+       }
+       else
+       {
+               SRF_RETURN_DONE(funcctx);
+       }
+ }
+ 
diff -rc postgresql-9.1.2.orig/src/backend/utils/fmgroids.h 
postgresql-9.1.2/src/backend/utils/fmgroids.h
*** postgresql-9.1.2.orig/src/backend/utils/fmgroids.h  2012-06-20 
03:32:50.000000000 +0900
--- postgresql-9.1.2/src/backend/utils/fmgroids.h       2012-06-26 
01:51:52.000000000 +0900
***************
*** 1977,1982 ****
--- 1977,1983 ----
  #define F_GET_CURRENT_TS_CONFIG 3759
  #define F_TS_MATCH_TT 3760
  #define F_TS_MATCH_TQ 3761
+ #define F_PG_STAT_GET_LWLOCKS 3764
  #define F_PG_TS_TEMPLATE_IS_VISIBLE 3768
  #define F_REGDICTIONARYIN 3771
  #define F_REGDICTIONARYOUT 3772
diff -rc postgresql-9.1.2.orig/src/backend/utils/fmgrtab.c 
postgresql-9.1.2/src/backend/utils/fmgrtab.c
*** postgresql-9.1.2.orig/src/backend/utils/fmgrtab.c   2012-06-20 
03:32:46.000000000 +0900
--- postgresql-9.1.2/src/backend/utils/fmgrtab.c        2012-06-26 
01:51:52.000000000 +0900
***************
*** 1966,1971 ****
--- 1966,1972 ----
  extern Datum get_current_ts_config (PG_FUNCTION_ARGS);
  extern Datum ts_match_tt (PG_FUNCTION_ARGS);
  extern Datum ts_match_tq (PG_FUNCTION_ARGS);
+ extern Datum pg_stat_get_lwlocks (PG_FUNCTION_ARGS);
  extern Datum pg_ts_template_is_visible (PG_FUNCTION_ARGS);
  extern Datum regdictionaryin (PG_FUNCTION_ARGS);
  extern Datum regdictionaryout (PG_FUNCTION_ARGS);
***************
*** 4156,4161 ****
--- 4157,4163 ----
    { 3759, "get_current_ts_config", 0, true, false, get_current_ts_config },
    { 3760, "ts_match_tt", 2, true, false, ts_match_tt },
    { 3761, "ts_match_tq", 2, true, false, ts_match_tq },
+   { 3764, "pg_stat_get_lwlocks", 0, false, true, pg_stat_get_lwlocks },
    { 3768, "pg_ts_template_is_visible", 1, true, false, 
pg_ts_template_is_visible },
    { 3771, "regdictionaryin", 1, true, false, regdictionaryin },
    { 3772, "regdictionaryout", 1, true, false, regdictionaryout },
diff -rc postgresql-9.1.2.orig/src/include/catalog/pg_proc.h 
postgresql-9.1.2/src/include/catalog/pg_proc.h
*** postgresql-9.1.2.orig/src/include/catalog/pg_proc.h 2012-06-20 
03:32:43.000000000 +0900
--- postgresql-9.1.2/src/include/catalog/pg_proc.h      2012-06-26 
01:51:46.000000000 +0900
***************
*** 2881,2886 ****
--- 2881,2889 ----
  DATA(insert OID = 3073 ( pg_is_xlog_replay_paused     PGNSP PGUID 12 1 0 0 f 
f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_is_xlog_replay_paused _null_ 
_null_ _null_ ));
  DESCR("true if xlog replay is paused");
  
+ DATA(insert OID = 3764 (  pg_stat_get_lwlocks PGNSP PGUID 12 1 100 0 f f f f 
t s 0 0 2249 "" "{20,20,20,20}" "{o,o,o,o}" "{lwlockid,calls,waits,time_ms}" 
_null_ pg_stat_get_lwlocks _null_ _null_ _null_ ));
+ DESCR("light-weight lock statistics");
+ 
  DATA(insert OID = 2621 ( pg_reload_conf                       PGNSP PGUID 12 
1 0 0 f f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_reload_conf _null_ 
_null_ _null_ ));
  DESCR("reload configuration files");
  DATA(insert OID = 2622 ( pg_rotate_logfile            PGNSP PGUID 12 1 0 0 f 
f f t f v 0 0 16 "" _null_ _null_ _null_ _null_ pg_rotate_logfile _null_ _null_ 
_null_ ));
diff -rc postgresql-9.1.2.orig/src/include/storage/lwlock.h 
postgresql-9.1.2/src/include/storage/lwlock.h
*** postgresql-9.1.2.orig/src/include/storage/lwlock.h  2012-06-20 
03:32:43.000000000 +0900
--- postgresql-9.1.2/src/include/storage/lwlock.h       2012-06-26 
01:57:22.000000000 +0900
***************
*** 115,118 ****
--- 115,121 ----
  
  extern void RequestAddinLWLocks(int n);
  
+ extern void lwlock_get_stat(LWLockId, uint64 *, uint64 *, uint64 *);
+ extern void lwlock_get_stat_sum(LWLockId, LWLockId, uint64 *, uint64 *, 
uint64 *);
+ 
  #endif   /* LWLOCK_H */
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to