Hello hackers,

 

This proposal is about recording additional statistics of wait events.

 

 

PostgreSQL statistics Issue

----------------------------------------

The pg_stat_activity view is very useful in analysis for performance issues.

But it is difficult to get information of wait events in detail,

when you need to deep dive into analysis of performance.

It is because pg_stat_activity just shows the current wait status of backend.

 

If PostgreSQL provides additional statistics about wait events,

it will be helpful in pinpointing the exact cause of throughput issue.

 

 

Proposal

----------------------------------------

Record additional statistics items per wait event for every backend.

    - total elapsed time to wait

    - max elapsed time to wait

    - number of times being waited

 

I suggest storing the above statistics in the pgBackendStatus structure.

 

typedef struct PgBackendStatus

{

...

    /*

     * proc's wait_event additional information.

     * each wait_events elapsed time & count.

    */

    TimestampTz st_wait_event_start_timestamp;

    uint64      st_wait_event_total_elapsed[NUM_WAIT_EVENT];

    uint64      st_wait_event_max_elapsed[NUM_WAIT_EVENT];

    uint32      st_wait_event_counting[NUM_WAIT_EVENT];

}

 

 

PoC test

----------------------------------------

I wrote a prototype patch.

With this patch, you can get additional wait event stats via
the new procedure ‘pg_stat_get_wait_events()’.

 

You can test by following steps.

    1. apply git patch

        - patch -p0 < wait_event_stat_patchfile.diff

    2. make distclean

    3. configure --with-wait-event-detail

    4. make & make install

    5. start postgreSQL and execute psql

    6. using pg_stat_get_wait_events(null) function

        - input parameter is pid.

 

display example>

postgres=# select * from pg_stat_get_wait_events(null) where counting > 0;

  pid  | wait_event_type |      wait_event       | total_elapsed | max_elapsed | counting

-------+-----------------+-----------------------+---------------+-------------+----------

 25291 | LWLock          | WALBufMappingLock     |          1359 |         376 |        6

 25291 | LWLock          | WALWriteLock          |        679733 |      113803 |        8

 25291 | IO              | BufFileRead           |           780 |           7 |      172

 25291 | IO              | BufFileWrite          |          1247 |          19 |      171

 25291 | IO              | DataFileExtend        |         44703 |          53 |     3395

 25291 | IO              | DataFileImmediateSync |        268798 |       72286 |       12

 25291 | IO              | DataFileRead          |         91763 |       22149 |       30

 25291 | IO              | WALSync               |        441139 |       60456 |       28

 25291 | IO              | WALWrite              |          9567 |         637 |      737

 24251 | LWLock          | WALBufMappingLock     |          1256 |         350 |        6

 24251 | LWLock          | WALWriteLock          |        649140 |      153994 |        7

 24251 | IO              | BufFileRead           |           620 |           9 |      172

 24251 | IO              | BufFileWrite          |          1228 |          20 |      171

 24251 | IO              | DataFileExtend        |         26884 |          51 |     3395

 24251 | IO              | DataFileImmediateSync |        208630 |       21067 |       12

 24251 | IO              | DataFileRead          |        426278 |       17327 |      128

 24251 | IO              | WALSync               |        307055 |       70853 |       24

 24251 | IO              | WALWrite              |         17935 |         961 |     2720

(18 rows)

     

 

etc. concept proposal

------------------------------------------

1. I allocated arrays for additional statistics per wait event.

   Normally the backend doesn’t use all wait events.

   So the size of memory used for recording statistics can be reduced

by allocating one hash list as memory pool for statistics of wait events. 

 

2. This feature can be implemented as extension

if some hooks were provided in following functions,

 - pgstat_report_wait_start

 -    Pgstat_report_wait_end

 

 

Feedback and suggestion will be very welcome.

Thanks!

 

Attachment: wait_event_stat_patchfile.diff
Description: Binary data

Reply via email to