Re: [GENERAL] Excessive (and slow) fsync() within single transaction

Stephen Tyler Thu, 10 Dec 2009 07:43:56 -0800

On Wed, Dec 9, 2009 at 12:57 PM, Greg Smith <g...@2ndquadrant.com> wrote:


> You should turn on log_checkpoint in the postgresql.conf and confirm the
> slowdowns are happening around the same time as the checkpoint report gets
> written to the log files.


I have turned on log_checkpoints, and re-run the command.  Checkpoints are
being written every 220 to 360 seconds.  About 80% are "time" and 20%
"xlog".  Here are a representative sample:

10/12/09 5:12:59 PM    org.postgresql.postgres[445]    LOG:  checkpoint
starting: time
10/12/09 5:16:41 PM    org.postgresql.postgres[445]    LOG:  checkpoint
complete: wrote 133795 buffers (25.5%); 0 transaction log file(s) added, 0
removed, 98 recycled; write=112.281 s, sync=108.809 s, total=221.166 s
10/12/09 5:17:32 PM    org.postgresql.postgres[445]    LOG:  checkpoint
starting: xlog
10/12/09 5:23:45 PM    org.postgresql.postgres[445]    LOG:  checkpoint
complete: wrote 292601 buffers (55.8%); 0 transaction log file(s) added, 0
removed, 76 recycled; write=162.919 s, sync=209.277 s, total=372.229 s
10/12/09 5:23:45 PM    org.postgresql.postgres[445]    LOG:  checkpoint
starting: time
10/12/09 5:27:58 PM    org.postgresql.postgres[445]    LOG:  checkpoint
complete: wrote 113330 buffers (21.6%); 0 transaction log file(s) added, 0
removed, 128 recycled; write=165.252 s, sync=87.588 s, total=253.285 s
10/12/09 5:28:45 PM    org.postgresql.postgres[445]    LOG:  checkpoint
starting: time
10/12/09 5:34:11 PM    org.postgresql.postgres[445]    LOG:  checkpoint
complete: wrote 206947 buffers (39.5%); 0 transaction log file(s) added, 0
removed, 65 recycled; write=152.953 s, sync=172.978 s, total=326.460 s

So a "typical" checkpoint is around 200K buffers (1.5GBytes, 40% of max),
taking around 150 seconds to write (10MByte/second, 1300 buffers/second),
and around 150 seconds to sync.

The system pauses/freezes are much more frequent.  Roughly 20 to 60 seconds
long, interspersed with a brief pulse of activity.  They align with the
lengthy fsync calls reported by DTrace.  But there are many more fsync calls
being made.  Most fsync calls return in a couple of milliseconds or
microseconds.

As I have previously mentioned, my system is using the default
"open_datasync" WAL sync method.  The test_fsync utility reports a small
"open o_dsync, write" call returning in around 300 microseconds or less,
both on the SSD RAID 0 array and a 7200rpm SATA HDD.  If that utility is to
be believed, "open o_dsync, write" is not reaching the disk platters by the
time it returns.

Why then does it take postgresql 200seconds to sync a checkpoint (or DTrace
report 20 to 60 second fsync calls)?  The drives themselves have only
smallish 32MB buffers.  I can write a 1.5GByte file (the size of the average
checkpoint) in only 10 seconds, if I do it from outside postgresql.

Stephen

Re: [GENERAL] Excessive (and slow) fsync() within single transaction

Reply via email to