Re: [HACKERS] More stats about skipped vacuums

Kyotaro HORIGUCHI Tue, 31 Oct 2017 02:50:50 -0700

This is just a repost as a (true) new thread.

At Mon, 30 Oct 2017 20:57:50 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
<horiguchi.kyot...@lab.ntt.co.jp> wrote in 
<20171030.205750.246076862.horiguchi.kyot...@lab.ntt.co.jp>
> At Fri, 20 Oct 2017 19:15:16 +0900, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote in <CAD21AoAkaw-u0feAVN_VrKZA5tvzp7jT=mqcqp-svmegkxh...@mail.gmail.com>
> > >   n_mod_since_analyze          | 20000
> > > + vacuum_requred               | true
> > > + last_vacuum_oldest_xid       | 8023
> > > + last_vacuum_left_to_truncate | 5123
> > > + last_vacuum_truncated        | 387
> > >   last_vacuum                  | 2017-10-10 17:21:54.380805+09
> > >   last_autovacuum              | 2017-10-10 17:21:54.380805+09
> > > + last_autovacuum_status       | Killed by lock conflict
> > > ...
> > >   autovacuum_count             | 128
> > > + incomplete_autovacuum_count  | 53
> > >
> > > # The last one might be needless..
> > 
> > I'm not sure that the above informations will help for users or DBA
> > but personally I sometimes want to have the number of index scans of
> > the last autovacuum in the pg_stat_user_tables view. That value
> > indicates how efficiently vacuums performed and would be a signal to
> > increase the setting of autovacuum_work_mem for user.
> 
> Btree and all existing index AMs (except brin) seem to visit the
> all pages in every index scan so it would be valuable. Instead
> the number of visited index pages during a table scan might be
> usable. It is more relevant to performance than the number of
> scans, on the other hand it is a bit difficult to get something
> worth from the number in a moment. I'll show the number of scans
> in the first cut.
> 
> > > Where the "Killed by lock conflict" is one of the followings.
> > >
> > >    - Completed
> > >    - Truncation skipped
> > >    - Partially truncated
> > >    - Skipped
> > >    - Killed by lock conflict
> > >
> > > This seems enough to find the cause of a table bloat. The same
> > > discussion could be applied to analyze but it might be the
> > > another issue.
> > >
> > > There may be a better way to indicate the vacuum soundness. Any
> > > opinions and suggestions are welcome.
> > >
> > > I'm going to make a patch to do the 'formal' one for the time
> > > being.


Done with small modifications. In the attached patch
pg_stat_all_tables has the following new columns. Documentations
is not provided at this stage.

-----
  n_mod_since_analyze     | 0
+ vacuum_required         | not requried
  last_vacuum             | 
  last_autovacuum         | 2017-10-30 18:51:32.060551+09
  last_analyze            | 
  last_autoanalyze        | 2017-10-30 18:48:33.414711+09
  vacuum_count            | 0
+ last_vacuum_truncated   | 0
+ last_vacuum_untruncated | 0
+ last_vacuum_index_scans | 0
+ last_vacuum_oldest_xmin | 2134
+ last_vacuum_status      | agressive vacuum completed
+ autovacuum_fail_count   | 0
  autovacuum_count        | 5
  analyze_count           | 0
  autoanalyze_count       | 1
-----
Where each column shows the following infomation.

+ vacuum_required         | not requried

 VACUUM requirement status. Takes the following values.

  - partial
    Partial (or normal) will be performed by the next autovacuum.
    The word "partial" is taken from the comment for
    vacuum_set_xid_limits.

  - aggressive
    Aggressive scan will be performed by the next autovacuum.

  - required
    Any type of autovacuum will be performed. The type of scan is
    unknown because the view failed to take the required lock on
    the table. (AutoVacuumrequirement())

  - not required
    Next autovacuum won't perform scan on this relation.

  - not required (lock not acquired)

    Autovacuum should be disabled and the distance to
    freeze-limit is not known because required lock is not
    available.

  - close to freeze-limit xid
    Shown while autovacuum is disabled. The table is in the
    manual vacuum window to avoid anti-wraparound autovacuum.

+ last_vacuum_truncated | 0

  The number of truncated pages in the last completed
  (auto)vacuum.

+ last_vacuum_untruncated | 0
  The number of pages the last completed (auto)vacuum tried to
  truncate but could not for some reason.

+ last_vacuum_index_scans | 0
  The number of index scans performed in the last completed
  (auto)vacuum.

+ last_vacuum_oldest_xmin | 2134
  The oldest xmin used in the last completed (auto)vacuum.

+ last_vacuum_status      | agressive vacuum completed

  The finish status of the last vacuum. Takes the following
  values. (pg_stat_get_last_vacuum_status())

   - completed
     The last partial (auto)vacuum is completed.

   - vacuum full completed
     The last VACUUM FULL is completed.

   - aggressive vacuum completed
     The last aggressive (auto)vacuum is completed.

   - error while $progress
     The last vacuum stopped by error while $progress.
     The $progress one of the vacuum progress phases.

   - canceled while $progress
     The last vacuum was canceled while $progress

     This is caused by user cancellation of manual vacuum or
     killed by another backend who wants to acquire lock on the
     relation.

   - skipped - lock unavailable
     The last autovacuum on the relation was skipped because
     required lock was not available.

   - unvacuumable
     A past autovacuum tried vacuum on the relation but it is not
     vacuumable for reasons of ownership or accessibility problem.
     (Such relations are not shown in pg_stat_all_tables..)

+ autovacuum_fail_count   | 0
  The number of successive failure of vacuum on the relation.
  Reset to zero by completed vacuum.

======

In the patch, vacrelstats if pointed from a static variable and
cancel reporting is performed in PG_CATCH() section in vacuum().
Every unthrown error like lock acquisition failure is reported by
explicit pgstat_report_vacuum() with the corresponding finish
code.

Vacuum requirement status is calculated in AutoVacuumRequirment()
and returned as a string. Access share lock on the target
relation is required but it returns only available values if the
lock is not available. I decided to return incomplete (coarse
grained) result than wait for a lock that isn't known to be
relased in a short time for a perfect result.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More stats about skipped vacuums

Reply via email to