At Fri, 22 Sep 2017 17:15:08 +0300, Sokolov Yura <y.soko...@postgrespro.ru> wrote in <c7d736b5ea4cda67a644a0247f1a3...@postgrespro.ru> > On 2017-09-22 16:22, Sokolov Yura wrote: > > On 2017-09-22 11:21, Masahiko Sawada wrote: > >> On Fri, Sep 22, 2017 at 4:16 PM, Kyotaro HORIGUCHI > >> <horiguchi.kyot...@lab.ntt.co.jp> wrote: > >>> At Fri, 22 Sep 2017 15:00:20 +0900, Masahiko Sawada > >>> <sawada.m...@gmail.com> wrote in > >>> <cad21aod6zgb1w6ps1axj0ccab_chdyiitntedpmhkefgg13...@mail.gmail.com> > >>>> On Tue, Sep 19, 2017 at 3:31 PM, Kyotaro HORIGUCHI > >>>> <horiguchi.kyot...@lab.ntt.co.jp> wrote: > >>>> > I was just looking the thread since it is found left alone for a > >>>> > long time in the CF app. > >>>> > > >>>> > At Mon, 18 Sep 2017 16:35:58 -0700, Peter Geoghegan <p...@bowt.ie> > >>>> > wrote > >>>> > in > >>>> > <CAH2-WzkhJhAXD+6DdBp7D8WYLfJ3D0m=AZbGsiw=usujtmu...@mail.gmail.com> > >>>> >> On Wed, Apr 5, 2017 at 3:50 PM, Andres Freund <and...@anarazel.de> > >>>> >> wrote: > >>>> >> > Hi, > >>>> >> > > >>>> >> > On 2017-04-01 03:05:07 +0900, Masahiko Sawada wrote: > >>>> >> >> On Fri, Mar 31, 2017 at 11:44 PM, Robert Haas > >>>> >> >> <robertmh...@gmail.com> wrote: > >>>> >> >> [ lots of valuable discussion ] > >>>> >> > > >>>> >> > I think this patch clearly still is in the design stage, and has > >>>> >> > received plenty feedback this CF. I'll therefore move this to the > >>>> >> > next > >>>> >> > commitfest. > >>>> >> > >>>> >> Does anyone have ideas on a way forward here? I don't, but then I > >>>> >> haven't thought about it in detail in several months. > >>>> > > >>>> > Is the additional storage in metapage to store the current status > >>>> > of vaccum is still unacceptable even if it can avoid useless > >>>> > full-page scan on indexes especially for stable tables? > >>>> > > >>>> > Or, how about additional 1 bit in pg_stat_*_index to indicate > >>>> > that the index *don't* require vacuum cleanup stage. (default > >>>> > value causes cleanup) > >>>> You meant that "the next cycle" is the lazy_cleanup_index() function > >>>> called by lazy_scan_heap()? > >>> Both finally call btvacuumscan under a certain condition, but > >>> what I meant by "the next cycle" is the lazy_cleanup_index call > >>> in the next round of vacuum since abstract layer (index am) isn't > >>> conscious of the detail of btree. > >>> > >>>> > index_bulk_delete (or ambulkdelete) returns the flag in > >>>> > IndexBulkDeleteResult then lazy_scan_heap stores the flag in > >>>> > stats and in the next cycle it is looked up to decide the > >>>> > necessity of index cleanup. > >>>> > > >>>> Could you elaborate about this? For example in btree index, the index > >>>> cleanup skips to scan on the index scan if index_bulk_delete has been > >>>> called during vacuuming because stats != NULL. So I think we don't > >>>> need such a flag. > >>> The flag works so that successive two index full scans don't > >>> happen in a vacuum round. If any rows are fully deleted, just > >>> following btvacuumcleanup does nothing. > >>> I think what you wanted to solve here was the problem that > >>> index_vacuum_cleanup runs a full scan even if it ends with no > >>> actual work, when manual or anti-wraparound vacuums. (I'm > >>> getting a bit confused on this..) It is caused by using the > >>> pointer "stats" as the flag to instruct to do that. If the > >>> stats-as-a-flag worked as expected, the GUC doesn't seem to be > >>> required. > >> Hmm, my proposal is like that if a table doesn't changed since the > >> previous vacuum much we skip the cleaning up index. > >> If the table has at least one garbage we do the lazy_vacuum_index and > >> then IndexBulkDeleteResutl is stored, which causes to skip doing the > >> btvacuumcleanup. On the other hand, if the table doesn't have any > >> garbage but some new tuples inserted since the previous vacuum, we > >> don't do the lazy_vacuum_index but do the lazy_cleanup_index. In this > >> case, we always do the lazy_cleanup_index (i.g, we do the full scan) > >> even if only one tuple is inserted. That's why I proposed a new GUC > >> parameter which allows us to skip the lazy_cleanup_index in the case. > >> > >>> Addition to that, as Simon and Peter pointed out > >>> index_bulk_delete can leave not-fully-removed pages (so-called > >>> half-dead pages and pages that are recyclable but not registered > >>> in FSM, AFAICS) in some cases mainly by RecentGlobalXmin > >>> interlock. In this case, just inhibiting cleanup scan by a > >>> threshold lets such dangling pages persist in the index. (I > >>> conldn't make such a many dangling pages, though..) > >>> The first patch in the mail (*1) does that. It seems having some > >>> bugs, though.. > >>> Since the dangling pages persist until autovacuum decided to scan > >>> the belonging table again, we should run a vacuum round (or > >>> index_vacuum_cleanup itself) even having no dead rows if we want > >>> to clean up such pages within a certain period. The second patch > >>> doesn that. > >>> > >> IIUC half-dead pages are not relevant to this proposal. The proposal > >> has two problems; > >> * By skipping index cleanup we could leave recyclable pages that are > >> not marked as a recyclable. > >> * we stash an XID when a btree page is deleted, which is used to > >> determine when it's finally safe to recycle the page > >> Regards, > >> -- > >> Masahiko Sawada > >> NIPPON TELEGRAPH AND TELEPHONE CORPORATION > >> NTT Open Source Software Center > > Here is a small patch that skips scanning btree index if no pending > > deleted pages exists. > > It detects this situation by comparing pages_deleted with pages_free.
It seems to work to prevent needless cleanup scans. (Although I still uncertain how to deal with half-dead pages..) > > If they are equal, then there is no half-deleted pages, and it is > > safe to skip next lazy scan. > > Flag stored in a btpo_flags. It is unset using general wal before > > scan. I'm not sure it's good to add a meta-page specific information into per-page storage. > > If no half-deleted pages found, it is set without wal (like hint bit). > > (it is safe to miss setting flag, but it is not safe to miss unsetting > > flag). > > This patch works correctly: > > - if rows are only inserted and never deleted, index always skips > > scanning (starting with second scan). (Yes, I was confused here..) > > - if some rows updated/deleted, then some scans are not skipped. But > > when all half-deleted pages are marked as deleted, lazy scans start to > > be skipped. > > Open question: what to do with index statistic? For simplicity this > > patch skips updating stats (just returns NULL from btvacuumcleanup). > > Probably, it should detect proportion of table changes, and do not > > skip scans if table grows too much. Agreed. > Excuse me, I didn't mean to overwrite "last attachment" on commitfest > page. I'm the other one who also posted a patch in this thread:p I'm not sure what we should behave for the case, but I think it's inevitable to attach a patch when proposing a funcamental change of the design in a concrete shape. (But this also cofuses the CI machinery:p) Actually, it was quite clear what you are thinking. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers