On Tue, Feb 6, 2018 at 9:51 PM, Claudio Freire <klaussfre...@gmail.com> wrote: > On Tue, Feb 6, 2018 at 4:56 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote: >> On Tue, Feb 6, 2018 at 2:55 AM, Claudio Freire <klaussfre...@gmail.com> >> wrote: >>> On Mon, Feb 5, 2018 at 1:53 AM, Masahiko Sawada <sawada.m...@gmail.com> >>> wrote: >>>> On Fri, Feb 2, 2018 at 11:13 PM, Claudio Freire <klaussfre...@gmail.com> >>>> wrote: >>>>> After autovacuum gets cancelled, the next time it wakes up it will >>>>> retry vacuuming the cancelled relation. That's because a cancelled >>>>> autovacuum doesn't update the last-vacuumed stats. >>>>> >>>>> So the timing between an autovacuum work item and the next retry for >>>>> that relation is more or less an autovacuum nap time, except perhaps >>>>> in the case where many vacuums get cancelled, and they have to be >>>>> queued. >>>> >>>> I think that's not true if there are multiple databases. >>> >>> I'd have to re-check. >>> >>> The loop is basically, IIRC: >>> >>> while(1) { vacuum db ; work items ; nap } >>> >>> Now, if that's vacuum one db, not all, and if the decision on the >>> second run doesn't pick the same db because that big table failed to >>> be vacuumed, then you're right. >>> >>> In that case we could add the FSM vacuum as a work item *in addition* >>> to what this patch does. If the work item queue is full and the FSM >>> vacuum doesn't happen, it'd be no worse than with the patch as-is. >>> >>> Is that what you suggest? >> >> That's one of the my suggestion. I might had to consider this issue >> for each case. To be clear let me summarize for each case. >> >> For table with indices, vacuum on fsm of heap is done after >> lazy_vacuum_heap(). Therefore I think that the case where a table got >> vacuumed but fsm couldn't get vacuumed doesn't happen unless the >> autovacuum gets cancelled before or while vacuuming fsm. > > Well, that's the whole point. Autovacuums get cancelled all the time > in highly contended tables. I have more than a handful tables in > production that never finish autovacuum, so I have to trigger manual > vacuums periodically. > >> (1) using autovacuum work-item, (2) vacuuming fsm of table in >> PG_CATCH, (3) remembering the tables got cancelled and vacuuming them >> after finished a loop of table_oids. > ... >> So I'm in favor of (3). > ... >> However, it's quite possible that I'm not seeing the whole picture here. > > Well, none of that handles the case where the autovacuum of a table > doesn't get cancelled, but takes a very long time. > > No free space becomes visible during long-running vacuums. That means > bloat keeps accumulating even though vacuum is freeing space, because > the FSM doesn't expose that free space. > > The extra work incurred in those FSM vacuums isn't useless, it's > preventing bloat in long-running vacuums. > > I can look into doing 3, that *might* get rid of the need to do that > initial FSM vacuum, but all other intermediate vacuums are still > needed.
Understood. So how about that this patch focuses only make FSM vacuum more frequently and leaves the initial FSM vacuum and the handling cancellation cases? The rest can be a separate patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center