Re: [HACKERS] [PATCH] Vacuum: Update FSM more frequently

Claudio Freire Mon, 29 Jan 2018 06:31:58 -0800

On Mon, Jan 29, 2018 at 4:12 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Sat, Jul 29, 2017 at 9:42 AM, Claudio Freire <klaussfre...@gmail.com> 
> wrote:
>> Introduce a tree pruning threshold to FreeSpaceMapVacuum that avoids
>> recursing into branches that already contain enough free space, to
>> avoid having to traverse the whole FSM and thus induce quadratic
>> costs. Intermediate FSM vacuums are only supposed to make enough
>> free space visible to avoid extension until the final (non-partial)
>> FSM vacuum.
>
> Hmm, I think this resolve a  part of the issue. How about calling
> AutoVacuumRequestWork() in PG_CATCH() if VACOPT_VACUUM is specified
> and give the relid that we were vacuuming but could not complete as a
> new autovacuum work-item? The new autovacuum work-item makes the
> worker vacuum FSMs of the given relation and its indices.


Well, I tried that in fact, as I mentioned in the OP.

I abandoned it due to the conjunction of the 2 main blockers I found
and mentioned there. In essence, those issues defeat the purpose of
the patch (to get the free space visible ASAP).

Don't forget, this is aimed at cases where autovacuum of a single
relation takes a very long time. That is, very big relations. Maybe
days, like in my case. A whole autovacuum cycle can take weeks, so
delaying FSM vacuum that much is not good, and using work items still
cause those delays, not to mention the segfaults.

> That way, we
> can ensure that FSM gets vacuumed by the cancelled autovacuum process
> or other autovacuum processes. Since a work-item can be handled by
> other autovacuum process I think 256 work-item limit would not be a
> problem.

Why do you think it wouldn't? In particular if you take into account
the above. If you have more than 256 relations in the cluster, it
could very well happen that you've queued the maximum amount and no
autovac worker has had a chance to take a look at them, because
they're all stuck vacuuming huge relations.

Not to mention the need to de-duplicate work items. We wouldn't want
to request repeated FSM vacuums, or worst, queue an FSM vacuum of a
single table 256 times and fill up the queue with redundant items.
With the current structure, de-duplication is O(N), so if we wanted to
raise the limit of 256 work items, we'd need a structure that would
let us de-duplicate in less than O(N). In essence, it's a ton of work
for very little gain. Hence why I abandoned it.

> Or it might be better to make autovacuum launcher launch
> worker process if there is pending work-item in a database even if
> there is no table needs to be vacuumed/analyzed.

Quite a must, in fact.

>> For one, it would sometimes crash when adding the work item from
>> inside autovacuum itself. I didn't find the cause of the crash, but I suspect
>> AutoVacuumRequestWork was being called in a context where it
>> was not safe.
>
> Perhaps the cause of this might be that autovacuum work-item is
> implemented using DSA, which has been changed before by commit
> 31ae1638ce35c23979f9bcbb92c6bb51744dbccb.

I thought about that. Perhaps. But the work-item approach didn't fail
just because of the segfaults. It's also that it doesn't address the
free space visibility issue quickly enough.

Re: [HACKERS] [PATCH] Vacuum: Update FSM more frequently

Reply via email to