On Wed, Apr 14, 2021 at 11:17 PM Mead, Scott <me...@amazon.com> wrote:
>
>
>
> > On Mar 1, 2021, at 8:43 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
> >
> > CAUTION: This email originated from outside of the organization. Do not 
> > click links or open attachments unless you can confirm the sender and know 
> > the content is safe.
> >
> >
> >
> > On Mon, Feb 8, 2021 at 11:49 PM Mead, Scott <me...@amazon.com> wrote:
> >>
> >> Hello,
> >>   I recently looked at what it would take to make a running autovacuum 
> >> pick-up a change to either cost_delay or cost_limit.  Users frequently 
> >> will have a conservative value set, and then wish to change it when 
> >> autovacuum initiates a freeze on a relation.  Most users end up finding 
> >> out they are in ‘to prevent wraparound’ after it has happened, this means 
> >> that if they want the vacuum to take advantage of more I/O, they need to 
> >> stop and then restart the currently running vacuum (after reloading the 
> >> GUCs).
> >>
> >>  Initially, my goal was to determine feasibility for making this dynamic.  
> >> I added debug code to vacuum.c:vacuum_delay_point(void) and found that 
> >> changes to cost_delay and cost_limit are already processed by a running 
> >> vacuum.  There was a bug preventing the cost_delay or cost_limit from 
> >> being configured to allow higher throughput however.
> >>
> >> I believe this is a bug because currently, autovacuum will dynamically 
> >> detect and increase the cost_limit or cost_delay, but it can never 
> >> decrease those values beyond their setting when the vacuum began.  The 
> >> current behavior is for vacuum to limit the maximum throughput of 
> >> currently running vacuum processes to the cost_limit that was set when the 
> >> vacuum process began.
> >
> > Thanks for your report.
> >
> > I've not looked at the patch yet but I agree that the calculation for
> > autovacuum cost delay seems not to work fine if vacuum-delay-related
> > parameters (e.g., autovacuum_vacuum_cost_delay etc) are changed during
> > vacuuming a table to speed up running autovacuums. Here is my
> > analysis:
>
>
> I appreciate your in-depth analysis and will comment in-line.  That said, I 
> still think it’s important that the attached path is applied.  As it is 
> today, a simple few lines of code prevent users from being able to increase 
> the throughput on vacuums that are running without having to cancel them 
> first.
>
> The patch that I’ve provided allows users to decrease their vacuum_cost_delay 
> and get an immediate boost in performance to their running vacuum jobs.
>
>
> >
> > Suppose we have the following parameters and 3 autovacuum workers are
> > running on different tables:
> >
> > autovacuum_vacuum_cost_delay = 100
> > autovacuum_vacuum_cost_limit = 100
> >
> > Vacuum cost-based delay parameters for each workers are follows:
> >
> > worker->wi_cost_limit_base = 100
> > worker->wi_cost_limit = 66
> > worker->wi_cost_delay = 100

Sorry, worker->wi_cost_limit should be 33.

> >
> > Each running autovacuum has "wi_cost_limit = 66" because the total
> > limit (100) is equally rationed. And another point is that the total
> > wi_cost_limit (198 = 66*3) is less than autovacuum_vacuum_cost_limit,
> > 100. Which are fine.

So the total wi_cost_limit, 99, is less than autovacuum_vacuum_cost_limit, 100.

> >
> > Here let's change autovacuum_vacuum_cost_delay/limit value to speed up
> > running autovacuums.
> >
> > Case 1 : increasing autovacuum_vacuum_cost_limit to 1000.
> >
> > After reloading the configuration file, vacuum cost-based delay
> > parameters for each worker become as follows:
> >
> > worker->wi_cost_limit_base = 100
> > worker->wi_cost_limit = 100
> > worker->wi_cost_delay = 100
> >
> > If we rationed autovacuum_vacuum_cost_limit, 1000, to 3 workers, it
> > would be 333. But since we cap it by wi_cost_limit_base, the
> > wi_cost_limit is 100. I think this is what Mead reported here.
>
>
> Yes, this is exactly correct.  The cost_limit is capped at the cost_limit 
> that was set during the start of a running vacuum.  My patch changes this cap 
> to be the max allowed cost_limit (10,000).

The comment of worker's limit calculation says:

        /*
         * We put a lower bound of 1 on the cost_limit, to avoid division-
         * by-zero in the vacuum code.  Also, in case of roundoff trouble
         * in these calculations, let's be sure we don't ever set
         * cost_limit to more than the base value.
         */
        worker->wi_cost_limit = Max(Min(limit,
                                        worker->wi_cost_limit_base),
                                    1);

If we use the max cost_limit as the upper bound here, the worker's
limit could unnecessarily be higher than the base value in case of
roundoff trouble? I think that the problem here is rather that we
don't update wi_cost_limit_base and wi_cost_delay when rebalancing the
cost.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


Reply via email to