On Wed, Apr 14, 2021 at 11:17 PM Mead, Scott <me...@amazon.com> wrote: > > > > > On Mar 1, 2021, at 8:43 PM, Masahiko Sawada <sawada.m...@gmail.com> wrote: > > > > CAUTION: This email originated from outside of the organization. Do not > > click links or open attachments unless you can confirm the sender and know > > the content is safe. > > > > > > > > On Mon, Feb 8, 2021 at 11:49 PM Mead, Scott <me...@amazon.com> wrote: > >> > >> Hello, > >> I recently looked at what it would take to make a running autovacuum > >> pick-up a change to either cost_delay or cost_limit. Users frequently > >> will have a conservative value set, and then wish to change it when > >> autovacuum initiates a freeze on a relation. Most users end up finding > >> out they are in ‘to prevent wraparound’ after it has happened, this means > >> that if they want the vacuum to take advantage of more I/O, they need to > >> stop and then restart the currently running vacuum (after reloading the > >> GUCs). > >> > >> Initially, my goal was to determine feasibility for making this dynamic. > >> I added debug code to vacuum.c:vacuum_delay_point(void) and found that > >> changes to cost_delay and cost_limit are already processed by a running > >> vacuum. There was a bug preventing the cost_delay or cost_limit from > >> being configured to allow higher throughput however. > >> > >> I believe this is a bug because currently, autovacuum will dynamically > >> detect and increase the cost_limit or cost_delay, but it can never > >> decrease those values beyond their setting when the vacuum began. The > >> current behavior is for vacuum to limit the maximum throughput of > >> currently running vacuum processes to the cost_limit that was set when the > >> vacuum process began. > > > > Thanks for your report. > > > > I've not looked at the patch yet but I agree that the calculation for > > autovacuum cost delay seems not to work fine if vacuum-delay-related > > parameters (e.g., autovacuum_vacuum_cost_delay etc) are changed during > > vacuuming a table to speed up running autovacuums. Here is my > > analysis: > > > I appreciate your in-depth analysis and will comment in-line. That said, I > still think it’s important that the attached path is applied. As it is > today, a simple few lines of code prevent users from being able to increase > the throughput on vacuums that are running without having to cancel them > first. > > The patch that I’ve provided allows users to decrease their vacuum_cost_delay > and get an immediate boost in performance to their running vacuum jobs. > > > > > > Suppose we have the following parameters and 3 autovacuum workers are > > running on different tables: > > > > autovacuum_vacuum_cost_delay = 100 > > autovacuum_vacuum_cost_limit = 100 > > > > Vacuum cost-based delay parameters for each workers are follows: > > > > worker->wi_cost_limit_base = 100 > > worker->wi_cost_limit = 66 > > worker->wi_cost_delay = 100
Sorry, worker->wi_cost_limit should be 33. > > > > Each running autovacuum has "wi_cost_limit = 66" because the total > > limit (100) is equally rationed. And another point is that the total > > wi_cost_limit (198 = 66*3) is less than autovacuum_vacuum_cost_limit, > > 100. Which are fine. So the total wi_cost_limit, 99, is less than autovacuum_vacuum_cost_limit, 100. > > > > Here let's change autovacuum_vacuum_cost_delay/limit value to speed up > > running autovacuums. > > > > Case 1 : increasing autovacuum_vacuum_cost_limit to 1000. > > > > After reloading the configuration file, vacuum cost-based delay > > parameters for each worker become as follows: > > > > worker->wi_cost_limit_base = 100 > > worker->wi_cost_limit = 100 > > worker->wi_cost_delay = 100 > > > > If we rationed autovacuum_vacuum_cost_limit, 1000, to 3 workers, it > > would be 333. But since we cap it by wi_cost_limit_base, the > > wi_cost_limit is 100. I think this is what Mead reported here. > > > Yes, this is exactly correct. The cost_limit is capped at the cost_limit > that was set during the start of a running vacuum. My patch changes this cap > to be the max allowed cost_limit (10,000). The comment of worker's limit calculation says: /* * We put a lower bound of 1 on the cost_limit, to avoid division- * by-zero in the vacuum code. Also, in case of roundoff trouble * in these calculations, let's be sure we don't ever set * cost_limit to more than the base value. */ worker->wi_cost_limit = Max(Min(limit, worker->wi_cost_limit_base), 1); If we use the max cost_limit as the upper bound here, the worker's limit could unnecessarily be higher than the base value in case of roundoff trouble? I think that the problem here is rather that we don't update wi_cost_limit_base and wi_cost_delay when rebalancing the cost. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/