Hi hackers, After watching Robert's talk[1] on autovacuum and participating in the related workshop yesterday, it appears that people are inclined to use prioritization to address the issues highlighted in Robert's presentation. Here I list two of the failure modes that were discussed.
- Spinning. Running repeatedly on the same table but not accomplishing anything useful. - Starvation. autovacuum can't vacuum everything that needs vacuuming. - ... The prioritization way needs some basic stuff that postgres doesn't have now. I had a random thought that introducing some randomness might help mitigate some of the issues mentioned above. Before performing vacuum on the collected tables, we could rotate the table_oids list by a random number within the range [0, list_length(table_oids)]. This way, every table would have an equal chance of being vacuumed first, thus no spinning and starvation. Even if there is a broken table that repeatedly gets stuck, this random approach would still provide opportunities for other tables to be vacuumed. Eventually, the system would converge. The change is something like the following, I haven't tested the code, just posted it here for discussion, let me know your thoughts. diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 16756152b71..6dddd273d22 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -79,6 +79,7 @@ #include "catalog/pg_namespace.h" #include "commands/dbcommands.h" #include "commands/vacuum.h" +#include "common/pg_prng.h" #include "common/int.h" #include "lib/ilist.h" #include "libpq/pqsignal.h" @@ -2267,6 +2268,25 @@ do_autovacuum(void) "Autovacuum Portal", ALLOCSET_DEFAULT_SIZES); + /* + * Randomly rotate the list of tables to vacuum. This is to avoid + * always vacuuming the same table first, which could lead to spinning + * on the same table or vacuuming starvation. + */ + if (list_length(table_oids) > 2) + { + int rand = 0; + static pg_prng_state prng_state; + List *tmp_oids = NIL; + + pg_prng_seed(&prng_state, (uint64) (getpid() ^ time(NULL))); + rand = (int) pg_prng_uint64_range(&prng_state, 0, list_length(table_oids) - 1); + if (rand != 0) { + tmp_oids = list_copy_tail(table_oids, rand); + table_oids = list_copy_head(table_oids, list_length(table_oids) - rand); + table_oids = list_concat(table_oids, tmp_oids); + } + } /* * Perform operations on collected tables. */ [1] How Autovacuum Goes Wrong: And Can We Please Make It Stop Doing That? https://www.youtube.com/watch?v=RfTD-Twpvac -- Regards Junwang Zhao