Hello, everyone! I think we don't have enough information to analyze vacuum functionality.
Needless to say that the vacuum is the most important process for a database system. It prevents problems like table and index bloating and emergency freezing if we have a wraparound problem. Furthermore, it keeps the visibility map up to date. On the other hand, because of incorrectly adjusted aggressive settings of autovacuum it can consume a lot of computing resources that lead to all queries to the system running longer.
Nowadays the vacuum gathers statistical information about tables, but it is important not for optimizer only.
Because the vacuum is an automation process, there are a lot of settings that determine their aggressive functionality to other objects of the database. Besides, sometimes it is important to set a correct parameter for the specified table, because of its dynamic changes.
An administrator of a database needs to set the settings of autovacuum to have a balance between the vacuum's useful action in the database system on the one hand, and the overhead of its workload on the other. However, it is not enough for him to decide on vacuum functionality through statistical information about the number of vacuum passes through tables and operational data from progress_vacuum, because it is available only during vacuum operation and does not provide a strategic overview over the considered period.
To sum up, an automation vacuum has a strategic behavior because the frequency of its functionality and resource consumption depends on the workload of the database. Its workload on the database is minimal for an append-only table and it is a maximum for the table with a high-frequency updating. Furthermore, there is a high dependence of the vacuum load on the number and volume of indexes. Because of the absence of the visibility map for indexes, the vacuum scans the index completely, and the worst situation when it needs to do it during a bloating index situation in a small table.
I suggest gathering information about vacuum resource consumption for processing indexes and tables and storing it in the table and index relationships (for example, PgStat_StatTabEntry structure like it has realized for usual statistics). It will allow us to determine how well the vacuum is configured and evaluate the effect of overhead on the system at the strategic level, the vacuum has gathered this information already, but this valuable information doesn't store it.
-- Regards, Alena Rybakina Postgres Professional: http://www.postgrespro.com The Russian Postgres Company