On Sun, 24 Feb 2019 at 10:06, Gunther <r...@gusw.net> wrote: > I am using an SQL queue for distributing work to massively parallel workers. > Workers come in servers with 12 parallel threads. One of those worker sets > handles 7 transactions per second. If I add a second one, for 24 parallel > workers, it scales to 14 /s. Even a third, for 36 parallel workers, I can add > to reach 21 /s. If I try a fourth set, 48 workers, I end up in trouble. But > that isn't even so much my problem rather than the fact that in short time, > the performance will deteriorate, and it looks like that is because the queue > index deteriorates and needs a REINDEX.
It sounds very much like auto-vacuum is simply unable to keep up with the rate at which the table is being updated. Please be aware, that by default, auto-vacuum is configured to run fairly slowly so as not to saturate low-end machines. vacuum_cost_limit / autovacuum_vacuum_cost limit control how many "points" the vacuum process can accumulate before it will perform an autovacuum_vacuum_cost_delay / vacuum_cost_delay. Additionally, after an auto-vacuum run completes it will wait for autovacuum_naptime before checking again if any tables require some attention. I think you should be monitoring how many auto-vacuums workers are busy during your runs. If you find that the "queue" table is being vacuumed almost constantly, then you'll likely want to increase vacuum_cost_limit / autovacuum_vacuum_cost_limit. You could get an idea of how often this table is being auto-vacuumed by setting log_autovacuum_min_duration to 0 and checking the logs. Another way to check would be to sample what: SELECT query FROM pg_stat_activity WHERE query LIKE 'autovacuum%'; returns. You may find that all of the workers are busy most of the time. If so, that indicates that the cost limits need to be raised. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services