On Thu, Apr 25, 2024 at 10:24 PM Laurenz Albe <laurenz.a...@cybertec.at> wrote: > I don't find that convincing. Why are 2TB of wasted space in a 10TB > table worse than 2TB of wasted space in 100 tables of 100GB each?
It's not worse, but it's more avoidable. No matter what you do, any table that suffers a reasonable number of updates and/or deletes is going to have some wasted space. When a tuple is deleted or update, the old one has to stick around until its xmax is all-visible, and then after that until the page is HOT pruned which may not happen immediately, and then even after that the line pointer sticks around until the next vacuum which doesn't happen instantly either. No matter how aggressive you make autovacuum, or even no matter how aggressively you vacuum manually, non-insert-only tables are always going to end up containing some bloat. But how much? Well, it's basically given by RATE_AT_WHICH_SPACE_IS_WASTED * AVERAGE_TIME_UNTIL_SPACE_IS_RECLAIMED. Which, you'll note, does not really depend on the table size. It does a little bit, because the time until a tuple is fully removed, including the line pointer, depends on how long vacuum takes, and vacuum takes larger on a big table than a small one. But the effect is much less than linear, I believe, because you can HOT-prune as soon as the xmax is all-visible, which reclaims most of the space instantly. So in practice, the minimum feasible steady-state bloat for a table depends a great deal on how fast updates and deletes are happening, but only weakly on the size of the table. Which, in plain English, means that you should be able to vacuum a 10TB table often enough that it doesn't accumulate 2TB of bloat, if you want to. It's going to be harder to vacuum a 10GB table often enough that it doesn't accumulate 2GB of bloat. And it's going to be *really* hard to vacuum a 10MB table often enough that it doesn't accumulate 2MB of bloat. The only way you're going to be able to do that last one at all is if the update rate is very low. > > Another reason, at least in existing releases, is that at some > > point index vacuuming hits a wall because we run out of space for dead > > tuples. We *most definitely* want to do index vacuuming before we get > > to the point where we're going to have to do multiple cycles of index > > vacuuming. > > That is more convincing. But do we need a GUC for that? What about > making a table eligible for autovacuum as soon as the number of dead > tuples reaches 90% of what you can hold in "autovacuum_work_mem"? That would have been a good idea to do in existing releases, a long time before now, but we didn't. However, the new dead TID store changes the picture, because if I understand John Naylor's remarks correctly, the new TID store can hold so many TIDs so efficiently that you basically won't run out of memory. So now I think this wouldn't be effective - yet I still think it's wrong to let the vacuum threshold scale without bound as the table size increases. -- Robert Haas EDB: http://www.enterprisedb.com