Hi, On 2023-09-19 14:50:13 -0400, Robert Haas wrote: > On Tue, Sep 19, 2023 at 12:56 PM Andres Freund <and...@anarazel.de> wrote: > > Yea, a setting like what's discussed here seems, uh, not particularly useful > > for achieving the goal of compacting tables. I don't think guiding this > > through SQL makes a lot of sense. For decent compaction you'd want to scan > > the > > table backwards, and move rows from the end to earlier, but stop once > > everything is filled up. You can somewhat do that from SQL, but it's going > > to > > be awkward and slow. I doubt you even want to use the normal UPDATE WAL > > logging. > > > > I think having explicit compaction support in VACUUM or somewhere similar > > would make sense, but I don't think the proposed GUC is a useful stepping > > stone. > > I think there's a difference between wanting to compact instantly and > wanting to compact over time. I think that this kind of thing is > reasonably well-suited to the latter, if we can engineer away the > cases where it backfires. > > But I know people will try to use it for instant compaction too, and > there it's worth remembering why we removed old-style VACUUM FULL. The > main problem is that it was mind-bogglingly slow.
I think some of the slowness was implementation related, rather than fundamental. But more importantly, storage was something entirely different back then than it is now. > The other really bad problem is that it caused massive index bloat. I think > any system that's based on moving around my tuples right now to make my > table smaller right now is likely to have similar issues. I think the problem of exploding WAL usage exists both for compaction being done in VACUUM (or a dedicated command) and being done by backends. I think to make using a facility like this realistic, you really need some form of rate limiting, regardless of when compaction is performed. Even leaving WAL volume aside, naively doing on-update compaction will cause lots of additional contention on early FSM pages. > In the case where you're trying to compact gradually, I think there > are potentially serious issues with index bloat, but only potentially. > It seems like there are reasonable cases where it's fine. > Specifically, if you have relatively few indexes per table, relatively > few long-running transactions, and all tuples get updated on a > semi-regular basis, I'm thinking that you're more likely to win than > lose. Maybe - but are you going to have a significant bloat issue in that case? Sure, if the updates update most of the table, youre are going to - but then on-update compaction won't really be needed either, since you're going to run out of space on pages on a regular basis. Greetings, Andres Freund