On Wed, Mar 16, 2011 at 6:36 PM, Jim Nasby <j...@nasby.net> wrote: > One way to look at this is that any system will have a limit on how quickly > it can vacuum everything. If it's having trouble dedicating enough IO to > vacuum, then autovac is going to have a long list of tables that it wants to > vacuum. When you're in that situation, you want to get to the next table that > needs vacuuming as quickly as possible, so if you've run through the first > heap scan and found only a limited number of dead tuples, it doesn't make > sense to spend a bunch of time scanning indexes and making a second heap scan > (though, IIRC the second scan doesn't hit the entire heap; it only hits the > tuples that were remembered as being dead).
I mostly agree with this, but you also can't postpone vacuuming indefinitely just because you're too busy; that's going to blow up in your face. > Of course, going along the lines of an autovac-based tuning mechanism, you > have to question how a table would show up for autovac if there's not > actually a number of dead tuples. One scenario is freezing (though I'm not > sure if your super-lazy vacuum could freeze tuples or not). Another is > inserts. That might become a big win; you might want to aggressively scan a > table that gets data loaded into it in order to set hint/all visible bits. Right. Really-lazy vacuum could freeze tuples. Unlike regular vacuum, it can also sensibly be done incrementally. One thing I was thinking about is counting the number of times that we fetched a tuple that was older than RecentGlobalXmin and had a committed xmin and an invalid xmax, but where the page was not PD_ALL_VISIBLE. If that's happening a lot, it probably means that some vacuuming would speed things up, by getting those PD_ALL_VISIBLE bits set. Perhaps you could work out some formula where you do a variable amount of super-lazy vacuuming depending on the number of such tuple fetches. The trick would be to avoid overdoing it (so that you swamp the I/O system) or underdoing it (so that the system never converges). It would be really nice (for this and for other things) if we had some way of measuring the I/O saturation of the system, so that we could automatically adjust the aggressiveness of background processes accordingly. Note also that if and when we get index-only scans, making sure the PD_ALL_VISIBLE bits (and thus the visibility map bits) actually get set is going to be a lot more important. > From a manual standpoint, ISTM that super-lazy vac would be extremely useful > for dealing with hint bits after a bulk insert to a table that also has some > update activity. Using a regular vacuum in that case would result in a lot of > extra work to deal with the small number of dead tuples. I can see that. > Perhaps it would be useful to write a script that analyzed the output of > vacuum verbose looking for tables where a super-lazy vacuum would have made > sense (assuming vacuum verbose provides the needed info). If we had such a > script we could ask folks to run it and see how much super-lazy vacuuming > would help in the real world. I'm a bit doubtful about this part. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers