Matthew, > As long as pg_autovacuum remains a contrib module, I don't think any > changes to the system catelogs will be make. If pg_autovacuum is > deemed ready to move out of contrib, then we can talk about the above.
But we could create a config file that would store stuff in a flatfile table, OR we could add our own "system table" that would be created when one "initializes" pg_avd. Just an idea. Mind you, I'm not so sure that we want to focus immediately on per-table settings. I think that we want to get the "automatic" settings working fairly well first; a lot of new DBAs would use the per-table settings to shoot themselves in the foot. So we need to be able to make a strong recommendation to "try the automatic settings first." > Are you saying that you can vacuum a 1 million row table in 2-4 > minutes? While a vacuum of the same table with an additional 1 million > dead tuples would take an hour? I'm probably exaggerating. I do know that I can vacuum a fairly clean 1-5 million row table in less than 4 mintues. I've never let such a table get to 50% dead tuples, so I don't really know how long that takes. Call me a coward if you like ... > >I'd be really reluctant to base pv-avd frequency on the fsm settings > > instead. What if the user loads 8GB of data but leaves fsm_pages at the > > default of 10,000? You can't do much with that; you'd have to vacuum if > > even 1% of the data changed. > > Ok, but as you said above it's very easy to set the FSM once you know > your db size. Actually, thinking about this I realize that PG_AVD and the Perl-based postgresql.conf configuration script I was working on (darn, who was doing that with me?) need to go togther. With pg_avd, setting max_fsm_pages is very easy; without it its a bit of guesswork. So I think we can do this: for 'auto' settings: If max_fsm_pages is between 13% and 100% of the total database pages, then set the vacuum scale factor to match 3/4 of the fsm_pages setting, e.g. database = 18,000,000 data pages; max_fsm_pages = 3,600,000; set vacuum scale factor = 3.6mil/18mil * 3/4 = 0.15 If max_fsm_pages is less than 13% of database pages, issue a warning to the user (log it, if possible) and set scale factor to 0.1. If it's greater than 100% set it to 1 and leave it alone. > I don't have a good plan as to how to incorporate > this data, but to a large extent the FSM already tracks table activity > and gives us the most accurate answer about storage growth (short of > using something like contrib/pgstattuple which takes nearly the same > amount of time as an actual vacuum) I don't really think we need to do dynamic monitoring at this point. It would be a lot of engineering to check data page pollution without having significant performance impact. It's doable, but something I think we should hold off until version 3. It would mean hacking the FSM, which is a little beyond me right now. > In my testing, I never changed the default statistics settings. Ah. Well, a lot of users do to resolve query problems. > But we track tuples because we can compare against the count given by > the stats system. I don't know of a way (other than looking at the FSM, > or contrib/pgstattuple ) to see how many dead pages exist. No, but for scaling you don't need the dynamic count of tuples or of dead tuples; pg_class holds a reasonable accurate count of pages per table as of last vacuum. -- Josh Berkus Aglio Database Solutions San Francisco ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings