Gavin,

For the record, I don't consider myself a PostgreSQL newbie, nor do I manage any 2 TB databases (much less tables), but I do have an unusual production use case: thousands (> 10,000) of tables, many of them inherited, and many of them with hundreds of thousands (a few with millions) of rows.

Honestly, creating crontab vacuum management for this scenario would be a nightmare, and pg_autovacuum has been a godsend. Considering the recent revelations of O(n^2) iterations over table lists in the current versions and the stated and apparent ease with which this problem could be solved by integrating the basic functionality of pg_autovacuum into the backend, I can personally attest to there being real-world use cases that would benefit tremendously from integrated autovacuum.

A few months ago, I attempted to solve the wrong problem by converting a hardcoded threshold into another command-line option. If I had spotted the O(n^2) problem, I might've spent the time working on it then instead of the new command-line option. I suppose it's possible that I'll head down this road anyway if it looks like integrated pg_autovacuum is going to be put on hold indefinitely after this discussion.

Anyway, just wanted to throw out some food for thought for the practicality of a tool like pg_autovacuum.

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Jun 16, 2005, at 5:22 PM, Gavin Sherry wrote:

On Thu, 16 Jun 2005, Alvaro Herrera wrote:


On Thu, Jun 16, 2005 at 04:20:34PM +1000, Gavin Sherry wrote:


2) By no fault of its own, autovacuum's level of granularity is the table level. For people dealing with non-trivial amounts of data (and we're not talking gigabytes or terabytes here), this is a serious drawback. Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is to give users the impression that it is solved and then vacuum their tables during
peak periods. I cannot stress this enough.


People running systems with petabyte-sized tables can disable autovacuum
for those tables, and leave it running for the rest.  Then they can
schedule whatever maintenance they see fit on their gigantic tables.
Trying to run a database with more than a dozen gigabytes of data
without expert advice (or at least reading the manual) would be
extremely stupid anyway.


As I've said a few times, I'm not concerned about such users. I'm
concerned about users with some busy tables of a few hundred megabytes. I
still don't think VACUUM at arbitary times on such tables is suitable.

Thanks,

Gavin

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to