Gavin,
For the record, I don't consider myself a PostgreSQL newbie, nor do I
manage any 2 TB databases (much less tables), but I do have an
unusual production use case: thousands (> 10,000) of tables, many of
them inherited, and many of them with hundreds of thousands (a few
with millions) of rows.
Honestly, creating crontab vacuum management for this scenario would
be a nightmare, and pg_autovacuum has been a godsend. Considering the
recent revelations of O(n^2) iterations over table lists in the
current versions and the stated and apparent ease with which this
problem could be solved by integrating the basic functionality of
pg_autovacuum into the backend, I can personally attest to there
being real-world use cases that would benefit tremendously from
integrated autovacuum.
A few months ago, I attempted to solve the wrong problem by
converting a hardcoded threshold into another command-line option. If
I had spotted the O(n^2) problem, I might've spent the time working
on it then instead of the new command-line option. I suppose it's
possible that I'll head down this road anyway if it looks like
integrated pg_autovacuum is going to be put on hold indefinitely
after this discussion.
Anyway, just wanted to throw out some food for thought for the
practicality of a tool like pg_autovacuum.
--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
Strategic Open Source: Open Your i™
http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005
On Jun 16, 2005, at 5:22 PM, Gavin Sherry wrote:
On Thu, 16 Jun 2005, Alvaro Herrera wrote:
On Thu, Jun 16, 2005 at 04:20:34PM +1000, Gavin Sherry wrote:
2) By no fault of its own, autovacuum's level of granularity is
the table
level. For people dealing with non-trivial amounts of data (and
we're not
talking gigabytes or terabytes here), this is a serious drawback.
Vacuum
at peak times can cause very intense IO bursts -- even with the
enhancements in 8.0. I don't think the solution to the problem is
to give
users the impression that it is solved and then vacuum their
tables during
peak periods. I cannot stress this enough.
People running systems with petabyte-sized tables can disable
autovacuum
for those tables, and leave it running for the rest. Then they can
schedule whatever maintenance they see fit on their gigantic tables.
Trying to run a database with more than a dozen gigabytes of data
without expert advice (or at least reading the manual) would be
extremely stupid anyway.
As I've said a few times, I'm not concerned about such users. I'm
concerned about users with some busy tables of a few hundred
megabytes. I
still don't think VACUUM at arbitary times on such tables is suitable.
Thanks,
Gavin
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]