Amit Kapila wrote: > Today while again thinking about the startegy used in patch to > parallelize the operation (vacuum database), I think we can > improve the same for cases when number of connections are > lesser than number of tables in database (which I presume > will normally be the case). Currently we are sending command > to vacuum one table per connection, how about sending multiple > commands (example Vacuum t1; Vacuum t2) on one connection. > It seems to me there is extra roundtrip for cases when there > are many small tables in database and few large tables. Do > you think we should optimize for any such cases?
I don't think this is a good idea; at least not in a first cut of this patch. It's easy to imagine that a table you initially think is small enough turns out to have grown much larger since last analyze. In that case, putting one worker to process that one together with some other table could end up being bad for parallelism, if later it turns out that some other worker has no table to process. (Table t2 in your example could grown between the time the command is sent and t1 is vacuumed.) It's simpler to have workers do one thing at a time only. I don't think it's a very good idea to call pg_relation_size() on every table in the database from vacuumdb. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers