On 12/5/14, 9:08 AM, José Luis Tallón wrote:
More over, when load goes up, the relative cost of parallel working should go up as well. Something like: p = number of cores l = 1min-load additional_cost = tuple estimate * cpu_tuple_cost * (l+1)/(c-1) (for c>1, of course)
...
The parallel seq scan nodes are definitively the best approach for "parallel query", since the planner can optimize them based on cost. I'm wondering about the ability to modify the implementation of some methods themselves once at execution time: given a previously planned query, chances are that, at execution time (I'm specifically thinking about prepared statements here), a different implementation of the same "node" might be more suitable and could be used instead while the condition holds.
These comments got me wondering... would it be better to decide on parallelism during execution instead of at plan time? That would allow us to dynamically scale parallelism based on system load. If we don't even consider parallelism until we've pulled some number of tuples/pages from a relation, this would also eliminate all parallel overhead on small relations. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers