On Tue, Jan 13, 2015 at 6:25 AM, John Gorman <johngorm...@gmail.com> wrote: > One approach that I has worked well for me is to break big jobs into much > smaller bite size tasks. Each task is small enough to complete quickly. > > We add the tasks to a task queue and spawn a generic worker pool which eats > through the task queue items. > > This solves a lot of problems. > > - Small to medium jobs can be parallelized efficiently. > - No need to split big jobs perfectly. > - We don't get into a situation where we are waiting around for a worker to > finish chugging through a huge task while the other workers sit idle. > - Worker memory footprint is tiny so we can afford many of them. > - Worker pool management is a well known problem. > - Worker spawn time disappears as a cost factor. > - The worker pool becomes a shared resource that can be managed and reported > on and becomes considerably more predictable.
I think this is a good idea, but for now I would like to keep our goals somewhat more modest: let's see if we can get parallel sequential scan, and only parallel sequential scan, working and committed. Ultimately, I think we may need something like what you're talking about, because if you have a query with three or six or twelve different parallelizable operations in it, you want the available CPU resources to switch between those as their respective needs may dictate. You certainly don't want to spawn a separate pool of workers for each scan. But I think getting that all working in the first version is probably harder than what we should attempt. We have a bunch of problems to solve here just around parallel sequential scan and the parallel mode infrastructure: heavyweight locking, prefetching, the cost model, and so on. Trying to add to that all of the problems that might attend on a generic task queueing infrastructure fills me with no small amount of fear. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers