On Sat, Mar 1, 2014 at 9:04 PM, Stephen Frost <sfr...@snowman.net> wrote: > * Robert Haas (robertmh...@gmail.com) wrote: >> I don't see that parallelizing Append is any easier than any other >> problem in this space. There's no parallel I/O facility, so you need >> a background worker per append branch to wait on I/O. And you have >> all the problems of making sure that the workers have the same >> snapshot, making sure they don't self-deadlock, etc. that you have for >> any other case. > > Erm, my thought was to use a select() loop which sends out I/O requests > and then loops around waiting to see who finishes it. It doesn't > parallelize the CPU cost of getting the rows back to the caller, but > it'd at least parallelize the I/O, and if what's underneath is actually > a remote FDW running a complex query (because the other side is actually > a view), it would be a massive win to have all the remote FDWs executing > concurrently instead of serially as we have today.
I can't really make sense of this. In general, what's under each branch of an append node is an arbitrary plan tree, and the only operation you can count on being able to do for each is "get me the next tuple" (i.e. ExecProcNode). Append has no idea whether that involves disk I/O or for what blocks. But even if it did, there's no standard API for issuing an asynchronous read(), which is how we get blocks into shared buffers. We do have an API for requesting the prefetch of a block on platforms with posix_fadvise(), but can't find out whether it's completed using select(), and even if you could you still have to do the actual read() afterwards. For FDWs, one idea might be to kick off the remote query at ExecInitNode() time rather than ExecProcNode() time, at least if the remote query doesn't depend on parameters that aren't available until run time. That actually would allow multiple remote queries to run simultaneously or in parallel with local work. It would also run them in cases where the relevant plan node is never executed, which would be bad but perhaps rare enough not to worry about. Or we could add a new API like ExecPrefetchNode() that tells nodes to prepare to have tuples pulled, and they can do things like kick off asynchronous queries. But I still don't see any clean way for the Append node to find out which one is ready to return results first. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers