On Tue, Oct 28, 2014 at 4:48 PM, Simon Riggs <si...@2ndquadrant.com> wrote: > On 16 October 2014 16:22, Robert Haas <robertmh...@gmail.com> wrote: >>> Might I gently enquire what the "something usable" we are going to see >>> in this release? I'm not up on current plans. >> >> I don't know how far I'm going to get for this release yet. I think >> pg_background is a pretty good milestone, and useful in its own right. >> I would like to get something that's truly parallel working sooner >> rather than later, but this group locking issue is one of 2 or 3 >> significant hurdles that I need to climb over first. > > pg_background is very cute, but really its not really a step forward, > or at least very far. It's sounding like you've already decided that > is as far as we're going to get this release, which I'm disappointed > about. > > Given your description of pg_background it looks an awful lot like > infrastructure to make Autonomous Transactions work, but it doesn't > even do that. I guess it could do in a very small additional patch, so > maybe it is useful for something. > > You asked for my help, but I'd like to see some concrete steps towards > an interim feature so I can see some benefit in a clear direction. > > Can we please have the first step we discussed? Parallel CREATE INDEX? > (Note the please)
What I've been thinking about trying to work towards is parallel sequential scan. I think that it would actually be pretty easy to code up a mostly-working version using the existing infrastructure, but the patch would be rejected with a bazooka, because the non-working parts would include things like: 1. The cooperating backends might not all be using the same snapshot, because that requires sharing the snapshot, combo CID hash, and transaction state. 2. The quals that got pushed down to the workers might not return the same answers that they would have produced with a single backend, because we have no mechanism for assessing pushdown-safety. 3. Deadlock detection would be to some greater or lesser degree broken, the details depending on the implementation choices you made. There is a bit of a chicken-and-egg problem here. If I submit a patch for parallel sequential scan, it'll (justifiably) get rejected because it doesn't solve those problems. So I'm trying to solve those above-enumerated problems first, with working and at least somewhat-useful examples that show how the incremental bits of infrastructure can be used to do stuff. But that leads to your (understandable) complaint that this isn't very real yet. Why am I now thinking about parallel sequential scan instead of parallel CREATE INDEX? You may remember that I posted a patch for a new memory allocator some time ago, and it came in for a fair amount of criticism and not much approbation. Some of that criticism was certainly justified, and perhaps I was as hard on myself as anyone else was. However you want to look at it, I see the trade-off between parallel sort and parallel seq-scan this way: parallel seq-scan requires dealing with the planner (ouch!) but parallel sort requires dealing with memory allocation in dynamic shared memory segments (ouch!). Both of them require solving the three problems listed above. And maybe a few others, but I think those are the big ones - and I think proper deadlock detection is the hardest of them. A colleague of mine has drafted patches for sharing snapshots and combo CIDs between processes, and as you might expect that's pretty easy. Sharing the transaction state (so we can test whether a transaction ID is "our" transaction ID inside the worker) is a bit trickier, but I think not too hard. Assessing pushdown-safety will probably boil down to adding some equivalent of proisparallel. Maybe not the most elegant, but defensible, and if you're looking for the shortest path to something usable, that's probably it. But deadlock detection ... well, I don't see any simpler solution than what I'm trying to build here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers