Reviewing 0001: Perhaps ExecEndCteScan needs an adjustment. What if node->leader was never set?
Other than that, I think this is in good shape. Maybe there are other things we'd want to adjust here, or maybe there aren't, but there doesn't seem to be any good reason to bundle more changes into the same patch. Reviewing 0002 and beyond: I think it's good that you have tried to divide up a big change into little pieces, but I'm finding the result difficult to understand. It doesn't really seem like each patch stands on its own. I keep flipping between patches to try to understand why other patches are doing things, which kind of defeats the purpose of splitting stuff up. For example, 0002 adds a NodeTag field to QueryDesc, but it doesn't even seem to initialize that field, let alone use it for anything. It adds a CachedPlan pointer to QueryDesc too, and adapts CreateQueryDesc to allow one as an argument, but none of the callers actually pass anything. I suspect that that the first change (adding a NodeTag) field is a bug, and that the second one is intentional, but it's hard to tell without flipping through all of the other patches to see how they build on what 0002 does. And even when something isn't a bug, it's also hard to tell whether it's the right design, again because you can't consider each patch in isolation. Ideally, splitting a patch set should bring related changes together in a single patch and push unrelated changes apart into different patches, but I don't really see this particular split having that effect. There is a chicken and egg problem here, to be fair. If we add code that can make plan initialization fail without teaching the planner to cope with failures, then we have broken the server, and if we do the reverse, then we have a bunch of dead code that we can't test. Neither is very satisfactory. But I still hope there's some better division possible than what you have here currently. For instance, I wonder if it would be possible to add all the stuff to cope with plan initialization failing and then have a test patch that makes initialization randomly fail with some probability (or maybe you can even cause failures at specific points). Then you could test that infrastructure by running the regression tests in a loop with various values of the relevant setting. Another overall comment that I have is that it doesn't feel like there's enough high-level explanation of the design. I don't know how much of that should go in comments vs. commit messages vs. a README that accompanies the patch set vs. whatever else, and I strongly suspect that some of the stuff that seems confusing now is actually stuff that at one point I understood and have just forgotten about. But rediscovering it shouldn't be quite so hard. For example, consider the question "why are we storing the CachedPlan in the QueryDesc?" I eventually figured out that it's so that ExecPlanStillValid can call CachedPlanStillValid which can then consult the cached plan's is_valid flag. But is that the only access to the CachedPlan that we ever expect to occur via the QueryDesc? If not, what else is allowable? If so, why not just store a Boolean in the QueryDesc and arrange for the plancache to be able to flip it when invalidating? I'm not saying that's a better design -- I'm saying that it looks hard to understand your thought process from the patch set. And also, you know, assuming the current design is correct, could there be some way of dividing up the patch set so that this one change, where we add the CachedPlan to the QueryDesc, isn't so spread out across the whole series? Some more detailed review comments below. This isn't really a full review because I don't understand the patches well enough for that, but it's some stuff I noticed. In 0002: + * result-rel info, etc. Also, we don't pass the parent't copy of the Typo. + /* + * All the necessary locks must already have been taken when + * initializing the parent's copy of subplanstate, so the CachedPlan, + * if any, should not have become invalid during ExecInitNode(). + */ + Assert(ExecPlanStillValid(rcestate)); This -- and the other similar instance -- feel very uncomfortable. There's a lot of action at a distance here. If this assertion ever failed, how would anyone ever figure out what went wrong? You wouldn't for example know which object got invalidated, presumably corresponding to a lock that you failed to take. Unless the problem were easily reproducible in a test environment, trying to guess what happened might be pretty awful; imagine seeing this assertion failure in a customer log file and trying to back-track to the find the underlying bug. A further problem is that what would actually happen is you *wouldn't* see this in the customer log file, because assertions wouldn't be enabled, so you'd just see queries occasionally returning wrong answers, I guess? Or crashing in some other random part of the code? Which seems even worse. At a minimum I think this should be upgraded to a test-and-elog, and maybe there's some value in trying to think of what should get printed by that elog to facilitate proper debugging, if it happens. In 0003: + * + * OK to ignore the return value; plan can't become invalid, + * because there's no CachedPlan. */ - ExecutorStart(cstate->queryDesc, 0); + (void) ExecutorStart(cstate->queryDesc, 0); This also feels awkward, for similar reasons. Sure, it shouldn't return false, but also, if it did, you'd just blindly continue. Maybe there should be test-and-elog here too. Or maybe this is an indication that we need less action at a distance. Like, if ExecutorStart took the CachedPlan as an argument instead of feeding it through the QueryDesc, then you could document that ExecutorStart returns true if that value is passed as NULL and true or false otherwise. Here, whether ExecutorStart can return true or false depends on the contents of the queryDesc ... which, granted, in this case is just built a line or two before anyway, but if you just passed to to ExecutorStart then you wouldn't need to feed it through the QueryDesc, it seems to me. Even better, maybe there should be ExecutorStart() that continues returning void and ExecutorStartExtended() that takes a cached plan as an additional argument and returns a bool. /* - * Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This - * Assert is needed because ExecutorFinish is new as of 9.1, and callers - * might forget to call it. + * Check that ExecutorFinish was called, unless in EXPLAIN-only mode or if + * execution was canceled. This Assert is needed because ExecutorFinish is + * new as of 9.1, and callers might forget to call it. */ Maybe we could drop the second sentence at this point. In 0005: + * XXX Maybe we should we skip calling ExecCheckPermissions from + * InitPlan in a parallel worker. Why? If the thinking is to save overhead, then perhaps try to assess the overhead. If the thinking is that we don't want it to fail spuriously, then we have to weight that against the (security) risk of succeeding spuriously. + * Returns true if current transaction holds a lock on the given relation of + * mode 'lockmode'. If 'orstronger' is true, a stronger lockmode is also OK. + * ("Stronger" is defined as "numerically higher", which is a bit + * semantically dubious but is OK for the purposes we use this for.) I don't particularly enjoy seeing this comment cut and pasted into some new place. Especially the tongue-in-cheek parenthetical part. Better to refer to the original comment or something instead of cut-and-pasting. Also, why is it appropriate to pass orstronger = true here? Don't we expect the *exact* lock mode that we have planned to be held, and isn't it a sure sign of a bug if it isn't? Maybe orstronger should just be ripped out here (and the comment could then go away too). In 0006: + /* + * RTIs of all partitioned tables whose children are scanned by + * appendplans. The list contains a bitmapset for every partition tree + * covered by this Append. + */ The first sentence of this comment makes this sound like a list of integers, the RTIs of all partitioned tables that are scanned. The second sentence makes it sound like a list of bitmapsets, but what does it mean to take about each partition tree covered by this Append? This is far from a complete review but I'm running out of steam for today. I hope that it's at least somewhat useful. ...Robert