On Sun, 19 May 2024 at 13:27, Tom Lane <t...@sss.pgh.pa.us> wrote: > > David Rowley <dgrowle...@gmail.com> writes: > > 1. No ability to control the order that the locks are obtained. The > > order in which the locks are taken will be at the mercy of the plan > > the planner chooses. > > I do not think I buy this argument, because plancache.c doesn't > provide any "ability to control the order" today, and never has. > The order in which AcquireExecutorLocks re-gets relation locks is only > weakly related to the order in which the parser/planner got them > originally. The order in which AcquirePlannerLocks re-gets the locks > is even less related to the original. This doesn't cause any big > problems that I'm aware of, because these locks are fairly weak.
It may not bite many people, it's just that if it does, I don't see what we could do to help those people. At the moment we could tell them to adjust their DDL script to obtain the locks in the same order as their query. With your idea that cannot be done as the order could change when the planner switches the join order. > I think we do have a guarantee that for partitioned tables, parents > will be locked before children, and that's probably valuable. > But an executor-driven lock order could preserve that property too. I think you'd have to lock the parent before the child. That would remain true and consistent anyway when taking locks during a breadth-first plan traversal. > > For #3, I've been thinking about what improvements we can do to make > > the executor more efficient. In [1], Andres talks about some very > > interesting things. In particular, in his email items 3) and 5) are > > relevant here. If we did move lots of executor startup code into the > > planner, I think it would be possible to one day get rid of executor > > startup and have the plan record how much memory is needed for the > > non-readonly part of the executor state and tag each plan node with > > the offset in bytes they should use for their portion of the executor > > working state. > > I'm fairly skeptical about that idea. The entire reason we have an > issue here is that we want to do runtime partition pruning, which > by definition can't be done at plan time. So I doubt it's going > to play nice with what we are trying to accomplish in this thread. I think we could have both, providing there was a way to still traverse the executor state tree in EXPLAIN. We'd need a way to skip portions of the plan that are not relevant or could be invalid for the current execution. e.g can't show Index Scan because index has been dropped. > > I think what Amit had before your objection was starting to turn into > > something workable and we should switch back to working on that. > > The reason I posted this idea was that I didn't think the previously > existing patch looked promising at all. Ok. It would be good if you could expand on that so we could determine if there's some fundamental reason it can't work or if that's because you were blinded by your epiphany and didn't give that any thought after thinking of the alternative idea. I've gone to effort to point out things that I think are concerning with your idea. It would be good if you could do the same for the previous patch other than "it didn't look promising". It's pretty hard for me to argue with that level of detail. David