On Tue, Aug 27, 2024 at 4:15 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > Seems reasonable. It might be possible to say that our answer > to "control over join order" is to provide a hook that can modify > the "joinlist" before it's passed to make_one_rel. If you want > to force a particular join order you can rearrange that > list-of-lists-of-range-table-indexes to do so. The thing this > would not give you is control over which rel is picked as outer > in any given join step. Not sure how critical that bit is.
This has a big advantage over what I proposed yesterday in that it's basically declarative. With one call to the hook, you get all the information about the join order that you could ever want. That's really nice. However, I don't really think it quite works, partly because of what you mention here about not being able to control which rel ends up on which side of the join, which I do think is important, and also because if the join order isn't possible, planning will fail, rather than falling back to some other plan shape. If you have an idea how we could address those things within this same general framework, I'd be keen to hear it. It has occurred to me more than once that it might be really useful if we could attempt to plan under a set of constraints and then, if we don't end up finding a plan, retry without the constraints. But I don't quite see how to make it work. When I tried to do that as a solution to the disable_cost problem, it ended up meaning that once you couldn't satisfy every constraint perfectly, you gave up on even trying. I wasn't immediately certain that such behavior was unacceptable, but I didn't have to look any further than our own regression test suites to see that it was going to cause a lot of unhappiness. In this case, if we could attempt join planning with the user-prescribed join order and then try it again if we fail to find a path, that would be really cool. Or if we could do all of planning without generating disabled paths *at all* and then go back and restart if it becomes clear that's not working out, that would be slick. But, unless you have a clever idea, that all seems like advanced magic that should wait until we have basic things working. Right now, I think we should focus on getting something in place where we still try all the paths but an extension can arrange for some of them to be disabled. Then all the right things will happen naturally; we'll only be leaving some CPU cycles on the table. Which isn't amazing, but I don't think it's a critical defect either, and we can try to improve things later if we want to. -- Robert Haas EDB: http://www.enterprisedb.com