On Wed, Dec 24, 2025 at 1:16 PM Tom Lane <[email protected]> wrote: > Having said that, I'm starting to wonder whether "do this stuff in a > separate pass before the main optimizer" is the wrong structural > decision. Should we be injecting the logic at some later point > where we've gathered more information? At least in principle, > we should be able to build all base-relation Paths before we start > to think about join order; would having those help?
I've been wondering for a while about making cardinality estimation a separate pass from path construction. One reason is that we currently come up with different row count estimates for partitionwise and non-partitionwise paths for the same rel, which doesn't really make sense. We could do partitionwise estimates for joins that could be done partitionwise even if partitionwise join is not selected or even if it is entirely disabled, and I think we'd get more accurate estimates (as we should for baserels also). But I wonder if it might help with this problem, too. Like, suppose we try to construct a star-join path using some fast-path, and as we go we compute cardinality estimates for the joinrels we consider building. At any point, we can decide to give up, e.g. because we find a row-count inflating join, or something that otherwise doesn't meet the criteria for the fast path. At that point, we can fall back and do the standard join search, and all the cardinality estimates we used are still available and don't need to be recomputed, so the only effort we've wasted is the effort of constructing paths for some rels. Those paths probably do need to be discarded, unless our fast-path always computes the full path set for each rel it considers. But being able to reuse the cardinality estimates would limit the wasted work. Your idea of building base-relation paths early, thus also getting base-relation cardinality estimates, has some of the same advantages and might be easier to implement. It would mean that we lose any join cardinality estimates that the fast-path algorithm computes, so I think it's probably at least modestly worse if you ignore implementation complexity, but I can almost hear you telling me that we cannot and should not ignore implementation complexity, which is fair enough as far as it goes. -- Robert Haas EDB: http://www.enterprisedb.com
