On Wed, Dec 24, 2025 at 1:16 PM Tom Lane <[email protected]> wrote:
> Having said that, I'm starting to wonder whether "do this stuff in a
> separate pass before the main optimizer" is the wrong structural
> decision.  Should we be injecting the logic at some later point
> where we've gathered more information?  At least in principle,
> we should be able to build all base-relation Paths before we start
> to think about join order; would having those help?

I've been wondering for a while about making cardinality estimation a
separate pass from path construction. One reason is that we currently
come up with different row count estimates for partitionwise and
non-partitionwise paths for the same rel, which doesn't really make
sense. We could do partitionwise estimates for joins that could be
done partitionwise even if partitionwise join is not selected or even
if it is entirely disabled, and I think we'd get more accurate
estimates (as we should for baserels also). But I wonder if it might
help with this problem, too. Like, suppose we try to construct a
star-join path using some fast-path, and as we go we compute
cardinality estimates for the joinrels we consider building. At any
point, we can decide to give up, e.g. because we find a row-count
inflating join, or something that otherwise doesn't meet the criteria
for the fast path. At that point, we can fall back and do the standard
join search, and all the cardinality estimates we used are still
available and don't need to be recomputed, so the only effort we've
wasted is the effort of constructing paths for some rels. Those paths
probably do need to be discarded, unless our fast-path always computes
the full path set for each rel it considers. But being able to reuse
the cardinality estimates would limit the wasted work.

Your idea of building base-relation paths early, thus also getting
base-relation cardinality estimates, has some of the same advantages
and might be easier to implement. It would mean that we lose any join
cardinality estimates that the fast-path algorithm computes, so I
think it's probably at least modestly worse if you ignore
implementation complexity, but I can almost hear you telling me that
we cannot and should not ignore implementation complexity, which is
fair enough as far as it goes.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


Reply via email to