On 2/7/2025 11:14, Richard Guo wrote:
On Wed, Jul 2, 2025 at 4:32 PM Andrei Lepikhov <lepi...@gmail.com> wrote:
I must say that I appreciate Tom's idea and see significant benefits in
making the parse tree a read-only structure. In complex queries, it can
be frustrating to make copies of the parse tree, leading to complaints
from users about insufficient memory allocation. This is why, in our
enterprise fork, we support a specific option to avoid copying the parse
tree multiple times.
I don't see how the changes in this patchset violate Tom's proposal
regarding keeping the parse tree read-only. The only potential issue
I can see is that we may clear the rte->inh flag in some cases -- but
that behavior has existed for a long time, not starting from this
patchset.
I think the 1e4351a solution was a little too fast and it changes the
parse tree inside the planner. To achieve a read-only parse tree, we
will need to redesign it.
Therefore, it would be better to find a way to refactor the
`preprocess_relation_rtes` function to gather table statistics lazily
into the hash table when they are needed. For example, we could do this
at the moment of creating the `RelOptInfo` or before a subquery pull-up,
without modifying the RTE at all.
All the catalog information collected in preprocess_relation_rtes() is
needed very early in the planner. I don't see how we could move that
logic to a later stage, such as at the moment of creating RelOptInfos
as you mentioned.
I apologise for the confusion in my previous message. I am not
suggesting that we postpone this. Instead, I would like an explanation
of why you believe that accessing the table statistics earlier could
negatively impact planner performance. As I mentioned before, I have
only envisioned rare instances where join eliminations may reduce the
number of relations and clause evaluations resulting in a constant.
--
regards, Andrei Lepikhov