On 19/9/2024 13:12, Ashutosh Bapat wrote:
On Thu, Sep 19, 2024 at 4:18 PM Andrei Lepikhov <lepi...@gmail.com> wrote:
At the same time, this technique (while highly useful in general) adds
fragility and increases complexity: a developer needs to remember to
link the path using the pointer in different places of the code.
So, maybe go the alternative way? Invent a subquery memory context and
store all the path allocations there. It can be freed after setrefs
finishes this subquery planning without pulling up this subquery.
Using memory context for subquery won't help with partitioning right?
If the outermost query has a partitioned table with thousands of
partitions, it will still accumulate those paths till the very end of
planning.
I got it. Just haven't had huge tables in the outer before.
We could instead use memory context/s to store all the paths
created, then copy the optimal paths into a new memory context at
strategic points and blow up the old memory context. And repeat this
till we choose the final path and create a plan out of it; at that
point we could blow up the memory context containing remaining paths
as well. That will free the paths as soon as they are rendered
useless.
I think any scalable solution should be based on a per-partition
cleanup. For starters, why not adopt Tom's patch [1] for selectivity
estimations? We will see the profit in the case of long lists of clauses.
I discussed this idea with Alvaro offline. We thought that
this approach needs some code to copy paths and then copying paths
recursively has some overhead of itself.
It needs path_tree_walker at first. We discussed it before but failed.
Maybe design it beforehand and use it in re-parameterising code?
The current work of adding a
reference count, OTOH has potential to bring discipline into the way
we handle paths. We need to avoid risks posed by dangling pointers.
Both hands up for having pointer counters: It is painful all the time in
extensions to invent an approach to safely removing a path you want to
replace with a custom one. I just want to say it looks too dangerous
compared to the value of a possible positive outcome.
For which Alvaro suggested looking at the way we manage snapshots. But
I didn't get time to explore that idea yet.
Unfortunately, I can't understand this idea without an explanation.
[1] Optimize planner memory consumption for huge arrays
https://www.postgresql.org/message-id/1367418.1708816...@sss.pgh.pa.us
--
regards, Andrei Lepikhov