On 4/4/25 04:53, Richard Guo wrote:
On Fri, Apr 4, 2025 at 1:02 AM Alexander Korotkov <aekorot...@gmail.com> wrote:
I've got an off-list bug report from Alexander Lakhin involving a
placeholder variable. Alena and Andrei proposed a fix. It is fairly
simple: we just shouldn't remove PHVs during self-join elimination, as
they might still be referenced from other parts of a query. The patch
is attached. I'm going to fix this if no objections.
Hmm, I'm not sure about the fix. It seems to me that it simply
prevents removing any PHVs in the self-join removal case. My concern
is that this might result in PHVs that could actually be removed not
being removed in many cases.
Let's play with use cases:
If a PHV is needed in the inner or outer only, it means we have a clause
in the baserestrictinfo that will be transferred to the keeping
relation, and we shouldn't remove the PHV.
Another case is when the PHV is needed in a join clause of the
self-join. I may imagine such a case:
toKeep.x+toRemove.y=PHV
This clause will be transformed to "toKeep.x+toKeep.y=PHV", pushed to
baserestrictinfo of keeping relation and should be saved.
I think it is possible to invent quite a narrow case of clause like the
following:
PHV_evaluated_at_inner = PHV_evaluated_at_outer
It needs to prove reproducibility. But even if it makes sense, it seems
to have no danger for further selectivity estimation compared to the
source clause and is a too-narrow case, isn't it?
In other cases, this PHV is needed something else, and we can't remove it.
Maybe I lost the case you keep in mind? I would like to discover it.
Besides, there's the specific comment above this code explaining the
logic behind the removal of PHVs. Shouldn't that comment be updated
to reflect the changes?
It makes sense: for now, it seems that PHV removal should be used in the
case of an outer join removal. In the case of SJE, logically we make a
replacement, not a removal, and we should not reduce the number of
entities involved.
--
regards, Andrei Lepikhov