Is there some transformation we'd want to apply to that tree, but can't
because we have no concept of scope? It's already possible for a plan rule
to traverse each node's subtree if it wants.

On Tue, Apr 24, 2018 at 10:18 AM, Marco Gaido <marcogaid...@gmail.com>
wrote:

> Hi all,
>
> working on SPARK-24051 I realized that currently in the Optimizer and in
> all the places where we are transforming a query plan, we are lacking the
> context information of what is in scope and what is not.
>
> Coming back to the ticket, the bug reported in the ticket is caused mainly
> by two reasons:
>  1 - we have two aliases in different places of the plan;
>  2 - (the focus of this email) we apply all the rules globally over the
> whole plan, without any notion of scope where something is
> reachable/visible or not.
>
> I will start with an easy example to explain what I mean. If we have a
> simple query like:
>
> select a, b from (
>   select 1 as a, 2 as b from table1
>     union
>   select 3 as a, 4 as b from table2) q
>
> We produce a tree which is logically something like:
>
> Project0(a, b)
> -   Union
> --    Project1 (a, b)
> ---     ScanTable1
> --    Project 2(a, b)
> ---     ScanTable2
>
> So when we apply a transformation on Project1 for instance, we have no
> information about what is coming from ScanTable1 (or in general any node
> which is part of the subtree whose root is Project1): we miss a stateful
> transform which allows the children to tell the parent, grandparents, and
> so on what is in their scope. This is in particular true for the
> Attributes: in a node we have no idea if an Attribute comes from its
> subtree (it is in scope) or not.
>
> So, the point of this email is: do you think in general might be useful to
> introduce a way of navigating the tree which allows the children to keep a
> state to be used by their parents? Or do you think it is useful in general
> to introduce the concept of scope (if an attribute can be accessed by a
> node of a plan)?
>
> Thanks,
> Marco
>
>
>

Reply via email to