gruuya commented on PR #20819:
URL: https://github.com/apache/datafusion/pull/20819#issuecomment-4088922846

   > This approach doesn't handle queries like:
   > 
   > SELECT a, b FROM t1 UNION ALL SELECT count(*), count(*) FROM t2;
   
   Added a fix (and tests) for this now.
   
   > I wonder if it would be cleaner to arrange not to check for duplicate 
column names in set operation queries that aren't the left-most query? Since 
the column names for such queries are discarded anyway, it seems a bit 
laborious to first rewrite them to ensure they are unique, and then check that 
they are indeed unique, before discarding them anyway.
   
   Indeed, whilst in theory this sounds like the best approach, in practice it 
would require a much more substantial change. We'd need to
   
   1. extend `PlannerContext` with a flag to denote that we're planning a set 
expression
   2. set that flag in in `set_expr_to_plan`, right after we plan the left side
   3. thread the flag down into `SqlToRel::project`
   4. then we'd either need to
       a. introduce a breaking change to `LogicalPlanBuilder::project` in order 
to pass that flag, or
       b. introduce a new method to `LogicalPlanBuilder`, something like 
`project_without_validation` where we'd skip `validate_unique_names` (but not 
normalize & columnize aspect of `project_with_validation`)
   5. unset the flag at the outermost set expression (i.e. where we've set it), 
so that further sql->plan calls perform go through `validate_unique_names`
   
   All in all it's more verbose and complex compared to this approach, but if 
you'd like i could open a PR for that so that we can compare.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to