Re: Re: [DISCUSS] Preserving Output Alias Names After RelNode Optimization

Mihai Budiu Fri, 27 Jun 2025 00:14:18 -0700

That's fine with me, but Julian pointed out many difficulties that arise in 
cases I had not considered. Let's see if these will be an obstacle in practice.


Mihai

________________________________
From: Yanjing Wang <[email protected]>
Sent: Friday, June 27, 2025 12:11 AM
To: [email protected] <[email protected]>
Subject: Re: Re: [DISCUSS] Preserving Output Alias Names After RelNode 
Optimization

Thank you for your detailed response. Given that consistent column naming
is a common challenge across organizations using Apache Calcite planner, I
believe we should establish a standardized approach. End users expect
predictable column aliases in query results, regardless of the optimization
process. Your proposed utility method for Calcite is promising. To
formalize this solution, I suggest we: 1. Document this as the recommended
best practice 2. Integrate it into planner.findBestExp method, which would
provide a centralized point for handling column alias preservation This
standardization would benefit all Calcite implementations by providing a
consistent and reliable way to handle column aliases throughout the
optimization process. Julian, Mihai, would you agree with this approach?

suibianwanwan <[email protected]> 于2025年6月26日周四 18:07写道：

> 1. I think so.
>
> 2. In my view, as long as we ensure the top-level Project is restored
> after Planner (some Calcite users might output RelNode), it should be fine.
>
> 3. RelBuilder#Project will optimize identity nodes. You can set force=true
> to force building a Project or directly call LogicalProject#create.
>
> I think we can add this utility method in Calcite:
> 1. When the top level is a Project, merge the Project to preserve aliases
> 2. When the top level is a Sort, call this method on its input
> 3. For other cases, directly add a Project to restore aliases
>
> On 2025/06/26 07:24:48 Yanjing Wang wrote:
> > >
> > >
> > > Dear Julian and Mihai, Thank you both for your detailed and insightful
> > > responses. I'd like to confirm my understanding: 1. Regarding column
> name
> > > preservation approach: - If I understand correctly, using RelRoot to
> get a
> > > projected rel node of the best rel would be the recommended way to
> preserve
> > > column names of rel after optimization? 2. About subquery generation:
> - I
> > > see that subquery generation is controlled by RelToSqlConverter, so I
> > > should focus on making adjustments there to control the subquery
> generation
> > > behavior for Project <- Sort rel pattern. 3. One observation I'd like
> to
> > > share: - I noticed that when I tried using a rel builder to add a
> project
> > > to the best rel (specifically when the best rel is a sort), adding a
> > > project to the sort input rel doesn't seem to make a difference in the
> > > outcome. Could you please confirm if my understanding aligns with your
> > > suggestions? This would help ensure I'm moving in the right direction
> with
> > > the implementation. Best regards, Yanjing
> >
>

Re: Re: [DISCUSS] Preserving Output Alias Names After RelNode Optimization

Reply via email to