Re: [PR] feat: metadata columns [datafusion]

via GitHub Fri, 07 Feb 2025 16:51:10 -0800


chenkovsky commented on PR #14057:
URL: https://github.com/apache/datafusion/pull/14057#issuecomment-2644389349


   > 
   > as I previously asked, in your implementation "a system column stops being 
a system column once it's projected" ? If this is correct, then as you said 
there's no need to add more UTs.
   > 
   > I have to call out it seems that this behavior is incompatible with Spark. 
I know whether follow Spark's standard is another problem. but community should 
be aware of this.
   
   I have to revoke my judgements for #14362  from metadata/system propagation 
side, because previously judgements are based on the assumption that difference 
between two approaches is just how to transmit the information, the goal is 
same. but it seems that it's not true. #14362 has own propagation rules. It's 
really hard for me to talk about a totally different thing. let's look pros and 
cons directly.
   
   pros of this approach:
   1. dataframe api friendly. There's no chance to hurt themself for dataframe 
api user.
   2. Spark compatible. Spark has already been battle tested in many areas, 
it's design to be compatible with many different data sources and data sinks. 
So there's fewer unknown problems.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] feat: metadata columns [datafusion]

Reply via email to