adriangb commented on PR #14057:
URL: https://github.com/apache/datafusion/pull/14057#issuecomment-2629230026

   > I don't know whether you agree, when make a design, every component should 
do one thing, and do it well. reuse metadata map violates this, it takes two 
roles.
   
   I have to disagree on that. Field metadata is a hook point to do these sorts 
of things without having to pipe major code changes throughout the entire 
codebase. I think this *is* the use case for field metadata.
   
   > what makes things worse is that this map is mutable by user.
   
   Who do you consider the "user" in this scenario? I am a system implementer 
and a user of DataFusion. By design and necessity I edit metadata on fields 
(e.g. to indicate a UTF8 columns is JSON data). The users of the system I 
implement do not edit field metadata in my system. Maybe you're coming at it 
from a different perspective of "user" that I'm not understanding?
   
   > but for metadata column or system column, we wish it's const for every 
data source.
   
   Maybe but I don't see how it's any different for a `TableProvider` to 
declare which columns are system columns via a new method on 
`TableProvider::system_columns_schema` vs adding metadata to fields returned 
from `TableProvider::schema`.
   
   Ultimately I think using field metadata will result in a smaller change in 
terms of LOC, less new methods and other API changes in DataFusion, will be 
less likely to break DataFusion implementers code (e.g. because they make 
assumptions about field indexes being contiguous; I'd like to see some tests 
against `SchemaAdapter`) and will be easier to retrofit into existing systems 
with system/metadata columns.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to