adriangb commented on PR #14057: URL: https://github.com/apache/datafusion/pull/14057#issuecomment-2629230026
> I don't know whether you agree, when make a design, every component should do one thing, and do it well. reuse metadata map violates this, it takes two roles. I have to disagree on that. Field metadata is a hook point to do these sorts of things without having to pipe major code changes throughout the entire codebase. I think this *is* the use case for field metadata. > what makes things worse is that this map is mutable by user. Who do you consider the "user" in this scenario? I am a system implementer and a user of DataFusion. By design and necessity I edit metadata on fields (e.g. to indicate a UTF8 columns is JSON data). The users of the system I implement do not edit field metadata in my system. Maybe you're coming at it from a different perspective of "user" that I'm not understanding? > but for metadata column or system column, we wish it's const for every data source. Maybe but I don't see how it's any different for a `TableProvider` to declare which columns are system columns via a new method on `TableProvider::system_columns_schema` vs adding metadata to fields returned from `TableProvider::schema`. Ultimately I think using field metadata will result in a smaller change in terms of LOC, less new methods and other API changes in DataFusion, will be less likely to break DataFusion implementers code (e.g. because they make assumptions about field indexes being contiguous; I'd like to see some tests against `SchemaAdapter`) and will be easier to retrofit into existing systems with system/metadata columns. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org