berkaysynnada commented on PR #16166: URL: https://github.com/apache/datafusion/pull/16166#issuecomment-2918419765
> Thank you for this contribution @berkaysynnada and @mertak-synnada > > I am a little confused about the new structure and exactly what problem is being solved with this change. > > The PR description says > > > In the current implementation, the TableOptions struct stores information for CSV, JSON, and Parquet file types, and its behavior changes depending on whether the file format is set before or after. > > However, I didn't see any tests or examples that show a different behavior after this PR. > > In my initial quick read through it I would say it has made the APIs harder to use (as now there are format specific options on `TableOptions` AND a new `TableFormatOptions`). Is there any way to consolidate the number of structures? If not, I think we should clearly document the difference between the two structures The confusion we are trying to address is that the TableOptions struct is currently being used in two different ways: 1) As a sub-configuration field in SessionState, similar to other configuration fields. 2) For directly configuring the specific file currently being worked on, typically by cloning it from the SessionState and overriding fields as needed., In other words, it seems like we actually need two distinct objects for these two different use cases: - One for storing global TableOptions configuration across the session, applicable to all file types. - Another for holding format-specific options derived from the global configuration, intended for the currently active file. Another key point is that once you set a specific file format, you don’t need the configuration fields for the others in the existing TableOptions. So for the second use case, the appropriate design might be an enum rather than a struct. In summary, this PR is more of a code cleanup than a functional change. It aims to clarify the separation of concerns and improve maintainability, in my opinion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org