alamb commented on issue #14874: URL: https://github.com/apache/datafusion/issues/14874#issuecomment-2702059366
> I opened https://github.com/apache/datafusion/issues/14993 today which I realized is a duplicate of a question I asked before in https://github.com/apache/datafusion/issues/7845#issuecomment-2463360160. My understanding of how ClickHouse handles JSON is by creating specialized "hidden" columns for each key (linked above but see https://clickhouse.com/blog/a-new-powerful-json-data-type-for-clickhouse). I think if DataFusion supported something like what I'm proposing in those comments (pushing down an expression into a file) we could: FWIW the idea to store some fields as separate columns is referred to as "shredding" in the Parquet doc / format they are adding: - https://github.com/apache/parquet-format/pull/456 > I think if DataFusion supported something like what I'm proposing in those comments (pushing down an expression into a file) we could: I think adding expression pushdown into table providers would be valuable and has come up a number of times. This usecase is a good one. I'll work on a writeup -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org