alamb opened a new issue, #12119: URL: https://github.com/apache/datafusion/issues/12119
### Is your feature request related to a problem or challenge? Part of https://github.com/apache/datafusion/issues/11752 We are trying to change DataFusion to use StringViewArray by default when reading parquet (and, for example, when it makes more sense such as the `substr` function), StringView enables many interesting optimization opportunities. However, as StringView is still being adopted across the rest of the arrow ecosystem, if DataFusion begins to emit `StringViewArray` in some places, it may cause issues with other parts of the ecosystem (e.g. flight clients may not be able to interpret data sent by a server using DataFusion) ### Describe the solution you'd like I would like DataFusion to retain maximum compatibility at the interfaces, but be able to use StringViewArray internally when it improves performance ### Describe alternatives you've considered I recommend a config flag that makes it possible to convert `Utf8View`/`BinaryView` --> `Utf8` / `Binary` at the query output and I think this conversion should be done by default. For example we might add this configuration flag: ``` datafusion.optimizer.expand_views_at_output=true ``` If this flag is true, 1. add code in the Analyzer (maybe in the TypeCOercion code) 2. check the output columns of a plan, and if any are `DataType::Utf8View` or `DataType::BinaryView`, add ProjectionExec` that converts them to Utf8/Binary (by adding a cast to `DataType::Utf8` or `DataType::Binary` respectively ### Additional context We already have to do something similar in flight with dictionary arrays -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
