alamb commented on issue #12788: URL: https://github.com/apache/datafusion/issues/12788#issuecomment-2402419364
> Yes, we need to support binary -> utf8view in arrow cast Casting from binary --> utf8view via `cast` will work, but won't be much/any faster than it is done today BTW thinking more about this, I do think we need to support the cast, but in this PR we should effectively change the *file* schema (not just the table schema) when we setup the parquet reader (specifically with [`ArrowReaderOptions::with_schema`](https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.ArrowReaderOptions.html#method.with_schema)) Here is the code that does so for `Utf8` --> `Utf8View` https://github.com/apache/datafusion/blob/b821929194c0ae3d3d5e862e0f82a0e5dc55702f/datafusion/core/src/datasource/physical_plan/parquet/opener.rs#L126-L129 I think we need to also allow the switch when the table schema is `Utf8View` and the file schema is Binary/BinaryView -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
