jkosh44 commented on issue #16248: URL: https://github.com/apache/datafusion/issues/16248#issuecomment-2952747673
Most of the cast errors have the same cause, they are trying to cast a type from Arrow that doesn't exist in substrait. - https://github.com/apache/datafusion/issues/16275 - https://github.com/apache/datafusion/issues/16278 - https://github.com/apache/datafusion/issues/16285 - https://github.com/apache/datafusion/issues/16296 - https://github.com/apache/datafusion/issues/16298 It's probably worth coming up for a solution for all of them together, instead of independently. Has it been previously discussed how we should handle these kind of types? I left some comments in https://github.com/apache/datafusion/issues/16285, but here's a TLDR: - One option is to use UDTs in substrait to represent these. That's what the arrow CPP library does (https://github.com/apache/arrow/issues/40695). The downside of this approach is that external systems are unlikely to understand the UDT. - Another approach is to do a lossy conversion to a similar type. Duration -> IntervalDay, Float16 -> Fp32, FixedSizeList -> List, Time32 -> Time, Time64 -> Time. If I understand correctly, these conversions are lossy because they lose some semantic information, but we don't have to lower the precision of any value. They won't however round trip from arrow -> substrait -> arrow, so we'll probably need to update the roundtrip tests to account for this. - Continue to return an error, because these types are not supported in substrait and update the substrait roundtrip tests to ignore these types. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org