jkosh44 commented on issue #16248:
URL: https://github.com/apache/datafusion/issues/16248#issuecomment-2952747673

   Most of the cast errors have the same cause, they are trying to cast a type 
from Arrow that doesn't exist in substrait.
   
   - https://github.com/apache/datafusion/issues/16275
   - https://github.com/apache/datafusion/issues/16278
   - https://github.com/apache/datafusion/issues/16285
   - https://github.com/apache/datafusion/issues/16296
   - https://github.com/apache/datafusion/issues/16298
   
   It's probably worth coming up for a solution for all of them together, 
instead of independently. Has it been previously discussed how we should handle 
these kind of types?
   
   I left some comments in https://github.com/apache/datafusion/issues/16285, 
but here's a TLDR: 
   
   - One option is to use UDTs in substrait to represent these. That's what the 
arrow CPP library does (https://github.com/apache/arrow/issues/40695). The 
downside of this approach is that external systems are unlikely to understand 
the UDT.
   - Another approach is to do a lossy conversion to a similar type. Duration 
-> IntervalDay, Float16 -> Fp32, FixedSizeList -> List, Time32 -> Time, Time64 
-> Time. If I understand correctly, these conversions are lossy because they 
lose some semantic information, but we don't have to lower the precision of any 
value. They won't however round trip from arrow -> substrait -> arrow, so we'll 
probably need to update the roundtrip tests to account for this.
   - Continue to return an error, because these types are not supported in 
substrait and update the substrait roundtrip tests to ignore these types.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to