andygrove opened a new issue, #4646:
URL: https://github.com/apache/datafusion-comet/issues/4646

   ## Describe the bug
   
   Spark 4.0 widens many string-typed `inputTypes` on datetime expressions to 
`StringTypeWithCollation(supportsTrimCollation = true)`. The affected datetime 
expressions include `convert_timezone`, `date_format`, `date_trunc`, 
`from_unixtime`, `make_timestamp`, `next_day`, `to_unix_timestamp`, `trunc`, 
and `unix_timestamp`.
   
   Today the Comet serdes for these expressions accept those string inputs 
without distinguishing the collation, so non-default collations are silently 
treated as compatible. Per the `audit-comet-expression` skill (rule 11), a 
non-default collation on a string input should flip the support level to 
`Incompatible(Some(...))` so the divergence is visible in EXPLAIN and the 
auto-generated compatibility guide, and so the projection falls back rather 
than producing potentially divergent results.
   
   ## Steps to reproduce
   
   On Spark 4.0, apply a non-default collation (for example `UTF8_LCASE` or 
`UNICODE_CI`) to a string argument of one of the datetime expressions above and 
observe that Comet still runs the expression natively without distinguishing 
the collation.
   
   ## Expected behavior
   
   Non-default collations on string inputs to these datetime expressions should 
report `Incompatible(Some(...))` (falling back unless explicitly opted in), 
consistent with how other expressions gate collation.
   
   ## Additional context
   
   Split out from the high-priority list in #4502 (item 5, originally tracked 
as medium priority) so that #4502 can be closed once the remaining fixes land. 
Cross-references #2190 and #4496.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to