vigneshsiva11 commented on PR #3265:
URL:
https://github.com/apache/datafusion-comet/pull/3265#issuecomment-3912718812
@andygrove Thank you for the detailed review and feedback!
I've updated the PR to address all your concerns:
### Changes Made:
1. Dictionary-encoded timestamp support: Added proper handling for
dictionary-encoded arrays (common in Parquet/Iceberg) in
`extract_date_part.rs`. The code now:
- First normalizes dictionary arrays by casting to the underlying
timestamp type
- Then applies the appropriate timezone logic based on whether it's
TimestampNTZ or Timestamp
2. Regression test added: Added a comprehensive test in
`CometTemporalExpressionSuite` (as suggested) that validates
`hour/minute/second` extraction from TimestampNTZ columns across multiple
session timezones (UTC, America/Los_Angeles, and Asia/Tokyo). This ensures the
behavior is consistent regardless of session timezone.
3. Cleaned up PR scope: Removed the unrelated `datediff` changes that were
causing test failures and belonged in a separate PR.
### Code Changes:
-
[native/spark-expr/src/datetime_funcs/extract_date_part.rs](https://github.com/apache/datafusion-comet/pull/3265/files#diff-...)
- Dictionary support + TimestampNTZ handling
-
[spark/src/test/scala/org/apache/comet/CometTemporalExpressionSuite.scala](https://github.com/apache/datafusion-comet/pull/3265/files#diff-...)
- New regression test
The PR now focuses solely on fixing issue #3180. Ready for another review
when you have time!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]