[
https://issues.apache.org/jira/browse/SPARK-57315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Max Gekk resolved SPARK-57315.
------------------------------
Fix Version/s: 4.3.0
Resolution: Fixed
Issue resolved by pull request 56368
[https://github.com/apache/spark/pull/56368]
> Support HOUR, MINUTE and SECOND functions over nanosecond-precision timestamps
> ------------------------------------------------------------------------------
>
> Key: SPARK-57315
> URL: https://issues.apache.org/jira/browse/SPARK-57315
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 4.3.0
> Reporter: Max Gekk
> Assignee: Max Gekk
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.3.0
>
> Attachments: nanos_hour_plan.md
>
>
> The nanosecond-precision timestamp types TIMESTAMP_NTZ(p) and TIMESTAMP_LTZ(p)
> (p in [7, 9]) are currently being added to Spark SQL. Their physical value is
> TimestampNanosVal(epochMicros: Long, nanosWithinMicro: Short).
> The time-of-day extraction functions hour(), minute() and second() do not yet
> accept these types. They are implemented by the GetTimeField expressions
> (Hour, Minute, Second), whose inputTypes is AnyTimestampType, which only
> accepts
> the microsecond TimestampType and TimestampNTZType. As a result, calling these
> functions on a TIMESTAMP_NTZ(p) / TIMESTAMP_LTZ(p) value fails analysis.
> These three functions return an integer field (hour 0-23, minute 0-59, second
> 0-59) that depends only on epochMicros; the sub-microsecond digits never
> affect
> the result. We can therefore reuse the existing expressions and DateTimeUtils
> logic by casting the nanosecond input down to the matching microsecond type
> before evaluation:
> - TimestampNTZNanosType(p) -> TimestampNTZType (UTC / wall-clock extraction)
> - TimestampLTZNanosType(p) -> TimestampType (session-zone extraction)
> The cast (already available, SPARK-57293) keeps epochMicros and drops
> nanosWithinMicro, which is lossless for these integer results.
> Implementation:
> - Add a dedicated analyzer rule (ResolveTimestampNanosExpressions), modeled
> on
> ResolveBinaryArithmetic, that rewrites a resolved Hour/Minute/Second whose
> child is a nanosecond timestamp type into <expr>(Cast(child, microType)).
> The rule is preferred over a TypeCoercion rule so the behavioral change
> stays
> scoped to these functions rather than every AnyTimestampType expression.
> - The rule is named generically so future nanos-aware expressions can be
> added
> as additional case branches.
> Out of scope:
> - SecondWithFraction (the extract(SECOND) path returning DECIMAL(8,6)) is
> excluded because its result depends on the sub-microsecond digits.
> - Other timestamp expressions that return a timestamp, read sub-second
> precision, or compare/order/hash the value require genuine nanos-aware
> evaluation and are handled separately.
> This change is gated by spark.sql.timestampNanosTypes.enabled.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]