[
https://issues.apache.org/jira/browse/SPARK-57826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Max Gekk updated SPARK-57826:
-----------------------------
Shepherd: Max Gekk
> Support nanosecond-precision timestamps in
> approx_percentile/percentile_approx and histogram_numeric
> ----------------------------------------------------------------------------------------------------
>
> Key: SPARK-57826
> URL: https://issues.apache.org/jira/browse/SPARK-57826
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 4.3.0
> Reporter: Max Gekk
> Priority: Major
>
> This sub-task is part of the umbrella SPARK-56822 (timestamps with nanosecond
> precision).
> h2. Problem
> {{ApproximatePercentile}} (aggregate/ApproximatePercentile.scala ~L104-209)
> lists {{TimestampType}}, {{TimestampNTZType}}, {{AnyTimeType}} in
> {{inputTypes}} but omits {{AnyTimestampNanoType}}; its value path does
> {{value.toDouble}} / result {{.toLong}}, which is wrong for
> {{TimestampNanosVal}} (not a {{Number}}). {{HistogramNumeric}}
> (aggregate/HistogramNumeric.scala ~L80-168) has the same pattern
> ({{asInstanceOf[Number].doubleValue()}}). Microsecond timestamps work;
> nanosecond types are rejected at analysis and would also fail at runtime.
> This mirrors the TIME extension added by SPARK-57557.
> h2. Goal
> Accept {{AnyTimestampNanoType}} in both aggregates and convert
> {{TimestampNanosVal}} to/from the internal double representation without
> losing sub-microsecond precision (e.g. via epoch seconds + fractional nanos),
> returning a nanosecond timestamp at the input precision/family.
> h2. Scope
> Extend {{inputTypes}}; add nanosecond value<->double conversion in
> update/merge/eval; preserve precision on the result type. Follow the
> SPARK-57557 pattern.
> h2. Acceptance criteria
> * {{approx_percentile}} / {{percentile_approx}} / {{histogram_numeric}} over
> NTZ/LTZ nanosecond timestamps return correctly-typed nanosecond results;
> accuracy comparable to the microsecond path.
> h2. Testing
> {{ApproximatePercentileQuerySuite}}, {{HistogramNumericSuite}}; nanos golden
> files.
> h2. Dependencies
> None - independent (mirrors the TIME work SPARK-57557).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]