Max Gekk created SPARK-57848:
--------------------------------

             Summary: Support the TIME data type in approx_top_k
                 Key: SPARK-57848
                 URL: https://issues.apache.org/jira/browse/SPARK-57848
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.3.0
            Reporter: Max Gekk


This sub-task is part of the umbrella SPARK-57550 (extend support for the TIME 
data type).

h2. Problem
{{ApproxTopK}} (aggregate/ApproxTopKAggregates.scala) rejects {{TimeType}}: 
{{isDataTypeSupported}} whitelist (~L227-234) excludes it, and 
{{createItemsSketch}} / {{genSketchSerDe}} (~L246-260) have no TIME arm. Not 
covered by SPARK-57557.

h2. Goal
Support {{TimeType}} in {{approx_top_k}}, keying the sketch on the nanos-of-day 
{{Long}}.

h2. Scope
Add {{TimeType}} to {{isDataTypeSupported}} and to the sketch (de)serialization 
paths; return TIME-typed items.

h2. Acceptance criteria
* {{approx_top_k}} over a TIME column returns correct top-k TIME values and 
counts.

h2. Testing
The {{ApproxTopK}}-related suite.

h2. Dependencies
None - independent.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to