[ 
https://issues.apache.org/jira/browse/SPARK-57848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-57848:
-----------------------------
    Shepherd: Max Gekk

> Support the TIME data type in approx_top_k
> ------------------------------------------
>
>                 Key: SPARK-57848
>                 URL: https://issues.apache.org/jira/browse/SPARK-57848
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 4.3.0
>            Reporter: Max Gekk
>            Priority: Major
>
> This sub-task is part of the umbrella SPARK-57550 (extend support for the 
> TIME data type).
> h2. Problem
> {{ApproxTopK}} (aggregate/ApproxTopKAggregates.scala) rejects {{TimeType}}: 
> {{isDataTypeSupported}} whitelist (~L227-234) excludes it, and 
> {{createItemsSketch}} / {{genSketchSerDe}} (~L246-260) have no TIME arm. Not 
> covered by SPARK-57557.
> h2. Goal
> Support {{TimeType}} in {{approx_top_k}}, keying the sketch on the 
> nanos-of-day {{Long}}.
> h2. Scope
> Add {{TimeType}} to {{isDataTypeSupported}} and to the sketch 
> (de)serialization paths; return TIME-typed items.
> h2. Acceptance criteria
> * {{approx_top_k}} over a TIME column returns correct top-k TIME values and 
> counts.
> h2. Testing
> The {{ApproxTopK}}-related suite.
> h2. Dependencies
> None - independent.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to