[ 
https://issues.apache.org/jira/browse/SPARK-57557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk resolved SPARK-57557.
------------------------------
    Fix Version/s: 4.3.0
       Resolution: Fixed

Issue resolved by pull request 56889
[https://github.com/apache/spark/pull/56889]

> Support the TIME data type in quantile and sketch aggregates
> ------------------------------------------------------------
>
>                 Key: SPARK-57557
>                 URL: https://issues.apache.org/jira/browse/SPARK-57557
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 4.3.0
>            Reporter: Max Gekk
>            Assignee: Max Gekk
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.3.0
>
>
> h2. What
> Allow {{TimeType}} as input to the quantile/sketch aggregate functions:
> {{percentile}}, {{percentile_approx}} / {{approx_percentile}} 
> ({{ApproximatePercentile}}),
> {{median}}, {{histogram_numeric}} ({{HistogramNumeric}}), and the 
> datasketches aggregates.
> h2. Why
> These functions currently accept NumericType, DateType, TimestampType, 
> TimestampNTZType and
> intervals, but not {{TimeType}} (see {{ApproximatePercentile.inputTypes}}). 
> TIME is an
> ordered datetime type with a {{Long}} internal value, so percentiles/medians 
> are well
> defined and consistent with how TIMESTAMP is already handled.
> h2. Scope
> * Add {{TimeType}} to the {{inputTypes}}/{{TypeCollection}} of the affected 
> aggregates.
> * Add the {{TimeType}} branches in the value<->double conversions (the 
> internal value is a
>   {{Long}}, same as TIMESTAMP/DayTimeInterval).
> * Datasketches aggregates: include {{TimeType}} in the supported-types list
>   (note the existing "implement support for decimal/datetime/interval types" 
> TODO).
> * Return type for a TIME percentile/median is {{TimeType}} (matching the 
> TIMESTAMP behavior).
> h2. Acceptance criteria
> * {{percentile}}, {{percentile_approx}}, {{median}}, {{histogram_numeric}} 
> work on TIME
>   columns and return TIME.
> * Tests added alongside the existing datetime aggregate tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to