[ 
https://issues.apache.org/jira/browse/FLINK-36123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-36123:
--------------------------------
    Fix Version/s: 2.0-preview

> Add PERCENTILE function
> -----------------------
>
>                 Key: FLINK-36123
>                 URL: https://issues.apache.org/jira/browse/FLINK-36123
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table SQL / API
>            Reporter: Dylan He
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.0-preview
>
>         Attachments: image-2024-09-06-10-23-42-094.png
>
>
> Add PERCENTILE function.
> ----
> Return a percentile value based on a continuous distribution of the input 
> column. If no input row lies exactly at the desired percentile, the result is 
> calculated using linear interpolation of the two nearest input values. NULL 
> values are ignored in the calculation.
> Example:
> {code:sql}
> > SELECT PERCENTILE(col, 0.3) FROM VALUES (0), (10), (10) AS tab(col);
>  6.0
> > SELECT PERCENTILE(col, 0.3, freq) FROM VALUES (0, 1), (10, 2) AS tab(col, 
> > freq);
>  6.0
> > SELECT PERCENTILE(col, array(0.25, 0.75)) FROM VALUES (0), (10) AS tab(col);
>  [2.5,7.5]
> {code}
> Syntax:
> {code:sql}
> PERCENTILE(expr, percentage[, frequency])
> {code}
> Arguments:
>  * {{expr}}: A NUMERIC expression.
>  * {{percentage}}: A NUMERIC expression between 0 and 1 or an ARRAY of 
> NUMERIC expressions, each between 0 and 1.
>  * {{frequency}}: An optional integral number greater than 0.
> Returns:
> DOUBLE if percentage is numeric, or an ARRAY of DOUBLE if percentage is an 
> ARRAY.
> Frequency describes the number of times expr must be counted. The default 
> value is 1.
> See also:
>  * 
> [Hive|https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inAggregateFunctions(UDAF)]
>  * 
> [Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#mathematical-functions]
>  * 
> [Databricks|https://docs.databricks.com/en/sql/language-manual/functions/percentile.html]
>  * [PostgreSQL|https://www.postgresql.org/docs/16/functions-aggregate.html] 
> percentile_cont
>  * 
> [Snowflake|https://docs.snowflake.com/en/sql-reference/functions/percentile_cont]
>  * 
> [Oracle|https://docs.oracle.com/en/database/oracle/oracle-database/23/sqlrf/PERCENTILE_CONT.html]
>  * [Wiki|https://en.wikipedia.org/wiki/Percentile]
> ----
> Currently our implementation is inspired by PERCENTILE of Spark, which offers 
> an additional parameter frequency compared to SQL standard function 
> PERCENTILE_CONT.
> Based on this function, we can easily extend support fo PERCENTILE_CONT and 
> PERCENTILE_DISC in SQL standard with a little modification in the future.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to