[ https://issues.apache.org/jira/browse/FLINK-36123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
lincoln lee updated FLINK-36123: -------------------------------- Fix Version/s: 2.0-preview > Add PERCENTILE function > ----------------------- > > Key: FLINK-36123 > URL: https://issues.apache.org/jira/browse/FLINK-36123 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API > Reporter: Dylan He > Priority: Major > Labels: pull-request-available > Fix For: 2.0-preview > > Attachments: image-2024-09-06-10-23-42-094.png > > > Add PERCENTILE function. > ---- > Return a percentile value based on a continuous distribution of the input > column. If no input row lies exactly at the desired percentile, the result is > calculated using linear interpolation of the two nearest input values. NULL > values are ignored in the calculation. > Example: > {code:sql} > > SELECT PERCENTILE(col, 0.3) FROM VALUES (0), (10), (10) AS tab(col); > 6.0 > > SELECT PERCENTILE(col, 0.3, freq) FROM VALUES (0, 1), (10, 2) AS tab(col, > > freq); > 6.0 > > SELECT PERCENTILE(col, array(0.25, 0.75)) FROM VALUES (0), (10) AS tab(col); > [2.5,7.5] > {code} > Syntax: > {code:sql} > PERCENTILE(expr, percentage[, frequency]) > {code} > Arguments: > * {{expr}}: A NUMERIC expression. > * {{percentage}}: A NUMERIC expression between 0 and 1 or an ARRAY of > NUMERIC expressions, each between 0 and 1. > * {{frequency}}: An optional integral number greater than 0. > Returns: > DOUBLE if percentage is numeric, or an ARRAY of DOUBLE if percentage is an > ARRAY. > Frequency describes the number of times expr must be counted. The default > value is 1. > See also: > * > [Hive|https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inAggregateFunctions(UDAF)] > * > [Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#mathematical-functions] > * > [Databricks|https://docs.databricks.com/en/sql/language-manual/functions/percentile.html] > * [PostgreSQL|https://www.postgresql.org/docs/16/functions-aggregate.html] > percentile_cont > * > [Snowflake|https://docs.snowflake.com/en/sql-reference/functions/percentile_cont] > * > [Oracle|https://docs.oracle.com/en/database/oracle/oracle-database/23/sqlrf/PERCENTILE_CONT.html] > * [Wiki|https://en.wikipedia.org/wiki/Percentile] > ---- > Currently our implementation is inspired by PERCENTILE of Spark, which offers > an additional parameter frequency compared to SQL standard function > PERCENTILE_CONT. > Based on this function, we can easily extend support fo PERCENTILE_CONT and > PERCENTILE_DISC in SQL standard with a little modification in the future. -- This message was sent by Atlassian Jira (v8.20.10#820010)