[ https://issues.apache.org/jira/browse/FLINK-36123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dylan He updated FLINK-36123: ----------------------------- Description: Add PERCENTILE function. ---- Return a percentile value based on a continuous distribution of the input column. If no input row lies exactly at the desired percentile, the result is calculated using linear interpolation of the two nearest input values. NULL values are ignored in the calculation. Example: {code:sql} > SELECT PERCENTILE(col, 0.3) FROM VALUES (0), (10), (10) AS tab(col); 6.0 > SELECT PERCENTILE(col, 0.3, freq) FROM VALUES (0, 1), (10, 2) AS tab(col, > freq); 6.0 > SELECT PERCENTILE(col, array(0.25, 0.75)) FROM VALUES (0), (10) AS tab(col); [2.5,7.5] {code} Syntax: {code:sql} PERCENTILE(expr, percentage[, frequency]) {code} Arguments: * {{expr}}: A NUMERIC expression. * {{percentage}}: A NUMERIC expression between 0 and 1 or an ARRAY of NUMERIC expressions, each between 0 and 1. * {{frequency}}: An optional integral numbe greater than 0. Returns: DOUBLE if percentage is numeric, or an ARRAY of DOUBLE if percentage is an ARRAY. Frequency describes the number of times expr must be counted. The default value is 1. See also: * [Hive|https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inAggregateFunctions(UDAF)] * [Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#mathematical-functions] * [Databricks|https://docs.databricks.com/en/sql/language-manual/functions/percentile.html] * [PostgreSQL|https://www.postgresql.org/docs/16/functions-aggregate.html] percentile_cont * [Snowflake|https://docs.snowflake.com/en/sql-reference/functions/percentile_cont] * [Oracle|https://docs.oracle.com/en/database/oracle/oracle-database/23/sqlrf/PERCENTILE_CONT.html] * [Wiki|https://en.wikipedia.org/wiki/Percentile] ---- Currently our implementation is inspired by PERCENTILE of Spark, which offers an additional parameter frequency compared to SQL standard function PERCENTILE_CONT. Based on this function, we can easily extend support fo PERCENTILE_CONT and PERCENTILE_DISC in SQL standard with a little modification in the future. was: Add PERCENTILE function. ---- Return a percentile value based on a continuous distribution of the input column. If no input row lies exactly at the desired percentile, the result is calculated using linear interpolation of the two nearest input values. NULL values are ignored in the calculation. Example: {code:sql} > SELECT PERCENTILE(col, 0.3) FROM VALUES (0), (10), (10) AS tab(col); 6.0 > SELECT PERCENTILE(col, array(0.25, 0.75)) FROM VALUES (0), (10) AS tab(col); [2.5,7.5] {code} Syntax: {code:sql} PERCENTILE(expr, percentage) {code} Arguments: * {{expr}}: A NUMERIC expression. * {{percentage}}: A NUMERIC expression between 0 and 1 or an ARRAY of NUMERIC expressions, each between 0 and 1. Returns: DOUBLE if percentage is numeric, or an ARRAY of DOUBLE if percentage is an ARRAY. See also: * [Hive|https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inAggregateFunctions(UDAF)] * [Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#mathematical-functions] * [Databricks|https://docs.databricks.com/en/sql/language-manual/functions/percentile.html] * [PostgreSQL|https://www.postgresql.org/docs/16/functions-aggregate.html] percentile_cont * [Snowflake|https://docs.snowflake.com/en/sql-reference/functions/percentile_cont] * [Oracle|https://docs.oracle.com/en/database/oracle/oracle-database/23/sqlrf/PERCENTILE_CONT.html] * [Wiki|https://en.wikipedia.org/wiki/Percentile] > Add PERCENTILE function > ----------------------- > > Key: FLINK-36123 > URL: https://issues.apache.org/jira/browse/FLINK-36123 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API > Reporter: Dylan He > Priority: Major > Labels: pull-request-available > > Add PERCENTILE function. > ---- > Return a percentile value based on a continuous distribution of the input > column. If no input row lies exactly at the desired percentile, the result is > calculated using linear interpolation of the two nearest input values. NULL > values are ignored in the calculation. > Example: > {code:sql} > > SELECT PERCENTILE(col, 0.3) FROM VALUES (0), (10), (10) AS tab(col); > 6.0 > > SELECT PERCENTILE(col, 0.3, freq) FROM VALUES (0, 1), (10, 2) AS tab(col, > > freq); > 6.0 > > SELECT PERCENTILE(col, array(0.25, 0.75)) FROM VALUES (0), (10) AS tab(col); > [2.5,7.5] > {code} > Syntax: > {code:sql} > PERCENTILE(expr, percentage[, frequency]) > {code} > Arguments: > * {{expr}}: A NUMERIC expression. > * {{percentage}}: A NUMERIC expression between 0 and 1 or an ARRAY of > NUMERIC expressions, each between 0 and 1. > * {{frequency}}: An optional integral numbe greater than 0. > Returns: > DOUBLE if percentage is numeric, or an ARRAY of DOUBLE if percentage is an > ARRAY. > Frequency describes the number of times expr must be counted. The default > value is 1. > See also: > * > [Hive|https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inAggregateFunctions(UDAF)] > * > [Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#mathematical-functions] > * > [Databricks|https://docs.databricks.com/en/sql/language-manual/functions/percentile.html] > * [PostgreSQL|https://www.postgresql.org/docs/16/functions-aggregate.html] > percentile_cont > * > [Snowflake|https://docs.snowflake.com/en/sql-reference/functions/percentile_cont] > * > [Oracle|https://docs.oracle.com/en/database/oracle/oracle-database/23/sqlrf/PERCENTILE_CONT.html] > * [Wiki|https://en.wikipedia.org/wiki/Percentile] > ---- > Currently our implementation is inspired by PERCENTILE of Spark, which offers > an additional parameter frequency compared to SQL standard function > PERCENTILE_CONT. > Based on this function, we can easily extend support fo PERCENTILE_CONT and > PERCENTILE_DISC in SQL standard with a little modification in the future. -- This message was sent by Atlassian Jira (v8.20.10#820010)