[ 
https://issues.apache.org/jira/browse/FLINK-36123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dylan He updated FLINK-36123:
-----------------------------
    Description: 
Add PERCENTILE function.
----
Return a percentile value based on a continuous distribution of the input 
column. If no input row lies exactly at the desired percentile, the result is 
calculated using linear interpolation of the two nearest input values. NULL 
values are ignored in the calculation.

Example:
{code:sql}
> SELECT PERCENTILE(col, 0.3) FROM VALUES (0), (10), (10) AS tab(col);
 6.0
> SELECT PERCENTILE(col, 0.3, freq) FROM VALUES (0, 1), (10, 2) AS tab(col, 
> freq);
 6.0
> SELECT PERCENTILE(col, array(0.25, 0.75)) FROM VALUES (0), (10) AS tab(col);
 [2.5,7.5]
{code}

Syntax:
{code:sql}
PERCENTILE(expr, percentage[, frequency])
{code}

Arguments:
 * {{expr}}: A NUMERIC expression.
 * {{percentage}}: A NUMERIC expression between 0 and 1 or an ARRAY of NUMERIC 
expressions, each between 0 and 1.
 * {{frequency}}: An optional integral numbe greater than 0.

Returns:
DOUBLE if percentage is numeric, or an ARRAY of DOUBLE if percentage is an 
ARRAY.

Frequency describes the number of times expr must be counted. The default value 
is 1.

See also:
 * 
[Hive|https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inAggregateFunctions(UDAF)]
 * 
[Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#mathematical-functions]
 * 
[Databricks|https://docs.databricks.com/en/sql/language-manual/functions/percentile.html]
 * [PostgreSQL|https://www.postgresql.org/docs/16/functions-aggregate.html] 
percentile_cont
 * 
[Snowflake|https://docs.snowflake.com/en/sql-reference/functions/percentile_cont]
 * 
[Oracle|https://docs.oracle.com/en/database/oracle/oracle-database/23/sqlrf/PERCENTILE_CONT.html]
 * [Wiki|https://en.wikipedia.org/wiki/Percentile]

----

Currently our implementation is inspired by PERCENTILE of Spark, which offers 
an additional parameter frequency compared to SQL standard function 
PERCENTILE_CONT.
Based on this function, we can easily extend support fo PERCENTILE_CONT and 
PERCENTILE_DISC in SQL standard with a little modification in the future.


  was:
Add PERCENTILE function.
----
Return a percentile value based on a continuous distribution of the input 
column. If no input row lies exactly at the desired percentile, the result is 
calculated using linear interpolation of the two nearest input values. NULL 
values are ignored in the calculation.

Example:
{code:sql}
> SELECT PERCENTILE(col, 0.3) FROM VALUES (0), (10), (10) AS tab(col);
 6.0
> SELECT PERCENTILE(col, array(0.25, 0.75)) FROM VALUES (0), (10) AS tab(col);
 [2.5,7.5]
{code}

Syntax:
{code:sql}
PERCENTILE(expr, percentage)
{code}

Arguments:
 * {{expr}}: A NUMERIC expression.
 * {{percentage}}: A NUMERIC expression between 0 and 1 or an ARRAY of NUMERIC 
expressions, each between 0 and 1.

Returns:
DOUBLE if percentage is numeric, or an ARRAY of DOUBLE if percentage is an 
ARRAY.

See also:
 * 
[Hive|https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inAggregateFunctions(UDAF)]
 * 
[Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#mathematical-functions]
 * 
[Databricks|https://docs.databricks.com/en/sql/language-manual/functions/percentile.html]
 * [PostgreSQL|https://www.postgresql.org/docs/16/functions-aggregate.html] 
percentile_cont
 * 
[Snowflake|https://docs.snowflake.com/en/sql-reference/functions/percentile_cont]
 * 
[Oracle|https://docs.oracle.com/en/database/oracle/oracle-database/23/sqlrf/PERCENTILE_CONT.html]
 * [Wiki|https://en.wikipedia.org/wiki/Percentile]


> Add PERCENTILE function
> -----------------------
>
>                 Key: FLINK-36123
>                 URL: https://issues.apache.org/jira/browse/FLINK-36123
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table SQL / API
>            Reporter: Dylan He
>            Priority: Major
>              Labels: pull-request-available
>
> Add PERCENTILE function.
> ----
> Return a percentile value based on a continuous distribution of the input 
> column. If no input row lies exactly at the desired percentile, the result is 
> calculated using linear interpolation of the two nearest input values. NULL 
> values are ignored in the calculation.
> Example:
> {code:sql}
> > SELECT PERCENTILE(col, 0.3) FROM VALUES (0), (10), (10) AS tab(col);
>  6.0
> > SELECT PERCENTILE(col, 0.3, freq) FROM VALUES (0, 1), (10, 2) AS tab(col, 
> > freq);
>  6.0
> > SELECT PERCENTILE(col, array(0.25, 0.75)) FROM VALUES (0), (10) AS tab(col);
>  [2.5,7.5]
> {code}
> Syntax:
> {code:sql}
> PERCENTILE(expr, percentage[, frequency])
> {code}
> Arguments:
>  * {{expr}}: A NUMERIC expression.
>  * {{percentage}}: A NUMERIC expression between 0 and 1 or an ARRAY of 
> NUMERIC expressions, each between 0 and 1.
>  * {{frequency}}: An optional integral numbe greater than 0.
> Returns:
> DOUBLE if percentage is numeric, or an ARRAY of DOUBLE if percentage is an 
> ARRAY.
> Frequency describes the number of times expr must be counted. The default 
> value is 1.
> See also:
>  * 
> [Hive|https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inAggregateFunctions(UDAF)]
>  * 
> [Spark|https://spark.apache.org/docs/3.5.1/sql-ref-functions-builtin.html#mathematical-functions]
>  * 
> [Databricks|https://docs.databricks.com/en/sql/language-manual/functions/percentile.html]
>  * [PostgreSQL|https://www.postgresql.org/docs/16/functions-aggregate.html] 
> percentile_cont
>  * 
> [Snowflake|https://docs.snowflake.com/en/sql-reference/functions/percentile_cont]
>  * 
> [Oracle|https://docs.oracle.com/en/database/oracle/oracle-database/23/sqlrf/PERCENTILE_CONT.html]
>  * [Wiki|https://en.wikipedia.org/wiki/Percentile]
> ----
> Currently our implementation is inspired by PERCENTILE of Spark, which offers 
> an additional parameter frequency compared to SQL standard function 
> PERCENTILE_CONT.
> Based on this function, we can easily extend support fo PERCENTILE_CONT and 
> PERCENTILE_DISC in SQL standard with a little modification in the future.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to