[ https://issues.apache.org/jira/browse/FLINK-21949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764658#comment-17764658 ]
Jiabao Sun commented on FLINK-21949: ------------------------------------ The pull request is ready for review now. This implementation made some simplifications based on Calcite's SqlLibraryOperators.ARRAY_AGG. {code:java} // calcite ARRAY_AGG([ ALL | DISTINCT ] value [ RESPECT NULLS | IGNORE NULLS ] [ ORDER BY orderItem [, orderItem ]* ] ) // flink ARRAY_AGG([ ALL | DISTINCT ] expression) {code} The differences from Calcite are as follows: # Null values are ignored. # The order by expression within the function is not supported because the complete row record cannot be accessed within the function implementation. # The function returns null when there's no input rows, but calcite definition returns an empty array. The behavior was referenced from BigQuery and Postgres. https://cloud.google.com/bigquery/docs/reference/standard-sql/aggregate_functions#array_agg https://www.postgresql.org/docs/8.4/functions-aggregate.html > Support ARRAY_AGG aggregate function > ------------------------------------ > > Key: FLINK-21949 > URL: https://issues.apache.org/jira/browse/FLINK-21949 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API > Affects Versions: 1.12.0 > Reporter: Jiabao Sun > Assignee: Jiabao Sun > Priority: Minor > Labels: pull-request-available > Fix For: 1.19.0 > > > Some nosql databases like mongodb and elasticsearch support nested data types. > Aggregating multiple rows into ARRAY<ROW> is a common requirement. > The CollectToArray function is similar to Collect, except that it returns > ARRAY<ROW> instead of MULTISET<ROW>. -- This message was sent by Atlassian Jira (v8.20.10#820010)