[ https://issues.apache.org/jira/browse/FLINK-5315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572373#comment-16572373 ]
ASF GitHub Bot commented on FLINK-5315: --------------------------------------- walterddr opened a new pull request #6521: [FLINK-5315][table] Adding support for distinct operation for table API on DataStream URL: https://github.com/apache/flink/pull/6521 ## What is the purpose of the change * Adding `distinct` aggregation support for Table API. Example usages are: - For built-in expressions `'a.count.distinct` - For user-defined aggregate functions `udaggFunc.distinct('a, 'b)` ## Brief change log - *Added `distinctAgg` operator in expression as aggregation* - *Create aggregation resolve rules in `operators` to accept distinct aggregation modifier before getting to actual aggregation* - *Modified UDAGG function interface to add `distinct` modifier API* ## Verifying this change This change added tests and can be verified as follows: - *Added integration tests for UDAGG Function call and Expression aggregation, respectively. * - *Added unit-test for both cases (prefix modifier for UDAGG, and suffix modifier for expressions) as well as added unsupported use cases (suffix modifier for UDAGG). - *Backward compatibility for other aggregations are covered with existing unit-test* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no, but added `private[flink]` modifier for `AggregationFunction` API which might have been exposed to Java API) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (yes) - If yes, how is the feature documented? (not documented yet) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support distinct aggregations in table api > ------------------------------------------ > > Key: FLINK-5315 > URL: https://issues.apache.org/jira/browse/FLINK-5315 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL > Reporter: Kurt Young > Assignee: Rong Rong > Priority: Major > Labels: pull-request-available > > Support distinct aggregations in Table API in the following format: > For Expressions: > {code:scala} > 'a.count.distinct // Expressions distinct modifier > {code} > For User-defined Function: > {code:scala} > singleArgUdaggFunc.distinct('a) // FunctionCall distinct modifier > multiArgUdaggFunc.distinct('a, 'b) // FunctionCall distinct modifier > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)