[jira] [Commented] (FLINK-4575) DataSet aggregate methods should support POJOs

Fabian Hueske (JIRA) Mon, 13 Nov 2017 01:52:40 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16249312#comment-16249312
 ]


Fabian Hueske commented on FLINK-4575:
--------------------------------------

I'm not sure about extending the DataSet API for such special cases. In fact, 
I'd rather remove the support for built-in aggregation functions on Tuples in 
the future as well.
IMO, the DataSet API is a rather low-level API that should provide the tools to 
implement custom functions based on {{MapFunction}}, {{GroupReduceFunction}}, 
etc.

Functionality for built-in aggregation function is much better covered by the 
Table API or SQL support. In fact, it is very easy to convert a {{DataSet}} 
into a {{Table}} and vice-versa.

> DataSet aggregate methods should support POJOs
> ----------------------------------------------
>
>                 Key: FLINK-4575
>                 URL: https://issues.apache.org/jira/browse/FLINK-4575
>             Project: Flink
>          Issue Type: Improvement
>          Components: DataSet API
>            Reporter: Gabor Gevay
>            Priority: Minor
>              Labels: starter
>
> The aggregate methods of DataSets (aggregate, sum, min, max) currently only 
> support Tuples, with the fields specified by indices. With 
> https://issues.apache.org/jira/browse/FLINK-3702 resolved, adding support for 
> POJOs and field expressions would be easy: {{AggregateOperator}} would create 
> {{FieldAccessors}} instead of just storing field positions, and 
> {{AggregateOperator.AggregatingUdf}} would use these {{FieldAccessors}} 
> instead of the Tuple field access methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-4575) DataSet aggregate methods should support POJOs

Reply via email to