[ 
https://issues.apache.org/jira/browse/FLINK-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235335#comment-15235335
 ] 

Fabian Hueske commented on FLINK-3723:
--------------------------------------

Hi [~yijieshen], welcome to the Flink community. Great that you are interested 
to contribute :-)
The Table API is currently under heavy development and we can certainly use 
some help here.

Thanks for the TPC-H Q1 example. How would you add the {{l_returnflag}} or 
{{l_linestatus}} fields to the output? Or do you assume that these will be 
implicitly added because they are grouping fields?

I am not sure about the benefits of the proposed {{agg}} method compared to the 
existing {{select}} method.
In {{select}} we do also check for non-grouped and non-aggregated columns, so 
it is not possible to have nondeterministic fields in the result.
In addition, {{select}} allows to explicitly add (or leave out) grouped fields 
or directly apply expressions on grouped or aggregated fields. Of course this 
would also be possible by using {{agg}} followed by {{select}}. {{agg}} would 
make {{select}} more specific and maybe easier to use. On the other hand, the 
current implementation of {{select}} is closer to the original SQL notation.

> Aggregate Functions and scalar expressions shouldn't be mixed in select
> -----------------------------------------------------------------------
>
>                 Key: FLINK-3723
>                 URL: https://issues.apache.org/jira/browse/FLINK-3723
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API
>    Affects Versions: 1.0.1
>            Reporter: Yijie Shen
>
> When we type {code}select deptno, name, max(age) from dept group by 
> deptno;{code} in calcite or Oracle, it will complain {code}Expression 'NAME' 
> is not being grouped{code} or {code}Column 'dept.name' is invalid in the 
> select list because it is not contained in either an aggregate function or 
> the GROUP BY clause.{code} because of the nondeterministic result.
> Therefore, I suggest to separate the current functionality of `select` into 
> two api, the new `select` only handle scalar expressions, and an `agg` accept 
> Aggregates.
> {code}
> def select(exprs: Expression*)
> def agg(aggs: Aggregation*)
> ....
> tbl.groupBy('deptno)
>    .agg('age.max, 'age.min)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to