[jira] [Updated] (FLINK-10972) Enhancements to Flink Table API

sunjincheng (JIRA) Mon, 22 Apr 2019 23:21:12 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-10972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


sunjincheng updated FLINK-10972:
--------------------------------
    Description: 
[link title|http://example.com/]With the continuous efforts from the community, 
the Flink system has been continuously improved, which has attracted more and 
more users. Flink SQL is a canonical, widely used relational query language. 
However, there are still some scenarios where Flink SQL failed to meet user 
needs in terms of functionality and ease of use, such as:
 * In terms of functionality

Iteration, user-defined window, user-defined join, user-defined GroupReduce, 
etc. Users cannot express them with SQL;
 * In terms of ease of use

 * Map - e.g. “dataStream.map(mapFun)”. Although “table.select(udf1(), udf2(), 
udf3()....)” can be used to accomplish the same function., with a map() 
function returning 100 columns, one has to define or call 100 UDFs when using 
SQL, which is quite involved.

 * FlatMap -  e.g. “dataStrem.flatmap(flatMapFun)”. Similarly, it can be 
implemented with “table.join(udtf).select()”. However, it is obvious that 
datastream is easier to use than SQL.

Due to the above two reasons, In this JIRAs group, we will enhance the TableAPI 
in stages.

-----------------------

The first stage we seek to support (will describe the details in the sub issue) 
:
 * Table.map()
 * Table.flatMap()
 * GroupedTable.aggregate()
 * GroupedTable.flatAggregate()

The FLIP can be find here: 
[FLIP-29|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=97552739]

 

The second part is about column operator/operations:

1)   Table(schema) operators
 * Add columns
 * Replace columns
 * Drop columns
 * Rename columns

2）Fine-grained column/row operations
 * Column selection
 * Row package and flatten

See [google 
doc|https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit]

 

 

  was:
[link title|http://example.com/]With the continuous efforts from the community, 
the Flink system has been continuously improved, which has attracted more and 
more users. Flink SQL is a canonical, widely used relational query language. 
However, there are still some scenarios where Flink SQL failed to meet user 
needs in terms of functionality and ease of use, such as:
 * In terms of functionality

Iteration, user-defined window, user-defined join, user-defined GroupReduce, 
etc. Users cannot express them with SQL;
 * In terms of ease of use

 * Map - e.g. “dataStream.map(mapFun)”. Although “table.select(udf1(), udf2(), 
udf3()....)” can be used to accomplish the same function., with a map() 
function returning 100 columns, one has to define or call 100 UDFs when using 
SQL, which is quite involved.

 * FlatMap -  e.g. “dataStrem.flatmap(flatMapFun)”. Similarly, it can be 
implemented with “table.join(udtf).select()”. However, it is obvious that 
datastream is easier to use than SQL.

Due to the above two reasons, In this JIRAs group, we will enhance the TableAPI 
in stages.

-----------------------

The first stage we seek to support (will describe the details in the sub issue) 
:
 * Table.map()
 * Table.flatMap()
 * GroupedTable.aggregate()
 * GroupedTable.flatAggregate()

The FLIP can be find here: 
[FLIP-29|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=97552739]

 

The second part is about column operator/operations:

1)   Table(schema) operators
 * Add columns
 * Replace columns
 * Drop columns
 * Rename columns

2）Fine-grained column/row operations
 * Column selection
 * Row package and flatten

See [google 
doc|https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit]

--------------------

Unify BatchTableEnvironment and StreamTableEnvironment

From:
{code:java}
ExecutionEnvironment env = ...
BatchTableEnvironment tEnv = 
 TableEnvironment.getTableEnvironment(env);
{code}
To:
{code:java}
ExecutionEnvironment env = …
TableEnvironment tEnv = TableEnvironment.getTableEnvironment(env)
{code}
See [google 
doc|https://docs.google.com/document/d/1t-AUGuaChADddyJi6e0WLsTDEnf9ZkupvvBiQ4yTTEI/edit]

 

 


> Enhancements to Flink Table API
> -------------------------------
>
>                 Key: FLINK-10972
>                 URL: https://issues.apache.org/jira/browse/FLINK-10972
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / API
>            Reporter: sunjincheng
>            Assignee: sunjincheng
>            Priority: Major
>
> [link title|http://example.com/]With the continuous efforts from the 
> community, the Flink system has been continuously improved, which has 
> attracted more and more users. Flink SQL is a canonical, widely used 
> relational query language. However, there are still some scenarios where 
> Flink SQL failed to meet user needs in terms of functionality and ease of 
> use, such as:
>  * In terms of functionality
> Iteration, user-defined window, user-defined join, user-defined GroupReduce, 
> etc. Users cannot express them with SQL;
>  * In terms of ease of use
>  * Map - e.g. “dataStream.map(mapFun)”. Although “table.select(udf1(), 
> udf2(), udf3()....)” can be used to accomplish the same function., with a 
> map() function returning 100 columns, one has to define or call 100 UDFs when 
> using SQL, which is quite involved.
>  * FlatMap -  e.g. “dataStrem.flatmap(flatMapFun)”. Similarly, it can be 
> implemented with “table.join(udtf).select()”. However, it is obvious that 
> datastream is easier to use than SQL.
> Due to the above two reasons, In this JIRAs group, we will enhance the 
> TableAPI in stages.
> -----------------------
> The first stage we seek to support (will describe the details in the sub 
> issue) :
>  * Table.map()
>  * Table.flatMap()
>  * GroupedTable.aggregate()
>  * GroupedTable.flatAggregate()
> The FLIP can be find here: 
> [FLIP-29|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=97552739]
>  
> The second part is about column operator/operations:
> 1)   Table(schema) operators
>  * Add columns
>  * Replace columns
>  * Drop columns
>  * Rename columns
> 2）Fine-grained column/row operations
>  * Column selection
>  * Row package and flatten
> See [google 
> doc|https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10972) Enhancements to Flink Table API

Reply via email to