Hi everybody, I'd like to start a discussion about blocking issues and outstanding features of the Table API and SQL for the 1.1.0 release. As you probably know, the Table API was completely reworked and ported to Apache Calcite. Moreover, we added initial support for SQL on batch and streaming tables.
We have come quite far but there are still a couple of issue that need to be resolved before we can release a new version of Flink. I would like to start collecting and prioritizing issues such that we can work towards a feature set that we would like to be included in the next release. In order to prepare this list, I tried to execute the TPC-H query set using the currently supported SQL feature set. Only one (Q18) out of the 22 queries could be executed. The others failed due to unsupported features or bugs. In the following, I list issues ordered by priority that I think need be resolved for the release. - FLINK-3728: Detect unsupported operators and improve error messages. While we can effectively prevent unsupported operations in the Table API, this is not easily possible with SQL queries. At the moment, unsupported operations are either not detected and translated into invalid plans or throw a hard to understand exceptions. - FLINK-3859: Add support for DECIMAL. Without this feature, it is not possible to use floating point literals in SQL queries. - FLINK-3152 / FLINK-3580: Add support for date types and date functions. - FLINK-3586: Prevent AVG(LONG) overflow by using BigInteger as intermediate data type. - FLINK-2971: Add support for outer joins (a PR for this issue exists #1981) - FLINK-3936 : Add MIN / MAX aggregation function for BOOLEAN types - FLINK-3916: Add support for generic types which are handled by - FLINK-3723: This is an proposal to split the Table API select() method into select() for projection and aggregate() for aggregations. At the moment, both are handled by select() (such as in SQL) and internally separated by the Table API. We should decide for Flink 1.1.0 whether to implement the proposal or not. - FLINK-3871 / FLINK-3873: Add Table Source and TableSink for Avro encoded Kafka sources - FLINK-3872 / FLINK-3874 : Add TableSource and TableSink for JSON encoded Kafka sources - More TableSource / TableSinks Please review this list, add issues that you think should go in as well, and discuss the priorities of the features. Also if you would like to get involved with improving the Table API / SQL, drop a mail to the mailing list or a comment to a JIRA issue. I think it would be good if somebody would coordinate these efforts. I would be happy to do it. However, I will leave in one month for a two-months parental leave and I don't know how much I can contribute in that time. So if somebody would like to step up and help coordinating, please let me and the others know. Cheers, Fabian