Hi everybody, in the last days, Timo and I refined the design document for adding a SQL / StreamSQL interface on top of Flink that was started by Stephan.
The document proposes an architecture that is centered around Apache Calcite. Calcite is an Apache top-level project and includes a SQL parser, a semantic validator for relational queries, and a rule- and cost-based relational optimizer. Calcite is used by Apache Hive and Apache Drill (among other projects). In a nutshell, the plan is to translate Table API and SQL queries into Calcite's relational expression trees, optimize these trees, and translate them into DataSet and DataStream programs.The document breaks down the work into several tasks and subtasks. Please review the design document and comment. -- > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing Unless there are major concerns with the design, Timo and I want to start next week to move the current Table API on top of Apache Calcite (Task 1 in the document). The goal of this task is to have the same functionality as currently, but with Calcite in the translation process. This is a blocking task that we hope to complete soon. Afterwards, we can independently work on different aspects such as extending the Table API, adding a SQL interface (basically just a parser), integration with external data sources, better code generation, optimization rules, streaming support for the Table API, StreamSQL, etc.. Timo and I plan to work on a WIP branch to implement Task 1 and merge it to the master branch once the task is completed. Of course, everybody is welcome to contribute to this effort. Please let us know such that we can coordinate our efforts. Thanks, Fabian