Effort to add SQL / StreamSQL to Flink

Fabian Hueske Thu, 07 Jan 2016 06:06:29 -0800

Hi everybody,

in the last days, Timo and I refined the design document for adding a SQL /
StreamSQL interface on top of Flink that was started by Stephan.

The document proposes an architecture that is centered around Apache
Calcite. Calcite is an Apache top-level project and includes a SQL parser,
a semantic validator for relational queries, and a rule- and cost-based
relational optimizer. Calcite is used by Apache Hive and Apache Drill
(among other projects). In a nutshell, the plan is to translate Table API
and SQL queries into Calcite's relational expression trees, optimize these
trees, and translate them into DataSet and DataStream programs.The document
breaks down the work into several tasks and subtasks.

Please review the design document and comment.

-- >
https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing

Unless there are major concerns with the design, Timo and I want to start
next week to move the current Table API on top of Apache Calcite (Task 1 in
the document). The goal of this task is to have the same functionality as
currently, but with Calcite in the translation process. This is a blocking
task that we hope to complete soon. Afterwards, we can independently work
on different aspects such as extending the Table API, adding a SQL
interface (basically just a parser), integration with external data
sources, better code generation, optimization rules, streaming support for
the Table API, StreamSQL, etc..

Timo and I plan to work on a WIP branch to implement Task 1 and merge it to
the master branch once the task is completed. Of course, everybody is
welcome to contribute to this effort. Please let us know such that we can
coordinate our efforts.

Thanks,
Fabian

Effort to add SQL / StreamSQL to Flink

Reply via email to