Hi everyone,
as Stephan already announced on the mailing list [1], the Flink
community will receive a big code contribution from Alibaba. The
flink-table module is one of the biggest parts that will receive many
new features and major architectural improvements. Instead of waiting
until the next major version of Flink or introducing big API-breaking
changes, we would like to gradually build up the Blink-based planner and
runtime while keeping the Table & SQL API mostly stable. Users will be
able to play around with the current merge status of the new planner or
fall back to the old planner until the new one is stable.
We have prepared a design document that discusses a restructuring of the
flink-table module and suggests a rough implementation plan:
https://docs.google.com/document/d/1Tfl2dBqBV3qSBy7oV3qLYvRRDbUOasvA1lhvYWWljQw/edit?usp=sharing
I will briefly summarize the steps we would like to do:
- Split the flink-table module similar to the proposal of FLIP-28 [3]
which is outdated. This is a preparation to separate API from core
(targeted for Flink 1.8).
- Perform minor API changes to separate API from actual implementation
(targeted for Flink 1.8).
- Merge a MVP Blink SQL planner given that necessary Flink core/runtime
changes have been completed.
The merging will happen in stages (e.g. basic planner framework, then
operator by operator). The exact merging plan still needs to be determined.
- Rework the type system in order to unblock work on unified table
environments, UDFs, sources/sinks, and catalog.
- Enable full end-to-end batch and stream execution features.
Our mid-term goal:
Run full TPC-DS on a unified batch/streaming runtime. Initially, we will
only support ingesting data coming from the DataStream API. Once we
reworked the sources/sink interfaces, we will target full end-to-end
TPC-DS query execution with table connectors.
A rough task dependency graph is illustrated in the design document. A
more detailed task dependency structure will be added to JIRA after we
agreed on this FLIP.
Looking forward to any feedback.
Thanks,
Timo
[1]
https://lists.apache.org/thread.html/2f7330e85d702a53b4a2b361149930b50f2e89d8e8a572f8ee2a0e6d@%3Cdev.flink.apache.org%3E
[2]
https://lists.apache.org/thread.html/6066abd0f09fc1c41190afad67770ede8efd0bebc36f00938eecc118@%3Cdev.flink.apache.org%3E
[3]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free