Thanks Stephan for the great proposal. This would not only be beneficial for new users but also for contributors to keep track on all upcoming features.
I think that better window operator support can also be separately group into its own category, as they affects both future DataStream API and batch stream unification. can we also include: - OVER aggregate for DataStream API separately as @jincheng suggested. - Improving sliding window operator [1] One more additional suggestion, can we also include a more extendable security module [2,3] @shuyi and I are currently working on? This will significantly improve the usability for Flink in corporate environments where proprietary or 3rd-party security integration is needed. Thanks, Rong [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html [3] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html On Wed, Feb 13, 2019 at 3:39 AM jincheng sun <sunjincheng...@gmail.com> wrote: > Very excited and thank you for launching such a great discussion, Stephan ! > > Here only a little suggestion that in the Batch Streaming Unification > section, do we need to add an item: > > - Same window operators on bounded/unbounded Table API and DataStream API > (currently OVER window only exists in SQL/TableAPI, DataStream API does > not yet support) > > Best, > Jincheng > > Stephan Ewen <se...@apache.org> 于2019年2月13日周三 下午7:21写道: > >> Hi all! >> >> Recently several contributors, committers, and users asked about making >> it more visible in which way the project is currently going. >> >> Users and developers can track the direction by following the discussion >> threads and JIRA, but due to the mass of discussions and open issues, it is >> very hard to get a good overall picture. >> Especially for new users and contributors, is is very hard to get a quick >> overview of the project direction. >> >> To fix this, I suggest to add a brief roadmap summary to the homepage. It >> is a bit of a commitment to keep that roadmap up to date, but I think the >> benefit for users justifies that. >> The Apache Beam project has added such a roadmap [1] >> <https://beam.apache.org/roadmap/>, which was received very well by the >> community, I would suggest to follow a similar structure here. >> >> If the community is in favor of this, I would volunteer to write a first >> version of such a roadmap. The points I would include are below. >> >> Best, >> Stephan >> >> [1] https://beam.apache.org/roadmap/ >> >> ======================================================== >> >> Disclaimer: Apache Flink is not governed or steered by any one single >> entity, but by its community and Project Management Committee (PMC). This >> is not a authoritative roadmap in the sense of a plan with a specific >> timeline. Instead, we share our vision for the future and major initiatives >> that are receiving attention and give users and contributors an >> understanding what they can look forward to. >> >> *Future Role of Table API and DataStream API* >> - Table API becomes first class citizen >> - Table API becomes primary API for analytics use cases >> * Declarative, automatic optimizations >> * No manual control over state and timers >> - DataStream API becomes primary API for applications and data pipeline >> use cases >> * Physical, user controls data types, no magic or optimizer >> * Explicit control over state and time >> >> *Batch Streaming Unification* >> - Table API unification (environments) (FLIP-32) >> - New unified source interface (FLIP-27) >> - Runtime operator unification & code reuse between DataStream / Table >> - Extending Table API to make it convenient API for all analytical use >> cases (easier mix in of UDFs) >> - Same join operators on bounded/unbounded Table API and DataStream API >> >> *Faster Batch (Bounded Streams)* >> - Much of this comes via Blink contribution/merging >> - Fine-grained Fault Tolerance on bounded data (Table API) >> - Batch Scheduling on bounded data (Table API) >> - External Shuffle Services Support on bounded streams >> - Caching of intermediate results on bounded data (Table API) >> - Extending DataStream API to explicitly model bounded streams (API >> breaking) >> - Add fine fault tolerance, scheduling, caching also to DataStream API >> >> *Streaming State Evolution* >> - Let all built-in serializers support stable evolution >> - First class support for other evolvable formats (Protobuf, Thrift) >> - Savepoint input/output format to modify / adjust savepoints >> >> *Simpler Event Time Handling* >> - Event Time Alignment in Sources >> - Simpler out-of-the box support in sources >> >> *Checkpointing* >> - Consistency of Side Effects: suspend / end with savepoint (FLIP-34) >> - Failed checkpoints explicitly aborted on TaskManagers (not only on >> coordinator) >> >> *Automatic scaling (adjusting parallelism)* >> - Reactive scaling >> - Active scaling policies >> >> *Kubernetes Integration* >> - Active Kubernetes Integration (Flink actively manages containers) >> >> *SQL Ecosystem* >> - Extended Metadata Stores / Catalog / Schema Registries support >> - DDL support >> - Integration with Hive Ecosystem >> >> *Simpler Handling of Dependencies* >> - Scala in the APIs, but not in the core (hide in separate class loader) >> - Hadoop-free by default >> >>