Re: [DISCUSS] Adding a mid-term roadmap to the Flink website

Rong Rong Wed, 13 Feb 2019 09:48:23 -0800

Thanks Stephan for the great proposal.

This would not only be beneficial for new users but also for contributors
to keep track on all upcoming features.


I think that better window operator support can also be separately group
into its own category, as they affects both future DataStream API and batch
stream unification.
can we also include:
- OVER aggregate for DataStream API separately as @jincheng suggested.
- Improving sliding window operator [1]

One more additional suggestion, can we also include a more extendable
security module [2,3] @shuyi and I are currently working on?
This will significantly improve the usability for Flink in corporate
environments where proprietary or 3rd-party security integration is needed.

Thanks,
Rong


[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Improvement-to-Flink-Window-Operator-with-Slicing-td25750.html
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html
[3]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Kerberos-Improvement-td25983.html




On Wed, Feb 13, 2019 at 3:39 AM jincheng sun <[email protected]>
wrote:

> Very excited and thank you for launching such a great discussion, Stephan !
>
> Here only a little suggestion that in the Batch Streaming Unification
> section, do we need to add an item:
>
> - Same window operators on bounded/unbounded Table API and DataStream API
> (currently OVER window only exists in SQL/TableAPI, DataStream API does
> not yet support)
>
> Best,
> Jincheng
>
> Stephan Ewen <[email protected]> 于2019年2月13日周三 下午7:21写道：
>
>> Hi all!
>>
>> Recently several contributors, committers, and users asked about making
>> it more visible in which way the project is currently going.
>>
>> Users and developers can track the direction by following the discussion
>> threads and JIRA, but due to the mass of discussions and open issues, it is
>> very hard to get a good overall picture.
>> Especially for new users and contributors, is is very hard to get a quick
>> overview of the project direction.
>>
>> To fix this, I suggest to add a brief roadmap summary to the homepage. It
>> is a bit of a commitment to keep that roadmap up to date, but I think the
>> benefit for users justifies that.
>> The Apache Beam project has added such a roadmap [1]
>> <https://beam.apache.org/roadmap/>, which was received very well by the
>> community, I would suggest to follow a similar structure here.
>>
>> If the community is in favor of this, I would volunteer to write a first
>> version of such a roadmap. The points I would include are below.
>>
>> Best,
>> Stephan
>>
>> [1] https://beam.apache.org/roadmap/
>>
>> ========================================================
>>
>> Disclaimer: Apache Flink is not governed or steered by any one single
>> entity, but by its community and Project Management Committee (PMC). This
>> is not a authoritative roadmap in the sense of a plan with a specific
>> timeline. Instead, we share our vision for the future and major initiatives
>> that are receiving attention and give users and contributors an
>> understanding what they can look forward to.
>>
>> *Future Role of Table API and DataStream API*
>>   - Table API becomes first class citizen
>>   - Table API becomes primary API for analytics use cases
>>       * Declarative, automatic optimizations
>>       * No manual control over state and timers
>>   - DataStream API becomes primary API for applications and data pipeline
>> use cases
>>       * Physical, user controls data types, no magic or optimizer
>>       * Explicit control over state and time
>>
>> *Batch Streaming Unification*
>>   - Table API unification (environments) (FLIP-32)
>>   - New unified source interface (FLIP-27)
>>   - Runtime operator unification & code reuse between DataStream / Table
>>   - Extending Table API to make it convenient API for all analytical use
>> cases (easier mix in of UDFs)
>>   - Same join operators on bounded/unbounded Table API and DataStream API
>>
>> *Faster Batch (Bounded Streams)*
>>   - Much of this comes via Blink contribution/merging
>>   - Fine-grained Fault Tolerance on bounded data (Table API)
>>   - Batch Scheduling on bounded data (Table API)
>>   - External Shuffle Services Support on bounded streams
>>   - Caching of intermediate results on bounded data (Table API)
>>   - Extending DataStream API to explicitly model bounded streams (API
>> breaking)
>>   - Add fine fault tolerance, scheduling, caching also to DataStream API
>>
>> *Streaming State Evolution*
>>   - Let all built-in serializers support stable evolution
>>   - First class support for other evolvable formats (Protobuf, Thrift)
>>   - Savepoint input/output format to modify / adjust savepoints
>>
>> *Simpler Event Time Handling*
>>   - Event Time Alignment in Sources
>>   - Simpler out-of-the box support in sources
>>
>> *Checkpointing*
>>   - Consistency of Side Effects: suspend / end with savepoint (FLIP-34)
>>   - Failed checkpoints explicitly aborted on TaskManagers (not only on
>> coordinator)
>>
>> *Automatic scaling (adjusting parallelism)*
>>   - Reactive scaling
>>   - Active scaling policies
>>
>> *Kubernetes Integration*
>>   - Active Kubernetes Integration (Flink actively manages containers)
>>
>> *SQL Ecosystem*
>>   - Extended Metadata Stores / Catalog / Schema Registries support
>>   - DDL support
>>   - Integration with Hive Ecosystem
>>
>> *Simpler Handling of Dependencies*
>>   - Scala in the APIs, but not in the core (hide in separate class loader)
>>   - Hadoop-free by default
>>
>>

Re: [DISCUSS] Adding a mid-term roadmap to the Flink website

Reply via email to