Re: [DISCUSS] Features for Apache Flink 1.10

Till Rohrmann Sat, 07 Sep 2019 01:58:21 -0700

Thanks for compiling the list of 1.10 efforts for the community Gary. I
think this helps a lot to better understand what the community is currently
working on.


Thanks for volunteering as the release managers for the next major
release. +1 for Gary and Yu being the RMs for Flink 1.10.

Cheers,
Till

On Sat, Sep 7, 2019 at 7:26 AM Zhu Zhu <[email protected]> wrote:

> Thanks Gary for kicking off this discussion.
> Really appreciate that you and Yu offer to help to manage 1.10 release.
>
> +1 for Gary and Yu as release managers.
>
> Thanks,
> Zhu Zhu
>
> Dian Fu <[email protected]> 于2019年9月7日周六 下午12:26写道：
>
> > Hi Gary,
> >
> > Thanks for kicking off the release schedule of 1.10. +1 for you and Yu Li
> > as the release manager.
> >
> > The feature freeze/release time sounds reasonable.
> >
> > Thanks,
> > Dian
> >
> > > 在 2019年9月7日，上午11:30，Jark Wu <[email protected]> 写道：
> > >
> > > Thanks Gary for kicking off the discussion for 1.10 release.
> > >
> > > +1 for Gary and Yu as release managers. Thank you for you effort.
> > >
> > > Best,
> > > Jark
> > >
> > >
> > >> 在 2019年9月7日，00:52，zhijiang <[email protected]> 写道：
> > >>
> > >> Hi Gary,
> > >>
> > >> Thanks for kicking off the features for next release 1.10.  I am very
> > supportive of you and Yu Li to be the relaese managers.
> > >>
> > >> Just mention another two improvements which want to be covered in
> > FLINK-1.10 and I already confirmed with Piotr to reach an agreement
> before.
> > >>
> > >> 1. Data serialize and copy only once for broadcast partition [1]: It
> > would improve the throughput performance greatly in broadcast mode and
> was
> > actually proposed in Flink-1.8. Most of works already done before and
> only
> > left the last critical jira/PR. It will not take much efforts to make it
> > ready.
> > >>
> > >> 2. Let Netty use Flink's buffers directly in credit-based mode [2] :
> It
> > could avoid memory copy from netty stack to flink managed network buffer.
> > The obvious benefit is decreasing the direct memory overhead greatly in
> > large-scale jobs. I also heard of some user cases encounter direct OOM
> > caused by netty memory overhead. Actually this improvment was proposed by
> > nico in FLINK-1.7 and always no time to focus then. Yun Gao already
> > submitted a PR half an year ago but have not been reviewed yet. I could
> > help review the deign and PR codes to make it ready.
> > >>
> > >> And you could make these two items as lowest priority if possible.
> > >>
> > >> [1] https://issues.apache.org/jira/browse/FLINK-10745
> > >> [2] https://issues.apache.org/jira/browse/FLINK-10742
> > >>
> > >> Best,
> > >> Zhijiang
> > >> ------------------------------------------------------------------
> > >> From:Gary Yao <[email protected]>
> > >> Send Time:2019年9月6日(星期五) 17:06
> > >> To:dev <[email protected]>
> > >> Cc:carp84 <[email protected]>
> > >> Subject:[DISCUSS] Features for Apache Flink 1.10
> > >>
> > >> Hi community,
> > >>
> > >> Since Apache Flink 1.9.0 has been released more than 2 weeks ago, I
> > want to
> > >> start kicking off the discussion about what we want to achieve for the
> > 1.10
> > >> release.
> > >>
> > >> Based on discussions with various people as well as observations from
> > >> mailing
> > >> list threads, Yu Li and I have compiled a list of features that we
> deem
> > >> important to be included in the next release. Note that the features
> > >> presented
> > >> here are not meant to be exhaustive. As always, I am sure that there
> > will be
> > >> other contributions that will make it into the next release. This
> email
> > >> thread
> > >> is merely to kick off a discussion, and to give users and contributors
> > an
> > >> understanding where the focus of the next release lies. If there is
> > anything
> > >> we have missed that somebody is working on, please reply to this
> thread.
> > >>
> > >>
> > >> ** Proposed features and focus
> > >>
> > >> Following the contribution of Blink to Apache Flink, the community
> > released
> > >> a
> > >> preview of the Blink SQL Query Processor, which offers better SQL
> > coverage
> > >> and
> > >> improved performance for batch queries, in Flink 1.9.0. However, the
> > >> integration of the Blink query processor is not fully completed yet as
> > there
> > >> are still pending tasks, such as implementing full TPC-DS support.
> With
> > the
> > >> next Flink release, we aim at finishing the Blink integration.
> > >>
> > >> Furthermore, there are several ongoing work threads addressing
> > long-standing
> > >> issues reported by users, such as improving checkpointing under
> > >> backpressure,
> > >> and limiting RocksDBs native memory usage, which can be especially
> > >> problematic
> > >> in containerized Flink deployments.
> > >>
> > >> Notable features surrounding Flink’s ecosystem that are planned for
> the
> > next
> > >> release include active Kubernetes support (i.e., enabling Flink’s
> > >> ResourceManager to launch new pods), improved Hive integration, Java
> 11
> > >> support, and new algorithms for the Flink ML library.
> > >>
> > >> Below I have included the list of features that we compiled ordered by
> > >> priority – some of which already have ongoing mailing list threads,
> > JIRAs,
> > >> or
> > >> FLIPs.
> > >>
> > >> - Improving Flink’s build system & CI [1] [2]
> > >> - Support Java 11 [3]
> > >> - Table API improvements
> > >>   - Configuration Evolution [4] [5]
> > >>   - Finish type system: Expression Re-design [6] and UDF refactor
> > >>   - Streaming DDL: Time attribute (watermark) and Changelog support
> > >>   - Full SQL partition support for both batch & streaming [7]
> > >>   - New Java Expression DSL [8]
> > >>   - SQL CLI with DDL and DML support
> > >> - Hive compatibility completion (DDL/UDF) to support full Hive
> > integration
> > >>   - Partition/Function/View support
> > >> - Remaining Blink planner/runtime merge
> > >>   - Support all TPC-DS queries [9]
> > >> - Finer grained resource management
> > >>   - Unified TaskExecutor Memory Configuration [10]
> > >>   - Fine Grained Operator Resource Management [11]
> > >>   - Dynamic Slots Allocation [12]
> > >> - Finish scheduler re-architecture [13]
> > >>   - Allows implementing more sophisticated scheduling strategies such
> as
> > >> better batch scheduler or speculative execution.
> > >> - New DataStream Source Interface [14]
> > >>   - A new source connector architecture to unify the implementation of
> > >> source connectors and make it simpler to implement custom source
> > connectors.
> > >> - Add more source/system metrics
> > >>   - For better flink job monitoring and facilitate customized
> solutions
> > >> like auto-scaling.
> > >> - Executor Interface / Client API [15]
> > >>   - Allow Flink downstream projects to easier and better monitor and
> > >> control flink jobs.
> > >> - Interactive Programming [16]
> > >>   - Allow users to cache the intermediate results in Table API for
> later
> > >> usage to avoid redundant computation when a Flink application contains
> > >> multiple jobs.
> > >> - Python User Defined Function [17]
> > >>   - Support native user-defined functions in Flink Python, including
> > >> UDF/UDAF/UDTF in Table API and Python-Java mixed UDF.
> > >> - Spillable heap backend [18]
> > >>   - A new state backend supporting automatic data spill and load when
> > >> memory exhausted/regained.
> > >> - RocksDB backend memory control [19]
> > >>   - Prevent excessive memory usage from RocksDB, especially in
> container
> > >> environment.
> > >> - Unaligned checkpoints [20]
> > >>   - Resolve the checkpoint timeout issue under backpressure.
> > >> - Separate framework and user class loader in per-job mode
> > >> - Active Kubernetes Integration [21]
> > >>   - Allow ResourceManager talking to Kubernetes to launch new pods
> > >> similar to Flink's Yarn/Mesos integration
> > >> - ML pipeline/library
> > >>   - Aims at delivering several core algorithms, including Logistic
> > >> Regression, Native Bayes, Random Forest, KMeans, etc.
> > >> - Add vertex subtask log url on WebUI [22]
> > >>
> > >>
> > >> ** Suggested release timeline
> > >>
> > >> Based on our usual time-based release schedule [23], and considering
> > that
> > >> several events, such as Flink Forward Europe and Asia, are overlapping
> > with
> > >> the current release cycle, we should aim at releasing 1.10 around the
> > >> beginning of January 2020. To give the community enough testing time,
> I
> > >> propose the feature freeze to be at the end of November. We should
> > announce
> > >> an
> > >> exact date later in the release cycle.
> > >>
> > >> Lastly, I would like to use the opportunity to propose Yu Li and
> myself
> > as
> > >> release managers for the upcoming release.
> > >>
> > >> What do you think?
> > >>
> > >>
> > >> Best,
> > >> Gary
> > >>
> > >> [1]
> > >>
> >
> https://lists.apache.org/thread.html/775447a187410727f5ba6f9cefd6406c58ca5cc5c580aecf30cf213e@%3Cdev.flink.apache.org%3E
> > >> [2]
> > >>
> >
> https://lists.apache.org/thread.html/b90aa518fcabce94f8e1de4132f46120fae613db6e95a2705f1bd1ea@%3Cdev.flink.apache.org%3E
> > >> [3] https://issues.apache.org/jira/browse/FLINK-10725
> > >> [4]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration
> > >> [5]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
> > >> [6]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-51%3A+Rework+of+the+Expression+Design
> > >> [7]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> > >> [8]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-55%3A+Introduction+of+a+Table+API+Java+Expression+DSL
> > >> [9] https://issues.apache.org/jira/browse/FLINK-11491
> > >> [10]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
> > >> [11]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management
> > >> [12]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation
> > >> [13] https://issues.apache.org/jira/browse/FLINK-10429
> > >> [14]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
> > >> [15]
> > >>
> >
> https://lists.apache.org/thread.html/498dd3e0277681cda356029582c1490299ae01df912e15942e11ae8e@%3Cdev.flink.apache.org%3E
> > >> [16]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-36%3A+Support+Interactive+Programming+in+Flink
> > >> [17]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
> > >> [18]
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
> > >> [19] https://issues.apache.org/jira/browse/FLINK-7289
> > >> [20]
> > >>
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Checkpointing-under-backpressure-td31616.html
> > >> [21]
> > >>
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Best-practice-to-run-flink-on-kubernetes-td31532.html
> > >> [22] https://issues.apache.org/jira/browse/FLINK-13894
> > >> [23]
> > https://cwiki.apache.org/confluence/display/FLINK/Time-based+releases
> > >>
> > >
> >
> >
>

Re: [DISCUSS] Features for Apache Flink 1.10

Reply via email to