Hi All, Awesome...I see this as a great opportunity for newcomers like me to contribute.
Is this discussion happening in a slack or discord forum too? If so, pls include me. Thanks, Sai On Fri, Apr 28, 2023 at 2:55 AM Martijn Visser <martijnvis...@apache.org> wrote: > Hi all, > > I think the proposal is a good starting point. We should aim to make Flink > a unified data processing, cloud friendly / cloud native technology, with > proper low-level and high-level interfaces (DataStream API, Table API, > SQL). I think it would make a lot of sense that we write down a vision for > Flink for the long term. That would also mean sharing and discussing more > insights and having conversations around some of the long-term direction > from the proposal. > > In order to achieve that vision, I believe that we need a Flink 2.0 which I > consider a long overdue clean-up. That version should be the foundation for > Flink that allows the above mentioned vision to become actual proposals and > implementations. > > As a foundation in Flink 2.0, I would be inclined to say it should be: > > - Remove all deprecated APIs, including the DataSet API, Scala API, > Queryable State, legacy Source and Sink implementations, legacy SQL > functions etc. > - Add support for Java 17 and 21, make 17 the default (given that the next > Java LTS, 21, is released in September this year and the timeline is set of > 2024) > - Drop support for Java 8 and 11 > - Refactor the configuration layer > - Refactor the DataStream API, such as: > ** Having a coherent and well designed API > ** Decouple the API into API-only modules, so no more cyclic dependencies > and leaking of non-APIs, including Kryo > ** Reorganize APIs and modules > > I think these are some of the must-haves. Curious about the thoughts of the > community. > > Thanks, Martijn > > Op do 27 apr. 2023 om 10:16 schreef David Morávek <d...@apache.org> > > > Hi, > > > > Great to see this topic moving forward; I agree it's long overdue. > > > > I keep thinking about 2.0 as a chance to eliminate things that didn't > work, > > make the feature set denser, and fix rough edges and APIs that hold us > > back. > > > > Some items in the doc (Key Features section) don't tick these boxes for > me, > > as they could also be implemented in the 1x branch. We should consider > > whether we need a backward incompatible release to introduce each > feature. > > This should help us to keep the discussion more focused. > > > > Best, > > D. > > > > > > On Wed, Apr 26, 2023 at 2:33 PM DONG Weike <kyled...@connect.hku.hk> > > wrote: > > > > > Hi, > > > > > > It is thrilling to see the foreseeable upcoming rollouts of Flink 2.x > > > releases, and I believe that this roadmap can take Flink to the next > > stage > > > of a top-of-notch unified streaming & batch computing engine. > > > > > > Given that all of the existing user programs are written and run in > Flink > > > 1.x versions as for now, and some of them are very complex and rely on > > > various third-party connectors written with legacy APIs, one thing > that I > > > have concerns about is if, one day in the future, the community decides > > > that new features are only given to 2.x releases, could the last > release > > of > > > Flink 1.x be converted as an LTS version (backporting severe bug fixes > > and > > > critical security patches), so that existing users could have enough > time > > > to wait for third-party connectors to upgrade, test their programs on > the > > > Flink APIs, and avoid sudden loss of community support. > > > > > > Just my two cents : ) > > > > > > Best, > > > Weike > > > > > > ________________________________ > > > 发件人: Xintong Song <tonysong...@gmail.com> > > > 发送时间: 2023年4月26日 20:01 > > > 收件人: dev <dev@flink.apache.org> > > > 主题: Re: [DISCUSS] Planning Flink 2.0 > > > > > > @Chesnay > > > > > > > > > > Technically this implies that every minor release may contain > breaking > > > > changes, which is exactly what users don't want. > > > > > > > > > It's not necessary to introduce the breaking chagnes immediately upon > > > reaching the minimum guaranteed stable time. If there are multiple > > changes > > > waiting for the stable time, we can still gather them in 1 minor > release. > > > But I see your point, from the user's perspective, the mechanism does > not > > > provide any guarantees for the compatibility of minor releases. > > > > > > What problems to do you see in creating major releases every N years? > > > > > > > > > > It might not be concrete problem, but I'm a bit concerned by the > > > uncertainty. I assume N should not be too small, e.g., at least 3. I'd > > > expect the decision to ship a major release would be made based on > > > comprehensive considerations over the situations at that time. Making a > > > decision now that we would ship a major release 3 years later seems a > bit > > > agressive to me. > > > > > > We need to figure out what this release means for connectors > > > > compatibility-wise. > > > > > > > > > > +1 > > > > > > > > > > What process are you thinking of for deciding what breaking changes > to > > > > make? The obvious choice would be FLIPs, but I'm worried that this > will > > > > overload the mailing list / wiki for lots of tiny changes. > > > > > > > > > > This should be a community decision. What I have in mind would be: (1) > > > collect a wish list on wiki, (2) schedule a series of online meetings > > (like > > > the release syncs) to get an agreed set of must-have items, (3) develop > > and > > > polish the detailed plans of items via FLIPs, and (4) if the plan for a > > > must-have item does not work out then go back to (2) for an update. I'm > > > also open to other opinions. > > > > > > Would we wait a few months for people to prepare/agree on changes so we > > > > reduce the time we need to merge things into 2 branches? > > > > > > > > > > That's what I had in mind. Hopefully after 1.18. > > > > > > @Max > > > > > > When I look at > > > > > > > > > > https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit > > > > , I'm a bit skeptical we will even be able to reach all these goals. > I > > > > think we have to prioritize and try to establish a deadline. > Otherwise > > we > > > > will end up never releasing 2.0. > > > > > > > > > Sorry for the confusion. I should have explain this more clearly. We > are > > > not planning to finish all the items in the list. It's more like a > > > brainstorm, a list of candidates. We are also expecting to collect more > > > ideas from the community. And after collecting the ideas, we should > > > prioritize them and decide on a subset of must-have items, following > the > > > consensus decision making. > > > > > > +1 on Flink 2.0 by May 2024 (not a hard deadline but I think having a > > > > deadline helps). > > > > > > > > > > I agree that having a deadline helps. I proposed mid 2024, which is > > similar > > > to but not as explicit as what you proposed. We may start with having a > > > deadline for deciding the must-have items (e.g., by the end of June). > > That > > > should make it easier for estimating the overall time needed for > > preparing > > > the release. > > > > > > Best, > > > > > > Xintong > > > > > > > > > > > > On Wed, Apr 26, 2023 at 6:57 PM Gyula Fóra <gyf...@apache.org> wrote: > > > > > > > +1 to everything Max said. > > > > > > > > Gyula > > > > > > > > On Wed, 26 Apr 2023 at 11:42, Maximilian Michels <m...@apache.org> > > wrote: > > > > > > > > > Thanks for starting the discussion, Jark and Xingtong! > > > > > > > > > > Flink 2.0 is long overdue. In the past, the expectations for such a > > > > > release were unreasonably high. I think everybody had a different > > > > > understanding of what exactly the criteria were. This led to > > releasing > > > > > 18 minor releases for the current major version. > > > > > > > > > > What I'm most excited about for Flink 2.0 is removal of baggage > that > > > > > Flink has accumulated over the years: > > > > > > > > > > - Removal of Scala, deprecated interfaces, unmaintained libraries > and > > > > > APIs (DataSet) > > > > > - Consolidation of configuration > > > > > - Merging of multiple scheduler implementations > > > > > - Ability to freely combine batch / streaming tasks in the runtime > > > > > > > > > > When I look at > > > > > > > > > > > > > > > https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit > > > > > , I'm a bit skeptical we will even be able to reach all these > goals. > > I > > > > > think we have to prioritize and try to establish a deadline. > > Otherwise > > > > > we will end up never releasing 2.0. > > > > > > > > > > +1 on Flink 2.0 by May 2024 (not a hard deadline but I think > having a > > > > > deadline helps). > > > > > > > > > > -Max > > > > > > > > > > > > > > > On Wed, Apr 26, 2023 at 10:08 AM Chesnay Schepler < > > ches...@apache.org> > > > > > wrote: > > > > > > > > > > > > > /Instead of defining compatibility guarantees as "this API > won't > > > > > > change in all 1.x/2.x series", what if we define it as "this API > > > won't > > > > > > change in the next 2/3 years"./ > > > > > > > > > > > > I can see some benefits to this approach (all APIs having a fixed > > > > > > minimum lifetime) but it's just gonna be difficult to > communicate. > > > > > > Technically this implies that every minor release may contain > > > breaking > > > > > > changes, which is exactly what users don't want. > > > > > > > > > > > > What problems to do you see in creating major releases every N > > years? > > > > > > > > > > > > > /IIUC, the milestone releases are a breakdown of the 2.0 > > release, > > > > > > while we are free to introduce breaking changes between them. And > > you > > > > > > suggest using longer-living feature branches to keep the master > > > branch > > > > > > in a releasable state (in terms of milestone releases). Am I > > > > > > understanding it correctly?/ > > > > > > > > > > > > I think you got the general idea. There are a lot of details to > be > > > > > > ironed out though (e.g., do we release connectors for each > > > > > milestone?...). > > > > > > > > > > > > Conflicts in the long-lived branches are certainly a concern, > but I > > > > > > think those will be inevitable. Right now I'm not _too_ worried > > about > > > > > > them, at least based on my personal wish-list. > > > > > > Maybe the milestones could even help with that, as we could > > > > preemptively > > > > > > decide on an order for certain changes that have a high chance of > > > > > > conflicting with each other? > > > > > > I guess we could do that anyway. > > > > > > Maybe we should explicitly evaluate how invasive a change is (in > > > > > > relation to other breaking changes!) and manage things > accordingly > > > > > > > > > > > > > > > > > > Other thoughts: > > > > > > > > > > > > We need to figure out what this release means for connectors > > > > > > compatibility-wise. The current rules for which versions a > > connector > > > > > > must support don't cover major releases at all. > > > > > > (This depends a bit on the scope of 2.0; if we add binary > > > compatibility > > > > > > to Public APIs and promote a few Evolving ones then compatibility > > > > across > > > > > > minor releases becomes trivial) > > > > > > > > > > > > What process are you thinking of for deciding what breaking > changes > > > to > > > > > > make? The obvious choice would be FLIPs, but I'm worried that > this > > > will > > > > > > overload the mailing list / wiki for lots of tiny changes. > > > > > > > > > > > > Provided that we agree on doing 2.0, when would we cut the 2.0 > > > branch? > > > > > > Would we wait a few months for people to prepare/agree on changes > > so > > > we > > > > > > reduce the time we need to merge things into 2 branches? > > > > > > > > > > > > On 26/04/2023 05:51, Xintong Song wrote: > > > > > > > Thanks all for the positive feedback. > > > > > > > > > > > > > > @Martijn > > > > > > > > > > > > > > If we want to have that roadmap, should we consolidate this > into > > a > > > > > > >> dedicated Confluence page over storing it in a Google doc? > > > > > > >> > > > > > > > Having a dedicated wiki page is definitely a good way for the > > > roadmap > > > > > > > discussion. I haven't created one yet because it's still a > > proposal > > > > to > > > > > have > > > > > > > such roadmap discussion. If the community agrees with our > > proposal, > > > > the > > > > > > > release manager team can decide how they want to drive and > track > > > the > > > > > > > roadmap discussion. > > > > > > > > > > > > > > @Chesnay > > > > > > > > > > > > > > We should discuss how regularly we will ship major releases > from > > > now > > > > > on. > > > > > > >> Let's avoid again making breaking changes because we "gotta do > > it > > > > now > > > > > > >> because 3.0 isn't happening anytime soon". (e.g., every 2 > years > > or > > > > > > >> something) > > > > > > > > > > > > > > I'm not entirely sure about shipping major releases regularly. > > But > > > I > > > > do > > > > > > > agree that we may want to avoid the situation that "breaking > > > changes > > > > > can > > > > > > > only happen now, or no idea when". Instead of defining > > > compatibility > > > > > > > guarantees as "this API won't change in all 1.x/2.x series", > what > > > if > > > > we > > > > > > > define it as "this API won't change in the next 2/3 years". > That > > > > should > > > > > > > allow us to incrementally iterate the APIs. > > > > > > > > > > > > > > E.g., in 2.a, all APIs annotated as `@Stable` will be > guaranteed > > > > > compatible > > > > > > > until 2 years after 2.a is shipped, and in 2.b if the API is > > still > > > > > > > annotated `@Stable` it extends the compatibility guarantee to 2 > > > years > > > > > after > > > > > > > 2.b is shipped. To remove an API, we would need to mark it as > > > > > `@Deprecated` > > > > > > > and wait for 2 years after the last release in which it was > > marked > > > > > > > `@Stable`. > > > > > > > > > > > > > > My thinking goes rather in the area of defining Milestone > > releases, > > > > > each > > > > > > >> Milestone targeting specific changes. > > > > > > >> > > > > > > > I'm trying to understand what you are suggesting here. IIUC, > the > > > > > milestone > > > > > > > releases are a breakdown of the 2.0 release, while we are free > to > > > > > introduce > > > > > > > breaking changes between them. And you suggest using > > longer-living > > > > > feature > > > > > > > branches to keep the master branch in a releasable state (in > > terms > > > of > > > > > > > milestone releases). Am I understanding it correctly? > > > > > > > > > > > > > > I haven't thought this through. My gut feeling is this might > be a > > > > good > > > > > > > direction to go, in terms of keeping things organized. The risk > > is > > > > the > > > > > cost > > > > > > > of merging feature branches and rebasing feature branches after > > > other > > > > > > > features are merged. That depends on how close the features are > > > > > related to > > > > > > > each other. E.g., reorganization of the project modules and > > > > > dependencies > > > > > > > may change the project structure a lot, which may significantly > > > > affect > > > > > most > > > > > > > of the feature branches. Maybe we can identify such > > > widely-affecting > > > > > > > changes and perform them at the beginning or end of the release > > > > cycle. > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > Xintong > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 26, 2023 at 8:23 AM ConradJam<jam.gz...@gmail.com> > > > > wrote: > > > > > > > > > > > > > >> Thanks Xintong and Jark for kicking off the great discussion! > > > > > > >> > > > > > > >> I checked the list carefully. The plans are detailed and most > of > > > the > > > > > > >> problems are covered > > > > > > >> Some of the ideas Chesnay mentioned, I think we should iterate > > in > > > > > > >> small steps and collect feedback in time > > > > > > >> Looking forward to the start of the work of Flink2.0, I am > > willing > > > > to > > > > > > >> provide assistance ~ > > > > > > >> > > > > > > >> Xintong Song<tonysong...@gmail.com> 于2023年4月25日周二 19:10写道: > > > > > > >>> Hi everyone, > > > > > > >>> > > > > > > >>> I'd like to start a discussion on planning for a Flink 2.0 > > > release. > > > > > > >>> > > > > > > >>> AFAIK, in the past years this topic has been mentioned from > > time > > > to > > > > > time, > > > > > > >>> in mailing lists, jira tickets and offline discussions. > > However, > > > > few > > > > > > >>> concrete steps have been taken, due to the significant > > > > determination > > > > > and > > > > > > >>> efforts it requires and distractions from other prioritized > > > > focuses. > > > > > > >> After > > > > > > >>> a series of offline discussions in the recent weeks, with > folks > > > > > mostly > > > > > > >> from > > > > > > >>> our team internally as well as a few from outside Alibaba / > > > > Ververica > > > > > > >>> (thanks for insights from Becket and Robert), we believe it's > > > time > > > > to > > > > > > >> kick > > > > > > >>> this off in the community. > > > > > > >>> > > > > > > >>> Below are some of our thoughts about the 2.0 release. Looking > > > > > forward to > > > > > > >>> your opinions and feedback. > > > > > > >>> > > > > > > >>> > > > > > > >>> ## Why plan for release 2.0? > > > > > > >>> > > > > > > >>> > > > > > > >>> Flink 1.0.0 was released in March 2016. In the past 7 years, > > many > > > > new > > > > > > >>> features have been added and the project has become different > > > from > > > > > what > > > > > > >> it > > > > > > >>> used to be. So what is Flink now? What will it become in the > > next > > > > 3-5 > > > > > > >>> years? What do we think of Flink's position in the industry? > We > > > > > believe > > > > > > >>> it's time to rethink these questions, and draw a roadmap > > towards > > > > > another > > > > > > >>> milestone, a milestone that worths a new major release. > > > > > > >>> > > > > > > >>> > > > > > > >>> In addition, we are still providing backwards compatibility > > > (maybe > > > > > not > > > > > > >>> perfectly but largely) with APIs that we designed and claimed > > > > stable > > > > > 7 > > > > > > >>> years ago. While such backwards compatibility helps users to > > > stick > > > > > with > > > > > > >> the > > > > > > >>> latest Flink releases more easily, it sometimes, and more and > > > more > > > > > over > > > > > > >>> time, also becomes a burden for maintenance and a limitation > > for > > > > new > > > > > > >>> features and improvements. It's probably time to have a > > > > comprehensive > > > > > > >>> review and clean-up over all the public APIs. > > > > > > >>> > > > > > > >>> > > > > > > >>> Furthermore, next year is the 10th year for Flink as an > Apache > > > > > project. > > > > > > >>> Flink joined the Apache incubator in April 2014, and became a > > > > > top-level > > > > > > >>> project in December 2014. That makes 2024 a perfect time for > > > > > bringing out > > > > > > >>> the release 2.0 milestone. And for such a major release, we'd > > > > expect > > > > > it > > > > > > >>> takes one year or even longer to prepare for, which means we > > > > probably > > > > > > >>> should start now. > > > > > > >>> > > > > > > >>> > > > > > > >>> ## What should we focus on in release 2.0? > > > > > > >>> > > > > > > >>> > > > > > > >>> - Roadmap discussion - How do we define and position > Flink > > > for > > > > > now and > > > > > > >>> in future? This is probably something we lacked. I > believe > > > some > > > > > > >> people have > > > > > > >>> thought about it, but at least it's not explicitly > > discussed > > > > and > > > > > > >> aligned in > > > > > > >>> the community. Ideally, the 2.0 release should be a > result > > of > > > > the > > > > > > >> roadmap. > > > > > > >>> - Breaking changes - Important improvements, bugfixes, > > > > technical > > > > > debts > > > > > > >>> that involve breaking of API backwards compatibility, > which > > > can > > > > > only > > > > > > >> be > > > > > > >>> carried out in major releases. > > > > > > >>> - With breaking API changes, we may need multiple > > > > > 2.0-alpha/beta > > > > > > >>> versions to collect feedback. > > > > > > >>> - Key features - Significant features and improvements > > (e.g., > > > > > new user > > > > > > >>> stories, architectural upgrades) that may change how > users > > > use > > > > > Flink > > > > > > >> and > > > > > > >>> its position in the industry. Some items from this > category > > > may > > > > > also > > > > > > >>> involve API breaking changes or significant behavior > > changes. > > > > > > >>> - There are also opinions that we should stay focused > as > > > > much > > > > > as > > > > > > >>> possible on the breaking changes only. Incremental / > > > > > non-breaking > > > > > > >>> improvements and features, or anything that can be > added > > > in > > > > > 2.x > > > > > > >> minor > > > > > > >>> releases, should not block the 2.0 release. > > > > > > >>> > > > > > > >>> It might be better to discuss the detailed technical items > > later > > > in > > > > > > >> another > > > > > > >>> thread, to keep the current discussion focused on the overall > > > > > proposal, > > > > > > >> and > > > > > > >>> to leave time for all parties to think about their technical > > > plans. > > > > > For > > > > > > >>> your reference, I've attached a preliminary list of work > items > > > > > proposed > > > > > > >> by > > > > > > >>> Alibaba / Ververica [1]. Note that the listed items are still > > > being > > > > > > >>> carefully evaluated and prioritized, and may change in > future. > > > > > > >>> > > > > > > >>> > > > > > > >>> ## How do we manage the release? > > > > > > >>> > > > > > > >>> > > > > > > >>> #### Release Process > > > > > > >>> > > > > > > >>> > > > > > > >>> We'd expect the release process for Flink 2.0 to be different > > > from > > > > > the > > > > > > >> 1.x > > > > > > >>> releases. > > > > > > >>> > > > > > > >>> > > > > > > >>> A major difference is that, we think the timeline-based > release > > > > > > >> management > > > > > > >>> may not be suitable. The idea behind the timeline-based > > approach > > > is > > > > > that > > > > > > >> we > > > > > > >>> can have more frequent releases and deliver completed > features > > to > > > > > users > > > > > > >>> earlier, while incompleted features can be postponed to the > > next > > > > > release > > > > > > >>> which won't be too late with the short release cycle. > However, > > > for > > > > > > >> breaking > > > > > > >>> changes that can only take place in major releases, the price > > for > > > > > > >> missing a > > > > > > >>> release is too high. > > > > > > >>> > > > > > > >>> > > > > > > >>> Alternatively, we probably should discuss and agree on a list > > of > > > > > > >> must-have > > > > > > >>> work items. That doesn't mean keep postponing the release > upon > > a > > > > few > > > > > > >>> delayed features. In fact, we would need to closely monitor > the > > > > > progress > > > > > > >> of > > > > > > >>> the must-have items during the entire release cycle, making > > sure > > > > > they are > > > > > > >>> taken care of by contributors with enough expertise and > > > capacities. > > > > > > >>> > > > > > > >>> > > > > > > >>> #### Timeline > > > > > > >>> > > > > > > >>> > > > > > > >>> The release cycle should be decided according to the feature > > > list, > > > > > > >>> especially the must-have items that we plan to do in the > > release. > > > > > > >> However, > > > > > > >>> a target feature freeze date would still be helpful when > making > > > the > > > > > plan, > > > > > > >>> so that we don't pack too many things into the release. We > > > propose > > > > > to aim > > > > > > >>> for a feature freeze around mid 2024, so that in case > must-have > > > > > items are > > > > > > >>> delayed, we still have a good chance to make the release > happen > > > by > > > > > the > > > > > > >> end > > > > > > >>> of 2024. > > > > > > >>> > > > > > > >>> > > > > > > >>> #### Branch > > > > > > >>> > > > > > > >>> > > > > > > >>> A longer release cycle also means we probably should keep > > shiping > > > > > the 1.x > > > > > > >>> releases while preparing for the 2.0 release. We may cut a > > > > release-1 > > > > > from > > > > > > >>> master, on which we can keep developing and release 1.x > > releases. > > > > The > > > > > > >>> version on the master branch will then become '2.0-SNAPSHOT'. > > > > > > >>> > > > > > > >>> > > > > > > >>> #### Release Manager > > > > > > >>> > > > > > > >>> > > > > > > >>> Given the new and to-be-explored release process, longer > cycle > > > and > > > > > higher > > > > > > >>> synchronization requirements, we'd expect the 2.0 release to > be > > > > more > > > > > > >>> challenging than previous 1.x releases. Therefore, we'd like > to > > > > > propose > > > > > > >> to > > > > > > >>> assemble a release management team with 4-5 experienced PMC > > > > members. > > > > > Jark > > > > > > >>> and I would like to volunteer as 2 of the release managers. > > > > > > >>> > > > > > > >>> > > > > > > >>> Looking forward to your thoughts. > > > > > > >>> > > > > > > >>> > > > > > > >>> Best, > > > > > > >>> > > > > > > >>> Jark & Xintong > > > > > > >>> > > > > > > >>> > > > > > > >>> [1] > > > > > > >>> > > > > > > >> > > > > > > > > > > > > > > > https://docs.google.com/document/d/1_PMGl5RuDQGlV99_gL3y7OiRsF0DgCk91Coua6hFXhE/edit?usp=sharing > > > > > > >> > > > > > > >> -- > > > > > > >> Best > > > > > > >> > > > > > > >> ConradJam > > > > > > >> > > > > > > > > > > > > > > > -- Thanks, Sai *+91 917 623 3379*