+1 on having a separate repository. I am always an advocate of separate repositories. All the substantial benefits of doing that are quite convincing. The only reason we might want to make Stateful Function in main repo is probably because it looks just like CEP, Gelly and other libraries that are for specific use cases. It is kind of philosophical. But given Stateful Function seems no longer a "data processing" use case, it looks also reasonable to treat it differently. And as others mentioned, we can always put it into the main repo later if we want to.
Thanks, Jiangjie (Becket) Qin On Wed, Nov 6, 2019 at 6:25 PM Stephan Ewen <se...@apache.org> wrote: > Are still open questions here? > > Or can I treat this discussion as converged in the sense of concluding > that: > - we start initially with a separate repository to allow for individual > releases in the early stages > - we later revisit this discussion once the project is a bit further > along and more converged > > Best, > Stephan > > > On Wed, Oct 16, 2019 at 3:03 PM Stephan Ewen <se...@apache.org> wrote: > > > Whether the side project will be overlooked of not will depends a lot on > > how we integrate it with the current Flink website and documentation. > > > > I would think that a separate repository is not necessarily a big problem > > there. > > It might also help, because a link to that repo shows prominently that > > particular angle of the project (application development), rather than it > > being an API hidden between 100 modules. > > > > On Wed, Oct 16, 2019 at 10:02 AM Timo Walther <twal...@apache.org> > wrote: > > > >> Hi Stephan, > >> > >> +1 for keeping it in a separate repository for fast release cycles and > >> stability until it is mature enough. But we should definitely merge it > >> back to the core repo also for marketing reasons. > >> > >> IMHO side projects tend to be overlooked by the outside world even > >> though they are great technology. > >> > >> Would we still document the code in our main documentation or on a > >> separate website? > >> > >> Thanks, > >> Timo > >> > >> > >> On 16.10.19 09:15, Aljoscha Krettek wrote: > >> > I would keep statefun in a separate repo in the beginning, for the > >> reasons you mentioned. > >> > > >> > Best, > >> > Aljoscha > >> > > >> >> On 15. Oct 2019, at 23:40, Flavio Pompermaier <pomperma...@okkam.it> > >> wrote: > >> >> > >> >> Definitely on the same page..+1 to keep it in a separate repo (at > least > >> >> until the cose becomes "stable" and widely adopted from the > community) > >> >> > >> >> Il Mar 15 Ott 2019, 23:17 Stephan Ewen <se...@apache.org> ha > scritto: > >> >> > >> >>> Hi Flink folks! > >> >>> > >> >>> After the positive reaction to the contribution proposal for > Stateful > >> >>> Functions, I would like to kick off the discussion for the big > >> question: In > >> >>> which form should it go into Flink? > >> >>> > >> >>> Before jumping into the "repository" question directly, let's get > some > >> >>> clarity on what would be our high-level goal with this project and > the > >> >>> contribution. > >> >>> My thinking so far was: > >> >>> > >> >>> - Stateful Functions is a way for Flink and stream processing to > >> become > >> >>> applicable for more general application development. That is a > chance > >> to > >> >>> grow our community to a new crowd of developers. > >> >>> > >> >>> - While adding this to Flink gives synergies with the runtime it > >> build on > >> >>> top of, it makes sense to offer the new developers a lightweight way > >> to get > >> >>> involved. Simple setup, easy contributions. > >> >>> > >> >>> - This is a new project, the API and many designs are not frozen > at > >> this > >> >>> point and may still change heavily. > >> >>> To become really good, the project needs to still make a bunch > of > >> >>> iterations (no pun intended) and change many things quickly. > >> >>> > >> >>> - The Stateful Functions project will likely try to release very > >> >>> frequently in its early days, to improve quickly and gather feedback > >> fast. > >> >>> Being bound to Flink core release cycle would hurt here. > >> >>> > >> >>> > >> >>> I believe that with all those goals, adding Stateful Functions to > the > >> Flink > >> >>> core repository would not make sense. Flink core has processes that > >> make > >> >>> sense for an established project that needs to guarantee stability. > >> These > >> >>> processes are simply prohibitive for new projects to develop. > >> >>> In addition, the Flink main repository is gigantic, has a build > >> system and > >> >>> CI system that cannot handle the size of the project any more. Not > >> the best > >> >>> way to start expanding into a new community. > >> >>> > >> >>> In some sense, Stateful Functions could make sense as an independent > >> >>> project, but it is so tightly coupled to Flink right now that I > think > >> an > >> >>> even better fit is a separate repository in Flink. > >> >>> Think Hive and Hadoop in the early days. That way, we get the > synergy > >> >>> between the two (the same community drives them) while letting both > >> move at > >> >>> their own speed. > >> >>> It would somehow mean two closely related projects shepherded by the > >> same > >> >>> community. > >> >>> > >> >>> It might be possible at a later stage to either merge this into > Flink > >> core > >> >>> (once Stateful Functions is more settled) or even spin this out as a > >> >>> standalone Apache project, if that is how the community develops. > >> >>> > >> >>> That is my main motivation. It is not driven primarily by > >> technicalities > >> >>> like code versioning and dependencies, but much rather by what is > the > >> best > >> >>> setup to develop this as Flink's way to expand its community towards > >> new > >> >>> users from a different background. > >> >>> > >> >>> Curious to hear if that makes sense to you. > >> >>> > >> >>> Best, > >> >>> Stephan > >> >>> > >> > >> >