Hi Flink folks!

After the positive reaction to the contribution proposal for Stateful
Functions, I would like to kick off the discussion for the big question: In
which form should it go into Flink?

Before jumping into the "repository" question directly, let's get some
clarity on what would be our high-level goal with this project and the
contribution.
My thinking so far was:

  - Stateful Functions is a way for Flink and stream processing to become
applicable for more general application development. That is a chance to
grow our community to a new crowd of developers.

  - While adding this to Flink gives synergies with the runtime it build on
top of, it makes sense to offer the new developers a lightweight way to get
involved. Simple setup, easy contributions.

  - This is a new project, the API and many designs are not frozen at this
point and may still change heavily.
    To become really good, the project needs to still make a bunch of
iterations (no pun intended) and change many things quickly.

  - The Stateful Functions project will likely try to release very
frequently in its early days, to improve quickly and gather feedback fast.
Being bound to Flink core release cycle would hurt here.


I believe that with all those goals, adding Stateful Functions to the Flink
core repository would not make sense. Flink core has processes that make
sense for an established project that needs to guarantee stability. These
processes are simply prohibitive for new projects to develop.
In addition, the Flink main repository is gigantic, has a build system and
CI system that cannot handle the size of the project any more. Not the best
way to start expanding into a new community.

In some sense, Stateful Functions could make sense as an independent
project, but it is so tightly coupled to Flink right now that I think an
even better fit is a separate repository in Flink.
Think Hive and Hadoop in the early days. That way, we get the synergy
between the two (the same community drives them) while letting both move at
their own speed.
It would somehow mean two closely related projects shepherded by the same
community.

It might be possible at a later stage to either merge this into Flink core
(once Stateful Functions is more settled) or even spin this out as a
standalone Apache project, if that is how the community develops.

That is my main motivation. It is not driven primarily by technicalities
like code versioning and dependencies, but much rather by what is the best
setup to develop this as Flink's way to expand its community towards new
users from a different background.

Curious to hear if that makes sense to you.

Best,
Stephan

Reply via email to