[PROPOSAL] Structure the Flink Open Source Development

Stephan Ewen Tue, 10 May 2016 02:24:42 -0700

Hi everyone!

We propose to establish some lightweight structures in the Flink open
source community and development process,
to help us better handle the increased interest in Flink (mailing list and
pull requests), while not overwhelming the
committers, and giving users and contributors a good experience.


This proposal is triggered by the observation that we are reaching the
limits of where the current community can support
users and guide new contributors. The below proposal is based on
observations and ideas from Till, Robert, and me.

========
Goals
========

We try to achieve the following

  - Pull requests get handled in a timely fashion
  - New contributors are better integrated into the community
  - The community feels empowered on the mailing list.
    But questions that need the attention of someone that has deep
knowledge of a certain part of Flink get their attention.
  - At the same time, the committers that are knowledgeable about many core
parts do not get completely overwhelmed.
  - We don't overlook threads that report critical issues.
  - We always have a pretty good overview of what the status of certain
parts of the system are.
      -> What are often encountered known issues
      -> What are the most frequently requested features


========
Problems
========

Looking into the process, there are two big issues:

(1) Up to now, we have been relying on the fact that everything just
"organizes itself", driven by best effort. That assumes
that everyone feels equally responsible for every part, question, and
contribution. At the current state, this is impossible
to maintain, it overwhelms the committers and contributors.

Example: Pull requests are picked up by whoever wants to pick them up. Pull
requests that are a lot of work, have little
chance of getting in, or relate to less active components are sometimes not
picked up. When contributors are pretty
loaded already, it may happen that no one eventually feels responsible to
pick up a pull request, and it falls through the cracks.

(2) There is no good overview of what are known shortcomings, efforts, and
requested features for different parts of the system.
This information exists in various peoples' heads, but is not easily
accessible for new people. The Flink JIRA is not well
maintained, it is not easy to draw insights from that.


===========
The Proposal
===========

Since we are building a parallel system, the natural solution seems to be:
partition the workload ;-)

We propose to define a set of components for Flink. Each component is
maintained or tracked by one or more
people - let's call them maintainers. It is important to note that we don't
suggest the maintainers as an authoritative role, but
simply as committers or contributors that visibly step up for a certain
component, and mainly track and drive the efforts
pertaining to that component.

It is also important to realize that we do not want to suggest that people
get less involved with certain parts and components, because
they are not the maintainers. We simply want to make sure that each pull
request or question or contribution has in the end
one person (or a small set of people) responsible for catching and tracking
it, if it was not worked on by the pro-active
community.

For some components, having multiple maintainers will be helpful. In that
case, one maintainer should be the "chair" or "lead"
and make sure that no issue of that component gets lost between the
multiple maintainers.


A maintainers' role is:
-----------------------------

  - Have an overview of which of the open pull requests relate to their
component
  - Drive the pull requests relating to the component to resolution
      => Moderate the decision whether the feature should be merged
      => Make sure the pull request gets a shepherd.
           In many cases, the maintainers would shepherd themselves.
      => In case the shepherd becomes inactive, the maintainers need to
find a new shepherd.

  - Have an overview of what are the known issues of their component
  - Have an overview of what are the frequently requested features of their
component

  - Have an overview of which contributors are doing very good work in
their component,
    would be candidates for committers, and should be mentored towards that.

  - Resolve email threads that have been brought to their attention,
because deeper
    component knowledge is required for that thread.

A maintainers' role is NOT:
----------------------------------

  - Review all pull requests of that component
  - Answer every mail with questions about that component
  - Fix all bugs and implement all features of that components


We imagine the following way that the community and the maintainers
interact:
---------------------------------------------------------------------------------------------------------

  - Pull requests should be tagged by component. Since we cannot add labels
at this point, we need
    to rely on the following:
     => The pull request opener should name the pull request like
"[FLINK-XXX] [component] Title"
     => Components can be (re) tagged by adding special comments in the
pull request ("==> component client")
     => With some luck, GitHub and Apache Infra will allow us to use labels
at some point

  - When pull requests are associated with a component, the maintainers
will manage them
    (decision whether to add, find shepherd, catch dropped pull requests)

  - We assume that maintainers frequently reach out to other community
members and ask them if they want
    to shepherd a pull request.

  - On the mailing list, everyone should feel equally empowered to answer
and discuss.
    If at some point in the discussion, some deep technical knowledge about
a component is required,
    the maintainer(s) should be drawn into the discussion.
    Because the Mailing List infrastructure has no support to tag threads,
here are some simple workarounds:

    => One possibility is to put the maintainers' mail addresses on cc for
the thread, so they get the mail
          not just via l the mailing list
    => Another way would be to post something like "+maintainer runtime" in
the thread and the "runtime"
         maintainers would have a filter/alert on these keywords in their
mail program.

  - We assume that maintainers will reach out to community members that are
very active and helpful in
    a component, and will ask them if they want to be added as maintainers.
    That will make it visible that those people are experts for that part
of Flink.


======================================
Maintainers: Committers and Contributors
======================================

It helps if maintainers are committers (since we want them to resolve pull
requests which often involves
merging them).

Components with multiple maintainers can easily have non-committer
contributors in addition to committer
contributors.


======
JIRA
======

Ideally, JIRA can be used to get an overview of what are the known issues
of each component, and what are
common feature requests. Unfortunately, the Flink JIRA is quite unorganized
right now.

A natural followup effort of this proposal would be to define in JIRA the
same components as we defined here,
and have the maintainers keep JIRA meaningful for that particular
component. That would allow us to
easily generate some tables out of JIRA (like top known issues per
component, most requested features)
post them on the dev list once in a while as a "state of the union" report.

Initial assignment of issues to components should be made by those people
opening the issue. The maintainer
of that tagged component needs to change the tag, if the component was
classified incorrectly.


======================================
Initial Components and Maintainers Suggestion
======================================

Below is a suggestion of how to define components for Flink. One goal of
the division was to make it
obvious for the majority of questions and contributions to which component
they would relate. Otherwise,
if many contributions had fuzzy component associations, we would again not
solve the issue of having clear
responsibilities for who would track the progress and resolution.

We also looked at each component and wrote the names of some people who we
thought were natural
experts for the components, and thus natural candidates for maintainers.

**These names are only a starting point for discussion.**

Once agreed upon, the components and names of maintainers should be kept in
the wiki and updated as
components change and people step up or down.


*DataSet API* (*Fabian, Greg, Gabor*)
  - Incuding Hadoop compat. parts

*DataStream API* (*Aljoscha, Max, Stephan*)

*Runtime*
  - Distributed Coordination (JobManager/TaskManager, Akka)  (*Till*)
  - Local Runtime (Memory Management, State Backends, Tasks/Operators) (
*Stephan*)
  - Network (*Ufuk*)

*Client/Optimizer* (*Fabian*)

*Type system / Type extractor* (Timo)

*Cluster Management* (Yarn, Mesos, Docker, ...) (*Max, Robert*)

*Libraries*
  - Gelly (*Vasia, Greg*)
  - ML (*Till, Theo*)
  - CEP (*Till*)
  - Python (*Chesnay*)

*Table API & SQL* (*Fabian, Vasia, Timo, Chengxiang*)

*Streaming Connectors* (*Robert*, *Aljoscha*)

*Batch Connectors and Input/Output Formats* (*Chesnay*)

*Storm Compatibility Layer* (*Mathias*)

*Scala shell* (*Till*)

*Startup Shell Scripts* (Ufuk)

*Flink Build System, Maven Files* (*Robert*)

*Documentation* (Ufuk)


Please let us know what you think about this proposal.
Happy discussing!

Greetings,
Stephan

[PROPOSAL] Structure the Flink Open Source Development

Reply via email to