Re: [PROPOSAL] [community] A more structured approach to reviews and contributions

陈梓立 Wed, 19 Sep 2018 02:26:09 -0700

Hi Piotr,

I strongly agree with your idea. Since we work on a huge project, diff in
the PR becomes hard to review if there is too much noise.


Just to clarify the process. Do you mean that a pull request should go into
the way below?

Separated commits during the implement, mainly distinguish feature/bug fix
with clean up/rework.

[FLINK-XXX] [module] future/bug fix...
[FLINK-XXX] [module] more on future/bug fix...
[hotfix] clean up/rework...

and so on.

And finally, when get merged, put all stuff into one commit and comments
close #PRID
so coming ones could see the detail by jump to #PRID.

One thing to trade off is that if we merge by one commit, we cannot revert
part of it automated; if we merge by PR commits(one by one), the commit log
might mess.

Best,
tison.


Piotr Nowojski <pi...@data-artisans.com> 于2018年9月19日周三 下午5:10写道：

> Hi,
>
> I would like to rise one more issue. Often contributions are very heavy
> and difficult to review. They have one big commit that changes multiple
> things which is difficult to review. Instead I would be in favour of
> implementing a rule, that a single commit should never do more then one
> thing. For example never mixing refactoring/renames/clean ups/code
> deduplications with functional changes. If implementing something requires
> two independent changes in two separate components, those also should be
> two independent commits. In other words, if there are two changed lines in
> a single commit, they should interact somehow together and strictly depend
> on one another. This has couple of important advantages:
>
> 1. Obviously faster reviews. Especially if a reviewer do not have to find
> 2 lines bug fix among 200 lines of renames/refactoring.
> 2. Provides good “cut out” points for reviewer. For example he can easily
> interrupt reviewing in the middle and continue later or even merge PR in
> stages.
> 3. Better reference “why something was done this way not the other” for
> the future. This is the same argument as first point, however with benefit
> not during reviewing, but when after merging someone is trying to
> understand the code.
> 4. Commit message becomes much better place to write down reasons why
> something was done and what are the effects (not that this should replace
> comments/documentation, only to complement it).
> 5. In case of need to revert/drop some part of the contribution, we are
> not loosing all of it. If we have to revert some small feature, it would be
> easier to keep refactoring and clean ups.
>
>
> Some examples of PRs that were more or less following this rule:
> https://github.com/apache/flink/pull/6692/commits
> https://github.com/apache/flink/pull/5423/commits <
> https://github.com/apache/flink/pull/5423/commits> (a bit extreme)
>
> If someone is not convinced I encourage to open those PRs and browse
> through couple of first commits (which are refactoring/clean up commits)
> one by one (GitHub has next/prev commit button). Then imagine if they were
> squashed with some functional/performance improvement changes.
>
> Piotrek
>
> > On 18 Sep 2018, at 17:12, 陈梓立 <wander4...@gmail.com> wrote:
> >
> > Put some good cases here might be helpful.
> >
> > See how a contribution of runtime module be proposed, discussed,
> > implemented and merged from  https://github.com/apache/flink/pull/5931
> to
> > https://github.com/apache/flink/pull/6132.
> >
> > 1. #5931 fix a bug, but remains points could be improved. Here sihuazhou
> > and shuai-xu share their considerations and require review(of the
> proposal)
> > by Stephan, Till and Gary, our committers.
> > 2. After discussion, all people involved reach a consensus. So the rest
> > work is to implement it.
> > 3. sihuazhou gives out an implementation #6132, Till reviews it and find
> it
> > is somewhat out of the "architectural" aspect, so suggests
> > implementation-level changes.
> > 4. Addressing those implementation-level comments, the PR gets merged.
> >
> > I think this is quite a good example how we think our review process
> should
> > go.
> >
> > Best,
> > tison.
> >
> >
> > 陈梓立 <wander4...@gmail.com> 于2018年9月18日周二 下午10:53写道：
> >
> >> Maybe a little rearrange to the process would help.
> >>
> >> (1). Does the contributor describe itself well?
> >>  (1.1) By whom this contribution should be given attention. This often
> >> shows by its title, "[FLINK-XXX] [module]", the module part infer.
> >>  (1.2) What the purpose of this contribution is. Done by the PR
> template.
> >> Even on JIRA an issue should cover these points.
> >>
> >> (2). Is there consensus on the contribution?
> >> This follows (1), because we need to clear what the purpose of the
> >> contribution first. At this stage reviewers could cc to module
> maintainer
> >> as a supplement to (1.1). Also reviewers might ask the contributor to
> >> clarify his purpose to sharp(1.2)
> >>
> >> (3). Is the implement architectural and fit code style?
> >> This follows (2). And only after a consensus we talk about concrete
> >> implement, which prevent spend time and put effort in vain.
> >>
> >> In addition, ideally a "+1" comment or approval means the purpose of
> >> contribution is supported by the reviewer and implement(if there is)
> >> quality is fine, so the reviewer vote for a consensus.
> >>
> >> Best,
> >> tison.
> >>
> >>
> >> Stephan Ewen <se...@apache.org> 于2018年9月18日周二 下午6:44写道：
> >>
> >>> On the template discussion, some thoughts
> >>>
> >>> *PR Template*
> >>>
> >>> I think the PR template went well. We can rethink the "checklist" at
> the
> >>> bottom, but all other parts turned out helpful in my opinion.
> >>>
> >>> With the amount of contributions, it helps to ask the contributor to
> take
> >>> a
> >>> little more work in order for the reviewer to be more efficient.
> >>> I would suggest to keep that mindset: Whenever we find a way that the
> >>> contributor can prepare stuff in such a way that reviews become
> >>> more efficient, we should do that. In my experience, most contributors
> are
> >>> willing to put in some extra minutes if it helps that their
> >>> PR gets merged faster.
> >>>
> >>> *Review Template*
> >>>
> >>> I think it would be helpful to have this checklist. It does not matter
> in
> >>> which form, be that as a text template, be that as labels.
> >>>
> >>> The most important thing is to make explicit which questions have been
> >>> answered in the review.
> >>> Currently there is a lot of "+1" on pull requests which means "code
> >>> quality
> >>> is fine", but all other questions are unanswered.
> >>> The contributors then rightfully wonder why this does not get merged.
> >>>
> >>>
> >>>
> >>> On Tue, Sep 18, 2018 at 7:26 AM, 陈梓立 <wander4...@gmail.com> wrote:
> >>>
> >>>> Hi all interested,
> >>>>
> >>>> Within the document there is a heated discussion about how the PR
> >>>> template/review template should be.
> >>>>
> >>>> Here share my opinion:
> >>>>
> >>>> 1. For the review template, actually we don't need comment a review
> >>>> template at all. GitHub has a tag system and only committer could add
> >>> tags,
> >>>> which we can make use of it. That is, tagging this PR is
> >>>> waiting-for-proposal-approved, waiting-for-code-review,
> >>>> waiting-for-benchmark or block-by-author and so on. Asfbot could pick
> >>>> GitHub tag state to the corresponding JIRA and we always regard JIRA
> as
> >>> the
> >>>> main discussion borad.
> >>>>
> >>>> 2. For the PR template, the greeting message is redundant. Just
> >>> emphasize a
> >>>> JIRA associated is important and how to format the title is enough.
> >>>> Besides, the "Does this pull request potentially affect one of the
> >>>> following parts" part and "Documentation" should be coved from "What
> is
> >>> the
> >>>> purpose of the change" and "Brief change log". These two parts, users
> >>>> always answer no and would be aware if they really make changes on it.
> >>> As
> >>>> example, even pull request requires document, its owner might no add
> it
> >>> at
> >>>> first. The PR template is a guide but not which one have to learn.
> >>>>
> >>>> To sum up, (1) take advantage of GitHub's tag system to tag review
> >>> progress
> >>>> (2) make the template more concise to avoid burden mature contributors
> >>> and
> >>>> force new comer to learn too much.
> >>>>
> >>>> Best,
> >>>> tison.
> >>>>
> >>>>
> >>>> Rong Rong <walter...@gmail.com> 于2018年9月18日周二 上午7:05写道：
> >>>>
> >>>>> Thanks for putting the review contribution doc together, Stephan!
> This
> >>>> will
> >>>>> definitely help the community to make the review process better.
> >>>>>
> >>>>> From my experience this will benefit on both contributors and
> >>> reviewers
> >>>>> side! Thus +1 for putting into practice as well.
> >>>>>
> >>>>> --
> >>>>> Rong
> >>>>>
> >>>>> On Mon, Sep 17, 2018 at 10:18 AM Stephan Ewen <se...@apache.org>
> >>> wrote:
> >>>>>
> >>>>>> Hi!
> >>>>>>
> >>>>>> Thanks you for the encouraging feedback so far.
> >>>>>>
> >>>>>> The overall goal is definitely to make the contribution process
> >>> better
> >>>>> and
> >>>>>> get fewer pull requests that are disregarded.
> >>>>>>
> >>>>>> There are various reasons for the disregarded pull requests, one
> >>> being
> >>>>> that
> >>>>>> fewer committers really participate in reviews beyond
> >>>>>> the component they are currently very involved with. This is a
> >>> separate
> >>>>>> issue and I am thinking on how to encourage more
> >>>>>> activity there.
> >>>>>>
> >>>>>> The other reason I was lack of structure and lack of decision
> >>> making,
> >>>>> which
> >>>>>> is what I am first trying to fix here.
> >>>>>> A follow-up to this will definitely be to improve the contribution
> >>>> guide
> >>>>> as
> >>>>>> well.
> >>>>>>
> >>>>>> Best,
> >>>>>> Stephan
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Sep 17, 2018 at 12:05 PM, Zhijiang(wangzhijiang999) <
> >>>>>> wangzhijiang...@aliyun.com.invalid> wrote:
> >>>>>>
> >>>>>>> From my personal experience as a contributor for three years, I
> >>> feel
> >>>>>>> better experience in contirbuting or reviewing than before,
> >>> although
> >>>> we
> >>>>>>> still have some points for further progress.
> >>>>>>>
> >>>>>>> I reviewed the proposal doc, and it gives very constructive and
> >>>>>> meaningful
> >>>>>>> guides which could help both contributor and reviewer. I agree
> >>> with
> >>>> the
> >>>>>>> bove suggestions and wish they can be praticed well!
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Zhijiang
> >>>>>>> ------------------------------------------------------------------
> >>>>>>> 发件人：Till Rohrmann <trohrm...@apache.org>
> >>>>>>> 发送时间：2018年9月17日(星期一) 16:27
> >>>>>>> 收件人：dev <dev@flink.apache.org>
> >>>>>>> 主 题：Re: [PROPOSAL] [community] A more structured approach to
> >>> reviews
> >>>>> and
> >>>>>>> contributions
> >>>>>>>
> >>>>>>> Thanks for writing this up Stephan. I like the steps and hope
> >>> that it
> >>>>>> will
> >>>>>>> help the community to make the review process better. Thus, +1 for
> >>>>>> putting
> >>>>>>> your proposal to practice.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Till
> >>>>>>>
> >>>>>>> On Mon, Sep 17, 2018 at 10:00 AM Stephan Ewen <se...@apache.org>
> >>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi Flink community members!
> >>>>>>>>
> >>>>>>>> As many of you will have noticed, the Flink project activity has
> >>>> gone
> >>>>>> up
> >>>>>>>> again quite a bit.
> >>>>>>>> There are many more contributions, which is an absolutely great
> >>>> thing
> >>>>>> to
> >>>>>>>> have :-)
> >>>>>>>>
> >>>>>>>> However, we see a continuously growing backlog of pull requests
> >>> and
> >>>>>> JIRA
> >>>>>>>> issues.
> >>>>>>>> To make sure the community will be able to handle the increased
> >>>>>> volume, I
> >>>>>>>> think we need to revisit some
> >>>>>>>> approaches and processes. I believe there are a few
> >>> opportunities
> >>>> to
> >>>>>>>> structure things a bit better, which
> >>>>>>>> should help to scale the development.
> >>>>>>>>
> >>>>>>>> The first thing I would like to bring up are *Pull Request
> >>>> Reviews*.
> >>>>>> Even
> >>>>>>>> though more community members being
> >>>>>>>> active in reviews (which is a really great thing!) the Pull
> >>> Request
> >>>>>>> backlog
> >>>>>>>> is increasing quite a bit.
> >>>>>>>>
> >>>>>>>> Why are pull requests still not merged faster? Looking at the
> >>>>> reviews,
> >>>>>>> one
> >>>>>>>> thing I noticed is that most reviews deal
> >>>>>>>> immediately with detailed code issues, and leave out most of the
> >>>> core
> >>>>>>>> questions that need to be answered
> >>>>>>>> before a Pull Request can be merged, like "is this a desired
> >>>>> feature?"
> >>>>>> or
> >>>>>>>> "does this align well with other developments?".
> >>>>>>>> I think that we even make things slightly worse that way: From
> >>> my
> >>>>>>> personal
> >>>>>>>> experience, I have often thought "oh, this
> >>>>>>>> PR has a review already" and rather looked at another PR, only
> >>> to
> >>>>> find
> >>>>>>>> later that the first review did never decide whether
> >>>>>>>> this PR is actually a good fit for Flink.
> >>>>>>>>
> >>>>>>>> There has never been a proper documentation of how to answer
> >>> these
> >>>>>>>> questions, what to evaluate in reviews,
> >>>>>>>> guidelines for how to evaluate pull requests, other than code
> >>>>> quality.
> >>>>>> I
> >>>>>>>> suspect that this is why so many reviewers
> >>>>>>>> do not address the "is this a good contribution" questions,
> >>> making
> >>>>> pull
> >>>>>>>> requests linger until another committers joins
> >>>>>>>> the review.
> >>>>>>>>
> >>>>>>>> Below is an idea for a guide *"How to Review Contributions"*. It
> >>>>>> outlines
> >>>>>>>> five core aspects to be checked in every
> >>>>>>>> pull request, and suggests a priority for clarifying those. The
> >>>> idea
> >>>>> is
> >>>>>>>> that this helps us to better structure reviews, and
> >>>>>>>> to make each reviewer aware what we look for in a review and
> >>> where
> >>>> to
> >>>>>>> best
> >>>>>>>> bring in their help.
> >>>>>>>>
> >>>>>>>> Looking forward to comments!
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Stephan
> >>>>>>>>
> >>>>>>>> ====================================
> >>>>>>>>
> >>>>>>>> The draft is in this Google Doc. Please add small textual
> >>> comments
> >>>> to
> >>>>>> the
> >>>>>>>> doc, and bigger principle discussions as replies to this mail.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> https://docs.google.com/document/d/1yaX2b9LNh-6LxrAmE23U3D2c
> >>>>>>> RbocGlGKCYnvJd9lVhk/edit?usp=sharing
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> *How to Review Contributions------------------------------This
> >>>> guide
> >>>>> is
> >>>>>>> for
> >>>>>>>> all committers and contributors that want to help with reviewing
> >>>>>>>> contributions. Thank you for your effort - good reviews are one
> >>> the
> >>>>>> most
> >>>>>>>> important and crucial parts of an open source project. This
> >>> guide
> >>>>>> should
> >>>>>>>> help the community to make reviews such that: - Contributors
> >>> have a
> >>>>>> good
> >>>>>>>> contribution experience- Reviews are structured and check all
> >>>>> important
> >>>>>>>> aspects of a contribution- Make sure we keep a high code
> >>> quality in
> >>>>>>> Flink-
> >>>>>>>> We avoid situations where contributors and reviewers spend a
> >>> lot of
> >>>>>> time
> >>>>>>> to
> >>>>>>>> refine a contribution that gets rejected laterReview
> >>> ChecklistEvery
> >>>>>>> review
> >>>>>>>> needs to check the following five aspects. We encourage to check
> >>>>> these
> >>>>>>>> aspects in order, to avoid spending time on detailed code
> >>> quality
> >>>>>> reviews
> >>>>>>>> when there is not yet consensus that a feature or change should
> >>> be
> >>>>>>> actually
> >>>>>>>> be added.(1) Is there consensus whether the change of feature
> >>>> should
> >>>>> go
> >>>>>>>> into to Flink?For bug fixes, this needs to be checked only in
> >>> case
> >>>> it
> >>>>>>>> requires bigger changes or might break existing programs and
> >>>>>>>> setups.Ideally, this question is already answered from a JIRA
> >>> issue
> >>>>> or
> >>>>>>> the
> >>>>>>>> dev-list discussion, except in cases of bug fixes and small
> >>>>> lightweight
> >>>>>>>> additions/extensions. In that case, this question can be
> >>>> immediately
> >>>>>>> marked
> >>>>>>>> as resolved. For pull requests that are created without prior
> >>>>>> consensus,
> >>>>>>>> this question needs to be answered as part of the review.The
> >>>> decision
> >>>>>>>> whether the change should go into Flink needs to take the
> >>> following
> >>>>>>> aspects
> >>>>>>>> into consideration: - Does the contribution alter the behavior
> >>> of
> >>>>>>> features
> >>>>>>>> or components in a way that it may break previous users’
> >>> programs
> >>>> and
> >>>>>>>> setups? If yes, there needs to be a discussion and agreement
> >>> that
> >>>>> this
> >>>>>>>> change is desirable. - Does the contribution conceptually fit
> >>> well
> >>>>> into
> >>>>>>>> Flink? Is it too much of special case such that it makes things
> >>>> more
> >>>>>>>> complicated for the common case, or bloats the abstractions /
> >>>> APIs? -
> >>>>>>> Does
> >>>>>>>> the feature fit well into Flink’s architecture? Will it scale
> >>> and
> >>>>> keep
> >>>>>>>> Flink flexible for the future, or will the feature restrict
> >>> Flink
> >>>> in
> >>>>>> the
> >>>>>>>> future? - Is the feature a significant new addition (rather
> >>> than an
> >>>>>>>> improvement to an existing part)? If yes, will the Flink
> >>> community
> >>>>>> commit
> >>>>>>>> to maintaining this feature? - Does the feature produce added
> >>> value
> >>>>> for
> >>>>>>>> Flink users or developers? Or does it introduce risk of
> >>> regression
> >>>>>>> without
> >>>>>>>> adding relevant user or developer benefit?All of these questions
> >>>>> should
> >>>>>>> be
> >>>>>>>> answerable from the description/discussion in JIRA and Pull
> >>>> Request,
> >>>>>>>> without looking at the code.(2) Does the contribution need
> >>>> attention
> >>>>>> from
> >>>>>>>> some specific committers and is there time commitment from these
> >>>>>>>> committers?Some changes require attention and approval from
> >>>> specific
> >>>>>>>> committers. For example, changes in parts that are either very
> >>>>>>> performance
> >>>>>>>> sensitive, or have a critical impact on distributed coordination
> >>>> and
> >>>>>>> fault
> >>>>>>>> tolerance need input by a committer that is deeply familiar with
> >>>> the
> >>>>>>>> component.As a rule of thumb, this is the case when the Pull
> >>>> Request
> >>>>>>>> description answers one of the questions in the template section
> >>>>> “Does
> >>>>>>> this
> >>>>>>>> pull request potentially affect one of the following parts” with
> >>>>>>> ‘yes’.This
> >>>>>>>> question can be answered with - Does not need specific
> >>> attention-
> >>>>> Needs
> >>>>>>>> specific attention for X (X can be for example checkpointing,
> >>>>>> jobmanager,
> >>>>>>>> etc.).- Has specific attention for X by @commiterA,
> >>> @contributorBIf
> >>>>> the
> >>>>>>>> pull request needs specific attention, one of the tagged
> >>>>>>>> committers/contributors should give the final approval.(3) Is
> >>> the
> >>>>>>>> contribution described well?Check whether the contribution is
> >>>>>>> sufficiently
> >>>>>>>> well described to support a good review. Trivial changes and
> >>> fixes
> >>>> do
> >>>>>> not
> >>>>>>>> need a long description. Any pull request that changes
> >>>> functionality
> >>>>> or
> >>>>>>>> behavior needs to describe the big picture of these changes, so
> >>>> that
> >>>>>>>> reviews know what to look for (and don’t have to dig through the
> >>>> code
> >>>>>> to
> >>>>>>>> hopefully understand what the change does).Changes that require
> >>>>> longer
> >>>>>>>> descriptions are ideally based on a prior design discussion in
> >>> the
> >>>>>>> mailing
> >>>>>>>> list or in JIRA and can simply link to there or copy the
> >>>> description
> >>>>>> from
> >>>>>>>> there.(4) Does the implementation follow the right overall
> >>>>>>>> approach/architecture?Is this the best approach to implement the
> >>>> fix
> >>>>> or
> >>>>>>>> feature, or are there other approaches that would be easier,
> >>> more
> >>>>>> robust,
> >>>>>>>> or more maintainable?This question should be answerable from the
> >>>> Pull
> >>>>>>>> Request description (or the linked JIRA) as much as possible.We
> >>>>>> recommend
> >>>>>>>> to check this before diving into the details of commenting on
> >>>>>> individual
> >>>>>>>> parts of the change.(5) Is the overall code quality good,
> >>> meeting
> >>>>>>> standard
> >>>>>>>> we want to maintain in Flink?This is the detailed code review of
> >>>> the
> >>>>>>> actual
> >>>>>>>> changes, covering: - Are the changes doing what is described in
> >>> the
> >>>>>>> design
> >>>>>>>> document or PR description?- Does the code follow the right
> >>>> software
> >>>>>>>> engineering practices? It the code correct, robust,
> >>> maintainable,
> >>>>>>>> testable?- Are the change performance aware, when changing a
> >>>>>> performance
> >>>>>>>> sensitive part?- Are the changes sufficiently covered by tests?-
> >>>> Are
> >>>>>> the
> >>>>>>>> tests executing fast?- Does the code format follow Flink’s
> >>>> checkstyle
> >>>>>>>> pattern?- Does the code avoid to introduce additional compiler
> >>>>>>>> warnings?Some code style guidelines can be found in the [Flink
> >>> Code
> >>>>>> Style
> >>>>>>>> Page](https://flink.apache.org/contribute-code.html#code-style
> >>>>>>>> <https://flink.apache.org/contribute-code.html#code-style>)Pull
> >>>>>> Request
> >>>>>>>> Review TemplateAdd the following checklist to the pull request
> >>>>> review,
> >>>>>>>> checking the boxes as the questions are answered:  - [ ]
> >>> Consensus
> >>>>> that
> >>>>>>> the
> >>>>>>>> contribution should go into to Flink  - [ ] Does not need
> >>> specific
> >>>>>>>> attention | Needs specific attention for X | Has attention for X
> >>>> by Y
> >>>>>> -
> >>>>>>> [
> >>>>>>>> ] Contribution description  - [ ] Architectural approach  - [ ]
> >>>>> Overall
> >>>>>>>> code quality*
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: [PROPOSAL] [community] A more structured approach to reviews and contributions

Reply via email to