Re: [DISCUSS] Increased use of feature branches

Karthik Kambatla Fri, 10 Jun 2016 08:29:08 -0700

Inline.

On Fri, Jun 10, 2016 at 6:56 AM, Junping Du <j...@hortonworks.com> wrote:

> Comparing with advantages, I believe the disadvantages of shipping any
> releases directly from trunk are more obvious and significant:
> - A lot of commits (incompatible, risky, uncompleted feature, etc.) have
> to wait to commit to trunk or put into a separated branch that could delay
> feature development progress as additional vote process get involved even
> the feature is simple and harmless.
>

Including these sorts of commits in trunk is a major pain.

One example from a recent mistake I made:
YARN-2877 and YARN-1011 had some common changes. Instead of putting them in
a separate branch, I committed these common changes to trunk because well
we don't release from trunk and what can go wrong. After a few days, other
contributors and committers started feeling annoyed about having to submit
two different patches for trunk and branch-2. This inconvenience led to
those patches being pulled into branch-2 even though they were not ready
for inclusion in branch-2 or a 2.x release.

I feel the major friction for feature branches comes from only some
features using it. If everyone uses feature branches and we have better
processes around quantifying the stability of a feature branch, feature
branches should make for a smoother experience for everyone.

It is not uncommon for features to get merged into trunk before being ready
with promises of follow-up work. While that might very well be the intent
of contributors, other work items come up and things get sidelined. How
often have we seen features without HA and security.

>
> - These commits left in separated branches are isolated and get more
> chance to conflict each other, and more bugs could be involved due to
> conflicts and/or less eyes watching/bless on isolated branches.
>

Partially agree. There is a tradeoff here: if we keep putting them into
trunk, they (1) destabilize trunk, and (2) conflict with other bug fixes
and smaller improvements.

>
> - More unnecessary arguments/debates will happen on if some commits should
> land on trunk or a separated branch, just like what we have recently.
>

Again, clearly defining the requirements to be merged into trunk will make
this easier. How is this different from what we do today for branch-2? If
we still have debates, that is probably required? Not having them today is
actually a concern?

>
> - Because branches will get increased massively, more community efforts
> will be spent on review & vote for branches merge that means less effort
> will be spent on other commits review given our review bandwidth is quite
> short so far.
>

Yes and no. Strictly using feature branches will serialize features.
Integrating with other features is a one-time, albeit more involved,
process instead of multiple rebases/resolutions each somewhat involved.

>
> - For small feature with only 1 or 2 commits, that need three +1 from PMCs
> will increase the bar largely for contributors who just start to contribute
> on Hadoop features but no such sufficient support.
>

If a feature/improvement is not supported by 3 committers (not PMC
members), it is probably worth looking at why. May be, this feature should
not be included at all?

I am open to changing the requirements for a merge. What do you think of
one +1 (thorough review) and two +0s (high-level review).

If the concern is finding enough committers, I would like for the PMC to
consider voting in more committers and increasing bandwidth.

>
> Given these concerns, I am open to other options, like: proposed by Vinod
> or Chris, but rather than to release anything directly from trunk.
>

I actually thought this was Vinod's proposal. My understanding is Andrew is
resurfacing this so we finalize things.

>
> - This point doesn't necessarily need to be resolved now though, since
> again we're still doing alphas.
> No. I think we have to settle down this first. Without a common agreed and
> transparent release process and branches in community, any release (alpha,
> beta) bits is only called a private release but not a official apache
> hadoop release (even alpha).
>
>
I am absolutely with Junping here. Changing this process primarily requires
a change in our mental model. I think it is pretty important that we decide
on one approach preferably before doing an alpha release.

To clarify: our current approach (trunk and branch-2) has been working
okay. The only issue I see is in the way we take merging into trunk
lightly. If we have well-defined requirements for merging to trunk and take
those seriously, I am comfortable with using the approach for 3.x. The new
proposal forces following these requirements and hence I like it more.

>
> Thanks,
>
> Junping
> ________________________________________
> From: Karthik Kambatla <ka...@cloudera.com>
> Sent: Friday, June 10, 2016 7:49 AM
> To: Andrew Wang
> Cc: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org;
> mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org
> Subject: Re: [DISCUSS] Increased use of feature branches
>
> Thanks for restarting this thread Andrew. I really hope we can get this
> across to a VOTE so it is clear.
>
> I see a few advantages shipping from trunk:
>
>    - The lack of need for one additional backport each time.
>    - Feature rot in trunk
>
> Instead of creating branch-3, I recommend creating branch-3.x so we can
> continue doing 3.x releases off branch-3 even after we move trunk to 4.x (I
> said it :))
>
> On Thu, Jun 9, 2016 at 11:12 PM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
>
> > Hi all,
> >
> > On a separate thread, a question was raised about 3.x branching and use
> of
> > feature branches going forward.
> >
> > We discussed this previously on the "Looking to a Hadoop 3 release"
> thread
> > that has spanned the years, with Vinod making this proposal (building on
> > ideas from others who also commented in the email thread):
> >
> >
> >
> http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201604.mbox/browser
> >
> > Pasting here for ease:
> >
> > On an unrelated note, offline I was pitching to a bunch of
> > contributors another idea to deal
> > with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*.
> >
> > What this gains us is that
> >  - Trunk is always nearly stable or nearly ready for releases
> >  - We no longer have some code lying around in some branch (today’s
> > trunk) that is not releasable
> > because it gets mixed with other undesirable and incompatible changes.
> >  - This needs to be coupled with more discipline on individual
> > features - medium to to large
> > features are always worked upon in branches and get merged into trunk
> > (and a nearing release!)
> > when they are ready
> >  - All incompatible changes go into some sort of a trunk-incompat
> > branch and stay there till
> > we accumulate enough of those to warrant another major release.
> >
> > Regarding "trunk-incompat", since we're still in the alpha stage for
> 3.0.0,
> > there's no need for this branch yet. This aspect of Vinod's proposal was
> > still under a bit of discussion; Chris Douglas though we should cut a
> > branch-3 for the first 3.0.0 beta, which aligns with my original
> thinking.
> > This point doesn't necessarily need to be resolved now though, since
> again
> > we're still doing alphas.
> >
> > What we should get consensus on is the goal of keeping trunk stable, and
> > achieving that by doing more development on feature branches and being
> > judicious about merges. My sense from the Hadoop 3 email thread (and the
> > more recent one on the async API) is that people are generally in favor
> of
> > this.
> >
> > We're just about ready to do the first 3.0.0 alpha, so would greatly
> > appreciate everyone's timely response in this matter.
> >
> > Thanks,
> > Andrew
> >
>

Re: [DISCUSS] Increased use of feature branches

Reply via email to