Inline. On Fri, Jun 10, 2016 at 6:56 AM, Junping Du <j...@hortonworks.com> wrote:
> Comparing with advantages, I believe the disadvantages of shipping any > releases directly from trunk are more obvious and significant: > - A lot of commits (incompatible, risky, uncompleted feature, etc.) have > to wait to commit to trunk or put into a separated branch that could delay > feature development progress as additional vote process get involved even > the feature is simple and harmless. > Including these sorts of commits in trunk is a major pain. One example from a recent mistake I made: YARN-2877 and YARN-1011 had some common changes. Instead of putting them in a separate branch, I committed these common changes to trunk because well we don't release from trunk and what can go wrong. After a few days, other contributors and committers started feeling annoyed about having to submit two different patches for trunk and branch-2. This inconvenience led to those patches being pulled into branch-2 even though they were not ready for inclusion in branch-2 or a 2.x release. I feel the major friction for feature branches comes from only some features using it. If everyone uses feature branches and we have better processes around quantifying the stability of a feature branch, feature branches should make for a smoother experience for everyone. It is not uncommon for features to get merged into trunk before being ready with promises of follow-up work. While that might very well be the intent of contributors, other work items come up and things get sidelined. How often have we seen features without HA and security. > > - These commits left in separated branches are isolated and get more > chance to conflict each other, and more bugs could be involved due to > conflicts and/or less eyes watching/bless on isolated branches. > Partially agree. There is a tradeoff here: if we keep putting them into trunk, they (1) destabilize trunk, and (2) conflict with other bug fixes and smaller improvements. > > - More unnecessary arguments/debates will happen on if some commits should > land on trunk or a separated branch, just like what we have recently. > Again, clearly defining the requirements to be merged into trunk will make this easier. How is this different from what we do today for branch-2? If we still have debates, that is probably required? Not having them today is actually a concern? > > - Because branches will get increased massively, more community efforts > will be spent on review & vote for branches merge that means less effort > will be spent on other commits review given our review bandwidth is quite > short so far. > Yes and no. Strictly using feature branches will serialize features. Integrating with other features is a one-time, albeit more involved, process instead of multiple rebases/resolutions each somewhat involved. > > - For small feature with only 1 or 2 commits, that need three +1 from PMCs > will increase the bar largely for contributors who just start to contribute > on Hadoop features but no such sufficient support. > If a feature/improvement is not supported by 3 committers (not PMC members), it is probably worth looking at why. May be, this feature should not be included at all? I am open to changing the requirements for a merge. What do you think of one +1 (thorough review) and two +0s (high-level review). If the concern is finding enough committers, I would like for the PMC to consider voting in more committers and increasing bandwidth. > > Given these concerns, I am open to other options, like: proposed by Vinod > or Chris, but rather than to release anything directly from trunk. > I actually thought this was Vinod's proposal. My understanding is Andrew is resurfacing this so we finalize things. > > - This point doesn't necessarily need to be resolved now though, since > again we're still doing alphas. > No. I think we have to settle down this first. Without a common agreed and > transparent release process and branches in community, any release (alpha, > beta) bits is only called a private release but not a official apache > hadoop release (even alpha). > > I am absolutely with Junping here. Changing this process primarily requires a change in our mental model. I think it is pretty important that we decide on one approach preferably before doing an alpha release. To clarify: our current approach (trunk and branch-2) has been working okay. The only issue I see is in the way we take merging into trunk lightly. If we have well-defined requirements for merging to trunk and take those seriously, I am comfortable with using the approach for 3.x. The new proposal forces following these requirements and hence I like it more. > > Thanks, > > Junping > ________________________________________ > From: Karthik Kambatla <ka...@cloudera.com> > Sent: Friday, June 10, 2016 7:49 AM > To: Andrew Wang > Cc: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; > mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org > Subject: Re: [DISCUSS] Increased use of feature branches > > Thanks for restarting this thread Andrew. I really hope we can get this > across to a VOTE so it is clear. > > I see a few advantages shipping from trunk: > > - The lack of need for one additional backport each time. > - Feature rot in trunk > > Instead of creating branch-3, I recommend creating branch-3.x so we can > continue doing 3.x releases off branch-3 even after we move trunk to 4.x (I > said it :)) > > On Thu, Jun 9, 2016 at 11:12 PM, Andrew Wang <andrew.w...@cloudera.com> > wrote: > > > Hi all, > > > > On a separate thread, a question was raised about 3.x branching and use > of > > feature branches going forward. > > > > We discussed this previously on the "Looking to a Hadoop 3 release" > thread > > that has spanned the years, with Vinod making this proposal (building on > > ideas from others who also commented in the email thread): > > > > > > > http://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201604.mbox/browser > > > > Pasting here for ease: > > > > On an unrelated note, offline I was pitching to a bunch of > > contributors another idea to deal > > with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*. > > > > What this gains us is that > > - Trunk is always nearly stable or nearly ready for releases > > - We no longer have some code lying around in some branch (today’s > > trunk) that is not releasable > > because it gets mixed with other undesirable and incompatible changes. > > - This needs to be coupled with more discipline on individual > > features - medium to to large > > features are always worked upon in branches and get merged into trunk > > (and a nearing release!) > > when they are ready > > - All incompatible changes go into some sort of a trunk-incompat > > branch and stay there till > > we accumulate enough of those to warrant another major release. > > > > Regarding "trunk-incompat", since we're still in the alpha stage for > 3.0.0, > > there's no need for this branch yet. This aspect of Vinod's proposal was > > still under a bit of discussion; Chris Douglas though we should cut a > > branch-3 for the first 3.0.0 beta, which aligns with my original > thinking. > > This point doesn't necessarily need to be resolved now though, since > again > > we're still doing alphas. > > > > What we should get consensus on is the goal of keeping trunk stable, and > > achieving that by doing more development on feature branches and being > > judicious about merges. My sense from the Hadoop 3 email thread (and the > > more recent one on the async API) is that people are generally in favor > of > > this. > > > > We're just about ready to do the first 3.0.0 alpha, so would greatly > > appreciate everyone's timely response in this matter. > > > > Thanks, > > Andrew > > >