On 6/13/16, 12:41 PM, "Anu Engineer" <aengin...@hortonworks.com> wrote:
>Hi Colin, > >>Even if everyone used branches for all development, person X might merge >>their branch before person Y, forcing person Y to do a rebase or merge. >>It is not the presence of absence of branches that causes the need to >>merge or rebase, but the presence of absence of "churn." > >You are perfectly right on this technically. The issue is when a >branch developer gets caught in Commit, Revert, let-us-commit-again, >oh-it-is-not-fixed-completely, let-us-revert-the-revert cycle. > >I was hoping that branches will be exposed to less of this if everyone >had private branches and got some time to test and bake the feature >instead of just directly committing to trunk and then test. > >Once again, I agree with your point that in a perfect world, merges should >be about the churn, but trunk is often treated as development branch, >So my point is that it gets unnecessary churn. I really appreciate the >thought in the thread - that is - let us be more responsible about how we >treat trunk. > >> I thought the feature branch merge voting period had been shortened to 5 >>days rather than 7? We should probably spell this out on >>https://hadoop.apache.org/bylaws.html > >Thanks for the link, right now it says 7 days. That is why I assumed it >is 7. >Would you be kind enough to point me to a thread that says it is 5 days >for a merge Vote? >I did a google search, but was not able to find a thread like that. >Thanks in advance. I remember 5days voting was related to release. Not sure that time we discussed about branch merge voting time. Here is the link: http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201406.mbox/%3C64A 2c234-dd6a-4e4c-b52d-e91d5d472...@hortonworks.com%3E > >Thanks >Anu > > >On 6/13/16, 11:51 AM, "Colin McCabe" <cmcc...@apache.org> wrote: > >>On Sun, Jun 12, 2016, at 05:06, Steve Loughran wrote: >>> > On 10 Jun 2016, at 20:37, Anu Engineer <aengin...@hortonworks.com> >>>wrote: >>> > >>> > I actively work on two branches (Diskbalancer and ozone) and I agree >>>with most of what Sangjin said. >>> > There is an overhead in working with branches, there are both >>>technical costs and administrative issues >>> > which discourages developers from using branches. >>> > >>> > I think the biggest issue with branch based development is that fact >>>that other developers do not use a branch. >>> > If a small feature appears as a series of commits to >>>ââdatanode.javaââ, the branch based developer ends up rebasing >>> > and paying this price of rebasing many times. If everyone followed a >>>model of branch + Pull request, other branches >>> > would not have to deal with continues rebasing to trunk commits. If >>>we are moving to a branch based >> >>Even if everyone used branches for all development, person X might merge >>their branch before person Y, forcing person Y to do a rebase or merge. >>It is not the presence of absence of branches that causes the need to >>merge or rebase, but the presence of absence of "churn." >> >>We try to minimize "churn" in many ways. For example, we discourage >>people from making trivial whitespace changes to parts of the code >>they're not modifying in their patch. Or doing things like letting >>their editor change the line ending of files from LF to CR/LF. However, >>in the final analysis, churn will always exist because development >>exists. >> >>> > development, we should probably move to that model for most >>>development to avoid this tax on people who >>> > actually end up working in the branches. >>> > >>> > I do have a question in my mind though: What is being proposed is >>>that we move active development to branches >>> > if the feature is small or incomplete, however keep the trunk open >>>for check-ins. One of the biggest reason why we >>> > check-in into trunk and not to branch-2 is because it is a change >>>that will break backward compatibility. So do we >>> > have an expectation of backward compatibility thru the 3.0-alpha >>>series (I personally vote No, since 3.0 is experimental >>> > at this stage), but if we decide to support some sort of >>>backward-compact then willy-nilly committing to trunk >>> > and still maintaining the expectation we can release Alphas from 3.0 >>>does not look possible. >>> > >>> > And then comes the question, once 3.0 becomes official, where do we >>>check-in a change, if that would break something? >>> > so this will lead us back to trunk being the unstable â 3.0 being >>>the new âbranch-2â. >> >>I'm not sure I really understand the goal of the "trunk-incompat" >>proposal. Like Karthik asked earlier in this thread, isn't it really >>just a rename of the existing trunk branch? >>It sounds like the policy is going to be exactly the same as now: >>incompatible stuff in trunk/trunk-incompat/whatever, 3.x compatible >>changes in the 3.x line, 2.x compatible changes in the 2.x line, etc. >>etc. >> >>I think we should just create branch-3 and follow the same policy we >>followed with branch-2 and branch-1. Switching around the names doesn't >>really change the policy, and it creates confusion since it's >>inconsistent with what we did earlier. >> >>I think one of the big frustrations with trunk is that features sat >>there a while without being released because they weren't compatible >>with branch-2-- the shell script rewrite, for example. However, this >>reflects a fundamental tradeoff-- either incompatible features can't be >>developed at all in the lifetime of Hadoop 3.x, or we will need >>somewhere to put them. The trunk-incompat proposal is like saying that >>you've solved the prison overcrowding problem by renaming all prisons to >>"correctional facilities." >> >>> > >>> > One more point: If we are moving to use a branch always â then we >>>are looking at a model similar to using a git + pull >>> > request model. If that is so would it make sense to modify the rules >>>to make these branches easier to merge? >>> > Say for example, if all commits in a branch has followed review and >>>checking policy â just like trunk and commits >>> > have been made only after a sign off from a committer, would it be >>>possible to merge with a 3-day voting period >>> > instead of 7, or treat it just like todayâs commit to trunk â >>>but with 2 people signing-off? >> >>I thought the feature branch merge voting period had been shortened to 5 >>days rather than 7? We should probably spell this out on >>https://hadoop.apache.org/bylaws.html . Like I said above, I don't >>believe that *all* development should be on feature branches, just >>biggish stuff that is likely to be controversial and/or disruptive. The >>suggestion I made earlier is that if 3 people ask you for a branch, you >>should definitely strongly consider a branch. >> >>I do think we should shorten the voting period for adding new branch >>committers... making it 3 or 4 days would be fine. After all, the work >>of branch committers is reviewed during the merge in any case. >> >>best, >>Colin >> >> >>> > >>> > What I am suggesting is reducing the administrative overheads of >>>using a branch to encourage use of branching. >>> > Right now it feels like Apacheâs process encourages committing >>>directly to trunk than a branch >>> > >>> > Thanks >>> > Anu >>> >>> >>> It's a per project process. In slider, we've used a git flow: all work >>> goes in a feature branch, then merge in with a merge point. This gives >>>a >>> better history of workflow, as an individual body of work is an ordered >>> sequence of operations, independent of everything else. This makes >>>cherry >>> picking a sequence easier, it even makes unrolling a series of changes >>> easier: until the entire set of changes is committed, there is nothing >>>to >>> back out. >>> >>> 1. there's the rebase/merge problem: coping with conflicting change. >>> Rebasing helps, but makes team dev complex. And, if there are big >>> conflict changes, its often easier to take the current diff with trunk >>> branch and reapply it than try to rebase a sequence of operations. You >>> don't always need to rebase though; an FB can repeatedly merge in >>>trunk, >>> for a history which may not be self contained, but does isolate the >>> feature dev from everyone else's work. >>> >>> 2. Changes don't get exposed more broadly until the feature is in. That >>> may reduce review, but for those of us who work on downstream code it >>> means: nothing breaks until the complete feature is in. You may not >>> realise it, but those of us who do compile downstream things (slider, >>> spark) against even branch-2 always fear discovering what's just broken >>> at the API level alone. And that's "the stable branch". I haven't dared >>> build against trunk for a while. >>> >>> 3. It's a real PITA trying to do development which spans >1 feature >>> branch. Even today it's tricky with code spanning >1 patch >>>(HADOOP-13207 >>> and HADOP-13208 this weekend). There I'm working in one branch and >>> generating two separate patches. That's hard to do in a single feature >>> branch., >>> >>> 4. The rules for feature branch merge. If I get a patch into trunk, >>>it's >>> in the codebase. If I get it into a feature branch, there's the risk >>>the >>> entire feature branch doesn't get in. Fix: for short lived feature >>> branches, we have an RTC policy strict enough we can say "if a feature >>> branch commit is in. it's considered good enough, even if a few more >>> successor commits are required before the whole sequence of commits are >>> considered stable. >>> >>> 5. If you do lots of incremental patches (as feature branches >>>encourage), >>> the patch history gets very noisy. Maybe here the patches can be rolled >>> up for the final commit. This is how Spark works. >>> >>> 6. Jenkins doesn't test feature branches today. Can yetus do this if I >>> give a name of any branch? If so, for a feature branch of > 1w we could >>> just fork the trunk jenkins builds too, but have it only email the >>> committers. >>> >>> 7. That final merge process needs to be rigorous from the regression >>> testing perspective. the last commit on a feature branch should be the >>> one to >>> >>> Feature branches need to be short lived to cope with change well. And >>>if >>> you are doing fundamental changes (e.g core APIs), there is some >>> incentive to get that common feature in, while you still get the full >>> implementation stable in a feature branch. But: you'd be better be >>> confident that the stuff in trunk isn't going to break. Nobody gets to >>> break the main build âor at least not for longer than it takes for >>>the >>> merge to be reverted. >>> >>> I think maybe we should try doing very-short-lived feature branches, >>>with >>> a simple policy: >>> >>> -self contained patch which delivers a complete feature/fix: single >>> patch. These are things where it means >>> >>> -something which is an intermediate step to delivering something: part >>>of >>> a feature branch. A branch where the process for committing patches is >>>as >>> rigorous as for trunk âso there's no ambiguity about *whether* a >>> feature is merged in, only *when* >>> >>> >>> >>> >>> >>>?BKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKC >>>B?È?[ÝXØÜXK??K[XZ[????ËY?]][ÝXØÜXP??Y?ÛÜ?\?XÚ?KÜÃBÜ?Y? >>>?]?[Û[??ÛÛ[X[?Ë??K[XZ[????ËY?]Z?[????Y?ÛÜ?\?XÚ?KÜÃB >> >>--------------------------------------------------------------------- >>To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org >>For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org >> >> > >?B�KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB� >?�?[��X��ܚX�K??K[XZ[?�???��Y?]�][��X��ܚX�P??Y?��?�\?X�?K�ܙ�B��܈?Y??]?[ۘ[?? >��[X[�?�??K[XZ[?�???��Y?]�Z?[????Y?��?�\?X�?K�ܙ�B� --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org