Hello all, Is it written anywhere what the difference is between a minor release and a point/dot/maintenance (I'll use "point" from here on out) release? I have looked around and I can't find anything other than some compatibility documentation in 2.x that has since been removed in 3.x [1] [2]. I think this would help shape my opinion on whether or not to keep branch-2 alive. My current understanding is that we can't really break compatibility in either a minor or point release. But the only mention of the difference between minor and point releases is how to deal with Stable, Evolving, and Unstable tags, and how to deal with changing default configuration values. So it seems like there really isn't a big official difference between the two. In my mind, the functional difference between the two is that the minor releases may have added features and rewrites, while the point releases only have bug fixes. This might be an incorrect understanding, but that's what I have gathered from watching the releases over the last few years. Whether or not this is a correct understanding, I think that this needs to be documented somewhere, even if it is just a convention.
Given my assumed understanding of minor vs point releases, here are the pros/cons that I can think of for having a branch-2. Please add on or correct me for anything you feel is missing or inadequate. Pros: - Features/rewrites/higher-risk patches are less likely to be put into 2.10.x - It is less necessary to move to 3.x Cons: - Bug fixes are less likely to be put into 2.10.x - An extra branch to maintain - Committers have an extra branch (5 vs 4 total branches) to commit patches to if they should go all the way back to 2.10.x - It is less necessary to move to 3.x So on the one hand you get added stability in fewer features being committed to 2.10.x, but then on the other you get fewer bug fixes being committed. In a perfect world, we wouldn't have to make this tradeoff. But we don't live in a perfect world and committers will make mistakes either because of lack of knowledge or simply because they made a mistake. If we have a branch-2, committers will forget, not know to, or choose not to (for whatever reason) commit valid bug fixes back all the way to branch-2.10. If we don't have a branch-2, committers who want their borderline risky feature in the 2.x line will err on the side of putting it into branch-2.10 instead of proposing the creation of a branch-2. Clearly I have made quite a few assumptions here based on my own experiences, so I would like to hear if others have similar or opposing views. As far as 3.x goes, to me it seems like some of the reasoning for killing branch-2 is due to an effort to push the community towards 3.x. This is why I have added movement to 3.x as both a pro and a con. As a community trying to move forward, keeping as many companies on similar branches as possible is a good way to make sure the code is well-tested. However, from a stability point of view, moving to 3.x is still scary and being able to stay on 2.x until you are comfortable to move is very nice. The 2.10.0 bridge release effort has been very good at making it possible for people to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large that it is reasonable for companies to want to be extra cautious with 3.x due to potential performance degradation at large scale. A question I'm pondering is what happens when we move to Java 11 and someone is still on 2.x? If they want to backport HADOOP-15338 <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11 support to 2.x, surely not everyone is going to want that (at least not immediately). The 2.10 documentation states, "The JVM requirements will not change across point releases within the same minor release except if the JVM version under question becomes unsupported" [1], so this would warrant a 2.11 release until Java 8 becomes unsupported (though one could argue that it is already unsupported since Oracle is no longer giving public Java 8 update). If we don't keep branch-2 around now, would a Java 11 backport be the catalyst for a branch-2 revival? Not sure if this really leads to any sort of answer from me on whether or not we should keep branch-2 alive, but these are the things that I am weighing in my mind. For me, the bigger problem beyond having branch-2 or not is committers not being on the same page with where they should commit their patches. Eric [1] https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html [2] https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html On Tue, Nov 19, 2019 at 2:49 PM epa...@apache.org <epa...@apache.org> wrote: > Hi Konstantin, > > Sure, I understand those concerns. On the other hand, I worry about the > stability of 2.10, since we will be on it for a couple of years at least. > I worry > that some committers may want to put new features into a branch 2 release, > and without a branch-2, they will go directly into 2.10. Since we don't > always > catch corner cases or performance problems for some time (usually not > until > the release is deployed to a busy, 4-thousand node cluster), it may be > very > difficult to back out those changes. > > It sounds like I'm in the minority here, so I'm not nixing the idea, but I > do > have these reservations. > > Thanks, > -Eric > > > > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko < > shv.had...@gmail.com> wrote: > Hi Eric, > > We had a long discussion on this list regarding making the 2.10 release the > last of branch-2 releases. We intended 2.10 as a bridge release between > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in > the picture right now, and many people may object this idea. > > I understand Jonathan's proposal as an attempt to > 1. eliminate confusion which branches people should commit their back-ports > to > 2. save engineering effort committing to more branches than necessary > > "Branches are cheap" as our founder used to say. If we ever decide to > release 2.11 we can resurrect the branch. > Until then I am in favor of Jonathan's proposal +1. > > Thanks, > --Konstantin > > > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jyhung2...@gmail.com> > wrote: > > > Thanks Eric for the comments - regarding your concerns, I feel the pros > > outweigh the cons. To me, the chances of patch releases on 2.10.x are > much > > higher than a new 2.11 minor release. (There didn't seem to be many > people > > outside of our company who expressed interest in getting new features to > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0 > > release, there's 29 patches that have gone into branch-2 and 9 in > > branch-2.10, so it's already diverged quite a bit. > > > > In any case, we can always reverse this decision if we really need to, by > > recreating branch-2. But this proposal would reduce a lot of confusion > IMO. > > > > Jonathan Hung > > > > > > On Fri, Nov 15, 2019 at 11:41 AM epa...@apache.org <epa...@apache.org> > > wrote: > > > > > Thanks Jonathan for opening the discussion. > > > > > > I am not in favor of this proposal. 2.10 was very recently released, > and > > > moving to 2.10 will take some time for the community. It seems > premature > > to > > > make a decision at this point that there will never be a need for a > 2.11 > > > release. > > > > > > -Eric > > > > > > > > > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung < > > > jyhung2...@gmail.com> wrote: > > > > > > Hi folks, > > > > > > Given the release of 2.10.0, and the fact that it's intended to be a > > bridge > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor > > > release line in branch-2. Currently, the main issue is that there's > many > > > fixes going into branch-2 (the theoretical 2.11.0) that's not going > into > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will > > > likely never see the light of day unless they are backported to > > > branch-2.10. > > > > > > To do this, I propose we: > > > > > > - Delete branch-2.10 > > > - Rename branch-2 to branch-2.10 > > > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT > > > > > > This way we get all the current branch-2 fixes into the 2.10.x release > > > line. Then the commit chain will look like: trunk -> branch-3.2 -> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8 > > > > > > Thoughts? > > > > > > Jonathan Hung > > > > > > [1] > > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > >