Alan, your email client is not compatible with gmail viewer. For some reason your reply contains the whole thread of the discussion On May 22, 2015 10:58 AM, "Alan Gates" <alanfga...@gmail.com> wrote:
> I don't think anyone is advocating for option 2, as that would be > disastrous. Option 3 is closest to what I'm proposing, though again > dropping support for Hadoop 1 is only a part of it. > > Alan. > > Alexander Pivovarov <apivova...@gmail.com> > May 22, 2015 at 10:03 > Looks like we discussing 3 options: > > 1. Support hadoop 1, 2 and 3 in master branch. > > 2. Support hadoop 1 in branch-1, hadoop 2 in branch-2, hadoop 3 in branch-3 > > 3. Support hadoop 2 and 3 in master > > I DO not think option 2 is good solution because it is much more difficuilt > to manage 3 active prod branches rather than one master branch. > > I think we should go with options 1 or 3. > > +1 on Xuefu and Edward opinion > > Sergey Shelukhin <ser...@hortonworks.com> > May 22, 2015 at 9:08 > I think branch-2 doesn’t need to be framed as particularly adventurous > (other than due to general increase of the amount of work done in Hive by > community). > All the new features that normally go on trunk/master will go to branch-2. > branch-2 is just trunk as it is now, in fact there will be no branch-2, > just master :) The difference is the dropped functionality, not added one. > So you shouldn’t lose stability if you retain the same process as now by > just staying on versions off master. > > Perhaps, as is usually the case in Apache projects, developing features on > older branches would be discouraged. Right now, all features usually go on > trunk/master, and are then back ported as needed and practical; so you > wouldn’t (in Apache) make a feature on Hive 0.14 to be released in 0.14.N, > and not back port to master. > > > Chris Drome <cdr...@yahoo-inc.com.INVALID> > May 22, 2015 at 0:49 > I understand the motivation and benefits of creating a branch-2 where more > disruptive work can go on without affecting branch-1. While not necessarily > against this approach, from Yahoo's standpoint, I do have some questions > (concerns). > Upgrading to a new version of Hive requires a significant commitment of > time and resources to stabilize and certify a build for deployment to our > clusters. Given the size of our clusters and scale of datasets, we have to > be particularly careful about adopting new functionality. However, at the > same time we are interested in new testing and making available new > features and functionality. That said, we would have to rely on branch-1 > for the immediate future. > One concern is that branch-1 would be left to stagnate, at which point > there would be no option but for users to move to branch-2 as branch-1 > would be effectively end-of-lifed. I'm not sure how long this would take, > but it would eventually happen as a direct result of the very reason for > creating branch-2. > A related concern is how disruptive the code changes will be in branch-2. > I imagine that changes in early in branch-2 will be easy to backport to > branch-1, while this effort will become more difficult, if not impractical, > as time goes. If the code bases diverge too much then this could lead to > more pressure for users of branch-1 to add features just to branch-1, which > has been mentioned as undesirable. By the same token, backporting any code > in branch-2 will require an increasing amount of effort, which contributors > to branch-2 may not be interested in committing to. > These questions affect us directly because, while we require a certain > amount of stability, we also like to pull in new functionality that will be > of value to our users. For example, our current 0.13 release is probably > closer to 0.14 at this point. Given the lifespan of a release, it is often > more palatable to backport features and bugfixes than to jump to a new > version. > > The good thing about this proposal is the opportunity to evaluate and > clean up alot of the old code. > Thanks, > chris > > > > On Monday, May 18, 2015 11:48 AM, Sergey Shelukhin > <ser...@hortonworks.com> <ser...@hortonworks.com> wrote: > > > Note: by “cannot” I mean “are unwilling to”; upgrade paths exist, but some > people are set in their ways or have practical considerations and don’t > care for new shiny stuff. > > > > > > Sergey Shelukhin <ser...@hortonworks.com> > May 18, 2015 at 11:47 > Note: by “cannot” I mean “are unwilling to”; upgrade paths exist, but some > people are set in their ways or have practical considerations and don’t > care for new shiny stuff. > > > Sergey Shelukhin <ser...@hortonworks.com> > May 18, 2015 at 11:46 > I think we need some path for deprecating old Hadoop versions, the same > way we deprecate old Java version support or old RDBMS version support. > At some point the cost of supporting Hadoop 1 exceeds the benefit. Same > goes for stuff like MR; supporting it, esp. for perf work, becomes a > burden, and it’s outdated with 2 alternatives, one of which has been > around for 2 releases. > The branches are a graceful way to get rid of the legacy burden. > > Alternatively, when sweeping changes are made, we can do what Hbase did > (which is not pretty imho), where 0.94 version had ~30 dot releases > because people cannot upgrade to 0.96 “singularity” release. > > > I posit that people who run Hadoop 1 and MR at this day and age (and more > so as time passes) are people who either don’t care about perf and new > features, only stability; so, stability-focused branch would be perfect to > support them. > > > >