Re: [DISCUSS] Supporting Hadoop-1 and experimental features

Alexander Pivovarov Fri, 22 May 2015 11:29:39 -0700

Alan, your email client is not compatible with gmail viewer. For some
reason your reply contains the whole thread of the discussion
On May 22, 2015 10:58 AM, "Alan Gates" <alanfga...@gmail.com> wrote:


> I don't think anyone is advocating for option 2, as that would be
> disastrous.  Option 3 is closest to what I'm proposing, though again
> dropping support for Hadoop 1 is only a part of it.
>
> Alan.
>
>   Alexander Pivovarov <apivova...@gmail.com>
>  May 22, 2015 at 10:03
> Looks like we discussing 3 options:
>
> 1. Support hadoop 1, 2 and 3 in master branch.
>
> 2. Support hadoop 1 in branch-1, hadoop 2 in branch-2, hadoop 3 in branch-3
>
> 3. Support hadoop 2 and 3 in master
>
> I DO not think option 2 is good solution because it is much more difficuilt
> to manage 3 active prod branches rather than one master branch.
>
> I think we should go with options 1 or 3.
>
> +1 on Xuefu and Edward opinion
>
>   Sergey Shelukhin <ser...@hortonworks.com>
>  May 22, 2015 at 9:08
> I think branch-2 doesn’t need to be framed as particularly adventurous
> (other than due to general increase of the amount of work done in Hive by
> community).
> All the new features that normally go on trunk/master will go to branch-2.
> branch-2 is just trunk as it is now, in fact there will be no branch-2,
> just master :) The difference is the dropped functionality, not added one.
> So you shouldn’t lose stability if you retain the same process as now by
> just staying on versions off master.
>
> Perhaps, as is usually the case in Apache projects, developing features on
> older branches would be discouraged. Right now, all features usually go on
> trunk/master, and are then back ported as needed and practical; so you
> wouldn’t (in Apache) make a feature on Hive 0.14 to be released in 0.14.N,
> and not back port to master.
>
>
>   Chris Drome <cdr...@yahoo-inc.com.INVALID>
>  May 22, 2015 at 0:49
> I understand the motivation and benefits of creating a branch-2 where more
> disruptive work can go on without affecting branch-1. While not necessarily
> against this approach, from Yahoo's standpoint, I do have some questions
> (concerns).
> Upgrading to a new version of Hive requires a significant commitment of
> time and resources to stabilize and certify a build for deployment to our
> clusters. Given the size of our clusters and scale of datasets, we have to
> be particularly careful about adopting new functionality. However, at the
> same time we are interested in new testing and making available new
> features and functionality. That said, we would have to rely on branch-1
> for the immediate future.
> One concern is that branch-1 would be left to stagnate, at which point
> there would be no option but for users to move to branch-2 as branch-1
> would be effectively end-of-lifed. I'm not sure how long this would take,
> but it would eventually happen as a direct result of the very reason for
> creating branch-2.
> A related concern is how disruptive the code changes will be in branch-2.
> I imagine that changes in early in branch-2 will be easy to backport to
> branch-1, while this effort will become more difficult, if not impractical,
> as time goes. If the code bases diverge too much then this could lead to
> more pressure for users of branch-1 to add features just to branch-1, which
> has been mentioned as undesirable. By the same token, backporting any code
> in branch-2 will require an increasing amount of effort, which contributors
> to branch-2 may not be interested in committing to.
> These questions affect us directly because, while we require a certain
> amount of stability, we also like to pull in new functionality that will be
> of value to our users. For example, our current 0.13 release is probably
> closer to 0.14 at this point. Given the lifespan of a release, it is often
> more palatable to backport features and bugfixes than to jump to a new
> version.
>
> The good thing about this proposal is the opportunity to evaluate and
> clean up alot of the old code.
> Thanks,
> chris
>
>
>
> On Monday, May 18, 2015 11:48 AM, Sergey Shelukhin
> <ser...@hortonworks.com> <ser...@hortonworks.com> wrote:
>
>
> Note: by “cannot” I mean “are unwilling to”; upgrade paths exist, but some
> people are set in their ways or have practical considerations and don’t
> care for new shiny stuff.
>
>
>
>
>
>   Sergey Shelukhin <ser...@hortonworks.com>
>  May 18, 2015 at 11:47
> Note: by “cannot” I mean “are unwilling to”; upgrade paths exist, but some
> people are set in their ways or have practical considerations and don’t
> care for new shiny stuff.
>
>
>   Sergey Shelukhin <ser...@hortonworks.com>
>  May 18, 2015 at 11:46
> I think we need some path for deprecating old Hadoop versions, the same
> way we deprecate old Java version support or old RDBMS version support.
> At some point the cost of supporting Hadoop 1 exceeds the benefit. Same
> goes for stuff like MR; supporting it, esp. for perf work, becomes a
> burden, and it’s outdated with 2 alternatives, one of which has been
> around for 2 releases.
> The branches are a graceful way to get rid of the legacy burden.
>
> Alternatively, when sweeping changes are made, we can do what Hbase did
> (which is not pretty imho), where 0.94 version had ~30 dot releases
> because people cannot upgrade to 0.96 “singularity” release.
>
>
> I posit that people who run Hadoop 1 and MR at this day and age (and more
> so as time passes) are people who either don’t care about perf and new
> features, only stability; so, stability-focused branch would be perfect to
> support them.
>
>
>
>

Re: [DISCUSS] Supporting Hadoop-1 and experimental features

Reply via email to