Hadoop 3.0 brings anyway some interesting benefits such as reduced storage needs (you dont need to replicate anymore 3 times for reliability reasons), so that may be convincing.
> On 22. Jul 2018, at 08:28, 彭鱼宴 <461292...@qq.com> wrote: > > Hi Tanvi, > > Thanks! I will check that and have a talk with my colleagues to consider > about the upgrading. > > Best, > Zhefu Peng > > > ------------------ 原始邮件 ------------------ > 发件人: "Tanvi Thacker"<tanvithack...@gmail.com>; > 发送时间: 2018年7月21日(星期六) 下午3:24 > 收件人: "user"<user@hive.apache.org>; > 主题: Re: Does Hive 3.0 only works with hadoop3.x.y? > > I would recommend upgrading to Hadoop 3.0 or 3.1 because of the following > reasons:- > > It may be possible that Hadoop 2.x transitively brings some dependencies > which may conflict with libraries used by hive( like unpredictable library > google guava etc), which will affect your runtime environment. > Hive might be utilizing some of the new public APIs which are exposed in 3.x > line of Hadoop , so with Hadoop 2.x you may see some > ClassNotFound/NoSuchMethod in runtime if your query is addressing such code > path. > In production, you must use the same the dependencies in which hive is > compiled and tested. > https://github.com/apache/hive/blob/rel/release-3.0.0/pom.xml#L149 > > Thanks, > Tanvi Thacker > > >> On Thu, Jul 19, 2018 at 8:15 PM, Sungwoo Park <glap...@gmail.com> wrote: >> I would say yes (because I am actually running Hive 3.0 on Hadoop 2.7.6 and >> HDP 2.7.5), provided that you make small changes to the source code to Hive >> 3.0. However, I have not tested Hive 3.0 on Spark. >> >> --- Sungwoo >> >>> On Thu, Jul 19, 2018 at 10:34 PM, 彭鱼宴 <461292...@qq.com> wrote: >>> Hi Sungwoo, >>> >>> Just want to confirm, does that mean I just need to update the hive >>> version, without updating the hadoop version? >>> >>> Thanks! >>> >>> Best, >>> Zhefu Peng >>> >>> >>> ------------------ 原始邮件 ------------------ >>> 发件人: "Sungwoo Park"<glap...@gmail.com>; >>> 发送时间: 2018年7月19日(星期四) 晚上8:20 >>> 收件人: "user"<user@hive.apache.org>; >>> 主题: Re: Does Hive 3.0 only works with hadoop3.x.y? >>> >>> Hive 3.0 make a few function calls that depend on Hadoop 3.x, but they are >>> easy to replace with code that compiles okay on Hadoop 2.8+. I am currently >>> running Hadoop 3.x on Hadoop 2.7.6 and HDP 2.6.4 to test with the TPC-DS >>> benchmark, and have not encountered any compatibility issue yet. I >>> previously posted a diff file that lets us compile Hadoop 3.x on Hadoop >>> 2.8+. >>> >>> http://mail-archives.apache.org/mod_mbox/hive-user/201806.mbox/%3CCAKHFPXDDFn52buKetHzSXTtjzX3UMHf%3DQvxm9QNNkv9r5xBs-Q%40mail.gmail.com%3E >>> >>> >>> --- Sungwoo Park >>> >>> >>>> On Thu, Jul 19, 2018 at 8:21 PM, 彭鱼宴 <461292...@qq.com> wrote: >>>> Hi, >>>> >>>> I already deployed hive 2.2.0 on our hadoop cluster. And recently, we >>>> deployed the spark cluster with 2.3.0, aiming at using the feature that >>>> hive on spark engine. However, when I checked the website of hive release, >>>> I found the text below: >>>> 21 May 2018 : release 3.0.0 available >>>> This release works with Hadoop 3.x.y. >>>> >>>> Now the hadoop version we deployed is hadoop 2.7.6. I wonder, does Hive >>>> 3.0 only work with hadoop 3.x.y? Or, if we want to use hive 3.0, we have >>>> to update the hadoop version to 3.x.y? >>>> >>>> Looking forward to your reply and help. >>>> >>>> Best, >>>> >>>> Zhefu Peng >>>> >>> >> >