I would recommend upgrading to Hadoop 3.0 or 3.1 because of the following reasons:-
- It may be possible that Hadoop 2.x transitively brings some dependencies which may conflict with libraries used by hive( like unpredictable library google guava etc), which will affect your runtime environment. - Hive might be utilizing some of the new public APIs which are exposed in 3.x line of Hadoop , so with Hadoop 2.x you may see some ClassNotFound/NoSuchMethod in runtime if your query is addressing such code path. In production, you must use the same the dependencies in which hive is compiled and tested. https://github.com/apache/hive/blob/rel/release-3.0.0/pom.xml#L149 Thanks, Tanvi Thacker On Thu, Jul 19, 2018 at 8:15 PM, Sungwoo Park <glap...@gmail.com> wrote: > I would say yes (because I am actually running Hive 3.0 on Hadoop 2.7.6 > and HDP 2.7.5), provided that you make small changes to the source code to > Hive 3.0. However, I have not tested Hive 3.0 on Spark. > > --- Sungwoo > > On Thu, Jul 19, 2018 at 10:34 PM, 彭鱼宴 <461292...@qq.com> wrote: > >> Hi Sungwoo, >> >> Just want to confirm, does that mean I just need to update the hive >> version, without updating the hadoop version? >> >> Thanks! >> >> Best, >> Zhefu Peng >> >> >> ------------------ 原始邮件 ------------------ >> *发件人:* "Sungwoo Park"<glap...@gmail.com>; >> *发送时间:* 2018年7月19日(星期四) 晚上8:20 >> *收件人:* "user"<user@hive.apache.org>; >> *主题:* Re: Does Hive 3.0 only works with hadoop3.x.y? >> >> Hive 3.0 make a few function calls that depend on Hadoop 3.x, but they >> are easy to replace with code that compiles okay on Hadoop 2.8+. I am >> currently running Hadoop 3.x on Hadoop 2.7.6 and HDP 2.6.4 to test with the >> TPC-DS benchmark, and have not encountered any compatibility issue yet. I >> previously posted a diff file that lets us compile Hadoop 3.x on Hadoop >> 2.8+. >> >> http://mail-archives.apache.org/mod_mbox/hive-user/201806.mb >> ox/%3CCAKHFPXDDFn52buKetHzSXTtjzX3UMHf%3DQvxm9QNNkv9r5xBs-Q% >> 40mail.gmail.com%3E >> >> --- Sungwoo Park >> >> >> On Thu, Jul 19, 2018 at 8:21 PM, 彭鱼宴 <461292...@qq.com> wrote: >> >>> Hi, >>> >>> I already deployed hive 2.2.0 on our hadoop cluster. And recently, we >>> deployed the spark cluster with 2.3.0, aiming at using the feature that >>> hive on spark engine. However, when I checked the website of hive release, >>> I found the text below: >>> 21 May 2018 : release 3.0.0 available >>> <https://hive.apache.org/downloads.html#21-may-2018-release-300-available> >>> >>> This release works with Hadoop 3.x.y. >>> >>> Now the hadoop version we deployed is hadoop 2.7.6. I wonder, does Hive >>> 3.0 only work with hadoop 3.x.y? Or, if we want to use hive 3.0, we have to >>> update the hadoop version to 3.x.y? >>> >>> Looking forward to your reply and help. >>> >>> Best, >>> >>> Zhefu Peng >>> >> >> >