Re: [DISCUSS] Upgrade built-in Hive to 2.3.4

Felix Cheung Fri, 01 Feb 2019 12:03:33 -0800

What’s the update and next step on this?

We have real users getting blocked by this issue.



________________________________
From: Xiao Li <[email protected]>
Sent: Wednesday, January 16, 2019 9:37 AM
To: Ryan Blue
Cc: Marcelo Vanzin; Hyukjin Kwon; Sean Owen; Felix Cheung; Yuming Wang; dev
Subject: Re: [DISCUSS] Upgrade built-in Hive to 2.3.4

Thanks for your feedbacks!

Working with Yuming to reduce the risk of stability and quality. Will keep you 
posted when the proposal is ready.

Cheers,

Xiao

Ryan Blue <[email protected]<mailto:[email protected]>> 于2019年1月16日周三 上午9:27写道：
+1 for what Marcelo and Hyukjin said.

In particular, I agree that we can't expect Hive to release a version that is 
now more than 3 years old just to solve a problem for Spark. Maybe that would 
have been a reasonable ask instead of publishing a fork years ago, but I think 
this is now Spark's problem.

On Tue, Jan 15, 2019 at 9:02 PM Marcelo Vanzin 
<[email protected]<mailto:[email protected]>> wrote:
+1 to that. HIVE-16391 by itself means we're giving up things like
Hadoop 3, and we're also putting the burden on the Hive folks to fix a
problem that we created.

The current PR is basically a Spark-side fix for that bug. It does
mean also upgrading Hive (which gives us Hadoop 3, yay!), but I think
it's really the right path to take here.

On Tue, Jan 15, 2019 at 6:32 PM Hyukjin Kwon 
<[email protected]<mailto:[email protected]>> wrote:
>
> Resolving HIVE-16391 means Hive to release 1.2.x that contains the fixes of 
> our Hive fork (correct me if I am mistaken).
>
> Just to be honest by myself and as a personal opinion, that basically says 
> Hive to take care of Spark's dependency.
> Hive looks going ahead for 3.1.x and no one would use the newer release of 
> 1.2.x. In practice, Spark doesn't make a release 1.6.x anymore for instance,
>
> Frankly, my impression was that it's, honestly, our mistake to fix. Since 
> Spark community is big enough, I was thinking we should try to fix it by 
> ourselves first.
> I am not saying upgrading is the only way to get through this but I think we 
> should at least try first, and see what's next.
>
> It does, yes, sound more risky to upgrade it in our side but I think it's 
> worth to check and try it and see if it's possible.
> I think this is a standard approach to upgrade the dependency than using the 
> fork or letting Hive side to release another 1.2.x.
>
> If we fail to upgrade it for critical or inevitable reasons somehow, yes, we 
> could find an alternative but that basically means
> we're going to stay in 1.2.x for, at least, a long time (say .. until Spark 
> 4.0.0?).
>
> I know somehow it happened to be sensitive but to be just literally honest to 
> myself, I think we should make a try.
>


--
Marcelo


--
Ryan Blue
Software Engineer
Netflix

Re: [DISCUSS] Upgrade built-in Hive to 2.3.4

Reply via email to