Re: [DISCUSS] Upgrade built-in Hive to 2.3.4

Sean Owen Mon, 04 Feb 2019 07:56:44 -0800

I was unclear from this thread what the objection to these PRs is:

https://github.com/apache/spark/pull/23552
https://github.com/apache/spark/pull/23553


Would we like to specifically discuss whether to merge these or not? I
hear support for it, concerns about continuing to support Hive too,
but I wasn't clear whether those concerns specifically argue against
these PRs.


On Fri, Feb 1, 2019 at 2:03 PM Felix Cheung <[email protected]> wrote:
>
> What’s the update and next step on this?
>
> We have real users getting blocked by this issue.
>
>
> ________________________________
> From: Xiao Li <[email protected]>
> Sent: Wednesday, January 16, 2019 9:37 AM
> To: Ryan Blue
> Cc: Marcelo Vanzin; Hyukjin Kwon; Sean Owen; Felix Cheung; Yuming Wang; dev
> Subject: Re: [DISCUSS] Upgrade built-in Hive to 2.3.4
>
> Thanks for your feedbacks!
>
> Working with Yuming to reduce the risk of stability and quality. Will keep 
> you posted when the proposal is ready.
>
> Cheers,
>
> Xiao
>
> Ryan Blue <[email protected]> 于2019年1月16日周三 上午9:27写道：
>>
>> +1 for what Marcelo and Hyukjin said.
>>
>> In particular, I agree that we can't expect Hive to release a version that 
>> is now more than 3 years old just to solve a problem for Spark. Maybe that 
>> would have been a reasonable ask instead of publishing a fork years ago, but 
>> I think this is now Spark's problem.
>>
>> On Tue, Jan 15, 2019 at 9:02 PM Marcelo Vanzin <[email protected]> wrote:
>>>
>>> +1 to that. HIVE-16391 by itself means we're giving up things like
>>> Hadoop 3, and we're also putting the burden on the Hive folks to fix a
>>> problem that we created.
>>>
>>> The current PR is basically a Spark-side fix for that bug. It does
>>> mean also upgrading Hive (which gives us Hadoop 3, yay!), but I think
>>> it's really the right path to take here.
>>>
>>> On Tue, Jan 15, 2019 at 6:32 PM Hyukjin Kwon <[email protected]> wrote:
>>> >
>>> > Resolving HIVE-16391 means Hive to release 1.2.x that contains the fixes 
>>> > of our Hive fork (correct me if I am mistaken).
>>> >
>>> > Just to be honest by myself and as a personal opinion, that basically 
>>> > says Hive to take care of Spark's dependency.
>>> > Hive looks going ahead for 3.1.x and no one would use the newer release 
>>> > of 1.2.x. In practice, Spark doesn't make a release 1.6.x anymore for 
>>> > instance,
>>> >
>>> > Frankly, my impression was that it's, honestly, our mistake to fix. Since 
>>> > Spark community is big enough, I was thinking we should try to fix it by 
>>> > ourselves first.
>>> > I am not saying upgrading is the only way to get through this but I think 
>>> > we should at least try first, and see what's next.
>>> >
>>> > It does, yes, sound more risky to upgrade it in our side but I think it's 
>>> > worth to check and try it and see if it's possible.
>>> > I think this is a standard approach to upgrade the dependency than using 
>>> > the fork or letting Hive side to release another 1.2.x.
>>> >
>>> > If we fail to upgrade it for critical or inevitable reasons somehow, yes, 
>>> > we could find an alternative but that basically means
>>> > we're going to stay in 1.2.x for, at least, a long time (say .. until 
>>> > Spark 4.0.0?).
>>> >
>>> > I know somehow it happened to be sensitive but to be just literally 
>>> > honest to myself, I think we should make a try.
>>> >
>>>
>>>
>>> --
>>> Marcelo
>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Re: [DISCUSS] Upgrade built-in Hive to 2.3.4

Reply via email to