+1 from my side too. I have created PR against the current branch. Still needs some work, and as many reviews as possible, because it is quite big, and I might made some mistakes https://issues.apache.org/jira/browse/HIVE-26134 https://github.com/apache/hive/pull/3201
Thanks, Peter On Thu, 10 Feb 2022 at 17:43, Zoltan Haindrich <k...@rxd.hu> wrote: > Hey, > > I think there is no real interest in this feature; we don't have > users/contributors backing it - last development was around 2018 October; > there were ~2 bugfix commits ever > since that...we should stop carrying dead weight...another 2 weeks went by > since Stamatis have reminded us that after 1.5 years(!) nothing have > changed. > > +1 on removing it > > cheers, > Zoltan > > you may inspect some of the recent changes with: > git log -c `find . -type f -path '**/spark/**'|grep -v xml|grep -v > properties|grep -v q.out` > > > On 1/28/22 2:32 PM, Stamatis Zampetakis wrote: > > Hi team, > > > > Almost one year has passed since the last exchange in this discussion and > > if I am not wrong there has been no effort to revive Hive-on-Spark. To be > > more precise, I don't think I have seen any Spark related JIRA for quite > > some time now and although I don't want to rush into conclusions, there > > does not seem to be any community member involved in maintaining or > adding > > new features in this part of the code. > > > > Keeping dead code in the repository does not do any good to the project > and > > puts a non-negligible burden to future maintainers. > > > > Clearly, we cannot make a new Hive release where a major feature is > > completely untested so either someone commits to re-enable/fix the > > respective tests soon or we move forward the work started by David and > drop > > support for Hive-on-Spark. > > > > I would like to ask the community if there is anyone who can take up this > > maintenance task and enable/fix Spark related tests in the next month or > so? > > > > Best, > > Stamatis > > > > On Sat, Feb 27, 2021 at 4:17 AM Edward Capriolo <edlinuxg...@gmail.com> > > wrote: > > > >> I do not know how it works for most of the world. But in cloudera where > the > >> TEZ options were never popular hive-on-spark represents a solid way to > get > >> things done for small datasets lower latency. > >> > >> As for the spark adoption. You know a while ago I came up with some > ways to > >> make hive more spark like. One of them was a found a way to make > "compile" > >> a hive keyword so folks could build UDFs on the fly. It was such an > >> uphil climb. Folks found a way to make it disabled by default for > security. > >> Then later when things moved from CLI to beeline it was like the ONLY > thing > >> that I found not ported. Like it was extremely frustrating. > >> > >> > >> > >> > >> > >> > >> On Mon, Jul 27, 2020 at 3:19 PM David <dam6...@gmail.com> wrote: > >> > >>> Hello Xuefu, > >>> > >>> I am not part of the Cloudera Hive product team, though I volunteer to > >>> work on small projects from time to time. Perhaps someone from that > team > >>> can chime in with some of their thoughts, but personally, I think that > in > >>> the long run, there will be more of a merge between Hive-on-Spark and > >> other > >>> Spark-native offerings. I'm not sure what the differentiation will be > >>> going forward. With that said, are there any developers on this > mailing > >>> list who are willing to take on the maintenance effort of keeping HoS > >>> moving forward? > >>> > >>> http://www.russellspitzer.com/2017/05/19/Spark-Sql-Thriftserver/ > >>> > >>> > >> > https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/config-sts.html > >>> > >>> > >>> Thanks. > >>> > >>> On Thu, Jul 23, 2020 at 12:35 PM Xuefu Zhang <xu...@apache.org> wrote: > >>> > >>>> Previous reasoning seemed to suggest a lack of user adoption. Now we > >> are > >>>> concerned about ongoing maintenance effort. Both are valid > >>> considerations. > >>>> However, I think we should have ways to find out the answers. > >> Therefore, > >>> I > >>>> suggest the following be carried out: > >>>> > >>>> 1. Send out the proposal (removing Hive on Spark) to users including > >>>> user@hive.apache.org and get their feedback. > >>>> 2. Ask if any developers on this mailing list are willing to take on > >> the > >>>> maintenance effort. > >>>> > >>>> I'm concerned about user impact because I can still see issues being > >>>> reported on HoS from time to time. I'm more concerned about the future > >> of > >>>> Hive if we narrow Hive neutrality on execution engines, which will > >>> possibly > >>>> force more Hive users to migrate to other alternatives such as Spark > >> SQL, > >>>> which is already eroding Hive's user base. > >>>> > >>>> Being open and neutral used to be Hive's most admired strengths. > >>>> > >>>> Thanks, > >>>> Xuefu > >>>> > >>>> > >>>> On Wed, Jul 22, 2020 at 8:46 AM Alan Gates <alanfga...@gmail.com> > >> wrote: > >>>> > >>>>> An important point here is I don't believe David is proposing to > >> remove > >>>>> Hive on Spark from the 2 or 3 lines, but only from trunk. Continuing > >>> to > >>>>> support it in existing 2 and 3 lines makes sense, but since no one > >> has > >>>>> maintained it on trunk for some time and it does not work with many > >> of > >>>> the > >>>>> newer features it should be removed from trunk. > >>>>> > >>>>> Alan. > >>>>> > >>>>> On Tue, Jul 21, 2020 at 4:10 PM Chao Sun <sunc...@apache.org> wrote: > >>>>> > >>>>>> Thanks David. FWIW Uber is still running Hive on Spark (2.3.4) on a > >>>> very > >>>>>> large scale in production right now and I don't think we have any > >>> plan > >>>> to > >>>>>> change it soon. > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Tue, Jul 21, 2020 at 11:28 AM David <dam6...@gmail.com> wrote: > >>>>>> > >>>>>>> Hello, > >>>>>>> > >>>>>>> Thanks for the feedback. > >>>>>>> > >>>>>>> Just a quick recap: I did propose this @dev and I received > >>> unanimous > >>>>> +1's > >>>>>>> from the community. After a couple months, I created the PR. > >>>>>>> > >>>>>>> Certainly open to discussion, but there hasn't been any > >> discussion > >>>> thus > >>>>>> far > >>>>>>> because there have been no objections until this point. > >>>>>>> > >>>>>>> HoS has low adoption, heavy technical debt, and the manner in > >> which > >>>> its > >>>>>>> build process is setup is impeding some other work that is not > >> even > >>>>>> related > >>>>>>> to HoS. > >>>>>>> > >>>>>>> We can deprecate in Hive 3.x and remove in Hive 4.x. The plan > >>> would > >>>> be > >>>>>> to > >>>>>>> use Tez moving forward. > >>>>>>> > >>>>>>> My point about the vendor's move to Tez is that HoS adoption is > >>> very > >>>>> low, > >>>>>>> it's only going lower, and while I don't know the specifics of > >> it, > >>>>> there > >>>>>>> must be some migration plan in place there (i.e., it must be > >>> possible > >>>>> to > >>>>>> do > >>>>>>> it already). > >>>>>>> > >>>>>>> Thanks, > >>>>>>> David > >>>>>>> > >>>>>>> On Tue, Jul 21, 2020 at 12:23 PM Xuefu Zhang <xu...@apache.org> > >>>> wrote: > >>>>>>> > >>>>>>>> Hi David, > >>>>>>>> > >>>>>>>> While a vendor may not support a component in an open source > >>>> project, > >>>>>>>> removing it or not is a decision by and for the community. I > >>>>> certainly > >>>>>>>> understand that the vendor you mentioned has contributed a > >> great > >>>> deal > >>>>>>>> (including my personal effort while working there), it's not up > >>> to > >>>>> the > >>>>>>>> vendor to make a call like what is proposed here. > >>>>>>>> > >>>>>>>> As a community, we should have gone through a thorough > >> discussion > >>>> and > >>>>>>>> reached a consensus before actually making such a big change, > >> in > >>> my > >>>>>>>> opinion. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Xuefu > >>>>>>>> > >>>>>>>> On Tue, Jul 21, 2020 at 8:49 AM David <dam6...@gmail.com> > >> wrote: > >>>>>>>> > >>>>>>>>> Hey, > >>>>>>>>> > >>>>>>>>> Thanks for the input. > >>>>>>>>> > >>>>>>>>> FYI. Cloudera (Cloudera + Hortonworks) have removed HoS from > >>>> their > >>>>>>> latest > >>>>>>>>> offering. > >>>>>>>>> > >>>>>>>>> "Tez is now the only supported execution engine, existing > >>> queries > >>>>>> that > >>>>>>>>> change execution mode to Spark or MapReduce within a session, > >>> for > >>>>>>>> example, > >>>>>>>>> fail." > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > https://docs.cloudera.com/cdp/latest/upgrade-post/topics/ug_hive_configuration_changes.html > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> So I don't know who will be supporting this feature moving > >>>> forward, > >>>>>> but > >>>>>>>>> there has been a lot of work done to make this change as > >>> painless > >>>>> as > >>>>>>>>> possible. Simply set the engine to 'tez' and remove the > >>>>> HoS-related > >>>>>>>>> settings should address many use cases. > >>>>>>>>> > >>>>>>>>> Thanks. > >>>>>>>>> > >>>>>>>>> On Tue, Jul 21, 2020 at 11:36 AM Xuefu Z <usxu...@gmail.com> > >>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Sorry for chiming in late. However, I don't think we should > >>>>> remove > >>>>>>> Hive > >>>>>>>>> on > >>>>>>>>>> Spark just because of a technical problem. This is rather a > >>> big > >>>>>>>> decision > >>>>>>>>>> that we need to be careful about. There are users that will > >>> be > >>>>> left > >>>>>>>> high > >>>>>>>>>> and dry by this move. > >>>>>>>>>> > >>>>>>>>>> If the community decides to desupport and eventually remove > >>>> it, I > >>>>>>> think > >>>>>>>>> we > >>>>>>>>>> need to have a due process. We also need a deprecation plan > >>> if > >>>>>> that's > >>>>>>>> we > >>>>>>>>>> decide to do. Before that, I'm -1 on this proposal. > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> Xuefu > >>>>>>>>>> > >>>>>>>>>> On Tue, Jul 21, 2020 at 7:57 AM David <dam6...@gmail.com> > >>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Hello Team, > >>>>>>>>>>> > >>>>>>>>>>> https://github.com/apache/hive/pull/1285 > >>>>>>>>>>> > >>>>>>>>>>> Thanks. > >>>>>>>>>>> > >>>>>>>>>>> On Wed, Jun 3, 2020 at 11:49 PM Gopal V < > >> gop...@apache.org > >>>> > >>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> +1 > >>>>>>>>>>>> > >>>>>>>>>>>> Cheers, > >>>>>>>>>>>> Gopal > >>>>>>>>>>>> > >>>>>>>>>>>> On 6/3/20 7:48 PM, Jesus Camacho Rodriguez wrote: > >>>>>>>>>>>>> +1 > >>>>>>>>>>>>> > >>>>>>>>>>>>> -Jesús > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Wed, Jun 3, 2020 at 1:58 PM Alan Gates < > >>>>>>> alanfga...@gmail.com> > >>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> +1. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Alan. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Wed, Jun 3, 2020 at 1:40 PM Prasanth Jayachandran > >>>>>>>>>>>>>> <pjayachand...@cloudera.com.invalid> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> +1 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan < > >>>>>>>>>> hashut...@apache.org> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> +1 > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On Wed, Jun 3, 2020 at 1:23 PM David Mollitor < > >>>>>>>>> dam6...@gmail.com> > >>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Hello Gang, > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> I have spent some time working on upgrading Avro > >>> (far > >>>>>> less > >>>>>>>> than > >>>>>>>>>>>>>> others): > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/HIVE-21737 > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> This should be a relatively easy thing to do, but > >>> is > >>>>>>> blocked > >>>>>>>> by > >>>>>>>>>>>>>>>>> Hive-on-Spark. HoS has a weird thing where it > >>>>> downloads > >>>>>>> some > >>>>>>>>>>>>>>>>> cloud-storage-hosted file of Spark-Hadoop as part > >>> of > >>>>> its > >>>>>>>> maven > >>>>>>>>>> run. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Since HoS is not going to receive updates from > >> the > >>>>> major > >>>>>>>>> vendors, > >>>>>>>>>>> is > >>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>> time to simply remove it? > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Tests are currently disabled: > >>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/HIVE-23137 > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Thanks. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> Xuefu Zhang > >>>>>>>>>> > >>>>>>>>>> "In Honey We Trust!" > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > > >