I struggled hard to deal with this issue multiple times over a year and thankfully we finally decided to use the official version of Hive 2.3.x too (thank you, Yuming, Alan, and guys) I think this is already a huge progress that we started to use the official version of Hive.
I think we should at least have one minor release term to let users test out Spark with Hive 2.3.x. before switching this as a default. My impression was it's the decision made before at: http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Upgrade-built-in-Hive-to-2-3-4-td26153.html How about we try to make it default in Spark 3.1 by using this thread as a reference? I think it's too a radical change. 2019년 11월 19일 (화) 오후 2:11, Dongjoon Hyun <dongjoon.h...@gmail.com>님이 작성: > Hi, All. > > First of all, I want to put this as a policy issue instead of a technical > issue. > Also, this is orthogonal from `hadoop` version discussion. > > Apache Spark community kept (not maintained) the forked Apache Hive > 1.2.1 because there has been no other options before. As we see at > SPARK-20202, it's not a desirable situation among the Apache projects. > > https://issues.apache.org/jira/browse/SPARK-20202 > > Also, please note that we `kept`, not `maintained`, because we know it's > not good. > There are several attempt to update that forked repository > for several reasons (Hadoop 3 support is one of the example), > but those attempts are also turned down. > > From Apache Spark 3.0, it seems that we have a new feasible option > `hive-2.3` profile. What about moving forward in this direction further? > > For example, can we remove the usage of forked `hive` in Apache Spark 3.0 > completely officially? If someone still needs to use the forked `hive`, we > can > have a profile `hive-1.2`. Of course, it should not be a default profile > in the community. > > I want to say this is a goal we should achieve someday. > If we don't do anything, nothing happen. At least we need to prepare this. > Without any preparation, Spark 3.1+ will be the same. > > Shall we focus on what are our problems with Hive 2.3.6? > If the only reason is that we didn't use it before, we can release > another > `3.0.0-preview` for that. > > Bests, > Dongjoon. >