Hi,

I've been testing SparkSQL in 1.4 rc and found two issues. I wanted to
confirm whether these are bugs or not before opening a jira.

*1)* I can no longer compile SparkSQL with -Phive-0.12.0. I noticed that in
1.4, IsolatedClientLoader is introduced, and different versions of Hive
metastore jars can be loaded at runtime. But instead, SparkSQL no longer
compiles with Hive 0.12.0.

My question is, is this intended? If so, shouldn't the hive-0.12.0 profile
in POM be removed?

*2)* After compiling SparkSQL with -Phive-0.13.1, I ran into my 2nd
problem. Since I have Hive 0.12 metastore in production, I have to use it
for now. But even if I set "spark.sql.hive.metastore.version" and
"spark.sql.hive.metastore.jars", SparkSQL cli throws an error as follows-

15/05/24 05:03:29 WARN RetryingMetaStoreClient: MetaStoreClient lost
connection. Attempting to reconnect.
org.apache.thrift.TApplicationException: Invalid method name:
'get_functions'
at
org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_functions(ThriftHiveMetastore.java:2886)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_functions(ThriftHiveMetastore.java:2872)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getFunctions(HiveMetaStoreClient.java:1727)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy12.getFunctions(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getFunctions(Hive.java:2670)
at
org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:674)
at
org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:662)
at
org.apache.hadoop.hive.cli.CliDriver.getCommandCompletor(CliDriver.java:540)
at
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:175)
at
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)

What's happening is that when SparkSQL Cli starts up, it tries to fetch
permanent udfs from Hive metastore (due to HIVE-6330
<https://issues.apache.org/jira/browse/HIVE-6330>, which was introduced in
Hive 0.13). But then, it ends up invoking an incompatible thrift function
that doesn't exist in Hive 0.12. To work around this error, I have to
comment out the following line of code for now-
https://goo.gl/wcfnH1

My question is, is SparkSQL that is compiled against Hive 0.13 supposed to
work with Hive 0.12 metastore (by setting
"spark.sql.hive.metastore.version" and "spark.sql.hive.metastore.jars")? It
only works if I comment out the above line of code.

Thanks,
Cheolsoo

Reply via email to