Thanks. I just got an Oozie Hive action set up to test on a single node cluster and putting "ADD JAR /path/to/hdfs/location" in the hive script worked. Hopefully I won't hit any issues when I try it on a multi-node cluster.
On Mon, Dec 2, 2013 at 5:37 PM, Adam Kawa <kawa.a...@gmail.com> wrote: > You can use ADD JAR command inside a Hive script and a parameter in Oozie > workflow definition. Example is here: > http://blog.cloudera.com/blog/2013/01/how-to-schedule-recurring-hadoop-jobs-with-apache-oozie/ > > > 2013/12/2 <mpeters...@gmail.com> > > Is it possible to specify a Hive auxiliary jar (like a SerDe) that is in >> HDFS rather than the local fileystem? >> >> I am using a CsvSerDe I wrote and when I specify it Hive >> hive.aux.jars.path with a local file system path it works: >> >> hive -hiveconf >> hive.aux.jars.path=*file:*///path/to/truven-hive-serdes-1.0.jar >> -hiveconf hive.auto.convert.join.noconditionaltask.size=25000000 -f >> hivefiscalyearqueries.sql >> >> But when I put that jar in HDFS and point it to, it fails: >> >> hive -hiveconf >> hive.aux.jars.path=*hdfs:*///hdfspath/to/truven-hive-serdes-1.0.jar >> -hiveconf hive.auto.convert.join.noconditionaltask.size=25000000 -f >> hivefiscalyearqueries.sql >> >> with the error message: >> >> java.lang.ClassNotFoundException: com.truven.hiveserde.csv.CsvSerDe >> Continuing ... >> 2013-12-02 03:48:25 Starting to launch local task to process map >> join; maximum memory = 1065484288 >> org.apache.hadoop.hive.ql.metadata.HiveException: Failed with exception >> nulljava.lang.NullPointerException >> at >> org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromTable(FetchOperator.java:230) >> at >> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:595) >> at >> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:406) >> at >> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:290) >> at >> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:682) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:160) >> >> at >> org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631) >> at >> org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:406) >> at >> org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:290) >> at >> org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:682) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:160) >> Execution failed with exit status: 2 >> Obtaining error information >> >> Task failed! >> >> >> >> I will need to run this from Oozie eventually, so I'd like to know how >> get Hive to use a jar in HDFS, rather than have to distribute the file to >> the local file system of all datanodes. >> >> Thank you, >> Michael >> > >