Hi, my hadoop’s core-site.xml contains following about tmp
<property> <name>hadoop.tmp.dir</name> <value>/hadoop_data/hadoop_data/tmp</value> </property> my hive-default.xml contains following about tmp <property> <name>hive.exec.scratchdir</name> <value>/tmp/hive-${user.name}</value> <description>Scratch space for Hive jobs</description> </property> <property> <name>hive.exec.local.scratchdir</name> <value>/tmp/${user.name}</value> <description>Local scratch space for Hive jobs</description> </property> Will this related to configuration issue or a bug? Please help! Regards Arthur On 6 Jan, 2015, at 3:45 am, Jason Dere <jd...@hortonworks.com> wrote: > During query compilation Hive needs to instantiate the UDF class and so the > JAR needs to be resolvable by the class loader, thus the JAR is copied > locally to a temp location for use. > During map/reduce jobs the local jar (like all jars added with the ADD JAR > command) should then be added to the distributed cache. It looks like this is > where the issue is occurring, but based on path in the error message I > suspect that either Hive or Hadoop is mistaking what should be a local path > with an HDFS path. > > On Jan 4, 2015, at 10:23 AM, arthur.hk.c...@gmail.com > <arthur.hk.c...@gmail.com> wrote: > >> Hi, >> >> A question: Why does it need to copy the jar file to the temp folder? Why >> couldn’t it use the file defined in using JAR >> 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar' directly? >> >> Regards >> Arthur >> >> >> On 4 Jan, 2015, at 7:48 am, arthur.hk.c...@gmail.com >> <arthur.hk.c...@gmail.com> wrote: >> >>> Hi, >>> >>> >>> A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt? >>> Yes >>> >>> A2: Would you be able to check if such a file exists with the same path, >>> on the local file system? >>> The file does not exist on the local file system. >>> >>> >>> Is there a way to set the another “tmp" folder for HIVE? or any suggestions >>> to fix this issue? >>> >>> Thanks !! >>> >>> Arthur >>> >>> >>> >>> On 3 Jan, 2015, at 4:12 am, Jason Dere <jd...@hortonworks.com> wrote: >>> >>>> The point of USING JAR as part of the CREATE FUNCTION statement to try to >>>> avoid having to do ADD JAR/aux path stuff to get the UDF to work. >>>> >>>> Are all of these commands (Step 1-5) from the same Hive CLI prompt? >>>> >>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' >>>>>> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar'; >>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar >>>>>> Added >>>>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>>>>> to class path >>>>>> Added resource: >>>>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>>>>> OK >>>> >>>> >>>> One note, >>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>>> here should actually be on the local file system, not on HDFS where you >>>> were checking in Step 5. During CREATE FUNCTION/query compilation, Hive >>>> will make a copy of the source JAR >>>> (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), copied to a temp >>>> location on the local file system where it's used by that Hive session. >>>> >>>> The location mentioned in the FileNotFoundException >>>> (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar) >>>> has a different path than the local copy mentioned during CREATE FUNCTION >>>> (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar). >>>> I'm not really sure why it is a HDFS path here either, but I'm not too >>>> familiar with what goes on during the job submission process. But the fact >>>> that this HDFS path has the same naming convention as the directory used >>>> for downloading resources locally (***_resources) looks a little fishy to >>>> me. Would you be able to check if such a file exists with the same path, >>>> on the local file system? >>>> >>>> >>>> >>>> >>>> >>>> On Dec 31, 2014, at 5:22 AM, Nirmal Kumar <nirmal.ku...@impetus.co.in> >>>> wrote: >>>> >>>>> Important: HiveQL's ADD JAR operation does not work with HiveServer2 >>>>> and the Beeline client when Beeline runs on a different host. As an >>>>> alterntive to ADD JAR, Hive auxiliary path functionality should be used >>>>> as described below. >>>>> >>>>> Refer: >>>>> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html >>>>> >>>>> >>>>> Thanks, >>>>> -Nirmal >>>>> >>>>> From: arthur.hk.c...@gmail.com <arthur.hk.c...@gmail.com> >>>>> Sent: Tuesday, December 30, 2014 9:54 PM >>>>> To: vic0777 >>>>> Cc: arthur.hk.c...@gmail.com; user@hive.apache.org >>>>> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file? >>>>> >>>>> Thank you. >>>>> >>>>> Will this work for hiveserver2 ? >>>>> >>>>> >>>>> Arthur >>>>> >>>>> On 30 Dec, 2014, at 2:24 pm, vic0777 <vic0...@163.com> wrote: >>>>> >>>>>> >>>>>> You can put it into $HOME/.hiverc like this: ADD JAR >>>>>> full_path_of_the_jar. Then, the file is automatically loaded when Hive >>>>>> is started. >>>>>> >>>>>> Wantao >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> At 2014-12-30 11:01:06, "arthur.hk.c...@gmail.com" >>>>>> <arthur.hk.c...@gmail.com> wrote: >>>>>> Hi, >>>>>> >>>>>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an >>>>>> extra JAR file to hive for UDF, below are my steps to create the UDF >>>>>> function. I have tried the following but still no luck to get thru. >>>>>> >>>>>> Please help!! >>>>>> >>>>>> Regards >>>>>> Arthur >>>>>> >>>>>> >>>>>> Step 1: (make sure the jar in in HDFS) >>>>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar; >>>>>> -rw-r--r-- 3 hadoop hadoop 57388 2014-12-30 >>>>>> 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar >>>>>> >>>>>> Step 2: (drop if function exists) >>>>>> hive> drop function sysdate; >>>>>> >>>>>> OK >>>>>> Time taken: 0.013 seconds >>>>>> >>>>>> Step 3: (create function using the jar in HDFS) >>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' >>>>>> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar'; >>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar >>>>>> Added >>>>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>>>>> to class path >>>>>> Added resource: >>>>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>>>>> OK >>>>>> Time taken: 0.034 seconds >>>>>> >>>>>> Step 4: (test) >>>>>> hive> select sysdate(); >>>>>> >>>>>> >>>>>> Automatically selecting local only mode for query >>>>>> Total jobs = 1 >>>>>> Launching Job 1 out of 1 >>>>>> Number of reduce tasks is set to 0 since there's no reduce operator >>>>>> SLF4J: Class path contains multiple SLF4J bindings. >>>>>> SLF4J: Found binding in >>>>>> [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] >>>>>> SLF4J: Found binding in >>>>>> [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class] >>>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an >>>>>> explanation. >>>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] >>>>>> 14/12/30 10:17:06 WARN conf.Configuration: >>>>>> file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an >>>>>> attempt to override final parameter: >>>>>> mapreduce.job.end-notification.max.retry.interval; Ignoring. >>>>>> 14/12/30 10:17:06 WARN conf.Configuration: >>>>>> file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an >>>>>> attempt to override final parameter: yarn.nodemanager.loacl-dirs; >>>>>> Ignoring. >>>>>> 14/12/30 10:17:06 WARN conf.Configuration: >>>>>> file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an >>>>>> attempt to override final parameter: >>>>>> mapreduce.job.end-notification.max.attempts; Ignoring. >>>>>> Execution log at: >>>>>> /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log >>>>>> java.io.FileNotFoundException: File does not >>>>>> exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>>>>> at >>>>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128) >>>>>> at >>>>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120) >>>>>> at >>>>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) >>>>>> at >>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120) >>>>>> at >>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288) >>>>>> at >>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224) >>>>>> at >>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99) >>>>>> at >>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57) >>>>>> at >>>>>> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265) >>>>>> at >>>>>> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301) >>>>>> at >>>>>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389) >>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) >>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) >>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>> at javax.security.auth.Subject.doAs(Subject.java:415) >>>>>> at >>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) >>>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) >>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) >>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) >>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>> at javax.security.auth.Subject.doAs(Subject.java:415) >>>>>> at >>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) >>>>>> at >>>>>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) >>>>>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) >>>>>> at >>>>>> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420) >>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740) >>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>> at >>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>>>> at >>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>>>> at java.lang.reflect.Method.invoke(Method.java:606) >>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212) >>>>>> Job Submission failed with exception 'java.io.FileNotFoundException(File >>>>>> does not >>>>>> exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)' >>>>>> Execution failed with exit status: 1 >>>>>> Obtaining error information >>>>>> Task failed! >>>>>> Task ID: >>>>>> Stage-1 >>>>>> Logs: >>>>>> /tmp/hadoop/hive.log >>>>>> FAILED: Execution Error, return code 1 from >>>>>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask >>>>>> >>>>>> >>>>>> Step 5: (check the file) >>>>>> hive> dfs -ls >>>>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar; >>>>>> ls: >>>>>> `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar': >>>>>> No such file or directory >>>>>> Command failed with exit code = 1 >>>>>> Query returned non-zero code: 1, cause: null >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> NOTE: This message may contain information that is confidential, >>>>> proprietary, privileged or otherwise protected by law. The message is >>>>> intended solely for the named addressee. If received in error, please >>>>> destroy and notify the sender. Any use of this email is prohibited when >>>>> received in error. Impetus does not represent, warrant and/or guarantee, >>>>> that the integrity of this communication has been maintained nor that the >>>>> communication is free of errors, virus, interception or interference. >>>> >>>> >>>> CONFIDENTIALITY NOTICE >>>> NOTICE: This message is intended for the use of the individual or entity >>>> to which it is addressed and may contain information that is confidential, >>>> privileged and exempt from disclosure under applicable law. If the reader >>>> of this message is not the intended recipient, you are hereby notified >>>> that any printing, copying, dissemination, distribution, disclosure or >>>> forwarding of this communication is strictly prohibited. If you have >>>> received this communication in error, please contact the sender >>>> immediately and delete it from your system. Thank You. >>> >> > > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader of > this message is not the intended recipient, you are hereby notified that any > printing, copying, dissemination, distribution, disclosure or forwarding of > this communication is strictly prohibited. If you have received this > communication in error, please contact the sender immediately and delete it > from your system. Thank You.