Hi, I'm a freshman of Hive and try to use the UDF of python script.
I wrote a simple projection function in add.py #!/usr/local/python/bin/python import sys import string try: line = sys.stdin.readline() a, b = string.split(line, "\t") print a except: print sys.exc_info() I have added the file by: hive> add file /home/hadoop/usr/local/hive/hivescript/add.py; Added resources: [/home/hadoop/usr/local/hive/hivescript/add.py] and it seems successfully. And I issue the query hive> select transform("aaa", "bbb") using 'python add.py' as (add string); But the task ends with errors: Diagnostic Messages for this Task: Error: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred when trying to close the Operator running your custom script. at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:585) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189) ... 8 more FAILED: Execution Error, return code 20003 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. An error occurred when trying to close the Operator running your custom script. MapReduce Jobs Launched: Stage-Stage-1: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec Does anyone know why and how to fix it? I'm running Hive 2.1.0 on Hadoop 2.7.0 on Centos 6(single machine). Thanks for your help! Best, Kangfei ***************************************************************************** Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong William M. Engineering Building, Rm 801 Tel:3943 8326 Email:kfz...@se.cuhk.edu.hk <kfz...@se.cuhk.edu.hk>;zkf1...@gmail.com *****************************************************************************