The drop table command between these 2 query fails due to permission issue. If I solve the permission issue, the tez error will disappear. But what confuse me is that I suppose the failure of drop table command should not affect the following tez jobs. But from the hdfs-audit logs, tez_staging_directory is deleted between the first query and the second query. But besides the audit log, I don't see anywhere else about the deleting of tez_staging_directory including hive log & yarn app log. Does anyone know whether hive will delete the tez staging directory in this case ?
Here's the hive log and hdfs-audit log. *hive.log* 2015-06-25 15:54:33,623 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(732)) - 0: drop_table : db=TESTDEFAULT tbl=tmp_recomm_prdt_detail0 2015-06-25 15:54:33,623 INFO [main]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(358)) - ugi=lujian ip=unknown-ip-addr cmd=drop_table : db=TESTDEFAULT tbl=tmp_recomm_prdt_detail0 2015-06-25 15:54:33,658 INFO [main]: metastore.hivemetastoressimpl (HiveMetaStoreFsImpl.java:deleteDir(41)) - deleting hdfs:// yhd-jqhadoop2.int.yihaodian.com:8020/user/hive/warehouse/testdefault.db/tmp_recomm_prdt_detail0 2015-06-25 15:54:33,660 INFO [main]: fs.TrashPolicyDefault (TrashPolicyDefault.java:initialize(92)) - Namenode trash configuration: Deletion interval = 1440 minutes, Emptier interval = 0 minutes. 2015-06-25 15:54:33,669 ERROR [main]: hive.log (MetaStoreUtils.java:logAndThrowMetaException(1173)) - Got exception: java.io.IOException Failed to move to trash: hdfs:// yhd-jqhadoop2.int.yihaodian.com:8020/user/hive/warehouse/testdefault.db/tmp_recomm_prdt_detail0 java.io.IOException: Failed to move to trash: hdfs:// yhd-jqhadoop2.int.yihaodian.com:8020/user/hive/warehouse/testdefault.db/tmp_recomm_prdt_detail0 at org.apache.hadoop.fs.TrashPolicyDefault.moveToTrash(TrashPolicyDefault.java:160) at org.apache.hadoop.fs.Trash.moveToTrash(Trash.java:109) .. (remove the full stacktrace) 2015-06-25 15:54:33,670 ERROR [main]: hive.log (MetaStoreUtils.java:logAndThrowMetaException(1174)) - Converting exception to MetaException 2015-06-25 15:54:33,670 ERROR [main]: metastore.HiveMetaStore (HiveMetaStore.java:deleteTableData(1557)) - Failed to delete table directory: hdfs:// yhd-jqhadoop2.int.yihaodian.com:8020/user/hive/warehouse/testdefault.db/tmp_recomm_prdt_detail0 Got exception: java.io.IOException Failed to move to trash: hdfs:// yhd-jqhadoop2.int.yihaodian.com:8020/user/hive/warehouse/testdefault.db/tmp_recomm_prdt_detail0 2015-06-25 15:54:33,670 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=runTasks start=1435218873582 end=1435218873670 duration=88 from=org.apache.hadoop.hive.ql.Driver> 2015-06-25 15:54:33,670 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=Driver.execute start=1435218873582 end=1435218873670 duration=88 from=org.apache.hadoop.hive.ql.Driver> 2015-06-25 15:54:33,671 INFO [main]: ql.Driver (SessionState.java:printInfo(852)) - OK 2015-06-25 15:54:33,671 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver> 2015-06-25 15:54:33,671 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=releaseLocks start=1435218873671 end=1435218873671 duration=0 from=org.apache.hadoop.hive.ql.Driver> 2015-06-25 15:54:33,671 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=Driver.run start=1435218873554 end=1435218873671 duration=117 from=org.apache.hadoop.hive.ql.Driver> 2015-06-25 15:54:33,671 INFO [main]: CliDriver (SessionState.java:printInfo(852)) - Time taken: 0.117 seconds 2015-06-25 15:54:33,671 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver> 2015-06-25 15:54:33,671 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver> 2015-06-25 15:54:33,671 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver> 2015-06-25 15:54:33,672 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver> 2015-06-25 15:54:33,672 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: create table TESTDEFAULT.tmp_recomm_prdt_detail0(track_id bigint,cart_time string,product_id bigint, session_id string,gu_id string,url string,referer string,province_id bigint,page_type bigint, button_position string,link_position string,ip string,m_session_id string,link_id bigint, button_id bigint,algorithm_id bigint,from_value string,ref_value string,step string,platform_type_id int,pm_info_id bigint,refer_page_value string,url_page_value string) 2015-06-25 15:54:33,674 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(206)) - Parse Completed 2015-06-25 15:54:33,674 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=parse start=1435218873672 end=1435218873674 duration=2 from=org.apache.hadoop.hive.ql.Driver> 2015-06-25 15:54:33,674 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver> 2015-06-25 15:54:33,675 INFO [main]: parse.CalcitePlanner (SemanticAnalyzer.java:analyzeInternal(10146)) - Starting Semantic Analysis *hdfs-audit.log* 2015-06-25 15:54:33,673 INFO FSNamesystem.audit: allowed=true ugi=lujian (auth:SIMPLE) ip=/10.17.28.11 cmd=delete src=/user/lujian/lujian/_tez_session_dir/abb91da9-ac07-4024-a09f-8622ee1caedf dst=null perm=null On Tue, Jun 23, 2015 at 2:02 PM, r7raul1...@163.com <r7raul1...@163.com> wrote: > Sorry , I made a mistake, > *use hivecli to execute same query on tez ,throw the same exception.* > ------------------------------ > r7raul1...@163.com > > > *From:* r7raul1...@163.com > *Date:* 2015-06-23 13:53 > *To:* user <user@hive.apache.org> > *Subject:* hive -e run tez query error > When I use hive 1.1.0 on tez 0.53 in hadoop 2.3.0: > > *hive -v -e "set hive.execution.engine=tez;set mapred.job.queue.name > <http://mapred.job.queue.name>=bi_etl;drop table TESTTMP.a_start;create > table TESTTMP.a_start(id bigint);insert overwrite table TESTTMP.a_start > select id from tandem.p_city;drop table TESTTMP.a_end;create table > TESTTMP.a_end(id bigint);insert overwrite table TESTTMP.a_end select id > from TESTTMP.a_start;*" > > Logging initialized using configuration in > jar:file:/usr/local/src/apache-hive/lib/hive-common-1.1.0.jar!/hive-log4j.properties > > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/local/src/apache-hive/lib/hive-jdbc-1.1.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] > > SLF4J: Found binding in > [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > set hive.execution.engine=tez > set mapred.job.queue.name=bi_etl > drop table TESTTMP.a_start > OK > Time taken: 1.702 seconds > create table TESTTMP.a_start(id bigint) > OK > Time taken: 0.311 seconds > insert overwrite table TESTTMP.a_start select id from tandem.p_city > Query ID = lujian_20150623134848_31ba3183-74b8-4c3a-96ae-5c8b650b99df > Total jobs = 1 > Launching Job 1 out of 1 > > > Status: Running (Executing on YARN cluster with App id > application_1433219182593_252390) > > -------------------------------------------------------------------------------- > > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED > -------------------------------------------------------------------------------- > > Map 1 .......... SUCCEEDED 1 1 0 0 0 0 > -------------------------------------------------------------------------------- > > VERTICES: 01/01 [==========================>>] 100% ELAPSED TIME: 13.55 s > -------------------------------------------------------------------------------- > > Loading data to table testtmp.a_start > Moved: 'hdfs:// > yhd-jqhadoop2.int.yihaodian.com:8020/user/hive/warehouse/testtmp.db/a_start/000000_0' > to trash at: hdfs:// > yhd-jqhadoop2.int.yihaodian.com:8020/user/lujian/.Trash/Current > Table testtmp.a_start stats: [numFiles=1, numRows=531, totalSize=2154, > rawDataSize=1623] > OK > Time taken: 25.586 seconds > drop table TESTTMP.a_end > OK > Time taken: 0.232 seconds > create table TESTTMP.a_end(id bigint) > OK > Time taken: 0.068 seconds > insert overwrite table TESTTMP.a_end select id from TESTTMP.a_start > Query ID = lujian_20150623134949_bff735c9-6abc-47e7-a9f7-f7e2be7e43e9 > Total jobs = 1 > Launching Job 1 out of 1 > > > Status: Running (Executing on YARN cluster with App id > application_1433219182593_252390) > > -------------------------------------------------------------------------------- > > VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED > -------------------------------------------------------------------------------- > > Map 1 FAILED 1 0 0 1 4 0 > -------------------------------------------------------------------------------- > > VERTICES: 00/01 [>>--------------------------] 0% ELAPSED TIME: 15.46 s > -------------------------------------------------------------------------------- > > Status: Failed > Vertex failed, vertexName=Map 1, > vertexId=vertex_1433219182593_252390_2_00, diagnostics=[Task failed, > taskId=task_1433219182593_252390_2_00_000000, diagnostics=[TaskAttempt 0 > failed, info=[Container container_1433219182593_252390_01_000003 finished > with diagnostics set to [Container failed. File does not exist: hdfs:// > yhd-jqhadoop2.int.yihaodian.com:8020/user/lujian/lujian/_tez_session_dir/63de23a2-1cff-4434-96ad-1304089fb489/.tez/application_1433219182593_252390/tez-conf.pb > ]], TaskAttempt 1 failed, info=[Container > container_1433219182593_252390_01_000004 finished with diagnostics set to > [Container failed. File does not exist: hdfs:// > yhd-jqhadoop2.int.yihaodian.com:8020/user/lujian/lujian/_tez_session_dir/63de23a2-1cff-4434-96ad-1304089fb489/.tez/application_1433219182593_252390/tez-conf.pb > ]], TaskAttempt 2 failed, info=[Container > container_1433219182593_252390_01_000005 finished with diagnostics set to > [Container failed. File does not exist: hdfs:// > yhd-jqhadoop2.int.yihaodian.com:8020/user/lujian/lujian/_tez_session_dir/63de23a2-1cff-4434-96ad-1304089fb489/.tez/application_1433219182593_252390/tez-conf.pb > ]], TaskAttempt 3 failed, info=[Container > container_1433219182593_252390_01_000006 finished with diagnostics set to > [Container failed. File does not exist: hdfs:// > yhd-jqhadoop2.int.yihaodian.com:8020/user/lujian/lujian/_tez_session_dir/63de23a2-1cff-4434-96ad-1304089fb489/.tez/application_1433219182593_252390/ > tez-conf.pb > ]]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex > vertex_1433219182593_252390_2_00 [Map 1] killed/failed due to:null] > DAG failed due to vertex failure. failedVertices:1 killedVertices:0 > FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.tez.TezTask > > *use hivecli to execute same query ,no exception throw.* > *use hive -e to execute same query on mr ,no exeception throw * > > > > ------------------------------ > r7raul1...@163.com > > -- Best Regards Jeff Zhang