Hi Yi,
Thanks for your reply.
1. The version of spark is 1.2.0 and the version of hive is 0.10.0-cdh4.2.1.
2. The full trace stack of the exception:
15/03/03 13:41:30 INFO Client:
client token:
DAAAAUrrav1rAADCnhQzX_Ic6CMnfqcW2NIxra5n8824CRFZQVJOX0NMSUVOVF9UT0tFTgA
diagnostics: User class threw exception: checkPaths:
hdfs://longzhou-hdpnn.lz.dscc:11000/tmp/hive-hadoop/hive_2015-03-03_13-41-04_472_3573658402424030395-1/-ext-10000
has nested
directoryhdfs://longzhou-hdpnn.lz.dscc:11000/tmp/hive-hadoop/hive_2015-03-03_13-41-04_472_3573658402424030395-1/-ext-10000/attempt_201503031341_0057_m_003375_21951
ApplicationMaster host: longzhou-hdp4.lz.dscc
ApplicationMaster RPC port: 0
queue: dt_spark
start time: 1425361063973
final status: FAILED
tracking URL:
longzhou-hdpnn.lz.dscc:12080/proxy/application_1421288865131_49822/history/application_1421288865131_49822
user: dt
Exception in thread "main" org.apache.spark.SparkException: Application
finished with failed status
at
org.apache.spark.deploy.yarn.ClientBase$class.run(ClientBase.scala:504)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:39)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:143)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
It seems that it is you are right about the causes.
Still, I am confused that why the nested directory is
`hdfs://longzhou-hdpnn.lz.dscc:11000/tmp/hive-hadoop/hive_2015-03-03_13-41-04_472_3573658402424030395-1/-ext-10000/attempt_201503031341_0057_m_003375_21951`
but not the path which |bak_startup_log_uid_20150227| point to? What's
in the `/tmp/hive-hadoop` ? What are they used for? It seems that there
are a huge lot of files in this directory.
Thanks.
On 2015年03月03日 14:43, Yi Tian wrote:
Hi,
Some suggestions:
1 You should tell us the version of spark and hive you are using.
2 You shoul paste the full trace stack of the exception.
In this case, I guess you have a nested directory in the path which
|bak_startup_log_uid_20150227| point to.
and the config field |hive.mapred.supports.subdirectories| is |false|
by default.
so…
|if (!conf.getBoolVar(HiveConf.ConfVars.HIVE_HADOOP_SUPPORTS_SUBDIRECTORIES) &&
item.isDir()) {
throw new HiveException("checkPaths: " + src.getPath()
+ " has nested directory" + itemSource);
}
|
On 3/3/15 14:36, LinQili wrote:
Hi all,
I was doing select using spark sql like:
insert into table startup_log_uid_20150227
select * from bak_startup_log_uid_20150227
where login_time < 1425027600
Usually, it got a exception:
org.apache.hadoop.hive.ql.metadata.Hive.checkPaths(Hive.java:2157)
org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:2298)
org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:686)
org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1469)
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:243)
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:137)
org.apache.spark.sql.execution.Command$class.execute(commands.scala:46)
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.execute(InsertIntoHiveTable.scala:51)
org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
org.apache.spark.sql.SchemaRDD.<init>(SchemaRDD.scala:108)
org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:94)
com.nd.home99.LogsProcess$anonfun$main$1$anonfun$apply$1.apply(LogsProcess.scala:286)
com.nd.home99.LogsProcess$anonfun$main$1$anonfun$apply$1.apply(LogsProcess.scala:83)
scala.collection.immutable.List.foreach(List.scala:318)
com.nd.home99.LogsProcess$anonfun$main$1.apply(LogsProcess.scala:83)
com.nd.home99.LogsProcess$anonfun$main$1.apply(LogsProcess.scala:82)
scala.collection.immutable.List.foreach(List.scala:318)
com.nd.home99.LogsProcess$.main(LogsProcess.scala:82)
com.nd.home99.LogsProcess.main(LogsProcess.scala)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:601)
org.apache.spark.deploy.yarn.ApplicationMaster$anon$2.run(ApplicationMaster.scala:427)
Is there any hints about this?