hi,
I'm trying to setup a standalone server, and in one of my tests, I got the
following exception:
java.io.IOException: Can't make directory for path
's3n://ww-sandbox/name_of_path' since it is a file.
at
org.apache.hadoop.fs.s3native.NativeS3FileSystem.mkdir(NativeS3FileSystem.java:541)
at
org.apache.hadoop.fs.s3native.NativeS3FileSystem.mkdirs(NativeS3FileSystem.java:532)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1867)
at
org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:52)
at
org.apache.spark.SparkHadoopWriter.preSetup(SparkHadoopWriter.scala:64)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1093)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1065)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1065)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:286)
at
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1065)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply$mcV$sp(PairRDDFunctions.scala:989)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:965)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:965)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:286)
at
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:965)
at
org.apache.spark.api.java.JavaPairRDD.saveAsHadoopFile(JavaPairRDD.scala:789)
at com.windward.spark.io.SparkWriter.write(SparkWriter.java:93)
at
com.windward.spark.io.MultipleDataFileWriter.write(MultipleDataFileWriter.java:48)
at
com.windward.spark.io.SparkWriterContainer.write(SparkWriterContainer.java:85)
at
com.windward.spark.io.SparkWriterContainer.write(SparkWriterContainer.java:72)
at
com.windward.spark.io.SparkWriterContainer.write(SparkWriterContainer.java:56)
at
com.windward.spark.apps.VesselStoriesRunner.doWork(VesselStoriesRunner.java:91)
at
com.windward.spark.AbstractSparkRunner.calcAll(AbstractSparkRunner.java:60)
at
com.windward.spark.apps.VesselStoriesApp.main(VesselStoriesApp.java:8)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
the part in s3 for the relevant part looks like:
root@ip-172-31-7-77 startup_scripts]$ s3cmd ls s3://ww-sandbox/name_of_path
DIR s3://ww-sandbox/name_of_path/
2014-11-13 10:27 0 s3://ww-sandbox/name_of_path
2015-06-21 20:39 0 s3://ww-sandbox/name_of_path_$folder$
I tried to give it as a parameter with or without the '/' in the end
The exact same call works for me with a yarn cluster (which I'm trying to
remove)
anyone has any idea?
thanks, nizan
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/s3-Can-t-make-directory-for-path-tp23419.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]