Hi

I am using Spark 1.3.0 . Command that I use is below.

/spark-submit --class org.com.td.sparkdemo.spark.WordCount \
    --master yarn-cluster \
    target/spark-0.0.1-SNAPSHOT-jar-with-dependencies.jar

Thanks
Shashi

On Sun, Nov 8, 2015 at 11:33 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Which release of Spark were you using ?
>
> Can you post the command you used to run WordCount ?
>
> Cheers
>
> On Sat, Nov 7, 2015 at 7:59 AM, Shashi Vishwakarma <
> shashi.vish...@gmail.com> wrote:
>
>> I am trying to run simple word count job in spark but I am getting
>> exception while running job.
>>
>> For more detailed output, check application tracking 
>> page:http://quickstart.cloudera:8088/proxy/application_1446699275562_0006/Then,
>>  click on links to logs of each attempt.Diagnostics: Exception from 
>> container-launch.Container id: container_1446699275562_0006_02_000001Exit 
>> code: 15Stack trace: ExitCodeException exitCode=15:
>>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
>>         at org.apache.hadoop.util.Shell.run(Shell.java:455)
>>         at 
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
>>         at 
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
>>         at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
>>         at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>         at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>         at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>> Container exited with a non-zero exit code 15Failing this attempt. Failing 
>> the application.
>>          ApplicationMaster host: N/A
>>          ApplicationMaster RPC port: -1
>>          queue: root.cloudera
>>          start time: 1446910483956
>>          final status: FAILED
>>          tracking URL: 
>> http://quickstart.cloudera:8088/cluster/app/application_1446699275562_0006
>>          user: clouderaException in thread "main" 
>> org.apache.spark.SparkException: Application finished with failed status
>>         at org.apache.spark.deploy.yarn.Client.run(Client.scala:626)
>>         at org.apache.spark.deploy.yarn.Client$.main(Client.scala:651)
>>         at org.apache.spark.deploy.yarn.Client.main(Client.scala)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>         at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at 
>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
>>         at 
>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
>>         at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
>>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>> I checked log from following command
>>
>> yarn logs -applicationId application_1446699275562_0006
>>
>> Here is log
>>
>>  15/11/07 07:35:09 ERROR yarn.ApplicationMaster: User class threw exception: 
>> Output directory 
>> hdfs://quickstart.cloudera:8020/user/cloudera/WordCountOutput already exists
>> org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory 
>> hdfs://quickstart.cloudera:8020/user/cloudera/WordCountOutput already exists
>>         at 
>> org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:132)
>>         at 
>> org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1053)
>>         at 
>> org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:954)
>>         at 
>> org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:863)
>>         at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1290)
>>         at org.com.td.sparkdemo.spark.WordCount$.main(WordCount.scala:23)
>>         at org.com.td.sparkdemo.spark.WordCount.main(WordCount.scala)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>         at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at 
>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)15/11/07
>>  07:35:09 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 
>> 15, (reason: User class threw exception: Output directory 
>> hdfs://quickstart.cloudera:8020/user/cloudera/WordCountOutput already 
>> exists)15/11/07 07:35:14 ERROR yarn.ApplicationMaster: SparkContext did not 
>> initialize after waiting for 100000 ms. Please check earlier log output for 
>> errors. Failing the application.
>>
>> Exception clearly indicates that WordCountOutput directory already exists
>> but I made sure that directory is not there before running job.
>>
>> Why I am getting this error even though directory was not there before
>> running my job?
>>
>>
>

Reply via email to