Thanks Mathieu,
So either I must have shared filesystem OR Hadoop as filesystem in order to write data from Standalone mode cluster setup environment. Thanks for your input.

Regards
Stuti Awasthi

From: Mathieu Longtin [math...@closetwork.org]
Sent: Tuesday, May 24, 2016 7:34 PM
To: Stuti Awasthi; Jacek Laskowski
Cc: user
Subject: Re: Not able to write output to local filsystem from Standalone mode.

In standalone mode, executor assume they have access to a shared file system. The driver creates the directory and the executor write files, so the executors end up not writing anything since there is no local directory.

On Tue, May 24, 2016 at 8:01 AM Stuti Awasthi <stutiawas...@hcl.com> wrote:
hi Jacek,

Parent directory already present, its my home directory. Im using Linux (Redhat) machine 64 bit.
Also I noticed that "test1" folder is created in my master with subdirectory as "_temporary" which is empty. but on slaves, no such directory is created under /home/stuti.

Thanks
Stuti 

From: Jacek Laskowski [ja...@japila.pl]
Sent: Tuesday, May 24, 2016 5:27 PM
To: Stuti Awasthi
Cc: user
Subject: Re: Not able to write output to local filsystem from Standalone mode.

Hi,

What happens when you create the parent directory /home/stuti? I think the failure is due to missing parent directories. What's the OS?

Jacek

On 24 May 2016 11:27 a.m., "Stuti Awasthi" <stutiawas...@hcl.com> wrote:

Hi All,

I have 3 nodes Spark 1.6 Standalone mode cluster with 1 Master and 2 Slaves. Also Im not having Hadoop as filesystem . Now, Im able to launch shell , read the input file from local filesystem and perform transformation successfully. When I try to write my output in local filesystem path then I receive below error .

 

I tried to search on web and found similar Jira : https://issues.apache.org/jira/browse/SPARK-2984 . Even though it shows resolved for Spark 1.3+ but already people have posted the same issue still persists in latest versions.

 

ERROR

scala> data.saveAsTextFile("/home/stuti/test1")

16/05/24 05:03:42 WARN TaskSetManager: Lost task 1.0 in stage 1.0 (TID 2, server1): java.io.IOException: The temporary job-output directory file:/home/stuti/test1/_temporary doesn't exist!

        at org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250)

        at org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244)

        at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116)

        at org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:91)

        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1193)

        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1185)

        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)

        at org.apache.spark.scheduler.Task.run(Task.scala:89)

        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.lang.Thread.run(Thread.java:745)

 

What is the best way to resolve this issue if suppose I don’t want to have Hadoop installed OR is it mandatory to have Hadoop to write the output from Standalone cluster mode.

 

Please suggest.

 

Thanks &Regards

Stuti Awasthi

 



::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
--
Mathieu Longtin
1-514-803-8977
--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to