RE: no space left at worker node

ey-chih chow Sun, 08 Feb 2015 17:17:32 -0800

I found the problem is, for each application, the Spark worker node saves the 
corresponding std output and std err under ./spark/work/appid, where appid is 
the id of the application.  If I ran several applications in a row, it will out 
of space.  In my case, the disk usage under ./spark/work/ is as follows:
1689784 ./app-20150208203033-0002/01689788      ./app-20150208203033-000240324  
./driver-20150208180505-00011691400     ./app-20150208180509-0001/01691404      
./app-20150208180509-000140316  ./driver-20150208203030-000240320       
./driver-20150208173156-00001649876     ./app-20150208173200-0000/01649880      
./app-20150208173200-00005152036        .
Any suggestion how to resolve it?  Thanks.
Ey-Chih ChowFrom: eyc...@hotmail.com
To: gen.tan...@gmail.com
CC: user@spark.apache.org
Subject: RE: no space left at worker node
Date: Sun, 8 Feb 2015 15:25:43 -0800





By this way, the input and output paths of the job are all in s3.  I did not 
use paths of hdfs as input or output.
Best regards,
Ey-Chih Chow

From: eyc...@hotmail.com
To: gen.tan...@gmail.com
CC: user@spark.apache.org
Subject: RE: no space left at worker node
Date: Sun, 8 Feb 2015 14:57:15 -0800




Hi Gen,
Thanks.  I save my logs in a file under /var/log.  This is the only place to 
save data.  Will the problem go away if I use a better machine?
Best regards,
Ey-Chih Chow

Date: Sun, 8 Feb 2015 23:32:27 +0100
Subject: Re: no space left at worker node
From: gen.tan...@gmail.com
To: eyc...@hotmail.com
CC: user@spark.apache.org

Hi,
I am sorry that I made a mistake. r3.large has only one SSD which has been 
mounted in /mnt. Therefore this is no /dev/sdc.In fact, the problem is that 
there is no space in the under / directory. So you should check whether your 
application write data under this directory(for instance, save file in 
file:///). 
If not, you can use watch du -sh to during the running time to figure out which 
directory is expanding. Normally, only /mnt directory which is supported by SSD 
is expanding significantly, because the data of hdfs is saved here. Then you 
can find the directory which caused no space problem and find out the specific 
reason.
CheersGen


On Sun, Feb 8, 2015 at 10:45 PM, ey-chih chow <eyc...@hotmail.com> wrote:



Thanks Gen.  How can I check if /dev/sdc is well mounted or not?  In general, 
the problem shows up when I submit the second or third job.  The first job I 
submit most likely will succeed.
Ey-Chih Chow

Date: Sun, 8 Feb 2015 18:18:03 +0100
Subject: Re: no space left at worker node
From: gen.tan...@gmail.com
To: eyc...@hotmail.com
CC: user@spark.apache.org

Hi,
In fact, /dev/sdb is /dev/xvdb. It seems that there is no problem about double 
mount. However, there is no information about /mnt2. You should check whether 
/dev/sdc is well mounted or not.The reply of Micheal is good solution about 
this type of problem. You can check his site.
CheersGen

On Sun, Feb 8, 2015 at 5:53 PM, ey-chih chow <eyc...@hotmail.com> wrote:



Gen,
Thanks for your information.  The content of /etc/fstab at the worker node 
(r3.large) is:
#LABEL=/     /           ext4    defaults,noatime  1   1tmpfs       /dev/shm    
tmpfs   defaults        0   0devpts      /dev/pts    devpts  gid=5,mode=620  0  
 0sysfs       /sys        sysfs   defaults        0   0proc        /proc       
proc    defaults        0   0/dev/sdb        /mnt    auto    
defaults,noatime,nodiratime,comment=cloudconfig 0       0/dev/sdc        /mnt2  
 auto    defaults,noatime,nodiratime,comment=cloudconfig 0       0
There is no entry of /dev/xvdb.
 Ey-Chih Chow
Date: Sun, 8 Feb 2015 12:09:37 +0100
Subject: Re: no space left at worker node
From: gen.tan...@gmail.com
To: eyc...@hotmail.com
CC: user@spark.apache.org

Hi,
I fact, I met this problem before. it is a bug of AWS. Which type of machine do 
you use?
If I guess well, you can check the file /etc/fstab. There would be a double 
mount of /dev/xvdb.If yes, you should1. stop hdfs2. umount /dev/xvdb at / 3. 
restart hdfs
Hope this could be helpful.CheersGen


On Sun, Feb 8, 2015 at 8:16 AM, ey-chih chow <eyc...@hotmail.com> wrote:
Hi,



I submitted a spark job to an ec2 cluster, using spark-submit.  At a worker

node, there is an exception of 'no space left on device' as follows.



==========================================

15/02/08 01:53:38 ERROR logging.FileAppender: Error writing stream to file

/root/spark/work/app-20150208014557-0003/0/stdout

java.io.IOException: No space left on device

        at java.io.FileOutputStream.writeBytes(Native Method)

        at java.io.FileOutputStream.write(FileOutputStream.java:345)

        at

org.apache.spark.util.logging.FileAppender.appendToFile(FileAppender.scala:92)

        at

org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:72)

        at

org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39)

        at

org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)

        at

org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)

        at

org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1311)

        at

org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38)

===========================================



The command df showed the following information at the worker node:



Filesystem           1K-blocks      Used Available Use% Mounted on

/dev/xvda1             8256920   8256456         0 100% /

tmpfs                  7752012         0   7752012   0% /dev/shm

/dev/xvdb             30963708   1729652  27661192   6% /mnt



Does anybody know how to fix this?  Thanks.





Ey-Chih Chow







--

View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/no-space-left-at-worker-node-tp21545.html

Sent from the Apache Spark User List mailing list archive at Nabble.com.



---------------------------------------------------------------------

To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

For additional commands, e-mail: user-h...@spark.apache.org

RE: no space left at worker node

Reply via email to