Varun,

What version of Ceph are you running? Can you confirm that the MDS daemon 
(ceph-mds) is still running or has crashed when the MDS becomes 
laggy/unresponsive? If it has crashed checked the MDS log for a crash report. 
There were a couple Hadoop workloads that caused the MDS to misbehave for us as 
well..

-Noah

On Apr 24, 2013, at 12:56 AM, Varun Chandramouli <varun....@gmail.com> wrote:

> Hi All,
> 
> I am running the MapReduce wordcount code (on a ceph cluster consisting of 2 
> VMs) on a data set consisting of 5000 odd files (approx. 10gb size in total). 
> Periodically, the ceph health says that the mds is laggy/unresponsive, and I 
> get messages like the following:
> 
> 13/04/24 10:41:00 INFO mapred.JobClient:  map 11% reduce 3%
> 13/04/24 10:42:36 INFO mapred.JobClient:  map 12% reduce 3%
> 13/04/24 10:42:45 INFO mapred.JobClient:  map 12% reduce 4%
> 13/04/24 10:44:08 INFO mapred.JobClient:  map 13% reduce 4%
> 13/04/24 10:45:29 INFO mapred.JobClient:  map 14% reduce 4%
> 13/04/24 11:06:31 INFO mapred.JobClient: Task Id : 
> attempt_201304241023_0001_m_000706_0, Status : FAILED
> Task attempt_201304241023_0001_m_000706_0 failed to report status for 600 
> seconds. Killing!
> Task attempt_201304241023_0001_m_000706_0 failed to report status for 600 
> seconds. Killing!
> 
> I then have to manually restart the mds again, and the process continues 
> execution. Can someone please tell me the reason for this, and how to solve 
> it? Pasting my ceph.conf file below:
> 
> [global]
>         auth client required = none
>         auth cluster required = none
>         auth service required = none
> 
> [osd]
>         osd journal data = 1000
>         filestore xattr use omap = true
> #       osd data = /var/lib/ceph/osd/ceph-$id
> 
> [mon.a]
>         host = varunc4-virtual-machine
>         mon addr = 10.72.148.209:6789
> #       mon data = /var/lib/ceph/mon/ceph-a
> 
> [mds.a]
>         host = varunc4-virtual-machine
> #       mds data = /var/lib/ceph/mds/ceph-a
> 
> [osd.0]
>         host = varunc4-virtual-machine
> 
> [osd.1]
>         host = varunc5-virtual-machine
> 
> Regards
> Varun 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to