On Thu, Apr 25, 2013 at 8:22 AM, Noah Watkins <noah.watk...@inktank.com> wrote:
>
> On Apr 25, 2013, at 4:08 AM, Varun Chandramouli <varun....@gmail.com> wrote:
>
>> 2013-04-25 13:54:36.182188 bff8cb40 -1 common/Thread.cc: In function 'void 
>> Thread::create(size_t)' thread bff8cb40 time 2013-04-25 
>> 13:54:36.053392#012common/Thread.cc: 110: FAILED assert(ret == 0)#012#012 
>> ceph version 0.58-500-gaf3b163 
>> (af3b16349a49a8aee401e27c1b71fd704b31297c)#012 1: (Thread::create(unsigned 
>> int)+0xdc) [0x843866c]#012 2: (Pipe::start_writer()+0x4e) [0x84d837e]#012 3: 
>> (Pipe::accept()+0x4955) [0x84ee625]#012 4: (Pipe::reader()+0x1758) 
>> [0x84f10b8]#012 5: (Pipe::Reader::entry()+0x1e) [0x84f2dee]#012 6: 
>> (Thread::_entry_func(void*)+0xf) [0x843833f]#012 7: (()+0x6d4c) 
>> [0xb7784d4c]#012 8: (clone()+0x5e) [0xb7106ace]#012 NOTE: a copy of the 
>> executable, or `objdump -rdS <executable>` is needed to interpret this.
>
> The assertion failure here doesn't look like any of the MDS problems I was 
> getting with Hadoop, but someone else may recognize the problem. A couple 
> things that might be helpful. First, I think that multi-MDS is less stable 
> right now than running a single MDS. Second, using GDB to run 'thread apply 
> all bt' to the crashed MDS core file would provide a lot more context to help 
> debug.

That assert indicates the MDS tried to create a new thread and got an
error back. Given that your MDS is already running, this means it's
not an issue with thread setup — you've run into a resource limit of
some kind. Since you're in VMs I'll guess you've run out of RAM, but
it's also possible that the process has exceeded some limitations
imposed by the kernel.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to