Re: [ceph-users] unable to start OSD

2014-02-20 Thread Samuel Just
What has happened in the last few weeks to this cluster? Was there an upgrade? -Sam On Wed, Feb 12, 2014 at 10:07 AM, Dietmar Maurer wrote: >> > It would be great to get two logs from two different crashing OSDs for >> > comparison purposes. >> >> ftp://download.proxmox.com/tmp/ceph-osd.4.log >>

Re: [ceph-users] unable to start OSD

2014-02-12 Thread Dietmar Maurer
> > It would be great to get two logs from two different crashing OSDs for > > comparison purposes. > > ftp://download.proxmox.com/tmp/ceph-osd.4.log > ftp://download.proxmox.com/tmp/ceph-osd.10.log I guess I should also mention that there was a miss-configuration in the network MTU setting of on

Re: [ceph-users] unable to start OSD

2014-02-12 Thread Dietmar Maurer
> It would be great to get two logs from two different crashing OSDs for > comparison purposes. ftp://download.proxmox.com/tmp/ceph-osd.4.log ftp://download.proxmox.com/tmp/ceph-osd.10.log > > > > and post the log somewhere? (You can use 'ceph-post ' to > > > send it to us > > > > # ceph-post-f

Re: [ceph-users] unable to start OSD

2014-02-12 Thread Sage Weil
On Wed, 12 Feb 2014, Dietmar Maurer wrote: > > This sounds like a bug introduced an entry into the pg log that is not > > ordered > > properly. I don't think I've seen this before... Sam, have you? > > > > How many OSDs you do you have? > > 12 OSDs, 3 nodes > > > Can you set 'debug osd = 20' i

Re: [ceph-users] unable to start OSD

2014-02-12 Thread Dietmar Maurer
Weil > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] unable to start OSD > > > This sounds like a bug introduced an entry into the pg log that is not > > ordered properly. I don't think I've seen this before... Sam, have you? > > > > How many

Re: [ceph-users] unable to start OSD

2014-02-12 Thread Dietmar Maurer
> This sounds like a bug introduced an entry into the pg log that is not ordered > properly. I don't think I've seen this before... Sam, have you? > > How many OSDs you do you have? 12 OSDs, 3 nodes > Can you set 'debug osd = 20' in your ceph.conf, restart and reproduce the > crash, The log I

Re: [ceph-users] unable to start OSD

2014-02-12 Thread Sage Weil
Hi Dietmar, This sounds like a bug introduced an entry into the pg log that is not ordered properly. I don't think I've seen this before... Sam, have you? How many OSDs you do you have? Can you set 'debug osd = 20' in your ceph.conf, restart and reproduce the crash, and post the log somewhere

Re: [ceph-users] unable to start OSD

2014-02-12 Thread Dietmar Maurer
After enabling debugging, I get: ... -4> 2014-02-12 09:43:44.739648 7f7f8b848780 20 read_log 6100'1677 (6100'1676) modify 85949a17/rbd_data.dd6592ae8944a.01bd/head//25 by clie nt.890681.0:76884 2014-01-26 16:44:08.412457 -3> 2014-02-12 09:43:44.739670 7f7f8b848780 20 read_lo

Re: [ceph-users] unable to start OSD

2014-02-12 Thread Dietmar Maurer
> I am unable to start my OSDs on one node: > > > osd/PGLog.cc: 672: FAILED assert(last_e.version.version < e.version.version) > > Does that mean there is something wrong with my journal disk? Or why can such > thing happen? After rebooting other nodes, all my OSD are offline, showing exactly th