What has happened in the last few weeks to this cluster? Was there an upgrade?
-Sam
On Wed, Feb 12, 2014 at 10:07 AM, Dietmar Maurer wrote:
>> > It would be great to get two logs from two different crashing OSDs for
>> > comparison purposes.
>>
>> ftp://download.proxmox.com/tmp/ceph-osd.4.log
>>
> > It would be great to get two logs from two different crashing OSDs for
> > comparison purposes.
>
> ftp://download.proxmox.com/tmp/ceph-osd.4.log
> ftp://download.proxmox.com/tmp/ceph-osd.10.log
I guess I should also mention that there was a miss-configuration in
the network MTU setting of on
> It would be great to get two logs from two different crashing OSDs for
> comparison purposes.
ftp://download.proxmox.com/tmp/ceph-osd.4.log
ftp://download.proxmox.com/tmp/ceph-osd.10.log
>
> > > and post the log somewhere? (You can use 'ceph-post ' to
> > > send it to us
> >
> > # ceph-post-f
On Wed, 12 Feb 2014, Dietmar Maurer wrote:
> > This sounds like a bug introduced an entry into the pg log that is not
> > ordered
> > properly. I don't think I've seen this before... Sam, have you?
> >
> > How many OSDs you do you have?
>
> 12 OSDs, 3 nodes
>
> > Can you set 'debug osd = 20' i
Weil
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] unable to start OSD
>
> > This sounds like a bug introduced an entry into the pg log that is not
> > ordered properly. I don't think I've seen this before... Sam, have you?
> >
> > How many
> This sounds like a bug introduced an entry into the pg log that is not ordered
> properly. I don't think I've seen this before... Sam, have you?
>
> How many OSDs you do you have?
12 OSDs, 3 nodes
> Can you set 'debug osd = 20' in your ceph.conf, restart and reproduce the
> crash,
The log I
Hi Dietmar,
This sounds like a bug introduced an entry into the pg log that is not
ordered properly. I don't think I've seen this before... Sam, have you?
How many OSDs you do you have?
Can you set 'debug osd = 20' in your ceph.conf, restart and reproduce the
crash, and post the log somewhere
After enabling debugging, I get:
...
-4> 2014-02-12 09:43:44.739648 7f7f8b848780 20 read_log 6100'1677
(6100'1676) modify 85949a17/rbd_data.dd6592ae8944a.01bd/head//25
by clie
nt.890681.0:76884 2014-01-26 16:44:08.412457
-3> 2014-02-12 09:43:44.739670 7f7f8b848780 20 read_lo
> I am unable to start my OSDs on one node:
>
> > osd/PGLog.cc: 672: FAILED assert(last_e.version.version < e.version.version)
>
> Does that mean there is something wrong with my journal disk? Or why can such
> thing happen?
After rebooting other nodes, all my OSD are offline, showing exactly th