[ceph-users] Periodically activating / peering on OSD add

2018-07-14 Thread Kevin Olbrich
Hi, why do I see activating followed by peering during OSD add (refill)? I did not change pg(p)_num. Is this normal? From my other clusters, I don't think that happend... Kevin ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com

Re: [ceph-users] Periodically activating / peering on OSD add

2018-07-14 Thread Kevin Olbrich
PS: It's luminous 12.2.5! Mit freundlichen Grüßen / best regards, Kevin Olbrich. 2018-07-14 15:19 GMT+02:00 Kevin Olbrich : > Hi, > > why do I see activating followed by peering during OSD add (refill)? > I did not change pg(p)_num. > > Is this normal? From my other clusters, I don't think that

[ceph-users] 12.2.6 CRC errors

2018-07-14 Thread Glen Baars
Hello Ceph users! Note to users, don't install new servers on Friday the 13th! We added a new ceph node on Friday and it has received the latest 12.2.6 update. I started to see CRC errors and investigated hardware issues. I have since found that it is caused by the 12.2.6 release. About 80TB co

Re: [ceph-users] 12.2.6 CRC errors

2018-07-14 Thread Uwe Sauter
Hi Glen, about 16h ago there has been a notice on this list with subject "IMPORTANT: broken luminous 12.2.6 release in repo, do not upgrade" from Sage Weil (main developer of Ceph). Quote from this notice: "tl;dr: Please avoid the 12.2.6 packages that are currently present on download.ceph.c

Re: [ceph-users] 12.2.6 CRC errors

2018-07-14 Thread Glen Baars
Thanks Uwe, I was that on the website. Any idea if what I have done is correct? Do I now just wait? Sent from my Cyanogen phone On 14 Jul 2018 11:16 PM, Uwe Sauter wrote: Hi Glen, about 16h ago there has been a notice on this list with subject "IMPORTANT: broken luminous 12.2.6 release in re

Re: [ceph-users] 12.2.6 CRC errors

2018-07-14 Thread Sage Weil
On Sat, 14 Jul 2018, Glen Baars wrote: > Hello Ceph users! > > Note to users, don't install new servers on Friday the 13th! > > We added a new ceph node on Friday and it has received the latest 12.2.6 > update. I started to see CRC errors and investigated hardware issues. I > have since found t

[ceph-users] OSD fails to start after power failure

2018-07-14 Thread David Young
Hey folks, I have a Luminous 12.2.6 cluster which suffered a power failure recently. On recovery, one of my OSDs is continually crashing and restarting, with the error below: 9ae00 con 0     -3> 2018-07-15 09:50:58.313242 7f131c5a9700 10 monclient: tick     -2> 2018-07-15 09:50:58.313277

[ceph-users] OSD fails to start after power failure (with FAILED assert(num_unsent <= log_queue.size()) error)

2018-07-14 Thread David Young
Hey folks, Sorry, posting this from a second account, since for some reason my primary account doesn't seem tobeable to post to the list... I have a Luminous 12.2.6 cluster which suffered a power failure recently. On recovery, one of my OSDs is continually crashing and restarting, with the e