I'm following up from awhile ago. I don't think this is the same bug. The bug referenced shows "abort: Corruption: block checksum mismatch", and I'm not seeing that on mine.
Now I've had 8 OSDs down on this one server for a couple of weeks, and I just tried to start it back up. Here's a link to the log of that OSD (which segfaulted right after starting up): http://people.beocat.ksu.edu/~kylehutson/ceph-osd.414.log To me, it looks like the logs are providing surprisingly few hints as to where the problem lies. Is there a way I can turn up logging to see if I can get any more info as to why this is happening? On Thu, Feb 8, 2018 at 3:02 AM, Mike O'Connor <m...@oeg.com.au> wrote: > On 7/02/2018 8:23 AM, Kyle Hutson wrote: > > We had a 26-node production ceph cluster which we upgraded to Luminous > > a little over a month ago. I added a 27th-node with Bluestore and > > didn't have any issues, so I began converting the others, one at a > > time. The first two went off pretty smoothly, but the 3rd is doing > > something strange. > > > > Initially, all the OSDs came up fine, but then some started to > > segfault. Out of curiosity more than anything else, I did reboot the > > server to see if it would get better or worse, and it pretty much > > stayed the same - 12 of the 18 OSDs did not properly come up. Of > > those, 3 again segfaulted > > > > I picked one that didn't properly come up and copied the log to where > > anybody can view it: > > http://people.beocat.ksu.edu/~kylehutson/ceph-osd.426.log > > <http://people.beocat.ksu.edu/%7Ekylehutson/ceph-osd.426.log> > > > > You can contrast that with one that is up: > > http://people.beocat.ksu.edu/~kylehutson/ceph-osd.428.log > > <http://people.beocat.ksu.edu/%7Ekylehutson/ceph-osd.428.log> > > > > (which is still showing segfaults in the logs, but seems to be > > recovering from them OK?) > > > > Any ideas? > Ideas ? yes > > There is a a bug which is hitting a small number of systems and at this > time there is no solution. Issues details at > http://tracker.ceph.com/issues/22102. > > Please submit more details of your problem on the ticket. > > Mike > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com