I mistyped the user list mail address. I am correcting and sending again. 
Apologies for the noise.

My mail is below.


İleti başlangıcı:

> Kimden: Goktug Yildirim <goktug.yildi...@gmail.com>
> Tarih: 1 Ekim 2018 21:54:31 GMT+2
> Kime: ceph-users-j...@lists.ceph.com
> Bilgi: ceph-de...@vger.kernel.org
> Konu: Mimic offline problem
> 
> Hi all,
> 
> We have recently upgraded from luminous to mimic. It’s been 6 days since this 
> cluster is offline. The long short story is here: 
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-September/030078.html
> 
> I’ve also CC’ed developers since I believe this is a bug. If this is not to 
> correct way I apology and please let me know.
> 
> For the 6 days lots of thing happened and there were some outcomes about the 
> problem. Some of them was misjudged and some of them are not looked deeper. 
> However the most certain diagnosis is this: each OSD causes very high disk 
> I/O to its bluestore disk (WAL and DB are fine). After that OSDs become 
> unresponsive or very very less responsive. For example "ceph tell osd.x 
> version” stucks like for ever.
> 
> So due to unresponsive OSDs cluster does not settle. This is our problem! 
> 
> This is the one we are very sure of. But we are not sure of the reason. 
> 
> Here is the latest ceph status: 
> https://paste.ubuntu.com/p/2DyZ5YqPjh/. 
> 
> This is the status after we started all of the OSDs 24 hours ago.
> Some of the OSDs are not started. However it didnt make any difference when 
> all of them was online.
> 
> Here is the debug=20 log of an OSD which is same for all others: 
> https://paste.ubuntu.com/p/8n2kTvwnG6/
> As we figure out there is a loop pattern. I am sure it wont caught from eye.
> 
> This the full log the same OSD.
> https://www.dropbox.com/s/pwzqeajlsdwaoi1/ceph-osd.90.log?dl=0
> 
> Here is the strace of the same OSD process:
> https://paste.ubuntu.com/p/8n2kTvwnG6/
> 
> Recently we hear more to uprade mimic. I hope none get hurts as we do. I am 
> sure we have done lots of mistakes to let this happening. And this situation 
> may be a example for other user and could be a potential bug for ceph 
> developer.
> 
> Any help to figure out what is going on would be great.
> 
> Best Regards,
> Goktug Yildirim
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to