Today I have peereng problem not when I put osd.71 out, but in normal CEPH work.
Regards Dominik 2013/6/28 Andrey Korolyov <and...@xdel.ru>: > There is almost same problem with the 0.61 cluster, at least with same > symptoms. Could be reproduced quite easily - remove an osd and then > mark it as out and with quite high probability one of neighbors will > be stuck at the end of peering process with couple of peering pgs with > primary copy on it. Such osd process seems to be stuck in some kind of > lock, eating exactly 100% of one core. > > On Thu, Jun 13, 2013 at 8:42 PM, Gregory Farnum <g...@inktank.com> wrote: >> On Thu, Jun 13, 2013 at 6:33 AM, SÅ‚awomir Skowron <szi...@gmail.com> wrote: >>> Hi, sorry for late response. >>> >>> https://docs.google.com/file/d/0B9xDdJXMieKEdHFRYnBfT3lCYm8/view >>> >>> Logs in attachment, and on google drive, from today. >>> >>> https://docs.google.com/file/d/0B9xDdJXMieKEQzVNVHJ1RXFXZlU/view >>> >>> We have such problem today. And new logs are on google drive with today >>> date. >>> >>> Strange is that problematic osd.71 have about 10-15%, more space used >>> then other osd in cluster. >>> >>> Today in one hour osd.71 fails 3 times in mon log, and after third >>> recovery has been stuck, and many 500 errors appears in http layer on >>> top of rgw. When it's stuck, restarting osd71, osd.23, and osd.108, >>> all from stucked pg, helps, but i run even repair on this osd, just in >>> case. >>> >>> I have some theory, that on this pg is rgw index of objects, or one of >>> osd in this pg, have some problems with local filesystem or drive >>> bellow (raid controller reports nothing about that), but i do not see >>> any problem in system. >>> >>> How can we find in which pg/osd index of objects in rgw bucket exist ?? >> >> You can find the location of any named object by grabbing the OSD map >> from the cluster and using the osdmaptool: "osdmaptool <mapfile> >> --test-map-object <objname> --pool <poolid>". >> >> You're not providing any context for your issue though, so we really >> can't help. What symptoms are you observing? >> -Greg >> Software Engineer #42 @ http://inktank.com | http://ceph.com >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Pozdrawiam Dominik _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com