On 13/05/2014 20:10, Gregory Farnum wrote: > On Tue, May 13, 2014 at 9:06 AM, Mike Dawson <mike.daw...@cloudapt.com> wrote: >> All, >> >> I have a recurring issue where the admin sockets >> (/var/run/ceph/ceph-*.*.asok) may vanish on a running cluster while the >> daemons keep running > > Hmm. > >> (or restart without my knowledge). > > I'm guessing this might be involved: > >> I see this issue on >> a dev cluster running Ubuntu and Ceph Emperor/Firefly, deployed with >> ceph-deploy using Upstart to control daemons. I never see this issue on >> Ubuntu / Dumpling / sysvinit. > > *goes and greps the git log* > > I'm betting it was commit 45600789f1ca399dddc5870254e5db883fb29b38 > (which has, in fact, been backported to dumpling and emperor), > intended so that turning on a new daemon wouldn't remove the admin > socket of an existing one. But I think that means that if you activate > the new daemon before the old one has finished shutting down and > unlinking, you would end up with a daemon that had no admin socket. > Perhaps it's an incomplete fix and we need a tracker ticket?
https://github.com/ceph/ceph/commit/45600789f1ca399dddc5870254e5db883fb29b38 I see the race condition now, missed it the first time around, thanks Greg :-) I'll work on it. Cheers > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > -- Loïc Dachary, Artisan Logiciel Libre
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com