On 13/05/2014 20:10, Gregory Farnum wrote:
> On Tue, May 13, 2014 at 9:06 AM, Mike Dawson <mike.daw...@cloudapt.com> wrote:
>> All,
>>
>> I have a recurring issue where the admin sockets
>> (/var/run/ceph/ceph-*.*.asok) may vanish on a running cluster while the
>> daemons keep running
> 
> Hmm.
> 
>> (or restart without my knowledge).
> 
> I'm guessing this might be involved:
> 
>> I see this issue on
>> a dev cluster running Ubuntu and Ceph Emperor/Firefly, deployed with
>> ceph-deploy using Upstart to control daemons. I never see this issue on
>> Ubuntu / Dumpling / sysvinit.
> 
> *goes and greps the git log*
> 
> I'm betting it was commit 45600789f1ca399dddc5870254e5db883fb29b38
> (which has, in fact, been backported to dumpling and emperor),
> intended so that turning on a new daemon wouldn't remove the admin
> socket of an existing one. But I think that means that if you activate
> the new daemon before the old one has finished shutting down and
> unlinking, you would end up with a daemon that had no admin socket.
> Perhaps it's an incomplete fix and we need a tracker ticket?

https://github.com/ceph/ceph/commit/45600789f1ca399dddc5870254e5db883fb29b38

I see the race condition now, missed it the first time around, thanks Greg :-) 
I'll work on it.

Cheers

> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to