On Tue, 14 Jan 2014, GuangYang wrote:
> Hi ceph-users and ceph-devel,
> I came across an issue after restarting monitors of the cluster, that 
> authentication fails which prevents running any ceph command.
> 
> After we did some maintenance work, I restart OSD, however, I found that the 
> OSD would not join the cluster automatically after being restarted, though 
> TCP dump showed it had already sent messenger to monitor telling add me into 
> the cluster.
> 
> So that I suspected there might be some issues of monitor and I restarted 
> monitor one by one (3 in total), however, after restarting monitors, all ceph 
> command would fail saying authentication timeout?
> 
> 2014-01-14 12:00:30.499397 7fc7f195e700  0 monclient(hunting): authenticate 
> timed out after 300
> 2014-01-14 12:00:30.499440 7fc7f195e700  0 librados: client.admin 
> authentication error (110) Connection timed out
> Error connecting to cluster: Error
> 
> Any idea why such error happened (restarting OSD would result in the same 
> error)?
> 
> I am thinking the authentication information is persisted in mon local disk 
> and is there a chance those data got corrupted?

That sounds unlikely, but you're right that the core problem is with the 
mons.  What does 

 ceph daemon mon.`hostname` mon_status

say?  Perhaps they are not forming a quorum and that is what is preventing 
authentication.

sage
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to