You'll need to go look at the individual OSDs to determine why they
aren't on. All the cluster knows is that the OSDs aren't communicating
properly.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Tue, Apr 29, 2014 at 3:06 AM, Gandalf Corvotempesta
<gandalf.corvotempe...@gmail.com> wrote:
> After a simple "service ceph restart" on a server, i'm unable to get
> my cluster up again
> http://pastebin.com/raw.php?i=Wsmfik2M
>
> suddenly, some OSDs goes UP and DOWN randomly.
>
> I don't see any network traffic on cluster interface.
> How can I detect what ceph is doing ? From the posted output there is
> no way to detect if ceph is recovering or not. Showing just a bunch of
> increasing/decreasing numbers doens't help.
>
> I can see this:
>
> 2014-04-29 12:03:49.013808 mon.0 [INF] pgmap v1047121: 98432 pgs: 241
> inactive, 33138 peering, 25 remapped, 60067 down+peering, 3489
> remapped+peering, 1472 down+remapped+peering; 66261 bytes data, 1647
> MB used, 5582 GB / 5583 GB avail
>
> so what, is it recovering? Is it sleeping ? Why is not recovering ?
>
> http://pastebin.com/raw.php?i=2EdugwQa
> why all OSDs from host osd12 and osd13 are down ? Both hosts are up and 
> running.
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to