Hello Ceph Users,

We have a Ceph test cluster, that we want to bring into production and will 
grow rapidly in the future.
Ceph version:
ceph                                   0.80.7-2+deb8u1             amd64        
distributed storage and file system
ceph-common                    0.80.7-2+deb8u1             amd64        common 
utilities to mount and interact with a ceph storage cluster


Our config:
5 hosts with each running 12 OSDs
containing 2 objects
One node went down and stayed down for about 12 hours
Then it was brought back online (manually), the entire cluster slowly
came to a halt with the current status being:

First status after this crash:

cluster e2295d66-a265-11e5-8c92-00219bfd424c
      health HEALTH_WARN 4628 pgs down; 4628 pgs peering; 4628 pgs stuck
inactive; 4628 pgs stuck unclean
      monmap e3: 3 mons at
{a=172.30.0.2:6789/0,b=172.30.0.67:6789/0,mon=172.30.0.1:6789/0},
election epoch 16, quorum 0,1,2 mon,a,b
      osdmap e18880: 60 osds: 48 up, 48 in
       pgmap v127495: 4628 pgs, 4 pools, 1238 bytes data, 4 objects
             283 GB used, 130 TB / 130 TB avail
                 4628 down+peering

The Ceph status at this moment:
# ceph status
    cluster e2295d66-a265-11e5-8c92-00219bfd424c
     health HEALTH_WARN 4622 pgs down; 4628 pgs peering; 1427 pgs stale; 4628 
pgs stuck inactive; 1427 pgs stuck stale; 4628 pgs stuck unclean; 2/17 in osds 
are down; 1 mons down, quorum 1,2 a,b
     monmap e3: 3 mons at 
{a=172.30.0.2:6789/0,b=172.30.0.67:6789/0,mon=172.30.0.1:6789/0}, election 
epoch 18, quorum 1,2 a,b
     osdmap e19242: 60 osds: 15 up, 17 in
      pgmap v128135: 4628 pgs, 4 pools, 118 bytes data, 3 objects
            100 GB used, 47383 GB / 47483 GB avail
                   3 peering
                1424 stale+down+peering
                3198 down+peering
                   3 stale+peering



It is a test cluster, so no real harm done. How to get it back up, and
why did this happen?

Regards, Arnoud.

------------------------------------------------------------------------------

De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is
uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht
ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct
te informeren door het bericht te retourneren. Het Universitair Medisch
Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W.
(Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij
de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.

Denk s.v.p aan het milieu voor u deze e-mail afdrukt.

------------------------------------------------------------------------------

This message may contain confidential information and is intended exclusively
for the addressee. If you receive this message unintentionally, please do not
use the contents but notify the sender immediately by return e-mail. University
Medical Center Utrecht is a legal person by public law and is registered at
the Chamber of Commerce for Midden-Nederland under no. 30244197.

Please consider the environment before printing this e-mail.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to