Gaurav,

Is there any unfound or incomplete PGs? If not, you can remove OSD (with
monitoring ceph -w and ceph -s output) and then replace it with good one,
one by one OSD. I have done with that successfully.

Best regards,

On Tue, May 17, 2016 at 12:30 PM, Gaurav Bafna <baf...@gmail.com> wrote:

> Even I faced the same issue with our production cluster .
>
>     cluster fac04d85-db48-4564-b821-deebda046261
>      health HEALTH_WARN
>             658 pgs degraded
>             658 pgs stuck degraded
>             688 pgs stuck unclean
>             658 pgs stuck undersized
>             658 pgs undersized
>             recovery 3064/1981308 objects degraded (0.155%)
>             recovery 124/1981308 objects misplaced (0.006%)
>      monmap e11: 11 mons at
> {dssmon2=
> 10.140.208.224:6789/0,dssmon3=10.140.208.225:6789/0,dssmon31=10.135.38.141:6789/0,dssmon32=10.135.38.142:6789/0,dssmon33=10.135.38.143:6789/0,dssmon34=10.135.38.144:6789/0,dssmon35=10.135.38.145:6789/0,dssmon4=10.140.208.226:6789/0,dssmon5=10.140.208.227:6789/0,dssmon6=10.140.208.228:6789/0,dssmonleader1=10.140.208.223:6789/0
> }
>             election epoch 792, quorum 0,1,2,3,4,5,6,7,8,9,10
>
> dssmon31,dssmon32,dssmon33,dssmon34,dssmon35,dssmonleader1,dssmon2,dssmon3,dssmon4,dssmon5,dssmon6
>      osdmap e8778: 2774 osds: 2746 up, 2746 in; 30 remapped pgs
>       pgmap v2740957: 75680 pgs, 11 pools, 386 GB data, 322 kobjects
>             16288 GB used, 14299 TB / 14315 TB avail
>             3064/1981308 objects degraded (0.155%)
>             124/1981308 objects misplaced (0.006%)
>                74992 active+clean
>                  658 active+undersized+degraded
>                   30 active+remapped
>   client io 12394 B/s rd, 17 op/s
>
> With 12 osd are down due to H/W failure, and having replication factor
> 6 , the cluster should have recovered , but it is not recovering.
>
> When I kill an osd daemon, it recovers quickly. Any ideas why the PGs
> are remaining undersized ?
>
> What could be the difference between two scenarions :
>
> 1. OSD down due to H/W failure.
> 2. OSD daemon killed .
>
> When I remove the 12 osds from the crushmap manually or do ceph osd
> crush remove for those osds, the cluster recovers just fine.
>
> Thanks
> Gaurav
>
> On Tue, May 17, 2016 at 2:08 AM, Wido den Hollander <w...@42on.com> wrote:
> >
> >> Op 14 mei 2016 om 12:36 schreef Lazuardi Nasution <
> mrxlazuar...@gmail.com>:
> >>
> >>
> >> Hi Wido,
> >>
> >> Yes you are right. After removing the down OSDs, reformatting and bring
> >> them up again, at least until 75% of total OSDs, my Ceph Cluster is
> healthy
> >> again. It seem there is high probability of data safety if the total
> active
> >> PGs same with total PGs and total degraded PGs same with total
> undersized
> >> PGs, but it is better to check PGs one by one for make sure there is no
> >> incomplete, unfound and/or missing objects.
> >>
> >> Anyway, why 75%? Can I reduce this value by resizing (add) the replica
> of
> >> the pool?
> >>
> >
> > It completely depends on the CRUSHMap how many OSDs have to be added
> back to allow the cluster to recover.
> >
> > A CRUSHmap has failure domains which is usually a host. You have to make
> sure you have enough 'hosts' online with OSDs for each replica.
> >
> > So with 3 replicas you need 3 hosts online with OSDs on there.
> >
> > You can lower the replica count of a pool (size), but that makes it more
> vulnerable to data loss.
> >
> > Wido
> >
> >> Best regards,
> >>
> >> On Fri, May 13, 2016 at 5:04 PM, Wido den Hollander <w...@42on.com>
> wrote:
> >>
> >> >
> >> > > Op 13 mei 2016 om 11:55 schreef Lazuardi Nasution <
> >> > mrxlazuar...@gmail.com>:
> >> > >
> >> > >
> >> > > Hi Wido,
> >> > >
> >> > > The status is same after 24 hour running. It seem that the status
> will
> >> > not
> >> > > go to fully active+clean until all down OSDs back again. The only
> way to
> >> > > make down OSDs to go back again is reformating or replace if HDDs
> has
> >> > > hardware issue. Do you think that it is safe way to do?
> >> > >
> >> >
> >> > Ah, you are probably lacking enough replicas to make the recovery
> proceed.
> >> >
> >> > If that is needed I would do this OSD by OSD. Your crushmap will
> probably
> >> > tell you which OSDs you need to bring back before it works again.
> >> >
> >> > Wido
> >> >
> >> > > Best regards,
> >> > >
> >> > > On Fri, May 13, 2016 at 4:44 PM, Wido den Hollander <w...@42on.com>
> >> > wrote:
> >> > >
> >> > > >
> >> > > > > Op 13 mei 2016 om 11:34 schreef Lazuardi Nasution <
> >> > > > mrxlazuar...@gmail.com>:
> >> > > > >
> >> > > > >
> >> > > > > Hi,
> >> > > > >
> >> > > > > After disaster and restarting for automatic recovery, I found
> >> > following
> >> > > > > ceph status. Some OSDs cannot be restarted due to file system
> >> > corruption
> >> > > > > (it seem that xfs is fragile).
> >> > > > >
> >> > > > > [root@management-b ~]# ceph status
> >> > > > >     cluster 3810e9eb-9ece-4804-8c56-b986e7bb5627
> >> > > > >      health HEALTH_WARN
> >> > > > >             209 pgs degraded
> >> > > > >             209 pgs stuck degraded
> >> > > > >             334 pgs stuck unclean
> >> > > > >             209 pgs stuck undersized
> >> > > > >             209 pgs undersized
> >> > > > >             recovery 5354/77810 objects degraded (6.881%)
> >> > > > >             recovery 1105/77810 objects misplaced (1.420%)
> >> > > > >      monmap e1: 3 mons at {management-a=
> >> > > > >
> >> > > >
> >> >
> 10.255.102.1:6789/0,management-b=10.255.102.2:6789/0,management-c=10.255.102.3:6789/0
> >> > > > > }
> >> > > > >             election epoch 2308, quorum 0,1,2
> >> > > > > management-a,management-b,management-c
> >> > > > >      osdmap e25037: 96 osds: 49 up, 49 in; 125 remapped pgs
> >> > > > >             flags sortbitwise
> >> > > > >       pgmap v9024253: 2560 pgs, 5 pools, 291 GB data, 38905
> objects
> >> > > > >             678 GB used, 90444 GB / 91123 GB avail
> >> > > > >             5354/77810 objects degraded (6.881%)
> >> > > > >             1105/77810 objects misplaced (1.420%)
> >> > > > >                 2226 active+clean
> >> > > > >                  209 active+undersized+degraded
> >> > > > >                  125 active+remapped
> >> > > > >   client io 0 B/s rd, 282 kB/s wr, 10 op/s
> >> > > > >
> >> > > > > Since total active PGs same with total PGs and total degraded
> PGs
> >> > same
> >> > > > with
> >> > > > > total undersized PGs, does it mean that all PGs have at least
> one
> >> > good
> >> > > > > replica, so I can just mark lost or remove down OSD, reformat
> again
> >> > and
> >> > > > > then restart them if there is no hardware issue with HDDs?
> Which one
> >> > of
> >> > > > PGs
> >> > > > > status should I pay more attention, degraded or undersized due
> to
> >> > lost
> >> > > > > object possibility?
> >> > > > >
> >> > > >
> >> > > > Yes. Your system is not reporting any inactive, unfound or stale
> PGs,
> >> > so
> >> > > > that is good news.
> >> > > >
> >> > > > However, I recommend that you wait for the system to become fully
> >> > > > active+clean before you start removing any OSDs or formatting hard
> >> > drives.
> >> > > > Better be safe than sorry.
> >> > > >
> >> > > > Wido
> >> > > >
> >> > > > > Best regards,
> >> > > > > _______________________________________________
> >> > > > > ceph-users mailing list
> >> > > > > ceph-users@lists.ceph.com
> >> > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> > > >
> >> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Gaurav Bafna
> 9540631400
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to