Re: [ceph-users] Another cluster completely hang

Zoltan Arnold Nagy Wed, 29 Jun 2016 01:54:25 -0700

Just loosing one disk doesn’t automagically delete it from CRUSH, but in the 
output you had 10 disks listed, so there must be something else going - did you 
delete the disk from the crush map as well?


Ceph waits by default 300 secs AFAIK to mark an OSD out after it will start to 
recover.


> On 29 Jun 2016, at 10:42, Mario Giammarco <mgiamma...@gmail.com> wrote:
> 
> I thank you for your reply so I can add my experience:
> 
> 1) the other time this thing happened to me I had a cluster with min_size=2 
> and size=3 and the problem was the same. That time I put min_size=1 to 
> recover the pool but it did not help. So I do not understand where is the 
> advantage to put three copies when ceph can decide to discard all three.
> 2) I started with 11 hdds. The hard disk failed. Ceph waited forever for hard 
> disk coming back. But hard disk is really completelly broken so I have 
> followed the procedure to really delete from cluster. Anyway ceph did not 
> recover.
> 3) I have 307 pgs more than 300 but it is due to the fact that I had 11 hdds 
> now only 10. I will add more hdds after I repair the pool
> 4) I have reduced the monitors to 3
> 
> 
> 
> Il giorno mer 29 giu 2016 alle ore 10:25 Christian Balzer <ch...@gol.com 
> <mailto:ch...@gol.com>> ha scritto:
> 
> Hello,
> 
> On Wed, 29 Jun 2016 06:02:59 +0000 Mario Giammarco wrote:
> 
> > pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
>                                ^
> And that's the root cause of all your woes.
> The default replication size is 3 for a reason and while I do run pools
> with replication of 2 they are either HDD RAIDs or extremely trustworthy
> and well monitored SSD.
> 
> That said, something more than a single HDD failure must have happened
> here, you should check the logs and backtrace all the step you did after
> that OSD failed.
> 
> You said there were 11 HDDs and your first ceph -s output showed:
> ---
>      osdmap e10182: 10 osds: 10 up, 10 in
> ----
> And your crush map states the same.
> 
> So how and WHEN did you remove that OSD?
> My suspicion would be it was removed before recovery was complete.
> 
> Also, as I think was mentioned before, 7 mons are overkill 3-5 would be a
> saner number.
> 
> Christian
> 
> > rjenkins pg_num 512 pgp_num 512 last_change 9313 flags hashpspool
> > stripe_width 0
> >        removed_snaps [1~3]
> > pool 1 'rbd2' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> > rjenkins pg_num 512 pgp_num 512 last_change 9314 flags hashpspool
> > stripe_width 0
> >        removed_snaps [1~3]
> > pool 2 'rbd3' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> > rjenkins pg_num 512 pgp_num 512 last_change 10537 flags hashpspool
> > stripe_width 0
> >        removed_snaps [1~3]
> >
> >
> > ID WEIGHT  REWEIGHT SIZE   USE   AVAIL %USE  VAR
> > 5 1.81000  1.00000  1857G  984G  872G 53.00 0.86
> > 6 1.81000  1.00000  1857G 1202G  655G 64.73 1.05
> > 2 1.81000  1.00000  1857G 1158G  698G 62.38 1.01
> > 3 1.35999  1.00000  1391G  906G  485G 65.12 1.06
> > 4 0.89999  1.00000   926G  702G  223G 75.88 1.23
> > 7 1.81000  1.00000  1857G 1063G  793G 57.27 0.93
> > 8 1.81000  1.00000  1857G 1011G  846G 54.44 0.88
> > 9 0.89999  1.00000   926G  573G  352G 61.91 1.01
> > 0 1.81000  1.00000  1857G 1227G  629G 66.10 1.07
> > 13 0.45000  1.00000   460G  307G  153G 66.74 1.08
> >              TOTAL 14846G 9136G 5710G 61.54
> > MIN/MAX VAR: 0.86/1.23  STDDEV: 6.47
> >
> >
> >
> > ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
> >
> > http://pastebin.com/SvGfcSHb <http://pastebin.com/SvGfcSHb>
> > http://pastebin.com/gYFatsNS <http://pastebin.com/gYFatsNS>
> > http://pastebin.com/VZD7j2vN <http://pastebin.com/VZD7j2vN>
> >
> > I do not understand why I/O on ENTIRE cluster is blocked when only few
> > pgs are incomplete.
> >
> > Many thanks,
> > Mario
> >
> >
> > Il giorno mar 28 giu 2016 alle ore 19:34 Stefan Priebe - Profihost AG <
> > s.pri...@profihost.ag <mailto:s.pri...@profihost.ag>> ha scritto:
> >
> > > And ceph health detail
> > >
> > > Stefan
> > >
> > > Excuse my typo sent from my mobile phone.
> > >
> > > Am 28.06.2016 um 19:28 schrieb Oliver Dzombic <i...@ip-interactive.de 
> > > <mailto:i...@ip-interactive.de>>:
> > >
> > > Hi Mario,
> > >
> > > please give some more details:
> > >
> > > Please the output of:
> > >
> > > ceph osd pool ls detail
> > > ceph osd df
> > > ceph --version
> > >
> > > ceph -w for 10 seconds ( use http://pastebin.com/ <http://pastebin.com/> 
> > > please )
> > >
> > > ceph osd crush dump ( also pastebin pls )
> > >
> > > --
> > > Mit freundlichen Gruessen / Best regards
> > >
> > > Oliver Dzombic
> > > IP-Interactive
> > >
> > > mailto:i...@ip-interactive.de <mailto:i...@ip-interactive.de> 
> > > <i...@ip-interactive.de <mailto:i...@ip-interactive.de>>
> > >
> > > Anschrift:
> > >
> > > IP Interactive UG ( haftungsbeschraenkt )
> > > Zum Sonnenberg 1-3
> > > 63571 Gelnhausen
> > >
> > > HRB 93402 beim Amtsgericht Hanau
> > > Geschäftsführung: Oliver Dzombic
> > >
> > > Steuer Nr.: 35 236 3622 1
> > > UST ID: DE274086107
> > >
> > >
> > > Am 28.06.2016 um 18:59 schrieb Mario Giammarco:
> > >
> > > Hello,
> > >
> > > this is the second time that happens to me, I hope that someone can
> > >
> > > explain what I can do.
> > >
> > > Proxmox ceph cluster with 8 servers, 11 hdd. Min_size=1, size=2.
> > >
> > >
> > > One hdd goes down due to bad sectors.
> > >
> > > Ceph recovers but it ends with:
> > >
> > >
> > > cluster f2a8dd7d-949a-4a29-acab-11d4900249f4
> > >
> > >     health HEALTH_WARN
> > >
> > >            3 pgs down
> > >
> > >            19 pgs incomplete
> > >
> > >            19 pgs stuck inactive
> > >
> > >            19 pgs stuck unclean
> > >
> > >            7 requests are blocked > 32 sec
> > >
> > >     monmap e11: 7 mons at
> > >
> > > {0=192.168.0.204:6789/0,1=192.168.0.201:6789/0 
> > > <http://192.168.0.204:6789/0,1=192.168.0.201:6789/0>,
> > >
> > > 2=192.168.0.203:6789/0,3=192.168.0.205:6789/0,4=192.168.0.202 
> > > <http://192.168.0.203:6789/0,3=192.168.0.205:6789/0,4=192.168.0.202>:
> > >
> > > 6789/0,5=192.168.0.206:6789/0,6=192.168.0.207:6789/0 
> > > <http://192.168.0.206:6789/0,6=192.168.0.207:6789/0>}
> > >
> > >            election epoch 722, quorum
> > >
> > > 0,1,2,3,4,5,6 1,4,2,0,3,5,6
> > >
> > >     osdmap e10182: 10 osds: 10 up, 10 in
> > >
> > >      pgmap v3295880: 1024 pgs, 2 pools, 4563 GB data, 1143 kobjects
> > >
> > >            9136 GB used, 5710 GB / 14846 GB avail
> > >
> > >                1005 active+clean
> > >
> > >                  16 incomplete
> > >
> > >                   3 down+incomplete
> > >
> > >
> > > Unfortunately "7 requests blocked" means no virtual machine can boot
> > >
> > > because ceph has stopped i/o.
> > >
> > >
> > > I can accept to lose some data, but not ALL data!
> > >
> > > Can you help me please?
> > >
> > > Thanks,
> > >
> > > Mario
> > >
> > >
> > > _______________________________________________
> > >
> > > ceph-users mailing list
> > >
> > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> > >
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> > > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> > >
> > >
> > > _______________________________________________
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> > > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> > >
> > > _______________________________________________
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> > > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> > >
> 
> 
> --
> Christian Balzer        Network/Systems Engineer
> ch...@gol.com <mailto:ch...@gol.com>           Global OnLine Japan/Rakuten 
> Communications
> http://www.gol.com/ <http://www.gol.com/>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Another cluster completely hang

Reply via email to