Vasiliy, I don't think that's the cause. Can you paste other tuning options from your ceph.conf?
Also, have you fixed the problems with cephx auth? Bob On Mon, Nov 30, 2015 at 12:56 AM, Vasiliy Angapov <anga...@gmail.com> wrote: > Btw, in my configuration "mon osd downout subtree limit" is set to "host". > Does it influence things? > > 2015-11-29 14:38 GMT+08:00 Vasiliy Angapov <anga...@gmail.com>: > > Bob, > > Thanks for explanation, sounds resonable! But how it could happen that > > host is down and its OSDs are still IN cluster? > > I mean NOOUT flag is not set and my timeouts are fully default... > > > > But if I remember correctly host was not completely down, it was > > pingable but not other services were reachable like SSH or any others. > > Is it possible that OSDs were still sending some information to > > monitors making them look like IN? > > > > 2015-11-29 2:10 GMT+08:00 Bob R <b...@drinksbeer.org>: > >> Vasiliy, > >> > >> Your OSDs are marked as 'down' but 'in'. > >> > >> "Ceph OSDs have two known states that can be combined. Up and Down only > >> tells you whether the OSD is actively involved in the cluster. OSD > states > >> also are expressed in terms of cluster replication: In and Out. Only > when a > >> Ceph OSD is tagged as Out does the self-healing process occur" > >> > >> Bob > >> > >> On Fri, Nov 27, 2015 at 6:15 AM, Mart van Santen <m...@greenhost.nl> > wrote: > >>> > >>> > >>> Dear Vasilily, > >>> > >>> > >>> > >>> On 11/27/2015 02:00 PM, Irek Fasikhov wrote: > >>> > >>> You have time to synchronize? > >>> > >>> С уважением, Фасихов Ирек Нургаязович > >>> Моб.: +79229045757 > >>> > >>> 2015-11-27 15:57 GMT+03:00 Vasiliy Angapov <anga...@gmail.com>: > >>>> > >>>> > It seams that you played around with crushmap, and done something > >>>> > wrong. > >>>> > Compare the look of 'ceph osd tree' and crushmap. There are some > 'osd' > >>>> > devices renamed to 'device' think threre is you problem. > >>>> Is this a mistake actually? What I did is removed a bunch of OSDs from > >>>> my cluster that's why the numeration is sparse. But is it an issue to > >>>> a have a sparse numeration of OSDs? > >>> > >>> > >>> I think this is normal and should be no problem. I had this also > >>> previously. > >>> > >>>> > >>>> > Hi. > >>>> > Vasiliy, Yes it is a problem with crusmap. Look at height: > >>>> > -3 14.56000 host slpeah001 > >>>> > -2 14.56000 host slpeah002 > >>>> What exactly is wrong here? > >>> > >>> > >>> I do not know how the weight of the hosts contribute to determine were > to > >>> store the 3-th copy of the PG. As you explained, you have enough space > on > >>> all hosts, but maybe if the weights of the hosts do not count up and > the > >>> crushmap maybe come to the conclusion it is not able to place the PGs. > What > >>> you can try, is to artificially raise the weights of these hosts, to > see if > >>> it starts mapping the thirth copies for the pg's onto the available > host. > >>> > >>> I had a similiar problem in the past, this was solved by upgrading to > the > >>> latest crush tunables. But be aware, that can create massive > datamovement > >>> behavior. > >>> > >>> > >>>> > >>>> I also found out that my OSD logs are full of such records: > >>>> 2015-11-26 08:31:19.273268 7fe4f49b1700 0 cephx: verify_authorizer > >>>> could not get service secret for service osd secret_id=2924 > >>>> 2015-11-26 08:31:19.273276 7fe4f49b1700 0 -- > >>>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 > pipe(0x41fd1000 > >>>> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a520).accept: got bad > >>>> authorizer > >>>> 2015-11-26 08:31:24.273207 7fe4f49b1700 0 auth: could not find > >>>> secret_id=2924 > >>>> 2015-11-26 08:31:24.273225 7fe4f49b1700 0 cephx: verify_authorizer > >>>> could not get service secret for service osd secret_id=2924 > >>>> 2015-11-26 08:31:24.273231 7fe4f49b1700 0 -- > >>>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 > pipe(0x3f90b000 > >>>> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a3c0).accept: got bad > >>>> authorizer > >>>> 2015-11-26 08:31:29.273199 7fe4f49b1700 0 auth: could not find > >>>> secret_id=2924 > >>>> 2015-11-26 08:31:29.273215 7fe4f49b1700 0 cephx: verify_authorizer > >>>> could not get service secret for service osd secret_id=2924 > >>>> 2015-11-26 08:31:29.273222 7fe4f49b1700 0 -- > >>>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 > pipe(0x41fd1000 > >>>> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a260).accept: got bad > >>>> authorizer > >>>> 2015-11-26 08:31:34.273469 7fe4f49b1700 0 auth: could not find > >>>> secret_id=2924 > >>>> 2015-11-26 08:31:34.273482 7fe4f49b1700 0 cephx: verify_authorizer > >>>> could not get service secret for service osd secret_id=2924 > >>>> 2015-11-26 08:31:34.273486 7fe4f49b1700 0 -- > >>>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 > pipe(0x3f90b000 > >>>> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a100).accept: got bad > >>>> authorizer > >>>> 2015-11-26 08:31:39.273310 7fe4f49b1700 0 auth: could not find > >>>> secret_id=2924 > >>>> 2015-11-26 08:31:39.273331 7fe4f49b1700 0 cephx: verify_authorizer > >>>> could not get service secret for service osd secret_id=2924 > >>>> 2015-11-26 08:31:39.273342 7fe4f49b1700 0 -- > >>>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 > pipe(0x41fcc000 > >>>> sd=98 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee19fa0).accept: got bad > >>>> authorizer > >>>> 2015-11-26 08:31:44.273753 7fe4f49b1700 0 auth: could not find > >>>> secret_id=2924 > >>>> 2015-11-26 08:31:44.273769 7fe4f49b1700 0 cephx: verify_authorizer > >>>> could not get service secret for service osd secret_id=2924 > >>>> 2015-11-26 08:31:44.273776 7fe4f49b1700 0 -- > >>>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 > pipe(0x41fcc000 > >>>> sd=98 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee189a0).accept: got bad > >>>> authorizer > >>>> 2015-11-26 08:31:49.273412 7fe4f49b1700 0 auth: could not find > >>>> secret_id=2924 > >>>> 2015-11-26 08:31:49.273431 7fe4f49b1700 0 cephx: verify_authorizer > >>>> could not get service secret for service osd secret_id=2924 > >>>> 2015-11-26 08:31:49.273455 7fe4f49b1700 0 -- > >>>> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 > pipe(0x41fd1000 > >>>> sd=98 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee19080).accept: got bad > >>>> authorizer > >>>> 2015-11-26 08:31:54.273293 7fe4f49b1700 0 auth: could not find > >>>> secret_id=2924 > >>>> > >>>> What does it mean? Google sais it might be a time sync issue, but my > >>>> clocks are perfectly synchronized... > >>> > >>> > >>> Normally you get an error warning in "ceph status" if time is out of > sync. > >>> Nevertheless, you can try to restart the OSD's. I had issues with > timing in > >>> the past and discovered it sometime helps to restart the daemons > *after* > >>> syncing the times, before the accepted the new timings. But this was > mostly > >>> the case with monitors though. > >>> > >>> > >>> > >>> Regards, > >>> > >>> > >>> Mart > >>> > >>> > >>> > >>> > >>>> > >>>> 2015-11-26 21:05 GMT+08:00 Irek Fasikhov <malm...@gmail.com>: > >>>> > Hi. > >>>> > Vasiliy, Yes it is a problem with crusmap. Look at height: > >>>> > " -3 14.56000 host slpeah001 > >>>> > -2 14.56000 host slpeah002 > >>>> > " > >>>> > > >>>> > С уважением, Фасихов Ирек Нургаязович > >>>> > Моб.: +79229045757 > >>>> > > >>>> > 2015-11-26 13:16 GMT+03:00 ЦИТ РТ-Курамшин Камиль Фидаилевич > >>>> > <kamil.kurams...@tatar.ru>: > >>>> >> > >>>> >> It seams that you played around with crushmap, and done something > >>>> >> wrong. > >>>> >> Compare the look of 'ceph osd tree' and crushmap. There are some > 'osd' > >>>> >> devices renamed to 'device' think threre is you problem. > >>>> >> > >>>> >> Отправлено с мобильного устройства. > >>>> >> > >>>> >> > >>>> >> -----Original Message----- > >>>> >> From: Vasiliy Angapov <anga...@gmail.com> > >>>> >> To: ceph-users <ceph-users@lists.ceph.com> > >>>> >> Sent: чт, 26 нояб. 2015 7:53 > >>>> >> Subject: [ceph-users] Undersized pgs problem > >>>> >> > >>>> >> Hi, colleagues! > >>>> >> > >>>> >> I have small 4-node CEPH cluster (0.94.2), all pools have size 3, > >>>> >> min_size > >>>> >> 1. > >>>> >> This night one host failed and cluster was unable to rebalance > saying > >>>> >> there are a lot of undersized pgs. > >>>> >> > >>>> >> root@slpeah002:[~]:# ceph -s > >>>> >> cluster 78eef61a-3e9c-447c-a3ec-ce84c617d728 > >>>> >> health HEALTH_WARN > >>>> >> 1486 pgs degraded > >>>> >> 1486 pgs stuck degraded > >>>> >> 2257 pgs stuck unclean > >>>> >> 1486 pgs stuck undersized > >>>> >> 1486 pgs undersized > >>>> >> recovery 80429/555185 objects degraded (14.487%) > >>>> >> recovery 40079/555185 objects misplaced (7.219%) > >>>> >> 4/20 in osds are down > >>>> >> 1 mons down, quorum 1,2 slpeah002,slpeah007 > >>>> >> monmap e7: 3 mons at > >>>> >> > >>>> >> > >>>> >> {slpeah001= > 192.168.254.11:6780/0,slpeah002=192.168.254.12:6780/0,slpeah007=172.31.252.46:6789/0 > } > >>>> >> election epoch 710, quorum 1,2 slpeah002,slpeah007 > >>>> >> osdmap e14062: 20 osds: 16 up, 20 in; 771 remapped pgs > >>>> >> pgmap v7021316: 4160 pgs, 5 pools, 1045 GB data, 180 kobjects > >>>> >> 3366 GB used, 93471 GB / 96838 GB avail > >>>> >> 80429/555185 objects degraded (14.487%) > >>>> >> 40079/555185 objects misplaced (7.219%) > >>>> >> 1903 active+clean > >>>> >> 1486 active+undersized+degraded > >>>> >> 771 active+remapped > >>>> >> client io 0 B/s rd, 246 kB/s wr, 67 op/s > >>>> >> > >>>> >> root@slpeah002:[~]:# ceph osd tree > >>>> >> ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > >>>> >> -1 94.63998 root default > >>>> >> -9 32.75999 host slpeah007 > >>>> >> 72 5.45999 osd.72 up 1.00000 1.00000 > >>>> >> 73 5.45999 osd.73 up 1.00000 1.00000 > >>>> >> 74 5.45999 osd.74 up 1.00000 1.00000 > >>>> >> 75 5.45999 osd.75 up 1.00000 1.00000 > >>>> >> 76 5.45999 osd.76 up 1.00000 1.00000 > >>>> >> 77 5.45999 osd.77 up 1.00000 1.00000 > >>>> >> -10 32.75999 host slpeah008 > >>>> >> 78 5.45999 osd.78 up 1.00000 1.00000 > >>>> >> 79 5.45999 osd.79 up 1.00000 1.00000 > >>>> >> 80 5.45999 osd.80 up 1.00000 1.00000 > >>>> >> 81 5.45999 osd.81 up 1.00000 1.00000 > >>>> >> 82 5.45999 osd.82 up 1.00000 1.00000 > >>>> >> 83 5.45999 osd.83 up 1.00000 1.00000 > >>>> >> -3 14.56000 host slpeah001 > >>>> >> 1 3.64000 osd.1 down 1.00000 1.00000 > >>>> >> 33 3.64000 osd.33 down 1.00000 1.00000 > >>>> >> 34 3.64000 osd.34 down 1.00000 1.00000 > >>>> >> 35 3.64000 osd.35 down 1.00000 1.00000 > >>>> >> -2 14.56000 host slpeah002 > >>>> >> 0 3.64000 osd.0 up 1.00000 1.00000 > >>>> >> 36 3.64000 osd.36 up 1.00000 1.00000 > >>>> >> 37 3.64000 osd.37 up 1.00000 1.00000 > >>>> >> 38 3.64000 osd.38 up 1.00000 1.00000 > >>>> >> > >>>> >> Crushmap: > >>>> >> > >>>> >> # begin crush map > >>>> >> tunable choose_local_tries 0 > >>>> >> tunable choose_local_fallback_tries 0 > >>>> >> tunable choose_total_tries 50 > >>>> >> tunable chooseleaf_descend_once 1 > >>>> >> tunable chooseleaf_vary_r 1 > >>>> >> tunable straw_calc_version 1 > >>>> >> tunable allowed_bucket_algs 54 > >>>> >> > >>>> >> # devices > >>>> >> device 0 osd.0 > >>>> >> device 1 osd.1 > >>>> >> device 2 device2 > >>>> >> device 3 device3 > >>>> >> device 4 device4 > >>>> >> device 5 device5 > >>>> >> device 6 device6 > >>>> >> device 7 device7 > >>>> >> device 8 device8 > >>>> >> device 9 device9 > >>>> >> device 10 device10 > >>>> >> device 11 device11 > >>>> >> device 12 device12 > >>>> >> device 13 device13 > >>>> >> device 14 device14 > >>>> >> device 15 device15 > >>>> >> device 16 device16 > >>>> >> device 17 device17 > >>>> >> device 18 device18 > >>>> >> device 19 device19 > >>>> >> device 20 device20 > >>>> >> device 21 device21 > >>>> >> device 22 device22 > >>>> >> device 23 device23 > >>>> >> device 24 device24 > >>>> >> device 25 device25 > >>>> >> device 26 device26 > >>>> >> device 27 device27 > >>>> >> device 28 device28 > >>>> >> device 29 device29 > >>>> >> device 30 device30 > >>>> >> device 31 device31 > >>>> >> device 32 device32 > >>>> >> device 33 osd.33 > >>>> >> device 34 osd.34 > >>>> >> device 35 osd.35 > >>>> >> device 36 osd.36 > >>>> >> device 37 osd.37 > >>>> >> device 38 osd.38 > >>>> >> device 39 device39 > >>>> >> device 40 device40 > >>>> >> device 41 device41 > >>>> >> device 42 device42 > >>>> >> device 43 device43 > >>>> >> device 44 device44 > >>>> >> device 45 device45 > >>>> >> device 46 device46 > >>>> >> device 47 device47 > >>>> >> device 48 device48 > >>>> >> device 49 device49 > >>>> >> device 50 device50 > >>>> >> device 51 device51 > >>>> >> device 52 device52 > >>>> >> device 53 device53 > >>>> >> device 54 device54 > >>>> >> device 55 device55 > >>>> >> device 56 device56 > >>>> >> device 57 device57 > >>>> >> device 58 device58 > >>>> >> device 59 device59 > >>>> >> device 60 device60 > >>>> >> device 61 device61 > >>>> >> device 62 device62 > >>>> >> device 63 device63 > >>>> >> device 64 device64 > >>>> >> device 65 device65 > >>>> >> device 66 device66 > >>>> >> device 67 device67 > >>>> >> device 68 device68 > >>>> >> device 69 device69 > >>>> >> device 70 device70 > >>>> >> device 71 device71 > >>>> >> device 72 osd.72 > >>>> >> device 73 osd.73 > >>>> >> device 74 osd.74 > >>>> >> device 75 osd.75 > >>>> >> device 76 osd.76 > >>>> >> device 77 osd.77 > >>>> >> device 78 osd.78 > >>>> >> device 79 osd.79 > >>>> >> device 80 osd.80 > >>>> >> device 81 osd.81 > >>>> >> device 82 osd.82 > >>>> >> device 83 osd.83 > >>>> >> > >>>> >> # types > >>>> >> type 0 osd > >>>> >> type 1 host > >>>> >> type 2 chassis > >>>> >> type 3 rack > >>>> >> type 4 row > >>>> >> type 5 pdu > >>>> >> type 6 pod > >>>> >> type 7 room > >>>> >> type 8 datacenter > >>>> >> type 9 region > >>>> >> type 10 root > >>>> >> > >>>> >> # buckets > >>>> >> host slpeah007 { > >>>> >> id -9 # do not change unnecessarily > >>>> >> # weight 32.760 > >>>> >> alg straw > >>>> >> hash 0 # rjenkins1 > >>>> >> item osd.72 weight 5.460 > >>>> >> item osd.73 weight 5.460 > >>>> >> item osd.74 weight 5.460 > >>>> >> item osd.75 weight 5.460 > >>>> >> item osd.76 weight 5.460 > >>>> >> item osd.77 weight 5.460 > >>>> >> } > >>>> >> host slpeah008 { > >>>> >> id -10 # do not change unnecessarily > >>>> >> # weight 32.760 > >>>> >> alg straw > >>>> >> hash 0 # rjenkins1 > >>>> >> item osd.78 weight 5.460 > >>>> >> item osd.79 weight 5.460 > >>>> >> item osd.80 weight 5.460 > >>>> >> item osd.81 weight 5.460 > >>>> >> item osd.82 weight 5.460 > >>>> >> item osd.83 weight 5.460 > >>>> >> } > >>>> >> host slpeah001 { > >>>> >> id -3 # do not change unnecessarily > >>>> >> # weight 14.560 > >>>> >> alg straw > >>>> >> hash 0 # rjenkins1 > >>>> >> item osd.1 weight 3.640 > >>>> >> item osd.33 weight 3.640 > >>>> >> item osd.34 weight 3.640 > >>>> >> item osd.35 weight 3.640 > >>>> >> } > >>>> >> host slpeah002 { > >>>> >> id -2 # do not change unnecessarily > >>>> >> # weight 14.560 > >>>> >> alg straw > >>>> >> hash 0 # rjenkins1 > >>>> >> item osd.0 weight 3.640 > >>>> >> item osd.36 weight 3.640 > >>>> >> item osd.37 weight 3.640 > >>>> >> item osd.38 weight 3.640 > >>>> >> } > >>>> >> root default { > >>>> >> id -1 # do not change unnecessarily > >>>> >> # weight 94.640 > >>>> >> alg straw > >>>> >> hash 0 # rjenkins1 > >>>> >> item slpeah007 weight 32.760 > >>>> >> item slpeah008 weight 32.760 > >>>> >> item slpeah001 weight 14.560 > >>>> >> item slpeah002 weight 14.560 > >>>> >> } > >>>> >> > >>>> >> # rules > >>>> >> rule default { > >>>> >> ruleset 0 > >>>> >> type replicated > >>>> >> min_size 1 > >>>> >> max_size 10 > >>>> >> step take default > >>>> >> step chooseleaf firstn 0 type host > >>>> >> step emit > >>>> >> } > >>>> >> > >>>> >> # end crush map > >>>> >> > >>>> >> > >>>> >> > >>>> >> This is odd because pools have size 3 and I have 3 hosts alive, so > why > >>>> >> it is saying that undersized pgs are present? It makes me feel like > >>>> >> CRUSH is not working properly. > >>>> >> There is not much data currently in cluster, something about 3TB > and > >>>> >> as you can see from osd tree - each host have minimum of 14TB disk > >>>> >> space on OSDs. > >>>> >> So I'm a bit stuck now... > >>>> >> How can I find the source of trouble? > >>>> >> > >>>> >> Thanks in advance! > >>>> >> _______________________________________________ > >>>> >> ceph-users mailing list > >>>> >> ceph-users@lists.ceph.com > >>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>>> >> > >>>> >> _______________________________________________ > >>>> >> ceph-users mailing list > >>>> >> ceph-users@lists.ceph.com > >>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>>> >> > >>>> > > >>> > >>> > >>> > >>> > >>> _______________________________________________ > >>> ceph-users mailing list > >>> ceph-users@lists.ceph.com > >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>> > >>> > >>> -- > >>> Mart van Santen > >>> Greenhost > >>> E: m...@greenhost.nl > >>> T: +31 20 4890444 > >>> W: https://greenhost.nl > >>> > >>> A PGP signature can be attached to this e-mail, > >>> you need PGP software to verify it. > >>> My public key is available in keyserver(s) > >>> see: http://tinyurl.com/openpgp-manual > >>> > >>> PGP Fingerprint: CA85 EB11 2B70 042D AF66 B29A 6437 01A1 10A3 D3A5 > >>> > >>> > >>> _______________________________________________ > >>> ceph-users mailing list > >>> ceph-users@lists.ceph.com > >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>> > >> > >> > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com