Hi, Here is our complete crush map that is being used.
# begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable straw_calc_version 1 # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 device 4 osd.4 device 5 osd.5 device 6 osd.6 device 7 osd.7 device 8 osd.8 device 9 osd.9 device 10 osd.10 device 11 osd.11 device 12 osd.12 device 13 osd.13 device 14 osd.14 device 15 osd.15 device 16 osd.16 device 17 osd.17 device 18 osd.18 device 19 osd.19 device 20 osd.20 device 21 osd.21 device 22 osd.22 device 23 osd.23 device 24 osd.24 device 25 osd.25 device 26 osd.26 device 27 osd.27 device 28 osd.28 device 29 osd.29 device 30 osd.30 device 31 osd.31 device 32 osd.32 device 33 osd.33 device 34 osd.34 device 35 osd.35 device 36 osd.36 device 37 osd.37 device 38 osd.38 device 39 osd.39 # types type 0 osd type 1 host type 2 chassis type 3 rack type 4 row type 5 pdu type 6 pod type 7 room type 8 datacenter type 9 region type 10 root # buckets host or1010051251040 { id -3 # do not change unnecessarily # weight 20.000 alg tree # do not change pos for existing items unnecessarily hash 0 # rjenkins1 item osd.0 weight 2.000 pos 0 item osd.1 weight 2.000 pos 1 item osd.2 weight 2.000 pos 2 item osd.3 weight 2.000 pos 3 item osd.4 weight 2.000 pos 4 item osd.5 weight 2.000 pos 5 item osd.6 weight 2.000 pos 6 item osd.7 weight 2.000 pos 7 item osd.8 weight 2.000 pos 8 item osd.9 weight 2.000 pos 9 } host or1010051251044 { id -8 # do not change unnecessarily # weight 20.000 alg tree # do not change pos for existing items unnecessarily hash 0 # rjenkins1 item osd.30 weight 2.000 pos 0 item osd.31 weight 2.000 pos 1 item osd.32 weight 2.000 pos 2 item osd.33 weight 2.000 pos 3 item osd.34 weight 2.000 pos 4 item osd.35 weight 2.000 pos 5 item osd.36 weight 2.000 pos 6 item osd.37 weight 2.000 pos 7 item osd.38 weight 2.000 pos 8 item osd.39 weight 2.000 pos 9 } rack rack_A1 { id -2 # do not change unnecessarily # weight 40.000 alg tree # do not change pos for existing items unnecessarily hash 0 # rjenkins1 item or1010051251040 weight 20.000 pos 0 item or1010051251044 weight 20.000 pos 1 } host or1010051251041 { id -5 # do not change unnecessarily # weight 20.000 alg tree # do not change pos for existing items unnecessarily hash 0 # rjenkins1 item osd.10 weight 2.000 pos 0 item osd.11 weight 2.000 pos 1 item osd.12 weight 2.000 pos 2 item osd.13 weight 2.000 pos 3 item osd.14 weight 2.000 pos 4 item osd.15 weight 2.000 pos 5 item osd.16 weight 2.000 pos 6 item osd.17 weight 2.000 pos 7 item osd.18 weight 2.000 pos 8 item osd.19 weight 2.000 pos 9 } host or1010051251045 { id -9 # do not change unnecessarily # weight 0.000 alg tree # do not change pos for existing items unnecessarily hash 0 # rjenkins1 } rack rack_B1 { id -4 # do not change unnecessarily # weight 20.000 alg tree # do not change pos for existing items unnecessarily hash 0 # rjenkins1 item or1010051251041 weight 20.000 pos 0 item or1010051251045 weight 0.000 pos 1 } host or1010051251042 { id -7 # do not change unnecessarily # weight 20.000 alg tree # do not change pos for existing items unnecessarily hash 0 # rjenkins1 item osd.20 weight 2.000 pos 0 item osd.21 weight 2.000 pos 1 item osd.22 weight 2.000 pos 2 item osd.23 weight 2.000 pos 3 item osd.24 weight 2.000 pos 4 item osd.25 weight 2.000 pos 5 item osd.26 weight 2.000 pos 6 item osd.27 weight 2.000 pos 7 item osd.28 weight 2.000 pos 8 item osd.29 weight 2.000 pos 9 } host or1010051251046 { id -10 # do not change unnecessarily # weight 0.000 alg tree # do not change pos for existing items unnecessarily hash 0 # rjenkins1 } host or1010051251023 { id -11 # do not change unnecessarily # weight 0.000 alg tree # do not change pos for existing items unnecessarily hash 0 # rjenkins1 } rack rack_C1 { id -6 # do not change unnecessarily # weight 20.000 alg tree # do not change pos for existing items unnecessarily hash 0 # rjenkins1 item or1010051251042 weight 20.000 pos 0 item or1010051251046 weight 0.000 pos 1 item or1010051251023 weight 0.000 pos 2 } host or1010051251048 { id -12 # do not change unnecessarily # weight 0.000 alg tree # do not change pos for existing items unnecessarily hash 0 # rjenkins1 } rack rack_D1 { id -13 # do not change unnecessarily # weight 0.000 alg tree # do not change pos for existing items unnecessarily hash 0 # rjenkins1 item or1010051251048 weight 0.000 pos 0 } root default { id -1 # do not change unnecessarily # weight 80.000 alg tree # do not change pos for existing items unnecessarily hash 0 # rjenkins1 item rack_A1 weight 40.000 pos 0 item rack_B1 weight 20.000 pos 1 item rack_C1 weight 20.000 pos 2 item rack_D1 weight 0.000 pos 3 } # rules rule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type rack step emit } # end crush map Thanks, Pardhiv Karri On Tue, May 22, 2018 at 9:58 AM, Pardhiv Karri <meher4in...@gmail.com> wrote: > Hi David, > > We are using tree algorithm. > > > > Thanks, > Pardhiv Karri > > On Tue, May 22, 2018 at 9:42 AM, David Turner <drakonst...@gmail.com> > wrote: > >> Your PG counts per pool per osd doesn't have any PGs on osd.38. that >> definitely matches what your seeing, but I've never seen this happen >> before. The osd doesn't seem to be misconfigured at all. >> >> Does anyone have any ideas what could be happening here? I expected to >> see something wrong in one of those outputs, but it all looks good. >> Possibly something with straw vs straw2 or crush tunables. >> >> >> On Tue, May 22, 2018, 12:33 PM Pardhiv Karri <meher4in...@gmail.com> >> wrote: >> >>> Hi David, >>> >>> root@or1010051251044:~# ceph df >>> GLOBAL: >>> SIZE AVAIL RAW USED %RAW USED >>> 79793G 56832G 22860G 28.65 >>> POOLS: >>> NAME ID USED %USED MAX AVAIL OBJECTS >>> rbd 0 0 0 14395G 0 >>> compute 1 0 0 14395G 0 >>> volumes 2 7605G 28.60 14395G 1947372 >>> images 4 0 0 14395G 0 >>> root@or1010051251044:~# >>> >>> >>> >>> pool : 4 0 1 2 | SUM >>> ------------------------------------------------ >>> osd.10 8 10 44 96 | 158 >>> osd.11 14 8 58 100 | 180 >>> osd.12 12 6 50 95 | 163 >>> osd.13 14 4 49 121 | 188 >>> osd.14 9 8 54 86 | 157 >>> osd.15 12 5 55 103 | 175 >>> osd.16 23 5 56 99 | 183 >>> osd.30 6 4 31 47 | 88 >>> osd.17 8 8 50 114 | 180 >>> osd.31 7 1 23 35 | 66 >>> osd.18 15 5 42 94 | 156 >>> osd.32 12 6 24 54 | 96 >>> osd.19 13 5 54 116 | 188 >>> osd.33 4 2 28 49 | 83 >>> osd.34 7 5 18 62 | 92 >>> osd.35 10 2 21 56 | 89 >>> osd.36 5 1 34 35 | 75 >>> osd.37 4 4 24 45 | 77 >>> osd.39 14 8 48 106 | 176 >>> osd.0 12 3 27 67 | 109 >>> osd.1 8 3 27 43 | 81 >>> osd.2 4 5 27 45 | 81 >>> osd.3 4 3 19 50 | 76 >>> osd.4 4 1 23 54 | 82 >>> osd.5 4 2 23 56 | 85 >>> osd.6 1 5 32 50 | 88 >>> osd.7 9 1 32 66 | 108 >>> osd.8 7 4 27 49 | 87 >>> osd.9 6 4 24 55 | 89 >>> osd.20 7 4 43 122 | 176 >>> osd.21 14 5 46 95 | 160 >>> osd.22 13 8 51 107 | 179 >>> osd.23 11 7 54 105 | 177 >>> osd.24 11 6 52 112 | 181 >>> osd.25 16 6 36 98 | 156 >>> osd.26 15 7 59 101 | 182 >>> osd.27 7 9 58 101 | 175 >>> osd.28 16 5 60 89 | 170 >>> osd.29 18 7 53 94 | 172 >>> ------------------------------------------------ >>> SUM : 384 192 1536 3072 >>> >>> >>> >>> root@or1010051251044:~# for i in `rados lspools`; do echo >>> "================="; echo Working on pool: $i; ceph osd pool get $i pg_num; >>> ceph osd pool get $i pgp_num; done ================= Working on pool: rbd >>> pg_num: 64 pgp_num: 64 ================= Working on pool: compute pg_num: >>> 512 pgp_num: 512 ================= Working on pool: volumes pg_num: 1024 >>> pgp_num: 1024 ================= Working on pool: images pg_num: 128 >>> pgp_num: 128 root@or1010051251044:~# >>> >>> >>> >>> Thanks, >>> Pardhiv Karri >>> >>> On Tue, May 22, 2018 at 9:16 AM, David Turner <drakonst...@gmail.com> >>> wrote: >>> >>>> This is all weird. Maybe it just doesn't have any PGs with data on >>>> them. `ceph df`, how many PGs you have in each pool, and which PGs are on >>>> osd 38. >>>> >>>> >>>> On Tue, May 22, 2018, 11:19 AM Pardhiv Karri <meher4in...@gmail.com> >>>> wrote: >>>> >>>>> Hi David, >>>>> >>>>> >>>>> >>>>> root@or1010051251044:~# ceph osd tree >>>>> ID WEIGHT TYPE NAME UP/DOWN REWEIGHT >>>>> PRIMARY-AFFINITY >>>>> -1 80.00000 root default >>>>> >>>>> -2 40.00000 rack rack_A1 >>>>> >>>>> -3 20.00000 host or1010051251040 >>>>> >>>>> 0 2.00000 osd.0 up 1.00000 >>>>> 1.00000 >>>>> 1 2.00000 osd.1 up 1.00000 >>>>> 1.00000 >>>>> 2 2.00000 osd.2 up 1.00000 >>>>> 1.00000 >>>>> 3 2.00000 osd.3 up 1.00000 >>>>> 1.00000 >>>>> 4 2.00000 osd.4 up 1.00000 >>>>> 1.00000 >>>>> 5 2.00000 osd.5 up 1.00000 >>>>> 1.00000 >>>>> 6 2.00000 osd.6 up 1.00000 >>>>> 1.00000 >>>>> 7 2.00000 osd.7 up 1.00000 >>>>> 1.00000 >>>>> 8 2.00000 osd.8 up 1.00000 >>>>> 1.00000 >>>>> 9 2.00000 osd.9 up 1.00000 >>>>> 1.00000 >>>>> -8 20.00000 host or1010051251044 >>>>> >>>>> 30 2.00000 osd.30 up 1.00000 >>>>> 1.00000 >>>>> 31 2.00000 osd.31 up 1.00000 >>>>> 1.00000 >>>>> 32 2.00000 osd.32 up 1.00000 >>>>> 1.00000 >>>>> 33 2.00000 osd.33 up 1.00000 >>>>> 1.00000 >>>>> 34 2.00000 osd.34 up 1.00000 >>>>> 1.00000 >>>>> 35 2.00000 osd.35 up 1.00000 >>>>> 1.00000 >>>>> 36 2.00000 osd.36 up 1.00000 >>>>> 1.00000 >>>>> 37 2.00000 osd.37 up 1.00000 >>>>> 1.00000 >>>>> 38 2.00000 osd.38 up 1.00000 >>>>> 1.00000 >>>>> 39 2.00000 osd.39 up 1.00000 >>>>> 1.00000 >>>>> -4 20.00000 rack rack_B1 >>>>> >>>>> -5 20.00000 host or1010051251041 >>>>> >>>>> 10 2.00000 osd.10 up 1.00000 >>>>> 1.00000 >>>>> 11 2.00000 osd.11 up 1.00000 >>>>> 1.00000 >>>>> 12 2.00000 osd.12 up 1.00000 >>>>> 1.00000 >>>>> 13 2.00000 osd.13 up 1.00000 >>>>> 1.00000 >>>>> 14 2.00000 osd.14 up 1.00000 >>>>> 1.00000 >>>>> 15 2.00000 osd.15 up 1.00000 >>>>> 1.00000 >>>>> 16 2.00000 osd.16 up 1.00000 >>>>> 1.00000 >>>>> 17 2.00000 osd.17 up 1.00000 >>>>> 1.00000 >>>>> 18 2.00000 osd.18 up 1.00000 >>>>> 1.00000 >>>>> 19 2.00000 osd.19 up 1.00000 >>>>> 1.00000 >>>>> -9 0 host or1010051251045 >>>>> >>>>> -6 20.00000 rack rack_C1 >>>>> >>>>> -7 20.00000 host or1010051251042 >>>>> >>>>> 20 2.00000 osd.20 up 1.00000 >>>>> 1.00000 >>>>> 21 2.00000 osd.21 up 1.00000 >>>>> 1.00000 >>>>> 22 2.00000 osd.22 up 1.00000 >>>>> 1.00000 >>>>> 23 2.00000 osd.23 up 1.00000 >>>>> 1.00000 >>>>> 24 2.00000 osd.24 up 1.00000 >>>>> 1.00000 >>>>> 25 2.00000 osd.25 up 1.00000 >>>>> 1.00000 >>>>> 26 2.00000 osd.26 up 1.00000 >>>>> 1.00000 >>>>> 27 2.00000 osd.27 up 1.00000 >>>>> 1.00000 >>>>> 28 2.00000 osd.28 up 1.00000 >>>>> 1.00000 >>>>> 29 2.00000 osd.29 up 1.00000 >>>>> 1.00000 >>>>> -10 0 host or1010051251046 >>>>> >>>>> -11 0 host or1010051251023 >>>>> >>>>> root@or1010051251044:~# >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> root@or1010051251044:~# ceph -s >>>>> cluster 6eacac66-087a-464d-94cb-9ca2585b98d5 >>>>> health HEALTH_OK >>>>> monmap e3: 3 mons at {or1010051251037=10.51.251.37: >>>>> 6789/0,or1010051251038=10.51.251.38:6789/0,or1010051251039=1 >>>>> 0.51.251.39:6789/0} >>>>> election epoch 144, quorum 0,1,2 >>>>> or1010051251037,or1010051251038,or1010051251039 >>>>> osdmap e1814: 40 osds: 40 up, 40 in >>>>> pgmap v446581: 1728 pgs, 4 pools, 7389 GB data, 1847 kobjects >>>>> 22221 GB used, 57472 GB / 79793 GB avail >>>>> 1728 active+clean >>>>> client io 61472 kB/s wr, 30 op/s >>>>> root@or1010051251044:~# >>>>> >>>>> >>>>> Thanks, >>>>> Pardhiv Karri >>>>> >>>>> On Tue, May 22, 2018 at 5:01 AM, David Turner <drakonst...@gmail.com> >>>>> wrote: >>>>> >>>>>> What are your `ceph osd tree` and `ceph status` as well? >>>>>> >>>>>> On Tue, May 22, 2018, 3:05 AM Pardhiv Karri <meher4in...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> We are using Ceph Hammer 0.94.9. Some of our OSDs never get any data >>>>>>> or PGs even at their full crush weight, up and running. Rest of the OSDs >>>>>>> are at 50% full. Is there a bug in Hammer that is causing this issue? >>>>>>> Does >>>>>>> upgrading to Jewel or Luminous fix this issue? >>>>>>> >>>>>>> I tried deleting and recreating this OSD N number of times and still >>>>>>> the same issue. I am seeing this in 3 of our 4 ceph clusters in >>>>>>> different >>>>>>> datacenters. We are using HDD as OSD and SSD as Journal drive. >>>>>>> >>>>>>> The below is from our lab and OSD 38 is the one that never fills. >>>>>>> >>>>>>> >>>>>>> ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR TYPE NAME >>>>>>> >>>>>>> -1 80.00000 - 0 0 0 0 0 root default >>>>>>> >>>>>>> -2 40.00000 - 39812G 6190G 33521G 15.55 0.68 rack >>>>>>> rack_A1 >>>>>>> -3 20.00000 - 19852G 3718G 16134G 18.73 0.82 host >>>>>>> or1010051251040 >>>>>>> 0 2.00000 1.00000 1861G 450G 1410G 24.21 1.07 >>>>>>> osd.0 >>>>>>> 1 2.00000 1.00000 1999G 325G 1673G 16.29 0.72 >>>>>>> osd.1 >>>>>>> 2 2.00000 1.00000 1999G 336G 1662G 16.85 0.74 >>>>>>> osd.2 >>>>>>> 3 2.00000 1.00000 1999G 386G 1612G 19.35 0.85 >>>>>>> osd.3 >>>>>>> 4 2.00000 1.00000 1999G 385G 1613G 19.30 0.85 >>>>>>> osd.4 >>>>>>> 5 2.00000 1.00000 1999G 364G 1634G 18.21 0.80 >>>>>>> osd.5 >>>>>>> 6 2.00000 1.00000 1999G 319G 1679G 15.99 0.70 >>>>>>> osd.6 >>>>>>> 7 2.00000 1.00000 1999G 434G 1564G 21.73 0.96 >>>>>>> osd.7 >>>>>>> 8 2.00000 1.00000 1999G 352G 1646G 17.63 0.78 >>>>>>> osd.8 >>>>>>> 9 2.00000 1.00000 1999G 362G 1636G 18.12 0.80 >>>>>>> osd.9 >>>>>>> -8 20.00000 - 19959G 2472G 17387G 12.39 0.55 host >>>>>>> or1010051251044 >>>>>>> 30 2.00000 1.00000 1999G 362G 1636G 18.14 0.80 >>>>>>> osd.30 >>>>>>> 31 2.00000 1.00000 1999G 293G 1705G 14.66 0.65 >>>>>>> osd.31 >>>>>>> 32 2.00000 1.00000 1999G 202G 1796G 10.12 0.45 >>>>>>> osd.32 >>>>>>> 33 2.00000 1.00000 1999G 215G 1783G 10.76 0.47 >>>>>>> osd.33 >>>>>>> 34 2.00000 1.00000 1999G 192G 1806G 9.61 0.42 >>>>>>> osd.34 >>>>>>> 35 2.00000 1.00000 1999G 337G 1661G 16.90 0.74 >>>>>>> osd.35 >>>>>>> 36 2.00000 1.00000 1999G 206G 1792G 10.35 0.46 >>>>>>> osd.36 >>>>>>> 37 2.00000 1.00000 1999G 266G 1732G 13.33 0.59 >>>>>>> osd.37 >>>>>>> 38 2.00000 1.00000 1999G 55836k 1998G 0.00 0 >>>>>>> osd.38 >>>>>>> 39 2.00000 1.00000 1968G 396G 1472G 20.12 0.89 >>>>>>> osd.39 >>>>>>> -4 20.00000 - 0 0 0 0 0 rack >>>>>>> rack_B1 >>>>>>> -5 20.00000 - 19990G 5978G 14011G 29.91 1.32 host >>>>>>> or1010051251041 >>>>>>> 10 2.00000 1.00000 1999G 605G 1393G 30.27 1.33 >>>>>>> osd.10 >>>>>>> 11 2.00000 1.00000 1999G 592G 1406G 29.62 1.30 >>>>>>> osd.11 >>>>>>> 12 2.00000 1.00000 1999G 539G 1460G 26.96 1.19 >>>>>>> osd.12 >>>>>>> 13 2.00000 1.00000 1999G 684G 1314G 34.22 1.51 >>>>>>> osd.13 >>>>>>> 14 2.00000 1.00000 1999G 510G 1488G 25.56 1.13 >>>>>>> osd.14 >>>>>>> 15 2.00000 1.00000 1999G 590G 1408G 29.52 1.30 >>>>>>> osd.15 >>>>>>> 16 2.00000 1.00000 1999G 595G 1403G 29.80 1.31 >>>>>>> osd.16 >>>>>>> 17 2.00000 1.00000 1999G 652G 1346G 32.64 1.44 >>>>>>> osd.17 >>>>>>> 18 2.00000 1.00000 1999G 544G 1454G 27.23 1.20 >>>>>>> osd.18 >>>>>>> 19 2.00000 1.00000 1999G 665G 1333G 33.27 1.46 >>>>>>> osd.19 >>>>>>> -9 0 - 0 0 0 0 0 host >>>>>>> or1010051251045 >>>>>>> -6 20.00000 - 0 0 0 0 0 rack >>>>>>> rack_C1 >>>>>>> -7 20.00000 - 19990G 5956G 14033G 29.80 1.31 host >>>>>>> or1010051251042 >>>>>>> 20 2.00000 1.00000 1999G 701G 1297G 35.11 1.55 >>>>>>> osd.20 >>>>>>> 21 2.00000 1.00000 1999G 573G 1425G 28.70 1.26 >>>>>>> osd.21 >>>>>>> 22 2.00000 1.00000 1999G 652G 1346G 32.64 1.44 >>>>>>> osd.22 >>>>>>> 23 2.00000 1.00000 1999G 612G 1386G 30.62 1.35 >>>>>>> osd.23 >>>>>>> 24 2.00000 1.00000 1999G 614G 1384G 30.74 1.35 >>>>>>> osd.24 >>>>>>> 25 2.00000 1.00000 1999G 561G 1437G 28.11 1.24 >>>>>>> osd.25 >>>>>>> 26 2.00000 1.00000 1999G 558G 1440G 27.93 1.23 >>>>>>> osd.26 >>>>>>> 27 2.00000 1.00000 1999G 610G 1388G 30.52 1.34 >>>>>>> osd.27 >>>>>>> 28 2.00000 1.00000 1999G 515G 1483G 25.81 1.14 >>>>>>> osd.28 >>>>>>> 29 2.00000 1.00000 1999G 555G 1443G 27.78 1.22 >>>>>>> osd.29 >>>>>>> -10 0 - 0 0 0 0 0 host >>>>>>> or1010051251046 >>>>>>> -11 0 - 0 0 0 0 0 host >>>>>>> or1010051251023 >>>>>>> TOTAL 79793G 18126G 61566G 22.72 >>>>>>> >>>>>>> MIN/MAX VAR: 0/1.55 STDDEV: 8.26 >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> Pardhiv karri >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> ceph-users mailing list >>>>>>> ceph-users@lists.ceph.com >>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> *Pardhiv Karri* >>>>> "Rise and Rise again until LAMBS become LIONS" >>>>> >>>>> >>>>> >>> >>> >>> -- >>> *Pardhiv Karri* >>> "Rise and Rise again until LAMBS become LIONS" >>> >>> >>> > > > -- > *Pardhiv Karri* > "Rise and Rise again until LAMBS become LIONS" > > > -- *Pardhiv Karri* "Rise and Rise again until LAMBS become LIONS"
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com