Hi Ceph Admins,

This night our ceph cluster got all pools 100% full. This happend after
osd.56 (95% used) reached OSD_FULL state.

ceph versions 12.2.2

Logs

2018-03-03 17:15:22.560710 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224452
: cluster [ERR] overall HEALTH_ERR noscrub,nodeep-scrub flag(s) set; 1
backfillfull osd(s); 5 nearfull osd(s); 21 pool(s) backfillfull;
638551/287271738 objects misplaced (0.222%); Degraded data redundancy:
253066/287271738 objects degraded (0.088%), 25 pgs unclean; Degraded data
redundancy (low space): 25 pgs backfill_toofull
2018-03-03 17:15:42.513194 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224515
: cluster [WRN] Health check update: 638576/287284518 objects misplaced
(0.222%) (OBJECT_MISPLACED)
2018-03-03 17:15:42.513256 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224516
: cluster [WRN] Health check update: Degraded data redundancy:
253266/287284518 objects degraded (0.088%), 25 pgs unclean (PG_DEGRADED)
2018-03-03 17:15:44.684928 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224524
: cluster [ERR] Health check failed: 1 full osd(s) (OSD_FULL)
2018-03-03 17:15:44.684969 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224525
: cluster [WRN] Health check failed: 21 pool(s) full (POOL_FULL)
2018-03-03 17:15:44.684987 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224526
: cluster [INF] Health check cleared: OSD_BACKFILLFULL (was: 1 backfillfull
osd(s))
2018-03-03 17:15:44.685013 mon.cephnode01 mon.0 10.212.32.18:6789/0 5224527
: cluster [INF] Health check cleared: POOL_BACKFILLFULL (was: 21 pool(s)
backfillfull)


# ceph df detail from crush time
GLOBAL:
    SIZE     AVAIL     RAW USED     %RAW USED     OBJECTS
    381T      102T         278T         73.05      38035k
POOLS:
    NAME                           ID     QUOTA OBJECTS     QUOTA BYTES
 USED       %USED      MAX AVAIL     OBJECTS      DIRTY      READ
 WRITE      RAW USED
    rbd                            0      N/A               N/A
      0          0             0            0          0          1
 134k            0
    vms                            1      N/A               N/A
      0          0             0            0          0          0
  0            0
    images                         2      N/A               N/A
  7659M     100.00             0         1022       1022       110k
 5668       22977M
    volumes                        3      N/A               N/A
 40991G     100.00             0     10514980     10268k      3404M
4087M         120T
    .rgw.root                      4      N/A               N/A
   1588     100.00             0            4          4       402k
  4         4764
    default.rgw.control            5      N/A               N/A
      0          0             0            8          8          0
  0            0
    default.rgw.data.root          6      N/A               N/A
  94942     100.00             0          339        339       257k
 6422         278k
    default.rgw.gc                 7      N/A               N/A
      0          0             0           32         32      3125M
7410k            0
    default.rgw.log                8      N/A               N/A
      0          0             0          186        186     27222k
 18146k            0
    default.rgw.users.uid          9      N/A               N/A
   4252     100.00             0           17         17       262k
64561        12756
    default.rgw.usage              10     N/A               N/A
      0          0             0            8          8       332k
 665k            0
    default.rgw.users.email        11     N/A               N/A
     87     100.00             0            4          4          0
  4          261
    default.rgw.users.keys         12     N/A               N/A
    206     100.00             0           11         11        459
 23          618
    default.rgw.users.swift        13     N/A               N/A
     40     100.00             0            3          3          0
  3          120
    default.rgw.buckets.index      14     N/A               N/A
      0          0             0          210        210       321M
 41709k            0
    default.rgw.buckets.non-ec     16     N/A               N/A
      0          0             0          114        114      18006
12055            0
    default.rgw.buckets.extra      17     N/A               N/A
      0          0             0            0          0          0
  0            0
    .rgw.buckets.extra             18     N/A               N/A
      0          0             0            0          0          0
  0            0
    default.rgw.buckets.data       20     N/A               N/A
   104T     100.00             0     28334451     27670k       160M
 156M         156T
    benchmark_replicated           21     N/A               N/A
 87136M     100.00             0        21792      21792      1450k
4497k         255G
    benchmark_erasure_coded        22     N/A               N/A
   292G     100.00             0        74779      74779      61288
 680k         439G
#


What we did to reclaim some space is:
- deleted two benchmark pools
- reweight full osd.56 from 1.0 to 0.85
- added new node - cephnode10 (cluster has grown from 9 to 10 nodes but I
had to do crush reweight down to 0 on new OSDs as a lot of slow requestes
(like 3000+) occured and customer IOPS went totally down. Adding one OSD at
a time now)

Current status

# ceph -s
  cluster:
    id:     1023c49f-3a10-42de-9f62-9b122db32f1f
    health: HEALTH_ERR
            noscrub,nodeep-scrub flag(s) set
            5 nearfull osd(s)
            19 pool(s) nearfull
            16151257/286563963 objects misplaced (5.636%)
            Degraded data redundancy: 20949/286563963 objects degraded
(0.007%), 431 pgs unclean, 28 pgs degraded, 1 pg undersized
            Degraded data redundancy (low space): 15 pgs backfill_toofull

  services:
    mon: 3 daemons, quorum cephnode01,cephnode02,cephnode03
    mgr: cephnode02(active), standbys: cephnode03, cephnode01
    osd: 120 osds: 117 up, 117 in; 405 remapped pgs
         flags noscrub,nodeep-scrub
    rgw: 3 daemons active

  data:
    pools:   19 pools, 3760 pgs
    objects: 37941k objects, 144 TB
    usage:   278 TB used, 146 TB / 425 TB avail
    pgs:     20949/286563963 objects degraded (0.007%)
             16151257/286563963 objects misplaced (5.636%)
             3329 active+clean
             370  active+remapped+backfill_wait
             26   active+recovery_wait+degraded
             18   active+remapped+backfilling
             15   active+remapped+backfill_wait+backfill_toofull
             1    active+recovery_wait+degraded+remapped
             1    active+undersized+degraded+remapped+backfilling

  io:
    client:   18337 B/s rd, 29269 kB/s wr, 1 op/s rd, 234 op/s wr
    recovery: 946 MB/s, 243 objects/s
#
# ceph df detail
GLOBAL:
    SIZE     AVAIL     RAW USED     %RAW USED     OBJECTS
    425T      146T         278T         65.50      37941k
POOLS:
    NAME                           ID     QUOTA OBJECTS     QUOTA BYTES
 USED       %USED     MAX AVAIL     OBJECTS      DIRTY      READ
 WRITE      RAW USED
    rbd                            0      N/A               N/A
      0         0         7415G            0          0          1
 134k            0
    vms                            1      N/A               N/A
      0         0         7415G            0          0          0
0            0
    images                         2      N/A               N/A
  7659M      0.10         7415G         1022       1022       110k
 5668       22445M
    volumes                        3      N/A               N/A
 40992G     84.68         7415G     10515231     10268k      3416M
4090M         120T
    .rgw.root                      4      N/A               N/A
   1588         0         7415G            4          4       141k
4         4764
    default.rgw.control            5      N/A               N/A
      0         0         7415G            8          8          0
0            0
    default.rgw.data.root          6      N/A               N/A
  94942         0         7415G          339        339       257k
 6422         278k
    default.rgw.gc                 7      N/A               N/A
      0         0         7415G           32         32      3125M
7430k            0
    default.rgw.log                8      N/A               N/A
      0         0         7415G          186        186     27249k
 18164k            0
    default.rgw.users.uid          9      N/A               N/A
   4252         0         7415G           17         17       263k
64577        12756
    default.rgw.usage              10     N/A               N/A
      0         0         7415G            8          8       332k
 665k            0
    default.rgw.users.email        11     N/A               N/A
     87         0         7415G            4          4          0
4          261
    default.rgw.users.keys         12     N/A               N/A
    206         0         7415G           11         11        483
 23          580
    default.rgw.users.swift        13     N/A               N/A
     40         0         7415G            3          3          0
3          120
    default.rgw.buckets.index      14     N/A               N/A
      0         0         7415G          210        210       321M
 41709k            0
    default.rgw.buckets.non-ec     16     N/A               N/A
      0         0         7415G          114        114      18006
12055            0
    default.rgw.buckets.extra      17     N/A               N/A
      0         0         7415G            0          0          0
0            0
    .rgw.buckets.extra             18     N/A               N/A
      0         0         7415G            0          0          0
0            0
    default.rgw.buckets.data       20     N/A               N/A
   104T     87.85        14831G     28334711     27670k       160M
 156M         157T
#


Most utilized pools are: volumes (replicated pool) and
default.rgw.buckets.data (EC pool, k=6,m=3)
pool 3 'volumes' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 1024 pgp_num 1024 last_change 10047 flags
hashpspool,backfillfull stripe_width 0 application rbd
removed_snaps [1~3]
pool 20 'default.rgw.buckets.data' erasure size 9 min_size 6 crush_rule 1
object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 10047 flags
hashpspool,backfillfull stripe_width 4224 application rgw

Crush rules for above pools:
# rules
rule replicated_ruleset {
        id 0
        type replicated
        min_size 1
        max_size 10
        step take default
        step chooseleaf firstn 0 type rack # !!! rack as failure domain
        step emit
}
rule ec_rule_k6_m3 {
        id 1
        type erasure
        min_size 3
        max_size 9
        step set_chooseleaf_tries 5
        step set_choose_tries 100
        step take default
        step chooseleaf indep 0 type host # !!! host as failure domain
        step emit
}

And finally cluster topology

# ceph osd df tree
ID  CLASS WEIGHT    REWEIGHT SIZE   USE    AVAIL  %USE  VAR  PGS TYPE NAME

 -1       392.72797        -   425T   278T   146T 65.51 1.00   - root
default
 -6       392.72797        -   425T   278T   146T 65.51 1.00   -     region
region01
 -5       392.72797        -   425T   278T   146T 65.51 1.00   -
 datacenter dc01
 -4       392.72797        -   425T   278T   146T 65.51 1.00   -
 room room01
 -8        43.63699        - 44684G 31703G 12980G 70.95 1.08   -
     rack rack01
 -7        43.63699        - 44684G 31703G 12980G 70.95 1.08   -
         host cephnode01
  0   hdd   3.63599  1.00000  3723G  2957G   765G 79.43 1.21 178
             osd.0
  2   hdd   3.63599  1.00000  3723G  2407G  1315G 64.66 0.99 157
             osd.2
  4   hdd   3.63599  1.00000  3723G  2980G   742G 80.05 1.22 184
             osd.4
  6   hdd   3.63599  1.00000  3723G  2768G   955G 74.34 1.13 170
             osd.6
  8   hdd   3.63599  1.00000  3723G  2704G  1019G 72.62 1.11 172
             osd.8
 11   hdd   3.63599  1.00000  3723G  2899G   824G 77.87 1.19 181
             osd.11
 12   hdd   3.63599  1.00000  3723G  2788G   935G 74.89 1.14 183
             osd.12
 14   hdd   3.63599  1.00000  3723G  2139G  1584G 57.44 0.88 139
             osd.14
 16   hdd   3.63599  1.00000  3723G  2672G  1050G 71.78 1.10 174
             osd.16
 18   hdd   3.63599  1.00000  3723G  2575G  1148G 69.17 1.06 166
             osd.18
 20   hdd   3.63599  1.00000  3723G  2395G  1328G 64.33 0.98 149
             osd.20
 22   hdd   3.63599  1.00000  3723G  2414G  1309G 64.83 0.99 161
             osd.22
 -3        43.63699        - 44684G 32329G 12354G 72.35 1.10   -
     rack rack02
 -2        43.63699        - 44684G 32329G 12354G 72.35 1.10   -
         host cephnode02
  1   hdd   3.63599  1.00000  3723G  2874G   848G 77.21 1.18 172
             osd.1
  3   hdd   3.63599  1.00000  3723G  3287G   436G 88.27 1.35 190
             osd.3
  5   hdd   3.63599  1.00000  3723G  2588G  1135G 69.50 1.06 151
             osd.5
  7   hdd   3.63599  1.00000  3723G  2566G  1156G 68.94 1.05 156
             osd.7
  9   hdd   3.63599  1.00000  3723G  2481G  1242G 66.65 1.02 164
             osd.9
 10   hdd   3.63599  1.00000  3723G  2622G  1101G 70.43 1.08 156
             osd.10
 13   hdd   3.63599  1.00000  3723G  2498G  1225G 67.08 1.02 150
             osd.13
 15   hdd   3.63599  1.00000  3723G  2664G  1058G 71.56 1.09 167
             osd.15
 17   hdd   3.63599  1.00000  3723G  2510G  1213G 67.42 1.03 163
             osd.17
 19   hdd   3.63599  1.00000  3723G  2562G  1161G 68.82 1.05 162
             osd.19
 21   hdd   3.63599  1.00000  3723G  2683G  1040G 72.05 1.10 169
             osd.21
 23   hdd   3.63599  1.00000  3723G  2989G   734G 80.28 1.23 169
             osd.23
-10        43.63699        - 44684G 32556G 12128G 72.86 1.11   -
     rack rack03
 -9        43.63699        - 44684G 32556G 12128G 72.86 1.11   -
         host cephnode03
 24   hdd   3.63599  1.00000  3723G  2757G   966G 74.05 1.13 155
             osd.24
 25   hdd   3.63599  1.00000  3723G  3003G   720G 80.66 1.23 186
             osd.25
 26   hdd   3.63599  1.00000  3723G  2494G  1229G 66.98 1.02 168
             osd.26
 28   hdd   3.63599  1.00000  3723G  3021G   701G 81.15 1.24 180
             osd.28
 30   hdd   3.63599  1.00000  3723G  2554G  1169G 68.60 1.05 164
             osd.30
 32   hdd   3.63599  1.00000  3723G  2060G  1662G 55.34 0.84 147
             osd.32
 34   hdd   3.63599  1.00000  3723G  3131G   592G 84.08 1.28 181
             osd.34
 36   hdd   3.63599  1.00000  3723G  2512G  1211G 67.47 1.03 162
             osd.36
 38   hdd   3.63599  1.00000  3723G  2408G  1315G 64.68 0.99 157
             osd.38
 40   hdd   3.63599  1.00000  3723G  2997G   726G 80.49 1.23 194
             osd.40
 42   hdd   3.63599  1.00000  3723G  2645G  1078G 71.05 1.08 161
             osd.42
 44   hdd   3.63599  1.00000  3723G  2969G   754G 79.74 1.22 173
             osd.44
-12        43.63699        - 44684G 32504G 12179G 72.74 1.11   -
     rack rack04
-11        43.63699        - 44684G 32504G 12179G 72.74 1.11   -
         host cephnode04
 27   hdd   3.63599  1.00000  3723G  2947G   775G 79.16 1.21 186
             osd.27
 29   hdd   3.63599  1.00000  3723G  3095G   628G 83.13 1.27 175
             osd.29
 31   hdd   3.63599  1.00000  3723G  2514G  1209G 67.52 1.03 163
             osd.31
 33   hdd   3.63599  1.00000  3723G  2557G  1166G 68.68 1.05 160
             osd.33
 35   hdd   3.63599  1.00000  3723G  3215G   508G 86.35 1.32 183
             osd.35
 37   hdd   3.63599  1.00000  3723G  2455G  1268G 65.93 1.01 151
             osd.37
 39   hdd   3.63599  1.00000  3723G  2335G  1387G 62.73 0.96 155
             osd.39
 41   hdd   3.63599  1.00000  3723G  2774G   949G 74.51 1.14 165
             osd.41
 43   hdd   3.63599  1.00000  3723G  2764G   959G 74.24 1.13 169
             osd.43
 45   hdd   3.63599  1.00000  3723G  2553G  1169G 68.59 1.05 163
             osd.45
 46   hdd   3.63599  1.00000  3723G  2645G  1077G 71.06 1.08 167
             osd.46
 47   hdd   3.63599  1.00000  3723G  2644G  1079G 71.02 1.08 156
             osd.47
-14        39.99585        - 33513G 27770G  5742G 82.86 1.26   -
     rack rack05
-13        39.99585        - 33513G 27770G  5742G 82.86 1.26   -
         host cephnode05
 48   hdd   3.63599  0.90002  3723G  3310G   413G 88.89 1.36 211
             osd.48
 49   hdd   3.63599  0.80005  3723G  3029G   694G 81.36 1.24 182
             osd.49
 50   hdd   3.63599  0.85004  3723G  2918G   804G 78.38 1.20 167
             osd.50
 51   hdd   3.63599  0.85004  3723G  3103G   620G 83.33 1.27 186
             osd.51
 52   hdd         0        0      0      0      0     0    0   0
             osd.52
 53   hdd   3.63599        0      0      0      0     0    0   0
             osd.53
 54   hdd   3.63599        0      0      0      0     0    0   0
             osd.54
 55   hdd   3.63599  0.85004  3723G  3003G   720G 80.65 1.23 178
             osd.55
 56   hdd   3.63599  0.84999  3723G  3347G   376G 89.89 1.37 189
             osd.56
 57   hdd   3.63599  0.75006  3723G  2707G  1016G 72.71 1.11 161
             osd.57
 58   hdd   3.63599  0.80005  3723G  3228G   495G 86.71 1.32 186
             osd.58
 59   hdd   3.63599  0.80005  3723G  3122G   601G 83.85 1.28 194
             osd.59
-16        43.63699        - 44684G 33402G 11281G 74.75 1.14   -
     rack rack06
-15        43.63699        - 44684G 33402G 11281G 74.75 1.14   -
         host cephnode06
 60   hdd   3.63599  1.00000  3723G  2317G  1406G 62.22 0.95 149
             osd.60
 61   hdd   3.63599  1.00000  3723G  3039G   684G 81.62 1.25 183
             osd.61
 62   hdd   3.63599  1.00000  3723G  2945G   778G 79.09 1.21 189
             osd.62
 63   hdd   3.63599  1.00000  3723G  2923G   800G 78.50 1.20 166
             osd.63
 64   hdd   3.63599  1.00000  3723G  3057G   665G 82.11 1.25 180
             osd.64
 65   hdd   3.63599  1.00000  3723G  2989G   733G 80.30 1.23 170
             osd.65
 66   hdd   3.63599  1.00000  3723G  2764G   959G 74.25 1.13 166
             osd.66
 67   hdd   3.63599  1.00000  3723G  2811G   912G 75.50 1.15 175
             osd.67
 68   hdd   3.63599  1.00000  3723G  1785G  1938G 47.95 0.73 139
             osd.68
 69   hdd   3.63599  1.00000  3723G  2744G   979G 73.69 1.12 159
             osd.69
 70   hdd   3.63599  1.00000  3723G  3068G   655G 82.40 1.26 178
             osd.70
 71   hdd   3.63599  1.00000  3723G  2956G   767G 79.40 1.21 174
             osd.71
-18        43.63699        - 44684G 33524G 11159G 75.03 1.15   -
     rack rack07
-17        43.63699        - 44684G 33524G 11159G 75.03 1.15   -
         host cephnode07
 72   hdd   3.63599  1.00000  3723G  2901G   822G 77.91 1.19 178
             osd.72
 73   hdd   3.63599  1.00000  3723G  2612G  1110G 70.16 1.07 168
             osd.73
 74   hdd   3.63599  1.00000  3723G  2870G   853G 77.09 1.18 172
             osd.74
 75   hdd   3.63599  1.00000  3723G  2813G   910G 75.56 1.15 169
             osd.75
 76   hdd   3.63599  1.00000  3723G  2861G   862G 76.85 1.17 170
             osd.76
 77   hdd   3.63599  1.00000  3723G  2807G   916G 75.39 1.15 168
             osd.77
 78   hdd   3.63599  1.00000  3723G  2678G  1045G 71.92 1.10 156
             osd.78
 79   hdd   3.63599  1.00000  3723G  2556G  1166G 68.67 1.05 160
             osd.79
 80   hdd   3.63599  1.00000  3723G  3082G   640G 82.79 1.26 190
             osd.80
 81   hdd   3.63599  1.00000  3723G  2418G  1305G 64.94 0.99 144
             osd.81
 82   hdd   3.63599  1.00000  3723G  2881G   841G 77.39 1.18 161
             osd.82
 83   hdd   3.63599  1.00000  3723G  3039G   683G 81.64 1.25 175
             osd.83
-20        90.91017        -   130T 61630G 72421G 45.98 0.70   -
     rack rack08
-19        43.63699        - 44684G 30861G 13823G 69.06 1.05   -
         host cephnode08
 84   hdd   3.63599  1.00000  3723G  2532G  1190G 68.02 1.04 157
             osd.84
 85   hdd   3.63599  1.00000  3723G  2518G  1205G 67.64 1.03 166
             osd.85
 86   hdd   3.63599  1.00000  3723G  2504G  1219G 67.25 1.03 151
             osd.86
 87   hdd   3.63599  1.00000  3723G  2698G  1024G 72.47 1.11 161
             osd.87
 88   hdd   3.63599  1.00000  3723G  2527G  1196G 67.87 1.04 147
             osd.88
 89   hdd   3.63599  1.00000  3723G  2508G  1215G 67.36 1.03 142
             osd.89
 90   hdd   3.63599  1.00000  3723G  2317G  1406G 62.24 0.95 142
             osd.90
 91   hdd   3.63599  1.00000  3723G  2582G  1140G 69.36 1.06 147
             osd.91
 92   hdd   3.63599  1.00000  3723G  2656G  1066G 71.35 1.09 144
             osd.92
 93   hdd   3.63599  1.00000  3723G  2448G  1275G 65.74 1.00 154
             osd.93
 94   hdd   3.63599  1.00000  3723G  2783G   939G 74.76 1.14 163
             osd.94
 95   hdd   3.63599  1.00000  3723G  2782G   941G 74.73 1.14 152
             osd.95
-21        43.63678        - 44684G 30331G 14353G 67.88 1.04   -
         host cephnode09
 96   hdd   3.63640  1.00000  3723G  3003G   719G 80.67 1.23 161
             osd.96
 97   hdd   3.63640  1.00000  3723G  2581G  1142G 69.32 1.06 151
             osd.97
 98   hdd   3.63640  1.00000  3723G  2118G  1605G 56.88 0.87 140
             osd.98
 99   hdd   3.63640  1.00000  3723G  2926G   796G 78.60 1.20 165
             osd.99
100   hdd   3.63640  1.00000  3723G  2492G  1231G 66.92 1.02 149
             osd.100
101   hdd   3.63640  1.00000  3723G  2605G  1117G 69.98 1.07 165
             osd.101
102   hdd   3.63640  1.00000  3723G  2159G  1563G 58.01 0.89 141
             osd.102
103   hdd   3.63640  1.00000  3723G  2328G  1395G 62.53 0.95 146
             osd.103
104   hdd   3.63640  1.00000  3723G  2624G  1099G 70.48 1.08 163
             osd.104
105   hdd   3.63640  1.00000  3723G  2582G  1141G 69.34 1.06 142
             osd.105
106   hdd   3.63640  1.00000  3723G  2401G  1322G 64.48 0.98 161
             osd.106
107   hdd   3.63640  1.00000  3723G  2507G  1216G 67.33 1.03 159
             osd.107
-43         3.63640        - 44684G   438G 44245G  0.98 0.01   -
         host cephnode10 ## Added after cluster pools got full
108   hdd   3.63640  1.00000  3723G 51915M  3672G  1.36 0.02  36
             osd.108
109   hdd         0  1.00000  3723G 72735M  3652G  1.91 0.03   4
             osd.109
110   hdd         0  1.00000  3723G 36948M  3687G  0.97 0.01   2
             osd.110
111   hdd         0  1.00000  3723G 37043M  3687G  0.97 0.01   2
             osd.111
112   hdd         0  1.00000  3723G 72382M  3652G  1.90 0.03   4
             osd.112
113   hdd         0  1.00000  3723G 54850M  3670G  1.44 0.02   3
             osd.113
114   hdd         0  1.00000  3723G 36664M  3687G  0.96 0.01   2
             osd.114
115   hdd         0  1.00000  3723G 36087M  3688G  0.95 0.01   2
             osd.115
116   hdd         0  1.00000  3723G 12066M  3711G  0.32 0.00   0
             osd.116
117   hdd         0  1.00000  3723G 36793M  3687G  0.96 0.01   2
             osd.117
118   hdd         0  1.00000  3723G   775M  3722G  0.02    0   0
             osd.118
119   hdd         0  1.00000  3723G   760M  3722G  0.02    0   0
             osd.119
                       TOTAL   425T   278T   146T 65.51

MIN/MAX VAR: 0/1.37  STDDEV: 23.07
#

I'm wondering why one FULL OSD made all cluster pools full? Does OSD_FULL
state stop write operations to all OSDs on the node that full OSD resides
or just to concerned OSD ?
Should pg_num/pgp_num be increased to get better data balancing across all
OSDs?
Why there is only 7415G MAX AVAIL for volumes pool and 14831G for
default.rgw.buckets.data pool while cluster %RAW USED is 65.50 (only?) ? Is
it somehow related to bad looking node cephnode05 (OSDs highly utilized)
and the fact that K+M of EC pool was equal to the number of nodes in the
cluster?

Best Regards
Jakub
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to