[ceph-users] Degraded objects while OSD is being added/filled

Andras Pataki Wed, 21 Jun 2017 06:57:20 -0700

Hi cephers,

I noticed something I don't understand about ceph's behavior when addingan OSD. When I start with a clean cluster (all PG's active+clean) andadd an OSD (via ceph-deploy for example), the crush map gets updated andPGs get reassigned to different OSDs, and the new OSD starts gettingfilled with data. As the new OSD gets filled, I start seeing PGs indegraded states. Here is an example:


          pgmap v52068792: 42496 pgs, 6 pools, 1305 TB data, 390 Mobjects
                3164 TB used, 781 TB / 3946 TB avail
   *            8017/994261437 objects degraded (0.001%)*
                2220581/994261437 objects misplaced (0.223%)
                   42393 active+clean
                      91 active+remapped+wait_backfill
                       9 active+clean+scrubbing+deep
   *                   1 active+recovery_wait+degraded*
                       1 active+clean+scrubbing
                       1 active+remapped+backfilling

Any ideas why there would be any persistent degradation in the clusterwhile the newly added drive is being filled? It takes perhaps a day ortwo to fill the drive - and during all this time the cluster seems to berunning degraded. As data is written to the cluster, the number ofdegraded objects increases over time. Once the newly added OSD isfilled, the cluster comes back to clean again.


Here is the PG that is degraded in this picture:

7.87c 1 0 2 0 0 4194304 7 7active+recovery_wait+degraded 2017-06-20 14:12:44.119921 344610'7583572:2797 [402,521] 402 [402,521] 402 344610'72017-06-16 06:04:55.822503 344610'7 2017-06-16 06:04:55.822503

The newly added osd here is 521. Before it got added, this PG had tworeplicas clean, but one got forgotten somehow?

Other remapped PGs have 521 in their "up" set but still have the twoexisting copies in their "acting" set - and no degradation is shown.Examples:

2.f24 14282 0 16 28564 0 51014850801 3102 3102active+remapped+wait_backfill 2017-06-20 14:12:42.650308583553'2033479 583573:2033266 [467,521] 467 [467,499] 467582430'2033337 2017-06-16 09:08:51.055131 582036'20308372017-05-31 20:37:54.8311786.2b7d 10499 0 140 20998 0 37242874687 3673 3673active+remapped+wait_backfill 2017-06-20 14:12:42.070019583569'165163 583572:342128 [541,37,521] 541 [541,37,532]541 582430'161890 2017-06-18 09:42:49.148402 582430'1618902017-06-18 09:42:49.148402

We are running the latest Jewel patch level everywhere (10.2.7). Anyinsights would be appreciated.


Andras

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Degraded objects while OSD is being added/filled

Reply via email to