Can you show a `ceph osd tree` ?
On 11/7/20 1:14 AM, seffyr...@gmail.com wrote:
I've inherited a Ceph Octopus cluster that seems like it needs urgent
maintenance before data loss begins to happen. I'm the guy with the most Ceph
experience on hand and that's not saying much. I'm experiencing most of the ops
and repair tasks for the first time here.
Ceph health output looks like this:
HEALTH_WARN Degraded data redundancy: 3640401/8801868 objects degraded
(41.359%),
128 pgs degraded, 128 pgs undersized; 128 pgs not deep-scrubbed in time;
128 pgs not scrubbed in time
Ceph -s output: https://termbin.com/i06u
The crush rule 'cephfs.media' is here: https://termbin.com/2klmq
So, it seems like all PGs are in a 'warning' state for the main pool, which is
erasure coded and 11TiB across 4 OSDs, of which around 6.4TiB is used. The Ceph
services themselves seem happy, they're stable and have Quorum. I'm able to
access the web panel fine also. The block devices are of different sizes and
types (2 large, different sized spinners, and 2 identical SSDs)
I would welcome any pointers on what my steps to bring this up to full health
may be. If it's undersized, can I simply add another block device/OSD? Or
perhaps adjusting config somewhere will get it to rebalance successfully? (the
rebalance jobs have been stuck at 0% for weeks)
Thank you for your time reading this message.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io