Re: [ceph-users] rebalancing taking very long time

Bob Ababurko Thu, 03 Sep 2015 11:54:57 -0700

I found place to paste my output for `ceph daemon osd.xx config show` for
all my OSD's:


https://www.zerobin.net/?743bbbdea41874f4#FNk5EjsfRxvkX1JuTp52fQ4CXW6VOIEB0Lj0Icnyr4Q=

If you want it in a gzip'd txt file, you can download here:

https://mega.nz/#!oY5QAByC!JEWhHRms0WwbYbwG4o4RdTUWtFwFjUDLWhtNtEDhBkA

It honestly looks to me like the disks are maxing out on IOPS and a good
portion of the disks are hitting 100% Utilization according to dstat when
there was rebalancing or client i/o.  I'm running this to look at my disk
stats:

dstat -cd --disk-util -D sda,sdb,sdc,sdd,sde,sdf,sdg,sdh --disk-tps

I dont have any client load on my cluster at this point to show any good
output but with just '11 active+clean+scrubbing+deep' being run, I am
seeing 70-80% disk utilization for each OSD according to dstat.




On Thu, Sep 3, 2015 at 2:34 AM, Jan Schermer <j...@schermer.cz> wrote:

> Can you post the output ot
>
> ceph daemon osd.xx config show? (probably as an attachment).
>
> There are several things that I've seen cause it
> 1) too many PGs but too little degraded objects make it seem "slow" (if
> you just have 2 degraded objects but restarted a host with 10K PGs, it will
> have to scan all the PGs probably)
> 2) sometimes the process gets stuck when a toofull condition occurs
> 3) sometimes the process gets stuck for no apparent reason - restarting
> the currently backfilling/recovering OSDs fixes it
> setting osd_recovery_threads sometimes fixes both 2) and 3), but usually
> not
> 4) setting recovery_delay_start to anything > 0 makes recovery slow (even
> 0.0000001 makes it much slower than simple 0). On the other hand we had to
> set it high as a default because of slow ops when restarting OSDs, which
> was partially fixed by this.
>
> Can you see any bottleneck in the system? CPU spinning, disks reading? I
> don't think this is the issue, just make sure it's not something more
> obvious...
>
> Jan
>
>
> On 02 Sep 2015, at 22:34, Bob Ababurko <b...@ababurko.net> wrote:
>
> When I lose a disk OR replace a OSD in my POC ceph cluster, it takes a
> very long time to rebalance.  I should note that my cluster is slightly
> unique in that I am using cephfs(shouldn't matter?) and it currently
> contains about 310 million objects.
>
> The last time I replaced a disk/OSD was 2.5 days ago and it is still
> rebalancing.  This is on a cluster with no client load.
>
> The configurations is 5 hosts with 6 x 1TB 7200rpm SATA OSD's & 1 850 Pro
> SSD which contains the journals for said OSD's.  Thats means 30 OSD's in
> total.  System disk is on its own disk.  I'm also using a backend network
> with single Gb NIC.  THe rebalancing rate(objects/s) seems to be very slow
> when it is close to finishing....say <1% objects misplaced.
>
> It doesn't seem right that it would take 2+ days to rebalance a 1TB disk
> with no load on the cluster.  Are my expectations off?
>
> I'm not sure if my pg_num/pgp_num needs to be changed OR the rebalance
> time is dependent on the number of objects in the pool.  These are thoughts
> i've had but am not certain are relevant here.
>
> $ sudo ceph -v
> ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
>
> $ sudo ceph -s
> [sudo] password for bababurko:
>     cluster f25cb23f-2293-4682-bad2-4b0d8ad10e79
>      health HEALTH_WARN
>             5 pgs backfilling
>             5 pgs stuck unclean
>             recovery 3046506/676638611 objects misplaced (0.450%)
>      monmap e1: 3 mons at {cephmon01=
> 10.15.24.71:6789/0,cephmon02=10.15.24.80:6789/0,cephmon03=10.15.24.135:6789/0
> }
>             election epoch 20, quorum 0,1,2 cephmon01,cephmon02,cephmon03
>      mdsmap e6070: 1/1/1 up {0=cephmds01=up:active}, 1 up:standby
>      osdmap e4395: 30 osds: 30 up, 30 in; 5 remapped pgs
>       pgmap v3100039: 2112 pgs, 3 pools, 6454 GB data, 321 Mobjects
>             18319 GB used, 9612 GB / 27931 GB avail
>             3046506/676638611 objects misplaced (0.450%)
>                 2095 active+clean
>                   12 active+clean+scrubbing+deep
>                    5 active+remapped+backfilling
> recovery io 2294 kB/s, 147 objects/s
>
> $ sudo rados df
> pool name                 KB      objects       clones     degraded
>  unfound           rd        rd KB           wr        wr KB
> cephfs_data       6767569962    335746702            0            0
>     0      2136834            1    676984208   7052266742
> cephfs_metadata        42738      1058437            0            0
>     0     16130199  30718800215    295996938   3811963908
> rbd                        0            0            0            0
>     0            0            0            0            0
>   total used     19209068780    336805139
>   total avail    10079469460
>   total space    29288538240
>
> $ sudo ceph osd pool get cephfs_data pgp_num
> pg_num: 1024
> $ sudo ceph osd pool get cephfs_metadata pgp_num
> pg_num: 1024
>
>
> thanks,
> Bob
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rebalancing taking very long time

Reply via email to