On Sun, Mar 17, 2013 at 8:31 PM, Gregory Farnum <g...@inktank.com> wrote: > On Sunday, March 17, 2013 at 9:25 AM, Andrey Korolyov wrote: >> On Sun, Mar 17, 2013 at 8:14 PM, Gregory Farnum <g...@inktank.com >> (mailto:g...@inktank.com)> wrote: >> > On Sunday, March 17, 2013 at 9:09 AM, Andrey Korolyov wrote: >> > > On Sun, Mar 17, 2013 at 7:56 PM, Gregory Farnum <g...@inktank.com >> > > (mailto:g...@inktank.com)> wrote: >> > > > On Sunday, March 17, 2013 at 4:46 AM, Andrey Korolyov wrote: >> > > > > Hi, >> > > > > >> > > > > from osd tree: >> > > > > >> > > > > -16 4.95 host 10.5.0.52 >> > > > > 32 1.9 osd.32 up 2 >> > > > > 33 1.05 osd.33 up 1 >> > > > > 34 1 osd.34 up 1 >> > > > > 35 1 osd.35 up 1 >> > > > > >> > > > > df -h: >> > > > > /dev/sdd3 3.7T 595G 3.1T 16% /var/lib/ceph/osd/32 >> > > > > /dev/sde3 3.7T 332G 3.4T 9% /var/lib/ceph/osd/33 >> > > > > /dev/sdf3 3.7T 322G 3.4T 9% /var/lib/ceph/osd/34 >> > > > > /dev/sdg3 3.7T 320G 3.4T 9% /var/lib/ceph/osd/35 >> > > > > >> > > > > -10 2 host 10.5.0.32 >> > > > > 18 1 osd.18 up 1 >> > > > > 26 1 osd.26 up 1 >> > > > > >> > > > > df -h: >> > > > > /dev/sda2 926G 417G 510G 45% /var/lib/ceph/osd/18 >> > > > > /dev/sdb2 926G 431G 496G 47% /var/lib/ceph/osd/26 >> > > > > >> > > > > Since osds on 10.5.0.32 does not contain garbage bytes almost for >> > > > > sure, seems to be some weirdness in the placement. Crush rules are >> > > > > almost default, there is no adjustment by node subsets. Any thoughts >> > > > > will be appreciated! >> > > > >> > > > >> > > > >> > > > >> > > > Do you have any other nodes? What's the rest of your osd tree look >> > > > like? >> > > > >> > > > I do note that at a first glance, you've got 1569GB in 10.5.0.52 and >> > > > 848 in 10.5.0.32, which is a 1.85 differential when you'd really like >> > > > a ~2.5 differential (based on the very odd CRUSH weights you've >> > > > assigned to each device, and the hosts). I suspect/hope you've also >> > > > got something weird going on with the rest of your interior nodes (not >> > > > pictured here), but perhaps not — and either way I'd recommend fixing >> > > > up the rest of your weights and seeing if that improves the >> > > > distribution. >> > > >> > > Nope, all other osds have weight one(and each host contains two osds, >> > > this many-disk system is an experimental one). This host had round >> > > values recently, I`ve just changed weights a bit to test a speed of >> > > data rearrangement. Problem existed since 10.5.0.52 entered to the >> > > data placement with default ``1'' osd weights. >> > >> > >> > So you had them all set to weight 1 for a while, despite the disks having >> > very different sizes. That would give them very different utilization >> > percentages (with the same absolute usage) like you've shown here and is >> > expected behavior. Weight them according to size if you want them to fill >> > up at the same rate. >> >> >> Yes, but in my case absolute usage values are different too - that`s >> why I though that something is not right. >> > With your current crush map they have to be — you've got a node with 2 disks > totaling ~2TB and a weight of 2 compared to a node with 4 disks totaling > ~15TB and a weight of ~5. That's not the right modifier to keep their > absolute usages the same! > And of course it's all probabilities — your usages might be off by a bit and > generally will converge as you add more data into the cluster.
Nice, thanks for explanation! Anyway, it is a bit counterintuitive when one don`t mind disk sizes and expect that data will spread exactly as weights do. > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > > _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com