Hi Peter, Thanks a lot for the reply. Please find 'ceph osd df' output here -
# ceph osd df ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS 2 0.04399 1.00000 46056M 35576k 46021M 0.08 0.00 0 1 0.04399 1.00000 46056M 40148k 46017M 0.09 0.00 384 0 0.04399 1.00000 46056M 43851M 2205M 95.21 2.99 192 0 0.04399 1.00000 46056M 43851M 2205M 95.21 2.99 192 1 0.04399 1.00000 46056M 40148k 46017M 0.09 0.00 384 2 0.04399 1.00000 46056M 35576k 46021M 0.08 0.00 0 TOTAL 134G 43925M 94244M 31.79 MIN/MAX VAR: 0.00/2.99 STDDEV: 44.85 I setup this cluster by manipulating CRUSH map using CLI. I had a default root before but it gave me an impression that since every rack is under a single root bucket its marking entire cluster down in case one of the osd is 95% full. So I removed root bucket but that still did not help me. No crush rule is referring to root bucket in the above mentioned case. Yes, I added one osd under two racks by linking host bucket from one rack to another using following command - "osd crush link <name> <args> [<args>...] : link existing entry for <name> under location <args>" On Thu, Aug 10, 2017 at 1:40 PM, Peter Maloney < peter.malo...@brockmann-consult.de> wrote: > I think a `ceph osd df` would be useful. > > And how did you set up such a cluster? I don't see a root, and you have > each osd in there more than once...is that even possible? > > > > On 08/10/17 08:46, Mandar Naik wrote: > > > > > > > > > > > > > > > * Hi, I am evaluating ceph cluster for a solution where ceph could be used > for provisioning pools which could be either stored local to a node or > replicated across a cluster. This way ceph could be used as single point > of solution for writing both local as well as replicated data. Local > storage helps avoid possible storage cost that comes with replication > factor of more than one and also provide availability as long as the data > host is alive. So I tried an experiment with Ceph cluster where there is > one crush rule which replicates data across nodes and other one only points > to a crush bucket that has local ceph osd. Cluster configuration is pasted > below. Here I observed that if one of the disk is full (95%) entire cluster > goes into error state and stops accepting new writes from/to other nodes. > So ceph cluster became unusable even though it’s only 32% full. The writes > are blocked even for pools which are not touching the full osd. I have > tried playing around crush hierarchy but it did not help. So is it possible > to store data in the above manner with Ceph ? If yes could we get cluster > state in usable state after one of the node is full ? # ceph df GLOBAL: > SIZE AVAIL RAW USED %RAW USED 134G 94247M > 43922M 31.79 # ceph –s cluster > ba658a02-757d-4e3c-7fb3-dc4bf944322f health HEALTH_ERR 1 > full osd(s) full,sortbitwise,require_jewel_osds flag(s) set > monmap e3: 3 mons at > {ip-10-0-9-122=10.0.9.122:6789/0,ip-10-0-9-146=10.0.9.146:6789/0,ip-10-0-9-210=10.0.9.210:6789/0 > <http://10.0.9.122:6789/0,ip-10-0-9-146=10.0.9.146:6789/0,ip-10-0-9-210=10.0.9.210:6789/0>} > election epoch 14, quorum 0,1,2 > ip-10-0-9-122,ip-10-0-9-146,ip-10-0-9-210 osdmap e93: 3 osds: 3 up, 3 > in flags full,sortbitwise,require_jewel_osds pgmap v630: > 384 pgs, 6 pools, 43772 MB data, 18640 objects 43922 MB used, > 94247 MB / 134 GB avail 384 active+clean # ceph osd tree ID > WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -9 > 0.04399 rack ip-10-0-9-146-rack -8 0.04399 host ip-10-0-9-146 2 0.04399 > osd.2 up 1.00000 1.00000 -7 0.04399 rack > ip-10-0-9-210-rack -6 0.04399 host ip-10-0-9-210 1 0.04399 > osd.1 up 1.00000 1.00000 -5 0.04399 rack > ip-10-0-9-122-rack -3 0.04399 host ip-10-0-9-122 0 0.04399 > osd.0 up 1.00000 1.00000 -4 0.13197 rack > rep-rack -3 0.04399 host ip-10-0-9-122 0 0.04399 osd.0 > up 1.00000 1.00000 -6 0.04399 host > ip-10-0-9-210 1 0.04399 osd.1 up 1.00000 > 1.00000 -8 0.04399 host ip-10-0-9-146 2 0.04399 osd.2 > up 1.00000 1.00000 # ceph osd crush rule list [ > "rep_ruleset", "ip-10-0-9-122_ruleset", "ip-10-0-9-210_ruleset", > "ip-10-0-9-146_ruleset" ] # ceph osd crush rule dump rep_ruleset { > "rule_id": 0, "rule_name": "rep_ruleset", "ruleset": 0, "type": > 1, "min_size": 1, "max_size": 10, "steps": [ { > "op": "take", "item": -4, "item_name": > "rep-rack" }, { "op": "chooseleaf_firstn", > "num": 0, "type": "host" }, { > "op": "emit" } ] } # ceph osd crush rule dump > ip-10-0-9-122_ruleset { "rule_id": 1, "rule_name": > "ip-10-0-9-122_ruleset", "ruleset": 1, "type": 1, "min_size": 1, > "max_size": 10, "steps": [ { "op": "take", > "item": -5, "item_name": "ip-10-0-9-122-rack" > }, { "op": "chooseleaf_firstn", "num": > 0, "type": "host" }, { "op": "emit" > } ] } * > > -- > Thanks, > Mandar Naik. > > > _______________________________________________ > ceph-users mailing > listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > -- > > -------------------------------------------- > Peter Maloney > Brockmann Consult > Max-Planck-Str. 2 > 21502 Geesthacht > Germany > Tel: +49 4152 889 300 > Fax: +49 4152 889 333 > E-mail: peter.malo...@brockmann-consult.de > Internet: http://www.brockmann-consult.de > -------------------------------------------- > > -- Thanks, Mandar Naik.
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com