On Wed, Aug 7, 2019 at 12:08 AM Konstantin Shalygin <k0...@k0ste.ru> wrote:
> On 8/7/19 1:40 PM, Robert LeBlanc wrote: > > > Maybe it's the lateness of the day, but I'm not sure how to do that. > > Do you have an example where all the OSDs are of class ssd? > Can't parse what you mean. You always should paste your `ceph osd tree` > first. > Our 'ceph osd tree' is like this: ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 892.21326 root default -3 69.16382 host sun-pcs01-osd01 0 ssd 3.49309 osd.0 up 1.00000 1.00000 1 ssd 3.42329 osd.1 up 0.87482 1.00000 2 ssd 3.49309 osd.2 up 0.88989 1.00000 3 ssd 3.42329 osd.3 up 0.94989 1.00000 4 ssd 3.49309 osd.4 up 0.93993 1.00000 5 ssd 3.42329 osd.5 up 1.00000 1.00000 6 ssd 3.49309 osd.6 up 0.89490 1.00000 7 ssd 3.42329 osd.7 up 1.00000 1.00000 8 ssd 3.49309 osd.8 up 0.89482 1.00000 9 ssd 3.42329 osd.9 up 1.00000 1.00000 100 ssd 3.49309 osd.100 up 1.00000 1.00000 101 ssd 3.42329 osd.101 up 1.00000 1.00000 102 ssd 3.49309 osd.102 up 1.00000 1.00000 103 ssd 3.42329 osd.103 up 0.81482 1.00000 104 ssd 3.49309 osd.104 up 0.87973 1.00000 105 ssd 3.42329 osd.105 up 0.86485 1.00000 106 ssd 3.49309 osd.106 up 0.79965 1.00000 107 ssd 3.42329 osd.107 up 1.00000 1.00000 108 ssd 3.49309 osd.108 up 1.00000 1.00000 109 ssd 3.42329 osd.109 up 1.00000 1.00000 -5 62.24744 host sun-pcs01-osd02 10 ssd 3.49309 osd.10 up 1.00000 1.00000 11 ssd 3.42329 osd.11 up 0.72473 1.00000 12 ssd 3.49309 osd.12 up 1.00000 1.00000 13 ssd 3.42329 osd.13 up 0.78979 1.00000 14 ssd 3.49309 osd.14 up 0.98961 1.00000 15 ssd 3.42329 osd.15 up 1.00000 1.00000 16 ssd 3.49309 osd.16 up 0.96495 1.00000 17 ssd 3.42329 osd.17 up 0.94994 1.00000 18 ssd 3.49309 osd.18 up 1.00000 1.00000 19 ssd 3.42329 osd.19 up 0.80481 1.00000 110 ssd 3.49309 osd.110 up 0.97998 1.00000 111 ssd 3.42329 osd.111 up 1.00000 1.00000 112 ssd 3.49309 osd.112 up 1.00000 1.00000 113 ssd 3.42329 osd.113 up 0.72974 1.00000 116 ssd 3.49309 osd.116 up 0.91992 1.00000 117 ssd 3.42329 osd.117 up 0.96997 1.00000 118 ssd 3.49309 osd.118 up 0.93959 1.00000 119 ssd 3.42329 osd.119 up 0.94481 1.00000 ... plus 11 more hosts just like this How do you single out one OSD from each host for the metadata only and prevent data on that OSD when all the device classes are the same? It seems that you would need one OSD to be a different class to do that. It a previous email the conversation was: Is it possible to add a new device class like 'metadata'? Yes, but you don't need this. Just use your existing class with another crush ruleset. So, I'm trying to figure out how you use the existing class of 'ssd' with another CRUSH ruleset to accomplish the above. > > Yes, we can set quotas to limit space usage (or number objects), but > > you can not reserve some space that other pools can't use. The problem > > is if we set a quota for the CephFS data pool to the equivalent of 95% > > there are at least two scenario that make that quota useless. > > Of course. 95% of CephFS deployments is where meta_pool on flash drives > with enough space for this. > > > ``` > > pool 21 'fs_data' replicated size 3 min_size 2 crush_rule 4 object_hash > rjenkins pg_num 64 pgp_num 64 last_change 56870 flags hashpspool > stripe_width 0 application cephfs > pool 22 'fs_meta' replicated size 3 min_size 2 crush_rule 0 object_hash > rjenkins pg_num 16 pgp_num 16 last_change 56870 flags hashpspool > stripe_width 0 application cephfs > > ``` > > ``` > > # ceph osd crush rule dump replicated_racks_nvme > { > "rule_id": 0, > "rule_name": "replicated_racks_nvme", > "ruleset": 0, > "type": 1, > "min_size": 1, > "max_size": 10, > "steps": [ > { > "op": "take", > "item": -44, > "item_name": "default~nvme" <------------ > }, > { > "op": "chooseleaf_firstn", > "num": 0, > "type": "rack" > }, > { > "op": "emit" > } > ] > } > ``` Yes, our HDD cluster is much like this, but not Luminous, so we created as separate root with SSD OSD for the metadata and set up a CRUSH rule for the metadata pool to be mapped to SSD. I understand that the CRUSH rule should have a `step take default class ssd` which I don't see in your rule unless the `~` in the item_name means device class. Thanks ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com