[ceph-users] Help with crushmap
Hi, i need help with crushmap I have 3 regions - r1 r2 r3 5 dc - dc1 dc2 dc3 dc4 dc5 dc1 dc2 dc3 in r1 dc4 in r2 dc5 in r3 Each dc have 3 nodes with 2 disks I need to have 3 rules rule1 to have 2 copies on two nodes in each dc - 10 copies total failure domain dc rule2 to have 2 copies on two nodes in each region - 6 copies total failure domain region rule3 to have 2 copies on two nodes in dc1 failure domain node How looks crushmap in this case for replicated type? Thanks. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Help with crushmap
10 copies for a replicated setup seems... excessive. The rules are quite simple, for example rule 1 could be: take default choose firstn 5 type datacenter # picks 5 datacenters chooseleaf firstn 2 type host # 2 different hosts in each datacenter emit rule 2 is the same but type region and first 3 and for rule3 you can just start directly in the selected dc (take dc1). -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 Am So., 2. Dez. 2018 um 17:44 Uhr schrieb Vasiliy Tolstov : > > Hi, i need help with crushmap > I have > 3 regions - r1 r2 r3 > 5 dc - dc1 dc2 dc3 dc4 dc5 > dc1 dc2 dc3 in r1 > dc4 in r2 > dc5 in r3 > > Each dc have 3 nodes with 2 disks > I need to have 3 rules > rule1 to have 2 copies on two nodes in each dc - 10 copies total failure > domain dc > rule2 to have 2 copies on two nodes in each region - 6 copies total failure > domain region > rule3 to have 2 copies on two nodes in dc1 failure domain node > > How looks crushmap in this case for replicated type? > Thanks. > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Help with crushmap
вс, 2 дек. 2018 г., 20:38 Paul Emmerich paul.emmer...@croit.io: > 10 copies for a replicated setup seems... excessive. > I'm try to create golang package for simple key-val store that used ceph crushmap to distribute data. For each namespace attach ceph crushmap rule. > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Slow rbd reads (fast writes) with luminous + bluestore
Hi Mark, just taking the liberty to follow up on this one, as I'd really like to get to the bottom of this. On 28/11/2018 16:53, Florian Haas wrote: > On 28/11/2018 15:52, Mark Nelson wrote: >> Option("bluestore_default_buffered_read", Option::TYPE_BOOL, >> Option::LEVEL_ADVANCED) >> .set_default(true) >> .set_flag(Option::FLAG_RUNTIME) >> .set_description("Cache read results by default (unless hinted >> NOCACHE or WONTNEED)"), >> >> Option("bluestore_default_buffered_write", Option::TYPE_BOOL, >> Option::LEVEL_ADVANCED) >> .set_default(false) >> .set_flag(Option::FLAG_RUNTIME) >> .set_description("Cache writes by default (unless hinted NOCACHE or >> WONTNEED)"), >> >> >> This is one area where bluestore is a lot more confusing for users that >> filestore was. There was a lot of concern about enabling buffer cache >> on writes by default because there's some associated overhead >> (potentially both during writes and in the mempool thread when trimming >> the cache). It might be worth enabling bluestore_default_buffered_write >> and see if it helps reads. > > So yes this is rather counterintuitive, but I happily gave it a shot and > the results are... more head-scratching than before. :) > > The output is here: http://paste.openstack.org/show/736324/ > > In summary: > > 1. Write benchmark is in the same ballpark as before (good). > > 2. Read benchmark *without* readahead is *way* better than before > (splendid!) but has a weird dip down to 9K IOPS that I find > inexplicable. Any ideas on that? > > 3. Read benchmark *with* readahead is still abysmal, which I also find > rather odd. What do you think about that one? These two still confuse me. And in addition, I'm curious as to what you think of the approach to configure OSDs with bluestore_cache_kv_ratio = .49, so that rather than using 1%/99%/0% of cache memory for metadata/KV data/objects, the OSDs use 1%/49%/50%. Is this sensible? I assume the default of not using any memory to actually cache object data is there for a reason, but I am struggling to grasp what that reason would be. Particularly since in filestore, we always got in-memory object caching for free, via the page cache. Thanks again! Cheers, Florian signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] How to use the feature of "CEPH_OSD_FALG_BALANCE_READS" ?
Hi~ I want to turn on the "CEPH_OSD_FALG_BALANCE_READS" flag to optimize read performance. Do I just need to set flag in librados API and is there any other problems? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd IO monitoring
On Thu, Nov 29, 2018 at 11:48:35PM -0500, Michael Green wrote: Hello collective wisdom, Ceph neophyte here, running v13.2.2 (mimic). Question: what tools are available to monitor IO stats on RBD level? That is, IOPS, Throughput, IOs inflight and so on? There is some brand new code for rbd io monitoring. This PR (https://github.com/ceph/ceph/pull/25114) added rbd client side perf counters and this PR (https://github.com/ceph/ceph/pull/25358) will add those counters as prometheus metrics. There is also room for an "rbd top" tool, though I haven't seen any code for this. I'm sure Mykola (the author of both PRs) could go into more detail if needed. I expect this functionality to land in nautilus. I'm testing with FIO and want to verify independently the IO load on each RBD image. -- Michael Green Customer Support & Integration [1]gr...@e8storage.com References 1. mailto:gr...@e8storage.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com