[ceph-users] Help with crushmap

2018-12-02 Thread Vasiliy Tolstov
Hi, i need help with crushmap
I have
3 regions - r1 r2 r3
5 dc - dc1 dc2 dc3 dc4 dc5
dc1 dc2 dc3 in r1
dc4 in r2
dc5 in r3

Each dc have 3 nodes with 2 disks
I need to have 3 rules
rule1 to have 2 copies on two nodes in each dc - 10 copies total failure
domain dc
rule2 to have 2 copies on two nodes in each region - 6 copies total failure
domain region
rule3 to have 2 copies on two nodes in dc1 failure domain node

How looks crushmap in this case for replicated type?
Thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help with crushmap

2018-12-02 Thread Paul Emmerich
10 copies for a replicated setup seems... excessive.

The rules are quite simple, for example rule 1 could be:

take default
choose firstn 5 type datacenter # picks 5 datacenters
chooseleaf firstn 2 type host # 2 different hosts in each datacenter
emit

rule 2 is the same but type region and first 3 and for rule3 you can
just start directly in the selected dc (take dc1).

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

Am So., 2. Dez. 2018 um 17:44 Uhr schrieb Vasiliy Tolstov :
>
> Hi, i need help with crushmap
> I have
> 3 regions - r1 r2 r3
> 5 dc - dc1 dc2 dc3 dc4 dc5
> dc1 dc2 dc3 in r1
> dc4 in r2
> dc5 in r3
>
> Each dc have 3 nodes with 2 disks
> I need to have 3 rules
> rule1 to have 2 copies on two nodes in each dc - 10 copies total failure 
> domain dc
> rule2 to have 2 copies on two nodes in each region - 6 copies total failure 
> domain region
> rule3 to have 2 copies on two nodes in dc1 failure domain node
>
> How looks crushmap in this case for replicated type?
> Thanks.
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help with crushmap

2018-12-02 Thread Vasiliy Tolstov
вс, 2 дек. 2018 г., 20:38 Paul Emmerich paul.emmer...@croit.io:

> 10 copies for a replicated setup seems... excessive.
>

I'm try to create golang package for simple key-val store that used ceph
crushmap to distribute data.
For each namespace attach ceph crushmap rule.

>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Slow rbd reads (fast writes) with luminous + bluestore

2018-12-02 Thread Florian Haas
Hi Mark,

just taking the liberty to follow up on this one, as I'd really like to
get to the bottom of this.

On 28/11/2018 16:53, Florian Haas wrote:
> On 28/11/2018 15:52, Mark Nelson wrote:
>> Option("bluestore_default_buffered_read", Option::TYPE_BOOL,
>> Option::LEVEL_ADVANCED)
>>     .set_default(true)
>>     .set_flag(Option::FLAG_RUNTIME)
>>     .set_description("Cache read results by default (unless hinted
>> NOCACHE or WONTNEED)"),
>>
>>     Option("bluestore_default_buffered_write", Option::TYPE_BOOL,
>> Option::LEVEL_ADVANCED)
>>     .set_default(false)
>>     .set_flag(Option::FLAG_RUNTIME)
>>     .set_description("Cache writes by default (unless hinted NOCACHE or
>> WONTNEED)"),
>>
>>
>> This is one area where bluestore is a lot more confusing for users that
>> filestore was.  There was a lot of concern about enabling buffer cache
>> on writes by default because there's some associated overhead
>> (potentially both during writes and in the mempool thread when trimming
>> the cache).  It might be worth enabling bluestore_default_buffered_write
>> and see if it helps reads.
> 
> So yes this is rather counterintuitive, but I happily gave it a shot and
> the results are... more head-scratching than before. :)
> 
> The output is here: http://paste.openstack.org/show/736324/
> 
> In summary:
> 
> 1. Write benchmark is in the same ballpark as before (good).
> 
> 2. Read benchmark *without* readahead is *way* better than before
> (splendid!) but has a weird dip down to 9K IOPS that I find
> inexplicable. Any ideas on that?
> 
> 3. Read benchmark *with* readahead is still abysmal, which I also find
> rather odd. What do you think about that one?

These two still confuse me.

And in addition, I'm curious as to what you think of the approach to
configure OSDs with bluestore_cache_kv_ratio = .49, so that rather
than using 1%/99%/0% of cache memory for metadata/KV data/objects, the
OSDs use 1%/49%/50%. Is this sensible? I assume the default of not using
any memory to actually cache object data is there for a reason, but I am
struggling to grasp what that reason would be. Particularly since in
filestore, we always got in-memory object caching for free, via the page
cache.

Thanks again!

Cheers,
Florian



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How to use the feature of "CEPH_OSD_FALG_BALANCE_READS" ?

2018-12-02 Thread 韦皓诚
Hi~
I want to turn on the "CEPH_OSD_FALG_BALANCE_READS" flag to
optimize read performance.  Do I just need to set flag in librados API
and is there any other problems?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd IO monitoring

2018-12-02 Thread Jan Fajerski

On Thu, Nov 29, 2018 at 11:48:35PM -0500, Michael Green wrote:

  Hello collective wisdom,

  Ceph neophyte here, running v13.2.2 (mimic).

  Question: what tools are available to monitor IO stats on RBD level?
  That is, IOPS, Throughput, IOs inflight and so on?
There is some brand new code for rbd io monitoring. This PR 
(https://github.com/ceph/ceph/pull/25114) added rbd client side perf counters 
and this PR (https://github.com/ceph/ceph/pull/25358) will add those counters as 
prometheus metrics. There is also room for an "rbd top" tool, though I haven't 
seen any code for this.
I'm sure Mykola (the author of both PRs) could go into more detail if needed. I 
expect this functionality to land in nautilus.


  I'm testing with FIO and want to verify independently the IO load on
  each RBD image.

  --
  Michael Green
  Customer Support & Integration
  [1]gr...@e8storage.com

References

  1. mailto:gr...@e8storage.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Jan Fajerski
Engineer Enterprise Storage
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
HRB 21284 (AG Nürnberg)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com