Re: [ceph-users] CRUSH straw2 can not handle big weight differences

2018-01-29 Thread Peter Linder
I realize we're probably kind of pushing it. It was the only option i could think of however that would satisfy the idea that: Have separate servers for HDD and NVMe storage spread out in 3 data centers. Always select 1 NVMe and 2 HDD, in separate data centers (make sure NVMe is primary) If on

Re: [ceph-users] CRUSH straw2 can not handle big weight differences

2018-01-29 Thread Gregory Farnum
CRUSH is a pseudorandom, probabilistic algorithm. That can lead to problems with extreme input. In this case, you've given it a bucket in which one child contains ~3.3% of the total weight, and there are only three weights. So on only 3% of "draws", as it tries to choose a child bucket to descend

Re: [ceph-users] CRUSH straw2 can not handle big weight differences

2018-01-29 Thread Peter Linder
We kind of turned the crushmap inside out a little bit. Instead of the traditional "for 1 PG, select OSDs from 3 separate data centers" we did "force selection from only one datacenter (out of 3) and leave enough options only to make sure precisely 1 SSD and 2 HDD are selected". We then orga

Re: [ceph-users] CRUSH straw2 can not handle big weight differences

2018-01-29 Thread Niklas
Yes. It is a hybrid solution where a placement group is always located on one NVMe drive and two HDD drives. Advantage is great read performance and cost savings. Disadvantages is low write performance. Still the write performance is good thanks to rockdb on Intel Optane disks in HDD servers.

Re: [ceph-users] CRUSH straw2 can not handle big weight differences

2018-01-29 Thread Wido den Hollander
On 01/29/2018 01:14 PM, Niklas wrote: Ceph luminous 12.2.2 $: ceph osd pool create hybrid 1024 1024 replicated hybrid $: ceph -s   cluster:     id: e07f568d-056c-4e01-9292-732c64ab4f8e     health: HEALTH_WARN     Degraded data redundancy: 431 pgs unclean, 431 pgs degraded, 431

[ceph-users] CRUSH straw2 can not handle big weight differences

2018-01-29 Thread Niklas
Ceph luminous 12.2.2 $: ceph osd pool create hybrid 1024 1024 replicated hybrid $: ceph -s   cluster:     id: e07f568d-056c-4e01-9292-732c64ab4f8e     health: HEALTH_WARN     Degraded data redundancy: 431 pgs unclean, 431 pgs degraded, 431 pgs undersized   services:     mon: 3 daemo