I created a new pool that only contains OSDs on a single node. The Rados
bench gives me the speed I'd expect (1GB/s...all coming out of cache)

I then created a pool that contains OSDs from 2 nodes. Now the strange part
is, if I run the Rados bench from either of those nodes, I get the speed
I'd expect: 2GB/s (1GB local and 1GB coming over from the other node). If I
run the same bench from a 3rd node, I only get about 200MB/s. During that
bench, I monitor the interfaces on the 2 OSD nodes and they are not going
any faster than 1Gb/s. It's almost as if the speed has negotiated down to
1Gb. If I run iperf tests between the 3 nodes I'm getting the full 10Gb
speed.

'rados -p 2node bench 60 rand --no-cleanup' from one of the nodes in the 2
node pool:

Total time run:       60.036413
Total reads made:     33496
Read size:            4194304
Object size:          4194304
Bandwidth (MB/sec):   2231.71
Average IOPS:         557
Stddev IOPS:          10
Max IOPS:             584
Min IOPS:             535
Average Latency(s):   0.0275722
Max latency(s):       0.164382
Min latency(s):       0.00480053

'rados -p 2node bench 60 rand --no-cleanup' from a node not in the 3 node
pool:

Total time run:       60.383206
Total reads made:     2715
Read size:            4194304
Object size:          4194304
Bandwidth (MB/sec):   179.851
Average IOPS:         44
Stddev IOPS:          10
Max IOPS:             77
Min IOPS:             28
Average Latency(s):   0.355126
Max latency(s):       2.17366
Min latency(s):       0.00641849

I appreciate this may not be a Ceph config issue but any tips on tracking
down this issue would be much appreciated.


On Sat, Aug 6, 2016 at 9:38 PM, David <dclistsli...@gmail.com> wrote:

> Hi All
>
> I've just installed Jewel 10.2.2 on hardware that has previously been
> running Giant. Rados Bench with the default rand and seq tests is giving me
> approx 40% of the throughput I used to achieve. On Giant I would get
> ~1000MB/s (so probably limited by the 10GbE interface), now I'm getting 300
> - 400MB/s.
>
> I can see there is no activity on the disks during the bench so the data
> is all coming out of cache. The cluster isn't doing anything else during
> the test. I'm fairly sure my network is sound, I've done the usual testing
> with iperf etc. The write test seems about the same as I used to get
> (~400MB/s).
>
> This was a fresh install rather than an upgrade.
>
> Are there any gotchas I should be aware of?
>
> Some more details:
>
> OS: CentOS 7
> Kernel: 3.10.0-327.28.2.el7.x86_64
> 5 nodes (each 10 * 4TB SATA, 2 * Intel dc3700 SSD partitioned up for
> journals).
> 10GbE public network
> 10GbE cluster network
> MTU 9000 on all interfaces and switch
> Ceph installed from ceph repo
>
> Ceph.conf is pretty basic (IPs, hosts etc omitted):
>
> filestore_xattr_use_omap = true
> osd_journal_size = 10000
> osd_pool_default_size = 3
> osd_pool_default_min_size = 2
> osd_pool_default_pg_num = 4096
> osd_pool_default_pgp_num = 4096
> osd_crush_chooseleaf_type = 1
> max_open_files = 131072
> mon_clock_drift_allowed = .15
> mon_clock_drift_warn_backoff = 30
> mon_osd_down_out_interval = 300
> mon_osd_report_timeout = 300
> mon_osd_full_ratio = .95
> mon_osd_nearfull_ratio = .80
> osd_backfill_full_ratio = .80
>
> Thanks
> David
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to