I created a new pool that only contains OSDs on a single node. The Rados bench gives me the speed I'd expect (1GB/s...all coming out of cache)
I then created a pool that contains OSDs from 2 nodes. Now the strange part is, if I run the Rados bench from either of those nodes, I get the speed I'd expect: 2GB/s (1GB local and 1GB coming over from the other node). If I run the same bench from a 3rd node, I only get about 200MB/s. During that bench, I monitor the interfaces on the 2 OSD nodes and they are not going any faster than 1Gb/s. It's almost as if the speed has negotiated down to 1Gb. If I run iperf tests between the 3 nodes I'm getting the full 10Gb speed. 'rados -p 2node bench 60 rand --no-cleanup' from one of the nodes in the 2 node pool: Total time run: 60.036413 Total reads made: 33496 Read size: 4194304 Object size: 4194304 Bandwidth (MB/sec): 2231.71 Average IOPS: 557 Stddev IOPS: 10 Max IOPS: 584 Min IOPS: 535 Average Latency(s): 0.0275722 Max latency(s): 0.164382 Min latency(s): 0.00480053 'rados -p 2node bench 60 rand --no-cleanup' from a node not in the 3 node pool: Total time run: 60.383206 Total reads made: 2715 Read size: 4194304 Object size: 4194304 Bandwidth (MB/sec): 179.851 Average IOPS: 44 Stddev IOPS: 10 Max IOPS: 77 Min IOPS: 28 Average Latency(s): 0.355126 Max latency(s): 2.17366 Min latency(s): 0.00641849 I appreciate this may not be a Ceph config issue but any tips on tracking down this issue would be much appreciated. On Sat, Aug 6, 2016 at 9:38 PM, David <dclistsli...@gmail.com> wrote: > Hi All > > I've just installed Jewel 10.2.2 on hardware that has previously been > running Giant. Rados Bench with the default rand and seq tests is giving me > approx 40% of the throughput I used to achieve. On Giant I would get > ~1000MB/s (so probably limited by the 10GbE interface), now I'm getting 300 > - 400MB/s. > > I can see there is no activity on the disks during the bench so the data > is all coming out of cache. The cluster isn't doing anything else during > the test. I'm fairly sure my network is sound, I've done the usual testing > with iperf etc. The write test seems about the same as I used to get > (~400MB/s). > > This was a fresh install rather than an upgrade. > > Are there any gotchas I should be aware of? > > Some more details: > > OS: CentOS 7 > Kernel: 3.10.0-327.28.2.el7.x86_64 > 5 nodes (each 10 * 4TB SATA, 2 * Intel dc3700 SSD partitioned up for > journals). > 10GbE public network > 10GbE cluster network > MTU 9000 on all interfaces and switch > Ceph installed from ceph repo > > Ceph.conf is pretty basic (IPs, hosts etc omitted): > > filestore_xattr_use_omap = true > osd_journal_size = 10000 > osd_pool_default_size = 3 > osd_pool_default_min_size = 2 > osd_pool_default_pg_num = 4096 > osd_pool_default_pgp_num = 4096 > osd_crush_chooseleaf_type = 1 > max_open_files = 131072 > mon_clock_drift_allowed = .15 > mon_clock_drift_warn_backoff = 30 > mon_osd_down_out_interval = 300 > mon_osd_report_timeout = 300 > mon_osd_full_ratio = .95 > mon_osd_nearfull_ratio = .80 > osd_backfill_full_ratio = .80 > > Thanks > David > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com