Running 9.0.3 rados bench on a 9.0.3 cluster...
In the following experiments this cluster is only 2 osd nodes, 6 osds each
and a separate mon node (and a separate client running rados bench).
I have two pools populated with 4M objects.  The pools are replicated x2
with identical parameters.  The objects appear to be spread evenly across the 
12 osds.

In all cases I drop caches on all nodes before doing a rados bench seq test.
In all cases I run rados bench seq for identical times (30 seconds) and in that 
time
we do not run out of objects to read from the pool.

I am seeing significant bandwidth differences between the following:

   * running a single instance of rados bench reading from one pool with 32 
threads
     (bandwidth approx 300)

   * running two instances rados bench each reading from one of the two pools
     with 16 threads per instance (combined bandwidth approx. 450)

I have already increased the following:
  objecter_inflight_op_bytes = 104857600000
  objecter_inflight_ops = 8192
  ms_dispatch_throttle_bytes = 1048576000  #didn't seem to have any effect

The disks and network are not reaching anywhere near 100% utilization

What is the best way to diagnose what is throttling things in the one-instance 
case?

-- Tom Deneau, AMD
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to