Hi everybody,

I want to squeeze all the performance of CEPH (we are using jewel 10.2.7).
We are testing a testing environment with 2 nodes having the same configuration:

 * CentOS 7.3
 * 24 CPUs (12 for real in hyper threading)
 * 32Gb of RAM
 * 2x 100Gbit/s ethernet cards
 * 2x OS dedicated in raid SSD Disks
 * 4x OSD SSD Disks SATA 6Gbit/s

We are already expecting the following bottlenecks:

 * [ SATA speed x n° disks ] = 24Gbit/s
 * [ Networks speed x n° bonded cards ] = 200Gbit/s

So the minimum between them is 24 Gbit/s per node (not taking in account protocol loss).

24Gbit/s per node x2 = 48Gbit/s of maximum hypotetical theorical gross speed.

Here are the tests:
///////IPERF2/////// Tests are quite good scoring 88% of the bottleneck.
Note: iperf2 can use only 1 connection from a bond.(it's a well know issue).

   [ ID] Interval       Transfer     Bandwidth
   [ 12]  0.0-10.0 sec  9.55 GBytes  8.21 Gbits/sec
   [  3]  0.0-10.0 sec  10.3 GBytes  8.81 Gbits/sec
   [  5]  0.0-10.0 sec  9.54 GBytes  8.19 Gbits/sec
   [  7]  0.0-10.0 sec  9.52 GBytes  8.18 Gbits/sec
   [  6]  0.0-10.0 sec  9.96 GBytes  8.56 Gbits/sec
   [  8]  0.0-10.0 sec  12.1 GBytes  10.4 Gbits/sec
   [  9]  0.0-10.0 sec  12.3 GBytes  10.6 Gbits/sec
   [ 10]  0.0-10.0 sec  10.2 GBytes  8.80 Gbits/sec
   [ 11]  0.0-10.0 sec  9.34 GBytes  8.02 Gbits/sec
   [  4]  0.0-10.0 sec  10.3 GBytes  8.82 Gbits/sec
   [SUM]  0.0-10.0 sec   103 GBytes  88.6 Gbits/sec

///////RADOS BENCH

Take in consideration the maximum hypotetical speed of 48Gbit/s tests (due to disks bottleneck), tests are not good enought.

 * Average MB/s in write is almost 5-7Gbit/sec (12,5% of the mhs)
 * Average MB/s in seq read is almost 24Gbit/sec (50% of the mhs)
 * Average MB/s in random read is almost 27Gbit/se (56,25% of the mhs).

Here are the reports.
Write:

   # rados bench -p scbench 10 write --no-cleanup
   Total time run:         10.229369
   Total writes made:      1538
   Write size:             4194304
   Object size:            4194304
   Bandwidth (MB/sec):     601.406
   Stddev Bandwidth:       357.012
   Max bandwidth (MB/sec): 1080
   Min bandwidth (MB/sec): 204
   Average IOPS:           150
   Stddev IOPS:            89
   Max IOPS:               270
   Min IOPS:               51
   Average Latency(s):     0.106218
   Stddev Latency(s):      0.198735
   Max latency(s):         1.87401
   Min latency(s):         0.0225438

sequential read:

   # rados bench -p scbench 10 seq
   Total time run:       2.054359
   Total reads made:     1538
   Read size:            4194304
   Object size:          4194304
   Bandwidth (MB/sec):   2994.61
   Average IOPS          748
   Stddev IOPS:          67
   Max IOPS:             802
   Min IOPS:             707
   Average Latency(s):   0.0202177
   Max latency(s):       0.223319
   Min latency(s):       0.00589238

random read:

   # rados bench -p scbench 10 rand
   Total time run:       10.036816
   Total reads made:     8375
   Read size:            4194304
   Object size:          4194304
   Bandwidth (MB/sec):   3337.71
   Average IOPS:         834
   Stddev IOPS:          78
   Max IOPS:             927
   Min IOPS:             741
   Average Latency(s):   0.0182707
   Max latency(s):       0.257397
   Min latency(s):       0.00469212

//------------------------------------

It's seems like that there are some bottleneck somewhere that we are understimating.
Can you help me to found it?




_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to