> On Jan 13, 2017, at 12:41 PM, Wido den Hollander <w...@42on.com> wrote:
> 
> 
>> Op 13 januari 2017 om 18:39 schreef Mohammed Naser <mna...@vexxhost.com>:
>> 
>> 
>> 
>>> On Jan 13, 2017, at 12:37 PM, Wido den Hollander <w...@42on.com> wrote:
>>> 
>>> 
>>>> Op 13 januari 2017 om 18:18 schreef Mohammed Naser <mna...@vexxhost.com>:
>>>> 
>>>> 
>>>> Hi everyone,
>>>> 
>>>> We have a deployment with 90 OSDs at the moment which is all SSD that’s 
>>>> not hitting quite the performance that it should be in my opinion, a 
>>>> `rados bench` run gives something along these numbers:
>>>> 
>>>> Maintaining 16 concurrent writes of 4194304 bytes to objects of size 
>>>> 4194304 for up to 10 seconds or 0 objects
>>>> Object prefix: benchmark_data_bench.vexxhost._30340
>>>> sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
>>>>   0       0         0         0         0         0           -           0
>>>>   1      16       158       142   568.513       568   0.0965336   0.0939971
>>>>   2      16       287       271   542.191       516   0.0291494    0.107503
>>>>   3      16       375       359    478.75       352   0.0892724    0.118463
>>>>   4      16       477       461   461.042       408   0.0243493    0.126649
>>>>   5      16       540       524   419.216       252    0.239123    0.132195
>>>>   6      16       644       628    418.67       416    0.347606    0.146832
>>>>   7      16       734       718   410.281       360   0.0534447    0.147413
>>>>   8      16       811       795   397.487       308   0.0311927     0.15004
>>>>   9      16       879       863   383.537       272   0.0894534    0.158513
>>>>  10      16       980       964   385.578       404   0.0969865    0.162121
>>>>  11       3       981       978   355.613        56    0.798949    0.171779
>>>> Total time run:         11.063482
>>>> Total writes made:      981
>>>> Write size:             4194304
>>>> Object size:            4194304
>>>> Bandwidth (MB/sec):     354.68
>>>> Stddev Bandwidth:       137.608
>>>> Max bandwidth (MB/sec): 568
>>>> Min bandwidth (MB/sec): 56
>>>> Average IOPS:           88
>>>> Stddev IOPS:            34
>>>> Max IOPS:               142
>>>> Min IOPS:               14
>>>> Average Latency(s):     0.175273
>>>> Stddev Latency(s):      0.294736
>>>> Max latency(s):         1.97781
>>>> Min latency(s):         0.0205769
>>>> Cleaning up (deleting benchmark objects)
>>>> Clean up completed and total clean up time :3.895293
>>>> 
>>>> We’ve verified the network by running `iperf` across both replication and 
>>>> public networks and it resulted in 9.8Gb/s (10G links for both).  The 
>>>> machine that’s running the benchmark doesn’t even saturate it’s port.  The 
>>>> SSDs are S3520 960GB drives which we’ve benchmarked and they can handle 
>>>> the load using fio/etc.  At this point, not really sure where to look 
>>>> next.. anyone running all SSD clusters that might be able to share their 
>>>> experience?
>>> 
>>> I suggest that you search a bit on the ceph-users list since this topic has 
>>> been discussed multiple times in the past and even recently.
>>> 
>>> Ceph isn't your average storage system and you have to keep that in mind. 
>>> Nothing is free in this world. Ceph provides excellent consistency and 
>>> distribution of data, but that also means that you have things like network 
>>> and CPU latency.
>>> 
>>> However, I suggest you look up a few threads on this list which have 
>>> valuable tips.
>>> 
>>> Wido
>> 
>> Thanks for the reply, I’ve actually done quite a lot of research and went 
>> through many of the previous posts.  While I agree a 100% with your 
>> statement, I’ve found that other people with similar setups have been able 
>> to reach numbers that I cannot, which leads me to believe that there is 
>> actually an issue in here.  They have been able to max out at 1200 MB/s 
>> which is the maximum of their benchmarking host.  We’d like to reach that 
>> and I think that given the specifications of the cluster, it can do it with 
>> no problems.
> 
> A few tips:
> 
> - Disable all logging in Ceph (debug_osd, debug_ms, debug_auth, etc, etc)

All logging is configured to default settings, should those be turned down?

> - Disable power saving on the CPUs

All disabled as well, everything running on `performance` mode.

> 
> Can you also share how the 90 OSDs are distributed in the cluster and what 
> CPUs you have?

There are 45 machines with 2 OSDs each.   The servers they’re located on on 
average have 24 core ~3GHz Intel CPUs.  Both OSDs are pinned to two cores on 
the system.

> 
> Wido
> 
>> 
>>>> 
>>>> Thanks,
>>>> Mohammed
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to