Re: [ceph-users] Poor performance on all SSD cluster

Greg Poirier Sun, 22 Jun 2014 12:10:24 -0700

I'm using Crucial M500s.


On Sat, Jun 21, 2014 at 7:09 PM, Mark Kirkwood <
[email protected]> wrote:

> I can reproduce this in:
>
> ceph version 0.81-423-g1fb4574
>
> on Ubuntu 14.04. I have a two osd cluster with data on two sata spinners
> (WD blacks) and journals on two ssd (Crucual m4's). I getting about 3.5
> MB/s (kernel and librbd) using your dd command with direct on. Leaving off
> direct I'm seeing about 140 MB/s (librbd) and 90 MB/s (kernel 3.11 [2]).
> The ssd's can do writes at about 180 MB/s each... which is something to
> look at another day[1].
>
> It would be interesting to know what version of Ceph Tyer is using, as his
> setup seems not nearly impacted by adding direct. Also it might be useful
> to know what make and model of ssd you both are using (some of 'em do not
> like a series of essentially sync writes). Having said that testing my
> Crucial m4's shows they can do the dd command (with direct *on*) at about
> 180 MB/s...hmmm...so it *is* the Ceph layer it seems.
>
> Regards
>
> Mark
>
> [1] I set filestore_max_sync_interval = 100 (30G journal...ssd able to do
> 180 MB/s etc), however I am still seeing writes to the spinners during the
> 8s or so that the above dd tests take).
> [2] Ubuntu 13.10 VM - I'll upgrade it to 14.04 and see if that helps at
> all.
>
>
> On 21/06/14 09:17, Greg Poirier wrote:
>
>> Thanks Tyler. So, I'm not totally crazy. There is something weird going
>> on.
>>
>> I've looked into things about as much as I can:
>>
>> - We have tested with collocated journals and dedicated journal disks.
>> - We have bonded 10Gb nics and have verified network configuration and
>> connectivity is sound
>> - We have run dd independently on the SSDs in the cluster and they are
>> performing fine
>> - We have tested both in a VM and with the RBD kernel module and get
>> identical performance
>> - We have pool size = 3, pool min size = 2 and have tested with min size
>> of 2 and 3 -- the performance impact is not bad
>> - osd_op times are approximately 6-12ms
>> - osd_sub_op times are 6-12 ms
>> - iostat reports service time of 6-12ms
>> - Latency between the storage and rbd client is approximately .1-.2ms
>> - Disabling replication entirely did not help significantly
>>
>>
>>
>>
>> On Fri, Jun 20, 2014 at 2:13 PM, Tyler Wilson <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>     Greg,
>>
>>     Not a real fix for you but I too run a full-ssd cluster and am able
>>     to get 112MB/s with your command;
>>
>>     [root@plesk-test ~]# dd if=/dev/zero of=testfilasde bs=16k
>>     count=65535 oflag=direct
>>     65535+0 records in
>>     65535+0 records out
>>     1073725440 bytes (1.1 GB) copied, 9.59092 s, 112 MB/s
>>
>>     This of course is in a VM, here is my ceph config
>>
>>     [global]
>>     fsid = <hidden>
>>     mon_initial_members = node-1 node-2 node-3
>>     mon_host = 192.168.0.3 192.168.0.4 192.168.0.5
>>     auth_supported = cephx
>>     osd_journal_size = 2048
>>     filestore_xattr_use_omap = true
>>     osd_pool_default_size = 2
>>     osd_pool_default_min_size = 1
>>     osd_pool_default_pg_num = 1024
>>     public_network = 192.168.0.0/24 <http://192.168.0.0/24>
>>     osd_mkfs_type = xfs
>>     cluster_network = 192.168.1.0/24 <http://192.168.1.0/24>
>>
>>
>>
>>
>>     On Fri, Jun 20, 2014 at 11:08 AM, Greg Poirier
>>     <[email protected] <mailto:[email protected]>> wrote:
>>
>>         I recently created a 9-node Firefly cluster backed by all SSDs.
>>         We have had some pretty severe performance degradation when
>>         using O_DIRECT in our tests (as this is how MySQL will be
>>         interacting with RBD volumes, this makes the most sense for a
>>         preliminary test). Running the following test:
>>
>>         dd if=/dev/zero of=testfilasde bs=16k count=65535 oflag=direct
>>
>>         779829248 bytes (780 MB) copied, 604.333 s, 1.3 MB/s
>>
>>         Shows us only about 1.5 MB/s throughput and 100 IOPS from the
>>         single dd thread. Running a second dd process does show
>>         increased throughput which is encouraging, but I am still
>>         concerned by the low throughput of a single thread w/ O_DIRECT.
>>
>>         Two threads:
>>         779829248 bytes (780 MB) copied, 604.333 s, 1.3 MB/s
>>         126271488 bytes (126 MB) copied, 99.2069 s, 1.3 MB/s
>>
>>         I am testing with an RBD volume mounted with the kernel module
>>         (I have also tested from within KVM, similar performance).
>>
>>         If allow caching, we start to see reasonable numbers from a
>>         single dd process:
>>
>>         dd if=/dev/zero of=testfilasde bs=16k count=65535
>>         65535+0 records in
>>         65535+0 records out
>>         1073725440 bytes (1.1 GB) copied, 2.05356 s, 523 MB/s
>>
>>         I can get >1GB/s from a single host with three threads.
>>
>>         Rados bench produces similar results.
>>
>>         Is there something I can do to increase the performance of
>>         O_DIRECT? I expect performance degradation, but so much?
>>
>>         If I increase the blocksize to 4M, I'm able to get significantly
>>         higher throughput:
>>
>>         3833593856 bytes (3.8 GB) copied, 44.2964 s, 86.5 MB/s
>>
>>         This still seems very low.
>>
>>         I'm using the deadline scheduler in all places. With noop
>>         scheduler, I do not see a performance improvement.
>>
>>         Suggestions?
>>
>>
>>         _______________________________________________
>>         ceph-users mailing list
>>         [email protected] <mailto:[email protected]>
>>         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Poor performance on all SSD cluster

Reply via email to