Re: [ceph-users] Performance problems

Ziemowit Pierzycki Wed, 10 Apr 2013 07:36:16 -0700

When executing ceph -w I see the following warning:

2013-04-09 22:38:07.288948 osd.2 [WRN] slow request 30.180683 seconds old,
received at 2013-04-09 22:37:37.108178: osd_op(client.4107.1:9678
10000000002.000001df [write 0~4194304 [6@0]] 0.4e208174 snapc 1=[])
currently waiting for subops from [0]


So what could be causing this?


On Tue, Apr 9, 2013 at 12:54 PM, Ziemowit Pierzycki
<ziemo...@pierzycki.com>wrote:

> Neither made a difference.  I also have a glusterFS cluster with two nodes
> in replicating mode residing on 1TB drives:
>
> [root@triton speed]# dd conv=fdatasync if=/dev/zero
> of=/mnt/speed/test.out bs=512k count=10000
> 10000+0 records in
> 10000+0 records out
> 5242880000 bytes (5.2 GB) copied, 43.573 s, 120 MB/s
>
> ... and Ceph:
>
> [root@triton temp]# dd conv=fdatasync if=/dev/zero of=/mnt/temp/test.out
> bs=512k count=10000
> 10000+0 records in
> 10000+0 records out
> 5242880000 bytes (5.2 GB) copied, 366.911 s, 14.3 MB/s
>
>
> On Mon, Apr 8, 2013 at 4:29 PM, Mark Nelson <mark.nel...@inktank.com>wrote:
>
>> On 04/08/2013 04:12 PM, Ziemowit Pierzycki wrote:
>>
>>> There is one SSD in each node.  IPoIB performance is about 7 gbps
>>> between each host.  CephFS is mounted via kernel client.  Ceph version
>>> is ceph-0.56.3-1.  I have a 1GB journal on the same drive as the OSD but
>>> on a seperate file system split via LVM.
>>>
>>> Here is output of another test with fdatasync:
>>>
>>> [root@triton temp]# dd conv=fdatasync if=/dev/zero of=/mnt/temp/test.out
>>> bs=512k count=10000
>>> 10000+0 records in
>>> 10000+0 records out
>>> 5242880000 bytes (5.2 GB) copied, 359.307 s, 14.6 MB/s
>>> [root@triton temp]# dd if=/mnt/temp/test.out of=/dev/null bs=512k
>>> count=10000
>>> 10000+0 records in
>>> 10000+0 records out
>>> 5242880000 bytes (5.2 GB) copied, 14.0521 s, 373 MB/s
>>>
>>
>> Definitely seems off!  How many SSDs are involved and how fast are they
>> each?  The MTU idea might have merit, but I honestly don't know enough
>> about how well IPoIB handles giant MTUs like that.  One thing I have
>> noticed on other IPoIB setups is that TCP autotuning can cause a ton of
>> problems.  You may want to try disabling it on all of the hosts involved:
>>
>> echo 0 | tee /proc/sys/net/ipv4/tcp_**moderate_rcvbuf
>>
>> If that doesn't work, maybe try setting MTU to 9000 or 1500 if possible.
>>
>> Mark
>>
>>
>>
>>
>>> The network traffic appears to match the transfer speeds shown here too.
>>>   Writing is very slow.
>>>
>>>
>>> On Mon, Apr 8, 2013 at 3:04 PM, Mark Nelson <mark.nel...@inktank.com
>>> <mailto:mark.nelson@inktank.**com <mark.nel...@inktank.com>>> wrote:
>>>
>>>     Hi,
>>>
>>>     How many drives?  Have you tested your IPoIB performance with iperf?
>>>       Is this CephFS with the kernel client?  What version of Ceph?  How
>>>     are your journals configured? etc.  It's tough to make any
>>>     recommendations without knowing more about what you are doing.
>>>
>>>     Also, please use conv=fdatasync when doing buffered IO writes with
>>> dd.
>>>
>>>     Thanks,
>>>     Mark
>>>
>>>
>>>     On 04/08/2013 03:00 PM, Ziemowit Pierzycki wrote:
>>>
>>>         Hi,
>>>
>>>         The first test was writing 500 mb file and was clocked at 1.2
>>>         GBps.  The
>>>         second test was writing 5000 mb file at 17 MBps.  The third test
>>> was
>>>         reading the file at ~400 MBps.
>>>
>>>
>>>         On Mon, Apr 8, 2013 at 2:56 PM, Gregory Farnum <g...@inktank.com
>>>         <mailto:g...@inktank.com>
>>>         <mailto:g...@inktank.com <mailto:g...@inktank.com>>> wrote:
>>>
>>>              More details, please. You ran the same test twice and
>>>         performance went
>>>              up from 17.5MB/s to 394MB/s? How many drives in each node,
>>>         and of what
>>>              kind?
>>>              -Greg
>>>              Software Engineer #42 @ http://inktank.com |
>>> http://ceph.com
>>>
>>>
>>>              On Mon, Apr 8, 2013 at 12:38 PM, Ziemowit Pierzycki
>>>              <ziemo...@pierzycki.com <mailto:ziemo...@pierzycki.com**>
>>>         <mailto:ziemo...@pierzycki.com
>>>
>>>         <mailto:ziemo...@pierzycki.com**>__>> wrote:
>>>               > Hi,
>>>               >
>>>               > I have a 3 node SSD-backed cluster connected over
>>>         infiniband (16K
>>>              MTU) and
>>>               > here is the performance I am seeing:
>>>               >
>>>               > [root@triton temp]# !dd
>>>               > dd if=/dev/zero of=/mnt/temp/test.out bs=512k count=1000
>>>               > 1000+0 records in
>>>               > 1000+0 records out
>>>               > 524288000 bytes (524 MB) copied, 0.436249 s, 1.2 GB/s
>>>               > [root@triton temp]# dd if=/dev/zero
>>>         of=/mnt/temp/test.out bs=512k
>>>               > count=10000
>>>               > 10000+0 records in
>>>               > 10000+0 records out
>>>               > 5242880000 bytes (5.2 GB) copied, 299.077 s, 17.5 MB/s
>>>               > [root@triton temp]# dd if=/mnt/temp/test.out
>>>         of=/dev/null bs=512k
>>>               > count=1000010000+0 records in
>>>               > 10000+0 records out
>>>               > 5242880000 bytes (5.2 GB) copied, 13.3015 s, 394 MB/s
>>>               >
>>>               > Does that look right?  How do I check this is not a
>>> network
>>>              problem, because
>>>               > I remember seeing a kernel issue related to large MTU.
>>>               >
>>>               > ______________________________**___________________
>>>
>>>               > ceph-users mailing list
>>>               > ceph-users@lists.ceph.com
>>>         <mailto:ceph-us...@lists.ceph.**com <ceph-users@lists.ceph.com>>
>>>         <mailto:ceph-us...@lists.ceph.**__com
>>>         <mailto:ceph-us...@lists.ceph.**com <ceph-users@lists.ceph.com>
>>> >>
>>>               >
>>>         
>>> http://lists.ceph.com/__**listinfo.cgi/ceph-users-ceph._**_com<http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com>
>>>         
>>> <http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>> >
>>>
>>>               >
>>>
>>>
>>>
>>>
>>>         ______________________________**___________________
>>>
>>>         ceph-users mailing list
>>>         ceph-users@lists.ceph.com 
>>> <mailto:ceph-us...@lists.ceph.**com<ceph-users@lists.ceph.com>
>>> >
>>>         
>>> http://lists.ceph.com/__**listinfo.cgi/ceph-users-ceph._**_com<http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com>
>>>         
>>> <http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>> >
>>>
>>>
>>>     ______________________________**___________________
>>>
>>>     ceph-users mailing list
>>>     ceph-users@lists.ceph.com 
>>> <mailto:ceph-us...@lists.ceph.**com<ceph-users@lists.ceph.com>
>>> >
>>>     
>>> http://lists.ceph.com/__**listinfo.cgi/ceph-users-ceph._**_com<http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com>
>>>
>>>     
>>> <http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>> >
>>>
>>>
>>>
>>>
>>> ______________________________**_________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>>
>>>
>> ______________________________**_________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Performance problems

Reply via email to