Hi Steve and Kate,

Again - thanks again for the great suggestions.

Increasing the allocsize did not help us in the situation relating to my
current testing (poor read performance). However, allocsize is a great for
parameter for overall performance tuning and I intend to use it. :)

After discussion with colleagues and reading this article - ubuntu drive io
scheduler
<http://askubuntu.com/questions/784442/why-does-ubuntu-16-04-set-all-drive-io-schedulers-to-deadline>,
I decided to try out the cfq io schedular - ubuntu now defaults to deadline.

This made a significant difference - it actually double the overall read
performance.

I suggest anyone using ubuntu 14.04 or higher and high density osd nodes
(we have 48 osds per osd node) might like to test out cfq. It's also a
pretty easy test to perform :) and can be done on the fly.

Cheers,
Tom

On Wed, Nov 30, 2016 at 5:50 PM, Steve Taylor <steve.tay...@storagecraft.com
> wrote:

> We’re using Ubuntu 14.04 on x86_64. We just added ‘osd mount options xfs =
> rw,noatime,inode64,allocsize=1m’ to the [osd] section of our ceph.conf so
> XFS allocates 1M blocks for new files. That only affected new files, so
> manual defragmentation was still necessary to clean up older data, but once
> that was done everything got better and stayed better.
>
>
>
> You can use the xfs_db command to check fragmentation on an XFS volume and
> xfs_fsr to perform a defragmentation. The defragmentation can run on a
> mounted filesystem too, so you don’t even have to rely on Ceph to avoid
> downtime. I probably wouldn’t run it everywhere at once though for
> performance reasons. A single OSD at a time would be ideal, but that’s a
> matter of preference.
>
>
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *Thomas Bennett
> *Sent:* Wednesday, November 30, 2016 5:58 AM
>
> *Cc:* ceph-users@lists.ceph.com
> *Subject:* Re: [ceph-users] Is there a setting on Ceph that we can use to
> fix the minimum read size?
>
>
>
> Hi Kate and Steve,
>
>
>
> Thanks for the replies. Always good to hear back from a community :)
>
>
>
> I'm using Linux on x86_64 architecture and the block size is limited to
> the page size which is 4k. So it looks like I'm hitting hard limits in any
> changes. to increase the block size.
>
>
>
> I found this out by running the following command:
>
>
>
> $ mkfs.xfs -f -b size=8192 /dev/sda1
>
>
>
> $ mount -v /dev/sda1 /tmp/disk/
>
> mount: Function not implemented #huh???
>
>
>
> Checking out the man page:
>
>
>
> $ man mkfs.xfs
>
>  -b block_size_options
>
>       ... XFS  on  Linux  currently  only  supports pagesize or smaller
> blocks.
>
>
>
> I'm hesitant to implement btrfs as its still experimental and ext4 seems
> to have the same current limitation.
>
>
>
> Our current approach is to exclude the hard drive that we're getting the
> poor read rates from our procurement process, but it would still be nice to
> find out how much control we have over how ceph-osd  daemons read from the
> drives. I may attempts a strace on an osd daemon as we read to see what the
> actual read request size is being asked to the kernel.
>
>
>
> Cheers,
>
> Tom
>
>
>
>
>
> On Tue, Nov 29, 2016 at 11:53 PM, Steve Taylor <
> steve.tay...@storagecraft.com> wrote:
>
> We configured XFS on our OSDs to use 1M blocks (our use case is RBDs with
> 1M blocks) due to massive fragmentation in our filestores a while back. We
> were having to defrag all the time and cluster performance was noticeably
> degraded. We also create and delete lots of RBD snapshots on a daily basis,
> so that likely contributed to the fragmentation as well. It’s been MUCH
> better since we switched XFS to use 1M allocations. Virtually no
> fragmentation and performance is consistently good.
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to