Re: [ceph-users] Is there a setting on Ceph that we can use to fix the minimum read size?

Thomas Bennett Wed, 30 Nov 2016 04:58:34 -0800

Hi Kate and Steve,

Thanks for the replies. Always good to hear back from a community :)


I'm using Linux on x86_64 architecture and the block size is limited to the
page size which is 4k. So it looks like I'm hitting hard limits in any
changes. to increase the block size.

I found this out by running the following command:

$ mkfs.xfs -f -b size=8192 /dev/sda1

$ mount -v /dev/sda1 /tmp/disk/
mount: Function not implemented #huh???

Checking out the man page:

$ man mkfs.xfs
 -b block_size_options
      ... XFS  on  Linux  currently  only  supports pagesize or smaller
blocks.

I'm hesitant to implement btrfs as its still experimental and ext4 seems to
have the same current limitation.

Our current approach is to exclude the hard drive that we're getting the
poor read rates from our procurement process, but it would still be nice to
find out how much control we have over how ceph-osd  daemons read from the
drives. I may attempts a strace on an osd daemon as we read to see what the
actual read request size is being asked to the kernel.

Cheers,
Tom


On Tue, Nov 29, 2016 at 11:53 PM, Steve Taylor <
steve.tay...@storagecraft.com> wrote:

> We configured XFS on our OSDs to use 1M blocks (our use case is RBDs with
> 1M blocks) due to massive fragmentation in our filestores a while back. We
> were having to defrag all the time and cluster performance was noticeably
> degraded. We also create and delete lots of RBD snapshots on a daily basis,
> so that likely contributed to the fragmentation as well. It’s been MUCH
> better since we switched XFS to use 1M allocations. Virtually no
> fragmentation and performance is consistently good.
>
>
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *Kate Ward
> *Sent:* Tuesday, November 29, 2016 2:02 PM
> *To:* Thomas Bennett <tho...@ska.ac.za>
> *Cc:* ceph-users@lists.ceph.com
> *Subject:* Re: [ceph-users] Is there a setting on Ceph that we can use to
> fix the minimum read size?
>
>
>
> I have no experience with XFS, but wouldn't expect poor behaviour with it.
> I use ZFS myself and know that it would combine writes, but btrfs might be
> an option.
>
>
>
> Do you know what block size was used to create the XFS filesystem? It
> looks like 4k is the default (reasonable) with a max of 64k. Perhaps a
> larger block size will give better performance for your particular use
> case. (I use a 1M block size with ZFS.)
>
> http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-
> US/html/ch04s02.html
>
>
>
>
>
> On Tue, Nov 29, 2016 at 10:23 AM Thomas Bennett <tho...@ska.ac.za> wrote:
>
> Hi Kate,
>
>
>
> Thanks for your reply. We currently use xfs as created by ceph-deploy.
>
>
>
> What would you recommend we try?
>
>
>
> Kind regards,
>
> Tom
>
>
>
>
>
> On Tue, Nov 29, 2016 at 11:14 AM, Kate Ward <kate.w...@forestent.com>
> wrote:
>
> What filesystem do you use on the OSD? Have you considered a different
> filesystem that is better at combining requests before they get to the
> drive?
>
>
>
> k8
>
>
>
> On Tue, Nov 29, 2016 at 9:52 AM Thomas Bennett <tho...@ska.ac.za> wrote:
>
> Hi,
>
>
>
> We have a use case where we are reading 128MB objects off spinning disks.
>
>
>
> We've benchmarked a number of different hard drive and have noticed that
> for a particular hard drive, we're experiencing slow reads by comparison.
>
>
>
> This occurs when we have multiple readers (even just 2) reading objects
> off the OSD.
>
>
>
> We've recreated the effect using iozone and have noticed that once the
> record size drops to 4k, the hard drive miss behaves.
>
>
>
> Is there a setting on Ceph that we can change to fix the minimum read size
> when the ceph-osd daemon reads the object of the hard drives, to see if we
> can overcome the overall slow read rate.
>
>
>
> Cheers,
>
> Tom
>
> ------------------------------
>
> <https://storagecraft.com> Steve Taylor | Senior Software Engineer | 
> StorageCraft
> Technology Corporation <https://storagecraft.com>
> 380 Data Drive Suite 300 | Draper | Utah | 84020
> Office: 801.871.2799 |
>
> ------------------------------
>
> If you are not the intended recipient of this message or received it
> erroneously, please notify the sender and delete it, together with any
> attachments, and be advised that any dissemination or copying of this
> message is prohibited.
>
> ------------------------------
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
>
> --
>
> Thomas Bennett
>
>
>
> SKA South Africa
>
> Science Processing Team
>
>
>
> Office: +27 21 5067341 <+27%2021%20506%207341>
>
> Mobile: +27 79 5237105 <+27%2079%20523%207105>
>
>


-- 
Thomas Bennett

SKA South Africa
Science Processing Team

Office: +27 21 5067341
Mobile: +27 79 5237105

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Is there a setting on Ceph that we can use to fix the minimum read size?

Reply via email to