Hi Nick, DB's IO pattern depends on config, mysql for example. innodb_flush_log_at_trx_commit =1, mysql will sync after one transcation. like: write sync wirte sync ...
innodb_flush_log_at_trx_commit = 5, write write write write write sync innodb_flush_log_at_trx_commit = 0, write write ... one second later. sync. may not very accurate, but more or less. We test mysql tps, with nnodb_flush_log_at_trx_commit =1, get very poor performance even if we can reach very high O_DIRECT randwrite iops with fio. 2016-02-26 16:59 GMT+08:00 Nick Fisk <n...@fisk.me.uk>: > > -----Original Message----- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > > Huan Zhang > > Sent: 26 February 2016 06:50 > > To: Jason Dillaman <dilla...@redhat.com> > > Cc: josh durgin <josh.dur...@inktank.com>; Nick Fisk <n...@fisk.me.uk>; > > ceph-users <ceph-us...@ceph.com> > > Subject: Re: [ceph-users] Guest sync write iops so poor. > > > > rbd engine with fsync=1 seems stuck. > > Jobs: 1 (f=1): [w(1)] [0.0% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta > > 1244d:10h:39m:18s] > > > > But fio using /dev/rbd0 sync=1 direct=1 ioengine=libaio iodepth=64, get > very > > high iops ~35K, similar to direct wirte. > > > > I'm confused with that result, IMHO, ceph could just ignore the sync > cache > > command since it always use sync write to journal, right? > > Even if the data is not sync'd to the data storage part of the OSD, the > data still has to be written to the journal and this is where the > performance limit lies. > > The very nature of SDS means that you are never going to achieve the same > latency as you do to a local disk as even if the software side introduced > no extra latency, just the network latency will severely limit your sync > performance. > > Do you know the IO pattern the DB's generate? I know you can switch most > DB's to flush with O_DIRECT instead of sync, it might be this helps in your > case. > > Also check out the tech talk from last month about high performance > databases on Ceph. The presenter gave the impression that, at least in > their case, not every write was a sync IO. So your results could possibly > matter less than you think. > > Also please search the lists and past presentations about reducing write > latency. There are a few things you can do like disabling logging and some > kernel parameters to stop the CPU's entering sleep states/reducing > frequency. One thing I witnessed that if the Ceph cluster is only running > at low queue depths, so it's only generating low cpu load, all the cores on > the CPU's throttle themselves down to their lowest speeds, which really > hurts latency. > > > > > Why we get so bad sync iops, how ceph handle it? > > Very appreciated to get your reply! > > > > 2016-02-25 22:44 GMT+08:00 Jason Dillaman <dilla...@redhat.com>: > > > 35K IOPS with ioengine=rbd sounds like the "sync=1" option doesn't > > actually > > > work. Or it's not touching the same object (but I wonder whether write > > > ordering is preserved at that rate?). > > > > The fio rbd engine does not support "sync=1"; however, it should support > > "fsync=1" to accomplish roughly the same effect. > > > > Jason > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com