Don’t have any SSD in the cluster to test. Also without knowing the exact reason why it being enabled has such a negative effect I wouldn’t be sure if also would be the same on SSD’s.
On Sun, 11 Nov 2018 at 6:41 PM, Marc Roos <m.r...@f1-outsourcing.eu> wrote: > > > Does it make sense to test disabling this on hdd cluster only? > > > -----Original Message----- > From: Ashley Merrick [mailto:singap...@amerrick.co.uk] > Sent: zondag 11 november 2018 6:24 > To: vita...@yourcmc.ru > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Disabling write cache on SATA HDDs reduces > write latency 7 times > > I've just worked out I had the same issue, been trying to work out the > cause for the past few days! > > However I am using brand new enterprise Toshiba drivers with 256MB write > cache, was seeing I/O wait peaks of 40% even during a small writing > operation to CEPH and commit / apply latency's in the 40ms+. > > Just went through and disabled the write cache on each drive, and done a > few tests with the exact same write performance, but I/O wait in the <1% > and commit / apply latency's in the 1-3ms max. > > Something somewhere definitely doesn't seem to like the write cache > being enabled on the disks, this is a EC Pool in the latest Mimic > version. > > On Sun, Nov 11, 2018 at 5:34 AM Vitaliy Filippov <vita...@yourcmc.ru> > wrote: > > > Hi > > A weird thing happens in my test cluster made from desktop > hardware. > > The command `for i in /dev/sd?; do hdparm -W 0 $i; done` > increases > > single-thread write iops (reduces latency) 7 times! > > It is a 3-node cluster with Ryzen 2700 CPUs, 3x SATA 7200rpm HDDs > + > 1x > SATA desktop SSD for system and ceph-mon + 1x SATA server SSD for > block.db/wal in each host. Hosts are linked by 10gbit ethernet > (not > the > fastest one though, average RTT according to flood-ping is > 0.098ms). Ceph > and OpenNebula are installed on the same hosts, OSDs are prepared > with > ceph-volume and bluestore with default options. SSDs have > capacitors > ('power-loss protection'), write cache is turned off for them > since > the > very beginning (hdparm -W 0 /dev/sdb). They're quite old, but each > of them > is capable of delivering ~22000 iops in journal mode (fio -sync=1 > -direct=1 -iodepth=1 -bs=4k -rw=write). > > However, RBD single-threaded random-write benchmark originally > gave > awful > results - when testing with `fio -ioengine=libaio -size=10G > -sync=1 > > -direct=1 -name=test -bs=4k -iodepth=1 -rw=randwrite -runtime=60 > -filename=./testfile` from inside a VM, the result was only 58 > iops > > average (17ms latency). This was not what I expected from the > HDD+SSD > setup. > > But today I tried to play with cache settings for data disks. And > I > was > really surprised to discover that just disabling HDD write cache > (hdparm > -W 0 /dev/sdX for all HDD devices) increases single-threaded > performance > ~7 times! The result from the same VM (without even rebooting it) > is > iops=405, avg lat=2.47ms. That's a magnitude faster and in fact > 2.5ms > seems sort of an expected number. > > As I understand 4k writes are always deferred at the default > setting of > prefer_deferred_size_hdd=32768, this means they should only get > written to > the journal device before OSD acks the write operation. > > So my question is WHY? Why does HDD write cache affect commit > latency with > WAL on an SSD? > > I would also appreciate if anybody with similar setup (HDD+SSD > with > > desktop SATA controllers or HBA) could test the same thing... > > -- > With best regards, > Vitaliy Filippov > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com