I've found that these setting make Ceph clients have much more consistent latency than the default scheduler, it also reduces the impact of backfills and recoveries. It may not give you better performance (although I have seen it allow all disks to be utilized to 100% rather than only as fast as the slowest disk), but could help with the VM with the SCSI resets.
Put this in your ceph.conf and restart your OSDs. osd op queue = wpq osd op queue cut off = high ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Aug 13, 2019 at 6:12 PM Richard Bade <hitr...@gmail.com> wrote: > Hi Everyone, > There's been a few threads around about small HDD (spinning disk) > clusters and performance on Bluestore. > One recently from Christian > ( > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-August/036385.html > ) > was particularly interesting to us as we have a very similar setup to > what Christian has and we see similar performance. > > We have a 6 node cluster each with 12x 4TB SATA HDD, IT mode LSI 3008, > wal/db on > 33GB NVMe partitions. Each node has a single Xeon Gold 6132 CPU @ > 2.60GHz and dual 10GB network. > We also use bcache with 1 180GB NVMe partition shared between 6 osd's. > Workload is via KVM (Proxmox) > > I did the same benchmark fio tests as Christian. Here's my results (M > for me, C for Christian) > direct=0 > ======== > M -- read : io=6008.0MB, bw=203264KB/s, iops=49, runt= 30267msec > C -- read: IOPS=40, BW=163MiB/s (171MB/s)(7556MiB/46320msec) > > direct=1 > ======== > M -- read : io=32768MB, bw=1991.4MB/s, iops=497, runt= 16455ms > C -- read: IOPS=314, BW=1257MiB/s (1318MB/s)(32.0GiB/26063msec) > > direct=0 > ======== > M -- write: io=32768MB, bw=471105KB/s, iops=115, runt= 71225msec > C -- write: IOPS=119, BW=479MiB/s (503MB/s)(32.0GiB/68348msec > > direct=1 > ======== > M -- write: io=32768MB, bw=479829KB/s, iops=117, runt= 69930msec > C -- write: IOPS=139, BW=560MiB/s (587MB/s)(32.0GiB/58519msec) > > I should probably mention that there was some active workload on the > cluster at that time also, around 500iops write and 100MB/s > throughput. > The main problem that we're having with this cluster is how easy it is > for it to hit slow requests and we have one particular vm that ends up > doing scsi resets because of the latency. > > So we're considering switching these osd's to filestore. > We have two other clusters using filestore/bcache/ssd journal and the > performance seems to be much better on those - taking into account the > different sizes. > What are peoples thoughts on this size cluster? Is it just not a good > fit with bluestore and our type of workload? > Also, does anyone have any knowledge on future support for filestore? > I'm concerned that we may have to migrate our other clusters off > filestore sometime in the future and that'll hurt us with the current > performance. > > Rich > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io >
_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io