[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-06 Thread vitalif
Hi Stefan, Do you mean more info than: Yes, there's more... I don't remember exactly, I think some information ends up included into OSD perf counters and some information is dumped into the OSD log, maybe there's even a 'ceph daemon' command to trigger it... There are 4 options that enab

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-05 Thread Bradley Kite
Thanks Vitaliy Posting here for the archives or if anyone else sees the same problem it might save them some work. After going through the code and logs (debug bluestore 20/5) it actually looks like the write-small-pre-read counter increases every time the WAL gets appended to (it reads the previ

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-05 Thread Stefan Kooman
Quoting vita...@yourcmc.ru (vita...@yourcmc.ru): > SSD (block.db) partition contains object metadata in RocksDB so it probably > loads the metadata before modifying objects (if it's not in cache yet). Also > it sometimes performs compaction which also results in disk reads and > writes. There are o

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-05 Thread vitalif
Hi, This helped to disable deferred writes in my case: bluestore_min_alloc_size=4096 bluestore_prefer_deferred_size=0 bluestore_prefer_deferred_size_ssd=0 If you already deployed your OSDs with min_alloc_size=4K then you don't need to redeploy them again. Hi Vitality, I completely destroye

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-05 Thread Bradley Kite
Hi Vitality, I completely destroyed the test cluster and re-deployed it after changing these settings but it did not make a difference - there are still a high number of deferred writes. Regards -- Brad. On Wed, 5 Feb 2020 at 10:55, wrote: > min_alloc_size can't be changed after formatting an

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-05 Thread vitalif
min_alloc_size can't be changed after formatting an OSD, and yes, bluestore defers all writes that are < min_alloc_size. And default min_alloc_size_ssd is 16KB. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-user

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-04 Thread Bradley Kite
Hi Igor, This has been very helpful. I have identified (when numjobs=1, the least-worst case) that there are approximately just as many bluestore_write_small_pre_read per second as there are sequential-write IOPS per second: Tue 4 Feb 22:44:34 GMT 2020 "bluestore_write_small_pre_read":

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-04 Thread vitalif
SSD (block.db) partition contains object metadata in RocksDB so it probably loads the metadata before modifying objects (if it's not in cache yet). Also it sometimes performs compaction which also results in disk reads and writes. There are other things going on that I'm not completely aware of

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-04 Thread Igor Fedotov
Hi Bradley, you might want to check performance counters for this specific OSD. Available via 'ceph daemon osd.0 perf dump'  command in Nautilus. A bit different command for Luminous AFAIR. Then look for 'read' substring in the dump and try to find unexpectedly high read-related counter valu

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-04 Thread Bradley Kite
Hi Vitaliy Yes - I tried this and I can still see a number of reads (~110 iops, 440KB/sec) on the SSD, so it is significantly better, but the result is still puzzling - I'm trying to understand what is causing the reads. The problem is amplified with numjobs >= 2 but it looks like it is still ther

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-04 Thread Vitaliy Filippov
Hi, Try to repeat your test with numjobs=1, I've already seen strange behaviour with parallel jobs to one RBD image. Also as usual: https://yourcmc.ru/wiki/Ceph_performance :-) Hi, We have a production cluster of 27 OSD's across 5 servers (all SSD's running bluestore), and have started to