I'm not sure if it's something I'm doing wrong or just experiencing an
oddity, but when my cache tier flushes dirty blocks out to the base tier,
the writes seem to hit the OSD's straight away instead of coalescing in the
journals, is this correct?

For example if I create a RBD on a standard 3 way replica pool and run fio
via librbd 128k writes, I see the journals take all the io's until I hit my
filestore_min_sync_interval and then I see it start writing to the
underlying disks.

Doing the same on a full cache tier (to force flushing)  I immediately see
the base disks at a very high utilisation. The journals also have some write
IO at the same time. The only other odd thing I can see via iostat is that
most of the time whilst I'm running Fio, is that I can see the underlying
disks doing very small write IO's of around 16kb with an occasional big
burst of activity.

I know erasure coding+cache tier is slower than just plain replicated pools,
but even with various high queue depths I'm struggling to get much above
100-150 iops compared to a 3 way replica pool which can easily achieve
1000-1500. The base tier is comprised of 40 disks. It seems quite a marked
difference and I'm wondering if this strange journal behaviour is the cause.

Does anyone have any ideas?

Nick


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to