I've just submitted a patch[1] to upstream for review to help with this problem. The key is to speed up the writeback rate when the fragmentation is high. Here are the comments from the patch:
Current way to calculate the writeback rate only considered the dirty sectors, this usually works fine when the fragmentation is not high, but it will give us unreasonable small rate when we are under a situation that very few dirty sectors consumed a lot dirty buckets. In some case, the dirty bucekts can reached to CUTOFF_WRITEBACK_SYNC while the dirty data(sectors) noteven reached the writeback_percent, the writeback rate will still be the minimum value (4k), thus it will cause all the writes to be stucked in a non-writeback mode because of the slow writeback. This patch will try to accelerate the writeback rate when the fragmentation is high. It calculate the propotional_scaled value based on below: (dirty_sectors / writeback_rate_p_term_inverse) * fragment As we can see, the higher fragmentation will result a larger proportional_scaled value, thus cause a larger writeback rate. The fragment value is calculated based on below: (dirty_buckets * bucket_size) / dirty_sectors If you think about it, the value of fragment will be always inside [1, bucket_size]. This patch only considers the fragmentation when the number of dirty_buckets reached to a dirty threshold(configurable by writeback_fragment_percent, default is 50), so bcache will remain the original behaviour before the dirty buckets reached the threshold. [1] https://marc.info/?l=linux-bcache&m=160441418209114&w=1 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1900438 Title: Bcache bypasse writeback on caching device with fragmentation Status in linux package in Ubuntu: Confirmed Bug description: Hello, An upstream bug has been opened on the matter for quite some time now [0]. I can reproduce easily on our production compute node instance, which are trusty host with xenial hwe kernels (4.15.0-101-generic). However due to heavy backport and such, doing real tracing is a bit hard there. I was able to reproduce the behavior on a hwe-bionic kernel as well. Since most of our critical deployments use bcache, I think this is a kinda nasty bug to have. Reproducing the issue is relatively easy with the script provided in the bug [1]. The script used to capture the stats is this one [2]. [0]: https://bugzilla.kernel.org/show_bug.cgi?id=206767 [1]: https://pastebin.ubuntu.com/p/YnnvvSRhXK/ [2]: https://pastebin.ubuntu.com/p/XfVpzg32sN/ To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1900438/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp