Hi. On Thu, Jun 07, 2018 at 11:05:27AM +0300, Abdullah Ramazanoğlu wrote: > On Thu, 7 Jun 2018 10:52:01 +0300 Reco said: > > [--8<--] > > > I.e. 12309 bug is back. It's obscure and presumably fixed (at least four > > times fixed) bug that happens with relatively slow filesystem (be it > > SSD/HDD/NFS or whatever) and a large amount of free RAM. I first > > encountered the thing back in 2.6.18 days, where it was presumably > > implemented (as in - nobody complained before ;). > > > > The idea behind that bug is simple - first, the kernel accumulates a > > certain amount of 'dirty' (i.e. changed) filesystem blocks. Since the > > amount of free RAM is large, the amount of such blocks is huge too. > > Next, kernel realizes that it's time for a 'barrier write' - everything > > that was happening before the barrier must be written onto persistent > > storage. And since it's 'barrier write time', everyone at userspace are > > blocked from making new changes for the existing filesystems, i.e. > > everyone are blocked on I/O. > > Since the amount of dirty blocks is huge, and the filesystem is slow - > > the kernel takes its time and writes dirty blocks. But - it writes them > > slowly, and new I/O requests are accumulating faster than it's possible > > for the kernel to write them. Hence the lookup. > > > > > So, as I think you suggested, it seems that OOM-killer isn't getting > > > in quickly enough to kill a program and/or not working correctly. > > > > > > Suggestions? > > > > Limit the size of dirty blocks cache. Kernel defaults are insanely large. > > What I'm using here is: > > > > $ cat /etc/sysctl.d/12309.conf > > vm.dirty_ratio=5 > > vm.dirty_background_ratio=5 > > I have added the line below to /etc/crontab for different reasons (better FS > resilience), but it might help to circumvent this bug too. > > * * * * * root /bin/sync
Here our approaches differ. I'm a strong believer of 'kernel does not need userspace kludges' principle. Yours is what Red Hat is using these days - 'we wrote a userspace tool for that'. To each its own, I guess. Reco