On Tue, 10 Apr 2007 12:04:54 +0900 Tomoki Sekiyama <[EMAIL PROTECTED]> wrote:
> Hello Andrew, > Thank you for your comments. > > Andrew Morton wrote: > > On Tue, 03 Apr 2007 19:46:04 +0900 > > Tomoki Sekiyama <[EMAIL PROTECTED]> wrote: > >> If % of Dirty+Writeback > `dirty_writeback_start_ratio', generators of > >> dirty pages start writeback of dirty pages by themselves. At that time, > >> these processes are not blocked in balance_dirty_pages(), but they may > >> be blocked if the write-requests-queue of the written disk is full > >> (that is, the length of the queue > `nr_requests'). By this behavior, > >> we can throttle only processes which write to the disks with heavy load, > >> and can allow processes to write to the other disks without blocking. > >> > >> If % of Dirty+Writeback > `dirty_ratio', generators of dirty pages > >> are throttled as current Linux does, not to fill up memory with dirty > >> pages. > > > > Does this actually solve the problem? If the request queue is sufficiently > > large (relative to the various dirty-memory thresholds) then I'd expect > > that a heavy-writer will be able to very quickly take the total > > dirty+writeback memory up to the dirty_ratio (should be renamed > > throttle_threshold, but it's too late for that). > > > > I suspect the reason why this patch was successful in your testing was > > because dirty_start_writeback_ratio happens to exceed the size of the disk > > request queues, so the heavy writer is getting stuck on disk request queue > > exhaustion. > > > > But that won't work if we have a lot of processes writing to a lot of > > disks, and it won't work if the request queue size is large, or if the > > dirty-memory thresholds are small (relative to the request queue size). > > > > Do the patches still work after > > `echo 10000 > /sys/block/sda/queue/nr_requests'? > > As you pointed out, this patch has no effect if nr_requests is too large, > because it distinguishes heavy disks depending on the length of the write- > requests queue of each disk. > > This patch is for providing the system administrators with room to avoid > the problem by adjusting parameters appropriately, rather than an automatic > solution for any possible situations. > > Could you please tell me some situations in which we should set nr_request > that large? It's probably not a sensible thing to do. But it's _possible_ to do, and the fact that the kernel will again misbehave indicates an overall weakness in our design. And there are other ways in which this situation could occur: - The request queue has a fixed size (it is not scaled according to the amount of memory in the machine). So if the machine is small enough (say, 64MB) then the problem can happen. - The machine could have a large number of disks - The queue size of 128 is in units of "number of requests". But it is independent upon the _size_ of those requests. If someone comes up with a driver which wants to use 16MB-sized requests, the problem will again reoccur. For all these sorts of reasons, we have learned that we should avoid any dependence upon request queue exhaustion within the VM/VFS/etc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/