On Saturday 08 September 2007 17:25, Nick Piggin wrote: > On Saturday 08 September 2007 07:12, Goswin von Brederlow wrote: > > Nick Piggin <[EMAIL PROTECTED]> writes: > > > On Saturday 08 September 2007 06:01, Goswin von Brederlow wrote: > > >> b) a segment boundary > > > > > > This is done, as I said, because of the deadlock issue. While the issue > > > is more completely fixed in -mm, a special case for kernel memory (eg. > > > nfsd) is in the latest mainline kernels. > > > > Can you tell me where to get the fix from -mm? If it is completly > > fixed there then that could make our patch obsolete. > > In the latest -mm series file, they start at > mm-revert-kernel_ds-buffered-write-optimisation.patch > ... > and go to > ocfs2-convert-to-new-aops.patch > > > >> What actually locks the page? Is it __grab_cache_page or > > >> a_ops->prepare_write? > > > > > > prepare_write must be given a locked page. > > > > Then that means __grab_cache_page does return a locked page because > > there is nothing between the two calls that would. > > That's right. > > > > No it would be included earlier. The "segment_eq" check should be > > > allowing kernel writes (nfsd) to write multiple segments. If you have a > > > patch which changes this significantly, then it would indicate the > > > existing logic has a problem (or you've got a userspace application > > > doing the writev, which should be fixed by the write_begin patches in > > > -mm). > > > > I've got userspace application doing the writev. To be exact 14% of > > the commits were saved by combining multiple segments into a single > > prepare/write pair. Since the kernel segments don't fragment anymore > > in 2.6.23-rc5 those savings must come from user space stuff. > > > > From the stats posted earlier you can see that there is a substantial > > amount of calls with 6 segments all (alot) smaller than a page. Lots > > of calls our patch or the write_begin/end will save. > > OK. The write_begin/write_end patchset is intrusive, no question. I'm not > sure what you're intending to do with it. They have been tested in -mm for > quite a while now, but just going with a simple patch that tries to copy > more segments might be OK for you if you're backporting. The deadlock is > pretty uncommon.
Lustre should probably have to be ported over to write_begin/write_end in order to use it too. With the patches in -mm, if a filesystem is still using prepare_write/commit_write, the vm reverts to a safe path which avoids the deadlock (and allows multi-seg io copies), but copies the data twice. OTOH, this is very likely to go upstream, so your filesystem will need to be ported over sooner or later anyway. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/