Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Nick Piggin
Andrea Gelmini wrote: On Sun, Dec 31, 2006 at 02:55:58PM +1100, Nick Piggin wrote: This bug was only introduced in 2.6.19, due to a change that caused pte no, Linus said that with 2.6.19 it's easier to trigger this bug... Yhat's when the bug was introduced -- 2.6.19. 2.6.18 doe

Re: [PATCH] 4/4 block: explicit plugging

2007-01-03 Thread Nick Piggin
Jens Axboe wrote: Nick writes: This is a patch to perform block device plugging explicitly in the submitting process context rather than implicitly by the block device. Hi Jens, Hey thanks for doing so much hard work with this, I couldn't have fixed all the block layer stuff myself. QRCU look

Re: [2.6 patch] the scheduled find_trylock_page() removal

2007-01-03 Thread Nick Piggin
to stay for another year because it is so unintrusive, but I don't like the fact it doesn't give one an explicit ref on the page -- it could be misused slightly more easily than find_lock_page or find_get_page. Anyone object? Otherwise: Acked-by: Nick Piggin <[EMAIL PROTECTED]> --

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Nick Piggin
Linus Torvalds wrote: On Thu, 4 Jan 2007, Nick Piggin wrote: Yhat's when the bug was introduced -- 2.6.19. 2.6.18 does not have this bug, so it cannot be years old. Actually, I think 2.6.18 may have a subtle variation on it. In particular, I look back at the try_to_free_buffers()

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Nick Piggin
Andrew Morton wrote: On Wed, 3 Jan 2007 20:44:36 -0800 (PST) Linus Torvalds <[EMAIL PROTECTED]> wrote: Actually, I think 2.6.18 may have a subtle variation on it. In particular, I look back at the try_to_free_buffers() thing that I hated so much, and it makes me wonder.. It used to do:

Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

2007-01-03 Thread Nick Piggin
Suparna Bhattacharya wrote: On Wed, Jan 03, 2007 at 02:15:56PM -0800, Andrew Morton wrote: Plus Jens's unplugging changes add more reliance upon context inside *current, for the plugging and unplugging operations. I expect that the fsaio patches will need to be aware of the protocol which tho

Re: [PATCHSET 1][PATCH 0/6] Filesystem AIO read/write

2007-01-03 Thread Nick Piggin
Suparna Bhattacharya wrote: On Thu, Jan 04, 2007 at 04:51:58PM +1100, Nick Piggin wrote: So long as AIO threads do the same, there would be no problem (plugging is optional, of course). Yup, the AIO threads run the same code as for regular IO, i.e in the rare situations where they actually

Re: [FSAIO][PATCH 6/8] Enable asynchronous wait page and lock page

2007-01-03 Thread Nick Piggin
Christoph Hellwig wrote: On Thu, Dec 28, 2006 at 08:17:17PM +0530, Suparna Bhattacharya wrote: I am really bad with names :( I tried using the _wq suffixes earlier and that seemed confusing to some, but if no one else objects I'm happy to use that. I thought aio_lock_page() might be misleading

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Nick Piggin
Andrew Morton wrote: On Wed, 03 Jan 2007 22:56:07 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: Anyway that leaves us with the question of why Andrea's database is getting corrupted. Hopefully he can give us a minimal test-case. It'd odd that stories of pre-2.6.19 BerkeleyDB corruption

[patch] mm: remove gcc workaround

2007-02-20 Thread Nick Piggin
Minimum gcc version is 3.2 now. However, with likely profiling, even modern gcc versions cannot always eliminate the call. Replace the placeholder functions with the more conventional empty static inlines, which should be optimal for everyone. Signed-off-by: Nick Piggin <[EMAIL PROTEC

[patch 0/6] fault vs truncate/invalidate race fix

2007-02-20 Thread Nick Piggin
The following set of patches are based on current git. These fix the fault vs invalidate and fault vs truncate_range race for filemap_nopage mappings, plus those and fault vs truncate race for nonlinear mappings. These patches fix silent data corruption that we've had several people hitting in SU

[patch 1/6] mm: debug check for the fault vs invalidate race

2007-02-20 Thread Nick Piggin
Add a bugcheck for Andrea's pagefault vs invalidate race. This is triggerable for both linear and nonlinear pages with a userspace test harness (using direct IO and truncate, respectively). Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> mm/filemap.c |2 ++ 1 file changed, 2

[patch 2/6] mm: simplify filemap_nopage

2007-02-20 Thread Nick Piggin
Identical block is duplicated twice: contrary to the comment, we have been re-reading the page *twice* in filemap_nopage rather than once. If any retry logic or anything is needed, it belongs in lower levels anyway. Only retry once. Linus agrees. Signed-off-by: Nick Piggin <[EMAIL PROTEC

[patch 3/6] mm: fix fault vs invalidate race for linear mappings

2007-02-20 Thread Nick Piggin
on is excluded because it holds the page lock during invalidation of each page (and ensures that the page is not mapped while holding the lock). This also allows significant simplifications in do_no_page, because we have the page locked in the right place in the pagecache from the start. Signe

[patch 5/6] mm: merge nopfn into fault

2007-02-20 Thread Nick Piggin
Remove ->nopfn and reimplement the existing handlers with ->fault Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> arch/powerpc/platforms/cell/spufs/file.c | 90 --- drivers/char/mspec.c | 29 ++--- includ

[patch 4/6] mm: merge populate and nopage into fault (fixes nonlinear)

2007-02-20 Thread Nick Piggin
d with ->fault, and no users have hit mainline yet. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Documentation/feature-removal-schedule.txt | 27 ++ Documentation/filesystems/Locking |2 fs/gfs2/ops_address.c |2 fs/gfs2/ops_file.c

[patch 6/6] mm: remove legacy cruft

2007-02-20 Thread Nick Piggin
Remove legacy filemap_nopage and all of the .populate API cruft. This patch can be skipped if it will cause clashes in your tree, or you disagree with removing these guys right now. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Documentation/feature-removal-schedule.txt | 18 -- i

Re: [patch 5/6] mm: merge nopfn into fault

2007-02-20 Thread Nick Piggin
On Wed, Feb 21, 2007 at 05:50:31AM +0100, Nick Piggin wrote: > Remove ->nopfn and reimplement the existing handlers with ->fault > > Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Dang, forgot to quilt refresh after fixing spufs compile. -- Remove ->nopfn and reimpleme

[rfc][patch 2.6.20-git16] mm: replicated pagecache

2007-02-20 Thread Nick Piggin
Index: linux-2.6/mm/Makefile === --- linux-2.6.orig/mm/Makefile +++ linux-2.6/mm/Makefile @@ -29,3 +29,4 @@ obj-$(CONFIG_MEMORY_HOTPLUG) += memory_h obj-$(CONFIG_FS_XIP) += filemap_xip.o obj-$(CONFIG_MIGRATION) += migrate.o obj-$(CONFIG_SMP) += al

Re: [patch 2/2] sched: dynticks idle load balancing - v2

2007-02-21 Thread Nick Piggin
On Wed, Feb 21, 2007 at 12:23:44PM -0800, Andrew Morton wrote: > On Fri, 16 Feb 2007 18:08:42 -0800 > > +int select_nohz_load_balancer(int stop_tick) > > +{ > > + int cpu = smp_processor_id(); > > + > > + if (stop_tick) { > > + cpu_set(cpu, nohz.cpu_mask); > > + cpu_rq(cpu)

Re: [patch 2/2] sched: dynticks idle load balancing - v2

2007-02-21 Thread Nick Piggin
On Fri, Feb 16, 2007 at 06:08:42PM -0800, Suresh B wrote: > Changes since v1: > > - Move the idle load balancer selection from schedule() > to the first busy scheduler_tick() after restarting the tick. > This will avoid the unnecessay ownership changes when > softirq's(which are run

Re: [patch 2/2] sched: dynticks idle load balancing - v2

2007-02-22 Thread Nick Piggin
On Thu, Feb 22, 2007 at 02:33:00PM -0800, Suresh B wrote: > On Thu, Feb 22, 2007 at 04:26:54AM +0100, Nick Piggin wrote: > > This is really ugly, sorry :( > > hm. myself and others too thought it was a simple and nice idea. The idea is not bad. I won't guarantee mine will

[rfc][patch] dynamic resizing dentry hash using RCU

2007-02-23 Thread Nick Piggin
The dentry hash uses up 8MB for 1 million entries on my 4GB system is one of the biggest wasters of memory for me. Because I rarely have more than one or two hundred thousand dentries. And that's with several kernel trees worth of entries. Most desktop and probably even many types of servers will

Re: [rfc][patch] dynamic resizing dentry hash using RCU

2007-02-23 Thread Nick Piggin
On Fri, Feb 23, 2007 at 05:31:17PM +0100, Eric Dumazet wrote: > On Friday 23 February 2007 16:37, Nick Piggin wrote: > > The dentry hash uses up 8MB for 1 million entries on my 4GB system is one > > of the biggest wasters of memory for me. Because I rarely have more than > &g

Re: [rfc][patch] dynamic resizing dentry hash using RCU

2007-02-23 Thread Nick Piggin
On Fri, Feb 23, 2007 at 09:25:28AM -0800, Zach Brown wrote: > > On Feb 23, 2007, at 7:37 AM, Nick Piggin wrote: > > > > >The dentry hash uses up 8MB for 1 million entries on my 4GB system > >is one > >of the biggest wasters of memory for me. Because I rarely

Re: [rfc][patch] dynamic resizing dentry hash using RCU

2007-02-23 Thread Nick Piggin
On Fri, Feb 23, 2007 at 05:31:30PM -0800, Michael K. Edwards wrote: > On 2/23/07, Zach Brown <[EMAIL PROTECTED]> wrote: > >I'd love to see a generic implementation of RCU hashing that > >subsystems can then take advantage of. It's long been on the fun > >side of my todo list. The side I never get

Re: [rfc][patch] dynamic resizing dentry hash using RCU

2007-02-23 Thread Nick Piggin
On Sat, Feb 24, 2007 at 02:26:02AM +0100, Nick Piggin wrote: > On Fri, Feb 23, 2007 at 09:25:28AM -0800, Zach Brown wrote: > > > > On Feb 23, 2007, at 7:37 AM, Nick Piggin wrote: > > > > > > > >The dentry hash uses up 8MB for 1 million entries on my 4GB s

Re: [rfc][patch] dynamic resizing dentry hash using RCU

2007-02-23 Thread Nick Piggin
On Fri, Feb 23, 2007 at 08:24:44PM -0800, William Lee Irwin III wrote: > On Fri, Feb 23, 2007 at 04:37:43PM +0100, Nick Piggin wrote: > > The dentry hash uses up 8MB for 1 million entries on my 4GB system is > > one of the biggest wasters of memory for me. Because I rarely have &g

Re: [rfc][patch] dynamic resizing dentry hash using RCU

2007-02-23 Thread Nick Piggin
On Sat, Feb 24, 2007 at 01:07:23PM +0900, KAMEZAWA Hiroyuki wrote: > On Fri, 23 Feb 2007 16:37:43 +0100 > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > +static void dcache_hash_resize(unsigned int new_shift); > > +static void mod_nr_dentry(int mod) > > +{

Re: [patch 3/3] mm: fix PageUptodate memorder

2007-02-25 Thread Nick Piggin
y tried to impose any synchronisation on parallel read vs write. A memory barrier in flush_dcache_page would do the trick as well, I think, but it is not really any better. It is misleading because it is not the canonical fix. And we'd still need the smp_rmb in the PageUptodate read-side. >

Re: BUG in 2.6.20-rt8

2007-02-25 Thread Nick Piggin
David Miller wrote: From: "Paul E. McKenney" <[EMAIL PROTECTED]> Date: Sun, 25 Feb 2007 17:52:30 -0800 Why doesn't the traditional hash table of locks work here? Use the cache-line address as input to the hash function, take the corresponding lock, do the compare-and-exchange by hand, and the

Re: SMP performance degradation with sysbench

2007-02-26 Thread Nick Piggin
Rik van Riel wrote: Lorenzo Allegrucci wrote: Hi lkml, according to the test below (sysbench) Linux seems to have scalability problems beyond 8 client threads: http://jeffr-tech.livejournal.com/6268.html#cutid1 http://jeffr-tech.livejournal.com/5705.html Hardware is an 8-core amd64 system and

Re: SMP performance degradation with sysbench

2007-02-26 Thread Nick Piggin
Nick Piggin wrote: Rik van Riel wrote: Lorenzo Allegrucci wrote: Hi lkml, according to the test below (sysbench) Linux seems to have scalability problems beyond 8 client threads: http://jeffr-tech.livejournal.com/6268.html#cutid1 http://jeffr-tech.livejournal.com/5705.html Hardware is an 8

Re: [patch 0/6] fault vs truncate/invalidate race fix

2007-02-27 Thread Nick Piggin
On Mon, Feb 26, 2007 at 09:32:04PM -0800, Andrew Morton wrote: > > On Tue, 27 Feb 2007 15:36:03 +1100 "Dave Airlie" <[EMAIL PROTECTED]> wrote: > > > > > > I've also got rid of the horrible populate API, and integrated nonlinear > > > pages > > > properly with the page fault path. > > > > > > Downs

Re: SMP performance degradation with sysbench

2007-02-27 Thread Nick Piggin
Nish Aravamudan wrote: On 2/26/07, Nick Piggin <[EMAIL PROTECTED]> wrote: Rik van Riel wrote: > Lorenzo Allegrucci wrote: > >> Hi lkml, >> >> according to the test below (sysbench) Linux seems to have scalability >> problems beyond 8 client threads: >&

Re: [patch 1/9] fs: libfs buffered write leak fix

2007-02-03 Thread Nick Piggin
On Sat, Feb 03, 2007 at 05:49:47PM +, Jörn Engel wrote: > On Sat, 3 February 2007 02:33:16 +0100, Nick Piggin wrote: > > > > If doing a partial-write, simply clear the whole page and set it uptodate > > (don't need to get too tricky). > > That sounds just like

[patch 0/9] buffered write deadlock fix

2007-02-04 Thread Nick Piggin
Have fixed a few issues since last time: - better comments for the SetPageUptodate race - actually fix the nobh problem rather than adding a comment - use kmap_atomic instead of kmap Patches against 2.6.20-rc7. Thanks, Nick -- SuSE Labs - To unsubscribe from this list: send the line "unsubscrib

[patch 1/9] fs: libfs buffered write leak fix

2007-02-04 Thread Nick Piggin
concurrent read can come in and copy the uninitialised memory into userspace before it written to. Fix simple_readpage by simply initialising the whole page in the case of a partial-page write. In the case of a full-page write, we don't SetPageDirty until commit_write time. Signed-off-by: Nick

[patch 2/9] mm: revert "generic_file_buffered_write(): handle zero length iovec segments"

2007-02-04 Thread Nick Piggin
From: Andrew Morton <[EMAIL PROTECTED]> Revert 81b0c8713385ce1b1b9058e916edcf9561ad76d6. This was a bugfix against 6527c2bdf1f833cc18e8f42bd97973d583e4aa83, which we also revert. Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> I

[patch 4/9] mm: generic_file_buffered_write cleanup

2007-02-04 Thread Nick Piggin
From: Andrew Morton <[EMAIL PROTECTED]> Clean up buffered write code. Rename some variables and fix some types. Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linu

[patch 6/9] mm: be sure to trim blocks

2007-02-04 Thread Nick Piggin
If prepare_write fails with AOP_TRUNCATED_PAGE, or if commit_write fails, then we may have failed the write operation despite prepare_write having instantiated blocks past i_size. Fix this, and consolidate the trimming into one place. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index:

[patch 7/9] mm: cleanup pagecache insertion operations

2007-02-04 Thread Nick Piggin
very short time, in contrast with the per-CPU pagevecs that are persistent. Net result: 7.3 times fewer lru_lock acquisitions required to add the pages to pagecache for a bulk write (in 4K chunks). Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linux-

[patch 9/9] mm: fix pagecache write deadlocks

2007-02-04 Thread Nick Piggin
data via the kernel address space. (also, rename maxlen to seglen, because it was confusing) Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linux-2.6/mm/filemap.c === --- linux-2.6.orig/mm/filemap.c +++ linux-2.6/m

[patch 8/9] mm: generic_file_buffered_write iovec cleanup

2007-02-04 Thread Nick Piggin
Hide some of the open-coded nr_segs tests into the iovec helpers. This is all to simplify generic_file_buffered_write, because that gets more complex in the next patch. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linux-2.6/mm/fil

[patch 3/9] mm: revert "generic_file_buffered_write(): deadlock on vectored write"

2007-02-04 Thread Nick Piggin
e fixing the deadlock by other means. Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Nick says: also it only ever actually papered over the bug, because after faulting in the pages, they might be unmapped or reclaimed. Signed-off-by: Nick Piggin <[EMAIL PROTECTED

[patch 5/9] mm: debug write deadlocks

2007-02-04 Thread Nick Piggin
: Nick Piggin <[EMAIL PROTECTED]> Index: linux-2.6/mm/filemap.c === --- linux-2.6.orig/mm/filemap.c +++ linux-2.6/mm/filemap.c @@ -2103,6 +2103,7 @@ generic_file_buffered_write(struct kiocb if (maxlen &

Re: [patch 9/9] mm: fix pagecache write deadlocks

2007-02-04 Thread Nick Piggin
On Sun, Feb 04, 2007 at 01:44:45AM -0800, Andrew Morton wrote: > On Sun, 4 Feb 2007 09:51:07 +0100 (CET) Nick Piggin <[EMAIL PROTECTED]> > wrote: > > > 2. If we find the destination page is non uptodate, unlock it (this could > > be > > made slightly mor

Re: [patch 9/9] mm: fix pagecache write deadlocks

2007-02-04 Thread Nick Piggin
On Sun, Feb 04, 2007 at 02:30:55AM -0800, Andrew Morton wrote: > On Sun, 4 Feb 2007 11:15:29 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > The write path is broken. I prefer my kernels slow, than buggy. > > That won't fly. What won't fly? > > &g

Re: [patch 9/9] mm: fix pagecache write deadlocks

2007-02-04 Thread Nick Piggin
On Sun, Feb 04, 2007 at 11:46:09AM +0100, Nick Piggin wrote: > > > It's better than taking mmap_sem and walking pagetables... > > I'm not convinced. Though I am more convinced that looking at mm *at all* (either to take the mmap_sem and find the vma, or to

Re: [patch 9/9] mm: fix pagecache write deadlocks

2007-02-04 Thread Nick Piggin
On Sun, Feb 04, 2007 at 02:56:02AM -0800, Andrew Morton wrote: > On Sun, 4 Feb 2007 11:46:09 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > On Sun, Feb 04, 2007 at 02:30:55AM -0800, Andrew Morton wrote: > > > On Sun, 4 Feb 2007 11:15:29 +0100 Nick Piggi

Re: [patch 9/9] mm: fix pagecache write deadlocks

2007-02-04 Thread Nick Piggin
On Sun, Feb 04, 2007 at 03:10:39AM -0800, Andrew Morton wrote: > On Sun, 4 Feb 2007 10:59:58 + (GMT) Anton Altaparmakov <[EMAIL > PROTECTED]> wrote: > > > > How about leaving the existing code with the following minor > > modifications: > > > > Instead of calling filemap_copy_from_user{,_io

Re: [patch 9/9] mm: fix pagecache write deadlocks

2007-02-04 Thread Nick Piggin
On Sun, Feb 04, 2007 at 03:15:49AM -0800, Andrew Morton wrote: > On Sun, 4 Feb 2007 12:03:17 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > On Sun, Feb 04, 2007 at 02:56:02AM -0800, Andrew Morton wrote: > > > On Sun, 4 Feb 2007 11:46:09 +0100 Nick Piggi

[rfc] mm: PageUptodate memorder problem?

2007-02-04 Thread Nick Piggin
Hi, I think there might be a problem, but don't take this as a final patch because I can make it nicer if we are agreed there is a problem. One thing I like about it is that it ties in the anonymous page handling with the rest of the page management, by marking anon pages as uptodate when they _a

Re: [patch 9/9] mm: fix pagecache write deadlocks

2007-02-05 Thread Nick Piggin
On Sun, Feb 04, 2007 at 05:40:35PM +, Anton Altaparmakov wrote: > On Sun, 4 Feb 2007, Andrew Morton wrote: > > truncate's OK: we're holding i_mutex. > > How about excluding readpage() (in addition to truncate if Nick is right > and some cases of truncate do not hold i_mutex) with an extra pa

Re: [patch 9/9] mm: fix pagecache write deadlocks

2007-02-05 Thread Nick Piggin
On Sun, Feb 04, 2007 at 10:36:20AM -0800, Andrew Morton wrote: > On Sun, 4 Feb 2007 16:10:51 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > They're not likely to hit the deadlocks, either. Probability gets more > > likely after my patch to lock the page in t

Re: [patch 9/9] mm: fix pagecache write deadlocks

2007-02-05 Thread Nick Piggin
On Tue, Feb 06, 2007 at 03:25:49AM +0100, Nick Piggin wrote: > On Sun, Feb 04, 2007 at 10:36:20AM -0800, Andrew Morton wrote: > > On Sun, 4 Feb 2007 16:10:51 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > > They're not likely to hit the deadlocks, either.

Re: [patch 9/9] mm: fix pagecache write deadlocks

2007-02-05 Thread Nick Piggin
On Mon, Feb 05, 2007 at 09:30:06PM -0800, Andrew Morton wrote: > On Tue, 6 Feb 2007 05:41:46 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > > > Not necessarily -- they could read from one part of a page and write to > > > another. I see this as the bigge

Re: [patch 9/9] mm: fix pagecache write deadlocks

2007-02-05 Thread Nick Piggin
On Tue, Feb 06, 2007 at 06:49:05AM +0100, Nick Piggin wrote: > > - If the get_user() doesn't fault, and if we're copying from and to the > > same page, we know that we've locked it, so nobody will be able to unmap > > it while we're copying from it. > &

[patch 0/3] 2.6.20 fix for PageUptodate memorder problem

2007-02-06 Thread Nick Piggin
Still no independent confirmation as to whether this is a problem or not. I think it is, so I'll propose this patchset to fix it. Patch 1/3 has a reasonable description of the problem. Thanks, Nick -- SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body

[patch 1/3] mm: fix PageUptodate memorder

2007-02-06 Thread Nick Piggin
lete patch follows). Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linux-2.6/include/linux/highmem.h === --- linux-2.6.orig/include/linux/highmem.h +++ linux-2.6/include/linux/highmem.h @@ -57,8 +57,6 @@ static inline void cle

[patch 2/3] fs: buffer don't PageUptodate without page locked

2007-02-06 Thread Nick Piggin
uch a thing anyway). Instead just leave it to the read side to bring the page uptodate when it notices that all buffers are uptodate. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linux-2.6/fs/buffer.c === --- linux-

[patch 3/3] mm: make read_cache_page synchronous

2007-02-06 Thread Nick Piggin
ew in ecryptfs, 1 in jffs2, and a possible cleared data overwritten with readpage in block2mtd. All depending on whether the filler is async and/or can return with a !uptodate page. Also, a memory leak in sys_swapon(). Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linux-2.6/f

[patch] fs: fix __block_write_full_page error case buffer submission

2007-02-06 Thread Nick Piggin
case here. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linux-2.6/fs/buffer.c === --- linux-2.6.orig/fs/buffer.c +++ linux-2.6/fs/buffer.c @@ -1732,7 +1732,6 @@ recover: SetPageError(page); BUG_ON(Pa

Re: [patch 2/3] fs: buffer don't PageUptodate without page locked

2007-02-06 Thread Nick Piggin
On Tue, Feb 06, 2007 at 12:21:40AM -0800, Andrew Morton wrote: > On Tue, 6 Feb 2007 09:02:23 +0100 (CET) Nick Piggin <[EMAIL PROTECTED]> > wrote: > > > __block_write_full_page is calling SetPageUptodate without the page locked. > > This is unusual, but not incorrec

Re: [patch 3/3] mm: make read_cache_page synchronous

2007-02-06 Thread Nick Piggin
Andrew Morton wrote: On Tue, 6 Feb 2007 09:02:33 +0100 (CET) Nick Piggin <[EMAIL PROTECTED]> wrote: Ensure pages are uptodate after returning from read_cache_page, which allows us to cut out most of the filesystem-internal PageUptodate_NoLock calls. Normally it's good to renam

Re: [patch 3/3] mm: make read_cache_page synchronous

2007-02-06 Thread Nick Piggin
On Tue, Feb 06, 2007 at 12:28:39AM -0800, Andrew Morton wrote: > > > > Also, a memory leak in sys_swapon(). > > Separate patch? Gack, I'm an idiot, there is no memory leak :P - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] M

Re: [patch 1/3] mm: fix PageUptodate memorder

2007-02-06 Thread Nick Piggin
Andrew Morton wrote: On Tue, 6 Feb 2007 09:02:11 +0100 (CET) Nick Piggin <[EMAIL PROTECTED]> wrote: +static inline void __SetPageUptodate(struct page *page) +{ +#ifdef CONFIG_S390 if (!test_and_set_bit(PG_uptodate, &page->flags)) page_test_and_cle

Re: [patch 0/3] 2.6.20 fix for PageUptodate memorder problem

2007-02-06 Thread Nick Piggin
On Wed, Feb 07, 2007 at 09:58:57AM +1100, David Chinner wrote: > On Tue, Feb 06, 2007 at 09:02:01AM +0100, Nick Piggin wrote: > > Still no independent confirmation as to whether this is a problem or not. > > I think it is, so I'll propose this patchset to fix it. Patch 1/3

[patch 0/3] 2.6.20 fix for PageUptodate memorder problem (try 2)

2007-02-08 Thread Nick Piggin
Still no independent confirmation as to whether this is a problem or not. Updated some comments, added diffstats to patches, don't use __SetPageUptodate as an internal page-flags.h private function. I would like to eventually get an ack from Hugh regarding the anon memory and especially swap side

[patch 1/3] mm: fix PageUptodate memorder

2007-02-08 Thread Nick Piggin
lete patch follows). Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> fs/ext2/dir.c |2 - fs/namei.c |2 - fs/partitions/check.c |2 - fs/splice.c|4 +-- include/linux/highmem.h|4 --- include/linux/pag

[patch 2/3] fs: buffer don't PageUptodate without page locked

2007-02-08 Thread Nick Piggin
uch a thing anyway). Instead just leave it to the read side to bring the page uptodate when it notices that all buffers are uptodate. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> fs/buffer.c | 11 +-- 1 file changed, 1 insertion(+), 10 deletions(-) Index: linux-2.6/

[patch 3/3] mm: make read_cache_page synchronous

2007-02-08 Thread Nick Piggin
ew in ecryptfs, 1 in jffs2, and a possible cleared data overwritten with readpage in block2mtd. All depending on whether the filler is async and/or can return with a !uptodate page. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> drivers/mtd/devices/block2mtd.c |3 -- f

[rfc][patch 0/3] a faster buffered write deadlock fix?

2007-02-08 Thread Nick Piggin
In my last set of numbers for my buffered-write deadlock fix using 2 copies per page, I realised there is no real performance hit for !uptodate pages as opposed to uptodate ones. This is unexpected because the uptodate pages only require a single copy... The problem turns out to be operator error.

[patch 1/3] fs: add an iovec iterator

2007-02-08 Thread Nick Piggin
Add an iterator data structure to operate over an iovec. Add usercopy operators needed by generic_file_buffered_write, and convert that function over. include/linux/fs.h | 32 mm/filemap.c | 132 ++--- mm/filemap.h | 103

[patch 2/3] fs: introduce perform_write aop

2007-02-08 Thread Nick Piggin
Add a new "perform_write" aop, which replaces prepare_write and commit_write as a single call to copy a given amount of userdata at the given offset. This is more flexible, because the implementation can determine how to best handle errors, or multi-page ranges (eg. it may use a gang lookup), and o

[patch 3/3] ext2: use perform_write aop

2007-02-08 Thread Nick Piggin
Convert ext2 to use ->perform_write. This uses the main loop out of generic_perform_write, but when encountering a short usercopy, it zeroes out new uninitialised blocks, and passes in a short-length commit to __block_commit_write, which does the right thing (in terms of not setting things uptodate

Re: [patch 0/3] 2.6.20 fix for PageUptodate memorder problem (try 2)

2007-02-08 Thread Nick Piggin
On Fri, Feb 09, 2007 at 12:41:51AM +, Hugh Dickins wrote: > On Thu, 8 Feb 2007, Nick Piggin wrote: > > Still no independent confirmation as to whether this is a problem or not. > > I'm trying to convince myself none of your patch is necessary. Probably > shall fa

Re: [patch 1/3] fs: add an iovec iterator

2007-02-08 Thread Nick Piggin
On Thu, Feb 08, 2007 at 07:49:53PM +, Christoph Hellwig wrote: > On Thu, Feb 08, 2007 at 02:07:24PM +0100, Nick Piggin wrote: > > Add an iterator data structure to operate over an iovec. Add usercopy > > operators needed by generic_file_buffered_write, and convert that fu

Re: [rfc][patch 0/3] a faster buffered write deadlock fix?

2007-02-08 Thread Nick Piggin
On Thu, Feb 08, 2007 at 04:38:01PM -0800, Mark Fasheh wrote: > On Thu, Feb 08, 2007 at 02:07:15PM +0100, Nick Piggin wrote: > > The problem is that the existing aops interface is crap. "correct, fast, > > compatible -- choose any 2" > > Agreed. There's lots

Re: [patch 1/3] fs: add an iovec iterator

2007-02-08 Thread Nick Piggin
On Thu, Feb 08, 2007 at 06:03:50PM -0800, Nate Diller wrote: > On 2/8/07, Nick Piggin <[EMAIL PROTECTED]> wrote: > >On Thu, Feb 08, 2007 at 07:49:53PM +, Christoph Hellwig wrote: > >> On Thu, Feb 08, 2007 at 02:07:24PM +0100, Nick Piggin wrote: > >> > Add a

Re: [rfc][patch 0/3] a faster buffered write deadlock fix?

2007-02-09 Thread Nick Piggin
On Fri, Feb 09, 2007 at 12:41:01AM -0800, Andrew Morton wrote: > On Thu, 8 Feb 2007 14:07:15 +0100 (CET) Nick Piggin <[EMAIL PROTECTED]> > wrote: > > > So I have finally finished a first slightly-working draft of my new aops > > op (perform_write) proposal. I

Re: [rfc][patch 0/3] a faster buffered write deadlock fix?

2007-02-09 Thread Nick Piggin
On Fri, Feb 09, 2007 at 02:09:54AM -0800, Andrew Morton wrote: > On Fri, 9 Feb 2007 10:54:05 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > > That's still got a deadlock, > > It does? Yes, PG_lock vs mm->mmap_sem. > > and also it doesn

Re: [rfc][patch 0/3] a faster buffered write deadlock fix?

2007-02-09 Thread Nick Piggin
On Fri, Feb 09, 2007 at 02:52:49AM -0800, Andrew Morton wrote: > On Fri, 9 Feb 2007 11:32:58 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > On Fri, Feb 09, 2007 at 02:09:54AM -0800, Andrew Morton wrote: > > > On Fri, 9 Feb 2007 10:54:05 +0100 Nick Piggi

Re: [rfc][patch 0/3] a faster buffered write deadlock fix?

2007-02-09 Thread Nick Piggin
On Fri, Feb 09, 2007 at 03:46:44AM -0800, Andrew Morton wrote: > On Fri, 9 Feb 2007 12:31:16 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > > > > We'll never, ever, ever update and test all filesytems. What you're > > > calling "lega

[patch 0/3] 2.6.20 fix for PageUptodate memorder problem (try 3)

2007-02-09 Thread Nick Piggin
OK, I have got rid of SetPageUptodate_nowarn, and removed the atomic op from SetNewPageUptodate. Made PageUptodate_NoLock only issue the memory barrier is the page was uptodate (hopefully the compiler can thread the branch into the caller's branch). SetNewPageUptodate does not do the S390 page_tes

[patch 1/3] mm: make read_cache_page synchronous

2007-02-09 Thread Nick Piggin
ew in ecryptfs, 1 in jffs2, and a possible cleared data overwritten with readpage in block2mtd. All depending on whether the filler is async and/or can return with a !uptodate page. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> drivers/mtd/devices/block2mtd.c |3 -- fs/afs/dir.c

[patch 2/3] fs: buffer don't PageUptodate without page locked

2007-02-09 Thread Nick Piggin
case (it is unusual that the write path does such a thing anyway). Instead just leave it to the read side to bring the page uptodate when it notices that all buffers are uptodate. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> fs/buffer.c | 11 +-- 1 file changed, 1 inser

[patch 3/3] mm: fix PageUptodate memorder

2007-02-09 Thread Nick Piggin
lete patch follows). Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> fs/ext2/dir.c |2 - fs/namei.c |2 - fs/partitions/check.c |2 - fs/splice.c|4 +-- include/linux/highmem.h|4 --- include/linux/pag

Re: [patch 3/3] ext2: use perform_write aop

2007-02-09 Thread Nick Piggin
On Fri, Feb 09, 2007 at 11:45:39AM -0800, Andrew Morton wrote: > On Fri, 9 Feb 2007 11:14:55 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > > > If so, that might be preventable by leaving the buffer nonuptodate. > > oh, OK, it was buffer_new(), so zeroes are the right thing for a reader to > se

Re: build error: allnoconfig fails on mincore/swapper_space

2007-02-12 Thread Nick Piggin
Andrew Morton wrote: On Mon, 12 Feb 2007 14:50:40 -0800 Randy Dunlap <[EMAIL PROTECTED]> wrote: 2.6.20-git8 on x86_64: LD init/built-in.o LD .tmp_vmlinux1 mm/built-in.o: In function `sys_mincore': (.text+0xe584): undefined reference to `swapper_space' make: *** [.tmp_vmlinux1] Error

Re: Coding style RFC: convert "for (i=0;i

2007-02-12 Thread Nick Piggin
Joe Perches wrote: On Tue, 2007-02-13 at 11:20 +1100, Ben Nizette wrote: #define array_for_each(element, array) \ for (int __idx = 0; __idx < ARRAY_SIZE((array)); \ __idx++, (element) = &(array[__idx])) This requires all interior loop code be changed. Ben is right

Re: [patch 0/3] 2.6.20 fix for PageUptodate memorder problem (try 3)

2007-02-12 Thread Nick Piggin
On Sat, Feb 10, 2007 at 11:44:55PM +0100, Martin Schwidefsky wrote: > On Sat, 2007-02-10 at 03:31 +0100, Nick Piggin wrote: > > SetNewPageUptodate does not do the S390 page_test_and_clear_dirty, so > > I'd like to make sure that's OK. > > An I/O operation on s390 wi

[patch] mm: NUMA replicated pagecache

2007-02-12 Thread Nick Piggin
Hi, Just tinkering around with this and got something working, so I'll see if anyone else wants to try it. Not proposing for inclusion, but I'd be interested in comments or results. Thanks, Nick -- Page-based NUMA pagecache replication. This is a scheme for page replication replicates read-on

Re: [patch] mm: NUMA replicated pagecache

2007-02-12 Thread Nick Piggin
On Tue, Feb 13, 2007 at 07:09:24AM +0100, Nick Piggin wrote: > Hi, > > Just tinkering around with this and got something working, so I'll see > if anyone else wants to try it. (patch against 2.6.20) - To unsubscribe from this list: send the line "unsubscribe linux-ke

Re: [PATCH] knfsd: Stop NFSD writes from being broken into lots of little writes to filesystem.

2007-02-12 Thread Nick Piggin
6527c2bdf1f833cc18e8f42bd97973d583e4aa83 Cc: Nick Piggin <[EMAIL PROTECTED]> FWIW, you can put Acked-by: me there if you'd like. Thanks, Nick -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com - To unsubscribe from this list: send the line "unsubscr

Re: Coding style RFC: convert "for (i=0;i

2007-02-12 Thread Nick Piggin
Joe Perches wrote: On Tue, 2007-02-13 at 15:19 +1100, Nick Piggin wrote: #define array_for_each(element, array) \ for (int __idx = 0; __idx < ARRAY_SIZE((array)); \ __idx++, (element) = &(array[__idx])) If you really wanted to introduce your loop, then please

Re: Coding style RFC: convert "for (i=0;i

2007-02-13 Thread Nick Piggin
Bernd Petrovitsch wrote: On Tue, 2007-02-13 at 18:42 +1100, Nick Piggin wrote: Joe Perches wrote: [...] perhaps: #define array_for_each(element, array) \ for ((element) = (array); \ (element) < ((array) + ARRAY_SIZE((array))); \ (element)++) If you

Re: [patch] build error: allnoconfig fails on mincore/swapper_space

2007-02-13 Thread Nick Piggin
Hugh Dickins wrote: On Tue, 13 Feb 2007, Randy Dunlap wrote: From: Randy Dunlap <[EMAIL PROTECTED]> Don't check for pte swap entries when CONFIG_SWAP=n. And save 'present' in the vec array. mm/built-in.o: In function `sys_mincore': (.text+0xe584): undefined reference to `swapper_space' Signe

Re: [RESEND][PATCH] 9p: add write-cache support to loose cache mode

2007-02-13 Thread Nick Piggin
Andrew Morton wrote: On Tue, 13 Feb 2007 20:07:44 -0600 "Eric Van Hensbergen" <[EMAIL PROTECTED]> wrote: On 2/13/07, Andrew Morton <[EMAIL PROTECTED]> wrote: On Tue, 13 Feb 2007 17:55:31 -0600 Eric Van Hensbergen <[EMAIL PROTECTED]> wrote: +int v9fs_prepare_write(struct file *file, struct pag

Re: Help! How do I debug oopsing kernel workqueue threads?

2007-02-13 Thread Nick Piggin
Chuck Ebbert wrote: How the hell do I tell what kernel subsystem queued a bogus work item? Can you add a new field to struct list_head and add the caller's address in there? BUG: unable to handle kernel paging request at virtual address 2074 printing eip: c04f3b55 *pde = 71bb1067 Oops:

<    6   7   8   9   10   11   12   13   14   15   >