Re: sysfs sys/kernel/ namespace (was Re: [PATCH 0/2] add new notifier function ,take2)

2007-10-24 Thread Nick Piggin
On Thursday 25 October 2007 15:45, Greg KH wrote: > On Thu, Oct 25, 2007 at 12:31:06PM +1000, Nick Piggin wrote: > > On Wednesday 24 October 2007 21:12, Kay Sievers wrote: > > > On 10/24/07, Nick Piggin <[EMAIL PROTECTED]> wrote: > > > It was intended to be some

Re: Linux 2.6.23

2007-10-11 Thread Nick Piggin
On Friday 12 October 2007 15:46, Ingo Molnar wrote: > * Nick Piggin <[EMAIL PROTECTED]> wrote: > > ;) I think you snipped the important bit: > > > > "the peak is terrible but it has virtually no dropoff and performs > > better under load than the default 2

Re: [PATCH] mm: avoid dirtying shared mappings on mlock

2007-10-12 Thread Nick Piggin
On Friday 12 October 2007 02:57, Nick Piggin wrote: > On Friday 12 October 2007 19:03, Peter Zijlstra wrote: > > Subject: mm: avoid dirtying shared mappings on mlock > > > > Suleiman noticed that shared mappings get dirtied when mlocked. > > Avoid this by teaching ma

Re: [PATCH 00/31] Remove iget() and read_inode() [try #4]

2007-10-12 Thread Nick Piggin
On Friday 12 October 2007 19:07, David Howells wrote: > Hi Linus, > > Here's a set of patches that remove all calls to iget() and all > read_inode() functions. They should be removed for two reasons: firstly > they don't lend themselves to good error handling, and secondly their > presence is a te

Re: [rfc][patch 3/3] x86: optimise barriers

2007-10-12 Thread Nick Piggin
On Fri, Oct 12, 2007 at 10:25:34AM +0200, Jarek Poplawski wrote: > On 04-10-2007 07:23, Nick Piggin wrote: > > According to latest memory ordering specification documents from Intel and > > AMD, both manufacturers are committed to in-order loads from cacheable > > m

Re: [PATCH] mm: avoid dirtying shared mappings on mlock

2007-10-12 Thread Nick Piggin
On Friday 12 October 2007 19:03, Peter Zijlstra wrote: > Subject: mm: avoid dirtying shared mappings on mlock > > Suleiman noticed that shared mappings get dirtied when mlocked. > Avoid this by teaching make_pages_present about this case. > > Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]> > Acke

Re: [rfc][patch 3/3] x86: optimise barriers

2007-10-12 Thread Nick Piggin
On Fri, Oct 12, 2007 at 11:12:13AM +0200, Jarek Poplawski wrote: > On Fri, Oct 12, 2007 at 10:42:34AM +0200, Helge Hafting wrote: > > Jarek Poplawski wrote: > > >On 04-10-2007 07:23, Nick Piggin wrote: > > > > > >>According to latest memory ordering

Re: [rfc][patch 3/3] x86: optimise barriers

2007-10-12 Thread Nick Piggin
On Fri, Oct 12, 2007 at 11:55:05AM +0200, Jarek Poplawski wrote: > On Fri, Oct 12, 2007 at 10:57:33AM +0200, Nick Piggin wrote: > > > > I don't know quite what you're saying... the CPUs could probably get > > performance by having weakly ordered loads, OTOH I

Re: [PATCH] mm: avoid dirtying shared mappings on mlock

2007-10-12 Thread Nick Piggin
On Friday 12 October 2007 20:37, Peter Zijlstra wrote: > On Fri, 2007-10-12 at 02:57 +1000, Nick Piggin wrote: > > On Friday 12 October 2007 19:03, Peter Zijlstra wrote: > > > Subject: mm: avoid dirtying shared mappings on mlock > > > > > > Suleiman noticed t

Re: [PATCH] mm: avoid dirtying shared mappings on mlock

2007-10-12 Thread Nick Piggin
On Friday 12 October 2007 20:50, Peter Zijlstra wrote: > On Fri, 2007-10-12 at 04:14 +1000, Nick Piggin wrote: > > On Friday 12 October 2007 20:37, Peter Zijlstra wrote: > > > The pages will still be read-only due to dirty tracking, so the first > > > write

[patch 1/2] hdaps: fix locking

2007-10-14 Thread Nick Piggin
no longer produces warnings, but I don't actually know if it does the right thing (because I don't really know what the driver does or how to test it anyway!). --- hdaps was using incorrect mutex_trylock return code. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> --- Index: linux

Re: [patch 1/2] hdaps: fix locking

2007-10-14 Thread Nick Piggin
On Sun, Oct 14, 2007 at 09:25:23AM +0200, Nick Piggin wrote: > Here are a couple of fixes for the hdaps driver. I have kind of been > blocking out the bug traces caused by these (the 2nd patch, actually) > thinking that it's one of those transient / churn things... but it's >

Re: Interaction between Xen and XFS: stray RW mappings

2007-10-14 Thread Nick Piggin
On Monday 15 October 2007 09:12, Jeremy Fitzhardinge wrote: > David Chinner wrote: > > You mean xfs_buf.c. > > Yes, sorry. > > > And yes, we delay unmapping pages until we have a batch of them > > to unmap. vmap and vunmap do not scale, so this is batching helps > > alleviate some of the worst of t

Re: Interaction between Xen and XFS: stray RW mappings

2007-10-14 Thread Nick Piggin
On Monday 15 October 2007 10:57, Jeremy Fitzhardinge wrote: > Nick Piggin wrote: > > Yes, as Dave said, vmap (more specifically: vunmap) is very expensive > > because it generally has to invalidate TLBs on all CPUs. > > I see. > > > I'm looking at some more gene

Re: [RFC] vivi, videobuf_to_vmalloc() and related breakage

2007-10-14 Thread Nick Piggin
On Monday 15 October 2007 12:01, Al Viro wrote: > AFAICS, videobuf-vmalloc use of mem->vma and mem->vmalloc is > bogus. > > You obtain the latter with vmalloc_user(); so far, so good. Then you have > retval=remap_vmalloc_range(vma, mem->vmalloc,0); > where vma is given to you by mmap

Re: ARCH_FREE_PTE_NR 5350 on x86_64

2007-10-15 Thread Nick Piggin
On Monday 15 October 2007 16:54, Alok kataria wrote: > Hi, > > Looking at the tlb_flush code path and its co-relation with > ARCH_FREE_PTE_NR, on x86-64 architecture. I think we still don't use > the ARCH_FREE_PTE_NR of 5350 as the caching value for the mmu_gathers > structure, instead fallback to

OOM killer gripe (was Re: What still uses the block layer?)

2007-10-15 Thread Nick Piggin
On Monday 15 October 2007 18:04, Rob Landley wrote: > On Sunday 14 October 2007 8:45:03 pm Theodore Tso wrote: > > > excuse for conflating different categories of devices in the first > > > place. > > > > See the thinkpad Ultrabay drive example above. > > Last week I drove my laptop so deep into s

Re: [rfc][patch 3/3] x86: optimise barriers

2007-10-15 Thread Nick Piggin
On Mon, Oct 15, 2007 at 09:44:05AM +0200, Jarek Poplawski wrote: > On Fri, Oct 12, 2007 at 08:13:52AM -0700, Linus Torvalds wrote: > > > > > > On Fri, 12 Oct 2007, Jarek Poplawski wrote: > ... > > So no, there's no way a software person could have afforded to say "it > > seems to work on my setu

Re: [PATCH resend] ramdisk: fix zeroed ramdisk pages on memory pressure

2007-10-15 Thread Nick Piggin
On Monday 15 October 2007 18:28, Christian Borntraeger wrote: > Andrew, this is a resend of a bugfix patch. Ramdisk seems a bit > unmaintained, so decided to sent the patch to you :-). > I have CCed Ted, who did work on the code in the 90s. I found no current > email address of Chad Page. This rea

Re: [PATCH resend] ramdisk: fix zeroed ramdisk pages on memory pressure

2007-10-15 Thread Nick Piggin
On Monday 15 October 2007 19:05, Christian Borntraeger wrote: > Am Montag, 15. Oktober 2007 schrieb Nick Piggin: > > On Monday 15 October 2007 18:28, Christian Borntraeger wrote: > > > Andrew, this is a resend of a bugfix patch. Ramdisk seems a bit > > > unmaintained, s

Re: Interaction between Xen and XFS: stray RW mappings

2007-10-15 Thread Nick Piggin
On Monday 15 October 2007 19:36, Andi Kleen wrote: > David Chinner <[EMAIL PROTECTED]> writes: > > And yes, we delay unmapping pages until we have a batch of them > > to unmap. vmap and vunmap do not scale, so this is batching helps > > alleviate some of the worst of the problems. > > You're keepin

Re: OOM killer gripe (was Re: What still uses the block layer?)

2007-10-15 Thread Nick Piggin
On Monday 15 October 2007 19:52, Rob Landley wrote: > On Monday 15 October 2007 8:37:44 am Nick Piggin wrote: > > > Virtual memory isn't perfect. I've _always_ been able to come up with > > > examples where it just doesn't work for me. This doesn't mea

Re: [PATCH resend] ramdisk: fix zeroed ramdisk pages on memory pressure

2007-10-15 Thread Nick Piggin
On Monday 15 October 2007 19:16, Andrew Morton wrote: > On Tue, 16 Oct 2007 00:06:19 +1000 Nick Piggin <[EMAIL PROTECTED]> wrote: > > On Monday 15 October 2007 18:28, Christian Borntraeger wrote: > > > Andrew, this is a resend of a bugfix patch. Ramdisk seems a bit > &

Re: Interaction between Xen and XFS: stray RW mappings

2007-10-15 Thread Nick Piggin
On Monday 15 October 2007 21:07, Andi Kleen wrote: > On Tue, Oct 16, 2007 at 12:56:46AM +1000, Nick Piggin wrote: > > Is this true even if you don't write through those old mappings? > > I think it happened for reads too. It is a little counter intuitive > because in theo

Re: nfs mmap adventure (was: 2.6.23-mm1)

2007-10-15 Thread Nick Piggin
On Tuesday 16 October 2007 00:06, David Howells wrote: > Peter Zijlstra <[EMAIL PROTECTED]> wrote: > > I get funny SIGBUS' like so: > > > > fault > > if (->page_mkwrite() < 0) > > nfs_vm_page_mkwrite() > > nfs_write_begin() > > nfs_flush_incompatible() > > nfs_wb_page(

Re: [git pull] scheduler updates for v2.6.24

2007-10-15 Thread Nick Piggin
On Tuesday 16 October 2007 00:17, Ingo Molnar wrote: > Linus, please pull the latest scheduler git tree from: > >git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched.git > > It contains lots of scheduler updates from lots of people - hopefully > the last big one for quite some ti

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-15 Thread Nick Piggin
On Tue, Oct 16, 2007 at 12:08:01AM +0200, Mikulas Patocka wrote: > > On Mon, 15 Oct 2007 22:47:42 +0200 (CEST) > > Mikulas Patocka <[EMAIL PROTECTED]> wrote: > > > > > > According to latest memory ordering specification documents from > > > > Intel and AMD, both manufacturers are committed to in-o

Re: [rfc][patch 3/3] x86: optimise barriers

2007-10-15 Thread Nick Piggin
On Mon, Oct 15, 2007 at 11:10:00AM +0200, Jarek Poplawski wrote: > On Mon, Oct 15, 2007 at 10:09:24AM +0200, Nick Piggin wrote: > ... > > Has performance really been much problem for you? (even before the > > lfence instruction, when you theoretically had to use a locked op

Re: [PATCH resend] ramdisk: fix zeroed ramdisk pages on memory pressure

2007-10-15 Thread Nick Piggin
On Tuesday 16 October 2007 13:14, Eric W. Biederman wrote: > Nick Piggin <[EMAIL PROTECTED]> writes: > > On Monday 15 October 2007 19:16, Andrew Morton wrote: > >> On Tue, 16 Oct 2007 00:06:19 +1000 Nick Piggin <[EMAIL PROTECTED]> > > > > wrote: > &g

Re: OOM killer gripe (was Re: What still uses the block layer?)

2007-10-15 Thread Nick Piggin
On Tuesday 16 October 2007 13:55, Eric W. Biederman wrote: > Nick Piggin <[EMAIL PROTECTED]> writes: > > How much swap do you have configured? You really shouldn't configure > > so much unless you do want the kernel to actually use it all, right? > > No. &g

Re: OOM killer gripe (was Re: What still uses the block layer?)

2007-10-15 Thread Nick Piggin
On Tuesday 16 October 2007 14:38, Eric W. Biederman wrote: > Nick Piggin <[EMAIL PROTECTED]> writes: > > On Tuesday 16 October 2007 13:55, Eric W. Biederman wrote: > > I don't follow your logic. We don't need SWAP > RAM in order to swap > > effectively, IM

Re: [PATCH resend] ramdisk: fix zeroed ramdisk pages on memory pressure

2007-10-15 Thread Nick Piggin
On Tuesday 16 October 2007 14:57, Eric W. Biederman wrote: > Nick Piggin <[EMAIL PROTECTED]> writes: > >> make_page_uptodate() is most hideous part I have run into. > >> It has to know details about other layers to now what not > >> to stomp. I think my inco

[patch][rfc] rewrite ramdisk

2007-10-16 Thread Nick Piggin
On Tuesday 16 October 2007 18:08, Nick Piggin wrote: > On Tuesday 16 October 2007 14:57, Eric W. Biederman wrote: > > > What magic restrictions on page allocations? Actually we have > > > fewer restrictions on page allocations because we can use > > > highmem! >

Re: [patch][rfc] rewrite ramdisk

2007-10-16 Thread Nick Piggin
On Tuesday 16 October 2007 17:52, Jan Engelhardt wrote: > On Oct 16 2007 17:47, Nick Piggin wrote: > >Here's a quick first hack... > > Inline patches preferred ;-) Thanks for reviewing it anyway ;) > >+config BLK_DEV_BRD > >+tristate "RAM

Re: [PATCH] rd: Preserve the dirty bit in init_page_buffers()

2007-10-16 Thread Nick Piggin
od idea. Was this causing the reiserfs problems? If so, I think we should be concentrating on what the real problem is with reiserfs... (or at least why this so obviously correct looking patch is wrong). Acked-by: Nick Piggin <[EMAIL PROTECTED]> > > Signed-off-by: Eric W. Biederma

Re: [PATCH] rd: Mark ramdisk buffers heads dirty

2007-10-16 Thread Nick Piggin
On Tuesday 16 October 2007 08:42, Eric W. Biederman wrote: > I have not observed this case but it is possible to get a dirty page > cache with clean buffer heads if we get a clean ramdisk page with > buffer heads generated by a filesystem calling __getblk and then write > to that page from user spa

Re: [patch][rfc] rewrite ramdisk

2007-10-16 Thread Nick Piggin
On Tuesday 16 October 2007 18:17, Jan Engelhardt wrote: > On Oct 16 2007 18:07, Nick Piggin wrote: > >Changed. But it will hopefully just completely replace rd.c, > >so I will probably just rename it to rd.c at some point (and > >change .config options to stay compatible). Unl

Re: [PATCH] rd: Mark ramdisk buffers heads dirty

2007-10-16 Thread Nick Piggin
On Wednesday 17 October 2007 05:06, Eric W. Biederman wrote: > Nick Piggin <[EMAIL PROTECTED]> writes: > > On Tuesday 16 October 2007 08:42, Eric W. Biederman wrote: > >> I have not observed this case but it is possible to get a dirty page > >> cache with cle

Re: [patch][rfc] rewrite ramdisk

2007-10-16 Thread Nick Piggin
On Wednesday 17 October 2007 07:28, Theodore Tso wrote: > On Tue, Oct 16, 2007 at 05:47:12PM +1000, Nick Piggin wrote: > > + /* > > +* ram device BLKFLSBUF has special semantics, we want to actually > > +* release and destroy the ramdisk data. > > +*/

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-16 Thread Nick Piggin
On Tue, Oct 16, 2007 at 12:33:54PM +0200, Mikulas Patocka wrote: > > > On Tue, 16 Oct 2007, Nick Piggin wrote: > > > > > The cpus also have an explicit set of instructions that deliberately do > > > > unordered stores/loads, and s/lfence etc are mostly des

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-16 Thread Nick Piggin
On Wed, Oct 17, 2007 at 01:05:16AM +0200, Mikulas Patocka wrote: > > > I see, AMD says that WC memory loads can be out-of-order. > > > > > > There is very little usability to it --- framebuffer and AGP aperture is > > > the only piece of memory that is WC and no kernel structures are placed > >

Re: [patch][rfc] rewrite ramdisk

2007-10-16 Thread Nick Piggin
On Wednesday 17 October 2007 09:48, Eric W. Biederman wrote: > Nick Piggin <[EMAIL PROTECTED]> writes: > > On Wednesday 17 October 2007 07:28, Theodore Tso wrote: > >> On Tue, Oct 16, 2007 at 05:47:12PM +1000, Nick Piggin wrote: > >> > +/* > &g

Re: [patch][rfc] rewrite ramdisk

2007-10-16 Thread Nick Piggin
On Wednesday 17 October 2007 11:13, Eric W. Biederman wrote: > Nick Piggin <[EMAIL PROTECTED]> writes: > > We have 2 problems. First is that, for testing/consistency, we > > don't want BLKFLSBUF to throw out the data. Maybe hardly anything > > uses BLKFLSBUF

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-17 Thread Nick Piggin
On Wed, Oct 17, 2007 at 02:30:32AM +0200, Mikulas Patocka wrote: > > > You already must not place any data structures into WC memory --- for > > > example, spinlocks wouldn't work there. > > > > What do you mean "already"? > > I mean "in current kernel" (I checked it in 2.6.22) Ahh, that's not

Re: LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers)

2007-10-17 Thread Nick Piggin
On Wed, Oct 17, 2007 at 01:51:17PM +0800, Herbert Xu wrote: > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > Also, for non-wb memory. I don't think the Intel document referenced > > says anything about this, but the AMD document says that loads can pass > > lo

Re: [patch][rfc] rewrite ramdisk

2007-10-17 Thread Nick Piggin
On Wednesday 17 October 2007 20:30, Eric W. Biederman wrote: > Nick Piggin <[EMAIL PROTECTED]> writes: > > On Tuesday 16 October 2007 18:08, Nick Piggin wrote: > >> On Tuesday 16 October 2007 14:57, Eric W. Biederman wrote: > >> > > What magic restriction

Re: [patch][rfc] rewrite ramdisk

2007-10-17 Thread Nick Piggin
On Thursday 18 October 2007 04:45, Eric W. Biederman wrote: > At this point my concern is what makes a clean code change in the > kernel. Because user space can currently play with buffer_heads > by way of the block device and cause lots of havoc (see the recent Well if userspace is writing to th

Re: [RFC][PATCH] block: Isolate the buffer cache in it's own mappings.

2007-10-17 Thread Nick Piggin
On Thursday 18 October 2007 13:59, Eric W. Biederman wrote: > If filesystems care at all they want absolute control over the buffer > cache. Controlling which buffers are dirty and when. Because we > keep the buffer cache in the page cache for the block device we have > not quite been giving file

Re: How Inactive may be much greather than cached?

2007-10-17 Thread Nick Piggin
Hi, On Thursday 18 October 2007 16:24, Vasily Averin wrote: > Hi all, > > could anybody explain how "inactive" may be much greater than "cached"? > stress test (http://weather.ou.edu/~apw/projects/stress/) that writes into > removed files in cycle puts the node to the following state: > > MemTotal

Re: How Inactive may be much greather than cached?

2007-10-18 Thread Nick Piggin
On Thursday 18 October 2007 17:14, Vasily Averin wrote: > Nick Piggin wrote: > > Hi, > > > > On Thursday 18 October 2007 16:24, Vasily Averin wrote: > >> Hi all, > >> > >> could anybody explain how "inactive" may be much greater than "

Re: Kernel 2.6.21.6:"swapper: page allocation failure. order:1, mode:0x20"

2007-10-18 Thread Nick Piggin
On Thursday 18 October 2007 16:16, Andrew A. Razdolsky wrote: > Hello! > > In attachments i did pick all info i know about this failure. Hi, Does this actually cause problems for your system? Occasional page allocation failures from interrupt context are expected. If you are getting a lot of the

Re: [patch 1/4] x86: FIFO ticket spinlocks

2007-11-01 Thread Nick Piggin
On Thu, Nov 01, 2007 at 04:01:45PM -0400, Chuck Ebbert wrote: > On 11/01/2007 10:03 AM, Nick Piggin wrote: > > [edited to show the resulting code] > > > + __asm__ __volatile__ ( > > + LOCK_PREFIX "xaddw %w0, %1\n" > > + &

Re: 2.6.23 regression: accessing invalid mmap'ed memory from gdb causes unkillable spinning

2007-11-01 Thread Nick Piggin
On Thu, Nov 01, 2007 at 09:08:45AM -0700, Linus Torvalds wrote: > > > On Thu, 1 Nov 2007, Nick Piggin wrote: > > > > Untested patch follows > > Ok, this looks ok. > > Except I would remove the VM_MAYSHARE bit from the test. But we do want to allow forced CO

Re: 2.6.23 regression: accessing invalid mmap'ed memory from gdb causes unkillable spinning

2007-11-01 Thread Nick Piggin
On Thu, Nov 01, 2007 at 06:17:42PM -0700, Linus Torvalds wrote: > > > On Fri, 2 Nov 2007, Nick Piggin wrote: > > > > But we do want to allow forced COW faults for MAP_PRIVATE mappings. gdb > > uses this for inserting breakpoints (but fortunately, a COW page in a &

Re: 2.6.23 regression: accessing invalid mmap'ed memory from gdb causes unkillable spinning

2007-11-02 Thread Nick Piggin
On Thu, Nov 01, 2007 at 10:02:04PM -0700, David Miller wrote: > From: David Miller <[EMAIL PROTECTED]> > Date: Wed, 31 Oct 2007 00:44:25 -0700 (PDT) > > > From: Nick Piggin <[EMAIL PROTECTED]> > > Date: Wed, 31 Oct 2007 08:41:06 +0100 > > > > > Y

Re: [patch 1/4] x86: FIFO ticket spinlocks

2007-11-01 Thread Nick Piggin
On Thu, Nov 01, 2007 at 06:19:41PM -0700, Linus Torvalds wrote: > > > On Thu, 1 Nov 2007, Rik van Riel wrote: > > > > Larry Woodman managed to wedge the VM into a state where, on his > > 4x dual core system, only 2 cores (on the same CPU) could get the > > zone->lru_lock overnight. The other 6

Re: [patch 1/4] x86: FIFO ticket spinlocks

2007-11-02 Thread Nick Piggin
On Fri, Nov 02, 2007 at 10:05:37AM -0400, Rik van Riel wrote: > On Fri, 2 Nov 2007 07:42:20 +0100 > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > On Thu, Nov 01, 2007 at 06:19:41PM -0700, Linus Torvalds wrote: > > > > > > > > > On Thu, 1 Nov

Re: [patch 1/4] x86: FIFO ticket spinlocks

2007-11-02 Thread Nick Piggin
On Fri, Nov 02, 2007 at 09:51:27AM -0700, Linus Torvalds wrote: > > > On Fri, 2 Nov 2007, Chuck Ebbert wrote: > > > > There's also a very easy way to get better fairness with our current > > spinlocks: > > use xchg to release the lock instead of mov. > > That does nothing at all. > > Yes, it

Re: [patch 1/4] x86: FIFO ticket spinlocks

2007-11-02 Thread Nick Piggin
On Fri, Nov 02, 2007 at 08:56:46PM -0400, Chuck Ebbert wrote: > On 11/02/2007 07:01 PM, Nick Piggin wrote: > > > > In the contended multi-threaded tight loop, the xchg lock is slower than inc > > lock but still beats the fair xadd lock, but that's only because it is >

Re: Oom-killer error.

2007-11-06 Thread Nick Piggin
On Tuesday 06 November 2007 19:34, Jim van Wel wrote: > Hi there, > > I have a strange problem with like 10-15 servers right now. > We have here all HP DL380-G5 servers with kernel 2.6.22.6. System all > works normall. But after a uptime of like 15 a 25 days, we get these > messages, and the server

Re: VM/networking crash cause #1: page allocation failure (order:1, GFP_ATOMIC)

2007-11-06 Thread Nick Piggin
On Tuesday 06 November 2007 04:42, Frank van Maarseveen wrote: > For quite some time I'm seeing occasional lockups spread over 50 different > machines I'm maintaining. Symptom: a page allocation failure with order:1, > GFP_ATOMIC, while there is plenty of memory, as it seems (lots of free > pages,

Re: [RFC/PATCH] Optimize zone allocator synchronization

2007-11-06 Thread Nick Piggin
On Wednesday 07 November 2007 17:19, Andrew Morton wrote: > > On Tue, 06 Nov 2007 05:08:07 -0500 Chris Snook <[EMAIL PROTECTED]> wrote: > > > > Don Porter wrote: > > > From: Donald E. Porter <[EMAIL PROTECTED]> > > > > > > In the bulk page allocation/free routines in mm/page_alloc.c, the zone > > >

Re: [patch 1/4] x86: FIFO ticket spinlocks

2007-11-07 Thread Nick Piggin
On Fri, Nov 02, 2007 at 04:33:32PM +0100, Ingo Molnar wrote: > > * Nick Piggin <[EMAIL PROTECTED]> wrote: > > > Anyway, if this can make its way to the x86 tree, I think it will get > > pulled into -mm (?) and get some exposure... > > ok, we can certainly try

Re: VM/networking crash cause #1: page allocation failure (order:1, GFP_ATOMIC)

2007-11-08 Thread Nick Piggin
On Thursday 08 November 2007 00:48, Frank van Maarseveen wrote: > On Wed, Nov 07, 2007 at 09:01:17AM +1100, Nick Piggin wrote: > > On Tuesday 06 November 2007 04:42, Frank van Maarseveen wrote: > > > For quite some time I'm seeing occasional lockups spread over 50 >

2.6.24-rc2 slab vs slob tbench numbers

2007-11-09 Thread Nick Piggin
Hi, Just ran some tbench numbers (from dbench-3.04), on a 2 socket, 8 core x86 system, with 1 NUMA node per socket. With kernel 2.6.24-rc2, comparing slab vs slub allocators. I run from 1 to 16 client threads, 5 times each, and restarting the tbench server between every run. I'm just taking the h

Re: [PATCH] sched: avoid large irq-latencies in smp-balancing

2007-11-09 Thread Nick Piggin
On Thursday 08 November 2007 15:37, Gregory Haskins wrote: > Peter Zijlstra wrote: > > Bah, missed a hunk > > > > --- > > Subject: sched: avoid large irq-latencies in smp-balancing > > > > SMP balancing is done with IRQs disabled and can iterate the full rq. > > When rqs are large this can cause la

Re: [PATCH, RFC] improved hacks to allow -rt to run kernbench on POWER

2007-11-09 Thread Nick Piggin
On Saturday 10 November 2007 07:52, Benjamin Herrenschmidt wrote: > > diff -urpNa -X dontdiff linux-2.6.23.1-rt4/arch/powerpc/kernel/process.c > > linux-2.6.23.1-rt4-fix/arch/powerpc/kernel/process.c --- > > linux-2.6.23.1-rt4/arch/powerpc/kernel/process.c2007-10-12 > > 09:43:44.0 -0700

Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench

2007-11-09 Thread Nick Piggin
cc'ed linux-netdev On Saturday 10 November 2007 10:46, Christoph Lameter wrote: > commit deea84b0ae3d26b41502ae0a39fe7fe134e703d0 seems to cause a drop > in SLUB tbench performance: > > 8p x86_64 system: > > 2.6.24-rc2: > 1260.80 MB/sec > > After reverting the patch: > 2350.04 MB/sec >

[patch 1/2] mm: page trylock rename

2007-11-09 Thread Nick Piggin
=> trylock_page, SetPageLocked => set_page_locked). Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> --- drivers/scsi/sg.c |2 +- fs/afs/write.c |2 +- fs/cifs/file.c |2 +- fs/jbd/commit.c |2 +- fs/jbd2/commit.c|

[patch 2/2] fs: buffer trylock rename

2007-11-09 Thread Nick Piggin
fs: rename buffer trylock Converting the buffer lock to new bitops also requires name change, so convert the raw test_and_set bitop to a trylock. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> --- fs/buffer.c |4 ++-- fs/jbd/commit.c |2 +- fs/jbd2/co

Re: [patch 1/2] mm: page trylock rename

2007-11-09 Thread Nick Piggin
ss, but we can't do non-atomic access. Split this into add_to_page_cache_locked, for tmpfs. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> --- Index: linux-2.6/mm/filemap.c === --- linux-2.6.orig/mm/filemap.c +++ linux-2.6/mm

Re: 2.6.24-rc2: Network commit causes SLUB performance regression with tbench

2007-11-09 Thread Nick Piggin
On Saturday 10 November 2007 12:29, Nick Piggin wrote: > cc'ed linux-netdev Err, make that 'netdev' :P > On Saturday 10 November 2007 10:46, Christoph Lameter wrote: > > commit deea84b0ae3d26b41502ae0a39fe7fe134e703d0 seems to cause a drop > > in SLUB tbench perf

Re: [PATCH 0/11 v3] enable "make ARCH=x86"

2007-11-10 Thread Nick Piggin
On Saturday 10 November 2007 18:54, Sam Ravnborg wrote: > On Fri, Nov 09, 2007 at 10:23:23PM -0500, Jeff Garzik wrote: > > Sam Ravnborg wrote: > > >This is the patch that get rid of ARCH=i386 and ARCH=x86_64 > > >and introduce ARCH=x86. > > >It touches several files but the changes are all one or t

Re: [RPC][PATCH 2.6.20-rc5] limit total vfs page cache

2007-01-19 Thread Nick Piggin
Aubrey Li wrote: On 1/20/07, Vaidyanathan Srinivasan <[EMAIL PROTECTED]> wrote: If pagecache is overlimit, we expect old (cold) pagecache pages to be thrown out and reused for new file data. We do not expect to drop a few text or data pages to make room for new pagecache. Well, actually I t

Re: [RPC][PATCH 2.6.20-rc5] limit total vfs page cache

2007-01-19 Thread Nick Piggin
Mike Frysinger wrote: On 1/19/07, Nick Piggin <[EMAIL PROTECTED]> wrote: Luckily, there are actually good, robust solutions for your higher order allocation problem. Do higher order allocations at boot time, modifiy userspace applications, or set up otherwise-unused, or easily recla

Re: [RPC][PATCH 2.6.20-rc5] limit total vfs page cache

2007-01-19 Thread Nick Piggin
Aubrey Li wrote: So what's the right way to limit pagecache? Probably something a lot more complicated... if you can say there is a "right way". Secondly, your patch isn't actually very good. It unconditionally shrinks memory to below the given % mark each time a pagecache alloc occurs, rega

Re: [patch 6/10] mm: be sure to trim blocks

2007-01-19 Thread Nick Piggin
On Sun, Jan 14, 2007 at 05:25:44PM +0300, Dmitriy Monakhov wrote: > Nick Piggin <[EMAIL PROTECTED]> writes: > > > If prepare_write fails with AOP_TRUNCATED_PAGE, or if commit_write fails, > > then > > we may have failed the write operation despite prepare_write

Re: [patch 6/10] mm: be sure to trim blocks

2007-01-19 Thread Nick Piggin
On Tue, Jan 16, 2007 at 08:14:16PM +0100, Peter Zijlstra wrote: > On Tue, 2007-01-16 at 18:36 +0100, Peter Zijlstra wrote: > > buf, bytes); > > > @@ -1935,10 +1922,9 @@ generic_file_buffered_write(struct kiocb > > >

[patch] buffer: memorder fix

2007-01-19 Thread Nick Piggin
Anyone mind telling me why unlock_buffer, unlike unlock_page, thinks it can clear the lock without ensuring the critical section is closed (ie. with a barrier)? Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linux-2.6/fs/bu

Re: [PATCH] Introduce simple TRUE and FALSE boolean macros.

2007-01-22 Thread Nick Piggin
Robert P. J. Day wrote: by adding (temporarily) the definitions of TRUE and FALSE to types.h, you should then (theoretically) be able to delete over 100 instances of those same macros being *defined* throughout the source tree. you're not going to be deleting the hundreds and hundreds of *uses*

Re: [PATCH] Introduce simple TRUE and FALSE boolean macros.

2007-01-22 Thread Nick Piggin
Robert P. J. Day wrote: On Mon, 22 Jan 2007, Nick Piggin wrote: Robert P. J. Day wrote: by adding (temporarily) the definitions of TRUE and FALSE to types.h, you should then (theoretically) be able to delete over 100 instances of those same macros being *defined* throughout the source tree

Re: Why active list and inactive list?

2007-01-22 Thread Nick Piggin
Balbir Singh wrote: This makes me wonder if it makes sense to split up the LRU into page cache LRU and mapped pages LRU. I see two benefits 1. Currently based on swappiness, we might walk an entire list searching for page cache pages or mapped pages. With these lists separated, it should

Re: [patch] notifiers: fix blocking_notifier_call_chain() scalability

2007-01-23 Thread Nick Piggin
Peter Zijlstra wrote: On Tue, 2007-01-23 at 10:45 +0100, Ingo Molnar wrote: The fix is to enhance blocking_notifier_call_chain() to only take the lock if there appears to be work on the call-chain. With this patch applied i get nicely saturated system, and much higher munmap performance, on

via irq quirk breakage

2007-01-23 Thread Nick Piggin
Recently updated an old box to a new kernel, and the USB mouse stops working. Well it sort of works, but stutters and is very unresponsive. This happens now and again when the IRQ routing for my board gets broken. Attached a dmesg from a bad 2.6.20-rc5, and a quick hack that gets everything worki

Re: [RFC] Limit the size of the pagecache

2007-01-23 Thread Nick Piggin
Christoph Lameter wrote: This is a patch using some of Aubrey's work plugging it in what is IMHO the right way. Feel free to improve on it. I have gotten repeatedly requests to be able to limit the pagecache. With the revised VM statistics this is now actually possile. I'd like to know more abo

Re: [PATCH 1/1] Page Table cleanup patch

2007-01-23 Thread Nick Piggin
Paul Davies wrote: This patch is a proposed cleanup of the current page table organisation. Such a cleanup would be a logical first step towards introducing at least a partial clean page table interface, geared towards providing enhanced virtualization oportunities for x86. It is also a common

Re: [RFC] Limit the size of the pagecache

2007-01-23 Thread Nick Piggin
Aubrey Li wrote: On 1/24/07, Christoph Lameter <[EMAIL PROTECTED]> wrote: On Wed, 24 Jan 2007, Nick Piggin wrote: > > 1. Insure that anonymous pages that may contain performance > >critical data is never subject to swap. > > > > 2. Insure rapid turnaround of

Re: [RFC] Limit the size of the pagecache

2007-01-24 Thread Nick Piggin
Peter Zijlstra wrote: On Tue, 2007-01-23 at 16:49 -0800, Christoph Lameter wrote: 2. Insure rapid turnaround of pages in the cache. [...] The only maybe valid point would be 2, and I'd like to see if we can't solve that differently - a better use-once logic comes to mind. There must be

Re: [PATCH 1/2]: Fix BUG in cancel_dirty_pages on XFS

2007-01-24 Thread Nick Piggin
Peter Zijlstra wrote: On Wed, 2007-01-24 at 09:37 +1100, David Chinner wrote: With the recent changes to cancel_dirty_pages(), XFS will dump warnings in the syslog because it can truncate_inode_pages() on dirty mapped pages. I've determined that this is indeed correct behaviour for XFS as this

Re: [PATCH 1/2]: Fix BUG in cancel_dirty_pages on XFS

2007-01-24 Thread Nick Piggin
Peter Zijlstra wrote: On Thu, 2007-01-25 at 00:43 +1100, Nick Piggin wrote: Have you seen the new launder_page() a_op? called from invalidate_inode_pages2_range() It would have been nice to make that one into a more potentially useful generic callback. That can still be done when the

Re: [PATCH 1/2]: Fix BUG in cancel_dirty_pages on XFS

2007-01-24 Thread Nick Piggin
David Chinner wrote: On Thu, Jan 25, 2007 at 12:43:23AM +1100, Nick Piggin wrote: And why not just leave it in the pagecache and be done with it? because what is in cache is then not coherent with what is on disk, and a direct read is supposed to read the data that is present in the file

Re: [PATCH 1/2]: Fix BUG in cancel_dirty_pages on XFS

2007-01-24 Thread Nick Piggin
David Chinner wrote: On Thu, Jan 25, 2007 at 11:12:41AM +1100, Nick Piggin wrote: ... so surely if you do a direct read followed by a buffered read, you should *not* get the same data if there has been some activity to modify that part of the file in the meantime (whether that be a buffered

Re: 2.6.20-rc5: cp 18gb 18gb.2 = OOM killer, reproducible just like 2.16.19.2

2007-01-24 Thread Nick Piggin
Justin Piszcz wrote: On Mon, 22 Jan 2007, Andrew Morton wrote: After the oom-killing, please see if you can free up the ZONE_NORMAL memory via a few `echo 3 > /proc/sys/vm/drop_caches' commands. See if you can work out what happened to the missing couple-of-hundred MB from ZONE_NORMAL. Ru

Re: [PATCH 1/2]: Fix BUG in cancel_dirty_pages on XFS

2007-01-24 Thread Nick Piggin
David Chinner wrote: On Thu, Jan 25, 2007 at 11:47:24AM +1100, Nick Piggin wrote: David Chinner wrote: On Thu, Jan 25, 2007 at 11:12:41AM +1100, Nick Piggin wrote: ... so surely if you do a direct read followed by a buffered read, you should *not* get the same data if there has been some

Re: [PATCH 1/2]: Fix BUG in cancel_dirty_pages on XFS

2007-01-24 Thread Nick Piggin
David Chinner wrote: On Thu, Jan 25, 2007 at 01:01:09PM +1100, Nick Piggin wrote: David Chinner wrote: No. The only thing that will happen here is that the direct read will see _none_ of the write because the mmap write occurred during the DIO read to a different set of pages in memory

Re: [PATCH 1/2]: Fix BUG in cancel_dirty_pages on XFS

2007-01-25 Thread Nick Piggin
David Chinner wrote: Only if we leave the page in the page cache. If we toss the page, the time it takes to do the I/O for the page fault is enough for the direct I/o to complete. Sure it's not an absolute guarantee, but if you want an absolute guarantee: So I guess you *could* relax it in the

Re: [PATCH] mm: fix page_mkclean_one (was: 2.6.19 file content corruption on ext3)

2006-12-26 Thread Nick Piggin
Linus Torvalds wrote: On Sun, 24 Dec 2006, Linus Torvalds wrote: Peter, tell me I'm crazy, but with the new rules, the following condition is a bug: - shared mapping - writable - not already marked dirty in the PTE Ok, so how about this diff. I'm actually feeling good about this one. It

Re: Ok, explained.. (was Re: [PATCH] mm: fix page_mkclean_one)

2006-12-29 Thread Nick Piggin
Hey nice work Linus! Linus Torvalds wrote: On Fri, 29 Dec 2006, Linus Torvalds wrote: Hmm? I'd love it if somebody else wrote the patch and tested it, because I'm getting sick and tired of this bug ;) Who the hell am I kidding? I haven't been able to sleep right for the last few days over

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2006-12-30 Thread Nick Piggin
Andrea Gelmini wrote: On Fri, Dec 29, 2006 at 06:59:02PM +, Linux Kernel Mailing List wrote: Gitweb: http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7658cc289288b8ae7dd2c2224549a048431222b3 Commit: 7658cc289288b8ae7dd2c2224549a048431222b3 Parent:

Re: Page alignment issue

2006-12-30 Thread Nick Piggin
Aubrey wrote: As for the buddy system, much of docs mention the physical address of the first page frame of a block should be a multiple of the group size. For example, the initial address of a 16-page-frame block should be 16-page aligned. I happened to encounted an issue that the physical addre

<    5   6   7   8   9   10   11   12   13   14   >