[PATCH v3] powerpc/ppc64: Use preempt_schedule_irq instead of preempt_schedule

2009-10-26 Thread Benjamin Herrenschmidt
> So I _think_ that the irqs on/off accounting for lockdep isn't quite > right. What do you think of this slightly modified version ? I've only > done a quick boot test on a G5 with lockdep enabled and a played a bit, > nothing shows up so far but it's definitely not conclusive. > > The main diff

[6/6] Bring hugepage PTE accessor functions back into sync with normal accessors

2009-10-26 Thread David Gibson
The hugepage arch code provides a number of hook functions/macros which mirror the functionality of various normal page pte access functions. Various changes in the normal page accessors (in particular BenH's recent changes to the handling of lazy icache flushing and PAGE_EXEC) have caused the hug

[5/6] Split hash MMU specific hugepage code into a new file

2009-10-26 Thread David Gibson
This patch separates the parts of hugetlbpage.c which are inherently specific to the hash MMU into a new hugelbpage-hash64.c file. Signed-off-by: David Gibson --- arch/powerpc/include/asm/hugetlb.h |3 arch/powerpc/mm/Makefile |5 - arch/powerpc/mm/hugetlbpage-hash64.c |

[4/6] Cleanup initialization of hugepages on powerpc

2009-10-26 Thread David Gibson
This patch simplifies the logic used to initialize hugepages on powerpc. The somewhat oddly named set_huge_psize() is renamed to add_huge_page_size() and now does all necessary verification of whether it's given a valid hugepage sizes (instead of just some) and instantiates the generic hstate stru

[3/6] Allow more flexible layouts for hugepage pagetables

2009-10-26 Thread David Gibson
Currently each available hugepage size uses a slightly different pagetable layout: that is, the bottem level table of pointers to hugepages is a different size, and may branch off from the normal page tables at a different level. Every hugepage aware path that needs to walk the pagetables must the

[1/6] Make hpte_need_flush() correctly mask for multiple page sizes

2009-10-26 Thread David Gibson
Currently, hpte_need_flush() only correctly flushes the given address for normal pages. Callers for hugepages are required to mask the address themselves. But hpte_need_flush() already looks up the page sizes for its own reasons, so this is a rather silly imposition on the callers. This patch al

[2/6] Cleanup management of kmem_caches for pagetables

2009-10-26 Thread David Gibson
Currently we have a fair bit of rather fiddly code to manage the various kmem_caches used to store page tables of various levels. We generally have two caches holding some combination of PGD, PUD and PMD tables, plus several more for the special hugepage pagetables. This patch cleans this all up

[0/6] Assorted hugepage cleanups (v4)

2009-10-26 Thread David Gibson
Currently, ordinary pages use one pagetable layout, and each different hugepage size uses a slightly different variant layout. A number of places which need to walk the pagetable must first check the slice map to see what the pagetable layout then handle the various different forms. New hardware,

hypervisor call trace module

2009-10-26 Thread Anton Blanchard
Here is an example of using the hcall tracepoints. This kernel module provides strace like functionality for hypervisor hcalls: -> 0x64(ff02, 1, 2, d34d7a71, f, c0a6f388, 1, c0989008, c0a3f480) <- 0x64() Which was an EOI (opcode 0x64) of 0xff02 There a

[PATCH 4/6] powerpc: tracing: Give hypervisor call tracepoints access to arguments

2009-10-26 Thread Anton Blanchard
While most users of the hcall tracepoints will only want the opcode and return code, some will want all the arguments. To avoid the complexity of using varargs we pass a pointer to the register save area which contain all arguments. Signed-off-by: Anton Blanchard --- Index: linux.trees.git/arch

[PATCH 2/6] powerpc: tracing: Add powerpc tracepoints for timer entry and exit

2009-10-26 Thread Anton Blanchard
We can monitor the effectiveness of our power management of both the kernel and hypervisor by probing the timer interrupt. For example, on this box we see 10.37s timer interrupts on an idle core: -0 [010] 3900.671297: timer_interrupt_entry: pt_regs=c000ce1e7b10 -0 [010] 3900.671302:

[PATCH 5/6] powerpc: Disable HCALL_STATS by default

2009-10-26 Thread Anton Blanchard
The overhead of HCALL_STATS is quite high and the functionality is very rarely used. Key statistics are also missing (eg min/max). With the new hcall tracepoints much more powerful tracing can be done in a kernel module. Lets disable this by default. Signed-off-by: Anton Blanchard --- Index: l

[PATCH 1/6] powerpc: tracing: Add powerpc tracepoints for interrupt entry and exit

2009-10-26 Thread Anton Blanchard
This patch adds powerpc specific tracepoints for interrupt entry and exit. While we already have generic irq_handler_entry and irq_handler_exit tracepoints there are cases on our virtualised powerpc machines where an interrupt is presented to the OS, but subsequently handled by the hypervisor. Th

[PATCH 3/6] powerpc: tracing: Add hypervisor call tracepoints

2009-10-26 Thread Anton Blanchard
Add hcall_entry and hcall_exit tracepoints. This replaces the inline assembly HCALL_STATS code and converts it to use the new tracepoints. To keep the disabled case as quick as possible, we embed a status word in the TOC so we can get at it with a single load. By doing so we keep the overhead at

[PATCH 6/6] powerpc: Export powerpc_debugfs_root

2009-10-26 Thread Anton Blanchard
Kernel modules should be able to place their debug output inside our powerpc debugfs directory. Signed-off-by: Anton Blanchard --- Index: linux.trees.git/arch/powerpc/kernel/setup-common.c === --- linux.trees.git.orig/arch/powerpc/

Re: [3/6] Allow more flexible layouts for hugepage pagetables

2009-10-26 Thread David Gibson
On Tue, Oct 27, 2009 at 02:10:59PM +1100, Benjamin Herrenschmidt wrote: > On Fri, 2009-10-16 at 16:22 +1100, David Gibson wrote: > > So far haven't seen anything blatantly wrong, in fact, this patch > results in some nice cleanups. > > One thing tho... > > > -#ifdef CONFIG_HUGETLB_PAGE > > -

Re: [2/6] Cleanup management of kmem_caches for pagetables

2009-10-26 Thread Benjamin Herrenschmidt
On Tue, 2009-10-27 at 14:46 +1100, David Gibson wrote: > > The trick is that allocating the PGD and PMD caches is supposed to > also create the PUD cache, because the PUD index size is always the > same as either the PGD or PUD cache. If that's not true, we've broken > the assumptions the code is

Re: [2/6] Cleanup management of kmem_caches for pagetables

2009-10-26 Thread David Gibson
On Tue, Oct 27, 2009 at 01:28:19PM +1100, Benjamin Herrenschmidt wrote: > On Fri, 2009-10-16 at 16:22 +1100, David Gibson wrote: > > Minor nits... if you can respin today I should push it out to -next > > > +void pgtable_cache_add(unsigned shift, void (*ctor)(void *)) > > +{ > > + char *name; >

Is there a patch for MPC8548 XOR?

2009-10-26 Thread hank peng
I want to use its' XOR engine to compute raid5 parity, but I can't find this function in 2.6.30 downloaded from kernel.org, someone know if there is a patch? -- The simplest is not all best but the best is surely the simplest! ___ Linuxppc-dev mailing l

Re: [PATCH v4 4/4] pseries: Serialize cpu hotplug operations during deactivate Vs deallocate

2009-10-26 Thread Benjamin Herrenschmidt
On Fri, 2009-10-09 at 14:01 +0530, Gautham R Shenoy wrote: > Currently the cpu-allocation/deallocation process comprises of two steps: > - Set the indicators and to update the device tree with DLPAR node > information. > > - Online/offline the allocated/deallocated CPU. > > This is achieved by

Re: [PATCH 10/16] percpu: make percpu symbols in powerpc unique

2009-10-26 Thread Benjamin Herrenschmidt
On Wed, 2009-10-14 at 15:01 +0900, Tejun Heo wrote: > This patch updates percpu related symbols in powerpc such that percpu > symbols are unique and don't clash with local symbols. This serves > two purposes of decreasing the possibility of global percpu symbol > collision and allowing dropping pe

Re: [3/6] Allow more flexible layouts for hugepage pagetables

2009-10-26 Thread Benjamin Herrenschmidt
On Fri, 2009-10-16 at 16:22 +1100, David Gibson wrote: So far haven't seen anything blatantly wrong, in fact, this patch results in some nice cleanups. One thing tho... > -#ifdef CONFIG_HUGETLB_PAGE > - /* Handle hugepage regions */ > - if (HPAGE_SHIFT && mmu_huge_psizes[psize]) { >

Re: [2/6] Cleanup management of kmem_caches for pagetables

2009-10-26 Thread Benjamin Herrenschmidt
On Fri, 2009-10-16 at 16:22 +1100, David Gibson wrote: Minor nits... if you can respin today I should push it out to -next > +void pgtable_cache_add(unsigned shift, void (*ctor)(void *)) > +{ > + char *name; > + unsigned long table_size = sizeof(void *) << shift; > + unsigned long ali

Re: [PATCH 0/8] Fix 8xx MMU/TLB

2009-10-26 Thread Benjamin Herrenschmidt
On Mon, 2009-10-26 at 16:26 -0700, Dan Malek wrote: > Just be careful the get_user() doesn't regenerate the same > translation error you are trying to fix by being here.. It shouldn't since it will always come up with a proper DAR but you may want to double check before hand that your instruct

Re: [PATCH] [RFC] PowerPC64: Use preempt_schedule_irq instead of preempt_schedule when returning from exceptions

2009-10-26 Thread Benjamin Herrenschmidt
On Mon, 2009-10-19 at 22:28 +0400, Valentine Barshak wrote: > Use preempt_schedule_irq to prevent infinite irq-entry and > eventual stack overflow problems with fast-paced IRQ sources. > This kind of problems has been observed on the PASemi Electra IDE > controller. We have to make sure we are soft

Re: [PATCH 0/8] Fix 8xx MMU/TLB

2009-10-26 Thread Dan Malek
On Oct 26, 2009, at 3:47 PM, Benjamin Herrenschmidt wrote: This whole thing would be a -lot- easier to do from C code. Why ? Simply because you could just use get_user() to load the instruction rather than doing this page table walking in asm, Just be careful the get_user() doesn't regenera

Re: [PATCH 0/8] Fix 8xx MMU/TLB

2009-10-26 Thread Benjamin Herrenschmidt
> Probably better to walk the kernel page table too. Does this > make a difference(needs the tophys() patch I posted earlier): This whole thing would be a -lot- easier to do from C code. Why ? Simply because you could just use get_user() to load the instruction rather than doing this page table w

Re: Network Stack SKB Reallocation

2009-10-26 Thread Michael Buesch
On Monday 26 October 2009 19:43:00 Jonathan Haws wrote: > Quick question about the network stack in general: > > Does the stack itself release an SKB allocated by the device driver back to > the heap upstream, or does it require that the device driver handle that? There's the concept of passing

RE: Network Stack SKB Reallocation

2009-10-26 Thread Jonathan Haws
So, in my case, I allocate a bunch of skb's that I want to be able to reuse during network operation (256 in fact). When I pass it up the stack, the stack will free that skb back to the system making any further use of it invalid until I call alloc_skb() again? Thanks. > On Monday 26 October

Network Stack SKB Reallocation

2009-10-26 Thread Jonathan Haws
Quick question about the network stack in general: Does the stack itself release an SKB allocated by the device driver back to the heap upstream, or does it require that the device driver handle that? Thanks! Jonathan ___ Linuxppc-dev mailing list L

INIT: PANIC: segmentation violation! sleeping for 30 seconds.

2009-10-26 Thread Breno Leitao
Hi, I just put a upstream kernel(rc5) on a specific machine I have (Power5), and I got the following error: INIT: PANIC: segmentation violation! sleeping for 30 seconds. init has generated signal 11 but has no handler for it init used greatest stack depth: 6240 bytes left Kernel panic - not sy

Jumbo Frame bug in ibm_newemac driver (was Jumbo Frames, sil24 SATA driver, and kswapd0 page allocation failures)

2009-10-26 Thread Jonathan Haws
Okay, I need to revisit this issue. I have had my time taken away for other things the past couple of months, but I am now back at this network issue. Here is what I have done: 1. I modified the ibm_newemac driver to follow scatter-gather chains on the RX path. The idea was to setup the drive

Re: [v9 PATCH 4/9]: x86: refactor x86 idle power management code and remove all instances of pm_idle.

2009-10-26 Thread Arun R Bharadwaj
* Pavel Machek [2009-10-26 08:58:31]: > > > > > +static int local_idle_loop(struct cpuidle_device *dev, struct > > > > cpuidle_state *st) > > > > +{ > > > > + ktime_t t1, t2; > > > > + s64 diff; > > > > + int ret; > > > > + > > > > + t1 = ktime_get(); > > > > + loc

Re: [v9 PATCH 4/9]: x86: refactor x86 idle power management code and remove all instances of pm_idle.

2009-10-26 Thread Pavel Machek
> > > +static int local_idle_loop(struct cpuidle_device *dev, struct > > > cpuidle_state *st) > > > +{ > > > + ktime_t t1, t2; > > > + s64 diff; > > > + int ret; > > > + > > > + t1 = ktime_get(); > > > + local_idle(); > > > + t2 = ktime_get(); > > > + > > > + diff = ktime_to_us(ktime_sub(t2, t1))

Re: [v9 PATCH 4/9]: x86: refactor x86 idle power management code and remove all instances of pm_idle.

2009-10-26 Thread Arun R Bharadwaj
* Pavel Machek [2009-10-23 18:07:11]: > On Fri 2009-10-16 15:13:08, Arun R Bharadwaj wrote: > > * Arun R Bharadwaj [2009-10-16 15:08:50]: > > > > This patch cleans up x86 of all instances of pm_idle. > > > > pm_idle which was earlier called from cpu_idle() idle loop > > is replaced by cpuidle_