Looking at arch/powerpc/include/asm/percpu.h I see that the per cpu offset comes from a local_paca field and local_paca is in r13. That means that for all percpu operations we first have to determine the address through a memory access.
Would it be possible to put the paca at the beginning of the percpu data area and then have r31 point to the percpu area? power has these nice instructions that fetch from an offset relative to a base register which could be used throughout for percpu operations in the kernel (similar to x86 segment registers). With that we may also be able to use the atomic ops for fast percpu access so that we can avoid the irq enable/disable sequence that is now required for percpu atomics. Would result in fast and reliable percpu counters for powerpc. I.e. powerpc atomic inc static __inline__ void atomic_inc(atomic_t *v) { int t; __asm__ __volatile__( "1: lwarx %0,0,%2 # atomic_inc\n\ addic %0,%0,1\n" PPC405_ERR77(0,%2) " stwcx. %0,0,%2 \n\ bne- 1b" : "=&r" (t), "+m" (v->counter) : "r" (&v->counter) : "cc", "xer"); } Could be used as a template to get: static __inline__ void raw_cpu_inc_4(__percpu void *v) { int t; __asm__ __volatile__( "1: lwarx %0,r31,%2 # percpu_inc\n\ addic %0,%0,1\n" PPC405_ERR77(0,%2) " stwcx. %0,r31,%2 \n\ bne- 1b" : "=&r" (t), "+m" (v) : "r" (&v->counter) : "cc", "xer"); } _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev