Re: [PATCH] macintosh: move mac_hid driver to input/mouse.
On Tue, 9 May 2017 17:43:27 -0700 Dmitry Torokhov wrote: > Hi Michal, > > On Tue, May 09, 2017 at 09:14:18PM +0200, Michal Suchanek wrote: > > There is nothing mac-specific about this driver. Non-mac hardware > > with suboptimal built-in pointer devices exists. > > > > This makes it possible to use this emulation not only on x86 and ppc > > notebooks but also on arm and mips. > > I'd rather we did not promote from drivers/macintosh to other > platforms, but rather removed it. The same functionality can be done > from userspace. What is the status of this? Do you reply to every patch to drivers/input that is not the the core infrastructure that you would rather drop the driver because it can be done is in userspace? It sure can be done. Remove everything but the bus drivers and uinput from drivers/input and the rest can be done in userspace. The question is who does it? Are you saying that you will implement the userspace equivalent? If not then please do your job as maintainer and accept trivial patches for perfectly working drivers we have now. If you want to move drivers/input into userspace I am not against it but I am not willing to do that for you either. > > What hardware do you believe would benefit from this and why? Any touchpad hardware where you cannot press two buttons at once to emulate the third button due to hardware design. And any touchpad hardware on which some of the buttons are broken when it comes to it. It is built into a notebook and works fine for moving the cursor but due to lack of usable buttons you still need a mouse to use the notebook. Thanks Michal > > Thanks. > > > > > Signed-off-by: Michal Suchanek > > --- > > drivers/input/mouse/Kconfig | 20 > > drivers/input/mouse/Makefile > > | 1 + drivers/{macintosh => input/mouse}/mac_hid.c | 0 > > drivers/macintosh/Kconfig| 17 - > > drivers/macintosh/Makefile | 1 - > > 5 files changed, 21 insertions(+), 18 deletions(-) > > rename drivers/{macintosh => input/mouse}/mac_hid.c (100%) > > > > diff --git a/drivers/input/mouse/Kconfig > > b/drivers/input/mouse/Kconfig index 89ebb8f39fee..5533fd3a113f > > 100644 --- a/drivers/input/mouse/Kconfig > > +++ b/drivers/input/mouse/Kconfig > > @@ -12,6 +12,26 @@ menuconfig INPUT_MOUSE > > > > if INPUT_MOUSE > > > > +config MAC_EMUMOUSEBTN > > + tristate "Support for mouse button 2+3 emulation" > > + depends on SYSCTL && INPUT > > + help > > + This provides generic support for emulating the 2nd and > > 3rd mouse > > + button with keypresses. If you say Y here, the > > emulation is still > > + disabled by default. The emulation is controlled by > > these sysctl > > + entries: > > + /proc/sys/dev/mac_hid/mouse_button_emulation > > + /proc/sys/dev/mac_hid/mouse_button2_keycode > > + /proc/sys/dev/mac_hid/mouse_button3_keycode > > + > > + If you have an Apple machine with a 1-button mouse, say > > Y here. + > > + This emulation can be useful on notebooks with > > suboptimal touchpad > > + hardware as well. > > + > > + To compile this driver as a module, choose M here: the > > + module will be called mac_hid. > > + > > config MOUSE_PS2 > > tristate "PS/2 mouse" > > default y > > diff --git a/drivers/input/mouse/Makefile > > b/drivers/input/mouse/Makefile index 56bf0ad877c6..dfaad1dd8857 > > 100644 --- a/drivers/input/mouse/Makefile > > +++ b/drivers/input/mouse/Makefile > > @@ -4,6 +4,7 @@ > > > > # Each configuration option enables a list of files. > > > > +obj-$(CONFIG_MAC_EMUMOUSEBTN) += mac_hid.o > > obj-$(CONFIG_MOUSE_AMIGA) += amimouse.o > > obj-$(CONFIG_MOUSE_APPLETOUCH) += appletouch.o > > obj-$(CONFIG_MOUSE_ATARI) += atarimouse.o > > diff --git a/drivers/macintosh/mac_hid.c > > b/drivers/input/mouse/mac_hid.c similarity index 100% > > rename from drivers/macintosh/mac_hid.c > > rename to drivers/input/mouse/mac_hid.c > > diff --git a/drivers/macintosh/Kconfig b/drivers/macintosh/Kconfig > > index 97a420c11eed..011df09c5167 100644 > > --- a/drivers/macintosh/Kconfig > > +++ b/drivers/macintosh/Kconfig > > @@ -159,23 +159,6 @@ config INPUT_ADBHID > > > > If unsure, say Y. > > > > -config MAC_EMUMOUSEBTN > > - tristate "Support for mouse button 2+3 emulation" > > - depends on SYSCTL && INPUT > > - help > > - This provides generic support for emulating the 2nd and > > 3rd mouse > > - button with keypresses. If you say Y here, the > > emulation is still > > - disabled by default. The emulation is controlled by > > these sysctl > > - entries: > > - /proc/sys/dev/mac_hid/mouse_button_emulation > > - /proc/sys/dev/mac_hid/mouse_button2_keycode > > - /proc/sys/dev/mac_hid/mouse_button3_keycode > > - > > - If you have an Apple machine with a 1-button mouse, say > > Y here. - > > - To compile this driver as a m
Re: [PATCH v1 1/8] powerpc/lib/code-patching: Enhance code patching
Le 25/05/2017 à 05:36, Balbir Singh a écrit : Today our patching happens via direct copy and patch_instruction. The patching code is well contained in the sense that copying bits are limited. While considering implementation of CONFIG_STRICT_RWX, the first requirement is to a create another mapping that will allow for patching. We create the window using text_poke_area, allocated via get_vm_area(), which might be an overkill. We can do per-cpu stuff as well. The downside of these patches that patch_instruction is now synchornized using a lock. Other arches do similar things, but use fixmaps. The reason for not using fixmaps is to make use of any randomization in the future. The code also relies on set_pte_at and pte_clear to do the appropriate tlb flushing. Signed-off-by: Balbir Singh [...] +static int kernel_map_addr(void *addr) +{ + unsigned long pfn; int err; - __put_user_size(instr, addr, 4, err); + if (is_vmalloc_addr(addr)) + pfn = vmalloc_to_pfn(addr); + else + pfn = __pa_symbol(addr) >> PAGE_SHIFT; + + err = map_kernel_page((unsigned long)text_poke_area->addr, + (pfn << PAGE_SHIFT), _PAGE_KERNEL_RW | _PAGE_PRESENT); map_kernel_page() doesn't exist on powerpc32, so compilation fails. However a similar function exists and is called map_page() Maybe the below modification could help (not tested yet) Christophe --- arch/powerpc/include/asm/book3s/32/pgtable.h | 2 ++ arch/powerpc/include/asm/nohash/32/pgtable.h | 2 ++ arch/powerpc/mm/8xx_mmu.c| 2 +- arch/powerpc/mm/dma-noncoherent.c| 2 +- arch/powerpc/mm/mem.c| 4 ++-- arch/powerpc/mm/mmu_decl.h | 1 - arch/powerpc/mm/pgtable_32.c | 8 7 files changed, 12 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index 26ed228..7fb7558 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -297,6 +297,8 @@ static inline void __ptep_set_access_flags(struct mm_struct *mm, extern int get_pteptr(struct mm_struct *mm, unsigned long addr, pte_t **ptep, pmd_t **pmdp); +int map_kernel_page(unsigned long va, phys_addr_t pa, int flags); + /* Generic accessors to PTE bits */ static inline int pte_write(pte_t pte) { return !!(pte_val(pte) & _PAGE_RW);} static inline int pte_dirty(pte_t pte) { return !!(pte_val(pte) & _PAGE_DIRTY); } diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h index 5134ade..9131426 100644 --- a/arch/powerpc/include/asm/nohash/32/pgtable.h +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h @@ -340,6 +340,8 @@ static inline void __ptep_set_access_flags(struct mm_struct *mm, extern int get_pteptr(struct mm_struct *mm, unsigned long addr, pte_t **ptep, pmd_t **pmdp); +int map_kernel_page(unsigned long va, phys_addr_t pa, int flags); + #endif /* !__ASSEMBLY__ */ #endif /* __ASM_POWERPC_NOHASH_32_PGTABLE_H */ diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c index 6c5025e..f4c6472 100644 --- a/arch/powerpc/mm/8xx_mmu.c +++ b/arch/powerpc/mm/8xx_mmu.c @@ -88,7 +88,7 @@ static void mmu_mapin_immr(void) int offset; for (offset = 0; offset < IMMR_SIZE; offset += PAGE_SIZE) - map_page(v + offset, p + offset, f); + map_kernel_page(v + offset, p + offset, f); } /* Address of instructions to patch */ diff --git a/arch/powerpc/mm/dma-noncoherent.c b/arch/powerpc/mm/dma-noncoherent.c index 2dc74e5..3825284 100644 --- a/arch/powerpc/mm/dma-noncoherent.c +++ b/arch/powerpc/mm/dma-noncoherent.c @@ -227,7 +227,7 @@ __dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *handle, gfp_t do { SetPageReserved(page); - map_page(vaddr, page_to_phys(page), + map_kernel_page(vaddr, page_to_phys(page), pgprot_val(pgprot_noncached(PAGE_KERNEL))); page++; vaddr += PAGE_SIZE; diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index 9ee536e..04f4c98 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -313,11 +313,11 @@ void __init paging_init(void) unsigned long end = __fix_to_virt(FIX_HOLE); for (; v < end; v += PAGE_SIZE) - map_page(v, 0, 0); /* XXX gross */ + map_kernel_page(v, 0, 0); /* XXX gross */ #endif #ifdef CONFIG_HIGHMEM - map_page(PKMAP_BASE, 0, 0); /* XXX gross */ + map_kernel_page(PKMAP_BASE, 0, 0); /* XXX gross */ pkmap_page_table = virt_to_kpte(PKMAP_BASE); kmap_pte = virt_to_kpte(__fix_to_virt(FIX_KMAP_BEGIN)); diff --git a/arch/powerpc/mm/mmu_decl.h b/
Re: [PATCH v1 1/8] powerpc/lib/code-patching: Enhance code patching
Le 25/05/2017 à 05:36, Balbir Singh a écrit : Today our patching happens via direct copy and patch_instruction. The patching code is well contained in the sense that copying bits are limited. While considering implementation of CONFIG_STRICT_RWX, the first requirement is to a create another mapping that will allow for patching. We create the window using text_poke_area, allocated via get_vm_area(), which might be an overkill. We can do per-cpu stuff as well. The downside of these patches that patch_instruction is now synchornized using a lock. Other arches do similar things, but use fixmaps. The reason for not using fixmaps is to make use of any randomization in the future. The code also relies on set_pte_at and pte_clear to do the appropriate tlb flushing. Isn't it overkill to remap the text in another area ? Among the 6 arches implementing CONFIG_STRICT_KERNEL_RWX (arm, arm64, parisc, s390, x86/32, x86/64): - arm, x86/32 and x86/64 set text RW during the modification - s390 seems to uses a special instruction which bypassed write protection - parisc doesn't seem to implement any function which modifies kernel text. Therefore it seems only arm64 does it via another mapping. Wouldn't it be lighter to just unprotect memory during the modification as done on arm and x86 ? Or another alternative could be to disable DMMU and do the write at physical address ? Christophe Signed-off-by: Balbir Singh --- arch/powerpc/lib/code-patching.c | 88 ++-- 1 file changed, 84 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c index 500b0f6..0a16b2f 100644 --- a/arch/powerpc/lib/code-patching.c +++ b/arch/powerpc/lib/code-patching.c @@ -16,19 +16,98 @@ #include #include #include +#include +#include +struct vm_struct *text_poke_area; +static DEFINE_RAW_SPINLOCK(text_poke_lock); -int patch_instruction(unsigned int *addr, unsigned int instr) +/* + * This is an early_initcall and early_initcalls happen at the right time + * for us, after slab is enabled and before we mark ro pages R/O. In the + * future if get_vm_area is randomized, this will be more flexible than + * fixmap + */ +static int __init setup_text_poke_area(void) { + text_poke_area = get_vm_area(PAGE_SIZE, VM_ALLOC); + if (!text_poke_area) { + WARN_ONCE(1, "could not create area for mapping kernel addrs" + " which allow for patching kernel code\n"); + return 0; + } + pr_info("text_poke area ready...\n"); + raw_spin_lock_init(&text_poke_lock); + return 0; +} + +/* + * This can be called for kernel text or a module. + */ +static int kernel_map_addr(void *addr) +{ + unsigned long pfn; int err; - __put_user_size(instr, addr, 4, err); + if (is_vmalloc_addr(addr)) + pfn = vmalloc_to_pfn(addr); + else + pfn = __pa_symbol(addr) >> PAGE_SHIFT; + + err = map_kernel_page((unsigned long)text_poke_area->addr, + (pfn << PAGE_SHIFT), _PAGE_KERNEL_RW | _PAGE_PRESENT); + pr_devel("Mapped addr %p with pfn %lx\n", text_poke_area->addr, pfn); if (err) - return err; - asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" : : "r" (addr)); + return -1; return 0; } +static inline void kernel_unmap_addr(void *addr) +{ + pte_t *pte; + unsigned long kaddr = (unsigned long)addr; + + pte = pte_offset_kernel(pmd_offset(pud_offset(pgd_offset_k(kaddr), + kaddr), kaddr), kaddr); + pr_devel("clearing mm %p, pte %p, kaddr %lx\n", &init_mm, pte, kaddr); + pte_clear(&init_mm, kaddr, pte); +} + +int patch_instruction(unsigned int *addr, unsigned int instr) +{ + int err; + unsigned int *dest = NULL; + unsigned long flags; + unsigned long kaddr = (unsigned long)addr; + + /* +* During early early boot patch_instruction is called +* when text_poke_area is not ready, but we still need +* to allow patching. We just do the plain old patching +*/ + if (!text_poke_area) { + __put_user_size(instr, addr, 4, err); + asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" :: "r" (addr)); + return 0; + } + + raw_spin_lock_irqsave(&text_poke_lock, flags); + if (kernel_map_addr(addr)) { + err = -1; + goto out; + } + + dest = (unsigned int *)(text_poke_area->addr) + + ((kaddr & ~PAGE_MASK) / sizeof(unsigned int)); + __put_user_size(instr, dest, 4, err); + asm ("dcbst 0, %0; sync; icbi 0,%0; sync; isync" :: "r" (dest)); + kernel_unmap_addr(text_poke_area->addr); +out: + raw_spin_unlock_irqrestore(&text_poke_lock, flags); + return err; +} +NOKPROBE_SYMBOL(patch_instruction); + int patch_branch(unsigned
Re: [PATCH] macintosh: move mac_hid driver to input/mouse.
On Sun, May 28, 2017 at 11:47:58AM +0200, Michal Suchanek wrote: > On Tue, 9 May 2017 17:43:27 -0700 > Dmitry Torokhov wrote: > > > Hi Michal, > > > > On Tue, May 09, 2017 at 09:14:18PM +0200, Michal Suchanek wrote: > > > There is nothing mac-specific about this driver. Non-mac hardware > > > with suboptimal built-in pointer devices exists. > > > > > > This makes it possible to use this emulation not only on x86 and ppc > > > notebooks but also on arm and mips. > > > > I'd rather we did not promote from drivers/macintosh to other > > platforms, but rather removed it. The same functionality can be done > > from userspace. > > What is the status of this? The same as in above paragraph. > > Do you reply to every patch to drivers/input that is not the the core > infrastructure that you would rather drop the driver because it can be > done is in userspace? > > It sure can be done. Remove everything but the bus drivers and uinput > from drivers/input and the rest can be done in userspace. > > The question is who does it? > > Are you saying that you will implement the userspace equivalent? No, I spend my time mostly with the kernel. > > If not then please do your job as maintainer and accept trivial patches > for perfectly working drivers we have now. I am doing my job as a maintainer right now. The driver might have been beneficial 15 years ago, when we did not have better options, but I would rather not continue expanding it's use. The main problem with the driver is that the functionality it is not easily discoverable by end users. And once you plumb it through userspace to present users with options you might as well handle it all in userspace. > > If you want to move drivers/input into userspace I am not against it > but I am not willing to do that for you either. Then we are at impasse. > > > > > What hardware do you believe would benefit from this and why? > > Any touchpad hardware where you cannot press two buttons at once to > emulate the third button due to hardware design. And any touchpad > hardware on which some of the buttons are broken when it comes to it. > > It is built into a notebook and works fine for moving the cursor but > due to lack of usable buttons you still need a mouse to use the > notebook. Have you tried simply redefining keymap of your keyboard to emit BTN_RIGHT/BTN_MIDDLE? Both atkbd and HID keyboards support keymap updates from userspace/udev/hwdb and if there is a driver that does not support it I will take patches fixing that. Thanks. -- Dmitry
Re: [PATCH v1 1/8] powerpc/lib/code-patching: Enhance code patching
Le 25/05/2017 à 05:36, Balbir Singh a écrit : Today our patching happens via direct copy and patch_instruction. The patching code is well contained in the sense that copying bits are limited. While considering implementation of CONFIG_STRICT_RWX, the first requirement is to a create another mapping that will allow for patching. We create the window using text_poke_area, allocated via get_vm_area(), which might be an overkill. We can do per-cpu stuff as well. The downside of these patches that patch_instruction is now synchornized using a lock. Other arches do similar things, but use fixmaps. The reason for not using fixmaps is to make use of any randomization in the future. The code also relies on set_pte_at and pte_clear to do the appropriate tlb flushing. Signed-off-by: Balbir Singh --- arch/powerpc/lib/code-patching.c | 88 ++-- 1 file changed, 84 insertions(+), 4 deletions(-) [...] +static int kernel_map_addr(void *addr) +{ + unsigned long pfn; int err; - __put_user_size(instr, addr, 4, err); + if (is_vmalloc_addr(addr)) + pfn = vmalloc_to_pfn(addr); + else + pfn = __pa_symbol(addr) >> PAGE_SHIFT; + + err = map_kernel_page((unsigned long)text_poke_area->addr, + (pfn << PAGE_SHIFT), _PAGE_KERNEL_RW | _PAGE_PRESENT); Why not use PAGE_KERNEL instead of _PAGE_KERNEL_RW | _PAGE_PRESENT ? From asm/pte-common.h : #define PAGE_KERNEL __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW) #define _PAGE_BASE (_PAGE_BASE_NC) #define _PAGE_BASE_NC (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_PSIZE) Also, in pte-common.h, maybe the following defines could/should be reworked once you serie applied, shouldn't it ? /* Protection used for kernel text. We want the debuggers to be able to * set breakpoints anywhere, so don't write protect the kernel text * on platforms where such control is possible. */ #if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) ||\ defined(CONFIG_KPROBES) || defined(CONFIG_DYNAMIC_FTRACE) #define PAGE_KERNEL_TEXTPAGE_KERNEL_X #else #define PAGE_KERNEL_TEXTPAGE_KERNEL_ROX #endif Christophe --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
Re: [PATCH] macintosh: move mac_hid driver to input/mouse.
On Sun, 2017-05-28 at 11:47 +0200, Michal Suchanek wrote: > On Tue, 9 May 2017 17:43:27 -0700 > Dmitry Torokhov wrote: > > > Hi Michal, > > > > On Tue, May 09, 2017 at 09:14:18PM +0200, Michal Suchanek wrote: > > > There is nothing mac-specific about this driver. Non-mac hardware > > > with suboptimal built-in pointer devices exists. > > > > > > This makes it possible to use this emulation not only on x86 and > > > ppc > > > notebooks but also on arm and mips. > > > > I'd rather we did not promote from drivers/macintosh to other > > platforms, but rather removed it. The same functionality can be > > done > > from userspace. > > What is the status of this? > > Do you reply to every patch to drivers/input that is not the the core > infrastructure that you would rather drop the driver because it can > be > done is in userspace? > > It sure can be done. Remove everything but the bus drivers and uinput > from drivers/input and the rest can be done in userspace. > > The question is who does it? > > Are you saying that you will implement the userspace equivalent? > > If not then please do your job as maintainer and accept trivial > patches > for perfectly working drivers we have now. > > If you want to move drivers/input into userspace I am not against it > but I am not willing to do that for you either. I'd advise you to take it down a notch. We don't go yelling at each other on this mailing-list.
Re: [PATCH v1 1/8] powerpc/lib/code-patching: Enhance code patching
On Sun, 2017-05-28 at 20:00 +0200, christophe leroy wrote: > > Le 25/05/2017 à 05:36, Balbir Singh a écrit : > > Today our patching happens via direct copy and > > patch_instruction. The patching code is well > > contained in the sense that copying bits are limited. > > > > While considering implementation of CONFIG_STRICT_RWX, > > the first requirement is to a create another mapping > > that will allow for patching. We create the window using > > text_poke_area, allocated via get_vm_area(), which might > > be an overkill. We can do per-cpu stuff as well. The > > downside of these patches that patch_instruction is > > now synchornized using a lock. Other arches do similar > > things, but use fixmaps. The reason for not using > > fixmaps is to make use of any randomization in the > > future. The code also relies on set_pte_at and pte_clear > > to do the appropriate tlb flushing. > > > > Signed-off-by: Balbir Singh > > --- > > arch/powerpc/lib/code-patching.c | 88 > > ++-- > > 1 file changed, 84 insertions(+), 4 deletions(-) > > > > [...] > > > +static int kernel_map_addr(void *addr) > > +{ > > + unsigned long pfn; > > int err; > > > > - __put_user_size(instr, addr, 4, err); > > + if (is_vmalloc_addr(addr)) > > + pfn = vmalloc_to_pfn(addr); > > + else > > + pfn = __pa_symbol(addr) >> PAGE_SHIFT; > > + > > + err = map_kernel_page((unsigned long)text_poke_area->addr, > > + (pfn << PAGE_SHIFT), _PAGE_KERNEL_RW | _PAGE_PRESENT); > > > > Why not use PAGE_KERNEL instead of _PAGE_KERNEL_RW | _PAGE_PRESENT ? > Will do > From asm/pte-common.h : > > #define PAGE_KERNEL __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW) > #define _PAGE_BASE(_PAGE_BASE_NC) > #define _PAGE_BASE_NC (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_PSIZE) > > Also, in pte-common.h, maybe the following defines could/should be > reworked once you serie applied, shouldn't it ? > > /* Protection used for kernel text. We want the debuggers to be able to > * set breakpoints anywhere, so don't write protect the kernel text > * on platforms where such control is possible. > */ > #if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || > defined(CONFIG_BDI_SWITCH) ||\ > defined(CONFIG_KPROBES) || defined(CONFIG_DYNAMIC_FTRACE) > #define PAGE_KERNEL_TEXT PAGE_KERNEL_X > #else > #define PAGE_KERNEL_TEXT PAGE_KERNEL_ROX > #endif Yes, I did see them and I want to rework them. Thanks, Balbir Singh.
Re: [PATCH v1 1/8] powerpc/lib/code-patching: Enhance code patching
On Sun, 2017-05-28 at 17:59 +0200, christophe leroy wrote: > > Le 25/05/2017 à 05:36, Balbir Singh a écrit : > > Today our patching happens via direct copy and > > patch_instruction. The patching code is well > > contained in the sense that copying bits are limited. > > > > While considering implementation of CONFIG_STRICT_RWX, > > the first requirement is to a create another mapping > > that will allow for patching. We create the window using > > text_poke_area, allocated via get_vm_area(), which might > > be an overkill. We can do per-cpu stuff as well. The > > downside of these patches that patch_instruction is > > now synchornized using a lock. Other arches do similar > > things, but use fixmaps. The reason for not using > > fixmaps is to make use of any randomization in the > > future. The code also relies on set_pte_at and pte_clear > > to do the appropriate tlb flushing. > > Isn't it overkill to remap the text in another area ? > > Among the 6 arches implementing CONFIG_STRICT_KERNEL_RWX (arm, arm64, > parisc, s390, x86/32, x86/64): > - arm, x86/32 and x86/64 set text RW during the modification x86 uses set_fixmap() in text_poke(), am I missing something? > - s390 seems to uses a special instruction which bypassed write protection > - parisc doesn't seem to implement any function which modifies kernel text. > > Therefore it seems only arm64 does it via another mapping. > Wouldn't it be lighter to just unprotect memory during the modification > as done on arm and x86 ? > I am not sure if the trade-off is quite that simple, for security I thought 1. It would be better to randomize text_poke_area(), which is why I dynamically allocated it. If we start randomizing get_vm_area(), we get the benefit 2. text_poke_aread() is RW and the normal text is RX, for any attack to succeed, it would need to find text_poke_area() at the time of patching, patch the kernel in that small window and use the normal mapping for execution Generally patch_instruction() is not fast path except for ftrace, tracing. In my tests I did not find the slow down noticable > Or another alternative could be to disable DMMU and do the write at > physical address ? > This would be worse off, I think, but we were discussing doing something like that xmon. But for other cases, I think it opens up a bigger window. > Christophe Balbir Singh
Re: [PATCH v1 1/8] powerpc/lib/code-patching: Enhance code patching
On Sun, 2017-05-28 at 16:29 +0200, christophe leroy wrote: > > Le 25/05/2017 à 05:36, Balbir Singh a écrit : > > Today our patching happens via direct copy and > > patch_instruction. The patching code is well > > contained in the sense that copying bits are limited. > > > > While considering implementation of CONFIG_STRICT_RWX, > > the first requirement is to a create another mapping > > that will allow for patching. We create the window using > > text_poke_area, allocated via get_vm_area(), which might > > be an overkill. We can do per-cpu stuff as well. The > > downside of these patches that patch_instruction is > > now synchornized using a lock. Other arches do similar > > things, but use fixmaps. The reason for not using > > fixmaps is to make use of any randomization in the > > future. The code also relies on set_pte_at and pte_clear > > to do the appropriate tlb flushing. > > > > Signed-off-by: Balbir Singh > > [...] > > > +static int kernel_map_addr(void *addr) > > +{ > > + unsigned long pfn; > > int err; > > > > - __put_user_size(instr, addr, 4, err); > > + if (is_vmalloc_addr(addr)) > > + pfn = vmalloc_to_pfn(addr); > > + else > > + pfn = __pa_symbol(addr) >> PAGE_SHIFT; > > + > > + err = map_kernel_page((unsigned long)text_poke_area->addr, > > + (pfn << PAGE_SHIFT), _PAGE_KERNEL_RW | _PAGE_PRESENT); > > map_kernel_page() doesn't exist on powerpc32, so compilation fails. > > However a similar function exists and is called map_page() > > Maybe the below modification could help (not tested yet) > > Christophe > Thanks, I'll try and get a compile, as an alternative how about #ifdef CONFIG_PPC32 #define map_kernel_page map_page #endif Balbir Singh.
[PATCH V2 2/2] KVM: PPC: Book3S HV: Enable guests to use large decrementer mode on POWER9
This allows userspace (e.g. QEMU) to enable large decrementer mode for the guest, by setting the LPCR_LD bit in the guest LPCR value. With this, the guest exit code saves 64 bits of the guest DEC value on exit. Other places that use the guest DEC value check the LPCR_LD bit in the guest LPCR value, and if it is set, omit the 32-bit sign extension that would otherwise be done. This doesn't change the DEC emulation used by PR KVM because PR KVM is not supported on POWER9 yet. This is partly based on an earlier patch by Oliver O'Halloran. Signed-off-by: Paul Mackerras --- arch/powerpc/include/asm/kvm_host.h | 2 +- arch/powerpc/kvm/book3s_hv.c| 2 ++ arch/powerpc/kvm/book3s_hv_rmhandlers.S | 28 arch/powerpc/kvm/emulate.c | 4 ++-- 4 files changed, 29 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 9c51ac4..3f879c8 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -579,7 +579,7 @@ struct kvm_vcpu_arch { ulong mcsrr0; ulong mcsrr1; ulong mcsr; - u32 dec; + ulong dec; #ifdef CONFIG_BOOKE u32 decar; #endif diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 42b7a4f..1f9c0ee 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -1143,6 +1143,8 @@ static void kvmppc_set_lpcr(struct kvm_vcpu *vcpu, u64 new_lpcr, mask = LPCR_DPFD | LPCR_ILE | LPCR_TC; if (cpu_has_feature(CPU_FTR_ARCH_207S)) mask |= LPCR_AIL; + if (cpu_has_feature(CPU_FTR_LARGE_DEC)) + mask |= LPCR_LD; /* Broken 32-bit version of LPCR must not clear top bits */ if (preserve_top32) diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index bcb5401..e7a2c89 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -916,7 +916,7 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300) mftbr7 subfr3,r7,r8 mtspr SPRN_DEC,r3 - stw r3,VCPU_DEC(r4) + std r3,VCPU_DEC(r4) ld r5, VCPU_SPRG0(r4) ld r6, VCPU_SPRG1(r4) @@ -1030,7 +1030,13 @@ kvmppc_cede_reentry: /* r4 = vcpu, r13 = paca */ li r0, BOOK3S_INTERRUPT_EXTERNAL bne cr1, 12f mfspr r0, SPRN_DEC - cmpwi r0, 0 +BEGIN_FTR_SECTION + /* On POWER9 check whether the guest has large decrementer enabled */ + andis. r8, r8, LPCR_LD@h + bne 15f +END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) + extsw r0, r0 +15:cmpdi r0, 0 li r0, BOOK3S_INTERRUPT_DECREMENTER bge 5f @@ -1457,12 +1463,18 @@ mc_cont: mtspr SPRN_SPURR,r4 /* Save DEC */ + ld r3, HSTATE_KVM_VCORE(r13) mfspr r5,SPRN_DEC mftbr6 + /* On P9, if the guest has large decr enabled, don't sign extend */ +BEGIN_FTR_SECTION + ld r4, VCORE_LPCR(r3) + andis. r4, r4, LPCR_LD@h + bne 16f +END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) extsw r5,r5 - add r5,r5,r6 +16:add r5,r5,r6 /* r5 is a guest timebase value here, convert to host TB */ - ld r3,HSTATE_KVM_VCORE(r13) ld r4,VCORE_TB_OFFSET(r3) subfr5,r4,r5 std r5,VCPU_DEC_EXPIRES(r9) @@ -2374,7 +2386,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM) mfspr r3, SPRN_DEC mfspr r4, SPRN_HDEC mftbr5 +BEGIN_FTR_SECTION + /* On P9 check whether the guest has large decrementer mode enabled */ + ld r6, HSTATE_KVM_VCORE(r13) + ld r6, VCORE_LPCR(r6) + andis. r6, r6, LPCR_LD@h + bne 68f +END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) extsw r3, r3 +68: BEGIN_FTR_SECTION extsw r4, r4 END_FTR_SECTION_IFSET(CPU_FTR_LARGE_DEC) diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c index c873ffe..4d8b4d6 100644 --- a/arch/powerpc/kvm/emulate.c +++ b/arch/powerpc/kvm/emulate.c @@ -39,7 +39,7 @@ void kvmppc_emulate_dec(struct kvm_vcpu *vcpu) unsigned long dec_nsec; unsigned long long dec_time; - pr_debug("mtDEC: %x\n", vcpu->arch.dec); + pr_debug("mtDEC: %lx\n", vcpu->arch.dec); hrtimer_try_to_cancel(&vcpu->arch.dec_timer); #ifdef CONFIG_PPC_BOOK3S @@ -109,7 +109,7 @@ static int kvmppc_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, int rs) case SPRN_TBWU: break; case SPRN_DEC: - vcpu->arch.dec = spr_val; + vcpu->arch.dec = (u32) spr_val; kvmppc_emulate_dec(vcpu); break; -- 2.7.4
[PATCH V2 0/2] KVM: PPC: Book3S HV: Support POWER9's large decrementer mode
One of the new features of POWER9 is that the decrementer (the facility that provides an interrupt after a programmable length of time) has been increased in size from 32 bits to 56 bits, allowing time intervals of up to about 814 days, compared to 4 seconds previously. This patch series adds support for the large decrementer mode to HV KVM. There is already code in the host kernel to enable large decrementer mode for the host, which means that some of the KVM entry/exit code is currently incorrect; the first patch fixes that. The second patch allows userspace to enable large decrementer mode for the guest, by setting the appropriate bit in the guest LPCR value. Changes in v2: use the presence of the ibm,dec-bits property to set the CPU_FTR_LARGE_DEC bit rather than the ibm,pa-features property, because QEMU already sets the large decrementer bit in the ibm,pa-features property (correctly, since ibm,pa-features describes the capabilities of the CPU hardware, not the settings established by the host) but does not currently enable large decrementer mode for the guest. Paul. --- arch/powerpc/include/asm/cputable.h | 4 ++- arch/powerpc/include/asm/kvm_host.h | 2 +- arch/powerpc/kernel/prom.c | 1 + arch/powerpc/kernel/time.c | 7 ++--- arch/powerpc/kvm/book3s_hv.c| 2 ++ arch/powerpc/kvm/book3s_hv_interrupts.S | 2 ++ arch/powerpc/kvm/book3s_hv_rmhandlers.S | 51 ++--- arch/powerpc/kvm/emulate.c | 4 +-- 8 files changed, 54 insertions(+), 19 deletions(-)
[PATCH V2 1/2] KVM: PPC: Book3S HV: Cope with host using large decrementer mode
POWER9 introduces a new mode for the decrementer register, called large decrementer mode, in which the decrementer counter is 56 bits wide rather than 32, and reads are sign-extended rather than zero-extended. Since KVM code reads and writes the host decrementer value in a few places, it needs to be aware of the need to treat the decrementer value as a 64-bit quantity, and only do a 32-bit sign extension when large decrementer mode is not in effect. To enable the sign extension to be removed in large decrementer mode, we use a CPU feature bit to indicate that large decrementer mode is in effect. This CPU feature bit is derived from the presence of the ibm,dec-bits property in the cpu nodes of the firmware device tree. This property is already set by firmware in the device tree that the kernel uses when running as a host. We change the kernel timer code to use this bit and enable large decrementer mode whenever it is set (even if firmware tells us that the large decrementer mode only gives us 32 bits) so that we get the sign extension in hardware. This is partly based on an earlier patch by Oliver O'Halloran. Cc: sta...@vger.kernel.org # v4.10+ Signed-off-by: Paul Mackerras --- arch/powerpc/include/asm/cputable.h | 4 +++- arch/powerpc/kernel/prom.c | 1 + arch/powerpc/kernel/time.c | 7 ++- arch/powerpc/kvm/book3s_hv_interrupts.S | 2 ++ arch/powerpc/kvm/book3s_hv_rmhandlers.S | 23 +-- 5 files changed, 25 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index c2d5095..99c3c56 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -216,6 +216,7 @@ enum { #define CPU_FTR_PMAO_BUG LONG_ASM_CONST(0x1000) #define CPU_FTR_SUBCORE LONG_ASM_CONST(0x2000) #define CPU_FTR_POWER9_DD1 LONG_ASM_CONST(0x4000) +#define CPU_FTR_LARGE_DEC LONG_ASM_CONST(0x8000) #ifndef __ASSEMBLY__ @@ -496,7 +497,8 @@ enum { (CPU_FTRS_POWER4 | CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \ CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \ CPU_FTRS_POWER8 | CPU_FTRS_POWER8_DD1 | CPU_FTRS_CELL | \ -CPU_FTRS_PA6T | CPU_FTR_VSX | CPU_FTRS_POWER9 | CPU_FTRS_POWER9_DD1) +CPU_FTRS_PA6T | CPU_FTR_VSX | CPU_FTRS_POWER9 | \ +CPU_FTRS_POWER9_DD1 | CPU_FTR_LARGE_DEC) #endif #else enum { diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index 40c4887..987fcc5 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -259,6 +259,7 @@ static struct feature_property { {"ibm,dfp", 1, 0, PPC_FEATURE_HAS_DFP}, {"ibm,purr", 1, CPU_FTR_PURR, 0}, {"ibm,spurr", 1, CPU_FTR_SPURR, 0}, + {"ibm,dec-bits", 32, CPU_FTR_LARGE_DEC, 0}, #endif /* CONFIG_PPC64 */ }; diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 2b33cfa..5d13f06 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -946,10 +946,7 @@ static void register_decrementer_clockevent(int cpu) static void enable_large_decrementer(void) { - if (!cpu_has_feature(CPU_FTR_ARCH_300)) - return; - - if (decrementer_max <= DECREMENTER_DEFAULT_MAX) + if (!cpu_has_feature(CPU_FTR_LARGE_DEC)) return; /* @@ -966,7 +963,7 @@ static void __init set_decrementer_max(void) u32 bits = 32; /* Prior to ISAv3 the decrementer is always 32 bit */ - if (!cpu_has_feature(CPU_FTR_ARCH_300)) + if (!cpu_has_feature(CPU_FTR_LARGE_DEC)) return; cpu = of_find_node_by_type(NULL, "cpu"); diff --git a/arch/powerpc/kvm/book3s_hv_interrupts.S b/arch/powerpc/kvm/book3s_hv_interrupts.S index 0fdc4a2..6e1d75f 100644 --- a/arch/powerpc/kvm/book3s_hv_interrupts.S +++ b/arch/powerpc/kvm/book3s_hv_interrupts.S @@ -124,7 +124,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) mfspr r8,SPRN_DEC mftbr7 mtspr SPRN_HDEC,r8 +BEGIN_FTR_SECTION extsw r8,r8 +END_FTR_SECTION_IFCLR(CPU_FTR_LARGE_DEC) add r8,r8,r7 std r8,HSTATE_DECEXP(r13) diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index bdb3f76..bcb5401 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -214,6 +214,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) kvmppc_primary_no_guest: /* We handle this much like a ceded vcpu */ /* put the HDEC into the DEC, since HDEC interrupts don't wake us */ + /* HDEC may be larger than DEC for arch >= v3.00, but since the */ + /* HDEC value came from DEC in the first place, it will fit */ mfspr r3, SPRN_HDEC mtspr SPRN_DEC, r3 /* @@ -295,8 +297,11 @@ kvm_nov
[PATCH] powerpc/64: Reclaim CPU_FTR_SUBCORE
We are running low on CPU feature bits, so we only want to use them when it's really necessary. CPU_FTR_SUBCORE is only used in one place, and only in C, so we don't need it in order to make asm patching work. It can only be set on "Power8" CPUs, which in practice means POWER8, POWER8E and POWER8NVL. There are no plans to implement it on future CPUs, but if there ever were we could retrofit it then. Although KVM uses subcores, it never looks at the CPU feature, it either looks at the ISA level or the threads_per_subcore value. So drop the CPU feature and do a PVR check instead. Drop the device tree "subcore" feature as we no longer support doing anything with it, and we will drop it from skiboot too. Signed-off-by: Michael Ellerman --- arch/powerpc/include/asm/cputable.h | 3 +-- arch/powerpc/kernel/dt_cpu_ftrs.c| 1 - arch/powerpc/platforms/powernv/subcore.c | 8 +++- 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index c2d509584a98..d02ad93bf708 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -214,7 +214,6 @@ enum { #define CPU_FTR_DAWR LONG_ASM_CONST(0x0400) #define CPU_FTR_DABRX LONG_ASM_CONST(0x0800) #define CPU_FTR_PMAO_BUG LONG_ASM_CONST(0x1000) -#define CPU_FTR_SUBCORE LONG_ASM_CONST(0x2000) #define CPU_FTR_POWER9_DD1 LONG_ASM_CONST(0x4000) #ifndef __ASSEMBLY__ @@ -463,7 +462,7 @@ enum { CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \ CPU_FTR_ICSWX | CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \ CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \ - CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_SUBCORE) + CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP) #define CPU_FTRS_POWER8E (CPU_FTRS_POWER8 | CPU_FTR_PMAO_BUG) #define CPU_FTRS_POWER8_DD1 (CPU_FTRS_POWER8 & ~CPU_FTR_DBELL) #define CPU_FTRS_POWER9 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \ diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c index 050925b5b451..d6f05e4dc328 100644 --- a/arch/powerpc/kernel/dt_cpu_ftrs.c +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c @@ -642,7 +642,6 @@ static struct dt_cpu_feature_match __initdata {"processor-control-facility", feat_enable_dbell, CPU_FTR_DBELL}, {"processor-control-facility-v3", feat_enable_dbell, CPU_FTR_DBELL}, {"processor-utilization-of-resources-register", feat_enable_purr, 0}, - {"subcore", feat_enable, CPU_FTR_SUBCORE}, {"no-execute", feat_enable, 0}, {"strong-access-ordering", feat_enable, CPU_FTR_SAO}, {"cache-inhibited-large-page", feat_enable_large_ci, 0}, diff --git a/arch/powerpc/platforms/powernv/subcore.c b/arch/powerpc/platforms/powernv/subcore.c index 0babef11136f..8c6119280c13 100644 --- a/arch/powerpc/platforms/powernv/subcore.c +++ b/arch/powerpc/platforms/powernv/subcore.c @@ -407,7 +407,13 @@ static DEVICE_ATTR(subcores_per_core, 0644, static int subcore_init(void) { - if (!cpu_has_feature(CPU_FTR_SUBCORE)) + unsigned pvr_ver; + + pvr_ver = PVR_VER(mfspr(SPRN_PVR)); + + if (pvr_ver != PVR_POWER8 && + pvr_ver != PVR_POWER8E && + pvr_ver != PVR_POWER8NVL) return 0; /* -- 2.7.4
[PATCH v2] spin loop primitives for busy waiting
Current busy-wait loops are implemented by repeatedly calling cpu_relax() to give an arch option for a low-latency option to improve power and/or SMT resource contention. This poses some difficulties for powerpc, which has SMT priority setting instructions (priorities determine how ifetch cycles are apportioned). powerpc's cpu_relax() is implemented by setting a low priority then setting normal priority. This has several problems: - Changing thread priority can have some execution cost and potential impact to other threads in the core. It's inefficient to execute them every time around a busy-wait loop. - Depending on implementation details, a `low ; medium` sequence may not have much if any affect. Some software with similar pattern actually inserts a lot of nops between, in order to cause a few fetch cycles with the low priority. - The busy-wait loop runs with regular priority. This might only be a few fetch cycles, but if there are several threads running such loops, they could cause a noticable impact on a non-idle thread. Implement spin_begin, spin_end primitives that can be used around busy wait loops, which default to no-ops. And spin_cpu_relax which defaults to cpu_relax. This will allow architectures to hook the entry and exit of busy-wait loops, and will allow powerpc to set low SMT priority at entry, and normal priority at exit. Suggested-by: Linus Torvalds Signed-off-by: Nicholas Piggin --- Since last time: - Fixed spin_do_cond with initial test as suggested by Linus. - Renamed it to spin_until_cond, which reads a little better. include/linux/processor.h | 70 +++ 1 file changed, 70 insertions(+) create mode 100644 include/linux/processor.h diff --git a/include/linux/processor.h b/include/linux/processor.h new file mode 100644 index ..da0c5e56ca02 --- /dev/null +++ b/include/linux/processor.h @@ -0,0 +1,70 @@ +/* Misc low level processor primitives */ +#ifndef _LINUX_PROCESSOR_H +#define _LINUX_PROCESSOR_H + +#include + +/* + * spin_begin is used before beginning a busy-wait loop, and must be paired + * with spin_end when the loop is exited. spin_cpu_relax must be called + * within the loop. + * + * The loop body should be as small and fast as possible, on the order of + * tens of instructions/cycles as a guide. It should and avoid calling + * cpu_relax, or any "spin" or sleep type of primitive including nested uses + * of these primitives. It should not lock or take any other resource. + * Violations of these guidelies will not cause a bug, but may cause sub + * optimal performance. + * + * These loops are optimized to be used where wait times are expected to be + * less than the cost of a context switch (and associated overhead). + * + * Detection of resource owner and decision to spin or sleep or guest-yield + * (e.g., spin lock holder vcpu preempted, or mutex owner not on CPU) can be + * tested within the loop body. + */ +#ifndef spin_begin +#define spin_begin() +#endif + +#ifndef spin_cpu_relax +#define spin_cpu_relax() cpu_relax() +#endif + +/* + * spin_cpu_yield may be called to yield (undirected) to the hypervisor if + * necessary. This should be used if the wait is expected to take longer + * than context switch overhead, but we can't sleep or do a directed yield. + */ +#ifndef spin_cpu_yield +#define spin_cpu_yield() cpu_relax_yield() +#endif + +#ifndef spin_end +#define spin_end() +#endif + +/* + * spin_until_cond can be used to wait for a condition to become true. It + * may be expected that the first iteration will true in the common case + * (no spinning), so that callers should not require a first "likely" test + * for the uncontended case before using this primitive. + * + * Usage and implementation guidelines are the same as for the spin_begin + * primitives, above. + */ +#ifndef spin_until_cond +#define spin_until_cond(cond) \ +do { \ + if (unlikely(!(cond))) {\ + spin_begin(); \ + do {\ + spin_cpu_relax(); \ + } while (!(cond)); \ + spin_end(); \ + } \ +} while (0) + +#endif + +#endif /* _LINUX_PROCESSOR_H */ -- 2.11.0
[PATCH V5] hwmon: (ibmpowernv) Add highest/lowest attributes to sensors
OCC provides historical minimum and maximum value for the sensor readings. This patch exports them as highest and lowest attributes for the inband sensors copied by OCC to main memory. Signed-off-by: Shilpasri G Bhat --- Changes from V4: - Got rid of 'len' variable in populate_attr_groups drivers/hwmon/ibmpowernv.c | 68 +- 1 file changed, 61 insertions(+), 7 deletions(-) diff --git a/drivers/hwmon/ibmpowernv.c b/drivers/hwmon/ibmpowernv.c index 6d2e660..b562323 100644 --- a/drivers/hwmon/ibmpowernv.c +++ b/drivers/hwmon/ibmpowernv.c @@ -298,10 +298,14 @@ static int populate_attr_groups(struct platform_device *pdev) sensor_groups[type].attr_count++; /* -* add a new attribute for labels +* add attributes for labels, min and max */ if (!of_property_read_string(np, "label", &label)) sensor_groups[type].attr_count++; + if (of_find_property(np, "sensor-data-min", NULL)) + sensor_groups[type].attr_count++; + if (of_find_property(np, "sensor-data-max", NULL)) + sensor_groups[type].attr_count++; } of_node_put(opal); @@ -337,6 +341,41 @@ static void create_hwmon_attr(struct sensor_data *sdata, const char *attr_name, sdata->dev_attr.show = show; } +static void populate_sensor(struct sensor_data *sdata, int od, int hd, int sid, + const char *attr_name, enum sensors type, + const struct attribute_group *pgroup, + ssize_t (*show)(struct device *dev, + struct device_attribute *attr, + char *buf)) +{ + sdata->id = sid; + sdata->type = type; + sdata->opal_index = od; + sdata->hwmon_index = hd; + create_hwmon_attr(sdata, attr_name, show); + pgroup->attrs[sensor_groups[type].attr_count++] = &sdata->dev_attr.attr; +} + +static char *get_max_attr(enum sensors type) +{ + switch (type) { + case POWER_INPUT: + return "input_highest"; + default: + return "highest"; + } +} + +static char *get_min_attr(enum sensors type) +{ + switch (type) { + case POWER_INPUT: + return "input_lowest"; + default: + return "lowest"; + } +} + /* * Iterate through the device tree for each child of 'sensors' node, create * a sysfs attribute file, the file is named by translating the DT node name @@ -417,16 +456,31 @@ static int create_device_attrs(struct platform_device *pdev) * attribute. They are related to the same * sensor. */ - sdata[count].type = type; - sdata[count].opal_index = sdata[count - 1].opal_index; - sdata[count].hwmon_index = sdata[count - 1].hwmon_index; make_sensor_label(np, &sdata[count], label); + populate_sensor(&sdata[count], opal_index, + sdata[count - 1].hwmon_index, + sensor_id, "label", type, pgroups[type], + show_label); + count++; + } - create_hwmon_attr(&sdata[count], "label", show_label); + if (!of_property_read_u32(np, "sensor-data-max", &sensor_id)) { + attr_name = get_max_attr(type); + populate_sensor(&sdata[count], opal_index, + sdata[count - 1].hwmon_index, + sensor_id, attr_name, type, + pgroups[type], show_sensor); + count++; + } - pgroups[type]->attrs[sensor_groups[type].attr_count++] = - &sdata[count++].dev_attr.attr; + if (!of_property_read_u32(np, "sensor-data-min", &sensor_id)) { + attr_name = get_min_attr(type); + populate_sensor(&sdata[count], opal_index, + sdata[count - 1].hwmon_index, + sensor_id, attr_name, type, + pgroups[type], show_sensor); + count++; } } -- 1.8.3.1
Re: [linux-next] PPC Lpar fail to boot with error hid: module verification failed: signature and/or required key missing - tainting kernel
Rob Landley writes: > On 05/25/2017 04:24 PM, Stephen Rothwell wrote: >> Hi Michael, >> >> On Thu, 25 May 2017 23:02:06 +1000 Michael Ellerman >> wrote: >>> >>> It'll be: >>> >>> ee35011fd032 ("initramfs: make initramfs honor CONFIG_DEVTMPFS_MOUNT") >> >> And Andrew has asked me to drop that patch from linux-next which will >> happen today. > > What approach do the kernel developers suggest I take here? Well I'm just *a* kernel developer, but rule #1 is don't break userspace. > I would have thought letting it soak in linux-next for a release so > people could fix userspace bugs would be the next step, but this sounds > like that's not an option? You say they're userspace bugs, userspace will say it's a bug that the kernel has changed its behaviour. > Is the behavior the patch implements wrong? Yes, because it breaks existing setups for no particularly good reason. If CONFIG_DEVTMPFS_MOUNT had always meant devtmpfs was mounted in the initramfs then that would have been fine. But because it didn't, there are now systems out there that depend on the existing behaviour, and changing it is therefore wrong IMHO. As I said in another mail you can avoid breaking existing setups by adding a new config option to control mounting devtmpfs in the initramfs. It's a pity to need yet another config option, but such is life. cheers
RE: [PATCH 3/3] powerpc/8xx: xmon compile fix
David Laight writes: > From: Michael Ellerman >> Sent: 26 May 2017 08:24 >> Nicholas Piggin writes: >> > diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c >> > index f11f65634aab..438fdb0fb142 100644 >> > --- a/arch/powerpc/xmon/xmon.c >> > +++ b/arch/powerpc/xmon/xmon.c >> > @@ -1242,14 +1242,16 @@ bpt_cmds(void) >> > { >> >int cmd; >> >unsigned long a; >> > - int mode, i; >> > + int i; >> >struct bpt *bp; >> > - const char badaddr[] = "Only kernel addresses are permitted " >> > - "for breakpoints\n"; >> > >> >cmd = inchar(); >> >switch (cmd) { >> > -#ifndef CONFIG_8xx >> > +#ifndef CONFIG_PPC_8xx >> > + int mode; >> > + const char badaddr[] = "Only kernel addresses are permitted " >> > + "for breakpoints\n"; >> > + >> >case 'd': /* bd - hardware data breakpoint */ >> >mode = 7; >> >cmd = inchar(); >> >> GCC 7 rejects this: >> >> arch/powerpc/xmon/xmon.c: In function bpt_cmds: >> arch/powerpc/xmon/xmon.c:1252:13: error: statement will never be executed >> [-Werror=switch- >> unreachable] >> const char badaddr[] = "Only kernel addresses are permitted for >> breakpoints\n"; >>^~~ > > Try 'static' ? Yep that works, will rebase this again ... O_o cheers
Re: [Patch 2/2]: powerpc/hotplug/mm: Fix hot-add memory node assoc
Reza Arbab writes: > On Fri, May 26, 2017 at 01:46:58PM +1000, Michael Ellerman wrote: >>Reza Arbab writes: >> >>> On Thu, May 25, 2017 at 04:19:53PM +1000, Michael Ellerman wrote: The commit message for 3af229f2071f says: In practice, we never see a system with 256 NUMA nodes, and in fact, we do not support node hotplug on power in the first place, so the nodes ^^^ that are online when we come up are the nodes that will be present for the lifetime of this kernel. Is that no longer true? >>> >>> I don't know what the reasoning behind that statement was at the time, >>> but as far as I can tell, the only thing missing for node hotplug now is >>> Balbir's patchset [1]. He fixes the resource issue which motivated >>> 3af229f2071f and reverts it. >>> >>> With that set, I can instantiate a new numa node just by doing >>> add_memory(nid, ...) where nid doesn't currently exist. >> >>But does that actually happen on any real system? > > I don't know if anything currently tries to do this. My interest in > having this working is so that in the future, our coherent gpu memory > could be added as a distinct node by the device driver. Sure. If/when that happens, we would hopefully still have some way to limit the size of the possible map. That would ideally be a firmware property that tells us the maximum number of GPUs that might be hot-added, or we punt and cap it at some "sane" maximum number. But until that happens it's silly to say we can have up to 256 nodes when in practice most of our systems have 8 or less. So I'm still waiting for an explanation from Michael B on how he's seeing this bug in practice. cheers
Re: [PATCH v1 1/8] powerpc/lib/code-patching: Enhance code patching
Le 29/05/2017 à 00:50, Balbir Singh a écrit : On Sun, 2017-05-28 at 17:59 +0200, christophe leroy wrote: Le 25/05/2017 à 05:36, Balbir Singh a écrit : Today our patching happens via direct copy and patch_instruction. The patching code is well contained in the sense that copying bits are limited. While considering implementation of CONFIG_STRICT_RWX, the first requirement is to a create another mapping that will allow for patching. We create the window using text_poke_area, allocated via get_vm_area(), which might be an overkill. We can do per-cpu stuff as well. The downside of these patches that patch_instruction is now synchornized using a lock. Other arches do similar things, but use fixmaps. The reason for not using fixmaps is to make use of any randomization in the future. The code also relies on set_pte_at and pte_clear to do the appropriate tlb flushing. Isn't it overkill to remap the text in another area ? Among the 6 arches implementing CONFIG_STRICT_KERNEL_RWX (arm, arm64, parisc, s390, x86/32, x86/64): - arm, x86/32 and x86/64 set text RW during the modification x86 uses set_fixmap() in text_poke(), am I missing something? Indeed I looked how it is done in ftrace. On x86 text modifications are done using ftrace_write() which calls probe_kernel_write() which doesn't remap anything. It first calls ftrace_arch_code_modify_prepare() which sets the kernel text to rw. Indeed you are right, text_poke() remaps via fixmap. However it looks like text_poke() is used only for kgdb and kprobe Christophe
[PATCH] Documentation: networking: add DPAA Ethernet document
Signed-off-by: Madalin Bucur Signed-off-by: Camelia Groza --- Documentation/networking/dpaa.txt | 194 ++ 1 file changed, 194 insertions(+) create mode 100644 Documentation/networking/dpaa.txt diff --git a/Documentation/networking/dpaa.txt b/Documentation/networking/dpaa.txt new file mode 100644 index 000..76e016d --- /dev/null +++ b/Documentation/networking/dpaa.txt @@ -0,0 +1,194 @@ +The QorIQ DPAA Ethernet Driver +== + +Authors: +Madalin Bucur +Camelia Groza + +Contents + + + - DPAA Ethernet Overview + - DPAA Ethernet Supported SoCs + - Configuring DPAA Ethernet in your kernel + - DPAA Ethernet Frame Processing + - DPAA Ethernet Features + - Debugging + +DPAA Ethernet Overview +== + +DPAA stands for Data Path Acceleration Architecture and it is a +set of networking acceleration IPs that are available on several +generations of SoCs, both on PowerPC and ARM64. + +The Freescale DPAA architecture consists of a series of hardware blocks +that support Ethernet connectivity. The Ethernet driver depends upon the +following drivers in the Linux kernel: + + - Peripheral Access Memory Unit (PAMU) (* needed only for PPC platforms) +drivers/iommu/fsl_* + - Frame Manager (FMan) +drivers/net/ethernet/freescale/fman + - Queue Manager (QMan), Buffer Manager (BMan) +drivers/soc/fsl/qbman + +A simplified view of the dpaa_eth interfaces mapped to FMan MACs: + + dpaa_eth /eth0\ ... /ethN\ + driver| | | | + - --- - + -Ports / Tx Rx \.../ Tx Rx \ + FMan| | | | + -MACs | MAC0 | | MACN | + / dtsec0 \ ... / dtsecN \ (or tgec) +/ \ / \(or memac) + - -- --- -- - + FMan, FMan Port, FMan SP, FMan MURAM drivers + - + FMan HW blocks: MURAM, MACs, Ports, SP + - + +The dpaa_eth relation to the QMan, BMan and FMan: + + dpaa_eth /eth0\ + driver/ \ + - -^- -^- -^- ---- + QMan driver / \ / \ / \ \ / | BMan| + |Rx | |Rx | |Tx | |Tx | | driver | + - |Dfl| |Err| |Cnf| |FQs| | | + QMan HW|FQ | |FQ | |FQs| | | | | + / \ / \ / \ \ / | | + - --- --- --- -v-- +|FMan QMI | | +| FMan HW FMan BMI | BMan HW | + --- + +where the acronyms used above (and in the code) are: +DPAA = Data Path Acceleration Architecture +FMan = DPAA Frame Manager +QMan = DPAA Queue Manager +BMan = DPAA Buffers Manager +QMI = QMan interface in FMan +BMI = BMan interface in FMan +FMan SP = FMan Storage Profiles +MURAM = Multi-user RAM in FMan +FQ = QMan Frame Queue +Rx Dfl FQ = default reception FQ +Rx Err FQ = Rx error frames FQ +Tx Cnf FQ = Tx confirmation FQs +Tx FQs = transmission frame queues +dtsec = datapath three speed Ethernet controller (10/100/1000 Mbps) +tgec = ten gigabit Ethernet controller (10 Gbps) +memac = multirate Ethernet MAC (10/100/1000/1) + +DPAA Ethernet Supported SoCs + + +The DPAA drivers enable the Ethernet controllers present on the following SoCs: + +# PPC +P1023 +P2041 +P3041 +P4080 +P5020 +P5040 +T1023 +T1024 +T1040 +T1042 +T2080 +T4240 +B4860 + +# ARM +LS1043A +LS1046A + +Configuring DPAA Ethernet in your kernel + + +To enable the DPAA Ethernet driver, the following Kconfig options are required: + +# common for arch/arm64 and arch/powerpc platforms +CONFIG_FSL_DPAA=y +CONFIG_FSL_FMAN=y +CONFIG_FSL_DPAA_ETH=y +CONFIG_FSL_XGMAC_MDIO=y + +# for arch/powerpc only +CONFIG_FSL_PAMU=y + +# common options needed for the PHYs used on the RDBs +CONFIG_VITESSE_PHY=y +CONFIG_REALTEK_PHY=y +CONFIG_AQUANTIA_PHY=y + +DPAA Ethernet Frame Processing +== + +On Rx, buffers for the incoming frames are retrieved from one of the three +existing buffers pools. The driver initializes and seeds these, each with +buffers of different sizes: 1KB, 2KB and 4KB. + +On Tx, all transmitted frames are returned to the driver through Tx +confirmation frame queues. The driver is then responsible for freeing the +buffers. In order to do this properly, a backpointer is added to the buffer +before transmission that points to the skb. When the buffer returns to the +driver on a confirmation FQ, the skb can be correctly consumed. + +DPAA Ethernet Features +== + +C
[PATCH] powerpc/64s: machine check handle ifetch from foreign real address for POWER9
The i-side 0111b case was missed by 7b9f71f974 ("powerpc/64s: POWER9 machine check handler"). It is possible to trigger this exception by branching to a foreign real address (bits [8:12] != 0) with instruction relocation off, and verify the exception cause is found after this patch. Fixes: 7b9f71f974 ("powerpc/64s: POWER9 machine check handler") Reported-by: Mahesh Salgaonkar Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/mce.h | 15 --- arch/powerpc/kernel/mce.c | 1 + arch/powerpc/kernel/mce_power.c | 3 +++ 3 files changed, 12 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h index 81eff8631434..190d69a7f701 100644 --- a/arch/powerpc/include/asm/mce.h +++ b/arch/powerpc/include/asm/mce.h @@ -90,13 +90,14 @@ enum MCE_UserErrorType { enum MCE_RaErrorType { MCE_RA_ERROR_INDETERMINATE = 0, MCE_RA_ERROR_IFETCH = 1, - MCE_RA_ERROR_PAGE_TABLE_WALK_IFETCH = 2, - MCE_RA_ERROR_PAGE_TABLE_WALK_IFETCH_FOREIGN = 3, - MCE_RA_ERROR_LOAD = 4, - MCE_RA_ERROR_STORE = 5, - MCE_RA_ERROR_PAGE_TABLE_WALK_LOAD_STORE = 6, - MCE_RA_ERROR_PAGE_TABLE_WALK_LOAD_STORE_FOREIGN = 7, - MCE_RA_ERROR_LOAD_STORE_FOREIGN = 8, + MCE_RA_ERROR_IFETCH_FOREIGN = 2, + MCE_RA_ERROR_PAGE_TABLE_WALK_IFETCH = 3, + MCE_RA_ERROR_PAGE_TABLE_WALK_IFETCH_FOREIGN = 4, + MCE_RA_ERROR_LOAD = 5, + MCE_RA_ERROR_STORE = 6, + MCE_RA_ERROR_PAGE_TABLE_WALK_LOAD_STORE = 7, + MCE_RA_ERROR_PAGE_TABLE_WALK_LOAD_STORE_FOREIGN = 8, + MCE_RA_ERROR_LOAD_STORE_FOREIGN = 9, }; enum MCE_LinkErrorType { diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index 5f9eada3519b..92f185875694 100644 --- a/arch/powerpc/kernel/mce.c +++ b/arch/powerpc/kernel/mce.c @@ -268,6 +268,7 @@ void machine_check_print_event_info(struct machine_check_event *evt, static const char *mc_ra_types[] = { "Indeterminate", "Instruction fetch (bad)", + "Instruction fetch (foreign)", "Page table walk ifetch (bad)", "Page table walk ifetch (foreign)", "Load (bad)", diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c index f913139bb0c2..d24e689e893f 100644 --- a/arch/powerpc/kernel/mce_power.c +++ b/arch/powerpc/kernel/mce_power.c @@ -236,6 +236,9 @@ static const struct mce_ierror_table mce_p9_ierror_table[] = { { 0x081c, 0x0018, true, MCE_ERROR_TYPE_UE, MCE_UE_ERROR_PAGE_TABLE_WALK_IFETCH, MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, }, +{ 0x081c, 0x001c, true, + MCE_ERROR_TYPE_RA, MCE_RA_ERROR_IFETCH_FOREIGN, + MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, }, { 0x081c, 0x0800, true, MCE_ERROR_TYPE_LINK,MCE_LINK_ERROR_IFETCH_TIMEOUT, MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, }, -- 2.11.0
[RFC] powerpc/powernv: machine check use kernel crash path
Use the normal kernel crash path in more cases (whenever we're not the init task), because it generally leads to much better Linux crash information. POWER9 has introduced more machine check conditions that can be triggered by programming errors (as opposed to hardware errors), which need to be debugged in Linux. It's unclear what the best way is to do this. Do we need to base the behaviour on the type of error? That might be impossible to do really well because some types of errors (e.g., translation multi hits) can be caused by software or hardware failures. Best would be to do something that works well for both. So what does BMC/OCC need here? Should we plumb OPAL_REBOOT_PLATFORM_ERROR into the generic crash path somehow (to be triggered by a special case of die()/panic()? This patch is just an RFC only, but when I test triggering a 0111b error from (kernel) process context after the previous patch, this patch changes the result from taking down the system with: w8l login: Severe Machine check interrupt [Not recovered] NIP []: 0x Initiator: CPU Error type: Real address [Instruction fetch (foreign)] [ 127.426651616,0] OPAL: Reboot requested due to Platform error. Effective[ 127.426693712,3] OPAL: Reboot requested due to Platform error. address: opal: Reboot type 1 not supported Kernel panic - not syncing: PowerNV Unrecovered Machine Check CPU: 56 PID: 4425 Comm: syscall Tainted: G M 4.12.0-rc1-13857-ga4700a261072-dirty #35 Call Trace: [ 128.017988928,4] IPMI: BUG: Dropping ESEL on the floor due to buggy/mising code in OPAL for this BMCRebooting in 10 seconds.. Trying to free IRQ 496 from IRQ context! To killing the process and continuing with: w8l login: Severe Machine check interrupt [Not recovered] NIP []: 0x Initiator: CPU Error type: Real address [Instruction fetch (foreign)] Effective address: Oops: Machine check, sig: 7 [#1] SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc kvm_hv kvm iptable_filter binfmt_misc vmx_crypto ip_tables x_tables autofs4 crc32c_vpmsum CPU: 22 PID: 4436 Comm: syscall Tainted: G M 4.12.0-rc1-13857-ga4700a261072-dirty #36 task: c0093230 task.stack: c0093238 NIP: LR: 217706a4 CTR: REGS: cfc8fd80 TRAP: 0200 Tainted: G M (4.12.0-rc1-13857-ga4700a261072-dirty) MSR: 901c1003 CR: 24000484 XER: 2000 CFAR: c0004c80 DAR: 21770a90 DSISR: 0a00 SOFTE: 1 GPR00: 1ebe 7fffce4818b0 21797f00 GPR04: 7fff8007ac24 44000484 4000 7fff801405e8 GPR08: 9280f033 24000484 0030 GPR12: 90001003 7fff801bc370 GPR16: GPR20: GPR24: GPR28: 7fff801b 217707a0 7fffce481918 NIP [] 0x LR [217706a4] 0x217706a4 Call Trace: Instruction dump: ---[ end trace 32ae1dabb4f8dae6 ]--- --- arch/powerpc/platforms/powernv/opal.c | 29 + 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c index 59684b4af4d1..67df76ac1fba 100644 --- a/arch/powerpc/platforms/powernv/opal.c +++ b/arch/powerpc/platforms/powernv/opal.c @@ -30,6 +30,7 @@ #include #include #include +#include #include "powernv.h" @@ -407,16 +408,28 @@ static int opal_recover_mce(struct pt_regs *regs, /* Fatal machine check */ pr_err("Machine check interrupt is fatal\n"); recovered = 0; - } else if ((evt->severity == MCE_SEV_ERROR_SYNC) && - (user_mode(regs) && !is_global_init(current))) { + } else if ((evt->severity == MCE_SEV_ERROR_SYNC) + && !is_global_init(current)) { /* -* For now, kill the task if we have received exception when -* in userspace. -* -* TODO: Queue up this address for hwpoisioning later. +* Try to kill processes if we get a synchronous machine check +* and are not "init" (see opal_machine_check() comment about +* not going via normal
Re: [PATCH v1 1/8] powerpc/lib/code-patching: Enhance code patching
Le 29/05/2017 à 00:58, Balbir Singh a écrit : On Sun, 2017-05-28 at 16:29 +0200, christophe leroy wrote: Le 25/05/2017 à 05:36, Balbir Singh a écrit : Today our patching happens via direct copy and patch_instruction. The patching code is well contained in the sense that copying bits are limited. While considering implementation of CONFIG_STRICT_RWX, the first requirement is to a create another mapping that will allow for patching. We create the window using text_poke_area, allocated via get_vm_area(), which might be an overkill. We can do per-cpu stuff as well. The downside of these patches that patch_instruction is now synchornized using a lock. Other arches do similar things, but use fixmaps. The reason for not using fixmaps is to make use of any randomization in the future. The code also relies on set_pte_at and pte_clear to do the appropriate tlb flushing. Signed-off-by: Balbir Singh [...] +static int kernel_map_addr(void *addr) +{ + unsigned long pfn; int err; - __put_user_size(instr, addr, 4, err); + if (is_vmalloc_addr(addr)) + pfn = vmalloc_to_pfn(addr); + else + pfn = __pa_symbol(addr) >> PAGE_SHIFT; + + err = map_kernel_page((unsigned long)text_poke_area->addr, + (pfn << PAGE_SHIFT), _PAGE_KERNEL_RW | _PAGE_PRESENT); map_kernel_page() doesn't exist on powerpc32, so compilation fails. However a similar function exists and is called map_page() Maybe the below modification could help (not tested yet) Christophe Thanks, I'll try and get a compile, as an alternative how about #ifdef CONFIG_PPC32 #define map_kernel_page map_page #endif My preference goes to renaming the PPC32 function, first because the PPC64 name fits better, second because too many defines kills readability, third because two functions doing the same thing are worth being called the same, and fourth because we surely have opportunity to merge both functions on day. Christophe