Re: [PATCH 5/5] kfifo: log based kfifo API
Hi Yuanhan, On Tue, Jan 08, 2013 at 10:57:53PM +0800, Yuanhan Liu wrote: > The current kfifo API take the kfifo size as input, while it rounds > _down_ the size to power of 2 at __kfifo_alloc. This may introduce > potential issue. > > Take the code at drivers/hid/hid-logitech-dj.c as example: > > if (kfifo_alloc(&djrcv_dev->notif_fifo, >DJ_MAX_NUMBER_NOTIFICATIONS * sizeof(struct dj_report), >GFP_KERNEL)) { > > Where, DJ_MAX_NUMBER_NOTIFICATIONS is 8, and sizeo of(struct dj_report) > is 15. > > Which means it wants to allocate a kfifo buffer which can store 8 > dj_report entries at once. The expected kfifo buffer size would be > 8 * 15 = 120 then. While, in the end, __kfifo_alloc will turn the > size to rounddown_power_of_2(120) = 64, and then allocate a buf > with 64 bytes, which I don't think this is the original author want. > > With the new log API, we can do like following: > > int kfifo_size_order = order_base_2(DJ_MAX_NUMBER_NOTIFICATIONS * > sizeof(struct dj_report)); > > if (kfifo_alloc(&djrcv_dev->notif_fifo, kfifo_size_order, GFP_KERNEL)) { > > This make sure we will allocate enough kfifo buffer for holding > DJ_MAX_NUMBER_NOTIFICATIONS dj_report entries. Why don't you simply change __kfifo_alloc to round the allocation up instead of down? Thanks. -- Dmitry ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 5/5] kfifo: log based kfifo API
The current kfifo API take the kfifo size as input, while it rounds _down_ the size to power of 2 at __kfifo_alloc. This may introduce potential issue. Take the code at drivers/hid/hid-logitech-dj.c as example: if (kfifo_alloc(&djrcv_dev->notif_fifo, DJ_MAX_NUMBER_NOTIFICATIONS * sizeof(struct dj_report), GFP_KERNEL)) { Where, DJ_MAX_NUMBER_NOTIFICATIONS is 8, and sizeo of(struct dj_report) is 15. Which means it wants to allocate a kfifo buffer which can store 8 dj_report entries at once. The expected kfifo buffer size would be 8 * 15 = 120 then. While, in the end, __kfifo_alloc will turn the size to rounddown_power_of_2(120) = 64, and then allocate a buf with 64 bytes, which I don't think this is the original author want. With the new log API, we can do like following: int kfifo_size_order = order_base_2(DJ_MAX_NUMBER_NOTIFICATIONS * sizeof(struct dj_report)); if (kfifo_alloc(&djrcv_dev->notif_fifo, kfifo_size_order, GFP_KERNEL)) { This make sure we will allocate enough kfifo buffer for holding DJ_MAX_NUMBER_NOTIFICATIONS dj_report entries. Cc: Stefani Seibold Cc: Andrew Morton Cc: linux-o...@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: platform-driver-...@vger.kernel.org Cc: linux-in...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-r...@vger.kernel.org Cc: linux-me...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux-...@lists.infradead.org Cc: libertas-...@lists.infradead.org Cc: linux-wirel...@vger.kernel.org Cc: net...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: open-is...@googlegroups.com Cc: linux-s...@vger.kernel.org Cc: de...@driverdev.osuosl.org Cc: linux-ser...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: linux...@kvack.org Cc: d...@vger.kernel.org Cc: linux-s...@vger.kernel.org Signed-off-by: Yuanhan Liu --- arch/arm/plat-omap/Kconfig |2 +- arch/arm/plat-omap/mailbox.c|6 +++- arch/powerpc/sysdev/fsl_rmu.c |2 +- drivers/char/sonypi.c |9 --- drivers/hid/hid-logitech-dj.c |7 +++-- drivers/iio/industrialio-event.c|2 +- drivers/iio/kfifo_buf.c |3 +- drivers/infiniband/hw/cxgb3/cxio_resource.c |8 -- drivers/media/i2c/cx25840/cx25840-ir.c |9 +-- drivers/media/pci/cx23885/cx23888-ir.c |9 +-- drivers/media/pci/meye/meye.c |6 +--- drivers/media/pci/meye/meye.h |2 + drivers/media/rc/ir-raw.c |7 +++-- drivers/memstick/host/r592.h|2 +- drivers/mmc/card/sdio_uart.c|4 ++- drivers/mtd/sm_ftl.c|5 +++- drivers/net/wireless/libertas/main.c|4 ++- drivers/net/wireless/rt2x00/rt2x00dev.c |5 +-- drivers/pci/pcie/aer/aerdrv_core.c |3 +- drivers/platform/x86/fujitsu-laptop.c |5 ++- drivers/platform/x86/sony-laptop.c |6 ++-- drivers/rapidio/devices/tsi721.c|5 ++- drivers/scsi/libiscsi_tcp.c |6 +++- drivers/staging/omapdrm/omap_plane.c|5 +++- drivers/tty/n_gsm.c |4 ++- drivers/tty/nozomi.c|5 +-- drivers/tty/serial/ifx6x60.c|2 +- drivers/tty/serial/ifx6x60.h|3 +- drivers/tty/serial/kgdb_nmi.c |7 +++-- drivers/usb/host/fhci.h |4 ++- drivers/usb/serial/cypress_m8.c |4 +- drivers/usb/serial/io_ti.c |4 +- drivers/usb/serial/ti_usb_3410_5052.c |7 +++-- drivers/usb/serial/usb-serial.c |2 +- include/linux/kfifo.h | 31 +-- include/linux/rio.h |1 + include/media/lirc_dev.h|4 ++- kernel/kfifo.c |9 +-- mm/memory-failure.c |3 +- net/dccp/probe.c|6 +++- net/sctp/probe.c|6 +++- samples/kfifo/bytestream-example.c |8 +++--- samples/kfifo/dma-example.c |5 ++- samples/kfifo/inttype-example.c |7 +++-- samples/kfifo/record-example.c |6 ++-- 45 files changed, 142 insertions(+), 108 deletions(-) diff --git a/arch/arm/plat-omap/Kconfig b/arch/arm/plat-omap/Kconfig index 665870d..7eda02c 100644 --- a/arch/arm/plat-omap/Kconfig +++ b/arch/arm/plat-omap/Kconfig @@ -124,7 +124,7 @@ config OMAP_MBOX_FWK DSP, IVA1.0 and IVA2 in OMAP1/2/3. config OMAP_MBOX_KFIFO_SIZE - int "Mailbox kfifo default buffer size (bytes)" + int "Mailbox kfifo default buffer size (bytes, should be power of 2. If not, will roundup to power of 2"
Re: [PATCH 5/5] kfifo: log based kfifo API
Dmitry Torokhov wrote: >Hi Yuanhan, > >On Tue, Jan 08, 2013 at 10:57:53PM +0800, Yuanhan Liu wrote: >> The current kfifo API take the kfifo size as input, while it rounds >> _down_ the size to power of 2 at __kfifo_alloc. This may introduce >> potential issue. >> >> Take the code at drivers/hid/hid-logitech-dj.c as example: >> >> if (kfifo_alloc(&djrcv_dev->notif_fifo, >>DJ_MAX_NUMBER_NOTIFICATIONS * sizeof(struct >dj_report), >>GFP_KERNEL)) { >> >> Where, DJ_MAX_NUMBER_NOTIFICATIONS is 8, and sizeo of(struct >dj_report) >> is 15. >> >> Which means it wants to allocate a kfifo buffer which can store 8 >> dj_report entries at once. The expected kfifo buffer size would be >> 8 * 15 = 120 then. While, in the end, __kfifo_alloc will turn the >> size to rounddown_power_of_2(120) = 64, and then allocate a buf >> with 64 bytes, which I don't think this is the original author want. >> >> With the new log API, we can do like following: >> >> int kfifo_size_order = order_base_2(DJ_MAX_NUMBER_NOTIFICATIONS * >> sizeof(struct dj_report)); >> >> if (kfifo_alloc(&djrcv_dev->notif_fifo, kfifo_size_order, >GFP_KERNEL)) { >> >> This make sure we will allocate enough kfifo buffer for holding >> DJ_MAX_NUMBER_NOTIFICATIONS dj_report entries. > >Why don't you simply change __kfifo_alloc to round the allocation up >instead of down? > >Thanks. > >-- >Dmitry >-- >To unsubscribe from this list: send the line "unsubscribe linux-media" >in >the body of a message to majord...@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html Hi Dmitry, I agree. I don't see the benefit in pushing up the change to a kfifo internal decision/problem to many different places in the kernel. Regards, Andy ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/4] audit: Syscall rules are not applied to existing processes on non-x86
Commit b05d8447e782 (audit: inline audit_syscall_entry to reduce burden on archs) changed audit_syscall_entry to check for a dummy context before calling __audit_syscall_entry. Unfortunately the dummy context state is maintained in __audit_syscall_entry so once set it never gets cleared, even if the audit rules change. As a result, if there are no auditing rules when a process starts then it will never be subject to any rules added later. x86 doesn't see this because it has an assembly fast path that calls directly into __audit_syscall_entry. I noticed this issue when working on audit performance optimisations. I wrote a set of simple test cases available at: http://ozlabs.org/~anton/junkcode/audit_tests.tar.gz 02_new_rule.py fails without the patch and passes with it. The test case clears all rules, starts a process, adds a rule then verifies the process produces a syscall audit record. Signed-off-by: Anton Blanchard Cc: # 3.3+ --- Index: b/include/linux/audit.h === --- a/include/linux/audit.h +++ b/include/linux/audit.h @@ -119,7 +119,7 @@ static inline void audit_syscall_entry(i unsigned long a1, unsigned long a2, unsigned long a3) { - if (unlikely(!audit_dummy_context())) + if (unlikely(current->audit_context)) __audit_syscall_entry(arch, major, a0, a1, a2, a3); } static inline void audit_syscall_exit(void *pt_regs) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/4] powerpc: Remove static branch prediction in 64bit traced syscall path
Some distros enable auditing by default which forces us through the syscall trace path. Remove the static branch prediction in our 64bit syscall handler and let the hardware do the prediction. Signed-off-by: Anton Blanchard --- Index: b/arch/powerpc/kernel/entry_64.S === --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -149,7 +149,7 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLP CURRENT_THREAD_INFO(r11, r1) ld r10,TI_FLAGS(r11) andi. r11,r10,_TIF_SYSCALL_T_OR_A - bne-syscall_dotrace + bne syscall_dotrace .Lsyscall_dotrace_cont: cmpldi 0,r0,NR_syscalls bge-syscall_enosys ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 3/4] powerpc: Optimise 64bit syscall auditing entry path
Add an assembly fast path for the syscall audit entry path on 64bit. Some distros enable auditing by default which forces us through the syscall auditing path even if there are no rules. I wrote some test cases to validate the patch: http://ozlabs.org/~anton/junkcode/audit_tests.tar.gz And to test the performance I ran a simple null syscall microbenchmark on a POWER7 box: http://ozlabs.org/~anton/junkcode/null_syscall.c Baseline: 949.2 cycles Patched: 920.6 cycles An improvement of 3%. Most of the potential gains are masked by the syscall audit exit path which will be fixed in a subsequent patch. Signed-off-by: Anton Blanchard --- Index: b/arch/powerpc/kernel/entry_64.S === --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -34,6 +34,12 @@ #include #include +/* Avoid __ASSEMBLER__'ifying just for this. */ +#include +#define AUDIT_ARCH_PPC (EM_PPC) +#define AUDIT_ARCH_PPC64 (EM_PPC64|__AUDIT_ARCH_64BIT) +#define __AUDIT_ARCH_64BIT 0x8000 + /* * System calls. */ @@ -244,6 +250,10 @@ syscall_error: /* Traced system call support */ syscall_dotrace: +#ifdef CONFIG_AUDITSYSCALL + andi. r11,r10,(_TIF_SYSCALL_T_OR_A & ~_TIF_SYSCALL_AUDIT) + beq audit_entry +#endif bl .save_nvgprs addir3,r1,STACK_FRAME_OVERHEAD bl .do_syscall_trace_enter @@ -253,6 +263,7 @@ syscall_dotrace: * for the call number to look up in the table (r0). */ mr r0,r3 +.Laudit_entry_return: ld r3,GPR3(r1) ld r4,GPR4(r1) ld r5,GPR5(r1) @@ -264,6 +275,34 @@ syscall_dotrace: ld r10,TI_FLAGS(r10) b .Lsyscall_dotrace_cont +#ifdef CONFIG_AUDITSYSCALL +audit_entry: + ld r4,GPR0(r1) + ld r5,GPR3(r1) + ld r6,GPR4(r1) + ld r7,GPR5(r1) + ld r8,GPR6(r1) + + andi. r11,r10,_TIF_32BIT + beq 1f + + lis r3,AUDIT_ARCH_PPC@h + ori r3,r3,AUDIT_ARCH_PPC@l + clrldi r5,r5,32 + clrldi r6,r6,32 + clrldi r7,r7,32 + clrldi r8,r8,32 + bl .__audit_syscall_entry + ld r0,GPR0(r1) + b .Laudit_entry_return + +1: lis r3,AUDIT_ARCH_PPC64@h + ori r3,r3,AUDIT_ARCH_PPC64@l + bl .__audit_syscall_entry + ld r0,GPR0(r1) + b .Laudit_entry_return +#endif + syscall_enosys: li r3,-ENOSYS b syscall_exit ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 4/4] powerpc: Optimise 64bit syscall auditing exit path
Add an assembly fast path for the syscall audit exit path on 64bit. Some distros enable auditing by default which forces us through the syscall auditing path even if there are no rules. With syscall auditing enabled we currently disable interrupts, check the threadinfo flags then immediately re-enable interrupts and call audit_syscall_exit. This patch splits the threadinfo flag check into two so we can avoid the disable/reenable of interrupts when handling trace flags. We must do the user work flag check with interrupts off to avoid returning to userspace without handling them. The other big gain is that we don't have to save and restore the non volatile registers or exit via the slow ret_from_except path. I wrote some test cases to validate the patch: http://ozlabs.org/~anton/junkcode/audit_tests.tar.gz And to test the performance I ran a simple null syscall microbenchmark on a POWER7 box: http://ozlabs.org/~anton/junkcode/null_syscall.c Baseline: 920.6 cycles Patched: 719.6 cycles An improvement of 22%. Signed-off-by: Anton Blanchard --- Index: b/arch/powerpc/kernel/entry_64.S === --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -195,6 +195,19 @@ syscall_exit: andi. r10,r8,MSR_RI beq-unrecov_restore #endif + + /* We can handle some thread info flags with interrupts on */ + ld r9,TI_FLAGS(r12) + li r11,-_LAST_ERRNO + andi. r0,r9,(_TIF_SYSCALL_T_OR_A|_TIF_SINGLESTEP|_TIF_PERSYSCALL_MASK) + bne syscall_exit_work + + cmpld r3,r11 + ld r5,_CCR(r1) + bge-syscall_error + +.Lsyscall_exit_work_cont: + /* * Disable interrupts so current_thread_info()->flags can't change, * and so that we don't get interrupted after loading SRR0/1. @@ -208,21 +221,19 @@ syscall_exit: * clear EE. We only need to clear RI just before we restore r13 * below, but batching it with EE saves us one expensive mtmsrd call. * We have to be careful to restore RI if we branch anywhere from -* here (eg syscall_exit_work). +* here (eg syscall_exit_user_work). */ li r9,MSR_RI andcr11,r10,r9 mtmsrd r11,1 #endif /* CONFIG_PPC_BOOK3E */ + /* Recheck thread info flags with interrupts off */ ld r9,TI_FLAGS(r12) - li r11,-_LAST_ERRNO - andi. r0,r9,(_TIF_SYSCALL_T_OR_A|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK) - bne-syscall_exit_work - cmpld r3,r11 - ld r5,_CCR(r1) - bge-syscall_error -.Lsyscall_error_cont: + + andi. r0,r9,_TIF_USER_WORK_MASK + bne-syscall_exit_user_work + ld r7,_NIP(r1) BEGIN_FTR_SECTION stdcx. r0,0,r1 /* to clear the reservation */ @@ -246,7 +257,7 @@ syscall_error: orisr5,r5,0x1000/* Set SO bit in CR */ neg r3,r3 std r5,_CCR(r1) - b .Lsyscall_error_cont + b .Lsyscall_exit_work_cont /* Traced system call support */ syscall_dotrace: @@ -306,58 +317,79 @@ audit_entry: syscall_enosys: li r3,-ENOSYS b syscall_exit - + syscall_exit_work: -#ifdef CONFIG_PPC_BOOK3S - mtmsrd r10,1 /* Restore RI */ -#endif - /* If TIF_RESTOREALL is set, don't scribble on either r3 or ccr. -If TIF_NOERROR is set, just save r3 as it is. */ + li r6,1/* r6 contains syscall success */ + mr r7,r3 + ld r5,_CCR(r1) + /* +* If TIF_RESTOREALL is set, don't scribble on either r3 or ccr. +* If TIF_NOERROR is set, just save r3 as it is. +*/ andi. r0,r9,_TIF_RESTOREALL beq+0f REST_NVGPRS(r1) b 2f -0: cmpld r3,r11 /* r10 is -LAST_ERRNO */ +0: cmpld r3,r11 /* r11 is -LAST_ERRNO */ blt+1f andi. r0,r9,_TIF_NOERROR bne-1f - ld r5,_CCR(r1) + li r6,0/* syscall failed */ neg r3,r3 orisr5,r5,0x1000/* Set SO bit in CR */ std r5,_CCR(r1) 1: std r3,GPR3(r1) -2: andi. r0,r9,(_TIF_PERSYSCALL_MASK) + +2: andi. r0,r9,_TIF_SYSCALL_AUDIT beq 4f - /* Clear per-syscall TIF flags if any are set. */ + mr r3,r6 + mr r4,r7 + bl .__audit_syscall_exit + CURRENT_THREAD_INFO(r12, r1) + ld r9,TI_FLAGS(r12) + ld r3,GPR3(r1) + ld r5,_CCR(r1) + ld r8,_MSR(r1) + +4: andi. r0,r9,(_TIF_PERSYSCALL_MASK) + beq 6f + /* Clear per-syscall TIF flags if any are set. */ li r11,_TIF_PERSYSCALL_MASK addir12,r12,TI_FLAGS -3: ldarx r10,0,r12 +5: ldarx r10,0,r12 andcr10,r1
[PATCH 1/8] mm: use vm_unmapped_area() on parisc architecture
Update the parisc arch_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: Michel Lespinasse --- arch/parisc/kernel/sys_parisc.c | 46 ++ 1 files changed, 17 insertions(+), 29 deletions(-) diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c index f76c10863c62..6ab138088076 100644 --- a/arch/parisc/kernel/sys_parisc.c +++ b/arch/parisc/kernel/sys_parisc.c @@ -35,18 +35,15 @@ static unsigned long get_unshared_area(unsigned long addr, unsigned long len) { - struct vm_area_struct *vma; + struct vm_unmapped_area_info info; - addr = PAGE_ALIGN(addr); - - for (vma = find_vma(current->mm, addr); ; vma = vma->vm_next) { - /* At this point: (!vma || addr < vma->vm_end). */ - if (TASK_SIZE - len < addr) - return -ENOMEM; - if (!vma || addr + len <= vma->vm_start) - return addr; - addr = vma->vm_end; - } + info.flags = 0; + info.length = len; + info.low_limit = PAGE_ALIGN(addr); + info.high_limit = TASK_SIZE; + info.align_mask = 0; + info.align_offset = 0; + return vm_unmapped_area(&info); } #define DCACHE_ALIGN(addr) (((addr) + (SHMLBA - 1)) &~ (SHMLBA - 1)) @@ -63,30 +60,21 @@ static unsigned long get_unshared_area(unsigned long addr, unsigned long len) */ static int get_offset(struct address_space *mapping) { - int offset = (unsigned long) mapping << (PAGE_SHIFT - 8); - return offset & 0x3FF000; + return (unsigned long) mapping >> 8; } static unsigned long get_shared_area(struct address_space *mapping, unsigned long addr, unsigned long len, unsigned long pgoff) { - struct vm_area_struct *vma; - int offset = mapping ? get_offset(mapping) : 0; - - offset = (offset + (pgoff << PAGE_SHIFT)) & 0x3FF000; + struct vm_unmapped_area_info info; - addr = DCACHE_ALIGN(addr - offset) + offset; - - for (vma = find_vma(current->mm, addr); ; vma = vma->vm_next) { - /* At this point: (!vma || addr < vma->vm_end). */ - if (TASK_SIZE - len < addr) - return -ENOMEM; - if (!vma || addr + len <= vma->vm_start) - return addr; - addr = DCACHE_ALIGN(vma->vm_end - offset) + offset; - if (addr < vma->vm_end) /* handle wraparound */ - return -ENOMEM; - } + info.flags = 0; + info.length = len; + info.low_limit = PAGE_ALIGN(addr); + info.high_limit = TASK_SIZE; + info.align_mask = PAGE_MASK & (SHMLBA - 1); + info.align_offset = (get_offset(mapping) + pgoff) << PAGE_SHIFT; + return vm_unmapped_area(&info); } unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, -- 1.7.7.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 0/8] vm_unmapped_area: finish the mission
These patches, which apply on top of v3.8-rc kernels, are to complete the VMA gap finding code I introduced (following Rik's initial proposal) in v3.8-rc1. First 5 patches introduce the use of vm_unmapped_area() to replace brute force searches on parisc, alpha, frv and ia64 architectures (all relatively trivial uses of the vm_unmapped_area() infrastructure) Next 2 patches do the same as above for the powerpc architecture. This change is not as trivial as for the other architectures, because we need to account for each address space slice potentially having a different page size. The last patch removes the free_area_cache, which was used by all the brute force searches before they got converted to the vm_unmapped_area() infrastructure. I did some basic testing on x86 and powerpc; however the first 5 (simpler) patches for parisc, alpha, frv and ia64 architectures are untested. Michel Lespinasse (8): mm: use vm_unmapped_area() on parisc architecture mm: use vm_unmapped_area() on alpha architecture mm: use vm_unmapped_area() on frv architecture mm: use vm_unmapped_area() on ia64 architecture mm: use vm_unmapped_area() in hugetlbfs on ia64 architecture mm: remove free_area_cache use in powerpc architecture mm: use vm_unmapped_area() on powerpc architecture mm: remove free_area_cache arch/alpha/kernel/osf_sys.c | 20 ++-- arch/arm/mm/mmap.c |2 - arch/arm64/mm/mmap.c |2 - arch/frv/mm/elf-fdpic.c | 49 +++ arch/ia64/kernel/sys_ia64.c | 37 ++ arch/ia64/mm/hugetlbpage.c | 20 ++-- arch/mips/mm/mmap.c |2 - arch/parisc/kernel/sys_parisc.c | 46 +++ arch/powerpc/include/asm/page_64.h |3 +- arch/powerpc/mm/hugetlbpage.c|2 +- arch/powerpc/mm/mmap_64.c|2 - arch/powerpc/mm/slice.c | 228 +- arch/powerpc/platforms/cell/spufs/file.c |2 +- arch/s390/mm/mmap.c |4 - arch/sparc/kernel/sys_sparc_64.c |2 - arch/tile/mm/mmap.c |2 - arch/x86/ia32/ia32_aout.c|2 - arch/x86/mm/mmap.c |2 - fs/binfmt_aout.c |2 - fs/binfmt_elf.c |2 - include/linux/mm_types.h |3 - include/linux/sched.h|2 - kernel/fork.c|4 - mm/mmap.c| 28 mm/nommu.c |4 - mm/util.c|1 - 26 files changed, 163 insertions(+), 310 deletions(-) -- 1.7.7.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/8] mm: use vm_unmapped_area() on alpha architecture
Update the alpha arch_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: Michel Lespinasse --- arch/alpha/kernel/osf_sys.c | 20 +--- 1 files changed, 9 insertions(+), 11 deletions(-) diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c index 14db93e4c8a8..ba707e23ef37 100644 --- a/arch/alpha/kernel/osf_sys.c +++ b/arch/alpha/kernel/osf_sys.c @@ -1298,17 +1298,15 @@ static unsigned long arch_get_unmapped_area_1(unsigned long addr, unsigned long len, unsigned long limit) { - struct vm_area_struct *vma = find_vma(current->mm, addr); - - while (1) { - /* At this point: (!vma || addr < vma->vm_end). */ - if (limit - len < addr) - return -ENOMEM; - if (!vma || addr + len <= vma->vm_start) - return addr; - addr = vma->vm_end; - vma = vma->vm_next; - } + struct vm_unmapped_area_info info; + + info.flags = 0; + info.length = len; + info.low_limit = addr; + info.high_limit = limit; + info.align_mask = 0; + info.align_offset = 0; + return vm_unmapped_area(&info); } unsigned long -- 1.7.7.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 3/8] mm: use vm_unmapped_area() on frv architecture
Update the frv arch_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: Michel Lespinasse --- arch/frv/mm/elf-fdpic.c | 49 -- 1 files changed, 17 insertions(+), 32 deletions(-) diff --git a/arch/frv/mm/elf-fdpic.c b/arch/frv/mm/elf-fdpic.c index 385fd30b142f..836f14707a62 100644 --- a/arch/frv/mm/elf-fdpic.c +++ b/arch/frv/mm/elf-fdpic.c @@ -60,7 +60,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi unsigned long pgoff, unsigned long flags) { struct vm_area_struct *vma; - unsigned long limit; + struct vm_unmapped_area_info info; if (len > TASK_SIZE) return -ENOMEM; @@ -79,39 +79,24 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr, unsi } /* search between the bottom of user VM and the stack grow area */ - addr = PAGE_SIZE; - limit = (current->mm->start_stack - 0x0020); - if (addr + len <= limit) { - limit -= len; - - if (addr <= limit) { - vma = find_vma(current->mm, PAGE_SIZE); - for (; vma; vma = vma->vm_next) { - if (addr > limit) - break; - if (addr + len <= vma->vm_start) - goto success; - addr = vma->vm_end; - } - } - } + info.flags = 0; + info.length = len; + info.low_limit = PAGE_SIZE; + info.high_limit = (current->mm->start_stack - 0x0020); + info.align_mask = 0; + info.align_offset = 0; + addr = vm_unmapped_area(&info); + if (!(addr & ~PAGE_MASK)) + goto success; + VM_BUG_ON(addr != -ENOMEM); /* search from just above the WorkRAM area to the top of memory */ - addr = PAGE_ALIGN(0x8000); - limit = TASK_SIZE - len; - if (addr <= limit) { - vma = find_vma(current->mm, addr); - for (; vma; vma = vma->vm_next) { - if (addr > limit) - break; - if (addr + len <= vma->vm_start) - goto success; - addr = vma->vm_end; - } - - if (!vma && addr <= limit) - goto success; - } + info.low_limit = PAGE_ALIGN(0x8000); + info.high_limit = TASK_SIZE; + addr = vm_unmapped_area(&info); + if (!(addr & ~PAGE_MASK)) + goto success; + VM_BUG_ON(addr != -ENOMEM); #if 0 printk("[area] l=%lx (ENOMEM) f='%s'\n", -- 1.7.7.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 4/8] mm: use vm_unmapped_area() on ia64 architecture
Update the ia64 arch_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: Michel Lespinasse --- arch/ia64/kernel/sys_ia64.c | 37 - 1 files changed, 12 insertions(+), 25 deletions(-) diff --git a/arch/ia64/kernel/sys_ia64.c b/arch/ia64/kernel/sys_ia64.c index d9439ef2f661..41e33f84c185 100644 --- a/arch/ia64/kernel/sys_ia64.c +++ b/arch/ia64/kernel/sys_ia64.c @@ -25,9 +25,9 @@ arch_get_unmapped_area (struct file *filp, unsigned long addr, unsigned long len unsigned long pgoff, unsigned long flags) { long map_shared = (flags & MAP_SHARED); - unsigned long start_addr, align_mask = PAGE_SIZE - 1; + unsigned long align_mask = 0; struct mm_struct *mm = current->mm; - struct vm_area_struct *vma; + struct vm_unmapped_area_info info; if (len > RGN_MAP_LIMIT) return -ENOMEM; @@ -44,7 +44,7 @@ arch_get_unmapped_area (struct file *filp, unsigned long addr, unsigned long len addr = 0; #endif if (!addr) - addr = mm->free_area_cache; + addr = TASK_UNMAPPED_BASE; if (map_shared && (TASK_SIZE > 0xul)) /* @@ -53,28 +53,15 @@ arch_get_unmapped_area (struct file *filp, unsigned long addr, unsigned long len * tasks, we prefer to avoid exhausting the address space too quickly by * limiting alignment to a single page. */ - align_mask = SHMLBA - 1; - - full_search: - start_addr = addr = (addr + align_mask) & ~align_mask; - - for (vma = find_vma(mm, addr); ; vma = vma->vm_next) { - /* At this point: (!vma || addr < vma->vm_end). */ - if (TASK_SIZE - len < addr || RGN_MAP_LIMIT - len < REGION_OFFSET(addr)) { - if (start_addr != TASK_UNMAPPED_BASE) { - /* Start a new search --- just in case we missed some holes. */ - addr = TASK_UNMAPPED_BASE; - goto full_search; - } - return -ENOMEM; - } - if (!vma || addr + len <= vma->vm_start) { - /* Remember the address where we stopped this search: */ - mm->free_area_cache = addr + len; - return addr; - } - addr = (vma->vm_end + align_mask) & ~align_mask; - } + align_mask = PAGE_MASK & (SHMLBA - 1); + + info.flags = 0; + info.length = len; + info.low_limit = addr; + info.high_limit = TASK_SIZE; + info.align_mask = align_mask; + info.align_offset = 0; + return vm_unmapped_area(&info); } asmlinkage long -- 1.7.7.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 5/8] mm: use vm_unmapped_area() in hugetlbfs on ia64 architecture
Update the ia64 hugetlb_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: Michel Lespinasse --- arch/ia64/mm/hugetlbpage.c | 20 +--- 1 files changed, 9 insertions(+), 11 deletions(-) diff --git a/arch/ia64/mm/hugetlbpage.c b/arch/ia64/mm/hugetlbpage.c index 5ca674b74737..76069c18ee42 100644 --- a/arch/ia64/mm/hugetlbpage.c +++ b/arch/ia64/mm/hugetlbpage.c @@ -148,7 +148,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { - struct vm_area_struct *vmm; + struct vm_unmapped_area_info info; if (len > RGN_MAP_LIMIT) return -ENOMEM; @@ -165,16 +165,14 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, u /* This code assumes that RGN_HPAGE != 0. */ if ((REGION_NUMBER(addr) != RGN_HPAGE) || (addr & (HPAGE_SIZE - 1))) addr = HPAGE_REGION_BASE; - else - addr = ALIGN(addr, HPAGE_SIZE); - for (vmm = find_vma(current->mm, addr); ; vmm = vmm->vm_next) { - /* At this point: (!vmm || addr < vmm->vm_end). */ - if (REGION_OFFSET(addr) + len > RGN_MAP_LIMIT) - return -ENOMEM; - if (!vmm || (addr + len) <= vmm->vm_start) - return addr; - addr = ALIGN(vmm->vm_end, HPAGE_SIZE); - } + + info.flags = 0; + info.length = len; + info.low_limit = addr; + info.high_limit = HPAGE_REGION_BASE + RGN_MAP_LIMIT; + info.align_mask = PAGE_MASK & (HPAGE_SIZE - 1); + info.align_offset = 0; + return vm_unmapped_area(&info); } static int __init hugetlb_setup_sz(char *str) -- 1.7.7.3 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 6/8] mm: remove free_area_cache use in powerpc architecture
As all other architectures have been converted to use vm_unmapped_area(), we are about to retire the free_area_cache. This change simply removes the use of that cache in slice_get_unmapped_area(), which will most certainly have a performance cost. Next one will convert that function to use the vm_unmapped_area() infrastructure and regain the performance. Signed-off-by: Michel Lespinasse --- arch/powerpc/include/asm/page_64.h |3 +- arch/powerpc/mm/hugetlbpage.c|2 +- arch/powerpc/mm/slice.c | 108 + arch/powerpc/platforms/cell/spufs/file.c |2 +- 4 files changed, 22 insertions(+), 93 deletions(-) diff --git a/arch/powerpc/include/asm/page_64.h b/arch/powerpc/include/asm/page_64.h index cd915d6b093d..88693cef4f3d 100644 --- a/arch/powerpc/include/asm/page_64.h +++ b/arch/powerpc/include/asm/page_64.h @@ -99,8 +99,7 @@ extern unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, unsigned long flags, unsigned int psize, -int topdown, -int use_cache); +int topdown); extern unsigned int get_slice_psize(struct mm_struct *mm, unsigned long addr); diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 1a6de0a7d8eb..5dc52d803ed8 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -742,7 +742,7 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, struct hstate *hstate = hstate_file(file); int mmu_psize = shift_to_mmu_psize(huge_page_shift(hstate)); - return slice_get_unmapped_area(addr, len, flags, mmu_psize, 1, 0); + return slice_get_unmapped_area(addr, len, flags, mmu_psize, 1); } #endif diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c index cf9dada734b6..999a74f25ebe 100644 --- a/arch/powerpc/mm/slice.c +++ b/arch/powerpc/mm/slice.c @@ -240,23 +240,15 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz static unsigned long slice_find_area_bottomup(struct mm_struct *mm, unsigned long len, struct slice_mask available, - int psize, int use_cache) + int psize) { struct vm_area_struct *vma; - unsigned long start_addr, addr; + unsigned long addr; struct slice_mask mask; int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT); - if (use_cache) { - if (len <= mm->cached_hole_size) { - start_addr = addr = TASK_UNMAPPED_BASE; - mm->cached_hole_size = 0; - } else - start_addr = addr = mm->free_area_cache; - } else - start_addr = addr = TASK_UNMAPPED_BASE; + addr = TASK_UNMAPPED_BASE; -full_search: for (;;) { addr = _ALIGN_UP(addr, 1ul << pshift); if ((TASK_SIZE - len) < addr) @@ -272,63 +264,24 @@ full_search: addr = _ALIGN_UP(addr + 1, 1ul << SLICE_HIGH_SHIFT); continue; } - if (!vma || addr + len <= vma->vm_start) { - /* -* Remember the place where we stopped the search: -*/ - if (use_cache) - mm->free_area_cache = addr + len; + if (!vma || addr + len <= vma->vm_start) return addr; - } - if (use_cache && (addr + mm->cached_hole_size) < vma->vm_start) - mm->cached_hole_size = vma->vm_start - addr; addr = vma->vm_end; } - /* Make sure we didn't miss any holes */ - if (use_cache && start_addr != TASK_UNMAPPED_BASE) { - start_addr = addr = TASK_UNMAPPED_BASE; - mm->cached_hole_size = 0; - goto full_search; - } return -ENOMEM; } static unsigned long slice_find_area_topdown(struct mm_struct *mm, unsigned long len, struct slice_mask available, -int psize, int use_cache) +int psize) { struct vm_area_struct *vma; unsigned long addr; struct slice_mask mask; int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT); - /* check if free_area_cache is useful for us */
[PATCH 7/8] mm: use vm_unmapped_area() on powerpc architecture
Update the powerpc slice_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: Michel Lespinasse --- arch/powerpc/mm/slice.c | 128 +- 1 files changed, 81 insertions(+), 47 deletions(-) diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c index 999a74f25ebe..048346b7eed5 100644 --- a/arch/powerpc/mm/slice.c +++ b/arch/powerpc/mm/slice.c @@ -242,31 +242,51 @@ static unsigned long slice_find_area_bottomup(struct mm_struct *mm, struct slice_mask available, int psize) { - struct vm_area_struct *vma; - unsigned long addr; - struct slice_mask mask; int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT); + unsigned long addr, found, slice; + struct vm_unmapped_area_info info; - addr = TASK_UNMAPPED_BASE; + info.flags = 0; + info.length = len; + info.align_mask = PAGE_MASK & ((1ul << pshift) - 1); + info.align_offset = 0; - for (;;) { - addr = _ALIGN_UP(addr, 1ul << pshift); - if ((TASK_SIZE - len) < addr) - break; - vma = find_vma(mm, addr); - BUG_ON(vma && (addr >= vma->vm_end)); + addr = TASK_UNMAPPED_BASE; + while (addr < TASK_SIZE) { + info.low_limit = addr; + if (addr < SLICE_LOW_TOP) { + slice = GET_LOW_SLICE_INDEX(addr); + addr = (slice + 1) << SLICE_LOW_SHIFT; + if (!(available.low_slices & (1u << slice))) + continue; + } else { + slice = GET_HIGH_SLICE_INDEX(addr); + addr = (slice + 1) << SLICE_HIGH_SHIFT; + if (!(available.high_slices & (1u << slice))) + continue; + } - mask = slice_range_to_mask(addr, len); - if (!slice_check_fit(mask, available)) { - if (addr < SLICE_LOW_TOP) - addr = _ALIGN_UP(addr + 1, 1ul << SLICE_LOW_SHIFT); - else - addr = _ALIGN_UP(addr + 1, 1ul << SLICE_HIGH_SHIFT); - continue; + next_slice: + if (addr >= TASK_SIZE) + addr = TASK_SIZE; + else if (addr < SLICE_LOW_TOP) { + slice = GET_LOW_SLICE_INDEX(addr); + if (available.low_slices & (1u << slice)) { + addr = (slice + 1) << SLICE_LOW_SHIFT; + goto next_slice; + } + } else { + slice = GET_HIGH_SLICE_INDEX(addr); + if (available.high_slices & (1u << slice)) { + addr = (slice + 1) << SLICE_HIGH_SHIFT; + goto next_slice; + } } - if (!vma || addr + len <= vma->vm_start) - return addr; - addr = vma->vm_end; + info.high_limit = addr; + + found = vm_unmapped_area(&info); + if (!(found & ~PAGE_MASK)) + return found; } return -ENOMEM; @@ -277,39 +297,53 @@ static unsigned long slice_find_area_topdown(struct mm_struct *mm, struct slice_mask available, int psize) { - struct vm_area_struct *vma; - unsigned long addr; - struct slice_mask mask; int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT); + unsigned long addr, found, slice; + struct vm_unmapped_area_info info; - addr = mm->mmap_base; - while (addr > len) { - /* Go down by chunk size */ - addr = _ALIGN_DOWN(addr - len, 1ul << pshift); + info.flags = VM_UNMAPPED_AREA_TOPDOWN; + info.length = len; + info.align_mask = PAGE_MASK & ((1ul << pshift) - 1); + info.align_offset = 0; - /* Check for hit with different page size */ - mask = slice_range_to_mask(addr, len); - if (!slice_check_fit(mask, available)) { - if (addr < SLICE_LOW_TOP) - addr = _ALIGN_DOWN(addr, 1ul << SLICE_LOW_SHIFT); - else if (addr < (1ul << SLICE_HIGH_SHIFT)) - addr = SLICE_LOW_TOP; - else - addr = _ALIGN_DOWN(addr, 1ul << SLICE_HIGH_SHIFT); - continue; + addr = mm->mmap_base; + while (addr > PAGE_SIZE) { + info.high_limi
[PATCH 8/8] mm: remove free_area_cache
Since all architectures have been converted to use vm_unmapped_area(), there is no remaining use for the free_area_cache. Signed-off-by: Michel Lespinasse --- arch/arm/mm/mmap.c |2 -- arch/arm64/mm/mmap.c |2 -- arch/mips/mm/mmap.c |2 -- arch/powerpc/mm/mmap_64.c|2 -- arch/s390/mm/mmap.c |4 arch/sparc/kernel/sys_sparc_64.c |2 -- arch/tile/mm/mmap.c |2 -- arch/x86/ia32/ia32_aout.c|2 -- arch/x86/mm/mmap.c |2 -- fs/binfmt_aout.c |2 -- fs/binfmt_elf.c |2 -- include/linux/mm_types.h |3 --- include/linux/sched.h|2 -- kernel/fork.c|4 mm/mmap.c| 28 mm/nommu.c |4 mm/util.c|1 - 17 files changed, 0 insertions(+), 66 deletions(-) diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c index 10062ceadd1c..0c6356255fe3 100644 --- a/arch/arm/mm/mmap.c +++ b/arch/arm/mm/mmap.c @@ -181,11 +181,9 @@ void arch_pick_mmap_layout(struct mm_struct *mm) if (mmap_is_legacy()) { mm->mmap_base = TASK_UNMAPPED_BASE + random_factor; mm->get_unmapped_area = arch_get_unmapped_area; - mm->unmap_area = arch_unmap_area; } else { mm->mmap_base = mmap_base(random_factor); mm->get_unmapped_area = arch_get_unmapped_area_topdown; - mm->unmap_area = arch_unmap_area_topdown; } } diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c index 7c7be7855638..8ed6cb1a900f 100644 --- a/arch/arm64/mm/mmap.c +++ b/arch/arm64/mm/mmap.c @@ -90,11 +90,9 @@ void arch_pick_mmap_layout(struct mm_struct *mm) if (mmap_is_legacy()) { mm->mmap_base = TASK_UNMAPPED_BASE; mm->get_unmapped_area = arch_get_unmapped_area; - mm->unmap_area = arch_unmap_area; } else { mm->mmap_base = mmap_base(); mm->get_unmapped_area = arch_get_unmapped_area_topdown; - mm->unmap_area = arch_unmap_area_topdown; } } EXPORT_SYMBOL_GPL(arch_pick_mmap_layout); diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c index d9be7540a6be..f4e63c29d044 100644 --- a/arch/mips/mm/mmap.c +++ b/arch/mips/mm/mmap.c @@ -158,11 +158,9 @@ void arch_pick_mmap_layout(struct mm_struct *mm) if (mmap_is_legacy()) { mm->mmap_base = TASK_UNMAPPED_BASE + random_factor; mm->get_unmapped_area = arch_get_unmapped_area; - mm->unmap_area = arch_unmap_area; } else { mm->mmap_base = mmap_base(random_factor); mm->get_unmapped_area = arch_get_unmapped_area_topdown; - mm->unmap_area = arch_unmap_area_topdown; } } diff --git a/arch/powerpc/mm/mmap_64.c b/arch/powerpc/mm/mmap_64.c index 67a42ed0d2fc..cb8bdbe4972f 100644 --- a/arch/powerpc/mm/mmap_64.c +++ b/arch/powerpc/mm/mmap_64.c @@ -92,10 +92,8 @@ void arch_pick_mmap_layout(struct mm_struct *mm) if (mmap_is_legacy()) { mm->mmap_base = TASK_UNMAPPED_BASE; mm->get_unmapped_area = arch_get_unmapped_area; - mm->unmap_area = arch_unmap_area; } else { mm->mmap_base = mmap_base(); mm->get_unmapped_area = arch_get_unmapped_area_topdown; - mm->unmap_area = arch_unmap_area_topdown; } } diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c index c59a5efa58b1..f2a462625c9e 100644 --- a/arch/s390/mm/mmap.c +++ b/arch/s390/mm/mmap.c @@ -91,11 +91,9 @@ void arch_pick_mmap_layout(struct mm_struct *mm) if (mmap_is_legacy()) { mm->mmap_base = TASK_UNMAPPED_BASE; mm->get_unmapped_area = arch_get_unmapped_area; - mm->unmap_area = arch_unmap_area; } else { mm->mmap_base = mmap_base(); mm->get_unmapped_area = arch_get_unmapped_area_topdown; - mm->unmap_area = arch_unmap_area_topdown; } } @@ -173,11 +171,9 @@ void arch_pick_mmap_layout(struct mm_struct *mm) if (mmap_is_legacy()) { mm->mmap_base = TASK_UNMAPPED_BASE; mm->get_unmapped_area = s390_get_unmapped_area; - mm->unmap_area = arch_unmap_area; } else { mm->mmap_base = mmap_base(); mm->get_unmapped_area = s390_get_unmapped_area_topdown; - mm->unmap_area = arch_unmap_area_topdown; } } diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c index 708bc29d36a8..f3c169f9d3a1 100644 --- a/arch/sparc/kernel/sys_sparc_64.c +++ b/arch/sparc/kernel/sys_sparc_64.c @@ -290,7 +290,6 @@ void arch_pick_mmap_layout(struct mm_struct *mm) sysctl_legacy_va_layout)
Re: [PATCH 0/8] vm_unmapped_area: finish the mission
Whoops, I was supposed to find a more appropriate subject line before sending this :] On Tue, Jan 8, 2013 at 5:28 PM, Michel Lespinasse wrote: > These patches, which apply on top of v3.8-rc kernels, are to complete the > VMA gap finding code I introduced (following Rik's initial proposal) in > v3.8-rc1. -- Michel "Walken" Lespinasse A program is never fully debugged until the last user dies. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 7/8] mm: use vm_unmapped_area() on powerpc architecture
On Tue, 2013-01-08 at 17:28 -0800, Michel Lespinasse wrote: > Update the powerpc slice_get_unmapped_area function to make use of > vm_unmapped_area() instead of implementing a brute force search. > > Signed-off-by: Michel Lespinasse > > --- > arch/powerpc/mm/slice.c | 128 +- > 1 files changed, 81 insertions(+), 47 deletions(-) That doesn't look good ... the resulting code is longer than the original, which makes me wonder how it is an improvement... Now it could just be a matter of how the code is factored, I see quite a bit of duplication of the whole slice mask test... Cheers, Ben. > diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c > index 999a74f25ebe..048346b7eed5 100644 > --- a/arch/powerpc/mm/slice.c > +++ b/arch/powerpc/mm/slice.c > @@ -242,31 +242,51 @@ static unsigned long slice_find_area_bottomup(struct > mm_struct *mm, > struct slice_mask available, > int psize) > { > - struct vm_area_struct *vma; > - unsigned long addr; > - struct slice_mask mask; > int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT); > + unsigned long addr, found, slice; > + struct vm_unmapped_area_info info; > > - addr = TASK_UNMAPPED_BASE; > + info.flags = 0; > + info.length = len; > + info.align_mask = PAGE_MASK & ((1ul << pshift) - 1); > + info.align_offset = 0; > > - for (;;) { > - addr = _ALIGN_UP(addr, 1ul << pshift); > - if ((TASK_SIZE - len) < addr) > - break; > - vma = find_vma(mm, addr); > - BUG_ON(vma && (addr >= vma->vm_end)); > + addr = TASK_UNMAPPED_BASE; > + while (addr < TASK_SIZE) { > + info.low_limit = addr; > + if (addr < SLICE_LOW_TOP) { > + slice = GET_LOW_SLICE_INDEX(addr); > + addr = (slice + 1) << SLICE_LOW_SHIFT; > + if (!(available.low_slices & (1u << slice))) > + continue; > + } else { > + slice = GET_HIGH_SLICE_INDEX(addr); > + addr = (slice + 1) << SLICE_HIGH_SHIFT; > + if (!(available.high_slices & (1u << slice))) > + continue; > + } > > - mask = slice_range_to_mask(addr, len); > - if (!slice_check_fit(mask, available)) { > - if (addr < SLICE_LOW_TOP) > - addr = _ALIGN_UP(addr + 1, 1ul << > SLICE_LOW_SHIFT); > - else > - addr = _ALIGN_UP(addr + 1, 1ul << > SLICE_HIGH_SHIFT); > - continue; > + next_slice: > + if (addr >= TASK_SIZE) > + addr = TASK_SIZE; > + else if (addr < SLICE_LOW_TOP) { > + slice = GET_LOW_SLICE_INDEX(addr); > + if (available.low_slices & (1u << slice)) { > + addr = (slice + 1) << SLICE_LOW_SHIFT; > + goto next_slice; > + } > + } else { > + slice = GET_HIGH_SLICE_INDEX(addr); > + if (available.high_slices & (1u << slice)) { > + addr = (slice + 1) << SLICE_HIGH_SHIFT; > + goto next_slice; > + } > } > - if (!vma || addr + len <= vma->vm_start) > - return addr; > - addr = vma->vm_end; > + info.high_limit = addr; > + > + found = vm_unmapped_area(&info); > + if (!(found & ~PAGE_MASK)) > + return found; > } > > return -ENOMEM; > @@ -277,39 +297,53 @@ static unsigned long slice_find_area_topdown(struct > mm_struct *mm, >struct slice_mask available, >int psize) > { > - struct vm_area_struct *vma; > - unsigned long addr; > - struct slice_mask mask; > int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT); > + unsigned long addr, found, slice; > + struct vm_unmapped_area_info info; > > - addr = mm->mmap_base; > - while (addr > len) { > - /* Go down by chunk size */ > - addr = _ALIGN_DOWN(addr - len, 1ul << pshift); > + info.flags = VM_UNMAPPED_AREA_TOPDOWN; > + info.length = len; > + info.align_mask = PAGE_MASK & ((1ul << pshift) - 1); > + info.align_offset = 0; > > - /* Check for hit with different page size */ > - mask = slice_range_to_mask(addr, len); > - if (!slice_check_fit(mask, available)) { > - if (addr < SLICE_LOW_TOP) > - addr = _ALIGN_DOWN(addr, 1ul <<
Re: [PATCH 7/8] mm: use vm_unmapped_area() on powerpc architecture
On Tue, Jan 8, 2013 at 6:15 PM, Benjamin Herrenschmidt wrote: > On Tue, 2013-01-08 at 17:28 -0800, Michel Lespinasse wrote: >> Update the powerpc slice_get_unmapped_area function to make use of >> vm_unmapped_area() instead of implementing a brute force search. >> >> Signed-off-by: Michel Lespinasse >> >> --- >> arch/powerpc/mm/slice.c | 128 >> +- >> 1 files changed, 81 insertions(+), 47 deletions(-) > > That doesn't look good ... the resulting code is longer than the > original, which makes me wonder how it is an improvement... Well no fair, the previous patch (for powerpc as well) has 22 insertions and 93 deletions :) The benefit is that the new code has lower algorithmic complexity, it replaces a per-vma loop with O(N) complexity with an outer loop that finds contiguous slice blocks and passes them to vm_unmapped_area() which is only O(log N) complexity. So the new code will be faster for workloads which use lots of vmas. That said, I do agree that the code that looks for contiguous available slices looks kinda ugly - just not sure how to make it look nicer though. -- Michel "Walken" Lespinasse A program is never fully debugged until the last user dies. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 5/5] kfifo: log based kfifo API
On Tue, Jan 08, 2013 at 10:16:46AM -0800, Dmitry Torokhov wrote: > Hi Yuanhan, > > On Tue, Jan 08, 2013 at 10:57:53PM +0800, Yuanhan Liu wrote: > > The current kfifo API take the kfifo size as input, while it rounds > > _down_ the size to power of 2 at __kfifo_alloc. This may introduce > > potential issue. > > > > Take the code at drivers/hid/hid-logitech-dj.c as example: > > > > if (kfifo_alloc(&djrcv_dev->notif_fifo, > >DJ_MAX_NUMBER_NOTIFICATIONS * sizeof(struct > > dj_report), > >GFP_KERNEL)) { > > > > Where, DJ_MAX_NUMBER_NOTIFICATIONS is 8, and sizeo of(struct dj_report) > > is 15. > > > > Which means it wants to allocate a kfifo buffer which can store 8 > > dj_report entries at once. The expected kfifo buffer size would be > > 8 * 15 = 120 then. While, in the end, __kfifo_alloc will turn the > > size to rounddown_power_of_2(120) = 64, and then allocate a buf > > with 64 bytes, which I don't think this is the original author want. > > > > With the new log API, we can do like following: > > > > int kfifo_size_order = order_base_2(DJ_MAX_NUMBER_NOTIFICATIONS * > > sizeof(struct dj_report)); > > > > if (kfifo_alloc(&djrcv_dev->notif_fifo, kfifo_size_order, GFP_KERNEL)) { > > > > This make sure we will allocate enough kfifo buffer for holding > > DJ_MAX_NUMBER_NOTIFICATIONS dj_report entries. > > Why don't you simply change __kfifo_alloc to round the allocation up > instead of down? Hi Dmitry, Yes, it would be neat and that was my first reaction as well. I then sent out a patch, but it was NACKed by Stefani(the original kfifo author). Here is the link: https://lkml.org/lkml/2012/10/26/144 Then Stefani proposed to change the API to take log of size as input to root fix this kind of issues. And here it is. Thanks. --yliu ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 7/8] mm: use vm_unmapped_area() on powerpc architecture
On Tue, 2013-01-08 at 18:38 -0800, Michel Lespinasse wrote: > > Well no fair, the previous patch (for powerpc as well) has 22 > insertions and 93 deletions :) > > The benefit is that the new code has lower algorithmic complexity, it > replaces a per-vma loop with O(N) complexity with an outer loop that > finds contiguous slice blocks and passes them to vm_unmapped_area() > which is only O(log N) complexity. So the new code will be faster for > workloads which use lots of vmas. > > That said, I do agree that the code that looks for contiguous > available slices looks kinda ugly - just not sure how to make it look > nicer though. Ok. I think at least you can move that construct: + if (addr < SLICE_LOW_TOP) { + slice = GET_LOW_SLICE_INDEX(addr); + addr = (slice + 1) << SLICE_LOW_SHIFT; + if (!(available.low_slices & (1u << slice))) + continue; + } else { + slice = GET_HIGH_SLICE_INDEX(addr); + addr = (slice + 1) << SLICE_HIGH_SHIFT; + if (!(available.high_slices & (1u << slice))) + continue; + } Into some kind of helper. It will probably compile to the same thing but at least it's more readable and it will avoid a fuckup in the future if somebody changes the algorithm and forgets to update one of the copies :-) Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/ptrace: make #defines for all request numbers hex
We have a mix of decimal and hex here, so lets make them consistently hex. Also, strace will print them in hex if it can't decode them, so having them in hex here makes it easier to match up. No functional change. Signed-off-by: Michael Neuling diff --git a/arch/powerpc/include/uapi/asm/ptrace.h b/arch/powerpc/include/uapi/asm/ptrace.h index ee67a2b..7e1584a 100644 --- a/arch/powerpc/include/uapi/asm/ptrace.h +++ b/arch/powerpc/include/uapi/asm/ptrace.h @@ -146,34 +146,34 @@ struct pt_regs { * structures. This also simplifies the implementation of a bi-arch * (combined (32- and 64-bit) gdb. */ -#define PTRACE_GETVRREGS 18 -#define PTRACE_SETVRREGS 19 +#define PTRACE_GETVRREGS 0x12 +#define PTRACE_SETVRREGS 0x13 /* Get/set all the upper 32-bits of the SPE registers, accumulator, and * spefscr, in one go */ -#define PTRACE_GETEVRREGS 20 -#define PTRACE_SETEVRREGS 21 +#define PTRACE_GETEVRREGS 0x14 +#define PTRACE_SETEVRREGS 0x15 /* Get the first 32 128bit VSX registers */ -#define PTRACE_GETVSRREGS 27 -#define PTRACE_SETVSRREGS 28 +#define PTRACE_GETVSRREGS 0x1b +#define PTRACE_SETVSRREGS 0x1c /* * Get or set a debug register. The first 16 are DABR registers and the * second 16 are IABR registers. */ -#define PTRACE_GET_DEBUGREG25 -#define PTRACE_SET_DEBUGREG26 +#define PTRACE_GET_DEBUGREG0x19 +#define PTRACE_SET_DEBUGREG0x1a /* (new) PTRACE requests using the same numbers as x86 and the same * argument ordering. Additionally, they support more registers too */ -#define PTRACE_GETREGS12 -#define PTRACE_SETREGS13 -#define PTRACE_GETFPREGS 14 -#define PTRACE_SETFPREGS 15 -#define PTRACE_GETREGS64 22 -#define PTRACE_SETREGS64 23 +#define PTRACE_GETREGS0xc +#define PTRACE_SETREGS0xd +#define PTRACE_GETFPREGS 0xe +#define PTRACE_SETFPREGS 0xf +#define PTRACE_GETREGS64 0x16 +#define PTRACE_SETREGS64 0x17 /* Calls to trace a 64bit program from a 32bit program */ #define PPC_PTRACE_PEEKTEXT_3264 0x95 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH] Added device tree binding for TDM and TDM phy
A gentle reminder. Any comments are appreciated. Regards, Sandeep > -Original Message- > From: Singh Sandeep-B37400 > Sent: Wednesday, January 02, 2013 6:55 PM > To: devicetree-disc...@lists.ozlabs.org; linuxppc-...@ozlabs.org > Cc: Singh Sandeep-B37400; Aggrwal Poonam-B10812 > Subject: [PATCH] Added device tree binding for TDM and TDM phy > > This controller is available on many Freescale SOCs like MPC8315, P1020, > P1010 and P1022 > > Signed-off-by: Sandeep Singh > Signed-off-by: Poonam Aggrwal > --- > .../devicetree/bindings/powerpc/fsl/fsl-tdm.txt| 63 > > .../devicetree/bindings/powerpc/fsl/tdm-phy.txt| 38 > 2 files changed, 101 insertions(+), 0 deletions(-) create mode 100644 > Documentation/devicetree/bindings/powerpc/fsl/fsl-tdm.txt > create mode 100644 Documentation/devicetree/bindings/powerpc/fsl/tdm- > phy.txt > > diff --git a/Documentation/devicetree/bindings/powerpc/fsl/fsl-tdm.txt > b/Documentation/devicetree/bindings/powerpc/fsl/fsl-tdm.txt > new file mode 100644 > index 000..ceb2ef1 > --- /dev/null > +++ b/Documentation/devicetree/bindings/powerpc/fsl/fsl-tdm.txt > @@ -0,0 +1,63 @@ > +TDM Device Tree Binding > + > +NOTE: The bindings described in this document are preliminary and > +subject to change. > + > +TDM (Time Division Multiplexing) > + > +Description: > + > +The TDM is full duplex serial port designed to allow various devices > +including digital signal processors (DSPs) to communicate with a > +variety of serial devices including industry standard framers, codecs, > other DSPs and microprocessors. > + > +The below properties describe the device tree bindings for Freescale > +TDM controller. This TDM controller is available on various Freescale > +Processors like MPC8315, P1020, P1022 and P1010. > + > +Required properties: > + > +- compatible > +Value type: > +Definition: Should contain "fsl,tdm1.0". > + > +- reg > +Definition: A standard property. The first reg specifier describes > the TDM > +registers, and the second describes the TDM DMAC registers. > + > +- tdm_tx_clk > +Value type: > +Definition: This specifies the value of transmit clock. It should > not > +exceed 50Mhz. > + > +- tdm_rx_clk > +Value type: > +Definition: This specifies the value of receive clock. Its value > could be > +zero, in which case tdm will operate in shared mode. Its value > should not > +exceed 50Mhz. > + > +- interrupts > +Definition: Two interrupt specifiers. The first is TDM error, and > the > +second is TDM DMAC. > + > +- phy-handle > +Value type: > +Definition: Phandle of the line controller node or framer node eg. > SLIC, > +E1/T1 etc. (Refer > +Documentation/devicetree/bindings/powerpc/fsl/tdm-phy.txt) > + > +- fsl,max-time-slots > +Value type: > +Definition: Maximum number of 8-bit time slots in one TDM frame. > This is > +the maximum number which TDM hardware supports. > + > +Example: > + > + tdm@16000 { > + compatible = "fsl,tdm1.0"; > + reg = <0x16000 0x200 0x2c000 0x2000>; > + tdm_tx_clk = <2048000>; > + tdm_rx_clk = <0>; > + interrupts = <16 8 62 8>; > + phy-handle = <&tdm-phy>; > + fsl,max-time-slots = <128>; > + }; > diff --git a/Documentation/devicetree/bindings/powerpc/fsl/tdm-phy.txt > b/Documentation/devicetree/bindings/powerpc/fsl/tdm-phy.txt > new file mode 100644 > index 000..2563934 > --- /dev/null > +++ b/Documentation/devicetree/bindings/powerpc/fsl/tdm-phy.txt > @@ -0,0 +1,38 @@ > +TDM PHY Device Tree Binding > + > +NOTE: The bindings described in this document are preliminary and > +subject to change. > + > +Description: > +TDM PHY is the terminal interface of TDM subsystem. It is typically a > +line control device like E1/T1 framer or SLIC. A TDM device can have > +multiple TDM PHYs. > + > +Required properties: > + > +- compatible > +Value type: > +Definition: Should contain generic compatibility like "tdm-phy-slic" > or > +"tdm-phy-e1" or "tdm-phy-t1". > + > +- max-num-ports > +Value type: > +Definition: Defines the maximum number of ports supported by the > SLIC > +device. Only required if the device is SLIC. For E1/T1 devices the > number > +of ports are predefined i.e. (24 in case of T1 and 32 in case of > E1). > + > +Apart from the above, there may be other properties required because of > +the bus/interface this device is connected on. It could be SPI/local > bus, etc. > + > +Example: > + > + tdm-phy@0 { > + compatible = "zarlink,le88266","tdm-phy-slic"; > + reg = <0>; > + max-num-ports = <4>; > + spi-max-frequency = <800>; > + }; > + > +In the above example properties "reg" and "spi-max-frequency" are SPI > +specific as the SLIC device is connected on SPI interface. These > +properties might vary depending on the specific interface the