from:"Adam Litke"

[PATCH] hugetlb: follow_hugetlb_page for write access

2007-11-07 Thread Adam Litke


When calling get_user_pages(), a write flag is passed in by the caller to
indicate if write access is required on the faulted-in pages.  Currently,
follow_hugetlb_page() ignores this flag and always faults pages for
read-only access.  This can cause data corruption because a device driver
that calls get_user_pages() with write set will not expect COW faults to
occur on the returned pages.

This patch passes the write flag down to follow_hugetlb_page() and makes
sure hugetlb_fault() is called with the right write_access parameter.

Signed-off-by: Adam Litke <[EMAIL PROTECTED]>
---

 include/linux/hugetlb.h |2 +-
 mm/hugetlb.c|5 +++--
 mm/memory.c |2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 3a19b03..31fa0a0 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -19,7 +19,7 @@ static inline int is_vm_hugetlb_page(struct vm_area_struct 
*vma)
 int hugetlb_sysctl_handler(struct ctl_table *, int, struct file *, void __user 
*, size_t *, loff_t *);
 int hugetlb_treat_movable_handler(struct ctl_table *, int, struct file *, void 
__user *, size_t *, loff_t *);
 int copy_hugetlb_page_range(struct mm_struct *, struct mm_struct *, struct 
vm_area_struct *);
-int follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *, struct 
page **, struct vm_area_struct **, unsigned long *, int *, int);
+int follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *, struct 
page **, struct vm_area_struct **, unsigned long *, int *, int, int);
 void unmap_hugepage_range(struct vm_area_struct *, unsigned long, unsigned 
long);
 void __unmap_hugepage_range(struct vm_area_struct *, unsigned long, unsigned 
long);
 int hugetlb_prefault(struct address_space *, struct vm_area_struct *);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index eab8c42..b645985 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -621,7 +621,8 @@ int hugetlb_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
 
 int follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
struct page **pages, struct vm_area_struct **vmas,
-   unsigned long *position, int *length, int i)
+   unsigned long *position, int *length, int i,
+   int write)
 {
unsigned long pfn_offset;
unsigned long vaddr = *position;
@@ -643,7 +644,7 @@ int follow_hugetlb_page(struct mm_struct *mm, struct 
vm_area_struct *vma,
int ret;
 
spin_unlock(&mm->page_table_lock);
-   ret = hugetlb_fault(mm, vma, vaddr, 0);
+   ret = hugetlb_fault(mm, vma, vaddr, write);
spin_lock(&mm->page_table_lock);
if (!(ret & VM_FAULT_ERROR))
continue;
diff --git a/mm/memory.c b/mm/memory.c
index f82b359..1bcd444 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1039,7 +1039,7 @@ int get_user_pages(struct task_struct *tsk, struct 
mm_struct *mm,
 
if (is_vm_hugetlb_page(vma)) {
i = follow_hugetlb_page(mm, vma, pages, vmas,
-   &start, &len, i);
+   &start, &len, i, write);
continue;
}
 
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

[Documentation] Page Table Layout diagrams

2007-08-08 Thread Adam Litke

Hello all.  In an effort to understand how the page tables are laid out
across various architectures I put together some diagrams.  I have
posted them on the linux-mm wiki: http://linux-mm.org/PageTableStructure
and I hope they will be useful to others.  

Just to make sure I am not spreading misinformation, could a few of you
experts take a quick look at the three diagrams I've got finished so far
and point out any errors I have made?  Thanks.

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

[RFC PATCH 0/2] Merge HUGETLB_PAGE and HUGETLBFS Kconfig options

2008-06-12 Thread Adam Litke

There are currently two global Kconfig options that enable/disable the
hugetlb code: CONFIG_HUGETLB_PAGE and CONFIG_HUGETLBFS.  This may have
made sense before hugetlbfs became ubiquitous but now the pair of
options are redundant.  Merging these two options into one will simplify
the code slightly and will, more importantly, avoid confusion and
questions like: Which hugetlbfs CONFIG option should my code depend on?

CONFIG_HUGETLB_PAGE is aliased to the value of CONFIG_HUGETLBFS, so one
option can be removed without any effect.  The first patch merges the
two options into one option: CONFIG_HUGETLB.  The second patch updates
the defconfigs to set the one new option appropriately.

I have cross-compiled this on i386, x86_64, ia64, powerpc, sparc64 and
sh with the option enabled and disabled.  This is completely mechanical
but, due to the large number of files affected (especially defconfigs),
could do well with a review from several sets of eyeballs.  Thanks.

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

[RFC PATCH 1/2] Merge options into CONFIG_HUGETLB

2008-06-12 Thread Adam Litke

Merge CONFIG_HUGETLB_PAGE and CONFIG_HUGETLBFS into one new config option:
CONFIG_HUGETLB.  CONFIG_HUGETLB_PAGE is aliased to the value of
CONFIG_HUGETLBFS, so one option can be removed without any effect.  This change
is pretty mechanical, but a little extra verification from arch maintainers
would be very helpful.  Thanks.

Signed-off-by: Adam Litke <[EMAIL PROTECTED]>

--

 Documentation/vm/hugetlbpage.txt   |6 ++
 arch/arm/mm/consistent.c   |2 +-
 arch/avr32/mm/dma-coherent.c   |2 +-
 arch/ia64/Kconfig  |8 
 arch/ia64/kernel/ivt.S |6 +++---
 arch/ia64/kernel/sys_ia64.c|2 +-
 arch/ia64/mm/Makefile  |2 +-
 arch/ia64/mm/init.c|2 +-
 arch/powerpc/Kconfig   |2 +-
 arch/powerpc/mm/Makefile   |2 +-
 arch/powerpc/mm/hash_utils_64.c|   10 +-
 arch/powerpc/mm/init_64.c  |2 +-
 arch/powerpc/mm/tlb_64.c   |2 +-
 arch/powerpc/platforms/Kconfig.cputype |2 +-
 arch/s390/mm/Makefile  |2 +-
 arch/sh/mm/Kconfig |2 +-
 arch/sh/mm/Makefile_32 |2 +-
 arch/sh/mm/Makefile_64 |2 +-
 arch/sparc64/Kconfig   |2 +-
 arch/sparc64/kernel/sun4v_tlb_miss.S   |2 +-
 arch/sparc64/kernel/tsb.S  |4 ++--
 arch/sparc64/mm/Makefile   |2 +-
 arch/sparc64/mm/fault.c|4 ++--
 arch/sparc64/mm/init.c |2 +-
 arch/sparc64/mm/tsb.c  |   14 +++---
 arch/x86/mm/Makefile   |2 +-
 fs/Kconfig |5 +
 fs/Makefile|2 +-
 fs/hugetlbfs/Makefile  |2 +-
 include/asm-ia64/mmu_context.h |2 +-
 include/asm-ia64/page.h|6 +++---
 include/asm-ia64/pgtable.h |2 +-
 include/asm-mn10300/page.h |2 +-
 include/asm-parisc/page.h  |2 +-
 include/asm-powerpc/mmu-hash64.h   |4 ++--
 include/asm-powerpc/page_64.h  |6 +++---
 include/asm-powerpc/pgtable-ppc64.h|2 +-
 include/asm-sh/page.h  |2 +-
 include/asm-sparc64/mmu.h  |2 +-
 include/asm-sparc64/mmu_context.h  |2 +-
 include/asm-sparc64/page.h |2 +-
 include/asm-sparc64/pgtable.h  |2 +-
 include/asm-x86/page_32.h  |2 +-
 include/linux/hugetlb.h|   12 ++--
 include/linux/pageblock-flags.h|6 +++---
 include/linux/vmstat.h |2 +-
 kernel/sysctl.c|2 +-
 mm/Makefile|2 +-
 mm/mempolicy.c |4 ++--
 mm/vmstat.c|2 +-
 50 files changed, 81 insertions(+), 86 deletions(-)

diff --git a/Documentation/vm/hugetlbpage.txt b/Documentation/vm/hugetlbpage.txt
index 3102b81..53604e9 100644
--- a/Documentation/vm/hugetlbpage.txt
+++ b/Documentation/vm/hugetlbpage.txt
@@ -13,10 +13,8 @@ This optimization is more critical now as bigger and bigger 
physical memories
 Users can use the huge page support in Linux kernel by either using the mmap
 system call or standard SYSv shared memory system calls (shmget, shmat).
 
-First the Linux kernel needs to be built with the CONFIG_HUGETLBFS
-(present under "File systems") and CONFIG_HUGETLB_PAGE (selected
-automatically when CONFIG_HUGETLBFS is selected) configuration
-options.
+First the Linux kernel needs to be built with the CONFIG_HUGETLB
+(present under "File systems") configuration option.
 
 The kernel built with hugepage support should show the number of configured
 hugepages in the system by running the "cat /proc/meminfo" command.
diff --git a/arch/arm/mm/consistent.c b/arch/arm/mm/consistent.c
index 333a82a..5931192 100644
--- a/arch/arm/mm/consistent.c
+++ b/arch/arm/mm/consistent.c
@@ -140,7 +140,7 @@ static struct vm_region *vm_region_find(struct vm_region 
*head, unsigned long ad
return c;
 }
 
-#ifdef CONFIG_HUGETLB_PAGE
+#ifdef CONFIG_HUGETLB
 #error ARM Coherent DMA allocator does not (yet) support huge TLB
 #endif
 
diff --git a/arch/avr32/mm/dma-coherent.c b/arch/avr32/mm/dma-coherent.c
index 6d8c794..0cf57c6 100644
--- a/arch/avr32/mm/dma-coherent.c
+++ b/arch/avr32/mm/dma-coherent.c
@@ -45,7 +45,7 @@ static struct page *__dma_alloc(struct device *dev, size_t 
size,
 * with __GFP_COMP being passed to split_page() which cannot
 * handle them.  The real problem is that this flag probably
 * should be 0 on AVR32 as it is not supported on this
-* platform--see CONFIG_HUGETLB_PAGE. */
+* platform--see CONFIG_HUGETLB. */
gfp &= ~(__GFP_COMP);
 
size = PAGE_ALIGN(size);
diff --git

Re: [RFC PATCH 0/2] Merge HUGETLB_PAGE and HUGETLBFS Kconfig options

2008-06-13 Thread Adam Litke

On Fri, 2008-06-13 at 14:46 +0100, Ralf Baechle wrote:
> MIPS doesn't do HUGETLB (at least not in-tree atm) so I'm not sure why
> [EMAIL PROTECTED] was cc'ed at all.  So feel free to add my
> Couldnt-care-less: ack line ;-)

Sorry :)  My patches touched your defconfigs so I felt it prudent to
include the mips list as an FYI.

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [RFC PATCH 2/2] Update defconfigs for CONFIG_HUGETLB

2008-06-13 Thread Adam Litke

On Thu, 2008-06-12 at 22:36 +0300, Adrian Bunk wrote:
> On Thu, Jun 12, 2008 at 02:55:45PM -0400, Adam Litke wrote:
> > Update all defconfigs that specify a default configuration for hugetlbfs.
> > There is now only one option: CONFIG_HUGETLB.  Replace the old
> > CONFIG_HUGETLB_PAGE and CONFIG_HUGETLBFS options with the new one.  I found 
> > no
> > cases where CONFIG_HUGETLBFS and CONFIG_HUGETLB_PAGE had different values so
> > this patch is large but completely mechanical:
> >...
> >  335 files changed, 335 insertions(+), 385 deletions(-)
> >...
> 
> Please don't do this kind of patches - it doesn't bring any advantage 
> but can create tons of patch conflicts.
> 
> The next time a defconfig gets updated it will anyway automatically be 
> fixed, and for defconfigs that aren't updated it doesn't create any 
> problems to keep them as they are today until they might one day get 
> updated.

Thanks for taking a look.  I am not sure if I have ever seen a defconfig
patch hit the mailing list before and I was wondering how those changes
happen.  In any case I am perfectly happy to drop this huge patch and
stick with just the first one.

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [Libhugetlbfs-devel] libbugetlbfs: Test case for powerpc huge_ptep_set_wrprotect() bug

2008-07-07 Thread Adam Litke

On Mon, 2008-07-07 at 17:19 +1000, David Gibson wrote:
> Until very recently (in fact, even now in mainline) powerpc kernels
> had a bug in huge_ptep_set_wrprotect() which meant the 'huge' flag was
> not passed down to pte_update() and hpte_need_flush().  This meant the
> hash ptes for hugepages would not be correctly flushed on fork(),
> allowing the parent to pollute the child's mapping after the fork().
> 
> This patch adds a testcase to libhugetlbfs for this behaviour, also
> doing some other checking of the COW semantics over a fork().
> 
> Signed-off-by: David Gibson <[EMAIL PROTECTED]>
Good test David, thanks...
Acked-by: Adam Litke <[EMAIL PROTECTED]>

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [BUG] 2.6.25-rc3-mm1 kernel bug while running libhugetlbfs

2008-03-04 Thread Adam Litke

On Tue, 2008-03-04 at 11:51 -0800, Andrew Morton wrote:
> hugetlb-correct-page-count-for-surplus-huge-pages.patch adds:
> 
> if (page) {
> /*
>  * This page is now managed by the hugetlb allocator and has
>  * no users -- drop the buddy allocator's reference.
>  */
> int page_count = put_page_testzero(page);
> BUG_ON(page_count != 0);
> 
> 

Ugh I got bitten by put_page_testzero().  When it returns 1, the page
count is zero (not the page count).

My initial version had a BUG_ON() with side-effects.  When a reviewer
pointed it out, I thought I could fix the patch up on its way out the
door.  I have self-administered my punishment.  This patch will fix it:

Signed-off-by: Adam Litke <[EMAIL PROTECTED]>

--- mm/hugetlb.c.orig   2008-03-04 13:36:30.0 -0800
+++ mm/hugetlb.c2008-03-04 13:39:30.0 -0800
@@ -291,8 +291,8 @@ static struct page *alloc_buddy_huge_pag
 * This page is now managed by the hugetlb allocator and has
 * no users -- drop the buddy allocator's reference.
 */
-   int page_count = put_page_testzero(page);
-   BUG_ON(page_count != 0);
+   put_page_testzero(page);
+   VM_BUG_ON(page_count(page));
nid = page_to_nid(page);
set_compound_page_dtor(page, free_huge_page);
/*
 
-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH] properly reserve in bootmem the lmb reserved regions that cross numa nodes

2008-09-30 Thread Adam Litke

This seems like the right approach to me.  I have pointed out a few
stylistic issues below.

On Tue, 2008-09-30 at 09:53 -0500, Jon Tollefson wrote:

> + /* Mark reserved regions */
> + for (i = 0; i < lmb.reserved.cnt; i++) {
> + unsigned long physbase = lmb.reserved.region[i].base;
> + unsigned long size = lmb.reserved.region[i].size;
> + unsigned long start_pfn = physbase >> PAGE_SHIFT;
> + unsigned long end_pfn = ((physbase+size-1) >> PAGE_SHIFT);

CodingStyle dictates that this should be:
unsigned long end_pfn = ((physbase + size - 1) >> PAGE_SHIFT);



> +/**
> + * get_node_active_region - Return active region containing start_pfn
> + * @start_pfn The page to return the region for.
> + *
> + * It will return NULL if active region is not found.
> + */
> +struct node_active_region *get_node_active_region(
> + unsigned long start_pfn)

Bad style.  I think the convention would be to write it like this:

struct node_active_region *
get_node_active_region(unsigned long start_pfn)

> +{
> + int i;
> + for (i = 0; i < nr_nodemap_entries; i++) {
> + unsigned long node_start_pfn = early_node_map[i].start_pfn;
> + unsigned long node_end_pfn = early_node_map[i].end_pfn;
> +
> + if (node_start_pfn <= start_pfn && node_end_pfn > start_pfn)
> + return &early_node_map[i];
> + }
> +     return NULL;
> +}

Since this is using the early_node_map[], should we mark the function
__mminit?  

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

[PATCH] hugetlb: follow_hugetlb_page for write access

[Documentation] Page Table Layout diagrams

[RFC PATCH 0/2] Merge HUGETLB_PAGE and HUGETLBFS Kconfig options

[RFC PATCH 1/2] Merge options into CONFIG_HUGETLB

Re: [RFC PATCH 0/2] Merge HUGETLB_PAGE and HUGETLBFS Kconfig options

Re: [RFC PATCH 2/2] Update defconfigs for CONFIG_HUGETLB

Re: [Libhugetlbfs-devel] libbugetlbfs: Test case for powerpc huge_ptep_set_wrprotect() bug

Re: [BUG] 2.6.25-rc3-mm1 kernel bug while running libhugetlbfs

Re: [PATCH] properly reserve in bootmem the lmb reserved regions that cross numa nodes

9 matches

Site Navigation

Mail list logo

Footer information