Re: [RFC] Add mempressure cgroup

2012-11-28 Thread Kirill A. Shutemov
l for API to serve multiple users. > + > + ret = -EINVAL; > + for (i = 0; i < VMPRESSURE_NUM_LEVELS; i++) { > + if (strcmp(vmpressure_str_levels[i], args)) > + continue; > + mpc->eventfd = eventfd; > + mpc-&g

Re: [PATCHv2, RFC 00/30] Transparent huge page cache

2013-03-18 Thread Kirill A. Shutemov
Simon Jeons wrote: > On 03/18/2013 12:03 PM, Simon Jeons wrote: > > Hi Kirill, > > On 03/15/2013 01:50 AM, Kirill A. Shutemov wrote: > >> From: "Kirill A. Shutemov" > >> > >> Here's the second version of the patchset. > >> > &

Re: [PATCHv2, RFC 00/30] Transparent huge page cache

2013-03-18 Thread Kirill A. Shutemov
Simon Jeons wrote: > Hi Kirill, > On 03/18/2013 07:19 PM, Kirill A. Shutemov wrote: > > Simon Jeons wrote: > >> On 03/18/2013 12:03 PM, Simon Jeons wrote: > >>> Hi Kirill, > >>> On 03/15/2013 01:50 AM, Kirill A. Shutemov wrote: > >>>>

Re: [PATCH 1/4] eventfd: introduce eventfd_signal_hangup()

2013-02-04 Thread Kirill A. Shutemov
On Sat, Feb 02, 2013 at 05:58:58PM +0200, Kirill A. Shutemov wrote: > On Sat, Feb 02, 2013 at 02:50:44PM +0800, Li Zefan wrote: > > When an eventfd is closed, a wakeup with POLLHUP will be issued, > > but cgroup wants to issue wakeup explicitly, so when a cgroup is > > rem

Re: [PATCH 1/4] eventfd: introduce eventfd_signal_hangup()

2013-02-05 Thread Kirill A. Shutemov
On Tue, Feb 05, 2013 at 11:40:50AM +0800, Li Zefan wrote: > On 2013/2/4 18:15, Kirill A. Shutemov wrote: > > On Sat, Feb 02, 2013 at 05:58:58PM +0200, Kirill A. Shutemov wrote: > >> On Sat, Feb 02, 2013 at 02:50:44PM +0800, Li Zefan wrote: > >>> When an eventfd is

Re: [PATCH 1/4] eventfd: introduce eventfd_signal_hangup()

2013-02-06 Thread Kirill A. Shutemov
On Wed, Feb 06, 2013 at 09:48:20AM +0800, Li Zefan wrote: > On 2013/2/5 16:28, Kirill A. Shutemov wrote: > > On Tue, Feb 05, 2013 at 11:40:50AM +0800, Li Zefan wrote: > >> On 2013/2/4 18:15, Kirill A. Shutemov wrote: > >>> On Sat, Feb 02, 2013 at 05:58:58PM +

Re: [PATCHv2, RFC 07/30] thp, mm: introduce mapping_can_have_hugepages() predicate

2013-04-02 Thread Kirill A. Shutemov
Dave Hansen wrote: > On 03/22/2013 03:12 AM, Kirill A. Shutemov wrote: > > Dave Hansen wrote: > >> On 03/14/2013 10:50 AM, Kirill A. Shutemov wrote: > >>> +static inline bool mapping_can_have_hugepages(struct address_space *m) > >>> +{ > >

RE: [PATCHv2, RFC 20/30] ramfs: enable transparent huge page cache

2013-04-02 Thread Kirill A. Shutemov
Kirill A. Shutemov wrote: > From: "Kirill A. Shutemov" > > ramfs is the most simple fs from page cache point of view. Let's start > transparent huge page cache enabling here. > > For now we allocate only non-movable huge page. It's not yet clear if > m

Re: mm: BUG in do_huge_pmd_wp_page

2013-04-04 Thread Kirill A. Shutemov
Sasha Levin wrote: > Ping? I'm seeing a whole bunch of these with current -next. Do you have a way to reproduce? -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More major

Re: mm: BUG in do_huge_pmd_wp_page

2013-04-04 Thread Kirill A. Shutemov
Sasha Levin wrote: > On 04/04/2013 10:30 AM, Kirill A. Shutemov wrote: > > Sasha Levin wrote: > >> Ping? I'm seeing a whole bunch of these with current -next. > > > > Do you have a way to reproduce? > > Not really, trinity just manages to make it happen

[PATCHv3, RFC 17/34] thp, mm: implement grab_thp_write_begin()

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" The function is grab_cache_page_write_begin() twin but it tries to allocate huge page at given position aligned to HPAGE_CACHE_NR. If, for some reason, it's not possible allocate a huge page at this possition, it returns NULL. Caller should take care

[PATCHv3, RFC 24/34] ramfs: enable transparent huge page cache

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" ramfs is the most simple fs from page cache point of view. Let's start transparent huge page cache enabling here. For now we allocate only non-movable huge page. ramfs pages cannot be moved yet. Signed-off-by: Kirill A. Shutemov --- fs/ramfs

[PATCHv3, RFC 25/34] x86-64, mm: proper alignment mappings with hugepages

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Make arch_get_unmapped_area() return unmapped area aligned to HPAGE_MASK if the file mapping can have huge pages. Signed-off-by: Kirill A. Shutemov --- arch/x86/kernel/sys_x86_64.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) di

[PATCHv3, RFC 21/34] thp: wait_split_huge_page(): serialize over i_mmap_mutex too

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Since we're going to have huge pages backed by files, wait_split_huge_page() has to serialize not only over anon_vma_lock, but over i_mmap_mutex too. Signed-off-by: Kirill A. Shutemov --- include/linux/huge_mm.h | 15 --- mm/huge_memory.

[PATCHv3, RFC 20/34] thp: handle file pages in split_huge_page()

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" The base scheme is the same as for anonymous pages, but we walk by mapping->i_mmap rather then anon_vma->rb_root. __split_huge_page_refcount() has been tunned a bit: we need to transfer PG_swapbacked to tail pages. Splitting mapped pages haven't

[PATCHv3, RFC 16/34] thp, mm: add event counters for huge page alloc on write to a file

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Existing stats specify source of thp page: fault or collapse. We're going allocate a new huge page with write(2). It's nither fault nor collapse. Let's introduce new events for that. Signed-off-by: Kirill A. Shutemov --- include/linux/vm_e

[PATCHv3, RFC 07/34] thp, mm: basic defines for transparent huge page cache

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Signed-off-by: Kirill A. Shutemov --- include/linux/huge_mm.h |8 1 file changed, 8 insertions(+) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index ee1c244..a54939c 100644 --- a/include/linux/huge_mm.h +++ b/include/linux

[PATCHv3, RFC 15/34] thp, mm: handle tail pages in page_cache_get_speculative()

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" For tail page we call __get_page_tail(). It has the same semantics, but for tail page. Signed-off-by: Kirill A. Shutemov --- include/linux/pagemap.h |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/pagemap.h b/inc

[PATCHv3, RFC 31/34] thp: initial implementation of do_huge_linear_fault()

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" The function tries to create a new page mapping using huge pages. It only called for not yet mapped pages. As usual in THP, we fallback to small pages if we fail to allocate huge page. Signed-off-by: Kirill A. Shutemov --- include/linux/huge_mm.h |

[PATCHv3, RFC 32/34] thp: handle write-protect exception to file-backed huge pages

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Signed-off-by: Kirill A. Shutemov --- mm/huge_memory.c | 69 -- 1 file changed, 67 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index ed4389b..6dde87f 10064

[PATCHv3, RFC 08/34] thp, mm: introduce mapping_can_have_hugepages() predicate

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Returns true if mapping can have huge pages. Just check for __GFP_COMP in gfp mask of the mapping for now. Signed-off-by: Kirill A. Shutemov --- include/linux/pagemap.h | 11 +++ 1 file changed, 11 insertions(+) diff --git a/include/linux/p

[PATCHv3, RFC 33/34] thp: call __vma_adjust_trans_huge() for file-backed VMA

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Since we're going to have huge pages in page cache, we need to call __vma_adjust_trans_huge() for file-backed VMA, which potentially can contain huge pages. For now we call it for all VMAs with vm_ops->huge_fault defined. Probably later we will n

[PATCHv3, RFC 34/34] thp: map file-backed huge pages on fault

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Look like all pieces are in place, we can map file-backed huge-pages now. Signed-off-by: Kirill A. Shutemov --- include/linux/huge_mm.h |4 +++- mm/memory.c |1 + 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/inc

[PATCHv3, RFC 30/34] thp: extract fallback path from do_huge_pmd_anonymous_page() to a function

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" The same fallback path will be reused by non-anonymous pages, so lets' extract it in separate function. Signed-off-by: Kirill A. Shutemov --- mm/huge_memory.c | 112 -- 1 file changed, 59 in

[PATCHv3, RFC 28/34] thp: move maybe_pmd_mkwrite() out of mk_huge_pmd()

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" It's confusing that mk_huge_pmd() has sematics different from mk_pte() or mk_pmd(). Let's move maybe_pmd_mkwrite() out of mk_huge_pmd() and adjust prototype to match mk_pte(). Signed-off-by: Kirill A. Shutemov --- mm/huge_memory.c | 14 ++

[PATCHv3, RFC 29/34] thp, mm: basic huge_fault implementation for generic_file_vm_ops

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" It provide enough functionality for simple cases like ramfs. Need to be extended later. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 76 ++ 1 file changed, 76 insertions(+) diff --git a/mm

[PATCHv3, RFC 26/34] mm: add huge_fault() callback to vm_operations_struct

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" huge_fault() should try to setup huge page for the pgoff, if possbile. VM_FAULT_OOM return code means we need to fallback to small pages. Signed-off-by: Kirill A. Shutemov --- include/linux/mm.h |1 + 1 file changed, 1 insertion(+) diff --git a/inc

[PATCHv3, RFC 27/34] thp: prepare zap_huge_pmd() to uncharge file pages

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Uncharge pages from correct counter. Signed-off-by: Kirill A. Shutemov --- mm/huge_memory.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 7c48f58..4a1d8d7 100644 --- a/mm/huge_memory.

[PATCHv3, RFC 19/34] thp, libfs: initial support of thp in simple_read/write_begin/write_end

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" For now we try to grab a huge cache page if gfp_mask has __GFP_COMP. It's probably to weak condition and need to be reworked later. Signed-off-by: Kirill A. Shutemov --- fs/libfs.c | 48 ---

[PATCHv3, RFC 22/34] thp, mm: truncate support for transparent huge page cache

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" If we starting position of truncation is in tail page we have to spilit the huge page page first. We also have to split if end is within the huge page. Otherwise we can truncate whole huge page at once. Signed-off-by: Kirill A. Shutemov --- mm/truncat

[PATCHv3, RFC 14/34] thp, mm: locking tail page is a bug

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Locking head page means locking entire compound page. If we try to lock tail page, something went wrong. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index 1defa8

[PATCHv3, RFC 18/34] thp, mm: naive support of thp in generic read/write routines

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" For now we still write/read at most PAGE_CACHE_SIZE bytes a time. This implementation doesn't cover address spaces with backing store. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 18 +- 1 file changed, 17 insertions

[PATCHv3, RFC 23/34] thp, mm: split huge page on mmap file page

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" We are not ready to mmap file-backed tranparent huge pages. Let's split them on fault attempt. Later in the patchset we'll implement mmap() properly and this code path be used for fallback cases. Signed-off-by: Kirill A. Shutemov --- mm/filemap

[PATCHv3, RFC 09/34] thp: represent file thp pages in meminfo and friends

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" The patch adds new zone stat to count file transparent huge pages and adjust related places. For now we don't count mapped or dirty file thp pages separately. Signed-off-by: Kirill A. Shutemov --- drivers/base/node.c| 10 ++ fs

[PATCHv3, RFC 06/34] thp, mm: avoid PageUnevictable on active/inactive lru lists

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" active/inactive lru lists can contain unevicable pages (i.e. ramfs pages that have been placed on the LRU lists when first allocated), but these pages must not have PageUnevictable set - otherwise shrink_active_list goes crazy: kernel BUG at /home/space/kas/

[PATCHv3, RFC 02/34] block: implement add_bdi_stat()

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" We're going to add/remove a number of page cache entries at once. This patch implements add_bdi_stat() which adjusts bdi stats by arbitrary amount. It's required for batched page cache manipulations. Signed-off-by: Kirill A. Shutemov --- include

[PATCHv3, RFC 13/34] thp, mm: trigger bug in replace_page_cache_page() on THP

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" replace_page_cache_page() is only used by FUSE. It's unlikely that we will support THP in FUSE page cache any soon. Let's pospone implemetation of THP handling in replace_page_cache_page() until any will use it. Signed-off-by: Kirill A. Shutemo

[PATCHv3, RFC 10/34] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" For huge page we add to radix tree HPAGE_CACHE_NR pages at once: head page for the specified index and HPAGE_CACHE_NR-1 tail pages for following indexes. Signed-off-by: Kirill A. Shutemov --- mm/filema

[PATCHv3, RFC 12/34] thp, mm: rewrite delete_from_page_cache() to support huge pages

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" As with add_to_page_cache_locked() we handle HPAGE_CACHE_NR pages a time. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 28 ++-- 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c ind

[PATCHv3, RFC 11/34] mm: trace filemap: dump page order

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Dump page order to trace to be able to distinguish between small page and huge page in page cache. Signed-off-by: Kirill A. Shutemov --- include/trace/events/filemap.h |7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/include/tr

[PATCHv3, RFC 01/34] mm: drop actor argument of do_generic_file_read()

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" There's only one caller of do_generic_file_read() and the only actor is file_read_actor(). No reason to have a callback parameter. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-)

[PATCHv3, RFC 03/34] mm: implement zero_huge_user_segment and friends

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Let's add helpers to clear huge page segment(s). They provide the same functionallity as zero_user_segment and zero_user, but for huge pages. Signed-off-by: Kirill A. Shutemov --- include/linux/mm.h |7 +++ mm/memory.

[PATCHv3, RFC 05/34] memcg, thp: charge huge cache pages

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" mem_cgroup_cache_charge() has check for PageCompound(). The check prevents charging huge cache pages. I don't see a reason why the check is present. Looks like it's just legacy (introduced in 52d4b9a memcg: allocate all page_cgroup at boot).

[PATCHv3, RFC 00/34] Transparent huge page cache

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Here's third RFC. Thanks everybody for feedback. The patchset is pretty big already and I want to stop generate new features to keep it reviewable. Next I'll concentrate on benchmarking and tuning. Therefore some features will be outside initial t

[PATCHv3, RFC 04/34] radix-tree: implement preload for multiple contiguous elements

2013-04-05 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" The radix tree is variable-height, so an insert operation not only has to build the branch to its corresponding item, it also has to build the branch to existing items if the size has to be increased (by radix_tree_extend). The worst case is a zero height

Re: [PATCH, RFC 06/16] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-01-29 Thread Kirill A. Shutemov
Hillf Danton wrote: > On Mon, Jan 28, 2013 at 5:24 PM, Kirill A. Shutemov > wrote: > > + page_cache_get(page); > > + spin_lock_irq(&mapping->tree_lock); > > + page->mapping = mapping; > > + if (PageTransHuge(page)) { > > +

Re: [PATCH, RFC 06/16] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-01-29 Thread Kirill A. Shutemov
Hillf Danton wrote: > On Mon, Jan 28, 2013 at 5:24 PM, Kirill A. Shutemov > wrote: > > @@ -443,6 +443,7 @@ int add_to_page_cache_locked(struct page *page, struct > > address_space *mapping, > > pgoff_t offset, gfp_t gfp_mask) > > { > >

Re: [PATCH, RFC 00/16] Transparent huge page cache

2013-01-29 Thread Kirill A. Shutemov
Hugh Dickins wrote: > On Mon, 28 Jan 2013, Kirill A. Shutemov wrote: > > From: "Kirill A. Shutemov" > > > > Here's first steps towards huge pages in page cache. > > > > The intend of the work is get code ready to enable transparent hug

Re: [PATCH, RFC 00/16] Transparent huge page cache

2013-02-02 Thread Kirill A. Shutemov
Hugh Dickins wrote: > On Tue, 29 Jan 2013, Kirill A. Shutemov wrote: > > Hugh Dickins wrote: > > > > > > Interesting. > > > > > > I was starting to think about Transparent Huge Pagecache a few > > > months ago, but then got washed away by i

Re: [PATCH 1/4] eventfd: introduce eventfd_signal_hangup()

2013-02-02 Thread Kirill A. Shutemov
On Sat, Feb 02, 2013 at 02:50:44PM +0800, Li Zefan wrote: > When an eventfd is closed, a wakeup with POLLHUP will be issued, > but cgroup wants to issue wakeup explicitly, so when a cgroup is > removed userspace can be notified. > > Signed-off-by: Li Zefan Acked-by: Kir

Re: [PATCH 2/4] cgroup: fix cgroup_rmdir() vs close(eventfd) race

2013-02-02 Thread Kirill A. Shutemov
eventfd_signal(event->eventfd, 1); > - schedule_work(&event->remove); > + while (true) { > + if (list_empty(&cgrp->event_list)) > + break; while (!list_empty(&cgrp->event_list)) ? Otherwise: Acked-by: Kirill A. Shu

Re: [PATCH 3/4] eventfd: make operations on eventfd return -EIDRM if it's hung up

2013-02-02 Thread Kirill A. Shutemov
el/cgroup.c > @@ -4373,7 +4373,6 @@ static int cgroup_destroy_locked(struct cgroup *cgrp) > ctx = eventfd_ctx_get(event->eventfd); > spin_unlock(&cgrp->event_list_lock); > > - eventfd_signal(ctx, 1); > eventfd_sig

[PATCH, RFC 01/16] block: implement add_bdi_stat()

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" It's required for batched stats update. Signed-off-by: Kirill A. Shutemov --- include/linux/backing-dev.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index 3504599..b05d9

[PATCH, RFC 00/16] Transparent huge page cache

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Here's first steps towards huge pages in page cache. The intend of the work is get code ready to enable transparent huge page cache for the most simple fs -- ramfs. It's not yet near feature-complete. It only provides basic infrastructure. At th

[PATCH, RFC 03/16] mm: drop actor argument of do_generic_file_read()

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" There's only one caller of do_generic_file_read() and the only actor is file_read_actor(). No reason to have a callback parameter. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-)

[PATCH, RFC 08/16] thp, mm: locking tail page is a bug

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Signed-off-by: Kirill A. Shutemov --- mm/filemap.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index a4b4fd5..f59eaa1 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -665,6 +665,7 @@ void __lock_page(struct

[PATCH, RFC 10/16] thp, mm: implement grab_cache_huge_page_write_begin()

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" The function is grab_cache_page_write_begin() twin but it tries to allocate huge page at given position aligned to HPAGE_CACHE_NR. If, for some reason, it's not possible allocate a huge page at this possition, it returns NULL. Caller should take care

[PATCH, RFC 16/16] ramfs: enable transparent huge page cache

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" ramfs is the most simple fs from page cache point of view. Let's start transparent huge page cache enabling here. For now we allocate only non-movable huge page. It's not yet clear if movable page is safe here and what need to be done to make it

[PATCH, RFC 07/16] thp, mm: rewrite delete_from_page_cache() to support huge pages

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" As with add_to_page_cache_locked() we handle HPAGE_CACHE_NR pages a time. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c ind

[PATCH, RFC 15/16] thp, mm: split huge page on mmap file page

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" We are not ready to mmap file-backed tranparent huge pages. Let's split them on mmap() attempt. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index a7331fb.

[PATCH, RFC 14/16] thp, mm: truncate support for transparent huge page cache

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" If we starting position of truncation is in tail page we have to spilit the huge page page first. We also have to split if end is within the huge page. Otherwise we can truncate whole huge page at once. Signed-off-by: Kirill A. Shutemov --- mm/truncat

[PATCH, RFC 11/16] thp, mm: naive support of thp in generic read/write routines

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" For now we still write/read at most PAGE_CACHE_SIZE bytes a time. This implementation doesn't cover address spaces with backing store. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 35 ++- 1 file changed, 30 i

[PATCH, RFC 13/16] thp: handle file pages in split_huge_page()

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" The base scheme is the same as for anonymous pages, but we walk by mapping->i_mmap rather then anon_vma->rb_root. __split_huge_page_refcount() has been tunned a bit: we need to transfer PG_swapbacked to tail pages. Splitting mapped pages haven't

[PATCH, RFC 02/16] mm: implement zero_huge_user_segment and friends

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Let's add helpers to clear huge page segment(s). They provide the same functionallity as zero_user_segment{,s} and zero_user, but for huge pages. Signed-off-by: Kirill A. Shutemov --- include/linux/mm.h | 15 +++ mm/memory.

[PATCH, RFC 12/16] thp, libfs: initial support of thp in simple_read/write_begin/write_end

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" For now we try to grab a huge cache page if gfp_mask has __GFP_COMP. It's probably to weak condition and need to be reworked later. Signed-off-by: Kirill A. Shutemov --- fs/libfs.c | 54 ++ 1 f

[PATCH, RFC 05/16] thp, mm: basic defines for transparent huge page cache

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Signed-off-by: Kirill A. Shutemov --- include/linux/huge_mm.h |8 1 file changed, 8 insertions(+) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index ee1c244..a54939c 100644 --- a/include/linux/huge_mm.h +++ b/include/linux

[PATCH, RFC 09/16] thp, mm: handle tail pages in page_cache_get_speculative()

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" For tail page we call __get_page_tail(). It has the same semantics, but for tail page. Signed-off-by: Kirill A. Shutemov --- include/linux/pagemap.h |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/pagemap.h b/inc

[PATCH, RFC 06/16] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" For huge page we add to radix tree HPAGE_CACHE_NR pages at once: head page for the specified index and HPAGE_CACHE_NR-1 tail pages for following indexes. Signed-off-by: Kirill A. Shutemov --- mm/filema

[PATCH, RFC 04/16] radix-tree: implement preload for multiple contiguous elements

2013-01-28 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Currently radix_tree_preload() only guarantees enough nodes to insert one element. It's a hard limit. You cannot batch a number insert under one tree_lock. This patch introduces radix_tree_preload_count(). It allows to preallocate nodes enough to ins

[PATCH] vt: add init_hide parameter to suppress boot output

2013-02-19 Thread Kirill A. Shutemov
: Kirill A. Shutemov --- drivers/tty/vt/vt.c|7 +++ include/linux/console_struct.h |3 ++- include/linux/vt.h |2 ++ kernel/power/console.c |2 -- 4 files changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/tty/vt/vt.c b/drivers/tty

[PATCH v2] vt: add init_hide parameter to suppress boot output

2013-02-19 Thread Kirill A. Shutemov
: Kirill A. Shutemov -- v2: - style: add space after if keyword; - change init_hide from int to bool; --- drivers/tty/vt/vt.c|7 +++ include/linux/console_struct.h |3 ++- include/linux/vt.h |2 ++ kernel/power/console.c |2 -- 4 files changed

[PATCHv2, RFC 01/30] block: implement add_bdi_stat()

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" It's required for batched stats update. Signed-off-by: Kirill A. Shutemov --- include/linux/backing-dev.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index 3504599..b05d9

[PATCHv2, RFC 25/30] thp, mm: basic huge_fault implementation for generic_file_vm_ops

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" It provide enough functionality for simple cases like ramfs. Need to be extended later. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 75 ++ 1 file changed, 75 insertions(+) diff --git a/mm

[PATCHv2, RFC 27/30] thp: initial implementation of do_huge_linear_fault()

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" The function tries to create a new page mapping using huge pages. It only called for not yet mapped pages. As usual in THP, we fallback to small pages if we fail to allocate huge page. Signed-off-by: Kirill A. Shutemov --- include/linux/huge_mm.h |

[PATCHv2, RFC 30/30] thp: map file-backed huge pages on fault

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Look like all pieces are in place, we can map file-backed huge-pages now. Signed-off-by: Kirill A. Shutemov --- include/linux/huge_mm.h |4 +++- mm/memory.c |1 + 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/inc

[PATCHv2, RFC 22/30] mm: add huge_fault() callback to vm_operations_struct

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" huge_fault() should try to setup huge page for the pgoff, if possbile. VM_FAULT_OOM return code means we need to fallback to small pages. Signed-off-by: Kirill A. Shutemov --- include/linux/mm.h |1 + 1 file changed, 1 insertion(+) diff --git a/inc

[PATCHv2, RFC 19/30] thp, mm: split huge page on mmap file page

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" We are not ready to mmap file-backed tranparent huge pages. Let's split them on mmap() attempt. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index 79ba9cd.

[PATCHv2, RFC 28/30] thp: handle write-protect exception to file-backed huge pages

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Signed-off-by: Kirill A. Shutemov --- mm/huge_memory.c | 69 -- 1 file changed, 67 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index d1adaea..a416a77 10064

[PATCHv2, RFC 21/30] x86-64, mm: proper alignment mappings with hugepages

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Make arch_get_unmapped_area() return unmapped area aligned to HPAGE_MASK if the file mapping can have huge pages. Signed-off-by: Kirill A. Shutemov --- arch/x86/kernel/sys_x86_64.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) di

[PATCHv2, RFC 24/30] thp: move maybe_pmd_mkwrite() out of mk_huge_pmd()

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" It's confusing that mk_huge_pmd() has sematics different from mk_pte() or mk_pmd(). Let's move maybe_pmd_mkwrite() out of mk_huge_pmd() and adjust prototype to match mk_pte(). Signed-off-by: Kirill A. Shutemov --- mm/huge_memory.c | 14 ++

[PATCHv2, RFC 23/30] thp: prepare zap_huge_pmd() to uncharge file pages

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Uncharge pages from correct counter. Signed-off-by: Kirill A. Shutemov --- mm/huge_memory.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a23da8b..34e0385 100644 --- a/mm/huge_memory.

[PATCHv2, RFC 12/30] thp, mm: add event counters for huge page alloc on write to a file

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Signed-off-by: Kirill A. Shutemov --- include/linux/vm_event_item.h |2 ++ mm/vmstat.c |2 ++ 2 files changed, 4 insertions(+) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index d4b7a18..04587c4 10

[PATCHv2, RFC 20/30] ramfs: enable transparent huge page cache

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" ramfs is the most simple fs from page cache point of view. Let's start transparent huge page cache enabling here. For now we allocate only non-movable huge page. It's not yet clear if movable page is safe here and what need to be done to make it

[PATCHv2, RFC 17/30] thp: wait_split_huge_page(): serialize over i_mmap_mutex too

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Since we're going to have huge pages backed by files, wait_split_huge_page() has to serialize not only over anon_vma_lock, but over i_mmap_mutex too. Signed-off-by: Kirill A. Shutemov --- include/linux/huge_mm.h | 15 --- mm/huge_memory.

[PATCHv2, RFC 05/30] thp, mm: avoid PageUnevictable on active/inactive lru lists

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" active/inactive lru lists can contain unevicable pages (i.e. ramfs pages that have been placed on the LRU lists when first allocated), but these pages must not have PageUnevictable set - otherwise shrink_active_list goes crazy: kernel BUG at /home/space/kas/

[PATCHv2, RFC 10/30] thp, mm: locking tail page is a bug

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Signed-off-by: Kirill A. Shutemov --- mm/filemap.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index 0ff3403..38fdc92 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -669,6 +669,7 @@ void __lock_page(struct

[PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" For huge page we add to radix tree HPAGE_CACHE_NR pages at once: head page for the specified index and HPAGE_CACHE_NR-1 tail pages for following indexes. Signed-off-by: Kirill A. Shutemov --- mm/filema

[PATCHv2, RFC 18/30] thp, mm: truncate support for transparent huge page cache

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" If we starting position of truncation is in tail page we have to spilit the huge page page first. We also have to split if end is within the huge page. Otherwise we can truncate whole huge page at once. Signed-off-by: Kirill A. Shutemov --- mm/truncat

[PATCHv2, RFC 06/30] thp, mm: basic defines for transparent huge page cache

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Signed-off-by: Kirill A. Shutemov --- include/linux/huge_mm.h |8 1 file changed, 8 insertions(+) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index ee1c244..a54939c 100644 --- a/include/linux/huge_mm.h +++ b/include/linux

[PATCHv2, RFC 29/30] thp: call __vma_adjust_trans_huge() for file-backed VMA

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Since we're going to have huge pages in page cache, we need to call __vma_adjust_trans_huge() for file-backed VMA, which potentially can contain huge pages. For now we call it for all VMAs with vm_ops->huge_fault defined. Probably later we will n

[PATCHv2, RFC 03/30] mm: drop actor argument of do_generic_file_read()

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" There's only one caller of do_generic_file_read() and the only actor is file_read_actor(). No reason to have a callback parameter. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-)

[PATCHv2, RFC 26/30] thp: extract fallback path from do_huge_pmd_anonymous_page() to a function

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" The same fallback path will be reused by non-anonymous pages, so lets' extract it in separate function. Signed-off-by: Kirill A. Shutemov --- mm/huge_memory.c | 112 -- 1 file changed, 59 in

[PATCHv2, RFC 16/30] thp: handle file pages in split_huge_page()

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" The base scheme is the same as for anonymous pages, but we walk by mapping->i_mmap rather then anon_vma->rb_root. __split_huge_page_refcount() has been tunned a bit: we need to transfer PG_swapbacked to tail pages. Splitting mapped pages haven't

[PATCHv2, RFC 09/30] thp, mm: rewrite delete_from_page_cache() to support huge pages

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" As with add_to_page_cache_locked() we handle HPAGE_CACHE_NR pages a time. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c ind

[PATCHv2, RFC 02/30] mm: implement zero_huge_user_segment and friends

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Let's add helpers to clear huge page segment(s). They provide the same functionallity as zero_user_segment{,s} and zero_user, but for huge pages. Signed-off-by: Kirill A. Shutemov --- include/linux/mm.h | 15 +++ mm/memory.

[PATCHv2, RFC 15/30] thp, libfs: initial support of thp in simple_read/write_begin/write_end

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" For now we try to grab a huge cache page if gfp_mask has __GFP_COMP. It's probably to weak condition and need to be reworked later. Signed-off-by: Kirill A. Shutemov --- fs/libfs.c | 50 +++--- 1 f

[PATCHv2, RFC 14/30] thp, mm: naive support of thp in generic read/write routines

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" For now we still write/read at most PAGE_CACHE_SIZE bytes a time. This implementation doesn't cover address spaces with backing store. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 35 ++- 1 file changed, 30 i

[PATCHv2, RFC 00/30] Transparent huge page cache

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" Here's the second version of the patchset. The intend of the work is get code ready to enable transparent huge page cache for the most simple fs -- ramfs. We have read()/write()/mmap() functionality now. Still plenty work ahead. Any feedback is we

[PATCHv2, RFC 13/30] thp, mm: implement grab_cache_huge_page_write_begin()

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" The function is grab_cache_page_write_begin() twin but it tries to allocate huge page at given position aligned to HPAGE_CACHE_NR. If, for some reason, it's not possible allocate a huge page at this possition, it returns NULL. Caller should take care

[PATCHv2, RFC 11/30] thp, mm: handle tail pages in page_cache_get_speculative()

2013-03-14 Thread Kirill A. Shutemov
From: "Kirill A. Shutemov" For tail page we call __get_page_tail(). It has the same semantics, but for tail page. Signed-off-by: Kirill A. Shutemov --- include/linux/pagemap.h |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/pagemap.h b/inc

  1   2   3   4   5   6   7   8   9   10   >