Update the ia64 hugetlb_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.
Signed-off-by: Michel Lespinasse
---
arch/ia64/mm/hugetlbpage.c | 20 +---
1 files changed, 9 insertions(+), 11 deletions(-)
diff --git a/arch
vm_unmapped_area() infrastructure and regain the performance.
Signed-off-by: Michel Lespinasse
---
arch/powerpc/include/asm/page_64.h |3 +-
arch/powerpc/mm/hugetlbpage.c|2 +-
arch/powerpc/mm/slice.c | 108 +
arch/powerpc
Since all architectures have been converted to use vm_unmapped_area(),
there is no remaining use for the free_area_cache.
Signed-off-by: Michel Lespinasse
---
arch/arm/mm/mmap.c |2 --
arch/arm64/mm/mmap.c |2 --
arch/mips/mm/mmap.c |2
Update the powerpc slice_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.
Signed-off-by: Michel Lespinasse
---
arch/powerpc/mm/slice.c | 128 +-
1 files changed, 81 insertions(+), 47
Update the ia64 arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.
Signed-off-by: Michel Lespinasse
---
arch/ia64/kernel/sys_ia64.c | 37 -
1 files changed, 12 insertions(+), 25 deletions
ested.
Michel Lespinasse (8):
mm: use vm_unmapped_area() on parisc architecture
mm: use vm_unmapped_area() on alpha architecture
mm: use vm_unmapped_area() on frv architecture
mm: use vm_unmapped_area() on ia64 architecture
mm: use vm_unmapped_area() in hugetlbfs on ia64 architecture
mm: r
Update the frv arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.
Signed-off-by: Michel Lespinasse
---
arch/frv/mm/elf-fdpic.c | 49 --
1 files changed, 17 insertions(+), 32 deletions
Update the parisc arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.
Signed-off-by: Michel Lespinasse
---
arch/parisc/kernel/sys_parisc.c | 46 ++
1 files changed, 17 insertions(+), 29 deletions
Whoops, I was supposed to find a more appropriate subject line before
sending this :]
On Tue, Jan 8, 2013 at 5:28 PM, Michel Lespinasse wrote:
> These patches, which apply on top of v3.8-rc kernels, are to complete the
> VMA gap finding code I introduced (following Rik's initial p
On Tue, Jan 8, 2013 at 6:15 PM, Benjamin Herrenschmidt
wrote:
> On Tue, 2013-01-08 at 17:28 -0800, Michel Lespinasse wrote:
>> Update the powerpc slice_get_unmapped_area function to make use of
>> vm_unmapped_area() instead of implementing a brute force search.
>>
>
Like others before me, I have discovered how easy it is to DOS a
system by abusing the rwlock_t unfairness and causing the
tasklist_lock read side to be continuously held (my abuse code makes
use of the getpriority syscall, but there are plenty of other ways
anyway).
My understanding is that the i
eadable and it will avoid a fuckup in the future if
> somebody changes the algorithm and forgets to update one of the
> copies :-)
All right, does the following look more palatable then ?
(didn't re-test it, though)
Signed-off-by: Michel Lespinasse
---
arch/powerpc/mm/slice.c | 1
On Wed, Jan 9, 2013 at 9:49 AM, Oleg Nesterov wrote:
> On 01/08, Michel Lespinasse wrote:
>> Like others before me, I have discovered how easy it is to DOS a
>> system by abusing the rwlock_t unfairness and causing the
>> tasklist_lock read side to be continuously held
On Tue, Jan 8, 2013 at 2:30 PM, Rik van Riel wrote:
> v3: use fixed-point math for the delay calculations, suggested by Michel
> Lespinasse
>
> - if (head == ticket)
> + if (head == ticket) {
> + /*
> +
etecting hash collisions to protect us against varying hold times,
because this case could happen even with a single spinlock. So we need
to make sure the base algorithm is robust and converges towards using
the shorter of the spinlock hold times; if we have that then forcing a
reset to MIN_SPIN
On Thu, Jan 10, 2013 at 5:05 AM, Rik van Riel wrote:
> Eric,
>
> with just patches 1-3, can you still reproduce the
> regression on your system?
>
> In other words, could we get away with dropping the
> complexity of patch 4, or do we still need it?
To be clear, I must say that I'm not opposing p
On Wed, Nov 15, 2000 at 12:12:15PM -0800, H. Peter Anvin wrote:
> Also, if a piece of software needs raw CPUID information (unlike the
> "cooked" one provided by recent kernels) it should use
> /dev/cpu/*/cpuid.
Is it also OK to use the cpuid opcode in userspace ? (after checking
for its presence
On Fri, Dec 08, 2000 at 09:44:29AM +0900, Rainer Mager wrote:
> I've heard that signal 11 can be related to bad hardware, most
> often memory, but I've done a good bit of testing on this and the
> system seems ok. What I did was to run the VA Linux Cerberos(sp?)
> test for 15 hours+ with n
n be found at
http://lespinasse.org/config-2.6.24-rc6 if that's any help.
Thanks,
--
Michel Lespinasse
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.or
Hi,
I have a quick question about nfsroot in linux: Is there any way to
use nfsv3 in the nfsroot (nolock is OK there), and then mount other
directories with locking enabled ?
The default nfs options when using nfsroot are to use nfsv2 without locking.
After booting, one can mount other filesystem
On Fri, May 04, 2007 at 11:25:43AM -0700, Kok, Auke wrote:
> can you try turning off the "management enable" function in the BIOS of the
> DQ965GF? That fixes this issue for us in our labs. A fix for this is also
> available in our standalone 7.5.5.1 driver (obtainable from e1000.sf.net),
> but
Hi,
Sorry if this is known, I am not on the list.
I'm having an issue with lost ticks, runnign linux 2.6.20.10 on an
intel DQ965GF motherboard. For some reason this occurs with clock-like
regularity, always exactly 24 lost ticks, about every two seconds.
This is running with 250-HZ ticks, and the
board E1000).
On Tue, May 01, 2007 at 11:34:28AM -0400, Chuck Ebbert wrote:
> Michel Lespinasse wrote:
> > running with report_lost_ticks, I see the following:
> >
> > May 1 12:58:57 server kernel: time.c: Lost 24 timer tick(s)! rip
> > _spin_unlock_irqrestore+0x8/0x9)
&
On Tue, May 01, 2007 at 03:08:48PM -0700, Kok, Auke wrote:
> Michel Lespinasse wrote:
> >(I've added the E1000 maintainers to the thread as I found the issue
> >seems to go away after I compile out that driver. For reference, I was
> >trying to figure out why I lose ex
On Wed, May 02, 2007 at 11:14:52AM -0700, Kok, Auke wrote:
> I just checked and the fix I was referring to earlier didn't make it into
> 2.6.21-final. You can get 2.6.21-git1 from kernel.org which has the fix. See
>
> http://www.kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.21-git1.log
Go
On Wed, Apr 07, 2021 at 04:48:44PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 06, 2021 at 06:44:36PM -0700, Michel Lespinasse wrote:
> > --- a/arch/x86/mm/fault.c
> > +++ b/arch/x86/mm/fault.c
> > @@ -1219,6 +1219,8 @@ void do_user_addr_fault(struct pt_regs *regs,
> &g
On Wed, Apr 07, 2021 at 01:14:53PM -0700, Michel Lespinasse wrote:
> On Wed, Apr 07, 2021 at 04:48:44PM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 06, 2021 at 06:44:36PM -0700, Michel Lespinasse wrote:
> > > --- a/arch/x86/mm/fault.c
> > > +++ b/arch/x86/mm/fault
On Wed, Apr 07, 2021 at 04:35:28PM +0100, Matthew Wilcox wrote:
> On Wed, Apr 07, 2021 at 04:48:44PM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 06, 2021 at 06:44:36PM -0700, Michel Lespinasse wrote:
> > > --- a/arch/x86/mm/fault.c
> > > +++ b/arch/x86/mm/fault.c
>
On Wed, Apr 07, 2021 at 04:47:34PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 06, 2021 at 06:44:34PM -0700, Michel Lespinasse wrote:
> > The counter's write side is hooked into the existing mmap locking API:
> > mmap_write_lock() increments the counter to the n
On Wed, Apr 07, 2021 at 04:40:34PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 06, 2021 at 06:44:49PM -0700, Michel Lespinasse wrote:
> > In the speculative case, call the vm_ops->fault() method from within
> > an rcu read locked section, and verify the mmap sequence lock at th
On Wed, Apr 07, 2021 at 03:50:06AM +0100, Matthew Wilcox wrote:
> On Tue, Apr 06, 2021 at 06:44:59PM -0700, Michel Lespinasse wrote:
> > Performance tuning: as single threaded userspace does not use
> > speculative page faults, it does not require rcu safe vma freeing.
> > T
On Thu, Apr 08, 2021 at 08:13:43AM +0100, Matthew Wilcox wrote:
> On Thu, Apr 08, 2021 at 09:00:26AM +0200, Peter Zijlstra wrote:
> > On Wed, Apr 07, 2021 at 10:27:12PM +0100, Matthew Wilcox wrote:
> > > Doing I/O without any lock held already works; it just uses the file
> > > refcount. It would
On Fri, Jul 13, 2012 at 1:15 PM, Andrew Morton
wrote:
> On Thu, 12 Jul 2012 17:31:50 -0700 Michel Lespinasse
> wrote:
>> Makefile|2 +-
>> lib/Kconfig.debug |1 +
>> tests/Kconfig | 18 +++
>> tests/Makefile |1 +
On Fri, Jul 13, 2012 at 3:45 PM, Andrew Morton
wrote:
> On Fri, 13 Jul 2012 15:33:35 -0700 Michel Lespinasse
> wrote:
>> Ah, I did not realize we had a precedent for in-tree kernel test modules.
>
> hm, well, just because that's what we do now doesn't mean that i
leaf nodes have the same number of black nodes,
- root node is black
Signed-off-by: Michel Lespinasse
---
lib/Kconfig.debug |7 +++
lib/Makefile |2 +
lib/rbtree_test.c | 135 +
3 files changed, 144 insertions(+), 0 deletions
d the fix in order to try out the patches. So here it is :)
- Forwarded message from Michel Lespinasse -
Date: Tue, 17 Jul 2012 17:30:35 -0700
From: Michel Lespinasse
To: Andrew Morton
Cc: Doug Ledford
Subject: [PATCH] ipc/mqueue: remove unnecessary rb_init_node calls
Commits d662985
here.
Signed-off-by: Michel Lespinasse
---
fs/jffs2/readinode.c |6 --
1 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/jffs2/readinode.c b/fs/jffs2/readinode.c
index dc0437e..b00fc50 100644
--- a/fs/jffs2/readinode.c
+++ b/fs/jffs2/readinode.c
@@ -395,7 +395,9 @@ stati
) and case 3 (node to remove has 2 childs,
successor is a left-descendant of the right child).
Signed-off-by: Michel Lespinasse
---
lib/rbtree.c | 115 --
1 files changed, 72 insertions(+), 43 deletions(-)
diff --git a/lib/rbtree.c b/lib
Signed-off-by: Michel Lespinasse
---
lib/rbtree_test.c | 103 +++-
1 files changed, 101 insertions(+), 2 deletions(-)
diff --git a/lib/rbtree_test.c b/lib/rbtree_test.c
index 4c6d250..2dfafe4 100644
--- a/lib/rbtree_test.c
+++ b/lib/rbtree_test.c
work, as my compiler output
is now *smaller* than before for that function. Speed wise, they seem
comparable though.
Signed-off-by: Michel Lespinasse
---
include/linux/rbtree.h |5 +
lib/rbtree.c | 14 +-
lib/rbtree_test.c | 31 +++--
006e rb_replace_node
Signed-off-by: Michel Lespinasse
---
arch/x86/mm/pat_rbtree.c | 52 +
include/linux/rbtree.h |8 -
lib/rbtree.c | 71 --
3 files changed, 33 insertions(+), 98 deletions
together, but we still call into a generic
__rb_erase_color() (passing a non-inlined callback function) for the
rebalancing work. This is intended to strike a reasonable compromise
between speed and compiled code size.
Signed-off-by: Michel Lespinasse
---
include/linux/rbtree.h |5
all the way to the root (and it should be more efficient too - most of the
nodes in a balanced tree are on the last few levels, so having to go all
the way back to the root really is wasteful), I have not found a nice
elegant way to do that yet, let alone in a generic way. If someone wants
to try
fetched child
This avoids fetching the parent's left child when node is actually
that child. Saves a bit on code size, though it doesn't seem to make
a large difference in speed.
Signed-off-by: Michel Lespinasse
---
lib/rbtree.c | 21 +
1 files changed, 13 insertions(
On Mon, May 18, 2020 at 12:32:03PM +0200, Vlastimil Babka wrote:
> On 4/22/20 2:14 AM, Michel Lespinasse wrote:
> > Add API for nested write locks and convert the few call sites doing that.
> >
> > Signed-off-by: Michel Lespinasse
> > Reviewed-by: Daniel Jordan
>
On Mon, May 18, 2020 at 12:45:06PM +0200, Vlastimil Babka wrote:
> On 4/22/20 2:14 AM, Michel Lespinasse wrote:
> > Define a new initializer for the mmap locking api.
> > Initially this just evaluates to __RWSEM_INITIALIZER as the API
> > is defined as wrappers around rwsem.
&
On Mon, May 18, 2020 at 01:01:33PM +0200, Vlastimil Babka wrote:
> On 4/24/20 3:38 AM, Michel Lespinasse wrote:
> > +static inline void mmap_assert_locked(struct mm_struct *mm)
> > +{
> > + VM_BUG_ON_MM(!lockdep_is_held_type(&mm->mmap_sem, -1), mm);
> > +
On Mon, May 18, 2020 at 03:45:22PM +0200, Laurent Dufour wrote:
> Le 24/04/2020 à 03:39, Michel Lespinasse a écrit :
> > Rename the mmap_sem field to mmap_lock. Any new uses of this lock
> > should now go through the new mmap locking api. The mmap_lock is
> > still implement
On Mon, May 18, 2020 at 01:07:26PM +0200, Vlastimil Babka wrote:
> Any plan about all the code comments mentioning mmap_sem? :) Not urgent.
It's mostly a sed job, I'll add it in the next version as it seems
the patchset is getting ready for inclusion.
--
Michel "Walken" Lespinasse
A program is n
On Tue, May 19, 2020 at 11:15 AM John Hubbard wrote:
> On 2020-05-19 08:32, Matthew Wilcox wrote:
> > On Tue, May 19, 2020 at 03:20:40PM +0200, Laurent Dufour wrote:
> >> Le 19/05/2020 à 15:10, Michel Lespinasse a écrit :
> >>> On Mon, May 18, 2020 at 03:45:22
point for replacing the rwsem
implementation with a different one, such as range locks.
Signed-off-by: Michel Lespinasse
Reviewed-by: Daniel Jordan
Reviewed-by: Davidlohr Bueso
Reviewed-by: Laurent Dufour
Reviewed-by: Vlastimil Babka
---
include/linux/mm.h| 1 +
include/linux
Convert comments that reference mmap_sem to reference mmap_lock instead.
Signed-off-by: Michel Lespinasse
---
.../admin-guide/mm/numa_memory_policy.rst | 10 ++---
Documentation/admin-guide/mm/userfaultfd.rst | 2 +-
Documentation/filesystems/locking.rst | 2 +-
Documentation/vm
Convert the last few remaining mmap_sem rwsem calls to use the new
mmap locking API. These were missed by coccinelle for some reason
(I think coccinelle does not support some of the preprocessor
constructs in these files ?)
Signed-off-by: Michel Lespinasse
Reviewed-by: Daniel Jordan
Reviewed-by
ould be delayed for
a bit, so that we'd get a chance to convert any new code that locks
mmap_sem in the -rc1 release before applying that last patch.
Michel Lespinasse (12):
mmap locking API: initial implementation as rwsem wrappers
MMU notifier: use the new mmap locking API
DMA reser
Add new APIs to assert that mmap_sem is held.
Using this instead of rwsem_is_locked and lockdep_assert_held[_write]
makes the assertions more tolerant of future changes to the lock type.
Signed-off-by: Michel Lespinasse
---
arch/x86/events/core.c| 2 +-
fs/userfaultfd.c | 6
This use is converted manually ahead of the next patch in the series,
as it requires including a new header which the automated conversion
would miss.
Signed-off-by: Michel Lespinasse
Reviewed-by: Daniel Jordan
Reviewed-by: Davidlohr Bueso
Reviewed-by: Laurent Dufour
Reviewed-by: Vlastimil
Convert comments that reference old mmap_sem APIs to reference
corresponding new mmap locking APIs instead.
Signed-off-by: Michel Lespinasse
---
Documentation/vm/hmm.rst | 6 +++---
arch/alpha/mm/fault.c | 2 +-
arch/ia64/mm/fault.c | 2 +-
arch/m68k/mm/fault.c
Add API for nested write locks and convert the few call sites doing that.
Signed-off-by: Michel Lespinasse
Reviewed-by: Daniel Jordan
Reviewed-by: Laurent Dufour
Reviewed-by: Vlastimil Babka
---
arch/um/include/asm/mmu_context.h | 3 ++-
include/linux/mmap_lock.h | 5 +
kernel
Rename the mmap_sem field to mmap_lock. Any new uses of this lock
should now go through the new mmap locking api. The mmap_lock is
still implemented as a rwsem, though this could change in the future.
Signed-off-by: Michel Lespinasse
Reviewed-by: Vlastimil Babka
---
arch/ia64/mm/fault.c
Define a new initializer for the mmap locking api.
Initially this just evaluates to __RWSEM_INITIALIZER as the API
is defined as wrappers around rwsem.
Signed-off-by: Michel Lespinasse
Reviewed-by: Laurent Dufour
Reviewed-by: Vlastimil Babka
---
arch/x86/kernel/tboot.c| 2 +-
drivers
least-ugly way of addressing this in the short term.
Signed-off-by: Michel Lespinasse
Reviewed-by: Daniel Jordan
Reviewed-by: Vlastimil Babka
---
include/linux/mmap_lock.h | 14 ++
kernel/bpf/stackmap.c | 17 +
2 files changed, 19 insertions(+), 12 deletions(-)
This use is converted manually ahead of the next patch in the series,
as it requires including a new header which the automated conversion
would miss.
Signed-off-by: Michel Lespinasse
Reviewed-by: Daniel Jordan
Reviewed-by: Laurent Dufour
Reviewed-by: Vlastimil Babka
---
drivers/dma-buf/dma
On Wed, May 20, 2020 at 12:32 AM John Hubbard wrote:
> On 2020-05-19 19:39, Michel Lespinasse wrote:
> >> That gives you additional options inside internal_get_user_pages_fast(),
> >> such
> >> as, approximately:
> >>
> >> if (!(gup_flags & F
Looks good. I'm not sure if you need a review, but just in case:
On Wed, May 20, 2020 at 8:23 PM Andrew Morton wrote:
> On Tue, 19 May 2020 22:29:01 -0700 Michel Lespinasse
> wrote:
>
> > Convert the last few remaining mmap_sem rwsem calls to use the new
> > mmap lock
Looks good, thanks !
On Wed, May 20, 2020 at 8:22 PM Andrew Morton wrote:
> On Tue, 19 May 2020 22:29:08 -0700 Michel Lespinasse
> wrote:
> > Convert comments that reference mmap_sem to reference mmap_lock instead.
>
> This may not be complete..
>
> From: Andrew Morton
On Thu, May 21, 2020 at 12:42 AM Vlastimil Babka wrote:
> On 5/20/20 7:29 AM, Michel Lespinasse wrote:
> > Convert comments that reference mmap_sem to reference mmap_lock instead.
> >
> > Signed-off-by: Michel Lespinasse
>
> Reviewed-by: Vlastimil Babka
>
Change do_anonymous_page() to handle the speculative case.
This involves aborting speculative faults if they have to allocate a new
anon_vma, and using pte_map_lock() instead of pte_offset_map_lock()
to complete the page fault.
Signed-off-by: Michel Lespinasse
---
mm/memory.c | 17
Change do_swap_page() to allow speculative fault execution to proceed.
Signed-off-by: Michel Lespinasse
---
mm/memory.c | 5 -
1 file changed, 5 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index ab3160719bf3..6eddd7b4e89c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3340,11
that point the page table lock serializes any further
races with concurrent mmap lock writers.
If the mmap sequence count check fails, both functions will return false
with the pte being left unmapped and unlocked.
Signed-off-by: Michel Lespinasse
---
include/linux/mm.h | 34 +
Set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT so that the speculative fault
handling code can be compiled on this architecture.
Signed-off-by: Michel Lespinasse
---
arch/x86/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2792879d398e
wp_pfn_shared() or wp_page_shared() (both unreachable as we only
handle anon vmas so far) or handle_userfault() (needs an explicit
abort to handle non-speculatively).
Signed-off-by: Michel Lespinasse
---
mm/memory.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/mm/memory.c
ind less readable.
Signed-off-by: Michel Lespinasse
---
include/linux/mmap_lock.h | 32
1 file changed, 16 insertions(+), 16 deletions(-)
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
index 4e27f755766b..8ff276a7560e 100644
--- a/include/l
tical between the two cases.
This change reduces the code duplication between the two cases.
Signed-off-by: Michel Lespinasse
---
mm/memory.c | 85 +++--
1 file changed, 37 insertions(+), 48 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
change do_numa_page() to use pte_spinlock() when locking the page table,
so that the mmap sequence counter will be validated in the speculative case.
Signed-off-by: Michel Lespinasse
---
mm/memory.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/memory.c b/mm
Define the new FAULT_FLAG_SPECULATIVE flag, which indicates when we are
attempting speculative fault handling (without holding the mmap lock).
Signed-off-by: Michel Lespinasse
---
include/linux/mm.h | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/linux/mm.h b
Change handle_pte_fault() to allow speculative fault execution to proceed
through do_numa_page().
do_swap_page() does not implement speculative execution yet, so it
needs to abort with VM_FAULT_RETRY in that case.
Signed-off-by: Michel Lespinasse
---
mm/memory.c | 15 ++-
1 file
Defer freeing of vma->vm_file when freeing vmas.
This is to allow speculative page faults in the mapped file case.
Signed-off-by: Michel Lespinasse
---
fs/exec.c | 1 +
kernel/fork.c | 17 +++--
mm/mmap.c | 11 +++
mm/nommu.c| 6 ++
4 files changed,
page faulting code, and some code has to
be added there to try speculative fault handling first.
Signed-off-by: Michel Lespinasse
---
mm/Kconfig | 22 ++
1 file changed, 22 insertions(+)
diff --git a/mm/Kconfig b/mm/Kconfig
index 24c045b24b95..322bda319dea 100644
--- a/mm/Kconfig
, and that readahead is not
necessary at this time. In all other cases, the fault is aborted to be
handled non-speculatively.
Signed-off-by: Michel Lespinasse
---
mm/filemap.c | 45 -
1 file changed, 44 insertions(+), 1 deletion(-)
diff --git a/mm
in order to satisfy pte_map_lock()'s preconditions.
Signed-off-by: Michel Lespinasse
---
mm/memory.c | 31 ++-
1 file changed, 22 insertions(+), 9 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index eea72bd78d06..547d9d0ee962 100644
--- a/mm/memory.c
() API is kept as a wrapper around
do_handle_mm_fault() so that we do not have to immediately update
every handle_mm_fault() call site.
Signed-off-by: Michel Lespinasse
---
include/linux/mm.h | 12 +---
mm/memory.c| 10 +++---
2 files changed, 16 insertions(+), 6 deletions
trying that unimplemented case.
Signed-off-by: Michel Lespinasse
---
arch/x86/mm/fault.c | 3 ++-
include/linux/mm.h | 14 ++
mm/memory.c | 17 -
3 files changed, 28 insertions(+), 6 deletions(-)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
is set (the original pte was not
pte_none), catch speculative faults and return VM_FAULT_RETRY as
those cases are not implemented yet. Also assert that do_fault()
is not reached in the speculative case.
Signed-off-by: Michel Lespinasse
---
arch/x86/mm/fault.c | 2 +-
mm/memory.c |
anymore, as it is now running within an rcu read lock.
Signed-off-by: Michel Lespinasse
---
fs/xfs/xfs_file.c | 3 +++
mm/memory.c | 22 --
2 files changed, 23 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index a007ca0711d9..b360
when finally committing
the faulted page to the mm address space.
Signed-off-by: Michel Lespinasse
---
mm/memory.c | 74 ++---
1 file changed, 42 insertions(+), 32 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index fc555fae0844..ab3160719bf3
lative fault handling.
The speculative handling case also does not preallocate page tables,
as it is always called with a pre-existing page table.
Signed-off-by: Michel Lespinasse
---
mm/memory.c | 63 +++--
1 file changed, 42 insertions(+), 21 deleti
update_mmu_tlb() can be used instead of update_mmu_cache() when the
page fault handler detects that it lost the race to another page fault.
Signed-off-by: Michel Lespinasse
---
mm/memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/memory.c b/mm/memory.c
index
: Michel Lespinasse
---
arch/arm64/mm/fault.c | 52 +++
1 file changed, 52 insertions(+)
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index f37d4e3830b7..3757bfbb457a 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -25,6 +25,7
Add a speculative field to the vm_operations_struct, which indicates if
the associated file type supports speculative faults.
Initially this is set for files that implement fault() with filemap_fault().
Signed-off-by: Michel Lespinasse
---
fs/btrfs/file.c| 1 +
fs/cifs/file.c | 1 +
fs
the anon case, but maybe not as clear for the file cases.
- Is the Android use case compelling enough to merge the entire patchset ?
- Can we use this as a foundation for other mmap scalability work ?
I hear several proposals involving the idea of RCU based fault handling,
and hope this propo
We just need to make sure ext4_filemap_fault() doesn't block in the
speculative case as it is called with an rcu read lock held.
Signed-off-by: Michel Lespinasse
---
fs/ext4/file.c | 1 +
fs/ext4/inode.c | 7 ++-
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/f
We just need to make sure f2fs_filemap_fault() doesn't block in the
speculative case as it is called with an rcu read lock held.
Signed-off-by: Michel Lespinasse
---
fs/f2fs/file.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
Set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT so that the speculative fault
handling code can be compiled on this architecture.
Signed-off-by: Michel Lespinasse
---
arch/arm64/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index e4e1b6550115
tables.
Signed-off-by: Michel Lespinasse
---
include/linux/mm.h | 4 +++
mm/memory.c| 77 --
2 files changed, 79 insertions(+), 2 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d5988e78e6ab..dee8a4833779 100644
--- a
tests that do not have any frequent
concurrent page faults ! This is because rcu safe vma freeing prevents
recently released vmas from being immediately reused in a new thread.
Signed-off-by: Michel Lespinasse
---
kernel/fork.c | 8 +---
1 file changed, 5 insertions(+), 3 deletions(-)
diff
In the speculative case, we want to avoid direct pmd checks (which
would require some extra synchronization to be safe), and rely on
pte_map_lock which will both lock the page table and verify that the
pmd has not changed from its initial value.
Signed-off-by: Michel Lespinasse
---
mm/memory.c
h any mmap writer.
This is very similar to a seqlock, but both the writer and speculative
readers are allowed to block. In the fail case, the speculative reader
does not spin on the sequence counter; instead it should fall back to
a different mechanism such as grabbing the mmap lock read side
Add a new CONFIG_SPECULATIVE_PAGE_FAULT_STATS config option,
and dump extra statistics about executed spf cases and abort reasons
when the option is set.
Signed-off-by: Michel Lespinasse
---
arch/x86/mm/fault.c | 19 +++---
include/linux/mmap_lock.h | 19 +-
include
when finalizing the fault.
Signed-off-by: Michel Lespinasse
---
arch/x86/mm/fault.c | 36 +++
include/linux/vm_event_item.h | 4
mm/vmstat.c | 4
3 files changed, 44 insertions(+)
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm
Performance tuning: single threaded userspace does not benefit from
speculative page faults, so we turn them off to avoid any related
(small) extra overheads.
Signed-off-by: Michel Lespinasse
---
arch/x86/mm/fault.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/arch/x86/mm/fault.c b
401 - 500 of 610 matches
Mail list logo