[PATCH 5/8] mm: use vm_unmapped_area() in hugetlbfs on ia64 architecture

2013-01-08 Thread Michel Lespinasse
Update the ia64 hugetlb_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: Michel Lespinasse --- arch/ia64/mm/hugetlbpage.c | 20 +--- 1 files changed, 9 insertions(+), 11 deletions(-) diff --git a/arch

[PATCH 6/8] mm: remove free_area_cache use in powerpc architecture

2013-01-08 Thread Michel Lespinasse
vm_unmapped_area() infrastructure and regain the performance. Signed-off-by: Michel Lespinasse --- arch/powerpc/include/asm/page_64.h |3 +- arch/powerpc/mm/hugetlbpage.c|2 +- arch/powerpc/mm/slice.c | 108 + arch/powerpc

[PATCH 8/8] mm: remove free_area_cache

2013-01-08 Thread Michel Lespinasse
Since all architectures have been converted to use vm_unmapped_area(), there is no remaining use for the free_area_cache. Signed-off-by: Michel Lespinasse --- arch/arm/mm/mmap.c |2 -- arch/arm64/mm/mmap.c |2 -- arch/mips/mm/mmap.c |2

[PATCH 7/8] mm: use vm_unmapped_area() on powerpc architecture

2013-01-08 Thread Michel Lespinasse
Update the powerpc slice_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: Michel Lespinasse --- arch/powerpc/mm/slice.c | 128 +- 1 files changed, 81 insertions(+), 47

[PATCH 4/8] mm: use vm_unmapped_area() on ia64 architecture

2013-01-08 Thread Michel Lespinasse
Update the ia64 arch_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: Michel Lespinasse --- arch/ia64/kernel/sys_ia64.c | 37 - 1 files changed, 12 insertions(+), 25 deletions

[PATCH 0/8] vm_unmapped_area: finish the mission

2013-01-08 Thread Michel Lespinasse
ested. Michel Lespinasse (8): mm: use vm_unmapped_area() on parisc architecture mm: use vm_unmapped_area() on alpha architecture mm: use vm_unmapped_area() on frv architecture mm: use vm_unmapped_area() on ia64 architecture mm: use vm_unmapped_area() in hugetlbfs on ia64 architecture mm: r

[PATCH 3/8] mm: use vm_unmapped_area() on frv architecture

2013-01-08 Thread Michel Lespinasse
Update the frv arch_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: Michel Lespinasse --- arch/frv/mm/elf-fdpic.c | 49 -- 1 files changed, 17 insertions(+), 32 deletions

[PATCH 1/8] mm: use vm_unmapped_area() on parisc architecture

2013-01-08 Thread Michel Lespinasse
Update the parisc arch_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: Michel Lespinasse --- arch/parisc/kernel/sys_parisc.c | 46 ++ 1 files changed, 17 insertions(+), 29 deletions

Re: [PATCH 0/8] vm_unmapped_area: finish the mission

2013-01-08 Thread Michel Lespinasse
Whoops, I was supposed to find a more appropriate subject line before sending this :] On Tue, Jan 8, 2013 at 5:28 PM, Michel Lespinasse wrote: > These patches, which apply on top of v3.8-rc kernels, are to complete the > VMA gap finding code I introduced (following Rik's initial p

Re: [PATCH 7/8] mm: use vm_unmapped_area() on powerpc architecture

2013-01-08 Thread Michel Lespinasse
On Tue, Jan 8, 2013 at 6:15 PM, Benjamin Herrenschmidt wrote: > On Tue, 2013-01-08 at 17:28 -0800, Michel Lespinasse wrote: >> Update the powerpc slice_get_unmapped_area function to make use of >> vm_unmapped_area() instead of implementing a brute force search. >> >

rwlock_t unfairness and tasklist_lock

2013-01-08 Thread Michel Lespinasse
Like others before me, I have discovered how easy it is to DOS a system by abusing the rwlock_t unfairness and causing the tasklist_lock read side to be continuously held (my abuse code makes use of the getpriority syscall, but there are plenty of other ways anyway). My understanding is that the i

Re: [PATCH 7/8] mm: use vm_unmapped_area() on powerpc architecture

2013-01-09 Thread Michel Lespinasse
eadable and it will avoid a fuckup in the future if > somebody changes the algorithm and forgets to update one of the > copies :-) All right, does the following look more palatable then ? (didn't re-test it, though) Signed-off-by: Michel Lespinasse --- arch/powerpc/mm/slice.c | 1

Re: rwlock_t unfairness and tasklist_lock

2013-01-09 Thread Michel Lespinasse
On Wed, Jan 9, 2013 at 9:49 AM, Oleg Nesterov wrote: > On 01/08, Michel Lespinasse wrote: >> Like others before me, I have discovered how easy it is to DOS a >> system by abusing the rwlock_t unfairness and causing the >> tasklist_lock read side to be continuously held

Re: [PATCH 3/5] x86,smp: auto tune spinlock backoff delay factor

2013-01-10 Thread Michel Lespinasse
On Tue, Jan 8, 2013 at 2:30 PM, Rik van Riel wrote: > v3: use fixed-point math for the delay calculations, suggested by Michel > Lespinasse > > - if (head == ticket) > + if (head == ticket) { > + /* > +

Re: [PATCH 4/5] x86,smp: keep spinlock delay values per hashed spinlock address

2013-01-10 Thread Michel Lespinasse
etecting hash collisions to protect us against varying hold times, because this case could happen even with a single spinlock. So we need to make sure the base algorithm is robust and converges towards using the shorter of the spinlock hold times; if we have that then forcing a reset to MIN_SPIN

Re: [PATCH 4/5] x86,smp: keep spinlock delay values per hashed spinlock address

2013-01-10 Thread Michel Lespinasse
On Thu, Jan 10, 2013 at 5:05 AM, Rik van Riel wrote: > Eric, > > with just patches 1-3, can you still reproduce the > regression on your system? > > In other words, could we get away with dropping the > complexity of patch 4, or do we still need it? To be clear, I must say that I'm not opposing p

Re: test11-pre5 breaks vmware

2000-11-15 Thread Michel LESPINASSE
On Wed, Nov 15, 2000 at 12:12:15PM -0800, H. Peter Anvin wrote: > Also, if a piece of software needs raw CPUID information (unlike the > "cooked" one provided by recent kernels) it should use > /dev/cpu/*/cpuid. Is it also OK to use the cpuid opcode in userspace ? (after checking for its presence

Re: Signal 11

2000-12-07 Thread Michel LESPINASSE
On Fri, Dec 08, 2000 at 09:44:29AM +0900, Rainer Mager wrote: > I've heard that signal 11 can be related to bad hardware, most > often memory, but I've done a good bit of testing on this and the > system seems ok. What I did was to run the VA Linux Cerberos(sp?) > test for 15 hours+ with n

Asus P1-AH2 won't suspend (regression)

2008-01-04 Thread Michel Lespinasse
n be found at http://lespinasse.org/config-2.6.24-rc6 if that's any help. Thanks, -- Michel Lespinasse -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.or

nfsroot and locking

2007-06-04 Thread Michel Lespinasse
Hi, I have a quick question about nfsroot in linux: Is there any way to use nfsv3 in the nfsroot (nolock is OK there), and then mount other directories with locking enabled ? The default nfs options when using nfsroot are to use nfsv2 without locking. After booting, one can mount other filesystem

Re: e1000 issue on DQ965GF board (was 24 lost ticks with 2.6.20.10 kernel)

2007-05-04 Thread Michel Lespinasse
On Fri, May 04, 2007 at 11:25:43AM -0700, Kok, Auke wrote: > can you try turning off the "management enable" function in the BIOS of the > DQ965GF? That fixes this issue for us in our labs. A fix for this is also > available in our standalone 7.5.5.1 driver (obtainable from e1000.sf.net), > but

24 lost ticks with 2.6.20.10 kernel

2007-05-01 Thread Michel Lespinasse
Hi, Sorry if this is known, I am not on the list. I'm having an issue with lost ticks, runnign linux 2.6.20.10 on an intel DQ965GF motherboard. For some reason this occurs with clock-like regularity, always exactly 24 lost ticks, about every two seconds. This is running with 250-HZ ticks, and the

Re: 24 lost ticks with 2.6.20.10 kernel

2007-05-01 Thread Michel Lespinasse
board E1000). On Tue, May 01, 2007 at 11:34:28AM -0400, Chuck Ebbert wrote: > Michel Lespinasse wrote: > > running with report_lost_ticks, I see the following: > > > > May 1 12:58:57 server kernel: time.c: Lost 24 timer tick(s)! rip > > _spin_unlock_irqrestore+0x8/0x9) &

Re: 24 lost ticks with 2.6.20.10 kernel

2007-05-02 Thread Michel Lespinasse
On Tue, May 01, 2007 at 03:08:48PM -0700, Kok, Auke wrote: > Michel Lespinasse wrote: > >(I've added the E1000 maintainers to the thread as I found the issue > >seems to go away after I compile out that driver. For reference, I was > >trying to figure out why I lose ex

e1000 issue on DQ965GF board (was 24 lost ticks with 2.6.20.10 kernel)

2007-05-02 Thread Michel Lespinasse
On Wed, May 02, 2007 at 11:14:52AM -0700, Kok, Auke wrote: > I just checked and the fix I was referring to earlier didn't make it into > 2.6.21-final. You can get 2.6.21-git1 from kernel.org which has the fix. See > > http://www.kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.21-git1.log Go

Re: [RFC PATCH 11/37] x86/mm: attempt speculative mm faults first

2021-04-07 Thread Michel Lespinasse
On Wed, Apr 07, 2021 at 04:48:44PM +0200, Peter Zijlstra wrote: > On Tue, Apr 06, 2021 at 06:44:36PM -0700, Michel Lespinasse wrote: > > --- a/arch/x86/mm/fault.c > > +++ b/arch/x86/mm/fault.c > > @@ -1219,6 +1219,8 @@ void do_user_addr_fault(struct pt_regs *regs, > &g

Re: [RFC PATCH 11/37] x86/mm: attempt speculative mm faults first

2021-04-07 Thread Michel Lespinasse
On Wed, Apr 07, 2021 at 01:14:53PM -0700, Michel Lespinasse wrote: > On Wed, Apr 07, 2021 at 04:48:44PM +0200, Peter Zijlstra wrote: > > On Tue, Apr 06, 2021 at 06:44:36PM -0700, Michel Lespinasse wrote: > > > --- a/arch/x86/mm/fault.c > > > +++ b/arch/x86/mm/fault

Re: [RFC PATCH 11/37] x86/mm: attempt speculative mm faults first

2021-04-07 Thread Michel Lespinasse
On Wed, Apr 07, 2021 at 04:35:28PM +0100, Matthew Wilcox wrote: > On Wed, Apr 07, 2021 at 04:48:44PM +0200, Peter Zijlstra wrote: > > On Tue, Apr 06, 2021 at 06:44:36PM -0700, Michel Lespinasse wrote: > > > --- a/arch/x86/mm/fault.c > > > +++ b/arch/x86/mm/fault.c >

Re: [RFC PATCH 09/37] mm: add per-mm mmap sequence counter for speculative page fault handling.

2021-04-07 Thread Michel Lespinasse
On Wed, Apr 07, 2021 at 04:47:34PM +0200, Peter Zijlstra wrote: > On Tue, Apr 06, 2021 at 06:44:34PM -0700, Michel Lespinasse wrote: > > The counter's write side is hooked into the existing mmap locking API: > > mmap_write_lock() increments the counter to the n

Re: [RFC PATCH 24/37] mm: implement speculative handling in __do_fault()

2021-04-07 Thread Michel Lespinasse
On Wed, Apr 07, 2021 at 04:40:34PM +0200, Peter Zijlstra wrote: > On Tue, Apr 06, 2021 at 06:44:49PM -0700, Michel Lespinasse wrote: > > In the speculative case, call the vm_ops->fault() method from within > > an rcu read locked section, and verify the mmap sequence lock at th

Re: [RFC PATCH 34/37] mm: rcu safe vma freeing only for multithreaded user space

2021-04-08 Thread Michel Lespinasse
On Wed, Apr 07, 2021 at 03:50:06AM +0100, Matthew Wilcox wrote: > On Tue, Apr 06, 2021 at 06:44:59PM -0700, Michel Lespinasse wrote: > > Performance tuning: as single threaded userspace does not use > > speculative page faults, it does not require rcu safe vma freeing. > > T

Re: [RFC PATCH 24/37] mm: implement speculative handling in __do_fault()

2021-04-08 Thread Michel Lespinasse
On Thu, Apr 08, 2021 at 08:13:43AM +0100, Matthew Wilcox wrote: > On Thu, Apr 08, 2021 at 09:00:26AM +0200, Peter Zijlstra wrote: > > On Wed, Apr 07, 2021 at 10:27:12PM +0100, Matthew Wilcox wrote: > > > Doing I/O without any lock held already works; it just uses the file > > > refcount. It would

Re: [PATCH v2 05/12] rbtree: performance and correctness test

2012-07-13 Thread Michel Lespinasse
On Fri, Jul 13, 2012 at 1:15 PM, Andrew Morton wrote: > On Thu, 12 Jul 2012 17:31:50 -0700 Michel Lespinasse > wrote: >> Makefile|2 +- >> lib/Kconfig.debug |1 + >> tests/Kconfig | 18 +++ >> tests/Makefile |1 +

Re: [PATCH v2 05/12] rbtree: performance and correctness test

2012-07-13 Thread Michel Lespinasse
On Fri, Jul 13, 2012 at 3:45 PM, Andrew Morton wrote: > On Fri, 13 Jul 2012 15:33:35 -0700 Michel Lespinasse > wrote: >> Ah, I did not realize we had a precedent for in-tree kernel test modules. > > hm, well, just because that's what we do now doesn't mean that i

[PATCH v2 05/12] rbtree: performance and correctness test

2012-07-13 Thread Michel Lespinasse
leaf nodes have the same number of black nodes, - root node is black Signed-off-by: Michel Lespinasse --- lib/Kconfig.debug |7 +++ lib/Makefile |2 + lib/rbtree_test.c | 135 + 3 files changed, 144 insertions(+), 0 deletions

[PATCH] ipc/mqueue: remove unnecessary rb_init_node calls

2012-07-18 Thread Michel Lespinasse
d the fix in order to try out the patches. So here it is :) - Forwarded message from Michel Lespinasse - Date: Tue, 17 Jul 2012 17:30:35 -0700 From: Michel Lespinasse To: Andrew Morton Cc: Doug Ledford Subject: [PATCH] ipc/mqueue: remove unnecessary rb_init_node calls Commits d662985

[PATCH] rbtree: fix jffs2 build issue due to renamed __rb_parent_color field

2012-07-18 Thread Michel Lespinasse
here. Signed-off-by: Michel Lespinasse --- fs/jffs2/readinode.c |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/jffs2/readinode.c b/fs/jffs2/readinode.c index dc0437e..b00fc50 100644 --- a/fs/jffs2/readinode.c +++ b/fs/jffs2/readinode.c @@ -395,7 +395,9 @@ stati

[PATCH 1/6] rbtree: rb_erase updates and comments

2012-07-20 Thread Michel Lespinasse
) and case 3 (node to remove has 2 childs, successor is a left-descendant of the right child). Signed-off-by: Michel Lespinasse --- lib/rbtree.c | 115 -- 1 files changed, 72 insertions(+), 43 deletions(-) diff --git a/lib/rbtree.c b/lib

[PATCH 3/6] augmented rbtree test

2012-07-20 Thread Michel Lespinasse
Signed-off-by: Michel Lespinasse --- lib/rbtree_test.c | 103 +++- 1 files changed, 101 insertions(+), 2 deletions(-) diff --git a/lib/rbtree_test.c b/lib/rbtree_test.c index 4c6d250..2dfafe4 100644 --- a/lib/rbtree_test.c +++ b/lib/rbtree_test.c

[PATCH 4/6] rbtree: faster augmented insert

2012-07-20 Thread Michel Lespinasse
work, as my compiler output is now *smaller* than before for that function. Speed wise, they seem comparable though. Signed-off-by: Michel Lespinasse --- include/linux/rbtree.h |5 + lib/rbtree.c | 14 +- lib/rbtree_test.c | 31 +++--

[PATCH 6/6] rbtree: remove prior augmented rbtree implementation

2012-07-20 Thread Michel Lespinasse
006e rb_replace_node Signed-off-by: Michel Lespinasse --- arch/x86/mm/pat_rbtree.c | 52 + include/linux/rbtree.h |8 - lib/rbtree.c | 71 -- 3 files changed, 33 insertions(+), 98 deletions

[PATCH 5/6] rbtree: faster augmented erase

2012-07-20 Thread Michel Lespinasse
together, but we still call into a generic __rb_erase_color() (passing a non-inlined callback function) for the rebalancing work. This is intended to strike a reasonable compromise between speed and compiled code size. Signed-off-by: Michel Lespinasse --- include/linux/rbtree.h |5

[RFC PATCH 0/6] augmented rbtree changes

2012-07-20 Thread Michel Lespinasse
all the way to the root (and it should be more efficient too - most of the nodes in a balanced tree are on the last few levels, so having to go all the way back to the root really is wasteful), I have not found a nice elegant way to do that yet, let alone in a generic way. If someone wants to try

[PATCH 2/6] rbtree: optimize fetching of sibling node

2012-07-20 Thread Michel Lespinasse
fetched child This avoids fetching the parent's left child when node is actually that child. Saves a bit on code size, though it doesn't seem to make a large difference in speed. Signed-off-by: Michel Lespinasse --- lib/rbtree.c | 21 + 1 files changed, 13 insertions(

Re: [PATCH v5 06/10] mmap locking API: convert nested write lock sites

2020-05-19 Thread Michel Lespinasse
On Mon, May 18, 2020 at 12:32:03PM +0200, Vlastimil Babka wrote: > On 4/22/20 2:14 AM, Michel Lespinasse wrote: > > Add API for nested write locks and convert the few call sites doing that. > > > > Signed-off-by: Michel Lespinasse > > Reviewed-by: Daniel Jordan >

Re: [PATCH v5 08/10] mmap locking API: add MMAP_LOCK_INITIALIZER

2020-05-19 Thread Michel Lespinasse
On Mon, May 18, 2020 at 12:45:06PM +0200, Vlastimil Babka wrote: > On 4/22/20 2:14 AM, Michel Lespinasse wrote: > > Define a new initializer for the mmap locking api. > > Initially this just evaluates to __RWSEM_INITIALIZER as the API > > is defined as wrappers around rwsem. &

Re: [PATCH v5.5 09/10] mmap locking API: add mmap_assert_locked() and mmap_assert_write_locked()

2020-05-19 Thread Michel Lespinasse
On Mon, May 18, 2020 at 01:01:33PM +0200, Vlastimil Babka wrote: > On 4/24/20 3:38 AM, Michel Lespinasse wrote: > > +static inline void mmap_assert_locked(struct mm_struct *mm) > > +{ > > + VM_BUG_ON_MM(!lockdep_is_held_type(&mm->mmap_sem, -1), mm); > > +

Re: [PATCH v5.5 10/10] mmap locking API: rename mmap_sem to mmap_lock

2020-05-19 Thread Michel Lespinasse
On Mon, May 18, 2020 at 03:45:22PM +0200, Laurent Dufour wrote: > Le 24/04/2020 à 03:39, Michel Lespinasse a écrit : > > Rename the mmap_sem field to mmap_lock. Any new uses of this lock > > should now go through the new mmap locking api. The mmap_lock is > > still implement

Re: [PATCH v5.5 10/10] mmap locking API: rename mmap_sem to mmap_lock

2020-05-19 Thread Michel Lespinasse
On Mon, May 18, 2020 at 01:07:26PM +0200, Vlastimil Babka wrote: > Any plan about all the code comments mentioning mmap_sem? :) Not urgent. It's mostly a sed job, I'll add it in the next version as it seems the patchset is getting ready for inclusion. -- Michel "Walken" Lespinasse A program is n

Re: [PATCH v5.5 10/10] mmap locking API: rename mmap_sem to mmap_lock

2020-05-19 Thread Michel Lespinasse
On Tue, May 19, 2020 at 11:15 AM John Hubbard wrote: > On 2020-05-19 08:32, Matthew Wilcox wrote: > > On Tue, May 19, 2020 at 03:20:40PM +0200, Laurent Dufour wrote: > >> Le 19/05/2020 à 15:10, Michel Lespinasse a écrit : > >>> On Mon, May 18, 2020 at 03:45:22

[PATCH v6 01/12] mmap locking API: initial implementation as rwsem wrappers

2020-05-19 Thread Michel Lespinasse
point for replacing the rwsem implementation with a different one, such as range locks. Signed-off-by: Michel Lespinasse Reviewed-by: Daniel Jordan Reviewed-by: Davidlohr Bueso Reviewed-by: Laurent Dufour Reviewed-by: Vlastimil Babka --- include/linux/mm.h| 1 + include/linux

[PATCH v6 12/12] mmap locking API: convert mmap_sem comments

2020-05-19 Thread Michel Lespinasse
Convert comments that reference mmap_sem to reference mmap_lock instead. Signed-off-by: Michel Lespinasse --- .../admin-guide/mm/numa_memory_policy.rst | 10 ++--- Documentation/admin-guide/mm/userfaultfd.rst | 2 +- Documentation/filesystems/locking.rst | 2 +- Documentation/vm

[PATCH v6 05/12] mmap locking API: convert mmap_sem call sites missed by coccinelle

2020-05-19 Thread Michel Lespinasse
Convert the last few remaining mmap_sem rwsem calls to use the new mmap locking API. These were missed by coccinelle for some reason (I think coccinelle does not support some of the preprocessor constructs in these files ?) Signed-off-by: Michel Lespinasse Reviewed-by: Daniel Jordan Reviewed-by

[PATCH v6 00/12] Add a new mmap locking API wrapping mmap_sem calls

2020-05-19 Thread Michel Lespinasse
ould be delayed for a bit, so that we'd get a chance to convert any new code that locks mmap_sem in the -rc1 release before applying that last patch. Michel Lespinasse (12): mmap locking API: initial implementation as rwsem wrappers MMU notifier: use the new mmap locking API DMA reser

[PATCH v6 09/12] mmap locking API: add mmap_assert_locked() and mmap_assert_write_locked()

2020-05-19 Thread Michel Lespinasse
Add new APIs to assert that mmap_sem is held. Using this instead of rwsem_is_locked and lockdep_assert_held[_write] makes the assertions more tolerant of future changes to the lock type. Signed-off-by: Michel Lespinasse --- arch/x86/events/core.c| 2 +- fs/userfaultfd.c | 6

[PATCH v6 02/12] MMU notifier: use the new mmap locking API

2020-05-19 Thread Michel Lespinasse
This use is converted manually ahead of the next patch in the series, as it requires including a new header which the automated conversion would miss. Signed-off-by: Michel Lespinasse Reviewed-by: Daniel Jordan Reviewed-by: Davidlohr Bueso Reviewed-by: Laurent Dufour Reviewed-by: Vlastimil

[PATCH v6 11/12] mmap locking API: convert mmap_sem API comments

2020-05-19 Thread Michel Lespinasse
Convert comments that reference old mmap_sem APIs to reference corresponding new mmap locking APIs instead. Signed-off-by: Michel Lespinasse --- Documentation/vm/hmm.rst | 6 +++--- arch/alpha/mm/fault.c | 2 +- arch/ia64/mm/fault.c | 2 +- arch/m68k/mm/fault.c

[PATCH v6 06/12] mmap locking API: convert nested write lock sites

2020-05-19 Thread Michel Lespinasse
Add API for nested write locks and convert the few call sites doing that. Signed-off-by: Michel Lespinasse Reviewed-by: Daniel Jordan Reviewed-by: Laurent Dufour Reviewed-by: Vlastimil Babka --- arch/um/include/asm/mmu_context.h | 3 ++- include/linux/mmap_lock.h | 5 + kernel

[PATCH v6 10/12] mmap locking API: rename mmap_sem to mmap_lock

2020-05-19 Thread Michel Lespinasse
Rename the mmap_sem field to mmap_lock. Any new uses of this lock should now go through the new mmap locking api. The mmap_lock is still implemented as a rwsem, though this could change in the future. Signed-off-by: Michel Lespinasse Reviewed-by: Vlastimil Babka --- arch/ia64/mm/fault.c

[PATCH v6 08/12] mmap locking API: add MMAP_LOCK_INITIALIZER

2020-05-19 Thread Michel Lespinasse
Define a new initializer for the mmap locking api. Initially this just evaluates to __RWSEM_INITIALIZER as the API is defined as wrappers around rwsem. Signed-off-by: Michel Lespinasse Reviewed-by: Laurent Dufour Reviewed-by: Vlastimil Babka --- arch/x86/kernel/tboot.c| 2 +- drivers

[PATCH v6 07/12] mmap locking API: add mmap_read_trylock_non_owner()

2020-05-19 Thread Michel Lespinasse
least-ugly way of addressing this in the short term. Signed-off-by: Michel Lespinasse Reviewed-by: Daniel Jordan Reviewed-by: Vlastimil Babka --- include/linux/mmap_lock.h | 14 ++ kernel/bpf/stackmap.c | 17 + 2 files changed, 19 insertions(+), 12 deletions(-)

[PATCH v6 03/12] DMA reservations: use the new mmap locking API

2020-05-19 Thread Michel Lespinasse
This use is converted manually ahead of the next patch in the series, as it requires including a new header which the automated conversion would miss. Signed-off-by: Michel Lespinasse Reviewed-by: Daniel Jordan Reviewed-by: Laurent Dufour Reviewed-by: Vlastimil Babka --- drivers/dma-buf/dma

Re: [PATCH v5.5 10/10] mmap locking API: rename mmap_sem to mmap_lock

2020-05-20 Thread Michel Lespinasse
On Wed, May 20, 2020 at 12:32 AM John Hubbard wrote: > On 2020-05-19 19:39, Michel Lespinasse wrote: > >> That gives you additional options inside internal_get_user_pages_fast(), > >> such > >> as, approximately: > >> > >> if (!(gup_flags & F

Re: [PATCH v6 05/12] mmap locking API: convert mmap_sem call sites missed by coccinelle

2020-05-20 Thread Michel Lespinasse
Looks good. I'm not sure if you need a review, but just in case: On Wed, May 20, 2020 at 8:23 PM Andrew Morton wrote: > On Tue, 19 May 2020 22:29:01 -0700 Michel Lespinasse > wrote: > > > Convert the last few remaining mmap_sem rwsem calls to use the new > > mmap lock

Re: [PATCH v6 12/12] mmap locking API: convert mmap_sem comments

2020-05-20 Thread Michel Lespinasse
Looks good, thanks ! On Wed, May 20, 2020 at 8:22 PM Andrew Morton wrote: > On Tue, 19 May 2020 22:29:08 -0700 Michel Lespinasse > wrote: > > Convert comments that reference mmap_sem to reference mmap_lock instead. > > This may not be complete.. > > From: Andrew Morton

Re: [PATCH v6 12/12] mmap locking API: convert mmap_sem comments

2020-05-21 Thread Michel Lespinasse
On Thu, May 21, 2020 at 12:42 AM Vlastimil Babka wrote: > On 5/20/20 7:29 AM, Michel Lespinasse wrote: > > Convert comments that reference mmap_sem to reference mmap_lock instead. > > > > Signed-off-by: Michel Lespinasse > > Reviewed-by: Vlastimil Babka >

[RFC PATCH 15/37] mm: implement speculative handling in do_anonymous_page()

2021-04-06 Thread Michel Lespinasse
Change do_anonymous_page() to handle the speculative case. This involves aborting speculative faults if they have to allocate a new anon_vma, and using pte_map_lock() instead of pte_offset_map_lock() to complete the page fault. Signed-off-by: Michel Lespinasse --- mm/memory.c | 17

[RFC PATCH 22/37] mm: enable speculative fault handling through do_swap_page()

2021-04-06 Thread Michel Lespinasse
Change do_swap_page() to allow speculative fault execution to proceed. Signed-off-by: Michel Lespinasse --- mm/memory.c | 5 - 1 file changed, 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index ab3160719bf3..6eddd7b4e89c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3340,11

[RFC PATCH 14/37] mm: add pte_map_lock() and pte_spinlock()

2021-04-06 Thread Michel Lespinasse
that point the page table lock serializes any further races with concurrent mmap lock writers. If the mmap sequence count check fails, both functions will return false with the pte being left unmapped and unlocked. Signed-off-by: Michel Lespinasse --- include/linux/mm.h | 34 +

[RFC PATCH 06/37] x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT

2021-04-06 Thread Michel Lespinasse
Set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT so that the speculative fault handling code can be compiled on this architecture. Signed-off-by: Michel Lespinasse --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 2792879d398e

[RFC PATCH 20/37] mm: implement and enable speculative fault handling in handle_pte_fault()

2021-04-06 Thread Michel Lespinasse
wp_pfn_shared() or wp_page_shared() (both unreachable as we only handle anon vmas so far) or handle_userfault() (needs an explicit abort to handle non-speculatively). Signed-off-by: Michel Lespinasse --- mm/memory.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/mm/memory.c

[RFC PATCH 02/37] mmap locking API: name the return values

2021-04-06 Thread Michel Lespinasse
ind less readable. Signed-off-by: Michel Lespinasse --- include/linux/mmap_lock.h | 32 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 4e27f755766b..8ff276a7560e 100644 --- a/include/l

[RFC PATCH 04/37] do_anonymous_page: reduce code duplication

2021-04-06 Thread Michel Lespinasse
tical between the two cases. This change reduces the code duplication between the two cases. Signed-off-by: Michel Lespinasse --- mm/memory.c | 85 +++-- 1 file changed, 37 insertions(+), 48 deletions(-) diff --git a/mm/memory.c b/mm/memory.c

[RFC PATCH 17/37] mm: implement speculative handling in do_numa_page()

2021-04-06 Thread Michel Lespinasse
change do_numa_page() to use pte_spinlock() when locking the page table, so that the mmap sequence counter will be validated in the speculative case. Signed-off-by: Michel Lespinasse --- mm/memory.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/memory.c b/mm

[RFC PATCH 07/37] mm: add FAULT_FLAG_SPECULATIVE flag

2021-04-06 Thread Michel Lespinasse
Define the new FAULT_FLAG_SPECULATIVE flag, which indicates when we are attempting speculative fault handling (without holding the mmap lock). Signed-off-by: Michel Lespinasse --- include/linux/mm.h | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b

[RFC PATCH 18/37] mm: enable speculative fault handling in do_numa_page()

2021-04-06 Thread Michel Lespinasse
Change handle_pte_fault() to allow speculative fault execution to proceed through do_numa_page(). do_swap_page() does not implement speculative execution yet, so it needs to abort with VM_FAULT_RETRY in that case. Signed-off-by: Michel Lespinasse --- mm/memory.c | 15 ++- 1 file

[RFC PATCH 23/37] mm: rcu safe vma->vm_file freeing

2021-04-06 Thread Michel Lespinasse
Defer freeing of vma->vm_file when freeing vmas. This is to allow speculative page faults in the mapped file case. Signed-off-by: Michel Lespinasse --- fs/exec.c | 1 + kernel/fork.c | 17 +++-- mm/mmap.c | 11 +++ mm/nommu.c| 6 ++ 4 files changed,

[RFC PATCH 05/37] mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT

2021-04-06 Thread Michel Lespinasse
page faulting code, and some code has to be added there to try speculative fault handling first. Signed-off-by: Michel Lespinasse --- mm/Kconfig | 22 ++ 1 file changed, 22 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index 24c045b24b95..322bda319dea 100644 --- a/mm/Kconfig

[RFC PATCH 25/37] mm: implement speculative handling in filemap_fault()

2021-04-06 Thread Michel Lespinasse
, and that readahead is not necessary at this time. In all other cases, the fault is aborted to be handled non-speculatively. Signed-off-by: Michel Lespinasse --- mm/filemap.c | 45 - 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/mm

[RFC PATCH 19/37] mm: implement speculative handling in wp_page_copy()

2021-04-06 Thread Michel Lespinasse
in order to satisfy pte_map_lock()'s preconditions. Signed-off-by: Michel Lespinasse --- mm/memory.c | 31 ++- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index eea72bd78d06..547d9d0ee962 100644 --- a/mm/memory.c

[RFC PATCH 08/37] mm: add do_handle_mm_fault()

2021-04-06 Thread Michel Lespinasse
() API is kept as a wrapper around do_handle_mm_fault() so that we do not have to immediately update every handle_mm_fault() call site. Signed-off-by: Michel Lespinasse --- include/linux/mm.h | 12 +--- mm/memory.c| 10 +++--- 2 files changed, 16 insertions(+), 6 deletions

[RFC PATCH 30/37] mm: enable speculative fault handling for supported file types.

2021-04-06 Thread Michel Lespinasse
trying that unimplemented case. Signed-off-by: Michel Lespinasse --- arch/x86/mm/fault.c | 3 ++- include/linux/mm.h | 14 ++ mm/memory.c | 17 - 3 files changed, 28 insertions(+), 6 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c

[RFC PATCH 16/37] mm: enable speculative fault handling through do_anonymous_page()

2021-04-06 Thread Michel Lespinasse
is set (the original pte was not pte_none), catch speculative faults and return VM_FAULT_RETRY as those cases are not implemented yet. Also assert that do_fault() is not reached in the speculative case. Signed-off-by: Michel Lespinasse --- arch/x86/mm/fault.c | 2 +- mm/memory.c |

[RFC PATCH 27/37] mm: implement speculative handling in do_fault_around()

2021-04-06 Thread Michel Lespinasse
anymore, as it is now running within an rcu read lock. Signed-off-by: Michel Lespinasse --- fs/xfs/xfs_file.c | 3 +++ mm/memory.c | 22 -- 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index a007ca0711d9..b360

[RFC PATCH 21/37] mm: implement speculative handling in do_swap_page()

2021-04-06 Thread Michel Lespinasse
when finally committing the faulted page to the mm address space. Signed-off-by: Michel Lespinasse --- mm/memory.c | 74 ++--- 1 file changed, 42 insertions(+), 32 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index fc555fae0844..ab3160719bf3

[RFC PATCH 24/37] mm: implement speculative handling in __do_fault()

2021-04-06 Thread Michel Lespinasse
lative fault handling. The speculative handling case also does not preallocate page tables, as it is always called with a pre-existing page table. Signed-off-by: Michel Lespinasse --- mm/memory.c | 63 +++-- 1 file changed, 42 insertions(+), 21 deleti

[RFC PATCH 03/37] do_anonymous_page: use update_mmu_tlb()

2021-04-06 Thread Michel Lespinasse
update_mmu_tlb() can be used instead of update_mmu_cache() when the page fault handler detects that it lost the race to another page fault. Signed-off-by: Michel Lespinasse --- mm/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index

[RFC PATCH 37/37] arm64/mm: attempt speculative mm faults first

2021-04-06 Thread Michel Lespinasse
: Michel Lespinasse --- arch/arm64/mm/fault.c | 52 +++ 1 file changed, 52 insertions(+) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index f37d4e3830b7..3757bfbb457a 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -25,6 +25,7

[RFC PATCH 29/37] fs: list file types that support speculative faults.

2021-04-06 Thread Michel Lespinasse
Add a speculative field to the vm_operations_struct, which indicates if the associated file type supports speculative faults. Initially this is set for files that implement fault() with filemap_fault(). Signed-off-by: Michel Lespinasse --- fs/btrfs/file.c| 1 + fs/cifs/file.c | 1 + fs

[RFC PATCH 00/37] Speculative page faults

2021-04-06 Thread Michel Lespinasse
the anon case, but maybe not as clear for the file cases. - Is the Android use case compelling enough to merge the entire patchset ? - Can we use this as a foundation for other mmap scalability work ? I hear several proposals involving the idea of RCU based fault handling, and hope this propo

[RFC PATCH 31/37] ext4: implement speculative fault handling

2021-04-06 Thread Michel Lespinasse
We just need to make sure ext4_filemap_fault() doesn't block in the speculative case as it is called with an rcu read lock held. Signed-off-by: Michel Lespinasse --- fs/ext4/file.c | 1 + fs/ext4/inode.c | 7 ++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/ext4/f

[RFC PATCH 32/37] f2fs: implement speculative fault handling

2021-04-06 Thread Michel Lespinasse
We just need to make sure f2fs_filemap_fault() doesn't block in the speculative case as it is called with an rcu read lock held. Signed-off-by: Michel Lespinasse --- fs/f2fs/file.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c

[RFC PATCH 36/37] arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT

2021-04-06 Thread Michel Lespinasse
Set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT so that the speculative fault handling code can be compiled on this architecture. Signed-off-by: Michel Lespinasse --- arch/arm64/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index e4e1b6550115

[RFC PATCH 13/37] mm: implement speculative handling in __handle_mm_fault().

2021-04-06 Thread Michel Lespinasse
tables. Signed-off-by: Michel Lespinasse --- include/linux/mm.h | 4 +++ mm/memory.c| 77 -- 2 files changed, 79 insertions(+), 2 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index d5988e78e6ab..dee8a4833779 100644 --- a

[RFC PATCH 34/37] mm: rcu safe vma freeing only for multithreaded user space

2021-04-06 Thread Michel Lespinasse
tests that do not have any frequent concurrent page faults ! This is because rcu safe vma freeing prevents recently released vmas from being immediately reused in a new thread. Signed-off-by: Michel Lespinasse --- kernel/fork.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff

[RFC PATCH 26/37] mm: implement speculative fault handling in finish_fault()

2021-04-06 Thread Michel Lespinasse
In the speculative case, we want to avoid direct pmd checks (which would require some extra synchronization to be safe), and rely on pte_map_lock which will both lock the page table and verify that the pmd has not changed from its initial value. Signed-off-by: Michel Lespinasse --- mm/memory.c

[RFC PATCH 09/37] mm: add per-mm mmap sequence counter for speculative page fault handling.

2021-04-06 Thread Michel Lespinasse
h any mmap writer. This is very similar to a seqlock, but both the writer and speculative readers are allowed to block. In the fail case, the speculative reader does not spin on the sequence counter; instead it should fall back to a different mechanism such as grabbing the mmap lock read side

[RFC PATCH 35/37] mm: spf statistics

2021-04-06 Thread Michel Lespinasse
Add a new CONFIG_SPECULATIVE_PAGE_FAULT_STATS config option, and dump extra statistics about executed spf cases and abort reasons when the option is set. Signed-off-by: Michel Lespinasse --- arch/x86/mm/fault.c | 19 +++--- include/linux/mmap_lock.h | 19 +- include

[RFC PATCH 11/37] x86/mm: attempt speculative mm faults first

2021-04-06 Thread Michel Lespinasse
when finalizing the fault. Signed-off-by: Michel Lespinasse --- arch/x86/mm/fault.c | 36 +++ include/linux/vm_event_item.h | 4 mm/vmstat.c | 4 3 files changed, 44 insertions(+) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm

[RFC PATCH 33/37] mm: enable speculative fault handling only for multithreaded user space

2021-04-06 Thread Michel Lespinasse
Performance tuning: single threaded userspace does not benefit from speculative page faults, so we turn them off to avoid any related (small) extra overheads. Signed-off-by: Michel Lespinasse --- arch/x86/mm/fault.c | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/x86/mm/fault.c b

<    1   2   3   4   5   6   7   >