Re: [PATCH 01/23] userfaultfd: linux/Documentation/vm/userfaultfd.txt

2015-12-04 Thread Andrea Arcangeli
Hello Michael, On Fri, Dec 04, 2015 at 04:50:03PM +0100, Michael Kerrisk (man-pages) wrote: > Hi Andrea, > > On 09/11/2015 10:47 AM, Michael Kerrisk (man-pages) wrote: > > On 05/14/2015 07:30 PM, Andrea Arcangeli wrote: > >> Add documentation. > > > > Hi And

Re: [PATCH 00/11] KVM: x86: track guest page access

2015-12-01 Thread Andrea Arcangeli
handler for the tracked pages. The > > performance result of kernel building is as followings: > > > >before after > > real 461.63 real 455.48 > > user 4529.55 user 4557.88 > > sys 1995.39 sys 1922.57 > > For KVM-GT, as far a

Re: [PATCH] mm: Loosen MADV_NOHUGEPAGE to enable Qemu postcopy on s390

2015-11-11 Thread Andrea Arcangeli
On Wed, Nov 11, 2015 at 08:47:34PM +0100, Christian Borntraeger wrote: > Acked-by: Christian Borntraeger > Who is going to take this patch? If I should take the patch, I need an > ACK from the memory mgmt folks. I would suggest to resend in CC to Andrew to merge in -mm after taking care of the be

Re: [PATCH] mm: Loosen MADV_NOHUGEPAGE to enable Qemu postcopy on s390

2015-11-11 Thread Andrea Arcangeli
On Wed, Nov 11, 2015 at 09:01:44PM +0100, Christian Borntraeger wrote: > Am 11.11.2015 um 18:30 schrieb Andrea Arcangeli: > > Hi Jason, > > > > On Wed, Nov 11, 2015 at 10:35:16AM -0500, Jason J. Herne wrote: > >> MADV_NOHUGEPAGE processing is too restrictive. kvm

Re: [PATCH 14/23] userfaultfd: wake pending userfaults

2015-10-22 Thread Andrea Arcangeli
On Thu, Oct 22, 2015 at 05:15:09PM +0200, Peter Zijlstra wrote: > Indefinitely is such a long time, we should try and finish > computation before the computer dies etc. :-) Indefinitely as read_seqcount_retry, eventually it makes progress. Even returning 0 from the page fault can trigger it again

Re: [PATCH 14/23] userfaultfd: wake pending userfaults

2015-10-22 Thread Andrea Arcangeli
On Thu, Oct 22, 2015 at 03:38:24PM +0200, Peter Zijlstra wrote: > On Thu, Oct 22, 2015 at 03:20:15PM +0200, Andrea Arcangeli wrote: > > > If schedule spontaneously wakes up a task in TASK_KILLABLE state that > > would be a bug in the scheduler in my view. Luckily there doesn

Re: [PATCH 14/23] userfaultfd: wake pending userfaults

2015-10-22 Thread Andrea Arcangeli
On Thu, Oct 22, 2015 at 02:10:56PM +0200, Peter Zijlstra wrote: > On Thu, May 14, 2015 at 07:31:11PM +0200, Andrea Arcangeli wrote: > > @@ -255,21 +259,23 @@ int handle_userfault(struct vm_area_struct *vma, > > unsigned long address, > >

Re: [PATCH 0/7] userfault21 update

2015-10-19 Thread Andrea Arcangeli
Hello Patrick, On Mon, Oct 12, 2015 at 11:04:11AM -0400, Patrick Donnelly wrote: > Hello Andrea, > > On Mon, Jun 15, 2015 at 1:22 PM, Andrea Arcangeli wrote: > > This is an incremental update to the userfaultfd code in -mm. > > Sorry I'm late to this party. I'

Re: [Qemu-devel] [PATCH 19/23] userfaultfd: activate syscall

2015-08-11 Thread Andrea Arcangeli
nclude > > > -#define __NR_syscalls364 > +#define __NR_syscalls365 > > #define __NR__exit __NR_exit > #define NR_syscalls __NR_syscalls Reviewed-by: Andrea Arcangeli -- To unsubscribe from this list: send the line "unsubscribe kvm&q

Re: [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization

2015-06-23 Thread Andrea Arcangeli
Hi Dave, On Tue, Jun 23, 2015 at 12:00:19PM -0700, Dave Hansen wrote: > Down in userfaultfd_wake_function(), it looks like you intended for a > len=0 to mean "wake all". But the validate_range() that we do from > userspace has a !len check in it, which keeps us from passing a len=0 in > from user

Re: [PATCH 5/7] userfaultfd: switch to exclusive wakeup for blocking reads

2015-06-16 Thread Andrea Arcangeli
On Mon, Jun 15, 2015 at 08:41:24PM -1000, Linus Torvalds wrote: > On Mon, Jun 15, 2015 at 12:19 PM, Andrea Arcangeli > wrote: > > > > Yes, it would leave the other blocked, how is it different from having > > just 1 reader and it gets killed? > > Either is complet

Re: [PATCH 5/7] userfaultfd: switch to exclusive wakeup for blocking reads

2015-06-15 Thread Andrea Arcangeli
On Mon, Jun 15, 2015 at 08:19:07AM -1000, Linus Torvalds wrote: > What if the process doing the polling never doors anything with the end > result? Maybe it meant to, but it got killed before it could? Are you going > to leave everybody else blocked, even though there are pending events? Yes, it w

Re: [PATCH 1/7] userfaultfd: require UFFDIO_API before other ioctls

2015-06-15 Thread Andrea Arcangeli
On Mon, Jun 15, 2015 at 08:11:50AM -1000, Linus Torvalds wrote: > On Jun 15, 2015 7:22 AM, "Andrea Arcangeli" wrote: > > > > + if (cmd != UFFDIO_API) { > > + if (ctx->state == UFFD_STATE_WAIT_API) > > + return

[PATCH 5/7] userfaultfd: switch to exclusive wakeup for blocking reads

2015-06-15 Thread Andrea Arcangeli
635,219,658 branches # 256.660 M/sec ( +- 0.71% ) [83.69%] 59,203,898 branch-misses #0.51% of all branches ( +- 2.03% ) [83.54%] 2.600912438 seconds time elapsed ( +- 0.02% ) Signed

[PATCH 7/7] userfaultfd: selftest

2015-06-15 Thread Andrea Arcangeli
by userfaultfd. The fix for those two bugs was also strightforward and required no design change of any sort. Signed-off-by: Andrea Arcangeli --- tools/testing/selftests/vm/Makefile | 4 +- tools/testing/selftests/vm/userfaultfd.c | 669 +++ 2 files changed

[PATCH 4/7] userfaultfd: avoid missing wakeups during refile in userfaultfd_read

2015-06-15 Thread Andrea Arcangeli
During the refile in userfaultfd_read both waitqueues could look empty to the lockless wake_userfault(). Use a seqcount to prevent this false negative that could leave an userfault blocked. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 26 -- 1 file changed, 24

[PATCH 2/7] userfaultfd: propagate the full address in THP faults

2015-06-15 Thread Andrea Arcangeli
IGBUS failure because the wrong page was being copied. For various reasons this wasn't easily reproducible in the qemu workload, but the strestest exposed the problem immediately. Signed-off-by: Andrea Arcangeli --- mm/huge_memory.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletion

[PATCH 1/7] userfaultfd: require UFFDIO_API before other ioctls

2015-06-15 Thread Andrea Arcangeli
(all but UFFDIO_API/struct uffdio_api) with a bump of uffdio_api.api. There's no actual plan or need to change the API or the ioctl, the current API already should cover fine even the non cooperative usage, but this is just for the longer term future just in case. Signed-off-by: Andrea Arca

[PATCH 6/7] userfaultfd: Revert "userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key"

2015-06-15 Thread Andrea Arcangeli
ce as wakeall, has wait->flags WQ_FLAG_EXCLUSIVE set. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 8 include/linux/wait.h | 5 ++--- kernel/sched/wait.c | 7 +++ net/sunrpc/sched.c | 2 +- 4 files changed, 10 insertions(+), 12 deletions(-) diff --git a/fs/userf

[PATCH 3/7] userfaultfd: allow signals to interrupt a userfault

2015-06-15 Thread Andrea Arcangeli
need to get signal processed, coredumps always worked perfectly with userfaults, no matter if the userfault is triggered by GUP a kernel copy_user or directly from userland. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 35 --- 1 file changed, 32 inse

[PATCH 0/7] userfault21 update

2015-06-15 Thread Andrea Arcangeli
r CPUs. CPU bugs in SIMD cannot be ruled out either yet. Andrea Arcangeli (7): userfaultfd: require UFFDIO_API before other ioctls userfaultfd: propagate the full address in THP faults userfaultfd: allow signals to interrupt a userfault userfaultfd: avoid missing wakeups during refile

Re: [PATCH 22/23] userfaultfd: avoid mmap_sem read recursion in mcopy_atomic

2015-05-22 Thread Andrea Arcangeli
he buildbot was shutdown recently? That buildbot was very useful to detect for problems like this. === >From 2f0a48670dc515932dec8b983871ec35caeba553 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli Date: Sat, 23 May 2015 02:26:32 +0200 Subject: [PATCH] userfaultfd: update the uffd_msg structure to be the same on 32

Re: [PATCH 22/23] userfaultfd: avoid mmap_sem read recursion in mcopy_atomic

2015-05-22 Thread Andrea Arcangeli
On Fri, May 22, 2015 at 01:18:22PM -0700, Andrew Morton wrote: > On Thu, 14 May 2015 19:31:19 +0200 Andrea Arcangeli > wrote: > > > If the rwsem starves writers it wasn't strictly a bug but lockdep > > doesn't like it and this avoids depending on lowlevel impleme

Re: [PATCH 00/23] userfaultfd v4

2015-05-21 Thread Andrea Arcangeli
Hi Kirill, On Thu, May 21, 2015 at 04:11:11PM +0300, Kirill Smelkov wrote: > Sorry for maybe speaking up too late, but here is additional real Not too late, in fact I don't think there's any change required for this at this stage, but it'd be great if you could help me to review. > Since arrays

[PATCH 20/23] userfaultfd: UFFDIO_COPY|UFFDIO_ZEROPAGE uAPI

2015-05-14 Thread Andrea Arcangeli
This implements the uABI of UFFDIO_COPY and UFFDIO_ZEROPAGE. Signed-off-by: Andrea Arcangeli --- include/uapi/linux/userfaultfd.h | 42 +++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux

[PATCH 17/23] userfaultfd: solve the race between UFFDIO_COPY|ZEROPAGE and read

2015-05-14 Thread Andrea Arcangeli
rom userfault thread This patch removes the need of both UFFDIO_WAKE and of the associated per-page tristate as well. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 81 +--- 1 file changed, 66 insertions(+), 15 deletions(-) diff --gi

[PATCH 06/23] userfaultfd: add VM_UFFD_MISSING and VM_UFFD_WP

2015-05-14 Thread Andrea Arcangeli
These two flags gets set in vma->vm_flags to tell the VM common code if the userfaultfd is armed and in which mode (only tracking missing faults, only tracking wrprotect faults or both). If neither flags is set it means the userfaultfd is not armed on the vma. Signed-off-by: Andrea Arcang

[PATCH 18/21] userfaultfd: UFFDIO_REMAP uABI

2015-03-05 Thread Andrea Arcangeli
This implements the uABI of UFFDIO_REMAP. Notably one mode bitflag is also forwarded (and in turn known) by the lowlevel remap_pages method. Signed-off-by: Andrea Arcangeli --- include/uapi/linux/userfaultfd.h | 27 ++- 1 file changed, 26 insertions(+), 1 deletion

[PATCH 09/21] userfaultfd: prevent khugepaged to merge if userfaultfd is armed

2015-03-05 Thread Andrea Arcangeli
tiny corner case. Signed-off-by: Andrea Arcangeli --- mm/huge_memory.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 5374132..8f1b6a5 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2145,7 +2145,8 @@ static int __col

Re: [PATCH 19/21] userfaultfd: remap_pages: UFFDIO_REMAP preparation

2015-03-05 Thread Andrea Arcangeli
On Thu, Mar 05, 2015 at 09:39:48AM -0800, Linus Torvalds wrote: > Is this really worth it? On real loads? That people are expected to use? I fully agree that it's not worth merging upstream UFFDIO_REMAP until (and if) a real world usage for it will showup. To further clarify: would this not have b

[PATCH 05/21] userfaultfd: add vm_userfaultfd_ctx to the vm_area_struct

2015-03-05 Thread Andrea Arcangeli
This adds the vm_userfaultfd_ctx to the vm_area_struct. Signed-off-by: Andrea Arcangeli --- include/linux/mm_types.h | 11 +++ kernel/fork.c| 1 + 2 files changed, 12 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 199a03a..fbf21f5

[PATCH 03/21] userfaultfd: uAPI

2015-03-05 Thread Andrea Arcangeli
Defines the uAPI of the userfaultfd, notably the ioctl numbers and protocol. Signed-off-by: Andrea Arcangeli --- Documentation/ioctl/ioctl-number.txt | 1 + include/uapi/linux/userfaultfd.h | 81 2 files changed, 82 insertions(+) create mode 100644

[PATCH 11/21] userfaultfd: buildsystem activation

2015-03-05 Thread Andrea Arcangeli
This allows to select the userfaultfd during configuration to build it. Signed-off-by: Andrea Arcangeli --- fs/Makefile | 1 + init/Kconfig | 11 +++ 2 files changed, 12 insertions(+) diff --git a/fs/Makefile b/fs/Makefile index a88ac48..ba8ab62 100644 --- a/fs/Makefile +++ b/fs

[PATCH 20/21] userfaultfd: UFFDIO_REMAP

2015-03-05 Thread Andrea Arcangeli
MP. Especially if copying only a few pages at time, copying without TLB flush is faster. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 51 +++ 1 file changed, 51 insertions(+) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 6230f22

[PATCH 17/21] userfaultfd: remap_pages: swp_entry_swapcount() preparation

2015-03-05 Thread Andrea Arcangeli
some anon_vma. Signed-off-by: Andrea Arcangeli --- include/linux/swap.h | 6 ++ mm/swapfile.c| 13 + 2 files changed, 19 insertions(+) diff --git a/include/linux/swap.h b/include/linux/swap.h index 4759491..9adda11 100644 --- a/include/linux/swap.h +++ b/include/linux

[PATCH 07/21] userfaultfd: call handle_userfault() for userfaultfd_missing() faults

2015-03-05 Thread Andrea Arcangeli
as parameter so the "read|write" kind of fault can be passed to userland. Signed-off-by: Andrea Arcangeli --- mm/huge_memory.c | 68 ++-- mm/memory.c | 16 + 2 files changed, 62 insertions(+), 22 deletions(-) di

[PATCH 16/21] userfaultfd: remap_pages: rmap preparation

2015-03-05 Thread Andrea Arcangeli
before or while remap_pages runs. Signed-off-by: Andrea Arcangeli --- mm/huge_memory.c | 23 +++ mm/rmap.c| 9 + 2 files changed, 28 insertions(+), 4 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 8f1b6a5..1e25cb3 100644 --- a/mm/hug

[PATCH 21/21] userfaultfd: add userfaultfd_wp mm helpers

2015-03-05 Thread Andrea Arcangeli
These helpers will be used to know if to call handle_userfault() during wrprotect faults in order to deliver the wrprotect faults to userland. Signed-off-by: Andrea Arcangeli --- include/linux/userfaultfd_k.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/include/linux

[PATCH 06/21] userfaultfd: add VM_UFFD_MISSING and VM_UFFD_WP

2015-03-05 Thread Andrea Arcangeli
These two flags gets set in vma->vm_flags to tell the VM common code if the userfaultfd is armed and in which mode (only tracking missing faults, only tracking wrprotect faults or both). If neither flags is set it means the userfaultfd is not armed on the vma. Signed-off-by: Andrea Arcang

[PATCH 08/21] userfaultfd: teach vma_merge to merge across vma->vm_userfaultfd_ctx

2015-03-05 Thread Andrea Arcangeli
vma->vm_userfaultfd_ctx is yet another vma parameter that vma_merge must be aware about so that we can merge vmas back like they were originally before arming the userfaultfd on some memory range. Signed-off-by: Andrea Arcangeli --- include/linux/mm.h | 2 +- mm/madvise.c | 3 ++-

[PATCH 13/21] userfaultfd: UFFDIO_COPY|UFFDIO_ZEROPAGE uAPI

2015-03-05 Thread Andrea Arcangeli
This implements the uABI of UFFDIO_COPY and UFFDIO_ZEROPAGE. Signed-off-by: Andrea Arcangeli --- include/uapi/linux/userfaultfd.h | 46 +++- 1 file changed, 45 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux

[PATCH 00/21] RFC: userfaultfd v3

2015-03-05 Thread Andrea Arcangeli
a fully backwards compatible change and it's only strictly required by the wrprotect tracking mode, so it's no problem to solve this later. Because of its inherent racy nature, nobody could possibly depend on a racy SIGBUS being raised now, when it won't be raised anymore later. Andre

[PATCH 02/21] userfaultfd: linux/Documentation/vm/userfaultfd.txt

2015-03-05 Thread Andrea Arcangeli
Add documentation. Signed-off-by: Andrea Arcangeli --- Documentation/vm/userfaultfd.txt | 97 1 file changed, 97 insertions(+) create mode 100644 Documentation/vm/userfaultfd.txt diff --git a/Documentation/vm/userfaultfd.txt b/Documentation/vm

[PATCH 10/21] userfaultfd: add new syscall to provide memory externalization

2015-03-05 Thread Andrea Arcangeli
to know when there are new pending userfaults to be read (POLLIN). Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 977 +++ 1 file changed, 977 insertions(+) create mode 100644 fs/userfaultfd.c diff --git a/fs/userfaultfd.c b/fs

[PATCH 19/21] userfaultfd: remap_pages: UFFDIO_REMAP preparation

2015-03-05 Thread Andrea Arcangeli
remap_pages is the lowlevel mm helper needed to implement UFFDIO_REMAP. Signed-off-by: Andrea Arcangeli --- include/linux/userfaultfd_k.h | 17 ++ mm/huge_memory.c | 120 ++ mm/userfaultfd.c | 526 ++ 3 files changed

[PATCH 01/21] userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key

2015-03-05 Thread Andrea Arcangeli
userfaultfd needs to wake all waitqueues (pass 0 as nr parameter), instead of the current hardcoded 1 (that would wake just the first waitqueue in the head list). Signed-off-by: Andrea Arcangeli --- include/linux/wait.h | 5 +++-- kernel/sched/wait.c | 7 --- net/sunrpc/sched.c | 2 +- 3

[PATCH 04/21] userfaultfd: linux/userfaultfd_k.h

2015-03-05 Thread Andrea Arcangeli
Kernel header defining the methods needed by the VM common code to interact with the userfaultfd. Signed-off-by: Andrea Arcangeli --- include/linux/userfaultfd_k.h | 79 +++ 1 file changed, 79 insertions(+) create mode 100644 include/linux

[PATCH 12/21] userfaultfd: activate syscall

2015-03-05 Thread Andrea Arcangeli
This activates the userfaultfd syscall. Signed-off-by: Andrea Arcangeli --- arch/powerpc/include/asm/systbl.h | 1 + arch/powerpc/include/asm/unistd.h | 2 +- arch/powerpc/include/uapi/asm/unistd.h | 1 + arch/x86/syscalls/syscall_32.tbl | 1 + arch/x86/syscalls/syscall_64.tbl

[PATCH 14/21] userfaultfd: mcopy_atomic|mfill_zeropage: UFFDIO_COPY|UFFDIO_ZEROPAGE preparation

2015-03-05 Thread Andrea Arcangeli
This implements mcopy_atomic and mfill_zeropage that are the lowlevel VM methods that are invoked respectively by the UFFDIO_COPY and UFFDIO_ZEROPAGE userfaultfd commands. Signed-off-by: Andrea Arcangeli --- include/linux/userfaultfd_k.h | 6 + mm/Makefile | 1 + mm

[PATCH 15/21] userfaultfd: UFFDIO_COPY and UFFDIO_ZEROPAGE

2015-03-05 Thread Andrea Arcangeli
These two ioctl allows to either atomically copy or to map zeropages into the virtual address space. This is used by the thread that opened the userfaultfd to resolve the userfaults. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 100

Re: [PATCH 1/4] mm: Correct ordering of *_clear_flush_young_notify

2015-01-08 Thread Andrea Arcangeli
On Thu, Jan 08, 2015 at 11:59:06AM +, Marc Zyngier wrote: > From: Steve Capper > > ptep_clear_flush_young_notify and pmdp_clear_flush_young_notify both > call the notifiers *after* the pte/pmd has been made young. > On x86 on EPT without hardware access bit (!shadow_accessed_mask), we'll tr

Re: [Qemu-devel] [PATCH 00/17] RFC: userfault v2

2014-11-25 Thread Andrea Arcangeli
On Fri, Nov 21, 2014 at 11:05:45PM +, Peter Maydell wrote: > If it's mapped and readable-but-not-writable then it should still > fault on write accesses, though? These are cases we currently get > SEGV for, anyway. Yes then it'll work just fine. > Ah, I guess we have a terminology difference.

Re: [Qemu-devel] [PATCH 00/17] RFC: userfault v2

2014-11-21 Thread Andrea Arcangeli
Hi Peter, On Wed, Oct 29, 2014 at 05:56:59PM +, Peter Maydell wrote: > On 29 October 2014 17:46, Andrea Arcangeli wrote: > > After some chat during the KVMForum I've been already thinking it > > could be beneficial for some usage to give userland the information >

Re: [PATCH 00/17] RFC: userfault v2

2014-11-20 Thread Andrea Arcangeli
Hi, On Thu, Nov 20, 2014 at 10:54:29AM +0800, zhanghailiang wrote: > Yes, you are right. This is what i really want, bypass all non-present faults > and only track strict wrprotect faults. ;) > > So, do you plan to support that in the userfault API? Yes I think it's good idea to support wrprotec

Re: [PATCH 00/17] RFC: userfault v2

2014-11-20 Thread Andrea Arcangeli
Hi, On Fri, Oct 31, 2014 at 12:39:32PM -0700, Peter Feiner wrote: > On Fri, Oct 31, 2014 at 11:29:49AM +0800, zhanghailiang wrote: > > Agreed, but for doing live memory snapshot (VM is running when do > > snapsphot), > > we have to do this (block the write action), because we have to save the >

Re: [PATCH 00/17] RFC: userfault v2

2014-11-19 Thread Andrea Arcangeli
Hi Zhang, On Fri, Oct 31, 2014 at 09:26:09AM +0800, zhanghailiang wrote: > On 2014/10/30 20:49, Dr. David Alan Gilbert wrote: > > * zhanghailiang (zhang.zhanghaili...@huawei.com) wrote: > >> On 2014/10/30 1:46, Andrea Arcangeli wrote: > >>> Hi Zhanghailiang, > >

Re: [PATCH 00/17] RFC: userfault v2

2014-10-29 Thread Andrea Arcangeli
Hi Zhanghailiang, On Mon, Oct 27, 2014 at 05:32:51PM +0800, zhanghailiang wrote: > Hi Andrea, > > Thanks for your hard work on userfault;) > > This is really a useful API. > > I want to confirm a question: > Can we support distinguishing between writing and reading memory for > userfault? > Th

Re: [PATCH 2/4] mm: gup: add get_user_pages_locked and get_user_pages_unlocked

2014-10-29 Thread Andrea Arcangeli
On Thu, Oct 09, 2014 at 12:50:37PM +0200, Peter Zijlstra wrote: > On Wed, Oct 01, 2014 at 10:56:35AM +0200, Andrea Arcangeli wrote: > > > +static inline long __get_user_pages_locked(struct task_struct *tsk, > > + st

Re: [PATCH 2/4] mm: gup: add get_user_pages_locked and get_user_pages_unlocked

2014-10-29 Thread Andrea Arcangeli
On Thu, Oct 09, 2014 at 12:47:23PM +0200, Peter Zijlstra wrote: > On Wed, Oct 01, 2014 at 10:56:35AM +0200, Andrea Arcangeli wrote: > > +static inline long __get_user_pages_locked(struct task_struct *tsk, > > + struc

Re: [PATCH 3/4] mm: gup: use get_user_pages_fast and get_user_pages_unlocked

2014-10-12 Thread Andrea Arcangeli
On Thu, Oct 09, 2014 at 12:52:45PM +0200, Peter Zijlstra wrote: > On Wed, Oct 01, 2014 at 10:56:36AM +0200, Andrea Arcangeli wrote: > > Just an optimization. > > Does it make sense to split the thing in two? One where you apply > _unlocked and then one where you apply _fast?

Re: [PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-07 Thread Andrea Arcangeli
On Tue, Oct 07, 2014 at 04:19:13PM +0200, Andrea Arcangeli wrote: > mremap like interface, or file+commands protocol interface. I tend to > like mremap more, that's why I opted for a remap_anon_pages syscall > kept orthogonal to the userfaultfd functionality (remap_anon_pages > c

Re: [PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-07 Thread Andrea Arcangeli
Hello, On Tue, Oct 07, 2014 at 08:47:59AM -0400, Linus Torvalds wrote: > On Mon, Oct 6, 2014 at 12:41 PM, Andrea Arcangeli wrote: > > > > Of course if somebody has better ideas on how to resolve an anonymous > > userfault they're welcome. > > So I'd

Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-07 Thread Andrea Arcangeli
Hi Kirill, On Tue, Oct 07, 2014 at 02:10:26PM +0300, Kirill A. Shutemov wrote: > On Fri, Oct 03, 2014 at 07:08:00PM +0200, Andrea Arcangeli wrote: > > There's one constraint enforced to allow this simplification: the > > source pages passed to remap_anon_pages must be mapped

Re: [PATCH 08/17] mm: madvise MADV_USERFAULT

2014-10-07 Thread Andrea Arcangeli
Hi Kirill, On Tue, Oct 07, 2014 at 01:36:45PM +0300, Kirill A. Shutemov wrote: > On Fri, Oct 03, 2014 at 07:07:58PM +0200, Andrea Arcangeli wrote: > > MADV_USERFAULT is a new madvise flag that will set VM_USERFAULT in the > > vma flags. Whenever VM_USERFAULT is set in an an

Re: [PATCH 08/17] mm: madvise MADV_USERFAULT

2014-10-06 Thread Andrea Arcangeli
Hi, On Sat, Oct 04, 2014 at 08:13:36AM +0900, Mike Hommey wrote: > On Fri, Oct 03, 2014 at 07:07:58PM +0200, Andrea Arcangeli wrote: > > MADV_USERFAULT is a new madvise flag that will set VM_USERFAULT in the > > vma flags. Whenever VM_USERFAULT is set in an anonymous vma, if >

Re: [PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-06 Thread Andrea Arcangeli
Hello, On Mon, Oct 06, 2014 at 09:55:41AM +0100, Dr. David Alan Gilbert wrote: > * Linus Torvalds (torva...@linux-foundation.org) wrote: > > On Fri, Oct 3, 2014 at 10:08 AM, Andrea Arcangeli > > wrote: > > > > > > Overall this looks a fairly small change to

Re: [PATCH 04/17] mm: gup: make get_user_pages_fast and __get_user_pages_fast latency conscious

2014-10-06 Thread Andrea Arcangeli
Hello, On Fri, Oct 03, 2014 at 11:23:53AM -0700, Linus Torvalds wrote: > On Fri, Oct 3, 2014 at 10:07 AM, Andrea Arcangeli wrote: > > This teaches gup_fast and __gup_fast to re-enable irqs and > > cond_resched() if possible every BATCH_PAGES. > > This is disgusting. > &

[PATCH 08/17] mm: madvise MADV_USERFAULT

2014-10-03 Thread Andrea Arcangeli
exclusive if set. Signed-off-by: Andrea Arcangeli --- arch/alpha/include/uapi/asm/mman.h | 3 ++ arch/mips/include/uapi/asm/mman.h | 3 ++ arch/parisc/include/uapi/asm/mman.h| 3 ++ arch/xtensa/include/uapi/asm/mman.h| 3 ++ fs/proc/task_mmu.c | 1

[PATCH 17/17] userfaultfd: implement USERFAULTFD_RANGE_REGISTER|UNREGISTER

2014-10-03 Thread Andrea Arcangeli
er process that is calling ptrace). We could also decide to retain the current -EFAULT behavior of ptrace using get_user_pages_locked with a NULL locked parameter so the FAULT_FLAG_ALLOW_RETRY flag will not be set. Either ways would be safe. Signed-off-by: Andrea Arcangeli ---

[PATCH 12/17] mm: sys_remap_anon_pages

2014-10-03 Thread Andrea Arcangeli
write MADV_USERFAULT */ c[i+1] = 0xbb; } if (c[i] != 0xaa) printf("error %x offset %lu\n", c[i], i), exit(1); } printf("remap_anon_pages functions correctly\n"); return 0; } === Signe

[PATCH 02/17] mm: gup: add get_user_pages_locked and get_user_pages_unlocked

2014-10-03 Thread Andrea Arcangeli
ent->mm. get_user_pages_unlocked varies from get_user_pages_fast only if mm is not current->mm (like when get_user_pages works on some other process mm). Whenever tsk and mm matches current and current->mm get_user_pages_fast must always be used to increase performance and get the page loc

[PATCH 05/17] mm: gup: use get_user_pages_fast and get_user_pages_unlocked

2014-10-03 Thread Andrea Arcangeli
Just an optimization. Signed-off-by: Andrea Arcangeli --- drivers/dma/iovlock.c | 10 ++ drivers/iommu/amd_iommu_v2.c | 6 ++ drivers/media/pci/ivtv/ivtv-udma.c | 6 ++ drivers/scsi/st.c | 10 ++ drivers/video/fbdev/pvr2fb.c

[PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-03 Thread Andrea Arcangeli
p_anon_pages runs. Signed-off-by: Andrea Arcangeli --- mm/huge_memory.c | 24 mm/rmap.c| 9 + 2 files changed, 29 insertions(+), 4 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b402d60..4277ed7 100644 --- a/mm/huge_memory.c +++ b/mm

[PATCH 07/17] mm: madvise MADV_USERFAULT: prepare vm_flags to allow more than 32bits

2014-10-03 Thread Andrea Arcangeli
We run out of 32bits in vm_flags, noop change for 64bit archs. Signed-off-by: Andrea Arcangeli --- fs/proc/task_mmu.c | 4 ++-- include/linux/huge_mm.h | 4 ++-- include/linux/ksm.h | 4 ++-- include/linux/mm_types.h | 2 +- mm/huge_memory.c | 2 +- mm/ksm.c

[PATCH 09/17] mm: PT lock: export double_pt_lock/unlock

2014-10-03 Thread Andrea Arcangeli
Those two helpers are needed by remap_anon_pages. Signed-off-by: Andrea Arcangeli --- include/linux/mm.h | 4 mm/fremap.c| 29 + 2 files changed, 33 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index bf3df07..71dbe03 100644 --- a

[PATCH 00/17] RFC: userfault v2

2014-10-03 Thread Andrea Arcangeli
ges should do it fine too, but it would create rmap nonlinearity which isn't optimal. The code can be found here: git clone --reference linux git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git -b userfault The branch is rebased so you can get updates for example with: git fetch &am

[PATCH 06/17] kvm: Faults which trigger IO release the mmap_sem

2014-10-03 Thread Andrea Arcangeli
, as other mmap semaphore users now stall as a function of swap or filemap latency. This patch ensures both the regular and async PF path re-enter the fault allowing for the mmap semaphore to be relinquished in the case of IO wait. Reviewed-by: Radim Krčmář Signed-off-by: Andres Lagar-Cavilla Signed

[PATCH 04/17] mm: gup: make get_user_pages_fast and __get_user_pages_fast latency conscious

2014-10-03 Thread Andrea Arcangeli
using get_user_pages_unlocked which would be slower). Signed-off-by: Andrea Arcangeli --- arch/x86/mm/gup.c | 234 ++ 1 file changed, 149 insertions(+), 85 deletions(-) diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c index 2ab183b..917d8c1 100644 --- a/arch/x

[PATCH 14/17] userfaultfd: add new syscall to provide memory externalization

2014-10-03 Thread Andrea Arcangeli
userfaults to read (POLLIN) and when there are threads waiting a wakeup through a range write (POLLOUT). Signed-off-by: Andrea Arcangeli --- arch/x86/syscalls/syscall_32.tbl | 1 + arch/x86/syscalls/syscall_64.tbl | 1 + fs/Makefile | 1 + fs/userfaultfd.c

[PATCH 01/17] mm: gup: add FOLL_TRIED

2014-10-03 Thread Andrea Arcangeli
From: Andres Lagar-Cavilla Reviewed-by: Radim Krčmář Signed-off-by: Andres Lagar-Cavilla Signed-off-by: Andrea Arcangeli --- include/linux/mm.h | 1 + mm/gup.c | 4 2 files changed, 5 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 8981cc8..0f4196a

[PATCH 15/17] userfaultfd: make userfaultfd_write non blocking

2014-10-03 Thread Andrea Arcangeli
same address. But we should still return an error so if the application thinks this occurrence can never happen it will know it hit a bug. So just return -ENOENT instead of blocking. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 34 +- 1 file changed, 5 inser

[PATCH 03/17] mm: gup: use get_user_pages_unlocked within get_user_pages_fast

2014-10-03 Thread Andrea Arcangeli
Signed-off-by: Andrea Arcangeli --- arch/mips/mm/gup.c | 8 +++- arch/powerpc/mm/gup.c| 6 ++ arch/s390/kvm/kvm-s390.c | 4 +--- arch/s390/mm/gup.c | 6 ++ arch/sh/mm/gup.c | 6 ++ arch/sparc/mm/gup.c | 6 ++ arch/x86/mm/gup.c| 7

[PATCH 16/17] powerpc: add remap_anon_pages and userfaultfd

2014-10-03 Thread Andrea Arcangeli
Add the syscall numbers. Signed-off-by: Andrea Arcangeli --- arch/powerpc/include/asm/systbl.h | 2 ++ arch/powerpc/include/asm/unistd.h | 2 +- arch/powerpc/include/uapi/asm/unistd.h | 2 ++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm

[PATCH 13/17] waitqueue: add nr wake parameter to __wake_up_locked_key

2014-10-03 Thread Andrea Arcangeli
Userfaultfd needs to wake all waitqueues (pass 0 as nr parameter), instead of the current hardcoded 1 (that would wake just the first waitqueue in the head list). Signed-off-by: Andrea Arcangeli --- include/linux/wait.h | 5 +++-- kernel/sched/wait.c | 7 --- net/sunrpc/sched.c | 2 +- 3

[PATCH 11/17] mm: swp_entry_swapcount

2014-10-03 Thread Andrea Arcangeli
in some anon_vma. Signed-off-by: Andrea Arcangeli --- include/linux/swap.h | 6 ++ mm/swapfile.c| 13 + 2 files changed, 19 insertions(+) diff --git a/include/linux/swap.h b/include/linux/swap.h index 8197452..af9977c 100644 --- a/include/linux/swap.h +++ b/include/linux

Re: RFC: get_user_pages_locked|unlocked to leverage VM_FAULT_RETRY

2014-10-02 Thread Andrea Arcangeli
On Thu, Oct 02, 2014 at 02:56:38PM +0200, Peter Zijlstra wrote: > On Thu, Oct 02, 2014 at 02:50:52PM +0200, Peter Zijlstra wrote: > > On Thu, Oct 02, 2014 at 02:31:17PM +0200, Andrea Arcangeli wrote: > > > On Wed, Oct 01, 2014 at 05:36:11PM +0200, Peter Zijlstra wrote: > &g

Re: [PATCH 2/4] mm: gup: add get_user_pages_locked and get_user_pages_unlocked

2014-10-02 Thread Andrea Arcangeli
On Wed, Oct 01, 2014 at 10:06:27AM -0700, Andres Lagar-Cavilla wrote: > On Wed, Oct 1, 2014 at 8:51 AM, Peter Feiner wrote: > > On Wed, Oct 01, 2014 at 10:56:35AM +0200, Andrea Arcangeli wrote: > >> + /* VM_FAULT_RETRY cannot return errors */ > >>

Re: RFC: get_user_pages_locked|unlocked to leverage VM_FAULT_RETRY

2014-10-02 Thread Andrea Arcangeli
On Wed, Oct 01, 2014 at 05:36:11PM +0200, Peter Zijlstra wrote: > For all these and the other _fast() users, is there an actual limit to > the nr_pages passed in? Because we used to have the 64 pages limit from > DIO, but without that we get rather long IRQ-off latencies. Ok, I would tend to think

Re: [PATCH 3/4] mm: gup: use get_user_pages_fast and get_user_pages_unlocked

2014-10-01 Thread Andrea Arcangeli
On Wed, Oct 01, 2014 at 10:56:36AM +0200, Andrea Arcangeli wrote: > diff --git a/drivers/misc/sgi-gru/grufault.c b/drivers/misc/sgi-gru/grufault.c > index f74fc0c..cd20669 100644 > --- a/drivers/misc/sgi-gru/grufault.c > +++ b/drivers/misc/sgi-gru/grufault.c > @@ -198,8 +198,

[PATCH 3/4] mm: gup: use get_user_pages_fast and get_user_pages_unlocked

2014-10-01 Thread Andrea Arcangeli
Just an optimization. Signed-off-by: Andrea Arcangeli --- drivers/dma/iovlock.c | 10 ++ drivers/iommu/amd_iommu_v2.c | 6 ++ drivers/media/pci/ivtv/ivtv-udma.c | 6 ++ drivers/misc/sgi-gru/grufault.c| 3 +-- drivers/scsi/st.c | 10

[PATCH 4/4] mm: gup: use get_user_pages_unlocked within get_user_pages_fast

2014-10-01 Thread Andrea Arcangeli
Signed-off-by: Andrea Arcangeli --- arch/mips/mm/gup.c | 8 +++- arch/powerpc/mm/gup.c| 6 ++ arch/s390/kvm/kvm-s390.c | 4 +--- arch/s390/mm/gup.c | 6 ++ arch/sh/mm/gup.c | 6 ++ arch/sparc/mm/gup.c | 6 ++ arch/x86/mm/gup.c| 7

[PATCH 0/4] leverage FAULT_FOLL_ALLOW_RETRY in get_user_pages

2014-10-01 Thread Andrea Arcangeli
serfaultfd backed memory. Reviews would be welcome, thanks, Andrea Andrea Arcangeli (3): mm: gup: add get_user_pages_locked and get_user_pages_unlocked mm: gup: use get_user_pages_fast and get_user_pages_unlocked mm: gup: use get_user_pages_unlocked within get_user_pages_fast Andres Laga

[PATCH 1/4] mm: gup: add FOLL_TRIED

2014-10-01 Thread Andrea Arcangeli
From: Andres Lagar-Cavilla Reviewed-by: Radim Krčmář Signed-off-by: Andres Lagar-Cavilla Signed-off-by: Andrea Arcangeli --- include/linux/mm.h | 1 + mm/gup.c | 4 2 files changed, 5 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 8981cc8..0f4196a

[PATCH 2/4] mm: gup: add get_user_pages_locked and get_user_pages_unlocked

2014-10-01 Thread Andrea Arcangeli
ent->mm. get_user_pages_unlocked varies from get_user_pages_fast only if mm is not current->mm (like when get_user_pages works on some other process mm). Whenever tsk and mm matches current and current->mm get_user_pages_fast must always be used to increase performance and get the page

Re: RFC: get_user_pages_locked|unlocked to leverage VM_FAULT_RETRY

2014-09-28 Thread Andrea Arcangeli
On Fri, Sep 26, 2014 at 12:54:46PM -0700, Andres Lagar-Cavilla wrote: > On Fri, Sep 26, 2014 at 10:25 AM, Andrea Arcangeli > wrote: > > On Thu, Sep 25, 2014 at 02:50:29PM -0700, Andres Lagar-Cavilla wrote: > >> It's nearly impossible to name it right beca

RFC: get_user_pages_locked|unlocked to leverage VM_FAULT_RETRY

2014-09-26 Thread Andrea Arcangeli
ocked parameter) will not invoke the userfaultfd protocol. But I need gup_fast to use FAULT_FLAG_ALLOW_RETRY because core places like O_DIRECT uses it. I tried to do a RFC patch below that goes into this direction and should be enough for a start to solve all my issues with the mmap_sem holding

Re: [PATCH] kvm: Fix kvm_get_page_retry_io __gup retval check

2014-09-25 Thread Andrea Arcangeli
On Thu, Sep 25, 2014 at 03:26:50PM -0700, Andres Lagar-Cavilla wrote: > Confusion around -EBUSY and zero (inside a BUG_ON no less). > > Reported-by: AndreA Arcangeli > Signed-off-by: Andres Lagar-Cavilla > --- > virt/kvm/kvm_main.c | 2 +- > 1 file changed, 1 ins

Re: [PATCH v2] kvm: Faults which trigger IO release the mmap_sem

2014-09-25 Thread Andrea Arcangeli
Hi Andres, On Wed, Sep 17, 2014 at 10:51:48AM -0700, Andres Lagar-Cavilla wrote: > + if (!locked) { > + VM_BUG_ON(npages != -EBUSY); > + Shouldn't this be VM_BUG_ON(npages)? Alternatively we could patch gup to do: case -EHWPOISON: +

Re: [PATCH 08/10] userfaultfd: add new syscall to provide memory externalization

2014-07-03 Thread Andrea Arcangeli
Hi Andy, thanks for CC'ing linux-api. On Wed, Jul 02, 2014 at 06:56:03PM -0700, Andy Lutomirski wrote: > On 07/02/2014 09:50 AM, Andrea Arcangeli wrote: > > Once an userfaultfd is created MADV_USERFAULT regions talks through > > the userfaultfd protocol with the thread respo

Re: [PATCH 0/2] KVM: async_pf: use_mm/mm_users fixes

2014-04-28 Thread Andrea Arcangeli
On Mon, Apr 28, 2014 at 01:06:05PM +0200, Paolo Bonzini wrote: > Patch 1 will be for 3.16 only, I'd like a review from Marcelo or Andrea > though (that's "KVM: async_pf: kill the unnecessary use_mm/unuse_mm > async_pf_execute()" for easier googling). Patch 1: Revi

  1   2   3   4   >