Re: [PATCH v5 19/30] arm64: add POE signal support

2024-10-15 Thread Catalin Marinas
On Tue, Oct 15, 2024 at 12:41:16PM +0100, Will Deacon wrote: > On Tue, Oct 15, 2024 at 10:59:11AM +0100, Joey Gouly wrote: > > On Mon, Oct 14, 2024 at 06:10:23PM +0100, Will Deacon wrote: > > > Looking a little more at this, I think we have quite a weird behaviour > > > on arm64 as it stands. It lo

Re: [PATCH v3] ftrace: Consolidate ftrace_regs accessor functions for archs using pt_regs

2024-10-11 Thread Catalin Marinas
@@ extern void return_to_handler(void); > unsigned long ftrace_call_adjust(unsigned long addr); > > #ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS > +#define HAVE_ARCH_FTRACE_REGS > struct dyn_ftrace; > struct ftrace_ops; > struct ftrace_regs; In case you need an ack for

Re: [PATCH] ftrace: Make ftrace_regs abstract from direct use

2024-10-08 Thread Catalin Marinas
.h | 20 + > arch/arm64/kernel/asm-offsets.c | 22 +-- > arch/arm64/kernel/ftrace.c | 10 - For arm64: Acked-by: Catalin Marinas

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-13 Thread Catalin Marinas
On Fri, Sep 13, 2024 at 11:08:23AM +0100, Catalin Marinas wrote: > On Thu, Sep 12, 2024 at 02:15:59PM -0700, Charlie Jenkins wrote: > > On Thu, Sep 12, 2024 at 11:53:49AM +0100, Catalin Marinas wrote: > > > On Wed, Sep 11, 2024 at 11:18:12PM -0700, Charlie Jenkins wrote: > &

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-13 Thread Catalin Marinas
On Thu, Sep 12, 2024 at 02:15:59PM -0700, Charlie Jenkins wrote: > On Thu, Sep 12, 2024 at 11:53:49AM +0100, Catalin Marinas wrote: > > On Wed, Sep 11, 2024 at 11:18:12PM -0700, Charlie Jenkins wrote: > > > Opting-in to the higher address space is reasonable. However, it is not &

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-12 Thread Catalin Marinas
On Wed, Sep 11, 2024 at 11:18:12PM -0700, Charlie Jenkins wrote: > Opting-in to the higher address space is reasonable. However, it is not > my preference, because the purpose of this flag is to ensure that > allocations do not exceed 47-bits, so it is a clearer ABI to have the > applications that

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-11 Thread Catalin Marinas
On Tue, Sep 10, 2024 at 05:45:07PM -0700, Charlie Jenkins wrote: > On Tue, Sep 10, 2024 at 03:08:14PM -0400, Liam R. Howlett wrote: > > * Catalin Marinas [240906 07:44]: > > > On Fri, Sep 06, 2024 at 09:55:42AM +, Arnd Bergmann wrote: > > > > On Fri, Sep 6,

Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits

2024-09-06 Thread Catalin Marinas
On Fri, Sep 06, 2024 at 09:55:42AM +, Arnd Bergmann wrote: > On Fri, Sep 6, 2024, at 09:14, Guo Ren wrote: > > On Fri, Sep 6, 2024 at 3:18 PM Arnd Bergmann wrote: > >> It's also unclear to me how we want this flag to interact with > >> the existing logic in arch_get_mmap_end(), which attempts

Re: [PATCH v5 06/30] arm64: context switch POR_EL0 register

2024-09-04 Thread Catalin Marinas
On Wed, Sep 04, 2024 at 11:22:54AM +0100, Will Deacon wrote: > On Tue, Sep 03, 2024 at 03:54:13PM +0100, Joey Gouly wrote: > > commit 3141fb86bee8d48ae47cab1594dad54f974a8899 > > Author: Joey Gouly > > Date: Tue Sep 3 15:47:26 2024 +0100 > > > > fixup! arm64: context switch POR_EL0 register

Re: [PATCH v5 06/30] arm64: context switch POR_EL0 register

2024-09-02 Thread Catalin Marinas
On Tue, Aug 27, 2024 at 12:38:04PM +0100, Will Deacon wrote: > On Fri, Aug 23, 2024 at 07:40:52PM +0100, Catalin Marinas wrote: > > On Fri, Aug 23, 2024 at 06:08:36PM +0100, Will Deacon wrote: > > > On Fri, Aug 23, 2024 at 05:41:06PM +0100, Catalin Marinas wrote: > > > &

Re: [PATCH v5 06/30] arm64: context switch POR_EL0 register

2024-08-23 Thread Catalin Marinas
On Fri, Aug 23, 2024 at 06:08:36PM +0100, Will Deacon wrote: > On Fri, Aug 23, 2024 at 05:41:06PM +0100, Catalin Marinas wrote: > > On Fri, Aug 23, 2024 at 03:45:32PM +0100, Will Deacon wrote: > > > On Thu, Aug 22, 2024 at 04:10:49PM +0100, Joey Gouly wrote: &g

Re: [PATCH v5 06/30] arm64: context switch POR_EL0 register

2024-08-23 Thread Catalin Marinas
On Fri, Aug 23, 2024 at 03:45:32PM +0100, Will Deacon wrote: > On Thu, Aug 22, 2024 at 04:10:49PM +0100, Joey Gouly wrote: > > +static void permission_overlay_switch(struct task_struct *next) > > +{ > > + if (!system_supports_poe()) > > + return; > > + > > + current->thread.por_el0 =

Re: [PATCH v4 18/29] arm64: add POE signal support

2024-08-19 Thread Catalin Marinas
On Thu, Aug 15, 2024 at 04:09:26PM +0100, Dave P Martin wrote: > On Thu, Aug 15, 2024 at 02:18:15PM +0100, Joey Gouly wrote: > > That's a lot of words to say, or ask, do you agree with the approach of only > > saving POR_EL0 in the signal frame if num_allocated_pkeys() > 1? > > > > Thanks, > > Joe

Re: [PATCH v4 18/29] arm64: add POE signal support

2024-08-14 Thread Catalin Marinas
Hi Joey, On Tue, Aug 06, 2024 at 03:31:03PM +0100, Joey Gouly wrote: > diff --git arch/arm64/kernel/signal.c arch/arm64/kernel/signal.c > index 561986947530..ca7d4e0be275 100644 > --- arch/arm64/kernel/signal.c > +++ arch/arm64/kernel/signal.c > @@ -1024,7 +1025,10 @@ static int setup_sigframe_lay

Re: [PATCH v6 RESED 1/2] dma: replace zone_dma_bits by zone_dma_limit

2024-08-12 Thread Catalin Marinas
On Sun, Aug 11, 2024 at 10:09:35AM +0300, Baruch Siach wrote: > From: Catalin Marinas > > Hardware DMA limit might not be power of 2. When RAM range starts above > 0, say 4GB, DMA limit of 30 bits should end at 5GB. A single high bit > can not encode this limit. > > Use

Re: [PATCH v5 2/3] dma: replace zone_dma_bits by zone_dma_limit

2024-08-08 Thread Catalin Marinas
On Thu, Aug 08, 2024 at 11:35:01AM +0200, Petr Tesařík wrote: > On Wed, 7 Aug 2024 19:14:58 +0100 > Catalin Marinas wrote: > > With ZONE_DMA32, since all the DMA code assumes that ZONE_DMA32 ends at > > 4GB CPU address, it doesn't really work for such platforms. If there

Re: [PATCH v5 2/3] dma: replace zone_dma_bits by zone_dma_limit

2024-08-07 Thread Catalin Marinas
On Wed, Aug 07, 2024 at 04:19:38PM +0200, Petr Tesařík wrote: > On Fri, 2 Aug 2024 10:37:38 +0100 > Catalin Marinas wrote: > > On Fri, Aug 02, 2024 at 09:03:47AM +0300, Baruch Siach wrote: > > > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c > > > index 3b4

Re: [PATCH v12 02/84] KVM: arm64: Disallow copying MTE to guest memory while KVM is dirty logging

2024-08-07 Thread Catalin Marinas
out; > + } There are ways to actually log the page dirtying but I don't think it's worth it. AFAICT, reading the tags still works and that's what's used during migration (on the VM where dirty tracking takes place). Reviewed-by: Catalin Marinas

Re: [PATCH v12 01/84] KVM: arm64: Release pfn, i.e. put page, if copying MTE tags hits ZONE_DEVICE

2024-08-07 Thread Catalin Marinas
_DEVICE or not. gfn_to_pfn_prot() increased the page refcount via GUP, so it must be released before bailing out of this loop. Reviewed-by: Catalin Marinas

Re: [PATCH v5 1/3] dma: improve DMA zone selection

2024-08-07 Thread Catalin Marinas
Thanks Robin for having a look. On Wed, Aug 07, 2024 at 02:13:06PM +0100, Robin Murphy wrote: > On 2024-08-02 7:03 am, Baruch Siach wrote: > > When device DMA limit does not fit in DMA32 zone it should use DMA zone, > > even when DMA zone is stricter than needed. > > > > Same goes for devices tha

Re: [PATCH v5 2/3] dma: replace zone_dma_bits by zone_dma_limit

2024-08-02 Thread Catalin Marinas
On Fri, Aug 02, 2024 at 09:03:47AM +0300, Baruch Siach wrote: > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c > index 3b4be4ca3b08..62b36fda44c9 100644 > --- a/kernel/dma/direct.c > +++ b/kernel/dma/direct.c > @@ -20,7 +20,7 @@ > * it for entirely different regions. In that case the arch

Re: [PATCH v4 2/2] dma: replace zone_dma_bits by zone_dma_limit

2024-08-01 Thread Catalin Marinas
On Thu, Aug 01, 2024 at 11:25:07AM +0300, Baruch Siach wrote: > diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c > index d10613eb0f63..a6e15db9d1e7 100644 > --- a/kernel/dma/pool.c > +++ b/kernel/dma/pool.c > @@ -70,9 +70,10 @@ static bool cma_in_zone(gfp_t gfp) > /* CMA can't cross zone bo

Re: [PATCH v4 2/2] dma: replace zone_dma_bits by zone_dma_limit

2024-08-01 Thread Catalin Marinas
On Thu, Aug 01, 2024 at 11:25:07AM +0300, Baruch Siach wrote: > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 9b5ab6818f7f..c45e2152ca9e 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -115,35 +115,35 @@ static void __init arch_reserve_crashkernel(void) > }

Re: [PATCH v4 1/2] dma: improve DMA zone selection

2024-08-01 Thread Catalin Marinas
o DMA32 in that case. > > Reported-by: Catalin Marinas > Signed-off-by: Baruch Siach This looks fine to me now. Reviewed-by: Catalin Marinas

Re: [PATCH v3 3/3] dma-direct: use RAM start to offset zone_dma_limit

2024-07-31 Thread Catalin Marinas
On Mon, Jul 29, 2024 at 01:51:26PM +0300, Baruch Siach wrote: > diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c > index 410a7b40e496..ded3d841c88c 100644 > --- a/kernel/dma/pool.c > +++ b/kernel/dma/pool.c > @@ -12,6 +12,7 @@ > #include > #include > #include > +#include > > static stru

Re: [PATCH v3 2/3] dma-mapping: replace zone_dma_bits by zone_dma_limit

2024-07-31 Thread Catalin Marinas
On Mon, Jul 29, 2024 at 01:51:25PM +0300, Baruch Siach wrote: > From: Catalin Marinas > > Hardware DMA limit might not be power of 2. When RAM range starts above > 0, say 4GB, DMA limit of 30 bits should end at 5GB. A single high bit > can not encode this limit. > > Use dir

Re: [PATCH RFC v2 2/5] of: get dma area lower limit

2024-07-30 Thread Catalin Marinas
On Thu, Jul 25, 2024 at 02:49:01PM +0300, Baruch Siach wrote: > Hi Catalin, > > On Tue, Jun 18 2024, Catalin Marinas wrote: > > On Tue, Apr 09, 2024 at 09:17:55AM +0300, Baruch Siach wrote: > >> of_dma_get_max_cpu_address() returns the highest CPU address that > >

Re: [PATCH v4 17/29] arm64: implement PKEYS support

2024-07-08 Thread Catalin Marinas
Hi Szabolcs, On Mon, Jun 17, 2024 at 03:51:35PM +0100, Szabolcs Nagy wrote: > The 06/17/2024 15:40, Florian Weimer wrote: > > >> A user can still set it by interacting with the register directly, but I > > >> guess > > >> we want something for the glibc interface.. > > >> > > >> Dave, any though

Re: [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values

2024-07-08 Thread Catalin Marinas
On Thu, Jul 04, 2024 at 01:47:04PM +0100, Joey Gouly wrote: > On Wed, Jun 19, 2024 at 05:45:29PM +0100, Catalin Marinas wrote: > > On Tue, May 28, 2024 at 12:24:57PM +0530, Amit Daniel Kachhap wrote: > > > On 5/3/24 18:31, Joey Gouly wrote: > > > > diff --git a

Re: [PATCH v4 22/29] arm64: add Permission Overlay Extension Kconfig

2024-07-05 Thread Catalin Marinas
On Fri, May 03, 2024 at 02:01:40PM +0100, Joey Gouly wrote: > Now that support for POE and Protection Keys has been implemented, add a > config to allow users to actually enable it. > > Signed-off-by: Joey Gouly > Cc: Catalin Marinas > Cc: Will Deacon Acked-by: Catalin Marinas

Re: [PATCH v4 18/29] arm64: add POE signal support

2024-07-05 Thread Catalin Marinas
On Fri, May 03, 2024 at 02:01:36PM +0100, Joey Gouly wrote: > Add PKEY support to signals, by saving and restoring POR_EL0 from the > stackframe. > > Signed-off-by: Joey Gouly > Cc: Catalin Marinas > Cc: Will Deacon > Reviewed-by: Mark Brown > Acked-by: Szabolcs Nag

Re: [PATCH v4 17/29] arm64: implement PKEYS support

2024-07-05 Thread Catalin Marinas
t; false)) I'm still not entirely convinced on checking the keys during fast GUP but that's what x86 and powerpc do already, so I guess we'll follow the same ABI. Reviewed-by: Catalin Marinas

Re: [PATCH v4 20/29] arm64: enable POE and PIE to coexist

2024-06-21 Thread Catalin Marinas
On Fri, May 03, 2024 at 02:01:38PM +0100, Joey Gouly wrote: > Set the EL0/userspace indirection encodings to be the overlay enabled > variants of the permissions. > > Signed-off-by: Joey Gouly > Cc: Catalin Marinas > Cc: Will Deacon Reviewed-by: Catalin Marinas

Re: [PATCH v4 16/29] arm64: add pte_access_permitted_no_overlay()

2024-06-21 Thread Catalin Marinas
On Fri, May 03, 2024 at 02:01:34PM +0100, Joey Gouly wrote: > We do not want take POE into account when clearing the MTE tags. > > Signed-off-by: Joey Gouly > Cc: Catalin Marinas > Cc: Will Deacon Reviewed-by: Catalin Marinas

Re: [PATCH v4 06/29] arm64: context switch POR_EL0 register

2024-06-21 Thread Catalin Marinas
On Fri, May 03, 2024 at 02:01:24PM +0100, Joey Gouly wrote: > +static void flush_poe(void) > +{ > + if (!system_supports_poe()) > + return; > + > + write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0); > + /* ISB required for kernel uaccess routines when chaning POR_EL0 */ Nit: s/chan

Re: [PATCH v4 12/29] arm64: add POIndex defines

2024-06-21 Thread Catalin Marinas
On Fri, May 03, 2024 at 02:01:30PM +0100, Joey Gouly wrote: > The 3-bit POIndex is stored in the PTE at bits 60..62. > > Signed-off-by: Joey Gouly > Cc: Catalin Marinas > Cc: Will Deacon Acked-by: Catalin Marinas

Re: [PATCH v4 11/29] arm64: re-order MTE VM_ flags

2024-06-21 Thread Catalin Marinas
On Fri, May 03, 2024 at 02:01:29PM +0100, Joey Gouly wrote: > To make it easier to share the generic PKEYs flags, move the MTE flag. > > Signed-off-by: Joey Gouly > Cc: Catalin Marinas > Cc: Will Deacon Acked-by: Catalin Marinas

Re: [PATCH v4 10/29] arm64: enable the Permission Overlay Extension for EL0

2024-06-21 Thread Catalin Marinas
On Fri, May 03, 2024 at 02:01:28PM +0100, Joey Gouly wrote: > Expose a HWCAP and ID_AA64MMFR3_EL1_S1POE to userspace, so they can be used to > check if the CPU supports the feature. > > Signed-off-by: Joey Gouly > Cc: Catalin Marinas > Cc: Will Deacon Reviewed-by: Catalin Marinas

Re: [PATCH v4 06/29] arm64: context switch POR_EL0 register

2024-06-21 Thread Catalin Marinas
On Fri, May 03, 2024 at 02:01:24PM +0100, Joey Gouly wrote: > POR_EL0 is a register that can be modified by userspace directly, > so it must be context switched. > > Signed-off-by: Joey Gouly > Cc: Catalin Marinas > Cc: Will Deacon Reviewed-by: Catalin Marinas

Re: [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap

2024-06-21 Thread Catalin Marinas
On Fri, Jun 21, 2024 at 06:01:52PM +0100, Catalin Marinas wrote: > On Fri, May 03, 2024 at 02:01:23PM +0100, Joey Gouly wrote: > > This indicates if the system supports POE. This is a CPUCAP_BOOT_CPU_FEATURE > > as the boot CPU will enable POE if it has it, so secondary CPUs mus

Re: [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap

2024-06-21 Thread Catalin Marinas
On Fri, May 03, 2024 at 02:01:23PM +0100, Joey Gouly wrote: > This indicates if the system supports POE. This is a CPUCAP_BOOT_CPU_FEATURE > as the boot CPU will enable POE if it has it, so secondary CPUs must also > have this feature. > > Signed-off-by: Joey Gouly > Cc: Cat

Re: [PATCH v4 05/29] arm64: cpufeature: add Permission Overlay Extension cpucap

2024-06-21 Thread Catalin Marinas
On Fri, May 03, 2024 at 02:01:23PM +0100, Joey Gouly wrote: > This indicates if the system supports POE. This is a CPUCAP_BOOT_CPU_FEATURE > as the boot CPU will enable POE if it has it, so secondary CPUs must also > have this feature. > > Signed-off-by: Joey Gouly > Cc: Cat

Re: [PATCH v4 15/29] arm64: handle PKEY/POE faults

2024-06-21 Thread Catalin Marinas
rotect() call. > + > fault = __do_page_fault(mm, vma, addr, mm_flags, vm_flags, regs); You'll need to rebase this on 6.10-rcX since this function disappeared. Otherwise the patch looks fine. Reviewed-by: Catalin Marinas

Re: [PATCH v4 13/29] arm64: convert protection key into vm_flags and pgprot values

2024-06-19 Thread Catalin Marinas
On Tue, May 28, 2024 at 12:24:57PM +0530, Amit Daniel Kachhap wrote: > On 5/3/24 18:31, Joey Gouly wrote: > > diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h > > index 5966ee4a6154..ecb2d18dc4d7 100644 > > --- a/arch/arm64/include/asm/mman.h > > +++ b/arch/arm64/include/a

Re: [PATCH RFC v2 4/5] dma-direct: add base offset to zone_dma_bits

2024-06-18 Thread Catalin Marinas
On Tue, Apr 09, 2024 at 09:17:57AM +0300, Baruch Siach wrote: > Current code using zone_dma_bits assume that all addresses range in the > bits mask are suitable for DMA. For some existing platforms this > assumption is not correct. DMA range might have non zero lower limit. [...] > @@ -59,7 +60,7 @

Re: [PATCH RFC v2 2/5] of: get dma area lower limit

2024-06-18 Thread Catalin Marinas
On Tue, Apr 09, 2024 at 09:17:55AM +0300, Baruch Siach wrote: > of_dma_get_max_cpu_address() returns the highest CPU address that > devices can use for DMA. The implicit assumption is that all CPU > addresses below that limit are suitable for DMA. However the > 'dma-ranges' property this code uses

Re: [PATCH RFC v2 1/5] dma-mapping: replace zone_dma_bits by zone_dma_limit

2024-06-18 Thread Catalin Marinas
(finally getting around to looking at this series, sorry for the delay) On Tue, Apr 09, 2024 at 09:17:54AM +0300, Baruch Siach wrote: > From: Catalin Marinas > > Hardware DMA limit might not be power of 2. When RAM range starts above > 0, say 4GB, DMA limit of 30 bits should e

Re: [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS

2024-04-09 Thread Catalin Marinas
lat_sig -P 1 prot lat_sig' from lmbench testcase. > > Since the page faut is handled under per-VMA lock, count it as a vma lock > event with VMA_LOCK_SUCCESS. > > Reviewed-by: Suren Baghdasaryan > Signed-off-by: Kefeng Wang Reviewed-by: Catalin Marinas

Re: [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault()

2024-04-09 Thread Catalin Marinas
yan > Signed-off-by: Kefeng Wang As I reviewed v1 and the changes are minimal: Reviewed-by: Catalin Marinas

Re: [PATCH 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS

2024-04-03 Thread Catalin Marinas
TRY | VM_FAULT_COMPLETED))) I think this makes sense. A concurrent modification of vma->vm_flags (e.g. mprotect()) would do a vma_start_write(), so no need to recheck again with the mmap lock held. Reviewed-by: Catalin Marinas

Re: [PATCH 1/7] arm64: mm: cleanup __do_page_fault()

2024-04-03 Thread Catalin Marinas
On Tue, Apr 02, 2024 at 03:51:36PM +0800, Kefeng Wang wrote: > The __do_page_fault() only check vma->flags and call handle_mm_fault(), > and only called by do_page_fault(), let's squash it into do_page_fault() > to cleanup code. > > Signed-off-by: Kefeng Wang Reviewed-by: Catalin Marinas

Re: [PATCH 4/4] vdso: avoid including asm/page.h

2024-02-27 Thread Catalin Marinas
> Reported-by: Linux Kernel Functional Testing > Fixes: a0d2fcd62ac2 ("vdso/ARM: Make union vdso_data_store available for all > architectures") > Link: > https://lore.kernel.org/lkml/ca+g9fytrxxm_ko9fnpz3xarxhv7ud_yqp-teupqrnrhu+_0...@mail.gmail.com/ > Signed-off-by: Arnd Bergmann Acked-by: Catalin Marinas

Re: [PATCH 2/4] arch: simplify architecture specific page size configuration

2024-02-27 Thread Catalin Marinas
lly used, while > leaving the arhcitecture specific ones as the user visible > place for configuring it, to avoid breaking user configs. > > Signed-off-by: Arnd Bergmann For arm64: Acked-by: Catalin Marinas

Re: [PATCH v6 12/18] arm64/mm: Wire up PTE_CONT for user mappings

2024-02-19 Thread Catalin Marinas
On Fri, Feb 16, 2024 at 12:53:43PM +, Ryan Roberts wrote: > On 16/02/2024 12:25, Catalin Marinas wrote: > > On Thu, Feb 15, 2024 at 10:31:59AM +, Ryan Roberts wrote: > >> +pte_t contpte_ptep_get_lockless(pte_t *orig_ptep) > >> +{ > >> + /* > >>

Re: [PATCH v6 12/18] arm64/mm: Wire up PTE_CONT for user mappings

2024-02-16 Thread Catalin Marinas
On Fri, Feb 16, 2024 at 12:53:43PM +, Ryan Roberts wrote: > On 16/02/2024 12:25, Catalin Marinas wrote: > > On Thu, Feb 15, 2024 at 10:31:59AM +, Ryan Roberts wrote: > >> arch/arm64/mm/contpte.c | 285 +++ > > > > Ni

Re: [PATCH v6 18/18] arm64/mm: Automatically fold contpte mappings

2024-02-16 Thread Catalin Marinas
perform the checks when an indiviual PTE is modified via mprotect > (ptep_modify_prot_commit() -> set_pte_at() -> set_ptes(nr=1)) and only > when we are setting the final PTE in a contpte-aligned block. > > Signed-off-by: Ryan Roberts Acked-by: Catalin Marinas

Re: [PATCH v6 17/18] arm64/mm: __always_inline to improve fork() perf

2024-02-16 Thread Catalin Marinas
is called by them, as __always_inline. This is worth ~1% on the > fork() microbenchmark with order-0 folios (the common case). > > Acked-by: Mark Rutland > Signed-off-by: Ryan Roberts Acked-by: Catalin Marinas

Re: [PATCH v6 16/18] arm64/mm: Implement pte_batch_hint()

2024-02-16 Thread Catalin Marinas
; > Acked-by: Mark Rutland > Reviewed-by: David Hildenbrand > Tested-by: John Hubbard > Signed-off-by: Ryan Roberts Acked-by: Catalin Marinas

Re: [PATCH v6 14/18] arm64/mm: Implement new [get_and_]clear_full_ptes() batch APIs

2024-02-16 Thread Catalin Marinas
the contpte. This significantly reduces unfolding > operations, reducing the number of tlbis that must be issued. > > Tested-by: John Hubbard > Signed-off-by: Ryan Roberts Acked-by: Catalin Marinas

Re: [PATCH v6 13/18] arm64/mm: Implement new wrprotect_ptes() batch API

2024-02-16 Thread Catalin Marinas
wrprotect a whole contpte block without unfolding is > possible thanks to the tightening of the Arm ARM in respect to the > definition and behaviour when 'Misprogramming the Contiguous bit'. See > section D21194 at https://developer.arm.com/documentation/102105/ja-07/ > > Tested-by: John Hubbard > Signed-off-by: Ryan Roberts Acked-by: Catalin Marinas

Re: [PATCH v6 12/18] arm64/mm: Wire up PTE_CONT for user mappings

2024-02-16 Thread Catalin Marinas
istent and the only variation allowed is the dirty/young state to be passed to the orig_pte returned. The original pte may have been updated by the time this loop finishes but I don't think it matters, it wouldn't be any different than reading a single pte and returning it while it is being updated. If you can make this easier to parse (in a few years time) with an additional patch adding some more comments, that would be great. For this patch: Reviewed-by: Catalin Marinas -- Catalin

Re: [PATCH v6 11/18] arm64/mm: Split __flush_tlb_range() to elide trailing DSB

2024-02-15 Thread Catalin Marinas
been discussed that __flush_tlb_page() may be wrong though. > Regardless, both will be resolved separately if needed. > > Reviewed-by: David Hildenbrand > Tested-by: John Hubbard > Signed-off-by: Ryan Roberts Acked-by: Catalin Marinas

Re: [PATCH v6 10/18] arm64/mm: New ptep layer to manage contig bit

2024-02-15 Thread Catalin Marinas
ar_young > - ptep_clear_flush_young > - ptep_set_wrprotect > - ptep_set_access_flags > > Tested-by: John Hubbard > Signed-off-by: Ryan Roberts Acked-by: Catalin Marinas

Re: [PATCH v6 09/18] arm64/mm: Convert ptep_clear() to ptep_get_and_clear()

2024-02-15 Thread Catalin Marinas
transparent contpte work. We won't have a private version of > ptep_clear() so let's convert it to directly call ptep_get_and_clear(). > > Tested-by: John Hubbard > Signed-off-by: Ryan Roberts Acked-by: Catalin Marinas

Re: [PATCH v6 08/18] arm64/mm: Convert set_pte_at() to set_ptes(..., 1)

2024-02-15 Thread Catalin Marinas
since those call sites are acting on behalf of > core-mm and should continue to call into the public set_ptes() rather > than the arch-private __set_ptes(). > > Tested-by: John Hubbard > Signed-off-by: Ryan Roberts Acked-by: Catalin Marinas

Re: [PATCH v6 07/18] arm64/mm: Convert READ_ONCE(*ptep) to ptep_get(ptep)

2024-02-15 Thread Catalin Marinas
be the same. > > This will benefit us when we shortly introduce the transparent contpte > support. In this case, ptep_get() will become more complex so we now > have all the code abstracted through it. > > Tested-by: John Hubbard > Signed-off-by: Ryan Roberts Acked-by: Catalin Marinas

Re: [PATCH v6 04/18] arm64/mm: Convert pte_next_pfn() to pte_advance_pfn()

2024-02-15 Thread Catalin Marinas
On Thu, Feb 15, 2024 at 10:31:51AM +, Ryan Roberts wrote: > Core-mm needs to be able to advance the pfn by an arbitrary amount, so > override the new pte_advance_pfn() API to do so. > > Signed-off-by: Ryan Roberts Acked-by: Catalin Marinas

Re: get_user_pages() and EXEC_ONLY mapping.

2023-11-10 Thread Catalin Marinas
On Fri, Nov 10, 2023 at 08:19:23PM +0530, Aneesh Kumar K.V wrote: > Some architectures can now support EXEC_ONLY mappings and I am wondering > what get_user_pages() on those addresses should return. Earlier > PROT_EXEC implied PROT_READ and pte_access_permitted() returned true for > that. But arm64

Re: [PATCH v3 9/9] efi: move screen_info into efi init code

2023-10-10 Thread Catalin Marinas
gt; arch/arm64/kernel/efi.c | 4 > arch/arm64/kernel/image-vars.h| 2 ++ It's more Ard's thing and he reviewed it already but if you need another ack: Acked-by: Catalin Marinas

Re: [PATCH v2] arch: Reserve map_shadow_stack() syscall number for all architectures

2023-10-04 Thread Catalin Marinas
by the > perf folks. > - Map Powerpc to sys_ni_syscall (Rick Edgecombe) > --- > arch/alpha/kernel/syscalls/syscall.tbl | 1 + > arch/arm/tools/syscall.tbl | 1 + > arch/arm64/include/asm/unistd.h | 2 +- > arch/arm64/include/asm/unistd32.h | 2 ++ For arm64 (compat): Acked-by: Catalin Marinas

Re: [PATCH v1 0/8] Fix set_huge_pte_at() panic on arm64

2023-09-21 Thread Catalin Marinas
On Thu, Sep 21, 2023 at 05:35:54PM +0100, Ryan Roberts wrote: > On 21/09/2023 17:30, Andrew Morton wrote: > > On Thu, 21 Sep 2023 17:19:59 +0100 Ryan Roberts > > wrote: > >> Ryan Roberts (8): > >> parisc: hugetlb: Convert set_huge_pte_at() to take vma > >> powerpc: hugetlb: Convert set_huge_p

Re: [PATCH v3 5/5] mmu_notifiers: Rename invalidate_range notifier

2023-07-21 Thread Catalin Marinas
sh_tlb_range(struct > vm_area_struct *vma, > scale++; > } > dsb(ish); > - mmu_notifier_invalidate_range(vma->vm_mm, start, end); > + mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end); > } For arm64: Acked-by: Catalin Marinas

Re: [PATCH v3 3/5] mmu_notifiers: Call invalidate_range() when invalidating TLBs

2023-07-21 Thread Catalin Marinas
D(mm)); > __tlbi(vale1is, addr); > __tlbi_user(vale1is, addr); > + mmu_notifier_invalidate_range(mm, uaddr & PAGE_MASK, > + (uaddr & PAGE_MASK) + > PAGE_SIZE); Nitpick: we have PAGE_ALIGN() for this. For arm64: Acked-by: Catalin Marinas

Re: [PATCH v11 4/4] arm64: support batched/deferred tlb shootdown during page reclamation/migration

2023-07-21 Thread Catalin Marinas
dsb(ish); > +} Nitpick: as an additional patch, I'd add some comment for these two functions that the TLBI has already been issued and only a DSB is needed to synchronise its effect on the other CPUs. Reviewed-by: Catalin Marinas

Re: [PATCH v11 3/4] mm/tlbbatch: Introduce arch_flush_tlb_batched_pending()

2023-07-21 Thread Catalin Marinas
64 may > only need a synchronization barrier(dsb) here rather than > a full mm flush. So add arch_flush_tlb_batched_pending() to > allow an arch-specific implementation here. This intends no > functional changes on x86 since still a full mm flush for > x86. > > Signed-off-by: Yicong Yang Reviewed-by: Catalin Marinas

Re: [PATCH v11 2/4] mm/tlbbatch: Rename and extend some functions

2023-07-21 Thread Catalin Marinas
> Tested-by: Yicong Yang > Tested-by: Xin Hao > Tested-by: Punit Agrawal > Signed-off-by: Barry Song > Signed-off-by: Yicong Yang > Reviewed-by: Kefeng Wang > Reviewed-by: Xin Hao > Reviewed-by: Anshuman Khandual Reviewed-by: Catalin Marinas

Re: [PATCH v11 1/4] mm/tlbbatch: Introduce arch_tlbbatch_should_defer()

2023-07-21 Thread Catalin Marinas
318-2-khand...@linux.vnet.ibm.com/] > Signed-off-by: Yicong Yang > [Rebase and fix incorrect return value type] > Reviewed-by: Kefeng Wang > Reviewed-by: Anshuman Khandual > Reviewed-by: Barry Song > Reviewed-by: Xin Hao > Tested-by: Punit Agrawal Reviewed-by: Catalin Marinas

Re: [PATCH v10 4/4] arm64: support batched/deferred tlb shootdown during page reclamation/migration

2023-07-16 Thread Catalin Marinas
On Mon, Jul 10, 2023 at 04:39:14PM +0800, Yicong Yang wrote: > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 7856c3a3e35a..f0ce8208c57f 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -96,6 +96,7 @@ config ARM64 > select ARCH_SUPPORTS_NUMA_BALANCING > se

Re: [RESEND PATCH v9 2/2] arm64: support batched/deferred tlb shootdown during page reclamation/migration

2023-06-29 Thread Catalin Marinas
On Thu, Jun 29, 2023 at 05:31:36PM +0100, Catalin Marinas wrote: > On Thu, May 18, 2023 at 02:59:34PM +0800, Yicong Yang wrote: > > From: Barry Song > > > > on x86, batched and deferred tlb shootdown has lead to 90% > > performance increase on tlb shootdown. on arm64,

Re: [RESEND PATCH v9 2/2] arm64: support batched/deferred tlb shootdown during page reclamation/migration

2023-06-29 Thread Catalin Marinas
On Thu, May 18, 2023 at 02:59:34PM +0800, Yicong Yang wrote: > From: Barry Song > > on x86, batched and deferred tlb shootdown has lead to 90% > performance increase on tlb shootdown. on arm64, HW can do > tlb shootdown without software IPI. But sync tlbi is still > quite expensive. [...] > .../

Re: [PATCH v4 21/34] arm64: Convert various functions to use ptdescs

2023-06-14 Thread Catalin Marinas
On Mon, Jun 12, 2023 at 02:04:10PM -0700, Vishal Moola (Oracle) wrote: > As part of the conversions to replace pgtable constructor/destructors with > ptdesc equivalents, convert various page table functions to use ptdescs. > > Signed-off-by: Vishal Moola (Oracle) Acked-by: Catalin Marinas

Re: [PATCH 0/3] Move the ARCH_DMA_MINALIGN definition to asm/cache.h

2023-06-13 Thread Catalin Marinas
On Tue, Jun 13, 2023 at 04:42:40PM +, Christophe Leroy wrote: > > > Le 13/06/2023 à 17:52, Catalin Marinas a écrit : > > Hi, > > > > The ARCH_KMALLOC_MINALIGN reduction series defines a generic > > ARCH_DMA_MINALIGN in linux/cache.h: > > > > ht

[PATCH 2/3] microblaze: Move the ARCH_{DMA,SLAB}_MINALIGN definitions to asm/cache.h

2023-06-13 Thread Catalin Marinas
The microblaze architecture defines ARCH_DMA_MINALIGN in asm/page.h. Move it to asm/cache.h to allow a generic ARCH_DMA_MINALIGN definition in linux/cache.h without redefine errors/warnings. While at it, also move ARCH_SLAB_MINALIGN to asm/cache.h for consistency. Signed-off-by: Catalin Marinas

[PATCH 3/3] sh: Move the ARCH_DMA_MINALIGN definition to asm/cache.h

2023-06-13 Thread Catalin Marinas
The sh architecture defines ARCH_DMA_MINALIGN in asm/page.h. Move it to asm/cache.h to allow a generic ARCH_DMA_MINALIGN definition in linux/cache.h without redefine errors/warnings. Signed-off-by: Catalin Marinas Cc: Yoshinori Sato Cc: Rich Felker Cc: John Paul Adrian Glaubitz Cc: linux

[PATCH 1/3] powerpc: Move the ARCH_DMA_MINALIGN definition to asm/cache.h

2023-06-13 Thread Catalin Marinas
The powerpc architecture defines ARCH_DMA_MINALIGN in asm/page_32.h and only if CONFIG_NOT_COHERENT_CACHE is enabled (32-bit platforms only). Move this macro to asm/cache.h to allow a generic ARCH_DMA_MINALIGN definition in linux/cache.h without redefine errors/warnings. Signed-off-by: Catalin

[PATCH 0/3] Move the ARCH_DMA_MINALIGN definition to asm/cache.h

2023-06-13 Thread Catalin Marinas
with the ARCH_KMALLOC_MINALIGN series? Thank you. Catalin Marinas (3): powerpc: Move the ARCH_DMA_MINALIGN definition to asm/cache.h microblaze: Move the ARCH_{DMA,SLAB}_MINALIGN definitions to asm/cache.h sh: Move the ARCH_DMA_MINALIGN definition to asm/cache.h arch/microblaze/includ

Re: linux-next: build failure after merge of the mm tree

2023-06-13 Thread Catalin Marinas
Hi Stephen, On Tue, Jun 13, 2023 at 04:21:19PM +1000, Stephen Rothwell wrote: > After merging the mm tree, today's linux-next build (powerpc > ppc44x_defconfig) failed like this: > > In file included from arch/powerpc/include/asm/page.h:247, > from arch/powerpc/include/asm/thread

Re: [PATCH] irq_work: consolidate arch_irq_work_raise prototypes

2023-05-25 Thread Catalin Marinas
> Fix this by providing it in only one place that is always visible. > > Signed-off-by: Arnd Bergmann Acked-by: Catalin Marinas

Re: [PATCH 03/23] arm64/hugetlb: pte_alloc_huge() pte_offset_huge()

2023-05-25 Thread Catalin Marinas
t; > Signed-off-by: Hugh Dickins Acked-by: Catalin Marinas

Re: [PATCH 02/23] arm64: allow pte_offset_map() to fail

2023-05-25 Thread Catalin Marinas
On Tue, May 09, 2023 at 09:43:47PM -0700, Hugh Dickins wrote: > In rare transient cases, not yet made possible, pte_offset_map() and > pte_offset_map_lock() may not find a page table: handle appropriately. > > Signed-off-by: Hugh Dickins Acked-by: Catalin Marinas

Re: [PATCH 13/14] thread_info: move function declarations to linux/thread_info.h

2023-05-24 Thread Catalin Marinas
p_task_struct' [-Werror=missing-prototypes] > > There are already prototypes in a number of architecture specific headers > that have addressed those warnings before, but it's much better to have > these in a single place so the warning no longer shows up anywhere. > > Signed-off-by: Arnd Bergmann For arm64: Acked-by: Catalin Marinas

Re: [PATCH v3 02/14] arm64: drop ranges in definition of ARCH_FORCE_MAX_ORDER

2023-04-27 Thread Catalin Marinas
On Tue, Apr 25, 2023 at 11:09:58AM -0500, Justin Forbes wrote: > On Tue, Apr 18, 2023 at 5:22 PM Andrew Morton > wrote: > > On Wed, 12 Apr 2023 18:27:08 +0100 Catalin Marinas > > wrote: > > > > It sounds nice in theory. In practice. EXPERT hides too much. Whe

Re: [PATCH v3 02/14] arm64: drop ranges in definition of ARCH_FORCE_MAX_ORDER

2023-04-19 Thread Catalin Marinas
On Tue, Apr 18, 2023 at 03:05:57PM -0700, Andrew Morton wrote: > On Wed, 12 Apr 2023 18:27:08 +0100 Catalin Marinas > wrote: > > > It sounds nice in theory. In practice. EXPERT hides too much. When you > > > flip expert, you expose over a 175ish new config options which

Re: [PATCH v3 02/14] arm64: drop ranges in definition of ARCH_FORCE_MAX_ORDER

2023-04-12 Thread Catalin Marinas
On Tue, Apr 04, 2023 at 06:50:01AM -0500, Justin Forbes wrote: > On Tue, Apr 4, 2023 at 2:22 AM Mike Rapoport wrote: > > On Wed, Mar 29, 2023 at 10:55:37AM -0500, Justin Forbes wrote: > > > On Sat, Mar 25, 2023 at 1:09 AM Mike Rapoport wrote: > > > > > > > > From: "Mike Rapoport (IBM)" > > > > >

Re: [PATCH 18/21] ARM: drop SMP support for ARM11MPCore

2023-03-31 Thread Catalin Marinas
As a consequence, neither configuration is actually safe to use in a > general-purpose kernel that is used on both MPCore systems and ARM1176 > with prefetching enabled. As the author of this terrible hack (created under duress ;)) Acked-by: Catalin Marinas IIRC, RWFO is working in combinat

Re: [PATCH 00/21] dma-mapping: unify support for cache flushes

2023-03-31 Thread Catalin Marinas
On Mon, Mar 27, 2023 at 02:12:56PM +0200, Arnd Bergmann wrote: > Another difference that I do not address here is what cache invalidation > does for partical cache lines. On arm32, arm64 and powerpc, a partial > cache line always gets written back before invalidation in order to > ensure that data

Re: [PATCH 03/14] arm64: reword ARCH_FORCE_MAX_ORDER prompt and help text

2023-03-23 Thread Catalin Marinas
es. > > Signed-off-by: Mike Rapoport (IBM) Acked-by: Catalin Marinas

Re: [PATCH 02/14] arm64: drop ranges in definition of ARCH_FORCE_MAX_ORDER

2023-03-23 Thread Catalin Marinas
On Thu, Mar 23, 2023 at 11:21:44AM +0200, Mike Rapoport wrote: > From: "Mike Rapoport (IBM)" > > It is not a good idea to change fundamental parameters of core memory > management. Having predefined ranges suggests that the values within > those ranges are sensible, but one has to *really* unders

Re: [PATCH] mm: add PTE pointer parameter to flush_tlb_fix_spurious_fault()

2023-03-06 Thread Catalin Marinas
flush_tlb_fix_spurious_fault(vma, address, ptep) do { } while (0) For arm64: Acked-by: Catalin Marinas > diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h > index 2c70b4d1263d..c1f6b46ec555 100644 > --- a/arch/s390/include/asm/pgtable.h > +++ b/arch/s3

Re: [PATCH v3 02/24] arm64: Remove COMMAND_LINE_SIZE from uapi

2023-02-14 Thread Catalin Marinas
On Tue, Feb 14, 2023 at 08:49:03AM +0100, Alexandre Ghiti wrote: > From: Palmer Dabbelt > > As far as I can tell this is not used by userspace and thus should not > be part of the user-visible API. > > Signed-off-by: Palmer Dabbelt Acked-by: Catalin Marinas

  1   2   3   >