Re: [RFC PATCH v1 1/3] Revert "powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto"

2023-06-20 Thread Peter Zijlstra
On Tue, Jun 20, 2023 at 10:51:25AM +0530, Naveen N Rao wrote: > Christophe Leroy wrote: > > This reverts commit 1e688dd2a3d6759d416616ff07afc4bb836c4213. > > > > That commit aimed at optimising the code around generation of > > WARN_ON/BUG_ON but this leads to a lot of dead code erroneously > > ge

Re: [PATCH mm-unstable v2 01/10] mm/kvm: add mmu_notifier_ops->test_clear_young()

2023-06-20 Thread Nicholas Piggin
On Sat May 27, 2023 at 9:44 AM AEST, Yu Zhao wrote: > Add mmu_notifier_ops->test_clear_young() to supersede test_young() > and clear_young(). > > test_clear_young() has a fast path, which if supported, allows its > callers to safely clear the accessed bit without taking > kvm->mmu_lock. > > The fas

[PATCH v2 00/12] mm: free retracted page table by RCU

2023-06-20 Thread Hugh Dickins
Here is v2 third series of patches to mm (and a few architectures), based on v6.4-rc5 with the preceding two series applied: in which khugepaged takes advantage of pte_offset_map[_lock]() allowing for pmd transitions. Differences from v1 are noted patch by patch below This follows on from the v2 "

[PATCH v2 01/12] mm/pgtable: add rcu_read_lock() and rcu_read_unlock()s

2023-06-20 Thread Hugh Dickins
Before putting them to use (several commits later), add rcu_read_lock() to pte_offset_map(), and rcu_read_unlock() to pte_unmap(). Make this a separate commit, since it risks exposing imbalances: prior commits have fixed all the known imbalances, but we may find some have been missed. Signed-off-

[PATCH v2 02/12] mm/pgtable: add PAE safety to __pte_offset_map()

2023-06-20 Thread Hugh Dickins
There is a faint risk that __pte_offset_map(), on a 32-bit architecture with a 64-bit pmd_t e.g. x86-32 with CONFIG_X86_PAE=y, would succeed on a pmdval assembled from a pmd_low and a pmd_high which never belonged together: their combination not pointing to a page table at all, perhaps not even a v

Re: [PATCH] cxl/ocxl: Possible repeated word

2023-06-20 Thread Frederic Barrat
Hello, While the correction in the comment is of course ok, the patch was sent as html. You may want to check/fix how it was submitted. Fred On 18/06/2023 17:08, zhumao...@208suo.com wrote: Delete repeated word in comment. Signed-off-by: Zhu Mao  208suo. com> --- drivers/misc/cxl/native.

[PATCH v2 03/12] arm: adjust_pte() use pte_offset_map_nolock()

2023-06-20 Thread Hugh Dickins
Instead of pte_lockptr(), use the recently added pte_offset_map_nolock() in adjust_pte(): because it gives the not-locked ptl for precisely that pte, which the caller can then safely lock; whereas pte_lockptr() is not so tightly coupled, because it dereferences the pmd pointer again. Signed-off-by

[PATCH v2 04/12] powerpc: assert_pte_locked() use pte_offset_map_nolock()

2023-06-20 Thread Hugh Dickins
Instead of pte_lockptr(), use the recently added pte_offset_map_nolock() in assert_pte_locked(). BUG if pte_offset_map_nolock() fails: this is stricter than the previous implementation, which skipped when pmd_none() (with a comment on khugepaged collapse transitions): but wouldn't we want to know,

[PATCH v2 05/12] powerpc: add pte_free_defer() for pgtables sharing page

2023-06-20 Thread Hugh Dickins
Add powerpc-specific pte_free_defer(), to call pte_free() via call_rcu(). pte_free_defer() will be called inside khugepaged's retract_page_tables() loop, where allocating extra memory cannot be relied upon. This precedes the generic version to avoid build breakage from incompatible pgtable_t. Thi

Re: [PATCH mm-unstable v2 07/10] kvm/powerpc: add kvm_arch_test_clear_young()

2023-06-20 Thread Nicholas Piggin
On Sat May 27, 2023 at 9:44 AM AEST, Yu Zhao wrote: > Implement kvm_arch_test_clear_young() to support the fast path in > mmu_notifier_ops->test_clear_young(). > > It focuses on a simple case, i.e., radix MMU sets the accessed bit in > KVM PTEs and VMs are not nested, where it can rely on RCU and >

[PATCH v2 06/12] sparc: add pte_free_defer() for pte_t *pgtable_t

2023-06-20 Thread Hugh Dickins
Add sparc-specific pte_free_defer(), to call pte_free() via call_rcu(). pte_free_defer() will be called inside khugepaged's retract_page_tables() loop, where allocating extra memory cannot be relied upon. This precedes the generic version to avoid build breakage from incompatible pgtable_t. sparc

[PATCH v2 07/12] s390: add pte_free_defer() for pgtables sharing page

2023-06-20 Thread Hugh Dickins
Add s390-specific pte_free_defer(), to call pte_free() via call_rcu(). pte_free_defer() will be called inside khugepaged's retract_page_tables() loop, where allocating extra memory cannot be relied upon. This precedes the generic version to avoid build breakage from incompatible pgtable_t. This v

[PATCH v2 08/12] mm/pgtable: add pte_free_defer() for pgtable as page

2023-06-20 Thread Hugh Dickins
Add the generic pte_free_defer(), to call pte_free() via call_rcu(). pte_free_defer() will be called inside khugepaged's retract_page_tables() loop, where allocating extra memory cannot be relied upon. This version suits all those architectures which use an unfragmented page for one page table (no

[PATCH v2 09/12] mm/khugepaged: retract_page_tables() without mmap or vma lock

2023-06-20 Thread Hugh Dickins
Simplify shmem and file THP collapse's retract_page_tables(), and relax its locking: to improve its success rate and to lessen impact on others. Instead of its MADV_COLLAPSE case doing set_huge_pmd() at target_addr of target_mm, leave that part of the work to madvise_collapse() calling collapse_pt

[PATCH v2 10/12] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()

2023-06-20 Thread Hugh Dickins
Bring collapse_and_free_pmd() back into collapse_pte_mapped_thp(). It does need mmap_read_lock(), but it does not need mmap_write_lock(), nor vma_start_write() nor i_mmap lock nor anon_vma lock. All racing paths are relying on pte_offset_map_lock() and pmd_lock(), so use those. Follow the pattern

[PATCH v2 11/12] mm/khugepaged: delete khugepaged_collapse_pte_mapped_thps()

2023-06-20 Thread Hugh Dickins
Now that retract_page_tables() can retract page tables reliably, without depending on trylocks, delete all the apparatus for khugepaged to try again later: khugepaged_collapse_pte_mapped_thps() etc; and free up the per-mm memory which was set aside for that in the khugepaged_mm_slot. But one part

[PATCH v2 12/12] mm: delete mmap_write_trylock() and vma_try_start_write()

2023-06-20 Thread Hugh Dickins
mmap_write_trylock() and vma_try_start_write() were added just for khugepaged, but now it has no use for them: delete. Signed-off-by: Hugh Dickins --- include/linux/mm.h| 17 - include/linux/mmap_lock.h | 10 -- 2 files changed, 27 deletions(-) diff --git a/inclu

Re: [PATCH mm-unstable v2 06/10] kvm/powerpc: make radix page tables RCU safe

2023-06-20 Thread Yu Zhao
On Tue, Jun 20, 2023 at 12:33 AM Nicholas Piggin wrote: > > On Sat May 27, 2023 at 9:44 AM AEST, Yu Zhao wrote: > > KVM page tables are currently not RCU safe against remapping, i.e., > > kvmppc_unmap_free_pmd_entry_table() et al. The previous > > Minor nit but the "page table" is not RCU-safe aga

[PATCH mm 10/12] mm/khugepaged: collapse_pte_mapped_thp() with mmap_read_lock()

2023-06-20 Thread Hugh Dickins
Bring collapse_and_free_pmd() back into collapse_pte_mapped_thp(). It does need mmap_read_lock(), but it does not need mmap_write_lock(), nor vma_start_write() nor i_mmap lock nor anon_vma lock. All racing paths are relying on pte_offset_map_lock() and pmd_lock(), so use those. Follow the pattern

Re: [PATCH v9 00/14] pci: Work around ASMedia ASM2824 PCIe link training failures

2023-06-20 Thread Maciej W. Rozycki
On Fri, 16 Jun 2023, Bjorn Helgaas wrote: > I agree that as I rearranged it, the workaround doesn't apply in all > cases simultaneously. Maybe not ideal, but maybe not terrible either. > Looking at it again, maybe it would have made more sense to move the > pcie_wait_for_link_delay() change to th

Re: [PATCH mm-unstable v2 06/10] kvm/powerpc: make radix page tables RCU safe

2023-06-20 Thread Nicholas Piggin
On Tue Jun 20, 2023 at 6:00 PM AEST, Yu Zhao wrote: > On Tue, Jun 20, 2023 at 12:33 AM Nicholas Piggin wrote: > > > > On Sat May 27, 2023 at 9:44 AM AEST, Yu Zhao wrote: > > > KVM page tables are currently not RCU safe against remapping, i.e., > > > kvmppc_unmap_free_pmd_entry_table() et al. The p

Re: [PATCH v2 05/12] powerpc: add pte_free_defer() for pgtables sharing page

2023-06-20 Thread Jason Gunthorpe
On Tue, Jun 20, 2023 at 12:47:54AM -0700, Hugh Dickins wrote: > Add powerpc-specific pte_free_defer(), to call pte_free() via call_rcu(). > pte_free_defer() will be called inside khugepaged's retract_page_tables() > loop, where allocating extra memory cannot be relied upon. This precedes > the gen

[6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR

2023-06-20 Thread Sachin Sant
6.4.0-rc7-next-20230620 fails to boot on IBM Power LPAR with following [ 5.548368] BUG: Unable to handle kernel data access at 0x95bdcf954bc34e73 [ 5.548380] Faulting instruction address: 0xc0548090 [ 5.548384] Oops: Kernel access of bad area, sig: 11 [#1] [ 5.548387] LE PAGE_SIZE=64K MMU

Re: [PATCH] cxl/ocxl: Possible repeated word

2023-06-20 Thread Michael Ellerman
zhumao...@208suo.com writes: > Delete repeated word in comment. > > Signed-off-by: Zhu Mao > --- > drivers/misc/cxl/native.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c > index 50b0c44bb8d7..6957946a6463 100644 >

Re: [PATCH v2] security/integrity: fix pointer to ESL data and its size on pseries

2023-06-20 Thread R Nageswara Sastry
On 08/06/23 5:34 pm, Nayna Jain wrote: On PowerVM guest, variable data is prefixed with 8 bytes of timestamp. Extract ESL by stripping off the timestamp before passing to ESL parser. Fixes: 4b3e71e9a34c ("integrity/powerpc: Support loading keys from PLPKS") Cc: sta...@vger.kenrnel.org # v6.3

Re: [PATCH v2 2/2] powerpc/mm: Add memory_block_size as a kernel parameter

2023-06-20 Thread Michael Ellerman
David Hildenbrand writes: > On 09.06.23 08:08, Aneesh Kumar K.V wrote: >> Certain devices can possess non-standard memory capacities, not constrained >> to multiples of 1GB. Provide a kernel parameter so that we can map the >> device memory completely on memory hotplug. > > So, the unfortunate thi

Re: [PATCH v2 2/2] powerpc/mm: Add memory_block_size as a kernel parameter

2023-06-20 Thread David Hildenbrand
On 20.06.23 14:35, Michael Ellerman wrote: David Hildenbrand writes: On 09.06.23 08:08, Aneesh Kumar K.V wrote: Certain devices can possess non-standard memory capacities, not constrained to multiples of 1GB. Provide a kernel parameter so that we can map the device memory completely on memory

Re: [RFC PATCH 2/3] powerpc/pci: Remove MVE code

2023-06-20 Thread Michael Ellerman
Joel Stanley writes: > With IODA1 support gone the OPAL calls to set MVE are dead code. Remove > them. > > TODO: Do we have rules for removing unused OPAL APIs? Should we leave it > in opal.h? opal-call.c? I don't think we have any rules for removal. When skiboot was being actively developed Ste

[PATCH v7 17/19] powerpc: mm: Convert to GENERIC_IOREMAP

2023-06-20 Thread Baoquan He
From: Christophe Leroy By taking GENERIC_IOREMAP method, the generic generic_ioremap_prot(), generic_iounmap(), and their generic wrapper ioremap_prot(), ioremap() and iounmap() are all visible and available to arch. Arch needs to provide wrapper functions to override the generic versions if ther

Re: [PATCH v2 08/16] mm/vmemmap: Improve vmemmap_can_optimize and allow architectures to override

2023-06-20 Thread Aneesh Kumar K.V
Joao Martins writes: > On 16/06/2023 12:08, Aneesh Kumar K.V wrote: >> dax vmemmap optimization requires a minimum of 2 PAGE_SIZE area within >> vmemmap such that tail page mapping can point to the second PAGE_SIZE area. >> Enforce that in vmemmap_can_optimize() function. >> >> Architectures lik

Re: [PATCH v2 06/12] mm/execmem: introduce execmem_data_alloc()

2023-06-20 Thread Steven Rostedt
On Mon, 19 Jun 2023 02:43:58 +0200 Thomas Gleixner wrote: > Now you might argue that it _is_ a "hotpath" due to the BPF usage, but > then even more so as any intermediate wrapper which converts from one > data representation to another data representation is not going to > increase performance, r

[PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs

2023-06-20 Thread Yair Podemsky
Currently the tlb_remove_table_smp_sync IPI is sent to all CPUs indiscriminately, this causes unnecessary work and delays notable in real-time use-cases and isolated cpus. By limiting the IPI to only be sent to cpus referencing the effected mm. a config to differentiate architectures that support m

[PATCH v2 1/2] arch: Introduce ARCH_HAS_CPUMASK_BITS

2023-06-20 Thread Yair Podemsky
Some architectures set and maintain the mm_cpumask bits when loading or removing process from cpu. This Kconfig will mark those to allow different behavior between kernels that maintain the mm_cpumask and those that do not. Signed-off-by: Yair Podemsky --- arch/Kconfig | 8 arch

[PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs

2023-06-20 Thread Yair Podemsky
Currently the tlb_remove_table_smp_sync IPI is sent to all CPUs indiscriminately, this causes unnecessary work and delays notable in real-time use-cases and isolated cpus. This patch will limit this IPI on systems with ARCH_HAS_CPUMASK_BITS, Where the IPI will only be sent to cpus referencing the a

Re: [PATCH v2 06/12] mm/execmem: introduce execmem_data_alloc()

2023-06-20 Thread Alexei Starovoitov
On Tue, Jun 20, 2023 at 7:51 AM Steven Rostedt wrote: > > On Mon, 19 Jun 2023 02:43:58 +0200 > Thomas Gleixner wrote: > > > Now you might argue that it _is_ a "hotpath" due to the BPF usage, but > > then even more so as any intermediate wrapper which converts from one > > data representation to a

Re: [PATCH v2 02/12] mm: introduce execmem_text_alloc() and jit_text_alloc()

2023-06-20 Thread Andy Lutomirski
On Mon, Jun 19, 2023, at 1:18 PM, Nadav Amit wrote: >> On Jun 19, 2023, at 10:09 AM, Andy Lutomirski wrote: >> >> But jit_text_alloc() can't do this, because the order of operations doesn't >> match. With jit_text_alloc(), the executable mapping shows up before the >> text is populated, so

Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR

2023-06-20 Thread Yu Zhao
On Tue, Jun 20, 2023 at 05:41:57PM +0530, Sachin Sant wrote: > 6.4.0-rc7-next-20230620 fails to boot on IBM Power LPAR with following Sorry for hijacking this thread -- I've been seeing another crash on NV since -rc1 but I haven't had the time to bisect. Just FYI. [0.814500] B

Re: [PATCH v2 05/12] powerpc: add pte_free_defer() for pgtables sharing page

2023-06-20 Thread Hugh Dickins
On Tue, 20 Jun 2023, Jason Gunthorpe wrote: > On Tue, Jun 20, 2023 at 12:47:54AM -0700, Hugh Dickins wrote: > > Add powerpc-specific pte_free_defer(), to call pte_free() via call_rcu(). > > pte_free_defer() will be called inside khugepaged's retract_page_tables() > > loop, where allocating extra me

Re: [PATCH v4 04/34] pgtable: Create struct ptdesc

2023-06-20 Thread Vishal Moola
On Fri, Jun 16, 2023 at 5:38 AM Jason Gunthorpe wrote: > > On Mon, Jun 12, 2023 at 02:03:53PM -0700, Vishal Moola (Oracle) wrote: > > Currently, page table information is stored within struct page. As part > > of simplifying struct page, create struct ptdesc for page table > > information. > > > >

Re: [PATCH v4 04/34] pgtable: Create struct ptdesc

2023-06-20 Thread Jason Gunthorpe
On Tue, Jun 20, 2023 at 01:01:39PM -0700, Vishal Moola wrote: > On Fri, Jun 16, 2023 at 5:38 AM Jason Gunthorpe wrote: > > > > On Mon, Jun 12, 2023 at 02:03:53PM -0700, Vishal Moola (Oracle) wrote: > > > Currently, page table information is stored within struct page. As part > > > of simplifying s

Re: [PATCH v4 04/34] pgtable: Create struct ptdesc

2023-06-20 Thread Vishal Moola
On Tue, Jun 20, 2023 at 4:05 PM Jason Gunthorpe wrote: > > On Tue, Jun 20, 2023 at 01:01:39PM -0700, Vishal Moola wrote: > > On Fri, Jun 16, 2023 at 5:38 AM Jason Gunthorpe wrote: > > > > > > On Mon, Jun 12, 2023 at 02:03:53PM -0700, Vishal Moola (Oracle) wrote: > > > > Currently, page table info

Re: [PATCH v2 05/12] powerpc: add pte_free_defer() for pgtables sharing page

2023-06-20 Thread Jason Gunthorpe
On Tue, Jun 20, 2023 at 12:54:25PM -0700, Hugh Dickins wrote: > On Tue, 20 Jun 2023, Jason Gunthorpe wrote: > > On Tue, Jun 20, 2023 at 12:47:54AM -0700, Hugh Dickins wrote: > > > Add powerpc-specific pte_free_defer(), to call pte_free() via call_rcu(). > > > pte_free_defer() will be called inside

Re: [PATCH mm-unstable v2 07/10] kvm/powerpc: add kvm_arch_test_clear_young()

2023-06-20 Thread Yu Zhao
On Tue, Jun 20, 2023 at 1:48 AM Nicholas Piggin wrote: > > On Sat May 27, 2023 at 9:44 AM AEST, Yu Zhao wrote: > > Implement kvm_arch_test_clear_young() to support the fast path in > > mmu_notifier_ops->test_clear_young(). > > > > It focuses on a simple case, i.e., radix MMU sets the accessed bit

Re: [PATCH 15/17] perf tests task_analyzer: fix bad substitution ${$1}

2023-06-20 Thread Namhyung Kim
Hello, On Tue, Jun 13, 2023 at 1:06 PM Arnaldo Carvalho de Melo wrote: > > Em Tue, Jun 13, 2023 at 10:11:43PM +0530, Athira Rajeev escreveu: > > From: Aditya Gupta > > > > ${$1} gives bad substitution error on sh, bash, and zsh. This seems like > > a typo, and this patch modifies it to $1, since

Re: [PATCH mm-unstable v2 07/10] kvm/powerpc: add kvm_arch_test_clear_young()

2023-06-20 Thread Nicholas Piggin
On Wed Jun 21, 2023 at 10:38 AM AEST, Yu Zhao wrote: > On Tue, Jun 20, 2023 at 1:48 AM Nicholas Piggin wrote: > > > > On Sat May 27, 2023 at 9:44 AM AEST, Yu Zhao wrote: > > > Implement kvm_arch_test_clear_young() to support the fast path in > > > mmu_notifier_ops->test_clear_young(). > > > > > >

Re: [6.4.0-rc7-next-20230620] Boot failure on IBM Power LPAR

2023-06-20 Thread Michael Ellerman
Sachin Sant writes: > 6.4.0-rc7-next-20230620 fails to boot on IBM Power LPAR with following > > [ 5.548368] BUG: Unable to handle kernel data access at 0x95bdcf954bc34e73 > [ 5.548380] Faulting instruction address: 0xc0548090 > [ 5.548384] Oops: Kernel access of bad ar

Re: [PATCH 02/16] powerpc/book3s64/mm: mmu_vmemmap_psize is used by radix

2023-06-20 Thread Michael Ellerman
"Aneesh Kumar K.V" writes: > This should not be within CONFIG_PPC_64S_HASHS_MMU. We use mmu_vmemmap_psize > on radix while mapping the vmemmap area. > > Signed-off-by: Aneesh Kumar K.V > --- > arch/powerpc/mm/book3s64/radix_pgtable.c | 2 -- > 1 file changed, 2 deletions(-) This breaks microwat

[PATCH] powerpc/ftrace: Create a dummy stackframe to fix stack unwind

2023-06-20 Thread Naveen N Rao
With ppc64 -mprofile-kernel and ppc32 -pg, profiling instructions to call into ftrace are emitted right at function entry. The instruction sequence used is minimal to reduce overhead. Crucially, a stackframe is not created for the function being traced. This breaks stack unwinding since the functio

Re: [PATCH 02/16] powerpc/book3s64/mm: mmu_vmemmap_psize is used by radix

2023-06-20 Thread Aneesh Kumar K V
On 6/21/23 9:38 AM, Michael Ellerman wrote: > "Aneesh Kumar K.V" writes: >> This should not be within CONFIG_PPC_64S_HASHS_MMU. We use mmu_vmemmap_psize >> on radix while mapping the vmemmap area. >> >> Signed-off-by: Aneesh Kumar K.V >> --- >> arch/powerpc/mm/book3s64/radix_pgtable.c | 2 -- >>