Hi Christophe,
The latest series is
https://lore.kernel.org/linuxppc-dev/20231017022806.4523-1-pi...@redhat.com/
And Michael has his implement on:
https://lore.kernel.org/all/20231229120107.2281153-3-...@ellerman.id.au/T/#m46128446bce1095631162a1927415733a3bf0633
Thanks,
Pingfan
On Fri, Jan 26
From: Baoquan He Sent: Monday, January 29, 2024 7:00 PM
>
> Michael pointed out that the CONFIG_CRASH_DUMP ifdef is nested inside
> CONFIG_KEXEC_CODE ifdef scope in some XEN, Hyper-V codes.
>
> Although the nesting works well too since CONFIG_CRASH_DUMP has
> dependency on CONFIG_KEXEC_CORE, it
On 01/30/24 at 01:39am, Michael Kelley wrote:
> From: Baoquan He
> >
> > On 01/29/24 at 06:27pm, Michael Kelley wrote:
> > > From: Baoquan He Sent: Monday, January 29, 2024
> > 5:51 AM
> > > >
> > > > Michael pointed out that the #ifdef CONFIG_CRASH_DUMP is nested inside
> > > > arch/x86/xen/enl
Michael pointed out that the CONFIG_CRASH_DUMP ifdef is nested inside
CONFIG_KEXEC_CODE ifdef scope in some XEN, Hyper-V codes.
Although the nesting works well too since CONFIG_CRASH_DUMP has
dependency on CONFIG_KEXEC_CORE, it may cause confusion because there
are places where it's not nested, an
From: Baoquan He
>
> On 01/29/24 at 06:27pm, Michael Kelley wrote:
> > From: Baoquan He Sent: Monday, January 29, 2024
> 5:51 AM
> > >
> > > Michael pointed out that the #ifdef CONFIG_CRASH_DUMP is nested inside
> > > arch/x86/xen/enlighten_hvm.c.
> >
> > Did some words get left out in the above
On 01/29/24 at 06:27pm, Michael Kelley wrote:
> From: Baoquan He Sent: Monday, January 29, 2024 5:51 AM
> >
> > Michael pointed out that the #ifdef CONFIG_CRASH_DUMP is nested inside
> > arch/x86/xen/enlighten_hvm.c.
>
> Did some words get left out in the above sentence? It mentions the Xen
> c
On Mon, Jan 29, 2024 at 2:47 PM Tong Tiangen wrote:
>
> Currently, many scenarios that can tolerate memory errors when copying page
> have been supported in the kernel[1][2][3], all of which are implemented by
> copy_mc_[user]_highpage(). arm64 should also support this mechanism.
>
> Due to mte, a
On Thu, Dec 21, 2023 at 10:02:46AM +0100, Christophe Leroy wrote:
> Declaring rodata_enabled and mark_rodata_ro() at all time
> helps removing related #ifdefery in C files.
>
> Signed-off-by: Christophe Leroy
Very nice cleanup, thanks!, applied and pushed
Luis
From: Baoquan He Sent: Monday, January 29, 2024 5:51 AM
>
> Michael pointed out that the #ifdef CONFIG_CRASH_DUMP is nested inside
> arch/x86/xen/enlighten_hvm.c.
Did some words get left out in the above sentence? It mentions the Xen
case, but not the Hyper-V case. I'm not sure what you intend
On Mon, Jan 29, 2024 at 09:46:49PM +0800, Tong Tiangen wrote:
> If user process access memory fails due to hardware memory error, only the
> relevant processes are affected, so it is more reasonable to kill the user
> process and isolate the corrupt page than to panic the kernel.
>
> Signed-off-by
On Mon, Jan 29, 2024 at 09:46:48PM +0800, Tong Tiangen wrote:
> For the arm64 kernel, when it processes hardware memory errors for
> synchronize notifications(do_sea()), if the errors is consumed within the
> kernel, the current processing is panic. However, it is not optimal.
>
> Take uaccess for
Similar to how we optimized fork(), let's implement PTE batching when
consecutive (present) PTEs map consecutive pages of the same large
folio.
Most infrastructure we need for batching (mmu gather, rmap) is already
there. We only have to add get_and_clear_full_ptes() and
clear_full_ptes(). Similar
Let's add a helper that lets us batch-process multiple consecutive PTEs.
Note that the loop will get optimized out on all architectures except on
powerpc. We have to add an early define of __tlb_remove_tlb_entry() on
ppc to make the compiler happy (and avoid making tlb_remove_tlb_entries() a
macro
Add __tlb_remove_folio_pages(), which will remove multiple consecutive
pages that belong to the same large folio, instead of only a single
page. We'll be using this function when optimizing unmapping/zapping of
large folios that are mapped by PTEs.
We're using the remaining spare bit in an encoded
Nowadays, encoded pages are only used in mmu_gather handling. Let's
update the documentation, and define ENCODED_PAGE_BIT_DELAY_RMAP. While at
it, rename ENCODE_PAGE_BITS to ENCODED_PAGE_BITS.
If encoded page pointers would ever be used in other context again, we'd
likely want to change the define
We have two bits available in the encoded page pointer to store
additional information. Currently, we use one bit to request delay of the
rmap removal until after a TLB flush.
We want to make use of the remaining bit internally for batching of
multiple pages of the same folio, specifying that the
Let's prepare for further changes by factoring it out into a separate
function.
Signed-off-by: David Hildenbrand
---
mm/memory.c | 53 -
1 file changed, 32 insertions(+), 21 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 20bc13ab8db
We don't need up-to-date accessed-dirty information for anon folios and can
simply work with the ptent we already have. Also, we know the RSS counter
we want to update.
We can safely move arch_check_zapped_pte() + tlb_remove_tlb_entry() +
zap_install_uffd_wp_if_needed() after updating the folio an
We don't need uptodate accessed/dirty bits, so in theory we could
replace ptep_get_and_clear_full() by an optimized ptep_clear_full()
function. Let's rely on the provided pte.
Further, there is no scenario where we would have to insert uffd-wp
markers when zapping something that is not a normal pa
Let's prepare for further changes by factoring out processing of present
PTEs.
Signed-off-by: David Hildenbrand
---
mm/memory.c | 92 ++---
1 file changed, 52 insertions(+), 40 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index b05fd28dbce1
This series is based on [1] and must be applied on top of it.
Similar to what we did with fork(), let's implement PTE batching
during unmap/zap when processing PTE-mapped THPs.
We collect consecutive PTEs that map consecutive pages of the same large
folio, making sure that the other PTE bits are c
The copy_mc_to_kernel() helper is memory copy implementation that handles
source exceptions. It can be used in memory copy scenarios that tolerate
hardware memory errors(e.g: pmem_read/dax_copy_to_iter).
Currnently, only x86 and ppc suuport this helper, after arm64 support
machine check safe frame
With the increase of memory capacity and density, the probability of memory
error also increases. The increasing size and density of server RAM in data
centers and clouds have shown increased uncorrectable memory errors.
Currently, more and more scenarios that can tolerate memory errors???such as
For the arm64 kernel, when it processes hardware memory errors for
synchronize notifications(do_sea()), if the errors is consumed within the
kernel, the current processing is panic. However, it is not optimal.
Take uaccess for example, if the uaccess operation fails due to memory
error, only the u
If hardware errors are encountered during page copying, returning the bytes
not copied is not meaningful, and the caller cannot do any processing on
the remaining data. Returning -EFAULT is more reasonable, which represents
a hardware error encountered during the copying.
Signed-off-by: Tong Tiang
If user process access memory fails due to hardware memory error, only the
relevant processes are affected, so it is more reasonable to kill the user
process and isolate the corrupt page than to panic the kernel.
Signed-off-by: Tong Tiangen
---
arch/arm64/lib/copy_from_user.S | 10 +-
ar
Currently, many scenarios that can tolerate memory errors when copying page
have been supported in the kernel[1][2][3], all of which are implemented by
copy_mc_[user]_highpage(). arm64 should also support this mechanism.
Due to mte, arm64 needs to have its own copy_mc_[user]_highpage()
architectur
x86/powerpc has it's implementation of copy_mc_to_user(), we add generic
fallback in include/linux/uaccess.h prepare for other architechures to
enable CONFIG_ARCH_HAS_COPY_MC.
Signed-off-by: Tong Tiangen
Acked-by: Michael Ellerman
---
arch/powerpc/include/asm/uaccess.h | 1 +
arch/x86/include/a
Nathan reported below building error:
=
$ curl -LSso .config
https://git.alpinelinux.org/aports/plain/community/linux-edge/config-edge.armv7
$ make -skj"$(nproc)" ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- olddefconfig all
...
arm-linux-gnueabi-ld: arch/arm/kernel/machine_kexec.o: in function
Nathan reported some building errors on arm64 as below:
==
$ curl -LSso .config
https://github.com/archlinuxarm/PKGBUILDs/raw/master/core/linux-aarch64/config
$ make -skj"$(nproc)" ARCH=arm64 CROSS_COMPILE=aarch64-linux- olddefconfig all
...
aarch64-linux-ld: kernel/kexec_file.o: in funct
Michael pointed out that the #ifdef CONFIG_CRASH_DUMP is nested inside
arch/x86/xen/enlighten_hvm.c.
Although the nesting works well too since CONFIG_CRASH_DUMP has
dependency on CONFIG_KEXEC_CORE, it may cause confuse because there
are places where it's not nested, and people may think it need be
Commit 109303336a0c ("crypto: vmx - Move to arch/powerpc/crypto") moves the
crypto vmx files to arch/powerpc, but misses to adjust the file entries for
IBM Power VMX Cryptographic instructions and LINUX FOR POWERPC.
Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about
broken ref
... and conditionally return to the caller if any PTE except the first one
is writable. fork() has to make sure to properly write-protect in case any
PTE is writable. Other users (e.g., page unmaping) are expected to not
care.
Reviewed-by: Ryan Roberts
Signed-off-by: David Hildenbrand
---
mm/me
Let's always ignore the accessed/young bit: we'll always mark the PTE
as old in our child process during fork, and upcoming users will
similarly not care.
Ignore the dirty bit only if we don't want to duplicate the dirty bit
into the child process during fork. Maybe, we could just set all PTEs
in
We already read it, let's just forward it.
This patch is based on work by Ryan Roberts.
Reviewed-by: Ryan Roberts
Signed-off-by: David Hildenbrand
---
mm/memory.c | 7 +++
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index a3bdb25f4c8d..41b24da5be
Let's prepare for further changes.
Reviewed-by: Ryan Roberts
Signed-off-by: David Hildenbrand
---
mm/memory.c | 63 -
1 file changed, 33 insertions(+), 30 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 8d14ba440929..a3bdb25f4c8d 10
Let's use our handy new helper. Note that the implementation is slightly
different, but shouldn't really make a difference in practice.
Reviewed-by: Christophe Leroy
Signed-off-by: David Hildenbrand
---
arch/powerpc/mm/pgtable.c | 5 +
1 file changed, 1 insertion(+), 4 deletions(-)
diff --
Let's use our handy helper now that it's available on all archs.
Signed-off-by: David Hildenbrand
---
arch/arm/mm/mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 674ed71573a8..c24e29c0b9a4 100644
--- a/arch/arm/mm/mmu.c
+++ b/
Let's provide pte_next_pfn(), independently of set_ptes(). This allows for
using the generic pte_next_pfn() version in some arch-specific set_ptes()
implementations, and prepares for reusing pte_next_pfn() in other context.
Reviewed-by: Christophe Leroy
Signed-off-by: David Hildenbrand
---
incl
We want to make use of pte_next_pfn() outside of set_ptes(). Let's
simply define PFN_PTE_SHIFT, required by pte_next_pfn().
Signed-off-by: David Hildenbrand
---
arch/sparc/include/asm/pgtable_64.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/sparc/include/asm/pgtable_64.h
b/arch/s
We want to make use of pte_next_pfn() outside of set_ptes(). Let's
simply define PFN_PTE_SHIFT, required by pte_next_pfn().
Signed-off-by: David Hildenbrand
---
arch/s390/include/asm/pgtable.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/inclu
We want to make use of pte_next_pfn() outside of set_ptes(). Let's
simply define PFN_PTE_SHIFT, required by pte_next_pfn().
Reviewed-by: Alexandre Ghiti
Signed-off-by: David Hildenbrand
---
arch/riscv/include/asm/pgtable.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/riscv/include
We want to make use of pte_next_pfn() outside of set_ptes(). Let's
simply define PFN_PTE_SHIFT, required by pte_next_pfn().
Reviewed-by: Christophe Leroy
Signed-off-by: David Hildenbrand
---
arch/powerpc/include/asm/pgtable.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/in
We want to make use of pte_next_pfn() outside of set_ptes(). Let's
simply define PFN_PTE_SHIFT, required by pte_next_pfn().
Signed-off-by: David Hildenbrand
---
arch/nios2/include/asm/pgtable.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/in
We want to make use of pte_next_pfn() outside of set_ptes(). Let's
simply define PFN_PTE_SHIFT, required by pte_next_pfn().
Signed-off-by: David Hildenbrand
---
arch/arm/include/asm/pgtable.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/
From: Ryan Roberts
Since the high bits [51:48] of an OA are not stored contiguously in the
PTE, there is a theoretical bug in set_ptes(), which just adds PAGE_SIZE
to the pte to get the pte with the next pfn. This works until the pfn
crosses the 48-bit boundary, at which point we overflow into th
Now that the rmap overhaul[1] is upstream that provides a clean interface
for rmap batching, let's implement PTE batching during fork when processing
PTE-mapped THPs.
This series is partially based on Ryan's previous work[2] to implement
cont-pte support on arm64, but its a complete rewrite based
Update the Power11 PVR to json mapfile to enable
json events. Power11 is PowerISA v3.1 compliant
and support Power10 events.
Signed-off-by: Madhavan Srinivasan
---
tools/perf/pmu-events/arch/powerpc/mapfile.csv | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/perf/pmu-events/arch/powerp
On 25/01/2024 19:32, David Hildenbrand wrote:
> Let's always ignore the accessed/young bit: we'll always mark the PTE
> as old in our child process during fork, and upcoming users will
> similarly not care.
>
> Ignore the dirty bit only if we don't want to duplicate the dirty bit
> into the child
Hi Aneesh,
Thanks for looking into the patch. My comments are inline below.
On 2024/01/24 01:06 PM, Aneesh Kumar K.V wrote:
> Amit Machhiwal writes:
>
> > Currently, rebooting a pseries nested qemu-kvm guest (L2) results in
> > below error as L1 qemu sends PVR value 'arch_compat' == 0 via
> > p
On 25/01/2024 19:32, David Hildenbrand wrote:
> Let's implement PTE batching when consecutive (present) PTEs map
> consecutive pages of the same large folio, and all other PTE bits besides
> the PFNs are equal.
>
> We will optimize folio_pte_batch() separately, to ignore selected
> PTE bits. This
On Sun, Jan 28, 2024 at 08:58:54PM +0100, Alexander Gordeev wrote:
> There is no architecture-specific code or data left
> that generic needs to know about.
> Thus, avoid the inclusion of header.
>
> Signed-off-by: Alexander Gordeev
> ---
> include/asm-generic/vtime.h | 1 -
> include/linux/vt
On Sun, Jan 28, 2024 at 08:58:53PM +0100, Alexander Gordeev wrote:
> update_timer_sys() and update_timer_mcck() are inlines used for
> CPU time accounting from the interrupt and machine-check handlers.
> These routines are specific to s390 architecture, but declared
> via header, which in turn inl
On Sun, Jan 28, 2024 at 08:58:52PM +0100, Alexander Gordeev wrote:
> __ARCH_HAS_VTIME_TASK_SWITCH macro is not used anymore.
>
> Signed-off-by: Alexander Gordeev
> ---
> arch/s390/include/asm/vtime.h | 2 --
> 1 file changed, 2 deletions(-)
Acked-by: Heiko Carstens
On 1/29/24 12:23 PM, Anshuman Khandual wrote:
>
>
> On 1/29/24 11:56, Aneesh Kumar K.V wrote:
>> On 1/29/24 11:52 AM, Anshuman Khandual wrote:
>>>
>>>
>>> On 1/29/24 11:30, Aneesh Kumar K.V (IBM) wrote:
Architectures like powerpc add debug checks to ensure we find only devmap
PUD pte en
55 matches
Mail list logo