Here's a fourth revision to fix MF_DELAYED handling on memory failure. This patch series addresses an issue in the memory failure handling path where MF_DELAYED is incorrectly treated as an error. This issue was discovered while testing memory failure handling for guest_memfd.
The proposed solution involves - 1. Clarifying the definition of MF_DELAYED to mean that memory failure handling is only partially completed, and that the metadata for the memory that failed (as in struct page/folio) is still referenced. 2. Updating shmem’s handling to align with the clarified definition. 3. Updating how the result of .error_remove_folio() is interpreted. Changes from v3: + Split an independent guest_memfd_memory_failure_test, as suggested by Ackerley and Sean + Align error logging style in truncate_error_folio, as suggested by Miaohe and David + Verify a clean shmem page can be read successfully after soft-offline memory failure, as suggested by Miaohe Thanks! + RFC v3: https://lore.kernel.org/all/20260408-memory-failure-mf-delayed-fix-rfc-v3-v3-0-718f45eb7...@google.com/ Signed-off-by: Lisa Wang <[email protected]> --- Lisa Wang (7): mm: memory_failure: Clarify the MF_DELAYED definition mm: memory_failure: Allow truncate_error_folio to return MF_DELAYED mm: shmem: Update shmem handler to the MF_DELAYED definition mm: memory_failure: Generalize extra_pins handling to all MF_DELAYED cases mm: selftests: Add shmem into memory failure test KVM: selftests: Add the guest_memfd memory failure test KVM: selftests: Test guest_memfd behavior with respect to stage 2 page tables mm/memory-failure.c | 29 +- mm/shmem.c | 2 +- tools/testing/selftests/kvm/Makefile.kvm | 2 + .../kvm/guest_memfd_memory_failure_test.c | 402 +++++++++++++++++++++ tools/testing/selftests/mm/memory-failure.c | 111 +++++- 5 files changed, 527 insertions(+), 19 deletions(-) --- base-commit: 38741a8e3bc1b809d64f8c8885ab15c3e40700ff change-id: 20260527-memory-failure-mf-delayed-fix-7d5a8f4a8a8b Best regards, -- Lisa Wang <[email protected]>

