[PATCH v2 0/8] mm/damon: remove DAMON debugfs interface

2025-01-06 Thread SeongJae Park
DAMON debugfs interface was the only user interface of DAMON at the beginning[1]. However, it turned out the interface would be not good enough for long-term flexibility and stability. In Feb 2022[2], we therefore introduced DAMON sysfs interface as an alternative user interface that aims long-te

Re: [PATCH v8 01/21] cxl/mbox: Flag support for Dynamic Capacity Devices (DCD)

2025-01-06 Thread Ira Weiny
Dan Williams wrote: > Ira Weiny wrote: > > Per the CXL 3.1 specification software must check the Command Effects > > Log (CEL) for dynamic capacity command support. > > > > Detect support for the DCD commands while reading the CEL, including: > > > > Get DC Config > > Get DC Extent List >

Re: [PATCH v4 14/14] iommu/arm-smmu-v3: Report events that belong to devices attached to vIOMMU

2025-01-06 Thread Nicolin Chen
On Mon, Jan 06, 2025 at 11:01:32AM +0800, Baolu Lu wrote: > On 1/4/25 03:43, Nicolin Chen wrote: > > diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h > > index 0a08aa82e7cc..55e3d5a14cca 100644 > > --- a/include/uapi/linux/iommufd.h > > +++ b/include/uapi/linux/iommufd.h > >

[PATCH v5 02/25] fs/dax: Return unmapped busy pages from dax_layout_busy_page_range()

2025-01-06 Thread Alistair Popple
dax_layout_busy_page_range() is used by file systems to scan the DAX page-cache to unmap mapping pages from user-space and to determine if any pages in the given range are busy, either due to ongoing DMA or other get_user_pages() usage. Currently it checks to see the file mapping is mapped into us

[PATCH v5 05/25] fs/dax: Create a common implementation to break DAX layouts

2025-01-06 Thread Alistair Popple
Prior to freeing a block file systems supporting FS DAX must check that the associated pages are both unmapped from user-space and not undergoing DMA or other access from eg. get_user_pages(). This is achieved by unmapping the file range and scanning the FS DAX page-cache to see if any pages within

[PATCH v5 01/25] fuse: Fix dax truncate/punch_hole fault path

2025-01-06 Thread Alistair Popple
FS DAX requires file systems to call into the DAX layout prior to unlinking inodes to ensure there is no ongoing DMA or other remote access to the direct mapped page. The fuse file system implements fuse_dax_break_layouts() to do this which includes a comment indicating that passing dmap_end == 0 l

[PATCH v5 16/25] huge_memory: Add vmf_insert_folio_pmd()

2025-01-06 Thread Alistair Popple
Currently DAX folio/page reference counts are managed differently to normal pages. To allow these to be managed the same as normal pages introduce vmf_insert_folio_pmd. This will map the entire PMD-sized folio and take references as it would for a normally mapped page. This is distinct from the cu

[PATCH v5 18/25] mm/gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages

2025-01-06 Thread Alistair Popple
Longterm pinning of FS DAX pages should already be disallowed by various pXX_devmap checks. However a future change will cause these checks to be invalid for FS DAX pages so make folio_is_longterm_pinnable() return false for FS DAX pages. Signed-off-by: Alistair Popple Reviewed-by: John Hubbard

[PATCH v5 17/25] memremap: Add is_devdax_page() and is_fsdax_page() helpers

2025-01-06 Thread Alistair Popple
Add helpers to determine if a page or folio is a devdax or fsdax page or folio. Signed-off-by: Alistair Popple Acked-by: David Hildenbrand --- Changes for v5: - Renamed is_device_dax_page() to is_devdax_page() for consistency. --- include/linux/memremap.h | 22 ++ 1 file

[PATCH v5 23/25] mm: Remove pXX_devmap callers

2025-01-06 Thread Alistair Popple
The devmap PTE special bit was used to detect mappings of FS DAX pages. This tracking was required to ensure the generic mm did not manipulate the page reference counts as FS DAX implemented it's own reference counting scheme. Now that FS DAX pages have their references counted the same way as nor

[PATCH v5 24/25] mm: Remove devmap related functions and page table bits

2025-01-06 Thread Alistair Popple
Now that DAX and all other reference counts to ZONE_DEVICE pages are managed normally there is no need for the special devmap PTE/PMD/PUD page table bits. So drop all references to these, freeing up a software defined page table bit on architectures supporting it. Signed-off-by: Alistair Popple A

[PATCH v5 25/25] Revert "riscv: mm: Add support for ZONE_DEVICE"

2025-01-06 Thread Alistair Popple
DEVMAP PTEs are no longer required to support ZONE_DEVICE so remove them. Signed-off-by: Alistair Popple Suggested-by: Chunyan Zhang Reviewed-by: Björn Töpel --- arch/riscv/Kconfig| 1 - arch/riscv/include/asm/pgtable-64.h | 20 arch/riscv/include/as

[PATCH v5 03/25] fs/dax: Don't skip locked entries when scanning entries

2025-01-06 Thread Alistair Popple
Several functions internal to FS DAX use the following pattern when trying to obtain an unlocked entry: xas_for_each(&xas, entry, end_idx) { if (dax_is_locked(entry)) entry = get_unlocked_entry(&xas, 0); This is problematic because get_unlocked_entry() will get the next pr

[PATCH v5 04/25] fs/dax: Refactor wait for dax idle page

2025-01-06 Thread Alistair Popple
A FS DAX page is considered idle when its refcount drops to one. This is currently open-coded in all file systems supporting FS DAX. Move the idle detection to a common function to make future changes easier. Signed-off-by: Alistair Popple Reviewed-by: Jan Kara Reviewed-by: Christoph Hellwig Re

[PATCH v5 15/25] huge_memory: Add vmf_insert_folio_pud()

2025-01-06 Thread Alistair Popple
Currently DAX folio/page reference counts are managed differently to normal pages. To allow these to be managed the same as normal pages introduce vmf_insert_folio_pud. This will map the entire PUD-sized folio and take references as it would for a normally mapped page. This is distinct from the cu

[PATCH v5 00/25] fs/dax: Fix ZONE_DEVICE page reference counts

2025-01-06 Thread Alistair Popple
Main updates since v4: - Removed most of the devdax/fsdax checks in fs/proc/task_mmu.c. This means smaps/pagemap may contain DAX pages. - Fixed rmap accounting of PUD mapped pages. - Minor code clean-ups. Main updates since v3: - Rebased onto next-20241216. The rebase wasn't too difficu

[PATCH v5 06/25] fs/dax: Always remove DAX page-cache entries when breaking layouts

2025-01-06 Thread Alistair Popple
Prior to any truncation operations file systems call dax_break_mapping() to ensure pages in the range are not under going DMA. Later DAX page-cache entries will be removed by truncate_folio_batch_exceptionals() in the generic page-cache code. However this makes it possible for folios to be removed

[PATCH v5 07/25] fs/dax: Ensure all pages are idle prior to filesystem unmount

2025-01-06 Thread Alistair Popple
File systems call dax_break_mapping() prior to reallocating file system blocks to ensure the page is not undergoing any DMA or other accesses. Generally this is needed when a file is truncated to ensure that if a block is reallocated nothing is writing to it. However filesystems currently don't cal

[PATCH v5 11/25] mm: Allow compound zone device pages

2025-01-06 Thread Alistair Popple
Zone device pages are used to represent various type of device memory managed by device drivers. Currently compound zone device pages are not supported. This is because MEMORY_DEVICE_FS_DAX pages are the only user of higher order zone device pages and have their own page reference counting. A futu

[PATCH v5 12/25] mm/memory: Enhance insert_page_into_pte_locked() to create writable mappings

2025-01-06 Thread Alistair Popple
In preparation for using insert_page() for DAX, enhance insert_page_into_pte_locked() to handle establishing writable mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a PTE which bypasses the typical set_pte_range() in finish_fault. Signed-off-by: Alistair Popple Suggested-by:

[PATCH v5 13/25] mm/memory: Add vmf_insert_page_mkwrite()

2025-01-06 Thread Alistair Popple
Currently to map a DAX page the DAX driver calls vmf_insert_pfn. This creates a special devmap PTE entry for the pfn but does not take a reference on the underlying struct page for the mapping. This is because DAX page refcounts are treated specially, as indicated by the presence of a devmap entry.

[PATCH v5 08/25] fs/dax: Remove PAGE_MAPPING_DAX_SHARED mapping flag

2025-01-06 Thread Alistair Popple
PAGE_MAPPING_DAX_SHARED is the same as PAGE_MAPPING_ANON. This isn't currently a problem because FS DAX pages are treated specially. However a future change will make FS DAX pages more like normal pages, so folio_test_anon() must not return true for a FS DAX page. We could explicitly test for a FS

[PATCH v5 10/25] mm/mm_init: Move p2pdma page refcount initialisation to p2pdma

2025-01-06 Thread Alistair Popple
Currently ZONE_DEVICE page reference counts are initialised by core memory management code in __init_zone_device_page() as part of the memremap() call which driver modules make to obtain ZONE_DEVICE pages. This initialises page refcounts to 1 before returning them to the driver. This was presumabl

[PATCH v5 09/25] mm/gup: Remove redundant check for PCI P2PDMA page

2025-01-06 Thread Alistair Popple
PCI P2PDMA pages are not mapped with pXX_devmap PTEs therefore the check in __gup_device_huge() is redundant. Remove it Signed-off-by: Alistair Popple Reviewed-by: Jason Gunthorpe Reviewed-by: Dan Wiliams Acked-by: David Hildenbrand --- mm/gup.c | 5 - 1 file changed, 5 deletions(-) diff

[PATCH v5 22/25] device/dax: Properly refcount device dax pages when mapping

2025-01-06 Thread Alistair Popple
Device DAX pages are currently not reference counted when mapped, instead relying on the devmap PTE bit to ensure mapping code will not get/put references. This requires special handling in various page table walkers, particularly GUP, to manage references on the underlying pgmap to ensure the page

[PATCH v5 21/25] fs/dax: Properly refcount fs dax pages

2025-01-06 Thread Alistair Popple
Currently fs dax pages are considered free when the refcount drops to one and their refcounts are not increased when mapped via PTEs or decreased when unmapped. This requires special logic in mm paths to detect that these pages should not be properly refcounted, and to detect when the refcount drop

[PATCH v5 19/25] proc/task_mmu: Mark devdax and fsdax pages as always unpinned

2025-01-06 Thread Alistair Popple
The procfs mmu files such as smaps and pagemap currently ignore devdax and fsdax pages because these pages are considered special. A future change will start treating these as normal pages, meaning they can be exposed via smaps and pagemap. The only difference is that devdax and fsdax pages can ne

[PATCH v5 20/25] mm/mlock: Skip ZONE_DEVICE PMDs during mlock

2025-01-06 Thread Alistair Popple
At present mlock skips ptes mapping ZONE_DEVICE pages. A future change to remove pmd_devmap will allow pmd_trans_huge_lock() to return ZONE_DEVICE folios so make sure we continue to skip those. Signed-off-by: Alistair Popple Acked-by: David Hildenbrand --- mm/mlock.c | 2 ++ 1 file changed, 2 i

[PATCH v5 14/25] rmap: Add support for PUD sized mappings to rmap

2025-01-06 Thread Alistair Popple
The rmap doesn't currently support adding a PUD mapping of a folio. This patch adds support for entire PUD mappings of folios, primarily to allow for more standard refcounting of device DAX folios. Currently DAX is the only user of this and it doesn't require support for partially mapped PUD-sized

Re: [PATCH v4 14/14] iommu/arm-smmu-v3: Report events that belong to devices attached to vIOMMU

2025-01-06 Thread Nicolin Chen
On Mon, Jan 06, 2025 at 10:46:21AM -0800, Nicolin Chen wrote: > On Mon, Jan 06, 2025 at 11:01:32AM +0800, Baolu Lu wrote: > > Nit: I think it would be more readable to add a check in the vevent > > reporting helper. > > > > diff --git a/drivers/iommu/iommufd/driver.c b/drivers/iommu/iommufd/driver

Re: [PATCH RFC 2/2] docs: process: submitting-patches: clarify imperative mood suggestion

2025-01-06 Thread Ahmad Fatoum
Hello Jon, On 06.01.25 15:57, Jonathan Corbet wrote: > Ahmad Fatoum writes: > >> Hello Jon, >> >> On 30.12.24 19:40, Jonathan Corbet wrote: >>> Ahmad Fatoum writes: >>> While we expect commit message titles to use the imperative mood, it's ok for commit message bodies to first include

[PATCH bpf-next v4 4/4] igc: Add launch time support to XDP ZC

2025-01-06 Thread Song Yoong Siang
Enable Launch Time Control (LTC) support to XDP zero copy via XDP Tx metadata framework. This patch is tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel Tiger Lake platform. Below are the test steps and result. Test Steps: 1. Add mqprio qdisc: $ sudo tc qdisc add dev enp2s0 hand

[PATCH bpf-next v4 2/4] selftests/bpf: Add Launch Time request to xdp_hw_metadata

2025-01-06 Thread Song Yoong Siang
Add Launch Time hw offload request to xdp_hw_metadata. User can configure the delta of launch time to HW RX-time by using "-l" argument. The default delta is 100,000,000 nanosecond. Signed-off-by: Song Yoong Siang --- tools/testing/selftests/bpf/xdp_hw_metadata.c | 30 +-- 1 file

[PATCH bpf-next v4 3/4] net: stmmac: Add launch time support to XDP ZC

2025-01-06 Thread Song Yoong Siang
Enable launch time (Time-Based Scheduling) support to XDP zero copy via XDP Tx metadata framework. This patch is tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel Tiger Lake platform. Below are the test steps and result. Test Steps: 1. Add mqprio qdisc: $ sudo tc qdisc add dev e

[PATCH bpf-next v4 0/4] xsk: TX metadata Launch Time support

2025-01-06 Thread Song Yoong Siang
This series expands the XDP TX metadata framework to allow user applications to pass per packet 64-bit launch time directly to the kernel driver, requesting launch time hardware offload support. The XDP TX metadata framework will not perform any clock conversion or packet reordering. Please note t

Re: [PATCH RFC 2/2] docs: process: submitting-patches: clarify imperative mood suggestion

2025-01-06 Thread Ahmad Fatoum
Hello Jon, On 30.12.24 19:40, Jonathan Corbet wrote: > Ahmad Fatoum writes: > >> While we expect commit message titles to use the imperative mood, >> it's ok for commit message bodies to first include a blurb describing >> the background of the patch, before delving into what's being done >> to

Re: [PATCH v3 06/14] iommufd: Add IOMMUFD_OBJ_VIRQ and IOMMUFD_CMD_VIRQ_ALLOC

2025-01-06 Thread Jason Gunthorpe
On Thu, Jan 02, 2025 at 07:30:21PM -0800, Nicolin Chen wrote: > On Thu, Jan 02, 2025 at 04:52:46PM -0400, Jason Gunthorpe wrote: > > On Tue, Dec 17, 2024 at 09:00:19PM -0800, Nicolin Chen wrote: > > > +/* An iommufd_virq_header packs a vIOMMU interrupt in an iommufd_virq > > > queue */ > > > +stru

[PATCH bpf-next v4 1/4] xsk: Add launch time hardware offload support to XDP Tx metadata

2025-01-06 Thread Song Yoong Siang
Extend the XDP Tx metadata framework so that user can requests launch time hardware offload, where the Ethernet device will schedule the packet for transmission at a pre-determined time called launch time. The value of launch time is communicated from user space to Ethernet driver via launch_time f

Re: [PATCH RFC 2/2] docs: process: submitting-patches: clarify imperative mood suggestion

2025-01-06 Thread Jonathan Corbet
Ahmad Fatoum writes: > Hello Jon, > > On 30.12.24 19:40, Jonathan Corbet wrote: >> Ahmad Fatoum writes: >> >>> While we expect commit message titles to use the imperative mood, >>> it's ok for commit message bodies to first include a blurb describing >>> the background of the patch, before delv

Re: [PATCH v4 14/14] iommu/arm-smmu-v3: Report events that belong to devices attached to vIOMMU

2025-01-06 Thread Baolu Lu
On 1/7/25 14:00, Nicolin Chen wrote: On Tue, Jan 07, 2025 at 01:54:00PM +0800, Baolu Lu wrote: On 1/7/25 12:36, Nicolin Chen wrote: +static bool arm_vsmmu_supports_veventq(unsigned int type) +{ + return type == IOMMU_VIOMMU_TYPE_ARM_SMMUV3; Do you need to check the hardware capabilities

Re: [PATCH v4 14/14] iommu/arm-smmu-v3: Report events that belong to devices attached to vIOMMU

2025-01-06 Thread Baolu Lu
On 1/7/25 12:36, Nicolin Chen wrote: On Mon, Jan 06, 2025 at 10:46:21AM -0800, Nicolin Chen wrote: On Mon, Jan 06, 2025 at 11:01:32AM +0800, Baolu Lu wrote: Nit: I think it would be more readable to add a check in the vevent reporting helper. diff --git a/drivers/iommu/iommufd/driver.c b/drive

Re: [PATCH v4 14/14] iommu/arm-smmu-v3: Report events that belong to devices attached to vIOMMU

2025-01-06 Thread Nicolin Chen
On Mon, Jan 06, 2025 at 08:37:04PM -0800, Nicolin Chen wrote: > I added something like this. Will send a v5. > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c > index 0c7a5894ba07..348179f3cf2a 100644 > --- a/drivers/iommu

Re: [PATCH v4 14/14] iommu/arm-smmu-v3: Report events that belong to devices attached to vIOMMU

2025-01-06 Thread Nicolin Chen
On Tue, Jan 07, 2025 at 01:54:00PM +0800, Baolu Lu wrote: > On 1/7/25 12:36, Nicolin Chen wrote: > > +static bool arm_vsmmu_supports_veventq(unsigned int type) > > +{ > > + return type == IOMMU_VIOMMU_TYPE_ARM_SMMUV3; > > Do you need to check the hardware capabilities before reporting this? I >