> From: Nicolin Chen
> Sent: Wednesday, January 8, 2025 1:10 AM
> +
> +int iommufd_veventq_alloc(struct iommufd_ucmd *ucmd)
> +{
> + struct iommu_veventq_alloc *cmd = ucmd->cmd;
> + struct iommufd_veventq *veventq;
> + struct iommufd_viommu *viommu;
> + int fdno;
> + int rc;
>
> From: Nicolin Chen
> Sent: Wednesday, January 8, 2025 1:10 AM
> +
> + xa_lock(&viommu->vdevs);
> + xa_for_each(&viommu->vdevs, index, vdev) {
> + if (vdev && vdev->dev == dev) {
> + vdev_id = (unsigned long)vdev->id;
> + break;
> +
Main updates since v5:
- Reworked patch 1 based on Dan's feedback.
- Fixed build issues on PPC and when CONFIG_PGTABLE_HAS_HUGE_LEAVES
is no defined.
- Minor comment formatting and documentation fixes.
- Remove PTE_DEVMAP definitions from Loongarch which were added since
this series w
FS DAX requires file systems to call into the DAX layout prior to unlinking
inodes to ensure there is no ongoing DMA or other remote access to the
direct mapped page. The fuse file system implements
fuse_dax_break_layouts() to do this which includes a comment indicating
that passing dmap_end == 0 l
dax_layout_busy_page_range() is used by file systems to scan the DAX
page-cache to unmap mapping pages from user-space and to determine if
any pages in the given range are busy, either due to ongoing DMA or
other get_user_pages() usage.
Currently it checks to see the file mapping is mapped into us
Several functions internal to FS DAX use the following pattern when
trying to obtain an unlocked entry:
xas_for_each(&xas, entry, end_idx) {
if (dax_is_locked(entry))
entry = get_unlocked_entry(&xas, 0);
This is problematic because get_unlocked_entry() will get the next
pr
A FS DAX page is considered idle when its refcount drops to one. This
is currently open-coded in all file systems supporting FS DAX. Move
the idle detection to a common function to make future changes easier.
Signed-off-by: Alistair Popple
Reviewed-by: Jan Kara
Reviewed-by: Christoph Hellwig
Re
File systems call dax_break_mapping() prior to reallocating file
system blocks to ensure the page is not undergoing any DMA or other
accesses. Generally this is needed when a file is truncated to ensure
that if a block is reallocated nothing is writing to it. However
filesystems currently don't cal
Prior to any truncation operations file systems call
dax_break_mapping() to ensure pages in the range are not under going
DMA. Later DAX page-cache entries will be removed by
truncate_folio_batch_exceptionals() in the generic page-cache code.
However this makes it possible for folios to be removed
Prior to freeing a block file systems supporting FS DAX must check
that the associated pages are both unmapped from user-space and not
undergoing DMA or other access from eg. get_user_pages(). This is
achieved by unmapping the file range and scanning the FS DAX
page-cache to see if any pages within
> From: Nicolin Chen
> Sent: Wednesday, January 8, 2025 1:10 AM
>
> With the introduction of the new objects, update the doc to reflect that.
>
> Reviewed-by: Lu Baolu
> Signed-off-by: Nicolin Chen
Reviewed-by: Kevin Tian
> From: Nicolin Chen
> Sent: Wednesday, January 8, 2025 1:10 AM
>
> +/*
> + * Typically called in driver's threaded IRQ handler.
> + * The @type and @event_data must be defined in
> include/uapi/linux/iommufd.h
> + */
> +int iommufd_viommu_report_event(struct iommufd_viommu *viommu,
> +
From: "Jiao, Joey"
The current design of KCOV risks frequent buffer overflows. To mitigate
this, new modes are introduced: KCOV_TRACE_UNIQ_PC, KCOV_TRACE_UNIQ_EDGE,
and KCOV_TRACE_UNIQ_CMP. These modes allow for the recording of unique
PCs, edges, and comparison operands (CMP).
Key changes inclu
> and fill full vnet header"
> > https://lore.kernel.org/r/20250109-tun-v2-0-388d7d5a2...@daynix.com
>
> As mentioned elsewhere: let's first handle that patch series and
> return to this series only when that is complete.
>
Alistair Popple wrote:
> Main updates since v5:
>
> - Reworked patch 1 based on Dan's feedback.
>
> - Fixed build issues on PPC and when CONFIG_PGTABLE_HAS_HUGE_LEAVES
>is no defined.
>
> - Minor comment formatting and documentation fixes.
>
> - Remove PTE_DEVMAP definitions from Loonga
Alistair Popple wrote:
> On Wed, Jan 08, 2025 at 04:14:20PM -0800, Dan Williams wrote:
> > Alistair Popple wrote:
> > > Prior to freeing a block file systems supporting FS DAX must check
> > > that the associated pages are both unmapped from user-space and not
> > > undergoing DMA or other access f
> From: Nicolin Chen
> Sent: Wednesday, January 8, 2025 1:10 AM
>
> The fault object was designed exclusively for hwpt's IO page faults (PRI).
> But its queue implementation can be reused for other purposes too, such as
> hardware IRQ and event injections to user space.
>
> Meanwhile, a fault ob
> From: Nicolin Chen
> Sent: Wednesday, January 8, 2025 1:10 AM
>
> Reorder the existing OBJ/IOCTL lists.
>
> Also run clang-format for the same coding style at line wrappings.
>
> No functional change.
>
> Reviewed-by: Lu Baolu
> Signed-off-by: Nicolin Chen
Reviewed-by: Kevin Tian
On 2025/01/09 21:46, Willem de Bruijn wrote:
Akihiko Odaki wrote:
On 2025/01/09 16:31, Michael S. Tsirkin wrote:
On Thu, Jan 09, 2025 at 03:58:44PM +0900, Akihiko Odaki wrote:
tun used to simply advance iov_iter when it needs to pad virtio header,
which leaves the garbage in the buffer as is.
On Thu, Jan 9, 2025 at 2:59 PM Akihiko Odaki wrote:
>
> The specification says the device MUST set num_buffers to 1 if
> VIRTIO_NET_F_MRG_RXBUF has not been negotiated.
Have we agreed on how to fix the spec or not?
As I replied in the spec patch, if we just remove this "MUST", it
looks like we a
On Thu, Jan 9, 2025 at 2:59 PM Akihiko Odaki wrote:
>
> tun used to simply advance iov_iter when it needs to pad virtio header,
> which leaves the garbage in the buffer as is. This is especially
> problematic when tun starts to allow enabling the hash reporting
> feature; even if the feature is en
On Thu, Jan 9, 2025 at 2:59 PM Akihiko Odaki wrote:
>
> Both tun and tap exposes the same set of virtio-net-related features.
> Unify their implementations to ease future changes.
>
> Signed-off-by: Akihiko Odaki
> ---
> MAINTAINERS| 1 +
> drivers/net/Kconfig| 5 ++
> driver
PCI P2PDMA pages are not mapped with pXX_devmap PTEs therefore the
check in __gup_device_huge() is redundant. Remove it
Signed-off-by: Alistair Popple
Reviewed-by: Jason Gunthorpe
Reviewed-by: Dan Wiliams
Acked-by: David Hildenbrand
---
mm/gup.c | 5 -
1 file changed, 5 deletions(-)
diff
PAGE_MAPPING_DAX_SHARED is the same as PAGE_MAPPING_ANON. This isn't
currently a problem because FS DAX pages are treated
specially. However a future change will make FS DAX pages more like
normal pages, so folio_test_anon() must not return true for a FS DAX
page.
We could explicitly test for a FS
Currently DAX folio/page reference counts are managed differently to
normal pages. To allow these to be managed the same as normal pages
introduce vmf_insert_folio_pud. This will map the entire PUD-sized folio
and take references as it would for a normally mapped page.
This is distinct from the cu
The rmap doesn't currently support adding a PUD mapping of a
folio. This patch adds support for entire PUD mappings of folios,
primarily to allow for more standard refcounting of device DAX
folios. Currently DAX is the only user of this and it doesn't require
support for partially mapped PUD-sized
Currently DAX folio/page reference counts are managed differently to
normal pages. To allow these to be managed the same as normal pages
introduce vmf_insert_folio_pmd. This will map the entire PMD-sized folio
and take references as it would for a normally mapped page.
This is distinct from the cu
Add helpers to determine if a page or folio is a devdax or fsdax page
or folio.
Signed-off-by: Alistair Popple
Acked-by: David Hildenbrand
---
Changes for v5:
- Renamed is_device_dax_page() to is_devdax_page() for consistency.
---
include/linux/memremap.h | 22 ++
1 file
Currently fs dax pages are considered free when the refcount drops to
one and their refcounts are not increased when mapped via PTEs or
decreased when unmapped. This requires special logic in mm paths to
detect that these pages should not be properly refcounted, and to
detect when the refcount drop
The procfs mmu files such as smaps and pagemap currently ignore devdax and
fsdax pages because these pages are considered special. A future change
will start treating these as normal pages, meaning they can be exposed via
smaps and pagemap.
The only difference is that devdax and fsdax pages can ne
Longterm pinning of FS DAX pages should already be disallowed by
various pXX_devmap checks. However a future change will cause these
checks to be invalid for FS DAX pages so make
folio_is_longterm_pinnable() return false for FS DAX pages.
Signed-off-by: Alistair Popple
Reviewed-by: John Hubbard
At present mlock skips ptes mapping ZONE_DEVICE pages. A future change
to remove pmd_devmap will allow pmd_trans_huge_lock() to return
ZONE_DEVICE folios so make sure we continue to skip those.
Signed-off-by: Alistair Popple
Acked-by: David Hildenbrand
---
mm/mlock.c | 2 ++
1 file changed, 2 i
The devmap PTE special bit was used to detect mappings of FS DAX
pages. This tracking was required to ensure the generic mm did not
manipulate the page reference counts as FS DAX implemented it's own
reference counting scheme.
Now that FS DAX pages have their references counted the same way as
nor
Now that DAX and all other reference counts to ZONE_DEVICE pages are
managed normally there is no need for the special devmap PTE/PMD/PUD
page table bits. So drop all references to these, freeing up a
software defined page table bit on architectures supporting it.
Signed-off-by: Alistair Popple
A
DEVMAP PTEs are no longer required to support ZONE_DEVICE so remove
them.
Signed-off-by: Alistair Popple
Suggested-by: Chunyan Zhang
Reviewed-by: Björn Töpel
---
arch/riscv/Kconfig| 1 -
arch/riscv/include/asm/pgtable-64.h | 20
arch/riscv/include/as
DEVMAP PTEs are no longer required to support ZONE_DEVICE so remove
them.
Signed-off-by: Alistair Popple
---
arch/loongarch/Kconfig| 1 -
arch/loongarch/include/asm/pgtable-bits.h | 6 ++
arch/loongarch/include/asm/pgtable.h | 19 ---
3 files change
Device DAX pages are currently not reference counted when mapped,
instead relying on the devmap PTE bit to ensure mapping code will not
get/put references. This requires special handling in various page
table walkers, particularly GUP, to manage references on the
underlying pgmap to ensure the page
Zone device pages are used to represent various type of device memory
managed by device drivers. Currently compound zone device pages are
not supported. This is because MEMORY_DEVICE_FS_DAX pages are the only
user of higher order zone device pages and have their own page
reference counting.
A futu
In preparation for using insert_page() for DAX, enhance
insert_page_into_pte_locked() to handle establishing writable
mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a
PTE which bypasses the typical set_pte_range() in finish_fault.
Signed-off-by: Alistair Popple
Suggested-by:
Currently to map a DAX page the DAX driver calls vmf_insert_pfn. This
creates a special devmap PTE entry for the pfn but does not take a
reference on the underlying struct page for the mapping. This is
because DAX page refcounts are treated specially, as indicated by the
presence of a devmap entry.
Currently ZONE_DEVICE page reference counts are initialised by core
memory management code in __init_zone_device_page() as part of the
memremap() call which driver modules make to obtain ZONE_DEVICE
pages. This initialises page refcounts to 1 before returning them to
the driver.
This was presumabl
On Wed, Jan 08, 2025 at 05:34:30PM -0800, Alison Schofield wrote:
> On Tue, Jan 07, 2025 at 02:42:16PM +1100, Alistair Popple wrote:
> > Main updates since v4:
> >
> > - Removed most of the devdax/fsdax checks in fs/proc/task_mmu.c. This
> >means smaps/pagemap may contain DAX pages.
> >
> >
On Thu, Jan 09, 2025 at 06:38:10PM +0900, Akihiko Odaki wrote:
> On 2025/01/09 16:40, Michael S. Tsirkin wrote:
> > On Thu, Jan 09, 2025 at 02:32:25AM -0500, Michael S. Tsirkin wrote:
> > > On Thu, Jan 09, 2025 at 03:58:45PM +0900, Akihiko Odaki wrote:
> > > > The specification says the device MUST
On Wed, Jan 08, 2025 at 09:47:11PM +0800, Luo Jie wrote:
> The BM (Buffer Management) config controls the pause frame generated
> on the PPE port. There are maximum 15 BM ports and 4 groups supported,
> all BM ports are assigned to group 0 by default. The number of hardware
> buffers configured for
On Wed, Jan 08, 2025 at 09:47:14PM +0800, Luo Jie wrote:
> Configure unicast and multicast hardware queues for the PPE
> ports to enable packet forwarding between the ports.
>
> Each PPE port is assigned with a range of queues. The queue ID
> selection for a packet is decided by the queue base and
On 01/09, Song, Yoong Siang wrote:
> On Wednesday, January 8, 2025 1:08 AM, Stanislav Fomichev
> wrote:
> >On 01/06, Song Yoong Siang wrote:
> >> Enable launch time (Time-Based Scheduling) support to XDP zero copy via XDP
> >> Tx metadata framework.
> >>
> >> This patch is tested with tools/testi
On 01/09, Song, Yoong Siang wrote:
> On Wednesday, January 8, 2025 12:50 AM, Stanislav Fomichev
> wrote:
> >On 01/06, Song Yoong Siang wrote:
> >> Extend the XDP Tx metadata framework so that user can requests launch time
> >> hardware offload, where the Ethernet device will schedule the packet f
On Wed, Jan 08, 2025 at 09:47:13PM +0800, Luo Jie wrote:
> The PPE scheduler settings determine the priority of scheduling the
> packet across the different hardware queues per PPE port.
>
> Signed-off-by: Luo Jie
> ---
> drivers/net/ethernet/qualcomm/ppe/ppe_config.c | 789
> ++
On Wed, Jan 08, 2025 at 09:47:15PM +0800, Luo Jie wrote:
> PPE service code is a special code (0-255) that is defined by PPE for
> PPE's packet processing stages, as per the network functions required
> for the packet.
>
> For packet being sent out by ARM cores on Ethernet ports, The service
> cod
Akihiko Odaki wrote:
> Both tun and tap exposes the same set of virtio-net-related features.
> Unify their implementations to ease future changes.
>
> Signed-off-by: Akihiko Odaki
> ---
> MAINTAINERS| 1 +
> drivers/net/Kconfig| 5 ++
> drivers/net/Makefile | 1 +
> drive
Akihiko Odaki wrote:
> This series depends on: "[PATCH v2 0/3] tun: Unify vnet implementation
> and fill full vnet header"
> https://lore.kernel.org/r/20250109-tun-v2-0-388d7d5a2...@daynix.com
As mentioned elsewhere: let's first handle that patch series and
return to this
Akihiko Odaki wrote:
> They are useful to implement VIRTIO_NET_F_RSS and
> VIRTIO_NET_F_HASH_REPORT.
Toeplitz potentially has users beyond virtio. I wonder if we should
from the start implement this as net/core/rss.c.
> Signed-off-by: Akihiko Odaki
> ---
> include/linux/virtio_net.h | 188
>
Akihiko Odaki wrote:
> When I implemented virtio's hash-related features to tun/tap [1],
> I found tun/tap does not fill the entire region reserved for the virtio
> header, leaving some uninitialized hole in the middle of the buffer
> after read()/recvmesg().
>
> This series fills the uninitialize
Akihiko Odaki wrote:
> The added tests confirm tun can perform RSS and hash reporting, and
> reject invalid configurations for them.
>
> Signed-off-by: Akihiko Odaki
> ---
> tools/testing/selftests/net/Makefile | 2 +-
> tools/testing/selftests/net/tun.c| 558
> +++
On Wed, Jan 08, 2025 at 09:47:08PM +0800, Luo Jie wrote:
> +required:
> + - clocks
> + - clock-names
> + - resets
> + - interrupts
> + - interrupt-names
> +
> + ethernet-ports:
This device really looks like DSA or other ethernet switch, so I would
really expect proper
On 2025/01/09 16:43, Michael S. Tsirkin wrote:
On Thu, Jan 09, 2025 at 04:41:50PM +0900, Akihiko Odaki wrote:
On 2025/01/09 16:31, Michael S. Tsirkin wrote:
On Thu, Jan 09, 2025 at 03:58:44PM +0900, Akihiko Odaki wrote:
tun used to simply advance iov_iter when it needs to pad virtio header,
wh
On 2025/01/09 16:40, Michael S. Tsirkin wrote:
On Thu, Jan 09, 2025 at 02:32:25AM -0500, Michael S. Tsirkin wrote:
On Thu, Jan 09, 2025 at 03:58:45PM +0900, Akihiko Odaki wrote:
The specification says the device MUST set num_buffers to 1 if
VIRTIO_NET_F_MRG_RXBUF has not been negotiated.
Signe
Akihiko Odaki wrote:
> On 2025/01/09 16:31, Michael S. Tsirkin wrote:
> > On Thu, Jan 09, 2025 at 03:58:44PM +0900, Akihiko Odaki wrote:
> >> tun used to simply advance iov_iter when it needs to pad virtio header,
> >> which leaves the garbage in the buffer as is. This is especially
> >> problemati
On Thu 09-01-25 18:36:52, Akihiko Odaki wrote:
> On 2025/01/09 16:43, Michael S. Tsirkin wrote:
> > On Thu, Jan 09, 2025 at 04:41:50PM +0900, Akihiko Odaki wrote:
> > > On 2025/01/09 16:31, Michael S. Tsirkin wrote:
> > > > On Thu, Jan 09, 2025 at 03:58:44PM +0900, Akihiko Odaki wrote:
> > > > > tu
On 1/9/2025 3:27 AM, Christophe JAILLET wrote:
Le 08/01/2025 à 14:47, Luo Jie a écrit :
The PPE scheduler settings determine the priority of scheduling the
packet across the different hardware queues per PPE port.
Signed-off-by: Luo Jie
...
+/* Scheduler configuration for dispatching pa
On 1/9/2025 3:29 AM, Christophe JAILLET wrote:
Le 08/01/2025 à 14:47, Luo Jie a écrit :
Configure unicast and multicast hardware queues for the PPE
ports to enable packet forwarding between the ports.
Each PPE port is assigned with a range of queues. The queue ID
selection for a packet is de
On 1/9/2025 3:19 AM, Christophe JAILLET wrote:
Le 08/01/2025 à 14:47, Luo Jie a écrit :
The PPE (Packet Process Engine) hardware block is available
on Qualcomm IPQ SoC that support PPE architecture, such as
IPQ9574.
The PPE in IPQ9574 includes six integrated ethernet MAC
(for 6 PPE ports), b
On 1/9/2025 12:43 AM, Andrew Lunn wrote:
On Wed, Jan 08, 2025 at 09:47:20PM +0800, Luo Jie wrote:
The PPE hardware packet counters are made available through
the debugfs entry "/sys/kernel/debug/ppe/packet_counters".
Why?
Would it not be better to make them available via ethtool -S ?
Man
Akihiko Odaki wrote:
> Hash reporting
> --
>
> Allow the guest to reuse the hash value to make receive steering
> consistent between the host and guest, and to save hash computation.
>
> RSS
> ---
>
> RSS is a receive steering algorithm that can be negotiated to use with
> virtio_net
: e94dc6ddda8dd3770879a132d577accd2cce25f9
patch link:
https://lore.kernel.org/r/03c01be90e53f743a91b6c1376c408404b891867.1736237481.git.nicolinc%40nvidia.com
patch subject: [PATCH v5 14/14] iommu/arm-smmu-v3: Report events that belong to
devices attached to vIOMMU
config: arm64-randconfig-r131-20250109
On 2025/01/09 19:54, Michael S. Tsirkin wrote:
On Thu, Jan 09, 2025 at 06:38:10PM +0900, Akihiko Odaki wrote:
On 2025/01/09 16:40, Michael S. Tsirkin wrote:
On Thu, Jan 09, 2025 at 02:32:25AM -0500, Michael S. Tsirkin wrote:
On Thu, Jan 09, 2025 at 03:58:45PM +0900, Akihiko Odaki wrote:
The s
On Thu, Jan 09, 2025 at 05:27:14PM +, Simon Horman wrote:
> On Wed, Jan 08, 2025 at 09:47:11PM +0800, Luo Jie wrote:
> > The BM (Buffer Management) config controls the pause frame generated
> > on the PPE port. There are maximum 15 BM ports and 4 groups supported,
> > all BM ports are assigned
67 matches
Mail list logo