[PATCH RFC v3 00/27] KVM: arm64: Implement support for SME in non-protected guests

2024-12-20 Thread Mark Brown
Given the time of year and point in the release cycle this is an RFC series, there's a few areas where I'm particularly expecting that people might have feedback: - The userspace ABI, in particular: - The vector length used for the SVE registers, access to the SVE registers and access to ZA

[PATCH RFC v3 01/27] arm64/fpsimd: Update FA64 and ZT0 enables when loading SME state

2024-12-20 Thread Mark Brown
Currently we enable EL0 and EL1 access to FA64 and ZT0 at boot and leave them enabled throughout the runtime of the system. When we add KVM support we will need to make this configuration dynamic, these features may be disabled for some KVM guests. Since the host kernel saves the floating point sta

[PATCH RFC v3 03/27] arm64/fpsimd: Check enable bit for FA64 when saving EFI state

2024-12-20 Thread Mark Brown
Currently when deciding if we need to save FFR when in streaming mode prior to EFI calls we check if FA64 is supported by the system. Since KVM guest support will mean that FA64 might be enabled and disabled at runtime switch to checking if traps for FA64 are enabled in SMCR_EL1 instead. Signed-of

[PATCH RFC v3 02/27] arm64/fpsimd: Decide to save ZT0 and streaming mode FFR at bind time

2024-12-20 Thread Mark Brown
Some parts of the SME state are optional, enabled by additional features on top of the base FEAT_SME and controlled with enable bits in SMCR_ELx. We unconditionally enable these for the host but for KVM we will allow the feature set exposed to guests to be restricted by the VMM. These are the FFR r

[PATCH RFC v3 04/27] arm64/fpsimd: Determine maximum virtualisable SME vector length

2024-12-20 Thread Mark Brown
As with SVE we can only virtualise SME vector lengths that are supported by all CPUs in the system, implement similar checks to those for SVE. Since unlike SVE there are no specific vector lengths that are architecturally required the handling is subtly different, we report a system where this happ

[PATCH RFC v3 07/27] KVM: arm64: Convert cpacr_clear_set() to a static inline

2024-12-20 Thread Mark Brown
Currently cpacr_clear_set() is defined as a macro in order to allow it to include a number of build time asserts that the bits being set and cleared are appropriate. While this check is welcome it only works when the arguments are constant which starts to scale poorly as we add SME unless we do mul

[PATCH RFC v3 06/27] KVM: arm64: Pull ctxt_has_ helpers to start of sysreg-sr.h

2024-12-20 Thread Mark Brown
Rather than add earlier prototypes of specific ctxt_has_ helpers let's just pull all their definitions to the top of sysreg-sr.h so they're all available to all the individual save/restore functions. Signed-off-by: Mark Brown --- arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 32 ++

[PATCH RFC v3 10/27] KVM: arm64: Rename SVE finalization constants to be more general

2024-12-20 Thread Mark Brown
Due to the overlap between SVE and SME vector length configuration created by streaming mode SVE we will finalize both at once. Rename the existing finalization to use _VEC (vector) for the naming to avoid confusion. Since this includes the userspace API we create an alias KVM_ARM_VCPU_VEC for th

[PATCH RFC v3 05/27] KVM: arm64: Introduce non-UNDEF FGT control

2024-12-20 Thread Mark Brown
We have support for determining a set of fine grained traps to enable for the guest which is tied to the support for injecting UNDEFs for undefined features. This means that we can't use the mechanism for system registers which should be present but need emulation, such as SMPRI_EL1 which should be

[PATCH RFC v3 08/27] KVM: arm64: Move SVE state access macros after feature test macros

2024-12-20 Thread Mark Brown
In preparation for SME support move the macros used to access SVE state after the feature test macros, we will need to test for SME subfeatures to determine the size of the SME state. Signed-off-by: Mark Brown --- arch/arm64/include/asm/kvm_host.h | 46 +++ 1

[PATCH RFC v3 12/27] KVM: arm64: Define internal features for SME

2024-12-20 Thread Mark Brown
In order to simplify interdependencies in the rest of the series define the feature detection for SME and it's subfeatures. Due to the need for vector length configuration we define a flag for SME like for SVE. We also have two subfeatures which add architectural state, FA64 and SME2, which are c

[PATCH RFC v3 11/27] KVM: arm64: Document the KVM ABI for SME

2024-12-20 Thread Mark Brown
SME, the Scalable Matrix Extension, is an arm64 extension which adds support for matrix operations, with core concepts patterned after SVE. SVE introduced some complication in the ABI since it adds new vector floating point registers with runtime configurable size, the size being controlled by a p

[PATCH RFC v3 14/27] KVM: arm64: Store vector lengths in an array

2024-12-20 Thread Mark Brown
SME adds a second vector length configured in a very similar way to the SVE vector length, in order to facilitate future code sharing for SME refactor our storage of vector lengths to use an array like the host does. We do not yet take much advantage of this so the intermediate code is not as clean

[PATCH RFC v3 09/27] KVM: arm64: Factor SVE guest exit handling out into a function

2024-12-20 Thread Mark Brown
The SVE portion of kvm_vcpu_put() is quite large, especially given the comments required. When we add similar handling for SME the function will get even larger, in order to keep things managable factor the SVE portion out of the main kvm_vcpu_put(). Signed-off-by: Mark Brown --- arch/arm64/kvm

[PATCH RFC v3 18/27] KVM: arm64: Support SMIDR_EL1 for guests

2024-12-20 Thread Mark Brown
SME adds an identification register SMIDR_EL1 which provides a basic description of the SME implementation, describing the implementation in a manner similar to MIDR_EL1 for the PE as well as indicating support for priority management. Since we do not currently support SME priority control we mask

[PATCH RFC v3 16/27] KVM: arm64: Add definitions for SME control register

2024-12-20 Thread Mark Brown
SME is configured by the system registers SMCR_EL1 and SMCR_EL2, add definitions and userspace access for them. They will be context switched together with the rest of SME state. In systems with SME priority support there are additional registers SMPRI_EL1 and SMPRIMAP_EL2 managing the priorities

[PATCH RFC v3 17/27] KVM: arm64: Support TPIDR2_EL0

2024-12-20 Thread Mark Brown
SME adds a new thread ID register, TPIDR2_EL0. This is used in userspace for delayed saving of the ZA state but in terms of the architecture is not really connected to SME other than being part of FEAT_SME. It has an independent fine grained trap and the runtime connection with the rest of SME is p

[PATCH RFC v3 23/27] KVM: arm64: Context switch SME state for normal guests

2024-12-20 Thread Mark Brown
If the guest has SME state we need to context switch that state, provide support for that for normal guests. SME has three sets of registers, ZA, ZT (only present for SME2) and also streaming SVE which replaces the standard floating point registers when active. The first two are fairly straightfor

[PATCH RFC v3 15/27] KVM: arm64: Implement SME vector length configuration

2024-12-20 Thread Mark Brown
SME implements a vector length which architecturally looks very similar to that for SVE, configured in a very similar manner. This controls the vector length used for the ZA matrix register, and for the SVE vector and predicate registers when in streaming mode. The only substantial difference is

[PATCH RFC v3 22/27] KVM: arm64: Expose SME specific state to userspace

2024-12-20 Thread Mark Brown
SME introduces two new registers, the ZA matrix register and the ZT0 LUT register. Both of these registers are only accessible when PSTATE.ZA is set and ZT0 is only present if SME2 is enabled for the guest. Provide support for configuring these from VMMs. The ZA matrix is a single SVL*SVL registe

[PATCH RFC v3 13/27] KVM: arm64: Rename sve_state_reg_region

2024-12-20 Thread Mark Brown
As for SVE we will need to pull parts of dynamically sized registers out of a block of memory for SME so we will use a similar code pattern for this. Rename the current struct sve_state_reg_region in preparation for this. No functional change. Signed-off-by: Mark Brown --- arch/arm64/kvm/guest.

[PATCH RFC v3 19/27] KVM: arm64: Support SME priority registers

2024-12-20 Thread Mark Brown
SME has optional support for configuring the relative priorities of PEs in systems where they share a single SME hardware block, known as a SMCU. Currently we do not have any support for this in Linux and will also hide it from KVM guests, pending experience with practical implementations. The inte

[PATCH RFC v3 21/27] KVM: arm64: Support Z and P registers in streaming mode

2024-12-20 Thread Mark Brown
SME introduces a mode called streaming mode where the Z, P and optionally FFR registers can be accessed using the SVE instructions but with the SME vector length. Reflect this in the ABI for accessing the guest registers by making the vector length for the vcpu reflect the vector length that would

[PATCH RFC v3 27/27] KVM: arm64: selftests: Add SME to set_id_regs test

2024-12-20 Thread Mark Brown
Add coverage of the SME ID registers to set_id_regs, ID_AA64PFR1_EL1.SME becomes writable and we add ID_AA64SMFR_EL1 and it's subfields. Signed-off-by: Mark Brown --- tools/testing/selftests/kvm/aarch64/set_id_regs.c | 29 +-- 1 file changed, 27 insertions(+), 2 deletions(-)

[PATCH RFC v3 26/27] KVM: arm64: selftests: Add SME system registers to get-reg-list

2024-12-20 Thread Mark Brown
SME adds a number of new system registers, update get-reg-list to check for them based on the visibility of SME. Signed-off-by: Mark Brown --- tools/testing/selftests/kvm/aarch64/get-reg-list.c | 32 +- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/tools/testi

[PATCH RFC v3 24/27] KVM: arm64: Handle SME exceptions

2024-12-20 Thread Mark Brown
The access control for SME follows the same structure as for the base FP and SVE extensions, with control being via CPACR_ELx.SMEN and CPTR_EL2.TSM mirroring the equivalent FPSIMD and SVE controls in those registers.Add handling for these controls and exceptions mirroring the existing handling for

[PATCH RFC v3 20/27] KVM: arm64: Provide assembly for SME state restore

2024-12-20 Thread Mark Brown
Provide a __sme_restore_state() for the hypervisor to allow it to restore ZA and ZT for guests. Signed-off-by: Mark Brown --- arch/arm64/include/asm/kvm_hyp.h | 2 ++ arch/arm64/kvm/hyp/fpsimd.S | 16 2 files changed, 18 insertions(+) diff --git a/arch/arm64/include/asm/k

[PATCH RFC v3 25/27] KVM: arm64: Provide interface for configuring and enabling SME for guests

2024-12-20 Thread Mark Brown
Since SME requires configuration of a vector length in order to know the size of both the streaming mode SVE state and ZA array we implement a capability for it and require that it be enabled and finalized before the SME specific state can be accessed, similarly to SVE. Due to the overlap with siz

Re: [PATCH v1 2/6] docs: 6.Followthrough.rst: when to involved Linus in regressions

2024-12-20 Thread Thorsten Leemhuis
On 13.12.24 17:17, Jonathan Corbet wrote: > Thorsten Leemhuis writes: > >> Add a few notes on when to involve Linus in regressions. Part of this >> spells out slightly obvious things infrequent developers might not be >> aware of, while others are based on a recent statement from Linus[1]. >> >>

Re: [PATCH v1 1/6] docs: more detailed instructions on handling regressions

2024-12-20 Thread Thorsten Leemhuis
On 13.12.24 17:14, Jonathan Corbet wrote: > Thorsten Leemhuis writes: > >> Add a few more specific guidelines on handling regressions to the >> kernel's two most prominent guides about contributing to Linux, as >> developers apparently work with quite different interpretations of what >> Linus ex

[PATCH RFC 1/2] docs: process: submitting-patches: split canonical patch format section

2024-12-20 Thread Ahmad Fatoum
To make it easier to reference specific parts of the patch format, let's add some headings for different parts. Doing that, it becomes clear that backtraces in commit message is out of place being after Reply-To Headers, so move it next to the commit message body subsubsection. Signed-off-by: Ahm

[PATCH RFC 2/2] docs: process: submitting-patches: clarify imperative mood suggestion

2024-12-20 Thread Ahmad Fatoum
While we expect commit message titles to use the imperative mood, it's ok for commit message bodies to first include a blurb describing the background of the patch, before delving into what's being done to address the situation. Make this clearer by adding a clarification after the imperative mood

[PATCH RFC 0/2] docs: process: submitting-patches: clarify imperative mood suggestion

2024-12-20 Thread Ahmad Fatoum
Many commit message bodies start off with some background information, before explaining how they address the situation. This can be arguably easier to follow than having the imperative in the commit message title be followed directly by another differently worded or more verbose imperative in the

[PATCH RFC net-next v1 0/5] Device memory TCP TX

2024-12-20 Thread Mina Almasry
The TX path had been dropped from the Device Memory TCP patch series post RFCv1 [1], to make that series slightly easier to review. This series rebases the implementation of the TX path on top of the net_iov/netmem framework agreed upon and merged. The motivation for the feature is thoroughly descr

[PATCH RFC net-next v1 2/5] selftests: ncdevmem: Implement devmem TCP TX

2024-12-20 Thread Mina Almasry
Add support for devmem TX in ncdevmem. This is a combination of the ncdevmem from the devmem TCP series RFCv1 which included the TX path, and work by Stan to include the netlink API and refactored on top of his generic memory_provider support. Signed-off-by: Mina Almasry Signed-off-by: Stanislav

[PATCH RFC net-next v1 1/5] net: add devmem TCP TX documentation

2024-12-20 Thread Mina Almasry
Add documentation outlining the usage and details of the devmem TCP TX API. Signed-off-by: Mina Almasry --- Documentation/networking/devmem.rst | 140 +++- 1 file changed, 136 insertions(+), 4 deletions(-) diff --git a/Documentation/networking/devmem.rst b/Documentation

[PATCH RFC net-next v1 3/5] net: add get_netmem/put_netmem support

2024-12-20 Thread Mina Almasry
Currently net_iovs support only pp ref counts, and do not support a page ref equivalent. This is fine for the RX path as net_iovs are used exclusively with the pp and only pp refcounting is needed there. The TX path however does not use pp ref counts, thus, support for get_page/put_page equivalent

[PATCH RFC net-next v1 4/5] net: devmem TCP tx netlink api

2024-12-20 Thread Mina Almasry
From: Stanislav Fomichev Add bind-tx netlink call to attach dmabuf for TX; queue is not required, only ifindex and dmabuf fd for attachment. Signed-off-by: Stanislav Fomichev Signed-off-by: Mina Almasry --- Documentation/netlink/specs/netdev.yaml | 12 include/uapi/linux/netdev.

[PATCH RFC net-next v1 5/5] net: devmem: Implement TX path

2024-12-20 Thread Mina Almasry
Augment dmabuf binding to be able to handle TX. Additional to all the RX binding, we also create tx_vec and tx_iter needed for the TX path. Provide API for sendmsg to be able to send dmabufs bound to this device: - Provide a new dmabuf_tx_cmsg which includes the dmabuf to send from, and the off

Re: [PATCH v1 5/6] docs: 6.Followthrough.rst: more specific advice on fixing regressions

2024-12-20 Thread Thorsten Leemhuis
On 13.12.24 17:28, Jonathan Corbet wrote: >> + - Expedite fixing regressions that recently reached releases deemed for end >> + users through new mainline releases or stable backports. If the culprit >> + reached it in the past six weeks, aim to mainline a fix before the end >> of the >> +

Re: [PATCH v1 6/6] docs: 6.Followthrough.rst: advice on handling regressions fixes

2024-12-20 Thread Thorsten Leemhuis
On 13.12.24 17:30, Jonathan Corbet wrote: > Thorsten Leemhuis writes: > >> Add some advice on how to handle regressions as developer, reviewer, and >> maintainer, as resolving regression without unnecessary delays requires >> multiple people working hand in hand. >> >> This removes equivalent par

Re: [PATCH v1 4/6] docs: 6.Followthrough.rst: tags to use in regressions fixes

2024-12-20 Thread Thorsten Leemhuis
On 13.12.24 17:24, Jonathan Corbet wrote: > Thorsten Leemhuis writes: > [...] >> diff --git a/Documentation/process/6.Followthrough.rst >> b/Documentation/process/6.Followthrough.rst >> index 763a80d21240f0..2ba16a71aba9b4 100644 >> --- a/Documentation/process/6.Followthrough.rst >> +++ b/Documen

Re: [PATCH v1 3/6] docs: 6.Followthrough.rst: interaction with stable wrt to regressions

2024-12-20 Thread Thorsten Leemhuis
On 13.12.24 17:20, Jonathan Corbet wrote: > Thorsten Leemhuis writes: > >> Add a few notes on how the interaction with the stable team works when >> it comes to mainline regressions that also affect stable series. >> >> This removes equivalent paragraphs from a section in >> Documentation/proc

Re: [PATCH v4 17/25] memremap: Add is_device_dax_page() and is_fsdax_page() helpers

2024-12-20 Thread David Hildenbrand
On 17.12.24 06:13, Alistair Popple wrote: Add helpers to determine if a page or folio is a device dax or fs dax page or folio. ... why is it "device_dax" but "fsdax" ? In particular because you wrote "fs dax" above. I see "fsdax" getting used in some functions. But then, people usually say

Re: [PATCH v4 10/25] mm/mm_init: Move p2pdma page refcount initialisation to p2pdma

2024-12-20 Thread David Hildenbrand
But that's a bit weird: we call __init_single_page()->init_page_count() to initialize it to 1, to then set it back to 0. Maybe we can just pass to __init_single_page() the refcount we want to have directly? Can be a patch on top of course. Once the dust settles on this series we won't nee

Re: [PATCH v4 14/25] rmap: Add support for PUD sized mappings to rmap

2024-12-20 Thread David Hildenbrand
return -EBUSY; diff --git a/mm/rmap.c b/mm/rmap.c index c6c4d4e..39d0439 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1203,6 +1203,11 @@ static __always_inline unsigned int __folio_add_rmap(struct folio *folio, } atomic_inc(&folio->_la

Re: [PATCH v4 19/25] proc/task_mmu: Ignore ZONE_DEVICE pages

2024-12-20 Thread David Hildenbrand
On 19.12.24 00:11, Alistair Popple wrote: On Tue, Dec 17, 2024 at 11:31:25PM +0100, David Hildenbrand wrote: On 17.12.24 06:13, Alistair Popple wrote: The procfs mmu files such as smaps currently ignore device dax and fs dax pages because these pages are considered special. To maintain existing

Re: [PATCH v4 16/25] huge_memory: Add vmf_insert_folio_pmd()

2024-12-20 Thread David Hildenbrand
+vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf, struct folio *folio, bool write) +{ + struct vm_area_struct *vma = vmf->vma; + unsigned long addr = vmf->address & PMD_MASK; + pfn_t pfn = pfn_to_pfn_t(folio_pfn(folio)); + struct mm_struct *mm = vma->vm_mm; +

Re: [PATCH v4 12/25] mm/memory: Enhance insert_page_into_pte_locked() to create writable mappings

2024-12-20 Thread David Hildenbrand
On 20.12.24 20:01, David Hildenbrand wrote: On 17.12.24 06:12, Alistair Popple wrote: In preparation for using insert_page() for DAX, enhance insert_page_into_pte_locked() to handle establishing writable mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a PTE which bypasses the

Re: [PATCH v4 15/25] huge_memory: Add vmf_insert_folio_pud()

2024-12-20 Thread David Hildenbrand
On 17.12.24 06:12, Alistair Popple wrote: Currently DAX folio/page reference counts are managed differently to normal pages. To allow these to be managed the same as normal pages introduce vmf_insert_folio_pud. This will map the entire PUD-sized folio and take references as it would for a normall

Re: [PATCH v4 12/25] mm/memory: Enhance insert_page_into_pte_locked() to create writable mappings

2024-12-20 Thread David Hildenbrand
On 17.12.24 06:12, Alistair Popple wrote: In preparation for using insert_page() for DAX, enhance insert_page_into_pte_locked() to handle establishing writable mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a PTE which bypasses the typical set_pte_range() in finish_fault. Si

Re: [PATCH RFC net-next v1 0/5] Device memory TCP TX

2024-12-20 Thread Stanislav Fomichev
On 12/21, Mina Almasry wrote: > The TX path had been dropped from the Device Memory TCP patch series > post RFCv1 [1], to make that series slightly easier to review. This > series rebases the implementation of the TX path on top of the > net_iov/netmem framework agreed upon and merged. The motivati

Re: [PATCH RFC net-next v1 1/5] net: add devmem TCP TX documentation

2024-12-20 Thread Stanislav Fomichev
On 12/21, Mina Almasry wrote: > Add documentation outlining the usage and details of the devmem TCP TX > API. > > Signed-off-by: Mina Almasry > --- > Documentation/networking/devmem.rst | 140 +++- > 1 file changed, 136 insertions(+), 4 deletions(-) > > diff --git a/Docu

Re: [PATCH RFC net-next v1 2/5] selftests: ncdevmem: Implement devmem TCP TX

2024-12-20 Thread Stanislav Fomichev
On 12/21, Mina Almasry wrote: > Add support for devmem TX in ncdevmem. > > This is a combination of the ncdevmem from the devmem TCP series RFCv1 > which included the TX path, and work by Stan to include the netlink API > and refactored on top of his generic memory_provider support. Do you plan t

Re: [PATCH RFC net-next v1 5/5] net: devmem: Implement TX path

2024-12-20 Thread Stanislav Fomichev
On 12/21, Mina Almasry wrote: > Augment dmabuf binding to be able to handle TX. Additional to all the RX > binding, we also create tx_vec and tx_iter needed for the TX path. > > Provide API for sendmsg to be able to send dmabufs bound to this device: > > - Provide a new dmabuf_tx_cmsg which inclu