Given the time of year and point in the release cycle this is an RFC
series, there's a few areas where I'm particularly expecting that people
might have feedback:
- The userspace ABI, in particular:
- The vector length used for the SVE registers, access to the SVE
registers and access to ZA
Currently we enable EL0 and EL1 access to FA64 and ZT0 at boot and leave
them enabled throughout the runtime of the system. When we add KVM support
we will need to make this configuration dynamic, these features may be
disabled for some KVM guests. Since the host kernel saves the floating
point sta
Currently when deciding if we need to save FFR when in streaming mode prior
to EFI calls we check if FA64 is supported by the system. Since KVM guest
support will mean that FA64 might be enabled and disabled at runtime switch
to checking if traps for FA64 are enabled in SMCR_EL1 instead.
Signed-of
Some parts of the SME state are optional, enabled by additional features
on top of the base FEAT_SME and controlled with enable bits in SMCR_ELx. We
unconditionally enable these for the host but for KVM we will allow the
feature set exposed to guests to be restricted by the VMM. These are the
FFR r
As with SVE we can only virtualise SME vector lengths that are supported by
all CPUs in the system, implement similar checks to those for SVE. Since
unlike SVE there are no specific vector lengths that are architecturally
required the handling is subtly different, we report a system where this
happ
Currently cpacr_clear_set() is defined as a macro in order to allow it to
include a number of build time asserts that the bits being set and cleared
are appropriate. While this check is welcome it only works when the
arguments are constant which starts to scale poorly as we add SME unless we
do mul
Rather than add earlier prototypes of specific ctxt_has_ helpers let's just
pull all their definitions to the top of sysreg-sr.h so they're all
available to all the individual save/restore functions.
Signed-off-by: Mark Brown
---
arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 32 ++
Due to the overlap between SVE and SME vector length configuration
created by streaming mode SVE we will finalize both at once. Rename the
existing finalization to use _VEC (vector) for the naming to avoid
confusion.
Since this includes the userspace API we create an alias
KVM_ARM_VCPU_VEC for th
We have support for determining a set of fine grained traps to enable for
the guest which is tied to the support for injecting UNDEFs for undefined
features. This means that we can't use the mechanism for system registers
which should be present but need emulation, such as SMPRI_EL1 which should
be
In preparation for SME support move the macros used to access SVE state
after the feature test macros, we will need to test for SME subfeatures to
determine the size of the SME state.
Signed-off-by: Mark Brown
---
arch/arm64/include/asm/kvm_host.h | 46 +++
1
In order to simplify interdependencies in the rest of the series define
the feature detection for SME and it's subfeatures. Due to the need for
vector length configuration we define a flag for SME like for SVE. We
also have two subfeatures which add architectural state, FA64 and SME2,
which are c
SME, the Scalable Matrix Extension, is an arm64 extension which adds
support for matrix operations, with core concepts patterned after SVE.
SVE introduced some complication in the ABI since it adds new vector
floating point registers with runtime configurable size, the size being
controlled by a p
SME adds a second vector length configured in a very similar way to the
SVE vector length, in order to facilitate future code sharing for SME
refactor our storage of vector lengths to use an array like the host does.
We do not yet take much advantage of this so the intermediate code is not
as clean
The SVE portion of kvm_vcpu_put() is quite large, especially given the
comments required. When we add similar handling for SME the function
will get even larger, in order to keep things managable factor the SVE
portion out of the main kvm_vcpu_put().
Signed-off-by: Mark Brown
---
arch/arm64/kvm
SME adds an identification register SMIDR_EL1 which provides a basic
description of the SME implementation, describing the implementation
in a manner similar to MIDR_EL1 for the PE as well as indicating support
for priority management.
Since we do not currently support SME priority control we mask
SME is configured by the system registers SMCR_EL1 and SMCR_EL2, add
definitions and userspace access for them. They will be context
switched together with the rest of SME state.
In systems with SME priority support there are additional registers
SMPRI_EL1 and SMPRIMAP_EL2 managing the priorities
SME adds a new thread ID register, TPIDR2_EL0. This is used in userspace
for delayed saving of the ZA state but in terms of the architecture is
not really connected to SME other than being part of FEAT_SME. It has an
independent fine grained trap and the runtime connection with the rest
of SME is p
If the guest has SME state we need to context switch that state, provide
support for that for normal guests.
SME has three sets of registers, ZA, ZT (only present for SME2) and also
streaming SVE which replaces the standard floating point registers when
active. The first two are fairly straightfor
SME implements a vector length which architecturally looks very similar
to that for SVE, configured in a very similar manner. This controls the
vector length used for the ZA matrix register, and for the SVE vector
and predicate registers when in streaming mode. The only substantial
difference is
SME introduces two new registers, the ZA matrix register and the ZT0 LUT
register. Both of these registers are only accessible when PSTATE.ZA is
set and ZT0 is only present if SME2 is enabled for the guest. Provide
support for configuring these from VMMs.
The ZA matrix is a single SVL*SVL registe
As for SVE we will need to pull parts of dynamically sized registers out of
a block of memory for SME so we will use a similar code pattern for this.
Rename the current struct sve_state_reg_region in preparation for this.
No functional change.
Signed-off-by: Mark Brown
---
arch/arm64/kvm/guest.
SME has optional support for configuring the relative priorities of PEs
in systems where they share a single SME hardware block, known as a
SMCU. Currently we do not have any support for this in Linux and will
also hide it from KVM guests, pending experience with practical
implementations. The inte
SME introduces a mode called streaming mode where the Z, P and optionally
FFR registers can be accessed using the SVE instructions but with the SME
vector length. Reflect this in the ABI for accessing the guest registers by
making the vector length for the vcpu reflect the vector length that would
Add coverage of the SME ID registers to set_id_regs, ID_AA64PFR1_EL1.SME
becomes writable and we add ID_AA64SMFR_EL1 and it's subfields.
Signed-off-by: Mark Brown
---
tools/testing/selftests/kvm/aarch64/set_id_regs.c | 29 +--
1 file changed, 27 insertions(+), 2 deletions(-)
SME adds a number of new system registers, update get-reg-list to check for
them based on the visibility of SME.
Signed-off-by: Mark Brown
---
tools/testing/selftests/kvm/aarch64/get-reg-list.c | 32 +-
1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/tools/testi
The access control for SME follows the same structure as for the base FP
and SVE extensions, with control being via CPACR_ELx.SMEN and CPTR_EL2.TSM
mirroring the equivalent FPSIMD and SVE controls in those registers.Add
handling for these controls and exceptions mirroring the existing handling
for
Provide a __sme_restore_state() for the hypervisor to allow it to restore
ZA and ZT for guests.
Signed-off-by: Mark Brown
---
arch/arm64/include/asm/kvm_hyp.h | 2 ++
arch/arm64/kvm/hyp/fpsimd.S | 16
2 files changed, 18 insertions(+)
diff --git a/arch/arm64/include/asm/k
Since SME requires configuration of a vector length in order to know the
size of both the streaming mode SVE state and ZA array we implement a
capability for it and require that it be enabled and finalized before
the SME specific state can be accessed, similarly to SVE.
Due to the overlap with siz
On 13.12.24 17:17, Jonathan Corbet wrote:
> Thorsten Leemhuis writes:
>
>> Add a few notes on when to involve Linus in regressions. Part of this
>> spells out slightly obvious things infrequent developers might not be
>> aware of, while others are based on a recent statement from Linus[1].
>>
>>
On 13.12.24 17:14, Jonathan Corbet wrote:
> Thorsten Leemhuis writes:
>
>> Add a few more specific guidelines on handling regressions to the
>> kernel's two most prominent guides about contributing to Linux, as
>> developers apparently work with quite different interpretations of what
>> Linus ex
To make it easier to reference specific parts of the patch format,
let's add some headings for different parts.
Doing that, it becomes clear that backtraces in commit message is out of
place being after Reply-To Headers, so move it next to the commit
message body subsubsection.
Signed-off-by: Ahm
While we expect commit message titles to use the imperative mood,
it's ok for commit message bodies to first include a blurb describing
the background of the patch, before delving into what's being done
to address the situation.
Make this clearer by adding a clarification after the imperative mood
Many commit message bodies start off with some background information,
before explaining how they address the situation. This can be arguably
easier to follow than having the imperative in the commit message title
be followed directly by another differently worded or more verbose
imperative in the
The TX path had been dropped from the Device Memory TCP patch series
post RFCv1 [1], to make that series slightly easier to review. This
series rebases the implementation of the TX path on top of the
net_iov/netmem framework agreed upon and merged. The motivation for
the feature is thoroughly descr
Add support for devmem TX in ncdevmem.
This is a combination of the ncdevmem from the devmem TCP series RFCv1
which included the TX path, and work by Stan to include the netlink API
and refactored on top of his generic memory_provider support.
Signed-off-by: Mina Almasry
Signed-off-by: Stanislav
Add documentation outlining the usage and details of the devmem TCP TX
API.
Signed-off-by: Mina Almasry
---
Documentation/networking/devmem.rst | 140 +++-
1 file changed, 136 insertions(+), 4 deletions(-)
diff --git a/Documentation/networking/devmem.rst
b/Documentation
Currently net_iovs support only pp ref counts, and do not support a
page ref equivalent.
This is fine for the RX path as net_iovs are used exclusively with the
pp and only pp refcounting is needed there. The TX path however does not
use pp ref counts, thus, support for get_page/put_page equivalent
From: Stanislav Fomichev
Add bind-tx netlink call to attach dmabuf for TX; queue is not
required, only ifindex and dmabuf fd for attachment.
Signed-off-by: Stanislav Fomichev
Signed-off-by: Mina Almasry
---
Documentation/netlink/specs/netdev.yaml | 12
include/uapi/linux/netdev.
Augment dmabuf binding to be able to handle TX. Additional to all the RX
binding, we also create tx_vec and tx_iter needed for the TX path.
Provide API for sendmsg to be able to send dmabufs bound to this device:
- Provide a new dmabuf_tx_cmsg which includes the dmabuf to send from,
and the off
On 13.12.24 17:28, Jonathan Corbet wrote:
>> + - Expedite fixing regressions that recently reached releases deemed for end
>> + users through new mainline releases or stable backports. If the culprit
>> + reached it in the past six weeks, aim to mainline a fix before the end
>> of the
>> +
On 13.12.24 17:30, Jonathan Corbet wrote:
> Thorsten Leemhuis writes:
>
>> Add some advice on how to handle regressions as developer, reviewer, and
>> maintainer, as resolving regression without unnecessary delays requires
>> multiple people working hand in hand.
>>
>> This removes equivalent par
On 13.12.24 17:24, Jonathan Corbet wrote:
> Thorsten Leemhuis writes:
> [...]
>> diff --git a/Documentation/process/6.Followthrough.rst
>> b/Documentation/process/6.Followthrough.rst
>> index 763a80d21240f0..2ba16a71aba9b4 100644
>> --- a/Documentation/process/6.Followthrough.rst
>> +++ b/Documen
On 13.12.24 17:20, Jonathan Corbet wrote:
> Thorsten Leemhuis writes:
>
>> Add a few notes on how the interaction with the stable team works when
>> it comes to mainline regressions that also affect stable series.
>>
>> This removes equivalent paragraphs from a section in
>> Documentation/proc
On 17.12.24 06:13, Alistair Popple wrote:
Add helpers to determine if a page or folio is a device dax or fs dax
page or folio.
... why is it "device_dax" but "fsdax" ? In particular because you wrote
"fs dax" above.
I see "fsdax" getting used in some functions. But then, people usually
say
But that's a bit weird: we call __init_single_page()->init_page_count() to
initialize it to 1, to then set it back to 0.
Maybe we can just pass to __init_single_page() the refcount we want to have
directly? Can be a patch on top of course.
Once the dust settles on this series we won't nee
return -EBUSY;
diff --git a/mm/rmap.c b/mm/rmap.c
index c6c4d4e..39d0439 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1203,6 +1203,11 @@ static __always_inline unsigned int
__folio_add_rmap(struct folio *folio,
}
atomic_inc(&folio->_la
On 19.12.24 00:11, Alistair Popple wrote:
On Tue, Dec 17, 2024 at 11:31:25PM +0100, David Hildenbrand wrote:
On 17.12.24 06:13, Alistair Popple wrote:
The procfs mmu files such as smaps currently ignore device dax and fs
dax pages because these pages are considered special. To maintain
existing
+vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf, struct folio *folio,
bool write)
+{
+ struct vm_area_struct *vma = vmf->vma;
+ unsigned long addr = vmf->address & PMD_MASK;
+ pfn_t pfn = pfn_to_pfn_t(folio_pfn(folio));
+ struct mm_struct *mm = vma->vm_mm;
+
On 20.12.24 20:01, David Hildenbrand wrote:
On 17.12.24 06:12, Alistair Popple wrote:
In preparation for using insert_page() for DAX, enhance
insert_page_into_pte_locked() to handle establishing writable
mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a
PTE which bypasses the
On 17.12.24 06:12, Alistair Popple wrote:
Currently DAX folio/page reference counts are managed differently to
normal pages. To allow these to be managed the same as normal pages
introduce vmf_insert_folio_pud. This will map the entire PUD-sized folio
and take references as it would for a normall
On 17.12.24 06:12, Alistair Popple wrote:
In preparation for using insert_page() for DAX, enhance
insert_page_into_pte_locked() to handle establishing writable
mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a
PTE which bypasses the typical set_pte_range() in finish_fault.
Si
On 12/21, Mina Almasry wrote:
> The TX path had been dropped from the Device Memory TCP patch series
> post RFCv1 [1], to make that series slightly easier to review. This
> series rebases the implementation of the TX path on top of the
> net_iov/netmem framework agreed upon and merged. The motivati
On 12/21, Mina Almasry wrote:
> Add documentation outlining the usage and details of the devmem TCP TX
> API.
>
> Signed-off-by: Mina Almasry
> ---
> Documentation/networking/devmem.rst | 140 +++-
> 1 file changed, 136 insertions(+), 4 deletions(-)
>
> diff --git a/Docu
On 12/21, Mina Almasry wrote:
> Add support for devmem TX in ncdevmem.
>
> This is a combination of the ncdevmem from the devmem TCP series RFCv1
> which included the TX path, and work by Stan to include the netlink API
> and refactored on top of his generic memory_provider support.
Do you plan t
On 12/21, Mina Almasry wrote:
> Augment dmabuf binding to be able to handle TX. Additional to all the RX
> binding, we also create tx_vec and tx_iter needed for the TX path.
>
> Provide API for sendmsg to be able to send dmabufs bound to this device:
>
> - Provide a new dmabuf_tx_cmsg which inclu
55 matches
Mail list logo