PCI: Work around PCIe link training failures

2024-07-26 Thread Matthew W Carlis
On Mon, 22 Jul 2024, Maciej W. Rozycki wrote: > The main reason is it is believed that it is the downstream device > causing the issue, and obviously you can't fetch its ID if you can't > negotiate link so as to talk to it in the first place. Have had some more time to look into this issue. So, I

Re: [kvm-unit-tests PATCH] build: retain intermediate .aux.o targets

2024-07-26 Thread Thomas Huth
On 26/07/2024 06.15, Nicholas Piggin wrote: On Fri Jun 14, 2024 at 6:38 PM AEST, Nicholas Piggin wrote: On Fri Jun 14, 2024 at 11:08 AM AEST, Segher Boessenkool wrote: On Fri, Jun 14, 2024 at 10:43:39AM +1000, Nicholas Piggin wrote: On Wed Jun 12, 2024 at 6:28 PM AEST, Segher Boessenkool wrote

Re: [PATCH v2 00/25] mm: introduce numa_memblks

2024-07-26 Thread Mike Rapoport
On Wed, Jul 24, 2024 at 10:48:42PM -0400, Zi Yan wrote: > On 24 Jul 2024, at 20:35, Zi Yan wrote: > > On 24 Jul 2024, at 18:44, Zi Yan wrote: > >> > >> Hi, > >> > >> I have tested this series on both x86_64 and arm64. It works fine on > >> x86_64. > >> All numa=fake= options work as they did befor

Re: [PATCH v2] PCI: Fix crash during pci_dev hot-unplug on pseries KVM guest

2024-07-26 Thread Rob Herring
+ Ubuntu kernel list, again On Thu, Jul 25, 2024 at 11:15:39PM +0530, Amit Machhiwal wrote: > Hi Lizhi, Rob, > > Sorry for responding late. I got busy with some other things. > > On 2024/07/23 02:08 PM, Lizhi Hou wrote: > > > > On 7/23/24 12:54, Rob Herring wrote: > > > On Tue, Jul 23, 2024 at

Re: [PATCH v3 8/8] mm/mprotect: fix dax pud handlings

2024-07-26 Thread Peter Xu
On Thu, Jul 25, 2024 at 05:23:48PM -0700, James Houghton wrote: > On Thu, Jul 25, 2024 at 3:41 PM Peter Xu wrote: > > > > On Thu, Jul 25, 2024 at 11:29:49AM -0700, James Houghton wrote: > > > > - pages += change_pmd_range(tlb, vma, pud, addr, next, > > > > newprot, > > > > + > > > >

[PATCH 2/2] MAINTAINERS: Mark powerpc spufs as orphaned

2024-07-26 Thread Michael Ellerman
Jeremy is no longer actively maintaining spufs, mark it as orphan. Also drop the dead developerworks link. Signed-off-by: Michael Ellerman Acked-by: Jeremy Kerr --- CREDITS | 3 +++ MAINTAINERS | 4 +--- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/CREDITS b/CREDITS index

[PATCH 1/2] MAINTAINERS: Mark powerpc Cell as orphaned

2024-07-26 Thread Michael Ellerman
Arnd is no longer actively maintaining Cell, mark it as orphan. Also drop the dead developerworks link. Signed-off-by: Michael Ellerman --- CREDITS | 3 +++ MAINTAINERS | 4 +--- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/CREDITS b/CREDITS index 053e5a5003eb..65165dc80f0

Re: [PATCH V8 14/15] tools/perf: Add support to use libcapstone in powerpc

2024-07-26 Thread Athira Rajeev
> On 25 Jul 2024, at 2:30 AM, Arnaldo Carvalho de Melo wrote: > > On Thu, Jul 18, 2024 at 02:13:57PM +0530, Athira Rajeev wrote: >> Now perf uses the capstone library to disassemble the instructions in >> x86. capstone is used (if available) for perf annotate to speed up. >> Currently it only

Re: [PATCH 2/2] MAINTAINERS: Mark powerpc spufs as orphaned

2024-07-26 Thread Arnd Bergmann
On Fri, Jul 26, 2024, at 14:33, Michael Ellerman wrote: > Jeremy is no longer actively maintaining spufs, mark it as orphan. > > Also drop the dead developerworks link. > > Signed-off-by: Michael Ellerman > Acked-by: Jeremy Kerr Acked-by: Arnd Bergmann

Re: [PATCH v2] PCI: Fix crash during pci_dev hot-unplug on pseries KVM guest

2024-07-26 Thread Michael Ellerman
Amit Machhiwal writes: > Hi Bjorn, > > On 2024/07/25 03:55 PM, Bjorn Helgaas wrote: >> On Thu, Jul 25, 2024 at 11:15:39PM +0530, Amit Machhiwal wrote: >> > ... >> > The crash in question is a critical issue that we would want to have >> > a fix for soon. And while this is still being figured out,

Re: [PATCH 1/2] MAINTAINERS: Mark powerpc Cell as orphaned

2024-07-26 Thread Arnd Bergmann
On Fri, Jul 26, 2024, at 14:33, Michael Ellerman wrote: > Arnd is no longer actively maintaining Cell, mark it as orphan. > > Also drop the dead developerworks link. > > Signed-off-by: Michael Ellerman Acked-by: Arnd Bergmann The platform contains two separate bits, so we need to decide what to

Re: [PATCH] perf vendor events power10: Update JSON/events

2024-07-26 Thread Arnaldo Carvalho de Melo
On Tue, Jul 23, 2024 at 09:02:23AM -0700, Ian Rogers wrote: > On Mon, Jul 22, 2024 at 10:27 PM Kajol Jain wrote: > > > > Update JSON/events for power10 platform with additional events. > > Also move PM_VECTOR_LD_CMPL event from others.json to > > frontend.json file. > > > > Signed-off-by: Kajol Ja

[PATCH v1 0/3] mm: split PTE/PMD PT table Kconfig cleanups+clarifications

2024-07-26 Thread David Hildenbrand
This series is a follow up to the fixes: "[PATCH v1 0/2] mm/hugetlb: fix hugetlb vs. core-mm PT locking" When working on the fixes, I wondered why 8xx is fine (-> never uses split PT locks) and how PT locking even works properly with PMD page table sharing (-> always requires split PMD PT

[PATCH v1 1/3] mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options

2024-07-26 Thread David Hildenbrand
Let's clean that up a bit and prepare for depending on CONFIG_SPLIT_PMD_PTLOCKS in other Kconfig options. More cleanups would be reasonable (like the arch-specific "depends on" for CONFIG_SPLIT_PTE_PTLOCKS), but we'll leave that for another day. Signed-off-by: David Hildenbrand --- arch/arm/mm/

[PATCH v1 2/3] mm/hugetlb: enforce that PMD PT sharing has split PMD PT locks

2024-07-26 Thread David Hildenbrand
Sharing page tables between processes but falling back to per-MM page table locks cannot possibly work. So, let's make sure that we do have split PMD locks by adding a new Kconfig option and letting that depend on CONFIG_SPLIT_PMD_PTLOCKS. Signed-off-by: David Hildenbrand --- fs/Kconfig

[PATCH v1 3/3] powerpc/8xx: document and enforce that split PT locks are not used

2024-07-26 Thread David Hildenbrand
Right now, we cannot have split PT locks because 8xx does not support SMP. But for the sake of documentation *why* 8xx is fine regarding what we documented in huge_pte_lockptr(), let's just add code to enforce it at the same time as documenting it. This should also make everybody who wants to cop

Re: [PATCH v4 18/29] arm64: add POE signal support

2024-07-26 Thread Dave Martin
On Thu, Jul 25, 2024 at 07:11:41PM +0100, Mark Brown wrote: > On Thu, Jul 25, 2024 at 04:58:27PM +0100, Dave Martin wrote: > > > I'll post a draft patch separately, since I think the update could > > benefit from separate discussion, but my back-of-the-envelope > > calculation suggests that (befor

Re: [PATCH v4 18/29] arm64: add POE signal support

2024-07-26 Thread Mark Brown
On Fri, Jul 26, 2024 at 05:14:01PM +0100, Dave Martin wrote: > On Thu, Jul 25, 2024 at 07:11:41PM +0100, Mark Brown wrote: > > That'd have to be a variably sized structure with pairs of sysreg > > ID/value items in it I think which would be a bit of a pain to implement > > but doable. The per-rec

Re: [PATCH v2] PCI: Fix crash during pci_dev hot-unplug on pseries KVM guest

2024-07-26 Thread Rob Herring
On Thu, Jul 25, 2024 at 6:06 PM Lizhi Hou wrote: > > Hi Amit, > > > I try to follow the option which add a OF flag. If Rob is ok with this, > I would suggest to use it instead of V1 patch > > diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c > index dda6092e6d3a..a401ed0463d9 100644 > --- a

Re: [PATCH v2] PCI: Fix crash during pci_dev hot-unplug on pseries KVM guest

2024-07-26 Thread Lizhi Hou
On 7/26/24 10:52, Rob Herring wrote: On Thu, Jul 25, 2024 at 6:06 PM Lizhi Hou wrote: Hi Amit, I try to follow the option which add a OF flag. If Rob is ok with this, I would suggest to use it instead of V1 patch diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c index dda6092e6d3a..

Re: [PATCH 04/15] block: add an API to atomically update queue limits

2024-07-26 Thread Christian Lamparter
Hi, got a WARNING splatch (=> boot harddrive is inaccessible - device fails to boot) [ cut here ] WARNING: CPU: 0 PID: 29 at block/blk-settings.c:185 blk_validate_limits+0x154/0x294 Modules linked in: CPU: 0 PID: 29 Comm: kworker/u4:2 Tainted: GW 6.10.0

[PATCH 07/20] soc/qman: test: Use kthread_run_on_cpu()

2024-07-26 Thread Frederic Weisbecker
Use the proper API instead of open coding it. However it looks like kthreads here could be replaced by the use of a per-cpu workqueue instead. Signed-off-by: Frederic Weisbecker --- drivers/soc/fsl/qbman/qman_test_stash.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a

[PATCH v12 00/84] KVM: Stop grabbing references to PFNMAP'd pages

2024-07-26 Thread Sean Christopherson
arm64 folks, the first two patches are bug fixes, but I have very low confidence that they are correct and/or desirable. If they are more or less correct, I can post them separately if that'd make life easier. I included them here to avoid conflicts, and because I'm pretty sure how KVM deals with

[PATCH v12 01/84] KVM: arm64: Release pfn, i.e. put page, if copying MTE tags hits ZONE_DEVICE

2024-07-26 Thread Sean Christopherson
Put the page reference acquired by gfn_to_pfn_prot() if kvm_vm_ioctl_mte_copy_tags() runs into ZONE_DEVICE memory. KVM's less- than-stellar heuristics for dealing with pfn-mapped memory means that KVM can get a page reference to ZONE_DEVICE memory. Fixes: f0376edb1ddc ("KVM: arm64: Add ioctl to f

[PATCH v12 02/84] KVM: arm64: Disallow copying MTE to guest memory while KVM is dirty logging

2024-07-26 Thread Sean Christopherson
Disallow copying MTE tags to guest memory while KVM is dirty logging, as writing guest memory without marking the gfn as dirty in the memslot could result in userspace failing to migrate the updated page. Ideally (maybe?), KVM would simply mark the gfn as dirty, but there is no vCPU to work with,

[PATCH v12 03/84] KVM: Drop KVM_ERR_PTR_BAD_PAGE and instead return NULL to indicate an error

2024-07-26 Thread Sean Christopherson
Remove KVM_ERR_PTR_BAD_PAGE and instead return NULL, as "bad page" is just a leftover bit of weirdness from days of old when KVM stuffed a "bad" page into the guest instead of actually handling missing pages. See commit cea7bb21280e ("KVM: MMU: Make gfn_to_page() always safe"). Signed-off-by: Sea

[PATCH v12 04/84] KVM: Allow calling kvm_release_page_{clean,dirty}() on a NULL page pointer

2024-07-26 Thread Sean Christopherson
Allow passing a NULL @page to kvm_release_page_{clean,dirty}(), there's no tangible benefit to forcing the callers to pre-check @page, and it ends up generating a lot of duplicate boilerplate code. Signed-off-by: Sean Christopherson --- virt/kvm/kvm_main.c | 4 ++-- 1 file changed, 2 insertions(

[PATCH v12 05/84] KVM: Add kvm_release_page_unused() API to put pages that KVM never consumes

2024-07-26 Thread Sean Christopherson
Add an API to release an unused page, i.e. to put a page without marking it accessed or dirty. The API will be used when KVM faults-in a page but bails before installing the guest mapping (and other similar flows). Signed-off-by: Sean Christopherson --- include/linux/kvm_host.h | 9 + 1

[PATCH v12 06/84] KVM: x86/mmu: Skip the "try unsync" path iff the old SPTE was a leaf SPTE

2024-07-26 Thread Sean Christopherson
Apply make_spte()'s optimization to skip trying to unsync shadow pages if and only if the old SPTE was a leaf SPTE, as non-leaf SPTEs in direct MMUs are always writable, i.e. could trigger a false positive and incorrectly lead to KVM creating a SPTE without write-protecting or marking shadow pages

[PATCH v12 07/84] KVM: x86/mmu: Mark folio dirty when creating SPTE, not when zapping/modifying

2024-07-26 Thread Sean Christopherson
Mark pages/folios dirty when creating SPTEs to map PFNs into the guest, not when zapping or modifying SPTEs, as marking folios dirty when zapping or modifying SPTEs can be extremely inefficient. E.g. when KVM is zapping collapsible SPTEs to reconstitute a hugepage after disbling dirty logging, KVM

[PATCH v12 08/84] KVM: x86/mmu: Mark page/folio accessed only when zapping leaf SPTEs

2024-07-26 Thread Sean Christopherson
Mark folios as accessed only when zapping leaf SPTEs, which is a rough heuristic for "only in response to an mmu_notifier invalidation". Page aging and LRUs are tolerant of false negatives, i.e. KVM doesn't need to be precise for correctness, and re-marking folios as accessed when zapping entire r

[PATCH v12 09/84] KVM: x86/mmu: Don't force flush if SPTE update clears Accessed bit

2024-07-26 Thread Sean Christopherson
Don't force a TLB flush if mmu_spte_update() clears Accessed bit, as access tracking tolerates false negatives, as evidenced by the mmu_notifier hooks that explicit test and age SPTEs without doing a TLB flush. In practice, this is very nearly a nop. spte_write_protect() and spte_clear_dirty() ne

[PATCH v12 10/84] KVM: x86/mmu: Use gfn_to_page_many_atomic() when prefetching indirect PTEs

2024-07-26 Thread Sean Christopherson
Use gfn_to_page_many_atomic() instead of gfn_to_pfn_memslot_atomic() when prefetching indirect PTEs (direct_pte_prefetch_many() already uses the "to page" APIS). Functionally, the two are subtly equivalent, as the "to pfn" API short-circuits hva_to_pfn() if hva_to_pfn_fast() fails, i.e. is just a

[PATCH v12 11/84] KVM: Rename gfn_to_page_many_atomic() to kvm_prefetch_pages()

2024-07-26 Thread Sean Christopherson
Rename gfn_to_page_many_atomic() to kvm_prefetch_pages() to try and communicate its true purpose, as the "atomic" aspect is essentially a side effect of the fact that x86 uses the API while holding mmu_lock. E.g. even if mmu_lock weren't held, KVM wouldn't want to fault-in pages, as the goal is to

[PATCH v12 12/84] KVM: Drop @atomic param from gfn=>pfn and hva=>pfn APIs

2024-07-26 Thread Sean Christopherson
Drop @atomic from the myriad "to_pfn" APIs now that all callers pass "false". No functional change intended. Signed-off-by: Sean Christopherson --- Documentation/virt/kvm/locking.rst | 4 +-- arch/arm64/kvm/mmu.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_hv.c| 2 +- ar

[PATCH v12 13/84] KVM: Annotate that all paths in hva_to_pfn() might sleep

2024-07-26 Thread Sean Christopherson
Now that hva_to_pfn() no longer supports being called in atomic context, move the might_sleep() annotation from hva_to_pfn_slow() to hva_to_pfn(). Signed-off-by: Sean Christopherson --- virt/kvm/kvm_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/virt/kvm/kvm_main.

[PATCH v12 14/84] KVM: Replace "async" pointer in gfn=>pfn with "no_wait" and error code

2024-07-26 Thread Sean Christopherson
From: David Stevens Add a pfn error code to communicate that hva_to_pfn() failed because I/O was needed and disallowed, and convert @async to a constant @no_wait boolean. This will allow eliminating the @no_wait param by having callers pass in FOLL_NOWAIT along with other FOLL_* flags. Signed-o

[PATCH v12 15/84] KVM: x86/mmu: Drop kvm_page_fault.hva, i.e. don't track intermediate hva

2024-07-26 Thread Sean Christopherson
Remove kvm_page_fault.hva as it is never read, only written. This will allow removing the @hva param from __gfn_to_pfn_memslot(). Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 5 ++--- arch/x86/kvm/mmu/mmu_internal.h | 2 -- 2 files changed, 2 insertions(+), 5 deletio

[PATCH v12 16/84] KVM: Drop unused "hva" pointer from __gfn_to_pfn_memslot()

2024-07-26 Thread Sean Christopherson
Drop @hva from __gfn_to_pfn_memslot() now that all callers pass NULL. No functional change intended. Signed-off-by: Sean Christopherson --- arch/arm64/kvm/mmu.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_hv.c| 2 +- arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +- arch/x86/kvm/m

[PATCH v12 17/84] KVM: Introduce kvm_follow_pfn() to eventually replace "gfn_to_pfn" APIs

2024-07-26 Thread Sean Christopherson
From: David Stevens Introduce kvm_follow_pfn() to eventually supplant the various "gfn_to_pfn" APIs, albeit by adding more wrappers. The primary motivation of the new helper is to pass a structure instead of an ever changing set of parameters, e.g. so that tweaking the behavior, inputs, and/or o

[PATCH v12 18/84] KVM: Remove pointless sanity check on @map param to kvm_vcpu_(un)map()

2024-07-26 Thread Sean Christopherson
Drop kvm_vcpu_{,un}map()'s useless checks on @map being non-NULL. The map is 100% kernel controlled, any caller that passes a NULL pointer is broken and needs to be fixed, i.e. a crash due to a NULL pointer dereference is desirable (though obviously not as desirable as not having a bug in the firs

[PATCH v12 19/84] KVM: Explicitly initialize all fields at the start of kvm_vcpu_map()

2024-07-26 Thread Sean Christopherson
Explicitly initialize the entire kvm_host_map structure when mapping a pfn, as some callers declare their struct on the stack, i.e. don't zero-initialize the struct, which makes the map->hva in kvm_vcpu_unmap() *very* suspect. Signed-off-by: Sean Christopherson --- virt/kvm/kvm_main.c | 40 +

[PATCH v12 20/84] KVM: Use NULL for struct page pointer to indicate mremapped memory

2024-07-26 Thread Sean Christopherson
Drop yet another unnecessary magic page value from KVM, as there's zero reason to use a poisoned pointer to indicate "no page". If KVM uses a NULL page pointer, the kernel will explode just as quickly as if KVM uses a poisoned pointer. Never mind the fact that such usage would be a blatant and eg

[PATCH v12 21/84] KVM: nVMX: Rely on kvm_vcpu_unmap() to track validity of eVMCS mapping

2024-07-26 Thread Sean Christopherson
Remove the explicit evmptr12 validity check when deciding whether or not to unmap the eVMCS pointer, and instead rely on kvm_vcpu_unmap() to play nice with a NULL map->hva, i.e. to do nothing if the map is invalid. Note, vmx->nested.hv_evmcs_map is zero-allocated along with the rest of vcpu_vmx, i

[PATCH v12 22/84] KVM: nVMX: Drop pointless msr_bitmap_map field from struct nested_vmx

2024-07-26 Thread Sean Christopherson
Remove vcpu_vmx.msr_bitmap_map and instead use an on-stack structure in the one function that uses the map, nested_vmx_prepare_msr_bitmap(). Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 8 arch/x86/kvm/vmx/vmx.h| 2 -- 2 files changed, 4 insertions(+), 6 deleti

[PATCH v12 23/84] KVM: nVMX: Add helper to put (unmap) vmcs12 pages

2024-07-26 Thread Sean Christopherson
Add a helper to dedup unmapping the vmcs12 pages. This will reduce the amount of churn when a future patch refactors the kvm_vcpu_unmap() API. No functional change intended. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/nested.c | 32 ++-- 1 file changed,

[PATCH v12 24/84] KVM: Use plain "struct page" pointer instead of single-entry array

2024-07-26 Thread Sean Christopherson
Use a single pointer instead of a single-entry array for the struct page pointer in hva_to_pfn_fast(). Using an array makes the code unnecessarily annoying to read and update. No functional change intended. Signed-off-by: Sean Christopherson --- virt/kvm/kvm_main.c | 6 +++--- 1 file changed,

[PATCH v12 25/84] KVM: Provide refcounted page as output field in struct kvm_follow_pfn

2024-07-26 Thread Sean Christopherson
Add kvm_follow_pfn.refcounted_page as an output for the "to pfn" APIs to "return" the struct page that is associated with the returned pfn (if KVM acquired a reference to the page). This will eventually allow removing KVM's hacky kvm_pfn_to_refcounted_page() code, which is error prone and can't de

[PATCH v12 26/84] KVM: Move kvm_{set,release}_page_{clean,dirty}() helpers up in kvm_main.c

2024-07-26 Thread Sean Christopherson
Hoist the kvm_{set,release}_page_{clean,dirty}() APIs further up in kvm_main.c so that they can be used by the kvm_follow_pfn family of APIs. No functional change intended. Signed-off-by: Sean Christopherson --- virt/kvm/kvm_main.c | 82 ++--- 1 file chan

[PATCH v12 27/84] KVM: pfncache: Precisely track refcounted pages

2024-07-26 Thread Sean Christopherson
Track refcounted struct page memory using kvm_follow_pfn.refcounted_page instead of relying on kvm_release_pfn_clean() to correctly detect that the pfn is associated with a struct page. Signed-off-by: Sean Christopherson --- virt/kvm/pfncache.c | 11 +++ 1 file changed, 7 insertions(+),

[PATCH v12 28/84] KVM: Migrate kvm_vcpu_map() to kvm_follow_pfn()

2024-07-26 Thread Sean Christopherson
From: David Stevens Migrate kvm_vcpu_map() to kvm_follow_pfn(), and have it track whether or not the map holds a refcounted struct page. Precisely tracking struct page references will eventually allow removing kvm_pfn_to_refcounted_page() and its various wrappers. Signed-off-by: David Stevens

[PATCH v12 29/84] KVM: Pin (as in FOLL_PIN) pages during kvm_vcpu_map()

2024-07-26 Thread Sean Christopherson
Pin, as in FOLL_PIN, pages when mapping them for direct access by KVM. As per Documentation/core-api/pin_user_pages.rst, writing to a page that was gotten via FOLL_GET is explicitly disallowed. Correct (uses FOLL_PIN calls): pin_user_pages() write to the data within the pages u

[PATCH v12 30/84] KVM: nVMX: Mark vmcs12's APIC access page dirty when unmapping

2024-07-26 Thread Sean Christopherson
Mark the APIC access page as dirty when unmapping it from KVM. The fact that the page _shouldn't_ be written doesn't guarantee the page _won't_ be written. And while the contents are likely irrelevant, the values _are_ visible to the guest, i.e. dropping writes would be visible to the guest (thou

[PATCH v12 31/84] KVM: Pass in write/dirty to kvm_vcpu_map(), not kvm_vcpu_unmap()

2024-07-26 Thread Sean Christopherson
Now that all kvm_vcpu_{,un}map() users pass "true" for @dirty, have them pass "true" as a @writable param to kvm_vcpu_map(), and thus create a read-only mapping when possible. Note, creating read-only mappings can be theoretically slower, as they don't play nice with fast GUP due to the need to br

[PATCH v12 32/84] KVM: Get writable mapping for __kvm_vcpu_map() only when necessary

2024-07-26 Thread Sean Christopherson
When creating a memory map for read, don't request a writable pfn from the primary MMU. While creating read-only mappings can be theoretically slower, as they don't play nice with fast GUP due to the need to break CoW before mapping the underlying PFN, practically speaking, creating a mapping isn'

[PATCH v12 33/84] KVM: Disallow direct access (w/o mmu_notifier) to unpinned pfn by default

2024-07-26 Thread Sean Christopherson
Add an off-by-default module param to control whether or not KVM is allowed to map memory that isn't pinned, i.e. that KVM can't guarantee won't be freed while it is mapped into KVM and/or the guest. Don't remove the functionality entirely, as there are use cases where mapping unpinned memory is s

[PATCH v12 34/84] KVM: Add a helper to lookup a pfn without grabbing a reference

2024-07-26 Thread Sean Christopherson
Add a kvm_follow_pfn() wrapper, kvm_lookup_pfn(), to allow looking up a gfn=>pfn mapping without the caller getting a reference to any underlying page. The API will be used in flows that want to know if a gfn points at a valid pfn, but don't actually need to do anything with the pfn. Signed-off-b

[PATCH v12 35/84] KVM: x86: Use kvm_lookup_pfn() to check if retrying #PF is useful

2024-07-26 Thread Sean Christopherson
Use kvm_lookup_pfn() instead of an open coded equivalent when checking to see if KVM should exit to userspace or re-enter the guest after failed instruction emulation triggered by a guest page fault. Note, there is a small functional change as kvm_lookup_pfn() doesn't mark the page as accessed, wh

[PATCH v12 36/84] KVM: x86: Use kvm_lookup_pfn() to check if APIC access page was installed

2024-07-26 Thread Sean Christopherson
Use kvm_lookup_pfn() to verify that the APIC access page was allocated and installed as expected. The mapping is controlled by KVM, i.e. it's guaranteed to be backed by struct page, the purpose of the check is purely to ensure the page is allocated, i.e. that KVM doesn't point the guest at garbage

[PATCH v12 37/84] KVM: x86/mmu: Add "mmu" prefix fault-in helpers to free up generic names

2024-07-26 Thread Sean Christopherson
Prefix x86's faultin_pfn helpers with "mmu" so that the mmu-less names can be used by common KVM for similar APIs. No functional change intended. Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 19 ++- arch/x86/kvm/mmu/mmu_internal.h | 2 +- arch/x86/kv

[PATCH v12 38/84] KVM: x86/mmu: Put direct prefetched pages via kvm_release_page_clean()

2024-07-26 Thread Sean Christopherson
Use kvm_release_page_clean() to put prefeteched pages instead of calling put_page() directly. This will allow de-duplicating the prefetch code between indirect and direct MMUs. Note, there's a small functional change as kvm_release_page_clean() marks the page/folio as accessed. While it's not st

[PATCH v12 39/84] KVM: x86/mmu: Add common helper to handle prefetching SPTEs

2024-07-26 Thread Sean Christopherson
Deduplicate the prefetching code for indirect and direct MMUs. The core logic is the same, the only difference is that indirect MMUs need to prefetch SPTEs one-at-a-time, as contiguous guest virtual addresses aren't guaranteed to yield contiguous guest physical addresses. Signed-off-by: Sean Chri

[PATCH v12 40/84] KVM: x86/mmu: Add helper to "finish" handling a guest page fault

2024-07-26 Thread Sean Christopherson
Add a helper to finish/complete the handling of a guest page, e.g. to mark the pages accessed and put any held references. In the near future, this will allow improving the logic without having to copy+paste changes into all page fault paths. And in the less near future, will allow sharing the "f

[PATCH v12 41/84] KVM: x86/mmu: Mark pages/folios dirty at the origin of make_spte()

2024-07-26 Thread Sean Christopherson
Move the marking of folios dirty from make_spte() out to its callers, which have access to the _struct page_, not just the underlying pfn. Once all architectures follow suit, this will allow removing KVM's ugly hack where KVM elevates the refcount of VM_MIXEDMAP pfns that happen to be struct page m

[PATCH v12 42/84] KVM: Move declarations of memslot accessors up in kvm_host.h

2024-07-26 Thread Sean Christopherson
Move the memslot lookup helpers further up in kvm_host.h so that they can be used by inlined "to pfn" wrappers. No functional change intended. Signed-off-by: Sean Christopherson --- include/linux/kvm_host.h | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/include/linu

[PATCH v12 43/84] KVM: Add kvm_faultin_pfn() to specifically service guest page faults

2024-07-26 Thread Sean Christopherson
Add a new dedicated API, kvm_faultin_pfn(), for servicing guest page faults, i.e. for getting pages/pfns that will be mapped into the guest via an mmu_notifier-protected KVM MMU. Keep struct kvm_follow_pfn buried in internal code, as having __kvm_faultin_pfn() take "out" params is actually cleaner

[PATCH v12 44/84] KVM: x86/mmu: Convert page fault paths to kvm_faultin_pfn()

2024-07-26 Thread Sean Christopherson
Convert KVM x86 to use the recently introduced __kvm_faultin_pfn(). Opportunstically capture the refcounted_page grabbed by KVM for use in future changes. No functional change intended. Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 14 ++ arch/x86/kvm/mmu/

[PATCH v12 45/84] KVM: guest_memfd: Provide "struct page" as output from kvm_gmem_get_pfn()

2024-07-26 Thread Sean Christopherson
Provide the "struct page" associated with a guest_memfd pfn as an output from __kvm_gmem_get_pfn() so that KVM guest page fault handlers can directly put the page instead of having to rely on kvm_pfn_to_refcounted_page(). Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 2 +- a

[PATCH v12 46/84] KVM: x86/mmu: Put refcounted pages instead of blindly releasing pfns

2024-07-26 Thread Sean Christopherson
Now that all x86 page fault paths precisely track refcounted pages, use Use kvm_page_fault.refcounted_page to put references to struct page memory when finishing page faults. This is a baby step towards eliminating kvm_pfn_to_refcounted_page(). Signed-off-by: Sean Christopherson --- arch/x86/kv

[PATCH v12 47/84] KVM: x86/mmu: Don't mark unused faultin pages as accessed

2024-07-26 Thread Sean Christopherson
When finishing guest page faults, don't mark pages as accessed if KVM is resuming the guest _without_ installing a mapping, i.e. if the page isn't being used. While it's possible that marking the page accessed could avoid minor thrashing due to reclaiming a page that the guest is about to access,

[PATCH v12 48/84] KVM: Move x86's API to release a faultin page to common KVM

2024-07-26 Thread Sean Christopherson
Move KVM x86's helper that "finishes" the faultin process to common KVM so that the logic can be shared across all architectures. Note, not all architectures implement a fast page fault path, but the gist of the comment applies to all architectures. Signed-off-by: Sean Christopherson --- arch/x

[PATCH v12 49/84] KVM: VMX: Hold mmu_lock until page is released when updating APIC access page

2024-07-26 Thread Sean Christopherson
Hold mmu_lock across kvm_release_pfn_clean() when refreshing the APIC access page address to ensure that KVM doesn't mark a page/folio as accessed after it has been unmapped. Practically speaking marking a folio accesses is benign in this scenario, as KVM does hold a reference (it's really just ma

[PATCH v12 50/84] KVM: VMX: Use __kvm_faultin_page() to get APIC access page/pfn

2024-07-26 Thread Sean Christopherson
Use __kvm_faultin_page() get the APIC access page so that KVM can precisely release the refcounted page, i.e. to remove yet another user of kvm_pfn_to_refcounted_page(). While the path isn't handling a guest page fault, the semantics are effectively the same; KVM just happens to be mapping the pfn

[PATCH v12 51/84] KVM: PPC: e500: Mark "struct page" dirty in kvmppc_e500_shadow_map()

2024-07-26 Thread Sean Christopherson
Mark the underlying page as dirty in kvmppc_e500_ref_setup()'s sole caller, kvmppc_e500_shadow_map(), which will allow converting e500 to __kvm_faultin_pfn() + kvm_release_faultin_page() without having to do a weird dance between ref_setup() and shadow_map(). Opportunistically drop the redundant k

[PATCH v12 52/84] KVM: PPC: e500: Mark "struct page" pfn accessed before dropping mmu_lock

2024-07-26 Thread Sean Christopherson
Mark pages accessed before dropping mmu_lock when faulting in guest memory so that shadow_map() can convert to kvm_release_faultin_page() without tripping its lockdep assertion on mmu_lock being held. Marking pages accessed outside of mmu_lock is ok (not great, but safe), but marking pages _dirty_

[PATCH v12 53/84] KVM: PPC: e500: Use __kvm_faultin_pfn() to handle page faults

2024-07-26 Thread Sean Christopherson
Convert PPC e500 to use __kvm_faultin_pfn()+kvm_release_faultin_page(), and continue the inexorable march towards the demise of kvm_pfn_to_refcounted_page(). Signed-off-by: Sean Christopherson --- arch/powerpc/kvm/e500_mmu_host.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) dif

[PATCH v12 54/84] KVM: arm64: Mark "struct page" pfns accessed/dirty before dropping mmu_lock

2024-07-26 Thread Sean Christopherson
Mark pages/folios accessed+dirty prior to dropping mmu_lock, as marking a page/folio dirty after it has been written back can make some filesystems unhappy (backing KVM guests will such filesystem files is uncommon, and the race is minuscule, hence the lack of complaints). See the link below for d

[PATCH v12 55/84] KVM: arm64: Use __kvm_faultin_pfn() to handle memory aborts

2024-07-26 Thread Sean Christopherson
Convert arm64 to use __kvm_faultin_pfn()+kvm_release_faultin_page(). Three down, six to go. Signed-off-by: Sean Christopherson --- arch/arm64/kvm/mmu.c | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index ce13c3d884

[PATCH v12 56/84] KVM: RISC-V: Mark "struct page" pfns dirty iff a stage-2 PTE is installed

2024-07-26 Thread Sean Christopherson
Don't mark pages dirty if KVM bails from the page fault handler without installing a stage-2 mapping, i.e. if the page is guaranteed to not be written by the guest. In addition to being a (very) minor fix, this paves the way for converting RISC-V to use kvm_release_faultin_page(). Signed-off-by:

[PATCH v12 57/84] KVM: RISC-V: Mark "struct page" pfns accessed before dropping mmu_lock

2024-07-26 Thread Sean Christopherson
Mark pages accessed before dropping mmu_lock when faulting in guest memory so that RISC-V can convert to kvm_release_faultin_page() without tripping its lockdep assertion on mmu_lock being held. Marking pages accessed outside of mmu_lock is ok (not great, but safe), but marking pages _dirty_ outsi

[PATCH v12 58/84] KVM: RISC-V: Use kvm_faultin_pfn() when mapping pfns into the guest

2024-07-26 Thread Sean Christopherson
Convert RISC-V to __kvm_faultin_pfn()+kvm_release_faultin_page(), which are new APIs to consolidate arch code and provide consistent behavior across all KVM architectures. Signed-off-by: Sean Christopherson --- arch/riscv/kvm/mmu.c | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-

[PATCH v12 59/84] KVM: PPC: Use __kvm_faultin_pfn() to handle page faults on Book3s HV

2024-07-26 Thread Sean Christopherson
Replace Book3s HV's homebrewed fault-in logic with __kvm_faultin_pfn(), which functionally does pretty much the exact same thing. Note, when the code was written, KVM indeed didn't do fast GUP without "!atomic && !async", but that has long since changed (KVM tries fast GUP for all writable mapping

[PATCH v12 60/84] KVM: PPC: Use __kvm_faultin_pfn() to handle page faults on Book3s Radix

2024-07-26 Thread Sean Christopherson
Replace Book3s Radix's homebrewed (read: copy+pasted) fault-in logic with __kvm_faultin_pfn(), which functionally does pretty much the exact same thing. Note, when the code was written, KVM indeed didn't do fast GUP without "!atomic && !async", but that has long since changed (KVM tries fast GUP f

[PATCH v12 61/84] KVM: PPC: Drop unused @kvm_ro param from kvmppc_book3s_instantiate_page()

2024-07-26 Thread Sean Christopherson
Drop @kvm_ro from kvmppc_book3s_instantiate_page() as it is now only written, and never read. No functional change intended. Signed-off-by: Sean Christopherson --- arch/powerpc/include/asm/kvm_book3s.h | 2 +- arch/powerpc/kvm/book3s_64_mmu_radix.c | 6 ++ arch/powerpc/kvm/book3s_hv_nested

[PATCH v12 62/84] KVM: PPC: Book3S: Mark "struct page" pfns dirty/accessed after installing PTE

2024-07-26 Thread Sean Christopherson
Mark pages/folios dirty/accessed after installing a PTE, and more specifically after acquiring mmu_lock and checking for an mmu_notifier invalidation. Marking a page/folio dirty after it has been written back can make some filesystems unhappy (backing KVM guests will such filesystem files is uncom

[PATCH v12 63/84] KVM: PPC: Use kvm_faultin_pfn() to handle page faults on Book3s PR

2024-07-26 Thread Sean Christopherson
Convert Book3S PR to __kvm_faultin_pfn()+kvm_release_faultin_page(), which are new APIs to consolidate arch code and provide consistent behavior across all KVM architectures. Signed-off-by: Sean Christopherson --- arch/powerpc/include/asm/kvm_book3s.h | 2 +- arch/powerpc/kvm/book3s.c

[PATCH v12 64/84] KVM: LoongArch: Mark "struct page" pfns dirty only in "slow" page fault path

2024-07-26 Thread Sean Christopherson
Mark pages/folios dirty only the slow page fault path, i.e. only when mmu_lock is held and the operation is mmu_notifier-protected, as marking a page/folio dirty after it has been written back can make some filesystems unhappy (backing KVM guests will such filesystem files is uncommon, and the race

[PATCH v12 65/84] KVM: LoongArch: Mark "struct page" pfns accessed only in "slow" page fault path

2024-07-26 Thread Sean Christopherson
Mark pages accessed only in the slow path, before dropping mmu_lock when faulting in guest memory so that LoongArch can convert to kvm_release_faultin_page() without tripping its lockdep assertion on mmu_lock being held. Signed-off-by: Sean Christopherson --- arch/loongarch/kvm/mmu.c | 20 ++

[PATCH v12 66/84] KVM: LoongArch: Mark "struct page" pfn accessed before dropping mmu_lock

2024-07-26 Thread Sean Christopherson
Mark pages accessed before dropping mmu_lock when faulting in guest memory so that LoongArch can convert to kvm_release_faultin_page() without tripping its lockdep assertion on mmu_lock being held. Signed-off-by: Sean Christopherson --- arch/loongarch/kvm/mmu.c | 2 +- 1 file changed, 1 insertio

[PATCH v12 67/84] KVM: LoongArch: Use kvm_faultin_pfn() to map pfns into the guest

2024-07-26 Thread Sean Christopherson
Convert LoongArch to kvm_faultin_pfn()+kvm_release_faultin_page(), which are new APIs to consolidate arch code and provide consistent behavior across all KVM architectures. Signed-off-by: Sean Christopherson --- arch/loongarch/kvm/mmu.c | 14 ++ 1 file changed, 6 insertions(+), 8 del

[PATCH v12 68/84] KVM: MIPS: Mark "struct page" pfns dirty only in "slow" page fault path

2024-07-26 Thread Sean Christopherson
Mark pages/folios dirty only the slow page fault path, i.e. only when mmu_lock is held and the operation is mmu_notifier-protected, as marking a page/folio dirty after it has been written back can make some filesystems unhappy (backing KVM guests will such filesystem files is uncommon, and the race

[PATCH v12 69/84] KVM: MIPS: Mark "struct page" pfns accessed only in "slow" page fault path

2024-07-26 Thread Sean Christopherson
Mark pages accessed only in the slow page fault path in order to remove an unnecessary user of kvm_pfn_to_refcounted_page(). Marking pages accessed in the primary MMU during KVM page fault handling isn't harmful, but it's largely pointless and likely a waste of a cycles since the primary MMU will

[PATCH v12 70/84] KVM: MIPS: Mark "struct page" pfns accessed prior to dropping mmu_lock

2024-07-26 Thread Sean Christopherson
Mark pages accessed before dropping mmu_lock when faulting in guest memory so that MIPS can convert to kvm_release_faultin_page() without tripping its lockdep assertion on mmu_lock being held. Signed-off-by: Sean Christopherson --- arch/mips/kvm/mmu.c | 5 ++--- 1 file changed, 2 insertions(+),

[PATCH v12 71/84] KVM: MIPS: Use kvm_faultin_pfn() to map pfns into the guest

2024-07-26 Thread Sean Christopherson
Convert MIPS to kvm_faultin_pfn()+kvm_release_faultin_page(), which are new APIs to consolidate arch code and provide consistent behavior across all KVM architectures. Signed-off-by: Sean Christopherson --- arch/mips/kvm/mmu.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-)

[PATCH v12 72/84] KVM: PPC: Remove extra get_page() to fix page refcount leak

2024-07-26 Thread Sean Christopherson
Don't manually do get_page() when patching dcbz, as gfn_to_page() gifts the caller a reference. I.e. doing get_page() will leak the page due to not putting all references. Signed-off-by: Sean Christopherson --- arch/powerpc/kvm/book3s_pr.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch

[PATCH v12 73/84] KVM: PPC: Use kvm_vcpu_map() to map guest memory to patch dcbz instructions

2024-07-26 Thread Sean Christopherson
Use kvm_vcpu_map() when patching dcbz in guest memory, as a regular GUP isn't technically sufficient when writing to data in the target pages. As per Documentation/core-api/pin_user_pages.rst: Correct (uses FOLL_PIN calls): pin_user_pages() write to the data within the pa

[PATCH v12 74/84] KVM: Convert gfn_to_page() to use kvm_follow_pfn()

2024-07-26 Thread Sean Christopherson
Convert gfn_to_page() to the new kvm_follow_pfn() internal API, which will eventually allow removing gfn_to_pfn() and kvm_pfn_to_refcounted_page(). Signed-off-by: Sean Christopherson --- virt/kvm/kvm_main.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/virt

[PATCH v12 75/84] KVM: Add support for read-only usage of gfn_to_page()

2024-07-26 Thread Sean Christopherson
Rework gfn_to_page() to support read-only accesses so that it can be used by arm64 to get MTE tags out of guest memory. Opportunistically rewrite the comment to be even more stern about using gfn_to_page(), as there are very few scenarios where requiring a struct page is actually the right thing t

[PATCH v12 76/84] KVM: arm64: Use __gfn_to_page() when copying MTE tags to/from userspace

2024-07-26 Thread Sean Christopherson
Use __gfn_to_page() instead when copying MTE tags between guest and userspace. This will eventually allow removing gfn_to_pfn_prot(), gfn_to_pfn(), kvm_pfn_to_refcounted_page(), and related APIs. Signed-off-by: Sean Christopherson --- arch/arm64/kvm/guest.c | 21 + 1 file ch

[PATCH v12 77/84] KVM: PPC: Explicitly require struct page memory for Ultravisor sharing

2024-07-26 Thread Sean Christopherson
Explicitly require "struct page" memory when sharing memory between guest and host via an Ultravisor. Given the number of pfn_to_page() calls in the code, it's safe to assume that KVM already requires that the pfn returned by gfn_to_pfn() is backed by struct page, i.e. this is likely a bug fix, no

  1   2   >