Re: [RFC PATCH 19/30] vfio/pci: Add TSM TDI bind/unbind IOCTLs for TEE-IO support

2025-06-06 Thread Jason Gunthorpe
On Fri, Jun 06, 2025 at 03:02:49PM +0530, Aneesh Kumar K.V wrote: > Jason Gunthorpe writes: > > > On Thu, Jun 05, 2025 at 09:47:01PM +0530, Aneesh Kumar K.V wrote: > >> Jason Gunthorpe writes: > >> > >> > On Thu, Jun 05, 2025 a

Re: [RFC PATCH 19/30] vfio/pci: Add TSM TDI bind/unbind IOCTLs for TEE-IO support

2025-06-05 Thread Jason Gunthorpe
On Thu, Jun 05, 2025 at 09:47:01PM +0530, Aneesh Kumar K.V wrote: > Jason Gunthorpe writes: > > > On Thu, Jun 05, 2025 at 05:33:52PM +0530, Aneesh Kumar K.V wrote: > > > >> > + > >> > +/* To ensure no host side MMIO

Re: [RFC PATCH 19/30] vfio/pci: Add TSM TDI bind/unbind IOCTLs for TEE-IO support

2025-06-05 Thread Jason Gunthorpe
On Thu, Jun 05, 2025 at 05:33:52PM +0530, Aneesh Kumar K.V wrote: > > + > > + /* To ensure no host side MMIO access is possible */ > > + ret = pci_request_regions_exclusive(pdev, "vfio-pci-tsm"); > > + if (ret) > > + goto out_unlock; > > + > > > > I am hitting failures here with s

Re: [RFC PATCH 19/30] vfio/pci: Add TSM TDI bind/unbind IOCTLs for TEE-IO support

2025-06-05 Thread Jason Gunthorpe
On Thu, Jun 05, 2025 at 05:41:17PM +0800, Xu Yilun wrote: > No, this is not device side TDISP requirement. It is host side > requirement to fix DMA silent drop issue. TDX enforces CPU S2 PT share > with IOMMU S2 PT (does ARM do the same?), so unmap CPU S2 PT in KVM equals > unmap IOMMU S2 PT. > >

Re: [PATCH 07/12] mm: Remove redundant pXd_devmap calls

2025-06-05 Thread Jason Gunthorpe
On Wed, Jun 04, 2025 at 07:35:24PM -0700, Dan Williams wrote: > If all dax pages are special, then vm_normal_page() should never find > them and gup should fail. > > ...oh, but vm_normal_page_p[mu]d() is not used in the gup path, and > 'special' is not set in the pte path. That seems really subo

Re: [bug report] drm/xe/svm: Implement prefetch support for SVM ranges

2025-06-04 Thread Jason Gunthorpe
On Wed, Jun 04, 2025 at 04:54:43PM +0200, Simona Vetter wrote: > On Tue, Jun 03, 2025 at 07:29:52PM -0300, Jason Gunthorpe wrote: > > On Mon, May 26, 2025 at 10:15:17PM +0530, Ghimiray, Himal Prasad wrote: > > > > > > > > > On 26-05-2025 20:36, Dan Carpent

Re: [RFC PATCH 17/30] iommufd/device: Add TSM Bind/Unbind for TIO support

2025-06-04 Thread Jason Gunthorpe
On Wed, Jun 04, 2025 at 02:10:43PM +0530, Aneesh Kumar K.V wrote: > Jason Gunthorpe writes: > > > On Tue, Jun 03, 2025 at 02:20:51PM +0800, Xu Yilun wrote: > >> > Wouldn’t it be simpler to skip the reference count increment altogether > >> > and just cal

Re: [bug report] drm/xe/svm: Implement prefetch support for SVM ranges

2025-06-03 Thread Jason Gunthorpe
On Mon, May 26, 2025 at 10:15:17PM +0530, Ghimiray, Himal Prasad wrote: > > > On 26-05-2025 20:36, Dan Carpenter wrote: > > Hello Himal Prasad Ghimiray, > > > > Commit 09ba0a8f06cd ("drm/xe/svm: Implement prefetch support for SVM > > ranges") from May 13, 2025 (linux-next), leads to the followin

Re: [PATCH 12/12] mm/memremap: Remove unused devmap_managed_key

2025-06-03 Thread Jason Gunthorpe
On Thu, May 29, 2025 at 04:32:13PM +1000, Alistair Popple wrote: > It's no longer used so remove it. > > Signed-off-by: Alistair Popple > --- > mm/memremap.c | 27 --- > 1 file changed, 27 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 11/12] mm: Remove callers of pfn_t functionality

2025-06-03 Thread Jason Gunthorpe
toph Hellwig Yay! Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 10/12] mm: Remove devmap related functions and page table bits

2025-06-03 Thread Jason Gunthorpe
le.h | 19 +-- > mm/Kconfig| 4 +- > mm/debug_vm_pgtable.c | 59 + > mm/hmm.c | 3 +- > mm/madvise.c | 8 +-- > 25 files changed, 17 insertions(+), 318 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 09/12] powerpc: Remove checks for devmap pages and PMDs/PUDs

2025-06-03 Thread Jason Gunthorpe
- > arch/powerpc/mm/book3s64/radix_pgtable.c | 5 ++--- > arch/powerpc/mm/pgtable.c| 2 +- > 6 files changed, 10 insertions(+), 14 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 08/12] mm/khugepaged: Remove redundant pmd_devmap() check

2025-06-03 Thread Jason Gunthorpe
t; > Signed-off-by: Alistair Popple > --- > mm/khugepaged.c | 2 -- > 1 file changed, 2 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 07/12] mm: Remove redundant pXd_devmap calls

2025-06-03 Thread Jason Gunthorpe
2 +- > mm/mremap.c| 5 ++--- > mm/page_vma_mapped.c | 5 ++--- > mm/pagewalk.c | 8 +++- > mm/pgtable-generic.c | 7 +++ > mm/userfaultfd.c | 4 ++-- > mm/vmscan.c| 3 --- > 15 files changed, 40 insertions(+), 66 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 06/12] mm/gup: Remove pXX_devmap usage from get_user_pages()

2025-06-03 Thread Jason Gunthorpe
yway so can be removed. > > Signed-off-by: Alistair Popple > --- > include/linux/huge_mm.h | 3 +- > mm/gup.c| 162 + > mm/huge_memory.c| 40 +-- > 3 files changed, 5 insertions(+), 200 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 05/12] mm: Remove remaining uses of PFN_DEV

2025-06-03 Thread Jason Gunthorpe
| 2 +- > include/linux/pfn_t.h | 25 ++--- > mm/memory.c| 4 ++-- > 7 files changed, 11 insertions(+), 36 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 04/12] mm: Convert vmf_insert_mixed() from using pte_devmap to pte_special

2025-06-03 Thread Jason Gunthorpe
.c| 3 --- > mm/memory.c | 20 ++-- > mm/vmscan.c | 2 +- > 3 files changed, 3 insertions(+), 22 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 03/12] mm/pagewalk: Skip dax pages in pagewalk

2025-06-03 Thread Jason Gunthorpe
On Thu, May 29, 2025 at 04:32:04PM +1000, Alistair Popple wrote: > Previously dax pages were skipped by the pagewalk code as pud_special() or > vm_normal_page{_pmd}() would be false for DAX pages. Now that dax pages are > refcounted normally that is no longer the case, so add explicit checks to > s

Re: [PATCH 02/12] mm: Convert pXd_devmap checks to vma_is_dax

2025-06-03 Thread Jason Gunthorpe
2 +- > mm/userfaultfd.c | 2 +- > 3 files changed, 3 insertions(+), 3 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [PATCH 01/12] mm: Remove PFN_MAP, PFN_SG_CHAIN and PFN_SG_LAST

2025-06-03 Thread Jason Gunthorpe
Christoph Hellwig > --- > include/linux/pfn_t.h | 31 +++ > mm/memory.c | 2 -- > tools/testing/nvdimm/test/iomap.c | 4 > 3 files changed, 3 insertions(+), 34 deletions(-) Reviewed-by: Jason Gunthorpe Jason

Re: [RFC PATCH 17/30] iommufd/device: Add TSM Bind/Unbind for TIO support

2025-06-03 Thread Jason Gunthorpe
On Tue, Jun 03, 2025 at 02:20:51PM +0800, Xu Yilun wrote: > > Wouldn’t it be simpler to skip the reference count increment altogether > > and just call tsm_unbind in the virtual device’s destroy callback? > > (iommufd_vdevice_destroy()) > > The vdevice refcount is the main concern, there is also a

Re: [RFC PATCH 00/30] Host side (KVM/VFIO/IOMMUFD) support for TDISP using TSM

2025-06-02 Thread Jason Gunthorpe
On Thu, May 29, 2025 at 01:34:43PM +0800, Xu Yilun wrote: > This series has 3 sections: I really think this is too big to try to progress, even in RFC form. > Patch 1 - 11 deal with the private MMIO mapping in KVM MMU via DMABUF. > Leverage Jason & Vivek's latest VFIO dmabuf series [3], see Pat

Re: [RFC PATCH 10/30] vfio/pci: Export vfio dma-buf specific info for importers

2025-06-02 Thread Jason Gunthorpe
On Thu, May 29, 2025 at 01:34:53PM +0800, Xu Yilun wrote: > Export vfio dma-buf specific info by attaching vfio_dma_buf_data in > struct dma_buf::priv. Provide a helper vfio_dma_buf_get_data() for > importers to fetch these data. Exporters identify VFIO dma-buf by > successfully getting these data.

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-29 Thread Jason Gunthorpe
On Thu, May 29, 2025 at 10:41:15PM +0800, Xu Yilun wrote: > > On AMD, the host can "revoke" at any time, at worst it'll see RMP > > events from IOMMU. Thanks, > > Is the RMP event firstly detected by host or guest? If by host, > host could fool guest by just suppress the event. Guest thought the

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-16 Thread Jason Gunthorpe
On Fri, May 16, 2025 at 02:19:45PM +0800, Xu Yilun wrote: > > I don't know why you'd disable a viommu while the VM is running, > > doesn't make sense. > > Here it means remove the CC setup for viommu, shared setup is still > kept. That might makes sense for the vPCI function, but not the vIOMMU.

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-15 Thread Jason Gunthorpe
On Fri, May 16, 2025 at 02:02:29AM +0800, Xu Yilun wrote: > > IMHO, I think it might be helpful that you can picture out what are the > > minimum requirements (function/life cycle) to the current IOMMUFD TSM > > bind architecture: > > > > 1.host tsm_bind (preparation) is in IOMMUFD, triggered by Q

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-15 Thread Jason Gunthorpe
On Fri, May 16, 2025 at 12:04:04AM +0800, Xu Yilun wrote: > > arches this was mostly invisible to the hypervisor? > > Attest & Accept can be invisible to hypervisor, or host just help pass > data blobs between guest, firmware & device. > > Bind cannot be host agnostic, host should be aware not to

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-14 Thread Jason Gunthorpe
On Wed, May 14, 2025 at 03:02:53PM +0800, Xu Yilun wrote: > > We have an awkward fit for what CCA people are doing to the various > > Linux APIs. Looking somewhat maximally across all the arches a "bind" > > for a CC vPCI device creation operation does: > > > > - Setup the CPU page tables for the

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-12 Thread Jason Gunthorpe
On Mon, May 12, 2025 at 07:30:21PM +1000, Alexey Kardashevskiy wrote: > > > I'm surprised by this.. iommufd shouldn't be doing PCI stuff, it is > > > just about managing the translation control of the device. > > > > I have a little difficulty to understand. Is TSM bind PCI stuff? To me > > it is

Re: [RFC PATCH 00/12] Private MMIO support for private assigned dev

2025-05-09 Thread Jason Gunthorpe
On Sat, May 10, 2025 at 12:28:48AM +0800, Xu Yilun wrote: > On Fri, May 09, 2025 at 07:12:46PM +0800, Xu Yilun wrote: > > On Fri, May 09, 2025 at 01:04:58PM +1000, Alexey Kardashevskiy wrote: > > > Ping? > > > > Sorry for late reply from vacation. > > > > > Also, since there is pushback on 01/12

Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl

2025-05-07 Thread Jason Gunthorpe
On Wed, May 07, 2025 at 03:49:15PM -0400, Rodrigo Vivi wrote: > One last thing since I have your attention here. Was any time in the previous > fwctl discussions talked about the possibility of some extra usages for like > FW flashing or in-field-repair/tests where big data needs to filled bypassi

Re: [RFC 9/9] {fwctl,drm}/xe/pcode: Introduce xe_pcode_fwctl

2025-05-06 Thread Jason Gunthorpe
On Tue, Apr 29, 2025 at 09:39:56PM +0530, Badal Nilawar wrote: > diff --git a/drivers/gpu/drm/xe/xe_pcode_fwctl.c > b/drivers/gpu/drm/xe/xe_pcode_fwctl.c > new file mode 100644 I really do prefer it if you can find a way to put the code in drivers/fwctl instead of in DRM subsystem. > +static int

Re: [PATCH v3 03/33] iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()

2025-04-29 Thread Jason Gunthorpe
On Tue, Apr 29, 2025 at 06:58:32AM -0700, Rob Clark wrote: > On Tue, Apr 29, 2025 at 5:28 AM Jason Gunthorpe wrote: > > > > On Mon, Apr 28, 2025 at 01:54:10PM -0700, Rob Clark wrote: > > > From: Rob Clark > > > > > > In situations where mapp

Re: [PATCH v3 03/33] iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()

2025-04-29 Thread Jason Gunthorpe
On Mon, Apr 28, 2025 at 01:54:10PM -0700, Rob Clark wrote: > From: Rob Clark > > In situations where mapping/unmapping squence can be controlled by > userspace, attempting to map over a region that has not yet been > unmapped is an error. But not something that should spam dmesg. I think if you

Re: [PATCH 0/3] uio/dma-buf: Give UIO users access to DMA addresses.

2025-04-22 Thread Jason Gunthorpe
On Mon, Apr 14, 2025 at 09:21:25PM +0200, Thomas Petazzoni wrote: > > "UIO is a broken legacy mess, so let's add more broken things > > to it as broken + broken => still broken, so no harm done", am I > > getting that right? > > Who says UIO is a "broken legacy mess"? Only you says so. I don't se

Re: [PATCH v2 0/4] kbuild: resurrect generic header check facility

2025-04-08 Thread Jason Gunthorpe
On Tue, Apr 08, 2025 at 09:42:36PM +0300, Jani Nikula wrote: > On Tue, 08 Apr 2025, Jason Gunthorpe wrote: > > On Tue, Apr 08, 2025 at 11:27:58AM +0300, Jani Nikula wrote: > >> On Mon, 07 Apr 2025, Jason Gunthorpe wrote: > >> > On Mon, Apr 07, 2025 at 10:17

Re: [PATCH v2 0/4] kbuild: resurrect generic header check facility

2025-04-08 Thread Jason Gunthorpe
On Tue, Apr 08, 2025 at 11:27:58AM +0300, Jani Nikula wrote: > On Mon, 07 Apr 2025, Jason Gunthorpe wrote: > > On Mon, Apr 07, 2025 at 10:17:40AM +0300, Jani Nikula wrote: > > > >> Even with Jason's idea [1], you *still* have to start small and opt-in > >>

Re: [PATCH v2 0/4] kbuild: resurrect generic header check facility

2025-04-07 Thread Jason Gunthorpe
On Mon, Apr 07, 2025 at 10:17:40AM +0300, Jani Nikula wrote: > Even with Jason's idea [1], you *still* have to start small and opt-in > (i.e. the patch series at hand). You can't just start off by testing > every header in one go, because it's a flag day switch. You'd add something like 'make he

Re: [git pull] drm for 6.15-rc1

2025-04-04 Thread Jason Gunthorpe
On Mon, Mar 31, 2025 at 01:03:38PM +0200, Simona Vetter wrote: > Hi Linus, > > On Mon, Mar 31, 2025 at 01:17:28PM +0300, Jani Nikula wrote: > > On Fri, 28 Mar 2025, Linus Torvalds wrote: > > > If you want to do that hdrtest thing, do it as part of your *own* > > > checks. Don't make everybody els

Re: [git pull] drm for 6.15-rc1

2025-04-02 Thread Jason Gunthorpe
On Wed, Apr 02, 2025 at 04:41:44PM +0200, Simona Vetter wrote: > - Gradually roll this out, ideally with support in main Kbuild so it > doesn't have to be replicated. No one said flag day, you'd have to approach the same way everyone else has done when adding new compiler errors and warnings to

Re: [git pull] drm for 6.15-rc1

2025-04-02 Thread Jason Gunthorpe
On Wed, Apr 02, 2025 at 03:56:37PM +0300, Jani Nikula wrote: > On Tue, 01 Apr 2025, Jason Gunthorpe wrote: > > On Tue, Apr 01, 2025 at 10:42:35PM +0300, Jani Nikula wrote: > >> On Tue, 01 Apr 2025, Jason Gunthorpe wrote: > >> > So, I'd suggest a better way to

Re: [git pull] drm for 6.15-rc1

2025-04-01 Thread Jason Gunthorpe
On Tue, Apr 01, 2025 at 10:42:35PM +0300, Jani Nikula wrote: > On Tue, 01 Apr 2025, Jason Gunthorpe wrote: > > So, I'd suggest a better way to run this is first build the kernel, > > then mine the gcc -MD output (ie stored in the .XX.cmd files) to > > generate a list of

Re: [git pull] drm for 6.15-rc1

2025-04-01 Thread Jason Gunthorpe
On Wed, Apr 02, 2025 at 03:46:34AM +0900, Masahiro Yamada wrote: > However, it is annoying to make every header self-contained > "just because we are checking this". >From my POV itis not "just because we are checking this", I have a very deliberate reason for wanting headers to be self contained:

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-21 Thread Jason Gunthorpe
On Fri, Mar 21, 2025 at 01:12:30PM +0100, Danilo Krummrich wrote: > Not all device resources are managed in the context of the subsystem, so > subsystem-level revokes do not apply. They could, you could say that these rust APIs are only safe to use for device drivers with C code providing a fence

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-21 Thread Jason Gunthorpe
On Fri, Mar 21, 2025 at 11:35:40AM +0100, Simona Vetter wrote: > On Wed, Mar 19, 2025 at 02:21:32PM -0300, Jason Gunthorpe wrote: > > On Thu, Mar 13, 2025 at 03:32:14PM +0100, Simona Vetter wrote: > > > > > So I think you can still achieve that building on top of revoca

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-19 Thread Jason Gunthorpe
On Thu, Mar 13, 2025 at 03:32:14PM +0100, Simona Vetter wrote: > So I think you can still achieve that building on top of revocable and a > few more abstractions that are internally unsafe. Or are you thinking of > different runtime checks? I'm thinking on the access side of the revocable you don

Re: [RFC PATCH 08/12] vfio/pci: Create host unaccessible dma-buf for private device

2025-03-17 Thread Jason Gunthorpe
On Tue, Mar 11, 2025 at 06:37:13PM -0700, Dan Williams wrote: > > There is a use case for using TDISP and getting devices up into an > > ecrypted/attested state on pure bare metal without any KVM, VFIO > > should work in that use case too. > > Are you sure you are not confusing the use case for n

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-07 Thread Jason Gunthorpe
On Fri, Mar 07, 2025 at 02:09:12PM +0100, Simona Vetter wrote: > > A driver can do a health check immediately in remove() and make a > > decision if the device is alive or not to speed up removal in the > > hostile hot unplug case. > > Hm ... I guess when you get an all -1 read you check with a s

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-07 Thread Jason Gunthorpe
On Fri, Mar 07, 2025 at 04:19:30PM +0100, Greg KH wrote: > Just like other busses, if PCI can't handle this at the core hotplug > layer (i.e. by giving up new resources to new devices) then the bus core > for it should handle this type of locking scheme as really, that feels > wrong. A new device

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-07 Thread Jason Gunthorpe
On Fri, Mar 07, 2025 at 03:00:09PM +0100, Greg KH wrote: > On Fri, Mar 07, 2025 at 08:32:55AM -0400, Jason Gunthorpe wrote: > > On Fri, Mar 07, 2025 at 11:28:37AM +0100, Simona Vetter wrote: > > > > > > I wouldn't say it is wrong. It is still the correct thing t

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-07 Thread Jason Gunthorpe
On Fri, Mar 07, 2025 at 11:28:37AM +0100, Simona Vetter wrote: > > I wouldn't say it is wrong. It is still the correct thing to do, and > > following down the normal cleanup paths is a good way to ensure the > > special case doesn't have bugs. The primary difference is you want to > > understand t

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-06 Thread Jason Gunthorpe
On Thu, Mar 06, 2025 at 11:42:38AM +0100, Simona Vetter wrote: > > Further, I just remembered, (Danilo please notice!) there is another > > related issue here that DMA mappings *may not* outlive remove() > > either. netdev had a bug related to this recently and it was all > > agreed that it is not

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-05 Thread Jason Gunthorpe
On Wed, Mar 05, 2025 at 08:30:34AM +0100, Simona Vetter wrote: > - developers who want to quickly test new driver versions without full > reboot. They're often preferring convenience over correctness, like with > the removal of module refcounting that's strictly needed but means they > first

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-04 Thread Jason Gunthorpe
On Tue, Mar 04, 2025 at 05:10:45PM +0100, Simona Vetter wrote: > On Fri, Feb 28, 2025 at 02:40:13PM -0400, Jason Gunthorpe wrote: > > On Fri, Feb 28, 2025 at 11:52:57AM +0100, Simona Vetter wrote: > > > > > - Nuke the driver binding manually through sysfs with the unbind

Re: [PATCH 0/4] cover-letter: Allow MMIO regions to be exported through dmabuf

2025-03-04 Thread Jason Gunthorpe
On Tue, Mar 04, 2025 at 03:29:42PM +0100, Christian König wrote: > Am 26.02.25 um 14:38 schrieb Jason Gunthorpe: > > On Wed, Feb 26, 2025 at 07:55:07AM +, Kasireddy, Vivek wrote: > > > >>> Is there any update or ETA for the v3? Are there any ways we can help? >

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-03-03 Thread Jason Gunthorpe
On Mon, Mar 03, 2025 at 08:36:34PM +0100, Danilo Krummrich wrote: > > > And yes, for *device resources* it is unsound if we do not ensure that the > > > device resource is actually dropped at device unbind. > > > > Why not do a runtime validation then? > > > > It would be easy to have an atomic c

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-28 Thread Jason Gunthorpe
On Fri, Feb 28, 2025 at 11:52:57AM +0100, Simona Vetter wrote: > - Nuke the driver binding manually through sysfs with the unbind files. > - Nuke all userspace that might beholding files and other resources open. > - At this point the module refcount should be zero and you can unload it. > > Exce

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-28 Thread Jason Gunthorpe
On Thu, Feb 27, 2025 at 11:40:53PM +0100, Danilo Krummrich wrote: > On Thu, Feb 27, 2025 at 06:00:13PM -0400, Jason Gunthorpe wrote: > > On Thu, Feb 27, 2025 at 01:25:10PM -0800, Boqun Feng wrote: > > > > > > Most of the cases, it should be naturally achieved, because

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-28 Thread Jason Gunthorpe
, 2025 at 5:02 PM PST, Greg KH wrote: > > >> > On Wed, Feb 26, 2025 at 07:47:30PM -0400, Jason Gunthorpe wrote: > > ... > > > nova is just a drm driver, it's not a rewrite of the drm subsystem, > > > that sort of effort would entail a much larger commitment

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Jason Gunthorpe
On Thu, Feb 27, 2025 at 01:25:10PM -0800, Boqun Feng wrote: > > The design pattern says that 'share it with the rest of the world' is > > a bug. A driver following the pattern cannot do that, it must contain > > the driver objects within the driver scope and free them. In C we > > I cannot speak f

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Jason Gunthorpe
On Thu, Feb 27, 2025 at 06:32:15PM +0100, Danilo Krummrich wrote: > On Thu, Feb 27, 2025 at 08:55:09AM -0800, Boqun Feng wrote: > > On Thu, Feb 27, 2025 at 12:17:33PM -0400, Jason Gunthorpe wrote: > > > > > I still wonder why you couldn't also have these reliable ref

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Jason Gunthorpe
On Thu, Feb 27, 2025 at 07:18:02AM -0800, Boqun Feng wrote: > On Thu, Feb 27, 2025 at 10:46:18AM -0400, Jason Gunthorpe wrote: > > On Wed, Feb 26, 2025 at 04:41:08PM -0800, Boqun Feng wrote: > > > And if you don't store the HrTimerHandle anywhere, like you drop() it &g

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Jason Gunthorpe
On Thu, Feb 27, 2025 at 12:32:45PM +0100, Danilo Krummrich wrote: > On Wed, Feb 26, 2025 at 07:47:30PM -0400, Jason Gunthorpe wrote: > > On Wed, Feb 26, 2025 at 10:31:10PM +0100, Danilo Krummrich wrote: > > > Let's take a step back and look again why we have Devres (an

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Jason Gunthorpe
On Wed, Feb 26, 2025 at 04:41:08PM -0800, Boqun Feng wrote: > And if you don't store the HrTimerHandle anywhere, like you drop() it > right after start a hrtimer, it will immediately stop the timer. Does > this make sense? Oh, I understand that, but it is not sufficient in the kernel. You are mak

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Jason Gunthorpe
On Wed, Feb 26, 2025 at 05:02:23PM -0800, Greg KH wrote: > On Wed, Feb 26, 2025 at 07:47:30PM -0400, Jason Gunthorpe wrote: > > The way misc device works you can't unload the module until all the > > FDs are closed and the misc code directly handles races with opening > >

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-26 Thread Jason Gunthorpe
On Wed, Feb 26, 2025 at 10:31:10PM +0100, Danilo Krummrich wrote: > Let's take a step back and look again why we have Devres (and Revocable) for > e.g. pci::Bar. > > The device / driver model requires that device resources are only held by a > driver, as long as the driver is bound to the device.

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-26 Thread Jason Gunthorpe
On Wed, Feb 26, 2025 at 02:16:58AM +0100, Danilo Krummrich wrote: > > DRM achieves this, in part, by using drm_dev_unplug(). > > No, DRM can have concurrent driver code running, which is why drm_dev_enter() > returns whether the device is unplugged already, such that subsequent > operations, (e.g.

Re: [PATCH 0/4] cover-letter: Allow MMIO regions to be exported through dmabuf

2025-02-26 Thread Jason Gunthorpe
On Wed, Feb 26, 2025 at 07:55:07AM +, Kasireddy, Vivek wrote: > > Is there any update or ETA for the v3? Are there any ways we can help? > I believe Leon's series is very close to getting merged. Once it > lands, this series can be revived. The recent drama has made what happens next unclear

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-25 Thread Jason Gunthorpe
On Wed, Feb 26, 2025 at 12:45:45AM +0100, Danilo Krummrich wrote: > On Tue, Feb 25, 2025 at 06:57:56PM -0400, Jason Gunthorpe wrote: > > The common driver shutdown process in the kernel, that is well tested > > and copied, makes the driver single threaded during the remove() > &g

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-25 Thread Jason Gunthorpe
Another colleague told me RDMA also uses SRCU for a similar purpose as > > > well. > > > > See the reasoning against SRCU from Sima [1], what's the reasoning of RDMA? > > > > [1] https://lore.kernel.org/nouveau/Z7XVfnnrRKrtQbB6@phenom.ffwll.local/ > For RDMA, I

Re: On community influencing (was Re: [PATCH v8 2/2] rust: add dma coherent allocator abstraction.)

2025-02-20 Thread Jason Gunthorpe
On Thu, Feb 20, 2025 at 05:24:01PM +0100, Simona Vetter wrote: > Better analogy aside, I fundamentally disagree with understanding > maintainership as a gatekeeper role that exists to keep the chaos out. My > goal is to help build a community where people enjoy collaborating, and > then gtfo so I d

[PATCH rc] gpu: host1x: Do not assume that a NULL domain means no DMA IOMMU

2025-02-04 Thread Jason Gunthorpe
ssume that a NULL domain means no DMA IOMMU"). Fixes: c8cc2655cc6c ("iommu/tegra-smmu: Implement an IDENTITY domain") Reported-by: Diogo Ivo Closes: https://lore.kernel.org/all/c6a6f114-3acd-4d56-a13b-b88978e92...@tecnico.ulisboa.pt/ Tested-by: Diogo Ivo Signed-off-by: Jason Gu

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-02-04 Thread Jason Gunthorpe
On Tue, Feb 04, 2025 at 03:29:48PM +0100, Thomas Hellström wrote: > On Tue, 2025-02-04 at 09:26 -0400, Jason Gunthorpe wrote: > > On Tue, Feb 04, 2025 at 10:32:32AM +0100, Thomas Hellström wrote: > > > > > > > > 1) Existing users would never use the callb

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-02-04 Thread Jason Gunthorpe
On Tue, Feb 04, 2025 at 10:32:32AM +0100, Thomas Hellström wrote: > > I would not be happy to see this. Please improve pagemap directly if > > you think you need more things. > > These are mainly helpers to migrate and populate a range of cpu memory > space (struct mm_struct) with GPU device_priva

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-02-03 Thread Jason Gunthorpe
On Fri, Jan 31, 2025 at 05:59:26PM +0100, Simona Vetter wrote: > So one aspect where I don't like the pgmap->owner approach much is that > it's a big thing to get right, and it feels a bit to me that we don't yet > know the right questions. Well, I would say it isn't really complete yet. No drive

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-01-30 Thread Jason Gunthorpe
On Thu, Jan 30, 2025 at 05:09:44PM +0100, Simona Vetter wrote: > > You could also use an integer instead of a pointer to indicate the > > cluster of interconnect, I think there are many options.. > > Hm yeah I guess an integer allocater of the atomic_inc kind plus "surely > 32bit is enough" could

Re: [PATCH v1 08/12] mm/rmap: handle device-exclusive entries correctly in try_to_unmap_one()

2025-01-30 Thread Jason Gunthorpe
On Thu, Jan 30, 2025 at 02:06:12PM +0100, Simona Vetter wrote: > On Thu, Jan 30, 2025 at 12:08:42PM +0100, David Hildenbrand wrote: > > On 30.01.25 11:10, Simona Vetter wrote: > > > On Wed, Jan 29, 2025 at 12:54:06PM +0100, David Hildenbrand wrote: > > > > Ever since commit b756a3b5e7ea ("mm: devic

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-01-30 Thread Jason Gunthorpe
On Thu, Jan 30, 2025 at 11:50:27AM +0100, Simona Vetter wrote: > On Wed, Jan 29, 2025 at 09:47:57AM -0400, Jason Gunthorpe wrote: > > On Wed, Jan 29, 2025 at 02:38:58PM +0100, Simona Vetter wrote: > > > > > > The pgmap->owner doesn't *have* to fixed, certainl

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-01-29 Thread Jason Gunthorpe
On Wed, Jan 29, 2025 at 02:38:58PM +0100, Simona Vetter wrote: > > The pgmap->owner doesn't *have* to fixed, certainly during early boot before > > you hand out any page references it can be changed. I wouldn't be > > surprised if this is useful to some requirements to build up the > > private int

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-01-28 Thread Jason Gunthorpe
On Tue, Jan 28, 2025 at 05:32:23PM +0100, Thomas Hellström wrote: > > This series supports three case: > > > >  1) pgmap->owner == range->dev_private_owner > >     This is "driver private fast interconnect" in this case HMM > > should > >     immediately return the page. The calling driver underst

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-01-28 Thread Jason Gunthorpe
On Tue, Jan 28, 2025 at 03:48:54PM +0100, Thomas Hellström wrote: > On Tue, 2025-01-28 at 09:20 -0400, Jason Gunthorpe wrote: > > On Tue, Jan 28, 2025 at 09:51:52AM +0100, Thomas Hellström wrote: > > > > > How would the pgmap device know whether P2P is actually possible &

Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

2025-01-28 Thread Jason Gunthorpe
On Tue, Jan 28, 2025 at 09:51:52AM +0100, Thomas Hellström wrote: > How would the pgmap device know whether P2P is actually possible > without knowing the client device, (like calling pci_p2pdma_distance) > and also if looking into access control, whether it is allowed? The DMA API will do this,

Re: [Question] Are "device exclusive non-swap entries" / "SVM atomics in Nouveau" still getting used in practice?

2025-01-24 Thread Jason Gunthorpe
On Fri, Jan 24, 2025 at 11:44:28AM +0100, David Hildenbrand wrote: > There are other concerns I have (what if the page is pinned and access > outside of the user space page tables?). Maybe there was not need to handle > these cases so far. I think alot of this depends on userspace following some

Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

2025-01-23 Thread Jason Gunthorpe
On Thu, Jan 23, 2025 at 04:48:29PM +0100, Christian König wrote: >No, no there are much more cases where drivers simply assume that they >are in the same iommu domain for different devices. This is an illegal assumption and invalid way to use the DMA API. Do not do that, do not architect

Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

2025-01-23 Thread Jason Gunthorpe
On Thu, Jan 23, 2025 at 03:35:21PM +0100, Christian König wrote: > Sending it as text mail once more. > > Am 23.01.25 um 15:32 schrieb Christian König: > > Am 23.01.25 um 14:59 schrieb Jason Gunthorpe: > > > On Wed, Jan 22, 2025 at 03:59:11PM +0100, Christian König wrote:

Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

2025-01-23 Thread Jason Gunthorpe
On Wed, Jan 22, 2025 at 03:59:11PM +0100, Christian König wrote: > > > For example we have cases with multiple devices are in the same IOMMU > > > domain > > > and re-using their DMA address mappings. > > IMHO this is just another flavour of "private" address flow between > > two cooperating drive

Re: [RFC PATCH 08/12] vfio/pci: Create host unaccessible dma-buf for private device

2025-01-23 Thread Jason Gunthorpe
On Thu, Jan 23, 2025 at 03:41:58PM +0800, Xu Yilun wrote: > I don't have a complete idea yet. But the goal is not to make any > existing driver seamlessly work with secure device. It is to provide a > generic way for bind/attestation/accept, and may save driver's effort > if they don't care about

Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

2025-01-22 Thread Jason Gunthorpe
On Wed, Jan 22, 2025 at 02:29:09PM +0100, Christian König wrote: > I'm having all kind of funny phenomena with AMDs mail servers since coming > back from xmas vacation. :( A few years back our IT fully migrated our email to into Office 365 cloud and gave up all the crazy half on-prem stuff they

Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

2025-01-22 Thread Jason Gunthorpe
On Wed, Jan 22, 2025 at 12:04:19PM +0100, Simona Vetter wrote: > I'm kinda leaning towards entirely separate dma-buf interfaces for the new > phyr stuff, because I fear that adding that to the existing ones will only > make the chaos worse. Lets see when some patches come up, if importers have t

Re: [RFC PATCH 08/12] vfio/pci: Create host unaccessible dma-buf for private device

2025-01-22 Thread Jason Gunthorpe
On Wed, Jan 22, 2025 at 12:32:56PM +0800, Xu Yilun wrote: > On Tue, Jan 21, 2025 at 01:43:03PM -0400, Jason Gunthorpe wrote: > > On Tue, Jun 25, 2024 at 05:12:10AM +0800, Xu Yilun wrote: > > > > > When VFIO works as a TEE user in VM, it means an attester (e.g. PCI >

Re: [RFC PATCH 08/12] vfio/pci: Create host unaccessible dma-buf for private device

2025-01-21 Thread Jason Gunthorpe
On Tue, Jun 25, 2024 at 05:12:10AM +0800, Xu Yilun wrote: > When VFIO works as a TEE user in VM, it means an attester (e.g. PCI > subsystem) has already moved the device to RUN state. So VFIO & DPDK > are all TEE users, no need to manipulate TDISP state between them. > AFAICS, this is the most pre

Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

2025-01-21 Thread Jason Gunthorpe
On Tue, Jan 21, 2025 at 05:11:32PM +0100, Simona Vetter wrote: > On Mon, Jan 20, 2025 at 03:48:04PM -0400, Jason Gunthorpe wrote: > > On Mon, Jan 20, 2025 at 07:50:23PM +0100, Simona Vetter wrote: > > > On Mon, Jan 20, 2025 at 01:59:01PM -0400, Jason Gunthorpe wrote: > >

Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

2025-01-20 Thread Jason Gunthorpe
On Mon, Jan 20, 2025 at 07:50:23PM +0100, Simona Vetter wrote: > On Mon, Jan 20, 2025 at 01:59:01PM -0400, Jason Gunthorpe wrote: > > On Mon, Jan 20, 2025 at 01:14:12PM +0100, Christian König wrote: > > What is going wrong with your email? You replied to Simona, but > > Simo

Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

2025-01-20 Thread Jason Gunthorpe
On Mon, Jan 20, 2025 at 01:14:12PM +0100, Christian König wrote: What is going wrong with your email? You replied to Simona, but Simona Vetter is dropped from the To/CC list??? I added the address back, but seems like a weird thing to happen. > Please take another look at what is proposed here.

Re: [RFC PATCH 08/12] vfio/pci: Create host unaccessible dma-buf for private device

2025-01-20 Thread Jason Gunthorpe
On Mon, Jan 20, 2025 at 08:45:51PM +1100, Alexey Kardashevskiy wrote: > > For CC I'm expecting the KVM fd to be the handle for the cVM, so any > > RPCs that want to call into the secure world need the KVM FD to get > > the cVM's identifier. Ie a "bind to cVM" RPC will need the PCI > > information

Re: [RFC PATCH 08/12] vfio/pci: Create host unaccessible dma-buf for private device

2025-01-20 Thread Jason Gunthorpe
On Mon, Jun 24, 2024 at 03:59:53AM +0800, Xu Yilun wrote: > > But it also seems to me that VFIO should be able to support putting > > the device into the RUN state > > Firstly I think VFIO should support putting device into *LOCKED* state. > From LOCKED to RUN, there are many evidence fetching and

Re: [RFC PATCH 08/12] vfio/pci: Create host unaccessible dma-buf for private device

2025-01-17 Thread Jason Gunthorpe
On Fri, Jan 17, 2025 at 09:57:40AM +0800, Baolu Lu wrote: > On 1/15/25 21:01, Jason Gunthorpe wrote: > > On Wed, Jan 15, 2025 at 11:57:05PM +1100, Alexey Kardashevskiy wrote: > > > On 15/1/25 00:35, Jason Gunthorpe wrote: > > > > On Tue, Jun 18, 2024 at 07

Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

2025-01-16 Thread Jason Gunthorpe
On Thu, Jan 16, 2025 at 04:13:13PM +0100, Christian König wrote: >> But this, fundamentally, is importers creating attachments and then >> *ignoring the lifetime rules of DMABUF*. If you created an attachment, >> got a move and *ignored the move* because you put the PFN in your own >> VMA, then you

Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

2025-01-16 Thread Jason Gunthorpe
On Thu, Jan 16, 2025 at 06:33:48AM +0100, Christoph Hellwig wrote: > On Wed, Jan 15, 2025 at 09:34:19AM -0400, Jason Gunthorpe wrote: > > > Or do you mean some that don't have pages associated with them, and > > > thus have pfn_valid fail on them? They still have a PFN,

Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI

2025-01-15 Thread Jason Gunthorpe
On Wed, Jan 15, 2025 at 05:34:23PM +0100, Christian König wrote: >Granted, let me try to improve this. >Here is a real world example of one of the issues we ran into and why >CPU mappings of importers are redirected to the exporter. >We have a good bunch of different exporters who t

  1   2   3   4   5   6   7   8   9   10   >