Hi Tomeu, I have some bad news ... On Wed, 4 Jun 2025 at 08:57, Tomeu Vizoso <to...@tomeuvizoso.net> wrote: > +int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct > drm_file *file) > +{ > + [...] > + > + /* This will map the pages to the IOMMU linked to core 0 */ > + sgt = drm_gem_shmem_get_pages_sgt(shmem_obj); > + if (IS_ERR(sgt)) { > + ret = PTR_ERR(sgt); > + goto err; > + } > + > + /* Map the pages to the IOMMUs linked to the other cores, so all > cores can access this BO */
So, uh, this is not great. We only have a single IOMMU context (well, one per core, but one effective VMA) for the whole device. Every BO that gets created, gets mapped into the IOMMU until it's been destroyed. Given that there is no client isolation and no CS validation, that means that every client has RW access to every BO created by any other client, for the lifetime of that BO. I really don't think that this is tractable, given that anyone with access to the device can exfiltrate anything that anyone else has provided to the device. I also don't think that CS validation is tractable tbh. So I guess that leaves us with the third option: enforcing context separation within the kernel driver. The least preferable option I can think of is that rkt sets up and tears down MMU mappings for each job, according to the BO list provided for it. This seems like way too much overhead - especially with RK IOMMU ops having been slow enough within DRM that we expended a lot of effort in Weston doing caching of DRM BOs to avoid doing this unless completely necessary. It also seems risky wrt allocating memory in drm_sched paths to ensure forward progress. Slightly more preferable than this would be that rkt kept a per-context list of BOs and their VA mappings, and when switching between different contexts, would tear down all MMU mappings from the old context and set up mappings from the new. But this has the same issues with drm_sched. The most preferable option from where I sit is that we could have an explicit notion of driver-managed IOMMU contexts, such that rkt could prepare the IOMMU for each context, and then switching contexts at job-run time would be a matter of changing the root DTE pointer and issuing a flush. But I don't see that anywhere in the user-facing IOMMU API, and I'm sure Robin (CCed) will be along shortly to explain why it's not possible ... Either way, I wonder if we have fully per-context mappings, userspace should not manage IOVAs in the VM_BIND style common to newer drivers, rather than relying on the kernel to do VA allocation and inform userspace of them? I'm really sorry this has come so late in the game. Cheers, Daniel