On Mon, Sep 30, 2024 at 10:45:31AM +0000, Shameerali Kolothum Thodi wrote:
> > -----Original Message-----
> > From: Nicolin Chen <nicol...@nvidia.com>
> > Sent: Thursday, September 5, 2024 9:37 PM
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.th...@huawei.com>
> > Cc: Eric Auger <eric.au...@redhat.com>; Mostafa Saleh
> > <smost...@google.com>; qemu-...@nongnu.org; qemu-
> > de...@nongnu.org; Peter Maydell <peter.mayd...@linaro.org>; Jason
> > Gunthorpe <j...@nvidia.com>; Jean-Philippe Brucker <jean-
> > phili...@linaro.org>; Moritz Fischer <m...@kernel.org>; Michael Shavit
> > <msha...@google.com>; Andrea Bolognani <abolo...@redhat.com>;
> > Michael S. Tsirkin <m...@redhat.com>; Peter Xu <pet...@redhat.com>
> > Subject: Re: nested-smmuv3 topic, Sep 2024
> >
> > Hi Shameer,
> >
> > Thanks for the reply!
> >
> > On Thu, Sep 05, 2024 at 12:55:52PM +0000, Shameerali Kolothum Thodi
> > wrote:
> > > > The main takeaway from the discussion is to
> > > > 1) Turn the vSMMU module into a pluggable one, like intel-iommu
> > > > 2) Move the per-SMMU pxb bus and device auto-assign into libvirt
> > > >
> > > > Apart from the multi-vSMMU thing, there's basic nesting series:
> > > > 0) Keep updating to the latest kernel uAPIs to support nesting
> > >
> > > By this you mean the old HWPT based nested-smmuv3 support?
> >
> > HWPT + vIOMMU. The for-viommu/virq branches that I shared in my
> > kernel series have those changes. Invalidations is done via the
> > vIOMMU infrastructure.
> >
> > > >
> > > > I was trying to do all these three, but apparently too ambitious.
> > > > The kernel side of work is still taking a lot of my bandwidth. So
> > > > far I had almost-zero progress on task (1) and completely-zero on
> > > > task (2).
> > > >
> > > > <-- Help Needed --->
> > > > So, I'm wondering if anyone(s) might have some extra bandwidth in
> > > > the following months helping these two tasks, either of which can
> > > > be a standalone project I think.
> > > >
> > > > For task (0), I think I can keep updating the uAPI part, although
> > > > it'd need some help for reviews, which I was hoping to occur after
> > > > Intel sends the QEMU nesting backend patches. Once we know how big
> > > > the rework is going to be, we may need to borrow some help at that
> > > > point once again..
> > >
> > > I might have some bandwidth starting October and can take a look at
> > > task 1 above. I haven't gone through the VIOMMU API model completely
> > > yet and plan to do that soon.
> >
> 
> I had  an initial look at this and also had some discussions with Eric at KVM
> Forum(Thanks Eric!).

Wow, thank both of you!

> Going through the code, is it ok to introduce a "pci-bus" for the proposed
> nested SMMUv3 device which will create the link between the SMMUv3 dev
> and the associated root complex(pxb-pcie).
> 
> Something like below,
> 
> -device pxb-pcie,id=pcie.1,bus_nr=2,bus=pcie.0 \
> -device arm-nested-smmuv3,pci-bus=pcie.1 \
> -device pcie-root-port,id=pcie.port1,bus=pcie.1 \
> -device vfio-pci,host=0000:75:00.1, bus=pcie.port1 \
> ...
> -device pxb-pcie,id=pcie.2,bus_nr=8,bus=pcie.0 \
> -device arm-nested-smmuv3,pci-bus=pcie.2 \
> -device pcie-root-port,id=pcie.port2,bus=pcie.2 \
> -device vfio-pci,host=0000:75:00.2, bus=pcie.port2 \
> 
> This way we can invoke the pci_setup_iommu() with the
> right PCIBus during the nested SMMUv3 device realize fn.
> 
> Please let me know, if this works/scales with all the use cases we have.

That looks nice to me. Hopefully, IORT or Device Tree would be
easy to tie to the corresponding pci-bus as well..

> Also Eric mentioned that when he initially added the support for SMMUv3,
> the initial approach was -device based solution, but later changed to machine
> option instead based on review comments. I managed to find the link where
> this change was proposed(by Peter),
> 
> https://lore.kernel.org/all/cafeaca_h+srawnvhezc48es11n6dc9cyewtl44tperipbo+...@mail.gmail.com/
> 
> I hope the use cases we now have make it reasonable to introduce a "-device 
> arm-nested-smmuv3" model.
> Please let me know if there are still objections to going this way.

I assume so. With multiple smmuv3 devices in the VM, we would need
this kinda flexibility to create them.

And FYI, I also found some resource in NVIDIA who will help me on
the QEMU workload, including our remaining task -- libvirt. I'll
align with them in the days ahead, and will keep all of us updated
after.

Thanks!
Nicolin

Reply via email to