On Jan 29 14:13, Damien Hedde wrote: > > > On 1/24/24 08:47, Hannes Reinecke wrote: > > On 1/24/24 07:52, Philippe Mathieu-Daudé wrote: > > > Hi Hannes, > > > > > > [+Markus as QOM/QDev rubber duck] > > > > > > On 23/1/24 13:40, Hannes Reinecke wrote: > > > > On 1/23/24 11:59, Damien Hedde wrote: > > > > > Hi all, > > > > > > > > > > We are currently looking into hotplugging nvme devices and > > > > > it is currently not possible: > > > > > When nvme was introduced 2 years ago, the feature was disabled. > > > > > > commit cc6fb6bc506e6c47ed604fcb7b7413dff0b7d845 > > > > > > Author: Klaus Jensen > > > > > > Date: Tue Jul 6 10:48:40 2021 +0200 > > > > > > > > > > > > hw/nvme: mark nvme-subsys non-hotpluggable > > > > > > We currently lack the infrastructure to handle > > > > > > subsystem hotplugging, so > > > > > > disable it. > > > > > > > > > > Do someone know what's lacking or anyone have some tips/idea > > > > > of what we should develop to add the support ? > > > > > > > > > Problem is that the object model is messed up. In qemu > > > > namespaces are attached to controllers, which in turn are > > > > children of the PCI device. > > > > There are subsystems, but these just reference the controller. > > > > > > > > So if you hotunplug the PCI device you detach/destroy the > > > > controller and detach the namespaces from the controller. > > > > But if you hotplug the PCI device again the NVMe controller will > > > > be attached to the PCI device, but the namespace are still be > > > > detached. > > > > > > > > Klaus said he was going to fix that, and I dimly remember some patches > > > > floating around. But apparently it never went anywhere. > > > > > > > > Fundamental problem is that the NVMe hierarchy as per spec is > > > > incompatible with the qemu object model; qemu requires a strict > > > > tree model where every object has exactly _one_ parent. > > > > > > The modelling problem is not clear to me. > > > Do you have an example of how the NVMe hierarchy should be? > > > > > Sure. > > > > As per NVMe spec we have this hierarchy: > > > > ---> subsys --- > > | | > > | V > > controller namespaces > > > > There can be several controllers, and several > > namespaces. > > The initiator (ie the linux 'nvme' driver) connects > > to a controller, queries the subsystem for the attached > > namespaces, and presents each namespace as a block device. > > > > For Qemu we have the problem that every device _must_ be > > a direct descendant of the parent (expressed by the fact > > that each 'parent' object is embedded in the device object). > > > > So if we were to present a NVMe PCI device, the controller > > must be derived from the PCI device: > > > > pci -> controller > > > > but now we have to express the NVMe hierarchy, too: > > > > pci -> ctrl1 -> subsys1 -> namespace1 > > > > which actually works. > > We can easily attach several namespaces: > > > > pci -> ctrl1 ->subsys1 -> namespace2 > > > > For a single controller and a single subsystem. > > However, as mentioned above, there can be _several_ > > controllers attached to the same subsystem. > > So we can express the second controller: > > > > pci -> ctrl2 > > > > but we cannot attach the controller to 'subsys1' > > as then 'subsys1' would need to be derived from > > 'ctrl2', and not (as it is now) from 'ctrl1'. > > > > The most logical step would be to have 'subsystems' > > their own entity, independent of any controllers. > > But then the block devices (which are derived from > > the namespaces) could not be traced back > > to the PCI device, and a PCI hotplug would not > > 'automatically' disconnect the nvme block devices. > > > > Plus the subsystem would be independent from the NVMe > > PCI devices, so you could have a subsystem with > > no controllers attached. And one would wonder who > > should be responsible for cleaning up that. > > > > Thanks for the details ! > > My use case is the simple one with no nvme subsystem/namespaces: > - hotplug a pci nvme device (nvme controller) as in the following CLI (which > automatically put the drive into a default namespace) > > ./qemu-system-aarch64 -nographic -M virt \ > -drive file=nvme0.disk,if=none,id=nvme-drive0 \ > -device nvme,serial=nvme0,id=nvmedev0,drive=nvme-drive0 >
AFAIK, you just need a pci root port to plug the device into. -drive file=nvme0.disk,if=none,id=nvme-drive0 \ -device "pcie-root-port,id=pcie_root_port0,chassis=1,slot=0" \ -device nvme,serial=nvme0,id=nvmedev0,drive=nvme-drive0 Then, you can use the qemu monitor to `device_del nvmedev0` and add it with `device_add nvme,serial=nvme0,id=nvmedev0,drive=nvme-drive0`. The "drive" (blockdev) will stick around after the device_del.
signature.asc
Description: PGP signature