On Tue, May 11, 2021 at 11:09:53AM +0100, Stefan Hajnoczi wrote: > > > > +* *sub-regions* is the array of Sub-Region IO FD info structures > > > > + > > > > +The reply message will additionally include at least one file > > > > descriptor in the > > > > +ancillary data. Note that more than one sub-region may share the same > > > > file > > > > +descriptor. > > > > > > How does this interact with the maximum number of file descriptors, > > > max_fds? It is possible that there are more sub-regions than max_fds > > > allows... > > > > I think this would just be a matter of the client advertising a reasonably > > large > > enough size for max_msg_fds. Do we need to worry about this? > > vhost-user historically only supported passing 8 fds and it became a > problem there. > > I can imagine devices having 10s to 100s of sub-regions (e.g. 64 queue > doorbells). Probably not 1000s. > > If I was implementing a server I would check the negotiated max_fds and > refuse to start the vfio-user connection if the device has been > configured to require more sub-regions. Failing early and printing an > error would allow users to troubleshoot the issue and re-configure the > client/server. > > This seems okay but the spec doesn't mention it explicitly so I wanted > to check what you had in mind.
Not for the spec, but I filed https://github.com/nutanix/libvfio-user/issues/489 to track this on the library side. Thanks. > Fleshing out irqs sounds like a 1.0 milestone to me. It will definitely > be necessary but for now this can be dropped. I could be wrong, and probably am, but I believe we're basically fine for IRQs right now, until we want to support servers on separate hosts where we'll obviously have to re-introduce something like the VM_INTERRUPT message. > > > > +VFIO_USER_DEVICE_RESET > > > > +---------------------- > > > > > > Any requirements for how long VFIO_USER_DEVICE_RESET takes to complete? > > > In some cases a reset involves the server communicating with other > > > systems or components and this can take an unbounded amount of time. > > > Therefore this message could hang. For example, if a vfio-user NVMe > > > device was accessing data on a hung NFS export and there were I/O > > > requests in flight that need to be aborted. > > > > I'm not sure this is something we could put in the generic spec. Perhaps a > > caveat? > > It's up to you whether you want to discuss this in the spec or let > client implementors figure it out themselves. Any vfio-user message can > take an unbounded amount of time and we could assume that readers will > think of this. I'm going to start an "implementation notes" section. regards john