> -----Original Message----- > From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com] > Sent: Wednesday, July 12, 2017 3:48 PM > To: Bharat Bhushan <bharat.bhus...@nxp.com>; Auger Eric > <eric.au...@redhat.com>; eric.auger....@gmail.com; > peter.mayd...@linaro.org; alex.william...@redhat.com; m...@redhat.com; > qemu-...@nongnu.org; qemu-devel@nongnu.org > Cc: w...@redhat.com; kevin.t...@intel.com; marc.zyng...@arm.com; > t...@semihalf.com; will.dea...@arm.com; drjo...@redhat.com; > robin.mur...@arm.com; christoffer.d...@linaro.org > Subject: Re: [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device > > On 12/07/17 04:50, Bharat Bhushan wrote: > [...] > >> The size of the virtio_iommu_req_probe structure is variable, and > depends > >> what fields the device implements. So the device initially computes the > size it > >> needs to fill virtio_iommu_req_probe, describes it in probe_size, and the > >> driver allocates that many bytes for virtio_iommu_req_probe.content[] > >> > >>>> * When device offers VIRTIO_IOMMU_F_PROBE, the driver should > send > >> an > >>>> VIRTIO_IOMMU_T_PROBE request for each new endpoint. > >>>> * The driver allocates a device-writeable buffer of probe_size (plus > >>>> framing) and sends it as a VIRTIO_IOMMU_T_PROBE request. > >>>> * The device fills the buffer with various information. > >>>> > >>>> struct virtio_iommu_req_probe { > >>>> /* device-readable */ > >>>> struct virtio_iommu_req_head head; > >>>> le32 device; > >>>> le32 flags; > >>>> > >>>> /* maybe also le32 content_size, but it must be equal to > >>>> probe_size */ > >>> > >>> Can you please describe why we need to pass size of "probe_size" in > probe > >> request? > >> > >> We don't. I don't think we should add this 'content_size' field unless > >> there > is > >> a compelling reason to do so. > >> > >>>> > >>>> /* device-writeable */ > >>>> u8 content[]; > >>> > >>> I assume content_size above is the size of array "content[]" and max > value > >> can be equal to probe_size advertised by device? > >> > >> probe_size is exactly the size of array content[]. The driver must > >> allocate a > >> buffer of this size (plus the space needed for head, device, flags and > >> tail). > >> > >> Then the device is free to leave parts of content[] empty. Field 'type' 0 > >> will > be > >> reserved and mark the end of the array. > >> > >>>> struct virtio_iommu_req_tail tail; > >>>> }; > >>>> > >>>> I'm still struggling with the content and layout of the probe > >>>> request, and would appreciate any feedback. To be easily extended, I > >>>> think it should contain a list of fields of variable size: > >>>> > >>>> |0 15|16 31|32 N| > >>>> | type | length | values | > >>>> > >>>> 'length' might be made optional if it can be deduced from type, but > >>>> might make driver-side parsing more robust. > >>>> > >>>> The probe could either be done for each endpoint, or for each address > >>>> space. I much prefer endpoint because it is the smallest granularity. > >>>> The driver can then decide what endpoints to put together in the same > >>>> address space based on their individual capabilities. The > >>>> specification would described how each endpoint property is combined > >>>> when endpoints are put in the same address space. For example, take > >>>> the minimum of all PASID size, the maximum of all page granularities, > >>>> combine doorbell addresses, etc. > >>>> > >>>> If we did the probe on address spaces instead, the driver would have > >>>> to re-send a probe request each time a new endpoint is attached to an > >>>> existing address space, to see if it is still capable of page table > >>>> handover or if the driver just combined a VFIO and an emulated > >>>> endpoint by accident. > >>>> > >>>> *** > >>>> > >>>> Using this framework, the device can declare doorbell regions by > >>>> adding one or more RESV fields into the probe buffer: > >>>> > >>>> /* 'type' */ > >>>> #define VIRTIO_IOMMU_PROBE_T_RESV 0x1 > >>>> > >>>> /* 'values'. 'length' is sizeof(struct virtio_iommu_probe_resv) */ > >>>> struct virtio_iommu_probe_resv { > >>>> le64 gpa; > >>>> le64 size; > >>>> > >>>> #define VIRTIO_IOMMU_PROBE_RESV_MSI 0x1 > >>>> u8 type; > > > > To be sure I am understanding it correctly, Is this "type" in struct > virtio_iommu_req_head? > > No, virtio_iommu_req_head::type is the request type > (ATTACH/DETACH/MAP/UNMAP/PROBE). > > Then virtio_iommu_probe_property::type is the property type (only RESV > for > the moment). > > And this is virtio_iommu_probe_resv::type, which is the type of the resv > region (MSI). I renamed it to 'subtype' below, but I think it still is > pretty confusing. > > > I did a number of changes to structures and naming when trying to > integrate it to the specification: > > * Added 64 bytes of padding in virtio_iommu_req_probe, so that future > extensions can add fields in the device-readable part. > * renamed "RESV" to "RESV_MEM". > * The resv_mem property now looks like this: > struct virtio_iommu_probe_resv_mem { > u8 subtype; > u8 padding[3]; > le32 flags; > le64 addr; > le64 size; > }; > * subtype for MSI doorbells is now > VIRTIO_IOMMU_PROBE_RESV_MEM_T_BYPASS > (because transactions to this region bypass the IOMMU). 'flags' contain a > hint VIRTIO_IOMMU_PROBE_RESV_MEM_F_MSI, telling the driver that this > region is used for MSIs. > > Here is an example of a probe request returning an MSI doorbell property. > > 31 7 0 > +---------------------------------+ > | 0 | type | <- request type = PROBE (5) > +---------------------------------+ > | device | > +---------------------------------+ > : : > : (64B padding) : > : : > +---------------------------------+ > ^ | length = 24 | type = 1 | <- property type = RESV_MEM (1) > | +---------------------------------+ > | | 0 |subtype | <- RESV_MEM subtype = BYPASS (1) > p| +---------------------------------+ > r| | flags = MSI | > o| +---------------------------------+ > b| | addr = 0xfee00000 | > e| | | > _| +---------------------------------+ > s| | size = 0x00100000 | > i| | | > z| +---------------------------------+ > e| | length | type | <- another property may start > | : : here > v : ... : > +---------------------------------+ > | 0 | status | <- request tail > +---------------------------------+
So we want a single probe will return info of all "types" and each "subtype" of given "type"? I was of impression that based on flags there will be separate probe request for a type. Thanks -Bharat > > > I'll try to send the next version of the spec out as soon as possible. > > Thanks, > Jean > > > > Thanks > > -Bharat > > > >>> > >>> type is 16 bit above? > >> > >> Ah, the naming isn't great. This is not the same as above, and could be > called > >> 'subtype' to avoid confusion. The above 16-bit type describes the field > type, > >> e.g. struct virtio_iommu_probe_resv. I proposed 16-bit because it seems > >> easy to reach more than 255 kinds of endpoint properties, but > >> 65535 should do. > >> > >> This subtype describes which kind of resv region is described in the > structure. > >> For the moment there only is VIRTIO_IOMMU_PROBE_RESV_MSI, but we > >> could for example add resv regions that the driver should never use or > that it > >> should identity-map (equivalent to IOMMU_RESV_RESERVED/DIRECT in > >> Linux). I think 8 bits should be enough to contain any future types, unless > we > >> make this a bitfield. For identity-map, there may be an additional flags > field > >> describing the protection. > >> > >>>> }; > >>>> > >>>> Such a region would be subject to the following rules: > >>>> > >>>> * Driver should not use any IOVA declared as RESV_MSI in a mapping. > >>>> * Device should leave any transaction matching a RESV_MSI region pass > >>>> through untranslated. > >>>> * If the device does not advertise any RESV region, then the driver > >>>> should assume that MSI doorbells, like any other GPA, must be > mapped > >>>> with an arbitrary IOVA in order for the endpoint to access them. > >>>> * Given that the driver *should* perform a probe request if > >>>> available, and it *should* understand the > >> VIRTIO_IOMMU_PROBE_T_RESV > >>>> field, then this field tells the guest how it should handle MSI > >>>> doorbells, and whether it should map the address via MAP requests or > >> not. > >>>> > >>>> Does this make sense and did I overlook something? > >>> > >>> Overall it looks good to me. Do you have plans to implements this in > virtio- > >> iommu driver and kvmtool? > >> > >> Yes, if there is no objection I'll try to formalize it and implement it > >> right > away. > >> > >> Thanks, > >> Jean