On Tue, Nov 13, 2018 at 11:37:07AM +0800, Peter Xu wrote: > On Mon, Nov 12, 2018 at 05:42:01PM +0800, Yu Zhang wrote: > > On Mon, Nov 12, 2018 at 04:36:34PM +0800, Peter Xu wrote: > > > On Fri, Nov 09, 2018 at 07:49:46PM +0800, Yu Zhang wrote: > > > > A 5-level paging capable VM may choose to use 57-bit IOVA address width. > > > > E.g. guest applications like DPDK prefer to use its VA as IOVA when > > > > performing VFIO map/unmap operations, to avoid the burden of managing > > > > the > > > > IOVA space. > > > > > > Since you mentioned about DPDK... I'm just curious that whether have > > > you tested the patchset with the 57bit-enabled machines with DPDK VA > > > mode running in the guest? That would be something nice to mention in > > > the cover letter if you have. > > > > > > > Hah. Maybe I shall not mention DPDK here. > > > > The story is that we heard the requirement, saying applications like DPDK > > would need 5-level paging in IOMMU side. And I was convinced after checked > > DPDK code, seeing it may use VA as IOVA directly. But I did not test this > > patch with DPDK. > > > > Instead, I used kvm-unit-test to verify this patch series. And of course, I > > also did some modification to the test case. Patch for the test also sent > > out > > at https://www.spinics.net/lists/kvm/msg177425.html. > > Yeah that's perfectly fine for me. So instead maybe you can also > mention the kvm-unit-test in the cover letter if you gonna repost.
Got it. Thanks! > > > > > > [...] > > > > > > > @@ -3264,11 +3286,19 @@ static bool vtd_decide_config(IntelIOMMUState > > > > *s, Error **errp) > > > > } > > > > } > > > > > > > > - /* Currently only address widths supported are 39 and 48 bits */ > > > > + /* Currently address widths supported are 39, 48, and 57 bits */ > > > > if ((s->aw_bits != VTD_AW_39BIT) && > > > > - (s->aw_bits != VTD_AW_48BIT)) { > > > > - error_setg(errp, "Supported values for x-aw-bits are: %d, %d", > > > > - VTD_AW_39BIT, VTD_AW_48BIT); > > > > + (s->aw_bits != VTD_AW_48BIT) && > > > > + (s->aw_bits != VTD_AW_57BIT)) { > > > > + error_setg(errp, "Supported values for x-aw-bits are: %d, %d, > > > > %d", > > > > + VTD_AW_39BIT, VTD_AW_48BIT, VTD_AW_57BIT); > > > > + return false; > > > > + } > > > > + > > > > + if ((s->aw_bits == VTD_AW_57BIT) && > > > > + !(host_has_la57() && guest_has_la57())) { > > > > + error_setg(errp, "Do not support 57-bit DMA address, unless > > > > both " > > > > + "host and guest are capable of 5-level > > > > paging.\n"); > > > > > > Is there any context (or pointer to previous discussions would work > > > too) on explaining why we don't support some scenarios like > > > host_paw=48,guest_paw=48,guest_gaw=57? > > > > > > > Well, above check is only to make sure both the host and the guest can > > use 57bit linear address, which requires 5-level paging. So I believe > > we do support scenarios like host_paw=48,guest_paw=48,guest_gaw=57. > > The guest_has_la57() means the guest can use 57-bit linear address, > > regardless of its physical address width. > > Sorry for my incorrect wording. I mean when host/guest CPU only > support 4-level LA then would/should we allow the guest IOMMU to > support 5-level IOVA? Asked since I'm thinking whether I can run the > series a bit with my laptop/servers. Well, by "only support", I guess you mean the hardware capability, instead of its paging mode. So I do not think hardware will support 5-level IOVA for platforms without 5-level VA. Therefore a 5-level vIOMMU is disallowed here. :) > > Since at it, another thing I thought about is making sure the IOMMU > capabilities will match between host and guest IOMMU, which I think > this series has ignorred so far. E.g., when we're having assigned > devices in the guest and with 5-level IOVA, we should make sure the > host IOMMU supports 5-level as well before the guest starts since > otherwise the shadow page synchronization could potentially fail when > the requested IOVA address goes beyond 4-level. One simple solution > is just to disable device assignment for now when we're with 57bits > vIOMMU but I'm not sure whether that's what you want, especially you > mentioned the DPDK case (who may use assigned devices). > Thanks, Peter. Replied in the following up mail. :) > (sorry to have mentioned the dpdk case again :) > > Regards, > > -- > Peter Xu > B.R. Yu