Ben-Ami Yassour wrote:
On Tue, 2008-06-17 at 16:29 -0500, Anthony Liguori wrote:
I think the current VT-d code needs some reworking.
We should build the table as the shadow page table gets built. We
should suppress iotlb flushes unless the table is actually being updated.
I'm not sure what you mean.
The current implementation of vtd for passthrough is a direct map, which
means that we map the entire guest memory (and pin it).
In this case there are no iotlb flushes after the first initialization.
Right. But this is not ideal. Instead of pinning up-front, it would
make more sense IMHO to build the VT-d table as the shadow page table
gets faulted in. In certain circumstances, this will result in
extraneous updates (because a GPA=>HPA mapping is already present) and
that's where we should eliminate iotlb flushes.
For now, we should basically do this for all of physical memory but we
should have the right infrastructure such that we can be more clever
once we have a PVDMA API.
Obviously, pinning the entire guest is not desirable since we waste a
lot of memory resources, but this is the approach that we currently
have. Do you find it good enough for a merge with the main KVM tree, and
optimize later?
No, it's not safe. What happens mmap(MAP_FIXED) into phys_ram_base? We
need to use MMU notifiers to handle such events and appropriately flush
the iotlb.
When you mentioned building a table as the shadow page table, did you
mean that we should map the IOMMU on demand?
Yes, but in the absence of a PV guest, there's a very special case where
we pre-fault the entire table.
I'm not sure how we can do that... the guest can send a guest physical
address to the device for DMA, even without generating a page-fault on
the host for that address... which implies that the host must pin the
entire guest memory in advance. agree?
See above. Ideally we would wait until the first PCI config space
access for a device before special casing the guest. Otherwise, there's
no way to allow a DMA-aware guest to avoid pinning up front.
The only way I can think of avoiding that is PVDMA with VT-d, which
means that there is a hyper call for each DMA request, but this is a
different solution, cause it only applies to PV guests.
It doesn't strictly require a hypercall, but yes, that's the general
solution.
Do you see a way to avoid mapping (and pinning) the entire guest memory
for fully virtual guests (and without parsing every transaction between
the guest and the device to figure out the DMA addresses)?
The key is to support both cases with the same infrastructure. The
unmodified guest should just be a special case.
Regards,
Anthony Liguori
Regards,
Ben
Regards,
Anthony Liguori
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html