Re: [Qemu-devel] [V6 0/4] AMD IOMMU

Jan Kiszka Tue, 01 Mar 2016 06:36:59 -0800

On 2016-03-01 15:30, Michael S. Tsirkin wrote:
> On Tue, Mar 01, 2016 at 03:18:05PM +0100, Jan Kiszka wrote:
>> On 2016-03-01 15:12, Jan Kiszka wrote:
>>> On 2016-03-01 14:55, Michael S. Tsirkin wrote:
>>>> On Tue, Mar 01, 2016 at 02:48:19PM +0100, Jan Kiszka wrote:
>>>>> On 2016-03-01 14:07, Michael S. Tsirkin wrote:
>>>>>> On Sun, Feb 21, 2016 at 09:10:56PM +0300, David Kiarie wrote:
>>>>>>> Hello there,
>>>>>>>
>>>>>>> Repost, AMD IOMMU patches version 6.
>>>>>>>
>>>>>>> Changes since version 5
>>>>>>>  -Fixed macro formating issues
>>>>>>>  -changed occurences of IO MMU to IOMMU for consistency
>>>>>>>  -Fixed capability registers duplication
>>>>>>>  -Rebased to current master
>>>>>>>
>>>>>>> David Kiarie (4):
>>>>>>>   hw/i386: Introduce AMD IOMMU
>>>>>>>   hw/core: Add AMD IOMMU to machine properties
>>>>>>>   hw/i386: ACPI table for AMD IOMMU
>>>>>>>   hw/pci-host: Emulate AMD IOMMU
>>>>>>
>>>>>> I went over AMD IOMMU spec.
>>>>>> I'm concerned that it appears that there's no chance for it to
>>>>>> work correctly if host caches invalid PTE entries.
>>>>>>
>>>>>> The spec vaguely discusses write-protecting such PTEs but
>>>>>> that would be very complex if it can be made to work at all.
>>>>>>
>>>>>> This means that this can't work with e.g. VFIO.
>>>>>> It can only work with emulated devices.
>>>>>
>>>>> You mean it can't work if we program a real IOMMU (for VFIO) with
>>>>> translated data from the emulated one but cannot track any updates of
>>>>> the related page tables because the guest is not required to issue
>>>>> traceable flush requests? Hmm, too bad.
>>>>>
>>>>>>
>>>>>> OTOH VTD can easily support PTE shadowing by setting a flag.
>>>>>
>>>>> Do you mean RWBF=1 in the CAP register? Given that "Newer hardware
>>>>> implementations are expected to NOT require explicit software flushing
>>>>> of write buffers and report RWBF=0 in the Capability register", we may
>>>>> eventually run into guests that no longer check that flag if we expose
>>>>> something that looks like a "newer" implementation.
>>>>
>>>> Hopefully not, if that happens we'll have to do a PV IOMMU :)
>>>
>>> Please not.
>>>
>>>>
>>>>> However, this flag is not set right now in our VT-d model.
>>>>>>
>>>>>> I'd like us to find some way to avoid possibility
>>>>>> of user error creating a configuration mixing e.g.
>>>>>> vfio with the amd iommu.
>>>>>>
>>>>>> I'm not sure how to do this.
>>>>>>
>>>>>> Any idea?
>>>>>
>>>>> There is likely no way around write-protecting the IOMMU page tables (in
>>>>> KVM mode) once we evaluated and cached them somewhere.
>>>>
>>>> Well for one, it's possible to use vt-d and not amd iommu.
>>>
>>> That would lead to nice combos of AMD CPUs with VT-d IOMMU. While it may
>>> be possible, I wouldn't rely on guests having tested that combination
>>> very well.
>>
>> To make the concern more concrete: I'm playing with code that will reuse
>> the MMU page tables for the IOMMU - the AMD architecture is designed for
>> that optimization (in contrast to Intel's). So, if the guest is not
>> foreseeing that artificial combo above (ours will definitely not)
>> because it is designed around the reuse, it will at least fail to run.
>>
>> Jan
> 
> So if you have an AMD iommu on the host and that is capable
> of 2-level translation, then the flushing problem
> can be fixed by a kind of iommu pass-through
> where you point the host's iommu to guest's page tables.


Yes, right, that could be another approach - provided the tables have
compatible entries. I didn't look details of any of both so far, but I
wouldn't be overly optimistic. Usually, hardware is not very well
designed for interesting nesting purposes.

> 
> So maybe what you need to do is make it possible
> for device to query iommu and ask whether it
> supports devices caching invalid PTEs.
> If not, vfio could fail.

Makes sense.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [V6 0/4] AMD IOMMU

Reply via email to