Re: [Qemu-devel] [RFC PATCH 09/10] spapr_pci_vfio: Enable DDW

Alexey Kardashevskiy Mon, 11 Aug 2014 17:05:21 -0700

On 08/12/2014 03:30 AM, Alexander Graf wrote:
> 
> On 11.08.14 17:01, Alexey Kardashevskiy wrote:
>> On 08/11/2014 10:02 PM, Alexander Graf wrote:
>>> On 31.07.14 11:34, Alexey Kardashevskiy wrote:
>>>> This implements DDW for VFIO. Host kernel support is required for this.
>>>>
>>>> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru>
>>>> ---
>>>>    hw/ppc/spapr_pci_vfio.c | 75
>>>> +++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    1 file changed, 75 insertions(+)
>>>>
>>>> diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
>>>> index d3bddf2..dc443e2 100644
>>>> --- a/hw/ppc/spapr_pci_vfio.c
>>>> +++ b/hw/ppc/spapr_pci_vfio.c
>>>> @@ -69,6 +69,77 @@ static void
>>>> spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **errp)
>>>>        /* Register default 32bit DMA window */
>>>>        memory_region_add_subregion(&sphb->iommu_root, tcet->bus_offset,
>>>>                                    spapr_tce_get_iommu(tcet));
>>>> +
>>>> +    sphb->ddw_supported = !!(info.flags & VFIO_IOMMU_SPAPR_TCE_FLAG_DDW);
>>>> +}
>>>> +
>>>> +static int spapr_pci_vfio_ddw_query(sPAPRPHBState *sphb,
>>>> +                                    uint32_t *windows_available,
>>>> +                                    uint32_t *page_size_mask)
>>>> +{
>>>> +    sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
>>>> +    struct vfio_iommu_spapr_tce_query query = { .argsz = sizeof(query) };
>>>> +    int ret;
>>>> +
>>>> +    ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
>>>> +                               VFIO_IOMMU_SPAPR_TCE_QUERY, &query);
>>>> +    if (ret) {
>>>> +        return ret;
>>>> +    }
>>>> +
>>>> +    *windows_available = query.windows_available;
>>>> +    *page_size_mask = query.page_size_mask;
>>>> +
>>>> +    return ret;
>>>> +}
>>>> +
>>>> +static int spapr_pci_vfio_ddw_create(sPAPRPHBState *sphb, uint32_t
>>>> page_shift,
>>>> +                                     uint32_t window_shift, uint32_t
>>>> liobn,
>>>> +                                     sPAPRTCETable **ptcet)
>>>> +{
>>>> +    sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
>>>> +    struct vfio_iommu_spapr_tce_create create = {
>>>> +        .argsz = sizeof(create),
>>>> +        .page_shift = page_shift,
>>>> +        .window_shift = window_shift,
>>>> +        .start_addr = 0
>>>> +    };
>>>> +    int ret;
>>>> +
>>>> +    ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
>>>> +                               VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
>>>> +    if (ret) {
>>>> +        return ret;
>>>> +    }
>>>> +
>>>> +    *ptcet = spapr_tce_new_table(DEVICE(sphb), liobn, create.start_addr,
>>>> +                                 page_shift, 1 << (window_shift -
>>>> page_shift),
>>> I spot a 1 without ULL again - this time it might work out ok, but please
>>> just always use ULL when you pass around addresses.
>> My bad. I keep forgetting this, I'll adjust my own checkpatch.py :)
>>
>>
>>> Please walk me though the abstraction levels on what each page size
>>> honoration means. If I use THP, what page size granularity can I use for
>>> TCE entries?
>>
>> [RFC PATCH 06/10] spapr_rtas: Add Dynamic DMA windows (DDW) RTAS calls
>> support
>>
>> +        const struct { int shift; uint32_t mask; } masks[] = {
>> +            { 12, DDW_PGSIZE_4K },
>> +            { 16, DDW_PGSIZE_64K },
>> +            { 24, DDW_PGSIZE_16M },
>> +            { 25, DDW_PGSIZE_32M },
>> +            { 26, DDW_PGSIZE_64M },
>> +            { 27, DDW_PGSIZE_128M },
>> +            { 28, DDW_PGSIZE_256M },
>> +            { 34, DDW_PGSIZE_16G },
>> +        };
>>
>>
>> Supported page sizes are returned by the host kernel via "query". For 16MB
>> pages, page shift will return DDW_PGSIZE_4K|DDW_PGSIZE_64K|DDW_PGSIZE_16M.
>> Or I did not understand the question...
> 
> Why do we care about the sizes? Anything bigger than what we support should
> always work, no? What happens if the guest creates a 16MB map but my pages
> are 4kb mapped? Wouldn't the same logic be able to deal with 16G pages?


It is DMA memory, if I split "virtual" 16M page to a bunch of real 4K
pages, I have to make sure these 16M are continuous - there will be one TCE
entry for it and no more translations besides IOMMU. What do I miss now?



-- 
Alexey

Re: [Qemu-devel] [RFC PATCH 09/10] spapr_pci_vfio: Enable DDW

Reply via email to