Hi, On 9/4/19 7:46 AM, Tian, Kevin wrote: >> From: Peter Xu [mailto:pet...@redhat.com] >> Sent: Wednesday, September 4, 2019 1:37 PM >> >> On Wed, Sep 04, 2019 at 04:23:50AM +0000, Tian, Kevin wrote: >>>> From: Peter Xu [mailto:pet...@redhat.com] >>>> Sent: Wednesday, September 4, 2019 9:44 AM >>>> >>>> On Tue, Sep 03, 2019 at 01:37:11PM +0200, Auger Eric wrote: >>>>> Hi Peter, >>>>> >>>>> On 8/19/19 10:11 AM, Peter Xu wrote: >>>>>> On Tue, Jul 30, 2019 at 07:21:30PM +0200, Eric Auger wrote: >>>>>> >>>>>> [...] >>>>>> >>>>>>> + mapping = g_tree_lookup(domain->mappings, >> (gpointer)(&interval)); >>>>>>> + >>>>>>> + while (mapping) { >>>>>>> + viommu_interval current; >>>>>>> + uint64_t low = mapping->virt_addr; >>>>>>> + uint64_t high = mapping->virt_addr + mapping->size - 1; >>>>>>> + >>>>>>> + current.low = low; >>>>>>> + current.high = high; >>>>>>> + >>>>>>> + if (low == interval.low && size >= mapping->size) { >>>>>>> + g_tree_remove(domain->mappings, (gpointer)(¤t)); >>>>>>> + interval.low = high + 1; >>>>>>> + trace_virtio_iommu_unmap_left_interval(current.low, >>>> current.high, >>>>>>> + interval.low, interval.high); >>>>>>> + } else if (high == interval.high && size >= mapping->size) { >>>>>>> + trace_virtio_iommu_unmap_right_interval(current.low, >>>> current.high, >>>>>>> + interval.low, interval.high); >>>>>>> + g_tree_remove(domain->mappings, (gpointer)(¤t)); >>>>>>> + interval.high = low - 1; >>>>>>> + } else if (low > interval.low && high < interval.high) { >>>>>>> + trace_virtio_iommu_unmap_inc_interval(current.low, >>>> current.high); >>>>>>> + g_tree_remove(domain->mappings, (gpointer)(¤t)); >>>>>>> + } else { >>>>>>> + break; >>>>>>> + } >>>>>>> + if (interval.low >= interval.high) { >>>>>>> + return VIRTIO_IOMMU_S_OK; >>>>>>> + } else { >>>>>>> + mapping = g_tree_lookup(domain->mappings, >>>> (gpointer)(&interval)); >>>>>>> + } >>>>>>> + } >>>>>>> + >>>>>>> + if (mapping) { >>>>>>> + qemu_log_mask(LOG_GUEST_ERROR, >>>>>>> + "****** %s: Unmap 0x%"PRIx64" size=0x%"PRIx64 >>>>>>> + " from 0x%"PRIx64" size=0x%"PRIx64" is not >>>>>>> supported\n", >>>>>>> + __func__, interval.low, size, >>>>>>> + mapping->virt_addr, mapping->size); >>>>>>> + } else { >>>>>>> + return VIRTIO_IOMMU_S_OK; >>>>>>> + } >>>>>>> + >>>>>>> + return VIRTIO_IOMMU_S_INVAL; >>>>>> >>>>>> Could the above chunk be simplified as something like below? >>>>>> >>>>>> while ((mapping = g_tree_lookup(domain->mappings, &interval))) { >>>>>> g_tree_remove(domain->mappings, mapping); >>>>>> } >>>>> Indeed the code could be simplified. I only need to make sure I don't >>>>> split an existing mapping. >>>> >>>> Hmm... Do we need to still split an existing mapping if necessary? >>>> For example when with this mapping: >>>> >>>> iova=0x1000, size=0x2000, phys=ADDR1, flags=FLAGS1 >>>> >>>> And if we want to unmap the range (iova=0, size=0x2000), then we >>>> should split the existing mappping and leave this one: >>>> >>>> iova=0x2000, size=0x1000, phys=(ADDR1+0x1000), flags=FLAGS1 >>>> >>>> Right? >>>> >>> >>> virtio-iommu spec explicitly disallows partial unmap. >>> >>> 5.11.6.6.1 Driver Requirements: UNMAP request >>> >>> The first address of a range MUST either be the first address of a >>> mapping or be outside any mapping. The last address of a range >>> MUST either be the last address of a mapping or be outside any >>> mapping. >>> >>> 5.11.6.6.2 Device Requirements: UNMAP request >>> >>> If a mapping affected by the range is not covered in its entirety >>> by the range (the UNMAP request would split the mapping), >>> then the device SHOULD set the request status to VIRTIO_IOMMU >>> _S_RANGE, and SHOULD NOT remove any mapping. >> >> I see, thanks Kevin. >> >> Though why so strict? (Sorry if I missed some discussions >> ... pointers welcomed...) >> >> What I'm thinking is when we want to allocate a bunch of buffers >> (e.g., 1M) while we will also need to be able to free them with >> smaller chunks (e.g., 4K), then it would be even better that we allow >> to allocate a whole 1M buffer within the guest and map it as a whole, >> then we can selectively unmap the pages after used. If with the >> strict rule, we'll need to map one by one, that can be a total of >> 1M/4K roundtrips. >> > > Sorry I forgot the original discussion. Need Jean to respond. :-) > > A possible reason is that no such usage exists today, thus simplification > was made?
In https://virtualization.linux-foundation.narkive.com/q6XOkO76/rfc-0-3-virtio-iommu-a-paravirtualized-iommu I found " (Note: the semantics of unmap are chosen to be compatible with VFIO's type1 v2 IOMMU API. This way a device serving as intermediary between guest and VFIO doesn't have to keep an internal tree of mappings. They are a bit tighter than VFIO, in that they don't allow unmap spilling outside mapped regions. Spilling is 'undefined' at the moment, because it should work in most cases but I don't know if it's worth the added complexity in devices that are not simply transmitting requests to VFIO. Splitting mappings won't ever be allowed, but see the relaxed proposal in 3/3 for more lenient semantics) " Thanks Eric > > Thanks > Kevin >