* Igor Mammedov (imamm...@redhat.com) wrote: > On Mon, 11 Dec 2017 11:03:00 +0000 > "Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote: > > > * Igor Mammedov (imamm...@redhat.com) wrote: > > > On Fri, 8 Dec 2017 17:51:56 +0000 > > > "Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote: > > > > > > > * Igor Mammedov (imamm...@redhat.com) wrote: > > > > > On Thu, 7 Dec 2017 18:17:51 +0000 > > > > > "Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote: > > > > > > > > > > > * Igor Mammedov (imamm...@redhat.com) wrote: > > > > > > > vhost_verify_ring_mappings() were used to verify that > > > > > > > rings are still accessible and related memory hasn't > > > > > > > been moved after flatview is updated. > > > > > > > > > > > > > > It were doing checks by mapping ring's GPA+len and > > > > > > > checking that HVA hasn't changed with new memory map. > > > > > > > To avoid maybe expensive mapping call, we were > > > > > > > identifying address range that changed and were doing > > > > > > > mapping only if ring were in changed range. > > > > > > > > > > > > > > However it's no neccessary to perform ringi's GPA > > > > > > > mapping as we already have it's current HVA and all > > > > > > > we need is to verify that ring's GPA translates to > > > > > > > the same HVA in updated flatview. > > > > > > > > > > > > > > That could be done during time when we are building > > > > > > > vhost memmap where vhost_update_mem_cb() already maps > > > > > > > every RAM MemoryRegionSection to get section HVA. This > > > > > > > fact can be used to check that ring belongs to the same > > > > > > > MemoryRegion in the same place, like this: > > > > > > > > > > > > > > hva_ring_offset = GPA(ring) - GPA(MemoryRegionSection) > > > > > > > ring_hva == HVA(MemoryRegionSection) + hva_ring_offset > > > > > > > > > > > > > > Doing this would allow us to drop ~100LOC of code which > > > > > > > figures out changed memory range and avoid calling not > > > > > > > needed reverse vhost_memory_map(is_write=true) which is > > > > > > > overkill for the task at hand. > > > > > > > > > > > > There are a few things about this I don't understand; however if > > > > > > it's right I agree that it means we can kill off my comparison > > > > > > code. > > > > > > > > > > > > > > > > > > First, can I check what constraints 'verify_ring' needs to check; > > > > > > if I'm understanding the original code it's checking that : > > > > > > a) If a queue falls within a region, that the whole queue can > > > > > > be mapped > > > > > see vhost_verify_ring_part_mapping(): > > > > > > > > > > p = vhost_memory_map(dev, part_addr, &l, 1); > > > > > > > > > > if (!p || l != part_size) > > > > > error_out > > > > > > > > > > 1st: (!p) requested part_addr must be mapped > > > > > i.e. be a part of MemorySection in flatview > > > > > AND > > > > > 2nd: (l != part_size) mapped range size must match what we > > > > > asked for > > > > > i.e. mapped as continuous range so that > > > > > [GPA, GPA + part_size) could directly correspond to > > > > > [HVA, HVA + part_size) > > > > > and that's is possible only withing 1 MemorySection && > > > > > 1 MeoryRegion > > > > > if I read address_space_map() correctly > > > > > flatview_translate() -> GPA's MemorySection > > > > > flatview_extend_translation() -> 1:1 GPA->HVA > > > > > range size > > > > > > > > > > So answer in case of RAM memory region that we are > > > > > interested in, would be: > > > > > queue falls within a MemorySection and whole queue fits in > > > > > to it > > > > > > > > Yes, OK. > > > > > > > > > > b) And the start of the queue corresponds to where we thought > > > > > > it used to be (in GPA?) > > > > > that cached at vq->desc queue HVA hasn't changed after > > > > > flatview change > > > > > if (p != part) > > > > > error_out > > > > > > > > OK, so it's the HVA that's not changed based on taking the part_addr > > > > which is GPA and checking the map? > > > Yes, I think so. > > > > > > > > > so that doesn't have any constraint on the ordering of the > > > > > > regions > > > > > > or whether the region is the same size or anything. > > > > > > Also I don't think it would spot if there was a qeueue that had no > > > > > > region associated with it at all. > > > > > > > > > > > > Now, comments on your code below: > > > > > > > > > > > > > > > > > > > Signed-off-by: Igor Mammedov <imamm...@redhat.com> > > > > > > > --- > > > > > > > PS: > > > > > > > should be applied on top of David's series > > > > > > > > > > > > > > --- > > > > > > > include/hw/virtio/vhost.h | 2 - > > > > > > > hw/virtio/vhost.c | 195 > > > > > > > ++++++++++++++-------------------------------- > > > > > > > 2 files changed, 57 insertions(+), 140 deletions(-) > > > > > > > > > > > > > > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h > > > > > > > index 467dc77..fc4af5c 100644 > > > > > > > --- a/include/hw/virtio/vhost.h > > > > > > > +++ b/include/hw/virtio/vhost.h > > > > > > > @@ -68,8 +68,6 @@ struct vhost_dev { > > > > > > > uint64_t log_size; > > > > > > > Error *migration_blocker; > > > > > > > bool memory_changed; > > > > > > > - hwaddr mem_changed_start_addr; > > > > > > > - hwaddr mem_changed_end_addr; > > > > > > > const VhostOps *vhost_ops; > > > > > > > void *opaque; > > > > > > > struct vhost_log *log; > > > > > > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > > > > > > > index 5b9a7d7..026bac3 100644 > > > > > > > --- a/hw/virtio/vhost.c > > > > > > > +++ b/hw/virtio/vhost.c > > > > > > > @@ -296,35 +296,43 @@ static void vhost_memory_unmap(struct > > > > > > > vhost_dev *dev, void *buffer, > > > > > > > } > > > > > > > } > > > > > > > > > > > > > > -static int vhost_verify_ring_part_mapping(struct vhost_dev *dev, > > > > > > > - void *part, > > > > > > > - uint64_t part_addr, > > > > > > > - uint64_t part_size, > > > > > > > - uint64_t start_addr, > > > > > > > - uint64_t size) > > > > > > > +static int vhost_verify_ring_part_mapping(void *ring_hva, > > > > > > > + uint64_t ring_gpa, > > > > > > > + uint64_t ring_size, > > > > > > > + void *reg_hva, > > > > > > > + uint64_t reg_gpa, > > > > > > > + uint64_t reg_size) > > > > > > > { > > > > > > > - hwaddr l; > > > > > > > - void *p; > > > > > > > - int r = 0; > > > > > > > + uint64_t hva_ring_offset; > > > > > > > + uint64_t ring_last = range_get_last(ring_gpa, ring_size); > > > > > > > + uint64_t reg_last = range_get_last(reg_gpa, reg_size); > > > > > > > > > > > > > > - if (!ranges_overlap(start_addr, size, part_addr, part_size)) > > > > > > > { > > > > > > > + /* start check from the end so that the rest of checks > > > > > > > + * are done when whole ring is in merged sections range > > > > > > > + */ > > > > > > > + if (ring_last < reg_last || ring_gpa > reg_last) { > > > > > > > return 0; > > > > > > > } > > > > > > > > > > > > so does that mean if our region never grows to be as big as the > > > > > > ring > > > > > > we wont spot the problem? > > > > > I think there is mistake here it should be: > > > > > ring_last < reg_gpa || ring_gpa > reg_last > > > > > > > > > > so if ring is out side of continuous region, we do not care => no > > > > > change > > > > > > > > OK, I don't see how that corresponds to your 'start check from the end' > > > > comment - I thought it was doing something smarter to deal with this > > > > being called from the _cb routine potentially before another part of the > > > > range had been joined onto it. > > > > In that case, we can just use ranges_overlap like the original > > > > routine did. > > > I suppose range check will do as well, the reason for check from the end > > > was optimization to make less checks than ranges_overlap(), > > > considering we are comparing against vhost_memory_region. > > > > Have a look at the RFC v2 I've posted; I've reworked it a bit so > > we call this code aftwards rather than from the _cb. > I'll skimmed through it already and it looks good to me on the first glance
OK, great. > /it needs spelling fixes at least/ Sounds normal for my code! I'll go through and try and clean them up; I'll also try and make sure it's bisectable. > I'll do line by line review and testing this week Thanks. > > > > > But it seems like a bit an overdoing, since by the commit time > > > flatview would contain 1:1 mapping of MemorySection:vhost_memory_region > > > (modulo sections we are not interested in). It should be unlikely to > > > get 2 MemoryRegions allocated one after another and merge in > > > vhost_memory_region() > > > into one vhost_memory_region. Maybe we would be better off with just > > > copying MemorySections into vhost_memory_region in vhost_update_mem_cb() > > > and drop merging altogether. > > > > Except I need the merging for the hugepage stuff that's coming in the > > postcopy world. > > > > > I'd even consider 'non deterministic' merging of 2 MemoryRegions harmful > > > to migration, since vhost might see different memory map on target vs > > > source. > > > > I think that's come up before and it was decided it's not a problem > > as long as the regions are still mapped; the only consistency required > > is between the qemu and the vhost (either the kernel or the vhost-user); > > it shouldn't be a guest visibile issue if I understand it correctly. > non consistent merging will cause change in number of memory regions > in vhost memmap and even if it's not visible to guests it's still > migration blocker in borderline cases where number of regions > is around vhost_backend_memslots_limit(). i.e. it source QEMU starts > fine by luck (merging happens) but target fails as it hits limit (merging > fails). Yep, but it's weighed against merging meaning that you take less slots up so you're less likely to run into the limit. IMHO the real problem is having a fixed small arbitrary number of slots rather than something dynamic. > Anyways it's problem that exists in current code as well and > not related to this series, so we can ponder on it later > (perhaps we don't see the issue because merging doesn't happen > in practice). I suspect it does but you get very little variation and very rarely get near the limit with merging, I think the worst case is there's a VGA mapping case which can be a bit random but that probably happens rarely with a lot of regions. I tested a simple world with real vhost (rather than user) and it is surviving hotplug RAM. Dave > > > > Dave > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK