Hi,

On Monday, August 12, 2024 12:01:00 PM GMT+5:30 you wrote:
> On Sun, Aug 11, 2024 at 7:20 PM Sahil <icegambi...@gmail.com> wrote:
> > On Wednesday, August 7, 2024 9:52:10 PM GMT+5:30 Eugenio Perez Martin wrote:
> > > On Fri, Aug 2, 2024 at 1:22 PM Sahil Siddiq <icegambi...@gmail.com> wrote:
> > > > [...]
> > > > @@ -726,17 +738,30 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, 
> > > > VirtIODevice *vdev,
> > > >      svq->vring.num = virtio_queue_get_num(vdev,
> > > >      virtio_get_queue_index(vq));
> > > >      svq->num_free = svq->vring.num;
> > > >
> > > > -    svq->vring.desc = mmap(NULL, vhost_svq_driver_area_size(svq),
> > > > -                           PROT_READ | PROT_WRITE, MAP_SHARED | 
> > > > MAP_ANONYMOUS,
> > > > -                           -1, 0);
> > > > -    desc_size = sizeof(vring_desc_t) * svq->vring.num;
> > > > -    svq->vring.avail = (void *)((char *)svq->vring.desc + desc_size);
> > > > -    svq->vring.used = mmap(NULL, vhost_svq_device_area_size(svq),
> > > > -                           PROT_READ | PROT_WRITE, MAP_SHARED | 
> > > > MAP_ANONYMOUS,
> > > > -                           -1, 0);
> > > > -    svq->desc_state = g_new0(SVQDescState, svq->vring.num);
> > > > -    svq->desc_next = g_new0(uint16_t, svq->vring.num);
> > > > -    for (unsigned i = 0; i < svq->vring.num - 1; i++) {
> > > > +    svq->is_packed = virtio_vdev_has_feature(svq->vdev, 
> > > > VIRTIO_F_RING_PACKED);
> > > > +
> > > > +    if (virtio_vdev_has_feature(svq->vdev, VIRTIO_F_RING_PACKED)) {
> > > > +        svq->vring_packed.vring.desc = mmap(NULL, 
> > > > vhost_svq_memory_packed(svq),
> > > > +                                          PROT_READ | PROT_WRITE, 
> > > > MAP_SHARED | MAP_ANONYMOUS,
> > > > +                                          -1, 0);
> > > > +        desc_size = sizeof(struct vring_packed_desc) * svq->vring.num;
> > > > +        svq->vring_packed.vring.driver = (void *)((char 
> > > > *)svq->vring_packed.vring.desc + desc_size);
> > > > +        svq->vring_packed.vring.device = (void *)((char 
> > > > *)svq->vring_packed.vring.driver +
> > > > +                                     sizeof(struct 
> > > > vring_packed_desc_event));
> > >
> > > This is a great start but it will be problematic when you start
> > > mapping the areas to the vdpa device. The driver area should be read
> > > only for the device, but it is placed in the same page as a RW one.
> > >
> > > More on this later.
> > >
> > > > +    } else {
> > > > +        svq->vring.desc = mmap(NULL, vhost_svq_driver_area_size(svq),
> > > > +                               PROT_READ | PROT_WRITE, MAP_SHARED 
> > > > |MAP_ANONYMOUS,
> > > > +                               -1, 0);
> > > > +        desc_size = sizeof(vring_desc_t) * svq->vring.num;
> > > > +        svq->vring.avail = (void *)((char *)svq->vring.desc + 
> > > > desc_size);
> > > > +        svq->vring.used = mmap(NULL, vhost_svq_device_area_size(svq),
> > > > +                               PROT_READ | PROT_WRITE, MAP_SHARED 
> > > > |MAP_ANONYMOUS,
> > > > +                               -1, 0);
> > > > +    }
> > >
> > > I think it will be beneficial to avoid "if (packed)" conditionals on
> > > the exposed functions that give information about the memory maps.
> > > These need to be replicated at
> > > hw/virtio/vhost-vdpa.c:vhost_vdpa_svq_map_rings.
> > >
> > > However, the current one depends on the driver area to live in the
> > > same page as the descriptor area, so it is not suitable for this.
> >
> > I haven't really understood this.
> >
> > In split vqs the descriptor, driver and device areas are mapped to RW pages.
> > In vhost_vdpa.c:vhost_vdpa_svq_map_rings, the regions are mapped with
> > the appropriate "perm" field that sets the R/W permissions in the DMAMap
> > object. Is this problematic for the split vq format because the avail ring 
> > is
> > anyway mapped to a RW page in "vhost_svq_start"?
> >
> 
> Ok so maybe the map word is misleading here. The pages needs to be
> allocated for the QEMU process with both PROT_READ | PROT_WRITE, as
> QEMU needs to write into it.
> 
> They are mapped to the device with vhost_vdpa_dma_map, and the last
> bool parameter indicates if the device needs write permissions or not.
> You can see how hw/virtio/vhost-vdpa.c:vhost_vdpa_svq_map_ring checks
> the needle permission for this, and the needle permissions are stored
> at hw/virtio/vhost-vdpa.c:vhost_vdpa_svq_map_rings. This is the
> function that needs to check for the maps permissions.
>

I think I have understood what's going on in "vhost_vdpa_svq_map_rings",
"vhost_vdpa_svq_map_ring" and "vhost_vdpa_dma_map". But based on
what I have understood it looks like the driver area is getting mapped to
an iova which is read-only for vhost_vdpa. Please let me know where I am
going wrong.

Consider the following implementation in hw/virtio/vhost_vdpa.c:

> size_t device_size = vhost_svq_device_area_size(svq);
> size_t driver_size = vhost_svq_driver_area_size(svq);

The driver size includes the descriptor area and the driver area. For
packed vq, the driver area is the "driver event suppression" structure
which should be read-only for the device according to the virtio spec
(section 2.8.10) [1].

> size_t avail_offset;
> bool ok;
> 
> vhost_svq_get_vring_addr(svq, &svq_addr);

Over here "svq_addr.desc_user_addr" will point to the descriptor area
while "svq_addr.avail_user_addr" will point to the driver area/driver
event suppression structure.

> driver_region = (DMAMap) {
>     .translated_addr = svq_addr.desc_user_addr,
>     .size = driver_size - 1,
>     .perm = IOMMU_RO,
> };

This region points to the descriptor area and its size encompasses the
driver area as well with RO permission.

> ok = vhost_vdpa_svq_map_ring(v, &driver_region, errp);

The above function checks the value of needle->perm and sees that it is RO.
It then calls "vhost_vdpa_dma_map" with the following arguments:

> r = vhost_vdpa_dma_map(v->shared, v->address_space_id, needle->iova,
>                                                needle->size + 1,
>                                                (void 
> *)(uintptr_t)needle->translated_addr,
>                                                needle->perm == IOMMU_RO);

Since needle->size includes the driver area as well, the driver area will be
mapped to a RO page in the device's address space, right?

> if (unlikely(!ok)) {
>     error_prepend(errp, "Cannot create vq driver region: ");
>     return false;
> }
> addr->desc_user_addr = driver_region.iova;
> avail_offset = svq_addr.avail_user_addr - svq_addr.desc_user_addr;
> addr->avail_user_addr = driver_region.iova + avail_offset;

I think "addr->desc_user_addr" and "addr->avail_user_addr" will both be
mapped to a RO page in the device's address space.

> device_region = (DMAMap) {
>     .translated_addr = svq_addr.used_user_addr,
>     .size = device_size - 1,
>     .perm = IOMMU_RW,
> };

The device area/device event suppression structure on the other hand will
be mapped to a RW page.

I also think there are other issues with the current state of the patch. 
According
to the virtio spec (section 2.8.10) [1], the "device event suppression" 
structure
needs to be write-only for the device but is mapped to a RW page.

Another concern I have is regarding the driver area size for packed vq. In
"hw/virtio/vhost-shadow-virtqueue.c" of the current patch:
> size_t vhost_svq_driver_area_size(const VhostShadowVirtqueue *svq)
> {
>     size_t desc_size = sizeof(vring_desc_t) * svq->vring.num;
>     size_t avail_size = offsetof(vring_avail_t, ring[svq->vring.num]) +
>                                                               
> sizeof(uint16_t);
> 
>     return ROUND_UP(desc_size + avail_size, qemu_real_host_page_size());
> }
> 
> [...]
> 
> size_t vhost_svq_memory_packed(const VhostShadowVirtqueue *svq)
> {
>     size_t desc_size = sizeof(struct vring_packed_desc) * svq->num_free;
>     size_t driver_event_suppression = sizeof(struct vring_packed_desc_event);
>     size_t device_event_suppression = sizeof(struct vring_packed_desc_event);
> 
>     return ROUND_UP(desc_size + driver_event_suppression + 
> device_event_suppression,
>                     qemu_real_host_page_size());
> }

The size returned by "vhost_svq_driver_area_size" might not be the actual driver
size which is given by desc_size + driver_event_suppression, right? Will this 
have to
be changed too?

Thanks,
Sahil

[1] 
https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-v1.3-csd01.html#x1-720008




Reply via email to