On Fri, Jun 05, 2026 at 11:50:36AM -0700, Si-Wei Liu wrote: > > > On 6/5/2026 10:43 AM, Michael S. Tsirkin wrote: > > On Fri, Jun 05, 2026 at 09:03:36AM -0700, Si-Wei Liu wrote: > > > > > > On 6/1/2026 11:04 PM, Eugenio Perez Martin wrote: > > > > On Tue, Jun 2, 2026 at 6:34 AM yangjiale <[email protected]> wrote: > > > > > When a descriptor list spans across cache lines, > > > > > updating the flag first can lead to a scenario where the device side > > > > > perceives the flag as valid, yet the corresponding address and length > > > > > fields remain unupdated—resulting in invalid values. > > > > > Therefore, the flag field must be updated last. > > > > > > > > > > Signed-off-by: yangjiale <[email protected]> > > > > > --- > > > > > drivers/virtio/virtio_ring.c | 8 ++++---- > > > > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c > > > > > b/drivers/virtio/virtio_ring.c > > > > > index fbca7ce1c6bf..036b4f90d30f 100644 > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > @@ -1688,6 +1688,10 @@ static inline int virtqueue_add_packed(struct > > > > > vring_virtqueue *vq, > > > > > &addr, &len, > > > > > premapped, attr)) > > > > > goto unmap_release; > > > > > > > > > > + desc[i].addr = cpu_to_le64(addr); > > > > > + desc[i].len = cpu_to_le32(len); > > > > > + desc[i].id = cpu_to_le16(id); > > > > > + > > > > > flags = > > > > > cpu_to_le16(vq->packed.avail_used_flags | > > > > > (++c == total_sg ? 0 : > > > > > VRING_DESC_F_NEXT) | > > > > > (n < out_sgs ? 0 : > > > > > VRING_DESC_F_WRITE)); > > > > > @@ -1696,10 +1700,6 @@ static inline int virtqueue_add_packed(struct > > > > > vring_virtqueue *vq, > > > > > else > > > > > desc[i].flags = flags; > > > > > > > > > > - desc[i].addr = cpu_to_le64(addr); > > > > > - desc[i].len = cpu_to_le32(len); > > > > > - desc[i].id = cpu_to_le16(id); > > > > > - > > > > > if (unlikely(vq->use_map_api)) { > > > > > vq->packed.desc_extra[curr].addr = > > > > > premapped ? > > > > > DMA_MAPPING_ERROR : addr; > > > > These flags are updated before the flags of the head descriptor at the > > > > end of the function, at "vq->packed.vring.desc[head].flags = > > > > head_flags", so the device should not see these. Because of that, the > > > > relative order between the rest of the fields of the same descriptor > > > > or other descriptors' fields, except for the head descriptor's flags, > > > > should not matter. There is a write memory barrier just before > > > > updating the head's flags. > > > The above analysis is absolutely correct. Though one hardware vendor told > > > me > > > that this driver implementation kinda stops them from reading ahead of > > > descriptors already posted beyond the available index., ending up with > > > suboptimal performance that is hard to make up by other means. Would it > > > be a > > > bad idea to go with this change and add write barrier in a gentle way for > > > a > > > small flit in the batch, e.g. commit to memory after every cache line size > > > worth of descriptors are posted? Would the memory barrier have negative > > > performance overhead to other backend implementation variants than real > > > hardware PCI device? > > > > > > -Siwei > > this would need a new feature bit, won't it? > Probably. This is to capture the device's expectation and behavior right? > the driver change itself is not spec violating...
yes, device can't rely on this without a feature bit. > > > > > > Also, I don't get why the cache line matters here. Can you expand? Am > > > > I missing something? > > me too. > > > Just to avoid extra delay due to excessive coherency messages and frequent > cache thrashing, device read over pci bus contends with host write/update on > the descriptors in a same cache line.. > > -Siwei this should be infrequent, the whole idea is that there's parallelism: device reads descriptors from X while host writes other ones to Y. btw i can't say whether it's ok for device to just issue 2 reads, or does it have to receive read result and only then issue the second read. -- MST

