On 05.12.2018 12:49, Maxime Coquelin wrote: > A read barrier is required to ensure the ordering between > available index and the descriptor reads is enforced. > > Fixes: 4796ad63ba1f ("examples/vhost: import userspace vhost application") > Cc: sta...@dpdk.org > > Reported-by: Jason Wang <jasow...@redhat.com> > Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com> > --- > lib/librte_vhost/virtio_net.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c > index 5e1a1a727..f11ebb54f 100644 > --- a/lib/librte_vhost/virtio_net.c > +++ b/lib/librte_vhost/virtio_net.c > @@ -791,6 +791,12 @@ virtio_dev_rx_split(struct virtio_net *dev, struct > vhost_virtqueue *vq, > rte_prefetch0(&vq->avail->ring[vq->last_avail_idx & (vq->size - 1)]); > avail_head = *((volatile uint16_t *)&vq->avail->idx); > > + /* > + * The ordering between avail index and > + * desc reads needs to be enforced. > + */ > + rte_smp_rmb(); > +
Hmm. This looks weird to me. Could you please describe the bad scenario here? (It'll be good to have it in commit message too) As I understand, you're enforcing the read of avail->idx to happen before reading the avail->ring[avail_idx]. Is it correct? But we have following code sequence: 1. read avail->idx (avail_head). 2. check that last_avail_idx != avail_head. 3. read from the ring using last_avail_idx. So, there is a strict dependency between all 3 steps and the memory transaction will be finished at the step #2 in any case. There is no way to read the ring before reading the avail->idx. Am I missing something? > for (pkt_idx = 0; pkt_idx < count; pkt_idx++) { > uint32_t pkt_len = pkts[pkt_idx]->pkt_len + dev->vhost_hlen; > uint16_t nr_vec = 0; > @@ -1373,6 +1379,12 @@ virtio_dev_tx_split(struct virtio_net *dev, struct > vhost_virtqueue *vq, > if (free_entries == 0) > return 0; > > + /* > + * The ordering between avail index and > + * desc reads needs to be enforced. > + */ > + rte_smp_rmb(); > + This one is strange too. free_entries = *((volatile uint16_t *)&vq->avail->idx) - vq->last_avail_idx; if (free_entries == 0) return 0; The code reads the value of avail->idx and uses the value on the next line even with any compiler optimizations. There is no way for CPU to postpone the actual read. > VHOST_LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__); > > count = RTE_MIN(count, MAX_PKT_BURST); >