On 06.12.2018 7:17, Jason Wang wrote: > > On 2018/12/5 下午7:30, Ilya Maximets wrote: >> On 05.12.2018 12:49, Maxime Coquelin wrote: >>> A read barrier is required to ensure the ordering between >>> available index and the descriptor reads is enforced. >>> >>> Fixes: 4796ad63ba1f ("examples/vhost: import userspace vhost application") >>> Cc: sta...@dpdk.org >>> >>> Reported-by: Jason Wang <jasow...@redhat.com> >>> Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com> >>> --- >>> lib/librte_vhost/virtio_net.c | 12 ++++++++++++ >>> 1 file changed, 12 insertions(+) >>> >>> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c >>> index 5e1a1a727..f11ebb54f 100644 >>> --- a/lib/librte_vhost/virtio_net.c >>> +++ b/lib/librte_vhost/virtio_net.c >>> @@ -791,6 +791,12 @@ virtio_dev_rx_split(struct virtio_net *dev, struct >>> vhost_virtqueue *vq, >>> rte_prefetch0(&vq->avail->ring[vq->last_avail_idx & (vq->size - 1)]); >>> avail_head = *((volatile uint16_t *)&vq->avail->idx); >>> + /* >>> + * The ordering between avail index and >>> + * desc reads needs to be enforced. >>> + */ >>> + rte_smp_rmb(); >>> + >> Hmm. This looks weird to me. >> Could you please describe the bad scenario here? (It'll be good to have it >> in commit message too) >> >> As I understand, you're enforcing the read of avail->idx to happen before >> reading the avail->ring[avail_idx]. Is it correct? >> >> But we have following code sequence: >> >> 1. read avail->idx (avail_head). >> 2. check that last_avail_idx != avail_head. >> 3. read from the ring using last_avail_idx. >> >> So, there is a strict dependency between all 3 steps and the memory >> transaction will be finished at the step #2 in any case. There is no >> way to read the ring before reading the avail->idx. >> >> Am I missing something? > > > Nope, I kind of get what you meaning now. And even if we will > > 4. read descriptor from descriptor ring using the id read from 3 > > 5. read descriptor content according to the address from 4 > > They still have dependent memory access. So there's no need for rmb. >
On a second glance I changed my mind. The code looks like this: 1. read avail_head = avail->idx 2. read cur_idx = last_avail_idx if (cur_idx != avail_head) { 3. read idx = avail->ring[cur_idx] 4. read desc[idx] } There is an address (data) dependency: 2 -> 3 -> 4. These reads could not be reordered. But it's only control dependency between 1 and (3, 4), because 'avail_head' is not used to calculate 'cur_idx'. In case of aggressive speculative execution, 1 could be reordered with 3 resulting with reading of not yet updated 'idx'. Not sure if speculative execution could go so far while 'avail_head' is not read yet, but it's should be possible in theory. Thoughts ? >> >>> for (pkt_idx = 0; pkt_idx < count; pkt_idx++) { >>> uint32_t pkt_len = pkts[pkt_idx]->pkt_len + dev->vhost_hlen; >>> uint16_t nr_vec = 0; >>> @@ -1373,6 +1379,12 @@ virtio_dev_tx_split(struct virtio_net *dev, struct >>> vhost_virtqueue *vq, >>> if (free_entries == 0) >>> return 0; >>> + /* >>> + * The ordering between avail index and >>> + * desc reads needs to be enforced. >>> + */ >>> + rte_smp_rmb(); >>> + >> This one is strange too. >> >> free_entries = *((volatile uint16_t *)&vq->avail->idx) - >> vq->last_avail_idx; >> if (free_entries == 0) >> return 0; >> >> The code reads the value of avail->idx and uses the value on the next >> line even with any compiler optimizations. There is no way for CPU to >> postpone the actual read. > > > Yes. > It's kind of similar situation here, but 'avail_head' is involved somehow in 'cur_idx' calculation because of fill_vec_buf_split(..., vq->last_avail_idx + i, ...) And 'i' depends on 'free_entries'. But we need to look at the exact asm code to be sure. I think, we may add barrier here to avoid possible issues. > Thanks > > >> >>> VHOST_LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__); >>> count = RTE_MIN(count, MAX_PKT_BURST); >>> > >