I agree with Maxime, just add an error log doesn't help anything, there might be something wrong in other places, I don't have the context for this issue, if this can be reproduced in SPDK, I suggest to submit an issue to SPDK first.
> -----Original Message----- > From: Xia, Chenbo <chenbo....@intel.com> > Sent: Thursday, October 14, 2021 4:28 PM > To: Maxime Coquelin <maxime.coque...@redhat.com>; Li Feng > <fen...@smartx.com>; Liu, Changpeng <changpeng....@intel.com> > Cc: dev@dpdk.org > Subject: RE: [PATCH v1] vhost: add sanity check for resubmiting reqs in split > ring > > Hi Changpeng, > > > -----Original Message----- > > From: Maxime Coquelin <maxime.coque...@redhat.com> > > Sent: Thursday, October 14, 2021 4:26 PM > > To: Li Feng <fen...@smartx.com>; Xia, Chenbo <chenbo....@intel.com> > > Cc: dev@dpdk.org > > Subject: Re: [PATCH v1] vhost: add sanity check for resubmiting reqs in > > split > > ring > > > > > > > > On 10/14/21 10:17, Maxime Coquelin wrote: > > > Hi Li, > > > > > > Adding Jin Yu who introduced this function. > > > > Looks like Jin Yu has left Intel, Chenbo, could you find someone from > > the Intel SPDK team to look at it? > > Could you or your team member help check this? > > Thanks, > Chenbo > > > > > > On 8/27/21 07:12, Li Feng wrote: > > >> When getting reqs from the avail ring, the id may exceed inflight > > >> queue size. Then the dpdk will crash forever. > > > > > > You need to add Fixes tag and Cc sta...@dpdk.org so that it can be > > > backported. > > > > > >> Signed-off-by: Li Feng <fen...@smartx.com> > > >> --- > > >> lib/vhost/vhost_user.c | 10 ++++++++-- > > >> 1 file changed, 8 insertions(+), 2 deletions(-) > > >> > > >> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c > > >> index 29a4c9af60..f09d0f6a48 100644 > > >> --- a/lib/vhost/vhost_user.c > > >> +++ b/lib/vhost/vhost_user.c > > >> @@ -1823,8 +1823,14 @@ vhost_check_queue_inflights_split(struct > > >> virtio_net *dev, > > >> last_io = inflight_split->last_inflight_io; > > >> if (inflight_split->used_idx != used->idx) { > > >> - inflight_split->desc[last_io].inflight = 0; > > >> - rte_atomic_thread_fence(__ATOMIC_SEQ_CST); > > >> + if (unlikely(last_io >= inflight_split->desc_num)) { > > >> + VHOST_LOG_CONFIG(ERR, "last_inflight_io '%"PRIu16"' > > >> exceeds inflight " > > >> + "queue size (%"PRIu16").\n", last_io, > > >> + inflight_split->desc_num); > > > > > > If such error happens, shouldn't we return RTE_VHOST_MSG_RESULT_ERR > > > instead of just logging an error? > > > > > >> + } else { > > >> + inflight_split->desc[last_io].inflight = 0; > > >> + rte_atomic_thread_fence(__ATOMIC_SEQ_CST); > > >> + } > > >> inflight_split->used_idx = used->idx; > > >> } > > >> > > > > > > Regards, > > > Maxime