On Thu, Oct 14, 2021 at 7:38 PM Maxime Coquelin
<maxime.coque...@redhat.com> wrote:
>
>
>
> On 10/14/21 13:25, Li Feng wrote:
> > Thank you for your response.
> >
> > On Thu, Oct 14, 2021 at 4:17 PM Maxime Coquelin
> > <maxime.coque...@redhat.com> wrote:
> >>
> >> Hi Li,
> >>
> >> Adding Jin Yu who introduced this function.
> >>
> >> On 8/27/21 07:12, Li Feng wrote:
> >>> When getting reqs from the avail ring, the id may exceed inflight
> >>> queue size. Then the dpdk will crash forever.
> >>
> >> You need to add Fixes tag and Cc sta...@dpdk.org so that it can be
> >> backported.
> > OK, I will send the v2 version.
> >
> >>
> >>> Signed-off-by: Li Feng <fen...@smartx.com>
> >>> ---
> >>>    lib/vhost/vhost_user.c | 10 ++++++++--
> >>>    1 file changed, 8 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> >>> index 29a4c9af60..f09d0f6a48 100644
> >>> --- a/lib/vhost/vhost_user.c
> >>> +++ b/lib/vhost/vhost_user.c
> >>> @@ -1823,8 +1823,14 @@ vhost_check_queue_inflights_split(struct 
> >>> virtio_net *dev,
> >>>        last_io = inflight_split->last_inflight_io;
> >>>
> >>>        if (inflight_split->used_idx != used->idx) {
> >>> -             inflight_split->desc[last_io].inflight = 0;
> >>> -             rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
> >>> +             if (unlikely(last_io >= inflight_split->desc_num)) {
> >>> +                     VHOST_LOG_CONFIG(ERR, "last_inflight_io '%"PRIu16"' 
> >>> exceeds inflight "
> >>> +                             "queue size (%"PRIu16").\n", last_io,
> >>> +                             inflight_split->desc_num);
> >>
> >> If such error happens, shouldn't we return RTE_VHOST_MSG_RESULT_ERR
> >> instead of just logging an error?
> > I think ignoring the error is ok. No one could handle this error correctly.
> > At this time the guest virtio driver of this virtqueue may be in an
> > incorrect state.
>
> Not sure to understand how it can happen.
> But I see that last_io is actually vq->inflight_split->last_inflight_io,
> which is set only by rte_vhost_set_last_inflight_io_split() API.
The polluted value is from the frontend driver.
My environment occurs this issue, and a VM is hang, so I guess this
bad value comes from it.

>
> Shouldn't there be a sanity check there to ensure that last_inflight_io
> is smaller than desc_num value set by the frontend?

Yes, putting a check in rte_vhost_set_last_inflight_io_split is also ok.
I will send the v2 version that includes this.

Thanks.
>
> Returning an error is the right thing to do anyway.
OK.
>
> >>
> >>> +             } else {
> >>> +                     inflight_split->desc[last_io].inflight = 0;
> >>> +                     rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
> >>> +             }
> >>>                inflight_split->used_idx = used->idx;
> >>>        }
> >>>
> >>>
> >>
> >> Regards,
> >> Maxime
> >>
> >
>

Reply via email to