Hi Stephen: If there is no intention to process the response message of NVS_TYPE_RNDIS, would it be better to not set the flags to VMBUS_CHANPKT_FLAG_RC so that it won't receive any response message?
Best Regards, Min Tang On Sun, Mar 1, 2020 at 12:54 PM Stephen Hemminger < step...@networkplumber.org> wrote: > On Thu, 27 Feb 2020 11:16:01 -0500 > Min Tang <tommyt...@gmail.com> wrote: > > > Hi Stephen: > > > > I saw the following error messages when using DPDK 18.11.2 in Azure: > > > > hn_nvs_execute(): unexpected NVS resp 0x6b, expect 0x85 > > hn_dev_configure(): subchannel configuration failed > > > > It was not easy to reproduce it and it only occurred with multiple queues > > enabled. In hn_nvs_execute it expects the response to match the request. > In > > the failed case, it was expecting NVS_TYPE_SUBCH_REQ (133 or 0x85) but > > got NVS_TYPE_RNDIS(107 or 0x6b). Obviously somewhere the NVS_TYPE_RNDIS > > message had been sent before the NVS_TYPE_SUBCH_REQ message was sent. I > > looked at the code and found that the NVS_TYPE_RNDIS message needs > > completion response but it does not receive the response message > anywhere. > > The fix would be receiving and discarding the wrong response message(s). > > > > I put the following patches and it has fixed the problem. > > > > --- a/drivers/net/netvsc/hn_nvs.c 2020-02-27 11:08:29.755530969 -0500 > > +++ b/drivers/net/netvsc/hn_nvs.c 2020-02-27 11:07:21.567371798 -0500 > > @@ -92,7 +92,7 @@ > > if (hdr->type != type) { > > PMD_DRV_LOG(ERR, "unexpected NVS resp %#x, expect %#x", > > hdr->type, type); > > - goto retry; > > + return -EINVAL; > > } > > > > if (len < resplen) { > > > The situation is that NVS_TYPE_RNDIS is a receive packet that is > arriving while subchannel is being setup. For first channel this > doesn't happen because control operations at that level happen > before packets arrive. > > Needs some more research before coming up with a good fix. > Either the processing of responses in nvs_execute needs to use > the same receive processing function as normal data. Which > means adding logic to wait for condition; or the incoming > packets there could be dropped; or the device needs to be > stopped before configuring sub channels. > > Dropping is probably the easiest to implement. > > >