On Thu, 27 Feb 2020 11:16:01 -0500 Min Tang <tommyt...@gmail.com> wrote:
> Hi Stephen: > > I saw the following error messages when using DPDK 18.11.2 in Azure: > > hn_nvs_execute(): unexpected NVS resp 0x6b, expect 0x85 > hn_dev_configure(): subchannel configuration failed > > It was not easy to reproduce it and it only occurred with multiple queues > enabled. In hn_nvs_execute it expects the response to match the request. In > the failed case, it was expecting NVS_TYPE_SUBCH_REQ (133 or 0x85) but > got NVS_TYPE_RNDIS(107 or 0x6b). Obviously somewhere the NVS_TYPE_RNDIS > message had been sent before the NVS_TYPE_SUBCH_REQ message was sent. I > looked at the code and found that the NVS_TYPE_RNDIS message needs > completion response but it does not receive the response message anywhere. > The fix would be receiving and discarding the wrong response message(s). > > I put the following patches and it has fixed the problem. > > --- a/drivers/net/netvsc/hn_nvs.c 2020-02-27 11:08:29.755530969 -0500 > +++ b/drivers/net/netvsc/hn_nvs.c 2020-02-27 11:07:21.567371798 -0500 > @@ -92,7 +92,7 @@ > if (hdr->type != type) { > PMD_DRV_LOG(ERR, "unexpected NVS resp %#x, expect %#x", > hdr->type, type); > - goto retry; > + return -EINVAL; > } > > if (len < resplen) { The situation is that NVS_TYPE_RNDIS is a receive packet that is arriving while subchannel is being setup. For first channel this doesn't happen because control operations at that level happen before packets arrive. Needs some more research before coming up with a good fix. Either the processing of responses in nvs_execute needs to use the same receive processing function as normal data. Which means adding logic to wait for condition; or the incoming packets there could be dropped; or the device needs to be stopped before configuring sub channels. Dropping is probably the easiest to implement.