On Fri, 2016-07-08 at 00:14 +0000, Lu, Wenzhuo wrote: > > > > > > > > > > Hello Wenzhuo, > > > > > > > > > > > > > > > > > > > > I'm testing this patchset, but I am sporadically running > > > > > > > > > > into an issue where the VFs reset fails after the PF flaps. > > > > > > > > > > > > > > > > > > > > I have a VM running on a KVM box with a X540-AT2, passing 2 > > > > > > > > > > VFs > > in. > > > > > > > > > > > > > > > > > > > > I am using calling rte_eth_dev_reset in response to a > > > > > > > > > > RTE_ETH_EVENT_INTR_RESET callback, and the following > > > > > > > > > > errors appear in the > > > > > > > > > > log: > > > > > > > > > > > > > > > > > > > > PMD: ixgbevf_dev_reset(): Ixgbe VF reset: Failed to update > > > > > > > > > > link. > > > > > > > > > > PMD: ixgbe_alloc_rx_queue_mbufs(): RX mbuf alloc failed > > > > > > > > > > queue_id=0 > > > > > > > > > > PMD: ixgbevf_dev_start(): Unable to initialize RX > > > > > > > > > > hardware > > > > > > > > > > (-12) > > > > > > > > > > PMD: ixgbevf_dev_reset(): Ixgbe VF reset: Failed to start > > > > > > > > > > device. > > > > > > > > > > > > > > > > > > > > Jumping in with GDB, it seems that the rte_rxmbuf_alloc > > > > > > > > > > call in ixgbe_alloc_rx_queue_mbufs returns NULL at > > > > > > > > > > iteration 64 out of > > > > 2048. > > > > > > > > > > The application has ~500 2MB hugepages, and there's 2GB > > > > > > > > > > of free memory available on top of that. > > > > > > > > > > > > > > > > > > > > Have you seen this before? Any pointer or suggestion for > > debugging? > > > > > > > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Kind regards, > > > > > > > > > > Luca Boccassi > > > > > > > > > I think the problem is the mbuf occupied by the packets is > > > > > > > > > not released. This > > > > > > > > memory has to be released by the APP, so my patches haven?t > > > > > > > > covered > > > > this. > > > > > > > > Actually an example is needed to show how to use the reset API. > > > > > > > > I plan to modify the testpmd. > > > > > > > > > You may notice this feature is postponed to 16.11. Would > > > > > > > > > you like to wait for > > > > > > > > the new version that will include an example? > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > Unfortunately we need the VF reset working sooner than that, > > > > > > > > so one way or the other I'll need to sort it out. Given I've > > > > > > > > got a use case where this is happening, if it can be helpful > > > > > > > > for you I'm more than happy to help as a guinea pig. If you > > > > > > > > could please give some guidance/guidelines with regards to > > > > > > > > which API to use to sort the mbuf > > > > > > problem, I can try it out and give back some feedback. > > > > > > > > > > > > > > > > Thanks! > > > > > > > I made a stupid mistake and deleted all my code. So, I have to > > > > > > > take some time to rewrite it :( Attached the example I used to > > > > > > > test the reset API. It's > > > > > > modified from the l2fwd example. So you can compare it with > > > > > > l2fwd to see what need to be added. > > > > > > > Hopefully it can help :) > > > > > > > > > > > > Thanks! That made me understand a couple of things more, and > > > > > > I've got past the problem. > > > > > > > > > > > > Unfortunately now there's a bigger issue - rte_eth_dev_reset is > > > > > > a blocking > > > > call. > > > > > > the _RESET event callback is fired when the PF goes down, but > > > > > > when I call rte_eth_dev_reset it will block until the PF goes back > > > > > > up. > > > > > > There is no way, as far as I can see, to know if the PF is back > > > > > > up before > > > > calling rte_eth_dev_reset. > > > > > > > > > > > > This is a problem because, as far as I understand, I have to > > > > > > call all the rte_eth_dev_ APIs from the same thread, in my case > > > > > > the master thread, and I can't have that block potentially > > > > > > indefinitely. > > > > > > > > > > > > Would it be possible to have 2 events instead of 1, one when the > > > > > > PF goes down and one when it goes up? This way an application > > > > > > would be able to soft-stop the port (drain queues, etc) when the > > > > > > PF is down, and then call the reset API when it goes back up. > > > > > > > > > > > > Thanks! > > > > > Sorry we cannot have 2 events now. There're 2 problems to have 2 > > > > > events. > > > > > 1, Normally we use kernel driver for PF. Now the kernel driver > > > > > only have one > > > > kind of message for link down and up. So we cannot tell if it's down or > > > > up. > > > > > 2, When the PF is down, if we don't reset the VF, VF is not working. > > > > > It cannot receive any message from PF. So we cannot know that when > > > > > PF is up. It means normally we have to reset VF twice when PF down > > > > > and up. (Surely we can wait a while when we receive the message > > > > > from PF until PF is up. But we cannot tell how long the time is > > > > > appropriate. > > > > > So this *wait a while* may work for flash.) > > > > > > > > Thanks for the clarification, I understand. > > > > > > > > The problem with a blocking call is that we basically need to spawn > > > > one thread per rte_eth_dev_reset call, since there is no way of > > > > knowing if a PF is down for good or just flapping, and we can't have > > > > a single thread managing all the interfaces being blocked forever > > > > (EG: PF 1 and 2 go down, thread blocks on PF 1 reset call but it > > > > never returns, meanwhile PF 2 goes back up but call is never made). > > > > > > > > A colleague of mine, Eric Kinzie, suggested to add a blocking > > > > boolean parameter to rte_eth_dev_reset API. If set to false, then > > > > the call will not block and just does one try and return an error > > > > (EAGAIN ?). > > Would this be an acceptable proposition? > > > It's a good suggestion. > > > And I think if the parameter is set to false and the link is not up after > > > trying > > once, it will be APP's responsibility to setup a timer or something like > > that to > > keep trying to bring up the link. > > > > That seems reasonable. I've thrown together a quick diff and played with it > > on > > top of your patches and DPDK 2.2, seems to work as intended, I'm attaching > > it > > for reference. Feel free to pick it up, adapt it or ignore it :-) > > > > Also I've noticed that the ixgbe is the only one that actually blocks, > > e1000 returns already immediately if the dev_start fails (perhaps it should > > be > > changed to be consistent?) and ixgb40 does weird things that I'm not sure > > about, > > but couldn't spot a loop in there :-) > > > > Also I've used int instead of bool because > > drivers/net/e1000/base/e1000_osdep.h redefines bool and true/false, so > > compilation fails when including stdbool.h and using bool in rte_ethdev.h > > > > -- > > Kind regards, > > Luca Boccassi > Glad to know it's working now. Thanks for your patch. Surely I'll try to > include it in the next version :)
Great, thanks! Unfortunately I found one issue: if PF is down, and then the VF on the guest is down as well (ip link down) and then goes back up before the PF, then calling rte_eth_dev_reset will return 0 (success), even though the PF is still down and it should fail. This is with ixgbe. Any idea what could be the problem? -- Kind regards, Luca Boccassi