Thanks a lot for the response Yuanhan. I am using dpdk v16.07. So what you are saying is that in 16.07, we dont really need to call rte_eth_dev_close() on exit, because dpdk will ensure that it will do virtio reset before init when it comes up right ?
Regarding the vhost commits you mentioned - do we still need those fixes if we have the "virtio reset before init" mechanism ? Or that is a seperate problem altogether (and hence we would need those fixes) ? Rgds, Gopa. On Thu, Mar 16, 2017 at 7:06 PM, Yuanhan Liu <yuanhan....@linux.intel.com> wrote: > On Thu, Mar 16, 2017 at 12:39:16PM -0700, Gopakumar Choorakkot Edakkunni > wrote: > > So the doc says we should call rte_eth_dev_close() *before* going down. > And I > > know that especially in dpdk-virtionet in the guest + ovs-dpdk in the > host, > > the ovs ends up getting stalled/stuck (!!) if I dont close the port > before > > starting() it when the guest dpdk process comes back up. > > I'm assuming you were using an old version, something like dpdk v2.2? > IIRC, DPDK v16.04 should have fixed your issue. > > > Considering that this not done properly can screw up the HOST ovs, and I > want > > to do everything possible to avoid that, I want to be 200% sure that I > call > > close even if my process gets a kill -9 .. So obviously the only way of > doing > > that is to close the port when the dpdk process comes back up and > *before* we > > init the port. rte_eth_dev_close() is not capable of doing that as it > expects > > the port parameters to be initialized etc.. before it can be called. > > We do virtio reset before init, which is basically what rte_eth_dev_close() > mainly does. So I see no big issue here. > > The stuck issue is due to hugepage reset by the guest DPDK application, > leading all virtio vring elements being mem zeroed. The old vhost doesn't > handle it well, as a result, it got stuck. And here are some relevant > commits: > > a436f53 vhost: avoid dead loop chain > c687b0b vhost: check for ring descriptors overflow > 623bc47 vhost: do sanity check for ring descriptor length > > --yliu > > > Any other > > suggestions on what can be done to close on restart rather than close on > going > > down ? Thought of bouncing this by the alias before I add a version of > close > > myself that can do this close-on-restart >