We got a crash of ovs-vswitchd.
I start ovs following the instructions of INSTALL.DPDK.md, with the ovs master
code.
In my enironment, there are four cpu cores, the "real_n_txq" of dpdk port is 1
and "txq_needs_locking" is "true""...ovs-vsctl add-br br0 -- set bridge br0
datapath_type=netdevovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0
type=dpdkovs-vsctl add-port br0 dpdkvhost0 -- set Interface dpdkvhost0
type=dpdkvhost"It works, then i bring up a guest and do some tests, a
segmentfault happenned to ovs-vswitchd when I run "iperf3" clients on guest and
host at the same time:guest: iperf client sends tcp packets -> dpdkvhost0 ->
OVS -> dpdk0 -> outsidehost: iperf client sends tcp packets -> br0 -> OVS ->
dpdk0 -> outside The segmentfault is like this:Program received signal SIGSEGV,
Segmentation fault.eth_em_xmit_pkts (tx_queue=0x7f623d0f9880,
tx_pkts=0x7f623d0ebee8, nb_pkts=65535) at
dpdk-2.0.0/lib/librte_pmd_e1000/em_rxtx.c:436436 ol_flags =
tx_pkt->ol_flags;(gdb) bt#0 eth_em_xmit_pkts (tx_queue=0x7f623d0f9880,
tx_pkts=0x7f623d0ebee8, nb_pkts=65535) at
dpdk-2.0.0/lib/librte_pmd_e1000/em_rxtx.c:436#1 0x0000000000625d3d in
rte_eth_tx_burst at
dpdk-2.0.0/x86_64-native-linuxapp-gcc/include/rte_ethdev.h:2572#2
dpdk_queue_flush__ (dev=dev@entry=0x7f623d0f9940, qid=qid@entry=0) at
lib/netdev-dpdk.c:808#3 0x0000000000627324 in dpdk_queue_pkts (cnt=1,
pkts=0x7fff5da5a8f0, qid=0, dev=0x7f623d0f9940) at lib/netdev-dpdk.c:1003#4
dpdk_do_tx_copy (netdev=netdev@entry=0x7f623d0f9940, qid=qid@entry=0,
pkts=pkts@entry=0x7fff5da5ae80, cnt=cnt@entry=1) at lib/netdev-dpdk.c:1073#5
0x0000000000627e96 in netdev_dpdk_send__ (may_steal=<optimized out>, cnt=1,
pkts=0x7fff5da5ae80, qid=0, dev=0x7f623d0f9940) at lib/netdev-dpdk.c:1116#6
netdev_dpdk_eth_send (netdev=0x7f623d0f9940, qid=<optimized out>,
pkts=0x7fff5da5ae80, cnt=1, may_steal=<optimized out>) at
lib/netdev-dpdk.c:1169....The traffics of guest and br0 are processing by main
thread and pmd thread seperately. The pkts of br0 are sent by
"netdev_dpdk_send__" with lock "rte_spinlock_lock" via dpdk port, and the pmd
main thread processes the sending direction pkts of dpdk port by
dpdk_queue_flush without lock. The carsh seems to be caused by this.
We fixed it with this patch and the segmentfault did not happen again.The
modification does not affect the performance phy-phy when the txq num is equal
to cpu num, we tesed it on server with Intel Xecn E5-2630 and 82599 xge nic.I
guess it also does not affect the performance of the opposite condition(not
equal) for the "flush_tx" of the txq is "true" and it will be flushed every
time in "dpdk_queue_pkts".
We dont clearly kown if there is better modificaion, some help will be greately
appreciated.
------------------------------------------------------------------发件人:Gray,
Mark D <mark.d.g...@intel.com>发送时间:2015年6月5日(星期五) 22:53收件人:钢锁0310
<l...@dtdream.com>,d...@openvswitch.com <d...@openvswitch.com>主 题:Re: [ovs-dev]
[PATCH] netdev-dpdk: Do not flush tx queue which is shared among CPUs
since it is always flushed> > When tx queue is shared among CPUS,the pkts
always be flush in> 'netdev_dpdk_eth_send'> So it is unnecessarily for flushing
in netdev_dpdk_rxq_recv Otherwise tx will> be accessed without lockingAre you
seeing a specific bug or is this just to account for a device withless queues
than pmds?> > Signed-off-by: Wei li <l...@dtdream.com>> ---> lib/netdev-dpdk.c
| 7 +++++--> 1 file changed, 5 insertions(+), 2 deletions(-)> > diff --git
a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 63243d8..25e3a73> 100644> ---
a/lib/netdev-dpdk.c> +++ b/lib/netdev-dpdk.c> @@ -892,8 +892,11 @@
netdev_dpdk_rxq_recv(struct netdev_rxq *rxq_,> struct dp_packet **packets,>
int nb_rx;> > /* There is only one tx queue for this core. Do not flush
other> - * queueus. */> - if (rxq_->queue_id == rte_lcore_id()) {> +
* queueus.s/queueus/queues> + * Do not flush tx queue which is shared among
CPUs> + * since it is always flushed */> + if (rxq_->queue_id ==
rte_lcore_id() &&> + OVS_LIKELY(!dev->txq_needs_locking)) {>
dpdk_queue_flush(dev, rxq_->queue_id);Do you see any drop in performance in a
simple phy-phy case beforeand after this patch?> }> > --> 1.9.5.msysgit.1>
> > _______________________________________________> dev mailing list>
dev@openvswitch.org>
http://openvswitch.org/mailman/listinfo/dev_______________________________________________dev
mailing listdev@openvswitch.orghttp://openvswitch.org/mailman/listinfo/dev
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev