As per the testcase included in this patch it has been observed that ovs-vswtichd may segfault when deleting a bridge.
Analysing the output of valgrind and gdb it appears that this is caused by the connmgr of a ofproto being accessed after the latter has been freed. It appears that this is occurring in different threads and is the result of the following postponement arrangement in ofproto_destroy(); /* We should not postpone this because it involves deleting a listening * socket which we may want to reopen soon. 'connmgr' should not be used * by other threads */ connmgr_destroy(p->connmgr); /* Destroying rules is deferred, must have 'ofproto' around for them. */ ovsrcu_postpone(ofproto_destroy__, p); Report from valgrind resulting from make check-valgrind: ==10653== Invalid read of size 8 ==10653== at 0x458998: ofmonitor_flush (connmgr.c:2254) ==10653== by 0x423E17: handle_flow_mod__ (ofproto.c:4942) ==10653== by 0x42651E: delete_group__ (ofproto.c:6171) ==10653== by 0x426600: delete_group (ofproto.c:6194) ==10653== by 0x41BD37: ofproto_destroy__ (ofproto.c:1485) ==10653== by 0x4EF80E: ovsrcu_call_postponed (ovs-rcu.c:256) ==10653== by 0x4EF8B4: ovsrcu_postpone_thread (ovs-rcu.c:271) ==10653== by 0x4F1656: ovsthread_wrapper (ovs-thread.c:337) ==10653== by 0x569F0A3: start_thread (pthread_create.c:309) ==10653== by 0x5EA304C: clone (clone.S:111) ==10653== Address 0x645a910 is 64 bytes inside a block of size 184 free'd ==10653== at 0x4C29E90: free (vg_replace_malloc.c:473) ==10653== by 0x45494F: connmgr_destroy (connmgr.c:299) ==10653== by 0x41C0E6: ofproto_destroy (ofproto.c:1553) ==10653== by 0x40D5CF: bridge_destroy (bridge.c:3179) ==10653== by 0x409C2D: add_del_bridges (bridge.c:1698) ==10653== by 0x406CFA: bridge_reconfigure (bridge.c:582) ==10653== by 0x40CC4A: bridge_run (bridge.c:2950) ==10653== by 0x412818: main (ovs-vswitchd.c:116) Backtrace reported by gdb using core resulting from make check: at ./lib/list.h:257 257 return list->next == list; (gdb) bt at ./lib/list.h:257 at ofproto/connmgr.c:2261 fm=0x7f48936757e8, req=0x0) at ofproto/ofproto.c:4942 at ofproto/ofproto.c:6171 at ofproto/ofproto.c:6194 at ofproto/ofproto.c:1485 at lib/ovs-thread.c:337 at pthread_create.c:309 at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 CGLAGS=-g gcc (Debian 4.9.2-10) 4.9.2 Signed-off-by: Simon Horman <simon.hor...@netronome.com> --- ofproto/connmgr.c | 6 ++++++ tests/ofproto.at | 13 +++++++++++++ 2 files changed, 19 insertions(+) diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c index 975ee335b9f6..0a662f6b4074 100644 --- a/ofproto/connmgr.c +++ b/ofproto/connmgr.c @@ -273,6 +273,8 @@ connmgr_destroy(struct connmgr *mgr) LIST_FOR_EACH_SAFE (ofconn, next_ofconn, node, &mgr->all_conns) { ofconn_destroy(ofconn); } + + mgr->ofproto->connmgr = NULL; /* Its going away! */ ovs_mutex_unlock(&ofproto_mutex); hmap_destroy(&mgr->controllers); @@ -2294,6 +2296,10 @@ ofmonitor_flush(struct connmgr *mgr) { struct ofconn *ofconn; + if (!mgr) { + return; + } + LIST_FOR_EACH (ofconn, node, &mgr->all_conns) { struct ofpbuf *msg; diff --git a/tests/ofproto.at b/tests/ofproto.at index 9c5f0bb4e530..fef315f0e968 100644 --- a/tests/ofproto.at +++ b/tests/ofproto.at @@ -679,6 +679,19 @@ OFPST_GROUP reply (OF1.5): OVS_VSWITCHD_STOP AT_CLEANUP +dnl This is really bare-bones. +dnl It at least checks request and reply serialization and deserialization. +AT_SETUP([ofproto - group add then bridge delete (OpenFlow 1.3)]) +OVS_VSWITCHD_START +AT_DATA([groups.txt], [dnl +group_id=1234,type=all,bucket=output:10 +group_id=1235,type=all,bucket=output:10 +]) +AT_CHECK([ovs-ofctl -O OpenFlow13 -vwarn add-groups br0 groups.txt]) +AT_CHECK([ovs-vsctl --if-exist del-br br0]) +OVS_VSWITCHD_STOP +AT_CLEANUP + AT_SETUP([ofproto - mod-port (OpenFlow 1.0)]) OVS_VSWITCHD_START for command_config_state in \ -- 2.1.4 _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev