Hi, On 2.5, we are seeing the following problem when removing a bridge:
- ofproto_destroy calls ofproto_flush__, which eventually calls ovsrcu_postpone(remove_rules_rcu) - ofproto_destroy also calls p->ofproto_class->destruct, which eventually leads to release of DPIF backer (close_dpif_backer) - at some later point, remove_rules_rcu is picked up by the RCU thread. That calls rule_delete, calls complete_operation, and that references the backer, which is however already gone: ofproto->backer->need_revalidate = REV_FLOW_TABLE; My first idea is to do this: modified ofproto/ofproto.c @@ -1588,9 +1588,16 @@ ofproto_destroy__(struct ofproto *ofproto) * - 1st we defer the removal of the rules from the classifier * - 2nd we defer the actual destruction of the rules. */ static void +ofproto_class_destruct__(struct ofproto *ofproto) +{ + ofproto->ofproto_class->destruct(ofproto); +} + +static void ofproto_destroy_defer__(struct ofproto *ofproto) OVS_EXCLUDED(ofproto_mutex) { + ovsrcu_postpone(ofproto_class_destruct__, ofproto); ovsrcu_postpone(ofproto_destroy__, ofproto); } @@ -1623,8 +1630,6 @@ ofproto_destroy(struct ofproto *p, bool del) free(usage); } - p->ofproto_class->destruct(p); - /* We should not postpone this because it involves deleting a listening * socket which we may want to reopen soon. 'connmgr' should not be used * by other threads */ That seems to fix the issue. But "ovs-appctl exit" (or rather the ovs-vswitchd exit action that ovs-appctl exit triggers) doesn't wait for the RCU thread to do all the deferred work, so this ends up not calling the cleanup at all. We can work around by writing something like "ovs-appctl mlnx/wait-br-cleanup", but that's unsatisfactory. It seems like ovs-vswitchd's exit handling should actually wait for deferred work to get done. Thoughts? How would I go about implementing this? Thanks, Petr _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev