On Mon, Aug 17, 2015 at 11:24:48AM -0700, Alex Wang wrote: > Hey, > > Want to open a thread to discuss the following race I encountered while > unit testing ovn. > > The most simple case is when I run ovn-nbctl to add a lport in unit test: > 1. ovn-nbctl first creates/commits the logical_port entry in ovn-nb > database. the new entry's "up" column is empty, > 2. then assume ovn-nbctl execution got suspended after > ovsdb_idl_txn_commit_block(), > 3. next, ovn-northd will update the ovn-sb database and finds that the > new logical port is not bound. so it goes ahead update the "up" > column of the entry to "false"... > 4. since ovn-nbctl is still running and is set to monitor everything, the > ovsdb-server will try sending the "update" to ovn-nbctl... > 5. now consider this race: if ovn-nbctl execution resumes and exits right > before ovsdb-server sending the update,... the send will fail with > (Broken Pipe) error, resulting in a WARN log in ovsdb-server.log. > > Even if we set the "up" column to "false" at creation, we can still run into > similar race if the ovn-controller quickly binds the lport to chassis and > ovn-northd now updates "up" column to "true". > > I also found similar race for other command combinations... e.g. > deleting vtep switch physical port and deleting ovs port while running > ovs-vtep simulator... > > I'm thinking instead of trying to fix every case (which may not be even > possible), we can try removing all monitor request right after > ovsdb_idl_txn_commit_block() and try waiting until receiving the > monitor request ack from ovsdb-server. After that ovsdb-server will > never try sending anything to "*-*ctl" commands, > > Would like to hear what you think?~
I think the warning is harmless (since we know the cause) so I'd be inclined to just ignore it in the testsuite. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev