Jarno, you're the expert on this stuff I think, do you have advice for Flavio?
On Mon, Apr 06, 2015 at 04:03:23PM -0300, Flavio Leitner wrote: > On Mon, 6 Apr 2015 11:09:58 -0700 > Ben Pfaff <b...@nicira.com> wrote: > > > On Mon, Apr 06, 2015 at 01:04:43PM -0300, Flavio Leitner wrote: > > > On Thu, 2 Apr 2015 20:17:19 -0700 > > > Ben Pfaff <b...@nicira.com> wrote: > > > > > > > On Thu, Apr 02, 2015 at 06:44:08PM -0300, Flavio Leitner wrote: > > > > > On Thu, 2 Apr 2015 13:47:56 -0700 > > > > > Ben Pfaff <b...@nicira.com> wrote: > > > > > > > > > > > On Thu, Apr 02, 2015 at 05:45:34PM -0300, Flavio Leitner > > > > > > wrote: > > > > > > > On Thu, 2 Apr 2015 13:37:06 -0700 > > > > > > > Ben Pfaff <b...@nicira.com> wrote: > > > > > > > > > > > > > > > On Thu, Apr 02, 2015 at 05:33:25PM -0300, Flavio Leitner > > > > > > > > wrote: > > > > > > > > > On Thu, 2 Apr 2015 12:58:37 -0700 > > > > > > > > > Ben Pfaff <b...@nicira.com> wrote: > > > > > > > > > > > > > > > > > > > On Thu, Apr 02, 2015 at 03:57:20PM -0300, Flavio > > > > > > > > > > Leitner wrote: > > > > > > > > > > > The ofproto-dpif creates dummies backed by sockets > > > > > > > > > > > so depending on the order of execution when bridge > > > > > > > > > > > is reconfiguring, an active socket may run first > > > > > > > > > > > and not find the file. That is usually not a > > > > > > > > > > > problem because it will try to reconnect one second > > > > > > > > > > > later. However, it breaks the testsuite. > > > > > > > > > > > > > > > > > > > > > > This patch fixes the issue splitting active and > > > > > > > > > > > passive sockets in different vsctl-ctl commands > > > > > > > > > > > that guarantees the proper ordering between them. > > > > > > > > > > > > > > > > > > > > WAIT_FOR_DUMMY_PORTS is supposed to avoid this > > > > > > > > > > problem, by waiting until the ports have connected. > > > > > > > > > > Is it busted? > > > > > > > > > > > > > > > > > > No, but it takes at least one extra second for the port > > > > > > > > > to reconnect. If we consider the four tests fixed by the > > > > > > > > > patch, it can be 4 extra seconds to complete the same > > > > > > > > > tests, so I'd just go with the proposed patch. > > > > > > > > > > > > > > > > OK, I buy that, but in that case the commit message goes > > > > > > > > overboard when it says that the current form "breaks the > > > > > > > > testsuite". Can you rephrase this as an optimization > > > > > > > > rather than a bug fix, then? > > > > > > > > > > > > > > Not really because it is a bug fix. Those tests break like > > > > > > > 7 out of 10 times on a s390x. > > > > > > > > > > > > OK, so WAIT_FOR_DUMMY_PORTS is buggy then, can we figure out > > > > > > why and fix it? Optimizations are fine too but I'd like to > > > > > > get to the root of the problem. > > > > > > > > > > Ok, I missed that WAIT_FOR_DUMMY_PORTS is not available on > > > > > branch-2.3 where it breaks most of the times. So, I think the > > > > > correct would be to backport these patches to branch-2.3 first: > > > > > > > > > > 93fa0de tests: Fix race in 'balance-tcp bonding' test. > > > > > 60187ac test: Remove explicit sleeps from ofproto-dpif bond > > > > > tests d611e23 test: add WAIT_FOR_DUMMY_PORTS helper macro for > > > > > writing tests > > > > > > > > > > and then hopefully my patch becomes just an optimization. > > > > > Anyway, I can do that next week if you or someone else didn't > > > > > do it before me. > > > > > > > > Thanks for the list of commits. I took care of the backport. I'd > > > > appreciate it if you'd verify it. > > > > > > I just did, and unfortunately it still fails because the macro > > > WAIT_FOR_DUMMY_PORTS() depends on the ovs-appctl > > > netdev-dummy/conn-state command which is not available in > > > branch-2.3 too. So, could you please backport this commit as well? > > > > > > commit 7d7fffe8a4dbe0aab2eb16eca5ce016518b652ad > > > Author: Andy Zhou <az...@nicira.com> > > > Date: Thu Jun 5 16:01:17 2014 -0700 > > > > > > netdev-dummy: add appctl netdev-dummy/conn-state command > > > > > > Using without any parameter, this command list the connection > > > state of all netdev-dummy devices that are configured to make > > > active connections. > > > [...] > > > > > > With the above commit applied, I could run 100 times the test 715 > > > without failures and I could run all ofproto-dpif tests 30 times > > > without failures too. > > > > Thanks for checking that! I applied this to branch-2.3 also. > > Thanks a lot! I found another issue with the following test: > > 717. ofproto-dpif.at:203: testing ofproto-dpif, balance-tcp bonding, > different recirc flow > [...] > --- - 2015-04-06 14:47:38.815107575 -0400 > +++ /root/ovs-2.3.git/tests/testsuite.dir/at-groups/717/stdout > 2015-04-06 14:47 :38.806520838 -0400 > @@ -1,2 +1,2 @@ > -table_id=254, n_packets=1, n_bytes=64, > priority=20,recirc_id=0x12d,dp_hash=0xcf /0xff,actions=output > +table_id=254, n_packets=1, n_bytes=64, > priority=20,recirc_id=0x12c,dp_hash=0x72 /0xff,actions=output > > Two things changed: dp_hash and recirc_id. There is a patch on master > that masks the dp_hash issue: > > commit 8ae8176fd0d8ed919e3301cc961dcf02b65ff49d > Author: Jarno Rajahalme <jrajaha...@nicira.com> > Date: Wed Jan 7 10:16:47 2015 -0800 > > tests: Make test independent of the hash function. > > Otherwise compiling with -msse4.2 (or -march=native on a SSE4.2 > capable CPU) will produce a test failure due to the CRC32-based hash > function being different from mhash. > > Signed-off-by: Jarno Rajahalme <jrajaha...@nicira.com> > Acked-by: Ben Pfaff <b...@nicira.com> > > Still the recirc_id changes, sometimes it doesn't: > --- - 2015-04-06 14:52:07.239712884 -0400 > +++ /root/ovs-2.3.git/tests/testsuite.dir/at-groups/717/stdout > 2015-04-06 14:52:07.226520838 -0400 @@ -1,2 +1,2 @@ > -table_id=254, n_packets=1, n_bytes=64, > priority=20,recirc_id=0x12d,dp_hash=0xcf/0xff,actions=output > +table_id=254, n_packets=1, n_bytes=64, > priority=20,recirc_id=0x12d,dp_hash=0x72/0xff,actions=output > > dp_hash is constant regardless of how many times I run the test. > > Any idea? > > Also, the above test is not included in the command: > # make check TESTSUITEFLAGS="-k ofproto-dpif" > > I am not expert on that language but I think we can't use the first > word followed by something like ','. So, we should change to: > > # Makes sure recirculation does not change the way packet is handled. > -AT_SETUP([ofproto-dpif, balance-tcp bonding, different recirc flow ]) > +AT_SETUP([ofproto-dpif - balance-tcp bonding, different recirc flow ]) > OVS_VSWITCHD_START( > > which then -k ofproto-dpif works for me. (there are other tests with > the same bug). > > Thanks, > fbl _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev