On Thu, 2 Apr 2015 20:17:19 -0700
Ben Pfaff <b...@nicira.com> wrote:

> On Thu, Apr 02, 2015 at 06:44:08PM -0300, Flavio Leitner wrote:
> > On Thu, 2 Apr 2015 13:47:56 -0700
> > Ben Pfaff <b...@nicira.com> wrote:
> > 
> > > On Thu, Apr 02, 2015 at 05:45:34PM -0300, Flavio Leitner wrote:
> > > > On Thu, 2 Apr 2015 13:37:06 -0700
> > > > Ben Pfaff <b...@nicira.com> wrote:
> > > > 
> > > > > On Thu, Apr 02, 2015 at 05:33:25PM -0300, Flavio Leitner
> > > > > wrote:
> > > > > > On Thu, 2 Apr 2015 12:58:37 -0700
> > > > > > Ben Pfaff <b...@nicira.com> wrote:
> > > > > > 
> > > > > > > On Thu, Apr 02, 2015 at 03:57:20PM -0300, Flavio Leitner
> > > > > > > wrote:
> > > > > > > > The ofproto-dpif creates dummies backed by sockets so
> > > > > > > > depending on the order of execution when bridge is
> > > > > > > > reconfiguring, an active socket may run first and not
> > > > > > > > find the file.  That is usually not a problem because
> > > > > > > > it will try to reconnect one second later. However, it
> > > > > > > > breaks the testsuite.
> > > > > > > > 
> > > > > > > > This patch fixes the issue splitting active and passive
> > > > > > > > sockets in different vsctl-ctl commands that guarantees
> > > > > > > > the proper ordering between them.
> > > > > > > 
> > > > > > > WAIT_FOR_DUMMY_PORTS is supposed to avoid this problem, by
> > > > > > > waiting until the ports have connected.  Is it busted?
> > > > > > 
> > > > > > No, but it takes at least one extra second for the port to
> > > > > > reconnect. If we consider the four tests fixed by the
> > > > > > patch, it can be 4 extra seconds to complete the same
> > > > > > tests, so I'd just go with the proposed patch.
> > > > > 
> > > > > OK, I buy that, but in that case the commit message goes
> > > > > overboard when it says that the current form "breaks the
> > > > > testsuite".  Can you rephrase this as an optimization rather
> > > > > than a bug fix, then?
> > > > 
> > > > Not really because it is a bug fix.  Those tests break like 7
> > > > out of 10 times on a s390x.
> > > 
> > > OK, so WAIT_FOR_DUMMY_PORTS is buggy then, can we figure out why
> > > and fix it?  Optimizations are fine too but I'd like to get to
> > > the root of the problem.
> > 
> > Ok, I missed that WAIT_FOR_DUMMY_PORTS is not available on
> > branch-2.3 where it breaks most of the times.  So, I think the
> > correct would be to backport these patches to branch-2.3 first:
> >  
> > 93fa0de tests: Fix race in 'balance-tcp bonding' test.
> > 60187ac test: Remove explicit sleeps from ofproto-dpif bond tests
> > d611e23 test: add WAIT_FOR_DUMMY_PORTS helper macro for writing
> > tests
> > 
> > and then hopefully my patch becomes just an optimization.  Anyway,
> > I can do that next week if you or someone else didn't do it before
> > me.
> 
> Thanks for the list of commits.  I took care of the backport.  I'd
> appreciate it if you'd verify it.

I just did, and unfortunately it still fails because the macro
WAIT_FOR_DUMMY_PORTS() depends on the ovs-appctl netdev-dummy/conn-state
command which is not available in branch-2.3 too.  So, could you please
backport this commit as well?

commit 7d7fffe8a4dbe0aab2eb16eca5ce016518b652ad
Author: Andy Zhou <az...@nicira.com>
Date:   Thu Jun 5 16:01:17 2014 -0700

    netdev-dummy: add appctl netdev-dummy/conn-state command
    
    Using without any parameter, this command list the connection
    state of all netdev-dummy devices that are configured to make
    active connections.
[...]

With the above commit applied, I could run 100 times the test 715
without failures and I could run all ofproto-dpif tests 30 times
without failures too.

> I'm happy enough to take optimizations (master only, please) now that
> we've figured out the root of the problem.

Ok, I will reword the commit and post again for master branch.

> It's funny that s390x passes in the Debian buildds (see
> https://buildd.debian.org/status/package.php?p=openvswitch) but maybe
> that's just luck.

I haven't dug out deeper beyond the lack of connection between the
sockets, but even on my test system some times it works just fine.

fbl
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to