I'm unlikely to continue along either of these paths because both of them are quite onerous for me. I think it's probably better if you debug the problem yourself.
If you have a simpler reproduction case then I'm happy to work on that, but I'm not going to install a 3-node ODL cluster to try to debug the problem. It sounds like you're not really a developer, so I guess that's part of the problem. On Tue, May 10, 2016 at 04:40:23PM +0000, Peter Gubka -X (pgubka - PANTHEON TECHNOLOGIES at Cisco) wrote: > Hi. > > How will we continue? I havent noticed your answer. > > Peter Gubka > > -----Original Message----- > From: Peter Gubka -X (pgubka - PANTHEON TECHNOLOGIES at Cisco) > Sent: Wednesday, May 04, 2016 9:00 AM > To: 'Ben Pfaff' <b...@ovn.org> > Cc: b...@openvswitch.org > Subject: RE: [ovs-discuss] controller's role mismatch? > > Hi. > > 1) For you to reproduce the problem: > > Download odl controller and install it in cluster way, you will need 3 > nodes(VMs probably or docker containers) . But if you have't done it before, > it can be time consuming and you'll not be sure if all is well configured. > https://nexus.opendaylight.org/content/repositories/opendaylight.release/org/opendaylight/integration/distribution-karaf/0.4.1-Beryllium-SR1/distribution-karaf-0.4.1-Beryllium-SR1.zip > Then follow steps i wrote at the beginnig(ovs-vsctl commands). If you want, i > can prepare python script for you to test automatically, but you'll have to > wait 1-2 days. > > 2) For me to reproduce the problem: > Build for me rpm for fedora 22(preferably) or 23 with improved logging. > > 3) Send me zipped ovs with your changes with steps how to build. I dont like > to build things if i dont have to, but i can do it. I will use f22 for > building, > > > Peter Gubka > > > > -----Original Message----- > From: Ben Pfaff [mailto:b...@ovn.org] > Sent: Tuesday, May 03, 2016 7:35 PM > To: Peter Gubka -X (pgubka - PANTHEON TECHNOLOGIES at Cisco) > <pgu...@cisco.com> > Cc: b...@openvswitch.org > Subject: Re: [ovs-discuss] controller's role mismatch? > > On Tue, May 03, 2016 at 07:02:45AM +0000, Peter Gubka -X (pgubka - PANTHEON > TECHNOLOGIES at Cisco) wrote: > > Are you sure about "reconnecting" switches? As i wrote before , to > > reproduce the problem, i had to use 2 switches/bridges. > > You're right, now that I look again. I missed the differences between the > connections. > > > $ grep -r rconn ovs-vswitchd.log | grep 6653 > > 2016-04-22T08:48:52.725Z|00022|rconn|INFO|s2<->tcp:127.0.0.1:6653: > > connecting... > > 2016-04-22T08:48:52.726Z|00023|rconn|WARN|s2<->tcp:127.0.0.1:6653: > > connection failed (Connection refused) > > 2016-04-22T08:48:52.726Z|00024|rconn|INFO|s2<->tcp:127.0.0.1:6653: > > waiting 1 seconds before reconnect > > 2016-04-22T08:48:52.726Z|00029|rconn|INFO|s1<->tcp:127.0.0.1:6653: > > connecting... > > 2016-04-22T08:48:52.726Z|00030|rconn|WARN|s1<->tcp:127.0.0.1:6653: > > connection failed (Connection refused) > > 2016-04-22T08:48:52.726Z|00031|rconn|INFO|s1<->tcp:127.0.0.1:6653: > > waiting 1 seconds before reconnect > > 2016-04-22T08:48:52.811Z|00032|rconn|WARN|s2<->tcp:127.0.0.1:6653: > > connection failed (Connection refused) > > 2016-04-22T08:48:52.811Z|00033|rconn|WARN|s1<->tcp:127.0.0.1:6653: > > connection failed (Connection refused) > > 2016-04-22T08:48:53.317Z|00070|rconn|INFO|s1<->tcp:10.25.2.14:6653: > > connecting... > > 2016-04-22T08:48:53.330Z|00075|rconn|INFO|s1<->tcp:10.25.2.14:6653: > > connected > > 2016-04-22T08:48:53.449Z|00085|rconn|INFO|s2<->tcp:10.25.2.13:6653: > > connecting... > > 2016-04-22T08:48:53.459Z|00090|rconn|INFO|s2<->tcp:10.25.2.13:6653: > > connected > > 2016-04-22T08:48:56.690Z|00184|rconn|INFO|s1<->tcp:10.25.2.12:6653: > > connecting... > > 2016-04-22T08:48:56.706Z|00189|rconn|INFO|s1<->tcp:10.25.2.12:6653: > > connected > > 2016-04-22T08:48:56.854Z|00199|rconn|INFO|s1<->tcp:10.25.2.13:6653: > > connecting... > > 2016-04-22T08:48:56.865Z|00204|rconn|INFO|s1<->tcp:10.25.2.13:6653: > > connected > > 2016-04-22T08:48:57.039Z|00214|rconn|INFO|s2<->tcp:10.25.2.12:6653: > > connecting... > > 2016-04-22T08:48:57.049Z|00219|rconn|INFO|s2<->tcp:10.25.2.12:6653: > > connected > > 2016-04-22T08:48:57.184Z|00229|rconn|INFO|s2<->tcp:10.25.2.14:6653: > > connecting... > > 2016-04-22T08:48:57.199Z|00234|rconn|INFO|s2<->tcp:10.25.2.14:6653: > > connected > > > > There is only 6x "connected", so i believe that was no reconnection. 2 > > bridges with 3 controllers each. > > 1) Around time 08:48:53 14 became master s1 and 13 for s2 > > 2) After time 08:48:56 i setup 2 more controllers for both s1 (12,13) and > > s2(12,14). > > > > How do i know if i see "vconn|DBG|tcp:10.25.2.14:6653: received: > > OFPT_ROLE_REQUEST (OF1.3) " if it is a request towards s1 or s2? > > You can't tell. This hasn't been an issue for me before, so probably, as one > outcome here, we should improve the logging. > > I can think of two different ways to hunt down what you're seeing. One way > would be for you to explain to me some simple way to reproduce it. > If I can have that, I'm willing to spend some time trying to find or explain > the problem. The other way would be to suggest some places that you can add > additional logging to OVS, which would help to explain what is going on. > That will probably take more back-and-forth and trial and error, but it > wouldn't require that I be able to reproduce the problem here. What's your > preference? _______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss