On Fri, Jul 22, 2016 at 01:52:18PM -0500, Ryan Moats wrote: > > > Guru Shetty <g...@ovn.org> wrote on 07/22/2016 12:31:43 PM: > > > From: Guru Shetty <g...@ovn.org> > > To: Ryan Moats/Omaha/IBM@IBMUS > > Cc: Lance Richardson <lrich...@redhat.com>, ovs dev <dev@openvswitch.org> > > Date: 07/22/2016 12:31 PM > > Subject: Re: [ovs-dev] ovn test failures > > > > On 21 July 2016 at 06:05, Ryan Moats <rmo...@us.ibm.com> wrote: > > "dev" <dev-boun...@openvswitch.org> wrote on 07/21/2016 06:32:02 AM: > > > > > From: Lance Richardson <lrich...@redhat.com> > > > To: ovs dev <dev@openvswitch.org> > > > Date: 07/21/2016 06:32 AM > > > Subject: [ovs-dev] ovn test failures > > > Sent by: "dev" <dev-boun...@openvswitch.org> > > > > > > It seems the failure rate for OVN end-to-end tests went up > significantly > > > when commit 70c7cfef188b5ae9940abd5b7d9fe46b1fa88c8e was merged earlier > > > this week. > > > > > > After this commit, 100 iterations of "make check TESTSUITEFLAGs='-j8 -k > > ovn'" > > > gave (number of failures in left-most column): > > > 2 2179: ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS FAILED > > > (ovn.at:1312) > > > 10 2183: ovn -- 2 HVs, 2 LS, 1 lport/LS, 2 peer LRs FAILED > > > (ovn.at:2416) > > > 52 2184: ovn -- 1 HV, 1 LS, 2 lport/LS, 1 LR FAILED > > > (ovn.at:2529) > > > 45 2185: ovn -- 1 HV, 2 LSs, 1 lport/LS, 1 LR FAILED > > > (ovn.at:2668) > > > 23 2186: ovn -- 2 HVs, 3 LS, 1 lport/LS, 2 peer LRs, static > > > routes FAILED (ovn.at:2819) > > > 53 2188: ovn -- 2 HVs, 3 LRs connected via LS, static routes > > > FAILED (ovn.at:3053) > > > 32 2189: ovn -- 2 HVs, 2 LRs connected via LS, gateway router > > > FAILED (ovn.at:3237) > > > 50 2190: ovn -- icmp_reply: 1 HVs, 2 LSs, 1 lport/LS, 1 LR > > > FAILED (ovn.at:3389) > > > > > > Immediately prior to this (at commit > > > 48ff3e25abe31b761d2d3f3a2fd6ccaa783c79dc), > > > the number of failures per 100 iterations was much lower: > > > 1 2178: ovn -- 2 HVs, 4 lports/HV, localnet ports FAILED > > > (ovn.at:1020) > > > 1 2179: ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS FAILED > > > (ovn.at:1307) > > > 1 2179: ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS FAILED > > > (ovn.at:1312) > > > 9 2184: ovn -- 1 HV, 1 LS, 2 lport/LS, 1 LR FAILED > > > (ovn.at:2529) > > > 7 2186: ovn -- 2 HVs, 3 LS, 1 lport/LS, 2 peer LRs, static > > > routes FAILED (ovn.at:2819) > > > 1 2187: ovn -- send gratuitous arp on localnet FAILED > > > (ovn.at:2874) > > > 16 2188: ovn -- 2 HVs, 3 LRs connected via LS, static routes > > > FAILED (ovn.at:3053) > > > > > > Any ideas? > > > > > > Thanks, > > > > > > Lance > > > As author of that patch, I will admit that those numbers are a > > bit disturbing, because they aren't consistent with what I was > > seeing while developing and testing the patch series. > > > > What they make me suspect is that that patches doesn't catch all > > state transitions (similar to what you uncovered with commit > > f94705d729459d808fd139c8f95d5f1f8d8becc6) correctly. > > > > Two things come to mind: > > 1) Make sure that all of the places where the code needs to request > > a full process of tables are correctly handled. > > 2) If a later step in the process finds that an earlier step in > > the process needs to process the database rows fully during the > > next cycle, use poll_immediate_wake so that processing happens > > sooner than later. > > Ryan, > > Were you planning to look at the failures? Should we revert the patch? > > > > > > Guru- > > Yes, I have been looking at the failures since Wed and I have a patch set > that > addresses all of these failures. However, I'm travelling today, so I won't > be able > to mail it until either late tonight or tomorrow morning (US Central Time).
We'll look forward to it. I think that these are probably affecting everyone who regularly runs the tests. It'll be nice to get them fixed soon. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev