Guru Shetty <g...@ovn.org> wrote on 07/22/2016 12:31:43 PM:
> From: Guru Shetty <g...@ovn.org> > To: Ryan Moats/Omaha/IBM@IBMUS > Cc: Lance Richardson <lrich...@redhat.com>, ovs dev <dev@openvswitch.org> > Date: 07/22/2016 12:31 PM > Subject: Re: [ovs-dev] ovn test failures > > On 21 July 2016 at 06:05, Ryan Moats <rmo...@us.ibm.com> wrote: > "dev" <dev-boun...@openvswitch.org> wrote on 07/21/2016 06:32:02 AM: > > > From: Lance Richardson <lrich...@redhat.com> > > To: ovs dev <dev@openvswitch.org> > > Date: 07/21/2016 06:32 AM > > Subject: [ovs-dev] ovn test failures > > Sent by: "dev" <dev-boun...@openvswitch.org> > > > > It seems the failure rate for OVN end-to-end tests went up significantly > > when commit 70c7cfef188b5ae9940abd5b7d9fe46b1fa88c8e was merged earlier > > this week. > > > > After this commit, 100 iterations of "make check TESTSUITEFLAGs='-j8 -k > ovn'" > > gave (number of failures in left-most column): > > 2 2179: ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS FAILED > > (ovn.at:1312) > > 10 2183: ovn -- 2 HVs, 2 LS, 1 lport/LS, 2 peer LRs FAILED > > (ovn.at:2416) > > 52 2184: ovn -- 1 HV, 1 LS, 2 lport/LS, 1 LR FAILED > > (ovn.at:2529) > > 45 2185: ovn -- 1 HV, 2 LSs, 1 lport/LS, 1 LR FAILED > > (ovn.at:2668) > > 23 2186: ovn -- 2 HVs, 3 LS, 1 lport/LS, 2 peer LRs, static > > routes FAILED (ovn.at:2819) > > 53 2188: ovn -- 2 HVs, 3 LRs connected via LS, static routes > > FAILED (ovn.at:3053) > > 32 2189: ovn -- 2 HVs, 2 LRs connected via LS, gateway router > > FAILED (ovn.at:3237) > > 50 2190: ovn -- icmp_reply: 1 HVs, 2 LSs, 1 lport/LS, 1 LR > > FAILED (ovn.at:3389) > > > > Immediately prior to this (at commit > > 48ff3e25abe31b761d2d3f3a2fd6ccaa783c79dc), > > the number of failures per 100 iterations was much lower: > > 1 2178: ovn -- 2 HVs, 4 lports/HV, localnet ports FAILED > > (ovn.at:1020) > > 1 2179: ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS FAILED > > (ovn.at:1307) > > 1 2179: ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS FAILED > > (ovn.at:1312) > > 9 2184: ovn -- 1 HV, 1 LS, 2 lport/LS, 1 LR FAILED > > (ovn.at:2529) > > 7 2186: ovn -- 2 HVs, 3 LS, 1 lport/LS, 2 peer LRs, static > > routes FAILED (ovn.at:2819) > > 1 2187: ovn -- send gratuitous arp on localnet FAILED > > (ovn.at:2874) > > 16 2188: ovn -- 2 HVs, 3 LRs connected via LS, static routes > > FAILED (ovn.at:3053) > > > > Any ideas? > > > > Thanks, > > > > Lance > As author of that patch, I will admit that those numbers are a > bit disturbing, because they aren't consistent with what I was > seeing while developing and testing the patch series. > > What they make me suspect is that that patches doesn't catch all > state transitions (similar to what you uncovered with commit > f94705d729459d808fd139c8f95d5f1f8d8becc6) correctly. > > Two things come to mind: > 1) Make sure that all of the places where the code needs to request > a full process of tables are correctly handled. > 2) If a later step in the process finds that an earlier step in > the process needs to process the database rows fully during the > next cycle, use poll_immediate_wake so that processing happens > sooner than later. > Ryan, > Were you planning to look at the failures? Should we revert the patch? > > Guru- Yes, I have been looking at the failures since Wed and I have a patch set that addresses all of these failures. However, I'm travelling today, so I won't be able to mail it until either late tonight or tomorrow morning (US Central Time). Ryan _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev