alright klement/dave

im a bit stuck again…

i get about an 60+% failure rate out of test-debug even with higher than normal 
cpu settings (higher than just what i use for build verify)

always right here


19:43:38 
==============================================================================
19:43:38 ERROR: L2 FIB test 7 - flush bd_id
19:43:38 
------------------------------------------------------------------------------
19:43:38 Traceback (most recent call last):
19:43:38   File 
"/workspace/vpp-test-debug-master-ubuntu1604/test/test_l2_fib.py", line 508, in 
test_l2_fib_07
19:43:38     self.run_verify_negat_test(bd_id=1, dst_hosts=flushed)
19:43:38   File 
"/workspace/vpp-test-debug-master-ubuntu1604/test/test_l2_fib.py", line 418, in 
run_verify_negat_test
19:43:38     i.get_capture(0, timeout=timeout)
19:43:38   File 
"/workspace/vpp-test-debug-master-ubuntu1604/test/vpp_pg_interface.py", line 
240, in get_capture
19:43:38     (len(capture.res), expected_count, name))
19:43:38 Exception: Captured packets mismatch, captured 9 packets, expected 0 
packets on pg0
19:43:38
19:43:38 
==============================================================================
19:43:38 ERROR: L2 FIB test 8 - flush all
19:43:38 
------------------------------------------------------------------------------
19:43:38 Traceback (most recent call last):
19:43:38   File 
"/workspace/vpp-test-debug-master-ubuntu1604/test/test_l2_fib.py", line 522, in 
test_l2_fib_08
19:43:38     self.run_verify_negat_test(bd_id=1, dst_hosts=flushed)
19:43:38   File 
"/workspace/vpp-test-debug-master-ubuntu1604/test/test_l2_fib.py", line 418, in 
run_verify_negat_test
19:43:38     i.get_capture(0, timeout=timeout)
19:43:38   File 
"/workspace/vpp-test-debug-master-ubuntu1604/test/vpp_pg_interface.py", line 
240, in get_capture
19:43:38     (len(capture.res), expected_count, name))
19:43:38 Exception: Captured packets mismatch, captured 9 packets, expected 0 
packets on pg0
19:43:38


when it fails its always the same two tests…always the same exception (captured 
9, expected 0)

its so consistent in its ‘death’ but so intermittent in frequency its freaking 
me out a bit…

any thoughts?

Ed



On May 24, 2017, at 8:42 AM, Klement Sekera -X (ksekera - PANTHEON TECHNOLOGIES 
at Cisco) <ksek...@cisco.com<mailto:ksek...@cisco.com>> wrote:

I know that the functional BFD tests passed so unless there is a bug in
the tests, the failures are pretty much timing issues. From my
experience the load is the culprit as the BFD tests test interactive
sessions, which need to be kept alive. The timings currently are set at
300ms and for most tests two keep-alives can be missed before the session
goes down on vpp side and asserts start failing. While this might seem
like ample time, especially on loaded systems there is a high chance
that at least one test will derp ...

I've also seen derps even on idle systems, where a select() call (used
by python in its own sleep() implementation) with timeout of 100ms returns
after 1-3 seconds.

Try running the bfd tests only (make test-all TEST=bfd) while no other tasks
are running - I think they should pass on your box just fine.

Thanks,
Klement

Quoting Ed Kern (ejk) (2017-05-24 16:27:10)
  right now its a VERY intentional mix…but depending on loading I could
  easily see this coming up if those timings are strict.
  To not dodge your question max loading on my slowest node would be 3
  concurrent builds on an Xeon™ E3-1240 v3 (4 cores @ 3.4Ghz)
    yeah yeah stop laughing…..Do you have suggested or even guesstimate
  minimums in this regard…I could pretty trivially route them towards
  the larger set that I have right now if you think magic will result :)
  Ed
  PS thanks though..for whatever reason the type of errors I was getting
  didn’t naturally steer my mind towards cpu/io binding.

    On May 24, 2017, at 12:57 AM, Klement Sekera -X (ksekera - PANTHEON
    TECHNOLOGIES at Cisco) <[1]ksek...@cisco.com<mailto:ksek...@cisco.com>> 
wrote:
    Hi Ed,

    how fast are your boxes? And how many cores? The BFD tests struggle to
    meet
    the aggresive timings on slower boxes...

    Thanks,
    Klement

    Quoting Ed Kern (ejk) (2017-05-23 20:43:55)

        No problem.
        If anyone is curious in rubbernecking the accident that is the
      current
        test-all (at least for my build system)
        adding a comment of
        testall
        SHOULD trigger and fire it off on my end.
        make it all pass and you win a beer (or beverage of your choice)
        Ed

          On May 23, 2017, at 11:34 AM, Dave Wallace
      <[1][2]dwallac...@gmail.com<mailto:dwallac...@gmail.com>>
          wrote:
          Ed,

          Thanks for adding this to the shadow build system.  Real data on
      the
          cost and effectiveness of this will be most useful.

          -daw-
          On 5/23/2017 1:30 PM, Ed Kern (ejk) wrote:

      In the vpp-dev call a couple hours ago there was a discussion of
      running test-debug on a regular/default? basis.
      As a trial I’ve added a new job to the shadow build system:

      vpp-test-debug-master-ubuntu1604

      Will do a make test-debug,  as part of verify set, as an ADDITIONAL
      job.

      I gave a couple passes with test-all but can’t ever get a clean run
      with test-all (errors in test_bfd and test_ip6 ).
      I don’t think this is unusual or unexpected.  Ill leave it to someone
      else to say that ‘all’ throwing failures is a good thing.
      I’m happy to add another job for/with test-all if someone wants to
      actually debug those errors.

      flames, comments,concerns welcome..

      Ed

      PS Please note/remember that all these tests are non-voting regardless
      of success or failure.
      _______________________________________________
      vpp-dev mailing list
      [2][3]vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>
      [3][4]https://lists.fd.io/mailman/listinfo/vpp-dev

      References

        Visible links
        1. [5]mailto:dwallac...@gmail.com
        2. [6]mailto:vpp-dev@lists.fd.io
        3. [7]https://lists.fd.io/mailman/listinfo/vpp-dev

References

  Visible links
  1. mailto:ksek...@cisco.com
  2. mailto:dwallac...@gmail.com
  3. mailto:vpp-dev@lists.fd.io
  4. https://lists.fd.io/mailman/listinfo/vpp-dev
  5. mailto:dwallac...@gmail.com
  6. mailto:vpp-dev@lists.fd.io
  7. https://lists.fd.io/mailman/listinfo/vpp-dev

_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev
  • [vpp-dev] sha... Ed Kern (ejk)
    • Re: [vpp... Dave Wallace
      • Re: ... Ed Kern (ejk)
        • ... Klement Sekera -X (ksekera - PANTHEON TECHNOLOGIES at Cisco)
          • ... Ed Kern (ejk)
            • ... Klement Sekera -X (ksekera - PANTHEON TECHNOLOGIES at Cisco)
              • ... Ed Kern (ejk)
              • ... Ed Kern (ejk)
                • ... John Lo (loj)
                • ... Klement Sekera -X (ksekera - PANTHEON TECHNOLOGIES at Cisco)

Reply via email to