Jan and Ed,

I recommended we go back into production when I saw the tests running consistently.

However, I did ssh timeouts during my testing. but only when I had more then 3 simulations running simultaneously on VIRL3 with all the tests enabled.

I figured that was due to the unusual use case where all the simultaneous tests were running from all the simulations on the same server and guessed that was less likely to happen when the tests were getting distributed across all 3 servers.

--Tom


On 02/14/2018 10:47 AM, Jan Gelety wrote:

Hello Ed,

First occurrence of connection issues in logs below (including elapsed time)

Regards,

Jan

*From:*vpp-dev@lists.fd.io [mailto:vpp-dev@lists.fd.io] *On Behalf Of *Ed Kern (ejk)
*Sent:* Tuesday, February 13, 2018 6:59 PM
*To:* vpp-dev@lists.fd.io
*Cc:* csit-...@lists.fd.io; vpp-dev@lists.fd.io
*Subject:* Re: [vpp-dev] [csit-dev] t4-virl3 moved from testing back into production - SSH timouts on VIRL
*Importance:* High

ok im looking (and for the time being ive flipped it back to testing)..

I am not seeing issues on the server end.  Dont get me wrong totally believe you that there is a problem..but im not seeing in the

logs below specific timeout issues related to virl3.

Could you schedule up a webex or pointers to help me track this down?

thanks,

Ed



    On Feb 13, 2018, at 6:23 AM, Jan Gelety -X (jgelety - PANTHEON
    TECHNOLOGIES at Cisco) <jgel...@cisco.com
    <mailto:jgel...@cisco.com>> wrote:

    Hello Ed,

    Unfortunately we are facing to SSH timeout quite often after
    moving virl3 to production:

    https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-virl-master/9418/console

    *00:40:09.239*10:46:26 [ ERROR ] Node 10.30.52.131 setup failed,
    error:''

    *00:40:09.239*10:47:11 Extracting tarball to /tmp/openvpp-testing
    on 10.30.52.130

    *00:40:09.239*10:47:12 Setup of node 10.30.52.130 done

    *00:40:09.239*10:47:12 All nodes are ready

    *00:40:09.239*10:47:37 Tests.Vpp.Func.Interfaces

    *00:40:09.239*10:47:37
    
========================================================================================================================================

    *00:40:09.239*10:47:37 Tests.Vpp.Func.Interfaces.Api-Crud-Tap-Func
    :: *Tap Interface CRUD Tests*

    *00:40:09.239*10:47:37
    
========================================================================================================================================

    *00:40:09.239*10:47:37 TC01: Tap Interface Modify And Delete ::
    [Top] TG-DUT1-TG.
                                                                | FAIL |

    *00:40:09.239*10:47:37 Parent suite setup failed:

    *00:40:09.239*10:47:37 NoValidConnectionsError: [Errno None]
    Unable to connect to port 22 on  or 10.30.52.131

    *00:40:09.239*10:47:37
    
----------------------------------------------------------------------------------------------------------------------------------------

    https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-virl-master/9420/console

    *00:47:19.083*12:12:41 TC01: IPv4 Equal-cost multipath routing ::
    [Top] TG=DUT                                               
                             [ WARN ] None

    *00:47:19.083*12:12:41 None

    *00:47:19.083*12:12:41 | FAIL |

    *00:47:19.083*12:12:41 Setup failed:

    *00:47:19.083*12:12:41 SSHTimeout: Timeout exception during
    execution of command: pidof vpp

    *00:47:19.083*12:12:41 Current contents of stdout buffer:

    *00:47:19.083*12:12:41 Current contents of stderr buffer:

    https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-virl-master/9423/console

    -Here it failed during startup of simulation

    *00:27:07.065*DEBUG: Node tg1 is of type tg and has mgmt IP
    10.30.54.141

    *00:27:07.065*DEBUG: Node sut1 is of type sut and has mgmt IP
    10.30.54.139

    *00:27:07.065*DEBUG: Node sut2 is of type sut and has mgmt IP
    10.30.54.140

    *00:27:07.065*DEBUG: Waiting for hosts to become reachable over SSH

    *00:27:12.070*DEBUG: Attempt 1 out of 48, waiting for 2 hosts

    ...

    *00:31:07.245*DEBUG: Attempt 48 out of 48, waiting for 2 hosts

    *00:31:07.245*ERROR: Simulation started OK but 2 hosts never
    mounted their NFS directory

    
https://jenkins.fd.io/job/csit-vpp-functional-1801-ubuntu1604-virl/211/console

    *02:01:43.419*TC02: DUT with iACL MAC dst-addr drops matching pkts
    :: [Top] TG-DUT1-DUT2-TG. [ WARN ]
    Tests.Vpp.Func.L2Xc.Eth2P-Eth-L2Xcbase-Iaclbase-Func - TC02: DUT
    with iACL MAC dst-addr drops matching pkts

    *02:03:23.456*The VPP PIDs are not equal!

    *02:03:23.456*Test Setup VPP PIDs: {'10.30.53.219': 23424,
    '10.30.53.218': 5433}

    *02:03:23.456*Test Teardown VPP PIDs: None

    *02:03:23.456*Tests.Vpp.Func.L2Xc.Eth2P-Eth-L2Xcbase-Iaclbase-Func
    - TC02: DUT with iACL MAC dst-addr drops matching pkts

    *02:03:23.456*The VPP PIDs are not equal!

    *02:03:23.456*Test Setup VPP PIDs: {'10.30.53.219': 23424,
    '10.30.53.218': 5433}

    *02:03:23.456*Test Teardown VPP PIDs: None

    *02:03:23.458*| FAIL |

    *02:03:23.458*Expected error 'ICMP echo Rx timeout' but got
    'NoValidConnectionsError: [Errno None] Unable to connect to port
    22 on  or 10.30.53.220'.

    
https://jenkins.fd.io/job/csit-vpp-functional-1801-ubuntu1604-virl/212/consoleFull

    *00:40:57.431*TC02: Process tagged send untagged [ ERROR ] VAT
    script execution timeout: sudo -S vpp_api_test  in
    /tmp/openvpp-testing/resources/templates/vat/show_trace.vat script

    *00:41:53.602*| FAIL |

    *00:41:53.603*Teardown failed:

    *00:41:53.603*SSHTimeout: Timeout exception during execution of
    command: sudo -S vpp_api_test  in
    /tmp/openvpp-testing/resources/templates/vat/show_trace.vat script

    
https://jenkins.fd.io/job/csit-vpp-functional-1801-centos7-virl/212/consoleFull

    *02:00:59.621*TC07: DUT1 and DUT2 with L2BD and VLAN translate-2-2
    with wrong outer tag used (DUT1) switch ICMPv6 between two TG
    links :: ... [ WARN ]
    Tests.Vpp.Func.L2Bd.Eth2P-Dot1Ad-L2Bdbasemaclrn-Vlantrans22-Func -
    TC07: DUT1 and DUT2 with L2BD and VLAN translate-2-2 with wrong
    outer tag used (DUT1) switch ICMPv6 between two TG links

    *02:02:31.332*The VPP PIDs are not equal!

    *02:02:31.332*Test Setup VPP PIDs: {'10.30.54.206': 25207,
    '10.30.54.205': 26542}

    *02:02:31.332*Test Teardown VPP PIDs: None

    
*02:02:31.332*Tests.Vpp.Func.L2Bd.Eth2P-Dot1Ad-L2Bdbasemaclrn-Vlantrans22-Func
    - TC07: DUT1 and DUT2 with L2BD and VLAN translate-2-2 with wrong
    outer tag used (DUT1) switch ICMPv6 between two TG links

    *02:02:31.332*The VPP PIDs are not equal!

    *02:02:31.332*Test Setup VPP PIDs: {'10.30.54.206': 25207,
    '10.30.54.205': 26542}

    *02:02:31.332*Test Teardown VPP PIDs: None

    *02:02:31.343*| FAIL |

    *02:02:31.343*Setup failed:

    *02:02:31.343*NoValidConnectionsError: [Errno None] Unable to
    connect to port 22 on  or 10.30.54.207

    Could you, please, have a look on it?

    Thanks,

    Jan

    *From:*csit-...@lists.fd.io
    <mailto:csit-...@lists.fd.io>[mailto:csit-...@lists.fd.io]*On
    Behalf Of*Thomas F Herbert
    *Sent:*Tuesday, February 13, 2018 2:14 PM
    *To:*csit-...@lists.fd.io <mailto:csit-...@lists.fd.io>
    *Subject:*Re: [csit-dev] t4-virl3 moved from testing back into
    production

    Yes,

    Everything should work fine with the 1.4 image.

    --Tom

    On 02/13/2018 12:27 AM, Peter Mikus wrote:

        Thank you Ed.

        Peter Mikus

        Engineer – Software

        Cisco Systems Limited

        -----Original Message-----

        From:csit-...@lists.fd.io <mailto:csit-...@lists.fd.io>  
[mailto:csit-...@lists.fd.io] On Behalf Of Ed Kern (ejk)

        Sent: Tuesday, February 13, 2018 12:06 AM

        To:csit-...@lists.fd.io <mailto:csit-...@lists.fd.io>

        Cc: Thomas F Herbert<therb...@redhat.com> <mailto:therb...@redhat.com>

        Subject: [csit-dev] FYI: t4-virl3 moved from testing back into 
production

        This is still with the ‘older’ centos image.  Thomas said the newer one 
isn’t quite ready yet.

        So this is just an fyi if folks see something amiss in the next few 
days with the virl jobs to let me know.

        thanks,

        Ed

    --
    *Thomas F Herbert*
    NFV and Fast Data Planes
    Networking Group Office of the CTO
    *Red Hat*



--
*Thomas F Herbert*
NFV and Fast Data Planes
Networking Group Office of the CTO
*Red Hat*

Reply via email to