Jan and Ed,
I recommended we go back into production when I saw the tests running
consistently.
However, I did ssh timeouts during my testing. but only when I had more
then 3 simulations running simultaneously on VIRL3 with all the tests
enabled.
I figured that was due to the unusual use case where all the
simultaneous tests were running from all the simulations on the same
server and guessed that was less likely to happen when the tests were
getting distributed across all 3 servers.
--Tom
On 02/14/2018 10:47 AM, Jan Gelety wrote:
Hello Ed,
First occurrence of connection issues in logs below (including elapsed
time)
Regards,
Jan
*From:*vpp-dev@lists.fd.io [mailto:vpp-dev@lists.fd.io] *On Behalf Of
*Ed Kern (ejk)
*Sent:* Tuesday, February 13, 2018 6:59 PM
*To:* vpp-dev@lists.fd.io
*Cc:* csit-...@lists.fd.io; vpp-dev@lists.fd.io
*Subject:* Re: [vpp-dev] [csit-dev] t4-virl3 moved from testing back
into production - SSH timouts on VIRL
*Importance:* High
ok im looking (and for the time being ive flipped it back to testing)..
I am not seeing issues on the server end. Dont get me wrong totally
believe you that there is a problem..but im not seeing in the
logs below specific timeout issues related to virl3.
Could you schedule up a webex or pointers to help me track this down?
thanks,
Ed
On Feb 13, 2018, at 6:23 AM, Jan Gelety -X (jgelety - PANTHEON
TECHNOLOGIES at Cisco) <jgel...@cisco.com
<mailto:jgel...@cisco.com>> wrote:
Hello Ed,
Unfortunately we are facing to SSH timeout quite often after
moving virl3 to production:
https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-virl-master/9418/console
*00:40:09.239*10:46:26 [ ERROR ] Node 10.30.52.131 setup failed,
error:''
*00:40:09.239*10:47:11 Extracting tarball to /tmp/openvpp-testing
on 10.30.52.130
*00:40:09.239*10:47:12 Setup of node 10.30.52.130 done
*00:40:09.239*10:47:12 All nodes are ready
*00:40:09.239*10:47:37 Tests.Vpp.Func.Interfaces
*00:40:09.239*10:47:37
========================================================================================================================================
*00:40:09.239*10:47:37 Tests.Vpp.Func.Interfaces.Api-Crud-Tap-Func
:: *Tap Interface CRUD Tests*
*00:40:09.239*10:47:37
========================================================================================================================================
*00:40:09.239*10:47:37 TC01: Tap Interface Modify And Delete ::
[Top] TG-DUT1-TG.
| FAIL |
*00:40:09.239*10:47:37 Parent suite setup failed:
*00:40:09.239*10:47:37 NoValidConnectionsError: [Errno None]
Unable to connect to port 22 on or 10.30.52.131
*00:40:09.239*10:47:37
----------------------------------------------------------------------------------------------------------------------------------------
https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-virl-master/9420/console
*00:47:19.083*12:12:41 TC01: IPv4 Equal-cost multipath routing ::
[Top] TG=DUT
[ WARN ] None
*00:47:19.083*12:12:41 None
*00:47:19.083*12:12:41 | FAIL |
*00:47:19.083*12:12:41 Setup failed:
*00:47:19.083*12:12:41 SSHTimeout: Timeout exception during
execution of command: pidof vpp
*00:47:19.083*12:12:41 Current contents of stdout buffer:
*00:47:19.083*12:12:41 Current contents of stderr buffer:
https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-virl-master/9423/console
-Here it failed during startup of simulation
*00:27:07.065*DEBUG: Node tg1 is of type tg and has mgmt IP
10.30.54.141
*00:27:07.065*DEBUG: Node sut1 is of type sut and has mgmt IP
10.30.54.139
*00:27:07.065*DEBUG: Node sut2 is of type sut and has mgmt IP
10.30.54.140
*00:27:07.065*DEBUG: Waiting for hosts to become reachable over SSH
*00:27:12.070*DEBUG: Attempt 1 out of 48, waiting for 2 hosts
...
*00:31:07.245*DEBUG: Attempt 48 out of 48, waiting for 2 hosts
*00:31:07.245*ERROR: Simulation started OK but 2 hosts never
mounted their NFS directory
https://jenkins.fd.io/job/csit-vpp-functional-1801-ubuntu1604-virl/211/console
*02:01:43.419*TC02: DUT with iACL MAC dst-addr drops matching pkts
:: [Top] TG-DUT1-DUT2-TG. [ WARN ]
Tests.Vpp.Func.L2Xc.Eth2P-Eth-L2Xcbase-Iaclbase-Func - TC02: DUT
with iACL MAC dst-addr drops matching pkts
*02:03:23.456*The VPP PIDs are not equal!
*02:03:23.456*Test Setup VPP PIDs: {'10.30.53.219': 23424,
'10.30.53.218': 5433}
*02:03:23.456*Test Teardown VPP PIDs: None
*02:03:23.456*Tests.Vpp.Func.L2Xc.Eth2P-Eth-L2Xcbase-Iaclbase-Func
- TC02: DUT with iACL MAC dst-addr drops matching pkts
*02:03:23.456*The VPP PIDs are not equal!
*02:03:23.456*Test Setup VPP PIDs: {'10.30.53.219': 23424,
'10.30.53.218': 5433}
*02:03:23.456*Test Teardown VPP PIDs: None
*02:03:23.458*| FAIL |
*02:03:23.458*Expected error 'ICMP echo Rx timeout' but got
'NoValidConnectionsError: [Errno None] Unable to connect to port
22 on or 10.30.53.220'.
https://jenkins.fd.io/job/csit-vpp-functional-1801-ubuntu1604-virl/212/consoleFull
*00:40:57.431*TC02: Process tagged send untagged [ ERROR ] VAT
script execution timeout: sudo -S vpp_api_test in
/tmp/openvpp-testing/resources/templates/vat/show_trace.vat script
*00:41:53.602*| FAIL |
*00:41:53.603*Teardown failed:
*00:41:53.603*SSHTimeout: Timeout exception during execution of
command: sudo -S vpp_api_test in
/tmp/openvpp-testing/resources/templates/vat/show_trace.vat script
https://jenkins.fd.io/job/csit-vpp-functional-1801-centos7-virl/212/consoleFull
*02:00:59.621*TC07: DUT1 and DUT2 with L2BD and VLAN translate-2-2
with wrong outer tag used (DUT1) switch ICMPv6 between two TG
links :: ... [ WARN ]
Tests.Vpp.Func.L2Bd.Eth2P-Dot1Ad-L2Bdbasemaclrn-Vlantrans22-Func -
TC07: DUT1 and DUT2 with L2BD and VLAN translate-2-2 with wrong
outer tag used (DUT1) switch ICMPv6 between two TG links
*02:02:31.332*The VPP PIDs are not equal!
*02:02:31.332*Test Setup VPP PIDs: {'10.30.54.206': 25207,
'10.30.54.205': 26542}
*02:02:31.332*Test Teardown VPP PIDs: None
*02:02:31.332*Tests.Vpp.Func.L2Bd.Eth2P-Dot1Ad-L2Bdbasemaclrn-Vlantrans22-Func
- TC07: DUT1 and DUT2 with L2BD and VLAN translate-2-2 with wrong
outer tag used (DUT1) switch ICMPv6 between two TG links
*02:02:31.332*The VPP PIDs are not equal!
*02:02:31.332*Test Setup VPP PIDs: {'10.30.54.206': 25207,
'10.30.54.205': 26542}
*02:02:31.332*Test Teardown VPP PIDs: None
*02:02:31.343*| FAIL |
*02:02:31.343*Setup failed:
*02:02:31.343*NoValidConnectionsError: [Errno None] Unable to
connect to port 22 on or 10.30.54.207
Could you, please, have a look on it?
Thanks,
Jan
*From:*csit-...@lists.fd.io
<mailto:csit-...@lists.fd.io>[mailto:csit-...@lists.fd.io]*On
Behalf Of*Thomas F Herbert
*Sent:*Tuesday, February 13, 2018 2:14 PM
*To:*csit-...@lists.fd.io <mailto:csit-...@lists.fd.io>
*Subject:*Re: [csit-dev] t4-virl3 moved from testing back into
production
Yes,
Everything should work fine with the 1.4 image.
--Tom
On 02/13/2018 12:27 AM, Peter Mikus wrote:
Thank you Ed.
Peter Mikus
Engineer – Software
Cisco Systems Limited
-----Original Message-----
From:csit-...@lists.fd.io <mailto:csit-...@lists.fd.io>
[mailto:csit-...@lists.fd.io] On Behalf Of Ed Kern (ejk)
Sent: Tuesday, February 13, 2018 12:06 AM
To:csit-...@lists.fd.io <mailto:csit-...@lists.fd.io>
Cc: Thomas F Herbert<therb...@redhat.com> <mailto:therb...@redhat.com>
Subject: [csit-dev] FYI: t4-virl3 moved from testing back into
production
This is still with the ‘older’ centos image. Thomas said the newer one
isn’t quite ready yet.
So this is just an fyi if folks see something amiss in the next few
days with the virl jobs to let me know.
thanks,
Ed
--
*Thomas F Herbert*
NFV and Fast Data Planes
Networking Group Office of the CTO
*Red Hat*
--
*Thomas F Herbert*
NFV and Fast Data Planes
Networking Group Office of the CTO
*Red Hat*