Hello, I had a look on all of three VIRL servers. I deleted few stuck sessions.
I think the main issue is that one of VIRL servers is in TESTING status (I guess because of Thomas Herbert’s Centos7 VIRL image tests) so not used for Jenkins jobs. It could lead to situation that we have no enough free IP addresses available on VIRL server top finish the simulation (session) start up when there is already a high number of running sessions on the VIRL server. We should get the third VIRL server back to PRODUCTION status as soon as possible to increase VIRL capacity. @Thomas – when do you expect to finish your work on Centos7 preparation for VIRL? We should also have a look on the VIRL server simulation start up procedure to improve handling in situation when VIRL simulation is not started successfully - maybe try to use another VIRL server if available. Regards, Jan From: Dave Wallace [mailto:dwallac...@gmail.com] Sent: Thursday, February 02, 2017 06:14 To: vpp-dev <vpp-dev@lists.fd.io>; csit-...@lists.fd.io; Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco) <jgel...@cisco.com> Subject: Re: Fix for vpp-verify-master-ubuntu1604 build failures Jan, csit-dev, There have been a number of failures of the vpp-csit-verify-virl-master jobs created by my rebase-ing vpp patches (see thread below for details). Some of these failures may be valid test failures. However, I have looked at a couple of them that seem to indicate there may be an issue starting up the VIRL VMs. The error signature that I'm seeing is the following error after the three simulations are spun up. [Excerpt from https://jenkins.fd.io/job/vpp-csit-verify-virl-master/3637/console]: ---- %< ---- 04:01:32 + VIRL_SID[${index}]='ERROR: Simulation started OK but devices never changed to ACTIVE state 04:01:32 Last VIRL response: 04:01:32 {u'\''session-Pv076_'\'': {u'\''~mgmt-lxc'\'': {u'\''vnc-console'\'': False, u'\''subtype'\'': u'\''mgmt-lxc'\'', u'\''state'\'': u'\''ABSENT'\'', u'\''management-protocol'\'': u'\''ssh'\'', u'\''management-proxy'\'': u'\''self'\'', u'\''serial-ports'\'': 0}, u'\''tg1'\'': {u'\''vnc-console'\'': True, u'\''subtype'\'': u'\''server'\'', u'\''state'\'': u'\''ABSENT'\'', u'\''management-protocol'\'': u'\''ssh'\'', u'\''management-proxy'\'': u'\''lxc'\'', u'\''serial-ports'\'': 1}, u'\''sut1'\'': {u'\''vnc-console'\'': True, u'\''subtype'\'': u'\''vPP'\'', u'\''state'\'': u'\''ABSENT'\'', u'\''management-protocol'\'': u'\''ssh'\'', u'\''management-proxy'\'': u'\''lxc'\'', u'\''serial-ports'\'': 1}, u'\''sut2'\'': {u'\''vnc-console'\'': True, u'\''subtype'\'': u'\''vPP'\'', u'\''state'\'': u'\''ABSENT'\'', u'\''management-protocol'\'': u'\''ssh'\'', u'\''management-proxy'\'': u'\''lxc'\'', u'\''serial-ports'\'': 1}}}' 04:01:32 + retval=1 04:01:32 + '[' 1 -ne 0 ']' 04:01:32 + echo 'VIRL simulation start failed on 10.30.51.29' 04:01:32 VIRL simulation start failed on 10.30.51.29 ---- %< ---- Can you please take a look at the most recent vpp-csit-verify-virl-master failures to see if this is a CSIT operational issue or a valid test failure? Thanks, -daw- On 2/1/17 10:15 PM, Dave Wallace wrote: On 2/1/17 9:55 PM, Dave Wallace wrote: Folks, After today's ubuntu mirror issue was resolved, Ed, Vanessa, and I discovered another failure mode for the vpp-verify-master-ubuntu1604 verify job. In the process of diagnosing the failure, Ed discovered and fixed a bug in "make verify" that was the root cause of this issue. See https://gerrit.fd.io/r/#/c/4993/ for details. I merged this patch and verified that it resolved the vpp-verify-master-ubuntu1604 failure for https://gerrit.fd.io/r/#/c/4897. I have subsequently rebased all patches in the gerrit:vpp queue that were open and current. Any patch that has merge conflicts will need to be rebased manually. Thanks to Ed for his keen eyesight and Vanessa for cancelling her after hours plans to stay and help resolve the issue. I have also cherry-picked 4993 to stable/1701, but that still requires merging. Please disregard the following, it appears that the other jobs are waiting in the build queue and have not been posted to gerrit yet. I also noticed that stable/1701 only has verify jobs for ubuntu1404 and centos7. We should add a verify job for ubuntu1604 as well. -daw- Please help monitor the status of the verify jobs that are now in progress. Thanks, -daw-
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev