Hello, Jut FYI. tb4, 10.31.51.28 is not in production since very long time (I think since beginning) and it is not due to CentOS. You can use any of 3 TBs for development of new images and not impacting running simulations (you just need to be careful and know what you are doing).
We did not run parallelism from beginning and since we are running now 3 simulations at once we are quickly draining free IP addresses from pool. But to summarize it the root cause is lack of cleaning mechanism of "dead" sessions which should be addresses. We also need to develop monitoring system that will be reporting the state of VIRL. Because simply said we can have army of TBs but if dead sessions will not be deleted then it is waste of resources. TB4 has issues with license that needs to be fixed before we put it into production. Peter Mikus Engineer - Software Cisco Systems Limited From: csit-dev-boun...@lists.fd.io [mailto:csit-dev-boun...@lists.fd.io] On Behalf Of Thomas F Herbert Sent: Thursday, February 02, 2017 4:51 PM To: csit-...@lists.fd.io Cc: vpp-dev <vpp-dev@lists.fd.io> Subject: Re: [csit-dev] Fix for vpp-verify-master-ubuntu1604 build failures On 02/02/2017 09:36 AM, Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco) wrote: Hello, I had a look on all of three VIRL servers. I deleted few stuck sessions. I think the main issue is that one of VIRL servers is in TESTING status (I guess because of Thomas Herbert's Centos7 VIRL image tests) so not used for Jenkins jobs. It could lead to situation that we have no enough free IP addresses available on VIRL server top finish the simulation (session) start up when there is already a high number of running sessions on the VIRL server. Jan, I was using, tb4, 10.31.51.28. I think we should put all servers back into production. I didn't release that some were removed from production. I still have to create NESTED image and fix potential problem with serial console but don't need the servers now. Resolve production issues first. We should get the third VIRL server back to PRODUCTION status as soon as possible to increase VIRL capacity. @Thomas - when do you expect to finish your work on Centos7 preparation for VIRL? We should also have a look on the VIRL server simulation start up procedure to improve handling in situation when VIRL simulation is not started successfully - maybe try to use another VIRL server if available. Regards, Jan From: Dave Wallace [mailto:dwallac...@gmail.com] Sent: Thursday, February 02, 2017 06:14 To: vpp-dev <vpp-dev@lists.fd.io><mailto:vpp-dev@lists.fd.io>; csit-...@lists.fd.io<mailto:csit-...@lists.fd.io>; Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco) <jgel...@cisco.com><mailto:jgel...@cisco.com> Subject: Re: Fix for vpp-verify-master-ubuntu1604 build failures Jan, csit-dev, There have been a number of failures of the vpp-csit-verify-virl-master jobs created by my rebase-ing vpp patches (see thread below for details). Some of these failures may be valid test failures. However, I have looked at a couple of them that seem to indicate there may be an issue starting up the VIRL VMs. The error signature that I'm seeing is the following error after the three simulations are spun up. [Excerpt from https://jenkins.fd.io/job/vpp-csit-verify-virl-master/3637/console]: ---- %< ---- 04:01:32 + VIRL_SID[${index}]='ERROR: Simulation started OK but devices never changed to ACTIVE state 04:01:32 Last VIRL response: 04:01:32 {u'\''session-Pv076_'\'': {u'\''~mgmt-lxc'\'': {u'\''vnc-console'\'': False, u'\''subtype'\'': u'\''mgmt-lxc'\'', u'\''state'\'': u'\''ABSENT'\'', u'\''management-protocol'\'': u'\''ssh'\'', u'\''management-proxy'\'': u'\''self'\'', u'\''serial-ports'\'': 0}, u'\''tg1'\'': {u'\''vnc-console'\'': True, u'\''subtype'\'': u'\''server'\'', u'\''state'\'': u'\''ABSENT'\'', u'\''management-protocol'\'': u'\''ssh'\'', u'\''management-proxy'\'': u'\''lxc'\'', u'\''serial-ports'\'': 1}, u'\''sut1'\'': {u'\''vnc-console'\'': True, u'\''subtype'\'': u'\''vPP'\'', u'\''state'\'': u'\''ABSENT'\'', u'\''management-protocol'\'': u'\''ssh'\'', u'\''management-proxy'\'': u'\''lxc'\'', u'\''serial-ports'\'': 1}, u'\''sut2'\'': {u'\''vnc-console'\'': True, u'\''subtype'\'': u'\''vPP'\'', u'\''state'\'': u'\''ABSENT'\'', u'\''management-protocol'\'': u'\''ssh'\'', u'\''management-proxy'\'': u'\''lxc'\'', u'\''serial-ports'\'': 1}}}' 04:01:32 + retval=1 04:01:32 + '[' 1 -ne 0 ']' 04:01:32 + echo 'VIRL simulation start failed on 10.30.51.29' 04:01:32 VIRL simulation start failed on 10.30.51.29 ---- %< ---- Can you please take a look at the most recent vpp-csit-verify-virl-master failures to see if this is a CSIT operational issue or a valid test failure? Thanks, -daw- On 2/1/17 10:15 PM, Dave Wallace wrote: On 2/1/17 9:55 PM, Dave Wallace wrote: Folks, After today's ubuntu mirror issue was resolved, Ed, Vanessa, and I discovered another failure mode for the vpp-verify-master-ubuntu1604 verify job. In the process of diagnosing the failure, Ed discovered and fixed a bug in "make verify" that was the root cause of this issue. See https://gerrit.fd.io/r/#/c/4993/ for details. I merged this patch and verified that it resolved the vpp-verify-master-ubuntu1604 failure for https://gerrit.fd.io/r/#/c/4897. I have subsequently rebased all patches in the gerrit:vpp queue that were open and current. Any patch that has merge conflicts will need to be rebased manually. Thanks to Ed for his keen eyesight and Vanessa for cancelling her after hours plans to stay and help resolve the issue. I have also cherry-picked 4993 to stable/1701, but that still requires merging. Please disregard the following, it appears that the other jobs are waiting in the build queue and have not been posted to gerrit yet. I also noticed that stable/1701 only has verify jobs for ubuntu1404 and centos7. We should add a verify job for ubuntu1604 as well. -daw- Please help monitor the status of the verify jobs that are now in progress. Thanks, -daw- _______________________________________________ csit-dev mailing list csit-...@lists.fd.io<mailto:csit-...@lists.fd.io> https://lists.fd.io/mailman/listinfo/csit-dev -- Thomas F Herbert SDN Group Office of Technology Red Hat
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev