On 02/02/2017 09:36 AM, Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco) wrote:

Hello,

I had a look on all of three VIRL servers. I deleted few stuck sessions.

I think the main issue is that one of VIRL servers is in TESTING status (I guess because of Thomas Herbert’s Centos7 VIRL image tests) so not used for Jenkins jobs. It could lead to situation that we have no enough free IP addresses available on VIRL server top finish the simulation (session) start up when there is already a high number of running sessions on the VIRL server.

Jan, I was using, tb4, 10.31.51.28. I think we should put all servers back into production. I didn't release that some were removed from production. I still have to create NESTED image and fix potential problem with serial console but don't need the servers now.
Resolve production issues first.

We should get the third VIRL server back to PRODUCTION status as soon as possible to increase VIRL capacity.

*@Thomas *– when do you expect to finish your work on Centos7 preparation for VIRL?

We should also have a look on the VIRL server simulation start up procedure to improve handling in situation when VIRL simulation is not started successfully - maybe try to use another VIRL server if available.

Regards,

Jan

*From:*Dave Wallace [mailto:dwallac...@gmail.com]
*Sent:* Thursday, February 02, 2017 06:14
*To:* vpp-dev <vpp-dev@lists.fd.io>; csit-...@lists.fd.io; Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco) <jgel...@cisco.com>
*Subject:* Re: Fix for vpp-verify-master-ubuntu1604 build failures

Jan, csit-dev,

There have been a number of failures of the vpp-csit-verify-virl-master jobs created by my rebase-ing vpp patches (see thread below for details). Some of these failures may be valid test failures. However, I have looked at a couple of them that seem to indicate there may be an issue starting up the VIRL VMs. The error signature that I'm seeing is the following error after the three simulations are spun up.

[Excerpt from https://jenkins.fd.io/job/vpp-csit-verify-virl-master/3637/console]:
---- %< ----
04:01:32 + VIRL_SID[${index}]='ERROR: Simulation started OK but devices never changed to ACTIVE state
04:01:32 Last VIRL response:
04:01:32 {u'\''session-Pv076_'\'': {u'\''~mgmt-lxc'\'': {u'\''vnc-console'\'': False, u'\''subtype'\'': u'\''mgmt-lxc'\'', u'\''state'\'': u'\''ABSENT'\'', u'\''management-protocol'\'': u'\''ssh'\'', u'\''management-proxy'\'': u'\''self'\'', u'\''serial-ports'\'': 0}, u'\''tg1'\'': {u'\''vnc-console'\'': True, u'\''subtype'\'': u'\''server'\'', u'\''state'\'': u'\''ABSENT'\'', u'\''management-protocol'\'': u'\''ssh'\'', u'\''management-proxy'\'': u'\''lxc'\'', u'\''serial-ports'\'': 1}, u'\''sut1'\'': {u'\''vnc-console'\'': True, u'\''subtype'\'': u'\''vPP'\'', u'\''state'\'': u'\''ABSENT'\'', u'\''management-protocol'\'': u'\''ssh'\'', u'\''management-proxy'\'': u'\''lxc'\'', u'\''serial-ports'\'': 1}, u'\''sut2'\'': {u'\''vnc-console'\'': True, u'\''subtype'\'': u'\''vPP'\'', u'\''state'\'': u'\''ABSENT'\'', u'\''management-protocol'\'': u'\''ssh'\'', u'\''management-proxy'\'': u'\''lxc'\'', u'\''serial-ports'\'': 1}}}'
04:01:32 + retval=1
04:01:32 + '[' 1 -ne 0 ']'
04:01:32 + echo 'VIRL simulation start failed on 10.30.51.29'
04:01:32 VIRL simulation start failed on 10.30.51.29
---- %< ----

Can you please take a look at the most recent vpp-csit-verify-virl-master failures to see if this is a CSIT operational issue or a valid test failure?

Thanks,
-daw-

On 2/1/17 10:15 PM, Dave Wallace wrote:

    On 2/1/17 9:55 PM, Dave Wallace wrote:

        Folks,

        After today's ubuntu mirror issue was resolved, Ed, Vanessa,
        and I discovered another failure mode for the
        vpp-verify-master-ubuntu1604 verify job.  In the process of
        diagnosing the failure, Ed discovered and fixed a bug in "make
        verify" that was the root cause of this issue.

        See https://gerrit.fd.io/r/#/c/4993/ for details.

        I merged this patch and verified that it resolved the
        vpp-verify-master-ubuntu1604 failure for
        https://gerrit.fd.io/r/#/c/4897. I have subsequently rebased
        all patches in the gerrit:vpp queue that were open and
        current.  Any patch that has merge conflicts will need to be
        rebased manually.

        Thanks to Ed for his keen eyesight and Vanessa for cancelling
        her after hours plans to stay and help resolve the issue.

        I have also cherry-picked 4993 to stable/1701, but that still
        requires merging.


    Please disregard the following, it appears that the other jobs are
    waiting in the build queue and have not been posted to gerrit yet.

        I also noticed that stable/1701 only has verify jobs for
        ubuntu1404 and centos7.  We should add a verify job for
        ubuntu1604 as well.

    -daw-


        Please help monitor the status of the verify jobs that are now
        in progress.

        Thanks,
        -daw-



_______________________________________________
csit-dev mailing list
csit-...@lists.fd.io
https://lists.fd.io/mailman/listinfo/csit-dev

--
*Thomas F Herbert*
SDN Group
Office of Technology
*Red Hat*
_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Reply via email to