Ah well, I assumed the default gateway router is running IP routing stack compliant with RFC791 and associated best practices of selecting best routes for forwarding. If this is not the case, then I would like the administrator of the aforementioned default gateway router to confirm the implementation and ensure that configuration is matching test behaviour described in my email below.
-Maciek > On 28 Jun 2017, at 14:03, Ed Kern (ejk) <e...@cisco.com> wrote: > > > Hey maciek, > > We don’t need and would prefer NOT to remove that .1 addresses on those > virtual routers. > > Just the addition of the static is whats needed right now.. > > thanks, > > Ed > > > >> On Jun 28, 2017, at 5:07 AM, Maciek Konstantynowicz (mkonstan) >> <mkons...@cisco.com> wrote: >> >> Anton and Team, >> >> The continued interruptions of IP connectivity to/from VIRL server >> simulations on the management subnet have been impacting both CSIT and VPP >> project operations. We decided to temporarily remove VPP VIRL based verify >> jobs, job/vpp-csit-verify-virl-master/, from both per vpp patch auto-trigger >> and the voting rights - Ed W. was kind to prepared required ci-mgmt patches, >> but they are not merged yet (https://gerrit.fd.io/r/#/c/7319/, >> https://gerrit.fd.io/r/#/c/7320/). >> >> Before we proceed with above step, we want to do one more set of network >> infra focused tests per yesterday exchange on #fdio-infra irc with >> Vanessa/valderrv, Ed Kern/snergster and Mohammed/mnaser. Here quick recap: >> >> Connectivity is affected between following the mgmt subnets added few weeks >> back as part of [FD.io Helpdesk #40733]: >> 10.30.52.0/24 >> 10.30.53.0/24 >> 10.30.54.0/24 >> >> The high packet drop rate (50..70%) problem seem to occur sporadically, but >> if packets are passing thru the default gateway router that has address .1 >> in each of above subnets. This affects all connectivity to jenkins slaves, >> but also between tb4 virl hosts. The problem is never observed if packets >> are sent directly between the hosts, it works fine. >> >> Test proposal: >> >> Configure the router that acts as default gateway for these subnet with the >> following static routes: >> 10.30.52.0/24 at 10.30.51.28 // tb-4virl1 mgmt addr >> 10.30.53.0/24 at 10.30.51.29 // tb-4virl1 mgmt addr >> 10.30.54.0/24 at 10.30.51.30 // tb-4virl1 mgmt addr >> Meaning all packets to above subnets will be routed through the main >> management IP address on respective tb4-virl host, per wiki [1]. >> This will remove default gateway router from the problem domain under >> investigation. >> > > this is all well and good > > >> Remove following IP addresses >> from the default gateway router: >> 10.30.52.1/24 >> 10.30.53.1/24 >> 10.30.54.1/24 >> > > Not sure how this got in there….we want to keep these right where they are > unless there is some reason to remove them > > >> Continue to advertise below routes into WAN to ensure reachability from >> Jenkins slave and LF FD.io infra: >> 10.30.52.0/24 >> 10.30.53.0/24 >> 10.30.54.0/24 >> > > > this is also correct... > >> Could you pls advise when can these be conducted? >> >> -Maciek >> >> [1] >> https://wiki.fd.io/view/CSIT/CSIT_LF_testbed#Management_VLAN_IP_Addresses_allocation >> >>> On 21 Jun 2017, at 16:00, Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at >>> Cisco) <jgel...@cisco.com> wrote: >>> >>> Hello Anton, >>> >>> We did some checks and here are results: >>> >>> 1. ping simulated node from the host itself - ping is OK >>> >>> 2. ping simulated node from other host (i.e. node simulated on virl2, >>> executing ping command on virl3) - discovered packet loss (see e-mail from >>> Peter below) >>> - even for successful ping packet transition we can see the wide range of >>> time - from cca 0,6ms to 45ms... >>> >>> We are still investigating VIRL settings but do you have some hints for us? >>> >>> Thanks, >>> Jan >>> >>> -----Original Message----- >>> From: Peter Mikus -X (pmikus - PANTHEON TECHNOLOGIES at Cisco) >>> Sent: Wednesday, June 21, 2017 15:20 >>> To: Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco) >>> <jgel...@cisco.com> >>> Subject: RE: [vpp-dev] [FD.io Helpdesk #41921] connection interruptiones >>> between jenkins executor and VIRL servers >>> >>> virl@t4-virl3:/home/testuser$ ping 10.30.51.127 >>> PING 10.30.51.127 (10.30.51.127) 56(84) bytes of data. >>> 64 bytes from 10.30.51.127: icmp_seq=54 ttl=64 time=1.86 ms >>> ... >>> ^C >>> --- 10.30.51.127 ping statistics --- >>> 1202 packets transmitted, 193 received, 83% packet loss, time 1202345ms >>> rtt min/avg/max/mdev = 0.369/0.736/3.271/0.509 ms >>> virl@t4-virl3:/home/testuser$ ping 10.30.51.29 >>> >>> Peter Mikus >>> Engineer - Software >>> Cisco Systems Limited >>> >>> -----Original Message----- >>> From: vpp-dev-boun...@lists.fd.io [mailto:vpp-dev-boun...@lists.fd.io] On >>> Behalf Of Jan Gelety -X via RT >>> Sent: Tuesday, June 20, 2017 5:20 PM >>> Cc: csit-...@lists.fd.io; vpp-dev@lists.fd.io >>> Subject: Re: [vpp-dev] [FD.io Helpdesk #41921] connection interruptiones >>> between jenkins executor and VIRL servers >>> >>> Hello Anton, >>> >>> Thanks for the fast response. We will check local firewall setting as you >>> proposed. >>> >>> Regards, >>> Jan >>> >>> -----Original Message----- >>> From: Anton Baranov via RT [mailto:fdio-helpd...@rt.linuxfoundation.org] >>> Sent: Tuesday, June 20, 2017 17:13 >>> To: Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco) >>> <jgel...@cisco.com> >>> Cc: csit-...@lists.fd.io; vpp-dev@lists.fd.io >>> Subject: [FD.io Helpdesk #41921] connection interruptiones between jenkins >>> executor and VIRL servers >>> >>> Jan: >>> >>> This is what I got from fdio jenkins server (i did the tests with >>> 10.30.{52,53}.2 hosts: >>> >>> $ ip ro get 10.30.52.2 >>> 10.30.52.2 via 10.30.48.1 dev eth0 src 10.30.48.5 >>> cache >>> >>> The traffic is going directly through neutron router.. so we don't block >>> any traffic on our firewall >>> >>> $ ping -q -c4 10.30.52.2 >>> PING 10.30.52.2 (10.30.52.2) 56(84) bytes of data. >>> >>> --- 10.30.52.2 ping statistics --- >>> 4 packets transmitted, 4 received, 0% packet loss, time 3001ms rtt >>> min/avg/max/mdev = 0.496/0.789/1.509/0.419 ms >>> >>> I was able to reach the host in 10.30.52.0/24 network from jenkins server >>> >>> $ nc -nv 10.30.52.2 22 >>> Ncat: Version 6.40 ( http://nmap.org/ncat ) >>> Ncat: Connection refused. >>> >>> Looks like access is blocked there. Could you check your local firewall >>> setting and make sure you allow port 22/tcp ? >>> >>> The above is also true for 10.30.{53,54}.0/24 subnets >>> >>> Regards, >>> >>> On Tue Jun 20 10:51:15 2017, jgel...@cisco.com wrote: >>>> Hello Vanessa, >>>> >>>> Thanks for the info. >>>> >>>> Just few remarks: >>>> >>>> 1. virl1 (10.30.51.28) - nodes of simulations started there are using >>>> subnet 10.30.52.0/24 and we are experiencing ssh timeouts in this >>>> subnet >>>> >>>> 2. virl2 (10.30.51.29) - nodes of simulations started there were >>>> using subnet 10.30.53.0/24 and we were experiencing ssh timeouts in >>>> this subnet; >>>> - at the moment we switched the >>>> subnet back to 10.30.51.0/24 and assigned there IP pool 10.30.51.106 - >>>> 10.30.51.180 >>>> - new tests started - let you >>>> know the result tomorrow >>>> >>>> 3. virl3 (10.30.51.30) - nodes of simulations started there were using >>>> subnet 10.30.51.0/24 and IP pool is set to 10.30.51.181 - 10.30.51.254 >>>> and we didn't experience ssh timeouts in this subnet; >>>> >>>> >>>> So would it be possible to check routes for subnets 10.30.52.0/24, >>>> 10.30.53.0/24 and also for 10.30.54.0/24 (that is planned for vilr3 >>>> when it will be upgraded)? >>>> >>>> Thank you very much. >>>> >>>> Regards, >>>> Jan >>>> >>>> -----Original Message----- >>>> From: Vanessa Valderrama via RT [mailto:fdio- >>>> helpd...@rt.linuxfoundation.org] >>>> Sent: Friday, June 16, 2017 22:01 >>>> To: Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco) >>>> <jgel...@cisco.com> >>>> Cc: csit-...@lists.fd.io; vpp-dev@lists.fd.io >>>> Subject: [FD.io Helpdesk #41921] connection interruptiones between >>>> jenkins executor and VIRL servers >>>> >>>> We did have the vendor run MTRs. I've attached the results. >>>> >>>> On Fri Jun 16 15:36:16 2017, valderrv wrote: >>>>> Jan, >>>>> >>>>> I missed this conversation with abranov. Can this issue be resolved? >>>>> >>>>> <snip> >>>>> <jgelety> abranov: unfortunatley I had no time to check test logs >>>>> from last test cases before (because of meeting so I just had a look >>>>> to console output) and I found out that ssh failures are not related >>>>> to connection between jenkins and virl now (but it was this issue at >>>>> the time I wrote the e-mail). >>>>> <mnaser> [11:04:06] <jgelety> They are related to start up pf >>>>> nested VM now - so I will ask VIRL support for the help here. >>>>> </snip> >>>>> >>>>> Thank you, >>>>> Vanessa >>>>> >>>>> On Fri Jun 16 12:12:43 2017, valderrv wrote: >>>>>> Jan, >>>>>> >>>>>> We are looking into this issue. >>>>>> >>>>>> Thank you, >>>>>> Vanessa >>>>>> >>>>>> On Fri Jun 16 09:12:55 2017, jgel...@cisco.com wrote: >>>>>>> Hello Anton, >>>>>>> >>>>>>> Unfortunately we are still having issues with ssh connection >>>>>>> timeouts during tests on virl. Could you, please, have a look on >>>>>>> it? >>>>>>> >>>>>>> Thank you very much. >>>>>>> Regards, >>>>>>> Jan >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Anton Baranov via RT [mailto:fdio- >>>>>>> helpd...@rt.linuxfoundation.org] >>>>>>> Sent: Wednesday, June 14, 2017 15:45 >>>>>>> To: Jan Gelety -X (jgelety - PANTHEON TECHNOLOGIES at Cisco) >>>>>>> <jgel...@cisco.com> >>>>>>> Cc: csit-...@lists.fd.io; vpp-dev@lists.fd.io >>>>>>> Subject: [FD.io Helpdesk #41921] connection interruptiones >>>>>>> between jenkins executor and VIRL servers >>>>>>> >>>>>>> Jan: >>>>>>> >>>>>>> On my side I currently don't see any connectivity problems >>>>>>> between jenkins and VIRL servers. Please let me know if you're >>>>>>> still having that issue. I'll keep an eye on that problem and if >>>>>>> it reapears I'll report that to our cloud provider to check >>>>>>> further. >>>>>>> >>>>>>> Thanks, >>>>>>> -- >>>>>>> Anton Baranov >>>>>>> Systems and Network Administrator The Linux Foundation >>>>>>> >>>>>>> On Wed Jun 14 08:12:45 2017, jgel...@cisco.com wrote: >>>>>>>> Dear held...@fd.io<mailto:held...@fd.io> >>>>>>>> >>>>>>>> We are observing connection issues between Jenkins executors >>>>>>>> and VIRL servers that leads to failures of verify jobs >>>>>>>> (https://jenkins.fd.io/view/vpp/job/vpp-csit-verify-virl- >>>>>>>> master/ >>>>>>>> , >>>>>>>> https://jenkins.fd.io/view/csit/job/csit-vpp-functional-master >>>>>>>> - ubuntu1604-virl/, >>>>>>>> https://jenkins.fd.io/view/csit/job/csit-vpp- >>>>>>>> functional-master-centos7-virl/) because of ssh connection >>>>>>>> timeouts. >>>>>>>> >>>>>>>> Could you, please, have a look on it? >>>>>>>> >>>>>>>> Thank you very much. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Jan >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>> >>>> >>>> >>> >>> >>> -- >>> Anton Baranov >>> Systems and Network Administrator >>> The Linux Foundation >>> >>> _______________________________________________ >>> vpp-dev mailing list >>> vpp-dev@lists.fd.io >>> https://lists.fd.io/mailman/listinfo/vpp-dev >>> _______________________________________________ >>> csit-dev mailing list >>> csit-...@lists.fd.io >>> https://lists.fd.io/mailman/listinfo/csit-dev _______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev