Outlined below is an overview of the analysis and in process work for
adding Virtual Redundant Routing to CloudStack Virtual Private Clouds.


Current state:

Public networks allow for redundant VR (Virtual Routers).
The network topology is static. The network topology consists of a single
guest
network and a public network. The VR redundancy is implemented using a
second VR
and KeepAlived and Conntrackd software packages configured to match the
static
network topology. The static network topology is configured using parameters
passed to the VR Linux image using /proc/cmdline. The /proc/cmdline line
information
is parsed by a shell script called cloud-early-config.sh. The appropriate
parameters
required for VR redundant setup are used to configured the KeepAlived and
Conntrackd
packages if a boolean named redundant_router is true(1). While changes can
be made
to the guest network using the script guestnw.sh; this shell script assumes
the topology
of the network will not change.

Desired state:

Redundant VR's should be available in not only Public Networks but also
Virtual Private Clouds (VPC)
under CloudStack.

Issues:

VPC allow for, among other things, networks with a dynamic topology, aka
tiers.

Current shells scripts that configure redundant VR's in public networks
cannot
Create, Read, Update, or Delete (CRUD) virtual networks and VR's because of
the
the script(s) and programs used to change the guest network assume a static
network topology.

Minimal unit testing for CRUD functions.


Current Work in Process:

Generate Unit tests to Create, Read, Update and Delete for both redundant
and non redundant
networks using a single System Vritual Machine image VM based on changes to
script files
below.

Generate Unit test to verify VPCVirtualNetworkApplianceManagerImpl.java
will work with CRUD
enabled scripts.

Add the interface and ability to do network CRUD to guestnw.sh and
cloud-early-config.sh.

Modify VPCVirtualNetworkApplianceManagerImpl to allow changes to guest
network using
new CRUD functionality of guesnw.sh mentioned above.

Modify UI and other Java code as required for implantation of VPC redundant
routers.

Karl


On Wed, Jun 11, 2014 at 3:16 PM, Sheng Yang <sh...@yasker.org> wrote:

> One note:
>
> In fact the split of MASTER is not a big issue, because that would only
> happen if network runs bad enough, which already cause packet loss.
>
> The problem is it should recover from that situation fast enough.
> Previously due to ARP ping from BACKUP router(which thought it would
> replace MASTER), upstream switch would redirect the traffic to original
> BACKUP router for a while, then as soon as network recovered, MASTER would
> preempt BACKUP once again. But it may take some time for upstream switch to
> aware that MAC/Port/IP mapping has been changed. We once tried different
> MAC for MASTER and BACKUP but found it would result in upstream switch fail
> to recognize the MASTER again. Now we're still using same MAC for MASTER
> and BACKUP, and upstream switch can handle the situation better.
>
> --Sheng
>
>
> On Wed, Jun 11, 2014 at 12:48 AM, Daan Hoogland <
> dhoogl...@schubergphilis.com> wrote:
>
> > H,
> >
> > We had a little meeting on the state of this feature and the way to go. I
> > have no karma for ASFBot meetings so here is my excerpt from the
> transcript:
> >
> > Attendance:
> > K3KH Karl Harris
> > Yasker Sheng Yang
> > Spark404 Hugo Trippaers
> > echaz Eric Chazas
> > LeoSimons Leo Simons
> > dahn Daan Hoogland
> >
> > others where present in the room but not active in the meeting
> >
> > Agenda:
> > -          Feasibility experiment plans by Schuberg Philis
> > -          Reusable work by Karl
> > -          Problems Citrix encountered with the regular redundant router
> > (and how to avoid them)
> > -          Work division
> > -          (next meeting needed?)
> >
> > We tried to follow the agenda but were not very strict on it. I'll
> > summarize outcome per agenda bullet:
> >
> > Schuberg Philis wants to implement a feasibility redundant router on a
> > simulated vpc environment using the operational expertise it has in
> house.
> > The outcome would then be back ported to the device, it's agent and the
> > management server.
> >
> > The implementation tactics is to create a json like configuration
> > description and to let the device do its own configuration. The idea is
> to
> > have a single device for normal and vpc routers and to let the redundancy
> > be a mere property of it. This should lead to the ultimate objective
> which
> > is to have a single relatively simple maintainable device.
> >
> > Karl will describe his endeavors in adapting the existing device on list.
> >
> > Sheng described the QA problems Citrix had with the existing redundant
> > capabilities of the VR and assured us that only one real problem
> persists.
> > The failover time of 3 seconds occasionally leads to a split brain which
> > leads to two VR's assuming the role of master. As the management server
> in
> > a busy environment can take up to 30 seconds the to detect a failover
> this
> > can lead to unacceptable outage. One possible solution, to have the
> > management server serve as negotiator on such occasions, will be hard to
> > implement due to this latency. Noticeably both routers use the same mac
> > address on the interface to the load balancer.
> >
> > The resources available by Citrix are uncertain. Plan and design needs to
> > be done. It is agreed that we will work in parallel (Schuberg Philis and
> > Citrix) but keep in close contact. The amount of resources Sungard has
> for
> > this is not discussed. Karl will keep involved.
> >
> > We agreed to have a next meeting at 20:00 UTC on June the 17th
> >
> > Can someone give me Karma to use ASFBot for this one, please?
> >
> > \DaanH
> >
> >
>

Reply via email to