On Tue, May 4, 2021 at 9:25 AM Francois <rigault.franc...@gmail.com> wrote:
>
> On Tue, 4 May 2021 at 17:03, Numan Siddique <num...@ovn.org> wrote:
> >
> > On Sat, May 1, 2021 at 6:32 AM Francois <rigault.franc...@gmail.com>
wrote:
> > >
> > > Hi Open vSwitch
> > > I am running an OVN stack with a dozen chassis, all of them able to
> > > act as gateways.
> > > I have many VMs without floating IPs on the same logical switch, doing
> > > a lot of external traffic. Today, this traffic has to go through the
> > > tunnel towards the unique chassis claiming the gateway to perform the
> > > snat natting and send the traffic outside the stack.
> > >
> > > With this current design, I see a lot of BFD traffic, and a clear
> > > bottleneck and spof with that single chassis doing the snat. A
> > > workaround is to add floating IPs on each VM, but this means the end
> > > user has to put the floating IP themself, it also means if a single
> > > chassis runs 10 VMs, we need one floating IP per VM just for the snat,
> > > while we could instead use a single IP per chassis for that.
> > >
> > > I was thinking of adding a "br-snat" bridge on each ovs, adding to it
> > > one interface with a fixed IP, and (with some minimal development in
> > > ovn northd) have the snat traffic of all its ports going out of that
> > > interface instead of going through the tunnel towards the gateway.
> > > Ideally the IP used today for the tunnel could be used too for the
> > > snat traffic, but this seems less trivial to achieve.
> > >
> > > Before looking at the details of ddlog and the syntax of flows, I
> > > would love to get some feedback on the idea, maybe there is something
> > > fundamentally broken with my design, or maybe there is a smarter way
> > > to achieve this?
> >
> > This is an interesting idea.  In order to do snat, OVN should know what
IP
> > to use.  But this IP should belong to the provider network subnet pool
right ?
> >
> > If you think this can be done, you can probably attempt a quick PoC
with just
> > changes to the C version and post the patches as RFC. The ddlog part
> > can be done later if the approach seems to be fine for the reviewers.
>
>
> Ok! I have no idea if this can be done, but I will attempt something
> nevertheless. You need one IP per chassis so if it is set on a different
> interface, and statically (similar to the external-ids:ovn-encap-ip)
> it should be fine too.
>

This is interesting. One question here. Not sure if I understand the
proposal correctly, but if you want to use the chassis's IP for snat, then
how would the return traffic hit the br-int? The return traffic (to the VM,
with destination IP being the chassis IP and destination mac address being
the chassis MAC) would go directly to the host interface (e.g. br0) without
going to the virtual network pipelines, right?

The original problem was to avoid using floating IPs per VM, but if it is
ok to have an extra IP per chassis, then I think current OVN implementation
already supports it by creating a per chassis Gateway Router and
configuring SNAT using the Gateway Router's uplink IP. Each chassis is now
both a HV and a gateway, and you will need one extra IP per chassis as the
Gateway Router IP. I think this is similar to the ovn-kubernetes topology.
Of course there may be other drawbacks such as you may need a logical
switch per chassis so that you can route the traffic to the chassis's own
Gateway Router. Not sure if it is something that could help in your use
cases.

> What will happen is that traffic going out will be seen from the outside,
> as an IP of the chassis (and the compute running a VM is chosen at random
> usually).  If there are firewall rules to open (on a firewall seating on
> the external network), they will need to be opened for all hypervisors,
for
> all VMs (so firewall rules become less relevant in a sense).  It basically
> "works" from a security standpoint, when all VMs belong to the same
tenant.
> Still since we are going to open whole ranges in the firewall, the IPs
> should be limited to the IPs used for the SNAT, and not include any fIP,
so it
> should probably be a different subnet.
>
> I think (I am still reading the doc!) a somewhat similar work was done
when
> addressing the MTU issue with the redirect-type=bridged, where packets
> are sent through a different port, using statically set mac-mappings.
>
> ddlog looked funnier! Do you have a plan for the removal of the C
> version? Also is there still a plan to have ovn-controller rewritten in
> ddlog?

There is a plan to remove the C version once the DDlog is stable enough.
The timeline is not clear (at least to me).
There is also a plan to rewrite ovn-controller in ddlog, but it is more
complex than northd and there are different options moving forward, and the
timeline is even less clear.

Thanks,
Han

>
> Thanks a lot!
> Francois
>
> >
> > Thanks
> > Numan
> >
> > >
> > > Thanks
> > > Francois
> > > _______________________________________________
> > > discuss mailing list
> > > disc...@openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> > >
> _______________________________________________
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to