> On Jan 8, 2016, at 2:03 PM, Joe Stringer <j...@ovn.org> wrote:
> 
> On 22 December 2015 at 22:05, Zang MingJie <zealot0...@gmail.com 
> <mailto:zealot0...@gmail.com>> wrote:
>> 
>> 
>> On Wed, Dec 23, 2015 at 3:10 AM, Joe Stringer <j...@ovn.org> wrote:
>>> 
>>> On 21 December 2015 at 23:52, Zang MingJie <zealot0...@gmail.com> wrote:
>>>> Hi:
>>>> 
>>>> Problem
>>>> =======
>>>> 
>>>> I'm glad to see that ovs add conntrack support, the conntrack support is
>>>> great, but I want to push it more forward.
>>>> 
>>>> Consider this scenario:
>>>> multiple tenant sharing a single global ip by using nat. ip address in
>>>> different tenant can be overlapped. let's say tenant A ip x and tenant B
>>>> ip
>>>> x want to access internet via nat.
>>>> 
>>>> Currently we accomplish this by using double-nat
>>>> tenant A:  x:port ---natA--> internal-ip-a:port ---nat--> global-ip:port
>>>> tenant B:  x:port ---natB--> internal-ip-b:port ---nat--> global-ip:port
>>>> 
>>>> natA and natB is done in their own per tenant namespace, so there is no
>>>> problem even they have same ip. and second net translate their assigned
>>>> internal ips to public ip, there internal ips doesn't conflict.
>>>> 
>>>> Idea
>>>> ====
>>>> 
>>>> Now I want to simplify the process by using a single nat using ovs, I
>>>> want
>>>> to translate a ip:port pair from tenant zone to public zone directly:
>>>> ZoneA:x:port ---nat--> ZonePublic:global-ip:port
>>>> ZoneB:x:port ---nat--> ZonePublic:global-ip:port
>>>> 
>>>> 
>>>> Implementation consideration
>>>> ============================
>>>> Currently kernel cf table is not zone/tenant aware, it can only handle
>>>> ip:port pair. It may extended to handle zone-id.
>>>> 
>>>> so cf table can be similar to this one
>>>> 
>>>> zone:s-ip:s-port:d-ip:d-port <------> zone:s-ip:s-port:d-ip:d-port
>>>> 
>>>> 
>>>> for new connection, src/dst zone is specified by flow:
>>>> 
>>>> Match:   in_port(1),tcp,conn_state=-tracked
>>>> Action:  nat(src_zone=10,dst_zone=20,masq=x.x.x.x)
>>>> 
>>>> then a new cf entry can be generated like this one:
>>>> 
>>>> 10:192.168.0.10:4562:8.8.8.8:53 <----> 20:masq-ip:random-port:8.8.8.8:53
>>>> 
>>>> The returning packets can be handled by another flow:
>>>> 
>>>> Match:   in_port(2),tcp,conn_state=+established
>>>> Action:  nat(reverse,zone=20)
>>>> 
>>>> by lookup cf table using 4-tuple plus zone, the cf entry can be easily
>>>> find,
>>>> also zone id '10' can be read from cf entry, so ovs know it is
>>>> translated
>>>> to zone 10 now.
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev@openvswitch.org
>>>> http://openvswitch.org/mailman/listinfo/dev
>>> 
>>> 
>>> Would something like this work?
>>> 
>>> From tenant to outside:
>>> in_port=1,tcp,
>>> actions=ct(commit,zone=1,nat(src=GLOBAL),ct(commit,zone=0,exec(set_field:1->ct_mark)),output:3
>>> in_port=2,tcp,
>>> actions=ct(commit,zone=2,nat(src=GLOBAL),ct(commit,zone=0,exec(set_field:2->ct_mark)),output:3
>> 
>> 
>> Probably not. nat also need to choose an unused port. but in zone 1, it
>> doesn't know which port is unused in zone 0.
> 
> OK, I see. Does the port allocation occur differently based on the
> zone? (If so, I agree this is a problem; if not, this approach seems
> plausible)
> 

Mapped port allocation checks for conflicts via nf_conntrack_tuple_taken() 
which checks for zone equality. Since it is possible to have overlapping 
addresses/ports in different zones, they can also be allocated by NAT (via 
different zones).

>>> 
>>> From outside to tenant:
>>> table=0,in_port=3,tcp, actions=ct(zone=0,table=1)
>>> table=1,in_port=3,tcp,ct_state=+est,ct_zone=0,ct_mark=1,
>>> actions=ct(zone=1,nat,table=2)
>>> table=1,in_port=3,tcp,ct_state=+est,ct_zone=0,ct_mark=2,
>>> actions=ct(zone=2,nat,table=2)
>>> table=2,in_port=3,tcp,ct_state=+est,ct_zone=1 actions=output:1
>>> table=2,in_port=3,tcp,ct_state=+est,ct_zone=2 actions=output:2
>> 
>> And there will be lots of upcalls per connection.
> 
> On second thought, I think that tables 1 and 2 could be squashed
> together - the extra recirculation/upcall should be unnecessary unless
> you've got other logic that wants to do something based on that zone.
> This is dependent on my previous question though.
> 
> <snip>
> 
> I discussed this with Jarno offline. It seems that what you're really
> looking for OVS support for this new feature that's in Linux 4.3:
> https://github.com/torvalds/linux/commit/deedb59039f111c41aa5a54ee384c8e7c08bc78a
>  
> <https://github.com/torvalds/linux/commit/deedb59039f111c41aa5a54ee384c8e7c08bc78a>
> 

The approach in Linux 4.3 seems to be that in addition to both original and 
reply direction tuples (i.e., address/port pairs) being in the specified zone, 
only  one of them can be in that zone, in which case the other tuple is always 
considered to be in zone 0. One way to expose this via the OVS API would be to 
add src_zone and dst_zone attributes to the CT action as (exclusive) 
alternatives to the existing zone attribute. Using one of the alternatives 
would imply that the tuple for the other direction would be in the zone 0.

> If you were to work on adding support for this feature to OVS, I don't
> see any particular reason that someone would block the change. The
> kernel changes would need to be made against upstream net-next kernel
> first (with OVS userspace code changes to test with). The API changes
> would need to make sure they don't break the existing functionality.
> In terms of the kernel module in the OVS tree, it looks like it would
> be a large amount of work to support this on kernels older than v4.3,
> so I wouldn't count on being able to run older kernels with this
> feature.

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to