On 22 December 2015 at 22:05, Zang MingJie <zealot0...@gmail.com> wrote: > > > On Wed, Dec 23, 2015 at 3:10 AM, Joe Stringer <j...@ovn.org> wrote: >> >> On 21 December 2015 at 23:52, Zang MingJie <zealot0...@gmail.com> wrote: >> > Hi: >> > >> > Problem >> > ======= >> > >> > I'm glad to see that ovs add conntrack support, the conntrack support is >> > great, but I want to push it more forward. >> > >> > Consider this scenario: >> > multiple tenant sharing a single global ip by using nat. ip address in >> > different tenant can be overlapped. let's say tenant A ip x and tenant B >> > ip >> > x want to access internet via nat. >> > >> > Currently we accomplish this by using double-nat >> > tenant A: x:port ---natA--> internal-ip-a:port ---nat--> global-ip:port >> > tenant B: x:port ---natB--> internal-ip-b:port ---nat--> global-ip:port >> > >> > natA and natB is done in their own per tenant namespace, so there is no >> > problem even they have same ip. and second net translate their assigned >> > internal ips to public ip, there internal ips doesn't conflict. >> > >> > Idea >> > ==== >> > >> > Now I want to simplify the process by using a single nat using ovs, I >> > want >> > to translate a ip:port pair from tenant zone to public zone directly: >> > ZoneA:x:port ---nat--> ZonePublic:global-ip:port >> > ZoneB:x:port ---nat--> ZonePublic:global-ip:port >> > >> > >> > Implementation consideration >> > ============================ >> > Currently kernel cf table is not zone/tenant aware, it can only handle >> > ip:port pair. It may extended to handle zone-id. >> > >> > so cf table can be similar to this one >> > >> > zone:s-ip:s-port:d-ip:d-port <------> zone:s-ip:s-port:d-ip:d-port >> > >> > >> > for new connection, src/dst zone is specified by flow: >> > >> > Match: in_port(1),tcp,conn_state=-tracked >> > Action: nat(src_zone=10,dst_zone=20,masq=x.x.x.x) >> > >> > then a new cf entry can be generated like this one: >> > >> > 10:192.168.0.10:4562:8.8.8.8:53 <----> 20:masq-ip:random-port:8.8.8.8:53 >> > >> > The returning packets can be handled by another flow: >> > >> > Match: in_port(2),tcp,conn_state=+established >> > Action: nat(reverse,zone=20) >> > >> > by lookup cf table using 4-tuple plus zone, the cf entry can be easily >> > find, >> > also zone id '10' can be read from cf entry, so ovs know it is >> > translated >> > to zone 10 now. >> > _______________________________________________ >> > dev mailing list >> > dev@openvswitch.org >> > http://openvswitch.org/mailman/listinfo/dev >> >> >> Would something like this work? >> >> From tenant to outside: >> in_port=1,tcp, >> actions=ct(commit,zone=1,nat(src=GLOBAL),ct(commit,zone=0,exec(set_field:1->ct_mark)),output:3 >> in_port=2,tcp, >> actions=ct(commit,zone=2,nat(src=GLOBAL),ct(commit,zone=0,exec(set_field:2->ct_mark)),output:3 > > > Probably not. nat also need to choose an unused port. but in zone 1, it > doesn't know which port is unused in zone 0.
OK, I see. Does the port allocation occur differently based on the zone? (If so, I agree this is a problem; if not, this approach seems plausible) >> >> From outside to tenant: >> table=0,in_port=3,tcp, actions=ct(zone=0,table=1) >> table=1,in_port=3,tcp,ct_state=+est,ct_zone=0,ct_mark=1, >> actions=ct(zone=1,nat,table=2) >> table=1,in_port=3,tcp,ct_state=+est,ct_zone=0,ct_mark=2, >> actions=ct(zone=2,nat,table=2) >> table=2,in_port=3,tcp,ct_state=+est,ct_zone=1 actions=output:1 >> table=2,in_port=3,tcp,ct_state=+est,ct_zone=2 actions=output:2 > > And there will be lots of upcalls per connection. On second thought, I think that tables 1 and 2 could be squashed together - the extra recirculation/upcall should be unnecessary unless you've got other logic that wants to do something based on that zone. This is dependent on my previous question though. <snip> I discussed this with Jarno offline. It seems that what you're really looking for OVS support for this new feature that's in Linux 4.3: https://github.com/torvalds/linux/commit/deedb59039f111c41aa5a54ee384c8e7c08bc78a If you were to work on adding support for this feature to OVS, I don't see any particular reason that someone would block the change. The kernel changes would need to be made against upstream net-next kernel first (with OVS userspace code changes to test with). The API changes would need to make sure they don't break the existing functionality. In terms of the kernel module in the OVS tree, it looks like it would be a large amount of work to support this on kernels older than v4.3, so I wouldn't count on being able to run older kernels with this feature. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev