While a zone per rule would be nice because we can easily delete connection state by only referencing a zone, that's probably overkill. We only need enough to disambiguate between overlapping IPs so we can then delete connection state by matching standard L3/4 headers again, right?
I think a conntrack zone per port would be the easiest from an accounting perspective. We already setup an iptables chain per port so the grouping is already there (/me sweeps the complexity of choosing zone numbers under the rug). On Fri, Oct 24, 2014 at 2:25 AM, Salvatore Orlando <sorla...@nicira.com> wrote: > Just like Kevin I was considering using conntrack zones to segregate > connections. > However, I don't know whether this would be feasible as I've never used > iptables CT target in real applications. > > Segregation should probably happen at the security group level - or even > at the rule level - rather than the tenant level. > Indeed the same situation could occur even with two security groups > belonging to the same tenant. > > Probably each rule can be associated with a different conntrack zone. So > when it's matched, the corresponding conntrack entries will be added to the > appropriate zone. And therefore when the rules are removed the > corresponding connections to kill can be filtered by zone as explained by > Kevin. > > This approach will add a good number of rules to the RAW table however, so > its impact on control/data plane scalability should be assessed, as it > might turn as bad as the solution where connections where explicitly > dropped with an ad-hoc iptables rule. > > Salvatore > > > On 24 October 2014 09:32, Kevin Benton <blak...@gmail.com> wrote: > >> I think the root cause of the problem here is that we are losing >> segregation between tenants at the conntrack level. The compute side plugs >> everything into the same namespace and we have no guarantees about >> uniqueness of any other fields kept by conntrack. >> >> Because of this loss of uniqueness, I think there may be another lurking >> bug here as well. One tenant establishing connections between IPs that >> overlap with another tenant will create the possibility that a connection >> the other tenant attempts will match the conntrack entry from the original >> connection. Then whichever closes the connection first will result in the >> conntrack entry being removed and the return traffic from the remaining >> connection being dropped. >> >> I think the correct way forward here is to isolate each tenant (or even >> compute interface) into its own conntrack zone.[1] This will provide >> isolation against that imaginary unlikely scenario I just presented. :-) >> More importantly, it will allow us to clear connections for a specific >> tenant (or compute interface) without interfering with others because >> conntrack can delete by zone.[2] >> >> >> 1. >> https://github.com/torvalds/linux/commit/5d0aa2ccd4699a01cfdf14886191c249d7b45a01 >> 2. see the -w option. >> http://manpages.ubuntu.com/manpages/raring/man8/conntrack.8.html >> >> On Thu, Oct 23, 2014 at 3:22 AM, Elena Ezhova <eezh...@mirantis.com> >> wrote: >> >>> Hi! >>> >>> I am working on a bug "ping still working once connected even after >>> related security group rule is deleted" ( >>> https://bugs.launchpad.net/neutron/+bug/1335375). The gist of the >>> problem is the following: when we delete a security group rule the >>> corresponding rule in iptables is also deleted, but the connection, that >>> was allowed by that rule, is not being destroyed. >>> The reason for such behavior is that in iptables we have the following >>> structure of a chain that filters input packets for an interface of an >>> istance: >>> >>> Chain neutron-openvswi-i830fa99f-3 (1 references) >>> pkts bytes target prot opt in out source >>> destination >>> 0 0 DROP all -- * * 0.0.0.0/0 >>> 0.0.0.0/0 state INVALID /* Drop packets that are not >>> associated with a state. */ >>> 0 0 RETURN all -- * * 0.0.0.0/0 >>> 0.0.0.0/0 state RELATED,ESTABLISHED /* Direct packets >>> associated with a known session to the RETURN chain. */ >>> 0 0 RETURN udp -- * * 10.0.0.3 >>> 0.0.0.0/0 udp spt:67 dpt:68 >>> 0 0 RETURN all -- * * 0.0.0.0/0 >>> 0.0.0.0/0 match-set IPv43a0d3610-8b38-43f2-8 src >>> 0 0 RETURN tcp -- * * 0.0.0.0/0 >>> 0.0.0.0/0 tcp dpt:22 <---- rule that allows ssh on port >>> 22 >>> 1 84 RETURN icmp -- * * 0.0.0.0/0 >>> 0.0.0.0/0 >>> 0 0 neutron-openvswi-sg-fallback all -- * * >>> 0.0.0.0/0 0.0.0.0/0 /* Send unmatched traffic to >>> the fallback chain. */ >>> >>> So, if we delete rule that allows tcp on port 22, then all connections >>> that are already established won't be closed, because all packets would >>> satisfy the rule: >>> 0 0 RETURN all -- * * 0.0.0.0/0 >>> 0.0.0.0/0 state RELATED,ESTABLISHED /* Direct packets >>> associated with a known session to the RETURN chain. */ >>> >>> I seek advice on the way how to deal with the problem. There are a >>> couple of ideas how to do it (more or less realistic): >>> >>> - Kill the connection using conntrack >>> >>> The problem here is that it is sometimes impossible to tell >>> which connection should be killed. For example there may be two instances >>> running in different namespaces that have the same ip addresses. As a >>> compute doesn't know anything about namespaces, it cannot distinguish >>> between the two seemingly identical connections: >>> $ sudo conntrack -L | grep "10.0.0.5" >>> tcp 6 431954 ESTABLISHED src=10.0.0.3 dst=10.0.0.5 >>> sport=60723 dport=22 src=10.0.0.5 dst=10.0.0.3 sport=22 dport=60723 >>> [ASSURED] mark=0 use=1 >>> tcp 6 431976 ESTABLISHED src=10.0.0.3 dst=10.0.0.5 >>> sport=60729 dport=22 src=10.0.0.5 dst=10.0.0.3 sport=22 dport=60729 >>> [ASSURED] mark=0 use=1 >>> >>> I wonder whether there is any way to search for a connection by >>> destination MAC? >>> >>> - Delete iptables rule that directs packets associated with a known >>> session to the RETURN chain >>> >>> It will force all packets to go through the full chain each >>> time and this will definitely make the connection close. But this will >>> strongly affect the performance. Probably there may be created a timeout >>> after which this rule will be restored, but it is uncertain how long should >>> it be. >>> >>> Please share your thoughts on how it would be better to handle it. >>> >>> Thanks in advance, >>> Elena >>> >>> >>> >>> _______________________________________________ >>> OpenStack-dev mailing list >>> OpenStack-dev@lists.openstack.org >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >>> >> >> >> -- >> Kevin Benton >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > -- Kevin Benton
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev