On Thu, 27 Jul 2023 at 20:52, Francois <rigault.franc...@gmail.com> wrote:
>
> Hello!

> Our use case is to have fine grained policies for egress traffic, and there
> are existing products implementing this filtering using DNS names (the NGFW
> firewalls doing L7 filtering).
> As basically all the traffic is TLS encrypted, a NGFW works by examining the
> Server Name Indication field in the first packet of the TLS handshake, which
> is the only thing it is able to decode.
>
> This could be doable in OVN by having the first packet of a payload (first
> packet after a TCP handshake) sent to the ovn-controller, the ovn-controller
> would inspect that first packet and decides on the fate of the connection.
> I don't think there is a way to identify a "first packet of a TCP payload"
> using the conntrack states, so

as a learning exercise I gave it a shot

https://github.com/ovn-org/ovn/compare/main...freedge:inspectOpcode

> this could be a new ct_mark bit in the
> conntrack to set and clear, and then a new action for the controller to
> parse the packet and check the sni against a known list of DNS names

done with a "OVN_CT_FIRST_BIT" that marks the connection until the first
TCP PSH, and an "inspect" action that looks like this:

match               : "reg0[8] == 1 && (inport == @egressNetpol && ip
&& ip4.src == 10.225.42.10 && ip4.dst == 192.168.42.100 && tcp.dst ==
8080) && (ct_mark.first == 1 && ct.est && tcp.flags==0x8/0x8)"
actions             : "reg8[16] = 1; reg0[1] = 1; reg0[18]=1; reg0[19]
= 0;reg8[18] = inspect(glob=\"www.example.com\");  next;"

depending on the outcome of handle_acl_sni the packet is either allowed (and
conntracks committed to clear OVN_CT_FIRST_BIT), or we send a RST back
(using the existing reject action). This is activated with

set acl egressNetpol external_ids:inspect="www.example.com"

During the reject there is a minor issue: the reject action is intended for new
connections. But for connections already established, a reject will cause the
controller to send a RST (and packets are dropped), however the RST is
malformed (the
SEQ computation is missing), and also nothing is sent to the server and the
conntrack stays ESTABLISHED. If a reject ACL is installed, the server can still
freely deliver packets to the client on existing connections (this can be
observed adding an ACL with verdict=reject and look at the behavior of an
already established TCP connection. This can be observed with the vanilla
sources without any crazy modification).

Another more general issue: the "inspect" logic is coded within the
ovn-controller, any change in this logic (eg to support more protocols, newer
versions of TLS etc.) would have to be coded in ovn-controller, while it really
ought to be an external module.  Also the controller must be up in order to
receive the packet-in actions, meaning the ovn-controller must be basically
always up or new TCP connections won't go past the handshake.

For these reasons I was thinking of adding additional controllers: as coded
ovn-controller -S will start a controller and configure it using a
nx_controller_id of 42. ovs is also changed so that if connmgr_send_async_msg
is not able to deliver the packet to any controller then it retries with
controler_id 42 (I coded like this for a proof of concept...).

(btw for out of band controller it seems there is not much information that can
be retrieved using ovs-appctl apart from memory/show that gives a count of
"ofconns" but without any information on features supported or controller-id)

In the end I had this idea:
- keeping ovn-controller as sole responsible for flows modification
- move the inspect logic to a separate controller that only process packet-in
- and even add more "plug-in points" for *all* ACLs:

Here the inspect action is called for the first PSH of the TCP payload, but it
could have been called as well for the first SYN packet of the TCP connection.
That would mean that ACL could be implemented using actions executed by
controllers. It would help for use cases where there are just too many ACLs to
install (we have deployments with over 100k ACL and it just seems inefficient
to use flows for that).

tl;dr: what do you think of ovn-controller installing flows,
and users bringing their controllers to execute actions?

Thanks
François
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to