On Tue, Aug 23, 2016 at 2:20 PM, Russell Bryant <rbry...@redhat.com> wrote:
> > > On Tue, Aug 23, 2016 at 5:05 PM, Darrell Ball <dlu...@gmail.com> wrote: > >> >> >> On Mon, Aug 22, 2016 at 1:08 PM, Lance Richardson <lrich...@redhat.com> >> wrote: >> >>> > From: "Ben Pfaff" <b...@ovn.org> >>> > To: "Russell Bryant" <russ...@ovn.org> >>> > Cc: "Lance Richardson" <lrich...@redhat.com>, "ovs dev" < >>> dev@openvswitch.org>, "Russell Bryant" <rbry...@redhat.com> >>> > Sent: Monday, August 22, 2016 1:22:43 PM >>> > Subject: Re: [ovs-dev] [RFC] ovn: minimize the impact of a compromised >>> chassis >>> > >>> > On Mon, Aug 22, 2016 at 01:14:03PM -0400, Russell Bryant wrote: >>> > > On Mon, Aug 22, 2016 at 12:30 PM, Ben Pfaff <b...@ovn.org> wrote: >>> > > >>> > > > On Tue, Aug 16, 2016 at 09:30:21AM -0400, Lance Richardson wrote: >>> > > > > As described in ovn/TODO, these are the two main approaches that >>> could >>> > > > > be >>> > > > > used to minimize the impact of a compromised chassis on the rest >>> of an >>> > > > > OVN OVN network: >>> > > > > >>> > > > > 1) Implement a role- or identity-based access control >>> mechanism for >>> > > > > ovsdb-server and use it to limit ovn-controller write >>> access to >>> > > > > tables in the southbound database. >>> > > > > >>> > > > > or >>> > > > > >>> > > > > 2) Disallow all write access to the southbound database by >>> > > > ovn-controller >>> > > > > (as an optional mode or unconditionally) and provide >>> alternative >>> > > > > mechanisms for updating the southbound database for entries >>> that >>> > > > > are >>> > > > > currently updated by ovn-controller. >>> > > > > >>> > > > > It is believed that option (1) would require somewhat more >>> effort than >>> > > > (2), >>> > > > > and, because it would involve significant modifications to >>> > > > > ovsdb-server, >>> > > > > would also be more likely to add risk and burden to non-OVN >>> users. >>> > > > > Additionally, option (2) will likely place fewer requirements on >>> > > > alternative >>> > > > > databases (such as etcd), so the following implementation >>> discussion >>> > > > > only >>> > > > > considers option (2). >>> > > > >>> > > > I've always pushed back against adding granular access control >>> > > > mechanisms to OVSDB because I didn't believe it was likely that >>> anything >>> > > > that was simple enough to be in the "spirit of OVSDB" (heh) was >>> also >>> > > > going to be sufficient to fit a real use case. However, if we do >>> now >>> > > > have specific requirements for OVN, then I'd invite descriptions >>> of what >>> > > > access control mechanism would be sufficient. If it's simple and >>> > > > general enough, then implementing it in OVSDB might totally make >>> sense. >>> > > > >>> > > > I don't think that the "risk and burden" of a simple and general >>> > > > mechanism is a real issue. >>> > > >>> > > >>> > > I think that push back makes sense. >>> > > >>> > > The proposal here was to take route #2. The only OVSDB feature >>> required in >>> > > that case is to accept read-only connections, which could be on a >>> > > per-socket basis. This seems much simpler all around, as long as we >>> can >>> > > all get on board with ovn-controller as a read-only client. >>> > >>> > I'm not actually saying we should choose #1. I'm saying a couple of >>> > things. First, changing OVSDB is not a huge deal; we do it when it >>> > makes sense. Second, that it is possible that our specific application >>> > here is a better place to start for OVSDB access control than a blanket >>> > "we need access control for OVSDB" that I've heard a couple of times. >>> > >>> >>> Based on my own narrow view of the world, I think option #1 would need: >>> >>> - The ability for ovsdb-server to associate a role/identity with each >>> client connection. For simplicity this could be a binary >>> "privileged" >>> vs "non-privileged" association, perhaps using per-role SSL >>> certificates >>> for TLS connections and treating unix socket connections as >>> "privileged". >>> - A mechanism for mapping a role/identity to access rights on a >>> per-table >>> and per-column basis. >>> - A mechanism for enforcing access rights on a per-table or >>> per-column basis, >>> in some cases also considering the identity of the client that >>> created >>> the row. >>> >>> This infrastructure would be applied to OVN to implement the following: >>> - These tables would be read-only for non-privileged clients: >>> SB_Global, Logical_Flow, Multicast_Group, Datapath_Binding, >>> Address_Set, >>> DHCP_Options, and DHCPv6_Options. >>> >>> - The Chassis and Encap tables would allow insertions by >>> non-privileged clients >>> and updates to existing rows only for the clients that inserted >>> them. >>> >>> - The Port_Binding table would be writable only by privileged clients >>> (ovn-northd) except for the "Chassis" column which should be >>> writable by any >>> non-privileged client (note that this doesn't do a lot to minimize >>> harm from >>> a compromised chassis). >>> >>> - The MAC_Binding table should be writable by any non-privileged >>> client (which also >>> doesn't do much to minimize harm from a compromised chassis). >>> >>> > > Are you interested in looking closer at what #1 would look like, with >>> > > details of what the access control policy would look like? >>> > >>> > It'll probably be obvious, or close to obvious, what would be needed >>> for >>> > #1 once we talk through what #2 needs. >>> > >>> >>> Here's a slightly more detailed breakdown of the work needed for option >>> #2: >>> >>> ovsdb-server: Add support for "read-only" connections. Perhaps >>> something >>> like "--remote ptcp:read-only:<port>[:<ip>]" and variations on >>> that theme >>> for other connection types. >>> >>> ovn-controller: Implement new approach for Chassis and Encap tables: >>> - Remove code from ovn-controller for creating rows in these >>> tables. >>> - Document how administrators create rows using ovn-sbctl in >>> ovn-controller >>> man page. >>> - Update all tests to manually create Chassis/Encap rows. >>> >>> ovn-controller: Implement new approach for chassis column in >>> Port_Binding table: >>> - Remove the code to update the chassis column from >>> ovn-controller. >>> - Add new key to options column of Logical_Switch_Port in >>> OVN_Northbound >>> database to specify chassis binding. >>> - Change ovn-northd to update Port_Binding table in southbound >>> db based >>> on chassis option from Logical_Switch_port in northbound db. >>> - Write upgrade helper script that sets chassis option for >>> existing >>> Logical_Switch_Ports based on current values in Port_Binding >>> table of >>> southbound db >>> - Document OVN upgrade procedure, including the use of the >>> upgrade helper >>> script. >>> >>> ovn-controller: Rework MAC_Binding table >>> - Propose details of chassis-local mac bindings storage, the >>> two main options >>> are: >>> + In ovn-controller memory (simple, but cache reset on >>> ovn-controller restart). >>> + In Open_vSwitch database (more work, as we need cache >>> invalidation logic added). >>> - Change ovn-controller to use local store for learned mac >>> bindings. >>> - Remove code for updating MAC_Binding table from >>> ovn-controller. >>> >> >> Regarding Option 2: >> >> Most distributed systems that share a common management plane would try >> to share >> mac bindings via the common management plane, even if each node maintains >> it own cache. >> > > What specific systems are you referring to here? > Every hardware router with distributed line card modules, where the line cards can be in single physical chassis or across multiple physical chassis. I will not be quoting specific products here or their specific designs. > > >> Throwing that out entirely because of a fear of a compromised chassis >> seems out of >> proportion to the potential problem. There can be 1000s of chassis part >> of the same >> logical network having packet flows needing the same binding. >> > > It's not a fear. It's a legitimate security issue. > I never said it was not possible to have an issue; if I did, my response would have been different. The following text has the full context. > > >> Furthermore, the risk of a compromised chassis may be very low in many >> use cases. >> The "one known target environment" eluded to in the problem description >> should not "rule all" >> by default. >> > > The group that raised this to me was OpenShift (a kubernetes based > platform). It's a show stopper for them, as I would expect for other > container based systems. > Good, thanks. > > The same issue applies to OpenStack, though it's not quite as pressing of > an issue as other OpenStack components have similar problems anyway. > > >> Perhaps allowing ovn-controller to write to a candidate mac binding table >> (with some limitations >> as well) and having northd (possibly as background work) detect a >> concensus of binding from > X controller >> client sessions and then populate the actual mac binding table might >> mitigate the exploit concern. >> Only northd would be able to write to the actual mac binding table. >> >> If there is no binding concensus yet on the binding, then the default is >> for the interested >> controller to issue the arp request and use the local controller cache. >> This includes the >> degenerate case where there is only one controller interested in that >> particular mac binding. >> > > That sounds like a potential improvement for dynamic mac bindings, at > least. We still have Chassis, Encap, and Port_Binding to deal with. It > would also require more complex RBAC capabilities to be added to ovsdb, > which I was hoping to avoid. > It is nice to keep things simple, when the "cost" is not too high... > > -- > Russell Bryant > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev