Re: [ovs-dev] [RFC] ovn: minimize the impact of a compromised chassis

Darrell Ball Tue, 23 Aug 2016 14:35:06 -0700

On Tue, Aug 23, 2016 at 2:20 PM, Russell Bryant <[email protected]> wrote:


>
>
> On Tue, Aug 23, 2016 at 5:05 PM, Darrell Ball <[email protected]> wrote:
>
>>
>>
>> On Mon, Aug 22, 2016 at 1:08 PM, Lance Richardson <[email protected]>
>> wrote:
>>
>>> > From: "Ben Pfaff" <[email protected]>
>>> > To: "Russell Bryant" <[email protected]>
>>> > Cc: "Lance Richardson" <[email protected]>, "ovs dev" <
>>> [email protected]>, "Russell Bryant" <[email protected]>
>>> > Sent: Monday, August 22, 2016 1:22:43 PM
>>> > Subject: Re: [ovs-dev] [RFC] ovn: minimize the impact of a compromised
>>> chassis
>>> >
>>> > On Mon, Aug 22, 2016 at 01:14:03PM -0400, Russell Bryant wrote:
>>> > > On Mon, Aug 22, 2016 at 12:30 PM, Ben Pfaff <[email protected]> wrote:
>>> > >
>>> > > > On Tue, Aug 16, 2016 at 09:30:21AM -0400, Lance Richardson wrote:
>>> > > > > As described in ovn/TODO, these are the two main approaches that
>>> could
>>> > > > > be
>>> > > > > used to minimize the impact of a compromised chassis on the rest
>>> of an
>>> > > > > OVN OVN network:
>>> > > > >
>>> > > > >   1) Implement a role- or identity-based access control
>>> mechanism for
>>> > > > >      ovsdb-server and use it to limit ovn-controller write
>>> access to
>>> > > > >      tables in the southbound database.
>>> > > > >
>>> > > > > or
>>> > > > >
>>> > > > >   2) Disallow all write access to the southbound database by
>>> > > > ovn-controller
>>> > > > >      (as an optional mode or unconditionally) and provide
>>> alternative
>>> > > > >      mechanisms for updating the southbound database for entries
>>> that
>>> > > > >      are
>>> > > > >      currently updated by ovn-controller.
>>> > > > >
>>> > > > > It is believed that option (1) would require somewhat more
>>> effort than
>>> > > > (2),
>>> > > > > and, because it would involve significant modifications to
>>> > > > > ovsdb-server,
>>> > > > > would also be more likely to add risk and burden to non-OVN
>>> users.
>>> > > > > Additionally, option (2) will likely place fewer requirements on
>>> > > > alternative
>>> > > > > databases (such as etcd), so the following implementation
>>> discussion
>>> > > > > only
>>> > > > > considers option (2).
>>> > > >
>>> > > > I've always pushed back against adding granular access control
>>> > > > mechanisms to OVSDB because I didn't believe it was likely that
>>> anything
>>> > > > that was simple enough to be in the "spirit of OVSDB" (heh) was
>>> also
>>> > > > going to be sufficient to fit a real use case.  However, if we do
>>> now
>>> > > > have specific requirements for OVN, then I'd invite descriptions
>>> of what
>>> > > > access control mechanism would be sufficient.  If it's simple and
>>> > > > general enough, then implementing it in OVSDB might totally make
>>> sense.
>>> > > >
>>> > > > I don't think that the "risk and burden" of a simple and general
>>> > > > mechanism is a real issue.
>>> > >
>>> > >
>>> > > I think that push back makes sense.
>>> > >
>>> > > The proposal here was to take route #2.  The only OVSDB feature
>>> required in
>>> > > that case is to accept read-only connections, which could be on a
>>> > > per-socket basis.  This seems much simpler all around, as long as we
>>> can
>>> > > all get on board with ovn-controller as a read-only client.
>>> >
>>> > I'm not actually saying we should choose #1.  I'm saying a couple of
>>> > things.  First, changing OVSDB is not a huge deal; we do it when it
>>> > makes sense.  Second, that it is possible that our specific application
>>> > here is a better place to start for OVSDB access control than a blanket
>>> > "we need access control for OVSDB" that I've heard a couple of times.
>>> >
>>>
>>> Based on my own narrow view of the world, I think option #1 would need:
>>>
>>>    - The ability for ovsdb-server to associate a role/identity with each
>>>      client connection.  For simplicity this could be a binary
>>> "privileged"
>>>      vs "non-privileged" association, perhaps using per-role SSL
>>> certificates
>>>      for TLS connections and treating unix socket connections as
>>> "privileged".
>>>    - A mechanism for mapping a role/identity to access rights on a
>>> per-table
>>>      and per-column basis.
>>>    - A mechanism for enforcing access rights on a per-table or
>>> per-column basis,
>>>      in some cases also considering the identity of the client that
>>> created
>>>      the row.
>>>
>>> This infrastructure would be applied to OVN to implement the following:
>>>     - These tables would be read-only for non-privileged clients:
>>>       SB_Global, Logical_Flow, Multicast_Group, Datapath_Binding,
>>> Address_Set,
>>>       DHCP_Options, and DHCPv6_Options.
>>>
>>>     - The Chassis and Encap tables would allow insertions by
>>> non-privileged clients
>>>       and updates to existing rows only for the clients that inserted
>>> them.
>>>
>>>     - The Port_Binding table would be writable only by privileged clients
>>>       (ovn-northd) except for the "Chassis" column which should be
>>> writable by any
>>>       non-privileged client (note that this doesn't do a lot to minimize
>>> harm from
>>>       a compromised chassis).
>>>
>>>     - The MAC_Binding table should be writable by any non-privileged
>>> client (which also
>>>       doesn't do much to minimize harm from a compromised chassis).
>>>
>>> > > Are you interested in looking closer at what #1 would look like, with
>>> > > details of what the access control policy would look like?
>>> >
>>> > It'll probably be obvious, or close to obvious, what would be needed
>>> for
>>> > #1 once we talk through what #2 needs.
>>> >
>>>
>>> Here's a slightly more detailed breakdown of the work needed for option
>>> #2:
>>>
>>>     ovsdb-server: Add support for "read-only" connections. Perhaps
>>> something
>>>       like "--remote ptcp:read-only:<port>[:<ip>]" and variations on
>>> that theme
>>>       for other connection types.
>>>
>>>     ovn-controller: Implement new approach for Chassis and Encap tables:
>>>          - Remove code from ovn-controller for creating rows in these
>>> tables.
>>>          - Document how administrators create rows using ovn-sbctl in
>>> ovn-controller
>>>            man page.
>>>          - Update all tests to manually create Chassis/Encap rows.
>>>
>>>     ovn-controller: Implement new approach for chassis column in
>>> Port_Binding table:
>>>          - Remove the code to update the chassis column from
>>> ovn-controller.
>>>          - Add new key to options column of Logical_Switch_Port in
>>> OVN_Northbound
>>>            database to specify chassis binding.
>>>          - Change ovn-northd to update Port_Binding table in southbound
>>> db based
>>>            on chassis option from Logical_Switch_port in northbound db.
>>>          - Write upgrade helper script that sets chassis option for
>>> existing
>>>            Logical_Switch_Ports based on current values in Port_Binding
>>> table of
>>>            southbound db
>>>          - Document OVN upgrade procedure, including the use of the
>>> upgrade helper
>>>            script.
>>>
>>>     ovn-controller: Rework MAC_Binding table
>>>          - Propose details of chassis-local mac bindings storage, the
>>> two main options
>>>            are:
>>>            + In ovn-controller memory (simple, but cache reset on
>>> ovn-controller restart).
>>>            + In Open_vSwitch database (more work, as we need cache
>>> invalidation logic added).
>>>          - Change ovn-controller to use local store for learned mac
>>> bindings.
>>>          - Remove code for updating MAC_Binding table from
>>> ovn-controller.
>>>
>>
>> Regarding Option 2:
>>
>> Most distributed systems that share a common management plane would try
>> to share
>> mac bindings via the common management plane, even if each node maintains
>> it own cache.
>>
>
> What specific systems are you referring to here?
>

Every hardware router with distributed line card modules, where the line
cards can be in single
physical chassis or across multiple physical chassis. I will not be quoting
specific products here
or their specific designs.



>
>
>> Throwing that out entirely because of a fear of a compromised chassis
>> seems out of
>> proportion to the potential problem. There can be 1000s of chassis part
>> of the same
>> logical network having packet flows needing the same binding.
>>
>
> It's not a fear.  It's a legitimate security issue.
>

I never said it was not possible to have an issue; if I did, my response
would have been different.
The following text has the full context.


>
>
>> Furthermore, the risk of a compromised chassis may be very low in many
>> use cases.
>> The "one known target environment" eluded to in the problem description
>> should not "rule all"
>> by default.
>>
>
> The group that raised this to me was OpenShift (a kubernetes based
> platform).  It's a show stopper for them, as I would expect for other
> container based systems.
>

Good, thanks.



>
> The same issue applies to OpenStack, though it's not quite as pressing of
> an issue as other OpenStack components have similar problems anyway.
>
>
>> Perhaps allowing ovn-controller to write to a candidate mac binding table
>> (with some limitations
>> as well) and having northd (possibly as background work) detect a
>> concensus of binding from > X controller
>> client sessions and then populate the actual mac binding table might
>> mitigate the exploit concern.
>> Only northd would be able to write to the actual mac binding table.
>>
>> If there is no binding concensus yet on the binding, then the default is
>> for the interested
>> controller to issue the arp request and use the local controller cache.
>> This includes the
>> degenerate case where there is only one controller interested in that
>> particular mac binding.
>>
>
> That sounds like a potential improvement for dynamic mac bindings, at
> least.  We still have Chassis, Encap, and Port_Binding to deal with.  It
> would also require more complex RBAC capabilities to be added to ovsdb,
> which I was hoping to avoid.
>

It is nice to keep things simple, when the "cost" is not too high...



>
> --
> Russell Bryant
>
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [RFC] ovn: minimize the impact of a compromised chassis

Reply via email to