On Wed, Jul 6, 2016 at 7:24 AM, Aaron Conole <acon...@redhat.com> wrote: > Aaron Conole <acon...@redhat.com> writes: > >> Daniele Di Proietto <diproiet...@vmware.com> writes: >> >>> On 10/06/2016 10:51, "Aaron Conole" <acon...@redhat.com> wrote: >>> >>>>Aaron Conole <acon...@redhat.com> writes: >>>> >>>>> Christian Ehrhardt <christian.ehrha...@canonical.com> writes: >>>>> >>>>>> On Tue, May 24, 2016 at 4:10 PM, Aaron Conole <acon...@redhat.com> wrote: >>>>>> >>>>>>> Daniele Di Proietto <diproiet...@vmware.com> writes: >>>>>>> >>>>>>> > Hi Aaron, >>>>>>> > >>>>>>> > I'm still a little bit nervous about calling chown on a (partially) >>>>>>> > user controlled file name. >>>>>>> >>>>>>> I agree, that always seems scary. >>>>>>> >>>>>>> > Before moving forward I wanted to discuss a couple of other options: >>>>>>> > >>>>>>> > * Ansis (in CC) suggested using -runas parameter in qemu. This way >>>>>>> > qemu can open the socket as root and drop privileges before starting >>>>>>> > guest execution. >>>>>>> >>>>>>> I'm not sure how to do this with libvirt, or via the OpenStack Neutron >>>>>>> plugin. I also don't know if it would be an acceptable workaround for >>>>>>> users. Additionally, I recall there being something of a "don't even >>>>>>> know if this works" around it. Maybe Christian or Ansis (both in CC) >>>>>>> can expound on it. >>>>>>> >>>>>> >>>>>> Hi, >>>>>> IIRC we kind of agree that long term a proper MAC will be much better but >>>>>> most involved people needed something to get it working like "now". >>>>>> Since they are complementary (other than the fix removing a bit of the >>>>>> urgency for more MAC) it was kind of the least bad option. >>>>>> >>>>>> You have to be aware that I brought up the discussion on d...@dpdk.org - >>>>>> see >>>>>> [1] and [2]: >>>>>> But this will take time and eventually still be the applications task to >>>>>> "do something" - no matter if via API or via the chmod's right now. >>>>>> So Aaron is trying to get something that works now until the long term >>>>>> things are in place, which I appreciate. >>>>>> >>>>>> FYI - I was even more in a hurry as it was clear that OVS-2.5 won't get >>>>>> this in time I run with [3] for now. >>>>>> I never intended to suggest that, but with the discussion in place, one >>>>>> could ask if you (Aaron) want to pick up that instead. >>>>>> That would keep OVS free for now until DPDK made up the API (see [2]) for >>>>>> socket ownership control and this then could be implemented in OVS? >>>>>> >>>>>> (I hope) In some months/years we will all be happy to drop this bunch of >>>>>> interim solutions, never the less we need it for now. >>>>>> >>>>>> [1]: http://dpdk.org/dev/patchwork/patch/12222/ >>>>>> [2]: http://dpdk.org/ml/archives/dev/2015-December/030326.html >>>>>> [3]: >>>>>> https://git.launchpad.net/~ubuntu-server/dpdk/commit/?h=ubuntu-xenial-to-dpdk2.2&id=f3c7aa1b2ddea8e092ad4a89e41a0e19d01ed4e7 >>>>>> >>>>>> [...] >>>>>> >>>>>> >>>>>>> I think originally we quickly discussed 4 possible solutions (and >>>>>>> hopefully I captured them correctly): >>>>>>> >>>>>>> 1. OVS downgrades to the ovs user, and kvm runs under the ovs >>>>>>> group. I don't actually like this solution because kvm could then >>>>>>> pollute the ovs database. >>>>>>> >>>>>>> 2. OVS runs as some user and sets the user/group ownership of the socket >>>>>>> via chown/chmod where permissions come from the database (the >>>>>>> original context had ovs running as root - but as I described above >>>>>>> it doesn't need to be root provided ovs+DPDK can start without root). >>>>>>> >>>>>>> 3. OVS runs as some user, kvm starts as root, opens the socket and >>>>>>> downgrades. IIRC, this doesn't actually work, or it may have >>>>>>> implications on other projects. I don't remember exactly what was >>>>>>> not as great about this solution, TBH. >>>>>>> >>>>>>> 4. OVS and KVM run as whatever users; MAC is used to enforce the >>>>>>> layering between them. >>>>>>> >>>>>>> I think solution 2 and solution 4 don't actually interfere with each >>>>>>> other, and can be used to a complementary effect (if implemented >>>>>>> properly) so that the MAC layer enforces access, but even without MAC, >>>>>>> the DAC layer can provide appropriate whitelisting behavior. >>>>>>> >>>>>> >>>>>> I also remember several complex changes needed for the #1 and #3 that >>>>>> always would end up with huge effort and a high risk not being accepted. >>>>>> Probably that is what you refer to with "implications on other projects". >>>>>> >>>>>> Also keep in mind the position of dpdk out of the last few discussions >>>>>> which I'd like to summarize as "dpdk got this path from an app, so this >>>>>> app >>>>>> OWNS that path". >>>>> >>>>> I'd like to continue on, but I am not sure what the concerns are right >>>>> now. Is it possible to enumerate them point by point so that I can >>>>> understand them? I think there are two outstanding concerns right now: >>>>> >>>>> 1. the proposed approach is not good enough (vis-a-vis DAC vs. MAC) >>>>> >>>>> 2. the proposed approach would be better implemented in the utility >>>>> that wants access to the sockets (vis-a-vis the libvirt discussion) >>>>> >>>>> Am I understanding the concerns correctly? >>>> >>>>Ping? >>> >>> I found another theoretical problem with the chmod approach, let me try to >>> explain: >>> >>> There's an extremely small race window between the socket creation and the >>> chmod which could theoretically be exploited to change the owner of a socket >>> (e.g. ovnsb_db.sock) in ovs rundir, by controlling the name of the port: >>> >>> 1. There's no southbound database running, because it's not yet been >>> started or because it's being restarted. >>> 2. The user creates a vhost port, naming it ovnsb_db.sock. >>> rte_vhost_driver_register() succeeds and creates a socket in the file >>> system. >>> 3. The southbound database is started, it removes ovnsb_db.sock and >>> recreates >>> it. >>> 4. Now OVS changes the owner and the permission of what it thinks is a >>> vhost-user socket. >>> >>> If 3 manages to get between 2 and 4, we have a problem. It's a pretty small >>> window, and it's unlikely that an attacker can control when the southbound >>> database is restarted. >>> >>> I feel like I'm nitpicking, but I'm not sure how serious is the security >>> impact of what I'm describing. >>> >>> I suggested an alternative approach, and I've tried implementing a quick >>> POC on top on your patch: >>> >>> ---8<--- >>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >>> index 24ebb41..d7adc66 100644 >>> --- a/lib/netdev-dpdk.c >>> +++ b/lib/netdev-dpdk.c >>> @@ -30,6 +30,7 @@ >>> #include <sys/types.h> >>> #include <sys/stat.h> >>> #include <getopt.h> >>> +#include <sys/fsuid.h> >>> >>> #include "chutil.h" >>> #include "dirs.h" >>> @@ -891,6 +892,17 @@ netdev_dpdk_vhost_user_construct(struct netdev *netdev) >>> */ >>> snprintf(dev->vhost_id, sizeof(dev->vhost_id), "%s/%s", >>> vhost_sock_dir, name); >>> + uid_t orig_u = geteuid(); >>> + gid_t orig_g = getegid(); >>> + if (vhost_sock_def_owner) { >>> + uid_t u; >>> + gid_t g; >>> + if (!ovs_strtousr(vhost_sock_def_owner, &u, NULL, &g, false)) { >>> + VLOG_INFO("UID: %d GID: %d", u, g); >>> + setfsuid(u); >>> + setfsgid(g); >>> + } >>> + } >>> >>> err = rte_vhost_driver_register(dev->vhost_id); >>> if (err) { >>> @@ -903,16 +915,12 @@ netdev_dpdk_vhost_user_construct(struct netdev >>> *netdev) >>> err = vhost_construct_helper(netdev); >>> } >>> >>> - ovs_mutex_unlock(&dpdk_mutex); >>> - if (!err && vhost_sock_def_owner && >>> - (err = ovs_chown(dev->vhost_id, vhost_sock_def_owner))) { >>> - VLOG_ERR("vhost-user socket device ownership change failed."); >>> + if (vhost_sock_def_owner) { >>> + setfsuid(orig_u); >>> + setfsgid(orig_g); >>> } >>> >>> - if (!err && vhost_sock_def_perms && >>> - (err = ovs_chmod(dev->vhost_id, vhost_sock_def_perms))) { >>> - VLOG_ERR("vhost-user socket device permission change failed."); >>> - } >>> + ovs_mutex_unlock(&dpdk_mutex); >>> >>> return err; >>> } >>> >>> ---8<--- >>> >>> Compared to the chmod approach this has some limitation: >>> >>> 1. It doesn't support changing permissions, only the owner. This >>> could be done with umask, but I couldn't find any system call >>> to change the umask for a single thread. >>> 2. Unless vhost-sock-dir is owned by the target owner, the socket >>> cannot be created. I'm not sure whether this is a reasonable >>> limitation for the use cases you have in mind. >>> 3. setfsuid() is Linux specific and somehow deprecated according >>> to the manpage: >>> >>> "Thus, setfsuid() is nowadays unneeded and should be avoided >>> in new applications" >>> >>> I haven't used seteuid, because it changes the euid of the whole >>> process and that may interfere with other operations on OVS. >> >> Thanks for this PoC, and explanation. I agree, there is a race, and I'd >> like to work on trying to solve the problem. >> >>> If these limitations are unacceptable, I can see how we can use >>> chmod. After all, as you point out, it's probably better to do it >>> in OVS than in some script. >> >> I think fchmod and fchown may actually be the correct calls to have, and >> will refactor these chown/chmod utils functions as such, which (I >> believe) avoids the race as you describe. >> >>> Thanks for your patience in solving this problem, >> >> Thanks for your reviews! >> >>> Daniele > > I've been trying to find a way to solve this in a portable fashion, but > it appears to either require changes in DPDK to let the application > create the unix server socket, or a linux-only guarantee with a > non-linux racy fallback. My attempt at the former change was NAK'd by > upstream DPDK. If the latter is unacceptable, then we can drop these > patches and hope that client mode will be a workable solution. > > Thoughts? Do you mean as if Open vSwitch would be the one connecting to DPDK socket (the client and server roles swapped)?
If so, then I sent a similar patch, but for OpenFlow controller sockets - http://openvswitch.org/pipermail/dev/2016-June/073733.html. Basically --no-self-confinement flag makes OVS to enter in mode where it can connect to sockets outside the /var/run/openvswitch directory. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev