Hi Enrico, I agree that vdev should represent as much of its state as possible through its filesystem. This will include things like: * the usual POSIX stuff for each device node (user, group, mode) * OS-specific device parameters for each device node, as extended attributes (e.g. which subsystem the device belongs to, the sequence number of the netlink packet, etc.) * the command used to generate a device node's path, as an extended attribute (this is vdev-specific) * the commands used to create and update each node, as extended attributes (these are vdev-specific) * the action(s) taken as a result of the node being created or updated, as extended attributes (these are vdev-specific) * maybe some usage statistics?
The reason for storing extra information as external config files (ACLs and actions) is that sometimes vdev must know how to handle device events *before* it can start processing requests from the OS (i.e. for correctly setting ownership and mode bits, and for avoiding name collisions). However, once the initial ACLs and actions are loaded, vdev should expose them under, say, /dev/vdev/acls/ and /dev/vdev/actions/ as regular files. Adding, editing, and removing these files would change vdev's runtime behavior accordingly, as would directly editing any of the above information already exposed via the filesystem. Also, I think that as long as there's a simple policy in place that ensures that each device node is invisible by default (until the admin changes it otherwise), and that device name collisions get handled in a sane manner, you could get away with not having any initial ACLs at all, and simply treat vdev as a typical /dev filesystem that the admin sets up manually (or from a script) before letting users access it. > One interesting question here is whether we should do our own > namespacing (within vdev itself), or just use the kernel infrastructure > for that. (by the way: does anybody here know how other kernels, > like *bsd handle namespaces ?) I think I could offer the admin a continuous trade-off between per-session device namespaces and doing all device namespacing in a global vdev, whatever his/her preference. In the former case, the admin (or session manager) would mount a new vdev on the user's /dev mountpoint, and define for that user the set of ACLs that ensure that the user only sees the devices (s)he can access. In the latter case, the admin would carefully craft a set of ACLs (or script that programmed vdev during boot) that ensured that each user saw only the devices (s)he could access. Everyone wins. > One interesting question here is whether we should do our own > namespacing (within vdev itself), or just use the kernel infrastructure > for that. (by the way: does anybody here know how other kernels, > like *bsd handle namespaces ?) > Maybe we could go through some scenarios, where you'd currently use > ACLs and check whether they could be done better w/ namespaces. > (in fact, I prefer not to use ACLs, due to additional complexity) I don't doubt that giving each user (or each session) its own /dev will offer the most flexibility in Linux. However, it is hard to do this consistently across operating systems. With Linux, you can give each session its own set of namespaces via unshare(2). With FreeBSD, you could conceivably give each session its own jail, but the jail will offer limited networking options (i.e. no raw packets, so no ping or tcpdump or the like). OpenBSD only offers chroot, which can be easily escaped. Since I'm looking to port vdev to !Linux, vdev shouldn't rely on the OS's namespacing capabilities to provide different users different views of /dev. > One example is session isolation: here I'm pretty sure that, on login > or session start, a proper namespace should be constructed, before > calling the login shell is started. Do you see any reason for not > going that way ? I must emphasize that containers alone shouldn't be relied upon as a security solution, since local privilege escalation attacks that could be used to circumvent them get discovered pretty regularly. If your motivation for doing per-session containers is only namespace isolation (i.e. give each user a different view of the system, so programs can continue to work as if they had the whole system to themselves), then this approach looks sound. There will be a bit of legwork involved in giving each container a proper network interface, however, since you'll have many options (for example, do you want put your containers behind a NAT, or do you want them to be able to bind to the root context's IP address?). > By the way: does vdev's ACL handling also allow revoking permissions > to some device even on already opened fd's ? Not possible; you need the kernel to help you there. FreeBSD offers it, but Linux does not (AFAIK; I know that there's been interest in adding it). -Jude On Wed, Dec 31, 2014 at 6:20 AM, Enrico Weigelt, metux IT consult < enrico.weig...@gr13.net> wrote: > On 31.12.2014 01:56, Jude Nelson wrote: > > Hi, > > > A much more elegant solution would be to give each session its own > > /dev like you were originally saying--it would allow users to > > interact with different devices under the same name, while also > > preserving POSIX filesystem semantics. > > Yes, I really think, separate namespaces are the correct way to do. > > Actually, I didn't even think about ACLs (which introduce extra > dimensions orthogonal to the filesystem tree), but doing everything > via separate /dev namespaces. > > One interesting question here is whether we should do our own > namespacing (within vdev itself), or just use the kernel infrastructure > for that. (by the way: does anybody here know how other kernels, > like *bsd handle namespaces ?) > > Maybe we could go through some scenarios, where you'd currently use > ACLs and check whether they could be done better w/ namespaces. > (in fact, I prefer not to use ACLs, due to additional complexity) > > One example is session isolation: here I'm pretty sure that, on login > or session start, a proper namespace should be constructed, before > calling the login shell is started. Do you see any reason for not > going that way ? > > By the way: does vdev's ACL handling also allow revoking permissions > to some device even on already opened fd's ? > > > cu > -- > Enrico Weigelt, > metux IT consulting > +49-151-27565287 >
_______________________________________________ Dng mailing list Dng@lists.dyne.org https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng