On Thu, May 08, 2025 at 01:18:20AM -0700, John Johansen wrote: > On 5/7/25 23:06, Song Liu wrote: > > On Wed, May 7, 2025 at 8:37 AM Maxime Bélair > > <[email protected]> wrote: > > [...] > > > > > > > > These two do not feel like real benefits: > > > > - One syscall cannot fit all use cases well... > > > > > > This syscall is not intended to cover every case, nor to replace existing > > > kernel > > > interfaces. > > > > > > Each LSM can decide which operations it wants to support (if any). For > > > example, when > > > loading policies, an LSM may choose to allow only policies that further > > > restrict > > > privileges. > > > > > > > - Not working in containers is often not an issue, but a feature. > > > > > > Indeed, using this syscall requires appropriate capabilities and will not > > > permit > > > unprivileged containers to manage policies arbitrarily. > > > > > > With this syscall, capability checks remain the responsibility of each > > > LSM. > > > > > > For instance, in the AppArmor patch, a profile can be loaded only if > > > aa_policy_admin_capable() succeeds (which requires CAP_MAC_ADMIN). > > > Moreover, by design, > > > policies can be loaded only in the current namespace. > > > > > > I see this syscall as a middle point between exposing the entire sysfs, > > > creating a large > > > attack surface, and blocking everything. > > > > > > Landlock’s existing syscalls already improve security by allowing > > > processes to further > > > restrict their ambient rights while adding only a modest attack surface. > > > > > > This syscall is a further step in that direction: it lets LSMs add > > > restrictive policies > > > without requiring exposing every other interface. > > > > I don't think a syscall makes the API more secure. If necessary, we can add > > It exposes a different attack surface. Requiring mounting of the fs to where > it is visible > in the container, provides attack surface, and requires additional external > configuration.
We should also keep in mind that syscalls could be accessible from everywhere, by everyone, which may increase the attack surface compared to a privileged filesystem interface. Adding a second interface may also introduce issues. Anyway, I'm definitely not against syscalls, but I don't see why the filesystem interface would be "less secure" in this context. > > Then there is the whole issue of getting the various LSMs to allow another > LSM in the > stack to be able manage its own policy. Right, and it's a similar issue with seccomp policies wrt syscalls. > > > permission check to each pseudo file. The downside of the syscall, however, > > is that all the permission checks are hard-coded in the kernel (except for > > The permission checks don't have to be hard coded. Each LSM can define how it > handles > or manages the syscall. The default is that it isn't supported, but if an lsm > decides > to support it, there is now reason that its policy can't determine the use of > the > syscall. >From an interface design point of view, it would be better to clearly specify the scope of a command (e.g. which components could be impacted by a command), and make sure the documentation reflect that as well. Even better, have a syscalls per required privileges and impact (e.g. privileged or unprivileged). Going this road, I'm not sure if a privileged syscall would make sense given the existing filesystem interface. > > > BPF LSM); while the sys admin can configure permissions of the pseudo > > files in user space. > > > Other LSMs also have policy that can control access to pseudo filesystems and > other resources. Again, the control doesn't have to be hard coded. And > seccomp can > be used to block the syscall. > > > > > > Again, each module decides which operations to expose through this > > > syscall. In many cases > > > the operation will still require CAP_SYS_ADMIN or a similar capability, > > > so environments > > > that choose this interface remain secure while gaining its advantages. > > > > > > > > - Avoids overhead of other kernel interfaces for better efficiency > > > > > > > > .. and it is is probably less efficient, because everything need to > > > > fit in the same API. > > > > > > As shown below, the syscall can significantly improve the performance of > > > policy management. > > > A more detailed benchmark is available in [1]. > > > > > > The following table presents the time required to load an AppArmor > > > profile. > > > > > > For every cell, the first value is the total time taken by aa-load, and > > > the value in > > > parentheses is the time spent to load the policy in the kernel only > > > (total - dry‑run). > > > > > > Results are in microseconds and are averaged over 10 000 runs to reduce > > > variance. > > > > > > > > > | t (µs) | syscall | pseudofs | Speedup | > > > |-----------|-------------|-------------|---------------| > > > | 1password | 4257 (1127) | 3333 (192) | x1.28 (x5.86) | > > > | Xorg | 6099 (2961) | 5167 (2020) | x1.18 (x1.47) | > > > > > > > I am not sure the performance of loading security policies is on any > > critical path. > > generally speaking I agree, but I am also not going to turn down a > performance improvement either. Its a nice to have, but not a strong > argument for need. > > > The implementation calls the hook for each LSM, which is why I think the > > syscall is not efficient. > > > it should only call the LSM identified by the lsmid in the call. > > > Overall, I am still not convinced a syscall for all LSMs is needed. To > > justify such > > its not needed by all LSMs, just a subset of them, and some nebulous > subset of potentially future LSMs that is entirely undefinable. > > If we had had appropriate LSM syscalls landlock wouldn't have needed > to have landlock specific syscalls. Having another LSM go that route > feels wrong especially now that we have some LSM syscalls. I don't agree. Dedicated syscalls are a good thing. See my other reply. > If a > syscall is needed by an LSM its better to try hashing something out > that might have utility for multiple LSMs or at the very least, > potentially have utility in the future. > > > > a syscall, I think we need to show that it is useful in multiple LSMs. > > Also, if we > > really want to have single set of APIs for all LSMs, we may also need > > get_policy, > > We are never going to get a single set of APIs for all LSMs. I will > settle for an api that has utility for a subset > > > remove_policy, etc. This set as-is appears to be an incomplete design. The > > To have a complete design, there needs to be feedback and discussion > from multiple LSMs. This is a starting point. > > > implementation, with call_int_hook, is also problematic. It can easily > > cause some> controversial behaviors. > > > agreed it shouldn't be doing a straight call_int_hook, it should only > call it against the lsm identified by the lsmid Yes, but then, I don't see the point of a "generic" LSM syscall.
