On 25.04.2025 20:39, Ludovic Courtès wrote:
Hi,

I committed the /etc/group fix in
0d3bc50b0cffeae05beb12d0c270c6599186c0d7 together with a test.

keinflue <keinf...@posteo.net> writes:

I think this happens if the user running guix-daemon has supplementary
groups. These are not mapped via /proc/gid_map in the build container
and therefore are reported as the overflow gid (65534) by getgroups.

The test cases assume that they can change ownership to this
additional group but that is not permitted on the overflow gid.

I think supplementary groups should be dropped in the user namespace
for the build container to make the behavior
reproducible. Unfortunately this may be impossible if the parent
namespace has set /proc/[...]/setgroups to "deny".

I came up with this test:

--8<---------------cut here---------------start------------->8---
(use-modules (guix)
             (gcrypt hash)
             (gnu packages bootstrap))

(computed-file "kvm-access"
               #~(begin
                   (pk '#$(gettimeofday))
                   (let ((st (stat "/dev/kvm")))
                     (pk '/dev/kvm st)
                     (pk '/dev/kvm:owner (stat:uid st) (stat:gid st))
                     (pk 'getgroups (getgroups))
;; XXX: When running the daemon as root, /dev/kvm is ;; owned by UID 0, which has no entry in /etc/passwd.
                     ;; (pk 'kvm-user (getpwuid (stat:uid st)))
;; xxx: /etc/group never contained an entry to the "kvm"
                     ;; group so the thing below always failed.
                     ;; (pk 'kvm-group (getgrgid (stat:gid st)))
                     )
                   (when (open-fdes "/dev/kvm" O_RDWR)
                     (mkdir #$output)))
               #:guile %bootstrap-guile)
--8<---------------cut here---------------end--------------->8---

Privileged:

--8<---------------cut here---------------start------------->8---
$ guix build -f ~/src/guix-debugging/dev-kvm-access.scm
substitute: looking for substitutes on 'http://192.168.1.48:8123'...
0.0%guix substitute: warning: 192.168.1.48: connection failed:
Connection timed out
substitute:
substitute: looking for substitutes on 'https://ci.guix.gnu.org'... 100.0% substitute: looking for substitutes on 'https://bordeaux.guix.gnu.org'... 100.0%
substitute: looking for substitutes on
'https://guix.bordeaux.inria.fr'... 100.0%
The following derivation will be built:
  /gnu/store/vc5p6bfrzr7khgp9jha8h6kplixcl5h6-kvm-access.drv
substitute: looking for substitutes on 'http://192.168.1.48:8123'... 0.0%
building /gnu/store/vc5p6bfrzr7khgp9jha8h6kplixcl5h6-kvm-access.drv...

;;; ((1745606160 . 233876))

;;; (/dev/kvm #(6 483 8624 1 0 984 2792 0 1745359386 1745359386
1745359386 4096 0 char-special 432 382791307 382791307 1745359386))

;;; (/dev/kvm:owner 0 984)

;;; (getgroups #(984 30000))
successfully built /gnu/store/vc5p6bfrzr7khgp9jha8h6kplixcl5h6-kvm-access.drv
/gnu/store/36fin1iw2fh9066jg0y2fjd78j9wyjwp-kvm-access
--8<---------------cut here---------------end--------------->8---

Unprivileged:

--8<---------------cut here---------------start------------->8---
$ ./test-env guix build -f ~/src/guix-debugging/dev-kvm-access.scm
accepted connection from pid 2591, user ludo
accepted connection from pid 2601, user ludo
substitute: guix substitute: warning: ACL for archive imports seems to
be uninitialized, substitutes may be unavailable
substitute: guix substitute: warning: authentication and authorization
of substitutes disabled!
The following derivation will be built:

/home/ludo/src/guix/test-tmp/store/5p4qn8d3bgnj60a2kwpliiwk81bvrcjp-kvm-access.drv
substitute: guix substitute: warning: authentication and authorization
of substitutes disabled!
building
/home/ludo/src/guix/test-tmp/store/5p4qn8d3bgnj60a2kwpliiwk81bvrcjp-kvm-access.drv...

;;; ((1745606200 . 636919))

;;; (/dev/kvm #(6 483 8624 1 65534 65534 2792 0 1745359386 1745359386
1745359386 4096 0 char-special 432 382791307 382791307 1745359386))

;;; (/dev/kvm:owner 65534 65534)

;;; (getgroups #(65534 65534 65534 65534 65534 65534 65534 30000 65534))
successfully built
/home/ludo/src/guix/test-tmp/store/5p4qn8d3bgnj60a2kwpliiwk81bvrcjp-kvm-access.drv
/home/ludo/src/guix/test-tmp/store/ffh8zaw279dgdsh6q54mlldh4nikxiqp-kvm-access
--8<---------------cut here---------------end--------------->8---

In both cases, /dev/kvm is accessible.

In both cases, only the primary group has an entry in /etc/group;
supplementary groups are lacking.

So:

  1. I don’t think we need to map the “kvm” UID/GID into the user
     namespace;

For the purpose of the passive permission checks that is not necessary, yes. There are no uids or gids being translated between the user namespaces. However if all supplementary groups would be dropped, that would include the kvm group and then this test will fail to access /dev/kvm. That was the problem I saw with that first suggestion.

  2. I’m confused as to what makes the Coreutils test suite fail.

The result from getgroups includes both the primary gid 30000 and a supplementary gid 65534 (where the repeated 65534 are the overflow gid produced by viewing supplementary gids that aren't mapped into the user namespace via /proc/[pid]/gid_map).
Coreutils sees this and so assumes that it can do the equivalent of

touch testfile
chgrp 65534 testfile

to create a file owned by group 30000 initially and to then change group ownership of that file to 65534. Normally an unprivileged user is allowed to change group ownership of files they own between groups that they are member of, so this would always succeed outside a user namespace context.

However, any uid/gid used inside the user namespace is translated back to the host namespace via the uid/gid_map before permission checks. But in this case because 65534 doesn't map back to any gid in the host namespace, the syscall will fail.

If there is no supplementary group reported by getgroups at all, then coreutils just skips the test and it is ok again. Probably the coreutils test case should remove any gid reported by getgroups that is equal to the overflow gid before making that decision.

Dropping all supplementary groups from the build process (after unshare and before writing "deny" to /proc/pid/setgroups) would make it so that this test case is always skipped by having getgroups only report 30000, however that would also drop the kvm group as mentioned above and is also not permitted in all environments (e.g. when the parent namespace already set /proc/[pid]/setgroups to "deny").

So I think that instead either all supplementary groups of the user or at least the kvm group specifically needs to be mapped via /proc/[pid]/gid_map. When doing so getgroups would report 30000 and 984 (assuming identity gid map for 984) in your test case above and the coreutils test case would work again, because

chgrp 984 testfile

would then succeed with 984 mapping back to the host namespace to a supplementary group of the process.

From a point of reproducibility and information leakage into the build container I think however that it would be preferable to not retain supplementary groups if possible. In contrast to the privileged build with a distinct build user that the can be given desired supplementary groups at will, the unprivileged environment may be one where the supplementary groups of the user running the daemon can't easily be changed to what is supposed to be seen in the build environment.

The contents of /etc/group are not relevant for this test case failure, they are never consulted.

But a few other asides (for which I don't necessarily think anything should be changed):

- I also noticed that the build container /etc/group is written with 65534 assumed as overflow gid. I am not sure whether anyone actually does this, but the overflow uid/gid are technically configurable (and retrievable) via sysctl entries (/proc/sys/kernel/overflow(uid|gid)). 65534 is just the default value.

- I also noticed that the operating-system defaults do not write an entry for the overflow gid to /etc/group (while they do for the overflow uid to /etc/passwd). I think such an entry should exist by default as well. The entry for /etc/passwd also assumes the default overflow uid of 65534. This isn't only relevant for a user namespace context, but also file systems that can't map the whole range of Linux uids/gids.

It would still be good to drop any supplementary group other than “kvm”
though.

WDYT?

Thanks,
Ludo’.



Reply via email to