Ludovic Courtès <l...@gnu.org> writes:

> Hello,
>
> (Cc: Reepca.)
>
> keinflue <keinf...@posteo.net> writes:
>
>> It seems that the "chown to overflowgid" issue is somewhat
>> widespread. I also see the testsuite for go (bootstrap) failing in the
>> same way. I'd guess most implementations of "chown" system call
>> wrappers in various languages will have test cases like this that fail
>> to anticipate user namespaces. I will let my system build keep running
>> a bit longer and will then post the list of packages I found with log
>> excerpts here.
>
> I think it would be best to support chown-to-supplementary-group even
> with the unprivileged daemon (specifically for the case where
> guix-daemon runs as a dedicated user, and that this user has one
> supplementary group, kvm).
>
> The attached patch tries to do that, by calling out to ‘newuidmap’, and
> under the assumption that /etc/subgid allows mapping the ‘kvm’ group.
>
> It does the job (a build process can chown to ‘kvm’), but I couldn’t get
> the GID mapping preserved across the ‘unshare’ call (the call that is
> made to “lock” mounts), hence the “#if 0” there.
>
> The problem is that when we call ‘unshare’, the ‘newgidmap’ setuid
> binary is not longer accessible because we’re already in a chroot, so it
> seems that we cannot preserve the GID map.

... and even if we had the setuid binary accessible (for example via a
saved file descriptor that could be used with execveat, or a
bind-mount), it wouldn't be of any use at this point because (man
user_namespaces):

"if either the user or the group ID of the file has no mapping inside
the namespace, the set-user-ID (set-group- ID) bit is silently ignored:
the new program is executed, but the process's effective user (group) ID
is left unchanged."

Naturally, uid 0 isn't going to be mapped!  In fact, more generally,
newuidmap and newgidmap can't ever be used from within an uninitialized
user namespace, since by definition uid 0 isn't yet mapped in it.

So it falls to the parent process to do the initialization - that is, it
now has to do the initialization twice.  Of course, it's going to need
some way of knowing when the second user namespace has been created, and
the child is going to need some way of knowing when it's been
initialized, so we'll need to either use two pipes or switch to using a
socketpair.

Of some concern also is the ominous statement in "man newgidmap" that
"Note that newgidmap may be used only once for a given process."  I have
no idea how or why it would enforce this, and I'm going to assume for
now that what is actually meant is that "a given user namespace's gid
mapping cannot be written more than once", which is just a restatement
of what "man user_namespaces" says.



It's too bad that the user namespaces implementation doesn't allow
unprivileged users to map their own supplementary groups.  I can't think
of any reason not to - a user can already switch their effective gid to
any supplementary gid they have by creating and executing a setgid
program for that gid.  Are there any operations that require both a
capability /and/ a (mapped) egid check to pass?

It wouldn't surprise me if the ultimate reason is that the kernel devs
wanted to reuse code for uid_map and gid_map, and that it's easier to
verify one line than try to verify an arbitrary set of gids, including
ones in potentially-large ranges.  It's just an unfortunate consequence
that this means that all unprivileged user namespaces have to carry
gibberish supplementary groups around that they can "use" but not
comprehend.

But as long as /etc/subgid is configured to allow each user to map all
of the groups they are a member of, it can at least be worked around.

- reepca

Attachment: signature.asc
Description: PGP signature

Reply via email to