Re: Is HURD's lack of HOST_NAME_MAX and PATH_MAX a good architectural approach

Guillem Jover Mon, 20 Jan 2025 18:26:34 -0800

Hi!

On Mon, 2025-01-20 at 13:21:32 -0700, Sam Hartman wrote:
> TL;DR: Is it time for the rest of Debian to stop conforming to HURD's
> lack of maximums for path and hostname? By thispoint I think we
> recognize those lack of maximums as an anti-pattern for DOS prevention
> and other security reasons.


While I agree on some of your premises below, I do not agree with the
conclusions, my conclusion has actually been for a long time, the
opposite.

Disclosure: I used to be part of the Debian Hurd porters team,
maintained mig and gnumach for a while, and even packaged L4 Pistachio
(an alternative microkernel) for a while, when there was discussions
about switching away from GNU Mach. I still try to follow what's going
in the port, and sporadically review or port stuff. I still consider
myself a porter (in general) at heart.

> I will admit I was kind of disappointed that rather than working to make
> my package handle arbitrary hostnames, the patch simply introduced an
> arbitrary constant for HURD.

I have not checked this specific case, but I think in general there
have been several reasons for the different kinds of patches being sent.
From the general experience of the porter, their knowledge of the
code being ported, the apparent urgency of getting something fixed,
whether stopping to use arbitrary limits would break exposed ABI,
the receptiveness of the receiving maintainers and their desire for
more correct/robust but potentially more intrusive patches, or their
desire for more minimal patches to simply fix FTBFS problems, etc.

Without more details, the patch you describe I'd consider it a
workaround, which I'd have asked to be reworked if I had reviewed
it, or would have provided an alternative myself. And I think these
kind of patches are frowned upon by the Hurd porters in general.

> HURD managed to explore a lot of interesting ground. Through
> explorations like Plan9 and HURD, things like namespaces, fuse, and
> other features revolutionized the Linux world, spawning important
> innovations in themselves like containerization.

(Sadly, many of the things that got into Linux, feel like they got
bolted on, in many cases in unnatural or complex ways, with harder
to grasp semantics and increased security exposure.)

> But I think that HURD's desire to  remove arbitrary limits like hostname
> and path maximums have proven not to be winners. We ran the experiment,
> and I at least think the conclusion is that we'd be better off with
> limits.

I disagree for both statements.

The way I've always interpreted the arbitrary limit stance, has been
that the code should be robust and be prepared to handle data of any
length, while dynamically handling truncation, and underlying limits
(say from specific filesystem implementations or protocols, allocation
failures, etc, because we live surrounded by a limited world anyway).
And not cooking such hardcoded limits means that the code is ready for
any such underlying limits to be bumped freely, or changing beneath us
due to underlying implementation changes. AFAIR this is currently a
problem on GNU/Linux for example, because even if we wanted to increase
the pathname or other such arbitrary limits, these are part of the ABI
now. :/

And while I agree that specifically on security sensitive contexts,
unfortunately it might be needed to add arbitrary limits for security
reasons as you mentioned, this to me still does not translate to the
examples given with pathnames and hostnames, and other similar system
resources. And a limit that might seem reasonable today, will most
probably be a hindrance years ahead or with bigger equipment, and
might easily hamper functionality. So I'd still be very ware of such
limits.

> Here are some of the concerns:
> 
> * Having different limits in different parts of the system can lead to
>   security problems.   On Linux, when I have something that I know is
>   a valid path, say because it's coming from the kernel, I know it fits
>   in PATH_MAX. I don't need to worry that some other program has a
>   different idea of PATH_MAX and I might need to deal with bounds
>   checking or truncation attacks.
>   However, on HURD, it's probably not sufficient to just create a
>   PATH_MAX or HOST_NAME_MAX buffer.  I probably also need to think about
>   what happens when something gets truncated or fails to fit into that
>   buffer.  I need to think about DOS and other potential attacks.
>   I know I have not been diligent about reviewing the HURD compatibility
>   patches for these sorts of issues over the years.

I agree with this concern, but I see it in reverse actually. Because
on GNU/Linux there's this hardcoding, there is less care for truncation
as you mention, which can still happen, because you might not control
what another program or a user on the same system or even from another
system with a different limit is passing to you. Even on GNU/Linux it's
still possible to have different limits and restriction depending on
what you are working on, say odd filesystems, network filesystems, etc.

Check for example:

  https://en.wikipedia.org/wiki/Comparison_of_file_systems#Limits

Where for both NAME_MAX and PATH_MAX, there are cases of downwards and
upward limits (or no limits at all). Even nesting multiple mount
points can get you off limits easily.

> * To some extent, there are intrinsit limits that are related. The DNS
>   does have a maximum for a domain name, and while that's not strictly
>   the same thing as a host name, practically speaking, we want host names
>   to be able to fit in domain names.

As mentioned above, there will always be some limit somewhere, if not
only just due to available memory. But even focusing on current
protocol limits seems problematic, because that also ossifies what can
be done in the future.

> * The kind of dynamic memory handling required for avoiding arbitrary
>   limits introduces significant complexity. You need to have some limit
>   at some level to avoid resource exhaustion attacks. Having constant
>   size structures for things like stat buffers and unix domain sockets
>   is a lot simpler than dynamically allocating everything.  Bounds
>   checking at compile time has value. So does avoiding all dynamic
>   allocation in critical sections of system resources.

Over the years I've found that the changes to remove these limits,
makes (in general) the code more robust, future proof, can even
simplify it, and in some cases makes one use better APIs to accomplish
the job.

> The latest version of pam is not building on hurd-i386 and hurd-amd64.
> One of the issues is HOST_NAME_MAX in modules/pam_xauth/pam_xauth.c.
> I'm sure the hurd porters would send me a patch if I asked for one. I'm
> sure I could come up with a patch on my own.

For this case, notice how the concern that you mentioned above is
quite present here with GNU/Linux exposing _POSIX_HOST_NAME_MAX,
HOST_NAME_MAX and MAXHOSTNAMELEN with diverging values. If the code is
changed to remove arbitrary limits, then these divergences suddenly
disappear.

> My question though is whether that's architecturally a good idea.

Yes, I think it's the better option, for robustness, for
future-proofness, for functionality, for security.

While it might seem annoying, because using a hardcoded limit seems
easier, this also looks like a trap, where one ends up ignoring cases
that are silently there but might not be obvious and will still
affect the code, and ossify the whole system impeding future
improvements (although for existing GNU/Linux ports, that's probably
too late anyway).

> As a maintainer, I'm willing to accept the patch if we believe that
> HURD's approach actually is a good one.
> For a non-release architecture that I perceive as more on the way out
> than on the way in,

hurd-i386 is probably on the way out, but that will eventually be
replaced by hurd-amd64 which seems to be coming along nicely.

> I'm not interested in accepting a patch if we think
> the architectural approach is an anti-pattern.
> Yes, that does put the hurd maintainers in an awkward position: pam is
> transitively essential.
> 
> * They could agree that particular aspect of the HURD experiment is not
>   a success and patch system include files.

I think that would be a mistake that would tie the port into a
position similar to GNU/Linux where backing out of it is hard to
impossible, and where that port then needs to live with the
consequences indefinitely.

> * They could find a way to patch pam only for HURD.  I think that would
>   be a bad precedent, but I couldn't stop them.

This has already been possible for a long time. We introduced long
long ago the "unreleased" suite to soft-fork required packages which
were urgent to fix to unblock the buildds, or for which there were
only workaround patches, or the maintainer didn't want to accept a
patch, or similar reasons. This is cumbersome though, as it needs for
these to be kept in sync.

> * They could take the issue to TC, either as a question about this
>   specific issue, or asking the TC to set policy on what ports patches
>   maintainers should accept.

It's my long standing stance that this is always detrimental to the
project; a failure of the community at large, that it ruptures the
social fabric of the project, and given its power and structure
is unjust in nature. If this would ever be needed, I think going
upstream directly or just using the "unreleased" suite is always
going to be socially better and more productive anyway.

Thanks,
Guillem

Re: Is HURD's lack of HOST_NAME_MAX and PATH_MAX a good architectural approach

Reply via email to