Re: critique of gnulib

Jonas 'Sortie' Termansen Sun, 01 Sep 2019 14:53:35 -0700

Hi Bruno & Paul,

Thanks for your interest in my notes on gnulib. I appreciate your desire
to discuss them. Please excuse some of the resources that I'll link
below for overreacting in jest ("screaming in horror", "usual gnulib
infection"), they weren’t really meant for upstream consumption, but
they do refer to true struggles that I've had to go through. I
understand and appreciate that you and I have different goals, but I
think we do have some common ground. I'll give you the context of my
notes below.


My goal is to make a good and clean  POSIX-like operating system, and to
enable other people to do so as well because healthy competition is good.

I have contributed to gnulib in the past. Although my main interest is
my operating systems project and I only work on ports as needed. This
year I am dedicating my time to my game development project, so I don't
have the resources to do OS work these days. Please excuse me if my
information is out of date.

I rather enjoy my operating systems work on Sortix, though, because I've
accomplished something extraordinary: A self-hosting POSIX-like system
made from scratch with key ports of third party software (about 75% of
ports compiles natively, so I rely on cross-compilation). What really
surprised me pleasantly was that I did not have the same backwards
compatibility concerns of a GNU/Linux system. By doing things in a
simple way without historical mistakes, my project got a good baseline
quality.

This really accomplishes the best qualities of a free software platform,
because all the code and ports are integrated and are easy to modify
(you can 'cd /src && make sysmerge' out of the box and the system is
updated to the new sources of everything). I'm free to compete with
other POSIX-like systems by making better implementations of the
standard interfaces and in turn encouraging other systems to improve.
Meanwhile I have encountered technical debt in other projects that I
port, and I sometimes fix those issues and contribute fixes upstream,
improving the health of the free software ecosystem.

My method for porting software is to cross-compile the software to my OS
(see <https://wiki.osdev.org/Cross-Porting_Software>). Sometimes I have
to fix some build system bugs. Then I fix the compile errors and
warnings, if any. That may require implementing some new features in my
OS. Finally it compiles and I run it. It might not work or crash, and I
fix those bugs too. Finally I package up the thing and I might send a
patch to the upstream if I like the software and it's easy and the
upstream seems receptive.

I find BSD software can be easier to port than GNU software, even though
it often does not even attempt to be portable. I can easily deal with it
because it simply fails to compile and I can easily figure out what I
need to provide, or just change the software to use a different feature.
Porting GNU software can be much harder to port because of complexities
in layers like autoconf or gnulib that cause problems that didn't need
to be there.

A big problem with gnulib is that good new operating systems are
unreasonably burdened because of the mistakes of buggy operating systems
(which may have been fixed long ago). A good example is e.g.
cross-compilation. For instance, an old Unix might not have had a broken
fork() system call or something. When cross-compiling, gnulib might be
pessimistic and default to assuming the system call is broken, which may
be handled with a poor or inefficient fallback, disabling functionality
at runtime, or a compile time error. There is usually a whitelist of
systems without the problem and an environment variable to inject the
true answer. That means that it's harder to compete with a new unknown
operating system because I must set the environment variable, while
other operating systems just work, including the buggy one. That means
my good operating system is paying for the complexity caused by a bad
operating system. I'd rather the extra work of cross-compiling is moved
to the buggy operating systems.

Cross-compilation is inherently a bit more difficult when the host
system is buggy, so a more reasonable design would could be to assume
the best about unknown operating systems, and to only invoke the
workarounds for known buggy systems (and forcing them to set the
environment variable instead of me). That means the buggy operating
systems pay the cost instead of the good ones (making it harder for new
systems). Making cross-compilation nice helps the development of new
operating systems, and not just for established things like glibc/musl.

As you saw in my gnulib wiki page, I literally inject 120 environment
variables to make gnulib assume the very best about my operating system.
I'd rather be confronted with bugs up front than have then be secretly
hidden by a portability layer, or be told that it assumed the worst
about my unknown operating system so I have to teach it internals about
my OS. I'd love if it was able to disable the gnulib bug replacements
and just get the bugs (if any). It doesn't help that gnulib contains 1)
implementations of missing functionality 2) workarounds for bugs,
including fallback implementations 3) and utility code shared between
projects. gnulib should really be restructured so these three are
separated. It's hard to find out if 1) or 2) got included, and I'd
rather know those things up front (so I can implement the features / fix
bugs).

It's fine that gnulib requires OS specific knowledge whenever POSIX
doesn't cover a feature (such as mountpoints) and the OS doesn't
implement that any interface that gnulib knows about.

Sorry, I don't have a list of packages with stale gnulib files, and such
a list would probably be stale by now. I assume you scan all the GNU
packages and other packages known to embed gnulib and ping them whenever
they forget to update. This was certainly a bad problem some years ago
when gnulib was worse at cross-compilation and I found myself having to
workaround the same bug over and again (trying to check for locale by
running cross-compiled programs), but it got better. But if config.sub
and config.guess is any indication, the delay for updating these things
can take years. For what it's worth, you can see the list of my patches
at <https://pub.sortix.org/sortix/release/nightly/patches/> although
many of my third party ports are a few years old by now. You can also
see my notes with personal opinions about each of my ports (including
gnulib) at <https://gitlab.com/sortix/sortix/wikis/Ports>.

I don't think I have a list of parts of gnulib that are troublesome on
modern systems. It is good to know that you're receptive to such
reports, I might report such cases when I come across them. Although I
suggest you audit the bugs that gnulib work around and simply get rid of
anything that doesn't affect any relevant system release from the past 5
years (or even 10, or this millennium).

I object to the attitude that code analysis tools should only really be
supported on glibc systems. A lot of security features are being
pioneered on other systems and making it easier for everyone to use
these tools benefits everyone

"Exploit mitigation counter-measures" is whenever a system has an
exploit mitigation and software goes out of its way to not take benefit.
A good example is the 2014 Heartbleed vulnerability where there was a
good old buffer overflow. OpenSSL was wrapping malloc with its own
allocation layer, which made use-after-free bugs worse and did not
support zeroing freed allocations. That meant that systems with a
hardened malloc (an exploit mitigation) such as OpenBSD, which would
have reduced the data leakage a lot, did not benefit from the exploit
mitigation. When gnulib wraps malloc, it's hardly as bad, but it does
replace the libc definition (which may contain attributes and other
features that help the compiler / code analysis tools detect bugs).
That's why I consider it really bad whenever software wraps malloc. I do
understand that assuming malloc(0) never returns NULL on success is a
deep assumption and it can be extremely difficult to audit GNU code for
such assumptions. Though I'd almost argue that systems whose malloc(0)
returns NULL on success might really bring that pain upon themselves.
Does any modern relevant system even do that? It might be fun to check
in my os-test project where I test every POSIX I could get my hands on.
I'd default to no for unknown systems when cross-compiling.

I do think it's very reasonable to test for particular bugs and that
it's the right thing to do, when possible. When cross-compiling, that's
not possible and the best bet should be listing known buggy operating
systems, not assuming the worst about unrelated systems. The big problem
is that the gnulib replacements that got used because gnulib was
pessimistic might be worse ((or less secure) than the operating system
provided features, or might even require being ported (once and over
again, unless I upstream support and wait years for it to get
downstream). As someone making a new OS, I'd much rather know about bugs
rather than sweeping them under the rug. In any case, gnulib can check
for these bugs when compiled natively, my point is about the
cross-compilation assumption which only really matters to developers
bootstrapping the new OS.

I recognize that different people (OS developers, packagers, users) have
slightly different needs here. In my case, packaging is done as part of
the OS development to get a fully consistent and integrated system. I'd
love if the users could download the upstream source code and build it,
although that's not so likely, as most software needs a patch (if just
to add my OS to config.sub, at least I got that upstream a couple years
ago, should be in many downstreams by now). My users will probably want
to get my modified sources if they want to build the software. Getting
to a point where upstream support me out of the box is many years away,
if ever.

Re adding that to the gnulib manual: I'd rather have you restructure the
project, and make it possible to disable the bug workarounds with a
--disable-gnulib-workarounds. That will make things easy for me and
other people too.

It's been a while since I looked at the stdio-ext functions, although
I'm not really sure why they need to exist. At least there is a way to
satisfy gnulib.

My libc warns about every sprintf use because I consider it an
inherently dangerous interface. Buffer allocation and string production
should not be decoupled as it leads to bugs. (Modern languages, such as
the one I develop at work, does simply not have these problems.) In C,
strdup should be used instead of strlen+malloc+strcpy/memcpy because
it's much less error prone. Generally asprintf should be used instead of
sprintf/snprintf because it does the buffer allocation for you and
significantly reduces the risk. At the very least snprintf should be
used, because the destination size must be known whenever sprintf is
used, or the code is a risk (I've seen plenty of such bugs or code
reeking of such bugs). You can see
<https://maxsi.org/coding/c-string-creation.html> about how I believe C
strings should be created.

I object to the notion that truncation is a worse outcome than a buffer
overflow. A buffer overflow is at worst a remote code execution
vulnerability, while a truncation is at worst a program bug (although
that may be exploitable in turn, it is not inherently exploitable). The
correct resolution is to not even have this class of problems in the
first place by not decoupling buffer allocation from the string
creation. That's why the resolution to the strcpy debate is not to have
a secure strcpy, but to instead use strdup. We need to be better than
expose ourselves to the risk of 1990's (and way earlier) security
vulnerabilities.

getgroups() is an interesting case. Often when making an OS, I have to
choose between having a feature, having a stub of a feature that doesn't
work, or not having the feature at all. It really depends on the
ecosystem. In this case, my OS doesn't properly implement users and
groups yet, and I simply don't have getgroups (although I do have
getgid). New systems will often be in this kind of inconsistent state.
Generally I prefer not to have features unless I really do implement
them, but sometimes a stub is required to get ports working that don't
truly need the feature. In any case, GETGROUPS_T should just check
whether gid_t is a type and then just use it instead of int.

To recap, my primary requests are:

1) Categorizing gnulib into three parts (replacement functions for when
they don't exist, workarounds for bugs, and utility functions).

2) Making it possible to disable the gnulib bug replacements with a
configure command line option.

3) Defaulting to assume the best when cross-compiling to unknown systems.

Thanks for listening. I believe that making gnulib better for
new/unknown operating systems will benefit everyone in the long run and
improves the health of the free software ecosystem. I'm close with the
hobbyist OS community and there really are a bunch of people like me
starting new projects and porting GNU software to try out new ideas.

Jonas

Re: critique of gnulib

Reply via email to