Hi Bruno & Paul, Thanks for your interest in my notes on gnulib. I appreciate your desire to discuss them. Please excuse some of the resources that I'll link below for overreacting in jest ("screaming in horror", "usual gnulib infection"), they weren’t really meant for upstream consumption, but they do refer to true struggles that I've had to go through. I understand and appreciate that you and I have different goals, but I think we do have some common ground. I'll give you the context of my notes below.
My goal is to make a good and clean POSIX-like operating system, and to enable other people to do so as well because healthy competition is good. I have contributed to gnulib in the past. Although my main interest is my operating systems project and I only work on ports as needed. This year I am dedicating my time to my game development project, so I don't have the resources to do OS work these days. Please excuse me if my information is out of date. I rather enjoy my operating systems work on Sortix, though, because I've accomplished something extraordinary: A self-hosting POSIX-like system made from scratch with key ports of third party software (about 75% of ports compiles natively, so I rely on cross-compilation). What really surprised me pleasantly was that I did not have the same backwards compatibility concerns of a GNU/Linux system. By doing things in a simple way without historical mistakes, my project got a good baseline quality. This really accomplishes the best qualities of a free software platform, because all the code and ports are integrated and are easy to modify (you can 'cd /src && make sysmerge' out of the box and the system is updated to the new sources of everything). I'm free to compete with other POSIX-like systems by making better implementations of the standard interfaces and in turn encouraging other systems to improve. Meanwhile I have encountered technical debt in other projects that I port, and I sometimes fix those issues and contribute fixes upstream, improving the health of the free software ecosystem. My method for porting software is to cross-compile the software to my OS (see <https://wiki.osdev.org/Cross-Porting_Software>). Sometimes I have to fix some build system bugs. Then I fix the compile errors and warnings, if any. That may require implementing some new features in my OS. Finally it compiles and I run it. It might not work or crash, and I fix those bugs too. Finally I package up the thing and I might send a patch to the upstream if I like the software and it's easy and the upstream seems receptive. I find BSD software can be easier to port than GNU software, even though it often does not even attempt to be portable. I can easily deal with it because it simply fails to compile and I can easily figure out what I need to provide, or just change the software to use a different feature. Porting GNU software can be much harder to port because of complexities in layers like autoconf or gnulib that cause problems that didn't need to be there. A big problem with gnulib is that good new operating systems are unreasonably burdened because of the mistakes of buggy operating systems (which may have been fixed long ago). A good example is e.g. cross-compilation. For instance, an old Unix might not have had a broken fork() system call or something. When cross-compiling, gnulib might be pessimistic and default to assuming the system call is broken, which may be handled with a poor or inefficient fallback, disabling functionality at runtime, or a compile time error. There is usually a whitelist of systems without the problem and an environment variable to inject the true answer. That means that it's harder to compete with a new unknown operating system because I must set the environment variable, while other operating systems just work, including the buggy one. That means my good operating system is paying for the complexity caused by a bad operating system. I'd rather the extra work of cross-compiling is moved to the buggy operating systems. Cross-compilation is inherently a bit more difficult when the host system is buggy, so a more reasonable design would could be to assume the best about unknown operating systems, and to only invoke the workarounds for known buggy systems (and forcing them to set the environment variable instead of me). That means the buggy operating systems pay the cost instead of the good ones (making it harder for new systems). Making cross-compilation nice helps the development of new operating systems, and not just for established things like glibc/musl. As you saw in my gnulib wiki page, I literally inject 120 environment variables to make gnulib assume the very best about my operating system. I'd rather be confronted with bugs up front than have then be secretly hidden by a portability layer, or be told that it assumed the worst about my unknown operating system so I have to teach it internals about my OS. I'd love if it was able to disable the gnulib bug replacements and just get the bugs (if any). It doesn't help that gnulib contains 1) implementations of missing functionality 2) workarounds for bugs, including fallback implementations 3) and utility code shared between projects. gnulib should really be restructured so these three are separated. It's hard to find out if 1) or 2) got included, and I'd rather know those things up front (so I can implement the features / fix bugs). It's fine that gnulib requires OS specific knowledge whenever POSIX doesn't cover a feature (such as mountpoints) and the OS doesn't implement that any interface that gnulib knows about. Sorry, I don't have a list of packages with stale gnulib files, and such a list would probably be stale by now. I assume you scan all the GNU packages and other packages known to embed gnulib and ping them whenever they forget to update. This was certainly a bad problem some years ago when gnulib was worse at cross-compilation and I found myself having to workaround the same bug over and again (trying to check for locale by running cross-compiled programs), but it got better. But if config.sub and config.guess is any indication, the delay for updating these things can take years. For what it's worth, you can see the list of my patches at <https://pub.sortix.org/sortix/release/nightly/patches/> although many of my third party ports are a few years old by now. You can also see my notes with personal opinions about each of my ports (including gnulib) at <https://gitlab.com/sortix/sortix/wikis/Ports>. I don't think I have a list of parts of gnulib that are troublesome on modern systems. It is good to know that you're receptive to such reports, I might report such cases when I come across them. Although I suggest you audit the bugs that gnulib work around and simply get rid of anything that doesn't affect any relevant system release from the past 5 years (or even 10, or this millennium). I object to the attitude that code analysis tools should only really be supported on glibc systems. A lot of security features are being pioneered on other systems and making it easier for everyone to use these tools benefits everyone "Exploit mitigation counter-measures" is whenever a system has an exploit mitigation and software goes out of its way to not take benefit. A good example is the 2014 Heartbleed vulnerability where there was a good old buffer overflow. OpenSSL was wrapping malloc with its own allocation layer, which made use-after-free bugs worse and did not support zeroing freed allocations. That meant that systems with a hardened malloc (an exploit mitigation) such as OpenBSD, which would have reduced the data leakage a lot, did not benefit from the exploit mitigation. When gnulib wraps malloc, it's hardly as bad, but it does replace the libc definition (which may contain attributes and other features that help the compiler / code analysis tools detect bugs). That's why I consider it really bad whenever software wraps malloc. I do understand that assuming malloc(0) never returns NULL on success is a deep assumption and it can be extremely difficult to audit GNU code for such assumptions. Though I'd almost argue that systems whose malloc(0) returns NULL on success might really bring that pain upon themselves. Does any modern relevant system even do that? It might be fun to check in my os-test project where I test every POSIX I could get my hands on. I'd default to no for unknown systems when cross-compiling. I do think it's very reasonable to test for particular bugs and that it's the right thing to do, when possible. When cross-compiling, that's not possible and the best bet should be listing known buggy operating systems, not assuming the worst about unrelated systems. The big problem is that the gnulib replacements that got used because gnulib was pessimistic might be worse ((or less secure) than the operating system provided features, or might even require being ported (once and over again, unless I upstream support and wait years for it to get downstream). As someone making a new OS, I'd much rather know about bugs rather than sweeping them under the rug. In any case, gnulib can check for these bugs when compiled natively, my point is about the cross-compilation assumption which only really matters to developers bootstrapping the new OS. I recognize that different people (OS developers, packagers, users) have slightly different needs here. In my case, packaging is done as part of the OS development to get a fully consistent and integrated system. I'd love if the users could download the upstream source code and build it, although that's not so likely, as most software needs a patch (if just to add my OS to config.sub, at least I got that upstream a couple years ago, should be in many downstreams by now). My users will probably want to get my modified sources if they want to build the software. Getting to a point where upstream support me out of the box is many years away, if ever. Re adding that to the gnulib manual: I'd rather have you restructure the project, and make it possible to disable the bug workarounds with a --disable-gnulib-workarounds. That will make things easy for me and other people too. It's been a while since I looked at the stdio-ext functions, although I'm not really sure why they need to exist. At least there is a way to satisfy gnulib. My libc warns about every sprintf use because I consider it an inherently dangerous interface. Buffer allocation and string production should not be decoupled as it leads to bugs. (Modern languages, such as the one I develop at work, does simply not have these problems.) In C, strdup should be used instead of strlen+malloc+strcpy/memcpy because it's much less error prone. Generally asprintf should be used instead of sprintf/snprintf because it does the buffer allocation for you and significantly reduces the risk. At the very least snprintf should be used, because the destination size must be known whenever sprintf is used, or the code is a risk (I've seen plenty of such bugs or code reeking of such bugs). You can see <https://maxsi.org/coding/c-string-creation.html> about how I believe C strings should be created. I object to the notion that truncation is a worse outcome than a buffer overflow. A buffer overflow is at worst a remote code execution vulnerability, while a truncation is at worst a program bug (although that may be exploitable in turn, it is not inherently exploitable). The correct resolution is to not even have this class of problems in the first place by not decoupling buffer allocation from the string creation. That's why the resolution to the strcpy debate is not to have a secure strcpy, but to instead use strdup. We need to be better than expose ourselves to the risk of 1990's (and way earlier) security vulnerabilities. getgroups() is an interesting case. Often when making an OS, I have to choose between having a feature, having a stub of a feature that doesn't work, or not having the feature at all. It really depends on the ecosystem. In this case, my OS doesn't properly implement users and groups yet, and I simply don't have getgroups (although I do have getgid). New systems will often be in this kind of inconsistent state. Generally I prefer not to have features unless I really do implement them, but sometimes a stub is required to get ports working that don't truly need the feature. In any case, GETGROUPS_T should just check whether gid_t is a type and then just use it instead of int. To recap, my primary requests are: 1) Categorizing gnulib into three parts (replacement functions for when they don't exist, workarounds for bugs, and utility functions). 2) Making it possible to disable the gnulib bug replacements with a configure command line option. 3) Defaulting to assume the best when cross-compiling to unknown systems. Thanks for listening. I believe that making gnulib better for new/unknown operating systems will benefit everyone in the long run and improves the health of the free software ecosystem. I'm close with the hobbyist OS community and there really are a bunch of people like me starting new projects and porting GNU software to try out new ideas. Jonas