Date: Sun, 2 Jul 2023 15:51:06 -0400 (EDT) From: Mouse <mo...@rodents-montreal.org> Message-ID: <202307021951.paa07...@stone.rodents-montreal.org>
| For example, a program that calls printf but never uses any | floating-point values at all will not, in theory, need floating point | support. But we do not have any mechanism by which anything can | discover that no floating-point printf formats are used and thus bring | in a printf variant that doesn't actually support floating point; this | means that a bunch of floating-point stuff will be brought in even | though it will never actually get used. First, a different printf that doesn't support floats isn't needed, printf (itself) has essentially no knowledge of anything related to floats. When everything used to be static (ie: back in my time...) a lot of effort was expended making small programs stay small (both RAM for the executing binary, and disc space for the executable file, were scarce resources) by careful crafting of what was in libc.a and the order it all appeared. Keeping that correct took much work, and it was very easy to end up with multiple symbol definition errors from linking an innocuous (and correct) program that just happened to be slightly different than had been expected. The issue above was solved by having dummy versions of the floating point to string conversion routines (which did nothing, and so were very small). The compiler helped, by inserting a reference to a well known symbol, if the program being compiled contained any float or double references. The real floating conversion routines defined that symbol, the dummy ones did not. libc was constructed (as far as is relevant here) with the real conversion routines first, then printf, then the dummy conversion routines following. If the program used any floating point then the compiler inserted undefined reference to the magic symbol would cause the real conversion routines to be linked (as they would be if explicitly called by the program, but that's very unlikely). Then printf would be linked (we're assuming the program uses printf, or this issue isn't relevant). If the floating point conversion routines were already linked, they satisfy the undefined symbols in the printf object file(s). If they weren't, those remained undefined until the dummy routines were encountered, later in libc, at which point they'd be loaded. Since we know the program isn't using floating point to get to that point, they'd never be called. Note that this isn't quite "discover that no floating-point printf formats are used" - there was never an attempt to do that, but if the program does printf("%f", x); what is 'x' in a valid program? What can it be that the compiler would not know that the program is using floating point? Even "*(double *)&long_var" is enough for floats to be considered used. If you manage to call printf with a floating format, and pass it something that the compiler does not believe is, or is to be treated as, any kind of floating point data (even if it happens to be) and the program uses no floats elsewhere, anywhere, then you loose... Trivial to fix, you just declare some float variable, somewhere. I don't know if current compilers provide this kind of assistance, or not, but they could. Similar, but different case specific, work can be done to handle all of the other (largish) systems ... eg: when a program exits, exit() or something it calls, needs to make sure all stdio buffers are flushed (typically by doing fclose() on each of them, but the close part isn't as important, the exit sys call accomplishes that - but that cannot ensure than unwritten buffered data has been flushed to files first). That means that you get large chunks of stdio linked, even if your program doesn't include <stdio.h> or use any of it (and since stdio uses malloc() you get that as well). You can attempt to avoid this by calling _exit() instead of exit(), but as "falling off the end of main" is defined as a call of exit(0), the run time support doesn't know that exit() won't be needed, and links it anyway (even if the compiler knows the program will never simply fall off the end of main()). With enough work that can be handled as well. And then on to the next problem ... and the next ... Since in practice, almost no-one uses static linking for almost anything any more (except via crunchgen for /rescue, which has so much linked in that the whole of libc is a drop in the bucket, and most of it is needed, by something, anyway) there aren't many people willing to attempt to manage all of this, and keep it working. Believe me, while possible, it isn't easy - and the smallest changes in the oddest of places can require a lot of work, and playing around, to keep it all working properly. For some of this you need linker/binary format support so the library can have routines which define symbols, which resolve references in the program if that routine is linked - but for which the presence of the symbol is not advertised, so the routine will not be linked just because the symbol is unreferenced and it is defined in that routine - something else needs to cause the routine to be linked first. Personally, I don't see any point, and I know I won't be working on that kind of thing, ever again - had enough of that, back when it really mattered, long long ago. If you link static binaries, without doing everything needed to avoid it, and if no-one has done the work to make the static libc able to handle all of this, and you're not willing to mangle your source code with all the dummy routines that others have been suggesting, so that the libc versions never let linked, then you're going to get big binaries. Live with it, or do the work to fix it yourself. kre