https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects
== Summary == All binaries (executables and shared libraries) are annotated with an ELF note that identifies the rpm for which this file was built. This allows binaries to be identified when they are distributed without any of the rpm metadata. `systemd-coredump` uses this to log package versions when reporting crashes. == Owner == * Name: [[User:Zbyszek|Zbigniew Jędrzejewski-Szmek]] * Email: zbys...@in.waw.pl * Name: Lennart Poettering * Email: mzsrq...@0pointer.net == Detailed Description == People mix binaries (programs and libraries) from different distributions (for example using Fedora containers on Debian or vice versa), and distribute binaries without packaging metadata (for example by stripping everything except the binary from a container image, also removing `/usr/lib/.build-id/*`), compile their own rpm packages (for internal distribution and installation), and compile and distribute their own binaries. Sometimes we need to introspect a binary and figure out its provenance, for example when a program crashes and we are looking at a core dump, but also when we have a binary without the packaging metadata. When the need to introspect a binary arises, we have some very good mechanisms to show the provenance: when a file is installed through the package manager we can directly list the providing package, but even without this we can use build-ids embedded in the binary to uniquely identify the originating build. But those mechanisms work best when we're in the realm of a single distribution. In particular, build-ids can be easily tied to a source rpm, but only when we have the source rpm is part of the distribution and the build-id was registered in the appropriate database which maps build-ids to real package names. When we move outside of the realm of a single distribution, it can be hard to figure out where a given binary originates from. If we know that a binary is from a given distribution, we may be able to use some distro-specific mechanism to figure out this information. But those mechanisms will be different for different distributions and will often require network access. With this change we aim to provide a mechanism that is is very simple, provides a "human-readable" origin information without further processing, is portable across distros, and works without network access. The directly motivating use case is display of core dumps. Right now we have build-ids, but those are just opaque hexadecimal numbers that are not meaningful to users. We would like to immediately list versions of packages involved in the crash (including both the program and any libraries it links to). It is not enough to query the rpm database to do the equivalent of `rpm -qf …`: very often programs crash after some packages have been upgraded and the binaries loaded into memory are not the binaries that are currently present on disk, or when through some mishap, the binaries on disk do not match the installed rpms. A mechanism that works without rpm database lookup or network access allows this information to be showed immediately in `coredumpctl` listings and journal entries about the crash. This includes crashes that happen in the initrd and sandboxed containers. A second motivating use case is when users distribute their own binaries and would like to collect crash information. Build-ids are a solution that is technically possible, but easy to get wrong in practice: users would need to immediately record the build-id after the build and store the mapping to program names, versions, and build number in some database. It's much easier to be able to record something during the build in the build product itself. A third motivating use case is the general mixing of Fedora binaries with programs and libraries from different distributions, both with our binaries being used as the base for foreign binaries, and the other way around. Whilst most distributions provide some mechanism to figure out the source build information, those mechanisms vary by distribution and may not be easy to access from a "foreign" system. Such mixing is expected with containers, flatpaks, snaps, Python binary wheels, anaconda packages, and quite often when somebody compiles a binary and puts it up on the web for other people to download. We propose a new mechanism which is designed to be very simple but extensible: a small JSON document is embedded in an section in the ELF binary. This document can be easily read by a human if necessary, but it is also well-defined and can be processed programatically. For example, `systemd-coredump` will immediately make use of this to display package ''nevra'' information for crashes. The format is also easy to generate, so it can be added to any build system, either using the helpers that we provide or even reimplemented from scratch. For the case where we mix binaries from different distros (the third motivating use case above), this approach is the most useful when this system is used by all distros and even non-distro builds. The more widely it is used, the more useful it becomes. The specification was developed in collaboration with Debian developers, and we hope that Fedora and Debian will lead the way for this to become as widely used as build-ids. But even if the information is only available from some distros, it is still useful, except that fallback mechanisms need to be implemented. === Existing system: `.note.gnu.build-id` === We already have build-ids: every ELF object has a `.note.gnu.build-id` note, and given a core file, we can read the build-id and look it up in the rpm database (`dnf repoquery --whatprovides debuginfo(build-id) = …`) to map it to a package name. Build-ids are unique and compact and very generic and work as expected in general. But they have some downsides: * build-ids are not very informative for users. Before the build-id is converted back to the appropriate package, it's completely opaque. * build-ids require a working rpm database or an internet connection to map to the package name. Three important cases: * minimal containers: the rpm database is not installed in the containers. The information about build-ids needs to be stored externally, so package name information is not available immediately, but only after offline processing. The new note doesn't depend on the rpm db in any way. * handling of a core from a container, where the container and host have different distros * self-built and external packages: unless a lot of care is taken to keep access to the debuginfo packages, this information may be lost. The new note is available even if the repository metadata gets lost. Users can easily provide equivalent information in a format that makes sense in their own environment. It should work even when rpms and debs and other formats are mixed, e.g. during container image creation. === New system: `.note.package` === The new note is created and propagated similarly to `.note.gnu.build-id`. The difference is that we inject the information about package ''nevra'' from the build system. The implementation is very simple: `%{build_ldflags}` are extended with a command to insert a custom note as a separate section in an ELF object. See [https://github.com/systemd/package-notes/blob/main/hello.spec hello.spec] for an example. This is done in the default macros, so all packages that use the prescribed link flags will be affected. The note is a compact json string. This allows the format to be trivially extensible (new fields can be added at will), easy to process (json is extremely popular and parsers are widely available). Using a single field rather than a set of separated notes is more space-efficient. With multiple fields the padding and alignment requirements cause unnecessary overhead. The system was designed with cross-distro collaboration and is flexible enough to identify binaries from different packaging formats and build systems (rpms, debs, custom binaries). See https://systemd.io/COREDUMP_PACKAGE_METADATA/ for detailed description of the format. One of the advantages of using an ELF note, as opposed to say a series of extended attributes on the binary itself, is that the ELF note gets automatically captured and copied into a core file by the kernel. Extended attributes would have to be copied manually, which might not even be possible because the binary on disk may have been removed by the time the crash is analyzed. The overhead is about 200 bytes for each ELF object. We have about overall 33200 files in `/usr/s?bin/` and about 36600 `.so` files (F35, single architecture, results from `dnf repoquery -l 2>/dev/null | rg '^/usr/s?bin/' | sort -u | wc -l`, `dnf repoquery -l 2>/dev/null | rg '^/usr/lib64/.*\.so$' |sort -u|wc -l`). If we do this for the whole distro, we get 69800 × 200 = 13 MB. For a typical installation, we can expect about 300–400 kB. Thus the overhead of additionally used space is neglible (also see the Feedback section for more discussion). Precise measurements TBD once this is turned on and we have real measurements for a larger number of builds. === Examples === <pre> $ objdump -s -j .note.package build/libhello.so build/libhello.so: file format elf64-x86-64 Contents of section .note.package: 02ec 04000000 63000000 7e1afeca 46444f00 ....c...~...FDO. 02fc 7b227479 7065223a 2272706d 222c226e {"type":"rpm","n 030c 616d6522 3a226865 6c6c6f22 2c227665 ame":"hello","ve 031c 7273696f 6e223a22 302d312e 66633335 rsion":"0-1.fc35 032c 2e783836 5f363422 2c226f73 43706522 .x86_64","osCpe" 033c 3a226370 653a2f6f 3a666564 6f726170 :"cpe:/o:fedorap 034c 726f6a65 63743a66 65646f72 613a3333 roject:fedora:33 035c 227d0000 "}.. </pre> <pre> $ readelf --notes build/hello | grep "description data" | sed -e "s/\s*description data: //g" -e "s/ //g" | xxd -p -r | jq readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to 0x10de readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to 0x10af readelf: build/hello: Warning: Gap in build notes detected from 0x1091 to 0x119f { "type": "rpm", "name": "hello", "version": "0-1.fc35.x86_64", "osCpe": "cpe:/o:fedoraproject:fedora:33" } </pre> <pre> $ coredumpctl info PID: 44522 (fsverity) ... Package: fsverity-utils/1.3-1 build-id: ac89bf7175b04d7eec7f6544a923f45be111f0be Message: Process 44522 (fsverity) of user 1000 dumped core. Found module /home/bluca/git/fsverity-utils/libfsverity.so.0 with build-id: fa40fdfb79aea84167c98ca8a89add9ac4f51069 Metadata for module /home/bluca/git/fsverity-utils/libfsverity.so.0 owned by FDO found: { "packageType" : "deb", "package" : "fsverity-utils", "packageVersion" : "1.3-1" } Found module linux-vdso.so.1 with build-id: aba08e06103f725e26f1d7c178fb6b76a564a35d Found module libpthread.so.0 with build-id: e91114987a0147bd050addbd591eb8994b29f4b3 Found module libdl.so.2 with build-id: d3583c742dd47aaa860c5ae0c0c5bdbcd2d54f61 Found module ld-linux-x86-64.so.2 with build-id: f25dfd7b95be4ba386fd71080accae8c0732b711 Found module libcrypto.so.1.1 with build-id: 749142d5ee728a76e7cdc61fd79d2311a77405a2 Found module libc.so.6 with build-id: 18b9a9a8c523e5cfe5b5d946d605d09242f09798 Found module fsverity with build-id: ac89bf7175b04d7eec7f6544a923f45be111f0be Metadata for module fsverity owned by FDO found: { "packageType" : "deb", "package" : "fsverity-utils", "packageVersion" : "1.3-1" } Stack trace of thread 44522: #0 0x00007fe7c8af26f4 __GI___nanosleep (libc.so.6 + 0xc66f4) #1 0x00007fe7c8af262a __sleep (libc.so.6 + 0xc662a) #2 0x00005608481407dd main (fsverity + 0x27dd) #3 0x00007fe7c8a5009b __libc_start_main (libc.so.6 + 0x2409b) #4 0x000056084814094a _start (fsverity + 0x294a) </pre> == Feedback == See [https://github.com/systemd/systemd/issues/18433 systemd issue #18433] for upstream discussion and implementation proposals. === Concerns about additional changes to files === <pre> 17:32:30 <Eighth_Doctor> I think zbyszek underestimates how much of a problem it is to stamp every ELF binary with ''nevra'' data 17:32:44 <mhroncok> zbyszek: so, assuming python has ~100 ELF .so files and I change one text file 17:33:22 <mhroncok> (ignore for the time being that the .so files often changed because of toolchain updates and assume they are stable) </pre> I tested this with python3.10. So far there are 13 builds of that package in F35: `python3.10-3.10.0-1.fc35`, `python3.10-3.10.0~a6-1.fc35`, `python3.10-3.10.0~a6-2.fc35`, `python3.10-3.10.0~a7-1.fc35`, `python3.10-3.10.0~b1-1.fc35`, `python3.10-3.10.0~b2-2.fc35`, `python3.10-3.10.0~b2-3.fc35`, `python3.10-3.10.0~b3-1.fc35`, `python3.10-3.10.0~b4-1.fc35`, `python3.10-3.10.0~b4-2.fc35`, `python3.10-3.10.0~b4-3.fc35`, `python3.10-3.10.0~rc1-1.fc35`, `python3.10-3.10.0~rc2-1.fc35`. I extracted the builds (for `.x86_64`) and made a list of all `.so` files (1368 files), and calculated sha256 hashes for them. No two files repeat, there are 1368 distinct hashes. So the files are '''already''' different between builds and the additional proposed metadata does will not make a significant difference. Note that this range of Python versions encompasses periods when the package is under development and undergoes significant changes (alpha versions), and when it's only undergoing small changes (rc versions). The fact that we get different files in each build is not surprising, because files embed build-ids which differ between builds. But even if we ignore those, binaries generally differ between builds. Even sizes tend to vary between builds: there are 636 distinct `.so` file sizes, i.e. on average any given size only repeats twice (presumably most often for the same file). Running `diffoscope` on `.so` files from different builds shows minor changes in the assembly which I did not analyze futher. If people have specific questions, for example about overhead in some scenario, I'd be happy to answer them. Until now, the issues that were raised were very vague, so it's impossible to answer them. === Why not just use the rpm database? === <pre> 17:34:33 <dcantrell> The main reason for this appears to be that we need the RPM db locally to resolve build-ids to package names. But since containers wipe /var/lib/rpm, we can't do that. So the solution is to put the ''nevra'' in ELF metadata? 17:34:39 <dcantrell> That feels like the wrong approach. </pre> First, there are legitimate reasons to strip packaging metadata from images. For example, for an initrd image from rpms, I get 117 MB of files (without compression), and out of this `/var/lib/rpm` is 5.9 MB, and `/var/lib/dnf` is 4.2 MB. This is an overhead of 9%. This is ''not much'', but still too much to keep in the image unless necessary. Similar ratios will happen for containers of similar size. Reducing image size by one tenth is important. There is no `rpm` or `dnf` in the image, to the package database is not even usable without external tools. As discussed on IRC (https://meetbot.fedoraproject.org/teams/fesco/fesco.2021-05-11-17.01.log.html), the containers ''we'' build don't wipe this metadata, but custom Dockerfiles do that. Second, as described in Description section above, not everybody and everything uses rpm. The Fedora motto is "we make an operating system and we make it easy for you to do useful stuff with it" (and yes, this is an actual quote from the official docs), and this stuff involves reusing our binaries in containers and custom installations and whatnot, not just straightforward installations with `dnf`. And in the other direction, people will build their own binaries that are not packaged as rpms. But it is still important to be able to figure out the exact version of a binary, especially after it crashes. === Why do this in Fedora? === <pre> 17:36:49 <mhroncok> I don't understand how non-rpm distros and custom built binaries are affected by our rpm-build environment :/ </pre> The idea is that we inject this into our build system, and Debian injects this into their build system, and so on… As mentioned, this is a cross-distro effort. Also, people can use it in their custom build systems if they build and distribute binaries internally. The scheme would obviously be most useful if used comprehensively, but it's still useful when available partially. We hope that Fedora can lead the way. (This is similar to build-ids: when initially adopted, they were used only by some distros, but were useful even then. Nowadays, with comprehensive adoption, they are even more useful.) https://hpc.guix.info/blog/2021/09/whats-in-a-package/ contains a nice description of a pathological case of packaging hacks and binary redistribution. When trying to unravel something like this, information embedded directly in the binaries would be quite useful. == Benefit to Fedora == A simple and reliable way to gather information about package versions of programs is added. It enhances, instead of replacing, the existing mechanisms. It is particularly useful when reporting crash dumps, but can also be used for image introspection and forensincs, license checks and version scans on containers, etc. If we adopt this in Fedora, Fedora leads the way on implementing the standard. Fedora binaries used in any context can be easily recognized. Fedora binaries provide a better basis to build things. If other distros adopt this, we can introspect and report on those binaries easily within the Fedora context. For example, when somebody is using a container with some programs that originate in the Debian ecosystem, we would be able to identify those programs without tools like `apt` or `dpkg-query`. Core dump analaysis executed in the Fedora host can easily provide useful information about programs from foreign builds. == Implementation in Other Distributions == === Microsoft CBL-Mariner === [https://en.wikipedia.org/wiki/CBL-Mariner CBL-Mariner] is an [https://github.com/microsoft/CBL-Mariner open source] Linux distribution created by Microsoft, targeted at first-party and container workloads on Azure. It is used both as a container runner host and a base container image. Mariner adopted the ELF stamping packaging metadata spec in [https://github.com/microsoft/CBL-Mariner/blob/1.0/SPECS/mariner-rpm-macros/gen-ld-script.sh version 1.0], initially to add OS metadata, and package-level metadata will be added in a following release. === Debian === A package-level proof-of-concept is included in the [https://github.com/systemd/package-notes/blob/main/dh_package_notes package-notes] repository. A [https://salsa.debian.org/bluca/debhelper/-/tree/notes_metadata system-level proof-of-concept] that enables ELF stamping by default in all builds implicitly will be proposed for adoption in the future. == Scope == * Proposal owners: ** create a specification (First version DONE: [https://systemd.io/COREDUMP_PACKAGE_METADATA COREDUMP_PACKAGE_METADATA]. We might need to make some adjustments based on the deployment in Fedora, but no big changes are expected.) ** write a script to generate the package note (First version DONE: [https://github.com/systemd/package-notes/blob/main/generate-package-notes.py generate-package-notes.py]) ** provide a patch for `redhat-rpm-config` to insert appropriate compilation options ** extend systemd's coredumpctl to extract and display this information (DONE: [https://github.com/systemd/systemd/pull/19135 PR #19135], available in systemd-249) ** submit pull request to Packaging Guidelines * Other developers: ** possibly add support in abrt? * Release engineering: There should be no impact. * Policies and guidelines: The new flags should be mentioned in Packaging Guidelines. * Trademark approval: N/A (not needed for this Change) N/A * Alignment with Objectives: It might be relevant for Minimization. Even though it increases the image size a tiny bit, it makes minimized images work a bit better. == Upgrade/compatibility impact == No impact. == How To Test == <pre> $ bash -c 'kill -SEGV $$' $ coredumpctl TIME PID UID GID SIG COREFILE EXE SIZE PACKAGE Mon 2021-03-01 14:37:22 CET 855151 1000 1000 SIGSEGV present /usr/bin/bash 51.7K bash-5.1.0-2.fc34.x86_64 </pre> == User Experience == `coredumpctl` should display information about package versions. `readelf --notes` or similar tools can be used on `.so` files and compiled programs to extract the JSON blurb that describes the originating package. == Dependencies == None. == Contingency Plan == * Contingency mechanism: Remove the new compilation flags. Rebuild any packages that were build with the new flags. * Contingency deadline: Beta freeze. * Blocks release? No. == Documentation == * https://systemd.io/COREDUMP_PACKAGE_METADATA/ * https://github.com/systemd/package-notes See also [[Changes/DebuginfodByDefault]]. -- Ben Cotton He / Him / His Fedora Program Manager Red Hat TZ=America/Indiana/Indianapolis _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure