Re: GNU Coding Standards, automake, and the recent xz-utils backdoor

Jacob Bachmeyer Sun, 31 Mar 2024 00:20:57 -0700

[email protected] wrote:

On 2024-03-30 18:25, Bruno Haible wrote:
Eric Gallager wrote:
Hm, so should automake's `distcheck` target be updated to perform
these checks as well, then?
The first mentioned check can not be automated. ...

The second mentioned check could be done by the maintainer, ...
I agree that distcheck is good but not a cure all. Any static systemcan be attacked when there is motive, and unit tests are easily gamed.

The issue seems to be releases containing binary data for unit tests,instead of source or scripts to generate that data. In this case, thatbinary data was used to smuggle in heavily obfuscated object code.

The best analysis in one place that I have found so far is<URL:https://gynvael.coldwind.pl/?lang=en&id=782>. In brief, grep isused to locate the main backdoor files by searching for marker strings.After running tests/files/bad-3-corrupt_lzma2.xz through tr(1), itbecomes a /valid/ xz file that decompresses to a shell script thatextracts a second shell script from part of the compressed data intests/files/good-large_compressed.lzma and pipes it to a shell. Thatsecond script has two major functions: first, it searches the testfiles for four six-byte markers, and it then extracts and decrypts(using a simple RC4-alike implemented in Awk) the binary backdoor alsofound in tests/files/good-large_compressed.lzma. The six-byte markersmark beginning and end of raw LZMA2 streams obfuscated with a simplesubstitution cipher. Any such streams found would be decompressed andread by the shell, but neither of the known crocked releases had anyfiles containing those markers. The binary backdoor is an x86-64 objectthat gets unpacked into liblzma_la-crc64-fast.o, unless m4/gettext.m4contains "dnl Convert it to C string syntax." which is a clever flagbecause about no one actually checks that those m4 files in releasetarballs actually match what the GNU project distributes. The objectitself is just the backdoor and presumably provides the symbol_get_cpuid as its entrypoint, since the unpacker script patches thesrc/liblzma/check/crc{64,32}_fast.c files in a pipeline to add calls tothat function and drops the compiled objects in .libs/. Running makewill then skip building those objects, since they are alreadyup-to-date, and the backdoored objects get linked into the final binary.

Commit 6e636819e8f070330d835fce46289a3ff72a7b89(<URL:https://git.tukaani.org/?p=xz.git;a=commitdiff;h=6e636819e8f070330d835fce46289a3ff72a7b89>)was an update to the backdoor. The commit message is suspicious,claiming the use of "a constant seed" to generate reproducible testfiles, but /not/ declaring how the files were produced, which of courseprevents reproducibility.

With a reproducible build system, multiple maintainers can "make dist"and compare the output to cross-check for erroneous / malicious distenvironments. Multiple signatures should be harder to compromise,assuming each is independent and generally trustworthy.


This can only work if a package /has/ multiple active maintainers.

You also have a small misunderstanding here: "make dist" prepares a(source) release tarball, not a binary build, so this is aclosely-related issue but actually distinct from reproducible builds.Also easier to solve, since we only have to make the source tarballreproducible.

Maybe GNU should establish a cross-verification signing standard and"dist verification service" that automates this process? Point it toa repo and tag, request a signed hash of the dist package... Thendownstream projects could check package signatures from both themaintainer and such third-party verifiers to check that nothing wasinserted outside of version control.

Essentially, this would be an automated release building service: uponrequest, make a Git checkout, run autogen.sh or equivalent, make dist,and publish or hash the result. The problem is that an attacker whomanages to gain commit access to a repository may be able to launchattacks on the release building service, since "make dist" can runscripts. The service could probably mount the working filesystem noexecsince preparing source releases should not require running (non-system)binaries and scripts can be run by directly feeding them into theirinterpreters even if the filesystem is mounted noexec, but this stillleaves all available interpreters and system tools potentially available.



-- Jacob

Re: GNU Coding Standards, automake, and the recent xz-utils backdoor

Reply via email to