Hi,
Hi,
Great findings by Qualys, as usual!
Below are some comments on my attempt at reproducing the issue against
Rocky Linux 9.5's systemd-coredump (systemd-252-46.el9_5.3.x86_64):
On Thu, May 29, 2025 at 05:17:08PM +0000, Qualys Security Advisory wrote:
Local information disclosure in systemd-coredump (CVE-2025-4598)
========================================================================
------------------------------------------------------------------------
Background
------------------------------------------------------------------------
While working on Ubuntu's apport, we remembered that various other
distributions (Red Hat Enterprise Linux 9 and Fedora for example) use
systemd-coredump as a core-dump handler in /proc/sys/kernel/core_pattern
(instead of apport). We began to wonder: how does systemd-coredump solve
the kill-and-replace race condition that we exploited against apport?
Similarly to apport, systemd-coredump writes all core files into a
hard-coded directory, /var/lib/systemd/coredump/. Before December 2022,
systemd-coredump allowed users to read all of their core files (through
file ACLs), including the core files of SUID or SGID programs, which of
course allowed local attackers to read the contents of /etc/shadow by
simply crashing su for example; this vulnerability was CVE-2022-4415,
discovered and published by Matthias Gerstner:
https://www.openwall.com/lists/oss-security/2022/12/21/3
FWIW, when run on Fedora 34, my reproducer trying to trigger the new bug
instead triggers the above older bug - file ACLs are in fact set to
enable the non-root user to read a coredump from a SUID program.
This old vulnerability was patched by introducing a new function,
grant_user_access(), which decides whether a user should be allowed to
read a core file or not, by analyzing the /proc/pid/auxv of the crashed
process: if its AT_UID and AT_EUID match, and if its AT_GID and AT_EGID
match, and if its AT_SECURE flag is 0, then read access is allowed;
otherwise (if the crashed process is SUID or SGID), read access is
denied (only root can read the core file).
------------------------------------------------------------------------
Analysis
------------------------------------------------------------------------
Unfortunately, we soon realized that systemd-coredump does not provide
any protection at all against the kill-and-replace race condition that
we exploited in apport. In other words, an attacker can simply crash a
SUID process such as unix_chkpwd, SIGKILL and replace it with a non-SUID
process (before its /proc/pid/auxv is analyzed by systemd-coredump), and
therefore gain read access to the core file of the crashed SUID process,
and hence to the contents of /etc/shadow.
On the one hand, exploiting systemd-coredump is easier than exploiting
apport, because we do not need to replace the crashed SUID process with
a namespaced process: we can replace it with any non-SUID process, whose
AT_UID and AT_EUID match, whose AT_GID and AT_EGID match, and whose
AT_SECURE flag is 0.
On the other hand, winning the kill-and-replace race condition against
systemd-coredump is harder: unlike apport, systemd-coredump is written
in C, and its initialization takes little time. To widen the window of
the race condition, we pass an argv[0] of 128K '\177' characters to the
SUID process: this slows down the analysis of its /proc/pid/cmdline (by
systemd-coredump, before the analysis of its /proc/pid/auxv) and gives
us enough time to replace the crashed SUID process with a non-SUID
process.
------------------------------------------------------------------------
Proof of concept
------------------------------------------------------------------------
$ grep PRETTY_NAME= /etc/os-release
PRETTY_NAME="Fedora Linux 41 (Server Edition)"
$ id
uid=1001(evey) gid=1001(evey) groups=1001(evey)
context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
$ while true; do
pid="$(printf 'whatever\0' | ./CVE-2025-4598 /usr/sbin/unix_chkpwd "$USER"
nullok)";
pidwait -f /usr/lib/systemd/systemd-coredump;
if coredumpctl -1 dump "$pid" 2>/dev/null | strings -a | grep
'\$[0-9A-Za-z]\+\$[0-9A-Za-z./]'; then
break;
fi;
done
...
pid 364536
tid 364521
tid 364540
died in main: 177
theadmin:$y$j9T$APKdqQO.brzhEbC2JFd.5zb7$Rz2q.0umBr8AmkwlozWr8/yphm/ckEHIOMo9vcj.Wj/::0:99999:7:::
evey:$y$j9T$QUW3HEErO9CYuGrRhiQjt.$.befySFW/nA48280u/Hk1XrcA2yDZ6Z1s7iRf91nJuA:20188:0:99999:7:::
I've attached my attempt at partially reconstructing the Qualys' exploit
(which I haven't seen) that the above script uses, as well as the script
with some edits.
I think I implemented most of what Qualys described (of the parts
relevant to systemd-coredump rather than only to apport), except that I
simply use fork() rather than clone() (slower PID reuse) and I didn't
implement usage of inotify (harder to win the race leading to password
hashes in dump). I've been testing this after:
sysctl kernel.pid_max=2000
control unix_chkpwd public # Undo SIG/Security hardening
With the PID range reduced from the default of 4M down to 2K, PID reuse
is quick even with simple fork(). I am getting frequent unix_chkpwd
coredumps (without password hashes in them, which is as expected without
inotify), but none of them are getting ACLs set for read by the user
(unexpected - I thought I'd win this easier race once in a while), e.g.:
The POC looks good to me overall but the issue is that the replacement
is not really happening while the dump is being generated.
Since the signal from the SUID process is not handled when it exits, it
will remain defunct for too long. Either SIG_IGN or waitpid for the
signal right after SIGKILL. Then you need to spawn the extra processes
to replace the PID "fast enough". fork is too slow for this I think, you
may need to use clone as Qualys mentioned, for me it always works with
clone. After that, it should work!
Target pid 1588, current pid 1589 - missed target, retrying
Target pid 1590, current pid 1591 - missed target, retrying
Replaced pid 1592
getfacl: Removing leading '/' from absolute path names
# file:
var/lib/systemd/coredump/core.unix_chkpwd.1000.17099079ebb84acbbb2dc4d8dd38e858.1592.1748566368000000.zst
# owner: root
# group: root
user::rw-
group::r--
other::---
I'd appreciate any hints here.
Meanwhile, Red Hat confirms RHEL 9 and 10 are affected, and curiously
lists not only systemd, but also NetworkManager and rpm-ostree among
affected packages - I wonder why?
Alexander
I hope it helps!
David