On Thu, May 07, 2026 at 12:47:43PM +0200, Greg KH wrote:
On Thu, May 07, 2026 at 03:05:45AM -0400, Sasha Levin wrote:
When a (security) issue goes public, fleets stay exposed until a patched kernel
is built, distributed, and rebooted into.

For many such issues the simplest mitigation is to stop calling the buggy
function. Killswitch provides that. An admin writes:

    echo "engage af_alg_sendmsg -1" \
        > /sys/kernel/security/killswitch/control

After this, af_alg_sendmsg() returns -EPERM on every call without
running its body. The mitigation takes effect immediately, and is dropped on
the next reboot.

A lot of recent kernel issues sit in code paths most installs only have enabled
to support a relative minority of users: AF_ALG, ksmbd, nf_tables, vsock, ax25,
and friends.

For most users, the cost of "this socket family stops working for the day" is
much smaller than the cost of running a known vulnerable kernel until the fix
land.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Sasha Levin <[email protected]>

This is kind of funny, but understandable.  Odds are a distro would want
to pick this up so that they can enable this for when their kernel
updates do not get out to users quick enough.

I figure that even if the new kernel does go out in a timely manner, there are
still days (weeks? months?) between when a new kernel is available and when the
user reboots.

Might as well try and improve their chances of survival during that period :)

One question:

+struct ks_attr {
+       struct list_head        list;
+       struct kprobe           kp;
+       atomic_long_t           retval;

Why is this an atomic value?  Shouldn't it be whatever the userspace
return type is?

The return register is `long` on every arch.

While testing this, I added the ability to modify the return value after we
create a killswitch, and figured that it could be a useful thing to keep in the
code.

But then I got worried about a race between a user changing the return value of
the killswitch and some program trying to execute the code, and getting some
combination of the old and the new return value.

Is that a real concern? I'm not sure - but making this atomic was cheap enough.

+       /* false once disengaged; per-fn file ops then return -EIDRM. */
+       bool                    engaged;
+       unsigned long __percpu  *hits;
+       struct dentry           *dir;
+       /* engaged_list holds one ref; each open per-fn fd holds one. */
+       refcount_t              refcnt;

Why is a refcnt needed?  Why not use a kref instead?

Ugh... no good reason, I can switch to a kref.

--
Thanks,
Sasha

Reply via email to