On Thu, May 07, 2026 at 12:47:43PM +0200, Greg KH wrote:
On Thu, May 07, 2026 at 03:05:45AM -0400, Sasha Levin wrote:
When a (security) issue goes public, fleets stay exposed until a patched kernel
is built, distributed, and rebooted into.
For many such issues the simplest mitigation is to stop calling the buggy
function. Killswitch provides that. An admin writes:
echo "engage af_alg_sendmsg -1" \
> /sys/kernel/security/killswitch/control
After this, af_alg_sendmsg() returns -EPERM on every call without
running its body. The mitigation takes effect immediately, and is dropped on
the next reboot.
A lot of recent kernel issues sit in code paths most installs only have enabled
to support a relative minority of users: AF_ALG, ksmbd, nf_tables, vsock, ax25,
and friends.
For most users, the cost of "this socket family stops working for the day" is
much smaller than the cost of running a known vulnerable kernel until the fix
land.
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Sasha Levin <[email protected]>
This is kind of funny, but understandable. Odds are a distro would want
to pick this up so that they can enable this for when their kernel
updates do not get out to users quick enough.
I figure that even if the new kernel does go out in a timely manner, there are
still days (weeks? months?) between when a new kernel is available and when the
user reboots.
Might as well try and improve their chances of survival during that period :)
One question:
+struct ks_attr {
+ struct list_head list;
+ struct kprobe kp;
+ atomic_long_t retval;
Why is this an atomic value? Shouldn't it be whatever the userspace
return type is?
The return register is `long` on every arch.
While testing this, I added the ability to modify the return value after we
create a killswitch, and figured that it could be a useful thing to keep in the
code.
But then I got worried about a race between a user changing the return value of
the killswitch and some program trying to execute the code, and getting some
combination of the old and the new return value.
Is that a real concern? I'm not sure - but making this atomic was cheap enough.
+ /* false once disengaged; per-fn file ops then return -EIDRM. */
+ bool engaged;
+ unsigned long __percpu *hits;
+ struct dentry *dir;
+ /* engaged_list holds one ref; each open per-fn fd holds one. */
+ refcount_t refcnt;
Why is a refcnt needed? Why not use a kref instead?
Ugh... no good reason, I can switch to a kref.
--
Thanks,
Sasha