On 2026-05-19 12:51 +0200, Mateusz Guzik wrote:
> On Tue, May 19, 2026 at 11:49 AM Christian Brauner <[email protected]> wrote:
> >
> > On 2026-05-18 20:59 +0200, Jann Horn wrote:
> > > I feel like a sysctl for "disable all the splice-like interfaces and
> > > zerocopy TX" would be reasonable to have? Either by blocking such
> > > operations, or better, silently downgrading all such operations to
> > > normal copies.
> >
> [..]
> > I think blocking isn't going to be useful as it will make it harder for
> > distros to turn this on. So we should degrade.
> >
> [..]
> > Let's discuss the other aggressive alternative: Can we try and
> > unconditionally degrade to copy. This would affect sendfile(), splice(),
> > and vmsplice(). Worst-case we would have to introduce the sysctl
> > retroactively.
> >
> 
> I know at least nginx uses sendfile, but I never benchmarked how much it buys.
> 
> The original patch as proposed filters by rw perms on the file, which
> I expect to exclude nginx.
> 
> While kernel-internal copy is still going to beat a userspace-based
> read/write loop, this is still going to be a hit and I expect people
> are going to complain. Afterwards you may end up with tutorials how to
> re-enable pre-patch behavior, partially defeating the point.
> 
> How about denial of splice usage or degradation to copy are still on
> the table, but based on a different criterion: whether code involved
> is "known good" for lack of a better description. iow the kernel would
> maintain a whitelist of "safe" cases. Random-ass AF_NOBODYEVERHEARDOF
> does not make the cut.

I had thought about that to but I felt a bit iffy about it. You could
envision an FOP_* flag for this:

  /* Module may use splice-like apis */
  #define FOP_MAY_SPLICE          ((__force fop_flags_t)(1 << 8))

But that doesn't address how fundamentally broken vmsplice() for example
really is and that probably no one should get to use it in its current
form.

> Common-case usage would have to be audited of course, but this sounds
> rather actionable and would provide hardening without much friction.

And that's the usual problem where rando module will just raise the
flag. Maybe that's fine and we will keep up.

> I can't stress enough that mucking around splice (even if worthwhile)
> is merely addressing the currently popular attack vector and not the
> general problem.
> 
> The general problem is that the kernel is expected to be able to run
> with untrusted unprivileged users, while it avoidably exposes a huge
> attack surface. Of course there is no way around providing a bunch of
> syscalls to users, so *some* danger will always be there and one has
> to expect that even core code has bugs which will be discovered by
> LLMs in the coming months. Even then, there is tons of code which is
> currently being audited by third parties and which has no use in most
> setups. Instead it gets autoloaded in response to an exploit wishing
> to take advantage of its bugs.
> 
> The huge attack surface was always a problematic position to be in,
> but with the advent lf LLMs any unskilled person can drop a 0day and
> the position is straight up untenable. In the long run there is no way
> around blocking access to code by default, way beyond the current
> splice proposal.

I see this is the "let's become goat-farmers" portion of the message.


Reply via email to