On 2026-05-18 20:59 +0200, Jann Horn wrote: > On Mon, May 18, 2026 at 2:30 PM Christian Brauner <[email protected]> wrote: > > On Sat, May 16, 2026 at 07:21:26PM +0100, Pedro Falcato wrote: > > > Since the advent of vulns like Dirty Pipe, Dirty Frag, Copy Fail > > > and Fragnasia, splicing a read-only file is fundamentally unsafe. > > > > > > As such, as a mitigation, add a way for users to block splice() for > > > files they cannot write to. This eliminates this whole class of exploits > > > that use splice()+confusion in pipe/net/etc code to gain write-access to > > > files they can only read. > > > > > > Users can simply toggle fs.splice_needs_write=1 and suddenly splice() will > > > refuse perfectly legal splices() from files it can only read, but not > > > write. > [...] > > At that point you can also just ENOSYS splice() and vmsplice() via > > seccomp and force a fallback on non-splice codepaths that userspace has > > to have anyway as splice() isn't supported unconditionally. > > > > It feels like a knee-jerk reaction to an exploit class originating in > > buggy modules that we have little control over and we would extend an > > API to users that is really difficult to use. > > > > What might make more sense is to add a splice specific security_*() hook > > into the code so that an LSM can deny usage of splice in whatever way it > > wants to - bpf lsm or in-tree lsm. > > I feel like a sysctl for "disable all the splice-like interfaces and > zerocopy TX" would be reasonable to have? Either by blocking such > operations, or better, silently downgrading all such operations to > normal copies.
I think blocking isn't going to be useful as it will make it harder for distros to turn this on. So we should degrade. > FWIW, vmsplice() and splice() are also weird in how much memory they > can implicitly pin - if you call vmsplice() on a single byte in a 2M > THP page, I believe you'll implicitly pin 2M of memory... You don't have to convince me that it's a problematic api. Let's discuss the other aggressive alternative: Can we try and unconditionally degrade to copy. This would affect sendfile(), splice(), and vmsplice(). Worst-case we would have to introduce the sysctl retroactively. Thoughts? > By the way, another bug a few years ago of a similar shape was this > one - also a bug in networking code that can lead to accidental writes > into spliced pages: > https://project-zero.issues.chromium.org/issues/42451650 ("ktls writes > into spliced readonly pages") > with fix: https://git.kernel.org/linus/c5a595000e26

