On Mon, Feb 18, 2019 at 8:15 PM Andy Lutomirski <l...@amacapital.net> wrote: > On Mon, Feb 18, 2019 at 5:04 AM Thomas Gleixner <t...@linutronix.de> wrote: > > > Another would be to have the buffer passed to flush_buffer() (i.e. > > > the callback of decompress_fn) allocated with 4 bytes of padding > > > past the part where the unpacked piece of data is placed for the > > > callback to find. As in, > > > > > > diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c > > > index 63b4b7eee138..ca3f7ecc9b35 100644 > > > --- a/lib/decompress_inflate.c > > > +++ b/lib/decompress_inflate.c > > > @@ -48,7 +48,7 @@ STATIC int INIT __gunzip(unsigned char *buf, long len, > > > rc = -1; > > > if (flush) { > > > out_len = 0x8000; /* 32 K */ > > > - out_buf = malloc(out_len); > > > + out_buf = malloc(out_len + 4); > > > > +8 actually. > > > > > } else { > > > if (!out_len) > > > out_len = ((size_t)~0) - (size_t)out_buf; /* no > > > limit */ > > > > > > for gunzip/decompress and similar ones for bzip2, etc. The contents > > > layout doesn't have anything to do with that... > > > > Right. That works nicely. > > > > This seems like it's just papering over the underlying problem: with > Jann's new checks in place, strncpy_from_user() is simply buggy. Does > the patch below look decent? It's only compile-tested, but it's > conceptually straightforward.
Nit: The "This could be optimized" comment has unbalanced parentheses around it. Nit: Maybe switch the order of "max" and "IS_UNALIGNED(src+res)" in the loop for unaligned prefix, or tell the compiler that "max" is likely to be non-zero? That might be a tiny bit more efficient in the aligned case? But I don't know much about making code fast, so feel free to ignore this. I think there is still similar code in some other places in arch/x86; back when I did the conversion to _ASM_EXTABLE_UA, I stumbled over a few similar-looking things. In particular: load_unaligned_zeropad() (used in some VFS code that looks pretty performance-critical) still uses _ASM_EXTABLE for reading from kernel memory; it has a comment saying: /* * Load an unaligned word from kernel space. * * In the (very unlikely) case of the word being a page-crosser * and the next page not being mapped, take the exception and * return zeroes in the non-existing part. */ csum_partial_copy_generic() uses PREFETCHT0 together with the following macro; I think this can be used on both kernel and user addresses? /* * No _ASM_EXTABLE_UA; this is used for intentional prefetch on a * potentially unmapped kernel address. */ .macro ignore L=.Lignore 30: _ASM_EXTABLE(30b, \L) .endm (That comment says "kernel address", but I wrote that, and I think what I really meant is "kernel or userspace address".) > I was hoping I could get rid of the > check-maximum-address stuff, but it's needed for architectures where > the user range is adjacent to the kernel range (i.e. not x86_64). > Jann, I'm still unhappy that this code will write up to sizeof(long)-1 > user-controlled garbage bytes in-bounds past the null-terminator in > the kernel buffer. Do you think that's worth changing, too? I don't > think it's a bug per se, but it seems like a nifty little wart for an > attacker to try to abuse. Hm. I guess it might be interesting if some code path first memsets a kernel buffer to all-zeroes, then uses strncpy_from_user() to copy into it, and then exposes the entire buffer to a different user task (or, if strncpy_from_user() was called on kernel memory, to any user task). And if the source is kernel memory (might happen e.g. via splice, there are VFS write handlers that use strncpy_from_user()), it could potentially be used to make an uninitialized memory read bug (constrained to a specific field in some slab object, or something like that) more powerful, by shoving out-of-bounds kernel data around such that it is aligned properly for the leak. So, yeah, I think if it's not too much trouble, changing that would be nice.