>> On Mar 8, 2018, at 9:08 PM, Alexei Starovoitov <a...@fb.com> wrote: >> >> On 3/8/18 7:54 PM, Andy Lutomirski wrote: >> >> >> >>> On Mar 8, 2018, at 7:06 PM, Linus Torvalds <torva...@linux-foundation.org> >>> wrote: >>> >>> >>> Honestly, that "read twice" thing may be what scuttles this. >>> Initially, I thought it was a non-issue, because anybody who controls >>> the module subdirectory enough to rewrite files would be in a position >>> to just execute the file itself directly instead. >> >> On further consideration, I think there’s another showstopper. This patch is >> a potentially severe ABI break. Right now, loading a module *copies* it into >> memory and does not hold a reference to the underlying fs. With the patch >> applied, all kinds of use cases can break in gnarly ways. Initramfs is maybe >> okay, but initrd may be screwed. If you load an ET_EXEC module from initrd, >> then umount it, then clear the ramdisk, something will go horribly wrong. >> Exactly what goes wrong depends on whether userspace notices that umount() >> failed. Similarly, if you load one of these modules over a network and then >> lose your connection, you have a problem. > > there is not abi breakage and file cannot disappear from running task. > One cannot umount fs while file is still being used.
Sure it is. Without your patch, init_module doesn’t keep using the file, so it’s common practice to load a module and then delete or unmount it. With your patch, the unmount case breaks. This is likely to break existing userspace, so, in Linux speak it’s an ABI break. > >> >> The “read twice” thing is also bad for another reason: containers. Suppose I >> have a setup where a container can load a signed module blob. With the read >> twice code, the container can race and run an entirely different blob >> outside the container. > > Not only "read twice", but "read many". > If .text sections of elf that are not yet in memory can be modified > by malicious user, later they will be brought in with different code. > I think the easiest fix to tighten this "umh modules" to CAP_SYS_ADMIN. Given this issue, I think the patch would need Kees’s explicit ack. I had initially thought your patch had minimal security impact, but I was wrong Module security is very complicated and needs to satisfy a bunch of requirements. There is a lot of code in the kernel that assumes that it’s sufficient to verify a module once at load time, your patch changes that, and this has all kinds of nasty interactions with autoloading. Kees is very reasonable, and he’ll change his mind and ack a patch that he’s nacked when presented with a valid technical argument. But I think my ABI break observation is also a major problem, and Linus is going to be pissed if this thing lands in his tree and breaks systems due to an issue that was raised during review. So I think you need to either rework the patch or do a serious survey of how all the distros deal with modules (dracut, initramfs-tools, all the older stuff, and probably more) and make sure they can all handle your patch. I'd also be concerned about anyone who puts /lib/modules on less reliable storage than they use for the rest of their system. (I know it's quite common to have /boot be the only non-RAID partition on a system, but modules don't generally live in /boot.) Also, I think you really ought to explain how your approach will work with MODULES=n or convince Linus that it’s okay to start adding core networking features that don’t work with MODULES=n and can't be built into the main kernel image.