From: Andy Lutomirski
Sent: November 13, 2018 at 5:16:09 PM GMT
> To: Igor Stoppa <igor.sto...@gmail.com>
> Cc: Kees Cook <keesc...@chromium.org>, Peter Zijlstra <pet...@infradead.org>, 
> Nadav Amit <nadav.a...@gmail.com>, Mimi Zohar <zo...@linux.vnet.ibm.com>, 
> Matthew Wilcox <wi...@infradead.org>, Dave Chinner <da...@fromorbit.com>, 
> James Morris <jmor...@namei.org>, Michal Hocko <mho...@kernel.org>, Kernel 
> Hardening <kernel-harden...@lists.openwall.com>, linux-integrity 
> <linux-integr...@vger.kernel.org>, LSM List 
> <linux-security-mod...@vger.kernel.org>, Igor Stoppa 
> <igor.sto...@huawei.com>, Dave Hansen <dave.han...@linux.intel.com>, Jonathan 
> Corbet <cor...@lwn.net>, Laura Abbott <labb...@redhat.com>, Randy Dunlap 
> <rdun...@infradead.org>, Mike Rapoport <r...@linux.vnet.ibm.com>, open 
> list:DOCUMENTATION <linux-doc@vger.kernel.org>, LKML 
> <linux-ker...@vger.kernel.org>, Thomas Gleixner <t...@linutronix.de>
> Subject: Re: [PATCH 10/17] prmem: documentation
> 
> 
> On Tue, Nov 13, 2018 at 6:25 AM Igor Stoppa <igor.sto...@gmail.com> wrote:
>> Hello,
>> I've been studying v4 of the patch-set [1] that Nadav has been working on.
>> Incidentally, I think it would be useful to cc also the
>> security/hardening ml.
>> The patch-set seems to be close to final, so I am resuming this discussion.
>> 
>> On 30/10/2018 19:06, Andy Lutomirski wrote:
>> 
>>> I support the addition of a rare-write mechanism to the upstream kernel.  
>>> And I think that there is only one sane way to implement it: using an 
>>> mm_struct. That mm_struct, just like any sane mm_struct, should only differ 
>>> from init_mm in that it has extra mappings in the *user* region.
>> 
>> After reading the code, I see what you meant.
>> I think I can work with it.
>> 
>> But I have a couple of questions wrt the use of this mechanism, in the
>> context of write rare.
>> 
>> 
>> 1) mm_struct.
>> 
>> Iiuc, the purpose of the patchset is mostly (only?) to patch kernel code
>> (live patch?), which seems to happen sequentially and in a relatively
>> standardized way, like replacing the NOPs specifically placed in the
>> functions that need patching.
>> 
>> This is a bit different from the more generic write-rare case, applied
>> to data.
>> 
>> As example, I have in mind a system where both IMA and SELinux are in use.
>> 
>> In this system, a file is accessed for the first time.
>> 
>> That would trigger 2 things:
>> - evaluation of the SELinux rules and probably update of the AVC cache
>> - IMA measurement and update of the measurements
>> 
>> Both of them could be write protected, meaning that they would both have
>> to be modified through the write rare mechanism.
>> 
>> While the events, for 1 specific file, would be sequential, it's not
>> difficult to imagine that multiple files could be accessed at the same time.
>> 
>> If the update of the data structures in both IMA and SELinux must use
>> the same mm_struct, that would have to be somehow regulated and it would
>> introduce an unnecessary (imho) dependency.
>> 
>> How about having one mm_struct for each writer (core or thread)?
> 
> I don't think that helps anything.  I think the mm_struct used for
> prmem (or rare_write or whatever you want to call it) should be
> entirely abstracted away by an appropriate API, so neither SELinux nor
> IMA need to be aware that there's an mm_struct involved.  It's also
> entirely possible that some architectures won't even use an mm_struct
> behind the scenes -- x86, for example, could have avoided it if there
> were a kernel equivalent of PKRU.  Sadly, there isn't.
> 
>> 2) Iiuc, the purpose of the 2 pages being remapped is that the target of
>> the patch might spill across the page boundary, however if I deal with
>> the modification of generic data, I shouldn't (shouldn't I?) assume that
>> the data will not span across multiple pages.
> 
> The reason for the particular architecture of text_poke() is to avoid
> memory allocation to get it working.  i think that prmem/rare_write
> should have each rare-writable kernel address map to a unique user
> address, possibly just by offsetting everything by a constant.  For
> rare_write, you don't actually need it to work as such until fairly
> late in boot, since the rare_writable data will just be writable early
> on.
> 
>> If the data spans across multiple pages, in unknown amount, I suppose
>> that I should not keep interrupts disabled for an unknown time, as it
>> would hurt preemption.
>> 
>> What I thought, in my initial patch-set, was to iterate over each page
>> that must be written to, in a loop, re-enabling interrupts in-between
>> iterations, to give pending interrupts a chance to be served.
>> 
>> This would mean that the data being written to would not be consistent,
>> but it's a problem that would have to be addressed anyways, since it can
>> be still read by other cores, while the write is ongoing.
> 
> This probably makes sense, except that enabling and disabling
> interrupts means you also need to restore the original mm_struct (most
> likely), which is slow.  I don't think there's a generic way to check
> whether in interrupt is pending without turning interrupts on.

I guess that enabling IRQs might break some hidden assumptions in the code,
but is there a fundamental reason that IRQs need to be disabled? use_mm()
got them enabled, although it is only suitable for kernel threads.

Reply via email to