Re: [PATCH 03/10] mfd / platform: cros_ec: Miscellaneous character device to talk with the EC

2019-06-10 Thread Lee Jones
On Thu, 06 Jun 2019, Randy Dunlap wrote:
> > On Thu, 2019-06-06 at 16:51 +0200, Greg Kroah-Hartman wrote:
> >> Again, don't be noisy, it's not hard, and is how things have been
> >> trending for many years now.
> 
> Ack that.

Not to say that this particular print is acceptable, but there are
places where a low-level (dbg/info) print on successful probe is
helpful.  Driver initialisation is important!

There's a big difference between drivers 'being noisy', spilling all
sorts of information that may well be useful or interesting to a
driver developer, but has little value to anyone else, and providing a
single print to say that a device has been detected and successfully
initialised/probed.

I recently fell victim to a silent, but fully functional device.
Successful device initialisation should not be silent when debugging
has been set to the highest level IMHO.

And yes, of course turning on debugging for Driver Core works, but is
not practical for all cases and is certainly not the first port of
call when figuring out why initialisation seems to be failing for a
single particular device.

Truly surplus churn should absolutely be removed from the boot log, or
at the very least downgraded, leaving only truly useful information
such as highlighting a newly detected device for example.  If the user
wants an even more silent boot log, they should turn the log level
down a notch.  That is why we have log levels after all.  Simply
removing all useful prints regardless of log-level is not the way to
go IMHO.

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog


[RESEND PATCH] Documentation/stackprotector: powerpc supports stack protector

2019-06-10 Thread Bhupesh Sharma
powerpc architecture (both 64-bit and 32-bit) supports stack protector
mechanism since some time now [see commit 06ec27aea9fc ("powerpc/64:
add stack protector support")].

Update stackprotector arch support documentation to reflect the same.

Cc: Jonathan Corbet 
Cc: Michael Ellerman 
Cc: linux-doc@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Signed-off-by: Bhupesh Sharma 
---
Resend, this time Cc'ing Jonathan and doc-list.

 Documentation/features/debug/stackprotector/arch-support.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/features/debug/stackprotector/arch-support.txt 
b/Documentation/features/debug/stackprotector/arch-support.txt
index ea521f3e..32bbdfc64c32 100644
--- a/Documentation/features/debug/stackprotector/arch-support.txt
+++ b/Documentation/features/debug/stackprotector/arch-support.txt
@@ -22,7 +22,7 @@
 |   nios2: | TODO |
 |openrisc: | TODO |
 |  parisc: | TODO |
-| powerpc: | TODO |
+| powerpc: |  ok  |
 |   riscv: | TODO |
 |s390: | TODO |
 |  sh: |  ok  |
-- 
2.7.4



Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Yu-cheng Yu
On Fri, 2019-06-07 at 13:43 -0700, Andy Lutomirski wrote:
> > On Jun 7, 2019, at 12:49 PM, Yu-cheng Yu  wrote:
> > 
> > On Fri, 2019-06-07 at 11:29 -0700, Andy Lutomirski wrote:
> > > > On Jun 7, 2019, at 10:59 AM, Dave Hansen  wrote:
> > > > 
> > > > > On 6/7/19 10:43 AM, Peter Zijlstra wrote:
> > > > > I've no idea what the kernel should do; since you failed to answer the
> > > > > question what happens when you point this to garbage.
> > > > > 
> > > > > Does it then fault or what?
> > > > 
> > > > Yeah, I think you'll fault with a rather mysterious CR2 value since
> > > > you'll go look at the instruction that faulted and not see any
> > > > references to the CR2 value.
> > > > 
> > > > I think this new MSR probably needs to get included in oops output when
> > > > CET is enabled.
> > > 
> > > This shouldn’t be able to OOPS because it only happens at CPL 3,
> > > right?  We
> > > should put it into core dumps, though.
> > > 
> > > > 
> > > > Why don't we require that a VMA be in place for the entire bitmap?
> > > > Don't we need a "get" prctl function too in case something like a JIT is
> > > > running and needs to find the location of this bitmap to set bits
> > > > itself?
> > > > 
> > > > Or, do we just go whole-hog and have the kernel manage the bitmap
> > > > itself. Our interface here could be:
> > > > 
> > > >   prctl(PR_MARK_CODE_AS_LEGACY, start, size);
> > > > 
> > > > and then have the kernel allocate and set the bitmap for those code
> > > > locations.
> > > 
> > > Given that the format depends on the VA size, this might be a good
> > > idea.  I
> > > bet we can reuse the special mapping infrastructure for this — the VMA
> > > could
> > > be a MAP_PRIVATE special mapping named [cet_legacy_bitmap] or similar, and
> > > we
> > > can even make special rules to core dump it intelligently if needed.  And
> > > we
> > > can make mremap() on it work correctly if anyone (CRIU?) cares.
> > > 
> > > Hmm.  Can we be creative and skip populating it with zeros?  The CPU
> > > should
> > > only ever touch a page if we miss an ENDBR on it, so, in normal operation,
> > > we
> > > don’t need anything to be there.  We could try to prevent anyone from
> > > *reading* it outside of ENDBR tracking if we want to avoid people
> > > accidentally
> > > wasting lots of memory by forcing it to be fully populated when the read
> > > it.
> > > 
> > > The one downside is this forces it to be per-mm, but that seems like a
> > > generally reasonable model anyway.
> > > 
> > > This also gives us an excellent opportunity to make it read-only as seen
> > > from
> > > userspace to prevent exploits from just poking it full of ones before
> > > redirecting execution.
> > 
> > GLIBC sets bits only for legacy code, and then makes the bitmap read-
> > only.  That
> > avoids most issues:
> 
> How does glibc know the linear address space size?  We don’t want LA64 to
> break old binaries because the address calculation changed.

When an application starts, its highest stack address is determined.
It uses that as the maximum the bitmap needs to cover.

Yu-cheng


Re: [PATCH v3 00/33] Convert files to ReST - part 1

2019-06-10 Thread Heiko Carstens
On Sun, Jun 09, 2019 at 09:29:40AM -0300, Mauro Carvalho Chehab wrote:
> Em Sun, 9 Jun 2019 11:16:43 +0200
> > Will there be a web page (e.g. kernel.org), which contains always the
> > latest upstream version?
> 
> Yes:
> 
>   https://www.kernel.org/doc/html/latest/
> 
> I guess this one is based on Linus tree.
> 
> Jon also maintains a version at:
> 
>   https://static.lwn.net/kerneldoc/
> 
> I guess that one is based on docs-next branch from the Docs tree.
> 
> Btw, if you want to build it for yourself, you could use:
> 
>   make htmldocs

Thanks a lot!

> > I can pick these up for s390. Or do you want to send the whole series
> > in one go upstream?
> 
> Yeah, feel free to pick them via the s390 tree.

Ok, applied! :)



Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Yu-cheng Yu
On Sat, 2019-06-08 at 22:52 +0200, Pavel Machek wrote:
> Hi!
> 
> > > I've no idea what the kernel should do; since you failed to answer the
> > > question what happens when you point this to garbage.
> > > 
> > > Does it then fault or what?
> > 
> > Yeah, I think you'll fault with a rather mysterious CR2 value since
> > you'll go look at the instruction that faulted and not see any
> > references to the CR2 value.
> > 
> > I think this new MSR probably needs to get included in oops output when
> > CET is enabled.
> > 
> > Why don't we require that a VMA be in place for the entire bitmap?
> > Don't we need a "get" prctl function too in case something like a JIT is
> > running and needs to find the location of this bitmap to set bits itself?
> > 
> > Or, do we just go whole-hog and have the kernel manage the bitmap
> > itself. Our interface here could be:
> > 
> > prctl(PR_MARK_CODE_AS_LEGACY, start, size);
> > 
> > and then have the kernel allocate and set the bitmap for those code
> > locations.
> 
> For the record, that sounds like a better interface than userspace knowing
> about the bitmap formats...
>   Pavel

Initially we implemented the bitmap that way.  To manage the bitmap, every time
the application issues a syscall for a .so it loads, and the kernel does
copy_from_user() & copy_to_user() (or similar things).  If a system has a few
legacy .so files and every application does the same, it can take a long time to
boot up.

Yu-cheng



Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Yu-cheng Yu
On Fri, 2019-06-07 at 15:27 -0700, Andy Lutomirski wrote:
> > On Jun 7, 2019, at 2:09 PM, Dave Hansen  wrote:
> > 
> > On 6/7/19 1:06 PM, Yu-cheng Yu wrote:
> > > > Huh, how does glibc know about all possible past and future legacy code
> > > > in the application?
> > > 
> > > When dlopen() gets a legacy binary and the policy allows that, it will
> > > manage
> > > the bitmap:
> > > 
> > >  If a bitmap has not been created, create one.
> > >  Set bits for the legacy code being loaded.
> > 
> > I was thinking about code that doesn't go through GLIBC like JITs.
> 
> CRIU is another consideration: it would be rather annoying if CET programs
> can’t migrate between LA57 and normal machines.

When a machine migrates, does its applications' addresses change?
If no, then the bitmap should still work, right?

Yu-cheng


Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Yu-cheng Yu
On Fri, 2019-06-07 at 14:09 -0700, Dave Hansen wrote:
> On 6/7/19 1:06 PM, Yu-cheng Yu wrote:
> > > Huh, how does glibc know about all possible past and future legacy code
> > > in the application?
> > 
> > When dlopen() gets a legacy binary and the policy allows that, it will
> > manage
> > the bitmap:
> > 
> >   If a bitmap has not been created, create one.
> >   Set bits for the legacy code being loaded.
> 
> I was thinking about code that doesn't go through GLIBC like JITs.

If JIT manages the bitmap, it knows where it is.
It can always read the bitmap again, right?

Yu-cheng


Re: [PATCH v3 09/33] docs: fault-injection: convert docs to ReST and rename to *.rst

2019-06-10 Thread Federico Vaga
In data Sunday, June 9, 2019 4:26:59 AM CEST, Mauro Carvalho Chehab ha 
scritto:
> The conversion is actually:
>   - add blank lines and identation in order to identify paragraphs;
>   - fix tables markups;
>   - add some lists markups;
>   - mark literal blocks;
>   - adjust title markups.
> 
> At its new index.rst, let's add a :orphan: while this is not linked to
> the main index.rst file, in order to avoid build warnings.
> 
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>  ...ault-injection.txt => fault-injection.rst} | 265 +-
>  Documentation/fault-injection/index.rst   |  20 ++
>  ...r-inject.txt => notifier-error-inject.rst} |  18 +-
>  ...injection.txt => nvme-fault-injection.rst} | 174 ++--
>  ...rovoke-crashes.txt => provoke-crashes.rst} |  40 ++-
>  Documentation/process/4.Coding.rst|   2 +-
>  .../translations/it_IT/process/4.Coding.rst   |   2 +-

Limited to translations/it_IT

Acked-by: Federico Vaga 

>  .../translations/zh_CN/process/4.Coding.rst   |   2 +-
>  drivers/misc/lkdtm/core.c |   2 +-
>  include/linux/fault-inject.h  |   2 +-
>  lib/Kconfig.debug |   2 +-
>  tools/testing/fault-injection/failcmd.sh  |   2 +-
>  12 files changed, 290 insertions(+), 241 deletions(-)
>  rename Documentation/fault-injection/{fault-injection.txt =>
> fault-injection.rst} (68%) create mode 100644
> Documentation/fault-injection/index.rst
>  rename Documentation/fault-injection/{notifier-error-inject.txt =>
> notifier-error-inject.rst} (83%) rename
> Documentation/fault-injection/{nvme-fault-injection.txt =>
> nvme-fault-injection.rst} (19%) rename
> Documentation/fault-injection/{provoke-crashes.txt => provoke-crashes.rst}
> (45%)
> 





Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file

2019-06-10 Thread Yu-cheng Yu
On Fri, 2019-06-07 at 19:01 +0100, Dave Martin wrote:
> On Thu, Jun 06, 2019 at 01:06:41PM -0700, Yu-cheng Yu wrote:
> > An ELF file's .note.gnu.property indicates features the executable file
> > can support.  For example, the property GNU_PROPERTY_X86_FEATURE_1_AND
> > indicates the file supports GNU_PROPERTY_X86_FEATURE_1_IBT and/or
> > GNU_PROPERTY_X86_FEATURE_1_SHSTK.
> > 
> > With this patch, if an arch needs to setup features from ELF properties,
> > it needs CONFIG_ARCH_USE_GNU_PROPERTY to be set, and a specific
> > arch_setup_property().
> > 
> > For example, for X86_64:
> > 
> > int arch_setup_property(void *ehdr, void *phdr, struct file *f, bool inter)
> > {
> > int r;
> > uint32_t property;
> > 
> > r = get_gnu_property(ehdr, phdr, f, GNU_PROPERTY_X86_FEATURE_1_AND,
> >  &property);
> > ...
> > }
> 
> Although this code works for the simple case, I have some concerns about
> some aspects of the implementation here.  There appear to be some bounds
> checking / buffer overrun issues, and the code seems quite complex.
> 
> Maybe this patch tries too hard to be compatible with toolchains that do
> silly things such as embedding huge notes in an executable, or mixing
> NT_GNU_PROPERTY_TYPE_0 in a single PT_NOTE with a load of junk not
> relevant to the loader.  I wonder whether Linux can dictate what
> interpretation(s) of the ELF specs it is prepared to support, rather than
> trying to support absolutely anything.

To me, looking at PT_GNU_PROPERTY and not trying to support anything is a
logical choice.  And it breaks only a limited set of toolchains.

I will simplify the parser and leave this patch as-is for anyone who wants to
back-port.  Are there any objections or concerns?

Yu-cheng


Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file

2019-06-10 Thread Dave Martin
On Mon, Jun 10, 2019 at 09:29:04AM -0700, Yu-cheng Yu wrote:
> On Fri, 2019-06-07 at 19:01 +0100, Dave Martin wrote:
> > On Thu, Jun 06, 2019 at 01:06:41PM -0700, Yu-cheng Yu wrote:
> > > An ELF file's .note.gnu.property indicates features the executable file
> > > can support.  For example, the property GNU_PROPERTY_X86_FEATURE_1_AND
> > > indicates the file supports GNU_PROPERTY_X86_FEATURE_1_IBT and/or
> > > GNU_PROPERTY_X86_FEATURE_1_SHSTK.
> > > 
> > > With this patch, if an arch needs to setup features from ELF properties,
> > > it needs CONFIG_ARCH_USE_GNU_PROPERTY to be set, and a specific
> > > arch_setup_property().
> > > 
> > > For example, for X86_64:
> > > 
> > > int arch_setup_property(void *ehdr, void *phdr, struct file *f, bool 
> > > inter)
> > > {
> > >   int r;
> > >   uint32_t property;
> > > 
> > >   r = get_gnu_property(ehdr, phdr, f, GNU_PROPERTY_X86_FEATURE_1_AND,
> > >&property);
> > >   ...
> > > }
> > 
> > Although this code works for the simple case, I have some concerns about
> > some aspects of the implementation here.  There appear to be some bounds
> > checking / buffer overrun issues, and the code seems quite complex.
> > 
> > Maybe this patch tries too hard to be compatible with toolchains that do
> > silly things such as embedding huge notes in an executable, or mixing
> > NT_GNU_PROPERTY_TYPE_0 in a single PT_NOTE with a load of junk not
> > relevant to the loader.  I wonder whether Linux can dictate what
> > interpretation(s) of the ELF specs it is prepared to support, rather than
> > trying to support absolutely anything.
> 
> To me, looking at PT_GNU_PROPERTY and not trying to support anything is a
> logical choice.  And it breaks only a limited set of toolchains.
> 
> I will simplify the parser and leave this patch as-is for anyone who wants to
> back-port.  Are there any objections or concerns?

No objection from me ;)  But I'm biased.

Hopefully this change should allow substantial simplification.  For one
thing, PT_GNU_PROPERTY tells its file offset and size directly in its
phdrs entry.  That should save us a lot of effort on the kernel side.

Cheers
---Dave


Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file

2019-06-10 Thread Florian Weimer
* Yu-cheng Yu:

> To me, looking at PT_GNU_PROPERTY and not trying to support anything is a
> logical choice.  And it breaks only a limited set of toolchains.
>
> I will simplify the parser and leave this patch as-is for anyone who wants to
> back-port.  Are there any objections or concerns?

Red Hat Enterprise Linux 8 does not use PT_GNU_PROPERTY and is probably
the largest collection of CET-enabled binaries that exists today.

My hope was that we would backport the upstream kernel patches for CET,
port the glibc dynamic loader to the new kernel interface, and be ready
to run with CET enabled in principle (except that porting userspace
libraries such as OpenSSL has not really started upstream, so many
processes where CET is particularly desirable will still run without
it).

I'm not sure if it is a good idea to port the legacy support if it's not
part of the mainline kernel because it comes awfully close to creating
our own private ABI.

Thanks,
Florian


Re: [PATCH v3 13/33] docs: infiniband: convert docs to ReST and rename to *.rst

2019-06-10 Thread Jason Gunthorpe
On Sat, Jun 08, 2019 at 11:27:03PM -0300, Mauro Carvalho Chehab wrote:
> The InfiniBand docs are plain text with no markups.
> So, all we needed to do were to add the title markups and
> some markup sequences in order to properly parse tables,
> lists and literal blocks.
> 
> At its new index.rst, let's add a :orphan: while this is not linked to
> the main index.rst file, in order to avoid build warnings.
> 
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>  .../{core_locking.txt => core_locking.rst}|  64 ++-
>  Documentation/infiniband/index.rst|  23 
>  .../infiniband/{ipoib.txt => ipoib.rst}   |  24 ++--
>  .../infiniband/{opa_vnic.txt => opa_vnic.rst} | 108 +-
>  .../infiniband/{sysfs.txt => sysfs.rst}   |   4 +-
>  .../{tag_matching.txt => tag_matching.rst}|   5 +
>  .../infiniband/{user_mad.txt => user_mad.rst} |  33 --
>  .../{user_verbs.txt => user_verbs.rst}|  12 +-
>  drivers/infiniband/core/user_mad.c|   2 +-
>  drivers/infiniband/ulp/ipoib/Kconfig  |   2 +-
>  10 files changed, 174 insertions(+), 103 deletions(-)
>  rename Documentation/infiniband/{core_locking.txt => core_locking.rst} (78%)
>  create mode 100644 Documentation/infiniband/index.rst
>  rename Documentation/infiniband/{ipoib.txt => ipoib.rst} (90%)
>  rename Documentation/infiniband/{opa_vnic.txt => opa_vnic.rst} (63%)
>  rename Documentation/infiniband/{sysfs.txt => sysfs.rst} (69%)
>  rename Documentation/infiniband/{tag_matching.txt => tag_matching.rst} (98%)
>  rename Documentation/infiniband/{user_mad.txt => user_mad.rst} (90%)
>  rename Documentation/infiniband/{user_verbs.txt => user_verbs.rst} (93%)

Looks OK to me, do you want to run these patches through the docs tree
or through RDMA?

Given that we've generally pushed doc updates through rdma, I think
I'd prefer the latter? Jonathan?

Thanks,
Jason


Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Florian Weimer
* Yu-cheng Yu:

> On Fri, 2019-06-07 at 14:09 -0700, Dave Hansen wrote:
>> On 6/7/19 1:06 PM, Yu-cheng Yu wrote:
>> > > Huh, how does glibc know about all possible past and future legacy code
>> > > in the application?
>> > 
>> > When dlopen() gets a legacy binary and the policy allows that, it will
>> > manage
>> > the bitmap:
>> > 
>> >   If a bitmap has not been created, create one.
>> >   Set bits for the legacy code being loaded.
>> 
>> I was thinking about code that doesn't go through GLIBC like JITs.
>
> If JIT manages the bitmap, it knows where it is.
> It can always read the bitmap again, right?

The problem are JIT libraries without assembler code which can be marked
non-CET, such as liborc.  Our builds (e.g., orc-0.4.29-2.fc30.x86_64)
currently carries the IBT and SHSTK flag, although the entry points into
the generated code do not start with ENDBR, so that a jump to them will
fault with the CET enabled.

Thanks,
Florian


Re: [PATCH v3 13/33] docs: infiniband: convert docs to ReST and rename to *.rst

2019-06-10 Thread Jonathan Corbet
On Mon, 10 Jun 2019 14:27:12 -0300
Jason Gunthorpe  wrote:

> Looks OK to me, do you want to run these patches through the docs tree
> or through RDMA?
> 
> Given that we've generally pushed doc updates through rdma, I think
> I'd prefer the latter? Jonathan?

Whichever works best for you is fine with me; go ahead and take them.

jon


Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Dave Hansen
On 6/10/19 9:05 AM, Yu-cheng Yu wrote:
> On Fri, 2019-06-07 at 14:09 -0700, Dave Hansen wrote:
>> On 6/7/19 1:06 PM, Yu-cheng Yu wrote:
 Huh, how does glibc know about all possible past and future legacy code
 in the application?
>>> When dlopen() gets a legacy binary and the policy allows that, it will
>>> manage
>>> the bitmap:
>>>
>>>   If a bitmap has not been created, create one.
>>>   Set bits for the legacy code being loaded.
>> I was thinking about code that doesn't go through GLIBC like JITs.
> If JIT manages the bitmap, it knows where it is.
> It can always read the bitmap again, right?

Let's just be clear:

The design proposed here is that all code mappers (anybody wanting to
get legacy non-CET code into the address space):

1. Know about CET
2. Know where the bitmap is, and identify the part that needs to be
   changed
3. Be able to mprotect() the bitmap to be writable (undoing glibc's
   PROT_READ)
4. Set the bits in the bitmap for the legacy code
5. mprotect() the bitmap back to PROT_READ

Do the non-glibc code mappers have glibc interfaces for this?
Otherwise, how could a bunch of JITs in a big multi-threaded application
possibly coordinate the mprotect()s?  Won't they race with each other?


Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Dave Hansen
On 6/10/19 8:22 AM, Yu-cheng Yu wrote:
>> How does glibc know the linear address space size?  We don’t want LA64 to
>> break old binaries because the address calculation changed.
> When an application starts, its highest stack address is determined.
> It uses that as the maximum the bitmap needs to cover.

Huh, I didn't think we ran code from the stack. ;)

Especially given the way that we implemented the new 5-level-paging
address space, I don't think that expecting code to be below the stack
is a good universal expectation.


Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Yu-cheng Yu
On Mon, 2019-06-10 at 11:02 -0700, Dave Hansen wrote:
> On 6/10/19 8:22 AM, Yu-cheng Yu wrote:
> > > How does glibc know the linear address space size?  We don’t want LA64 to
> > > break old binaries because the address calculation changed.
> > 
> > When an application starts, its highest stack address is determined.
> > It uses that as the maximum the bitmap needs to cover.
> 
> Huh, I didn't think we ran code from the stack. ;)
> 
> Especially given the way that we implemented the new 5-level-paging
> address space, I don't think that expecting code to be below the stack
> is a good universal expectation.

Yes, you make a good point.  However, allowing the application manage the bitmap
is the most efficient and flexible.  If the loader finds a legacy lib is beyond
the bitmap can cover, it can deal with the problem by moving the lib to a lower
address; or re-allocate the bitmap.  If the loader cannot allocate a big bitmap
to cover all 5-level address space (the bitmap will be large), it can put all
legacy lib's at lower address.  We cannot do these easily in the kernel.

Yu-cheng



Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Dave Hansen
On 6/10/19 12:38 PM, Yu-cheng Yu wrote:
>>> When an application starts, its highest stack address is determined.
>>> It uses that as the maximum the bitmap needs to cover.
>> Huh, I didn't think we ran code from the stack. ;)
>>
>> Especially given the way that we implemented the new 5-level-paging
>> address space, I don't think that expecting code to be below the stack
>> is a good universal expectation.
> Yes, you make a good point.  However, allowing the application manage the 
> bitmap
> is the most efficient and flexible.  If the loader finds a legacy lib is 
> beyond
> the bitmap can cover, it can deal with the problem by moving the lib to a 
> lower
> address; or re-allocate the bitmap.

How could the loader reallocate the bitmap and coordinate with other
users of the bitmap?

> If the loader cannot allocate a big bitmap to cover all 5-level
> address space (the bitmap will be large), it can put all legacy lib's
> at lower address.  We cannot do these easily in the kernel.

This is actually an argument to do it in the kernel.  The kernel can
always allocate the virtual space however it wants, no matter how large.
 If we hide the bitmap behind a kernel API then we can put it at high
5-level user addresses because we also don't have to worry about the
high bits confusing userspace.



Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Andy Lutomirski
On Mon, Jun 10, 2019 at 12:52 PM Dave Hansen  wrote:
>
> On 6/10/19 12:38 PM, Yu-cheng Yu wrote:
> >>> When an application starts, its highest stack address is determined.
> >>> It uses that as the maximum the bitmap needs to cover.
> >> Huh, I didn't think we ran code from the stack. ;)
> >>
> >> Especially given the way that we implemented the new 5-level-paging
> >> address space, I don't think that expecting code to be below the stack
> >> is a good universal expectation.
> > Yes, you make a good point.  However, allowing the application manage the 
> > bitmap
> > is the most efficient and flexible.  If the loader finds a legacy lib is 
> > beyond
> > the bitmap can cover, it can deal with the problem by moving the lib to a 
> > lower
> > address; or re-allocate the bitmap.
>
> How could the loader reallocate the bitmap and coordinate with other
> users of the bitmap?
>
> > If the loader cannot allocate a big bitmap to cover all 5-level
> > address space (the bitmap will be large), it can put all legacy lib's
> > at lower address.  We cannot do these easily in the kernel.
>
> This is actually an argument to do it in the kernel.  The kernel can
> always allocate the virtual space however it wants, no matter how large.
>  If we hide the bitmap behind a kernel API then we can put it at high
> 5-level user addresses because we also don't have to worry about the
> high bits confusing userspace.
>

That's a fairly compelling argument.

The bitmap is one bit per page, right?  So it's smaller than the
address space by a factor of 8*2^12 == 2^15.  This means that, if we
ever get full 64-bit linear addresses reserved entirely for userspace
(which could happen if my perennial request to Intel to split user and
kernel addresses completely happens), then we'll need 2^48 bytes for
the bitmap, which simply does not fit in the address space of a legacy
application.


Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Yu-cheng Yu
On Mon, 2019-06-10 at 12:52 -0700, Dave Hansen wrote:
> On 6/10/19 12:38 PM, Yu-cheng Yu wrote:
> > > > When an application starts, its highest stack address is determined.
> > > > It uses that as the maximum the bitmap needs to cover.
> > > 
> > > Huh, I didn't think we ran code from the stack. ;)
> > > 
> > > Especially given the way that we implemented the new 5-level-paging
> > > address space, I don't think that expecting code to be below the stack
> > > is a good universal expectation.
> > 
> > Yes, you make a good point.  However, allowing the application manage the
> > bitmap
> > is the most efficient and flexible.  If the loader finds a legacy lib is
> > beyond
> > the bitmap can cover, it can deal with the problem by moving the lib to a
> > lower
> > address; or re-allocate the bitmap.
> 
> How could the loader reallocate the bitmap and coordinate with other
> users of the bitmap?

Assuming the loader actually chooses to re-allocate, it can copy the old bitmap
over to the new before doing the switch.  But, I agree, the other choice is
easier; the loader can simply put the lib at lower address.  AFAIK, the loader
does not request high address in mmap().

> 
> > If the loader cannot allocate a big bitmap to cover all 5-level
> > address space (the bitmap will be large), it can put all legacy lib's
> > at lower address.  We cannot do these easily in the kernel.
> 
> This is actually an argument to do it in the kernel.  The kernel can
> always allocate the virtual space however it wants, no matter how large.
>  If we hide the bitmap behind a kernel API then we can put it at high
> 5-level user addresses because we also don't have to worry about the
> high bits confusing userspace.

We actually tried this.  The kernel needs to reserve the bitmap space in the
beginning for every CET-enabled app, regardless of actual needs.  On each memory
request, the kernel then must consider a percentage of allocated space in its
calculation, and on systems with less memory this quickly becomes a problem.



Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Dave Hansen
On 6/10/19 1:27 PM, Yu-cheng Yu wrote:
>>> If the loader cannot allocate a big bitmap to cover all 5-level
>>> address space (the bitmap will be large), it can put all legacy lib's
>>> at lower address.  We cannot do these easily in the kernel.
>> This is actually an argument to do it in the kernel.  The kernel can
>> always allocate the virtual space however it wants, no matter how large.
>>  If we hide the bitmap behind a kernel API then we can put it at high
>> 5-level user addresses because we also don't have to worry about the
>> high bits confusing userspace.
> We actually tried this.  The kernel needs to reserve the bitmap space in the
> beginning for every CET-enabled app, regardless of actual needs. 

I don't think this is a problem.  In fact, I think reserving the space
is actually the only sane behavior.  If you don't reserve it, you
fundamentally limit where future legacy instructions can go.

One idea is that we always size the bitmap for the 48-bit addressing
systems.  Legacy code probably doesn't _need_ to go in the new address
space, and if we do this we don't have to worry about the gigantic
57-bit address space bitmap.

> On each memory request, the kernel then must consider a percentage of
> allocated space in its calculation, and on systems with less memory
> this quickly becomes a problem.

I'm not sure what you're referring to here?  Are you referring to our
overcommit limits?


Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Yu-cheng Yu
On Mon, 2019-06-10 at 13:43 -0700, Dave Hansen wrote:
> On 6/10/19 1:27 PM, Yu-cheng Yu wrote:
> > > > If the loader cannot allocate a big bitmap to cover all 5-level
> > > > address space (the bitmap will be large), it can put all legacy lib's
> > > > at lower address.  We cannot do these easily in the kernel.
> > > 
> > > This is actually an argument to do it in the kernel.  The kernel can
> > > always allocate the virtual space however it wants, no matter how large.
> > >  If we hide the bitmap behind a kernel API then we can put it at high
> > > 5-level user addresses because we also don't have to worry about the
> > > high bits confusing userspace.
> > 
> > We actually tried this.  The kernel needs to reserve the bitmap space in the
> > beginning for every CET-enabled app, regardless of actual needs. 
> 
> I don't think this is a problem.  In fact, I think reserving the space
> is actually the only sane behavior.  If you don't reserve it, you
> fundamentally limit where future legacy instructions can go.
> 
> One idea is that we always size the bitmap for the 48-bit addressing
> systems.  Legacy code probably doesn't _need_ to go in the new address
> space, and if we do this we don't have to worry about the gigantic
> 57-bit address space bitmap.
> 
> > On each memory request, the kernel then must consider a percentage of
> > allocated space in its calculation, and on systems with less memory
> > this quickly becomes a problem.
> 
> I'm not sure what you're referring to here?  Are you referring to our
> overcommit limits?

Yes.


Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Dave Hansen
On 6/10/19 1:58 PM, Yu-cheng Yu wrote:
>>> On each memory request, the kernel then must consider a percentage of
>>> allocated space in its calculation, and on systems with less memory
>>> this quickly becomes a problem.
>> I'm not sure what you're referring to here?  Are you referring to our
>> overcommit limits?
> Yes.

My assumption has always been that these large, potentially sparse
hardware tables *must* be mmap()'d with MAP_NORESERVE specified.  That
should keep them from being problematic with respect to overcommit.



Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Yu-cheng Yu
On Mon, 2019-06-10 at 15:02 -0700, Dave Hansen wrote:
> On 6/10/19 1:58 PM, Yu-cheng Yu wrote:
> > > > On each memory request, the kernel then must consider a percentage of
> > > > allocated space in its calculation, and on systems with less memory
> > > > this quickly becomes a problem.
> > > 
> > > I'm not sure what you're referring to here?  Are you referring to our
> > > overcommit limits?
> > 
> > Yes.
> 
> My assumption has always been that these large, potentially sparse
> hardware tables *must* be mmap()'d with MAP_NORESERVE specified.  That
> should keep them from being problematic with respect to overcommit.

Ok, we will go back to do_mmap() with MAP_PRIVATE, MAP_NORESERVE and
VM_DONTDUMP.  The bitmap will cover only 48-bit address space.

We then create PR_MARK_CODE_AS_LEGACY.  The kernel will set the bitmap, but it
is going to be slow.

Perhaps we still let the app fill the bitmap?

Yu-cheng


Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Dave Hansen
On 6/10/19 3:40 PM, Yu-cheng Yu wrote:
> Ok, we will go back to do_mmap() with MAP_PRIVATE, MAP_NORESERVE and
> VM_DONTDUMP.  The bitmap will cover only 48-bit address space.

Could you make sure to discuss the downsides of only doing a 48-bit
address space?

What are the reasons behind and implications of VM_DONTDUMP?

> We then create PR_MARK_CODE_AS_LEGACY.  The kernel will set the bitmap, but it
> is going to be slow.

Slow compared to what?  We're effectively adding one (quick) system call
to a path that, today, has at *least* half a dozen syscalls and probably
a bunch of page faults.  Heck, we can probably avoid the actual page
fault to populate the bitmap if we're careful.  That alone would put a
syscall on equal footing with any other approach.  If the bit setting
crossed a page boundary it would probably win.

> Perhaps we still let the app fill the bitmap?

I think I'd want to see some performance data on it first.


Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread H.J. Lu
On Mon, Jun 10, 2019 at 3:59 PM Dave Hansen  wrote:
>
> > We then create PR_MARK_CODE_AS_LEGACY.  The kernel will set the bitmap, but 
> > it
> > is going to be slow.
>
> Slow compared to what?  We're effectively adding one (quick) system call
> to a path that, today, has at *least* half a dozen syscalls and probably
> a bunch of page faults.  Heck, we can probably avoid the actual page
> fault to populate the bitmap if we're careful.  That alone would put a
> syscall on equal footing with any other approach.  If the bit setting
> crossed a page boundary it would probably win.
>
> > Perhaps we still let the app fill the bitmap?
>
> I think I'd want to see some performance data on it first.

Updating legacy bitmap in user space from kernel requires

long q;

get_user(q, ...);
q |= mask;
put_user(q, ...);

instead of

*p |= mask;

get_user + put_user was quite slow when we tried before.

-- 
H.J.


Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Dave Hansen
On 6/10/19 4:20 PM, H.J. Lu wrote:
>>> Perhaps we still let the app fill the bitmap?
>> I think I'd want to see some performance data on it first.
> Updating legacy bitmap in user space from kernel requires
> 
> long q;
> 
> get_user(q, ...);
> q |= mask;
> put_user(q, ...);
> 
> instead of
> 
> *p |= mask;
> 
> get_user + put_user was quite slow when we tried before.

Numbers, please.

There are *lots* of ways to speed something like that up if you have
actual issues with it.  For instance, you can skip the get_user() for
whole bytes.  You can write bits with 0's for unallocated address space.
 You can do user_access_begin/end() to avoid bunches of STAC/CLACs...

The list goes on and on. :)


Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Andy Lutomirski



> On Jun 10, 2019, at 3:59 PM, Dave Hansen  wrote:
> 
>> On 6/10/19 3:40 PM, Yu-cheng Yu wrote:
>> Ok, we will go back to do_mmap() with MAP_PRIVATE, MAP_NORESERVE and
>> VM_DONTDUMP.  The bitmap will cover only 48-bit address space.
> 
> Could you make sure to discuss the downsides of only doing a 48-bit
> address space?
> 
> What are the reasons behind and implications of VM_DONTDUMP?
> 
>> We then create PR_MARK_CODE_AS_LEGACY.  The kernel will set the bitmap, but 
>> it
>> is going to be slow.
> 
> Slow compared to what?  We're effectively adding one (quick) system call
> to a path that, today, has at *least* half a dozen syscalls and probably
> a bunch of page faults.  Heck, we can probably avoid the actual page
> fault to populate the bitmap if we're careful.  That alone would put a
> syscall on equal footing with any other approach.  If the bit setting
> crossed a page boundary it would probably win.
> 
>> Perhaps we still let the app fill the bitmap?
> 
> I think I'd want to see some performance data on it first.

Trying to summarize:

If we manage the whole thing in user space, we are basically committing to only 
covering 48 bits — otherwise the whole model falls apart in quite a few ways. 
We gain some simplicity in the kernel.

If we do it in the kernel, we still have to decide how much address space to 
cover. We get to play games like allocating the bitmap above 2^48, but then we 
might have CRIU issues if we migrate to a system with fewer BA bits.

I doubt that the performance matters much one way or another. I just don’t 
expect any of this to be a bottleneck.

Another benefit of kernel management: we could plausibly auto-clear the bits 
corresponding to munmapped regions. Is this worth it?

And a maybe-silly benefit: if we manage it in the kernel, we could optimize the 
inevitable case where the bitmap contains pages that are all ones :). If it’s 
in userspace, KSM could do the, but that will be inefficient at best.

Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Dave Hansen
On 6/10/19 4:54 PM, Andy Lutomirski wrote:
> Another benefit of kernel management: we could plausibly auto-clear
> the bits corresponding to munmapped regions. Is this worth it?

I did it for MPX.  I think I even went to the trouble of zapping the
whole pages that got unused.

But, MPX tables took 80% of the address space, worst-case.  This takes
0.003% :)  The only case it would really matter would be a task was
long-running, used legacy executables/JITs, and was mapping/unmapping
text all over the address space.  That seems rather unlikely.


Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function

2019-06-10 Thread Andy Lutomirski



> On Jun 10, 2019, at 5:08 PM, Dave Hansen  wrote:
> 
>> On 6/10/19 4:54 PM, Andy Lutomirski wrote:
>> Another benefit of kernel management: we could plausibly auto-clear
>> the bits corresponding to munmapped regions. Is this worth it?
> 
> I did it for MPX.  I think I even went to the trouble of zapping the
> whole pages that got unused.
> 
> But, MPX tables took 80% of the address space, worst-case.  This takes
> 0.003% :)  The only case it would really matter would be a task was
> long-running, used legacy executables/JITs, and was mapping/unmapping
> text all over the address space.  That seems rather unlikely.

Every wasted page still costs 4K plus page table overhead.  The worst case is a 
JIT that doesn’t clean up and leaks legacy bitmap memory all over. We can blame 
the JIT, but the actual attribution could be complicated.

It also matters when you unmap one thing, map something else, and are sad when 
the legacy bits are still set.

Admittedly, it’s a bit hard to imagine the exploit that takes advantage of this.