Re: [PATCH] x86/mm: annotate no_context with UNWIND_HINTS

2018-10-15 Thread Nathan Chancellor
On Mon, Oct 15, 2018 at 09:03:40AM -0700, Andy Lutomirski wrote:
> On Mon, Oct 15, 2018 at 8:31 AM Josh Poimboeuf  wrote:
> >
> > On Mon, Oct 15, 2018 at 08:22:21AM -0700, Nathan Chancellor wrote:
> > > > >>> @@ -760,9 +760,11 @@ no_context(struct pt_regs *regs, unsigned long 
> > > > >>> error_code,
> > > > >>> * and then double-fault, though, because we're 
> > > > >>> likely to
> > > > >>> * break the console driver and lose most of the 
> > > > >>> stack dump.
> > > > >>> */
> > > > >>> -   asm volatile ("movq %[stack], %%rsp\n\t"
> > > > >>> +   asm volatile (UNWIND_HINT_SAVE
> > > > >>> + "movq %[stack], %%rsp\n\t"
> > > > >>>  "call handle_stack_overflow\n\t"
> > > > >>> - "1: jmp 1b"
> > > > >>> + "1: jmp 1b\n\t"
> > > > >>> + UNWIND_HINT_RESTORE
> > > > >>>  : ASM_CALL_CONSTRAINT
> > > > >>>  : "D" ("kernel stack overflow (page 
> > > > >>> fault)"),
> > > > >>>"S" (regs), "d" (address),
> > > > >>
> > > > >> NAK.  Just below this snippet is unreachable();
> > > > >>
> > > > >> Can you reply with objtool -dr output on a problematic fault.o?  
> > > > >> Josh,
> > > > >> it *looks* like annotate_unreachable() should be doing the right
> > > > >> thing, but something is clearly busted.
> > > > >>
> > > > >> Also, shouldn't compiler-clang.h contain a reasonable definition of
> > > > >> unreachable()?
> > > > >>
> > > > >> --Andy
> > > > >
> > > > > Hi Andy,
> > > > >
> > > > > Did you mean 'objdump -dr'? If so, here you go (rather long, sorry if 
> > > > > I
> > > > > should have pasted it here instead):
> > > > > https://gist.github.com/nathanchance/f038bb0a6653b975bb8a4e64fcd5503e
> > > > >
> > > > >
> > > >
> > > > Hmm, -dr wasn’t quite enough to dump the .discard bits, assuming 
> > > > they’re there at all. Can you just put the whole .o file somewhere?
> > >
> > > Here you go: https://nathanchance.me/downloads/.tmp/fault.o
> >
> > $ eu-readelf -S /tmp/fault.o  |grep reachable
> > [12] .discard.reachable   PROGBITS  2bc0 0014  
> > 00   0  1
> > [13] .rela.discard.reachable RELA  2bd8 
> > 0078 24 I 32  12  8
> >
> > That confirms that you need a clang version of the unreachable() macro.
> >
> 
> Duh.
> 
> That being said, the generic macro is:
> 
> # define unreachable() do { annotate_reachable(); do { } while (1); } while 
> (0)
> 
> I'm probably missing some subtlety here, but shouldn't that be
> annotate_*un*reachable()?
> 
> Of course, there are any number of reasons why there should be a real
> definition.  Nathan and Nick, does adding something like:
> 
> #define unreachable() \
> do {\
> annotate_unreachable(); \
> __builtin_unreachable();\
> } while (0)
> 
> to compiler-clang.h fix the problem?
> 
> --Andy

Ha, I was just typing out a message summarizing that that exact
definition fixed this warning.

Nathan


Re: [PATCH v1 2/2] sysctl: handle overflow for file-max

2018-10-15 Thread Kees Cook
On Mon, Oct 15, 2018 at 3:55 AM, Christian Brauner  wrote:
> Currently, when writing
>
> echo 18446744073709551616 > /proc/sys/fs/file-max
>
> /proc/sys/fs/file-max will overflow and be set to 0. That quickly
> crashes the system.
> This commit explicitly caps the value for file-max to ULONG_MAX.
>
> Note, this isn't technically necessary since proc_get_long() will already
> return ULONG_MAX. However, two reason why we still should do this:
> 1. it makes it explicit what the upper bound of file-max is instead of
>making readers of the code infer it from proc_get_long() themselves
> 2. other tunebles than file-max may want to set a lower max value than
>ULONG_MAX and we need to enable __do_proc_doulongvec_minmax() to handle
>such cases too
>
> Cc: Kees Cook 
> Signed-off-by: Christian Brauner 
> ---
> v0->v1:
> - if max value is < than ULONG_MAX use max as upper bound
> - (Dominik) remove double "the" from commit message
> ---
>  kernel/sysctl.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 97551eb42946..226d4eaf4b0e 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -127,6 +127,7 @@ static int __maybe_unused one = 1;
>  static int __maybe_unused two = 2;
>  static int __maybe_unused four = 4;
>  static unsigned long one_ul = 1;
> +static unsigned long ulong_max = ULONG_MAX;
>  static int one_hundred = 100;
>  static int one_thousand = 1000;
>  #ifdef CONFIG_PRINTK
> @@ -1696,6 +1697,7 @@ static struct ctl_table fs_table[] = {
> .maxlen = sizeof(files_stat.max_files),
> .mode   = 0644,
> .proc_handler   = proc_doulongvec_minmax,
> +   .extra2 = &ulong_max,

Don't we want this capped lower? The percpu comparisons, for example,
are all signed long. And there is at least this test, which could
overflow:

if (atomic_long_read(&unix_nr_socks) > 2 * get_max_files())
goto out;

Seems like max-files should be  SLONG_MAX / 2 or something instead?

> },
> {
> .procname   = "nr_open",
> @@ -2795,6 +2797,8 @@ static int __do_proc_doulongvec_minmax(void *data, 
> struct ctl_table *table, int
> break;
> if (neg)
> continue;
> +   if (max && val > *max)
> +   val = *max;
> val = convmul * val / convdiv;
> if ((min && val < *min) || (max && val > *max))
> continue;
> --
> 2.17.1
>

-Kees

-- 
Kees Cook
Pixel Security


RE: [PATCH v2 0/4] Port mxs-dcp to imx6ull and imx6sll

2018-10-15 Thread A.s. Dong
Hi Leonard,

> -Original Message-
> From: Leonard Crestez
> Sent: Monday, October 15, 2018 9:28 PM


[...]

> Subject: [PATCH v2 0/4] Port mxs-dcp to imx6ull and imx6sll
> 
> The DCP block is present on 6sll and 6ull but not enabled. The hardware is
> mostly compatible with 6sl, the only important difference is that explicit 
> clock
> enabling is required.
> 
> There were several issues with the functionality of this driver (it didn't 
> even
> probe properly) but they are fixed in cryptodev/master by this series:
> https://lore.kernel.org/patchwork/cover/994874/
> 

Thanks for the work.
I will be glad to help the test if you provide some test guides. :-)

Regards
Dong Aisheng

> ---
> Changes since v1:
>  * Add devicetree maintainers for dt-bindings
>  * Add a patch enabling in imx_v6_v7_defconfig. Since tcrypt now passes this
> shouldn't cause any issues
>  * Link to v1: https://lore.kernel.org/patchwork/cover/994893/
> 
> Leonard Crestez (4):
>   dt-bindings: crypto: Mention clocks for mxs-dcp
>   crypto: mxs-dcp - Add support for dcp clk
>   ARM: dts: imx6ull: Add dcp node
>   ARM: imx_v6_v7_defconfig: Enable CRYPTO_DEV_MXS_DCP
> 
>  .../devicetree/bindings/crypto/fsl-dcp.txt |  2 ++
>  arch/arm/boot/dts/imx6ull.dtsi | 10 ++
>  arch/arm/configs/imx_v6_v7_defconfig   |  1 +
>  drivers/crypto/mxs-dcp.c   | 18
> ++
>  4 files changed, 31 insertions(+)
> 
> --
> 2.17.1



Re: [PATCH 0/7] staging: vc04_services: Some dead code removal

2018-10-15 Thread Eric Anholt
Stefan Wahren  writes:

> Hi Tuomas,
>
>> Tuomas Tynkkynen  hat am 4. Oktober 2018 um 11:37 
>> geschrieben:
>> 
>> 
>> Drop various pieces of dead code from here and there to get rid of
>> the remaining users of VCHI_CONNECTION_T. After that we get to drop
>> entire header files worth of unused code.
>> 
>> I've tested on a Raspberry Pi Model B (bcm2835_defconfig) that
>> snd-bcm2835 can still play analog audio just fine.
>> 
>
> thanks and i'm fine with your patch series:
>
> Acked-by: Stefan Wahren 
>
> Unfortunately this would break compilation of the downstream vchi
> drivers like vcsm [1]. Personally i don't want to maintain another
> one, because i cannot see the gain of the resulting effort.
>
> [1] - 
> https://github.com/raspberrypi/linux/tree/rpi-4.14.y/drivers/char/broadcom/vc_sm

I think the main concern would be if we removed things necessary for
6by9's new vcsm (the one that will let us do dma-buf sharing between
media decode and DRM).

On the other hand, git revert is a thing, so it's not like we actually
lose anything.


signature.asc
Description: PGP signature


Re: [PATCH v1 1/2] sysctl: cap to ULONG_MAX in proc_get_long()

2018-10-15 Thread Kees Cook
On Mon, Oct 15, 2018 at 3:55 AM, Christian Brauner  wrote:
> proc_get_long() is a funny function. It uses simple_strtoul() and for a
> good reason. proc_get_long() wants to always succeed the parse and return
> the maybe incorrect value and the trailing characters to check against a
> pre-defined list of acceptable trailing values.
> However, simple_strtoul() explicitly ignores overflows which can cause

What depends on simple_strtoul() ignoring overflows? Can we just cap
it to ULONG_MAX instead?

I note that both simple_strtoul() and simple_strtoull() are marked as
obsolete (more below).

> funny things like the following to happen:
>
> echo 18446744073709551616 > /proc/sys/fs/file-max
> cat /proc/sys/fs/file-max
> 0
>
> (Which will cause your system to silently die behind your back.)
>
> On the other hand kstrtoul() does do overflow detection but fails the parse
> in this case, does not return the trailing characters, and also fails the
> parse when anything other than '\n' is a trailing character whereas
> proc_get_long() wants to be more lenient.

This parsing strictness difference makes it seem like the simple_*()
shouldn't be considered obsolete...

and it's still very heavily used:

$ git grep -E 'simple_strtoull?\(' | wc -l
745

> Now, before adding another kstrtoul() function let's simply add a static
> parse strtoul_cap_erange() which does:
> - returns ULONG_MAX on ERANGE
> - returns the trailing characters to the caller
> This guarantees that we don't regress userspace in any way but also caps
> any parsed value to ULONG_MAX and prevents things like file-max to become 0
> on overflow.

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH] x86/mm: annotate no_context with UNWIND_HINTS

2018-10-15 Thread Josh Poimboeuf
On Mon, Oct 15, 2018 at 09:03:40AM -0700, Andy Lutomirski wrote:
> That being said, the generic macro is:
> 
> # define unreachable() do { annotate_reachable(); do { } while (1); } while 
> (0)
> 
> I'm probably missing some subtlety here, but shouldn't that be
> annotate_*un*reachable()?

That code should have had a comment, but that subtlety was intentional.
As I mentioned earlier, that was a hack for old versions of GCC which
didn't have __builtin_unreachable().  In those cases, GCC doesn't treat
"ud2" as fatal, so this was a way of telling objtool that.  Luckily we
can get rid of this hack now that the minimum supported GCC version has
gone up to 4.6.

-- 
Josh


Re: [PATCH v1 2/2] sysctl: handle overflow for file-max

2018-10-15 Thread Christian Brauner
On Mon, Oct 15, 2018 at 09:11:51AM -0700, Kees Cook wrote:
> On Mon, Oct 15, 2018 at 3:55 AM, Christian Brauner  
> wrote:
> > Currently, when writing
> >
> > echo 18446744073709551616 > /proc/sys/fs/file-max
> >
> > /proc/sys/fs/file-max will overflow and be set to 0. That quickly
> > crashes the system.
> > This commit explicitly caps the value for file-max to ULONG_MAX.
> >
> > Note, this isn't technically necessary since proc_get_long() will already
> > return ULONG_MAX. However, two reason why we still should do this:
> > 1. it makes it explicit what the upper bound of file-max is instead of
> >making readers of the code infer it from proc_get_long() themselves
> > 2. other tunebles than file-max may want to set a lower max value than
> >ULONG_MAX and we need to enable __do_proc_doulongvec_minmax() to handle
> >such cases too
> >
> > Cc: Kees Cook 
> > Signed-off-by: Christian Brauner 
> > ---
> > v0->v1:
> > - if max value is < than ULONG_MAX use max as upper bound
> > - (Dominik) remove double "the" from commit message
> > ---
> >  kernel/sysctl.c | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> > index 97551eb42946..226d4eaf4b0e 100644
> > --- a/kernel/sysctl.c
> > +++ b/kernel/sysctl.c
> > @@ -127,6 +127,7 @@ static int __maybe_unused one = 1;
> >  static int __maybe_unused two = 2;
> >  static int __maybe_unused four = 4;
> >  static unsigned long one_ul = 1;
> > +static unsigned long ulong_max = ULONG_MAX;
> >  static int one_hundred = 100;
> >  static int one_thousand = 1000;
> >  #ifdef CONFIG_PRINTK
> > @@ -1696,6 +1697,7 @@ static struct ctl_table fs_table[] = {
> > .maxlen = sizeof(files_stat.max_files),
> > .mode   = 0644,
> > .proc_handler   = proc_doulongvec_minmax,
> > +   .extra2 = &ulong_max,
> 
> Don't we want this capped lower? The percpu comparisons, for example,
> are all signed long. And there is at least this test, which could
> overflow:
> 
> if (atomic_long_read(&unix_nr_socks) > 2 * get_max_files())
> goto out;

Does that check even make sense? 
Commit 518de9b39e854542de59bfb8b9f61c8f7ecf808b made get_max_files()
return a long to bump the number of allowed files to more than 2^31.

But assuming a platform where an unsigned long is 64bit which is what
get_max_files() returns and atomic_long_read() is 64bit too this is
guaranteed to overflow, no?  So I'm not clear what this is trying to do.
Seems this should simply be:

if (atomic_long_read(&unix_nr_socks) > get_max_files())
goto out;

or am I missing a crucial point?

> 
> Seems like max-files should be  SLONG_MAX / 2 or something instead?

Hm. Isn't that a bit low? Iiuc, this would mean cutting the maximum
number of open files in half? If at all shouldn't it be LONG_MAX?

> 
> > },
> > {
> > .procname   = "nr_open",
> > @@ -2795,6 +2797,8 @@ static int __do_proc_doulongvec_minmax(void *data, 
> > struct ctl_table *table, int
> > break;
> > if (neg)
> > continue;
> > +   if (max && val > *max)
> > +   val = *max;
> > val = convmul * val / convdiv;
> > if ((min && val < *min) || (max && val > *max))
> > continue;
> > --
> > 2.17.1
> >
> 
> -Kees
> 
> -- 
> Kees Cook
> Pixel Security


[PATCH 2/2] arm64: dts: meson-axg: drop FW reserved memory

2018-10-15 Thread Jerome Brunet
The axg does not require the FW memory region for all we know.
This seems to be something we carried for the gx family for no
reason.

Fixes: 9d59b708500f ("arm64: dts: meson-axg: add initial A113D SoC DT support")
Signed-off-by: Jerome Brunet 
---
 arch/arm64/boot/dts/amlogic/meson-axg.dtsi | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/arm64/boot/dts/amlogic/meson-axg.dtsi 
b/arch/arm64/boot/dts/amlogic/meson-axg.dtsi
index 06a06f11f114..d1beedc4fb0e 100644
--- a/arch/arm64/boot/dts/amlogic/meson-axg.dtsi
+++ b/arch/arm64/boot/dts/amlogic/meson-axg.dtsi
@@ -13,9 +13,6 @@
 #include 
 #include 
 
-/* 16 MiB reserved for Hardware ROM Firmware */
-/memreserve/ 0x0 0x100;
-
 /* 3 MiB reserved for ARM Trusted Firmware (BL31) */
 /memreserve/ 0x0500 0x30;
 
-- 
2.17.2



Re: [PATCH 08/14] ARM64: dts: hisilicon: Add tsensor interrupt name

2018-10-15 Thread Rob Herring
On Tue, Sep 25, 2018 at 11:03:06AM +0200, Daniel Lezcano wrote:
> Add the interrupt names for the sensors, so the code can rely on them
> instead of dealing with index which are prone to error.
> 
> The name comes from the Hisilicon documentation found on internet.
> 
> Signed-off-by: Daniel Lezcano 
> ---
>  .../bindings/thermal/hisilicon-thermal.txt |  3 ++
>  arch/arm64/boot/dts/hisilicon/hi3660.dtsi  | 63 
> +++---
>  arch/arm64/boot/dts/hisilicon/hi6220.dtsi  |  1 +
>  3 files changed, 36 insertions(+), 31 deletions(-)

Lots of whitespace errors reported by checkpatch.pl.

> 
> diff --git a/Documentation/devicetree/bindings/thermal/hisilicon-thermal.txt 
> b/Documentation/devicetree/bindings/thermal/hisilicon-thermal.txt
> index cef716a..3edfae3 100644
> --- a/Documentation/devicetree/bindings/thermal/hisilicon-thermal.txt
> +++ b/Documentation/devicetree/bindings/thermal/hisilicon-thermal.txt
> @@ -7,6 +7,7 @@
>region.
>  - interrupt: The interrupt number to the cpu. Defines the interrupt used
>by /SOCTHERM/tsensor.
> +- interrupt-names: The interrupt names for the different sensors

Need to define what the names are.

>  - clock-names: Input clock name, should be 'thermal_clk'.
>  - clocks: phandles for clock specified in "clock-names" property.
>  - #thermal-sensor-cells: Should be 1. See ./thermal.txt for a description.
> @@ -18,6 +19,7 @@ for Hi6220:
>   compatible = "hisilicon,tsensor";
>   reg = <0x0 0xf7030700 0x0 0x1000>;
>   interrupts = <0 7 0x4>;
> + interrupt-names = "tsensor_intr";

That name seems pretty pointless.

>   clocks = <&sys_ctrl HI6220_TSENSOR_CLK>;
>   clock-names = "thermal_clk";
>   #thermal-sensor-cells = <1>;
> @@ -28,5 +30,6 @@ for Hi3660:
>   compatible = "hisilicon,hi3660-tsensor";
>   reg = <0x0 0xfff3 0x0 0x1000>;
>   interrupts = ;
> + interrupt-names = "tsensor_a73";

Just 'a73' is sufficient.

>   #thermal-sensor-cells = <1>;
>   };


[PATCH 1/2] arm64: dts: meson: fix reserve memory regions

2018-10-15 Thread Jerome Brunet
Since commit 50d7ba36b916 ("arm64: export memblock_reserve()d regions via 
/proc/iomem")
was merged Amlogic's boards using mainline u-boot started showing the
following warning:

WARNING: CPU: 0 PID: 1 at arch/arm64/kernel/setup.c:271 
reserve_memblock_reserved_regions+0xd8/0x144
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc7-00263-g385684b3eb27-dirty 
#254
pstate: 4005 (nZcv daif -PAN -UAO)
pc : reserve_memblock_reserved_regions+0xd8/0x144
lr : reserve_memblock_reserved_regions+0xd0/0x144
[...]

This is due to u-boot setting some /reservedmem/ region while our
dts declares reserved memory on the same region with no-map.

The conflict produce the warning. This is fixed by using /reservedmem/
in our dts as well, which is probably something we should have done from
the beginning.

Cc: sta...@vger.kernel.org
Cc: Neil Armstrong 
Signed-off-by: Jerome Brunet 
---

Hi Kevin,

I would have liked to put a Fixes tag above but I could not figure out
which commit to pick, considering how much we changed those regions in
the past. If you have suggestion, I'll be happy to repost this patch.
If you prefer, feel free to amend this patch directly.

Cheers
Jerome

 arch/arm64/boot/dts/amlogic/meson-axg.dtsi | 24 +--
 arch/arm64/boot/dts/amlogic/meson-gx.dtsi  | 27 --
 2 files changed, 15 insertions(+), 36 deletions(-)

diff --git a/arch/arm64/boot/dts/amlogic/meson-axg.dtsi 
b/arch/arm64/boot/dts/amlogic/meson-axg.dtsi
index 178d8e8c56b8..06a06f11f114 100644
--- a/arch/arm64/boot/dts/amlogic/meson-axg.dtsi
+++ b/arch/arm64/boot/dts/amlogic/meson-axg.dtsi
@@ -13,6 +13,12 @@
 #include 
 #include 
 
+/* 16 MiB reserved for Hardware ROM Firmware */
+/memreserve/ 0x0 0x100;
+
+/* 3 MiB reserved for ARM Trusted Firmware (BL31) */
+/memreserve/ 0x0500 0x30;
+
 / {
compatible = "amlogic,meson-axg";
 
@@ -115,24 +121,6 @@
method = "smc";
};
 
-   reserved-memory {
-   #address-cells = <2>;
-   #size-cells = <2>;
-   ranges;
-
-   /* 16 MiB reserved for Hardware ROM Firmware */
-   hwrom_reserved: hwrom@0 {
-   reg = <0x0 0x0 0x0 0x100>;
-   no-map;
-   };
-
-   /* Alternate 3 MiB reserved for ARM Trusted Firmware (BL31) */
-   secmon_reserved: secmon@500 {
-   reg = <0x0 0x0500 0x0 0x30>;
-   no-map;
-   };
-   };
-
soc {
compatible = "simple-bus";
#address-cells = <2>;
diff --git a/arch/arm64/boot/dts/amlogic/meson-gx.dtsi 
b/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
index 676a995fb912..23e879b29b1e 100644
--- a/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
+++ b/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
@@ -13,6 +13,15 @@
 #include 
 #include 
 
+/* 16 MiB reserved for Hardware ROM Firmware */
+/memreserve/ 0x0 0x100;
+
+/* 2 MiB reserved for ARM Trusted Firmware (BL31) */
+/memreserve/ 0x1000 0x20;
+
+/* Alternate 3 MiB reserved for ARM Trusted Firmware (BL31) */
+/memreserve/ 0x0500 0x30;
+
 / {
interrupt-parent = <&gic>;
#address-cells = <2>;
@@ -23,24 +32,6 @@
#size-cells = <2>;
ranges;
 
-   /* 16 MiB reserved for Hardware ROM Firmware */
-   hwrom_reserved: hwrom@0 {
-   reg = <0x0 0x0 0x0 0x100>;
-   no-map;
-   };
-
-   /* 2 MiB reserved for ARM Trusted Firmware (BL31) */
-   secmon_reserved: secmon@1000 {
-   reg = <0x0 0x1000 0x0 0x20>;
-   no-map;
-   };
-
-   /* Alternate 3 MiB reserved for ARM Trusted Firmware (BL31) */
-   secmon_reserved_alt: secmon@500 {
-   reg = <0x0 0x0500 0x0 0x30>;
-   no-map;
-   };
-
linux,cma {
compatible = "shared-dma-pool";
reusable;
-- 
2.17.2



[PATCH 0/2] arm64: dts: meson: reserved memory updates

2018-10-15 Thread Jerome Brunet
This patcheset provide updates regarding reserved memory on
Amlogic platforms.

Jerome Brunet (2):
  arm64: dts: meson: fix reserve memory regions
  arm64: dts: meson-axg: drop FW reserved memory

 arch/arm64/boot/dts/amlogic/meson-axg.dtsi | 21 +++--
 arch/arm64/boot/dts/amlogic/meson-gx.dtsi  | 27 --
 2 files changed, 12 insertions(+), 36 deletions(-)

-- 
2.17.2



Re: [PATCH v1 1/2] sysctl: cap to ULONG_MAX in proc_get_long()

2018-10-15 Thread Christian Brauner
On Mon, Oct 15, 2018 at 09:18:40AM -0700, Kees Cook wrote:
> On Mon, Oct 15, 2018 at 3:55 AM, Christian Brauner  
> wrote:
> > proc_get_long() is a funny function. It uses simple_strtoul() and for a
> > good reason. proc_get_long() wants to always succeed the parse and return
> > the maybe incorrect value and the trailing characters to check against a
> > pre-defined list of acceptable trailing values.
> > However, simple_strtoul() explicitly ignores overflows which can cause
> 
> What depends on simple_strtoul() ignoring overflows? Can we just cap
> it to ULONG_MAX instead?
> 
> I note that both simple_strtoul() and simple_strtoull() are marked as
> obsolete (more below).
> 
> > funny things like the following to happen:
> >
> > echo 18446744073709551616 > /proc/sys/fs/file-max
> > cat /proc/sys/fs/file-max
> > 0
> >
> > (Which will cause your system to silently die behind your back.)
> >
> > On the other hand kstrtoul() does do overflow detection but fails the parse
> > in this case, does not return the trailing characters, and also fails the
> > parse when anything other than '\n' is a trailing character whereas
> > proc_get_long() wants to be more lenient.
> 
> This parsing strictness difference makes it seem like the simple_*()
> shouldn't be considered obsolete...
> 
> and it's still very heavily used:
> 
> $ git grep -E 'simple_strtoull?\(' | wc -l
> 745

Maybe, but the intention is probably to fade it out and to not use it in
new code because it doesn't handle overflow.
Tbh, I'm weary to change that to suddenly return a ULONG_MAX on overflow
instead of what it is doing now. I have absolutely no idea what this
might break given how much it is still used in the kernel...

> 
> > Now, before adding another kstrtoul() function let's simply add a static
> > parse strtoul_cap_erange() which does:
> > - returns ULONG_MAX on ERANGE
> > - returns the trailing characters to the caller
> > This guarantees that we don't regress userspace in any way but also caps
> > any parsed value to ULONG_MAX and prevents things like file-max to become 0
> > on overflow.
> 
> -Kees
> 
> -- 
> Kees Cook
> Pixel Security


Re: [PATCH 10/14] ARM64: dts: hisilicon: Add interrupt names for the tsensors

2018-10-15 Thread Rob Herring
On Tue, Sep 25, 2018 at 11:03:08AM +0200, Daniel Lezcano wrote:
> Add the missing interrupts for the temperature sensors as well as
> their names.
> 
> Signed-off-by: Daniel Lezcano 
> ---
>  Documentation/devicetree/bindings/thermal/hisilicon-thermal.txt | 8 ++--

Combine this and the previous binding change to 1 patch.

>  arch/arm64/boot/dts/hisilicon/hi3660.dtsi   | 8 ++--
>  2 files changed, 12 insertions(+), 4 deletions(-)

And more checkpatch whitespace errors in this.


Re: [PATCH] rcu: Use cpus_read_lock() while looking at cpu_online_mask

2018-10-15 Thread Paul E. McKenney
On Mon, Oct 15, 2018 at 11:33:48PM +0800, Boqun Feng wrote:
> On Mon, Oct 15, 2018 at 05:09:03PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2018-10-15 23:07:15 [+0800], Boqun Feng wrote:
> > > Hi, Sebastian
> > Hi Boqun,
> > 
> > > On Mon, Oct 15, 2018 at 04:42:17PM +0200, Sebastian Andrzej Siewior wrote:
> > > > On 2018-10-13 06:48:13 [-0700], Paul E. McKenney wrote:
> > > > > 
> > > > > My concern would be that it would queue it by default for the current
> > > > > CPU, which would serialize the processing, losing the concurrency of
> > > > > grace-period initialization.  But that was a long time ago, and 
> > > > > perhaps
> > > > > workqueues have changed. 
> > > > 
> > > > but the code here is always using the first CPU of a NUMA node or did I
> > > > miss something?
> > > > 
> > > 
> > > The thing is the original way is to pick one CPU for a *RCU* node to
> > > run the grace-period work, but with your proposal, if a RCU node is
> > > smaller than a NUMA node (having fewer CPUs), we could end up having two
> > > grace-period works running on one CPU. I think that's Paul's concern.
> > 
> > Ah. Okay. From what I observed, the RCU nodes and NUMA nodes were 1:1
> > here. Noted.
> 
> Ok, in that case, there should be no significant performance difference.
> 
> > Given that I can enqueue a work item on an offlined CPU I don't see why
> > commit fcc6354365015 ("rcu: Make expedited GPs handle CPU 0 being
> > offline") should make a difference. Any objections to just revert it?
> 
> Well, that commit is trying to avoid queue a work on an offlined CPU,
> because according to workqueue API, it's the users' responsibility to
> make sure the CPU is online when a work item enqueued. So there is a
> difference ;-)
> 
> But I don't have any objection to revert it with your proposal, since
> yours is more simple and straight-forward, and doesn't perform worse if
> NUMA nodes and RCU nodes have one-to-one corresponding.
> 
> Besides, I think even if we observe some performance difference in the
> future, the best way to solve that is to make workqueue have a more
> fine-grained affine group than a NUMA node.

Please keep in mind that there are computer systems out there with NUMA
topologies that are completely incompatible with RCU's rcu_node tree
structure.  According to Rik van Riel (CCed), there are even systems
out there where CPU 0 is on socket 0, CPU 1 on socket 1, and so on,
round-robining across the sockets.

The system that convinced me that the additional constraints on
the workqueue's CPU had CPUs 0-7 on one socket and CPUs 8-15 on the
second, and with CPUs 0-15 sharing the same leaf rcu_node structure.
Unfortunately, I no longer have useful access to this system (dead disk
drive, apparently).

I am not saying that Sebastian's approach is bad, rather that it does
need to be tested on a variety of systems.

Thanx, Paul

> Regards,
> Boqun
> 
> > 
> > > Regards,
> > > Boqun
> > 
> > Sebastian




Re: [PATCH v2 3/5] clk: lochnagar: Add support for the Cirrus Logic Lochnagar

2018-10-15 Thread Stephen Boyd
Quoting Charles Keepax (2018-10-15 03:49:05)
> On Fri, Oct 12, 2018 at 08:59:56AM -0700, Stephen Boyd wrote:
> > Quoting Charles Keepax (2018-10-11 06:26:02)
> > > On Thu, Oct 11, 2018 at 12:00:46AM -0700, Stephen Boyd wrote:
> > > > Quoting Charles Keepax (2018-10-08 06:25:40)
> > > > > +struct lochnagar_clk_priv {
> > > > > +   struct device *dev;
> > > > > +   struct lochnagar *lochnagar;
> > > > 
> > > > Is this used for anything besides getting the regmap? Can you get the
> > > > pointer to the parent in probe and use that to get the regmap pointer
> > > > from dev_get_remap() and also use the of_node of the parent to register
> > > > a clk provider? It would be nice to avoid including the mfd header file
> > > > unless it's providing something useful.
> > > > 
> > > 
> > > It is also used to find out which type of Lochnagar we have
> > > connected, which determines which clocks we should register. I
> > 
> > Can that be done through some device ID? So the driver can be untangled
> > from the MFD part.
> > 
> > > could perhaps pass that using another mechanism but we would
> > > still want to include the MFD stuff to get the register
> > > definitions. So this approach seems simplest.
> > 
> > Can the register definitions be moved to this clk driver?
> > 
> > Maybe you now get the hint, but I'd really like to be able to merge and
> > compile the clk driver all by itself without relying on the parent MFD
> > device to provide anything at compile time.
> > 
> 
> If you feel strongly but since the MFD needs to hold the regmap
> (which needs to define the read/volatile regs and defaults)
> these will need to be duplicate defines and personally i would
> rather only have one copy as it makes updating things much less
> error prone.

Ok if there's going to be read/volatile regs and defaults then it makes
sense to leave the defines in some shared header file, which is
unfortunate for the independent merge of driver bits. Either way, I
would prefer we don't use struct lochnagar in this driver and move to
more generic structures like struct regmap and express the type of MFD
to this device driver some other way.

> 
> > > > > +   if (lclk->regmap.dir_mask) {
> > > > > +   ret = regmap_update_bits(regmap, lclk->regmap.cfg_reg,
> > > > > +lclk->regmap.dir_mask,
> > > > > +lclk->regmap.dir_mask);
> > > > > +   if (ret < 0) {
> > > > > +   dev_err(priv->dev, "Failed to set %s 
> > > > > direction: %d\n",
> > > > 
> > > > What does direction mean?
> > > > 
> > > 
> > > Some of the clocks can both generate and receive a clock. For
> > > example the PSIA (external audio interface) MCLKs, the attached
> > > device could be expecting or providing a master audio clock. If
> > > the user assigns a parent to the clock we assume the attached
> > > device is providing a clock to us, otherwise we assume we are
> > > providing the clock.
> > 
> > And this directionality is determined by dir_mask? It would be great if
> > this sort of information was in the commit text or in a comment in the
> > driver so we know what's going on here.
> > 
> 
> No problem will make this more clear.
> 

Thanks!



Re: [PATCH 1/2] arm64: dts: meson: fix reserve memory regions

2018-10-15 Thread Mark Rutland
On Mon, Oct 15, 2018 at 06:28:32PM +0200, Jerome Brunet wrote:
> Since commit 50d7ba36b916 ("arm64: export memblock_reserve()d regions via 
> /proc/iomem")
> was merged Amlogic's boards using mainline u-boot started showing the
> following warning:
>
> WARNING: CPU: 0 PID: 1 at arch/arm64/kernel/setup.c:271 
> reserve_memblock_reserved_regions+0xd8/0x144
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
> 4.19.0-rc7-00263-g385684b3eb27-dirty #254
> pstate: 4005 (nZcv daif -PAN -UAO)
> pc : reserve_memblock_reserved_regions+0xd8/0x144
> lr : reserve_memblock_reserved_regions+0xd0/0x144
> [...]
>
> This is due to u-boot setting some /reservedmem/ region while our
> dts declares reserved memory on the same region with no-map.
>
> The conflict produce the warning. This is fixed by using /reservedmem/
> in our dts as well, which is probably something we should have done from
> the beginning.

A /memreserve/ does not ensure no-map, and the kernel will map regions
which are described in a memory node and only protected with a
/memreserve/ entry.

Is it safe for the kernel to map these? e.g. speculative fetches won't
trigger a TrustZone controller to reboot the system?

... or are they not in memory nodes to begin with?

Thanks,
Mark.

>
> Cc: sta...@vger.kernel.org
> Cc: Neil Armstrong 
> Signed-off-by: Jerome Brunet 
> ---
>
> Hi Kevin,
>
> I would have liked to put a Fixes tag above but I could not figure out
> which commit to pick, considering how much we changed those regions in
> the past. If you have suggestion, I'll be happy to repost this patch.
> If you prefer, feel free to amend this patch directly.
>
> Cheers
> Jerome
>
>  arch/arm64/boot/dts/amlogic/meson-axg.dtsi | 24 +--
>  arch/arm64/boot/dts/amlogic/meson-gx.dtsi  | 27 --
>  2 files changed, 15 insertions(+), 36 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/amlogic/meson-axg.dtsi 
> b/arch/arm64/boot/dts/amlogic/meson-axg.dtsi
> index 178d8e8c56b8..06a06f11f114 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-axg.dtsi
> +++ b/arch/arm64/boot/dts/amlogic/meson-axg.dtsi
> @@ -13,6 +13,12 @@
>  #include 
>  #include 
>
> +/* 16 MiB reserved for Hardware ROM Firmware */
> +/memreserve/ 0x0 0x100;
> +
> +/* 3 MiB reserved for ARM Trusted Firmware (BL31) */
> +/memreserve/ 0x0500 0x30;
> +
>  / {
>   compatible = "amlogic,meson-axg";
>
> @@ -115,24 +121,6 @@
>   method = "smc";
>   };
>
> - reserved-memory {
> - #address-cells = <2>;
> - #size-cells = <2>;
> - ranges;
> -
> - /* 16 MiB reserved for Hardware ROM Firmware */
> - hwrom_reserved: hwrom@0 {
> - reg = <0x0 0x0 0x0 0x100>;
> - no-map;
> - };
> -
> - /* Alternate 3 MiB reserved for ARM Trusted Firmware (BL31) */
> - secmon_reserved: secmon@500 {
> - reg = <0x0 0x0500 0x0 0x30>;
> - no-map;
> - };
> - };
> -
>   soc {
>   compatible = "simple-bus";
>   #address-cells = <2>;
> diff --git a/arch/arm64/boot/dts/amlogic/meson-gx.dtsi 
> b/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
> index 676a995fb912..23e879b29b1e 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
> +++ b/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
> @@ -13,6 +13,15 @@
>  #include 
>  #include 
>
> +/* 16 MiB reserved for Hardware ROM Firmware */
> +/memreserve/ 0x0 0x100;
> +
> +/* 2 MiB reserved for ARM Trusted Firmware (BL31) */
> +/memreserve/ 0x1000 0x20;
> +
> +/* Alternate 3 MiB reserved for ARM Trusted Firmware (BL31) */
> +/memreserve/ 0x0500 0x30;
> +
>  / {
>   interrupt-parent = <&gic>;
>   #address-cells = <2>;
> @@ -23,24 +32,6 @@
>   #size-cells = <2>;
>   ranges;
>
> - /* 16 MiB reserved for Hardware ROM Firmware */
> - hwrom_reserved: hwrom@0 {
> - reg = <0x0 0x0 0x0 0x100>;
> - no-map;
> - };
> -
> - /* 2 MiB reserved for ARM Trusted Firmware (BL31) */
> - secmon_reserved: secmon@1000 {
> - reg = <0x0 0x1000 0x0 0x20>;
> - no-map;
> - };
> -
> - /* Alternate 3 MiB reserved for ARM Trusted Firmware (BL31) */
> - secmon_reserved_alt: secmon@500 {
> - reg = <0x0 0x0500 0x0 0x30>;
> - no-map;
> - };
> -
>   linux,cma {
>   compatible = "shared-dma-pool";
>   reusable;
> --
> 2.17.2
>
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any

[RFC][PATCH 0/3] pgtable bytes mis-accounting v2

2018-10-15 Thread Martin Schwidefsky
Greetings,

the first test patch to fix the pgtable_bytes mis-accounting on s390
still had a few problems. For one it didn't work for x86 ..

Changes v1 -> v2:

 - Split the patch into three parts, one patch to add the mm_pxd_folded
   helpers, one patch to use to the helpers in mm_[dec|inc]_nr_[pmds|puds]
   and finally the fix for s390.

 - Drop the use of __is_defined, it does not work with the
   __PAGETABLE_PxD_FOLDED defines

 - Do not change the basic #ifdef'ery in mm.h, just add the calls
   to mm_pxd_folded to the pgtable_bytes accounting functions. This
   fixes the compile error on alpha (and potentially on other archs).

Martin Schwidefsky (3):
  mm: introduce mm_[p4d|pud|pmd]_folded
  mm: add mm_pxd_folded checks to pgtable_bytes accounting functions
  s390/mm: fix mis-accounting of pgtable_bytes

 arch/s390/include/asm/mmu_context.h |  5 
 arch/s390/include/asm/pgalloc.h |  6 ++---
 arch/s390/include/asm/pgtable.h | 18 ++
 arch/s390/include/asm/tlb.h |  6 ++---
 include/linux/mm.h  | 48 +
 5 files changed, 72 insertions(+), 11 deletions(-)

-- 
2.16.4



[PATCH 1/3] mm: introduce mm_[p4d|pud|pmd]_folded

2018-10-15 Thread Martin Schwidefsky
Add three architecture overrideable function to test if the
p4d, pud, or pmd layer of a page table is folded or not.

Signed-off-by: Martin Schwidefsky 
---
 include/linux/mm.h | 40 
 1 file changed, 40 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0416a7204be3..d1029972541c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -105,6 +105,46 @@ extern int mmap_rnd_compat_bits __read_mostly;
 #define mm_zero_struct_page(pp)  ((void)memset((pp), 0, sizeof(struct page)))
 #endif
 
+/*
+ * On some architectures it depends on the mm if the p4d/pud or pmd
+ * layer of the page table hierarchy is folded or not.
+ */
+#ifndef mm_p4d_folded
+#define mm_p4d_folded(mm) mm_p4d_folded(mm)
+static inline bool mm_p4d_folded(struct mm_struct *mm)
+{
+#ifdef __PAGETABLE_P4D_FOLDED
+   return 1;
+#else
+   return 0;
+#endif
+}
+#endif
+
+#ifndef mm_pud_folded
+#define mm_pud_folded(mm) mm_pud_folded(mm)
+static inline bool mm_pud_folded(struct mm_struct *mm)
+{
+#ifdef __PAGETABLE_PUD_FOLDED
+   return 1;
+#else
+   return 0;
+#endif
+}
+#endif
+
+#ifndef mm_pmd_folded
+#define mm_pmd_folded(mm) mm_pmd_folded(mm)
+static inline bool mm_pmd_folded(struct mm_struct *mm)
+{
+#ifdef __PAGETABLE_PMD_FOLDED
+   return 1;
+#else
+   return 0;
+#endif
+}
+#endif
+
 /*
  * Default maximum number of active map areas, this limits the number of vmas
  * per mm struct. Users can overwrite this number by sysctl but there is a
-- 
2.16.4



[PATCH 2/3] mm: add mm_pxd_folded checks to pgtable_bytes accounting functions

2018-10-15 Thread Martin Schwidefsky
The common mm code calls mm_dec_nr_pmds() and mm_dec_nr_puds()
in free_pgtables() if the address range spans a full pud or pmd.
If mm_dec_nr_puds/mm_dec_nr_pmds are non-empty due to configuration
settings they blindly subtract the size of the pmd or pud table from
pgtable_bytes even if the pud or pmd page table layer is folded.

Add explicit mm_[pmd|pud]_folded checks to the four pgtable_bytes
accounting functions mm_inc_nr_puds, mm_inc_nr_pmds, mm_dec_nr_puds
and mm_dec_nr_pmds. As the check for folded page tables can be
overwritten by the architecture, this allows to keep a correct
pgtable_bytes value for platforms that use a dynamic number of
page table levels.

Signed-off-by: Martin Schwidefsky 
---
 include/linux/mm.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index d1029972541c..67f55c71e59a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1764,11 +1764,15 @@ int __pud_alloc(struct mm_struct *mm, p4d_t *p4d, 
unsigned long address);
 
 static inline void mm_inc_nr_puds(struct mm_struct *mm)
 {
+   if (mm_pud_folded(mm))
+   return;
atomic_long_add(PTRS_PER_PUD * sizeof(pud_t), &mm->pgtables_bytes);
 }
 
 static inline void mm_dec_nr_puds(struct mm_struct *mm)
 {
+   if (mm_pud_folded(mm))
+   return;
atomic_long_sub(PTRS_PER_PUD * sizeof(pud_t), &mm->pgtables_bytes);
 }
 #endif
@@ -1788,11 +1792,15 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, 
unsigned long address);
 
 static inline void mm_inc_nr_pmds(struct mm_struct *mm)
 {
+   if (mm_pmd_folded(mm))
+   return;
atomic_long_add(PTRS_PER_PMD * sizeof(pmd_t), &mm->pgtables_bytes);
 }
 
 static inline void mm_dec_nr_pmds(struct mm_struct *mm)
 {
+   if (mm_pmd_folded(mm))
+   return;
atomic_long_sub(PTRS_PER_PMD * sizeof(pmd_t), &mm->pgtables_bytes);
 }
 #endif
-- 
2.16.4



[PATCH 3/3] s390/mm: fix mis-accounting of pgtable_bytes

2018-10-15 Thread Martin Schwidefsky
In case a fork or a clone system fails in copy_process and the error
handling does the mmput() at the bad_fork_cleanup_mm label, the
following warning messages will appear on the console:

  BUG: non-zero pgtables_bytes on freeing mm: 16384

The reason for that is the tricks we play with mm_inc_nr_puds() and
mm_inc_nr_pmds() in init_new_context().

A normal 64-bit process has 3 levels of page table, the p4d level and
the pud level are folded. On process termination the free_pud_range()
function in mm/memory.c will subtract 16KB from pgtable_bytes with a
mm_dec_nr_puds() call, but there actually is not really a pud table.

One issue with this is the fact that pgtable_bytes is usually off
by a few kilobytes, but the more severe problem is that for a failed
fork or clone the free_pgtables() function is not called. In this case
there is no mm_dec_nr_puds() or mm_dec_nr_pmds() that go together with
the mm_inc_nr_puds() and mm_inc_nr_pmds in init_new_context().
The pgtable_bytes will be off by 16384 or 32768 bytes and we get the
BUG message. The message itself is purely cosmetic, but annoying.

To fix this override the mm_pmd_folded, mm_pud_folded and mm_p4d_folded
function to check for the true size of the address space.

Reported-by: Li Wang 
Signed-off-by: Martin Schwidefsky 
---
 arch/s390/include/asm/mmu_context.h |  5 -
 arch/s390/include/asm/pgalloc.h |  6 +++---
 arch/s390/include/asm/pgtable.h | 18 ++
 arch/s390/include/asm/tlb.h |  6 +++---
 4 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/arch/s390/include/asm/mmu_context.h 
b/arch/s390/include/asm/mmu_context.h
index 0717ee76885d..f1ab9420ccfb 100644
--- a/arch/s390/include/asm/mmu_context.h
+++ b/arch/s390/include/asm/mmu_context.h
@@ -45,8 +45,6 @@ static inline int init_new_context(struct task_struct *tsk,
mm->context.asce_limit = STACK_TOP_MAX;
mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH |
   _ASCE_USER_BITS | _ASCE_TYPE_REGION3;
-   /* pgd_alloc() did not account this pud */
-   mm_inc_nr_puds(mm);
break;
case -PAGE_SIZE:
/* forked 5-level task, set new asce with new_mm->pgd */
@@ -62,9 +60,6 @@ static inline int init_new_context(struct task_struct *tsk,
/* forked 2-level compat task, set new asce with new mm->pgd */
mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH |
   _ASCE_USER_BITS | _ASCE_TYPE_SEGMENT;
-   /* pgd_alloc() did not account this pmd */
-   mm_inc_nr_pmds(mm);
-   mm_inc_nr_puds(mm);
}
crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm));
return 0;
diff --git a/arch/s390/include/asm/pgalloc.h b/arch/s390/include/asm/pgalloc.h
index f0f9bcf94c03..5ee733720a57 100644
--- a/arch/s390/include/asm/pgalloc.h
+++ b/arch/s390/include/asm/pgalloc.h
@@ -36,11 +36,11 @@ static inline void crst_table_init(unsigned long *crst, 
unsigned long entry)
 
 static inline unsigned long pgd_entry_type(struct mm_struct *mm)
 {
-   if (mm->context.asce_limit <= _REGION3_SIZE)
+   if (mm_pmd_folded(mm))
return _SEGMENT_ENTRY_EMPTY;
-   if (mm->context.asce_limit <= _REGION2_SIZE)
+   if (mm_pud_folded(mm))
return _REGION3_ENTRY_EMPTY;
-   if (mm->context.asce_limit <= _REGION1_SIZE)
+   if (mm_p4d_folded(mm))
return _REGION2_ENTRY_EMPTY;
return _REGION1_ENTRY_EMPTY;
 }
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 0e7cb0dc9c33..de05466ce50c 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -485,6 +485,24 @@ static inline int is_module_addr(void *addr)
   _REGION_ENTRY_PROTECT | \
   _REGION_ENTRY_NOEXEC)
 
+static inline bool mm_p4d_folded(struct mm_struct *mm)
+{
+   return mm->context.asce_limit <= _REGION1_SIZE;
+}
+#define mm_p4d_folded(mm) mm_p4d_folded(mm)
+
+static inline bool mm_pud_folded(struct mm_struct *mm)
+{
+   return mm->context.asce_limit <= _REGION2_SIZE;
+}
+#define mm_pud_folded(mm) mm_pud_folded(mm)
+
+static inline bool mm_pmd_folded(struct mm_struct *mm)
+{
+   return mm->context.asce_limit <= _REGION3_SIZE;
+}
+#define mm_pmd_folded(mm) mm_pmd_folded(mm)
+
 static inline int mm_has_pgste(struct mm_struct *mm)
 {
 #ifdef CONFIG_PGSTE
diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h
index 457b7ba0fbb6..b31c779cf581 100644
--- a/arch/s390/include/asm/tlb.h
+++ b/arch/s390/include/asm/tlb.h
@@ -136,7 +136,7 @@ static inline void pte_free_tlb(struct mmu_gather *tlb, 
pgtable_t pte,
 static inline void pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
unsigned long address)
 {
-   if (tlb->mm->context.asce_limit <= _REGION3_SIZE)

RE: [PATCH 2/2] clk: imx: imx7d: remove clks_init_on array

2018-10-15 Thread Stephen Boyd
Quoting Anson Huang (2018-10-15 02:33:35)
> > > > >
> > > >
> > > > Why can't we add clks to the op-tee node in DT's /firmware container?
> > > > Then any clks in there can be turned on forever and left enabled by
> > > > the linux driver?
> > >
> > > I did NOT run op-tee with Linux-next kernel before, can you advise more?
> > 
> > Neither have I, so I can't advise more.
> > 
> > > And I think if op-tee has such requirement, can we have another patch
> > > to cover it?
> > 
> > Yes.
> > 
> > 
> > > I believe all other i.MX platforms also have same requirements if
> > > considering op-tee support, so I think it should be another topic, what 
> > > do you
> > think?
> > >
> > 
> > I'm going to drop these patches from my review queue. Please resend them
> > and please include the op-tee patches too.
> 
> 
> I do NOT know how to include the op-tee patch to meet special requirement, 
> should
> the op-tee related patch be added later when someone actually add the op-tee 
> support for i.MX7?
> It should NOT block this patch set, do you think we can add this patch set 
> first?
> 

Please resend the two patches. In the commit text for the second patch,
describe the plan to remove CLK_IS_CRITICAL from these clks by adding an
OP-TEE device/driver to the kernel to keep these clks enabled. My
understanding is that there isn't an OP-TEE driver right now, but these
clks are used by the firmware and can't be turned off in Linux. If in
the future we want to be able to turn them on and off, we'll need to add
them to an OP-TEE device node and have that driver manage the clks.

How that will work when a system doesn't enable the OP-TEE driver I'm
not sure. We may need to develop some system whereby clks like this are
handed from the clk controller to the consumer driver when it's enabled
for further power managment, but if they're never handed off, they're
kept on forever like is done here. Anyway, please resend with a note
about why these are marked CLK_IS_CRITICAL.



Re: [PATCH] proc: fix proc-self-map-files selftest for arm

2018-10-15 Thread Rafael David Tinoco

On 10/11/18 7:00 PM, Cyrill Gorcunov wrote:

On Fri, Oct 12, 2018 at 12:30:06AM +0300, Alexey Dobriyan wrote:

On Fri, Oct 12, 2018 at 12:02:56AM +0300, Cyrill Gorcunov wrote:

On Thu, Oct 11, 2018 at 11:56:01PM +0300, Alexey Dobriyan wrote:


As the comment in the beginning says this test is specifically for addresss 0.
Maybe it should be ifdeffed with __arm__ then.


Is there some other reason than allocating non-mergable VMA?


IIRC the reason is to test address 0 as it is effectively banned
for userspace so if it will be broken, it will be broken silently
for a long time.


This is rather a side effect of the test because the primary reason
was to check procfs numbers conversion, right? Don't get me wrong,
I don't mind about __arm__ define or similar, this is fine for
one architecture, but if there comes more we will get a number
of #ifdefs which is unrelated to procfs numeric routines at all.


That is what I also had in mind, thus the patch. I just realized we had 
another issue on LKFT (our functional tests tool) for 
proc-self-map-files-001.c. Test 001 does pretty much the same as 002, 
but without the MAP_FIXED mmap flag.


Is it okay to consolidate both tests into just 1, and focus in checking 
procfs numbers conversion only, rather than if mapping 0 is allowed or 
not ? Can I send a v2 with that in mind ?





As for "unmergeable" libc here doesn't map /dev/zero. I know how to
avoid even theoretical breakage by creating binaries by hand but it
will be probably too much.


Sure.





Re: [PATCH v2 3/5] clk: lochnagar: Add support for the Cirrus Logic Lochnagar

2018-10-15 Thread Mark Brown
On Mon, Oct 15, 2018 at 09:39:59AM -0700, Stephen Boyd wrote:

> Ok if there's going to be read/volatile regs and defaults then it makes
> sense to leave the defines in some shared header file, which is
> unfortunate for the independent merge of driver bits. Either way, I

Kconfig dependencies take care of that pretty well - if there's a
dependency on the MFD then the function driver just won't get built
until the MFD gets merged.  


signature.asc
Description: PGP signature


Re: [PATCH] x86/mm: annotate no_context with UNWIND_HINTS

2018-10-15 Thread Nick Desaulniers
On Mon, Oct 15, 2018 at 9:03 AM Andy Lutomirski  wrote:
>
> On Mon, Oct 15, 2018 at 8:31 AM Josh Poimboeuf  wrote:
> >
> > On Mon, Oct 15, 2018 at 08:22:21AM -0700, Nathan Chancellor wrote:
> > > > >>> @@ -760,9 +760,11 @@ no_context(struct pt_regs *regs, unsigned long 
> > > > >>> error_code,
> > > > >>> * and then double-fault, though, because we're 
> > > > >>> likely to
> > > > >>> * break the console driver and lose most of the 
> > > > >>> stack dump.
> > > > >>> */
> > > > >>> -   asm volatile ("movq %[stack], %%rsp\n\t"
> > > > >>> +   asm volatile (UNWIND_HINT_SAVE
> > > > >>> + "movq %[stack], %%rsp\n\t"
> > > > >>>  "call handle_stack_overflow\n\t"
> > > > >>> - "1: jmp 1b"
> > > > >>> + "1: jmp 1b\n\t"
> > > > >>> + UNWIND_HINT_RESTORE
> > > > >>>  : ASM_CALL_CONSTRAINT
> > > > >>>  : "D" ("kernel stack overflow (page 
> > > > >>> fault)"),
> > > > >>>"S" (regs), "d" (address),
> > > > >>
> > > > >> NAK.  Just below this snippet is unreachable();
> > > > >>
> > > > >> Can you reply with objtool -dr output on a problematic fault.o?  
> > > > >> Josh,
> > > > >> it *looks* like annotate_unreachable() should be doing the right
> > > > >> thing, but something is clearly busted.
> > > > >>
> > > > >> Also, shouldn't compiler-clang.h contain a reasonable definition of
> > > > >> unreachable()?
> > > > >>
> > > > >> --Andy
> > > > >
> > > > > Hi Andy,
> > > > >
> > > > > Did you mean 'objdump -dr'? If so, here you go (rather long, sorry if 
> > > > > I
> > > > > should have pasted it here instead):
> > > > > https://gist.github.com/nathanchance/f038bb0a6653b975bb8a4e64fcd5503e
> > > > >
> > > > >
> > > >
> > > > Hmm, -dr wasn’t quite enough to dump the .discard bits, assuming 
> > > > they’re there at all. Can you just put the whole .o file somewhere?
> > >
> > > Here you go: https://nathanchance.me/downloads/.tmp/fault.o
> >
> > $ eu-readelf -S /tmp/fault.o  |grep reachable
> > [12] .discard.reachable   PROGBITS  2bc0 0014  
> > 00   0  1
> > [13] .rela.discard.reachable RELA  2bd8 
> > 0078 24 I 32  12  8
> >
> > That confirms that you need a clang version of the unreachable() macro.
> >
>
> Duh.
>
> That being said, the generic macro is:
>
> # define unreachable() do { annotate_reachable(); do { } while (1); } while 
> (0)
>
> I'm probably missing some subtlety here, but shouldn't that be
> annotate_*un*reachable()?
>
> Of course, there are any number of reasons why there should be a real
> definition.  Nathan and Nick, does adding something like:
>
> #define unreachable() \
> do {\
> annotate_unreachable(); \
> __builtin_unreachable();\
> } while (0)
>
> to compiler-clang.h fix the problem?

I broke this myself in commit 815f0ddb346c
("include/linux/compiler*.h: make compiler-*.h mutually exclusive").
Thanks for the suggestion, will verify then send a patch with your
suggested by tag.  Thanks everyone for helping us sort this out!


Re: [PATCH 0/7] NULL pointer deref fix for stm32-dma

2018-10-15 Thread Vinod
Hi Joel,

On 08-10-18, 22:47, Joel Fernandes (Google) wrote:
> Hi Greg,
> 
> While looking at android-4.14, I found a NULL pointer deref with
> stm32-dma driver using Coccicheck errors. I found that upstream had a
> bunch of patches on stm32-dma that have fixed this and other issues, I
> applied these patches cleanly onto Android 4.14. I believe these should
> goto stable and flow into Android 4.14 from there, but I haven't tested
> this since I have no hardware to do so.
> 
> Atleast I can say that the coccicheck error below goes away when running:
> make coccicheck MODE=report
> ./drivers/dma/stm32-dma.c:567:18-24: ERROR: chan -> desc is NULL but 
> dereferenced.
> 
> Anyway, please consider this series for 4.14 stable, I have CC'd the
> author and others, thanks.
> 
> Pierre Yves MORDRET (7):
>   dmaengine: stm32-dma: threshold manages with bitfield feature
>   dmaengine: stm32-dma: fix incomplete configuration in cyclic mode
>   dmaengine: stm32-dma: fix typo and reported checkpatch warnings
>   dmaengine: stm32-dma: Improve memory burst management
>   dmaengine: stm32-dma: fix DMA IRQ status handling
>   dmaengine: stm32-dma: fix max items per transfer
>   dmaengine: stm32-dma: properly mask irq bits

It would be good to only cherry pick fixes for this. I do not feel that
some of them which are adding or enhancing driver belong to stable.

Thanks
-- 
~Vinod


[PATCH] arm64: dts: sdm845: Add PSCI cpuidle low power states

2018-10-15 Thread Raju P.L.S.S.S.N
Add device bindings for cpuidle states for cpu devices.

Cc: 
Signed-off-by: Raju P.L.S.S.S.N 
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 62 
 1 file changed, 62 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 0c9a2aa..32262b0 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -96,6 +96,7 @@
reg = <0x0 0x0>;
enable-method = "psci";
next-level-cache = <&L2_0>;
+   cpu-idle-states = <&C0_CPU_SPC &C0_CPU_PC &CLUSTER_PC>;
L2_0: l2-cache {
compatible = "cache";
next-level-cache = <&L3_0>;
@@ -111,6 +112,7 @@
reg = <0x0 0x100>;
enable-method = "psci";
next-level-cache = <&L2_100>;
+   cpu-idle-states = <&C0_CPU_SPC &C0_CPU_PC &CLUSTER_PC>;
L2_100: l2-cache {
compatible = "cache";
next-level-cache = <&L3_0>;
@@ -123,6 +125,7 @@
reg = <0x0 0x200>;
enable-method = "psci";
next-level-cache = <&L2_200>;
+   cpu-idle-states = <&C0_CPU_SPC &C0_CPU_PC &CLUSTER_PC>;
L2_200: l2-cache {
compatible = "cache";
next-level-cache = <&L3_0>;
@@ -135,6 +138,7 @@
reg = <0x0 0x300>;
enable-method = "psci";
next-level-cache = <&L2_300>;
+   cpu-idle-states = <&C0_CPU_SPC &C0_CPU_PC &CLUSTER_PC>;
L2_300: l2-cache {
compatible = "cache";
next-level-cache = <&L3_0>;
@@ -147,6 +151,7 @@
reg = <0x0 0x400>;
enable-method = "psci";
next-level-cache = <&L2_400>;
+   cpu-idle-states = <&C4_CPU_SPC &C4_CPU_PC &CLUSTER_PC>;
L2_400: l2-cache {
compatible = "cache";
next-level-cache = <&L3_0>;
@@ -159,6 +164,7 @@
reg = <0x0 0x500>;
enable-method = "psci";
next-level-cache = <&L2_500>;
+   cpu-idle-states = <&C4_CPU_SPC &C4_CPU_PC &CLUSTER_PC>;
L2_500: l2-cache {
compatible = "cache";
next-level-cache = <&L3_0>;
@@ -171,6 +177,7 @@
reg = <0x0 0x600>;
enable-method = "psci";
next-level-cache = <&L2_600>;
+   cpu-idle-states = <&C4_CPU_SPC &C4_CPU_PC &CLUSTER_PC>;
L2_600: l2-cache {
compatible = "cache";
next-level-cache = <&L3_0>;
@@ -183,11 +190,66 @@
reg = <0x0 0x700>;
enable-method = "psci";
next-level-cache = <&L2_700>;
+   cpu-idle-states = <&C4_CPU_SPC &C4_CPU_PC &CLUSTER_PC>;
L2_700: l2-cache {
compatible = "cache";
next-level-cache = <&L3_0>;
};
};
+
+   idle-states {
+   entry-method = "psci";
+
+   C0_CPU_SPC: c0_spc {
+   compatible = "arm,idle-state";
+   arm,psci-suspend-param = <0x4003>;
+   entry-latency-us = <350>;
+   exit-latency-us = <461>;
+   min-residency-us = <1890>;
+   local-timer-stop;
+   idle-state-name = "pc";
+   };
+
+   C0_CPU_PC: c0_pc {
+   compatible = "arm,idle-state";
+   arm,psci-suspend-param = <0x4004>;
+   entry-latency-us = <360>;
+   exit-latency-us = <531>;
+   min-residency-us = <3934>;
+   local-timer-stop;
+   idle-state-name = "rail pc";
+   };
+
+   C4_CPU_SPC: c4_spc {
+   compatible = "arm,idle-state";
+   arm,psci-suspend-param = <0x4003>;
+   entry-latency-us = <264>;
+ 

Re: [PATCH v3 2/2] dmaengine: uniphier-mdmac: add UniPhier MIO DMAC driver

2018-10-15 Thread Vinod
On 12-10-18, 01:27, Masahiro Yamada wrote:
> On Sun, Oct 7, 2018 at 1:23 AM Vinod  wrote:
> > > > > +static int uniphier_mdmac_probe(struct platform_device *pdev)
> > > > > +{
> > > > > + struct device *dev = &pdev->dev;
> > > > > + struct uniphier_mdmac_device *mdev;
> > > > > + struct dma_device *ddev;
> > > > > + struct resource *res;
> > > > > + int nr_chans, ret, i;
> > > > > +
> > > > > + nr_chans = platform_irq_count(pdev);
> > > > > + if (nr_chans < 0)
> > > > > + return nr_chans;
> > > > > +
> > > > > + ret = dma_set_mask(dev, DMA_BIT_MASK(32));
> > > > > + if (ret)
> > > > > + return ret;
> > > > > +
> > > > > + mdev = devm_kzalloc(dev, struct_size(mdev, channels, nr_chans),
> > > > > + GFP_KERNEL);
> > > > > + if (!mdev)
> > > > > + return -ENOMEM;
> > > > > +
> > > > > + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > > > > + mdev->reg_base = devm_ioremap_resource(dev, res);
> > > > > + if (IS_ERR(mdev->reg_base))
> > > > > + return PTR_ERR(mdev->reg_base);
> > > > > +
> > > > > + mdev->clk = devm_clk_get(dev, NULL);
> > > > > + if (IS_ERR(mdev->clk)) {
> > > > > + dev_err(dev, "failed to get clock\n");
> > > > > + return PTR_ERR(mdev->clk);
> > > > > + }
> > > > > +
> > > > > + ret = clk_prepare_enable(mdev->clk);
> > > > > + if (ret)
> > > > > + return ret;
> > > > > +
> > > > > + ddev = &mdev->ddev;
> > > > > + ddev->dev = dev;
> > > > > + dma_cap_set(DMA_PRIVATE, ddev->cap_mask);
> > > > > + ddev->src_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_UNDEFINED);
> > > > > + ddev->dst_addr_widths = BIT(DMA_SLAVE_BUSWIDTH_UNDEFINED);
> > > >
> > > > undefined?
> > >
> > > Precisely, I do not know the *_addr_widths.
> >
> > This is "your" controller, you know the capability!
> 
> No, I do not.
> 
> I wrote this driver, but the hardware-internal is not fully documented
> in the datasheet.
> I can see the functionality only from the software point of view.

Ah that is sad!

> > > As far as I read dmaengine/provider.rst
> > > this represents the data bytes that are read/written at a time.
> > >
> > > Really I do not know (care about) the transfer width.
> > >
> > > As I commented in v2, the connection of the device side is hard-wired.
> > > The transfer width cannot be observed from SW view.
> > >
> > > What should I do?
> >
> > Add the widths that are supported by the controller
> 
> To my best knowledge, this DMA engine is connected to a 32-bit bus.
> So, 4 bytes are read/written at a time.
> 
> This HW allows to set the transfer size by byte granularity.
> So, it would be possible to access the data bus
> by 1-byte, 2-bytes, 3-bytes as well.
> 
> I will set the OR of 1, 2, 3, 4 bytes.

that would be better. Also if you can test and verify these and add the
ones you have verified would be even better

> > > > > +static int uniphier_mdmac_remove(struct platform_device *pdev)
> > > > > +{
> > > > > + struct uniphier_mdmac_device *mdev = platform_get_drvdata(pdev);
> > > > > +
> > > > > + of_dma_controller_free(pdev->dev.of_node);
> > > > > + dma_async_device_unregister(&mdev->ddev);
> > > > > + clk_disable_unprepare(mdev->clk);
> > > >
> > > > at this point your irq is registered and can be fired, the tasklets are
> > > > not killed :(
> > >
> > >
> > > Please let me clarify the concerns here.
> > >
> > > Before the .remove hook is called, all the consumers should
> > > have already put the dma channels.
> > > So, no new descriptor is coming in.
> > >
> > > However,
> > >
> > > Some already-issued descriptors might be remaining, and being processed.
> > >
> > > [1] This DMA engine might be still running
> > > when clk_disable_unprepare() is being called.
> > > The register access with its clock disabled
> > > would cause the system crash.
> >
> > Yes and dmaengine may fire a spurious irq..
> > >
> > > [2] vchan_cookie_complete() might being called at this point
> > > and schedule the tasklet.
> > > It might call uniphier_mdmac_desc_free() after
> > > the reference disapperrs.
> > >
> > > Is this correct?
> >
> > Correct :)
> >
> > > Do you have recommendation
> > > for module removal guideline?
> >
> > Yes please free up or disable irq explictly, ensure pending irqs have
> > completed and then ensure all the tasklets are killed and in this order
> > for obvious reasons
> 
> Also, need to free up the left-over descriptor(s) right?
> Just killing the tasklets may result in memory leak.

Yes I am assuming you would have done so in terminate calls

> Please let know if the implementation in v4 is wrong.

Sure will do

-- 
~Vinod


possible deadlock in process_measurement

2018-10-15 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:bab5c80b2110 Merge tag 'armsoc-fixes-4.19' of git://git.ke..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=138e5f7640
kernel config:  https://syzkaller.appspot.com/x/.config?x=88e9a8a39dc0be2d
dashboard link: https://syzkaller.appspot.com/bug?extid=5ab61747675a87ea359d
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+5ab61747675a87ea3...@syzkaller.appspotmail.com

netlink: 8 bytes leftover after parsing attributes in process  
`syz-executor5'.

IPv6: ADDRCONF(NETDEV_CHANGE): vcan0: link becomes ready

==
WARNING: possible circular locking dependency detected
4.19.0-rc7+ #60 Not tainted
--
syz-executor4/22440 is trying to acquire lock:
8e8e5998 (&ovl_i_mutex_key[depth]){+.+.}, at: inode_lock  
include/linux/fs.h:738 [inline]
8e8e5998 (&ovl_i_mutex_key[depth]){+.+.}, at:  
process_measurement+0xc3e/0x1bf0 security/integrity/ima/ima_main.c:205


but task is already holding lock:
fd816d85 (&sig->cred_guard_mutex){+.+.}, at:  
prepare_bprm_creds+0x53/0x120 fs/exec.c:1404


which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (&sig->cred_guard_mutex){+.+.}:
   __mutex_lock_common kernel/locking/mutex.c:925 [inline]
   __mutex_lock+0x166/0x1700 kernel/locking/mutex.c:1072
   mutex_lock_killable_nested+0x16/0x20 kernel/locking/mutex.c:1102
   lock_trace+0x4c/0xe0 fs/proc/base.c:384
   proc_pid_syscall+0xad/0x520 fs/proc/base.c:617
   proc_single_show+0x101/0x190 fs/proc/base.c:737
IPv6: ADDRCONF(NETDEV_UP): wlan11: link is not ready
   seq_read+0x4af/0x1150 fs/seq_file.c:229
   do_loop_readv_writev fs/read_write.c:700 [inline]
   do_iter_read+0x4a3/0x650 fs/read_write.c:924
   vfs_readv+0x175/0x1c0 fs/read_write.c:986
   do_preadv+0x1cc/0x280 fs/read_write.c:1070
   __do_sys_preadv fs/read_write.c:1120 [inline]
   __se_sys_preadv fs/read_write.c:1115 [inline]
   __x64_sys_preadv+0x9a/0xf0 fs/read_write.c:1115
IPv6: ADDRCONF(NETDEV_UP): wlan12: link is not ready
   do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #2 (&p->lock){+.+.}:
   __mutex_lock_common kernel/locking/mutex.c:925 [inline]
   __mutex_lock+0x166/0x1700 kernel/locking/mutex.c:1072
IPv6: ADDRCONF(NETDEV_UP): wlan13: link is not ready
   mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087
   seq_read+0x71/0x1150 fs/seq_file.c:161
   do_loop_readv_writev fs/read_write.c:700 [inline]
   do_iter_read+0x4a3/0x650 fs/read_write.c:924
   vfs_readv+0x175/0x1c0 fs/read_write.c:986
   kernel_readv fs/splice.c:362 [inline]
   default_file_splice_read+0x53c/0xb20 fs/splice.c:417
   do_splice_to+0x12e/0x190 fs/splice.c:881
   splice_direct_to_actor+0x270/0x8f0 fs/splice.c:953
   do_splice_direct+0x2d4/0x420 fs/splice.c:1062
   do_sendfile+0x62a/0xe20 fs/read_write.c:1440
   __do_sys_sendfile64 fs/read_write.c:1495 [inline]
   __se_sys_sendfile64 fs/read_write.c:1487 [inline]
   __x64_sys_sendfile64+0x15d/0x250 fs/read_write.c:1487
   do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #1 (sb_writers#4){.+.+}:
   percpu_down_read_preempt_disable include/linux/percpu-rwsem.h:36  
[inline]

   percpu_down_read include/linux/percpu-rwsem.h:59 [inline]
   __sb_start_write+0x214/0x370 fs/super.c:1387
   sb_start_write include/linux/fs.h:1566 [inline]
   mnt_want_write+0x3f/0xc0 fs/namespace.c:360
   ovl_want_write+0x76/0xa0 fs/overlayfs/util.c:24
   ovl_do_remove+0x174/0xfd0 fs/overlayfs/dir.c:823
   ovl_unlink+0x17/0x20 fs/overlayfs/dir.c:868
   vfs_unlink+0x2db/0x510 fs/namei.c:4000
   do_unlinkat+0x6cc/0xa30 fs/namei.c:4063
   __do_sys_unlink fs/namei.c:4110 [inline]
   __se_sys_unlink fs/namei.c:4108 [inline]
   __x64_sys_unlink+0x42/0x50 fs/namei.c:4108
   do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (&ovl_i_mutex_key[depth]){+.+.}:
   lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3900
   down_write+0x8a/0x130 kernel/locking/rwsem.c:70
   inode_lock include/linux/fs.h:738 [inline]
   process_measurement+0xc3e/0x1bf0  
security/integrity/ima/ima_main.c:205

   ima_file_check+0xe5/0x130 security/integrity/ima/ima_main.c:391
   do_last fs/namei.c:3422 [inline]
   path_openat+0x134d/0x5160 fs/namei.c:3534
   do_filp_open+0x255/0x380 fs/namei.c:3564
   do_open_execat+0x221/0x8e0 fs/exec.c:853
   __do_execve_file.isra.33+0x173

Re: [PATCH] clk: meson-gxbb: set fclk_div3 as CLK_IS_CRITICAL

2018-10-15 Thread Jerome Brunet
On Sat, 2018-10-13 at 18:08 +0200, Michael Turquette wrote:
> Quoting Christian Hewitt (2018-10-13 12:04:46)
> > On the Khadas VIM2 (GXM) and LePotato (GXL) board there are problems
> > with reboot; e.g. a ~60 second delay between issuing reboot and the
> > board power cycling (and in some OS configurations reboot will fail
> > and require manual power cycling).
> > 
> > Similar to 'commit c987ac6f1f088663b6dad39281071aeb31d450a8 ("clk:
> > meson-gxbb: set fclk_div2 as CLK_IS_CRITICAL")' the SCPI Cortex-M4
> > Co-Processor seems to depend on FCLK_DIV3 being operational.
> > 
> > Bisect gives 'commit 05f814402d6174369b3b29832cbb5eb5ed287059 ("clk:
> > meson: add fdiv clock gates") between 4.16 and 4.16-rc1 as the first
> > bad commit. This added support for the missing clock gates before the
> > fixed PLL fixed dividers (FCLK_DIVx) and the clock framework which
> > disabled all the unused fixed dividers, thus it disabled a critical
> > clock path for the SCPI Co-Processor.
> > 
> > This change simply sets the FCLK_DIV3 gate as critical to ensure
> > nothing can disable it.
> 
> I'm a bit skeptical of this. If FCLK_DIV3 is gated at run-time, there is
> no side effect other than long and/or failed reboot?
> 
> Seems like someone should be managing this clock, and simply leaving it
> on all the time isn't necessarily the right approach. Any chance that
> you can dig into this behavior to better understand it?
> 
> It's easy to solve issues by leaving clocks on all the time, but this
> becomes a problem later on when trying to tune a device for power.

Hi Mike,

I totally agree with you and, in perfect a world, I would prefer not to use this
CLK_IS_CRITICAL at all. It looks like a cheap fix for:

"this is required, I don't for what but it is, so please leave it on"

The problem is this issue is a regression. We added a few gates to better model
the clock tree a while ago. Before, those those clock were left enabled by the
bootloader.

Now that linux turn them off, we are learning a few things about what the FWs
are doing behind our backs.

Among the 5 clocks which got a new gates, only 2 needs to back the old way using
CLK_IS_CRITICAL, so this is still an improvement.

> 
> Also, if this commit really is the right fix, it should include a
> comment for FCLK_DIV3 stating why the CLK_IS_CRITICAL flag was set,
> which may be helpful later on to know if it is safe to remove it. Same
> is true for FCLK_DIV2 in c987ac6f1f088663b6dad39281071aeb31d450a8.

+1 There should be FIXME notice in the driver explaining why we put that flag in
the first place, so we can remove it as soon as a driver properly handle this
clock.

Christian, is Ok if I amend your patch, or do you prefer to post a v2 ?
Mike, with explained, is this change OK with you ?

> 
> Regards,
> Mike
> 
> > 
> > Signed-off-by: Christian Hewitt 
> > ---
> >  drivers/clk/meson/gxbb.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/clk/meson/gxbb.c b/drivers/clk/meson/gxbb.c
> > index 86d3ae5..4c8925d 100644
> > --- a/drivers/clk/meson/gxbb.c
> > +++ b/drivers/clk/meson/gxbb.c
> > @@ -509,6 +509,7 @@ static struct clk_fixed_factor gxbb_fclk_div3_div = {
> > .ops = &clk_fixed_factor_ops,
> > .parent_names = (const char *[]){ "fixed_pll" },
> > .num_parents = 1,
> > +   .flags = CLK_IS_CRITICAL,
> > },
> >  };
> >  
> > -- 
> > 2.7.4




Re: [PATCH v5 09/12] x86/kvm/nVMX: allow bare VMXON state migration

2018-10-15 Thread Paolo Bonzini
On 13/09/2018 19:05, Vitaly Kuznetsov wrote:
> It is perfectly valid for a guest to do VMXON and not do VMPTRLD. This
> state needs to be preserved on migration.
> 
> Signed-off-by: Vitaly Kuznetsov 

Please cover this in state-test.c too.

Paolo

> ---
>  arch/x86/kvm/vmx.c | 15 ---
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index d3297fadf7ed..25a25fff8dd9 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -14482,13 +14482,6 @@ static int vmx_set_nested_state(struct kvm_vcpu 
> *vcpu,
>   if (!page_address_valid(vcpu, kvm_state->vmx.vmxon_pa))
>   return -EINVAL;
>  
> - if (kvm_state->size < sizeof(kvm_state) + sizeof(*vmcs12))
> - return -EINVAL;
> -
> - if (kvm_state->vmx.vmcs_pa == kvm_state->vmx.vmxon_pa ||
> - !page_address_valid(vcpu, kvm_state->vmx.vmcs_pa))
> - return -EINVAL;
> -
>   if ((kvm_state->vmx.smm.flags & KVM_STATE_NESTED_SMM_GUEST_MODE) &&
>   (kvm_state->flags & KVM_STATE_NESTED_GUEST_MODE))
>   return -EINVAL;
> @@ -14510,6 +14503,14 @@ static int vmx_set_nested_state(struct kvm_vcpu 
> *vcpu,
>   if (ret)
>   return ret;
>  
> + /* Empty 'VMXON' state is permitted */
> + if (kvm_state->size < sizeof(kvm_state) + sizeof(*vmcs12))
> + return 0;
> +
> + if (kvm_state->vmx.vmcs_pa == kvm_state->vmx.vmxon_pa ||
> + !page_address_valid(vcpu, kvm_state->vmx.vmcs_pa))
> + return -EINVAL;
> +
>   set_current_vmptr(vmx, kvm_state->vmx.vmcs_pa);
>  
>   if (kvm_state->vmx.smm.flags & KVM_STATE_NESTED_SMM_VMXON) {
> 



Re: [PATCH] dmaengine: owl: Fix warnings generated during build

2018-10-15 Thread Vinod
On 08-10-18, 22:46, Manivannan Sadhasivam wrote:
> Following warnings are generated when compiled with W=1,
> 
> drivers/dma/owl-dma.c:170: warning: Function parameter or member 'cyclic'
> not described in 'owl_dma_txd'
> drivers/dma/owl-dma.c:198: warning: Function parameter or member 'cfg' not
> described in 'owl_dma_vchan'
> drivers/dma/owl-dma.c:198: warning: Function parameter or member 'drq' not
> described in 'owl_dma_vchan'
> drivers/dma/owl-dma.c:225: warning: Function parameter or member 'irq' not
> described in 'owl_dma'
> 
> Fix this by adding comments for relevant struct members to appear in
> kernel-doc.
> 
> Fixes: d64e1b3f5cce ("dmaengine: owl: Add Slave and Cyclic mode support for
> Actions Semi Owl S900 SoC")
> 

This empty line is not required

> Reported-by: Vinod Koul 
> Signed-off-by: Manivannan Sadhasivam 


Applied after removing the bogus empty line, thanks
-- 
~Vinod


[RFC] Allow user namespace inside chroot

2018-10-15 Thread nagarathnam . muthusamy
From: Nagarathnam Muthusamy 

Following commit disables the creation of user namespace inside
the chroot environment.

userns: Don't allow creation if the user is chrooted

commit 3151527ee007b73a0ebd296010f1c0454a919c7d

Consider a system in which a non-root user creates a combination
of user, pid and mount namespaces and confines a process to it.
The system will have multiple levels of nested namespaces.
The root namespace in the system will have lots of directories
which should not be exposed to the child confined to the set of
namespaces.

Without chroot, we will have to hide all unwanted directories
individually using bind mounts and mount namespace. Chroot enables
us to expose a handpicked list of directories which the child
can see but if we use chroot we wont be able to create nested
namespaces.

Allowing a process to create user namespace within a chroot
environment will enable it to chroot, which in turn can be used
to escape the jail.

This patch drops the chroot privilege when user namespace is
created within the chroot environment so the process cannot
use it to escape the chroot jail. The process can still modify
the view of the file system using mount namespace but for those
modifications to be useful, it needs to run a setuid program with
that intented uid directly mapped into the user namespace as it is
which is not possible for an unprivileged process.

If there were any other corner cases which were considered while
deciding to disable the creation of user namespace as a whole
within the chroot environment please let me know.

Signed-off-by: Nagarathnam Muthusamy
---
 kernel/user_namespace.c | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index e5222b5..83d2a70 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -44,7 +44,7 @@ static void dec_user_namespaces(struct ucounts *ucounts)
return dec_ucount(ucounts, UCOUNT_USER_NAMESPACES);
 }
 
-static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)
+static void set_cred_user_ns(struct cred *cred, struct user_namespace 
*user_ns, int is_chrooted)
 {
/* Start with the same capabilities as init but useless for doing
 * anything as the capabilities are bound to the new user namespace.
@@ -55,6 +55,11 @@ static void set_cred_user_ns(struct cred *cred, struct 
user_namespace *user_ns)
cred->cap_effective = CAP_FULL_SET;
cred->cap_ambient = CAP_EMPTY_SET;
cred->cap_bset = CAP_FULL_SET;
+   if (is_chrooted) {
+   cap_lower(cred->cap_permitted, CAP_SYS_CHROOT);
+   cap_lower(cred->cap_effective, CAP_SYS_CHROOT);
+   cap_lower(cred->cap_bset, CAP_SYS_CHROOT);
+   }
 #ifdef CONFIG_KEYS
key_put(cred->request_key_auth);
cred->request_key_auth = NULL;
@@ -78,6 +83,7 @@ int create_user_ns(struct cred *new)
kgid_t group = new->egid;
struct ucounts *ucounts;
int ret, i;
+   int is_chrooted = 0;
 
ret = -ENOSPC;
if (parent_ns->level > 32)
@@ -88,14 +94,12 @@ int create_user_ns(struct cred *new)
goto fail;
 
/*
-* Verify that we can not violate the policy of which files
-* may be accessed that is specified by the root directory,
-* by verifing that the root directory is at the root of the
-* mount namespace which allows all files to be accessed.
+* Drop the chroot privilege when a user namespace is created inside
+* chrooted environment so that the file system view presented to a
+* non-admin process is preserved.
 */
-   ret = -EPERM;
if (current_chrooted())
-   goto fail_dec;
+   is_chrooted = 1;
 
/* The creator needs a mapping in the parent user namespace
 * or else we won't be able to reasonably tell userspace who
@@ -140,7 +144,7 @@ int create_user_ns(struct cred *new)
if (!setup_userns_sysctls(ns))
goto fail_keyring;
 
-   set_cred_user_ns(new, ns);
+   set_cred_user_ns(new, ns, is_chrooted);
return 0;
 fail_keyring:
 #ifdef CONFIG_PERSISTENT_KEYRINGS
@@ -1281,7 +1285,7 @@ static int userns_install(struct nsproxy *nsproxy, struct 
ns_common *ns)
return -ENOMEM;
 
put_user_ns(cred->user_ns);
-   set_cred_user_ns(cred, get_user_ns(user_ns));
+   set_cred_user_ns(cred, get_user_ns(user_ns), 0);
 
return commit_creds(cred);
 }
-- 
1.8.3.1



Re: [PATCH] kvm/x86 : fix some typo

2018-10-15 Thread Paolo Bonzini
On 04/10/2018 17:45, Peng Hao wrote:
> From: Peng Hao 
> 
> Signed-off-by: Peng Hao 
> ---
>  arch/x86/kvm/mmu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index d7e9bce..281e20e 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -4546,7 +4546,7 @@ static void update_permission_bitmask(struct kvm_vcpu 
> *vcpu,
>* SMAP:kernel-mode data accesses from user-mode
>* mappings should fault. A fault is considered
>* as a SMAP violation if all of the following
> -  * conditions are ture:
> +  * conditions are true:
>*   - X86_CR4_SMAP is set in CR4
>*   - A user page is accessed
>*   - The access is not a fetch
> @@ -5891,7 +5891,7 @@ int kvm_mmu_module_init(void)
>  }
>  
>  /*
> - * Caculate mmu pages needed for kvm.
> + * Calculate mmu pages needed for kvm.
>   */
>  unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm)
>  {
> 

Queued, thanks.

Paolo


Re: [PATCH] kvm/x86 : avoid shifting signed 32-bit value by 31 bits

2018-10-15 Thread Paolo Bonzini
On 08/10/2018 04:25, Wei Yang wrote:
> On Mon, Oct 08, 2018 at 09:04:34AM +0800, peng.h...@zte.com.cn wrote:
>>> On Sat, Oct 06, 2018 at 11:31:04AM +0800, peng.h...@zte.com.cn wrote:
> On Thu, Oct 04, 2018 at 01:47:18PM -0400, Peng Hao wrote:
>>
>> From: Peng Hao 
>>
>>  modify AVIC_LOGICAL_ID_ENTRY_VALID_MASK to unsigned
>>
>> Signed-off-by: Peng Hao 
>> ---
>> arch/x86/kvm/svm.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index d96092b..bf1ded4 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -262,7 +262,7 @@ struct amd_svm_iommu_ir {
>> };
>>
>> #define AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK(0xFF)
>> -#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK(1 << 31)
>> +#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK(1UL << 31)

> It is reasonable to change to unsigned, while not necessary to unsigned
> long?
 AVIC_LOGICAL_ID_ENTRY_VALID_MASK is used in function avic_ldr_write.
 here I think it doesn't matter if you use unsigned or unsigned long. Do 
 you have any suggestions?
>>
>>> In current case, AVIC_LOGICAL_ID_ENTRY_VALID_MASK is used to calculate
>>> the value of new_entry with type of u32. So the definition here is not
>>> harmful.
>>
>>> Also, I did a quick grep and found similar definition (1 << 31) is popular
>>> in the whole kernel tree.
>>
>>> The reason to make this change is not that strong to me. Would you
>>> minding sharing more reason behind this change?
>> oh, I'm just thinking logically, not more reason.
> 
> This definition may introduce problem when this value is used to
> calculate a 64bit data.
> 
> Since current entry is 32bit, we may leave it as it is for now.

I agree.

Paolo



Re: INFO: task hung in fanotify_handle_event

2018-10-15 Thread Dmitry Vyukov
On Mon, Oct 15, 2018 at 2:45 PM, Jan Kara  wrote:
> Hi Dmirty!
>
> On Mon 15-10-18 14:29:14, Dmitry Vyukov wrote:
>> On Mon, Oct 15, 2018 at 2:15 PM, Jan Kara  wrote:
>> > Hello,
>> >
>> > On Mon 15-10-18 04:32:02, syzbot wrote:
>> >> syzbot found the following crash on:
>> >>
>> >> HEAD commit:90ad18418c2d Merge 
>> >> git://git.kernel.org/pub/scm/linux/kern..
>> >> git tree:   upstream
>> >> console output: https://syzkaller.appspot.com/x/log.txt?x=17f1776e40
>> >> kernel config:  https://syzkaller.appspot.com/x/.config?x=88e9a8a39dc0be2d
>> >> dashboard link: 
>> >> https://syzkaller.appspot.com/bug?extid=29143581b0ded3213e99
>> >> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
>> >> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=123459d640
>> >>
>> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> >> Reported-by: syzbot+29143581b0ded3213...@syzkaller.appspotmail.com
>> >
>> > Syzbot has apparently generated fanotify watch for FAN_OPEN_PERM event and
>> > then the process got stuck waiting for userspace to respond to that event -
>> > which never happened. So everything works as designed here - the process
>> > placing FAN_OPEN_PERM mark is responsible for replying to the generated
>> > events as all opens hang waiting for responses. That's why the
>> > functionality is behind CAP_SYS_ADMIN after all... Could we fix syzbot to
>> > actually generate replies for these events?
>>
>> Is there a reliable way to kill such processes?
>> Or admins are never supposed to kill any root processes and have not
>> bugs whatsoever? :)
>
> Currently the wait is not killable but yes, we want to make it killable
> exactly because of userspace bugs :). But it is non-trivial because
> currently the waker has also other responsibilities and all that stuff has
> to be cleaned up when handling killed wait. Konstantin Khlebnikov was
> working on that so I might need to prod him.
>
>> syzkaller probably capable of generating replies in some cases, but
>> unfortunately it can't work this way. It's practically not possible to
>> ensure that it will always generate a proper reply and it will be
>> actually delivered and the process won't be killed in the middle, or
>> another thread won't crash or call exit_group concurrently, etc. The
>> thing either needs to be reliable, work without any but's and be
>> reliably killable, or it's not suitable for stress testing.
>> If there is no reliable way to kill it, I think we need to disable
>> FAN_OPEN_PERM entirely.
>
> Understood. Then just disable FAN_OPEN_PERM & FAN_ACCESS_PERM for now.


Disabled FAN_OPEN_PERM & FAN_ACCESS_PERM for now:
https://github.com/google/syzkaller/commit/6ce17935cb99fa11aaa2f2d1889261da6b298013


#syz invalid


Re: [PATCH] KVM: x86: fix failure of injecting exceptionsinx86_emulate_instruction

2018-10-15 Thread Paolo Bonzini
On 09/10/2018 04:51, peng.h...@zte.com.cn wrote:
> ping
> patch  ;https://patchwork.kernel.org/patch/10604977/
> test case :https://patchwork.kernel.org/patch/10631781/

I need to understand the double fault I'm seeing before applying this patch.

Paolo


Re: [PATCH v3 4/7] dmaengine: stm32-dma: Add DMA/MDMA chaining support

2018-10-15 Thread Vinod
On 10-10-18, 09:02, Pierre Yves MORDRET wrote:
> 
> 
> On 10/10/2018 06:03 AM, Vinod wrote:
> > On 09-10-18, 10:40, Pierre Yves MORDRET wrote:
> >>
> >>
> >> On 10/07/2018 06:00 PM, Vinod wrote:
> >>> On 28-09-18, 15:01, Pierre-Yves MORDRET wrote:
>  This patch adds support of DMA/MDMA chaining support.
>  It introduces an intermediate transfer between peripherals and STM32 DMA.
>  This intermediate transfer is triggered by SW for single M2D transfer and
>  by STM32 DMA IP for all other modes (sg, cyclic) and direction (D2M).
> 
>  A generic SRAM allocator is used for this intermediate buffer
>  Each DMA channel will be able to define its SRAM needs to achieve 
>  chaining
>  feature : (2 ^ order) * PAGE_SIZE.
>  For cyclic, SRAM buffer is derived from period length (rounded on
>  PAGE_SIZE).
> >>>
> >>> So IIUC, you chain two dma txns together and transfer data via an SRAM?
> >>
> >> Correct. one DMA is DMAv2 (stm32-dma) and the other is MDMA(stm32-mdma).
> >> Intermediate transfer is between device and memory.
> >> This intermediate transfer is using SDRAM.
> > 
> > Ah so you use dma calls to setup mdma xtfers? I dont think that is a
> > good idea. How do you know you should use mdma for subsequent transfer?
> > 
> 
> When user bindings told to setup chaining intermediate MDMA transfers are 
> always
> triggers.
> For instance if a user requests a Dev2Mem transfer with chaining. From client
> pov this is still a prep_slave_sg. Internally DMAv2 is setup in cyclic mode 
> (in
> double buffer mode indeed => 2 buffer of PAGE_SIZE/2) and destination is 
> SDRAM.
> DMAv2 will flip/flop on those 2 buffers.
> At the same time DMAv2 driver prepares a MDMA SG that will fetch data from 
> those
> 2 buffers in SDRAM and fills final destination memory.

I am not able to follow is why does it need to be internal, why should
the client not set the two transfers and trigger them?

-- 
~Vinod


Re: [PATCH] kvm/x86 : avoid shifting signed 32-bit value by 31 bits

2018-10-15 Thread H. Peter Anvin
On 10/7/18 6:04 PM, peng.h...@zte.com.cn wrote:
\>
> #define AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK(0xFF)
> -#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK(1 << 31)
> +#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK(1UL << 31)
>>>
 It is reasonable to change to unsigned, while not necessary to unsigned
 long?
>>> AVIC_LOGICAL_ID_ENTRY_VALID_MASK is used in function avic_ldr_write.
>>> here I think it doesn't matter if you use unsigned or unsigned long. Do you 
>>> have any suggestions?
> 
>> In current case, AVIC_LOGICAL_ID_ENTRY_VALID_MASK is used to calculate
>> the value of new_entry with type of u32. So the definition here is not
>> harmful.
> 
>> Also, I did a quick grep and found similar definition (1 << 31) is popular
>> in the whole kernel tree.
> 
>> The reason to make this change is not that strong to me. Would you
>> minding sharing more reason behind this change?
> oh, I'm just thinking logically, not more reason.

The right way to do this would be to use the _BITUL() (or _BITULL()) macro.

-hpa



Read Business Letter

2018-10-15 Thread info
Steven Peter Walker(Esq)
Stone Chambers, 4 Field Court,
Gray's Inn, London,
WC1R 5EF..
Email: stevenwalkerchamb...@workmail.co.za

Greetings To You,

This is a personal email directed to you and I request that it be 
treated as such. I am Steven Walker, a personal attorney/sole 
executor to the late Engineer Robert M, herein after referred to 
as" my client" I represent the interest of my client killed with 
his immediate family in a fatal motor accident in East London on 
November 5, 2002.and I will like to negotiate the terms of 
investment of resources available to him.

My late client worked as consulting engineer & sub-comptroller 
with Genesis Oil and Gas Consultants Ltd here in the United 
Kingdom and had left behind a deposit of Six Million Eight 
Hundred Thousand British Pounds Sterling only (£6.8million) with 
a finance company. The funds originated from contract 
transactions he executed in his registered area of business. Just 
after his death, I was contacted by the finance house to provide 
his next of kin, reasons been that his deposit agreement contains 
a residuary clause giving his personal attorney express authority 
to nominate the beneficiary to his funds. Unknown to the bank, 
Robert had left no possible trace of any of his close relative 
with me, making all efforts in my part to locate his family 
relative to be unfruitful since his death. In addition, from 
Robert's own story, he was only adopted and his foster parents 
whom he lost in 1976, according to him had no possible trace of 
his real family.

The funds had remained unclaimed since his death, but I had made 
effort writing several letters to the embassy with intent to 
locate any of his extended relatives whom shall be 
claimants/beneficiaries of his abandoned personal estate, and all 
such efforts have been to no avail. More so, I have received 
official letters in the last few weeks suggesting a likely 
proceeding for confiscation of his abandoned personal assets in 
line with existing laws by the bank However, it will interest you 
to know that I discovered that some directors of this finance 
company are making plans already to have this fund to themselves 
only to use the excuse that since I am unable to find a next of 
kin to my late client then the funds should be confiscated, 
meanwhile their intentions is to have the funds retrieved for 
themselves.

I reasoned very professionally and resolved to use a legal means 
to retrieve the abandoned funds, and that is to present the next 
of kin of my deceased client to the bank. This is legally 
possible and would be done in accordance with the laws. On this 
note, I decided to search for a credible person and finding that 
you bear a similar last name, I was urged to contact you, that I 
may, with your consent, present you to the "trustee" bank as my 
late client's surviving family member so as to enable you put up 
a claim to the bank in that capacity as a next of kin of my 
client. I find this to be possible for the fuller reasons that 
you are of the same nationality and you bear a similar last name 
with my late client making it a lot easier for you to put up a 
claim in that capacity. I have all vital documents that would 
confer you the legal right to lay claim to the funds, and it 
would back up your claim. I am willing to make these documents 
available to you so that the proceeds of this bank account valued 
at £6.8million can be paid to you before it is confiscated or 
declared unserviceable to the bank where this huge amount is 
lodged.

I do sincerely sympathize the death of my client but I think that 
it is unprofitable for his funds to be submitted to the 
government of this country or some financial institution. I seek 
your assistance since I have been unable to locate the relatives 
for the past three years now and since no one would come for the 
claim. I seek your consent to present you as the next of kin of 
the deceased since you have the same last name giving you the 
advantage which also makes the claim most credible . In that 
stand, the proceeds of this account can be paid to you. Then, we 
talk about percentage. I know there are others with the same 
surname as my client, but after a little search, my instinct 
tells me to contact you. I shall assemble all the necessary 
documents that would be used to back up your claim.

I guarantee that this will be executed under a legitimate 
arrangement that will protect you from any breach of law. I will 
not fail to bring to your notice that this proposal is hitch-free 
and that you should not entertain any fears as the required 
arrangements have been made for the completion of this transfer. 
As I said, I require only a solemn confidentiality on this. 
Please get in touch via my alternative 
email{stevenwalkerchamb...@workmail.co.za} for better 
confidentiality and if it's okay to you send me your telephone 
and fax numbers to enable us discuss further on this transaction, 
please do not take undue adva

Re: [PATCH v3] sched/rt : return accurate release rq lock info

2018-10-15 Thread Steven Rostedt
On Tue, 16 Oct 2018 00:09:43 +0800 (CST)
 wrote:

> >We only do the check if the immediate double_lock_balance() released
> >the current task rq lock, but we don't take into account if it was
> >released earlier, which means it could have migrated and we never
> >noticed!
> >  
> double_lock_balance may release current rq's lock,but it just for get the 
> locks of the two rq's in order
> and it immediately reacquire the current rq's lock before double_lock_balance 
> returns.
> >I believe the code should look like this:
> >

Bah, I didn't even compile it. And thought it was
"double_lock_balance", and didn't notice it was double_unlock_balance()
(this is what I get for trying to do too much at once).

Sad part is, I noticed this back when I added reviewed-by, but then
looking at it again, I did the same mistake :-/

Yeah, never mind, it's fine, my original reviewed-by stands.

-- Steve


Re: [PATCH] proc: fix proc-self-map-files selftest for arm

2018-10-15 Thread Cyrill Gorcunov
On Mon, Oct 15, 2018 at 01:55:14PM -0300, Rafael David Tinoco wrote:
> That is what I also had in mind, thus the patch. I just realized we had
> another issue on LKFT (our functional tests tool) for
> proc-self-map-files-001.c. Test 001 does pretty much the same as 002, but
> without the MAP_FIXED mmap flag.
> 
> Is it okay to consolidate both tests into just 1, and focus in checking
> procfs numbers conversion only, rather than if mapping 0 is allowed or not ?
> Can I send a v2 with that in mind ?

As to me -- yes, I would move zero page testing to a separate memory testcase.
But since Alexey is the former author of the tests better wait for his opinion.


Re: [PATCH V5 0/3] introduce coalesced pio support

2018-10-15 Thread Paolo Bonzini
On 14/10/2018 01:09, Peng Hao wrote:
> Coalesced pio is based on coalesced mmio and can be used for some port
> like rtc port, pci-host config port and so on.
> 
> Specially in case of rtc as coalesced pio, some versions of windows guest
> access rtc frequently because of rtc as system tick. guest access rtc like
> this: write register index to 0x70, then write or read data from 0x71.
> writing 0x70 port is just as index and do nothing else. So we can use
> coalesced pio to handle this scene to reduce VM-EXIT time.
> 
> When starting and closing a virtual machine, it will access pci-host config
> port frequently. So setting these port as coalesced pio can reduce startup 
> and shutdown time. 
> 
> without my patch, get the vm-exit time of accessing rtc 0x70 and piix 0xcf8
> using perf tools: (guest OS : windows 7 64bit)
> IO Port Access  Samples Samples%  Time%  Min Time  Max Time  Avg time
> 0x70:POUT86 30.99%74.59%   9us  29us10.75us (+- 3.41%)
> 0xcf8:POUT 1119 2.60% 2.12%   2.79us56.83us 3.41us (+- 2.23%)
> 
> with my patch
> IO Port Access  Samples Samples%  Time%   Min Time  Max Time   Avg time
> 0x70:POUT   10632.02%29.47%0us  10us 1.57us (+- 7.38%)
> 0xcf8:POUT  10651.67% 0.28%   0.41us65.44us   0.66us (+- 
> 10.55%)
> 
> 
> Peng Hao (3):
>   kvm/x86 : add coalesced pio support
>   kvm/x86 : add document for coalesced mmio
>   kvm/x86 : add document for coalesced pio
> 
>  Documentation/virtual/kvm/api.txt | 28 
> +++
>  include/uapi/linux/kvm.h  | 11 +--
>  virt/kvm/coalesced_mmio.c | 12 +---
>  virt/kvm/kvm_main.c   |  2 ++
>  4 files changed, 48 insertions(+), 5 deletions(-)
> 

Queued, thanks (squashing 1 and 3 together).

Paolo


[PATCH] compiler.h: update definition of unreachable()

2018-10-15 Thread ndesaulniers
Fixes the objtool warning seen with Clang:
arch/x86/mm/fault.o: warning: objtool: no_context()+0x220: unreachable
instruction

Fixes commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive")

Josh noted that the fallback definition was meant to work around a
pre-gcc-4.6 bug. GCC still needs to work around
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365, so compiler-gcc.h
defines its own version of unreachable().  Clang and ICC can use this
shared definition.

Link: https://github.com/ClangBuiltLinux/linux/issues/204
Suggested-by: Andy Lutomirski 
Suggested-by: Josh Poimboeuf 
Tested-by: Nathan Chancellor 
Signed-off-by: Nick Desaulniers 
---
Miguel, would you mind taking this up in your new compiler attributes
tree?

 include/linux/compiler.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 681d866efb1e..8875fd3243fd 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -124,7 +124,10 @@ void ftrace_likely_update(struct ftrace_likely_data *f, 
int val,
 # define ASM_UNREACHABLE
 #endif
 #ifndef unreachable
-# define unreachable() do { annotate_reachable(); do { } while (1); } while (0)
+# define unreachable() do {\
+   annotate_unreachable(); \
+   __builtin_unreachable();\
+} while (0)
 #endif
 
 /*
-- 
2.19.0.605.g01d371f741-goog



Re: [RFC] Allow user namespace inside chroot

2018-10-15 Thread Jann Horn
On Mon, Oct 15, 2018 at 7:10 PM  wrote:
> Following commit disables the creation of user namespace inside
> the chroot environment.
>
> userns: Don't allow creation if the user is chrooted
>
> commit 3151527ee007b73a0ebd296010f1c0454a919c7d
>
> Consider a system in which a non-root user creates a combination
> of user, pid and mount namespaces and confines a process to it.
> The system will have multiple levels of nested namespaces.
> The root namespace in the system will have lots of directories
> which should not be exposed to the child confined to the set of
> namespaces.
>
> Without chroot, we will have to hide all unwanted directories
> individually using bind mounts and mount namespace.

IMO what you really should be doing is to create a tmpfs, bind-mount
the directories you want into it, and then pivot_root() into that, not
the other way around.

> Chroot enables
> us to expose a handpicked list of directories which the child
> can see but if we use chroot we wont be able to create nested
> namespaces.

Uh, are you aware that pivot_root() exists? That's what you should be using.

The kernel makes pretty much no security guarantees about chroot(). If
you're using chroot() for security, you're almost certainly doing it
wrong. If you want security, use pivot_root().

> Allowing a process to create user namespace within a chroot
> environment will enable it to chroot, which in turn can be used
> to escape the jail.
>
> This patch drops the chroot privilege when user namespace is
> created within the chroot environment so the process cannot
> use it to escape the chroot jail.

"cannot" is a strong expression. More like "might not be able to".

> The process can still modify
> the view of the file system using mount namespace but for those
> modifications to be useful, it needs to run a setuid program with
> that intented uid directly mapped into the user namespace as it is
> which is not possible for an unprivileged process.
>
> If there were any other corner cases which were considered while
> deciding to disable the creation of user namespace as a whole
> within the chroot environment please let me know.
>
> Signed-off-by: Nagarathnam Muthusamy
> ---
>  kernel/user_namespace.c | 22 +-
>  1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index e5222b5..83d2a70 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -44,7 +44,7 @@ static void dec_user_namespaces(struct ucounts *ucounts)
> return dec_ucount(ucounts, UCOUNT_USER_NAMESPACES);
>  }
>
> -static void set_cred_user_ns(struct cred *cred, struct user_namespace 
> *user_ns)
> +static void set_cred_user_ns(struct cred *cred, struct user_namespace 
> *user_ns, int is_chrooted)
>  {
> /* Start with the same capabilities as init but useless for doing
>  * anything as the capabilities are bound to the new user namespace.
> @@ -55,6 +55,11 @@ static void set_cred_user_ns(struct cred *cred, struct 
> user_namespace *user_ns)
> cred->cap_effective = CAP_FULL_SET;
> cred->cap_ambient = CAP_EMPTY_SET;
> cred->cap_bset = CAP_FULL_SET;
> +   if (is_chrooted) {
> +   cap_lower(cred->cap_permitted, CAP_SYS_CHROOT);
> +   cap_lower(cred->cap_effective, CAP_SYS_CHROOT);
> +   cap_lower(cred->cap_bset, CAP_SYS_CHROOT);
> +   }

This isn't going to work. For example, if the attacker can use
pivot_root() (which checks for CAP_SYS_ADMIN), you're still screwed.

>  #ifdef CONFIG_KEYS
> key_put(cred->request_key_auth);
> cred->request_key_auth = NULL;
> @@ -78,6 +83,7 @@ int create_user_ns(struct cred *new)
> kgid_t group = new->egid;
> struct ucounts *ucounts;
> int ret, i;
> +   int is_chrooted = 0;
>
> ret = -ENOSPC;
> if (parent_ns->level > 32)
> @@ -88,14 +94,12 @@ int create_user_ns(struct cred *new)
> goto fail;
>
> /*
> -* Verify that we can not violate the policy of which files
> -* may be accessed that is specified by the root directory,
> -* by verifing that the root directory is at the root of the
> -* mount namespace which allows all files to be accessed.
> +* Drop the chroot privilege when a user namespace is created inside
> +* chrooted environment so that the file system view presented to a
> +* non-admin process is preserved.
>  */
> -   ret = -EPERM;
> if (current_chrooted())
> -   goto fail_dec;
> +   is_chrooted = 1;
>
> /* The creator needs a mapping in the parent user namespace
>  * or else we won't be able to reasonably tell userspace who
> @@ -140,7 +144,7 @@ int create_user_ns(struct cred *new)
> if (!setup_userns_sysctls(ns))
> goto fail_keyring;
>
> -   set_cred_user_ns(new, ns);
> +   set_cred_user_ns(new, ns, is_chro

Re: [PATCH] kvm/x86 : avoid shifting signed 32-bit value by 31 bits

2018-10-15 Thread Paolo Bonzini
On 15/10/2018 19:16, H. Peter Anvin wrote:
> On 10/7/18 6:04 PM, peng.h...@zte.com.cn wrote:
> \>
>> #define AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK(0xFF)
>> -#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK(1 << 31)
>> +#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK(1UL << 31)

> It is reasonable to change to unsigned, while not necessary to unsigned
> long?
 AVIC_LOGICAL_ID_ENTRY_VALID_MASK is used in function avic_ldr_write.
 here I think it doesn't matter if you use unsigned or unsigned long. Do 
 you have any suggestions?
>>
>>> In current case, AVIC_LOGICAL_ID_ENTRY_VALID_MASK is used to calculate
>>> the value of new_entry with type of u32. So the definition here is not
>>> harmful.
>>
>>> Also, I did a quick grep and found similar definition (1 << 31) is popular
>>> in the whole kernel tree.
>>
>>> The reason to make this change is not that strong to me. Would you
>>> minding sharing more reason behind this change?
>> oh, I'm just thinking logically, not more reason.
> 
> The right way to do this would be to use the _BITUL() (or _BITULL()) macro.

Even for a value from a 32-bit register?  That would be _BIT, which
doesn't exist.

Paolo


Re: [PATCH] x86/mm: annotate no_context with UNWIND_HINTS

2018-10-15 Thread Nick Desaulniers
On Mon, Oct 15, 2018 at 9:57 AM Nick Desaulniers
 wrote:
> I broke this myself in commit 815f0ddb346c
> ("include/linux/compiler*.h: make compiler-*.h mutually exclusive").
> Thanks for the suggestion, will verify then send a patch with your
> suggested by tag.  Thanks everyone for helping us sort this out!

https://lkml.org/lkml/2018/10/15/626


Re: [PATCH] dt-bindings: ufs: Fix the compatible string definition

2018-10-15 Thread Doug Anderson
Vivek,

On Mon, Oct 15, 2018 at 8:23 AM Vivek Gautam
 wrote:
>
> Hi Doug,
>
> On Sat, Oct 13, 2018 at 3:09 AM Douglas Anderson  
> wrote:
> >
> > If you look at the bindings for the UFS Host Controller it says:
> >
> > - compatible: must contain "jedec,ufs-1.1" or "jedec,ufs-2.0", may
> >   also list one or more of the following:
> >  "qcom,msm8994-ufshc"
> >  "qcom,msm8996-ufshc"
> >  "qcom,ufshc"
> >
> > My reading of that is that it's fine to just have either of these:
> > 1. "qcom,msm8996-ufshc", "jedec,ufs-2.0"
> > 2. "qcom,ufshc", "jedec,ufs-2.0"
> >
> > As far as I can tell neither of the above is actually a good idea.
> >
> > For #1 it turns out that the driver currently only keys off the
> > compatible string "qcom,ufshc" so it won't actually probe.
> >
> > For #2 the driver won't probe but it's not a good idea to keep the SoC
> > name out of the compatible string.
> >
> > Let's update the compatible string to make it really explicit.  We'll
> > include a nod to the existing driver and the old binding and say that
> > we should always include the "qcom,ufshc" string in addition to the
> > SoC compatible string.
> >
> > While we're at it we'll also include another example SoC known to have
> > UFS: sdm845.
> >
> > Fixes: 47555a5c8a11 ("scsi: ufs: make the UFS variant a platform device")
> > Signed-off-by: Douglas Anderson 
> > ---
> >
> >  .../devicetree/bindings/ufs/ufshcd-pltfrm.txt   | 13 -
> >  1 file changed, 8 insertions(+), 5 deletions(-)
> >
> > diff --git a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt 
> > b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
> > index 2df00524bd21..69a06a1b732e 100644
> > --- a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
> > +++ b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
> > @@ -4,11 +4,14 @@ UFSHC nodes are defined to describe on-chip UFS host 
> > controllers.
> >  Each UFS controller instance should have its own node.
> >
> >  Required properties:
> > -- compatible   : must contain "jedec,ufs-1.1" or "jedec,ufs-2.0", 
> > may
> > - also list one or more of the following:
> > - "qcom,msm8994-ufshc"
> > - "qcom,msm8996-ufshc"
> > - "qcom,ufshc"
> > +- compatible   : must contain "jedec,ufs-1.1" or "jedec,ufs-2.0"
> > +
> > + For Qualcomm SoCs must contain, as below, an
> > + SoC-specific compatible along with "qcom,ufshc" 
> > and
> > + the appropriate jedec string:
> > +   "qcom,msm8994-ufshc", "qcom,ufshc", 
> > "jedec,ufs-2.0"
> > +   "qcom,msm8996-ufshc", "qcom,ufshc", 
> > "jedec,ufs-2.0"
> > +   "qcom,sdm845-ufshc", "qcom,ufshc", 
> > "jedec,ufs-2.0"
>
> Thanks for the patch. It looks good to me.
> Reviewed-by: Vivek Gautam 

Thanks for the review!


> P.S.: While you are at it, can you please move 'ufs-qcom.txt'
> to Documentation/devicetree/bindings/phy/qcom,ufs-phy.txt.
> The current name and file location is misleading.

I'd rather someone at Qualcomm do this.  Do you have a suggested
person?  The reason I feel that Qualcomm needs to get involved is that
I see that when I look at the file you refer to says it's for:

  "qcom,ufs-phy-qmp-20nm" for 20nm ufs phy,
  "qcom,ufs-phy-qmp-14nm" for legacy 14nm ufs phy,
  "qcom,msm8996-ufs-phy-qmp-14nm" for 14nm ufs phy
  present on MSM8996 chipset.

...but there's another Qualcomm file, 'qcom-qmp-phy.txt'.  That
handles the compatible string:

   "qcom,sdm845-qmp-ufs-phy" for UFS QMP phy on sdm845.

So I'm a little confused.  Should the SDM845 UFS PHY been handled by
the older UFS PHY driver?  ...or should all the older UFS PHYs be
moved to be handled by the newer QMP PHY driver?  ...or are they
really different hardware blocks, in which case how would you describe
the difference (both are described as UFS QMP PHYs I think).

BTW: I have a patch attempting to fixup the QMP PHY bindings at
.


-Doug


Re: [RFC] Allow user namespace inside chroot

2018-10-15 Thread Andy Lutomirski
On Mon, Oct 15, 2018 at 10:22 AM Jann Horn  wrote:
>
> On Mon, Oct 15, 2018 at 7:10 PM  wrote:
> > Following commit disables the creation of user namespace inside
> > the chroot environment.
> >
> > userns: Don't allow creation if the user is chrooted
> >
> > commit 3151527ee007b73a0ebd296010f1c0454a919c7d
> >
> > Consider a system in which a non-root user creates a combination
> > of user, pid and mount namespaces and confines a process to it.
> > The system will have multiple levels of nested namespaces.
> > The root namespace in the system will have lots of directories
> > which should not be exposed to the child confined to the set of
> > namespaces.
> >
> > Without chroot, we will have to hide all unwanted directories
> > individually using bind mounts and mount namespace.
>
> IMO what you really should be doing is to create a tmpfs, bind-mount
> the directories you want into it, and then pivot_root() into that, not
> the other way around.

Indeed.  Or you can just recursive bind-mount the subtree you want and
then pivot_root() into it.

--Andy


Re: [RFC][PATCH] perf: Rewrite core context handling

2018-10-15 Thread Alexey Budankov


Hi,
On 15.10.2018 11:34, Peter Zijlstra wrote:
> On Mon, Oct 15, 2018 at 10:26:06AM +0300, Alexey Budankov wrote:
>> Hi,
>>
>> On 10.10.2018 13:45, Peter Zijlstra wrote:
>>> Hi all,
>>>
>>> There have been various issues and limitations with the way perf uses
>>> (task) contexts to track events. Most notable is the single hardware PMU
>>> task context, which has resulted in a number of yucky things (both
>>> proposed and merged).
>>>
>>> Notably:
>>>
>>>  - HW breakpoint PMU
>>>  - ARM big.little PMU
>>>  - Intel Branch Monitoring PMU
>>>
>>> Since we now track the events in RB trees, we can 'simply' add a pmu
>>> order to them and have them grouped that way, reducing to a single
>>> context. Of course, reality never quite works out that simple, and below
>>> ends up adding an intermediate data structure to bridge the context ->
>>> pmu mapping.
>>>
>>> Something a little like:
>>>
>>>   ,[1:n]-.
>>>   V  V
>>> perf_event_context <-[1:n]-> perf_event_pmu_context <--- perf_event
>>>   ^  ^ | |
>>>   `[1:n]-' `-[n:1]-> pmu <-[1:n]-'
>>>
>>> This patch builds (provided you disable CGROUP_PERF), boots and survives
>>> perf-top without the machine catching fire.
>>>
>>> There's still a fair bit of loose ends (look for XXX), but I think this
>>> is the direction we should be going.
>>>
>>> Comments?
>>>
>>> Not-Quite-Signed-off-by: Peter Zijlstra (Intel) 
>>> ---
>>>  arch/powerpc/perf/core-book3s.c |4 
>>>  arch/x86/events/core.c  |4 
>>>  arch/x86/events/intel/core.c|6 
>>>  arch/x86/events/intel/ds.c  |6 
>>>  arch/x86/events/intel/lbr.c |   16 
>>>  arch/x86/events/perf_event.h|6 
>>>  include/linux/perf_event.h  |   80 +-
>>>  include/linux/sched.h   |2 
>>>  kernel/events/core.c| 1412 
>>> 
>>>  9 files changed, 815 insertions(+), 721 deletions(-)
>>
>> Rewrite is impressive however it doesn't result in code base reduction as it 
>> is.
> 
> Yeah.. that seems to be nature of these things ..
> 
>> Nonetheless there is a clear demand for per pmu events groups tracking and 
>> rotation 
>> in single cpu context (HW breakpoints, ARM big.little, Intel LBRs) and there 
>> is 
>> a supply thru groups ordering on RB-tree.
>>
>> This might be driven into the kernel by some new Perf features that would 
>> base on 
>> that RB-tree groups ordering or by refactoring of existing code but in the 
>> way it 
>> would result in overall code base reduction thus lowering support cost.
> 
> If you have a concrete suggestion on how to reduce complexity? I tried,
> but couldn't find any (without breaking something).

Could some of those PMUs (HW breakpoints, ARM big.little, Intel LBRs) 
or other Perf related code be adjusted now so that overall subsystem 
code base would reduce?

Thanks,
Alexey

> 
> The active lists and pmu_ctx_list could arguably be replaced with
> (slower) iteratons over the RB tree, but you'll still need the per pmu
> nr_events/nr_active counts to determine if rotation is required at all.
> 
> And like you know, performance is quite important here too. I'd love to
> reduce complexity while maintaining or improve performance, but that
> rarely if ever happens :/
> 


Re:Business proposition for you

2018-10-15 Thread Melvin Greg
Hello, 

Business proposition for you.

I have a client from Syrian who will like to invest with your 
company. My client is willing to invest $4 Million. Can I have 
your company website to show to my client your company so that 
they will check and decide if they will invest there funds with 
you as joint partner. 

This information is needed urgently.

Please reply. 

Best Regards,
Agent Melvin Greg
Tel:+1 4045966532


Re: [PATCH] kvm/x86 : avoid shifting signed 32-bit value by 31 bits

2018-10-15 Thread H. Peter Anvin
On 10/15/18 10:23 AM, Paolo Bonzini wrote:
> 
> Even for a value from a 32-bit register?  That would be _BIT, which
> doesn't exist.
> 

Just use _BITUL(). gcc is smart enough to know that that the resulting value
is representable in 32 bits.

Or if you really care, submit a patch to create _BITU(), but I don't
personally see much of a point.

-hpa




Re: [PATCH] doc: rcu: Fix code listing in performance and scalability requirements

2018-10-15 Thread Paul E. McKenney
On Sun, Oct 14, 2018 at 07:29:42PM -0700, Joel Fernandes (Google) wrote:
> The code listing under this section has a quick quiz that says line 19
> uses rcu_access_pointer, but the code listing itself does not. Fix this.
> 
> Signed-off-by: Joel Fernandes (Google) 

Good eyes!  Queued for the merge window after this coming one,
thank you!

Thanx, Paul

> ---
>  .../RCU/Design/Requirements/Requirements.html|  2 +-
>  kernel/sys.c | 16 
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/RCU/Design/Requirements/Requirements.html 
> b/Documentation/RCU/Design/Requirements/Requirements.html
> index 4fae55056c1d..f74a2233865c 100644
> --- a/Documentation/RCU/Design/Requirements/Requirements.html
> +++ b/Documentation/RCU/Design/Requirements/Requirements.html
> @@ -1596,7 +1596,7 @@ used in place of synchronize_rcu() as follows:
>  16   struct foo *p;
>  17
>  18   spin_lock(&gp_lock);
> -19   p = rcu_dereference(gp);
> +19   p = rcu_access_pointer(gp);
>  20   if (!p) {
>  21 spin_unlock(&gp_lock);
>  22 return false;
> -- 
> 2.19.0.605.g01d371f741-goog
> 



Re: [PATCH 5/5] RISC-V: Implement sparsemem

2018-10-15 Thread Palmer Dabbelt

On Thu, 11 Oct 2018 05:18:20 PDT (-0700), sba...@raithlin.com wrote:

Palmer


I don't really know anything about this, but you're welcome to add a
   
   Reviewed-by: Palmer Dabbelt 


Thanks. I think it would be good to get someone who's familiar with linux/mm to 
take a look.

if you think it'll help.  I'm assuming you're targeting a different tree for 
the patch set, in which case it's probably best to keep this together with the 
rest of it.


No I think this series should be pulled by the RISC-V maintainer. The other patches in this series just refactor some code and need to be ACK'ed by their ARCH developers but I suspect the series should be pulled into RISC-V. That said since it does touch other arch should it be pulled by mm? 


BTW note that RISC-V SPARSEMEM support is pretty useful for all manner of 
things and not just the p2pdma discussed in the cover.


Ah, OK -- I thought this was adding the support everywhere.  Do you mind 
re-sending the patches with the various acks/reviews and I'll put in on 
for-next?




Thanks for porting your stuff to RISC-V!


You bet ;-)


Re: [RFC] Allow user namespace inside chroot

2018-10-15 Thread Eric W. Biederman


Have you considered using pivot_root to drop all of the pieces of the
filesystem you don't want to be visible?  That should be a much better
solution overall.

It is must a matter of:
mount --bind /path/you/would/chroot/to
pivot_root /path/you/would/chroot/to /put/old
umount -l /put/old

You might need to do something like make --rprivate before calling
pivot_root to stop mount propagation to the parent.  But I can't
image it to be a practical problem.


Also note that being in a chroot tends to indicate one of two things,
being in an old build system, or being in some kind of chroot jail.
Because of the jails created with chroot we want to be very careful
with enabling user namespaces in that context.

There have been some very clever people figuring out how to get out of
chroot jails by passing file descriptors between processes and using
things like pivot root.

Even if your analysis is semantically perfect there is the issue of
increasing the attack surface of preexising chroot jails.  I believe
that would make the kernel more vulnerable overall, and for only
a very small simplification of implementation details.

So unless I am missing something I don't see the use case for this that
would not be better served by just properly setting up your mount
namespace, and the attack surface increase of chroot jails makes we
very relucatant to see a change like this.

Eric

nagarathnam.muthus...@oracle.com writes:

> From: Nagarathnam Muthusamy 
>
> Following commit disables the creation of user namespace inside
> the chroot environment.
>
> userns: Don't allow creation if the user is chrooted
>
> commit 3151527ee007b73a0ebd296010f1c0454a919c7d
>
> Consider a system in which a non-root user creates a combination
> of user, pid and mount namespaces and confines a process to it.
> The system will have multiple levels of nested namespaces.
> The root namespace in the system will have lots of directories
> which should not be exposed to the child confined to the set of
> namespaces.
>
> Without chroot, we will have to hide all unwanted directories
> individually using bind mounts and mount namespace. Chroot enables
> us to expose a handpicked list of directories which the child
> can see but if we use chroot we wont be able to create nested
> namespaces.
>
> Allowing a process to create user namespace within a chroot
> environment will enable it to chroot, which in turn can be used
> to escape the jail.
>
> This patch drops the chroot privilege when user namespace is
> created within the chroot environment so the process cannot
> use it to escape the chroot jail. The process can still modify
> the view of the file system using mount namespace but for those
> modifications to be useful, it needs to run a setuid program with
> that intented uid directly mapped into the user namespace as it is
> which is not possible for an unprivileged process.
>
> If there were any other corner cases which were considered while
> deciding to disable the creation of user namespace as a whole
> within the chroot environment please let me know.
>
> Signed-off-by: Nagarathnam Muthusamy
> ---
>  kernel/user_namespace.c | 22 +-
>  1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
> index e5222b5..83d2a70 100644
> --- a/kernel/user_namespace.c
> +++ b/kernel/user_namespace.c
> @@ -44,7 +44,7 @@ static void dec_user_namespaces(struct ucounts *ucounts)
>   return dec_ucount(ucounts, UCOUNT_USER_NAMESPACES);
>  }
>  
> -static void set_cred_user_ns(struct cred *cred, struct user_namespace 
> *user_ns)
> +static void set_cred_user_ns(struct cred *cred, struct user_namespace 
> *user_ns, int is_chrooted)
>  {
>   /* Start with the same capabilities as init but useless for doing
>* anything as the capabilities are bound to the new user namespace.
> @@ -55,6 +55,11 @@ static void set_cred_user_ns(struct cred *cred, struct 
> user_namespace *user_ns)
>   cred->cap_effective = CAP_FULL_SET;
>   cred->cap_ambient = CAP_EMPTY_SET;
>   cred->cap_bset = CAP_FULL_SET;
> + if (is_chrooted) {
> + cap_lower(cred->cap_permitted, CAP_SYS_CHROOT);
> + cap_lower(cred->cap_effective, CAP_SYS_CHROOT);
> + cap_lower(cred->cap_bset, CAP_SYS_CHROOT);
> + }
>  #ifdef CONFIG_KEYS
>   key_put(cred->request_key_auth);
>   cred->request_key_auth = NULL;
> @@ -78,6 +83,7 @@ int create_user_ns(struct cred *new)
>   kgid_t group = new->egid;
>   struct ucounts *ucounts;
>   int ret, i;
> + int is_chrooted = 0;
>  
>   ret = -ENOSPC;
>   if (parent_ns->level > 32)
> @@ -88,14 +94,12 @@ int create_user_ns(struct cred *new)
>   goto fail;
>  
>   /*
> -  * Verify that we can not violate the policy of which files
> -  * may be accessed that is specified by the root directory,
> -  * by verifing that the root directory is at the root of the
> -  * mount name

[PATCH v2 4/6] arm64: mm: make use of new memblocks_present() helper

2018-10-15 Thread Logan Gunthorpe
Cleanup the arm64_memory_present() function seeing it's very
similar to other arches.

memblocks_present() is a direct replacement of arm64_memory_present()

Signed-off-by: Logan Gunthorpe 
Acked-by: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/mm/init.c | 20 +---
 1 file changed, 1 insertion(+), 19 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 6a0b5c5a61af..c51a944fe19f 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -296,24 +296,6 @@ int pfn_valid(unsigned long pfn)
 EXPORT_SYMBOL(pfn_valid);
 #endif
 
-#ifndef CONFIG_SPARSEMEM
-static void __init arm64_memory_present(void)
-{
-}
-#else
-static void __init arm64_memory_present(void)
-{
-   struct memblock_region *reg;
-
-   for_each_memblock(memory, reg) {
-   int nid = memblock_get_region_node(reg);
-
-   memory_present(nid, memblock_region_memory_base_pfn(reg),
-   memblock_region_memory_end_pfn(reg));
-   }
-}
-#endif
-
 static phys_addr_t memory_limit = PHYS_ADDR_MAX;
 
 /*
@@ -506,7 +488,7 @@ void __init bootmem_init(void)
 * Sparsemem tries to allocate bootmem in memory_present(), so must be
 * done after the fixed reservations.
 */
-   arm64_memory_present();
+   memblocks_present();
 
sparse_init();
zone_sizes_init(min, max);
-- 
2.19.0



[PATCH v2 0/6] sparsemem support for RISC-V

2018-10-15 Thread Logan Gunthorpe
This patchset implements sparsemem on RISC-V. The first few patches
move some code in existing architectures into common helpers
so they can be used by the new RISC-V implementation. The final
patch actually adds sparsmem support to RISC-V.

This is the first small step in supporting P2P on RISC-V.

--

Changes in v2:

* Rebase on v4.19-rc8
* Move the STRUCT_PAGE_MAX_SHIFT define into a common header (near
  the definition of struct page). As suggested by Christoph.
* Clean up the unnecessary nid variable in the memblocks_present()
  function, per Christoph.
* Collected tags from Palmer and Catalin.

--
Logan Gunthorpe (6):
  mm: Introduce common STRUCT_PAGE_MAX_SHIFT define
  mm/sparse: add common helper to mark all memblocks present
  ARM: mm: make use of new memblocks_present() helper
  arm64: mm: make use of new memblocks_present() helper
  sh: mm: make use of new memblocks_present() helper
  RISC-V: Implement sparsemem

 arch/arm/mm/init.c | 17 +
 arch/arm64/include/asm/memory.h|  9 -
 arch/arm64/mm/init.c   | 28 +---
 arch/riscv/Kconfig | 23 +++
 arch/riscv/include/asm/pgtable.h   | 21 +
 arch/riscv/include/asm/sparsemem.h | 11 +++
 arch/riscv/kernel/setup.c  |  4 +++-
 arch/riscv/mm/init.c   |  8 
 arch/sh/mm/init.c  |  7 +--
 include/asm-generic/fixmap.h   |  1 +
 include/linux/mm_types.h   |  5 +
 include/linux/mmzone.h |  6 ++
 mm/sparse.c| 14 ++
 13 files changed, 91 insertions(+), 63 deletions(-)
 create mode 100644 arch/riscv/include/asm/sparsemem.h

--
2.19.0


[PATCH v2 3/6] ARM: mm: make use of new memblocks_present() helper

2018-10-15 Thread Logan Gunthorpe
Cleanup the arm_memory_present() function seeing it's very
similar to other arches.

The new memblocks_present() helper checks for node ids which the
arm version did not. However, this is equivalent seeing
HAVE_MEMBLOCK_NODE_MAP should be false in this arch and therefore
memblock_get_region_node() should return 0.

Signed-off-by: Logan Gunthorpe 
Cc: Russell King 
Cc: Kees Cook 
Cc: Philip Derrin 
Cc: "Steven Rostedt (VMware)" 
Cc: Nicolas Pitre 
---
 arch/arm/mm/init.c | 17 +
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 0cc8e04295a4..e2710dd7446f 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -201,21 +201,6 @@ int pfn_valid(unsigned long pfn)
 EXPORT_SYMBOL(pfn_valid);
 #endif
 
-#ifndef CONFIG_SPARSEMEM
-static void __init arm_memory_present(void)
-{
-}
-#else
-static void __init arm_memory_present(void)
-{
-   struct memblock_region *reg;
-
-   for_each_memblock(memory, reg)
-   memory_present(0, memblock_region_memory_base_pfn(reg),
-  memblock_region_memory_end_pfn(reg));
-}
-#endif
-
 static bool arm_memblock_steal_permitted = true;
 
 phys_addr_t __init arm_memblock_steal(phys_addr_t size, phys_addr_t align)
@@ -317,7 +302,7 @@ void __init bootmem_init(void)
 * Sparsemem tries to allocate bootmem in memory_present(),
 * so must be done after the fixed reservations
 */
-   arm_memory_present();
+   memblocks_present();
 
/*
 * sparse_init() needs the bootmem allocator up and running.
-- 
2.19.0



[PATCH v2 5/6] sh: mm: make use of new memblocks_present() helper

2018-10-15 Thread Logan Gunthorpe
Cleanup the open coded for_each_memblock() loop that is equivalent
to the new memblocks_present() helper.

Signed-off-by: Logan Gunthorpe 
Cc: Yoshinori Sato 
Cc: Rich Felker 
Cc: Dan Williams 
Cc: Rob Herring 
---
 arch/sh/mm/init.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index 7713c084d040..f601f96408ee 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -235,12 +235,7 @@ static void __init do_init_bootmem(void)
 
plat_mem_setup();
 
-   for_each_memblock(memory, reg) {
-   int nid = memblock_get_region_node(reg);
-
-   memory_present(nid, memblock_region_memory_base_pfn(reg),
-   memblock_region_memory_end_pfn(reg));
-   }
+   memblocks_present();
sparse_init();
 }
 
-- 
2.19.0



[PATCH v2 2/6] mm/sparse: add common helper to mark all memblocks present

2018-10-15 Thread Logan Gunthorpe
Presently the arches arm64, arm and sh have a function which loops through
each memblock and calls memory present. riscv will require a similar
function.

Introduce a common memblocks_present() function that can be used by
all the arches. Subsequent patches will cleanup the arches that
make use of this.

Signed-off-by: Logan Gunthorpe 
Cc: Andrew Morton 
Cc: Michal Hocko 
Cc: Vlastimil Babka 
Cc: Pavel Tatashin 
Cc: Oscar Salvador 
---
 include/linux/mmzone.h |  6 ++
 mm/sparse.c| 14 ++
 2 files changed, 20 insertions(+)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index d4b0c79d2924..26a026a45857 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -784,6 +784,12 @@ void memory_present(int nid, unsigned long start, unsigned 
long end);
 static inline void memory_present(int nid, unsigned long start, unsigned long 
end) {}
 #endif
 
+#if defined(CONFIG_SPARSEMEM) && defined(CONFIG_HAVE_MEMBLOCK)
+void memblocks_present(void);
+#else
+static inline void memblocks_present(void) {}
+#endif
+
 #ifdef CONFIG_HAVE_MEMORYLESS_NODES
 int local_memory_node(int node_id);
 #else
diff --git a/mm/sparse.c b/mm/sparse.c
index 10b07eea9a6e..90aec8331a03 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -238,6 +239,19 @@ void __init memory_present(int nid, unsigned long start, 
unsigned long end)
}
 }
 
+#ifdef CONFIG_HAVE_MEMBLOCK
+void __init memblocks_present(void)
+{
+   struct memblock_region *reg;
+
+   for_each_memblock(memory, reg) {
+   memory_present(memblock_get_region_node(reg),
+  memblock_region_memory_base_pfn(reg),
+  memblock_region_memory_end_pfn(reg));
+   }
+}
+#endif
+
 /*
  * Subtle, we encode the real pfn into the mem_map such that
  * the identity pfn - section_mem_map will return the actual
-- 
2.19.0



[PATCH v2 1/6] mm: Introduce common STRUCT_PAGE_MAX_SHIFT define

2018-10-15 Thread Logan Gunthorpe
This define is used by arm64 to calculate the size of the vmemmap
region. It is defined as the log2 of the upper bound on the size
of a struct page.

We move it into mm_types.h so it can be defined properly instead of
set and checked with a build bug. This also allows us to use the same
define for riscv.

Signed-off-by: Logan Gunthorpe 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Arnd Bergmann 
Cc: Andrew Morton 
Cc: Christoph Hellwig 
---
 arch/arm64/include/asm/memory.h | 9 -
 arch/arm64/mm/init.c| 8 
 include/asm-generic/fixmap.h| 1 +
 include/linux/mm_types.h| 5 +
 4 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index b96442960aea..f0a5c9531e8b 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -34,15 +34,6 @@
  */
 #define PCI_IO_SIZESZ_16M
 
-/*
- * Log2 of the upper bound of the size of a struct page. Used for sizing
- * the vmemmap region only, does not affect actual memory footprint.
- * We don't use sizeof(struct page) directly since taking its size here
- * requires its definition to be available at this point in the inclusion
- * chain, and it may not be a power of 2 in the first place.
- */
-#define STRUCT_PAGE_MAX_SHIFT  6
-
 /*
  * VMEMMAP_SIZE - allows the whole linear region to be covered by
  *a struct page array
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 787e27964ab9..6a0b5c5a61af 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -615,14 +615,6 @@ void __init mem_init(void)
BUILD_BUG_ON(TASK_SIZE_32   > TASK_SIZE_64);
 #endif
 
-#ifdef CONFIG_SPARSEMEM_VMEMMAP
-   /*
-* Make sure we chose the upper bound of sizeof(struct page)
-* correctly when sizing the VMEMMAP array.
-*/
-   BUILD_BUG_ON(sizeof(struct page) > (1 << STRUCT_PAGE_MAX_SHIFT));
-#endif
-
if (PAGE_SIZE >= 16384 && get_num_physpages() <= 128) {
extern int sysctl_overcommit_memory;
/*
diff --git a/include/asm-generic/fixmap.h b/include/asm-generic/fixmap.h
index 827e4d3bbc7a..8cc7b09c1bc7 100644
--- a/include/asm-generic/fixmap.h
+++ b/include/asm-generic/fixmap.h
@@ -16,6 +16,7 @@
 #define __ASM_GENERIC_FIXMAP_H
 
 #include 
+#include 
 
 #define __fix_to_virt(x)   (FIXADDR_TOP - ((x) << PAGE_SHIFT))
 #define __virt_to_fix(x)   ((FIXADDR_TOP - ((x)&PAGE_MASK)) >> PAGE_SHIFT)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5ed8f6292a53..ec8c16d9396b 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -206,6 +206,11 @@ struct page {
 #endif
 } _struct_page_alignment;
 
+/*
+ * Used for sizing the vmemmap region on some architectures
+ */
+#define STRUCT_PAGE_MAX_SHIFT  ilog2(roundup_pow_of_two(sizeof(struct page)))
+
 #define PAGE_FRAG_CACHE_MAX_SIZE   __ALIGN_MASK(32768, ~PAGE_MASK)
 #define PAGE_FRAG_CACHE_MAX_ORDER  get_order(PAGE_FRAG_CACHE_MAX_SIZE)
 
-- 
2.19.0



[RFC PATCH v2 06/12] crypto: arm/chacha20 - refactor to allow varying number of rounds

2018-10-15 Thread Eric Biggers
From: Eric Biggers 

In preparation for adding XChaCha12 support, rename/refactor the NEON
implementation of ChaCha20 to support different numbers of rounds.

Signed-off-by: Eric Biggers 
---
 arch/arm/crypto/Makefile  |  4 +-
 ...hacha20-neon-core.S => chacha-neon-core.S} | 36 ++--
 ...hacha20-neon-glue.c => chacha-neon-glue.c} | 56 ++-
 3 files changed, 52 insertions(+), 44 deletions(-)
 rename arch/arm/crypto/{chacha20-neon-core.S => chacha-neon-core.S} (96%)
 rename arch/arm/crypto/{chacha20-neon-glue.c => chacha-neon-glue.c} (73%)

diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile
index bd5bceef0605f..005482ff95047 100644
--- a/arch/arm/crypto/Makefile
+++ b/arch/arm/crypto/Makefile
@@ -9,7 +9,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
 obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
 obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
 obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
-obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
+obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o
 
 ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
 ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
@@ -52,7 +52,7 @@ aes-arm-ce-y  := aes-ce-core.o aes-ce-glue.o
 ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
 crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
 crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
-chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
+chacha-neon-y := chacha-neon-core.o chacha-neon-glue.o
 
 ifdef REGENERATE_ARM_CRYPTO
 quiet_cmd_perl = PERL$@
diff --git a/arch/arm/crypto/chacha20-neon-core.S 
b/arch/arm/crypto/chacha-neon-core.S
similarity index 96%
rename from arch/arm/crypto/chacha20-neon-core.S
rename to arch/arm/crypto/chacha-neon-core.S
index db59f1fbc728b..4b12064449f78 100644
--- a/arch/arm/crypto/chacha20-neon-core.S
+++ b/arch/arm/crypto/chacha-neon-core.S
@@ -1,5 +1,5 @@
 /*
- * ChaCha20 256-bit cipher algorithm, RFC7539, ARM NEON functions
+ * ChaCha/XChaCha NEON helper functions
  *
  * Copyright (C) 2016 Linaro, Ltd. 
  *
@@ -53,18 +53,19 @@
.align  5
 
 /*
- * _chacha20_permute - permute one block
+ * _chacha_permute - permute one block
  *
  * Permute one 64-byte block where the state matrix is stored in the four NEON
  * registers q0-q3.  It performs matrix operation on four words in parallel, 
but
  * requires shuffling to rearrange the words after each round.
  *
+ * The round count is given in r3.
+ *
  * Clobbers: r3, q4-q5
  */
-.macro _chacha20_permute
+.macro _chacha_permute
 
adr ip, .Lrol8_table
-   mov r3, #10
vld1.8  {d10}, [ip, :64]
 
 .Ldoubleround_\@:
@@ -128,14 +129,15 @@
// x3 = shuffle32(x3, MASK(0, 3, 2, 1))
vext.8  q3, q3, q3, #4
 
-   subsr3, r3, #1
+   subsr3, r3, #2
bne .Ldoubleround_\@
 .endm
 
-ENTRY(chacha20_block_xor_neon)
+ENTRY(chacha_block_xor_neon)
// r0: Input state matrix, s
// r1: 1 data block output, o
// r2: 1 data block input, i
+   // r3: nrounds
 
// x0..3 = s0..3
add ip, r0, #0x20
@@ -147,7 +149,7 @@ ENTRY(chacha20_block_xor_neon)
vmovq10, q2
vmovq11, q3
 
-   _chacha20_permute
+   _chacha_permute
 
add ip, r2, #0x20
vld1.8  {q4-q5}, [r2]
@@ -174,29 +176,31 @@ ENTRY(chacha20_block_xor_neon)
vst1.8  {q2-q3}, [ip]
 
bx  lr
-ENDPROC(chacha20_block_xor_neon)
+ENDPROC(chacha_block_xor_neon)
 
-ENTRY(hchacha20_block_neon)
+ENTRY(hchacha_block_neon)
// r0: Input state matrix, s
// r1: output (8 32-bit words)
+   // r2: nrounds
 
vld1.32 {q0-q1}, [r0]!
vld1.32 {q2-q3}, [r0]
 
-   _chacha20_permute
+   mov r3, r2
+   _chacha_permute
 
vst1.32 {q0}, [r1]!
vst1.32 {q3}, [r1]
 
bx  lr
-ENDPROC(hchacha20_block_neon)
+ENDPROC(hchacha_block_neon)
 
.align  4
 .Lctrinc:  .word   0, 1, 2, 3
 .Lrol8_table:  .byte   3, 0, 1, 2, 7, 4, 5, 6
 
.align  5
-ENTRY(chacha20_4block_xor_neon)
+ENTRY(chacha_4block_xor_neon)
push{r4-r5}
mov r4, sp  // preserve the stack pointer
sub ip, sp, #0x20   // allocate a 32 byte buffer
@@ -206,9 +210,10 @@ ENTRY(chacha20_4block_xor_neon)
// r0: Input state matrix, s
// r1: 4 data blocks output, o
// r2: 4 data blocks input, i
+   // r3: nrounds
 
//
-   // This function encrypts four consecutive ChaCha20 blocks by loading
+   // This function encrypts four consecutive ChaCha blocks by loading
// the state matrix in NEON registers four times. The algorithm performs
// each operation on the corresponding

[PATCH v2 6/6] RISC-V: Implement sparsemem

2018-10-15 Thread Logan Gunthorpe
This patch implements sparsemem support for risc-v which helps pave the
way for memory hotplug and eventually P2P support.

We introduce Kconfig options for virtual and physical address bits which
are used to calculate the size of the vmemmap and set the
MAX_PHYSMEM_BITS.

The vmemmap is located directly before the VMALLOC region and sized
such that we can allocate enough pages to populate all the virtual
address space in the system (similar to the way it's done in arm64).

During initialization, call memblocks_present() and sparse_init(),
and provide a stub for vmemmap_populate() (all of which is similar to
arm64).

Signed-off-by: Logan Gunthorpe 
Reviewed-by: Palmer Dabbelt 
Cc: Albert Ou 
Cc: Andrew Waterman 
Cc: Olof Johansson 
Cc: Michael Clark 
Cc: Rob Herring 
Cc: Zong Li 
---
 arch/riscv/Kconfig | 23 +++
 arch/riscv/include/asm/pgtable.h   | 21 +
 arch/riscv/include/asm/sparsemem.h | 11 +++
 arch/riscv/kernel/setup.c  |  4 +++-
 arch/riscv/mm/init.c   |  8 
 5 files changed, 62 insertions(+), 5 deletions(-)
 create mode 100644 arch/riscv/include/asm/sparsemem.h

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index a344980287a5..a1b5d758a542 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -52,12 +52,32 @@ config ZONE_DMA32
bool
default y if 64BIT
 
+config VA_BITS
+   int
+   default 32 if 32BIT
+   default 39 if 64BIT
+
+config PA_BITS
+   int
+   default 34 if 32BIT
+   default 56 if 64BIT
+
 config PAGE_OFFSET
hex
default 0xC000 if 32BIT && MAXPHYSMEM_2GB
default 0x8000 if 64BIT && MAXPHYSMEM_2GB
default 0xffe0 if 64BIT && MAXPHYSMEM_128GB
 
+config ARCH_FLATMEM_ENABLE
+   def_bool y
+
+config ARCH_SPARSEMEM_ENABLE
+   def_bool y
+   select SPARSEMEM_VMEMMAP_ENABLE
+
+config ARCH_SELECT_MEMORY_MODEL
+   def_bool ARCH_SPARSEMEM_ENABLE
+
 config STACKTRACE_SUPPORT
def_bool y
 
@@ -92,6 +112,9 @@ config PGTABLE_LEVELS
 config HAVE_KPROBES
def_bool n
 
+config HAVE_ARCH_PFN_VALID
+   def_bool y
+
 menu "Platform type"
 
 choice
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 16301966d65b..e1162336f5ea 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -89,6 +89,23 @@ extern pgd_t swapper_pg_dir[];
 #define __S110 PAGE_SHARED_EXEC
 #define __S111 PAGE_SHARED_EXEC
 
+#define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1)
+#define VMALLOC_END  (PAGE_OFFSET - 1)
+#define VMALLOC_START(PAGE_OFFSET - VMALLOC_SIZE)
+
+/*
+ * Roughly size the vmemmap space to be large enough to fit enough
+ * struct pages to map half the virtual address space. Then
+ * position vmemmap directly below the VMALLOC region.
+ */
+#define VMEMMAP_SHIFT \
+   (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT)
+#define VMEMMAP_SIZE   (1UL << VMEMMAP_SHIFT)
+#define VMEMMAP_END(VMALLOC_START - 1)
+#define VMEMMAP_START  (VMALLOC_START - VMEMMAP_SIZE)
+
+#define vmemmap((struct page *)VMEMMAP_START)
+
 /*
  * ZERO_PAGE is a global shared page that is always zero,
  * used for zero-mapped memory areas, etc.
@@ -411,10 +428,6 @@ static inline void pgtable_cache_init(void)
/* No page table caches to initialize */
 }
 
-#define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1)
-#define VMALLOC_END  (PAGE_OFFSET - 1)
-#define VMALLOC_START(PAGE_OFFSET - VMALLOC_SIZE)
-
 /*
  * Task size is 0x400 for RV64 or 0xb80 for RV32.
  * Note that PGDIR_SIZE must evenly divide TASK_SIZE.
diff --git a/arch/riscv/include/asm/sparsemem.h 
b/arch/riscv/include/asm/sparsemem.h
new file mode 100644
index ..215530b24336
--- /dev/null
+++ b/arch/riscv/include/asm/sparsemem.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_SPARSEMEM_H
+#define __ASM_SPARSEMEM_H
+
+#ifdef CONFIG_SPARSEMEM
+#define MAX_PHYSMEM_BITS   CONFIG_PA_BITS
+#define SECTION_SIZE_BITS  30
+#endif /* CONFIG_SPARSEMEM */
+
+#endif /* __ASM_SPARSEMEM_H */
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index b2d26d9d8489..494c380e4ea6 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -205,6 +205,9 @@ static void __init setup_bootmem(void)
  PFN_PHYS(end_pfn - start_pfn),
  &memblock.memory, 0);
}
+
+   memblocks_present();
+   sparse_init();
 }
 
 void __init setup_arch(char **cmdline_p)
@@ -239,4 +242,3 @@ void __init setup_arch(char **cmdline_p)
 
riscv_fill_hwcap();
 }
-
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 58a522f9bcc3..5d529878667c 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -70,3 +70,11 @@ void free_initrd_mem(unsigned long start, unsigned long end)
 {
 }
 #endif /* CONFIG_BLK_DEV_INITRD */
+
+#ifdef C

Re: [PATCH 08/14] ARM64: dts: hisilicon: Add tsensor interrupt name

2018-10-15 Thread Daniel Lezcano


Hi Rob,

thanks for the review.

On 15/10/2018 18:28, Rob Herring wrote:
> On Tue, Sep 25, 2018 at 11:03:06AM +0200, Daniel Lezcano wrote:
>> Add the interrupt names for the sensors, so the code can rely on them
>> instead of dealing with index which are prone to error.
>>
>> The name comes from the Hisilicon documentation found on internet.
>>
>> Signed-off-by: Daniel Lezcano 
>> ---
>>  .../bindings/thermal/hisilicon-thermal.txt |  3 ++
>>  arch/arm64/boot/dts/hisilicon/hi3660.dtsi  | 63 
>> +++---
>>  arch/arm64/boot/dts/hisilicon/hi6220.dtsi  |  1 +
>>  3 files changed, 36 insertions(+), 31 deletions(-)
> 
> Lots of whitespace errors reported by checkpatch.pl.

Yeah, I don't know why but I have something bad with my emacs
configuration when changing DT files. Will look at this.

>> diff --git a/Documentation/devicetree/bindings/thermal/hisilicon-thermal.txt 
>> b/Documentation/devicetree/bindings/thermal/hisilicon-thermal.txt
>> index cef716a..3edfae3 100644
>> --- a/Documentation/devicetree/bindings/thermal/hisilicon-thermal.txt
>> +++ b/Documentation/devicetree/bindings/thermal/hisilicon-thermal.txt
>> @@ -7,6 +7,7 @@
>>region.
>>  - interrupt: The interrupt number to the cpu. Defines the interrupt used
>>by /SOCTHERM/tsensor.
>> +- interrupt-names: The interrupt names for the different sensors
> 
> Need to define what the names are.
> 
>>  - clock-names: Input clock name, should be 'thermal_clk'.
>>  - clocks: phandles for clock specified in "clock-names" property.
>>  - #thermal-sensor-cells: Should be 1. See ./thermal.txt for a description.
>> @@ -18,6 +19,7 @@ for Hi6220:
>>  compatible = "hisilicon,tsensor";
>>  reg = <0x0 0xf7030700 0x0 0x1000>;
>>  interrupts = <0 7 0x4>;
>> +interrupt-names = "tsensor_intr";
> 
> That name seems pretty pointless.
> 
>>  clocks = <&sys_ctrl HI6220_TSENSOR_CLK>;
>>  clock-names = "thermal_clk";
>>  #thermal-sensor-cells = <1>;
>> @@ -28,5 +30,6 @@ for Hi3660:
>>  compatible = "hisilicon,hi3660-tsensor";
>>  reg = <0x0 0xfff3 0x0 0x1000>;
>>  interrupts = ;
>> +interrupt-names = "tsensor_a73";
> 
> Just 'a73' is sufficient.

This is the name defined in the board documentation to give a name to
the interrupt when requesting it. This one appears in /proc/interrupts.

I can replace the 'tsensor_intr' by 'tsensor', but if 'tsensor_a73' is
replaced by 'a73' that may looks odd in /proc/interrupts to see an
interrupts line with the 'a73' name.

However this is the preparation for the multiple sensors support so we
will have more interrupt names.

Is it possible to keep these names ?

 - tsensor
 - tsensor_a73
 - tsensor_a57
 - tsensor_gpu




-- 
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog



Re: [PATCH 10/14] ARM64: dts: hisilicon: Add interrupt names for the tsensors

2018-10-15 Thread Daniel Lezcano
On 15/10/2018 18:31, Rob Herring wrote:
> On Tue, Sep 25, 2018 at 11:03:08AM +0200, Daniel Lezcano wrote:
>> Add the missing interrupts for the temperature sensors as well as
>> their names.
>>
>> Signed-off-by: Daniel Lezcano 
>> ---
>>  Documentation/devicetree/bindings/thermal/hisilicon-thermal.txt | 8 ++--
> 
> Combine this and the previous binding change to 1 patch.

Ok.

>>  arch/arm64/boot/dts/hisilicon/hi3660.dtsi   | 8 ++--
>>  2 files changed, 12 insertions(+), 4 deletions(-)
> 
> And more checkpatch whitespace errors in this.
> 


-- 
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog



Re: [PATCH v4 1/2] Bluetooth: Add device_get_bd_address()

2018-10-15 Thread Matthias Kaehlcke
Hi Marcel,

please let me know if any changes are needed to get this patch applied
to bluetooth-next.

Thanks

Matthias

On Thu, Oct 04, 2018 at 10:33:38AM -0700, Matthias Kaehlcke wrote:
> On Thu, Sep 27, 2018 at 10:13:05AM -0700, Matthias Kaehlcke wrote:
> > On Thu, Sep 27, 2018 at 12:47:06PM -0400, Sinan Kaya wrote:
> > > On 9/27/2018 12:41 PM, Balakrishna Godavarthi wrote:
> > > >   void bt_sock_reclassify_lock(struct sock *sk, int proto);
> > > > 
> > > > +int device_get_bd_address(struct device *dev, bdaddr_t *bd_addr);
> > > 
> > > Maybe change the API name to start with bt_ and get rid of device_?
> > 
> > device_ indicates that we get the BD_ADDR for a 'struct device' and
> > not for e.g. a 'struct fwnode_handle'.
> > 
> > Anyway with this version of the patch fwnode_get_bd_address() has been
> > scrapped and it might never be introduced again, so I'm open to change
> > the name to bt_ if there is a general preference for it.
> 
> Marcel, can you live with this being added to the Bluetooth code base
> instead of property? Also if you'd prefer the function to be named
> bt_get_bd_address() let me know.
> 
> Cheers
> 
> Matthias


Re: [PATCH v4 1/2] Bluetooth: Add device_get_bd_address()

2018-10-15 Thread Marcel Holtmann
Hi Matthias,

  void bt_sock_reclassify_lock(struct sock *sk, int proto);
 
 +int device_get_bd_address(struct device *dev, bdaddr_t *bd_addr);
>>> 
>>> Maybe change the API name to start with bt_ and get rid of device_?
>> 
>> device_ indicates that we get the BD_ADDR for a 'struct device' and
>> not for e.g. a 'struct fwnode_handle'.
>> 
>> Anyway with this version of the patch fwnode_get_bd_address() has been
>> scrapped and it might never be introduced again, so I'm open to change
>> the name to bt_ if there is a general preference for it.
> 
> Marcel, can you live with this being added to the Bluetooth code base
> instead of property? Also if you'd prefer the function to be named
> bt_get_bd_address() let me know.

explain to me again why this is useful? I am failing to see the benefit if this 
is not part of the property.h API.

Regards

Marcel



Re: [RFC] Allow user namespace inside chroot

2018-10-15 Thread Nagarathnam Muthusamy




On 10/15/2018 10:42 AM, ebied...@xmission.com wrote:

Have you considered using pivot_root to drop all of the pieces of the
filesystem you don't want to be visible?  That should be a much better
solution overall.

It is must a matter of:
mount --bind /path/you/would/chroot/to
pivot_root /path/you/would/chroot/to /put/old
umount -l /put/old

You might need to do something like make --rprivate before calling
pivot_root to stop mount propagation to the parent.  But I can't
image it to be a practical problem.


Also note that being in a chroot tends to indicate one of two things,
being in an old build system, or being in some kind of chroot jail.
Because of the jails created with chroot we want to be very careful
with enabling user namespaces in that context.

There have been some very clever people figuring out how to get out of
chroot jails by passing file descriptors between processes and using
things like pivot root.

Even if your analysis is semantically perfect there is the issue of
increasing the attack surface of preexising chroot jails.  I believe
that would make the kernel more vulnerable overall, and for only
a very small simplification of implementation details.

So unless I am missing something I don't see the use case for this that
would not be better served by just properly setting up your mount
namespace, and the attack surface increase of chroot jails makes we
very relucatant to see a change like this.


Thanks a lot for the feedback! I will work on solving the issue with
pivot_root and mount namespace combination.

Thanks,
Nagarathnam.

Eric

nagarathnam.muthus...@oracle.com writes:


From: Nagarathnam Muthusamy 

Following commit disables the creation of user namespace inside
the chroot environment.

userns: Don't allow creation if the user is chrooted

commit 3151527ee007b73a0ebd296010f1c0454a919c7d

Consider a system in which a non-root user creates a combination
of user, pid and mount namespaces and confines a process to it.
The system will have multiple levels of nested namespaces.
The root namespace in the system will have lots of directories
which should not be exposed to the child confined to the set of
namespaces.

Without chroot, we will have to hide all unwanted directories
individually using bind mounts and mount namespace. Chroot enables
us to expose a handpicked list of directories which the child
can see but if we use chroot we wont be able to create nested
namespaces.

Allowing a process to create user namespace within a chroot
environment will enable it to chroot, which in turn can be used
to escape the jail.

This patch drops the chroot privilege when user namespace is
created within the chroot environment so the process cannot
use it to escape the chroot jail. The process can still modify
the view of the file system using mount namespace but for those
modifications to be useful, it needs to run a setuid program with
that intented uid directly mapped into the user namespace as it is
which is not possible for an unprivileged process.

If there were any other corner cases which were considered while
deciding to disable the creation of user namespace as a whole
within the chroot environment please let me know.

Signed-off-by: Nagarathnam Muthusamy
---
  kernel/user_namespace.c | 22 +-
  1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index e5222b5..83d2a70 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -44,7 +44,7 @@ static void dec_user_namespaces(struct ucounts *ucounts)
return dec_ucount(ucounts, UCOUNT_USER_NAMESPACES);
  }
  
-static void set_cred_user_ns(struct cred *cred, struct user_namespace *user_ns)

+static void set_cred_user_ns(struct cred *cred, struct user_namespace 
*user_ns, int is_chrooted)
  {
/* Start with the same capabilities as init but useless for doing
 * anything as the capabilities are bound to the new user namespace.
@@ -55,6 +55,11 @@ static void set_cred_user_ns(struct cred *cred, struct 
user_namespace *user_ns)
cred->cap_effective = CAP_FULL_SET;
cred->cap_ambient = CAP_EMPTY_SET;
cred->cap_bset = CAP_FULL_SET;
+   if (is_chrooted) {
+   cap_lower(cred->cap_permitted, CAP_SYS_CHROOT);
+   cap_lower(cred->cap_effective, CAP_SYS_CHROOT);
+   cap_lower(cred->cap_bset, CAP_SYS_CHROOT);
+   }
  #ifdef CONFIG_KEYS
key_put(cred->request_key_auth);
cred->request_key_auth = NULL;
@@ -78,6 +83,7 @@ int create_user_ns(struct cred *new)
kgid_t group = new->egid;
struct ucounts *ucounts;
int ret, i;
+   int is_chrooted = 0;
  
  	ret = -ENOSPC;

if (parent_ns->level > 32)
@@ -88,14 +94,12 @@ int create_user_ns(struct cred *new)
goto fail;
  
  	/*

-* Verify that we can not violate the policy of which files
-* may be accessed that is specified by the root directory,
-*

Re: [RFC] Allow user namespace inside chroot

2018-10-15 Thread Eric W. Biederman
Jann Horn  writes:

> On Mon, Oct 15, 2018 at 7:10 PM  wrote:
>> @@ -1281,7 +1285,7 @@ static int userns_install(struct nsproxy *nsproxy, 
>> struct ns_common *ns)
>> return -ENOMEM;
>>
>> put_user_ns(cred->user_ns);
>> -   set_cred_user_ns(cred, get_user_ns(user_ns));
>> +   set_cred_user_ns(cred, get_user_ns(user_ns), 0);
>
> This looks bogus. With this, I think your restriction can be bypassed
> if process A forks a child B, B creates a new user namespace, then A
> enters the user namespace with setns() and has full capabilities. Am I
> missing something?

Nope.  I feel silly for missing that angle.

Even without the full capabilities the userns_install angle will place
you at the root of the mount namespace outside of the chroot.

At which point I have visions of the special cases multiplying like
bunnies make this work.  Without a very strong case I don't like this at all.

Eric





Re: WARNING in ext4_invalidatepage

2018-10-15 Thread Theodore Y. Ts'o
On Mon, Oct 15, 2018 at 03:22:42PM +0200, Dmitry Vyukov wrote:
> Now that you mention EXT4_IOC_SWAP_BOOT, I think I looked at the wrong
> program, there is a subsequent one that does ioctl(0x6611) where
> 0x6611 is in fact EXT4_IOC_SWAP_BOOT. So I think it's this one:
> 
> 05:23:28 executing program 5:
> r0 = creat(&(0x7f0001c0)='./file0\x00', 0x0)
> socketpair$unix(0x1, 0x1, 0x0, &(0x7f000380)={0x,
> 0x})
> write$RDMA_USER_CM_CMD_CREATE_ID(r0, &(0x7f000240)={0x0, 0x18,
> 0xfa00, {0x0, &(0x7f000200)}}, 0x20)
> ioctl$PERF_EVENT_IOC_ENABLE(r1, 0x8912, 0x400200)
> ioctl$EXT4_IOC_SETFLAGS(r0, 0x6611, &(0x7f00)=0x4000)

Ah, so is it a bug in Syzkaller that it is printing
ioctl$EXT4_IOC_SETFLAGS when 0x6611 is in fact EXT4_IOC_SWAP_BOOT,
right?

> I've tried to manually reply this program and the whole log too, but
> it does not reproduce. This may be related to the fact that filesystem
> accumulates too much global state, so probably first relevant part
> happened long time ago, and then second relevant part happened later
> and triggered the warning. But just re-doing the second part does not
> reproduce the bug.

It was probably some other process racing with EXT4_IOC_SWAP_BOOT.
The patch I referenced in my previous e-mail protects against
additional scenarios where someone might be trying to punch a whole
into a file that is being swapped into the bootloader ioctl.  This
particular ioctl isn't yet being used by anyone, so it had some other
issues as well, such as not interacting well with inline_data-enabled
file systems --- not that any bootloader would be small enough that it
would fit in an inline_data inode, but we're basically proofing the
code against a malicious (or buggy) root-privileged program... such as
syzbot.  :-)

- Ted


Re: [PATCH] of: overlay: user space synchronization

2018-10-15 Thread Frank Rowand
On 10/15/18 01:24, Geert Uytterhoeven wrote:
> Hi Frank,
> 
> On Mon, Oct 15, 2018 at 3:36 AM  wrote:
>> From: Frank Rowand 
>>
>> When an overlay is applied or removed, the live devicetree visible in
>> /proc/device-tree/, aka /sys/firmware/devicetree/base/, reflects the
>> changes.  There is no method for user space to determine whether the
>> live devicetree was modified by overlay actions.
>>
>> Provide a sysfs file, /sys/firmware/devicetree/tree_version,  to allow
>> user space to determine if the live devicetree has remained unchanged
>> while a series of one or more accesses of /proc/device-tree/ occur.
> 
> Thanks for your patch!
> 
>> The use of both dynamic devicetree modifications and overlay apply and
>> removal are not supported during the same boot cycle.  Thus non-overlay
>> dynamic modifications are not reflected in the value of tree_version.
> 
> What does this mean exactly, for users?
> I am used to applying and removing overlays at run time (still
> carrying Pantelis'
> overlay configfs patches), but when would I use non-overlay dynamic
> modifications?

To find some examples, 'git grep of_add_property'.  (This is not a
comprehensive list, there are other similar functions.)

Some Powerpc systems use dynamic modifications of device trees, for
example adding and removing cpus.

Kexec uses dynamic modifications just before booting a new
kernel (so no interference with overlays).  Devicetree
unittest uses it, with no interference with overlays.

There are also a few places platform code or a driver uses dynamic
modification.  Possible conflicts with overlays are:
   arch/arm/mach-mvebu/coherency.c
   arch/arm/mach-omap2/timer.c

rcar_du_of.c is a known use that is grandfathered in.  It currently
is not compatible with overlays.

drivers/macintosh/smu.c should not be an issue because I don't
expect any macintosh overlay.

Some of the reasons for not supporting both overlays and other
dynamic modifications on the same system might be possible to
resolve with additional code, but some might be difficult or
impossible to resolve.  So that restriction might be loosened
or removed in the future.  Some of the reasons are:

  - dynamic modifications do not use the same locking mechanism
as overlay apply and removal, thus the devicetree could be
corrupted

  - dynamic modifications of portions of the devicetree that
are the result of applying an overlay will (may?) cause
removal of the overlay to fail (or devicetree corruption?)

  - (future concern: ) static validation of an overlay (or set
of overlays) against a specific base devicetree would not
be valid if the base devicetree is further modified by
dynamic modifications - this is theoretical since validation
is a currently under development feature and we do not know
what the final feature will look like

The plan for overlays has been to add specific use models (or
functionality or features) in a limited fashion initially, to
ensure that each feature is implemented to a sufficient degree
(in other words is not a hack that only works in limited
circumstances, such as the correct phase of the moon), works
robustly and is maintainable.  Then as each feature or set
of features is found to be good enough, add more features.

I suspect that dynamic modification in general will remain not
compatible with overlays, with limited exceptions.  Possible
exceptions would require stringent review and auditing, and
could incude devicetree unittest (even this one makes me
nervous) and some platform code (especially early boot code).


>> --- a/Documentation/ABI/testing/sysfs-firmware-ofw
>> +++ b/Documentation/ABI/testing/sysfs-firmware-ofw
>>
>> +What:  /sys/firmware/devicetree/tree_version
>> +Date:  October 2018
>> +KernelVersion: 4.20
>> +Contact:   Frank Rowand , 
>> devicet...@vger.kernel.org
>> +Description:
>> +   When an overlay is applied or removed, the live devicetree
>> +   visible in /proc/device-tree/, aka
>> +   /sys/firmware/devicetree/base/, reflects the changes.
>> +
>> +   tree_version provides a way for user space to determine if 
>> the
>> +   live devicetree has remained unchanged while a series of one
>> +   or more accesses of /proc/device-tree/ occur.
>> +
>> +   The use of both dynamic devicetree modifications and overlay
>> +   apply and removal are not supported during the same boot
>> +   cycle.  Thus non-overlay dynamic modifications are not
>> +   reflected in the value of tree_version.
>> +
>> +   Example shell use of tree_version:
>> +
>> +   done=1
>> +
>> +   while [ $done = 1 ] ; do
>> +
>> +  pre_version=$(cat /sys/firmware/devicetree/tree_version)
>> +  version=$pre_version
>> +  while [ $(( ${version} & 1 )) != 0 ] ; do
>> + # echo is optional, s

Re: [PATCH] kernel/signal: Signal-based pre-coredump notification

2018-10-15 Thread Enke Chen
Hi, Greg:

> Shouldn't there also be a manpage update, and a kselftest added for this
> new user/kernel api that is being created?
> 

I will submit a patch for manpage update once the code is accepted.

Regarding the kselftest, I am not sure.  Once the prctl() is limited to
self (which I will do), the logic would be pretty straightforward. Not
sure if the selftest would add much value.

Thanks.  -- Enke

On 10/12/18 11:40 PM, Greg Kroah-Hartman wrote:
> On Fri, Oct 12, 2018 at 05:33:35PM -0700, Enke Chen wrote:
>> For simplicity and consistency, this patch provides an implementation
>> for signal-based fault notification prior to the coredump of a child
>> process. A new prctl command, PR_SET_PREDUMP_SIG, is defined that can
>> be used by an application to express its interest and to specify the
>> signal (SIGCHLD or SIGUSR1 or SIGUSR2) for such a notification. A new
>> signal code (si_code), CLD_PREDUMP, is also defined for SIGCHLD.
>>
>> Background:
>>
>> As the coredump of a process may take time, in certain time-sensitive
>> applications it is necessary for a parent process (e.g., a process
>> manager) to be notified of a child's imminent death before the coredump
>> so that the parent process can act sooner, such as re-spawning an
>> application process, or initiating a control-plane fail-over.
>>
>> Currently there are two ways for a parent process to be notified of a
>> child process's state change. One is to use the POSIX signal, and
>> another is to use the kernel connector module. The specific events and
>> actions are summarized as follows:
>>
>> Process EventPOSIX SignalConnector-based
>> --
>> ptrace_attach()  do_notify_parent_cldstop()  proc_ptrace_connector()
>>  SIGCHLD / CLD_STOPPED
>>
>> ptrace_detach()  do_notify_parent_cldstop()  proc_ptrace_connector()
>>  SIGCHLD / CLD_CONTINUED
>>
>> pre_coredump/N/A proc_coredump_connector()
>> get_signal()
>>
>> post_coredump/   do_notify_parent()  proc_exit_connector()
>> do_exit()SIGCHLD / exit_signal
>> --
>>
>> As shown in the table, the signal-based pre-coredump notification is not
>> currently available. In some cases using a connector-based notification
>> can be quite complicated (e.g., when a process manager is written in shell
>> scripts and thus is subject to certain inherent limitations), and a
>> signal-based notification would be simpler and better suited.
>>
>> Signed-off-by: Enke Chen 
>> ---
>>  arch/x86/kernel/signal_compat.c|  2 +-
>>  include/linux/sched.h  |  4 ++
>>  include/linux/signal.h |  5 +++
>>  include/uapi/asm-generic/siginfo.h |  3 +-
>>  include/uapi/linux/prctl.h |  4 ++
>>  kernel/fork.c  |  1 +
>>  kernel/signal.c| 51 +
>>  kernel/sys.c   | 77 
>> ++
>>  8 files changed, 145 insertions(+), 2 deletions(-)
> 
> Shouldn't there also be a manpage update, and a kselftest added for this
> new user/kernel api that is being created?
> 
> thanks,
> 
> greg k-h
> 


Re: [PATCH] input: touchscreen: fix wm97xx-ts exit path

2018-10-15 Thread Dmitry Torokhov
On Sat, Oct 13, 2018 at 10:06:18AM -0700, Randy Dunlap wrote:
> From: Randy Dunlap 
> 
> Loading then unloading wm97xx-ts.ko when CONFIG_AC97_BUS=m
> causes a WARNING: from drivers/base/driver.c:
> 
> Unexpected driver unregister!
> WARNING: CPU: 0 PID: 1709 at ../drivers/base/driver.c:193 
> driver_unregister+0x30/0x40
> 
> Fix this by only calling driver_unregister() with the same
> condition that driver_register() is called.
> 
> Fixes: ae9d1b5fbd7b ("Input: wm97xx: add new AC97 bus support")
> 
> Signed-off-by: Randy Dunlap 
> Cc: Robert Jarzmik 
> Cc: Charles Keepax 
> Cc: Dmitry Torokhov 
> Cc: Mark Brown 
> Cc: patc...@opensource.cirrus.com
> Cc: linux-in...@vger.kernel.org

Applied, thank you.

> ---
>  drivers/input/touchscreen/wm97xx-core.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> --- lnx-419-rc7.orig/drivers/input/touchscreen/wm97xx-core.c
> +++ lnx-419-rc7/drivers/input/touchscreen/wm97xx-core.c
> @@ -929,7 +929,8 @@ static int __init wm97xx_init(void)
>  
>  static void __exit wm97xx_exit(void)
>  {
> - driver_unregister(&wm97xx_driver);
> + if (IS_BUILTIN(CONFIG_AC97_BUS))
> + driver_unregister(&wm97xx_driver);
>   platform_driver_unregister(&wm97xx_mfd_driver);
>  }
>  
> 
> 

-- 
Dmitry


Re: [PATCH] Input: atmel_mxt_ts - mark expected switch fall-through

2018-10-15 Thread Dmitry Torokhov
On Mon, Oct 08, 2018 at 07:03:55PM +0200, Gustavo A. R. Silva wrote:
> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.
> 
> Signed-off-by: Gustavo A. R. Silva 

Applied, thank you.

> ---
>  drivers/input/touchscreen/atmel_mxt_ts.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/input/touchscreen/atmel_mxt_ts.c 
> b/drivers/input/touchscreen/atmel_mxt_ts.c
> index bbc122f..d3aacd5 100644
> --- a/drivers/input/touchscreen/atmel_mxt_ts.c
> +++ b/drivers/input/touchscreen/atmel_mxt_ts.c
> @@ -488,7 +488,7 @@ static int mxt_lookup_bootloader_address(struct mxt_data 
> *data, bool retry)
>   bootloader = appmode - 0x24;
>   break;
>   }
> - /* Fall through for normal case */
> + /* Fall through - for normal case */
>   case 0x4c:
>   case 0x4d:
>   case 0x5a:
> -- 
> 2.7.4
> 

-- 
Dmitry


Re: [PATCH] compiler.h: update definition of unreachable()

2018-10-15 Thread Miguel Ojeda
On Mon, Oct 15, 2018 at 7:22 PM  wrote:
>
> Fixes the objtool warning seen with Clang:
> arch/x86/mm/fault.o: warning: objtool: no_context()+0x220: unreachable
> instruction
>
> Fixes commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
> mutually exclusive")
>
> Josh noted that the fallback definition was meant to work around a
> pre-gcc-4.6 bug. GCC still needs to work around
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365, so compiler-gcc.h
> defines its own version of unreachable().  Clang and ICC can use this
> shared definition.

Could we, at the same time, update the comment on compiler-gcc.h as
well? i.e. remove the 4.5 comment, add the link to the GCC PR.

>
> Link: https://github.com/ClangBuiltLinux/linux/issues/204
> Suggested-by: Andy Lutomirski 
> Suggested-by: Josh Poimboeuf 
> Tested-by: Nathan Chancellor 
> Signed-off-by: Nick Desaulniers 
> ---
> Miguel, would you mind taking this up in your new compiler attributes
> tree?

Sure, will do.

Thanks,

Cheers,
Miguel


Re: [PATCH] Input: cyapa - mark expected switch fall-throughs

2018-10-15 Thread Dmitry Torokhov
On Mon, Oct 08, 2018 at 05:38:24PM +0200, Gustavo A. R. Silva wrote:
> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.
> 
> Notice that in this particular case, I replaced the "Fallthrough state"
> commern with a proper "Fall through", which is what GCC is expecting to
> find.
> 
> Addresses-Coverity-ID: 114758 ("Missing break in switch")
> Addresses-Coverity-ID: 114759 ("Missing break in switch")
> Signed-off-by: Gustavo A. R. Silva 

Applied, thank you.

> ---
>  drivers/input/mouse/cyapa_gen3.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/input/mouse/cyapa_gen3.c 
> b/drivers/input/mouse/cyapa_gen3.c
> index 076dda4..00e395d 100644
> --- a/drivers/input/mouse/cyapa_gen3.c
> +++ b/drivers/input/mouse/cyapa_gen3.c
> @@ -1067,7 +1067,7 @@ static int cyapa_gen3_do_operational_check(struct cyapa 
> *cyapa)
>   return error;
>   }
>  
> - /* Fallthrough state */
> + /* Fall through */
>   case CYAPA_STATE_BL_IDLE:
>   /* Try to get firmware version in bootloader mode. */
>   cyapa_gen3_bl_query_data(cyapa);
> @@ -1078,7 +1078,7 @@ static int cyapa_gen3_do_operational_check(struct cyapa 
> *cyapa)
>   return error;
>   }
>  
> - /* Fallthrough state */
> + /* Fall through */
>   case CYAPA_STATE_OP:
>   /*
>* Reading query data before going back to the full mode
> -- 
> 2.7.4
> 

-- 
Dmitry


Re: linux-next: Tree for Oct 15

2018-10-15 Thread Guenter Roeck
On Mon, Oct 15, 2018 at 07:25:46PM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> Changes since 20181012:
> 
> My qemu boots of a powerpc pseries_le_defconfig kernel failed today.
> 

Same here. Interestingly, this only affects little endian pseries
boots; big endian works fine. I'll try to bisect later.

ALl ppc qemu tests (including big endian pseries) also generate a warning.

WARNING: CPU: 0 PID: 0 at mm/memblock.c:1301 .memblock_alloc_range_nid+0x20/0x68
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc7-next-20181015 #1
NIP:  c0f99198 LR: c0f99490 CTR: c0bb8364
REGS: c1217a78 TRAP: 0700   Not tainted  (4.19.0-rc7-next-20181015)
MSR:  80021000   CR: 24000422  XER: 2000
IRQMASK: 1 
GPR00: c0f99490 c1217d00 c121a500 00c0 
GPR04:     
GPR08:  00c0 0018 00b7 
GPR12: 0040 c0fe7840   
GPR16:     
GPR20:     
GPR24:     
GPR28: c304 c1262088 00c0 c0fea500 
NIP [c0f99198] .memblock_alloc_range_nid+0x20/0x68
LR [c0f99490] .memblock_alloc_base+0x18/0x48
Call Trace:
[c1217d00] [c2a0] 0xc2a0 (unreliable)
[c1217d80] [c0f99490] .memblock_alloc_base+0x18/0x48
[c1217df0] [c0f7a274] .allocate_paca_ptrs+0x3c/0x74
[c1217e70] [c0f78bf0] .early_init_devtree+0x288/0x320
[c1217f10] [c0f79b6c] .early_setup+0x80/0x130
[c1217f90] [c528] start_here_multiplatform+0x68/0x80


sparc images crash, starting with next-20181009. Bisect with
next-201810112 points to the merge of devicetree/for-next, though
devicetree/for-next itself does not have the problem (bisect log
attached below). The crash is in devicetree code.

Crash logs:
https://kerneltests.org/builders/qemu-sparc64-next/builds/981/steps/qemubuildcommand_1/logs/stdio
https://kerneltests.org/builders/qemu-sparc-next/builds/975/steps/qemubuildcommand_1/logs/stdio

Guenter

---
# bad: [774ea0551a2966c8fc29a6f675c3e28c5c6fa586] Add linux-next specific files 
for 20181012
# good: [0238df646e6224016a45505d2c111a24669ebe21] Linux 4.19-rc7
git bisect start 'HEAD' 'v4.19-rc7'
# good: [dfbf78faefa3c26d94208398e62bf25ea798e7f2] Merge remote-tracking branch 
'spi-nor/spi-nor/next'
git bisect good dfbf78faefa3c26d94208398e62bf25ea798e7f2
# bad: [3f296bb430327676912966c56d2f078f74e6b4ab] Merge remote-tracking branch 
'tip/auto-latest'
git bisect bad 3f296bb430327676912966c56d2f078f74e6b4ab
# good: [efad9cbc89fbef3c4b3905e1c01a8191eae4c772] Merge remote-tracking branch 
'sound/for-next'
git bisect good efad9cbc89fbef3c4b3905e1c01a8191eae4c772
# good: [7d12a265b24001fbff1ff260c2f6bd802224a7c0] Merge remote-tracking branch 
'iommu/next'
git bisect good 7d12a265b24001fbff1ff260c2f6bd802224a7c0
# good: [4fc72c0ef3c1e792caf06d25ef68c7c871730e31] Merge branch 'ras/core'
git bisect good 4fc72c0ef3c1e792caf06d25ef68c7c871730e31
# good: [d74865bd3996c7a6f3e8ce6e626c1fe474e39494] Merge branch 'x86/mm'
git bisect good d74865bd3996c7a6f3e8ce6e626c1fe474e39494
# bad: [1b1ab6a98adab8a0436024b369305a978e365a13] Merge remote-tracking branch 
'mailbox/mailbox-for-next'
git bisect bad 1b1ab6a98adab8a0436024b369305a978e365a13
# good: [389d0a8a7af8ff8bb6301382333c7e8f748d7cd6] Merge branch 
'dt/cpu-type-rework' into dt/next
git bisect good 389d0a8a7af8ff8bb6301382333c7e8f748d7cd6
# good: [4355151de47c2b4bc72c026ee743bd9ed7f71ba3] Merge branch 'all-dtbs' into 
dt/next
git bisect good 4355151de47c2b4bc72c026ee743bd9ed7f71ba3
# good: [60d744213fd9433b10b23afafb694a44c8e96cb8] Merge remote-tracking branch 
'vfio/next'
git bisect good 60d744213fd9433b10b23afafb694a44c8e96cb8
# good: [9f0a0a381c5db56e7922dbeea6831f27db58372f] mailbox: mediatek: Add check 
for possible failure of kzalloc
git bisect good 9f0a0a381c5db56e7922dbeea6831f27db58372f
# good: [157b4129ded8ba756ef17c058192e734889673e4] dt-bindings: arm: fsl: Move 
DCFG and SCFG bindings to their own docs
git bisect good 157b4129ded8ba756ef17c058192e734889673e4
# bad: [bed61948ea6c57bc73fb3ded9421c1bdd8cbe4d9] Merge remote-tracking branch 
'devicetree/for-next'
git bisect bad bed61948ea6c57bc73fb3ded9421c1bdd8cbe4d9
# good: [d81cc4a8e47219fbe60d49446f04ed3e9c1657d9] dt-bindings: arm: zte: Move 
sysctrl bindings to their own doc
git bisect good d81cc4a8e47219fbe60d49446f04ed3e9c1657d9
# first bad commit: [bed61948ea6c57bc73fb3ded9421c1bdd8cbe4d9] Merge 
remote-tracking branch 'devicetree/for-next'



Re: [PATCH 0/4] gsmi: Google specific firmware patches

2018-10-15 Thread Greg Kroah-Hartman
On Sat, Oct 13, 2018 at 03:44:45PM -0700, Guenter Roeck wrote:
> On Fri, Oct 12, 2018 at 9:04 AM Ross Zwisler  wrote:
> 
> > This series contains some Google specific firmware patches that we've
> > been carrying out of tree.  I've updated the changelog for each so that
> > it is suitable for upstream, and I've retested them to make sure I know
> > what the patches are doing.
> >
> > Duncan Laurie (3):
> >   gsmi: Fix bug in append_to_eventlog sysfs handler
> >   gsmi: Add coreboot to list of matching BIOS vendors
> >   gsmi: Remove autoselected dependency on EFI and EFI_VARS
> >
> > Furquan Shaikh (1):
> >   gsmi: Add GSMI commands to log S0ix info
> >
> >
> For the series: Reviewed-by: Guenter Roeck 
> 
> Greg, please let me know if I should send this for every patch.

Nope, this works.

greg k-h


Re: WARNING in usb_submit_urb (3)

2018-10-15 Thread syzbot

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger  
crash:


Reported-and-tested-by:  
syzbot+24a30223a4b609bb8...@syzkaller.appspotmail.com


Tested on:

commit: f0a7d1883d9f afs: Fix clearance of reply
git tree:   upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=b3f55cb3dfcc6c33
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
patch:  https://syzkaller.appspot.com/x/patch.diff?x=164eab9140

Note: testing is done by a robot and is best-effort only.


Re: [RFC][PATCH] perf: Rewrite core context handling

2018-10-15 Thread Stephane Eranian
Hi,

On Mon, Oct 15, 2018 at 10:29 AM Alexey Budankov
 wrote:
>
>
> Hi,
> On 15.10.2018 11:34, Peter Zijlstra wrote:
> > On Mon, Oct 15, 2018 at 10:26:06AM +0300, Alexey Budankov wrote:
> >> Hi,
> >>
> >> On 10.10.2018 13:45, Peter Zijlstra wrote:
> >>> Hi all,
> >>>
> >>> There have been various issues and limitations with the way perf uses
> >>> (task) contexts to track events. Most notable is the single hardware PMU
> >>> task context, which has resulted in a number of yucky things (both
> >>> proposed and merged).
> >>>
> >>> Notably:
> >>>
> >>>  - HW breakpoint PMU
> >>>  - ARM big.little PMU
> >>>  - Intel Branch Monitoring PMU
> >>>
> >>> Since we now track the events in RB trees, we can 'simply' add a pmu
> >>> order to them and have them grouped that way, reducing to a single
> >>> context. Of course, reality never quite works out that simple, and below
> >>> ends up adding an intermediate data structure to bridge the context ->
> >>> pmu mapping.
> >>>
> >>> Something a little like:
> >>>
> >>>   ,[1:n]-.
> >>>   V  V
> >>> perf_event_context <-[1:n]-> perf_event_pmu_context <--- perf_event
> >>>   ^  ^ | |
> >>>   `[1:n]-' `-[n:1]-> pmu <-[1:n]-'
> >>>
> >>> This patch builds (provided you disable CGROUP_PERF), boots and survives
> >>> perf-top without the machine catching fire.
> >>>
> >>> There's still a fair bit of loose ends (look for XXX), but I think this
> >>> is the direction we should be going.
> >>>
> >>> Comments?
> >>>
> >>> Not-Quite-Signed-off-by: Peter Zijlstra (Intel) 
> >>> ---
> >>>  arch/powerpc/perf/core-book3s.c |4
> >>>  arch/x86/events/core.c  |4
> >>>  arch/x86/events/intel/core.c|6
> >>>  arch/x86/events/intel/ds.c  |6
> >>>  arch/x86/events/intel/lbr.c |   16
> >>>  arch/x86/events/perf_event.h|6
> >>>  include/linux/perf_event.h  |   80 +-
> >>>  include/linux/sched.h   |2
> >>>  kernel/events/core.c| 1412 
> >>> 
> >>>  9 files changed, 815 insertions(+), 721 deletions(-)
> >>
> >> Rewrite is impressive however it doesn't result in code base reduction as 
> >> it is.
> >
> > Yeah.. that seems to be nature of these things ..
> >
> >> Nonetheless there is a clear demand for per pmu events groups tracking and 
> >> rotation
> >> in single cpu context (HW breakpoints, ARM big.little, Intel LBRs) and 
> >> there is
> >> a supply thru groups ordering on RB-tree.
> >>
> >> This might be driven into the kernel by some new Perf features that would 
> >> base on
> >> that RB-tree groups ordering or by refactoring of existing code but in the 
> >> way it
> >> would result in overall code base reduction thus lowering support cost.
> >
> > If you have a concrete suggestion on how to reduce complexity? I tried,
> > but couldn't find any (without breaking something).
>
> Could some of those PMUs (HW breakpoints, ARM big.little, Intel LBRs)
> or other Perf related code be adjusted now so that overall subsystem
> code base would reduce?
>
I have always had a hard time understanding the role of all these structs in
the generic code. This is still very confusing and very hard to follow.

In my mind, you have per-task and per-cpu perf_events contexts.
And for each you can have multiple PMUs, some hw some sw.
Each PMU has its own list of events maintained in RB tree. There is
never any interactions between PMUs.

Maybe this is how this is done or proposed by your patches, but it
certainly is not
obvious.

Also the Intel LBR is not a PMU on is own. Maybe you are talking about
the BTS in
arch/x86/even/sintel/bts.c.


> >
> > The active lists and pmu_ctx_list could arguably be replaced with
> > (slower) iteratons over the RB tree, but you'll still need the per pmu
> > nr_events/nr_active counts to determine if rotation is required at all.
> >
> > And like you know, performance is quite important here too. I'd love to
> > reduce complexity while maintaining or improve performance, but that
> > rarely if ever happens :/
> >


Re: [PATCH] slimbus: ngd: QCOM_QMI_HELPERS has to be selected

2018-10-15 Thread Greg KH
On Wed, Oct 03, 2018 at 06:52:36PM +0100, Srinivas Kandagatla wrote:
> From: Niklas Cassel 
> 
> QCOM_QMI_HELPERS is a hidden kconfig, so the proper usage is
> to select it, not depend upon it.
> 
> Signed-off-by: Niklas Cassel 
> Reviewed-by: Alex Elder 
> Reviewed-by: Bjorn Andersson 
> Signed-off-by: Srinivas Kandagatla 
> ---
>  drivers/slimbus/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/slimbus/Kconfig b/drivers/slimbus/Kconfig
> index 9d73ad806698..9cca2c3559f2 100644
> --- a/drivers/slimbus/Kconfig
> +++ b/drivers/slimbus/Kconfig
> @@ -22,8 +22,8 @@ config SLIM_QCOM_CTRL
>  
>  config SLIM_QCOM_NGD_CTRL
>   tristate "Qualcomm SLIMbus Satellite Non-Generic Device Component"
> - depends on QCOM_QMI_HELPERS
>   depends on HAS_IOMEM && DMA_ENGINE
> + select QCOM_QMI_HELPERS
>   help
> Select driver if Qualcomm's SLIMbus Satellite Non-Generic Device
> Component is programmed using Linux kernel.
> -- 
> 2.19.0
> 

This adds the following build warning when applied:

WARNING: unmet direct dependencies detected for QCOM_QMI_HELPERS
  Depends on [n]: ARCH_QCOM && NET [=y]
  Selected by [m]:
  - SLIM_QCOM_NGD_CTRL [=m] && SLIMBUS [=m] && HAS_IOMEM [=y] && DMA_ENGINE [=y]




Re: [PATCH] spi: Make GPIO CSs honour the SPI_NO_CS flag

2018-10-15 Thread Trent Piepho
On Fri, 2018-10-12 at 10:23 +0100, Phil Elwell wrote:
> The SPI configuration state includes an SPI_NO_CS flag that disables
> all CS line manipulation, for applications that want to manage their
> own chip selects. However, this flag is ignored by the GPIO CS code
> in the SPI framework.

> @@ -729,7 +729,9 @@ static void spi_set_cs(struct spi_device *spi, bool 
> enable)
>   enable = !enable;
>  
>   if (gpio_is_valid(spi->cs_gpio)) {
> - gpio_set_value(spi->cs_gpio, !enable);
> + /* Honour the SPI_NO_CS flag */
> + if (!(spi->mode & SPI_NO_CS))
> + gpio_set_value(spi->cs_gpio, !enable);
>   /* Some SPI masters need both GPIO CS & slave_select */
>   if ((spi->controller->flags & SPI_MASTER_GPIO_SS) &&
>   spi->controller->set_cs)

What about the calls to spi->controller->set_cs() after this? Should a
driver provided set_cs method be responsible for checking SPI_NO_CS? 
Or should it not be called in the first place?

I imagine it depends on what set_cs needs to do, which might not be
solely related to changing the CS line.

Re: [PATCH v6 2/9] PCI: mediatek: Fixup class ID for MT7622 as PCI_CLASS_BRIDGE_PCI

2018-10-15 Thread Bjorn Helgaas
On Mon, Oct 15, 2018 at 10:42:23AM +0800, Honghui Zhang wrote:
> On Fri, 2018-10-12 at 09:12 -0500, Bjorn Helgaas wrote:
> > On Fri, Oct 12, 2018 at 11:22:30AM +0100, Lorenzo Pieralisi wrote:
> > > On Fri, Oct 12, 2018 at 04:01:29PM +0800, Honghui Zhang wrote:
> > >> On Thu, 2018-10-11 at 12:38 +0100, Lorenzo Pieralisi wrote:
> > >>> On Tue, Oct 09, 2018 at 11:08:15AM +0800, Honghui Zhang wrote:
> >  On Mon, 2018-10-08 at 18:23 +0100, Lorenzo Pieralisi wrote:
> > > On Mon, Oct 08, 2018 at 11:24:41AM +0800, honghui.zh...@mediatek.com 
> > > wrote:
> > >> From: Honghui Zhang 
> > >> 
> > >> The PCIe controller of MT7622 has TYPE 1 configuration
> > >> space type, but the HW default class type values is
> > >> invalid.
> > >> 
> > >> The commit 101c92dc80c8 ("PCI: mediatek: Set up vendor ID
> > >> and class type for MT7622") have set the class ID for
> > >> MT7622 as PCI_CLASS_BRIDGE_HOSTe, but it's not workable
> > >> for MT7622:
> > >> 
> > >> In __pci_bus_assign_resources, the framework only setup
> > >> bridge's resource window only if class type is
> > >> PCI_CLASS_BRIDGE_PCI. Or it will leave the subordinary PCIe
> > >> device's MMIO window un-touched.
> > 
> > I think __pci_bus_assign_resources() should be testing dev->hdr_type
> > instead of dev->class.  The connection between "Header Type" and the
> > layout of the rest of the header is very explicit (PCI r3.0 sec 6.1,
> > PCIe r4.0 sec 7.5.1.1.9), and the reason for the switch statement in
> > __pci_bus_assign_resources() is precisely to determine which layout to
> > use.
> > 
> > There are several other uses of dev->class in setup-bus.c that I think
> > should also be changed to use dev->hdr_type.
> > 
> > If we make these changes in setup-bus.c, I suspect the class code you
> > assign won't matter too much.  There are a few other tests of the
> > class code to figure out whether to leave certain things untouched.
> > These seem a little hacky to me, but we're probably stuck with them
> > for now, so you should look and see whether they apply to your
> > situation.
> 
> If these change could be made in the PCI core, then the class code is no
> matter what will be workable for MT7622.
> 
> As Lorenzo point it out, it's more reasonable for MT7622 to defined as a
> PCI-to-PCI class code since the IP is defined as that. I intend to
> following Lorenzo's suggest to update the commit message and re-send
> this patch set for current solution.
> 
> >  And for MT7622, it integrated with block of internal control
> >  registers, type 1 configuration space, and is considered as a
> >  root complex.
> > >>> 
> > >>> I assume you mean a type 1 config header here. I do not think it
> > >>> is mandatory for a host bridge to have a type 1 config header (and
> > >>> related bridge windows + primary/secondary/subordinate bus
> > >>> numbers) but I do not know how the IP you are programming is
> > >>> designed.
> > 
> > It is definitely not mandatory for a host bridge to have a type 1
> > header.  I'm not even sure that would make sense: the "Primary Bus
> > Number" would not apply to a host bridge (since a host bridge's
> > primary bus is some sort of CPU bus, not a PCI bus), and a type 1
> > device cannot perform address translation between its primary and
> > secondary buses, while a host bridge can.
> > 
> > A Root Port is a type 1 device where the primary bus is logically
> > internal to the Root Complex.  A host bridge bridges from the CPU bus
> > to that internal bus and might perform address translation.  The Root
> > Port must be a PCI device.  A host bridge, being a bridge *to* the PCI
> > domain, is not itself generally programmed via PCI config space and
> > might not even be visible as a device in PCI config space.
> > 
> Thanks for the explain. Per my understanding, MT7622 is more like a
> complex of Root Port and PCI-to-PCI bridge. It has type 1 header and has
> the ability to translate address between its primary and secondary
> buses.

Nope.  Logically speaking, the PCI device in question is a Root Port,
which is a bridge between a primary PCI bus (probably bus 0) that is
internal to the Root Complex, and a secondary PCI bus.  This bridge,
like all other PCI-to-PCI bridges, does no address translation.

The Root Complex *also* contains a host bridge from whatever the
upstream CPU bus is and the logical PCI bus 0 internal to the Root
Complex.  This host bridge can perform address translation.

The Root Port PCI device might also have device-specific PCI config
registers, and those might offer some control over the host bridge
part of the Root Complex.  There is probably an MMIO (non-PCI config
space) way to configure the Root Complex, since you may not be able to
access the PCI config space before the Root Complex is configured.

> I guess apply the class type as PCI_CLASS_BRIDGE_PCI is
> reasonable way to make its integrated internal bridge workable.

Nope, that's not

Re: [PATCH v3 06/15] platform: goldfish: pipe: Move memory allocation from probe to init

2018-10-15 Thread Greg KH
On Wed, Oct 03, 2018 at 10:17:11AM -0700, r...@google.com wrote:
> From: Roman Kiryanov 
> 
> There will be two separate init functions for v1 and v2
> (different driver versions) and they will allocate different
> state.

You should only allocate memory at probe time, not init time as what
happens if the hardware is not present yet your driver is loaded?

You should do almost nothing at init time except register with the
proper bus so that your probe function can be called if the hardware is
present.

So the patch here is going backwards from what it should be working
toward.

thanks,

greg k-h


Re: [RESEND] dt-bindings: clock: samsung: Add SPDX license identifiers

2018-10-15 Thread Rob Herring
On Thu, Sep 27, 2018 at 07:03:47PM +0200, Krzysztof Kozlowski wrote:
> Replace GPL license statements with SPDX license identifiers (GPL-2.0).
> 
> Signed-off-by: Krzysztof Kozlowski 
> Acked-by: Chanwoo Choi 
> Reviewed-by: Rob Herring 
> ---
>  include/dt-bindings/clock/exynos3250.h | 5 +
>  include/dt-bindings/clock/exynos4.h| 7 ++-
>  include/dt-bindings/clock/exynos5250.h | 7 ++-
>  include/dt-bindings/clock/exynos5260-clk.h | 7 ++-
>  include/dt-bindings/clock/exynos5410.h | 7 ++-
>  include/dt-bindings/clock/exynos5420.h | 7 ++-
>  include/dt-bindings/clock/exynos5433.h | 5 +
>  include/dt-bindings/clock/exynos7-clk.h| 7 ++-
>  include/dt-bindings/clock/s3c2410.h| 5 +
>  include/dt-bindings/clock/s3c2412.h| 5 +
>  include/dt-bindings/clock/s3c2443.h| 5 +
>  11 files changed, 17 insertions(+), 50 deletions(-)

Applied.


Re: [RESEND] dt-bindings: thermal: samsung: Add SPDX license identifier

2018-10-15 Thread Rob Herring
On Thu, Sep 27, 2018 at 07:04:40PM +0200, Krzysztof Kozlowski wrote:
> Replace GPL license statement with SPDX license identifier (GPL-2.0+).
> 
> Signed-off-by: Krzysztof Kozlowski 
> 
> ---
> 
> We postponed this patch because in previous discussion [1] Bartlomiej
> expressed willingness to change the license... which did not happen
> since July 2018.  In such case let's take this.
> 
> [1] https://patchwork.kernel.org/patch/10533191/
> ---
>  include/dt-bindings/thermal/thermal_exynos.h | 12 +---
>  1 file changed, 1 insertion(+), 11 deletions(-)

Applied.


Re: [PATCH v3 07/15] platform: goldfish: pipe: Return status from "deinit" since "remove" does not do much

2018-10-15 Thread Greg KH
On Wed, Oct 03, 2018 at 10:17:12AM -0700, r...@google.com wrote:
> From: Roman Kiryanov 
> 
> This way deinit will have a chance to report an error.
> 
> Signed-off-by: Roman Kiryanov 
> ---
> Changes in v3:
>  - No change.
> Changes in v2:
>  - No change.
> 
>  drivers/platform/goldfish/goldfish_pipe.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/platform/goldfish/goldfish_pipe.c 
> b/drivers/platform/goldfish/goldfish_pipe.c
> index 445c0c0c66c4..1822d4146778 100644
> --- a/drivers/platform/goldfish/goldfish_pipe.c
> +++ b/drivers/platform/goldfish/goldfish_pipe.c
> @@ -888,13 +888,15 @@ static int goldfish_pipe_device_init(struct 
> platform_device *pdev,
>   return 0;
>  }
>  
> -static void goldfish_pipe_device_deinit(struct platform_device *pdev,
> - struct goldfish_pipe_dev *dev)
> +static int goldfish_pipe_device_deinit(struct platform_device *pdev,
> +struct goldfish_pipe_dev *dev)
>  {
>   misc_deregister(&dev->miscdev);
>   tasklet_kill(&dev->irq_tasklet);
>   kfree(dev->pipes);
>   free_page((unsigned long)dev->buffers);
> +
> + return 0;
>  }

This function can not fail, why are you returning 0 always?  That
doesn't make sense.

thanks,

greg k-h


Re: [PATCH v8 2/9] PCI: mediatek: Fix class type for MT7622 as PCI_CLASS_BRIDGE_PCI

2018-10-15 Thread Bjorn Helgaas
On Mon, Oct 15, 2018 at 04:08:53PM +0800, honghui.zh...@mediatek.com wrote:
> From: Honghui Zhang 
> 
> The commit 101c92dc80c8 ("PCI: mediatek: Set up vendor ID and class
> type for MT7622") have set the class type for MT7622 as un-properly
> value of PCI_CLASS_BRIDGE_HOST.
> 
> The PCIe controller of MT7622 is complexed with Root Port and PCI-to-PCI
> bridge, the bridge has type 1 configuration space header and related bridge
> windows. The HW default value of this bridge's class type is invalid. Fix
> its class type as PCI_CLASS_BRIDGE_PCI since it is HW defines.
> 
> Making the bridge visiable to PCI framework by setting its class type
> properly will get its bridge windows configurated during PCI device
> enumerate.
> 
> Fixes: 101c92dc80c8 ("PCI: mediatek: Set up vendor ID and class type for 
> MT7622")
> Signed-off-by: Honghui Zhang 
> Acked-by: Ryder Lee 

Nak until this patch is preceded by one that fixes the PCI core defect
I pointed out earlier [1].  It's OK to change the class code, but
not as a way of working around that PCI core defect.

[1] 
https://lore.kernel.org/linux-pci/20181012141202.gv5...@bhelgaas-glaptop.roam.corp.google.com

> ---
>  drivers/pci/controller/pcie-mediatek.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/pcie-mediatek.c 
> b/drivers/pci/controller/pcie-mediatek.c
> index 288b8e2..bcdac9b 100644
> --- a/drivers/pci/controller/pcie-mediatek.c
> +++ b/drivers/pci/controller/pcie-mediatek.c
> @@ -432,7 +432,7 @@ static int mtk_pcie_startup_port_v2(struct mtk_pcie_port 
> *port)
>   val = PCI_VENDOR_ID_MEDIATEK;
>   writew(val, port->base + PCIE_CONF_VEND_ID);
>  
> - val = PCI_CLASS_BRIDGE_HOST;
> + val = PCI_CLASS_BRIDGE_PCI;
>   writew(val, port->base + PCIE_CONF_CLASS_ID);
>   }
>  
> -- 
> 2.6.4
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


Re: [PATCH v3 09/15] platform: goldfish: pipe: Move goldfish_pipe to goldfish_pipe_v2

2018-10-15 Thread Greg KH
On Wed, Oct 03, 2018 at 10:17:14AM -0700, r...@google.com wrote:
> From: Roman Kiryanov 
> 
> This is the v2 driver. v1 will be added later.

This does not make any sense.  Why are you renaming a driver that has
been in the tree already?  It's not "v2", it is what we have today.

If you want to add a new driver later, great, but don't change the name
of an existing driver for no good reason.  Userspace has a tendancy to
break when that happens.

thanks,

greg k-h


[PATCH] mm: detect numbers of vmstat keys/values mismatch

2018-10-15 Thread Yu Zhao
There were mismatches between number of vmstat keys and number of
vmstat values. They were fixed recently by:
  commit 58bc4c34d249 ("mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly")
  commit 28e2c4bb99aa ("mm/vmstat.c: fix outdated vmstat_text")

Add a BUILD_BUG_ON to detect such mismatch and hopefully prevent
it from happening again.

Signed-off-by: Yu Zhao 
---
 include/linux/vmstat.h |  4 
 mm/vmstat.c| 18 --
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index f25cef84b41d..33fdd37124cb 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -78,6 +78,10 @@ extern void vm_events_fold_cpu(int cpu);
 
 #else
 
+struct vm_event_state {
+   unsigned long event[0];
+};
+
 /* Disable counters */
 static inline void count_vm_event(enum vm_event_item item)
 {
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 7878da76abf2..7ebf871b4cc9 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1647,23 +1647,21 @@ enum writeback_stat_item {
NR_VM_WRITEBACK_STAT_ITEMS,
 };
 
+#define NR_VM_STAT_ITEMS (NR_VM_ZONE_STAT_ITEMS + NR_VM_NUMA_STAT_ITEMS + \
+ NR_VM_NODE_STAT_ITEMS + NR_VM_WRITEBACK_STAT_ITEMS + \
+ ARRAY_SIZE(((struct vm_event_state *)0)->event))
+
 static void *vmstat_start(struct seq_file *m, loff_t *pos)
 {
+   int i;
unsigned long *v;
-   int i, stat_items_size;
+
+   BUILD_BUG_ON(ARRAY_SIZE(vmstat_text) != NR_VM_STAT_ITEMS);
 
if (*pos >= ARRAY_SIZE(vmstat_text))
return NULL;
-   stat_items_size = NR_VM_ZONE_STAT_ITEMS * sizeof(unsigned long) +
- NR_VM_NUMA_STAT_ITEMS * sizeof(unsigned long) +
- NR_VM_NODE_STAT_ITEMS * sizeof(unsigned long) +
- NR_VM_WRITEBACK_STAT_ITEMS * sizeof(unsigned long);
-
-#ifdef CONFIG_VM_EVENT_COUNTERS
-   stat_items_size += sizeof(struct vm_event_state);
-#endif
 
-   v = kmalloc(stat_items_size, GFP_KERNEL);
+   v = kmalloc_array(NR_VM_STAT_ITEMS, sizeof(unsigned long), GFP_KERNEL);
m->private = v;
if (!v)
return ERR_PTR(-ENOMEM);
-- 
2.19.1.331.ge82ca0e54c-goog



Re: [PATCH] kernel/signal: Signal-based pre-coredump notification

2018-10-15 Thread Enke Chen
Hi, Christian:

As I replied to Jann, I will remove the code that does the setting on others
to make the code simpler and more secure.

Thanks.  -- Enke

>> +static bool set_predump_signal_perm(struct task_struct *p)
>> +{
>> +const struct cred *cred = current_cred(), *pcred = __task_cred(p);
>> +
>> +return uid_eq(pcred->uid, cred->euid) ||
>> +   uid_eq(pcred->euid, cred->euid) ||
>> +   capable(CAP_SYS_ADMIN);
> 
> So before proceeding I'd like to discuss at least two points:
> - how does this interact with the dumpability of a process?
> - do we need the capable(CAP_SYS_ADMIN) restriction to init_user_ns?
>   Seems we could make this work per-user-ns just like
>   PRCTL_SET_PDEATHSIG does?
> 
>> +}



[PATCH v1 4/4] arm64: dts: qcom: qcs404: Add thermal zones for each sensor

2018-10-15 Thread Amit Kucheria
qcs404 has 10 sensors connected to the single TSENS IP. Define a thermal
zone for each of those sensors to expose the temperature of each zone.

Signed-off-by: Amit Kucheria 
---
 arch/arm64/boot/dts/qcom/qcs404.dtsi | 206 +++
 1 file changed, 206 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/qcs404.dtsi 
b/arch/arm64/boot/dts/qcom/qcs404.dtsi
index dfd65c53cf5f..ea882a9ce6e3 100644
--- a/arch/arm64/boot/dts/qcom/qcs404.dtsi
+++ b/arch/arm64/boot/dts/qcom/qcs404.dtsi
@@ -69,6 +69,7 @@
reg = <0x100>;
enable-method = "psci";
next-level-cache = <&L2_0>;
+   #cooling-cells= <2>;
};
 
CPU1: cpu@1 {
@@ -77,6 +78,7 @@
reg = <0x101>;
enable-method = "psci";
next-level-cache = <&L2_0>;
+   #cooling-cells= <2>;
};
 
CPU2: cpu@2 {
@@ -85,6 +87,7 @@
reg = <0x102>;
enable-method = "psci";
next-level-cache = <&L2_0>;
+   #cooling-cells= <2>;
};
 
CPU3: cpu@3 {
@@ -93,6 +96,7 @@
reg = <0x100>;
enable-method = "psci";
next-level-cache = <&L2_0>;
+   #cooling-cells= <2>;
};
 
L2_0: l2-cache {
@@ -484,4 +488,206 @@
label = "wcss";
};
};
+
+   thermal-zones {
+   aoss-thermal {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = <&tsens 0>;
+
+   trips {
+   aoss_alert: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   aoss_crit: trip1 {
+   temperature = <95000>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   dsp-thermal {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = <&tsens 1>;
+
+   trips {
+   dsp_alert: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   dsp_crit: trip1 {
+   temperature = <95000>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   lpass-thermal {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = <&tsens 2>;
+
+   trips {
+   lpass_alert: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   lpass_crit: trip1 {
+   temperature = <95000>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   wlan-thermal {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = <&tsens 3>;
+
+   trips {
+   wlan_alert: trip0 {
+   temperature = <75000>;
+   hysteresis = <2000>;
+   type = "passive";
+   };
+   wlan_crit: trip1 {
+   temperature = <95000>;
+   hysteresis = <2000>;
+   type = "critical";
+   };
+   };
+   };
+
+   cluster-thermal {
+   polling-delay-passive = <250>;
+   polling-de

[PATCH v1 1/4] dt: thermal: tsens: Add bindings for qcs404

2018-10-15 Thread Amit Kucheria
qcs404 uses v1 of the TSENS IP block. Create a fallback DT property
"qcom,tsens-v1" to gather common code.

Signed-off-by: Amit Kucheria 
---
 Documentation/devicetree/bindings/thermal/qcom-tsens.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/thermal/qcom-tsens.txt 
b/Documentation/devicetree/bindings/thermal/qcom-tsens.txt
index 1d9e8cf61018..799de3062352 100644
--- a/Documentation/devicetree/bindings/thermal/qcom-tsens.txt
+++ b/Documentation/devicetree/bindings/thermal/qcom-tsens.txt
@@ -8,9 +8,12 @@ Required properties:
 - "qcom,msm8996-tsens" (MSM8996)
 - "qcom,msm8998-tsens", "qcom,tsens-v2" (MSM8998)
 - "qcom,sdm845-tsens", "qcom,tsens-v2" (SDM845)
+- "qcom,qcs404-tsens", "qcom,tsens-v1" (QCS404)
   The generic "qcom,tsens-v2" property must be used as a fallback for any SoC
   with version 2 of the TSENS IP. MSM8996 is the only exception because the
   generic property did not exist when support was added.
+  Similarly, the generic "qcom,tsens-v1" property must be used as a fallback 
for
+  any SoC with version 1 of the TSENS IP.
 
 - reg: Address range of the thermal registers.
   New platforms containing v2.x.y of the TSENS IP must specify the SROT and TM
-- 
2.17.1



[PATCH v1 2/4] drivers: thermal: tsens: Add generic support for TSENS v1 IP

2018-10-15 Thread Amit Kucheria
qcs404 has a single TSENS IP block with 10 sensors. It uses version 1.4
of the TSENS IP, functionality for which is encapsulated inside
qcom,tsens-v1 compatible.

Signed-off-by: Amit Kucheria 
---
 drivers/thermal/qcom/Makefile   |   2 +-
 drivers/thermal/qcom/tsens-v1.c | 196 
 drivers/thermal/qcom/tsens.c|   3 +
 drivers/thermal/qcom/tsens.h|   2 +-
 4 files changed, 201 insertions(+), 2 deletions(-)
 create mode 100644 drivers/thermal/qcom/tsens-v1.c

diff --git a/drivers/thermal/qcom/Makefile b/drivers/thermal/qcom/Makefile
index a821929ede0b..60269ee90c43 100644
--- a/drivers/thermal/qcom/Makefile
+++ b/drivers/thermal/qcom/Makefile
@@ -1,2 +1,2 @@
 obj-$(CONFIG_QCOM_TSENS)   += qcom_tsens.o
-qcom_tsens-y   += tsens.o tsens-common.o tsens-8916.o 
tsens-8974.o tsens-8960.o tsens-v2.o
+qcom_tsens-y   += tsens.o tsens-common.o tsens-8916.o 
tsens-8974.o tsens-8960.o tsens-v2.o tsens-v1.o
diff --git a/drivers/thermal/qcom/tsens-v1.c b/drivers/thermal/qcom/tsens-v1.c
new file mode 100644
index ..297bcac3b94e
--- /dev/null
+++ b/drivers/thermal/qcom/tsens-v1.c
@@ -0,0 +1,196 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2018, Linaro Limited
+ */
+
+#include 
+#include 
+#include "tsens.h"
+
+/* eeprom layout data for qcs404 (v1) */
+#define BASE0_MASK 0x07f8
+#define BASE1_MASK 0x0007f800
+#define BASE0_SHIFT3
+#define BASE1_SHIFT11
+
+#define S0_P1_MASK 0x003f
+#define S1_P1_MASK 0x0003f000
+#define S2_P1_MASK 0x3f00
+#define S3_P1_MASK 0x03f0
+#define S4_P1_MASK 0x003f
+#define S5_P1_MASK 0x003f
+#define S6_P1_MASK 0x0003f000
+#define S7_P1_MASK 0x3f00
+#define S8_P1_MASK 0x03f0
+#define S9_P1_MASK 0x003f
+
+#define S0_P2_MASK 0x0fc0
+#define S1_P2_MASK 0x00fc
+#define S2_P2_MASK_1_0 0xc000
+#define S2_P2_MASK_5_2 0x000f
+#define S3_P2_MASK 0xfc00
+#define S4_P2_MASK 0x0fc0
+#define S5_P2_MASK 0x0fc0
+#define S6_P2_MASK 0x00fc
+#define S7_P2_MASK_1_0 0xc000
+#define S7_P2_MASK_5_2 0x000f
+#define S8_P2_MASK 0xfc00
+#define S9_P2_MASK 0x0fc0
+
+#define S0_P1_SHIFT0
+#define S0_P2_SHIFT6
+#define S1_P1_SHIFT12
+#define S1_P2_SHIFT18
+#define S2_P1_SHIFT24
+#define S2_P2_SHIFT_1_030
+
+#define S2_P2_SHIFT_5_20
+#define S3_P1_SHIFT4
+#define S3_P2_SHIFT10
+#define S4_P1_SHIFT16
+#define S4_P2_SHIFT22
+
+#define S5_P1_SHIFT0
+#define S5_P2_SHIFT6
+#define S6_P1_SHIFT12
+#define S6_P2_SHIFT18
+#define S7_P1_SHIFT24
+#define S7_P2_SHIFT_1_030
+
+#define S7_P2_SHIFT_5_20
+#define S8_P1_SHIFT4
+#define S8_P2_SHIFT10
+#define S9_P1_SHIFT16
+#define S9_P2_SHIFT22
+
+#define CAL_SEL_MASK   7
+#define CAL_SEL_SHIFT  0
+
+static int calibrate_v1(struct tsens_device *tmdev)
+{
+   u32 base0 = 0, base1 = 0;
+   u32 p1[tmdev->num_sensors], p2[tmdev->num_sensors];
+   u32 mode = 0, lsb = 0, msb = 0;
+   u32 *qfprom_cdata;
+   int i;
+
+   qfprom_cdata = (u32 *)qfprom_read(tmdev->dev, "calib");
+   if (IS_ERR(qfprom_cdata))
+   return PTR_ERR(qfprom_cdata);
+
+   mode = (qfprom_cdata[4] & CAL_SEL_MASK) >> CAL_SEL_SHIFT;
+   dev_dbg(tmdev->dev, "calibration mode is %d\n", mode);
+
+   switch (mode) {
+   case TWO_PT_CALIB:
+   base1 = (qfprom_cdata[4] & BASE1_MASK) >> BASE1_SHIFT;
+   p2[0] = (qfprom_cdata[0] & S0_P2_MASK) >> S0_P2_SHIFT;
+   p2[1] = (qfprom_cdata[0] & S1_P2_MASK) >> S1_P2_SHIFT;
+   /* This value is split over two registers, 2 bits and 4 bits */
+   lsb   = (qfprom_cdata[0] & S2_P2_MASK_1_0) >> S2_P2_SHIFT_1_0;
+   msb   = (qfprom_cdata[1] & S2_P2_MASK_5_2) >> S2_P2_SHIFT_5_2;
+   p2[2] = msb << 2 | lsb;
+   p2[3] = (qfprom_cdata[1] & S3_P2_MASK) >> S3_P2_SHIFT;
+   p2[4] = (qfprom_cdata[1] & S4_P2_MASK) >> S4_P2_SHIFT;
+   p2[5] = (qfprom_cdata[2] & S5_P2_MASK) >> S5_P2_SHIFT;
+   p2[6] = (qfprom_cdata[2] & S6_P2_MASK) >> S6_P2_SHIFT;
+   /* This value is split over two registers, 2 bits and 4 bits */
+   lsb   = (qfprom_cdata[2] & S7_P2_MASK_1_0) >> S7_P2_SHIFT_1_0;
+   msb   = (qfprom_cdata[3] & S7_P2_MASK_5_2) >> S7_P2_SHIFT_5_2;
+   p2[7] = msb << 2 | lsb;
+   p2[8] = (qfprom_cdata[3] & S8_P2_MASK) >> S8_P2_SHIFT;
+   p2[9] = (qfprom_cdata[3] & S9_P2_MASK) >> S9_P2_SHIFT;
+   for (i = 0; i < tmdev->num_sensors; i++)
+   p2[i] = ((base1 + p2[i]) << 2);
+   /* Fall through */
+   case ONE_PT_CALIB2:
+   base0 = (qfprom_cdata[4] & BASE0_MASK) >> BASE0_SHIFT;
+   p1[0] = (qfprom_cdata[0] & S0_P1_MASK) >> S0_

[PATCH v1 0/4] thermal: tsens: Add support for QCS404 platform

2018-10-15 Thread Amit Kucheria
Add support for the Qualcomm QCS404 platform that contains v1 of the TSENS
IP. Introduce a fallback binding to handle "v1" functionality.

These patches apply on top of previous tsens-related patches sent to the
list pending merge along with various qcs404 patches under review
(available in this branch[1] for convenience).

[1] 
https://git.linaro.org/landing-teams/working/qualcomm/kernel.git/log/?h=integration-linux-qcomlt

Amit Kucheria (4):
  dt: thermal: tsens: Add bindings for qcs404
  drivers: thermal: tsens: Add generic support for TSENS v1 IP
  arm64: dts: qcom: qcs404: Add tsens controller
  arm64: dts: qcom: qcs404: Add thermal zones for each sensor

 .../bindings/thermal/qcom-tsens.txt   |   3 +
 arch/arm64/boot/dts/qcom/qcs404.dtsi  | 226 ++
 drivers/thermal/qcom/Makefile |   2 +-
 drivers/thermal/qcom/tsens-v1.c   | 196 +++
 drivers/thermal/qcom/tsens.c  |   3 +
 drivers/thermal/qcom/tsens.h  |   2 +-
 6 files changed, 430 insertions(+), 2 deletions(-)
 create mode 100644 drivers/thermal/qcom/tsens-v1.c

-- 
2.17.1



[PATCH v1 3/4] arm64: dts: qcom: qcs404: Add tsens controller

2018-10-15 Thread Amit Kucheria
qcs404 has a single TSENS IP block with 10 sensors. The calibration data
is stored in an eeprom (qfprom) that is accessed through the nvmem
framework. We add the qfprom node to allow the tsens sensors to be
calibrated correctly.

Signed-off-by: Amit Kucheria 
---
 arch/arm64/boot/dts/qcom/qcs404.dtsi | 20 
 1 file changed, 20 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/qcs404.dtsi 
b/arch/arm64/boot/dts/qcom/qcs404.dtsi
index e1e2ba9cbfcd..dfd65c53cf5f 100644
--- a/arch/arm64/boot/dts/qcom/qcs404.dtsi
+++ b/arch/arm64/boot/dts/qcom/qcs404.dtsi
@@ -273,6 +273,26 @@
status = "okay";
};
 
+   qfprom: qfprom@a4000 {
+   compatible = "qcom,qfprom";
+   reg = <0xa4000 0x1000>;
+   #address-cells = <1>;
+   #size-cells = <1>;
+   tsens_caldata: caldata@d0 {
+   reg = <0x1f8 0x14>;
+   };
+   };
+
+   tsens: thermal-sensor@4a9000 {
+   compatible = "qcom,qcs404-tsens", "qcom,tsens-v1";
+   reg = <0x4a9000 0x1000>, /* TM */
+ <0x4a8000 0x1000>; /* SROT */
+   nvmem-cells = <&tsens_caldata>;
+   nvmem-cell-names = "calib";
+   #qcom,sensors = <10>;
+   #thermal-sensor-cells = <1>;
+   };
+
apcs_glb: mailbox@b011000 {
compatible = "qcom,qcs404-apcs-apps-global", "syscon";
reg = <0xb011000 0x1000>;
-- 
2.17.1



Re: [PATCH] mm: detect numbers of vmstat keys/values mismatch

2018-10-15 Thread Jann Horn
On Mon, Oct 15, 2018 at 8:38 PM Yu Zhao  wrote:
> There were mismatches between number of vmstat keys and number of
> vmstat values. They were fixed recently by:
>   commit 58bc4c34d249 ("mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly")
>   commit 28e2c4bb99aa ("mm/vmstat.c: fix outdated vmstat_text")
>
> Add a BUILD_BUG_ON to detect such mismatch and hopefully prevent
> it from happening again.

A BUILD_BUG_ON() like this is already in the mm tree:
https://ozlabs.org/~akpm/mmotm/broken-out/mm-vmstat-assert-that-vmstat_text-is-in-sync-with-stat_items_size.patch

> Signed-off-by: Yu Zhao 
> ---
>  include/linux/vmstat.h |  4 
>  mm/vmstat.c| 18 --
>  2 files changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
> index f25cef84b41d..33fdd37124cb 100644
> --- a/include/linux/vmstat.h
> +++ b/include/linux/vmstat.h
> @@ -78,6 +78,10 @@ extern void vm_events_fold_cpu(int cpu);
>
>  #else
>
> +struct vm_event_state {
> +   unsigned long event[0];
> +};
> +
>  /* Disable counters */
>  static inline void count_vm_event(enum vm_event_item item)
>  {
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 7878da76abf2..7ebf871b4cc9 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1647,23 +1647,21 @@ enum writeback_stat_item {
> NR_VM_WRITEBACK_STAT_ITEMS,
>  };
>
> +#define NR_VM_STAT_ITEMS (NR_VM_ZONE_STAT_ITEMS + NR_VM_NUMA_STAT_ITEMS + \
> + NR_VM_NODE_STAT_ITEMS + NR_VM_WRITEBACK_STAT_ITEMS 
> + \
> + ARRAY_SIZE(((struct vm_event_state *)0)->event))
> +
>  static void *vmstat_start(struct seq_file *m, loff_t *pos)
>  {
> +   int i;
> unsigned long *v;
> -   int i, stat_items_size;
> +
> +   BUILD_BUG_ON(ARRAY_SIZE(vmstat_text) != NR_VM_STAT_ITEMS);
>
> if (*pos >= ARRAY_SIZE(vmstat_text))
> return NULL;
> -   stat_items_size = NR_VM_ZONE_STAT_ITEMS * sizeof(unsigned long) +
> - NR_VM_NUMA_STAT_ITEMS * sizeof(unsigned long) +
> - NR_VM_NODE_STAT_ITEMS * sizeof(unsigned long) +
> - NR_VM_WRITEBACK_STAT_ITEMS * sizeof(unsigned long);
> -
> -#ifdef CONFIG_VM_EVENT_COUNTERS
> -   stat_items_size += sizeof(struct vm_event_state);
> -#endif
>
> -   v = kmalloc(stat_items_size, GFP_KERNEL);
> +   v = kmalloc_array(NR_VM_STAT_ITEMS, sizeof(unsigned long), 
> GFP_KERNEL);
> m->private = v;
> if (!v)
> return ERR_PTR(-ENOMEM);
> --
> 2.19.1.331.ge82ca0e54c-goog
>


Re: [PATCH v3] staging: gasket: Fix sparse "incorrect type in assignment" warnings.

2018-10-15 Thread Todd Poynor
On Wed, Oct 10, 2018 at 2:14 PM Laurence Rochfort
 wrote:
>
> Remove the coherent buffer __iomem cookie because the buffer is
> allocated from dma_alloc_coherent().
>
> warning: incorrect type in assignment (different address spaces)
>expected unsigned char [noderef] [usertype] *virt_base
>got void *[assigned] mem
> warning: incorrect type in argument 3 (different address spaces)
>expected void *cpu_addr
>got unsigned char [noderef] [usertype] *virt_base
>
> Signed-off-by: Laurence Rochfort 
> ---
> Changes in v3:
>  - Just remove the __iommem cookie, don't alter type.
>
>  drivers/staging/gasket/gasket_core.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/staging/gasket/gasket_core.h 
> b/drivers/staging/gasket/gasket_core.h
> index 275fd0b..e62adcd 100644
> --- a/drivers/staging/gasket/gasket_core.h
> +++ b/drivers/staging/gasket/gasket_core.h
> @@ -231,7 +231,7 @@ struct gasket_coherent_buffer_desc {
>  /* Coherent buffer structure. */
>  struct gasket_coherent_buffer {
> /* Virtual base address. */
> -   u8 __iomem *virt_base;
> +   u8 *virt_base;
>
> /* Physical base address. */
> ulong phys_base;
> --
> 2.9.5

Reviewed-by: Todd Poynor 

Thanks!


Re: [PATCH] kernel/signal: Signal-based pre-coredump notification

2018-10-15 Thread Greg Kroah-Hartman
On Mon, Oct 15, 2018 at 11:16:36AM -0700, Enke Chen wrote:
> Hi, Greg:
> 
> > Shouldn't there also be a manpage update, and a kselftest added for this
> > new user/kernel api that is being created?
> > 
> 
> I will submit a patch for manpage update once the code is accepted.

Writing a manpage update is key to see if what you are describing
actually matches the code you have submitted.  You should do both at the
same time so that they can be reviewed together.

> Regarding the kselftest, I am not sure.  Once the prctl() is limited to
> self (which I will do), the logic would be pretty straightforward. Not
> sure if the selftest would add much value.

If you do not have a test for this feature, how do you know it even
works at all?  How will you know if it breaks in a future kernel
release?  Have you tested this?  If so, how?

thanks,

greg k-h


Re: [PATCH] mm: detect numbers of vmstat keys/values mismatch

2018-10-15 Thread Yu Zhao
On Mon, Oct 15, 2018 at 08:41:52PM +0200, Jann Horn wrote:
> On Mon, Oct 15, 2018 at 8:38 PM Yu Zhao  wrote:
> > There were mismatches between number of vmstat keys and number of
> > vmstat values. They were fixed recently by:
> >   commit 58bc4c34d249 ("mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly")
> >   commit 28e2c4bb99aa ("mm/vmstat.c: fix outdated vmstat_text")
> >
> > Add a BUILD_BUG_ON to detect such mismatch and hopefully prevent
> > it from happening again.
> 
> A BUILD_BUG_ON() like this is already in the mm tree:
> https://ozlabs.org/~akpm/mmotm/broken-out/mm-vmstat-assert-that-vmstat_text-is-in-sync-with-stat_items_size.patch

My bad! Didn't notice this. Please disregard this patch.


Re: [PATCH] kernel/signal: Signal-based pre-coredump notification

2018-10-15 Thread Enke Chen
Hi, Jann:

Thanks a lot for you detailed review. Please see my replied/comments inline.

On 10/13/18 11:27 AM, Jann Horn wrote:
> On Sat, Oct 13, 2018 at 2:33 AM Enke Chen  wrote:
>> For simplicity and consistency, this patch provides an implementation
>> for signal-based fault notification prior to the coredump of a child
>> process. A new prctl command, PR_SET_PREDUMP_SIG, is defined that can
>> be used by an application to express its interest and to specify the
>> signal (SIGCHLD or SIGUSR1 or SIGUSR2) for such a notification. A new
>> signal code (si_code), CLD_PREDUMP, is also defined for SIGCHLD.
> 
> Your suggested API looks vaguely similar to PR_SET_PDEATHSIG, but with
> some important differences:
> 
>  - You don't reset the signal on setuid execution.
>  - You permit setting this not just on the current process, but also on 
> others.
> 
> For both of these: Are these differences actually necessary, and if
> so, can you provide a specific rationale? From a security perspective,
> I would very much prefer it if this API had semantics closer to
> PR_SET_PDEATHSIG.

Regarding setting on others, I started with setting for self. But there is
a requirement for supporting the feature for a process manager written in
bash script. That's the reason for allowing the setting on others.

Given the feedback from you and others, I agree that it would be simpler and
more secure to remove the setting on others. We can submit a patch for bash
to support the setting natively.

Regarding the impact of "setuid", this property "PR_SET_PREDUMP_SIG" has to
do with the application/process whether the signal handler is set for receiving
such a notification.  If it is set, the "uid" should not matter.

> 
> [...]
>> diff --git a/kernel/signal.c b/kernel/signal.c
>> index 312b43e..eb4a483 100644
>> --- a/kernel/signal.c
>> +++ b/kernel/signal.c
>> @@ -2337,6 +2337,44 @@ static int ptrace_signal(int signr, kernel_siginfo_t 
>> *info)
>> return signr;
>>  }
>>
>> +/*
>> + * Let the parent, if so desired, know about the imminent death of a child
>> + * prior to its coredump.
>> + *
>> + * Locking logic is similar to do_notify_parent_cldstop().
>> + */
>> +static void do_notify_parent_predump(struct task_struct *tsk)
>> +{
>> +   struct sighand_struct *sighand;
>> +   struct task_struct *parent;
>> +   struct kernel_siginfo info;
>> +   unsigned long flags;
>> +   int sig;
>> +
>> +   parent = tsk->real_parent;
>> +   sig = parent->predump_signal;
>> +
>> +   /* Check again with "tasklist_lock" locked by the caller */
>> +   if (!valid_predump_signal(sig))
>> +   return;
>> +
>> +   clear_siginfo(&info);
>> +   info.si_signo = sig;
>> +   if (sig == SIGCHLD)
>> +   info.si_code = CLD_PREDUMP;
>> +
>> +   rcu_read_lock();
>> +   info.si_pid = task_pid_nr_ns(tsk, task_active_pid_ns(parent));
>> +   info.si_uid = from_kuid_munged(task_cred_xxx(parent, user_ns),
>> +  task_uid(tsk));
> 
> You're sending a signal from the current namespaces, but with IDs that
> have been mapped into the parent's namespaces? That looks wrong to me.

I am following the example "do_notify_parent_cldstop()" called in the same
routine "get_signal()". If there is a better way, sure I will use it.

> 
>> +   rcu_read_unlock();
>> +
>> +   sighand = parent->sighand;
>> +   spin_lock_irqsave(&sighand->siglock, flags);
>> +   __group_send_sig_info(sig, &info, parent);
>> +   spin_unlock_irqrestore(&sighand->siglock, flags);
>> +}
>> +
>>  bool get_signal(struct ksignal *ksig)
>>  {
>> struct sighand_struct *sighand = current->sighand;
>> @@ -2497,6 +2535,19 @@ bool get_signal(struct ksignal *ksig)
>> current->flags |= PF_SIGNALED;
>>
>> if (sig_kernel_coredump(signr)) {
>> +   /*
>> +* Notify the parent prior to the coredump if the
>> +* parent is interested in such a notificaiton.
>> +*/
>> +   int p_sig = current->real_parent->predump_signal;
> 
> current->real_parent is an __rcu member. I think if you run the sparse
> checker against this patch, it's going to complain. Are you allowed to
> access current->real_parent in this context?

Let me check, and get back to you on this one.

> 
>> +   if (valid_predump_signal(p_sig)) {
>> +   read_lock(&tasklist_lock);
>> +   do_notify_parent_predump(current);
>> +   read_unlock(&tasklist_lock);
>> +   cond_resched();
>> +   }
>> +
>> if (print_fatal_signals)
>> print_fatal_signal(ksig->info.si_signo);
>> proc_coredump_connector(current);
>> diff --git a/kernel/sys.c b/kernel/sys.c
>> index 1

  1   2   3   4   5   6   7   8   9   >