Re: [PATCH net-next v2] ibmveth: Allow users to update reported speed and duplex

2019-08-07 Thread Michael Ellerman
Thomas Falcon  writes:
> Reported ethtool link settings for the ibmveth driver are currently
> hardcoded and no longer reflect the actual capabilities of supported
> hardware. There is no interface designed for retrieving this information
> from device firmware nor is there any way to update current settings
> to reflect observed or expected link speeds.
>
> To avoid breaking existing configurations, retain current values as
> default settings but let users update them to match the expected
> capabilities of underlying hardware if needed. This update would
> allow the use of configurations that rely on certain link speed
> settings, such as LACP. This patch is based on the implementation
> in virtio_net.
>
> Signed-off-by: Thomas Falcon 
> ---
> v2: Updated default driver speed/duplex settings to avoid
> breaking existing setups

Thanks.

I won't give you an ack because I don't know jack about network drivers
these days, but I think that alleviates my concern about breaking
existing setups. I'll leave the rest of the review up to the networking
folks.

cheers

> diff --git a/drivers/net/ethernet/ibm/ibmveth.c 
> b/drivers/net/ethernet/ibm/ibmveth.c
> index d654c23..5dc634f 100644
> --- a/drivers/net/ethernet/ibm/ibmveth.c
> +++ b/drivers/net/ethernet/ibm/ibmveth.c
> @@ -712,31 +712,68 @@ static int ibmveth_close(struct net_device *netdev)
>   return 0;
>  }
>  
> -static int netdev_get_link_ksettings(struct net_device *dev,
> -  struct ethtool_link_ksettings *cmd)
> +static bool
> +ibmveth_validate_ethtool_cmd(const struct ethtool_link_ksettings *cmd)
>  {
> - u32 supported, advertising;
> -
> - supported = (SUPPORTED_1000baseT_Full | SUPPORTED_Autoneg |
> - SUPPORTED_FIBRE);
> - advertising = (ADVERTISED_1000baseT_Full | ADVERTISED_Autoneg |
> - ADVERTISED_FIBRE);
> - cmd->base.speed = SPEED_1000;
> - cmd->base.duplex = DUPLEX_FULL;
> - cmd->base.port = PORT_FIBRE;
> - cmd->base.phy_address = 0;
> - cmd->base.autoneg = AUTONEG_ENABLE;
> -
> - ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.supported,
> - supported);
> - ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.advertising,
> - advertising);
> + struct ethtool_link_ksettings diff1 = *cmd;
> + struct ethtool_link_ksettings diff2 = {};
> +
> + diff2.base.port = PORT_OTHER;
> + diff1.base.speed = 0;
> + diff1.base.duplex = 0;
> + diff1.base.cmd = 0;
> + diff1.base.link_mode_masks_nwords = 0;
> + ethtool_link_ksettings_zero_link_mode(&diff1, advertising);
> +
> + return !memcmp(&diff1.base, &diff2.base, sizeof(diff1.base)) &&
> + bitmap_empty(diff1.link_modes.supported,
> +  __ETHTOOL_LINK_MODE_MASK_NBITS) &&
> + bitmap_empty(diff1.link_modes.advertising,
> +  __ETHTOOL_LINK_MODE_MASK_NBITS) &&
> + bitmap_empty(diff1.link_modes.lp_advertising,
> +  __ETHTOOL_LINK_MODE_MASK_NBITS);
> +}
> +
> +static int ibmveth_set_link_ksettings(struct net_device *dev,
> +   const struct ethtool_link_ksettings *cmd)
> +{
> + struct ibmveth_adapter *adapter = netdev_priv(dev);
> + u32 speed;
> + u8 duplex;
> +
> + speed = cmd->base.speed;
> + duplex = cmd->base.duplex;
> + /* don't allow custom speed and duplex */
> + if (!ethtool_validate_speed(speed) ||
> + !ethtool_validate_duplex(duplex) ||
> + !ibmveth_validate_ethtool_cmd(cmd))
> + return -EINVAL;
> + adapter->speed = speed;
> + adapter->duplex = duplex;
>  
>   return 0;
>  }
>  
> -static void netdev_get_drvinfo(struct net_device *dev,
> -struct ethtool_drvinfo *info)
> +static int ibmveth_get_link_ksettings(struct net_device *dev,
> +   struct ethtool_link_ksettings *cmd)
> +{
> + struct ibmveth_adapter *adapter = netdev_priv(dev);
> +
> + cmd->base.speed = adapter->speed;
> + cmd->base.duplex = adapter->duplex;
> + cmd->base.port = PORT_OTHER;
> +
> + return 0;
> +}
> +
> +static void ibmveth_init_link_settings(struct ibmveth_adapter *adapter)
> +{
> + adapter->duplex = DUPLEX_FULL;
> + adapter->speed = SPEED_1000;
> +}
> +
> +static void ibmveth_get_drvinfo(struct net_device *dev,
> + struct ethtool_drvinfo *info)
>  {
>   strlcpy(info->driver, ibmveth_driver_name, sizeof(info->driver));
>   strlcpy(info->version, ibmveth_driver_version, sizeof(info->version));
> @@ -965,12 +1002,13 @@ static void ibmveth_get_ethtool_stats(struct 
> net_device *dev,
>  }
>  
>  static const struct ethtool_ops netdev_ethtool_ops = {
> - .get_drvinfo= netdev_get_drvinfo,
> + .get_drvinfo= ibmveth_get

Re: [PATCH 1/2] dma-mapping: fix page attributes for dma_mmap_*

2019-08-07 Thread Shawn Anastasio

On 8/7/19 8:04 AM, Christoph Hellwig wrote:

Actually it is typical modern Linux style to just provide a prototype
and then use "if (IS_ENABLED(CONFIG_FOO))" to guard the call(s) to it.


I see.


Also, like Will mentioned earlier, the function name isn't entirely
accurate anymore. I second the suggestion of using something like
arch_dma_noncoherent_pgprot().


As mentioned I plan to remove arch_dma_mmap_pgprot for 5.4, so I'd
rather avoid churn for the short period of time.


Yeah, fair enough.


As for your idea of defining
pgprot_dmacoherent for all architectures as

#ifndef pgprot_dmacoherent
#define pgprot_dmacoherent pgprot_noncached
#endif

I think that the name here is kind of misleading too, since this
definition will only be used when there is no support for proper
DMA coherency.


Do you have a suggestion for a better name?  I'm pretty bad at naming,
so just reusing the arm name seemed like a good way to avoid having
to make naming decisions myself.


Good question. Perhaps something like `pgprot_dmacoherent_fallback`
would better convey that this is only used for devices that don't
support DMA coherency? Or maybe `pgprot_dma_noncoherent`?



Re: [PATCH v5 02/10] powerpc: move memstart_addr and kernstart_addr to init-common.c

2019-08-07 Thread Michael Ellerman
Jason Yan  writes:
> These two variables are both defined in init_32.c and init_64.c. Move
> them to init-common.c.
>
> Signed-off-by: Jason Yan 
> Cc: Diana Craciun 
> Cc: Michael Ellerman 
> Cc: Christophe Leroy 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Nicholas Piggin 
> Cc: Kees Cook 
> Reviewed-by: Christophe Leroy 
> Reviewed-by: Diana Craciun 
> Tested-by: Diana Craciun 
> ---
>  arch/powerpc/mm/init-common.c | 5 +
>  arch/powerpc/mm/init_32.c | 5 -
>  arch/powerpc/mm/init_64.c | 5 -
>  3 files changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c
> index a84da92920f7..152ae0d21435 100644
> --- a/arch/powerpc/mm/init-common.c
> +++ b/arch/powerpc/mm/init-common.c
> @@ -21,6 +21,11 @@
>  #include 
>  #include 
>  
> +phys_addr_t memstart_addr = (phys_addr_t)~0ull;
> +EXPORT_SYMBOL_GPL(memstart_addr);
> +phys_addr_t kernstart_addr;
> +EXPORT_SYMBOL_GPL(kernstart_addr);

Would be nice if these can be __ro_after_init ?

cheers


Re: [PATCH v5 03/10] powerpc: introduce kimage_vaddr to store the kernel base

2019-08-07 Thread Michael Ellerman
Jason Yan  writes:
> Now the kernel base is a fixed value - KERNELBASE. To support KASLR, we
> need a variable to store the kernel base.
>
> Signed-off-by: Jason Yan 
> Cc: Diana Craciun 
> Cc: Michael Ellerman 
> Cc: Christophe Leroy 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Nicholas Piggin 
> Cc: Kees Cook 
> Reviewed-by: Christophe Leroy 
> Reviewed-by: Diana Craciun 
> Tested-by: Diana Craciun 
> ---
>  arch/powerpc/include/asm/page.h | 2 ++
>  arch/powerpc/mm/init-common.c   | 2 ++
>  2 files changed, 4 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
> index 0d52f57fca04..60a68d3a54b1 100644
> --- a/arch/powerpc/include/asm/page.h
> +++ b/arch/powerpc/include/asm/page.h
> @@ -315,6 +315,8 @@ void arch_free_page(struct page *page, int order);
>  
>  struct vm_area_struct;
>  
> +extern unsigned long kimage_vaddr;
> +
>  #include 
>  #endif /* __ASSEMBLY__ */
>  #include 
> diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c
> index 152ae0d21435..d4801ce48dc5 100644
> --- a/arch/powerpc/mm/init-common.c
> +++ b/arch/powerpc/mm/init-common.c
> @@ -25,6 +25,8 @@ phys_addr_t memstart_addr = (phys_addr_t)~0ull;
>  EXPORT_SYMBOL_GPL(memstart_addr);
>  phys_addr_t kernstart_addr;
>  EXPORT_SYMBOL_GPL(kernstart_addr);
> +unsigned long kimage_vaddr = KERNELBASE;
> +EXPORT_SYMBOL_GPL(kimage_vaddr);

The names of the #defines and variables we use for these values are not
very consistent already, but using kimage_vaddr makes it worse I think.

Isn't this going to have the same value as kernstart_addr, but the
virtual rather than physical address?

If so kernstart_virt_addr would seem better.

cheers


Re: [PATCH v5 07/10] powerpc/fsl_booke/32: randomize the kernel image offset

2019-08-07 Thread Michael Ellerman
Jason Yan  writes:
> After we have the basic support of relocate the kernel in some
> appropriate place, we can start to randomize the offset now.
>
> Entropy is derived from the banner and timer, which will change every
> build and boot. This not so much safe so additionally the bootloader may
> pass entropy via the /chosen/kaslr-seed node in device tree.
>
> We will use the first 512M of the low memory to randomize the kernel
> image. The memory will be split in 64M zones. We will use the lower 8
> bit of the entropy to decide the index of the 64M zone. Then we chose a
> 16K aligned offset inside the 64M zone to put the kernel in.
>
> KERNELBASE
>
> |-->   64M   <--|
> |   |
> +---+++---+
> |   |||kernel||   |
> +---+++---+
> | |
> |->   offset<-|
>
>   kimage_vaddr

Can you drop this description / diagram and any other relevant design
details in eg. Documentation/powerpc/kaslr-booke32.rst please?

See cpu_families.rst for an example of how to incorporate the ASCII
diagram.

> diff --git a/arch/powerpc/kernel/kaslr_booke.c 
> b/arch/powerpc/kernel/kaslr_booke.c
> index 30f84c0321b2..52b59b05f906 100644
> --- a/arch/powerpc/kernel/kaslr_booke.c
> +++ b/arch/powerpc/kernel/kaslr_booke.c
> @@ -34,15 +36,329 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> +#include 
> +#include 
> +
> +#ifdef DEBUG
> +#define DBG(fmt...) pr_info(fmt)
> +#else
> +#define DBG(fmt...)
> +#endif

Just use pr_debug()?

> +struct regions {
> + unsigned long pa_start;
> + unsigned long pa_end;
> + unsigned long kernel_size;
> + unsigned long dtb_start;
> + unsigned long dtb_end;
> + unsigned long initrd_start;
> + unsigned long initrd_end;
> + unsigned long crash_start;
> + unsigned long crash_end;
> + int reserved_mem;
> + int reserved_mem_addr_cells;
> + int reserved_mem_size_cells;
> +};
>  
>  extern int is_second_reloc;
>  
> +/* Simplified build-specific string for starting entropy. */
> +static const char build_str[] = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
> + LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION;
> +
> +static __init void kaslr_get_cmdline(void *fdt)
> +{
> + int node = fdt_path_offset(fdt, "/chosen");
> +
> + early_init_dt_scan_chosen(node, "chosen", 1, boot_command_line);
> +}
> +
> +static unsigned long __init rotate_xor(unsigned long hash, const void *area,
> +size_t size)
> +{
> + size_t i;
> + const unsigned long *ptr = area;
> +
> + for (i = 0; i < size / sizeof(hash); i++) {
> + /* Rotate by odd number of bits and XOR. */
> + hash = (hash << ((sizeof(hash) * 8) - 7)) | (hash >> 7);
> + hash ^= ptr[i];
> + }
> +
> + return hash;
> +}

That looks suspiciously like the version Kees wrote in 2013 in
arch/x86/boot/compressed/kaslr.c ?

You should mention that in the change log at least.

> +
> +/* Attempt to create a simple but unpredictable starting entropy. */

It's simple, but I would argue unpredictable is not really true. A local
attacker can probably fingerprint the kernel version, and also has
access to the unflattened device tree, which means they can make
educated guesses about the flattened tree size.

Be careful when copying comments :)

> +static unsigned long __init get_boot_seed(void *fdt)
> +{
> + unsigned long hash = 0;
> +
> + hash = rotate_xor(hash, build_str, sizeof(build_str));
> + hash = rotate_xor(hash, fdt, fdt_totalsize(fdt));
> +
> + return hash;
> +}
> +
> +static __init u64 get_kaslr_seed(void *fdt)
> +{
> + int node, len;
> + fdt64_t *prop;
> + u64 ret;
> +
> + node = fdt_path_offset(fdt, "/chosen");
> + if (node < 0)
> + return 0;
> +
> + prop = fdt_getprop_w(fdt, node, "kaslr-seed", &len);
> + if (!prop || len != sizeof(u64))
> + return 0;
> +
> + ret = fdt64_to_cpu(*prop);
> + *prop = 0;
> + return ret;
> +}
> +
> +static __init bool regions_overlap(u32 s1, u32 e1, u32 s2, u32 e2)
> +{
> + return e1 >= s2 && e2 >= s1;
> +}

There's a generic helper called memory_intersects(), though it takes
void*. Might not be worth using, not sure.

...
>  static unsigned long __init kaslr_choose_location(void *dt_ptr, phys_addr_t 
> size,
> unsigned long kernel_sz)
>  {
> - /* return a fixed offset of 64M for now */
> - return SZ_64M;
> + unsigned long offset, random;
> + unsigned long ram, linear_sz;
> + unsigned long kaslr_offset;
> + u64 seed;
> + struct regions regions;

You pass that around to a lot of the functions, would it be simpler just
to make it static global and __initdata ?

cheer

Re: [PATCH v5 10/10] powerpc/fsl_booke/kaslr: dump out kernel offset information on panic

2019-08-07 Thread Michael Ellerman
Jason Yan  writes:
> When kaslr is enabled, the kernel offset is different for every boot.
> This brings some difficult to debug the kernel. Dump out the kernel
> offset when panic so that we can easily debug the kernel.

Some of this is taken from the arm64 version right? Please say so when
you copy other people's code.

> diff --git a/arch/powerpc/kernel/machine_kexec.c 
> b/arch/powerpc/kernel/machine_kexec.c
> index c4ed328a7b96..078fe3d76feb 100644
> --- a/arch/powerpc/kernel/machine_kexec.c
> +++ b/arch/powerpc/kernel/machine_kexec.c
> @@ -86,6 +86,7 @@ void arch_crash_save_vmcoreinfo(void)
>   VMCOREINFO_STRUCT_SIZE(mmu_psize_def);
>   VMCOREINFO_OFFSET(mmu_psize_def, shift);
>  #endif
> + vmcoreinfo_append_str("KERNELOFFSET=%lx\n", kaslr_offset());
>  }

There's no mention of that in the commit log.

Please split it into a separate patch and describe what you're doing and
why.

> diff --git a/arch/powerpc/kernel/setup-common.c 
> b/arch/powerpc/kernel/setup-common.c
> index 1f8db666468d..064075f02837 100644
> --- a/arch/powerpc/kernel/setup-common.c
> +++ b/arch/powerpc/kernel/setup-common.c
> @@ -715,12 +715,31 @@ static struct notifier_block ppc_panic_block = {
>   .priority = INT_MIN /* may not return; must be done last */
>  };
>  
> +/*
> + * Dump out kernel offset information on panic.
> + */
> +static int dump_kernel_offset(struct notifier_block *self, unsigned long v,
> +   void *p)
> +{
> + pr_emerg("Kernel Offset: 0x%lx from 0x%lx\n",
> +  kaslr_offset(), KERNELBASE);
> +
> + return 0;
> +}
> +
> +static struct notifier_block kernel_offset_notifier = {
> + .notifier_call = dump_kernel_offset
> +};
> +
>  void __init setup_panic(void)
>  {
>   /* PPC64 always does a hard irq disable in its panic handler */
>   if (!IS_ENABLED(CONFIG_PPC64) && !ppc_md.panic)
>   return;
>   atomic_notifier_chain_register(&panic_notifier_list, &ppc_panic_block);

> + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset() > 0)
> + atomic_notifier_chain_register(&panic_notifier_list,
> +&kernel_offset_notifier);

Don't you want to do that before the return above?

>  }

cheers


Re: [PATCH v5 09/10] powerpc/fsl_booke/kaslr: support nokaslr cmdline parameter

2019-08-07 Thread Michael Ellerman
Jason Yan  writes:
> diff --git a/arch/powerpc/kernel/kaslr_booke.c 
> b/arch/powerpc/kernel/kaslr_booke.c
> index c6b326424b54..436f9a03f385 100644
> --- a/arch/powerpc/kernel/kaslr_booke.c
> +++ b/arch/powerpc/kernel/kaslr_booke.c
> @@ -361,6 +361,18 @@ static unsigned long __init kaslr_choose_location(void 
> *dt_ptr, phys_addr_t size
>   return kaslr_offset;
>  }
>  
> +static inline __init bool kaslr_disabled(void)
> +{
> + char *str;
> +
> + str = strstr(boot_command_line, "nokaslr");
> + if (str == boot_command_line ||
> + (str > boot_command_line && *(str - 1) == ' '))
> + return true;

This extra logic doesn't work for "nokaslrfoo". Is it worth it?

cheers


Re: [PATCH v5 06/10] powerpc/fsl_booke/32: implement KASLR infrastructure

2019-08-07 Thread Michael Ellerman
Jason Yan  writes:
> This patch add support to boot kernel from places other than KERNELBASE.
> Since CONFIG_RELOCATABLE has already supported, what we need to do is
> map or copy kernel to a proper place and relocate. Freescale Book-E
> parts expect lowmem to be mapped by fixed TLB entries(TLB1). The TLB1
> entries are not suitable to map the kernel directly in a randomized
> region, so we chose to copy the kernel to a proper place and restart to
> relocate.

So to be 100% clear you are randomising the location of the kernel in
virtual and physical space, by the same amount, and retaining the 1:1
linear mapping.

> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 77f6ebf97113..755378887912 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -548,6 +548,17 @@ config RELOCATABLE
> setting can still be useful to bootwrappers that need to know the
> load address of the kernel (eg. u-boot/mkimage).
>  
> +config RANDOMIZE_BASE
> + bool "Randomize the address of the kernel image"
> + depends on (FSL_BOOKE && FLATMEM && PPC32)
> + select RELOCATABLE

I think this should depend on RELOCATABLE, rather than selecting it.

> diff --git a/arch/powerpc/kernel/kaslr_booke.c 
> b/arch/powerpc/kernel/kaslr_booke.c
> new file mode 100644
> index ..30f84c0321b2
> --- /dev/null
> +++ b/arch/powerpc/kernel/kaslr_booke.c
> @@ -0,0 +1,84 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2019 Jason Yan 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.

You don't need that paragraph now that you have the SPDX tag.

Rather than using a '//' comment followed by a single line block comment
you can format it as:

// SPDX-License-Identifier: GPL-2.0-only
//
// Copyright (C) 2019 Jason Yan 


> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

Do you really need all those headers?

> +extern int is_second_reloc;

That should be in a header.

Any reason why it isn't a bool?

cheers



Re: [PATCH v5 00/10] implement KASLR for powerpc/fsl_booke/32

2019-08-07 Thread Michael Ellerman
Hi Jason,

Jason Yan  writes:
> This series implements KASLR for powerpc/fsl_booke/32, as a security
> feature that deters exploit attempts relying on knowledge of the location
> of kernel internals.

Thanks for doing this work.

Sorry I didn't get a chance to look at this until v5, I sent a few
comments just now. Nothing major though, I think this looks almost ready
to merge.

cheers

> Since CONFIG_RELOCATABLE has already supported, what we need to do is
> map or copy kernel to a proper place and relocate. Freescale Book-E
> parts expect lowmem to be mapped by fixed TLB entries(TLB1). The TLB1
> entries are not suitable to map the kernel directly in a randomized
> region, so we chose to copy the kernel to a proper place and restart to
> relocate.
>
> Entropy is derived from the banner and timer base, which will change every
> build and boot. This not so much safe so additionally the bootloader may
> pass entropy via the /chosen/kaslr-seed node in device tree.
>
> We will use the first 512M of the low memory to randomize the kernel
> image. The memory will be split in 64M zones. We will use the lower 8
> bit of the entropy to decide the index of the 64M zone. Then we chose a
> 16K aligned offset inside the 64M zone to put the kernel in.
>
> KERNELBASE
>
> |-->   64M   <--|
> |   |
> +---+++---+
> |   |||kernel||   |
> +---+++---+
> | |
> |->   offset<-|
>
>   kimage_vaddr
>
> We also check if we will overlap with some areas like the dtb area, the
> initrd area or the crashkernel area. If we cannot find a proper area,
> kaslr will be disabled and boot from the original kernel.
>
> Changes since v4:
>  - Add Reviewed-by tag from Christophe
>  - Remove an unnecessary cast
>  - Remove unnecessary parenthesis
>  - Fix checkpatch warning
>
> Changes since v3:
>  - Add Reviewed-by and Tested-by tag from Diana
>  - Change the comment in fsl_booke_entry_mapping.S to be consistent
>with the new code.
>
> Changes since v2:
>  - Remove unnecessary #ifdef
>  - Use SZ_64M instead of0x400
>  - Call early_init_dt_scan_chosen() to init boot_command_line
>  - Rename kaslr_second_init() to kaslr_late_init()
>
> Changes since v1:
>  - Remove some useless 'extern' keyword.
>  - Replace EXPORT_SYMBOL with EXPORT_SYMBOL_GPL
>  - Improve some assembly code
>  - Use memzero_explicit instead of memset
>  - Use boot_command_line and remove early_command_line
>  - Do not print kaslr offset if kaslr is disabled
>
> Jason Yan (10):
>   powerpc: unify definition of M_IF_NEEDED
>   powerpc: move memstart_addr and kernstart_addr to init-common.c
>   powerpc: introduce kimage_vaddr to store the kernel base
>   powerpc/fsl_booke/32: introduce create_tlb_entry() helper
>   powerpc/fsl_booke/32: introduce reloc_kernel_entry() helper
>   powerpc/fsl_booke/32: implement KASLR infrastructure
>   powerpc/fsl_booke/32: randomize the kernel image offset
>   powerpc/fsl_booke/kaslr: clear the original kernel if randomized
>   powerpc/fsl_booke/kaslr: support nokaslr cmdline parameter
>   powerpc/fsl_booke/kaslr: dump out kernel offset information on panic
>
>  arch/powerpc/Kconfig  |  11 +
>  arch/powerpc/include/asm/nohash/mmu-book3e.h  |  10 +
>  arch/powerpc/include/asm/page.h   |   7 +
>  arch/powerpc/kernel/Makefile  |   1 +
>  arch/powerpc/kernel/early_32.c|   2 +-
>  arch/powerpc/kernel/exceptions-64e.S  |  10 -
>  arch/powerpc/kernel/fsl_booke_entry_mapping.S |  27 +-
>  arch/powerpc/kernel/head_fsl_booke.S  |  55 ++-
>  arch/powerpc/kernel/kaslr_booke.c | 427 ++
>  arch/powerpc/kernel/machine_kexec.c   |   1 +
>  arch/powerpc/kernel/misc_64.S |   5 -
>  arch/powerpc/kernel/setup-common.c|  19 +
>  arch/powerpc/mm/init-common.c |   7 +
>  arch/powerpc/mm/init_32.c |   5 -
>  arch/powerpc/mm/init_64.c |   5 -
>  arch/powerpc/mm/mmu_decl.h|  10 +
>  arch/powerpc/mm/nohash/fsl_booke.c|   8 +-
>  17 files changed, 560 insertions(+), 50 deletions(-)
>  create mode 100644 arch/powerpc/kernel/kaslr_booke.c
>
> -- 
> 2.17.2


Re: [PATCH v5 01/10] powerpc: unify definition of M_IF_NEEDED

2019-08-07 Thread Michael Ellerman
Jason Yan  writes:
> M_IF_NEEDED is defined too many times. Move it to a common place.

The name is not great, can you call it MAS2_M_IF_NEEDED, which at least
gives a clue what it's for?

cheers

> Signed-off-by: Jason Yan 
> Cc: Diana Craciun 
> Cc: Michael Ellerman 
> Cc: Christophe Leroy 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Nicholas Piggin 
> Cc: Kees Cook 
> Reviewed-by: Christophe Leroy 
> Reviewed-by: Diana Craciun 
> Tested-by: Diana Craciun 
> ---
>  arch/powerpc/include/asm/nohash/mmu-book3e.h  | 10 ++
>  arch/powerpc/kernel/exceptions-64e.S  | 10 --
>  arch/powerpc/kernel/fsl_booke_entry_mapping.S | 10 --
>  arch/powerpc/kernel/misc_64.S |  5 -
>  4 files changed, 10 insertions(+), 25 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/nohash/mmu-book3e.h 
> b/arch/powerpc/include/asm/nohash/mmu-book3e.h
> index 4c9777d256fb..0877362e48fa 100644
> --- a/arch/powerpc/include/asm/nohash/mmu-book3e.h
> +++ b/arch/powerpc/include/asm/nohash/mmu-book3e.h
> @@ -221,6 +221,16 @@
>  #define TLBILX_T_CLASS2  6
>  #define TLBILX_T_CLASS3  7
>  
> +/*
> + * The mapping only needs to be cache-coherent on SMP, except on
> + * Freescale e500mc derivatives where it's also needed for coherent DMA.
> + */
> +#if defined(CONFIG_SMP) || defined(CONFIG_PPC_E500MC)
> +#define M_IF_NEEDED  MAS2_M
> +#else
> +#define M_IF_NEEDED  0
> +#endif
> +
>  #ifndef __ASSEMBLY__
>  #include 
>  
> diff --git a/arch/powerpc/kernel/exceptions-64e.S 
> b/arch/powerpc/kernel/exceptions-64e.S
> index 1cfb3da4a84a..fd49ec07ce4a 100644
> --- a/arch/powerpc/kernel/exceptions-64e.S
> +++ b/arch/powerpc/kernel/exceptions-64e.S
> @@ -1342,16 +1342,6 @@ skpinv:addir6,r6,1 
> /* Increment */
>   sync
>   isync
>  
> -/*
> - * The mapping only needs to be cache-coherent on SMP, except on
> - * Freescale e500mc derivatives where it's also needed for coherent DMA.
> - */
> -#if defined(CONFIG_SMP) || defined(CONFIG_PPC_E500MC)
> -#define M_IF_NEEDED  MAS2_M
> -#else
> -#define M_IF_NEEDED  0
> -#endif
> -
>  /* 6. Setup KERNELBASE mapping in TLB[0]
>   *
>   * r3 = MAS0 w/TLBSEL & ESEL for the entry we started in
> diff --git a/arch/powerpc/kernel/fsl_booke_entry_mapping.S 
> b/arch/powerpc/kernel/fsl_booke_entry_mapping.S
> index ea065282b303..de0980945510 100644
> --- a/arch/powerpc/kernel/fsl_booke_entry_mapping.S
> +++ b/arch/powerpc/kernel/fsl_booke_entry_mapping.S
> @@ -153,16 +153,6 @@ skpinv:  addir6,r6,1 /* 
> Increment */
>   tlbivax 0,r9
>   TLBSYNC
>  
> -/*
> - * The mapping only needs to be cache-coherent on SMP, except on
> - * Freescale e500mc derivatives where it's also needed for coherent DMA.
> - */
> -#if defined(CONFIG_SMP) || defined(CONFIG_PPC_E500MC)
> -#define M_IF_NEEDED  MAS2_M
> -#else
> -#define M_IF_NEEDED  0
> -#endif
> -
>  #if defined(ENTRY_MAPPING_BOOT_SETUP)
>  
>  /* 6. Setup KERNELBASE mapping in TLB1[0] */
> diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
> index b55a7b4cb543..26074f92d4bc 100644
> --- a/arch/powerpc/kernel/misc_64.S
> +++ b/arch/powerpc/kernel/misc_64.S
> @@ -432,11 +432,6 @@ kexec_create_tlb:
>   rlwimi  r9,r10,16,4,15  /* Setup MAS0 = TLBSEL | ESEL(r9) */
>  
>  /* Set up a temp identity mapping v:0 to p:0 and return to it. */
> -#if defined(CONFIG_SMP) || defined(CONFIG_PPC_E500MC)
> -#define M_IF_NEEDED  MAS2_M
> -#else
> -#define M_IF_NEEDED  0
> -#endif
>   mtspr   SPRN_MAS0,r9
>  
>   lis r9,(MAS1_VALID|MAS1_IPROT)@h
> -- 
> 2.17.2


[PATCH v8 0/7] powerpc: implement machine check safe memcpy

2019-08-07 Thread Santosh Sivaraj
During a memcpy from a pmem device, if a machine check exception is
generated we end up in a panic. In case of fsdax read, this should
only result in a -EIO. Avoid MCE by implementing memcpy_mcsafe.

Before this patch series:

```
bash-4.4# mount -o dax /dev/pmem0 /mnt/pmem/
[ 7621.714094] Disabling lock debugging due to kernel taint
[ 7621.714099] MCE: CPU0: machine check (Severe) Host UE Load/Store [Not 
recovered]
[ 7621.714104] MCE: CPU0: NIP: [c0088978] memcpy_power7+0x418/0x7e0
[ 7621.714107] MCE: CPU0: Hardware error
[ 7621.714112] opal: Hardware platform error: Unrecoverable Machine Check 
exception
[ 7621.714118] CPU: 0 PID: 1368 Comm: mount Tainted: G   M  
5.2.0-rc5-00239-g241e39004581
#50
[ 7621.714123] NIP:  c0088978 LR: c08e16f8 CTR: 01de
[ 7621.714129] REGS: c000fffbfd70 TRAP: 0200   Tainted: G   M  
(5.2.0-rc5-00239-g241e39004581)
[ 7621.714131] MSR:  92209033   CR: 
24428840  XER: 0004
[ 7621.714160] CFAR: c00889a8 DAR: deadbeefdeadbeef DSISR: 8000 
IRQMASK: 0
[ 7621.714171] GPR00: 0e00 c000f0b8b1e0 c12cf100 
c000ed8e1100 
[ 7621.714186] GPR04: c2001100 0001 0200 
03fff1272000 
[ 7621.714201] GPR08: 8000 0010 0020 
0030 
[ 7621.714216] GPR12: 0040 7fffb8c6d390 0050 
0060 
[ 7621.714232] GPR16: 0070  0001 
c000f0b8b960 
[ 7621.714247] GPR20: 0001 c000f0b8b940 0001 
0001 
[ 7621.714262] GPR24: c1382560 c00c003b6380 c00c003b6380 
0001 
[ 7621.714277] GPR28:  0001 c200 
0001 
[ 7621.714294] NIP [c0088978] memcpy_power7+0x418/0x7e0
[ 7621.714298] LR [c08e16f8] pmem_do_bvec+0xf8/0x430
...  ...
```

After this patch series:

```
bash-4.4# mount -o dax /dev/pmem0 /mnt/pmem/
[25302.883978] Buffer I/O error on dev pmem0, logical block 0, async page read
[25303.020816] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your 
own risk
[25303.021236] EXT4-fs (pmem0): Can't read superblock on 2nd try
[25303.152515] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your 
own risk
[25303.284031] EXT4-fs (pmem0): DAX enabled. Warning: EXPERIMENTAL, use at your 
own risk
[25304.084100] UDF-fs: bad mount option "dax" or missing value
mount: /mnt/pmem: wrong fs type, bad option, bad superblock on /dev/pmem0, 
missing codepage or helper
program, or other error.
```

MCE is injected on a pmem address using mambo. The last patch which adds a
nop is only for testing on mambo, where r13 is not restored upon hitting
vector 200.

The memcpy code can be optimised by adding VMX optimizations and GAS macros
can be used to enable code reusablity, which I will send as another series.

---
Change-log:
v8:
* While ignoring UE events, return was used instead of continue.
* Checkpatch fixups for commit log

v7:
* Move schedule_work to be called from irq_work.

v6:
* Don't return pfn, all callees are expecting physical address anyway [nick]
* Patch re-ordering: move exception table patch before memcpy_mcsafe patch 
[nick]
* Reword commit log for search_exception_tables patch [nick]

v5:
* Don't use search_exception_tables since it searches for module exception 
tables
  also [Nicholas]
* Fix commit message for patch 2 [Nicholas]

v4:
* Squash return remaining bytes patch to memcpy_mcsafe implemtation patch 
[christophe]
* Access ok should be checked for copy_to_user_mcsafe() [christophe]

v3:
* Drop patch which enables DR/IR for external modules
* Drop notifier call chain, we don't want to do that in real mode
* Return remaining bytes from memcpy_mcsafe correctly
* We no longer restore r13 for simulator tests, rather use a nop at 
  vector 0x200 [workaround for simulator; not to be merged]

v2:
* Don't set RI bit explicitly [mahesh]
* Re-ordered series to get r13 workaround as the last patch
--
Balbir Singh (2):
  powerpc/mce: Fix MCE handling for huge pages
  powerpc/memcpy: Add memcpy_mcsafe for pmem

Reza Arbab (1):
  powerpc/mce: Make machine_check_ue_event() static

Santosh Sivaraj (4):
  powerpc/mce: Schedule work from irq_work
  extable: Add function to search only kernel exception table
  powerpc/mce: Handle UE event for memcpy_mcsafe
  powerpc: add machine check safe copy_to_user

 arch/powerpc/Kconfig |   1 +
 arch/powerpc/include/asm/mce.h   |   6 +-
 arch/powerpc/include/asm/string.h|   2 +
 arch/powerpc/include/asm/uaccess.h   |  14 ++
 arch/powerpc/kernel/mce.c|  26 ++-
 arch/powerpc/kernel/mce_power.c  |  65 ---
 arch/powerpc/lib/Makefile|   2 +-
 arch/powerpc/lib/memcpy_mcsafe_64.S  | 242 +++
 arch/powerpc/platforms/pseries/ras.c |   9 +-
 include/linux/extable.h  |   2 +
 kernel/extable.c

[PATCH v8 1/7] powerpc/mce: Schedule work from irq_work

2019-08-07 Thread Santosh Sivaraj
schedule_work() cannot be called from MCE exception context as MCE can
interrupt even in interrupt disabled context.

fixes: 733e4a4c ("powerpc/mce: hookup memory_failure for UE errors")
Signed-off-by: Santosh Sivaraj 
---
 arch/powerpc/kernel/mce.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index b18df633eae9..0ab6fa7c 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -144,7 +144,6 @@ void save_mce_event(struct pt_regs *regs, long handled,
if (phys_addr != ULONG_MAX) {
mce->u.ue_error.physical_address_provided = true;
mce->u.ue_error.physical_address = phys_addr;
-   machine_check_ue_event(mce);
}
}
return;
@@ -275,8 +274,7 @@ static void machine_process_ue_event(struct work_struct 
*work)
}
 }
 /*
- * process pending MCE event from the mce event queue. This function will be
- * called during syscall exit.
+ * process pending MCE event from the mce event queue.
  */
 static void machine_check_process_queued_event(struct irq_work *work)
 {
@@ -292,6 +290,10 @@ static void machine_check_process_queued_event(struct 
irq_work *work)
while (__this_cpu_read(mce_queue_count) > 0) {
index = __this_cpu_read(mce_queue_count) - 1;
evt = this_cpu_ptr(&mce_event_queue[index]);
+
+   if (evt->error_type == MCE_ERROR_TYPE_UE)
+   machine_check_ue_event(evt);
+
machine_check_print_event_info(evt, false, false);
__this_cpu_dec(mce_queue_count);
}
-- 
2.20.1



[PATCH v8 2/7] powerpc/mce: Make machine_check_ue_event() static

2019-08-07 Thread Santosh Sivaraj
From: Reza Arbab 

The function doesn't get used outside this file, so make it static.

Signed-off-by: Reza Arbab 
Signed-off-by: Santosh Sivaraj 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/kernel/mce.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 0ab6fa7c..8c0b471658a7 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -33,7 +33,7 @@ static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT],
mce_ue_event_queue);
 
 static void machine_check_process_queued_event(struct irq_work *work);
-void machine_check_ue_event(struct machine_check_event *evt);
+static void machine_check_ue_event(struct machine_check_event *evt);
 static void machine_process_ue_event(struct work_struct *work);
 
 static struct irq_work mce_event_process_work = {
@@ -202,7 +202,7 @@ void release_mce_event(void)
 /*
  * Queue up the MCE event which then can be handled later.
  */
-void machine_check_ue_event(struct machine_check_event *evt)
+static void machine_check_ue_event(struct machine_check_event *evt)
 {
int index;
 
-- 
2.20.1



[PATCH v8 3/7] powerpc/mce: Fix MCE handling for huge pages

2019-08-07 Thread Santosh Sivaraj
From: Balbir Singh 

The current code would fail on huge pages addresses, since the shift would
be incorrect. Use the correct page shift value returned by
__find_linux_pte() to get the correct physical address. The code is more
generic and can handle both regular and compound pages.

Fixes: ba41e1e1ccb9 ("powerpc/mce: Hookup derror (load/store) UE errors")
Signed-off-by: Balbir Singh 
[ar...@linux.ibm.com: Fixup pseries_do_memory_failure()]
Signed-off-by: Reza Arbab 
Co-developed-by: Santosh Sivaraj 
Signed-off-by: Santosh Sivaraj 
---
 arch/powerpc/include/asm/mce.h   |  2 +-
 arch/powerpc/kernel/mce_power.c  | 50 ++--
 arch/powerpc/platforms/pseries/ras.c |  9 ++---
 3 files changed, 29 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index a4c6a74ad2fb..f3a6036b6bc0 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -209,7 +209,7 @@ extern void release_mce_event(void);
 extern void machine_check_queue_event(void);
 extern void machine_check_print_event_info(struct machine_check_event *evt,
   bool user_mode, bool in_guest);
-unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
+unsigned long addr_to_phys(struct pt_regs *regs, unsigned long addr);
 #ifdef CONFIG_PPC_BOOK3S_64
 void flush_and_reload_slb(void);
 #endif /* CONFIG_PPC_BOOK3S_64 */
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index a814d2dfb5b0..bed38a8e2e50 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -20,13 +20,14 @@
 #include 
 
 /*
- * Convert an address related to an mm to a PFN. NOTE: we are in real
- * mode, we could potentially race with page table updates.
+ * Convert an address related to an mm to a physical address.
+ * NOTE: we are in real mode, we could potentially race with page table 
updates.
  */
-unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr)
+unsigned long addr_to_phys(struct pt_regs *regs, unsigned long addr)
 {
-   pte_t *ptep;
-   unsigned long flags;
+   pte_t *ptep, pte;
+   unsigned int shift;
+   unsigned long flags, phys_addr;
struct mm_struct *mm;
 
if (user_mode(regs))
@@ -35,14 +36,21 @@ unsigned long addr_to_pfn(struct pt_regs *regs, unsigned 
long addr)
mm = &init_mm;
 
local_irq_save(flags);
-   if (mm == current->mm)
-   ptep = find_current_mm_pte(mm->pgd, addr, NULL, NULL);
-   else
-   ptep = find_init_mm_pte(addr, NULL);
+   ptep = __find_linux_pte(mm->pgd, addr, NULL, &shift);
local_irq_restore(flags);
+
if (!ptep || pte_special(*ptep))
return ULONG_MAX;
-   return pte_pfn(*ptep);
+
+   pte = *ptep;
+   if (shift > PAGE_SHIFT) {
+   unsigned long rpnmask = (1ul << shift) - PAGE_SIZE;
+
+   pte = __pte(pte_val(pte) | (addr & rpnmask));
+   }
+   phys_addr = pte_pfn(pte) << PAGE_SHIFT;
+
+   return phys_addr;
 }
 
 /* flush SLBs and reload */
@@ -354,18 +362,16 @@ static int mce_find_instr_ea_and_pfn(struct pt_regs 
*regs, uint64_t *addr,
 * faults
 */
int instr;
-   unsigned long pfn, instr_addr;
+   unsigned long instr_addr;
struct instruction_op op;
struct pt_regs tmp = *regs;
 
-   pfn = addr_to_pfn(regs, regs->nip);
-   if (pfn != ULONG_MAX) {
-   instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
+   instr_addr = addr_to_phys(regs, regs->nip) + (regs->nip & ~PAGE_MASK);
+   if (instr_addr != ULONG_MAX) {
instr = *(unsigned int *)(instr_addr);
if (!analyse_instr(&op, &tmp, instr)) {
-   pfn = addr_to_pfn(regs, op.ea);
*addr = op.ea;
-   *phys_addr = (pfn << PAGE_SHIFT);
+   *phys_addr = addr_to_phys(regs, op.ea);
return 0;
}
/*
@@ -440,15 +446,9 @@ static int mce_handle_ierror(struct pt_regs *regs,
*addr = regs->nip;
if (mce_err->sync_error &&
table[i].error_type == MCE_ERROR_TYPE_UE) {
-   unsigned long pfn;
-
-   if (get_paca()->in_mce < MAX_MCE_DEPTH) {
-   pfn = addr_to_pfn(regs, regs->nip);
-   if (pfn != ULONG_MAX) {
-   *phys_addr =
-   (pfn << PAGE_SHIFT);
-   }
-   }
+   if (get_paca()->in_mce < MAX_MCE_DEPTH)
+   *phys_addr = addr_to_phys(regs,
+  

[PATCH v8 4/7] extable: Add function to search only kernel exception table

2019-08-07 Thread Santosh Sivaraj
Certain architecture specific operating modes (e.g., in powerpc machine
check handler that is unable to access vmalloc memory), the
search_exception_tables cannot be called because it also searches the
module exception tables if entry is not found in the kernel exception
table.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Nicholas Piggin 
Signed-off-by: Santosh Sivaraj 
Reviewed-by: Nicholas Piggin 
---
 include/linux/extable.h |  2 ++
 kernel/extable.c| 11 +--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/linux/extable.h b/include/linux/extable.h
index 41c5b3a25f67..81ecfaa83ad3 100644
--- a/include/linux/extable.h
+++ b/include/linux/extable.h
@@ -19,6 +19,8 @@ void trim_init_extable(struct module *m);
 
 /* Given an address, look for it in the exception tables */
 const struct exception_table_entry *search_exception_tables(unsigned long add);
+const struct exception_table_entry *
+search_kernel_exception_table(unsigned long addr);
 
 #ifdef CONFIG_MODULES
 /* For extable.c to search modules' exception tables. */
diff --git a/kernel/extable.c b/kernel/extable.c
index e23cce6e6092..f6c9406eec7d 100644
--- a/kernel/extable.c
+++ b/kernel/extable.c
@@ -40,13 +40,20 @@ void __init sort_main_extable(void)
}
 }
 
+/* Given an address, look for it in the kernel exception table */
+const
+struct exception_table_entry *search_kernel_exception_table(unsigned long addr)
+{
+   return search_extable(__start___ex_table,
+ __stop___ex_table - __start___ex_table, addr);
+}
+
 /* Given an address, look for it in the exception tables. */
 const struct exception_table_entry *search_exception_tables(unsigned long addr)
 {
const struct exception_table_entry *e;
 
-   e = search_extable(__start___ex_table,
-  __stop___ex_table - __start___ex_table, addr);
+   e = search_kernel_exception_table(addr);
if (!e)
e = search_module_extables(addr);
return e;
-- 
2.20.1



[PATCH v8 5/7] powerpc/memcpy: Add memcpy_mcsafe for pmem

2019-08-07 Thread Santosh Sivaraj
From: Balbir Singh 

The pmem infrastructure uses memcpy_mcsafe in the pmem layer so as to
convert machine check exceptions into a return value on failure in case
a machine check exception is encountered during the memcpy. The return
value is the number of bytes remaining to be copied.

This patch largely borrows from the copyuser_power7 logic and does not add
the VMX optimizations, largely to keep the patch simple. If needed those
optimizations can be folded in.

Signed-off-by: Balbir Singh 
[ar...@linux.ibm.com: Added symbol export]
Co-developed-by: Santosh Sivaraj 
Signed-off-by: Santosh Sivaraj 
---
 arch/powerpc/include/asm/string.h   |   2 +
 arch/powerpc/lib/Makefile   |   2 +-
 arch/powerpc/lib/memcpy_mcsafe_64.S | 242 
 3 files changed, 245 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/lib/memcpy_mcsafe_64.S

diff --git a/arch/powerpc/include/asm/string.h 
b/arch/powerpc/include/asm/string.h
index 9bf6dffb4090..b72692702f35 100644
--- a/arch/powerpc/include/asm/string.h
+++ b/arch/powerpc/include/asm/string.h
@@ -53,7 +53,9 @@ void *__memmove(void *to, const void *from, __kernel_size_t 
n);
 #ifndef CONFIG_KASAN
 #define __HAVE_ARCH_MEMSET32
 #define __HAVE_ARCH_MEMSET64
+#define __HAVE_ARCH_MEMCPY_MCSAFE
 
+extern int memcpy_mcsafe(void *dst, const void *src, __kernel_size_t sz);
 extern void *__memset16(uint16_t *, uint16_t v, __kernel_size_t);
 extern void *__memset32(uint32_t *, uint32_t v, __kernel_size_t);
 extern void *__memset64(uint64_t *, uint64_t v, __kernel_size_t);
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index eebc782d89a5..fa6b1b657b43 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -39,7 +39,7 @@ obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o 
copypage_power7.o \
   memcpy_power7.o
 
 obj64-y+= copypage_64.o copyuser_64.o mem_64.o hweight_64.o \
-  memcpy_64.o pmem.o
+  memcpy_64.o pmem.o memcpy_mcsafe_64.o
 
 obj64-$(CONFIG_SMP)+= locks.o
 obj64-$(CONFIG_ALTIVEC)+= vmx-helper.o
diff --git a/arch/powerpc/lib/memcpy_mcsafe_64.S 
b/arch/powerpc/lib/memcpy_mcsafe_64.S
new file mode 100644
index ..949976dc115d
--- /dev/null
+++ b/arch/powerpc/lib/memcpy_mcsafe_64.S
@@ -0,0 +1,242 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) IBM Corporation, 2011
+ * Derived from copyuser_power7.s by Anton Blanchard 
+ * Author - Balbir Singh 
+ */
+#include 
+#include 
+#include 
+
+   .macro err1
+100:
+   EX_TABLE(100b,.Ldo_err1)
+   .endm
+
+   .macro err2
+200:
+   EX_TABLE(200b,.Ldo_err2)
+   .endm
+
+   .macro err3
+300:   EX_TABLE(300b,.Ldone)
+   .endm
+
+.Ldo_err2:
+   ld  r22,STK_REG(R22)(r1)
+   ld  r21,STK_REG(R21)(r1)
+   ld  r20,STK_REG(R20)(r1)
+   ld  r19,STK_REG(R19)(r1)
+   ld  r18,STK_REG(R18)(r1)
+   ld  r17,STK_REG(R17)(r1)
+   ld  r16,STK_REG(R16)(r1)
+   ld  r15,STK_REG(R15)(r1)
+   ld  r14,STK_REG(R14)(r1)
+   addir1,r1,STACKFRAMESIZE
+.Ldo_err1:
+   /* Do a byte by byte copy to get the exact remaining size */
+   mtctr   r7
+46:
+err3;  lbz r0,0(r4)
+   addir4,r4,1
+err3;  stb r0,0(r3)
+   addir3,r3,1
+   bdnz46b
+   li  r3,0
+   blr
+
+.Ldone:
+   mfctr   r3
+   blr
+
+
+_GLOBAL(memcpy_mcsafe)
+   mr  r7,r5
+   cmpldi  r5,16
+   blt .Lshort_copy
+
+.Lcopy:
+   /* Get the source 8B aligned */
+   neg r6,r4
+   mtocrf  0x01,r6
+   clrldi  r6,r6,(64-3)
+
+   bf  cr7*4+3,1f
+err1;  lbz r0,0(r4)
+   addir4,r4,1
+err1;  stb r0,0(r3)
+   addir3,r3,1
+   subir7,r7,1
+
+1: bf  cr7*4+2,2f
+err1;  lhz r0,0(r4)
+   addir4,r4,2
+err1;  sth r0,0(r3)
+   addir3,r3,2
+   subir7,r7,2
+
+2: bf  cr7*4+1,3f
+err1;  lwz r0,0(r4)
+   addir4,r4,4
+err1;  stw r0,0(r3)
+   addir3,r3,4
+   subir7,r7,4
+
+3: sub r5,r5,r6
+   cmpldi  r5,128
+   blt 5f
+
+   mflrr0
+   stdur1,-STACKFRAMESIZE(r1)
+   std r14,STK_REG(R14)(r1)
+   std r15,STK_REG(R15)(r1)
+   std r16,STK_REG(R16)(r1)
+   std r17,STK_REG(R17)(r1)
+   std r18,STK_REG(R18)(r1)
+   std r19,STK_REG(R19)(r1)
+   std r20,STK_REG(R20)(r1)
+   std r21,STK_REG(R21)(r1)
+   std r22,STK_REG(R22)(r1)
+   std r0,STACKFRAMESIZE+16(r1)
+
+   srdir6,r5,7
+   mtctr   r6
+
+   /* Now do cacheline (128B) sized loads and stores. */
+   .align  5
+4:
+err2;  ld  r0,0(r4)
+err2;  ld  r6,8(r4)
+err2;  ld  r8,16(r4)
+err2;  ld  r9,24(r4)
+err2;  ld  r10,32(r4)
+err2;  ld  r11,40(r4)
+err2;  ld  r12,48(r4)
+err2;  ld  r14,56(r4)
+err2;  ld  r15,64(r4)
+err2;  ld  r16,72(

[PATCH v8 6/7] powerpc/mce: Handle UE event for memcpy_mcsafe

2019-08-07 Thread Santosh Sivaraj
If we take a UE on one of the instructions with a fixup entry, set nip
to continue execution at the fixup entry. Stop processing the event
further or print it.

Co-developed-by: Reza Arbab 
Signed-off-by: Reza Arbab 
Cc: Mahesh Salgaonkar 
Signed-off-by: Santosh Sivaraj 
---
 arch/powerpc/include/asm/mce.h  |  4 +++-
 arch/powerpc/kernel/mce.c   | 18 --
 arch/powerpc/kernel/mce_power.c | 15 +--
 3 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index f3a6036b6bc0..e1931c8c2743 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -122,7 +122,8 @@ struct machine_check_event {
enum MCE_UeErrorType ue_error_type:8;
u8  effective_address_provided;
u8  physical_address_provided;
-   u8  reserved_1[5];
+   u8  ignore_event;
+   u8  reserved_1[4];
u64 effective_address;
u64 physical_address;
u8  reserved_2[8];
@@ -193,6 +194,7 @@ struct mce_error_info {
enum MCE_Initiator  initiator:8;
enum MCE_ErrorClass error_class:8;
boolsync_error;
+   boolignore_event;
 };
 
 #define MAX_MC_EVT 100
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 8c0b471658a7..a80f5d6ef166 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -33,7 +33,6 @@ static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT],
mce_ue_event_queue);
 
 static void machine_check_process_queued_event(struct irq_work *work);
-static void machine_check_ue_event(struct machine_check_event *evt);
 static void machine_process_ue_event(struct work_struct *work);
 
 static struct irq_work mce_event_process_work = {
@@ -144,6 +143,7 @@ void save_mce_event(struct pt_regs *regs, long handled,
if (phys_addr != ULONG_MAX) {
mce->u.ue_error.physical_address_provided = true;
mce->u.ue_error.physical_address = phys_addr;
+   mce->u.ue_error.ignore_event = mce_err->ignore_event;
}
}
return;
@@ -256,8 +256,17 @@ static void machine_process_ue_event(struct work_struct 
*work)
/*
 * This should probably queued elsewhere, but
 * oh! well
+*
+* Don't report this machine check because the caller has a
+* asked us to ignore the event, it has a fixup handler which
+* will do the appropriate error handling and reporting.
 */
if (evt->error_type == MCE_ERROR_TYPE_UE) {
+   if (evt->u.ue_error.ignore_event) {
+   __this_cpu_dec(mce_ue_count);
+   continue;
+   }
+
if (evt->u.ue_error.physical_address_provided) {
unsigned long pfn;
 
@@ -291,8 +300,13 @@ static void machine_check_process_queued_event(struct 
irq_work *work)
index = __this_cpu_read(mce_queue_count) - 1;
evt = this_cpu_ptr(&mce_event_queue[index]);
 
-   if (evt->error_type == MCE_ERROR_TYPE_UE)
+   if (evt->error_type == MCE_ERROR_TYPE_UE) {
machine_check_ue_event(evt);
+   if (evt->u.ue_error.ignore_event) {
+   __this_cpu_dec(mce_queue_count);
+   continue;
+   }
+   }
 
machine_check_print_event_info(evt, false, false);
__this_cpu_dec(mce_queue_count);
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index bed38a8e2e50..36ca45bbb273 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -11,6 +11,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -18,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Convert an address related to an mm to a physical address.
@@ -558,9 +560,18 @@ static int mce_handle_derror(struct pt_regs *regs,
return 0;
 }
 
-static long mce_handle_ue_error(struct pt_regs *regs)
+static long mce_handle_ue_error(struct pt_regs *regs,
+   struct mce_error_info *mce_err)
 {
long handled = 0;
+   const struct exception_table_entry *entry;
+
+   entry = search_kernel_exception_table(regs->nip);
+   if (entry) {
+   mce_err->ignore_event = true;
+   regs->nip = extable_fixup(entry);
+   return 1;
+  

[PATCH v8 7/7] powerpc: add machine check safe copy_to_user

2019-08-07 Thread Santosh Sivaraj
Use  memcpy_mcsafe() implementation to define copy_to_user_mcsafe()

Signed-off-by: Santosh Sivaraj 
---
 arch/powerpc/Kconfig   |  1 +
 arch/powerpc/include/asm/uaccess.h | 14 ++
 2 files changed, 15 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 77f6ebf97113..4316e36095a2 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -137,6 +137,7 @@ config PPC
select ARCH_HAS_STRICT_KERNEL_RWX   if ((PPC_BOOK3S_64 || PPC32) && 
!RELOCATABLE && !HIBERNATION)
select ARCH_HAS_TICK_BROADCAST  if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UACCESS_FLUSHCACHE  if PPC64
+   select ARCH_HAS_UACCESS_MCSAFE  if PPC64
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_KEEP_MEMBLOCK
diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 8b03eb44e876..15002b51ff18 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -387,6 +387,20 @@ static inline unsigned long raw_copy_to_user(void __user 
*to,
return ret;
 }
 
+static __always_inline unsigned long __must_check
+copy_to_user_mcsafe(void __user *to, const void *from, unsigned long n)
+{
+   if (likely(check_copy_size(from, n, true))) {
+   if (access_ok(to, n)) {
+   allow_write_to_user(to, n);
+   n = memcpy_mcsafe((void *)to, from, n);
+   prevent_write_to_user(to, n);
+   }
+   }
+
+   return n;
+}
+
 extern unsigned long __clear_user(void __user *addr, unsigned long size);
 
 static inline unsigned long clear_user(void __user *addr, unsigned long size)
-- 
2.20.1



Re: [PATCH 2/4] kasan: support instrumented bitops with generic non-atomic bitops

2019-08-07 Thread Christophe Leroy




Le 07/08/2019 à 01:38, Daniel Axtens a écrit :

Currently bitops-instrumented.h assumes that the architecture provides
both the atomic and non-atomic versions of the bitops (e.g. both
set_bit and __set_bit). This is true on x86, but is not always true:
there is a generic bitops/non-atomic.h header that provides generic
non-atomic versions. powerpc uses this generic version, so it does
not have it's own e.g. __set_bit that could be renamed arch___set_bit.

Rearrange bitops-instrumented.h. As operations in bitops/non-atomic.h
will already be instrumented (they use regular memory accesses), put
the instrumenting wrappers for them behind an ifdef. Only include
these instrumentation wrappers if non-atomic.h has not been included.


What about moving and splitting bitops-instrumented.h into:
bitops/atomic-instrumented.h
bitops/non-atomic-instrumented.h
bitops/lock-instrumented.h

I think that would be cleaner than hacking the file with the _GUARDS_ of 
another header file (is that method used anywhere else in header files ?)


Christophe



Signed-off-by: Daniel Axtens 
---
  include/asm-generic/bitops-instrumented.h | 144 --
  1 file changed, 76 insertions(+), 68 deletions(-)

diff --git a/include/asm-generic/bitops-instrumented.h 
b/include/asm-generic/bitops-instrumented.h
index ddd1c6d9d8db..2fe8f7e12a11 100644
--- a/include/asm-generic/bitops-instrumented.h
+++ b/include/asm-generic/bitops-instrumented.h
@@ -29,21 +29,6 @@ static inline void set_bit(long nr, volatile unsigned long 
*addr)
arch_set_bit(nr, addr);
  }
  
-/**

- * __set_bit - Set a bit in memory
- * @nr: the bit to set
- * @addr: the address to start counting from
- *
- * Unlike set_bit(), this function is non-atomic. If it is called on the same
- * region of memory concurrently, the effect may be that only one operation
- * succeeds.
- */
-static inline void __set_bit(long nr, volatile unsigned long *addr)
-{
-   kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
-   arch___set_bit(nr, addr);
-}
-
  /**
   * clear_bit - Clears a bit in memory
   * @nr: Bit to clear
@@ -57,21 +42,6 @@ static inline void clear_bit(long nr, volatile unsigned long 
*addr)
arch_clear_bit(nr, addr);
  }
  
-/**

- * __clear_bit - Clears a bit in memory
- * @nr: the bit to clear
- * @addr: the address to start counting from
- *
- * Unlike clear_bit(), this function is non-atomic. If it is called on the same
- * region of memory concurrently, the effect may be that only one operation
- * succeeds.
- */
-static inline void __clear_bit(long nr, volatile unsigned long *addr)
-{
-   kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
-   arch___clear_bit(nr, addr);
-}
-
  /**
   * clear_bit_unlock - Clear a bit in memory, for unlock
   * @nr: the bit to set
@@ -116,21 +86,6 @@ static inline void change_bit(long nr, volatile unsigned 
long *addr)
arch_change_bit(nr, addr);
  }
  
-/**

- * __change_bit - Toggle a bit in memory
- * @nr: the bit to change
- * @addr: the address to start counting from
- *
- * Unlike change_bit(), this function is non-atomic. If it is called on the 
same
- * region of memory concurrently, the effect may be that only one operation
- * succeeds.
- */
-static inline void __change_bit(long nr, volatile unsigned long *addr)
-{
-   kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
-   arch___change_bit(nr, addr);
-}
-
  /**
   * test_and_set_bit - Set a bit and return its old value
   * @nr: Bit to set
@@ -144,20 +99,6 @@ static inline bool test_and_set_bit(long nr, volatile 
unsigned long *addr)
return arch_test_and_set_bit(nr, addr);
  }
  
-/**

- * __test_and_set_bit - Set a bit and return its old value
- * @nr: Bit to set
- * @addr: Address to count from
- *
- * This operation is non-atomic. If two instances of this operation race, one
- * can appear to succeed but actually fail.
- */
-static inline bool __test_and_set_bit(long nr, volatile unsigned long *addr)
-{
-   kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
-   return arch___test_and_set_bit(nr, addr);
-}
-
  /**
   * test_and_set_bit_lock - Set a bit and return its old value, for lock
   * @nr: Bit to set
@@ -187,30 +128,96 @@ static inline bool test_and_clear_bit(long nr, volatile 
unsigned long *addr)
  }
  
  /**

- * __test_and_clear_bit - Clear a bit and return its old value
- * @nr: Bit to clear
+ * test_and_change_bit - Change a bit and return its old value
+ * @nr: Bit to change
+ * @addr: Address to count from
+ *
+ * This is an atomic fully-ordered operation (implied full memory barrier).
+ */
+static inline bool test_and_change_bit(long nr, volatile unsigned long *addr)
+{
+   kasan_check_write(addr + BIT_WORD(nr), sizeof(long));
+   return arch_test_and_change_bit(nr, addr);
+}
+
+/*
+ * If the arch is using the generic non-atomic bit ops, they are already
+ * instrumented, and we don't need to create wrappers. Only wrap if we
+ * haven't included that header.
+ 

Re: [PATCH 3/4] powerpc: support KASAN instrumentation of bitops

2019-08-07 Thread Christophe Leroy




Le 07/08/2019 à 01:38, Daniel Axtens a écrit :

In KASAN development I noticed that the powerpc-specific bitops
were not being picked up by the KASAN test suite.

Instrumentation is done via the bitops-instrumented.h header. It
requies that arch-specific versions of bitop functions are renamed
to arch_*. Do this renaming.

For clear_bit_unlock_is_negative_byte, the current implementation
uses the PG_waiter constant. This works because it's a preprocessor
macro - so it's only actually evaluated in contexts where PG_waiter
is defined. With instrumentation however, it becomes a static inline
function, and all of a sudden we need the actual value of PG_waiter.
Because of the order of header includes, it's not available and we
fail to compile. Instead, manually specify that we care about bit 7.
This is still correct: bit 7 is the bit that would mark a negative
byte, but it does obscure the origin a little bit.

Cc: Nicholas Piggin  # clear_bit_unlock_negative_byte
Signed-off-by: Daniel Axtens 
---
  arch/powerpc/include/asm/bitops.h | 25 ++---
  1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/bitops.h 
b/arch/powerpc/include/asm/bitops.h
index 603aed229af7..19dc16e62e6a 100644
--- a/arch/powerpc/include/asm/bitops.h
+++ b/arch/powerpc/include/asm/bitops.h
@@ -86,22 +86,22 @@ DEFINE_BITOP(clear_bits, andc, "")
  DEFINE_BITOP(clear_bits_unlock, andc, PPC_RELEASE_BARRIER)
  DEFINE_BITOP(change_bits, xor, "")
  
-static __inline__ void set_bit(int nr, volatile unsigned long *addr)

+static __inline__ void arch_set_bit(int nr, volatile unsigned long *addr)
  {
set_bits(BIT_MASK(nr), addr + BIT_WORD(nr));
  }
  
-static __inline__ void clear_bit(int nr, volatile unsigned long *addr)

+static __inline__ void arch_clear_bit(int nr, volatile unsigned long *addr)
  {
clear_bits(BIT_MASK(nr), addr + BIT_WORD(nr));
  }
  
-static __inline__ void clear_bit_unlock(int nr, volatile unsigned long *addr)

+static __inline__ void arch_clear_bit_unlock(int nr, volatile unsigned long 
*addr)
  {
clear_bits_unlock(BIT_MASK(nr), addr + BIT_WORD(nr));
  }
  
-static __inline__ void change_bit(int nr, volatile unsigned long *addr)

+static __inline__ void arch_change_bit(int nr, volatile unsigned long *addr)
  {
change_bits(BIT_MASK(nr), addr + BIT_WORD(nr));
  }
@@ -138,26 +138,26 @@ DEFINE_TESTOP(test_and_clear_bits, andc, 
PPC_ATOMIC_ENTRY_BARRIER,
  DEFINE_TESTOP(test_and_change_bits, xor, PPC_ATOMIC_ENTRY_BARRIER,
  PPC_ATOMIC_EXIT_BARRIER, 0)
  
-static __inline__ int test_and_set_bit(unsigned long nr,

+static __inline__ int arch_test_and_set_bit(unsigned long nr,
   volatile unsigned long *addr)
  {
return test_and_set_bits(BIT_MASK(nr), addr + BIT_WORD(nr)) != 0;
  }
  
-static __inline__ int test_and_set_bit_lock(unsigned long nr,

+static __inline__ int arch_test_and_set_bit_lock(unsigned long nr,
   volatile unsigned long *addr)
  {
return test_and_set_bits_lock(BIT_MASK(nr),
addr + BIT_WORD(nr)) != 0;
  }
  
-static __inline__ int test_and_clear_bit(unsigned long nr,

+static __inline__ int arch_test_and_clear_bit(unsigned long nr,
 volatile unsigned long *addr)
  {
return test_and_clear_bits(BIT_MASK(nr), addr + BIT_WORD(nr)) != 0;
  }
  
-static __inline__ int test_and_change_bit(unsigned long nr,

+static __inline__ int arch_test_and_change_bit(unsigned long nr,
  volatile unsigned long *addr)
  {
return test_and_change_bits(BIT_MASK(nr), addr + BIT_WORD(nr)) != 0;
@@ -186,14 +186,14 @@ static __inline__ unsigned long 
clear_bit_unlock_return_word(int nr,
  }
  
  /* This is a special function for mm/filemap.c */

-#define clear_bit_unlock_is_negative_byte(nr, addr)\
-   (clear_bit_unlock_return_word(nr, addr) & BIT_MASK(PG_waiters))
+#define arch_clear_bit_unlock_is_negative_byte(nr, addr)   \
+   (clear_bit_unlock_return_word(nr, addr) & BIT_MASK(7))


Maybe add a comment reminding that 7 is PG_waiters ?

Christophe

  
  #endif /* CONFIG_PPC64 */
  
  #include 
  
-static __inline__ void __clear_bit_unlock(int nr, volatile unsigned long *addr)

+static __inline__ void arch___clear_bit_unlock(int nr, volatile unsigned long 
*addr)
  {
__asm__ __volatile__(PPC_RELEASE_BARRIER "" ::: "memory");
__clear_bit(nr, addr);
@@ -239,6 +239,9 @@ unsigned long __arch_hweight64(__u64 w);
  
  #include 
  
+/* wrappers that deal with KASAN instrumentation */

+#include 
+
  /* Little-endian versions */
  #include 
  



[PATCH] powerpc: use

2019-08-07 Thread Christoph Hellwig
The powerpc version of dma-mapping.h only contains a version of
get_arch_dma_ops that always return NULL.  Replace it with the
asm-generic version that does the same.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/include/asm/Kbuild|  1 +
 arch/powerpc/include/asm/dma-mapping.h | 18 --
 2 files changed, 1 insertion(+), 18 deletions(-)
 delete mode 100644 arch/powerpc/include/asm/dma-mapping.h

diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild
index 9a1d2fc6ceb7..15bb09ad5dc2 100644
--- a/arch/powerpc/include/asm/Kbuild
+++ b/arch/powerpc/include/asm/Kbuild
@@ -4,6 +4,7 @@ generated-y += syscall_table_64.h
 generated-y += syscall_table_c32.h
 generated-y += syscall_table_spu.h
 generic-y += div64.h
+generic-y += dma-mapping.h
 generic-y += export.h
 generic-y += irq_regs.h
 generic-y += local64.h
diff --git a/arch/powerpc/include/asm/dma-mapping.h 
b/arch/powerpc/include/asm/dma-mapping.h
deleted file mode 100644
index 565d6f74b189..
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * Copyright (C) 2004 IBM
- */
-#ifndef _ASM_DMA_MAPPING_H
-#define _ASM_DMA_MAPPING_H
-
-static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
-{
-   /* We don't handle the NULL dev case for ISA for now. We could
-* do it via an out of line call but it is not needed for now. The
-* only ISA DMA device we support is the floppy and we have a hack
-* in the floppy driver directly to get a device for us.
-*/
-   return NULL;
-}
-
-#endif /* _ASM_DMA_MAPPING_H */
-- 
2.20.1



Re: [PATCH 1/4] kasan: allow arches to provide their own early shadow setup

2019-08-07 Thread Christophe Leroy




Le 07/08/2019 à 01:38, Daniel Axtens a écrit :

powerpc supports several different MMUs. In particular, book3s
machines support both a hash-table based MMU and a radix MMU.
These MMUs support different numbers of entries per directory
level: the PTES_PER_* defines evaluate to variables, not constants.
This leads to complier errors as global variables must have constant
sizes.

Allow architectures to manage their own early shadow variables so we
can work around this on powerpc.


This seems rather strange to move the early shadow tables out of 
mm/kasan/init.c allthough they are used there still.


What about doing for all what is already done for 
kasan_early_shadow_p4d[], in extenso define constant max sizes 
MAX_PTRS_PER_PTE, MAX_PTRS_PER_PMD and MAX_PTRS_PER_PUD ?


With a set of the following, it would remain transparent for other arches.
#ifndef MAX_PTRS_PER_PXX
#define MAX_PTRS_PER_PXX PTRS_PER_PXX
#endif

Then you would just need to do the following for Radix:

#define MAX_PTRS_PER_PTE(1 << RADIX_PTE_INDEX_SIZE)
#define MAX_PTRS_PER_PMD(1 << RADIX_PMD_INDEX_SIZE)
#define MAX_PTRS_PER_PUD(1 << RADIX_PUD_INDEX_SIZE)


For the kasan_early_shadow_page[], I don't think we have variable 
PAGE_SIZE, have we ?


Christophe




Signed-off-by: Daniel Axtens 

---
Changes from RFC:

  - To make checkpatch happy, move ARCH_HAS_KASAN_EARLY_SHADOW from
a random #define to a config option selected when building for
ppc64 book3s
---
  include/linux/kasan.h |  2 ++
  lib/Kconfig.kasan |  3 +++
  mm/kasan/init.c   | 10 ++
  3 files changed, 15 insertions(+)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index ec81113fcee4..15933da52a3e 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -14,11 +14,13 @@ struct task_struct;
  #include 
  #include 
  
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW

  extern unsigned char kasan_early_shadow_page[PAGE_SIZE];
  extern pte_t kasan_early_shadow_pte[PTRS_PER_PTE];
  extern pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD];
  extern pud_t kasan_early_shadow_pud[PTRS_PER_PUD];
  extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
+#endif
  
  int kasan_populate_early_shadow(const void *shadow_start,

const void *shadow_end);
diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
index a320dc2e9317..0621a0129c04 100644
--- a/lib/Kconfig.kasan
+++ b/lib/Kconfig.kasan
@@ -9,6 +9,9 @@ config HAVE_ARCH_KASAN_SW_TAGS
  configHAVE_ARCH_KASAN_VMALLOC
bool
  
+config ARCH_HAS_KASAN_EARLY_SHADOW

+   bool
+
  config CC_HAS_KASAN_GENERIC
def_bool $(cc-option, -fsanitize=kernel-address)
  
diff --git a/mm/kasan/init.c b/mm/kasan/init.c

index ce45c491ebcd..7ef2b87a7988 100644
--- a/mm/kasan/init.c
+++ b/mm/kasan/init.c
@@ -31,10 +31,14 @@
   *   - Latter it reused it as zero shadow to cover large ranges of memory
   * that allowed to access, but not handled by kasan (vmalloc/vmemmap ...).
   */
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
  unsigned char kasan_early_shadow_page[PAGE_SIZE] __page_aligned_bss;
+#endif
  
  #if CONFIG_PGTABLE_LEVELS > 4

+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
  p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D] __page_aligned_bss;
+#endif
  static inline bool kasan_p4d_table(pgd_t pgd)
  {
return pgd_page(pgd) == virt_to_page(lm_alias(kasan_early_shadow_p4d));
@@ -46,7 +50,9 @@ static inline bool kasan_p4d_table(pgd_t pgd)
  }
  #endif
  #if CONFIG_PGTABLE_LEVELS > 3
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
  pud_t kasan_early_shadow_pud[PTRS_PER_PUD] __page_aligned_bss;
+#endif
  static inline bool kasan_pud_table(p4d_t p4d)
  {
return p4d_page(p4d) == virt_to_page(lm_alias(kasan_early_shadow_pud));
@@ -58,7 +64,9 @@ static inline bool kasan_pud_table(p4d_t p4d)
  }
  #endif
  #if CONFIG_PGTABLE_LEVELS > 2
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
  pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD] __page_aligned_bss;
+#endif
  static inline bool kasan_pmd_table(pud_t pud)
  {
return pud_page(pud) == virt_to_page(lm_alias(kasan_early_shadow_pmd));
@@ -69,7 +77,9 @@ static inline bool kasan_pmd_table(pud_t pud)
return false;
  }
  #endif
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
  pte_t kasan_early_shadow_pte[PTRS_PER_PTE] __page_aligned_bss;
+#endif
  
  static inline bool kasan_pte_table(pmd_t pmd)

  {



Re: [PATCH] nvdimm/of_pmem: Provide a unique name for bus provider

2019-08-07 Thread Dan Williams
On Tue, Aug 6, 2019 at 11:00 PM Aneesh Kumar K.V
 wrote:
>
> On 8/7/19 10:22 AM, Dan Williams wrote:
> > On Tue, Aug 6, 2019 at 9:17 PM Aneesh Kumar K.V
> >  wrote:
> >>
> >> On 8/7/19 9:43 AM, Dan Williams wrote:
> >>> On Tue, Aug 6, 2019 at 9:00 PM Aneesh Kumar K.V
> >>>  wrote:
> 
>  ndctl utility requires the ndbus to have unique names. If not while
>  enumerating the bus in userspace it drops bus with similar names.
>  This results in us not listing devices beneath the bus.
> >>>
> >>> It does?
> >>>
> 
>  Signed-off-by: Aneesh Kumar K.V 
>  ---
> drivers/nvdimm/of_pmem.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
>  diff --git a/drivers/nvdimm/of_pmem.c b/drivers/nvdimm/of_pmem.c
>  index a0c8dcfa0bf9..97187d6c0bdb 100644
>  --- a/drivers/nvdimm/of_pmem.c
>  +++ b/drivers/nvdimm/of_pmem.c
>  @@ -42,7 +42,7 @@ static int of_pmem_region_probe(struct platform_device 
>  *pdev)
>    return -ENOMEM;
> 
>    priv->bus_desc.attr_groups = bus_attr_groups;
>  -   priv->bus_desc.provider_name = "of_pmem";
>  +   priv->bus_desc.provider_name = kstrdup(pdev->name, GFP_KERNEL);
> >>>
> >>> This looks ok to me to address support for older ndctl binaries, but
> >>> I'd like to also fix the ndctl bug that makes non-unique provider
> >>> names fail.
> >>>
> >>
> >> 0462269ab121d323a016874ebdd42217f2911ee7 (ndctl: provide a method to
> >> invalidate the bus list)
> >>
> >> This hunk does the filtering.
> >>
> >> @@ -928,6 +929,14 @@ static int add_bus(void *parent, int id, const char
> >> *ctl_base)
> >>  goto err_read;
> >>  bus->buf_len = strlen(bus->bus_path) + 50;
> >>
> >> +   ndctl_bus_foreach(ctx, bus_dup)
> >> +   if (strcmp(ndctl_bus_get_provider(bus_dup),
> >> +   ndctl_bus_get_provider(bus)) == 0) 
> >> {
> >> +   free_bus(bus, NULL);
> >> +   free(path);
> >> +   return 1;
> >> +   }
> >> +
> >
> > Yup, that's broken, does this incremental fix work?
> >
> > diff --git a/ndctl/lib/libndctl.c b/ndctl/lib/libndctl.c
> > index 4d9cc7e29c6b..6596f94edef8 100644
> > --- a/ndctl/lib/libndctl.c
> > +++ b/ndctl/lib/libndctl.c
> > @@ -889,7 +889,9 @@ static void *add_bus(void *parent, int id, const
> > char *ctl_base)
> >
> >  ndctl_bus_foreach(ctx, bus_dup)
> >  if (strcmp(ndctl_bus_get_provider(bus_dup),
> > -   ndctl_bus_get_provider(bus)) == 0) {
> > +   ndctl_bus_get_provider(bus)) == 0
> > +   && strcmp(ndctl_bus_get_devname(bus_dup),
> > +   ndctl_bus_get_devname(bus)) == 0) {
> >  free_bus(bus, NULL);
> >  free(path);
> >  return bus_dup;
> >
>
> That worked.

Great. I'll make a formal patch, and I'll amend the changelog of the
proposed kernel change to say "older ndctl binaries mistakenly
require"
>
> -aneesh


Re: [PATCH 0/4] powerpc: KASAN for 64-bit Book3S on Radix

2019-08-07 Thread Christophe Leroy




Le 07/08/2019 à 01:38, Daniel Axtens a écrit :

Building on the work of Christophe, Aneesh and Balbir, I've ported
KASAN to 64-bit Book3S kernels running on the Radix MMU.

It builds on top Christophe's work on 32bit. It also builds on my
generic KASAN_VMALLOC series, available at:
https://patchwork.kernel.org/project/linux-mm/list/?series=153209


Would be good to send that one to the powerpc list as well.



This provides full inline instrumentation on radix, but does require
that you be able to specify the amount of memory on the system at
compile time. More details in patch 4.

Notable changes from the RFC:

  - I've dropped Book3E 64-bit for now.

  - Now instead of hacking into the KASAN core to disable module
allocations, we use KASAN_VMALLOC.

  - More testing, including on real hardware. This revealed that
discontiguous memory is a bit of a headache, at the moment we
must disable memory not contiguous from 0.

  - Update to deal with kasan bitops instrumentation that landed

between RFC and now.


This is rather independant and also applies to PPC32. Could it be a 
separate series that Michael could apply earlier ?


Christophe



  - Documentation!

  - Various cleanups and tweaks.

I am getting occasional problems on boot of real hardware where it
seems vmalloc space mappings don't get installed in time. (We get a
BUG that memory is not accessible, but by the time we hit xmon the
memory then is accessible!) It happens once every few boots. I haven't
yet been able to figure out what is happening and why. I'm going to
look in to it, but I think the patches are in good enough shape to
review while I work on it.

Regards,
Daniel

Daniel Axtens (4):
   kasan: allow arches to provide their own early shadow setup
   kasan: support instrumented bitops with generic non-atomic bitops
   powerpc: support KASAN instrumentation of bitops
   powerpc: Book3S 64-bit "heavyweight" KASAN support

  Documentation/dev-tools/kasan.rst|   7 +-
  Documentation/powerpc/kasan.txt  | 111 ++
  arch/powerpc/Kconfig |   4 +
  arch/powerpc/Kconfig.debug   |  21 +++
  arch/powerpc/Makefile|   7 +
  arch/powerpc/include/asm/bitops.h|  25 ++--
  arch/powerpc/include/asm/book3s/64/radix.h   |   5 +
  arch/powerpc/include/asm/kasan.h |  35 -
  arch/powerpc/kernel/process.c|   8 ++
  arch/powerpc/kernel/prom.c   |  57 +++-
  arch/powerpc/mm/kasan/Makefile   |   1 +
  arch/powerpc/mm/kasan/kasan_init_book3s_64.c |  76 ++
  include/asm-generic/bitops-instrumented.h| 144 ++-
  include/linux/kasan.h|   2 +
  lib/Kconfig.kasan|   3 +
  mm/kasan/init.c  |  10 ++
  16 files changed, 431 insertions(+), 85 deletions(-)
  create mode 100644 Documentation/powerpc/kasan.txt
  create mode 100644 arch/powerpc/mm/kasan/kasan_init_book3s_64.c



Re: [PATCH v2 0/3] arm/arm64: Add support for function error injection

2019-08-07 Thread Will Deacon
On Tue, Aug 06, 2019 at 06:00:12PM +0800, Leo Yan wrote:
> This small patch set is to add support for function error injection;
> this can be used to eanble more advanced debugging feature, e.g.
> CONFIG_BPF_KPROBE_OVERRIDE.
> 
> The patch 01/03 is to consolidate the function definition which can be
> suared cross architectures, patches 02,03/03 are used for enabling
> function error injection on arm64 and arm architecture respectively.
> 
> I tested on arm64 platform Juno-r2 and one of my laptop with x86
> architecture with below steps; I don't test for Arm architecture so
> only pass compilation.

Thanks. I've queued the first two patches up here:

https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=for-next/error-injection

Will


Re: [PATCH 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support

2019-08-07 Thread Christophe Leroy




Le 07/08/2019 à 01:38, Daniel Axtens a écrit :

KASAN support on powerpc64 is interesting:

  - We want to be able to support inline instrumentation so as to be
able to catch global and stack issues.

  - We run a lot of code at boot in real mode. This includes stuff like
printk(), so it's not feasible to just disable instrumentation
around it.


Have you definitely given up the idea of doing a standard implementation 
of KASAN like other 64 bits arches have done ?


Isn't it possible to setup an early 1:1 mapping and go in virtual mode 
earlier ? What is so different between book3s64 and book3e64 ?
On book3e64, we've been able to setup KASAN before printing anything 
(except when using EARLY_DEBUG). Isn't it feasible on book3s64 too ?




[For those not immersed in ppc64, in real mode, the top nibble or
2 bits (depending on radix/hash mmu) of the address is ignored. To
make things work, we put the linear mapping at
0xc000. This means that a pointer to part of the linear
mapping will work both in real mode, where it will be interpreted
as a physical address of the form 0x000..., and out of real mode,
where it will go via the linear mapping.]

  - Inline instrumentation requires a fixed offset.

  - Because of our running things in real mode, the offset has to
point to valid memory both in and out of real mode.

This makes finding somewhere to put the KASAN shadow region a bit fun.

One approach is just to give up on inline instrumentation and override
the address->shadow calculation. This way we can delay all checking
until after we get everything set up to our satisfaction. However,
we'd really like to do better.

What we can do - if we know _at compile time_ how much contiguous
physical memory we have - is to set aside the top 1/8th of the memory
and use that. This is a big hammer (hence the "heavyweight" name) and
comes with 3 big consequences:

  - kernels will simply fail to boot on machines with less memory than
specified when compiling.

  - kernels running on machines with more memory than specified when
compiling will simply ignore the extra memory.

  - there's no nice way to handle physically discontiguous memory, so
you are restricted to the first physical memory block.

If you can bear all this, you get pretty full support for KASAN.

Despite the limitations, it can still find bugs,
e.g. http://patchwork.ozlabs.org/patch/1103775/

The current implementation is Radix only. I am open to extending
it to hash at some point but I don't think it should hold up v1.

Massive thanks to mpe, who had the idea for the initial design.

Signed-off-by: Daniel Axtens 

---
Changes since the rfc:

  - Boots real and virtual hardware, kvm works.

  - disabled reporting when we're checking the stack for exception
frames. The behaviour isn't wrong, just incompatible with KASAN.


Does this applies to / impacts PPC32 at all ?



  - Documentation!

  - Dropped old module stuff in favour of KASAN_VMALLOC.


You said in the cover that this is done to avoid having to split modules 
out of VMALLOC area. Would it be an issue to perform that split ?
I can understand it is not easy on 32 bits because vmalloc space is 
rather small, but on 64 bits don't we have enough virtual space to 
confortably split modules out of vmalloc ? The 64 bits already splits 
ioremap away from vmalloc whereas 32 bits have them merged too.




The bugs with ftrace and kuap were due to kernel bloat pushing
prom_init calls to be done via the plt. Because we did not have
a relocatable kernel, and they are done very early, this caused
everything to explode. Compile with CONFIG_RELOCATABLE!

---
  Documentation/dev-tools/kasan.rst|   7 +-
  Documentation/powerpc/kasan.txt  | 111 +++
  arch/powerpc/Kconfig |   4 +
  arch/powerpc/Kconfig.debug   |  21 
  arch/powerpc/Makefile|   7 ++
  arch/powerpc/include/asm/book3s/64/radix.h   |   5 +
  arch/powerpc/include/asm/kasan.h |  35 +-
  arch/powerpc/kernel/process.c|   8 ++
  arch/powerpc/kernel/prom.c   |  57 +-
  arch/powerpc/mm/kasan/Makefile   |   1 +
  arch/powerpc/mm/kasan/kasan_init_book3s_64.c |  76 +
  11 files changed, 326 insertions(+), 6 deletions(-)
  create mode 100644 Documentation/powerpc/kasan.txt
  create mode 100644 arch/powerpc/mm/kasan/kasan_init_book3s_64.c

diff --git a/Documentation/dev-tools/kasan.rst 
b/Documentation/dev-tools/kasan.rst
index 35fda484a672..48d3b669e577 100644
--- a/Documentation/dev-tools/kasan.rst
+++ b/Documentation/dev-tools/kasan.rst
@@ -22,7 +22,9 @@ global variables yet.
  Tag-based KASAN is only supported in Clang and requires version 7.0.0 or 
later.
  
  Currently generic KASAN is supported for the x86_64, arm64, xtensa and s390

-architectures, and tag-based KASAN is supported only for arm64.

Re: [PATCH 1/4] kasan: allow arches to provide their own early shadow setup

2019-08-07 Thread Christophe Leroy




Le 07/08/2019 à 17:14, Christophe Leroy a écrit :



Le 07/08/2019 à 01:38, Daniel Axtens a écrit :

powerpc supports several different MMUs. In particular, book3s
machines support both a hash-table based MMU and a radix MMU.
These MMUs support different numbers of entries per directory
level: the PTES_PER_* defines evaluate to variables, not constants.
This leads to complier errors as global variables must have constant
sizes.

Allow architectures to manage their own early shadow variables so we
can work around this on powerpc.


This seems rather strange to move the early shadow tables out of 
mm/kasan/init.c allthough they are used there still.


What about doing for all what is already done for 
kasan_early_shadow_p4d[], in extenso define constant max sizes 
MAX_PTRS_PER_PTE, MAX_PTRS_PER_PMD and MAX_PTRS_PER_PUD ?


To illustrate my suggestion, see commit c65e774fb3f6af21 ("x86/mm: Make 
PGDIR_SHIFT and PTRS_PER_P4D variable")


The same principle should apply on all variable powerpc PTRS_PER_XXX.

Christophe



With a set of the following, it would remain transparent for other arches.
#ifndef MAX_PTRS_PER_PXX
#define MAX_PTRS_PER_PXX PTRS_PER_PXX
#endif

Then you would just need to do the following for Radix:

#define MAX_PTRS_PER_PTE    (1 << RADIX_PTE_INDEX_SIZE)
#define MAX_PTRS_PER_PMD    (1 << RADIX_PMD_INDEX_SIZE)
#define MAX_PTRS_PER_PUD    (1 << RADIX_PUD_INDEX_SIZE)


For the kasan_early_shadow_page[], I don't think we have variable 
PAGE_SIZE, have we ?


Christophe




Signed-off-by: Daniel Axtens 

---
Changes from RFC:

  - To make checkpatch happy, move ARCH_HAS_KASAN_EARLY_SHADOW from
    a random #define to a config option selected when building for
    ppc64 book3s
---
  include/linux/kasan.h |  2 ++
  lib/Kconfig.kasan |  3 +++
  mm/kasan/init.c   | 10 ++
  3 files changed, 15 insertions(+)

diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index ec81113fcee4..15933da52a3e 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -14,11 +14,13 @@ struct task_struct;
  #include 
  #include 
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
  extern unsigned char kasan_early_shadow_page[PAGE_SIZE];
  extern pte_t kasan_early_shadow_pte[PTRS_PER_PTE];
  extern pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD];
  extern pud_t kasan_early_shadow_pud[PTRS_PER_PUD];
  extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
+#endif
  int kasan_populate_early_shadow(const void *shadow_start,
  const void *shadow_end);
diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
index a320dc2e9317..0621a0129c04 100644
--- a/lib/Kconfig.kasan
+++ b/lib/Kconfig.kasan
@@ -9,6 +9,9 @@ config HAVE_ARCH_KASAN_SW_TAGS
  config    HAVE_ARCH_KASAN_VMALLOC
  bool
+config ARCH_HAS_KASAN_EARLY_SHADOW
+    bool
+
  config CC_HAS_KASAN_GENERIC
  def_bool $(cc-option, -fsanitize=kernel-address)
diff --git a/mm/kasan/init.c b/mm/kasan/init.c
index ce45c491ebcd..7ef2b87a7988 100644
--- a/mm/kasan/init.c
+++ b/mm/kasan/init.c
@@ -31,10 +31,14 @@
   *   - Latter it reused it as zero shadow to cover large ranges of 
memory
   * that allowed to access, but not handled by kasan 
(vmalloc/vmemmap ...).

   */
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
  unsigned char kasan_early_shadow_page[PAGE_SIZE] __page_aligned_bss;
+#endif
  #if CONFIG_PGTABLE_LEVELS > 4
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
  p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D] __page_aligned_bss;
+#endif
  static inline bool kasan_p4d_table(pgd_t pgd)
  {
  return pgd_page(pgd) == 
virt_to_page(lm_alias(kasan_early_shadow_p4d));

@@ -46,7 +50,9 @@ static inline bool kasan_p4d_table(pgd_t pgd)
  }
  #endif
  #if CONFIG_PGTABLE_LEVELS > 3
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
  pud_t kasan_early_shadow_pud[PTRS_PER_PUD] __page_aligned_bss;
+#endif
  static inline bool kasan_pud_table(p4d_t p4d)
  {
  return p4d_page(p4d) == 
virt_to_page(lm_alias(kasan_early_shadow_pud));

@@ -58,7 +64,9 @@ static inline bool kasan_pud_table(p4d_t p4d)
  }
  #endif
  #if CONFIG_PGTABLE_LEVELS > 2
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
  pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD] __page_aligned_bss;
+#endif
  static inline bool kasan_pmd_table(pud_t pud)
  {
  return pud_page(pud) == 
virt_to_page(lm_alias(kasan_early_shadow_pmd));

@@ -69,7 +77,9 @@ static inline bool kasan_pmd_table(pud_t pud)
  return false;
  }
  #endif
+#ifndef CONFIG_ARCH_HAS_KASAN_EARLY_SHADOW
  pte_t kasan_early_shadow_pte[PTRS_PER_PTE] __page_aligned_bss;
+#endif
  static inline bool kasan_pte_table(pmd_t pmd)
  {



Re: [PATCH] powerpc: convert put_page() to put_user_page*()

2019-08-07 Thread John Hubbard
On 8/7/19 4:24 PM, kbuild test robot wrote:
> Hi,
> 
> Thank you for the patch! Yet something to improve:
> 
> [auto build test ERROR on linus/master]
> [cannot apply to v5.3-rc3 next-20190807]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
> 
> url:
> https://github.com/0day-ci/linux/commits/john-hubbard-gmail-com/powerpc-convert-put_page-to-put_user_page/20190805-132131
> config: powerpc-allmodconfig (attached as .config)
> compiler: powerpc64-linux-gcc (GCC) 7.4.0
> reproduce:
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> GCC_VERSION=7.4.0 make.cross ARCH=powerpc 
> 
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot 
> 
> All errors (new ones prefixed by >>):
> 
>arch/powerpc/kvm/book3s_64_mmu_radix.c: In function 
> 'kvmppc_book3s_instantiate_page':
>>> arch/powerpc/kvm/book3s_64_mmu_radix.c:879:4: error: too many arguments to 
>>> function 'put_user_pages_dirty_lock'
>put_user_pages_dirty_lock(&page, 1, dirty);
>^

Yep, I should have included the prerequisite patch. But this is obsolete now,
please refer to the larger patchset instead:

   https://lore.kernel.org/r/20190807013340.9706-1-jhubb...@nvidia.com

thanks,
-- 
John Hubbard
NVIDIA


Re: [PATCH] powerpc: convert put_page() to put_user_page*()

2019-08-07 Thread kbuild test robot
Hi,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.3-rc3 next-20190807]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/john-hubbard-gmail-com/powerpc-convert-put_page-to-put_user_page/20190805-132131
config: powerpc-allmodconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 7.4.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.4.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

   arch/powerpc/kvm/book3s_64_mmu_radix.c: In function 
'kvmppc_book3s_instantiate_page':
>> arch/powerpc/kvm/book3s_64_mmu_radix.c:879:4: error: too many arguments to 
>> function 'put_user_pages_dirty_lock'
   put_user_pages_dirty_lock(&page, 1, dirty);
   ^
   In file included from arch/powerpc/include/asm/io.h:29:0,
from include/linux/io.h:13,
from include/linux/irq.h:20,
from arch/powerpc/include/asm/hardirq.h:6,
from include/linux/hardirq.h:9,
from include/linux/kvm_host.h:7,
from arch/powerpc/kvm/book3s_64_mmu_radix.c:10:
   include/linux/mm.h:1061:6: note: declared here
void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
 ^
--
   arch/powerpc/mm/book3s64/iommu_api.c: In function 'mm_iommu_unpin':
>> arch/powerpc/mm/book3s64/iommu_api.c:220:3: error: too many arguments to 
>> function 'put_user_pages_dirty_lock'
  put_user_pages_dirty_lock(&page, 1, dirty);
  ^
   In file included from include/linux/migrate.h:5:0,
from arch/powerpc/mm/book3s64/iommu_api.c:13:
   include/linux/mm.h:1061:6: note: declared here
void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
 ^

vim +/put_user_pages_dirty_lock +879 arch/powerpc/kvm/book3s_64_mmu_radix.c

   765  
   766  int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu,
   767 unsigned long gpa,
   768 struct kvm_memory_slot *memslot,
   769 bool writing, bool kvm_ro,
   770 pte_t *inserted_pte, unsigned int 
*levelp)
   771  {
   772  struct kvm *kvm = vcpu->kvm;
   773  struct page *page = NULL;
   774  unsigned long mmu_seq;
   775  unsigned long hva, gfn = gpa >> PAGE_SHIFT;
   776  bool upgrade_write = false;
   777  bool *upgrade_p = &upgrade_write;
   778  pte_t pte, *ptep;
   779  unsigned int shift, level;
   780  int ret;
   781  bool large_enable;
   782  
   783  /* used to check for invalidations in progress */
   784  mmu_seq = kvm->mmu_notifier_seq;
   785  smp_rmb();
   786  
   787  /*
   788   * Do a fast check first, since __gfn_to_pfn_memslot doesn't
   789   * do it with !atomic && !async, which is how we call it.
   790   * We always ask for write permission since the common case
   791   * is that the page is writable.
   792   */
   793  hva = gfn_to_hva_memslot(memslot, gfn);
   794  if (!kvm_ro && __get_user_pages_fast(hva, 1, 1, &page) == 1) {
   795  upgrade_write = true;
   796  } else {
   797  unsigned long pfn;
   798  
   799  /* Call KVM generic code to do the slow-path check */
   800  pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL,
   801 writing, upgrade_p);
   802  if (is_error_noslot_pfn(pfn))
   803  return -EFAULT;
   804  page = NULL;
   805  if (pfn_valid(pfn)) {
   806  page = pfn_to_page(pfn);
   807  if (PageReserved(page))
   808  page = NULL;
   809  }
   810  }
   811  
   812  /*
   813   * Read the PTE from the process' radix tree and use that
   814   * so we get the shift and attribute bits.
   815   */
   816  local_irq_disable();
   817  ptep = __find_linux_pte(vcpu->arch.pgdir, hva, NULL, &shift);
   818  /*
   819   * If the PTE disappeared temporarily due to 

Re: [PATCH] scsi: ibmvfc: Mark expected switch fall-throughs

2019-08-07 Thread Martin K. Petersen


Gustavo,

> Mark switch cases where we are expecting to fall through.

Applied to 5.4/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH v5 00/10] implement KASLR for powerpc/fsl_booke/32

2019-08-07 Thread Jason Yan




On 2019/8/7 21:12, Michael Ellerman wrote:

Hi Jason,

Jason Yan  writes:

This series implements KASLR for powerpc/fsl_booke/32, as a security
feature that deters exploit attempts relying on knowledge of the location
of kernel internals.


Thanks for doing this work.

Sorry I didn't get a chance to look at this until v5, I sent a few
comments just now. Nothing major though, I think this looks almost ready
to merge.



Thank you. I will try my best to improve the code.


cheers


Since CONFIG_RELOCATABLE has already supported, what we need to do is
map or copy kernel to a proper place and relocate. Freescale Book-E
parts expect lowmem to be mapped by fixed TLB entries(TLB1). The TLB1
entries are not suitable to map the kernel directly in a randomized
region, so we chose to copy the kernel to a proper place and restart to
relocate.

Entropy is derived from the banner and timer base, which will change every
build and boot. This not so much safe so additionally the bootloader may
pass entropy via the /chosen/kaslr-seed node in device tree.

We will use the first 512M of the low memory to randomize the kernel
image. The memory will be split in 64M zones. We will use the lower 8
bit of the entropy to decide the index of the 64M zone. Then we chose a
16K aligned offset inside the 64M zone to put the kernel in.

 KERNELBASE

 |-->   64M   <--|
 |   |
 +---+++---+
 |   |||kernel||   |
 +---+++---+
 | |
 |->   offset<-|

   kimage_vaddr

We also check if we will overlap with some areas like the dtb area, the
initrd area or the crashkernel area. If we cannot find a proper area,
kaslr will be disabled and boot from the original kernel.

Changes since v4:
  - Add Reviewed-by tag from Christophe
  - Remove an unnecessary cast
  - Remove unnecessary parenthesis
  - Fix checkpatch warning

Changes since v3:
  - Add Reviewed-by and Tested-by tag from Diana
  - Change the comment in fsl_booke_entry_mapping.S to be consistent
with the new code.

Changes since v2:
  - Remove unnecessary #ifdef
  - Use SZ_64M instead of0x400
  - Call early_init_dt_scan_chosen() to init boot_command_line
  - Rename kaslr_second_init() to kaslr_late_init()

Changes since v1:
  - Remove some useless 'extern' keyword.
  - Replace EXPORT_SYMBOL with EXPORT_SYMBOL_GPL
  - Improve some assembly code
  - Use memzero_explicit instead of memset
  - Use boot_command_line and remove early_command_line
  - Do not print kaslr offset if kaslr is disabled

Jason Yan (10):
   powerpc: unify definition of M_IF_NEEDED
   powerpc: move memstart_addr and kernstart_addr to init-common.c
   powerpc: introduce kimage_vaddr to store the kernel base
   powerpc/fsl_booke/32: introduce create_tlb_entry() helper
   powerpc/fsl_booke/32: introduce reloc_kernel_entry() helper
   powerpc/fsl_booke/32: implement KASLR infrastructure
   powerpc/fsl_booke/32: randomize the kernel image offset
   powerpc/fsl_booke/kaslr: clear the original kernel if randomized
   powerpc/fsl_booke/kaslr: support nokaslr cmdline parameter
   powerpc/fsl_booke/kaslr: dump out kernel offset information on panic

  arch/powerpc/Kconfig  |  11 +
  arch/powerpc/include/asm/nohash/mmu-book3e.h  |  10 +
  arch/powerpc/include/asm/page.h   |   7 +
  arch/powerpc/kernel/Makefile  |   1 +
  arch/powerpc/kernel/early_32.c|   2 +-
  arch/powerpc/kernel/exceptions-64e.S  |  10 -
  arch/powerpc/kernel/fsl_booke_entry_mapping.S |  27 +-
  arch/powerpc/kernel/head_fsl_booke.S  |  55 ++-
  arch/powerpc/kernel/kaslr_booke.c | 427 ++
  arch/powerpc/kernel/machine_kexec.c   |   1 +
  arch/powerpc/kernel/misc_64.S |   5 -
  arch/powerpc/kernel/setup-common.c|  19 +
  arch/powerpc/mm/init-common.c |   7 +
  arch/powerpc/mm/init_32.c |   5 -
  arch/powerpc/mm/init_64.c |   5 -
  arch/powerpc/mm/mmu_decl.h|  10 +
  arch/powerpc/mm/nohash/fsl_booke.c|   8 +-
  17 files changed, 560 insertions(+), 50 deletions(-)
  create mode 100644 arch/powerpc/kernel/kaslr_booke.c

--
2.17.2


.





Re: [PATCH v5 01/10] powerpc: unify definition of M_IF_NEEDED

2019-08-07 Thread Jason Yan




On 2019/8/7 21:13, Michael Ellerman wrote:

Jason Yan  writes:

M_IF_NEEDED is defined too many times. Move it to a common place.


The name is not great, can you call it MAS2_M_IF_NEEDED, which at least
gives a clue what it's for?



OK.


cheers


Signed-off-by: Jason Yan 
Cc: Diana Craciun 
Cc: Michael Ellerman 
Cc: Christophe Leroy 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Nicholas Piggin 
Cc: Kees Cook 
Reviewed-by: Christophe Leroy 
Reviewed-by: Diana Craciun 
Tested-by: Diana Craciun 
---
  arch/powerpc/include/asm/nohash/mmu-book3e.h  | 10 ++
  arch/powerpc/kernel/exceptions-64e.S  | 10 --
  arch/powerpc/kernel/fsl_booke_entry_mapping.S | 10 --
  arch/powerpc/kernel/misc_64.S |  5 -
  4 files changed, 10 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/mmu-book3e.h 
b/arch/powerpc/include/asm/nohash/mmu-book3e.h
index 4c9777d256fb..0877362e48fa 100644
--- a/arch/powerpc/include/asm/nohash/mmu-book3e.h
+++ b/arch/powerpc/include/asm/nohash/mmu-book3e.h
@@ -221,6 +221,16 @@
  #define TLBILX_T_CLASS2   6
  #define TLBILX_T_CLASS3   7
  
+/*

+ * The mapping only needs to be cache-coherent on SMP, except on
+ * Freescale e500mc derivatives where it's also needed for coherent DMA.
+ */
+#if defined(CONFIG_SMP) || defined(CONFIG_PPC_E500MC)
+#define M_IF_NEEDEDMAS2_M
+#else
+#define M_IF_NEEDED0
+#endif
+
  #ifndef __ASSEMBLY__
  #include 
  
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S

index 1cfb3da4a84a..fd49ec07ce4a 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -1342,16 +1342,6 @@ skpinv:  addir6,r6,1 /* 
Increment */
sync
isync
  
-/*

- * The mapping only needs to be cache-coherent on SMP, except on
- * Freescale e500mc derivatives where it's also needed for coherent DMA.
- */
-#if defined(CONFIG_SMP) || defined(CONFIG_PPC_E500MC)
-#define M_IF_NEEDEDMAS2_M
-#else
-#define M_IF_NEEDED0
-#endif
-
  /* 6. Setup KERNELBASE mapping in TLB[0]
   *
   * r3 = MAS0 w/TLBSEL & ESEL for the entry we started in
diff --git a/arch/powerpc/kernel/fsl_booke_entry_mapping.S 
b/arch/powerpc/kernel/fsl_booke_entry_mapping.S
index ea065282b303..de0980945510 100644
--- a/arch/powerpc/kernel/fsl_booke_entry_mapping.S
+++ b/arch/powerpc/kernel/fsl_booke_entry_mapping.S
@@ -153,16 +153,6 @@ skpinv:addir6,r6,1 /* 
Increment */
tlbivax 0,r9
TLBSYNC
  
-/*

- * The mapping only needs to be cache-coherent on SMP, except on
- * Freescale e500mc derivatives where it's also needed for coherent DMA.
- */
-#if defined(CONFIG_SMP) || defined(CONFIG_PPC_E500MC)
-#define M_IF_NEEDEDMAS2_M
-#else
-#define M_IF_NEEDED0
-#endif
-
  #if defined(ENTRY_MAPPING_BOOT_SETUP)
  
  /* 6. Setup KERNELBASE mapping in TLB1[0] */

diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index b55a7b4cb543..26074f92d4bc 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -432,11 +432,6 @@ kexec_create_tlb:
rlwimi  r9,r10,16,4,15  /* Setup MAS0 = TLBSEL | ESEL(r9) */
  
  /* Set up a temp identity mapping v:0 to p:0 and return to it. */

-#if defined(CONFIG_SMP) || defined(CONFIG_PPC_E500MC)
-#define M_IF_NEEDEDMAS2_M
-#else
-#define M_IF_NEEDED0
-#endif
mtspr   SPRN_MAS0,r9
  
  	lis	r9,(MAS1_VALID|MAS1_IPROT)@h

--
2.17.2


.





Re: [PATCH v5 02/10] powerpc: move memstart_addr and kernstart_addr to init-common.c

2019-08-07 Thread Jason Yan




On 2019/8/7 21:02, Michael Ellerman wrote:

Jason Yan  writes:

These two variables are both defined in init_32.c and init_64.c. Move
them to init-common.c.

Signed-off-by: Jason Yan 
Cc: Diana Craciun 
Cc: Michael Ellerman 
Cc: Christophe Leroy 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Nicholas Piggin 
Cc: Kees Cook 
Reviewed-by: Christophe Leroy 
Reviewed-by: Diana Craciun 
Tested-by: Diana Craciun 
---
  arch/powerpc/mm/init-common.c | 5 +
  arch/powerpc/mm/init_32.c | 5 -
  arch/powerpc/mm/init_64.c | 5 -
  3 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c
index a84da92920f7..152ae0d21435 100644
--- a/arch/powerpc/mm/init-common.c
+++ b/arch/powerpc/mm/init-common.c
@@ -21,6 +21,11 @@
  #include 
  #include 
  
+phys_addr_t memstart_addr = (phys_addr_t)~0ull;

+EXPORT_SYMBOL_GPL(memstart_addr);
+phys_addr_t kernstart_addr;
+EXPORT_SYMBOL_GPL(kernstart_addr);


Would be nice if these can be __ro_after_init ?



Good idea.


cheers

.





[PATCH v5 0/7] kvmppc: Paravirtualize KVM to support ultravisor

2019-08-07 Thread Claudio Carvalho
Protected Execution Facility (PEF) is an architectural change for POWER 9
that enables Secure Virtual Machines (SVMs). When enabled, PEF adds a new
higher privileged mode, called Ultravisor mode, to POWER architecture.
Along with the new mode there is new firmware called the Protected
Execution Ultravisor (or Ultravisor for short). Ultravisor mode is the
highest privileged mode in POWER architecture.

The Ultravisor calls allow the SVMs and Hypervisor to request services from
the Ultravisor such as accessing a register or memory region that can only
be accessed when running in Ultravisor-privileged mode.

This patch set adds support for Ultravisor calls and do some preparation
for running secure guests.

---
Changelog:
---

v4->v5:

- New patch "Documentation/powerpc: Ultravisor API"

- Patch "v4: KVM: PPC: Ultravisor: Add generic ultravisor call handler":
  - Made global the ucall_norets symbol without adding it to the TOC.
  - Implemented ucall_norets() rather than ucall().
  - Defined the ucall_norets in "asm/asm-prototypes.h" for symbol
versioning.
  - Renamed to "powerpc/kernel: Add ucall_norets() ultravisor call
handler".
  - Updated the commit message.

- Patch "v4: powerpc: Introduce FW_FEATURE_ULTRAVISOR":
  - Changed to scan for a node that is compatible with "ibm,ultravisor"
  - Renamed to "powerpc/powernv: Introduce FW_FEATURE_ULTRAVISOR".
  - Updated the commit message.

- Patch "v4: KVM: PPC: Ultravisor: Restrict flush of the partition tlb
  cache":
  - Merged into "v4: ... Use UV_WRITE_PATE ucall to register a PATE".

- Patch "v4: KVM: PPC: Ultravisor: Use UV_WRITE_PATE ucall to register a
  PATE":
  - Added back the missing "ptesync" instruction in flush_partition().
  - Updated source code comments for the partition table creation.
  - Factored out "powerpc/mm: Write to PTCR only if ultravisor disabled".
  - Cleaned up the code a bit.
  - Renamed to "powerpc/mm: Use UV_WRITE_PATE ucall to register a PATE".
  - Updated the commit message.

- Patch "v4: KVM: PPC: Ultravisor: Restrict LDBAR access":
  - Dropped the change that skips loading the IMC driver if ultravisor
enabled because skiboot will remove the IMC devtree nodes if
ultravisor enabled.
  - Dropped the BEGIN_{END_}FW_FTR_SECTION_NESTED in power8 code.
  - Renamed to "powerpc/powernv: Access LDBAR only if ultravisor
disabled".
  - Updated the commit message.

- Patch "v4: KVM: PPC: Ultravisor: Enter a secure guest":
  - Openned "LOAD_REG_IMMEDIATE(r3, UV_RETURN)" to save instructions
  - Used R2, rather than R11, to pass synthesized interrupts in
UV_RETURN ucall.
  - Dropped the change that preserves the MSR[S] bit in
"kvmppc_msr_interrupt" because that is done by the ultravisor.
  - Hoisted up the load of R6 and R7 to before "bne ret_to_ultra".
  - Cleaned up the code a bit.
  - Renamed to "powerpc/kvm: Use UV_RETURN ucall to return to ultravisor".
  - Updated the commit message.

- Patch "v4: KVM: PPC: Ultravisor: Check for MSR_S during hv_reset_msr":
  - Dropped from the patch set because "kvm_arch->secure_guest" rather
than MSR[S] is used to determine if we need to return to the
ultravisor.

- Patch "v4: KVM: PPC: Ultravisor: Introduce the MSR_S bit":
  - Moved to the patch set "Secure Virtual Machine Enablement" posted by
Thiago Bauermann. MSR[S] is no longer needed in this patch set.

- Rebased to powerpc/next

v3->v4:

- Patch "KVM: PPC: Ultravisor: Add PPC_UV config option":
  - Moved to the patchset "kvmppc: HMM driver to manage pages of secure
guest" v5 that will be posted by Bharata Rao.

- Patch "powerpc: Introduce FW_FEATURE_ULTRAVISOR":
  - Changed to depend only on CONFIG_PPC_POWERNV.

- Patch "KVM: PPC: Ultravisor: Add generic ultravisor call handler":
  - Fixed whitespaces in ucall.S and in ultravisor-api.h.
  - Changed to depend only on CONFIG_PPC_POWERNV.
  - Changed the ucall wrapper to pass the ucall number in R3.

- Patch "KVM: PPC: Ultravisor: Use UV_WRITE_PATE ucall to register a
  PATE:
  - Changed to depend only on CONFIG_PPC_POWERNV.

- Patch "KVM: PPC: Ultravisor: Restrict LDBAR access":
  - Fixed comment in opal-imc.c to be "Disable IMC devices, when
Ultravisor is enabled.
  - Fixed signed-off-by.

- Patch "KVM: PPC: Ultravisor: Enter a secure guest":
  - Changed the UV_RETURN assembly call to save the actual R3 in
R0 for the ultravisor and pass the UV_RETURN call number in R3.

- Patch "KVM: PPC: Ultravisor: Check for MSR_S during hv_reset_msr":
  - Fixed commit message.

- Rebased to powerpc/next.

v2->v3:

- Squashed patches:
  - "KVM: PPC: Ultravisor: Return to UV for hcalls from SVM"
  - "KVM: PPC: Book3S HV: Fixed for running secure guests"
- Renamed patch from/to:
  - "KVM: PPC: Ultravisor: Return to UV for hcalls from SVM"
  - "KVM: PPC: Ultravisor: Enter a secure guest
- Rebased
- Addressed comments from Paul Mackerras
  - Dropped ultravisor checks made in power8 code
  - Updated the commit message for:
   "KVM: PPC: Ultravisor: Enter a se

[PATCH v5 1/7] Documentation/powerpc: Ultravisor API

2019-08-07 Thread Claudio Carvalho
From: Sukadev Bhattiprolu 

POWER9 processor includes support for Protected Execution Facility (PEF).
Attached documentation provides an overview of PEF and defines the API
for various interfaces that must be implemented in the Ultravisor
firmware as well as in the KVM Hypervisor.

Based on input from Mike Anderson, Thiago Bauermann, Claudio Carvalho,
Ben Herrenschmidt, Guerney Hunt, Paul Mackerras.

Signed-off-by: Sukadev Bhattiprolu 
Signed-off-by: Ram Pai 
Signed-off-by: Guerney Hunt 
Reviewed-by: Claudio Carvalho 
Reviewed-by: Michael Anderson 
Reviewed-by: Thiago Bauermann 
Signed-off-by: Claudio Carvalho 
---
 Documentation/powerpc/ultravisor.rst | 1055 ++
 1 file changed, 1055 insertions(+)
 create mode 100644 Documentation/powerpc/ultravisor.rst

diff --git a/Documentation/powerpc/ultravisor.rst 
b/Documentation/powerpc/ultravisor.rst
new file mode 100644
index ..8d5246585b66
--- /dev/null
+++ b/Documentation/powerpc/ultravisor.rst
@@ -0,0 +1,1055 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. _ultravisor:
+
+
+Protected Execution Facility
+
+
+.. contents::
+:depth: 3
+
+.. sectnum::
+:depth: 3
+
+Protected Execution Facility
+
+
+Protected Execution Facility (PEF) is an architectural change for
+POWER 9 that enables Secure Virtual Machines (SVMs). When enabled,
+PEF adds a new higher privileged mode, called Ultravisor mode, to
+POWER architecture. Along with the new mode there is new firmware
+called the Protected Execution Ultravisor (or Ultravisor for short).
+Ultravisor mode is the highest privileged mode in POWER architecture.
+
+   +--+
+   | Privilege States |
+   +==+
+   |  Problem |
+   +--+
+   |  Supervisor  |
+   +--+
+   |  Hypervisor  |
+   +--+
+   |  Ultravisor  |
+   +--+
+
+PEF protects SVMs from the hypervisor, privileged users, and other
+VMs in the system. SVMs are protected while at rest and can only be
+executed by an authorized machine. All virtual machines utilize
+hypervisor services. The Ultravisor filters calls between the SVMs
+and the hypervisor to assure that information does not accidentally
+leak. All hypercalls except H_RANDOM are reflected to the hypervisor.
+H_RANDOM is not reflected to prevent the hypervisor from influencing
+random values in the SVM.
+
+To support this there is a refactoring of the ownership of resources
+in the CPU. Some of the resources which were previously hypervisor
+privileged are now ultravisor privileged.
+
+Hardware
+
+
+The hardware changes include the following:
+
+* There is a new bit in the MSR that determines whether the current
+  process is running in secure mode, MSR(S) bit 41. MSR(S)=1, process
+  is in secure mode, MSR(s)=0 process is in normal mode.
+
+* The MSR(S) bit can only be set by the Ultravisor.
+
+* HRFID cannot be used to set the MSR(S) bit. If the hypervisor needs
+  to return to a SVM it must use an ultracall. It can determine if
+  the VM it is returning to is secure.
+
+* There is a new Ultravisor privileged register, SMFCTRL, which has an
+  enable/disable bit SMFCTRL(E).
+
+* The privilege of a process is now determined by three MSR bits,
+  MSR(S, HV, PR). In each of the tables below the modes are listed
+  from least privilege to highest privilege. The higher privilege
+  modes can access all the resources of the lower privilege modes.
+
+  **Secure Mode MSR Settings**
+
+  +---+---+---+---+
+  | S | HV| PR|Privilege  |
+  +===+===+===+===+
+  | 1 | 0 | 1 | Problem   |
+  +---+---+---+---+
+  | 1 | 0 | 0 | Privileged(OS)|
+  +---+---+---+---+
+  | 1 | 1 | 0 | Ultravisor|
+  +---+---+---+---+
+  | 1 | 1 | 1 | Reserved  |
+  +---+---+---+---+
+
+  **Normal Mode MSR Settings**
+
+  +---+---+---+---+
+  | S | HV| PR|Privilege  |
+  +===+===+===+===+
+  | 0 | 0 | 1 | Problem   |
+  +---+---+---+---+
+  | 0 | 0 | 0 | Privileged(OS)|
+  +---+---+---+---+
+  | 0 | 1 | 0 | Hypervisor|
+  +---+---+---+---+
+  | 0 | 1 | 1 | Problem (HV)  |
+  +---+---+---+---+
+
+* Memory is partitioned into secure and normal memory. Only processes
+  that are running in secure mode can access secure memory.
+
+* The hardware does not allow anything that is not running secure to
+  access secure memory. This means that the Hypervisor cannot access
+  the memory of the SVM without using an ultracall (asking the
+  Ultravisor). The Ultrav

[PATCH v5 2/7] powerpc/kernel: Add ucall_norets() ultravisor call handler

2019-08-07 Thread Claudio Carvalho
The ultracalls (ucalls for short) allow the Secure Virtual Machines
(SVM)s and hypervisor to request services from the ultravisor such as
accessing a register or memory region that can only be accessed when
running in ultravisor-privileged mode.

This patch adds the ucall_norets() ultravisor call handler. Like
plpar_hcall_norets(), it also saves and restores the Condition
Register (CR).

The specific service needed from an ucall is specified in register
R3 (the first parameter to the ucall). Other parameters to the
ucall, if any, are specified in registers R4 through R12.

Return value of all ucalls is in register R3. Other output values
from the ucall, if any, are returned in registers R4 through R12.

Each ucall returns specific error codes, applicable in the context
of the ucall. However, like with the PowerPC Architecture Platform
Reference (PAPR), if no specific error code is defined for a particular
situation, then the ucall will fallback to an erroneous
parameter-position based code. i.e U_PARAMETER, U_P2, U_P3 etc depending
on the ucall parameter that may have caused the error.

Every host kernel (powernv) needs to be able to do ucalls in case it
ends up being run in a machine with ultravisor enabled. Otherwise, the
kernel may crash early in boot trying to access ultravisor resources,
for instance, trying to set the partition table entry 0. Secure guests
also need to be able to do ucalls and its kernel may not have
CONFIG_PPC_POWERNV=y. For that reason, the ucall.S file is placed under
arch/powerpc/kernel.

If ultravisor is not enabled, the ucalls will be redirected to the
hypervisor which must handle/fail the call.

Thanks to inputs from Ram Pai and Michael Anderson.

Signed-off-by: Claudio Carvalho 

---
Ultravisor call support for secure guests is being proposed as part of
the patchset "Secure Virtual Machine Enablement" posted by Thiago
Bauermann.
---
 arch/powerpc/include/asm/asm-prototypes.h | 11 +++
 arch/powerpc/include/asm/ultravisor-api.h | 23 +++
 arch/powerpc/kernel/Makefile  |  1 +
 arch/powerpc/kernel/ucall.S   | 20 
 4 files changed, 55 insertions(+)
 create mode 100644 arch/powerpc/include/asm/ultravisor-api.h
 create mode 100644 arch/powerpc/kernel/ucall.S

diff --git a/arch/powerpc/include/asm/asm-prototypes.h 
b/arch/powerpc/include/asm/asm-prototypes.h
index 296584e6dd55..ee2e67d5a005 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -38,6 +39,16 @@ extern struct static_key hcall_tracepoint_key;
 void __trace_hcall_entry(unsigned long opcode, unsigned long *args);
 void __trace_hcall_exit(long opcode, long retval, unsigned long *retbuf);
 
+/* Ultravisor */
+#ifdef CONFIG_PPC_POWERNV
+long ucall_norets(unsigned long opcode, ...);
+#else
+static inline long ucall_norets(unsigned long opcode, ...)
+{
+   return U_NOT_AVAILABLE;
+}
+#endif
+
 /* OPAL */
 int64_t __opal_call(int64_t a0, int64_t a1, int64_t a2, int64_t a3,
int64_t a4, int64_t a5, int64_t a6, int64_t a7,
diff --git a/arch/powerpc/include/asm/ultravisor-api.h 
b/arch/powerpc/include/asm/ultravisor-api.h
new file mode 100644
index ..88ffa78f9d61
--- /dev/null
+++ b/arch/powerpc/include/asm/ultravisor-api.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Ultravisor API.
+ *
+ * Copyright 2019, IBM Corporation.
+ *
+ */
+#ifndef _ASM_POWERPC_ULTRAVISOR_API_H
+#define _ASM_POWERPC_ULTRAVISOR_API_H
+
+#include 
+
+/* Return codes */
+#define U_FUNCTION H_FUNCTION
+#define U_NOT_AVAILABLEH_NOT_AVAILABLE
+#define U_P2   H_P2
+#define U_P3   H_P3
+#define U_P4   H_P4
+#define U_P5   H_P5
+#define U_PARAMETERH_PARAMETER
+#define U_SUCCESS  H_SUCCESS
+
+#endif /* _ASM_POWERPC_ULTRAVISOR_API_H */
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 56dfa7a2a6f2..35379b632f3c 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -155,6 +155,7 @@ endif
 
 obj-$(CONFIG_EPAPR_PARAVIRT)   += epapr_paravirt.o epapr_hcalls.o
 obj-$(CONFIG_KVM_GUEST)+= kvm.o kvm_emul.o
+obj-$(CONFIG_PPC_POWERNV)  += ucall.o
 
 # Disable GCOV, KCOV & sanitizers in odd or sensitive code
 GCOV_PROFILE_prom_init.o := n
diff --git a/arch/powerpc/kernel/ucall.S b/arch/powerpc/kernel/ucall.S
new file mode 100644
index ..de9133e45d21
--- /dev/null
+++ b/arch/powerpc/kernel/ucall.S
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Generic code to perform an ultravisor call.
+ *
+ * Copyright 2019, IBM Corporation.
+ *
+ */
+#include 
+#include 
+
+_GLOBAL(ucall_norets)
+EXPORT_SYMBOL_GPL(ucall_norets)
+   mfcrr0
+   stw r0,8(r1)
+
+   sc  2   /* Invoke the ultravi

[PATCH v5 3/7] powerpc/powernv: Introduce FW_FEATURE_ULTRAVISOR

2019-08-07 Thread Claudio Carvalho
In PEF enabled systems, some of the resources which were previously
hypervisor privileged are now ultravisor privileged and controlled by
the ultravisor firmware.

This adds FW_FEATURE_ULTRAVISOR to indicate if PEF is enabled.

The host kernel can use FW_FEATURE_ULTRAVISOR, for instance, to skip
accessing resources (e.g. PTCR and LDBAR) in case PEF is enabled.

Signed-off-by: Claudio Carvalho 
[ andmike: Device node name to "ibm,ultravisor" ]
Signed-off-by: Michael Anderson 
---
 arch/powerpc/include/asm/firmware.h |  5 +++--
 arch/powerpc/include/asm/ultravisor.h   | 14 
 arch/powerpc/kernel/prom.c  |  4 
 arch/powerpc/platforms/powernv/Makefile |  1 +
 arch/powerpc/platforms/powernv/ultravisor.c | 24 +
 5 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100644 arch/powerpc/include/asm/ultravisor.h
 create mode 100644 arch/powerpc/platforms/powernv/ultravisor.c

diff --git a/arch/powerpc/include/asm/firmware.h 
b/arch/powerpc/include/asm/firmware.h
index 00bc42d95679..43b48c4d3ca9 100644
--- a/arch/powerpc/include/asm/firmware.h
+++ b/arch/powerpc/include/asm/firmware.h
@@ -54,6 +54,7 @@
 #define FW_FEATURE_DRC_INFOASM_CONST(0x0008)
 #define FW_FEATURE_BLOCK_REMOVE ASM_CONST(0x0010)
 #define FW_FEATURE_PAPR_SCMASM_CONST(0x0020)
+#define FW_FEATURE_ULTRAVISOR  ASM_CONST(0x0040)
 
 #ifndef __ASSEMBLY__
 
@@ -72,9 +73,9 @@ enum {
FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
FW_FEATURE_HPT_RESIZE | FW_FEATURE_DRMEM_V2 |
FW_FEATURE_DRC_INFO | FW_FEATURE_BLOCK_REMOVE |
-   FW_FEATURE_PAPR_SCM,
+   FW_FEATURE_PAPR_SCM | FW_FEATURE_ULTRAVISOR,
FW_FEATURE_PSERIES_ALWAYS = 0,
-   FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL,
+   FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL | FW_FEATURE_ULTRAVISOR,
FW_FEATURE_POWERNV_ALWAYS = 0,
FW_FEATURE_PS3_POSSIBLE = FW_FEATURE_LPAR | FW_FEATURE_PS3_LV1,
FW_FEATURE_PS3_ALWAYS = FW_FEATURE_LPAR | FW_FEATURE_PS3_LV1,
diff --git a/arch/powerpc/include/asm/ultravisor.h 
b/arch/powerpc/include/asm/ultravisor.h
new file mode 100644
index ..dc6e1ea198f2
--- /dev/null
+++ b/arch/powerpc/include/asm/ultravisor.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Ultravisor definitions
+ *
+ * Copyright 2019, IBM Corporation.
+ *
+ */
+#ifndef _ASM_POWERPC_ULTRAVISOR_H
+#define _ASM_POWERPC_ULTRAVISOR_H
+
+int early_init_dt_scan_ultravisor(unsigned long node, const char *uname,
+ int depth, void *data);
+
+#endif /* _ASM_POWERPC_ULTRAVISOR_H */
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 4221527b082f..67a2c1b39252 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -706,6 +707,9 @@ void __init early_init_devtree(void *params)
 #ifdef CONFIG_PPC_POWERNV
/* Some machines might need OPAL info for debugging, grab it now. */
of_scan_flat_dt(early_init_dt_scan_opal, NULL);
+
+   /* Scan tree for ultravisor feature */
+   of_scan_flat_dt(early_init_dt_scan_ultravisor, NULL);
 #endif
 
 #ifdef CONFIG_FA_DUMP
diff --git a/arch/powerpc/platforms/powernv/Makefile 
b/arch/powerpc/platforms/powernv/Makefile
index da2e99efbd04..2c27c8ac00c8 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -4,6 +4,7 @@ obj-y   += idle.o opal-rtc.o opal-nvram.o 
opal-lpc.o opal-flash.o
 obj-y  += rng.o opal-elog.o opal-dump.o opal-sysparam.o 
opal-sensor.o
 obj-y  += opal-msglog.o opal-hmi.o opal-power.o opal-irqchip.o
 obj-y  += opal-kmsg.o opal-powercap.o opal-psr.o 
opal-sensor-groups.o
+obj-y  += ultravisor.o
 
 obj-$(CONFIG_SMP)  += smp.o subcore.o subcore-asm.o
 obj-$(CONFIG_PCI)  += pci.o pci-ioda.o npu-dma.o pci-ioda-tce.o
diff --git a/arch/powerpc/platforms/powernv/ultravisor.c 
b/arch/powerpc/platforms/powernv/ultravisor.c
new file mode 100644
index ..02ac57b4bded
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/ultravisor.c
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Ultravisor high level interfaces
+ *
+ * Copyright 2019, IBM Corporation.
+ *
+ */
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+int __init early_init_dt_scan_ultravisor(unsigned long node, const char *uname,
+int depth, void *data)
+{
+   if (!of_flat_dt_is_compatible(node, "ibm,ultravisor"))
+   return 0;
+
+   powerpc_firmware_features |= FW_FEATURE_ULTRAVISOR;
+   pr_debug("Ultravisor detected!\n");
+   return 1;
+}
-- 
2.20.1



[PATCH v5 4/7] powerpc/mm: Use UV_WRITE_PATE ucall to register a PATE

2019-08-07 Thread Claudio Carvalho
From: Michael Anderson 

In ultravisor enabled systems, the ultravisor creates and maintains the
partition table in secure memory where the hypervisor cannot access, and
therefore, the hypervisor have to do the UV_WRITE_PATE ucall whenever it
wants to set a partition table entry (PATE).

This patch adds the UV_WRITE_PATE ucall and uses it to set a PATE if
ultravisor is enabled. Additionally, this also also keeps a copy of the
partition table because the nestMMU does not have access to secure
memory. Such copy has entries for nonsecure and hypervisor partition.

Signed-off-by: Michael Anderson 
Signed-off-by: Madhavan Srinivasan 
Signed-off-by: Ram Pai 
[ cclaudio: Write the PATE in HV's table before doing that in UV's ]
Signed-off-by: Claudio Carvalho 
Reviewed-by: Ryan Grimm 
---
 arch/powerpc/include/asm/ultravisor-api.h |  5 ++
 arch/powerpc/include/asm/ultravisor.h |  8 +++
 arch/powerpc/mm/book3s64/pgtable.c| 60 ---
 3 files changed, 56 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/ultravisor-api.h 
b/arch/powerpc/include/asm/ultravisor-api.h
index 88ffa78f9d61..8cd49abff4f3 100644
--- a/arch/powerpc/include/asm/ultravisor-api.h
+++ b/arch/powerpc/include/asm/ultravisor-api.h
@@ -11,6 +11,7 @@
 #include 
 
 /* Return codes */
+#define U_BUSY H_BUSY
 #define U_FUNCTION H_FUNCTION
 #define U_NOT_AVAILABLEH_NOT_AVAILABLE
 #define U_P2   H_P2
@@ -18,6 +19,10 @@
 #define U_P4   H_P4
 #define U_P5   H_P5
 #define U_PARAMETERH_PARAMETER
+#define U_PERMISSION   H_PERMISSION
 #define U_SUCCESS  H_SUCCESS
 
+/* opcodes */
+#define UV_WRITE_PATE  0xF104
+
 #endif /* _ASM_POWERPC_ULTRAVISOR_API_H */
diff --git a/arch/powerpc/include/asm/ultravisor.h 
b/arch/powerpc/include/asm/ultravisor.h
index dc6e1ea198f2..6fe1f365dec8 100644
--- a/arch/powerpc/include/asm/ultravisor.h
+++ b/arch/powerpc/include/asm/ultravisor.h
@@ -8,7 +8,15 @@
 #ifndef _ASM_POWERPC_ULTRAVISOR_H
 #define _ASM_POWERPC_ULTRAVISOR_H
 
+#include 
+#include 
+
 int early_init_dt_scan_ultravisor(unsigned long node, const char *uname,
  int depth, void *data);
 
+static inline int uv_register_pate(u64 lpid, u64 dw0, u64 dw1)
+{
+   return ucall_norets(UV_WRITE_PATE, lpid, dw0, dw1);
+}
+
 #endif /* _ASM_POWERPC_ULTRAVISOR_H */
diff --git a/arch/powerpc/mm/book3s64/pgtable.c 
b/arch/powerpc/mm/book3s64/pgtable.c
index 85bc81abd286..033731f5dbaa 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -16,6 +16,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -198,7 +200,15 @@ void __init mmu_partition_table_init(void)
unsigned long ptcr;
 
BUILD_BUG_ON_MSG((PATB_SIZE_SHIFT > 36), "Partition table size too 
large.");
-   /* Initialize the Partition Table with no entries */
+   /*
+* Initialize the Partition Table with no entries, even in the presence
+* of an ultravisor firmware.
+*
+* In ultravisor enabled systems, the ultravisor creates and maintains
+* the partition table in secure memory. However, we keep a copy of the
+* partition table because nestMMU cannot access secure memory. Our copy
+* contains entries for nonsecure and hypervisor partition.
+*/
partition_tb = memblock_alloc(patb_size, patb_size);
if (!partition_tb)
panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
@@ -213,34 +223,50 @@ void __init mmu_partition_table_init(void)
powernv_set_nmmu_ptcr(ptcr);
 }
 
-void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
-  unsigned long dw1)
+/*
+ * Global flush of TLBs and partition table caches for this lpid. The type of
+ * flush (hash or radix) depends on what the previous use of this partition ID
+ * was, not the new use.
+ */
+static void flush_partition(unsigned int lpid, unsigned long old_patb0)
 {
-   unsigned long old = be64_to_cpu(partition_tb[lpid].patb0);
-
-   partition_tb[lpid].patb0 = cpu_to_be64(dw0);
-   partition_tb[lpid].patb1 = cpu_to_be64(dw1);
-
-   /*
-* Global flush of TLBs and partition table caches for this lpid.
-* The type of flush (hash or radix) depends on what the previous
-* use of this partition ID was, not the new use.
-*/
asm volatile("ptesync" : : : "memory");
-   if (old & PATB_HR) {
-   asm volatile(PPC_TLBIE_5(%0,%1,2,0,1) : :
+   if (old_patb0 & PATB_HR) {
+   asm volatile(PPC_TLBIE_5(%0, %1, 2, 0, 1) : :
 "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
-   asm volatile(PPC_TLBIE_5(%0,%1,2,1,1) : :
+   asm volatile(PPC_TLBIE_5(%0, %1, 2, 1, 1) : :
 "r" (TLBIEL_INV

[PATCH v5 5/7] powerpc/mm: Write to PTCR only if ultravisor disabled

2019-08-07 Thread Claudio Carvalho
In ultravisor enabled systems, PTCR becomes ultravisor privileged only
for writing and an attempt to write to it will cause a Hypervisor
Emulation Assitance interrupt.

This patch adds the try_set_ptcr(val) macro as an accessor to
mtspr(SPRN_PTCR, val), which will be executed only if ultravisor
disabled.

Signed-off-by: Claudio Carvalho 
---
 arch/powerpc/include/asm/reg.h   | 13 +
 arch/powerpc/mm/book3s64/hash_utils.c|  4 ++--
 arch/powerpc/mm/book3s64/pgtable.c   |  2 +-
 arch/powerpc/mm/book3s64/radix_pgtable.c |  6 +++---
 4 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 10caa145f98b..14139b1ebdb8 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Pickup Book E specific registers. */
 #if defined(CONFIG_BOOKE) || defined(CONFIG_40x)
@@ -1452,6 +1453,18 @@ static inline void update_power8_hid0(unsigned long hid0)
 */
asm volatile("sync; mtspr %0,%1; isync":: "i"(SPRN_HID0), "r"(hid0));
 }
+
+/*
+ * In ultravisor enabled systems, PTCR becomes ultravisor privileged only for
+ * writing and an attempt to write to it will cause a Hypervisor Emulation
+ * Assistance interrupt.
+ */
+#define try_set_ptcr(val)  \
+   do {\
+   if (!firmware_has_feature(FW_FEATURE_ULTRAVISOR))   \
+   mtspr(SPRN_PTCR, val);  \
+   } while (0)
+
 #endif /* __ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_REG_H */
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c 
b/arch/powerpc/mm/book3s64/hash_utils.c
index 25a2cf32d544..048b7f58deae 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1079,8 +1079,8 @@ void hash__early_init_mmu_secondary(void)
if (!cpu_has_feature(CPU_FTR_ARCH_300))
mtspr(SPRN_SDR1, _SDR1);
else
-   mtspr(SPRN_PTCR,
- __pa(partition_tb) | (PATB_SIZE_SHIFT - 12));
+   try_set_ptcr(__pa(partition_tb) |
+(PATB_SIZE_SHIFT - 12));
}
/* Initialize SLB */
slb_initialize();
diff --git a/arch/powerpc/mm/book3s64/pgtable.c 
b/arch/powerpc/mm/book3s64/pgtable.c
index 033731f5dbaa..016c6ccb5b81 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -219,7 +219,7 @@ void __init mmu_partition_table_init(void)
 * 64 K size.
 */
ptcr = __pa(partition_tb) | (PATB_SIZE_SHIFT - 12);
-   mtspr(SPRN_PTCR, ptcr);
+   try_set_ptcr(ptcr);
powernv_set_nmmu_ptcr(ptcr);
 }
 
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c 
b/arch/powerpc/mm/book3s64/radix_pgtable.c
index e92c6472a20c..246b32550eab 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -654,8 +654,8 @@ void radix__early_init_mmu_secondary(void)
lpcr = mfspr(SPRN_LPCR);
mtspr(SPRN_LPCR, lpcr | LPCR_UPRT | LPCR_HR);
 
-   mtspr(SPRN_PTCR,
- __pa(partition_tb) | (PATB_SIZE_SHIFT - 12));
+   try_set_ptcr(__pa(partition_tb) | (PATB_SIZE_SHIFT - 12));
+
radix_init_amor();
}
 
@@ -671,7 +671,7 @@ void radix__mmu_cleanup_all(void)
if (!firmware_has_feature(FW_FEATURE_LPAR)) {
lpcr = mfspr(SPRN_LPCR);
mtspr(SPRN_LPCR, lpcr & ~LPCR_UPRT);
-   mtspr(SPRN_PTCR, 0);
+   try_set_ptcr(0);
powernv_set_nmmu_ptcr(0);
radix__flush_tlb_all();
}
-- 
2.20.1



[PATCH v5 6/7] powerpc/powernv: Access LDBAR only if ultravisor disabled

2019-08-07 Thread Claudio Carvalho
LDBAR is a per-thread SPR populated and used by the thread-imc pmu
driver to dump the data counter into memory. It contains memory along
with few other configuration bits. LDBAR is populated and enabled only
when any of the thread imc pmu events are monitored.

In ultravisor enabled systems, LDBAR becomes ultravisor privileged and
an attempt to write to it will cause a Hypervisor Emulation Assistance
interrupt.

In ultravisor enabled systems, the ultravisor is responsible to maintain
the LDBAR (e.g. save and restore it).

This restricts LDBAR access to only when ultravisor is disabled.

Signed-off-by: Claudio Carvalho 
Reviewed-by: Ram Pai 
Reviewed-by: Ryan Grimm 
---
 arch/powerpc/platforms/powernv/idle.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/idle.c 
b/arch/powerpc/platforms/powernv/idle.c
index 210fb73a5121..14018463a8f0 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -679,7 +679,8 @@ static unsigned long power9_idle_stop(unsigned long psscr, 
bool mmu_on)
sprs.ptcr   = mfspr(SPRN_PTCR);
sprs.rpr= mfspr(SPRN_RPR);
sprs.tscr   = mfspr(SPRN_TSCR);
-   sprs.ldbar  = mfspr(SPRN_LDBAR);
+   if (!firmware_has_feature(FW_FEATURE_ULTRAVISOR))
+   sprs.ldbar = mfspr(SPRN_LDBAR);
 
sprs_saved = true;
 
@@ -793,7 +794,8 @@ static unsigned long power9_idle_stop(unsigned long psscr, 
bool mmu_on)
mtspr(SPRN_MMCR0,   sprs.mmcr0);
mtspr(SPRN_MMCR1,   sprs.mmcr1);
mtspr(SPRN_MMCR2,   sprs.mmcr2);
-   mtspr(SPRN_LDBAR,   sprs.ldbar);
+   if (!firmware_has_feature(FW_FEATURE_ULTRAVISOR))
+   mtspr(SPRN_LDBAR, sprs.ldbar);
 
mtspr(SPRN_SPRG3,   local_paca->sprg_vdso);
 
-- 
2.20.1



[PATCH v5 7/7] powerpc/kvm: Use UV_RETURN ucall to return to ultravisor

2019-08-07 Thread Claudio Carvalho
From: Sukadev Bhattiprolu 

When an SVM makes an hypercall or incurs some other exception, the
Ultravisor usually forwards (a.k.a. reflects) the exceptions to the
Hypervisor. After processing the exception, Hypervisor uses the
UV_RETURN ultracall to return control back to the SVM.

The expected register state on entry to this ultracall is:

* Non-volatile registers are restored to their original values.
* If returning from an hypercall, register R0 contains the return value
  (unlike other ultracalls) and, registers R4 through R12 contain any
  output values of the hypercall.
* R3 contains the ultracall number, i.e UV_RETURN.
* If returning with a synthesized interrupt, R2 contains the
  synthesized interrupt number.

Thanks to input from Paul Mackerras, Ram Pai and Mike Anderson.

Signed-off-by: Sukadev Bhattiprolu 
Signed-off-by: Claudio Carvalho 
---
 arch/powerpc/include/asm/kvm_host.h   |  1 +
 arch/powerpc/include/asm/ultravisor-api.h |  1 +
 arch/powerpc/kernel/asm-offsets.c |  1 +
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 39 +++
 4 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 013c76a0a03e..184becb62ea4 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -294,6 +294,7 @@ struct kvm_arch {
cpumask_t cpu_in_guest;
u8 radix;
u8 fwnmi_enabled;
+   u8 secure_guest;
bool threads_indep;
bool nested_enable;
pgd_t *pgtable;
diff --git a/arch/powerpc/include/asm/ultravisor-api.h 
b/arch/powerpc/include/asm/ultravisor-api.h
index 8cd49abff4f3..6a0f9c74f959 100644
--- a/arch/powerpc/include/asm/ultravisor-api.h
+++ b/arch/powerpc/include/asm/ultravisor-api.h
@@ -24,5 +24,6 @@
 
 /* opcodes */
 #define UV_WRITE_PATE  0xF104
+#define UV_RETURN  0xF11C
 
 #endif /* _ASM_POWERPC_ULTRAVISOR_API_H */
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 524a7bba0ee5..aadc6176824b 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -510,6 +510,7 @@ int main(void)
OFFSET(KVM_VRMA_SLB_V, kvm, arch.vrma_slb_v);
OFFSET(KVM_RADIX, kvm, arch.radix);
OFFSET(KVM_FWNMI, kvm, arch.fwnmi_enabled);
+   OFFSET(KVM_SECURE_GUEST, kvm, arch.secure_guest);
OFFSET(VCPU_DSISR, kvm_vcpu, arch.shregs.dsisr);
OFFSET(VCPU_DAR, kvm_vcpu, arch.shregs.dar);
OFFSET(VCPU_VPA, kvm_vcpu, arch.vpa.pinned_addr);
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index bc18366cd1ba..0a5b2a8236c7 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Sign-extend HDEC if not on POWER9 */
 #define EXTEND_HDEC(reg)   \
@@ -1090,16 +1091,10 @@ BEGIN_FTR_SECTION
 END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 
ld  r5, VCPU_LR(r4)
-   ld  r6, VCPU_CR(r4)
mtlrr5
-   mtcrr6
 
ld  r1, VCPU_GPR(R1)(r4)
-   ld  r2, VCPU_GPR(R2)(r4)
-   ld  r3, VCPU_GPR(R3)(r4)
ld  r5, VCPU_GPR(R5)(r4)
-   ld  r6, VCPU_GPR(R6)(r4)
-   ld  r7, VCPU_GPR(R7)(r4)
ld  r8, VCPU_GPR(R8)(r4)
ld  r9, VCPU_GPR(R9)(r4)
ld  r10, VCPU_GPR(R10)(r4)
@@ -1117,10 +1112,42 @@ BEGIN_FTR_SECTION
mtspr   SPRN_HDSISR, r0
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
 
+   ld  r6, VCPU_KVM(r4)
+   lbz r7, KVM_SECURE_GUEST(r6)
+   cmpdi   r7, 0
+   ld  r6, VCPU_GPR(R6)(r4)
+   ld  r7, VCPU_GPR(R7)(r4)
+   bne ret_to_ultra
+
+   lwz r0, VCPU_CR(r4)
+   mtcrr0
+
ld  r0, VCPU_GPR(R0)(r4)
+   ld  r2, VCPU_GPR(R2)(r4)
+   ld  r3, VCPU_GPR(R3)(r4)
ld  r4, VCPU_GPR(R4)(r4)
HRFI_TO_GUEST
b   .
+/*
+ * Use UV_RETURN ultracall to return control back to the Ultravisor after
+ * processing an hypercall or interrupt that was forwarded (a.k.a. reflected)
+ * to the Hypervisor.
+ *
+ * All registers have already been loaded, except:
+ *   R0 = hcall result
+ *   R2 = SRR1, so UV can detect a synthesized interrupt (if any)
+ *   R3 = UV_RETURN
+ */
+ret_to_ultra:
+   lwz r0, VCPU_CR(r4)
+   mtcrr0
+
+   ld  r0, VCPU_GPR(R3)(r4)
+   mfspr   r2, SPRN_SRR1
+   li  r3, 0
+   ori r3, r3, UV_RETURN
+   ld  r4, VCPU_GPR(R4)(r4)
+   sc  2
 
 /*
  * Enter the guest on a P9 or later system where we have exactly
-- 
2.20.1



Re: [PATCH v5 03/10] powerpc: introduce kimage_vaddr to store the kernel base

2019-08-07 Thread Jason Yan




On 2019/8/7 21:03, Michael Ellerman wrote:

Jason Yan  writes:

Now the kernel base is a fixed value - KERNELBASE. To support KASLR, we
need a variable to store the kernel base.

Signed-off-by: Jason Yan 
Cc: Diana Craciun 
Cc: Michael Ellerman 
Cc: Christophe Leroy 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Nicholas Piggin 
Cc: Kees Cook 
Reviewed-by: Christophe Leroy 
Reviewed-by: Diana Craciun 
Tested-by: Diana Craciun 
---
  arch/powerpc/include/asm/page.h | 2 ++
  arch/powerpc/mm/init-common.c   | 2 ++
  2 files changed, 4 insertions(+)

diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 0d52f57fca04..60a68d3a54b1 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -315,6 +315,8 @@ void arch_free_page(struct page *page, int order);
  
  struct vm_area_struct;
  
+extern unsigned long kimage_vaddr;

+
  #include 
  #endif /* __ASSEMBLY__ */
  #include 
diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c
index 152ae0d21435..d4801ce48dc5 100644
--- a/arch/powerpc/mm/init-common.c
+++ b/arch/powerpc/mm/init-common.c
@@ -25,6 +25,8 @@ phys_addr_t memstart_addr = (phys_addr_t)~0ull;
  EXPORT_SYMBOL_GPL(memstart_addr);
  phys_addr_t kernstart_addr;
  EXPORT_SYMBOL_GPL(kernstart_addr);
+unsigned long kimage_vaddr = KERNELBASE;
+EXPORT_SYMBOL_GPL(kimage_vaddr);


The names of the #defines and variables we use for these values are not
very consistent already, but using kimage_vaddr makes it worse I think.

Isn't this going to have the same value as kernstart_addr, but the
virtual rather than physical address?



Yes, that's true.


If so kernstart_virt_addr would seem better.



OK, I will take kernstart_virt_addr if no better name appears.


cheers

.





Re: [PATCH v2 09/44] powerpc/64s/pseries: machine check convert to use common event code

2019-08-07 Thread kbuild test robot
Hi Nicholas,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.3-rc3 next-20190807]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-64s-exception-cleanup-and-macrofiy/20190802-11
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 7.4.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.4.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

   arch/powerpc/platforms/pseries/ras.c: In function 'mce_handle_error':
>> arch/powerpc/platforms/pseries/ras.c:563:28: error: this statement may fall 
>> through [-Werror=implicit-fallthrough=]
   mce_err.u.ue_error_type = MCE_UE_ERROR_IFETCH;
   ^
   arch/powerpc/platforms/pseries/ras.c:564:3: note: here
  case MC_ERROR_UE_PAGE_TABLE_WALK_IFETCH:
  ^~~~
   arch/powerpc/platforms/pseries/ras.c:565:28: error: this statement may fall 
through [-Werror=implicit-fallthrough=]
   mce_err.u.ue_error_type = MCE_UE_ERROR_PAGE_TABLE_WALK_IFETCH;
   ^
   arch/powerpc/platforms/pseries/ras.c:566:3: note: here
  case MC_ERROR_UE_LOAD_STORE:
  ^~~~
   arch/powerpc/platforms/pseries/ras.c:567:28: error: this statement may fall 
through [-Werror=implicit-fallthrough=]
   mce_err.u.ue_error_type = MCE_UE_ERROR_LOAD_STORE;
   ^
   arch/powerpc/platforms/pseries/ras.c:568:3: note: here
  case MC_ERROR_UE_PAGE_TABLE_WALK_LOAD_STORE:
  ^~~~
   arch/powerpc/platforms/pseries/ras.c:569:28: error: this statement may fall 
through [-Werror=implicit-fallthrough=]
   mce_err.u.ue_error_type = MCE_UE_ERROR_PAGE_TABLE_WALK_LOAD_STORE;
   ^
   arch/powerpc/platforms/pseries/ras.c:570:3: note: here
  case MC_ERROR_UE_INDETERMINATE:
  ^~~~
   cc1: all warnings being treated as errors

vim +563 arch/powerpc/platforms/pseries/ras.c

   496  
   497  
   498  static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log 
*errp)
   499  {
   500  struct mce_error_info mce_err = { 0 };
   501  unsigned long eaddr = 0, paddr = 0;
   502  struct pseries_errorlog *pseries_log;
   503  struct pseries_mc_errorlog *mce_log;
   504  int disposition = rtas_error_disposition(errp);
   505  int initiator = rtas_error_initiator(errp);
   506  int severity = rtas_error_severity(errp);
   507  u8 error_type, err_sub_type;
   508  
   509  if (initiator == RTAS_INITIATOR_UNKNOWN)
   510  mce_err.initiator = MCE_INITIATOR_UNKNOWN;
   511  else if (initiator == RTAS_INITIATOR_CPU)
   512  mce_err.initiator = MCE_INITIATOR_CPU;
   513  else if (initiator == RTAS_INITIATOR_PCI)
   514  mce_err.initiator = MCE_INITIATOR_PCI;
   515  else if (initiator == RTAS_INITIATOR_ISA)
   516  mce_err.initiator = MCE_INITIATOR_ISA;
   517  else if (initiator == RTAS_INITIATOR_MEMORY)
   518  mce_err.initiator = MCE_INITIATOR_MEMORY;
   519  else if (initiator == RTAS_INITIATOR_POWERMGM)
   520  mce_err.initiator = MCE_INITIATOR_POWERMGM;
   521  else
   522  mce_err.initiator = MCE_INITIATOR_UNKNOWN;
   523  
   524  if (severity == RTAS_SEVERITY_NO_ERROR)
   525  mce_err.severity = MCE_SEV_NO_ERROR;
   526  else if (severity == RTAS_SEVERITY_EVENT)
   527  mce_err.severity = MCE_SEV_WARNING;
   528  else if (severity == RTAS_SEVERITY_WARNING)
   529  mce_err.severity = MCE_SEV_WARNING;
   530  else if (severity == RTAS_SEVERITY_ERROR_SYNC)
   531  mce_err.severity = MCE_SEV_SEVERE;
   532  else if (severity == RTAS_SEVERITY_ERROR)
   533  mce_err.severity = MCE_SEV_SEVERE;
   534  else if (severity == RTAS_SEVERITY_FATAL)
   535  mce_err.severity = MCE_SEV_FATAL;
   536  else
   537  mce_err.severity = MCE_SEV_FATAL;
   538  
   539  if (severity <= RTAS_SEVERITY_ERROR_SYNC)
   540  mce_err.sync_error = true;
   541  else
   542  mce_err.sync_error = false;
   543  
   544  mce_err.error_type = MCE_ERROR_TYPE_UNKNOWN;
   545  mce_err.error_class = MCE_ECLASS_UNKNO

Re: [PATCH v3 38/41] powerpc: convert put_page() to put_user_page*()

2019-08-07 Thread Michael Ellerman
Hi John,

john.hubb...@gmail.com writes:
> diff --git a/arch/powerpc/mm/book3s64/iommu_api.c 
> b/arch/powerpc/mm/book3s64/iommu_api.c
> index b056cae3388b..e126193ba295 100644
> --- a/arch/powerpc/mm/book3s64/iommu_api.c
> +++ b/arch/powerpc/mm/book3s64/iommu_api.c
> @@ -203,6 +202,7 @@ static void mm_iommu_unpin(struct 
> mm_iommu_table_group_mem_t *mem)
>  {
>   long i;
>   struct page *page = NULL;
> + bool dirty = false;

I don't think you need that initialisation do you?

>   if (!mem->hpas)
>   return;
> @@ -215,10 +215,9 @@ static void mm_iommu_unpin(struct 
> mm_iommu_table_group_mem_t *mem)
>   if (!page)
>   continue;
>  
> - if (mem->hpas[i] & MM_IOMMU_TABLE_GROUP_PAGE_DIRTY)
> - SetPageDirty(page);
> + dirty = mem->hpas[i] & MM_IOMMU_TABLE_GROUP_PAGE_DIRTY;
> - put_page(page);
> + put_user_pages_dirty_lock(&page, 1, dirty);
>   mem->hpas[i] = 0;
>   }
>  }

cheers


Re: [PATCH] powerpc/fadump: sysfs for fadump memory reservation

2019-08-07 Thread Sourabh Jain



On 8/7/19 8:40 AM, Michael Ellerman wrote:
> Sourabh Jain  writes:
>> Add a sys interface to allow querying the memory reserved by fadump
>> for saving the crash dump.
>>
>> Signed-off-by: Sourabh Jain 
>> ---
>>  Documentation/powerpc/firmware-assisted-dump.rst |  5 +
>>  arch/powerpc/kernel/fadump.c | 14 ++
>>  2 files changed, 19 insertions(+)
>>
>> diff --git a/Documentation/powerpc/firmware-assisted-dump.rst 
>> b/Documentation/powerpc/firmware-assisted-dump.rst
>> index 9ca12830a48e..4a7f6dc556f5 100644
>> --- a/Documentation/powerpc/firmware-assisted-dump.rst
>> +++ b/Documentation/powerpc/firmware-assisted-dump.rst
>> @@ -222,6 +222,11 @@ Here is the list of files under kernel sysfs:
>>  be handled and vmcore will not be captured. This interface can be
>>  easily integrated with kdump service start/stop.
>>  
>> +/sys/kernel/fadump_mem_reserved
>> +
>> +   This is used to display the memory reserved by fadump for saving the
>> +   crash dump.
>> +
>>   /sys/kernel/fadump_release_mem
>>  This file is available only when fadump is active during
>>  second kernel. This is used to release the reserved memory
> 
> Dumping these in /sys/kernel is pretty gross, but I guess that ship has
> sailed.
> 
> But please add it to Documentation/ABI, and Cc the appropriate lists/people.

Sure, I will write the ABI documentation and will send the next version.

> 
> cheers
> 
>> diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
>> index 4eab97292cc2..70d49013ebec 100644
>> --- a/arch/powerpc/kernel/fadump.c
>> +++ b/arch/powerpc/kernel/fadump.c
>> @@ -1514,6 +1514,13 @@ static ssize_t fadump_enabled_show(struct kobject 
>> *kobj,
>>  return sprintf(buf, "%d\n", fw_dump.fadump_enabled);
>>  }
>>  
>> +static ssize_t fadump_mem_reserved_show(struct kobject *kobj,
>> +struct kobj_attribute *attr,
>> +char *buf)
>> +{
>> +return sprintf(buf, "%ld\n", fw_dump.reserve_dump_area_size);
>> +}
>> +
>>  static ssize_t fadump_register_show(struct kobject *kobj,
>>  struct kobj_attribute *attr,
>>  char *buf)
>> @@ -1632,6 +1639,9 @@ static struct kobj_attribute fadump_attr = 
>> __ATTR(fadump_enabled,
>>  static struct kobj_attribute fadump_register_attr = 
>> __ATTR(fadump_registered,
>>  0644, fadump_register_show,
>>  fadump_register_store);
>> +static struct kobj_attribute fadump_mem_reserved_attr =
>> +__ATTR(fadump_mem_reserved, 0444,
>> +fadump_mem_reserved_show, NULL);
>>  
>>  DEFINE_SHOW_ATTRIBUTE(fadump_region);
>>  
>> @@ -1663,6 +1673,10 @@ static void fadump_init_files(void)
>>  printk(KERN_ERR "fadump: unable to create sysfs file"
>>  " fadump_release_mem (%d)\n", rc);
>>  }
>> +rc = sysfs_create_file(kernel_kobj, &fadump_mem_reserved_attr.attr);
>> +if (rc)
>> +pr_err("unable to create sysfs file fadump_mem_reserved (%d)\n",
>> +rc);
>>  return;
>>  }
>>  
>> -- 
>> 2.17.2



Re: [PATCH v2 09/44] powerpc/64s/pseries: machine check convert to use common event code

2019-08-07 Thread kbuild test robot
Hi Nicholas,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[cannot apply to v5.3-rc3 next-20190807]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-64s-exception-cleanup-and-macrofiy/20190802-11
config: powerpc-allmodconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 7.4.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.4.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All warnings (new ones prefixed by >>):

   arch/powerpc/platforms/pseries/ras.c: In function 'mce_handle_error':
>> arch/powerpc/platforms/pseries/ras.c:563:28: warning: this statement may 
>> fall through [-Wimplicit-fallthrough=]
   mce_err.u.ue_error_type = MCE_UE_ERROR_IFETCH;
   ^
   arch/powerpc/platforms/pseries/ras.c:564:3: note: here
  case MC_ERROR_UE_PAGE_TABLE_WALK_IFETCH:
  ^~~~
   arch/powerpc/platforms/pseries/ras.c:565:28: warning: this statement may 
fall through [-Wimplicit-fallthrough=]
   mce_err.u.ue_error_type = MCE_UE_ERROR_PAGE_TABLE_WALK_IFETCH;
   ^
   arch/powerpc/platforms/pseries/ras.c:566:3: note: here
  case MC_ERROR_UE_LOAD_STORE:
  ^~~~
   arch/powerpc/platforms/pseries/ras.c:567:28: warning: this statement may 
fall through [-Wimplicit-fallthrough=]
   mce_err.u.ue_error_type = MCE_UE_ERROR_LOAD_STORE;
   ^
   arch/powerpc/platforms/pseries/ras.c:568:3: note: here
  case MC_ERROR_UE_PAGE_TABLE_WALK_LOAD_STORE:
  ^~~~
   arch/powerpc/platforms/pseries/ras.c:569:28: warning: this statement may 
fall through [-Wimplicit-fallthrough=]
   mce_err.u.ue_error_type = MCE_UE_ERROR_PAGE_TABLE_WALK_LOAD_STORE;
   ^
   arch/powerpc/platforms/pseries/ras.c:570:3: note: here
  case MC_ERROR_UE_INDETERMINATE:
  ^~~~

vim +563 arch/powerpc/platforms/pseries/ras.c

   496  
   497  
   498  static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log 
*errp)
   499  {
   500  struct mce_error_info mce_err = { 0 };
   501  unsigned long eaddr = 0, paddr = 0;
   502  struct pseries_errorlog *pseries_log;
   503  struct pseries_mc_errorlog *mce_log;
   504  int disposition = rtas_error_disposition(errp);
   505  int initiator = rtas_error_initiator(errp);
   506  int severity = rtas_error_severity(errp);
   507  u8 error_type, err_sub_type;
   508  
   509  if (initiator == RTAS_INITIATOR_UNKNOWN)
   510  mce_err.initiator = MCE_INITIATOR_UNKNOWN;
   511  else if (initiator == RTAS_INITIATOR_CPU)
   512  mce_err.initiator = MCE_INITIATOR_CPU;
   513  else if (initiator == RTAS_INITIATOR_PCI)
   514  mce_err.initiator = MCE_INITIATOR_PCI;
   515  else if (initiator == RTAS_INITIATOR_ISA)
   516  mce_err.initiator = MCE_INITIATOR_ISA;
   517  else if (initiator == RTAS_INITIATOR_MEMORY)
   518  mce_err.initiator = MCE_INITIATOR_MEMORY;
   519  else if (initiator == RTAS_INITIATOR_POWERMGM)
   520  mce_err.initiator = MCE_INITIATOR_POWERMGM;
   521  else
   522  mce_err.initiator = MCE_INITIATOR_UNKNOWN;
   523  
   524  if (severity == RTAS_SEVERITY_NO_ERROR)
   525  mce_err.severity = MCE_SEV_NO_ERROR;
   526  else if (severity == RTAS_SEVERITY_EVENT)
   527  mce_err.severity = MCE_SEV_WARNING;
   528  else if (severity == RTAS_SEVERITY_WARNING)
   529  mce_err.severity = MCE_SEV_WARNING;
   530  else if (severity == RTAS_SEVERITY_ERROR_SYNC)
   531  mce_err.severity = MCE_SEV_SEVERE;
   532  else if (severity == RTAS_SEVERITY_ERROR)
   533  mce_err.severity = MCE_SEV_SEVERE;
   534  else if (severity == RTAS_SEVERITY_FATAL)
   535  mce_err.severity = MCE_SEV_FATAL;
   536  else
   537  mce_err.severity = MCE_SEV_FATAL;
   538  
   539  if (severity <= RTAS_SEVERITY_ERROR_SYNC)
   540  mce_err.sync_error = true;
   541  else
   542  mce_err.sync_error = false;
   543  
   544  mce_err.error_type = MCE_ERROR_TYPE_UNKNOWN;
   545  mce_err.error_class = MCE_ECLASS_UNKNOWN;
   546  
   547  if (!rtas_error_extended(errp))
 

Re: [PATCH v5 06/10] powerpc/fsl_booke/32: implement KASLR infrastructure

2019-08-07 Thread Jason Yan




On 2019/8/7 21:04, Michael Ellerman wrote:

Jason Yan  writes:

This patch add support to boot kernel from places other than KERNELBASE.
Since CONFIG_RELOCATABLE has already supported, what we need to do is
map or copy kernel to a proper place and relocate. Freescale Book-E
parts expect lowmem to be mapped by fixed TLB entries(TLB1). The TLB1
entries are not suitable to map the kernel directly in a randomized
region, so we chose to copy the kernel to a proper place and restart to
relocate.


So to be 100% clear you are randomising the location of the kernel in
virtual and physical space, by the same amount, and retaining the 1:1
linear mapping.



100% right :)


diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 77f6ebf97113..755378887912 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -548,6 +548,17 @@ config RELOCATABLE
  setting can still be useful to bootwrappers that need to know the
  load address of the kernel (eg. u-boot/mkimage).
  
+config RANDOMIZE_BASE

+   bool "Randomize the address of the kernel image"
+   depends on (FSL_BOOKE && FLATMEM && PPC32)
+   select RELOCATABLE


I think this should depend on RELOCATABLE, rather than selecting it.


diff --git a/arch/powerpc/kernel/kaslr_booke.c 
b/arch/powerpc/kernel/kaslr_booke.c
new file mode 100644
index ..30f84c0321b2
--- /dev/null
+++ b/arch/powerpc/kernel/kaslr_booke.c
@@ -0,0 +1,84 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2019 Jason Yan 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.


You don't need that paragraph now that you have the SPDX tag.

Rather than using a '//' comment followed by a single line block comment
you can format it as:

// SPDX-License-Identifier: GPL-2.0-only
//
// Copyright (C) 2019 Jason Yan 
>

+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 


Do you really need all those headers?



I will remove useless headers.


+extern int is_second_reloc;


That should be in a header.

Any reason why it isn't a bool?



Oh yes, it should be in a header. This variable is already defined 
before and also used in assembly code. I think it was not defined as a 
bool just because there is no 'bool' in assembly code.



cheers


.





powerpc flush_inval_dcache_range() was buggy until v5.3-rc1 (was Re: [PATCH 4/4] powerpc/64: reuse PPC32 static inline flush_dcache_range())

2019-08-07 Thread Michael Ellerman
[ deliberately broke threading so this doesn't get buried ]

Christophe Leroy  writes:
> diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
> index a4fd536efb44..1b0a42c50ef1 100644
> --- a/arch/powerpc/kernel/misc_64.S
> +++ b/arch/powerpc/kernel/misc_64.S
> @@ -115,35 +115,6 @@ _ASM_NOKPROBE_SYMBOL(flush_icache_range)
>  EXPORT_SYMBOL(flush_icache_range)
>  
>  /*
> - * Like above, but only do the D-cache.
> - *
> - * flush_dcache_range(unsigned long start, unsigned long stop)
> - *
> - *flush all bytes from start to stop-1 inclusive
> - */
> -
> -_GLOBAL_TOC(flush_dcache_range)
> - ld  r10,PPC64_CACHES@toc(r2)
> - lwz r7,DCACHEL1BLOCKSIZE(r10)   /* Get dcache block size */
> - addir5,r7,-1
> - andcr6,r3,r5/* round low to line bdy */
> - subfr8,r6,r4/* compute length */
> - add r8,r8,r5/* ensure we get enough */
> - lwz r9,DCACHEL1LOGBLOCKSIZE(r10)/* Get log-2 of dcache block size */
> - srw.r8,r8,r9/* compute line count */
  ^
> - beqlr   /* nothing to do? */

Alastair noticed that this was a 32-bit right shift.

Meaning if you called flush_dcache_range() with a range larger than 4GB,
it did nothing and returned.

That code (which was previously called flush_inval_dcache_range()) was
merged back in 2005:

  
https://github.com/mpe/linux-fullhistory/commit/faa5ee3743ff9b6df9f9a03600e34fdae596cfb2#diff-67c7ffa8e420c7d4206cae4a9e888e14


Back then it was only used by the smu.c driver, which presumably wasn't
flushing more than 4GB.

Over time it grew more users:

  v4.17 (Apr 2018): fb5924fddf9e ("powerpc/mm: Flush cache on memory 
hot(un)plug")
  v4.15 (Nov 2017): 6c44741d75a2 ("powerpc/lib: Implement UACCESS_FLUSHCACHE 
API")
  v4.15 (Nov 2017): 32ce3862af3c ("powerpc/lib: Implement PMEM API")
  v4.8  (Jul 2016): c40785ad305b ("powerpc/dart: Use a cachable DART")

The DART case doesn't matter, but the others probably could. I assume
the lack of bug reports is due to the fact that pmem stuff is still in
development and the lack of flushing usually doesn't actually matter? Or
are people flushing/hotplugging < 4G at a time?

Anyway we probably want to backport the fix below to various places?

cheers


diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S
index 1ad4089dd110..802f5abbf061 100644
--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -148,7 +148,7 @@ _GLOBAL(flush_inval_dcache_range)
subfr8,r6,r4/* compute length */
add r8,r8,r5/* ensure we get enough */
lwz r9,DCACHEL1LOGBLOCKSIZE(r10)/* Get log-2 of dcache block size */
-   srw.r8,r8,r9/* compute line count */
+   srd.r8,r8,r9/* compute line count */
beqlr   /* nothing to do? */
sync
isync