Re: [PATCH v2 0/5] cpu/speculation: Add 'mitigations=' cmdline option

2019-04-16 Thread Jiri Kosina
On Fri, 12 Apr 2019, Josh Poimboeuf wrote: > v2: > - docs improvements: [Randy, Michael] > - Rename to "mitigations=" [Michael] > - Add cpu_mitigations_off() function wrapper [Michael] > - x86: Simplify logic [Boris] > - powerpc: Fix no_rfi_flush checking bug (use '&&' instead of '||') > - arm64:

Re: [PATCH V32 01/27] Add the ability to lock down access to the running kernel image

2019-04-16 Thread Andrew Donnellan
On 4/4/19 11:32 am, Matthew Garrett wrote: diff --git a/Documentation/ABI/testing/lockdown b/Documentation/ABI/testing/lockdown new file mode 100644 index ..5bd51e20917a --- /dev/null +++ b/Documentation/ABI/testing/lockdown @@ -0,0 +1,19 @@ +What: security/lockdown +Date:

Re: Linux 5.1-rc5

2019-04-16 Thread Martin Schwidefsky
On Mon, 15 Apr 2019 09:17:10 -0700 Linus Torvalds wrote: > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig wrote: > > > > Can we please have the page refcount overflow fixes out on the list > > for review, even if it is after the fact? > > They were actually on a list for review long befor

Re: [PATCH v4 0/5] powerpc/perf: IMC trace-mode support

2019-04-16 Thread Anju T Sudhakar
Hi, Kindly ignore this series, since patch 5/5 in this series doesn't incorporate the event-format change that I've done in v4 of this series. Apologies for the inconvenience. I will post the updated v5 soon. Thanks, Anju On 4/15/19 3:41 PM, Anju T Sudhakar wrote: IMC (In-Memory collect

[PATCH v4 0/5] powerpc/perf: IMC trace-mode support

2019-04-16 Thread Anju T Sudhakar
IMC (In-Memory collection counters) is a hardware monitoring facility that collects large number of hardware performance events. POWER9 support two modes for IMC which are the Accumulation mode and Trace mode. In Accumulation mode, event counts are accumulated in syste

[PATCH v4 1/5] powerpc/include: Add data structures and macros for IMC trace mode

2019-04-16 Thread Anju T Sudhakar
Add the macros needed for IMC (In-Memory Collection Counters) trace-mode and data structure to hold the trace-imc record data. Also, add the new type "OPAL_IMC_COUNTERS_TRACE" in 'opal-api.h', since there is a new switch case added in the opal-calls for IMC. Signed-off-by: Anju T Sudhakar Reviewe

[PATCH v4 3/5] powerpc/perf: Add privileged access check for thread_imc

2019-04-16 Thread Anju T Sudhakar
From: Madhavan Srinivasan Add code to restrict user access to thread_imc pmu since some event report privilege level information. Fixes: f74c89bd80fb3 ('powerpc/perf: Add thread IMC PMU support') Signed-off-by: Madhavan Srinivasan Signed-off-by: Anju T Sudhakar --- arch/powerpc/perf/imc-pmu.c

[PATCH v4 2/5] powerpc/perf: Rearrange setting of ldbar for thread-imc

2019-04-16 Thread Anju T Sudhakar
LDBAR holds the memory address allocated for each cpu. For thread-imc the mode bit (i.e bit 1) of LDBAR is set to accumulation. Currently, ldbar is loaded with per cpu memory address and mode set to accumulation at boot time. To enable trace-imc, the mode bit of ldbar should be set to 'trace'. So

[PATCH v4 4/5] powerpc/perf: Trace imc events detection and cpuhotplug

2019-04-16 Thread Anju T Sudhakar
Patch detects trace-imc events, does memory initilizations for each online cpu, and registers cpuhotplug call-backs. Signed-off-by: Anju T Sudhakar Reviewed-by: Madhavan Srinivasan --- arch/powerpc/perf/imc-pmu.c | 104 ++ arch/powerpc/platforms/powernv/opal-im

[PATCH v4 5/5] powerpc/perf: Trace imc PMU functions

2019-04-16 Thread Anju T Sudhakar
Add PMU functions to support trace-imc. Signed-off-by: Anju T Sudhakar Reviewed-by: Madhavan Srinivasan --- arch/powerpc/perf/imc-pmu.c | 205 +++- 1 file changed, 204 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-

Re: [PATCH] Linux: Define struct termios2 in under _GNU_SOURCE [BZ #10339]

2019-04-16 Thread Florian Weimer
* hpa: > Using symbol versioning doesn't really help much since the real > problem is that struct termios can be passed around in userspace, and > the interfaces between user space libraries don't have any > versioning. However, my POC code deals with that too by only seeing > BOTHER when necessar

[PATCH v3 0/8] Update hash MMU kernel mapping to be in sync with radix

2019-04-16 Thread Aneesh Kumar K.V
This patch series map all the kernel regions (vmalloc, IO and vmemmap) using 0xc top nibble address. This brings hash translation kernel mapping in sync with radix. Each of these regions can now map 512TB. We use one context to map these regions and hence the 512TB limit. We also update radix to u

[PATCH v3 1/8] powerpc/mm/hash64: Add a variable to track the end of IO mapping

2019-04-16 Thread Aneesh Kumar K.V
This makes it easy to update the region mapping in the later patch Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/hash.h| 3 ++- arch/powerpc/include/asm/book3s/64/pgtable.h | 8 +--- arch/powerpc/include/asm/book3s/64/radix.h | 1 + arch/powerpc/mm/hash_utils_6

[PATCH v3 2/8] powerpc/mm/hash64: Map all the kernel regions in the same 0xc range

2019-04-16 Thread Aneesh Kumar K.V
This patch maps vmap, IO and vmemap regions in the 0xc address range instead of the current 0xd and 0xf range. This brings the mapping closer to radix translation mode. With hash 64K page size each of this region is 512TB whereas with 4K config we are limited by the max page table range of 64TB an

[PATCH v3 3/8] powerpc/mm: Validate address values against different region limits

2019-04-16 Thread Aneesh Kumar K.V
This adds an explicit check in various functions. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/mm/hash_utils_64.c | 18 +++--- arch/powerpc/mm/pgtable-hash64.c | 13 ++--- arch/powerpc/mm/pgtable-radix.c | 16 arch/powerpc/mm/pgtable_64.c | 5 +

[PATCH v3 4/8] powerpc/mm: Drop the unnecessary region check

2019-04-16 Thread Aneesh Kumar K.V
All the regions are now mapped with top nibble 0xc. Hence the region id check is not needed for virt_addr_valid() Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/page.h | 12 1 file changed, 12 deletions(-) diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/in

[PATCH v3 5/8] powerpc/mm/hash: Simplify the region id calculation.

2019-04-16 Thread Aneesh Kumar K.V
This reduces multiple comparisons in get_region_id to a bit shift operation. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/hash-4k.h | 4 ++- arch/powerpc/include/asm/book3s/64/hash-64k.h | 1 + arch/powerpc/include/asm/book3s/64/hash.h | 31 +-- a

[PATCH v3 6/8] powerpc/mm: Print kernel map details to dmesg

2019-04-16 Thread Aneesh Kumar K.V
This helps in debugging. We can look at the dmesg to find out different kernel mapping details. On 4K config this shows kernel vmalloc start = 0xc0001000 kernel IO start= 0xc0002000 kernel vmemmap start = 0xc0003000 On 64K config: kernel vmalloc start =

[PATCH v3 7/8] powerpc/mm: Consolidate radix and hash address map details

2019-04-16 Thread Aneesh Kumar K.V
We now have 4K page size config kernel_region_map_size = 16TB kernel vmalloc start = 0xc0001000 kernel IO start= 0xc0002000 kernel vmemmap start = 0xc0003000 with 64K page size config: kernel_region_map_size = 512TB kernel vmalloc start = 0xc00800

Re: [PATCH v4 0/5] powerpc/perf: IMC trace-mode support

2019-04-16 Thread Anju T Sudhakar
On 4/16/19 3:14 PM, Anju T Sudhakar wrote: Hi, Kindly ignore this series, since patch 5/5 in this series doesn't incorporate the event-format change that I've done in v4 of this series. Apologies for the inconvenience. I will post the updated v5 soon. s/v5/v4 Thanks, Anju On 4/15/

[PATCH v3 8/8] powerpc/mm/hash: Rename KERNEL_REGION_ID to LINEAR_MAP_REGION_ID

2019-04-16 Thread Aneesh Kumar K.V
The region actually point to linear map. Rename the #define to clarify thati. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/hash.h | 4 ++-- arch/powerpc/include/asm/book3s/64/mmu-hash.h | 2 +- arch/powerpc/mm/copro_fault.c | 4 ++-- arch/powerpc/mm/

[PATCH v2 00/16] Add FADump support on PowerNV platform

2019-04-16 Thread Hari Bathini
Firmware-Assisted Dump (FADump) is currently supported only on pseries platform. This patch series adds support for powernv platform too. The first and third patches refactor the FADump code to make use of common code across multiple platforms. The fifth patch adds basic FADump support for powernv

[PATCH v2 01/16] powerpc/fadump: move internal fadump code to a new file

2019-04-16 Thread Hari Bathini
Refactoring fadump code means internal fadump code is referenced from different places. For ease, move internal code to a new file. Signed-off-by: Hari Bathini --- Changes in v2: * Using fadump-common.* instead of fadump_internal.* arch/powerpc/include/asm/fadump.h | 112 --

[PATCH v2 02/16] powerpc/fadump: Improve fadump documentation

2019-04-16 Thread Hari Bathini
The figures depicting FADump's (Firmware-Assisted Dump) memory layout are missing some finer details like different memory regions and what they represent. Improve the documentation by updating those details. Signed-off-by: Hari Bathini --- Documentation/powerpc/firmware-assisted-dump.txt | 65

[PATCH v2 03/16] pseries/fadump: move out platform specific support from generic code

2019-04-16 Thread Hari Bathini
Introduce callbacks for platform specific operations like register, unregister, invalidate & such, and move pseries specific code into platform code. Signed-off-by: Hari Bathini --- Changes in v2: * pSeries specific fadump code files are named rtas-fadump.* instead of pseries_fadump.* arch/

[PATCH v2 04/16] powerpc/fadump: use FADump instead of fadump for how it is pronounced

2019-04-16 Thread Hari Bathini
Signed-off-by: Hari Bathini --- Documentation/powerpc/firmware-assisted-dump.txt | 56 +++--- 1 file changed, 28 insertions(+), 28 deletions(-) diff --git a/Documentation/powerpc/firmware-assisted-dump.txt b/Documentation/powerpc/firmware-assisted-dump.txt index 059993b..62e75

[PATCH v2 05/16] powerpc/fadump: enable fadump support on OPAL based POWER platform

2019-04-16 Thread Hari Bathini
From: Hari Bathini Firmware-assisted dump support is enabled for OPAL based POWER platforms in P9 firmware. Make the corresponding updates in kernel to enable fadump support for such platforms. Signed-off-by: Hari Bathini --- Changes in v2: * Updated API number for FADump according to recent O

[PATCH v2 06/16] powerpc/fadump: Update documentation about OPAL platform support

2019-04-16 Thread Hari Bathini
With FADump support now available on both pseries and OPAL platforms, update FADump documentation with these details. Signed-off-by: Hari Bathini --- Documentation/powerpc/firmware-assisted-dump.txt | 90 -- 1 file changed, 51 insertions(+), 39 deletions(-) diff --git a/Do

[PATCH v2 07/16] powerpc/fadump: consider reserved ranges while reserving memory

2019-04-16 Thread Hari Bathini
Commit 0962e8004e97 ("powerpc/prom: Scan reserved-ranges node for memory reservations") enabled support to parse reserved-ranges DT node and reserve kernel memory falling in these ranges for F/W purposes. Ensure memory in these ranges is not overlapped with memory reserved for FADump. Also, use a

[PATCH v2 08/16] powerpc/fadump: consider reserved ranges while releasing memory

2019-04-16 Thread Hari Bathini
Commit 0962e8004e97 ("powerpc/prom: Scan reserved-ranges node for memory reservations") enabled support to parse 'reserved-ranges' DT node to reserve kernel memory falling in these ranges for firmware purposes. Along with the preserved area memory, also ensure memory in reserved ranges is not overl

[PATCH v2 09/16] powernv/fadump: process architected register state data provided by firmware

2019-04-16 Thread Hari Bathini
From: Hari Bathini Firmware provides architected register state data at the time of crash. Process this data and build CPU notes to append to ELF core. Signed-off-by: Hari Bathini Signed-off-by: Vasant Hegde --- Changes in v2: * Updated reg type values according to recent OPAL changes arch

[PATCH v2 10/16] powernv/fadump: add support to preserve crash data on FADUMP disabled kernel

2019-04-16 Thread Hari Bathini
Add a new kernel config option, CONFIG_PRESERVE_FA_DUMP that ensures that crash data, from previously crash'ed kernel, is preserved. This helps in cases where FADump is not enabled but the subsequent memory preserving kernel boot is likely to process this crash data. One typical usecase for this co

[PATCH v2 11/16] powerpc/fadump: update documentation about CONFIG_PRESERVE_FA_DUMP

2019-04-16 Thread Hari Bathini
Kernel config option CONFIG_PRESERVE_FA_DUMP is introduced to ensure crash data, from previously crash'ed kernel, is preserved. Update documentation with this details. Signed-off-by: Hari Bathini --- Documentation/powerpc/firmware-assisted-dump.txt |9 + 1 file changed, 9 insertions(

[PATCH v2 12/16] powerpc/powernv: export /proc/opalcore for analysing opal crashes

2019-04-16 Thread Hari Bathini
From: Hari Bathini Export /proc/opalcore file to analyze opal crashes. Since opalcore can be generated independent of CONFIG_FA_DUMP support in kernel, add this support under a new kernel config option CONFIG_OPAL_CORE. Also, avoid code duplication by moving common code used for processing the re

[PATCH v2 13/16] powernv/fadump: Skip processing /proc/vmcore when only OPAL core exists

2019-04-16 Thread Hari Bathini
If OPAL crashes when the kernel is not registered for FADump, F/W still exports OPAL core through result-table DT node. Make sure '/proc/vmcore' processing is skipped as only data relevant to OPAL core is exported in such scenario. Signed-off-by: Hari Bathini --- arch/powerpc/platforms/powernv/o

[PATCH v2 14/16] powernv/opalcore: provide an option to invalidate /proc/opalcore file

2019-04-16 Thread Hari Bathini
Writing '1' to /sys/kernel/fadump_release_opalcore would release the memory held by kernel in exporting /proc/opalcore file. Signed-off-by: Hari Bathini --- arch/powerpc/platforms/powernv/opal-core.c | 39 1 file changed, 39 insertions(+) diff --git a/arch/powerpc

[PATCH v2 15/16] powernv/fadump: consider f/w load area

2019-04-16 Thread Hari Bathini
OPAL loads kernel & initrd at 512MB offset (256MB size), also exported as ibm,opal/dump/fw-load-area. So, if boot memory size of FADump is less than 768MB, kernel memory to be exported as '/proc/vmcore' would be overwritten by f/w while loading kernel & initrd. To avoid such a scenario, enforce a m

[PATCH v2 16/16] powernv/fadump: update documentation about option to release opalcore

2019-04-16 Thread Hari Bathini
With /proc/opalcore support available on OPAL based machines and an option to release memory used by kernel in exporting /proc/opalcore, update FADump documentation with these details. Signed-off-by: Hari Bathini --- Documentation/powerpc/firmware-assisted-dump.txt | 19 +++ 1

Re: [PATCH] [v2] arch: add pidfd and io_uring syscalls everywhere

2019-04-16 Thread Catalin Marinas
On Mon, Apr 15, 2019 at 04:22:57PM +0200, Arnd Bergmann wrote: > Add the io_uring and pidfd_send_signal system calls to all architectures. > > These system calls are designed to handle both native and compat tasks, > so all entries are the same across architectures, only arm-compat and > the gener

Re: Linux 5.1-rc5

2019-04-16 Thread Martin Schwidefsky
On Tue, 16 Apr 2019 11:09:06 +0200 Martin Schwidefsky wrote: > On Mon, 15 Apr 2019 09:17:10 -0700 > Linus Torvalds wrote: > > > On Sun, Apr 14, 2019 at 10:19 PM Christoph Hellwig > > wrote: > > > > > > Can we please have the page refcount overflow fixes out on the list > > > for review, eve

Re: [PATCH] Linux: Define struct termios2 in under _GNU_SOURCE [BZ #10339]

2019-04-16 Thread Adhemerval Zanella
On 16/04/2019 06:59, Florian Weimer wrote: > * hpa: > >> Using symbol versioning doesn't really help much since the real >> problem is that struct termios can be passed around in userspace, and >> the interfaces between user space libraries don't have any >> versioning. However, my POC code dea

Re: [PATCH 1/5] arm64: Fix vDSO clock_getres()

2019-04-16 Thread Vincenzo Frascino
Hi Catalin, On 15/04/2019 18:35, Catalin Marinas wrote: > On Mon, Apr 01, 2019 at 12:51:48PM +0100, Vincenzo Frascino wrote: >> diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c >> index 2d419006ad43..47ba72345739 100644 >> --- a/arch/arm64/kernel/vdso.c >> +++ b/arch/arm64/kernel/v

Re: [PATCH v3 7/8] powerpc/mm: Consolidate radix and hash address map details

2019-04-16 Thread Nicholas Piggin
Aneesh Kumar K.V's on April 16, 2019 8:07 pm: > We now have > > 4K page size config > > kernel_region_map_size = 16TB > kernel vmalloc start = 0xc0001000 > kernel IO start= 0xc0002000 > kernel vmemmap start = 0xc0003000 > > with 64K page size config: > >

Re: [PATCH v2 1/5] cpu/speculation: Add 'mitigations=' cmdline option

2019-04-16 Thread Borislav Petkov
On Fri, Apr 12, 2019 at 03:39:28PM -0500, Josh Poimboeuf wrote: > diff --git a/kernel/cpu.c b/kernel/cpu.c > index 38890f62f9a8..aed9083f8eac 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -2320,3 +2320,18 @@ void __init boot_cpu_hotplug_init(void) > #endif > this_cpu_write(cpuhp_stat

Re: [PATCH 1/5] arm64: Fix vDSO clock_getres()

2019-04-16 Thread Will Deacon
On Tue, Apr 16, 2019 at 01:42:58PM +0100, Vincenzo Frascino wrote: > On 15/04/2019 18:35, Catalin Marinas wrote: > > On Mon, Apr 01, 2019 at 12:51:48PM +0100, Vincenzo Frascino wrote: > >> +1:/* Get hrtimer_res */ > >> + seqcnt_acquire > >> + syscall_check fail=5f > >> + ldr x2, [vds

Re: [PATCH v12 04/31] arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT

2019-04-16 Thread Mark Rutland
On Tue, Apr 16, 2019 at 03:44:55PM +0200, Laurent Dufour wrote: > From: Mahendran Ganesh > > Set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT for arm64. This > enables Speculative Page Fault handler. > > Signed-off-by: Ganesh Mahendran This is missing your S-o-B. The first patch noted that the ARCH_S

Re: [PATCH v12 04/31] arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT

2019-04-16 Thread Mark Rutland
On Tue, Apr 16, 2019 at 04:31:27PM +0200, Laurent Dufour wrote: > Le 16/04/2019 à 16:27, Mark Rutland a écrit : > > On Tue, Apr 16, 2019 at 03:44:55PM +0200, Laurent Dufour wrote: > > > From: Mahendran Ganesh > > > > > > Set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT for arm64. This > > > enables Specu

Re: [PATCH v5 1/6] iommu: add generic boot option iommu.dma_mode

2019-04-16 Thread Will Deacon
On Fri, Apr 12, 2019 at 02:11:31PM +0100, Robin Murphy wrote: > On 12/04/2019 11:26, John Garry wrote: > > On 09/04/2019 13:53, Zhen Lei wrote: > > > +static int __init iommu_dma_mode_setup(char *str) > > > +{ > > > +    if (!str) > > > +    goto fail; > > > + > > > +    if (!strncmp(str, "pass

Re: [PATCH v2 1/5] cpu/speculation: Add 'mitigations=' cmdline option

2019-04-16 Thread Josh Poimboeuf
On Tue, Apr 16, 2019 at 04:13:35PM +0200, Borislav Petkov wrote: > On Fri, Apr 12, 2019 at 03:39:28PM -0500, Josh Poimboeuf wrote: > > diff --git a/kernel/cpu.c b/kernel/cpu.c > > index 38890f62f9a8..aed9083f8eac 100644 > > --- a/kernel/cpu.c > > +++ b/kernel/cpu.c > > @@ -2320,3 +2320,18 @@ void _

[PATCH v2 0/5] Fix vDSO clock_getres()

2019-04-16 Thread Vincenzo Frascino
clock_getres in the vDSO library has to preserve the same behaviour of posix_get_hrtimer_res(). In particular, posix_get_hrtimer_res() does: sec = 0; ns = hrtimer_resolution; and hrtimer_resolution depends on the enablement of the high resolution timers that can happen either at compile or

[PATCH v2 1/5] arm64: Fix vDSO clock_getres()

2019-04-16 Thread Vincenzo Frascino
clock_getres in the vDSO library has to preserve the same behaviour of posix_get_hrtimer_res(). In particular, posix_get_hrtimer_res() does: sec = 0; ns = hrtimer_resolution; and hrtimer_resolution depends on the enablement of the high resolution timers that can happen either at compile or

[PATCH v2 2/5] powerpc: Fix vDSO clock_getres()

2019-04-16 Thread Vincenzo Frascino
clock_getres in the vDSO library has to preserve the same behaviour of posix_get_hrtimer_res(). In particular, posix_get_hrtimer_res() does: sec = 0; ns = hrtimer_resolution; and hrtimer_resolution depends on the enablement of the high resolution timers that can happen either at compile or

[PATCH v2 3/5] s390: Fix vDSO clock_getres()

2019-04-16 Thread Vincenzo Frascino
clock_getres in the vDSO library has to preserve the same behaviour of posix_get_hrtimer_res(). In particular, posix_get_hrtimer_res() does: sec = 0; ns = hrtimer_resolution; and hrtimer_resolution depends on the enablement of the high resolution timers that can happen either at compile or

[PATCH v2 4/5] nds32: Fix vDSO clock_getres()

2019-04-16 Thread Vincenzo Frascino
clock_getres in the vDSO library has to preserve the same behaviour of posix_get_hrtimer_res(). In particular, posix_get_hrtimer_res() does: sec = 0; ns = hrtimer_resolution; and hrtimer_resolution depends on the enablement of the high resolution timers that can happen either at compile or

[PATCH v2 5/5] kselftest: Extend vDSO selftest to clock_getres

2019-04-16 Thread Vincenzo Frascino
The current version of the multiarch vDSO selftest verifies only gettimeofday. Extend the vDSO selftest to clock_getres, to verify that the syscall and the vDSO library function return the same information. The extension has been used to verify the hrtimer_resoltion fix. Cc: Shuah Khan Signed-o

Re: Linux 5.1-rc5

2019-04-16 Thread Linus Torvalds
On Tue, Apr 16, 2019 at 5:08 AM Martin Schwidefsky wrote: > > This is not nice, would a patch like the following be acceptable? Umm. We actually already *have* this function. It's called "gup_fast_permitted()" and it's used by x86-64 to verify the proper address range. Exactly like s390 needs..

Re: [PATCH v2 1/5] arm64: Fix vDSO clock_getres()

2019-04-16 Thread Catalin Marinas
On Tue, Apr 16, 2019 at 05:14:30PM +0100, Vincenzo Frascino wrote: > diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c > index 2d419006ad43..5f5759d51c33 100644 > --- a/arch/arm64/kernel/vdso.c > +++ b/arch/arm64/kernel/vdso.c > @@ -245,6 +245,8 @@ void update_vsyscall(struct timekee

Re: Linux 5.1-rc5

2019-04-16 Thread Linus Torvalds
On Tue, Apr 16, 2019 at 9:16 AM Linus Torvalds wrote: > > We actually already *have* this function. > > It's called "gup_fast_permitted()" and it's used by x86-64 to verify > the proper address range. Exactly like s390 needs.. > > Could you please use that instead? IOW, something like the attache

Re: [PATCH v2 5/5] kselftest: Extend vDSO selftest to clock_getres

2019-04-16 Thread Will Deacon
On Tue, Apr 16, 2019 at 05:14:34PM +0100, Vincenzo Frascino wrote: > The current version of the multiarch vDSO selftest verifies only > gettimeofday. > > Extend the vDSO selftest to clock_getres, to verify that the > syscall and the vDSO library function return the same information. > > The exten

Re: [PATCH v2 1/5] arm64: Fix vDSO clock_getres()

2019-04-16 Thread Will Deacon
On Tue, Apr 16, 2019 at 05:24:33PM +0100, Catalin Marinas wrote: > On Tue, Apr 16, 2019 at 05:14:30PM +0100, Vincenzo Frascino wrote: > > diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c > > index 2d419006ad43..5f5759d51c33 100644 > > --- a/arch/arm64/kernel/vdso.c > > +++ b/arch/ar

Re: [PATCH v2 5/5] arm64/speculation: Support 'mitigations=' cmdline option

2019-04-16 Thread Thomas Gleixner
On Fri, 12 Apr 2019, Josh Poimboeuf wrote: > Configure arm64 runtime CPU speculation bug mitigations in accordance > with the 'mitigations=' cmdline option. This affects Meltdown, Spectre > v2, and Speculative Store Bypass. > > The default behavior is unchanged. > > Signed-off-by: Josh Poimboeu

Re: [PATCH v2 00/21] Convert hwmon documentation to ReST

2019-04-16 Thread Jonathan Corbet
On Fri, 12 Apr 2019 20:09:16 -0700 Guenter Roeck wrote: > The big real-world question is: Is the series good enough for you to accept, > or do you expect some level of user/kernel separation ? I guess it can go in; it's forward progress, even if it doesn't make the improvements I would like to s

Re: [PATCH v2 5/5] arm64/speculation: Support 'mitigations=' cmdline option

2019-04-16 Thread Josh Poimboeuf
On Tue, Apr 16, 2019 at 09:26:13PM +0200, Thomas Gleixner wrote: > On Fri, 12 Apr 2019, Josh Poimboeuf wrote: > > > Configure arm64 runtime CPU speculation bug mitigations in accordance > > with the 'mitigations=' cmdline option. This affects Meltdown, Spectre > > v2, and Speculative Store Bypass

[PATCH v3 00/26] compat_ioctl: cleanups

2019-04-16 Thread Arnd Bergmann
Hi Al, It took me way longer than I had hoped to revisit this series, see https://lore.kernel.org/lkml/20180912150142.157913-1-a...@arndb.de/ for the previously posted version. I've come to the point where all conversion handlers and most COMPATIBLE_IOCTL() entries are gone from this file, but fo

[PATCH v3 10/26] compat_ioctl: use correct compat_ptr() translation in drivers

2019-04-16 Thread Arnd Bergmann
A handful of drivers all have a trivial wrapper around their ioctl handler, but don't call the compat_ptr() conversion function at the moment. In practice this does not matter, since none of them are used on the s390 architecture and for all other architectures, compat_ptr() does not do anything, b

Re: [PATCH v2 00/21] Convert hwmon documentation to ReST

2019-04-16 Thread Guenter Roeck
On Tue, Apr 16, 2019 at 02:19:49PM -0600, Jonathan Corbet wrote: > On Fri, 12 Apr 2019 20:09:16 -0700 > Guenter Roeck wrote: > > > The big real-world question is: Is the series good enough for you to accept, > > or do you expect some level of user/kernel separation ? > > I guess it can go in; it

[PATCH v12 01/31] mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT

2019-04-16 Thread Laurent Dufour
This configuration variable will be used to build the code needed to handle speculative page fault. By default it is turned off, and activated depending on architecture support, ARCH_HAS_PTE_SPECIAL, SMP and MMU. The architecture support is needed since the speculative page fault handler is calle

[PATCH v12 21/31] mm: Introduce find_vma_rcu()

2019-04-16 Thread Laurent Dufour
This allows to search for a VMA structure without holding the mmap_sem. The search is repeated while the mm seqlock is changing and until we found a valid VMA. While under the RCU protection, a reference is taken on the VMA, so the caller must call put_vma() once it not more need the VMA structur

[PATCH v12 13/31] mm: cache some VMA fields in the vm_fault structure

2019-04-16 Thread Laurent Dufour
When handling speculative page fault, the vma->vm_flags and vma->vm_page_prot fields are read once the page table lock is released. So there is no more guarantee that these fields would not change in our back. They will be saved in the vm_fault structure before the VMA is checked for changes. In t

[PATCH v12 27/31] mm: add speculative page fault vmstats

2019-04-16 Thread Laurent Dufour
Add speculative_pgfault vmstat counter to count successful speculative page fault handling. Also fixing a minor typo in include/linux/vm_event_item.h. Signed-off-by: Laurent Dufour --- include/linux/vm_event_item.h | 3 +++ mm/memory.c | 3 +++ mm/vmstat.c |

[PATCH v12 16/31] mm: introduce __vm_normal_page()

2019-04-16 Thread Laurent Dufour
When dealing with the speculative fault path we should use the VMA's field cached value stored in the vm_fault structure. Currently vm_normal_page() is using the pointer to the VMA to fetch the vm_flags value. This patch provides a new __vm_normal_page() which is receiving the vm_flags flags value

[PATCH v12 03/31] powerpc/mm: set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT

2019-04-16 Thread Laurent Dufour
Set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT for BOOK3S_64. This enables the Speculative Page Fault handler. Support is only provide for BOOK3S_64 currently because: - require CONFIG_PPC_STD_MMU because checks done in set_access_flags_filter() - require BOOK3S because we can't support for book3e_hug

[PATCH v12 14/31] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page()

2019-04-16 Thread Laurent Dufour
migrate_misplaced_page() is only called during the page fault handling so it's better to pass the pointer to the struct vm_fault instead of the vma. This way during the speculative page fault path the saved vma->vm_flags could be used. Acked-by: David Rientjes Signed-off-by: Laurent Dufour ---

[PATCH v12 12/31] mm: protect SPF handler against anon_vma changes

2019-04-16 Thread Laurent Dufour
The speculative page fault handler must be protected against anon_vma changes. This is because page_add_new_anon_rmap() is called during the speculative path. In addition, don't try speculative page fault if the VMA don't have an anon_vma structure allocated because its allocation should be protec

[PATCH v12 09/31] mm: VMA sequence count

2019-04-16 Thread Laurent Dufour
From: Peter Zijlstra Wrap the VMA modifications (vma_adjust/unmap_page_range) with sequence counts such that we can easily test if a VMA is changed. The calls to vm_write_begin/end() in unmap_page_range() are used to detect when a VMA is being unmap and thus that new page fault should not be sat

[PATCH v12 06/31] mm: introduce pte_spinlock for FAULT_FLAG_SPECULATIVE

2019-04-16 Thread Laurent Dufour
When handling page fault without holding the mmap_sem the fetch of the pte lock pointer and the locking will have to be done while ensuring that the VMA is not touched in our back. So move the fetch and locking operations in a dedicated function. Signed-off-by: Laurent Dufour --- mm/memory.c |

[PATCH v12 29/31] powerpc/mm: add speculative page fault

2019-04-16 Thread Laurent Dufour
This patch enable the speculative page fault on the PowerPC architecture. This will try a speculative page fault without holding the mmap_sem, if it returns with VM_FAULT_RETRY, the mmap_sem is acquired and the traditional page fault processing is done. The speculative path is only tried for mult

[PATCH v12 02/31] x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT

2019-04-16 Thread Laurent Dufour
Set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT which turns on the Speculative Page Fault handler when building for 64bit. Cc: Thomas Gleixner Signed-off-by: Laurent Dufour --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 0f2ab09da060..

[PATCH v12 31/31] mm: Add a speculative page fault switch in sysctl

2019-04-16 Thread Laurent Dufour
This allows to turn on/off the use of the speculative page fault handler. By default it's turned on. Signed-off-by: Laurent Dufour --- include/linux/mm.h | 3 +++ kernel/sysctl.c| 9 + mm/memory.c| 3 +++ 3 files changed, 15 insertions(+) diff --git a/include/linux/mm.h b/i

[PATCH v12 07/31] mm: make pte_unmap_same compatible with SPF

2019-04-16 Thread Laurent Dufour
pte_unmap_same() is making the assumption that the page table are still around because the mmap_sem is held. This is no more the case when running a speculative page fault and additional check must be made to ensure that the final page table are still there. This is now done by calling pte_spinloc

[PATCH v12 30/31] arm64/mm: add speculative page fault

2019-04-16 Thread Laurent Dufour
From: Mahendran Ganesh This patch enables the speculative page fault on the arm64 architecture. I completed spf porting in 4.9. From the test result, we can see app launching time improved by about 10% in average. For the apps which have more than 50 threads, 15% or even more improvement can be

[PATCH v12 05/31] mm: prepare for FAULT_FLAG_SPECULATIVE

2019-04-16 Thread Laurent Dufour
From: Peter Zijlstra When speculating faults (without holding mmap_sem) we need to validate that the vma against which we loaded pages is still valid when we're ready to install the new PTE. Therefore, replace the pte_offset_map_lock() calls that (re)take the PTL with pte_map_lock() which can fa

[PATCH v12 26/31] perf tools: add support for the SPF perf event

2019-04-16 Thread Laurent Dufour
Add support for the new speculative faults event. Acked-by: David Rientjes Signed-off-by: Laurent Dufour --- tools/include/uapi/linux/perf_event.h | 1 + tools/perf/util/evsel.c | 1 + tools/perf/util/parse-events.c| 4 tools/perf/util/parse-events.l| 1 + too

[PATCH v12 19/31] mm: protect the RB tree with a sequence lock

2019-04-16 Thread Laurent Dufour
Introducing a per mm_struct seqlock, mm_seq field, to protect the changes made in the MM RB tree. This allows to walk the RB tree without grabbing the mmap_sem, and on the walk is done to double check that sequence counter was stable during the walk. The mm seqlock is held while inserting and remo

[PATCH v12 15/31] mm: introduce __lru_cache_add_active_or_unevictable

2019-04-16 Thread Laurent Dufour
The speculative page fault handler which is run without holding the mmap_sem is calling lru_cache_add_active_or_unevictable() but the vm_flags is not guaranteed to remain constant. Introducing __lru_cache_add_active_or_unevictable() which has the vma flags value parameter instead of the vma pointer

[PATCH v12 11/31] mm: protect mremap() against SPF hanlder

2019-04-16 Thread Laurent Dufour
If a thread is remapping an area while another one is faulting on the destination area, the SPF handler may fetch the vma from the RB tree before the pte has been moved by the other thread. This means that the moved ptes will overwrite those create by the page fault handler leading to page leaked.

[PATCH v12 28/31] x86/mm: add speculative pagefault handling

2019-04-16 Thread Laurent Dufour
From: Peter Zijlstra Try a speculative fault before acquiring mmap_sem, if it returns with VM_FAULT_RETRY continue with the mmap_sem acquisition and do the traditional fault. Signed-off-by: Peter Zijlstra (Intel) [Clearing of FAULT_FLAG_ALLOW_RETRY is now done in handle_speculative_fault()] [

[PATCH v12 04/31] arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT

2019-04-16 Thread Laurent Dufour
From: Mahendran Ganesh Set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT for arm64. This enables Speculative Page Fault handler. Signed-off-by: Ganesh Mahendran --- arch/arm64/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 870ef86a64ed..8e86934

[PATCH v12 10/31] mm: protect VMA modifications using VMA sequence count

2019-04-16 Thread Laurent Dufour
The VMA sequence count has been introduced to allow fast detection of VMA modification when running a page fault handler without holding the mmap_sem. This patch provides protection against the VMA modification done in : - madvise() - mpol_rebind_policy() - vma_replace_poli

[PATCH v12 20/31] mm: introduce vma reference counter

2019-04-16 Thread Laurent Dufour
The final goal is to be able to use a VMA structure without holding the mmap_sem and to be sure that the structure will not be freed in our back. The lockless use of the VMA will be done through RCU protection and thus a dedicated freeing service is required to manage it asynchronously. As report

[PATCH v12 25/31] perf: add a speculative page fault sw event

2019-04-16 Thread Laurent Dufour
Add a new software event to count succeeded speculative page faults. Acked-by: David Rientjes Signed-off-by: Laurent Dufour --- include/uapi/linux/perf_event.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 7198ddd0c6b

[PATCH v12 23/31] mm: don't do swap readahead during speculative page fault

2019-04-16 Thread Laurent Dufour
Vinayak Menon faced a panic because one thread was page faulting a page in swap, while another one was mprotecting a part of the VMA leading to a VMA split. This raise a panic in swap_vma_readahead() because the VMA's boundaries were not more matching the faulting address. To avoid this, if the pa

[PATCH v12 18/31] mm: protect against PTE changes done by dup_mmap()

2019-04-16 Thread Laurent Dufour
Vinayak Menon and Ganesh Mahendran reported that the following scenario may lead to thread being blocked due to data corruption: CPU 1 CPU 2CPU 3 Process 1, Process 1, Process 1, Thread AThread B

[PATCH v12 22/31] mm: provide speculative fault infrastructure

2019-04-16 Thread Laurent Dufour
From: Peter Zijlstra Provide infrastructure to do a speculative fault (not holding mmap_sem). The not holding of mmap_sem means we can race against VMA change/removal and page-table destruction. We use the SRCU VMA freeing to keep the VMA around. We use the VMA seqcount to detect change (includi

Re: [PATCH v12 04/31] arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT

2019-04-16 Thread Laurent Dufour
Le 16/04/2019 à 16:27, Mark Rutland a écrit : On Tue, Apr 16, 2019 at 03:44:55PM +0200, Laurent Dufour wrote: From: Mahendran Ganesh Set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT for arm64. This enables Speculative Page Fault handler. Signed-off-by: Ganesh Mahendran This is missing your S-o-B.

[PATCH v12 17/31] mm: introduce __page_add_new_anon_rmap()

2019-04-16 Thread Laurent Dufour
When dealing with speculative page fault handler, we may race with VMA being split or merged. In this case the vma->vm_start and vm->vm_end fields may not match the address the page fault is occurring. This can only happens when the VMA is split but in that case, the anon_vma pointer of the new VM

[PATCH v12 24/31] mm: adding speculative page fault failure trace events

2019-04-16 Thread Laurent Dufour
This patch a set of new trace events to collect the speculative page fault event failures. Signed-off-by: Laurent Dufour --- include/trace/events/pagefault.h | 80 mm/memory.c | 57 ++- 2 files changed, 125 insertions(+),

[PATCH v12 00/31] Speculative page faults

2019-04-16 Thread Laurent Dufour
This is a port on kernel 5.1 of the work done by Peter Zijlstra to handle page fault without holding the mm semaphore [1]. The idea is to try to handle user space page faults without holding the mmap_sem. This should allow better concurrency for massively threaded process since the page fault hand

[PATCH v12 08/31] mm: introduce INIT_VMA()

2019-04-16 Thread Laurent Dufour
Some VMA struct fields need to be initialized once the VMA structure is allocated. Currently this only concerns anon_vma_chain field but some other will be added to support the speculative page fault. Instead of spreading the initialization calls all over the code, let's introduce a dedicated inli

[PATCH v3 3/5] powerpc: Use the correct style for SPDX License Identifier

2019-04-16 Thread Nishad Kamdar
This patch corrects the SPDX License Identifier style in the powerpc Hardware Architecture related files. Suggested-by: Joe Perches Signed-off-by: Nishad Kamdar --- arch/powerpc/include/asm/pnv-ocxl.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/p

  1   2   >