[PATCH] irqchip/gicv3: silence noisy DEBUG_PER_CPU_MAPS warning

2016-09-19 Thread James Morse
will return early as start >= nbits), this patch just silences the warning. Signed-off-by: James Morse --- drivers/irqchip/irq-gic-v3.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index ede5672ab34d..9e37c

Re: [RFC] Arm64 boot fail with numa enable in BIOS

2016-09-19 Thread James Morse
On 19/09/16 15:07, Mark Rutland wrote: > On Mon, Sep 19, 2016 at 09:05:26PM +0800, Yisheng Xie wrote: >> For the crash log, it seems caused by error number of cpumask. >> Any ideas about it? > Much earlier in your log, there was a (non-fatal) warning, as below. Do > you see this without NUMA/SRAT

Re: [PATCH v3 2/3] hwmon: xgene: Add hwmon driver

2016-09-08 Thread James Morse
Hi, On 08/09/16 09:14, Arnd Bergmann wrote: > On Wednesday, September 7, 2016 3:37:05 PM CEST Guenter Roeck wrote: >> On Wed, Sep 07, 2016 at 11:41:44PM +0200, Arnd Bergmann wrote: >>> On Thursday, July 21, 2016 1:55:56 PM CEST Hoan Tran wrote: + ctx->comm_base_addr = cppc_ss->b

Re: [PATCH v3 2/3] hwmon: xgene: Add hwmon driver

2016-09-09 Thread James Morse
Hi, On 09/09/16 04:18, AKASHI Takahiro wrote: > On Thu, Sep 08, 2016 at 11:47:59AM +0100, James Morse wrote: >> On 08/09/16 09:14, Arnd Bergmann wrote: >>> On Wednesday, September 7, 2016 3:37:05 PM CEST Guenter Roeck wrote: >>>> On Wed, Sep 07, 2016 at 11:41:4

Re: [PATCH v5 0/3] arm64: hibernate: Resume when hibernate image created on non-boot CPU

2016-09-14 Thread James Morse
Hi Rafael, On 14/09/16 02:07, Rafael J. Wysocki wrote: > What's the status of this? Will has queued it in his for-next/core branch. Thanks, James

Re: [PATCH v6 5/7] arm64: kvm: route synchronous external abort exceptions to el2

2017-10-16 Thread James Morse
Hi gengdongjiu, On 14/09/17 12:12, gengdongjiu wrote: > On 2017/9/8 0:31, James Morse wrote: >> KVM already handles external aborts from lower exception levels, no more work >> needs doing for TEA. > If it is firmware first solution, that is SCR_EL3.EA=1, all SError interrupt

[RFC/RFT PATCH 0/6] Switch GHES ioremap_page_range() to use fixmap

2017-10-31 Thread James Morse
. RFC as I've only build-tested this on x86. For arm64 I've tested it on a software model. Any more testing would be welcome. These patches are based on rc7. Thanks, James Morse (6): arm64: fixmap: Add GHES fixmap entries x86/mm/fixmap: Add GHES fixmap entries ACPI / APEI

[RFC/RFT PATCH 1/6] arm64: fixmap: Add GHES fixmap entries

2017-10-31 Thread James Morse
GHES is switching to use fixmap for its dynamic mapping of CPER records, to avoid using ioremap_page_range() in IRQ/NMI context. Signed-off-by: James Morse --- arch/arm64/include/asm/fixmap.h | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64

[RFC/RFT PATCH 2/6] x86/mm/fixmap: Add GHES fixmap entries

2017-10-31 Thread James Morse
GHES is switching to use fixmap for its dynamic mapping of CPER records, to avoid using ioremap_page_range() in IRQ/NMI context. Signed-off-by: James Morse --- arch/x86/include/asm/fixmap.h | 4 1 file changed, 4 insertions(+) diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include

[RFC/RFT PATCH 6/6] ACPI / APEI: Remove arch_apei_flush_tlb_one()

2017-10-31 Thread James Morse
Nothing calls arch_apei_flush_tlb_one() anymore, instead relying on __set_pte_vaddr() to do the invalidation when called from clear_fixmap() Remove arch_apei_flush_tlb_one(). Signed-off-by: James Morse --- arch/x86/kernel/acpi/apei.c | 5 - include/acpi/apei.h | 1 - 2 files changed

[RFC/RFT PATCH 5/6] arm64: mm: Remove arch_apei_flush_tlb_one()

2017-10-31 Thread James Morse
Nothing calls arch_apei_flush_tlb_one() anymore, instead relying on __set_fixmap() to do the invalidation. Remove it. Move the IPI-considered-harmful comment to __set_fixmap(). Signed-off-by: James Morse --- arch/arm64/include/asm/acpi.h | 12 arch/arm64/mm/mmu.c | 4

[RFC/RFT PATCH 3/6] ACPI / APEI: Replace ioremap_page_range() with fixmap

2017-10-31 Thread James Morse
arm64 and __set_pte_vaddr() for x86. In each case its the same as the respective arch_apei_flush_tlb_one(). Reported-by: Fengguang Wu Suggested-by: Linus Torvalds Signed-off-by: James Morse CC: Tyler Baicar CC: Dongjiu Geng CC: Xie XiuQi --- CC'd people I've seen posting CPER log

[RFC/RFT PATCH 4/6] ACPI / APEI: Remove ghes_ioremap_area

2017-10-31 Thread James Morse
Now that nothing is using the ghes_ioremap_area pages, rip them out. Signed-off-by: James Morse --- drivers/acpi/apei/ghes.c | 39 ++- 1 file changed, 2 insertions(+), 37 deletions(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index

Re: [PATCH v1 0/3] manually add Error Synchronization Barrier at exception handler entry and exit

2017-11-01 Thread James Morse
Hi Dongjiu Geng, On 01/11/17 19:14, Dongjiu Geng wrote: > Some hardware platform can support RAS Extension, but not support IESB, > such as Huawei's platform, so software need to insert Synchronization Barrier > operations at exception handler entry. > > This series patches are based on James's

Re: [RFC/RFT PATCH 3/6] ACPI / APEI: Replace ioremap_page_range() with fixmap

2017-11-01 Thread James Morse
Hi gengdonjiu, On 01/11/17 04:13, gengdongjiu wrote: > On 2017/10/31 23:38, James Morse wrote: >> CC'd people I've seen posting CPER log fragments, could you give this a >> test on your platforms? > Thanks for the fixing, not found obviously issue. Can I take that as

Re: [RFC/RFT PATCH 0/6] Switch GHES ioremap_page_range() to use fixmap

2017-11-01 Thread James Morse
Hi guys, (+CC: Chen Gong and Huang Ying from the git log of [0]) On 31/10/17 15:38, James Morse wrote: > RFC as I've only build-tested this on x86. Does anyone have an x86 machine that does firmware-first using NOTIFY_NMI? > Any more testing would be welcome. ('ls /sys/firm

Re: [RFC/RFT PATCH 0/6] Switch GHES ioremap_page_range() to use fixmap

2017-11-01 Thread James Morse
Hi Linus, On 31/10/17 15:52, Linus Torvalds wrote: > On Tue, Oct 31, 2017 at 8:38 AM, James Morse wrote: >> 7 files changed, 30 insertions(+), 85 deletions(-) > > Lovely. > > I obviously can't test it, but it looks fine. I *would* suggest just > making the &quo

Re: [PATCH v8 0/7] Support RAS virtualization in KVM

2017-11-14 Thread James Morse
Hi Dongjiu Geng, On 10/11/17 19:54, Dongjiu Geng wrote: > This series patches mainly do below things: > > 1. Trap RAS ERR* registers Accesses to EL2 from Non-secure EL1, >KVM will will do a minimum simulation, there registers are simulated >to RAZ/WI in KVM. > 2. Route synchronous Externa

Re: [PATCH v8 7/7] arm64: kvm: handle SError Interrupt by categorization

2017-11-14 Thread James Morse
Hi Dongjiu Geng, On 10/11/17 19:54, Dongjiu Geng wrote: > If it is not RAS SError, directly inject virtual SError, > which will keep the old way. If it is RAS SError, firstly > let host ACPI module to handle it. > For the ACPI handling, > if the error address is invalid, APEI driver will not > id

get_online_cpus() from a preemptible() context (bug?)

2017-11-03 Thread James Morse
Hi Thomas, Peter, I'm trying to work out what stops a thread being pre-empted and migrated between calling get_online_cpus() and put_online_cpus(). According to __percpu_down_read(), its the pre-empt count: > * Due to having preemption disabled the decrement happens on > * the same CPU as the i

Re: [RFC/RFT PATCH 3/6] ACPI / APEI: Replace ioremap_page_range() with fixmap

2017-11-06 Thread James Morse
Hi gengdongjiu On 02/11/17 12:01, gengdongjiu wrote: > James Morse wrote: >> Can I take that as a 'Tested-by:'? >> >> These tags also let us record who has a system that can test changes to this >> driver. > > sure. > Thanks for the fixing. &g

Re: [RFC/RFT PATCH 0/6] Switch GHES ioremap_page_range() to use fixmap

2017-11-06 Thread James Morse
On 01/11/17 18:20, Kani, Toshimitsu wrote: > On Wed, 2017-11-01 at 16:30 +0100, Borislav Petkov wrote: >> On Wed, Nov 01, 2017 at 02:58:33PM +0000, James Morse wrote: >>> Does anyone have an x86 machine that does firmware-first using NOTIFY_NMI? >> AFAIK, the only

[PATCH 0/4] Switch GHES ioremap_page_range() to use fixmap

2017-11-06 Thread James Morse
irst three patches for improved history I've tried to be clear with who-acked-what when merging the patches. For reference, the arch-acks are here: https://lkml.org/lkml/2017/11/2/254 https://lkml.org/lkml/2017/10/31/780 Thanks, James Morse (4): ACPI / APEI: Replace ioremap_page_rang

[PATCH 2/4] ACPI / APEI: Remove ghes_ioremap_area

2017-11-06 Thread James Morse
Now that nothing is using the ghes_ioremap_area pages, rip them out. Signed-off-by: James Morse Reviewed-by: Borislav Petkov Tested-by: Tyler Baicar --- drivers/acpi/apei/ghes.c | 39 ++- 1 file changed, 2 insertions(+), 37 deletions(-) diff --git a

[PATCH 4/4] ACPI / APEI: Remove arch_apei_flush_tlb_one()

2017-11-06 Thread James Morse
Nothing calls arch_apei_flush_tlb_one() anymore, instead relying on __set_pte_vaddr() to do the invalidation when called from clear_fixmap() Remove arch_apei_flush_tlb_one(). Signed-off-by: James Morse Reviewed-by: Borislav Petkov --- arch/x86/kernel/acpi/apei.c | 5 - include/acpi/apei.h

[PATCH 1/4] ACPI / APEI: Replace ioremap_page_range() with fixmap

2017-11-06 Thread James Morse
Torvalds Signed-off-by: James Morse Reviewed-by: Borislav Petkov Tested-by: Tyler Baicar Tested-by: Toshi Kani [ For the arm64 bits: ] Acked-by: Will Deacon [ For the x86 bits: ] Acked-by: Ingo Molnar --- Changes since RFC: * Added #ifdefs around the entries in fixmap.h * Added a paragraph

[PATCH 3/4] arm64: mm: Remove arch_apei_flush_tlb_one()

2017-11-06 Thread James Morse
Nothing calls arch_apei_flush_tlb_one() anymore, instead relying on __set_fixmap() to do the invalidation. Remove it. Move the IPI-considered-harmful comment to __set_fixmap(). Signed-off-by: James Morse Acked-by: Will Deacon Tested-by: Tyler Baicar --- arch/arm64/include/asm/acpi.h | 12

Re: get_online_cpus() from a preemptible() context (bug?)

2017-11-06 Thread James Morse
Hi Peter, (combining your replies) On 06/11/17 10:32, Peter Zijlstra wrote: > On Fri, Nov 03, 2017 at 02:45:45PM +0000, James Morse wrote: >> I'm trying to work out what stops a thread being pre-empted and migrated >> between >> calling get_online_cpus() and put_onli

Re: [PATCH v5 2/2] acpi: apei: Add SEI notification type support for ARMv8

2017-10-18 Thread James Morse
Hi Borislav! On 18/10/17 10:25, Borislav Petkov wrote: > On Wed, Oct 18, 2017 at 05:17:27PM +0800, gengdongjiu wrote: >> Thanks Borislav, can I write it as asynchronous exception or >> asynchronous abort? > > WTF?! Yup. > The thing is abbreviated as "SEI" and apparently means "System Error > I

Re: [PATCH v5 2/2] acpi: apei: Add SEI notification type support for ARMv8

2017-10-18 Thread James Morse
to a > invalid value. This paragraph keeps cropping up. Who expects an address with an SError? We don't get one for IRQs, but that never needs stating. > Cc: Borislav Petkov > Cc: James Morse > Signed-off-by: Dongjiu Geng > Tested-by: Tyler Baicar > Tested-by: Dong

Re: [PATCH v7 1/4] arm64: kvm: route synchronous external abort exceptions to EL2

2017-10-18 Thread James Morse
Hi Dongjiu Geng, On 17/10/17 15:14, Dongjiu Geng wrote: > ARMv8.2 adds a new bit HCR_EL2.TEA which controls to > route synchronous external aborts to EL2, and adds a > trap control bit HCR_EL2.TERR which controls to > trap all Non-secure EL1&0 error record accesses to EL2. The bulk of this patch

Re: [PATCH v3] arm64: Introduce IRQ stack

2015-10-05 Thread James Morse
On 05/10/15 07:37, AKASHI Takahiro wrote: > On 10/04/2015 11:32 PM, Jungseok Lee wrote: >> On Oct 3, 2015, at 1:23 AM, James Morse wrote: >>> One observed change in behaviour: >>> Any stack-unwinding now stops at el1_irq(), which is the bottom of the irq >>> s

Re: [PATCH v5 1/2] arm64: kvm: allows kvm cpu hotplug

2015-10-12 Thread James Morse
On 29/05/15 06:38, AKASHI Takahiro wrote: > The current kvm implementation on arm64 does cpu-specific initialization > at system boot, and has no way to gracefully shutdown a core in terms of > kvm. This prevents, especially, kexec from rebooting the system on a boot > core in EL2. > > This patch

Re: [PATCH v4 2/2] arm64: Expand the stack trace feature to support IRQ stack

2015-10-12 Thread James Morse
Hi Jungseok, On 12/10/15 15:53, Jungseok Lee wrote: > On Oct 9, 2015, at 11:24 PM, James Morse wrote: >> I think unwind_frame() needs to walk the irq stack too. [2] is an example >> of perf tracing back to userspace, (and there are patches on the list to >> do/fix this), so

Re: [PATCH v5 1/2] arm64: kvm: allows kvm cpu hotplug

2015-10-13 Thread James Morse
Hi, On 13/10/15 06:38, AKASHI Takahiro wrote: > On 10/12/2015 10:28 PM, James Morse wrote: >> On 29/05/15 06:38, AKASHI Takahiro wrote: >>> The current kvm implementation on arm64 does cpu-specific initialization >>> at system boot, and has no way to gracefully shutdown

Re: [PATCH v4 2/2] arm64: Expand the stack trace feature to support IRQ stack

2015-10-13 Thread James Morse
Hi Jungseok, On 12/10/15 23:13, Jungseok Lee wrote: > On Oct 13, 2015, at 1:34 AM, James Morse wrote: >> Having two kmem_caches for 16K stacks on a 64K page system may be wasteful >> (especially for systems with few cpus)… > > This would be a single concern. To address t

Re: [RFC PATCH 0/3] Implement IRQ stack on ARM64

2015-09-07 Thread James Morse
a copy of the stack-pointer to find struct thread_info, whereas I was copying it between stacks (ends up as 2x ldp/stps), which keeps the change restricted to irq_stack setup code. We should get some feedback as to which approach is preferred. Thanks, James Morse -- To unsubscribe from this lis

[PATCH] arm64: kernel: Use a separate stack for irq interrupts.

2015-09-07 Thread James Morse
stack usage (running ltp and generating usb+ethernet interrupts) was 7256 bytes. With this patch, the same workload gives a maximum stack usage of 5816 bytes. Signed-off-by: James Morse --- arch/arm64/include/asm/irq.h | 12 + arch/arm64/include/asm/thread_info.h | 8 -- arch

Re: [RFC PATCH 2/3] arm64: Introduce IRQ stack

2015-09-07 Thread James Morse
On 04/09/15 15:23, Jungseok Lee wrote: > Currently, kernel context and interrupts are handled using a single > kernel stack navigated by sp_el1. This forces many systems to use > 16KB stack, not 8KB one. Low memory platforms naturally suffer from > both memory pressure and performance degradation s

Re: [RFC PATCH 1/3] arm64: entry: Remove unnecessary calculation for S_SP in EL1h

2015-09-07 Thread James Morse
On 04/09/15 15:23, Jungseok Lee wrote: > Under EL1h, S_SP data is not seen in kernel_exit. Thus, x21 calculation > is not needed in kernel_entry. Currently, S_SP information is vaild only > when sp_el0 is used. > > Signed-off-by: Jungseok Lee > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/

Re: [PATCH] arm64: kernel: Use a separate stack for irq interrupts.

2015-09-07 Thread James Morse
On 07/09/15 16:48, Jungseok Lee wrote: > On Sep 7, 2015, at 11:36 PM, James Morse wrote: > > Hi James, > >> Having to handle interrupts on top of an existing kernel stack means the >> kernel stack must be large enough to accomodate both the maximum kernel >> usag

Re: [PATCH] arm64: kernel: Use a separate stack for irq interrupts.

2015-09-08 Thread James Morse
On 08/09/15 15:54, Jungseok Lee wrote: > On Sep 7, 2015, at 11:36 PM, James Morse wrote: > > Hi James, > >> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S >> index e16351819fed..d42371f3f5a1 100644 >> --- a/arch/arm64/kernel/entry.S >

Re: [PATCH] arm64: kernel: Use a separate stack for irq interrupts.

2015-09-09 Thread James Morse
On 09/09/15 14:22, Jungseok Lee wrote: > On Sep 9, 2015, at 1:47 AM, James Morse wrote: >> On 08/09/15 15:54, Jungseok Lee wrote: >>> On Sep 7, 2015, at 11:36 PM, James Morse wrote: >>>> diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c >>>&g

Re: [PATCH v4 2/2] arm64: Expand the stack trace feature to support IRQ stack

2015-10-15 Thread James Morse
On 14/10/15 13:12, Jungseok Lee wrote: > On Oct 14, 2015, at 12:00 AM, Jungseok Lee wrote: >> On Oct 13, 2015, at 8:00 PM, James Morse wrote: >>> On 12/10/15 23:13, Jungseok Lee wrote: >>>> On Oct 13, 2015, at 1:34 AM, James Morse wrote: >>>>> Having

Re: [PATCH v4 2/2] arm64: Expand the stack trace feature to support IRQ stack

2015-10-15 Thread James Morse
On 15/10/15 15:24, Jungseok Lee wrote: > On Oct 9, 2015, at 11:24 PM, James Morse wrote: >> I think unwind_frame() needs to walk the irq stack too. [2] is an example >> of perf tracing back to userspace, (and there are patches on the list to >> do/fix this), so we need to walk

Re: [PATCH v5] arm64: Introduce IRQ stack

2015-10-20 Thread James Morse
e zero. > --- > I've used Cc', not Tested-by tag, from James, since there is a gap > between v4 and v5. Re-tested, with both 4K and 64K pages. Tested-By: James Morse I also need to test this on top of Akashi Takahiros's series - in isolation this patch only lets perf/dump_s

Re: [PATCH v5] arm64: Introduce IRQ stack

2015-10-20 Thread James Morse
On 20/10/15 16:05, Jungseok Lee wrote: > On Oct 20, 2015, at 7:05 PM, James Morse wrote: >> On 17/10/15 15:27, Jungseok Lee wrote: >>> diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c >>> index 9f17ec0..13fe8f4 100644 >>> --- a/arch/arm64/kern

Re: [PATCH v3] arm64: Introduce IRQ stack

2015-10-02 Thread James Morse
Hi, On 22/09/15 13:11, Jungseok Lee wrote: > Currently, kernel context and interrupts are handled using a single > kernel stack navigated by sp_el1. This forces a system to use 16KB > stack, not 8KB one. This restriction makes low memory platforms suffer > from memory pressure accompanied by perfo

Re: KVM guest sometimes failed to boot because of kernel stack overflow if KPTI is enabled on a hisilicon ARM64 platform.

2018-06-20 Thread James Morse
Hi Wei, On 20/06/18 16:52, Wei Xu wrote: > On 2018/6/20 22:42, Will Deacon wrote: >> Hmm, I wonder if this is at all related to RAS, since we've just enabled >> that and if we take a fault whilst rewriting swapper then we're going to >> get stuck. What happens if you set CONFIG_ARM64_RAS_EXTN=n in

Re: KVM guest sometimes failed to boot because of kernel stack overflow if KPTI is enabled on a hisilicon ARM64 platform.

2018-06-21 Thread James Morse
Hi Will, Wei, On 20/06/18 17:25, Wei Xu wrote: > On 2018/6/20 23:54, James Morse wrote: > I have disabled CONFIG_ARM64_RAS_EXTN and reverted that commit. > But I still got the stack overflow issue sometimes. > Do you have more hint? > The log is as below: >     [    0.00

Re: [PATCH 1/1] arm64/mm: move {idmap_pg_dir,tramp_pg_dir,swapper_pg_dir} to .rodata section

2018-06-21 Thread James Morse
Hi guys, On 21/06/18 07:39, Ard Biesheuvel wrote: > On 21 June 2018 at 04:51, Jun Yao wrote: >> On Wed, Jun 20, 2018 at 12:09:49PM +0200, Ard Biesheuvel wrote: >>> On 20 June 2018 at 10:57, Jun Yao wrote: Move {idmap_pg_dir,tramp_pg_dir,swapper_pg_dir} to .rodata section. And update th

Re: [PATCH 1/1] arm64/mm: move {idmap_pg_dir,tramp_pg_dir,swapper_pg_dir} to .rodata section

2018-06-21 Thread James Morse
Hi Ard, On 21/06/18 10:29, Ard Biesheuvel wrote: > On 21 June 2018 at 10:59, James Morse wrote: >> On 21/06/18 07:39, Ard Biesheuvel wrote: >>> On 21 June 2018 at 04:51, Jun Yao wrote: >>>> On Wed, Jun 20, 2018 at 12:09:49PM +0200, Ard Biesheuvel wrote: >>&

Re: [RFC PATCH] EDAC, ghes: Enable per-layer error reporting for ARM

2018-08-28 Thread James Morse
Hi Boris, On 24/08/18 13:01, Borislav Petkov wrote: > On Fri, Aug 24, 2018 at 10:48:24AM +0100, James Morse wrote: >> so edac_raw_mc_handle_error() has no clue where the error happened. (I >> haven't >> read what it does with this information yet). > > See edac_

Re: [RFC PATCH] EDAC, ghes: Enable per-layer error reporting for ARM

2018-08-28 Thread James Morse
Hi Fan, On 24/08/18 15:30, wufan wrote: >> Why get avoid the layer stuff? Isn't counting DIMM/memory-devices what >> EDAC_MC_LAYER_SLOT is for? > > Borislav has explained it in his response. Here let me elaborate a little > more. > To use the layer information you need an accurate way to pinpoin

Re: [RFC PATCH] EDAC, ghes: Enable per-layer error reporting for ARM

2018-08-28 Thread James Morse
Hi Tyler, On 24/08/18 16:14, Tyler Baicar wrote: > On Fri, Aug 24, 2018 at 5:48 AM, James Morse wrote: >> On 23/08/18 16:46, Tyler Baicar wrote: >>> On Thu, Aug 23, 2018 at 5:29 AM James Morse wrote: >>>> On 19/07/18 19:36, Tyler Baicar wrote: >>>>>

Re: [RFC PATCH] EDAC, ghes: Enable per-layer error reporting for ARM

2018-08-29 Thread James Morse
Hi Boris, On 29/08/18 08:38, Borislav Petkov wrote: > On Tue, Aug 28, 2018 at 06:09:24PM +0100, James Morse wrote: >> Does x86 have another source of memory-topology information it needs to >> correlate smbios with? > > Bah, pinpointing the DIMM on x86 is a mess. There'

Re: [PATCH] EDAC, ghes: use CPER module handles to locate DIMMs

2018-08-30 Thread James Morse
Hi Fan, On 29/08/18 19:33, Fan Wu wrote: > The current ghes_edac driver does not update per-dimm error > counters when reporting memory errors, because there is no > platform-independent way to find DIMMs based on the error > information provided by firmware. I'd argue there is: its in the CPER r

Re: [RFC PATCH] EDAC, ghes: Enable per-layer error reporting for ARM

2018-08-23 Thread James Morse
Hi guys, (CC: +Fan Wu) On 19/07/18 19:36, Tyler Baicar wrote: > On 7/19/2018 10:46 AM, James Morse wrote: >> On 19/07/18 15:01, Borislav Petkov wrote: >>> On Mon, Jul 16, 2018 at 01:26:49PM -0400, Tyler Baicar wrote: >>>> Enable per-layer error reporting for

Re: [RFC PATCH] EDAC, ghes: Enable per-layer error reporting for ARM

2018-08-24 Thread James Morse
Hi Tyler, On 23/08/18 16:46, Tyler Baicar wrote: > On Thu, Aug 23, 2018 at 5:29 AM James Morse wrote: >> On 19/07/18 19:36, Tyler Baicar wrote: >>> On 7/19/2018 10:46 AM, James Morse wrote: >>>> On 19/07/18 15:01, Borislav Petkov wrote: >>>>> On

[RFC PATCH 12/20] x86/intel_rdt: Correct the closid when staging configuration changes

2018-08-24 Thread James Morse
Now that apply_config() and update_domains() know the code/data/both value of what they are writing, and ctrl_val is correctly sized: use the hardware closid slot, based on the configuration type. This means cbm_idx() and its illusionary cache-properties can go. Signed-off-by: James Morse

[RFC PATCH 13/20] x86/intel_rdt: Allow different CODE/DATA configurations to be staged

2018-08-24 Thread James Morse
for each schema. Use the cdp_type enum directly as an index. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 4 ++-- include/linux/resctrl.h | 4 +++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu

[RFC PATCH 02/20] x86/intel_rdt: Split struct rdt_domain

2018-08-24 Thread James Morse
touching a 'hw' struct indicates where an abstraction is needed. No change in behaviour, this patch just moves types around. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt.c | 87 +++-- arch/x86/kernel/cpu/intel_rdt.h | 30 ---

[RFC PATCH 01/20] x86/intel_rdt: Split struct rdt_resource

2018-08-24 Thread James Morse
n the next patch. No change in behaviour, this patch just moves types around. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt.c | 193 +++- arch/x86/kernel/cpu/intel_rdt.h | 112 +++- arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 6 +-

[RFC PATCH 06/20] x86/intel_rdt: Add a helper to read a closid's configuration for show_doms()

2018-08-24 Thread James Morse
rrent configuration. This will allow another architecture to scale the bitmaps if necessary, and possibly use controls that don't take a bitmap at all. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 17 + arch/x86/kernel/cpu/intel_rdt_monitor.c | 2 +

[RFC PATCH 08/20] x86/intel_rdt: Make cdp enable/disable global

2018-08-24 Thread James Morse
ch only has one of the two. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt.c | 1 + arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 72 include/linux/resctrl.h | 7 +++ 3 files changed, 57 insertions(+), 23 deletions(-) diff --git

[RFC PATCH 00/20] x86/intel_rdt: Start abstraction for a second arch

2018-08-24 Thread James Morse
ons on what should be arch-specific, and what shouldn't. This series is based on v4.18, and can be retrieved from: git://linux-arm.org/linux-jm.git -b mpam/resctrl_rework/rfc_1 Thanks, James Morse (20): x86/intel_rdt: Split struct rdt_resource x86/intel_rdt: Split struct rdt_domain x86/

[RFC PATCH 05/20] x86/intel_rdt: make update_domains() learn the affected closids

2018-08-24 Thread James Morse
architectures that don't do this don't need to emulate it. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt.h | 4 ++-- arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 21 - 2 files changed, 18 insertions(+), 7 deletions(-) diff --git a/arch/x86/

[RFC PATCH 09/20] x86/intel_rdt: Track the actual number of closids separately

2018-08-24 Thread James Morse
is when a resource is reset(), both the code and data illusionary caches reset the full closid range. This disappears in a later patch that merges the caches together. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt.c | 19 ++- arch/x86/kernel/cpu/i

[RFC PATCH 10/20] x86/intel_rdt: Let resctrl change the resources's num_closid

2018-08-24 Thread James Morse
esctrl will see the value it changed here. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c ind

[RFC PATCH 17/20] x86/intel_rdt: Stop using Lx CODE/DATA resources

2018-08-24 Thread James Morse
chema generating the names and setting the configuration type. We can now remove the initialisation of of the illusionary hw_resources: 'cdp_capable' just requires setting a flag, resctrl knows what to do from there. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt.

[RFC PATCH 16/20] x86/intel_rdt: Move the schemata names into struct resctrl_schema

2018-08-24 Thread James Morse
code's max_name_width, this is now resctrl's problem. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt.c | 9 ++--- arch/x86/kernel/cpu/intel_rdt.h | 2 +- arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 4 ++-- arch/x86/kernel/cpu/intel_rdt_

[RFC PATCH 15/20] x86/intel_rdt: Walk the resctrl schema list instead of the arch's resource list

2018-08-24 Thread James Morse
are working with a per-resource property. Previously we littered resctrl_to_rdt() wherever we needed to know the cdp_type of a cache. Now that this has a home, fix all those callers to read the value from the relevant schema entry. Signed-off-by: James Morse --- arch/x86/kernel/cpu

[RFC PATCH 14/20] x86/intel_rdt: Add a separate resource list for resctrl

2018-08-24 Thread James Morse
o have the same resource represented twice as code/data, with the appropriate cdp_type for configuration. This will also let us generate the names in resctrl. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 45 +++- include/linux/resctrl.h

[RFC PATCH 04/20] x86/intel_rdt: Add closid to the staged config

2018-08-24 Thread James Morse
the schema is being parsed. In the future this will be the hardware closid, with the CDP correction already applied by resctrl. This allows another architecture to work with resctrl, without having to emulate CDP. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt.h | 4

[RFC PATCH 11/20] x86/intel_rdt: Pass in the code/data/both configuration value when parsing

2018-08-24 Thread James Morse
this label up to become a property of the schema. A later patch will correct the closid for CDP when the configuration is staged, which will let us merge the three types of resource. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt.c | 6 ++ arch/x86/kernel/cpu

[RFC PATCH 18/20] x86/intel_rdt: Remove the CODE/DATA illusionary caches

2018-08-24 Thread James Morse
Now that nothing uses these caches, remove them. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt.c | 69 - arch/x86/kernel/cpu/intel_rdt.h | 4 -- 2 files changed, 73 deletions(-) diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu

[RFC PATCH 07/20] x86/intel_rdt: Expose update_domains() as an arch helper

2018-08-24 Thread James Morse
update_domains() applies the staged configuration to the hw_dom's configuration array and updates the hardware. Make it part of the interface between resctrl and the arch code. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 4 ++-- include/linux/resc

[RFC PATCH 03/20] x86/intel_rdt: Group staged configuration into a separate struct

2018-08-24 Thread James Morse
eventually resctrl will use the array slots for CODE/DATA/BOTH to detect a duplicate schema being written. Signed-off-by: James Morse --- arch/x86/kernel/cpu/intel_rdt_ctrlmondata.c | 44 +++-- include/linux/resctrl.h | 16 ++-- 2 files changed, 43 insertions

[RFC PATCH 20/20] x86/intel_rdt: Merge cdp enable/disable calls

2018-08-24 Thread James Morse
Now that the cdp_enable() and cdp_disable() calls are basically the same, merge them into cdp_set_enabled(true/false). All these functions are behind resctrl_arch_set_cdp_enabled(), so the can take the rdt_hw_resource directly. Signed-off-by: James Morse --- arch/x86/kernel/cpu

[RFC PATCH 19/20] x86/intel_rdt: Kill off alloc_enabled

2018-08-24 Thread James Morse
Now that the L2/L2CODE/L2DATA resources are merged together, alloc_enabled doesn't mean anything, its the same as alloc_capable which indicates CAT is supported by this cache. Take the opportunity to kill of alloc_enabled and its helpers. Signed-off-by: James Morse --- arch/x86/kerne

Re: [RFC PATCH] EDAC, ghes: Enable per-layer error reporting for ARM

2018-07-19 Thread James Morse
Hi guys, On 19/07/18 15:01, Borislav Petkov wrote: > On Mon, Jul 16, 2018 at 01:26:49PM -0400, Tyler Baicar wrote: >> Enable per-layer error reporting for ARM systems so that the error >> counters are incremented per-DIMM. >> >> On ARM systems that use firmware first error handling it is understoo

Re: [PATCH v3 5/5] arm64/mm: Move {idmap_pg_dir, swapper_pg_dir} to .rodata section

2018-07-11 Thread James Morse
Hi Yun, On 02/07/18 12:16, Jun Yao wrote: > Move {idmap_pg_dir, swapper_pg_dir} to .rodata section and > populate swapper_pg_dir by fixmap. (any chance you could split the fixmap bits into a separate patch so that the rodata move comes last? This will make review and bisecting any problems easi

Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

2018-01-23 Thread James Morse
Hi Dongjiu Geng, On 06/01/18 16:02, Dongjiu Geng wrote: > The ARM64 RAS SError Interrupt(SEI) syndrome value is specific to the > guest and user space needs a way to tell KVM this value. So we add a > new ioctl. Before user space specifies the Exception Syndrome Register > ESR(ESR), it firstly che

Re: [PATCH v9 6/7] arm64: kvm: Set Virtual SError Exception Syndrome for guest

2018-01-23 Thread James Morse
xit. A version of this patch has been queued by Catalin. Now that the cpufeature bits are queued, I think this can be split up into two separate series for v4.16-rc1, one to tackle NOTIFY_SEI and the associated plumbing. The second for the KVM 'make SError pending' API. > Signed-off-by:

Re: [PATCH V15 06/11] acpi: apei: handle SEA notification type for ARMv8

2017-05-12 Thread James Morse
Hi Tyler, On 08/05/17 20:59, Baicar, Tyler wrote: > On 5/8/2017 11:28 AM, James Morse wrote: >> I was tidying up the masking/unmasking in entry.S, something I wasn't aware >> of >> that leads to a bug: >> entry.S will unmask interrupts for instruction/data aborts

Re: [PATCH v3 3/3] arm/arm64: signal SIBGUS and inject SEA Error

2017-05-12 Thread James Morse
Hi gengdongjiu, On 05/05/17 13:31, gengdongjiu wrote: > when guest OS happen an SEA, My current solution is shown below: > > (1) host EL3 firmware firstly handle the SEA error and generate the CPER > record. > (2) EL3 firmware separately copy the esr_el3, elr_el3, SPSR_el3, > far_el3 to the esr_

Re: [PATCH v3 3/3] arm/arm64: signal SIBGUS and inject SEA Error

2017-05-12 Thread James Morse
Hi gengdongjiu, On 10/05/17 09:44, gengdongjiu wrote: > On 2017/5/9 1:28, James Morse wrote: >>>> (hwpoison for KVM is a corner case as Qemu's memory effectively has two >>>> users, >>>> Qemu and KVM. This isn't the example of how user-space get

Re: [PATCH v8 7/7] arm64: kvm: handle SError Interrupt by categorization

2018-01-12 Thread James Morse
Hi gengdongjiu, On 15/12/17 03:30, gengdongjiu wrote: > On 2017/12/7 14:37, gengdongjiu wrote: >>> We need to tackle (1) and (3) separately. For (3) we need some API that lets >>> Qemu _trigger_ an SError in the guest, with a specified ESR. But, we don't >>> have >>> a way of migrating pending SE

Re: [PATCH v8 7/7] arm64: kvm: handle SError Interrupt by categorization

2018-01-12 Thread James Morse
Hi gengdongjiu, On 16/12/17 04:47, gengdongjiu wrote: > [...] >> >>> + case ESR_ELx_AET_UER: /* The error has not been propagated */ >>> + /* >>> + * Userspace only handle the guest SError Interrupt(SEI) if >>> the >>> + * error has not been propagated

Re: [PATCH v5 1/3] arm64/ras: support sea error recovery

2018-01-30 Thread James Morse
Hi Xie XiuQi, On 26/01/18 12:31, Xie XiuQi wrote: > With ARM v8.2 RAS Extension, SEA are usually triggered when memory errors > are consumed. According to the existing process, errors occurred in the > kernel, leading to direct panic, if it occurred the user-space, we should > just kill process. >

Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

2018-01-30 Thread James Morse
Hi gengdongjiu, On 24/01/18 20:06, gengdongjiu wrote: >> On 06/01/18 16:02, Dongjiu Geng wrote: >>> The ARM64 RAS SError Interrupt(SEI) syndrome value is specific to the >>> guest and user space needs a way to tell KVM this value. So we add a >>> new ioctl. Before user space specifies the Exceptio

Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

2018-01-30 Thread James Morse
Hi gengdongjiu, On 23/01/18 09:23, gengdongjiu wrote: > On 2018/1/23 3:39, James Morse wrote: >> gengdongjiu wrote: >>> This error source parsing and handling method >>> is similar with the SEA. >> >> There are problems with doing this: >> >> Oc

Re: [PATCH v8 7/7] arm64: kvm: handle SError Interrupt by categorization

2018-01-22 Thread James Morse
Hi gengdongjiu, On 21/01/18 02:45, gengdongjiu wrote: > For the ESR_ELx_AET_UER, this exception is precise, closing the VM may > be better[1]. > But if you think panic is better until we support kernel-first, it is > also OK to me. I'm not convinced SError while a guest was running means only gue

Re: [PATCH v8 7/7] arm64: kvm: handle SError Interrupt by categorization

2018-01-22 Thread James Morse
Hi gengdongjiu, On 16/12/17 03:44, gengdongjiu wrote: > On 2017/12/16 2:52, James Morse wrote: >>> signal, it will record the CPER and trigger a IRQ to notify guest, as shown >>> below: >>> >>> SIGBUS_MCEERR_AR trigger Synchronous External Abort

Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

2018-01-22 Thread James Morse
ort for NOTIFY_SEI as a GHES notification mechanism... ", its up to the arch code to spot a v8.2 RAS Error based on the cpu caps. > This error source parsing and handling method > is similar with the SEA. There are problems with doing this: Oct. 18, 2017, 10:26 a.m. James Morse wrote: |

Re: [PATCH v2 11/11] arm64: Implement branch predictor hardening for affected Cortex-A CPUs

2018-01-05 Thread James Morse
Hi Marc, Will, (SOB-chain suggests a missing From: tag on this and patch 7) On 05/01/18 13:12, Will Deacon wrote: > Cortex-A57, A72, A73 and A75 are susceptible to branch predictor aliasing > and can theoretically be attacked by malicious code. > > This patch implements a PSCI-based mitigation f

Re: [PATCH v2 07/11] arm64: Add skeleton to harden the branch predictor against aliasing attacks

2018-01-08 Thread James Morse
Hi Will, Marc, On 05/01/18 13:12, Will Deacon wrote: > Aliasing attacks against CPU branch predictors can allow an attacker to > redirect speculative control flow on some CPUs and potentially divulge > information from one context to another. > > This patch adds initial skeleton code behind a new

Re: [PATCH -next] firmware: arm_sdei: Fix return value check in sdei_present_dt()

2018-01-15 Thread James Morse
Hi Wei, On 15/01/18 10:41, Wei Yongjun wrote: > In case of error, the function of_platform_device_create() returns > NULL pointer not ERR_PTR(). The IS_ERR() test in the return value > check should be replaced with NULL test. Bother, so it does! Thanks for catching this. Acked-by: Ja

Re: [PATCH 07/11] signal/arm64: Document conflicts with SI_USER and SIGFPE, SIGTRAP, SIGBUS

2018-01-15 Thread James Morse
Hi Dave, Thanks for going through all these, On 15/01/18 16:30, Dave Martin wrote: > On Thu, Jan 11, 2018 at 06:59:36PM -0600, Eric W. Biederman wrote: >> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c >> index 9b7f89df49db..abe200587334 100644 >> --- a/arch/arm64/mm/fault.c >> +++ b/

Re: [PATCH 1/5] arm64: entry: isb in el1_irq

2018-04-06 Thread James Morse
Hi Yury, On 05/04/18 18:17, Yury Norov wrote: > Kernel text patching framework relies on IPI to ensure that other > SMP cores observe the change. Target core calls isb() in IPI handler (Odd, if its just to synchronize the CPU, taking the IPI should be enough). > path, but not at the beginning o

  1   2   3   4   5   6   7   8   >