On Thu, 2017-09-28 at 11:45 +1000, David Gibson wrote:
> On Tue, Sep 26, 2017 at 04:47:04PM +1000, Sam Bobroff wrote:
> > In KVM's XICS-on-XIVE emulation, kvmppc_xive_get_xive() returns the
> > value of state->guest_server as "server". However, this value is not
> > set by it's counterpart kvmppc_x
Clearing very big IOMMU tables can trigger soft lockups. This adds
cond_resched() to allow the scheduler to do context switching when
it decides to.
Signed-off-by: Alexey Kardashevskiy
---
The testcase is POWER9 box with 264GB guest, 4 VFIO devices from
independent IOMMU groups, 64K IOMMU pages.
From: Simon Guo
> Sent: 27 September 2017 19:34
...
> > On X86 all the AVX registers are caller saved, the system call
> > entry could issue the instruction that invalidates them all.
> > Kernel code running in the context of a user process could then
> > use the registers without saving them.
> >
Hi,
On 26/09/2017 01:34, Andrew Morton wrote:
On Mon, 25 Sep 2017 09:27:43 -0700 Alexei Starovoitov
wrote:
On Mon, Sep 18, 2017 at 12:15 AM, Laurent Dufour
wrote:
Despite the unprovable lockdep warning raised by Sergey, I didn't get any
feedback on this series.
Is there a chance to get it
Hi Andrew,
On 26/09/2017 01:34, Andrew Morton wrote:
On Mon, 25 Sep 2017 09:27:43 -0700 Alexei Starovoitov
wrote:
On Mon, Sep 18, 2017 at 12:15 AM, Laurent Dufour
wrote:
Despite the unprovable lockdep warning raised by Sergey, I didn't get any
feedback on this series.
Is there a chance to
Nick,
I applied your patch into linux kernel 4.13, rebuild it, installed it, then had a test, still can not boot OS. System hung here, the same as before.
Would you please have a look?
The system is going down NOW!Sent SIGTERM to all processesSent SIGKILL to all processes[ 104.755810] kexec_core
- On Sep 27, 2017, at 9:04 AM, Nicholas Piggin npig...@gmail.com wrote:
> On Tue, 26 Sep 2017 20:43:28 + (UTC)
> Mathieu Desnoyers wrote:
>
>> - On Sep 26, 2017, at 1:51 PM, Mathieu Desnoyers
>> mathieu.desnoy...@efficios.com wrote:
>>
>> > Provide a new command allowing processes t
Several callers to epapr_hypercall() pass an uninitialized stack
allocated array for the input arguments, presumably because they
have no input arguments. However this can produce errors like
this one
arch/powerpc/include/asm/epapr_hcalls.h:470:42: error: 'in' may be used
uninitialized in this f
Currently sprintf is used, and while paths should never exceed
the size of the buffer it is theoretically possible since
dirent.d_name is 256 bytes. As a result this trips
-Wformat-overflow, and since the test is built with -Wall -Werror
the causes the build to fail. Switch to using snprintf and sk
On Thu, 28 Sep 2017 13:31:36 + (UTC)
Mathieu Desnoyers wrote:
> - On Sep 27, 2017, at 9:04 AM, Nicholas Piggin npig...@gmail.com wrote:
>
> > On Tue, 26 Sep 2017 20:43:28 + (UTC)
> > Mathieu Desnoyers wrote:
> >
> >> - On Sep 26, 2017, at 1:51 PM, Mathieu Desnoyers
> >> mathi
- On Sep 28, 2017, at 11:01 AM, Nicholas Piggin npig...@gmail.com wrote:
> On Thu, 28 Sep 2017 13:31:36 + (UTC)
> Mathieu Desnoyers wrote:
>
>> - On Sep 27, 2017, at 9:04 AM, Nicholas Piggin npig...@gmail.com wrote:
>>
>> > On Tue, 26 Sep 2017 20:43:28 + (UTC)
>> > Mathieu Desno
On Fri, Sep 29, 2017 at 01:01:12AM +1000, Nicholas Piggin wrote:
> That's fine. If a user is not bound to a subset of CPUs, they could
> also cause disturbances with other syscalls and faults, taking locks,
> causing tlb flushes and IPIs and things.
So on the big SGI class machines we've had troub
On Wed, 2017-09-27 at 15:32 +, York Sun wrote:
> On 09/27/2017 04:03 AM, Joakim Tjernlund wrote:
> > On Mon, 2017-09-25 at 17:26 +, York Sun wrote:
> > > On 09/25/2017 09:55 AM, Joakim Tjernlund wrote:
> > > > We got some "broken" boards(mpx8321) where UART RX is held low(BREAK)
> > > > The
On Thu, 28 Sep 2017 15:29:50 + (UTC)
Mathieu Desnoyers wrote:
> - On Sep 28, 2017, at 11:01 AM, Nicholas Piggin npig...@gmail.com wrote:
>
> > On Thu, 28 Sep 2017 13:31:36 + (UTC)
> > Mathieu Desnoyers wrote:
> >
> >> - On Sep 27, 2017, at 9:04 AM, Nicholas Piggin npig...@gma
On Thu, 28 Sep 2017 17:51:15 +0200
Peter Zijlstra wrote:
> On Fri, Sep 29, 2017 at 01:01:12AM +1000, Nicholas Piggin wrote:
> > That's fine. If a user is not bound to a subset of CPUs, they could
> > also cause disturbances with other syscalls and faults, taking locks,
> > causing tlb flushes and
On Thu, 2017-09-28 at 17:54 +0200, Joakim Tjernlund wrote:
> On Wed, 2017-09-27 at 15:32 +, York Sun wrote:
> > On 09/27/2017 04:03 AM, Joakim Tjernlund wrote:
> > > On Mon, 2017-09-25 at 17:26 +, York Sun wrote:
> > > > On 09/25/2017 09:55 AM, Joakim Tjernlund wrote:
> > > > > We got some
- On Sep 28, 2017, at 12:16 PM, Nicholas Piggin npig...@gmail.com wrote:
> On Thu, 28 Sep 2017 15:29:50 + (UTC)
> Mathieu Desnoyers wrote:
>
>> - On Sep 28, 2017, at 11:01 AM, Nicholas Piggin npig...@gmail.com wrote:
>>
>> > On Thu, 28 Sep 2017 13:31:36 + (UTC)
>> > Mathieu Desn
The mmu context on the 40x, 44x does not define pte_frag
entry. This causes gcc abort the compilation due to:
setup-common.c: In function ‘setup_arch’:
setup-common.c:908: error: ‘mm_context_t’ has no ‘pte_frag’
This patch fixes the issue by adding additional guard
conditions, that limit the init
Nick has a valid point that the sched_in() hook is a fast-path compared
to switch_mm(). Adding an extra TIF test in a fast-path to save a
barrier in a comparatively slow-path is therefore not such a good idea
overall.
Therefore, move the architecture hook to switch_mm() instead.
[ This patch is a
On Thu, 28 Sep 2017 14:29:02 +0200 Laurent Dufour
wrote:
> > Laurent's [0/n] provides some nice-looking performance benefits for
> > workloads which are chosen to show performance benefits(!) but, alas,
> > no quantitative testing results for workloads which we may suspect will
> > be harmed by
When a vdevice is DLPAR removed from the system the vio subsystem doesn't
bother unmapping the virq from the irq_domain. As a result we have a virq
mapped to a hardware irq that is no longer valid for the irq_domain. A side
effect is that we are left with /proc/irq/ affinity entries, and
attempts t
On Thu, Sep 28, 2017 at 07:16:12PM +1000, Alexey Kardashevskiy wrote:
> Clearing very big IOMMU tables can trigger soft lockups. This adds
> cond_resched() to allow the scheduler to do context switching when
> it decides to.
>
> Signed-off-by: Alexey Kardashevskiy
Reviewed-by: David Gibson
> -
Since last post, the first 4 patches are unchanged.
Split the last patch into 2, tidied up a few things, and removed
the DD1 workarounds because firmware is not going to support DD1.
Then re-tested with upstream firmware which has now merged support
for OPAL_SIGNAL_SYSTEM_RESET on POWER9 DD2.
N
The SMP watchdog will detect locked CPUs and IPI them to print a
backtrace and registers. If panic on hard lockup is enabled, do
not panic from this handler, because that can cause recursion into
the IPI layer during the panic.
The caller already panics in this case.
Signed-off-by: Nicholas Piggi
If sysctl_hardlockup_all_cpu_backtrace is enabled, there is no need to
IPI stuck CPUs for backtrace before trigger_allbutself_cpu_backtrace(),
which does the same thing again.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kernel/watchdog.c | 19 +++
1 file changed, 11 insertion
In xmon, touch_nmi_watchdog() is not expected to be checking that
other CPUs have not touched the watchdog, so the code will just
call touch_nmi_watchdog() once before re-enabling hard interrupts.
Just update our CPU's state, and ignore apparently stuck SMP threads.
Arguably touch_nmi_watchdog sh
The SMP hardlockup watchdog cross-checks other CPUs for lockups,
which causes xmon headaches because it's assuming interrupts
hard disabled means no watchdog troubles. Try to improve that by
calling touch_nmi_watchdog() in obvious places where secondaries
are spinning.
Also annotate these spin loo
It is possible to wake from idle due to a system reset exception, in
which case the CPU takes a system reset interrupt to wake from idle,
with system reset as the wakeup reason.
The regular (not idle wakeup) system reset interrupt handler must be
invoked in this case, otherwise the system reset in
This allows MSR[EE]=0 lockups to be detected on an OPAL (bare metal)
system similarly to the hcall NMI IPI on pseries guests, when the
platform/firmware supports it.
This is an example of CPU10 spinning with interrupts hard disabled:
Watchdog CPU:32 detected Hard LOCKUP other CPUS:10
Watchdog CPU
In the recent commit:
d8bd9f3f09 powerpc: Handle MCE on POWER9 with only DSISR bit 30 set
I screwed up the bit. It should be bit 25 (IBM bit 38).
Signed-off-by: Michael Neuling
---
arch/powerpc/kernel/mce_power.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/pow
In opal_event_shutdown() we free all the IRQs hanging off the
opal_event_irqchip. However it's not safe to do so if we're called
from IRQ context, because free_irq() wants to synchronise versus IRQ
context. This can lead to warnings and a stuck system.
For example from sysrq-b:
Trying to free I
This patch series is designed to hook up memory_failure on
UE errors, this is specially helpful for user_mode UE errors.
The first two patches cleanup bits, remove dead code.
I could not find any users of get_mce_fault_addr().
The second one improves printing of physical address
The third patch w
There are no users of get_mce_fault_addr()
Fixes: b63a0ff ("powerpc/powernv: Machine check exception handling.")
Signed-off-by: Balbir Singh
Reviewed-by: Nicholas Piggin
---
arch/powerpc/include/asm/mce.h | 2 --
arch/powerpc/kernel/mce.c | 39 ---
2 f
Use the same alignment as Effective address and rename
phyiscal address to Page Frame Number
Signed-off-by: Balbir Singh
Reviewed-by: Nicholas Piggin
---
arch/powerpc/kernel/mce.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel
Extract physical_address for UE errors by walking the page
tables for the mm and address at the NIP, to extract the
instruction. Then use the instruction to find the effective
address via analyse_instr().
We might have page table walking races, but we expect them to
be rare, the physical address e
Hookup instruction errors (UE) for memory offling via memory_failure()
in a manner similar to load/store errors (derror). Since we have access
to the NIP, the conversion is a one step process in this case.
Signed-off-by: Balbir Singh
Reviewed-by: Nicholas Piggin
---
arch/powerpc/kernel/mce_powe
If we are in user space and hit a UE error, we now have the
basic infrastructure to walk the page tables and find out
the effective address that was accessed, since the DAR
is not valid.
We use a work_queue content to hookup the bad pfn, any
other context causes problems, since memory_failure itse
On Fri, 29 Sep 2017 13:58:02 +1000
Michael Ellerman wrote:
> In opal_event_shutdown() we free all the IRQs hanging off the
> opal_event_irqchip. However it's not safe to do so if we're called
> from IRQ context, because free_irq() wants to synchronise versus IRQ
> context. This can lead to warnin
This adds definitions for the OV32 and CA32 bits of XER that
were introduced in POWER ISA v3.0. There are some existing
instructions that currently set the OV and CA bits based on
certain conditions.
The emulation behaviour of all these instructions needs to
be updated to set these new bits accord
There are existing fixed-point arithmetic instructions that always set the
CA bit of XER to reflect the carry out of bit 0 in 64-bit mode and out of
bit 32 in 32-bit mode. In ISA v3.0, these instructions also always set the
CA32 bit of XER to reflect the carry out of bit 32.
This fixes the emulate
This fixes the emulated behaviour of existing fixed-point shift right
algebraic instructions that are supposed to set both the CA and CA32
bits of XER when running on a system that is compliant with POWER ISA
v3.0 independent of whether the system is executing in 32-bit mode or
64-bit mode. The fol
S);
yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count);
if ((yield_count & 1) == 0)
return; /* virtual cpu is currently running */
rmb();
Machine Type: Power 8 PowerVM LPAR
kernel : 4.14.0-rc2-next-20170928
gcc: version 6.3.1
Test : DLPAR Memory
config
This driver provides interface to mmap the OCC sensor area
to userspace to parse and read OCC inband sensors.
Signed-off-by: Shilpasri G Bhat
---
- The skiboot patch for this is posted here:
https://lists.ozlabs.org/pipermail/skiboot/2017-September/009209.html
arch/powerpc/platforms/powernv/Mak
43 matches
Mail list logo