On 06/07/2017 08:29 PM, Noam Camus wrote:
*From:* Noam Camus
*Sent:* Wednesday, June 7, 2017 8:06:17 PM
*To:* Vineet Gupta; linux-snps-...@lists.infradead.org
*Cc:* linux-kernel@vger.kernel.org; Elad Kanfi
*Subject:* Re: [PATCH v2 11/11] ARC: [plat-eznps] Handle memory error as an
exception
*> From:*Vineet Gupta <vineet.gup...@synopsys.com>
*> Sent:* Wednesday, June 7, 2017 7:15 PM...
> So NPS *hardware* generates exception, jumps to vector mem_service(), which
you
> redirect to the machine check handler - which simply panics.
> But this redirection is under EZNPS_MEM_ERROR, which you have defaulted to
"n". So
> how is the default working for hardware ? Doesn't it need to be "y"
The NPS400 architects changed userspace bus error behavior to be machine check
instead of Interrupt level 2.
The reason was that since we are dealing with imprecise exception.
So memory request result will be back to core long time after bad instruction
was executed.
In the meantime core be able to do HW schedule between threads and result may
hit another thread.
The core do not keep information on each such bus transaction so it just
interfere current thread without knowing if it was the initiator of this bus
transaction.
In such case we prefer to create machine check and end with PANIC.
Ok this make sense !
With simulator we just turn this configuration on, so we redirect the Legacy
Synopsys L2 ISR from nSIM into machine check.
This way we end up just like with silicon 😊
This doesn't make sense :-)
In simulation (where L2 interrupt is asserted), you need to handle it as such -
say reading out the banked regs for L2 interrupt. What you are doing here is
handling it like an exception which won't work . I really don't see the point of
this "alignment" - hardware and simulation are different. simulation semantics are
already supported by generic ARC code. And for silicon case, the existing vector
woudl MachineCheck would work for both K and U. So I'm not sure what we are trying
to achieve here !
>BTW it seems your patch is wrong otherwise too. So the userspace bus error
will go
>to machine check handler which currently just panic's. You really want to kill
the
>user space process and continue, thus need to call do_memory_error()
So I believe that we do correct thing here, when we deal with multi thread
cores.
Sure, the imprecise handling of bus error is an issue - but we should atleat try
to recover. By just panic'ing unconditionally, you are enabling a one liner user
program to panic the system (granted in simulation only)
-Vineet