On Tue, 17 Aug 2021 13:08:46 +0530 Jerin Jacob <jerinjac...@gmail.com> wrote:
> On Tue, Aug 17, 2021 at 9:23 AM Stephen Hemminger > <step...@networkplumber.org> wrote: > > > > On Tue, 17 Aug 2021 08:57:18 +0530 > > <jer...@marvell.com> wrote: > > > > > From: Jerin Jacob <jer...@marvell.com> > > > > > > Introducing oops handling API with following specification > > > and enable stub implementation for Linux and FreeBSD. > > > > > > On rte_eal_init() invocation, the EAL library installs the > > > oops handler for the essential signals. > > > The rte_oops_signals_enabled() API provides the list > > > of signals the library installed by the EAL. > > > > This is a big change, and many applications already handle these > > signals themselves. Therefore adding this needs to be opt-in > > and not enabled by default. > > In order to avoid every application explicitly register this > sighandler and to cater to the > co-existing application-specific signal-hander usage. > The following design has been chosen. (It is mentioned in the commit log, > I will describe here for more clarity) > > Case 1: > a) The application installs the signal handler prior to rte_eal_init(). > b) Implementation stores the application-specific signal and replace a > signal handler as oops eal handler > c) when application/DPDK get the segfault, the default EAL oops > handler gets invoked > d) Then it dumps the EAL specific message, it calls the > application-specific signal handler > installed in step 1 by application. This avoids breaking any contract > with the application. > i.e Behavior is the same current EAL now. > That is the reason for not using SA_RESETHAND(which call SIG_DFL after > eal oops handler instead > application-specific handler) > > Case 2: > a) The application install the signal handler after rte_eal_init(), > b) EAL hander get replaced with application handle then the application can > call > rte_oops_decode() to decode. > > In order to cater the above use case, rte_oops_signals_enabled() and > rte_oops_decode() > provided. > > Here we are not breaking any contract with the application. > Do you have concerns about this design? In our application as a service it is important not to do any backtrace in production. We rely on other infrastructure to process coredumps. This should be controlled enabled by a command line argument.