> -----Original Message----- > From: Burakov, Anatoly > Sent: Tuesday, October 2, 2018 4:54 PM > To: Guo, Jia <jia....@intel.com>; step...@networkplumber.org; Richardson, > Bruce <bruce.richard...@intel.com>; Yigit, Ferruh > <ferruh.yi...@intel.com>; Ananyev, Konstantin <konstantin.anan...@intel.com>; > gaetan.ri...@6wind.com; Wu, Jingjing > <jingjing...@intel.com>; tho...@monjalon.net; mo...@mellanox.com; > ma...@mellanox.com; Van Haaren, Harry > <harry.van.haa...@intel.com>; Zhang, Qi Z <qi.z.zh...@intel.com>; He, > Shaopeng <shaopeng...@intel.com>; Iremonger, Bernard > <bernard.iremon...@intel.com>; arybche...@solarflare.com; Lu, Wenzhuo > <wenzhuo...@intel.com>; jerin.ja...@caviumnetworks.com > Cc: jblu...@infradead.org; shreyansh.j...@nxp.com; dev@dpdk.org; Zhang, Helin > <helin.zh...@intel.com> > Subject: Re: [PATCH v12 6/7] eal: add failure handle mechanism for hot-unplug > > On 02-Oct-18 1:35 PM, Jeff Guo wrote: > > The mechanism can initially register the sigbus handler after the device > > event monitor is enabled. When a sigbus event is captured, it will check > > the failure address and accordingly handle the memory failure of the > > corresponding device by invoke the hot-unplug handler. It could prevent > > the application from crashing when a device is hot-unplugged. > > > > By this patch, users could call below new added APIs to enable/disable > > the device hotplug handle mechanism. Note that it just implement the > > hot-unplug handler in these functions, the other handler of hotplug, such > > as handler for hotplug binding, could be add in the future if need: > > - rte_dev_hotplug_handle_enable > > - rte_dev_hotplug_handle_disable > > > > Signed-off-by: Jeff Guo <jia....@intel.com> > > --- > > <snip> > > > +static void sigbus_handler(int signum, siginfo_t *info, > > + void *ctx __rte_unused) > > +{ > > + int ret; > > + > > + RTE_LOG(INFO, EAL, "Thread[%d] catch SIGBUS, fault address:%p\n", > > + (int)pthread_self(), info->si_addr); > > + > > + rte_spinlock_lock(&failure_handle_lock); > > + ret = rte_bus_sigbus_handler(info->si_addr); > > + rte_spinlock_unlock(&failure_handle_lock); > > + if (ret == -1) { > > + rte_exit(EXIT_FAILURE, > > + "Failed to handle SIGBUS for hot-unplug, " > > + "(rte_errno: %s)!", strerror(rte_errno)); > > Do we really want to exit the application on sigbus handle failure?
I'd say yes :) What else we can do in such situation, except then die gracefully? Konstantin > > > + } else if (ret == 1) { > > + if (sigbus_action_old.sa_handler) > > + (*(sigbus_action_old.sa_handler))(signum); > > + else > > + rte_exit(EXIT_FAILURE, > > + "Failed to handle generic SIGBUS!"); > > + } > > + > > + RTE_LOG(INFO, EAL, "Success to handle SIGBUS for hot-unplug!\n"); > > Again, does this all need to be with INFO log level? IMO it should be DEBUG. > > > +} > > + > > +static int cmp_dev_name(const struct rte_device *dev, > > + const void *_name) > > +{ > > + const char *name = _name; > > + > > + return strcmp(dev->name, name); > > +} > > + > > static int > > <snip> > > > > > int __rte_experimental > > @@ -220,5 +320,67 @@ rte_dev_event_monitor_stop(void) > > close(intr_handle.fd); > > intr_handle.fd = -1; > > monitor_started = false; > > + > > return 0; > > This looks like unintended change. > > > } > > + > > +int __rte_experimental > > +rte_dev_sigbus_handler_register(void) > > +{ > > + sigset_t mask; > > + struct sigaction action; > > + > > <snip> > > > --- a/lib/librte_eal/rte_eal_version.map > > +++ b/lib/librte_eal/rte_eal_version.map > > @@ -281,6 +281,8 @@ EXPERIMENTAL { > > rte_dev_event_callback_unregister; > > rte_dev_event_monitor_start; > > rte_dev_event_monitor_stop; > > + rte_dev_hotplug_handle_enable; > > + rte_dev_hotplug_handle_disable; > > Nitpicking - disable should be above enable, as E follows D in alphabet :) > > > rte_dev_iterator_init; > > rte_dev_iterator_next; > > rte_devargs_add; > > > > > -- > Thanks, > Anatoly