On 7/11/2018 5:55 AM, Stephen Hemminger wrote:
On Fri, 29 Jun 2018 18:30:42 +0800
Jeff Guo <jia....@intel.com> wrote:
When device be hotplug, if data path still read/write device, the sigbus
error will occur, this error need to be handled. So a handler need to be
here to capture the signal and handle it correspondingly.
To handle sigbus error is a bus-specific behavior, this patch introduces
a bus ops so that each kind of bus can implement its own logic.
Signed-off-by: Jeff Guo <jia....@intel.com>
---
v4->v3:
split patches to be small and clear.
---
lib/librte_eal/common/include/rte_bus.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/lib/librte_eal/common/include/rte_bus.h
b/lib/librte_eal/common/include/rte_bus.h
index 3642aeb..231bd3d 100644
--- a/lib/librte_eal/common/include/rte_bus.h
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -181,6 +181,20 @@ typedef int (*rte_bus_parse_t)(const char *name, void
*addr);
typedef int (*rte_bus_hotplug_handler_t)(struct rte_device *dev);
/**
+ * Implementation a specific sigbus handler, which is responsible
+ * for handle the sigbus error which is original memory error, or specific
+ * memory error that caused of hot unplug.
+ * @param failure_addr
+ * Pointer of the fault address of the sigbus error.
+ *
+ * @return
+ * 0 for success handle the sigbus.
+ * 1 for no handle the sigbus.
+ * -1 for failed to handle the sigbus
+ */
+typedef int (*rte_bus_sigbus_handler_t)(const void *failure_addr);
+
+/**
* Bus scan policies
*/
enum rte_bus_scan_mode {
@@ -226,6 +240,8 @@ struct rte_bus {
rte_bus_get_iommu_class_t get_iommu_class; /**< Get iommu class */
rte_bus_hotplug_handler_t hotplug_handler;
/**< handle hot plug on bus */
+ rte_bus_sigbus_handler_t sigbus_handler; /**< handle sigbus error */
+
};
/**
One issue with handling sigbus is that you are going to trap program errors
as well as hotplug. How can you distinguish between removed device and a
buggy userspace program (or worse comprimised program)?
That is a problem which i have been considerate in this mechanism and do
it in other patch, the way is that first check if the error domain is
belong to the mmio device resource or not,
if it is will do new sigbus handler for hotplug, if not will mean that
it is buggy user space program, will use generic sigbus handler to
handler it.