hi,jan
On 9/20/2017 2:44 AM, Jan Blunck wrote:
On Wed, Sep 20, 2017 at 6:12 AM, Jeff Guo <jia....@intel.com> wrote:
So far, about hot plug in dpdk, we already have hot plug add/remove
api and fail-safe driver to offload the fail-safe work from the app
user. But there are still lack of a general event api, since the interrupt
event, which hot plug related with, is diversity between each device and
driver, such as mlx4, pci driver and others.
Use the hot removal event for example, pci drivers not all exposure the
remove interrupt, so in order to make user to easy use the hot plug feature
for pci driver, something must be done to detect the remove event at the
kernel level and offer a new line of interrupt to the user land.
Base on the uevent of kobject mechanism in kernel, we could use it to
benefit for monitoring the hot plug status of the device which not only
uio/vfio of pci bus devices, but also other, such as cpu/usb/pci-express
bus devices.
The idea is comming as bellow.
Jeff,
We already have libudev. Sorry for catching it that late but I don't
believe that replicating the uevent handling belongs in the DPDK. You
might want to look into a helper to find the corresponding rte_device
for a given devnode though.
i think the kobject netlink message could be used by bunch of app,such
as customize hotplug app or libudev, off course it could used by dpdk,
and since it would related with
some different behaviors of diversity drivers and user space complex
device management. so aim to let dpdk user easy to develop hotplug , we
want to offload the monitoring work and
coherenceprocession into eal layer.
Also the remap_device function should get removed. There is no
synchronization between the polling for the uevent and the rest of the
drivers. Therefore there is no guarantee that you can remap to "safe"
memory. You should fix the drivers instead.
agree with you that would no have sync guarantee, maybe we could add the
sigaction or other sync condition in drivers to handle the signal bus
error when unsafe memory control occur.
but i think it is the sync problem , would not affect the remap_device
functional, and would not affect to use a common bus layer memory
control to let caller to use to remap to "safe".
am i right? also i want to collect info that it there any other
conversion about that remap_device api?
Thanks,
Jan
a.The uevent message form FD monitoring which will be useful.
remove@/devices/pci0000:80/0000:80:02.2/0000:82:00.0/0000:83:03.0/0000:84:00.2/uio/uio2
ACTION=remove
DEVPATH=/devices/pci0000:80/0000:80:02.2/0000:82:00.0/0000:83:03.0/0000:84:00.2/uio/uio2
SUBSYSTEM=uio
MAJOR=243
MINOR=2
DEVNAME=uio2
SEQNUM=11366
b.add uevent monitoring machanism:
add several general api to enable uevent monitoring.
c.add common uevent handler and uevent failure handler
uevent of device should be handler at bus or device layer, and the memory read
and write failure when hot removal should be handle correctly before detach
behaviors.
d.show example how to use uevent monitor
enable uevent monitoring in testpmd or fail-safe to show usage.
patchset history:
v5->v4:
1.Move uevent monitor epolling from eal interrupt to eal device layer.
2.Redefine the eal device API for common, and distinguish between linux and bsd
3.Add failure handler helper api in bus layer.Add function of find device by
name.
4.Replace of individual fd bind with single device, use a common fd to polling
all device.
5.Add to register hot insertion monitoring and process, add function to auto
bind driver befor user add device
6.Refine some coding style and typos issue
7.add new callback to process hot insertion
v4->v3:
1.move uevent monitor api from eal interrupt to eal device layer.
2.create uevent type and struct in eal device.
3.move uevent handler for each driver to eal layer.
4.add uevent failure handler to process signal fault issue.
5.add example for request and use uevent monitoring in testpmd.
v3->v2:
1.refine some return error
2.refine the string searching logic to avoid memory issue
v2->v1:
1.remove global variables of hotplug_fd, add uevent_fd
in rte_intr_handle to let each pci device self maintain it fd,
to fix dual device fd issue.
2.refine some typo error.
Jeff Guo (2):
eal: add uevent monitor for hot plug
app/testpmd: use uevent to monitor hot removal
app/test-pmd/testpmd.c | 90 ++++++
lib/librte_eal/bsdapp/eal/eal_dev.c | 64 ++++
.../bsdapp/eal/include/exec-env/rte_dev.h | 105 ++++++
lib/librte_eal/common/eal_common_bus.c | 31 ++
lib/librte_eal/common/eal_common_dev.c | 223 ++++++++++++-
lib/librte_eal/common/eal_common_pci.c | 69 +++-
lib/librte_eal/common/eal_common_pci_uio.c | 31 +-
lib/librte_eal/common/eal_common_vdev.c | 29 +-
lib/librte_eal/common/eal_private.h | 13 +-
lib/librte_eal/common/include/rte_bus.h | 52 +++
lib/librte_eal/common/include/rte_dev.h | 102 +++++-
lib/librte_eal/common/include/rte_pci.h | 26 ++
lib/librte_eal/linuxapp/eal/Makefile | 3 +-
lib/librte_eal/linuxapp/eal/eal_dev.c | 351 +++++++++++++++++++++
lib/librte_eal/linuxapp/eal/eal_pci.c | 33 ++
.../linuxapp/eal/include/exec-env/rte_dev.h | 105 ++++++
16 files changed, 1318 insertions(+), 9 deletions(-)
create mode 100644 lib/librte_eal/bsdapp/eal/eal_dev.c
create mode 100644 lib/librte_eal/bsdapp/eal/include/exec-env/rte_dev.h
create mode 100644 lib/librte_eal/linuxapp/eal/eal_dev.c
create mode 100644 lib/librte_eal/linuxapp/eal/include/exec-env/rte_dev.h
--
2.7.4