Link to v1: [10] :: Introduction ::
DPDK has been inherently a PCI inclined framework. Because of this, the design of device tree (or list) within DPDK is also PCI inclined. A non-PCI device doesn't have a way of being expressed without using hooks started from EAL to PMD. (Check 'Version Changes' section for changes) :: Overview of the Proposed Changes :: Assuming the below graph for a computing node: device A1 | +==.===='==============.============+ Bus A. | `--> driver A11 \ device A2 `-> driver A12 \______ |CPU | /````` device B1 / | / +==.===='==============.============+ Bus B` | `--> driver B11 device B2 `-> driver B12 - One or more buses are connected to a CPU (or core) - One or more devices are conneted to a Bus - Drivers are running instances which manage one or more devices - Bus is responsible for identifying devices (and interrupt propogation) - Driver is responsible for initializing the device In context of DPDK EAL: - rte_bus, represents a Bus. An implementation of a physical bus would instantiate this class. - Buses are registered just like a PMD - RTE_REGISTER_BUS() `- Thus, implementation for PCI would instantiate a rte_bus, give it a name and provide scan/match hooks. - Currently, priority of RTE_REGISTER_BUS constructor has been set to 101 to make sure bus is registered *before* drivers are. - Each registered bus is part of a doubly list. -- Each device refers to rte_bus on which it belongs -- Each driver refers to rte_bus with which it is associated -- Device and Drivers lists are part of rte_bus -- NO global device/driver list would exist - When a PMD wants to register itself, it would 'add' itself to an existing bus. Which essentially converts to adding the driver to a bus specific driver_list. - Bus would perform a scan and 'add' devices scanned to its list. - Bus would perform a probe and link devices and drivers on each bus and invoking a series of probes `-- There are some parallel work for combining scan/probe in EAL [5] and also for doing away with a independent scan function all together [6]. The view would be almost like: __ rte_bus_list / +----------'---+ |rte_bus | | driver_list------> device_list for this bus | device_list---- | scan() | `-> driver_list for this bus | match() | | probe() | | | +--|------|----+ _________/ \_________ +--------/----+ +-\---------------+ |rte_device | |rte_driver | | *rte_bus | | *rte_bus | | rte_driver | | probe() | | | | remove() | | devargs | | | +---||--------+ +---------|||-----+ || ''' | \ \\\ | \_____________ \\\ | \ ||| +------|---------+ +----|----------+ ||| |rte_pci_device | |rte_xxx_device | ||| | PCI specific | | xxx device | ||| | info (mem,) | | specific fns | / | \ +----------------+ +---------------+ / | \ _____________________/ / \ / ___/ \ +-------------'--+ +------------'---+ +--'------------+ |rte_pci_driver | |rte_vdev_driver | |rte_xxx_driver | | PCI id table, | | <probably, | | .... | | other driver | | nothing> | +---------------+ | data | | ... | | probe() | +----------------+ | remove() | +----------------+ In continuation to the RFC posted on 17/Nov [9], A series of patches is being posted which attempts to create: 1. A basic bus model `- define rte_bus and associated methods/helpers `- test infrastructure to test the Bus infra 2. Changes in EAL to support PCI as a bus `- a "pci" bus is registered `- existing scan/match/probe are modified to allow for bus integration `- PCI Device and Driver list, which were global entities, have been moved to rte_bus->[device/driver]_list For v2 as well, I have sanity tested this patch over a XeonD X552 available with me, as well as part of PoC for verifying NXP's DPAA2 PMD (being pushed out in a separate series). Exhaustive testing is still pending. -> Please help in MLX & BSD related changes. :: Brief about Patch Layout :: 0001: Container_of patch from [3] 0002~0003: Introducing the basic Bus model and associated test case 0004~0005: Add scan, match and insert support for devices on bus 0006: Add probe and remove for rte_driver 0007: Enable probing of PCI Bus (devices) from EAL 0008: Split the existing PCI probe into match and probe 0009: Make PCI probe/match work on rte_driver/device rather than rte_pci_device/rte_pci_driver 0010: Patch from Ben [8], part of series [2] 0011: Enable Scan/Match/probe on Bus from EAL and remove unused functions and lists. PMDs still don't work (in fact, PCI PMD don't work after this patch - but without any compilation issues). 0012: Change PMDs to integrate with PCI bus :: Pending Changes/Caveats :: 0. eth_dev still contains rte_pci_device. I am banking on Jan's patches [1] for conversion to a macro (ETH_DEV_PCI_DEV) and subsequent replacement of all pci_dev usage in eth_dev. 1. One of the major changes pending, as against proposed in RFC, is the removal of eth_driver. Being a large change, and independent one, this would be done in a separate series of patches. [] 2. app/test/test_pci.c is still not modified. It is rendered uncompilable as most of the APIs it was calling don't exist anymore. I will push a v2 which would include a complete re-write of test_pci (or, if someone can help me that, it would be awesome). 3. This patchset only moves the PCI into a bus. And, that movement is also currently part of the EAL (lib/librte_eal/linux) - there was an open question in RFC about where to place the PCI bus instance - whether in drivers/bus/... or in lib/librte_bus/... or lib/librte_eal/...; This patch uses the last option. But, movement only impacts placement of Makefiles. Please convey your reservations for current placement. - It also impacts the point (8) about priority use in constructor 4. Though the implementation for bus is common for Linux and BSD, the PCI bus implementation has been done/tested only for Linux. 5. There was a suggestion from Jan Blunk about a helper iterator within the rte_bus. I will send this out in v3. 6. Cyptodev and VDEV changes are still pending. Jan has already conveyed that he would be working on the VDEV part. I will work on the Crypto part and target that for v3. 7. The overall layout for driver probing has changed a little. earlier, it was: rte_eal_init() `-> rte_eal_pci_probe() (and parallel for VDEV) `-> rte_pci_driver->probe() `-> eth_driver->eth_dev_init() now, it would be: rte_eal_init() `-> rte_eal_bus_probe() <- Iterator for PCI device/driver `-> rte_driver->probe() <- devargs handling | old rte_eal_pci_probe() `-> rte_xxx_driver->probe() <- eth_dev allocation `-> eth_driver->eth_dev_init <- eth_dev init Open Questions: Also, rte_driver->probe() creating eth_dev certainly sounds a little wrong - but, I would like to get your opinion on how to lay this order of which layer ethernet device corresponds to. 1) Which layer should allocate eth_dev? `-> My take: rte_driver->probe() 2) which layer should fill the eth_dev? `-> My take: rte_xxx_driver->probe() 3) Is init/uninit better name for rte_xxx_driver()->probe() if all they do is initialize the ethernet device? 8. RTE_REGISTER_BUS has been declared with contructor priority of 101 It is important that Bus is registered *before* drivers are registered. Only way I could find to assure that was via __attribute(contructor(priority)) of GCC. I am not sure how it would behave on other compilers. Any suggestions? - One suggestion from David Marchand was to use global bus object handles, which I have not implemented for now. If that is common choice, I will change in v3. :: ToDo list :: - app/test/test_pci.c compilation is failing, if enabled. - Bump to librte_eal version - Documentation continues to have references to some _old_ PCI symbols - vdev changes - eth_device, eth_driver changes :: References :: [1] http://dpdk.org/ml/archives/dev/2016-November/050186.html [2] http://dpdk.org/ml/archives/dev/2016-November/050622.html [3] http://dpdk.org/ml/archives/dev/2016-November/050416.html [4] http://dpdk.org/ml/archives/dev/2016-November/050567.html [5] http://dpdk.org/ml/archives/dev/2016-November/050628.html [6] http://dpdk.org/ml/archives/dev/2016-November/050415.html [7] http://dpdk.org/ml/archives/dev/2016-November/050443.html [8] http://dpdk.org/ml/archives/dev/2016-November/050624.html [9] http://dpdk.org/ml/archives/dev/2016-November/050296.html [10] http://dpdk.org/ml/archives/dev/2016-December/051349.html :: Version Changes :: v2: - No more bus->probe() Now, rte_eal_bus_probe() calls rte_driver->probe based on match output - new functions, rte_eal_pci_probe and rte_eal_pci_remove have been added as glue code between PCI PMDs and PCI Bus `-> PMDs are updated to use these new functions as callbacks for rte_driver - 'default' keyword has been removed from match and scan - Fix for incorrect changes in mlx* and nicvf* - Checkpatch fixes - Some variable checks have been removed from internal functions; functions which are externally visible continue to have such checks - Some rearrangement of patches: -- changes to drivers have been separated from EAL changes (but this does make PCI PMDs non-working for a particular patch) Ben Walker (1): pci: Pass rte_pci_addr to functions instead of separate args Jan Blunck (1): eal: define container_of macro Shreyansh Jain (10): eal/bus: introduce bus abstraction test: add basic bus infrastructure tests eal/bus: add scan, match and insert support eal: integrate bus scan and probe with EAL eal: add probe and remove support for rte_driver eal: enable probe from bus infrastructure pci: split match and probe function eal/pci: generalize args of PCI scan/match towards RTE device/driver eal: enable PCI bus drivers: update PMDs to use rte_driver probe and remove app/test/Makefile | 2 +- app/test/test.h | 2 + app/test/test_bus.c | 688 ++++++++++++++++++++++++ app/test/test_pci.c | 2 +- drivers/net/bnx2x/bnx2x_ethdev.c | 8 + drivers/net/bnxt/bnxt_ethdev.c | 4 + drivers/net/cxgbe/cxgbe_ethdev.c | 4 + drivers/net/e1000/em_ethdev.c | 4 + drivers/net/e1000/igb_ethdev.c | 8 + drivers/net/ena/ena_ethdev.c | 4 + drivers/net/enic/enic_ethdev.c | 4 + drivers/net/fm10k/fm10k_ethdev.c | 4 + drivers/net/i40e/i40e_ethdev.c | 4 + drivers/net/i40e/i40e_ethdev_vf.c | 4 + drivers/net/ixgbe/ixgbe_ethdev.c | 8 + drivers/net/mlx4/mlx4.c | 4 +- drivers/net/mlx5/mlx5.c | 1 + drivers/net/nfp/nfp_net.c | 4 + drivers/net/qede/qede_ethdev.c | 8 + drivers/net/szedata2/rte_eth_szedata2.c | 4 + drivers/net/thunderx/nicvf_ethdev.c | 4 + drivers/net/virtio/virtio_ethdev.c | 2 + drivers/net/vmxnet3/vmxnet3_ethdev.c | 4 + lib/librte_eal/bsdapp/eal/Makefile | 1 + lib/librte_eal/bsdapp/eal/eal.c | 12 +- lib/librte_eal/bsdapp/eal/eal_pci.c | 52 +- lib/librte_eal/bsdapp/eal/rte_eal_version.map | 22 +- lib/librte_eal/common/Makefile | 2 +- lib/librte_eal/common/eal_common_bus.c | 285 ++++++++++ lib/librte_eal/common/eal_common_pci.c | 341 +++++++----- lib/librte_eal/common/eal_private.h | 14 +- lib/librte_eal/common/include/rte_bus.h | 257 +++++++++ lib/librte_eal/common/include/rte_common.h | 21 + lib/librte_eal/common/include/rte_dev.h | 14 + lib/librte_eal/common/include/rte_pci.h | 59 +- lib/librte_eal/linuxapp/eal/Makefile | 1 + lib/librte_eal/linuxapp/eal/eal.c | 12 +- lib/librte_eal/linuxapp/eal/eal_pci.c | 81 +-- lib/librte_eal/linuxapp/eal/rte_eal_version.map | 22 +- 39 files changed, 1737 insertions(+), 240 deletions(-) create mode 100644 app/test/test_bus.c create mode 100644 lib/librte_eal/common/eal_common_bus.c create mode 100644 lib/librte_eal/common/include/rte_bus.h -- 2.7.4