> >>>> We have been working on optimizing the latency of calls to > >>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the work > >>>> requires changes in the implementation of > >>>> .dev_start() PMD callback, but I also wanted to start a discussion > >>>> regarding configuration restore. > >>>> > >>>> rte_eth_dev_start() does a few things on top of calling .dev_start() > >>>> callback: > >>>> > >>>> - Before calling it: > >>>> - eth_dev_mac_restore() - if device supports > >>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR; > >>>> - After calling it: > >>>> - eth_dev_mac_restore() - if device does not support > >>> RTE_ETH_DEV_NOLIVE_MAC_ADDR; > >>>> - restore promiscuous config > >>>> - restore all multicast config > >>>> > >>>> eth_dev_mac_restore() iterates over all known MAC addresses - stored > >>>> in rte_eth_dev_data.mac_addrs array - and calls > >>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC > >>>> addresses. > >>>> > >>>> Promiscuous config restore checks if promiscuous mode is enabled or > >>>> not, and calls .promiscuous_enable() or .promiscuous_disable() callback. > >>>> > >>>> All multicast config restore checks if all multicast mode is enabled > >>>> or not, and calls .allmulticast_enable() or .allmulticast_disable() > >>>> callback. > >>>> > >>>> Callbacks are called directly in all of these cases, to bypass the > >>>> checks for applying the same configuration, which exist in relevant APIs. > >>>> Checks are bypassed to force drivers to reapply the configuration. > >>>> > >>>> Let's consider what happens in the following sequence of API calls. > >>>> > >>>> 1. rte_eth_dev_configure() > >>>> 2. rte_eth_tx_queue_setup() > >>>> 3. rte_eth_rx_queue_setup() > >>>> 4. rte_eth_promiscuous_enable() > >>>> - Call dev->dev_ops->promiscuous_enable() > >>>> - Stores promiscuous state in dev->data->promiscuous 5. > >>>> rte_eth_allmulticast_enable() > >>>> - Call dev->dev_ops->allmulticast_enable() > >>>> - Stores allmulticast state in dev->data->allmulticast 6. > >>>> rte_eth_dev_start() > >>>> - Call dev->dev_ops->dev_start() > >>>> - Call dev->dev_ops->mac_addr_set() - apply default MAC address > >>>> - Call dev->dev_ops->promiscuous_enable() > >>>> - Call dev->dev_ops->allmulticast_enable() > >>>> > >>>> Even though all configuration is available in dev->data after step 5, > >>>> library forces reapplying this configuration in step 6. > >>>> > >>>> In mlx5 PMD case all relevant callbacks require communication with the > >>>> kernel driver, to configure the device (mlx5 PMD must create/destroy > >>>> new kernel flow rules and/or change netdev config). > >>>> > >>>> mlx5 PMD handles applying all configuration in .dev_start(), so the > >>>> following forced callbacks force additional communication with the > >>>> kernel. The > >>> same configuration is applied multiple times. > >>>> > >>>> As an optimization, mlx5 PMD could check if a given configuration was > >>>> applied, but this would duplicate the functionality of the library > >>>> (for example rte_eth_promiscuous_enable() does not call the driver if > >>>> dev->data->promiscuous is set). > >>>> > >>>> Question: Since all of the configuration is available before > >>>> .dev_start() callback is called, why ethdev library does not expect > >>>> .dev_start() to > >>> take this configuration into account? > >>>> In other words, why library has to reapply the configuration? > >>>> > >>>> I could not find any particular reason why configuration restore > >>>> exists as part of the process (it was in the initial DPDK commit). > >>>> > >>> > >>> My assumption is .dev_stop() cause these values reset in some devices, so > >>> .dev_start() restores them back. > >>> @Bruce or @Konstantin may remember the history. > > > > Yep, as I remember, at least some Intel PMDs calling hw_reset() ad > > dec_stop() and > > even dev_start() to make sure that HW is in a clean (known) state. > > > >>> > >>> But I agree this is device specific behavior, and can be managed by what > >>> device > >>> requires. > > > > Probably yes. > > > >>> > >>>> The patches included in this RFC, propose a mechanism which would help > >>>> with managing which drivers rely on forceful configuration restore. > >>>> Drivers could advertise if forceful configuration restore is needed > >>>> through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is > >>>> set, then the driver in question requires ethdev to forcefully restore > >>> configuration. > >>>> > >>> > >>> OK to use flag for it, but not sure about using 'dev_info->dev_flags' > >>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and this is > >>> all dpdk > >>> internal. > >>> > >>> What about to have a dedicated flag for it? We can have a dedicated set > >>> of flag > >>> values for restore. > >> > >> Agreed. What do you think about the following? > > > > Instead of exposing that, can we probably make it transparent to the user > > and probably ethdev layer too? > > > > +1 to make it transparent to user, but not sure if we can make it > transparent to ethdev layer.
Just to be clear: Let say, using example from above: rte_eth_dev_start() - Call dev->dev_ops->dev_start() - Call dev->dev_ops->mac_addr_set() - apply default MAC address - Call dev->dev_ops->promiscuous_enable() - Call dev->dev_ops->allmulticast_enable() We probably can introduce ethdev internal function (still visible to PMDs) that would do last 3 steps: ethdev_replay_user_conf(...) - Call dev->dev_ops->mac_addr_set() - apply default MAC address - Call dev->dev_ops->promiscuous_enable() - Call dev->dev_ops->allmulticast_enable() And let PMD itself to decide does it needs to call it at dev_start() or not. So it will become: rte_eth_dev_start() - Call dev->dev_ops->dev_start() -Call ethdev_replay_user_conf(.) - Call dev->dev_ops->mac_addr_set() - apply default MAC address - Call dev->dev_ops->promiscuous_enable() -Call dev->dev_ops->allmulticast_enable() For PMDs that do need to restore user provided config And rte_eth_dev_start() - Call dev->dev_ops->dev_start() For those who do not. > Suggested 'internal_flag' in "struct rte_eth_dev_data" can be confusing > and open to interpretation what to use it for and by time become source > of defect. Yes, same thoughts. > Instead what do you think to have a separate, dedicated data struct for it? Hmm... not sure I understood you here... > > > Might be we can move this restoration code into the new ethdev helper > > function,(ethdevd_user_config_restore() or so) > > that PMD can invoke during its dev_start() if needed? > > > >> > >> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE RTE_BIT32(0) > >> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE RTE_BIT32(1) > >> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE RTE_BIT32(2) > >> > >> struct rte_eth_dev_data { > >> /* snip */ > >> > >> uint32_t dev_flags; > >> > >> /** > >> * Internal device capabilities, used only by ethdev library. > >> * Certain functionalities provided by the library might > >> enabled/disabled, > >> * based on driver exposing certain capabilities. > >> */ > >> uint32_t internal_flags; > >> > >> /* snip */ > >> }; > >> > >>> Also perhaps we have go into details what needs to be restored after > >>> 'stop' and > >>> what needs to be restored after 'reset' and use similar mechanism etc... > >> > >> I think we should look into that. > >> Any 'codification' of semantics between drivers and ethdev library is good > >> in my opinion. > >> > >> At least right now, ethdev does not change any configuration in 'stop' and > >> 'reset' from what I see. > >> But that's on library side only. > >> > >>>> This way, if we would conclude that it makes sense for .dev_start() to > >>>> handle all starting configuration aspects, we could track which drivers > >>>> still rely > >>> on configuration restore. > >>>> > >>>> Dariusz Sosnowski (4): > >>>> ethdev: rework config restore > >>>> ethdev: omit promiscuous config restore if not required > >>>> ethdev: omit all multicast config restore if not required > >>>> ethdev: omit MAC address restore if not required > >>>> > >>>> lib/ethdev/rte_ethdev.c | 39 ++++++++++++++++++++++++++++++++++----- > >>>> lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++ > >>>> 2 files changed, 52 insertions(+), 5 deletions(-) > >>>> > >>>> -- > >>>> 2.39.5 > >>>> > >