> >>>>>> We have been working on optimizing the latency of calls to > >>>>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the work > >>>>>> requires changes in the implementation of > >>>>>> .dev_start() PMD callback, but I also wanted to start a discussion > >>>>>> regarding configuration restore. > >>>>>> > >>>>>> rte_eth_dev_start() does a few things on top of calling .dev_start() > >>>>>> callback: > >>>>>> > >>>>>> - Before calling it: > >>>>>> - eth_dev_mac_restore() - if device supports > >>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR; > >>>>>> - After calling it: > >>>>>> - eth_dev_mac_restore() - if device does not support > >>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR; > >>>>>> - restore promiscuous config > >>>>>> - restore all multicast config > >>>>>> > >>>>>> eth_dev_mac_restore() iterates over all known MAC addresses - stored > >>>>>> in rte_eth_dev_data.mac_addrs array - and calls > >>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC > >>>>>> addresses. > >>>>>> > >>>>>> Promiscuous config restore checks if promiscuous mode is enabled or > >>>>>> not, and calls .promiscuous_enable() or .promiscuous_disable() > >>>>>> callback. > >>>>>> > >>>>>> All multicast config restore checks if all multicast mode is enabled > >>>>>> or not, and calls .allmulticast_enable() or .allmulticast_disable() > >>>>>> callback. > >>>>>> > >>>>>> Callbacks are called directly in all of these cases, to bypass the > >>>>>> checks for applying the same configuration, which exist in relevant > >>>>>> APIs. > >>>>>> Checks are bypassed to force drivers to reapply the configuration. > >>>>>> > >>>>>> Let's consider what happens in the following sequence of API calls. > >>>>>> > >>>>>> 1. rte_eth_dev_configure() > >>>>>> 2. rte_eth_tx_queue_setup() > >>>>>> 3. rte_eth_rx_queue_setup() > >>>>>> 4. rte_eth_promiscuous_enable() > >>>>>> - Call dev->dev_ops->promiscuous_enable() > >>>>>> - Stores promiscuous state in dev->data->promiscuous 5. > >>>>>> rte_eth_allmulticast_enable() > >>>>>> - Call dev->dev_ops->allmulticast_enable() > >>>>>> - Stores allmulticast state in dev->data->allmulticast 6. > >>>>>> rte_eth_dev_start() > >>>>>> - Call dev->dev_ops->dev_start() > >>>>>> - Call dev->dev_ops->mac_addr_set() - apply default MAC address > >>>>>> - Call dev->dev_ops->promiscuous_enable() > >>>>>> - Call dev->dev_ops->allmulticast_enable() > >>>>>> > >>>>>> Even though all configuration is available in dev->data after step 5, > >>>>>> library forces reapplying this configuration in step 6. > >>>>>> > >>>>>> In mlx5 PMD case all relevant callbacks require communication with the > >>>>>> kernel driver, to configure the device (mlx5 PMD must create/destroy > >>>>>> new kernel flow rules and/or change netdev config). > >>>>>> > >>>>>> mlx5 PMD handles applying all configuration in .dev_start(), so the > >>>>>> following forced callbacks force additional communication with the > >>>>>> kernel. The > >>>>> same configuration is applied multiple times. > >>>>>> > >>>>>> As an optimization, mlx5 PMD could check if a given configuration was > >>>>>> applied, but this would duplicate the functionality of the library > >>>>>> (for example rte_eth_promiscuous_enable() does not call the driver if > >>>>>> dev->data->promiscuous is set). > >>>>>> > >>>>>> Question: Since all of the configuration is available before > >>>>>> .dev_start() callback is called, why ethdev library does not expect > >>>>>> .dev_start() to > >>>>> take this configuration into account? > >>>>>> In other words, why library has to reapply the configuration? > >>>>>> > >>>>>> I could not find any particular reason why configuration restore > >>>>>> exists as part of the process (it was in the initial DPDK commit). > >>>>>> > >>>>> > >>>>> My assumption is .dev_stop() cause these values reset in some devices, > >>>>> so > >>>>> .dev_start() restores them back. > >>>>> @Bruce or @Konstantin may remember the history. > >>> > >>> Yep, as I remember, at least some Intel PMDs calling hw_reset() ad > >>> dec_stop() and > >>> even dev_start() to make sure that HW is in a clean (known) state. > >>> > >>>>> > >>>>> But I agree this is device specific behavior, and can be managed by > >>>>> what device > >>>>> requires. > >>> > >>> Probably yes. > >>> > >>>>> > >>>>>> The patches included in this RFC, propose a mechanism which would help > >>>>>> with managing which drivers rely on forceful configuration restore. > >>>>>> Drivers could advertise if forceful configuration restore is needed > >>>>>> through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is > >>>>>> set, then the driver in question requires ethdev to forcefully restore > >>>>> configuration. > >>>>>> > >>>>> > >>>>> OK to use flag for it, but not sure about using 'dev_info->dev_flags' > >>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and this is > >>>>> all dpdk > >>>>> internal. > >>>>> > >>>>> What about to have a dedicated flag for it? We can have a dedicated set > >>>>> of flag > >>>>> values for restore. > >>>> > >>>> Agreed. What do you think about the following? > >>> > >>> Instead of exposing that, can we probably make it transparent to the user > >>> and probably ethdev layer too? > >>> > >> > >> +1 to make it transparent to user, but not sure if we can make it > >> transparent to ethdev layer. > > > > Just to be clear: > > Let say, using example from above: > > > > rte_eth_dev_start() > > - Call dev->dev_ops->dev_start() > > - Call dev->dev_ops->mac_addr_set() - apply default MAC address > > - Call dev->dev_ops->promiscuous_enable() > > - Call dev->dev_ops->allmulticast_enable() > > > > We probably can introduce ethdev internal function (still visible to PMDs) > > that would do last 3 steps: > > ethdev_replay_user_conf(...) > > - Call dev->dev_ops->mac_addr_set() - apply default MAC address > > - Call dev->dev_ops->promiscuous_enable() > > - Call dev->dev_ops->allmulticast_enable() > > > > And let PMD itself to decide does it needs to call it at dev_start() or not. > > So it will become: > > rte_eth_dev_start() > > - Call dev->dev_ops->dev_start() > > -Call ethdev_replay_user_conf(.) > > - Call dev->dev_ops->mac_addr_set() - apply default MAC > > address > > - Call dev->dev_ops->promiscuous_enable() > > -Call dev->dev_ops->allmulticast_enable() > > > > For PMDs that do need to restore user provided config > > And > > rte_eth_dev_start() > > - Call dev->dev_ops->dev_start() > > > > For those who do not. > > > > OK, got it what you mean. > Pushing restore functionality to PMDs works, but this may be doing > redundant work on each PMD.
Yes, every PMD will have to be updated with that way. > Instead Dariusz suggests PMD to provide a flag to ehtdev to what to > restore and common code in ethdev does the work. > My below dedicated data struct comment is to have this flag in a new > struct, overall like following: > > rte_eth_dev_start() > - Call dev->dev_ops->dev_start() > - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START, &flags) > - if (flags & MAC) dev->dev_ops->mac_addr_set() > - if (flags & PROMISC) dev->dev_ops->promiscuous_enable() > - ... > > So PMDs only will provide what to restore with an internal API and > common ethdev layer will restore it. > If no restore required PMD may not implement .get_restore_flags() at all. Makes sense to me... Probably less disruptive would be to have a dev_ops function that would tell what settings don't need to be replayed by ehtdev layer. > Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be provided to > internal API to get what to restore in different states... > > >> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be confusing > >> and open to interpretation what to use it for and by time become source > >> of defect. > > > > Yes, same thoughts. > > > >> Instead what do you think to have a separate, dedicated data struct for it? > > > > Hmm... not sure I understood you here... > > > >> > >>> Might be we can move this restoration code into the new ethdev helper > >>> function,(ethdevd_user_config_restore() or so) > >>> that PMD can invoke during its dev_start() if needed? > >>> > >>>> > >>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE RTE_BIT32(0) > >>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE RTE_BIT32(1) > >>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE RTE_BIT32(2) > >>>> > >>>> struct rte_eth_dev_data { > >>>> /* snip */ > >>>> > >>>> uint32_t dev_flags; > >>>> > >>>> /** > >>>> * Internal device capabilities, used only by ethdev library. > >>>> * Certain functionalities provided by the library might > >>>> enabled/disabled, > >>>> * based on driver exposing certain capabilities. > >>>> */ > >>>> uint32_t internal_flags; > >>>> > >>>> /* snip */ > >>>> }; > >>>> > >>>>> Also perhaps we have go into details what needs to be restored after > >>>>> 'stop' and > >>>>> what needs to be restored after 'reset' and use similar mechanism etc... > >>>> > >>>> I think we should look into that. > >>>> Any 'codification' of semantics between drivers and ethdev library is > >>>> good in my opinion. > >>>> > >>>> At least right now, ethdev does not change any configuration in 'stop' > >>>> and 'reset' from what I see. > >>>> But that's on library side only. > >>>> > >>>>>> This way, if we would conclude that it makes sense for .dev_start() to > >>>>>> handle all starting configuration aspects, we could track which > >>>>>> drivers still rely > >>>>> on configuration restore. > >>>>>> > >>>>>> Dariusz Sosnowski (4): > >>>>>> ethdev: rework config restore > >>>>>> ethdev: omit promiscuous config restore if not required > >>>>>> ethdev: omit all multicast config restore if not required > >>>>>> ethdev: omit MAC address restore if not required > >>>>>> > >>>>>> lib/ethdev/rte_ethdev.c | 39 ++++++++++++++++++++++++++++++++++----- > >>>>>> lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++ > >>>>>> 2 files changed, 52 insertions(+), 5 deletions(-) > >>>>>> > >>>>>> -- > >>>>>> 2.39.5 > >>>>>> > >>> > >