On 10/7/2024 10:27 AM, Konstantin Ananyev wrote: > > >>> External email: Use caution opening links or attachments >>> >>> >>> On 9/18/2024 10:21 AM, Dariusz Sosnowski wrote: >>>> Hi all, >>>> >>>> We have been working on optimizing the latency of calls to >>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the work >>>> requires changes in the implementation of >>>> .dev_start() PMD callback, but I also wanted to start a discussion >>>> regarding configuration restore. >>>> >>>> rte_eth_dev_start() does a few things on top of calling .dev_start() >>>> callback: >>>> >>>> - Before calling it: >>>> - eth_dev_mac_restore() - if device supports >>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR; >>>> - After calling it: >>>> - eth_dev_mac_restore() - if device does not support >>> RTE_ETH_DEV_NOLIVE_MAC_ADDR; >>>> - restore promiscuous config >>>> - restore all multicast config >>>> >>>> eth_dev_mac_restore() iterates over all known MAC addresses - stored >>>> in rte_eth_dev_data.mac_addrs array - and calls >>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC addresses. >>>> >>>> Promiscuous config restore checks if promiscuous mode is enabled or >>>> not, and calls .promiscuous_enable() or .promiscuous_disable() callback. >>>> >>>> All multicast config restore checks if all multicast mode is enabled >>>> or not, and calls .allmulticast_enable() or .allmulticast_disable() >>>> callback. >>>> >>>> Callbacks are called directly in all of these cases, to bypass the >>>> checks for applying the same configuration, which exist in relevant APIs. >>>> Checks are bypassed to force drivers to reapply the configuration. >>>> >>>> Let's consider what happens in the following sequence of API calls. >>>> >>>> 1. rte_eth_dev_configure() >>>> 2. rte_eth_tx_queue_setup() >>>> 3. rte_eth_rx_queue_setup() >>>> 4. rte_eth_promiscuous_enable() >>>> - Call dev->dev_ops->promiscuous_enable() >>>> - Stores promiscuous state in dev->data->promiscuous 5. >>>> rte_eth_allmulticast_enable() >>>> - Call dev->dev_ops->allmulticast_enable() >>>> - Stores allmulticast state in dev->data->allmulticast 6. >>>> rte_eth_dev_start() >>>> - Call dev->dev_ops->dev_start() >>>> - Call dev->dev_ops->mac_addr_set() - apply default MAC address >>>> - Call dev->dev_ops->promiscuous_enable() >>>> - Call dev->dev_ops->allmulticast_enable() >>>> >>>> Even though all configuration is available in dev->data after step 5, >>>> library forces reapplying this configuration in step 6. >>>> >>>> In mlx5 PMD case all relevant callbacks require communication with the >>>> kernel driver, to configure the device (mlx5 PMD must create/destroy >>>> new kernel flow rules and/or change netdev config). >>>> >>>> mlx5 PMD handles applying all configuration in .dev_start(), so the >>>> following forced callbacks force additional communication with the kernel. >>>> The >>> same configuration is applied multiple times. >>>> >>>> As an optimization, mlx5 PMD could check if a given configuration was >>>> applied, but this would duplicate the functionality of the library >>>> (for example rte_eth_promiscuous_enable() does not call the driver if >>>> dev->data->promiscuous is set). >>>> >>>> Question: Since all of the configuration is available before >>>> .dev_start() callback is called, why ethdev library does not expect >>>> .dev_start() to >>> take this configuration into account? >>>> In other words, why library has to reapply the configuration? >>>> >>>> I could not find any particular reason why configuration restore >>>> exists as part of the process (it was in the initial DPDK commit). >>>> >>> >>> My assumption is .dev_stop() cause these values reset in some devices, so >>> .dev_start() restores them back. >>> @Bruce or @Konstantin may remember the history. > > Yep, as I remember, at least some Intel PMDs calling hw_reset() ad dec_stop() > and > even dev_start() to make sure that HW is in a clean (known) state. > >>> >>> But I agree this is device specific behavior, and can be managed by what >>> device >>> requires. > > Probably yes. > >>> >>>> The patches included in this RFC, propose a mechanism which would help >>>> with managing which drivers rely on forceful configuration restore. >>>> Drivers could advertise if forceful configuration restore is needed >>>> through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is >>>> set, then the driver in question requires ethdev to forcefully restore >>> configuration. >>>> >>> >>> OK to use flag for it, but not sure about using 'dev_info->dev_flags' >>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and this is all >>> dpdk >>> internal. >>> >>> What about to have a dedicated flag for it? We can have a dedicated set of >>> flag >>> values for restore. >> >> Agreed. What do you think about the following? > > Instead of exposing that, can we probably make it transparent to the user > and probably ethdev layer too? >
+1 to make it transparent to user, but not sure if we can make it transparent to ethdev layer. Suggested 'internal_flag' in "struct rte_eth_dev_data" can be confusing and open to interpretation what to use it for and by time become source of defect. Instead what do you think to have a separate, dedicated data struct for it? > Might be we can move this restoration code into the new ethdev helper > function,(ethdevd_user_config_restore() or so) > that PMD can invoke during its dev_start() if needed? > >> >> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE RTE_BIT32(0) >> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE RTE_BIT32(1) >> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE RTE_BIT32(2) >> >> struct rte_eth_dev_data { >> /* snip */ >> >> uint32_t dev_flags; >> >> /** >> * Internal device capabilities, used only by ethdev library. >> * Certain functionalities provided by the library might >> enabled/disabled, >> * based on driver exposing certain capabilities. >> */ >> uint32_t internal_flags; >> >> /* snip */ >> }; >> >>> Also perhaps we have go into details what needs to be restored after 'stop' >>> and >>> what needs to be restored after 'reset' and use similar mechanism etc... >> >> I think we should look into that. >> Any 'codification' of semantics between drivers and ethdev library is good >> in my opinion. >> >> At least right now, ethdev does not change any configuration in 'stop' and >> 'reset' from what I see. >> But that's on library side only. >> >>>> This way, if we would conclude that it makes sense for .dev_start() to >>>> handle all starting configuration aspects, we could track which drivers >>>> still rely >>> on configuration restore. >>>> >>>> Dariusz Sosnowski (4): >>>> ethdev: rework config restore >>>> ethdev: omit promiscuous config restore if not required >>>> ethdev: omit all multicast config restore if not required >>>> ethdev: omit MAC address restore if not required >>>> >>>> lib/ethdev/rte_ethdev.c | 39 ++++++++++++++++++++++++++++++++++----- >>>> lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++ >>>> 2 files changed, 52 insertions(+), 5 deletions(-) >>>> >>>> -- >>>> 2.39.5 >>>> >