> > > >> Currently, there is a hard limitation on the PMD power management > >> support that only allows it to support a single queue per lcore. This is > >> not ideal as most DPDK use cases will poll multiple queues per core. > >> > >> The PMD power management mechanism relies on ethdev Rx callbacks, so it > >> is very difficult to implement such support because callbacks are > >> effectively stateless and have no visibility into what the other ethdev > >> devices are doing. This places limitations on what we can do within the > >> framework of Rx callbacks, but the basics of this implementation are as > >> follows: > >> > >> - Replace per-queue structures with per-lcore ones, so that any device > >> polled from the same lcore can share data > >> - Any queue that is going to be polled from a specific lcore has to be > >> added to the list of queues to poll, so that the callback is aware of > >> other queues being polled by the same lcore > >> - Both the empty poll counter and the actual power saving mechanism is > >> shared between all queues polled on a particular lcore, and is only > >> activated when all queues in the list were polled and were determined > >> to have no traffic. > >> - The limitation on UMWAIT-based polling is not removed because UMWAIT > >> is incapable of monitoring more than one address. > >> > >> Also, while we're at it, update and improve the docs. > >> > >> Signed-off-by: Anatoly Burakov <anatoly.bura...@intel.com> > >> --- > >> > >> Notes: > >> v6: > >> - Track each individual queue sleep status (Konstantin) > >> - Fix segfault (Dave) > >> > >> v5: > >> - Remove the "power save queue" API and replace it with mechanism > >> suggested by > >> Konstantin > >> > >> v3: > >> - Move the list of supported NICs to NIC feature table > >> > >> v2: > >> - Use a TAILQ for queues instead of a static array > >> - Address feedback from Konstantin > >> - Add additional checks for stopped queues > >> > > <snip> > > > .... > >> +static inline void > >> +queue_reset(struct pmd_core_cfg *cfg, struct queue_list_entry *qcfg) > >> +{ > >> + const bool is_ready_to_sleep = qcfg->n_empty_polls > EMPTYPOLL_MAX; > >> + > >> + /* reset empty poll counter for this queue */ > >> + qcfg->n_empty_polls = 0; > >> + /* reset the queue sleep counter as well */ > >> + qcfg->n_sleeps = 0; > >> + /* remove the queue from list of cores ready to sleep */ > >> + if (is_ready_to_sleep) > >> + cfg->n_queues_ready_to_sleep--; > >> + /* > >> + * no need change the lcore sleep target counter because this lcore > >> will > >> + * reach the n_sleeps anyway, and the other cores are already > >> counted so > >> + * there's no need to do anything else. > >> + */ > >> +} > >> + > >> +static inline bool > >> +queue_can_sleep(struct pmd_core_cfg *cfg, struct queue_list_entry *qcfg) > >> +{ > >> + /* this function is called - that means we have an empty poll */ > >> + qcfg->n_empty_polls++; > >> + > >> + /* if we haven't reached threshold for empty polls, we can't sleep */ > >> + if (qcfg->n_empty_polls <= EMPTYPOLL_MAX) > >> + return false; > >> + > >> + /* > >> + * we've reached a point where we are able to sleep, but we still > >> need > >> + * to check if this queue has already been marked for sleeping. > >> + */ > >> + if (qcfg->n_sleeps == cfg->sleep_target) > >> + return true; > >> + > >> + /* mark this queue as ready for sleep */ > >> + qcfg->n_sleeps = cfg->sleep_target; > >> + cfg->n_queues_ready_to_sleep++; > > > > So, assuming there is no incoming traffic, should it be: > > 1) poll_all_queues(times=EMPTYPOLL_MAX); sleep; poll_all_queues(times=1); > > sleep; poll_all_queues(times=1); sleep; ... > > OR > > 2) poll_all_queues(times=EMPTYPOLL_MAX); sleep; poll_all_queues(times= > > EMPTYPOLL_MAX); sleep; poll_all_queues(times= > EMPTYPOLL_MAX); sleep; ... > > ? > > > > My initial thought was 2) but might be the intention is 1)? > > > The intent is 1), not 2). There's no need to wait for more empty polls > once we pass the threshold - we keep sleeping until there's traffic. >
Ok, then: Acked-by: Konstantin Ananyev <konstantin.anan...@intel.com> Probably worth to put extra explanation here on in the doc, to help people avoid wrong assumptions😉