On 4/19/2023 10:54 AM, Sivaprasad Tummala wrote:
Add eventdev support to enable power saving when no events
are arriving. It is based on counting the number of empty
polls and, when the number reaches a certain threshold, entering
an architecture-defined optimized power state that will either wait
until a TSC timestamp expires, or when events arrive.
This API mandates a core-to-single-port mapping (i.e. one core polling
multiple ports of event device is not supported). This should be ok
as the general use case will have one CPU core using one port to
enqueue/dequeue events from an eventdev.
This design is using Eventdev PMD Dequeue callbacks.
1. MWAITX/MONITORX:
When a certain threshold of empty polls is reached, the core will go
into a power optimized sleep while waiting on an address of next RX
descriptor to be written to.
2. Pause instruction
This method uses the pause instruction to avoid busy polling.
Signed-off-by: Sivaprasad Tummala <sivaprasad.tumm...@amd.com>
---
Hi, few comments
+
+static uint16_t
+evt_clb_pause(uint8_t dev_id __rte_unused, uint8_t port_id __rte_unused,
+ struct rte_event *ev __rte_unused,
+ uint16_t nb_events, void *arg)
+{
+ const unsigned int lcore = rte_lcore_id();
+ struct queue_list_entry *queue_conf = arg;
+ struct pmd_core_cfg *lcore_conf;
+ const bool empty = nb_events == 0;
+ uint32_t pause_duration = rte_power_pmd_mgmt_get_pause_duration();
+
+ lcore_conf = &lcore_cfgs[lcore];
+
+ if (likely(!empty))
+ /* early exit */
+ queue_reset(lcore_conf, queue_conf);
+ else {
+ /* can this queue sleep? */
+ if (!queue_can_sleep(lcore_conf, queue_conf))
+ return nb_events;
+
+ /* can this lcore sleep? */
+ if (!lcore_can_sleep(lcore_conf))
+ return nb_events;
+
+ uint64_t i;
+ for (i = 0; i < global_data.pause_per_us * pause_duration; i++)
+ rte_pause();
Why not add support for TPAUSE? This is generic code, ethdev code path
supports it, and most of this function is duplicated from ethdev
implementation. I wish we could unify them somehow, but I can't think of
an elegant way to do it off the top of my head.
+
+ /* we need this in various places */
+ rte_cpu_get_intrinsics_support(&global_data.intrinsics_support);
+
+ switch (mode) {
+ case RTE_POWER_MGMT_TYPE_MONITOR:
+ /* check if we can add a new port */
+ ret = check_evt_monitor(lcore_cfg, &qdata);
+ if (ret < 0)
+ goto end;
+
+ clb = evt_clb_umwait;
+ break;
+ case RTE_POWER_MGMT_TYPE_PAUSE:
+ /* figure out various time-to-tsc conversions */
+ if (global_data.tsc_per_us == 0)
+ calc_tsc();
+
+ clb = evt_clb_pause;
+ break;
+ default:
+ RTE_LOG(DEBUG, POWER, "Invalid power management type\n");
Technically, if we specify "scale" here, the power management scheme
would be *unsupported* rather than *invalid*, and thus should return
-ENOTSUP rather than -EINVAL.
Also, since this is generic code, theoretically this code could in fact
support SCALE mode? Would it make sense for eventdev to use that scheme?
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice.
+ *
+ * Disable power management on a specified Ethernet device Rx queue and lcore.
+ *
+ * @note This function is not thread-safe.
+ *
+ * @warning This function must be called when all affected Ethernet queues are
+ * stopped and no Rx/Tx is in progress!
+ *
+ * @param lcore_id
+ * The lcore the Rx queue is polled from.
+ * @param dev_id
+ * The identifier of the device.
+ * @param port_id
+ * Event port identifier of the Event device.
+ * @return
+ * 0 on success
+ * <0 on error
+ */
+__rte_experimental
+int
+rte_power_eventdev_pmgmt_port_disable(unsigned int lcore_id,
+ uint8_t dev_id, uint8_t port_id);
It would've been nice if we didn't have to reimplement the same logic
for every new device type, but seeing how we do not have any unified
driver API, I don't have any bright ideas on how to do it better.
--
Thanks,
Anatoly