> -----Original Message-----
> From: Ma, Liang J
> Sent: Friday, August 31, 2018 11:04 PM
> To: Hunt, David <david.h...@intel.com>
> Cc: dev@dpdk.org; Yao, Lei A <lei.a....@intel.com>; Nicolau, Radu
> <radu.nico...@intel.com>; Burakov, Anatoly <anatoly.bura...@intel.com>;
> Geary, John <john.ge...@intel.com>; Ma, Liang J <liang.j...@intel.com>
> Subject: [PATCH v6 1/4] lib/librte_power: traffic pattern aware power control
>
> 1. Abstract
>
> For packet processing workloads such as DPDK polling is continuous.
> This means CPU cores always show 100% busy independent of how much
> work
> those cores are doing. It is critical to accurately determine how busy
> a core is hugely important for the following reasons:
>
> * No indication of overload conditions
>
> * User do not know how much real load is on a system meaning resulted in
> wasted energy as no power management is utilized
>
> Compared to the original l3fwd-power design, instead of going to sleep
> after detecting an empty poll, the new mechanism just lowers the core
> frequency. As a result, the application does not stop polling the device,
> which leads to improved handling of bursts of traffic.
>
> When the system become busy, the empty poll mechanism can also increase
> the core
> frequency (including turbo) to do best effort for intensive traffic. This
> gives
> us more flexible and balanced traffic awareness over the standard l3fwd-
> power
> application.
>
> 2. Proposed solution
>
> The proposed solution focuses on how many times empty polls are executed.
> The less
> the number of empty polls, means current core is busy with processing
> workload,
> therefore, the higher frequency is needed. The high empty poll number
> indicates
> the current core not doing any real work therefore, we can lower the
> frequency
> to safe power.
>
> In the current implementation, each core has 1 empty-poll counter which
> assume
> 1 core is dedicated to 1 queue. This will need to be expanded in the future
> to support multiple queues per core.
>
> 2.1 Power state definition:
>
> LOW: Not currently used, reserved for future use.
>
> MED: the frequency is used to process modest traffic workload.
>
> HIGH: the frequency is used to process busy traffic workload.
>
> 2.2 There are two phases to establish the power management system:
>
> a.Initialization/Training phase. The training phase is necessary
> in order to figure out the system polling baseline numbers from
> idle to busy. The highest poll count will be during idle, where all
> polls are empty. These poll counts will be different between
> systems due to the many possible processor micro-arch, cache
> and device configurations, hence the training phase.
> In the training phase, traffic is blocked so the training algorithm
> can average the empty-poll numbers for the LOW, MED and
> HIGH power states in order to create a baseline.
> The core's counter are collected every 10ms, and the Training
> phase will take 2 seconds.
>
> b.Normal phase. When the training phase is complete, traffic is
> started. The run-time poll counts are compared with the
> baseline and the decision will be taken to move to MED power
> state or HIGH power state. The counters are calculated every 10ms.
>
> 3. Proposed API
>
> 1. rte_power_empty_poll_stat_init(void);
> which is used to initialize the power management system.
>
> 2. rte_power_empty_poll_stat_free(void);
> which is used to free the resource hold by power management system.
>
> 3. rte_power_empty_poll_stat_update(unsigned int lcore_id);
> which is used to update specific core empty poll counter, not thread safe
>
> 4. rte_power_poll_stat_update(unsigned int lcore_id, uint8_t nb_pkt);
> which is used to update specific core valid poll counter, not thread safe
>
> 5. rte_power_empty_poll_stat_fetch(unsigned int lcore_id);
> which is used to get specific core empty poll counter.
>
> 6. rte_power_poll_stat_fetch(unsigned int lcore_id);
> which is used to get specific core valid poll counter.
>
> 7. rte_empty_poll_detection(void);
> which is used to detect empty poll state changes.
>
> ChangeLog:
> v2: fix some coding style issues
> v3: rename the filename, API name.
> v4: no change
> v5: no change
> v6: re-work the code layout, update API
>
> Signed-off-by: Liang Ma <liang.j...@intel.com>
Reviewed-by: Lei Yao <lei.a....@intel.com>
> ---
> lib/librte_power/Makefile | 6 +-
> lib/librte_power/meson.build | 5 +-
> lib/librte_power/rte_power_empty_poll.c | 500
> ++++++++++++++++++++++++++++++++
> lib/librte_power/rte_power_empty_poll.h | 205 +++++++++++++
> lib/librte_power/rte_power_version.map | 13 +
> 5 files changed, 725 insertions(+), 4 deletions(-)
> create mode 100644 lib/librte_power/rte_power_empty_poll.c
> create mode 100644 lib/librte_power/rte_power_empty_poll.h
>
> diff --git a/lib/librte_power/Makefile b/lib/librte_power/Makefile
> index 6f85e88..a8f1301 100644
> --- a/lib/librte_power/Makefile
> +++ b/lib/librte_power/Makefile
> @@ -6,8 +6,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> # library name
> LIB = librte_power.a
>
> +CFLAGS += -DALLOW_EXPERIMENTAL_API
> CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -fno-strict-aliasing
> -LDLIBS += -lrte_eal
> +LDLIBS += -lrte_eal -lrte_timer
>
> EXPORT_MAP := rte_power_version.map
>
> @@ -16,8 +17,9 @@ LIBABIVER := 1
> # all source are stored in SRCS-y
> SRCS-$(CONFIG_RTE_LIBRTE_POWER) := rte_power.c
> power_acpi_cpufreq.c
> SRCS-$(CONFIG_RTE_LIBRTE_POWER) += power_kvm_vm.c
> guest_channel.c
> +SRCS-$(CONFIG_RTE_LIBRTE_POWER) += rte_power_empty_poll.c
>
> # install this header file
> -SYMLINK-$(CONFIG_RTE_LIBRTE_POWER)-include := rte_power.h
> +SYMLINK-$(CONFIG_RTE_LIBRTE_POWER)-include := rte_power.h
> rte_power_empty_poll.h
>
> include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_power/meson.build b/lib/librte_power/meson.build
> index 253173f..63957eb 100644
> --- a/lib/librte_power/meson.build
> +++ b/lib/librte_power/meson.build
> @@ -5,5 +5,6 @@ if host_machine.system() != 'linux'
> build = false
> endif
> sources = files('rte_power.c', 'power_acpi_cpufreq.c',
> - 'power_kvm_vm.c', 'guest_channel.c')
> -headers = files('rte_power.h')
> + 'power_kvm_vm.c', 'guest_channel.c',
> + 'rte_power_empty_poll.c')
> +headers = files('rte_power.h','rte_power_empty_poll.h')
> diff --git a/lib/librte_power/rte_power_empty_poll.c
> b/lib/librte_power/rte_power_empty_poll.c
> new file mode 100644
> index 0000000..3dac654
> --- /dev/null
> +++ b/lib/librte_power/rte_power_empty_poll.c
> @@ -0,0 +1,500 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2010-2018 Intel Corporation
> + */
> +
> +#include <string.h>
> +
> +#include <rte_lcore.h>
> +#include <rte_cycles.h>
> +#include <rte_atomic.h>
> +#include <rte_malloc.h>
> +
> +#include "rte_power.h"
> +#include "rte_power_empty_poll.h"
> +
> +#define INTERVALS_PER_SECOND 100 /* (10ms) */
> +#define SECONDS_TO_TRAIN_FOR 2
> +#define DEFAULT_MED_TO_HIGH_PERCENT_THRESHOLD 70
> +#define DEFAULT_HIGH_TO_MED_PERCENT_THRESHOLD 30
> +#define DEFAULT_CYCLES_PER_PACKET 800
> +
> +static struct ep_params *ep_params;
> +static uint32_t med_to_high_threshold =
> DEFAULT_MED_TO_HIGH_PERCENT_THRESHOLD;
> +static uint32_t high_to_med_threshold =
> DEFAULT_HIGH_TO_MED_PERCENT_THRESHOLD;
> +
> +static uint32_t avail_freqs[RTE_MAX_LCORE][NUM_FREQS];
> +
> +static uint32_t total_avail_freqs[RTE_MAX_LCORE];
> +
> +static uint32_t freq_index[NUM_FREQ];
> +
> +static uint32_t
> +get_freq_index(enum freq_val index)
> +{
> + return freq_index[index];
> +}
> +
> +
> +static int
> +set_power_freq(int lcore_id, enum freq_val freq, bool specific_freq)
> +{
> + int err = 0;
> + uint32_t power_freq_index;
> + if (!specific_freq)
> + power_freq_index = get_freq_index(freq);
> + else
> + power_freq_index = freq;
> +
> + err = rte_power_set_freq(lcore_id, power_freq_index);
> +
> + return err;
> +}
> +
> +
> +static inline void __attribute__((always_inline))
> +exit_training_state(struct priority_worker *poll_stats)
> +{
> + RTE_SET_USED(poll_stats);
> +}
> +
> +static inline void __attribute__((always_inline))
> +enter_training_state(struct priority_worker *poll_stats)
> +{
> + poll_stats->iter_counter = 0;
> + poll_stats->cur_freq = LOW;
> + poll_stats->queue_state = TRAINING;
> +}
> +
> +static inline void __attribute__((always_inline))
> +enter_normal_state(struct priority_worker *poll_stats)
> +{
> + /* Clear the averages arrays and strs */
> + memset(poll_stats->edpi_av, 0, sizeof(poll_stats->edpi_av));
> + poll_stats->ec = 0;
> + memset(poll_stats->ppi_av, 0, sizeof(poll_stats->ppi_av));
> + poll_stats->pc = 0;
> +
> + poll_stats->cur_freq = MED;
> + poll_stats->iter_counter = 0;
> + poll_stats->threshold_ctr = 0;
> + poll_stats->queue_state = MED_NORMAL;
> + set_power_freq(poll_stats->lcore_id, MED, false);
> +
> + /* Try here */
> + poll_stats->thresh[MED].threshold_percent =
> med_to_high_threshold;
> + poll_stats->thresh[HGH].threshold_percent =
> high_to_med_threshold;
> +}
> +
> +static inline void __attribute__((always_inline))
> +enter_busy_state(struct priority_worker *poll_stats)
> +{
> + memset(poll_stats->edpi_av, 0, sizeof(poll_stats->edpi_av));
> + poll_stats->ec = 0;
> + memset(poll_stats->ppi_av, 0, sizeof(poll_stats->ppi_av));
> + poll_stats->pc = 0;
> +
> + poll_stats->cur_freq = HGH;
> + poll_stats->iter_counter = 0;
> + poll_stats->threshold_ctr = 0;
> + poll_stats->queue_state = HGH_BUSY;
> + set_power_freq(poll_stats->lcore_id, HGH, false);
> +}
> +
> +static inline void __attribute__((always_inline))
> +enter_purge_state(struct priority_worker *poll_stats)
> +{
> + poll_stats->iter_counter = 0;
> + poll_stats->queue_state = LOW_PURGE;
> +}
> +
> +static inline void __attribute__((always_inline))
> +set_state(struct priority_worker *poll_stats,
> + enum queue_state new_state)
> +{
> + enum queue_state old_state = poll_stats->queue_state;
> + if (old_state != new_state) {
> +
> + /* Call any old state exit functions */
> + if (old_state == TRAINING)
> + exit_training_state(poll_stats);
> +
> + /* Call any new state entry functions */
> + if (new_state == TRAINING)
> + enter_training_state(poll_stats);
> + if (new_state == MED_NORMAL)
> + enter_normal_state(poll_stats);
> + if (new_state == HGH_BUSY)
> + enter_busy_state(poll_stats);
> + if (new_state == LOW_PURGE)
> + enter_purge_state(poll_stats);
> + }
> +}
> +
> +
> +static void
> +update_training_stats(struct priority_worker *poll_stats,
> + uint32_t freq,
> + bool specific_freq,
> + uint32_t max_train_iter)
> +{
> + RTE_SET_USED(specific_freq);
> +
> + char pfi_str[32];
> + uint64_t p0_empty_deq;
> +
> + sprintf(pfi_str, "%02d", freq);
> +
> + if (poll_stats->cur_freq == freq &&
> + poll_stats->thresh[freq].trained == false) {
> + if (poll_stats->thresh[freq].cur_train_iter == 0) {
> +
> + set_power_freq(poll_stats->lcore_id,
> + freq, specific_freq);
> +
> + poll_stats->empty_dequeues_prev =
> + poll_stats->empty_dequeues;
> +
> + poll_stats->thresh[freq].cur_train_iter++;
> +
> + return;
> + } else if (poll_stats->thresh[freq].cur_train_iter
> + <= max_train_iter) {
> +
> + p0_empty_deq = poll_stats->empty_dequeues -
> + poll_stats->empty_dequeues_prev;
> +
> + poll_stats->empty_dequeues_prev =
> + poll_stats->empty_dequeues;
> +
> + poll_stats->thresh[freq].base_edpi +=
> p0_empty_deq;
> + poll_stats->thresh[freq].cur_train_iter++;
> +
> + } else {
> + if (poll_stats->thresh[freq].trained == false) {
> + poll_stats->thresh[freq].base_edpi =
> + poll_stats->thresh[freq].base_edpi /
> + max_train_iter;
> +
> + /* Add on a factor of 0.05%, this should
> remove any */
> + /* false negatives when the system is 0%
> busy */
> + poll_stats->thresh[freq].base_edpi +=
> + poll_stats->thresh[freq].base_edpi /
> 2000;
> +
> + poll_stats->thresh[freq].trained = true;
> + poll_stats->cur_freq++;
> +
> + }
> + }
> + }
> +}
> +
> +static inline uint32_t __attribute__((always_inline))
> +update_stats(struct priority_worker *poll_stats)
> +{
> + uint64_t tot_edpi = 0, tot_ppi = 0;
> + uint32_t j, percent;
> +
> + struct priority_worker *s = poll_stats;
> +
> + uint64_t cur_edpi = s->empty_dequeues - s-
> >empty_dequeues_prev;
> +
> + s->empty_dequeues_prev = s->empty_dequeues;
> +
> + uint64_t ppi = s->num_dequeue_pkts - s-
> >num_dequeue_pkts_prev;
> +
> + s->num_dequeue_pkts_prev = s->num_dequeue_pkts;
> +
> + if (s->thresh[s->cur_freq].base_edpi < cur_edpi) {
> + RTE_LOG(DEBUG, POWER, "cur_edpi is too large "
> + "cur edpi %ld "
> + "base epdi %ld\n",
> + cur_edpi,
> + s->thresh[s->cur_freq].base_edpi);
> + /* Value to make us fail need debug log*/
> + return 1000UL;
> + }
> +
> + s->edpi_av[s->ec++ % BINS_AV] = cur_edpi;
> + s->ppi_av[s->pc++ % BINS_AV] = ppi;
> +
> + for (j = 0; j < BINS_AV; j++) {
> + tot_edpi += s->edpi_av[j];
> + tot_ppi += s->ppi_av[j];
> + }
> +
> + tot_edpi = tot_edpi / BINS_AV;
> +
> + percent = 100 - (uint32_t)(((float)tot_edpi /
> + (float)s->thresh[s->cur_freq].base_edpi) * 100);
> +
> + return (uint32_t)percent;
> +}
> +
> +
> +static inline void __attribute__((always_inline))
> +update_stats_normal(struct priority_worker *poll_stats)
> +{
> + uint32_t percent;
> +
> + if (poll_stats->thresh[poll_stats->cur_freq].base_edpi == 0)
> + return;
> +
> + percent = update_stats(poll_stats);
> +
> + if (percent > 100)
> + return;
> +
> + if (poll_stats->cur_freq == LOW)
> + RTE_LOG(INFO, POWER, "Purge Mode is not supported\n");
> + else if (poll_stats->cur_freq == MED) {
> +
> + if (percent >
> + poll_stats->thresh[MED].threshold_percent) {
> +
> + if (poll_stats->threshold_ctr <
> INTERVALS_PER_SECOND)
> + poll_stats->threshold_ctr++;
> + else {
> + set_state(poll_stats, HGH_BUSY);
> + RTE_LOG(INFO, POWER, "MOVE to HGH\n");
> + }
> +
> + } else {
> + /* reset */
> + /* need debug log */
> + poll_stats->threshold_ctr = 0;
> + }
> +
> + } else if (poll_stats->cur_freq == HGH) {
> +
> + if (percent <
> + poll_stats->thresh[HGH].threshold_percent)
> {
> +
> + if (poll_stats->threshold_ctr <
> INTERVALS_PER_SECOND)
> + poll_stats->threshold_ctr++;
> + else
> + set_state(poll_stats, MED_NORMAL);
> + } else
> + /* reset */
> + /* need debug log */
> + poll_stats->threshold_ctr = 0;
> +
> +
> + }
> +}
> +
> +static int
> +empty_poll_training(struct priority_worker *poll_stats,
> + uint32_t max_train_iter)
> +{
> +
> + if (poll_stats->iter_counter < INTERVALS_PER_SECOND) {
> + poll_stats->iter_counter++;
> + return 0;
> + }
> +
> +
> + update_training_stats(poll_stats,
> + LOW,
> + false,
> + max_train_iter);
> +
> + update_training_stats(poll_stats,
> + MED,
> + false,
> + max_train_iter);
> +
> + update_training_stats(poll_stats,
> + HGH,
> + false,
> + max_train_iter);
> +
> +
> + if (poll_stats->thresh[LOW].trained == true
> + && poll_stats->thresh[MED].trained == true
> + && poll_stats->thresh[HGH].trained == true) {
> +
> + set_state(poll_stats, MED_NORMAL);
> +
> + RTE_LOG(INFO, POWER, "Training is Complete for %d\n",
> + poll_stats->lcore_id);
> + }
> +
> + return 0;
> +}
> +
> +void
> +rte_empty_poll_detection(struct rte_timer *tim,
> + void *arg)
> +{
> +
> + uint32_t i;
> +
> + struct priority_worker *poll_stats;
> +
> + RTE_SET_USED(tim);
> +
> + RTE_SET_USED(arg);
> +
> + for (i = 0; i < NUM_NODES; i++) {
> +
> + poll_stats = &(ep_params->wrk_data.wrk_stats[i]);
> +
> + if (rte_lcore_is_enabled(poll_stats->lcore_id) == 0)
> + continue;
> +
> + switch (poll_stats->queue_state) {
> + case(TRAINING):
> + empty_poll_training(poll_stats,
> + ep_params->max_train_iter);
> + break;
> +
> + case(HGH_BUSY):
> + case(MED_NORMAL):
> + update_stats_normal(poll_stats);
> +
> + break;
> +
> + case(LOW_PURGE):
> + break;
> + default:
> + break;
> +
> + }
> +
> + }
> +
> +}
> +
> +int __rte_experimental
> +rte_power_empty_poll_stat_init(struct ep_params **eptr, uint8_t
> *freq_tlb)
> +{
> + uint32_t i;
> + /* Allocate the ep_params structure */
> + ep_params = rte_zmalloc_socket(NULL,
> + sizeof(struct ep_params),
> + 0,
> + rte_socket_id());
> +
> + if (!ep_params)
> + rte_panic("Cannot allocate heap memory for ep_params "
> + "for socket %d\n", rte_socket_id());
> +
> + if (freq_tlb == NULL) {
> + freq_index[LOW] = 14;
> + freq_index[MED] = 9;
> + freq_index[HGH] = 1;
> + } else {
> + freq_index[LOW] = freq_tlb[LOW];
> + freq_index[MED] = freq_tlb[MED];
> + freq_index[HGH] = freq_tlb[HGH];
> + }
> +
> + RTE_LOG(INFO, POWER, "Initialize the Empty Poll\n");
> +
> + /* 5 seconds worth of training */
> + ep_params->max_train_iter = INTERVALS_PER_SECOND *
> SECONDS_TO_TRAIN_FOR;
> +
> + struct stats_data *w = &ep_params->wrk_data;
> +
> + *eptr = ep_params;
> +
> + /* initialize all wrk_stats state */
> + for (i = 0; i < NUM_NODES; i++) {
> +
> + if (rte_lcore_is_enabled(i) == 0)
> + continue;
> +
> + set_state(&w->wrk_stats[i], TRAINING);
> + /*init the freqs table */
> + total_avail_freqs[i] = rte_power_freqs(i,
> + avail_freqs[i],
> + NUM_FREQS);
> +
> + if (get_freq_index(LOW) > total_avail_freqs[i])
> + return -1;
> +
> + }
> +
> +
> + return 0;
> +}
> +
> +void __rte_experimental
> +rte_power_empty_poll_stat_free(void)
> +{
> +
> + RTE_LOG(INFO, POWER, "Close the Empty Poll\n");
> +
> + if (ep_params != NULL)
> + rte_free(ep_params);
> +}
> +
> +int __rte_experimental
> +rte_power_empty_poll_stat_update(unsigned int lcore_id)
> +{
> + struct priority_worker *poll_stats;
> +
> + if (lcore_id >= NUM_NODES)
> + return -1;
> +
> + poll_stats = &(ep_params->wrk_data.wrk_stats[lcore_id]);
> +
> + if (poll_stats->lcore_id == 0)
> + poll_stats->lcore_id = lcore_id;
> +
> + poll_stats->empty_dequeues++;
> +
> + return 0;
> +}
> +
> +int __rte_experimental
> +rte_power_poll_stat_update(unsigned int lcore_id, uint8_t nb_pkt)
> +{
> +
> + struct priority_worker *poll_stats;
> +
> + if (lcore_id >= NUM_NODES)
> + return -1;
> +
> + poll_stats = &(ep_params->wrk_data.wrk_stats[lcore_id]);
> +
> + if (poll_stats->lcore_id == 0)
> + poll_stats->lcore_id = lcore_id;
> +
> + poll_stats->num_dequeue_pkts += nb_pkt;
> +
> + return 0;
> +}
> +
> +
> +uint64_t __rte_experimental
> +rte_power_empty_poll_stat_fetch(unsigned int lcore_id)
> +{
> + struct priority_worker *poll_stats;
> +
> + if (lcore_id >= NUM_NODES)
> + return -1;
> +
> + poll_stats = &(ep_params->wrk_data.wrk_stats[lcore_id]);
> +
> + if (poll_stats->lcore_id == 0)
> + poll_stats->lcore_id = lcore_id;
> +
> + return poll_stats->empty_dequeues;
> +}
> +
> +uint64_t __rte_experimental
> +rte_power_poll_stat_fetch(unsigned int lcore_id)
> +{
> + struct priority_worker *poll_stats;
> +
> + if (lcore_id >= NUM_NODES)
> + return -1;
> +
> + poll_stats = &(ep_params->wrk_data.wrk_stats[lcore_id]);
> +
> + if (poll_stats->lcore_id == 0)
> + poll_stats->lcore_id = lcore_id;
> +
> + return poll_stats->num_dequeue_pkts;
> +}
> diff --git a/lib/librte_power/rte_power_empty_poll.h
> b/lib/librte_power/rte_power_empty_poll.h
> new file mode 100644
> index 0000000..e43981f
> --- /dev/null
> +++ b/lib/librte_power/rte_power_empty_poll.h
> @@ -0,0 +1,205 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2010-2018 Intel Corporation
> + */
> +
> +#ifndef _RTE_EMPTY_POLL_H
> +#define _RTE_EMPTY_POLL_H
> +
> +/**
> + * @file
> + * RTE Power Management
> + */
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +#include <rte_common.h>
> +#include <rte_byteorder.h>
> +#include <rte_log.h>
> +#include <rte_string_fns.h>
> +#include <rte_power.h>
> +#include <rte_timer.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#define NUM_FREQS 20
> +
> +#define BINS_AV 4 /* Has to be ^2 */
> +
> +#define DROP (NUM_DIRECTIONS * NUM_DEVICES)
> +
> +#define NUM_PRIORITIES 2
> +
> +#define NUM_NODES 31 /* Any reseanable prime number should
> work*/
> +
> +/* Processor Power State */
> +enum freq_val {
> + LOW,
> + MED,
> + HGH,
> + NUM_FREQ = NUM_FREQS
> +};
> +
> +
> +/* Queue Polling State */
> +enum queue_state {
> + TRAINING, /* NO TRAFFIC */
> + MED_NORMAL, /* MED */
> + HGH_BUSY, /* HIGH */
> + LOW_PURGE, /* LOW */
> +};
> +
> +/* Queue Stats */
> +struct freq_threshold {
> +
> + uint64_t base_edpi;
> + bool trained;
> + uint32_t threshold_percent;
> + uint32_t cur_train_iter;
> +};
> +
> +/* Each Worder Thread Empty Poll Stats */
> +struct priority_worker {
> +
> + /* Current dequeue and throughput counts */
> + /* These 2 are written to by the worker threads */
> + /* So keep them on their own cache line */
> + uint64_t empty_dequeues;
> + uint64_t num_dequeue_pkts;
> +
> + enum queue_state queue_state;
> +
> + uint64_t empty_dequeues_prev;
> + uint64_t num_dequeue_pkts_prev;
> +
> + /* Used for training only */
> + struct freq_threshold thresh[NUM_FREQ];
> + enum freq_val cur_freq;
> +
> + /* bucket arrays to calculate the averages */
> + uint64_t edpi_av[BINS_AV];
> + uint32_t ec;
> + uint64_t ppi_av[BINS_AV];
> + uint32_t pc;
> +
> + uint32_t lcore_id;
> + uint32_t iter_counter;
> + uint32_t threshold_ctr;
> + uint32_t display_ctr;
> + uint8_t dev_id;
> +
> +} __rte_cache_aligned;
> +
> +
> +struct stats_data {
> +
> + struct priority_worker wrk_stats[NUM_NODES];
> +
> + /* flag to stop rx threads processing packets until training over */
> + bool start_rx;
> +
> +};
> +
> +/* Empty Poll Parameters */
> +struct ep_params {
> +
> + /* Timer related stuff */
> + uint64_t interval_ticks;
> + uint32_t max_train_iter;
> +
> + struct rte_timer timer0;
> + struct stats_data wrk_data;
> +};
> +
> +
> +/**
> + * Initialize the power management system.
> + *
> + * @param eptr
> + * the structure of empty poll configuration
> + * @freq_tlb
> + * the power state/frequency mapping table
> + *
> + * @return
> + * - 0 on success.
> + * - Negative on error.
> + */
> +int __rte_experimental
> +rte_power_empty_poll_stat_init(struct ep_params **eptr, uint8_t
> *freq_tlb);
> +
> +/**
> + * Free the resource hold by power management system.
> + */
> +void __rte_experimental
> +rte_power_empty_poll_stat_free(void);
> +
> +/**
> + * Update specific core empty poll counter
> + * It's not thread safe.
> + *
> + * @param lcore_id
> + * lcore id
> + *
> + * @return
> + * - 0 on success.
> + * - Negative on error.
> + */
> +int __rte_experimental
> +rte_power_empty_poll_stat_update(unsigned int lcore_id);
> +
> +/**
> + * Update specific core valid poll counter, not thread safe.
> + *
> + * @param lcore_id
> + * lcore id.
> + * @param nb_pkt
> + * The packet number of one valid poll.
> + *
> + * @return
> + * - 0 on success.
> + * - Negative on error.
> + */
> +int __rte_experimental
> +rte_power_poll_stat_update(unsigned int lcore_id, uint8_t nb_pkt);
> +
> +/**
> + * Fetch specific core empty poll counter.
> + *
> + * @param lcore_id
> + * lcore id
> + *
> + * @return
> + * Current lcore empty poll counter value.
> + */
> +uint64_t __rte_experimental
> +rte_power_empty_poll_stat_fetch(unsigned int lcore_id);
> +
> +/**
> + * Fetch specific core valid poll counter.
> + *
> + * @param lcore_id
> + * lcore id
> + *
> + * @return
> + * Current lcore valid poll counter value.
> + */
> +uint64_t __rte_experimental
> +rte_power_poll_stat_fetch(unsigned int lcore_id);
> +
> +/**
> + * Empty poll state change detection function
> + *
> + * @param tim
> + * The timer structure
> + * @param arg
> + * The customized parameter
> + */
> +void __rte_experimental
> +rte_empty_poll_detection(struct rte_timer *tim, void *arg);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif
> diff --git a/lib/librte_power/rte_power_version.map
> b/lib/librte_power/rte_power_version.map
> index dd587df..11ffdfb 100644
> --- a/lib/librte_power/rte_power_version.map
> +++ b/lib/librte_power/rte_power_version.map
> @@ -33,3 +33,16 @@ DPDK_18.08 {
> rte_power_get_capabilities;
>
> } DPDK_17.11;
> +
> +EXPERIMENTAL {
> + global:
> +
> + rte_power_empty_poll_stat_init;
> + rte_power_empty_poll_stat_free;
> + rte_power_empty_poll_stat_update;
> + rte_power_empty_poll_stat_fetch;
> + rte_power_poll_stat_fetch;
> + rte_power_poll_stat_update;
> + rte_empty_poll_detection;
> +
> +};
> --
> 2.7.5