Hi Konstantin Ananyev,
Thanks for your review.
在 2024/10/14 16:29, Konstantin Ananyev 写道:
The deeper the idle state, the lower the power consumption, but the longer
the resume time. Some service are delay sensitive and very except the low
resume time, like interrupt packet receiving mode.
And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
interface is used to set and get the resume latency limit on the cpuX for
userspace. Each cpuidle governor in Linux select which idle state to enter
based on this CPU resume latency in their idle task.
The per-CPU PM QoS API can be used to control this CPU's idle state
selection and limit just enter the shallowest idle state to low the delay
after sleep by setting strict resume latency (zero value).
Signed-off-by: Huisong Li <lihuis...@huawei.com>
Acked-by: Morten Brørup <m...@smartsharesystems.com>
---
doc/guides/prog_guide/power_man.rst | 24 ++++++
doc/guides/rel_notes/release_24_11.rst | 5 ++
lib/power/meson.build | 2 +
lib/power/rte_power_qos.c | 111 +++++++++++++++++++++++++
lib/power/rte_power_qos.h | 73 ++++++++++++++++
lib/power/version.map | 4 +
6 files changed, 219 insertions(+)
create mode 100644 lib/power/rte_power_qos.c
create mode 100644 lib/power/rte_power_qos.h
diff --git a/doc/guides/prog_guide/power_man.rst
b/doc/guides/prog_guide/power_man.rst
index f6674efe2d..faa32b4320 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -249,6 +249,30 @@ Get Num Pkgs
Get Num Dies
Get the number of die's on a given package.
+
+PM QoS
+------
+
+The deeper the idle state, the lower the power consumption, but the longer
+the resume time. Some service are delay sensitive and very except the low
+resume time, like interrupt packet receiving mode.
+
+And the "/sys/devices/system/cpu/cpuX/power/pm_qos_resume_latency_us" sysfs
+interface is used to set and get the resume latency limit on the cpuX for
+userspace. Each cpuidle governor in Linux select which idle state to enter
+based on this CPU resume latency in their idle task.
+
+The per-CPU PM QoS API can be used to set and get the CPU resume latency based
+on this sysfs.
+
+The ``rte_power_qos_set_cpu_resume_latency()`` function can control the CPU's
+idle state selection in Linux and limit just to enter the shallowest idle state
+to low the delay of resuming service after sleeping by setting strict resume
+latency (zero value).
+
+The ``rte_power_qos_get_cpu_resume_latency()`` function can get the resume
+latency on specified CPU.
+
References
----------
diff --git a/doc/guides/rel_notes/release_24_11.rst
b/doc/guides/rel_notes/release_24_11.rst
index 0ff70d9057..bd72d0a595 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -55,6 +55,11 @@ New Features
Also, make sure to start the actual text at the margin.
=======================================================
+* **Introduce per-CPU PM QoS interface.**
+
+ * Add per-CPU PM QoS interface to low the delay after sleep by controlling
+ CPU idle state selection.
+
Removed Items
-------------
diff --git a/lib/power/meson.build b/lib/power/meson.build
index b8426589b2..8222e178b0 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -23,12 +23,14 @@ sources = files(
'rte_power.c',
'rte_power_uncore.c',
'rte_power_pmd_mgmt.c',
+ 'rte_power_qos.c',
)
headers = files(
'rte_power.h',
'rte_power_guest_channel.h',
'rte_power_pmd_mgmt.h',
'rte_power_uncore.h',
+ 'rte_power_qos.h',
)
if cc.has_argument('-Wno-cast-qual')
cflags += '-Wno-cast-qual'
diff --git a/lib/power/rte_power_qos.c b/lib/power/rte_power_qos.c
new file mode 100644
index 0000000000..8eb26cd41a
--- /dev/null
+++ b/lib/power/rte_power_qos.c
@@ -0,0 +1,111 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 HiSilicon Limited
+ */
+
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_lcore.h>
+#include <rte_log.h>
+
+#include "power_common.h"
+#include "rte_power_qos.h"
+
+#define PM_QOS_SYSFILE_RESUME_LATENCY_US \
+ "/sys/devices/system/cpu/cpu%u/power/pm_qos_resume_latency_us"
+
+#define PM_QOS_CPU_RESUME_LATENCY_BUF_LEN 32
+
+int
+rte_power_qos_set_cpu_resume_latency(uint16_t lcore_id, int latency)
+{
+ char buf[PM_QOS_CPU_RESUME_LATENCY_BUF_LEN];
+ FILE *f;
+ int ret;
+
+ if (!rte_lcore_is_enabled(lcore_id)) {
+ POWER_LOG(ERR, "lcore id %u is not enabled", lcore_id);
+ return -EINVAL;
+ }
+
+ if (latency < 0) {
+ POWER_LOG(ERR, "latency should be greater than and equal to 0");
+ return -EINVAL;
+ }
+
+ ret = open_core_sysfs_file(&f, "w", PM_QOS_SYSFILE_RESUME_LATENCY_US,
lcore_id);
That was already brought by Morten:
lcore_id is not always equal to cpu_core_id (cpu affinity).
Yes, Morten also mentioned it.
And I tried to answer him, please find our previous disscussion, thanks.
I think it's ok😁
Looking through power library it is not specific to that particular patch,
but sort of common limitation (bug?) in rte_power lib.
Yes it is very common in power lib.
+ if (ret != 0) {
+ POWER_LOG(ERR, "Failed to open
"PM_QOS_SYSFILE_RESUME_LATENCY_US, lcore_id);
+ return ret;
+ }
+
<...>