On Wed, Mar 17, 2021 at 11:44 PM Pratik Sampat <psam...@linux.ibm.com> wrote:
>
> Hi Doug,
> Thanks for trying these patches out.
>
> On 18/03/21 2:30 am, Doug Smythies wrote:
> > Hi Pratik,
> >
> > It just so happens that I have been trying Artem's version this last
> > week, so I tried yours.
> >
> > On Mon, Mar 15, 2021 at 4:49 AM Pratik Rajesh Sampat
> > <psam...@linux.ibm.com> wrote:
> > ...
...
> > Other notes:
> >
> > No idle state for CPU 0 ever gets disabled.
> > I assume this is because CPU 0 can never be offline,
> > so that bit of code (Disable all stop states) doesn't find its state.
> > By the way, processor = Intel i5-9600K
>
> I had tried these patches on an IBM POWER 9 processor and disabling CPU0's 
> idle
> state works there. However, it does make sense for some processors to treat 
> CPU
> 0 differently.
> Maybe I could write in a case if idle state disabling fails for a CPU then we
> just skip it?

I didn't try it, I just did a hack so I could continue for this reply.

> > The system is left with all idle states disabled, well not for CPU 0
> > as per the above comment. The suggestion is to restore them,
> > otherwise my processor hogs 42 watts instead of 2.
> >
> > My results are highly variable per test.
>
> Question: Do you notice high variability with IPI test, Timer test or both?

The IPI test has less variability than the Timer test.

>
> I can think of two reasons for high run to run variance:
>
> 1. If you observe variance in timer tests, then I believe there could a
> mechanism of "C-state pre-wake" on some Intel machines at play here, which can
> pre-wake a CPU from an idle state when timers are armed. I'm not sure if the
> Intel platform that you're running on does that or not.
>
> Artem had described this behavior to me a while ago and I think his wult page
> describes this behavior in more detail:
> https://intel.github.io/wult/#c-state-pre-wake

Yes, I have reviewed all the references.
And yes, I think my processors have the pre-wake stuff.

I do not have the proper hardware to do the Artem pre-wake workaround
method, but might buy it in future.

> 2. I have noticed variability in results when there are kernel book-keeping or
> jitter tasks scheduled from time to time on an otherwise idle core.
> In the full per-CPU logs at tools/testing/selftests/cpuidle/cpuidle.log can 
> you
> spot any obvious outliers per-CPU state?

Yes.
I'll just paste in an example cpuidle.log file having used the -v option
below, along with my hack job diff.

doug@s19:~/temp-k-git/linux/tools/testing/selftests/cpuidle$ cat
cpuidle.log.v3-1
--IPI Latency Test---
--Baseline IPI Latency measurement: CPU Busy--
SRC_CPU   DEST_CPU IPI_Latency(ns)
0            0          140
0            1          632
0            2          675
0            3          671
0            4          675
0            5          767
0            6          653
0            7          826
0            8          819
0            9          615
0           10          758
0           11          758
Baseline Avg IPI latency(ns): 665
---Enabling state: 0---
SRC_CPU   DEST_CPU IPI_Latency(ns)
0            0           76
0            1          484
0            2          494
0            3          539
0            4          498
0            5          491
0            6          474
0            7          434
0            8          544
0            9          476
0           10          447
0           11          467
Expected IPI latency(ns): 0
Observed Avg IPI latency(ns) - State 0: 452
---Enabling state: 1---
SRC_CPU   DEST_CPU IPI_Latency(ns)
0            0           72
0            1         1081
0            2          821
0            3         1486
0            4         1022
0            5          960
0            6         1634
0            7          933
0            8         1032
0            9         1046
0           10         1430
0           11         1338
Expected IPI latency(ns): 1000
Observed Avg IPI latency(ns) - State 1: 1071
---Enabling state: 2---
SRC_CPU   DEST_CPU IPI_Latency(ns)
0            0          264
0            1        30836
0            2        30562
0            3        30748
0            4        35286
0            5        30978
0            6         1952
0            7        36066
0            8        30670
0            9        30605
0           10        30635
0           11        35423
Expected IPI latency(ns): 120000
Observed Avg IPI latency(ns) - State 2: 27002
---Enabling state: 3---
SRC_CPU   DEST_CPU IPI_Latency(ns)
0            0           71
0            1        30853
0            2        32095
0            3        32661
0            4        30230
0            5        34348
0            6         2012
0            7        30816
0            8        30908
0            9        31130
0           10        34150
0           11        32050
Expected IPI latency(ns): 1034000
Observed Avg IPI latency(ns) - State 3: 26777

--Timeout Latency Test--
--Baseline Timeout Latency measurement: CPU Busy--
Wakeup_src Baseline_delay(ns)
0             453
1             568
2             387
3             337
4             433
5             579
6             330
7             400
8             561
9             544
10            569
11            523
Baseline Avg timeout diff(ns): 473
---Enabling state: 0---
Wakeup_src Baseline_delay(ns)  Delay(ns)
0             399
1             388
2             352
3             385
4             334
5             415
6             320
7             356
8             401
9             379
10            339
11            384
Expected timeout(ns): 200
Observed Avg timeout diff(ns) - State 0: 371
---Enabling state: 1---
Wakeup_src Baseline_delay(ns)  Delay(ns)
0             666
1             575
2             608
3             590
4             608
5             552
6             582
7             593
8             597
9             587
10            588
11            610
Expected timeout(ns): 1200
Observed Avg timeout diff(ns) - State 1: 596
---Enabling state: 2---
Wakeup_src Baseline_delay(ns)  Delay(ns)
0           36386
1            1069
2             866
3             884
4             850
5           55642
6          408082
7            1184
8          406075
9          406830
10         414105
11         406594
Expected timeout(ns): 360200
Observed Avg timeout diff(ns) - State 2: 178213
---Enabling state: 3---
Wakeup_src Baseline_delay(ns)  Delay(ns)
0          406049
1             913
2          410134
3             921
4          406237
5             950
6          407181
7             920
8          407678
9             894
10         406320
11         304161
Expected timeout(ns): 3102200
Observed Avg timeout diff(ns) - State 3: 229363


My hack job, (CPUs always online):

diff --git a/tools/testing/selftests/cpuidle/cpuidle.sh
b/tools/testing/selftests/cpuidle/cpuidle.sh
index de5141d5b76b..70bdacda5e91 100755
--- a/tools/testing/selftests/cpuidle/cpuidle.sh
+++ b/tools/testing/selftests/cpuidle/cpuidle.sh
@@ -86,10 +86,6 @@ disable_idle()
 {
        for ((cpu=0; cpu<NUM_CPUS; cpu++))
        do
-               local cpu_status=$(cpu_is_online $cpu)
-               if [ $cpu_status == 0 ]; then
-                       continue
-               fi
                for ((state=0; state<NUM_STATES; state++))
                do
                        echo 1 >
/sys/devices/system/cpu/cpu$cpu/cpuidle/state$state/disable
@@ -104,10 +100,6 @@ op_state()
 {
        for ((cpu=0; cpu<NUM_CPUS; cpu++))
        do
-               local cpu_status=$(cpu_is_online $cpu)
-               if [ $cpu_status == 0 ]; then
-                       continue
-               fi
                echo $1 >
/sys/devices/system/cpu/cpu$cpu/cpuidle/state$2/disable
        done
 }
@@ -124,17 +116,6 @@ cpuidle_disable_state()
        op_state 1 $state
 }

-cpu_is_online()
-{
-       cpu=$1
-       if [ ! -f "/sys/devices/system/cpu/cpu$cpu/online" ]; then
-               echo 0
-               return
-       fi
-       status=$(cat /sys/devices/system/cpu/cpu$cpu/online)
-       echo $status
-}
-
 # Extract latency in microseconds and convert to nanoseconds
 extract_latency()
 {
@@ -179,10 +160,6 @@ run_ipi_tests()
        printf "%s %10s %12s\n" "SRC_CPU" "DEST_CPU" "IPI_Latency(ns)" >> $LOG
        for ((cpu=0; cpu<NUM_CPUS; cpu+=SMT))
        do
-               local cpu_status=$(cpu_is_online $cpu)
-               if [ $cpu_status == 0 ]; then
-                       continue
-               fi
                ipi_test_once "baseline" $cpu
                printf "%-3s %10s %12s\n" $src_cpu $cpu $ipi_latency >> $LOG
                avg_arr+=($ipi_latency)
@@ -198,10 +175,6 @@ run_ipi_tests()
                printf "%s %10s %12s\n" "SRC_CPU" "DEST_CPU"
"IPI_Latency(ns)" >> $LOG
                for ((cpu=0; cpu<NUM_CPUS; cpu+=SMT))
                do
-                       local cpu_status=$(cpu_is_online $cpu)
-                       if [ $cpu_status == 0 ]; then
-                               continue
-                       fi
                        # Running IPI test and logging results
                        sleep 1
                        ipi_test_once "test" $cpu
@@ -262,10 +235,6 @@ run_timeout_tests()
        printf "%s %10s %10s\n" "Wakeup_src" "Baseline_delay(ns)">> $LOG
        for ((cpu=0; cpu<NUM_CPUS; cpu+=SMT))
        do
-               local cpu_status=$(cpu_is_online $cpu)
-               if [ $cpu_status == 0 ]; then
-                       continue
-               fi
                timeout_test_once "baseline" $cpu 1000000
                printf "%-3s %13s\n" $src_cpu $timeout_diff >> $LOG
                avg_arr+=($timeout_diff)
@@ -281,10 +250,6 @@ run_timeout_tests()
                printf "%s %10s %10s\n" "Wakeup_src"
"Baseline_delay(ns)" "Delay(ns)" >> $LOG
                for ((cpu=0; cpu<NUM_CPUS; cpu+=SMT))
                do
-                       local cpu_status=$(cpu_is_online $cpu)
-                       if [ $cpu_status == 0 ]; then
-                               continue
-                       fi
                        timeout_test_once "test" $cpu 1000000
                        printf "%-3s %13s %18s\n" $src_cpu
$baseline_timeout_diff $timeout_diff >> $LOG
                        avg_arr+=($timeout_diff)
@@ -314,3 +279,7 @@ run_timeout_tests
 printf "Removing $MODULE module\n"
 printf "Full Output logged at: $LOG\n"
 rmmod $MODULE
+
+printf "enabling idle states\n"
+
+echo 0 | tee /sys/devices/system/cpu/cpu*/cpuidle/state*/disable
(END)

Reply via email to