On 03/20/2018 04:54 AM, Aaron Lu wrote:
This series is meant to improve zone->lock scalability for order 0 pages.
With will-it-scale/page_fault1 workload, on a 2 sockets Intel Skylake
server with 112 CPUs, CPU spend 80% of its time spinning on zone->lock.
Perf profile shows the most time consuming part under zone->lock is the
cache miss on "struct page", so here I'm trying to avoid those cache
misses.
I ran page_fault1 comparing 4.16-rc5 to your recent work, these four
patches plus the three others from your github branch zone_lock_rfc_v2.
Out of curiosity I also threw in another 4.16-rc5 with the pcp batch
size adjusted so high (10922 pages) that we always stay in the pcp lists
and out of buddy completely. I used your patch[*] in this last kernel.
This was on a 2-socket, 20-core broadwell server.
There were some small regressions a bit outside the noise at low process
counts (2-5) but I'm not sure they're repeatable. Anyway, it does
improve the microbenchmark across the board.
[*] lkml.kernel.org/r/20170919072342.GB7263 () intel ! com
,,586305.0,747,587731.0,1766
4.0,3.4,609505.0,1563,608007.0,1170
8.0,5.9,633145.0,1752,622690.0,1287
,,1131428.0,7397,1022890.0,7334
-1.0,-0.3,1119974.0,2558,1020102.0,5707
3.0,3.3,1165004.0,6232,1056689.0,6411
,,1590413.0,6412,1346900.0,8925
-0.7,1.0,1579816.0,7217,1360376.0,4418
2.3,3.4,1626925.0,5321,1392515.0,8180
,,2064476.0,5035,1656475.0,14714
-1.0,0.9,2043240.0,9070,1672036.0,5342
1.2,3.9,2089959.0,8287,1721614.0,7797
,,2486090.0,11178,1878085.0,15286
-0.1,1.1,2483021.0,15100,1898459.0,9295
1.3,4.0,2517602.0,13717,1952995.0,7481
,,2869756.0,9194,2058398.0,20580
0.4,3.5,2882220.0,20584,2129444.0,9689
2.4,6.0,2937618.0,14126,2182859.0,9650
,,3242589.0,15354,2231977.0,20188
0.9,3.2,3270796.0,15125,2303780.0,6607
2.0,6.7,3306528.0,17683,2381279.0,16507
,,3598209.0,10765,2361819.0,13509
1.2,4.5,3642407.0,14894,2469191.0,16250
2.0,8.1,3671834.0,17501,2552786.0,12112
,,3974345.0,12605,2511565.0,31986
2.3,4.8,4067553.0,11070,2632608.0,8158
2.5,9.8,4073111.0,12464,2758075.0,31433
,,4333026.0,12187,2636914.0,15065
2.4,5.8,4435852.0,21692,2789949.0,16400
3.2,10.3,4470666.0,23663,2907263.0,15052
,,4932423.0,12184,2675769.0,23925
2.7,3.6,5064666.0,18600,2771476.0,22438
3.6,7.9,5110434.0,21460,2888419.0,18181
,,5461255.0,14704,2631232.0,24390
1.7,2.5,5554957.0,20979,2697554.0,19370
3.1,5.8,5629143.0,22781,2782902.0,20347
,,5924367.0,11835,2445607.0,25821
1.4,4.9,6004723.0,19071,2566547.0,26031
2.9,8.4,6094087.0,17676,2651793.0,20051
,,6381611.0,16792,2277611.0,39558
1.2,4.4,6459693.0,18869,2377795.0,25094
2.4,12.4,6534837.0,24991,2560085.0,12638
,,6804737.0,13654,2232121.0,19409
1.1,4.8,6881730.0,18995,2338868.0,18923
2.4,12.5,6970318.0,28677,2510594.0,25891
,,7197145.0,17500,2313168.0,16694
1.2,4.5,7287072.0,28727,2418232.0,23120
2.6,7.6,7383613.0,17992,2489382.0,28385
,,7550498.0,15101,2226427.0,24769
1.6,4.4,7667641.0,24306,2324675.0,16265
2.8,5.3,7761917.0,29855,2345195.0,20660
,,7902794.0,12579,2188399.0,37454
1.7,6.0,8033876.0,21158,2320641.0,13053
2.8,8.9,8126732.0,27620,2383083.0,25173
,,8277506.0,15448,2198021.0,33075
2.1,6.4,8453255.0,17411,2339395.0,20221
3.0,7.7,8529130.0,29853,2366800.0,20139
,,8651034.0,19706,2239626.0,25694
2.2,3.2,8840988.0,24387,2311966.0,25694
3.1,9.3,8918721.0,34762,2448589.0,26228
,,8777023.0,11833,2259622.0,32155
2.5,5.7,8993348.0,25481,2389000.0,29464
3.5,6.3,9085319.0,27791,2401713.0,28706
,,8855202.0,27455,2268030.0,35914
3.0,4.1,9123705.0,34289,2361876.0,31917
4.2,11.9,9228843.0,33484,2536867.0,26001
,,8952897.0,21601,2280539.0,30530
3.5,10.1,9268365.0,30804,2510883.0,26312
4.6,10.3,9367048.0,30681,2514740.0,32897
,,9036582.0,19483,2374728.0,42892
3.7,2.2,9369541.0,33253,2425993.0,45953
5.0,4.5,9489640.0,28066,2482179.0,33821
,,9136041.0,18233,2336090.0,30037
4.0,6.1,9497501.0,34409,2478602.0,31960
4.7,10.5,9563832.0,34507,2581056.0,38516
,,9226630.0,17998,2326070.0,33782
3.7,7.8,9570547.0,27052,2508396.0,41848
4.4,10.4,9634192.0,40218,2566842.0,31116
,,9305784.0,24574,2391261.0,29548
3.8,6.1,9656252.0,39164,2536624.0,45738
4.5,5.2,9720210.0,31172,2516447.0,32348
,,9381004.0,19378,2442774.0,35745
2.6,1.3,9626125.0,65187,2474560.0,29978
4.1,4.4,9766045.0,54298,2549227.0,31200
,,9401844.0,27746,2456372.0,40550
2.7,3.8,9652161.0,51629,2549004.0,39991
3.5,6.9,9731681.0,51852,2625589.0,27822
,,9428320.0,17562,2509119.0,39752
2.1,6.1,9630472.0,50106,2662447.0,44347
3.1,5.7,9722152.0,50349,2651891.0,28519
,,9561062.0,21910,2392883.0,24181
2.0,7.1,9755774.0,60382,2563573.0,32132
3.5,15.0,9894735.0,45967,2752506.0,28517
,,9624859.0,30462,2480667.0,27055
2.7,5.5,9883943.0,61851,2618326.0,36656
4.5,15.4,10057320.0,46352,2863788.0,34022
,,9739896.0,35436,2476666.0,30301
3.1,8.2,10043706.0,60944,2680570.0,42346
4.6,16.8,10191082.0,51348,2893385.0,32717
,,9833955.0,39366,2628480.0,36567
3.5,2.7,10180871.0,50050,2699941.0,42805
5.0,7.3,10323136.0,50768,2820628.0,30552
,,9908832.0,20826,2666415.0,51379
3.5,0.5,10251385.0,58551,2679925.0,49144
5.1,5.7,10418155.0,51726,2817192.0,34043
,,9969311.0,20378,2563399.0,36720
3.5,4.8,10314449.0,60867,2686176.0,42926
5.4,9.1,10504881.0,53101,2796816.0,37461
,,10077169.0,36182,2584728.0,32672
3.1,7.4,10393453.0,63048,2775523.0,39745
4.7,11.9,10549870.0,45281,2893000.0,39102
,,10115997.0,25835,2653036.0,33259
2.7,5.7,10388901.0,63402,2803290.0,36021
4.6,11.2,10580796.0,63517,2949834.0,31422
,,10162757.0,33119,2681195.0,30592
2.5,3.0,10413010.0,76720,2761752.0,32472
4.0,9.5,10568061.0,65614,2935127.0,38463
,,10223472.0,41882,2670421.0,26049
2.4,5.0,10470977.0,58009,2803478.0,37111
4.1,7.4,10646450.0,54810,2868986.0,52724
kernel (#) ntask proc thr proc stdev thr
stdev
speedup speedup pgf/s pgf/s
4.16-rc5 (1) 1 586,305 747 587,731
1,766
lu-zone (2) 1 4.0% 3.4% 609,505 1,562 608,007
1,169
4.16-rc5-nz (3) 1 8.0% 5.9% 633,145 1,752 622,690
1,286
4.16-rc5 (1) 2 1,131,428 7,396 1,022,890
7,333
lu-zone (2) 2 -1.0% -0.3% 1,119,974 2,557 1,020,102
5,706
4.16-rc5-nz (3) 2 3.0% 3.3% 1,165,004 6,232 1,056,689
6,411
4.16-rc5 (1) 3 1,590,413 6,411 1,346,900
8,924
lu-zone (2) 3 -0.7% 1.0% 1,579,816 7,216 1,360,376
4,418
4.16-rc5-nz (3) 3 2.3% 3.4% 1,626,925 5,321 1,392,515
8,180
4.16-rc5 (1) 4 2,064,476 5,034 1,656,475
14,713
lu-zone (2) 4 -1.0% 0.9% 2,043,240 9,069 1,672,036
5,342
4.16-rc5-nz (3) 4 1.2% 3.9% 2,089,959 8,287 1,721,614
7,796
4.16-rc5 (1) 5 2,486,090 11,178 1,878,085
15,286
lu-zone (2) 5 -0.1% 1.1% 2,483,021 15,100 1,898,459
9,295
4.16-rc5-nz (3) 5 1.3% 4.0% 2,517,602 13,717 1,952,995
7,481
4.16-rc5 (1) 6 2,869,756 9,194 2,058,398
20,580
lu-zone (2) 6 0.4% 3.5% 2,882,220 20,583 2,129,444
9,689
4.16-rc5-nz (3) 6 2.4% 6.0% 2,937,618 14,126 2,182,859
9,650
4.16-rc5 (1) 7 3,242,589 15,354 2,231,977
20,188
lu-zone (2) 7 0.9% 3.2% 3,270,796 15,124 2,303,780
6,607
4.16-rc5-nz (3) 7 2.0% 6.7% 3,306,528 17,683 2,381,279
16,507
4.16-rc5 (1) 8 3,598,209 10,764 2,361,819
13,508
lu-zone (2) 8 1.2% 4.5% 3,642,407 14,893 2,469,191
16,250
4.16-rc5-nz (3) 8 2.0% 8.1% 3,671,834 17,501 2,552,786
12,112
4.16-rc5 (1) 9 3,974,345 12,605 2,511,565
31,986
lu-zone (2) 9 2.3% 4.8% 4,067,553 11,069 2,632,608
8,158
4.16-rc5-nz (3) 9 2.5% 9.8% 4,073,111 12,463 2,758,075
31,432
4.16-rc5 (1) 10 4,333,026 12,187 2,636,914
15,064
lu-zone (2) 10 2.4% 5.8% 4,435,852 21,691 2,789,949
16,399
4.16-rc5-nz (3) 10 3.2% 10.3% 4,470,666 23,663 2,907,263
15,052
4.16-rc5 (1) 11 4,932,423 12,183 2,675,769
23,924
lu-zone (2) 11 2.7% 3.6% 5,064,666 18,600 2,771,476
22,438
4.16-rc5-nz (3) 11 3.6% 7.9% 5,110,434 21,459 2,888,419
18,180
4.16-rc5 (1) 12 5,461,255 14,704 2,631,232
24,390
lu-zone (2) 12 1.7% 2.5% 5,554,957 20,978 2,697,554
19,369
4.16-rc5-nz (3) 12 3.1% 5.8% 5,629,143 22,781 2,782,902
20,346
4.16-rc5 (1) 13 5,924,367 11,835 2,445,607
25,821
lu-zone (2) 13 1.4% 4.9% 6,004,723 19,070 2,566,547
26,031
4.16-rc5-nz (3) 13 2.9% 8.4% 6,094,087 17,676 2,651,793
20,050
4.16-rc5 (1) 14 6,381,611 16,791 2,277,611
39,557
lu-zone (2) 14 1.2% 4.4% 6,459,693 18,869 2,377,795
25,093
4.16-rc5-nz (3) 14 2.4% 12.4% 6,534,837 24,990 2,560,085
12,638
4.16-rc5 (1) 15 6,804,737 13,653 2,232,121
19,408
lu-zone (2) 15 1.1% 4.8% 6,881,730 18,995 2,338,868
18,922
4.16-rc5-nz (3) 15 2.4% 12.5% 6,970,318 28,677 2,510,594
25,890
4.16-rc5 (1) 16 7,197,145 17,499 2,313,168
16,694
lu-zone (2) 16 1.2% 4.5% 7,287,072 28,727 2,418,232
23,120
4.16-rc5-nz (3) 16 2.6% 7.6% 7,383,613 17,991 2,489,382
28,385
4.16-rc5 (1) 17 7,550,498 15,101 2,226,427
24,768
lu-zone (2) 17 1.6% 4.4% 7,667,641 24,305 2,324,675
16,265
4.16-rc5-nz (3) 17 2.8% 5.3% 7,761,917 29,854 2,345,195
20,659
4.16-rc5 (1) 18 7,902,794 12,578 2,188,399
37,453
lu-zone (2) 18 1.7% 6.0% 8,033,876 21,158 2,320,641
13,053
4.16-rc5-nz (3) 18 2.8% 8.9% 8,126,732 27,619 2,383,083
25,172
4.16-rc5 (1) 19 8,277,506 15,448 2,198,021
33,074
lu-zone (2) 19 2.1% 6.4% 8,453,255 17,411 2,339,395
20,220
4.16-rc5-nz (3) 19 3.0% 7.7% 8,529,130 29,852 2,366,800
20,139
4.16-rc5 (1) 20 8,651,034 19,705 2,239,626
25,694
lu-zone (2) 20 2.2% 3.2% 8,840,988 24,387 2,311,966
25,693
4.16-rc5-nz (3) 20 3.1% 9.3% 8,918,721 34,761 2,448,589
26,227
4.16-rc5 (1) 21 8,777,023 11,833 2,259,622
32,154
lu-zone (2) 21 2.5% 5.7% 8,993,348 25,480 2,389,000
29,464
4.16-rc5-nz (3) 21 3.5% 6.3% 9,085,319 27,790 2,401,713
28,706
4.16-rc5 (1) 22 8,855,202 27,455 2,268,030
35,914
lu-zone (2) 22 3.0% 4.1% 9,123,705 34,288 2,361,876
31,917
4.16-rc5-nz (3) 22 4.2% 11.9% 9,228,843 33,483 2,536,867
26,000
4.16-rc5 (1) 23 8,952,897 21,601 2,280,539
30,530
lu-zone (2) 23 3.5% 10.1% 9,268,365 30,803 2,510,883
26,312
4.16-rc5-nz (3) 23 4.6% 10.3% 9,367,048 30,681 2,514,740
32,896
4.16-rc5 (1) 24 9,036,582 19,482 2,374,728
42,891
lu-zone (2) 24 3.7% 2.2% 9,369,541 33,253 2,425,993
45,952
4.16-rc5-nz (3) 24 5.0% 4.5% 9,489,640 28,066 2,482,179
33,820
4.16-rc5 (1) 25 9,136,041 18,232 2,336,090
30,036
lu-zone (2) 25 4.0% 6.1% 9,497,501 34,408 2,478,602
31,959
4.16-rc5-nz (3) 25 4.7% 10.5% 9,563,832 34,506 2,581,056
38,516
4.16-rc5 (1) 26 9,226,630 17,998 2,326,070
33,782
lu-zone (2) 26 3.7% 7.8% 9,570,547 27,052 2,508,396
41,848
4.16-rc5-nz (3) 26 4.4% 10.4% 9,634,192 40,217 2,566,842
31,115
4.16-rc5 (1) 27 9,305,784 24,573 2,391,261
29,547
lu-zone (2) 27 3.8% 6.1% 9,656,252 39,164 2,536,624
45,738
4.16-rc5-nz (3) 27 4.5% 5.2% 9,720,210 31,171 2,516,447
32,347
4.16-rc5 (1) 28 9,381,004 19,377 2,442,774
35,745
lu-zone (2) 28 2.6% 1.3% 9,626,125 65,187 2,474,560
29,977
4.16-rc5-nz (3) 28 4.1% 4.4% 9,766,045 54,298 2,549,227
31,199
4.16-rc5 (1) 29 9,401,844 27,746 2,456,372
40,549
lu-zone (2) 29 2.7% 3.8% 9,652,161 51,629 2,549,004
39,990
4.16-rc5-nz (3) 29 3.5% 6.9% 9,731,681 51,852 2,625,589
27,821
4.16-rc5 (1) 30 9,428,320 17,561 2,509,119
39,752
lu-zone (2) 30 2.1% 6.1% 9,630,472 50,106 2,662,447
44,347
4.16-rc5-nz (3) 30 3.1% 5.7% 9,722,152 50,348 2,651,891
28,518
4.16-rc5 (1) 31 9,561,062 21,909 2,392,883
24,180
lu-zone (2) 31 2.0% 7.1% 9,755,774 60,381 2,563,573
32,132
4.16-rc5-nz (3) 31 3.5% 15.0% 9,894,735 45,966 2,752,506
28,516
4.16-rc5 (1) 32 9,624,859 30,462 2,480,667
27,055
lu-zone (2) 32 2.7% 5.5% 9,883,943 61,850 2,618,326
36,655
4.16-rc5-nz (3) 32 4.5% 15.4% 10,057,320 46,352 2,863,788
34,021
4.16-rc5 (1) 33 9,739,896 35,435 2,476,666
30,301
lu-zone (2) 33 3.1% 8.2% 10,043,706 60,943 2,680,570
42,346
4.16-rc5-nz (3) 33 4.6% 16.8% 10,191,082 51,348 2,893,385
32,717
4.16-rc5 (1) 34 9,833,955 39,366 2,628,480
36,567
lu-zone (2) 34 3.5% 2.7% 10,180,871 50,050 2,699,941
42,804
4.16-rc5-nz (3) 34 5.0% 7.3% 10,323,136 50,767 2,820,628
30,551
4.16-rc5 (1) 35 9,908,832 20,826 2,666,415
51,379
lu-zone (2) 35 3.5% 0.5% 10,251,385 58,551 2,679,925
49,143
4.16-rc5-nz (3) 35 5.1% 5.7% 10,418,155 51,726 2,817,192
34,042
4.16-rc5 (1) 36 9,969,311 20,377 2,563,399
36,720
lu-zone (2) 36 3.5% 4.8% 10,314,449 60,867 2,686,176
42,925
4.16-rc5-nz (3) 36 5.4% 9.1% 10,504,881 53,100 2,796,816
37,461
4.16-rc5 (1) 37 10,077,169 36,182 2,584,728
32,672
lu-zone (2) 37 3.1% 7.4% 10,393,453 63,048 2,775,523
39,745
4.16-rc5-nz (3) 37 4.7% 11.9% 10,549,870 45,280 2,893,000
39,102
4.16-rc5 (1) 38 10,115,997 25,835 2,653,036
33,259
lu-zone (2) 38 2.7% 5.7% 10,388,901 63,402 2,803,290
36,020
4.16-rc5-nz (3) 38 4.6% 11.2% 10,580,796 63,516 2,949,834
31,421
4.16-rc5 (1) 39 10,162,757 33,118 2,681,195
30,591
lu-zone (2) 39 2.5% 3.0% 10,413,010 76,719 2,761,752
32,471
4.16-rc5-nz (3) 39 4.0% 9.5% 10,568,061 65,614 2,935,127
38,463
4.16-rc5 (1) 40 10,223,472 41,882 2,670,421
26,049
lu-zone (2) 40 2.4% 5.0% 10,470,977 58,008 2,803,478
37,111
4.16-rc5-nz (3) 40 4.1% 7.4% 10,646,450 54,810 2,868,986
52,724