On Mon, Apr 29, 2019 at 09:25:35PM +0800 Li, Aubrey wrote: > On 2019/4/29 14:14, Ingo Molnar wrote: > > > > * Li, Aubrey <aubrey...@linux.intel.com> wrote: > > > >>> I suspect it's pretty low, below 1% for all rows? > >> > >> Hope my this mail box works for this... > >> > >> .-------------------------------------------------------------------------------------------------------------. > >> |NA/AVX vanilla-SMT [std% / sem%] | coresched-SMT [std% / sem%] > >> +/- | no-SMT [std% / sem%] +/- | > >> |-------------------------------------------------------------------------------------------------------------| > >> | 1/1 508.5 [ 0.2%/ 0.0%] | 504.7 [ 1.1%/ 0.1%] > >> -0.8%| 509.0 [ 0.2%/ 0.0%] 0.1% | > >> | 2/2 1000.2 [ 1.4%/ 0.1%] | 1004.1 [ 1.6%/ 0.2%] > >> 0.4%| 997.6 [ 1.2%/ 0.1%] -0.3% | > >> | 4/4 1912.1 [ 1.0%/ 0.1%] | 1904.2 [ 1.1%/ 0.1%] > >> -0.4%| 1914.9 [ 1.3%/ 0.1%] 0.1% | > >> | 8/8 3753.5 [ 0.3%/ 0.0%] | 3748.2 [ 0.3%/ 0.0%] > >> -0.1%| 3751.3 [ 0.4%/ 0.0%] -0.1% | > >> | 16/16 7139.3 [ 2.4%/ 0.2%] | 7137.9 [ 1.8%/ 0.2%] > >> -0.0%| 7049.2 [ 2.4%/ 0.2%] -1.3% | > >> | 32/32 10899.0 [ 4.2%/ 0.4%] | 10780.3 [ 4.4%/ 0.4%] > >> -1.1%| 10339.2 [ 9.6%/ 0.9%] -5.1% | > >> | 64/64 15086.1 [11.5%/ 1.2%] | 14262.0 [ 8.2%/ 0.8%] > >> -5.5%| 11168.7 [22.2%/ 1.7%] -26.0% | > >> |128/128 15371.9 [22.0%/ 2.2%] | 14675.8 [14.4%/ 1.4%] > >> -4.5%| 10963.9 [18.5%/ 1.4%] -28.7% | > >> |256/256 15990.8 [22.0%/ 2.2%] | 12227.9 [10.3%/ 1.0%] > >> -23.5%| 10469.9 [19.6%/ 1.7%] -34.5% | > >> '-------------------------------------------------------------------------------------------------------------' > > > > Perfectly presented, thank you very much! > > My pleasure! ;-) > > > > > My final questin would be about the environment: > > > >> Skylake server, 2 numa nodes, 104 CPUs (HT on) > > > > Is the typical nr_running value the sum of 'NA+AVX', i.e. is it ~256 > > threads for the 128/128 row for example - or is it 128 parallel tasks? > > That means 128 sysbench threads and 128 gemmbench tasks, so 256 threads in > sum. > > > > I.e. showing the approximate CPU thread-load figure column would be very > > useful too, where '50%' shows half-loaded, '100%' fully-loaded, '200%' > > over-saturated, etc. - for each row? > > See below, hope this helps. > .--------------------------------------------------------------------------------------------------------------------------------------. > |NA/AVX vanilla-SMT [std% / sem%] cpu% |coresched-SMT [std% / sem%] > +/- cpu% | no-SMT [std% / sem%] +/- cpu% | > |--------------------------------------------------------------------------------------------------------------------------------------| > | 1/1 508.5 [ 0.2%/ 0.0%] 2.1% | 504.7 [ 1.1%/ 0.1%] > -0.8% 2.1% | 509.0 [ 0.2%/ 0.0%] 0.1% 4.3% | > | 2/2 1000.2 [ 1.4%/ 0.1%] 4.1% | 1004.1 [ 1.6%/ 0.2%] > 0.4% 4.1% | 997.6 [ 1.2%/ 0.1%] -0.3% 8.1% | > | 4/4 1912.1 [ 1.0%/ 0.1%] 7.9% | 1904.2 [ 1.1%/ 0.1%] > -0.4% 7.9% | 1914.9 [ 1.3%/ 0.1%] 0.1% 15.1% | > | 8/8 3753.5 [ 0.3%/ 0.0%] 14.9% | 3748.2 [ 0.3%/ 0.0%] > -0.1% 14.9% | 3751.3 [ 0.4%/ 0.0%] -0.1% 30.5% | > | 16/16 7139.3 [ 2.4%/ 0.2%] 30.3% | 7137.9 [ 1.8%/ 0.2%] > -0.0% 30.3% | 7049.2 [ 2.4%/ 0.2%] -1.3% 60.4% | > | 32/32 10899.0 [ 4.2%/ 0.4%] 60.3% | 10780.3 [ 4.4%/ 0.4%] > -1.1% 55.9% | 10339.2 [ 9.6%/ 0.9%] -5.1% 97.7% | > | 64/64 15086.1 [11.5%/ 1.2%] 97.7% | 14262.0 [ 8.2%/ 0.8%] > -5.5% 82.0% | 11168.7 [22.2%/ 1.7%] -26.0% 100.0% | > |128/128 15371.9 [22.0%/ 2.2%] 100.0% | 14675.8 [14.4%/ 1.4%] > -4.5% 82.8% | 10963.9 [18.5%/ 1.4%] -28.7% 100.0% | > |256/256 15990.8 [22.0%/ 2.2%] 100.0% | 12227.9 [10.3%/ 1.0%] > -23.5% 73.2% | 10469.9 [19.6%/ 1.7%] -34.5% 100.0% | > '--------------------------------------------------------------------------------------------------------------------------------------' >
That's really nice and clear. We start to see the penalty for the coresched at 32/32, leaving some cpus more idle than otherwise. But it's pretty good overall, for this benchmark at least. Is this with stock v2 or with any of the fixes posted after? I wonder how much the fixes for the race that violates the rule effects this, for example. Cheers, Phil > Thanks, > -Aubrey --