On Thu, 10 Aug 2017 09:18:09 -0400
Eric Farman <far...@linux.vnet.ibm.com> wrote:

> On 08/08/2017 04:14 AM, Longpeng (Mike) wrote:
> > 
> > 
> > On 2017/8/8 15:41, Cornelia Huck wrote:
> >   
> >> On Tue, 8 Aug 2017 12:05:31 +0800
> >> "Longpeng(Mike)" <longpe...@huawei.com> wrote:
> >>  
> >>> This is a simple optimization for kvm_vcpu_on_spin, the
> >>> main idea is described in patch-1's commit msg.  
> >>
> >> I think this generally looks good now.
> >>  
> >>>
> >>> I did some tests base on the RFC version, the result shows
> >>> that it can improves the performance slightly.  
> >>
> >> Did you re-run tests on this version?  
> > 
> > 
> > Hi Cornelia,
> > 
> > I didn't re-run tests on V2. But the major difference between RFC and V2
> > is that V2 only cache result for X86 (s390/arm needn't) and V2 saves a
> > expensive operation ( 440-1400 cycles on my test machine ) for X86/VMX.
> > 
> > So I think V2's performance is at least the same as RFC or even slightly
> > better. :)
> >   
> >>
> >> I would also like to see some s390 numbers; unfortunately I only have a
> >> z/VM environment and any performance numbers would be nearly useless
> >> there. Maybe somebody within IBM with a better setup can run a quick
> >> test?  
> 
> Won't swear I didn't screw something up, but here's some quick numbers. 
> Host was 4.12.0 with and without this series, running QEMU 2.10.0-rc0. 
> Created 4 guests, each with 4 CPU (unpinned) and 4GB RAM.  VM1 did full 
> kernel compiles with kernbench, which took averages of 5 runs of 
> different job sizes (I threw away the "-j 1" numbers). VM2-VM4 ran cpu 
> burners on 2 of their 4 cpus.
> 
> Numbers from VM1 kernbench output, and the delta between runs:
> 
> load -j 3             before          after           delta
> Elapsed Time          183.178         182.58          -0.598
> User Time             534.19          531.52          -2.67
> System Time           32.538          33.37           0.832
> Percent CPU           308.8           309             0.2
> Context Switches      98484.6         99001           516.4
> Sleeps                        227347          228752          1405
> 
> load -j 16            before          after           delta
> Elapsed Time          153.352         147.59          -5.762
> User Time             545.829         533.41          -12.419
> System Time           34.289          34.85           0.561
> Percent CPU           347.6           348             0.4
> Context Switches      160518          159120          -1398
> Sleeps                        240740          240536          -204

Thanks a lot, Eric!

The decreases in elapsed time look nice, and we probably should not
care about the increases reported.

Reply via email to