On Fri, 2011-09-23 at 17:29 +0200, Erik Rull wrote: > Alex Williamson wrote: > > On Thu, 2011-09-22 at 18:43 +0200, Erik Rull wrote: > >> Alex Williamson wrote: > >>> See the extended -smp options: > >>> > >>> -smp n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets] > >>> set the number of CPUs to 'n' [default=1] > >>> maxcpus= maximum number of total cpus, including > >>> offline CPUs for hotplug, etc > >>> cores= number of CPU cores on one socket > >>> threads= number of threads on one CPU core > >>> sockets= number of discrete sockets in the system > >>> > >>> Try something like: > >>> > >>> -smp 4,cores=2,threads=2,sockets=1 > >>> > >>> Alex > >>> > >> > >> Great - the correct combination made it :-) > >> > >> But the SMP-Performance-Benchmark is horrible :-( > >> "Only" between 0.35 and 1.05 for the above combination. > > > > I'm not sure what that means... > > > >> I have the same architecture on the host (2 cores w/ ht enabled) so there > >> are enough real cores available for computation. > >> > >> Any idea what could slow down the performance here? > > > > Note that threads aren't real "full" cores, so you're likely going to > > see some scheduling mismatches between physical threads and virtual > > threads. One thing that often helps smp guests is to pin vCPUs to > > pCPUs. You can get the vCPU thread IDs from 'info cpu' in the monitor > > and pin each to a physical CPU with taskset. If you're using libvirt, > > virt-manager can configure it to do this too (as well as the cpu > > topology). > > > > Alex > > > > The SMP factor means how much speed improvement was gained using SMP > against a single core with the same algorithm. My results showed a heavy > performance breakdown when doing the SMP benchmark. > Fixing the virtual cores to the physical ones worked with taskset but > didn't bring that real performance improvements at all :-( > > Maybe the used algorithm for benchmarking is not the best one, I will try > others.
So effectively you're getting 1.35x to 2.05x of the uniprocessor performance. On it's own, without knowing how the test fairs on the host, that doesn't mean a whole lot. If you're expecting 4x from a dual-core, hyperthreaded system, it's not going to happen. I'd guess something in the range of 2-2.5x would be typical. Hyperthreading is effectively a hardware managed overcommitment of the processor, trying to virtualize that directly to a guest introduces scheduling issues since we don't get to control the hardware thread scheduler (other than maybe giving it hints to switch). I expect any published SMP benchmarks for VMs would try not to exceed 1 vCPU per core and may even disable hyperthreading on the system. Thanks, Alex