date:20141117

Fix Penguin Penalty 17th October2014 ( mail-archive.com )

2014-11-17 Thread stifled74394

Dear Sir

Did your website get hit by Google Penguin update on October 17th 2014? What 
basically is Google Penguin Update? It is actually a code name for Google 
algorithm which aims at decreasing your websites search engine rankings that 
violate Googles guidelines by using black hat SEO techniques to rank your 
webpage by giving number of spammy links to the page.
 
We are one of those few SEO companies that can help you avoid penalties from 
Google Updates like Penguin and Panda. Our clients have survived all the 
previous and present updates with ease. They have never been hit because we use 
100% white hat SEO techniques to rank Webpages.  Simple thing that we do to 
keep websites away from any Penguin or Panda penalties is follow Google 
guidelines and we give Google users the best answers to their queries.

If you are looking to increase the quality of your websites and to get more 
targeted traffic or save your websites from these Google penalties email us 
back with your interest. 

We will be glad to serve you and help you grow your business.

Regards

Vince G

SEO Manager ( TOB )
B7 Green Avenue, Amritsar 143001 Punjab

NO CLICK in the subject to STOP EMAILS
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Possible approaches to limit csw overhead

2014-11-17 Thread Andrey Korolyov

Hello,

I have a rather practical question, is it possible to limit amount of
vm-initiated events for single VM? As and example, VM which
experienced OOM and effectively stuck dead generates a lot of
unnecessary context switches triggering do_raw_spin_lock very often
and therefore increasing overall compute workload. This possibly can
be done via reactive limitation of the cpu quota via cgroup, but such
method is quite impractical because every orchestration solution will
need to implement its own piece of code to detect such VM states and
act properly. I wonder if there may be a proposal which will do this
job better than userspace-implemented perf statistics loop.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-17 Thread Samuel Thibault

Jan Kiszka, le Mon 17 Nov 2014 07:28:23 +0100, a écrit :
> > AIUI, the external interrupt is 0xf6, i.e. Linux' IRQ_WORK_VECTOR.  I
> > however don't see any of them, neither in L0's /proc/interrupts, nor in
> > L1's /proc/interrupts...
> 
> I suppose this is a SMP host and guest?

L0 is a hyperthreaded quad-core, but L1 is only 1 VCPU.  In the trace,
L1 happens to have been apparently always scheduled on the same L0 CPU:
trace-cmd tells me that CPU [0-24-7] are empty.

Samuel
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-17 Thread Gleb Natapov

On Sun, Nov 16, 2014 at 11:18:28PM +0100, Samuel Thibault wrote:
> Hello,
> 
> Jan Kiszka, le Wed 12 Nov 2014 00:42:52 +0100, a écrit :
> > On 2014-11-11 19:55, Samuel Thibault wrote:
> > > jenkins.debian.net is running inside a KVM VM, and it runs nested
> > > KVM guests for its installation attempts.  This goes fine with Linux
> > > kernels, but it is extremely slow with gnumach kernels.
> 
> > You can try to catch a trace (ftrace) on the physical host.
> > 
> > I suspect the setup forces a lot of instruction emulation, either on L0
> > or L1. And that is slower than QEMU is KVM does not optimize like QEMU does.
> 
> Here is a sample of trace-cmd output dump: the same kind of pattern
> repeats over and over, with EXTERNAL_INTERRUPT happening mostly
> every other microsecond:
> 
>  qemu-system-x86-9752  [003]  4106.187755: kvm_exit: reason 
> EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6
>  qemu-system-x86-9752  [003]  4106.187756: kvm_entry:vcpu 0
>  qemu-system-x86-9752  [003]  4106.187757: kvm_exit: reason 
> EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6
>  qemu-system-x86-9752  [003]  4106.187758: kvm_entry:vcpu 0
>  qemu-system-x86-9752  [003]  4106.187759: kvm_exit: reason 
> EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6
>  qemu-system-x86-9752  [003]  4106.187760: kvm_entry:vcpu 0
> 
> The various functions being interrupted are vmx_vcpu_run
> (0xa02848b1 and 0xa0284972), handle_io
> (0xa027ee62), vmx_get_cpl (0xa027a7de),
> load_vmc12_host_state (0xa027ea31), native_read_tscp
> (0x81050a84), native_write_msr_safe (0x81050aa6),
> vmx_decache_cr0_guest_bits (0xa027a384),
> vmx_handle_external_intr (0xa027a54d).
> 
> AIUI, the external interrupt is 0xf6, i.e. Linux' IRQ_WORK_VECTOR.  I
> however don't see any of them, neither in L0's /proc/interrupts, nor in
> L1's /proc/interrupts...
> 
Do you know how gnumach timekeeping works? Does it have a timer that fires each 
1ms?
Which clock device is it using?

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-17 Thread Samuel Thibault

Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
> Do you know how gnumach timekeeping works? Does it have a timer that fires 
> each 1ms?
> Which clock device is it using?

It uses the PIT every 10ms, in square mode
(PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).

Samuel
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-17 Thread Jan Kiszka

On 2014-11-17 10:03, Samuel Thibault wrote:
> Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
>> Do you know how gnumach timekeeping works? Does it have a timer that fires 
>> each 1ms?
>> Which clock device is it using?
> 
> It uses the PIT every 10ms, in square mode
> (PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).

Wow... how retro. That feature might be unsupported - does user space
irqchip work better?

Jan




signature.asc
Description: OpenPGP digital signature

Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-17 Thread Samuel Thibault

Jan Kiszka, le Mon 17 Nov 2014 10:04:37 +0100, a écrit :
> On 2014-11-17 10:03, Samuel Thibault wrote:
> > Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
> >> Do you know how gnumach timekeeping works? Does it have a timer that fires 
> >> each 1ms?
> >> Which clock device is it using?
> > 
> > It uses the PIT every 10ms, in square mode
> > (PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).
> 
> Wow... how retro. That feature might be unsupported - does user space
> irqchip work better?

I had indeed tried giving -machine kernel_irqchip=off to the L2 kvm,
with the same bad performance and external_interrupt in the trace.

Samuel
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-17 Thread Gleb Natapov

On Mon, Nov 17, 2014 at 10:10:25AM +0100, Samuel Thibault wrote:
> Jan Kiszka, le Mon 17 Nov 2014 10:04:37 +0100, a écrit :
> > On 2014-11-17 10:03, Samuel Thibault wrote:
> > > Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
> > >> Do you know how gnumach timekeeping works? Does it have a timer that 
> > >> fires each 1ms?
> > >> Which clock device is it using?
> > > 
> > > It uses the PIT every 10ms, in square mode
> > > (PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).
> > 
> > Wow... how retro. That feature might be unsupported - does user space
> > irqchip work better?
> 
> I had indeed tried giving -machine kernel_irqchip=off to the L2 kvm,
> with the same bad performance and external_interrupt in the trace.
> 
They are always be in the trace, but do you see them each ms or each 10ms
with user space irqchip?

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC][PATCH 1/2] kvm: x86: mmu: return zero if s > e in rsvd_bits()

2014-11-17 Thread Paolo Bonzini



On 17/11/2014 02:34, Chen, Tiejun wrote:
> On 2014/11/14 18:06, Paolo Bonzini wrote:
>>
>>
>> On 14/11/2014 10:31, Tiejun Chen wrote:
>>> In some real scenarios 'start' may not be less than 'end' like
>>> maxphyaddr = 52.
>>>
>>> Signed-off-by: Tiejun Chen 
>>> ---
>>>   arch/x86/kvm/mmu.h | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
>>> index bde8ee7..0e98b5e 100644
>>> --- a/arch/x86/kvm/mmu.h
>>> +++ b/arch/x86/kvm/mmu.h
>>> @@ -58,6 +58,8 @@
>>>
>>>   static inline u64 rsvd_bits(int s, int e)
>>>   {
>>> +if (unlikely(s > e))
>>> +return 0;
>>>   return ((1ULL << (e - s + 1)) - 1) << s;
>>>   }
>>>
>>>
>>
>> s == e + 1 is supported:
>>
>> (1ULL << (e - (e + 1) + 1)) - 1) << s ==
> 
> (1ULL << (e - (e + 1) + 1)) - 1) << s
> = (1ULL << (e - e - 1) + 1)) - 1) << s
> = (1ULL << (-1) + 1)) - 1) << s

no,

((1ULL << (-1 + 1)) - 1) << s

> = (1ULL << (0) - 1) << s

((1ULL << (0)) - 1) << s

> = (1ULL << (- 1) << s

(1 - 1) << s
0 << s

Paolo

> 
> Am I missing something?
> 
> Thanks
> Tiejun
> 
>> (1ULL << 0) << s ==
>> 0
>>
>> Is there any case where s is even bigger?
>>
>> Paolo
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> -- 
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/3] KVM: simplification to the memslots code

2014-11-17 Thread Paolo Bonzini



On 17/11/2014 02:56, Takuya Yoshikawa wrote:
>> > here are a few small patches that simplify __kvm_set_memory_region
>> > and associated code.  Can you please review them?
> Ah, already queued.  Sorry for being late to respond.

While they are not in kvm/next, there's time to add Reviewed-by's and
all that.  kvm/queue basically means "I want Fengguang to compile-test
them, some testing done on x86_64".

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC][PATCH 1/2] kvm: x86: mmu: return zero if s > e in rsvd_bits()

2014-11-17 Thread Chen, Tiejun


On 2014/11/17 17:22, Paolo Bonzini wrote:



On 17/11/2014 02:34, Chen, Tiejun wrote:

On 2014/11/14 18:06, Paolo Bonzini wrote:



On 14/11/2014 10:31, Tiejun Chen wrote:

In some real scenarios 'start' may not be less than 'end' like
maxphyaddr = 52.

Signed-off-by: Tiejun Chen 
---
   arch/x86/kvm/mmu.h | 2 ++
   1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index bde8ee7..0e98b5e 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -58,6 +58,8 @@

   static inline u64 rsvd_bits(int s, int e)
   {
+if (unlikely(s > e))
+return 0;
   return ((1ULL << (e - s + 1)) - 1) << s;
   }




s == e + 1 is supported:

 (1ULL << (e - (e + 1) + 1)) - 1) << s ==


(1ULL << (e - (e + 1) + 1)) - 1) << s
 = (1ULL << (e - e - 1) + 1)) - 1) << s
 = (1ULL << (-1) + 1)) - 1) << s


no,


You're right since I'm seeing "()" wrongly.

Sorry to bother you.

Thanks
Tiejun
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/3] KVM: simplification to the memslots code

2014-11-17 Thread Takuya Yoshikawa

On 2014/11/17 18:23, Paolo Bonzini wrote:
> 
> 
> On 17/11/2014 02:56, Takuya Yoshikawa wrote:
 here are a few small patches that simplify __kvm_set_memory_region
 and associated code.  Can you please review them?
>> Ah, already queued.  Sorry for being late to respond.
> 
> While they are not in kvm/next, there's time to add Reviewed-by's and
> all that.  kvm/queue basically means "I want Fengguang to compile-test
> them, some testing done on x86_64".
> 
> Paolo
> 

OK.

I reviewed patch 2/3 and 3/3, and saw no problem, some
improvements, there.

Takuya


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-17 Thread Samuel Thibault

Gleb Natapov, le Mon 17 Nov 2014 11:21:22 +0200, a écrit :
> On Mon, Nov 17, 2014 at 10:10:25AM +0100, Samuel Thibault wrote:
> > Jan Kiszka, le Mon 17 Nov 2014 10:04:37 +0100, a écrit :
> > > On 2014-11-17 10:03, Samuel Thibault wrote:
> > > > Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
> > > >> Do you know how gnumach timekeeping works? Does it have a timer that 
> > > >> fires each 1ms?
> > > >> Which clock device is it using?
> > > > 
> > > > It uses the PIT every 10ms, in square mode
> > > > (PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).
> > > 
> > > Wow... how retro. That feature might be unsupported - does user space
> > > irqchip work better?
> > 
> > I had indeed tried giving -machine kernel_irqchip=off to the L2 kvm,
> > with the same bad performance and external_interrupt in the trace.
> > 
> They are always be in the trace, but do you see them each ms or each 10ms
> with user space irqchip?

The external interupts are every 1 *microsecond, not millisecond. With
irqchip=off or not.

Samuel
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-17 Thread Samuel Thibault

Also, I have made gnumach show a timer counter, it does get PIT
interrupts every 10ms as expected, not more often.

Samuel
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: vhost + multiqueue + RSS question.

2014-11-17 Thread Michael S. Tsirkin

On Mon, Nov 17, 2014 at 09:44:23AM +0200, Gleb Natapov wrote:
> On Sun, Nov 16, 2014 at 08:56:04PM +0200, Michael S. Tsirkin wrote:
> > On Sun, Nov 16, 2014 at 06:18:18PM +0200, Gleb Natapov wrote:
> > > Hi Michael,
> > > 
> > >  I am playing with vhost multiqueue capability and have a question about
> > > vhost multiqueue and RSS (receive side steering). My setup has Mellanox
> > > ConnectX-3 NIC which supports multiqueue and RSS. Network related
> > > parameters for qemu are:
> > > 
> > >-netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4
> > >-device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10
> > > 
> > > In a guest I ran "ethtool -L eth0 combined 4" to enable multiqueue.
> > > 
> > > I am running one tcp stream into the guest using iperf. Since there is
> > > only one tcp stream I expect it to be handled by one queue only but
> > > this seams to be not the case. ethtool -S on a host shows that the
> > > stream is handled by one queue in the NIC, just like I would expect,
> > > but in a guest all 4 virtio-input interrupt are incremented. Am I
> > > missing any configuration?
> > 
> > I don't see anything obviously wrong with what you describe.
> > Maybe, somehow, same irqfd got bound to multiple MSI vectors?
> It does not look like this is what is happening judging by the way
> interrupts are distributed between queues. They are not distributed
> uniformly and often I see one queue gets most interrupt and others get
> much less and then it changes.

Weird. It would happen if you transmitted from multiple CPUs.
You did pin iperf to a single CPU within guest, did you not?



> --
>   Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Seeking a KVM benchmark

2014-11-17 Thread Wanpeng Li


Hi Paolo,
On 11/11/14, 1:28 AM, Paolo Bonzini wrote:

On 10/11/2014 15:23, Avi Kivity wrote:

It's not surprising [1].  Since the meaning of some PTE bits change [2],
the TLB has to be flushed.  In VMX we have VPIDs, so we only need to flush
if EFER changed between two invocations of the same VPID, which isn't the
case.


If there need a TLB flush if guest is UP?

Regards,
Wanpeng Li



[1] after the fact
[2] although those bits were reserved with NXE=0, so they shouldn't have
any TLB footprint

You're right that this is not that surprising after the fact, and that
both Sandy Bridge and Ivy Bridge have VPIDs (even the non-Xeon ones).
This is also why I'm curious about the Nehalem.

However note that even toggling the SCE bit is flushing the TLB.  The
NXE bit is not being toggled here!  That's the more surprising part.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Seeking a KVM benchmark

2014-11-17 Thread Paolo Bonzini



On 17/11/2014 12:17, Wanpeng Li wrote:
>>
>>> It's not surprising [1].  Since the meaning of some PTE bits change [2],
>>> the TLB has to be flushed.  In VMX we have VPIDs, so we only need to flush
>>> if EFER changed between two invocations of the same VPID, which isn't
>>> the case.
> 
> If there need a TLB flush if guest is UP?

The wrmsr is in the host, and the TLB flush is done in the processor
microcode.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: vhost + multiqueue + RSS question.

2014-11-17 Thread Gleb Natapov

On Mon, Nov 17, 2014 at 12:38:16PM +0200, Michael S. Tsirkin wrote:
> On Mon, Nov 17, 2014 at 09:44:23AM +0200, Gleb Natapov wrote:
> > On Sun, Nov 16, 2014 at 08:56:04PM +0200, Michael S. Tsirkin wrote:
> > > On Sun, Nov 16, 2014 at 06:18:18PM +0200, Gleb Natapov wrote:
> > > > Hi Michael,
> > > > 
> > > >  I am playing with vhost multiqueue capability and have a question about
> > > > vhost multiqueue and RSS (receive side steering). My setup has Mellanox
> > > > ConnectX-3 NIC which supports multiqueue and RSS. Network related
> > > > parameters for qemu are:
> > > > 
> > > >-netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4
> > > >-device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10
> > > > 
> > > > In a guest I ran "ethtool -L eth0 combined 4" to enable multiqueue.
> > > > 
> > > > I am running one tcp stream into the guest using iperf. Since there is
> > > > only one tcp stream I expect it to be handled by one queue only but
> > > > this seams to be not the case. ethtool -S on a host shows that the
> > > > stream is handled by one queue in the NIC, just like I would expect,
> > > > but in a guest all 4 virtio-input interrupt are incremented. Am I
> > > > missing any configuration?
> > > 
> > > I don't see anything obviously wrong with what you describe.
> > > Maybe, somehow, same irqfd got bound to multiple MSI vectors?
> > It does not look like this is what is happening judging by the way
> > interrupts are distributed between queues. They are not distributed
> > uniformly and often I see one queue gets most interrupt and others get
> > much less and then it changes.
> 
> Weird. It would happen if you transmitted from multiple CPUs.
> You did pin iperf to a single CPU within guest, did you not?
> 
No, I didn't because I didn't expect it to matter for input interrupts.
When I run iperf on a host rx queue that receives all packets depends
only on a connection itself, not on a cpu iperf is running on (I tested
that). When I pin iperf in a guest I do indeed see that all interrupts
are arriving to the same irq vector. Is a number after virtio-input
in /proc/interrupt any indication of a queue a packet arrived to (on
a host I can use ethtool -S to check what queue receives packets, but
unfortunately this does not work for virtio nic in a guest)? Because if
it is the way RSS works in virtio is not how it works on a host and not
what I would expect after reading about RSS. The queue a packets arrives
to should be calculated by hashing fields from a packet header only.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [RFC v2 0/9] KVM-VFIO IRQ forward control

2014-11-17 Thread Wu, Feng



> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org
> [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Alex Williamson
> Sent: Thursday, September 11, 2014 1:10 PM
> To: Christoffer Dall
> Cc: Eric Auger; eric.au...@st.com; marc.zyng...@arm.com;
> linux-arm-ker...@lists.infradead.org; kvm...@lists.cs.columbia.edu;
> kvm@vger.kernel.org; joel.sch...@amd.com; kim.phill...@freescale.com;
> pau...@samba.org; g...@kernel.org; pbonz...@redhat.com;
> linux-ker...@vger.kernel.org; patc...@linaro.org; will.dea...@arm.com;
> a.mota...@virtualopensystems.com; a.r...@virtualopensystems.com;
> john.li...@huawei.com
> Subject: Re: [RFC v2 0/9] KVM-VFIO IRQ forward control
> 
> On Thu, 2014-09-11 at 05:10 +0200, Christoffer Dall wrote:
> > On Tue, Sep 02, 2014 at 03:05:41PM -0600, Alex Williamson wrote:
> > > On Mon, 2014-09-01 at 14:52 +0200, Eric Auger wrote:
> > > > This RFC proposes an integration of "ARM: Forwarding physical
> > > > interrupts to a guest VM" (http://lwn.net/Articles/603514/) in
> > > > KVM.
> > > >
> > > > It enables to transform a VFIO platform driver IRQ into a forwarded
> > > > IRQ. The direct benefit is that, for a level sensitive IRQ, a VM
> > > > switch can be avoided on guest virtual IRQ completion. Before this
> > > > patch, a maintenance IRQ was triggered on the virtual IRQ completion.
> > > >
> > > > When the IRQ is forwarded, the VFIO platform driver does not need to
> > > > disable the IRQ anymore. Indeed when returning from the IRQ handler
> > > > the IRQ is not deactivated. Only its priority is lowered. This means
> > > > the same IRQ cannot hit before the guest completes the virtual IRQ
> > > > and the GIC automatically deactivates the corresponding physical IRQ.
> > > >
> > > > Besides, the injection still is based on irqfd triggering. The only
> > > > impact on irqfd process is resamplefd is not called anymore on
> > > > virtual IRQ completion since this latter becomes "transparent".
> > > >
> > > > The current integration is based on an extension of the KVM-VFIO
> > > > device, previously used by KVM to interact with VFIO groups. The
> > > > patch serie now enables KVM to directly interact with a VFIO
> > > > platform device. The VFIO external API was extended for that purpose.
> > > >
> > > > Th KVM-VFIO device can get/put the vfio platform device, check its
> > > > integrity and type, get the IRQ number associated to an IRQ index.
> > > >
> > > > The IRQ forward programming is architecture specific (virtual interrupt
> > > > controller programming basically). However the whole infrastructure is
> > > > kept generic.
> > > >
> > > > from a user point of view, the functionality is provided through new
> > > > KVM-VFIO device commands,
> KVM_DEV_VFIO_DEVICE_(UN)FORWARD_IRQ
> > > > and the capability can be checked with KVM_HAS_DEVICE_ATTR.
> > > > Assignment can only be changed when the physical IRQ is not active.
> > > > It is the responsability of the user to do this check.
> > > >
> > > > This patch serie has the following dependencies:
> > > > - "ARM: Forwarding physical interrupts to a guest VM"
> > > >   (http://lwn.net/Articles/603514/) in
> > > > - [PATCH v3] irqfd for ARM
> > > > - and obviously the VFIO platform driver serie:
> > > >   [RFC PATCH v6 00/20] VFIO support for platform devices on ARM
> > > >   https://www.mail-archive.com/kvm@vger.kernel.org/msg103247.html
> > > >
> > > > Integrated pieces can be found at
> > > > ssh://git.linaro.org/people/eric.auger/linux.git
> > > > on branch 3.17rc3_irqfd_forward_integ_v2
> > > >
> > > > This was was tested on Calxeda Midway, assigning the xgmac main IRQ.
> > > >
> > > > v1 -> v2:
> > > > - forward control is moved from architecture specific file into generic
> > > >   vfio.c module.
> > > >   only kvm_arch_set_fwd_state remains architecture specific
> > > > - integrate Kim's patch which enables KVM-VFIO for ARM
> > > > - fix vgic state bypass in vgic_queue_hwirq
> > > > - struct kvm_arch_forwarded_irq moved from
> arch/arm/include/uapi/asm/kvm.h
> > > >   to include/uapi/linux/kvm.h
> > > >   also irq_index renamed into index and guest_irq renamed into gsi
> > > > - ASSIGN/DEASSIGN renamed into FORWARD/UNFORWARD
> > > > - vfio_external_get_base_device renamed into vfio_external_base_device
> > > > - vfio_external_get_type removed
> > > > - kvm_vfio_external_get_base_device renamed into
> kvm_vfio_external_base_device
> > > > - __KVM_HAVE_ARCH_KVM_VFIO renamed into
> __KVM_HAVE_ARCH_KVM_VFIO_FORWARD
> > > >
> > > > Eric Auger (8):
> > > >   KVM: ARM: VGIC: fix multiple injection of level sensitive forwarded
> > > > IRQ
> > > >   KVM: ARM: VGIC: add forwarded irq rbtree lock
> > > >   VFIO: platform: handler tests whether the IRQ is forwarded
> > > >   KVM: KVM-VFIO: update user API to program forwarded IRQ
> > > >   VFIO: Extend external user API
> > > >   KVM: KVM-VFIO: add new VFIO external API hooks
> > > >   KVM: KVM-VFIO: generic KVM_DEV_VFIO_DEVICE command and IRQ
> forwarding
> > >

[v2][PATCH] kvm: x86: mmio: fix setting the present bit of mmio spte

2014-11-17 Thread Tiejun Chen

In non-ept 64-bit of PAE case maxphyaddr may be 52bit as well,
so we also need to disable mmio page fault. Here we can check
MMIO_SPTE_GEN_HIGH_SHIFT directly to determine if we should
set the present bit, and bring a little cleanup.

Signed-off-by: Tiejun Chen 
---
v2:

* Correct codes comments
* Need to use "|=" to set the present bit

 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/mmu.c  | 25 +
 arch/x86/kvm/x86.c  | 30 --
 3 files changed, 26 insertions(+), 30 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index dc932d3..667f2b6 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -809,6 +809,7 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 struct kvm_memory_slot *slot,
 gfn_t gfn_offset, unsigned long mask);
 void kvm_mmu_zap_all(struct kvm *kvm);
+void kvm_set_mmio_spte_mask(void);
 void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm);
 unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
 void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index ac1c4de..fe9a917 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -295,6 +295,31 @@ static bool check_mmio_spte(struct kvm *kvm, u64 spte)
return likely(kvm_gen == spte_gen);
 }
 
+/*
+ * Set the reserved bits and the present bit of an paging-structure
+ * entry to generate page fault with PFER.RSV = 1.
+ */
+void kvm_set_mmio_spte_mask(void)
+{
+   u64 mask;
+   int maxphyaddr = boot_cpu_data.x86_phys_bits;
+
+   /* Mask the reserved physical address bits. */
+   mask = rsvd_bits(maxphyaddr, MMIO_SPTE_GEN_HIGH_SHIFT - 1);
+
+   /* Magic bits are always reserved to identify mmio spte.
+* On 32 bit systems we have bit 62.
+*/
+   mask |= 0x3ull << 62;
+
+   /* Set the present bit to enable mmio page fault. */
+   if (maxphyaddr < MMIO_SPTE_GEN_HIGH_SHIFT)
+   mask |= PT_PRESENT_MASK;
+
+   kvm_mmu_set_mmio_spte_mask(mask);
+}
+EXPORT_SYMBOL_GPL(kvm_set_mmio_spte_mask);
+
 void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
u64 dirty_mask, u64 nx_mask, u64 x_mask)
 {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f85da5c..550f179 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5596,36 +5596,6 @@ void kvm_after_handle_nmi(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_after_handle_nmi);
 
-static void kvm_set_mmio_spte_mask(void)
-{
-   u64 mask;
-   int maxphyaddr = boot_cpu_data.x86_phys_bits;
-
-   /*
-* Set the reserved bits and the present bit of an paging-structure
-* entry to generate page fault with PFER.RSV = 1.
-*/
-/* Mask the reserved physical address bits. */
-   mask = rsvd_bits(maxphyaddr, 51);
-
-   /* Bit 62 is always reserved for 32bit host. */
-   mask |= 0x3ull << 62;
-
-   /* Set the present bit. */
-   mask |= 1ull;
-
-#ifdef CONFIG_X86_64
-   /*
-* If reserved bit is not supported, clear the present bit to disable
-* mmio page fault.
-*/
-   if (maxphyaddr == 52)
-   mask &= ~1ull;
-#endif
-
-   kvm_mmu_set_mmio_spte_mask(mask);
-}
-
 #ifdef CONFIG_X86_64
 static void pvclock_gtod_update_fn(struct work_struct *work)
 {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [v2][PATCH] kvm: x86: mmio: fix setting the present bit of mmio spte

2014-11-17 Thread Paolo Bonzini



On 17/11/2014 12:31, Tiejun Chen wrote:
> In non-ept 64-bit of PAE case maxphyaddr may be 52bit as well,

There is no such thing as 64-bit PAE.

On 32-bit PAE hosts, PTEs have bit 62 reserved, as in your patch:

> + /* Magic bits are always reserved for 32bit host. */
> + mask |= 0x3ull << 62;

so there is no need to disable the MMIO page fault.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: vhost + multiqueue + RSS question.

2014-11-17 Thread Michael S. Tsirkin

On Mon, Nov 17, 2014 at 01:22:07PM +0200, Gleb Natapov wrote:
> On Mon, Nov 17, 2014 at 12:38:16PM +0200, Michael S. Tsirkin wrote:
> > On Mon, Nov 17, 2014 at 09:44:23AM +0200, Gleb Natapov wrote:
> > > On Sun, Nov 16, 2014 at 08:56:04PM +0200, Michael S. Tsirkin wrote:
> > > > On Sun, Nov 16, 2014 at 06:18:18PM +0200, Gleb Natapov wrote:
> > > > > Hi Michael,
> > > > > 
> > > > >  I am playing with vhost multiqueue capability and have a question 
> > > > > about
> > > > > vhost multiqueue and RSS (receive side steering). My setup has 
> > > > > Mellanox
> > > > > ConnectX-3 NIC which supports multiqueue and RSS. Network related
> > > > > parameters for qemu are:
> > > > > 
> > > > >-netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4
> > > > >-device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10
> > > > > 
> > > > > In a guest I ran "ethtool -L eth0 combined 4" to enable multiqueue.
> > > > > 
> > > > > I am running one tcp stream into the guest using iperf. Since there is
> > > > > only one tcp stream I expect it to be handled by one queue only but
> > > > > this seams to be not the case. ethtool -S on a host shows that the
> > > > > stream is handled by one queue in the NIC, just like I would expect,
> > > > > but in a guest all 4 virtio-input interrupt are incremented. Am I
> > > > > missing any configuration?
> > > > 
> > > > I don't see anything obviously wrong with what you describe.
> > > > Maybe, somehow, same irqfd got bound to multiple MSI vectors?
> > > It does not look like this is what is happening judging by the way
> > > interrupts are distributed between queues. They are not distributed
> > > uniformly and often I see one queue gets most interrupt and others get
> > > much less and then it changes.
> > 
> > Weird. It would happen if you transmitted from multiple CPUs.
> > You did pin iperf to a single CPU within guest, did you not?
> > 
> No, I didn't because I didn't expect it to matter for input interrupts.
> When I run iperf on a host rx queue that receives all packets depends
> only on a connection itself, not on a cpu iperf is running on (I tested
> that).

This really depends on the type of networking card you have
on the host, and how it's configured.

I think you will get something more closely resembling this
behaviour if you enable RFS in host.

> When I pin iperf in a guest I do indeed see that all interrupts
> are arriving to the same irq vector. Is a number after virtio-input
> in /proc/interrupt any indication of a queue a packet arrived to (on
> a host I can use ethtool -S to check what queue receives packets, but
> unfortunately this does not work for virtio nic in a guest)?

I think it is.

> Because if
> it is the way RSS works in virtio is not how it works on a host and not
> what I would expect after reading about RSS. The queue a packets arrives
> to should be calculated by hashing fields from a packet header only.

Yes, what virtio has is not RSS - it's an accelerated RFS really.

The point is to try and take application locality into account.


> --
>   Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Seeking a KVM benchmark

2014-11-17 Thread Wanpeng Li


Hi Paolo,
On 11/17/14, 7:18 PM, Paolo Bonzini wrote:


On 17/11/2014 12:17, Wanpeng Li wrote:

It's not surprising [1].  Since the meaning of some PTE bits change [2],
the TLB has to be flushed.  In VMX we have VPIDs, so we only need to flush
if EFER changed between two invocations of the same VPID, which isn't
the case.

If there need a TLB flush if guest is UP?

The wrmsr is in the host, and the TLB flush is done in the processor
microcode.


Sorry, maybe I didn't state my question clearly. As Avi mentioned above 
"In VMX we have VPIDs, so we only need to flush if EFER changed between 
two invocations of the same VPID", so there is only one VPID if the 
guest is UP, my question is if there need a TLB flush when guest's EFER 
has been changed?


Regards,
Wanpeng Li



Paolo


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Seeking a KVM benchmark

2014-11-17 Thread Paolo Bonzini



On 17/11/2014 13:00, Wanpeng Li wrote:
> Sorry, maybe I didn't state my question clearly. As Avi mentioned above
> "In VMX we have VPIDs, so we only need to flush if EFER changed between
> two invocations of the same VPID", so there is only one VPID if the
> guest is UP, my question is if there need a TLB flush when guest's EFER
> has been changed?

Yes, because the meaning of the page table entries has changed.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Seeking a KVM benchmark

2014-11-17 Thread Wanpeng Li


Hi Paolo,
On 11/17/14, 8:04 PM, Paolo Bonzini wrote:


On 17/11/2014 13:00, Wanpeng Li wrote:

Sorry, maybe I didn't state my question clearly. As Avi mentioned above
"In VMX we have VPIDs, so we only need to flush if EFER changed between
two invocations of the same VPID", so there is only one VPID if the
guest is UP, my question is if there need a TLB flush when guest's EFER
has been changed?

Yes, because the meaning of the page table entries has changed.


So both VMX EFER writes and non-VMX EFER writes cause a TLB flush for UP 
guest, is there still a performance improvement in this case?


Regards,
Wanpeng Li



Paolo


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Seeking a KVM benchmark

2014-11-17 Thread Paolo Bonzini



On 17/11/2014 13:14, Wanpeng Li wrote:
>>
>>> Sorry, maybe I didn't state my question clearly. As Avi mentioned above
>>> "In VMX we have VPIDs, so we only need to flush if EFER changed between
>>> two invocations of the same VPID", so there is only one VPID if the
>>> guest is UP, my question is if there need a TLB flush when guest's EFER
>>> has been changed?
>> Yes, because the meaning of the page table entries has changed.
> 
> So both VMX EFER writes and non-VMX EFER writes cause a TLB flush for UP
> guest, is there still a performance improvement in this case?

Note that the guest's EFER does not change, so no TLB flush happens.
The guest EFER, however, is different from the host's, so if you change
it with a wrmsr in the host you will get a TLB flush on every userspace
exit.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: vhost + multiqueue + RSS question.

2014-11-17 Thread Gleb Natapov

On Mon, Nov 17, 2014 at 01:58:20PM +0200, Michael S. Tsirkin wrote:
> On Mon, Nov 17, 2014 at 01:22:07PM +0200, Gleb Natapov wrote:
> > On Mon, Nov 17, 2014 at 12:38:16PM +0200, Michael S. Tsirkin wrote:
> > > On Mon, Nov 17, 2014 at 09:44:23AM +0200, Gleb Natapov wrote:
> > > > On Sun, Nov 16, 2014 at 08:56:04PM +0200, Michael S. Tsirkin wrote:
> > > > > On Sun, Nov 16, 2014 at 06:18:18PM +0200, Gleb Natapov wrote:
> > > > > > Hi Michael,
> > > > > > 
> > > > > >  I am playing with vhost multiqueue capability and have a question 
> > > > > > about
> > > > > > vhost multiqueue and RSS (receive side steering). My setup has 
> > > > > > Mellanox
> > > > > > ConnectX-3 NIC which supports multiqueue and RSS. Network related
> > > > > > parameters for qemu are:
> > > > > > 
> > > > > >-netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4
> > > > > >-device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10
> > > > > > 
> > > > > > In a guest I ran "ethtool -L eth0 combined 4" to enable multiqueue.
> > > > > > 
> > > > > > I am running one tcp stream into the guest using iperf. Since there 
> > > > > > is
> > > > > > only one tcp stream I expect it to be handled by one queue only but
> > > > > > this seams to be not the case. ethtool -S on a host shows that the
> > > > > > stream is handled by one queue in the NIC, just like I would expect,
> > > > > > but in a guest all 4 virtio-input interrupt are incremented. Am I
> > > > > > missing any configuration?
> > > > > 
> > > > > I don't see anything obviously wrong with what you describe.
> > > > > Maybe, somehow, same irqfd got bound to multiple MSI vectors?
> > > > It does not look like this is what is happening judging by the way
> > > > interrupts are distributed between queues. They are not distributed
> > > > uniformly and often I see one queue gets most interrupt and others get
> > > > much less and then it changes.
> > > 
> > > Weird. It would happen if you transmitted from multiple CPUs.
> > > You did pin iperf to a single CPU within guest, did you not?
> > > 
> > No, I didn't because I didn't expect it to matter for input interrupts.
> > When I run iperf on a host rx queue that receives all packets depends
> > only on a connection itself, not on a cpu iperf is running on (I tested
> > that).
> 
> This really depends on the type of networking card you have
> on the host, and how it's configured.
> 
> I think you will get something more closely resembling this
> behaviour if you enable RFS in host.
> 
> > When I pin iperf in a guest I do indeed see that all interrupts
> > are arriving to the same irq vector. Is a number after virtio-input
> > in /proc/interrupt any indication of a queue a packet arrived to (on
> > a host I can use ethtool -S to check what queue receives packets, but
> > unfortunately this does not work for virtio nic in a guest)?
> 
> I think it is.
> 
> > Because if
> > it is the way RSS works in virtio is not how it works on a host and not
> > what I would expect after reading about RSS. The queue a packets arrives
> > to should be calculated by hashing fields from a packet header only.
> 
> Yes, what virtio has is not RSS - it's an accelerated RFS really.
> 
OK, if what virtio has is RFS and not RSS my test results make sense.
Thanks!

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kvm-unit-tests PATCH 0/6] arm: enable MMU

2014-11-17 Thread Paolo Bonzini

On 30/10/2014 16:56, Andrew Jones wrote:
> This first patch of this series fixes a bug caused by attempting
> to use spinlocks without enabling the MMU. The next three do some
> prep for the fifth, and also fix arm's PAGE_ALIGN. The fifth is
> prep for the sixth, which finally turns the MMU on for arm unit
> tests.
> 
> Andrew Jones (6):
>   arm: fix crash on cubietruck
>   lib: add ALIGN() macro
>   lib: steal const.h from kernel
>   arm: apply ALIGN() and const.h to arm files
>   arm: import some Linux page table API
>   arm: turn on the MMU
> 
>  arm/cstart.S| 33 +++
>  config/config-arm.mak   |  3 ++-
>  lib/alloc.c |  4 +--
>  lib/arm/asm/mmu.h   | 43 ++
>  lib/arm/asm/page.h  | 43 +++---
>  lib/arm/asm/pgtable-hwdef.h | 65 
> +
>  lib/arm/mmu.c   | 53 
>  lib/arm/processor.c | 11 
>  lib/arm/setup.c |  3 +++
>  lib/arm/spinlock.c  |  7 +
>  lib/asm-generic/page.h  | 17 ++--
>  lib/const.h | 11 
>  lib/libcflat.h  |  4 +++
>  13 files changed, 275 insertions(+), 22 deletions(-)
>  create mode 100644 lib/arm/asm/mmu.h
>  create mode 100644 lib/arm/asm/pgtable-hwdef.h
>  create mode 100644 lib/arm/mmu.c
>  create mode 100644 lib/const.h
> 

Tested on CubieTruck and applied, thanks.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC v2 0/9] KVM-VFIO IRQ forward control

2014-11-17 Thread Eric Auger

Hi Feng,

I will submit a PATCH v3 release end of this week.

Best Regards

Eric

On 11/17/2014 12:25 PM, Wu, Feng wrote:
> 
> 
>> -Original Message-
>> From: linux-kernel-ow...@vger.kernel.org
>> [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Alex Williamson
>> Sent: Thursday, September 11, 2014 1:10 PM
>> To: Christoffer Dall
>> Cc: Eric Auger; eric.au...@st.com; marc.zyng...@arm.com;
>> linux-arm-ker...@lists.infradead.org; kvm...@lists.cs.columbia.edu;
>> kvm@vger.kernel.org; joel.sch...@amd.com; kim.phill...@freescale.com;
>> pau...@samba.org; g...@kernel.org; pbonz...@redhat.com;
>> linux-ker...@vger.kernel.org; patc...@linaro.org; will.dea...@arm.com;
>> a.mota...@virtualopensystems.com; a.r...@virtualopensystems.com;
>> john.li...@huawei.com
>> Subject: Re: [RFC v2 0/9] KVM-VFIO IRQ forward control
>>
>> On Thu, 2014-09-11 at 05:10 +0200, Christoffer Dall wrote:
>>> On Tue, Sep 02, 2014 at 03:05:41PM -0600, Alex Williamson wrote:
 On Mon, 2014-09-01 at 14:52 +0200, Eric Auger wrote:
> This RFC proposes an integration of "ARM: Forwarding physical
> interrupts to a guest VM" (http://lwn.net/Articles/603514/) in
> KVM.
>
> It enables to transform a VFIO platform driver IRQ into a forwarded
> IRQ. The direct benefit is that, for a level sensitive IRQ, a VM
> switch can be avoided on guest virtual IRQ completion. Before this
> patch, a maintenance IRQ was triggered on the virtual IRQ completion.
>
> When the IRQ is forwarded, the VFIO platform driver does not need to
> disable the IRQ anymore. Indeed when returning from the IRQ handler
> the IRQ is not deactivated. Only its priority is lowered. This means
> the same IRQ cannot hit before the guest completes the virtual IRQ
> and the GIC automatically deactivates the corresponding physical IRQ.
>
> Besides, the injection still is based on irqfd triggering. The only
> impact on irqfd process is resamplefd is not called anymore on
> virtual IRQ completion since this latter becomes "transparent".
>
> The current integration is based on an extension of the KVM-VFIO
> device, previously used by KVM to interact with VFIO groups. The
> patch serie now enables KVM to directly interact with a VFIO
> platform device. The VFIO external API was extended for that purpose.
>
> Th KVM-VFIO device can get/put the vfio platform device, check its
> integrity and type, get the IRQ number associated to an IRQ index.
>
> The IRQ forward programming is architecture specific (virtual interrupt
> controller programming basically). However the whole infrastructure is
> kept generic.
>
> from a user point of view, the functionality is provided through new
> KVM-VFIO device commands,
>> KVM_DEV_VFIO_DEVICE_(UN)FORWARD_IRQ
> and the capability can be checked with KVM_HAS_DEVICE_ATTR.
> Assignment can only be changed when the physical IRQ is not active.
> It is the responsability of the user to do this check.
>
> This patch serie has the following dependencies:
> - "ARM: Forwarding physical interrupts to a guest VM"
>   (http://lwn.net/Articles/603514/) in
> - [PATCH v3] irqfd for ARM
> - and obviously the VFIO platform driver serie:
>   [RFC PATCH v6 00/20] VFIO support for platform devices on ARM
>   https://www.mail-archive.com/kvm@vger.kernel.org/msg103247.html
>
> Integrated pieces can be found at
> ssh://git.linaro.org/people/eric.auger/linux.git
> on branch 3.17rc3_irqfd_forward_integ_v2
>
> This was was tested on Calxeda Midway, assigning the xgmac main IRQ.
>
> v1 -> v2:
> - forward control is moved from architecture specific file into generic
>   vfio.c module.
>   only kvm_arch_set_fwd_state remains architecture specific
> - integrate Kim's patch which enables KVM-VFIO for ARM
> - fix vgic state bypass in vgic_queue_hwirq
> - struct kvm_arch_forwarded_irq moved from
>> arch/arm/include/uapi/asm/kvm.h
>   to include/uapi/linux/kvm.h
>   also irq_index renamed into index and guest_irq renamed into gsi
> - ASSIGN/DEASSIGN renamed into FORWARD/UNFORWARD
> - vfio_external_get_base_device renamed into vfio_external_base_device
> - vfio_external_get_type removed
> - kvm_vfio_external_get_base_device renamed into
>> kvm_vfio_external_base_device
> - __KVM_HAVE_ARCH_KVM_VFIO renamed into
>> __KVM_HAVE_ARCH_KVM_VFIO_FORWARD
>
> Eric Auger (8):
>   KVM: ARM: VGIC: fix multiple injection of level sensitive forwarded
> IRQ
>   KVM: ARM: VGIC: add forwarded irq rbtree lock
>   VFIO: platform: handler tests whether the IRQ is forwarded
>   KVM: KVM-VFIO: update user API to program forwarded IRQ
>   VFIO: Extend external user API
>   KVM: KVM-VFIO: add new VFIO external API hooks
>   KVM: KVM-VFIO: generic KVM_DEV_VFIO_DEVICE command and IRQ
>> forwarding
> co

RE: [RFC v2 0/9] KVM-VFIO IRQ forward control

2014-11-17 Thread Wu, Feng



> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On
> Behalf Of Eric Auger
> Sent: Monday, November 17, 2014 9:42 PM
> To: Wu, Feng; Alex Williamson; Christoffer Dall
> Cc: eric.au...@st.com; marc.zyng...@arm.com;
> linux-arm-ker...@lists.infradead.org; kvm...@lists.cs.columbia.edu;
> kvm@vger.kernel.org; joel.sch...@amd.com; kim.phill...@freescale.com;
> pau...@samba.org; g...@kernel.org; pbonz...@redhat.com;
> linux-ker...@vger.kernel.org; patc...@linaro.org; will.dea...@arm.com;
> a.mota...@virtualopensystems.com; a.r...@virtualopensystems.com;
> john.li...@huawei.com
> Subject: Re: [RFC v2 0/9] KVM-VFIO IRQ forward control
> 
> Hi Feng,
> 
> I will submit a PATCH v3 release end of this week.
> 
> Best Regards
> 
> Eric

Thanks for the update, Eric!

Thanks,
Feng

> 
> On 11/17/2014 12:25 PM, Wu, Feng wrote:
> >
> >
> >> -Original Message-
> >> From: linux-kernel-ow...@vger.kernel.org
> >> [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Alex Williamson
> >> Sent: Thursday, September 11, 2014 1:10 PM
> >> To: Christoffer Dall
> >> Cc: Eric Auger; eric.au...@st.com; marc.zyng...@arm.com;
> >> linux-arm-ker...@lists.infradead.org; kvm...@lists.cs.columbia.edu;
> >> kvm@vger.kernel.org; joel.sch...@amd.com; kim.phill...@freescale.com;
> >> pau...@samba.org; g...@kernel.org; pbonz...@redhat.com;
> >> linux-ker...@vger.kernel.org; patc...@linaro.org; will.dea...@arm.com;
> >> a.mota...@virtualopensystems.com; a.r...@virtualopensystems.com;
> >> john.li...@huawei.com
> >> Subject: Re: [RFC v2 0/9] KVM-VFIO IRQ forward control
> >>
> >> On Thu, 2014-09-11 at 05:10 +0200, Christoffer Dall wrote:
> >>> On Tue, Sep 02, 2014 at 03:05:41PM -0600, Alex Williamson wrote:
>  On Mon, 2014-09-01 at 14:52 +0200, Eric Auger wrote:
> > This RFC proposes an integration of "ARM: Forwarding physical
> > interrupts to a guest VM" (http://lwn.net/Articles/603514/) in
> > KVM.
> >
> > It enables to transform a VFIO platform driver IRQ into a forwarded
> > IRQ. The direct benefit is that, for a level sensitive IRQ, a VM
> > switch can be avoided on guest virtual IRQ completion. Before this
> > patch, a maintenance IRQ was triggered on the virtual IRQ completion.
> >
> > When the IRQ is forwarded, the VFIO platform driver does not need to
> > disable the IRQ anymore. Indeed when returning from the IRQ handler
> > the IRQ is not deactivated. Only its priority is lowered. This means
> > the same IRQ cannot hit before the guest completes the virtual IRQ
> > and the GIC automatically deactivates the corresponding physical IRQ.
> >
> > Besides, the injection still is based on irqfd triggering. The only
> > impact on irqfd process is resamplefd is not called anymore on
> > virtual IRQ completion since this latter becomes "transparent".
> >
> > The current integration is based on an extension of the KVM-VFIO
> > device, previously used by KVM to interact with VFIO groups. The
> > patch serie now enables KVM to directly interact with a VFIO
> > platform device. The VFIO external API was extended for that purpose.
> >
> > Th KVM-VFIO device can get/put the vfio platform device, check its
> > integrity and type, get the IRQ number associated to an IRQ index.
> >
> > The IRQ forward programming is architecture specific (virtual interrupt
> > controller programming basically). However the whole infrastructure is
> > kept generic.
> >
> > from a user point of view, the functionality is provided through new
> > KVM-VFIO device commands,
> >> KVM_DEV_VFIO_DEVICE_(UN)FORWARD_IRQ
> > and the capability can be checked with KVM_HAS_DEVICE_ATTR.
> > Assignment can only be changed when the physical IRQ is not active.
> > It is the responsability of the user to do this check.
> >
> > This patch serie has the following dependencies:
> > - "ARM: Forwarding physical interrupts to a guest VM"
> >   (http://lwn.net/Articles/603514/) in
> > - [PATCH v3] irqfd for ARM
> > - and obviously the VFIO platform driver serie:
> >   [RFC PATCH v6 00/20] VFIO support for platform devices on ARM
> >
> https://www.mail-archive.com/kvm@vger.kernel.org/msg103247.html
> >
> > Integrated pieces can be found at
> > ssh://git.linaro.org/people/eric.auger/linux.git
> > on branch 3.17rc3_irqfd_forward_integ_v2
> >
> > This was was tested on Calxeda Midway, assigning the xgmac main IRQ.
> >
> > v1 -> v2:
> > - forward control is moved from architecture specific file into generic
> >   vfio.c module.
> >   only kvm_arch_set_fwd_state remains architecture specific
> > - integrate Kim's patch which enables KVM-VFIO for ARM
> > - fix vgic state bypass in vgic_queue_hwirq
> > - struct kvm_arch_forwarded_irq moved from
> >> arch/arm/include/uapi/asm/kvm.h
> >   to include/uapi/linux/

[PATCH 1/3] kvm: add a memslot flag for incoherent memory regions

2014-11-17 Thread Ard Biesheuvel

Memory regions may be incoherent with the caches, typically when the
guest has mapped a host system RAM backed memory region as uncached.
Add a flag KVM_MEMSLOT_INCOHERENT so that we can tag these memslots
and handle them appropriately when mapping them.

Signed-off-by: Ard Biesheuvel 
---
 include/linux/kvm_host.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a6059bdf7b03..e4d8f705fecd 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -43,6 +43,7 @@
  * include/linux/kvm_h.
  */
 #define KVM_MEMSLOT_INVALID(1UL << 16)
+#define KVM_MEMSLOT_INCOHERENT (1UL << 17)
 
 /* Two fragments for cross MMIO pages. */
 #define KVM_MAX_MMIO_FRAGMENTS 2
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/3] arm, arm64: KVM: allow forced dcache flush on page faults

2014-11-17 Thread Ard Biesheuvel

From: Laszlo Ersek 

To allow handling of incoherent memslots in a subsequent patch, this
patch adds a paramater 'ipa_uncached' to cache_coherent_guest_page()
so that we can instruct it to flush the page's contents to DRAM even
if the guest has caching globally enabled.

Signed-off-by: Laszlo Ersek 
Signed-off-by: Ard Biesheuvel 
---
 arch/arm/include/asm/kvm_mmu.h   | 5 +++--
 arch/arm/kvm/mmu.c   | 9 +++--
 arch/arm64/include/asm/kvm_mmu.h | 5 +++--
 3 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index acb0d5712716..f867060035ec 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -161,9 +161,10 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu 
*vcpu)
 }
 
 static inline void coherent_cache_guest_page(struct kvm_vcpu *vcpu, hva_t hva,
-unsigned long size)
+unsigned long size,
+bool ipa_uncached)
 {
-   if (!vcpu_has_cache_enabled(vcpu))
+   if (!vcpu_has_cache_enabled(vcpu) || ipa_uncached)
kvm_flush_dcache_to_poc((void *)hva, size);

/*
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index b007438242e2..cb924c6d56a6 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -852,6 +852,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
struct vm_area_struct *vma;
pfn_t pfn;
pgprot_t mem_type = PAGE_S2;
+   bool fault_ipa_uncached;
 
write_fault = kvm_is_write_fault(vcpu);
if (fault_status == FSC_PERM && !write_fault) {
@@ -918,6 +919,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (!hugetlb && !force_pte)
hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
 
+   fault_ipa_uncached = false;
+
if (hugetlb) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd);
@@ -925,7 +928,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
kvm_set_s2pmd_writable(&new_pmd);
kvm_set_pfn_dirty(pfn);
}
-   coherent_cache_guest_page(vcpu, hva & PMD_MASK, PMD_SIZE);
+   coherent_cache_guest_page(vcpu, hva & PMD_MASK, PMD_SIZE,
+ fault_ipa_uncached);
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
} else {
pte_t new_pte = pfn_pte(pfn, mem_type);
@@ -933,7 +937,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
kvm_set_s2pte_writable(&new_pte);
kvm_set_pfn_dirty(pfn);
}
-   coherent_cache_guest_page(vcpu, hva, PAGE_SIZE);
+   coherent_cache_guest_page(vcpu, hva, PAGE_SIZE,
+ fault_ipa_uncached);
ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE));
}
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 0caf7a59f6a1..123b521a9908 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -243,9 +243,10 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu 
*vcpu)
 }
 
 static inline void coherent_cache_guest_page(struct kvm_vcpu *vcpu, hva_t hva,
-unsigned long size)
+unsigned long size,
+bool ipa_uncached)
 {
-   if (!vcpu_has_cache_enabled(vcpu))
+   if (!vcpu_has_cache_enabled(vcpu) || ipa_uncached)
kvm_flush_dcache_to_poc((void *)hva, size);
 
if (!icache_is_aliasing()) {/* PIPT */
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/3] arm, arm64: KVM: handle potential incoherency of readonly memslots

2014-11-17 Thread Ard Biesheuvel

Readonly memslots are often used to implement emulation of ROMs and
NOR flashes, in which case the guest may legally map these regions as
uncached.
To deal with the incoherency associated with uncached guest mappings,
treat all readonly memslots as incoherent, and ensure that pages that
belong to regions tagged as such are flushed to DRAM before being passed
to the guest.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm/kvm/mmu.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index cb924c6d56a6..f2a9874ff5cb 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -919,7 +919,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (!hugetlb && !force_pte)
hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
 
-   fault_ipa_uncached = false;
+   fault_ipa_uncached = memslot->flags & KVM_MEMSLOT_INCOHERENT;
 
if (hugetlb) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
@@ -1298,11 +1298,12 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
hva = vm_end;
} while (hva < reg_end);
 
-   if (ret) {
-   spin_lock(&kvm->mmu_lock);
+   spin_lock(&kvm->mmu_lock);
+   if (ret)
unmap_stage2_range(kvm, mem->guest_phys_addr, mem->memory_size);
-   spin_unlock(&kvm->mmu_lock);
-   }
+   else
+   stage2_flush_memslot(kvm, memslot);
+   spin_unlock(&kvm->mmu_lock);
return ret;
 }
 
@@ -1314,6 +1315,15 @@ void kvm_arch_free_memslot(struct kvm *kvm, struct 
kvm_memory_slot *free,
 int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot,
unsigned long npages)
 {
+   /*
+* Readonly memslots are not incoherent with the caches by definition,
+* but in practice, they are used mostly to emulate ROMs or NOR flashes
+* that the guest may consider devices and hence map as uncached.
+* To prevent incoherency issues in these cases, tag all readonly
+* regions as incoherent.
+*/
+   if (slot->flags & KVM_MEM_READONLY)
+   slot->flags |= KVM_MEMSLOT_INCOHERENT;
return 0;
 }
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 3/6] hw_random: use reference counts on each struct hwrng.

2014-11-17 Thread Amos Kong

On Wed, Nov 12, 2014 at 02:11:23PM +1030, Rusty Russell wrote:
> Amos Kong  writes:
> > From: Rusty Russell 
> >
> > current_rng holds one reference, and we bump it every time we want
> > to do a read from it.
> >
> > This means we only hold the rng_mutex to grab or drop a reference,
> > so accessing /sys/devices/virtual/misc/hw_random/rng_current doesn't
> > block on read of /dev/hwrng.
> >
> > Using a kref is overkill (we're always under the rng_mutex), but
> > a standard pattern.
> >
> > This also solves the problem that the hwrng_fillfn thread was
> > accessing current_rng without a lock, which could change (eg. to NULL)
> > underneath it.
> >
> > v4: decrease last reference for triggering the cleanup
> 
> This doesn't make any sense:
> 
> > +static void drop_current_rng(void)
> > +{
> > +   struct hwrng *rng = current_rng;
> > +
> > +   BUG_ON(!mutex_is_locked(&rng_mutex));
> > +   if (!current_rng)
> > +   return;
> > +
> > +   /* release current_rng reference */
> > +   kref_put(¤t_rng->ref, cleanup_rng);
> > +   current_rng = NULL;
> > +
> > +   /* decrease last reference for triggering the cleanup */
> > +   kref_put(&rng->ref, cleanup_rng);
> > +}
> 
> Why would it drop the refcount twice?  This doesn't make sense.
> 
> Hmm, because you added kref_init, which initializes the reference count
> to 1, you created this bug.

I saw some kernel code uses kref_* helper functions, the reference
conter is initialized to 1. Some code didn't use the helper functions
to increase/decrease the reference counter. So I will drop kref_init()
and the second kref_put().
 
> Leave out the kref_init, and let it naturally be 0 (until, and if, it
> becomes current_rng).  Add a comment if you want.

OK, thanks.
 
> Thanks,
> Rusty.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Amos.


signature.asc
Description: Digital signature

Re: [PATCH 3/3] arm, arm64: KVM: handle potential incoherency of readonly memslots

2014-11-17 Thread Paolo Bonzini



On 17/11/2014 15:58, Ard Biesheuvel wrote:
> Readonly memslots are often used to implement emulation of ROMs and
> NOR flashes, in which case the guest may legally map these regions as
> uncached.
> To deal with the incoherency associated with uncached guest mappings,
> treat all readonly memslots as incoherent, and ensure that pages that
> belong to regions tagged as such are flushed to DRAM before being passed
> to the guest.

On x86, the processor combines the cacheability values from the two
levels of page tables.  Is there no way to do the same on ARM?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/3] arm, arm64: KVM: handle potential incoherency of readonly memslots

2014-11-17 Thread Marc Zyngier

Hi Paolo,

On 17/11/14 15:29, Paolo Bonzini wrote:
> 
> 
> On 17/11/2014 15:58, Ard Biesheuvel wrote:
>> Readonly memslots are often used to implement emulation of ROMs and
>> NOR flashes, in which case the guest may legally map these regions as
>> uncached.
>> To deal with the incoherency associated with uncached guest mappings,
>> treat all readonly memslots as incoherent, and ensure that pages that
>> belong to regions tagged as such are flushed to DRAM before being passed
>> to the guest.
> 
> On x86, the processor combines the cacheability values from the two
> levels of page tables.  Is there no way to do the same on ARM?

ARM is broadly similar, but there's a number of gotchas:
- uncacheable (guest level) + cacheable (host level) -> uncacheable: the
read request is going to be directly sent to RAM, bypassing the caches.
- Userspace is going to use a cacheable view of the "NOR" pages, which
is going to stick around in the cache (this is just memory, after all).

The net result is that we need to detect those cases and make sure the
guest sees the latest bit of data written by userland.

We already have a similar mechanism when we fault pages in, but the
guest has not enabled its caches yet.

M.
-- 
Jazz is not dead. It just smells funny...

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/3] arm, arm64: KVM: handle potential incoherency of readonly memslots

2014-11-17 Thread Laszlo Ersek

On 11/17/14 16:29, Paolo Bonzini wrote:
> 
> 
> On 17/11/2014 15:58, Ard Biesheuvel wrote:
>> Readonly memslots are often used to implement emulation of ROMs and
>> NOR flashes, in which case the guest may legally map these regions as
>> uncached.
>> To deal with the incoherency associated with uncached guest mappings,
>> treat all readonly memslots as incoherent, and ensure that pages that
>> belong to regions tagged as such are flushed to DRAM before being passed
>> to the guest.
> 
> On x86, the processor combines the cacheability values from the two
> levels of page tables.  Is there no way to do the same on ARM?

Combining occurs on ARMv8 too. The Stage1 (guest) mapping is very strict
(Device non-Gathering, non-Reordering, no Early Write Acknowledgement --
for EFI_MEMORY_UC), which basically "overrides" the Stage2 (very lax
host) memory attributes.

When qemu writes, as part of emulating the flash programming commands,
to the RAMBlock that *otherwise* backs the flash range (as a r/o
memslot), those writes (from host userspace) tend to end up in dcache.

But, when the guest flips back the flash to romd mode, and tries to read
back the values from the flash as plain ROM, the dcache is completely
bypassed due to the strict stage1 mapping, and the guest goes directly
to DRAM.

Where qemu's earlier writes are not yet / necessarily visible.

Please see my original patch (which was incomplete) in the attachment,
it has a very verbose commit message.

Anyway, I'll let others explain; they can word it better than I can :)

FWIW,

Series
Reviewed-by: Laszlo Ersek 

I ported this series to a 3.17.0+ based kernel, and tested it. It works
fine. The ROM-like view of the NOR flash now reflects the previously
programmed contents.

Series
Tested-by: Laszlo Ersek 

Thanks!
Laszlo
>From a2b4da9b03f03ccdb8b0988a5cc64d1967f00398 Mon Sep 17 00:00:00 2001
From: Laszlo Ersek 
Date: Sun, 16 Nov 2014 01:43:11 +0100
Subject: [PATCH] arm, arm64: KVM: clean cache on page fault also when IPA is
 uncached (WIP)

This patch builds on Marc Zyngier's commit
2d58b733c87689d3d5144e4ac94ea861cc729145.

(1) The guest bypasses the cache *not only* when the VCPU's dcache is
disabled (see bit 0 and bit 2 in SCTLR_EL1, "MMU enable" and "Cache
enable", respectively -- vcpu_has_cache_enabled()).

The guest bypasses the cache *also* when the Stage 1 memory attributes
say "device memory" about the Intermediate Page Address in question,
independently of the Stage 2 memory attributes. Refer to:

  Table D5-38
  Combining the stage 1 and stage 2 memory type assignments

in the ARM ARM. (This is likely similar to MTRRs on x86.)

(2) In edk2 (EFI Development Kit II), the ARM NOR flash driver,

  ArmPlatformPkg/Drivers/NorFlashDxe/NorFlashFvbDxe.c

uses the AddMemorySpace() and SetMemorySpaceAttributes() Global
Coherency Domain Services of DXE (Driver eXecution Environment) to
*justifiedly* set the attributes of the guest memory covering the
flash chip to EFI_MEMORY_UC ("uncached").

According to the AArch64 bindings for UEFI (see "2.3.6.1 Memory types"
in the UEFI-2.4A specification), EFI_MEMORY_UC is mapped to:

   ARM Memory Type:
   MAIR attribute encodingARM Memory Type:
EFI Memory TypeAttr [7:4] [3:0]Meaning
------
EFI_MEMORY_UC     Device-nGnRnE (Device
(Not cacheable)   non-Gathering,
  non-Reordering, no Early
  Write Acknowledgement)

This is correctly implemented in edk2, in the ArmConfigureMmu()
function, via the ArmSetMAIR() call and the MAIR_ATTR() macro:

The TT_ATTR_INDX_DEVICE_MEMORY (== 0) memory attribute index, which is
used for EFI_MEMORY_UC memory, is associated with the
MAIR_ATTR_DEVICE_MEMORY (== 0x00, see above) memory attribute value,
in the MAIR_ELx register.

As a consequence of (1) and (2), when edk2 code running in the guest
accesses an IPA falling in the flash range, it will completely bypass
the cache.

Therefore, when such a page is faulted in in user_mem_abort(), we must
flush the data cache; otherwise the guest will see stale data in the flash
chip.

This patch is not complete because I have no clue how to calculate the
memory attribute for "fault_ipa" in user_mem_abort(). Right now I set
"fault_ipa_uncached" to constant true, which might incur some performance
penalty for data faults, but it certainly improves correctness -- the
ArmVirtualizationPkg platform build of edk2 actually boots as a KVM guest
on APM Mustang.

Signed-off-by: Laszlo Ersek 
---
 arch/arm/include/asm/kvm_mmu.h   |  5 +++--
 arch/arm64/include/asm/kvm_mmu.h |  5 +++--
 arch/arm/kvm/mmu.c   | 10 --
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/

Re: [PATCH 3/3] arm, arm64: KVM: handle potential incoherency of readonly memslots

2014-11-17 Thread Paolo Bonzini

On 17/11/2014 16:39, Marc Zyngier wrote:
> ARM is broadly similar, but there's a number of gotchas:
> - uncacheable (guest level) + cacheable (host level) -> uncacheable: the
> read request is going to be directly sent to RAM, bypassing the caches.
> - Userspace is going to use a cacheable view of the "NOR" pages, which
> is going to stick around in the cache (this is just memory, after all).

Ah, x86 also has uncacheable + cacheable -> uncacheable, but Intel also
added a bit to ignore the guest-provided type.  We use that bit for
RAM-backed areas.

Also, on x86 if the cache is disabled the processor will still snoop
caches (including its own cache) and perform writeback+invalidate of the
cache line before accessing main memory, if it's dirty.  AMD does not
have the aforementioned bit, but applies this same algorithm if the host
says the page is writeback in the MTRR (memory type range register).
The Intel solution is less tricky and has better performance.

Paolo

> The net result is that we need to detect those cases and make sure the
> guest sees the latest bit of data written by userland.
> 
> We already have a similar mechanism when we fault pages in, but the
> guest has not enabled its caches yet.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Where is the VM live migration code?

2014-11-17 Thread Jidong Xiao

Hi,

I saw this page:

http://www.linux-kvm.org/page/Migration.

It looks like Migration is a feature provided by KVM? But when I look
at the Linux kernel source code, i.e., virt/kvm, and arch/x86/kvm, I
don't see the code for this migration feature.

So I wonder where is the source code for the live migration? Is it
purely implemented in user space? Because I see there are the
following files in the qemu source code:

migration.c  migration-exec.c  migration-fd.c  migration-rdma.c
migration-tcp.c  migration-unix.c

If I wish to understand the implementation of migration in Qemu/KVM,
are these above files the ones I should read? Thanks.

-Jidong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] Where is the VM live migration code?

2014-11-17 Thread Zhang Haoyu

> Hi,
> 
> I saw this page:
> 
> http://www.linux-kvm.org/page/Migration.
> 
> It looks like Migration is a feature provided by KVM? But when I look
> at the Linux kernel source code, i.e., virt/kvm, and arch/x86/kvm, I
> don't see the code for this migration feature.
> 
Most of live migration code is in qemu migration.c, savevm.c, arch_init.c,
block-migration.c, and the other devices's save/load handler, .etc,
only log/sync dirty page implemented in kernel.
You can read the most important function migration_thread(),
process_incoming_migration_co().

> So I wonder where is the source code for the live migration? Is it
>purely implemented in user space? Because I see there are the
> following files in the qemu source code:
> 
> migration.c  migration-exec.c  migration-fd.c  migration-rdma.c
> migration-tcp.c  migration-unix.c
> 
> If I wish to understand the implementation of migration in Qemu/KVM,
> are these above files the ones I should read? Thanks.
> 
> -Jidong 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: vhost + multiqueue + RSS question.

2014-11-17 Thread Zhang Haoyu

> On Mon, Nov 17, 2014 at 01:58:20PM +0200, Michael S. Tsirkin wrote:
> > On Mon, Nov 17, 2014 at 01:22:07PM +0200, Gleb Natapov wrote:
> > > On Mon, Nov 17, 2014 at 12:38:16PM +0200, Michael S. Tsirkin wrote:
> > > > On Mon, Nov 17, 2014 at 09:44:23AM +0200, Gleb Natapov wrote:
> > > > > On Sun, Nov 16, 2014 at 08:56:04PM +0200, Michael S. Tsirkin wrote:
> > > > > > On Sun, Nov 16, 2014 at 06:18:18PM +0200, Gleb Natapov wrote:
> > > > > > > Hi Michael,
> > > > > > > 
> > > > > > >  I am playing with vhost multiqueue capability and have a 
> > > > > > > question about
> > > > > > > vhost multiqueue and RSS (receive side steering). My setup has 
> > > > > > > Mellanox
> > > > > > > ConnectX-3 NIC which supports multiqueue and RSS. Network related
> > > > > > > parameters for qemu are:
> > > > > > > 
> > > > > > >-netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4
> > > > > > >-device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10
> > > > > > > 
> > > > > > > In a guest I ran "ethtool -L eth0 combined 4" to enable 
> > > > > > > multiqueue.
> > > > > > > 
> > > > > > > I am running one tcp stream into the guest using iperf. Since 
> > > > > > > there is
> > > > > > > only one tcp stream I expect it to be handled by one queue only 
> > > > > > > but
> > > > > > > this seams to be not the case. ethtool -S on a host shows that the
> > > > > > > stream is handled by one queue in the NIC, just like I would 
> > > > > > > expect,
> > > > > > > but in a guest all 4 virtio-input interrupt are incremented. Am I
> > > > > > > missing any configuration?
> > > > > > 
> > > > > > I don't see anything obviously wrong with what you describe.
> > > > > > Maybe, somehow, same irqfd got bound to multiple MSI vectors?
> > > > > It does not look like this is what is happening judging by the way
> > > > > interrupts are distributed between queues. They are not distributed
> > > > > uniformly and often I see one queue gets most interrupt and others get
> > > > > much less and then it changes.
> > > > 
> > > > Weird. It would happen if you transmitted from multiple CPUs.
> > > > You did pin iperf to a single CPU within guest, did you not?
> > > > 
> > > No, I didn't because I didn't expect it to matter for input interrupts.
> > > When I run iperf on a host rx queue that receives all packets depends
> > > only on a connection itself, not on a cpu iperf is running on (I tested
> > > that).
> > 
> > This really depends on the type of networking card you have
> > on the host, and how it's configured.
> > 
> > I think you will get something more closely resembling this
> > behaviour if you enable RFS in host.
> > 
> > > When I pin iperf in a guest I do indeed see that all interrupts
> > > are arriving to the same irq vector. Is a number after virtio-input
> > > in /proc/interrupt any indication of a queue a packet arrived to (on
> > > a host I can use ethtool -S to check what queue receives packets, but
> > > unfortunately this does not work for virtio nic in a guest)?
> > 
> > I think it is.
> > 
> > > Because if
> > > it is the way RSS works in virtio is not how it works on a host and not
> > > what I would expect after reading about RSS. The queue a packets arrives
> > > to should be calculated by hashing fields from a packet header only.
> > 
> > Yes, what virtio has is not RSS - it's an accelerated RFS really.
> > 
> OK, if what virtio has is RFS and not RSS my test results make sense.
> Thanks!

I think the RSS emulation for virtio-mq NIC is implemented in 
tun_select_queue(),
am I missing something?

Thanks,
Zhang Haoyu

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] Where is the VM live migration code?

2014-11-17 Thread Jidong Xiao

On Mon, Nov 17, 2014 at 5:29 PM, Zhang Haoyu  wrote:
>> Hi,
>>
>> I saw this page:
>>
>> http://www.linux-kvm.org/page/Migration.
>>
>> It looks like Migration is a feature provided by KVM? But when I look
>> at the Linux kernel source code, i.e., virt/kvm, and arch/x86/kvm, I
>> don't see the code for this migration feature.
>>
> Most of live migration code is in qemu migration.c, savevm.c, arch_init.c,
> block-migration.c, and the other devices's save/load handler, .etc,
> only log/sync dirty page implemented in kernel.
> You can read the most important function migration_thread(),
> process_incoming_migration_co().
>
Great, thanks Haoyu! I will try to understand these parts of code first.

-Jidong

>> So I wonder where is the source code for the live migration? Is it
>>purely implemented in user space? Because I see there are the
>> following files in the qemu source code:
>>
>> migration.c  migration-exec.c  migration-fd.c  migration-rdma.c
>> migration-tcp.c  migration-unix.c
>>
>> If I wish to understand the implementation of migration in Qemu/KVM,
>> are these above files the ones I should read? Thanks.
>>
>> -Jidong
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 0/6] ARM64: KVM: PMU infrastructure support

2014-11-17 Thread Anup Patel

On Tue, Nov 11, 2014 at 2:48 PM, Anup Patel  wrote:
> Hi All,
>
> I have second thoughts about rebasing KVM PMU patches
> to Marc's irq-forwarding patches.
>
> The PMU IRQs (when virtualized by KVM) are not exactly
> forwarded IRQs because they are shared between Host
> and Guest.
>
> Scenario1
> -
>
> We might have perf running on Host and no KVM guest
> running. In this scenario, we wont get interrupts on Host
> because the kvm_pmu_hyp_init() (similar to the function
> kvm_timer_hyp_init() of Marc's IRQ-forwarding
> implementation) has put all host PMU IRQs in forwarding
> mode.
>
> The only way solve this problem is to not set forwarding
> mode for PMU IRQs in kvm_pmu_hyp_init() and instead
> have special routines to turn on and turn off the forwarding
> mode of PMU IRQs. These routines will be called from
> kvm_arch_vcpu_ioctl_run() for toggling the PMU IRQ
> forwarding state.
>
> Scenario2
> -
>
> We might have perf running on Host and Guest simultaneously
> which means it is quite likely that PMU HW trigger IRQ meant
> for Host between "ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);"
> and "kvm_pmu_sync_hwstate(vcpu);" (similar to timer sync routine
> of Marc's patchset which is called before local_irq_enable()).
>
> In this scenario, the updated kvm_pmu_sync_hwstate(vcpu)
> will accidentally forward IRQ meant for Host to Guest unless
> we put additional checks to inspect VCPU PMU state.
>
> Am I missing any detail about IRQ forwarding for above
> scenarios?
>
> If not then can we consider current mask/unmask approach
> for forwarding PMU IRQs?
>
> Marc?? Will??
>
> Regards,
> Anup

Ping ???

--
Anup
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: vhost + multiqueue + RSS question.

2014-11-17 Thread Jason Wang

On 11/17/2014 07:58 PM, Michael S. Tsirkin wrote:
> On Mon, Nov 17, 2014 at 01:22:07PM +0200, Gleb Natapov wrote:
>> > On Mon, Nov 17, 2014 at 12:38:16PM +0200, Michael S. Tsirkin wrote:
>>> > > On Mon, Nov 17, 2014 at 09:44:23AM +0200, Gleb Natapov wrote:
 > > > On Sun, Nov 16, 2014 at 08:56:04PM +0200, Michael S. Tsirkin wrote:
> > > > > On Sun, Nov 16, 2014 at 06:18:18PM +0200, Gleb Natapov wrote:
>> > > > > > Hi Michael,
>> > > > > > 
>> > > > > >  I am playing with vhost multiqueue capability and have a 
>> > > > > > question about
>> > > > > > vhost multiqueue and RSS (receive side steering). My setup has 
>> > > > > > Mellanox
>> > > > > > ConnectX-3 NIC which supports multiqueue and RSS. Network 
>> > > > > > related
>> > > > > > parameters for qemu are:
>> > > > > > 
>> > > > > >-netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4
>> > > > > >-device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10
>> > > > > > 
>> > > > > > In a guest I ran "ethtool -L eth0 combined 4" to enable 
>> > > > > > multiqueue.
>> > > > > > 
>> > > > > > I am running one tcp stream into the guest using iperf. Since 
>> > > > > > there is
>> > > > > > only one tcp stream I expect it to be handled by one queue 
>> > > > > > only but
>> > > > > > this seams to be not the case. ethtool -S on a host shows that 
>> > > > > > the
>> > > > > > stream is handled by one queue in the NIC, just like I would 
>> > > > > > expect,
>> > > > > > but in a guest all 4 virtio-input interrupt are incremented. 
>> > > > > > Am I
>> > > > > > missing any configuration?
> > > > > 
> > > > > I don't see anything obviously wrong with what you describe.
> > > > > Maybe, somehow, same irqfd got bound to multiple MSI vectors?
 > > > It does not look like this is what is happening judging by the way
 > > > interrupts are distributed between queues. They are not distributed
 > > > uniformly and often I see one queue gets most interrupt and others 
 > > > get
 > > > much less and then it changes.
>>> > > 
>>> > > Weird. It would happen if you transmitted from multiple CPUs.
>>> > > You did pin iperf to a single CPU within guest, did you not?
>>> > > 
>> > No, I didn't because I didn't expect it to matter for input interrupts.
>> > When I run iperf on a host rx queue that receives all packets depends
>> > only on a connection itself, not on a cpu iperf is running on (I tested
>> > that).
> This really depends on the type of networking card you have
> on the host, and how it's configured.
>
> I think you will get something more closely resembling this
> behaviour if you enable RFS in host.
>
>> > When I pin iperf in a guest I do indeed see that all interrupts
>> > are arriving to the same irq vector. Is a number after virtio-input
>> > in /proc/interrupt any indication of a queue a packet arrived to (on
>> > a host I can use ethtool -S to check what queue receives packets, but
>> > unfortunately this does not work for virtio nic in a guest)?
> I think it is.
>
>> > Because if
>> > it is the way RSS works in virtio is not how it works on a host and not
>> > what I would expect after reading about RSS. The queue a packets arrives
>> > to should be calculated by hashing fields from a packet header only.
> Yes, what virtio has is not RSS - it's an accelerated RFS really.

Strictly speaking, not aRFS. aRFS requires a programmable filter and
needs driver to fill the filter on demand. For virtio-net, this is done
automatically in host side (tun/tap). There's no guest involvement.

>
> The point is to try and take application locality into account.
>

Yes, the locality was done through (consider a N vcpu guest with N queue):

- virtio-net driver will provide a default 1:1 mapping between vcpu and
txq through XPS
- virtio-net driver will suggest a default irq affinity hint also for a
1:1 mapping bettwen vcpu and txq/rxq

With all these, each vcpu get its private txq/rxq paris. And host side
implementation (tun/tap) will make sure if the packets of a flow were
received from queue N, if will also use queue N to transmit the packets
of this flow to guest.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: vhost + multiqueue + RSS question.

2014-11-17 Thread Jason Wang

On 11/18/2014 09:37 AM, Zhang Haoyu wrote:
>> On Mon, Nov 17, 2014 at 01:58:20PM +0200, Michael S. Tsirkin wrote:
>>> On Mon, Nov 17, 2014 at 01:22:07PM +0200, Gleb Natapov wrote:
 On Mon, Nov 17, 2014 at 12:38:16PM +0200, Michael S. Tsirkin wrote:
> On Mon, Nov 17, 2014 at 09:44:23AM +0200, Gleb Natapov wrote:
>> On Sun, Nov 16, 2014 at 08:56:04PM +0200, Michael S. Tsirkin wrote:
>>> On Sun, Nov 16, 2014 at 06:18:18PM +0200, Gleb Natapov wrote:
 Hi Michael,

  I am playing with vhost multiqueue capability and have a question 
 about
 vhost multiqueue and RSS (receive side steering). My setup has Mellanox
 ConnectX-3 NIC which supports multiqueue and RSS. Network related
 parameters for qemu are:

-netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4
-device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10

 In a guest I ran "ethtool -L eth0 combined 4" to enable multiqueue.

 I am running one tcp stream into the guest using iperf. Since there is
 only one tcp stream I expect it to be handled by one queue only but
 this seams to be not the case. ethtool -S on a host shows that the
 stream is handled by one queue in the NIC, just like I would expect,
 but in a guest all 4 virtio-input interrupt are incremented. Am I
 missing any configuration?
>>> I don't see anything obviously wrong with what you describe.
>>> Maybe, somehow, same irqfd got bound to multiple MSI vectors?
>> It does not look like this is what is happening judging by the way
>> interrupts are distributed between queues. They are not distributed
>> uniformly and often I see one queue gets most interrupt and others get
>> much less and then it changes.
> Weird. It would happen if you transmitted from multiple CPUs.
> You did pin iperf to a single CPU within guest, did you not?
>
 No, I didn't because I didn't expect it to matter for input interrupts.
 When I run iperf on a host rx queue that receives all packets depends
 only on a connection itself, not on a cpu iperf is running on (I tested
 that).
>>> This really depends on the type of networking card you have
>>> on the host, and how it's configured.
>>>
>>> I think you will get something more closely resembling this
>>> behaviour if you enable RFS in host.
>>>
 When I pin iperf in a guest I do indeed see that all interrupts
 are arriving to the same irq vector. Is a number after virtio-input
 in /proc/interrupt any indication of a queue a packet arrived to (on
 a host I can use ethtool -S to check what queue receives packets, but
 unfortunately this does not work for virtio nic in a guest)?
>>> I think it is.
>>>
 Because if
 it is the way RSS works in virtio is not how it works on a host and not
 what I would expect after reading about RSS. The queue a packets arrives
 to should be calculated by hashing fields from a packet header only.
>>> Yes, what virtio has is not RSS - it's an accelerated RFS really.
>>>
>> OK, if what virtio has is RFS and not RSS my test results make sense.
>> Thanks!
> I think the RSS emulation for virtio-mq NIC is implemented in 
> tun_select_queue(),
> am I missing something?
>
> Thanks,
> Zhang Haoyu
>

Yes, if RSS is the short for Receive Side Steering which is a generic
technology. But RSS is usually short for Receive Side Scaling which was
commonly technology used by Windows, it was implemented through a
indirection table in the card which is obviously not supported in tun
currently.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

can I make this work… (Foundation for accessibility project)

2014-11-17 Thread Eric S. Johansson

this is a rather different use case than what you've been thinking of 
for KVM. It could mean significant improvement of the quality of life of 
disabled programs like myself. It's difficult to convey what it's like 
to try to use computers with speech recognition for something other than 
writing so, bear with me when I say something is real but don't quite 
prove it yet.  also, please take it as read that the only really usable 
speech recognition environment out there is NaturallySpeaking with 
Google close behind in terms of accuracy but not even in the same planet 
for ability to extend for speech enabled applications.


I'm trying to figure out ways of making it possible to drive Linux from 
Windows speech recognition (NaturallySpeaking).  The goal is a system 
where Windows runs in a virtual machine (Linux host), audio is passed 
through from a USB headset to the Windows environment. And the output of 
the recognition engine is piped through some magic back to the Linux host.


the hardest part of all of this without question is getting clean 
uninterrupted audio from the USB device all the way through to the 
Windows virtual machine. virtual box, VMware fail mostly in delivering  
reliable  audio to the virtual machine.


I expect KVM to not  work right  with regards to getting clean 
audio/real-time USB but I'm asking in case I'm wrong. if it doesn't work 
or can't work yet, what would it take to make it possible for clean 
audio to be passed through to a guest?


--- Why this is important, approaches that failed, why think this will 
work. Boring accessibility info ---


The history of trying to make Windows or DOS based speech recognition 
drive Linux has a long and tortured history. almost all of them involve 
some form of an open loop system that ignores system context and counts 
on the grammar to specify the context and the subsequent keystrokes 
injected into the target system.


This model fails because it effectively speaking keyboard functions 
which wastes the majority of the power of a good grammar in a speech 
recognition environment.


Most common configuration for speech recognition in a virtualized 
environment today is that Windows is the host with speech recognition 
and Linux is the guest. It's just a reimplementation of the open-loop 
system described above where your dictation results are keystrokes 
injected into the virtual machine console window. Sometimes works, 
sometimes drops characters.


One big failing of the Windows host/Linux guest environments is in 
addition to dropping characters,it seems to drop segments of the audio 
stream on the Windows side. It's  common but not frequent for this to 
happen anyway when running Windows with any sort of CPU utilization but 
it's almost guaranteed as soon as a virtual machine starts up.


Another failing is that the context the recognition application is aware 
of is the window of the console. It knows nothing about the internal 
context of the virtual machine (what application has focus). And 
unfortunately it can't know anything more because of the way that 
NaturallySpeaking uses the local Windows context.


Inverting the relationship between guest and host where Linux is the 
host and Windows is the guest solves at least the focus problem. In the 
virtual machine, you have a portal application the canal control the 
perception of context and tunnels the character stream from the 
recognition engine into the host OS to drive it open loop. The portal 
application[1] can also communicate which grammar sequence has been 
parsed and what action should be taken on the host site. At this point, 
we now have the capabilities of a closed-loop speech recognition 
environment where a grammar can read context to generate a new grammar 
to fit the applications state. This means smaller utterances which can 
be disambiguated versus the more traditional large utterance 
disambiguation technique.


A couple other advantages of Windows as a guest is that it only run 
speech recognition in the portal. There's no browsers, no flash, 
JavaScript, viruses and other "stuff" taking up resources and 
distracting from speech recognition working as well as possible. The 
downside is that the host running the virtual machine needs to make the 
VM very high almost real-time priority[2] so that it doesn't stall and 
speech recognition works as quickly and as accurately as possible.


Hope I didn't bore you too badly. Thank you for reading and I hope we 
can make this work.

--- eric



[1] should I call it cake?
[2]  I'm looking at you Firefox, sucking down 30% of the CPU doing nothing
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: vhost + multiqueue + RSS question.

2014-11-17 Thread Gleb Natapov

On Tue, Nov 18, 2014 at 11:41:11AM +0800, Jason Wang wrote:
> On 11/18/2014 09:37 AM, Zhang Haoyu wrote:
> >> On Mon, Nov 17, 2014 at 01:58:20PM +0200, Michael S. Tsirkin wrote:
> >>> On Mon, Nov 17, 2014 at 01:22:07PM +0200, Gleb Natapov wrote:
>  On Mon, Nov 17, 2014 at 12:38:16PM +0200, Michael S. Tsirkin wrote:
> > On Mon, Nov 17, 2014 at 09:44:23AM +0200, Gleb Natapov wrote:
> >> On Sun, Nov 16, 2014 at 08:56:04PM +0200, Michael S. Tsirkin wrote:
> >>> On Sun, Nov 16, 2014 at 06:18:18PM +0200, Gleb Natapov wrote:
>  Hi Michael,
> 
>   I am playing with vhost multiqueue capability and have a question 
>  about
>  vhost multiqueue and RSS (receive side steering). My setup has 
>  Mellanox
>  ConnectX-3 NIC which supports multiqueue and RSS. Network related
>  parameters for qemu are:
> 
> -netdev tap,id=hn0,script=qemu-ifup.sh,vhost=on,queues=4
> -device virtio-net-pci,netdev=hn0,id=nic1,mq=on,vectors=10
> 
>  In a guest I ran "ethtool -L eth0 combined 4" to enable multiqueue.
> 
>  I am running one tcp stream into the guest using iperf. Since there 
>  is
>  only one tcp stream I expect it to be handled by one queue only but
>  this seams to be not the case. ethtool -S on a host shows that the
>  stream is handled by one queue in the NIC, just like I would expect,
>  but in a guest all 4 virtio-input interrupt are incremented. Am I
>  missing any configuration?
> >>> I don't see anything obviously wrong with what you describe.
> >>> Maybe, somehow, same irqfd got bound to multiple MSI vectors?
> >> It does not look like this is what is happening judging by the way
> >> interrupts are distributed between queues. They are not distributed
> >> uniformly and often I see one queue gets most interrupt and others get
> >> much less and then it changes.
> > Weird. It would happen if you transmitted from multiple CPUs.
> > You did pin iperf to a single CPU within guest, did you not?
> >
>  No, I didn't because I didn't expect it to matter for input interrupts.
>  When I run iperf on a host rx queue that receives all packets depends
>  only on a connection itself, not on a cpu iperf is running on (I tested
>  that).
> >>> This really depends on the type of networking card you have
> >>> on the host, and how it's configured.
> >>>
> >>> I think you will get something more closely resembling this
> >>> behaviour if you enable RFS in host.
> >>>
>  When I pin iperf in a guest I do indeed see that all interrupts
>  are arriving to the same irq vector. Is a number after virtio-input
>  in /proc/interrupt any indication of a queue a packet arrived to (on
>  a host I can use ethtool -S to check what queue receives packets, but
>  unfortunately this does not work for virtio nic in a guest)?
> >>> I think it is.
> >>>
>  Because if
>  it is the way RSS works in virtio is not how it works on a host and not
>  what I would expect after reading about RSS. The queue a packets arrives
>  to should be calculated by hashing fields from a packet header only.
> >>> Yes, what virtio has is not RSS - it's an accelerated RFS really.
> >>>
> >> OK, if what virtio has is RFS and not RSS my test results make sense.
> >> Thanks!
> > I think the RSS emulation for virtio-mq NIC is implemented in 
> > tun_select_queue(),
> > am I missing something?
> >
> > Thanks,
> > Zhang Haoyu
> >
> 
> Yes, if RSS is the short for Receive Side Steering which is a generic
> technology. But RSS is usually short for Receive Side Scaling which was
> commonly technology used by Windows, it was implemented through a
> indirection table in the card which is obviously not supported in tun
> currently.
Hmm, I had an impression that "Receive Side Steering" and "Receive Side
Scaling" are interchangeable. Software implementation for RSS is called
"Receive Packet Steering" according to Documentation/networking/scaling.txt
not "Receive Packet Scaling". Those damn TLAs are confusing.
 
--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

47 matches

Mail list logo