Re: [patch 0/3] fixes for kvm on s390

2009-06-24 Thread Avi Kivity

On 06/23/2009 06:24 PM, Christian Borntraeger wrote:

Hello Avi,

here are three patches against kvm.git that fix several issues in our
kvm port. Please review and consider all patches for 2.6.31.

   


Applied all, and queued the stfle patch for 2.6.31.  The commits 
referenced in patch 1 and 3 doesn't exist in 2.6.31.  Please correct me 
if I misread things.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 0/3] fixes for kvm on s390

2009-06-24 Thread Christian Bornträger
Am Mittwoch 24 Juni 2009 10:09:18 schrieben Sie:
> On 06/23/2009 06:24 PM, Christian Borntraeger wrote:
> > Hello Avi,
> >
> > here are three patches against kvm.git that fix several issues in our
> > kvm port. Please review and consider all patches for 2.6.31.
>
> Applied all, and queued the stfle patch for 2.6.31.  The commits
> referenced in patch 1 and 3 doesn't exist in 2.6.31.  Please correct me
> if I misread things.

Yes, the stfle issue is present on linus git and should go to Linus.

Fixes 1 and 3 are only for your kvm.git-tree. They should go to Linus in as 
soon 
as you push the referenced commits to Linus.

Christian
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio_blk fails for rhel5.3 guest with LVM: kernel panic

2009-06-24 Thread Avi Kivity

On 06/23/2009 08:21 PM, sudhir kumar wrote:

I see that Rhel5.3 32 and 64 bit guests fail to boot with virtio for
block device. The
guest can not find the root filesystem. The guest is using LVM. As per
the linux-kvm wiki  instructions I did change device.map and booted
with if=virtio.
The wiki link is http://www.linux-kvm.org/page/Boot_from_virtio_block_device

   


Does your initrd include virtio-pci.ko and virtio-blk.ko?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/8 v3] KVM support for 1GB pages

2009-06-24 Thread Avi Kivity

On 06/19/2009 04:16 PM, Joerg Roedel wrote:

Hi,

this is the third version of the patches for KVM to support 1GB pages. Changes
to the last version include:

- changed reporting of 1GB page support to userspace to use
  SUPPORTED_CPUID ioctl
- added support for 1GB pages to the software page walker code too

This code still only can make use of 1GB pages with nested paging enabled. I
will give the shadow paging code another debug round soon. Please comment or
consider to apply these patches.

Thanks,

Joerg

Shortlog:

Joerg Roedel (8):
   hugetlbfs: export vma_kernel_pagsize to modules
   kvm: change memslot data structures for multiple hugepage sizes
   


At Marcelo's suggestion I've applied the first two patches, to reduce 
your rebase load.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: remove in_range from kvm_io_device

2009-06-24 Thread Michael S. Tsirkin
On Tue, Jun 23, 2009 at 09:43:31PM -0400, Gregory Haskins wrote:
> Michael S. Tsirkin wrote:
> > Remove in_range from kvm_io_device and ask read/write callbacks, if
> > supplied, to perform range checks internally. This allows aliasing
> > (mostly for in-kernel virtio), as well as better error handling by
> > making it possible to pass errors up to userspace. And it's enough to
> > look at the diffstat to see that it's a better API anyway.
> >
> > While we are at it, document locking rules for kvm_io_device.
> >   
> 
> Hi Michael,
> 
> I just tried to apply this to kvm.git/master, and it blew up really
> badly.  What tree should I be using?

Ugh, this is against 2.6.30. I'll post kvm.git version soon.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM: x86: missing locking in PIT/IRQCHIP/SET_BSP_CPU ioctl paths

2009-06-24 Thread Avi Kivity

On 06/23/2009 09:05 PM, Marcelo Tosatti wrote:

Correct missing locking in a few places in x86's vm_ioctl handling path.

   


Applied, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 0/3] fixes for kvm on s390

2009-06-24 Thread Avi Kivity

On 06/24/2009 11:18 AM, Christian Bornträger wrote:

Yes, the stfle issue is present on linus git and should go to Linus.
Fixes 1 and 3 are only for your kvm.git-tree. They should go to Linus in as soon
as you push the referenced commits to Linus.

   


That'll be 2.6.32.  I generally fold commits that fix other commits in 
the same pull.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 0/3] fixes for kvm on s390

2009-06-24 Thread Christian Bornträger
Am Mittwoch 24 Juni 2009 11:18:32 schrieb Avi Kivity:
> On 06/24/2009 11:18 AM, Christian Bornträger wrote:
> > Yes, the stfle issue is present on linus git and should go to Linus.
> > Fixes 1 and 3 are only for your kvm.git-tree. They should go to Linus in
> > as soon as you push the referenced commits to Linus.
>
> That'll be 2.6.32.  I generally fold commits that fix other commits in
> the same pull.

That would be fine with me.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio_blk fails for rhel5.3 guest with LVM: kernel panic

2009-06-24 Thread Avi Kivity

On 06/24/2009 12:41 PM, sudhir kumar wrote:

initrd does not contain these modules. My kernel contains the above modules.
Is it mandatory for the initrd to contain these modules? It does not
contain virtio-net and the guest boots fine with virtio.
   


virtio-net isn't used for locating the disks.  If you want your root 
filesystem to use virtio, you must include these modules.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio_blk fails for rhel5.3 guest with LVM: kernel panic

2009-06-24 Thread sudhir kumar
initrd does not contain these modules. My kernel contains the above modules.
Is it mandatory for the initrd to contain these modules? It does not
contain virtio-net and the guest boots fine with virtio.

On Wed, Jun 24, 2009 at 2:04 PM, Avi Kivity wrote:
> On 06/23/2009 08:21 PM, sudhir kumar wrote:
>>
>> I see that Rhel5.3 32 and 64 bit guests fail to boot with virtio for
>> block device. The
>> guest can not find the root filesystem. The guest is using LVM. As per
>> the linux-kvm wiki  instructions I did change device.map and booted
>> with if=virtio.
>> The wiki link is
>> http://www.linux-kvm.org/page/Boot_from_virtio_block_device
>>
>>
>
> Does your initrd include virtio-pci.ko and virtio-blk.ko?
>
> --
> error compiling committee.c: too many arguments to function
>
>



-- 
Sudhir Kumar
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce -cpu host target

2009-06-24 Thread Avi Kivity

On 06/23/2009 12:47 AM, Andre Przywara wrote:

Should we ignore unhandled MSRs like QEMU or Xen do?
   


Ignoring unhandled msrs is dangerous.  If a write has some effect the 
guest depends on, and we're not emulating that effect, the guest will 
fail.  Similarly if you don't know what a register mean, who knows what 
returning zero for a read will do.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] ignore AMDs HWCR register access to set the FFDIS bit

2009-06-24 Thread Avi Kivity

On 06/23/2009 01:20 AM, Andre Przywara wrote:

Linux tries to disable the flush filter on all AMD K8 CPUs. Since KVM
does not handle the needed MSR, the injected #GP will panic the Linux
kernel. Ignore setting of the HWCR.FFDIS bit in this MSR to let Linux
boot with an AMD K8 family guest CPU.

Signed-off-by: Andre Przywara
---
  arch/x86/kvm/x86.c |7 +++
  1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5a66bb9..4c19c24 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -796,6 +796,13 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 
data)
case MSR_EFER:
set_efer(vcpu, data);
break;
+   case MSR_K7_HWCR:
+   if (data != 0x40) {
+   pr_unimpl(vcpu, "unimplemented HWCR wrmsr: 0x%llx\n",
+   data);
+   return 1;
+   }
+   break;
   


Won't that printk() if writing a zero?  Just mask out that bit.

I also see some HWCR handling in svm.c, can probably be removed.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM: x86: verify MTRR/PAT validity

2009-06-24 Thread Avi Kivity

On 06/22/2009 09:27 PM, Marcelo Tosatti wrote:


This in the BIOS is writing 1's into the reserved address bits into
variable MTRR:

 wrmsr_smp(MTRRphysMask_MSR(0), ~(0x2000ull - 1) | 0x800);

So i'll leave just memory type validity checking and MSR_MTRRdefType
valid bit checks in for now:
   

Applied, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Cleanup cpu loop

2009-06-24 Thread Avi Kivity

On 06/22/2009 05:27 PM, Gleb Natapov wrote:

Rearrange cpu loop to be (hopefully) more readable. Put difference
between kernel/userspace irqchip in one place.

   


Applied, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] ignore reads from AMDs C1E enabled MSR

2009-06-24 Thread Andre Przywara
If the Linux kernel detects an C1E capable AMD processor (K8 RevF and
higher), it will access a certain MSR on every attempt to go to halt.
Explicitly handle this read and return 0 to let KVM run a Linux guest
with the native AMD host CPU propagated to the guest.

Signed-off-by: Andre Przywara 
---
 arch/x86/kvm/x86.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c5a0d37..a148f4c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1001,6 +1001,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, 
u64 *pdata)
case MSR_P6_EVNTSEL0:
case MSR_P6_EVNTSEL1:
case MSR_K7_EVNTSEL0:
+   case MSR_K8_INT_PENDING_MSG:
data = 0;
break;
case MSR_MTRRcap:
-- 
1.6.1.3


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] ignore PCI ECS I/O enablement

2009-06-24 Thread Andre Przywara
Linux guests will try to enable access to the extended PCI config space
via the I/O ports 0xCF8/0xCFC on AMD Fam10h CPU. Since we (currently?)
don't use ECS, simply ignore this write attempt.

Signed-off-by: Andre Przywara 
---
 arch/x86/kvm/x86.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a148f4c..e6e61ee 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -804,6 +804,8 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 
data)
return 1;
}
break;
+   case MSR_AMD64_NB_CFG:
+   break;
case MSR_IA32_DEBUGCTLMSR:
if (!data) {
/* We support the non-activated case already */
-- 
1.6.1.3


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] dont trim the guest's hypervisor CPUID bit in KVM if the guest requests it

2009-06-24 Thread Andre Przywara
Signed-off-by: Andre Przywara 
---
 arch/x86/kvm/x86.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e6e61ee..6ad0f93 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1415,7 +1415,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, 
u32 function,
0 /* Reserved */ | F(CX16) | 0 /* xTPR Update, PDCM */ |
0 /* Reserved, DCA */ | F(XMM4_1) |
F(XMM4_2) | 0 /* x2APIC */ | F(MOVBE) | F(POPCNT) |
-   0 /* Reserved, XSAVE, OSXSAVE */;
+   0 /* Reserved, XSAVE, OSXSAVE */ | F(HYPERVISOR);
/* cpuid 0x8001.ecx */
const u32 kvm_supported_word6_x86_features =
F(LAHF_LM) | F(CMP_LEGACY) | F(SVM) | 0 /* ExtApicSpace */ |
-- 
1.6.1.3


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] ignore AMDs HWCR register access to set the FFDIS bit

2009-06-24 Thread Andre Przywara
Linux tries to disable the flush filter on all AMD K8 CPUs. Since KVM
does not handle the needed MSR, the injected #GP will panic the Linux
kernel. Ignore setting of the HWCR.FFDIS bit in this MSR to let Linux
boot with an AMD K8 family guest CPU.

Signed-off-by: Andre Przywara 
---
 arch/x86/kvm/svm.c |1 -
 arch/x86/kvm/x86.c |8 
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 9b37043..610bcae 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2139,7 +2139,6 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, unsigned 
ecx, u64 data)
break;
case MSR_VM_CR:
case MSR_VM_IGNNE:
-   case MSR_K7_HWCR:
pr_unimpl(vcpu, "unimplemented wrmsr: 0x%x data 0x%llx\n", ecx, 
data);
break;
default:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5a66bb9..c5a0d37 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -796,6 +796,14 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 
data)
case MSR_EFER:
set_efer(vcpu, data);
break;
+   case MSR_K7_HWCR:
+   data &= ~0x40U; /* ignore flush filter disable */
+   if (data != 0) {
+   pr_unimpl(vcpu, "unimplemented HWCR wrmsr: 0x%llx\n",
+   data);
+   return 1;
+   }
+   break;
case MSR_IA32_DEBUGCTLMSR:
if (!data) {
/* We support the non-activated case already */
-- 
1.6.1.3


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce -cpu host target

2009-06-24 Thread Andre Przywara

Avi Kivity wrote:

On 06/23/2009 12:47 AM, Andre Przywara wrote:

Should we ignore unhandled MSRs like QEMU or Xen do?
   


Ignoring unhandled msrs is dangerous.  If a write has some effect the 
guest depends on, and we're not emulating that effect, the guest will 
fail.  Similarly if you don't know what a register mean, who knows what 
returning zero for a read will do.

I agree - from an academic POV.
But if the pragmatic approach simply enables many guests to run, then 
it's at least worth considering it.
And with the current approach the guest fails, too (due to the injected 
#GP).
If I only look at AMD's list of MSRs (not to speak of the internal list 
;-), there will be a lot of work to emulate them. Even worse, most of 
them cannot be properly emulated (like disable Lock prefix).


But nevertheless I would like to continue the "patch-on-demand" path by 
catching those MSRs that in-the-wild OSes really touch and handle them 
appropriately. Hopefully that will cover most of the MSRs.
Maybe we could consider an (module? QEMU cmdline?) option to ignore 
unknown MSRs.


Regards,
Andre.

--
Andre Przywara
AMD-Operating System Research Center (OSRC), Dresden, Germany
Tel: +49 351 448 3567 12
to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Thomas M. McCoy; Giuliano Meroni
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] virtio_test: A module for testing virtio via userspace

2009-06-24 Thread Christian Bornträger
Am Mittwoch 24 Juni 2009 05:40:34 schrieb Rusty Russell:
> > o  the general idea of a virtio_test module
> > o  the user interface ioctls
> > o  further ideas and comments
>
> Not mugging real drivers would be a requirement, I think.

Ok, I try to find a proper way to avoid that virtio_test binds to devices that 
have real drivers available. That patch to virtio_dev_match should be 
relatively 
easy.
The open question I have:  Should virtio_test bind to a device if no other 
driver is (yet) available?

> > +config VIRTIO_TEST
> > +   tristate "Virtio test driver (EXPERIMENTAL)"
> > +   select VIRTIO
> > +   select VIRTIO_RING
>
> Perhaps these should be depends?  Plus, depends on EXPERIMENTAL.
>
> > +If unsure, say M.
>
> That's "N" I think.

Yes. 

>
> > +   case VIOTEST_IOCGETBUF:
> > +   ret = do_get_buf(vtest, (struct viotest_getbuf __user *) arg);
> > +   break;
> > +   case VIOTEST_IOCGETCBS:
> > +   ret = get_callbacks(vtest, (struct viotest_cbinfo __user *) 
> > arg);
> > +   break;
>
> Generally the point of callbacks is to tell you you have new buffers; in
> fact you're insulated from callbacks which don't show new buffers.  So I'm
> not sure these two need to be separate?
> In which case, a read/write interface starts to make sense (write for
> addbuf and kick, read for get_buf).  That fits nicely with O_NONBLOCK and
> poll().

Hmm - makes sense. I will try to propose a 2nd version of the interface. The 
interface must handle multiple virtqueues per device, should allow non-blocking 
mode etc. Lets see what ideas come to my mind.

Thanks for the comments

Christian
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce -cpu host target

2009-06-24 Thread Avi Kivity

On 06/24/2009 02:04 PM, Andre Przywara wrote:

Avi Kivity wrote:

On 06/23/2009 12:47 AM, Andre Przywara wrote:

Should we ignore unhandled MSRs like QEMU or Xen do?


Ignoring unhandled msrs is dangerous.  If a write has some effect the 
guest depends on, and we're not emulating that effect, the guest will 
fail.  Similarly if you don't know what a register mean, who knows 
what returning zero for a read will do.

I agree - from an academic POV.
But if the pragmatic approach simply enables many guests to run, then 
it's at least worth considering it.
And with the current approach the guest fails, too (due to the 
injected #GP).
If I only look at AMD's list of MSRs (not to speak of the internal 
list ;-), there will be a lot of work to emulate them. Even worse, 
most of them cannot be properly emulated (like disable Lock prefix).


We don't need to emulate all values.  We can allow default values or 
bits that don't matter for virtualization, and warn on bits that don't 
emulate properly.  I'm not concerned about failure -- just silent failure.




But nevertheless I would like to continue the "patch-on-demand" path 
by catching those MSRs that in-the-wild OSes really touch and handle 
them appropriately. Hopefully that will cover most of the MSRs.
Maybe we could consider an (module? QEMU cmdline?) option to ignore 
unknown MSRs.


Good idea.  We can start with a module option, if it proves popular we 
can graduate it to a per-guest ioctl.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] dont trim the guest's hypervisor CPUID bit in KVM if the guest requests it

2009-06-24 Thread Avi Kivity

On 06/24/2009 01:44 PM, Andre Przywara wrote:

Signed-off-by: Andre Przywara
---
  arch/x86/kvm/x86.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e6e61ee..6ad0f93 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1415,7 +1415,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, 
u32 function,
0 /* Reserved */ | F(CX16) | 0 /* xTPR Update, PDCM */ |
0 /* Reserved, DCA */ | F(XMM4_1) |
F(XMM4_2) | 0 /* x2APIC */ | F(MOVBE) | F(POPCNT) |
-   0 /* Reserved, XSAVE, OSXSAVE */;
+   0 /* Reserved, XSAVE, OSXSAVE */ | F(HYPERVISOR);
   


I think this should be handled in qemu, since it isn't really a cpu bit.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] ignore PCI ECS I/O enablement

2009-06-24 Thread Avi Kivity

On 06/24/2009 01:44 PM, Andre Przywara wrote:

Linux guests will try to enable access to the extended PCI config space
via the I/O ports 0xCF8/0xCFC on AMD Fam10h CPU. Since we (currently?)
don't use ECS, simply ignore this write attempt.

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a148f4c..e6e61ee 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -804,6 +804,8 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 
data)
return 1;
}
break;
+   case MSR_AMD64_NB_CFG:
+   break;
case MSR_IA32_DEBUGCTLMSR:
if (!data) {
/* We support the non-activated case already */
   



I see Linux does both rdmsr and wrmsr, don't we need to support both?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] ignore AMDs HWCR register access to set the FFDIS bit

2009-06-24 Thread Avi Kivity

On 06/24/2009 01:44 PM, Andre Przywara wrote:

Linux tries to disable the flush filter on all AMD K8 CPUs. Since KVM
does not handle the needed MSR, the injected #GP will panic the Linux
kernel. Ignore setting of the HWCR.FFDIS bit in this MSR to let Linux
boot with an AMD K8 family guest CPU.

   


Applied first two patched, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2811454 ] CPU ID Emulation bug: AMD Athlon

2009-06-24 Thread SourceForge.net
Bugs item #2811454, was opened at 2009-06-24 14:40
Message generated for change (Tracker Item Submitted) made by technologov
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2811454&group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: qemu
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Technologov (technologov)
Assigned to: Nobody/Anonymous (nobody)
Summary: CPU ID Emulation bug: AMD Athlon

Initial Comment:
Qemu incorrectly emulates CPU ID of AMD Athlon CPU.

Namely:
1. It uses CPU Model of "Qemu CPU"
2. It shows SSE/SSE2 instructions as CPU flags, while AMD Athlons don't have it.
3. When running KVM on Intel CPU, it shows "CPU vendor" as Intel instead of AMD.
In my opinion KVM should refuse to emulate AMD Athlon CPUID on Intel CPUs, due 
to lacking 3Dnow! (unless those instructions can be emulated)

-Alexey, 24.6.2009.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2811454&group_id=180599
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2811454 ] CPU ID Emulation bug: AMD Athlon

2009-06-24 Thread SourceForge.net
Bugs item #2811454, was opened at 2009-06-24 14:40
Message generated for change (Comment added) made by technologov
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2811454&group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: qemu
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Technologov (technologov)
Assigned to: Nobody/Anonymous (nobody)
Summary: CPU ID Emulation bug: AMD Athlon

Initial Comment:
Qemu incorrectly emulates CPU ID of AMD Athlon CPU.

Namely:
1. It uses CPU Model of "Qemu CPU"
2. It shows SSE/SSE2 instructions as CPU flags, while AMD Athlons don't have it.
3. When running KVM on Intel CPU, it shows "CPU vendor" as Intel instead of AMD.
In my opinion KVM should refuse to emulate AMD Athlon CPUID on Intel CPUs, due 
to lacking 3Dnow! (unless those instructions can be emulated)

-Alexey, 24.6.2009.

--

>Comment By: Technologov (technologov)
Date: 2009-06-24 14:41

Message:
Qemu command:
./qemu-kvm -hda /isos/disks-vm/alexeye/fc9-32.qcow2 -snapshot -m 512 -cpu
athlon

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2811454&group_id=180599
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2811454 ] CPU ID Emulation bug: AMD Athlon

2009-06-24 Thread SourceForge.net
Bugs item #2811454, was opened at 2009-06-24 14:40
Message generated for change (Comment added) made by technologov
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2811454&group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: qemu
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Technologov (technologov)
Assigned to: Nobody/Anonymous (nobody)
Summary: CPU ID Emulation bug: AMD Athlon

Initial Comment:
Qemu incorrectly emulates CPU ID of AMD Athlon CPU.

Namely:
1. It uses CPU Model of "Qemu CPU"
2. It shows SSE/SSE2 instructions as CPU flags, while AMD Athlons don't have it.
3. When running KVM on Intel CPU, it shows "CPU vendor" as Intel instead of AMD.
In my opinion KVM should refuse to emulate AMD Athlon CPUID on Intel CPUs, due 
to lacking 3Dnow! (unless those instructions can be emulated)

-Alexey, 24.6.2009.

--

>Comment By: Technologov (technologov)
Date: 2009-06-24 14:57

Message:
Tesed on KVM-87rc7 on Intel, AMD and -no-kvm (Qemu)

--

Comment By: Technologov (technologov)
Date: 2009-06-24 14:41

Message:
Qemu command:
./qemu-kvm -hda /isos/disks-vm/alexeye/fc9-32.qcow2 -snapshot -m 512 -cpu
athlon

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2811454&group_id=180599
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] dont trim the guest's hypervisor CPUID bit in KVM if the guest requests it

2009-06-24 Thread Andre Przywara

Avi Kivity wrote:

On 06/24/2009 01:44 PM, Andre Przywara wrote:

Signed-off-by: Andre Przywara
---
  arch/x86/kvm/x86.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e6e61ee..6ad0f93 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1415,7 +1415,7 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 
*entry, u32 function,

  0 /* Reserved */ | F(CX16) | 0 /* xTPR Update, PDCM */ |
  0 /* Reserved, DCA */ | F(XMM4_1) |
  F(XMM4_2) | 0 /* x2APIC */ | F(MOVBE) | F(POPCNT) |
-0 /* Reserved, XSAVE, OSXSAVE */;
+0 /* Reserved, XSAVE, OSXSAVE */ | F(HYPERVISOR);
   


I think this should be handled in qemu, since it isn't really a cpu bit.


But this would require to make an exception for turning this bit on 
again after it has been trimmed.
I just made (QEMU) patches for make this trimming really work (and 
removing all the hacked bits), so I would like to not spoil this again 
by introducing another excecption. After all this is more a list of what 
KVM does _not_ support (should we make this a negative one? gets ugly 
with the reserved bits), so I would like to leave it here.

I will send out the patches to qemu-devel when I found the last bug.

Regards,
Andre.

--
Andre Przywara
AMD-OSRC (Dresden)
Tel: x29712

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio-serial: A guest <-> host interface for simple communication

2009-06-24 Thread Amit Shah
On (Wed) Jun 24 2009 [13:45:01], Rusty Russell wrote:
> On Tue, 23 Jun 2009 10:12:31 pm Amit Shah wrote:
> > Hello,
> >
> > Here are two patches. One implements a virtio-serial device in qemu
> > and the other is the driver for a guest kernel.
> >
> > While working on a vmchannel interface that is needed for communication
> > between guest userspace and host userspace, I saw that most of the
> > interface can be abstracted out as a "serial" device with "ports".
> 
> OK, I don't think the "naming" idea works though.  A userspace user would 
> have 
> to open each one in turn to get its name.  I'd stick with numbers.

What if an ioctl were added to get the port number from the port name?
Userspace would do
ioctl(fd, VIRTIO_SERIAL_GET_PORT_FROM_NAME, &nr);
sprintf(port, "/dev/vmch%s", nr);
fd2 = open(port, ...);

?

> You also don't have dynamic creation and removal, except by hotpluging the 
> entire device (which was on your requirements page).

Actually we're more interested in hotplugging ports than the device
itself ("Dynamic channel creation").

> what ports exist.  Register on the change interrupt to get updates.  Drop the 
> control vq entirely.

If the ioctl mentioned above were added, it would justify the control
vq, right?

Thanks for the comments,
Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] dont trim the guest's hypervisor CPUID bit in KVM if the guest requests it

2009-06-24 Thread Andre Przywara

Andre Przywara wrote:

Avi Kivity wrote:

--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1415,7 +1415,7 @@ static void do_cpuid_ent(struct 
-0 /* Reserved, XSAVE, OSXSAVE */;

+0 /* Reserved, XSAVE, OSXSAVE */ | F(HYPERVISOR);


I think this should be handled in qemu, since it isn't really a cpu bit.


But this would require to make an exception for turning this bit on 
again after it has been trimmed.
I just made (QEMU) patches for make this trimming really work (and 
removing all the hacked bits), so I would like to not spoil this again 
by introducing another exception.
I kind of tricked myself here. Since the hypervisor bit is (well, 
mostly) not set in the host, it will always be removed from the guest. 
So we need an exception for this anyway.

So please drop this patch, I will come up with some QEMU solution for this.

Regards,
Andre.

After all this is more a list of what 
KVM does _not_ support (should we make this a negative one? gets ugly 
with the reserved bits), so I would like to leave it here.

I will send out the patches to qemu-devel when I found the last bug.



--
Andre Przywara
AMD-OSRC (Dresden)
Tel: x29712

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] ignore PCI ECS I/O enablement

2009-06-24 Thread Andre Przywara
Linux guests will try to enable access to the extended PCI config space
via the I/O ports 0xCF8/0xCFC on AMD Fam10h CPU. Since we (currently?)
don't use ECS, simply ignore write and read attempts.

Signed-off-by: Andre Przywara 
---
 arch/x86/kvm/x86.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a148f4c..c717037 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -804,6 +804,8 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 
data)
return 1;
}
break;
+   case MSR_AMD64_NB_CFG:
+   break;
case MSR_IA32_DEBUGCTLMSR:
if (!data) {
/* We support the non-activated case already */
@@ -1002,6 +1004,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, 
u64 *pdata)
case MSR_P6_EVNTSEL1:
case MSR_K7_EVNTSEL0:
case MSR_K8_INT_PENDING_MSG:
+   case MSR_AMD64_NB_CFG:
data = 0;
break;
case MSR_MTRRcap:
-- 
1.6.1.3


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


BUG: using smp_processor_id() in preemptible

2009-06-24 Thread Johannes Berg
Hi,

I'm trying to run a test environment in kvm (because uml doesn't have
lockdep), and am running into the following problems:

1) I get the $subject warning a lot, when starting kvm:
[85763.262707] BUG: using smp_processor_id() in preemptible [] code: 
kvm/13877
[85763.262719] caller is kvm_write_guest_time+0x40/0x220 [kvm]
[85763.262722] Pid: 13877, comm: kvm Not tainted 2.6.30-wl-26837-g0ee651a-dirty 
#54
[85763.262725] Call Trace:
[85763.262729]  [] debug_smp_processor_id+0xf2/0x100
[85763.262741]  [] kvm_write_guest_time+0x40/0x220 [kvm]
[85763.262753]  [] vcpu_enter_guest+0x320/0x580 [kvm]
[85763.262780]  [] __vcpu_run+0x74/0x2f0 [kvm]
[85763.262792]  [] kvm_arch_vcpu_ioctl_run+0x8f/0x200 [kvm]
[85763.262804]  [] kvm_vcpu_ioctl+0x4b8/0x900 [kvm]
[85763.262816]  [] vfs_ioctl+0x36/0xb0
[85763.262819]  [] do_vfs_ioctl+0x89/0x320
[85763.262826]  [] sys_ioctl+0x4f/0x80
[85763.262830]  [] system_call_fastpath+0x16/0x1b

That kernel version is wireless-testing, which is currently based on
v2.6.30, and the -dirty is for some wireless patches I did.

2) The second problem is that it doesn't actually work. I use this
command line:
kvm -kernel arch/x86_64/boot/bzImage \
-hda ../uml/Ubuntu-IntrepidIbex-amd64-root_fs \
-append "root=/dev/hda console=ttyS0" -curses

and the system hangs after
  Plex86/Bochs VGABios (PCI) current-cvs 12 Jun 2009
  This VGA/VBE Bios is released under the GNU LGPL

  Please visit :
   . http://bochs.sourceforge.net
   . http://www.nongnu.org/vgabios

  cirrus-compatible VGA is detected

  QEMU BIOS - build: 06/12/09
  $Revision: 1.182 $ $Date: 2007/08/01 17:09:51 $
  Options: apmbios pcibios eltorito rombios32

  ata0 master: QEMU HARDDISK ATA-7 Hard-Disk (1024 MBytes)
  ata1 master: QEMU DVD-ROM ATAPI-4 CD-Rom/DVD-Rom

  Press F12 for boot menu.


  Decompressing Linux... Parsing ELF... done.
  Booting the kernel.

The guest kernel is the same as the host, but with somewhat different
config options.

The strange thing here is that the exact same command line, with
qemu-system-x86_64 instead of kvm works perfectly.

johannes


signature.asc
Description: This is a digitally signed message part


ia64: why is mmio bus access lockless?

2009-06-24 Thread Michael S. Tsirkin
Hi,
arch/ia64/kvm/kvm-ia64.c does this:

   mmio_dev = vcpu_find_mmio_dev(vcpu, p->addr, p->size, !p->dir);
   if (mmio_dev) {
   if (!p->dir)
   kvm_iodevice_write(mmio_dev, p->addr, p->size,
   &p->data);
   else
   kvm_iodevice_read(mmio_dev, p->addr, p->size,
   &p->data);
without holding kvm lock.

I know that find/read/write on the bus require locking on x86 -
are they safe to do without locking on ia64?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] virtio-serial: A guest <-> host interface for simple communication

2009-06-24 Thread Jamie Lokier
Amit Shah wrote:
> A few sample uses for a vmchannel are to share the host and guest
> clipboards (to allow copy/paste between a host and a guest), to
> lock the screen of the guest session when the vnc viewer is closed,
> to find out which applications are installed on a guest OS even when
> the guest is powered down (using virt-inspector) and so on.

Those all look like useful features.

Can you run an application to provide those features on a guest which
_doesn't_ have a vmchannel/virtio-serial support in the kernel?

Or will it be restricted only to guests which have QEMU-specific
support in their kernel?

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: [PATCH 2/2] introduce -cpu host target

2009-06-24 Thread Jamie Lokier
Andre Przywara wrote:
> Even worse, most of them cannot be properly emulated (like disable
> Lock prefix).

Disable Lock prefix should be easy to emulate by ignoring it,
shouldn't it? :-)

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] support colon in filenames

2009-06-24 Thread Ram Pai
Problem: It is impossible to feed filenames with the character colon because 
qemu interprets such names as a protocol. For example  a filename scsi:0,
is interpreted as a protocol by name "scsi".

This patch allows users to espace colon characters. For example the above 
filename
can now be expressed as 'scsi\:0'

Here are couple of examples:

ndb:\::  is treated as a ndb protocol with a hostname ':' on port 
scsi\:0\:abc is a local file scsi:0:abc
http\://myweb is a local file by name http://myweb
nbd\::localhost:2558  is a protocol by name nbd: 

Signed-off-by: Ram Pai 
---


 block.c   |   26 +-
 block/nbd.c   |   19 ++-
 block/raw-posix.c |   24 +---
 cutils.c  |   22 ++
 qemu-common.h |1 +
 vl.c  |3 +--
 6 files changed, 72 insertions(+), 23 deletions(-)

diff --git a/block.c b/block.c
index aca5a6d..80bded9 100644
--- a/block.c
+++ b/block.c
@@ -225,22 +225,30 @@ static BlockDriver *find_protocol(const char *filename)
 {
 BlockDriver *drv1;
 char protocol[128];
-int len;
-const char *p;
+char *p = protocol; 
+const char *f=filename;
+int len = strnlen(filename, 128);
 
 #ifdef _WIN32
 if (is_windows_drive(filename) ||
 is_windows_drive_prefix(filename))
 return bdrv_find_format("raw");
 #endif
-p = strchr(filename, ':');
-if (!p)
+while ( f < filename+len ) {
+   if ( *f == ':' )
+   break;
+   if ( *f == '\\') {
+   f++;
+   if ( *f == '\0')
+   break;
+   }
+   *p++ = *f++;
+}
+*p='\0';
+
+if (*f != ':')
 return bdrv_find_format("raw");
-len = p - filename;
-if (len > sizeof(protocol) - 1)
-len = sizeof(protocol) - 1;
-memcpy(protocol, filename, len);
-protocol[len] = '\0';
+
 for(drv1 = first_drv; drv1 != NULL; drv1 = drv1->next) {
 if (drv1->protocol_name &&
 !strcmp(drv1->protocol_name, protocol))
diff --git a/block/nbd.c b/block/nbd.c
index 47d4778..a011cc7 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -64,18 +64,27 @@ static int nbd_open(BlockDriverState *bs, const char* 
filename, int flags)
 
 } else {
 uint16_t port;
-char *p, *r;
+char *q, *p, *r;
 char hostname[128];
 
 pstrcpy(hostname, 128, host);
 
-p = strchr(hostname, ':');
-if (p == NULL)
+   q=p=hostname;
+   while ( p < hostname+128 ) {
+   if (*p == ':')
+   break;
+   if ( *p == '\\' )  {
+   p++;
+   if (*p =='\0') 
+   break;
+   }
+   *q++=*p++;
+   }
+if (*p != ':')
 return -EINVAL;
 
-*p = '\0';
+   *q='\0';
 p++;
-
 port = strtol(p, &r, 0);
 if (r == p)
 return -EINVAL;
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 41bfa37..98ede17 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -124,6 +124,16 @@ static int fd_open(BlockDriverState *bs);
 static int cdrom_reopen(BlockDriverState *bs);
 #endif
 
+static int _open(const char *filename, int flags, ...)
+{
+   char myfile[PATH_MAX];
+   va_list ap;
+   va_start(ap, flags);
+   return  open(esc_string(myfile, PATH_MAX, filename),
+   flags, ap);
+}
+
+
 static int raw_open_common(BlockDriverState *bs, const char *filename,
 int flags)
 {
@@ -151,7 +161,7 @@ static int raw_open_common(BlockDriverState *bs, const char 
*filename,
 s->open_flags |= O_DSYNC;
 
 s->fd = -1;
-fd = open(filename, s->open_flags, 0644);
+fd = _open(filename, s->open_flags, 0644);
 if (fd < 0) {
 ret = -errno;
 if (ret == -EROFS)
@@ -844,7 +854,7 @@ static int raw_create(const char *filename, 
QEMUOptionParameter *options)
 options++;
 }
 
-fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
+fd = _open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
   0644);
 if (fd < 0)
 return -EIO;
@@ -985,7 +995,7 @@ static int hdev_open(BlockDriverState *bs, const char 
*filename, int flags)
 if ( bsdPath[ 0 ] != '\0' ) {
 strcat(bsdPath,"s0");
 /* some CDs don't have a partition 0 */
-fd = open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE);
+fd = _open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE);
 if (fd < 0) {
 bsdPath[strlen(bsdPath)-1] = '1';
 } else {
@@ -1037,7 +1047,7 @@ static int fd_open(BlockDriverState *bs)
 #endif
 return -EIO;
 }
-s->fd = open(bs->filename, s->open_flags & ~O_NONBLOCK);
+s->fd = _open(bs->filename, s->open_flags & ~O_NONBLOCK);

Re: [PATCH] support colon in filenames

2009-06-24 Thread Balbir Singh
* Ram Pai  [2009-06-24 09:58:59]:

> Problem: It is impossible to feed filenames with the character colon because 
> qemu interprets such names as a protocol. For example  a filename scsi:0,
> is interpreted as a protocol by name "scsi".
> 
> This patch allows users to espace colon characters. For example the above 
> filename
> can now be expressed as 'scsi\:0'
> 
> Here are couple of examples:
> 
> ndb:\::  is treated as a ndb protocol with a hostname ':' on port 
> scsi\:0\:abc is a local file scsi:0:abc
> http\://myweb is a local file by name http://myweb
> nbd\::localhost:2558  is a protocol by name nbd: 
> 
> Signed-off-by: Ram Pai 

Are colons useful for filenames? Is there a common use case for this?

-- 
Balbir
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] virtio-serial: A guest <-> host interface for simple communication

2009-06-24 Thread Amit Shah
On (Wed) Jun 24 2009 [17:40:49], Jamie Lokier wrote:
> Amit Shah wrote:
> > A few sample uses for a vmchannel are to share the host and guest
> > clipboards (to allow copy/paste between a host and a guest), to
> > lock the screen of the guest session when the vnc viewer is closed,
> > to find out which applications are installed on a guest OS even when
> > the guest is powered down (using virt-inspector) and so on.
> 
> Those all look like useful features.
> 
> Can you run an application to provide those features on a guest which
> _doesn't_ have a vmchannel/virtio-serial support in the kernel?
> 
> Or will it be restricted only to guests which have QEMU-specific
> support in their kernel?

libguestfs currently uses the -net user based vmchannel interface that
exists in current qemu. That doesn't need a kernel that doesn't have
support for virtio-serial.

Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] support colon in filenames

2009-06-24 Thread Amit Shah
On (Wed) Jun 24 2009 [09:58:59], Ram Pai wrote:
> Problem: It is impossible to feed filenames with the character colon because 
> qemu interprets such names as a protocol. For example  a filename scsi:0,
> is interpreted as a protocol by name "scsi".
> 
> This patch allows users to espace colon characters. For example the above 
> filename
> can now be expressed as 'scsi\:0'

Eduardo (CC'ed) had a patch as well:

http://thread.gmane.org/gmane.comp.emulators.qemu/39229/focus=39229

Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] support colon in filenames

2009-06-24 Thread Amit Shah
(Actually add Eduardo to CC)

On (Wed) Jun 24 2009 [22:56:59], Amit Shah wrote:
> On (Wed) Jun 24 2009 [09:58:59], Ram Pai wrote:
> > Problem: It is impossible to feed filenames with the character colon 
> > because 
> > qemu interprets such names as a protocol. For example  a filename scsi:0,
> > is interpreted as a protocol by name "scsi".
> > 
> > This patch allows users to espace colon characters. For example the above 
> > filename
> > can now be expressed as 'scsi\:0'
> 
> Eduardo (CC'ed) had a patch as well:
> 
> http://thread.gmane.org/gmane.comp.emulators.qemu/39229/focus=39229

Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] support colon in filenames

2009-06-24 Thread Ram Pai
On Wed, 2009-06-24 at 22:38 +0530, Balbir Singh wrote:
> * Ram Pai  [2009-06-24 09:58:59]:
> 
> > Problem: It is impossible to feed filenames with the character colon 
> > because 
> > qemu interprets such names as a protocol. For example  a filename scsi:0,
> > is interpreted as a protocol by name "scsi".
> > 
> > This patch allows users to espace colon characters. For example the above 
> > filename
> > can now be expressed as 'scsi\:0'
> > 
> > Here are couple of examples:
> > 
> > ndb:\::  is treated as a ndb protocol with a hostname ':' on port 
> > scsi\:0\:abc is a local file scsi:0:abc
> > http\://myweb is a local file by name http://myweb
> > nbd\::localhost:2558  is a protocol by name nbd: 
> > 
> > Signed-off-by: Ram Pai 
> 
> Are colons useful for filenames? Is there a common use case for this?

Yes. files like
 
/dev/disk/by-path/pci-:0b:00.0-sas-phy0:1-0x5000c5000c14b41d:0-lun0-part3

RP

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: [PATCH 2/2] introduce -cpu host target

2009-06-24 Thread Filip Navara
On Wed, Jun 24, 2009 at 11:54 AM, Avi Kivity wrote:
> On 06/23/2009 12:47 AM, Andre Przywara wrote:
>>
>> Should we ignore unhandled MSRs like QEMU or Xen do?
>>
>
> Ignoring unhandled msrs is dangerous.  If a write has some effect the guest
> depends on, and we're not emulating that effect, the guest will fail.
>  Similarly if you don't know what a register mean, who knows what returning
> zero for a read will do.

It is definitely a bad idea to ignore unknown MSRs. Kernel patch
protection scheme used by certain operating system depend on them to
work properly and it's pretty hard to debug when you don't know what
failed (the MSR read in this case).

http://www.uninformed.org/?v=3&a=3
http://www.uninformed.org/?v=6&a=1
http://www.uninformed.org/?v=8&a=5
http://en.wikipedia.org/wiki/Kernel_Patch_Protection

Best regards,
Filip Navara
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: [PATCH 2/2] introduce -cpu host target

2009-06-24 Thread Avi Kivity

On 06/24/2009 08:37 PM, Filip Navara wrote:

On Wed, Jun 24, 2009 at 11:54 AM, Avi Kivity  wrote:
   

On 06/23/2009 12:47 AM, Andre Przywara wrote:
 

Should we ignore unhandled MSRs like QEMU or Xen do?

   

Ignoring unhandled msrs is dangerous.  If a write has some effect the guest
depends on, and we're not emulating that effect, the guest will fail.
  Similarly if you don't know what a register mean, who knows what returning
zero for a read will do.
 


It is definitely a bad idea to ignore unknown MSRs. Kernel patch
protection scheme used by certain operating system depend on them to
work properly and it's pretty hard to debug when you don't know what
failed (the MSR read in this case).

http://www.uninformed.org/?v=3&a=3
http://www.uninformed.org/?v=6&a=1
http://www.uninformed.org/?v=8&a=5
http://en.wikipedia.org/wiki/Kernel_Patch_Protection

   


Which unknown msrs are used by kernel patch protection?

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] virtio-serial: A guest <-> host interface for simple communication

2009-06-24 Thread Jamie Lokier
Amit Shah wrote:
> On (Wed) Jun 24 2009 [17:40:49], Jamie Lokier wrote:
> > Amit Shah wrote:
> > > A few sample uses for a vmchannel are to share the host and guest
> > > clipboards (to allow copy/paste between a host and a guest), to
> > > lock the screen of the guest session when the vnc viewer is closed,
> > > to find out which applications are installed on a guest OS even when
> > > the guest is powered down (using virt-inspector) and so on.
> > 
> > Those all look like useful features.
> > 
> > Can you run an application to provide those features on a guest which
> > _doesn't_ have a vmchannel/virtio-serial support in the kernel?
> > 
> > Or will it be restricted only to guests which have QEMU-specific
> > support in their kernel?
> 
> libguestfs currently uses the -net user based vmchannel interface that
> exists in current qemu. That doesn't need a kernel that doesn't have
> support for virtio-serial.

That's great!

If that works fine, and guest apps/libraries are using that as a
fallback anyway, what benefit do they get from switching to
virtio-serial when they detect that instead, given they still have
code for the -net method?

Is the plan to remove -net user based support from libguestfs?

Is virtio-serial significantly simpler to use?

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] support colon in filenames

2009-06-24 Thread Ram Pai
Copying the qemu-devel mailing list too.

On Wed, 2009-06-24 at 09:58 -0700, Ram Pai wrote:
> Problem: It is impossible to feed filenames with the character colon because 
> qemu interprets such names as a protocol. For example  a filename scsi:0,
> is interpreted as a protocol by name "scsi".
> 
> This patch allows users to espace colon characters. For example the above 
> filename
> can now be expressed as 'scsi\:0'
> 
> Here are couple of examples:
> 
> ndb:\::  is treated as a ndb protocol with a hostname ':' on port 
> scsi\:0\:abc is a local file scsi:0:abc
> http\://myweb is a local file by name http://myweb
> nbd\::localhost:2558  is a protocol by name nbd: 
> 
> Signed-off-by: Ram Pai 
> ---
> 
> 
>  block.c   |   26 +-
>  block/nbd.c   |   19 ++-
>  block/raw-posix.c |   24 +---
>  cutils.c  |   22 ++
>  qemu-common.h |1 +
>  vl.c  |3 +--
>  6 files changed, 72 insertions(+), 23 deletions(-)
> 
> diff --git a/block.c b/block.c
> index aca5a6d..80bded9 100644
> --- a/block.c
> +++ b/block.c
> @@ -225,22 +225,30 @@ static BlockDriver *find_protocol(const char *filename)
>  {
>  BlockDriver *drv1;
>  char protocol[128];
> -int len;
> -const char *p;
> +char *p = protocol; 
> +const char *f=filename;
> +int len = strnlen(filename, 128);
> 
>  #ifdef _WIN32
>  if (is_windows_drive(filename) ||
>  is_windows_drive_prefix(filename))
>  return bdrv_find_format("raw");
>  #endif
> -p = strchr(filename, ':');
> -if (!p)
> +while ( f < filename+len ) {
> + if ( *f == ':' )
> + break;
> + if ( *f == '\\') {
> + f++;
> + if ( *f == '\0')
> + break;
> + }
> + *p++ = *f++;
> +}
> +*p='\0';
> +
> +if (*f != ':')
>  return bdrv_find_format("raw");
> -len = p - filename;
> -if (len > sizeof(protocol) - 1)
> -len = sizeof(protocol) - 1;
> -memcpy(protocol, filename, len);
> -protocol[len] = '\0';
> +
>  for(drv1 = first_drv; drv1 != NULL; drv1 = drv1->next) {
>  if (drv1->protocol_name &&
>  !strcmp(drv1->protocol_name, protocol))
> diff --git a/block/nbd.c b/block/nbd.c
> index 47d4778..a011cc7 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -64,18 +64,27 @@ static int nbd_open(BlockDriverState *bs, const char* 
> filename, int flags)
> 
>  } else {
>  uint16_t port;
> -char *p, *r;
> +char *q, *p, *r;
>  char hostname[128];
> 
>  pstrcpy(hostname, 128, host);
> 
> -p = strchr(hostname, ':');
> -if (p == NULL)
> + q=p=hostname;
> + while ( p < hostname+128 ) {
> + if (*p == ':')
> + break;
> + if ( *p == '\\' )  {
> + p++;
> + if (*p =='\0') 
> + break;
> + }
> + *q++=*p++;
> + }
> +if (*p != ':')
>  return -EINVAL;
> 
> -*p = '\0';
> + *q='\0';
>  p++;
> -
>  port = strtol(p, &r, 0);
>  if (r == p)
>  return -EINVAL;
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index 41bfa37..98ede17 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -124,6 +124,16 @@ static int fd_open(BlockDriverState *bs);
>  static int cdrom_reopen(BlockDriverState *bs);
>  #endif
> 
> +static int _open(const char *filename, int flags, ...)
> +{
> + char myfile[PATH_MAX];
> + va_list ap;
> + va_start(ap, flags);
> + return  open(esc_string(myfile, PATH_MAX, filename),
> + flags, ap);
> +}
> +
> +
>  static int raw_open_common(BlockDriverState *bs, const char *filename,
>  int flags)
>  {
> @@ -151,7 +161,7 @@ static int raw_open_common(BlockDriverState *bs, const 
> char *filename,
>  s->open_flags |= O_DSYNC;
> 
>  s->fd = -1;
> -fd = open(filename, s->open_flags, 0644);
> +fd = _open(filename, s->open_flags, 0644);
>  if (fd < 0) {
>  ret = -errno;
>  if (ret == -EROFS)
> @@ -844,7 +854,7 @@ static int raw_create(const char *filename, 
> QEMUOptionParameter *options)
>  options++;
>  }
> 
> -fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
> +fd = _open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
>0644);
>  if (fd < 0)
>  return -EIO;
> @@ -985,7 +995,7 @@ static int hdev_open(BlockDriverState *bs, const char 
> *filename, int flags)
>  if ( bsdPath[ 0 ] != '\0' ) {
>  strcat(bsdPath,"s0");
>  /* some CDs don't have a partition 0 */
> -fd = open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE);
> +fd = _open(bsdPath, O_RDONLY | O_BINARY | O_LARG

Re: [Qemu-devel] Re: [PATCH 2/2] introduce -cpu host target

2009-06-24 Thread Filip Navara
On Wed, Jun 24, 2009 at 7:46 PM, Avi Kivity wrote:
> On 06/24/2009 08:37 PM, Filip Navara wrote:
>>
>> On Wed, Jun 24, 2009 at 11:54 AM, Avi Kivity  wrote:
>>
>>>
>>> On 06/23/2009 12:47 AM, Andre Przywara wrote:
>>>

 Should we ignore unhandled MSRs like QEMU or Xen do?


>>>
>>> Ignoring unhandled msrs is dangerous.  If a write has some effect the
>>> guest
>>> depends on, and we're not emulating that effect, the guest will fail.
>>>  Similarly if you don't know what a register mean, who knows what
>>> returning
>>> zero for a read will do.
>>>
>>
>> It is definitely a bad idea to ignore unknown MSRs. Kernel patch
>> protection scheme used by certain operating system depend on them to
>> work properly and it's pretty hard to debug when you don't know what
>> failed (the MSR read in this case).
>>
>> http://www.uninformed.org/?v=3&a=3
>> http://www.uninformed.org/?v=6&a=1
>> http://www.uninformed.org/?v=8&a=5
>> http://en.wikipedia.org/wiki/Kernel_Patch_Protection
>>
>>
>
> Which unknown msrs are used by kernel patch protection?

It's a moving target. At the time I first got Win64 running on QEMU it
was the one for getting number of implemented virtual address bits
(0x8008 iirc) and some other for getting cache sizes
(0x8005/0x8006 iirc). Both of them were documented in AMD
manuals and not implemented by QEMU. Also the higher bits of virtual
addresses must be treated as sign-extended (as per the information in
the 0x8008 MSR) even though there are actually bits stored in the
address. Me and Alex Ionescu have spent considerable time by reversing
the PatchGuard v1 and that information is described in more detail in
the first link above. I haven't looked at PatchGuard v2/v3 yet.

Best regards,
Filip Navara
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] virtio-serial: A guest <-> host interface for simple communication

2009-06-24 Thread Amit Shah
On (Wed) Jun 24 2009 [18:50:02], Jamie Lokier wrote:
> Amit Shah wrote:
> > On (Wed) Jun 24 2009 [17:40:49], Jamie Lokier wrote:
> > > Amit Shah wrote:
> > > > A few sample uses for a vmchannel are to share the host and guest
> > > > clipboards (to allow copy/paste between a host and a guest), to
> > > > lock the screen of the guest session when the vnc viewer is closed,
> > > > to find out which applications are installed on a guest OS even when
> > > > the guest is powered down (using virt-inspector) and so on.
> > > 
> > > Those all look like useful features.
> > > 
> > > Can you run an application to provide those features on a guest which
> > > _doesn't_ have a vmchannel/virtio-serial support in the kernel?
> > > 
> > > Or will it be restricted only to guests which have QEMU-specific
> > > support in their kernel?
> > 
> > libguestfs currently uses the -net user based vmchannel interface that
> > exists in current qemu. That doesn't need a kernel that doesn't have
> > support for virtio-serial.
> 
> That's great!
> 
> If that works fine, and guest apps/libraries are using that as a
> fallback anyway, what benefit do they get from switching to
> virtio-serial when they detect that instead, given they still have
> code for the -net method?

Speed is the biggest benefit.

> Is the plan to remove -net user based support from libguestfs?

I don't know what Richard's plan is, but if the kernel that libguestfs
deploys in the appliance gains support for virtio-serial, there's no
reason it shouldn't switch.

> Is virtio-serial significantly simpler to use?

I think the interface from the guest POV stays the same: reads / writes
to char devices. With virtio-serial, though, we can add a few other
interesting things like names to ports, ability to hot-add ports on
demand, request notifications when either end goes down, etc.

Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix non-KVM build

2009-06-24 Thread Anthony Liguori
This introduces some #ifdefs in pcspk to fix the build when KVM isn't enabled.

Signed-off-by: Anthony Liguori 
---
 hw/pcspk.c |   15 +--
 1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/hw/pcspk.c b/hw/pcspk.c
index 9e1b59a..236995a 100644
--- a/hw/pcspk.c
+++ b/hw/pcspk.c
@@ -80,11 +80,6 @@ static void kvm_set_pit_ch2(PITState *pit,
 kvm_set_pit(kvm_context, inkernel_state);
 }
 }
-#else
-static inline void kvm_get_pit_ch2(PITState *pit,
-   kvm_pit_state *inkernel_state) { }
-static inline void kvm_set_pit_ch2(PITState *pit,
-   kvm_pit_state *inkernel_state) { }
 #endif
 
 static inline void generate_samples(PCSpkState *s)
@@ -111,7 +106,9 @@ static void pcspk_callback(void *opaque, int free)
 PCSpkState *s = opaque;
 unsigned int n;
 
+#ifdef USE_KVM_PIT
 kvm_get_pit_ch2(s->pit, NULL);
+#endif
 
 if (pit_get_mode(s->pit, 2) != 3)
 return;
@@ -158,7 +155,9 @@ static uint32_t pcspk_ioport_read(void *opaque, uint32_t 
addr)
 PCSpkState *s = opaque;
 int out;
 
+#ifdef USE_KVM_PIT
 kvm_get_pit_ch2(s->pit, NULL);
+#endif
 
 s->dummy_refresh_clock ^= (1 << 4);
 out = pit_get_out(s->pit, 2, qemu_get_clock(vm_clock)) << 5;
@@ -168,11 +167,13 @@ static uint32_t pcspk_ioport_read(void *opaque, uint32_t 
addr)
 
 static void pcspk_ioport_write(void *opaque, uint32_t addr, uint32_t val)
 {
-struct kvm_pit_state inkernel_state;
 PCSpkState *s = opaque;
 const int gate = val & 1;
+#ifdef USE_KVM_PIT
+struct kvm_pit_state inkernel_state;
 
 kvm_get_pit_ch2(s->pit, &inkernel_state);
+#endif
 
 s->data_on = (val >> 1) & 1;
 pit_set_gate(s->pit, 2, gate);
@@ -182,7 +183,9 @@ static void pcspk_ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 AUD_set_active_out(s->voice, gate & s->data_on);
 }
 
+#ifdef USE_KVM_PIT
 kvm_set_pit_ch2(s->pit, &inkernel_state);
+#endif
 }
 
 void pcspk_init(PITState *pit)
-- 
1.6.2.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] support colon in filenames

2009-06-24 Thread Balbir Singh
* Ram Pai  [2009-06-24 10:30:00]:

> On Wed, 2009-06-24 at 22:38 +0530, Balbir Singh wrote:
> > * Ram Pai  [2009-06-24 09:58:59]:
> > 
> > > Problem: It is impossible to feed filenames with the character colon 
> > > because 
> > > qemu interprets such names as a protocol. For example  a filename scsi:0,
> > > is interpreted as a protocol by name "scsi".
> > > 
> > > This patch allows users to espace colon characters. For example the above 
> > > filename
> > > can now be expressed as 'scsi\:0'
> > > 
> > > Here are couple of examples:
> > > 
> > > ndb:\::  is treated as a ndb protocol with a hostname ':' on port 
> > > scsi\:0\:abc is a local file scsi:0:abc
> > > http\://myweb is a local file by name http://myweb
> > > nbd\::localhost:2558  is a protocol by name nbd: 
> > > 
> > > Signed-off-by: Ram Pai 
> > 
> > Are colons useful for filenames? Is there a common use case for this?
> 
> Yes. files like
> 
> /dev/disk/by-path/pci-:0b:00.0-sas-phy0:1-0x5000c5000c14b41d:0-lun0-part3
>

Yes, that does make sense. Thanks!
 

-- 
Balbir
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix non-KVM build

2009-06-24 Thread Dustin Kirkland
On Wed, Jun 24, 2009 at 1:13 PM, Anthony Liguori wrote:
> This introduces some #ifdefs in pcspk to fix the build when KVM isn't enabled.
>
> Signed-off-by: Anthony Liguori 
> ---
>  hw/pcspk.c |   15 +--
>  1 files changed, 9 insertions(+), 6 deletions(-)
...

Thanks, Anthony.  This patch does solve my pcspk build breakage, and
gets me a bit further in the build, before:

...
  CCppc-softmmu/disas.o
  CCppc-softmmu/i386-dis.o
  CCppc-softmmu/ppc-dis.o
  ARppc-softmmu/libqemu.a
  LINK  ppc-softmmu/qemu-system-ppc
machine.o: In function `cpu_load':
/tmp/tmp.wibDyxlHPK/qemu-kvm-0.10.50.20090624132948/target-ppc/machine.c:180:
undefined reference to `cpu_synchronize_state'
machine.o: In function `cpu_save':
/tmp/tmp.wibDyxlHPK/qemu-kvm-0.10.50.20090624132948/target-ppc/machine.c:10:
undefined reference to `cpu_synchronize_state'
ppc440.o: In function `ppc440ep_init':
/tmp/tmp.wibDyxlHPK/qemu-kvm-0.10.50.20090624132948/hw/ppc440.c:48:
undefined reference to `kvm_enabled'
ppc440_bamboo.o: In function `bamboo_init':
/tmp/tmp.wibDyxlHPK/qemu-kvm-0.10.50.20090624132948/hw/ppc440_bamboo.c:183:
undefined reference to `kvm_enabled'
/tmp/tmp.wibDyxlHPK/qemu-kvm-0.10.50.20090624132948/hw/ppc440_bamboo.c:184:
undefined reference to `kvmppc_init'
ppce500_mpc8544ds.o: In function `mpc8544ds_init':
/tmp/tmp.wibDyxlHPK/qemu-kvm-0.10.50.20090624132948/hw/ppce500_mpc8544ds.c:277:
undefined reference to `kvm_enabled'
/tmp/tmp.wibDyxlHPK/qemu-kvm-0.10.50.20090624132948/hw/ppce500_mpc8544ds.c:278:
undefined reference to `kvmppc_init'
libqemu.a(helper.o): In function `ppc_tlb_invalidate_all':
/tmp/tmp.wibDyxlHPK/qemu-kvm-0.10.50.20090624132948/target-ppc/helper.c:1861:
undefined reference to `kvm_enabled'
collect2: ld returned 1 exit status
make[2]: *** [qemu-system-ppc] Error 1
make[1]: *** [subdir-ppc-softmmu] Error 2


:-Dustin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] virtio-serial: A guest <-> host interface for simple communication

2009-06-24 Thread Jamie Lokier
Amit Shah wrote:
> On (Wed) Jun 24 2009 [18:50:02], Jamie Lokier wrote:
> > Amit Shah wrote:
> > > On (Wed) Jun 24 2009 [17:40:49], Jamie Lokier wrote:
> > > > Amit Shah wrote:
> > > > > A few sample uses for a vmchannel are to share the host and guest
> > > > > clipboards (to allow copy/paste between a host and a guest), to
> > > > > lock the screen of the guest session when the vnc viewer is closed,
> > > > > to find out which applications are installed on a guest OS even when
> > > > > the guest is powered down (using virt-inspector) and so on.
> > > > 
> > > > Those all look like useful features.
> > > > 
> > > > Can you run an application to provide those features on a guest which
> > > > _doesn't_ have a vmchannel/virtio-serial support in the kernel?
> > > > 
> > > > Or will it be restricted only to guests which have QEMU-specific
> > > > support in their kernel?
> > > 
> > > libguestfs currently uses the -net user based vmchannel interface that
> > > exists in current qemu. That doesn't need a kernel that doesn't have
> > > support for virtio-serial.
> > 
> > That's great!
> > 
> > If that works fine, and guest apps/libraries are using that as a
> > fallback anyway, what benefit do they get from switching to
> > virtio-serial when they detect that instead, given they still have
> > code for the -net method?
> 
> Speed is the biggest benefit.

Fair enough, sounds good, and I can see how it's more usable than a
network interface in many respects.

> > Is the plan to remove -net user based support from libguestfs?
> 
> I don't know what Richard's plan is, but if the kernel that libguestfs
> deploys in the appliance gains support for virtio-serial, there's no
> reason it shouldn't switch.
> 
> > Is virtio-serial significantly simpler to use?
> 
> I think the interface from the guest POV stays the same: reads / writes
> to char devices. With virtio-serial, though, we can add a few other
> interesting things like names to ports, ability to hot-add ports on
> demand, request notifications when either end goes down, etc.

Good features, useful for a lot of handy things.  I think it would be
handy if the same features were available to the guest application
generally, not just on guest kernels with a specific driver though.

As we talked before, about things like boot loaders and kernel
debuggers, and installing the support applications on old guests.

Is it possible to support access to the same capabilities through a
well known IO/MMIO address, in the same way that VGA and IDE are both
ordinary PCI devices, but also can be reached easily through well
known IO/MMIO addresses in simple code?

-- Jamie

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix ppc-softmmu build

2009-06-24 Thread Anthony Liguori
This gets ppc-softmmu building when KVM is not enabled.  It may be enough to get
it working with KVM enabled but I haven't checked.

Signed-off-by: Anthony Liguori 
---
 hw/ppc440.c|1 +
 hw/ppc440_bamboo.c |1 +
 hw/ppce500_mpc8544ds.c |1 +
 qemu-kvm.h |1 +
 target-ppc/helper.c|1 +
 target-ppc/machine.c   |1 +
 6 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/hw/ppc440.c b/hw/ppc440.c
index 00d82e4..c2c9e65 100644
--- a/hw/ppc440.c
+++ b/hw/ppc440.c
@@ -19,6 +19,7 @@
 #include "ppc405.h"
 #include "sysemu.h"
 #include "kvm.h"
+#include "qemu-kvm.h"
 
 #define PPC440EP_PCI_CONFIG 0xeec0
 #define PPC440EP_PCI_INTACK 0xeed0
diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c
index 00aa2c7..e4aad39 100644
--- a/hw/ppc440_bamboo.c
+++ b/hw/ppc440_bamboo.c
@@ -22,6 +22,7 @@
 #include "kvm.h"
 #include "kvm_ppc.h"
 #include "device_tree.h"
+#include "qemu-kvm.h"
 
 #define BINARY_DEVICE_TREE_FILE "bamboo.dtb"
 
diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index d9ed36c..1099b99 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -29,6 +29,7 @@
 #include "device_tree.h"
 #include "openpic.h"
 #include "ppce500.h"
+#include "qemu-kvm.h"
 
 #define BINARY_DEVICE_TREE_FILE"mpc8544ds.dtb"
 #define UIMAGE_LOAD_BASE   0
diff --git a/qemu-kvm.h b/qemu-kvm.h
index 68a5b40..9341d0c 100644
--- a/qemu-kvm.h
+++ b/qemu-kvm.h
@@ -169,6 +169,7 @@ int kvm_has_sync_mmu(void);
 void kvm_init_vcpu(CPUState *env);
 void kvm_load_tsc(CPUState *env);
 #else
+#define kvm_has_sync_mmu() (0)
 #define kvm_enabled() (0)
 #define kvm_nested 0
 #define qemu_kvm_irqchip_in_kernel() (0)
diff --git a/target-ppc/helper.c b/target-ppc/helper.c
index 5a7a935..3629c99 100644
--- a/target-ppc/helper.c
+++ b/target-ppc/helper.c
@@ -29,6 +29,7 @@
 #include "helper_regs.h"
 #include "qemu-common.h"
 #include "kvm.h"
+#include "qemu-kvm.h"
 
 //#define DEBUG_MMU
 //#define DEBUG_BATS
diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index 99ba3eb..ec8e197 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -1,6 +1,7 @@
 #include "hw/hw.h"
 #include "hw/boards.h"
 #include "kvm.h"
+#include "qemu-kvm.h"
 
 void cpu_save(QEMUFile *f, void *opaque)
 {
-- 
1.6.2.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM PATCH v4 3/4] slow-work: add (module*)work->owner to fix races with module clients

2009-06-24 Thread Gregory Haskins
The slow_work facility was designed to use reference counting instead of
barriers for synchronization.  The reference counting mechanism is
implemented as a vtable op (->get_ref, ->put_ref) callback.  This is
problematic for module use of the slow_work facility because it is impossible
to synchronize against the .text installed in the callbacks: There is
no way to ensure that the slow-work threads have completely exited the
text in question and rmmod may yank it out from under the slow_work thread.

This patch attempts to address this issue by transparently mapping "struct
module* owner" to the slow_work item, and maintaining a module reference
count coincident with the more externally visible reference count.  Since
the slow_work facility is resident in kernel, it should be a race-free
location to issue a module_put() call.  This will ensure that modules
can properly cleanup before exiting.

A module_get()/module_put() pair on slow_work_enqueue() and the subsequent
dequeue technically adds the overhead of the atomic operations for every
work item scheduled.  However, slow_work is designed for deferring
relatively long-running and/or sleepy tasks to begin with, so this
overhead will hopefully be negligible.

Signed-off-by: Gregory Haskins 
CC: David Howells 
---

 include/linux/slow-work.h |4 
 kernel/slow-work.c|5 +
 2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/include/linux/slow-work.h b/include/linux/slow-work.h
index b65c888..9f48dab 100644
--- a/include/linux/slow-work.h
+++ b/include/linux/slow-work.h
@@ -17,6 +17,7 @@
 #ifdef CONFIG_SLOW_WORK
 
 #include 
+#include 
 
 struct slow_work;
 
@@ -42,6 +43,7 @@ struct slow_work_ops {
  *   queued
  */
 struct slow_work {
+   struct module  *owner;
unsigned long   flags;
 #define SLOW_WORK_PENDING  0   /* item pending (further) execution */
 #define SLOW_WORK_EXECUTING1   /* item currently executing */
@@ -61,6 +63,7 @@ struct slow_work {
 static inline void slow_work_init(struct slow_work *work,
  const struct slow_work_ops *ops)
 {
+   work->owner = THIS_MODULE;
work->flags = 0;
work->ops = ops;
INIT_LIST_HEAD(&work->link);
@@ -78,6 +81,7 @@ static inline void slow_work_init(struct slow_work *work,
 static inline void vslow_work_init(struct slow_work *work,
   const struct slow_work_ops *ops)
 {
+   work->owner = THIS_MODULE;
work->flags = 1 << SLOW_WORK_VERY_SLOW;
work->ops = ops;
INIT_LIST_HEAD(&work->link);
diff --git a/kernel/slow-work.c b/kernel/slow-work.c
index 521ed20..fa27500 100644
--- a/kernel/slow-work.c
+++ b/kernel/slow-work.c
@@ -220,6 +220,7 @@ static bool slow_work_execute(void)
}
 
work->ops->put_ref(work);
+   module_put(work->owner);
return true;
 
 auto_requeue:
@@ -299,6 +300,8 @@ int slow_work_enqueue(struct slow_work *work)
if (test_bit(SLOW_WORK_EXECUTING, &work->flags)) {
set_bit(SLOW_WORK_ENQ_DEFERRED, &work->flags);
} else {
+   if (!try_module_get(work->owner))
+   goto cant_get_mod;
if (work->ops->get_ref(work) < 0)
goto cant_get_ref;
if (test_bit(SLOW_WORK_VERY_SLOW, &work->flags))
@@ -313,6 +316,8 @@ int slow_work_enqueue(struct slow_work *work)
return 0;
 
 cant_get_ref:
+   module_put(work->owner);
+cant_get_mod:
spin_unlock_irqrestore(&slow_work_queue_lock, flags);
return -EAGAIN;
 }

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM PATCH v4 0/4] irqfd fixes

2009-06-24 Thread Gregory Haskins
(Applies to kvm.git/master:4631e094)

The following is the latest attempt to fix the remaining races in
irqfd/eventfd.  For more details, please read the patch headers.

This series has been tested against the kvm-eventfd unit test, and
appears to be functioning properly.  You can download this test here:

ftp://ftp.novell.com/dev/ghaskins/kvm-eventfd.tar.bz2

I've included version 2 of Davide's eventfd patch (ported to kvm.git) so
that its a complete reviewable series.  Note, however, that there may be
later versions of his patch to consider for merging, so we should
coordinate with him.

Note also that I potentially found an issue in slow_work which I
subsequently fixed, so I have cc'd David Howells.  It is needed to close
some of the remaining issues.

-Greg

---

Davide Libenzi (1):
  eventfd - revised interface and cleanups (2nd rev)

Gregory Haskins (3):
  KVM: Fix races in irqfd using new eventfd_kref_get interface
  slow-work: add (module*)work->owner to fix races with module clients
  kvm: prepare irqfd for having interrupts disabled during eventfd->release


 drivers/lguest/lg.h  |2 
 drivers/lguest/lguest_user.c |4 -
 fs/aio.c |   24 +---
 fs/eventfd.c |  119 ++---
 include/linux/aio.h  |4 -
 include/linux/eventfd.h  |   18 +--
 include/linux/kvm_host.h |7 +
 include/linux/slow-work.h|4 +
 init/Kconfig |1 
 kernel/slow-work.c   |5 +
 virt/kvm/Kconfig |1 
 virt/kvm/eventfd.c   |  234 --
 12 files changed, 312 insertions(+), 111 deletions(-)

-- 
Signature
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM PATCH v4 1/4] kvm: prepare irqfd for having interrupts disabled during eventfd->release

2009-06-24 Thread Gregory Haskins
We need to plug some race conditions on eventfd shutdown.  In order to
do this, we need to change the context in which the release notification
is delivered so that the wqh lock is now held.  However, there is currently
code in the release callback that assumes it can sleep.

We have a slight chicken and egg problem where we cant fix the race
without adding the lock, and we can't add the lock without breaking
the sleepy code.  Normally we could deal with this by making both
changes in an atomic changeset.  However, we want to keep the eventfd
and kvm specific changes isolated to ease the reviewer burden on upstream
eventfd (at the specific request of upstream).  Therefore, we have this
intermediate patch.

This intermediate patch allows the release() method to work in an atomic
context, at the expense of correctness w.r.t. memory-leaks.  Today we have
a race condition.  With this patch applied we leak.  Both issues will be
resolved later in the series.  It is the author's opinion that a leak is
better for bisectability than the hang would be should we leave the sleepy
code in place after the locking changeover.

Signed-off-by: Gregory Haskins 
---

 virt/kvm/eventfd.c |   89 
 1 files changed, 27 insertions(+), 62 deletions(-)

diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index a9e7de7..9656027 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -28,7 +28,6 @@
 #include 
 #include 
 #include 
-#include 
 
 /*
  * 
@@ -38,8 +37,6 @@
  * 
  */
 struct _irqfd {
-   struct mutex  lock;
-   struct srcu_structsrcu;
struct kvm   *kvm;
int   gsi;
struct list_head  list;
@@ -53,48 +50,12 @@ static void
 irqfd_inject(struct work_struct *work)
 {
struct _irqfd *irqfd = container_of(work, struct _irqfd, inject);
-   struct kvm *kvm;
-   int idx;
+   struct kvm *kvm = irqfd->kvm;
 
-   idx = srcu_read_lock(&irqfd->srcu);
-
-   kvm = rcu_dereference(irqfd->kvm);
-   if (kvm) {
-   mutex_lock(&kvm->irq_lock);
-   kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 1);
-   kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 0);
-   mutex_unlock(&kvm->irq_lock);
-   }
-
-   srcu_read_unlock(&irqfd->srcu, idx);
-}
-
-static void
-irqfd_disconnect(struct _irqfd *irqfd)
-{
-   struct kvm *kvm;
-
-   mutex_lock(&irqfd->lock);
-
-   kvm = rcu_dereference(irqfd->kvm);
-   rcu_assign_pointer(irqfd->kvm, NULL);
-
-   mutex_unlock(&irqfd->lock);
-
-   if (!kvm)
-   return;
-
-   mutex_lock(&kvm->lock);
-   list_del(&irqfd->list);
-   mutex_unlock(&kvm->lock);
-
-   /*
-* It is important to not drop the kvm reference until the next grace
-* period because there might be lockless references in flight up
-* until then
-*/
-   synchronize_srcu(&irqfd->srcu);
-   kvm_put_kvm(kvm);
+   mutex_lock(&kvm->irq_lock);
+   kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 1);
+   kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 0);
+   mutex_unlock(&kvm->irq_lock);
 }
 
 static int
@@ -103,26 +64,24 @@ irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, 
void *key)
struct _irqfd *irqfd = container_of(wait, struct _irqfd, wait);
unsigned long flags = (unsigned long)key;
 
+   /*
+* Assume we will be called with interrupts disabled
+*/
if (flags & POLLIN)
/*
-* The POLLIN wake_up is called with interrupts disabled.
-* Therefore we need to defer the IRQ injection until later
-* since we need to acquire the kvm->lock to do so.
+* Defer the IRQ injection until later since we need to
+* acquire the kvm->lock to do so.
 */
schedule_work(&irqfd->inject);
 
if (flags & POLLHUP) {
/*
-* The POLLHUP is called unlocked, so it theoretically should
-* be safe to remove ourselves from the wqh using the locked
-* variant of remove_wait_queue()
+* for now, just remove ourselves from the list and let
+* the rest dangle.  We will fix this up later once
+* the races in eventfd are fixed
 */
-   remove_wait_queue(irqfd->wqh, &irqfd->wait);
-   flush_work(&irqfd->inject);
-   irqfd_disconnect(irqfd);
-
-   cleanup_srcu_struct(&irqfd->srcu);
-   kfree(irqfd);
+   __remove_wait_queue(irqfd->wqh, &irqfd->wait);
+   irqfd->wqh = NULL;
}
 
return 0;
@@ 

[KVM PATCH v4 4/4] KVM: Fix races in irqfd using new eventfd_kref_get interface

2009-06-24 Thread Gregory Haskins
This patch fixes all known races in irqfd, and paves the way to restore
DEASSIGN support.  For details of the eventfd races, please see the header
for patch 2/4, or the thread on which it was based on:

http://www.mail-archive.com/kvm@vger.kernel.org/msg17767.html

In a nutshell, we use eventfd_ctx_get/put() to properly manage the
lifetime of the underlying eventfd.  We also use careful coordination
with our workqueue to ensure that all irqfd objects have terminated
before we allow kvm to shutdown.  The logic used for shutdown walks
all open irqfds and releases them.  This logic can be generalized in
the future to allow a subset of irqfds to be released, thus allowing
DEASSIGN support.

Signed-off-by: Gregory Haskins 
---

 include/linux/kvm_host.h |7 +-
 virt/kvm/Kconfig |1 
 virt/kvm/eventfd.c   |  199 +-
 3 files changed, 183 insertions(+), 24 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 2451f48..d94ee72 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -141,7 +141,12 @@ struct kvm {
struct kvm_io_bus mmio_bus;
struct kvm_io_bus pio_bus;
 #ifdef CONFIG_HAVE_KVM_EVENTFD
-   struct list_head irqfds;
+   struct {
+   spinlock_tlock;
+   struct list_head  items;
+   atomic_t  outstanding;
+   wait_queue_head_t wqh;
+   } irqfds;
 #endif
struct kvm_vm_stat stat;
struct kvm_arch arch;
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index daece36..ab7848a 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -9,6 +9,7 @@ config HAVE_KVM_IRQCHIP
 config HAVE_KVM_EVENTFD
bool
select EVENTFD
+   select SLOW_WORK
 
 config KVM_APIC_ARCHITECTURE
bool
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 9656027..e9e74f4 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * 
@@ -36,17 +37,27 @@
  * Credit goes to Avi Kivity for the original idea.
  * 
  */
+
 struct _irqfd {
struct kvm   *kvm;
+   struct eventfd_ctx   *eventfd;
int   gsi;
struct list_head  list;
poll_tablept;
wait_queue_head_t*wqh;
wait_queue_t  wait;
struct work_structinject;
+   struct slow_work  shutdown;
 };
 
 static void
+irqfd_release(struct _irqfd *irqfd)
+{
+   eventfd_ctx_put(irqfd->eventfd);
+   kfree(irqfd);
+}
+
+static void
 irqfd_inject(struct work_struct *work)
 {
struct _irqfd *irqfd = container_of(work, struct _irqfd, inject);
@@ -58,6 +69,55 @@ irqfd_inject(struct work_struct *work)
mutex_unlock(&kvm->irq_lock);
 }
 
+static struct _irqfd *
+work_to_irqfd(struct slow_work *work)
+{
+   return container_of(work, struct _irqfd, shutdown);
+}
+
+static int
+irqfd_shutdown_get_ref(struct slow_work *work)
+{
+   struct _irqfd *irqfd = work_to_irqfd(work);
+   struct kvm *kvm = irqfd->kvm;
+
+   atomic_inc(&kvm->irqfds.outstanding);
+
+   return 0;
+}
+
+static void
+irqfd_shutdown_put_ref(struct slow_work *work)
+{
+   struct _irqfd *irqfd = work_to_irqfd(work);
+   struct kvm *kvm = irqfd->kvm;
+
+   irqfd_release(irqfd);
+
+   if (atomic_dec_and_test(&kvm->irqfds.outstanding))
+   wake_up(&kvm->irqfds.wqh);
+}
+
+static void
+irqfd_shutdown_execute(struct slow_work *work)
+{
+   struct _irqfd *irqfd = work_to_irqfd(work);
+
+   /*
+* Ensure there are no outstanding "inject" work-items before we blow
+* away our state.  Once this job completes, the slow_work
+* infrastructure will drop the irqfd object completely via put_ref
+*/
+   flush_work(&irqfd->inject);
+}
+
+const static struct slow_work_ops irqfd_shutdown_work_ops = {
+   .get_ref = irqfd_shutdown_get_ref,
+   .put_ref = irqfd_shutdown_put_ref,
+   .execute = irqfd_shutdown_execute,
+};
+
+
 static int
 irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, void *key)
 {
@@ -65,25 +125,47 @@ irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, 
void *key)
unsigned long flags = (unsigned long)key;
 
/*
-* Assume we will be called with interrupts disabled
+* Called with interrupts disabled
 */
if (flags & POLLIN)
-   /*
-* Defer the IRQ injection until later since we need to
-* acquire the kvm->lock to do so.
-*/
+   /* An event has been signaled, inject an interrupt */
schedule_work(&irqfd->inject);
 
if (flags & POLLHUP) {
-   /*
-* for now, jus

[KVM PATCH v4 2/4] eventfd - revised interface and cleanups (2nd rev)

2009-06-24 Thread Gregory Haskins
From: Davide Libenzi 

The following patch changes the eventfd interface to de-couple the eventfd
memory context, from the file pointer instance.
Without such change, there is no clean way to racely free handle the
POLLHUP event sent when the last instance of the file* goes away.
Also, now the internal eventfd APIs are using the eventfd context instead
of the file*.
Another cleanup this patch does, is making AIO select EVENTFD, instead of
adding a bunch of empty function stubs inside eventfd.h in order to
handle the (AIO && !EVENTFD) case.

[gmh: resolved trivial merge conflict with kvm's POLLHUP patch]

Signed-off-by: Davide Libenzi 
Signed-off-by: Gregory Haskins 
---

 drivers/lguest/lg.h  |2 -
 drivers/lguest/lguest_user.c |4 +
 fs/aio.c |   24 ++--
 fs/eventfd.c |  119 --
 include/linux/aio.h  |4 +
 include/linux/eventfd.h  |   18 ++
 init/Kconfig |1 
 7 files changed, 120 insertions(+), 52 deletions(-)

diff --git a/drivers/lguest/lg.h b/drivers/lguest/lg.h
index d4e8979..9c31382 100644
--- a/drivers/lguest/lg.h
+++ b/drivers/lguest/lg.h
@@ -82,7 +82,7 @@ struct lg_cpu {
 
 struct lg_eventfd {
unsigned long addr;
-   struct file *event;
+   struct eventfd_ctx *event;
 };
 
 struct lg_eventfd_map {
diff --git a/drivers/lguest/lguest_user.c b/drivers/lguest/lguest_user.c
index 32e2971..9f9a295 100644
--- a/drivers/lguest/lguest_user.c
+++ b/drivers/lguest/lguest_user.c
@@ -50,7 +50,7 @@ static int add_eventfd(struct lguest *lg, unsigned long addr, 
int fd)
 
/* Now append new entry. */
new->map[new->num].addr = addr;
-   new->map[new->num].event = eventfd_fget(fd);
+   new->map[new->num].event = eventfd_ctx_fdget(fd);
if (IS_ERR(new->map[new->num].event)) {
kfree(new);
return PTR_ERR(new->map[new->num].event);
@@ -357,7 +357,7 @@ static int close(struct inode *inode, struct file *file)
 
/* Release any eventfds they registered. */
for (i = 0; i < lg->eventfds->num; i++)
-   fput(lg->eventfds->map[i].event);
+   eventfd_ctx_put(lg->eventfds->map[i].event);
kfree(lg->eventfds);
 
/* If lg->dead doesn't contain an error code it will be NULL or a
diff --git a/fs/aio.c b/fs/aio.c
index 76da125..d065b2c 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -485,6 +485,8 @@ static inline void really_put_req(struct kioctx *ctx, 
struct kiocb *req)
 {
assert_spin_locked(&ctx->ctx_lock);
 
+   if (req->ki_eventfd != NULL)
+   eventfd_ctx_put(req->ki_eventfd);
if (req->ki_dtor)
req->ki_dtor(req);
if (req->ki_iovec != &req->ki_inline_vec)
@@ -509,8 +511,6 @@ static void aio_fput_routine(struct work_struct *data)
/* Complete the fput(s) */
if (req->ki_filp != NULL)
__fput(req->ki_filp);
-   if (req->ki_eventfd != NULL)
-   __fput(req->ki_eventfd);
 
/* Link the iocb into the context's free list */
spin_lock_irq(&ctx->ctx_lock);
@@ -528,8 +528,6 @@ static void aio_fput_routine(struct work_struct *data)
  */
 static int __aio_put_req(struct kioctx *ctx, struct kiocb *req)
 {
-   int schedule_putreq = 0;
-
dprintk(KERN_DEBUG "aio_put(%p): f_count=%ld\n",
req, atomic_long_read(&req->ki_filp->f_count));
 
@@ -549,24 +547,16 @@ static int __aio_put_req(struct kioctx *ctx, struct kiocb 
*req)
 * we would not be holding the last reference to the file*, so
 * this function will be executed w/out any aio kthread wakeup.
 */
-   if (unlikely(atomic_long_dec_and_test(&req->ki_filp->f_count)))
-   schedule_putreq++;
-   else
-   req->ki_filp = NULL;
-   if (req->ki_eventfd != NULL) {
-   if 
(unlikely(atomic_long_dec_and_test(&req->ki_eventfd->f_count)))
-   schedule_putreq++;
-   else
-   req->ki_eventfd = NULL;
-   }
-   if (unlikely(schedule_putreq)) {
+   if (unlikely(atomic_long_dec_and_test(&req->ki_filp->f_count))) {
get_ioctx(ctx);
spin_lock(&fput_lock);
list_add(&req->ki_list, &fput_head);
spin_unlock(&fput_lock);
queue_work(aio_wq, &fput_work);
-   } else
+   } else {
+   req->ki_filp = NULL;
really_put_req(ctx, req);
+   }
return 1;
 }
 
@@ -1622,7 +1612,7 @@ static int io_submit_one(struct kioctx *ctx, struct iocb 
__user *user_iocb,
 * an eventfd() fd, and will be signaled for each completed
 * event using the eventfd_signal() function.
 */
-   req->ki_eventfd = eventfd_fget((int) iocb->aio_resfd);
+   req->ki_eventfd = eventfd_ctx_

[PATCH] Use upstream guard for kvm support in ppcemb-softmmu

2009-06-24 Thread Anthony Liguori
ppcemb-softmmu uses upstream KVM support so put an appropriate guard around it.
This fixes the non-KVM ppcemb-softmmu build.

Signed-off-by: Anthony Liguori 
---
 configure |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/configure b/configure
index 420e101..4bad3c3 100755
--- a/configure
+++ b/configure
@@ -2261,11 +2261,13 @@ case "$target_cpu" in
 echo "#define TARGET_ARCH \"ppcemb\"" >> $config_h
 echo "#define TARGET_PPC 1" >> $config_h
 echo "#define TARGET_PPCEMB 1" >> $config_h
+if test use_upstream_kvm = yes ; then
 if test "$target_kvm" = "yes" ; then
   echo "USE_KVM=yes" >> $config_mak
   echo "KVM_CFLAGS=$kvm_cflags" >> $config_mak
   echo "#define USE_KVM 1" >> $config_h
 fi
+fi
 gdb_xml_files="power-core.xml power-fpu.xml power-altivec.xml 
power-spe.xml"
 target_phys_bits=64
   ;;
-- 
1.6.2.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: [PATCH 1/2] allow hypervisor CPUID bit to be overriden

2009-06-24 Thread Jamie Lokier
Avi Kivity wrote:
> On 06/23/2009 02:31 PM, Paul Brook wrote:
> >On Tuesday 23 June 2009, Avi Kivity wrote:
> >   
> >>On 06/23/2009 12:47 AM, Andre Przywara wrote:
> >> 
> >>>KVM defaults to the hypervisor CPUID bit to be set, whereas pure QEMU
> >>>clears it. On some occasions one want to set or clear it the other way
> >>>round (for instance to get HyperV running inside a guest).
> >>>Allow the default to be overridden on the command line and fix some
> >>>whitespace damage on the way.
> >>>   
> >>It makes sense for qemu to set the hypervisor bit unconditionally.  A
> >>guest running under qemu is not bare metal.
> >> 
> >
> >I see no reason why a guest has to be told that it's running inside a VM.
> >In principle an appropriately configured qemu should be indistinguishable 
> >from
> >real hardware. In practice it's technically infeasible to cover absolutely
> >everything, but if we set this bit we're not even trying.
> >
> >I have no objection to the bit being set by default for the QEMU CPU types.
> >   
> 
> I agree it's pointless, but it is a Microsoft requirement for passing 
> their SVVP tests.  Enabling it by default makes life a little easier for 
> users who wish to validate their hypervisor and has no drawbacks.

Hold on.

Do the SVVP tests fail on a real (non-virtal) machine then?

Or is QEMU's machine emulation insufficiently accurate?

I see a drawback in setting the bit by default.

Something I expect from an emulator is that it behaves like a real
machine to the extent possible.  In particular, guest code which
attempts to check if it's running on a real machine should get the
answer "yes".  Unfriendly guest code which pops up a message like
"Sorry I refuse to work for you after 100 hours/ because you are
attempting to run me in a virtual machine, and don't even think of
trying to hide this from me now you know I look for it" should never
do so.

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Only compile KVM support for matching CPU types

2009-06-24 Thread Anthony Liguori
I don't know why this is disabled for qemu-kvm.git.

Signed-off-by: Anthony Liguori 
---
 configure |5 +
 1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index 4bad3c3..1b73eaf 100755
--- a/configure
+++ b/configure
@@ -2117,13 +2117,12 @@ configure_kvm() {
   fi
 }
 
-if [ use_upstream_kvm = yes ]; then
-
 # Make sure the target and host cpus are compatible
 if test ! \( "$target_cpu" = "$cpu" -o \
   \( "$target_cpu" = "ppcemb" -a "$cpu" = "ppc" \) -o \
   \( "$target_cpu" = "x86_64" -a "$cpu" = "i386"   \) -o \
   \( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) \) ; then
+  \( "$target_cpu" = "ia64"   -a "$cpu" = "ia64" \) \) ; then
   target_kvm="no"
 fi
 # Disable KVM for linux-user
@@ -2131,8 +2130,6 @@ if test "$target_softmmu" = "no" ; then
   target_kvm="no"
 fi
 
-fi
-
 case "$target_cpu" in
   i386)
 echo "TARGET_ARCH=i386" >> $config_mak
-- 
1.6.2.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-24 Thread Davide Libenzi
On Wed, 24 Jun 2009, Rusty Russell wrote:

> On Tue, 23 Jun 2009 03:33:22 am Davide Libenzi wrote:
> > What you're doing there, is setting up a kernel-to-kernel (since
> > userspace only role is to create the eventfd) communication, using a file*
> > as accessory. That IMO is plain wrong.
> 
> The most sensible is that userspace can use these fds; an in-kernel variant 
> is 
> possible too, but not primary IMHO.
> 
> It's nice that userspace create the fds; it can then use the same fd for 
> multiple event sources.
> 
> But I didn't see anything wrong with the way eventfd used to work: you have a 
> kvm ioctl to say "attach this eventfd to this guest notification" and that 
> does 
> the eventfd_fget.  A detach ioctl does the fput (as does release of the kvm 
> fd).
> 
> If they close the eventfd and don't do the detach ioctl, it's their problem.

Some components would like to know if userspace dropped the fd, and take 
proper action accordingly (release resources, drop module instances, etc...).
The POLLHUP helps with that, but w/out decoupling file* memory from 
eventfd_ctx memory, it becomes pretty tricky (if feasible at all) to 
handle the event in a race-free way.
Once the file* is decoupled from the eventfd_ctx, it becomes saner to 
expose the internal kernel API via the eventfd_ctx.
Another thing that comes in my mind (that for some components might not 
matter) is considering the effect of userspace doing things like:

for (;;) {
fd = eventfd(...);
ioctl(xfd, XXX_ADD, fd);
close(fd);
}

That might lead to unprivileged users drawing kernel memory w/out any 
userspace accountability, if not properly handled.



- Davide


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-24 Thread Davide Libenzi
On Tue, 23 Jun 2009, Andrew Morton wrote:

> Maybe there is such a reason, and it hasn't yet been beaten into my
> skull.  But I still think it should be done in a separate patch.  One
> which comes with a full description of the reasons for the change, btw.

Since your skull seems pretty hard to beat, would you like me to split it 
for ya? :)



- Davide


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Only compile KVM support for matching CPU types

2009-06-24 Thread Dustin Kirkland
On Wed, 2009-06-24 at 17:40 -0500, Anthony Liguori wrote:
> I don't know why this is disabled for qemu-kvm.git.
> 
> Signed-off-by: Anthony Liguori 
> ---
>  configure |5 +
>  1 files changed, 1 insertions(+), 4 deletions(-)
> 
> diff --git a/configure b/configure
> index 4bad3c3..1b73eaf 100755
> --- a/configure
> +++ b/configure
> @@ -2117,13 +2117,12 @@ configure_kvm() {
>fi
>  }
>  
> -if [ use_upstream_kvm = yes ]; then
> -
>  # Make sure the target and host cpus are compatible
>  if test ! \( "$target_cpu" = "$cpu" -o \
>\( "$target_cpu" = "ppcemb" -a "$cpu" = "ppc" \) -o \
>\( "$target_cpu" = "x86_64" -a "$cpu" = "i386"   \) -o \
>\( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) \) ; then
> +  \( "$target_cpu" = "ia64"   -a "$cpu" = "ia64" \) \) ; then

Almost...

-  \( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) \) ; then
+  \( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) \) -o \
+  \( "$target_cpu" = "ia64"   -a "$cpu" = "ia64" \) \) ; then

:-Dustin


signature.asc
Description: This is a digitally signed message part


Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-24 Thread Andrew Morton
On Wed, 24 Jun 2009 15:47:47 -0700 (PDT)
Davide Libenzi  wrote:

> On Tue, 23 Jun 2009, Andrew Morton wrote:
> 
> > Maybe there is such a reason, and it hasn't yet been beaten into my
> > skull.  But I still think it should be done in a separate patch.  One
> > which comes with a full description of the reasons for the change, btw.
> 
> Since your skull seems pretty hard to beat, would you like me to split it 
> for ya? :)

Split what?  My skull?

umm, yes please, I believe the patches should be split.  And I'm still
not seeing the justification for forcing CONFIG_EVENTFD onto all
CONFIG_AIO users!

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Only compile KVM support for matching CPU types

2009-06-24 Thread Dustin Kirkland
On Wed, 2009-06-24 at 18:09 -0500, Dustin Kirkland wrote:
> On Wed, 2009-06-24 at 17:40 -0500, Anthony Liguori wrote:
> >  # Make sure the target and host cpus are compatible
> >  if test ! \( "$target_cpu" = "$cpu" -o \
> >\( "$target_cpu" = "ppcemb" -a "$cpu" = "ppc" \) -o \
> >\( "$target_cpu" = "x86_64" -a "$cpu" = "i386"   \) -o \
> >\( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) \) ; then
> > +  \( "$target_cpu" = "ia64"   -a "$cpu" = "ia64" \) \) ; then
> 
> Almost...
> 
> -  \( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) \) ; then
> +  \( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) \) -o \
> +  \( "$target_cpu" = "ia64"   -a "$cpu" = "ia64" \) \) ; then

Doh.  Still not quite right...

Clean patch below.

Signed-off-by: Dustin Kirkland 

=== modified file 'configure'
--- configure   2009-06-17 11:16:33 +
+++ configure   2009-06-24 23:13:00 +
@@ -2117,13 +2117,12 @@
   fi
 }
 
-if [ use_upstream_kvm = yes ]; then
-
 # Make sure the target and host cpus are compatible
 if test ! \( "$target_cpu" = "$cpu" -o \
   \( "$target_cpu" = "ppcemb" -a "$cpu" = "ppc" \) -o \
   \( "$target_cpu" = "x86_64" -a "$cpu" = "i386"   \) -o \
-  \( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) \) ; then
+  \( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) -o \
+  \( "$target_cpu" = "ia64"   -a "$cpu" = "ia64" \) \) ; then
   target_kvm="no"
 fi
 # Disable KVM for linux-user
@@ -2131,8 +2130,6 @@
   target_kvm="no"
 fi
 
-fi
-
 case "$target_cpu" in
   i386)
 echo "TARGET_ARCH=i386" >> $config_mak




signature.asc
Description: This is a digitally signed message part


Re: [PATCH] Only compile KVM support for matching CPU types

2009-06-24 Thread Anthony Liguori

Dustin Kirkland wrote:

On Wed, 2009-06-24 at 17:40 -0500, Anthony Liguori wrote:
  

I don't know why this is disabled for qemu-kvm.git.

Signed-off-by: Anthony Liguori 
---
 configure |5 +
 1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index 4bad3c3..1b73eaf 100755
--- a/configure
+++ b/configure
@@ -2117,13 +2117,12 @@ configure_kvm() {
   fi
 }
 
-if [ use_upstream_kvm = yes ]; then

-
 # Make sure the target and host cpus are compatible
 if test ! \( "$target_cpu" = "$cpu" -o \
   \( "$target_cpu" = "ppcemb" -a "$cpu" = "ppc" \) -o \
   \( "$target_cpu" = "x86_64" -a "$cpu" = "i386"   \) -o \
   \( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) \) ; then
+  \( "$target_cpu" = "ia64"   -a "$cpu" = "ia64" \) \) ; then



Almost...

-  \( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) \) ; then
+  \( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) \) -o \
+  \( "$target_cpu" = "ia64"   -a "$cpu" = "ia64" \) \) ; then
  


Yeah, I have that locally but I forgot to commit --amend.  doh!

Regards,

Anthony Liguori


:-Dustin
  



--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Only compile KVM support for matching CPU types (v2)

2009-06-24 Thread Anthony Liguori
I don't know why this is disabled for qemu-kvm.git.

Signed-off-by: Anthony Liguori 

--

v1 -> v2 Fix typo
---
 configure |7 ++-
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/configure b/configure
index 4bad3c3..0e65edb 100755
--- a/configure
+++ b/configure
@@ -2117,13 +2117,12 @@ configure_kvm() {
   fi
 }
 
-if [ use_upstream_kvm = yes ]; then
-
 # Make sure the target and host cpus are compatible
 if test ! \( "$target_cpu" = "$cpu" -o \
   \( "$target_cpu" = "ppcemb" -a "$cpu" = "ppc" \) -o \
   \( "$target_cpu" = "x86_64" -a "$cpu" = "i386"   \) -o \
-  \( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) \) ; then
+  \( "$target_cpu" = "i386"   -a "$cpu" = "x86_64" \) -o \
+  \( "$target_cpu" = "ia64"   -a "$cpu" = "ia64" \) \) ; then
   target_kvm="no"
 fi
 # Disable KVM for linux-user
@@ -2131,8 +2130,6 @@ if test "$target_softmmu" = "no" ; then
   target_kvm="no"
 fi
 
-fi
-
 case "$target_cpu" in
   i386)
 echo "TARGET_ARCH=i386" >> $config_mak
-- 
1.6.2.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-24 Thread Davide Libenzi
On Wed, 24 Jun 2009, Andrew Morton wrote:

> Split what?  My skull?

Heh :)


> umm, yes please, I believe the patches should be split.  And I'm still
> not seeing the justification for forcing CONFIG_EVENTFD onto all
> CONFIG_AIO users!

Eventfd notifications became part of the AIO API (it's not even delivered 
through a new syscall, from the AIO side - same existing aiocb struct and 
io_submit syscall) once we merged it, so IMHO (AIO && !EVENTFD) would be 
similar to split AIO in AIO_READ and AIO_WRITE and have (AIO && !AIO_WRITE).
Considering that the kernel config, once you unleash the CONFIG_EMBEDDED 
pandora box, allows you to select (AIO && !EVENTFD) w/out even a warning 
about possible userspace breakages, this makes it rather a confusing 
configuration if you ask me.
It's not a biggie from the kernel side, just a few ugly errors wrappers 
around functions. For me AIO (or whatever userspace visible kernel 
subsystem) should select all the components that are part of the userspace 
API, but my argument ends here.



- Davide


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch] eventfd - revised interface and cleanups (4th rev)

2009-06-24 Thread Davide Libenzi
The following patch changes the eventfd interface to de-couple the eventfd 
memory context, from the file pointer instance.
Without such change, there is no clean way to racely free handle the 
POLLHUP event sent when the last instance of the file* goes away.
Also, now the internal eventfd APIs are using the eventfd context instead 
of the file*.

Gregory, none of the APIs changed, so your patches do not need to be 
rebased over this one. The last three revisions just updated comments.

Andrew, I'm not posting the "AIO select EVENTFD" patch at this time. Will 
eventually try another push in your skull once I come back from vacation ;)


Signed-off-by: Davide Libenzi 


- Davide


---
 drivers/lguest/lg.h  |2 
 drivers/lguest/lguest_user.c |4 -
 fs/aio.c |   24 ++--
 fs/eventfd.c |  122 ++-
 include/linux/aio.h  |4 -
 include/linux/eventfd.h  |   35 +---
 6 files changed, 149 insertions(+), 42 deletions(-)

Index: linux-2.6.mod/fs/eventfd.c
===
--- linux-2.6.mod.orig/fs/eventfd.c 2009-06-23 19:04:38.0 -0700
+++ linux-2.6.mod/fs/eventfd.c  2009-06-24 16:13:18.0 -0700
@@ -14,35 +14,44 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
+#include 
+#include 
 
 struct eventfd_ctx {
+   struct kref kref;
wait_queue_head_t wqh;
/*
 * Every time that a write(2) is performed on an eventfd, the
 * value of the __u64 being written is added to "count" and a
 * wakeup is performed on "wqh". A read(2) will return the "count"
 * value to userspace, and will reset "count" to zero. The kernel
-* size eventfd_signal() also, adds to the "count" counter and
+* side eventfd_signal() also, adds to the "count" counter and
 * issue a wakeup.
 */
__u64 count;
unsigned int flags;
 };
 
-/*
- * Adds "n" to the eventfd counter "count". Returns "n" in case of
- * success, or a value lower then "n" in case of coutner overflow.
- * This function is supposed to be called by the kernel in paths
- * that do not allow sleeping. In this function we allow the counter
- * to reach the ULLONG_MAX value, and we signal this as overflow
- * condition by returining a POLLERR to poll(2).
+/**
+ * eventfd_signal - Adds @n to the eventfd counter.
+ * @ctx: [in] Pointer to the eventfd context.
+ * @n: [in] Value of the counter to be added to the eventfd internal counter.
+ *  The value cannot be negative.
+ *
+ * This function is supposed to be called by the kernel in paths that do not
+ * allow sleeping. In this function we allow the counter to reach the 
ULLONG_MAX
+ * value, and we signal this as overflow condition by returining a POLLERR
+ * to poll(2).
+ *
+ * Returns @n in case of success, a non-negative number lower than @n in case
+ * of overflow, or the following error codes:
+ *
+ * -EINVAL: The value of @n is negative.
  */
-int eventfd_signal(struct file *file, int n)
+int eventfd_signal(struct eventfd_ctx *ctx, int n)
 {
-   struct eventfd_ctx *ctx = file->private_data;
unsigned long flags;
 
if (n < 0)
@@ -59,9 +68,45 @@ int eventfd_signal(struct file *file, in
 }
 EXPORT_SYMBOL_GPL(eventfd_signal);
 
+static void eventfd_free(struct kref *kref)
+{
+   struct eventfd_ctx *ctx = container_of(kref, struct eventfd_ctx, kref);
+
+   kfree(ctx);
+}
+
+/**
+ * eventfd_ctx_get - Acquires a reference to the internal eventfd context.
+ * @ctx: [in] Pointer to the eventfd context.
+ *
+ * Returns: In case of success, returns a pointer to the eventfd context.
+ */
+struct eventfd_ctx *eventfd_ctx_get(struct eventfd_ctx *ctx)
+{
+   kref_get(&ctx->kref);
+   return ctx;
+}
+EXPORT_SYMBOL_GPL(eventfd_ctx_get);
+
+/**
+ * eventfd_ctx_put - Releases a reference to the internal eventfd context.
+ * @ctx: [in] Pointer to eventfd context.
+ *
+ * The eventfd context reference must have been previously acquired either
+ * with eventfd_ctx_get() or eventfd_ctx_fdget()).
+ */
+void eventfd_ctx_put(struct eventfd_ctx *ctx)
+{
+   kref_put(&ctx->kref, eventfd_free);
+}
+EXPORT_SYMBOL_GPL(eventfd_ctx_put);
+
 static int eventfd_release(struct inode *inode, struct file *file)
 {
-   kfree(file->private_data);
+   struct eventfd_ctx *ctx = file->private_data;
+
+   wake_up_poll(&ctx->wqh, POLLHUP);
+   eventfd_ctx_put(ctx);
return 0;
 }
 
@@ -185,6 +230,16 @@ static const struct file_operations even
.write  = eventfd_write,
 };
 
+/**
+ * eventfd_fget - Acquire a reference of an eventfd file descriptor.
+ * @fd: [in] Eventfd file descriptor.
+ *
+ * Returns a pointer to the eventfd file structure in case of success, or the
+ * following error pointer:
+ *
+ * -EBADF: Invalid @fd file descriptor.
+ * -EINVAL   : The @fd file descriptor is not an eventfd f

Re: [PATCH 3/3] eventfd: add internal reference counting to fix notifier race conditions

2009-06-24 Thread Davide Libenzi
On Tue, 23 Jun 2009, Davide Libenzi wrote:

> On Tue, 23 Jun 2009, Rusty Russell wrote:
> 
> > The first 'struct eventfd_ctx;' line is not required.
> 
> Will repost dropping that.

Almost forgot. While fixing lg.h to drop the fwd declaration, I noticed 
there's another one ;)


- Davide


--- a/drivers/lguest/lg.h   2009-06-15 15:35:24.0 -0700
+++ b/drivers/lguest/lg.h   2009-06-24 17:08:26.0 -0700
@@ -38,8 +38,6 @@
 #define CHANGED_GDT_TLS4 /* Actually a subset of CHANGED_GDT */
 #define CHANGED_ALL3
 
-struct lguest;
-
 struct lg_cpu {
unsigned int id;
struct lguest *lg;
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] eventfd - revised interface and cleanups (2nd rev)

2009-06-24 Thread Andrew Morton
On Wed, 24 Jun 2009 16:52:06 -0700 (PDT)
Davide Libenzi  wrote:

> > umm, yes please, I believe the patches should be split.  And I'm still
> > not seeing the justification for forcing CONFIG_EVENTFD onto all
> > CONFIG_AIO users!
> 
> Eventfd notifications became part of the AIO API (it's not even delivered 
> through a new syscall, from the AIO side - same existing aiocb struct and 
> io_submit syscall) once we merged it,

That was a regression for existing embedded AIO users ;)

> so IMHO (AIO && !EVENTFD) would be 
> similar to split AIO in AIO_READ and AIO_WRITE and have (AIO && !AIO_WRITE).
> Considering that the kernel config, once you unleash the CONFIG_EMBEDDED 
> pandora box, allows you to select (AIO && !EVENTFD) w/out even a warning 
> about possible userspace breakages, this makes it rather a confusing 
> configuration if you ask me.

Sure.  But we do assume that one someone sets CONFIG_EMBEDDED, they
know what they're doing.  We prefer to give them maximum flexibility
and foot-shooting ability.  As long as the maintenance cost of doing so
in each case is reasonable, of course.

> It's not a biggie from the kernel side, just a few ugly errors wrappers 
> around functions. For me AIO (or whatever userspace visible kernel 
> subsystem) should select all the components that are part of the userspace 
> API, but my argument ends here.

Sure, it's not a biggie either way.  Doubtful if many tiny systems are
using AIO anyway.  Heck, lots of them are running 2.4.18...

But from the general this-is-the-way-we-do-things POV, it's preferred
that the two features be separable under CONFIG_EMBEDDED if poss.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ESX on KVM requirements

2009-06-24 Thread Ben Sanders
I applied the patch I found here:
http://thread.gmane.org/gmane.comp.emulators.qemu/35419
and here are my new debug results: http://pastebin.com/m60d53e9d

It looks like it is still crashing in the same place, so I double
checked that the patch was applied correctly.  I also didn't have the
"npt=1" originally on in the module, so I added that as well.

Any ideas?

Thanks,

Ben

On Fri, Jun 19, 2009 at 1:30 AM, Alexander Graf wrote:
>
> Looks like you're missing my VMware interface extension patches that
> implement TSC fake value passing to the guest? The series was called "VMware
> ESX guest bringup (partial)".
>
> Also, try to keep the ML in CC and simply pastebin instead of attaching the
> log file. There are probably more people out there who have the same
> question ;-).
>
> Alex
>
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] virtio-serial: A guest <-> host interface for simple communication

2009-06-24 Thread Amit Shah
On (Wed) Jun 24 2009 [20:20:27], Jamie Lokier wrote:
> Amit Shah wrote:
> > 
> > I think the interface from the guest POV stays the same: reads / writes
> > to char devices. With virtio-serial, though, we can add a few other
> > interesting things like names to ports, ability to hot-add ports on
> > demand, request notifications when either end goes down, etc.
> 
> Good features, useful for a lot of handy things.  I think it would be
> handy if the same features were available to the guest application
> generally, not just on guest kernels with a specific driver though.
> 
> As we talked before, about things like boot loaders and kernel
> debuggers, and installing the support applications on old guests.

The -net vmchannel offers that option. Can't see how to do it other way.

> Is it possible to support access to the same capabilities through a
> well known IO/MMIO address, in the same way that VGA and IDE are both
> ordinary PCI devices, but also can be reached easily through well
> known IO/MMIO addresses in simple code?

The exits caused by accessing IO/MMIO regions are costlier than using
PV. So using the current vmchannel option would be almost similar in
those respects.

Also, virtio is the almost standard way these days to get performance.
We already have block, net drivers over virtio and they have been ported
to older kernels as well.

Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: [PATCH 1/2] allow hypervisor CPUID bit to be overriden

2009-06-24 Thread Gleb Natapov
On Wed, Jun 24, 2009 at 10:20:41PM +0100, Jamie Lokier wrote:
> Avi Kivity wrote:
> > On 06/23/2009 02:31 PM, Paul Brook wrote:
> > >On Tuesday 23 June 2009, Avi Kivity wrote:
> > >   
> > >>On 06/23/2009 12:47 AM, Andre Przywara wrote:
> > >> 
> > >>>KVM defaults to the hypervisor CPUID bit to be set, whereas pure QEMU
> > >>>clears it. On some occasions one want to set or clear it the other way
> > >>>round (for instance to get HyperV running inside a guest).
> > >>>Allow the default to be overridden on the command line and fix some
> > >>>whitespace damage on the way.
> > >>>   
> > >>It makes sense for qemu to set the hypervisor bit unconditionally.  A
> > >>guest running under qemu is not bare metal.
> > >> 
> > >
> > >I see no reason why a guest has to be told that it's running inside a VM.
> > >In principle an appropriately configured qemu should be indistinguishable 
> > >from
> > >real hardware. In practice it's technically infeasible to cover absolutely
> > >everything, but if we set this bit we're not even trying.
> > >
> > >I have no objection to the bit being set by default for the QEMU CPU types.
> > >   
> > 
> > I agree it's pointless, but it is a Microsoft requirement for passing 
> > their SVVP tests.  Enabling it by default makes life a little easier for 
> > users who wish to validate their hypervisor and has no drawbacks.
> 
> Hold on.
> 
> Do the SVVP tests fail on a real (non-virtal) machine then?
> 
SVVP -> Server Virtualization Validation Program

Definitely fails on a real machine :)

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html