date:20111004

[Qemu-devel] buildbot failure in qemu on monitor_x86_64_debian_6_0

2011-10-04 Thread qemu

The Buildbot has detected a new failure on builder monitor_x86_64_debian_6_0 
while building qemu.
Full details are available at:
 http://buildbot.b1-systems.de/qemu/builders/monitor_x86_64_debian_6_0/builds/50

Buildbot URL: http://buildbot.b1-systems.de/qemu/

Buildslave for this Build: yuzuki

Build Reason: The Nightly scheduler named 'nightly_monitor' triggered this build
Build Source Stamp: [branch queue/monitor] HEAD
Blamelist: 

BUILD FAILED: failed git

sincerely,
 -The Buildbot

Re: [Qemu-devel] [PATCH 1/2] linux-user: Remove unused code

2011-10-04 Thread Peter Maydell

On 3 October 2011 21:43, Stefan Weil  wrote:
> The code is unused since 8 years, so remove it.
>
> Signed-off-by: Stefan Weil 
> ---
>  linux-user/signal.c |    5 +
>  1 files changed, 1 insertions(+), 4 deletions(-)
>
> diff --git a/linux-user/signal.c b/linux-user/signal.c
> index 89276eb..40c5eb1 100644
> --- a/linux-user/signal.c
> +++ b/linux-user/signal.c
> @@ -1274,10 +1274,7 @@ setup_return(CPUState *env, struct target_sigaction 
> *ka,
>
>                if (__put_user(retcodes[idx], rc))
>                        return 1;
> -#if 0
> -               flush_icache_range((abi_ulong)rc,
> -                                  (abi_ulong)(rc + 1));
> -#endif
> +
>                retcode = rc_addr + thumb;
>        }

(since this is in the ARM section of this file)

Since this is writing guest code instructions, flushing the host icache
is clearly not the right thing. I'm just wondering how this ensures that
we don't have stale translated code for this address, though...

-- PMM

Re: [Qemu-devel] buildbot failure in qemu on default_openbsd_4.9

2011-10-04 Thread Stefan Hajnoczi

On Mon, Oct 03, 2011 at 05:57:28PM -0500, Anthony Liguori wrote:
> On 10/03/2011 06:15 PM, q...@buildbot.b1-systems.de wrote:
> >The Buildbot has detected a new failure on builder default_openbsd_4.9 while 
> >building qemu.
> >Full details are available at:
> >  http://buildbot.b1-systems.de/qemu/builders/default_openbsd_4.9/builds/44
> >
> >Buildbot URL: http://buildbot.b1-systems.de/qemu/
> >
> >Buildslave for this Build: kraxel_openbsd49
> >
> >Build Reason: The Nightly scheduler named 'nightly_default' triggered this 
> >build
> >Build Source Stamp: [branch master] HEAD
> >Blamelist:
> >
> >BUILD FAILED: failed git
> 
> Is there a way to suppress buildbot failures when the failed stage
> is git?  Not a huge deal but seems like a potential improvement.

We've discussed adding custom Python configuration code to ignore these
failures.

Looking at the buildbot documentation there is a 'retry' parameter for
version control which "specifies a tuple of (delay, repeats) which means
that when a full VC checkout fails, it should be retried up to repeats
times, waiting delay seconds between attempts".  This sounds like it
could work.

I suggest setting retry = (5 * 60, 3) so that we wait 5 minutes after a
failed git fetch and try 3 times.

Daniel: In the buildbot 0.8.3p1-1 docs I found the 'retry' parameter
under 4.11.3 Source Checkout.  Is it possible to add this parameter?

Thanks,
Stefan

Re: [Qemu-devel] New option for snapshot_blkdev to avoid image creation

2011-10-04 Thread Stefan Hajnoczi

On Mon, Oct 03, 2011 at 04:09:01PM +, Federico Simoncelli wrote:
> In some situations might be useful to let qemu use an image that was
> prepared for a live snapshot.
> The advantage is that creating the snapshot file outside of the qemu
> process we can use the whole range of options provided by the format
> (eg for qcow2: encryption, cluster_size and preallocation).
> It is also possible to pre-set a relative path to the backing file
> (now it is created by default as absolute path).
> In the long run it can also avoid the danger of reimplementing qemu-img
> inside qemu (if we wanted to expose such options when a snapshot is
> requested).

When the image file is created based on the backing file size:

$ qemu-img create -f qcow2 -o backing_file=master.img vm001.qcow2

It turns out that bdrv_img_create() opens the backing file with
read/write permissions.  This is generally a bad idea but especially
dangerous when the VM currently has the image file open already since
image formats are not designed for multiple initiators (clustering).  We
wouldn't want any caches being written out or startup fsck-style
operations to be performed on the backing file while the VM has it open.

Please make sure to use read-only before applying this patch.

Stefan

Re: [Qemu-devel] buildbot failure in qemu on default_x86_64_rhel5

2011-10-04 Thread Stefan Hajnoczi

On Mon, Oct 03, 2011 at 04:10:50PM +0530, Aneesh Kumar K.V wrote:
> On Mon, 3 Oct 2011 08:37:52 +0100, Stefan Hajnoczi  wrote:
> > On Sat, Oct 01, 2011 at 03:03:23PM +0530, Aneesh Kumar K.V wrote:
> > > +#ifndef CONFIG_UTIMENSAT
> > > +/*
> > > + * We support handle fs driver only if all related
> > > + * syscalls are provided by host.
> > > + */
> > 
> > Perhaps a ./configure check should be added to see whether the handle
> > syscalls are supported instead of using CONFIG_UTIMENSAT.
> > 
> 
> We already do check for handle syscall. Since glibc doesn't have the
> this syscall yet, I added the check in virtio-9p-handle.c as below

CONFIG_UTIMENSAT is defined when the host has glibc >= 2.6.

Handle syscalls are available on Linux 2.6.39 but not exposed by glibc.

Therefore CONFIG_UTIMENSAT has nothing to do with handle syscalls and
does not mean handle syscalls are supported.  I would drop that hunk of
the patch or test for the actual handle syscalls in ./configure.

Stefan

Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/2] tcg/ppc*: Move cache initialization to ppc specific code

2011-10-04 Thread Peter Maydell

On 4 October 2011 07:29, Paolo Bonzini  wrote:
> On 10/04/2011 07:55 AM, Stefan Weil wrote:
>> I learned now that ppc will need flush_icache_range() for kvm, too.
>> So it won't be possible to implement a uniform handling of
>> flush_icache_range() for all host architectures.
>
> x86 and IIRC s390 flush_icache_range is a no-op, so it is possible to "call"
> it for all KVM architectures without incurring penalties.

...but presumably when ARM KVM actually lands we'll need to make
flush_icache_range() available for it too.

If flushing the icache isn't a TCG-specific operation, then we should
have implementations in some other per-target-architecture file.
Doesn't particularly matter where as long as we're consistent for
all architectures.

(I'd also like to see more of the ifdef-target-foo ladders in common
files broken out into more cleanly defined interfaces provided by
per-target source files, personally.)

-- PMM

Re: [Qemu-devel] buildbot failure in qemu on default_openbsd_4.9

2011-10-04 Thread Daniel Gollub

Hi Stefan,

On Tuesday, October 04, 2011 08:45:28 AM Stefan Hajnoczi wrote:
> Looking at the buildbot documentation there is a 'retry' parameter for
> version control which "specifies a tuple of (delay, repeats) which means
> that when a full VC checkout fails, it should be retried up to repeats
> times, waiting delay seconds between attempts".  This sounds like it
> could work.
> 
> I suggest setting retry = (5 * 60, 3) so that we wait 5 minutes after a
> failed git fetch and try 3 times.

Perfect!

Thanks for finding this! I just haven't had time yet to look into this in 
detail.

> 
> Daniel: In the buildbot 0.8.3p1-1 docs I found the 'retry' parameter
> under 4.11.3 Source Checkout.  Is it possible to add this parameter?

I applied this for the qemu buildbot master instance for all Git checkouts.

Hopefully there are now less "failed git" buildbot mails ...

Best Regards,
Daniel



-- 
Daniel Gollub
Linux Consultant & Developer
Tel.: +49-160 47 73 970 
Mail: gol...@b1-systems.de

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537


signature.asc
Description: This is a digitally signed message part.

Re: [Qemu-devel] New option for snapshot_blkdev to avoid image creation

2011-10-04 Thread Federico Simoncelli

- Original Message -
> From: "Stefan Hajnoczi" 
> To: "Federico Simoncelli" 
> Cc: qemu-devel@nongnu.org, aba...@redhat.com, dl...@redhat.com
> Sent: Tuesday, October 4, 2011 9:33:48 AM
> Subject: Re: [Qemu-devel] New option for snapshot_blkdev to avoid image 
> creation
> 
> On Mon, Oct 03, 2011 at 04:09:01PM +, Federico Simoncelli wrote:
> > In some situations might be useful to let qemu use an image that
> > was
> > prepared for a live snapshot.
> > The advantage is that creating the snapshot file outside of the
> > qemu
> > process we can use the whole range of options provided by the
> > format
> > (eg for qcow2: encryption, cluster_size and preallocation).
> > It is also possible to pre-set a relative path to the backing file
> > (now it is created by default as absolute path).
> > In the long run it can also avoid the danger of reimplementing
> > qemu-img
> > inside qemu (if we wanted to expose such options when a snapshot is
> > requested).
> 
> When the image file is created based on the backing file size:
> 
> $ qemu-img create -f qcow2 -o backing_file=master.img vm001.qcow2
> 
> It turns out that bdrv_img_create() opens the backing file with
> read/write permissions.  This is generally a bad idea but especially
> dangerous when the VM currently has the image file open already since
> image formats are not designed for multiple initiators (clustering).

Hi Stefan, are you sure that bdrv_img_create opens the backing file
with read/write permissions?
After a quick check I can't see that happening:

$ ./qemu-img create -f qcow2 /tmp/master.img 1G
Formatting '/tmp/master.img', fmt=qcow2 size=1073741824 encryption=off 
cluster_size=65536

$ strace -e trace=open ./qemu-img create -f qcow2 -o 
backing_file=/tmp/master.img /tmp/vm001.qcow2
[...]
open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or 
directory)
open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or 
directory)
open("/tmp/master.img", O_RDONLY|O_NONBLOCK) = 3
open("/tmp/master.img", O_RDONLY|O_NONBLOCK) = 3
open("/tmp/master.img", O_RDONLY|O_DSYNC|O_CLOEXEC) = 3
open("/tmp/master.img", O_RDONLY|O_NONBLOCK) = 3
open("/tmp/master.img", O_RDONLY|O_NONBLOCK) = 3
open("/tmp/master.img", O_RDONLY|O_CLOEXEC) = 3
Formatting '/tmp/vm001.qcow2', fmt=qcow2 size=1073741824 
backing_file='/tmp/master.img' encryption=off cluster_size=65536 
open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or 
directory)
open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or 
directory)
open("/tmp/vm001.qcow2", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 6
open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = 6
open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = 6
open("/tmp/vm001.qcow2", O_RDWR|O_DSYNC|O_CLOEXEC) = 6
open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = 6
open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = 6
open("/tmp/vm001.qcow2", O_RDWR|O_CLOEXEC) = 6

-- 
Federico

Re: [Qemu-devel] [PATCH V4] Add stdio char device on windows

2011-10-04 Thread Yann Quelen

Hello,

2011/10/3 Fabien Chouteau :
> Simple implementation of an stdio char device on Windows.
>
> Signed-off-by: Fabien Chouteau 
> ---
>  qemu-char.c |  227 
> ++-
>  1 files changed, 225 insertions(+), 2 deletions(-)
>

hStdIn and hStdOut should be declared HANDLE instead of HANDLE*
(in practice it works because HANDLE is just a void*)

Regards,

Yann Quelen

Re: [Qemu-devel] buildbot failure in qemu on default_x86_64_rhel5

2011-10-04 Thread Aneesh Kumar K.V

On Tue, 4 Oct 2011 08:18:07 +0100, Stefan Hajnoczi  wrote:
> On Mon, Oct 03, 2011 at 04:10:50PM +0530, Aneesh Kumar K.V wrote:
> > On Mon, 3 Oct 2011 08:37:52 +0100, Stefan Hajnoczi  
> > wrote:
> > > On Sat, Oct 01, 2011 at 03:03:23PM +0530, Aneesh Kumar K.V wrote:
> > > > +#ifndef CONFIG_UTIMENSAT
> > > > +/*
> > > > + * We support handle fs driver only if all related
> > > > + * syscalls are provided by host.
> > > > + */
> > > 
> > > Perhaps a ./configure check should be added to see whether the handle
> > > syscalls are supported instead of using CONFIG_UTIMENSAT.
> > > 
> > 
> > We already do check for handle syscall. Since glibc doesn't have the
> > this syscall yet, I added the check in virtio-9p-handle.c as below
> 
> CONFIG_UTIMENSAT is defined when the host has glibc >= 2.6.
> 
> Handle syscalls are available on Linux 2.6.39 but not exposed by glibc.
> 
> Therefore CONFIG_UTIMENSAT has nothing to do with handle syscalls and
> does not mean handle syscalls are supported.  I would drop that hunk of
> the patch or test for the actual handle syscalls in ./configure.

Here is what i am trying to achieve with the patch. For handle based fs
driver to work we need to have 

1) support for handle syscall
2) support for fd based syscalls like futimens, fstatat, readlinkat,
fchmod, fchownat, openat etc.

Now handle syscalls are recently added to kernel and glibc doesn't have
support for that. So what we did is to add handle syscall in
virtio-9p-handle.c via syscall(2). Now if the syscall is not supported
by the host kernel we will get ENOSYS. I only added support for i386 and
x86_64, because most the syscall number varies with different archs. For
other archs the wrapper returns ENOSYS. So instead of checking for
handle syscalls in configure script we did the above to make sure it
work without failure in most of the case. Once we have glibc support for
handle syscall the above changes should be dropped in favor of
configure script test.

Now for the fd based syscall dependency, we didn't initially had any
check for that because my expectation was most glibc should
have support for them. But RHEL 5 build failure indicate that futimens
is not supported. We were already checking for futimens in configure so
i added changes to make sure if futimens is not supported
handle_utimensat returns error. (That was not added as a run time
check, but rather a compile error fix). Now should we allow handle based fs
driver if futimens is not supported. I was suggesting we should not;
hence the check in init to return error when we don't support
futimens. The later part of init routine do check whether handle
syscalls are supported and disable handle fs driver if they are not.

-aneesh

Re: [Qemu-devel] buildbot failure in qemu on default_openbsd_4.9

2011-10-04 Thread Stefan Hajnoczi

On Tue, Oct 04, 2011 at 10:12:15AM +0200, Daniel Gollub wrote:
> > Daniel: In the buildbot 0.8.3p1-1 docs I found the 'retry' parameter
> > under 4.11.3 Source Checkout.  Is it possible to add this parameter?
> 
> I applied this for the qemu buildbot master instance for all Git checkouts.
> 
> Hopefully there are now less "failed git" buildbot mails ...

Great, thank you!

Stefan

Re: [Qemu-devel] buildbot failure in qemu on default_x86_64_rhel5

2011-10-04 Thread Stefan Hajnoczi

On Tue, Oct 04, 2011 at 02:18:20PM +0530, Aneesh Kumar K.V wrote:
> On Tue, 4 Oct 2011 08:18:07 +0100, Stefan Hajnoczi  wrote:
> > On Mon, Oct 03, 2011 at 04:10:50PM +0530, Aneesh Kumar K.V wrote:
> > > On Mon, 3 Oct 2011 08:37:52 +0100, Stefan Hajnoczi  
> > > wrote:
> > > > On Sat, Oct 01, 2011 at 03:03:23PM +0530, Aneesh Kumar K.V wrote:
> > > > > +#ifndef CONFIG_UTIMENSAT
> > > > > +/*
> > > > > + * We support handle fs driver only if all related
> > > > > + * syscalls are provided by host.
> > > > > + */
> > > > 
> > > > Perhaps a ./configure check should be added to see whether the handle
> > > > syscalls are supported instead of using CONFIG_UTIMENSAT.
> > > > 
> > > 
> > > We already do check for handle syscall. Since glibc doesn't have the
> > > this syscall yet, I added the check in virtio-9p-handle.c as below
> > 
> > CONFIG_UTIMENSAT is defined when the host has glibc >= 2.6.
> > 
> > Handle syscalls are available on Linux 2.6.39 but not exposed by glibc.
> > 
> > Therefore CONFIG_UTIMENSAT has nothing to do with handle syscalls and
> > does not mean handle syscalls are supported.  I would drop that hunk of
> > the patch or test for the actual handle syscalls in ./configure.
> 
> Here is what i am trying to achieve with the patch. For handle based fs
> driver to work we need to have 
> 
> 1) support for handle syscall
> 2) support for fd based syscalls like futimens, fstatat, readlinkat,
> fchmod, fchownat, openat etc.
> 
> Now handle syscalls are recently added to kernel and glibc doesn't have
> support for that. So what we did is to add handle syscall in
> virtio-9p-handle.c via syscall(2). Now if the syscall is not supported
> by the host kernel we will get ENOSYS. I only added support for i386 and
> x86_64, because most the syscall number varies with different archs. For
> other archs the wrapper returns ENOSYS. So instead of checking for
> handle syscalls in configure script we did the above to make sure it
> work without failure in most of the case. Once we have glibc support for
> handle syscall the above changes should be dropped in favor of
> configure script test.
> 
> Now for the fd based syscall dependency, we didn't initially had any
> check for that because my expectation was most glibc should
> have support for them. But RHEL 5 build failure indicate that futimens
> is not supported. We were already checking for futimens in configure so
> i added changes to make sure if futimens is not supported
> handle_utimensat returns error. (That was not added as a run time
> check, but rather a compile error fix). Now should we allow handle based fs
> driver if futimens is not supported. I was suggesting we should not;
> hence the check in init to return error when we don't support
> futimens. The later part of init routine do check whether handle
> syscalls are supported and disable handle fs driver if they are not.

Okay, then the comment should be:

/*
 * We support handle fs driver only if futimens is provided by the host
 */

The scenario where it might be possible to hit the CONFIG_UTIMENSAT is
with a modern kernel paired with an old userspace.  The handle syscall
which we call directly would succeed but the futimens(2) would not be
available.

On a sane system the handle syscall fails because the kernel doesn't
support it (and futimens isn't there either).

Stefan

Re: [Qemu-devel] New option for snapshot_blkdev to avoid image creation

2011-10-04 Thread Stefan Hajnoczi

On Tue, Oct 04, 2011 at 04:27:42AM -0400, Federico Simoncelli wrote:
> - Original Message -
> > From: "Stefan Hajnoczi" 
> > To: "Federico Simoncelli" 
> > Cc: qemu-devel@nongnu.org, aba...@redhat.com, dl...@redhat.com
> > Sent: Tuesday, October 4, 2011 9:33:48 AM
> > Subject: Re: [Qemu-devel] New option for snapshot_blkdev to avoid image 
> > creation
> > 
> > On Mon, Oct 03, 2011 at 04:09:01PM +, Federico Simoncelli wrote:
> > > In some situations might be useful to let qemu use an image that
> > > was
> > > prepared for a live snapshot.
> > > The advantage is that creating the snapshot file outside of the
> > > qemu
> > > process we can use the whole range of options provided by the
> > > format
> > > (eg for qcow2: encryption, cluster_size and preallocation).
> > > It is also possible to pre-set a relative path to the backing file
> > > (now it is created by default as absolute path).
> > > In the long run it can also avoid the danger of reimplementing
> > > qemu-img
> > > inside qemu (if we wanted to expose such options when a snapshot is
> > > requested).
> > 
> > When the image file is created based on the backing file size:
> > 
> > $ qemu-img create -f qcow2 -o backing_file=master.img vm001.qcow2
> > 
> > It turns out that bdrv_img_create() opens the backing file with
> > read/write permissions.  This is generally a bad idea but especially
> > dangerous when the VM currently has the image file open already since
> > image formats are not designed for multiple initiators (clustering).
> 
> Hi Stefan, are you sure that bdrv_img_create opens the backing file
> with read/write permissions?

You are absolutely right.  BDRV_O_FLAGS does not have BDRV_O_RDWR so it
is opening read-only.  I missed this when reading the code earlier
today.

Stefan

Re: [Qemu-devel] [RFC] Adding new filesystem 'proxy' to 9p

2011-10-04 Thread M. Mohan Kumar

> 
> That is the case if the proxy helper code is perfectly written. I am trying
> to think about the scenario where there is a bug (eg heap corruption /
> stack overflow) which allows a malicious non-root QEMU process to exploit
> the proxy helper to run code that it was *not* intended to run.
> 
> If the proxy helper is running root with all capabilities, then a bug in
> the proxy helper can easily turn into a full root exploit.
> 
> If the proxy helper starts as root, chroots, and then immediately drops to
> a non-root user, keeping only the CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_FOWNER
> and CAP_DAC_READ_SEARCH capabilities, then a bug in the proxy helper can
> only be used to access files within the designated 9pfs export. If the
> exported directory does not contain any important host system files, then
> it is unlikely it can be used to create a full root exploit.
> 

Thanks Daniel, I will add 'capabiliies' to proxy helper. CAP_FOWNER capability 
also need.

I am working on the patches. I will post them in few days.

[Qemu-devel] [PATCH 1/2] ACPI: Update cpu hotplug event for guest

2011-10-04 Thread pingfank

From: Liu Ping Fan 

Need to update cpu hotplug event for guest when updating SCI event

Signed-off-by: Liu Ping Fan 
---
 hw/acpi_piix4.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index 44eb8ae..f585226 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -94,7 +94,8 @@ static void pm_update_sci(PIIX4PMState *s)
ACPI_BITMASK_POWER_BUTTON_ENABLE |
ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
-(((s->gpe.sts[0] & s->gpe.en[0]) & PIIX4_PCI_HOTPLUG_STATUS) != 0);
+(((s->gpe.sts[0] & s->gpe.en[0]) &
+  (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0);
 
 qemu_set_irq(s->irq, sci_level);
 /* schedule a timer interruption if needed */
-- 
1.7.4.4

[Qemu-devel] [PATCH] ACPI: Call ACPI remove handler when handling ACPI eject event

2011-10-04 Thread pingfank

From: Liu Ping Fan 

Call the remove handler for ACPI_NOTIFY_EJECT_REQUEST

Signed-off-by: Liu Ping Fan 
---
 drivers/acpi/bus.c  |2 +-
 drivers/acpi/scan.c |2 +-
 include/acpi/acpi_bus.h |2 ++
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
index 437ddbf..d06ec6d 100644
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -764,7 +764,7 @@ static void acpi_bus_notify(acpi_handle handle, u32 type, 
void *data)
break;
 
case ACPI_NOTIFY_EJECT_REQUEST:
-   /* TBD */
+   acpi_os_hotplug_execute(acpi_bus_hot_remove_device, handle);
break;
 
case ACPI_NOTIFY_DEVICE_CHECK_LIGHT:
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 449c556..3b97b61 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -83,7 +83,7 @@ acpi_device_modalias_show(struct device *dev, struct 
device_attribute *attr, cha
 }
 static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL);
 
-static void acpi_bus_hot_remove_device(void *context)
+void acpi_bus_hot_remove_device(void *context)
 {
struct acpi_device *device;
acpi_handle handle = context;
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index 6cd5b64..b19c09d 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -310,6 +310,8 @@ extern int unregister_acpi_notifier(struct notifier_block 
*);
 
 extern int register_acpi_bus_notifier(struct notifier_block *nb);
 extern void unregister_acpi_bus_notifier(struct notifier_block *nb);
+extern void acpi_bus_hot_remove_device(void *context);
+
 /*
  * External Functions
  */
-- 
1.7.4.4

[Qemu-devel] [PATCH 2/2] LAPIC: make lapic support cpu hotplug

2011-10-04 Thread pingfank

From: Liu Ping Fan 

Separate apic from qbus to icc bus which supports hotplug feature.
And make the creation of apic as part of cpu initialization, so
apic's state has been ready, before setting kvm_apic.

Signed-off-by: Liu Ping Fan 
---
 Makefile.target  |1 +
 hw/apic.c|7 -
 hw/apic.h|1 +
 hw/icc_bus.c |   62 ++
 hw/icc_bus.h |   24 +++
 hw/pc.c  |   11 -
 target-i386/cpu.h|1 +
 target-i386/helper.c |7 +-
 target-i386/kvm.c|1 -
 9 files changed, 105 insertions(+), 10 deletions(-)
 create mode 100644 hw/icc_bus.c
 create mode 100644 hw/icc_bus.h

diff --git a/Makefile.target b/Makefile.target
index 9011f28..5607c6d 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -241,6 +241,7 @@ obj-i386-$(CONFIG_KVM) += kvmclock.o
 obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
 obj-i386-y += testdev.o
 obj-i386-y += acpi.o acpi_piix4.o
+obj-i386-y += icc_bus.o
 
 obj-i386-y += pcspk.o i8254.o
 obj-i386-$(CONFIG_KVM_PIT) += i8254-kvm.o
diff --git a/hw/apic.c b/hw/apic.c
index 69d6ac5..95a1664 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -24,6 +24,7 @@
 #include "sysbus.h"
 #include "trace.h"
 #include "kvm.h"
+#include "icc_bus.h"
 
 /* APIC Local Vector Table */
 #define APIC_LVT_TIMER   0
@@ -304,11 +305,13 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val)
 
 if (!s)
 return;
+
 if (kvm_enabled() && kvm_irqchip_in_kernel())
 s->apicbase = val;
 else
 s->apicbase = (val & 0xf000) |
 (s->apicbase & (MSR_IA32_APICBASE_BSP | MSR_IA32_APICBASE_ENABLE));
+
 /* if disabled, cannot be enabled again */
 if (!(val & MSR_IA32_APICBASE_ENABLE)) {
 s->apicbase &= ~MSR_IA32_APICBASE_ENABLE;
@@ -1075,7 +1078,7 @@ static const VMStateDescription vmstate_apic = {
 }
 };
 
-static void apic_reset(DeviceState *d)
+void apic_reset(DeviceState *d)
 {
 APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
 int bsp;
@@ -1138,7 +1141,7 @@ static SysBusDeviceInfo apic_info = {
 
 static void apic_register_devices(void)
 {
-sysbus_register_withprop(&apic_info);
+iccbus_register(&apic_info);
 }
 
 device_init(apic_register_devices)
diff --git a/hw/apic.h b/hw/apic.h
index c857d52..e258efa 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -24,5 +24,6 @@ void apic_sipi(DeviceState *s);
 /* pc.c */
 int cpu_is_bsp(CPUState *env);
 DeviceState *cpu_get_current_apic(void);
+void apic_reset(DeviceState *d);
 
 #endif
diff --git a/hw/icc_bus.c b/hw/icc_bus.c
new file mode 100644
index 000..360ca2a
--- /dev/null
+++ b/hw/icc_bus.c
@@ -0,0 +1,62 @@
+/*
+*/
+#define ICC_BUS_PLUG
+#ifdef ICC_BUS_PLUG
+#include "icc_bus.h"
+
+
+
+struct icc_bus_info icc_info = {
+.qinfo.name = "icc",
+.qinfo.size = sizeof(struct icc_bus),
+.qinfo.props = (Property[]) {
+DEFINE_PROP_END_OF_LIST(),
+}
+
+};
+
+
+static const VMStateDescription vmstate_icc_bus = {
+.name = "icc_bus",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.pre_save = NULL,
+.post_load = NULL,
+};
+
+struct icc_bus *g_iccbus;
+
+struct icc_bus *icc_init_bus(DeviceState *parent, const char *name)
+{
+struct icc_bus *bus;
+
+bus = FROM_QBUS(icc_bus, qbus_create(&icc_info.qinfo, parent, name));
+bus->qbus.allow_hotplug = 1; /* Yes, we can */
+bus->qbus.name = "icc";
+vmstate_register(NULL, -1, &vmstate_icc_bus, bus);
+g_iccbus = bus;
+return bus;
+}
+
+
+
+
+
+static int iccbus_device_init(DeviceState *dev, DeviceInfo *base)
+{
+SysBusDeviceInfo *info = container_of(base, SysBusDeviceInfo, qdev);
+
+return info->init(sysbus_from_qdev(dev));
+}
+
+void iccbus_register(SysBusDeviceInfo *info)
+{
+info->qdev.init = iccbus_device_init;
+info->qdev.bus_info = &icc_info.qinfo;
+
+assert(info->qdev.size >= sizeof(SysBusDevice));
+qdev_register(&info->qdev);
+}
+
+#endif
diff --git a/hw/icc_bus.h b/hw/icc_bus.h
new file mode 100644
index 000..94d9242
--- /dev/null
+++ b/hw/icc_bus.h
@@ -0,0 +1,24 @@
+#ifndef QEMU_ICC_H
+#define QEMU_ICC_H
+
+#include "qdev.h"
+#include "sysbus.h"
+
+typedef struct icc_bus icc_bus;
+typedef struct icc_bus_info icc_bus_info;
+
+
+struct icc_bus {
+BusState qbus;
+};
+
+struct icc_bus_info {
+BusInfo qinfo;
+};
+
+extern struct icc_bus_info icc_info;
+extern struct icc_bus *g_iccbus;
+extern struct icc_bus *icc_init_bus(DeviceState *parent, const char *name);
+extern void iccbus_register(SysBusDeviceInfo *info);
+
+#endif
diff --git a/hw/pc.c b/hw/pc.c
index 6b3662e..10371d8 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -24,6 +24,7 @@
 #include "hw.h"
 #include "pc.h"
 #include "apic.h"
+#include "icc_bus.h"
 #include "fdc.h"
 #include "ide.h"
 #include "pci.h"
@@ -91,6 +92,7 @@ struct e820_table {
 static struct e820_table e820_table;
 struct hpet_fw_config hpet_cfg = {.count = UINT8_

[Qemu-devel] Enable x86 linux guest with cpu hotplug feature.

2011-10-04 Thread pingfank

From: Liu Ping Fan 

A bunch of patches, which are applied separately for kernel and qemu-kvm,
and make x86 based linux guest with cpu hotplug feature.

For kernel:
0001-ACPI-Call-ACPI-remove-handler-when-handling-ACPI-eje.patch
--call acpi eject event handler in acpi_bus_notify()

For qemu-kvm:
0001-ACPI-Update-cpu-hotplug-event-for-guest.patch
--trigger acpi cpu hotplug event for guest, so qemu-kvm can bring present cpu
  to online state
0002-LAPIC-make-lapic-support-cpu-hotplug.patch
--make qemu-kvm support to bring possible cpu to present state.

After applying these patches, we can bring possible cpu online 
by the following step.
1.Boot up qemu-kvm with the param "-smp 1,maxcpu=4", 
2.Use "cpu_set 2 online" cmd in qemu monitor to bring possible cpu 
  to present state. 
3.In guest,echo 1 > cpu2/online to bring the present cpu to online 
  state.

[Qemu-devel] [PATCH] ui/spice-core: fix segfault in monitor

2011-10-04 Thread Alon Levy

Fix segfault if a qxl device is present but no spice command line
argument is given.

RHBZ 743251.

Signed-off-by: Alon Levy 
---
 ui/spice-core.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/ui/spice-core.c b/ui/spice-core.c
index 3cbc721..b14d1d9 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -413,7 +413,7 @@ void do_info_spice(Monitor *mon, QObject **ret_data)
 int port, tls_port;
 char version_string[20]; /* 12 = |255.255.255\0| is the max */
 
-if (!spice_server) {
+if (!spice_server || !opts) {
 *ret_data = qobject_from_jsonf("{ 'enabled': false }");
 return;
 }
-- 
1.7.6.4

[Qemu-devel] s390: Fix cpu shutdown for KVM

2011-10-04 Thread Christian Borntraeger

qemu/kvm on s390 currently hangs on panic (doesnt exit on disabled
wait) and also shuts down on cpu hot-unplug (SIGP stop).
This patch tries to fix these simple cases.
On s390 a shutdown is the state of all CPUs being either stopped
or disabled (for interrupts) waiting. We have to track this number
to call the shutdown sequence accordingly. This patch implements
the counting and shutdown handling for the kvm path in qemu.

Signed-off-by: Christian Borntraeger 
---
 hw/s390-virtio.c   |4 
 target-s390x/kvm.c |   29 +
 2 files changed, 29 insertions(+), 4 deletions(-)

Index: b/hw/s390-virtio.c
===
--- a/hw/s390-virtio.c
+++ b/hw/s390-virtio.c
@@ -130,6 +130,9 @@ int s390_virtio_hypercall(CPUState *env,
 return r;
 }
 
+/* defined in target-s390x/kvm.c */
+extern int s390_running_cpus;
+
 /* PC hardware initialisation */
 static void s390_init(ram_addr_t my_ram_size,
   const char *boot_device,
@@ -189,6 +192,7 @@ static void s390_init(ram_addr_t my_ram_
 
 env->halted = 0;
 env->exception_index = 0;
+s390_running_cpus = 1;
 
 if (kernel_filename) {
 kernel_size = load_image(kernel_filename, qemu_get_ram_ptr(0));
Index: b/target-s390x/kvm.c
===
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -63,6 +63,14 @@
 #define SCLP_CMDW_READ_SCP_INFO 0x00020001
 #define SCLP_CMDW_READ_SCP_INFO_FORCED  0x00120001
 
+/*
+ * The number of running CPUs. On s390 a shutdown is the state of all CPUs
+ * being either stopped or disabled (for interrupts) waiting. We have to
+ * track this number to call the shutdown sequence accordingly. This
+ * number is modified either on startup or while holding the big qemu lock.
+ */
+int s390_running_cpus;
+
 const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
 KVM_CAP_LAST_INFO
 };
@@ -185,6 +193,12 @@ void kvm_s390_interrupt_internal(CPUStat
 return;
 }
 
+/*
+ * We can only deliver interrupts to (interrupt) enabled CPUs.
+ * We dont modify s390_running_cpus here, since CPUs in enabled wait
+ * will wait inside the kernel (no exit). Therefore, the targeted
+ * CPUs was neither disabled waiting or stopped for qemu.
+ */
 env->halted = 0;
 env->exception_index = -1;
 qemu_cpu_kick(env);
@@ -301,6 +315,7 @@ static int s390_cpu_restart(CPUState *en
 kvm_s390_interrupt(env, KVM_S390_RESTART, 0);
 env->halted = 0;
 env->exception_index = -1;
+s390_running_cpus++;
 qemu_cpu_kick(env);
 dprintf("DONE: SIGP cpu restart: %p\n", env);
 return 0;
@@ -425,16 +440,24 @@ static int handle_intercept(CPUState *en
 r = handle_instruction(env, run);
 break;
 case ICPT_WAITPSW:
-/* XXX What to do on system shutdown? */
+if (--s390_running_cpus == 0) {
+qemu_system_shutdown_request();
+}
 env->halted = 1;
 env->exception_index = EXCP_HLT;
+r = EXCP_HALTED;
 break;
 case ICPT_SOFT_INTERCEPT:
 fprintf(stderr, "KVM unimplemented icpt SOFT\n");
 exit(1);
 break;
 case ICPT_CPU_STOP:
-qemu_system_shutdown_request();
+if (--s390_running_cpus == 0) {
+qemu_system_shutdown_request();
+}
+env->halted = 1;
+env->exception_index = EXCP_HLT;
+r = EXCP_HALTED;
 break;
 case ICPT_IO:
 fprintf(stderr, "KVM unimplemented icpt IO\n");
@@ -468,8 +491,6 @@ int kvm_arch_handle_exit(CPUState *env, 
 
 if (ret == 0) {
 ret = EXCP_INTERRUPT;
-} else if (ret > 0) {
-ret = 0;
 }
 return ret;
 }

[Qemu-devel] [PATCH] runstate: do not discard runstate changes when paused

2011-10-04 Thread Paolo Bonzini

Trying to migrate a paused machine fails.  The reason is that
the RSTATE_PRE_MIGRATE is reached with vm_stop, and this
transition is eaten when the vm is already paused.  This patch
fixes the problem by always going through runstate_set and
always notifying the new state.

Signed-off-by: Paolo Bonzini 
---
 cpus.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/cpus.c b/cpus.c
index 8978779..eab8ff6 100644
--- a/cpus.c
+++ b/cpus.c
@@ -128,6 +128,8 @@ static void do_vm_stop(RunState state)
 qemu_aio_flush();
 bdrv_flush_all();
 monitor_protocol_event(QEVENT_STOP, NULL);
+} else {
+runstate_set(state);
 }
 }
 
-- 
1.7.6

Re: [Qemu-devel] [PATCH] s390: fix short kernel command lines

2011-10-04 Thread Alexander Graf


On 09/23/2011 03:38 PM, Christian Borntraeger wrote:

The default kernel command line for s390 is
"root=/dev/ram0 ro"

When overriding this line, we have to ensure to also copy the \0 to
avoid false lines, for example, -append "root=/dev/vda" will result in
"root=/dev/vda0 ro" with the current code.

Signed-off-by: Christian Borntraeger


Thanks, applied to s390-next.


Alex

[Qemu-devel] build errors on s390 tcg

2011-10-04 Thread Christian Borntraeger

With todays qemu git I get the following error when compiling for s390:

In file included from /home/cborntra/REPOS/qemu/tcg/tcg.c:175:0:
/home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:677:13: error: conflicting 
types for ‘tcg_out_mov’
/home/cborntra/REPOS/qemu/tcg/tcg.c:76:13: note: previous declaration of 
‘tcg_out_mov’ was here
/home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:689:13: error: conflicting 
types for ‘tcg_out_movi’
/home/cborntra/REPOS/qemu/tcg/tcg.c:77:13: note: previous declaration of 
‘tcg_out_movi’ was here
/home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:829:20: error: conflicting 
types for ‘tcg_out_ld’
/home/cborntra/REPOS/qemu/tcg/tcg.c:74:13: note: previous declaration of 
‘tcg_out_ld’ was here
/home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:839:20: error: conflicting 
types for ‘tcg_out_st’
/home/cborntra/REPOS/qemu/tcg/tcg.c:81:13: note: previous declaration of 
‘tcg_out_st’ was here
make[1]: *** [tcg/tcg.o] Error 1

reverting c0ad3001bf12292b137b05e1c4643f31c6b0a727 ""tcg: Add forward 
declarations for local functions"
solves the problem, but it just hides the problem. Any ideas howto solve that 
properly?

Christian

Re: [Qemu-devel] s390: Fix cpu shutdown for KVM

2011-10-04 Thread Alexander Graf


On 10/04/2011 01:28 PM, Christian Borntraeger wrote:

qemu/kvm on s390 currently hangs on panic (doesnt exit on disabled
wait) and also shuts down on cpu hot-unplug (SIGP stop).
This patch tries to fix these simple cases.
On s390 a shutdown is the state of all CPUs being either stopped
or disabled (for interrupts) waiting. We have to track this number
to call the shutdown sequence accordingly. This patch implements
the counting and shutdown handling for the kvm path in qemu.

Signed-off-by: Christian Borntraeger
---
  hw/s390-virtio.c   |4 
  target-s390x/kvm.c |   29 +
  2 files changed, 29 insertions(+), 4 deletions(-)

Index: b/hw/s390-virtio.c
===
--- a/hw/s390-virtio.c
+++ b/hw/s390-virtio.c
@@ -130,6 +130,9 @@ int s390_virtio_hypercall(CPUState *env,
  return r;
  }

+/* defined in target-s390x/kvm.c */
+extern int s390_running_cpus;


This breaks non-kvm builds. Please put it into s390-virtio.c and export 
functions that get called from target-s390x/kvm.c which add/remove 
running CPUs from that counter.



Alex


+
  /* PC hardware initialisation */
  static void s390_init(ram_addr_t my_ram_size,
const char *boot_device,
@@ -189,6 +192,7 @@ static void s390_init(ram_addr_t my_ram_

  env->halted = 0;
  env->exception_index = 0;
+s390_running_cpus = 1;

  if (kernel_filename) {
  kernel_size = load_image(kernel_filename, qemu_get_ram_ptr(0));
Index: b/target-s390x/kvm.c
===
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -63,6 +63,14 @@
  #define SCLP_CMDW_READ_SCP_INFO 0x00020001
  #define SCLP_CMDW_READ_SCP_INFO_FORCED  0x00120001

+/*
+ * The number of running CPUs. On s390 a shutdown is the state of all CPUs
+ * being either stopped or disabled (for interrupts) waiting. We have to
+ * track this number to call the shutdown sequence accordingly. This
+ * number is modified either on startup or while holding the big qemu lock.
+ */
+int s390_running_cpus;
+
  const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
  KVM_CAP_LAST_INFO
  };
@@ -185,6 +193,12 @@ void kvm_s390_interrupt_internal(CPUStat
  return;
  }

+/*
+ * We can only deliver interrupts to (interrupt) enabled CPUs.
+ * We dont modify s390_running_cpus here, since CPUs in enabled wait
+ * will wait inside the kernel (no exit). Therefore, the targeted
+ * CPUs was neither disabled waiting or stopped for qemu.
+ */
  env->halted = 0;
  env->exception_index = -1;
  qemu_cpu_kick(env);
@@ -301,6 +315,7 @@ static int s390_cpu_restart(CPUState *en
  kvm_s390_interrupt(env, KVM_S390_RESTART, 0);
  env->halted = 0;
  env->exception_index = -1;
+s390_running_cpus++;
  qemu_cpu_kick(env);
  dprintf("DONE: SIGP cpu restart: %p\n", env);
  return 0;
@@ -425,16 +440,24 @@ static int handle_intercept(CPUState *en
  r = handle_instruction(env, run);
  break;
  case ICPT_WAITPSW:
-/* XXX What to do on system shutdown? */
+if (--s390_running_cpus == 0) {
+qemu_system_shutdown_request();
+}
  env->halted = 1;
  env->exception_index = EXCP_HLT;
+r = EXCP_HALTED;
  break;
  case ICPT_SOFT_INTERCEPT:
  fprintf(stderr, "KVM unimplemented icpt SOFT\n");
  exit(1);
  break;
  case ICPT_CPU_STOP:
-qemu_system_shutdown_request();
+if (--s390_running_cpus == 0) {
+qemu_system_shutdown_request();
+}
+env->halted = 1;
+env->exception_index = EXCP_HLT;
+r = EXCP_HALTED;
  break;
  case ICPT_IO:
  fprintf(stderr, "KVM unimplemented icpt IO\n");
@@ -468,8 +491,6 @@ int kvm_arch_handle_exit(CPUState *env,

  if (ret == 0) {
  ret = EXCP_INTERRUPT;
-} else if (ret>  0) {
-ret = 0;
  }
  return ret;
  }

Re: [Qemu-devel] [PATCH 11/22] i8259: Update IRQ state after reset

2011-10-04 Thread Avi Kivity

On 10/02/2011 10:36 PM, Blue Swirl wrote:

On Sun, Oct 2, 2011 at 8:31 PM, Avi Kivity  wrote:
>
>>  >
>>  >  In fact these aren't problems.  The packet may be sent or data
>>  >  written, as long as they aren't corrupted.  A device is allowed to
>>  >  "delay" a reset (but not indefinitely).
>>
>>  Oh, but corruption could easily happen. Consider for example a disk
>>  controller waiting for DMA ready signal a device separate from the
>>  DMA
>>  controller. Due to reset glitches in the device or signal chain, the
>>  DMA ready signal arrives but the DMA controller still contains old
>>  information, writing the data to disk from wrong memory location.
>
>
>  Would not this corruption also happen on real hardware?  If reset to the 
disk controller is delayed by a slow gate or extra capacitance on a line?

Maybe, but the delays are probably too short on real HW before any
packets are sent or disk gets written. On QEMU, I/O can be
instantaneous.

As an anecdote, while reading a chip errata I came upon this:

15. CPU May Record Signal Glitches When MCH is Being Reset

Problem: When the MCH is reset via RSTIN# the CPU may record any one or 
more of the
following errors: address strobe glitch (MSR IA32_MCi_STATUS bit 23), 
data strobe

glitch (MSR IA32_MCi_STATUS bit 22), P/N data strobes out of sync (MSR
IA32_MCi_STATUS bit 21), PIC & FSB data parity (MSR IA32_MCi_STATUS bit 
19), RSP
parity (MSR IA32_MCi_STATUS bit 18), or FSB address parity (MSR 
IA32_MCi_STATUS
bit 16).  This can happen when the MCH asserts CPURST# just after the 
MCH drives an

FSB transaction.  This may happen because RSTIN# and CPURST# maintain an
asynchronous relationship with each other.

Workaround: None.

Status: No Fix

(of course the situation there is different, there is no global reset 
signal)

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] build errors on s390 tcg

2011-10-04 Thread Alexander Graf


On 10/04/2011 01:54 PM, Christian Borntraeger wrote:

With todays qemu git I get the following error when compiling for s390:

In file included from /home/cborntra/REPOS/qemu/tcg/tcg.c:175:0:
/home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:677:13: error: conflicting 
types for ‘tcg_out_mov’
/home/cborntra/REPOS/qemu/tcg/tcg.c:76:13: note: previous declaration of 
‘tcg_out_mov’ was here
/home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:689:13: error: conflicting 
types for ‘tcg_out_movi’
/home/cborntra/REPOS/qemu/tcg/tcg.c:77:13: note: previous declaration of 
‘tcg_out_movi’ was here
/home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:829:20: error: conflicting 
types for ‘tcg_out_ld’
/home/cborntra/REPOS/qemu/tcg/tcg.c:74:13: note: previous declaration of 
‘tcg_out_ld’ was here
/home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:839:20: error: conflicting 
types for ‘tcg_out_st’
/home/cborntra/REPOS/qemu/tcg/tcg.c:81:13: note: previous declaration of 
‘tcg_out_st’ was here
make[1]: *** [tcg/tcg.o] Error 1

reverting c0ad3001bf12292b137b05e1c4643f31c6b0a727 ""tcg: Add forward declarations 
for local functions"
solves the problem, but it just hides the problem. Any ideas howto solve that 
properly?


The problem is that tcg.c defines the functions with int arguments for 
TCG register indexes, while s390/tcg-target.c takes TCGReg parameters. 
I'm not sure which way is better, but using TCGReg feels more type safe 
to me.



Alex

Re: [Qemu-devel] build errors on s390 tcg

2011-10-04 Thread Peter Maydell

On 4 October 2011 13:23, Alexander Graf  wrote:
> The problem is that tcg.c defines the functions with int arguments for TCG
> register indexes, while s390/tcg-target.c takes TCGReg parameters. I'm not
> sure which way is better, but using TCGReg feels more type safe to me.

You're not consistent, though; for example the s390 tcg_target_reg_alloc_order
array is declared as int[], not TCGReg[].

-- PMM

Re: [Qemu-devel] [PATCH 2/2] LAPIC: make lapic support cpu hotplug

2011-10-04 Thread Anthony Liguori


On 10/04/2011 06:13 AM, pingf...@linux.vnet.com wrote:

From: Liu Ping Fan

Separate apic from qbus to icc bus which supports hotplug feature.
And make the creation of apic as part of cpu initialization, so
apic's state has been ready, before setting kvm_apic.

Signed-off-by: Liu Ping Fan
---
  Makefile.target  |1 +
  hw/apic.c|7 -
  hw/apic.h|1 +
  hw/icc_bus.c |   62 ++
  hw/icc_bus.h |   24 +++
  hw/pc.c  |   11 -
  target-i386/cpu.h|1 +
  target-i386/helper.c |7 +-
  target-i386/kvm.c|1 -
  9 files changed, 105 insertions(+), 10 deletions(-)
  create mode 100644 hw/icc_bus.c
  create mode 100644 hw/icc_bus.h

diff --git a/Makefile.target b/Makefile.target
index 9011f28..5607c6d 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -241,6 +241,7 @@ obj-i386-$(CONFIG_KVM) += kvmclock.o
  obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
  obj-i386-y += testdev.o
  obj-i386-y += acpi.o acpi_piix4.o
+obj-i386-y += icc_bus.o

  obj-i386-y += pcspk.o i8254.o
  obj-i386-$(CONFIG_KVM_PIT) += i8254-kvm.o
diff --git a/hw/apic.c b/hw/apic.c
index 69d6ac5..95a1664 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -24,6 +24,7 @@
  #include "sysbus.h"
  #include "trace.h"
  #include "kvm.h"
+#include "icc_bus.h"

  /* APIC Local Vector Table */
  #define APIC_LVT_TIMER   0
@@ -304,11 +305,13 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val)

  if (!s)
  return;
+
  if (kvm_enabled()&&  kvm_irqchip_in_kernel())
  s->apicbase = val;
  else
  s->apicbase = (val&  0xf000) |
  (s->apicbase&  (MSR_IA32_APICBASE_BSP | 
MSR_IA32_APICBASE_ENABLE));
+
  /* if disabled, cannot be enabled again */
  if (!(val&  MSR_IA32_APICBASE_ENABLE)) {
  s->apicbase&= ~MSR_IA32_APICBASE_ENABLE;
@@ -1075,7 +1078,7 @@ static const VMStateDescription vmstate_apic = {
  }
  };


Please don't introduce extra whitespace in unrelated code.



-static void apic_reset(DeviceState *d)
+void apic_reset(DeviceState *d)
  {
  APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
  int bsp;
@@ -1138,7 +1141,7 @@ static SysBusDeviceInfo apic_info = {

  static void apic_register_devices(void)
  {
-sysbus_register_withprop(&apic_info);
+iccbus_register(&apic_info);
  }

  device_init(apic_register_devices)
diff --git a/hw/apic.h b/hw/apic.h
index c857d52..e258efa 100644
--- a/hw/apic.h
+++ b/hw/apic.h
@@ -24,5 +24,6 @@ void apic_sipi(DeviceState *s);
  /* pc.c */
  int cpu_is_bsp(CPUState *env);
  DeviceState *cpu_get_current_apic(void);
+void apic_reset(DeviceState *d);

  #endif
diff --git a/hw/icc_bus.c b/hw/icc_bus.c
new file mode 100644
index 000..360ca2a
--- /dev/null
+++ b/hw/icc_bus.c
@@ -0,0 +1,62 @@
+/*
+*/


New files need to include copyright/licenses.


+#define ICC_BUS_PLUG
+#ifdef ICC_BUS_PLUG


Drop these guards here and at the end of the file.  We conditionally build files 
by having an:


obj-$(CONFIG_ICC_BUS) += icc_bus.o

In the Makefile.


+#include "icc_bus.h"



+
+
+
+struct icc_bus_info icc_info = {
+.qinfo.name = "icc",
+.qinfo.size = sizeof(struct icc_bus),
+.qinfo.props = (Property[]) {
+DEFINE_PROP_END_OF_LIST(),
+}
+
+};


Structure name is not following Coding Style.


+
+static const VMStateDescription vmstate_icc_bus = {
+.name = "icc_bus",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.pre_save = NULL,
+.post_load = NULL,
+};
+
+struct icc_bus *g_iccbus;
+
+struct icc_bus *icc_init_bus(DeviceState *parent, const char *name)
+{
+struct icc_bus *bus;
+
+bus = FROM_QBUS(icc_bus, qbus_create(&icc_info.qinfo, parent, name));
+bus->qbus.allow_hotplug = 1; /* Yes, we can */
+bus->qbus.name = "icc";
+vmstate_register(NULL, -1,&vmstate_icc_bus, bus);
+g_iccbus = bus;
+return bus;
+}
+
+
+
+
+
+static int iccbus_device_init(DeviceState *dev, DeviceInfo *base)
+{
+SysBusDeviceInfo *info = container_of(base, SysBusDeviceInfo, qdev);
+
+return info->init(sysbus_from_qdev(dev));
+}
+
+void iccbus_register(SysBusDeviceInfo *info)
+{
+info->qdev.init = iccbus_device_init;
+info->qdev.bus_info =&icc_info.qinfo;
+
+assert(info->qdev.size>= sizeof(SysBusDevice));
+qdev_register(&info->qdev);
+}
+


It's not obvious to me why you need the g_iccbus variable.


+#endif
diff --git a/hw/icc_bus.h b/hw/icc_bus.h
new file mode 100644
index 000..94d9242
--- /dev/null
+++ b/hw/icc_bus.h
@@ -0,0 +1,24 @@
+#ifndef QEMU_ICC_H
+#define QEMU_ICC_H



Needs copyright and a blurb explaining what this file is and what the ICC bus 
is.

Same comments re: whitespace below.

Regards,

Anthony Liguori


+#include "qdev.h"
+#include "sysbus.h"
+
+typedef struct icc_bus icc_bus;
+typedef struct icc_bus_info icc_bus_info;
+
+
+struct icc_bus {
+BusState qbus;
+};
+
+s

[Qemu-devel] [PATCH] runstate: add more valid transitions

2011-10-04 Thread Paolo Bonzini

This patch adds more valid transitions to the table, and avoids
that the VM remains stuck in RSTATE_SAVEVM state when savevm is
done on a paused virtual machine.

Signed-off-by: Paolo Bonzini 
---
 savevm.c |   11 ---
 vl.c |5 +
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/savevm.c b/savevm.c
index 46f2447..ba5beb0 100644
--- a/savevm.c
+++ b/savevm.c
@@ -1599,7 +1599,7 @@ void qemu_savevm_state_cancel(Monitor *mon, QEMUFile *f)
 
 static int qemu_savevm_state(Monitor *mon, QEMUFile *f)
 {
-int saved_vm_running;
+int saved_vm_state;
 int ret;
 
 saved_vm_running = runstate_is_running();
@@ -1626,8 +1626,13 @@ out:
 if (qemu_file_has_error(f))
 ret = -EIO;
 
-if (!ret && saved_vm_running)
-vm_start();
+if (!ret) {
+if (saved_vm_running) {
+vm_start();
+} else {
+runstate_set(RSTATE_PAUSED);
+}
+}
 
 return ret;
 }
diff --git a/vl.c b/vl.c
index bd4a5ce..165712e 100644
--- a/vl.c
+++ b/vl.c
@@ -346,8 +346,12 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RSTATE_IO_ERROR, RSTATE_RUNNING },
 
 { RSTATE_PAUSED, RSTATE_RUNNING },
+{ RSTATE_PAUSED, RSTATE_PRE_MIGRATE },
+{ RSTATE_PAUSED, RSTATE_SAVEVM },
 
 { RSTATE_POST_MIGRATE, RSTATE_RUNNING },
+{ RSTATE_POST_MIGRATE, RSTATE_PRE_MIGRATE },
+{ RSTATE_POST_MIGRATE, RSTATE_SAVEVM },
 
 { RSTATE_PRE_LAUNCH, RSTATE_RUNNING },
 { RSTATE_PRE_LAUNCH, RSTATE_POST_MIGRATE },
@@ -368,6 +372,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RSTATE_RUNNING, RSTATE_WATCHDOG },
 
 { RSTATE_SAVEVM, RSTATE_RUNNING },
+{ RSTATE_SAVEVM, RSTATE_PAUSED },
 
 { RSTATE_SHUTDOWN, RSTATE_PAUSED },
 
-- 
1.7.6

[Qemu-devel] [PATCH v2] runstate: add more valid transitions

2011-10-04 Thread Paolo Bonzini

This patch adds more valid transitions to the table, and avoids
that the VM remains stuck in RSTATE_SAVEVM state when savevm is
done on a paused virtual machine.

Signed-off-by: Paolo Bonzini 
---
 savevm.c |9 +++--
 vl.c |5 +
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/savevm.c b/savevm.c
index 46f2447..ff37968 100644
--- a/savevm.c
+++ b/savevm.c
@@ -1626,8 +1626,13 @@ out:
 if (qemu_file_has_error(f))
 ret = -EIO;
 
-if (!ret && saved_vm_running)
-vm_start();
+if (!ret) {
+if (saved_vm_running) {
+vm_start();
+   } else {
+   runstate_set(RSTATE_PAUSED);
+   }
+}
 
 return ret;
 }
diff --git a/vl.c b/vl.c
index bd4a5ce..165712e 100644
--- a/vl.c
+++ b/vl.c
@@ -346,8 +346,12 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RSTATE_IO_ERROR, RSTATE_RUNNING },
 
 { RSTATE_PAUSED, RSTATE_RUNNING },
+{ RSTATE_PAUSED, RSTATE_PRE_MIGRATE },
+{ RSTATE_PAUSED, RSTATE_SAVEVM },
 
 { RSTATE_POST_MIGRATE, RSTATE_RUNNING },
+{ RSTATE_POST_MIGRATE, RSTATE_PRE_MIGRATE },
+{ RSTATE_POST_MIGRATE, RSTATE_SAVEVM },
 
 { RSTATE_PRE_LAUNCH, RSTATE_RUNNING },
 { RSTATE_PRE_LAUNCH, RSTATE_POST_MIGRATE },
@@ -368,6 +372,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RSTATE_RUNNING, RSTATE_WATCHDOG },
 
 { RSTATE_SAVEVM, RSTATE_RUNNING },
+{ RSTATE_SAVEVM, RSTATE_PAUSED },
 
 { RSTATE_SHUTDOWN, RSTATE_PAUSED },
 
-- 
1.7.6

Re: [Qemu-devel] KVM call agenda for October 4th

2011-10-04 Thread Juan Quintela

Juan Quintela  wrote:
> Hi
>
> Please send in any agenda items you are interested in covering.

As there is no agenda, this week call gets cancelled.

Happy hacking, Juan.

Re: [Qemu-devel] s390: Fix cpu shutdown for KVM

2011-10-04 Thread Christian Borntraeger

Something like the following? 

s390: Fix cpu shutdown for KVM

On s390 a shutdown is the state of all CPUs being either stopped
or disabled (for interrupts) waiting. We have to track this number
to call the shutdown sequence accordingly. This patch implements
the counting and shutdown handling for the kvm path in qemu.

Signed-off-by: Christian Borntraeger 
---
 hw/s390-virtio.c   |   21 +
 target-s390x/cpu.h |   12 
 target-s390x/kvm.c |   21 +
 3 files changed, 50 insertions(+), 4 deletions(-)

Index: b/hw/s390-virtio.c
===
--- a/hw/s390-virtio.c
+++ b/hw/s390-virtio.c
@@ -130,6 +130,26 @@ int s390_virtio_hypercall(CPUState *env,
 return r;
 }
 
+/*
+ * The number of running CPUs. On s390 a shutdown is the state of all CPUs
+ * being either stopped or disabled (for interrupts) waiting. We have to
+ * track this number to call the shutdown sequence accordingly. This
+ * number is modified either on startup or while holding the big qemu lock.
+ */
+static unsigned s390_running_cpus;
+
+void kvm_s390_add_running_cpu(CPUState *env)
+{
+assert(env->halted == 0);
+s390_running_cpus++;
+}
+
+unsigned kvm_s390_del_running_cpu(CPUState *env)
+{
+assert(s390_running_cpus >= 1);
+return --s390_running_cpus;
+}
+
 /* PC hardware initialisation */
 static void s390_init(ram_addr_t my_ram_size,
   const char *boot_device,
@@ -189,6 +209,7 @@ static void s390_init(ram_addr_t my_ram_
 
 env->halted = 0;
 env->exception_index = 0;
+kvm_s390_add_running_cpu(env);
 
 if (kernel_filename) {
 kernel_size = load_image(kernel_filename, qemu_get_ram_ptr(0));
Index: b/target-s390x/cpu.h
===
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -292,6 +292,8 @@ void kvm_s390_interrupt(CPUState *env, i
 void kvm_s390_virtio_irq(CPUState *env, int config_change, uint64_t token);
 void kvm_s390_interrupt_internal(CPUState *env, int type, uint32_t parm,
  uint64_t parm64, int vm);
+void kvm_s390_add_running_cpu(CPUState *env);
+unsigned kvm_s390_del_running_cpu(CPUState *env);
 #else
 static inline void kvm_s390_interrupt(CPUState *env, int type, uint32_t code)
 {
@@ -307,6 +309,16 @@ static inline void kvm_s390_interrupt_in
int vm)
 {
 }
+
+static inline void kvm_s390_add_running_cpu(CPUState *env)
+{
+}
+
+static inline unsigned kvm_s390_del_running_cpu(CPUState *env)
+{
+return 0;
+}
+
 #endif
 CPUState *s390_cpu_addr2state(uint16_t cpu_addr);
 
Index: b/target-s390x/kvm.c
===
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -185,6 +185,12 @@ void kvm_s390_interrupt_internal(CPUStat
 return;
 }
 
+/*
+ * We can only deliver interrupts to (interrupt) enabled CPUs.
+ * We dont call kvm_s390_add_running_cpu here, since CPUs in enabled wait
+ * will wait inside the kernel (no exit). Therefore, the targeted
+ * CPUs was neither disabled waiting or stopped for qemu.
+ */
 env->halted = 0;
 env->exception_index = -1;
 qemu_cpu_kick(env);
@@ -301,6 +307,7 @@ static int s390_cpu_restart(CPUState *en
 kvm_s390_interrupt(env, KVM_S390_RESTART, 0);
 env->halted = 0;
 env->exception_index = -1;
+kvm_s390_add_running_cpu(env);
 qemu_cpu_kick(env);
 dprintf("DONE: SIGP cpu restart: %p\n", env);
 return 0;
@@ -425,16 +432,24 @@ static int handle_intercept(CPUState *en
 r = handle_instruction(env, run);
 break;
 case ICPT_WAITPSW:
-/* XXX What to do on system shutdown? */
+if (kvm_s390_del_running_cpu(env) == 0) {
+qemu_system_shutdown_request();
+}
 env->halted = 1;
 env->exception_index = EXCP_HLT;
+r = EXCP_HALTED;
 break;
 case ICPT_SOFT_INTERCEPT:
 fprintf(stderr, "KVM unimplemented icpt SOFT\n");
 exit(1);
 break;
 case ICPT_CPU_STOP:
-qemu_system_shutdown_request();
+if (kvm_s390_del_running_cpu(env) == 0) {
+qemu_system_shutdown_request();
+}
+env->halted = 1;
+env->exception_index = EXCP_HLT;
+r = EXCP_HALTED;
 break;
 case ICPT_IO:
 fprintf(stderr, "KVM unimplemented icpt IO\n");
@@ -468,8 +483,6 @@ int kvm_arch_handle_exit(CPUState *env, 
 
 if (ret == 0) {
 ret = EXCP_INTERRUPT;
-} else if (ret > 0) {
-ret = 0;
 }
 return ret;
 }

Re: [Qemu-devel] [PATCH] runstate: do not discard runstate changes when paused

2011-10-04 Thread Luiz Capitulino

On Tue,  4 Oct 2011 14:04:45 +0200
Paolo Bonzini  wrote:

> Trying to migrate a paused machine fails.  The reason is that
> the RSTATE_PRE_MIGRATE is reached with vm_stop, and this
> transition is eaten when the vm is already paused.  This patch
> fixes the problem by always going through runstate_set and
> always notifying the new state.

There's a semantic change which I'm not completely sure it won't generate
unexpected side-effects: today vm_stop() will only carry any action if the
machine is running, otherwise it's no-op. This patch changes that.

We probably can fix it by adding a new transition to the transition table
too...

> 
> Signed-off-by: Paolo Bonzini 
> ---
>  cpus.c |2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/cpus.c b/cpus.c
> index 8978779..eab8ff6 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -128,6 +128,8 @@ static void do_vm_stop(RunState state)
>  qemu_aio_flush();
>  bdrv_flush_all();
>  monitor_protocol_event(QEVENT_STOP, NULL);
> +} else {
> +runstate_set(state);
>  }
>  }
>

Re: [Qemu-devel] [PATCH 0/1] Make the help info more friendly in monitor

2011-10-04 Thread Luiz Capitulino

On Wed, 28 Sep 2011 10:16:19 +0100
Stefan Hajnoczi  wrote:

> On Wed, Sep 28, 2011 at 10:00 AM, Wayne Xia  
> wrote:
> > During my test, I found it inconvenient when I type "help" or "help info",
> > because the information was shown without orderliness. This patch would just
> > show the help information in sorted order.
> >
> > For eg:
> > (qemu)help
> > acl_add
> > acl_policy
> > acl_remove
> > acl_reset
> > acl_show
> > balloon
> > block_passwd
> > ...
> > the command list is sorted.
> >
> > Wayne Xia (1):
> >  Sort the help info shown in monitor
> >
> >  monitor.c |   97 
> > ++--
> >  1 files changed, 93 insertions(+), 4 deletions(-)
> 
> This is a nice idea.  We could keep hmp/qmp-commands.hx in sorted
> order but that prevents us from keeping related commands together in
> those files (and the generated documentation?).  So sorting at
> run-time makes sense.

The info help command reads from hmp-commands.hx and the info_cmds array,
I would prefer to get those sorted.

Re: [Qemu-devel] s390: Fix cpu shutdown for KVM

2011-10-04 Thread Peter Maydell

On 4 October 2011 14:47, Christian Borntraeger  wrote:
> +     * We can only deliver interrupts to (interrupt) enabled CPUs.
> +     * We dont call kvm_s390_add_running_cpu here, since CPUs in enabled wait
> +     * will wait inside the kernel (no exit). Therefore, the targeted
> +     * CPUs was neither disabled waiting or stopped for qemu.

Grammar nits, since I'm commenting anyway:
 "don't"
 "neither disabled waiting nor stopped"

> @@ -425,16 +432,24 @@ static int handle_intercept(CPUState *en
>             r = handle_instruction(env, run);
>             break;
>         case ICPT_WAITPSW:
> -            /* XXX What to do on system shutdown? */
> +            if (kvm_s390_del_running_cpu(env) == 0) {
> +                qemu_system_shutdown_request();
> +            }
>             env->halted = 1;
>             env->exception_index = EXCP_HLT;
> +            r = EXCP_HALTED;
>             break;
>         case ICPT_SOFT_INTERCEPT:
>             fprintf(stderr, "KVM unimplemented icpt SOFT\n");
>             exit(1);
>             break;
>         case ICPT_CPU_STOP:
> -            qemu_system_shutdown_request();
> +            if (kvm_s390_del_running_cpu(env) == 0) {
> +                qemu_system_shutdown_request();
> +            }
> +            env->halted = 1;
> +            env->exception_index = EXCP_HLT;
> +            r = EXCP_HALTED;
>             break;
>         case ICPT_IO:
>             fprintf(stderr, "KVM unimplemented icpt IO\n");

This makes the ICPT_CPU_STOP and ICPT_WAITPSW cases identical,
right? You should just fold them together.

-- PMM

Re: [Qemu-devel] [PATCH] runstate: do not discard runstate changes when paused

2011-10-04 Thread Paolo Bonzini


On 10/04/2011 03:49 PM, Luiz Capitulino wrote:

There's a semantic change which I'm not completely sure it won't generate
unexpected side-effects: today vm_stop() will only carry any action if the
machine is running, otherwise it's no-op. This patch changes that.


More or less, yes.  I tried to limit the semantic change by not running 
notifiers, which again could be better or worse.


I don't think adding a new transition is a good solution, because you'll 
have to add a transition from PAUSED to anything that uses runstate_set 
instead of vm_stop.


However, you could change all vm_stop() to vm_stop(RSTATE_PAUSED) 
followed by runstate_set(), adjust the transition table consequently and 
possibly drop the argument to vm_stop.  I tried to get the smallest 
patch, but I did need to follow-up with changes to the transition table.


In any case, can I assume this to be in your hands now? :)

Paolo

Re: [Qemu-devel] s390: Fix cpu shutdown for KVM

2011-10-04 Thread Christian Borntraeger

On 04/10/11 15:56, Peter Maydell wrote:
> On 4 October 2011 14:47, Christian Borntraeger  wrote:
>> + * We can only deliver interrupts to (interrupt) enabled CPUs.
>> + * We dont call kvm_s390_add_running_cpu here, since CPUs in enabled 
>> wait
>> + * will wait inside the kernel (no exit). Therefore, the targeted
>> + * CPUs was neither disabled waiting or stopped for qemu.
> 
> Grammar nits, since I'm commenting anyway:
>  "don't"
>  "neither disabled waiting nor stopped"

Ok. Alex can you fix that up or do you want a new patch?

> This makes the ICPT_CPU_STOP and ICPT_WAITPSW cases identical,
> right? You should just fold them together.

Yes, at the moment they are identical.
I am still thinking about some additional changes that will make them separate 
again
due to their usage in Linux:
- disabled wait usually indicates a kernel panic
- stop is called during cpu hot unplug and during shutdown for the last cpu

So on disabled wait we might want to perform extra logging etc, but I dont know 
yet.
Should I merge them anyway?

Christian

Re: [Qemu-devel] [PATCH 05/23] migration: Refactor MigrationState creation

2011-10-04 Thread Anthony Liguori


On 09/23/2011 07:56 AM, Juan Quintela wrote:

Signed-off-by: Juan Quintela
---
  migration-exec.c |   16 +---
  migration-fd.c   |   16 +---
  migration-tcp.c  |   15 +--
  migration-unix.c |   15 +--
  migration.c  |   29 +
  migration.h  |   11 +++
  6 files changed, 32 insertions(+), 70 deletions(-)

diff --git a/migration-exec.c b/migration-exec.c
index 0ed5976..db11374 100644
--- a/migration-exec.c
+++ b/migration-exec.c
@@ -71,7 +71,7 @@ MigrationState *exec_start_outgoing_migration(Monitor *mon,
  MigrationState *s;
  FILE *f;

-s = g_malloc0(sizeof(*s));
+s = migrate_create_state(mon, bandwidth_limit, detach, blk, inc);

  f = popen(command, "w");
  if (f == NULL) {
@@ -92,20 +92,6 @@ MigrationState *exec_start_outgoing_migration(Monitor *mon,
  s->close = exec_close;
  s->get_error = file_errno;
  s->write = file_write;
-s->cancel = migrate_fd_cancel;
-s->get_status = migrate_fd_get_status;
-s->release = migrate_fd_release;
-
-s->blk = blk;
-s->shared = inc;
-
-s->state = MIG_STATE_ACTIVE;
-s->mon = NULL;
-s->bandwidth_limit = bandwidth_limit;
-
-if (!detach) {
-migrate_fd_monitor_suspend(s, mon);
-}

  migrate_fd_connect(s);
  return s;
diff --git a/migration-fd.c b/migration-fd.c
index e78fd4e..2235a2f 100644
--- a/migration-fd.c
+++ b/migration-fd.c
@@ -59,7 +59,7 @@ MigrationState *fd_start_outgoing_migration(Monitor *mon,
  {
  MigrationState *s;

-s = g_malloc0(sizeof(*s));
+s = migrate_create_state(mon, bandwidth_limit, detach, blk, inc);

  s->fd = monitor_get_fd(mon, fdname);
  if (s->fd == -1) {
@@ -75,20 +75,6 @@ MigrationState *fd_start_outgoing_migration(Monitor *mon,
  s->get_error = fd_errno;
  s->write = fd_write;
  s->close = fd_close;
-s->cancel = migrate_fd_cancel;
-s->get_status = migrate_fd_get_status;
-s->release = migrate_fd_release;
-
-s->blk = blk;
-s->shared = inc;
-
-s->state = MIG_STATE_ACTIVE;
-s->mon = NULL;
-s->bandwidth_limit = bandwidth_limit;
-
-if (!detach) {
-migrate_fd_monitor_suspend(s, mon);
-}

  migrate_fd_connect(s);
  return s;
diff --git a/migration-tcp.c b/migration-tcp.c
index d6feb23..db8cea2 100644
--- a/migration-tcp.c
+++ b/migration-tcp.c
@@ -89,21 +89,12 @@ MigrationState *tcp_start_outgoing_migration(Monitor *mon,
  if (parse_host_port(&addr, host_port)<  0)
  return NULL;

-s = g_malloc0(sizeof(*s));
+s = migrate_create_state(mon, bandwidth_limit, detach, blk, inc);

  s->get_error = socket_errno;
  s->write = socket_write;
  s->close = tcp_close;
-s->cancel = migrate_fd_cancel;
-s->get_status = migrate_fd_get_status;
-s->release = migrate_fd_release;

-s->blk = blk;
-s->shared = inc;
-
-s->state = MIG_STATE_ACTIVE;
-s->mon = NULL;
-s->bandwidth_limit = bandwidth_limit;
  s->fd = qemu_socket(PF_INET, SOCK_STREAM, 0);
  if (s->fd == -1) {
  g_free(s);
@@ -112,10 +103,6 @@ MigrationState *tcp_start_outgoing_migration(Monitor *mon,

  socket_set_nonblock(s->fd);

-if (!detach) {
-migrate_fd_monitor_suspend(s, mon);
-}
-
  do {
  ret = connect(s->fd, (struct sockaddr *)&addr, sizeof(addr));
  if (ret == -1)
diff --git a/migration-unix.c b/migration-unix.c
index 3b9017b..74c6dde 100644
--- a/migration-unix.c
+++ b/migration-unix.c
@@ -88,21 +88,12 @@ MigrationState *unix_start_outgoing_migration(Monitor *mon,
  addr.sun_family = AF_UNIX;
  snprintf(addr.sun_path, sizeof(addr.sun_path), "%s", path);

-s = g_malloc0(sizeof(*s));
+s = migrate_create_state(mon, bandwidth_limit, detach, blk, inc);

  s->get_error = unix_errno;
  s->write = unix_write;
  s->close = unix_close;
-s->cancel = migrate_fd_cancel;
-s->get_status = migrate_fd_get_status;
-s->release = migrate_fd_release;

-s->blk = blk;
-s->shared = inc;
-
-s->state = MIG_STATE_ACTIVE;
-s->mon = NULL;
-s->bandwidth_limit = bandwidth_limit;
  s->fd = qemu_socket(PF_UNIX, SOCK_STREAM, 0);
  if (s->fd<  0) {
  DPRINTF("Unable to open socket");
@@ -125,10 +116,6 @@ MigrationState *unix_start_outgoing_migration(Monitor *mon,
  goto err_after_open;
  }

-if (!detach) {
-migrate_fd_monitor_suspend(s, mon);
-}
-
  if (ret>= 0)
  migrate_fd_connect(s);

diff --git a/migration.c b/migration.c
index b27a322..990667e 100644
--- a/migration.c
+++ b/migration.c
@@ -263,7 +263,7 @@ void do_info_migrate(Monitor *mon, QObject **ret_data)

  /* shared migration helpers */

-void migrate_fd_monitor_suspend(MigrationState *s, Monitor *mon)
+static void migrate_fd_monitor_suspend(MigrationState *s, Monitor *mon)
  {
  s->mon = mon;
  if (monitor_suspend(mon) == 0) {
@@ -404,12 +404,12 @@ void migrate_fd_put_r

Re: [Qemu-devel] [PATCH 14/23] migration: Move exported functions to the end of the file

2011-10-04 Thread Anthony Liguori


On 09/23/2011 07:57 AM, Juan Quintela wrote:

This means we can remove the two forward declarations.

Signed-off-by: Juan Quintela


Reviewed-by: Anthony Liguori 

Regards,

Anthony Liguori


---
  migration.c |  188 --
  1 files changed, 91 insertions(+), 97 deletions(-)

diff --git a/migration.c b/migration.c
index 9bc7ffa..9bb089a 100644
--- a/migration.c
+++ b/migration.c
@@ -77,91 +77,6 @@ void process_incoming_migration(QEMUFile *f)
  }
  }

-static MigrationState *migrate_create_state(Monitor *mon,
-int64_t bandwidth_limit,
-int detach, int blk, int inc);
-
-int do_migrate(Monitor *mon, const QDict *qdict, QObject **ret_data)
-{
-MigrationState *s = NULL;
-const char *p;
-int detach = qdict_get_try_bool(qdict, "detach", 0);
-int blk = qdict_get_try_bool(qdict, "blk", 0);
-int inc = qdict_get_try_bool(qdict, "inc", 0);
-const char *uri = qdict_get_str(qdict, "uri");
-int ret;
-
-if (current_migration&&
-current_migration->state == MIG_STATE_ACTIVE) {
-monitor_printf(mon, "migration already in progress\n");
-return -1;
-}
-
-if (qemu_savevm_state_blocked(mon)) {
-return -1;
-}
-
-s = migrate_create_state(mon, max_throttle, detach, blk, inc);
-
-if (strstart(uri, "tcp:",&p)) {
-ret = tcp_start_outgoing_migration(s, p);
-#if !defined(WIN32)
-} else if (strstart(uri, "exec:",&p)) {
-ret = exec_start_outgoing_migration(s, p);
-} else if (strstart(uri, "unix:",&p)) {
-ret = unix_start_outgoing_migration(s, p);
-} else if (strstart(uri, "fd:",&p)) {
-ret = fd_start_outgoing_migration(s, p);
-#endif
-} else {
-monitor_printf(mon, "unknown migration protocol: %s\n", uri);
-ret  = -EINVAL;
-goto free_migrate_state;
-}
-
-if (ret<  0) {
-monitor_printf(mon, "migration failed\n");
-goto free_migrate_state;
-}
-
-g_free(current_migration);
-current_migration = s;
-notifier_list_notify(&migration_state_notifiers, NULL);
-return 0;
-free_migrate_state:
-g_free(s);
-return -1;
-}
-
-static void migrate_fd_cancel(MigrationState *s);
-
-int do_migrate_cancel(Monitor *mon, const QDict *qdict, QObject **ret_data)
-{
-if (current_migration) {
-migrate_fd_cancel(current_migration);
-}
-return 0;
-}
-
-int do_migrate_set_speed(Monitor *mon, const QDict *qdict, QObject **ret_data)
-{
-int64_t d;
-MigrationState *s;
-
-d = qdict_get_int(qdict, "value");
-if (d<  0) {
-d = 0;
-}
-max_throttle = d;
-
-s = current_migration;
-if (s&&  s->file) {
-qemu_file_set_rate_limit(s->file, max_throttle);
-}
-
-return 0;
-}
-
  /* amount of nanoseconds we are willing to wait for migration to be down.
   * the choice of nanoseconds is because it is the maximum resolution that
   * get_clock() can achieve. It is an internal measure. All user-visible
@@ -173,18 +88,6 @@ uint64_t migrate_max_downtime(void)
  return max_downtime;
  }

-int do_migrate_set_downtime(Monitor *mon, const QDict *qdict,
-QObject **ret_data)
-{
-double d;
-
-d = qdict_get_double(qdict, "value") * 1e9;
-d = MAX(0, MIN(UINT64_MAX, d));
-max_downtime = (uint64_t)d;
-
-return 0;
-}
-
  static void migrate_print_status(Monitor *mon, const char *name,
   const QDict *status_dict)
  {
@@ -502,3 +405,94 @@ static MigrationState *migrate_create_state(Monitor *mon,

  return s;
  }
+
+int do_migrate(Monitor *mon, const QDict *qdict, QObject **ret_data)
+{
+MigrationState *s = NULL;
+const char *p;
+int detach = qdict_get_try_bool(qdict, "detach", 0);
+int blk = qdict_get_try_bool(qdict, "blk", 0);
+int inc = qdict_get_try_bool(qdict, "inc", 0);
+const char *uri = qdict_get_str(qdict, "uri");
+int ret;
+
+if (current_migration&&
+current_migration->state == MIG_STATE_ACTIVE) {
+monitor_printf(mon, "migration already in progress\n");
+return -1;
+}
+
+if (qemu_savevm_state_blocked(mon)) {
+return -1;
+}
+
+s = migrate_create_state(mon, max_throttle, detach, blk, inc);
+
+if (strstart(uri, "tcp:",&p)) {
+ret = tcp_start_outgoing_migration(s, p);
+#if !defined(WIN32)
+} else if (strstart(uri, "exec:",&p)) {
+ret = exec_start_outgoing_migration(s, p);
+} else if (strstart(uri, "unix:",&p)) {
+ret = unix_start_outgoing_migration(s, p);
+} else if (strstart(uri, "fd:",&p)) {
+ret = fd_start_outgoing_migration(s, p);
+#endif
+} else {
+monitor_printf(mon, "unknown migration protocol: %s\n", uri);
+ret  = -EINVAL;
+goto free_migrate_state;
+}
+
+if (ret<  0) {
+monitor_printf(mon, "migration failed\n");
+

[Qemu-devel] [PATCH] i386: wire up MSR_IA32_MISC_ENABLE

2011-10-04 Thread Avi Kivity

It's needed for its default value - bit 0 specifies that "rep movs" is
good enough for memcpy, and Linux may use a slower memcpu if it is not set,
depending on cpu family/model.

Signed-off-by: Avi Kivity 
---
 target-i386/cpu.h   |5 +
 target-i386/helper.c|1 +
 target-i386/kvm.c   |   15 +++
 target-i386/machine.c   |   21 +
 target-i386/op_helper.c |6 ++
 5 files changed, 48 insertions(+), 0 deletions(-)

diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index ae36489..5416809 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -299,6 +299,10 @@
 
 #define MSR_IA32_PERF_STATUS0x198
 
+#define MSR_IA32_MISC_ENABLE   0x1a0
+/* Indicates good rep/movs microcode on some processors: */
+#define MSR_IA32_MISC_ENABLE_DEFAULT1
+
 #define MSR_MTRRphysBase(reg)  (0x200 + 2 * (reg))
 #define MSR_MTRRphysMask(reg)  (0x200 + 2 * (reg) + 1)
 
@@ -689,6 +693,7 @@ typedef struct CPUX86State {
 uint64_t tsc;
 
 uint64_t mcg_status;
+uint64_t msr_ia32_misc_enable;
 
 /* exception/interrupt handling */
 int error_code;
diff --git a/target-i386/helper.c b/target-i386/helper.c
index 5df40d4..6c6a167 100644
--- a/target-i386/helper.c
+++ b/target-i386/helper.c
@@ -98,6 +98,7 @@ void cpu_reset(CPUX86State *env)
 env->mxcsr = 0x1f80;
 
 env->pat = 0x0007040600070406ULL;
+env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT;
 
 memset(env->dr, 0, sizeof(env->dr));
 env->dr[6] = DR6_FIXED_1;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index b6eef04..b3a650b 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -60,6 +60,7 @@
 static bool has_msr_star;
 static bool has_msr_hsave_pa;
 static bool has_msr_async_pf_en;
+static bool has_msr_misc_enable;
 static int lm_capable_kernel;
 
 static struct kvm_cpuid2 *try_get_cpuid(KVMState *s, int max)
@@ -568,6 +569,10 @@ static int kvm_get_supported_msrs(KVMState *s)
 has_msr_hsave_pa = true;
 continue;
 }
+if (kvm_msr_list->indices[i] == MSR_IA32_MISC_ENABLE) {
+has_msr_misc_enable = true;
+continue;
+}
 }
 }
 
@@ -881,6 +886,10 @@ static int kvm_put_msrs(CPUState *env, int level)
 if (has_msr_hsave_pa) {
 kvm_msr_entry_set(&msrs[n++], MSR_VM_HSAVE_PA, env->vm_hsave);
 }
+if (has_msr_misc_enable) {
+kvm_msr_entry_set(&msrs[n++], MSR_IA32_MISC_ENABLE,
+  env->msr_ia32_misc_enable);
+}
 #ifdef TARGET_X86_64
 if (lm_capable_kernel) {
 kvm_msr_entry_set(&msrs[n++], MSR_CSTAR, env->cstar);
@@ -1127,6 +1136,9 @@ static int kvm_get_msrs(CPUState *env)
 if (has_msr_hsave_pa) {
 msrs[n++].index = MSR_VM_HSAVE_PA;
 }
+if (has_msr_misc_enable) {
+msrs[n++].index = MSR_IA32_MISC_ENABLE;
+}
 
 if (!env->tsc_valid) {
 msrs[n++].index = MSR_IA32_TSC;
@@ -1210,6 +1222,9 @@ static int kvm_get_msrs(CPUState *env)
 case MSR_MCG_CTL:
 env->mcg_ctl = msrs[i].data;
 break;
+case MSR_IA32_MISC_ENABLE:
+env->msr_ia32_misc_enable = msrs[i].data;
+break;
 default:
 if (msrs[i].index >= MSR_MC0_CTL &&
 msrs[i].index < MSR_MC0_CTL + (env->mcg_cap & 0xff) * 4) {
diff --git a/target-i386/machine.c b/target-i386/machine.c
index 9aca8e0..73c1ffe 100644
--- a/target-i386/machine.c
+++ b/target-i386/machine.c
@@ -310,6 +310,24 @@ static bool fpop_ip_dp_needed(void *opaque)
 }
 };
 
+static bool misc_enable_needed(void *opaque)
+{
+CPUState *env = opaque;
+
+return env->msr_ia32_misc_enable != MSR_IA32_MISC_ENABLE_DEFAULT;
+}
+
+static const VMStateDescription vmstate_msr_ia32_misc_enable = {
+.name = "cpu/msr_ia32_misc_enable",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField []) {
+VMSTATE_UINT64(msr_ia32_misc_enable, CPUState),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_cpu = {
 .name = "cpu",
 .version_id = CPU_SAVE_VERSION,
@@ -420,6 +438,9 @@ static bool fpop_ip_dp_needed(void *opaque)
 } , {
 .vmsd = &vmstate_fpop_ip_dp,
 .needed = fpop_ip_dp_needed,
+}, {
+.vmsd = &vmstate_msr_ia32_misc_enable,
+.needed = misc_enable_needed,
 } , {
 /* empty */
 }
diff --git a/target-i386/op_helper.c b/target-i386/op_helper.c
index 3bb5a91..c89e4a4 100644
--- a/target-i386/op_helper.c
+++ b/target-i386/op_helper.c
@@ -3280,6 +3280,9 @@ void helper_wrmsr(void)
 case MSR_TSC_AUX:
 env->tsc_aux = val;
 break;
+case MSR_IA32_MISC_ENABLE:
+env->msr_ia32_misc_enable = val;
+break;
 default:
 if ((uint32_t)ECX >= MSR_MC0_CTL

Re: [Qemu-devel] [PATCH 15/23] migration: use global variable directly

2011-10-04 Thread Anthony Liguori


On 09/23/2011 07:57 AM, Juan Quintela wrote:

We are setting a pointer to a local variable in the previous line, just use
the global variable directly.  We remove the ->file test because it is already
done inside qemu_file_set_rate_limit() function.

Signed-off-by: Juan Quintela
---
  migration.c |6 ++
  1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/migration.c b/migration.c
index 9bb089a..d5e0eb0 100644
--- a/migration.c
+++ b/migration.c
@@ -469,7 +469,6 @@ int do_migrate_cancel(Monitor *mon, const QDict *qdict, 
QObject **ret_data)
  int do_migrate_set_speed(Monitor *mon, const QDict *qdict, QObject **ret_data)
  {
  int64_t d;
-MigrationState *s;

  d = qdict_get_int(qdict, "value");
  if (d<  0) {
@@ -477,9 +476,8 @@ int do_migrate_set_speed(Monitor *mon, const QDict *qdict, 
QObject **ret_data)
  }
  max_throttle = d;

-s = current_migration;
-if (s&&  s->file) {
-qemu_file_set_rate_limit(s->file, max_throttle);
+if (current_migration) {
+qemu_file_set_rate_limit(current_migration->file, max_throttle);
  }


How about we compromise and add a:

MigrationState *migrate_get_current(void);

I'm strongly opposed to propagating direct usage of a global.  If it's at least 
a function call, that's a bit nicer.


Regards,

Anthony Liguori



  return 0;

Re: [Qemu-devel] [PATCH 17/23] migration: make sure we always have a migration state

2011-10-04 Thread Anthony Liguori


On 09/23/2011 07:57 AM, Juan Quintela wrote:

This cleans up a lot the code as we don't have to check anymore if
the variable is NULL or not.

We don't make it static, because when we integrate fault tolerance, we
can have several migrations at once.

Signed-off-by: Juan Quintela
---
  migration.c |  126 +-
  1 files changed, 54 insertions(+), 72 deletions(-)

diff --git a/migration.c b/migration.c
index 1cc21f0..c382383 100644
--- a/migration.c
+++ b/migration.c
@@ -34,7 +34,13 @@
  /* Migration speed throttling */
  static int64_t max_throttle = (32<<  20);

-static MigrationState *current_migration;
+/* When we add fault tolerance, we could have several
+   migrations at once.  For now we don't need to add
+   dynamic creation of migration */
+static MigrationState current_migration_static = {
+.state = MIG_STATE_NONE,
+};
+static MigrationState *current_migration =¤t_migration_static;


This doesn't make sense to me.  We can have two migration states that are both 
in the MIG_STATE_NONE?  What does that actually mean?


I don't see the point in making migration state nullable via the state member 
other than avoiding the if (s == NULL) check.


Regards,

Anthony Liguori



  static NotifierList migration_state_notifiers =
  NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
@@ -136,37 +142,34 @@ void do_info_migrate(Monitor *mon, QObject **ret_data)
  {
  QDict *qdict;

-if (current_migration) {
-
-switch (current_migration->state) {
-case MIG_STATE_NONE:
-/* no migration has happened ever */
-break;
-case MIG_STATE_ACTIVE:
-qdict = qdict_new();
-qdict_put(qdict, "status", qstring_from_str("active"));
-
-migrate_put_status(qdict, "ram", ram_bytes_transferred(),
-   ram_bytes_remaining(), ram_bytes_total());
-
-if (blk_mig_active()) {
-migrate_put_status(qdict, "disk", blk_mig_bytes_transferred(),
-   blk_mig_bytes_remaining(),
-   blk_mig_bytes_total());
-}
-
-*ret_data = QOBJECT(qdict);
-break;
-case MIG_STATE_COMPLETED:
-*ret_data = qobject_from_jsonf("{ 'status': 'completed' }");
-break;
-case MIG_STATE_ERROR:
-*ret_data = qobject_from_jsonf("{ 'status': 'failed' }");
-break;
-case MIG_STATE_CANCELLED:
-*ret_data = qobject_from_jsonf("{ 'status': 'cancelled' }");
-break;
+switch (current_migration->state) {
+case MIG_STATE_NONE:
+/* no migration has happened ever */
+break;
+case MIG_STATE_ACTIVE:
+qdict = qdict_new();
+qdict_put(qdict, "status", qstring_from_str("active"));
+
+migrate_put_status(qdict, "ram", ram_bytes_transferred(),
+   ram_bytes_remaining(), ram_bytes_total());
+
+if (blk_mig_active()) {
+migrate_put_status(qdict, "disk", blk_mig_bytes_transferred(),
+   blk_mig_bytes_remaining(),
+   blk_mig_bytes_total());
  }
+
+*ret_data = QOBJECT(qdict);
+break;
+case MIG_STATE_COMPLETED:
+*ret_data = qobject_from_jsonf("{ 'status': 'completed' }");
+break;
+case MIG_STATE_ERROR:
+*ret_data = qobject_from_jsonf("{ 'status': 'failed' }");
+break;
+case MIG_STATE_CANCELLED:
+*ret_data = qobject_from_jsonf("{ 'status': 'cancelled' }");
+break;
  }
  }

@@ -357,11 +360,7 @@ void remove_migration_state_change_notifier(Notifier 
*notify)

  int get_migration_state(void)
  {
-if (current_migration) {
-return current_migration->state;
-} else {
-return MIG_STATE_ERROR;
-}
+return current_migration->state;
  }

  void migrate_fd_connect(MigrationState *s)
@@ -386,28 +385,23 @@ void migrate_fd_connect(MigrationState *s)
  migrate_fd_put_ready(s);
  }

-static MigrationState *migrate_create_state(Monitor *mon,
-int64_t bandwidth_limit,
-int detach, int blk, int inc)
+static void migrate_init_state(Monitor *mon, int64_t bandwidth_limit,
+   int detach, int blk, int inc)
  {
-MigrationState *s = g_malloc0(sizeof(*s));
-
-s->blk = blk;
-s->shared = inc;
-s->mon = NULL;
-s->bandwidth_limit = bandwidth_limit;
-s->state = MIG_STATE_NONE;
+memset(current_migration, 0, sizeof(current_migration));
+current_migration->blk = blk;
+current_migration->shared = inc;
+current_migration->mon = NULL;
+current_migration->bandwidth_limit = bandwidth_limit;
+current_migration->state = MIG_STATE_NONE;

  if (!detach) {
-migrate_fd_monitor_suspend(s, mon);
+migrate_fd_

Re: [Qemu-devel] [PATCH 18/23] migration: Use bandwidth_limit directly

2011-10-04 Thread Anthony Liguori


On 09/23/2011 07:57 AM, Juan Quintela wrote:

Now that current_migration always exist, there is no reason for
max_throotle variable.

Signed-off-by: Juan Quintela


If we can have multiple MigrationStates, this doesn't really make sense.

Regards,

Anthony Liguori


---
  migration.c |   19 ++-
  1 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/migration.c b/migration.c
index c382383..ea50a6f 100644
--- a/migration.c
+++ b/migration.c
@@ -31,14 +31,14 @@
  do { } while (0)
  #endif

-/* Migration speed throttling */
-static int64_t max_throttle = (32<<  20);
+#define MAX_THROTTLE  (32<<  20)  /* Migration speed throttling */

  /* When we add fault tolerance, we could have several
 migrations at once.  For now we don't need to add
 dynamic creation of migration */
  static MigrationState current_migration_static = {
  .state = MIG_STATE_NONE,
+.bandwidth_limit = MAX_THROTTLE,
  };
  static MigrationState *current_migration =¤t_migration_static;

@@ -385,14 +385,14 @@ void migrate_fd_connect(MigrationState *s)
  migrate_fd_put_ready(s);
  }

-static void migrate_init_state(Monitor *mon, int64_t bandwidth_limit,
-   int detach, int blk, int inc)
+static void migrate_init_state(Monitor *mon, int detach, int blk, int inc)
  {
+int64_t bandwidth_limit = current_migration->bandwidth_limit;
+
  memset(current_migration, 0, sizeof(current_migration));
+current_migration->bandwidth_limit = bandwidth_limit;
  current_migration->blk = blk;
  current_migration->shared = inc;
-current_migration->mon = NULL;
-current_migration->bandwidth_limit = bandwidth_limit;
  current_migration->state = MIG_STATE_NONE;

  if (!detach) {
@@ -418,7 +418,7 @@ int do_migrate(Monitor *mon, const QDict *qdict, QObject 
**ret_data)
  return -1;
  }

-migrate_init_state(mon, max_throttle, detach, blk, inc);
+migrate_init_state(mon, detach, blk, inc);

  if (strstart(uri, "tcp:",&p)) {
  ret = tcp_start_outgoing_migration(current_migration, p);
@@ -458,9 +458,10 @@ int do_migrate_set_speed(Monitor *mon, const QDict *qdict, 
QObject **ret_data)
  if (d<  0) {
  d = 0;
  }
-max_throttle = d;
+current_migration->bandwidth_limit = d;

-qemu_file_set_rate_limit(current_migration->file, max_throttle);
+qemu_file_set_rate_limit(current_migration->file,
+ current_migration->bandwidth_limit);
  return 0;
  }

Re: [Qemu-devel] [PATCH] runstate: do not discard runstate changes when paused

2011-10-04 Thread Luiz Capitulino

On Tue, 04 Oct 2011 16:09:03 +0200
Paolo Bonzini  wrote:

> On 10/04/2011 03:49 PM, Luiz Capitulino wrote:
> > There's a semantic change which I'm not completely sure it won't generate
> > unexpected side-effects: today vm_stop() will only carry any action if the
> > machine is running, otherwise it's no-op. This patch changes that.
> 
> More or less, yes.  I tried to limit the semantic change by not running 
> notifiers, which again could be better or worse.
> 
> I don't think adding a new transition is a good solution, because you'll 
> have to add a transition from PAUSED to anything that uses runstate_set 
> instead of vm_stop.
> 
> However, you could change all vm_stop() to vm_stop(RSTATE_PAUSED) 
> followed by runstate_set(), adjust the transition table consequently and 
> possibly drop the argument to vm_stop.  I tried to get the smallest 
> patch, but I did need to follow-up with changes to the transition table.
> 
> In any case, can I assume this to be in your hands now? :)

Yes. I'm not sure what's the best solution here, but I'll decide soon :)

Re: [Qemu-devel] [PATCH 19/23] migration: Export a function that tells if the migration has finished correctly

2011-10-04 Thread Anthony Liguori


On 09/23/2011 07:57 AM, Juan Quintela wrote:

This will allows us to hide the status values.

Signed-off-by: Juan Quintela
---
  migration.c |4 ++--
  migration.h |2 +-
  ui/spice-core.c |4 +---
  3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/migration.c b/migration.c
index ea50a6f..580f546 100644
--- a/migration.c
+++ b/migration.c
@@ -358,9 +358,9 @@ void remove_migration_state_change_notifier(Notifier 
*notify)
  notifier_list_remove(&migration_state_notifiers, notify);
  }

-int get_migration_state(void)
+bool migration_has_finished(void)
  {
-return current_migration->state;
+return current_migration->state == MIG_STATE_COMPLETED;
  }

  void migrate_fd_connect(MigrationState *s)
diff --git a/migration.h b/migration.h
index f1a7452..6641a26 100644
--- a/migration.h
+++ b/migration.h
@@ -84,7 +84,7 @@ void migrate_fd_connect(MigrationState *s);

  void add_migration_state_change_notifier(Notifier *notify);
  void remove_migration_state_change_notifier(Notifier *notify);
-int get_migration_state(void);
+bool migration_has_finished(void);

  uint64_t ram_bytes_remaining(void);
  uint64_t ram_bytes_transferred(void);
diff --git a/ui/spice-core.c b/ui/spice-core.c
index 3cbc721..1202993 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -447,9 +447,7 @@ void do_info_spice(Monitor *mon, QObject **ret_data)

  static void migration_state_notifier(Notifier *notifier, void *data)
  {
-int state = get_migration_state();
-
-if (state == MIG_STATE_COMPLETED) {
+if (migration_has_finished()) {
  #if SPICE_SERVER_VERSION>= 0x000701 /* 0.7.1 */
  spice_server_migrate_switch(spice_server);
  #endif


I think the bug here is migration_state_notifier.  It should take an additional 
argument of MigrationState.  Otherwise, how does this code work with FT?


Regards,

Anthony Liguori

Re: [Qemu-devel] [PATCH 21/23] migration: Don't use callback on file defining it

2011-10-04 Thread Anthony Liguori


On 09/23/2011 07:57 AM, Juan Quintela wrote:

Signed-off-by: Juan Quintela


I don't think this is better.  It just eliminates the possibility of having 
useful trace points in get_error.


Regards,

Anthony Liguori


---
  migration-tcp.c  |4 ++--
  migration-unix.c |6 +++---
  2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/migration-tcp.c b/migration-tcp.c
index f6b2288..bd3aa3a 100644
--- a/migration-tcp.c
+++ b/migration-tcp.c
@@ -58,7 +58,7 @@ static void tcp_wait_for_connect(void *opaque)
  DPRINTF("connect completed\n");
  do {
  ret = getsockopt(s->fd, SOL_SOCKET, SO_ERROR, (void *)&val,&valsize);
-} while (ret == -1&&  (s->get_error(s)) == EINTR);
+} while (ret == -1&&  (socket_error()) == EINTR);

  if (ret<  0) {
  migrate_fd_error(s);
@@ -98,7 +98,7 @@ int tcp_start_outgoing_migration(MigrationState *s, const 
char *host_port)
  do {
  ret = connect(s->fd, (struct sockaddr *)&addr, sizeof(addr));
  if (ret == -1)
-ret = -(s->get_error(s));
+ret = -(socket_error());

  if (ret == -EINPROGRESS || ret == -EWOULDBLOCK)
  qemu_set_fd_handler2(s->fd, NULL, NULL, tcp_wait_for_connect, s);
diff --git a/migration-unix.c b/migration-unix.c
index d6f315a..7205b25 100644
--- a/migration-unix.c
+++ b/migration-unix.c
@@ -57,7 +57,7 @@ static void unix_wait_for_connect(void *opaque)
  DPRINTF("connect completed\n");
  do {
  ret = getsockopt(s->fd, SOL_SOCKET, SO_ERROR, (void *)&val,&valsize);
-} while (ret == -1&&  (s->get_error(s)) == EINTR);
+} while (ret == -1&&  errno == EINTR);

  if (ret<  0) {
  migrate_fd_error(s);
@@ -97,7 +97,7 @@ int unix_start_outgoing_migration(MigrationState *s, const 
char *path)
  do {
  ret = connect(s->fd, (struct sockaddr *)&addr, sizeof(addr));
  if (ret == -1)
-   ret = -(s->get_error(s));
+ret = -errno;

  if (ret == -EINPROGRESS || ret == -EWOULDBLOCK)
qemu_set_fd_handler2(s->fd, NULL, NULL, unix_wait_for_connect, s);
@@ -130,7 +130,7 @@ static void unix_accept_incoming_migration(void *opaque)

  do {
  c = qemu_accept(s, (struct sockaddr *)&addr,&addrlen);
-} while (c == -1&&  socket_error() == EINTR);
+} while (c == -1&&  errno == EINTR);

  DPRINTF("accepted migration\n");

Re: [Qemu-devel] [PATCH 22/23] migration: propagate error correctly

2011-10-04 Thread Anthony Liguori


On 09/23/2011 07:57 AM, Juan Quintela wrote:

unix and tcp outgoing migration have error values, but didn't returned
it.  Make them return the error.  Notice that EINPROGRESS&  EWOULDBLOCK
are not considered errors as callwill be retry later.

Signed-off-by: Juan Quintela


Reviewed-by: Anthony Liguori 

Regards,

Anthony Liguori


---
  migration-tcp.c  |   20 +++-
  migration-unix.c |   26 ++
  2 files changed, 21 insertions(+), 25 deletions(-)

diff --git a/migration-tcp.c b/migration-tcp.c
index bd3aa3a..619df21 100644
--- a/migration-tcp.c
+++ b/migration-tcp.c
@@ -90,26 +90,28 @@ int tcp_start_outgoing_migration(MigrationState *s, const 
char *host_port)

  s->fd = qemu_socket(PF_INET, SOCK_STREAM, 0);
  if (s->fd == -1) {
-return -1;
+return -socket_error();
  }

  socket_set_nonblock(s->fd);

  do {
  ret = connect(s->fd, (struct sockaddr *)&addr, sizeof(addr));
-if (ret == -1)
-ret = -(socket_error());
-
-if (ret == -EINPROGRESS || ret == -EWOULDBLOCK)
+if (ret == -1) {
+ret = -socket_error();
+}
+if (ret == -EINPROGRESS || ret == -EWOULDBLOCK) {
  qemu_set_fd_handler2(s->fd, NULL, NULL, tcp_wait_for_connect, s);
+return 0;
+}
  } while (ret == -EINTR);

-if (ret<  0&&  ret != -EINPROGRESS&&  ret != -EWOULDBLOCK) {
+if (ret<  0) {
  DPRINTF("connect failed\n");
  migrate_fd_error(s);
-} else if (ret>= 0)
-migrate_fd_connect(s);
-
+return ret;
+}
+migrate_fd_connect(s);
  return 0;
  }

diff --git a/migration-unix.c b/migration-unix.c
index 7205b25..428fe66 100644
--- a/migration-unix.c
+++ b/migration-unix.c
@@ -89,35 +89,29 @@ int unix_start_outgoing_migration(MigrationState *s, const 
char *path)
  s->fd = qemu_socket(PF_UNIX, SOCK_STREAM, 0);
  if (s->fd<  0) {
  DPRINTF("Unable to open socket");
-goto err_after_socket;
+return -errno;
  }

  socket_set_nonblock(s->fd);

  do {
  ret = connect(s->fd, (struct sockaddr *)&addr, sizeof(addr));
-if (ret == -1)
+if (ret == -1) {
  ret = -errno;
-
-if (ret == -EINPROGRESS || ret == -EWOULDBLOCK)
+}
+if (ret == -EINPROGRESS || ret == -EWOULDBLOCK) {
qemu_set_fd_handler2(s->fd, NULL, NULL, unix_wait_for_connect, s);
+return 0;
+}
  } while (ret == -EINTR);

-if (ret<  0&&  ret != -EINPROGRESS&&  ret != -EWOULDBLOCK) {
+if (ret<  0) {
  DPRINTF("connect failed\n");
-goto err_after_open;
+migrate_fd_error(s);
+return ret;
  }
-
-if (ret>= 0)
-migrate_fd_connect(s);
-
+migrate_fd_connect(s);
  return 0;
-
-err_after_open:
-close(s->fd);
-
-err_after_socket:
-return -1;
  }

  static void unix_accept_incoming_migration(void *opaque)

Re: [Qemu-devel] [PATCH 23/23] migration: make migration-{tcp, unix} consistent

2011-10-04 Thread Anthony Liguori


On 09/23/2011 07:57 AM, Juan Quintela wrote:

Files are almost identical in functionality, just remove the
differences that make no sense.

Signed-off-by: Juan Quintela


Reviewed-by: Anthony Liguori 

Regards,

Anthony Liguori


---
  migration-tcp.c  |   15 ++-
  migration-unix.c |   46 +-
  2 files changed, 35 insertions(+), 26 deletions(-)

diff --git a/migration-tcp.c b/migration-tcp.c
index 619df21..5aa742c 100644
--- a/migration-tcp.c
+++ b/migration-tcp.c
@@ -48,7 +48,6 @@ static int tcp_close(MigrationState *s)
  return 0;
  }

-
  static void tcp_wait_for_connect(void *opaque)
  {
  MigrationState *s = opaque;
@@ -84,12 +83,14 @@ int tcp_start_outgoing_migration(MigrationState *s, const 
char *host_port)
  if (ret<  0) {
  return ret;
  }
+
  s->get_error = socket_errno;
  s->write = socket_write;
  s->close = tcp_close;

  s->fd = qemu_socket(PF_INET, SOCK_STREAM, 0);
  if (s->fd == -1) {
+DPRINTF("Unable to open socket");
  return -socket_error();
  }

@@ -155,23 +156,27 @@ int tcp_start_incoming_migration(const char *host_port)
  int val;
  int s;

+DPRINTF("Attempting to start an incoming migration\n");
+
  if (parse_host_port(&addr, host_port)<  0) {
  fprintf(stderr, "invalid host/port combination: %s\n", host_port);
  return -EINVAL;
  }

  s = qemu_socket(PF_INET, SOCK_STREAM, 0);
-if (s == -1)
+if (s == -1) {
  return -socket_error();
+}

  val = 1;
  setsockopt(s, SOL_SOCKET, SO_REUSEADDR, (const char *)&val, sizeof(val));

-if (bind(s, (struct sockaddr *)&addr, sizeof(addr)) == -1)
+if (bind(s, (struct sockaddr *)&addr, sizeof(addr)) == -1) {
  goto err;
-
-if (listen(s, 1) == -1)
+}
+if (listen(s, 1) == -1) {
  goto err;
+}

  qemu_set_fd_handler2(s, NULL, tcp_accept_incoming_migration, NULL,
   (void *)(intptr_t)s);
diff --git a/migration-unix.c b/migration-unix.c
index 428fe66..f9d34b9 100644
--- a/migration-unix.c
+++ b/migration-unix.c
@@ -87,7 +87,7 @@ int unix_start_outgoing_migration(MigrationState *s, const 
char *path)
  s->close = unix_close;

  s->fd = qemu_socket(PF_UNIX, SOCK_STREAM, 0);
-if (s->fd<  0) {
+if (s->fd == -1) {
  DPRINTF("Unable to open socket");
  return -errno;
  }
@@ -130,7 +130,7 @@ static void unix_accept_incoming_migration(void *opaque)

  if (c == -1) {
  fprintf(stderr, "could not accept migration connection\n");
-return;
+goto out2;
  }

  f = qemu_fopen_socket(c);
@@ -142,45 +142,49 @@ static void unix_accept_incoming_migration(void *opaque)
  process_incoming_migration(f);
  qemu_fclose(f);
  out:
+close(c);
+out2:
  qemu_set_fd_handler2(s, NULL, NULL, NULL, NULL);
  close(s);
-close(c);
  }

  int unix_start_incoming_migration(const char *path)
  {
-struct sockaddr_un un;
-int sock;
+struct sockaddr_un addr;
+int s;
+int ret;

  DPRINTF("Attempting to start an incoming migration\n");

-sock = qemu_socket(PF_UNIX, SOCK_STREAM, 0);
-if (sock<  0) {
+s = qemu_socket(PF_UNIX, SOCK_STREAM, 0);
+if (s == -1) {
  fprintf(stderr, "Could not open unix socket: %s\n", strerror(errno));
-return -EINVAL;
+return -errno;
  }

-memset(&un, 0, sizeof(un));
-un.sun_family = AF_UNIX;
-snprintf(un.sun_path, sizeof(un.sun_path), "%s", path);
+memset(&addr, 0, sizeof(addr));
+addr.sun_family = AF_UNIX;
+snprintf(addr.sun_path, sizeof(addr.sun_path), "%s", path);

-unlink(un.sun_path);
-if (bind(sock, (struct sockaddr*)&un, sizeof(un))<  0) {
-fprintf(stderr, "bind(unix:%s): %s\n", un.sun_path, strerror(errno));
+unlink(addr.sun_path);
+if (bind(s, (struct sockaddr *)&addr, sizeof(addr))<  0) {
+ret = -errno;
+fprintf(stderr, "bind(unix:%s): %s\n", addr.sun_path, strerror(errno));
  goto err;
  }
-if (listen(sock, 1)<  0) {
-fprintf(stderr, "listen(unix:%s): %s\n", un.sun_path, strerror(errno));
+if (listen(s, 1) == -1) {
+fprintf(stderr, "listen(unix:%s): %s\n", addr.sun_path,
+strerror(errno));
+ret = -errno;
  goto err;
  }

-qemu_set_fd_handler2(sock, NULL, unix_accept_incoming_migration, NULL,
-(void *)(intptr_t)sock);
+qemu_set_fd_handler2(s, NULL, unix_accept_incoming_migration, NULL,
+ (void *)(intptr_t)s);

  return 0;

  err:
-close(sock);
-
-return -EINVAL;
+close(s);
+return ret;
  }

Re: [Qemu-devel] [PATCH v3 00/23] Refactor and cleanup migration code

2011-10-04 Thread Anthony Liguori


I peeled off the first 4 patches to apply.  Once I finish testing I'll push 
them.

Most of the rest of the series looks good.  I really don't like making 
MigrationState nullable with MIG_STATE_NONE.  I don't think it makes sense with 
multiple MigrationStates.


The direct use of globals is not something I'm happy about, but I'd be inclined 
to hold my nose and apply if it weren't for MIG_STATE_NONE.


Regards,

Anthony Liguori

On 09/23/2011 07:56 AM, Juan Quintela wrote:

Hi

v4:

rebase on top of new qemu and new migration-errors series


v3:
this patch applies on top of my previous "migration error"
patches.  All error handling has been moved to that series,
except for "propagate error correctly", without this
refactoring, it is quite complicated to apply it.

Please, review.

Later, Juan.

v3:
- more checkpatch.pl happines
- split error handling in a previous series
- make Anthony happy.  current_migration is still a pointer, but points to
   a static variable.  We can change current_migration when we integrate
   kemari.

v2:
- make Jan^Wcheckpatch.pl happy
- Yoshiaki Tamura suggestions:
   - include its two patches to clean things
   - MAX_THROTTLE define
   - migration_state enum
- I removed spurious differences between migration-{tcp,unix}
- better error propagation, after this patch:
migrate -d "tcp:name_don_exist:port"
migrate -d "tcp:name:port_dont_exist"
migrate -d "exec: prog_dont_exist"
migrate -d "exec: gzip>  /path/dont/exist"
  fail as expected.  Last two used to enter an infinite loop.

The fixes part should be backported to 0.14, waiting for the review to do that.

Later, Juan.

v1:
This series:
- Fold MigrationState into FdMigrationState (and then rename)
- Factorize migration statec creation in a single place
- Make use of MIG_STATE_*, setup through helpers and make them local
- remove relase&  cancel callbacks (where used only one in same
   file than defined)
- get_status() is no more, just access directly to .state
- current_migration use cleanup, and make variable static
- max_throotle is gone, now inside current_migration
- change get_migration_status() to migration_has_finished()
   and actualize single user.

Please review.

Later, Juan.

Juan Quintela (23):
   migration: Make *start_outgoing_migration return FdMigrationState
   migration: Use FdMigrationState instead of MigrationState when
 possible
   migration: Fold MigrationState into FdMigrationState
   migration: Rename FdMigrationState MigrationState
   migration: Refactor MigrationState creation
   migration: Make all posible migration functions static
   migration: move migrate_create_state to do_migrate
   migration: Introduce MIG_STATE_NONE
   migration: Refactor and simplify error checking in
 migrate_fd_put_ready
   migration: Introduce migrate_fd_completed() for consistency
   migration: Our release callback was just free
   migration: Remove get_status() accessor
   migration: Remove migration cancel() callback
   migration: Move exported functions to the end of the file
   migration: use global variable directly
   migration: another case of global variable assigned to local one
   migration: make sure we always have a migration state
   migration: Use bandwidth_limit directly
   migration: Export a function that tells if the migration has finished
 correctly
   migration: Make state definitions local
   migration: Don't use callback on file defining it
   migration: propagate error correctly
   migration: make migration-{tcp,unix} consistent

  migration-exec.c |   39 +
  migration-fd.c   |   42 ++-
  migration-tcp.c  |   76 --
  migration-unix.c |  112 ++-
  migration.c  |  404 ++---
  migration.h  |   85 ++--
  ui/spice-core.c  |4 +-
  7 files changed, 303 insertions(+), 459 deletions(-)

Re: [Qemu-devel] s390: Fix cpu shutdown for KVM

2011-10-04 Thread Alexander Graf


On 10/04/2011 03:47 PM, Christian Borntraeger wrote:

Something like the following?

s390: Fix cpu shutdown for KVM

On s390 a shutdown is the state of all CPUs being either stopped
or disabled (for interrupts) waiting. We have to track this number
to call the shutdown sequence accordingly. This patch implements
the counting and shutdown handling for the kvm path in qemu.

Signed-off-by: Christian Borntraeger
---
  hw/s390-virtio.c   |   21 +
  target-s390x/cpu.h |   12 
  target-s390x/kvm.c |   21 +
  3 files changed, 50 insertions(+), 4 deletions(-)

Index: b/hw/s390-virtio.c
===
--- a/hw/s390-virtio.c
+++ b/hw/s390-virtio.c
@@ -130,6 +130,26 @@ int s390_virtio_hypercall(CPUState *env,
  return r;
  }

+/*
+ * The number of running CPUs. On s390 a shutdown is the state of all CPUs
+ * being either stopped or disabled (for interrupts) waiting. We have to
+ * track this number to call the shutdown sequence accordingly. This
+ * number is modified either on startup or while holding the big qemu lock.
+ */
+static unsigned s390_running_cpus;
+
+void kvm_s390_add_running_cpu(CPUState *env)
+{
+assert(env->halted == 0);
+s390_running_cpus++;
+}
+
+unsigned kvm_s390_del_running_cpu(CPUState *env)
+{
+assert(s390_running_cpus>= 1);
+return --s390_running_cpus;
+}


Almost. It has nothing to do with KVM though. Please keep the 
abstraction intact!



+
  /* PC hardware initialisation */
  static void s390_init(ram_addr_t my_ram_size,
const char *boot_device,
@@ -189,6 +209,7 @@ static void s390_init(ram_addr_t my_ram_

  env->halted = 0;
  env->exception_index = 0;
+kvm_s390_add_running_cpu(env);

  if (kernel_filename) {
  kernel_size = load_image(kernel_filename, qemu_get_ram_ptr(0));
Index: b/target-s390x/cpu.h
===
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -292,6 +292,8 @@ void kvm_s390_interrupt(CPUState *env, i
  void kvm_s390_virtio_irq(CPUState *env, int config_change, uint64_t token);
  void kvm_s390_interrupt_internal(CPUState *env, int type, uint32_t parm,
   uint64_t parm64, int vm);
+void kvm_s390_add_running_cpu(CPUState *env);
+unsigned kvm_s390_del_running_cpu(CPUState *env);
  #else
  static inline void kvm_s390_interrupt(CPUState *env, int type, uint32_t code)
  {
@@ -307,6 +309,16 @@ static inline void kvm_s390_interrupt_in
 int vm)
  {
  }
+
+static inline void kvm_s390_add_running_cpu(CPUState *env)
+{
+}
+
+static inline unsigned kvm_s390_del_running_cpu(CPUState *env)
+{
+return 0;
+}


The functions are defined in s390-virtio.c and thus are available 
anywhere now. No need to special-case KVM anywhere.



+
  #endif
  CPUState *s390_cpu_addr2state(uint16_t cpu_addr);

Index: b/target-s390x/kvm.c
===
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -185,6 +185,12 @@ void kvm_s390_interrupt_internal(CPUStat
  return;
  }

+/*
+ * We can only deliver interrupts to (interrupt) enabled CPUs.
+ * We dont call kvm_s390_add_running_cpu here, since CPUs in enabled wait
+ * will wait inside the kernel (no exit). Therefore, the targeted
+ * CPUs was neither disabled waiting or stopped for qemu.
+ */
  env->halted = 0;
  env->exception_index = -1;
  qemu_cpu_kick(env);
@@ -301,6 +307,7 @@ static int s390_cpu_restart(CPUState *en
  kvm_s390_interrupt(env, KVM_S390_RESTART, 0);
  env->halted = 0;
  env->exception_index = -1;
+kvm_s390_add_running_cpu(env);


Hrm. Is add_running correct here? Maybe we should add a wrapper around 
the halted = 0 parts in s390-virtio.c and do the increase of the counter 
there? That way we can make sure that transitioning from 0 -> 0 doesn't 
increase the counter.



Alex

[Qemu-devel] [PATCH 1/4] savevm: teach qemu_fill_buffer to do partial refills

2011-10-04 Thread Juan Quintela

We will need on next patch to be able to lookahead on next patch

Signed-off-by: Juan Quintela 
---
 savevm.c |   14 +++---
 1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/savevm.c b/savevm.c
index 46f2447..31131df 100644
--- a/savevm.c
+++ b/savevm.c
@@ -455,6 +455,7 @@ void qemu_fflush(QEMUFile *f)
 static void qemu_fill_buffer(QEMUFile *f)
 {
 int len;
+int used;

 if (!f->get_buffer)
 return;
@@ -462,10 +463,17 @@ static void qemu_fill_buffer(QEMUFile *f)
 if (f->is_write)
 abort();

-len = f->get_buffer(f->opaque, f->buf, f->buf_offset, IO_BUF_SIZE);
+used = f->buf_size - f->buf_index;
+if (used > 0) {
+memmove(f->buf, f->buf + f->buf_index, used);
+}
+f->buf_index = 0;
+f->buf_size = used;
+
+len = f->get_buffer(f->opaque, f->buf + used, f->buf_offset,
+IO_BUF_SIZE - used);
 if (len > 0) {
-f->buf_index = 0;
-f->buf_size = len;
+f->buf_size += len;
 f->buf_offset += len;
 } else if (len != -EAGAIN)
 f->has_error = 1;
-- 
1.7.6.4

[Qemu-devel] [PATCH 2/4] savevm: some coding style cleanups

2011-10-04 Thread Juan Quintela

This patch will make moving code on next patches and having checkpatch
happy easier.

Signed-off-by: Juan Quintela 
---
 savevm.c |   21 ++---
 1 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/savevm.c b/savevm.c
index 31131df..5fee4e2 100644
--- a/savevm.c
+++ b/savevm.c
@@ -536,8 +536,9 @@ int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1)
 {
 int size, l;

-if (f->is_write)
+if (f->is_write) {
 abort();
+}

 size = size1;
 while (size > 0) {
@@ -545,11 +546,13 @@ int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1)
 if (l == 0) {
 qemu_fill_buffer(f);
 l = f->buf_size - f->buf_index;
-if (l == 0)
+if (l == 0) {
 break;
+}
 }
-if (l > size)
+if (l > size) {
 l = size;
+}
 memcpy(buf, f->buf + f->buf_index, l);
 f->buf_index += l;
 buf += l;
@@ -560,26 +563,30 @@ int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1)

 static int qemu_peek_byte(QEMUFile *f)
 {
-if (f->is_write)
+if (f->is_write) {
 abort();
+}

 if (f->buf_index >= f->buf_size) {
 qemu_fill_buffer(f);
-if (f->buf_index >= f->buf_size)
+if (f->buf_index >= f->buf_size) {
 return 0;
+}
 }
 return f->buf[f->buf_index];
 }

 int qemu_get_byte(QEMUFile *f)
 {
-if (f->is_write)
+if (f->is_write) {
 abort();
+}

 if (f->buf_index >= f->buf_size) {
 qemu_fill_buffer(f);
-if (f->buf_index >= f->buf_size)
+if (f->buf_index >= f->buf_size) {
 return 0;
+}
 }
 return f->buf[f->buf_index++];
 }
-- 
1.7.6.4

[Qemu-devel] [PATCH 4/4] Revert "savevm: fix corruption in vmstate_subsection_load()."

2011-10-04 Thread Juan Quintela

This reverts commit eb60260de0b050a5e8ab725e84d377d0b44c43ae.

Conflicts:

savevm.c

We changed qemu_peek_byte() prototype, just fixed the rejects.

Signed-off-by: Juan Quintela 
---
 savevm.c |   10 +-
 1 files changed, 1 insertions(+), 9 deletions(-)

diff --git a/savevm.c b/savevm.c
index db6ea12..aaa303e 100644
--- a/savevm.c
+++ b/savevm.c
@@ -1715,12 +1715,6 @@ static const VMStateDescription 
*vmstate_get_subsection(const VMStateSubsection
 static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd,
void *opaque)
 {
-const VMStateSubsection *sub = vmsd->subsections;
-
-if (!sub || !sub->needed) {
-return 0;
-}
-
 while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
 char idstr[256];
 int ret;
@@ -1742,7 +1736,7 @@ static int vmstate_subsection_load(QEMUFile *f, const 
VMStateDescription *vmsd,
 /* it don't have a valid subsection name */
 return 0;
 }
-sub_vmsd = vmstate_get_subsection(sub, idstr);
+sub_vmsd = vmstate_get_subsection(vmsd->subsections, idstr);
 if (sub_vmsd == NULL) {
 return -ENOENT;
 }
@@ -1752,7 +1746,6 @@ static int vmstate_subsection_load(QEMUFile *f, const 
VMStateDescription *vmsd,
 idstr[len] = 0;
 version_id = qemu_get_be32(f);

-assert(!sub_vmsd->subsections);
 ret = vmstate_load_state(f, sub_vmsd, opaque, version_id);
 if (ret) {
 return ret;
@@ -1776,7 +1769,6 @@ static void vmstate_subsection_save(QEMUFile *f, const 
VMStateDescription *vmsd,
 qemu_put_byte(f, len);
 qemu_put_buffer(f, (uint8_t *)vmsd->name, len);
 qemu_put_be32(f, vmsd->version_id);
-assert(!vmsd->subsections);
 vmstate_save_state(f, vmsd, opaque);
 }
 sub++;
-- 
1.7.6.4

[Qemu-devel] [PATCH 0/4] migration: Improve subsections detection

2011-10-04 Thread Juan Quintela

Hi

This series move the subsections detection code form:
- Look that it starts form 5
To:
- Look that it starts form 5 (SUBSECTION)
- Look at the length
- Look that length is bigger than section name
- Look at the idstr and see that it starts with the subsection name.

Please review.

Later, Juan.

Juan Quintela (4):
  savevm: teach qemu_fill_buffer to do partial refills
  savevm: some coding style cleanups
  savevm: improve subsections detection on load
  Revert "savevm: fix corruption in vmstate_subsection_load()."

 savevm.c |  112 ++---
 1 files changed, 84 insertions(+), 28 deletions(-)

-- 
1.7.6.4

Re: [Qemu-devel] [PATCH 05/23] migration: Refactor MigrationState creation

2011-10-04 Thread Juan Quintela

Anthony Liguori  wrote:
> On 09/23/2011 07:56 AM, Juan Quintela wrote:
>> Signed-off-by: Juan Quintela
>
>
> I think this would fit better if it were migrate_new() or migrate_open().
>
> It definitely should be in the form noun_verb and not noun_verb_noun().
>
> Regards,

You are the native speaker, so I change this one O:-)

Later, Juan.

Re: [Qemu-devel] [PATCH 15/23] migration: use global variable directly

2011-10-04 Thread Juan Quintela

Anthony Liguori  wrote:

> How about we compromise and add a:
>
> MigrationState *migrate_get_current(void);

I can agree with this one.

> I'm strongly opposed to propagating direct usage of a global.  If it's
> at least a function call, that's a bit nicer.

What I am against is with trying to "hide" the fact that we are using a
global.  function accessor looks ok enough.

Re: [Qemu-devel] [PATCH 17/23] migration: make sure we always have a migration state

2011-10-04 Thread Juan Quintela

Anthony Liguori  wrote:
> On 09/23/2011 07:57 AM, Juan Quintela wrote:
>> This cleans up a lot the code as we don't have to check anymore if
>> the variable is NULL or not.
>>
>> We don't make it static, because when we integrate fault tolerance, we
>> can have several migrations at once.
>>
>> Signed-off-by: Juan Quintela
>> ---
>>   migration.c |  126 
>> +-
>>   1 files changed, 54 insertions(+), 72 deletions(-)
>>
>> diff --git a/migration.c b/migration.c
>> index 1cc21f0..c382383 100644
>> --- a/migration.c
>> +++ b/migration.c
>> @@ -34,7 +34,13 @@
>>   /* Migration speed throttling */
>>   static int64_t max_throttle = (32<<  20);
>>
>> -static MigrationState *current_migration;
>> +/* When we add fault tolerance, we could have several
>> +   migrations at once.  For now we don't need to add
>> +   dynamic creation of migration */
>> +static MigrationState current_migration_static = {
>> +.state = MIG_STATE_NONE,
>> +};
>> +static MigrationState *current_migration =¤t_migration_static;
>
> This doesn't make sense to me.  We can have two migration states that
> are both in the MIG_STATE_NONE?  What does that actually mean?
>
> I don't see the point in making migration state nullable via the state
> member other than avoiding the if (s == NULL) check.

It avoids s==NULL checks, and it also avoids having to have new
variables (max_throotle) just because we don't have a migration state
handy.

Obviously the problem can't be the size (the variable is smaller that
all the code tests for NULL).

For me, it is easier to understand that we have an state (migration
never attempted) with the same states that the other migrations states.

Later, Juan.

[Qemu-devel] [PATCH RFC V1 02/11] qemu-timer: Introduce qemu_run_one_timer

2011-10-04 Thread Anthony PERARD

Used by the Xen PCI Passthrough code to run the timer about the power
state transition.

Signed-off-by: Anthony PERARD 
---
 qemu-timer.c |   15 +++
 qemu-timer.h |3 +++
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index 46dd483..15e659b 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -1163,3 +1163,18 @@ int qemu_calculate_timeout(void)
 return 1000;
 }
 
+/* run the specified timer */
+void qemu_run_one_timer(QEMUTimer *ts)
+{
+uint64_t current_time;
+
+/* remove timer from the list before calling the callback */
+qemu_del_timer(ts);
+
+while ((current_time = qemu_get_clock_ms(rt_clock)) < ts->expire_time)
+/* sleep until the expire time */
+usleep((ts->expire_time - current_time) * 1000);
+
+/* run the callback */
+ts->cb(ts->opaque);
+}
diff --git a/qemu-timer.h b/qemu-timer.h
index 0a43469..b7b907b 100644
--- a/qemu-timer.h
+++ b/qemu-timer.h
@@ -338,4 +338,7 @@ extern int64_t tlb_flush_time;
 extern int64_t dev_time;
 #endif
 
+/* run the specified timer */
+void qemu_run_one_timer(QEMUTimer *ts);
+
 #endif
-- 
Anthony PERARD

[Qemu-devel] [PATCH RFC V1 01/11] Introduce HostPCIDevice to access a pci device on the host.

2011-10-04 Thread Anthony PERARD

Signed-off-by: Anthony PERARD 
---
 hw/host-pci-device.c |  192 ++
 hw/host-pci-device.h |   36 +
 2 files changed, 228 insertions(+), 0 deletions(-)
 create mode 100644 hw/host-pci-device.c
 create mode 100644 hw/host-pci-device.h

diff --git a/hw/host-pci-device.c b/hw/host-pci-device.c
new file mode 100644
index 000..b3f2899
--- /dev/null
+++ b/hw/host-pci-device.c
@@ -0,0 +1,192 @@
+#include "qemu-common.h"
+#include "host-pci-device.h"
+
+static int path_to(const HostPCIDevice *d,
+   const char *name, char *buf, ssize_t size)
+{
+return snprintf(buf, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%x/%s",
+d->domain, d->bus, d->dev, d->func, name);
+}
+
+static int get_resource(HostPCIDevice *d)
+{
+int i;
+FILE *f;
+char path[PATH_MAX];
+unsigned long long start, end, flags, size;
+
+path_to(d, "resource", path, sizeof (path));
+f = fopen(path, "r");
+if (!f) {
+fprintf(stderr, "Error: Can't open %s: %s\n", path, strerror(errno));
+return -1;
+}
+
+for (i = 0; i < PCI_NUM_REGIONS; i++) {
+if (fscanf(f, "%llx %llx %llx", &start, &end, &flags) != 3) {
+fprintf(stderr, "Error: Syntax error in %s\n", path);
+break;
+}
+if (start) {
+size = end - start + 1;
+} else {
+size = 0;
+}
+
+flags &= 0xf;
+
+if (i < PCI_ROM_SLOT) {
+d->base_addr[i] = start | flags;
+d->size[i] = size;
+} else {
+d->rom_base_addr = start | flags;
+d->rom_size = size;
+}
+}
+
+fclose(f);
+return 0;
+}
+
+static unsigned long get_value(HostPCIDevice *d, const char *name)
+{
+char path[PATH_MAX];
+FILE *f;
+unsigned long value;
+
+path_to(d, name, path, sizeof (path));
+f = fopen(path, "r");
+if (!f) {
+fprintf(stderr, "Error: Can't open %s: %s\n", path, strerror(errno));
+return -1;
+}
+if (fscanf(f, "%lx\n", &value) != 1) {
+fprintf(stderr, "Error: Syntax error in %s\n", path);
+value = -1;
+}
+fclose(f);
+return value;
+}
+
+static int pci_dev_is_virtfn(HostPCIDevice *d)
+{
+int rc;
+char path[PATH_MAX];
+struct stat buf;
+
+path_to(d, "physfn", path, sizeof (path));
+rc = !stat(path, &buf);
+
+return rc;
+}
+
+static int host_pci_config_fd(HostPCIDevice *d)
+{
+char path[PATH_MAX];
+
+if (d->config_fd < 0) {
+path_to(d, "config", path, sizeof (path));
+d->config_fd = open(path, O_RDWR);
+if (d->config_fd < 0) {
+fprintf(stderr, "HostPCIDevice: Can not open '%s': %s\n",
+path, strerror(errno));
+}
+}
+return d->config_fd;
+}
+static int host_pci_config_read(HostPCIDevice *d, int pos, void *buf, int len)
+{
+int fd = host_pci_config_fd(d);
+int res = 0;
+
+res = pread(fd, buf, len, pos);
+if (res < 0) {
+fprintf(stderr, "host_pci_config: read failed: %s (fd: %i)\n",
+strerror(errno), fd);
+return -1;
+}
+return res;
+}
+static int host_pci_config_write(HostPCIDevice *d,
+ int pos, const void *buf, int len)
+{
+int fd = host_pci_config_fd(d);
+int res = 0;
+
+res = pwrite(fd, buf, len, pos);
+if (res < 0) {
+fprintf(stderr, "host_pci_config: write failed: %s\n",
+strerror(errno));
+return -1;
+}
+return res;
+}
+
+uint8_t host_pci_read_byte(HostPCIDevice *d, int pos)
+{
+  uint8_t buf;
+  host_pci_config_read(d, pos, &buf, 1);
+  return buf;
+}
+uint16_t host_pci_read_word(HostPCIDevice *d, int pos)
+{
+  uint16_t buf;
+  host_pci_config_read(d, pos, &buf, 2);
+  return le16_to_cpu(buf);
+}
+uint32_t host_pci_read_long(HostPCIDevice *d, int pos)
+{
+  uint32_t buf;
+  host_pci_config_read(d, pos, &buf, 4);
+  return le32_to_cpu(buf);
+}
+int host_pci_read_block(HostPCIDevice *d, int pos, uint8_t *buf, int len)
+{
+  return host_pci_config_read(d, pos, buf, len);
+}
+
+int host_pci_write_byte(HostPCIDevice *d, int pos, uint8_t data)
+{
+  return host_pci_config_write(d, pos, &data, 1);
+}
+int host_pci_write_word(HostPCIDevice *d, int pos, uint16_t data)
+{
+  return host_pci_config_write(d, pos, &data, 2);
+}
+int host_pci_write_long(HostPCIDevice *d, int pos, uint32_t data)
+{
+  return host_pci_config_write(d, pos, &data, 4);
+}
+int host_pci_write_block(HostPCIDevice *d, int pos, uint8_t *buf, int len)
+{
+  return host_pci_config_write(d, pos, buf, len);
+}
+
+HostPCIDevice *host_pci_device_get(uint8_t bus, uint8_t dev, uint8_t func)
+{
+HostPCIDevice *d = NULL;
+
+d = g_new0(HostPCIDevice, 1);
+
+d->config_fd = -1;
+d->domain = 0;
+d->bus = bus;
+d->dev = dev;
+d->func = func;
+
+if (host_pci_config_fd(d) == -1)
+goto error;
+if (get_resource(d)

[Qemu-devel] [PATCH RFC V1 11/11] config/make: Introduce --enable-xen-pci-passthrough, built it.

2011-10-04 Thread Anthony PERARD

Signed-off-by: Anthony PERARD 
---
 Makefile.target |7 +++
 configure   |   21 +
 2 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/Makefile.target b/Makefile.target
index f708453..b5fbc18 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -208,6 +208,13 @@ obj-$(CONFIG_NO_XEN) += xen-stub.o
 
 obj-i386-$(CONFIG_XEN) += xen_platform.o
 
+# Xen PCI Passthrough
+obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += host-pci-device.o
+obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough_helpers.o
+obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough.o
+obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough_config_init.o
+obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough_msi.o
+
 # Inter-VM PCI shared memory
 CONFIG_IVSHMEM =
 ifeq ($(CONFIG_KVM), y)
diff --git a/configure b/configure
index 0875f95..b90cfe1 100755
--- a/configure
+++ b/configure
@@ -127,6 +127,7 @@ vnc_png=""
 vnc_thread="no"
 xen=""
 xen_ctrl_version=""
+xen_pci_passthrough=""
 linux_aio=""
 attr=""
 xfs=""
@@ -635,6 +636,10 @@ for opt do
   ;;
   --enable-xen) xen="yes"
   ;;
+  --disable-xen-pci-passthrough) xen_pci_passthrough="no"
+  ;;
+  --enable-xen-pci-passthrough) xen_pci_passthrough="yes"
+  ;;
   --disable-brlapi) brlapi="no"
   ;;
   --enable-brlapi) brlapi="yes"
@@ -972,6 +977,8 @@ echo "   (affects only QEMU, not 
qemu-img)"
 echo "  --enable-mixemu  enable mixer emulation"
 echo "  --disable-xendisable xen backend driver support"
 echo "  --enable-xen enable xen backend driver support"
+echo "  --disable-xen-pci-passthrough"
+echo "  --enable-xen-pci-passthrough"
 echo "  --disable-brlapi disable BrlAPI"
 echo "  --enable-brlapi  enable BrlAPI"
 echo "  --disable-vnc-tlsdisable TLS encryption for VNC server"
@@ -1335,6 +1342,17 @@ EOF
   fi
 fi
 
+if test "$xen_pci_passthrough" != "no"; then
+  if test "$xen" = "yes"; then
+xen_pci_passthrough=yes
+  else
+if test "$xen_pci_passthrough" = "yes"; then
+  feature_not_found "Xen PCI Passthrough without Xen"
+fi
+xen_pci_passthrough=no
+  fi
+fi
+
 ##
 # pkg-config probe
 
@@ -3378,6 +3396,9 @@ case "$target_arch2" in
 if test "$xen" = "yes" -a "$target_softmmu" = "yes" ; then
   target_phys_bits=64
   echo "CONFIG_XEN=y" >> $config_target_mak
+  if test "$xen_pci_passthrough" = yes; then
+echo "CONFIG_XEN_PCI_PASSTHROUGH=y" >> "$config_target_mak"
+  fi
 else
   echo "CONFIG_NO_XEN=y" >> $config_target_mak
 fi
-- 
Anthony PERARD

[Qemu-devel] [PATCH RFC V1 10/11] Introduce Xen PCI Passthrough, MSI (3/3)

2011-10-04 Thread Anthony PERARD

Signed-off-by: Anthony PERARD 
---
 hw/xen_pci_passthrough_msi.c |  674 ++
 1 files changed, 674 insertions(+), 0 deletions(-)
 create mode 100644 hw/xen_pci_passthrough_msi.c

diff --git a/hw/xen_pci_passthrough_msi.c b/hw/xen_pci_passthrough_msi.c
new file mode 100644
index 000..be18ff1
--- /dev/null
+++ b/hw/xen_pci_passthrough_msi.c
@@ -0,0 +1,674 @@
+/*
+ * Copyright (c) 2007, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ *
+ * Jiang Yunhong 
+ *
+ * This file implements direct PCI assignment to a HVM guest
+ */
+
+#include 
+
+#include "xen_backend.h"
+#include "xen_pci_passthrough.h"
+
+void msi_set_enable(XenPCIPassthroughState *dev, int en)
+{
+uint16_t val = 0;
+uint32_t address = 0;
+PT_LOG("enable: %i\n", en);
+
+if (!dev->msi) {
+return;
+}
+
+address = dev->msi->ctrl_offset;
+if (!address) {
+return;
+}
+
+val = host_pci_read_word(dev->real_device, address);
+val &= ~PCI_MSI_FLAGS_ENABLE;
+val |= en & PCI_MSI_FLAGS_ENABLE;
+host_pci_write_word(dev->real_device, address, val);
+
+PT_LOG("done, address: %#x, val: %#x\n", address, val);
+}
+
+static void msix_set_enable(XenPCIPassthroughState *dev, int en)
+{
+uint16_t val = 0;
+uint32_t address = 0;
+
+if (!dev->msix) {
+return;
+}
+
+address = dev->msix->ctrl_offset;
+if (!address) {
+return;
+}
+
+val = host_pci_read_word(dev->real_device, address);
+val &= ~PCI_MSIX_FLAGS_ENABLE;
+if (en) {
+val |= PCI_MSIX_FLAGS_ENABLE;
+}
+host_pci_write_word(dev->real_device, address, val);
+}
+
+/*/
+/* MSI virtuailization functions */
+
+/*
+ * setup physical msi, but didn't enable it
+ */
+int pt_msi_setup(XenPCIPassthroughState *dev)
+{
+int pirq = -1;
+uint8_t gvec = 0;
+
+if (!(dev->msi->flags & MSI_FLAG_UNINIT)) {
+PT_LOG("Error: setup physical after initialized?? \n");
+return -1;
+}
+
+gvec = dev->msi->data & 0xFF;
+if (!gvec) {
+/* if gvec is 0, the guest is asking for a particular pirq that
+ * is passed as dest_id */
+pirq = (dev->msi->addr_hi & 0xff00) |
+   ((dev->msi->addr_lo >> MSI_TARGET_CPU_SHIFT) & 0xff);
+if (!pirq) {
+/* this probably identifies an misconfiguration of the guest,
+ * try the emulated path */
+pirq = -1;
+} else {
+PT_LOG("pt_msi_setup requested pirq = %d\n", pirq);
+}
+}
+
+if (xc_physdev_map_pirq_msi(xen_xc, xen_domid, AUTO_ASSIGN, &pirq,
+PCI_DEVFN(dev->real_device->dev,
+  dev->real_device->func),
+dev->real_device->bus, 0, 0)) {
+PT_LOG("Error: Mapping of MSI failed.\n");
+return -1;
+}
+
+if (pirq < 0) {
+PT_LOG("Error: Invalid pirq number\n");
+return -1;
+}
+
+dev->msi->pirq = pirq;
+PT_LOG("msi mapped with pirq %x\n", pirq);
+
+return 0;
+}
+
+static uint32_t __get_msi_gflags(uint32_t data, uint64_t addr)
+{
+uint32_t result = 0;
+int rh, dm, dest_id, deliv_mode, trig_mode;
+
+rh = (addr >> MSI_ADDR_REDIRECTION_SHIFT) & 0x1;
+dm = (addr >> MSI_ADDR_DESTMODE_SHIFT) & 0x1;
+dest_id = (addr >> MSI_TARGET_CPU_SHIFT) & 0xff;
+deliv_mode = (data >> MSI_DATA_DELIVERY_SHIFT) & 0x7;
+trig_mode = (data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
+
+result |= dest_id | (rh << GFLAGS_SHIFT_RH) | (dm << GFLAGS_SHIFT_DM) | \
+  (deliv_mode << GLFAGS_SHIFT_DELIV_MODE) |
+  (trig_mode << GLFAGS_SHIFT_TRG_MODE);
+
+return result;
+}
+
+int pt_msi_update(XenPCIPassthroughState *d)
+{
+uint8_t gvec = 0;
+uint32_t gflags = 0;
+uint64_t addr = 0;
+int ret = 0;
+
+/* get vector, address, flags info, etc. */
+gvec = d->msi->data & 0xFF;
+addr = (uint64_t)d->msi->addr_hi << 32 | d->msi->addr_lo;
+gflags = __get_msi_gflags(d->msi->data, addr);
+
+PT_LOG("Update msi with pirq %x gvec %x gflags %x\n",
+   d->msi->pirq, gvec, gflags);
+
+ret = xc_domain_update_msi_irq(xen_xc, xen_domid, gvec,
+   d->msi->pirq, gflags, 0);
+
+if (ret) {
+

[Qemu-devel] [PATCH RFC V1 08/11] Introduce Xen PCI Passthrough, qdevice (1/3)

2011-10-04 Thread Anthony PERARD

Signed-off-by: Anthony PERARD 
---
 hw/xen_pci_passthrough.c |  763 ++
 hw/xen_pci_passthrough.h |  335 +
 hw/xen_pci_passthrough_helpers.c |   46 +++
 3 files changed, 1144 insertions(+), 0 deletions(-)
 create mode 100644 hw/xen_pci_passthrough.c
 create mode 100644 hw/xen_pci_passthrough.h
 create mode 100644 hw/xen_pci_passthrough_helpers.c

diff --git a/hw/xen_pci_passthrough.c b/hw/xen_pci_passthrough.c
new file mode 100644
index 000..bfbe042
--- /dev/null
+++ b/hw/xen_pci_passthrough.c
@@ -0,0 +1,763 @@
+#include 
+
+#include "pci.h"
+#include "xen.h"
+#include "xen_backend.h"
+#include "xen_pci_passthrough.h"
+
+#define PCI_BAR_ENTRIES (6)
+
+typedef struct PlugDevice {
+uint8_t r_bus;
+uint8_t r_slot;
+uint8_t r_func;
+int bus;
+int slot;
+int func;
+} PlugDevice;
+
+QLIST_HEAD(php_dev_list, PlugDevice) php_dev_list = 
+QLIST_HEAD_INITIALIZER(php_dev_list);
+
+#define PT_BAR_ALLF 0x  /* BAR ALLF value */
+
+/* #define PT_NR_IRQS  (256) */
+/* char mapped_machine_irq[PT_NR_IRQS] = {0}; */
+
+/* Config Space */
+static int pt_pci_config_access_check(PCIDevice *d, uint32_t address, int len)
+{
+/* check offset range */
+if (address >= 0xFF) {
+PT_LOG("Error: Failed to access register with offset exceeding FFh. "
+   "[%02x:%02x.%x][Offset:%02xh][Length:%d]\n",
+   pci_bus_num(d->bus), PCI_SLOT(d->devfn), PCI_FUNC(d->devfn),
+   address, len);
+return -1;
+}
+
+/* check read size */
+if ((len != 1) && (len != 2) && (len != 4)) {
+PT_LOG("Error: Failed to access register with invalid access length. "
+   "[%02x:%02x.%x][Offset:%02xh][Length:%d]\n",
+   pci_bus_num(d->bus), PCI_SLOT(d->devfn), PCI_FUNC(d->devfn),
+   address, len);
+return -1;
+}
+
+/* check offset alignment */
+if (address & (len - 1)) {
+PT_LOG("Error: Failed to access register with invalid access size "
+"alignment. [%02x:%02x.%x][Offset:%02xh][Length:%d]\n",
+pci_bus_num(d->bus), PCI_SLOT(d->devfn), PCI_FUNC(d->devfn),
+address, len);
+return -1;
+}
+
+return 0;
+}
+
+int pt_bar_offset_to_index(uint32_t offset)
+{
+int index = 0;
+
+/* check Exp ROM BAR */
+if (offset == PCI_ROM_ADDRESS) {
+return PCI_ROM_SLOT;
+}
+
+/* calculate BAR index */
+index = (offset - PCI_BASE_ADDRESS_0) >> 2;
+if (index >= PCI_NUM_REGIONS) {
+return -1;
+}
+
+return index;
+}
+
+static uint32_t pt_pci_read_config(PCIDevice *d, uint32_t address, int len)
+{
+XenPCIPassthroughState *s = DO_UPCAST(XenPCIPassthroughState, dev, d);
+uint32_t val = 0;
+XenPTRegGroup *reg_grp_entry = NULL;
+XenPTReg *reg_entry = NULL;
+int rc = 0;
+int emul_len = 0;
+uint32_t find_addr = address;
+
+if (pt_pci_config_access_check(d, address, len)) {
+goto exit;
+}
+
+/* check power state transition flags */
+if (s->pm_state != NULL && s->pm_state->flags & PT_FLAG_TRANSITING) {
+/* can't accept untill previous power state transition is completed.
+ * so finished previous request here.
+ */
+qemu_run_one_timer(s->pm_state->pm_timer);
+}
+
+/* find register group entry */
+reg_grp_entry = pt_find_reg_grp(s, address);
+if (reg_grp_entry) {
+/* check 0 Hardwired register group */
+if (reg_grp_entry->reg_grp->grp_type == GRP_TYPE_HARDWIRED) {
+/* no need to emulate, just return 0 */
+val = 0;
+goto exit;
+}
+}
+
+/* read I/O device register value */
+rc = host_pci_read_block(s->real_device, address, (uint8_t *)&val, len);
+if (!rc) {
+PT_LOG("Error: pci_read_block failed. return value[%d].\n", rc);
+memset(&val, 0xff, len);
+}
+
+/* just return the I/O device register value for
+ * passthrough type register group */
+if (reg_grp_entry == NULL) {
+goto exit;
+}
+
+/* adjust the read value to appropriate CFC-CFF window */
+val <<= (address & 3) << 3;
+emul_len = len;
+
+/* loop Guest request size */
+while (emul_len > 0) {
+/* find register entry to be emulated */
+reg_entry = pt_find_reg(reg_grp_entry, find_addr);
+if (reg_entry) {
+XenPTRegInfo *reg = reg_entry->reg;
+uint32_t real_offset = reg_grp_entry->base_offset + reg->offset;
+uint32_t valid_mask = 0x >> ((4 - emul_len) << 3);
+uint8_t *ptr_val = NULL;
+
+valid_mask <<= (find_addr - real_offset) << 3;
+ptr_val = (uint8_t *)&val + (real_offset & 3);
+
+/* do emulation depend on register size */
+switch (reg->size) {
+case 1:
+if (reg->u.b.read) {
+r

[Qemu-devel] [PATCH RFC V1 06/11] pci.c: Add pci_check_bar_overlap

2011-10-04 Thread Anthony PERARD

This function help Xen PCI Passthrough device to check for overlap.

Signed-off-by: Anthony PERARD 
---
 hw/pci.c |   46 ++
 hw/pci.h |3 +++
 2 files changed, 49 insertions(+), 0 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index af74003..56e94d1 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -2196,3 +2196,49 @@ MemoryRegion *pci_address_space(PCIDevice *dev)
 {
 return dev->bus->address_space_mem;
 }
+
+int pci_check_bar_overlap(PCIDevice *dev,
+  pcibus_t addr, pcibus_t size, uint8_t type)
+{
+PCIBus *bus = dev->bus;
+PCIDevice *devices = NULL;
+PCIIORegion *r;
+int i, j;
+int rc = 0;
+
+/* check Overlapped to Base Address */
+for (i = 0; i < ARRAY_SIZE(bus->devices); i++) {
+if (!(devices = bus->devices[i])) {
+continue;
+}
+
+/* skip itself */
+if (devices->devfn == dev->devfn) {
+continue;
+}
+
+for (j = 0; j < PCI_NUM_REGIONS; j++) {
+r = &devices->io_regions[j];
+
+/* skip different resource type, but don't skip when
+ * prefetch and non-prefetch memory are compared.
+ */
+if (type != r->type) {
+if (type == PCI_BASE_ADDRESS_SPACE_IO ||
+r->type == PCI_BASE_ADDRESS_SPACE_IO) {
+continue;
+}
+}
+
+if ((addr < (r->addr + r->size)) && ((addr + size) > r->addr)) {
+printf("Overlapped to device[%02x:%02x.%x][Region:%d]"
+   "[Address:%"PRIx64"h][Size:%"PRIx64"h]\n",
+   pci_bus_num(bus), PCI_SLOT(devices->devfn),
+   PCI_FUNC(devices->devfn), j, r->addr, r->size);
+rc = 1;
+}
+}
+}
+
+return rc;
+}
diff --git a/hw/pci.h b/hw/pci.h
index c04b169..97996b5 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -490,4 +490,7 @@ static inline uint32_t pci_config_size(const PCIDevice *d)
 return pci_is_express(d) ? PCIE_CONFIG_SPACE_SIZE : PCI_CONFIG_SPACE_SIZE;
 }
 
+int pci_check_bar_overlap(PCIDevice *dev,
+  pcibus_t addr, pcibus_t size, uint8_t type);
+
 #endif
-- 
Anthony PERARD

Re: [Qemu-devel] [PATCH 05/23] migration: Refactor MigrationState creation

2011-10-04 Thread Anthony Liguori


On 10/04/2011 09:44 AM, Juan Quintela wrote:

Anthony Liguori  wrote:

On 09/23/2011 07:56 AM, Juan Quintela wrote:

Signed-off-by: Juan Quintela



I think this would fit better if it were migrate_new() or migrate_open().

It definitely should be in the form noun_verb and not noun_verb_noun().

Regards,


You are the native speaker, so I change this one O:-)


It's just consistency, not about what sounds good.

bdrv_new(), bdrv_open(), qemu_chr_open(), memory_region_init(), etc.

Regards,

Anthony Liguori



Later, Juan.

Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough

2011-10-04 Thread Avi Kivity


On 10/04/2011 04:51 PM, Anthony PERARD wrote:

Hi all,

This patch series introduce the PCI passthrough for Xen.

First, we have HostPCIDevice that help to access one PCI device of the host.

Then, there are several additions in the QEMU code. One is qemu_run_one_timer
to run a specific timer. It is used by PCI passthrough to run a timer about
power management. Another is pci_check_bar_overlap.

There are also several change in pci_ids and pci_regs.

Last part, but not least, the PCI passthrough device himself. Cut in 3 parts
(or file), there is one to take care of the initialisation of a passthrough
device. The second one handle everything about the config address space, there
are specifics functions for every config register. The third one is to handle
MSI.

I'm still working on setting a PCI passthrough device through QMP from libxl
(xen tool stack). It is just a call to device_add, with the driver parametter
hostaddr=":00:1b.0".

There is some minor things missing:
  - copyright header
  - PCI IO space multiplexer




We also have pci passthrough in qemu-kvm (I think based on the same 
Neocleus code).  Rather than having two pci assignment implementations, 
I think we should have just one, with the differences (programming the 
hypervisor) abstracted at that level.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough

2011-10-04 Thread Anthony Liguori


On 10/04/2011 09:58 AM, Avi Kivity wrote:

On 10/04/2011 04:51 PM, Anthony PERARD wrote:

Hi all,

This patch series introduce the PCI passthrough for Xen.

First, we have HostPCIDevice that help to access one PCI device of the host.

Then, there are several additions in the QEMU code. One is qemu_run_one_timer
to run a specific timer. It is used by PCI passthrough to run a timer about
power management. Another is pci_check_bar_overlap.

There are also several change in pci_ids and pci_regs.

Last part, but not least, the PCI passthrough device himself. Cut in 3 parts
(or file), there is one to take care of the initialisation of a passthrough
device. The second one handle everything about the config address space, there
are specifics functions for every config register. The third one is to handle
MSI.

I'm still working on setting a PCI passthrough device through QMP from libxl
(xen tool stack). It is just a call to device_add, with the driver parametter
hostaddr=":00:1b.0".

There is some minor things missing:
- copyright header
- PCI IO space multiplexer




We also have pci passthrough in qemu-kvm (I think based on the same Neocleus
code). Rather than having two pci assignment implementations, I think we should
have just one, with the differences (programming the hypervisor) abstracted at
that level.


I agree in principle but how close is qemu-kvm pci passthrough to a mergable 
state?  Would it make sense to merge the Xen code first and then abstract it?


Regards,

Anthony Liguori

Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough

2011-10-04 Thread Stefano Stabellini

On Tue, 4 Oct 2011, Anthony Liguori wrote:
> On 10/04/2011 09:58 AM, Avi Kivity wrote:
> > On 10/04/2011 04:51 PM, Anthony PERARD wrote:
> >> Hi all,
> >>
> >> This patch series introduce the PCI passthrough for Xen.
> >>
> >> First, we have HostPCIDevice that help to access one PCI device of the 
> >> host.
> >>
> >> Then, there are several additions in the QEMU code. One is 
> >> qemu_run_one_timer
> >> to run a specific timer. It is used by PCI passthrough to run a timer about
> >> power management. Another is pci_check_bar_overlap.
> >>
> >> There are also several change in pci_ids and pci_regs.
> >>
> >> Last part, but not least, the PCI passthrough device himself. Cut in 3 
> >> parts
> >> (or file), there is one to take care of the initialisation of a passthrough
> >> device. The second one handle everything about the config address space, 
> >> there
> >> are specifics functions for every config register. The third one is to 
> >> handle
> >> MSI.
> >>
> >> I'm still working on setting a PCI passthrough device through QMP from 
> >> libxl
> >> (xen tool stack). It is just a call to device_add, with the driver 
> >> parametter
> >> hostaddr=":00:1b.0".
> >>
> >> There is some minor things missing:
> >> - copyright header
> >> - PCI IO space multiplexer
> >>
> >>
> >
> > We also have pci passthrough in qemu-kvm (I think based on the same Neocleus
> > code). Rather than having two pci assignment implementations, I think we 
> > should
> > have just one, with the differences (programming the hypervisor) abstracted 
> > at
> > that level.
> 
> I agree in principle but how close is qemu-kvm pci passthrough to a mergable 
> state?  Would it make sense to merge the Xen code first and then abstract it?

I think it should be fairly easy to abstract the current xen code: just
a matter of providing memory, ioport and interrupt mapping functions.

Re: [Qemu-devel] [PATCH] Make cpu_single_env thread local (Linux only for now)

2011-10-04 Thread Lluís Vilanova

Jan Kiszka writes:

> On 2011-10-03 18:33, Dr. David Alan Gilbert wrote:
>> Make cpu_single_env thread local (Linux only for now)
>> * Fixes some user space threading issues (esp those triggered
>> by bug 823902)
>> 
>> Against rev d11cf8cc..., tested on ARM user mode, and ARM Vexpress
>> system mode (with Blue Swirl's fix from yesterday) - only
>> tested on Linux host.  Lets me run ARM userspace firefox.
>> 
>> Signed-off-by: Dr. David Alan Gilbert 
>> 
>> diff --git a/cpu-all.h b/cpu-all.h
>> index 42a5fa0..d895ee6 100644
>> --- a/cpu-all.h
>> +++ b/cpu-all.h
>> @@ -334,7 +334,13 @@ void cpu_dump_statistics(CPUState *env, FILE *f, 
>> fprintf_function cpu_fprintf,
>> void QEMU_NORETURN cpu_abort(CPUState *env, const char *fmt, ...)
>> GCC_FMT_ATTR(2, 3);
>> extern CPUState *first_cpu;
>> +
>> +#ifdef __linux__
>> +/* DAG: Only tested thread local on Linux, feel free to add others */
>> +extern __thread CPUState *cpu_single_env;
>> +#else
>> extern CPUState *cpu_single_env;
>> +#endif

> We need this for all platforms in order to skip qemu_global_mutex while
> manipulating some CPUState. And leaving some platforms with non-TLS will
> eventually break them when code is added that assumes TLS.

> However, it's not unlikely that some weird platforms / ancient
> toolchains still have problems with __thread - even on Linux. We may
> want to play safe and use pthread_key on POSIX.

Why not make this kind of annotations available in qemu-commmon.h; or even
better somewhere less... "generic".

#if defined(CONFIG_SUPPORTS_THREAD_KEYWORD)

// use __thread
#define QEMU_DECL_TLS(type, name) \
extern __thread type name;
#define QEMU_DEF_TLS(type, name) \
__thread type name;
#define QEMU_SET_TLS(name, value) \
do { name = value; } while (0)
#define QEMU_GET_TLS(name) \
name

#elif defined(CONFIG_SUPPORTS_PTHREAD_KEY)

// use pthread_key_t
#define QEMU_DECL_TLS(type, name) \
extern type _type_##name; \
extern pthread_key_t name;
#define QEMU_DEF_TLS(type, name) \
pthread_key_t name; \
void _init_##name (void) __attribute__((constructor)); \
void _init_##name (void) { pthread_key_create(&name, NULL); }
#define QEMU_SET_TLS(name, value) \
do { pthread_setspecific(name, (void*)value); } while (0)
#define QEMU_GET_TLS(name) \
((typeof(_type_##name))pthread_getspecific(name))

#else
#error Go home
#endif


Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

Re: [Qemu-devel] [PATCH 0/4] migration: Improve subsections detection

2011-10-04 Thread Paolo Bonzini


On 10/04/2011 04:38 PM, Juan Quintela wrote:

Hi

This series move the subsections detection code form:
- Look that it starts form 5
To:
- Look that it starts form 5 (SUBSECTION)
- Look at the length
- Look that length is bigger than section name
- Look at the idstr and see that it starts with the subsection name.

Please review.


Looks good as an alternative to the new migration format.

Thanks,

Paolo

[Qemu-devel] [PATCH] ppc64: Fix linker script

2011-10-04 Thread Andreas Färber

Since commit 8733f609 (Fix linker scripts) linking on Linux/ppc64 fails:

  LINK  ppc64-linux-user/qemu-ppc64
/usr/lib64/gcc/powerpc64-suse-linux/4.3/../../../../powerpc64-suse-linux/bin/ld:/home/afaerber/qemu/ppc64.ld:84:
 syntax error
collect2: ld gab 1 als Ende-Status zurück
make[1]: *** [qemu-ppc64] Fehler 1
make: *** [subdir-ppc64-linux-user] Fehler 2

Fix by removing a leftover line in the ppc64 linker script.

Cc: Gerd Hoffmann 
Cc: Blue Swirl 
Signed-off-by: Andreas Färber 
---
 ppc64.ld |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/ppc64.ld b/ppc64.ld
index 0059ee5..0a7c0dd 100644
--- a/ppc64.ld
+++ b/ppc64.ld
@@ -81,8 +81,8 @@ SECTIONS
   .sdata2 : { *(.sdata2 .sdata2.* .gnu.linkonce.s2.*) }
   .sbss2  : { *(.sbss2 .sbss2.* .gnu.linkonce.sb2.*) }
   .eh_frame_hdr : { *(.eh_frame_hdr) }
-*(.gcc_except_table.*) } /* Adjust the address for the data segment.  We want 
to
-adjust up to + the same address within the page on the next page up.  */
+  /* Adjust the address for the data segment.  We want to adjust up to
+ the same address within the page on the next page up.  */
   . = ALIGN (0x1) - ((0x1 - .) & (0x1 - 1)); . = DATA_SEGMENT_ALIGN
 (0x1, 0x1000);   /* Exception handling  */
   .eh_frame   : { KEEP (*(.eh_frame)) }
-- 
1.6.0.2

Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppc64: Fix linker script

2011-10-04 Thread Alexander Graf


On 10/04/2011 05:14 PM, Andreas Färber wrote:

Since commit 8733f609 (Fix linker scripts) linking on Linux/ppc64 fails:

   LINK  ppc64-linux-user/qemu-ppc64
/usr/lib64/gcc/powerpc64-suse-linux/4.3/../../../../powerpc64-suse-linux/bin/ld:/home/afaerber/qemu/ppc64.ld:84:
 syntax error
collect2: ld gab 1 als Ende-Status zurück
make[1]: *** [qemu-ppc64] Fehler 1
make: *** [subdir-ppc64-linux-user] Fehler 2

Fix by removing a leftover line in the ppc64 linker script.

Cc: Gerd Hoffmann
Cc: Blue Swirl
Signed-off-by: Andreas Färber


Thanks, applied to ppc-next.


Alex

[Qemu-devel] [PATCH RFC V1 04/11] pci_regs: Fix value of PCI_EXP_TYPE_RC_EC.

2011-10-04 Thread Anthony PERARD

Value check in PCI Express Base Specification rev 1.1

Signed-off-by: Anthony PERARD 
---
 hw/pci_regs.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/pci_regs.h b/hw/pci_regs.h
index e884096..dad7d9a 100644
--- a/hw/pci_regs.h
+++ b/hw/pci_regs.h
@@ -392,7 +392,7 @@
 #define  PCI_EXP_TYPE_DOWNSTREAM 0x6   /* Downstream Port */
 #define  PCI_EXP_TYPE_PCI_BRIDGE 0x7   /* PCI/PCI-X Bridge */
 #define  PCI_EXP_TYPE_RC_END   0x9 /* Root Complex Integrated Endpoint */
-#define  PCI_EXP_TYPE_RC_EC0x10/* Root Complex Event Collector */
+#define  PCI_EXP_TYPE_RC_EC0xa /* Root Complex Event Collector */
 #define PCI_EXP_FLAGS_SLOT 0x0100  /* Slot implemented */
 #define PCI_EXP_FLAGS_IRQ  0x3e00  /* Interrupt message number */
 #define PCI_EXP_DEVCAP 4   /* Device capabilities */
-- 
Anthony PERARD

[Qemu-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough

2011-10-04 Thread Anthony PERARD

Hi all,

This patch series introduce the PCI passthrough for Xen.

First, we have HostPCIDevice that help to access one PCI device of the host.

Then, there are several additions in the QEMU code. One is qemu_run_one_timer
to run a specific timer. It is used by PCI passthrough to run a timer about
power management. Another is pci_check_bar_overlap.

There are also several change in pci_ids and pci_regs.

Last part, but not least, the PCI passthrough device himself. Cut in 3 parts
(or file), there is one to take care of the initialisation of a passthrough
device. The second one handle everything about the config address space, there
are specifics functions for every config register. The third one is to handle
MSI.

I'm still working on setting a PCI passthrough device through QMP from libxl
(xen tool stack). It is just a call to device_add, with the driver parametter
hostaddr=":00:1b.0".

There is some minor things missing:
 - copyright header
 - PCI IO space multiplexer

Regards,


Anthony PERARD (11):
  Introduce HostPCIDevice to access a pci device on the host.
  qemu-timer: Introduce qemu_run_one_timer
  pci_ids: Add INTEL_82599_VF id.
  pci_regs: Fix value of PCI_EXP_TYPE_RC_EC.
  pci_regs: Add PCI_EXP_TYPE_PCIE_BRIDGE
  pci.c: Add pci_check_bar_overlap
  host-pci-device: Add host_pci_find_ext_cap_offset
  Introduce Xen PCI Passthrough, qdevice (1/3)
  Introduce Xen PCI Passthrough, PCI config space helpers (2/3)
  Introduce Xen PCI Passthrough, MSI (3/3)
  config/make: Introduce --enable-xen-pci-passthrough, built it.

 Makefile.target  |7 +
 configure|   21 +
 hw/host-pci-device.c |  223 +++
 hw/host-pci-device.h |   38 +
 hw/pci.c |   46 +
 hw/pci.h |3 +
 hw/pci_ids.h |1 +
 hw/pci_regs.h|3 +-
 hw/xen_pci_passthrough.c |  763 +++
 hw/xen_pci_passthrough.h |  335 +
 hw/xen_pci_passthrough_config_init.c | 2489 ++
 hw/xen_pci_passthrough_helpers.c |   46 +
 hw/xen_pci_passthrough_msi.c |  674 +
 qemu-timer.c |   15 +
 qemu-timer.h |3 +
 15 files changed, 4666 insertions(+), 1 deletions(-)
 create mode 100644 hw/host-pci-device.c
 create mode 100644 hw/host-pci-device.h
 create mode 100644 hw/xen_pci_passthrough.c
 create mode 100644 hw/xen_pci_passthrough.h
 create mode 100644 hw/xen_pci_passthrough_config_init.c
 create mode 100644 hw/xen_pci_passthrough_helpers.c
 create mode 100644 hw/xen_pci_passthrough_msi.c

-- 
Anthony PERARD

[Qemu-devel] [PATCHv3] s390: Fix cpu shutdown for KVM

2011-10-04 Thread Christian Borntraeger

On s390 a shutdown is the state of all CPUs being either stopped
or disabled (for interrupts) waiting. We have to track the overall
number of running CPUs to call the shutdown sequence accordingly.
This patch implements the counting and shutdown handling for the 
kvm path in qemu.
Lets also wrap changes to env->halted and env->exception_index.

Signed-off-by: Christian Borntraeger 
---
 hw/s390-virtio.c   |   32 ++--
 target-s390x/cpu.h |2 ++
 target-s390x/kvm.c |   19 +++
 3 files changed, 39 insertions(+), 14 deletions(-)

Index: b/hw/s390-virtio.c
===
--- a/hw/s390-virtio.c
+++ b/hw/s390-virtio.c
@@ -130,6 +130,34 @@ int s390_virtio_hypercall(CPUState *env,
 return r;
 }
 
+/*
+ * The number of running CPUs. On s390 a shutdown is the state of all CPUs
+ * being either stopped or disabled (for interrupts) waiting. We have to
+ * track this number to call the shutdown sequence accordingly. This
+ * number is modified either on startup or while holding the big qemu lock.
+ */
+static unsigned s390_running_cpus;
+
+void s390_add_running_cpu(CPUState *env)
+{
+if (env->halted) {
+s390_running_cpus++;
+env->halted = 0;
+env->exception_index = -1;
+}
+}
+
+unsigned s390_del_running_cpu(CPUState *env)
+{
+if (env->halted == 0) {
+assert(s390_running_cpus >= 1);
+s390_running_cpus--;
+env->halted = 1;
+env->exception_index = EXCP_HLT;
+}
+return s390_running_cpus;
+}
+
 /* PC hardware initialisation */
 static void s390_init(ram_addr_t my_ram_size,
   const char *boot_device,
@@ -187,8 +215,8 @@ static void s390_init(ram_addr_t my_ram_
 tmp_env->storage_keys = storage_keys;
 }
 
-env->halted = 0;
-env->exception_index = 0;
+/* One CPU has to run */
+s390_add_running_cpu(env);
 
 if (kernel_filename) {
 kernel_size = load_image(kernel_filename, qemu_get_ram_ptr(0));
Index: b/target-s390x/cpu.h
===
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -309,6 +309,8 @@ static inline void kvm_s390_interrupt_in
 }
 #endif
 CPUState *s390_cpu_addr2state(uint16_t cpu_addr);
+void s390_add_running_cpu(CPUState *env);
+unsigned s390_del_running_cpu(CPUState *env);
 
 /* from s390-virtio-bus */
 extern const target_phys_addr_t virtio_size;
Index: b/target-s390x/kvm.c
===
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -185,8 +185,7 @@ void kvm_s390_interrupt_internal(CPUStat
 return;
 }
 
-env->halted = 0;
-env->exception_index = -1;
+s390_add_running_cpu(env);
 qemu_cpu_kick(env);
 
 kvmint.type = type;
@@ -299,8 +298,7 @@ static int handle_diag(CPUState *env, st
 static int s390_cpu_restart(CPUState *env)
 {
 kvm_s390_interrupt(env, KVM_S390_RESTART, 0);
-env->halted = 0;
-env->exception_index = -1;
+s390_add_running_cpu(env);
 qemu_cpu_kick(env);
 dprintf("DONE: SIGP cpu restart: %p\n", env);
 return 0;
@@ -425,17 +423,16 @@ static int handle_intercept(CPUState *en
 r = handle_instruction(env, run);
 break;
 case ICPT_WAITPSW:
-/* XXX What to do on system shutdown? */
-env->halted = 1;
-env->exception_index = EXCP_HLT;
+case ICPT_CPU_STOP:
+if (s390_del_running_cpu(env) == 0) {
+qemu_system_shutdown_request();
+}
+r = EXCP_HALTED;
 break;
 case ICPT_SOFT_INTERCEPT:
 fprintf(stderr, "KVM unimplemented icpt SOFT\n");
 exit(1);
 break;
-case ICPT_CPU_STOP:
-qemu_system_shutdown_request();
-break;
 case ICPT_IO:
 fprintf(stderr, "KVM unimplemented icpt IO\n");
 exit(1);
@@ -468,8 +465,6 @@ int kvm_arch_handle_exit(CPUState *env, 
 
 if (ret == 0) {
 ret = EXCP_INTERRUPT;
-} else if (ret > 0) {
-ret = 0;
 }
 return ret;
 }

[Qemu-devel] [PATCH RFC V1 05/11] pci_regs: Add PCI_EXP_TYPE_PCIE_BRIDGE

2011-10-04 Thread Anthony PERARD

Signed-off-by: Anthony PERARD 
---
 hw/pci_regs.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/pci_regs.h b/hw/pci_regs.h
index dad7d9a..091a749 100644
--- a/hw/pci_regs.h
+++ b/hw/pci_regs.h
@@ -391,6 +391,7 @@
 #define  PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
 #define  PCI_EXP_TYPE_DOWNSTREAM 0x6   /* Downstream Port */
 #define  PCI_EXP_TYPE_PCI_BRIDGE 0x7   /* PCI/PCI-X Bridge */
+#define  PCI_EXP_TYPE_PCIE_BRIDGE 0x8  /* PCI/PCI-X to PCIE Bridge */
 #define  PCI_EXP_TYPE_RC_END   0x9 /* Root Complex Integrated Endpoint */
 #define  PCI_EXP_TYPE_RC_EC0xa /* Root Complex Event Collector */
 #define PCI_EXP_FLAGS_SLOT 0x0100  /* Slot implemented */
-- 
Anthony PERARD

[Qemu-devel] [PATCH RFC V1 07/11] host-pci-device: Add host_pci_find_ext_cap_offset

2011-10-04 Thread Anthony PERARD

Signed-off-by: Anthony PERARD 
---
 hw/host-pci-device.c |   31 +++
 hw/host-pci-device.h |2 ++
 2 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/hw/host-pci-device.c b/hw/host-pci-device.c
index b3f2899..2a889d5 100644
--- a/hw/host-pci-device.c
+++ b/hw/host-pci-device.c
@@ -162,6 +162,37 @@ int host_pci_write_block(HostPCIDevice *d, int pos, 
uint8_t *buf, int len)
   return host_pci_config_write(d, pos, buf, len);
 }
 
+uint32_t host_pci_find_ext_cap_offset(HostPCIDevice *d, uint32_t cap)
+{
+uint32_t header = 0;
+int max_cap = 480;
+int pos = 0x100;
+
+do {
+header = host_pci_read_long(d, pos);
+/*
+ * If we have no capabilities, this is indicated by cap ID,
+ * cap version and next pointer all being 0.
+ */
+if (header == 0) {
+break;
+}
+
+if (PCI_EXT_CAP_ID(header) == cap) {
+return pos;
+}
+
+pos = PCI_EXT_CAP_NEXT(header);
+if (pos < 0x100) {
+break;
+}
+
+max_cap--;
+} while (max_cap > 0);
+
+return 0;
+}
+
 HostPCIDevice *host_pci_device_get(uint8_t bus, uint8_t dev, uint8_t func)
 {
 HostPCIDevice *d = NULL;
diff --git a/hw/host-pci-device.h b/hw/host-pci-device.h
index 0137507..2734eb3 100644
--- a/hw/host-pci-device.h
+++ b/hw/host-pci-device.h
@@ -33,4 +33,6 @@ int host_pci_write_word(HostPCIDevice *d, int pos, uint16_t 
data);
 int host_pci_write_long(HostPCIDevice *d, int pos, uint32_t data);
 int host_pci_write_block(HostPCIDevice *d, int pos, uint8_t *buf, int len);
 
+uint32_t host_pci_find_ext_cap_offset(HostPCIDevice *s, uint32_t cap);
+
 #endif /* !HW_HOST_PCI_DEVICE */
-- 
Anthony PERARD

Re: [Qemu-devel] [PATCH] ppc64: Fix linker script

2011-10-04 Thread Richard Henderson

On 10/04/2011 08:14 AM, Andreas Färber wrote:
> Since commit 8733f609 (Fix linker scripts) linking on Linux/ppc64 fails:

It occurs to me to wonder if we ought to simply auto-detect the presence
of the -Ttext-segment ADDR option.  If that's present, don't override the
linker script and all the changes that might have been introduced therein.


r~

[Qemu-devel] [PATCH RFC V1 03/11] pci_ids: Add INTEL_82599_VF id.

2011-10-04 Thread Anthony PERARD

Signed-off-by: Anthony PERARD 
---
 hw/pci_ids.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/pci_ids.h b/hw/pci_ids.h
index 83f3893..2ea5ec2 100644
--- a/hw/pci_ids.h
+++ b/hw/pci_ids.h
@@ -117,6 +117,7 @@
 #define PCI_DEVICE_ID_INTEL_82801I_UHCI6 0x2939
 #define PCI_DEVICE_ID_INTEL_82801I_EHCI1 0x293a
 #define PCI_DEVICE_ID_INTEL_82801I_EHCI2 0x293c
+#define PCI_DEVICE_ID_INTEL_82599_VF 0x10ed
 
 #define PCI_VENDOR_ID_XEN   0x5853
 #define PCI_DEVICE_ID_XEN_PLATFORM  0x0001
-- 
Anthony PERARD

[Qemu-devel] [PATCH 3/4] savevm: improve subsections detection on load

2011-10-04 Thread Juan Quintela

We add qemu_peek_buffer, that is identical to qemu_get_buffer, just
that it don't update f->buf_index.

We add a paramenter to qemu_peek_byte() to be able to peek more than
one byte.

Once this is done, to see if we have a subsection we look:
- 1st byte is QEMU_VM_SUBSECTION
- 2nd byte is a length, and is bigger than section name
- 3rd element is a string that starts with section_name

So, we shouldn't have false positives (yes, content could still get us
wrong but probabilities are really low).

Signed-off-by: Juan Quintela 
---
 savevm.c |   71 -
 1 files changed, 60 insertions(+), 11 deletions(-)

diff --git a/savevm.c b/savevm.c
index 5fee4e2..db6ea12 100644
--- a/savevm.c
+++ b/savevm.c
@@ -532,6 +532,37 @@ void qemu_put_byte(QEMUFile *f, int v)
 qemu_fflush(f);
 }

+static int qemu_peek_buffer(QEMUFile *f, uint8_t *buf, int size1, int offset)
+{
+int size, l;
+int index = f->buf_index + offset;
+
+if (f->is_write) {
+abort();
+}
+
+size = size1;
+while (size > 0) {
+l = f->buf_size - index;
+if (l == 0) {
+qemu_fill_buffer(f);
+index = f->buf_index + offset;
+l = f->buf_size - index;
+if (l == 0) {
+break;
+}
+}
+if (l > size) {
+l = size;
+}
+memcpy(buf, f->buf + index, l);
+index += l;
+buf += l;
+size -= l;
+}
+return size1 - size;
+}
+
 int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1)
 {
 int size, l;
@@ -561,19 +592,22 @@ int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1)
 return size1 - size;
 }

-static int qemu_peek_byte(QEMUFile *f)
+static int qemu_peek_byte(QEMUFile *f, int offset)
 {
+int index = f->buf_index + offset;
+
 if (f->is_write) {
 abort();
 }

-if (f->buf_index >= f->buf_size) {
+if (index >= f->buf_size) {
 qemu_fill_buffer(f);
-if (f->buf_index >= f->buf_size) {
+index = f->buf_index + offset;
+if (index >= f->buf_size) {
 return 0;
 }
 }
-return f->buf[f->buf_index];
+return f->buf[index];
 }

 int qemu_get_byte(QEMUFile *f)
@@ -1687,22 +1721,37 @@ static int vmstate_subsection_load(QEMUFile *f, const 
VMStateDescription *vmsd,
 return 0;
 }

-while (qemu_peek_byte(f) == QEMU_VM_SUBSECTION) {
+while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) {
 char idstr[256];
 int ret;
-uint8_t version_id, len;
+uint8_t version_id, len, size;
 const VMStateDescription *sub_vmsd;

-qemu_get_byte(f); /* subsection */
-len = qemu_get_byte(f);
-qemu_get_buffer(f, (uint8_t *)idstr, len);
-idstr[len] = 0;
-version_id = qemu_get_be32(f);
+len = qemu_peek_byte(f, 1);
+if (len < strlen(vmsd->name) + 1) {
+/* subsection name has be be "section_name/a" */
+return 0;
+}
+size = qemu_peek_buffer(f, (uint8_t *)idstr, len, 2);
+if (size != len) {
+return 0;
+}
+idstr[size] = 0;

+if (strncmp(vmsd->name, idstr, strlen(vmsd->name)) != 0) {
+/* it don't have a valid subsection name */
+return 0;
+}
 sub_vmsd = vmstate_get_subsection(sub, idstr);
 if (sub_vmsd == NULL) {
 return -ENOENT;
 }
+qemu_get_byte(f); /* subsection */
+qemu_get_byte(f); /* len  */
+qemu_get_buffer(f, (uint8_t *)idstr, len);
+idstr[len] = 0;
+version_id = qemu_get_be32(f);
+
 assert(!sub_vmsd->subsections);
 ret = vmstate_load_state(f, sub_vmsd, opaque, version_id);
 if (ret) {
-- 
1.7.6.4

Re: [Qemu-devel] [PATCHv3] s390: Fix cpu shutdown for KVM

2011-10-04 Thread Alexander Graf


On 10/04/2011 05:20 PM, Christian Borntraeger wrote:

On s390 a shutdown is the state of all CPUs being either stopped
or disabled (for interrupts) waiting. We have to track the overall
number of running CPUs to call the shutdown sequence accordingly.
This patch implements the counting and shutdown handling for the
kvm path in qemu.
Lets also wrap changes to env->halted and env->exception_index.

Signed-off-by: Christian Borntraeger
---
  hw/s390-virtio.c   |   32 ++--
  target-s390x/cpu.h |2 ++
  target-s390x/kvm.c |   19 +++
  3 files changed, 39 insertions(+), 14 deletions(-)

Index: b/hw/s390-virtio.c
===
--- a/hw/s390-virtio.c
+++ b/hw/s390-virtio.c
@@ -130,6 +130,34 @@ int s390_virtio_hypercall(CPUState *env,
  return r;
  }

+/*
+ * The number of running CPUs. On s390 a shutdown is the state of all CPUs
+ * being either stopped or disabled (for interrupts) waiting. We have to
+ * track this number to call the shutdown sequence accordingly. This
+ * number is modified either on startup or while holding the big qemu lock.
+ */
+static unsigned s390_running_cpus;
+
+void s390_add_running_cpu(CPUState *env)
+{
+if (env->halted) {
+s390_running_cpus++;
+env->halted = 0;
+env->exception_index = -1;
+}
+}
+
+unsigned s390_del_running_cpu(CPUState *env)
+{
+if (env->halted == 0) {
+assert(s390_running_cpus>= 1);
+s390_running_cpus--;
+env->halted = 1;
+env->exception_index = EXCP_HLT;
+}
+return s390_running_cpus;
+}
+
  /* PC hardware initialisation */
  static void s390_init(ram_addr_t my_ram_size,
const char *boot_device,
@@ -187,8 +215,8 @@ static void s390_init(ram_addr_t my_ram_
  tmp_env->storage_keys = storage_keys;
  }

-env->halted = 0;
-env->exception_index = 0;
+/* One CPU has to run */
+s390_add_running_cpu(env);

  if (kernel_filename) {
  kernel_size = load_image(kernel_filename, qemu_get_ram_ptr(0));
Index: b/target-s390x/cpu.h
===
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -309,6 +309,8 @@ static inline void kvm_s390_interrupt_in
  }
  #endif
  CPUState *s390_cpu_addr2state(uint16_t cpu_addr);
+void s390_add_running_cpu(CPUState *env);
+unsigned s390_del_running_cpu(CPUState *env);

  /* from s390-virtio-bus */
  extern const target_phys_addr_t virtio_size;
Index: b/target-s390x/kvm.c
===
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -185,8 +185,7 @@ void kvm_s390_interrupt_internal(CPUStat
  return;
  }

-env->halted = 0;
-env->exception_index = -1;
+s390_add_running_cpu(env);
  qemu_cpu_kick(env);

  kvmint.type = type;
@@ -299,8 +298,7 @@ static int handle_diag(CPUState *env, st
  static int s390_cpu_restart(CPUState *env)
  {
  kvm_s390_interrupt(env, KVM_S390_RESTART, 0);
-env->halted = 0;
-env->exception_index = -1;
+s390_add_running_cpu(env);
  qemu_cpu_kick(env);
  dprintf("DONE: SIGP cpu restart: %p\n", env);
  return 0;
@@ -425,17 +423,16 @@ static int handle_intercept(CPUState *en
  r = handle_instruction(env, run);
  break;
  case ICPT_WAITPSW:
-/* XXX What to do on system shutdown? */
-env->halted = 1;
-env->exception_index = EXCP_HLT;
+case ICPT_CPU_STOP:
+if (s390_del_running_cpu(env) == 0) {
+qemu_system_shutdown_request();
+}
+r = EXCP_HALTED;
  break;
  case ICPT_SOFT_INTERCEPT:
  fprintf(stderr, "KVM unimplemented icpt SOFT\n");
  exit(1);
  break;
-case ICPT_CPU_STOP:
-qemu_system_shutdown_request();
-break;
  case ICPT_IO:
  fprintf(stderr, "KVM unimplemented icpt IO\n");
  exit(1);
@@ -468,8 +465,6 @@ int kvm_arch_handle_exit(CPUState *env,

  if (ret == 0) {
  ret = EXCP_INTERRUPT;
-} else if (ret>  0) {
-ret = 0;


Hrm. Are you sure that this doesn't break anything? The rest looks good.


Alex

Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/2] tcg/ppc*: Move cache initialization to ppc specific code

2011-10-04 Thread Scott Wood

On 10/04/2011 12:55 AM, Stefan Weil wrote:
> Am 03.10.2011 23:40, schrieb Scott Wood:
>> The interface isn't powerpc-specific. It just happens to be the only
>> arch so far that qemu supports that needs the implementation to do
>> something (or possibly just the only one where that need has been
>> discovered).
>>
>> What problem is it causing the way it is?
> 
> My patch was triggered by this discussion thread and a remark from
> Peter Maydell:
> 
> http://lists.nongnu.org/archive/html/qemu-devel/2011-09/msg02272.html
> 
> The only problem is that ppc differs from all other hosts:
> 
> * No other host needs qemu_cache_utils_init() today.
> 
> * All other hosts define flush_icache_range() in tcg-target.h.
> 
> * All other hosts only call flush_icache_range() from tcg code.
> 
> I learned now that ppc will need flush_icache_range() for kvm, too.
> So it won't be possible to implement a uniform handling of
> flush_icache_range() for all host architectures.

This doesn't smell right... why would other arches need it for TCG, but
not KVM?  Either you need to flush when writing code to memory, or you
don't.

Is it just that KVM isn't implemented yet on those architectures?

-Scott

Re: [Qemu-devel] [PATCH] i386: wire up MSR_IA32_MISC_ENABLE

2011-10-04 Thread Jan Kiszka

On 2011-10-04 16:26, Avi Kivity wrote:
> It's needed for its default value - bit 0 specifies that "rep movs" is
> good enough for memcpy, and Linux may use a slower memcpu if it is not set,
> depending on cpu family/model.
> 
> Signed-off-by: Avi Kivity 
> ---
>  target-i386/cpu.h   |5 +
>  target-i386/helper.c|1 +
>  target-i386/kvm.c   |   15 +++
>  target-i386/machine.c   |   21 +
>  target-i386/op_helper.c |6 ++
>  5 files changed, 48 insertions(+), 0 deletions(-)
> 
> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> index ae36489..5416809 100644
> --- a/target-i386/cpu.h
> +++ b/target-i386/cpu.h
> @@ -299,6 +299,10 @@
>  
>  #define MSR_IA32_PERF_STATUS0x198
>  
> +#define MSR_IA32_MISC_ENABLE 0x1a0

I smell tabs...

> +/* Indicates good rep/movs microcode on some processors: */
> +#define MSR_IA32_MISC_ENABLE_DEFAULT1
> +
>  #define MSR_MTRRphysBase(reg)(0x200 + 2 * (reg))
>  #define MSR_MTRRphysMask(reg)(0x200 + 2 * (reg) + 1)
>  
> @@ -689,6 +693,7 @@ typedef struct CPUX86State {
>  uint64_t tsc;
>  
>  uint64_t mcg_status;
> +uint64_t msr_ia32_misc_enable;
>  
>  /* exception/interrupt handling */
>  int error_code;
> diff --git a/target-i386/helper.c b/target-i386/helper.c
> index 5df40d4..6c6a167 100644
> --- a/target-i386/helper.c
> +++ b/target-i386/helper.c
> @@ -98,6 +98,7 @@ void cpu_reset(CPUX86State *env)
>  env->mxcsr = 0x1f80;
>  
>  env->pat = 0x0007040600070406ULL;
> +env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT;
>  
>  memset(env->dr, 0, sizeof(env->dr));
>  env->dr[6] = DR6_FIXED_1;
> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index b6eef04..b3a650b 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -60,6 +60,7 @@
>  static bool has_msr_star;
>  static bool has_msr_hsave_pa;
>  static bool has_msr_async_pf_en;
> +static bool has_msr_misc_enable;
>  static int lm_capable_kernel;
>  
>  static struct kvm_cpuid2 *try_get_cpuid(KVMState *s, int max)
> @@ -568,6 +569,10 @@ static int kvm_get_supported_msrs(KVMState *s)
>  has_msr_hsave_pa = true;
>  continue;
>  }
> +if (kvm_msr_list->indices[i] == MSR_IA32_MISC_ENABLE) {
> +has_msr_misc_enable = true;
> +continue;
> +}
>  }
>  }
>  
> @@ -881,6 +886,10 @@ static int kvm_put_msrs(CPUState *env, int level)
>  if (has_msr_hsave_pa) {
>  kvm_msr_entry_set(&msrs[n++], MSR_VM_HSAVE_PA, env->vm_hsave);
>  }
> +if (has_msr_misc_enable) {
> +kvm_msr_entry_set(&msrs[n++], MSR_IA32_MISC_ENABLE,
> +  env->msr_ia32_misc_enable);
> +}
>  #ifdef TARGET_X86_64
>  if (lm_capable_kernel) {
>  kvm_msr_entry_set(&msrs[n++], MSR_CSTAR, env->cstar);
> @@ -1127,6 +1136,9 @@ static int kvm_get_msrs(CPUState *env)
>  if (has_msr_hsave_pa) {
>  msrs[n++].index = MSR_VM_HSAVE_PA;
>  }
> +if (has_msr_misc_enable) {
> +msrs[n++].index = MSR_IA32_MISC_ENABLE;
> +}
>  
>  if (!env->tsc_valid) {
>  msrs[n++].index = MSR_IA32_TSC;
> @@ -1210,6 +1222,9 @@ static int kvm_get_msrs(CPUState *env)
>  case MSR_MCG_CTL:
>  env->mcg_ctl = msrs[i].data;
>  break;
> +case MSR_IA32_MISC_ENABLE:
> +env->msr_ia32_misc_enable = msrs[i].data;
> +break;
>  default:
>  if (msrs[i].index >= MSR_MC0_CTL &&
>  msrs[i].index < MSR_MC0_CTL + (env->mcg_cap & 0xff) * 4) {
> diff --git a/target-i386/machine.c b/target-i386/machine.c
> index 9aca8e0..73c1ffe 100644
> --- a/target-i386/machine.c
> +++ b/target-i386/machine.c
> @@ -310,6 +310,24 @@ static bool fpop_ip_dp_needed(void *opaque)
>  }
>  };
>  
> +static bool misc_enable_needed(void *opaque)
> +{
> +CPUState *env = opaque;
> +
> +return env->msr_ia32_misc_enable != MSR_IA32_MISC_ENABLE_DEFAULT;
> +}
> +
> +static const VMStateDescription vmstate_msr_ia32_misc_enable = {
> +.name = "cpu/msr_ia32_misc_enable",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.minimum_version_id_old = 1,
> +.fields  = (VMStateField []) {
> +VMSTATE_UINT64(msr_ia32_misc_enable, CPUState),
> +VMSTATE_END_OF_LIST()
> +}
> +};
> +

We are about to bump the CPU_SAVE_VERSION for the sake of APIC deadline
timer, so you can jump on that train and avoid this subsection.

>  static const VMStateDescription vmstate_cpu = {
>  .name = "cpu",
>  .version_id = CPU_SAVE_VERSION,
> @@ -420,6 +438,9 @@ static bool fpop_ip_dp_needed(void *opaque)
>  } , {
>  .vmsd = &vmstate_fpop_ip_dp,
>  .needed = fpop_ip_dp_needed,
> +}, {
> +.vmsd = &vmstate_msr_ia32_misc_enable,
> +.needed = misc_enable_nee

[Qemu-devel] [PATCH 0/2] HDA fixes for Windows 7

2011-10-04 Thread Marc-André Lureau

Hi,

There are two related bugs in stream state and xfer handling due to the
conflict of stream number (on Windows 7, input stream and output stream
both use the full range of 1..15 values. on Linux, the handling is
different and it works fine because there is no clash)

The patches handle stream number for input and output stream
distinctively to avoid conflict, and it seems to be in accordance
to the spec.

Marc-André Lureau (2):
  hda: do not mix output and input streams, RHBZ #740493
  hda: do not mix output and input stream states, RHBZ #740493

 hw/hda-audio.c |   15 +--
 hw/intel-hda.c |   18 ++
 hw/intel-hda.h |2 +-
 3 files changed, 20 insertions(+), 15 deletions(-)

-- 
1.7.6.2

[Qemu-devel] [PATCH 1/2] hda: do not mix output and input streams, RHBZ #740493

2011-10-04 Thread Marc-André Lureau

Windows 7 may use the same stream number for input and output.
That will result in lot of garbage on playback.

The hardcoded value of 4 needs to be in sync with GCAP streams
description and IN/OUT registers.
---
 hw/intel-hda.c |9 +
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/hw/intel-hda.c b/hw/intel-hda.c
index 4272204..c6a3fec 100644
--- a/hw/intel-hda.c
+++ b/hw/intel-hda.c
@@ -389,14 +389,15 @@ static bool intel_hda_xfer(HDACodecDevice *dev, uint32_t 
stnr, bool output,
 {
 HDACodecBus *bus = DO_UPCAST(HDACodecBus, qbus, dev->qdev.parent_bus);
 IntelHDAState *d = container_of(bus, IntelHDAState, codecs);
-IntelHDAStream *st = NULL;
 target_phys_addr_t addr;
 uint32_t s, copy, left;
+IntelHDAStream *st;
 bool irq = false;
 
-for (s = 0; s < ARRAY_SIZE(d->st); s++) {
-if (stnr == ((d->st[s].ctl >> 20) & 0x0f)) {
-st = d->st + s;
+st = output ? d->st + 4 : d->st;
+for (s = 0; s < 4; s++) {
+if (stnr == ((st[s].ctl >> 20) & 0x0f)) {
+st = st + s;
 break;
 }
 }
-- 
1.7.6.2

[Qemu-devel] [PATCH 2/2] hda: do not mix output and input stream states, RHBZ #740493

2011-10-04 Thread Marc-André Lureau

Windows 7 may use the same stream number for input and output.
Current code will confuse streams.

NB: I wonder if this patch breaks migration code because of
this change:
-VMSTATE_BOOL_ARRAY(running, HDAAudioState, 16),
+VMSTATE_BOOL_ARRAY(running, HDAAudioState, 2 * 16),
---
 hw/hda-audio.c |   15 +--
 hw/intel-hda.c |9 +
 hw/intel-hda.h |2 +-
 3 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/hw/hda-audio.c b/hw/hda-audio.c
index 03c0a24..dacf290 100644
--- a/hw/hda-audio.c
+++ b/hw/hda-audio.c
@@ -462,7 +462,7 @@ struct HDAAudioState {
 QEMUSoundCard card;
 const desc_codec *desc;
 HDAAudioStream st[4];
-bool running[16];
+bool running[2 * 16];
 
 /* properties */
 uint32_t debug;
@@ -659,7 +659,7 @@ static void hda_audio_command(HDACodecDevice *hda, uint32_t 
nid, uint32_t data)
 st->channel = payload & 0x0f;
 dprint(a, 2, "%s: stream %d, channel %d\n",
st->node->name, st->stream, st->channel);
-hda_audio_set_running(st, a->running[st->stream]);
+hda_audio_set_running(st, a->running[st->output * 16 + st->stream]);
 hda_codec_response(hda, true, 0);
 break;
 case AC_VERB_GET_CONV:
@@ -742,16 +742,19 @@ fail:
 hda_codec_response(hda, true, 0);
 }
 
-static void hda_audio_stream(HDACodecDevice *hda, uint32_t stnr, bool running)
+static void hda_audio_stream(HDACodecDevice *hda, uint32_t stnr, bool running, 
bool output)
 {
 HDAAudioState *a = DO_UPCAST(HDAAudioState, hda, hda);
 int s;
 
-a->running[stnr] = running;
+a->running[output * 16 + stnr] = running;
 for (s = 0; s < ARRAY_SIZE(a->st); s++) {
 if (a->st[s].node == NULL) {
 continue;
 }
+if (a->st[s].output != output) {
+continue;
+}
 if (a->st[s].stream != stnr) {
 continue;
 }
@@ -840,7 +843,7 @@ static int hda_audio_post_load(void *opaque, int version)
 hda_codec_parse_fmt(st->format, &st->as);
 hda_audio_setup(st);
 hda_audio_set_amp(st);
-hda_audio_set_running(st, a->running[st->stream]);
+hda_audio_set_running(st, a->running[st->output * 16 + st->stream]);
 }
 return 0;
 }
@@ -870,7 +873,7 @@ static const VMStateDescription vmstate_hda_audio = {
 VMSTATE_STRUCT_ARRAY(st, HDAAudioState, 4, 0,
  vmstate_hda_audio_stream,
  HDAAudioStream),
-VMSTATE_BOOL_ARRAY(running, HDAAudioState, 16),
+VMSTATE_BOOL_ARRAY(running, HDAAudioState, 2 * 16),
 VMSTATE_END_OF_LIST()
 }
 };
diff --git a/hw/intel-hda.c b/hw/intel-hda.c
index c6a3fec..f97775c 100644
--- a/hw/intel-hda.c
+++ b/hw/intel-hda.c
@@ -485,7 +485,7 @@ static void intel_hda_parse_bdl(IntelHDAState *d, 
IntelHDAStream *st)
 st->bp= 0;
 }
 
-static void intel_hda_notify_codecs(IntelHDAState *d, uint32_t stream, bool 
running)
+static void intel_hda_notify_codecs(IntelHDAState *d, uint32_t stream, bool 
running, bool output)
 {
 DeviceState *qdev;
 HDACodecDevice *cdev;
@@ -493,7 +493,7 @@ static void intel_hda_notify_codecs(IntelHDAState *d, 
uint32_t stream, bool runn
 QLIST_FOREACH(qdev, &d->codecs.qbus.children, sibling) {
 cdev = DO_UPCAST(HDACodecDevice, qdev, qdev);
 if (cdev->info->stream) {
-cdev->info->stream(cdev, stream, running);
+cdev->info->stream(cdev, stream, running, output);
 }
 }
 }
@@ -567,6 +567,7 @@ static void intel_hda_set_ics(IntelHDAState *d, const 
IntelHDAReg *reg, uint32_t
 
 static void intel_hda_set_st_ctl(IntelHDAState *d, const IntelHDAReg *reg, 
uint32_t old)
 {
+bool output = reg->stream >= 4;
 IntelHDAStream *st = d->st + reg->stream;
 
 if (st->ctl & 0x01) {
@@ -582,11 +583,11 @@ static void intel_hda_set_st_ctl(IntelHDAState *d, const 
IntelHDAReg *reg, uint3
 dprint(d, 1, "st #%d: start %d (ring buf %d bytes)\n",
reg->stream, stnr, st->cbl);
 intel_hda_parse_bdl(d, st);
-intel_hda_notify_codecs(d, stnr, true);
+intel_hda_notify_codecs(d, stnr, true, output);
 } else {
 /* stop */
 dprint(d, 1, "st #%d: stop %d\n", reg->stream, stnr);
-intel_hda_notify_codecs(d, stnr, false);
+intel_hda_notify_codecs(d, stnr, false, output);
 }
 }
 intel_hda_update_irq(d);
diff --git a/hw/intel-hda.h b/hw/intel-hda.h
index 4e44e38..65fd2a8 100644
--- a/hw/intel-hda.h
+++ b/hw/intel-hda.h
@@ -34,7 +34,7 @@ struct HDACodecDeviceInfo {
 int (*init)(HDACodecDevice *dev);
 int (*exit)(HDACodecDevice *dev);
 void (*command)(HDACodecDevice *dev, uint32_t nid, uint32_t data);
-void (*stream)(HDACodecDevice *dev, uint32_t stnr, bool running);
+void (*stream)(HDACodecDevice *dev, uint32_t stnr, bool running, bool 
output);
 };
 
 void hda_codec_bus_

Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough

2011-10-04 Thread Alex Williamson

On Tue, 2011-10-04 at 16:05 +0100, Stefano Stabellini wrote:
> On Tue, 4 Oct 2011, Anthony Liguori wrote:
> > On 10/04/2011 09:58 AM, Avi Kivity wrote:
> > > On 10/04/2011 04:51 PM, Anthony PERARD wrote:
> > >> Hi all,
> > >>
> > >> This patch series introduce the PCI passthrough for Xen.
> > >>
> > >> First, we have HostPCIDevice that help to access one PCI device of the 
> > >> host.
> > >>
> > >> Then, there are several additions in the QEMU code. One is 
> > >> qemu_run_one_timer
> > >> to run a specific timer. It is used by PCI passthrough to run a timer 
> > >> about
> > >> power management. Another is pci_check_bar_overlap.
> > >>
> > >> There are also several change in pci_ids and pci_regs.
> > >>
> > >> Last part, but not least, the PCI passthrough device himself. Cut in 3 
> > >> parts
> > >> (or file), there is one to take care of the initialisation of a 
> > >> passthrough
> > >> device. The second one handle everything about the config address space, 
> > >> there
> > >> are specifics functions for every config register. The third one is to 
> > >> handle
> > >> MSI.
> > >>
> > >> I'm still working on setting a PCI passthrough device through QMP from 
> > >> libxl
> > >> (xen tool stack). It is just a call to device_add, with the driver 
> > >> parametter
> > >> hostaddr=":00:1b.0".
> > >>
> > >> There is some minor things missing:
> > >> - copyright header
> > >> - PCI IO space multiplexer
> > >>
> > >>
> > >
> > > We also have pci passthrough in qemu-kvm (I think based on the same 
> > > Neocleus
> > > code). Rather than having two pci assignment implementations, I think we 
> > > should
> > > have just one, with the differences (programming the hypervisor) 
> > > abstracted at
> > > that level.
> > 
> > I agree in principle but how close is qemu-kvm pci passthrough to a 
> > mergable 
> > state?  Would it make sense to merge the Xen code first and then abstract 
> > it?
> 
> I think it should be fairly easy to abstract the current xen code: just
> a matter of providing memory, ioport and interrupt mapping functions.

I thought we were potentially looking at vfio as a convergence point.
I'm still a bit off from having a vfio re-write ready to submit, but is
this still a possibility?  Thanks,

Alex

Re: [Qemu-devel] build errors on s390 tcg

2011-10-04 Thread Stefan Weil


Am 04.10.2011 14:44, schrieb Peter Maydell:

On 4 October 2011 13:23, Alexander Graf  wrote:
The problem is that tcg.c defines the functions with int arguments 
for TCG
register indexes, while s390/tcg-target.c takes TCGReg parameters. 
I'm not

sure which way is better, but using TCGReg feels more type safe to me.


You're not consistent, though; for example the s390 
tcg_target_reg_alloc_order

array is declared as int[], not TCGReg[].

-- PMM


I did not notice that s390 was different when I wrote my patch,
but I agree with Alex that TCGReg is more type safe. I also
think that using TCGReg documents the meaning of a parameter
better than using a simple int.

So there are two tasks to be done:

- Look for inconsistencies and fix them.

- Add and use TCGReg for all other targets, too.

Or are there any reasons why we should keep the int type parameters?

Regards,
Stefan Weil

Re: [Qemu-devel] [PATCH 2/2] LAPIC: make lapic support cpu hotplug

2011-10-04 Thread Jan Kiszka

On 2011-10-04 13:13, pingf...@linux.vnet.com wrote:
> From: Liu Ping Fan 
> 
> Separate apic from qbus to icc bus which supports hotplug feature.

Modeling the ICC bus looks like a step in the right direction. The
IOAPIC could be attached to it as well to get rid of "ioapics[MAX_IOAPICS]".

> And make the creation of apic as part of cpu initialization, so
> apic's state has been ready, before setting kvm_apic.

There is no kvm-apic upstream yet, so it's hard to judge why we need
this here. If we do, this has to be a separate patch. But I seriously
doubt we need it (my hack worked without it, and that was not because of
its hack nature).

> 
> Signed-off-by: Liu Ping Fan 
> ---
>  Makefile.target  |1 +
>  hw/apic.c|7 -
>  hw/apic.h|1 +
>  hw/icc_bus.c |   62 
> ++
>  hw/icc_bus.h |   24 +++
>  hw/pc.c  |   11 -
>  target-i386/cpu.h|1 +
>  target-i386/helper.c |7 +-
>  target-i386/kvm.c|1 -
>  9 files changed, 105 insertions(+), 10 deletions(-)
>  create mode 100644 hw/icc_bus.c
>  create mode 100644 hw/icc_bus.h
> 
> diff --git a/Makefile.target b/Makefile.target
> index 9011f28..5607c6d 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -241,6 +241,7 @@ obj-i386-$(CONFIG_KVM) += kvmclock.o
>  obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
>  obj-i386-y += testdev.o
>  obj-i386-y += acpi.o acpi_piix4.o
> +obj-i386-y += icc_bus.o
>  
>  obj-i386-y += pcspk.o i8254.o
>  obj-i386-$(CONFIG_KVM_PIT) += i8254-kvm.o
> diff --git a/hw/apic.c b/hw/apic.c
> index 69d6ac5..95a1664 100644
> --- a/hw/apic.c
> +++ b/hw/apic.c
> @@ -24,6 +24,7 @@
>  #include "sysbus.h"
>  #include "trace.h"
>  #include "kvm.h"
> +#include "icc_bus.h"
>  
>  /* APIC Local Vector Table */
>  #define APIC_LVT_TIMER   0
> @@ -304,11 +305,13 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val)
>  
>  if (!s)
>  return;
> +
>  if (kvm_enabled() && kvm_irqchip_in_kernel())
>  s->apicbase = val;
>  else
>  s->apicbase = (val & 0xf000) |
>  (s->apicbase & (MSR_IA32_APICBASE_BSP | 
> MSR_IA32_APICBASE_ENABLE));
> +
>  /* if disabled, cannot be enabled again */
>  if (!(val & MSR_IA32_APICBASE_ENABLE)) {
>  s->apicbase &= ~MSR_IA32_APICBASE_ENABLE;
> @@ -1075,7 +1078,7 @@ static const VMStateDescription vmstate_apic = {
>  }
>  };
>  
> -static void apic_reset(DeviceState *d)
> +void apic_reset(DeviceState *d)

Still unused outside of this file, so keep it private.

>  {
>  APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
>  int bsp;
> @@ -1138,7 +1141,7 @@ static SysBusDeviceInfo apic_info = {
>  
>  static void apic_register_devices(void)
>  {
> -sysbus_register_withprop(&apic_info);
> +iccbus_register(&apic_info);
>  }
>  
>  device_init(apic_register_devices)
> diff --git a/hw/apic.h b/hw/apic.h
> index c857d52..e258efa 100644
> --- a/hw/apic.h
> +++ b/hw/apic.h
> @@ -24,5 +24,6 @@ void apic_sipi(DeviceState *s);
>  /* pc.c */
>  int cpu_is_bsp(CPUState *env);
>  DeviceState *cpu_get_current_apic(void);
> +void apic_reset(DeviceState *d);
>  
>  #endif
> diff --git a/hw/icc_bus.c b/hw/icc_bus.c
> new file mode 100644
> index 000..360ca2a
> --- /dev/null
> +++ b/hw/icc_bus.c
> @@ -0,0 +1,62 @@
> +/*
> +*/
> +#define ICC_BUS_PLUG
> +#ifdef ICC_BUS_PLUG
> +#include "icc_bus.h"
> +
> +
> +
> +struct icc_bus_info icc_info = {
> +.qinfo.name = "icc",
> +.qinfo.size = sizeof(struct icc_bus),
> +.qinfo.props = (Property[]) {
> +DEFINE_PROP_END_OF_LIST(),
> +}
> +
> +};
> +
> +
> +static const VMStateDescription vmstate_icc_bus = {
> +.name = "icc_bus",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.minimum_version_id_old = 1,
> +.pre_save = NULL,
> +.post_load = NULL,
> +};
> +
> +struct icc_bus *g_iccbus;
> +
> +struct icc_bus *icc_init_bus(DeviceState *parent, const char *name)
> +{
> +struct icc_bus *bus;
> +
> +bus = FROM_QBUS(icc_bus, qbus_create(&icc_info.qinfo, parent, name));
> +bus->qbus.allow_hotplug = 1; /* Yes, we can */
> +bus->qbus.name = "icc";
> +vmstate_register(NULL, -1, &vmstate_icc_bus, bus);

The chipset is the owner of this bus and instantiates it. So it also
provides a vmstate. You can drop this unneeded one here (it's created
via an obsolete API anyway).

> +g_iccbus = bus;
> +return bus;
> +}
> +
> +
> +
> +
> +
> +static int iccbus_device_init(DeviceState *dev, DeviceInfo *base)
> +{
> +SysBusDeviceInfo *info = container_of(base, SysBusDeviceInfo, qdev);
> +
> +return info->init(sysbus_from_qdev(dev));
> +}
> +
> +void iccbus_register(SysBusDeviceInfo *info)
> +{
> +info->qdev.init = iccbus_device_init;
> +info->qdev.bus_info = &icc_info.qinfo;
> +
> +assert(info->qdev.size >= sizeof(SysBusDevice));
> +qdev_register(&info->qdev);
> +}

This

Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough

2011-10-04 Thread Stefano Stabellini

On Tue, 4 Oct 2011, Alex Williamson wrote:
> On Tue, 2011-10-04 at 16:05 +0100, Stefano Stabellini wrote:
> > On Tue, 4 Oct 2011, Anthony Liguori wrote:
> > > On 10/04/2011 09:58 AM, Avi Kivity wrote:
> > > > On 10/04/2011 04:51 PM, Anthony PERARD wrote:
> > > >> Hi all,
> > > >>
> > > >> This patch series introduce the PCI passthrough for Xen.
> > > >>
> > > >> First, we have HostPCIDevice that help to access one PCI device of the 
> > > >> host.
> > > >>
> > > >> Then, there are several additions in the QEMU code. One is 
> > > >> qemu_run_one_timer
> > > >> to run a specific timer. It is used by PCI passthrough to run a timer 
> > > >> about
> > > >> power management. Another is pci_check_bar_overlap.
> > > >>
> > > >> There are also several change in pci_ids and pci_regs.
> > > >>
> > > >> Last part, but not least, the PCI passthrough device himself. Cut in 3 
> > > >> parts
> > > >> (or file), there is one to take care of the initialisation of a 
> > > >> passthrough
> > > >> device. The second one handle everything about the config address 
> > > >> space, there
> > > >> are specifics functions for every config register. The third one is to 
> > > >> handle
> > > >> MSI.
> > > >>
> > > >> I'm still working on setting a PCI passthrough device through QMP from 
> > > >> libxl
> > > >> (xen tool stack). It is just a call to device_add, with the driver 
> > > >> parametter
> > > >> hostaddr=":00:1b.0".
> > > >>
> > > >> There is some minor things missing:
> > > >> - copyright header
> > > >> - PCI IO space multiplexer
> > > >>
> > > >>
> > > >
> > > > We also have pci passthrough in qemu-kvm (I think based on the same 
> > > > Neocleus
> > > > code). Rather than having two pci assignment implementations, I think 
> > > > we should
> > > > have just one, with the differences (programming the hypervisor) 
> > > > abstracted at
> > > > that level.
> > > 
> > > I agree in principle but how close is qemu-kvm pci passthrough to a 
> > > mergable 
> > > state?  Would it make sense to merge the Xen code first and then abstract 
> > > it?
> > 
> > I think it should be fairly easy to abstract the current xen code: just
> > a matter of providing memory, ioport and interrupt mapping functions.
> 
> I thought we were potentially looking at vfio as a convergence point.
> I'm still a bit off from having a vfio re-write ready to submit, but is
> this still a possibility?  Thanks,
 

As I said at KVM Forum 2011, I am OK with vfio: it is a nice thing to
have in the long term.
However having seen how much time it took to make this code work
reliably, I think is going to take some time before we have a stable
vfio implementation.
Considering that we have just introduced upstream qemu to xen-unstable,
I would like to reach feature parity with the old qemu-xen sooner rather
than later, so that we can kill the qemu-xen fork once and for all. In
order to reach feature parity we need PCI passthrough.  Once that is
done, we could gradually modify this code to introduce vfio support,
throwing away what is not necessary and keeping what can be reused.
If you are interested, we can easily make this code work for KVM too,
otherwise we can keep it Xen specific and wait for vfio to be ready.

Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough

2011-10-04 Thread Avi Kivity


On 10/04/2011 06:33 PM, Alex Williamson wrote:

I thought we were potentially looking at vfio as a convergence point.
I'm still a bit off from having a vfio re-write ready to submit, but is
this still a possibility?  Thanks,



vfio leaves out users of current and past kernels; relying on it would 
force me to maintain qemu-kvm for a while.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough

2011-10-04 Thread Avi Kivity


On 10/04/2011 05:01 PM, Anthony Liguori wrote:
We also have pci passthrough in qemu-kvm (I think based on the same 
Neocleus
code). Rather than having two pci assignment implementations, I think 
we should
have just one, with the differences (programming the hypervisor) 
abstracted at

that level.



I agree in principle but how close is qemu-kvm pci passthrough to a 
mergable state?  Would it make sense to merge the Xen code first and 
then abstract it?


Merging either implementation and abstracting it would risk regressions 
in the other.


How about merging both, with the ABIs (command line and qmp) tagged as 
experimental, and then doing a merge in the same style as 
i386+x86_64->x86 or the two kvm implementations in qemu?  We can pick 
one implementation as the merge target and port fixes from the other.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH] i386: wire up MSR_IA32_MISC_ENABLE

2011-10-04 Thread Avi Kivity

On 10/04/2011 06:05 PM, Jan Kiszka wrote:

On 2011-10-04 16:26, Avi Kivity wrote:
>  It's needed for its default value - bit 0 specifies that "rep movs" is
>  good enough for memcpy, and Linux may use a slower memcpu if it is not set,
>  depending on cpu family/model.
>
>  Signed-off-by: Avi Kivity
>  ---
>   target-i386/cpu.h   |5 +
>   target-i386/helper.c|1 +
>   target-i386/kvm.c   |   15 +++
>   target-i386/machine.c   |   21 +
>   target-i386/op_helper.c |6 ++
>   5 files changed, 48 insertions(+), 0 deletions(-)
>
>  diff --git a/target-i386/cpu.h b/target-i386/cpu.h
>  index ae36489..5416809 100644
>  --- a/target-i386/cpu.h
>  +++ b/target-i386/cpu.h
>  @@ -299,6 +299,10 @@
>
>   #define MSR_IA32_PERF_STATUS0x198
>
>  +#define MSR_IA32_MISC_ENABLE 0x1a0

I smell tabs...

Oops.  Cut'n'paste flew underneath the emacs radar.

>  +
>  +static const VMStateDescription vmstate_msr_ia32_misc_enable = {
>  +.name = "cpu/msr_ia32_misc_enable",
>  +.version_id = 1,
>  +.minimum_version_id = 1,
>  +.minimum_version_id_old = 1,
>  +.fields  = (VMStateField []) {
>  +VMSTATE_UINT64(msr_ia32_misc_enable, CPUState),
>  +VMSTATE_END_OF_LIST()
>  +}
>  +};
>  +

We are about to bump the CPU_SAVE_VERSION for the sake of APIC deadline
timer, so you can jump on that train and avoid this subsection.

Must we do that?  Considering that no guest will use the deadline timer, 
it seems to be an excellent candidates for subsections.

>  diff --git a/target-i386/op_helper.c b/target-i386/op_helper.c
>  index 3bb5a91..c89e4a4 100644
>  --- a/target-i386/op_helper.c
>  +++ b/target-i386/op_helper.c
>  @@ -3280,6 +3280,9 @@ void helper_wrmsr(void)
>   case MSR_TSC_AUX:
>   env->tsc_aux = val;
>   break;
>  +case MSR_IA32_MISC_ENABLE:
>  +env->msr_ia32_misc_enable = val;
>  +break;

This MSR is Intel-specific, isn't it? Then I guess it should be limited
to Intel CPU types.

It's an "architectural MSR" that is only available on some Intel 
models.  Either we do a full cpuid qualification of accessible MSRs (and 
bits within MSRs), or not.  Qualifying just by vendor ID is pointless.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH] i386: wire up MSR_IA32_MISC_ENABLE

2011-10-04 Thread Jan Kiszka

On 2011-10-04 19:08, Avi Kivity wrote:
> On 10/04/2011 06:05 PM, Jan Kiszka wrote:
>> On 2011-10-04 16:26, Avi Kivity wrote:
>> >  It's needed for its default value - bit 0 specifies that "rep movs" is
>> >  good enough for memcpy, and Linux may use a slower memcpu if it is
>> not set,
>> >  depending on cpu family/model.
>> >
>> >  Signed-off-by: Avi Kivity
>> >  ---
>> >   target-i386/cpu.h   |5 +
>> >   target-i386/helper.c|1 +
>> >   target-i386/kvm.c   |   15 +++
>> >   target-i386/machine.c   |   21 +
>> >   target-i386/op_helper.c |6 ++
>> >   5 files changed, 48 insertions(+), 0 deletions(-)
>> >
>> >  diff --git a/target-i386/cpu.h b/target-i386/cpu.h
>> >  index ae36489..5416809 100644
>> >  --- a/target-i386/cpu.h
>> >  +++ b/target-i386/cpu.h
>> >  @@ -299,6 +299,10 @@
>> >
>> >   #define MSR_IA32_PERF_STATUS0x198
>> >
>> >  +#define MSR_IA32_MISC_ENABLE0x1a0
>>
>> I smell tabs...
> 
> Oops.  Cut'n'paste flew underneath the emacs radar.
> 
>> >  +
>> >  +static const VMStateDescription vmstate_msr_ia32_misc_enable = {
>> >  +.name = "cpu/msr_ia32_misc_enable",
>> >  +.version_id = 1,
>> >  +.minimum_version_id = 1,
>> >  +.minimum_version_id_old = 1,
>> >  +.fields  = (VMStateField []) {
>> >  +VMSTATE_UINT64(msr_ia32_misc_enable, CPUState),
>> >  +VMSTATE_END_OF_LIST()
>> >  +}
>> >  +};
>> >  +
>>
>> We are about to bump the CPU_SAVE_VERSION for the sake of APIC deadline
>> timer, so you can jump on that train and avoid this subsection.
> 
> Must we do that?  Considering that no guest will use the deadline timer,
> it seems to be an excellent candidates for subsections.

I don't know, it was sent out for pull like that. And I thought
subsections are still broken, aren't they?

> 
>> >  diff --git a/target-i386/op_helper.c b/target-i386/op_helper.c
>> >  index 3bb5a91..c89e4a4 100644
>> >  --- a/target-i386/op_helper.c
>> >  +++ b/target-i386/op_helper.c
>> >  @@ -3280,6 +3280,9 @@ void helper_wrmsr(void)
>> >   case MSR_TSC_AUX:
>> >   env->tsc_aux = val;
>> >   break;
>> >  +case MSR_IA32_MISC_ENABLE:
>> >  +env->msr_ia32_misc_enable = val;
>> >  +break;
>>
>> This MSR is Intel-specific, isn't it? Then I guess it should be limited
>> to Intel CPU types.
> 
> It's an "architectural MSR" that is only available on some Intel
> models.  Either we do a full cpuid qualification of accessible MSRs (and
> bits within MSRs), or not.  Qualifying just by vendor ID is pointless.

Given that, when in conflict, we rather model after AMD than Intel for
TCG, I would hesitate to expose this by default. Or are there
precedences already?

Jan



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] i386: wire up MSR_IA32_MISC_ENABLE

2011-10-04 Thread Avi Kivity


On 10/04/2011 07:14 PM, Jan Kiszka wrote:

>
>>  >   +
>>  >   +static const VMStateDescription vmstate_msr_ia32_misc_enable = {
>>  >   +.name = "cpu/msr_ia32_misc_enable",
>>  >   +.version_id = 1,
>>  >   +.minimum_version_id = 1,
>>  >   +.minimum_version_id_old = 1,
>>  >   +.fields  = (VMStateField []) {
>>  >   +VMSTATE_UINT64(msr_ia32_misc_enable, CPUState),
>>  >   +VMSTATE_END_OF_LIST()
>>  >   +}
>>  >   +};
>>  >   +
>>
>>  We are about to bump the CPU_SAVE_VERSION for the sake of APIC deadline
>>  timer, so you can jump on that train and avoid this subsection.
>
>  Must we do that?  Considering that no guest will use the deadline timer,
>  it seems to be an excellent candidates for subsections.

I don't know, it was sent out for pull like that. And I thought
subsections are still broken, aren't they?


Well let's fix subsections instead of disabling migration to older versions.


>>
>>  This MSR is Intel-specific, isn't it? Then I guess it should be limited
>>  to Intel CPU types.
>
>  It's an "architectural MSR" that is only available on some Intel
>  models.  Either we do a full cpuid qualification of accessible MSRs (and
>  bits within MSRs), or not.  Qualifying just by vendor ID is pointless.

Given that, when in conflict, we rather model after AMD than Intel for
TCG, I would hesitate to expose this by default. Or are there
precedences already?


Practically all MSRs.  i486 doesn't have any, IIRC, for example.

(and given this MSR has no effect, the only difference it makes to 
guests is the #GP we take or not; still it may be worthwhile to 
construct some table-driven thing to allow or reject MSR accesses, both 
for kvm and qemu)


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough

2011-10-04 Thread Jan Kiszka

On 2011-10-04 17:01, Anthony Liguori wrote:
> On 10/04/2011 09:58 AM, Avi Kivity wrote:
>> On 10/04/2011 04:51 PM, Anthony PERARD wrote:
>>> Hi all,
>>>
>>> This patch series introduce the PCI passthrough for Xen.
>>>
>>> First, we have HostPCIDevice that help to access one PCI device of
>>> the host.
>>>
>>> Then, there are several additions in the QEMU code. One is
>>> qemu_run_one_timer
>>> to run a specific timer. It is used by PCI passthrough to run a timer
>>> about
>>> power management. Another is pci_check_bar_overlap.
>>>
>>> There are also several change in pci_ids and pci_regs.
>>>
>>> Last part, but not least, the PCI passthrough device himself. Cut in
>>> 3 parts
>>> (or file), there is one to take care of the initialisation of a
>>> passthrough
>>> device. The second one handle everything about the config address
>>> space, there
>>> are specifics functions for every config register. The third one is
>>> to handle
>>> MSI.
>>>
>>> I'm still working on setting a PCI passthrough device through QMP
>>> from libxl
>>> (xen tool stack). It is just a call to device_add, with the driver
>>> parametter
>>> hostaddr=":00:1b.0".
>>>
>>> There is some minor things missing:
>>> - copyright header
>>> - PCI IO space multiplexer
>>>
>>>
>>
>> We also have pci passthrough in qemu-kvm (I think based on the same
>> Neocleus
>> code). Rather than having two pci assignment implementations, I think
>> we should
>> have just one, with the differences (programming the hypervisor)
>> abstracted at
>> that level.
> 
> I agree in principle but how close is qemu-kvm pci passthrough to a
> mergable state?  Would it make sense to merge the Xen code first and
> then abstract it?

What is missing to get qemu-kvm device assignment ready:
 - MSI/MSI-X refactoring in QEMU (specifically config notifiers, also
   relevant for virtio)
 - switch device assignment to generic MSI/MSI-X
 - get in-kernel irqchip support upstream
 - minor code cleanups

I was hoping to complete this for 1.0, but it looks unrealistic now.
Maybe I'll find the time for the MSI stuff at least.

From a first glance at these patches, I think there are already some
synergies in the host-pci access layer. And I bet the Xen bits could
also make use of our MSI/MSI-X code once it's generalized. But that will
be a bit work for them (based on my experience with qemu-kvm).

Jan



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 2/2] hda: do not mix output and input stream states, RHBZ #740493

2011-10-04 Thread Juan Quintela

"Marc-André Lureau"  wrote:
> Windows 7 may use the same stream number for input and output.
> Current code will confuse streams.
>
> NB: I wonder if this patch breaks migration code because of
> this change:
> -VMSTATE_BOOL_ARRAY(running, HDAAudioState, 16),
> +VMSTATE_BOOL_ARRAY(running, HDAAudioState, 2 * 16),

It does :-(

Two questions to know how to handle this.
a) My understanding of this patch is that we move from an array of 16
bools representing anything to one array where the 1st 16 represent if
there are input and the 2nd 16's reprosenting if there are output for
that channel.

So, what we should do if we migrate from one old version that only has
16 bools?  My understanding is that copying directly is not gonna work?

b) Just in case it makes "compatibility" easier.

Is the running array normally all false or similar?  If so, we could
just sent the other 16th values only if they are different of false.
(if this is true, it is interesting to put in the 1st slots the ones
that are more probable to be true).

Later, Juan.

Re: [Qemu-devel] [PATCH] i386: wire up MSR_IA32_MISC_ENABLE

2011-10-04 Thread Jan Kiszka

On 2011-10-04 19:21, Avi Kivity wrote:
> On 10/04/2011 07:14 PM, Jan Kiszka wrote:
>> >
>> >>  >   +
>> >>  >   +static const VMStateDescription vmstate_msr_ia32_misc_enable = {
>> >>  >   +.name = "cpu/msr_ia32_misc_enable",
>> >>  >   +.version_id = 1,
>> >>  >   +.minimum_version_id = 1,
>> >>  >   +.minimum_version_id_old = 1,
>> >>  >   +.fields  = (VMStateField []) {
>> >>  >   +VMSTATE_UINT64(msr_ia32_misc_enable, CPUState),
>> >>  >   +VMSTATE_END_OF_LIST()
>> >>  >   +}
>> >>  >   +};
>> >>  >   +
>> >>
>> >>  We are about to bump the CPU_SAVE_VERSION for the sake of APIC
>> deadline
>> >>  timer, so you can jump on that train and avoid this subsection.
>> >
>> >  Must we do that?  Considering that no guest will use the deadline
>> timer,
>> >  it seems to be an excellent candidates for subsections.
>>
>> I don't know, it was sent out for pull like that. And I thought
>> subsections are still broken, aren't they?
> 
> Well let's fix subsections instead of disabling migration to older
> versions.
> 
>> >>
>> >>  This MSR is Intel-specific, isn't it? Then I guess it should be
>> limited
>> >>  to Intel CPU types.
>> >
>> >  It's an "architectural MSR" that is only available on some Intel
>> >  models.  Either we do a full cpuid qualification of accessible MSRs
>> (and
>> >  bits within MSRs), or not.  Qualifying just by vendor ID is pointless.
>>
>> Given that, when in conflict, we rather model after AMD than Intel for
>> TCG, I would hesitate to expose this by default. Or are there
>> precedences already?
> 
> Practically all MSRs.  i486 doesn't have any, IIRC, for example.

Pre-Pentiums don't have instructions to access them as well, so that
doesn't cause any harm.

> 
> (and given this MSR has no effect, the only difference it makes to
> guests is the #GP we take or not; still it may be worthwhile to
> construct some table-driven thing to allow or reject MSR accesses, both
> for kvm and qemu)

Right. If this MSR is not the first bogus one on AMD, we can do this
later. If it is, it should be done first.

Jan



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] Make cpu_single_env thread local (Linux only for now)

2011-10-04 Thread Jan Kiszka

On 2011-10-04 17:10, Lluís Vilanova wrote:
> Jan Kiszka writes:
> 
>> On 2011-10-03 18:33, Dr. David Alan Gilbert wrote:
>>> Make cpu_single_env thread local (Linux only for now)
>>> * Fixes some user space threading issues (esp those triggered
>>> by bug 823902)
>>>
>>> Against rev d11cf8cc..., tested on ARM user mode, and ARM Vexpress
>>> system mode (with Blue Swirl's fix from yesterday) - only
>>> tested on Linux host.  Lets me run ARM userspace firefox.
>>>
>>> Signed-off-by: Dr. David Alan Gilbert 
>>>
>>> diff --git a/cpu-all.h b/cpu-all.h
>>> index 42a5fa0..d895ee6 100644
>>> --- a/cpu-all.h
>>> +++ b/cpu-all.h
>>> @@ -334,7 +334,13 @@ void cpu_dump_statistics(CPUState *env, FILE *f, 
>>> fprintf_function cpu_fprintf,
>>> void QEMU_NORETURN cpu_abort(CPUState *env, const char *fmt, ...)
>>> GCC_FMT_ATTR(2, 3);
>>> extern CPUState *first_cpu;
>>> +
>>> +#ifdef __linux__
>>> +/* DAG: Only tested thread local on Linux, feel free to add others */
>>> +extern __thread CPUState *cpu_single_env;
>>> +#else
>>> extern CPUState *cpu_single_env;
>>> +#endif
> 
>> We need this for all platforms in order to skip qemu_global_mutex while
>> manipulating some CPUState. And leaving some platforms with non-TLS will
>> eventually break them when code is added that assumes TLS.
> 
>> However, it's not unlikely that some weird platforms / ancient
>> toolchains still have problems with __thread - even on Linux. We may
>> want to play safe and use pthread_key on POSIX.
> 
> Why not make this kind of annotations available in qemu-commmon.h; or even
> better somewhere less... "generic".

Yes, we will need some encapsulation for this.

> 
> #if defined(CONFIG_SUPPORTS_THREAD_KEYWORD)
> 
> // use __thread
> #define QEMU_DECL_TLS(type, name) \
> extern __thread type name;
> #define QEMU_DEF_TLS(type, name) \
> __thread type name;
> #define QEMU_SET_TLS(name, value) \
> do { name = value; } while (0)
> #define QEMU_GET_TLS(name) \
> name
> 
> #elif defined(CONFIG_SUPPORTS_PTHREAD_KEY)
> 
> // use pthread_key_t
> #define QEMU_DECL_TLS(type, name) \
> extern type _type_##name; \
> extern pthread_key_t name;
> #define QEMU_DEF_TLS(type, name) \
> pthread_key_t name; \
> void _init_##name (void) __attribute__((constructor)); \
> void _init_##name (void) { pthread_key_create(&name, NULL); }
> #define QEMU_SET_TLS(name, value) \
> do { pthread_setspecific(name, (void*)value); } while (0)
> #define QEMU_GET_TLS(name) \
> ((typeof(_type_##name))pthread_getspecific(name))
> 
> #else
> #error Go home
> #endif

Looks like a start. But I would avoid macros and go for (static inline)
functions where possible. And initialization should be explicit (so that
you can start using TLS already inside constructors).

Jan



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RFC V1 02/11] qemu-timer: Introduce qemu_run_one_timer

2011-10-04 Thread Jan Kiszka

On 2011-10-04 16:51, Anthony PERARD wrote:
> Used by the Xen PCI Passthrough code to run the timer about the power
> state transition.
> 
> Signed-off-by: Anthony PERARD 
> ---
>  qemu-timer.c |   15 +++
>  qemu-timer.h |3 +++
>  2 files changed, 18 insertions(+), 0 deletions(-)
> 
> diff --git a/qemu-timer.c b/qemu-timer.c
> index 46dd483..15e659b 100644
> --- a/qemu-timer.c
> +++ b/qemu-timer.c
> @@ -1163,3 +1163,18 @@ int qemu_calculate_timeout(void)
>  return 1000;
>  }
>  
> +/* run the specified timer */
> +void qemu_run_one_timer(QEMUTimer *ts)
> +{
> +uint64_t current_time;
> +
> +/* remove timer from the list before calling the callback */
> +qemu_del_timer(ts);
> +
> +while ((current_time = qemu_get_clock_ms(rt_clock)) < ts->expire_time)
> +/* sleep until the expire time */
> +usleep((ts->expire_time - current_time) * 1000);
> +
> +/* run the callback */
> +ts->cb(ts->opaque);
> +}

This looks funny. I can't imagine that this could ever fit into the
standard (asynchronous) QEMU execution model for I/O. Keep it private to
Xen?

Jan



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough

2011-10-04 Thread Jan Kiszka

On 2011-10-04 19:01, Avi Kivity wrote:
> On 10/04/2011 06:33 PM, Alex Williamson wrote:
>> I thought we were potentially looking at vfio as a convergence point.
>> I'm still a bit off from having a vfio re-write ready to submit, but is
>> this still a possibility?  Thanks,
>>
> 
> vfio leaves out users of current and past kernels; relying on it would
> force me to maintain qemu-kvm for a while.

VFIO is a great thing, and we all want it. But it's also a big project
now as it has to please more and more use cases. I think it will require
some more time to settle its interfaces, and it should likely take this
time. So we need KVM device assignment upstream as well until comparable
VFIO support is in every recent distro kernel out there.

Jan

signature.asc
Description: OpenPGP digital signature

1 2 >

1 - 100 of 128 matches

Mail list logo