[Qemu-devel] buildbot failure in qemu on monitor_x86_64_debian_6_0
The Buildbot has detected a new failure on builder monitor_x86_64_debian_6_0 while building qemu. Full details are available at: http://buildbot.b1-systems.de/qemu/builders/monitor_x86_64_debian_6_0/builds/50 Buildbot URL: http://buildbot.b1-systems.de/qemu/ Buildslave for this Build: yuzuki Build Reason: The Nightly scheduler named 'nightly_monitor' triggered this build Build Source Stamp: [branch queue/monitor] HEAD Blamelist: BUILD FAILED: failed git sincerely, -The Buildbot
Re: [Qemu-devel] [PATCH 1/2] linux-user: Remove unused code
On 3 October 2011 21:43, Stefan Weil wrote: > The code is unused since 8 years, so remove it. > > Signed-off-by: Stefan Weil > --- > linux-user/signal.c | 5 + > 1 files changed, 1 insertions(+), 4 deletions(-) > > diff --git a/linux-user/signal.c b/linux-user/signal.c > index 89276eb..40c5eb1 100644 > --- a/linux-user/signal.c > +++ b/linux-user/signal.c > @@ -1274,10 +1274,7 @@ setup_return(CPUState *env, struct target_sigaction > *ka, > > if (__put_user(retcodes[idx], rc)) > return 1; > -#if 0 > - flush_icache_range((abi_ulong)rc, > - (abi_ulong)(rc + 1)); > -#endif > + > retcode = rc_addr + thumb; > } (since this is in the ARM section of this file) Since this is writing guest code instructions, flushing the host icache is clearly not the right thing. I'm just wondering how this ensures that we don't have stale translated code for this address, though... -- PMM
Re: [Qemu-devel] buildbot failure in qemu on default_openbsd_4.9
On Mon, Oct 03, 2011 at 05:57:28PM -0500, Anthony Liguori wrote: > On 10/03/2011 06:15 PM, q...@buildbot.b1-systems.de wrote: > >The Buildbot has detected a new failure on builder default_openbsd_4.9 while > >building qemu. > >Full details are available at: > > http://buildbot.b1-systems.de/qemu/builders/default_openbsd_4.9/builds/44 > > > >Buildbot URL: http://buildbot.b1-systems.de/qemu/ > > > >Buildslave for this Build: kraxel_openbsd49 > > > >Build Reason: The Nightly scheduler named 'nightly_default' triggered this > >build > >Build Source Stamp: [branch master] HEAD > >Blamelist: > > > >BUILD FAILED: failed git > > Is there a way to suppress buildbot failures when the failed stage > is git? Not a huge deal but seems like a potential improvement. We've discussed adding custom Python configuration code to ignore these failures. Looking at the buildbot documentation there is a 'retry' parameter for version control which "specifies a tuple of (delay, repeats) which means that when a full VC checkout fails, it should be retried up to repeats times, waiting delay seconds between attempts". This sounds like it could work. I suggest setting retry = (5 * 60, 3) so that we wait 5 minutes after a failed git fetch and try 3 times. Daniel: In the buildbot 0.8.3p1-1 docs I found the 'retry' parameter under 4.11.3 Source Checkout. Is it possible to add this parameter? Thanks, Stefan
Re: [Qemu-devel] New option for snapshot_blkdev to avoid image creation
On Mon, Oct 03, 2011 at 04:09:01PM +, Federico Simoncelli wrote: > In some situations might be useful to let qemu use an image that was > prepared for a live snapshot. > The advantage is that creating the snapshot file outside of the qemu > process we can use the whole range of options provided by the format > (eg for qcow2: encryption, cluster_size and preallocation). > It is also possible to pre-set a relative path to the backing file > (now it is created by default as absolute path). > In the long run it can also avoid the danger of reimplementing qemu-img > inside qemu (if we wanted to expose such options when a snapshot is > requested). When the image file is created based on the backing file size: $ qemu-img create -f qcow2 -o backing_file=master.img vm001.qcow2 It turns out that bdrv_img_create() opens the backing file with read/write permissions. This is generally a bad idea but especially dangerous when the VM currently has the image file open already since image formats are not designed for multiple initiators (clustering). We wouldn't want any caches being written out or startup fsck-style operations to be performed on the backing file while the VM has it open. Please make sure to use read-only before applying this patch. Stefan
Re: [Qemu-devel] buildbot failure in qemu on default_x86_64_rhel5
On Mon, Oct 03, 2011 at 04:10:50PM +0530, Aneesh Kumar K.V wrote: > On Mon, 3 Oct 2011 08:37:52 +0100, Stefan Hajnoczi wrote: > > On Sat, Oct 01, 2011 at 03:03:23PM +0530, Aneesh Kumar K.V wrote: > > > +#ifndef CONFIG_UTIMENSAT > > > +/* > > > + * We support handle fs driver only if all related > > > + * syscalls are provided by host. > > > + */ > > > > Perhaps a ./configure check should be added to see whether the handle > > syscalls are supported instead of using CONFIG_UTIMENSAT. > > > > We already do check for handle syscall. Since glibc doesn't have the > this syscall yet, I added the check in virtio-9p-handle.c as below CONFIG_UTIMENSAT is defined when the host has glibc >= 2.6. Handle syscalls are available on Linux 2.6.39 but not exposed by glibc. Therefore CONFIG_UTIMENSAT has nothing to do with handle syscalls and does not mean handle syscalls are supported. I would drop that hunk of the patch or test for the actual handle syscalls in ./configure. Stefan
Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/2] tcg/ppc*: Move cache initialization to ppc specific code
On 4 October 2011 07:29, Paolo Bonzini wrote: > On 10/04/2011 07:55 AM, Stefan Weil wrote: >> I learned now that ppc will need flush_icache_range() for kvm, too. >> So it won't be possible to implement a uniform handling of >> flush_icache_range() for all host architectures. > > x86 and IIRC s390 flush_icache_range is a no-op, so it is possible to "call" > it for all KVM architectures without incurring penalties. ...but presumably when ARM KVM actually lands we'll need to make flush_icache_range() available for it too. If flushing the icache isn't a TCG-specific operation, then we should have implementations in some other per-target-architecture file. Doesn't particularly matter where as long as we're consistent for all architectures. (I'd also like to see more of the ifdef-target-foo ladders in common files broken out into more cleanly defined interfaces provided by per-target source files, personally.) -- PMM
Re: [Qemu-devel] buildbot failure in qemu on default_openbsd_4.9
Hi Stefan, On Tuesday, October 04, 2011 08:45:28 AM Stefan Hajnoczi wrote: > Looking at the buildbot documentation there is a 'retry' parameter for > version control which "specifies a tuple of (delay, repeats) which means > that when a full VC checkout fails, it should be retried up to repeats > times, waiting delay seconds between attempts". This sounds like it > could work. > > I suggest setting retry = (5 * 60, 3) so that we wait 5 minutes after a > failed git fetch and try 3 times. Perfect! Thanks for finding this! I just haven't had time yet to look into this in detail. > > Daniel: In the buildbot 0.8.3p1-1 docs I found the 'retry' parameter > under 4.11.3 Source Checkout. Is it possible to add this parameter? I applied this for the qemu buildbot master instance for all Git checkouts. Hopefully there are now less "failed git" buildbot mails ... Best Regards, Daniel -- Daniel Gollub Linux Consultant & Developer Tel.: +49-160 47 73 970 Mail: gol...@b1-systems.de B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537 signature.asc Description: This is a digitally signed message part.
Re: [Qemu-devel] New option for snapshot_blkdev to avoid image creation
- Original Message - > From: "Stefan Hajnoczi" > To: "Federico Simoncelli" > Cc: qemu-devel@nongnu.org, aba...@redhat.com, dl...@redhat.com > Sent: Tuesday, October 4, 2011 9:33:48 AM > Subject: Re: [Qemu-devel] New option for snapshot_blkdev to avoid image > creation > > On Mon, Oct 03, 2011 at 04:09:01PM +, Federico Simoncelli wrote: > > In some situations might be useful to let qemu use an image that > > was > > prepared for a live snapshot. > > The advantage is that creating the snapshot file outside of the > > qemu > > process we can use the whole range of options provided by the > > format > > (eg for qcow2: encryption, cluster_size and preallocation). > > It is also possible to pre-set a relative path to the backing file > > (now it is created by default as absolute path). > > In the long run it can also avoid the danger of reimplementing > > qemu-img > > inside qemu (if we wanted to expose such options when a snapshot is > > requested). > > When the image file is created based on the backing file size: > > $ qemu-img create -f qcow2 -o backing_file=master.img vm001.qcow2 > > It turns out that bdrv_img_create() opens the backing file with > read/write permissions. This is generally a bad idea but especially > dangerous when the VM currently has the image file open already since > image formats are not designed for multiple initiators (clustering). Hi Stefan, are you sure that bdrv_img_create opens the backing file with read/write permissions? After a quick check I can't see that happening: $ ./qemu-img create -f qcow2 /tmp/master.img 1G Formatting '/tmp/master.img', fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 $ strace -e trace=open ./qemu-img create -f qcow2 -o backing_file=/tmp/master.img /tmp/vm001.qcow2 [...] open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or directory) open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or directory) open("/tmp/master.img", O_RDONLY|O_NONBLOCK) = 3 open("/tmp/master.img", O_RDONLY|O_NONBLOCK) = 3 open("/tmp/master.img", O_RDONLY|O_DSYNC|O_CLOEXEC) = 3 open("/tmp/master.img", O_RDONLY|O_NONBLOCK) = 3 open("/tmp/master.img", O_RDONLY|O_NONBLOCK) = 3 open("/tmp/master.img", O_RDONLY|O_CLOEXEC) = 3 Formatting '/tmp/vm001.qcow2', fmt=qcow2 size=1073741824 backing_file='/tmp/master.img' encryption=off cluster_size=65536 open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or directory) open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or directory) open("/tmp/vm001.qcow2", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 6 open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = 6 open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = 6 open("/tmp/vm001.qcow2", O_RDWR|O_DSYNC|O_CLOEXEC) = 6 open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = 6 open("/tmp/vm001.qcow2", O_RDONLY|O_NONBLOCK) = 6 open("/tmp/vm001.qcow2", O_RDWR|O_CLOEXEC) = 6 -- Federico
Re: [Qemu-devel] [PATCH V4] Add stdio char device on windows
Hello, 2011/10/3 Fabien Chouteau : > Simple implementation of an stdio char device on Windows. > > Signed-off-by: Fabien Chouteau > --- > qemu-char.c | 227 > ++- > 1 files changed, 225 insertions(+), 2 deletions(-) > hStdIn and hStdOut should be declared HANDLE instead of HANDLE* (in practice it works because HANDLE is just a void*) Regards, Yann Quelen
Re: [Qemu-devel] buildbot failure in qemu on default_x86_64_rhel5
On Tue, 4 Oct 2011 08:18:07 +0100, Stefan Hajnoczi wrote: > On Mon, Oct 03, 2011 at 04:10:50PM +0530, Aneesh Kumar K.V wrote: > > On Mon, 3 Oct 2011 08:37:52 +0100, Stefan Hajnoczi > > wrote: > > > On Sat, Oct 01, 2011 at 03:03:23PM +0530, Aneesh Kumar K.V wrote: > > > > +#ifndef CONFIG_UTIMENSAT > > > > +/* > > > > + * We support handle fs driver only if all related > > > > + * syscalls are provided by host. > > > > + */ > > > > > > Perhaps a ./configure check should be added to see whether the handle > > > syscalls are supported instead of using CONFIG_UTIMENSAT. > > > > > > > We already do check for handle syscall. Since glibc doesn't have the > > this syscall yet, I added the check in virtio-9p-handle.c as below > > CONFIG_UTIMENSAT is defined when the host has glibc >= 2.6. > > Handle syscalls are available on Linux 2.6.39 but not exposed by glibc. > > Therefore CONFIG_UTIMENSAT has nothing to do with handle syscalls and > does not mean handle syscalls are supported. I would drop that hunk of > the patch or test for the actual handle syscalls in ./configure. Here is what i am trying to achieve with the patch. For handle based fs driver to work we need to have 1) support for handle syscall 2) support for fd based syscalls like futimens, fstatat, readlinkat, fchmod, fchownat, openat etc. Now handle syscalls are recently added to kernel and glibc doesn't have support for that. So what we did is to add handle syscall in virtio-9p-handle.c via syscall(2). Now if the syscall is not supported by the host kernel we will get ENOSYS. I only added support for i386 and x86_64, because most the syscall number varies with different archs. For other archs the wrapper returns ENOSYS. So instead of checking for handle syscalls in configure script we did the above to make sure it work without failure in most of the case. Once we have glibc support for handle syscall the above changes should be dropped in favor of configure script test. Now for the fd based syscall dependency, we didn't initially had any check for that because my expectation was most glibc should have support for them. But RHEL 5 build failure indicate that futimens is not supported. We were already checking for futimens in configure so i added changes to make sure if futimens is not supported handle_utimensat returns error. (That was not added as a run time check, but rather a compile error fix). Now should we allow handle based fs driver if futimens is not supported. I was suggesting we should not; hence the check in init to return error when we don't support futimens. The later part of init routine do check whether handle syscalls are supported and disable handle fs driver if they are not. -aneesh
Re: [Qemu-devel] buildbot failure in qemu on default_openbsd_4.9
On Tue, Oct 04, 2011 at 10:12:15AM +0200, Daniel Gollub wrote: > > Daniel: In the buildbot 0.8.3p1-1 docs I found the 'retry' parameter > > under 4.11.3 Source Checkout. Is it possible to add this parameter? > > I applied this for the qemu buildbot master instance for all Git checkouts. > > Hopefully there are now less "failed git" buildbot mails ... Great, thank you! Stefan
Re: [Qemu-devel] buildbot failure in qemu on default_x86_64_rhel5
On Tue, Oct 04, 2011 at 02:18:20PM +0530, Aneesh Kumar K.V wrote: > On Tue, 4 Oct 2011 08:18:07 +0100, Stefan Hajnoczi wrote: > > On Mon, Oct 03, 2011 at 04:10:50PM +0530, Aneesh Kumar K.V wrote: > > > On Mon, 3 Oct 2011 08:37:52 +0100, Stefan Hajnoczi > > > wrote: > > > > On Sat, Oct 01, 2011 at 03:03:23PM +0530, Aneesh Kumar K.V wrote: > > > > > +#ifndef CONFIG_UTIMENSAT > > > > > +/* > > > > > + * We support handle fs driver only if all related > > > > > + * syscalls are provided by host. > > > > > + */ > > > > > > > > Perhaps a ./configure check should be added to see whether the handle > > > > syscalls are supported instead of using CONFIG_UTIMENSAT. > > > > > > > > > > We already do check for handle syscall. Since glibc doesn't have the > > > this syscall yet, I added the check in virtio-9p-handle.c as below > > > > CONFIG_UTIMENSAT is defined when the host has glibc >= 2.6. > > > > Handle syscalls are available on Linux 2.6.39 but not exposed by glibc. > > > > Therefore CONFIG_UTIMENSAT has nothing to do with handle syscalls and > > does not mean handle syscalls are supported. I would drop that hunk of > > the patch or test for the actual handle syscalls in ./configure. > > Here is what i am trying to achieve with the patch. For handle based fs > driver to work we need to have > > 1) support for handle syscall > 2) support for fd based syscalls like futimens, fstatat, readlinkat, > fchmod, fchownat, openat etc. > > Now handle syscalls are recently added to kernel and glibc doesn't have > support for that. So what we did is to add handle syscall in > virtio-9p-handle.c via syscall(2). Now if the syscall is not supported > by the host kernel we will get ENOSYS. I only added support for i386 and > x86_64, because most the syscall number varies with different archs. For > other archs the wrapper returns ENOSYS. So instead of checking for > handle syscalls in configure script we did the above to make sure it > work without failure in most of the case. Once we have glibc support for > handle syscall the above changes should be dropped in favor of > configure script test. > > Now for the fd based syscall dependency, we didn't initially had any > check for that because my expectation was most glibc should > have support for them. But RHEL 5 build failure indicate that futimens > is not supported. We were already checking for futimens in configure so > i added changes to make sure if futimens is not supported > handle_utimensat returns error. (That was not added as a run time > check, but rather a compile error fix). Now should we allow handle based fs > driver if futimens is not supported. I was suggesting we should not; > hence the check in init to return error when we don't support > futimens. The later part of init routine do check whether handle > syscalls are supported and disable handle fs driver if they are not. Okay, then the comment should be: /* * We support handle fs driver only if futimens is provided by the host */ The scenario where it might be possible to hit the CONFIG_UTIMENSAT is with a modern kernel paired with an old userspace. The handle syscall which we call directly would succeed but the futimens(2) would not be available. On a sane system the handle syscall fails because the kernel doesn't support it (and futimens isn't there either). Stefan
Re: [Qemu-devel] New option for snapshot_blkdev to avoid image creation
On Tue, Oct 04, 2011 at 04:27:42AM -0400, Federico Simoncelli wrote: > - Original Message - > > From: "Stefan Hajnoczi" > > To: "Federico Simoncelli" > > Cc: qemu-devel@nongnu.org, aba...@redhat.com, dl...@redhat.com > > Sent: Tuesday, October 4, 2011 9:33:48 AM > > Subject: Re: [Qemu-devel] New option for snapshot_blkdev to avoid image > > creation > > > > On Mon, Oct 03, 2011 at 04:09:01PM +, Federico Simoncelli wrote: > > > In some situations might be useful to let qemu use an image that > > > was > > > prepared for a live snapshot. > > > The advantage is that creating the snapshot file outside of the > > > qemu > > > process we can use the whole range of options provided by the > > > format > > > (eg for qcow2: encryption, cluster_size and preallocation). > > > It is also possible to pre-set a relative path to the backing file > > > (now it is created by default as absolute path). > > > In the long run it can also avoid the danger of reimplementing > > > qemu-img > > > inside qemu (if we wanted to expose such options when a snapshot is > > > requested). > > > > When the image file is created based on the backing file size: > > > > $ qemu-img create -f qcow2 -o backing_file=master.img vm001.qcow2 > > > > It turns out that bdrv_img_create() opens the backing file with > > read/write permissions. This is generally a bad idea but especially > > dangerous when the VM currently has the image file open already since > > image formats are not designed for multiple initiators (clustering). > > Hi Stefan, are you sure that bdrv_img_create opens the backing file > with read/write permissions? You are absolutely right. BDRV_O_FLAGS does not have BDRV_O_RDWR so it is opening read-only. I missed this when reading the code earlier today. Stefan
Re: [Qemu-devel] [RFC] Adding new filesystem 'proxy' to 9p
> > That is the case if the proxy helper code is perfectly written. I am trying > to think about the scenario where there is a bug (eg heap corruption / > stack overflow) which allows a malicious non-root QEMU process to exploit > the proxy helper to run code that it was *not* intended to run. > > If the proxy helper is running root with all capabilities, then a bug in > the proxy helper can easily turn into a full root exploit. > > If the proxy helper starts as root, chroots, and then immediately drops to > a non-root user, keeping only the CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_FOWNER > and CAP_DAC_READ_SEARCH capabilities, then a bug in the proxy helper can > only be used to access files within the designated 9pfs export. If the > exported directory does not contain any important host system files, then > it is unlikely it can be used to create a full root exploit. > Thanks Daniel, I will add 'capabiliies' to proxy helper. CAP_FOWNER capability also need. I am working on the patches. I will post them in few days.
[Qemu-devel] [PATCH 1/2] ACPI: Update cpu hotplug event for guest
From: Liu Ping Fan Need to update cpu hotplug event for guest when updating SCI event Signed-off-by: Liu Ping Fan --- hw/acpi_piix4.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c index 44eb8ae..f585226 100644 --- a/hw/acpi_piix4.c +++ b/hw/acpi_piix4.c @@ -94,7 +94,8 @@ static void pm_update_sci(PIIX4PMState *s) ACPI_BITMASK_POWER_BUTTON_ENABLE | ACPI_BITMASK_GLOBAL_LOCK_ENABLE | ACPI_BITMASK_TIMER_ENABLE)) != 0) || -(((s->gpe.sts[0] & s->gpe.en[0]) & PIIX4_PCI_HOTPLUG_STATUS) != 0); +(((s->gpe.sts[0] & s->gpe.en[0]) & + (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0); qemu_set_irq(s->irq, sci_level); /* schedule a timer interruption if needed */ -- 1.7.4.4
[Qemu-devel] [PATCH] ACPI: Call ACPI remove handler when handling ACPI eject event
From: Liu Ping Fan Call the remove handler for ACPI_NOTIFY_EJECT_REQUEST Signed-off-by: Liu Ping Fan --- drivers/acpi/bus.c |2 +- drivers/acpi/scan.c |2 +- include/acpi/acpi_bus.h |2 ++ 3 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c index 437ddbf..d06ec6d 100644 --- a/drivers/acpi/bus.c +++ b/drivers/acpi/bus.c @@ -764,7 +764,7 @@ static void acpi_bus_notify(acpi_handle handle, u32 type, void *data) break; case ACPI_NOTIFY_EJECT_REQUEST: - /* TBD */ + acpi_os_hotplug_execute(acpi_bus_hot_remove_device, handle); break; case ACPI_NOTIFY_DEVICE_CHECK_LIGHT: diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c index 449c556..3b97b61 100644 --- a/drivers/acpi/scan.c +++ b/drivers/acpi/scan.c @@ -83,7 +83,7 @@ acpi_device_modalias_show(struct device *dev, struct device_attribute *attr, cha } static DEVICE_ATTR(modalias, 0444, acpi_device_modalias_show, NULL); -static void acpi_bus_hot_remove_device(void *context) +void acpi_bus_hot_remove_device(void *context) { struct acpi_device *device; acpi_handle handle = context; diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h index 6cd5b64..b19c09d 100644 --- a/include/acpi/acpi_bus.h +++ b/include/acpi/acpi_bus.h @@ -310,6 +310,8 @@ extern int unregister_acpi_notifier(struct notifier_block *); extern int register_acpi_bus_notifier(struct notifier_block *nb); extern void unregister_acpi_bus_notifier(struct notifier_block *nb); +extern void acpi_bus_hot_remove_device(void *context); + /* * External Functions */ -- 1.7.4.4
[Qemu-devel] [PATCH 2/2] LAPIC: make lapic support cpu hotplug
From: Liu Ping Fan Separate apic from qbus to icc bus which supports hotplug feature. And make the creation of apic as part of cpu initialization, so apic's state has been ready, before setting kvm_apic. Signed-off-by: Liu Ping Fan --- Makefile.target |1 + hw/apic.c|7 - hw/apic.h|1 + hw/icc_bus.c | 62 ++ hw/icc_bus.h | 24 +++ hw/pc.c | 11 - target-i386/cpu.h|1 + target-i386/helper.c |7 +- target-i386/kvm.c|1 - 9 files changed, 105 insertions(+), 10 deletions(-) create mode 100644 hw/icc_bus.c create mode 100644 hw/icc_bus.h diff --git a/Makefile.target b/Makefile.target index 9011f28..5607c6d 100644 --- a/Makefile.target +++ b/Makefile.target @@ -241,6 +241,7 @@ obj-i386-$(CONFIG_KVM) += kvmclock.o obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o obj-i386-y += testdev.o obj-i386-y += acpi.o acpi_piix4.o +obj-i386-y += icc_bus.o obj-i386-y += pcspk.o i8254.o obj-i386-$(CONFIG_KVM_PIT) += i8254-kvm.o diff --git a/hw/apic.c b/hw/apic.c index 69d6ac5..95a1664 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -24,6 +24,7 @@ #include "sysbus.h" #include "trace.h" #include "kvm.h" +#include "icc_bus.h" /* APIC Local Vector Table */ #define APIC_LVT_TIMER 0 @@ -304,11 +305,13 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val) if (!s) return; + if (kvm_enabled() && kvm_irqchip_in_kernel()) s->apicbase = val; else s->apicbase = (val & 0xf000) | (s->apicbase & (MSR_IA32_APICBASE_BSP | MSR_IA32_APICBASE_ENABLE)); + /* if disabled, cannot be enabled again */ if (!(val & MSR_IA32_APICBASE_ENABLE)) { s->apicbase &= ~MSR_IA32_APICBASE_ENABLE; @@ -1075,7 +1078,7 @@ static const VMStateDescription vmstate_apic = { } }; -static void apic_reset(DeviceState *d) +void apic_reset(DeviceState *d) { APICState *s = DO_UPCAST(APICState, busdev.qdev, d); int bsp; @@ -1138,7 +1141,7 @@ static SysBusDeviceInfo apic_info = { static void apic_register_devices(void) { -sysbus_register_withprop(&apic_info); +iccbus_register(&apic_info); } device_init(apic_register_devices) diff --git a/hw/apic.h b/hw/apic.h index c857d52..e258efa 100644 --- a/hw/apic.h +++ b/hw/apic.h @@ -24,5 +24,6 @@ void apic_sipi(DeviceState *s); /* pc.c */ int cpu_is_bsp(CPUState *env); DeviceState *cpu_get_current_apic(void); +void apic_reset(DeviceState *d); #endif diff --git a/hw/icc_bus.c b/hw/icc_bus.c new file mode 100644 index 000..360ca2a --- /dev/null +++ b/hw/icc_bus.c @@ -0,0 +1,62 @@ +/* +*/ +#define ICC_BUS_PLUG +#ifdef ICC_BUS_PLUG +#include "icc_bus.h" + + + +struct icc_bus_info icc_info = { +.qinfo.name = "icc", +.qinfo.size = sizeof(struct icc_bus), +.qinfo.props = (Property[]) { +DEFINE_PROP_END_OF_LIST(), +} + +}; + + +static const VMStateDescription vmstate_icc_bus = { +.name = "icc_bus", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.pre_save = NULL, +.post_load = NULL, +}; + +struct icc_bus *g_iccbus; + +struct icc_bus *icc_init_bus(DeviceState *parent, const char *name) +{ +struct icc_bus *bus; + +bus = FROM_QBUS(icc_bus, qbus_create(&icc_info.qinfo, parent, name)); +bus->qbus.allow_hotplug = 1; /* Yes, we can */ +bus->qbus.name = "icc"; +vmstate_register(NULL, -1, &vmstate_icc_bus, bus); +g_iccbus = bus; +return bus; +} + + + + + +static int iccbus_device_init(DeviceState *dev, DeviceInfo *base) +{ +SysBusDeviceInfo *info = container_of(base, SysBusDeviceInfo, qdev); + +return info->init(sysbus_from_qdev(dev)); +} + +void iccbus_register(SysBusDeviceInfo *info) +{ +info->qdev.init = iccbus_device_init; +info->qdev.bus_info = &icc_info.qinfo; + +assert(info->qdev.size >= sizeof(SysBusDevice)); +qdev_register(&info->qdev); +} + +#endif diff --git a/hw/icc_bus.h b/hw/icc_bus.h new file mode 100644 index 000..94d9242 --- /dev/null +++ b/hw/icc_bus.h @@ -0,0 +1,24 @@ +#ifndef QEMU_ICC_H +#define QEMU_ICC_H + +#include "qdev.h" +#include "sysbus.h" + +typedef struct icc_bus icc_bus; +typedef struct icc_bus_info icc_bus_info; + + +struct icc_bus { +BusState qbus; +}; + +struct icc_bus_info { +BusInfo qinfo; +}; + +extern struct icc_bus_info icc_info; +extern struct icc_bus *g_iccbus; +extern struct icc_bus *icc_init_bus(DeviceState *parent, const char *name); +extern void iccbus_register(SysBusDeviceInfo *info); + +#endif diff --git a/hw/pc.c b/hw/pc.c index 6b3662e..10371d8 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -24,6 +24,7 @@ #include "hw.h" #include "pc.h" #include "apic.h" +#include "icc_bus.h" #include "fdc.h" #include "ide.h" #include "pci.h" @@ -91,6 +92,7 @@ struct e820_table { static struct e820_table e820_table; struct hpet_fw_config hpet_cfg = {.count = UINT8_
[Qemu-devel] Enable x86 linux guest with cpu hotplug feature.
From: Liu Ping Fan A bunch of patches, which are applied separately for kernel and qemu-kvm, and make x86 based linux guest with cpu hotplug feature. For kernel: 0001-ACPI-Call-ACPI-remove-handler-when-handling-ACPI-eje.patch --call acpi eject event handler in acpi_bus_notify() For qemu-kvm: 0001-ACPI-Update-cpu-hotplug-event-for-guest.patch --trigger acpi cpu hotplug event for guest, so qemu-kvm can bring present cpu to online state 0002-LAPIC-make-lapic-support-cpu-hotplug.patch --make qemu-kvm support to bring possible cpu to present state. After applying these patches, we can bring possible cpu online by the following step. 1.Boot up qemu-kvm with the param "-smp 1,maxcpu=4", 2.Use "cpu_set 2 online" cmd in qemu monitor to bring possible cpu to present state. 3.In guest,echo 1 > cpu2/online to bring the present cpu to online state.
[Qemu-devel] [PATCH] ui/spice-core: fix segfault in monitor
Fix segfault if a qxl device is present but no spice command line argument is given. RHBZ 743251. Signed-off-by: Alon Levy --- ui/spice-core.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/ui/spice-core.c b/ui/spice-core.c index 3cbc721..b14d1d9 100644 --- a/ui/spice-core.c +++ b/ui/spice-core.c @@ -413,7 +413,7 @@ void do_info_spice(Monitor *mon, QObject **ret_data) int port, tls_port; char version_string[20]; /* 12 = |255.255.255\0| is the max */ -if (!spice_server) { +if (!spice_server || !opts) { *ret_data = qobject_from_jsonf("{ 'enabled': false }"); return; } -- 1.7.6.4
[Qemu-devel] s390: Fix cpu shutdown for KVM
qemu/kvm on s390 currently hangs on panic (doesnt exit on disabled wait) and also shuts down on cpu hot-unplug (SIGP stop). This patch tries to fix these simple cases. On s390 a shutdown is the state of all CPUs being either stopped or disabled (for interrupts) waiting. We have to track this number to call the shutdown sequence accordingly. This patch implements the counting and shutdown handling for the kvm path in qemu. Signed-off-by: Christian Borntraeger --- hw/s390-virtio.c |4 target-s390x/kvm.c | 29 + 2 files changed, 29 insertions(+), 4 deletions(-) Index: b/hw/s390-virtio.c === --- a/hw/s390-virtio.c +++ b/hw/s390-virtio.c @@ -130,6 +130,9 @@ int s390_virtio_hypercall(CPUState *env, return r; } +/* defined in target-s390x/kvm.c */ +extern int s390_running_cpus; + /* PC hardware initialisation */ static void s390_init(ram_addr_t my_ram_size, const char *boot_device, @@ -189,6 +192,7 @@ static void s390_init(ram_addr_t my_ram_ env->halted = 0; env->exception_index = 0; +s390_running_cpus = 1; if (kernel_filename) { kernel_size = load_image(kernel_filename, qemu_get_ram_ptr(0)); Index: b/target-s390x/kvm.c === --- a/target-s390x/kvm.c +++ b/target-s390x/kvm.c @@ -63,6 +63,14 @@ #define SCLP_CMDW_READ_SCP_INFO 0x00020001 #define SCLP_CMDW_READ_SCP_INFO_FORCED 0x00120001 +/* + * The number of running CPUs. On s390 a shutdown is the state of all CPUs + * being either stopped or disabled (for interrupts) waiting. We have to + * track this number to call the shutdown sequence accordingly. This + * number is modified either on startup or while holding the big qemu lock. + */ +int s390_running_cpus; + const KVMCapabilityInfo kvm_arch_required_capabilities[] = { KVM_CAP_LAST_INFO }; @@ -185,6 +193,12 @@ void kvm_s390_interrupt_internal(CPUStat return; } +/* + * We can only deliver interrupts to (interrupt) enabled CPUs. + * We dont modify s390_running_cpus here, since CPUs in enabled wait + * will wait inside the kernel (no exit). Therefore, the targeted + * CPUs was neither disabled waiting or stopped for qemu. + */ env->halted = 0; env->exception_index = -1; qemu_cpu_kick(env); @@ -301,6 +315,7 @@ static int s390_cpu_restart(CPUState *en kvm_s390_interrupt(env, KVM_S390_RESTART, 0); env->halted = 0; env->exception_index = -1; +s390_running_cpus++; qemu_cpu_kick(env); dprintf("DONE: SIGP cpu restart: %p\n", env); return 0; @@ -425,16 +440,24 @@ static int handle_intercept(CPUState *en r = handle_instruction(env, run); break; case ICPT_WAITPSW: -/* XXX What to do on system shutdown? */ +if (--s390_running_cpus == 0) { +qemu_system_shutdown_request(); +} env->halted = 1; env->exception_index = EXCP_HLT; +r = EXCP_HALTED; break; case ICPT_SOFT_INTERCEPT: fprintf(stderr, "KVM unimplemented icpt SOFT\n"); exit(1); break; case ICPT_CPU_STOP: -qemu_system_shutdown_request(); +if (--s390_running_cpus == 0) { +qemu_system_shutdown_request(); +} +env->halted = 1; +env->exception_index = EXCP_HLT; +r = EXCP_HALTED; break; case ICPT_IO: fprintf(stderr, "KVM unimplemented icpt IO\n"); @@ -468,8 +491,6 @@ int kvm_arch_handle_exit(CPUState *env, if (ret == 0) { ret = EXCP_INTERRUPT; -} else if (ret > 0) { -ret = 0; } return ret; }
[Qemu-devel] [PATCH] runstate: do not discard runstate changes when paused
Trying to migrate a paused machine fails. The reason is that the RSTATE_PRE_MIGRATE is reached with vm_stop, and this transition is eaten when the vm is already paused. This patch fixes the problem by always going through runstate_set and always notifying the new state. Signed-off-by: Paolo Bonzini --- cpus.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/cpus.c b/cpus.c index 8978779..eab8ff6 100644 --- a/cpus.c +++ b/cpus.c @@ -128,6 +128,8 @@ static void do_vm_stop(RunState state) qemu_aio_flush(); bdrv_flush_all(); monitor_protocol_event(QEVENT_STOP, NULL); +} else { +runstate_set(state); } } -- 1.7.6
Re: [Qemu-devel] [PATCH] s390: fix short kernel command lines
On 09/23/2011 03:38 PM, Christian Borntraeger wrote: The default kernel command line for s390 is "root=/dev/ram0 ro" When overriding this line, we have to ensure to also copy the \0 to avoid false lines, for example, -append "root=/dev/vda" will result in "root=/dev/vda0 ro" with the current code. Signed-off-by: Christian Borntraeger Thanks, applied to s390-next. Alex
[Qemu-devel] build errors on s390 tcg
With todays qemu git I get the following error when compiling for s390: In file included from /home/cborntra/REPOS/qemu/tcg/tcg.c:175:0: /home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:677:13: error: conflicting types for ‘tcg_out_mov’ /home/cborntra/REPOS/qemu/tcg/tcg.c:76:13: note: previous declaration of ‘tcg_out_mov’ was here /home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:689:13: error: conflicting types for ‘tcg_out_movi’ /home/cborntra/REPOS/qemu/tcg/tcg.c:77:13: note: previous declaration of ‘tcg_out_movi’ was here /home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:829:20: error: conflicting types for ‘tcg_out_ld’ /home/cborntra/REPOS/qemu/tcg/tcg.c:74:13: note: previous declaration of ‘tcg_out_ld’ was here /home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:839:20: error: conflicting types for ‘tcg_out_st’ /home/cborntra/REPOS/qemu/tcg/tcg.c:81:13: note: previous declaration of ‘tcg_out_st’ was here make[1]: *** [tcg/tcg.o] Error 1 reverting c0ad3001bf12292b137b05e1c4643f31c6b0a727 ""tcg: Add forward declarations for local functions" solves the problem, but it just hides the problem. Any ideas howto solve that properly? Christian
Re: [Qemu-devel] s390: Fix cpu shutdown for KVM
On 10/04/2011 01:28 PM, Christian Borntraeger wrote: qemu/kvm on s390 currently hangs on panic (doesnt exit on disabled wait) and also shuts down on cpu hot-unplug (SIGP stop). This patch tries to fix these simple cases. On s390 a shutdown is the state of all CPUs being either stopped or disabled (for interrupts) waiting. We have to track this number to call the shutdown sequence accordingly. This patch implements the counting and shutdown handling for the kvm path in qemu. Signed-off-by: Christian Borntraeger --- hw/s390-virtio.c |4 target-s390x/kvm.c | 29 + 2 files changed, 29 insertions(+), 4 deletions(-) Index: b/hw/s390-virtio.c === --- a/hw/s390-virtio.c +++ b/hw/s390-virtio.c @@ -130,6 +130,9 @@ int s390_virtio_hypercall(CPUState *env, return r; } +/* defined in target-s390x/kvm.c */ +extern int s390_running_cpus; This breaks non-kvm builds. Please put it into s390-virtio.c and export functions that get called from target-s390x/kvm.c which add/remove running CPUs from that counter. Alex + /* PC hardware initialisation */ static void s390_init(ram_addr_t my_ram_size, const char *boot_device, @@ -189,6 +192,7 @@ static void s390_init(ram_addr_t my_ram_ env->halted = 0; env->exception_index = 0; +s390_running_cpus = 1; if (kernel_filename) { kernel_size = load_image(kernel_filename, qemu_get_ram_ptr(0)); Index: b/target-s390x/kvm.c === --- a/target-s390x/kvm.c +++ b/target-s390x/kvm.c @@ -63,6 +63,14 @@ #define SCLP_CMDW_READ_SCP_INFO 0x00020001 #define SCLP_CMDW_READ_SCP_INFO_FORCED 0x00120001 +/* + * The number of running CPUs. On s390 a shutdown is the state of all CPUs + * being either stopped or disabled (for interrupts) waiting. We have to + * track this number to call the shutdown sequence accordingly. This + * number is modified either on startup or while holding the big qemu lock. + */ +int s390_running_cpus; + const KVMCapabilityInfo kvm_arch_required_capabilities[] = { KVM_CAP_LAST_INFO }; @@ -185,6 +193,12 @@ void kvm_s390_interrupt_internal(CPUStat return; } +/* + * We can only deliver interrupts to (interrupt) enabled CPUs. + * We dont modify s390_running_cpus here, since CPUs in enabled wait + * will wait inside the kernel (no exit). Therefore, the targeted + * CPUs was neither disabled waiting or stopped for qemu. + */ env->halted = 0; env->exception_index = -1; qemu_cpu_kick(env); @@ -301,6 +315,7 @@ static int s390_cpu_restart(CPUState *en kvm_s390_interrupt(env, KVM_S390_RESTART, 0); env->halted = 0; env->exception_index = -1; +s390_running_cpus++; qemu_cpu_kick(env); dprintf("DONE: SIGP cpu restart: %p\n", env); return 0; @@ -425,16 +440,24 @@ static int handle_intercept(CPUState *en r = handle_instruction(env, run); break; case ICPT_WAITPSW: -/* XXX What to do on system shutdown? */ +if (--s390_running_cpus == 0) { +qemu_system_shutdown_request(); +} env->halted = 1; env->exception_index = EXCP_HLT; +r = EXCP_HALTED; break; case ICPT_SOFT_INTERCEPT: fprintf(stderr, "KVM unimplemented icpt SOFT\n"); exit(1); break; case ICPT_CPU_STOP: -qemu_system_shutdown_request(); +if (--s390_running_cpus == 0) { +qemu_system_shutdown_request(); +} +env->halted = 1; +env->exception_index = EXCP_HLT; +r = EXCP_HALTED; break; case ICPT_IO: fprintf(stderr, "KVM unimplemented icpt IO\n"); @@ -468,8 +491,6 @@ int kvm_arch_handle_exit(CPUState *env, if (ret == 0) { ret = EXCP_INTERRUPT; -} else if (ret> 0) { -ret = 0; } return ret; }
Re: [Qemu-devel] [PATCH 11/22] i8259: Update IRQ state after reset
On 10/02/2011 10:36 PM, Blue Swirl wrote: On Sun, Oct 2, 2011 at 8:31 PM, Avi Kivity wrote: > >> > >> > In fact these aren't problems. The packet may be sent or data >> > written, as long as they aren't corrupted. A device is allowed to >> > "delay" a reset (but not indefinitely). >> >> Oh, but corruption could easily happen. Consider for example a disk >> controller waiting for DMA ready signal a device separate from the >> DMA >> controller. Due to reset glitches in the device or signal chain, the >> DMA ready signal arrives but the DMA controller still contains old >> information, writing the data to disk from wrong memory location. > > > Would not this corruption also happen on real hardware? If reset to the disk controller is delayed by a slow gate or extra capacitance on a line? Maybe, but the delays are probably too short on real HW before any packets are sent or disk gets written. On QEMU, I/O can be instantaneous. As an anecdote, while reading a chip errata I came upon this: 15. CPU May Record Signal Glitches When MCH is Being Reset Problem: When the MCH is reset via RSTIN# the CPU may record any one or more of the following errors: address strobe glitch (MSR IA32_MCi_STATUS bit 23), data strobe glitch (MSR IA32_MCi_STATUS bit 22), P/N data strobes out of sync (MSR IA32_MCi_STATUS bit 21), PIC & FSB data parity (MSR IA32_MCi_STATUS bit 19), RSP parity (MSR IA32_MCi_STATUS bit 18), or FSB address parity (MSR IA32_MCi_STATUS bit 16). This can happen when the MCH asserts CPURST# just after the MCH drives an FSB transaction. This may happen because RSTIN# and CPURST# maintain an asynchronous relationship with each other. Workaround: None. Status: No Fix (of course the situation there is different, there is no global reset signal) -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] build errors on s390 tcg
On 10/04/2011 01:54 PM, Christian Borntraeger wrote: With todays qemu git I get the following error when compiling for s390: In file included from /home/cborntra/REPOS/qemu/tcg/tcg.c:175:0: /home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:677:13: error: conflicting types for ‘tcg_out_mov’ /home/cborntra/REPOS/qemu/tcg/tcg.c:76:13: note: previous declaration of ‘tcg_out_mov’ was here /home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:689:13: error: conflicting types for ‘tcg_out_movi’ /home/cborntra/REPOS/qemu/tcg/tcg.c:77:13: note: previous declaration of ‘tcg_out_movi’ was here /home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:829:20: error: conflicting types for ‘tcg_out_ld’ /home/cborntra/REPOS/qemu/tcg/tcg.c:74:13: note: previous declaration of ‘tcg_out_ld’ was here /home/cborntra/REPOS/qemu/tcg/s390/tcg-target.c:839:20: error: conflicting types for ‘tcg_out_st’ /home/cborntra/REPOS/qemu/tcg/tcg.c:81:13: note: previous declaration of ‘tcg_out_st’ was here make[1]: *** [tcg/tcg.o] Error 1 reverting c0ad3001bf12292b137b05e1c4643f31c6b0a727 ""tcg: Add forward declarations for local functions" solves the problem, but it just hides the problem. Any ideas howto solve that properly? The problem is that tcg.c defines the functions with int arguments for TCG register indexes, while s390/tcg-target.c takes TCGReg parameters. I'm not sure which way is better, but using TCGReg feels more type safe to me. Alex
Re: [Qemu-devel] build errors on s390 tcg
On 4 October 2011 13:23, Alexander Graf wrote: > The problem is that tcg.c defines the functions with int arguments for TCG > register indexes, while s390/tcg-target.c takes TCGReg parameters. I'm not > sure which way is better, but using TCGReg feels more type safe to me. You're not consistent, though; for example the s390 tcg_target_reg_alloc_order array is declared as int[], not TCGReg[]. -- PMM
Re: [Qemu-devel] [PATCH 2/2] LAPIC: make lapic support cpu hotplug
On 10/04/2011 06:13 AM, pingf...@linux.vnet.com wrote: From: Liu Ping Fan Separate apic from qbus to icc bus which supports hotplug feature. And make the creation of apic as part of cpu initialization, so apic's state has been ready, before setting kvm_apic. Signed-off-by: Liu Ping Fan --- Makefile.target |1 + hw/apic.c|7 - hw/apic.h|1 + hw/icc_bus.c | 62 ++ hw/icc_bus.h | 24 +++ hw/pc.c | 11 - target-i386/cpu.h|1 + target-i386/helper.c |7 +- target-i386/kvm.c|1 - 9 files changed, 105 insertions(+), 10 deletions(-) create mode 100644 hw/icc_bus.c create mode 100644 hw/icc_bus.h diff --git a/Makefile.target b/Makefile.target index 9011f28..5607c6d 100644 --- a/Makefile.target +++ b/Makefile.target @@ -241,6 +241,7 @@ obj-i386-$(CONFIG_KVM) += kvmclock.o obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o obj-i386-y += testdev.o obj-i386-y += acpi.o acpi_piix4.o +obj-i386-y += icc_bus.o obj-i386-y += pcspk.o i8254.o obj-i386-$(CONFIG_KVM_PIT) += i8254-kvm.o diff --git a/hw/apic.c b/hw/apic.c index 69d6ac5..95a1664 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -24,6 +24,7 @@ #include "sysbus.h" #include "trace.h" #include "kvm.h" +#include "icc_bus.h" /* APIC Local Vector Table */ #define APIC_LVT_TIMER 0 @@ -304,11 +305,13 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val) if (!s) return; + if (kvm_enabled()&& kvm_irqchip_in_kernel()) s->apicbase = val; else s->apicbase = (val& 0xf000) | (s->apicbase& (MSR_IA32_APICBASE_BSP | MSR_IA32_APICBASE_ENABLE)); + /* if disabled, cannot be enabled again */ if (!(val& MSR_IA32_APICBASE_ENABLE)) { s->apicbase&= ~MSR_IA32_APICBASE_ENABLE; @@ -1075,7 +1078,7 @@ static const VMStateDescription vmstate_apic = { } }; Please don't introduce extra whitespace in unrelated code. -static void apic_reset(DeviceState *d) +void apic_reset(DeviceState *d) { APICState *s = DO_UPCAST(APICState, busdev.qdev, d); int bsp; @@ -1138,7 +1141,7 @@ static SysBusDeviceInfo apic_info = { static void apic_register_devices(void) { -sysbus_register_withprop(&apic_info); +iccbus_register(&apic_info); } device_init(apic_register_devices) diff --git a/hw/apic.h b/hw/apic.h index c857d52..e258efa 100644 --- a/hw/apic.h +++ b/hw/apic.h @@ -24,5 +24,6 @@ void apic_sipi(DeviceState *s); /* pc.c */ int cpu_is_bsp(CPUState *env); DeviceState *cpu_get_current_apic(void); +void apic_reset(DeviceState *d); #endif diff --git a/hw/icc_bus.c b/hw/icc_bus.c new file mode 100644 index 000..360ca2a --- /dev/null +++ b/hw/icc_bus.c @@ -0,0 +1,62 @@ +/* +*/ New files need to include copyright/licenses. +#define ICC_BUS_PLUG +#ifdef ICC_BUS_PLUG Drop these guards here and at the end of the file. We conditionally build files by having an: obj-$(CONFIG_ICC_BUS) += icc_bus.o In the Makefile. +#include "icc_bus.h" + + + +struct icc_bus_info icc_info = { +.qinfo.name = "icc", +.qinfo.size = sizeof(struct icc_bus), +.qinfo.props = (Property[]) { +DEFINE_PROP_END_OF_LIST(), +} + +}; Structure name is not following Coding Style. + +static const VMStateDescription vmstate_icc_bus = { +.name = "icc_bus", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.pre_save = NULL, +.post_load = NULL, +}; + +struct icc_bus *g_iccbus; + +struct icc_bus *icc_init_bus(DeviceState *parent, const char *name) +{ +struct icc_bus *bus; + +bus = FROM_QBUS(icc_bus, qbus_create(&icc_info.qinfo, parent, name)); +bus->qbus.allow_hotplug = 1; /* Yes, we can */ +bus->qbus.name = "icc"; +vmstate_register(NULL, -1,&vmstate_icc_bus, bus); +g_iccbus = bus; +return bus; +} + + + + + +static int iccbus_device_init(DeviceState *dev, DeviceInfo *base) +{ +SysBusDeviceInfo *info = container_of(base, SysBusDeviceInfo, qdev); + +return info->init(sysbus_from_qdev(dev)); +} + +void iccbus_register(SysBusDeviceInfo *info) +{ +info->qdev.init = iccbus_device_init; +info->qdev.bus_info =&icc_info.qinfo; + +assert(info->qdev.size>= sizeof(SysBusDevice)); +qdev_register(&info->qdev); +} + It's not obvious to me why you need the g_iccbus variable. +#endif diff --git a/hw/icc_bus.h b/hw/icc_bus.h new file mode 100644 index 000..94d9242 --- /dev/null +++ b/hw/icc_bus.h @@ -0,0 +1,24 @@ +#ifndef QEMU_ICC_H +#define QEMU_ICC_H Needs copyright and a blurb explaining what this file is and what the ICC bus is. Same comments re: whitespace below. Regards, Anthony Liguori +#include "qdev.h" +#include "sysbus.h" + +typedef struct icc_bus icc_bus; +typedef struct icc_bus_info icc_bus_info; + + +struct icc_bus { +BusState qbus; +}; + +s
[Qemu-devel] [PATCH] runstate: add more valid transitions
This patch adds more valid transitions to the table, and avoids that the VM remains stuck in RSTATE_SAVEVM state when savevm is done on a paused virtual machine. Signed-off-by: Paolo Bonzini --- savevm.c | 11 --- vl.c |5 + 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/savevm.c b/savevm.c index 46f2447..ba5beb0 100644 --- a/savevm.c +++ b/savevm.c @@ -1599,7 +1599,7 @@ void qemu_savevm_state_cancel(Monitor *mon, QEMUFile *f) static int qemu_savevm_state(Monitor *mon, QEMUFile *f) { -int saved_vm_running; +int saved_vm_state; int ret; saved_vm_running = runstate_is_running(); @@ -1626,8 +1626,13 @@ out: if (qemu_file_has_error(f)) ret = -EIO; -if (!ret && saved_vm_running) -vm_start(); +if (!ret) { +if (saved_vm_running) { +vm_start(); +} else { +runstate_set(RSTATE_PAUSED); +} +} return ret; } diff --git a/vl.c b/vl.c index bd4a5ce..165712e 100644 --- a/vl.c +++ b/vl.c @@ -346,8 +346,12 @@ static const RunStateTransition runstate_transitions_def[] = { { RSTATE_IO_ERROR, RSTATE_RUNNING }, { RSTATE_PAUSED, RSTATE_RUNNING }, +{ RSTATE_PAUSED, RSTATE_PRE_MIGRATE }, +{ RSTATE_PAUSED, RSTATE_SAVEVM }, { RSTATE_POST_MIGRATE, RSTATE_RUNNING }, +{ RSTATE_POST_MIGRATE, RSTATE_PRE_MIGRATE }, +{ RSTATE_POST_MIGRATE, RSTATE_SAVEVM }, { RSTATE_PRE_LAUNCH, RSTATE_RUNNING }, { RSTATE_PRE_LAUNCH, RSTATE_POST_MIGRATE }, @@ -368,6 +372,7 @@ static const RunStateTransition runstate_transitions_def[] = { { RSTATE_RUNNING, RSTATE_WATCHDOG }, { RSTATE_SAVEVM, RSTATE_RUNNING }, +{ RSTATE_SAVEVM, RSTATE_PAUSED }, { RSTATE_SHUTDOWN, RSTATE_PAUSED }, -- 1.7.6
[Qemu-devel] [PATCH v2] runstate: add more valid transitions
This patch adds more valid transitions to the table, and avoids that the VM remains stuck in RSTATE_SAVEVM state when savevm is done on a paused virtual machine. Signed-off-by: Paolo Bonzini --- savevm.c |9 +++-- vl.c |5 + 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/savevm.c b/savevm.c index 46f2447..ff37968 100644 --- a/savevm.c +++ b/savevm.c @@ -1626,8 +1626,13 @@ out: if (qemu_file_has_error(f)) ret = -EIO; -if (!ret && saved_vm_running) -vm_start(); +if (!ret) { +if (saved_vm_running) { +vm_start(); + } else { + runstate_set(RSTATE_PAUSED); + } +} return ret; } diff --git a/vl.c b/vl.c index bd4a5ce..165712e 100644 --- a/vl.c +++ b/vl.c @@ -346,8 +346,12 @@ static const RunStateTransition runstate_transitions_def[] = { { RSTATE_IO_ERROR, RSTATE_RUNNING }, { RSTATE_PAUSED, RSTATE_RUNNING }, +{ RSTATE_PAUSED, RSTATE_PRE_MIGRATE }, +{ RSTATE_PAUSED, RSTATE_SAVEVM }, { RSTATE_POST_MIGRATE, RSTATE_RUNNING }, +{ RSTATE_POST_MIGRATE, RSTATE_PRE_MIGRATE }, +{ RSTATE_POST_MIGRATE, RSTATE_SAVEVM }, { RSTATE_PRE_LAUNCH, RSTATE_RUNNING }, { RSTATE_PRE_LAUNCH, RSTATE_POST_MIGRATE }, @@ -368,6 +372,7 @@ static const RunStateTransition runstate_transitions_def[] = { { RSTATE_RUNNING, RSTATE_WATCHDOG }, { RSTATE_SAVEVM, RSTATE_RUNNING }, +{ RSTATE_SAVEVM, RSTATE_PAUSED }, { RSTATE_SHUTDOWN, RSTATE_PAUSED }, -- 1.7.6
Re: [Qemu-devel] KVM call agenda for October 4th
Juan Quintela wrote: > Hi > > Please send in any agenda items you are interested in covering. As there is no agenda, this week call gets cancelled. Happy hacking, Juan.
Re: [Qemu-devel] s390: Fix cpu shutdown for KVM
Something like the following? s390: Fix cpu shutdown for KVM On s390 a shutdown is the state of all CPUs being either stopped or disabled (for interrupts) waiting. We have to track this number to call the shutdown sequence accordingly. This patch implements the counting and shutdown handling for the kvm path in qemu. Signed-off-by: Christian Borntraeger --- hw/s390-virtio.c | 21 + target-s390x/cpu.h | 12 target-s390x/kvm.c | 21 + 3 files changed, 50 insertions(+), 4 deletions(-) Index: b/hw/s390-virtio.c === --- a/hw/s390-virtio.c +++ b/hw/s390-virtio.c @@ -130,6 +130,26 @@ int s390_virtio_hypercall(CPUState *env, return r; } +/* + * The number of running CPUs. On s390 a shutdown is the state of all CPUs + * being either stopped or disabled (for interrupts) waiting. We have to + * track this number to call the shutdown sequence accordingly. This + * number is modified either on startup or while holding the big qemu lock. + */ +static unsigned s390_running_cpus; + +void kvm_s390_add_running_cpu(CPUState *env) +{ +assert(env->halted == 0); +s390_running_cpus++; +} + +unsigned kvm_s390_del_running_cpu(CPUState *env) +{ +assert(s390_running_cpus >= 1); +return --s390_running_cpus; +} + /* PC hardware initialisation */ static void s390_init(ram_addr_t my_ram_size, const char *boot_device, @@ -189,6 +209,7 @@ static void s390_init(ram_addr_t my_ram_ env->halted = 0; env->exception_index = 0; +kvm_s390_add_running_cpu(env); if (kernel_filename) { kernel_size = load_image(kernel_filename, qemu_get_ram_ptr(0)); Index: b/target-s390x/cpu.h === --- a/target-s390x/cpu.h +++ b/target-s390x/cpu.h @@ -292,6 +292,8 @@ void kvm_s390_interrupt(CPUState *env, i void kvm_s390_virtio_irq(CPUState *env, int config_change, uint64_t token); void kvm_s390_interrupt_internal(CPUState *env, int type, uint32_t parm, uint64_t parm64, int vm); +void kvm_s390_add_running_cpu(CPUState *env); +unsigned kvm_s390_del_running_cpu(CPUState *env); #else static inline void kvm_s390_interrupt(CPUState *env, int type, uint32_t code) { @@ -307,6 +309,16 @@ static inline void kvm_s390_interrupt_in int vm) { } + +static inline void kvm_s390_add_running_cpu(CPUState *env) +{ +} + +static inline unsigned kvm_s390_del_running_cpu(CPUState *env) +{ +return 0; +} + #endif CPUState *s390_cpu_addr2state(uint16_t cpu_addr); Index: b/target-s390x/kvm.c === --- a/target-s390x/kvm.c +++ b/target-s390x/kvm.c @@ -185,6 +185,12 @@ void kvm_s390_interrupt_internal(CPUStat return; } +/* + * We can only deliver interrupts to (interrupt) enabled CPUs. + * We dont call kvm_s390_add_running_cpu here, since CPUs in enabled wait + * will wait inside the kernel (no exit). Therefore, the targeted + * CPUs was neither disabled waiting or stopped for qemu. + */ env->halted = 0; env->exception_index = -1; qemu_cpu_kick(env); @@ -301,6 +307,7 @@ static int s390_cpu_restart(CPUState *en kvm_s390_interrupt(env, KVM_S390_RESTART, 0); env->halted = 0; env->exception_index = -1; +kvm_s390_add_running_cpu(env); qemu_cpu_kick(env); dprintf("DONE: SIGP cpu restart: %p\n", env); return 0; @@ -425,16 +432,24 @@ static int handle_intercept(CPUState *en r = handle_instruction(env, run); break; case ICPT_WAITPSW: -/* XXX What to do on system shutdown? */ +if (kvm_s390_del_running_cpu(env) == 0) { +qemu_system_shutdown_request(); +} env->halted = 1; env->exception_index = EXCP_HLT; +r = EXCP_HALTED; break; case ICPT_SOFT_INTERCEPT: fprintf(stderr, "KVM unimplemented icpt SOFT\n"); exit(1); break; case ICPT_CPU_STOP: -qemu_system_shutdown_request(); +if (kvm_s390_del_running_cpu(env) == 0) { +qemu_system_shutdown_request(); +} +env->halted = 1; +env->exception_index = EXCP_HLT; +r = EXCP_HALTED; break; case ICPT_IO: fprintf(stderr, "KVM unimplemented icpt IO\n"); @@ -468,8 +483,6 @@ int kvm_arch_handle_exit(CPUState *env, if (ret == 0) { ret = EXCP_INTERRUPT; -} else if (ret > 0) { -ret = 0; } return ret; }
Re: [Qemu-devel] [PATCH] runstate: do not discard runstate changes when paused
On Tue, 4 Oct 2011 14:04:45 +0200 Paolo Bonzini wrote: > Trying to migrate a paused machine fails. The reason is that > the RSTATE_PRE_MIGRATE is reached with vm_stop, and this > transition is eaten when the vm is already paused. This patch > fixes the problem by always going through runstate_set and > always notifying the new state. There's a semantic change which I'm not completely sure it won't generate unexpected side-effects: today vm_stop() will only carry any action if the machine is running, otherwise it's no-op. This patch changes that. We probably can fix it by adding a new transition to the transition table too... > > Signed-off-by: Paolo Bonzini > --- > cpus.c |2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/cpus.c b/cpus.c > index 8978779..eab8ff6 100644 > --- a/cpus.c > +++ b/cpus.c > @@ -128,6 +128,8 @@ static void do_vm_stop(RunState state) > qemu_aio_flush(); > bdrv_flush_all(); > monitor_protocol_event(QEVENT_STOP, NULL); > +} else { > +runstate_set(state); > } > } >
Re: [Qemu-devel] [PATCH 0/1] Make the help info more friendly in monitor
On Wed, 28 Sep 2011 10:16:19 +0100 Stefan Hajnoczi wrote: > On Wed, Sep 28, 2011 at 10:00 AM, Wayne Xia > wrote: > > During my test, I found it inconvenient when I type "help" or "help info", > > because the information was shown without orderliness. This patch would just > > show the help information in sorted order. > > > > For eg: > > (qemu)help > > acl_add > > acl_policy > > acl_remove > > acl_reset > > acl_show > > balloon > > block_passwd > > ... > > the command list is sorted. > > > > Wayne Xia (1): > > Sort the help info shown in monitor > > > > monitor.c | 97 > > ++-- > > 1 files changed, 93 insertions(+), 4 deletions(-) > > This is a nice idea. We could keep hmp/qmp-commands.hx in sorted > order but that prevents us from keeping related commands together in > those files (and the generated documentation?). So sorting at > run-time makes sense. The info help command reads from hmp-commands.hx and the info_cmds array, I would prefer to get those sorted.
Re: [Qemu-devel] s390: Fix cpu shutdown for KVM
On 4 October 2011 14:47, Christian Borntraeger wrote: > + * We can only deliver interrupts to (interrupt) enabled CPUs. > + * We dont call kvm_s390_add_running_cpu here, since CPUs in enabled wait > + * will wait inside the kernel (no exit). Therefore, the targeted > + * CPUs was neither disabled waiting or stopped for qemu. Grammar nits, since I'm commenting anyway: "don't" "neither disabled waiting nor stopped" > @@ -425,16 +432,24 @@ static int handle_intercept(CPUState *en > r = handle_instruction(env, run); > break; > case ICPT_WAITPSW: > - /* XXX What to do on system shutdown? */ > + if (kvm_s390_del_running_cpu(env) == 0) { > + qemu_system_shutdown_request(); > + } > env->halted = 1; > env->exception_index = EXCP_HLT; > + r = EXCP_HALTED; > break; > case ICPT_SOFT_INTERCEPT: > fprintf(stderr, "KVM unimplemented icpt SOFT\n"); > exit(1); > break; > case ICPT_CPU_STOP: > - qemu_system_shutdown_request(); > + if (kvm_s390_del_running_cpu(env) == 0) { > + qemu_system_shutdown_request(); > + } > + env->halted = 1; > + env->exception_index = EXCP_HLT; > + r = EXCP_HALTED; > break; > case ICPT_IO: > fprintf(stderr, "KVM unimplemented icpt IO\n"); This makes the ICPT_CPU_STOP and ICPT_WAITPSW cases identical, right? You should just fold them together. -- PMM
Re: [Qemu-devel] [PATCH] runstate: do not discard runstate changes when paused
On 10/04/2011 03:49 PM, Luiz Capitulino wrote: There's a semantic change which I'm not completely sure it won't generate unexpected side-effects: today vm_stop() will only carry any action if the machine is running, otherwise it's no-op. This patch changes that. More or less, yes. I tried to limit the semantic change by not running notifiers, which again could be better or worse. I don't think adding a new transition is a good solution, because you'll have to add a transition from PAUSED to anything that uses runstate_set instead of vm_stop. However, you could change all vm_stop() to vm_stop(RSTATE_PAUSED) followed by runstate_set(), adjust the transition table consequently and possibly drop the argument to vm_stop. I tried to get the smallest patch, but I did need to follow-up with changes to the transition table. In any case, can I assume this to be in your hands now? :) Paolo
Re: [Qemu-devel] s390: Fix cpu shutdown for KVM
On 04/10/11 15:56, Peter Maydell wrote: > On 4 October 2011 14:47, Christian Borntraeger wrote: >> + * We can only deliver interrupts to (interrupt) enabled CPUs. >> + * We dont call kvm_s390_add_running_cpu here, since CPUs in enabled >> wait >> + * will wait inside the kernel (no exit). Therefore, the targeted >> + * CPUs was neither disabled waiting or stopped for qemu. > > Grammar nits, since I'm commenting anyway: > "don't" > "neither disabled waiting nor stopped" Ok. Alex can you fix that up or do you want a new patch? > This makes the ICPT_CPU_STOP and ICPT_WAITPSW cases identical, > right? You should just fold them together. Yes, at the moment they are identical. I am still thinking about some additional changes that will make them separate again due to their usage in Linux: - disabled wait usually indicates a kernel panic - stop is called during cpu hot unplug and during shutdown for the last cpu So on disabled wait we might want to perform extra logging etc, but I dont know yet. Should I merge them anyway? Christian
Re: [Qemu-devel] [PATCH 05/23] migration: Refactor MigrationState creation
On 09/23/2011 07:56 AM, Juan Quintela wrote: Signed-off-by: Juan Quintela --- migration-exec.c | 16 +--- migration-fd.c | 16 +--- migration-tcp.c | 15 +-- migration-unix.c | 15 +-- migration.c | 29 + migration.h | 11 +++ 6 files changed, 32 insertions(+), 70 deletions(-) diff --git a/migration-exec.c b/migration-exec.c index 0ed5976..db11374 100644 --- a/migration-exec.c +++ b/migration-exec.c @@ -71,7 +71,7 @@ MigrationState *exec_start_outgoing_migration(Monitor *mon, MigrationState *s; FILE *f; -s = g_malloc0(sizeof(*s)); +s = migrate_create_state(mon, bandwidth_limit, detach, blk, inc); f = popen(command, "w"); if (f == NULL) { @@ -92,20 +92,6 @@ MigrationState *exec_start_outgoing_migration(Monitor *mon, s->close = exec_close; s->get_error = file_errno; s->write = file_write; -s->cancel = migrate_fd_cancel; -s->get_status = migrate_fd_get_status; -s->release = migrate_fd_release; - -s->blk = blk; -s->shared = inc; - -s->state = MIG_STATE_ACTIVE; -s->mon = NULL; -s->bandwidth_limit = bandwidth_limit; - -if (!detach) { -migrate_fd_monitor_suspend(s, mon); -} migrate_fd_connect(s); return s; diff --git a/migration-fd.c b/migration-fd.c index e78fd4e..2235a2f 100644 --- a/migration-fd.c +++ b/migration-fd.c @@ -59,7 +59,7 @@ MigrationState *fd_start_outgoing_migration(Monitor *mon, { MigrationState *s; -s = g_malloc0(sizeof(*s)); +s = migrate_create_state(mon, bandwidth_limit, detach, blk, inc); s->fd = monitor_get_fd(mon, fdname); if (s->fd == -1) { @@ -75,20 +75,6 @@ MigrationState *fd_start_outgoing_migration(Monitor *mon, s->get_error = fd_errno; s->write = fd_write; s->close = fd_close; -s->cancel = migrate_fd_cancel; -s->get_status = migrate_fd_get_status; -s->release = migrate_fd_release; - -s->blk = blk; -s->shared = inc; - -s->state = MIG_STATE_ACTIVE; -s->mon = NULL; -s->bandwidth_limit = bandwidth_limit; - -if (!detach) { -migrate_fd_monitor_suspend(s, mon); -} migrate_fd_connect(s); return s; diff --git a/migration-tcp.c b/migration-tcp.c index d6feb23..db8cea2 100644 --- a/migration-tcp.c +++ b/migration-tcp.c @@ -89,21 +89,12 @@ MigrationState *tcp_start_outgoing_migration(Monitor *mon, if (parse_host_port(&addr, host_port)< 0) return NULL; -s = g_malloc0(sizeof(*s)); +s = migrate_create_state(mon, bandwidth_limit, detach, blk, inc); s->get_error = socket_errno; s->write = socket_write; s->close = tcp_close; -s->cancel = migrate_fd_cancel; -s->get_status = migrate_fd_get_status; -s->release = migrate_fd_release; -s->blk = blk; -s->shared = inc; - -s->state = MIG_STATE_ACTIVE; -s->mon = NULL; -s->bandwidth_limit = bandwidth_limit; s->fd = qemu_socket(PF_INET, SOCK_STREAM, 0); if (s->fd == -1) { g_free(s); @@ -112,10 +103,6 @@ MigrationState *tcp_start_outgoing_migration(Monitor *mon, socket_set_nonblock(s->fd); -if (!detach) { -migrate_fd_monitor_suspend(s, mon); -} - do { ret = connect(s->fd, (struct sockaddr *)&addr, sizeof(addr)); if (ret == -1) diff --git a/migration-unix.c b/migration-unix.c index 3b9017b..74c6dde 100644 --- a/migration-unix.c +++ b/migration-unix.c @@ -88,21 +88,12 @@ MigrationState *unix_start_outgoing_migration(Monitor *mon, addr.sun_family = AF_UNIX; snprintf(addr.sun_path, sizeof(addr.sun_path), "%s", path); -s = g_malloc0(sizeof(*s)); +s = migrate_create_state(mon, bandwidth_limit, detach, blk, inc); s->get_error = unix_errno; s->write = unix_write; s->close = unix_close; -s->cancel = migrate_fd_cancel; -s->get_status = migrate_fd_get_status; -s->release = migrate_fd_release; -s->blk = blk; -s->shared = inc; - -s->state = MIG_STATE_ACTIVE; -s->mon = NULL; -s->bandwidth_limit = bandwidth_limit; s->fd = qemu_socket(PF_UNIX, SOCK_STREAM, 0); if (s->fd< 0) { DPRINTF("Unable to open socket"); @@ -125,10 +116,6 @@ MigrationState *unix_start_outgoing_migration(Monitor *mon, goto err_after_open; } -if (!detach) { -migrate_fd_monitor_suspend(s, mon); -} - if (ret>= 0) migrate_fd_connect(s); diff --git a/migration.c b/migration.c index b27a322..990667e 100644 --- a/migration.c +++ b/migration.c @@ -263,7 +263,7 @@ void do_info_migrate(Monitor *mon, QObject **ret_data) /* shared migration helpers */ -void migrate_fd_monitor_suspend(MigrationState *s, Monitor *mon) +static void migrate_fd_monitor_suspend(MigrationState *s, Monitor *mon) { s->mon = mon; if (monitor_suspend(mon) == 0) { @@ -404,12 +404,12 @@ void migrate_fd_put_r
Re: [Qemu-devel] [PATCH 14/23] migration: Move exported functions to the end of the file
On 09/23/2011 07:57 AM, Juan Quintela wrote: This means we can remove the two forward declarations. Signed-off-by: Juan Quintela Reviewed-by: Anthony Liguori Regards, Anthony Liguori --- migration.c | 188 -- 1 files changed, 91 insertions(+), 97 deletions(-) diff --git a/migration.c b/migration.c index 9bc7ffa..9bb089a 100644 --- a/migration.c +++ b/migration.c @@ -77,91 +77,6 @@ void process_incoming_migration(QEMUFile *f) } } -static MigrationState *migrate_create_state(Monitor *mon, -int64_t bandwidth_limit, -int detach, int blk, int inc); - -int do_migrate(Monitor *mon, const QDict *qdict, QObject **ret_data) -{ -MigrationState *s = NULL; -const char *p; -int detach = qdict_get_try_bool(qdict, "detach", 0); -int blk = qdict_get_try_bool(qdict, "blk", 0); -int inc = qdict_get_try_bool(qdict, "inc", 0); -const char *uri = qdict_get_str(qdict, "uri"); -int ret; - -if (current_migration&& -current_migration->state == MIG_STATE_ACTIVE) { -monitor_printf(mon, "migration already in progress\n"); -return -1; -} - -if (qemu_savevm_state_blocked(mon)) { -return -1; -} - -s = migrate_create_state(mon, max_throttle, detach, blk, inc); - -if (strstart(uri, "tcp:",&p)) { -ret = tcp_start_outgoing_migration(s, p); -#if !defined(WIN32) -} else if (strstart(uri, "exec:",&p)) { -ret = exec_start_outgoing_migration(s, p); -} else if (strstart(uri, "unix:",&p)) { -ret = unix_start_outgoing_migration(s, p); -} else if (strstart(uri, "fd:",&p)) { -ret = fd_start_outgoing_migration(s, p); -#endif -} else { -monitor_printf(mon, "unknown migration protocol: %s\n", uri); -ret = -EINVAL; -goto free_migrate_state; -} - -if (ret< 0) { -monitor_printf(mon, "migration failed\n"); -goto free_migrate_state; -} - -g_free(current_migration); -current_migration = s; -notifier_list_notify(&migration_state_notifiers, NULL); -return 0; -free_migrate_state: -g_free(s); -return -1; -} - -static void migrate_fd_cancel(MigrationState *s); - -int do_migrate_cancel(Monitor *mon, const QDict *qdict, QObject **ret_data) -{ -if (current_migration) { -migrate_fd_cancel(current_migration); -} -return 0; -} - -int do_migrate_set_speed(Monitor *mon, const QDict *qdict, QObject **ret_data) -{ -int64_t d; -MigrationState *s; - -d = qdict_get_int(qdict, "value"); -if (d< 0) { -d = 0; -} -max_throttle = d; - -s = current_migration; -if (s&& s->file) { -qemu_file_set_rate_limit(s->file, max_throttle); -} - -return 0; -} - /* amount of nanoseconds we are willing to wait for migration to be down. * the choice of nanoseconds is because it is the maximum resolution that * get_clock() can achieve. It is an internal measure. All user-visible @@ -173,18 +88,6 @@ uint64_t migrate_max_downtime(void) return max_downtime; } -int do_migrate_set_downtime(Monitor *mon, const QDict *qdict, -QObject **ret_data) -{ -double d; - -d = qdict_get_double(qdict, "value") * 1e9; -d = MAX(0, MIN(UINT64_MAX, d)); -max_downtime = (uint64_t)d; - -return 0; -} - static void migrate_print_status(Monitor *mon, const char *name, const QDict *status_dict) { @@ -502,3 +405,94 @@ static MigrationState *migrate_create_state(Monitor *mon, return s; } + +int do_migrate(Monitor *mon, const QDict *qdict, QObject **ret_data) +{ +MigrationState *s = NULL; +const char *p; +int detach = qdict_get_try_bool(qdict, "detach", 0); +int blk = qdict_get_try_bool(qdict, "blk", 0); +int inc = qdict_get_try_bool(qdict, "inc", 0); +const char *uri = qdict_get_str(qdict, "uri"); +int ret; + +if (current_migration&& +current_migration->state == MIG_STATE_ACTIVE) { +monitor_printf(mon, "migration already in progress\n"); +return -1; +} + +if (qemu_savevm_state_blocked(mon)) { +return -1; +} + +s = migrate_create_state(mon, max_throttle, detach, blk, inc); + +if (strstart(uri, "tcp:",&p)) { +ret = tcp_start_outgoing_migration(s, p); +#if !defined(WIN32) +} else if (strstart(uri, "exec:",&p)) { +ret = exec_start_outgoing_migration(s, p); +} else if (strstart(uri, "unix:",&p)) { +ret = unix_start_outgoing_migration(s, p); +} else if (strstart(uri, "fd:",&p)) { +ret = fd_start_outgoing_migration(s, p); +#endif +} else { +monitor_printf(mon, "unknown migration protocol: %s\n", uri); +ret = -EINVAL; +goto free_migrate_state; +} + +if (ret< 0) { +monitor_printf(mon, "migration failed\n"); +
[Qemu-devel] [PATCH] i386: wire up MSR_IA32_MISC_ENABLE
It's needed for its default value - bit 0 specifies that "rep movs" is good enough for memcpy, and Linux may use a slower memcpu if it is not set, depending on cpu family/model. Signed-off-by: Avi Kivity --- target-i386/cpu.h |5 + target-i386/helper.c|1 + target-i386/kvm.c | 15 +++ target-i386/machine.c | 21 + target-i386/op_helper.c |6 ++ 5 files changed, 48 insertions(+), 0 deletions(-) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index ae36489..5416809 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -299,6 +299,10 @@ #define MSR_IA32_PERF_STATUS0x198 +#define MSR_IA32_MISC_ENABLE 0x1a0 +/* Indicates good rep/movs microcode on some processors: */ +#define MSR_IA32_MISC_ENABLE_DEFAULT1 + #define MSR_MTRRphysBase(reg) (0x200 + 2 * (reg)) #define MSR_MTRRphysMask(reg) (0x200 + 2 * (reg) + 1) @@ -689,6 +693,7 @@ typedef struct CPUX86State { uint64_t tsc; uint64_t mcg_status; +uint64_t msr_ia32_misc_enable; /* exception/interrupt handling */ int error_code; diff --git a/target-i386/helper.c b/target-i386/helper.c index 5df40d4..6c6a167 100644 --- a/target-i386/helper.c +++ b/target-i386/helper.c @@ -98,6 +98,7 @@ void cpu_reset(CPUX86State *env) env->mxcsr = 0x1f80; env->pat = 0x0007040600070406ULL; +env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT; memset(env->dr, 0, sizeof(env->dr)); env->dr[6] = DR6_FIXED_1; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index b6eef04..b3a650b 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -60,6 +60,7 @@ static bool has_msr_star; static bool has_msr_hsave_pa; static bool has_msr_async_pf_en; +static bool has_msr_misc_enable; static int lm_capable_kernel; static struct kvm_cpuid2 *try_get_cpuid(KVMState *s, int max) @@ -568,6 +569,10 @@ static int kvm_get_supported_msrs(KVMState *s) has_msr_hsave_pa = true; continue; } +if (kvm_msr_list->indices[i] == MSR_IA32_MISC_ENABLE) { +has_msr_misc_enable = true; +continue; +} } } @@ -881,6 +886,10 @@ static int kvm_put_msrs(CPUState *env, int level) if (has_msr_hsave_pa) { kvm_msr_entry_set(&msrs[n++], MSR_VM_HSAVE_PA, env->vm_hsave); } +if (has_msr_misc_enable) { +kvm_msr_entry_set(&msrs[n++], MSR_IA32_MISC_ENABLE, + env->msr_ia32_misc_enable); +} #ifdef TARGET_X86_64 if (lm_capable_kernel) { kvm_msr_entry_set(&msrs[n++], MSR_CSTAR, env->cstar); @@ -1127,6 +1136,9 @@ static int kvm_get_msrs(CPUState *env) if (has_msr_hsave_pa) { msrs[n++].index = MSR_VM_HSAVE_PA; } +if (has_msr_misc_enable) { +msrs[n++].index = MSR_IA32_MISC_ENABLE; +} if (!env->tsc_valid) { msrs[n++].index = MSR_IA32_TSC; @@ -1210,6 +1222,9 @@ static int kvm_get_msrs(CPUState *env) case MSR_MCG_CTL: env->mcg_ctl = msrs[i].data; break; +case MSR_IA32_MISC_ENABLE: +env->msr_ia32_misc_enable = msrs[i].data; +break; default: if (msrs[i].index >= MSR_MC0_CTL && msrs[i].index < MSR_MC0_CTL + (env->mcg_cap & 0xff) * 4) { diff --git a/target-i386/machine.c b/target-i386/machine.c index 9aca8e0..73c1ffe 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -310,6 +310,24 @@ static bool fpop_ip_dp_needed(void *opaque) } }; +static bool misc_enable_needed(void *opaque) +{ +CPUState *env = opaque; + +return env->msr_ia32_misc_enable != MSR_IA32_MISC_ENABLE_DEFAULT; +} + +static const VMStateDescription vmstate_msr_ia32_misc_enable = { +.name = "cpu/msr_ia32_misc_enable", +.version_id = 1, +.minimum_version_id = 1, +.minimum_version_id_old = 1, +.fields = (VMStateField []) { +VMSTATE_UINT64(msr_ia32_misc_enable, CPUState), +VMSTATE_END_OF_LIST() +} +}; + static const VMStateDescription vmstate_cpu = { .name = "cpu", .version_id = CPU_SAVE_VERSION, @@ -420,6 +438,9 @@ static bool fpop_ip_dp_needed(void *opaque) } , { .vmsd = &vmstate_fpop_ip_dp, .needed = fpop_ip_dp_needed, +}, { +.vmsd = &vmstate_msr_ia32_misc_enable, +.needed = misc_enable_needed, } , { /* empty */ } diff --git a/target-i386/op_helper.c b/target-i386/op_helper.c index 3bb5a91..c89e4a4 100644 --- a/target-i386/op_helper.c +++ b/target-i386/op_helper.c @@ -3280,6 +3280,9 @@ void helper_wrmsr(void) case MSR_TSC_AUX: env->tsc_aux = val; break; +case MSR_IA32_MISC_ENABLE: +env->msr_ia32_misc_enable = val; +break; default: if ((uint32_t)ECX >= MSR_MC0_CTL
Re: [Qemu-devel] [PATCH 15/23] migration: use global variable directly
On 09/23/2011 07:57 AM, Juan Quintela wrote: We are setting a pointer to a local variable in the previous line, just use the global variable directly. We remove the ->file test because it is already done inside qemu_file_set_rate_limit() function. Signed-off-by: Juan Quintela --- migration.c |6 ++ 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/migration.c b/migration.c index 9bb089a..d5e0eb0 100644 --- a/migration.c +++ b/migration.c @@ -469,7 +469,6 @@ int do_migrate_cancel(Monitor *mon, const QDict *qdict, QObject **ret_data) int do_migrate_set_speed(Monitor *mon, const QDict *qdict, QObject **ret_data) { int64_t d; -MigrationState *s; d = qdict_get_int(qdict, "value"); if (d< 0) { @@ -477,9 +476,8 @@ int do_migrate_set_speed(Monitor *mon, const QDict *qdict, QObject **ret_data) } max_throttle = d; -s = current_migration; -if (s&& s->file) { -qemu_file_set_rate_limit(s->file, max_throttle); +if (current_migration) { +qemu_file_set_rate_limit(current_migration->file, max_throttle); } How about we compromise and add a: MigrationState *migrate_get_current(void); I'm strongly opposed to propagating direct usage of a global. If it's at least a function call, that's a bit nicer. Regards, Anthony Liguori return 0;
Re: [Qemu-devel] [PATCH 17/23] migration: make sure we always have a migration state
On 09/23/2011 07:57 AM, Juan Quintela wrote: This cleans up a lot the code as we don't have to check anymore if the variable is NULL or not. We don't make it static, because when we integrate fault tolerance, we can have several migrations at once. Signed-off-by: Juan Quintela --- migration.c | 126 +- 1 files changed, 54 insertions(+), 72 deletions(-) diff --git a/migration.c b/migration.c index 1cc21f0..c382383 100644 --- a/migration.c +++ b/migration.c @@ -34,7 +34,13 @@ /* Migration speed throttling */ static int64_t max_throttle = (32<< 20); -static MigrationState *current_migration; +/* When we add fault tolerance, we could have several + migrations at once. For now we don't need to add + dynamic creation of migration */ +static MigrationState current_migration_static = { +.state = MIG_STATE_NONE, +}; +static MigrationState *current_migration =¤t_migration_static; This doesn't make sense to me. We can have two migration states that are both in the MIG_STATE_NONE? What does that actually mean? I don't see the point in making migration state nullable via the state member other than avoiding the if (s == NULL) check. Regards, Anthony Liguori static NotifierList migration_state_notifiers = NOTIFIER_LIST_INITIALIZER(migration_state_notifiers); @@ -136,37 +142,34 @@ void do_info_migrate(Monitor *mon, QObject **ret_data) { QDict *qdict; -if (current_migration) { - -switch (current_migration->state) { -case MIG_STATE_NONE: -/* no migration has happened ever */ -break; -case MIG_STATE_ACTIVE: -qdict = qdict_new(); -qdict_put(qdict, "status", qstring_from_str("active")); - -migrate_put_status(qdict, "ram", ram_bytes_transferred(), - ram_bytes_remaining(), ram_bytes_total()); - -if (blk_mig_active()) { -migrate_put_status(qdict, "disk", blk_mig_bytes_transferred(), - blk_mig_bytes_remaining(), - blk_mig_bytes_total()); -} - -*ret_data = QOBJECT(qdict); -break; -case MIG_STATE_COMPLETED: -*ret_data = qobject_from_jsonf("{ 'status': 'completed' }"); -break; -case MIG_STATE_ERROR: -*ret_data = qobject_from_jsonf("{ 'status': 'failed' }"); -break; -case MIG_STATE_CANCELLED: -*ret_data = qobject_from_jsonf("{ 'status': 'cancelled' }"); -break; +switch (current_migration->state) { +case MIG_STATE_NONE: +/* no migration has happened ever */ +break; +case MIG_STATE_ACTIVE: +qdict = qdict_new(); +qdict_put(qdict, "status", qstring_from_str("active")); + +migrate_put_status(qdict, "ram", ram_bytes_transferred(), + ram_bytes_remaining(), ram_bytes_total()); + +if (blk_mig_active()) { +migrate_put_status(qdict, "disk", blk_mig_bytes_transferred(), + blk_mig_bytes_remaining(), + blk_mig_bytes_total()); } + +*ret_data = QOBJECT(qdict); +break; +case MIG_STATE_COMPLETED: +*ret_data = qobject_from_jsonf("{ 'status': 'completed' }"); +break; +case MIG_STATE_ERROR: +*ret_data = qobject_from_jsonf("{ 'status': 'failed' }"); +break; +case MIG_STATE_CANCELLED: +*ret_data = qobject_from_jsonf("{ 'status': 'cancelled' }"); +break; } } @@ -357,11 +360,7 @@ void remove_migration_state_change_notifier(Notifier *notify) int get_migration_state(void) { -if (current_migration) { -return current_migration->state; -} else { -return MIG_STATE_ERROR; -} +return current_migration->state; } void migrate_fd_connect(MigrationState *s) @@ -386,28 +385,23 @@ void migrate_fd_connect(MigrationState *s) migrate_fd_put_ready(s); } -static MigrationState *migrate_create_state(Monitor *mon, -int64_t bandwidth_limit, -int detach, int blk, int inc) +static void migrate_init_state(Monitor *mon, int64_t bandwidth_limit, + int detach, int blk, int inc) { -MigrationState *s = g_malloc0(sizeof(*s)); - -s->blk = blk; -s->shared = inc; -s->mon = NULL; -s->bandwidth_limit = bandwidth_limit; -s->state = MIG_STATE_NONE; +memset(current_migration, 0, sizeof(current_migration)); +current_migration->blk = blk; +current_migration->shared = inc; +current_migration->mon = NULL; +current_migration->bandwidth_limit = bandwidth_limit; +current_migration->state = MIG_STATE_NONE; if (!detach) { -migrate_fd_monitor_suspend(s, mon); +migrate_fd_
Re: [Qemu-devel] [PATCH 18/23] migration: Use bandwidth_limit directly
On 09/23/2011 07:57 AM, Juan Quintela wrote: Now that current_migration always exist, there is no reason for max_throotle variable. Signed-off-by: Juan Quintela If we can have multiple MigrationStates, this doesn't really make sense. Regards, Anthony Liguori --- migration.c | 19 ++- 1 files changed, 10 insertions(+), 9 deletions(-) diff --git a/migration.c b/migration.c index c382383..ea50a6f 100644 --- a/migration.c +++ b/migration.c @@ -31,14 +31,14 @@ do { } while (0) #endif -/* Migration speed throttling */ -static int64_t max_throttle = (32<< 20); +#define MAX_THROTTLE (32<< 20) /* Migration speed throttling */ /* When we add fault tolerance, we could have several migrations at once. For now we don't need to add dynamic creation of migration */ static MigrationState current_migration_static = { .state = MIG_STATE_NONE, +.bandwidth_limit = MAX_THROTTLE, }; static MigrationState *current_migration =¤t_migration_static; @@ -385,14 +385,14 @@ void migrate_fd_connect(MigrationState *s) migrate_fd_put_ready(s); } -static void migrate_init_state(Monitor *mon, int64_t bandwidth_limit, - int detach, int blk, int inc) +static void migrate_init_state(Monitor *mon, int detach, int blk, int inc) { +int64_t bandwidth_limit = current_migration->bandwidth_limit; + memset(current_migration, 0, sizeof(current_migration)); +current_migration->bandwidth_limit = bandwidth_limit; current_migration->blk = blk; current_migration->shared = inc; -current_migration->mon = NULL; -current_migration->bandwidth_limit = bandwidth_limit; current_migration->state = MIG_STATE_NONE; if (!detach) { @@ -418,7 +418,7 @@ int do_migrate(Monitor *mon, const QDict *qdict, QObject **ret_data) return -1; } -migrate_init_state(mon, max_throttle, detach, blk, inc); +migrate_init_state(mon, detach, blk, inc); if (strstart(uri, "tcp:",&p)) { ret = tcp_start_outgoing_migration(current_migration, p); @@ -458,9 +458,10 @@ int do_migrate_set_speed(Monitor *mon, const QDict *qdict, QObject **ret_data) if (d< 0) { d = 0; } -max_throttle = d; +current_migration->bandwidth_limit = d; -qemu_file_set_rate_limit(current_migration->file, max_throttle); +qemu_file_set_rate_limit(current_migration->file, + current_migration->bandwidth_limit); return 0; }
Re: [Qemu-devel] [PATCH] runstate: do not discard runstate changes when paused
On Tue, 04 Oct 2011 16:09:03 +0200 Paolo Bonzini wrote: > On 10/04/2011 03:49 PM, Luiz Capitulino wrote: > > There's a semantic change which I'm not completely sure it won't generate > > unexpected side-effects: today vm_stop() will only carry any action if the > > machine is running, otherwise it's no-op. This patch changes that. > > More or less, yes. I tried to limit the semantic change by not running > notifiers, which again could be better or worse. > > I don't think adding a new transition is a good solution, because you'll > have to add a transition from PAUSED to anything that uses runstate_set > instead of vm_stop. > > However, you could change all vm_stop() to vm_stop(RSTATE_PAUSED) > followed by runstate_set(), adjust the transition table consequently and > possibly drop the argument to vm_stop. I tried to get the smallest > patch, but I did need to follow-up with changes to the transition table. > > In any case, can I assume this to be in your hands now? :) Yes. I'm not sure what's the best solution here, but I'll decide soon :)
Re: [Qemu-devel] [PATCH 19/23] migration: Export a function that tells if the migration has finished correctly
On 09/23/2011 07:57 AM, Juan Quintela wrote: This will allows us to hide the status values. Signed-off-by: Juan Quintela --- migration.c |4 ++-- migration.h |2 +- ui/spice-core.c |4 +--- 3 files changed, 4 insertions(+), 6 deletions(-) diff --git a/migration.c b/migration.c index ea50a6f..580f546 100644 --- a/migration.c +++ b/migration.c @@ -358,9 +358,9 @@ void remove_migration_state_change_notifier(Notifier *notify) notifier_list_remove(&migration_state_notifiers, notify); } -int get_migration_state(void) +bool migration_has_finished(void) { -return current_migration->state; +return current_migration->state == MIG_STATE_COMPLETED; } void migrate_fd_connect(MigrationState *s) diff --git a/migration.h b/migration.h index f1a7452..6641a26 100644 --- a/migration.h +++ b/migration.h @@ -84,7 +84,7 @@ void migrate_fd_connect(MigrationState *s); void add_migration_state_change_notifier(Notifier *notify); void remove_migration_state_change_notifier(Notifier *notify); -int get_migration_state(void); +bool migration_has_finished(void); uint64_t ram_bytes_remaining(void); uint64_t ram_bytes_transferred(void); diff --git a/ui/spice-core.c b/ui/spice-core.c index 3cbc721..1202993 100644 --- a/ui/spice-core.c +++ b/ui/spice-core.c @@ -447,9 +447,7 @@ void do_info_spice(Monitor *mon, QObject **ret_data) static void migration_state_notifier(Notifier *notifier, void *data) { -int state = get_migration_state(); - -if (state == MIG_STATE_COMPLETED) { +if (migration_has_finished()) { #if SPICE_SERVER_VERSION>= 0x000701 /* 0.7.1 */ spice_server_migrate_switch(spice_server); #endif I think the bug here is migration_state_notifier. It should take an additional argument of MigrationState. Otherwise, how does this code work with FT? Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 21/23] migration: Don't use callback on file defining it
On 09/23/2011 07:57 AM, Juan Quintela wrote: Signed-off-by: Juan Quintela I don't think this is better. It just eliminates the possibility of having useful trace points in get_error. Regards, Anthony Liguori --- migration-tcp.c |4 ++-- migration-unix.c |6 +++--- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/migration-tcp.c b/migration-tcp.c index f6b2288..bd3aa3a 100644 --- a/migration-tcp.c +++ b/migration-tcp.c @@ -58,7 +58,7 @@ static void tcp_wait_for_connect(void *opaque) DPRINTF("connect completed\n"); do { ret = getsockopt(s->fd, SOL_SOCKET, SO_ERROR, (void *)&val,&valsize); -} while (ret == -1&& (s->get_error(s)) == EINTR); +} while (ret == -1&& (socket_error()) == EINTR); if (ret< 0) { migrate_fd_error(s); @@ -98,7 +98,7 @@ int tcp_start_outgoing_migration(MigrationState *s, const char *host_port) do { ret = connect(s->fd, (struct sockaddr *)&addr, sizeof(addr)); if (ret == -1) -ret = -(s->get_error(s)); +ret = -(socket_error()); if (ret == -EINPROGRESS || ret == -EWOULDBLOCK) qemu_set_fd_handler2(s->fd, NULL, NULL, tcp_wait_for_connect, s); diff --git a/migration-unix.c b/migration-unix.c index d6f315a..7205b25 100644 --- a/migration-unix.c +++ b/migration-unix.c @@ -57,7 +57,7 @@ static void unix_wait_for_connect(void *opaque) DPRINTF("connect completed\n"); do { ret = getsockopt(s->fd, SOL_SOCKET, SO_ERROR, (void *)&val,&valsize); -} while (ret == -1&& (s->get_error(s)) == EINTR); +} while (ret == -1&& errno == EINTR); if (ret< 0) { migrate_fd_error(s); @@ -97,7 +97,7 @@ int unix_start_outgoing_migration(MigrationState *s, const char *path) do { ret = connect(s->fd, (struct sockaddr *)&addr, sizeof(addr)); if (ret == -1) - ret = -(s->get_error(s)); +ret = -errno; if (ret == -EINPROGRESS || ret == -EWOULDBLOCK) qemu_set_fd_handler2(s->fd, NULL, NULL, unix_wait_for_connect, s); @@ -130,7 +130,7 @@ static void unix_accept_incoming_migration(void *opaque) do { c = qemu_accept(s, (struct sockaddr *)&addr,&addrlen); -} while (c == -1&& socket_error() == EINTR); +} while (c == -1&& errno == EINTR); DPRINTF("accepted migration\n");
Re: [Qemu-devel] [PATCH 22/23] migration: propagate error correctly
On 09/23/2011 07:57 AM, Juan Quintela wrote: unix and tcp outgoing migration have error values, but didn't returned it. Make them return the error. Notice that EINPROGRESS& EWOULDBLOCK are not considered errors as callwill be retry later. Signed-off-by: Juan Quintela Reviewed-by: Anthony Liguori Regards, Anthony Liguori --- migration-tcp.c | 20 +++- migration-unix.c | 26 ++ 2 files changed, 21 insertions(+), 25 deletions(-) diff --git a/migration-tcp.c b/migration-tcp.c index bd3aa3a..619df21 100644 --- a/migration-tcp.c +++ b/migration-tcp.c @@ -90,26 +90,28 @@ int tcp_start_outgoing_migration(MigrationState *s, const char *host_port) s->fd = qemu_socket(PF_INET, SOCK_STREAM, 0); if (s->fd == -1) { -return -1; +return -socket_error(); } socket_set_nonblock(s->fd); do { ret = connect(s->fd, (struct sockaddr *)&addr, sizeof(addr)); -if (ret == -1) -ret = -(socket_error()); - -if (ret == -EINPROGRESS || ret == -EWOULDBLOCK) +if (ret == -1) { +ret = -socket_error(); +} +if (ret == -EINPROGRESS || ret == -EWOULDBLOCK) { qemu_set_fd_handler2(s->fd, NULL, NULL, tcp_wait_for_connect, s); +return 0; +} } while (ret == -EINTR); -if (ret< 0&& ret != -EINPROGRESS&& ret != -EWOULDBLOCK) { +if (ret< 0) { DPRINTF("connect failed\n"); migrate_fd_error(s); -} else if (ret>= 0) -migrate_fd_connect(s); - +return ret; +} +migrate_fd_connect(s); return 0; } diff --git a/migration-unix.c b/migration-unix.c index 7205b25..428fe66 100644 --- a/migration-unix.c +++ b/migration-unix.c @@ -89,35 +89,29 @@ int unix_start_outgoing_migration(MigrationState *s, const char *path) s->fd = qemu_socket(PF_UNIX, SOCK_STREAM, 0); if (s->fd< 0) { DPRINTF("Unable to open socket"); -goto err_after_socket; +return -errno; } socket_set_nonblock(s->fd); do { ret = connect(s->fd, (struct sockaddr *)&addr, sizeof(addr)); -if (ret == -1) +if (ret == -1) { ret = -errno; - -if (ret == -EINPROGRESS || ret == -EWOULDBLOCK) +} +if (ret == -EINPROGRESS || ret == -EWOULDBLOCK) { qemu_set_fd_handler2(s->fd, NULL, NULL, unix_wait_for_connect, s); +return 0; +} } while (ret == -EINTR); -if (ret< 0&& ret != -EINPROGRESS&& ret != -EWOULDBLOCK) { +if (ret< 0) { DPRINTF("connect failed\n"); -goto err_after_open; +migrate_fd_error(s); +return ret; } - -if (ret>= 0) -migrate_fd_connect(s); - +migrate_fd_connect(s); return 0; - -err_after_open: -close(s->fd); - -err_after_socket: -return -1; } static void unix_accept_incoming_migration(void *opaque)
Re: [Qemu-devel] [PATCH 23/23] migration: make migration-{tcp, unix} consistent
On 09/23/2011 07:57 AM, Juan Quintela wrote: Files are almost identical in functionality, just remove the differences that make no sense. Signed-off-by: Juan Quintela Reviewed-by: Anthony Liguori Regards, Anthony Liguori --- migration-tcp.c | 15 ++- migration-unix.c | 46 +- 2 files changed, 35 insertions(+), 26 deletions(-) diff --git a/migration-tcp.c b/migration-tcp.c index 619df21..5aa742c 100644 --- a/migration-tcp.c +++ b/migration-tcp.c @@ -48,7 +48,6 @@ static int tcp_close(MigrationState *s) return 0; } - static void tcp_wait_for_connect(void *opaque) { MigrationState *s = opaque; @@ -84,12 +83,14 @@ int tcp_start_outgoing_migration(MigrationState *s, const char *host_port) if (ret< 0) { return ret; } + s->get_error = socket_errno; s->write = socket_write; s->close = tcp_close; s->fd = qemu_socket(PF_INET, SOCK_STREAM, 0); if (s->fd == -1) { +DPRINTF("Unable to open socket"); return -socket_error(); } @@ -155,23 +156,27 @@ int tcp_start_incoming_migration(const char *host_port) int val; int s; +DPRINTF("Attempting to start an incoming migration\n"); + if (parse_host_port(&addr, host_port)< 0) { fprintf(stderr, "invalid host/port combination: %s\n", host_port); return -EINVAL; } s = qemu_socket(PF_INET, SOCK_STREAM, 0); -if (s == -1) +if (s == -1) { return -socket_error(); +} val = 1; setsockopt(s, SOL_SOCKET, SO_REUSEADDR, (const char *)&val, sizeof(val)); -if (bind(s, (struct sockaddr *)&addr, sizeof(addr)) == -1) +if (bind(s, (struct sockaddr *)&addr, sizeof(addr)) == -1) { goto err; - -if (listen(s, 1) == -1) +} +if (listen(s, 1) == -1) { goto err; +} qemu_set_fd_handler2(s, NULL, tcp_accept_incoming_migration, NULL, (void *)(intptr_t)s); diff --git a/migration-unix.c b/migration-unix.c index 428fe66..f9d34b9 100644 --- a/migration-unix.c +++ b/migration-unix.c @@ -87,7 +87,7 @@ int unix_start_outgoing_migration(MigrationState *s, const char *path) s->close = unix_close; s->fd = qemu_socket(PF_UNIX, SOCK_STREAM, 0); -if (s->fd< 0) { +if (s->fd == -1) { DPRINTF("Unable to open socket"); return -errno; } @@ -130,7 +130,7 @@ static void unix_accept_incoming_migration(void *opaque) if (c == -1) { fprintf(stderr, "could not accept migration connection\n"); -return; +goto out2; } f = qemu_fopen_socket(c); @@ -142,45 +142,49 @@ static void unix_accept_incoming_migration(void *opaque) process_incoming_migration(f); qemu_fclose(f); out: +close(c); +out2: qemu_set_fd_handler2(s, NULL, NULL, NULL, NULL); close(s); -close(c); } int unix_start_incoming_migration(const char *path) { -struct sockaddr_un un; -int sock; +struct sockaddr_un addr; +int s; +int ret; DPRINTF("Attempting to start an incoming migration\n"); -sock = qemu_socket(PF_UNIX, SOCK_STREAM, 0); -if (sock< 0) { +s = qemu_socket(PF_UNIX, SOCK_STREAM, 0); +if (s == -1) { fprintf(stderr, "Could not open unix socket: %s\n", strerror(errno)); -return -EINVAL; +return -errno; } -memset(&un, 0, sizeof(un)); -un.sun_family = AF_UNIX; -snprintf(un.sun_path, sizeof(un.sun_path), "%s", path); +memset(&addr, 0, sizeof(addr)); +addr.sun_family = AF_UNIX; +snprintf(addr.sun_path, sizeof(addr.sun_path), "%s", path); -unlink(un.sun_path); -if (bind(sock, (struct sockaddr*)&un, sizeof(un))< 0) { -fprintf(stderr, "bind(unix:%s): %s\n", un.sun_path, strerror(errno)); +unlink(addr.sun_path); +if (bind(s, (struct sockaddr *)&addr, sizeof(addr))< 0) { +ret = -errno; +fprintf(stderr, "bind(unix:%s): %s\n", addr.sun_path, strerror(errno)); goto err; } -if (listen(sock, 1)< 0) { -fprintf(stderr, "listen(unix:%s): %s\n", un.sun_path, strerror(errno)); +if (listen(s, 1) == -1) { +fprintf(stderr, "listen(unix:%s): %s\n", addr.sun_path, +strerror(errno)); +ret = -errno; goto err; } -qemu_set_fd_handler2(sock, NULL, unix_accept_incoming_migration, NULL, -(void *)(intptr_t)sock); +qemu_set_fd_handler2(s, NULL, unix_accept_incoming_migration, NULL, + (void *)(intptr_t)s); return 0; err: -close(sock); - -return -EINVAL; +close(s); +return ret; }
Re: [Qemu-devel] [PATCH v3 00/23] Refactor and cleanup migration code
I peeled off the first 4 patches to apply. Once I finish testing I'll push them. Most of the rest of the series looks good. I really don't like making MigrationState nullable with MIG_STATE_NONE. I don't think it makes sense with multiple MigrationStates. The direct use of globals is not something I'm happy about, but I'd be inclined to hold my nose and apply if it weren't for MIG_STATE_NONE. Regards, Anthony Liguori On 09/23/2011 07:56 AM, Juan Quintela wrote: Hi v4: rebase on top of new qemu and new migration-errors series v3: this patch applies on top of my previous "migration error" patches. All error handling has been moved to that series, except for "propagate error correctly", without this refactoring, it is quite complicated to apply it. Please, review. Later, Juan. v3: - more checkpatch.pl happines - split error handling in a previous series - make Anthony happy. current_migration is still a pointer, but points to a static variable. We can change current_migration when we integrate kemari. v2: - make Jan^Wcheckpatch.pl happy - Yoshiaki Tamura suggestions: - include its two patches to clean things - MAX_THROTTLE define - migration_state enum - I removed spurious differences between migration-{tcp,unix} - better error propagation, after this patch: migrate -d "tcp:name_don_exist:port" migrate -d "tcp:name:port_dont_exist" migrate -d "exec: prog_dont_exist" migrate -d "exec: gzip> /path/dont/exist" fail as expected. Last two used to enter an infinite loop. The fixes part should be backported to 0.14, waiting for the review to do that. Later, Juan. v1: This series: - Fold MigrationState into FdMigrationState (and then rename) - Factorize migration statec creation in a single place - Make use of MIG_STATE_*, setup through helpers and make them local - remove relase& cancel callbacks (where used only one in same file than defined) - get_status() is no more, just access directly to .state - current_migration use cleanup, and make variable static - max_throotle is gone, now inside current_migration - change get_migration_status() to migration_has_finished() and actualize single user. Please review. Later, Juan. Juan Quintela (23): migration: Make *start_outgoing_migration return FdMigrationState migration: Use FdMigrationState instead of MigrationState when possible migration: Fold MigrationState into FdMigrationState migration: Rename FdMigrationState MigrationState migration: Refactor MigrationState creation migration: Make all posible migration functions static migration: move migrate_create_state to do_migrate migration: Introduce MIG_STATE_NONE migration: Refactor and simplify error checking in migrate_fd_put_ready migration: Introduce migrate_fd_completed() for consistency migration: Our release callback was just free migration: Remove get_status() accessor migration: Remove migration cancel() callback migration: Move exported functions to the end of the file migration: use global variable directly migration: another case of global variable assigned to local one migration: make sure we always have a migration state migration: Use bandwidth_limit directly migration: Export a function that tells if the migration has finished correctly migration: Make state definitions local migration: Don't use callback on file defining it migration: propagate error correctly migration: make migration-{tcp,unix} consistent migration-exec.c | 39 + migration-fd.c | 42 ++- migration-tcp.c | 76 -- migration-unix.c | 112 ++- migration.c | 404 ++--- migration.h | 85 ++-- ui/spice-core.c |4 +- 7 files changed, 303 insertions(+), 459 deletions(-)
Re: [Qemu-devel] s390: Fix cpu shutdown for KVM
On 10/04/2011 03:47 PM, Christian Borntraeger wrote: Something like the following? s390: Fix cpu shutdown for KVM On s390 a shutdown is the state of all CPUs being either stopped or disabled (for interrupts) waiting. We have to track this number to call the shutdown sequence accordingly. This patch implements the counting and shutdown handling for the kvm path in qemu. Signed-off-by: Christian Borntraeger --- hw/s390-virtio.c | 21 + target-s390x/cpu.h | 12 target-s390x/kvm.c | 21 + 3 files changed, 50 insertions(+), 4 deletions(-) Index: b/hw/s390-virtio.c === --- a/hw/s390-virtio.c +++ b/hw/s390-virtio.c @@ -130,6 +130,26 @@ int s390_virtio_hypercall(CPUState *env, return r; } +/* + * The number of running CPUs. On s390 a shutdown is the state of all CPUs + * being either stopped or disabled (for interrupts) waiting. We have to + * track this number to call the shutdown sequence accordingly. This + * number is modified either on startup or while holding the big qemu lock. + */ +static unsigned s390_running_cpus; + +void kvm_s390_add_running_cpu(CPUState *env) +{ +assert(env->halted == 0); +s390_running_cpus++; +} + +unsigned kvm_s390_del_running_cpu(CPUState *env) +{ +assert(s390_running_cpus>= 1); +return --s390_running_cpus; +} Almost. It has nothing to do with KVM though. Please keep the abstraction intact! + /* PC hardware initialisation */ static void s390_init(ram_addr_t my_ram_size, const char *boot_device, @@ -189,6 +209,7 @@ static void s390_init(ram_addr_t my_ram_ env->halted = 0; env->exception_index = 0; +kvm_s390_add_running_cpu(env); if (kernel_filename) { kernel_size = load_image(kernel_filename, qemu_get_ram_ptr(0)); Index: b/target-s390x/cpu.h === --- a/target-s390x/cpu.h +++ b/target-s390x/cpu.h @@ -292,6 +292,8 @@ void kvm_s390_interrupt(CPUState *env, i void kvm_s390_virtio_irq(CPUState *env, int config_change, uint64_t token); void kvm_s390_interrupt_internal(CPUState *env, int type, uint32_t parm, uint64_t parm64, int vm); +void kvm_s390_add_running_cpu(CPUState *env); +unsigned kvm_s390_del_running_cpu(CPUState *env); #else static inline void kvm_s390_interrupt(CPUState *env, int type, uint32_t code) { @@ -307,6 +309,16 @@ static inline void kvm_s390_interrupt_in int vm) { } + +static inline void kvm_s390_add_running_cpu(CPUState *env) +{ +} + +static inline unsigned kvm_s390_del_running_cpu(CPUState *env) +{ +return 0; +} The functions are defined in s390-virtio.c and thus are available anywhere now. No need to special-case KVM anywhere. + #endif CPUState *s390_cpu_addr2state(uint16_t cpu_addr); Index: b/target-s390x/kvm.c === --- a/target-s390x/kvm.c +++ b/target-s390x/kvm.c @@ -185,6 +185,12 @@ void kvm_s390_interrupt_internal(CPUStat return; } +/* + * We can only deliver interrupts to (interrupt) enabled CPUs. + * We dont call kvm_s390_add_running_cpu here, since CPUs in enabled wait + * will wait inside the kernel (no exit). Therefore, the targeted + * CPUs was neither disabled waiting or stopped for qemu. + */ env->halted = 0; env->exception_index = -1; qemu_cpu_kick(env); @@ -301,6 +307,7 @@ static int s390_cpu_restart(CPUState *en kvm_s390_interrupt(env, KVM_S390_RESTART, 0); env->halted = 0; env->exception_index = -1; +kvm_s390_add_running_cpu(env); Hrm. Is add_running correct here? Maybe we should add a wrapper around the halted = 0 parts in s390-virtio.c and do the increase of the counter there? That way we can make sure that transitioning from 0 -> 0 doesn't increase the counter. Alex
[Qemu-devel] [PATCH 1/4] savevm: teach qemu_fill_buffer to do partial refills
We will need on next patch to be able to lookahead on next patch Signed-off-by: Juan Quintela --- savevm.c | 14 +++--- 1 files changed, 11 insertions(+), 3 deletions(-) diff --git a/savevm.c b/savevm.c index 46f2447..31131df 100644 --- a/savevm.c +++ b/savevm.c @@ -455,6 +455,7 @@ void qemu_fflush(QEMUFile *f) static void qemu_fill_buffer(QEMUFile *f) { int len; +int used; if (!f->get_buffer) return; @@ -462,10 +463,17 @@ static void qemu_fill_buffer(QEMUFile *f) if (f->is_write) abort(); -len = f->get_buffer(f->opaque, f->buf, f->buf_offset, IO_BUF_SIZE); +used = f->buf_size - f->buf_index; +if (used > 0) { +memmove(f->buf, f->buf + f->buf_index, used); +} +f->buf_index = 0; +f->buf_size = used; + +len = f->get_buffer(f->opaque, f->buf + used, f->buf_offset, +IO_BUF_SIZE - used); if (len > 0) { -f->buf_index = 0; -f->buf_size = len; +f->buf_size += len; f->buf_offset += len; } else if (len != -EAGAIN) f->has_error = 1; -- 1.7.6.4
[Qemu-devel] [PATCH 2/4] savevm: some coding style cleanups
This patch will make moving code on next patches and having checkpatch happy easier. Signed-off-by: Juan Quintela --- savevm.c | 21 ++--- 1 files changed, 14 insertions(+), 7 deletions(-) diff --git a/savevm.c b/savevm.c index 31131df..5fee4e2 100644 --- a/savevm.c +++ b/savevm.c @@ -536,8 +536,9 @@ int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1) { int size, l; -if (f->is_write) +if (f->is_write) { abort(); +} size = size1; while (size > 0) { @@ -545,11 +546,13 @@ int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1) if (l == 0) { qemu_fill_buffer(f); l = f->buf_size - f->buf_index; -if (l == 0) +if (l == 0) { break; +} } -if (l > size) +if (l > size) { l = size; +} memcpy(buf, f->buf + f->buf_index, l); f->buf_index += l; buf += l; @@ -560,26 +563,30 @@ int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1) static int qemu_peek_byte(QEMUFile *f) { -if (f->is_write) +if (f->is_write) { abort(); +} if (f->buf_index >= f->buf_size) { qemu_fill_buffer(f); -if (f->buf_index >= f->buf_size) +if (f->buf_index >= f->buf_size) { return 0; +} } return f->buf[f->buf_index]; } int qemu_get_byte(QEMUFile *f) { -if (f->is_write) +if (f->is_write) { abort(); +} if (f->buf_index >= f->buf_size) { qemu_fill_buffer(f); -if (f->buf_index >= f->buf_size) +if (f->buf_index >= f->buf_size) { return 0; +} } return f->buf[f->buf_index++]; } -- 1.7.6.4
[Qemu-devel] [PATCH 4/4] Revert "savevm: fix corruption in vmstate_subsection_load()."
This reverts commit eb60260de0b050a5e8ab725e84d377d0b44c43ae. Conflicts: savevm.c We changed qemu_peek_byte() prototype, just fixed the rejects. Signed-off-by: Juan Quintela --- savevm.c | 10 +- 1 files changed, 1 insertions(+), 9 deletions(-) diff --git a/savevm.c b/savevm.c index db6ea12..aaa303e 100644 --- a/savevm.c +++ b/savevm.c @@ -1715,12 +1715,6 @@ static const VMStateDescription *vmstate_get_subsection(const VMStateSubsection static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd, void *opaque) { -const VMStateSubsection *sub = vmsd->subsections; - -if (!sub || !sub->needed) { -return 0; -} - while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) { char idstr[256]; int ret; @@ -1742,7 +1736,7 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd, /* it don't have a valid subsection name */ return 0; } -sub_vmsd = vmstate_get_subsection(sub, idstr); +sub_vmsd = vmstate_get_subsection(vmsd->subsections, idstr); if (sub_vmsd == NULL) { return -ENOENT; } @@ -1752,7 +1746,6 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd, idstr[len] = 0; version_id = qemu_get_be32(f); -assert(!sub_vmsd->subsections); ret = vmstate_load_state(f, sub_vmsd, opaque, version_id); if (ret) { return ret; @@ -1776,7 +1769,6 @@ static void vmstate_subsection_save(QEMUFile *f, const VMStateDescription *vmsd, qemu_put_byte(f, len); qemu_put_buffer(f, (uint8_t *)vmsd->name, len); qemu_put_be32(f, vmsd->version_id); -assert(!vmsd->subsections); vmstate_save_state(f, vmsd, opaque); } sub++; -- 1.7.6.4
[Qemu-devel] [PATCH 0/4] migration: Improve subsections detection
Hi This series move the subsections detection code form: - Look that it starts form 5 To: - Look that it starts form 5 (SUBSECTION) - Look at the length - Look that length is bigger than section name - Look at the idstr and see that it starts with the subsection name. Please review. Later, Juan. Juan Quintela (4): savevm: teach qemu_fill_buffer to do partial refills savevm: some coding style cleanups savevm: improve subsections detection on load Revert "savevm: fix corruption in vmstate_subsection_load()." savevm.c | 112 ++--- 1 files changed, 84 insertions(+), 28 deletions(-) -- 1.7.6.4
Re: [Qemu-devel] [PATCH 05/23] migration: Refactor MigrationState creation
Anthony Liguori wrote: > On 09/23/2011 07:56 AM, Juan Quintela wrote: >> Signed-off-by: Juan Quintela > > > I think this would fit better if it were migrate_new() or migrate_open(). > > It definitely should be in the form noun_verb and not noun_verb_noun(). > > Regards, You are the native speaker, so I change this one O:-) Later, Juan.
Re: [Qemu-devel] [PATCH 15/23] migration: use global variable directly
Anthony Liguori wrote: > How about we compromise and add a: > > MigrationState *migrate_get_current(void); I can agree with this one. > I'm strongly opposed to propagating direct usage of a global. If it's > at least a function call, that's a bit nicer. What I am against is with trying to "hide" the fact that we are using a global. function accessor looks ok enough.
Re: [Qemu-devel] [PATCH 17/23] migration: make sure we always have a migration state
Anthony Liguori wrote: > On 09/23/2011 07:57 AM, Juan Quintela wrote: >> This cleans up a lot the code as we don't have to check anymore if >> the variable is NULL or not. >> >> We don't make it static, because when we integrate fault tolerance, we >> can have several migrations at once. >> >> Signed-off-by: Juan Quintela >> --- >> migration.c | 126 >> +- >> 1 files changed, 54 insertions(+), 72 deletions(-) >> >> diff --git a/migration.c b/migration.c >> index 1cc21f0..c382383 100644 >> --- a/migration.c >> +++ b/migration.c >> @@ -34,7 +34,13 @@ >> /* Migration speed throttling */ >> static int64_t max_throttle = (32<< 20); >> >> -static MigrationState *current_migration; >> +/* When we add fault tolerance, we could have several >> + migrations at once. For now we don't need to add >> + dynamic creation of migration */ >> +static MigrationState current_migration_static = { >> +.state = MIG_STATE_NONE, >> +}; >> +static MigrationState *current_migration =¤t_migration_static; > > This doesn't make sense to me. We can have two migration states that > are both in the MIG_STATE_NONE? What does that actually mean? > > I don't see the point in making migration state nullable via the state > member other than avoiding the if (s == NULL) check. It avoids s==NULL checks, and it also avoids having to have new variables (max_throotle) just because we don't have a migration state handy. Obviously the problem can't be the size (the variable is smaller that all the code tests for NULL). For me, it is easier to understand that we have an state (migration never attempted) with the same states that the other migrations states. Later, Juan.
[Qemu-devel] [PATCH RFC V1 02/11] qemu-timer: Introduce qemu_run_one_timer
Used by the Xen PCI Passthrough code to run the timer about the power state transition. Signed-off-by: Anthony PERARD --- qemu-timer.c | 15 +++ qemu-timer.h |3 +++ 2 files changed, 18 insertions(+), 0 deletions(-) diff --git a/qemu-timer.c b/qemu-timer.c index 46dd483..15e659b 100644 --- a/qemu-timer.c +++ b/qemu-timer.c @@ -1163,3 +1163,18 @@ int qemu_calculate_timeout(void) return 1000; } +/* run the specified timer */ +void qemu_run_one_timer(QEMUTimer *ts) +{ +uint64_t current_time; + +/* remove timer from the list before calling the callback */ +qemu_del_timer(ts); + +while ((current_time = qemu_get_clock_ms(rt_clock)) < ts->expire_time) +/* sleep until the expire time */ +usleep((ts->expire_time - current_time) * 1000); + +/* run the callback */ +ts->cb(ts->opaque); +} diff --git a/qemu-timer.h b/qemu-timer.h index 0a43469..b7b907b 100644 --- a/qemu-timer.h +++ b/qemu-timer.h @@ -338,4 +338,7 @@ extern int64_t tlb_flush_time; extern int64_t dev_time; #endif +/* run the specified timer */ +void qemu_run_one_timer(QEMUTimer *ts); + #endif -- Anthony PERARD
[Qemu-devel] [PATCH RFC V1 01/11] Introduce HostPCIDevice to access a pci device on the host.
Signed-off-by: Anthony PERARD --- hw/host-pci-device.c | 192 ++ hw/host-pci-device.h | 36 + 2 files changed, 228 insertions(+), 0 deletions(-) create mode 100644 hw/host-pci-device.c create mode 100644 hw/host-pci-device.h diff --git a/hw/host-pci-device.c b/hw/host-pci-device.c new file mode 100644 index 000..b3f2899 --- /dev/null +++ b/hw/host-pci-device.c @@ -0,0 +1,192 @@ +#include "qemu-common.h" +#include "host-pci-device.h" + +static int path_to(const HostPCIDevice *d, + const char *name, char *buf, ssize_t size) +{ +return snprintf(buf, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%x/%s", +d->domain, d->bus, d->dev, d->func, name); +} + +static int get_resource(HostPCIDevice *d) +{ +int i; +FILE *f; +char path[PATH_MAX]; +unsigned long long start, end, flags, size; + +path_to(d, "resource", path, sizeof (path)); +f = fopen(path, "r"); +if (!f) { +fprintf(stderr, "Error: Can't open %s: %s\n", path, strerror(errno)); +return -1; +} + +for (i = 0; i < PCI_NUM_REGIONS; i++) { +if (fscanf(f, "%llx %llx %llx", &start, &end, &flags) != 3) { +fprintf(stderr, "Error: Syntax error in %s\n", path); +break; +} +if (start) { +size = end - start + 1; +} else { +size = 0; +} + +flags &= 0xf; + +if (i < PCI_ROM_SLOT) { +d->base_addr[i] = start | flags; +d->size[i] = size; +} else { +d->rom_base_addr = start | flags; +d->rom_size = size; +} +} + +fclose(f); +return 0; +} + +static unsigned long get_value(HostPCIDevice *d, const char *name) +{ +char path[PATH_MAX]; +FILE *f; +unsigned long value; + +path_to(d, name, path, sizeof (path)); +f = fopen(path, "r"); +if (!f) { +fprintf(stderr, "Error: Can't open %s: %s\n", path, strerror(errno)); +return -1; +} +if (fscanf(f, "%lx\n", &value) != 1) { +fprintf(stderr, "Error: Syntax error in %s\n", path); +value = -1; +} +fclose(f); +return value; +} + +static int pci_dev_is_virtfn(HostPCIDevice *d) +{ +int rc; +char path[PATH_MAX]; +struct stat buf; + +path_to(d, "physfn", path, sizeof (path)); +rc = !stat(path, &buf); + +return rc; +} + +static int host_pci_config_fd(HostPCIDevice *d) +{ +char path[PATH_MAX]; + +if (d->config_fd < 0) { +path_to(d, "config", path, sizeof (path)); +d->config_fd = open(path, O_RDWR); +if (d->config_fd < 0) { +fprintf(stderr, "HostPCIDevice: Can not open '%s': %s\n", +path, strerror(errno)); +} +} +return d->config_fd; +} +static int host_pci_config_read(HostPCIDevice *d, int pos, void *buf, int len) +{ +int fd = host_pci_config_fd(d); +int res = 0; + +res = pread(fd, buf, len, pos); +if (res < 0) { +fprintf(stderr, "host_pci_config: read failed: %s (fd: %i)\n", +strerror(errno), fd); +return -1; +} +return res; +} +static int host_pci_config_write(HostPCIDevice *d, + int pos, const void *buf, int len) +{ +int fd = host_pci_config_fd(d); +int res = 0; + +res = pwrite(fd, buf, len, pos); +if (res < 0) { +fprintf(stderr, "host_pci_config: write failed: %s\n", +strerror(errno)); +return -1; +} +return res; +} + +uint8_t host_pci_read_byte(HostPCIDevice *d, int pos) +{ + uint8_t buf; + host_pci_config_read(d, pos, &buf, 1); + return buf; +} +uint16_t host_pci_read_word(HostPCIDevice *d, int pos) +{ + uint16_t buf; + host_pci_config_read(d, pos, &buf, 2); + return le16_to_cpu(buf); +} +uint32_t host_pci_read_long(HostPCIDevice *d, int pos) +{ + uint32_t buf; + host_pci_config_read(d, pos, &buf, 4); + return le32_to_cpu(buf); +} +int host_pci_read_block(HostPCIDevice *d, int pos, uint8_t *buf, int len) +{ + return host_pci_config_read(d, pos, buf, len); +} + +int host_pci_write_byte(HostPCIDevice *d, int pos, uint8_t data) +{ + return host_pci_config_write(d, pos, &data, 1); +} +int host_pci_write_word(HostPCIDevice *d, int pos, uint16_t data) +{ + return host_pci_config_write(d, pos, &data, 2); +} +int host_pci_write_long(HostPCIDevice *d, int pos, uint32_t data) +{ + return host_pci_config_write(d, pos, &data, 4); +} +int host_pci_write_block(HostPCIDevice *d, int pos, uint8_t *buf, int len) +{ + return host_pci_config_write(d, pos, buf, len); +} + +HostPCIDevice *host_pci_device_get(uint8_t bus, uint8_t dev, uint8_t func) +{ +HostPCIDevice *d = NULL; + +d = g_new0(HostPCIDevice, 1); + +d->config_fd = -1; +d->domain = 0; +d->bus = bus; +d->dev = dev; +d->func = func; + +if (host_pci_config_fd(d) == -1) +goto error; +if (get_resource(d)
[Qemu-devel] [PATCH RFC V1 11/11] config/make: Introduce --enable-xen-pci-passthrough, built it.
Signed-off-by: Anthony PERARD --- Makefile.target |7 +++ configure | 21 + 2 files changed, 28 insertions(+), 0 deletions(-) diff --git a/Makefile.target b/Makefile.target index f708453..b5fbc18 100644 --- a/Makefile.target +++ b/Makefile.target @@ -208,6 +208,13 @@ obj-$(CONFIG_NO_XEN) += xen-stub.o obj-i386-$(CONFIG_XEN) += xen_platform.o +# Xen PCI Passthrough +obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += host-pci-device.o +obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough_helpers.o +obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough.o +obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough_config_init.o +obj-i386-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pci_passthrough_msi.o + # Inter-VM PCI shared memory CONFIG_IVSHMEM = ifeq ($(CONFIG_KVM), y) diff --git a/configure b/configure index 0875f95..b90cfe1 100755 --- a/configure +++ b/configure @@ -127,6 +127,7 @@ vnc_png="" vnc_thread="no" xen="" xen_ctrl_version="" +xen_pci_passthrough="" linux_aio="" attr="" xfs="" @@ -635,6 +636,10 @@ for opt do ;; --enable-xen) xen="yes" ;; + --disable-xen-pci-passthrough) xen_pci_passthrough="no" + ;; + --enable-xen-pci-passthrough) xen_pci_passthrough="yes" + ;; --disable-brlapi) brlapi="no" ;; --enable-brlapi) brlapi="yes" @@ -972,6 +977,8 @@ echo " (affects only QEMU, not qemu-img)" echo " --enable-mixemu enable mixer emulation" echo " --disable-xendisable xen backend driver support" echo " --enable-xen enable xen backend driver support" +echo " --disable-xen-pci-passthrough" +echo " --enable-xen-pci-passthrough" echo " --disable-brlapi disable BrlAPI" echo " --enable-brlapi enable BrlAPI" echo " --disable-vnc-tlsdisable TLS encryption for VNC server" @@ -1335,6 +1342,17 @@ EOF fi fi +if test "$xen_pci_passthrough" != "no"; then + if test "$xen" = "yes"; then +xen_pci_passthrough=yes + else +if test "$xen_pci_passthrough" = "yes"; then + feature_not_found "Xen PCI Passthrough without Xen" +fi +xen_pci_passthrough=no + fi +fi + ## # pkg-config probe @@ -3378,6 +3396,9 @@ case "$target_arch2" in if test "$xen" = "yes" -a "$target_softmmu" = "yes" ; then target_phys_bits=64 echo "CONFIG_XEN=y" >> $config_target_mak + if test "$xen_pci_passthrough" = yes; then +echo "CONFIG_XEN_PCI_PASSTHROUGH=y" >> "$config_target_mak" + fi else echo "CONFIG_NO_XEN=y" >> $config_target_mak fi -- Anthony PERARD
[Qemu-devel] [PATCH RFC V1 10/11] Introduce Xen PCI Passthrough, MSI (3/3)
Signed-off-by: Anthony PERARD --- hw/xen_pci_passthrough_msi.c | 674 ++ 1 files changed, 674 insertions(+), 0 deletions(-) create mode 100644 hw/xen_pci_passthrough_msi.c diff --git a/hw/xen_pci_passthrough_msi.c b/hw/xen_pci_passthrough_msi.c new file mode 100644 index 000..be18ff1 --- /dev/null +++ b/hw/xen_pci_passthrough_msi.c @@ -0,0 +1,674 @@ +/* + * Copyright (c) 2007, Intel Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + * + * Jiang Yunhong + * + * This file implements direct PCI assignment to a HVM guest + */ + +#include + +#include "xen_backend.h" +#include "xen_pci_passthrough.h" + +void msi_set_enable(XenPCIPassthroughState *dev, int en) +{ +uint16_t val = 0; +uint32_t address = 0; +PT_LOG("enable: %i\n", en); + +if (!dev->msi) { +return; +} + +address = dev->msi->ctrl_offset; +if (!address) { +return; +} + +val = host_pci_read_word(dev->real_device, address); +val &= ~PCI_MSI_FLAGS_ENABLE; +val |= en & PCI_MSI_FLAGS_ENABLE; +host_pci_write_word(dev->real_device, address, val); + +PT_LOG("done, address: %#x, val: %#x\n", address, val); +} + +static void msix_set_enable(XenPCIPassthroughState *dev, int en) +{ +uint16_t val = 0; +uint32_t address = 0; + +if (!dev->msix) { +return; +} + +address = dev->msix->ctrl_offset; +if (!address) { +return; +} + +val = host_pci_read_word(dev->real_device, address); +val &= ~PCI_MSIX_FLAGS_ENABLE; +if (en) { +val |= PCI_MSIX_FLAGS_ENABLE; +} +host_pci_write_word(dev->real_device, address, val); +} + +/*/ +/* MSI virtuailization functions */ + +/* + * setup physical msi, but didn't enable it + */ +int pt_msi_setup(XenPCIPassthroughState *dev) +{ +int pirq = -1; +uint8_t gvec = 0; + +if (!(dev->msi->flags & MSI_FLAG_UNINIT)) { +PT_LOG("Error: setup physical after initialized?? \n"); +return -1; +} + +gvec = dev->msi->data & 0xFF; +if (!gvec) { +/* if gvec is 0, the guest is asking for a particular pirq that + * is passed as dest_id */ +pirq = (dev->msi->addr_hi & 0xff00) | + ((dev->msi->addr_lo >> MSI_TARGET_CPU_SHIFT) & 0xff); +if (!pirq) { +/* this probably identifies an misconfiguration of the guest, + * try the emulated path */ +pirq = -1; +} else { +PT_LOG("pt_msi_setup requested pirq = %d\n", pirq); +} +} + +if (xc_physdev_map_pirq_msi(xen_xc, xen_domid, AUTO_ASSIGN, &pirq, +PCI_DEVFN(dev->real_device->dev, + dev->real_device->func), +dev->real_device->bus, 0, 0)) { +PT_LOG("Error: Mapping of MSI failed.\n"); +return -1; +} + +if (pirq < 0) { +PT_LOG("Error: Invalid pirq number\n"); +return -1; +} + +dev->msi->pirq = pirq; +PT_LOG("msi mapped with pirq %x\n", pirq); + +return 0; +} + +static uint32_t __get_msi_gflags(uint32_t data, uint64_t addr) +{ +uint32_t result = 0; +int rh, dm, dest_id, deliv_mode, trig_mode; + +rh = (addr >> MSI_ADDR_REDIRECTION_SHIFT) & 0x1; +dm = (addr >> MSI_ADDR_DESTMODE_SHIFT) & 0x1; +dest_id = (addr >> MSI_TARGET_CPU_SHIFT) & 0xff; +deliv_mode = (data >> MSI_DATA_DELIVERY_SHIFT) & 0x7; +trig_mode = (data >> MSI_DATA_TRIGGER_SHIFT) & 0x1; + +result |= dest_id | (rh << GFLAGS_SHIFT_RH) | (dm << GFLAGS_SHIFT_DM) | \ + (deliv_mode << GLFAGS_SHIFT_DELIV_MODE) | + (trig_mode << GLFAGS_SHIFT_TRG_MODE); + +return result; +} + +int pt_msi_update(XenPCIPassthroughState *d) +{ +uint8_t gvec = 0; +uint32_t gflags = 0; +uint64_t addr = 0; +int ret = 0; + +/* get vector, address, flags info, etc. */ +gvec = d->msi->data & 0xFF; +addr = (uint64_t)d->msi->addr_hi << 32 | d->msi->addr_lo; +gflags = __get_msi_gflags(d->msi->data, addr); + +PT_LOG("Update msi with pirq %x gvec %x gflags %x\n", + d->msi->pirq, gvec, gflags); + +ret = xc_domain_update_msi_irq(xen_xc, xen_domid, gvec, + d->msi->pirq, gflags, 0); + +if (ret) { +
[Qemu-devel] [PATCH RFC V1 08/11] Introduce Xen PCI Passthrough, qdevice (1/3)
Signed-off-by: Anthony PERARD --- hw/xen_pci_passthrough.c | 763 ++ hw/xen_pci_passthrough.h | 335 + hw/xen_pci_passthrough_helpers.c | 46 +++ 3 files changed, 1144 insertions(+), 0 deletions(-) create mode 100644 hw/xen_pci_passthrough.c create mode 100644 hw/xen_pci_passthrough.h create mode 100644 hw/xen_pci_passthrough_helpers.c diff --git a/hw/xen_pci_passthrough.c b/hw/xen_pci_passthrough.c new file mode 100644 index 000..bfbe042 --- /dev/null +++ b/hw/xen_pci_passthrough.c @@ -0,0 +1,763 @@ +#include + +#include "pci.h" +#include "xen.h" +#include "xen_backend.h" +#include "xen_pci_passthrough.h" + +#define PCI_BAR_ENTRIES (6) + +typedef struct PlugDevice { +uint8_t r_bus; +uint8_t r_slot; +uint8_t r_func; +int bus; +int slot; +int func; +} PlugDevice; + +QLIST_HEAD(php_dev_list, PlugDevice) php_dev_list = +QLIST_HEAD_INITIALIZER(php_dev_list); + +#define PT_BAR_ALLF 0x /* BAR ALLF value */ + +/* #define PT_NR_IRQS (256) */ +/* char mapped_machine_irq[PT_NR_IRQS] = {0}; */ + +/* Config Space */ +static int pt_pci_config_access_check(PCIDevice *d, uint32_t address, int len) +{ +/* check offset range */ +if (address >= 0xFF) { +PT_LOG("Error: Failed to access register with offset exceeding FFh. " + "[%02x:%02x.%x][Offset:%02xh][Length:%d]\n", + pci_bus_num(d->bus), PCI_SLOT(d->devfn), PCI_FUNC(d->devfn), + address, len); +return -1; +} + +/* check read size */ +if ((len != 1) && (len != 2) && (len != 4)) { +PT_LOG("Error: Failed to access register with invalid access length. " + "[%02x:%02x.%x][Offset:%02xh][Length:%d]\n", + pci_bus_num(d->bus), PCI_SLOT(d->devfn), PCI_FUNC(d->devfn), + address, len); +return -1; +} + +/* check offset alignment */ +if (address & (len - 1)) { +PT_LOG("Error: Failed to access register with invalid access size " +"alignment. [%02x:%02x.%x][Offset:%02xh][Length:%d]\n", +pci_bus_num(d->bus), PCI_SLOT(d->devfn), PCI_FUNC(d->devfn), +address, len); +return -1; +} + +return 0; +} + +int pt_bar_offset_to_index(uint32_t offset) +{ +int index = 0; + +/* check Exp ROM BAR */ +if (offset == PCI_ROM_ADDRESS) { +return PCI_ROM_SLOT; +} + +/* calculate BAR index */ +index = (offset - PCI_BASE_ADDRESS_0) >> 2; +if (index >= PCI_NUM_REGIONS) { +return -1; +} + +return index; +} + +static uint32_t pt_pci_read_config(PCIDevice *d, uint32_t address, int len) +{ +XenPCIPassthroughState *s = DO_UPCAST(XenPCIPassthroughState, dev, d); +uint32_t val = 0; +XenPTRegGroup *reg_grp_entry = NULL; +XenPTReg *reg_entry = NULL; +int rc = 0; +int emul_len = 0; +uint32_t find_addr = address; + +if (pt_pci_config_access_check(d, address, len)) { +goto exit; +} + +/* check power state transition flags */ +if (s->pm_state != NULL && s->pm_state->flags & PT_FLAG_TRANSITING) { +/* can't accept untill previous power state transition is completed. + * so finished previous request here. + */ +qemu_run_one_timer(s->pm_state->pm_timer); +} + +/* find register group entry */ +reg_grp_entry = pt_find_reg_grp(s, address); +if (reg_grp_entry) { +/* check 0 Hardwired register group */ +if (reg_grp_entry->reg_grp->grp_type == GRP_TYPE_HARDWIRED) { +/* no need to emulate, just return 0 */ +val = 0; +goto exit; +} +} + +/* read I/O device register value */ +rc = host_pci_read_block(s->real_device, address, (uint8_t *)&val, len); +if (!rc) { +PT_LOG("Error: pci_read_block failed. return value[%d].\n", rc); +memset(&val, 0xff, len); +} + +/* just return the I/O device register value for + * passthrough type register group */ +if (reg_grp_entry == NULL) { +goto exit; +} + +/* adjust the read value to appropriate CFC-CFF window */ +val <<= (address & 3) << 3; +emul_len = len; + +/* loop Guest request size */ +while (emul_len > 0) { +/* find register entry to be emulated */ +reg_entry = pt_find_reg(reg_grp_entry, find_addr); +if (reg_entry) { +XenPTRegInfo *reg = reg_entry->reg; +uint32_t real_offset = reg_grp_entry->base_offset + reg->offset; +uint32_t valid_mask = 0x >> ((4 - emul_len) << 3); +uint8_t *ptr_val = NULL; + +valid_mask <<= (find_addr - real_offset) << 3; +ptr_val = (uint8_t *)&val + (real_offset & 3); + +/* do emulation depend on register size */ +switch (reg->size) { +case 1: +if (reg->u.b.read) { +r
[Qemu-devel] [PATCH RFC V1 06/11] pci.c: Add pci_check_bar_overlap
This function help Xen PCI Passthrough device to check for overlap. Signed-off-by: Anthony PERARD --- hw/pci.c | 46 ++ hw/pci.h |3 +++ 2 files changed, 49 insertions(+), 0 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index af74003..56e94d1 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -2196,3 +2196,49 @@ MemoryRegion *pci_address_space(PCIDevice *dev) { return dev->bus->address_space_mem; } + +int pci_check_bar_overlap(PCIDevice *dev, + pcibus_t addr, pcibus_t size, uint8_t type) +{ +PCIBus *bus = dev->bus; +PCIDevice *devices = NULL; +PCIIORegion *r; +int i, j; +int rc = 0; + +/* check Overlapped to Base Address */ +for (i = 0; i < ARRAY_SIZE(bus->devices); i++) { +if (!(devices = bus->devices[i])) { +continue; +} + +/* skip itself */ +if (devices->devfn == dev->devfn) { +continue; +} + +for (j = 0; j < PCI_NUM_REGIONS; j++) { +r = &devices->io_regions[j]; + +/* skip different resource type, but don't skip when + * prefetch and non-prefetch memory are compared. + */ +if (type != r->type) { +if (type == PCI_BASE_ADDRESS_SPACE_IO || +r->type == PCI_BASE_ADDRESS_SPACE_IO) { +continue; +} +} + +if ((addr < (r->addr + r->size)) && ((addr + size) > r->addr)) { +printf("Overlapped to device[%02x:%02x.%x][Region:%d]" + "[Address:%"PRIx64"h][Size:%"PRIx64"h]\n", + pci_bus_num(bus), PCI_SLOT(devices->devfn), + PCI_FUNC(devices->devfn), j, r->addr, r->size); +rc = 1; +} +} +} + +return rc; +} diff --git a/hw/pci.h b/hw/pci.h index c04b169..97996b5 100644 --- a/hw/pci.h +++ b/hw/pci.h @@ -490,4 +490,7 @@ static inline uint32_t pci_config_size(const PCIDevice *d) return pci_is_express(d) ? PCIE_CONFIG_SPACE_SIZE : PCI_CONFIG_SPACE_SIZE; } +int pci_check_bar_overlap(PCIDevice *dev, + pcibus_t addr, pcibus_t size, uint8_t type); + #endif -- Anthony PERARD
Re: [Qemu-devel] [PATCH 05/23] migration: Refactor MigrationState creation
On 10/04/2011 09:44 AM, Juan Quintela wrote: Anthony Liguori wrote: On 09/23/2011 07:56 AM, Juan Quintela wrote: Signed-off-by: Juan Quintela I think this would fit better if it were migrate_new() or migrate_open(). It definitely should be in the form noun_verb and not noun_verb_noun(). Regards, You are the native speaker, so I change this one O:-) It's just consistency, not about what sounds good. bdrv_new(), bdrv_open(), qemu_chr_open(), memory_region_init(), etc. Regards, Anthony Liguori Later, Juan.
Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough
On 10/04/2011 04:51 PM, Anthony PERARD wrote: Hi all, This patch series introduce the PCI passthrough for Xen. First, we have HostPCIDevice that help to access one PCI device of the host. Then, there are several additions in the QEMU code. One is qemu_run_one_timer to run a specific timer. It is used by PCI passthrough to run a timer about power management. Another is pci_check_bar_overlap. There are also several change in pci_ids and pci_regs. Last part, but not least, the PCI passthrough device himself. Cut in 3 parts (or file), there is one to take care of the initialisation of a passthrough device. The second one handle everything about the config address space, there are specifics functions for every config register. The third one is to handle MSI. I'm still working on setting a PCI passthrough device through QMP from libxl (xen tool stack). It is just a call to device_add, with the driver parametter hostaddr=":00:1b.0". There is some minor things missing: - copyright header - PCI IO space multiplexer We also have pci passthrough in qemu-kvm (I think based on the same Neocleus code). Rather than having two pci assignment implementations, I think we should have just one, with the differences (programming the hypervisor) abstracted at that level. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough
On 10/04/2011 09:58 AM, Avi Kivity wrote: On 10/04/2011 04:51 PM, Anthony PERARD wrote: Hi all, This patch series introduce the PCI passthrough for Xen. First, we have HostPCIDevice that help to access one PCI device of the host. Then, there are several additions in the QEMU code. One is qemu_run_one_timer to run a specific timer. It is used by PCI passthrough to run a timer about power management. Another is pci_check_bar_overlap. There are also several change in pci_ids and pci_regs. Last part, but not least, the PCI passthrough device himself. Cut in 3 parts (or file), there is one to take care of the initialisation of a passthrough device. The second one handle everything about the config address space, there are specifics functions for every config register. The third one is to handle MSI. I'm still working on setting a PCI passthrough device through QMP from libxl (xen tool stack). It is just a call to device_add, with the driver parametter hostaddr=":00:1b.0". There is some minor things missing: - copyright header - PCI IO space multiplexer We also have pci passthrough in qemu-kvm (I think based on the same Neocleus code). Rather than having two pci assignment implementations, I think we should have just one, with the differences (programming the hypervisor) abstracted at that level. I agree in principle but how close is qemu-kvm pci passthrough to a mergable state? Would it make sense to merge the Xen code first and then abstract it? Regards, Anthony Liguori
Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough
On Tue, 4 Oct 2011, Anthony Liguori wrote: > On 10/04/2011 09:58 AM, Avi Kivity wrote: > > On 10/04/2011 04:51 PM, Anthony PERARD wrote: > >> Hi all, > >> > >> This patch series introduce the PCI passthrough for Xen. > >> > >> First, we have HostPCIDevice that help to access one PCI device of the > >> host. > >> > >> Then, there are several additions in the QEMU code. One is > >> qemu_run_one_timer > >> to run a specific timer. It is used by PCI passthrough to run a timer about > >> power management. Another is pci_check_bar_overlap. > >> > >> There are also several change in pci_ids and pci_regs. > >> > >> Last part, but not least, the PCI passthrough device himself. Cut in 3 > >> parts > >> (or file), there is one to take care of the initialisation of a passthrough > >> device. The second one handle everything about the config address space, > >> there > >> are specifics functions for every config register. The third one is to > >> handle > >> MSI. > >> > >> I'm still working on setting a PCI passthrough device through QMP from > >> libxl > >> (xen tool stack). It is just a call to device_add, with the driver > >> parametter > >> hostaddr=":00:1b.0". > >> > >> There is some minor things missing: > >> - copyright header > >> - PCI IO space multiplexer > >> > >> > > > > We also have pci passthrough in qemu-kvm (I think based on the same Neocleus > > code). Rather than having two pci assignment implementations, I think we > > should > > have just one, with the differences (programming the hypervisor) abstracted > > at > > that level. > > I agree in principle but how close is qemu-kvm pci passthrough to a mergable > state? Would it make sense to merge the Xen code first and then abstract it? I think it should be fairly easy to abstract the current xen code: just a matter of providing memory, ioport and interrupt mapping functions.
Re: [Qemu-devel] [PATCH] Make cpu_single_env thread local (Linux only for now)
Jan Kiszka writes: > On 2011-10-03 18:33, Dr. David Alan Gilbert wrote: >> Make cpu_single_env thread local (Linux only for now) >> * Fixes some user space threading issues (esp those triggered >> by bug 823902) >> >> Against rev d11cf8cc..., tested on ARM user mode, and ARM Vexpress >> system mode (with Blue Swirl's fix from yesterday) - only >> tested on Linux host. Lets me run ARM userspace firefox. >> >> Signed-off-by: Dr. David Alan Gilbert >> >> diff --git a/cpu-all.h b/cpu-all.h >> index 42a5fa0..d895ee6 100644 >> --- a/cpu-all.h >> +++ b/cpu-all.h >> @@ -334,7 +334,13 @@ void cpu_dump_statistics(CPUState *env, FILE *f, >> fprintf_function cpu_fprintf, >> void QEMU_NORETURN cpu_abort(CPUState *env, const char *fmt, ...) >> GCC_FMT_ATTR(2, 3); >> extern CPUState *first_cpu; >> + >> +#ifdef __linux__ >> +/* DAG: Only tested thread local on Linux, feel free to add others */ >> +extern __thread CPUState *cpu_single_env; >> +#else >> extern CPUState *cpu_single_env; >> +#endif > We need this for all platforms in order to skip qemu_global_mutex while > manipulating some CPUState. And leaving some platforms with non-TLS will > eventually break them when code is added that assumes TLS. > However, it's not unlikely that some weird platforms / ancient > toolchains still have problems with __thread - even on Linux. We may > want to play safe and use pthread_key on POSIX. Why not make this kind of annotations available in qemu-commmon.h; or even better somewhere less... "generic". #if defined(CONFIG_SUPPORTS_THREAD_KEYWORD) // use __thread #define QEMU_DECL_TLS(type, name) \ extern __thread type name; #define QEMU_DEF_TLS(type, name) \ __thread type name; #define QEMU_SET_TLS(name, value) \ do { name = value; } while (0) #define QEMU_GET_TLS(name) \ name #elif defined(CONFIG_SUPPORTS_PTHREAD_KEY) // use pthread_key_t #define QEMU_DECL_TLS(type, name) \ extern type _type_##name; \ extern pthread_key_t name; #define QEMU_DEF_TLS(type, name) \ pthread_key_t name; \ void _init_##name (void) __attribute__((constructor)); \ void _init_##name (void) { pthread_key_create(&name, NULL); } #define QEMU_SET_TLS(name, value) \ do { pthread_setspecific(name, (void*)value); } while (0) #define QEMU_GET_TLS(name) \ ((typeof(_type_##name))pthread_getspecific(name)) #else #error Go home #endif Lluis -- "And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer." -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth
Re: [Qemu-devel] [PATCH 0/4] migration: Improve subsections detection
On 10/04/2011 04:38 PM, Juan Quintela wrote: Hi This series move the subsections detection code form: - Look that it starts form 5 To: - Look that it starts form 5 (SUBSECTION) - Look at the length - Look that length is bigger than section name - Look at the idstr and see that it starts with the subsection name. Please review. Looks good as an alternative to the new migration format. Thanks, Paolo
[Qemu-devel] [PATCH] ppc64: Fix linker script
Since commit 8733f609 (Fix linker scripts) linking on Linux/ppc64 fails: LINK ppc64-linux-user/qemu-ppc64 /usr/lib64/gcc/powerpc64-suse-linux/4.3/../../../../powerpc64-suse-linux/bin/ld:/home/afaerber/qemu/ppc64.ld:84: syntax error collect2: ld gab 1 als Ende-Status zurück make[1]: *** [qemu-ppc64] Fehler 1 make: *** [subdir-ppc64-linux-user] Fehler 2 Fix by removing a leftover line in the ppc64 linker script. Cc: Gerd Hoffmann Cc: Blue Swirl Signed-off-by: Andreas Färber --- ppc64.ld |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/ppc64.ld b/ppc64.ld index 0059ee5..0a7c0dd 100644 --- a/ppc64.ld +++ b/ppc64.ld @@ -81,8 +81,8 @@ SECTIONS .sdata2 : { *(.sdata2 .sdata2.* .gnu.linkonce.s2.*) } .sbss2 : { *(.sbss2 .sbss2.* .gnu.linkonce.sb2.*) } .eh_frame_hdr : { *(.eh_frame_hdr) } -*(.gcc_except_table.*) } /* Adjust the address for the data segment. We want to -adjust up to + the same address within the page on the next page up. */ + /* Adjust the address for the data segment. We want to adjust up to + the same address within the page on the next page up. */ . = ALIGN (0x1) - ((0x1 - .) & (0x1 - 1)); . = DATA_SEGMENT_ALIGN (0x1, 0x1000); /* Exception handling */ .eh_frame : { KEEP (*(.eh_frame)) } -- 1.6.0.2
Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppc64: Fix linker script
On 10/04/2011 05:14 PM, Andreas Färber wrote: Since commit 8733f609 (Fix linker scripts) linking on Linux/ppc64 fails: LINK ppc64-linux-user/qemu-ppc64 /usr/lib64/gcc/powerpc64-suse-linux/4.3/../../../../powerpc64-suse-linux/bin/ld:/home/afaerber/qemu/ppc64.ld:84: syntax error collect2: ld gab 1 als Ende-Status zurück make[1]: *** [qemu-ppc64] Fehler 1 make: *** [subdir-ppc64-linux-user] Fehler 2 Fix by removing a leftover line in the ppc64 linker script. Cc: Gerd Hoffmann Cc: Blue Swirl Signed-off-by: Andreas Färber Thanks, applied to ppc-next. Alex
[Qemu-devel] [PATCH RFC V1 04/11] pci_regs: Fix value of PCI_EXP_TYPE_RC_EC.
Value check in PCI Express Base Specification rev 1.1 Signed-off-by: Anthony PERARD --- hw/pci_regs.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/hw/pci_regs.h b/hw/pci_regs.h index e884096..dad7d9a 100644 --- a/hw/pci_regs.h +++ b/hw/pci_regs.h @@ -392,7 +392,7 @@ #define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */ #define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */ #define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */ -#define PCI_EXP_TYPE_RC_EC0x10/* Root Complex Event Collector */ +#define PCI_EXP_TYPE_RC_EC0xa /* Root Complex Event Collector */ #define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */ #define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */ #define PCI_EXP_DEVCAP 4 /* Device capabilities */ -- Anthony PERARD
[Qemu-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough
Hi all, This patch series introduce the PCI passthrough for Xen. First, we have HostPCIDevice that help to access one PCI device of the host. Then, there are several additions in the QEMU code. One is qemu_run_one_timer to run a specific timer. It is used by PCI passthrough to run a timer about power management. Another is pci_check_bar_overlap. There are also several change in pci_ids and pci_regs. Last part, but not least, the PCI passthrough device himself. Cut in 3 parts (or file), there is one to take care of the initialisation of a passthrough device. The second one handle everything about the config address space, there are specifics functions for every config register. The third one is to handle MSI. I'm still working on setting a PCI passthrough device through QMP from libxl (xen tool stack). It is just a call to device_add, with the driver parametter hostaddr=":00:1b.0". There is some minor things missing: - copyright header - PCI IO space multiplexer Regards, Anthony PERARD (11): Introduce HostPCIDevice to access a pci device on the host. qemu-timer: Introduce qemu_run_one_timer pci_ids: Add INTEL_82599_VF id. pci_regs: Fix value of PCI_EXP_TYPE_RC_EC. pci_regs: Add PCI_EXP_TYPE_PCIE_BRIDGE pci.c: Add pci_check_bar_overlap host-pci-device: Add host_pci_find_ext_cap_offset Introduce Xen PCI Passthrough, qdevice (1/3) Introduce Xen PCI Passthrough, PCI config space helpers (2/3) Introduce Xen PCI Passthrough, MSI (3/3) config/make: Introduce --enable-xen-pci-passthrough, built it. Makefile.target |7 + configure| 21 + hw/host-pci-device.c | 223 +++ hw/host-pci-device.h | 38 + hw/pci.c | 46 + hw/pci.h |3 + hw/pci_ids.h |1 + hw/pci_regs.h|3 +- hw/xen_pci_passthrough.c | 763 +++ hw/xen_pci_passthrough.h | 335 + hw/xen_pci_passthrough_config_init.c | 2489 ++ hw/xen_pci_passthrough_helpers.c | 46 + hw/xen_pci_passthrough_msi.c | 674 + qemu-timer.c | 15 + qemu-timer.h |3 + 15 files changed, 4666 insertions(+), 1 deletions(-) create mode 100644 hw/host-pci-device.c create mode 100644 hw/host-pci-device.h create mode 100644 hw/xen_pci_passthrough.c create mode 100644 hw/xen_pci_passthrough.h create mode 100644 hw/xen_pci_passthrough_config_init.c create mode 100644 hw/xen_pci_passthrough_helpers.c create mode 100644 hw/xen_pci_passthrough_msi.c -- Anthony PERARD
[Qemu-devel] [PATCHv3] s390: Fix cpu shutdown for KVM
On s390 a shutdown is the state of all CPUs being either stopped or disabled (for interrupts) waiting. We have to track the overall number of running CPUs to call the shutdown sequence accordingly. This patch implements the counting and shutdown handling for the kvm path in qemu. Lets also wrap changes to env->halted and env->exception_index. Signed-off-by: Christian Borntraeger --- hw/s390-virtio.c | 32 ++-- target-s390x/cpu.h |2 ++ target-s390x/kvm.c | 19 +++ 3 files changed, 39 insertions(+), 14 deletions(-) Index: b/hw/s390-virtio.c === --- a/hw/s390-virtio.c +++ b/hw/s390-virtio.c @@ -130,6 +130,34 @@ int s390_virtio_hypercall(CPUState *env, return r; } +/* + * The number of running CPUs. On s390 a shutdown is the state of all CPUs + * being either stopped or disabled (for interrupts) waiting. We have to + * track this number to call the shutdown sequence accordingly. This + * number is modified either on startup or while holding the big qemu lock. + */ +static unsigned s390_running_cpus; + +void s390_add_running_cpu(CPUState *env) +{ +if (env->halted) { +s390_running_cpus++; +env->halted = 0; +env->exception_index = -1; +} +} + +unsigned s390_del_running_cpu(CPUState *env) +{ +if (env->halted == 0) { +assert(s390_running_cpus >= 1); +s390_running_cpus--; +env->halted = 1; +env->exception_index = EXCP_HLT; +} +return s390_running_cpus; +} + /* PC hardware initialisation */ static void s390_init(ram_addr_t my_ram_size, const char *boot_device, @@ -187,8 +215,8 @@ static void s390_init(ram_addr_t my_ram_ tmp_env->storage_keys = storage_keys; } -env->halted = 0; -env->exception_index = 0; +/* One CPU has to run */ +s390_add_running_cpu(env); if (kernel_filename) { kernel_size = load_image(kernel_filename, qemu_get_ram_ptr(0)); Index: b/target-s390x/cpu.h === --- a/target-s390x/cpu.h +++ b/target-s390x/cpu.h @@ -309,6 +309,8 @@ static inline void kvm_s390_interrupt_in } #endif CPUState *s390_cpu_addr2state(uint16_t cpu_addr); +void s390_add_running_cpu(CPUState *env); +unsigned s390_del_running_cpu(CPUState *env); /* from s390-virtio-bus */ extern const target_phys_addr_t virtio_size; Index: b/target-s390x/kvm.c === --- a/target-s390x/kvm.c +++ b/target-s390x/kvm.c @@ -185,8 +185,7 @@ void kvm_s390_interrupt_internal(CPUStat return; } -env->halted = 0; -env->exception_index = -1; +s390_add_running_cpu(env); qemu_cpu_kick(env); kvmint.type = type; @@ -299,8 +298,7 @@ static int handle_diag(CPUState *env, st static int s390_cpu_restart(CPUState *env) { kvm_s390_interrupt(env, KVM_S390_RESTART, 0); -env->halted = 0; -env->exception_index = -1; +s390_add_running_cpu(env); qemu_cpu_kick(env); dprintf("DONE: SIGP cpu restart: %p\n", env); return 0; @@ -425,17 +423,16 @@ static int handle_intercept(CPUState *en r = handle_instruction(env, run); break; case ICPT_WAITPSW: -/* XXX What to do on system shutdown? */ -env->halted = 1; -env->exception_index = EXCP_HLT; +case ICPT_CPU_STOP: +if (s390_del_running_cpu(env) == 0) { +qemu_system_shutdown_request(); +} +r = EXCP_HALTED; break; case ICPT_SOFT_INTERCEPT: fprintf(stderr, "KVM unimplemented icpt SOFT\n"); exit(1); break; -case ICPT_CPU_STOP: -qemu_system_shutdown_request(); -break; case ICPT_IO: fprintf(stderr, "KVM unimplemented icpt IO\n"); exit(1); @@ -468,8 +465,6 @@ int kvm_arch_handle_exit(CPUState *env, if (ret == 0) { ret = EXCP_INTERRUPT; -} else if (ret > 0) { -ret = 0; } return ret; }
[Qemu-devel] [PATCH RFC V1 05/11] pci_regs: Add PCI_EXP_TYPE_PCIE_BRIDGE
Signed-off-by: Anthony PERARD --- hw/pci_regs.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/hw/pci_regs.h b/hw/pci_regs.h index dad7d9a..091a749 100644 --- a/hw/pci_regs.h +++ b/hw/pci_regs.h @@ -391,6 +391,7 @@ #define PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */ #define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */ #define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */ +#define PCI_EXP_TYPE_PCIE_BRIDGE 0x8 /* PCI/PCI-X to PCIE Bridge */ #define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */ #define PCI_EXP_TYPE_RC_EC0xa /* Root Complex Event Collector */ #define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */ -- Anthony PERARD
[Qemu-devel] [PATCH RFC V1 07/11] host-pci-device: Add host_pci_find_ext_cap_offset
Signed-off-by: Anthony PERARD --- hw/host-pci-device.c | 31 +++ hw/host-pci-device.h |2 ++ 2 files changed, 33 insertions(+), 0 deletions(-) diff --git a/hw/host-pci-device.c b/hw/host-pci-device.c index b3f2899..2a889d5 100644 --- a/hw/host-pci-device.c +++ b/hw/host-pci-device.c @@ -162,6 +162,37 @@ int host_pci_write_block(HostPCIDevice *d, int pos, uint8_t *buf, int len) return host_pci_config_write(d, pos, buf, len); } +uint32_t host_pci_find_ext_cap_offset(HostPCIDevice *d, uint32_t cap) +{ +uint32_t header = 0; +int max_cap = 480; +int pos = 0x100; + +do { +header = host_pci_read_long(d, pos); +/* + * If we have no capabilities, this is indicated by cap ID, + * cap version and next pointer all being 0. + */ +if (header == 0) { +break; +} + +if (PCI_EXT_CAP_ID(header) == cap) { +return pos; +} + +pos = PCI_EXT_CAP_NEXT(header); +if (pos < 0x100) { +break; +} + +max_cap--; +} while (max_cap > 0); + +return 0; +} + HostPCIDevice *host_pci_device_get(uint8_t bus, uint8_t dev, uint8_t func) { HostPCIDevice *d = NULL; diff --git a/hw/host-pci-device.h b/hw/host-pci-device.h index 0137507..2734eb3 100644 --- a/hw/host-pci-device.h +++ b/hw/host-pci-device.h @@ -33,4 +33,6 @@ int host_pci_write_word(HostPCIDevice *d, int pos, uint16_t data); int host_pci_write_long(HostPCIDevice *d, int pos, uint32_t data); int host_pci_write_block(HostPCIDevice *d, int pos, uint8_t *buf, int len); +uint32_t host_pci_find_ext_cap_offset(HostPCIDevice *s, uint32_t cap); + #endif /* !HW_HOST_PCI_DEVICE */ -- Anthony PERARD
Re: [Qemu-devel] [PATCH] ppc64: Fix linker script
On 10/04/2011 08:14 AM, Andreas Färber wrote: > Since commit 8733f609 (Fix linker scripts) linking on Linux/ppc64 fails: It occurs to me to wonder if we ought to simply auto-detect the presence of the -Ttext-segment ADDR option. If that's present, don't override the linker script and all the changes that might have been introduced therein. r~
[Qemu-devel] [PATCH RFC V1 03/11] pci_ids: Add INTEL_82599_VF id.
Signed-off-by: Anthony PERARD --- hw/pci_ids.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/hw/pci_ids.h b/hw/pci_ids.h index 83f3893..2ea5ec2 100644 --- a/hw/pci_ids.h +++ b/hw/pci_ids.h @@ -117,6 +117,7 @@ #define PCI_DEVICE_ID_INTEL_82801I_UHCI6 0x2939 #define PCI_DEVICE_ID_INTEL_82801I_EHCI1 0x293a #define PCI_DEVICE_ID_INTEL_82801I_EHCI2 0x293c +#define PCI_DEVICE_ID_INTEL_82599_VF 0x10ed #define PCI_VENDOR_ID_XEN 0x5853 #define PCI_DEVICE_ID_XEN_PLATFORM 0x0001 -- Anthony PERARD
[Qemu-devel] [PATCH 3/4] savevm: improve subsections detection on load
We add qemu_peek_buffer, that is identical to qemu_get_buffer, just that it don't update f->buf_index. We add a paramenter to qemu_peek_byte() to be able to peek more than one byte. Once this is done, to see if we have a subsection we look: - 1st byte is QEMU_VM_SUBSECTION - 2nd byte is a length, and is bigger than section name - 3rd element is a string that starts with section_name So, we shouldn't have false positives (yes, content could still get us wrong but probabilities are really low). Signed-off-by: Juan Quintela --- savevm.c | 71 - 1 files changed, 60 insertions(+), 11 deletions(-) diff --git a/savevm.c b/savevm.c index 5fee4e2..db6ea12 100644 --- a/savevm.c +++ b/savevm.c @@ -532,6 +532,37 @@ void qemu_put_byte(QEMUFile *f, int v) qemu_fflush(f); } +static int qemu_peek_buffer(QEMUFile *f, uint8_t *buf, int size1, int offset) +{ +int size, l; +int index = f->buf_index + offset; + +if (f->is_write) { +abort(); +} + +size = size1; +while (size > 0) { +l = f->buf_size - index; +if (l == 0) { +qemu_fill_buffer(f); +index = f->buf_index + offset; +l = f->buf_size - index; +if (l == 0) { +break; +} +} +if (l > size) { +l = size; +} +memcpy(buf, f->buf + index, l); +index += l; +buf += l; +size -= l; +} +return size1 - size; +} + int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1) { int size, l; @@ -561,19 +592,22 @@ int qemu_get_buffer(QEMUFile *f, uint8_t *buf, int size1) return size1 - size; } -static int qemu_peek_byte(QEMUFile *f) +static int qemu_peek_byte(QEMUFile *f, int offset) { +int index = f->buf_index + offset; + if (f->is_write) { abort(); } -if (f->buf_index >= f->buf_size) { +if (index >= f->buf_size) { qemu_fill_buffer(f); -if (f->buf_index >= f->buf_size) { +index = f->buf_index + offset; +if (index >= f->buf_size) { return 0; } } -return f->buf[f->buf_index]; +return f->buf[index]; } int qemu_get_byte(QEMUFile *f) @@ -1687,22 +1721,37 @@ static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription *vmsd, return 0; } -while (qemu_peek_byte(f) == QEMU_VM_SUBSECTION) { +while (qemu_peek_byte(f, 0) == QEMU_VM_SUBSECTION) { char idstr[256]; int ret; -uint8_t version_id, len; +uint8_t version_id, len, size; const VMStateDescription *sub_vmsd; -qemu_get_byte(f); /* subsection */ -len = qemu_get_byte(f); -qemu_get_buffer(f, (uint8_t *)idstr, len); -idstr[len] = 0; -version_id = qemu_get_be32(f); +len = qemu_peek_byte(f, 1); +if (len < strlen(vmsd->name) + 1) { +/* subsection name has be be "section_name/a" */ +return 0; +} +size = qemu_peek_buffer(f, (uint8_t *)idstr, len, 2); +if (size != len) { +return 0; +} +idstr[size] = 0; +if (strncmp(vmsd->name, idstr, strlen(vmsd->name)) != 0) { +/* it don't have a valid subsection name */ +return 0; +} sub_vmsd = vmstate_get_subsection(sub, idstr); if (sub_vmsd == NULL) { return -ENOENT; } +qemu_get_byte(f); /* subsection */ +qemu_get_byte(f); /* len */ +qemu_get_buffer(f, (uint8_t *)idstr, len); +idstr[len] = 0; +version_id = qemu_get_be32(f); + assert(!sub_vmsd->subsections); ret = vmstate_load_state(f, sub_vmsd, opaque, version_id); if (ret) { -- 1.7.6.4
Re: [Qemu-devel] [PATCHv3] s390: Fix cpu shutdown for KVM
On 10/04/2011 05:20 PM, Christian Borntraeger wrote: On s390 a shutdown is the state of all CPUs being either stopped or disabled (for interrupts) waiting. We have to track the overall number of running CPUs to call the shutdown sequence accordingly. This patch implements the counting and shutdown handling for the kvm path in qemu. Lets also wrap changes to env->halted and env->exception_index. Signed-off-by: Christian Borntraeger --- hw/s390-virtio.c | 32 ++-- target-s390x/cpu.h |2 ++ target-s390x/kvm.c | 19 +++ 3 files changed, 39 insertions(+), 14 deletions(-) Index: b/hw/s390-virtio.c === --- a/hw/s390-virtio.c +++ b/hw/s390-virtio.c @@ -130,6 +130,34 @@ int s390_virtio_hypercall(CPUState *env, return r; } +/* + * The number of running CPUs. On s390 a shutdown is the state of all CPUs + * being either stopped or disabled (for interrupts) waiting. We have to + * track this number to call the shutdown sequence accordingly. This + * number is modified either on startup or while holding the big qemu lock. + */ +static unsigned s390_running_cpus; + +void s390_add_running_cpu(CPUState *env) +{ +if (env->halted) { +s390_running_cpus++; +env->halted = 0; +env->exception_index = -1; +} +} + +unsigned s390_del_running_cpu(CPUState *env) +{ +if (env->halted == 0) { +assert(s390_running_cpus>= 1); +s390_running_cpus--; +env->halted = 1; +env->exception_index = EXCP_HLT; +} +return s390_running_cpus; +} + /* PC hardware initialisation */ static void s390_init(ram_addr_t my_ram_size, const char *boot_device, @@ -187,8 +215,8 @@ static void s390_init(ram_addr_t my_ram_ tmp_env->storage_keys = storage_keys; } -env->halted = 0; -env->exception_index = 0; +/* One CPU has to run */ +s390_add_running_cpu(env); if (kernel_filename) { kernel_size = load_image(kernel_filename, qemu_get_ram_ptr(0)); Index: b/target-s390x/cpu.h === --- a/target-s390x/cpu.h +++ b/target-s390x/cpu.h @@ -309,6 +309,8 @@ static inline void kvm_s390_interrupt_in } #endif CPUState *s390_cpu_addr2state(uint16_t cpu_addr); +void s390_add_running_cpu(CPUState *env); +unsigned s390_del_running_cpu(CPUState *env); /* from s390-virtio-bus */ extern const target_phys_addr_t virtio_size; Index: b/target-s390x/kvm.c === --- a/target-s390x/kvm.c +++ b/target-s390x/kvm.c @@ -185,8 +185,7 @@ void kvm_s390_interrupt_internal(CPUStat return; } -env->halted = 0; -env->exception_index = -1; +s390_add_running_cpu(env); qemu_cpu_kick(env); kvmint.type = type; @@ -299,8 +298,7 @@ static int handle_diag(CPUState *env, st static int s390_cpu_restart(CPUState *env) { kvm_s390_interrupt(env, KVM_S390_RESTART, 0); -env->halted = 0; -env->exception_index = -1; +s390_add_running_cpu(env); qemu_cpu_kick(env); dprintf("DONE: SIGP cpu restart: %p\n", env); return 0; @@ -425,17 +423,16 @@ static int handle_intercept(CPUState *en r = handle_instruction(env, run); break; case ICPT_WAITPSW: -/* XXX What to do on system shutdown? */ -env->halted = 1; -env->exception_index = EXCP_HLT; +case ICPT_CPU_STOP: +if (s390_del_running_cpu(env) == 0) { +qemu_system_shutdown_request(); +} +r = EXCP_HALTED; break; case ICPT_SOFT_INTERCEPT: fprintf(stderr, "KVM unimplemented icpt SOFT\n"); exit(1); break; -case ICPT_CPU_STOP: -qemu_system_shutdown_request(); -break; case ICPT_IO: fprintf(stderr, "KVM unimplemented icpt IO\n"); exit(1); @@ -468,8 +465,6 @@ int kvm_arch_handle_exit(CPUState *env, if (ret == 0) { ret = EXCP_INTERRUPT; -} else if (ret> 0) { -ret = 0; Hrm. Are you sure that this doesn't break anything? The rest looks good. Alex
Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/2] tcg/ppc*: Move cache initialization to ppc specific code
On 10/04/2011 12:55 AM, Stefan Weil wrote: > Am 03.10.2011 23:40, schrieb Scott Wood: >> The interface isn't powerpc-specific. It just happens to be the only >> arch so far that qemu supports that needs the implementation to do >> something (or possibly just the only one where that need has been >> discovered). >> >> What problem is it causing the way it is? > > My patch was triggered by this discussion thread and a remark from > Peter Maydell: > > http://lists.nongnu.org/archive/html/qemu-devel/2011-09/msg02272.html > > The only problem is that ppc differs from all other hosts: > > * No other host needs qemu_cache_utils_init() today. > > * All other hosts define flush_icache_range() in tcg-target.h. > > * All other hosts only call flush_icache_range() from tcg code. > > I learned now that ppc will need flush_icache_range() for kvm, too. > So it won't be possible to implement a uniform handling of > flush_icache_range() for all host architectures. This doesn't smell right... why would other arches need it for TCG, but not KVM? Either you need to flush when writing code to memory, or you don't. Is it just that KVM isn't implemented yet on those architectures? -Scott
Re: [Qemu-devel] [PATCH] i386: wire up MSR_IA32_MISC_ENABLE
On 2011-10-04 16:26, Avi Kivity wrote: > It's needed for its default value - bit 0 specifies that "rep movs" is > good enough for memcpy, and Linux may use a slower memcpu if it is not set, > depending on cpu family/model. > > Signed-off-by: Avi Kivity > --- > target-i386/cpu.h |5 + > target-i386/helper.c|1 + > target-i386/kvm.c | 15 +++ > target-i386/machine.c | 21 + > target-i386/op_helper.c |6 ++ > 5 files changed, 48 insertions(+), 0 deletions(-) > > diff --git a/target-i386/cpu.h b/target-i386/cpu.h > index ae36489..5416809 100644 > --- a/target-i386/cpu.h > +++ b/target-i386/cpu.h > @@ -299,6 +299,10 @@ > > #define MSR_IA32_PERF_STATUS0x198 > > +#define MSR_IA32_MISC_ENABLE 0x1a0 I smell tabs... > +/* Indicates good rep/movs microcode on some processors: */ > +#define MSR_IA32_MISC_ENABLE_DEFAULT1 > + > #define MSR_MTRRphysBase(reg)(0x200 + 2 * (reg)) > #define MSR_MTRRphysMask(reg)(0x200 + 2 * (reg) + 1) > > @@ -689,6 +693,7 @@ typedef struct CPUX86State { > uint64_t tsc; > > uint64_t mcg_status; > +uint64_t msr_ia32_misc_enable; > > /* exception/interrupt handling */ > int error_code; > diff --git a/target-i386/helper.c b/target-i386/helper.c > index 5df40d4..6c6a167 100644 > --- a/target-i386/helper.c > +++ b/target-i386/helper.c > @@ -98,6 +98,7 @@ void cpu_reset(CPUX86State *env) > env->mxcsr = 0x1f80; > > env->pat = 0x0007040600070406ULL; > +env->msr_ia32_misc_enable = MSR_IA32_MISC_ENABLE_DEFAULT; > > memset(env->dr, 0, sizeof(env->dr)); > env->dr[6] = DR6_FIXED_1; > diff --git a/target-i386/kvm.c b/target-i386/kvm.c > index b6eef04..b3a650b 100644 > --- a/target-i386/kvm.c > +++ b/target-i386/kvm.c > @@ -60,6 +60,7 @@ > static bool has_msr_star; > static bool has_msr_hsave_pa; > static bool has_msr_async_pf_en; > +static bool has_msr_misc_enable; > static int lm_capable_kernel; > > static struct kvm_cpuid2 *try_get_cpuid(KVMState *s, int max) > @@ -568,6 +569,10 @@ static int kvm_get_supported_msrs(KVMState *s) > has_msr_hsave_pa = true; > continue; > } > +if (kvm_msr_list->indices[i] == MSR_IA32_MISC_ENABLE) { > +has_msr_misc_enable = true; > +continue; > +} > } > } > > @@ -881,6 +886,10 @@ static int kvm_put_msrs(CPUState *env, int level) > if (has_msr_hsave_pa) { > kvm_msr_entry_set(&msrs[n++], MSR_VM_HSAVE_PA, env->vm_hsave); > } > +if (has_msr_misc_enable) { > +kvm_msr_entry_set(&msrs[n++], MSR_IA32_MISC_ENABLE, > + env->msr_ia32_misc_enable); > +} > #ifdef TARGET_X86_64 > if (lm_capable_kernel) { > kvm_msr_entry_set(&msrs[n++], MSR_CSTAR, env->cstar); > @@ -1127,6 +1136,9 @@ static int kvm_get_msrs(CPUState *env) > if (has_msr_hsave_pa) { > msrs[n++].index = MSR_VM_HSAVE_PA; > } > +if (has_msr_misc_enable) { > +msrs[n++].index = MSR_IA32_MISC_ENABLE; > +} > > if (!env->tsc_valid) { > msrs[n++].index = MSR_IA32_TSC; > @@ -1210,6 +1222,9 @@ static int kvm_get_msrs(CPUState *env) > case MSR_MCG_CTL: > env->mcg_ctl = msrs[i].data; > break; > +case MSR_IA32_MISC_ENABLE: > +env->msr_ia32_misc_enable = msrs[i].data; > +break; > default: > if (msrs[i].index >= MSR_MC0_CTL && > msrs[i].index < MSR_MC0_CTL + (env->mcg_cap & 0xff) * 4) { > diff --git a/target-i386/machine.c b/target-i386/machine.c > index 9aca8e0..73c1ffe 100644 > --- a/target-i386/machine.c > +++ b/target-i386/machine.c > @@ -310,6 +310,24 @@ static bool fpop_ip_dp_needed(void *opaque) > } > }; > > +static bool misc_enable_needed(void *opaque) > +{ > +CPUState *env = opaque; > + > +return env->msr_ia32_misc_enable != MSR_IA32_MISC_ENABLE_DEFAULT; > +} > + > +static const VMStateDescription vmstate_msr_ia32_misc_enable = { > +.name = "cpu/msr_ia32_misc_enable", > +.version_id = 1, > +.minimum_version_id = 1, > +.minimum_version_id_old = 1, > +.fields = (VMStateField []) { > +VMSTATE_UINT64(msr_ia32_misc_enable, CPUState), > +VMSTATE_END_OF_LIST() > +} > +}; > + We are about to bump the CPU_SAVE_VERSION for the sake of APIC deadline timer, so you can jump on that train and avoid this subsection. > static const VMStateDescription vmstate_cpu = { > .name = "cpu", > .version_id = CPU_SAVE_VERSION, > @@ -420,6 +438,9 @@ static bool fpop_ip_dp_needed(void *opaque) > } , { > .vmsd = &vmstate_fpop_ip_dp, > .needed = fpop_ip_dp_needed, > +}, { > +.vmsd = &vmstate_msr_ia32_misc_enable, > +.needed = misc_enable_nee
[Qemu-devel] [PATCH 0/2] HDA fixes for Windows 7
Hi, There are two related bugs in stream state and xfer handling due to the conflict of stream number (on Windows 7, input stream and output stream both use the full range of 1..15 values. on Linux, the handling is different and it works fine because there is no clash) The patches handle stream number for input and output stream distinctively to avoid conflict, and it seems to be in accordance to the spec. Marc-André Lureau (2): hda: do not mix output and input streams, RHBZ #740493 hda: do not mix output and input stream states, RHBZ #740493 hw/hda-audio.c | 15 +-- hw/intel-hda.c | 18 ++ hw/intel-hda.h |2 +- 3 files changed, 20 insertions(+), 15 deletions(-) -- 1.7.6.2
[Qemu-devel] [PATCH 1/2] hda: do not mix output and input streams, RHBZ #740493
Windows 7 may use the same stream number for input and output. That will result in lot of garbage on playback. The hardcoded value of 4 needs to be in sync with GCAP streams description and IN/OUT registers. --- hw/intel-hda.c |9 + 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/hw/intel-hda.c b/hw/intel-hda.c index 4272204..c6a3fec 100644 --- a/hw/intel-hda.c +++ b/hw/intel-hda.c @@ -389,14 +389,15 @@ static bool intel_hda_xfer(HDACodecDevice *dev, uint32_t stnr, bool output, { HDACodecBus *bus = DO_UPCAST(HDACodecBus, qbus, dev->qdev.parent_bus); IntelHDAState *d = container_of(bus, IntelHDAState, codecs); -IntelHDAStream *st = NULL; target_phys_addr_t addr; uint32_t s, copy, left; +IntelHDAStream *st; bool irq = false; -for (s = 0; s < ARRAY_SIZE(d->st); s++) { -if (stnr == ((d->st[s].ctl >> 20) & 0x0f)) { -st = d->st + s; +st = output ? d->st + 4 : d->st; +for (s = 0; s < 4; s++) { +if (stnr == ((st[s].ctl >> 20) & 0x0f)) { +st = st + s; break; } } -- 1.7.6.2
[Qemu-devel] [PATCH 2/2] hda: do not mix output and input stream states, RHBZ #740493
Windows 7 may use the same stream number for input and output. Current code will confuse streams. NB: I wonder if this patch breaks migration code because of this change: -VMSTATE_BOOL_ARRAY(running, HDAAudioState, 16), +VMSTATE_BOOL_ARRAY(running, HDAAudioState, 2 * 16), --- hw/hda-audio.c | 15 +-- hw/intel-hda.c |9 + hw/intel-hda.h |2 +- 3 files changed, 15 insertions(+), 11 deletions(-) diff --git a/hw/hda-audio.c b/hw/hda-audio.c index 03c0a24..dacf290 100644 --- a/hw/hda-audio.c +++ b/hw/hda-audio.c @@ -462,7 +462,7 @@ struct HDAAudioState { QEMUSoundCard card; const desc_codec *desc; HDAAudioStream st[4]; -bool running[16]; +bool running[2 * 16]; /* properties */ uint32_t debug; @@ -659,7 +659,7 @@ static void hda_audio_command(HDACodecDevice *hda, uint32_t nid, uint32_t data) st->channel = payload & 0x0f; dprint(a, 2, "%s: stream %d, channel %d\n", st->node->name, st->stream, st->channel); -hda_audio_set_running(st, a->running[st->stream]); +hda_audio_set_running(st, a->running[st->output * 16 + st->stream]); hda_codec_response(hda, true, 0); break; case AC_VERB_GET_CONV: @@ -742,16 +742,19 @@ fail: hda_codec_response(hda, true, 0); } -static void hda_audio_stream(HDACodecDevice *hda, uint32_t stnr, bool running) +static void hda_audio_stream(HDACodecDevice *hda, uint32_t stnr, bool running, bool output) { HDAAudioState *a = DO_UPCAST(HDAAudioState, hda, hda); int s; -a->running[stnr] = running; +a->running[output * 16 + stnr] = running; for (s = 0; s < ARRAY_SIZE(a->st); s++) { if (a->st[s].node == NULL) { continue; } +if (a->st[s].output != output) { +continue; +} if (a->st[s].stream != stnr) { continue; } @@ -840,7 +843,7 @@ static int hda_audio_post_load(void *opaque, int version) hda_codec_parse_fmt(st->format, &st->as); hda_audio_setup(st); hda_audio_set_amp(st); -hda_audio_set_running(st, a->running[st->stream]); +hda_audio_set_running(st, a->running[st->output * 16 + st->stream]); } return 0; } @@ -870,7 +873,7 @@ static const VMStateDescription vmstate_hda_audio = { VMSTATE_STRUCT_ARRAY(st, HDAAudioState, 4, 0, vmstate_hda_audio_stream, HDAAudioStream), -VMSTATE_BOOL_ARRAY(running, HDAAudioState, 16), +VMSTATE_BOOL_ARRAY(running, HDAAudioState, 2 * 16), VMSTATE_END_OF_LIST() } }; diff --git a/hw/intel-hda.c b/hw/intel-hda.c index c6a3fec..f97775c 100644 --- a/hw/intel-hda.c +++ b/hw/intel-hda.c @@ -485,7 +485,7 @@ static void intel_hda_parse_bdl(IntelHDAState *d, IntelHDAStream *st) st->bp= 0; } -static void intel_hda_notify_codecs(IntelHDAState *d, uint32_t stream, bool running) +static void intel_hda_notify_codecs(IntelHDAState *d, uint32_t stream, bool running, bool output) { DeviceState *qdev; HDACodecDevice *cdev; @@ -493,7 +493,7 @@ static void intel_hda_notify_codecs(IntelHDAState *d, uint32_t stream, bool runn QLIST_FOREACH(qdev, &d->codecs.qbus.children, sibling) { cdev = DO_UPCAST(HDACodecDevice, qdev, qdev); if (cdev->info->stream) { -cdev->info->stream(cdev, stream, running); +cdev->info->stream(cdev, stream, running, output); } } } @@ -567,6 +567,7 @@ static void intel_hda_set_ics(IntelHDAState *d, const IntelHDAReg *reg, uint32_t static void intel_hda_set_st_ctl(IntelHDAState *d, const IntelHDAReg *reg, uint32_t old) { +bool output = reg->stream >= 4; IntelHDAStream *st = d->st + reg->stream; if (st->ctl & 0x01) { @@ -582,11 +583,11 @@ static void intel_hda_set_st_ctl(IntelHDAState *d, const IntelHDAReg *reg, uint3 dprint(d, 1, "st #%d: start %d (ring buf %d bytes)\n", reg->stream, stnr, st->cbl); intel_hda_parse_bdl(d, st); -intel_hda_notify_codecs(d, stnr, true); +intel_hda_notify_codecs(d, stnr, true, output); } else { /* stop */ dprint(d, 1, "st #%d: stop %d\n", reg->stream, stnr); -intel_hda_notify_codecs(d, stnr, false); +intel_hda_notify_codecs(d, stnr, false, output); } } intel_hda_update_irq(d); diff --git a/hw/intel-hda.h b/hw/intel-hda.h index 4e44e38..65fd2a8 100644 --- a/hw/intel-hda.h +++ b/hw/intel-hda.h @@ -34,7 +34,7 @@ struct HDACodecDeviceInfo { int (*init)(HDACodecDevice *dev); int (*exit)(HDACodecDevice *dev); void (*command)(HDACodecDevice *dev, uint32_t nid, uint32_t data); -void (*stream)(HDACodecDevice *dev, uint32_t stnr, bool running); +void (*stream)(HDACodecDevice *dev, uint32_t stnr, bool running, bool output); }; void hda_codec_bus_
Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough
On Tue, 2011-10-04 at 16:05 +0100, Stefano Stabellini wrote: > On Tue, 4 Oct 2011, Anthony Liguori wrote: > > On 10/04/2011 09:58 AM, Avi Kivity wrote: > > > On 10/04/2011 04:51 PM, Anthony PERARD wrote: > > >> Hi all, > > >> > > >> This patch series introduce the PCI passthrough for Xen. > > >> > > >> First, we have HostPCIDevice that help to access one PCI device of the > > >> host. > > >> > > >> Then, there are several additions in the QEMU code. One is > > >> qemu_run_one_timer > > >> to run a specific timer. It is used by PCI passthrough to run a timer > > >> about > > >> power management. Another is pci_check_bar_overlap. > > >> > > >> There are also several change in pci_ids and pci_regs. > > >> > > >> Last part, but not least, the PCI passthrough device himself. Cut in 3 > > >> parts > > >> (or file), there is one to take care of the initialisation of a > > >> passthrough > > >> device. The second one handle everything about the config address space, > > >> there > > >> are specifics functions for every config register. The third one is to > > >> handle > > >> MSI. > > >> > > >> I'm still working on setting a PCI passthrough device through QMP from > > >> libxl > > >> (xen tool stack). It is just a call to device_add, with the driver > > >> parametter > > >> hostaddr=":00:1b.0". > > >> > > >> There is some minor things missing: > > >> - copyright header > > >> - PCI IO space multiplexer > > >> > > >> > > > > > > We also have pci passthrough in qemu-kvm (I think based on the same > > > Neocleus > > > code). Rather than having two pci assignment implementations, I think we > > > should > > > have just one, with the differences (programming the hypervisor) > > > abstracted at > > > that level. > > > > I agree in principle but how close is qemu-kvm pci passthrough to a > > mergable > > state? Would it make sense to merge the Xen code first and then abstract > > it? > > I think it should be fairly easy to abstract the current xen code: just > a matter of providing memory, ioport and interrupt mapping functions. I thought we were potentially looking at vfio as a convergence point. I'm still a bit off from having a vfio re-write ready to submit, but is this still a possibility? Thanks, Alex
Re: [Qemu-devel] build errors on s390 tcg
Am 04.10.2011 14:44, schrieb Peter Maydell: On 4 October 2011 13:23, Alexander Graf wrote: The problem is that tcg.c defines the functions with int arguments for TCG register indexes, while s390/tcg-target.c takes TCGReg parameters. I'm not sure which way is better, but using TCGReg feels more type safe to me. You're not consistent, though; for example the s390 tcg_target_reg_alloc_order array is declared as int[], not TCGReg[]. -- PMM I did not notice that s390 was different when I wrote my patch, but I agree with Alex that TCGReg is more type safe. I also think that using TCGReg documents the meaning of a parameter better than using a simple int. So there are two tasks to be done: - Look for inconsistencies and fix them. - Add and use TCGReg for all other targets, too. Or are there any reasons why we should keep the int type parameters? Regards, Stefan Weil
Re: [Qemu-devel] [PATCH 2/2] LAPIC: make lapic support cpu hotplug
On 2011-10-04 13:13, pingf...@linux.vnet.com wrote: > From: Liu Ping Fan > > Separate apic from qbus to icc bus which supports hotplug feature. Modeling the ICC bus looks like a step in the right direction. The IOAPIC could be attached to it as well to get rid of "ioapics[MAX_IOAPICS]". > And make the creation of apic as part of cpu initialization, so > apic's state has been ready, before setting kvm_apic. There is no kvm-apic upstream yet, so it's hard to judge why we need this here. If we do, this has to be a separate patch. But I seriously doubt we need it (my hack worked without it, and that was not because of its hack nature). > > Signed-off-by: Liu Ping Fan > --- > Makefile.target |1 + > hw/apic.c|7 - > hw/apic.h|1 + > hw/icc_bus.c | 62 > ++ > hw/icc_bus.h | 24 +++ > hw/pc.c | 11 - > target-i386/cpu.h|1 + > target-i386/helper.c |7 +- > target-i386/kvm.c|1 - > 9 files changed, 105 insertions(+), 10 deletions(-) > create mode 100644 hw/icc_bus.c > create mode 100644 hw/icc_bus.h > > diff --git a/Makefile.target b/Makefile.target > index 9011f28..5607c6d 100644 > --- a/Makefile.target > +++ b/Makefile.target > @@ -241,6 +241,7 @@ obj-i386-$(CONFIG_KVM) += kvmclock.o > obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o > obj-i386-y += testdev.o > obj-i386-y += acpi.o acpi_piix4.o > +obj-i386-y += icc_bus.o > > obj-i386-y += pcspk.o i8254.o > obj-i386-$(CONFIG_KVM_PIT) += i8254-kvm.o > diff --git a/hw/apic.c b/hw/apic.c > index 69d6ac5..95a1664 100644 > --- a/hw/apic.c > +++ b/hw/apic.c > @@ -24,6 +24,7 @@ > #include "sysbus.h" > #include "trace.h" > #include "kvm.h" > +#include "icc_bus.h" > > /* APIC Local Vector Table */ > #define APIC_LVT_TIMER 0 > @@ -304,11 +305,13 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val) > > if (!s) > return; > + > if (kvm_enabled() && kvm_irqchip_in_kernel()) > s->apicbase = val; > else > s->apicbase = (val & 0xf000) | > (s->apicbase & (MSR_IA32_APICBASE_BSP | > MSR_IA32_APICBASE_ENABLE)); > + > /* if disabled, cannot be enabled again */ > if (!(val & MSR_IA32_APICBASE_ENABLE)) { > s->apicbase &= ~MSR_IA32_APICBASE_ENABLE; > @@ -1075,7 +1078,7 @@ static const VMStateDescription vmstate_apic = { > } > }; > > -static void apic_reset(DeviceState *d) > +void apic_reset(DeviceState *d) Still unused outside of this file, so keep it private. > { > APICState *s = DO_UPCAST(APICState, busdev.qdev, d); > int bsp; > @@ -1138,7 +1141,7 @@ static SysBusDeviceInfo apic_info = { > > static void apic_register_devices(void) > { > -sysbus_register_withprop(&apic_info); > +iccbus_register(&apic_info); > } > > device_init(apic_register_devices) > diff --git a/hw/apic.h b/hw/apic.h > index c857d52..e258efa 100644 > --- a/hw/apic.h > +++ b/hw/apic.h > @@ -24,5 +24,6 @@ void apic_sipi(DeviceState *s); > /* pc.c */ > int cpu_is_bsp(CPUState *env); > DeviceState *cpu_get_current_apic(void); > +void apic_reset(DeviceState *d); > > #endif > diff --git a/hw/icc_bus.c b/hw/icc_bus.c > new file mode 100644 > index 000..360ca2a > --- /dev/null > +++ b/hw/icc_bus.c > @@ -0,0 +1,62 @@ > +/* > +*/ > +#define ICC_BUS_PLUG > +#ifdef ICC_BUS_PLUG > +#include "icc_bus.h" > + > + > + > +struct icc_bus_info icc_info = { > +.qinfo.name = "icc", > +.qinfo.size = sizeof(struct icc_bus), > +.qinfo.props = (Property[]) { > +DEFINE_PROP_END_OF_LIST(), > +} > + > +}; > + > + > +static const VMStateDescription vmstate_icc_bus = { > +.name = "icc_bus", > +.version_id = 1, > +.minimum_version_id = 1, > +.minimum_version_id_old = 1, > +.pre_save = NULL, > +.post_load = NULL, > +}; > + > +struct icc_bus *g_iccbus; > + > +struct icc_bus *icc_init_bus(DeviceState *parent, const char *name) > +{ > +struct icc_bus *bus; > + > +bus = FROM_QBUS(icc_bus, qbus_create(&icc_info.qinfo, parent, name)); > +bus->qbus.allow_hotplug = 1; /* Yes, we can */ > +bus->qbus.name = "icc"; > +vmstate_register(NULL, -1, &vmstate_icc_bus, bus); The chipset is the owner of this bus and instantiates it. So it also provides a vmstate. You can drop this unneeded one here (it's created via an obsolete API anyway). > +g_iccbus = bus; > +return bus; > +} > + > + > + > + > + > +static int iccbus_device_init(DeviceState *dev, DeviceInfo *base) > +{ > +SysBusDeviceInfo *info = container_of(base, SysBusDeviceInfo, qdev); > + > +return info->init(sysbus_from_qdev(dev)); > +} > + > +void iccbus_register(SysBusDeviceInfo *info) > +{ > +info->qdev.init = iccbus_device_init; > +info->qdev.bus_info = &icc_info.qinfo; > + > +assert(info->qdev.size >= sizeof(SysBusDevice)); > +qdev_register(&info->qdev); > +} This
Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough
On Tue, 4 Oct 2011, Alex Williamson wrote: > On Tue, 2011-10-04 at 16:05 +0100, Stefano Stabellini wrote: > > On Tue, 4 Oct 2011, Anthony Liguori wrote: > > > On 10/04/2011 09:58 AM, Avi Kivity wrote: > > > > On 10/04/2011 04:51 PM, Anthony PERARD wrote: > > > >> Hi all, > > > >> > > > >> This patch series introduce the PCI passthrough for Xen. > > > >> > > > >> First, we have HostPCIDevice that help to access one PCI device of the > > > >> host. > > > >> > > > >> Then, there are several additions in the QEMU code. One is > > > >> qemu_run_one_timer > > > >> to run a specific timer. It is used by PCI passthrough to run a timer > > > >> about > > > >> power management. Another is pci_check_bar_overlap. > > > >> > > > >> There are also several change in pci_ids and pci_regs. > > > >> > > > >> Last part, but not least, the PCI passthrough device himself. Cut in 3 > > > >> parts > > > >> (or file), there is one to take care of the initialisation of a > > > >> passthrough > > > >> device. The second one handle everything about the config address > > > >> space, there > > > >> are specifics functions for every config register. The third one is to > > > >> handle > > > >> MSI. > > > >> > > > >> I'm still working on setting a PCI passthrough device through QMP from > > > >> libxl > > > >> (xen tool stack). It is just a call to device_add, with the driver > > > >> parametter > > > >> hostaddr=":00:1b.0". > > > >> > > > >> There is some minor things missing: > > > >> - copyright header > > > >> - PCI IO space multiplexer > > > >> > > > >> > > > > > > > > We also have pci passthrough in qemu-kvm (I think based on the same > > > > Neocleus > > > > code). Rather than having two pci assignment implementations, I think > > > > we should > > > > have just one, with the differences (programming the hypervisor) > > > > abstracted at > > > > that level. > > > > > > I agree in principle but how close is qemu-kvm pci passthrough to a > > > mergable > > > state? Would it make sense to merge the Xen code first and then abstract > > > it? > > > > I think it should be fairly easy to abstract the current xen code: just > > a matter of providing memory, ioport and interrupt mapping functions. > > I thought we were potentially looking at vfio as a convergence point. > I'm still a bit off from having a vfio re-write ready to submit, but is > this still a possibility? Thanks, As I said at KVM Forum 2011, I am OK with vfio: it is a nice thing to have in the long term. However having seen how much time it took to make this code work reliably, I think is going to take some time before we have a stable vfio implementation. Considering that we have just introduced upstream qemu to xen-unstable, I would like to reach feature parity with the old qemu-xen sooner rather than later, so that we can kill the qemu-xen fork once and for all. In order to reach feature parity we need PCI passthrough. Once that is done, we could gradually modify this code to introduce vfio support, throwing away what is not necessary and keeping what can be reused. If you are interested, we can easily make this code work for KVM too, otherwise we can keep it Xen specific and wait for vfio to be ready.
Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough
On 10/04/2011 06:33 PM, Alex Williamson wrote: I thought we were potentially looking at vfio as a convergence point. I'm still a bit off from having a vfio re-write ready to submit, but is this still a possibility? Thanks, vfio leaves out users of current and past kernels; relying on it would force me to maintain qemu-kvm for a while. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough
On 10/04/2011 05:01 PM, Anthony Liguori wrote: We also have pci passthrough in qemu-kvm (I think based on the same Neocleus code). Rather than having two pci assignment implementations, I think we should have just one, with the differences (programming the hypervisor) abstracted at that level. I agree in principle but how close is qemu-kvm pci passthrough to a mergable state? Would it make sense to merge the Xen code first and then abstract it? Merging either implementation and abstracting it would risk regressions in the other. How about merging both, with the ABIs (command line and qmp) tagged as experimental, and then doing a merge in the same style as i386+x86_64->x86 or the two kvm implementations in qemu? We can pick one implementation as the merge target and port fixes from the other. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH] i386: wire up MSR_IA32_MISC_ENABLE
On 10/04/2011 06:05 PM, Jan Kiszka wrote: On 2011-10-04 16:26, Avi Kivity wrote: > It's needed for its default value - bit 0 specifies that "rep movs" is > good enough for memcpy, and Linux may use a slower memcpu if it is not set, > depending on cpu family/model. > > Signed-off-by: Avi Kivity > --- > target-i386/cpu.h |5 + > target-i386/helper.c|1 + > target-i386/kvm.c | 15 +++ > target-i386/machine.c | 21 + > target-i386/op_helper.c |6 ++ > 5 files changed, 48 insertions(+), 0 deletions(-) > > diff --git a/target-i386/cpu.h b/target-i386/cpu.h > index ae36489..5416809 100644 > --- a/target-i386/cpu.h > +++ b/target-i386/cpu.h > @@ -299,6 +299,10 @@ > > #define MSR_IA32_PERF_STATUS0x198 > > +#define MSR_IA32_MISC_ENABLE 0x1a0 I smell tabs... Oops. Cut'n'paste flew underneath the emacs radar. > + > +static const VMStateDescription vmstate_msr_ia32_misc_enable = { > +.name = "cpu/msr_ia32_misc_enable", > +.version_id = 1, > +.minimum_version_id = 1, > +.minimum_version_id_old = 1, > +.fields = (VMStateField []) { > +VMSTATE_UINT64(msr_ia32_misc_enable, CPUState), > +VMSTATE_END_OF_LIST() > +} > +}; > + We are about to bump the CPU_SAVE_VERSION for the sake of APIC deadline timer, so you can jump on that train and avoid this subsection. Must we do that? Considering that no guest will use the deadline timer, it seems to be an excellent candidates for subsections. > diff --git a/target-i386/op_helper.c b/target-i386/op_helper.c > index 3bb5a91..c89e4a4 100644 > --- a/target-i386/op_helper.c > +++ b/target-i386/op_helper.c > @@ -3280,6 +3280,9 @@ void helper_wrmsr(void) > case MSR_TSC_AUX: > env->tsc_aux = val; > break; > +case MSR_IA32_MISC_ENABLE: > +env->msr_ia32_misc_enable = val; > +break; This MSR is Intel-specific, isn't it? Then I guess it should be limited to Intel CPU types. It's an "architectural MSR" that is only available on some Intel models. Either we do a full cpuid qualification of accessible MSRs (and bits within MSRs), or not. Qualifying just by vendor ID is pointless. -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [PATCH] i386: wire up MSR_IA32_MISC_ENABLE
On 2011-10-04 19:08, Avi Kivity wrote: > On 10/04/2011 06:05 PM, Jan Kiszka wrote: >> On 2011-10-04 16:26, Avi Kivity wrote: >> > It's needed for its default value - bit 0 specifies that "rep movs" is >> > good enough for memcpy, and Linux may use a slower memcpu if it is >> not set, >> > depending on cpu family/model. >> > >> > Signed-off-by: Avi Kivity >> > --- >> > target-i386/cpu.h |5 + >> > target-i386/helper.c|1 + >> > target-i386/kvm.c | 15 +++ >> > target-i386/machine.c | 21 + >> > target-i386/op_helper.c |6 ++ >> > 5 files changed, 48 insertions(+), 0 deletions(-) >> > >> > diff --git a/target-i386/cpu.h b/target-i386/cpu.h >> > index ae36489..5416809 100644 >> > --- a/target-i386/cpu.h >> > +++ b/target-i386/cpu.h >> > @@ -299,6 +299,10 @@ >> > >> > #define MSR_IA32_PERF_STATUS0x198 >> > >> > +#define MSR_IA32_MISC_ENABLE0x1a0 >> >> I smell tabs... > > Oops. Cut'n'paste flew underneath the emacs radar. > >> > + >> > +static const VMStateDescription vmstate_msr_ia32_misc_enable = { >> > +.name = "cpu/msr_ia32_misc_enable", >> > +.version_id = 1, >> > +.minimum_version_id = 1, >> > +.minimum_version_id_old = 1, >> > +.fields = (VMStateField []) { >> > +VMSTATE_UINT64(msr_ia32_misc_enable, CPUState), >> > +VMSTATE_END_OF_LIST() >> > +} >> > +}; >> > + >> >> We are about to bump the CPU_SAVE_VERSION for the sake of APIC deadline >> timer, so you can jump on that train and avoid this subsection. > > Must we do that? Considering that no guest will use the deadline timer, > it seems to be an excellent candidates for subsections. I don't know, it was sent out for pull like that. And I thought subsections are still broken, aren't they? > >> > diff --git a/target-i386/op_helper.c b/target-i386/op_helper.c >> > index 3bb5a91..c89e4a4 100644 >> > --- a/target-i386/op_helper.c >> > +++ b/target-i386/op_helper.c >> > @@ -3280,6 +3280,9 @@ void helper_wrmsr(void) >> > case MSR_TSC_AUX: >> > env->tsc_aux = val; >> > break; >> > +case MSR_IA32_MISC_ENABLE: >> > +env->msr_ia32_misc_enable = val; >> > +break; >> >> This MSR is Intel-specific, isn't it? Then I guess it should be limited >> to Intel CPU types. > > It's an "architectural MSR" that is only available on some Intel > models. Either we do a full cpuid qualification of accessible MSRs (and > bits within MSRs), or not. Qualifying just by vendor ID is pointless. Given that, when in conflict, we rather model after AMD than Intel for TCG, I would hesitate to expose this by default. Or are there precedences already? Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] i386: wire up MSR_IA32_MISC_ENABLE
On 10/04/2011 07:14 PM, Jan Kiszka wrote: > >> > + >> > +static const VMStateDescription vmstate_msr_ia32_misc_enable = { >> > +.name = "cpu/msr_ia32_misc_enable", >> > +.version_id = 1, >> > +.minimum_version_id = 1, >> > +.minimum_version_id_old = 1, >> > +.fields = (VMStateField []) { >> > +VMSTATE_UINT64(msr_ia32_misc_enable, CPUState), >> > +VMSTATE_END_OF_LIST() >> > +} >> > +}; >> > + >> >> We are about to bump the CPU_SAVE_VERSION for the sake of APIC deadline >> timer, so you can jump on that train and avoid this subsection. > > Must we do that? Considering that no guest will use the deadline timer, > it seems to be an excellent candidates for subsections. I don't know, it was sent out for pull like that. And I thought subsections are still broken, aren't they? Well let's fix subsections instead of disabling migration to older versions. >> >> This MSR is Intel-specific, isn't it? Then I guess it should be limited >> to Intel CPU types. > > It's an "architectural MSR" that is only available on some Intel > models. Either we do a full cpuid qualification of accessible MSRs (and > bits within MSRs), or not. Qualifying just by vendor ID is pointless. Given that, when in conflict, we rather model after AMD than Intel for TCG, I would hesitate to expose this by default. Or are there precedences already? Practically all MSRs. i486 doesn't have any, IIRC, for example. (and given this MSR has no effect, the only difference it makes to guests is the #GP we take or not; still it may be worthwhile to construct some table-driven thing to allow or reject MSR accesses, both for kvm and qemu) -- error compiling committee.c: too many arguments to function
Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough
On 2011-10-04 17:01, Anthony Liguori wrote: > On 10/04/2011 09:58 AM, Avi Kivity wrote: >> On 10/04/2011 04:51 PM, Anthony PERARD wrote: >>> Hi all, >>> >>> This patch series introduce the PCI passthrough for Xen. >>> >>> First, we have HostPCIDevice that help to access one PCI device of >>> the host. >>> >>> Then, there are several additions in the QEMU code. One is >>> qemu_run_one_timer >>> to run a specific timer. It is used by PCI passthrough to run a timer >>> about >>> power management. Another is pci_check_bar_overlap. >>> >>> There are also several change in pci_ids and pci_regs. >>> >>> Last part, but not least, the PCI passthrough device himself. Cut in >>> 3 parts >>> (or file), there is one to take care of the initialisation of a >>> passthrough >>> device. The second one handle everything about the config address >>> space, there >>> are specifics functions for every config register. The third one is >>> to handle >>> MSI. >>> >>> I'm still working on setting a PCI passthrough device through QMP >>> from libxl >>> (xen tool stack). It is just a call to device_add, with the driver >>> parametter >>> hostaddr=":00:1b.0". >>> >>> There is some minor things missing: >>> - copyright header >>> - PCI IO space multiplexer >>> >>> >> >> We also have pci passthrough in qemu-kvm (I think based on the same >> Neocleus >> code). Rather than having two pci assignment implementations, I think >> we should >> have just one, with the differences (programming the hypervisor) >> abstracted at >> that level. > > I agree in principle but how close is qemu-kvm pci passthrough to a > mergable state? Would it make sense to merge the Xen code first and > then abstract it? What is missing to get qemu-kvm device assignment ready: - MSI/MSI-X refactoring in QEMU (specifically config notifiers, also relevant for virtio) - switch device assignment to generic MSI/MSI-X - get in-kernel irqchip support upstream - minor code cleanups I was hoping to complete this for 1.0, but it looks unrealistic now. Maybe I'll find the time for the MSI stuff at least. From a first glance at these patches, I think there are already some synergies in the host-pci access layer. And I bet the Xen bits could also make use of our MSI/MSI-X code once it's generalized. But that will be a bit work for them (based on my experience with qemu-kvm). Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 2/2] hda: do not mix output and input stream states, RHBZ #740493
"Marc-André Lureau" wrote: > Windows 7 may use the same stream number for input and output. > Current code will confuse streams. > > NB: I wonder if this patch breaks migration code because of > this change: > -VMSTATE_BOOL_ARRAY(running, HDAAudioState, 16), > +VMSTATE_BOOL_ARRAY(running, HDAAudioState, 2 * 16), It does :-( Two questions to know how to handle this. a) My understanding of this patch is that we move from an array of 16 bools representing anything to one array where the 1st 16 represent if there are input and the 2nd 16's reprosenting if there are output for that channel. So, what we should do if we migrate from one old version that only has 16 bools? My understanding is that copying directly is not gonna work? b) Just in case it makes "compatibility" easier. Is the running array normally all false or similar? If so, we could just sent the other 16th values only if they are different of false. (if this is true, it is interesting to put in the 1st slots the ones that are more probable to be true). Later, Juan.
Re: [Qemu-devel] [PATCH] i386: wire up MSR_IA32_MISC_ENABLE
On 2011-10-04 19:21, Avi Kivity wrote: > On 10/04/2011 07:14 PM, Jan Kiszka wrote: >> > >> >> > + >> >> > +static const VMStateDescription vmstate_msr_ia32_misc_enable = { >> >> > +.name = "cpu/msr_ia32_misc_enable", >> >> > +.version_id = 1, >> >> > +.minimum_version_id = 1, >> >> > +.minimum_version_id_old = 1, >> >> > +.fields = (VMStateField []) { >> >> > +VMSTATE_UINT64(msr_ia32_misc_enable, CPUState), >> >> > +VMSTATE_END_OF_LIST() >> >> > +} >> >> > +}; >> >> > + >> >> >> >> We are about to bump the CPU_SAVE_VERSION for the sake of APIC >> deadline >> >> timer, so you can jump on that train and avoid this subsection. >> > >> > Must we do that? Considering that no guest will use the deadline >> timer, >> > it seems to be an excellent candidates for subsections. >> >> I don't know, it was sent out for pull like that. And I thought >> subsections are still broken, aren't they? > > Well let's fix subsections instead of disabling migration to older > versions. > >> >> >> >> This MSR is Intel-specific, isn't it? Then I guess it should be >> limited >> >> to Intel CPU types. >> > >> > It's an "architectural MSR" that is only available on some Intel >> > models. Either we do a full cpuid qualification of accessible MSRs >> (and >> > bits within MSRs), or not. Qualifying just by vendor ID is pointless. >> >> Given that, when in conflict, we rather model after AMD than Intel for >> TCG, I would hesitate to expose this by default. Or are there >> precedences already? > > Practically all MSRs. i486 doesn't have any, IIRC, for example. Pre-Pentiums don't have instructions to access them as well, so that doesn't cause any harm. > > (and given this MSR has no effect, the only difference it makes to > guests is the #GP we take or not; still it may be worthwhile to > construct some table-driven thing to allow or reject MSR accesses, both > for kvm and qemu) Right. If this MSR is not the first bogus one on AMD, we can do this later. If it is, it should be done first. Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] Make cpu_single_env thread local (Linux only for now)
On 2011-10-04 17:10, Lluís Vilanova wrote: > Jan Kiszka writes: > >> On 2011-10-03 18:33, Dr. David Alan Gilbert wrote: >>> Make cpu_single_env thread local (Linux only for now) >>> * Fixes some user space threading issues (esp those triggered >>> by bug 823902) >>> >>> Against rev d11cf8cc..., tested on ARM user mode, and ARM Vexpress >>> system mode (with Blue Swirl's fix from yesterday) - only >>> tested on Linux host. Lets me run ARM userspace firefox. >>> >>> Signed-off-by: Dr. David Alan Gilbert >>> >>> diff --git a/cpu-all.h b/cpu-all.h >>> index 42a5fa0..d895ee6 100644 >>> --- a/cpu-all.h >>> +++ b/cpu-all.h >>> @@ -334,7 +334,13 @@ void cpu_dump_statistics(CPUState *env, FILE *f, >>> fprintf_function cpu_fprintf, >>> void QEMU_NORETURN cpu_abort(CPUState *env, const char *fmt, ...) >>> GCC_FMT_ATTR(2, 3); >>> extern CPUState *first_cpu; >>> + >>> +#ifdef __linux__ >>> +/* DAG: Only tested thread local on Linux, feel free to add others */ >>> +extern __thread CPUState *cpu_single_env; >>> +#else >>> extern CPUState *cpu_single_env; >>> +#endif > >> We need this for all platforms in order to skip qemu_global_mutex while >> manipulating some CPUState. And leaving some platforms with non-TLS will >> eventually break them when code is added that assumes TLS. > >> However, it's not unlikely that some weird platforms / ancient >> toolchains still have problems with __thread - even on Linux. We may >> want to play safe and use pthread_key on POSIX. > > Why not make this kind of annotations available in qemu-commmon.h; or even > better somewhere less... "generic". Yes, we will need some encapsulation for this. > > #if defined(CONFIG_SUPPORTS_THREAD_KEYWORD) > > // use __thread > #define QEMU_DECL_TLS(type, name) \ > extern __thread type name; > #define QEMU_DEF_TLS(type, name) \ > __thread type name; > #define QEMU_SET_TLS(name, value) \ > do { name = value; } while (0) > #define QEMU_GET_TLS(name) \ > name > > #elif defined(CONFIG_SUPPORTS_PTHREAD_KEY) > > // use pthread_key_t > #define QEMU_DECL_TLS(type, name) \ > extern type _type_##name; \ > extern pthread_key_t name; > #define QEMU_DEF_TLS(type, name) \ > pthread_key_t name; \ > void _init_##name (void) __attribute__((constructor)); \ > void _init_##name (void) { pthread_key_create(&name, NULL); } > #define QEMU_SET_TLS(name, value) \ > do { pthread_setspecific(name, (void*)value); } while (0) > #define QEMU_GET_TLS(name) \ > ((typeof(_type_##name))pthread_getspecific(name)) > > #else > #error Go home > #endif Looks like a start. But I would avoid macros and go for (static inline) functions where possible. And initialization should be explicit (so that you can start using TLS already inside constructors). Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH RFC V1 02/11] qemu-timer: Introduce qemu_run_one_timer
On 2011-10-04 16:51, Anthony PERARD wrote: > Used by the Xen PCI Passthrough code to run the timer about the power > state transition. > > Signed-off-by: Anthony PERARD > --- > qemu-timer.c | 15 +++ > qemu-timer.h |3 +++ > 2 files changed, 18 insertions(+), 0 deletions(-) > > diff --git a/qemu-timer.c b/qemu-timer.c > index 46dd483..15e659b 100644 > --- a/qemu-timer.c > +++ b/qemu-timer.c > @@ -1163,3 +1163,18 @@ int qemu_calculate_timeout(void) > return 1000; > } > > +/* run the specified timer */ > +void qemu_run_one_timer(QEMUTimer *ts) > +{ > +uint64_t current_time; > + > +/* remove timer from the list before calling the callback */ > +qemu_del_timer(ts); > + > +while ((current_time = qemu_get_clock_ms(rt_clock)) < ts->expire_time) > +/* sleep until the expire time */ > +usleep((ts->expire_time - current_time) * 1000); > + > +/* run the callback */ > +ts->cb(ts->opaque); > +} This looks funny. I can't imagine that this could ever fit into the standard (asynchronous) QEMU execution model for I/O. Keep it private to Xen? Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [Xen-devel] [PATCH RFC V1 00/11] Xen PCI Passthrough
On 2011-10-04 19:01, Avi Kivity wrote: > On 10/04/2011 06:33 PM, Alex Williamson wrote: >> I thought we were potentially looking at vfio as a convergence point. >> I'm still a bit off from having a vfio re-write ready to submit, but is >> this still a possibility? Thanks, >> > > vfio leaves out users of current and past kernels; relying on it would > force me to maintain qemu-kvm for a while. VFIO is a great thing, and we all want it. But it's also a big project now as it has to please more and more use cases. I think it will require some more time to settle its interfaces, and it should likely take this time. So we need KVM device assignment upstream as well until comparable VFIO support is in every recent distro kernel out there. Jan signature.asc Description: OpenPGP digital signature