date:20121218

Re: [Qemu-devel] [PATCH v2 1/4] usb/ehci: Clean up SysBus and PCI EHCI split

2012-12-18 Thread Gerd Hoffmann

  Hi,

>> I fail to see the point.  EHCIPCIState should not be needed outside of
>> hcd-ehci-pci.c and I'd prefer to leave it there.  Likewise for sysbus.
> 
> It is exactly what I commented on my v1 for needing a v2 and you seemed
> to concur... In C, to embed a struct in another struct the compiler
> needs the full struct definition (compare i440fx, prep_pci series) and
> it thus needs to be in an #include'able header.

Sure.

> @@ -115,6 +115,9 @@ typedef struct Tegra2State {
>  TegraClocksState clocks;
>  SDHCIState sdhci[4];
>  TegraI2CState i2c[4];
> +#if 0
> +EHCISysBusState usb[3];
> +#endif
>  } Tegra2State;

Ah, *that* is the place where you need it (outside hcd-ehci-sysbus.c).
Makes sense indeed.

I'll go put the bits as-is into the usb queue.

cheers,
  Gerd

[Qemu-devel] [PATCH qom-cpu 1/4] cpu: Introduce CPUListState struct

2012-12-18 Thread Andreas Färber

This generalizes {ARM,M68k,Alpha}CPUListState to avoid declaring it for
each target.

Signed-off-by: Andreas Färber 
---
 include/qemu/cpu.h   |   12 
 target-alpha/cpu.c   |9 ++---
 target-arm/helper.c  |9 ++---
 target-m68k/helper.c |9 ++---
 4 Dateien geändert, 18 Zeilen hinzugefügt(+), 21 Zeilen entfernt(-)

diff --git a/include/qemu/cpu.h b/include/qemu/cpu.h
index 61b7698..5fbb3f9 100644
--- a/include/qemu/cpu.h
+++ b/include/qemu/cpu.h
@@ -21,6 +21,7 @@
 #define QEMU_CPU_H
 
 #include "qemu/object.h"
+#include "qemu-common.h"
 #include "qemu-thread.h"
 
 /**
@@ -80,6 +81,17 @@ struct CPUState {
 /* TODO Move common fields from CPUArchState here. */
 };
 
+/**
+ * CPUListState:
+ * @cpu_fprintf: Print function.
+ * @file: File to print to using @cpu_fprint.
+ *
+ * State commonly used for iterating over CPU models.
+ */
+typedef struct CPUListState {
+fprintf_function cpu_fprintf;
+FILE *file;
+} CPUListState;
 
 /**
  * cpu_reset:
diff --git a/target-alpha/cpu.c b/target-alpha/cpu.c
index d065085..915278f 100644
--- a/target-alpha/cpu.c
+++ b/target-alpha/cpu.c
@@ -33,11 +33,6 @@ static void alpha_cpu_realize(Object *obj, Error **err)
 #endif
 }
 
-typedef struct AlphaCPUListState {
-fprintf_function cpu_fprintf;
-FILE *file;
-} AlphaCPUListState;
-
 /* Sort alphabetically by type name. */
 static gint alpha_cpu_list_compare(gconstpointer a, gconstpointer b)
 {
@@ -53,7 +48,7 @@ static gint alpha_cpu_list_compare(gconstpointer a, 
gconstpointer b)
 static void alpha_cpu_list_entry(gpointer data, gpointer user_data)
 {
 ObjectClass *oc = data;
-AlphaCPUListState *s = user_data;
+CPUListState *s = user_data;
 
 (*s->cpu_fprintf)(s->file, "  %s\n",
   object_class_get_name(oc));
@@ -61,7 +56,7 @@ static void alpha_cpu_list_entry(gpointer data, gpointer 
user_data)
 
 void alpha_cpu_list(FILE *f, fprintf_function cpu_fprintf)
 {
-AlphaCPUListState s = {
+CPUListState s = {
 .file = f,
 .cpu_fprintf = cpu_fprintf,
 };
diff --git a/target-arm/helper.c b/target-arm/helper.c
index ab8b734..d2f2fb4 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -1291,11 +1291,6 @@ ARMCPU *cpu_arm_init(const char *cpu_model)
 return cpu;
 }
 
-typedef struct ARMCPUListState {
-fprintf_function cpu_fprintf;
-FILE *file;
-} ARMCPUListState;
-
 /* Sort alphabetically by type name, except for "any". */
 static gint arm_cpu_list_compare(gconstpointer a, gconstpointer b)
 {
@@ -1317,7 +1312,7 @@ static gint arm_cpu_list_compare(gconstpointer a, 
gconstpointer b)
 static void arm_cpu_list_entry(gpointer data, gpointer user_data)
 {
 ObjectClass *oc = data;
-ARMCPUListState *s = user_data;
+CPUListState *s = user_data;
 
 (*s->cpu_fprintf)(s->file, "  %s\n",
   object_class_get_name(oc));
@@ -1325,7 +1320,7 @@ static void arm_cpu_list_entry(gpointer data, gpointer 
user_data)
 
 void arm_cpu_list(FILE *f, fprintf_function cpu_fprintf)
 {
-ARMCPUListState s = {
+CPUListState s = {
 .file = f,
 .cpu_fprintf = cpu_fprintf,
 };
diff --git a/target-m68k/helper.c b/target-m68k/helper.c
index a5d0100..875a71a 100644
--- a/target-m68k/helper.c
+++ b/target-m68k/helper.c
@@ -25,11 +25,6 @@
 
 #define SIGNBIT (1u << 31)
 
-typedef struct M68kCPUListState {
-fprintf_function cpu_fprintf;
-FILE *file;
-} M68kCPUListState;
-
 /* Sort alphabetically, except for "any". */
 static gint m68k_cpu_list_compare(gconstpointer a, gconstpointer b)
 {
@@ -51,7 +46,7 @@ static gint m68k_cpu_list_compare(gconstpointer a, 
gconstpointer b)
 static void m68k_cpu_list_entry(gpointer data, gpointer user_data)
 {
 ObjectClass *c = data;
-M68kCPUListState *s = user_data;
+CPUListState *s = user_data;
 
 (*s->cpu_fprintf)(s->file, "%s\n",
   object_class_get_name(c));
@@ -59,7 +54,7 @@ static void m68k_cpu_list_entry(gpointer data, gpointer 
user_data)
 
 void m68k_cpu_list(FILE *f, fprintf_function cpu_fprintf)
 {
-M68kCPUListState s = {
+CPUListState s = {
 .file = f,
 .cpu_fprintf = cpu_fprintf,
 };
-- 
1.7.10.4

[Qemu-devel] [PATCH RFC qom-cpu 2/4] qemu-common.h: Move fprintf_function to qemu-types.h

2012-12-18 Thread Andreas Färber

This avoids a dependency on qemu-common.h from qemu/cpu.h.

Signed-off-by: Andreas Färber 
---
 include/qemu/cpu.h |2 +-
 qemu-common.h  |5 -
 qemu-types.h   |6 ++
 3 Dateien geändert, 7 Zeilen hinzugefügt(+), 6 Zeilen entfernt(-)

diff --git a/include/qemu/cpu.h b/include/qemu/cpu.h
index 5fbb3f9..d737fcd 100644
--- a/include/qemu/cpu.h
+++ b/include/qemu/cpu.h
@@ -21,7 +21,7 @@
 #define QEMU_CPU_H
 
 #include "qemu/object.h"
-#include "qemu-common.h"
+#include "qemu-types.h"
 #include "qemu-thread.h"
 
 /**
diff --git a/qemu-common.h b/qemu-common.h
index e674786..98ab78d 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -12,7 +12,6 @@
 #ifndef QEMU_COMMON_H
 #define QEMU_COMMON_H
 
-#include "compiler.h"
 #include "config-host.h"
 #include "qemu-types.h"
 
@@ -24,7 +23,6 @@
 
 /* we put basic includes here to avoid repeating them in device drivers */
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -95,9 +93,6 @@ struct iovec {
 #include 
 #endif
 
-typedef int (*fprintf_function)(FILE *f, const char *fmt, ...)
-GCC_FMT_ATTR(2, 3);
-
 #ifdef _WIN32
 #define fsync _commit
 #if !defined(lseek)
diff --git a/qemu-types.h b/qemu-types.h
index fd532a2..f7a7194 100644
--- a/qemu-types.h
+++ b/qemu-types.h
@@ -1,6 +1,12 @@
 #ifndef QEMU_TYPEDEFS_H
 #define QEMU_TYPEDEFS_H
 
+#include "compiler.h"
+#include 
+
+typedef int (*fprintf_function)(FILE *f, const char *fmt, ...)
+GCC_FMT_ATTR(2, 3);
+
 /* A load of opaque types so that device init declarations don't have to
pull in all the real definitions.  */
 typedef struct QEMUTimer QEMUTimer;
-- 
1.7.10.4

[Qemu-devel] [PATCH qom-cpu 0/4] CPU cleanup and PPC subclasses

2012-12-18 Thread Andreas Färber

Hello,

This series starts with unifying the various structs for -cpu ? implementation.

I'm guessing the second patch will be necessary for CPU-as-a-device,
but it breaks the paradigm of having only typedefs in qemu-types.h.

Based on that, by demand from David, here's a quick and dirty introduction of
CPU subclasses as proposed some time ago. It's been redone, so no change log.
My proposal is to leave ppc_def_t in place for now, adding a pointer to it in
the CPU class for instance_init and for David.

Plus, it seems that my "POWER5+ (gs)" CPU is #ifdef TODO'ed out, so lacking
an immediate fix how to fall back to another CPU model I'm proposing an error.

The series is based on the current qom-cpu queue and will need to be slightly
rebased when I apply the KVM CPUState series.

Regards,
Andreas

Cc: Eduardo Habkost 
Cc: Igor Mammedov 
Cc: Richard Henderson 
Cc: Peter Maydell 

Cc: Alexander Graf 
Cc: qemu-ppc 
Cc: David Gibson 

Andreas Färber (4):
  cpu: Introduce CPUListState struct
  qemu-common.h: Move fprintf_function to qemu-types.h
  target-ppc: Slim conversion of model definitions to QOM subclasses
  target-ppc: Error out for -cpu host on unknown PVR

 include/qemu/cpu.h  |   12 ++
 qemu-common.h   |5 -
 qemu-types.h|6 +
 target-alpha/cpu.c  |9 +-
 target-arm/helper.c |9 +-
 target-m68k/helper.c|9 +-
 target-ppc/Makefile.objs|3 +-
 target-ppc/cpu-qom.h|5 +
 target-ppc/cpu.h|4 -
 target-ppc/helper.c |   50 ---
 target-ppc/kvm.c|   44 +-
 target-ppc/kvm_ppc.h|8 +-
 target-ppc/translate_init.c |  344 +--
 13 Dateien geändert, 306 Zeilen hinzugefügt(+), 202 Zeilen entfernt(-)
 delete mode 100644 target-ppc/helper.c

-- 
1.7.10.4

Re: [Qemu-devel] [PATCHv2] virtio: make bindings typesafe

2012-12-18 Thread Michael S. Tsirkin

On Mon, Dec 17, 2012 at 06:42:58PM -0600, Anthony Liguori wrote:
> What don't you just use a static inline and then you get even more type
> safety and don't confuse with QOM cast macros...
> 
> Regards,
> 
> Anthony Liguori

OK.

Re: [Qemu-devel] [PATCH 00/15] qdev: make reset semantics more clear and consistent, reset qbuses under virtio devices

2012-12-18 Thread Paolo Bonzini

Il 18/12/2012 08:27, Paolo Bonzini ha scritto:
> Il 17/12/2012 22:43, Michael S. Tsirkin ha scritto:
>> On Mon, Dec 17, 2012 at 05:24:35PM +0100, Paolo Bonzini wrote:
>>> After discussion with mst on the topic of resetting virtio devices,
>>> here is a series that hopefully clarifies the semantics of bus and
>>> device resets.
>>>
>>> After this series, there are two kinds of resets:
>>
>> So just to clarify, what I proposed was this
>> (on top of my type safety patch). Then
>> all transports can call virtio_config_reset
>> when appropriate (e.g. when PA is set to 0).
>>
>> Signed-off-by: Michael S. Tsirkin 
>>
>> diff --git a/hw/virtio.c b/hw/virtio.c
>> index f40a8c5..e65d7c8 100644
>> --- a/hw/virtio.c
>> +++ b/hw/virtio.c
>> @@ -554,6 +554,14 @@ void virtio_reset(void *opaque)
>>  }
>>  }
>>  
>> +/* Device-specific reset through virtio config space.
>> + * Reset virtio config and backend child devices if any.
>> + */
>> +void virtio_config_reset(VirtIODevice *vdev)
>> +{
>> +qdev_reset_all(vdev->binding_opaque);
>> +}
> 
> Yes, I had understood.  As I said, this is the wrong direction.
> Resetting happens from vdev->binding_opaque, it can just do
> qdev_reset_all(myself).

... besides, this only works if the reset callback of
vdev->binding_opaque remembers to call virtio_reset (in the s390
bindings, it doesn't and this series fixes it).  So IMO it is not only
useless, it is also misleading.

Paolo

Re: [Qemu-devel] [PATCH] virtio: make bindings typesafe

2012-12-18 Thread Michael S. Tsirkin

On Tue, Dec 18, 2012 at 01:48:44AM +0100, Andreas Färber wrote:
> Am 18.12.2012 01:30, schrieb Michael S. Tsirkin:
> > On Tue, Dec 18, 2012 at 01:13:18AM +0100, Andreas Färber wrote:
> >> Am 17.12.2012 23:58, schrieb Michael S. Tsirkin:
> >>> On Mon, Dec 17, 2012 at 11:08:43PM +0100, Andreas Färber wrote:
>  Am 17.12.2012 22:18, schrieb Michael S. Tsirkin:
> > On Mon, Dec 17, 2012 at 10:13:11PM +0100, Andreas Färber wrote:
> >> Am 17.12.2012 21:48, schrieb Michael S. Tsirkin:
> >>> On Mon, Dec 17, 2012 at 07:25:08PM +0100, Andreas Färber wrote:
>  Am 17.12.2012 19:21, schrieb Paolo Bonzini:
> > Il 17/12/2012 18:55, Andreas Färber ha scritto:
> >> Am 17.12.2012 16:45, schrieb Michael S. Tsirkin:
> >>> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> >>> index 3ea4140..63ae888 100644
> >>> --- a/hw/virtio-pci.c
> >>> +++ b/hw/virtio-pci.c
> >>> @@ -98,34 +98,34 @@ bool virtio_is_big_endian(void);
> >>>  
> >>>  /* virtio device */
> >>>  
> >>> -static void virtio_pci_notify(void *opaque, uint16_t vector)
> >>> +static void virtio_pci_notify(DeviceState *d, uint16_t vector)
> >>>  {
> >>> -VirtIOPCIProxy *proxy = opaque;
> >>> +VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, 
> >>> pci_dev.qdev);
> >>
> >> Nack. This is going the wrong direction QOM-wise and you among all
> >> others know that from PCI host bridges!
> >
> > Well, that's just a difference of VIRTIO_PCI_PROXY(d) vs. 
> > container_of.
> 
>  VIRTIO_PCI_PROXY(d) would be acceptable, sure. But as-is this patch 
>  just
>  pushes unnecessary work on Fred, me, you or anyone else who works 
>  with QOM.
> >>>
> >>> What's VIRTIO_PCI_PROXY? Note this is data path we do not want extra
> >>> code.
> >>
> >> My complaint is the direct access of pci_dev, qdev, etc. parent fields
> >> in many places as the main change of this patch. Those mean more places
> >> to touch in a future patch.
> >>
> >> Use of any new-style macro hiding these - wherever the particular one
> >> suggested may be defined or whether it needs to be added - is better.
> >>
> >> If performance of dynamic_cast is an issue - something I'd leave you to
> >> discuss with Anthony - you can just do a C cast directly. Just don't
> >> spread this qdev paradigm further please.
> >
> > OK so just
> >
> > #define VIRTIO_PCI_PROXY(d) container_of(d, VirtIOPCIProxy, 
> > pci_dev.qdev)
> >
> > is OK with you?
> 
>  Well, at least it's better than inlining it...
> 
>  I would've expected to see VIRTIO_PCI_PROXY(obj) defined as
>  OBJECT_CHECK(VirtIOPCIProxy, (obj), TYPE_something) somewhere.
> 
>  If, as you imply with "data path", this were a problem, you could just
>  do VirtIOPCIProxy *proxy = (VirtIOPCIProxy *)d inline to allow for
>  VIRTIO_PCI_PROXY() to be used in the QOM sense elsewhere.
> >>>
> >>> I don't get it - where?
> >>> Since we don't do runtime checks we need container_of -
> >>> safer than a plain cast.
> >>>
> >>> Anyway, when you start doing your QOM conversions it will be
> >>> easy to do what you like.
> >>
> >> I don't get what you don't get
> > 
> > Wha'ts the QOM way to get virtio pci proxy from
> > devicestate?
> > C cast is not what I am looking for.
> 
> Looking into virtio-pci.c it looks like virtio has a similar deficiency
> as EHCI USB (my recent series): We lack an abstract intermediate type
> TYPE_VIRTIO_PCI_PROXY to match the struct VirtIOPCIProxy shared by its
> subtypes:
> 
> Object
> - DeviceState
>   - PCIDevice
> - VirtIOPCIProxy
>   - virtio-scsi-pci
>   - virtio-rng-pci
>   ...
> 
> Not sure if that can be extracted from Fred's series already; otherwise
> I can send you a patch.

I'd like to avoid the dependency - let me do the rework
using simple container_of meanwhile, then Fred's series
can change the cast in a single place.

> Then you can do the mentioned:
> 
> #define VIRTIO_PCI_PROXY(obj) \
> OBJECT_CHECK(VirtIOPCIProxy, (obj), TYPE_VIRTIO_PCI_PROXY)
> 
> DeviceState *dev = ...;
> VirtIOPCIProxy *proxy = VIRTIO_PCI_PROXY(dev);
> 
> where you consider it acceptable performance-wise and a
> FAST_VIRTIO_PCI_PROXY_FROM_DEVICE(dev) or so elsewhere.
> 
> Andreas
> 
> -- 
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH 1/1] virtio-serial-bus: assert port is non-null in remove_port()

2012-12-18 Thread Markus Armbruster

Amit Shah  writes:

> remove_port() is called from qdev's unplug callback, and we're certain
> the port will be found in our list of ports.  Adding an assert()
> documents this.
>
> This was flagged by Coverity, fix suggested by Markus.
>
> CC: Markus Armbruster 
> Signed-off-by: Amit Shah 
> ---
>  hw/virtio-serial-bus.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
> index 3ea95b8..ce4556f 100644
> --- a/hw/virtio-serial-bus.c
> +++ b/hw/virtio-serial-bus.c
> @@ -852,6 +852,12 @@ static void remove_port(VirtIOSerial *vser, uint32_t 
> port_id)
>  vser->ports_map[i] &= ~(1U << (port_id % 32));
>  
>  port = find_port_by_id(vser, port_id);
> +/*
> + * This function is only called from qdev's unplug callback; if we
> + * get a NULL port here, we're in trouble.
> + */
> +assert(port);
> +
>  /* Flush out any unconsumed buffers first */
>  discard_vq_data(port->ovq, &port->vser->vdev);

Leaving it to you got me a nice comment!

Reviewed-by: Markus Armbruster

Re: [Qemu-devel] [PATCH v2 3/5] s390: Add new channel I/O based virtio transport.

2012-12-18 Thread Paolo Bonzini

Il 04/09/2012 17:13, Cornelia Huck ha scritto:
> +VirtioCcwBus *virtio_ccw_bus_init(void)
> +{
> +VirtioCcwBus *cbus;
> +BusState *bus;
> +DeviceState *dev;
> +
> +/* Create bridge device */
> +dev = qdev_create(NULL, "virtio-ccw-bridge");
> +qdev_init_nofail(dev);
> +
> +/* Create bus on bridge device */
> +bus = qbus_create(TYPE_VIRTIO_CCW_BUS, dev, "virtio-ccw");
> +cbus = DO_UPCAST(VirtioCcwBus, bus, bus);
> +
> +/* Enable hotplugging */
> +bus->allow_hotplug = 1;
> +
> +qemu_register_reset(virtio_ccw_reset_subchannels, cbus);

Please use qdev device-reset and bus-reset callbacks instead of this.

In particular, when writing the status you should call
qdev_reset_all(DEVICE(sch)), and whatever state should be reset will
have to be cleared by the device-reset callback of SubchDev, including
calling virtio_reset.

Everything else will be cleared instead by the bus-reset callback of
virtio-ccw-bus, similar to what you are doing in
virtio_ccw_reset_subchannels.

Paolo


> +return cbus;
> +}

Re: [Qemu-devel] [PATCH 00/15] qdev: make reset semantics more clear and consistent, reset qbuses under virtio devices

2012-12-18 Thread Michael S. Tsirkin

On Tue, Dec 18, 2012 at 09:35:02AM +0100, Paolo Bonzini wrote:
> Il 18/12/2012 08:27, Paolo Bonzini ha scritto:
> > Il 17/12/2012 22:43, Michael S. Tsirkin ha scritto:
> >> On Mon, Dec 17, 2012 at 05:24:35PM +0100, Paolo Bonzini wrote:
> >>> After discussion with mst on the topic of resetting virtio devices,
> >>> here is a series that hopefully clarifies the semantics of bus and
> >>> device resets.
> >>>
> >>> After this series, there are two kinds of resets:
> >>
> >> So just to clarify, what I proposed was this
> >> (on top of my type safety patch). Then
> >> all transports can call virtio_config_reset
> >> when appropriate (e.g. when PA is set to 0).
> >>
> >> Signed-off-by: Michael S. Tsirkin 
> >>
> >> diff --git a/hw/virtio.c b/hw/virtio.c
> >> index f40a8c5..e65d7c8 100644
> >> --- a/hw/virtio.c
> >> +++ b/hw/virtio.c
> >> @@ -554,6 +554,14 @@ void virtio_reset(void *opaque)
> >>  }
> >>  }
> >>  
> >> +/* Device-specific reset through virtio config space.
> >> + * Reset virtio config and backend child devices if any.
> >> + */
> >> +void virtio_config_reset(VirtIODevice *vdev)
> >> +{
> >> +qdev_reset_all(vdev->binding_opaque);
> >> +}
> > 
> > Yes, I had understood.  As I said, this is the wrong direction.
> > Resetting happens from vdev->binding_opaque, it can just do
> > qdev_reset_all(myself).

It can but it's the wrong thing for transport to know about.
Let PCI worry about PCI things. This is not
a transport specific thing so belongs in virtio.c

> ... besides, this only works if the reset callback of
> vdev->binding_opaque remembers to call virtio_reset (in the s390
> bindings, it doesn't and this series fixes it).

That's a separate bug I think.

>  So IMO it is not only
> useless, it is also misleading.
> 
> Paolo

Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2012-12-18 Thread Philipp Hahn

Hello Kevin, hello Michael,

On Wednesday 12 December 2012 17:54:58 Kevin Wolf wrote:
> Am 12.12.2012 15:09, schrieb Philipp Hahn:
> > Am Mittwoch 12 Dezember 2012 14:41:49 schrieb Kevin Wolf:
> >> As you can see in the commit message of that patch I was convinced that
> >> no bug did exist in practice and this was only dangerous with respect to
> >> future changes. Therefore my first question is if you're using an
> >> unmodified upstream qemu or if some backported patches are applied to
> >> it? If it's indeed unmodified, we should probably review the code once
> >> again to understand why it makes a difference.
> >
> > This were all unmodified versions directly from git between
> > "qemu-kvm-1.1.0" and "qemu-kvm-1.2.0"
> >
> > "git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c" works,
> > "git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c~1" is broken.
> > "git checkout qemu-kvm-1.1.2"  is broken,
> > "git checkout qemu-kvm-1.1.2 ; git cherry-pick
> > b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c"  works
>
> Ok, thanks for clarifying. Then I must have missed some interesting case
> while doing the patch.

I think I found your missing link:
After filling in "QCowL2Meta *m", that request ist queued:
  QLIST_INSERT_HEAD(&s->cluster_allocs, m, next_in_flight);
do prevent double allocating the same cluster for overlapping requests, which 
is checked in do_alloc_cluster_offset().

I guess that since the sector count was wrong, the overlap detection didn't 
work and the two concurrent write requests to the same cluster overwrote each 
other.

> Ideally we would find a sequence of qemu-io commands to reliably
> reproduce this.

You're the block guru, so I leave that to you (or anybody else who knows more 
about the working of qemu-io.) ;-)

Sincerely
Philipp
-- 
Philipp Hahn   Open Source Software Engineer  h...@univention.de
Univention GmbHbe open.   fon: +49 421 22 232- 0
Mary-Somerville-Str.1  D-28359 Bremen fax: +49 421 22 232-99
   http://www.univention.de/


signature.asc
Description: This is a digitally signed message part.

Re: [Qemu-devel] [PATCH v2] Added uapi directory into linux-header

2012-12-18 Thread Alexander Graf



On 18.12.2012, at 03:07, Bhushan Bharat-R65777  wrote:

> 
> 
>> -Original Message-
>> From: Alexander Graf [mailto:ag...@suse.de]
>> Sent: Tuesday, December 18, 2012 7:00 AM
>> To: Bhushan Bharat-R65777
>> Cc: qemu-devel qemu-devel; Peter Maydell; Jan Kiszka; qemu-...@nongnu.org 
>> List;
>> Marcelo Tosatti; David Howells
>> Subject: Re: [PATCH v2] Added uapi directory into linux-header
>> 
>> 
>> On 18.12.2012, at 02:27, Bhushan Bharat-R65777 wrote:
>> 
>>> 
>>> 
 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Tuesday, December 18, 2012 6:51 AM
 To: Bhushan Bharat-R65777
 Cc: qemu-devel qemu-devel; Peter Maydell; Jan Kiszka;
 qemu-...@nongnu.org List; Marcelo Tosatti; David Howells
 Subject: Re: [PATCH v2] Added uapi directory into linux-header
 
 
 On 18.12.2012, at 02:14, Bhushan Bharat-R65777 wrote:
 
>>> +++ b/scripts/update-linux-headers.sh
>>> @@ -46,14 +46,26 @@ for arch in $ARCHLIST; do
>>> 
>>>  make -C "$linux" INSTALL_HDR_PATH="$tmpdir" SRCARCH=$arch
>>> headers_install
>>> 
>>> +if [ -e "$linux/arch/$arch/include/uapi" ] &&
>>> +! [ -e "$output/linux-headers/uapi" ] ; then
>>> +mkdir "$output/linux-headers/uapi"
>> 
>> mkdir -p
>> 
>> But looking through this whole thing, it seems like the root cause
>> is actually different. We don't want any uapi directories exposed
>> to user space. So let's go back a step:
>> 
>> Why do we need the uapi include dir? Because some header is using it.
>> 
>> linux-headers/asm-powerpc/kvm_para.h:
> 
> The kvm_para.h (also kvm.h) are now defined in include/uapi/asm/
> 
> Is not this the correct thing that any header file in
> include/uapi/asm/ (in
 this case kvm_para.h) includes another header file (epapr_hcalls.h)
 in same directory?
> 
> Also I think now only the uapi/asm/*.h files should be exposed to
> userspace
 (QEMU here).
 
 make headers_install should basically remove all the uapi magic and
 give us normal backwards-compatible asm trees :).
>>> 
>>> I am perfectly fine, How we can do this now :)
>> 
>> Well, for starters, do the headers work if you apply the patch I sent in a
>> previous mail plus the epapr_hcall.h copy? If so, then that's the way to go 
>> :)
> 
> Are you really sure that applying a patch and then syncing (or other way 
> round)  is the way you want to go ?

Yes, because I'm quite confident we're generating broken headers right now.

Alex

> 
> To me it does not look good, I think we can go with the script changes to 
> make install_header is updated to do the work.
> 
> -Bharat
> 
>> 
>> 
>> Alex
>> 
> 
>

Re: [Qemu-devel] [PATCH 09/26] ehci: Use uframe precision for interrupt threshold checking

2012-12-18 Thread Hans de Goede


Hi,

On 12/17/2012 03:51 PM, Gerd Hoffmann wrote:

On 12/17/12 15:47, Hans de Goede wrote:

Hi,

On 12/17/2012 03:39 PM, Gerd Hoffmann wrote:

On 12/17/12 15:23, Hans de Goede wrote:

Hi,

On 12/17/2012 02:16 PM, Gerd Hoffmann wrote:

On 12/14/12 14:35, Hans de Goede wrote:

Note that a shadow variable is used instead of changing frindex to
uframe accuracy because we must send a frindex which is a multiple
of 8
during migration for migration compatibility, and rounding it down to
a multiple of 8 pre-migration, can lead to frindex going backwards
from
the guest pov.


Jumping forward instead?


You mean rounding the send frindex up pre-migration, I didn't really
consider
that as it will cause us skipping processing an entry in the periodic
frame
list. I guess doing that on migration isn't too bad. OTOH giving the
guest
only frame accuracy like we've been doing till now also works fine...

Your choice :)


I'm looking for a way to avoid the shadow variable, but of course
without breaking migration.  giving the guest only frame accuracy looks
good to me.


Ok, but then we need the shadow variable, iow then the patch stays as is
...


Can't we (a) switch frindex to microframe resolution, (b) round to frame
resolution in pre_save and (c) return frindex & ~7 on guest register reads?


Ah yes we can. I thought about that myself, but I was under the impression
that we were using mmap tricks to allow the guest to read the ioregs directly
without going through a vmexit. But it turns out we've:

static uint64_t ehci_opreg_read(void *ptr, hwaddr addr,
unsigned size)
{
EHCIState *s = ptr;
uint32_t val;

val = s->opreg[addr >> 2];
trace_usb_ehci_opreg_read(addr + s->opregbase, addr2str(addr), val);
return val;
}

Can qemu not handle an mmio range where writes are trapped, but reads are
not? That would force the use of the shadow variable, but should otherwise
provide a nice speedup.

Regards,

Hans

Re: [Qemu-devel] [PATCH v2] Added uapi directory into linux-header

2012-12-18 Thread Bhushan Bharat-R65777

> >>> +++ b/scripts/update-linux-headers.sh
> >>> @@ -46,14 +46,26 @@ for arch in $ARCHLIST; do
> >>>
> >>>  make -C "$linux" INSTALL_HDR_PATH="$tmpdir" SRCARCH=$arch
> >>> headers_install
> >>>
> >>> +if [ -e "$linux/arch/$arch/include/uapi" ] &&
> >>> +! [ -e "$output/linux-headers/uapi" ] ; then
> >>> +mkdir "$output/linux-headers/uapi"
> >>
> >> mkdir -p
> >>
> >> But looking through this whole thing, it seems like the root
> >> cause is actually different. We don't want any uapi directories
> >> exposed to user space. So let's go back a step:
> >>
> >> Why do we need the uapi include dir? Because some header is using it.
> >>
> >> linux-headers/asm-powerpc/kvm_para.h:
> >
> > The kvm_para.h (also kvm.h) are now defined in include/uapi/asm/
> >
> > Is not this the correct thing that any header file in
> > include/uapi/asm/ (in
>  this case kvm_para.h) includes another header file (epapr_hcalls.h)
>  in same directory?
> >
> > Also I think now only the uapi/asm/*.h files should be exposed to
> > userspace
>  (QEMU here).
> 
>  make headers_install should basically remove all the uapi magic and
>  give us normal backwards-compatible asm trees :).
> >>>
> >>> I am perfectly fine, How we can do this now :)
> >>
> >> Well, for starters, do the headers work if you apply the patch I sent
> >> in a previous mail plus the epapr_hcall.h copy? If so, then that's
> >> the way to go :)
> >
> > Are you really sure that applying a patch and then syncing (or other way
> round)  is the way you want to go ?
> 
> Yes, because I'm quite confident we're generating broken headers right now.

Ok, so every time someone does the sync he/she has to do this? Also do we think 
that sometime in future this will be taken care by make header_install?

Thanks
-Bharat

> 
> Alex
> 
> >
> > To me it does not look good, I think we can go with the script changes to 
> > make
> install_header is updated to do the work.
> >
> > -Bharat
> >
> >>
> >>
> >> Alex
> >>
> >
> >

[Qemu-devel] [PULL] virtio-serial: fixes, cleanups

2012-12-18 Thread Amit Shah

Hi Anthony,

Please pull to get fixes and cleanups to virtio-serial code.

Thanks,

The following changes since commit 1c97e303d4ea80a2691334b0febe87a50660f99d:

  Merge remote-tracking branch 'afaerber/qom-cpu' into staging (2012-12-10 
08:35:15 -0600)

are available in the git repository at:


  git://git.kernel.org/pub/scm/virt/qemu/amit/virtio-serial.git master

for you to fetch changes up to 91bdd1cf08f65b7a127c22d4d65ff9d16dcac870:

  virtio-serial-bus: assert port is non-null in remove_port() (2012-12-18 
14:28:50 +0530)


Amit Shah (6):
  virtio-serial: use uint32_t to count ports
  virtio-serial: move active ports loading to separate function
  virtio-serial: allocate post_load only at load-time
  virtio-serial: delete timer if active during exit
  virtio-serial-bus: send_control_msg() should not deal with cpkts
  virtio-serial-bus: assert port is non-null in remove_port()

 hw/virtio-serial-bus.c | 195 
++---
 1 file changed, 113 insertions(+), 82 deletions(-)

Amit

[Qemu-devel] [PATCHv3] virtio: make bindings typesafe

2012-12-18 Thread Michael S. Tsirkin

Move bindings from opaque to DeviceState.
This gives us better type safety with no performance cost.
Add macros to make future QOM work easier.
Note: this code is not replacing QOM use with non QOM -
it is replacing unsafe void * use with type-checked use.
Switch of the implementation to QOM where feasible
can be done by a follow-up patch.

Signed-off-by: Michael S. Tsirkin 
---

Changes from v2:
- More comments by Anreas F\344rber:
  Document which code is datapath, create wrapper
  to make future QOM changes easier.
- Address comment by Anthony:
  use inline functions for type safety and to
  avoid conflicting with QOM
Changes from v1:
- Address comment by Anreas F\344rber: wrap container_of
  macros to make future QOM work easier
- make a couple of bindings that v1 missed typesafe:
  virtio doesn't use any void * now

diff --git a/hw/s390-virtio-bus.c b/hw/s390-virtio-bus.c
index e0ac2d1..8570b76 100644
--- a/hw/s390-virtio-bus.c
+++ b/hw/s390-virtio-bus.c
@@ -137,7 +137,7 @@ static int s390_virtio_device_init(VirtIOS390Device *dev, 
VirtIODevice *vdev)
 
 bus->dev_offs += dev_len;
 
-virtio_bind_device(vdev, &virtio_s390_bindings, dev);
+virtio_bind_device(vdev, &virtio_s390_bindings, DEVICE(dev));
 dev->host_features = vdev->get_features(vdev, dev->host_features);
 s390_virtio_device_sync(dev);
 s390_virtio_reset_idx(dev);
@@ -364,9 +364,23 @@ VirtIOS390Device *s390_virtio_bus_find_mem(VirtIOS390Bus 
*bus, ram_addr_t mem)
 return NULL;
 }
 
-static void virtio_s390_notify(void *opaque, uint16_t vector)
+/* DeviceState to VirtIOS390Device. Note: used on datapath,
+ * be careful and test performance if you change this.
+ */
+static inline VirtIOS390Device *to_virtio_s390_device_fast(DeviceState *d)
+{
+return container_of(d, VirtIOS390Device, qdev);
+}
+
+/* DeviceState to VirtIOS390Device. TODO: use QOM. */
+static inline VirtIOS390Device *to_virtio_s390_device(DeviceState *d)
+{
+return container_of(d, VirtIOS390Device, qdev);
+}
+
+static void virtio_s390_notify(DeviceState *d, uint16_t vector)
 {
-VirtIOS390Device *dev = (VirtIOS390Device*)opaque;
+VirtIOS390Device *dev = to_virtio_s390_device_fast(d);
 uint64_t token = s390_virtio_device_vq_token(dev, vector);
 S390CPU *cpu = s390_cpu_addr2state(0);
 CPUS390XState *env = &cpu->env;
@@ -374,9 +388,9 @@ static void virtio_s390_notify(void *opaque, uint16_t 
vector)
 s390_virtio_irq(env, 0, token);
 }
 
-static unsigned virtio_s390_get_features(void *opaque)
+static unsigned virtio_s390_get_features(DeviceState *d)
 {
-VirtIOS390Device *dev = (VirtIOS390Device*)opaque;
+VirtIOS390Device *dev = to_virtio_s390_device(d);
 return dev->host_features;
 }
 
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 3ea4140..1c03bb5 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -97,35 +97,48 @@
 bool virtio_is_big_endian(void);
 
 /* virtio device */
+/* DeviceState to VirtIOPCIProxy. For use off data-path. TODO: use QOM. */
+static inline VirtIOPCIProxy *to_virtio_pci_proxy(DeviceState *d)
+{
+return container_of(d, VirtIOPCIProxy, pci_dev.qdev);
+}
 
-static void virtio_pci_notify(void *opaque, uint16_t vector)
+/* DeviceState to VirtIOPCIProxy. Note: used on datapath,
+ * be careful and test performance if you change this.
+ */
+static inline VirtIOPCIProxy *to_virtio_pci_proxy_fast(DeviceState *d)
 {
-VirtIOPCIProxy *proxy = opaque;
+return container_of(d, VirtIOPCIProxy, pci_dev.qdev);
+}
+
+static void virtio_pci_notify(DeviceState *d, uint16_t vector)
+{
+VirtIOPCIProxy *proxy = to_virtio_pci_proxy_fast(d);
 if (msix_enabled(&proxy->pci_dev))
 msix_notify(&proxy->pci_dev, vector);
 else
 qemu_set_irq(proxy->pci_dev.irq[0], proxy->vdev->isr & 1);
 }
 
-static void virtio_pci_save_config(void * opaque, QEMUFile *f)
+static void virtio_pci_save_config(DeviceState *d, QEMUFile *f)
 {
-VirtIOPCIProxy *proxy = opaque;
+VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
 pci_device_save(&proxy->pci_dev, f);
 msix_save(&proxy->pci_dev, f);
 if (msix_present(&proxy->pci_dev))
 qemu_put_be16(f, proxy->vdev->config_vector);
 }
 
-static void virtio_pci_save_queue(void * opaque, int n, QEMUFile *f)
+static void virtio_pci_save_queue(DeviceState *d, int n, QEMUFile *f)
 {
-VirtIOPCIProxy *proxy = opaque;
+VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
 if (msix_present(&proxy->pci_dev))
 qemu_put_be16(f, virtio_queue_vector(proxy->vdev, n));
 }
 
-static int virtio_pci_load_config(void * opaque, QEMUFile *f)
+static int virtio_pci_load_config(DeviceState *d, QEMUFile *f)
 {
-VirtIOPCIProxy *proxy = opaque;
+VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
 int ret;
 ret = pci_device_load(&proxy->pci_dev, f);
 if (ret) {
@@ -144,9 +157,9 @@ static int virtio_pci_load_config(void * opaque, QEMUFile 
*f)
 return 0;
 }
 
-static int virtio_pci_load_

Re: [Qemu-devel] [PATCH 4/6] snapshot: implemention of common API to take snapshots

2012-12-18 Thread Wenchao Xia


于 2012-12-17 18:32, Dietmar Maurer 写道:

For example, nexenta storage provides an API to create snapshots. We
want to use that. Another example would be to use lvcreate to create lvm

snapshots.



I am not familar with those tools


You do not know LVM?


  Honest speaking I use LVM tools little. I wonder why lvcreate can't
be used, for block internal snapshot I think this patch have same
function as your previous patch, what is missing?






--
Best Regards

Wenchao Xia

Re: [Qemu-devel] [RFC PATCH v6 0/6] Virtio refactoring.

2012-12-18 Thread Peter Maydell

On 17 December 2012 15:45, Michael S. Tsirkin  wrote:
> Is the point to allow virtio-mmio?  Why can't virtio-mmio be just
> another bus, like a pci bus, and another binding, like the virtio-pci
> binding?

(a) the current code is really not very nice because it's not
actually a proper set of QOM/qdev devices
(b) unlike PCI, you can't create sysbus devices on the
command line, because they don't correspond to a user
pluggable bit of hardware. We don't want users to have to know
an address and IRQ number for each virtio-mmio device (especially
since these are board specific); instead the board can create
and wire up transport devices wherever is suitable, and the
user just creates the backend (which is plugged into the virtio bus).

-- PMM

Re: [Qemu-devel] [PATCH 4/6] snapshot: implemention of common API to take snapshots

2012-12-18 Thread Dietmar Maurer

> >> I am not familar with those tools
> >
> > You do not know LVM?
> >
>Honest speaking I use LVM tools little. I wonder why lvcreate can't be 
> used,
> for block internal snapshot I think this patch have same function as your
> previous patch, what is missing?

Qemu does not have any information about the underlying storage. So creating
lvm snapshot must be done from the management software.

Re: [Qemu-devel] [PATCH v2] Added uapi directory into linux-header

2012-12-18 Thread Alexander Graf

On 18.12.2012, at 11:19, Bhushan Bharat-R65777  wrote:

> +++ b/scripts/update-linux-headers.sh
> @@ -46,14 +46,26 @@ for arch in $ARCHLIST; do
> 
> make -C "$linux" INSTALL_HDR_PATH="$tmpdir" SRCARCH=$arch
> headers_install
> 
> +if [ -e "$linux/arch/$arch/include/uapi" ] &&
> +! [ -e "$output/linux-headers/uapi" ] ; then
> +mkdir "$output/linux-headers/uapi"

 mkdir -p

 But looking through this whole thing, it seems like the root
 cause is actually different. We don't want any uapi directories
 exposed to user space. So let's go back a step:

 Why do we need the uapi include dir? Because some header is using it.

 linux-headers/asm-powerpc/kvm_para.h:
>>> 
>>> The kvm_para.h (also kvm.h) are now defined in include/uapi/asm/
>>> 
>>> Is not this the correct thing that any header file in
>>> include/uapi/asm/ (in
>> this case kvm_para.h) includes another header file (epapr_hcalls.h)
>> in same directory?
>>> 
>>> Also I think now only the uapi/asm/*.h files should be exposed to
>>> userspace
>> (QEMU here).
>> 
>> make headers_install should basically remove all the uapi magic and
>> give us normal backwards-compatible asm trees :).
> 
> I am perfectly fine, How we can do this now :)

 Well, for starters, do the headers work if you apply the patch I sent
 in a previous mail plus the epapr_hcall.h copy? If so, then that's
 the way to go :)
>>> 
>>> Are you really sure that applying a patch and then syncing (or other way
>> round)  is the way you want to go ?
>> 
>> Yes, because I'm quite confident we're generating broken headers right now.
> 
> Ok, so every time someone does the sync he/she has to do this? Also do we 
> think that sometime in future this will be taken care by make header_install?

That's the point.

The QEMU header sync is a hack that allows us to not use system headers. System 
headers could be outdated wrt features we want to support.

However, system headers get generated using make headers_install. And there we 
generate a header that refers to a uapi directory that doesn't exist after 
headers_install.

So the next time a distro updates their system headers, they get broken ones. 
That's what the patch against Linux is trying to fix.

Alex

Re: [Qemu-devel] [PATCHv2] virtio: make bindings typesafe

2012-12-18 Thread Michael S. Tsirkin

On Mon, Dec 17, 2012 at 06:42:58PM -0600, Anthony Liguori wrote:
> "Michael S. Tsirkin"  writes:
> 
> > On Mon, Dec 17, 2012 at 11:59:15PM +0100, Andreas Färber wrote:
> >> Am 17.12.2012 22:40, schrieb Michael S. Tsirkin:
> >> > Move bindings from opaque to DeviceState.
> >> > This gives us better type safety with no performance cost.
> >> > Add macros to make future QOM work easier, document
> >> > which ones are data-path sensitive.
> >> > 
> >> > Signed-off-by: Michael S. Tsirkin 
> >> > ---
> >> > 
> >> > Changes from v1:
> >> > - Address comment by Anreas Färber: wrap container_of
> >> >   macros to make future QOM work easier
> >> > - make a couple of bindings that v1 missed typesafe:
> >> >   virtio doesn't use any void * now
> >> > 
> >> > diff --git a/hw/s390-virtio-bus.c b/hw/s390-virtio-bus.c
> >> > index e0ac2d1..8c693b4 100644
> >> > --- a/hw/s390-virtio-bus.c
> >> > +++ b/hw/s390-virtio-bus.c
> >> > @@ -137,7 +137,7 @@ static int s390_virtio_device_init(VirtIOS390Device 
> >> > *dev, VirtIODevice *vdev)
> >> >  
> >> >  bus->dev_offs += dev_len;
> >> >  
> >> > -virtio_bind_device(vdev, &virtio_s390_bindings, dev);
> >> > +virtio_bind_device(vdev, &virtio_s390_bindings, 
> >> > VIRTIO_S390_TO_QDEV(dev));
> >> 
> >> DEVICE(dev) exists for exactly that purpose, and device init is
> >> certainly no hot path. Please don't reinvent the wheel for virtio.
> >
> > OK.
> > Though my beef with DEVICE is that it ignores the type
> > passed in completely. You can give it int * and it will
> > happily cast to devicestate. Your only hope is to
> > catch the error at runtime.
> 
> That's a feature.  DEVICE can do upcasting and downcasting.  There's no
> way to do compile time checking of upcasting when
> 
> > It would be better if DEVICE got the name of the
> > qdev field, then we could check it's actually DeviceState
> > before casting. Yes it would mean a bit of churn if you rename the
> > field but it's very rare and trivial to change by a regexp.
> 
> No, it would be much, much worse.  You shouldn't have to know what the
> layout of the structure is to convert between types.

Still I'm pointing out the problems, they are real.
Illegal code like
 DEVICE("foobar")
compiles fine and it shouldn't.

-- 
MST

Re: [Qemu-devel] [RFC PATCH v6 0/6] Virtio refactoring.

2012-12-18 Thread Michael S. Tsirkin

On Tue, Dec 18, 2012 at 10:33:37AM +, Peter Maydell wrote:
> On 17 December 2012 15:45, Michael S. Tsirkin  wrote:
> > Is the point to allow virtio-mmio?  Why can't virtio-mmio be just
> > another bus, like a pci bus, and another binding, like the virtio-pci
> > binding?
> 
> (a) the current code is really not very nice because it's not
> actually a proper set of QOM/qdev devices
> (b) unlike PCI, you can't create sysbus devices on the
> command line, because they don't correspond to a user
> pluggable bit of hardware. We don't want users to have to know
> an address and IRQ number for each virtio-mmio device (especially
> since these are board specific); instead the board can create
> and wire up transport devices wherever is suitable, and the
> user just creates the backend (which is plugged into the virtio bus).
> 
> -- PMM

This is what I am saying: create your own bus and put
your devices there. Allocate resources when you init
a device.

Instead you seem to want to expose a virtio device as two devices to
user - if true this is not reasonable.

-- 
MST

Re: [Qemu-devel] [PATCH 09/26] ehci: Use uframe precision for interrupt threshold checking

2012-12-18 Thread Gerd Hoffmann

  Hi,

> Can qemu not handle an mmio range where writes are trapped, but reads are
> not? That would force the use of the shadow variable, but should otherwise
> provide a nice speedup.

No.  vmexit is needed anyway btw, but the round-trip to qemu userspace
could be short-cutted in theory.  It's non-trivial though.  Alex had a
talk about it at kvm forum (covering ide).

First a simple read directly + write via qemu isn't that useful.  You
need a policy per register.

For most reads it would work, but there are exceptions.  Registers
holding timers for example.  frindex is actually an example of that.
With async_stepdown active frindex updates are quite jumpy.  We might
want to update the register on guest reads (and maybe also reset async
stepdown in that case).

Likewise the other way around: Not all register writes have some effect
which qemu must emulate, some are just storage (like ehci frame list
address).

Locking is an unsolved issue (in-kernel register reads/writes don't grab
the qemu lock and thus would race with iothread accessing the register
variables).

cheers,
  Gerd

Re: [Qemu-devel] [RFC PATCH v6 0/6] Virtio refactoring.

2012-12-18 Thread Peter Maydell

On 18 December 2012 11:01, Michael S. Tsirkin  wrote:
> This is what I am saying: create your own bus and put
> your devices there.

What bus?

-- PMM

Re: [Qemu-devel] [PATCH 09/26] ehci: Use uframe precision for interrupt threshold checking

2012-12-18 Thread Hans de Goede


Hi,

On 12/18/2012 12:03 PM, Gerd Hoffmann wrote:

   Hi,


Can qemu not handle an mmio range where writes are trapped, but reads are
not? That would force the use of the shadow variable, but should otherwise
provide a nice speedup.


No.  vmexit is needed anyway btw, but the round-trip to qemu userspace
could be short-cutted in theory.  It's non-trivial though.  Alex had a
talk about it at kvm forum (covering ide).


Ok, then I'll respin the patch getting rid of the shadow variable.

Regards,

Hans

Re: [Qemu-devel] [RFC PATCH v6 0/6] Virtio refactoring.

2012-12-18 Thread KONRAD Frédéric


On 18/12/2012 12:01, Michael S. Tsirkin wrote:

On Tue, Dec 18, 2012 at 10:33:37AM +, Peter Maydell wrote:

On 17 December 2012 15:45, Michael S. Tsirkin  wrote:

Is the point to allow virtio-mmio?  Why can't virtio-mmio be just
another bus, like a pci bus, and another binding, like the virtio-pci
binding?

(a) the current code is really not very nice because it's not
actually a proper set of QOM/qdev devices
(b) unlike PCI, you can't create sysbus devices on the
command line, because they don't correspond to a user
pluggable bit of hardware. We don't want users to have to know
an address and IRQ number for each virtio-mmio device (especially
since these are board specific); instead the board can create
and wire up transport devices wherever is suitable, and the
user just creates the backend (which is plugged into the virtio bus).

-- PMM

This is what I am saying: create your own bus and put
your devices there. Allocate resources when you init
a device.

Instead you seem to want to expose a virtio device as two devices to
user - if true this is not reasonable.

The modifications will be transparent to the user, as we will keep 
virtio-x-pci devices.

[Qemu-devel] [PATCH] powerpc: linux header sync script includes epapr_hcalls.h

2012-12-18 Thread Bharat Bhushan

epapr_hcalls.h is now referenced by kvm_para.h. so this is needed for
QEMU to get compiled on powerpc.

Signed-off-by: Bharat Bhushan 
---

This patch is after all the discussion we had on patch with subject
"Added uapi directory in linux-header" and the mentioned patch is
no more needed.

 scripts/update-linux-headers.sh |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
index 4c7b566..120a694 100755
--- a/scripts/update-linux-headers.sh
+++ b/scripts/update-linux-headers.sh
@@ -54,6 +54,9 @@ for arch in $ARCHLIST; do
 if [ $arch = x86 ]; then
 cp "$tmpdir/include/asm/hyperv.h" "$output/linux-headers/asm-x86"
 fi
+if [ $arch = powerpc ]; then
+cp "$tmpdir/include/asm/epapr_hcalls.h" 
"$output/linux-headers/asm-powerpc/"
+fi
 done
 
 rm -rf "$output/linux-headers/linux"
-- 
1.7.0.4

Re: [Qemu-devel] [PATCH 00/15] qdev: make reset semantics more clear and consistent, reset qbuses under virtio devices

2012-12-18 Thread Paolo Bonzini

Il 18/12/2012 10:49, Michael S. Tsirkin ha scritto:
 +/* Device-specific reset through virtio config space.
 + * Reset virtio config and backend child devices if any.
 + */
 +void virtio_config_reset(VirtIODevice *vdev)
 +{
 +qdev_reset_all(vdev->binding_opaque);
 +}
>>>
>>> Yes, I had understood.  As I said, this is the wrong direction.
>>> Resetting happens from vdev->binding_opaque, it can just do
>>> qdev_reset_all(myself).
> 
> It can but it's the wrong thing for transport to know about.

The transport provides an implementation of dc->reset, not virtio.c
(e.g. virtio_pci_reset).  Sure it knows what the effect of
qdev_reset_all are on itself.

> Let PCI worry about PCI things. This is not
> a transport specific thing so belongs in virtio.c

This _is_ a transport specific thing.  Sure it will reset the virtio
device (virtio_reset), but it will also reset things such as MSI-X
vectors and VIRTIO_PCI_FLAG_BUS_MASTER_BUG.  It doesn't belong in virtio.c.

>> ... besides, this only works if the reset callback of
>> vdev->binding_opaque remembers to call virtio_reset (in the s390
>> bindings, it doesn't and this series fixes it).
> 
> That's a separate bug I think.

Yes, I agree.

Paolo

Re: [Qemu-devel] [RFC PATCH v6 0/6] Virtio refactoring.

2012-12-18 Thread Paolo Bonzini

Il 18/12/2012 12:26, Peter Maydell ha scritto:
> On 18 December 2012 11:01, Michael S. Tsirkin  wrote:
>> This is what I am saying: create your own bus and put
>> your devices there.
> 
> What bus?

A virtio bus like the one in these patches.  But mst is suggesting to
leave virtio.c aside, and only use the virtio bus in the virtio-x-mmio
devices, to connect to the virtio-mmio device provided by the board.

Either way is fine for me.

Paolo

Re: [Qemu-devel] [PATCH] powerpc: linux header sync script includes epapr_hcalls.h

2012-12-18 Thread Alexander Graf


On 18.12.2012, at 12:13, Bharat Bhushan wrote:

> epapr_hcalls.h is now referenced by kvm_para.h. so this is needed for
> QEMU to get compiled on powerpc.
> 
> Signed-off-by: Bharat Bhushan 

Thanks, applied to ppc-next.

Alex

> ---
> 
> This patch is after all the discussion we had on patch with subject
> "Added uapi directory in linux-header" and the mentioned patch is
> no more needed.
> 
> scripts/update-linux-headers.sh |3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers.sh
> index 4c7b566..120a694 100755
> --- a/scripts/update-linux-headers.sh
> +++ b/scripts/update-linux-headers.sh
> @@ -54,6 +54,9 @@ for arch in $ARCHLIST; do
> if [ $arch = x86 ]; then
> cp "$tmpdir/include/asm/hyperv.h" "$output/linux-headers/asm-x86"
> fi
> +if [ $arch = powerpc ]; then
> +cp "$tmpdir/include/asm/epapr_hcalls.h" 
> "$output/linux-headers/asm-powerpc/"
> +fi
> done
> 
> rm -rf "$output/linux-headers/linux"
> -- 
> 1.7.0.4
> 
>

Re: [Qemu-devel] [RFC PATCH v6 0/6] Virtio refactoring.

2012-12-18 Thread Peter Maydell

On 18 December 2012 11:50, Paolo Bonzini  wrote:
> Il 18/12/2012 12:26, Peter Maydell ha scritto:
>> On 18 December 2012 11:01, Michael S. Tsirkin  wrote:
>>> This is what I am saying: create your own bus and put
>>> your devices there.
>>
>> What bus?
>
> A virtio bus like the one in these patches.  But mst is suggesting to
> leave virtio.c aside, and only use the virtio bus in the virtio-x-mmio
> devices, to connect to the virtio-mmio device provided by the board.

That doesn't make any sense to me -- why would you want to make
mmio be randomly different from the other virtio transports? The code
as it stands is a mess which ought to be cleaned up anyway...

To the extent that it's painful for users to manipulate qdev devices on
the command line, that's a problem we ought to address at some point
anyhow.

-- PMM

Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

2012-12-18 Thread Michael Tokarev

On 18.12.2012 13:46, Philipp Hahn wrote:

> I think I found your missing link:
> After filling in "QCowL2Meta *m", that request ist queued:
>   QLIST_INSERT_HEAD(&s->cluster_allocs, m, next_in_flight);
> do prevent double allocating the same cluster for overlapping requests, which 
> is checked in do_alloc_cluster_offset().
> 
> I guess that since the sector count was wrong, the overlap detection didn't 
> work and the two concurrent write requests to the same cluster overwrote each 
> other.

Meh.  And I already closed the debian bugreport... :)

But thank you Philipp for your excellent work on the matter!

/mjt

Re: [Qemu-devel] [PATCH 01/10] ide: Break all non-qdevified controllers

2012-12-18 Thread Peter Maydell

On 17 December 2012 14:05, Markus Armbruster  wrote:
> The writing has been on the wall for a few years.

...behind a filing cabinet in a disused lavatory with a sign on the door
saying "beware of the leopard"?

We really need a better way to mark devices as "obsolete, will be
removed/broken/etc in a future version"...

-- PMM

[Qemu-devel] [Bug 1025244] Re: qcow2 image increasing disk size above the virtual limit

2012-12-18 Thread Andy Menzel

Any solution right now? I have a similar problem like Todor Andreev;
Our daily backup of some virtual machines (qcow2) looks like that:

1. shutdown the VM
2. create a snapshot via: "qemu-img snapshot -c nameofsnapshot..."
3. boot the VM
4. backup the snapshot to another virtual disk via: "qemu-img convert  -f qcow2 
-O qcow2 -s nameofsnapshot..."
5. DELETE the snapshot from VM via: qemu-img snapshot -d nameofsnapshot...

But the problem is, that our original VM-size growing steadily (although
few changes were made) ?!

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1025244

Title:
  qcow2 image increasing disk size above the virtual limit

Status in QEMU:
  New
Status in “qemu-kvm” package in Ubuntu:
  Triaged

Bug description:
  Using qemu/kvm, qcow2 images, ext4 file systems on both guest and host
   Host and Guest: Ubuntu server 12.04 64bit
  To create an image I did this:

  qemu-img create -f qcow2 -o preallocation=metadata ubuntu-pdc-vda.img 
10737418240 (not sure about the exact bytes, but around this)
  ls -l ubuntu-pdc-vda.img
  fallocate -l theSizeInBytesFromAbove ubuntu-pdc-vda.img

  The problem is that the image is growing progressively and has
  obviously no limit, although I gave it one. The root filesystem's
  image is the same case:

  qemu-img info ubuntu-pdc-vda.img
   image: ubuntu-pdc-vda.img
   file format: qcow2
   virtual size: 10G (10737418240 bytes)
   disk size: 14G
   cluster_size: 65536

  and for confirmation:
   du -sh ubuntu-pdc-vda.img
   15G ubuntu-pdc-vda.img

  I made a test and saw that when I delete something from the guest, the real 
size of the image is not decreasing (I read it is normal). OK, but when I write 
something again, it doesn't use the freed space, but instead grows the image. 
So for example:
   1. The initial physical size of the image is 1GB.
   2. I copy 1GB of data in the guest. It's physical size becomes 2GB.
   3. I delete this data (1GB). The physical size of the image remains 2GB.
   4. I copy another 1GB of data to the guest.
   5. The physical size of the image becomes 3GB.
   6. And so on with no limit. It doesn't care if the virtual size is less.

  Is this normal - the real/physical size of the image to be larger than
  the virtual limit???

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1025244/+subscriptions

[Qemu-devel] [PATCH 0/3] virtio: don't poll masked vectors with irqfd

2012-12-18 Thread Michael S. Tsirkin

At the moment when vector is masked virtio will poll it
in userspace, even if it is handled by irqfd.
This is done in order to update pending bits, but
it's not really required until someone reads the pending bits.
On the other hand this read results in extra io thread wakeups.

As we only implement the pending bits as a compatibility
feature (read - real drivers don't use it), we can defer
the irqfd poll until the read actually happens.

This does not seem to affect vhost-net speed
in simple benchmarks but could help block: both
vhost-blk and dataplane when using irqfd,
and I also think this is cleaner than enabling/disabling
notifiers all the time.

This will also be the basis for future optimizations.

Michael S. Tsirkin (3):
  msi: add API to get notified about pending bit poll
  msix: expose access to masked/pending state
  virtio-pci: don't poll masked vectors

 hw/pci/msix.c   | 19 +++
 hw/pci/msix.h   |  6 +-
 hw/pci/pci.h|  4 
 hw/vfio_pci.c   |  2 +-
 hw/virtio-pci.c | 53 +
 5 files changed, 66 insertions(+), 18 deletions(-)

-- 
MST

Re: [Qemu-devel] [PATCH 0/3] virtio: don't poll masked vectors with irqfd

2012-12-18 Thread Michael S. Tsirkin

On Tue, Dec 18, 2012 at 02:20:20PM +0200, Michael S. Tsirkin wrote:
> At the moment when vector is masked virtio will poll it
> in userspace, even if it is handled by irqfd.
> This is done in order to update pending bits, but
> it's not really required until someone reads the pending bits.
> On the other hand this read results in extra io thread wakeups.
> 
> As we only implement the pending bits as a compatibility
> feature (read - real drivers don't use it), we can defer
> the irqfd poll until the read actually happens.
> 
> This does not seem to affect vhost-net speed
> in simple benchmarks but could help block: both
> vhost-blk and dataplane when using irqfd,
> and I also think this is cleaner than enabling/disabling
> notifiers all the time.
> 
> This will also be the basis for future optimizations.

Note: this is on top of the typesafe bindings patch v3
I sent previously.
You can get the whole bundle from:
git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git pci

-- 
MST

[Qemu-devel] [RFC PATCH v4 01/30] [SeaBIOS] Add ACPI_EXTRACT_DEVICE* macros

2012-12-18 Thread Vasilis Liaskovitis

This allows to extract the beginning, end and name of a Device object.
---
 tools/acpi_extract.py |   28 
 1 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/tools/acpi_extract.py b/tools/acpi_extract.py
index 3295678..3191f53 100755
--- a/tools/acpi_extract.py
+++ b/tools/acpi_extract.py
@@ -217,6 +217,28 @@ def aml_package_start(offset):
 offset += 1
 return offset + aml_pkglen_bytes(offset) + 1
 
+def aml_device_start(offset):
+#0x5B 0x82 DeviceOp PkgLength NameString ProcID
+if ((aml[offset] != 0x5B) or (aml[offset + 1] != 0x82)):
+die( "Name offset 0x%x: expected 0x5B 0x83 actual 0x%x 0x%x" %
+ (offset, aml[offset], aml[offset + 1]));
+return offset
+
+def aml_device_string(offset):
+#0x5B 0x82 DeviceOp PkgLength NameString ProcID
+start = aml_device_start(offset)
+offset += 2
+pkglenbytes = aml_pkglen_bytes(offset)
+offset += pkglenbytes
+return offset
+
+def aml_device_end(offset):
+start = aml_device_start(offset)
+offset += 2
+pkglenbytes = aml_pkglen_bytes(offset)
+pkglen = aml_pkglen(offset)
+return offset + pkglen
+
 lineno = 0
 for line in fileinput.input():
 # Strip trailing newline
@@ -307,6 +329,12 @@ for i in range(len(asl)):
 offset = aml_processor_end(offset)
 elif (directive == "ACPI_EXTRACT_PKG_START"):
 offset = aml_package_start(offset)
+elif (directive == "ACPI_EXTRACT_DEVICE_START"):
+offset = aml_device_start(offset)
+elif (directive == "ACPI_EXTRACT_DEVICE_STRING"):
+offset = aml_device_string(offset)
+elif (directive == "ACPI_EXTRACT_DEVICE_END"):
+offset = aml_device_end(offset)
 else:
 die("Unsupported directive %s" % directive)
 
-- 
1.7.9

[Qemu-devel] [RFC PATCH v4 06/30] qapi: make visit_type_size fallback to type_int

2012-12-18 Thread Vasilis Liaskovitis

Currently visit_type_size checks if the visitor's type_size function pointer is
NULL. If not, it calls it, otherwise it calls v->type_uint64(). But neither of
these pointers are ever set. Fallback to calling v->type_int() in this third
(default) case.

Signed-off-by: Vasilis Liaskovitis 
---
 qapi/qapi-visit-core.c |   11 ++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index 7a82b63..497e693 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -236,8 +236,17 @@ void visit_type_int64(Visitor *v, int64_t *obj, const char 
*name, Error **errp)
 
 void visit_type_size(Visitor *v, uint64_t *obj, const char *name, Error **errp)
 {
+int64_t value;
 if (!error_is_set(errp)) {
-(v->type_size ? v->type_size : v->type_uint64)(v, obj, name, errp);
+if (v->type_size) {
+v->type_size(v, obj, name, errp);
+} else if (v->type_uint64) {
+v->type_uint64(v, obj, name, errp);
+} else {
+value = *obj;
+v->type_int(v, &value, name, errp);
+*obj = value;
+}
 }
 }
 
-- 
1.7.9

[Qemu-devel] [RFC PATCH v4 14/30] piix_pci: Add i440fx dram controller initialization

2012-12-18 Thread Vasilis Liaskovitis

Also introduce function to adjust memory map for hotplug-able dimms.

Signed-off-by: Vasilis Liaskovitis 
---
 hw/pc_piix.c  |6 +++---
 hw/piix_pci.c |   30 --
 2 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 6a9b508..fe995b9 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -95,9 +95,9 @@ static void pc_init1(MemoryRegion *system_memory,
 kvmclock_create();
 }
 
-if (ram_size >= 0xe000 ) {
-above_4g_mem_size = ram_size - 0xe000;
-below_4g_mem_size = 0xe000;
+if (ram_size >= I440FX_PCI_HOLE_START) {
+above_4g_mem_size = ram_size - I440FX_PCI_HOLE_START;
+below_4g_mem_size = I440FX_PCI_HOLE_START;
 } else {
 above_4g_mem_size = 0;
 below_4g_mem_size = ram_size;
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index 7ca3c73..9866b1d 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -125,6 +125,25 @@ static const VMStateDescription vmstate_i440fx = {
 }
 };
 
+hwaddr i440fx_pmc_dimm_offset(DeviceState *dev, uint64_t size)
+{
+PCII440FXState *d = I440FX_PCI_DEVICE(dev);
+hwaddr ret;
+
+/* if dimm fits before pci hole, append it normally */
+if (d->below_4g_mem_size + size <= I440FX_PCI_HOLE_START) {
+ret = d->below_4g_mem_size;
+d->below_4g_mem_size += size;
+}
+/* otherwise place it above 4GB */
+else {
+ret = 0x1LL + d->above_4g_mem_size;
+d->above_4g_mem_size += size;
+}
+
+return ret;
+}
+
 static void i440fx_pcihost_initfn(Object *obj)
 {
 I440FXState *s = I440FX_HOST_DEVICE(obj);
@@ -148,8 +167,8 @@ static int i440fx_pcihost_init(SysBusDevice *dev)
 sysbus_add_io(dev, 0xcfc, &pci->data_mem);
 sysbus_init_ioports(&pci->busdev, 0xcfc, 4);
 
-b = pci_bus_new(&s->parent_obj.busdev.qdev, NULL, s->mch.pci_address_space,
-s->mch.address_space_io, 0);
+b = pci_bus_new(&s->parent_obj.busdev.qdev, "pci.0",
+s->mch.pci_address_space, s->mch.address_space_io, 0);
 s->parent_obj.bus = b;
 qdev_set_parent_bus(DEVICE(&s->mch), BUS(b));
 qdev_init_nofail(DEVICE(&s->mch));
@@ -169,6 +188,13 @@ static int i440fx_initfn(PCIDevice *dev)
 
 pci_hole64_size = (sizeof(hwaddr) == 4 ? 0 :
((uint64_t)1 << 62));
+
+/* Initialize i440fx's DRAM channel, it can hold up to 8 DRAM ranks */
+f->dram_channel0 = dimm_bus_create(OBJECT(f), "membus.0", 8,
+i440fx_pmc_dimm_offset);
+/* Initialize paravirtual memory bus */
+f->pv_dram_channel = dimm_bus_create(OBJECT(f), "membus.pv", 0,
+i440fx_pmc_dimm_offset);
 memory_region_init_alias(&f->pci_hole, "pci-hole", f->pci_address_space,
  f->below_4g_mem_size,
  0x1LL - f->below_4g_mem_size);
-- 
1.7.9

[Qemu-devel] [RFC PATCH v4 15/30] q35: Add i440fx dram controller initialization

2012-12-18 Thread Vasilis Liaskovitis

Create memory buses and introduce function to adjust memory map for
hotplug-able dimms.

Signed-off-by: Vasilis Liaskovitis 
---
 hw/pc_q35.c |1 +
 hw/q35.c|   27 +++
 hw/q35.h|5 +
 3 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/hw/pc_q35.c b/hw/pc_q35.c
index 3429a9a..e6375bf 100644
--- a/hw/pc_q35.c
+++ b/hw/pc_q35.c
@@ -41,6 +41,7 @@
 #include "hw/ide/pci.h"
 #include "hw/ide/ahci.h"
 #include "hw/usb.h"
+#include "fw_cfg.h"
 
 /* ICH9 AHCI has 6 ports */
 #define MAX_SATA_PORTS 6
diff --git a/hw/q35.c b/hw/q35.c
index efebc27..cc27d72 100644
--- a/hw/q35.c
+++ b/hw/q35.c
@@ -236,12 +236,39 @@ static void mch_reset(DeviceState *qdev)
 mch_update(mch);
 }
 
+static hwaddr mch_dimm_offset(DeviceState *dev, uint64_t size)
+{
+MCHPCIState *d = MCH_PCI_DEVICE(dev);
+hwaddr ret;
+
+/* if dimm fits before pci hole, append it normally */
+if (d->below_4g_mem_size + size <= MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT) {
+ret = d->below_4g_mem_size;
+d->below_4g_mem_size += size;
+}
+/* otherwise place it above 4GB */
+else {
+ret = 0x1LL + d->above_4g_mem_size;
+d->above_4g_mem_size += size;
+}
+
+return ret;
+}
+
 static int mch_init(PCIDevice *d)
 {
 int i;
 hwaddr pci_hole64_size;
 MCHPCIState *mch = MCH_PCI_DEVICE(d);
 
+/* Initialize 2 GMC DRAM channels x 4 DRAM ranks each */
+mch->dram_channel[0] = dimm_bus_create(OBJECT(d), "membus.0", 4,
+mch_dimm_offset);
+mch->dram_channel[1] = dimm_bus_create(OBJECT(d), "membus.1", 4,
+mch_dimm_offset);
+/* Initialize paravirtual memory bus */
+mch->pv_dram_channel = dimm_bus_create(OBJECT(d), "membus.pv", 0,
+mch_dimm_offset);
 /* setup pci memory regions */
 memory_region_init_alias(&mch->pci_hole, "pci-hole",
  mch->pci_address_space,
diff --git a/hw/q35.h b/hw/q35.h
index e34f7c1..bf76dc8 100644
--- a/hw/q35.h
+++ b/hw/q35.h
@@ -34,6 +34,7 @@
 #include "acpi.h"
 #include "acpi_ich9.h"
 #include "pam.h"
+#include "dimm.h"
 
 #define TYPE_Q35_HOST_DEVICE "q35-pcihost"
 #define Q35_HOST_DEVICE(obj) \
@@ -56,6 +57,10 @@ typedef struct MCHPCIState {
 uint8_t smm_enabled;
 ram_addr_t below_4g_mem_size;
 ram_addr_t above_4g_mem_size;
+/* GMCH allows for 2 DRAM channels x 4 DRAM ranks each */
+DimmBus * dram_channel[2];
+/* paravirtual memory bus */
+DimmBus *pv_dram_channel;
 } MCHPCIState;
 
 typedef struct Q35PCIHost {
-- 
1.7.9

[Qemu-devel] [RFC PATCH v4 27/30] [SeaBIOS] Add _OST dimm method

2012-12-18 Thread Vasilis Liaskovitis

Add support for _OST method. _OST method will write into the correct I/O byte to
signal success / failure of hot-add or hot-remove to qemu.
---
 src/acpi-dsdt-mem-hotplug.dsl |   51 -
 src/ssdt-mem.dsl  |4 +++
 2 files changed, 54 insertions(+), 1 deletions(-)

diff --git a/src/acpi-dsdt-mem-hotplug.dsl b/src/acpi-dsdt-mem-hotplug.dsl
index fd73ea7..a648bee 100644
--- a/src/acpi-dsdt-mem-hotplug.dsl
+++ b/src/acpi-dsdt-mem-hotplug.dsl
@@ -27,7 +27,28 @@ Scope(\_SB) {
 {
 MPE, 8
 }
-
+
+/* Memory hot-remove notify failure byte */
+OperationRegion(MEEF, SystemIO, 0xafa1, 1)
+Field (MEEF, ByteAcc, NoLock, Preserve)
+{
+MEF, 8
+}
+
+/* Memory hot-add notify success byte */
+OperationRegion(MPIS, SystemIO, 0xafa2, 1)
+Field (MPIS, ByteAcc, NoLock, Preserve)
+{
+MIS, 8
+}
+
+/* Memory hot-add notify failure byte */
+OperationRegion(MPIF, SystemIO, 0xafa3, 1)
+Field (MPIF, ByteAcc, NoLock, Preserve)
+{
+MIF, 8
+}
+
 Method(MESC, 0) {
 // Local5 = active memdevice bitmap
 Store (MES, Local5)
@@ -69,4 +90,32 @@ Scope(\_SB) {
 Sleep(200)
 }
 
+Method (MOST, 3, Serialized) {
+// _OST method - OS status indication
+Switch (And(Arg0, 0xFF)) {
+Case(0x3)
+{
+Switch(And(Arg1, 0xFF)) {
+Case(0x1) {
+Store(Arg2, MEF)
+// Revert MEON flag for this memory device to one
+Store(One, Index(MEON, Arg2))
+}
+}
+}
+Case(0x1)
+{
+Switch(And(Arg1, 0xFF)) {
+Case(0x0) {
+Store(Arg2, MIS)
+}
+Case(0x1) {
+Store(Arg2, MIF)
+// Revert MEON flag for this memory device to zero
+Store(Zero, Index(MEON, Arg2))
+}
+}
+}
+}
+}
 }
diff --git a/src/ssdt-mem.dsl b/src/ssdt-mem.dsl
index eef84b6..47a3b4f 100644
--- a/src/ssdt-mem.dsl
+++ b/src/ssdt-mem.dsl
@@ -38,6 +38,7 @@ DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", 
"CSSDT", 0x1)
 
 External(CMST, MethodObj)
 External(MPEJ, MethodObj)
+External(MOST, MethodObj)
 
 Name(_CRS, ResourceTemplate() {
 QwordMemory(
@@ -60,6 +61,9 @@ DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", 
"CSSDT", 0x1)
 Method (_EJ0, 1, NotSerialized) {
 MPEJ(ID, Arg0)
 }
+Method (_OST, 3) {
+MOST(Arg0, Arg1, ID)
+}
 }
 }
 
-- 
1.7.9

[Qemu-devel] [RFC PATCH v4 30/30] Implement _PS3 for dimm

2012-12-18 Thread Vasilis Liaskovitis

This will allow us to update dimm state on OSPM-initiated eject operations e.g.
with "echo 1 > /sys/bus/acpi/devices/PNP0C80\:00/eject"

v3->v4: Add support for ich9
---
 docs/specs/acpi_hotplug.txt |7 +++
 hw/acpi_ich9.c  |7 +--
 hw/acpi_ich9.h  |1 +
 hw/acpi_piix4.c |9 ++---
 hw/dimm.c   |4 
 hw/dimm.h   |3 ++-
 6 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/docs/specs/acpi_hotplug.txt b/docs/specs/acpi_hotplug.txt
index 536da16..69868fe 100644
--- a/docs/specs/acpi_hotplug.txt
+++ b/docs/specs/acpi_hotplug.txt
@@ -45,3 +45,10 @@ insertion failed.
 Written by ACPI memory device _OST method to notify qemu of failed
 hot-add.  Write-only.
 
+Memory Dimm _PS3 power-off initiated by OSPM (IO port 0xafa4, 1-byte access):
+---
+Dimm hot-add _PS3 initiated by OSPM. Byte value indicates Dimm slot which
+entered D3 state.
+
+Written by ACPI memory device _PS3 method to notify qemu of power-off state for
+the dimm.  Write-only.
diff --git a/hw/acpi_ich9.c b/hw/acpi_ich9.c
index 2705230..5e7fca6 100644
--- a/hw/acpi_ich9.c
+++ b/hw/acpi_ich9.c
@@ -120,6 +120,9 @@ static void memhp_writeb(void *opaque, uint32_t addr, 
uint32_t val)
 case ICH9_MEM_OST_ADD_FAIL - ICH9_MEM_BASE:
 dimm_notify(val, DIMM_ADD_FAIL);
 break;
+case ICH9_MEM_PS3 - ICH9_MEM_BASE:
+ dimm_notify(val, DIMM_OSPM_POWEROFF);
+ break;
 default:
 ICH9_DEBUG("memhp write invalid %x <== %d\n", addr, val);
 }
@@ -134,7 +137,7 @@ static const MemoryRegionOps ich9_memhp_ops = {
 },
 {
 .offset = ICH9_MEM_EJ_BASE - ICH9_MEM_BASE,
-.len = 4, .size = 1,
+.len = 5, .size = 1,
 .write = memhp_writeb,
 },
 PORTIO_END_OF_LIST()
@@ -321,7 +324,7 @@ void ich9_pm_init(void *device, qemu_irq sci_irq, qemu_irq 
cmos_s3)
 memory_region_add_subregion(&pm->io, ICH9_PMIO_SMI_EN, &pm->io_smi);
 
 memory_region_init_io(&pm->io_memhp, &ich9_memhp_ops, pm, "apci-memhp0",
-  DIMM_BITMAP_BYTES + 4);
+  DIMM_BITMAP_BYTES + 5);
 memory_region_add_subregion(get_system_io(), ICH9_MEM_BASE, &pm->io_memhp);
 
 dimm_bus_hotplug(ich9_dimm_hotplug, ich9_dimm_revert, &lpc->d.qdev);
diff --git a/hw/acpi_ich9.h b/hw/acpi_ich9.h
index 8f57cd8..816d453 100644
--- a/hw/acpi_ich9.h
+++ b/hw/acpi_ich9.h
@@ -29,6 +29,7 @@
 #define ICH9_MEM_OST_REMOVE_FAIL 0xafa1
 #define ICH9_MEM_OST_ADD_SUCCESS 0xafa2
 #define ICH9_MEM_OST_ADD_FAIL 0xafa3
+#define ICH9_MEM_PS3 0xafa4
 
 typedef struct ICH9LPCPMRegs {
 /*
diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index 70aa480..6c953c2 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -54,6 +54,7 @@
 #define MEM_OST_REMOVE_FAIL 0xafa1
 #define MEM_OST_ADD_SUCCESS 0xafa2
 #define MEM_OST_ADD_FAIL 0xafa3
+#define MEM_PS3 0xafa4
 
 #define PIIX4_MEM_HOTPLUG_STATUS 8
 #define PIIX4_PCI_HOTPLUG_STATUS 2
@@ -564,6 +565,9 @@ static void memhp_writeb(void *opaque, uint32_t addr, 
uint32_t val)
 case MEM_OST_ADD_FAIL - MEM_BASE:
 dimm_notify(val, DIMM_ADD_FAIL);
 break;
+case MEM_PS3 - MEM_BASE:
+dimm_notify(val, DIMM_OSPM_POWEROFF);
+break;
 default:
 PIIX4_DPRINTF("memhp write invalid %x <== %d\n", addr, val);
 }
@@ -577,7 +581,7 @@ static const MemoryRegionOps piix4_memhp_ops = {
 .read = memhp_readb,
 },
 {
-.offset = MEM_EJ_BASE - MEM_BASE, .len = 4,
+.offset = MEM_EJ_BASE - MEM_BASE, .len = 5,
 .size = 1,
 .write = memhp_writeb,
 },
@@ -666,7 +670,7 @@ static void piix4_acpi_system_hot_add_init(PCIBus *bus, 
PIIX4PMState *s)
 memory_region_add_subregion(get_system_io(), PCI_HOTPLUG_ADDR,
 &s->io_pci);
 memory_region_init_io(&s->io_memhp, &piix4_memhp_ops, s, "apci-memhp0",
-  DIMM_BITMAP_BYTES + 4);
+  DIMM_BITMAP_BYTES + 5);
 memory_region_add_subregion(get_system_io(), MEM_BASE, &s->io_memhp);
 
 for (i = 0; i < DIMM_BITMAP_BYTES; i++) {
@@ -726,7 +730,6 @@ static int piix4_dimm_revert(DeviceState *qdev, DimmDevice 
*dev, int add)
 struct gpe_regs *g = &s->gperegs;
 DimmDevice *slot = DIMM(dev);
 int idx = slot->idx;
-
 if (add) {
 g->mems_sts[idx/8] &= ~(1 << (idx%8));
 } else {
diff --git a/hw/dimm.c b/hw/dimm.c
index 69b97b6..2454e38 100644
--- a/hw/dimm.c
+++ b/hw/dimm.c
@@ -407,6 +407,10 @@ void dimm_notify(uint32_t idx, uint32_t event)
 qdev_unplug_complete((DeviceState *)slot, NULL);
 QTAILQ_REMOVE(&bus->dimmlist, slot, nextdimm);
 QTAILQ_INSERT_TAIL(&bus->dimm_hp_result_queue, result, next);
+case DIMM_OSPM_POWEROFF:
+if (bus->dimm_revert) {
+bus->dimm_revert

[Qemu-devel] [RFC PATCH v4 07/30] Add SIZE type to qdev properties

2012-12-18 Thread Vasilis Liaskovitis

This patch adds a 'SIZE' type property to qdev.

It will make dimm description more convenient by allowing sizes to be specified
with K,M,G,T prefixes instead of number of bytes e.g.:
-device dimm,id=mem0,size=2G,bus=membus.0

Credits go to Ian Molton for original patch. See:
http://patchwork.ozlabs.org/patch/38835/

Signed-off-by: Vasilis Liaskovitis 
---
 hw/qdev-properties.c |   60 ++
 hw/qdev-properties.h |3 ++
 qemu-option.c|2 +-
 qemu-option.h|2 +
 4 files changed, 66 insertions(+), 1 deletions(-)

diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
index 81d901c..a77f760 100644
--- a/hw/qdev-properties.c
+++ b/hw/qdev-properties.c
@@ -1279,3 +1279,63 @@ void qemu_add_globals(void)
 {
 qemu_opts_foreach(qemu_find_opts("global"), qdev_add_one_global, NULL, 0);
 }
+
+/* --- 64bit unsigned int 'size' type --- */
+
+static void get_size(Object *obj, Visitor *v, void *opaque,
+   const char *name, Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
+
+visit_type_size(v, ptr, name, errp);
+}
+
+static void set_size(Object *obj, Visitor *v, void *opaque,
+   const char *name, Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
+
+if (dev->state != DEV_STATE_CREATED) {
+error_set(errp, QERR_PERMISSION_DENIED);
+return;
+}
+
+visit_type_size(v, ptr, name, errp);
+}
+
+static int parse_size(DeviceState *dev, Property *prop, const char *str)
+{
+uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
+Error *errp = NULL;
+
+if (str != NULL) {
+parse_option_size(prop->name, str, ptr, &errp);
+}
+assert_no_error(errp);
+return 0;
+}
+
+static int print_size(DeviceState *dev, Property *prop, char *dest, size_t len)
+{
+uint64_t *ptr = qdev_get_prop_ptr(dev, prop);
+char suffixes[] = {'T', 'G', 'M', 'K', 'B'};
+int i = 0;
+uint64_t div;
+
+for (div = (long int)1 << 40; !(*ptr / div) ; div >>= 10) {
+i++;
+}
+return snprintf(dest, len, "%0.03f%c", (double)*ptr/div, suffixes[i]);
+}
+
+PropertyInfo qdev_prop_size = {
+.name  = "size",
+.parse = parse_size,
+.print = print_size,
+.get = get_size,
+.set = set_size,
+};
diff --git a/hw/qdev-properties.h b/hw/qdev-properties.h
index 5b046ab..0182bef 100644
--- a/hw/qdev-properties.h
+++ b/hw/qdev-properties.h
@@ -14,6 +14,7 @@ extern PropertyInfo qdev_prop_uint64;
 extern PropertyInfo qdev_prop_hex8;
 extern PropertyInfo qdev_prop_hex32;
 extern PropertyInfo qdev_prop_hex64;
+extern PropertyInfo qdev_prop_size;
 extern PropertyInfo qdev_prop_string;
 extern PropertyInfo qdev_prop_chr;
 extern PropertyInfo qdev_prop_ptr;
@@ -67,6 +68,8 @@ extern PropertyInfo qdev_prop_pci_host_devaddr;
 DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_hex32, uint32_t)
 #define DEFINE_PROP_HEX64(_n, _s, _f, _d)   \
 DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_hex64, uint64_t)
+#define DEFINE_PROP_SIZE(_n, _s, _f, _d)   \
+DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_size, uint64_t)
 #define DEFINE_PROP_PCI_DEVFN(_n, _s, _f, _d)   \
 DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_pci_devfn, int32_t)
 
diff --git a/qemu-option.c b/qemu-option.c
index 27891e7..38e0a11 100644
--- a/qemu-option.c
+++ b/qemu-option.c
@@ -203,7 +203,7 @@ static void parse_option_number(const char *name, const 
char *value,
 }
 }
 
-static void parse_option_size(const char *name, const char *value,
+void parse_option_size(const char *name, const char *value,
   uint64_t *ret, Error **errp)
 {
 char *postfix;
diff --git a/qemu-option.h b/qemu-option.h
index ca72986..b8ee5b3 100644
--- a/qemu-option.h
+++ b/qemu-option.h
@@ -152,5 +152,7 @@ typedef int (*qemu_opts_loopfunc)(QemuOpts *opts, void 
*opaque);
 int qemu_opts_print(QemuOpts *opts, void *dummy);
 int qemu_opts_foreach(QemuOptsList *list, qemu_opts_loopfunc func, void 
*opaque,
   int abort_on_failure);
+void parse_option_size(const char *name, const char *value,
+  uint64_t *ret, Error **errp);
 
 #endif
-- 
1.7.9

Re: [Qemu-devel] [PATCH 01/10] ide: Break all non-qdevified controllers

2012-12-18 Thread Markus Armbruster

Andreas Färber  writes:

> Am 17.12.2012 15:43, schrieb Markus Armbruster:
>> Alexander Graf  writes:
>> 
>>> On 17.12.2012, at 15:05, Markus Armbruster wrote:
>>>
 They complicate IDE data structures and keep getting in the way.
 Also, TRIM support (commit d353fb72) is broken for them, because
 ide_identify() accesses IDEDevice member conf, but IDEDevice exists
 only with qdevified controllers.

 The non-qdevified controllers are still there, but attempting to
 connect devices to them fails with "IDE controller not qdevified yet;
 drive  ignored".

 Affected machines:

 * g3beige's first IDE channel (MacIO)
  -hda, -hdb are on first channel, and no longer work
  -hdc, -hdd are on second channel, and still work
 * mac99's second and third IDE channel (MacIO)
  All four IDE drives no longer work
>>>
>>> Nack. This breaks the default targets of qemu-system-ppc and
>>> qemu-system-ppc64.
>> 
>> Please tell us how much more time you want to qdevify IDE for these
>> targets.  Thanks!
>
> I believe I have a branch with macio QOM'ifications somewhere that I
> could revive.

I'd appreciate that.

>   Note that I know little about IDE or block layer and
> mainly care about consistent infrastructure there; I vaguely remember
> something about the mac's IDE channels being mixed together from two
> devices unlike real hardware, guess I would be unable to fix that.

Yes, g3beige's first IDE channel is MacIO (not qdevified), second is
CMD646 (qdevified), and mac99's first IDE channel is unimplemented,
second and third are MacIO (not qdevified).  Inhowfar that matches real
hardware I don't know.

> As for your question, 2013 and a gentle reminder to all involved would
> be nice. :)

In my experience with IDE qdevification, gentle reminders do not work.
But I'd be delighted to be proven wrong.

> In particular we have the Soft Freeze coming up shortly
> after the holidays, so is this needed for 1.4 Soft Freeze or can it be
> deferred to 1.5 or done during the 1.4 Soft Freeze?
>
> If Aurélien (CC'ed) doesn't manage, I can look at r2d as well.
> CC'ing Peter and Andrzej for the arm devices.

In time for 1.4 would be good, because it's in the way of the block
backend configuration work I'd like to get into 1.4.  If I can pull it
off in time.  Having to hack around IDE silliness that cannot be cleaned
up while we still have non-qdevified controllers isn't helping :)

[Qemu-devel] [RFC PATCH v4 17/30] [SeaBIOS] pci: Use paravirt interface for pcimem_start and pcimem64_start

2012-12-18 Thread Vasilis Liaskovitis

Initialize the 32-bit and 64-bit pci starting offsets from values passed in by
the qemu paravirt interface QEMU_CFG_PCI_WINDOW. Qemu calculates the starting
offsets based on initial memory and hotplug-able dimms.
It's possible to avoid the new paravirt interface, and calculate pci ranges from
srat entries. But the code changes are ugly, see:
http://lists.gnu.org/archive/html/qemu-devel/2012-09/msg03548.html
---
 src/paravirt.c |6 ++
 src/paravirt.h |2 ++
 src/pciinit.c  |9 +
 3 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/src/paravirt.c b/src/paravirt.c
index 4b5c441..f7517b9 100644
--- a/src/paravirt.c
+++ b/src/paravirt.c
@@ -347,3 +347,9 @@ void qemu_cfg_romfile_setup(void)
 dprintf(3, "Found fw_cfg file: %s (size=%d)\n", file->name, 
file->size);
 }
 }
+
+void qemu_cfg_get_pci_offsets(u64 *pcimem_start, u64 *pcimem64_start)
+{
+qemu_cfg_read_entry(pcimem_start, QEMU_CFG_PCI_WINDOW, sizeof(u64));
+qemu_cfg_read((u8*)(pcimem64_start), sizeof(u64));
+}
diff --git a/src/paravirt.h b/src/paravirt.h
index a284c41..b53ff88 100644
--- a/src/paravirt.h
+++ b/src/paravirt.h
@@ -35,6 +35,7 @@ static inline int kvm_para_available(void)
 #define QEMU_CFG_BOOT_MENU  0x0e
 #define QEMU_CFG_MAX_CPUS   0x0f
 #define QEMU_CFG_FILE_DIR   0x19
+#define QEMU_CFG_PCI_WINDOW 0x1a
 #define QEMU_CFG_ARCH_LOCAL 0x8000
 #define QEMU_CFG_ACPI_TABLES(QEMU_CFG_ARCH_LOCAL + 0)
 #define QEMU_CFG_SMBIOS_ENTRIES (QEMU_CFG_ARCH_LOCAL + 1)
@@ -65,5 +66,6 @@ struct e820_reservation {
 u32 qemu_cfg_e820_entries(void);
 void* qemu_cfg_e820_load_next(void *addr);
 void qemu_cfg_romfile_setup(void);
+void qemu_cfg_get_pci_offsets(u64 *pcimem_start, u64 *pcimem64_start);
 
 #endif
diff --git a/src/pciinit.c b/src/pciinit.c
index a406bbd..4103d2d 100644
--- a/src/pciinit.c
+++ b/src/pciinit.c
@@ -734,6 +734,7 @@ static void pci_bios_map_devices(struct pci_bus *busses)
 void
 pci_setup(void)
 {
+u64 pv_pcimem_start, pv_pcimem64_start;
 if (CONFIG_COREBOOT || usingXen()) {
 // PCI setup already done by coreboot or Xen - just do probe.
 pci_probe_devices();
@@ -769,5 +770,13 @@ pci_setup(void)
 
 pci_bios_init_devices();
 
+/* if qemu gives us other pci window values, it means there are 
hotplug-able
+ * dimms. Adjust accordingly */
+qemu_cfg_get_pci_offsets(&pv_pcimem_start, &pv_pcimem64_start);
+if (pv_pcimem_start > pcimem_start)
+pcimem_start = pv_pcimem_start;
+if (pv_pcimem64_start > pcimem64_start)
+pcimem64_start = pv_pcimem64_start;
+
 free(busses);
 }
-- 
1.7.9

Re: [Qemu-devel] [PATCH v8 00/12] virtio: virtio-blk data plane

2012-12-18 Thread Stefan Hajnoczi

On Mon, Dec 17, 2012 at 11:05 AM, Stefan Hajnoczi  wrote:
> Note: v8 is a small change, if you have reviewed v7 then the code is almost
> totally unchanged.
>
> This series adds the -device virtio-blk-pci,x-data-plane=on property that
> enables a high performance I/O codepath.  A dedicated thread is used to 
> process
> virtio-blk requests outside the global mutex and without going through the 
> QEMU
> block layer.
>
> Khoa Huynh  reported an increase from 140,000 IOPS to 600,000
> IOPS for a single VM using virtio-blk-data-plane in July:
>
>   http://comments.gmane.org/gmane.comp.emulators.kvm.devel/94580
>
> The virtio-blk-data-plane approach was originally presented at Linux Plumbers
> Conference 2010.  The following slides contain a brief overview:
>
>   
> http://linuxplumbersconf.org/2010/ocw/system/presentations/651/original/Optimizing_the_QEMU_Storage_Stack.pdf
>
> The basic approach is:
> 1. Each virtio-blk device has a thread dedicated to handling ioeventfd
>signalling when the guest kicks the virtqueue.
> 2. Requests are processed without going through the QEMU block layer using
>Linux AIO directly.
> 3. Completion interrupts are injected via irqfd from the dedicated thread.
>
> To try it out:
>
>   qemu -drive if=none,id=drive0,cache=none,aio=native,format=raw,file=...
>-device 
> virtio-blk-pci,drive=drive0,scsi=off,config-wce=off,x-data-plane=on
>
> Limitations:
>  * Only format=raw is supported
>  * Live migration is not supported
>  * Block jobs, hot unplug, and other operations fail with -EBUSY
>  * I/O throttling limits are ignored
>  * Only Linux hosts are supported due to Linux AIO usage
>
> The code has reached a stage where I feel it is ready to merge.  Users have
> been playing with it for some time and want the significant performance boost.
>
> We are refactoring QEMU to get rid of the global mutex.  I believe that
> virtio-blk-data-plane can eventually become the default mode of operation.
>
> Instead of waiting for global mutex removal efforts to finish, I want to use
> virtio-blk-data-plane as an example device for AioContext and threaded hw
> dispatch refactoring.  This means:
>
> 1. When the block layer can bind to an AioContext and execute I/O outside the
>global mutex, virtio-blk-data-plane can use this (and gain image format
>support).
>
> 2. When hw dispatch no longer needs the global mutex we can use hw/virtio.c
>again and perhaps run a pool of iothreads instead of dedicated data plane
>threads.
>
> But in the meantime, I have cleaned up the virtio-blk-data-plane code so that
> it can be merged as an experimental feature.
>
> v8:
>  * Fix VIRTIO_BLK_T_GET_ID support - use "in" bufsm, not "out" bufs in 
> hw/dataplane/virtio-blk.c
>  * Hostmem -> HostMem rename in hw/dataplane/hostmem.[ch] and 
> hw/dataplane/vring.h [Blue]
>
> v7:
>  * VIRTIO_BLK_T_GET_ID support
>  * Replace lock/condvar with drain operation that stops data plane thread 
> [Michael, Paolo, Laszlo]
>  * Add vring_pop() TODO about crossing memory region boundaries [Michael]
>  * Move #ifdef CONFIG_VIRTIO_BLK_DATA_PLANE to hw/virtio-blk.c [Michael]
>  * Typo s/there is/there is no/ in hostmem.h [Paolo]
>  * Avoid potential integer overflow in hostmem.c [Laszlo]
>  * Retry epoll_wait() on EINTR so gdb works
>
> v6:
>  * Move hw/Makefile.objs dataplane/ inclusion from Patch 4 to Patch 3 [Kevin]
>  * Split discard() with front/back and switch ssize_t to size_t [Michael]
>  * Disable WCE config feature [Michael]
>  * Assert on ioq underflow/overflow, it can never happen [Kevin]
>  * Propagate fdatasync() errors [Kevin]
>  * Remember to init/destroy hostmem mutex
>  * Declare VirtIOBlkConf->data_plane in the right patch so building works
>
> v5:
>  * Omit memory regions with dirty logging enabled from hostmem [Michael]
>  * Add doc comment about quiescing requests across memory hot unplug [Michael]
>  * Clarify which Linux vhost version the vring code originates from [Michael]
>  * Break up indirect vring buffer into 1 hostmem_lookup() per descriptor 
> [Michael]
>  * Barriers in hw/dataplane/vring.c to force fields to be loaded [Michael]
>  * split vring_set_notification() into enable/disable [Paolo]
>  * barriers in vring.c instead of virtio-blk.c [Michael]
>  * move setup code from hw/virtio-blk.c into hw/dataplane/virtio-blk.c 
> [Michael]
>
>  * Note I did not get rid of the mutex+condvar approach to draining requests.
>I've had good feedback on the performance of the patch series so I'm not
>worried about eliminating the lock (it's very rarely contended).  Hope
>Michael and Paolo are okay with this approach.
>
> v4:
>  * Add qemu_iovec_concat_iov() [Paolo]
>  * Use QEMUIOVector to copy out virtio_blk_inhdr [Michael, Paolo]
>
> v3:
>  * Don't assume iovec layout [Michael]
>  * Better naming for hostmem.c MemoryListener callbacks [Don]
>  * More vring quarantining if commands are bogus instead of exiting [Blue]
>
> v2:
>  * Use MemoryListener for thread-safe memor

[Qemu-devel] [RFC PATCH v4 28/30] Add _OST dimm support

2012-12-18 Thread Vasilis Liaskovitis

This allows qemu to receive notifications from the guest OS on success or
failure of a memory hotplug request. The guest OS needs to implement the _OST
functionality for this to work (linux-next: http://lkml.org/lkml/2012/6/25/321)

This patch also updates dimm bitmap state and hot-remove pending flag
on hot-remove fail. This allows failed hot operations to be retried at
anytime (only works for guests that use _OST notification).
Also adds new _OST registers in  docs/specs/acpi_hotplug.txt
---
 docs/specs/acpi_hotplug.txt |   25 +
 hw/acpi_ich9.c  |   31 ---
 hw/acpi_ich9.h  |3 +++
 hw/acpi_piix4.c |   35 ---
 hw/dimm.c   |   28 +++-
 hw/dimm.h   |   11 ++-
 6 files changed, 125 insertions(+), 8 deletions(-)

diff --git a/docs/specs/acpi_hotplug.txt b/docs/specs/acpi_hotplug.txt
index cf86242..536da16 100644
--- a/docs/specs/acpi_hotplug.txt
+++ b/docs/specs/acpi_hotplug.txt
@@ -20,3 +20,28 @@ ejected.
 
 Written by ACPI memory device _EJ0 method to notify qemu of successfull
 hot-removal.  Write-only.
+
+Memory Dimm ejection failure notification (IO port 0xafa1, 1-byte access):
+---
+Dimm hot-remove _OST notification. Byte value indicates Dimm slot for which
+ejection failed.
+
+Written by ACPI memory device _OST method to notify qemu of failed
+hot-removal.  Write-only.
+
+Memory Dimm insertion success notification (IO port 0xafa2, 1-byte access):
+---
+Dimm hot-remove _OST notification. Byte value indicates Dimm slot for which
+insertion succeeded.
+
+Written by ACPI memory device _OST method to notify qemu of failed
+hot-add.  Write-only.
+
+Memory Dimm insertion failure notification (IO port 0xafa3, 1-byte access):
+---
+Dimm hot-remove _OST notification. Byte value indicates Dimm slot for which
+insertion failed.
+
+Written by ACPI memory device _OST method to notify qemu of failed
+hot-add.  Write-only.
+
diff --git a/hw/acpi_ich9.c b/hw/acpi_ich9.c
index f5dc1c9..2705230 100644
--- a/hw/acpi_ich9.c
+++ b/hw/acpi_ich9.c
@@ -111,6 +111,15 @@ static void memhp_writeb(void *opaque, uint32_t addr, 
uint32_t val)
 case ICH9_MEM_EJ_BASE - ICH9_MEM_BASE:
 dimm_notify(val, DIMM_REMOVE_SUCCESS);
 break;
+case ICH9_MEM_OST_REMOVE_FAIL - ICH9_MEM_BASE:
+dimm_notify(val, DIMM_REMOVE_FAIL);
+break;
+case ICH9_MEM_OST_ADD_SUCCESS - ICH9_MEM_BASE:
+dimm_notify(val, DIMM_ADD_SUCCESS);
+break;
+case ICH9_MEM_OST_ADD_FAIL - ICH9_MEM_BASE:
+dimm_notify(val, DIMM_ADD_FAIL);
+break;
 default:
 ICH9_DEBUG("memhp write invalid %x <== %d\n", addr, val);
 }
@@ -125,7 +134,7 @@ static const MemoryRegionOps ich9_memhp_ops = {
 },
 {
 .offset = ICH9_MEM_EJ_BASE - ICH9_MEM_BASE,
-.len = 1, .size = 1,
+.len = 4, .size = 1,
 .write = memhp_writeb,
 },
 PORTIO_END_OF_LIST()
@@ -274,6 +283,22 @@ static int ich9_dimm_hotplug(DeviceState *qdev, DimmDevice 
*dev, int
 return 0;
 }
 
+static int ich9_dimm_revert(DeviceState *qdev, DimmDevice *dev, int add)
+{
+PCIDevice *pci_dev = DO_UPCAST(PCIDevice, qdev, qdev);
+ICH9LPCState *s = DO_UPCAST(ICH9LPCState, d, pci_dev);
+struct gpe_regs *g = &s->pm.gperegs;
+DimmDevice *slot = DIMM(dev);
+int idx = slot->idx;
+
+if (add) {
+g->mems_sts[idx/8] &= ~(1 << (idx%8));
+} else {
+g->mems_sts[idx/8] |= (1 << (idx%8));
+}
+return 0;
+}
+
 void ich9_pm_init(void *device, qemu_irq sci_irq, qemu_irq cmos_s3)
 {
 ICH9LPCState *lpc = (ICH9LPCState *)device;
@@ -296,10 +321,10 @@ void ich9_pm_init(void *device, qemu_irq sci_irq, 
qemu_irq cmos_s3)
 memory_region_add_subregion(&pm->io, ICH9_PMIO_SMI_EN, &pm->io_smi);
 
 memory_region_init_io(&pm->io_memhp, &ich9_memhp_ops, pm, "apci-memhp0",
-  DIMM_BITMAP_BYTES + 1);
+  DIMM_BITMAP_BYTES + 4);
 memory_region_add_subregion(get_system_io(), ICH9_MEM_BASE, &pm->io_memhp);
 
-dimm_bus_hotplug(ich9_dimm_hotplug, &lpc->d.qdev);
+dimm_bus_hotplug(ich9_dimm_hotplug, ich9_dimm_revert, &lpc->d.qdev);
 
 pm->irq = sci_irq;
 qemu_register_reset(pm_reset, pm);
diff --git a/hw/acpi_ich9.h b/hw/acpi_ich9.h
index af61a2d..8f57cd8 100644
--- a/hw/acpi_ich9.h
+++ b/hw/acpi_ich9.h
@@ -26,6 +26,9 @@
 #define ICH9_MEM_BASE0xaf80
 #define ICH9_MEM_EJ_BASE0xafa0
 #define ICH9_MEM_HOTPLUG_STATUS 8
+#define ICH9_MEM_OST_REMOVE_FAIL 0xafa1
+#define ICH9_MEM_OST_ADD_SUCCESS 0xafa2
+#define ICH9_MEM_OST_ADD_FAIL 0xafa3
 
 typedef struct ICH9LPCPMRegs {
 /*
diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4

Re: [Qemu-devel] [RFC PATCH v6 0/6] Virtio refactoring.

2012-12-18 Thread Michael S. Tsirkin

On Tue, Dec 18, 2012 at 12:06:39PM +, Peter Maydell wrote:
> On 18 December 2012 11:50, Paolo Bonzini  wrote:
> > Il 18/12/2012 12:26, Peter Maydell ha scritto:
> >> On 18 December 2012 11:01, Michael S. Tsirkin  wrote:
> >>> This is what I am saying: create your own bus and put
> >>> your devices there.
> >>
> >> What bus?
> >
> > A virtio bus like the one in these patches.  But mst is suggesting to
> > leave virtio.c aside, and only use the virtio bus in the virtio-x-mmio
> > devices, to connect to the virtio-mmio device provided by the board.
> 
> That doesn't make any sense to me -- why would you want to make
> mmio be randomly different from the other virtio transports?

Not different at all. virtio-pci uses the pci bus.
It's you who's saying mmio is different and can not
work with existing bindings.

> The code
> as it stands is a mess which ought to be cleaned up anyway...

My experience is all working code is somewhat messy.  This is working
code.

I'm sure we shouldnot add more ways to create devices,
we have two for each device already, that's too much.
You can't seriously add yet another way, keep the two
existing ones around as legacy and claim it's a cleanup.
You are single-handedly making the testing matrix bigger by 1/3.

And what makes virtio so special anyway? e1000 can be used without
exposing users to internal buses and all kind of nastiness like this.
People just want to install a driver and have a faster IO.

> To the extent that it's painful for users to manipulate qdev devices on
> the command line, that's a problem we ought to address at some point
> anyhow.
> 
> -- PMM

It's pretty painful, yes, but adding yet another way to do it is not
addressing the problem.

-- 
MST

[Qemu-devel] [RFC PATCH v4 21/30] Implement dimm-info

2012-12-18 Thread Vasilis Liaskovitis

"query-dimm-info" and "info dimm" will give current state of all dimms in the
system e.g.

dimm0: on
dimm1: off
dimm2: off
dimm3: on
etc.

Signed-off-by: Vasilis Liaskovitis 
---
 hmp-commands.hx  |2 ++
 hmp.c|   17 +
 hmp.h|1 +
 hw/dimm.c|   43 +++
 monitor.c|7 +++
 qapi-schema.json |   26 ++
 6 files changed, 96 insertions(+), 0 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 3fbd975..65d799e 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1572,6 +1572,8 @@ show qdev device model list
 show roms
 @item info memory-total
 show memory-total
+@item info dimm
+show dimm
 @end table
 ETEXI
 
diff --git a/hmp.c b/hmp.c
index fb39b0d..f8456fd 100644
--- a/hmp.c
+++ b/hmp.c
@@ -635,6 +635,23 @@ void hmp_info_memory_total(Monitor *mon)
 monitor_printf(mon, "MemTotal: %lu\n", ram_total);
 }
 
+void hmp_info_dimm(Monitor *mon)
+{
+DimmInfoList *info;
+DimmInfoList *item;
+DimmInfo *dimm;
+
+info = qmp_query_dimm_info(NULL);
+for (item = info; item; item = item->next) {
+dimm = item->value;
+monitor_printf(mon, "dimm %s : %s\n", dimm->dimm,
+dimm->state ? "on" : "off");
+dimm->dimm = NULL;
+}
+
+qapi_free_DimmInfoList(info);
+}
+
 void hmp_quit(Monitor *mon, const QDict *qdict)
 {
 monitor_suspend(mon);
diff --git a/hmp.h b/hmp.h
index 25a3a70..74ac061 100644
--- a/hmp.h
+++ b/hmp.h
@@ -37,6 +37,7 @@ void hmp_info_balloon(Monitor *mon);
 void hmp_info_pci(Monitor *mon);
 void hmp_info_block_jobs(Monitor *mon);
 void hmp_info_memory_total(Monitor *mon);
+void hmp_info_dimm(Monitor *mon);
 void hmp_quit(Monitor *mon, const QDict *qdict);
 void hmp_stop(Monitor *mon, const QDict *qdict);
 void hmp_system_reset(Monitor *mon, const QDict *qdict);
diff --git a/hw/dimm.c b/hw/dimm.c
index f181e54..e79f23d 100644
--- a/hw/dimm.c
+++ b/hw/dimm.c
@@ -174,6 +174,18 @@ static DimmConfig *dimmcfg_find_from_name(DimmBus *bus, 
const char *name)
 return NULL;
 }
 
+static DimmDevice *dimm_find_from_name(DimmBus *bus, const char *name)
+{
+DimmDevice *slot;
+
+QTAILQ_FOREACH(slot, &bus->dimmlist, nextdimm) {
+if (!strcmp(slot->qdev.id, name)) {
+return slot;
+}
+}
+return NULL;
+}
+
 void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots)
 {
 DimmConfig *slot;
@@ -203,6 +215,37 @@ uint64_t get_hp_memory_total(void)
 return info;
 }
 
+DimmInfoList *qmp_query_dimm_info(Error **errp)
+{
+DimmBus *bus;
+DimmConfig *slot;
+DimmInfoList *head = NULL, *info, *cur_item = NULL;
+
+QLIST_FOREACH(bus, &memory_buses, next) {
+QTAILQ_FOREACH(slot, &bus->dimmconfig_list, nextdimmcfg) {
+
+info = g_malloc0(sizeof(*info));
+info->value = g_malloc0(sizeof(*info->value));
+info->value->dimm = g_malloc0(sizeof(char) * 32);
+strcpy(info->value->dimm, slot->name);
+if (dimm_find_from_name(bus, slot->name)) {
+info->value->state = 1;
+} else {
+info->value->state = 0;
+}
+/* XXX: waiting for the qapi to support GSList */
+if (!cur_item) {
+head = cur_item = info;
+} else {
+cur_item->next = info;
+cur_item = info;
+}
+}
+}
+
+return head;
+}
+
 static int dimm_init(DeviceState *s)
 {
 DimmBus *bus = DIMM_BUS(qdev_get_parent_bus(s));
diff --git a/monitor.c b/monitor.c
index 6e87d0d..de1dcf1 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2743,6 +2743,13 @@ static mon_cmd_t info_cmds[] = {
 .mhandler.info = do_trace_print_events,
 },
 {
+.name   = "dimm",
+.args_type  = "",
+.params = "",
+.help   = "show active and non active dimms",
+.mhandler.info = hmp_info_dimm,
+},
+{
 .name   = NULL,
 },
 };
diff --git a/qapi-schema.json b/qapi-schema.json
index 33f88d6..5a20577 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2914,6 +2914,32 @@
 { 'command': 'query-memory-total', 'returns': 'int' }
 
 ##
+# @DimmInfo:
+#
+# Information about status of a memory hotplug command
+#
+# @dimm: the Dimm associated with the result
+#
+# @result: the result of the hotplug command
+#
+# Since: 1.4
+#
+##
+{ 'type': 'DimmInfo',
+  'data': {'dimm': 'str', 'state': 'bool'} }
+
+##
+# @query-dimm-info:
+#
+# Returns total memory in bytes, including hotplugged dimms
+#
+# Returns: int
+#
+# Since: 1.4
+##
+{ 'command': 'query-dimm-info', 'returns': ['DimmInfo'] }
+
+##
 # @QKeyCode:
 #
 # An enumeration of key name.
-- 
1.7.9

[Qemu-devel] [RFC PATCH v4 29/30] [SeaBIOS] Implement _PS3 method for memory device

2012-12-18 Thread Vasilis Liaskovitis

---
 src/acpi-dsdt-mem-hotplug.dsl |   15 +++
 src/ssdt-mem.dsl  |4 
 2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/src/acpi-dsdt-mem-hotplug.dsl b/src/acpi-dsdt-mem-hotplug.dsl
index a648bee..7d7c078 100644
--- a/src/acpi-dsdt-mem-hotplug.dsl
+++ b/src/acpi-dsdt-mem-hotplug.dsl
@@ -49,6 +49,13 @@ Scope(\_SB) {
 MIF, 8
 }
 
+/* Memory _PS3 byte */
+OperationRegion(MPSB, SystemIO, 0xafa4, 1)
+Field (MPSB, ByteAcc, NoLock, Preserve)
+{
+MPS, 8
+}
+
 Method(MESC, 0) {
 // Local5 = active memdevice bitmap
 Store (MES, Local5)
@@ -90,6 +97,14 @@ Scope(\_SB) {
 Sleep(200)
 }
 
+
+Method (MPS3, 1, NotSerialized) {
+// _PS3 method - power-off method
+Store(Arg0, MPS)
+Store(Zero, Index(MEON, Arg0))
+Sleep(200)
+}
+
 Method (MOST, 3, Serialized) {
 // _OST method - OS status indication
 Switch (And(Arg0, 0xFF)) {
diff --git a/src/ssdt-mem.dsl b/src/ssdt-mem.dsl
index 47a3b4f..9827a58 100644
--- a/src/ssdt-mem.dsl
+++ b/src/ssdt-mem.dsl
@@ -39,6 +39,7 @@ DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", 
"CSSDT", 0x1)
 External(CMST, MethodObj)
 External(MPEJ, MethodObj)
 External(MOST, MethodObj)
+External(MPS3, MethodObj)
 
 Name(_CRS, ResourceTemplate() {
 QwordMemory(
@@ -64,6 +65,9 @@ DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", 
"CSSDT", 0x1)
 Method (_OST, 3) {
 MOST(Arg0, Arg1, ID)
 }
+Method (_PS3, 0) {
+MPS3(ID)
+}
 }
 }
 
-- 
1.7.9

[Qemu-devel] [RFC PATCH v4 23/30] dimm: add hot-remove capability

2012-12-18 Thread Vasilis Liaskovitis

On a succesfull _EJ0 operation unmap the device from the guest by using the new
qdev function qdev_unplug_complete, see:
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02699.html

The memory of the device should be freed when the last subsystem using it
unmaps it, see the following two series:
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg00728.html
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02697.html

Needs testing. Other subsystems (e.g. virtio-blk) may have to install new
memorylisteners to complete pending I/O before device memory can be freed.

Signed-off-by: Vasilis Liaskovitis 
---
 hw/dimm.c |   51 +++
 hw/dimm.h |1 +
 2 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/hw/dimm.c b/hw/dimm.c
index e79f23d..0b4e22d 100644
--- a/hw/dimm.c
+++ b/hw/dimm.c
@@ -120,6 +120,18 @@ static void dimm_populate(DimmDevice *s)
 s->mr = new;
 }
 
+static int dimm_depopulate(DeviceState *dev)
+{
+DimmDevice *s = DIMM(dev);
+assert(s);
+vmstate_unregister_ram(s->mr, NULL);
+memory_region_del_subregion(get_system_memory(), s->mr);
+memory_region_destroy(s->mr);
+s->populated = false;
+s->mr = NULL;
+return 0;
+}
+
 void dimm_config_create(char *id, uint64_t size, const char *bus, uint64_t 
node,
 uint32_t dimm_idx, uint32_t populated)
 {
@@ -159,6 +171,11 @@ static void dimm_plug_device(DimmDevice *slot)
 
 static int dimm_unplug_device(DeviceState *qdev)
 {
+DimmBus *bus = DIMM_BUS(qdev_get_parent_bus(qdev));
+
+if (bus->dimm_hotplug) {
+bus->dimm_hotplug(bus->dimm_hotplug_qdev, DIMM(qdev), 0);
+}
 return 1;
 }
 
@@ -186,6 +203,21 @@ static DimmDevice *dimm_find_from_name(DimmBus *bus, const 
char *name)
 return NULL;
 }
 
+static DimmDevice *dimm_find_from_idx(uint32_t idx)
+{
+DimmDevice *slot;
+DimmBus *bus;
+
+QLIST_FOREACH(bus, &memory_buses, next) {
+QTAILQ_FOREACH(slot, &bus->dimmlist, nextdimm) {
+if (slot->idx == idx) {
+return slot;
+}
+}
+}
+return NULL;
+}
+
 void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots)
 {
 DimmConfig *slot;
@@ -275,6 +307,24 @@ static int dimm_init(DeviceState *s)
 return 0;
 }
 
+void dimm_notify(uint32_t idx, uint32_t event)
+{
+DimmBus *bus;
+DimmDevice *slot;
+
+slot = dimm_find_from_idx(idx);
+assert(slot != NULL);
+bus = DIMM_BUS(qdev_get_parent_bus(&slot->qdev));
+
+switch (event) {
+case DIMM_REMOVE_SUCCESS:
+qdev_unplug_complete((DeviceState *)slot, NULL);
+QTAILQ_REMOVE(&bus->dimmlist, slot, nextdimm);
+break;
+default:
+break;
+}
+}
 
 static void dimm_class_init(ObjectClass *klass, void *data)
 {
@@ -283,6 +333,7 @@ static void dimm_class_init(ObjectClass *klass, void *data)
 dc->props = dimm_properties;
 dc->unplug = dimm_unplug_device;
 dc->init = dimm_init;
+dc->exit = dimm_depopulate;
 dc->bus_type = TYPE_DIMM_BUS;
 }
 
diff --git a/hw/dimm.h b/hw/dimm.h
index 5130b2c..86c7cd5 100644
--- a/hw/dimm.h
+++ b/hw/dimm.h
@@ -86,5 +86,6 @@ DimmBus *dimm_bus_create(Object *parent, const char *name, 
uint32_t max_dimms,
 void dimm_config_create(char *id, uint64_t size, const char *bus, uint64_t 
node,
 uint32_t dimm_idx, uint32_t populated);
 uint64_t get_hp_memory_total(void);
+void dimm_notify(uint32_t idx, uint32_t event);
 
 #endif
-- 
1.7.9

[Qemu-devel] [PATCH] ehci: Use uframe precision for interrupt threshold checking (v2)

2012-12-18 Thread Hans de Goede

Before this patch, the following could happen:
1) Transfer completes, raises interrupt
2) .5 ms later we check if the guest has queued up any new transfers
3) We find and execute a new transfer
4) .2 ms later the new transfer completes
5) We re-run our frame_timer to write back the completion, but less then
   1 ms has passed since our last run, so frindex is not changed, so the
   interrupt threshold code delays the interrupt
6) 1 ms from the re-run our frame-timer runs again and finally delivers
   the interrupt

This leads to unnecessary large delays of interrupts, this code fixes this
by changing frindex to uframe precision and using that for interrupt threshold
control, making the interrupt fire at step 5 for guest which have low interrupt
threshold settings (like Linux).

Note that the guest still sees the frindex move in steps of 8 for migration
compatibility.

This boosts Linux read speed of a simple cheap USB thumb drive by 6 %.

Changes in v2:
-Make the guest see frindex move in steps of 8 by modifying ehci_opreg_read,
 rather then using a shadow variable

Signed-off-by: Hans de Goede 
---
 hw/usb/hcd-ehci.c | 70 +--
 1 file changed, 47 insertions(+), 23 deletions(-)

diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
index c7a9a7c..320b7e7 100644
--- a/hw/usb/hcd-ehci.c
+++ b/hw/usb/hcd-ehci.c
@@ -109,12 +109,13 @@
 
 #define FRAME_TIMER_FREQ 1000
 #define FRAME_TIMER_NS   (10 / FRAME_TIMER_FREQ)
+#define UFRAME_TIMER_NS  (FRAME_TIMER_NS / 8)
 
 #define NB_MAXINTRATE8// Max rate at which controller issues ints
 #define BUFF_SIZE5*4096   // Max bytes to transfer per transaction
 #define MAX_QH   100  // Max allowable queue heads in a chain
-#define MIN_FR_PER_TICK  3// Min frames to process when catching up
-#define PERIODIC_ACTIVE  64
+#define MIN_UFR_PER_TICK 24   /* Min frames to process when catching up */
+#define PERIODIC_ACTIVE  512  /* Micro-frames */
 
 /*  Internal periodic / asynchronous schedule state machine states
  */
@@ -967,7 +968,15 @@ static uint64_t ehci_opreg_read(void *ptr, hwaddr addr,
 EHCIState *s = ptr;
 uint32_t val;
 
-val = s->opreg[addr >> 2];
+switch (addr) {
+case FRINDEX:
+/* Round down to mult of 8, else it can go backwards on migration */
+val = s->frindex & ~7;
+break;
+default:
+val = s->opreg[addr >> 2];
+}
+
 trace_usb_ehci_opreg_read(addr + s->opregbase, addr2str(addr), val);
 return val;
 }
@@ -1118,7 +1127,8 @@ static void ehci_opreg_write(void *ptr, hwaddr addr,
 break;
 
 case FRINDEX:
-val &= 0x3ff8; /* frindex is 14bits and always a multiple of 8 */
+val &= 0x3fff; /* frindex is 14bits */
+s->usbsts_frindex = val;
 break;
 
 case CONFIGFLAG:
@@ -2237,16 +2247,16 @@ static void ehci_advance_periodic_state(EHCIState *ehci)
 }
 }
 
-static void ehci_update_frindex(EHCIState *ehci, int frames)
+static void ehci_update_frindex(EHCIState *ehci, int uframes)
 {
 int i;
 
-if (!ehci_enabled(ehci)) {
+if (!ehci_enabled(ehci) && ehci->pstate == EST_INACTIVE) {
 return;
 }
 
-for (i = 0; i < frames; i++) {
-ehci->frindex += 8;
+for (i = 0; i < uframes; i++) {
+ehci->frindex++;
 
 if (ehci->frindex == 0x2000) {
 ehci_raise_irq(ehci, USBSTS_FLR);
@@ -2270,33 +2280,33 @@ static void ehci_frame_timer(void *opaque)
 int need_timer = 0;
 int64_t expire_time, t_now;
 uint64_t ns_elapsed;
-int frames, skipped_frames;
+int uframes, skipped_uframes;
 int i;
 
 t_now = qemu_get_clock_ns(vm_clock);
 ns_elapsed = t_now - ehci->last_run_ns;
-frames = ns_elapsed / FRAME_TIMER_NS;
+uframes = ns_elapsed / UFRAME_TIMER_NS;
 
 if (ehci_periodic_enabled(ehci) || ehci->pstate != EST_INACTIVE) {
 need_timer++;
 
-if (frames > ehci->maxframes) {
-skipped_frames = frames - ehci->maxframes;
-ehci_update_frindex(ehci, skipped_frames);
-ehci->last_run_ns += FRAME_TIMER_NS * skipped_frames;
-frames -= skipped_frames;
-DPRINTF("WARNING - EHCI skipped %d frames\n", skipped_frames);
+if (uframes > (ehci->maxframes * 8)) {
+skipped_uframes = uframes - (ehci->maxframes * 8);
+ehci_update_frindex(ehci, skipped_uframes);
+ehci->last_run_ns += UFRAME_TIMER_NS * skipped_uframes;
+uframes -= skipped_uframes;
+DPRINTF("WARNING - EHCI skipped %d uframes\n", skipped_uframes);
 }
 
-for (i = 0; i < frames; i++) {
+for (i = 0; i < uframes; i++) {
 /*
  * If we're running behind schedule, we should not catch up
  * too fast, as that will make some guests unhappy:
- * 1) We must process a minimum of MIN_FR_PER_TICK frames,
+ * 1) We must

Re: [Qemu-devel] [RFC PATCH v6 0/6] Virtio refactoring.

2012-12-18 Thread Michael S. Tsirkin

On Tue, Dec 18, 2012 at 12:30:20PM +0100, KONRAD Frédéric wrote:
> On 18/12/2012 12:01, Michael S. Tsirkin wrote:
> >On Tue, Dec 18, 2012 at 10:33:37AM +, Peter Maydell wrote:
> >>On 17 December 2012 15:45, Michael S. Tsirkin  wrote:
> >>>Is the point to allow virtio-mmio?  Why can't virtio-mmio be just
> >>>another bus, like a pci bus, and another binding, like the virtio-pci
> >>>binding?
> >>(a) the current code is really not very nice because it's not
> >>actually a proper set of QOM/qdev devices
> >>(b) unlike PCI, you can't create sysbus devices on the
> >>command line, because they don't correspond to a user
> >>pluggable bit of hardware. We don't want users to have to know
> >>an address and IRQ number for each virtio-mmio device (especially
> >>since these are board specific); instead the board can create
> >>and wire up transport devices wherever is suitable, and the
> >>user just creates the backend (which is plugged into the virtio bus).
> >>
> >>-- PMM
> >This is what I am saying: create your own bus and put
> >your devices there. Allocate resources when you init
> >a device.
> >
> >Instead you seem to want to expose a virtio device as two devices to
> >user - if true this is not reasonable.
> >
> The modifications will be transparent to the user, as we will keep
> virtio-x-pci devices.

So there are three ways to add virtio pci devices now.
Legacy -device virtio-net-pci, legacy legacy -net nic.model=virtio
and the new one with two devices.
If yes it's not transparent, it's user visible.
Or did I misunderstand?

Look we can have a virtio network device on a PCI bus.
A very similar device can be created on XXX bus, and
we can and do share a lot of code.
This makes it two devices? Why not 4?
One for TX one for RX one for control one for PCI.
I hope I'm not giving anyone ideas ...

-- 
MST

[Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug

2012-12-18 Thread Vasilis Liaskovitis

This is v4 of the ACPI memory hotplug functionality. Only x86_64 target is
supported (both i440fx and q35). There are still several issues, but it's
been a while since v3 and I wanted to get some more feedback on the current
state of the patchseries.

Overview:

Dimm device layout is modeled with a normal qemu device:

"-device dimm,id=name,size=sz,node=pxm,populated=on|off,bus=membus.0"

The starting physical address for all dimms is calculated from top of memory,
during memory controller init, skipping the pci hole at [PCI_HOLE_START, 4G).
e.g.
"-device dimm,id=dimm0,size=512M,node=0,populated=off,bus=membus.0"
will define a 512M memory dimm belonging to numa node 0, on bus membus.0.

Because dimm layout needs to be configured on machine-boot, all dimm devices
need to be specified on startup command line (either with populated=on or with
populated=off). The dimm information is stored in dimm configuration structures.

After machine startup, dimms are hot-added or removed with normal device_add
and device_del operations e.g.:
Hot-add syntax: "device_add dimm,id=mydimm0,bus=membus.0"
Hot-remove syntax: "device_del dimm,id=mydimm0"

Changes v3->v4

- Dimms added with normal -device argument (extra -dimm arg dropped).
- multiple memory buses can be registered. Memory buses of the real hw/chipset
  or a paravirtual memory bus can be added.
- acpi implementation uses memory API instead of old ioports.
- Support for q35/ich9 added (still buggy, see patch 12/31).
- piix4/i440fx initialization code has been refactored to resemble q35. This
will allow memory map initialization at chipset qdev init time for both
machines, as well as more similar code.
- Hot-remove functionality has been moved to separate patches. Hot-remove no
longer frees memory but unmaps the dimm/qdev device from the guest's view.
Freeing the memory should happen when the last user unrefs/unmaps the memory,
see also (work in progress):
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg00728.html
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02697.html
- new qmp/hmp command for the state of each dimm (on/off)

Changes v2->v3

- qdev integration. Dimms are attached to a dimmbus. The dimmbus is a child
  of i440fx device in the pc machine. Hot-add and remove are done with normal
  device_add / device_del operations on the dimmbus. New commands "dimm_add" and
  "dimm_del" are obsolete.
- Add _PS3 method to allow OSPM-induced hot operations.
- pci-window calculation in Seabios takes dimms into account(for both 32-bit and
  64-bit windows)
- rename new qmp commands: query-memory-total and query-memory-hotplug
- balloon driver can see the hotplugged memory

Changes v1->v2

- memory map is automatically calculated for hotplug dimms. Dimms are added from
top-of-memory skipping the pci hole at [PCI_HOLE_START, 4G).
- Renamed from "-memslot" to "-dimm". Commands changed to "dimm_add", "dimm_del"
- Seabios ejection array reduced to a byte. Use extraction macros for dimm ssdt.
- additional SRAT paravirt info does not break previous SRAT fw_cfg layout.
- Documentation of new acpi_piix4 registers and paravirt data.
- add ACPI _OST support for _OST enabled guests. This allows qemu to receive
notification for success / failure of memory hot-add and hot-remove operations.
Guest needs to support _OST (https://lkml.org/lkml/2012/6/25/321)
- add monitor info command to report total guest memory (initial + hot-added)

Issues:

- hot-remove needs to only unmap the dimm device from guest's view. Freeing the
memory should happen when the last user of the device (e.g. virtio-blk) unrefs
the device. A testcase is needed for this.

- Live Migration: Ramblocks are migrated before qdev VMStates are migrated. So
the DimmDevice is handled diferrently than other devices. Should this be
reworked ?( DimmDevice structure currently does not define a VMStateDescription)
Live migration works as long as the dimm layout (command line args) are
identical at the source and destination qemu command line, and destination takes
into account hot-operations that have occured on source. (v3 patch 10/19
created the DimmDevice that corresponds to an unknown incoming ramblock, e.g.
for a dimm that was hot-added on source. but has been dropped for the moment). 

- A main blocker issue is windows guest functionality. The patchset does not
work for windows currently.  Testing on win2012 server RC or windows2008
consumer prerelease, when adding a DIMM, there is a BSOD with ACPI_BIOS_ERROR
message. After this, the VM keeps rebooting with ACPI_BIOS_ERROR. The windows
pnpmem driver obviosuly has a problem with the seabios dimm implementation
(or the seabios dimm implementation is not fully ACPI-compliant). If someone
can review the seabios patches or has any ideas to debug this, let me know.

- hot-operation notification lists need to be added to migration state.

series is based on:
- qemu master (commit a8a826a3) + patch:
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02699

Re: [Qemu-devel] [PATCH] block/raw-win32: Fix compiler warnings (wrong format specifiers)

2012-12-18 Thread Fabien Chouteau

On 12/17/2012 08:40 PM, Stefan Weil wrote:
> Commit fbcad04d6bfdff937536eb23088a01a280a1a3af added fprintf statements
> with wrong format specifiers.
> 
> GetLastError() returns a DWORD which is unsigned long, so %lu must be used.
> 

That's right, I didn't see that.

Thanks Stefan,

-- 
Fabien Chouteau

[Qemu-devel] [PATCH 3/3] virtio-pci: don't poll masked vectors

2012-12-18 Thread Michael S. Tsirkin

At the moment, when irqfd is in use but a vector is masked,
qemu will poll it and handle vector masks in userspace.
Since almost no one ever looks at the pending bits,
it is better to defer this until pending bits
are actually read.
Implement this optimization using the new poll notifier.

Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio-pci.c | 52 
 1 file changed, 40 insertions(+), 12 deletions(-)

diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 1c03bb5..bc6b4e0 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -509,8 +509,6 @@ static int kvm_virtio_pci_vq_vector_use(VirtIOPCIProxy 
*proxy,
 }
 return ret;
 }
-
-virtio_queue_set_guest_notifier_fd_handler(vq, true, true);
 return 0;
 }
 
@@ -529,8 +527,6 @@ static void kvm_virtio_pci_vq_vector_release(VirtIOPCIProxy 
*proxy,
 if (--irqfd->users == 0) {
 kvm_irqchip_release_virq(kvm_state, irqfd->virq);
 }
-
-virtio_queue_set_guest_notifier_fd_handler(vq, true, false);
 }
 
 static int kvm_virtio_pci_vector_use(PCIDevice *dev, unsigned vector,
@@ -581,7 +577,36 @@ static void kvm_virtio_pci_vector_release(PCIDevice *dev, 
unsigned vector)
 }
 }
 
-static int virtio_pci_set_guest_notifier(DeviceState *d, int n, bool assign)
+static void kvm_virtio_pci_vector_poll(PCIDevice *dev,
+   unsigned int vector_start,
+   unsigned int vector_end)
+{
+VirtIOPCIProxy *proxy = container_of(dev, VirtIOPCIProxy, pci_dev);
+VirtIODevice *vdev = proxy->vdev;
+int queue_no;
+unsigned int vector;
+EventNotifier *notifier;
+VirtQueue *vq;
+
+for (queue_no = 0; queue_no < VIRTIO_PCI_QUEUE_MAX; queue_no++) {
+if (!virtio_queue_get_num(vdev, queue_no)) {
+break;
+}
+vector = virtio_queue_vector(vdev, queue_no);
+if (vector < vector_start || vector >= vector_end ||
+!msix_is_masked(dev, vector)) {
+continue;
+}
+vq = virtio_get_queue(vdev, queue_no);
+notifier = virtio_queue_get_guest_notifier(vq);
+if (event_notifier_test_and_clear(notifier)) {
+msix_set_pending(dev, vector);
+}
+}
+}
+
+static int virtio_pci_set_guest_notifier(DeviceState *d, int n, bool assign,
+ bool with_irqfd)
 {
 VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
 VirtQueue *vq = virtio_get_queue(proxy->vdev, n);
@@ -592,9 +617,9 @@ static int virtio_pci_set_guest_notifier(DeviceState *d, 
int n, bool assign)
 if (r < 0) {
 return r;
 }
-virtio_queue_set_guest_notifier_fd_handler(vq, true, false);
+virtio_queue_set_guest_notifier_fd_handler(vq, true, with_irqfd);
 } else {
-virtio_queue_set_guest_notifier_fd_handler(vq, false, false);
+virtio_queue_set_guest_notifier_fd_handler(vq, false, with_irqfd);
 event_notifier_cleanup(notifier);
 }
 
@@ -612,9 +637,11 @@ static int virtio_pci_set_guest_notifiers(DeviceState *d, 
bool assign)
 VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
 VirtIODevice *vdev = proxy->vdev;
 int r, n;
+bool with_irqfd = msix_enabled(&proxy->pci_dev) &&
+kvm_msi_via_irqfd_enabled();
 
 /* Must unset vector notifier while guest notifier is still assigned */
-if (kvm_msi_via_irqfd_enabled() && !assign) {
+if (with_irqfd && !assign) {
 msix_unset_vector_notifiers(&proxy->pci_dev);
 g_free(proxy->vector_irqfd);
 proxy->vector_irqfd = NULL;
@@ -625,21 +652,22 @@ static int virtio_pci_set_guest_notifiers(DeviceState *d, 
bool assign)
 break;
 }
 
-r = virtio_pci_set_guest_notifier(d, n, assign);
+r = virtio_pci_set_guest_notifier(d, n, assign,
+  kvm_msi_via_irqfd_enabled());
 if (r < 0) {
 goto assign_error;
 }
 }
 
 /* Must set vector notifier after guest notifier has been assigned */
-if (kvm_msi_via_irqfd_enabled() && assign) {
+if (with_irqfd && assign) {
 proxy->vector_irqfd =
 g_malloc0(sizeof(*proxy->vector_irqfd) *
   msix_nr_vectors_allocated(&proxy->pci_dev));
 r = msix_set_vector_notifiers(&proxy->pci_dev,
   kvm_virtio_pci_vector_use,
   kvm_virtio_pci_vector_release,
-  NULL);
+  kvm_virtio_pci_vector_poll);
 if (r < 0) {
 goto assign_error;
 }
@@ -651,7 +679,7 @@ assign_error:
 /* We get here on assignment failure. Recover by undoing for VQs 0 .. n. */
 assert(assign);
 while (--n >= 0) {
-virtio_pci_set_guest_notifier(d, n, !assign);
+virtio_pci_set_guest_notifier(d, n, !assign, with_irqfd);
 }

Re: [Qemu-devel] [PATCH 01/10] ide: Break all non-qdevified controllers

2012-12-18 Thread Markus Armbruster

Peter Maydell  writes:

> On 17 December 2012 14:05, Markus Armbruster  wrote:
>> The writing has been on the wall for a few years.
>
> ...behind a filing cabinet in a disused lavatory with a sign on the door
> saying "beware of the leopard"?
>
> We really need a better way to mark devices as "obsolete, will be
> removed/broken/etc in a future version"...

Yes, we do.

These devices, however, are a slightly different case: "need
maintenance, will be removed unless they get it".  I've asked for them
to be updated on several occasions, on the list[*], and in person at the
last three KVM Forums.

Having your board in the tree is a privilege, not a right.  You earn the
privilege by doing your share of the work.  In my opinion, that includes
moving them off obsolete infrastructure in a timely manner, at least
when keeping the obsolete infrastructure around just for you becomes a
drag.  Which it definitely has been in case of IDE.

[*] http://lists.nongnu.org/archive/html/qemu-devel/2012-07/msg00621.html

[Qemu-devel] [PATCH 2/3] msix: expose access to masked/pending state

2012-12-18 Thread Michael S. Tsirkin

For use by poll handler.

Signed-off-by: Michael S. Tsirkin 
---
 hw/pci/msix.c | 6 +++---
 hw/pci/msix.h | 3 +++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index 1f31975..9df0ffb 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -65,7 +65,7 @@ static int msix_is_pending(PCIDevice *dev, int vector)
 return *msix_pending_byte(dev, vector) & msix_pending_mask(vector);
 }
 
-static void msix_set_pending(PCIDevice *dev, int vector)
+void msix_set_pending(PCIDevice *dev, unsigned int vector)
 {
 *msix_pending_byte(dev, vector) |= msix_pending_mask(vector);
 }
@@ -75,13 +75,13 @@ static void msix_clr_pending(PCIDevice *dev, int vector)
 *msix_pending_byte(dev, vector) &= ~msix_pending_mask(vector);
 }
 
-static bool msix_vector_masked(PCIDevice *dev, int vector, bool fmask)
+static bool msix_vector_masked(PCIDevice *dev, unsigned int vector, bool fmask)
 {
 unsigned offset = vector * PCI_MSIX_ENTRY_SIZE + 
PCI_MSIX_ENTRY_VECTOR_CTRL;
 return fmask || dev->msix_table[offset] & PCI_MSIX_ENTRY_CTRL_MASKBIT;
 }
 
-static bool msix_is_masked(PCIDevice *dev, int vector)
+bool msix_is_masked(PCIDevice *dev, unsigned int vector)
 {
 return msix_vector_masked(dev, vector, dev->msix_function_masked);
 }
diff --git a/hw/pci/msix.h b/hw/pci/msix.h
index ea85d02..d0c4429 100644
--- a/hw/pci/msix.h
+++ b/hw/pci/msix.h
@@ -26,6 +26,9 @@ void msix_load(PCIDevice *dev, QEMUFile *f);
 int msix_enabled(PCIDevice *dev);
 int msix_present(PCIDevice *dev);
 
+bool msix_is_masked(PCIDevice *dev, unsigned vector);
+void msix_set_pending(PCIDevice *dev, unsigned vector);
+
 int msix_vector_use(PCIDevice *dev, unsigned vector);
 void msix_vector_unuse(PCIDevice *dev, unsigned vector);
 void msix_unuse_all_vectors(PCIDevice *dev);
-- 
MST

[Qemu-devel] [RFC PATCH v4 11/30] acpi_piix4 : Implement memory device hotplug registers

2012-12-18 Thread Vasilis Liaskovitis

A 32-byte register is used to present up to 256 hotplug-able memory devices
to BIOS and OSPM. Hot-add and hot-remove functions trigger an ACPI hotplug
event through these. Only reads are allowed from these registers.

An ACPI hot-remove event but needs to wait for OSPM to eject the device.
We use a single-byte register to know when OSPM has called the _EJ function
for a particular dimm. A write to this byte will depopulate the respective dimm.
Only writes are allowed to this byte.

v1->v2:
mems_sts address moved from 0xaf20 to 0xaf80 (to accomodate more space for
cpu-hotplugging in the future).
_EJ array is reduced to a single byte.
Add documentation in docs/specs/acpi_hotplug.txt

v3->v4: Removed hot-remove functions, will be added separately. Updated for
memory API.

Signed-off-by: Vasilis Liaskovitis 
---
 docs/specs/acpi_hotplug.txt |   14 +
 hw/acpi.h   |5 +++
 hw/acpi_piix4.c |   65 +-
 3 files changed, 82 insertions(+), 2 deletions(-)
 create mode 100644 docs/specs/acpi_hotplug.txt

diff --git a/docs/specs/acpi_hotplug.txt b/docs/specs/acpi_hotplug.txt
new file mode 100644
index 000..8391713
--- /dev/null
+++ b/docs/specs/acpi_hotplug.txt
@@ -0,0 +1,14 @@
+QEMU<->ACPI BIOS hotplug interface
+--
+This document describes the interface between QEMU and the ACPI BIOS for 
non-PCI
+space. For the PCI interface please look at docs/specs/acpi_pci_hotplug.txt
+
+QEMU<->ACPI BIOS memory hotplug interface
+--
+
+Memory Dimm status array (IO port 0xaf80-0xaf9f, 1-byte access):
+---
+Dimm hot-plug notification pending. One bit per slot.
+
+Read by ACPI BIOS GPE.3 handler to notify OS of memory hot-add or hot-remove
+events.  Read-only.
diff --git a/hw/acpi.h b/hw/acpi.h
index afda153..dc617d3 100644
--- a/hw/acpi.h
+++ b/hw/acpi.h
@@ -120,6 +120,11 @@ struct ACPIREGS {
 Notifier wakeup;
 };
 
+#include "dimm.h"
+struct gpe_regs {
+uint8_t mems_sts[DIMM_BITMAP_BYTES];
+};
+
 /* PM_TMR */
 void acpi_pm_tmr_update(ACPIREGS *ar, bool enable);
 void acpi_pm_tmr_calc_overflow_time(ACPIREGS *ar);
diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index 0b5b0d3..879d8a0 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -29,6 +29,8 @@
 #include "ioport.h"
 #include "fw_cfg.h"
 #include "exec-memory.h"
+#include "sysbus.h"
+#include "dimm.h"
 
 //#define DEBUG
 
@@ -47,7 +49,9 @@
 #define PCI_DOWN_BASE 0xae04
 #define PCI_EJ_BASE 0xae08
 #define PCI_RMV_BASE 0xae0c
+#define MEM_BASE 0xaf80
 
+#define PIIX4_MEM_HOTPLUG_STATUS 8
 #define PIIX4_PCI_HOTPLUG_STATUS 2
 
 struct pci_status {
@@ -60,6 +64,7 @@ typedef struct PIIX4PMState {
 MemoryRegion io;
 MemoryRegion io_gpe;
 MemoryRegion io_pci;
+MemoryRegion io_memhp;
 ACPIREGS ar;
 
 APMState apm;
@@ -74,6 +79,7 @@ typedef struct PIIX4PMState {
 Notifier powerdown_notifier;
 
 /* for pci hotplug */
+struct gpe_regs gperegs;
 struct pci_status pci0_status;
 uint32_t pci0_hotplug_enable;
 uint32_t pci0_slot_device_present;
@@ -98,8 +104,8 @@ static void pm_update_sci(PIIX4PMState *s)
ACPI_BITMASK_POWER_BUTTON_ENABLE |
ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
-(((s->ar.gpe.sts[0] & s->ar.gpe.en[0])
-  & PIIX4_PCI_HOTPLUG_STATUS) != 0);
+(((s->ar.gpe.sts[0] & s->ar.gpe.en[0]) &
+  (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_MEM_HOTPLUG_STATUS)) != 0);
 
 qemu_set_irq(s->irq, sci_level);
 /* schedule a timer interruption if needed */
@@ -526,6 +532,29 @@ static const MemoryRegionOps piix4_gpe_ops = {
 .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
+static uint32_t memhp_readb(void *opaque, uint32_t addr)
+{
+PIIX4PMState *s = opaque;
+uint32_t val = 0;
+struct gpe_regs *g = &s->gperegs;
+if (addr < DIMM_BITMAP_BYTES) {
+val = (uint32_t) g->mems_sts[addr];
+}
+PIIX4_DPRINTF(stderr, "memhp read %x == %x\n", addr, val);
+return val;
+}
+
+static const MemoryRegionOps piix4_memhp_ops = {
+.old_portio = (MemoryRegionPortio[]) {
+{
+.offset = 0,   .len = DIMM_BITMAP_BYTES, .size = 1,
+.read = memhp_readb,
+},
+PORTIO_END_OF_LIST()
+},
+.endianness = DEVICE_LITTLE_ENDIAN,
+};
+
 static uint32_t pci_up_read(void *opaque, uint32_t addr)
 {
 PIIX4PMState *s = opaque;
@@ -592,9 +621,11 @@ static const MemoryRegionOps piix4_pci_ops = {
 
 static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
 PCIHotplugState state);
+static int piix4_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int add);
 
 static void piix4_acpi_system_hot_add_init(PCIBus *bus, PIIX4PMState *s)
 {
+int i = 0;
 memory_region_init_io(&s->io_gpe, &piix4_gpe_ops, s, "apci-gpe0",

[Qemu-devel] [RFC PATCH v4 02/30] [SeaBIOS] Add SSDT memory device support

2012-12-18 Thread Vasilis Liaskovitis

Define SSDT hotplug-able memory devices in _SB namespace. The dynamically
generated SSDT includes per memory device hotplug methods. These methods
just call methods defined in the DSDT. Also dynamically generate a MTFY
method and a MEON array of the online/available memory devices.  ACPI
extraction macros are used to place the AML code in variables later used by
src/acpi. The design is taken from SSDT cpu generation.

v3->v4: EJ0 operation will be provided separately
---
 Makefile |2 +-
 src/ssdt-mem.dsl |   62 ++
 2 files changed, 63 insertions(+), 1 deletions(-)
 create mode 100644 src/ssdt-mem.dsl

diff --git a/Makefile b/Makefile
index f28d86c..c8fcc57 100644
--- a/Makefile
+++ b/Makefile
@@ -220,7 +220,7 @@ $(OUT)%.hex: src/%.dsl ./tools/acpi_extract_preprocess.py 
./tools/acpi_extract.p
$(Q)$(PYTHON) ./tools/acpi_extract.py $(OUT)$*.lst > $(OUT)$*.off
$(Q)cat $(OUT)$*.off > $@
 
-$(OUT)acpi.o: $(OUT)acpi-dsdt.hex $(OUT)ssdt-proc.hex $(OUT)ssdt-pcihp.hex 
$(OUT)ssdt-susp.hex $(OUT)q35-acpi-dsdt.hex
+$(OUT)acpi.o: $(OUT)acpi-dsdt.hex $(OUT)ssdt-proc.hex $(OUT)ssdt-pcihp.hex 
$(OUT)ssdt-susp.hex $(OUT)q35-acpi-dsdt.hex $(OUT)ssdt-mem.hex
 
  Kconfig rules
 
diff --git a/src/ssdt-mem.dsl b/src/ssdt-mem.dsl
new file mode 100644
index 000..dbac33f
--- /dev/null
+++ b/src/ssdt-mem.dsl
@@ -0,0 +1,62 @@
+/* This file is the basis for the ssdt_mem[] variable in src/acpi.c.
+ * It is similar in design to the ssdt_proc variable.
+ * It defines the contents of the per-dimm QWordMemory() object.  At
+ * runtime, a dynamically generated SSDT will contain one copy of this
+ * AML snippet for every possible memory device in the system.  The
+ * objects will * be placed in the \_SB_ namespace.
+ *
+ * In addition to the aml code generated from this file, the
+ * src/acpi.c file creates a MTFY method with an entry for each memdevice:
+ * Method(MTFY, 2) {
+ * If (LEqual(Arg0, 0x00)) { Notify(MP00, Arg1) }
+ * If (LEqual(Arg0, 0x01)) { Notify(MP01, Arg1) }
+ * ...
+ * }
+ * and a MEON array with the list of active and inactive memory devices:
+ * Name(MEON, Package() { One, One, ..., Zero, Zero, ... })
+ */
+ACPI_EXTRACT_ALL_CODE ssdm_mem_aml
+
+DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", "CSSDT", 0x1)
+/*  v-- DO NOT EDIT --v */
+{
+ACPI_EXTRACT_DEVICE_START ssdt_mem_start
+ACPI_EXTRACT_DEVICE_END ssdt_mem_end
+ACPI_EXTRACT_DEVICE_STRING ssdt_mem_name
+Device(MPAA) {
+ACPI_EXTRACT_NAME_BYTE_CONST ssdt_mem_id
+Name(ID, 0xAA)
+/*  ^-- DO NOT EDIT --^
+ *
+ * The src/acpi.c code requires the above layout so that it can update
+ * MPAA and 0xAA with the appropriate MEMDEVICE id (see
+ * SD_OFFSET_MEMHEX/MEMID1/MEMID2).  Don't change the above without
+ * also updating the C code.
+ */
+Name(_HID, EISAID("PNP0C80"))
+Name(_PXM, 0xAA)
+
+External(CMST, MethodObj)
+External(MPEJ, MethodObj)
+
+Name(_CRS, ResourceTemplate() {
+QwordMemory(
+   ResourceConsumer,
+   ,
+   MinFixed,
+   MaxFixed,
+   Cacheable,
+   ReadWrite,
+   0x0,
+   0xDEADBEEF,
+   0xE6ADBEEE,
+   0x,
+   0x0800,
+   )
+})
+Method (_STA, 0) {
+Return(CMST(ID))
+}
+}
+}
+
-- 
1.7.9

Re: [Qemu-devel] [PATCH 00/10] Drop code for non-qdevified IDE, and clean up

2012-12-18 Thread Anthony Liguori

Markus Armbruster  writes:

> *** Important ***
> This *breaks* all non-qdevified controllers, see PATCH 01/10.
> Maintainers are cc'ed.
>
> If you want still more time to qdevify your controller, please speak
> up now, and tell us how much.
>
> The rest of the series is obvious cleanups enabled by dropping the
> special case of a non-qdevified controller.  The block configuration
> stuff I'm working on also profits from it, and is real reason I'm
> posting this.

Breaking is not the right approach.  If you're asserting that the code
is unused and unloved, then remove it entirely from the tree.

Just breaking something is always wrong though.

I only see three users of ide_init2_with_non_qdev_drives.  Is there any
reason you didn't just convert these users to qdev?

Regards,

Anthony Liguori

>
> Markus Armbruster (10):
>   ide: Break all non-qdevified controllers
>   ide: Move IDEDevice pointer from IDEBus to IDEState
>   ide: Use IDEState member dev for "device connected" test
>   ide: Don't block-align IDEState member smart_selftest_data
>   ide: Drop redundant IDEState member bs
>   ide: Drop redundant IDEState geometry members
>   ide: Drop redundant IDEState member version
>   ide: Drop redundant IDEState member drive_serial_str
>   ide: Drop redundant IDEState member model
>   ide: Drop redundant IDEState member wwn
>
>  hw/ide/ahci.c   |  19 ++--
>  hw/ide/atapi.c  |  39 
>  hw/ide/core.c   | 278 
> +---
>  hw/ide/internal.h   |  15 +--
>  hw/ide/macio.c  |  20 ++--
>  hw/ide/microdrive.c |   5 +-
>  hw/ide/piix.c   |   1 -
>  hw/ide/qdev.c   |  50 +-
>  8 files changed, 189 insertions(+), 238 deletions(-)
>
> -- 
> 1.7.11.7

Re: [Qemu-devel] [PATCH V9 0/4] replace QEMUOptionParameter with QemuOpts parser

2012-12-18 Thread Stefan Hajnoczi

On Mon, Dec 17, 2012 at 02:42:25PM +0800, Dong Xu Wang wrote:
> Patch 1-3 are from Luiz, added Markus's comments, discussion could be found
> here:
> http://lists.nongnu.org/archive/html/qemu-devel/2012-07/msg02716.html
> Patch 3 was changed according Paolo's comments.
> 
> Patch 4-5: because qemu_opts_create can not fail while id is null, so create
> function qemu_opts_create_nofail and use it.
> 
> Patch 6: create function qemu_opt_set_number, like qemu_opt_set_bool.
> 
> Patch 7: add def_value and use it in qemu_opts_print.
> 
> Patch 8: Create functions to pair with QEMUOptionParameter parser.
> 
> Patch 9: Use QemuOpts parser in Block.
> 
> Patch 10: Remove QEMUOptionParameter parser related code.
> 
> Patches 1 - 6 have been merged into block branch, so only patches 8 to 10
> are included.
> 
> v8->v9)
> 1) add qemu_ prefix to gluster_create_opts.
> 2) fix bug: bdrv_gluster_unix and bdrv_gluster_rdma should also be
>converted.
> 
> v7->v8)
> 1) print "elements => accept any params" while opts_accepts_any() ==
> true.
> 2) since def_print_str is the default value if an option isn't set,
> so rename it to def_value_str.
> 3) rebase to upstream source tree.
> 4) add gluster.c, raw-win32.c, and rbd.c.
> 
> v6->v7:
> 1) Fix typo: enouth->enough.
> 2) use osdep.h:stringify(), not redefining new macro.
> 3) preserve TODO comment.
> 4) fix typo: BLOCK_OPT_ENCRYPT->BLOCK_OPT_STATIC.
> 5) initialize disk_type even when opts is NULL.
> 
> v5->v6:
> 1) allocate enough space in append_opts_list function.
> 2) judge if opts == NULL in block layer create functions.
> 3) use bdrv_create_file(filename, NULL) in qcow_create funtion.
> 4) made more readable while using qemu_opt_get_number funtion.
> 
> v4->v5:
> 1) Rewrite qemu_opts_create_nofail function based on Peter Maydell's comments.
> 2) Use g_strdup_printf in qemu_opt_set_number.
> 3) Rewrite qemu_opts_print.
> 4) .bdrv_create_options returns pointer directly. Fix a bug about 
> "encryption".
> 5) Check qemu_opt_get_number in raw-posix.c.
> 
> v3->v4:
> 1) Rebased to the newest source tree.
> 2) Remove redundant "#include "block-cache.h"
> 3) Other small changes.
> 
> v2->v3:
> 1) rewrite qemu_opt_set_bool and qemu_opt_set_number according Paolo's 
> coments.
> 2) split patches to make review easier.
> 
> v1->v2:
> 1) add Luiz's patches.
> 2) create qemu_opt_set_number() and qemu_opts_create_nofail() functions.
> 3) add QemuOptsList map to drivers.
> 4) use original opts parser, not creating new ones.
> 5) fix other bugs.
> 
> Dong Xu Wang (4):
>   add def_print_str and use it in qemu_opts_print.
>   Create four opts list related functions
>   Use QemuOpts support in block layer
>   remove QEMUOptionParameter related functions and struct
> 
>  block.c   |   91 ++---
>  block.h   |4 +-
>  block/cow.c   |   46 +++---
>  block/gluster.c   |   37 +++---
>  block/qcow.c  |   60 
>  block/qcow2.c |  171 ---
>  block/qed.c   |   86 ++--
>  block/raw-posix.c |   65 -
>  block/raw-win32.c |   30 ++--
>  block/raw.c   |   30 +++--
>  block/rbd.c   |   63 
>  block/sheepdog.c  |   75 +-
>  block/vdi.c   |   69 +-
>  block/vmdk.c  |   74 +-
>  block/vpc.c   |   67 +
>  block/vvfat.c |   11 +-
>  block_int.h   |6 +-
>  qemu-img.c|   61 
>  qemu-option.c |  406 
> +++--
>  qemu-option.h |   37 +
>  20 files changed, 641 insertions(+), 848 deletions(-)

block/rbd.c: In function ‘qemu_rbd_create’:
block/rbd.c:315:9: error: ‘cluster_size’ undeclared (first use in this function)
block/rbd.c:315:9: note: each undeclared identifier is reported only once for 
each function it appears in
block/rbd.c: At top level:
block/rbd.c:945:5: error: unknown field ‘create_options’ specified in 
initializer
block/rbd.c:945:5: error: initialization from incompatible pointer type 
[-Werror]
block/rbd.c:945:5: error: (near initialization for 
‘bdrv_rbd.bdrv_save_vmstate’) [-Werror]

Stefan

Re: [Qemu-devel] [PATCH 26/26] usbredir: Add support for buffered bulk input

2012-12-18 Thread Gerd Hoffmann

  Hi,

Added patches 1-25 to the usb queue (using v2 of patch 10).

>  hw/usb/redirect-ftdi-ids.h   | 1255 
> ++
>  hw/usb/redirect-pl2303-ids.h |  150 +
>  hw/usb/redirect-usb-ids.h|  910 ++

Where does this come from?  Linux kernel I guess?  What is the procedure
to update them?

I also think this shouldn't be tied to redir, I think it is better to
have a hw/usb/quirks.c file where the device id database and helper
functions to match devices against the list are living.

cheers,
  Gerd

[Qemu-devel] [RFC PATCH v4 09/30] Implement dimm device abstraction

2012-12-18 Thread Vasilis Liaskovitis

Each hotplug-able memory slot is a DimmDevice. All DimmDevices are attached
to a new bus called DimmBus. This bus is introduced so that we no longer
depend on hotplug-capability of main system bus (the main bus does not allow
hotplugging). The DimmBus should be attached to a chipset Device (i440fx in case
of the pc)

A hot-add operation for a particular dimm:
- creates a new DimmDevice and attaches it to the DimmBus
- creates a new MemoryRegion of the given physical address offset, size and
node proximity, and attaches it to main system memory as a sub_region.

Hotplug operations are done through normal device_add commands.
Also add properties to DimmDevice.

v3->v4: Removed hot-remove functions. Will be offered in separate patches.

Signed-off-by: Vasilis Liaskovitis 
---
 hw/Makefile.objs |2 +-
 hw/dimm.c|  245 ++
 hw/dimm.h|   89 
 3 files changed, 335 insertions(+), 1 deletions(-)
 create mode 100644 hw/dimm.c
 create mode 100644 hw/dimm.h

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index d581d8d..51494c9 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -29,7 +29,7 @@ common-obj-$(CONFIG_I8254) += i8254_common.o i8254.o
 common-obj-$(CONFIG_PCSPK) += pcspk.o
 common-obj-$(CONFIG_PCKBD) += pckbd.o
 common-obj-$(CONFIG_FDC) += fdc.o
-common-obj-$(CONFIG_ACPI) += acpi.o acpi_piix4.o acpi_ich9.o smbus_ich9.o
+common-obj-$(CONFIG_ACPI) += acpi.o acpi_piix4.o acpi_ich9.o smbus_ich9.o 
dimm.o
 common-obj-$(CONFIG_APM) += pm_smbus.o apm.o
 common-obj-$(CONFIG_DMA) += dma.o
 common-obj-$(CONFIG_I82374) += i82374.o
diff --git a/hw/dimm.c b/hw/dimm.c
new file mode 100644
index 000..e384952
--- /dev/null
+++ b/hw/dimm.c
@@ -0,0 +1,245 @@
+/*
+ * Dimm device for Memory Hotplug
+ *
+ * Copyright ProfitBricks GmbH 2012
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "trace.h"
+#include "qdev.h"
+#include "dimm.h"
+#include 
+#include "../exec-memory.h"
+#include "qmp-commands.h"
+
+/* the following list is used to hold dimm config info before machine
+ * is initialized. After machine init, the list is not used anymore.*/
+static DimmConfiglist dimmconfig_list =
+   QTAILQ_HEAD_INITIALIZER(dimmconfig_list);
+
+/* the list of memory buses */
+static QLIST_HEAD(, DimmBus) memory_buses;
+
+static void dimmbus_dev_print(Monitor *mon, DeviceState *dev, int indent);
+static char *dimmbus_get_fw_dev_path(DeviceState *dev);
+
+static Property dimm_properties[] = {
+DEFINE_PROP_UINT64("start", DimmDevice, start, 0),
+DEFINE_PROP_SIZE("size", DimmDevice, size, DEFAULT_DIMMSIZE),
+DEFINE_PROP_UINT32("node", DimmDevice, node, 0),
+DEFINE_PROP_BIT("populated", DimmDevice, populated, 0, false),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void dimmbus_dev_print(Monitor *mon, DeviceState *dev, int indent)
+{
+}
+
+static char *dimmbus_get_fw_dev_path(DeviceState *dev)
+{
+char path[40];
+
+snprintf(path, sizeof(path), "%s", qdev_fw_name(dev));
+return strdup(path);
+}
+
+static void dimm_bus_class_init(ObjectClass *klass, void *data)
+{
+BusClass *k = BUS_CLASS(klass);
+
+k->print_dev = dimmbus_dev_print;
+k->get_fw_dev_path = dimmbus_get_fw_dev_path;
+}
+
+static void dimm_bus_initfn(Object *obj)
+{
+DimmBus *bus = DIMM_BUS(obj);
+QTAILQ_INIT(&bus->dimmconfig_list);
+QTAILQ_INIT(&bus->dimmlist);
+}
+
+static const TypeInfo dimm_bus_info = {
+.name = TYPE_DIMM_BUS,
+.parent = TYPE_BUS,
+.instance_size = sizeof(DimmBus),
+.instance_init = dimm_bus_initfn,
+.class_init = dimm_bus_class_init,
+};
+
+DimmBus *dimm_bus_create(Object *parent, const char *name, uint32_t max_dimms,
+dimm_calcoffset_fn pmc_set_offset)
+{
+DimmBus *memory_bus;
+DimmConfig *dimm_cfg, *next_cfg;
+uint32_t num_dimms = 0;
+
+memory_bus = g_malloc0(dimm_bus_info.instance_size);
+memory_bus->qbus.name = name ? g_strdup(name) : "membus.0";
+qbus_create_inplace(&memory_bus->qbus, TYPE_DIMM_BUS, DEVICE(parent),
+ name);
+
+QTAILQ_FOREACH_SAFE(dimm_cfg, &dimmconfig_list, nextdimmcfg, next_cfg) {
+if (!strcmp(memory_bus->qbus.name, dimm_cfg->bus_name)) {
+if (max_dimms && (num_dimms == max_dimms)) {
+fprintf(stderr, "Bus %s can only accept %u number of DIMMs\n",
+

[Qemu-devel] [RFC PATCH v4 04/30] [SeaBIOS] acpi: generate hotplug memory devices

2012-12-18 Thread Vasilis Liaskovitis

The memory device generation is guided by qemu paravirt info. Seabios
first uses the info to setup SRAT entries for the hotplug-able memory slots.
Afterwards, build_memssdt uses the created SRAT entries to generate
appropriate memory device objects. One memory device (and corresponding SRAT
entry) is generated for each hotplug-able qemu memslot. Currently no SSDT
memory device is created for initial system memory.

We only support up to 255 DIMMs for now (PackageOp used for the MEON array can
only describe an array of at most 255 elements. VarPackageOp would be needed to
support more than 255 devices)

v1->v2:
Seabios reads mems_sts from qemu to build e820_map
SSDT size and some offsets are calculated with extraction macros.
---
 src/acpi.c |  158 +--
 1 files changed, 152 insertions(+), 6 deletions(-)

diff --git a/src/acpi.c b/src/acpi.c
index 6267d7b..82231da 100644
--- a/src/acpi.c
+++ b/src/acpi.c
@@ -14,6 +14,7 @@
 #include "ioport.h" // inl
 #include "paravirt.h" // qemu_cfg_irq0_override
 #include "dev-q35.h" // qemu_cfg_irq0_override
+#include "memmap.h"
 
 //
 /* ACPI tables init */
@@ -446,11 +447,26 @@ encodeLen(u8 *ssdt_ptr, int length, int bytes)
 #define PCIHP_AML (ssdp_pcihp_aml + *ssdt_pcihp_start)
 #define PCI_SLOTS 32
 
+/* 0x5B 0x82 DeviceOp PkgLength NameString DimmID */
+#define MEM_BASE 0xaf80
+#define MEM_AML (ssdm_mem_aml + *ssdt_mem_start)
+#define MEM_SIZEOF (*ssdt_mem_end - *ssdt_mem_start)
+#define MEM_OFFSET_HEX (*ssdt_mem_name - *ssdt_mem_start + 2)
+#define MEM_OFFSET_ID (*ssdt_mem_id - *ssdt_mem_start)
+#define MEM_OFFSET_PXM 31
+#define MEM_OFFSET_START 55
+#define MEM_OFFSET_END   63
+#define MEM_OFFSET_SIZE  79
+
+u64 nb_hp_memslots = 0;
+struct srat_memory_affinity *mem;
+
 #define SSDT_SIGNATURE 0x54445353 // SSDT
 #define SSDT_HEADER_LENGTH 36
 
 #include "ssdt-susp.hex"
 #include "ssdt-pcihp.hex"
+#include "ssdt-mem.hex"
 
 #define PCI_RMV_BASE 0xae0c
 
@@ -502,6 +518,111 @@ static void patch_pcihp(int slot, u8 *ssdt_ptr, u32 eject)
 }
 }
 
+static void build_memdev(u8 *ssdt_ptr, int i, u64 mem_base, u64 mem_len, u8 
node)
+{
+memcpy(ssdt_ptr, MEM_AML, MEM_SIZEOF);
+ssdt_ptr[MEM_OFFSET_HEX] = getHex(i >> 4);
+ssdt_ptr[MEM_OFFSET_HEX+1] = getHex(i);
+ssdt_ptr[MEM_OFFSET_ID] = i;
+ssdt_ptr[MEM_OFFSET_PXM] = node;
+*(u64*)(ssdt_ptr + MEM_OFFSET_START) = mem_base;
+*(u64*)(ssdt_ptr + MEM_OFFSET_END) = mem_base + mem_len;
+*(u64*)(ssdt_ptr + MEM_OFFSET_SIZE) = mem_len;
+}
+
+static void*
+build_memssdt(void)
+{
+u64 mem_base;
+u64 mem_len;
+u8  node;
+int i;
+struct srat_memory_affinity *entry = mem;
+u64 nb_memdevs = nb_hp_memslots;
+u8  memslot_status, enabled;
+
+int length = ((1+3+4)
+  + (nb_memdevs * MEM_SIZEOF)
+  + (1+2+5+(12*nb_memdevs))
+  + (6+2+1+(1*nb_memdevs)));
+u8 *ssdt = malloc_high(sizeof(struct acpi_table_header) + length);
+if (! ssdt) {
+warn_noalloc();
+return NULL;
+}
+u8 *ssdt_ptr = ssdt + sizeof(struct acpi_table_header);
+
+// build Scope(_SB_) header
+*(ssdt_ptr++) = 0x10; // ScopeOp
+ssdt_ptr = encodeLen(ssdt_ptr, length-1, 3);
+*(ssdt_ptr++) = '_';
+*(ssdt_ptr++) = 'S';
+*(ssdt_ptr++) = 'B';
+*(ssdt_ptr++) = '_';
+
+for (i = 0; i < nb_memdevs; i++) {
+mem_base = (((u64)(entry->base_addr_high) << 32 )| 
entry->base_addr_low);
+mem_len = (((u64)(entry->length_high) << 32 )| entry->length_low);
+node = entry->proximity[0];
+build_memdev(ssdt_ptr, i, mem_base, mem_len, node);
+ssdt_ptr += MEM_SIZEOF;
+entry++;
+}
+
+// build "Method(MTFY, 2) {If (LEqual(Arg0, 0x00)) {Notify(CM00, Arg1)} 
...}"
+*(ssdt_ptr++) = 0x14; // MethodOp
+ssdt_ptr = encodeLen(ssdt_ptr, 2+5+(12*nb_memdevs), 2);
+*(ssdt_ptr++) = 'M';
+*(ssdt_ptr++) = 'T';
+*(ssdt_ptr++) = 'F';
+*(ssdt_ptr++) = 'Y';
+*(ssdt_ptr++) = 0x02;
+for (i=0; i> 4);
+*(ssdt_ptr++) = getHex(i);
+*(ssdt_ptr++) = 0x69; // Arg1Op
+}
+
+// build "Name(MEON, Package() { One, One, ..., Zero, Zero, ... })"
+*(ssdt_ptr++) = 0x08; // NameOp
+*(ssdt_ptr++) = 'M';
+*(ssdt_ptr++) = 'E';
+*(ssdt_ptr++) = 'O';
+*(ssdt_ptr++) = 'N';
+*(ssdt_ptr++) = 0x12; // PackageOp
+ssdt_ptr = encodeLen(ssdt_ptr, 2+1+(1*nb_memdevs), 2);
+*(ssdt_ptr++) = nb_memdevs;
+
+entry = mem;
+memslot_status = 0;
+
+for (i = 0; i < nb_memdevs; i++) {
+enabled = 0;
+if (i % 8 == 0)
+memslot_status = inb(MEM_BASE + i/8);
+enabled = memslot_status & 1;
+mem_base = (((u64)(entry->base_addr_high) << 32 )| 
entry->base_addr_low);
+mem_len = (((u64)(entry->length_high) << 32 )| entry->length_low);
+*(ssdt_ptr++) = enabled ? 0x01 : 0x00;
+if (enabled

Re: [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification.

2012-12-18 Thread Stefan Hajnoczi

On Wed, Dec 12, 2012 at 04:57:38PM +0100, Benoît Canet wrote:
> > Can you foresee the need to use a different hash algorithm in the future
> > and should we add a hash_algo enum field to the dedup QCOW2 header
> > extension?
> 
> Yes I foresee the future use of faster hash function like SHA3 or Skein.
> 
> I also think an alternate deduplication mechanism where lookups are done
> on disk in order to be able to deduplicate very large volume could be added.
> 
> What would be the cleanest way to store this in the header extension ?
> bitmaps or two char fields ?

The header extension could have a uint8_t hash_algo field (and 3
reserved bytes that can be used in the future).

0 - SHA256
1 - Skein
...

Stefan

[Qemu-devel] [RFC PATCH v4 20/30] balloon: update with hotplugged memory

2012-12-18 Thread Vasilis Liaskovitis

query-balloon and "info balloon" should report total memory available to the
guest.

balloon inflate/ deflate can also use all memory available to the guest (initial
+ hotplugged memory)

Ballon driver has been minimaly tested with the patch, please review and test.

Caveat: if the guest does not online hotplugged-memory, it's easy for a balloon
inflate command to OOM a guest.

Signed-off-by: Vasilis Liaskovitis 
---
 hw/virtio-balloon.c |   13 +
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/hw/virtio-balloon.c b/hw/virtio-balloon.c
index dd1a650..149e8ba 100644
--- a/hw/virtio-balloon.c
+++ b/hw/virtio-balloon.c
@@ -22,6 +22,7 @@
 #include "virtio-balloon.h"
 #include "kvm.h"
 #include "exec-memory.h"
+#include "dimm.h"
 
 #if defined(__linux__)
 #include 
@@ -147,10 +148,11 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
 VirtIOBalloon *dev = to_virtio_balloon(vdev);
 struct virtio_balloon_config config;
 uint32_t oldactual = dev->actual;
+uint64_t hotplugged_ram_size = get_hp_memory_total();
 memcpy(&config, config_data, 8);
 dev->actual = le32_to_cpu(config.actual);
 if (dev->actual != oldactual) {
-qemu_balloon_changed(ram_size -
+qemu_balloon_changed(ram_size + hotplugged_ram_size -
  (dev->actual << VIRTIO_BALLOON_PFN_SHIFT));
 }
 }
@@ -188,17 +190,20 @@ static void virtio_balloon_stat(void *opaque, BalloonInfo 
*info)
 
 info->actual = ram_size - ((uint64_t) dev->actual <<
VIRTIO_BALLOON_PFN_SHIFT);
+info->actual += get_hp_memory_total();
 }
 
 static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
 {
 VirtIOBalloon *dev = opaque;
+uint64_t hotplugged_ram_size = get_hp_memory_total();
 
-if (target > ram_size) {
-target = ram_size;
+if (target > ram_size + hotplugged_ram_size) {
+target = ram_size + hotplugged_ram_size;
 }
 if (target) {
-dev->num_pages = (ram_size - target) >> VIRTIO_BALLOON_PFN_SHIFT;
+dev->num_pages = (ram_size + hotplugged_ram_size - target) >>
+ VIRTIO_BALLOON_PFN_SHIFT;
 virtio_notify_config(&dev->vdev);
 }
 }
-- 
1.7.9

Re: [Qemu-devel] [RFC V3 00/24] QCOW2 deduplication

2012-12-18 Thread Stefan Hajnoczi

On Wed, Dec 12, 2012 at 05:14:28PM +0100, Benoît Canet wrote:
> 
> Hi Stefan,
> 
> I have a few questions
> 
> 1) overlapping sequential sub-cluster writes
> 
> The current code pass most of the tests and behave well with a 4KB cluster 
> sized
> ext3 volume on the deduplicated image.
> 
> But less than cluster size sequentials writes are troublesome.
> They fail with xfstest.
> The problem is that the lock is released twice so that coherency is not
> garanteed when two sub cluster size write are done on the same area.
> (a deduplication attempt is done while the first write is yet not on disk)
> 
> My understanding is that a wait_for_overlapping_cluster_write function called
> before the writev loop in order to serialize such writes would solve the 
> problem.
> What do you this of this idea ?

Yes, it's the same problem that copy-on-read has.  We can serialize I/O
requests, if necessary, in order to prevent them racing with each other.

> 2) Internal snapshot
> I don't fully understand if the current deduplication implementation is
> compatible with internal snapshots. If not could it be done on a latter
> patchset ?

Let's figure out how hard it is to support internal snapshots for dedup.

Internal snapshot creation is simple:

1. Copy the current L1 table for the internal snapshot.
2. Increment refcounts for L2 and data clusters.
3. Finalize the internal snapshot.

Where do you see an issue - do you think the refcount manipulations
you're doing for dedup might conflict with internal snapshots?

Stefan

Re: [Qemu-devel] [PATCH] e1000: Discard oversized packets based on SBP|LPE

2012-12-18 Thread Stefan Hajnoczi

On Wed, Dec 05, 2012 at 01:31:30PM -0500, Michael Contreras wrote:
> Discard packets longer than 16384 when !SBP to match the hardware behavior.
> 
> Signed-off-by: Michael Contreras 
> ---
>  hw/e1000.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)

Thanks, applied to the net tree:
https://github.com/stefanha/qemu/commits/net

Stefan

[Qemu-devel] [RFC PATCH v4 18/30] Introduce paravirt interface QEMU_CFG_PCI_WINDOW

2012-12-18 Thread Vasilis Liaskovitis

Qemu calculates the 32-bit and 64-bit PCI starting offsets based on
initial memory and hotplug-able dimms. This info needs to be passed to Seabios
for PCI initialization.

Signed-off-by: Vasilis Liaskovitis 
---
 hw/fw_cfg.h  |1 +
 hw/pc_piix.c |   10 ++
 hw/pc_q35.c  |9 +
 3 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
index 619a394..8b48493 100644
--- a/hw/fw_cfg.h
+++ b/hw/fw_cfg.h
@@ -27,6 +27,7 @@
 #define FW_CFG_SETUP_SIZE   0x17
 #define FW_CFG_SETUP_DATA   0x18
 #define FW_CFG_FILE_DIR 0x19
+#define FW_CFG_PCI_WINDOW   0x1a
 
 #define FW_CFG_FILE_FIRST   0x20
 #define FW_CFG_FILE_SLOTS   0x10
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 1a99852..b6633e8 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -48,6 +48,7 @@
 #  include 
 #endif
 #include "piix_pci.h"
+#include "fw_cfg.h"
 
 #define MAX_IDE_BUS 2
 
@@ -86,6 +87,7 @@ static void pc_init1(MemoryRegion *system_memory,
 MemoryRegion *pci_memory;
 MemoryRegion *rom_memory;
 void *fw_cfg = NULL;
+uint64_t *pci_window_fw_cfg;
 I440FXState *i440fx_host;
 PIIX3State *piix3;
 
@@ -141,6 +143,14 @@ static void pc_init1(MemoryRegion *system_memory,
 
 qdev_init_nofail(DEVICE(i440fx_host));
 bochs_meminfo_bios_init(fw_cfg);
+
+pci_window_fw_cfg = g_malloc0(2 * 8);
+pci_window_fw_cfg[0] = cpu_to_le64(i440fx_host->mch.below_4g_mem_size);
+pci_window_fw_cfg[1] = cpu_to_le64(0x1ULL +
+i440fx_host->mch.above_4g_mem_size);
+fw_cfg_add_bytes(fw_cfg, FW_CFG_PCI_WINDOW,
+(uint8_t *)pci_window_fw_cfg, 2 * 8);
+
 i440fx_state = &i440fx_host->mch;
 pci_bus = i440fx_host->parent_obj.bus;
 /* Xen supports additional interrupt routes from the PCI devices to
diff --git a/hw/pc_q35.c b/hw/pc_q35.c
index 7ce0b53..e35814a 100644
--- a/hw/pc_q35.c
+++ b/hw/pc_q35.c
@@ -87,6 +87,7 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
 PCIDevice *ahci;
 qemu_irq *cmos_s3;
 void *fw_cfg = NULL;
+uint64_t *pci_window_fw_cfg;
 
 pc_cpus_init(cpu_model);
 
@@ -139,6 +140,14 @@ static void pc_q35_init(QEMUMachineInitArgs *args)
 /* pci */
 qdev_init_nofail(DEVICE(q35_host));
 bochs_meminfo_bios_init(fw_cfg);
+
+pci_window_fw_cfg = g_malloc0(2 * 8);
+pci_window_fw_cfg[0] = cpu_to_le64(MCH_HOST_BRIDGE_PCIEXBAR_DEFAULT);
+pci_window_fw_cfg[1] = cpu_to_le64(0x1ULL +
+q35_host->mch.above_4g_mem_size);
+fw_cfg_add_bytes(fw_cfg, FW_CFG_PCI_WINDOW,
+(uint8_t *)pci_window_fw_cfg, 2 * 8);
+
 host_bus = q35_host->host.pci.bus;
 /* create ISA bus */
 lpc = pci_create_simple_multifunction(host_bus, PCI_DEVFN(ICH9_LPC_DEV,
-- 
1.7.9

[Qemu-devel] [RFC PATCH v4 10/30] vl: handle "-device dimm"

2012-12-18 Thread Vasilis Liaskovitis

Signed-off-by: Vasilis Liaskovitis 
---
 vl.c |   51 +++
 1 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/vl.c b/vl.c
index a3ab384..8406933 100644
--- a/vl.c
+++ b/vl.c
@@ -169,6 +169,7 @@ int main(int argc, char **argv)
 
 #include "ui/qemu-spice.h"
 #include "qapi/string-input-visitor.h"
+#include "hw/dimm.h"
 
 //#define DEBUG_NET
 //#define DEBUG_SLIRP
@@ -249,6 +250,7 @@ static QTAILQ_HEAD(, FWBootEntry) fw_boot_order =
 int nb_numa_nodes;
 uint64_t node_mem[MAX_NODES];
 unsigned long *node_cpumask[MAX_NODES];
+int nb_hp_dimms;
 
 uint8_t qemu_uuid[16];
 
@@ -2065,6 +2067,50 @@ static int chardev_init_func(QemuOpts *opts, void 
*opaque)
 return 0;
 }
 
+static int dimmcfg_init_func(QemuOpts *opts, void *opaque)
+{
+const char *driver;
+const char *id;
+uint64_t node, size;
+uint32_t populated;
+const char *buf, *busbuf;
+
+/* DimmDevice configuration needs to be known in order to initialize 
chipset
+ * with correct memory and pci ranges. But all devices are created after
+ * chipset / machine initialization. In * order to avoid this problem, we
+ * parse dimm information earlier into dimmcfg structs. */
+
+driver = qemu_opt_get(opts, "driver");
+if (!strcmp(driver, "dimm")) {
+
+id = qemu_opts_id(opts);
+buf = qemu_opt_get(opts, "size");
+parse_option_size("size", buf, &size, NULL);
+buf = qemu_opt_get(opts, "node");
+parse_option_number("node", buf, &node, NULL);
+busbuf = qemu_opt_get(opts, "bus");
+buf = qemu_opt_get(opts, "populated");
+if (!buf) {
+populated = 0;
+} else {
+populated = strcmp(buf, "on") ? 0 : 1;
+}
+
+dimm_config_create((char *)id, size, busbuf ? busbuf : "membus.0",
+node, nb_hp_dimms, populated);
+
+/* if !populated, we just keep the config. The real device
+ * will be created in the future with a normal device_add
+ * command. */
+if (!populated) {
+qemu_opts_del(opts);
+}
+nb_hp_dimms++;
+}
+
+return 0;
+}
+
 #ifdef CONFIG_VIRTFS
 static int fsdev_init_func(QemuOpts *opts, void *opaque)
 {
@@ -3859,6 +3905,11 @@ int main(int argc, char **argv, char **envp)
 }
 qemu_add_globals();
 
+/* init generic devices */
+if (qemu_opts_foreach(qemu_find_opts("device"),
+   dimmcfg_init_func, NULL, 1) != 0) {
+exit(1);
+}
 qdev_machine_init();
 
 QEMUMachineInitArgs args = { .ram_size = ram_size,
-- 
1.7.9

[Qemu-devel] [RFC PATCH v4 25/30] acpi_ich9: add hot-remove capability

2012-12-18 Thread Vasilis Liaskovitis

---
 hw/acpi_ich9.c |   28 +++-
 hw/acpi_ich9.h |1 +
 2 files changed, 28 insertions(+), 1 deletions(-)

diff --git a/hw/acpi_ich9.c b/hw/acpi_ich9.c
index abafbb5..f5dc1c9 100644
--- a/hw/acpi_ich9.c
+++ b/hw/acpi_ich9.c
@@ -105,12 +105,29 @@ static uint32_t memhp_readb(void *opaque, uint32_t addr)
 return val;
 }
 
+static void memhp_writeb(void *opaque, uint32_t addr, uint32_t val)
+{
+switch (addr) {
+case ICH9_MEM_EJ_BASE - ICH9_MEM_BASE:
+dimm_notify(val, DIMM_REMOVE_SUCCESS);
+break;
+default:
+ICH9_DEBUG("memhp write invalid %x <== %d\n", addr, val);
+}
+ICH9_DEBUG("memhp write %x <== %d\n", addr, val);
+}
+
 static const MemoryRegionOps ich9_memhp_ops = {
 .old_portio = (MemoryRegionPortio[]) {
 {
 .offset = 0,   .len = DIMM_BITMAP_BYTES, .size = 1,
 .read = memhp_readb,
 },
+{
+.offset = ICH9_MEM_EJ_BASE - ICH9_MEM_BASE,
+.len = 1, .size = 1,
+.write = memhp_writeb,
+},
 PORTIO_END_OF_LIST()
 },
 .endianness = DEVICE_LITTLE_ENDIAN,
@@ -234,6 +251,13 @@ static void enable_mem_device(ICH9LPCState *s, int 
memdevice)
 g->mems_sts[memdevice/8] |= (1 << (memdevice%8));
 }
 
+static void disable_mem_device(ICH9LPCState *s, int memdevice)
+{
+struct gpe_regs *g = &s->pm.gperegs;
+s->pm.acpi_regs.gpe.sts[0] |= ICH9_MEM_HOTPLUG_STATUS;
+g->mems_sts[memdevice/8] &= ~(1 << (memdevice%8));
+}
+
 static int ich9_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int
 add)
 {
@@ -243,6 +267,8 @@ static int ich9_dimm_hotplug(DeviceState *qdev, DimmDevice 
*dev, int
 
 if (add) {
 enable_mem_device(s, slot->idx);
+} else {
+disable_mem_device(s, slot->idx);
 }
 pm_update_sci(&s->pm);
 return 0;
@@ -270,7 +296,7 @@ void ich9_pm_init(void *device, qemu_irq sci_irq, qemu_irq 
cmos_s3)
 memory_region_add_subregion(&pm->io, ICH9_PMIO_SMI_EN, &pm->io_smi);
 
 memory_region_init_io(&pm->io_memhp, &ich9_memhp_ops, pm, "apci-memhp0",
-  DIMM_BITMAP_BYTES);
+  DIMM_BITMAP_BYTES + 1);
 memory_region_add_subregion(get_system_io(), ICH9_MEM_BASE, &pm->io_memhp);
 
 dimm_bus_hotplug(ich9_dimm_hotplug, &lpc->d.qdev);
diff --git a/hw/acpi_ich9.h b/hw/acpi_ich9.h
index 4419247..af61a2d 100644
--- a/hw/acpi_ich9.h
+++ b/hw/acpi_ich9.h
@@ -24,6 +24,7 @@
 #include "acpi.h"
 
 #define ICH9_MEM_BASE0xaf80
+#define ICH9_MEM_EJ_BASE0xafa0
 #define ICH9_MEM_HOTPLUG_STATUS 8
 
 typedef struct ICH9LPCPMRegs {
-- 
1.7.9

Re: [Qemu-devel] [PATCH v2] net: clean up network at qemu process termination

2012-12-18 Thread Stefan Hajnoczi

On Tue, Dec 11, 2012 at 10:20:15PM +0800, Amos Kong wrote:
> We don't clean up network if fails to parse "-device" parameters without
> calling net_cleanup(). I touch a problem, the tap device which is
> created by qemu-ifup script could not be removed by qemu-ifdown script.
> Some similar problems also exist in vl.c
> 
> In this patch, if network initialization successes, a cleanup function
> will be registered to be called at qemu process termination.
> 
> Signed-off-by: Amos Kong 
> ---
> v2: register cleanup function before network initialization
> ---
>  vl.c |4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)

Thanks, applied to the net tree:
https://github.com/stefanha/qemu/commits/net

Stefan

Re: [Qemu-devel] [PATCH 0/2] i2c: Add AT24Cxx EEPROM model

2012-12-18 Thread Stefan Hajnoczi

On Wed, Dec 12, 2012 at 10:44:10AM +0100, Jan Kiszka wrote:
> On 2012-11-19 15:24, Jan Kiszka wrote:
> > See patches for details.
> > 
> > Jan Kiszka (2):
> >   i2c: Introduce device address mask
> >   Add AT24Cxx I2C EEPROM device model
> > 
> >  hw/Makefile.objs |2 +-
> >  hw/at24.c|  363 
> > ++
> >  hw/ds1338.c  |2 +-
> >  hw/i2c.c |9 +-
> >  hw/i2c.h |3 +-
> >  hw/lm832x.c  |2 +-
> >  hw/max7310.c |2 +-
> >  hw/pxa2xx.c  |3 +-
> >  hw/smbus.c   |2 +-
> >  hw/ssd0303.c |2 +-
> >  hw/tmp105.c  |2 +-
> >  hw/tosa.c|2 +-
> >  hw/twl92230.c|2 +-
> >  hw/wm8750.c  |2 +-
> >  hw/z2.c  |2 +-
> >  15 files changed, 383 insertions(+), 17 deletions(-)
> >  create mode 100644 hw/at24.c
> > 
> 
> Ping. Both still apply over latest master, and addressable review
> comments are not pending according to my understanding.

Hi Jan,
Just going through my piled up qemu-devel mailbox.  You're asking for a
qemu.git commit to merge this, right?

Stefan

Re: [Qemu-devel] [PATCH 01/10] ide: Break all non-qdevified controllers

2012-12-18 Thread Peter Maydell

On 18 December 2012 12:44, Markus Armbruster  wrote:
> Peter Maydell  writes:
>> We really need a better way to mark devices as "obsolete, will be
>> removed/broken/etc in a future version"...
>
> Yes, we do.
>
> These devices, however, are a slightly different case: "need
> maintenance, will be removed unless they get it".  I've asked for them
> to be updated on several occasions, on the list[*], and in person at the
> last three KVM Forums.

My concern is basically that I suspect there's a user community
out there that doesn't necessarily read the list or go to KVM Forum.
I actually have no idea which of the various elderly ARM boards are
really used and which we could just delete. If back in July we'd
marked the relevant boards as "this may disappear in the next version"
we might have got some feedback about if anybody was actually using
tosa, spitz, etc.

> Having your board in the tree is a privilege, not a right.  You earn the
> privilege by doing your share of the work.

I agree completely with this.

-- PMM

Re: [Qemu-devel] [PATCH] virtio-blk: Return UNSUPP for unknown request types

2012-12-18 Thread Stefan Hajnoczi

On Thu, Dec 13, 2012 at 09:03:43AM +0200, Alexey Zaytsev wrote:
> Currently, all unknown requests are treated as VIRTIO_BLK_T_IN
> 
> Signed-off-by: Alexey Zaytsev 
> ---
> Sorry, made a typo when formatting the patch, please consider this one.
> 
>  hw/virtio-blk.c |6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

[Qemu-devel] [RFC PATCH v4 03/30] [SeaBIOS] acpi-dsdt: Implement functions for memory hotplug

2012-12-18 Thread Vasilis Liaskovitis

Extend the DSDT to include methods for handling memory hot-add and hot-remove
notifications and memory device status requests. These functions are called
from the memory device SSDT methods.
---
 src/acpi-dsdt-mem-hotplug.dsl |   57 +
 src/acpi-dsdt.dsl |5 +++-
 2 files changed, 61 insertions(+), 1 deletions(-)
 create mode 100644 src/acpi-dsdt-mem-hotplug.dsl

diff --git a/src/acpi-dsdt-mem-hotplug.dsl b/src/acpi-dsdt-mem-hotplug.dsl
new file mode 100644
index 000..0e7ced3
--- /dev/null
+++ b/src/acpi-dsdt-mem-hotplug.dsl
@@ -0,0 +1,57 @@
+/
+ * Memory hotplug
+ /
+
+Scope(\_SB) {
+/* Objects filled in by run-time generated SSDT */
+External(MTFY, MethodObj)
+External(MEON, PkgObj)
+
+Method (CMST, 1, NotSerialized) {
+// _STA method - return ON status of memdevice
+// Local0 = MEON flag for this cpu
+Store(DerefOf(Index(MEON, Arg0)), Local0)
+If (Local0) { Return(0xF) } Else { Return(0x0) }
+}
+
+/* Memory hotplug notify array */
+OperationRegion(MEST, SystemIO, 0xaf80, 32)
+Field (MEST, ByteAcc, NoLock, Preserve)
+{
+MES, 256
+}
+ 
+Method(MESC, 0) {
+// Local5 = active memdevice bitmap
+Store (MES, Local5)
+// Local2 = last read byte from bitmap
+Store (Zero, Local2)
+// Local0 = memory device iterator
+Store (Zero, Local0)
+While (LLess(Local0, SizeOf(MEON))) {
+// Local1 = MEON flag for this memory device
+Store(DerefOf(Index(MEON, Local0)), Local1)
+If (And(Local0, 0x07)) {
+// Shift down previously read bitmap byte
+ShiftRight(Local2, 1, Local2)
+} Else {
+// Read next byte from memdevice bitmap
+Store(DerefOf(Index(Local5, ShiftRight(Local0, 3))), 
Local2)
+}
+// Local3 = active state for this memory device
+Store(And(Local2, 1), Local3)
+
+If (LNotEqual(Local1, Local3)) {
+// State change - update MEON with new state
+Store(Local3, Index(MEON, Local0))
+// Do MEM notify
+If (LEqual(Local3, 1)) {
+MTFY(Local0, 1)
+}
+}
+Increment(Local0)
+}
+Return(One)
+}
+
+}
diff --git a/src/acpi-dsdt.dsl b/src/acpi-dsdt.dsl
index 158f6b4..98c9413 100644
--- a/src/acpi-dsdt.dsl
+++ b/src/acpi-dsdt.dsl
@@ -294,6 +294,7 @@ DefinitionBlock (
 }
 
 #include "acpi-dsdt-cpu-hotplug.dsl"
+#include "acpi-dsdt-mem-hotplug.dsl"
 
 
 /
@@ -313,7 +314,9 @@ DefinitionBlock (
 // CPU hotplug event
 \_SB.PRSC()
 }
-Method(_L03) {
+Method(_E03) {
+// Memory hotplug event
+\_SB.MESC()
 }
 Method(_L04) {
 }
-- 
1.7.9

Re: [Qemu-devel] [RFC PATCH v6 0/6] Virtio refactoring.

2012-12-18 Thread Peter Maydell

On 18 December 2012 13:10, Michael S. Tsirkin  wrote:
> And what makes virtio so special anyway? e1000 can be used without
> exposing users to internal buses and all kind of nastiness like this.

Congratulations, you're using an architecture that has a pluggable
discoverable bus implemented by just about all machines using that
architecture. That makes things much easier for you.

-- PMM

[Qemu-devel] [RFC PATCH v4 08/30] qemu-option: export parse_option_number

2012-12-18 Thread Vasilis Liaskovitis

Signed-off-by: Vasilis Liaskovitis 
---
 qemu-option.c |2 +-
 qemu-option.h |2 ++
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/qemu-option.c b/qemu-option.c
index 38e0a11..88fd370 100644
--- a/qemu-option.c
+++ b/qemu-option.c
@@ -185,7 +185,7 @@ static void parse_option_bool(const char *name, const char 
*value, bool *ret,
 }
 }
 
-static void parse_option_number(const char *name, const char *value,
+void parse_option_number(const char *name, const char *value,
 uint64_t *ret, Error **errp)
 {
 char *postfix;
diff --git a/qemu-option.h b/qemu-option.h
index b8ee5b3..8b7235f 100644
--- a/qemu-option.h
+++ b/qemu-option.h
@@ -154,5 +154,7 @@ int qemu_opts_foreach(QemuOptsList *list, 
qemu_opts_loopfunc func, void *opaque,
   int abort_on_failure);
 void parse_option_size(const char *name, const char *value,
   uint64_t *ret, Error **errp);
+void parse_option_number(const char *name, const char *value,
+uint64_t *ret, Error **errp);
 
 #endif
-- 
1.7.9

[Qemu-devel] [PATCH 1/3] msi: add API to get notified about pending bit poll

2012-12-18 Thread Michael S. Tsirkin

Update all users.

Signed-off-by: Michael S. Tsirkin 
---
 hw/pci/msix.c   | 13 -
 hw/pci/msix.h   |  3 ++-
 hw/pci/pci.h|  4 
 hw/vfio_pci.c   |  2 +-
 hw/virtio-pci.c |  3 ++-
 5 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index 917327b..1f31975 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -192,6 +192,11 @@ static uint64_t msix_pba_mmio_read(void *opaque, hwaddr 
addr,
unsigned size)
 {
 PCIDevice *dev = opaque;
+if (dev->msix_vector_poll_notifier) {
+unsigned vector_start = addr * 8;
+unsigned vector_end = MIN(addr + size * 8, dev->msix_entries_nr);
+dev->msix_vector_poll_notifier(dev, vector_start, vector_end);
+}
 
 return pci_get_long(dev->msix_pba + addr);
 }
@@ -515,7 +520,8 @@ static void msix_unset_notifier_for_vector(PCIDevice *dev, 
unsigned int vector)
 
 int msix_set_vector_notifiers(PCIDevice *dev,
   MSIVectorUseNotifier use_notifier,
-  MSIVectorReleaseNotifier release_notifier)
+  MSIVectorReleaseNotifier release_notifier,
+  MSIVectorPollNotifier poll_notifier)
 {
 int vector, ret;
 
@@ -523,6 +529,7 @@ int msix_set_vector_notifiers(PCIDevice *dev,
 
 dev->msix_vector_use_notifier = use_notifier;
 dev->msix_vector_release_notifier = release_notifier;
+dev->msix_vector_poll_notifier = poll_notifier;
 
 if ((dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] &
 (MSIX_ENABLE_MASK | MSIX_MASKALL_MASK)) == MSIX_ENABLE_MASK) {
@@ -533,6 +540,9 @@ int msix_set_vector_notifiers(PCIDevice *dev,
 }
 }
 }
+if (dev->msix_vector_poll_notifier) {
+dev->msix_vector_poll_notifier(dev, 0, dev->msix_entries_nr);
+}
 return 0;
 
 undo:
@@ -559,4 +569,5 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
 }
 dev->msix_vector_use_notifier = NULL;
 dev->msix_vector_release_notifier = NULL;
+dev->msix_vector_poll_notifier = NULL;
 }
diff --git a/hw/pci/msix.h b/hw/pci/msix.h
index ff07ae2..ea85d02 100644
--- a/hw/pci/msix.h
+++ b/hw/pci/msix.h
@@ -36,6 +36,7 @@ void msix_reset(PCIDevice *dev);
 
 int msix_set_vector_notifiers(PCIDevice *dev,
   MSIVectorUseNotifier use_notifier,
-  MSIVectorReleaseNotifier release_notifier);
+  MSIVectorReleaseNotifier release_notifier,
+  MSIVectorPollNotifier poll_notifier);
 void msix_unset_vector_notifiers(PCIDevice *dev);
 #endif
diff --git a/hw/pci/pci.h b/hw/pci/pci.h
index 41e5ddd..f80f8fb 100644
--- a/hw/pci/pci.h
+++ b/hw/pci/pci.h
@@ -187,6 +187,9 @@ typedef void (*PCIINTxRoutingNotifier)(PCIDevice *dev);
 typedef int (*MSIVectorUseNotifier)(PCIDevice *dev, unsigned int vector,
   MSIMessage msg);
 typedef void (*MSIVectorReleaseNotifier)(PCIDevice *dev, unsigned int vector);
+typedef void (*MSIVectorPollNotifier)(PCIDevice *dev,
+  unsigned int vector_start,
+  unsigned int vector_end);
 
 struct PCIDevice {
 DeviceState qdev;
@@ -271,6 +274,7 @@ struct PCIDevice {
 /* MSI-X notifiers */
 MSIVectorUseNotifier msix_vector_use_notifier;
 MSIVectorReleaseNotifier msix_vector_release_notifier;
+MSIVectorPollNotifier msix_vector_poll_notifier;
 };
 
 void pci_register_bar(PCIDevice *pci_dev, int region_num,
diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
index 45d90ab..80c11de 100644
--- a/hw/vfio_pci.c
+++ b/hw/vfio_pci.c
@@ -697,7 +697,7 @@ static void vfio_enable_msix(VFIODevice *vdev)
 vdev->interrupt = VFIO_INT_MSIX;
 
 if (msix_set_vector_notifiers(&vdev->pdev, vfio_msix_vector_use,
-  vfio_msix_vector_release)) {
+  vfio_msix_vector_release, NULL)) {
 error_report("vfio: msix_set_vector_notifiers failed\n");
 }
 
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 518fb8a..1c03bb5 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -638,7 +638,8 @@ static int virtio_pci_set_guest_notifiers(DeviceState *d, 
bool assign)
   msix_nr_vectors_allocated(&proxy->pci_dev));
 r = msix_set_vector_notifiers(&proxy->pci_dev,
   kvm_virtio_pci_vector_use,
-  kvm_virtio_pci_vector_release);
+  kvm_virtio_pci_vector_release,
+  NULL);
 if (r < 0) {
 goto assign_error;
 }
-- 
MST

Re: [Qemu-devel] KVM call agenda for 2012-12-18

2012-12-18 Thread Anthony Liguori

Juan Quintela  writes:

> Hi
>
> Please send in any agenda topics that you have.

I have a conflicting call today so I can't attend.

Regards,

Anthony Liguori

>
> Thanks, Juan.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Qemu-devel] [RFC PATCH v4 05/30] [SeaBIOS] q35: Add memory hotplug handler

2012-12-18 Thread Vasilis Liaskovitis

---
 src/q35-acpi-dsdt.dsl |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/q35-acpi-dsdt.dsl b/src/q35-acpi-dsdt.dsl
index c031d83..5b28d72 100644
--- a/src/q35-acpi-dsdt.dsl
+++ b/src/q35-acpi-dsdt.dsl
@@ -403,7 +403,7 @@ DefinitionBlock (
 }
 
 #include "acpi-dsdt-cpu-hotplug.dsl"
-
+#include "acpi-dsdt-mem-hotplug.dsl"
 
 /
  * General purpose events
@@ -418,7 +418,9 @@ DefinitionBlock (
 // CPU hotplug event
 \_SB.PRSC()
 }
-Method(_L02) {
+Method(_E02) {
+// Memory hotplug event
+\_SB.MESC()
 }
 Method(_L03) {
 }
-- 
1.7.9

[Qemu-devel] [RFC PATCH v4 19/30] Implement "info memory-total" and "query-memory-total"

2012-12-18 Thread Vasilis Liaskovitis

Returns total physical memory available to guest in bytes, including hotplugged
memory. Note that the number reported here may be different from what the guest
sees e.g. if the guest has not logically onlined hotplugged memory.

This functionality is provided independently of a balloon device, since a
guest can be using ACPI memory hotplug without using a balloon device.

v3->v4: Moved qmp command implementation to vl.c. This prevents a circular
header dependency problem.

Signed-off-by: Vasilis Liaskovitis 
---
 hmp-commands.hx  |2 ++
 hmp.c|7 +++
 hmp.h|1 +
 hw/dimm.c|   14 ++
 hw/dimm.h|1 +
 monitor.c|7 +++
 qapi-schema.json |   11 +++
 qmp-commands.hx  |   20 
 vl.c |9 +
 9 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 010b8c9..3fbd975 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1570,6 +1570,8 @@ show device tree
 show qdev device model list
 @item info roms
 show roms
+@item info memory-total
+show memory-total
 @end table
 ETEXI
 
diff --git a/hmp.c b/hmp.c
index 180ba2b..fb39b0d 100644
--- a/hmp.c
+++ b/hmp.c
@@ -628,6 +628,13 @@ void hmp_info_block_jobs(Monitor *mon)
 }
 }
 
+void hmp_info_memory_total(Monitor *mon)
+{
+uint64_t ram_total;
+ram_total = (uint64_t)qmp_query_memory_total(NULL);
+monitor_printf(mon, "MemTotal: %lu\n", ram_total);
+}
+
 void hmp_quit(Monitor *mon, const QDict *qdict)
 {
 monitor_suspend(mon);
diff --git a/hmp.h b/hmp.h
index 0ab03be..25a3a70 100644
--- a/hmp.h
+++ b/hmp.h
@@ -36,6 +36,7 @@ void hmp_info_spice(Monitor *mon);
 void hmp_info_balloon(Monitor *mon);
 void hmp_info_pci(Monitor *mon);
 void hmp_info_block_jobs(Monitor *mon);
+void hmp_info_memory_total(Monitor *mon);
 void hmp_quit(Monitor *mon, const QDict *qdict);
 void hmp_stop(Monitor *mon, const QDict *qdict);
 void hmp_system_reset(Monitor *mon, const QDict *qdict);
diff --git a/hw/dimm.c b/hw/dimm.c
index e384952..f181e54 100644
--- a/hw/dimm.c
+++ b/hw/dimm.c
@@ -189,6 +189,20 @@ void dimm_setup_fwcfg_layout(uint64_t *fw_cfg_slots)
 }
 }
 
+uint64_t get_hp_memory_total(void)
+{
+DimmBus *bus;
+DimmDevice *slot;
+uint64_t info = 0;
+
+QLIST_FOREACH(bus, &memory_buses, next) {
+QTAILQ_FOREACH(slot, &bus->dimmlist, nextdimm) {
+info += slot->size;
+}
+}
+return info;
+}
+
 static int dimm_init(DeviceState *s)
 {
 DimmBus *bus = DIMM_BUS(qdev_get_parent_bus(s));
diff --git a/hw/dimm.h b/hw/dimm.h
index 75a6911..5130b2c 100644
--- a/hw/dimm.h
+++ b/hw/dimm.h
@@ -85,5 +85,6 @@ DimmBus *dimm_bus_create(Object *parent, const char *name, 
uint32_t max_dimms,
 dimm_calcoffset_fn pmc_set_offset);
 void dimm_config_create(char *id, uint64_t size, const char *bus, uint64_t 
node,
 uint32_t dimm_idx, uint32_t populated);
+uint64_t get_hp_memory_total(void);
 
 #endif
diff --git a/monitor.c b/monitor.c
index c0e32d6..6e87d0d 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2708,6 +2708,13 @@ static mon_cmd_t info_cmds[] = {
 .mhandler.info = hmp_info_balloon,
 },
 {
+.name   = "memory-total",
+.args_type  = "",
+.params = "",
+.help   = "show total memory size",
+.mhandler.info = hmp_info_memory_total,
+},
+{
 .name   = "qtree",
 .args_type  = "",
 .params = "",
diff --git a/qapi-schema.json b/qapi-schema.json
index 5dfa052..33f88d6 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2903,6 +2903,17 @@
 { 'command': 'query-target', 'returns': 'TargetInfo' }
 
 ##
+# @query-memory-total:
+#
+# Returns total memory in bytes, including hotplugged dimms
+#
+# Returns: int
+#
+# Since: 1.4
+##
+{ 'command': 'query-memory-total', 'returns': 'int' }
+
+##
 # @QKeyCode:
 #
 # An enumeration of key name.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 5c692d0..a99117a 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2654,3 +2654,23 @@ EQMP
 .args_type  = "",
 .mhandler.cmd_new = qmp_marshal_input_query_target,
 },
+
+{
+.name   = "query-memory-total",
+.args_type  = "",
+.mhandler.cmd_new = qmp_marshal_input_query_memory_total
+},
+SQMP
+query-memory-total
+--
+
+Return total memory in bytes, including hotplugged dimms
+
+Example:
+
+-> { "execute": "query-memory-total" }
+<- {
+  "return": 1073741824
+   }
+
+EQMP
diff --git a/vl.c b/vl.c
index 8406933..80803c5 100644
--- a/vl.c
+++ b/vl.c
@@ -126,6 +126,7 @@ int main(int argc, char **argv)
 #include "hw/xen.h"
 #include "hw/qdev.h"
 #include "hw/loader.h"
+#include "hw/dimm.h"
 #include "bt-host.h"
 #include "net.h"
 #include "net/slirp.h"
@@ -442,6 +443,14 @@ StatusInfo *qmp_query_status(Error **errp)
 return info;
 }
 
+int64_t qmp_query_memory_total(Error **errp)
+{
+uint64_t info;
+

[Qemu-devel] [RFC PATCH v4 12/30] acpi_ich9 : Implement memory device hotplug registers

2012-12-18 Thread Vasilis Liaskovitis

This implements acpi dimm hot-add capability for q35 (ich9). The logic is the
same as for the pc machine (piix4).

TODO: Fix acpi irq delivery bug. Currently there is a flood of irqs when
delivering an acpi interrupt (should be just one). Guest complains as follows:
"irq 9: nobody cared
[...]
Disabling IRQ #9"
where #9 is the acpi irq

Signed-off-by: Vasilis Liaskovitis 
---
 hw/acpi_ich9.c |   61 +--
 hw/acpi_ich9.h |7 +-
 hw/lpc_ich9.c  |2 +-
 3 files changed, 65 insertions(+), 5 deletions(-)

diff --git a/hw/acpi_ich9.c b/hw/acpi_ich9.c
index c5978d3..abafbb5 100644
--- a/hw/acpi_ich9.c
+++ b/hw/acpi_ich9.c
@@ -48,11 +48,14 @@ static void pm_update_sci(ICH9LPCPMRegs *pm)
 
 pm1a_sts = acpi_pm1_evt_get_sts(&pm->acpi_regs);
 
-sci_level = (((pm1a_sts & pm->acpi_regs.pm1.evt.en) &
+sci_level = pm1a_sts & pm->acpi_regs.pm1.evt.en) &
   (ACPI_BITMASK_RT_CLOCK_ENABLE |
ACPI_BITMASK_POWER_BUTTON_ENABLE |
ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
-   ACPI_BITMASK_TIMER_ENABLE)) != 0);
+   ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
+(((pm->acpi_regs.gpe.sts[0] & pm->acpi_regs.gpe.en[0]) &
+  (ICH9_MEM_HOTPLUG_STATUS)) != 0));
+
 qemu_set_irq(pm->irq, sci_level);
 
 /* schedule a timer interruption if needed */
@@ -90,6 +93,29 @@ static const MemoryRegionOps ich9_gpe_ops = {
 .endianness = DEVICE_LITTLE_ENDIAN,
 };
 
+static uint32_t memhp_readb(void *opaque, uint32_t addr)
+{
+ICH9LPCPMRegs *s = opaque;
+uint32_t val = 0;
+struct gpe_regs *g = &s->gperegs;
+if (addr < DIMM_BITMAP_BYTES) {
+val = (uint32_t) g->mems_sts[addr];
+}
+ICH9_DEBUG("memhp read %x == %x\n", addr, val);
+return val;
+}
+
+static const MemoryRegionOps ich9_memhp_ops = {
+.old_portio = (MemoryRegionPortio[]) {
+{
+.offset = 0,   .len = DIMM_BITMAP_BYTES, .size = 1,
+.read = memhp_readb,
+},
+PORTIO_END_OF_LIST()
+},
+.endianness = DEVICE_LITTLE_ENDIAN,
+};
+
 static uint64_t ich9_smi_readl(void *opaque, hwaddr addr, unsigned width)
 {
 ICH9LPCPMRegs *pm = opaque;
@@ -201,8 +227,31 @@ static void pm_powerdown_req(Notifier *n, void *opaque)
 acpi_pm1_evt_power_down(&pm->acpi_regs);
 }
 
-void ich9_pm_init(ICH9LPCPMRegs *pm, qemu_irq sci_irq, qemu_irq cmos_s3)
+static void enable_mem_device(ICH9LPCState *s, int memdevice)
 {
+struct gpe_regs *g = &s->pm.gperegs;
+s->pm.acpi_regs.gpe.sts[0] |= ICH9_MEM_HOTPLUG_STATUS;
+g->mems_sts[memdevice/8] |= (1 << (memdevice%8));
+}
+
+static int ich9_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int
+add)
+{
+PCIDevice *pci_dev = DO_UPCAST(PCIDevice, qdev, qdev);
+ICH9LPCState *s = DO_UPCAST(ICH9LPCState, d, pci_dev);
+DimmDevice *slot = DIMM(dev);
+
+if (add) {
+enable_mem_device(s, slot->idx);
+}
+pm_update_sci(&s->pm);
+return 0;
+}
+
+void ich9_pm_init(void *device, qemu_irq sci_irq, qemu_irq cmos_s3)
+{
+ICH9LPCState *lpc = (ICH9LPCState *)device;
+ICH9LPCPMRegs *pm = &lpc->pm;
 memory_region_init(&pm->io, "ich9-pm", ICH9_PMIO_SIZE);
 memory_region_set_enabled(&pm->io, false);
 memory_region_add_subregion(get_system_io(), 0, &pm->io);
@@ -220,6 +269,12 @@ void ich9_pm_init(ICH9LPCPMRegs *pm, qemu_irq sci_irq, 
qemu_irq cmos_s3)
   8);
 memory_region_add_subregion(&pm->io, ICH9_PMIO_SMI_EN, &pm->io_smi);
 
+memory_region_init_io(&pm->io_memhp, &ich9_memhp_ops, pm, "apci-memhp0",
+  DIMM_BITMAP_BYTES);
+memory_region_add_subregion(get_system_io(), ICH9_MEM_BASE, &pm->io_memhp);
+
+dimm_bus_hotplug(ich9_dimm_hotplug, &lpc->d.qdev);
+
 pm->irq = sci_irq;
 qemu_register_reset(pm_reset, pm);
 pm->powerdown_notifier.notify = pm_powerdown_req;
diff --git a/hw/acpi_ich9.h b/hw/acpi_ich9.h
index bc221d3..4419247 100644
--- a/hw/acpi_ich9.h
+++ b/hw/acpi_ich9.h
@@ -23,6 +23,9 @@
 
 #include "acpi.h"
 
+#define ICH9_MEM_BASE0xaf80
+#define ICH9_MEM_HOTPLUG_STATUS 8
+
 typedef struct ICH9LPCPMRegs {
 /*
  * In ich9 spec says that pm1_cnt register is 32bit width and
@@ -33,16 +36,18 @@ typedef struct ICH9LPCPMRegs {
 MemoryRegion io;
 MemoryRegion io_gpe;
 MemoryRegion io_smi;
+MemoryRegion io_memhp;
 uint32_t smi_en;
 uint32_t smi_sts;
 
 qemu_irq irq;  /* SCI */
 
+struct gpe_regs gperegs;
 uint32_t pm_io_base;
 Notifier powerdown_notifier;
 } ICH9LPCPMRegs;
 
-void ich9_pm_init(ICH9LPCPMRegs *pm,
+void ich9_pm_init(void *lpc,
   qemu_irq sci_irq, qemu_irq cmos_s3_resume);
 void ich9_pm_iospace_update(ICH9LPCPMRegs *pm, uint32_t pm_io_base);
 extern const VMStateDescription vmstate_ich9_pm;
diff --git a/hw/lpc_ich9.c b/hw/lpc_ich9.c
index 878a43e..0ef7af6 100644
--- a/hw/lpc_ich9.c
+++ b/hw/lpc_ich9.c
@@ -352,

Re: [Qemu-devel] KVM call agenda for 2012-12-18

2012-12-18 Thread Juan Quintela

Juan Quintela  wrote:
> Hi
>
> Please send in any agenda topics that you have.

As there are no topics, call is cancelled.

Later, Juan.

[Qemu-devel] [RFC PATCH v4 13/30] piix_pci and pc_piix: refactor

2012-12-18 Thread Vasilis Liaskovitis

Refactor code so that chipset initialization is similar to q35. This will
allow memory map initialization at chipset qdev init time for both
machines, as well as more similar code structure overall.

Signed-off-by: Vasilis Liaskovitis 
---
 hw/pc_piix.c  |   57 ---
 hw/piix_pci.c |  225 ++---
 2 files changed, 100 insertions(+), 182 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 19e342a..6a9b508 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -47,6 +47,7 @@
 #ifdef CONFIG_XEN
 #  include 
 #endif
+#include "piix_pci.h"
 
 #define MAX_IDE_BUS 2
 
@@ -85,6 +86,8 @@ static void pc_init1(MemoryRegion *system_memory,
 MemoryRegion *pci_memory;
 MemoryRegion *rom_memory;
 void *fw_cfg = NULL;
+I440FXState *i440fx_host;
+PIIX3State *piix3;
 
 pc_cpus_init(cpu_model);
 
@@ -127,21 +130,53 @@ static void pc_init1(MemoryRegion *system_memory,
 }
 
 if (pci_enabled) {
-pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, &isa_bus, gsi,
-  system_memory, system_io, ram_size,
-  below_4g_mem_size,
-  0x1ULL - below_4g_mem_size,
-  0x1ULL + above_4g_mem_size,
-  (sizeof(hwaddr) == 4
-   ? 0
-   : ((uint64_t)1 << 62)),
-  pci_memory, ram_memory);
+i440fx_host = I440FX_HOST_DEVICE(qdev_create(NULL,
+TYPE_I440FX_HOST_DEVICE));
+i440fx_host->mch.ram_memory = ram_memory;
+i440fx_host->mch.pci_address_space = pci_memory;
+i440fx_host->mch.system_memory = get_system_memory();
+i440fx_host->mch.address_space_io = get_system_io();;
+i440fx_host->mch.below_4g_mem_size = below_4g_mem_size;
+i440fx_host->mch.above_4g_mem_size = above_4g_mem_size;
+
+qdev_init_nofail(DEVICE(i440fx_host));
+i440fx_state = &i440fx_host->mch;
+pci_bus = i440fx_host->parent_obj.bus;
+/* Xen supports additional interrupt routes from the PCI devices to
+ * the IOAPIC: the four pins of each PCI device on the bus are also
+ * connected to the IOAPIC directly.
+ * These additional routes can be discovered through ACPI. */
+if (xen_enabled()) {
+piix3 = DO_UPCAST(PIIX3State, dev,
+pci_create_simple_multifunction(pci_bus, -1, true,
+"PIIX3-xen"));
+pci_bus_irqs(pci_bus, xen_piix3_set_irq, xen_pci_slot_get_pirq,
+piix3, XEN_PIIX_NUM_PIRQS);
+} else {
+piix3 = DO_UPCAST(PIIX3State, dev,
+pci_create_simple_multifunction(pci_bus, -1, true,
+"PIIX3"));
+pci_bus_irqs(pci_bus, piix3_set_irq, pci_slot_get_pirq, piix3,
+PIIX_NUM_PIRQS);
+pci_bus_set_route_irq_fn(pci_bus, piix3_route_intx_pin_to_irq);
+}
+piix3->pic = gsi;
+isa_bus = DO_UPCAST(ISABus, qbus,
+qdev_get_child_bus(&piix3->dev.qdev, "isa.0"));
+
+piix3_devfn = piix3->dev.devfn;
+
+ram_size = ram_size / 8 / 1024 / 1024;
+if (ram_size > 255) {
+ram_size = 255;
+}
+i440fx_state->dev.config[0x57] = ram_size;
 } else {
 pci_bus = NULL;
-i440fx_state = NULL;
 isa_bus = isa_bus_new(NULL, system_io);
 no_hpet = 1;
 }
+
 isa_bus_irqs(isa_bus, gsi);
 
 if (kvm_irqchip_in_kernel()) {
@@ -157,7 +192,7 @@ static void pc_init1(MemoryRegion *system_memory,
 gsi_state->i8259_irq[i] = i8259[i];
 }
 if (pci_enabled) {
-ioapic_init_gsi(gsi_state, "i440fx");
+ioapic_init_gsi(gsi_state, NULL);
 }
 
 pc_register_ferr_irq(gsi[13]);
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index ba1b3de..7ca3c73 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -31,70 +31,15 @@
 #include "range.h"
 #include "xen.h"
 #include "pam.h"
+#include "piix_pci.h"
 
-/*
- * I440FX chipset data sheet.
- * http://download.intel.com/design/chipsets/datashts/29054901.pdf
- */
-
-typedef struct I440FXState {
-PCIHostState parent_obj;
-} I440FXState;
-
-#define PIIX_NUM_PIC_IRQS   16  /* i8259 * 2 */
-#define PIIX_NUM_PIRQS  4ULL/* PIRQ[A-D] */
-#define XEN_PIIX_NUM_PIRQS  128ULL
-#define PIIX_PIRQC  0x60
-
-typedef struct PIIX3State {
-PCIDevice dev;
-
-/*
- * bitmap to track pic levels.
- * The pic level is the logical OR of all the PCI irqs mapped to it
- * So one PIC level is tracked by PIIX_NUM_PIRQS bits.
- *
- * PIRQ is mapped to PIC pins, we track it by
- * PIIX_NUM_PIRQS * PIIX_NUM_PIC_IRQS = 64 bits with
- * pic_irq * PIIX_NUM_PIRQS + pirq
- */
-#if PIIX_NUM_PIC_IRQS * PIIX_NUM_PIRQS > 64
-#error "unable to encode pic

[Qemu-devel] [RFC PATCH v4 16/30] pc: Add dimm paravirt SRAT info

2012-12-18 Thread Vasilis Liaskovitis

The numa_fw_cfg paravirt interface is extended to include SRAT information for
all hotplug-able dimms. There are 3 words for each hotplug-able memory slot,
denoting start address, size and node proximity. The new info is appended after
existing numa info, so that the fw_cfg layout does not break.  This information
is used by Seabios to build hotplug memory device objects at runtime.
nb_numa_nodes is set to 1 by default (not 0), so that we always pass srat info
to SeaBIOS.

v3->v4: numa_fw_cfg needs to be initalized after memory controller sets up dimm
ranges.  Make changes for pc_piix and pc_q35 to set numa_fw_cfg after i440fx
initialization.

v2->v3: setting nb_numa_nodes to 1 is not needed

v1->v2:
Dimm SRAT info (#dimms) is appended at end of existing numa fw_cfg in order not
to break existing layout
Documentation of the new fwcfg layout is included in docs/specs/fwcfg.txt

Signed-off-by: Vasilis Liaskovitis 
---
 docs/specs/fwcfg.txt |   28 
 hw/pc.c  |   28 +++-
 hw/pc.h  |1 +
 hw/pc_piix.c |1 +
 hw/pc_q35.c  |8 +---
 sysemu.h |1 +
 6 files changed, 59 insertions(+), 8 deletions(-)
 create mode 100644 docs/specs/fwcfg.txt

diff --git a/docs/specs/fwcfg.txt b/docs/specs/fwcfg.txt
new file mode 100644
index 000..e6fcd8f
--- /dev/null
+++ b/docs/specs/fwcfg.txt
@@ -0,0 +1,28 @@
+QEMU<->BIOS Paravirt Documentation
+--
+
+This document describes paravirt data structures passed from QEMU to BIOS.
+
+fw_cfg SRAT paravirt info
+
+The SRAT info passed from QEMU to BIOS has the following layout:
+
+---
+#nodes | cpu0_pxm | cpu1_pxm | ... | cpulast_pxm | node0_mem | node1_mem | ... 
| nodelast_mem
+
+---
+#dimms | dimm0_start | dimm0_sz | dimm0_pxm | ... | dimmlast_start | 
dimmlast_sz | dimmlast_pxm
+
+Entry 0 contains the number of numa nodes (nb_numa_nodes).
+
+Entries 1..max_cpus: The next max_cpus entries describe node proximity for each
+one of the vCPUs in the system.
+
+Entries max_cpus+1..max_cpus+nb_numa_nodes+1:  The next nb_numa_nodes entries
+describe the memory size for each one of the NUMA nodes in the system.
+
+Entry max_cpus+nb_numa_nodes+1 contains the number of memory dimms 
(nb_hp_dimms)
+
+The last 3 * nb_hp_dimms entries are organized in triplets: Each triplet 
contains
+the physical address offset, size (in bytes), and node proximity for the
+respective dimm.
diff --git a/hw/pc.c b/hw/pc.c
index b11e7c4..025c356 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -51,6 +51,7 @@
 #include "exec-memory.h"
 #include "arch_init.h"
 #include "bitmap.h"
+#include "hw/dimm.h"
 
 /* debug PC/ISA interrupts */
 //#define DEBUG_IRQ
@@ -582,8 +583,6 @@ static void *bochs_bios_init(void)
 void *fw_cfg;
 uint8_t *smbios_table;
 size_t smbios_len;
-uint64_t *numa_fw_cfg;
-int i, j;
 PortioList *bochs_bios_port_list = g_new(PortioList, 1);
 
 portio_list_init(bochs_bios_port_list, bochs_bios_portio_list,
@@ -607,11 +606,24 @@ static void *bochs_bios_init(void)
 
 fw_cfg_add_bytes(fw_cfg, FW_CFG_HPET, (uint8_t *)&hpet_cfg,
  sizeof(struct hpet_fw_config));
+
+return fw_cfg;
+}
+
+void bochs_meminfo_bios_init(void *fw_cfg)
+{
+uint64_t *numa_fw_cfg;
+uint64_t *hp_dimms_fw_cfg;
+int i, j;
+
 /* allocate memory for the NUMA channel: one (64bit) word for the number
  * of nodes, one word for each VCPU->node and one word for each node to
  * hold the amount of memory.
+ * Finally one word for the number of hotplug memory slots and three words
+ * for each hotplug memory slot (start address, size and node proximity).
  */
-numa_fw_cfg = g_malloc0((1 + max_cpus + nb_numa_nodes) * 8);
+numa_fw_cfg = g_malloc0((2 + max_cpus + nb_numa_nodes + 3 * nb_hp_dimms)
+* 8);
 numa_fw_cfg[0] = cpu_to_le64(nb_numa_nodes);
 for (i = 0; i < max_cpus; i++) {
 for (j = 0; j < nb_numa_nodes; j++) {
@@ -624,10 +636,16 @@ static void *bochs_bios_init(void)
 for (i = 0; i < nb_numa_nodes; i++) {
 numa_fw_cfg[max_cpus + 1 + i] = cpu_to_le64(node_mem[i]);
 }
+
+numa_fw_cfg[1 + max_cpus + nb_numa_nodes] = cpu_to_le64(nb_hp_dimms);
+
+hp_dimms_fw_cfg = numa_fw_cfg + 2 + max_cpus + nb_numa_nodes;
+if (nb_hp_dimms) {
+dimm_setup_fwcfg_layout(hp_dimms_fw_cfg);
+}
 fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA, (uint8_t *)numa_fw_cfg,
- (1 + max_cpus + nb_numa_nodes) * 8);
+ (2 + max_cpus + nb_numa_nodes + 3 * nb_hp_dimms) * 8);
 
-return fw_cfg;
 }
 
 static long get_file_size(FILE *f)
diff --git a/hw/pc.h b/hw/pc.h
index 2237e86..075514f 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -185,5 +185,6

Re: [Qemu-devel] [PATCH v3 1/2] cutils:change strtosz_suffix_unit function

2012-12-18 Thread Stefan Hajnoczi

On Mon, Dec 17, 2012 at 09:49:22AM +0800, liguang wrote:
> if value to be translated is larger than INT64_MAX,
> this function will not be convenient for caller to
> be aware of it, so change a little for this.
> 
> Signed-off-by: liguang 
> ---
>  cutils.c |6 --
>  1 files changed, 4 insertions(+), 2 deletions(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Please follow the ": " convention for commit
messages with a space after the ':'.

There was a conflict with a9300911 that was easy to resolve.

Stefan

[Qemu-devel] [RFC PATCH v4 26/30] Implement qmp and hmp commands for notification lists

2012-12-18 Thread Vasilis Liaskovitis

Guest can respond to ACPI hotplug events e.g. with _EJ or _OST method.
This patch implements a tail queue to store guest notifications for memory
hot-add and hot-remove requests.

Guest responses for memory hotplug command on a per-dimm basis can be detected
with the new hmp command "info memory-hotplug" or the new qmp command
"query-memory-hotplug"

Examples:

(qemu) device_add dimm,id=ram0
(qemu) info memory-hotplug
dimm: ram0 hot-add success
or
dimm: ram0 hot-add failure

(qemu) device_del ram3
(qemu) info memory-hotplug
dimm: ram3 hot-remove success
or
dimm: ram3 hot-remove failure

Results are removed from the queue once read.

This patch only queues _EJ events that signal hot-remove success.
For  _OST event queuing, which cover the hot-remove failure and
hot-add success/failure cases, the _OST patches in this series are  are also
needed.

These notification items should probably be part of migration state (not yet
implemented).

Signed-off-by: Vasilis Liaskovitis 
---
 hmp-commands.hx  |2 +
 hmp.c|   17 +++
 hmp.h|1 +
 hw/dimm.c|   61 ++
 hw/dimm.h|1 +
 monitor.c|7 ++
 qapi-schema.json |   26 +++
 qmp-commands.hx  |   37 
 8 files changed, 152 insertions(+), 0 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 65d799e..b94b7a2 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1574,6 +1574,8 @@ show roms
 show memory-total
 @item info dimm
 show dimm
+@item info memory-hotplug
+show memory-hotplug
 @end table
 ETEXI
 
diff --git a/hmp.c b/hmp.c
index f8456fd..727ed80 100644
--- a/hmp.c
+++ b/hmp.c
@@ -652,6 +652,23 @@ void hmp_info_dimm(Monitor *mon)
 qapi_free_DimmInfoList(info);
 }
 
+void hmp_info_memory_hotplug(Monitor *mon)
+{
+MemHpInfoList *info;
+MemHpInfoList *item;
+MemHpInfo *dimm;
+
+info = qmp_query_memory_hotplug(NULL);
+for (item = info; item; item = item->next) {
+dimm = item->value;
+monitor_printf(mon, "dimm: %s %s %s\n", dimm->dimm,
+dimm->request, dimm->result);
+dimm->dimm = NULL;
+}
+
+qapi_free_MemHpInfoList(info);
+}
+
 void hmp_quit(Monitor *mon, const QDict *qdict)
 {
 monitor_suspend(mon);
diff --git a/hmp.h b/hmp.h
index 74ac061..92095df 100644
--- a/hmp.h
+++ b/hmp.h
@@ -38,6 +38,7 @@ void hmp_info_pci(Monitor *mon);
 void hmp_info_block_jobs(Monitor *mon);
 void hmp_info_memory_total(Monitor *mon);
 void hmp_info_dimm(Monitor *mon);
+void hmp_info_memory_hotplug(Monitor *mon);
 void hmp_quit(Monitor *mon, const QDict *qdict);
 void hmp_stop(Monitor *mon, const QDict *qdict);
 void hmp_system_reset(Monitor *mon, const QDict *qdict);
diff --git a/hw/dimm.c b/hw/dimm.c
index 0b4e22d..4670ae6 100644
--- a/hw/dimm.c
+++ b/hw/dimm.c
@@ -67,6 +67,7 @@ static void dimm_bus_initfn(Object *obj)
 DimmBus *bus = DIMM_BUS(obj);
 QTAILQ_INIT(&bus->dimmconfig_list);
 QTAILQ_INIT(&bus->dimmlist);
+QTAILQ_INIT(&bus->dimm_hp_result_queue);
 }
 
 static const TypeInfo dimm_bus_info = {
@@ -278,6 +279,58 @@ DimmInfoList *qmp_query_dimm_info(Error **errp)
 return head;
 }
 
+MemHpInfoList *qmp_query_memory_hotplug(Error **errp)
+{
+DimmBus *bus;
+MemHpInfoList *head = NULL, *cur_item = NULL, *info;
+struct dimm_hp_result *item, *nextitem;
+
+QLIST_FOREACH(bus, &memory_buses, next) {
+QTAILQ_FOREACH_SAFE(item, &bus->dimm_hp_result_queue, next, nextitem) {
+
+info = g_malloc0(sizeof(*info));
+info->value = g_malloc0(sizeof(*info->value));
+info->value->dimm = g_malloc0(sizeof(char) * 32);
+info->value->request = g_malloc0(sizeof(char) * 16);
+info->value->result = g_malloc0(sizeof(char) * 16);
+switch (item->ret) {
+case DIMM_REMOVE_SUCCESS:
+strcpy(info->value->request, "hot-remove");
+strcpy(info->value->result, "success");
+break;
+case DIMM_REMOVE_FAIL:
+strcpy(info->value->request, "hot-remove");
+strcpy(info->value->result, "failure");
+break;
+case DIMM_ADD_SUCCESS:
+strcpy(info->value->request, "hot-add");
+strcpy(info->value->result, "success");
+break;
+case DIMM_ADD_FAIL:
+strcpy(info->value->request, "hot-add");
+strcpy(info->value->result, "failure");
+break;
+default:
+break;
+}
+strcpy(info->value->dimm, item->dimmname);
+/* XXX: waiting for the qapi to support GSList */
+if (!cur_item) {
+head = cur_item = info;
+} else {
+cur_item->next = info;
+cur_item = info;
+}
+
+/* hotplug notification copi

Re: [Qemu-devel] [PATCH 0/2] i2c: Add AT24Cxx EEPROM model

2012-12-18 Thread Andreas Färber

Am 12.12.2012 10:44, schrieb Jan Kiszka:
> On 2012-11-19 15:24, Jan Kiszka wrote:
>> See patches for details.
>>
>> Jan Kiszka (2):
>>   i2c: Introduce device address mask
>>   Add AT24Cxx I2C EEPROM device model
>>
>>  hw/Makefile.objs |2 +-
>>  hw/at24.c|  363 
>> ++
>>  hw/ds1338.c  |2 +-
>>  hw/i2c.c |9 +-
>>  hw/i2c.h |3 +-
>>  hw/lm832x.c  |2 +-
>>  hw/max7310.c |2 +-
>>  hw/pxa2xx.c  |3 +-
>>  hw/smbus.c   |2 +-
>>  hw/ssd0303.c |2 +-
>>  hw/tmp105.c  |2 +-
>>  hw/tosa.c|2 +-
>>  hw/twl92230.c|2 +-
>>  hw/wm8750.c  |2 +-
>>  hw/z2.c  |2 +-
>>  15 files changed, 383 insertions(+), 17 deletions(-)
>>  create mode 100644 hw/at24.c
>>
> 
> Ping. Both still apply over latest master, and addressable review
> comments are not pending according to my understanding.

Hello Jan, did you see my recent proposal of an I2C qtest framework? :)
For one I'd be interested in feedback from someone who knows I2C and for
another maybe you could follow-up your patch with some test case (I2C
master shouldn't matter, I guess).

Cheers,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH 1/4] block: Add special error code for wrong format

2012-12-18 Thread Stefan Hajnoczi

On Sat, Dec 15, 2012 at 03:09:30PM +0100, Stefan Weil wrote:
> The block drivers normally return -errno for typical errors.
> There is no appropriate error code for "wrong format", so
> use a special error code which does not conflict with system
> error codes.

ENOTTY is used when something is of the wrong type.  Since the name
"ENOTTY" is not clear, defining a new error code makes sense though.

Stefan

[Qemu-devel] [RFC PATCH v4 22/30] [SeaBIOS] acpi: add _EJ0 operation and eject port for memory devices

2012-12-18 Thread Vasilis Liaskovitis

This will allow hot-remove signalling from/to qemu and acpi-enabled guest.
---
 src/acpi-dsdt-mem-hotplug.dsl |   15 +++
 src/ssdt-mem.dsl  |3 +++
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/src/acpi-dsdt-mem-hotplug.dsl b/src/acpi-dsdt-mem-hotplug.dsl
index 0e7ced3..fd73ea7 100644
--- a/src/acpi-dsdt-mem-hotplug.dsl
+++ b/src/acpi-dsdt-mem-hotplug.dsl
@@ -21,6 +21,13 @@ Scope(\_SB) {
 MES, 256
 }
  
+/* Memory eject byte */
+OperationRegion(MEMJ, SystemIO, 0xafa0, 1)
+Field (MEMJ, ByteAcc, NoLock, Preserve)
+{
+MPE, 8
+}
+
 Method(MESC, 0) {
 // Local5 = active memdevice bitmap
 Store (MES, Local5)
@@ -47,6 +54,8 @@ Scope(\_SB) {
 // Do MEM notify
 If (LEqual(Local3, 1)) {
 MTFY(Local0, 1)
+} Else {
+MTFY(Local0, 3)
 }
 }
 Increment(Local0)
@@ -54,4 +63,10 @@ Scope(\_SB) {
 Return(One)
 }
 
+Method (MPEJ, 2, NotSerialized) {
+// _EJ0 method - eject callback
+Store(Arg0, MPE)
+Sleep(200)
+}
+
 }
diff --git a/src/ssdt-mem.dsl b/src/ssdt-mem.dsl
index dbac33f..eef84b6 100644
--- a/src/ssdt-mem.dsl
+++ b/src/ssdt-mem.dsl
@@ -57,6 +57,9 @@ DefinitionBlock ("ssdt-mem.aml", "SSDT", 0x02, "BXPC", 
"CSSDT", 0x1)
 Method (_STA, 0) {
 Return(CMST(ID))
 }
+Method (_EJ0, 1, NotSerialized) {
+MPEJ(ID, Arg0)
+}
 }
 }
 
-- 
1.7.9

[Qemu-devel] [RFC PATCH v4 24/30] acpi_piix4: add hot-remove capability

2012-12-18 Thread Vasilis Liaskovitis

---
 docs/specs/acpi_hotplug.txt |8 
 hw/acpi_piix4.c |   29 -
 2 files changed, 36 insertions(+), 1 deletions(-)

diff --git a/docs/specs/acpi_hotplug.txt b/docs/specs/acpi_hotplug.txt
index 8391713..cf86242 100644
--- a/docs/specs/acpi_hotplug.txt
+++ b/docs/specs/acpi_hotplug.txt
@@ -12,3 +12,11 @@ Dimm hot-plug notification pending. One bit per slot.
 
 Read by ACPI BIOS GPE.3 handler to notify OS of memory hot-add or hot-remove
 events.  Read-only.
+
+Memory Dimm ejection success notification (IO port 0xafa0, 1-byte access):
+---
+Dimm hot-remove _EJ0 notification. Byte value indicates Dimm slot that was
+ejected.
+
+Written by ACPI memory device _EJ0 method to notify qemu of successfull
+hot-removal.  Write-only.
diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index 879d8a0..6e4718e 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -50,6 +50,7 @@
 #define PCI_EJ_BASE 0xae08
 #define PCI_RMV_BASE 0xae0c
 #define MEM_BASE 0xaf80
+#define MEM_EJ_BASE 0xafa0
 
 #define PIIX4_MEM_HOTPLUG_STATUS 8
 #define PIIX4_PCI_HOTPLUG_STATUS 2
@@ -544,12 +545,29 @@ static uint32_t memhp_readb(void *opaque, uint32_t addr)
 return val;
 }
 
+static void memhp_writeb(void *opaque, uint32_t addr, uint32_t val)
+{
+switch (addr) {
+case MEM_EJ_BASE - MEM_BASE:
+dimm_notify(val, DIMM_REMOVE_SUCCESS);
+break;
+default:
+PIIX4_DPRINTF("memhp write invalid %x <== %d\n", addr, val);
+}
+PIIX4_DPRINTF("memhp write %x <== %d\n", addr, val);
+}
+
 static const MemoryRegionOps piix4_memhp_ops = {
 .old_portio = (MemoryRegionPortio[]) {
 {
 .offset = 0,   .len = DIMM_BITMAP_BYTES, .size = 1,
 .read = memhp_readb,
 },
+{
+.offset = MEM_EJ_BASE - MEM_BASE, .len = 1,
+.size = 1,
+.write = memhp_writeb,
+},
 PORTIO_END_OF_LIST()
 },
 .endianness = DEVICE_LITTLE_ENDIAN,
@@ -635,7 +653,7 @@ static void piix4_acpi_system_hot_add_init(PCIBus *bus, 
PIIX4PMState *s)
 memory_region_add_subregion(get_system_io(), PCI_HOTPLUG_ADDR,
 &s->io_pci);
 memory_region_init_io(&s->io_memhp, &piix4_memhp_ops, s, "apci-memhp0",
-  DIMM_BITMAP_BYTES);
+  DIMM_BITMAP_BYTES + 1);
 memory_region_add_subregion(get_system_io(), MEM_BASE, &s->io_memhp);
 
 for (i = 0; i < DIMM_BITMAP_BYTES; i++) {
@@ -665,6 +683,13 @@ static void enable_mem_device(PIIX4PMState *s, int 
memdevice)
 g->mems_sts[memdevice/8] |= (1 << (memdevice%8));
 }
 
+static void disable_mem_device(PIIX4PMState *s, int memdevice)
+{
+struct gpe_regs *g = &s->gperegs;
+s->ar.gpe.sts[0] |= PIIX4_MEM_HOTPLUG_STATUS;
+g->mems_sts[memdevice/8] &= ~(1 << (memdevice%8));
+}
+
 static int piix4_dimm_hotplug(DeviceState *qdev, DimmDevice *dev, int
 add)
 {
@@ -674,6 +699,8 @@ static int piix4_dimm_hotplug(DeviceState *qdev, DimmDevice 
*dev, int
 
 if (add) {
 enable_mem_device(s, slot->idx);
+} else {
+disable_mem_device(s, slot->idx);
 }
 pm_update_sci(s);
 return 0;
-- 
1.7.9

Re: [Qemu-devel] [RFC PATCH v6 0/6] Virtio refactoring.

2012-12-18 Thread Paolo Bonzini

Il 18/12/2012 15:00, Peter Maydell ha scritto:
> On 18 December 2012 13:10, Michael S. Tsirkin  wrote:
>> > And what makes virtio so special anyway? e1000 can be used without
>> > exposing users to internal buses and all kind of nastiness like this.
> Congratulations, you're using an architecture that has a pluggable
> discoverable bus implemented by just about all machines using that
> architecture. That makes things much easier for you.

Yes, that's true.  And you're basically using virtio as the pluggable
discoverable bus, which is actually a pretty good idea.

However, what you are doing is very similar to what virtio-s390 does,
and it manages to do it just fine with the existing virtio.c
infrastructure.  The only difference is that you have a 1:1 relationship
between virtio-mmio "slots" described by the board and virtio-mmio
devices added by the user.

True, it is not pure qdev, but it is much simpler and doesn't require
convincing grumpy maintainers. :)

Paolo

Re: [Qemu-devel] [RFC PATCH v6 0/6] Virtio refactoring.

2012-12-18 Thread Michael S. Tsirkin

On Tue, Dec 18, 2012 at 02:00:11PM +, Peter Maydell wrote:
> On 18 December 2012 13:10, Michael S. Tsirkin  wrote:
> > And what makes virtio so special anyway? e1000 can be used without
> > exposing users to internal buses and all kind of nastiness like this.
> 
> Congratulations, you're using an architecture that has a pluggable
> discoverable bus implemented by just about all machines using that
> architecture. That makes things much easier for you.
> 
> -- PMM

That's on the guest side. Yes, drivers are a problem but they are not
qemu problem as such.
But on the user/qemu side I dont' yes see why it's so different.

Re: [Qemu-devel] [RFC PATCH v6 0/6] Virtio refactoring.

2012-12-18 Thread Peter Maydell

On 18 December 2012 14:36, Paolo Bonzini  wrote:
> Yes, that's true.  And you're basically using virtio as the pluggable
> discoverable bus, which is actually a pretty good idea.
>
> However, what you are doing is very similar to what virtio-s390 does,
> and it manages to do it just fine with the existing virtio.c
> infrastructure.  The only difference is that you have a 1:1 relationship
> between virtio-mmio "slots" described by the board and virtio-mmio
> devices added by the user.

Also it looks like the board model and the 'bridge' and the transport
implementation are all collaborating to get the virtio memory sorted
out, rather than it just being "instantiate a bridge here"...

> True, it is not pure qdev, but it is much simpler and doesn't require
> convincing grumpy maintainers. :)

I'm not actually personally all that attached to this design -- it's just
trying to implement a suggestion by Anthony.

It does seem frankly bizarre that adding a new transport requires
knowing about all the backends (notice how s390-virtio-bus.c has
to register types for each backend). The kernel gets the transport
vs backend separation much cleaner and it was much easier to
add the virtio support there.

-- PMM

Re: [Qemu-devel] [PATCH v6 12/12] virtio-blk: add x-data-plane=on|off performance feature

2012-12-18 Thread Stefan Hajnoczi

On Sun, Dec 16, 2012 at 06:08:53PM +0200, Michael S. Tsirkin wrote:
> On Mon, Dec 10, 2012 at 02:09:45PM +0100, Stefan Hajnoczi wrote:
> > @@ -33,6 +34,7 @@ typedef struct VirtIOBlock
> >  VirtIOBlkConf *blk;
> >  unsigned short sector_mask;
> >  DeviceState *qdev;
> > +VirtIOBlockDataPlane *dataplane;
> >  } VirtIOBlock;
> >  
> >  static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
> > @@ -407,6 +409,14 @@ static void virtio_blk_handle_output(VirtIODevice 
> > *vdev, VirtQueue *vq)
> >  .num_writes = 0,
> >  };
> >  
> > +/* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
> > + * dataplane here instead of waiting for .set_status().
> > + */
> 
> By the way which guests are these?

I ran a Windows 8 guest today with build 48 virtio-win drivers.  It
notifies before the device gets its .set_status() callback invoked.

But I could swear I've seen Linux guests do this too.

> > +if (s->dataplane) {
> > +virtio_blk_data_plane_start(s->dataplane);
> > +return;
> > +}
> > +
> 
> By the way it's chunk such as this that I meant: it's not
> compiled out even if dataplane is disabled by configure.
> Naither is the extra field in the struct.

Okay.

Stefan

Re: [Qemu-devel] [PATCH v2 3/5] s390: Add new channel I/O based virtio transport.

2012-12-18 Thread Cornelia Huck

On Tue, 18 Dec 2012 09:45:27 +0100
Paolo Bonzini  wrote:

> Il 04/09/2012 17:13, Cornelia Huck ha scritto:
> > +VirtioCcwBus *virtio_ccw_bus_init(void)
> > +{
> > +VirtioCcwBus *cbus;
> > +BusState *bus;
> > +DeviceState *dev;
> > +
> > +/* Create bridge device */
> > +dev = qdev_create(NULL, "virtio-ccw-bridge");
> > +qdev_init_nofail(dev);
> > +
> > +/* Create bus on bridge device */
> > +bus = qbus_create(TYPE_VIRTIO_CCW_BUS, dev, "virtio-ccw");
> > +cbus = DO_UPCAST(VirtioCcwBus, bus, bus);
> > +
> > +/* Enable hotplugging */
> > +bus->allow_hotplug = 1;
> > +
> > +qemu_register_reset(virtio_ccw_reset_subchannels, cbus);
> 
> Please use qdev device-reset and bus-reset callbacks instead of this.

Will do for the next version.
> 
> In particular, when writing the status you should call
> qdev_reset_all(DEVICE(sch)), and whatever state should be reset will
> have to be cleared by the device-reset callback of SubchDev, including
> calling virtio_reset.

With "writing the status" you mean "the guest sets the status to 0",
right?

> 
> Everything else will be cleared instead by the bus-reset callback of
> virtio-ccw-bus, similar to what you are doing in
> virtio_ccw_reset_subchannels.

Looking at the reset handler, css_reset() is a bit oddly placed, as it
doesn't really have anything to do with virtio-ccw; virtio-ccw is just
the only current creator of channel subsystem images. I'll try to come
up with a better model.

> 
> Paolo
> 
> 
> > +return cbus;
> > +}
>

Re: [Qemu-devel] [PATCH v2 3/5] s390: Add new channel I/O based virtio transport.

2012-12-18 Thread Paolo Bonzini

Il 18/12/2012 15:58, Cornelia Huck ha scritto:
> On Tue, 18 Dec 2012 09:45:27 +0100
> Paolo Bonzini  wrote:
> 
>> Il 04/09/2012 17:13, Cornelia Huck ha scritto:
>>> +VirtioCcwBus *virtio_ccw_bus_init(void)
>>> +{
>>> +VirtioCcwBus *cbus;
>>> +BusState *bus;
>>> +DeviceState *dev;
>>> +
>>> +/* Create bridge device */
>>> +dev = qdev_create(NULL, "virtio-ccw-bridge");
>>> +qdev_init_nofail(dev);
>>> +
>>> +/* Create bus on bridge device */
>>> +bus = qbus_create(TYPE_VIRTIO_CCW_BUS, dev, "virtio-ccw");
>>> +cbus = DO_UPCAST(VirtioCcwBus, bus, bus);
>>> +
>>> +/* Enable hotplugging */
>>> +bus->allow_hotplug = 1;
>>> +
>>> +qemu_register_reset(virtio_ccw_reset_subchannels, cbus);
>>
>> Please use qdev device-reset and bus-reset callbacks instead of this.
> 
> Will do for the next version.
>>
>> In particular, when writing the status you should call
>> qdev_reset_all(DEVICE(sch)), and whatever state should be reset will
>> have to be cleared by the device-reset callback of SubchDev, including
>> calling virtio_reset.
> 
> With "writing the status" you mean "the guest sets the status to 0",
> right?

Yes.

Paolo

>> Everything else will be cleared instead by the bus-reset callback of
>> virtio-ccw-bus, similar to what you are doing in
>> virtio_ccw_reset_subchannels.
> 
> Looking at the reset handler, css_reset() is a bit oddly placed, as it
> doesn't really have anything to do with virtio-ccw; virtio-ccw is just
> the only current creator of channel subsystem images. I'll try to come
> up with a better model.
> 
>>
>> Paolo
>>
>>
>>> +return cbus;
>>> +}
>>
>

Re: [Qemu-devel] compile tcm_vhost kernel module as built-in

2012-12-18 Thread Stefan Hajnoczi

On Tue, Dec 11, 2012 at 09:18:57PM +0800, ching wrote:
> is there any virtio-scsi developer here?
> 
> I take a look at the tcm_vhost module of kernel 3.7 and want to compile it as 
> built-in.
> 
> However, CONFIG_TCM_VHOST only allows user to build as module.
> 
> I wonder why this restriction exists? thx a lot.

Hi Ching,
I'm probably the one who added that restriction.  I think the target
core loaded fabric modules explicitly when the tcm_vhost code was
written and it was not possible to build it into the kernel.

Perhaps that has changed.  The correct mailing list for drivers/target/
discussion is target-de...@vger.kernel.org.

Stefan

1 2 3 >

1 - 100 of 232 matches

Mail list logo