date:20150430

Re: [Qemu-devel] [PATCH v14 09/10] sysbus: add irq_routing_notifier

2015-04-30 Thread Peter Crosthwaite

On Wed, Apr 29, 2015 at 7:52 AM, Eric Auger  wrote:
> Add a new connect_irq_notifier notifier in the SysBusDeviceClass. This
> notifier, if populated, is called after sysbus_connect_irq.
>
> This mechanism is used to setup VFIO signaling once VFIO platform
> devices get attached to their platform bus, on a machine init done
> notifier.
>
> Signed-off-by: Eric Auger 

Reviewed-by: Peter Crosthwaite 

>
> ---
> v2 -> v3:
> - rename irq_routing_notifier into connect_irq_notifier
>
> v1 -> v2:
> - duly put the notifier in the class and not in the device
> ---
>  hw/core/sysbus.c| 6 ++
>  include/hw/sysbus.h | 1 +
>  2 files changed, 7 insertions(+)
>
> diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
> index b53c351..2d22aec 100644
> --- a/hw/core/sysbus.c
> +++ b/hw/core/sysbus.c
> @@ -109,7 +109,13 @@ qemu_irq sysbus_get_connected_irq(SysBusDevice *dev, int 
> n)
>
>  void sysbus_connect_irq(SysBusDevice *dev, int n, qemu_irq irq)
>  {
> +SysBusDeviceClass *sbd = SYS_BUS_DEVICE_GET_CLASS(dev);
> +
>  qdev_connect_gpio_out_named(DEVICE(dev), SYSBUS_DEVICE_GPIO_IRQ, n, irq);
> +
> +if (sbd->connect_irq_notifier) {
> +sbd->connect_irq_notifier(dev, irq);
> +}
>  }
>
>  /* Check whether an MMIO region exists */
> diff --git a/include/hw/sysbus.h b/include/hw/sysbus.h
> index d1f3f00..e80b26d 100644
> --- a/include/hw/sysbus.h
> +++ b/include/hw/sysbus.h
> @@ -41,6 +41,7 @@ typedef struct SysBusDeviceClass {
>  /*< public >*/
>
>  int (*init)(SysBusDevice *dev);
> +void (*connect_irq_notifier)(SysBusDevice *dev, qemu_irq irq);
>  } SysBusDeviceClass;
>
>  struct SysBusDevice {
> --
> 1.8.3.2
>
>

[Qemu-devel] [PULL 37/42] qmp-commands: Fix typo

2015-04-30 Thread Michael Tokarev

From: John Snow 

Just a trivial patch to correct a QMP example in qmp-commands.hx.

Signed-off-by: John Snow 
Reviewed-by: Eric Blake 
Signed-off-by: Michael Tokarev 
---
 qmp-commands.hx | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 213508f..d4a837c 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -2380,7 +2380,7 @@ Example:
   "virtual-size":2048000,
   "backing_file":"base.qcow2",
   "full-backing-filename":"disks/base.qcow2",
-  "backing-filename-format:"qcow2",
+  "backing-filename-format":"qcow2",
   "snapshots":[
  {
 "id": "1",
@@ -3847,7 +3847,7 @@ Example:
   "virtual-size":2048000,
   "backing_file":"base.qcow2",
   "full-backing-filename":"disks/base.qcow2",
-  "backing-filename-format:"qcow2",
+  "backing-filename-format":"qcow2",
   "snapshots":[
  {
 "id": "1",
-- 
2.1.4

[Qemu-devel] [PULL 28/42] microblaze: cpu: Delete EXCP_NMI

2015-04-30 Thread Michael Tokarev

From: Peter Crosthwaite 

This define is unused. Remove.

Signed-off-by: Peter Crosthwaite 
Reviewed-by: Edgar E. Iglesias 
Signed-off-by: Michael Tokarev 
---
 target-microblaze/cpu.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/target-microblaze/cpu.h b/target-microblaze/cpu.h
index f21da2f..6522af7 100644
--- a/target-microblaze/cpu.h
+++ b/target-microblaze/cpu.h
@@ -36,7 +36,6 @@ typedef struct CPUMBState CPUMBState;
 
 #define ELF_MACHINEEM_MICROBLAZE
 
-#define EXCP_NMI1
 #define EXCP_MMU2
 #define EXCP_IRQ3
 #define EXCP_BREAK  4
-- 
2.1.4

[Qemu-devel] [PULL 06/42] ui/console : remove 'struct' from 'typedef struct' type

2015-04-30 Thread Michael Tokarev

From: Chih-Min Chao 

Signed-off-by: Chih-Min Chao 
Reviewed-by: Gerd Hoffmann 
Signed-off-by: Michael Tokarev 
---
 ui/console.c   | 4 ++--
 ui/spice-display.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/ui/console.c b/ui/console.c
index 2927513..f5295c4 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -269,7 +269,7 @@ void graphic_hw_invalidate(QemuConsole *con)
 }
 }
 
-static void ppm_save(const char *filename, struct DisplaySurface *ds,
+static void ppm_save(const char *filename, DisplaySurface *ds,
  Error **errp)
 {
 int width = pixman_image_get_width(ds->image);
@@ -1535,7 +1535,7 @@ void dpy_text_update(QemuConsole *con, int x, int y, int 
w, int h)
 void dpy_text_resize(QemuConsole *con, int w, int h)
 {
 DisplayState *s = con->ds;
-struct DisplayChangeListener *dcl;
+DisplayChangeListener *dcl;
 
 if (!qemu_console_is_visible(con)) {
 return;
diff --git a/ui/spice-display.c b/ui/spice-display.c
index c71a059..e293ec2 100644
--- a/ui/spice-display.c
+++ b/ui/spice-display.c
@@ -718,7 +718,7 @@ static void display_update(DisplayChangeListener *dcl,
 }
 
 static void display_switch(DisplayChangeListener *dcl,
-   struct DisplaySurface *surface)
+   DisplaySurface *surface)
 {
 SimpleSpiceDisplay *ssd = container_of(dcl, SimpleSpiceDisplay, dcl);
 qemu_spice_display_switch(ssd, surface);
-- 
2.1.4

[Qemu-devel] [PULL 08/42] misc: Fix new collection of typos

2015-04-30 Thread Michael Tokarev

From: Stefan Weil 

All of them were reported by codespell.
Most typos are in comments, one is in an error message.

Signed-off-by: Stefan Weil 
Reviewed-by: Peter Maydell 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Michael Tokarev 
---
 hw/block/virtio-blk.c | 2 +-
 hw/misc/edu.c | 2 +-
 hw/net/virtio-net.c   | 2 +-
 hw/ppc/spapr.c| 2 +-
 qga/qapi-schema.json  | 2 +-
 target-s390x/mmu_helper.c | 8 
 target-s390x/translate.c  | 2 +-
 tests/libqos/ahci.c   | 4 ++--
 8 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 9546fd2..e6afe97 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -515,7 +515,7 @@ void virtio_blk_handle_request(VirtIOBlockReq *req, 
MultiReqBuffer *mrb)
 type = virtio_ldl_p(VIRTIO_DEVICE(req->dev), &req->out.type);
 
 /* VIRTIO_BLK_T_OUT defines the command direction. VIRTIO_BLK_T_BARRIER
- * is an optional flag. Altough a guest should not send this flag if
+ * is an optional flag. Although a guest should not send this flag if
  * not negotiated we ignored it in the past. So keep ignoring it. */
 switch (type & ~(VIRTIO_BLK_T_OUT | VIRTIO_BLK_T_BARRIER)) {
 case VIRTIO_BLK_T_IN:
diff --git a/hw/misc/edu.c b/hw/misc/edu.c
index f601069..fe50b42 100644
--- a/hw/misc/edu.c
+++ b/hw/misc/edu.c
@@ -279,7 +279,7 @@ static const MemoryRegionOps edu_mmio_ops = {
 };
 
 /*
- * We purposedly use a thread, so that users are forced to wait for the status
+ * We purposely use a thread, so that users are forced to wait for the status
  * register.
  */
 static void *edu_fact_thread(void *opaque)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 59f76bc..67ab228 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1590,7 +1590,7 @@ static void virtio_net_device_realize(DeviceState *dev, 
Error **errp)
 n->max_queues = MAX(n->nic_conf.peers.queues, 1);
 if (n->max_queues * 2 + 1 > VIRTIO_PCI_QUEUE_MAX) {
 error_setg(errp, "Invalid number of queues (= %" PRIu32 "), "
-   "must be a postive integer less than %d.",
+   "must be a positive integer less than %d.",
n->max_queues, (VIRTIO_PCI_QUEUE_MAX - 1) / 2);
 virtio_cleanup(vdev);
 return;
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 61ddc79..644689a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1029,7 +1029,7 @@ static int spapr_post_load(void *opaque, int version_id)
 sPAPREnvironment *spapr = (sPAPREnvironment *)opaque;
 int err = 0;
 
-/* In earlier versions, there was no seperate qdev for the PAPR
+/* In earlier versions, there was no separate qdev for the PAPR
  * RTC, so the RTC offset was stored directly in sPAPREnvironment.
  * So when migrating from those versions, poke the incoming offset
  * value into the RTC device */
diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index 95f49e3..5c4cd40 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -808,7 +808,7 @@
 #
 # An enumeration of memory block operation result.
 #
-# @sucess: the operation of online/offline memory block is successful.
+# @success: the operation of online/offline memory block is successful.
 # @not-found: can't find the corresponding memoryXXX directory in sysfs.
 # @operation-not-supported: for some old kernels, it does not support
 #   online or offline memory block.
diff --git a/target-s390x/mmu_helper.c b/target-s390x/mmu_helper.c
index b061c85..7baf5e9 100644
--- a/target-s390x/mmu_helper.c
+++ b/target-s390x/mmu_helper.c
@@ -303,8 +303,8 @@ static int mmu_translate_asce(CPUS390XState *env, 
target_ulong vaddr,
  * @param ascaddress space control (one of the PSW_ASC_* modes)
  * @param raddr  the translated address is stored to this pointer
  * @param flags  the PAGE_READ/WRITE/EXEC flags are stored to this pointer
- * @param exctrue = inject a program check if a fault occured
- * @return   0 if the translation was successfull, -1 if a fault occured
+ * @param exctrue = inject a program check if a fault occurred
+ * @return   0 if the translation was successful, -1 if a fault occurred
  */
 int mmu_translate(CPUS390XState *env, target_ulong vaddr, int rw, uint64_t asc,
   target_ulong *raddr, int *flags, bool exc)
@@ -436,9 +436,9 @@ static int translate_pages(S390CPU *cpu, vaddr addr, int 
nr_pages,
  * s390_cpu_virt_mem_rw:
  * @laddr: the logical start address
  * @hostbuf:   buffer in host memory. NULL = do only checks w/o copying
- * @len:   length that should be transfered
+ * @len:   length that should be transferred
  * @is_write:  true = write, false = read
- * Returns:0 on success, non-zero if an exception occured
+ * Returns:0 on success, non-zero if an exception occurred
  *
  * Copy from/to guest memory using logical addresses. Note that we inject a
  * program i

[Qemu-devel] [PULL 23/42] libcacard: do not use full paths for include files in the same dir

2015-04-30 Thread Michael Tokarev

Signed-off-by: Michael Tokarev 
Reviewed-by: Paolo Bonzini 
---
 libcacard/vcard_emul_nss.c | 2 +-
 libcacard/vcardt.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/libcacard/vcard_emul_nss.c b/libcacard/vcard_emul_nss.c
index 6955f69..d9761ee 100644
--- a/libcacard/vcard_emul_nss.c
+++ b/libcacard/vcard_emul_nss.c
@@ -33,7 +33,7 @@
 #include "vreader.h"
 #include "vevent.h"
 
-#include "libcacard/vcardt_internal.h"
+#include "vcardt_internal.h"
 
 
 typedef enum {
diff --git a/libcacard/vcardt.c b/libcacard/vcardt.c
index 9ce4648..c67de2f 100644
--- a/libcacard/vcardt.c
+++ b/libcacard/vcardt.c
@@ -2,9 +2,9 @@
 #include 
 #include 
 
-#include "libcacard/vcardt.h"
+#include "vcardt.h"
 
-#include "libcacard/vcardt_internal.h"
+#include "vcardt_internal.h"
 
 /* create an ATR with appropriate historical bytes */
 #define ATR_TS_DIRECT_CONVENTION 0x3b
-- 
2.1.4

[Qemu-devel] [PULL 04/42] ui/vnc : fix coding style

2015-04-30 Thread Michael Tokarev

From: Chih-Min Chao 

reported by checkpatch.pl

Signed-off-by: Chih-Min Chao 
Reviewed-by: Gerd Hoffmann 
Signed-off-by: Michael Tokarev 
---
 ui/vnc-auth-vencrypt.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/ui/vnc-auth-vencrypt.c b/ui/vnc-auth-vencrypt.c
index a420ccb..65f1afa 100644
--- a/ui/vnc-auth-vencrypt.c
+++ b/ui/vnc-auth-vencrypt.c
@@ -65,7 +65,8 @@ static void start_auth_vencrypt_subauth(VncState *vs)
 
 static void vnc_tls_handshake_io(void *opaque);
 
-static int vnc_start_vencrypt_handshake(struct VncState *vs) {
+static int vnc_start_vencrypt_handshake(struct VncState *vs)
+{
 int ret;
 
 if ((ret = gnutls_handshake(vs->tls.session)) < 0) {
@@ -100,7 +101,8 @@ static int vnc_start_vencrypt_handshake(struct VncState 
*vs) {
 return 0;
 }
 
-static void vnc_tls_handshake_io(void *opaque) {
+static void vnc_tls_handshake_io(void *opaque)
+{
 struct VncState *vs = (struct VncState *)opaque;
 
 VNC_DEBUG("Handshake IO continue\n");
-- 
2.1.4

[Qemu-devel] [PATCH] openrisc: cpu: Remove unused cpu_get_pc

2015-04-30 Thread Peter Crosthwaite

This function is not used by anything. Remove.

Signed-off-by: Peter Crosthwaite 
---
 target-openrisc/cpu.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/target-openrisc/cpu.h b/target-openrisc/cpu.h
index b25324b..9e23cd0 100644
--- a/target-openrisc/cpu.h
+++ b/target-openrisc/cpu.h
@@ -415,9 +415,4 @@ static inline int cpu_mmu_index(CPUOpenRISCState *env)
 
 #include "exec/exec-all.h"
 
-static inline target_ulong cpu_get_pc(CPUOpenRISCState *env)
-{
-return env->pc;
-}
-
 #endif /* CPU_OPENRISC_H */
-- 
1.9.1

Re: [Qemu-devel] [PATCH V2] vhost: logs sharing

2015-04-30 Thread Jason Wang




On Tue, Apr 28, 2015 at 6:30 PM, Michael S. Tsirkin  
wrote:

On Tue, Apr 28, 2015 at 05:58:28PM +0800, Jason Wang wrote:
 
 
 On Tue, Apr 28, 2015 at 5:37 PM, Michael S. Tsirkin 
 wrote:

 >On Fri, Apr 10, 2015 at 05:33:35PM +0800, Jason Wang wrote:
 >> Currently we allocate one vhost log per vhost device. This is sub
 >> optimal when:
 >> - Guest has several device with vhost as backend
 >> - Guest has multiqueue devices
 >> In the above cases, we can avoid the memory allocation by 
sharing a
 >> single vhost log among all the vhost devices. This is done 
through:

 >> - Introducing a new vhost_log structure with refcnt inside.
 >> - Using a global pointer to vhost_log structure that will be 
used. And
 >>   introduce helper to get the log with expected log size and 
helper to

 >> - drop the refcnt to the old log.
 >> - Each vhost device still keep track of a pointer to the log 
that was

 >>   used.
 >>   With above, if no resize happens, all vhost device will share a
 >>single
 >> vhost log. During resize, a new vhost_log structure will be 
allocated
 >> and made for the global pointer. And each vhost devices will 
drop the

 >> refcnt to the old log.
 >> Tested by doing scp during migration for a 2 queues 
virtio-net-pci.

 >> Cc: Michael S. Tsirkin 
 >> Signed-off-by: Jason Wang 
 >> ---
 >> Changes from V1:
 >> - Drop the list of vhost log, instead, using a global pointer 
instead

 >
 >I don't think it works like this. If you have a global pointer,
 >you also need a global listener, have that sync all devices.
 
 It doesn't conflict, see my comments below.

 >
 >
 >
 >> ---
 >>  hw/virtio/vhost.c | 66
 >>++-
 >>  include/hw/virtio/vhost.h |  9 ++-
 >>  2 files changed, 62 insertions(+), 13 deletions(-)
 >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
 >> index 5a12861..e16c2db 100644
 >> --- a/hw/virtio/vhost.c
 >> +++ b/hw/virtio/vhost.c
 >> @@ -22,15 +22,19 @@
 >>  #include "hw/virtio/virtio-bus.h"
 >>  #include "migration/migration.h"
 >> +static struct vhost_log *vhost_log;
 >> +
 >>  static void vhost_dev_sync_region(struct vhost_dev *dev,
 >>MemoryRegionSection *section,
 >>uint64_t mfirst, uint64_t 
mlast,
 >>uint64_t rfirst, uint64_t 
rlast)

 >>  {
 >> +vhost_log_chunk_t *log = dev->log->log;
 >> +
 >>  uint64_t start = MAX(mfirst, rfirst);
 >>  uint64_t end = MIN(mlast, rlast);
 >> -vhost_log_chunk_t *from = dev->log + start / 
VHOST_LOG_CHUNK;
 >> -vhost_log_chunk_t *to = dev->log + end / VHOST_LOG_CHUNK + 
1;

 >> +vhost_log_chunk_t *from = log + start / VHOST_LOG_CHUNK;
 >> +vhost_log_chunk_t *to = log + end / VHOST_LOG_CHUNK + 1;
 >>  uint64_t addr = (start / VHOST_LOG_CHUNK) * VHOST_LOG_CHUNK;
 >>  if (end < start) {
 >> @@ -280,22 +284,55 @@ static uint64_t vhost_get_log_size(struct
 >>vhost_dev *dev)
 >>  }
 >>  return log_size;
 >>  }
 >> +static struct vhost_log *vhost_log_alloc(uint64_t size)
 >> +{
 >> +struct vhost_log *log = g_malloc0(sizeof *log + size *
 >>sizeof(*(log->log)));
 >> +
 >> +log->size = size;
 >> +log->refcnt = 1;
 >> +
 >> +return log;
 >> +}
 >> +
 >> +static struct vhost_log *vhost_log_get(uint64_t size)
 >> +{
 >> +if (!vhost_log || vhost_log->size != size) {
 >> +vhost_log = vhost_log_alloc(size);
 >
 >This just leaks the old log if size != size.
 
 But old log is reference counted and will be freed during 
vhost_log_put() if

 refcnt drops to zero.


You need a pointer to reference-count it.
You return pointer to new object, no one references the old one.


The pointer is just vhost_log->log. The old pointer will be kept in 
vhost_dev_log_resize() until 1) new log was set through ioctl and 2) 
the old log was synced. Then vhost_log_put() will drop the reference 
count to the old log.





 >
 >> +} else {
 >> +++vhost_log->refcnt;
 >> +}
 >> +
 >> +return vhost_log;
 >> +}
 >> +
 >> +static void vhost_log_put(struct vhost_log *log)
 >> +{
 >> +if (!log) {
 >> +return;
 >> +}
 >> +
 >> +--log->refcnt;
 >> +if (log->refcnt == 0) {
 >> +if (vhost_log == log) {
 >> +vhost_log = NULL;
 >> +}
 >> +g_free(log);
 >> +}
 >> +}
 >> static inline void vhost_dev_log_resize(struct vhost_dev* 
dev,

 >>uint64_t size)
 >>  {
 >> -vhost_log_chunk_t *log;
 >> -uint64_t log_base;
 >> +struct vhost_log *log = vhost_log_get(size);
 >
 >At this point next device will try to use the
 >new log, but there is nothing there.
 
 vhost_log_get() will either allocate and return a new log if size 
is change
 or just increase the refcnt and use current log. So it works in 
fact?


But old log is lost and is not synced.


As I replied above, it was not lost and kept in vhost_log->log.

Re: [Qemu-devel] [PATCH V2] vhost: logs sharing

2015-04-30 Thread Michael S. Tsirkin

On Thu, Apr 30, 2015 at 04:05:09PM +0800, Jason Wang wrote:
> 
> 
> On Tue, Apr 28, 2015 at 6:30 PM, Michael S. Tsirkin  wrote:
> >On Tue, Apr 28, 2015 at 05:58:28PM +0800, Jason Wang wrote:
> >> On Tue, Apr 28, 2015 at 5:37 PM, Michael S. Tsirkin
> >> wrote:
> >> >On Fri, Apr 10, 2015 at 05:33:35PM +0800, Jason Wang wrote:
> >> >> Currently we allocate one vhost log per vhost device. This is sub
> >> >> optimal when:
> >> >> - Guest has several device with vhost as backend
> >> >> - Guest has multiqueue devices
> >> >> In the above cases, we can avoid the memory allocation by sharing a
> >> >> single vhost log among all the vhost devices. This is done through:
> >> >> - Introducing a new vhost_log structure with refcnt inside.
> >> >> - Using a global pointer to vhost_log structure that will be used.
> >>And
> >> >>   introduce helper to get the log with expected log size and helper
> >>to
> >> >> - drop the refcnt to the old log.
> >> >> - Each vhost device still keep track of a pointer to the log that
> >>was
> >> >>   used.
> >> >>   With above, if no resize happens, all vhost device will share a
> >> >>single
> >> >> vhost log. During resize, a new vhost_log structure will be
> >>allocated
> >> >> and made for the global pointer. And each vhost devices will drop
> >>the
> >> >> refcnt to the old log.
> >> >> Tested by doing scp during migration for a 2 queues virtio-net-pci.
> >> >> Cc: Michael S. Tsirkin 
> >> >> Signed-off-by: Jason Wang 
> >> >> ---
> >> >> Changes from V1:
> >> >> - Drop the list of vhost log, instead, using a global pointer
> >>instead
> >> >
> >> >I don't think it works like this. If you have a global pointer,
> >> >you also need a global listener, have that sync all devices.
> >> It doesn't conflict, see my comments below.
> >> >
> >> >
> >> >
> >> >> ---
> >> >>  hw/virtio/vhost.c | 66
> >> >>++-
> >> >>  include/hw/virtio/vhost.h |  9 ++-
> >> >>  2 files changed, 62 insertions(+), 13 deletions(-)
> >> >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >> >> index 5a12861..e16c2db 100644
> >> >> --- a/hw/virtio/vhost.c
> >> >> +++ b/hw/virtio/vhost.c
> >> >> @@ -22,15 +22,19 @@
> >> >>  #include "hw/virtio/virtio-bus.h"
> >> >>  #include "migration/migration.h"
> >> >> +static struct vhost_log *vhost_log;
> >> >> +
> >> >>  static void vhost_dev_sync_region(struct vhost_dev *dev,
> >> >>MemoryRegionSection *section,
> >> >>uint64_t mfirst, uint64_t mlast,
> >> >>uint64_t rfirst, uint64_t rlast)
> >> >>  {
> >> >> +vhost_log_chunk_t *log = dev->log->log;
> >> >> +
> >> >>  uint64_t start = MAX(mfirst, rfirst);
> >> >>  uint64_t end = MIN(mlast, rlast);
> >> >> -vhost_log_chunk_t *from = dev->log + start / VHOST_LOG_CHUNK;
> >> >> -vhost_log_chunk_t *to = dev->log + end / VHOST_LOG_CHUNK + 1;
> >> >> +vhost_log_chunk_t *from = log + start / VHOST_LOG_CHUNK;
> >> >> +vhost_log_chunk_t *to = log + end / VHOST_LOG_CHUNK + 1;
> >> >>  uint64_t addr = (start / VHOST_LOG_CHUNK) * VHOST_LOG_CHUNK;
> >> >>  if (end < start) {
> >> >> @@ -280,22 +284,55 @@ static uint64_t vhost_get_log_size(struct
> >> >>vhost_dev *dev)
> >> >>  }
> >> >>  return log_size;
> >> >>  }
> >> >> +static struct vhost_log *vhost_log_alloc(uint64_t size)
> >> >> +{
> >> >> +struct vhost_log *log = g_malloc0(sizeof *log + size *
> >> >>sizeof(*(log->log)));
> >> >> +
> >> >> +log->size = size;
> >> >> +log->refcnt = 1;
> >> >> +
> >> >> +return log;
> >> >> +}
> >> >> +
> >> >> +static struct vhost_log *vhost_log_get(uint64_t size)
> >> >> +{
> >> >> +if (!vhost_log || vhost_log->size != size) {
> >> >> +vhost_log = vhost_log_alloc(size);
> >> >
> >> >This just leaks the old log if size != size.
> >>   But old log is reference counted and will be freed during
> >>vhost_log_put() if
> >> refcnt drops to zero.
> >
> >You need a pointer to reference-count it.
> >You return pointer to new object, no one references the old one.
> 
> The pointer is just vhost_log->log. The old pointer will be kept in
> vhost_dev_log_resize() until 1) new log was set through ioctl and 2) the old
> log was synced.

vhost_dev_log_resize is per device, isn't it?
So the log is synced in device 1, but not in device 2.

> Then vhost_log_put() will drop the reference count to the
> old log.
> 
> >
> >> >
> >> >> +} else {
> >> >> +++vhost_log->refcnt;
> >> >> +}
> >> >> +
> >> >> +return vhost_log;
> >> >> +}
> >> >> +
> >> >> +static void vhost_log_put(struct vhost_log *log)
> >> >> +{
> >> >> +if (!log) {
> >> >> +return;
> >> >> +}
> >> >> +
> >> >> +--log->refcnt;
> >> >> +if (log->refcnt == 0) {
> >> >> +if (vhost_log == log) {
> >> >> +vhost_log = NULL;
> >> >> +}
> >> >> +g_free(log);
> >> >> +}
> >> >> +}
> >>

Re: [Qemu-devel] [PATCH] openrisc: cpu: Remove unused cpu_get_pc

2015-04-30 Thread Alex Bennée


Peter Crosthwaite  writes:

> This function is not used by anything. Remove.
>
> Signed-off-by: Peter Crosthwaite 
> ---
>  target-openrisc/cpu.h | 5 -
>  1 file changed, 5 deletions(-)
>
> diff --git a/target-openrisc/cpu.h b/target-openrisc/cpu.h
> index b25324b..9e23cd0 100644
> --- a/target-openrisc/cpu.h
> +++ b/target-openrisc/cpu.h
> @@ -415,9 +415,4 @@ static inline int cpu_mmu_index(CPUOpenRISCState *env)
>  
>  #include "exec/exec-all.h"
>  
> -static inline target_ulong cpu_get_pc(CPUOpenRISCState *env)
> -{
> -return env->pc;
> -}
> -
>  #endif /* CPU_OPENRISC_H */

Are you going to clean up the the microblaze one as well?

Reviewed-by: Alex Bennée 

-- 
Alex Bennée

Re: [Qemu-devel] [PATCH microblaze v1 3/6] mb: cpu: Remote unused cpu_get_pc

2015-04-30 Thread Alex Bennée


Peter Crosthwaite  writes:

> This function is not used by anything. Remove.
>
> Signed-off-by: Peter Crosthwaite 
> ---
>  target-microblaze/cpu.h | 5 -
>  1 file changed, 5 deletions(-)
>
> diff --git a/target-microblaze/cpu.h b/target-microblaze/cpu.h
> index 7d06227..2c18b49 100644
> --- a/target-microblaze/cpu.h
> +++ b/target-microblaze/cpu.h
> @@ -333,11 +333,6 @@ static inline int cpu_interrupts_enabled(CPUMBState *env)
>  
>  #include "exec/cpu-all.h"
>  
> -static inline target_ulong cpu_get_pc(CPUMBState *env)
> -{
> -return env->sregs[SR_PC];
> -}
> -
>  static inline void cpu_get_tb_cpu_state(CPUMBState *env, target_ulong *pc,
>  target_ulong *cs_base, int *flags)
>  {

Ahh I see that now ;-)

Reviewed-by: Alex Bennée 


-- 
Alex Bennée

[Qemu-devel] [PULL 34/42] coroutine: remove unnecessary parentheses in qemu_co_queue_empty

2015-04-30 Thread Michael Tokarev

From: "Emilio G. Cota" 

Signed-off-by: Emilio G. Cota 
Signed-off-by: Michael Tokarev 
---
 qemu-coroutine-lock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-coroutine-lock.c b/qemu-coroutine-lock.c
index e4860ae..6b49033 100644
--- a/qemu-coroutine-lock.c
+++ b/qemu-coroutine-lock.c
@@ -108,7 +108,7 @@ bool qemu_co_enter_next(CoQueue *queue)
 
 bool qemu_co_queue_empty(CoQueue *queue)
 {
-return (QTAILQ_FIRST(&queue->entries) == NULL);
+return QTAILQ_FIRST(&queue->entries) == NULL;
 }
 
 void qemu_co_mutex_init(CoMutex *mutex)
-- 
2.1.4

[Qemu-devel] [PULL 33/42] qemu-char: remove unused list node from FDCharDriver

2015-04-30 Thread Michael Tokarev

From: "Emilio G. Cota" 

Signed-off-by: Emilio G. Cota 
Signed-off-by: Michael Tokarev 
---
 qemu-char.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/qemu-char.c b/qemu-char.c
index a405d76..d0c1564 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -973,7 +973,6 @@ typedef struct FDCharDriver {
 CharDriverState *chr;
 GIOChannel *fd_in, *fd_out;
 int max_size;
-QTAILQ_ENTRY(FDCharDriver) node;
 } FDCharDriver;
 
 /* Called with chr_write_lock held.  */
-- 
2.1.4

[Qemu-devel] [PULL 15/42] pci: Remove unused function ich9_d2pbr_init()

2015-04-30 Thread Michael Tokarev

From: Thomas Huth 

The function ich9_d2pbr_init() is completely unused and
thus can be deleted.

Signed-off-by: Thomas Huth 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael Tokarev 
---
 hw/pci-bridge/i82801b11.c | 21 -
 include/hw/i386/ich9.h|  1 -
 2 files changed, 22 deletions(-)

diff --git a/hw/pci-bridge/i82801b11.c b/hw/pci-bridge/i82801b11.c
index 14cd7fd..7e79bc0 100644
--- a/hw/pci-bridge/i82801b11.c
+++ b/hw/pci-bridge/i82801b11.c
@@ -101,27 +101,6 @@ static const TypeInfo i82801b11_bridge_info = {
 .class_init= i82801b11_bridge_class_init,
 };
 
-PCIBus *ich9_d2pbr_init(PCIBus *bus, int devfn, int sec_bus)
-{
-PCIDevice *d;
-PCIBridge *br;
-char buf[16];
-DeviceState *qdev;
-
-d = pci_create_multifunction(bus, devfn, true, "i82801b11-bridge");
-if (!d) {
-return NULL;
-}
-br = PCI_BRIDGE(d);
-qdev = DEVICE(d);
-
-snprintf(buf, sizeof(buf), "pci.%d", sec_bus);
-pci_bridge_map_irq(br, buf, pci_swizzle_map_irq_fn);
-qdev_init_nofail(qdev);
-
-return pci_bridge_get_sec_bus(br);
-}
-
 static void d2pbr_register(void)
 {
 type_register_static(&i82801b11_bridge_info);
diff --git a/include/hw/i386/ich9.h b/include/hw/i386/ich9.h
index c171578..f4e522c 100644
--- a/include/hw/i386/ich9.h
+++ b/include/hw/i386/ich9.h
@@ -18,7 +18,6 @@ void ich9_lpc_set_irq(void *opaque, int irq_num, int level);
 int ich9_lpc_map_irq(PCIDevice *pci_dev, int intx);
 PCIINTxRoute ich9_route_intx_pin_to_irq(void *opaque, int pirq_pin);
 void ich9_lpc_pm_init(PCIDevice *pci_lpc);
-PCIBus *ich9_d2pbr_init(PCIBus *bus, int devfn, int sec_bus);
 I2CBus *ich9_smb_init(PCIBus *bus, int devfn, uint32_t smb_io_base);
 
 #define ICH9_CC_SIZE(16 * 1024) /* 16KB */
-- 
2.1.4

[Qemu-devel] [PULL 36/42] i440fx-test: remove ARRAY_SIZE redefinition

2015-04-30 Thread Michael Tokarev

From: "Emilio G. Cota" 

It's defined in osdep.h and shouldn't be redefined here.

Signed-off-by: Emilio G. Cota 
Reviewed-by: Peter Crosthwaite 
Reviewed-by: Paolo Bonzini 
Signed-off-by: Michael Tokarev 
---
 tests/i440fx-test.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tests/i440fx-test.c b/tests/i440fx-test.c
index d0bc8de..33a7ecb 100644
--- a/tests/i440fx-test.c
+++ b/tests/i440fx-test.c
@@ -27,8 +27,6 @@
 
 #define BROKEN 1
 
-#define ARRAY_SIZE(array) (sizeof(array) / sizeof((array)[0]))
-
 typedef struct TestData
 {
 int num_cpus;
-- 
2.1.4

[Qemu-devel] [PULL 11/42] tpm: fix coding style

2015-04-30 Thread Michael Tokarev

From: Stefan Berger 

Fix coding style in one instance.

Signed-off-by: Stefan Berger 
Signed-off-by: Michael Tokarev 
---
 hw/tpm/tpm_tis.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
index 4b6d601..b8235d5 100644
--- a/hw/tpm/tpm_tis.c
+++ b/hw/tpm/tpm_tis.c
@@ -842,7 +842,7 @@ static void tpm_tis_mmio_write_intern(void *opaque, hwaddr 
addr,
 (tis->loc[locty].sts & TPM_TIS_STS_EXPECT)) {
 /* we have a packet length - see if we have all of it */
 #ifdef RAISE_STS_IRQ
-bool needIrq = !(tis->loc[locty].sts & TPM_TIS_STS_VALID);
+bool need_irq = !(tis->loc[locty].sts & TPM_TIS_STS_VALID);
 #endif
 len = tpm_tis_get_size_from_buffer(&tis->loc[locty].w_buffer);
 if (len > tis->loc[locty].w_offset) {
@@ -853,7 +853,7 @@ static void tpm_tis_mmio_write_intern(void *opaque, hwaddr 
addr,
 tpm_tis_sts_set(&tis->loc[locty], TPM_TIS_STS_VALID);
 }
 #ifdef RAISE_STS_IRQ
-if (needIrq) {
+if (need_irq) {
 tpm_tis_raise_irq(s, locty, TPM_TIS_INT_STS_VALID);
 }
 #endif
-- 
2.1.4

[Qemu-devel] [PULL 22/42] libcacard: stop including qemu-common.h

2015-04-30 Thread Michael Tokarev

From: Paolo Bonzini 

This is a small step towards making libcacard standalone.

Signed-off-by: Paolo Bonzini 
Signed-off-by: Michael Tokarev 
---
 libcacard/cac.c| 5 -
 libcacard/card_7816.c  | 4 +++-
 libcacard/event.c  | 2 +-
 libcacard/vcard.c  | 4 +++-
 libcacard/vcard_emul_nss.c | 2 +-
 libcacard/vreader.c| 4 +++-
 libcacard/vscclient.c  | 8 +++-
 7 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/libcacard/cac.c b/libcacard/cac.c
index f38fdce..bc84534 100644
--- a/libcacard/cac.c
+++ b/libcacard/cac.c
@@ -5,7 +5,10 @@
  * See the COPYING.LIB file in the top-level directory.
  */
 
-#include "qemu-common.h"
+#include "glib-compat.h"
+
+#include 
+#include 
 
 #include "cac.h"
 #include "vcard.h"
diff --git a/libcacard/card_7816.c b/libcacard/card_7816.c
index 814fa16..22fd334 100644
--- a/libcacard/card_7816.c
+++ b/libcacard/card_7816.c
@@ -5,7 +5,9 @@
  * See the COPYING.LIB file in the top-level directory.
  */
 
-#include "qemu-common.h"
+#include "glib-compat.h"
+
+#include 
 
 #include "vcard.h"
 #include "vcard_emul.h"
diff --git a/libcacard/event.c b/libcacard/event.c
index 4c551e4..63f4057 100644
--- a/libcacard/event.c
+++ b/libcacard/event.c
@@ -5,7 +5,7 @@
  * See the COPYING.LIB file in the top-level directory.
  */
 
-#include "qemu-common.h"
+#include "glib-compat.h"
 
 #include "vcard.h"
 #include "vreader.h"
diff --git a/libcacard/vcard.c b/libcacard/vcard.c
index d140a8e..1a87208 100644
--- a/libcacard/vcard.c
+++ b/libcacard/vcard.c
@@ -5,7 +5,9 @@
  * See the COPYING.LIB file in the top-level directory.
  */
 
-#include "qemu-common.h"
+#include "glib-compat.h"
+
+#include 
 
 #include "vcard.h"
 #include "vcard_emul.h"
diff --git a/libcacard/vcard_emul_nss.c b/libcacard/vcard_emul_nss.c
index 950edee..6955f69 100644
--- a/libcacard/vcard_emul_nss.c
+++ b/libcacard/vcard_emul_nss.c
@@ -25,7 +25,7 @@
 #include 
 #include 
 
-#include "qemu-common.h"
+#include "glib-compat.h"
 
 #include "vcard.h"
 #include "card_7816t.h"
diff --git a/libcacard/vreader.c b/libcacard/vreader.c
index 0315dd8..9725f46 100644
--- a/libcacard/vreader.c
+++ b/libcacard/vreader.c
@@ -10,7 +10,9 @@
 #endif
 #define G_LOG_DOMAIN "libcacard"
 
-#include "qemu-common.h"
+#include "glib-compat.h"
+
+#include 
 
 #include "vcard.h"
 #include "vcard_emul.h"
diff --git a/libcacard/vscclient.c b/libcacard/vscclient.c
index fa6041d..0652684 100644
--- a/libcacard/vscclient.c
+++ b/libcacard/vscclient.c
@@ -10,14 +10,20 @@
  * See the COPYING.LIB file in the top-level directory.
  */
 
+#include 
+#include 
+#include 
 #ifndef _WIN32
 #include 
 #include 
 #include 
+#include 
 #define closesocket(x) close(x)
+#else
+#include 
 #endif
 
-#include "qemu-common.h"
+#include "glib-compat.h"
 
 #include "vscard_common.h"
 
-- 
2.1.4

Re: [Qemu-devel] [RFC PATCH 6/8] tap: Drop tap_can_send

2015-04-30 Thread Jason Wang




On Wed, Apr 29, 2015 at 6:37 PM, Fam Zheng  wrote:
This callback is called by main loop before polling s->fd, if it 
returns

false, the fd will not be polled in this iteration.

This is redundant with checks inside read callback. After this patch,
the data will be sent to peer when it arrives. If the device can't
receive, it will be queued to incoming_queue, and when the device 
status

changes, this queue will be flushed.

Signed-off-by: Fam Zheng 
---
 net/tap.c | 23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/net/tap.c b/net/tap.c
index 968df46..2ddf570 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -61,14 +61,12 @@ typedef struct TAPState {
 
 static int launch_script(const char *setup_script, const char 
*ifname, int fd);
 
-static int tap_can_send(void *opaque);

 static void tap_send(void *opaque);
 static void tap_writable(void *opaque);
 
 static void tap_update_fd_handler(TAPState *s)

 {
-qemu_set_fd_handler2(s->fd,
- s->read_poll && s->enabled ? tap_can_send : 
NULL,

+qemu_set_fd_handler2(s->fd, NULL,
  s->read_poll && s->enabled ? tap_send : 
NULL,
  s->write_poll && s->enabled ? tap_writable 
: NULL,

  s);
@@ -165,13 +163,6 @@ static ssize_t tap_receive(NetClientState *nc, 
const uint8_t *buf, size_t size)

 return tap_write_packet(s, iov, 1);
 }
 
-static int tap_can_send(void *opaque)

-{
-TAPState *s = opaque;
-
-return qemu_can_send_packet(&s->nc);
-}
-
 #ifndef __sun__
 ssize_t tap_read_packet(int tapfd, uint8_t *buf, int maxlen)
 {
@@ -190,10 +181,13 @@ static void tap_send(void *opaque)
 TAPState *s = opaque;
 int size;
 int packets = 0;
+bool can_send = true;
 
-while (qemu_can_send_packet(&s->nc)) {

+while (can_send) {
 uint8_t *buf = s->buf;
 
+can_send = qemu_can_send_packet(&s->nc);

+
 size = tap_read_packet(s->fd, s->buf, sizeof(s->buf));
 if (size <= 0) {
 break;
@@ -204,8 +198,13 @@ static void tap_send(void *opaque)
 size -= s->host_vnet_hdr_len;
 }
 
+/* If !can_send, we will want to disable the read poll, but 
we still
+ * need the send completion callback to enable it again, 
which is a

+ * sign of peer becoming ready.  So call the send function
+ * regardlessly of can_send.
+ */


It was probably not safe to depend on sent_cb to re-enable the polling. 
Since the packet could be purged in some conditions (e.g 
net_vm_change_state_handler()). So tap_send_completed won't be called 
in this case.




 size = qemu_send_packet_async(&s->nc, buf, size, 
tap_send_completed);

-if (size == 0) {
+if (size == 0 || !can_send) {
 tap_read_poll(s, false);
 break;
 } else if (size < 0) {
--
1.9.3

Re: [Qemu-devel] [PATCH v4 0/4] scripts: qmp-shell: add transaction support

2015-04-30 Thread Kashyap Chamarthy

On Wed, Apr 29, 2015 at 03:14:00PM -0400, John Snow wrote:
> The qmp-shell is a little rudimentary, but it can be hacked
> to give us some transactional support without too much difficulty.
> 
> (1) Prep.
> (2) Add support for serializing json arrays and
> improve the robustness of QMP parsing
> (3) Add a special transaction( ... ) syntax that lets users
> build up transactional commands using the existing qmp shell
> syntax to define each action.
> (4) Add a verbose flag to display generated QMP commands.
> 
> The parsing is not as robust as one would like, but this suffices
> without adding a proper parser.
> 
> Design considerations:
> 
> (1) Try not to disrupt the existing design of the qmp-shell. The existing
> API is not disturbed.
> (2) Pick a "magic token" such that it could not be confused for legitimate
> QMP/JSON syntax. Parentheses are used for this purpose.
> 
> For convenience, this branch is available at:
> https://github.com/jnsnow/qemu.git branch qmp-shell++
> This version is tagged qmp-shell++-v4.
> 
> ===
> v++
> ===
> 
>  - Use the AST to allow 'true', 'false' and 'null' within QMP expressions
>  - Fix a bunch of stupid junk I broke in v2, apparently.
> 
> ===
> v3:
> ===
> 
>  - Folding in hotfix from list (import ast)
> 
> ===
> v2:
> ===
> 
>  - Squash patches 2 & 3:
>  - Remove wholesale replacement of single quotes, in favor of try blocks
>that attempt to parse as pure JSON, then as Python.
>  - Factored out the value parser block to accomplish the above.
>  - Allow both true/True and false/False for values.
>  - Fix typo in patch 3 cover letter. (was patch 4.)
> 
> John Snow (4):
>   scripts: qmp-shell: refactor helpers
>   scripts: qmp-shell: Expand support for QMP expressions
>   scripts: qmp-shell: add transaction subshell
>   scripts: qmp-shell: Add verbose flag
> 
>  scripts/qmp/qmp-shell | 147 
> +++---
>  1 file changed, 116 insertions(+), 31 deletions(-)


Quick test, works as advertized. This time, I ran this series on top of
your incremental backup branch:

A positive test (sorry for the un-wrapped long lines). I already had the
target image pre-created:

$ ./qmp-shell -v ./qmp-sock
Welcome to the QMP low-level shell!
Connected to QEMU 2.2.94
(QEMU)
(QEMU) transaction(
TRANS> blockdev-snapshot-internal-sync device=drive-ide0-0-0 name=snapshot5
TRANS> block-dirty-bitmap-add node=drive-ide0-0-0 name=bitmap1
TRANS> block-dirty-bitmap-clear node=drive-ide0-0-0 name=bitmap0

   
TRANS> drive-backup device=drive-ide0-0-0 bitmap=bitmap1 sync=dirty-bitmap 
target=./incremental.0.img mode=existing format=qcow2
TRANS> )
{"execute": "transaction", "arguments": {"actions": [{"data": {"device": 
"drive-ide0-0-0", "name": "snapshot5"}, "type": 
"blockdev-snapshot-internal-sync"}, {"data": {"node": "drive-ide0-0-0", "name": 
"bitmap1"}, "type": "block-dirty-bitmap-add"}, {"data": {"node": 
"drive-ide0-0-0", "name": "bitmap0"}, "type": "block-dirty-bitmap-clear"}, 
{"data": {"target": "./incremental.0.img", "format": "qcow2", "sync": 
"dirty-bitmap", "bitmap": "bitmap1", "mode": "existing", "device": 
"drive-ide0-0-0"}, "type": "drive-backup"}]}}
{"return": {}}
(QEMU)

And a quick negative test: don't pre-create the target image when
running the `drive-backup` command, appropriate error is thrown.


So, FWIW:

  Tested-by: Kashyap Chamarthy 

-- 
/kashyap

[Qemu-devel] Question about block driver

2015-04-30 Thread Wen Congyang

Hi, all

Some drivers use bdrv_open, while the other dirvers use bdrv_file_open().
What is the difference between bdrv_open() and bdrv_file_open()?

Thanks
Wen Congyang

Re: [Qemu-devel] [PULL 0/3] VFIO fixes + AMD GPU reset workaround

2015-04-30 Thread Peter Maydell

On 28 April 2015 at 18:52, Alex Williamson  wrote:
> The following changes since commit 84cbd63f87c1d246f51ec8eee5367a5588f367fd:
>
>   Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into 
> staging (2015-04-28 12:22:20 +0100)
>
> are available in the git repository at:
>
>
>   git://github.com/awilliam/qemu-vfio.git tags/vfio-update-20150428.0
>
> for you to fetch changes up to 5655f931abcfa5f100d12d021eaed606c2d4ef52:
>
>   vfio-pci: Reset workaround for AMD Bonaire and Hawaii GPUs (2015-04-28 
> 11:14:02 -0600)
>
> 
> VFIO updates
>  - Correction to BAR overflow
>  - Fix error sign
>  - Reset workaround for AMD Bonaire & Hawaii GPUs
>
> 

Applied, thanks.

-- PMM

Re: [Qemu-devel] [REBASE PATCH v5 1/2] machine: add default_ram_size to machine class

2015-04-30 Thread Alexander Graf



On 30.04.15 06:41, Nikunj A Dadhania wrote:
> 
> Hi Paolo,
> 
> Paolo Bonzini  writes:
>> On 29/04/2015 11:06, Nikunj A Dadhania wrote:
 so David can push both patches.

 But isn't 1G a bit too much?  At least on x86 you can easily boot with 
 512M.
>>>
>>> I understood this number as not the _minimum memory_ to boot the
>>> VM. And this will only come in picture when the user has not specified
>>> any memory.
>>
>> This in turn will basically only happen for QEMU developers.  So keeping
>> the default on the low side would make sense.
>>
>> On my (4G memory) laptop I might not even be able to boot a PPC64 VM
>> with 1G and TCG, but I can do that nicely with 256M.
> 
> That will be fine with me as well, i.e. 256M
> 
> David/Alex, Do you have comments on this before we change it?

I've seen RAM size combinations that seemed to work ok, but then failed
during grub2 execution for example. Please verify with all reasonably
realistically executed distributions that 256MB is enough.


Alex

Re: [Qemu-devel] [PATCH V2] vhost: logs sharing

2015-04-30 Thread Jason Wang




On Thu, Apr 30, 2015 at 4:09 PM, Michael S. Tsirkin  
wrote:

On Thu, Apr 30, 2015 at 04:05:09PM +0800, Jason Wang wrote:
 
 
 On Tue, Apr 28, 2015 at 6:30 PM, Michael S. Tsirkin 
 wrote:

 >On Tue, Apr 28, 2015 at 05:58:28PM +0800, Jason Wang wrote:
 >> On Tue, Apr 28, 2015 at 5:37 PM, Michael S. Tsirkin
 >> wrote:
 >> >On Fri, Apr 10, 2015 at 05:33:35PM +0800, Jason Wang wrote:
 >> >> Currently we allocate one vhost log per vhost device. This is 
sub

 >> >> optimal when:
 >> >> - Guest has several device with vhost as backend
 >> >> - Guest has multiqueue devices
 >> >> In the above cases, we can avoid the memory allocation by 
sharing a
 >> >> single vhost log among all the vhost devices. This is done 
through:

 >> >> - Introducing a new vhost_log structure with refcnt inside.
 >> >> - Using a global pointer to vhost_log structure that will be 
used.

 >>And
 >> >>   introduce helper to get the log with expected log size and 
helper

 >>to
 >> >> - drop the refcnt to the old log.
 >> >> - Each vhost device still keep track of a pointer to the log 
that

 >>was
 >> >>   used.
 >> >>   With above, if no resize happens, all vhost device will 
share a

 >> >>single
 >> >> vhost log. During resize, a new vhost_log structure will be
 >>allocated
 >> >> and made for the global pointer. And each vhost devices will 
drop

 >>the
 >> >> refcnt to the old log.
 >> >> Tested by doing scp during migration for a 2 queues 
virtio-net-pci.

 >> >> Cc: Michael S. Tsirkin 
 >> >> Signed-off-by: Jason Wang 
 >> >> ---
 >> >> Changes from V1:
 >> >> - Drop the list of vhost log, instead, using a global pointer
 >>instead
 >> >
 >> >I don't think it works like this. If you have a global pointer,
 >> >you also need a global listener, have that sync all devices.
 >> It doesn't conflict, see my comments below.
 >> >
 >> >
 >> >
 >> >> ---
 >> >>  hw/virtio/vhost.c | 66
 >> >>++-
 >> >>  include/hw/virtio/vhost.h |  9 ++-
 >> >>  2 files changed, 62 insertions(+), 13 deletions(-)
 >> >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
 >> >> index 5a12861..e16c2db 100644
 >> >> --- a/hw/virtio/vhost.c
 >> >> +++ b/hw/virtio/vhost.c
 >> >> @@ -22,15 +22,19 @@
 >> >>  #include "hw/virtio/virtio-bus.h"
 >> >>  #include "migration/migration.h"
 >> >> +static struct vhost_log *vhost_log;
 >> >> +
 >> >>  static void vhost_dev_sync_region(struct vhost_dev *dev,
 >> >>MemoryRegionSection 
*section,
 >> >>uint64_t mfirst, uint64_t 
mlast,
 >> >>uint64_t rfirst, uint64_t 
rlast)

 >> >>  {
 >> >> +vhost_log_chunk_t *log = dev->log->log;
 >> >> +
 >> >>  uint64_t start = MAX(mfirst, rfirst);
 >> >>  uint64_t end = MIN(mlast, rlast);
 >> >> -vhost_log_chunk_t *from = dev->log + start / 
VHOST_LOG_CHUNK;
 >> >> -vhost_log_chunk_t *to = dev->log + end / VHOST_LOG_CHUNK 
+ 1;

 >> >> +vhost_log_chunk_t *from = log + start / VHOST_LOG_CHUNK;
 >> >> +vhost_log_chunk_t *to = log + end / VHOST_LOG_CHUNK + 1;
 >> >>  uint64_t addr = (start / VHOST_LOG_CHUNK) * 
VHOST_LOG_CHUNK;

 >> >>  if (end < start) {
 >> >> @@ -280,22 +284,55 @@ static uint64_t 
vhost_get_log_size(struct

 >> >>vhost_dev *dev)
 >> >>  }
 >> >>  return log_size;
 >> >>  }
 >> >> +static struct vhost_log *vhost_log_alloc(uint64_t size)
 >> >> +{
 >> >> +struct vhost_log *log = g_malloc0(sizeof *log + size *
 >> >>sizeof(*(log->log)));
 >> >> +
 >> >> +log->size = size;
 >> >> +log->refcnt = 1;
 >> >> +
 >> >> +return log;
 >> >> +}
 >> >> +
 >> >> +static struct vhost_log *vhost_log_get(uint64_t size)
 >> >> +{
 >> >> +if (!vhost_log || vhost_log->size != size) {
 >> >> +vhost_log = vhost_log_alloc(size);
 >> >
 >> >This just leaks the old log if size != size.
 >>   But old log is reference counted and will be freed during
 >>vhost_log_put() if
 >> refcnt drops to zero.
 >
 >You need a pointer to reference-count it.
 >You return pointer to new object, no one references the old one.
 
 The pointer is just vhost_log->log. The old pointer will be kept in
 vhost_dev_log_resize() until 1) new log was set through ioctl and 
2) the old

 log was synced.


vhost_dev_log_resize is per device, isn't it?


Yes.



So the log is synced in device 1, but not in device 2.


But we will do resizing one by one for all listeners. So the sync of 
device 2 will happen soon afterwards.

Re: [Qemu-devel] Question about block driver

2015-04-30 Thread Kevin Wolf

[Cc: qemu-block]

Am 30.04.2015 um 11:11 hat Wen Congyang geschrieben:
> Some drivers use bdrv_open, while the other dirvers use bdrv_file_open().
> What is the difference between bdrv_open() and bdrv_file_open()?

bdrv_file_open() is used by protocol drivers that don't need any other
driver to work (e.g. raw-posix), bdrv_open() is used by format drivers
that need a bs->file (e.g. raw or qcow2).

They used to have different parameters originally, but nowadays that's
the only difference.

Kevin

Re: [Qemu-devel] [PATCH V2] vhost: logs sharing

2015-04-30 Thread Michael S. Tsirkin

On Thu, Apr 30, 2015 at 05:22:33PM +0800, Jason Wang wrote:
> 
> 
> On Thu, Apr 30, 2015 at 4:09 PM, Michael S. Tsirkin  wrote:
> >On Thu, Apr 30, 2015 at 04:05:09PM +0800, Jason Wang wrote:
> >> On Tue, Apr 28, 2015 at 6:30 PM, Michael S. Tsirkin
> >> wrote:
> >> >On Tue, Apr 28, 2015 at 05:58:28PM +0800, Jason Wang wrote:
> >> >> On Tue, Apr 28, 2015 at 5:37 PM, Michael S. Tsirkin
> >> >> wrote:
> >> >> >On Fri, Apr 10, 2015 at 05:33:35PM +0800, Jason Wang wrote:
> >> >> >> Currently we allocate one vhost log per vhost device. This is sub
> >> >> >> optimal when:
> >> >> >> - Guest has several device with vhost as backend
> >> >> >> - Guest has multiqueue devices
> >> >> >> In the above cases, we can avoid the memory allocation by sharing
> >>a
> >> >> >> single vhost log among all the vhost devices. This is done
> >>through:
> >> >> >> - Introducing a new vhost_log structure with refcnt inside.
> >> >> >> - Using a global pointer to vhost_log structure that will be
> >>used.
> >> >>And
> >> >> >>   introduce helper to get the log with expected log size and
> >>helper
> >> >>to
> >> >> >> - drop the refcnt to the old log.
> >> >> >> - Each vhost device still keep track of a pointer to the log that
> >> >>was
> >> >> >>   used.
> >> >> >>   With above, if no resize happens, all vhost device will share a
> >> >> >>single
> >> >> >> vhost log. During resize, a new vhost_log structure will be
> >> >>allocated
> >> >> >> and made for the global pointer. And each vhost devices will drop
> >> >>the
> >> >> >> refcnt to the old log.
> >> >> >> Tested by doing scp during migration for a 2 queues
> >>virtio-net-pci.
> >> >> >> Cc: Michael S. Tsirkin 
> >> >> >> Signed-off-by: Jason Wang 
> >> >> >> ---
> >> >> >> Changes from V1:
> >> >> >> - Drop the list of vhost log, instead, using a global pointer
> >> >>instead
> >> >> >
> >> >> >I don't think it works like this. If you have a global pointer,
> >> >> >you also need a global listener, have that sync all devices.
> >> >> It doesn't conflict, see my comments below.
> >> >> >
> >> >> >
> >> >> >
> >> >> >> ---
> >> >> >>  hw/virtio/vhost.c | 66
> >> >> >>++-
> >> >> >>  include/hw/virtio/vhost.h |  9 ++-
> >> >> >>  2 files changed, 62 insertions(+), 13 deletions(-)
> >> >> >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >> >> >> index 5a12861..e16c2db 100644
> >> >> >> --- a/hw/virtio/vhost.c
> >> >> >> +++ b/hw/virtio/vhost.c
> >> >> >> @@ -22,15 +22,19 @@
> >> >> >>  #include "hw/virtio/virtio-bus.h"
> >> >> >>  #include "migration/migration.h"
> >> >> >> +static struct vhost_log *vhost_log;
> >> >> >> +
> >> >> >>  static void vhost_dev_sync_region(struct vhost_dev *dev,
> >> >> >>MemoryRegionSection *section,
> >> >> >>uint64_t mfirst, uint64_t
> >>mlast,
> >> >> >>uint64_t rfirst, uint64_t
> >>rlast)
> >> >> >>  {
> >> >> >> +vhost_log_chunk_t *log = dev->log->log;
> >> >> >> +
> >> >> >>  uint64_t start = MAX(mfirst, rfirst);
> >> >> >>  uint64_t end = MIN(mlast, rlast);
> >> >> >> -vhost_log_chunk_t *from = dev->log + start /
> >>VHOST_LOG_CHUNK;
> >> >> >> -vhost_log_chunk_t *to = dev->log + end / VHOST_LOG_CHUNK +
> >>1;
> >> >> >> +vhost_log_chunk_t *from = log + start / VHOST_LOG_CHUNK;
> >> >> >> +vhost_log_chunk_t *to = log + end / VHOST_LOG_CHUNK + 1;
> >> >> >>  uint64_t addr = (start / VHOST_LOG_CHUNK) * VHOST_LOG_CHUNK;
> >> >> >>  if (end < start) {
> >> >> >> @@ -280,22 +284,55 @@ static uint64_t vhost_get_log_size(struct
> >> >> >>vhost_dev *dev)
> >> >> >>  }
> >> >> >>  return log_size;
> >> >> >>  }
> >> >> >> +static struct vhost_log *vhost_log_alloc(uint64_t size)
> >> >> >> +{
> >> >> >> +struct vhost_log *log = g_malloc0(sizeof *log + size *
> >> >> >>sizeof(*(log->log)));
> >> >> >> +
> >> >> >> +log->size = size;
> >> >> >> +log->refcnt = 1;
> >> >> >> +
> >> >> >> +return log;
> >> >> >> +}
> >> >> >> +
> >> >> >> +static struct vhost_log *vhost_log_get(uint64_t size)
> >> >> >> +{
> >> >> >> +if (!vhost_log || vhost_log->size != size) {
> >> >> >> +vhost_log = vhost_log_alloc(size);
> >> >> >
> >> >> >This just leaks the old log if size != size.
> >> >>   But old log is reference counted and will be freed during
> >> >>vhost_log_put() if
> >> >> refcnt drops to zero.
> >> >
> >> >You need a pointer to reference-count it.
> >> >You return pointer to new object, no one references the old one.
> >> The pointer is just vhost_log->log. The old pointer will be kept in
> >> vhost_dev_log_resize() until 1) new log was set through ioctl and 2)
> >>the old
> >> log was synced.
> >
> >vhost_dev_log_resize is per device, isn't it?
> 
> Yes.
> 
> >
> >So the log is synced in device 1, but not in device 2.
> 
> But we will do resizing one by one for all listeners. So the sync of device
> 2 will hap

Re: [Qemu-devel] Question about block driver

2015-04-30 Thread Wen Congyang

On 04/30/2015 05:33 PM, Kevin Wolf wrote:
> [Cc: qemu-block]
> 
> Am 30.04.2015 um 11:11 hat Wen Congyang geschrieben:
>> Some drivers use bdrv_open, while the other dirvers use bdrv_file_open().
>> What is the difference between bdrv_open() and bdrv_file_open()?
> 
> bdrv_file_open() is used by protocol drivers that don't need any other
> driver to work (e.g. raw-posix), bdrv_open() is used by format drivers
> that need a bs->file (e.g. raw or qcow2).
> 
> They used to have different parameters originally, but nowadays that's
> the only difference.

So, if the driver wants to open bs->file itself, it should use bdrv_file_open(),
and the driver wants bs->file has already been opened before its open, it should
use bdrv_open(). Is it right?

Thanks
Wen Congyang

> 
> Kevin
> .
>

Re: [Qemu-devel] [REBASE PATCH v5 1/2] machine: add default_ram_size to machine class

2015-04-30 Thread Thomas Huth

On Thu, 30 Apr 2015 11:18:05 +0200
Alexander Graf  wrote:

> 
> 
> On 30.04.15 06:41, Nikunj A Dadhania wrote:
> > 
> > Hi Paolo,
> > 
> > Paolo Bonzini  writes:
> >> On 29/04/2015 11:06, Nikunj A Dadhania wrote:
>  so David can push both patches.
> 
>  But isn't 1G a bit too much?  At least on x86 you can easily boot with 
>  512M.
> >>>
> >>> I understood this number as not the _minimum memory_ to boot the
> >>> VM. And this will only come in picture when the user has not specified
> >>> any memory.
> >>
> >> This in turn will basically only happen for QEMU developers.  So keeping
> >> the default on the low side would make sense.
> >>
> >> On my (4G memory) laptop I might not even be able to boot a PPC64 VM
> >> with 1G and TCG, but I can do that nicely with 256M.
> > 
> > That will be fine with me as well, i.e. 256M
> > 
> > David/Alex, Do you have comments on this before we change it?
> 
> I've seen RAM size combinations that seemed to work ok, but then failed
> during grub2 execution for example. Please verify with all reasonably
> realistically executed distributions that 256MB is enough.

Since this default value will likely be there for the next couple of
years, it's maybe better to use a slightly higher value than one that
is too low - the amount of RAM that a guest requires likely rather
increases in the next years instead of going down again. So I think
using 512 MB instead is maybe a good compromise?

 Thomas

Re: [Qemu-devel] [REBASE PATCH v5 1/2] machine: add default_ram_size to machine class

2015-04-30 Thread Alexander Graf



> Am 30.04.2015 um 11:40 schrieb Thomas Huth :
> 
> On Thu, 30 Apr 2015 11:18:05 +0200
> Alexander Graf  wrote:
> 
>> 
>> 
>>> On 30.04.15 06:41, Nikunj A Dadhania wrote:
>>> 
>>> Hi Paolo,
>>> 
>>> Paolo Bonzini  writes:
 On 29/04/2015 11:06, Nikunj A Dadhania wrote:
>> so David can push both patches.
>> 
>> But isn't 1G a bit too much?  At least on x86 you can easily boot with 
>> 512M.
> 
> I understood this number as not the _minimum memory_ to boot the
> VM. And this will only come in picture when the user has not specified
> any memory.
 
 This in turn will basically only happen for QEMU developers.  So keeping
 the default on the low side would make sense.
 
 On my (4G memory) laptop I might not even be able to boot a PPC64 VM
 with 1G and TCG, but I can do that nicely with 256M.
>>> 
>>> That will be fine with me as well, i.e. 256M
>>> 
>>> David/Alex, Do you have comments on this before we change it?
>> 
>> I've seen RAM size combinations that seemed to work ok, but then failed
>> during grub2 execution for example. Please verify with all reasonably
>> realistically executed distributions that 256MB is enough.
> 
> Since this default value will likely be there for the next couple of
> years, it's maybe better to use a slightly higher value than one that
> is too low - the amount of RAM that a guest requires likely rather
> increases in the next years instead of going down again. So I think
> using 512 MB instead is maybe a good compromise?

Again, even with 512, please verify a few different distros and check that they 
run.

Alex

[Qemu-devel] [PATCH] nvme: support NVME_VOLATILE_WRITE_CACHE feature

2015-04-30 Thread Christoph Hellwig

The SCSI emulation in the Linux NVMe driver really wants to know
if a device has a volatile write cache.  Given that qemu has moved
away from a model where we report the backing store WCE bit to
one where the WCE bit is supposed to be part of the migratable
guest-visible state we always return 1 here.

Signed-off-by: Christoph Hellwig 

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 1e07166..50d76f1 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -479,6 +479,9 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, 
NvmeRequest *req)
 req->cqe.result =
 cpu_to_le32((n->num_queues - 1) | ((n->num_queues - 1) << 16));
 break;
+case NVME_VOLATILE_WRITE_CACHE:
+req->cqe.result = cpu_to_le32(1);
+break;
 default:
 return NVME_INVALID_FIELD | NVME_DNR;
 }

Re: [Qemu-devel] Question about block driver

2015-04-30 Thread Kevin Wolf

Am 30.04.2015 um 11:43 hat Wen Congyang geschrieben:
> On 04/30/2015 05:33 PM, Kevin Wolf wrote:
> > [Cc: qemu-block]
> > 
> > Am 30.04.2015 um 11:11 hat Wen Congyang geschrieben:
> >> Some drivers use bdrv_open, while the other dirvers use bdrv_file_open().
> >> What is the difference between bdrv_open() and bdrv_file_open()?
> > 
> > bdrv_file_open() is used by protocol drivers that don't need any other
> > driver to work (e.g. raw-posix), bdrv_open() is used by format drivers
> > that need a bs->file (e.g. raw or qcow2).
> > 
> > They used to have different parameters originally, but nowadays that's
> > the only difference.
> 
> So, if the driver wants to open bs->file itself, it should use 
> bdrv_file_open(),
> and the driver wants bs->file has already been opened before its open, it 
> should
> use bdrv_open(). Is it right?

Yes, that's how it works, even though I don't think a driver exists that
opens bs->file by itself.

Kevin

[Qemu-devel] [PULL 21/42] docs/atomics.txt: fix two typos

2015-04-30 Thread Michael Tokarev

From: Laszlo Ersek 

Cc: Paolo Bonzini 
Signed-off-by: Laszlo Ersek 
Signed-off-by: Michael Tokarev 
---
 docs/atomics.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/atomics.txt b/docs/atomics.txt
index 6f2997b..ef285e3 100644
--- a/docs/atomics.txt
+++ b/docs/atomics.txt
@@ -281,7 +281,7 @@ note that the other barrier may actually be in a driver 
that runs in
 the guest!
 
 For the purposes of pairing, smp_read_barrier_depends() and smp_rmb()
-both count as read barriers.  A read barriers shall pair with a write
+both count as read barriers.  A read barrier shall pair with a write
 barrier or a full barrier; a write barrier shall pair with a read
 barrier or a full barrier.  A full barrier can pair with anything.
 For example:
@@ -294,7 +294,7 @@ For example:
  smp_rmb();
  y = a;
 
-Note that the "writing" thread are accessing the variables in the
+Note that the "writing" thread is accessing the variables in the
 opposite order as the "reading" thread.  This is expected: stores
 before the write barrier will normally match the loads after the
 read barrier, and vice versa.  The same is true for more than 2
-- 
2.1.4

Re: [Qemu-devel] [PULL 00/10] Ide patches

2015-04-30 Thread Peter Maydell

On 29 April 2015 at 00:25, John Snow  wrote:
> The following changes since commit a9392bc93c8615ad1983047e9f91ee3fa8aae75f:
>
>   Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging 
> (2015-04-28 16:55:03 +0100)
>
> are available in the git repository at:
>
>   https://github.com/jnsnow/qemu.git tags/ide-pull-request
>
> for you to fetch changes up to c8368674980b299604e3cfe9215c4105acefa516:
>
>   qtest: Add assertion that required environment variable is set (2015-04-28 
> 15:27:51 -0400)
>
> 
>
> 

Applied, thanks.

-- PMM

[Qemu-devel] [PATCH 1/6] qcow2: use one single memory block for the L2/refcount cache tables

2015-04-30 Thread Alberto Garcia

The qcow2 L2/refcount cache contains one separate table for each cache
entry. Doing one allocation per table adds unnecessary overhead and it
also requires us to store the address of each table separately.

Since the size of the cache is constant during its lifetime, it's
better to have an array that contains all the tables using one single
allocation.

In my tests measuring freshly created caches with sizes 128MB (L2) and
32MB (refcount) this uses around 10MB of RAM less.

Signed-off-by: Alberto Garcia 
---
 block/qcow2-cache.c | 48 +---
 1 file changed, 21 insertions(+), 27 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index b115549..586880b 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -28,7 +28,6 @@
 #include "trace.h"
 
 typedef struct Qcow2CachedTable {
-void*   table;
 int64_t offset;
 booldirty;
 int cache_hits;
@@ -40,39 +39,34 @@ struct Qcow2Cache {
 struct Qcow2Cache*  depends;
 int size;
 booldepends_on_flush;
+void   *table_array;
+int table_size;
 };
 
+static inline void *table_addr(Qcow2Cache *c, int table)
+{
+return c->table_array + table * c->table_size;
+}
+
 Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables)
 {
 BDRVQcowState *s = bs->opaque;
 Qcow2Cache *c;
-int i;
 
 c = g_new0(Qcow2Cache, 1);
 c->size = num_tables;
+c->table_size = s->cluster_size;
 c->entries = g_try_new0(Qcow2CachedTable, num_tables);
-if (!c->entries) {
-goto fail;
-}
+c->table_array = qemu_try_blockalign(bs->file, num_tables * c->table_size);
 
-for (i = 0; i < c->size; i++) {
-c->entries[i].table = qemu_try_blockalign(bs->file, s->cluster_size);
-if (c->entries[i].table == NULL) {
-goto fail;
-}
+if (!c->entries || !c->table_array) {
+qemu_vfree(c->table_array);
+g_free(c->entries);
+g_free(c);
+c = NULL;
 }
 
 return c;
-
-fail:
-if (c->entries) {
-for (i = 0; i < c->size; i++) {
-qemu_vfree(c->entries[i].table);
-}
-}
-g_free(c->entries);
-g_free(c);
-return NULL;
 }
 
 int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c)
@@ -81,9 +75,9 @@ int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c)
 
 for (i = 0; i < c->size; i++) {
 assert(c->entries[i].ref == 0);
-qemu_vfree(c->entries[i].table);
 }
 
+qemu_vfree(c->table_array);
 g_free(c->entries);
 g_free(c);
 
@@ -151,8 +145,8 @@ static int qcow2_cache_entry_flush(BlockDriverState *bs, 
Qcow2Cache *c, int i)
 BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE);
 }
 
-ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table,
-s->cluster_size);
+ret = bdrv_pwrite(bs->file, c->entries[i].offset, table_addr(c, i),
+  s->cluster_size);
 if (ret < 0) {
 return ret;
 }
@@ -304,7 +298,7 @@ static int qcow2_cache_do_get(BlockDriverState *bs, 
Qcow2Cache *c,
 BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
 }
 
-ret = bdrv_pread(bs->file, offset, c->entries[i].table, 
s->cluster_size);
+ret = bdrv_pread(bs->file, offset, table_addr(c, i), s->cluster_size);
 if (ret < 0) {
 return ret;
 }
@@ -319,7 +313,7 @@ static int qcow2_cache_do_get(BlockDriverState *bs, 
Qcow2Cache *c,
 found:
 c->entries[i].cache_hits++;
 c->entries[i].ref++;
-*table = c->entries[i].table;
+*table = table_addr(c, i);
 
 trace_qcow2_cache_get_done(qemu_coroutine_self(),
c == s->l2_table_cache, i);
@@ -344,7 +338,7 @@ int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, 
void **table)
 int i;
 
 for (i = 0; i < c->size; i++) {
-if (c->entries[i].table == *table) {
+if (table_addr(c, i) == *table) {
 goto found;
 }
 }
@@ -363,7 +357,7 @@ void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void 
*table)
 int i;
 
 for (i = 0; i < c->size; i++) {
-if (c->entries[i].table == table) {
+if (table_addr(c, i) == table) {
 goto found;
 }
 }
-- 
2.1.4

[Qemu-devel] [PATCH 6/6] qcow2: style fixes in qcow2-cache.c

2015-04-30 Thread Alberto Garcia

Fix pointer declaration to make it consistent with the rest of the
code.

Signed-off-by: Alberto Garcia 
---
 block/qcow2-cache.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index c0e0278..dd591ef 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -35,8 +35,8 @@ typedef struct Qcow2CachedTable {
 } Qcow2CachedTable;
 
 struct Qcow2Cache {
-Qcow2CachedTable*   entries;
-struct Qcow2Cache*  depends;
+Qcow2CachedTable   *entries;
+struct Qcow2Cache  *depends;
 int size;
 booldepends_on_flush;
 void   *table_array;
@@ -70,7 +70,7 @@ Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int 
num_tables)
 return c;
 }
 
-int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c)
+int qcow2_cache_destroy(BlockDriverState *bs, Qcow2Cache *c)
 {
 int i;
 
-- 
2.1.4

[Qemu-devel] [PATCH 2/6] qcow2: simplify qcow2_cache_put() and qcow2_cache_entry_mark_dirty()

2015-04-30 Thread Alberto Garcia

Since all tables are now stored together, it is possible to obtain
the position of a particular table directly from its address, so the
operation becomes O(1).

Signed-off-by: Alberto Garcia 
---
 block/qcow2-cache.c | 22 +-
 1 file changed, 5 insertions(+), 17 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 586880b..d3274f4 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -335,16 +335,12 @@ int qcow2_cache_get_empty(BlockDriverState *bs, 
Qcow2Cache *c, uint64_t offset,
 
 int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table)
 {
-int i;
+int i = (*table - c->table_array) / c->table_size;
 
-for (i = 0; i < c->size; i++) {
-if (table_addr(c, i) == *table) {
-goto found;
-}
+if (c->entries[i].offset == 0) {
+return -ENOENT;
 }
-return -ENOENT;
 
-found:
 c->entries[i].ref--;
 *table = NULL;
 
@@ -354,15 +350,7 @@ found:
 
 void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table)
 {
-int i;
-
-for (i = 0; i < c->size; i++) {
-if (table_addr(c, i) == table) {
-goto found;
-}
-}
-abort();
-
-found:
+int i = (table - c->table_array) / c->table_size;
+assert(c->entries[i].offset != 0);
 c->entries[i].dirty = true;
 }
-- 
2.1.4

[Qemu-devel] [PATCH 5/6] qcow2: use a hash to look for entries in the L2 cache

2015-04-30 Thread Alberto Garcia

The current cache algorithm traverses the array starting always from
the beginning, so the average number of comparisons needed to perform
a lookup is proportional to the size of the array.

By using a hash of the offset as the starting point, lookups are
faster and independent from the array size.

The hash is computed using the cluster number of the table, multiplied
by 4 to make it perform better when there are collisions.

In my tests, using a cache with 2048 entries, this reduces the average
number of comparisons per lookup from 430 to 2.5.

Signed-off-by: Alberto Garcia 
---
 block/qcow2-cache.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index e1bba20..c0e0278 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -237,6 +237,7 @@ static int qcow2_cache_do_get(BlockDriverState *bs, 
Qcow2Cache *c,
 BDRVQcowState *s = bs->opaque;
 int i;
 int ret;
+int lookup_index;
 uint64_t min_lru_counter = UINT64_MAX;
 int min_lru_index = -1;
 
@@ -244,7 +245,8 @@ static int qcow2_cache_do_get(BlockDriverState *bs, 
Qcow2Cache *c,
   offset, read_from_disk);
 
 /* Check if the table is already cached */
-for (i = 0; i < c->size; i++) {
+i = lookup_index = (offset / c->table_size * 4) % c->size;
+do {
 const Qcow2CachedTable *t = &c->entries[i];
 if (t->offset == offset) {
 goto found;
@@ -253,7 +255,10 @@ static int qcow2_cache_do_get(BlockDriverState *bs, 
Qcow2Cache *c,
 min_lru_counter = t->lru_counter;
 min_lru_index = i;
 }
-}
+if (++i == c->size) {
+i = 0;
+}
+} while (i != lookup_index);
 
 if (min_lru_index == -1) {
 /* This can't happen in current synchronous code, but leave the check
-- 
2.1.4

[Qemu-devel] [PATCH 3/6] qcow2: use an LRU algorithm to replace entries from the L2 cache

2015-04-30 Thread Alberto Garcia

The current algorithm to evict entries from the cache gives always
preference to those in the lowest positions. As the size of the cache
increases, the chances of the later elements of being removed decrease
exponentially.

In a scenario with random I/O and lots of cache misses, entries in
positions 8 and higher are rarely (if ever) evicted. This can be seen
even with the default cache size, but with larger caches the problem
becomes more obvious.

Using an LRU algorithm makes the chances of being removed from the
cache independent from the position.

Signed-off-by: Alberto Garcia 
---
 block/qcow2-cache.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index d3274f4..477a209 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -28,10 +28,10 @@
 #include "trace.h"
 
 typedef struct Qcow2CachedTable {
-int64_t offset;
-booldirty;
-int cache_hits;
-int ref;
+int64_t  offset;
+bool dirty;
+uint64_t lru_counter;
+int  ref;
 } Qcow2CachedTable;
 
 struct Qcow2Cache {
@@ -41,6 +41,7 @@ struct Qcow2Cache {
 booldepends_on_flush;
 void   *table_array;
 int table_size;
+uint64_tlru_counter;
 };
 
 static inline void *table_addr(Qcow2Cache *c, int table)
@@ -222,16 +223,18 @@ int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c)
 for (i = 0; i < c->size; i++) {
 assert(c->entries[i].ref == 0);
 c->entries[i].offset = 0;
-c->entries[i].cache_hits = 0;
+c->entries[i].lru_counter = 0;
 }
 
+c->lru_counter = 0;
+
 return 0;
 }
 
 static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c)
 {
 int i;
-int min_count = INT_MAX;
+uint64_t min_lru_counter = UINT64_MAX;
 int min_index = -1;
 
 
@@ -240,15 +243,9 @@ static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c)
 continue;
 }
 
-if (c->entries[i].cache_hits < min_count) {
+if (c->entries[i].lru_counter < min_lru_counter) {
 min_index = i;
-min_count = c->entries[i].cache_hits;
-}
-
-/* Give newer hits priority */
-/* TODO Check how to optimize the replacement strategy */
-if (c->entries[i].cache_hits > 1) {
-c->entries[i].cache_hits /= 2;
+min_lru_counter = c->entries[i].lru_counter;
 }
 }
 
@@ -306,12 +303,10 @@ static int qcow2_cache_do_get(BlockDriverState *bs, 
Qcow2Cache *c,
 
 /* Give the table some hits for the start so that it won't be replaced
  * immediately. The number 32 is completely arbitrary. */
-c->entries[i].cache_hits = 32;
 c->entries[i].offset = offset;
 
 /* And return the right table */
 found:
-c->entries[i].cache_hits++;
 c->entries[i].ref++;
 *table = table_addr(c, i);
 
@@ -344,6 +339,10 @@ int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, 
void **table)
 c->entries[i].ref--;
 *table = NULL;
 
+if (c->entries[i].ref == 0) {
+c->entries[i].lru_counter = ++c->lru_counter;
+}
+
 assert(c->entries[i].ref >= 0);
 return 0;
 }
-- 
2.1.4

[Qemu-devel] [PATCH 0/6] qcow2 L2/refcount cache improvements

2015-04-30 Thread Alberto Garcia

Here are some improvements to the qcow2 L2/refcount cache code.

The first one is that all cache tables are now allocated using a
single memory block, as we discussed last week.

Apart from a more efficient use of memory, this allows some additional
optimizations so I took the chance to make other changes.

- qcow2_cache_put() and qcow2_cache_entry_mark_dirty() are now O(1)
- The eviction algorithm is now LRU. The previous one only works well
  with very small cache sizes.
- qcow2_cache_find_entry_to_replace() is no longer necessary.
- Lookups are faster now.

In my tests with a preallocated 128MB L2 cache in an empty drive the
new code is ~13% faster than the previous one (~43% if compiled
without optimizations). This is a best-case scenario, if the cache is
smaller or the drive is full of data the improvements are not so
visible, but I believe the code is simpler now so I hope you find the
changes worthwhile.

Regards,

Berto

Alberto Garcia (6):
  qcow2: use one single memory block for the L2/refcount cache tables
  qcow2: simplify qcow2_cache_put() and qcow2_cache_entry_mark_dirty()
  qcow2: use an LRU algorithm to replace entries from the L2 cache
  qcow2: remove qcow2_cache_find_entry_to_replace()
  qcow2: use a hash to look for entries in the L2 cache
  qcow2: style fixes in qcow2-cache.c

 block/qcow2-cache.c | 149 +---
 1 file changed, 61 insertions(+), 88 deletions(-)

-- 
2.1.4

[Qemu-devel] [PATCH 4/6] qcow2: remove qcow2_cache_find_entry_to_replace()

2015-04-30 Thread Alberto Garcia

A cache miss means that the whole array was traversed and the entry
we were looking for was not found, so there's no need to traverse it
again in order to select an entry to replace.

Signed-off-by: Alberto Garcia 
---
 block/qcow2-cache.c | 45 -
 1 file changed, 16 insertions(+), 29 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 477a209..e1bba20 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -231,51 +231,38 @@ int qcow2_cache_empty(BlockDriverState *bs, Qcow2Cache *c)
 return 0;
 }
 
-static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c)
-{
-int i;
-uint64_t min_lru_counter = UINT64_MAX;
-int min_index = -1;
-
-
-for (i = 0; i < c->size; i++) {
-if (c->entries[i].ref) {
-continue;
-}
-
-if (c->entries[i].lru_counter < min_lru_counter) {
-min_index = i;
-min_lru_counter = c->entries[i].lru_counter;
-}
-}
-
-if (min_index == -1) {
-/* This can't happen in current synchronous code, but leave the check
- * here as a reminder for whoever starts using AIO with the cache */
-abort();
-}
-return min_index;
-}
-
 static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c,
 uint64_t offset, void **table, bool read_from_disk)
 {
 BDRVQcowState *s = bs->opaque;
 int i;
 int ret;
+uint64_t min_lru_counter = UINT64_MAX;
+int min_lru_index = -1;
 
 trace_qcow2_cache_get(qemu_coroutine_self(), c == s->l2_table_cache,
   offset, read_from_disk);
 
 /* Check if the table is already cached */
 for (i = 0; i < c->size; i++) {
-if (c->entries[i].offset == offset) {
+const Qcow2CachedTable *t = &c->entries[i];
+if (t->offset == offset) {
 goto found;
 }
+if (t->ref == 0 && t->lru_counter < min_lru_counter) {
+min_lru_counter = t->lru_counter;
+min_lru_index = i;
+}
+}
+
+if (min_lru_index == -1) {
+/* This can't happen in current synchronous code, but leave the check
+ * here as a reminder for whoever starts using AIO with the cache */
+abort();
 }
 
-/* If not, write a table back and replace it */
-i = qcow2_cache_find_entry_to_replace(c);
+/* Cache miss: write a table back and replace it */
+i = min_lru_index;
 trace_qcow2_cache_get_replace_entry(qemu_coroutine_self(),
 c == s->l2_table_cache, i);
 if (i < 0) {
-- 
2.1.4

Re: [Qemu-devel] [RFC PATCH 6/8] tap: Drop tap_can_send

2015-04-30 Thread Paolo Bonzini



On 30/04/2015 10:59, Jason Wang wrote:
>>
>>  
>> +/* If !can_send, we will want to disable the read poll, but
>> we still
>> + * need the send completion callback to enable it again,
>> which is a
>> + * sign of peer becoming ready.  So call the send function
>> + * regardlessly of can_send.
>> + */
> 
> It was probably not safe to depend on sent_cb to re-enable the polling.
> Since the packet could be purged in some conditions (e.g
> net_vm_change_state_handler()). So tap_send_completed won't be called in
> this case.

Doesn't qemu_net_queue_purge also call the sent_cb?

Paolo

[Qemu-devel] [v2 0/2] Generic PCIe host bridge INTx determination for INTx routing

2015-04-30 Thread Pranavkumar Sawargaonkar

This patch adds a routine for GPEX to implement PCI bus specific function 
pointer 
"route_intx_to_irq" which is used during INTx routing.

ChangeLog:

V2:
- Drop a patch about adding an API to get irq number from qemu_irq
- Store a GPEX INTx information from board specific code (virt.c)
V1:
- Initial patchset
- https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg01986.html

Pranavkumar Sawargaonkar (2):
  pci: GPEX: Add a function to determine interrupt number for INTx
routing
  arm: hw: virt: Store information about GPEX legacy interrupt numbers

 hw/arm/virt.c  |  4 
 hw/pci-host/gpex.c | 12 
 include/hw/pci-host/gpex.h |  1 +
 3 files changed, 17 insertions(+)

-- 
1.9.1

[Qemu-devel] [v2 2/2] arm: hw: virt: Store information about GPEX legacy interrupt numbers

2015-04-30 Thread Pranavkumar Sawargaonkar

This patch stores information about assigned legacy interrupt numbers in
GPEX host structure.
This is used during GPEX INTx number determination from a pin during
INTx routing.

Signed-off-by: Pranavkumar Sawargaonkar 
Signed-off-by: Tushar Jagad 
---
 hw/arm/virt.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 565f573..fdafdcc 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -658,6 +658,7 @@ static void create_pcie(const VirtBoardInfo *vbi, qemu_irq 
*pic,
 MemoryRegion *ecam_alias;
 MemoryRegion *ecam_reg;
 DeviceState *dev;
+GPEXHost *s;
 char *nodename;
 int i;
 
@@ -689,8 +690,11 @@ static void create_pcie(const VirtBoardInfo *vbi, qemu_irq 
*pic,
 /* Map IO port space */
 sysbus_mmio_map(SYS_BUS_DEVICE(dev), 2, base_ioport);
 
+s = GPEX_HOST(dev);
+
 for (i = 0; i < GPEX_NUM_IRQS; i++) {
 sysbus_connect_irq(SYS_BUS_DEVICE(dev), i, pic[irq + i]);
+s->irq_num[i] = irq + i;
 }
 
 nodename = g_strdup_printf("/pcie@%" PRIx64, base);
-- 
1.9.1

[Qemu-devel] [v2 1/2] pci: GPEX: Add a function to determine interrupt number for INTx routing

2015-04-30 Thread Pranavkumar Sawargaonkar

This patch adds a PCI bus specific function pointer "route_intx_to_irq"
for GPEX.
This is used in detemining PCI INTx number from pin.

Signed-off-by: Pranavkumar Sawargaonkar 
Signed-off-by: Tushar Jagad 
---
 hw/pci-host/gpex.c | 12 
 include/hw/pci-host/gpex.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/hw/pci-host/gpex.c b/hw/pci-host/gpex.c
index 9d8fb5a..ed96053 100644
--- a/hw/pci-host/gpex.c
+++ b/hw/pci-host/gpex.c
@@ -42,6 +42,17 @@ static void gpex_set_irq(void *opaque, int irq_num, int 
level)
 qemu_set_irq(s->irq[irq_num], level);
 }
 
+static PCIINTxRoute gpex_route_intx_pin_to_irq(void *opaque, int pin)
+{
+PCIINTxRoute route;
+GPEXHost *s = opaque;
+
+route.mode = PCI_INTX_ENABLED;
+route.irq = (int) s->irq_num[pin];
+
+return route;
+}
+
 static void gpex_host_realize(DeviceState *dev, Error **errp)
 {
 PCIHostState *pci = PCI_HOST_BRIDGE(dev);
@@ -66,6 +77,7 @@ static void gpex_host_realize(DeviceState *dev, Error **errp)
 &s->io_ioport, 0, 4, TYPE_PCIE_BUS);
 
 qdev_set_parent_bus(DEVICE(&s->gpex_root), BUS(pci->bus));
+pci_bus_set_route_irq_fn(pci->bus, gpex_route_intx_pin_to_irq);
 qdev_init_nofail(DEVICE(&s->gpex_root));
 }
 
diff --git a/include/hw/pci-host/gpex.h b/include/hw/pci-host/gpex.h
index 68c9348..7df1c16 100644
--- a/include/hw/pci-host/gpex.h
+++ b/include/hw/pci-host/gpex.h
@@ -51,6 +51,7 @@ typedef struct GPEXHost {
 MemoryRegion io_ioport;
 MemoryRegion io_mmio;
 qemu_irq irq[GPEX_NUM_IRQS];
+uint32_t irq_num[GPEX_NUM_IRQS];
 } GPEXHost;
 
 #endif /* HW_GPEX_H */
-- 
1.9.1

[Qemu-devel] [PULL 03/10] sclp: sort into categories

2015-04-30 Thread Cornelia Huck

Sort the sclp consoles into the input category, just as virtio-serial.
Various other sclp devices don't have an obvious category, sort them
into misc.

Reviewed-by: David Hildenbrand 
Acked-by: Christian Borntraeger 
Signed-off-by: Cornelia Huck 
---
 hw/char/sclpconsole-lm.c  | 1 +
 hw/char/sclpconsole.c | 1 +
 hw/s390x/event-facility.c | 1 +
 hw/s390x/sclp.c   | 9 +
 hw/s390x/sclpcpu.c| 2 ++
 hw/s390x/sclpquiesce.c| 1 +
 6 files changed, 15 insertions(+)

diff --git a/hw/char/sclpconsole-lm.c b/hw/char/sclpconsole-lm.c
index a9f5e62..02ac80b 100644
--- a/hw/char/sclpconsole-lm.c
+++ b/hw/char/sclpconsole-lm.c
@@ -364,6 +364,7 @@ static void console_class_init(ObjectClass *klass, void 
*data)
 ec->can_handle_event = can_handle_event;
 ec->read_event_data = read_event_data;
 ec->write_event_data = write_event_data;
+set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
 }
 
 static const TypeInfo sclp_console_info = {
diff --git a/hw/char/sclpconsole.c b/hw/char/sclpconsole.c
index 79891df..b014c7f 100644
--- a/hw/char/sclpconsole.c
+++ b/hw/char/sclpconsole.c
@@ -266,6 +266,7 @@ static void console_class_init(ObjectClass *klass, void 
*data)
 ec->can_handle_event = can_handle_event;
 ec->read_event_data = read_event_data;
 ec->write_event_data = write_event_data;
+set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
 }
 
 static const TypeInfo sclp_console_info = {
diff --git a/hw/s390x/event-facility.c b/hw/s390x/event-facility.c
index 78da718..1cb116a 100644
--- a/hw/s390x/event-facility.c
+++ b/hw/s390x/event-facility.c
@@ -362,6 +362,7 @@ static void init_event_facility_class(ObjectClass *klass, 
void *data)
 
 dc->reset = reset_event_facility;
 dc->vmsd = &vmstate_event_facility;
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
 k->init = init_event_facility;
 k->command_handler = command_handler;
 k->event_pending = event_pending;
diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
index a969975..b3a6c5e 100644
--- a/hw/s390x/sclp.c
+++ b/hw/s390x/sclp.c
@@ -457,10 +457,19 @@ sclpMemoryHotplugDev *get_sclp_memory_hotplug_dev(void)
TYPE_SCLP_MEMORY_HOTPLUG_DEV, NULL));
 }
 
+static void sclp_memory_hotplug_dev_class_init(ObjectClass *klass,
+   void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+}
+
 static TypeInfo sclp_memory_hotplug_dev_info = {
 .name = TYPE_SCLP_MEMORY_HOTPLUG_DEV,
 .parent = TYPE_SYS_BUS_DEVICE,
 .instance_size = sizeof(sclpMemoryHotplugDev),
+.class_init = sclp_memory_hotplug_dev_class_init,
 };
 
 static void register_types(void)
diff --git a/hw/s390x/sclpcpu.c b/hw/s390x/sclpcpu.c
index 3600fe2..2fe8b5a 100644
--- a/hw/s390x/sclpcpu.c
+++ b/hw/s390x/sclpcpu.c
@@ -88,12 +88,14 @@ static int irq_cpu_hotplug_init(SCLPEvent *event)
 static void cpu_class_init(ObjectClass *oc, void *data)
 {
 SCLPEventClass *k = SCLP_EVENT_CLASS(oc);
+DeviceClass *dc = DEVICE_CLASS(oc);
 
 k->init = irq_cpu_hotplug_init;
 k->get_send_mask = send_mask;
 k->get_receive_mask = receive_mask;
 k->read_event_data = read_event_data;
 k->write_event_data = NULL;
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
 }
 
 static const TypeInfo sclp_cpu_info = {
diff --git a/hw/s390x/sclpquiesce.c b/hw/s390x/sclpquiesce.c
index 1a399bd..ffa5553 100644
--- a/hw/s390x/sclpquiesce.c
+++ b/hw/s390x/sclpquiesce.c
@@ -116,6 +116,7 @@ static void quiesce_class_init(ObjectClass *klass, void 
*data)
 
 dc->reset = quiesce_reset;
 dc->vmsd = &vmstate_sclpquiesce;
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
 k->init = quiesce_init;
 
 k->get_send_mask = send_mask;
-- 
2.3.7

[Qemu-devel] [PULL 07/10] s390x/kvm: Put vm name, extended name and UUID into STSI322 SYSIB

2015-04-30 Thread Cornelia Huck

From: Ekaterina Tumanova 

KVM prefills the SYSIB, returned by STSI 3.2.2. This patch allows
userspace to intercept execution, and fill in the values, that are
known to qemu: machine name (8 chars), extended machine name (256
chars), extended machine name encoding (equals 2 for UTF-8) and UUID.

STSI322 qemu handler also finds a highest virtualization level in
level-3 virtualization stack that doesn't support Extended Names
(Ext Name delimiter) and propagates zero Ext Name to all levels below,
because this level is not capable of managing Extended Names of lower
levels.

Signed-off-by: Ekaterina Tumanova 
Reviewed-by: Christian Borntraeger 
Reviewed-by: Thomas Huth 
Signed-off-by: Cornelia Huck 
---
 target-s390x/cpu.h |  8 --
 target-s390x/kvm.c | 71 ++
 2 files changed, 77 insertions(+), 2 deletions(-)

diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
index 8135dda..79bc80b 100644
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -865,9 +865,13 @@ struct sysib_322 {
 uint8_t  name[8];
 uint32_t caf;
 uint8_t  cpi[16];
-uint8_t  res3[24];
+uint8_t res5[3];
+uint8_t ext_name_encoding;
+uint32_t res3;
+uint8_t uuid[16];
 } vm[8];
-uint8_t res4[3552];
+uint8_t res4[1504];
+uint8_t ext_names[8][256];
 };
 
 /* MMU defines */
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index b48c643..619684b 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -44,6 +44,7 @@
 #include "hw/s390x/s390-pci-inst.h"
 #include "hw/s390x/s390-pci-bus.h"
 #include "hw/s390x/ipl.h"
+#include "hw/s390x/ebcdic.h"
 
 /* #define DEBUG_KVM */
 
@@ -255,6 +256,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 }
 
 kvm_vm_enable_cap(s, KVM_CAP_S390_USER_SIGP, 0);
+kvm_vm_enable_cap(s, KVM_CAP_S390_USER_STSI, 0);
 
 return 0;
 }
@@ -1723,6 +1725,72 @@ static int handle_tsch(S390CPU *cpu)
 return ret;
 }
 
+static void insert_stsi_3_2_2(S390CPU *cpu, __u64 addr)
+{
+struct sysib_322 sysib;
+int del;
+
+if (s390_cpu_virt_mem_read(cpu, addr, &sysib, sizeof(sysib))) {
+return;
+}
+/* Shift the stack of Extended Names to prepare for our own data */
+memmove(&sysib.ext_names[1], &sysib.ext_names[0],
+sizeof(sysib.ext_names[0]) * (sysib.count - 1));
+/* First virt level, that doesn't provide Ext Names delimits stack. It is
+ * assumed it's not capable of managing Extended Names for lower levels.
+ */
+for (del = 1; del < sysib.count; del++) {
+if (!sysib.vm[del].ext_name_encoding || !sysib.ext_names[del][0]) {
+break;
+}
+}
+if (del < sysib.count) {
+memset(sysib.ext_names[del], 0,
+   sizeof(sysib.ext_names[0]) * (sysib.count - del));
+}
+/* Insert short machine name in EBCDIC, padded with blanks */
+if (qemu_name) {
+memset(sysib.vm[0].name, 0x40, sizeof(sysib.vm[0].name));
+ebcdic_put(sysib.vm[0].name, qemu_name, MIN(sizeof(sysib.vm[0].name),
+strlen(qemu_name)));
+}
+sysib.vm[0].ext_name_encoding = 2; /* 2 = UTF-8 */
+memset(sysib.ext_names[0], 0, sizeof(sysib.ext_names[0]));
+/* If hypervisor specifies zero Extended Name in STSI322 SYSIB, it's
+ * considered by s390 as not capable of providing any Extended Name.
+ * Therefore if no name was specified on qemu invocation, we go with the
+ * same "KVMguest" default, which KVM has filled into short name field.
+ */
+if (qemu_name) {
+strncpy((char *)sysib.ext_names[0], qemu_name,
+sizeof(sysib.ext_names[0]));
+} else {
+strcpy((char *)sysib.ext_names[0], "KVMguest");
+}
+/* Insert UUID */
+memcpy(sysib.vm[0].uuid, qemu_uuid, sizeof(sysib.vm[0].uuid));
+
+s390_cpu_virt_mem_write(cpu, addr, &sysib, sizeof(sysib));
+}
+
+static int handle_stsi(S390CPU *cpu)
+{
+CPUState *cs = CPU(cpu);
+struct kvm_run *run = cs->kvm_run;
+
+switch (run->s390_stsi.fc) {
+case 3:
+if (run->s390_stsi.sel1 != 2 || run->s390_stsi.sel2 != 2) {
+return 0;
+}
+/* Only sysib 3.2.2 needs post-handling for now. */
+insert_stsi_3_2_2(cpu, run->s390_stsi.addr);
+return 0;
+default:
+return 0;
+}
+}
+
 static int kvm_arch_handle_debug_exit(S390CPU *cpu)
 {
 CPUState *cs = CPU(cpu);
@@ -1772,6 +1840,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run 
*run)
 case KVM_EXIT_S390_TSCH:
 ret = handle_tsch(cpu);
 break;
+case KVM_EXIT_S390_STSI:
+ret = handle_stsi(cpu);
+break;
 case KVM_EXIT_DEBUG:
 ret = kvm_arch_handle_debug_exit(cpu);
 break;
-- 
2.3.7

[Qemu-devel] [PULL 04/10] s390x/ipl: sort into categories

2015-04-30 Thread Cornelia Huck

The s390 ipl device has no real home (it's not really a storage device),
so let's sort it into the misc category.

Reviewed-by: David Hildenbrand 
Acked-by: Christian Borntraeger 
Signed-off-by: Cornelia Huck 
---
 hw/s390x/ipl.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index 2e26d2a..132004a 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -315,6 +315,7 @@ static void s390_ipl_class_init(ObjectClass *klass, void 
*data)
 dc->props = s390_ipl_properties;
 dc->reset = s390_ipl_reset;
 dc->vmsd = &vmstate_ipl;
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
 }
 
 static const TypeInfo s390_ipl_info = {
-- 
2.3.7

[Qemu-devel] [PULL 01/10] virtio-ccw: sort into categories

2015-04-30 Thread Cornelia Huck

Sort the various virtio-ccw devices into the same categories as their
virtio-pci counterparts.

Reviewed-by: David Hildenbrand 
Acked-by: Christian Borntraeger 
Signed-off-by: Cornelia Huck 
---
 hw/s390x/virtio-ccw.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index ed75c63..c1d8288 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -1453,6 +1453,7 @@ static void virtio_ccw_net_class_init(ObjectClass *klass, 
void *data)
 k->exit = virtio_ccw_exit;
 dc->reset = virtio_ccw_reset;
 dc->props = virtio_ccw_net_properties;
+set_bit(DEVICE_CATEGORY_NETWORK, dc->categories);
 }
 
 static const TypeInfo virtio_ccw_net = {
@@ -1479,6 +1480,7 @@ static void virtio_ccw_blk_class_init(ObjectClass *klass, 
void *data)
 k->exit = virtio_ccw_exit;
 dc->reset = virtio_ccw_reset;
 dc->props = virtio_ccw_blk_properties;
+set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 }
 
 static const TypeInfo virtio_ccw_blk = {
@@ -1505,6 +1507,7 @@ static void virtio_ccw_serial_class_init(ObjectClass 
*klass, void *data)
 k->exit = virtio_ccw_exit;
 dc->reset = virtio_ccw_reset;
 dc->props = virtio_ccw_serial_properties;
+set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
 }
 
 static const TypeInfo virtio_ccw_serial = {
@@ -1531,6 +1534,7 @@ static void virtio_ccw_balloon_class_init(ObjectClass 
*klass, void *data)
 k->exit = virtio_ccw_exit;
 dc->reset = virtio_ccw_reset;
 dc->props = virtio_ccw_balloon_properties;
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
 }
 
 static const TypeInfo virtio_ccw_balloon = {
@@ -1558,6 +1562,7 @@ static void virtio_ccw_scsi_class_init(ObjectClass 
*klass, void *data)
 k->exit = virtio_ccw_exit;
 dc->reset = virtio_ccw_reset;
 dc->props = virtio_ccw_scsi_properties;
+set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 }
 
 static const TypeInfo virtio_ccw_scsi = {
@@ -1583,6 +1588,7 @@ static void vhost_ccw_scsi_class_init(ObjectClass *klass, 
void *data)
 k->exit = virtio_ccw_exit;
 dc->reset = virtio_ccw_reset;
 dc->props = vhost_ccw_scsi_properties;
+set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 }
 
 static const TypeInfo vhost_ccw_scsi = {
@@ -1620,6 +1626,7 @@ static void virtio_ccw_rng_class_init(ObjectClass *klass, 
void *data)
 k->exit = virtio_ccw_exit;
 dc->reset = virtio_ccw_reset;
 dc->props = virtio_ccw_rng_properties;
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
 }
 
 static const TypeInfo virtio_ccw_rng = {
@@ -1706,9 +1713,11 @@ static void virtual_css_bridge_class_init(ObjectClass 
*klass, void *data)
 {
 SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
 HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(klass);
+DeviceClass *dc = DEVICE_CLASS(klass);
 
 k->init = virtual_css_bridge_init;
 hc->unplug = virtio_ccw_busdev_unplug;
+set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
 }
 
 static const TypeInfo virtual_css_bridge_info = {
-- 
2.3.7

[Qemu-devel] [PULL 02/10] s390-virtio: sort into categories

2015-04-30 Thread Cornelia Huck

Sort the various s390-virtio devices into the same categories as their
virtio-pci counterparts.

Reviewed-by: David Hildenbrand 
Acked-by: Christian Borntraeger 
Signed-off-by: Cornelia Huck 
---
 hw/s390x/s390-virtio-bus.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/s390x/s390-virtio-bus.c b/hw/s390x/s390-virtio-bus.c
index 0f93a64..c27f8a5 100644
--- a/hw/s390x/s390-virtio-bus.c
+++ b/hw/s390x/s390-virtio-bus.c
@@ -542,6 +542,7 @@ static void s390_virtio_net_class_init(ObjectClass *klass, 
void *data)
 
 k->realize = s390_virtio_net_realize;
 dc->props = s390_virtio_net_properties;
+set_bit(DEVICE_CATEGORY_NETWORK, dc->categories);
 }
 
 static const TypeInfo s390_virtio_net = {
@@ -555,8 +556,10 @@ static const TypeInfo s390_virtio_net = {
 static void s390_virtio_blk_class_init(ObjectClass *klass, void *data)
 {
 VirtIOS390DeviceClass *k = VIRTIO_S390_DEVICE_CLASS(klass);
+DeviceClass *dc = DEVICE_CLASS(klass);
 
 k->realize = s390_virtio_blk_realize;
+set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 }
 
 static const TypeInfo s390_virtio_blk = {
@@ -578,6 +581,7 @@ static void s390_virtio_serial_class_init(ObjectClass 
*klass, void *data)
 
 k->realize = s390_virtio_serial_realize;
 dc->props = s390_virtio_serial_properties;
+set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
 }
 
 static const TypeInfo s390_virtio_serial = {
@@ -600,6 +604,7 @@ static void s390_virtio_rng_class_init(ObjectClass *klass, 
void *data)
 
 k->realize = s390_virtio_rng_realize;
 dc->props = s390_virtio_rng_properties;
+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
 }
 
 static const TypeInfo s390_virtio_rng = {
@@ -658,6 +663,7 @@ static void s390_virtio_scsi_class_init(ObjectClass *klass, 
void *data)
 
 k->realize = s390_virtio_scsi_realize;
 dc->props = s390_virtio_scsi_properties;
+set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 }
 
 static const TypeInfo s390_virtio_scsi = {
@@ -681,6 +687,7 @@ static void s390_vhost_scsi_class_init(ObjectClass *klass, 
void *data)
 
 k->realize = s390_vhost_scsi_realize;
 dc->props = s390_vhost_scsi_properties;
+set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 }
 
 static const TypeInfo s390_vhost_scsi = {
@@ -704,8 +711,10 @@ static int s390_virtio_bridge_init(SysBusDevice *dev)
 static void s390_virtio_bridge_class_init(ObjectClass *klass, void *data)
 {
 SysBusDeviceClass *k = SYS_BUS_DEVICE_CLASS(klass);
+DeviceClass *dc = DEVICE_CLASS(klass);
 
 k->init = s390_virtio_bridge_init;
+set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
 }
 
 static const TypeInfo s390_virtio_bridge_info = {
-- 
2.3.7

[Qemu-devel] [PULL 09/10] s390x/kvm: Support access register mode for KVM_S390_MEM_OP ioctl

2015-04-30 Thread Cornelia Huck

From: Alexander Yarygin 

Access register mode is one of the modes that control dynamic address
translation. In this mode the address space is specified by values of
the access registers. The effective address-space-control element is
obtained from the result of the access register translation. See
the "Access-Register Introduction" section of the chapter 5 "Program
Execution" in "Principles of Operations" for more details.

When the CPU is in AR mode, the s390_cpu_virt_mem_rw() function must
know which access register number to use for address translation.
This patch does several things:
- add new parameter 'uint8_t ar' to that function
- decode ar number from intercepted instructions
- pass the ar number to s390_cpu_virt_mem_rw(), which in turn passes it
to the KVM_S390_MEM_OP ioctl.

Signed-off-by: Alexander Yarygin 
Reviewed-by: Thomas Huth 
Reviewed-by: David Hildenbrand 
Signed-off-by: Cornelia Huck 
---
 hw/s390x/s390-pci-inst.c  | 21 +++--
 hw/s390x/s390-pci-inst.h  |  7 ---
 target-s390x/cpu.h| 30 +-
 target-s390x/ioinst.c | 42 +-
 target-s390x/kvm.c| 46 ++
 target-s390x/mmu_helper.c |  5 +++--
 6 files changed, 90 insertions(+), 61 deletions(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 8f7288f..f9151a9 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -155,7 +155,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2)
 return 0;
 }
 
-if (s390_cpu_virt_mem_read(cpu, env->regs[r2], buffer, sizeof(*reqh))) {
+if (s390_cpu_virt_mem_read(cpu, env->regs[r2], r2, buffer, sizeof(*reqh))) 
{
 return 0;
 }
 reqh = (ClpReqHdr *)buffer;
@@ -165,7 +165,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2)
 return 0;
 }
 
-if (s390_cpu_virt_mem_read(cpu, env->regs[r2], buffer,
+if (s390_cpu_virt_mem_read(cpu, env->regs[r2], r2, buffer,
req_len + sizeof(*resh))) {
 return 0;
 }
@@ -180,7 +180,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2)
 return 0;
 }
 
-if (s390_cpu_virt_mem_read(cpu, env->regs[r2], buffer,
+if (s390_cpu_virt_mem_read(cpu, env->regs[r2], r2, buffer,
req_len + res_len)) {
 return 0;
 }
@@ -277,7 +277,7 @@ int clp_service_call(S390CPU *cpu, uint8_t r2)
 }
 
 out:
-if (s390_cpu_virt_mem_write(cpu, env->regs[r2], buffer,
+if (s390_cpu_virt_mem_write(cpu, env->regs[r2], r2, buffer,
 req_len + res_len)) {
 return 0;
 }
@@ -546,7 +546,8 @@ out:
 return 0;
 }
 
-int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr)
+int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr,
+uint8_t ar)
 {
 CPUS390XState *env = &cpu->env;
 S390PCIBusDevice *pbdev;
@@ -603,7 +604,7 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t 
r3, uint64_t gaddr)
 return 0;
 }
 
-if (s390_cpu_virt_mem_read(cpu, gaddr, buffer, len)) {
+if (s390_cpu_virt_mem_read(cpu, gaddr, ar, buffer, len)) {
 return 0;
 }
 
@@ -698,7 +699,7 @@ static void dereg_ioat(S390PCIBusDevice *pbdev)
 pbdev->g_iota = 0;
 }
 
-int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba)
+int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar)
 {
 CPUS390XState *env = &cpu->env;
 uint8_t oc;
@@ -727,7 +728,7 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t 
fiba)
 return 0;
 }
 
-if (s390_cpu_virt_mem_read(cpu, fiba, (uint8_t *)&fib, sizeof(fib))) {
+if (s390_cpu_virt_mem_read(cpu, fiba, ar, (uint8_t *)&fib, sizeof(fib))) {
 return 0;
 }
 
@@ -773,7 +774,7 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t 
fiba)
 return 0;
 }
 
-int stpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba)
+int stpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar)
 {
 CPUS390XState *env = &cpu->env;
 uint32_t fh;
@@ -829,7 +830,7 @@ int stpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t 
fiba)
 fib.fc |= 0x10;
 }
 
-if (s390_cpu_virt_mem_write(cpu, fiba, (uint8_t *)&fib, sizeof(fib))) {
+if (s390_cpu_virt_mem_write(cpu, fiba, ar, (uint8_t *)&fib, sizeof(fib))) {
 return 0;
 }
 
diff --git a/hw/s390x/s390-pci-inst.h b/hw/s390x/s390-pci-inst.h
index 7e6c804..70fa713 100644
--- a/hw/s390x/s390-pci-inst.h
+++ b/hw/s390x/s390-pci-inst.h
@@ -281,8 +281,9 @@ int clp_service_call(S390CPU *cpu, uint8_t r2);
 int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2);
 int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2);
 int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2);
-int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr);
-int mpcifc

[Qemu-devel] [PULL 05/10] s390x/mmu: Use access type definitions instead of magic values

2015-04-30 Thread Cornelia Huck

From: Thomas Huth 

Since there are now proper definitions for the MMU access type,
let's use them in the s390x MMU code, too, instead of the
hard-to-understand magic values.

Signed-off-by: Thomas Huth 
Reviewed-by: Jens Freimann 
Acked-by: Cornelia Huck 
Signed-off-by: Cornelia Huck 
---
 target-s390x/helper.c |  2 +-
 target-s390x/mmu_helper.c | 10 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/target-s390x/helper.c b/target-s390x/helper.c
index f1060c2..041c9c7 100644
--- a/target-s390x/helper.c
+++ b/target-s390x/helper.c
@@ -162,7 +162,7 @@ hwaddr s390_cpu_get_phys_page_debug(CPUState *cs, vaddr 
vaddr)
 vaddr &= 0x7fff;
 }
 
-mmu_translate(env, vaddr, 2, asc, &raddr, &prot, false);
+mmu_translate(env, vaddr, MMU_INST_FETCH, asc, &raddr, &prot, false);
 
 return raddr;
 }
diff --git a/target-s390x/mmu_helper.c b/target-s390x/mmu_helper.c
index b061c85..9b88498 100644
--- a/target-s390x/mmu_helper.c
+++ b/target-s390x/mmu_helper.c
@@ -68,7 +68,7 @@ static void trigger_prot_fault(CPUS390XState *env, 
target_ulong vaddr,
 {
 uint64_t tec;
 
-tec = vaddr | (rw == 1 ? FS_WRITE : FS_READ) | 4 | asc >> 46;
+tec = vaddr | (rw == MMU_DATA_STORE ? FS_WRITE : FS_READ) | 4 | asc >> 46;
 
 DPRINTF("%s: trans_exc_code=%016" PRIx64 "\n", __func__, tec);
 
@@ -85,7 +85,7 @@ static void trigger_page_fault(CPUS390XState *env, 
target_ulong vaddr,
 int ilen = ILEN_LATER;
 uint64_t tec;
 
-tec = vaddr | (rw == 1 ? FS_WRITE : FS_READ) | asc >> 46;
+tec = vaddr | (rw == MMU_DATA_STORE ? FS_WRITE : FS_READ) | asc >> 46;
 
 DPRINTF("%s: vaddr=%016" PRIx64 " bits=%d\n", __func__, vaddr, bits);
 
@@ -94,7 +94,7 @@ static void trigger_page_fault(CPUS390XState *env, 
target_ulong vaddr,
 }
 
 /* Code accesses have an undefined ilc.  */
-if (rw == 2) {
+if (rw == MMU_INST_FETCH) {
 ilen = 2;
 }
 
@@ -288,7 +288,7 @@ static int mmu_translate_asce(CPUS390XState *env, 
target_ulong vaddr,
 
 r = mmu_translate_region(env, vaddr, asc, asce, level, raddr, flags, rw,
  exc);
-if ((rw == 1) && !(*flags & PAGE_WRITE)) {
+if (rw == MMU_DATA_STORE && !(*flags & PAGE_WRITE)) {
 trigger_prot_fault(env, vaddr, asc, rw, exc);
 return -1;
 }
@@ -338,7 +338,7 @@ int mmu_translate(CPUS390XState *env, target_ulong vaddr, 
int rw, uint64_t asc,
  * Instruction: Primary
  * Data: Secondary
  */
-if (rw == 2) {
+if (rw == MMU_INST_FETCH) {
 r = mmu_translate_asce(env, vaddr, PSW_ASC_PRIMARY, env->cregs[1],
raddr, flags, rw, exc);
 *flags &= ~(PAGE_READ | PAGE_WRITE);
-- 
2.3.7

[Qemu-devel] [PULL 08/10] s390x/mmu: Use ioctl for reading and writing from/to guest memory

2015-04-30 Thread Cornelia Huck

From: Thomas Huth 

Add code to make use of the new ioctl for reading from / writing to
virtual guest memory. By using the ioctl, the memory accesses are now
protected with the so-called ipte-lock in the kernel.

[CH: moved error message into kvm_s390_mem_op()]
Signed-off-by: Thomas Huth 
Acked-by: Christian Borntraeger 
Signed-off-by: Cornelia Huck 
---
 target-s390x/cpu.h|  7 +++
 target-s390x/kvm.c| 40 
 target-s390x/mmu_helper.c |  7 +++
 3 files changed, 54 insertions(+)

diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
index 79bc80b..9c42743 100644
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -401,6 +401,8 @@ void kvm_s390_vcpu_interrupt(S390CPU *cpu, struct 
kvm_s390_irq *irq);
 void kvm_s390_floating_interrupt(struct kvm_s390_irq *irq);
 int kvm_s390_inject_flic(struct kvm_s390_irq *irq);
 void kvm_s390_access_exception(S390CPU *cpu, uint16_t code, uint64_t te_code);
+int kvm_s390_mem_op(S390CPU *cpu, vaddr addr, void *hostbuf, int len,
+bool is_write);
 int kvm_s390_get_clock(uint8_t *tod_high, uint64_t *tod_clock);
 int kvm_s390_set_clock(uint8_t *tod_high, uint64_t *tod_clock);
 #else
@@ -418,6 +420,11 @@ static inline int kvm_s390_set_clock(uint8_t *tod_high, 
uint64_t *tod_low)
 {
 return -ENOSYS;
 }
+static inline int kvm_s390_mem_op(S390CPU *cpu, vaddr addr, void *hostbuf,
+  int len, bool is_write)
+{
+return -ENOSYS;
+}
 static inline void kvm_s390_access_exception(S390CPU *cpu, uint16_t code,
  uint64_t te_code)
 {
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index 619684b..1c0e78c 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -123,6 +123,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
 
 static int cap_sync_regs;
 static int cap_async_pf;
+static int cap_mem_op;
 
 static void *legacy_s390_alloc(size_t size, uint64_t *align);
 
@@ -247,6 +248,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 {
 cap_sync_regs = kvm_check_extension(s, KVM_CAP_SYNC_REGS);
 cap_async_pf = kvm_check_extension(s, KVM_CAP_ASYNC_PF);
+cap_mem_op = kvm_check_extension(s, KVM_CAP_S390_MEM_OP);
 
 kvm_s390_enable_cmma(s);
 
@@ -550,6 +552,44 @@ int kvm_s390_set_clock(uint8_t *tod_high, uint64_t 
*tod_low)
 return kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, &attr);
 }
 
+/**
+ * kvm_s390_mem_op:
+ * @addr:  the logical start address in guest memory
+ * @hostbuf:   buffer in host memory. NULL = do only checks w/o copying
+ * @len:   length that should be transfered
+ * @is_write:  true = write, false = read
+ * Returns:0 on success, non-zero if an exception or error occured
+ *
+ * Use KVM ioctl to read/write from/to guest memory. An access exception
+ * is injected into the vCPU in case of translation errors.
+ */
+int kvm_s390_mem_op(S390CPU *cpu, vaddr addr, void *hostbuf, int len,
+bool is_write)
+{
+struct kvm_s390_mem_op mem_op = {
+.gaddr = addr,
+.flags = KVM_S390_MEMOP_F_INJECT_EXCEPTION,
+.size = len,
+.op = is_write ? KVM_S390_MEMOP_LOGICAL_WRITE
+   : KVM_S390_MEMOP_LOGICAL_READ,
+.buf = (uint64_t)hostbuf,
+};
+int ret;
+
+if (!cap_mem_op) {
+return -ENOSYS;
+}
+if (!hostbuf) {
+mem_op.flags |= KVM_S390_MEMOP_F_CHECK_ONLY;
+}
+
+ret = kvm_vcpu_ioctl(CPU(cpu), KVM_S390_MEM_OP, &mem_op);
+if (ret < 0) {
+error_printf("KVM_S390_MEM_OP failed: %s\n", strerror(-ret));
+}
+return ret;
+}
+
 /*
  * Legacy layout for s390:
  * Older S390 KVM requires the topmost vma of the RAM to be
diff --git a/target-s390x/mmu_helper.c b/target-s390x/mmu_helper.c
index 9b88498..cd2cb51 100644
--- a/target-s390x/mmu_helper.c
+++ b/target-s390x/mmu_helper.c
@@ -450,6 +450,13 @@ int s390_cpu_virt_mem_rw(S390CPU *cpu, vaddr laddr, void 
*hostbuf,
 target_ulong *pages;
 int ret;
 
+if (kvm_enabled()) {
+ret = kvm_s390_mem_op(cpu, laddr, hostbuf, len, is_write);
+if (ret >= 0) {
+return ret;
+}
+}
+
 nr_pages = (((laddr & ~TARGET_PAGE_MASK) + len - 1) >> TARGET_PAGE_BITS)
+ 1;
 pages = g_malloc(nr_pages * sizeof(*pages));
-- 
2.3.7

[Qemu-devel] [PULL 06/10] linux-headers: update

2015-04-30 Thread Cornelia Huck

This updates linux-headers against master 4.1-rc1 (commit
b787f68c36d49bb1d9236f403813641efa74a031).

Signed-off-by: Cornelia Huck 
---
 include/standard-headers/linux/virtio_balloon.h |  28 +++-
 include/standard-headers/linux/virtio_ids.h |   1 +
 include/standard-headers/linux/virtio_input.h   |  76 +++
 linux-headers/asm-arm/kvm.h |   9 +-
 linux-headers/asm-arm64/kvm.h   |   9 +-
 linux-headers/asm-mips/kvm.h| 164 +++-
 linux-headers/asm-s390/kvm.h|   4 +
 linux-headers/asm-x86/hyperv.h  |   2 +
 linux-headers/linux/kvm.h   |  66 +-
 linux-headers/linux/vfio.h  |   2 +
 10 files changed, 293 insertions(+), 68 deletions(-)
 create mode 100644 include/standard-headers/linux/virtio_input.h

diff --git a/include/standard-headers/linux/virtio_balloon.h 
b/include/standard-headers/linux/virtio_balloon.h
index 799376d..88ada1d 100644
--- a/include/standard-headers/linux/virtio_balloon.h
+++ b/include/standard-headers/linux/virtio_balloon.h
@@ -25,6 +25,7 @@
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE. */
+#include "standard-headers/linux/types.h"
 #include "standard-headers/linux/virtio_ids.h"
 #include "standard-headers/linux/virtio_config.h"
 
@@ -51,9 +52,32 @@ struct virtio_balloon_config {
 #define VIRTIO_BALLOON_S_MEMTOT   5   /* Total amount of memory */
 #define VIRTIO_BALLOON_S_NR   6
 
+/*
+ * Memory statistics structure.
+ * Driver fills an array of these structures and passes to device.
+ *
+ * NOTE: fields are laid out in a way that would make compiler add padding
+ * between and after fields, so we have to use compiler-specific attributes to
+ * pack it, to disable this padding. This also often causes compiler to
+ * generate suboptimal code.
+ *
+ * We maintain this statistics structure format for backwards compatibility,
+ * but don't follow this example.
+ *
+ * If implementing a similar structure, do something like the below instead:
+ * struct virtio_balloon_stat {
+ * __virtio16 tag;
+ * uint8_t reserved[6];
+ * __virtio64 val;
+ * };
+ *
+ * In other words, add explicit reserved fields to align field and
+ * structure boundaries at field size, avoiding compiler padding
+ * without the packed attribute.
+ */
 struct virtio_balloon_stat {
-   uint16_t tag;
-   uint64_t val;
+   __virtio16 tag;
+   __virtio64 val;
 } QEMU_PACKED;
 
 #endif /* _LINUX_VIRTIO_BALLOON_H */
diff --git a/include/standard-headers/linux/virtio_ids.h 
b/include/standard-headers/linux/virtio_ids.h
index 284fc3a..5f60aa4 100644
--- a/include/standard-headers/linux/virtio_ids.h
+++ b/include/standard-headers/linux/virtio_ids.h
@@ -39,5 +39,6 @@
 #define VIRTIO_ID_9P   9 /* 9p virtio console */
 #define VIRTIO_ID_RPROC_SERIAL 11 /* virtio remoteproc serial link */
 #define VIRTIO_ID_CAIF12 /* Virtio caif */
+#define VIRTIO_ID_INPUT18 /* virtio input */
 
 #endif /* _LINUX_VIRTIO_IDS_H */
diff --git a/include/standard-headers/linux/virtio_input.h 
b/include/standard-headers/linux/virtio_input.h
new file mode 100644
index 000..a98a797
--- /dev/null
+++ b/include/standard-headers/linux/virtio_input.h
@@ -0,0 +1,76 @@
+#ifndef _LINUX_VIRTIO_INPUT_H
+#define _LINUX_VIRTIO_INPUT_H
+/* This header is BSD licensed so anyone can use the definitions to implement
+ * compatible drivers/servers.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ *may be used to endorse or promote products derived from this software
+ *without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+ * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)

[Qemu-devel] [PULL 00/10] first pile of s390x patches for 2.4

2015-04-30 Thread Cornelia Huck

More to come, but I'd like to get the first batch off my plate now.

The following changes since commit 52b7aba62f02cf90f57ee7e02f67d2d8445e7e40:

  Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20150428.0' 
into staging (2015-04-28 18:58:15 +0100)

are available in the git repository at:

  git://github.com/cohuck/qemu tags/s390x-20150430

for you to fetch changes up to 2c80e996e427ae31982f3405a762859578a6261d:

  kvm: better advice for failed s390x startup (2015-04-30 13:21:42 +0200)


First pile of s390x patches for 2.4, including:
- some cleanup patches
- sort most of the s390x devices into categories
- support for the new STSI post handler, used to insert vm name and
  friends
- support for the new MEM_OP ioctl (including access register mode)
  for accessing guest memory



Alexander Yarygin (1):
  s390x/kvm: Support access register mode for KVM_S390_MEM_OP ioctl

Cornelia Huck (6):
  virtio-ccw: sort into categories
  s390-virtio: sort into categories
  sclp: sort into categories
  s390x/ipl: sort into categories
  linux-headers: update
  kvm: better advice for failed s390x startup

Ekaterina Tumanova (1):
  s390x/kvm: Put vm name, extended name and UUID into STSI322 SYSIB

Thomas Huth (2):
  s390x/mmu: Use access type definitions instead of magic values
  s390x/mmu: Use ioctl for reading and writing from/to guest memory

 hw/char/sclpconsole-lm.c|   1 +
 hw/char/sclpconsole.c   |   1 +
 hw/s390x/event-facility.c   |   1 +
 hw/s390x/ipl.c  |   1 +
 hw/s390x/s390-pci-inst.c|  21 +--
 hw/s390x/s390-pci-inst.h|   7 +-
 hw/s390x/s390-virtio-bus.c  |   9 ++
 hw/s390x/sclp.c |   9 ++
 hw/s390x/sclpcpu.c  |   2 +
 hw/s390x/sclpquiesce.c  |   1 +
 hw/s390x/virtio-ccw.c   |   9 ++
 include/standard-headers/linux/virtio_balloon.h |  28 +++-
 include/standard-headers/linux/virtio_ids.h |   1 +
 include/standard-headers/linux/virtio_input.h   |  76 +++
 kvm-all.c   |  13 +-
 linux-headers/asm-arm/kvm.h |   9 +-
 linux-headers/asm-arm64/kvm.h   |   9 +-
 linux-headers/asm-mips/kvm.h| 164 +++-
 linux-headers/asm-s390/kvm.h|   4 +
 linux-headers/asm-x86/hyperv.h  |   2 +
 linux-headers/linux/kvm.h   |  66 +-
 linux-headers/linux/vfio.h  |   2 +
 target-s390x/cpu.h  |  37 --
 target-s390x/helper.c   |   2 +-
 target-s390x/ioinst.c   |  42 +++---
 target-s390x/kvm.c  | 145 +++--
 target-s390x/mmu_helper.c   |  20 ++-
 27 files changed, 554 insertions(+), 128 deletions(-)
 create mode 100644 include/standard-headers/linux/virtio_input.h

-- 
2.3.7

[Qemu-devel] [PULL 10/10] kvm: better advice for failed s390x startup

2015-04-30 Thread Cornelia Huck

If KVM_CREATE failed on s390x, we print a hint to enable the switch_amode
kernel parameter. This only applies to old kernels, and only if the
error was -EINVAL. Moreover, with new kernels, the most likely reason
for -EINVAL is that pgstes were not enabled.

Let's update the error message to give a better hint on where things
may need fixing.

Acked-by: Christian Borntraeger 
Signed-off-by: Cornelia Huck 
---
 kvm-all.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 2a717e5..3f7061a 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1544,8 +1544,17 @@ static int kvm_init(MachineState *ms)
 strerror(-ret));
 
 #ifdef TARGET_S390X
-fprintf(stderr, "Please add the 'switch_amode' kernel parameter to "
-"your host kernel command line\n");
+if (ret == -EINVAL) {
+fprintf(stderr,
+"Host kernel setup problem detected. Please verify:\n");
+fprintf(stderr, "- for kernels supporting the switch_amode or"
+" user_mode parameters, whether\n");
+fprintf(stderr,
+"  user space is running in primary address space\n");
+fprintf(stderr,
+"- for kernels supporting the vm.allocate_pgste sysctl, "
+"whether it is enabled\n");
+}
 #endif
 goto err;
 }
-- 
2.3.7

Re: [Qemu-devel] [PATCH v2 2/3] qobject: Add a special null QObject

2015-04-30 Thread Markus Armbruster

Eric Blake  writes:

> From: Markus Armbruster 
>
> I'm going to fix the JSON parser to recognize null.  The obvious
> representation of JSON null as (QObject *)NULL doesn't work, because
> the parser already uses it as an error value.  Perhaps we should
> change it to free NULL for null, but that's more than I can do right
> now.  Create a special null QObject instead.

Keeping my S-o-B here would clarify commit message authorship.

> The existing QDict, QList, and QString all represent something that
> is a pointer in C and could therefore be associated with NULL.  But
> right now, all three of these sub-types are always non-null once
> created, so the new null sentinel object is intentionally unrelated
> to them.
>
> Signed-off-by: Markus Armbruster 
> Signed-off-by: Eric Blake 

You fixed my pasto and updated a copyright note.  Thanks.

Re: [Qemu-devel] [PATCH v2 0/3] parse 'null' literal in QMP

2015-04-30 Thread Markus Armbruster

Eric Blake  writes:

> Here's my attempt to merge the best points of Markus' approach [1]
> (patches 16-18 of that series - benefit of smaller patches and fewer
> malloc calls) and my approach [2] (benefit of a testsuite addition
> and more detailed commit messages), while fixing the typos that both
> of us had in v1.
>
> [1]https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg00342.html
> [2]https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg00623.html

Looks ready.  Luiz?

Re: [Qemu-devel] [PATCH v2 0/3] parse 'null' literal in QMP

2015-04-30 Thread Luiz Capitulino

On Thu, 30 Apr 2015 14:21:02 +0200
Markus Armbruster  wrote:

> Eric Blake  writes:
> 
> > Here's my attempt to merge the best points of Markus' approach [1]
> > (patches 16-18 of that series - benefit of smaller patches and fewer
> > malloc calls) and my approach [2] (benefit of a testsuite addition
> > and more detailed commit messages), while fixing the typos that both
> > of us had in v1.
> >
> > [1]https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg00342.html
> > [2]https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg00623.html
> 
> Looks ready.  Luiz?

It's in my queue.

Re: [Qemu-devel] [PULL v2 00/22] Memory, TCG, NBD, build system changes for 2015-04-27

2015-04-30 Thread Peter Maydell

On 29 April 2015 at 09:35, Paolo Bonzini  wrote:
> The following changes since commit e1a5476354d396773e4c555f126d752d4ae58fa9:
>
>   Open 2.4 development tree (2015-04-25 22:05:07 +0100)
>
> are available in the git repository at:
>
>   git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to d064d9f381b00538e41f14104b88a1ae85d78865:
>
>   nbd/trivial: fix type cast for ioctl (2015-04-28 22:14:15 +0200)
>
> 
> - miscellaneous cleanups for TCG (Emilio) and NBD (Bogdan)
> - next part in the thread-safe address_space_* saga: atomic access
>   to the bounce buffer and the map_clients list, from Fam
> - optional support for linking with tcmalloc, also from Fam
> - reapplying Peter Crosthwaite's "Respect as_translate_internal
>   length clamp" after fixing the SPARC fallout.
> - build system fix from Wei Liu
> - small acpi-build and ioport cleanup by myself
>
> 

Applied, thanks.

-- PMM

[Qemu-devel] [PATCH v15 02/10] hw/vfio/platform: vfio-platform skeleton

2015-04-30 Thread Eric Auger

Minimal VFIO platform implementation supporting register space
user mapping but not IRQ assignment.

Signed-off-by: Kim Phillips 
Signed-off-by: Eric Auger 

---
v14 -> v15:
- vfio_platform_compute_needs_reset now returns true while
  vfio_platform_hot_reset_multi returns -1
- adjust g_malloc0_n usage

v13 -> v14:
- fix ENAMETOOLONG error path sign

v12 -> v13:
- check device name does not contain any /
- handle case where readlink fully fills the buffer
- in vfio_map_region declare size as uint64_t

v11 -> v12:
- add x-mmap property definition, without which the default value of
  vbasedev.allow_mmap is false, hence preventing the reg space from
  being mapped.

v10 -> v11:
x Take into account Alex Bennee's comments:
- use g_malloc0_n instead of g_malloc0
- use block declarations when possible
- rework readlink returned value treatment
- use g_strlcat in place of strncat
x use g_snprintf in place of snprintf
x correct error handling in vfio_populate_device,
  in case of flag not corresponding to platform device
x various cosmetic changes

v9 -> v10:
- vfio_populate_device no more called in common vfio_get_device
  but in vfio_base_device_init

v8 -> v9:
- irq management is moved into a separate patch to ease the review
- VFIO_DEVICE_FLAGS_PLATFORM is checked in vfio_populate_device
- g_free of regions added in vfio_populate_device error label
- virtualID becomes 32b

v7 -> v8:
- change proto of vfio_platform_compute_needs_reset and sets
  vbasedev->needs_reset to false there
- vfio_[un]mask_irqindex renamed into vfio_[un]mask_single_irqindex
- vfio_register_irq_starter renamed into vfio_kick_irqs
  we now use a reset notifier instead of a machine init done notifier.
  Enables to get rid of the VfioIrqStarterNotifierParams dangling
  pointer. Previously we use pbus first_irq. This is no more possible
  since the reset notifier takes a void * and first_irq is a field of
  a const struct. So now we pass the DeviceState handle of the
  interrupt controller. I tried to keep the code generic, reason why
  I did not rely on an architecture specific accessor to retrieve
  the gsi number (gic accessor as proposed by Alex). I would like to
  avoid creating an ARM VFIO device model. I hope this model
  model can work on other archs than arm (no multiple intc?);
  wouldn't it be simpler to keep the previous first_irq parameter and
  relax the const constraint.

v6 -> v7:
- compat is not exposed anymore as a user option. Rationale is
  the vfio device became abstract and a specialization is needed
  anyway. The derived device must set the compat string.
- in v6 vfio_start_irq_injection was exposed in vfio-platform.h.
  A new function dubbed vfio_register_irq_starter replaces it. It
  registers a machine init done notifier that programs & starts
  all dynamic VFIO device IRQs. This function is supposed to be
  called by the machine file. A set of static helper routines are
  added too. It must be called before the creation of the platform
  bus device.

v5 -> v6:
- vfio_device property renamed into host property
- correct error handling of VFIO_DEVICE_GET_IRQ_INFO ioctl
  and remove PCI related comment
- remove declaration of vfio_setup_irqfd and irqfd_allowed
  property.Both belong to next patch (irqfd)
- remove declaration of vfio_intp_interrupt in vfio-platform.h
- functions that can be static get this characteristic
- remove declarations of vfio_region_ops, vfio_memory_listener,
  group_list, vfio_address_spaces. All are moved to vfio-common.h
- remove vfio_put_device declaration and definition
- print_regions removed. code moved into vfio_populate_regions
- replace DPRINTF by trace events
- new helper routine to set the trigger eventfd
- dissociate intp init from the injection enablement:
  vfio_enable_intp renamed into vfio_init_intp and new function
  named vfio_start_eventfd_injection
- injection start moved to vfio_start_irq_injection (not anymore
  in vfio_populate_interrupt)
- new start_irq_fn field in VFIOPlatformDevice corresponding to
  the function that will be used for starting injection
- user handled eventfd:
  x add mutex to protect IRQ state & list manipulation,
  x correct misleading comment in vfio_intp_interrupt.
  x Fix bugs thanks to fake interrupt modality
- VFIOPlatformDeviceClass becomes abstract
- add error_setg in vfio_platform_realize

v4 -> v5:
- vfio-plaform.h included first
- cleanup error handling in *populate*, vfio_get_device,
  vfio_enable_intp
- vfio_put_device not called anymore
- add some includes to follow vfio policy

v3 -> v4:
[Eric Auger]
- merge of "vfio: Add initial IRQ support in platform device"
  to get a full functional patch although perfs are limited.
- removal of unrealize function since I currently understand
  it is only used with device hot-plug feature.

v2 -> v3:
[Eric Auger]
- further factorization between PCI and platform (VFIORegion,
  VFIODevice). same level of functionality.

<= v2:
[Kim Philipps]
- Initial Creation of the device supporting register space

[Qemu-devel] [PATCH v15 00/10] KVM platform device passthrough

2015-04-30 Thread Eric Auger

This series aims at enabling KVM platform device passthrough.

On kernel side, the vfio platform driver is needed, available from
4.1-rc1 onwards.

This series now only relies on the following QEMU series, for
dynamic instantiation of the VFIO platform device from qemu command
line:

[1] [PATCH v12 0/4] machvirt dynamic sysbus device instantiation
http://comments.gmane.org/gmane.comp.emulators.kvm.arm.devel/886

Both series are candidate for QEMU 2.4 and available at

http://git.linaro.org/people/eric.auger/qemu.git
(branch vfio_integ_v15)

The series was tested on Calxeda Midway (ARMv7) where one xgmac
is assigned to KVM host while the second one is assigned to the guest.

Wiki for Calxeda Midway setup:
https://wiki.linaro.org/LEG/Engineering/Virtualization/Platform_Device_Passthrough_on_Midway

History:

v14 -> v15:
- add Peter R-b on sysbus: add irq_routing_notifier
- correct g_malloc0_n usage in skeleton
- correct return values of reset related functions
- include Cornelia's patch for header update

v13 -> v14:
- remove v13 9, 10, 11 patch files and replace them by a single patch file
  "sysbus: add irq_routing_notifier".
- in skeleton, fix ENAMETOOLONG sign
- remove VFIOINTp virtualID in "add irq assignment" patch file
- removed trace_vfio_platform_start_eventfd

v12 -> v13:
- header update but same update was already sent by Cornelia
- Rework VFIO signaling & irqfd setup: restored 2-step setup featuring
  eventfd setup on realize and then irqfd setup on irq binding.
- irqfd setup now uses kvm_irqchip_add_irqfd_notifier and
  sysbus irq_set_hook override. This leads to the introduction of 6 patch
  files enabling those 2 features. Paolo advised to introduce
  kvm_irqchip_add_irqfd_notifier series in the VFIO one. I did the
  same for irq_set_hook series but if it is better I can submit it aside.
- above changes made possible to remove
  x hw/vfio/platform: add capability to start IRQ propagation"
  x hw/arm/virt: start VFIO IRQ propagation
- in sysbus-fdt.c, use platform_bus_get_mmio_addr instead of deprecated
  mmio[0] property. Thanks to Bharat who pointed this issue out. also
  cpu_to_be32 was used for size and base (Vikram input) .
- in skeleton misc corrections following Alex review.

v11->v12:
- add x-mmap property definition, without which the default value of
  vbasedev.allow_mmap is false, hence preventing the reg space from
  being mmapped.

v10->v11:
- rebase onto v2.3.0-rc0 (mainly related to PCIe support in virt)
- add dma-coherent property for calxeda midway (fix revealed by removal
  of kernel-side "vfio: type1: support for ARM SMMUS with VFIO_IOMMU_TYPE1")
- virt modifications to start VFIO IRQ forwarding are now in a separate
  patch
- rearrange linux header exports (those are still partial exports
  waiting for definitive 4.1-rc0)
- take into account Alex Benn??e comments:
  - use g_malloc0_n instead of g_malloc0
  - use block declarations when possible
  - rework readlink returned value treatment
  - use g_strlcat in place strncat
  - re-arrange mutex locking for multiple IRQ support (user-side handled
eventfds)
- use g_snprintf instead of snprintf
- change the order of functions to avoid pre-declaration in platform.c
- add flags in VFIOINTp struct to detect whether the IRQ is automasked
- some comment rewriting

v9->v10:
- rebase on "vfio: cleanup vfio_get_device error path, remove
  vfio_populate_device": vfio_populate_device no more called in
  vfio_get_device but in vfio_base_device_init
- update VFIO header according to vfio platform driver v13 (no AMBA)

v8->v9:
- rebase on 2.2.0 and machvirt dynamic sysbus instantiation v10
- v8 1-11 were pulled
- patch files related to forwarding are moved in a seperate series since
  it depends on kernel series still in RFC.
- introduction of basic VFIO platform device split into 3 patch files to
  ease the review (hope it will help).
- add an author in platform.c
- add deallocation in vfio_populate_device error case
- add patch file doing the VFIO header sync
- use VFIO_DEVICE_FLAGS_PLATFORM in vfio_populate_device
- rename calxeda_xgmac.c into calxeda-xgmac.c
- sysbus-fdt: add_calxeda_midway_xgmac_fdt_node g_free in case of errors
- reword of linux-headers patch files

v7->v8:
- rebase on v2.2.0-rc3 and integrate
  "Add skip_dump flag to ignore memory region during dump"
- KVM header evolution with subindex addition in kvm_arch_forwarded_irq
- split [PATCH v7 03/16] hw/vfio/pci: introduce VFIODevice into 4 patches
- vfio_compute_needs_reset does not return bool anymore
- add some comments about exposed MMIO region and IRQ in calxeda xgmac
  device
- vfio_[un]mask_irqindex renamed into vfio_[un]mask_single_irqindex
- rework IRQ startup: former machine init done notifier is replaced by a
  reset notifier. machine file passes the interrupt controller
  DeviceState handle (not the platform bus first irq parameter).
- sysbus-fdt:
  - move the add_fdt_node_functions array declaration between the device
specific code and the generic code

[Qemu-devel] [PATCH v15 06/10] kvm: rename kvm_irqchip_[add, remove]_irqfd_notifier with gsi suffix

2015-04-30 Thread Eric Auger

Anticipating for the introduction of new add/remove functions taking
a qemu_irq parameter, let's rename existing ones with a gsi suffix.

Signed-off-by: Eric Auger 
---
 hw/s390x/virtio-ccw.c  | 8 
 hw/vfio/pci.c  | 6 +++---
 hw/virtio/virtio-pci.c | 4 ++--
 include/sysemu/kvm.h   | 7 ---
 kvm-all.c  | 7 ---
 kvm-stub.c | 7 ---
 6 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index ed75c63..55a74fa 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -1238,8 +1238,8 @@ static int virtio_ccw_add_irqfd(VirtioCcwDevice *dev, int 
n)
 VirtQueue *vq = virtio_get_queue(vdev, n);
 EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
 
-return kvm_irqchip_add_irqfd_notifier(kvm_state, notifier, NULL,
-  dev->routes.gsi[n]);
+return kvm_irqchip_add_irqfd_notifier_gsi(kvm_state, notifier, NULL,
+  dev->routes.gsi[n]);
 }
 
 static void virtio_ccw_remove_irqfd(VirtioCcwDevice *dev, int n)
@@ -1249,8 +1249,8 @@ static void virtio_ccw_remove_irqfd(VirtioCcwDevice *dev, 
int n)
 EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
 int ret;
 
-ret = kvm_irqchip_remove_irqfd_notifier(kvm_state, notifier,
-dev->routes.gsi[n]);
+ret = kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, notifier,
+dev->routes.gsi[n]);
 assert(ret == 0);
 }
 
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index cd15b20..938f584 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -596,7 +596,7 @@ static void vfio_add_kvm_msi_virq(VFIOMSIVector *vector, 
MSIMessage *msg,
 return;
 }
 
-if (kvm_irqchip_add_irqfd_notifier(kvm_state, &vector->kvm_interrupt,
+if (kvm_irqchip_add_irqfd_notifier_gsi(kvm_state, &vector->kvm_interrupt,
NULL, virq) < 0) {
 kvm_irqchip_release_virq(kvm_state, virq);
 event_notifier_cleanup(&vector->kvm_interrupt);
@@ -608,8 +608,8 @@ static void vfio_add_kvm_msi_virq(VFIOMSIVector *vector, 
MSIMessage *msg,
 
 static void vfio_remove_kvm_msi_virq(VFIOMSIVector *vector)
 {
-kvm_irqchip_remove_irqfd_notifier(kvm_state, &vector->kvm_interrupt,
-  vector->virq);
+kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, &vector->kvm_interrupt,
+  vector->virq);
 kvm_irqchip_release_virq(kvm_state, vector->virq);
 vector->virq = -1;
 event_notifier_cleanup(&vector->kvm_interrupt);
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index c7c3f72..3be7fad 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -477,7 +477,7 @@ static int kvm_virtio_pci_irqfd_use(VirtIOPCIProxy *proxy,
 VirtQueue *vq = virtio_get_queue(vdev, queue_no);
 EventNotifier *n = virtio_queue_get_guest_notifier(vq);
 int ret;
-ret = kvm_irqchip_add_irqfd_notifier(kvm_state, n, NULL, irqfd->virq);
+ret = kvm_irqchip_add_irqfd_notifier_gsi(kvm_state, n, NULL, irqfd->virq);
 return ret;
 }
 
@@ -491,7 +491,7 @@ static void kvm_virtio_pci_irqfd_release(VirtIOPCIProxy 
*proxy,
 VirtIOIRQFD *irqfd = &proxy->vector_irqfd[vector];
 int ret;
 
-ret = kvm_irqchip_remove_irqfd_notifier(kvm_state, n, irqfd->virq);
+ret = kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, n, irqfd->virq);
 assert(ret == 0);
 }
 
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 197e6c0..0f28d6f 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -413,9 +413,10 @@ void kvm_irqchip_release_virq(KVMState *s, int virq);
 
 int kvm_irqchip_add_adapter_route(KVMState *s, AdapterInfo *adapter);
 
-int kvm_irqchip_add_irqfd_notifier(KVMState *s, EventNotifier *n,
-   EventNotifier *rn, int virq);
-int kvm_irqchip_remove_irqfd_notifier(KVMState *s, EventNotifier *n, int virq);
+int kvm_irqchip_add_irqfd_notifier_gsi(KVMState *s, EventNotifier *n,
+   EventNotifier *rn, int virq);
+int kvm_irqchip_remove_irqfd_notifier_gsi(KVMState *s, EventNotifier *n,
+  int virq);
 void kvm_pc_gsi_handler(void *opaque, int n, int level);
 void kvm_pc_setup_irq_routing(bool pci_enabled);
 void kvm_init_irq_routing(KVMState *s);
diff --git a/kvm-all.c b/kvm-all.c
index 4ec153d..42bb923 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1368,14 +1368,15 @@ int kvm_irqchip_update_msi_route(KVMState *s, int virq, 
MSIMessage msg)
 }
 #endif /* !KVM_CAP_IRQ_ROUTING */
 
-int kvm_irqchip_add_irqfd_notifier(KVMState *s, EventNotifier *n,
-   EventNotifier *rn, int virq)
+int kvm_irqchip_add_irqfd_notifier_gsi(KVMState *s, EventNotifier *n,
+   EventNotifier

[Qemu-devel] [PATCH v15 01/10] linux-headers: update

2015-04-30 Thread Eric Auger

From: Cornelia Huck 

This updates linux-headers against master 4.1-rc1 (commit
b787f68c36d49bb1d9236f403813641efa74a031).

Signed-off-by: Cornelia Huck 
---
 include/standard-headers/linux/virtio_balloon.h |  28 +++-
 include/standard-headers/linux/virtio_blk.h |   8 +-
 include/standard-headers/linux/virtio_ids.h |   1 +
 include/standard-headers/linux/virtio_input.h   |  76 +++
 linux-headers/asm-arm/kvm.h |   9 +-
 linux-headers/asm-arm64/kvm.h   |   9 +-
 linux-headers/asm-mips/kvm.h| 164 +++-
 linux-headers/asm-s390/kvm.h|   4 +
 linux-headers/asm-x86/hyperv.h  |   2 +
 linux-headers/linux/kvm.h   |  66 +-
 linux-headers/linux/vfio.h  |   2 +
 11 files changed, 299 insertions(+), 70 deletions(-)
 create mode 100644 include/standard-headers/linux/virtio_input.h

diff --git a/include/standard-headers/linux/virtio_balloon.h 
b/include/standard-headers/linux/virtio_balloon.h
index 799376d..88ada1d 100644
--- a/include/standard-headers/linux/virtio_balloon.h
+++ b/include/standard-headers/linux/virtio_balloon.h
@@ -25,6 +25,7 @@
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE. */
+#include "standard-headers/linux/types.h"
 #include "standard-headers/linux/virtio_ids.h"
 #include "standard-headers/linux/virtio_config.h"
 
@@ -51,9 +52,32 @@ struct virtio_balloon_config {
 #define VIRTIO_BALLOON_S_MEMTOT   5   /* Total amount of memory */
 #define VIRTIO_BALLOON_S_NR   6
 
+/*
+ * Memory statistics structure.
+ * Driver fills an array of these structures and passes to device.
+ *
+ * NOTE: fields are laid out in a way that would make compiler add padding
+ * between and after fields, so we have to use compiler-specific attributes to
+ * pack it, to disable this padding. This also often causes compiler to
+ * generate suboptimal code.
+ *
+ * We maintain this statistics structure format for backwards compatibility,
+ * but don't follow this example.
+ *
+ * If implementing a similar structure, do something like the below instead:
+ * struct virtio_balloon_stat {
+ * __virtio16 tag;
+ * uint8_t reserved[6];
+ * __virtio64 val;
+ * };
+ *
+ * In other words, add explicit reserved fields to align field and
+ * structure boundaries at field size, avoiding compiler padding
+ * without the packed attribute.
+ */
 struct virtio_balloon_stat {
-   uint16_t tag;
-   uint64_t val;
+   __virtio16 tag;
+   __virtio64 val;
 } QEMU_PACKED;
 
 #endif /* _LINUX_VIRTIO_BALLOON_H */
diff --git a/include/standard-headers/linux/virtio_blk.h 
b/include/standard-headers/linux/virtio_blk.h
index 12016b4..cd601f4 100644
--- a/include/standard-headers/linux/virtio_blk.h
+++ b/include/standard-headers/linux/virtio_blk.h
@@ -58,7 +58,7 @@ struct virtio_blk_config {
uint32_t size_max;
/* The maximum number of segments (if VIRTIO_BLK_F_SEG_MAX) */
uint32_t seg_max;
-   /* geometry the device (if VIRTIO_BLK_F_GEOMETRY) */
+   /* geometry of the device (if VIRTIO_BLK_F_GEOMETRY) */
struct virtio_blk_geometry {
uint16_t cylinders;
uint8_t heads;
@@ -117,7 +117,11 @@ struct virtio_blk_config {
 #define VIRTIO_BLK_T_BARRIER   0x8000
 #endif /* !VIRTIO_BLK_NO_LEGACY */
 
-/* This is the first element of the read scatter-gather list. */
+/*
+ * This comes first in the read scatter-gather list.
+ * For legacy virtio, if VIRTIO_F_ANY_LAYOUT is not negotiated,
+ * this is the first element of the read scatter-gather list.
+ */
 struct virtio_blk_outhdr {
/* VIRTIO_BLK_T* */
__virtio32 type;
diff --git a/include/standard-headers/linux/virtio_ids.h 
b/include/standard-headers/linux/virtio_ids.h
index 284fc3a..5f60aa4 100644
--- a/include/standard-headers/linux/virtio_ids.h
+++ b/include/standard-headers/linux/virtio_ids.h
@@ -39,5 +39,6 @@
 #define VIRTIO_ID_9P   9 /* 9p virtio console */
 #define VIRTIO_ID_RPROC_SERIAL 11 /* virtio remoteproc serial link */
 #define VIRTIO_ID_CAIF12 /* Virtio caif */
+#define VIRTIO_ID_INPUT18 /* virtio input */
 
 #endif /* _LINUX_VIRTIO_IDS_H */
diff --git a/include/standard-headers/linux/virtio_input.h 
b/include/standard-headers/linux/virtio_input.h
new file mode 100644
index 000..a98a797
--- /dev/null
+++ b/include/standard-headers/linux/virtio_input.h
@@ -0,0 +1,76 @@
+#ifndef _LINUX_VIRTIO_INPUT_H
+#define _LINUX_VIRTIO_INPUT_H
+/* This header is BSD licensed so anyone can use the definitions to implement
+ * compatible drivers/servers.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the a

[Qemu-devel] [PATCH v15 03/10] hw/vfio/platform: add irq assignment

2015-04-30 Thread Eric Auger

This patch adds the code requested to assign interrupts to
a guest. The interrupts are mediated through user handled
eventfds only.

Signed-off-by: Eric Auger 

---
v13 -> v14:
- remove virtualID field in header

v12 -> v13:
- start user-side eventfd handling at realize time
- remove start_irq_fn

v10 -> v11:
- use block declaration when possible
- change order of vfio_platform_eoi vs vfio_intp_interrupt
- introduce vfio_intp_inject_pending_lockheld following Alex Bennee
  comments
- remove unmasking/masked when setting up VFIO signaling
- remove unused kvm_accel member in VFIOINTp struct
- add flags member in VFIOINTp in order to properly discriminate
  edge/level-sensitive IRQs; unmask the physical IRQ only in case of
  level-sensitive IRQ
- some comment rewording

v8 -> v9:
- free irq related resources in case of error in vfio_populate_device
---
 hw/vfio/platform.c  | 331 +++-
 include/hw/vfio/vfio-platform.h |  31 
 trace-events|   7 +
 3 files changed, 368 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 569a675..35266a8 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -22,10 +22,299 @@
 #include "qemu/range.h"
 #include "sysemu/sysemu.h"
 #include "exec/memory.h"
+#include "qemu/queue.h"
 #include "hw/sysbus.h"
 #include "trace.h"
 #include "hw/platform-bus.h"
 
+/*
+ * Functions used whatever the injection method
+ */
+
+/**
+ * vfio_init_intp - allocate, initialize the IRQ struct pointer
+ * and add it into the list of IRQs
+ * @vbasedev: the VFIO device handle
+ * @info: irq info struct retrieved from VFIO driver
+ */
+static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev,
+struct vfio_irq_info info)
+{
+int ret;
+VFIOPlatformDevice *vdev =
+container_of(vbasedev, VFIOPlatformDevice, vbasedev);
+SysBusDevice *sbdev = SYS_BUS_DEVICE(vdev);
+VFIOINTp *intp;
+
+intp = g_malloc0(sizeof(*intp));
+intp->vdev = vdev;
+intp->pin = info.index;
+intp->flags = info.flags;
+intp->state = VFIO_IRQ_INACTIVE;
+
+sysbus_init_irq(sbdev, &intp->qemuirq);
+
+/* Get an eventfd for trigger */
+ret = event_notifier_init(&intp->interrupt, 0);
+if (ret) {
+g_free(intp);
+error_report("vfio: Error: trigger event_notifier_init failed ");
+return NULL;
+}
+
+QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
+return intp;
+}
+
+/**
+ * vfio_set_trigger_eventfd - set VFIO eventfd handling
+ *
+ * @intp: IRQ struct handle
+ * @handler: handler to be called on eventfd signaling
+ *
+ * Setup VFIO signaling and attach an optional user-side handler
+ * to the eventfd
+ */
+static int vfio_set_trigger_eventfd(VFIOINTp *intp,
+eventfd_user_side_handler_t handler)
+{
+VFIODevice *vbasedev = &intp->vdev->vbasedev;
+struct vfio_irq_set *irq_set;
+int argsz, ret;
+int32_t *pfd;
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+irq_set = g_malloc0(argsz);
+irq_set->argsz = argsz;
+irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER;
+irq_set->index = intp->pin;
+irq_set->start = 0;
+irq_set->count = 1;
+pfd = (int32_t *)&irq_set->data;
+*pfd = event_notifier_get_fd(&intp->interrupt);
+qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp);
+ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+g_free(irq_set);
+if (ret < 0) {
+error_report("vfio: Failed to set trigger eventfd: %m");
+qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
+}
+return ret;
+}
+
+/*
+ * Functions only used when eventfds are handled on user-side
+ * ie. without irqfd
+ */
+
+/**
+ * vfio_mmap_set_enabled - enable/disable the fast path mode
+ * @vdev: the VFIO platform device
+ * @enabled: the target mmap state
+ *
+ * enabled = true ~ fast path = MMIO region is mmaped (no KVM TRAP);
+ * enabled = false ~ slow path = MMIO region is trapped and region callbacks
+ * are called; slow path enables to trap the device IRQ status register reset
+*/
+
+static void vfio_mmap_set_enabled(VFIOPlatformDevice *vdev, bool enabled)
+{
+int i;
+
+trace_vfio_platform_mmap_set_enabled(enabled);
+
+for (i = 0; i < vdev->vbasedev.num_regions; i++) {
+VFIORegion *region = vdev->regions[i];
+
+memory_region_set_enabled(®ion->mmap_mem, enabled);
+}
+}
+
+/**
+ * vfio_intp_mmap_enable - timer function, restores the fast path
+ * if there is no more active IRQ
+ * @opaque: actually points to the VFIO platform device
+ *
+ * Called on mmap timer timout, this function checks whether the
+ * IRQ is still active and if not, restores the fast path.
+ * by construction a single eventfd is handled at a time.
+ * if the IRQ is still active, the timer is re-programmed.
+ */
+static void vfio_intp_mmap_enable(void *opaque)
+{
+VFIOINTp *tmp;
+VFIOPlatformDevice *vdev

[Qemu-devel] [PATCH v15 10/10] hw/vfio/platform: add irqfd support

2015-04-30 Thread Eric Auger

This patch aims at optimizing IRQ handling using irqfd framework.

Instead of handling the eventfds on user-side they are handled on
kernel side using
- the KVM irqfd framework,
- the VFIO driver virqfd framework.

the virtual IRQ completion is trapped at interrupt controller
This removes the need for fast/slow path swap.

Overall this brings significant performance improvements.

Signed-off-by: Alvise Rigo 
Signed-off-by: Eric Auger 
Reviewed-by: Alex Bennee

---
v13 -> v14:
- use connect_irq_notifier
- remove trace_vfio_platform_start_eventfd

v12 -> v13:
- setup the new mechanism for starting irqfd, based on
  LinkPropertySetter override
- use kvm_irqchip_[add,remove]_irqfd_notifier new functions: no need
  to bother about gsi (hence virtualID could be removed with small
  change in trace-events)

v10 -> v11:
- Add Alex' Reviewed-by
- introduce kvm_accel in this patch and initialize it

v5 -> v6
- rely on kvm_irqfds_enabled() and kvm_resamplefds_enabled()
- guard KVM code with #ifdef CONFIG_KVM

v3 -> v4:
[Alvise Rigo]
Use of VFIO Platform driver v6 unmask/virqfd feature and removal
of resamplefd handler. Physical IRQ unmasking is now done in
VFIO driver.

v3:
[Eric Auger]
initial support with resamplefd handled on QEMU side since the
unmask was not supported on VFIO platform driver v5.

Conflicts:
hw/vfio/platform.c
---
 hw/vfio/platform.c  | 107 
 include/hw/vfio/vfio-platform.h |   2 +
 trace-events|   1 +
 3 files changed, 110 insertions(+)

diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
index 35266a8..901b98e 100644
--- a/hw/vfio/platform.c
+++ b/hw/vfio/platform.c
@@ -26,6 +26,7 @@
 #include "hw/sysbus.h"
 #include "trace.h"
 #include "hw/platform-bus.h"
+#include "sysemu/kvm.h"
 
 /*
  * Functions used whatever the injection method
@@ -51,6 +52,7 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev,
 intp->pin = info.index;
 intp->flags = info.flags;
 intp->state = VFIO_IRQ_INACTIVE;
+intp->kvm_accel = false;
 
 sysbus_init_irq(sbdev, &intp->qemuirq);
 
@@ -61,6 +63,13 @@ static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev,
 error_report("vfio: Error: trigger event_notifier_init failed ");
 return NULL;
 }
+/* Get an eventfd for resample/unmask */
+ret = event_notifier_init(&intp->unmask, 0);
+if (ret) {
+g_free(intp);
+error_report("vfio: Error: resample event_notifier_init failed eoi");
+return NULL;
+}
 
 QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
 return intp;
@@ -315,6 +324,95 @@ static int vfio_start_eventfd_injection(VFIOINTp *intp)
 return ret;
 }
 
+/*
+ * Functions used for irqfd
+ */
+
+#ifdef CONFIG_KVM
+
+/**
+ * vfio_set_resample_eventfd - sets the resamplefd for an IRQ
+ * @intp: the IRQ struct handle
+ * programs the VFIO driver to unmask this IRQ when the
+ * intp->unmask eventfd is triggered
+ */
+static int vfio_set_resample_eventfd(VFIOINTp *intp)
+{
+VFIODevice *vbasedev = &intp->vdev->vbasedev;
+struct vfio_irq_set *irq_set;
+int argsz, ret;
+int32_t *pfd;
+
+argsz = sizeof(*irq_set) + sizeof(*pfd);
+irq_set = g_malloc0(argsz);
+irq_set->argsz = argsz;
+irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_UNMASK;
+irq_set->index = intp->pin;
+irq_set->start = 0;
+irq_set->count = 1;
+pfd = (int32_t *)&irq_set->data;
+*pfd = event_notifier_get_fd(&intp->unmask);
+qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
+ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
+g_free(irq_set);
+if (ret < 0) {
+error_report("vfio: Failed to set resample eventfd: %m");
+}
+return ret;
+}
+
+static void vfio_start_irqfd_injection(SysBusDevice *sbdev, qemu_irq irq)
+{
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
+VFIOINTp *intp;
+bool found = false;
+
+QLIST_FOREACH(intp, &vdev->intp_list, next) {
+if (intp->qemuirq == irq) {
+found  = true;
+break;
+}
+}
+assert(found);
+
+/* Get to a known interrupt state */
+qemu_set_fd_handler(event_notifier_get_fd(&intp->interrupt),
+NULL, NULL, vdev);
+
+vfio_mask_single_irqindex(&vdev->vbasedev, intp->pin);
+qemu_set_irq(intp->qemuirq, 0);
+
+if (kvm_irqchip_add_irqfd_notifier(kvm_state, &intp->interrupt,
+   &intp->unmask, irq) < 0) {
+goto fail_irqfd;
+}
+
+if (vfio_set_trigger_eventfd(intp, NULL) < 0) {
+goto fail_vfio;
+}
+if (vfio_set_resample_eventfd(intp) < 0) {
+goto fail_vfio;
+}
+
+/* Let'em rip */
+vfio_unmask_single_irqindex(&vdev->vbasedev, intp->pin);
+
+intp->kvm_accel = true;
+
+trace_vfio_platform_start_irqfd_injection(intp->pin,
+ event_notifier_get_fd(&intp->interrupt),
+ ev

[Qemu-devel] [PATCH v15 04/10] hw/vfio/platform: calxeda xgmac device

2015-04-30 Thread Eric Auger

The platform device class has become abstract. This patch introduces
a calxeda xgmac device that derives from it.

Signed-off-by: Eric Auger 
Reviewed-by: Alex Bennee

---
v10 -> v11:
- add Alex Reviewed-by
- move virt modifications in a separate patch

v8 -> v9:
- renamed calxeda_xgmac.c into calxeda-xgmac.c

v7 -> v8:
- add a comment in the header about the MMIO regions and IRQ which
  are exposed by the device

v5 -> v6
- back again following Alex Graf advises
- fix a bug related to compat override

v4 -> v5:
removed since device tree was moved to hw/arm/dyn_sysbus_devtree.c

v4: creation for device tree specialization

Conflicts:
hw/arm/virt.c
---
 hw/vfio/Makefile.objs|  1 +
 hw/vfio/calxeda-xgmac.c  | 54 
 include/hw/vfio/vfio-calxeda-xgmac.h | 46 ++
 3 files changed, 101 insertions(+)
 create mode 100644 hw/vfio/calxeda-xgmac.c
 create mode 100644 include/hw/vfio/vfio-calxeda-xgmac.h

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index c5c76fe..d540c9d 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -2,4 +2,5 @@ ifeq ($(CONFIG_LINUX), y)
 obj-$(CONFIG_SOFTMMU) += common.o
 obj-$(CONFIG_PCI) += pci.o
 obj-$(CONFIG_SOFTMMU) += platform.o
+obj-$(CONFIG_SOFTMMU) += calxeda-xgmac.o
 endif
diff --git a/hw/vfio/calxeda-xgmac.c b/hw/vfio/calxeda-xgmac.c
new file mode 100644
index 000..c4b8fef
--- /dev/null
+++ b/hw/vfio/calxeda-xgmac.c
@@ -0,0 +1,54 @@
+/*
+ * calxeda xgmac VFIO device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/vfio/vfio-calxeda-xgmac.h"
+
+static void calxeda_xgmac_realize(DeviceState *dev, Error **errp)
+{
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
+VFIOCalxedaXgmacDeviceClass *k = VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(dev);
+
+vdev->compat = g_strdup("calxeda,hb-xgmac");
+
+k->parent_realize(dev, errp);
+}
+
+static const VMStateDescription vfio_platform_vmstate = {
+.name = TYPE_VFIO_CALXEDA_XGMAC,
+.unmigratable = 1,
+};
+
+static void vfio_calxeda_xgmac_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+VFIOCalxedaXgmacDeviceClass *vcxc =
+VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass);
+vcxc->parent_realize = dc->realize;
+dc->realize = calxeda_xgmac_realize;
+dc->desc = "VFIO Calxeda XGMAC";
+}
+
+static const TypeInfo vfio_calxeda_xgmac_dev_info = {
+.name = TYPE_VFIO_CALXEDA_XGMAC,
+.parent = TYPE_VFIO_PLATFORM,
+.instance_size = sizeof(VFIOCalxedaXgmacDevice),
+.class_init = vfio_calxeda_xgmac_class_init,
+.class_size = sizeof(VFIOCalxedaXgmacDeviceClass),
+};
+
+static void register_calxeda_xgmac_dev_type(void)
+{
+type_register_static(&vfio_calxeda_xgmac_dev_info);
+}
+
+type_init(register_calxeda_xgmac_dev_type)
diff --git a/include/hw/vfio/vfio-calxeda-xgmac.h 
b/include/hw/vfio/vfio-calxeda-xgmac.h
new file mode 100644
index 000..f994775
--- /dev/null
+++ b/include/hw/vfio/vfio-calxeda-xgmac.h
@@ -0,0 +1,46 @@
+/*
+ * VFIO calxeda xgmac device
+ *
+ * Copyright Linaro Limited, 2014
+ *
+ * Authors:
+ *  Eric Auger 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef HW_VFIO_VFIO_CALXEDA_XGMAC_H
+#define HW_VFIO_VFIO_CALXEDA_XGMAC_H
+
+#include "hw/vfio/vfio-platform.h"
+
+#define TYPE_VFIO_CALXEDA_XGMAC "vfio-calxeda-xgmac"
+
+/**
+ * This device exposes:
+ * - a single MMIO region corresponding to its register space
+ * - 3 IRQS (main and 2 power related IRQs)
+ */
+typedef struct VFIOCalxedaXgmacDevice {
+VFIOPlatformDevice vdev;
+} VFIOCalxedaXgmacDevice;
+
+typedef struct VFIOCalxedaXgmacDeviceClass {
+/*< private >*/
+VFIOPlatformDeviceClass parent_class;
+/*< public >*/
+DeviceRealize parent_realize;
+} VFIOCalxedaXgmacDeviceClass;
+
+#define VFIO_CALXEDA_XGMAC_DEVICE(obj) \
+ OBJECT_CHECK(VFIOCalxedaXgmacDevice, (obj), TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_CLASS(klass) \
+ OBJECT_CLASS_CHECK(VFIOCalxedaXgmacDeviceClass, (klass), \
+TYPE_VFIO_CALXEDA_XGMAC)
+#define VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(VFIOCalxedaXgmacDeviceClass, (obj), \
+  TYPE_VFIO_CALXEDA_XGMAC)
+
+#endif
-- 
1.8.3.2

[Qemu-devel] [PATCH v15 05/10] hw/arm/sysbus-fdt: enable vfio-calxeda-xgmac dynamic instantiation

2015-04-30 Thread Eric Auger

This patch allows the instantiation of the vfio-calxeda-xgmac device
from the QEMU command line (-device vfio-calxeda-xgmac,host="").

A specialized device tree node is created for the guest, containing
compat, dma-coherent, reg and interrupts properties.

Signed-off-by: Eric Auger 

---
v12 -> v13:
- use platform_bus_get_mmio_addr instead of deprecated mmio[0] property.
  Thanks to Bharat who pointed this issue out.
- use cpu_to_be32 to mmio_base & size (Vikram report)

v10 -> v11:
- add dma-coherent property to calxeda midway xgmac node (fix)
- use qemu_fdt_setprop to add reg property instead of
  qemu_fdt_setprop_sized_cells_from_array
- commit message rewording

v8 -> v9:
- properly free resources in case of errors in
  add_calxeda_midway_xgmac_fdt_node

v7 -> v8:
- move the add_fdt_node_functions array declaration between the device
  specific code and the generic code to avoid forward declarations of
  decice specific functions
- rename add_basic_vfio_fdt_node into
  add_calxeda_midway_xgmac_fdt_node

v6 -> v7:
- compat string re-formatting removed since compat string is not exposed
  anymore as a user option
- VFIO IRQ kick-off removed from sysbus-fdt and moved to VFIO platform
  device
---
 hw/arm/sysbus-fdt.c | 72 +
 1 file changed, 72 insertions(+)

diff --git a/hw/arm/sysbus-fdt.c b/hw/arm/sysbus-fdt.c
index 3038b94..3d67acf 100644
--- a/hw/arm/sysbus-fdt.c
+++ b/hw/arm/sysbus-fdt.c
@@ -26,6 +26,8 @@
 #include "sysemu/device_tree.h"
 #include "hw/platform-bus.h"
 #include "sysemu/sysemu.h"
+#include "hw/vfio/vfio-platform.h"
+#include "hw/vfio/vfio-calxeda-xgmac.h"
 
 /*
  * internal struct that contains the information to create dynamic
@@ -53,11 +55,81 @@ typedef struct NodeCreationPair {
 int (*add_fdt_node_fn)(SysBusDevice *sbdev, void *opaque);
 } NodeCreationPair;
 
+/* Device Specific Code */
+
+/**
+ * add_calxeda_midway_xgmac_fdt_node
+ *
+ * Generates a simple node with following properties:
+ * compatible string, regs, interrupts, dma-coherent
+ */
+static int add_calxeda_midway_xgmac_fdt_node(SysBusDevice *sbdev, void *opaque)
+{
+PlatformBusFDTData *data = opaque;
+PlatformBusDevice *pbus = data->pbus;
+void *fdt = data->fdt;
+const char *parent_node = data->pbus_node_name;
+int compat_str_len, i, ret = -1;
+char *nodename;
+uint32_t *irq_attr, *reg_attr;
+uint64_t mmio_base, irq_number;
+VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(sbdev);
+VFIODevice *vbasedev = &vdev->vbasedev;
+
+mmio_base = platform_bus_get_mmio_addr(pbus, sbdev, 0);
+nodename = g_strdup_printf("%s/%s@%" PRIx64, parent_node,
+   vbasedev->name, mmio_base);
+qemu_fdt_add_subnode(fdt, nodename);
+
+compat_str_len = strlen(vdev->compat) + 1;
+qemu_fdt_setprop(fdt, nodename, "compatible",
+  vdev->compat, compat_str_len);
+
+qemu_fdt_setprop(fdt, nodename, "dma-coherent", "", 0);
+
+reg_attr = g_new(uint32_t, vbasedev->num_regions*2);
+for (i = 0; i < vbasedev->num_regions; i++) {
+mmio_base = platform_bus_get_mmio_addr(pbus, sbdev, i);
+reg_attr[2*i] = cpu_to_be32(mmio_base);
+reg_attr[2*i+1] = cpu_to_be32(
+memory_region_size(&vdev->regions[i]->mem));
+}
+ret = qemu_fdt_setprop(fdt, nodename, "reg", reg_attr,
+   vbasedev->num_regions*2*sizeof(uint32_t));
+if (ret) {
+error_report("could not set reg property of node %s", nodename);
+goto fail_reg;
+}
+
+irq_attr = g_new(uint32_t, vbasedev->num_irqs*3);
+for (i = 0; i < vbasedev->num_irqs; i++) {
+irq_number = platform_bus_get_irqn(pbus, sbdev , i)
+ + data->irq_start;
+irq_attr[3*i] = cpu_to_be32(0);
+irq_attr[3*i+1] = cpu_to_be32(irq_number);
+irq_attr[3*i+2] = cpu_to_be32(0x4);
+}
+   ret = qemu_fdt_setprop(fdt, nodename, "interrupts",
+ irq_attr, vbasedev->num_irqs*3*sizeof(uint32_t));
+if (ret) {
+error_report("could not set interrupts property of node %s",
+ nodename);
+}
+g_free(irq_attr);
+fail_reg:
+g_free(reg_attr);
+g_free(nodename);
+return ret;
+}
+
 /* list of supported dynamic sysbus devices */
 static const NodeCreationPair add_fdt_node_functions[] = {
+{TYPE_VFIO_CALXEDA_XGMAC, add_calxeda_midway_xgmac_fdt_node},
 {"", NULL}, /* last element */
 };
 
+/* Generic Code */
+
 /**
  * add_fdt_node - add the device tree node of a dynamic sysbus device
  *
-- 
1.8.3.2

[Qemu-devel] [PATCH v15 09/10] sysbus: add irq_routing_notifier

2015-04-30 Thread Eric Auger

Add a new connect_irq_notifier notifier in the SysBusDeviceClass. This
notifier, if populated, is called after sysbus_connect_irq.

This mechanism is used to setup VFIO signaling once VFIO platform
devices get attached to their platform bus, on a machine init done
notifier.

Signed-off-by: Eric Auger 
Reviewed-by: Peter Crosthwaite 

---
v14 -> v15:
add Peter R-b

v2 -> v3 (integrated into this series v14):
- rename irq_routing_notifier into connect_irq_notifier

v1 -> v2:
- duly put the notifier in the class and not in the device
---
 hw/core/sysbus.c| 6 ++
 include/hw/sysbus.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/hw/core/sysbus.c b/hw/core/sysbus.c
index b53c351..2d22aec 100644
--- a/hw/core/sysbus.c
+++ b/hw/core/sysbus.c
@@ -109,7 +109,13 @@ qemu_irq sysbus_get_connected_irq(SysBusDevice *dev, int n)
 
 void sysbus_connect_irq(SysBusDevice *dev, int n, qemu_irq irq)
 {
+SysBusDeviceClass *sbd = SYS_BUS_DEVICE_GET_CLASS(dev);
+
 qdev_connect_gpio_out_named(DEVICE(dev), SYSBUS_DEVICE_GPIO_IRQ, n, irq);
+
+if (sbd->connect_irq_notifier) {
+sbd->connect_irq_notifier(dev, irq);
+}
 }
 
 /* Check whether an MMIO region exists */
diff --git a/include/hw/sysbus.h b/include/hw/sysbus.h
index d1f3f00..e80b26d 100644
--- a/include/hw/sysbus.h
+++ b/include/hw/sysbus.h
@@ -41,6 +41,7 @@ typedef struct SysBusDeviceClass {
 /*< public >*/
 
 int (*init)(SysBusDevice *dev);
+void (*connect_irq_notifier)(SysBusDevice *dev, qemu_irq irq);
 } SysBusDeviceClass;
 
 struct SysBusDevice {
-- 
1.8.3.2

[Qemu-devel] [PATCH v15 08/10] intc: arm_gic_kvm: set the qemu_irq/gsi mapping

2015-04-30 Thread Eric Auger

The arm_gic_kvm now calls kvm_irqchip_set_qemuirq_gsi to build
the hash table storing qemu_irq/gsi mappings. From that point on
irqfd can be setup directly from the qemu_irq using
kvm_irqchip_add_irqfd_notifier.

Signed-off-by: Eric Auger 

---

v2 -> v3:
- kvm_irqchip_add_qemuirq_irqfd_notifier renamed into
  kvm_irqchip_add_irqfd_notifier
---
 hw/intc/arm_gic_kvm.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
index e1952ad..28506e3 100644
--- a/hw/intc/arm_gic_kvm.c
+++ b/hw/intc/arm_gic_kvm.c
@@ -554,6 +554,11 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error 
**errp)
  */
 i += (GIC_INTERNAL * s->num_cpu);
 qdev_init_gpio_in(dev, kvm_arm_gic_set_irq, i);
+
+for (i = 0; i < s->num_irq - GIC_INTERNAL; i++) {
+qemu_irq irq = qdev_get_gpio_in(dev, i);
+kvm_irqchip_set_qemuirq_gsi(kvm_state, irq, i);
+}
 /* We never use our outbound IRQ lines but provide them so that
  * we maintain the same interface as the non-KVM GIC.
  */
-- 
1.8.3.2

[Qemu-devel] [PATCH v15 07/10] kvm-all.c: add qemu_irq/gsi hash table and utility routines

2015-04-30 Thread Eric Auger

VFIO platform device needs to setup irqfd but it does not know the
gsi corresponding to the device qemu_irq. This series proposes to
store a hash table in kvm_state using the qemu_irq as key and the gsi
as a value.

kvm_irqchip_set_qemuirq_gsi allows to insert such a pair. The interrupt
controller is supposed to use it.

kvm_irqchip_[add, remove]_irqfd_notifier allows to setup/tear down
irqfd directly from the qemu_irq.

Signed-off-by: Eric Auger 

---
v13 -> v14:
- correct checkpatch warning

v2 -> v3 (integration into VFIO series v13):
- rename kvm_irqchip_[add, remove]_qemuirq_irqfd_notifier into
  kvm_irqchip_[add, remove]_irqfd_notifier. Possible since legacy
  functions were also renamed with _gsi suffix.

V1 -> v2:
- qemu_irq get_gsi callback replaced by hash table stored in kvm
---
 include/sysemu/kvm.h |  6 ++
 kvm-all.c| 37 +
 2 files changed, 43 insertions(+)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 0f28d6f..bc3f230 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -18,6 +18,7 @@
 #include "config-host.h"
 #include "qemu/queue.h"
 #include "qom/cpu.h"
+#include "hw/irq.h"
 
 #ifdef CONFIG_KVM
 #include 
@@ -417,6 +418,11 @@ int kvm_irqchip_add_irqfd_notifier_gsi(KVMState *s, 
EventNotifier *n,
EventNotifier *rn, int virq);
 int kvm_irqchip_remove_irqfd_notifier_gsi(KVMState *s, EventNotifier *n,
   int virq);
+int kvm_irqchip_add_irqfd_notifier(KVMState *s, EventNotifier *n,
+   EventNotifier *rn, qemu_irq irq);
+int kvm_irqchip_remove_irqfd_notifier(KVMState *s, EventNotifier *n,
+  qemu_irq irq);
+void kvm_irqchip_set_qemuirq_gsi(KVMState *s, qemu_irq irq, int gsi);
 void kvm_pc_gsi_handler(void *opaque, int n, int level);
 void kvm_pc_setup_irq_routing(bool pci_enabled);
 void kvm_init_irq_routing(KVMState *s);
diff --git a/kvm-all.c b/kvm-all.c
index 42bb923..d2cb7ed 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -37,6 +37,7 @@
 #include "exec/address-spaces.h"
 #include "qemu/event_notifier.h"
 #include "trace.h"
+#include "hw/irq.h"
 
 #include "hw/boards.h"
 
@@ -99,6 +100,7 @@ struct KVMState
  * unsigned, and treating them as signed here can break things */
 unsigned irq_set_ioctl;
 unsigned int sigmask_len;
+GHashTable *gsimap;
 #ifdef KVM_CAP_IRQ_ROUTING
 struct kvm_irq_routing *irq_routes;
 int nr_allocated_irq_routes;
@@ -1382,6 +1384,39 @@ int kvm_irqchip_remove_irqfd_notifier_gsi(KVMState *s, 
EventNotifier *n,
false);
 }
 
+int kvm_irqchip_add_irqfd_notifier(KVMState *s, EventNotifier *n,
+   EventNotifier *rn, qemu_irq irq)
+{
+gpointer key, gsi;
+gboolean found = g_hash_table_lookup_extended(s->gsimap, irq, &key, &gsi);
+
+if (!found) {
+return -ENXIO;
+} else {
+return kvm_irqchip_add_irqfd_notifier_gsi(s, n, rn,
+  GPOINTER_TO_UINT(gsi));
+}
+}
+
+int kvm_irqchip_remove_irqfd_notifier(KVMState *s, EventNotifier *n,
+  qemu_irq irq)
+{
+gpointer key, gsi;
+gboolean found = g_hash_table_lookup_extended(s->gsimap, irq, &key, &gsi);
+
+if (!found) {
+return -ENXIO;
+} else {
+return kvm_irqchip_remove_irqfd_notifier_gsi(s, n,
+ GPOINTER_TO_INT(gsi));
+}
+}
+
+void kvm_irqchip_set_qemuirq_gsi(KVMState *s, qemu_irq irq, int gsi)
+{
+g_hash_table_insert(s->gsimap, irq, GINT_TO_POINTER(gsi));
+}
+
 static int kvm_irqchip_create(MachineState *machine, KVMState *s)
 {
 int ret;
@@ -1414,6 +1449,8 @@ static int kvm_irqchip_create(MachineState *machine, 
KVMState *s)
 
 kvm_init_irq_routing(s);
 
+s->gsimap = g_hash_table_new(g_direct_hash, g_direct_equal);
+
 return 0;
 }
 
-- 
1.8.3.2

Re: [Qemu-devel] [PULL 00/42] Trivial patches for 2015-04-30

2015-04-30 Thread Peter Maydell

On 30 April 2015 at 06:08, Michael Tokarev  wrote:
> Hello.
>
> This is the first pull request for trivial-patches tree since 2.3 has
> been released.  During the freeze many patches has been accumulated,
> and even more has been received after 2.4 developmnent has been opened.
>
> So here we have 42 trivial patches, which is kinda too much, but that's
> what we have :)
>
> Please consider applying.
>
> Thanks,
>
> /mjt
>
> The following changes since commit a9392bc93c8615ad1983047e9f91ee3fa8aae75f:
>
>   Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging 
> (2015-04-28 16:55:03 +0100)
>
> are available in the git repository at:
>
>   git://git.corpit.ru/qemu.git tags/pull-trivial-patches-2015-04-30
>
> for you to fetch changes up to fe31d3e6e6945b509e5b4ab03d38aae1266fe64e:
>
>   microblaze: fix memory leak (2015-04-30 08:06:20 +0300)
>
> 
> trivial patches for 2015-04-30

Hi. I'm afraid this fails to build the tests on OSX:
  CCtests/i440fx-test.o
/Users/pm215/src/qemu/tests/i440fx-test.c:229:21: warning: implicit declaration
  of function 'ARRAY_SIZE' is invalid in C99
  [-Wimplicit-function-declaration]
for (i = 0; i < ARRAY_SIZE(pam_area); i++) {
^
1 warning generated.
  LINK  tests/i440fx-test
Undefined symbols for architecture x86_64:
  "_ARRAY_SIZE", referenced from:
  _test_i440fx_pam in i440fx-test.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [tests/i440fx-test] Error 1

'i440fx-test: remove ARRAY_SIZE redefinition' has removed
this macro, but this test file doesn't include osdep.h so
it presumably works on Linux only because we're picking
something up from a system header file somewhere...

-- PMM

Re: [Qemu-devel] [REBASE PATCH v5 1/2] machine: add default_ram_size to machine class

2015-04-30 Thread Paolo Bonzini



On 30/04/2015 11:40, Thomas Huth wrote:
 On 29/04/2015 11:06, Nikunj A Dadhania wrote:
>> so David can push both patches.
>>
>> But isn't 1G a bit too much?  At least on x86 you can easily boot with 
>> 512M.
>
> I understood this number as not the _minimum memory_ to boot the
> VM. And this will only come in picture when the user has not specified
> any memory.

 This in turn will basically only happen for QEMU developers.  So keeping
 the default on the low side would make sense.

 On my (4G memory) laptop I might not even be able to boot a PPC64 VM
 with 1G and TCG, but I can do that nicely with 256M.
>>>
>>> That will be fine with me as well, i.e. 256M
>>>
>>> David/Alex, Do you have comments on this before we change it?
>>
>> I've seen RAM size combinations that seemed to work ok, but then failed
>> during grub2 execution for example. Please verify with all reasonably
>> realistically executed distributions that 256MB is enough.
> 
> Since this default value will likely be there for the next couple of
> years, it's maybe better to use a slightly higher value than one that
> is too low - the amount of RAM that a guest requires likely rather
> increases in the next years instead of going down again. So I think
> using 512 MB instead is maybe a good compromise?

Sure, 512 is okay with me.

Paolo

Re: [Qemu-devel] [PULL 00/42] Trivial patches for 2015-04-30

2015-04-30 Thread Michael Tokarev

30.04.2015 15:58, Peter Maydell wrote:
[]
> Hi. I'm afraid this fails to build the tests on OSX:
>   CCtests/i440fx-test.o
> /Users/pm215/src/qemu/tests/i440fx-test.c:229:21: warning: implicit 
> declaration
>   of function 'ARRAY_SIZE' is invalid in C99

So much for trivial ;)

Let's remove this one and add another tiny dead code removal patch
instead, which went in meanwhile, to complete half-removal of unused
functions.  I'll resend another pull request in a moment.

Thanks,  and sorry for the noize.  I wanted to verify if this array_size
definition is really unnecessary here but something distracted me.

/mjt

[Qemu-devel] [PULL 00/42] Trivial patches for 2015-04-30

2015-04-30 Thread Michael Tokarev

Another attempt, now without the problematic ARRAY_SIZE removal,
but with additional patch removing the unused cpu_get_pc(), and
with additional Reviewed-by.

Resending only the newly added patch, not whole series.

Thanks,

/mjt

The following changes since commit 06feaacfb4cfef10cc0c93d97df7bfc8a71dbc7e:

  Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging 
(2015-04-30 12:04:11 +0100)

are available in the git repository at:

  git://git.corpit.ru/qemu.git tags/pull-trivial-patches-2015-04-30

for you to fetch changes up to 553029351bac9f5b4f9ea72793e55f02e7677ec2:

  openrisc: cpu: Remove unused cpu_get_pc (2015-04-30 16:06:18 +0300)


trivial patches for 2015-04-30


Chih-Min Chao (5):
  bitops : fix coding style
  ui/vnc : fix coding style
  ui/vnc : remove 'struct' of 'typedef struct'
  ui/console : remove 'struct' from 'typedef struct' type
  hw/display : remove 'struct' from 'typedef QXL struct'

Emilio G. Cota (5):
  cpus: use first_cpu macro instead of QTAILQ_FIRST(&cpus)
  input: remove unused mouse_handlers list
  qemu-char: remove unused list node from FDCharDriver
  coroutine: remove unnecessary parentheses in qemu_co_queue_empty
  linux-user/elfload: use QTAILQ_FOREACH instead of open-coding it

Gonglei (3):
  target-mips: fix memory leak
  vhost-user: remove superfluous '\n' around error_report()
  microblaze: fix memory leak

Jan Kiszka (1):
  hostmem: Fix mem-path property name in error report

John Snow (1):
  qmp-commands: Fix typo

Laszlo Ersek (1):
  docs/atomics.txt: fix two typos

Michael Tokarev (3):
  qemu-options: trivial spelling fix (messsage)
  libcacard: do not use full paths for include files in the same dir
  microblaze: cpu: Renumber EXCP_* constants to close gap

Paolo Bonzini (3):
  range: remove useless inclusions
  qemu-config: remove stray inclusions of hw/ files
  libcacard: stop including qemu-common.h

Peter Crosthwaite (10):
  arm: cpu.h: Remove unused typdefs
  configure: alphabetize tricore in target list
  defconfigs: Piggyback microblazeel on microblaze
  microblaze: mmu: Delete flip_um fn prototype
  microblaze: cpu: Remote unused cpu_get_pc
  microblaze: cpu: Remove unused CC_OP enum
  microblaze: cpu: Delete EXCP_NMI
  microblaze: cpu: delete unused cpu_interrupts_enabled
  tcg: Delete unused cpu_pc_from_tb()
  openrisc: cpu: Remove unused cpu_get_pc

Stefan Berger (3):
  tpm: Cast 64bit variables to int when used in DPRINTF
  tpm: Modify DPRINTF to enable -Wformat checking
  tpm: fix coding style

Stefan Weil (1):
  misc: Fix new collection of typos

Thomas Huth (6):
  vmxnet: Remove unused function vmxnet_rx_pkt_get_num_frags()
  pci: Remove unused function ich9_d2pbr_init()
  monitor: Remove unused functions
  usb: Remove unused functions
  util: Remove unused functions
  kvm: Silence warning from valgrind

 backends/hostmem-file.c  |  2 +-
 configure|  4 +-
 cpus.c   |  2 +-
 default-configs/microblazeel-softmmu.mak | 10 +
 dma-helpers.c|  1 -
 docs/atomics.txt |  4 +-
 hmp.h|  1 -
 hw/acpi/pcihp.c  |  1 -
 hw/block/virtio-blk.c|  2 +-
 hw/display/qxl.c |  2 +-
 hw/i386/acpi-build.c |  1 -
 hw/microblaze/boot.c | 13 ---
 hw/mips/mips_fulong2e.c  |  1 +
 hw/mips/mips_malta.c |  1 +
 hw/mips/mips_r4k.c   |  1 +
 hw/misc/edu.c|  2 +-
 hw/net/virtio-net.c  |  2 +-
 hw/net/vmxnet_rx_pkt.c   |  7 
 hw/net/vmxnet_rx_pkt.h   |  9 -
 hw/pci-bridge/i82801b11.c| 21 --
 hw/ppc/spapr.c   |  2 +-
 hw/tpm/tpm_passthrough.c | 16 
 hw/tpm/tpm_tis.c | 26 ++---
 hw/usb/core.c| 41 
 hw/virtio/vhost-user.c   | 22 +--
 include/hw/i386/ich9.h   |  2 -
 include/hw/pci-host/q35.h|  1 -
 include/hw/usb.h |  5 ---
 include/monitor/monitor.h|  1 -
 include/qemu-common.h|  4 --
 include/qemu/bitops.h| 61 ++---
 include/qemu/compatfd.h  |  1 -
 kvm-all.c| 14 +++
 libcacard/cac.c  |  5 ++-
 libcacard/card_7816.c|  4 +-
 libcacard/event.c

[Qemu-devel] [PULL 42/42] openrisc: cpu: Remove unused cpu_get_pc

2015-04-30 Thread Michael Tokarev

From: Peter Crosthwaite 

This function is not used by anything. Remove.

Signed-off-by: Peter Crosthwaite 
Reviewed-by: Alex Bennée 
Signed-off-by: Michael Tokarev 
---
 target-openrisc/cpu.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/target-openrisc/cpu.h b/target-openrisc/cpu.h
index b25324b..9e23cd0 100644
--- a/target-openrisc/cpu.h
+++ b/target-openrisc/cpu.h
@@ -415,9 +415,4 @@ static inline int cpu_mmu_index(CPUOpenRISCState *env)
 
 #include "exec/exec-all.h"
 
-static inline target_ulong cpu_get_pc(CPUOpenRISCState *env)
-{
-return env->pc;
-}
-
 #endif /* CPU_OPENRISC_H */
-- 
2.1.4

[Qemu-devel] [PATCH] MAINTAINERS: Add qemu-block list where missing

2015-04-30 Thread Kevin Wolf

Signed-off-by: Kevin Wolf 
---
 MAINTAINERS | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 25fd2b5..0b67c48 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -735,12 +735,14 @@ F: backends/rng*.c
 
 nvme
 M: Keith Busch 
+L: qemu-bl...@nongnu.org
 S: Supported
 F: hw/block/nvme*
 F: tests/nvme-test.c
 
 megasas
 M: Hannes Reinecke 
+L: qemu-bl...@nongnu.org
 S: Supported
 F: hw/scsi/megasas.c
 F: hw/scsi/mfi.h
@@ -772,6 +774,7 @@ F: tests/intel-hda-test.c
 
 Block layer core
 M: Kevin Wolf 
+L: qemu-bl...@nongnu.org
 S: Supported
 F: block*
 F: block/
@@ -,6 +1114,7 @@ Block drivers
 -
 VMDK
 M: Fam Zheng 
+L: qemu-bl...@nongnu.org
 S: Supported
 F: block/vmdk.c
 
@@ -1141,6 +1145,7 @@ T: git git://github.com/codyprime/qemu-kvm-jtc.git block
 
 VDI
 M: Stefan Weil 
+L: qemu-bl...@nongnu.org
 S: Maintained
 F: block/vdi.c
 
@@ -1148,6 +1153,7 @@ iSCSI
 M: Ronnie Sahlberg 
 M: Paolo Bonzini 
 M: Peter Lieven 
+L: qemu-bl...@nongnu.org
 S: Supported
 F: block/iscsi.c
 
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH 0/5] MAINTAINERS: split block layer

2015-04-30 Thread Kevin Wolf

Am 29.04.2015 um 16:13 hat Stefan Hajnoczi geschrieben:
> Kevin and I have been maintaining the block layer together.  We take weekly
> turns reviewing/merging patches.  The volume of traffic is so high that we
> struggle to give timely code reviews.
> 
> This series adjusts MAINTAINERS to reflect how we are splitting up the block
> layer.  Once this series is merged we will permanently be on duty in our
> respective areas instead of taking weekly turns with reviews.  Hopefully this
> will improve code review times.

Thanks, applied to the block branch.

Kevin

[Qemu-devel] [PATCH v4 1/7] vmport.c: Fix vmport_cmd_ram_size

2015-04-30 Thread Don Slutz

Based on

https://sites.google.com/site/chitchatvmback/backdoor

and testing on ESXi, this should be in MB not bytes.

Signed-off-by: Don Slutz 
---
 hw/misc/vmport.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/misc/vmport.c b/hw/misc/vmport.c
index 7fcc00d..6b350ce 100644
--- a/hw/misc/vmport.c
+++ b/hw/misc/vmport.c
@@ -110,7 +110,7 @@ static uint32_t vmport_cmd_ram_size(void *opaque, uint32_t 
addr)
 X86CPU *cpu = X86_CPU(current_cpu);
 
 cpu->env.regs[R_EBX] = 0x1177;
-return ram_size;
+return ram_size >> 20; /* in MB */
 }
 
 static uint32_t vmport_cmd_xtest(void *opaque, uint32_t addr)
-- 
1.8.4

[Qemu-devel] [PATCH v4 2/7] vmport_rpc: Add the object vmport_rpc

2015-04-30 Thread Don Slutz

This is the 1st part of "Add limited support of VMware's hyper-call
rpc".

This patch uses existing infrastructure used by vmmouse.c (provided
by vmport.c) to handle the VMware backdoor command 30.

One of the better on-line references is:

https://sites.google.com/site/chitchatvmback/backdoor

More in next patch.

Signed-off-by: Don Slutz 
---
 hw/i386/pc.c  |   6 +++
 hw/misc/Makefile.objs |   1 +
 hw/misc/vmport_rpc.c  | 126 ++
 3 files changed, 133 insertions(+)
 create mode 100644 hw/misc/vmport_rpc.c

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index a8e6be1..e5b7167 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1471,8 +1471,14 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi,
 i8042 = isa_create_simple(isa_bus, "i8042");
 i8042_setup_a20_line(i8042, &a20_line[0]);
 if (!no_vmport) {
+ISADevice *vmport_rpc;
+
 vmport_init(isa_bus);
 vmmouse = isa_try_create(isa_bus, "vmmouse");
+vmport_rpc = isa_try_create(isa_bus, "vmport_rpc");
+if (vmport_rpc) {
+qdev_init_nofail(DEVICE(vmport_rpc));
+}
 } else {
 vmmouse = NULL;
 }
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index 4aa76ff..e04c8ac 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -7,6 +7,7 @@ common-obj-$(CONFIG_ISA_TESTDEV) += pc-testdev.o
 common-obj-$(CONFIG_PCI_TESTDEV) += pci-testdev.o
 
 obj-$(CONFIG_VMPORT) += vmport.o
+obj-$(CONFIG_VMPORT) += vmport_rpc.o
 
 # ARM devices
 common-obj-$(CONFIG_PL310) += arm_l2x0.o
diff --git a/hw/misc/vmport_rpc.c b/hw/misc/vmport_rpc.c
new file mode 100644
index 000..b7cd355
--- /dev/null
+++ b/hw/misc/vmport_rpc.c
@@ -0,0 +1,126 @@
+/*
+ * QEMU VMPORT RPC emulation
+ *
+ * Copyright (C) 2015 Verizon Corporation
+ *
+ * This file is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License Version 2 (GPLv2)
+ * as published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details. .
+ */
+
+/*
+ * One of the better on-line references is:
+ *
+ * https://sites.google.com/site/chitchatvmback/backdoor
+ *
+ * Which points you to:
+ *
+ * http://open-vm-tools.sourceforge.net/
+ *
+ * as a place to get more accurate information by studying.
+ */
+
+#include "hw/hw.h"
+#include "hw/i386/pc.h"
+#include "hw/qdev.h"
+#include "trace.h"
+#include "qmp-commands.h"
+#include "qapi/qmp/qerror.h"
+
+/* #define VMPORT_RPC_DEBUG */
+
+#define TYPE_VMPORT_RPC "vmport_rpc"
+#define VMPORT_RPC(obj) OBJECT_CHECK(VMPortRpcState, (obj), TYPE_VMPORT_RPC)
+
+/* VMPORT RPC Command */
+#define VMPORT_RPC_COMMAND  30
+
+/* The vmport_rpc object. */
+typedef struct VMPortRpcState {
+ISADevice parent_obj;
+
+/* Properties */
+uint64_t reset_time;
+uint64_t build_number_value;
+uint64_t build_number_time;
+
+/* Private data */
+} VMPortRpcState;
+
+typedef struct {
+uint32_t eax;
+uint32_t ebx;
+uint32_t ecx;
+uint32_t edx;
+uint32_t esi;
+uint32_t edi;
+} vregs;
+
+static uint32_t vmport_rpc_ioport_read(void *opaque, uint32_t addr)
+{
+VMPortRpcState *s = opaque;
+union {
+uint32_t data[6];
+vregs regs;
+} ur;
+
+vmmouse_get_data(ur.data);
+
+s->build_number_time++;
+
+vmmouse_set_data(ur.data);
+return ur.data[0];
+}
+
+static void vmport_rpc_reset(DeviceState *d)
+{
+VMPortRpcState *s = VMPORT_RPC(d);
+
+s->reset_time = 14;
+s->build_number_value = 0;
+s->build_number_time = 0;
+}
+
+static void vmport_rpc_realize(DeviceState *dev, Error **errp)
+{
+VMPortRpcState *s = VMPORT_RPC(dev);
+
+vmport_register(VMPORT_RPC_COMMAND, vmport_rpc_ioport_read, s);
+}
+
+static Property vmport_rpc_properties[] = {
+DEFINE_PROP_UINT64("reset-time", VMPortRpcState, reset_time, 14),
+DEFINE_PROP_UINT64("build-number-value", VMPortRpcState,
+   build_number_value, 0),
+DEFINE_PROP_UINT64("build-number-time", VMPortRpcState,
+   build_number_time, 0),
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static void vmport_rpc_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+
+dc->realize = vmport_rpc_realize;
+dc->reset = vmport_rpc_reset;
+dc->props = vmport_rpc_properties;
+}
+
+static const TypeInfo vmport_rpc_info = {
+.name  = TYPE_VMPORT_RPC,
+.parent= TYPE_ISA_DEVICE,
+.instance_size = sizeof(VMPortRpcState),
+.class_init= vmport_rpc_class_init,
+};
+
+static void vmport_rpc_register_types(void)
+{
+type_register_static(&vmport_rpc_info);
+}
+
+type_init(vmport_rpc_register_types)
-- 
1.8.4

[Qemu-devel] [PATCH v4 6/7] vmport: Add VMware all ring hack

2015-04-30 Thread Don Slutz

This is done by adding a new machine property vmware-port-ring3 that
needs to be enabled to have any effect.  It only effects accel=tcg
mode.  It is needed if you want to use VMware tools in accel=tcg
mode.

Signed-off-by: Don Slutz 
(cherry picked from commit 6d99c91fc9ae27b476e89a8cc880b4a46e237536)
---
 hw/i386/pc.c | 28 +++-
 hw/i386/pc_piix.c|  2 +-
 hw/i386/pc_q35.c |  2 +-
 include/hw/i386/pc.h |  6 +-
 target-i386/cpu.c|  4 
 target-i386/cpu.h|  2 ++
 target-i386/seg_helper.c |  6 ++
 7 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index e5b7167..ec78c76 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1056,7 +1056,9 @@ void pc_hot_add_cpu(const int64_t id, Error **errp)
 pc_new_cpu(current_cpu_model, apic_id, icc_bridge, errp);
 }
 
-void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge)
+/* vmware_port_ring3 true says enable VMware port access in ring3. */
+void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge,
+  bool vmware_port_ring3)
 {
 int i;
 X86CPU *cpu = NULL;
@@ -1087,6 +1089,9 @@ void pc_cpus_init(const char *cpu_model, DeviceState 
*icc_bridge)
 error_report_err(error);
 exit(1);
 }
+if (vmware_port_ring3) {
+cpu->env.hflags2 |= HF2_VMPORT_HACK_MASK;
+}
 }
 
 /* map APIC MMIO area if CPU has APIC */
@@ -1824,6 +1829,21 @@ static bool pc_machine_get_aligned_dimm(Object *obj, 
Error **errp)
 return pcms->enforce_aligned_dimm;
 }
 
+static bool pc_machine_get_vmware_port_ring3(Object *obj, Error **errp)
+{
+PCMachineState *pcms = PC_MACHINE(obj);
+
+return pcms->vmware_port_ring3;
+}
+
+static void pc_machine_set_vmware_port_ring3(Object *obj, bool value,
+ Error **errp)
+{
+PCMachineState *pcms = PC_MACHINE(obj);
+
+pcms->vmware_port_ring3 = value;
+}
+
 static void pc_machine_initfn(Object *obj)
 {
 PCMachineState *pcms = PC_MACHINE(obj);
@@ -1854,6 +1874,12 @@ static void pc_machine_initfn(Object *obj)
 object_property_add_bool(obj, PC_MACHINE_ENFORCE_ALIGNED_DIMM,
  pc_machine_get_aligned_dimm,
  NULL, NULL);
+
+pcms->vmware_port_ring3 = false;
+object_property_add_bool(obj, PC_MACHINE_VMWARE_PORT_RING3,
+ pc_machine_get_vmware_port_ring3,
+ pc_machine_set_vmware_port_ring3,
+ NULL);
 }
 
 static unsigned pc_cpu_index_to_socket_id(unsigned cpu_index)
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 1fe7bfb..4fa21c9 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -147,7 +147,7 @@ static void pc_init1(MachineState *machine,
 object_property_add_child(qdev_get_machine(), "icc-bridge",
   OBJECT(icc_bridge), NULL);
 
-pc_cpus_init(machine->cpu_model, icc_bridge);
+pc_cpus_init(machine->cpu_model, icc_bridge, 
pc_machine->vmware_port_ring3);
 
 if (kvm_enabled() && kvmclock_enabled) {
 kvmclock_create();
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index dcc17c0..1e47b97 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -136,7 +136,7 @@ static void pc_q35_init(MachineState *machine)
 object_property_add_child(qdev_get_machine(), "icc-bridge",
   OBJECT(icc_bridge), NULL);
 
-pc_cpus_init(machine->cpu_model, icc_bridge);
+pc_cpus_init(machine->cpu_model, icc_bridge, 
pc_machine->vmware_port_ring3);
 pc_acpi_init("q35-acpi-dsdt.aml");
 
 kvmclock_create();
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 1b35168..2119d5d 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -40,6 +40,7 @@ struct PCMachineState {
 
 uint64_t max_ram_below_4g;
 OnOffAuto vmport;
+bool vmware_port_ring3;
 bool enforce_aligned_dimm;
 };
 
@@ -48,6 +49,7 @@ struct PCMachineState {
 #define PC_MACHINE_MAX_RAM_BELOW_4G "max-ram-below-4g"
 #define PC_MACHINE_VMPORT   "vmport"
 #define PC_MACHINE_ENFORCE_ALIGNED_DIMM "enforce-aligned-dimm"
+#define PC_MACHINE_VMWARE_PORT_RING3 "vmware-port-ring3"
 
 /**
  * PCMachineClass:
@@ -163,7 +165,9 @@ extern int fd_bootchk;
 void pc_register_ferr_irq(qemu_irq irq);
 void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
 
-void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge);
+/* vmware_port_ring3 true says enable VMware port access in ring3. */
+void pc_cpus_init(const char *cpu_model, DeviceState *icc_bridge,
+  bool vmware_port_ring3);
 void pc_hot_add_cpu(const int64_t id, Error **errp);
 void pc_acpi_init(const char *default_dsdt);
 
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 3305e09..5085f29 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2592,6 +2592,7 @@ static void x86_cpu_res

[Qemu-devel] [PATCH v4 0/7] Add limited support of VMware's hyper-call rpc

2015-04-30 Thread Don Slutz

Changes v4 to v4:

  Paolo Bonzini on "vmort_rpc: Add QMP access to vmport_rpc"
Does this compile on non-x86 targets?
  Nope.  Fixed.

Changes v2 to v3:

  s/2.3/2.4

Changes v1 to v2:

   Added live migration code.
   Adjust data structures for migration.
   Switch to GHashTable.

  Eric Blake
s/spawened/spawned/
  Done
s/traceing/tracing/
  Done
Change "error_set(errp, ERROR_CLASS_GENERIC_ERROR, " to
"error_setg(errp, "
  Done
Why two commands (inject-vmport-reboot, inject-vmport-halt)?
  Switched to inject-vmport-action.
format=base64 "bug" statements.
  Dropped.

Much more on format=base64:

If there is a bug it is in GLIB.  However the Glib reference manual
refers to RFC 1421 and RFC 2045 and MIME encoding.  Based on all
that (which seems to match:

http://en.wikipedia.org/wiki/Base64

) MIME states that all characters outside the (base64) alphabet are
to be ignored.  Testing shows that g_base64_decode() does this.

The confusion is that most non-MIME uses reject a base64 string that
contain characters outside the alphabet.  I was just following the
other uses of base64 in this file.

DataFormat refers to RFC 3548, which has the info:

"
   Implementations MUST reject the encoding if it contains
   characters outside the base alphabet when interpreting base
   encoded data, unless the specification referring to this document
   explicitly states otherwise.  Such specifications may, as MIME
   does, instead state that characters outside the base encoding
   alphabet should simply be ignored when interpreting data ("be
   liberal in what you accept").
"

So with GLIB going the MIME way, I do not think this is a QEMU bug
(you could consider this a GLIB bug, but the document I found says
that GLIB goes the MIME way and so does not reject anything).

---


The support included is enough to allow VMware tools to install in a
guest and provide guestinfo support.  guestinfo support is provided
by what is known as VMware RPC support.

One of the better on-line references is:

https://sites.google.com/site/chitchatvmback/backdoor

As a place to get more accurate information by studying:

http://open-vm-tools.sourceforge.net/

With vmware tools installed, you can do:

---
Last login: Fri Jan 30 16:03:08 2015
[root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.joejoel"
No value found
[root@C63-min-tools ~]# vmtoolsd --cmd "info-set guestinfo.joejoel bar"

[root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.joejoel"
bar
[root@C63-min-tools ~]# 
---

to access guest info.  QMP access is also provided.

The live migration code is still in progress.

Don Slutz (7):
  vmport.c: Fix vmport_cmd_ram_size
  vmport_rpc: Add the object vmport_rpc
  vmport_rpc: Add limited support of VMware's hyper-call rpc
  vmport_rpc: Add QMP access to vmport_rpc object.
  vmport_rpc: Add migration
  vmport:  Add VMware all ring hack
  MAINTAINERS: add VMware port

 MAINTAINERS  |7 +
 hw/i386/pc.c |   34 +-
 hw/i386/pc_piix.c|2 +-
 hw/i386/pc_q35.c |2 +-
 hw/misc/Makefile.objs|1 +
 hw/misc/vmport.c |2 +-
 hw/misc/vmport_rpc.c | 1442 ++
 include/hw/i386/pc.h |6 +-
 monitor.c|   23 +
 qapi-schema.json |   90 +++
 qmp-commands.hx  |  120 
 target-i386/cpu.c|4 +
 target-i386/cpu.h|2 +
 target-i386/seg_helper.c |6 +
 trace-events |   24 +
 15 files changed, 1760 insertions(+), 5 deletions(-)
 create mode 100644 hw/misc/vmport_rpc.c

-- 
1.8.4

[Qemu-devel] [PATCH v4 5/7] vmport_rpc: Add migration

2015-04-30 Thread Don Slutz

Signed-off-by: Don Slutz 
---
 hw/misc/vmport_rpc.c | 250 +++
 trace-events |   8 +-
 2 files changed, 255 insertions(+), 3 deletions(-)

diff --git a/hw/misc/vmport_rpc.c b/hw/misc/vmport_rpc.c
index 0ba3319..a147561 100644
--- a/hw/misc/vmport_rpc.c
+++ b/hw/misc/vmport_rpc.c
@@ -171,6 +171,14 @@ typedef struct VMPortRpcState {
 uint32_t open_cookie;
 channel_t chans[GUESTMSG_MAX_CHANNEL];
 GHashTable *guestinfo;
+/* Temporary cache for migration purposes */
+int32_t mig_chan_num;
+int32_t mig_bucket_num;
+uint32_t mig_guestinfo_size;
+uint32_t mig_guestinfo_off;
+uint8_t *mig_guestinfo_buf;
+channel_control_t *mig_chans;
+bucket_control_t *mig_buckets;
 #ifdef VMPORT_RPC_DEBUG
 unsigned int end;
 unsigned int last;
@@ -1168,6 +1176,247 @@ static Property vmport_rpc_properties[] = {
 DEFINE_PROP_END_OF_LIST(),
 };
 
+static const VMStateDescription vmstate_vmport_rpc_chan = {
+.name = "vmport_rpc/chan",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField [])
+{
+VMSTATE_UINT64(active_time, channel_control_t),
+VMSTATE_UINT32(chan_id, channel_control_t),
+VMSTATE_UINT32(cookie, channel_control_t),
+VMSTATE_UINT32(proto_num, channel_control_t),
+VMSTATE_UINT16(send_len, channel_control_t),
+VMSTATE_UINT16(send_idx, channel_control_t),
+VMSTATE_UINT16(send_buf_max, channel_control_t),
+VMSTATE_UINT8(recv_read, channel_control_t),
+VMSTATE_UINT8(recv_write, channel_control_t),
+VMSTATE_END_OF_LIST()
+},
+};
+
+static const VMStateDescription vmstate_vmport_rpc_bucket = {
+.name = "vmport_rpc/bucket",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField [])
+{
+VMSTATE_UINT16(recv_len, bucket_control_t),
+VMSTATE_UINT16(recv_idx, bucket_control_t),
+VMSTATE_UINT16(recv_buf_max, bucket_control_t),
+VMSTATE_END_OF_LIST()
+},
+};
+
+static void vmport_rpc_size_mig_guestinfo(gpointer key, gpointer value,
+  gpointer opaque)
+{
+VMPortRpcState *s = opaque;
+unsigned int key_len = strlen(key) + 1;
+guestinfo_t *gi = value;
+
+s->mig_guestinfo_size += 1 + key_len + 4 + gi->val_max;
+}
+
+static void vmport_rpc_fill_mig_guestinfo(gpointer key, gpointer value,
+  gpointer opaque)
+{
+VMPortRpcState *s = opaque;
+unsigned int key_len = strlen(key) + 1;
+guestinfo_t *gi = value;
+
+assert(gi->val_len <= gi->val_max);
+trace_vmport_rpc_fill_mig_guestinfo(key_len, key_len, key, gi->val_len,
+gi->val_len, gi->val_data);
+s->mig_guestinfo_buf[s->mig_guestinfo_off++] = key_len;
+memcpy(s->mig_guestinfo_buf + s->mig_guestinfo_off, key, key_len);
+s->mig_guestinfo_off += key_len;
+s->mig_guestinfo_buf[s->mig_guestinfo_off++] = gi->val_len >> 8;
+s->mig_guestinfo_buf[s->mig_guestinfo_off++] = gi->val_len;
+s->mig_guestinfo_buf[s->mig_guestinfo_off++] = gi->val_max >> 8;
+s->mig_guestinfo_buf[s->mig_guestinfo_off++] = gi->val_max;
+memcpy(s->mig_guestinfo_buf + s->mig_guestinfo_off, gi->val_data,
+   gi->val_max);
+s->mig_guestinfo_off += gi->val_max;
+}
+
+static int vmport_rpc_pre_load(void *opaque)
+{
+VMPortRpcState *s = opaque;
+
+g_free(s->mig_guestinfo_buf);
+s->mig_guestinfo_buf = NULL;
+s->mig_guestinfo_size = 0;
+s->mig_guestinfo_off = 0;
+g_free(s->mig_chans);
+s->mig_chans = NULL;
+s->mig_chan_num = 0;
+g_free(s->mig_buckets);
+s->mig_buckets = NULL;
+s->mig_bucket_num = 0;
+
+return 0;
+}
+
+static void vmport_rpc_pre_save(void *opaque)
+{
+VMPortRpcState *s = opaque;
+unsigned int i;
+unsigned int mig_chan_idx = 0;
+unsigned int mig_bucket_idx = 0;
+
+(void)vmport_rpc_pre_load(opaque);
+for (i = 0; i < GUESTMSG_MAX_CHANNEL; ++i) {
+channel_t *c = &s->chans[i];
+
+if (c->ctl.proto_num) {
+unsigned int j;
+
+s->mig_chan_num++;
+for (j = 0; j < MAX_BKTS; ++j) {
+bucket_t *b = &c->recv_bkt[j];
+
+s->mig_bucket_num++;
+s->mig_guestinfo_size +=
+(b->ctl.recv_buf_max + 1) * CHAR_PER_CALL;
+}
+s->mig_guestinfo_size += (c->ctl.send_buf_max + 1) * CHAR_PER_CALL;
+}
+}
+g_hash_table_foreach(s->guestinfo, vmport_rpc_size_mig_guestinfo, s);
+s->mig_guestinfo_size++;
+s->mig_guestinfo_buf = g_malloc(s->mig_guestinfo_size);
+s->mig_chans = g_malloc(s->mig_chan_num * sizeof(channel_control_t));
+s->mig_buckets = g_malloc(s->mig_bucket_num * sizeof(bucket_control_t));
+
+for (i = 0; i < GUESTMSG_MAX_CHANNEL; ++i) {
+channel_t *c = &s->chans[i];
+
+if (c->ctl.proto_num) {
+

Re: [Qemu-devel] [RFC PATCH 02/15] qdev: store DeviceState's canonical path to use when unparenting

2015-04-30 Thread Paolo Bonzini



On 29/04/2015 21:20, Michael Roth wrote:
> If the parent is finalized as a result of object_unparent(), it
> will still be attached to the composition tree at the time any
> children are unparented as a result of that same call to
> object_unparent(). However, in some cases, object_unparent()
> will complete without finalizing the parent device, due to
> lingering references that won't be released till some time later.
> One such example is if the parent has MemoryRegion children (which
> take a ref on their parent), who in turn have AddressSpace's (which
> take a ref on their regions), since those AddressSpaces get cleaned
> up asynchronously by the RCU thread.
> 
> In this case qdev:device_unparent() may be called for a child Device
> that no longer has a path to the root/machine container, causing
> object_get_canonical_path() to assert.

This doesn't seem right.  Unparent callbacks are _not_ called when you 
finalize, they are called in post-order as soon as you unplug a device 
(the call tree is object_unparent ==> device_unparent(parent) ==> 
bus_unparent(parent->bus) ==> device_unparent(parent->bus->child[0]) 
and so on).

DEVICE_DELETED is called after a device's children have been 
unparented.  It could be called after a bus is dead though.  Could it 
be that the patch you want is simply this:

diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 6e6a65d..46019c4 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -1241,11 +1241,6 @@ static void device_unparent(Object *obj)
 bus = QLIST_FIRST(&dev->child_bus);
 object_unparent(OBJECT(bus));
 }
-if (dev->parent_bus) {
-bus_remove_child(dev->parent_bus, dev);
-object_unref(OBJECT(dev->parent_bus));
-dev->parent_bus = NULL;
-}
 
 /* Only send event if the device had been completely realized */
 if (dev->pending_deleted_event) {
@@ -1254,6 +1249,12 @@ static void device_unparent(Object *obj)
 qapi_event_send_device_deleted(!!dev->id, dev->id, path, &error_abort);
 g_free(path);
 }
+
+if (dev->parent_bus) {
+bus_remove_child(dev->parent_bus, dev);
+object_unref(OBJECT(dev->parent_bus));
+dev->parent_bus = NULL;
+}
 }
 
 static void device_class_init(ObjectClass *class, void *data)

?

Paolo

[Qemu-devel] [PATCH v4 4/7] vmport_rpc: Add QMP access to vmport_rpc object.

2015-04-30 Thread Don Slutz

This adds one new inject command:

inject-vmport-action

And three guest info commands:

vmport-guestinfo-set
vmport-guestinfo-get
query-vmport-guestinfo

More details in qmp-commands.hx

Signed-off-by: Don Slutz 
---
 hw/misc/vmport_rpc.c | 269 +++
 monitor.c|  23 +
 qapi-schema.json |  90 +
 qmp-commands.hx  | 120 +++
 4 files changed, 502 insertions(+)

diff --git a/hw/misc/vmport_rpc.c b/hw/misc/vmport_rpc.c
index 927c0bf..0ba3319 100644
--- a/hw/misc/vmport_rpc.c
+++ b/hw/misc/vmport_rpc.c
@@ -215,6 +215,56 @@ typedef struct {
 uint32_t edi;
 } vregs;
 
+/*
+ * Run func() for every VMPortRpc device, traverse the tree for
+ * everything else.  Note: This routine expects that opaque is a
+ * VMPortRpcFind pointer and not NULL.
+ */
+static int find_VMPortRpc_device(Object *obj, void *opaque)
+{
+VMPortRpcFind *find = opaque;
+Object *dev;
+VMPortRpcState *s;
+
+if (find->found) {
+return 0;
+}
+dev = object_dynamic_cast(obj, TYPE_VMPORT_RPC);
+s = (VMPortRpcState *)dev;
+
+if (!s) {
+/* Container, traverse it for children */
+return object_child_foreach(obj, find_VMPortRpc_device, opaque);
+}
+
+find->found++;
+find->rc = find->func(s, find->arg);
+
+return 0;
+}
+
+/*
+ * Loop through all dynamically created VMPortRpc devices and call
+ * func() for each instance.
+ */
+static int foreach_dynamic_vmport_rpc_device(FindVMPortRpcDeviceFunc *func,
+ void *arg)
+{
+VMPortRpcFind find = {
+.func = func,
+.arg = arg,
+};
+
+/* Loop through all VMPortRpc devices that were spawned outside
+ * the machine */
+find_VMPortRpc_device(qdev_get_machine(), &find);
+if (find.found) {
+return find.rc;
+} else {
+return VMPORT_DEVICE_NOT_FOUND;
+}
+}
+
 #ifdef VMPORT_RPC_DEBUG
 /*
  * Add helper function for tracing.  This routine will convert
@@ -464,6 +514,23 @@ static int get_guestinfo(VMPortRpcState *s,
 return GUESTINFO_NOTFOUND;
 }
 
+static int get_qmp_guestinfo(VMPortRpcState *s,
+ unsigned int a_key_len, char *a_info_key,
+ unsigned int *a_value_len, void **a_value_data)
+{
+gpointer key = g_strndup(a_info_key, a_key_len);
+guestinfo_t *gi = (guestinfo_t *)g_hash_table_lookup(s->guestinfo, key);
+
+g_free(key);
+if (gi) {
+*a_value_len = gi->val_len;
+*a_value_data = gi->val_data;
+return 0;
+}
+
+return GUESTINFO_NOTFOUND;
+}
+
 static int set_guestinfo(VMPortRpcState *s, int a_key_len,
  unsigned int a_val_len, char *a_info_key, char *val)
 {
@@ -851,6 +918,208 @@ static uint32_t vmport_rpc_ioport_read(void *opaque, 
uint32_t addr)
 return ur.data[0];
 }
 
+static int vmport_rpc_find_send(VMPortRpcState *s, void *arg)
+{
+return vmport_rpc_ctrl_send(s, arg);
+}
+
+static void convert_local_rc(Error **errp, int rc)
+{
+switch (rc) {
+case 0:
+break;
+case VMPORT_DEVICE_NOT_FOUND:
+error_set(errp, QERR_DEVICE_NOT_FOUND, TYPE_VMPORT_RPC);
+break;
+case SEND_NOT_OPEN:
+error_setg(errp, "VMWare rpc not open");
+break;
+case SEND_SKIPPED:
+error_setg(errp, "VMWare rpc send skipped");
+break;
+case SEND_TRUCATED:
+error_setg(errp, "VMWare rpc send trucated");
+break;
+case SEND_NO_MEMORY:
+error_setg(errp, "VMWare rpc send out of memory");
+break;
+case GUESTINFO_NOTFOUND:
+error_setg(errp, "VMWare guestinfo not found");
+break;
+case GUESTINFO_VALTOOLONG:
+error_setg(errp, "VMWare guestinfo value too long");
+break;
+case GUESTINFO_KEYTOOLONG:
+error_setg(errp, "VMWare guestinfo key too long");
+break;
+case GUESTINFO_TOOMANYKEYS:
+error_setg(errp, "VMWare guestinfo too many keys");
+break;
+case GUESTINFO_NO_MEMORY:
+error_setg(errp, "VMWare guestinfo out of memory");
+break;
+default:
+error_setg(errp, "VMWare rpc send rc=%d unknown", rc);
+break;
+}
+}
+
+void qmp_inject_vmport_action(enum VmportAction action, Error **errp)
+{
+int rc;
+
+switch (action) {
+case VMPORT_ACTION_REBOOT:
+rc = foreach_dynamic_vmport_rpc_device(vmport_rpc_find_send,
+   (void *)"OS_Reboot");
+break;
+case VMPORT_ACTION_HALT:
+rc = foreach_dynamic_vmport_rpc_device(vmport_rpc_find_send,
+   (void *)"OS_Halt");
+break;
+case VMPORT_ACTION_MAX:
+assert(action != VMPORT_ACTION_MAX);
+rc = 0; /* Should be impossible to get here. */
+break;
+}
+convert_local_rc(errp, rc);
+}
+
+typedef struct keyValue {
+voi

[Qemu-devel] [PATCH v4 3/7] vmport_rpc: Add limited support of VMware's hyper-call rpc

2015-04-30 Thread Don Slutz

The support included is enough to allow VMware tools to install in a
guest and provide guestinfo support.  guestinfo support is provided
by what is known as VMware RPC support.

If the guest is running VMware tools, then the "build version" of
the tools is also available via the property build-number-value of
the vmport_rpc object.  The build-number-time property is changed
every time the VMware tools running in the guest sends the
"build version" via the rpc.

The property reset-time controls how often to request them in
seconds minus 1.  The minus 1 is to handle to 0 case.  I.E. the
fastest that can be selected is every second.  The default is 4
times a minute.

The VMware RPC support includes the notion of channels that are
opened, active and closed.  All RPC messages sent via a channel
starts with normal ASCII text.  The message some times does include
binary data.

Currently there are 2 protocols defined for VMware RPC.  They
determine the direction for data flow, guest to tool stack or tool
stack to guest.

There is no provided interrupt for VMware RPC.

Getting VMPORT_RPC_DEBUG will provide a higher level trace calls
that are simpler to understand.

Signed-off-by: Don Slutz 
---
 hw/misc/vmport_rpc.c | 799 ++-
 trace-events |  22 ++
 2 files changed, 820 insertions(+), 1 deletion(-)

diff --git a/hw/misc/vmport_rpc.c b/hw/misc/vmport_rpc.c
index b7cd355..927c0bf 100644
--- a/hw/misc/vmport_rpc.c
+++ b/hw/misc/vmport_rpc.c
@@ -40,6 +40,123 @@
 /* VMPORT RPC Command */
 #define VMPORT_RPC_COMMAND  30
 
+/* Limits on amount of non guest memory to use */
+#define MAX_KEY_LEN  128
+#define MIN_VAL_LEN  64
+#define MAX_VAL_LEN  8192
+#define MAX_NUM_KEY  256
+#define MAX_BKTS 4
+/* Max number of channels. */
+#define GUESTMSG_MAX_CHANNEL 8
+
+/*
+ * All of VMware's rpc is based on 32bit registers.  So this is the
+ * number of bytes in ebx.
+ */
+#define CHAR_PER_CALL   sizeof(uint32_t)
+/* Room for basic commands */
+#define EXTRA_SEND 22
+/* Status code and NULL */
+#define EXTRA_RECV 2
+#define MAX_SEND_BUF DIV_ROUND_UP(EXTRA_SEND + MAX_KEY_LEN + MAX_VAL_LEN, \
+  CHAR_PER_CALL)
+#define MAX_RECV_BUF DIV_ROUND_UP(EXTRA_RECV + MAX_VAL_LEN, CHAR_PER_CALL)
+#define MIN_SEND_BUF DIV_ROUND_UP(EXTRA_SEND + MAX_KEY_LEN + MIN_VAL_LEN, \
+  CHAR_PER_CALL)
+
+/* Reply statuses */
+/*  The basic request succeeded */
+#define MESSAGE_STATUS_SUCCESS  0x0001
+/*  vmware has a message available for its party */
+#define MESSAGE_STATUS_DORECV   0x0002
+/*  The channel has been closed */
+#define MESSAGE_STATUS_CLOSED   0x0004
+/*  vmware removed the message before the party fetched it */
+#define MESSAGE_STATUS_UNSENT   0x0008
+/*  A checkpoint occurred */
+#define MESSAGE_STATUS_CPT  0x0010
+/*  An underlying device is powering off */
+#define MESSAGE_STATUS_POWEROFF 0x0020
+/*  vmware has detected a timeout on the channel */
+#define MESSAGE_STATUS_TIMEOUT  0x0040
+/*  vmware supports high-bandwidth for sending and receiving the payload */
+#define MESSAGE_STATUS_HB   0x0080
+
+/* Max number of channels. */
+#define GUESTMSG_MAX_CHANNEL 8
+
+/* Flags to open a channel. */
+#define GUESTMSG_FLAG_COOKIE 0x8000
+
+/* Data to guest */
+#define VMWARE_PROTO_TO_GUEST   0x4f4c4354
+/* Data from guest */
+#define VMWARE_PROTO_FROM_GUEST 0x49435052
+
+/*
+ * Error return values used only in this file.  The routine
+ * convert_local_rc() is used to convert these to an Error
+ * object.
+ */
+#define VMPORT_DEVICE_NOT_FOUND -1
+#define SEND_NOT_OPEN   -2
+#define SEND_SKIPPED-3
+#define SEND_TRUCATED   -4
+#define SEND_NO_MEMORY  -5
+#define GUESTINFO_NOTFOUND  -6
+#define GUESTINFO_VALTOOLONG-7
+#define GUESTINFO_KEYTOOLONG-8
+#define GUESTINFO_TOOMANYKEYS   -9
+#define GUESTINFO_NO_MEMORY -10
+
+
+/* The VMware RPC guest info storage . */
+typedef struct {
+char *val_data;
+uint16_t val_len;
+uint16_t val_max;
+} guestinfo_t;
+
+/* The VMware RPC bucket control. */
+typedef struct {
+uint16_t recv_len;
+uint16_t recv_idx;
+uint16_t recv_buf_max;
+} bucket_control_t;
+
+/* The VMware RPC bucket info. */
+typedef struct {
+union {
+uint32_t *words;
+char *bytes;
+} recv;
+bucket_control_t ctl;
+} bucket_t;
+
+
+/* The VMware RPC channel control. */
+typedef struct {
+uint64_t active_time;
+uint32_t chan_id;
+uint32_t cookie;
+uint32_t proto_num;
+uint16_t send_len;
+uint16_t send_idx;
+uint16_t send_buf_max;
+uint8_t recv_read;
+uint8_t recv_write;
+} channel_control_t;
+
+/* The VMware RPC channel info. */
+typedef struct {
+union {
+uint32_t *words;
+char *bytes;
+} send;
+channel_control_t ctl;
+bucket_t recv_bkt[MAX_BKTS];
+} channel_t;
+
 /* The vmport_rpc object. */
 type

Re: [Qemu-devel] [PATCH] Enable NVMe start controller for Windows guest.

2015-04-30 Thread Kevin Wolf

[Cc: qemu-block]

Am 24.04.2015 um 21:19 hat Keith Busch geschrieben:
> On Fri, 24 Apr 2015, Daniel Stekloff wrote:
> >Windows seems to send two separate calls to NVMe controller configuration. 
> >The
> >first sends configuration info and the second the enable bit. I couldn't
> >enable the Windows 8.1 in-box NVMe driver with base Qemu. I made the
> >following change to store the configuration data and then handle enable and
> >NVMe driver works on Windows 8.1.
> 
> Hm, Microsoft's driver must be issuing MMIO reads to mask in the enable
> bit rather than keep the state known. Sounds odd, but thanks for the fix.
> 
> Acked-by: Keith Busch 

Thanks, applied to the block branch.

Kevin

Re: [Qemu-devel] [PATCH v4 0/3] block: Fix unaligned bdrv_aio_write_zeroes

2015-04-30 Thread Kevin Wolf

Am 27.04.2015 um 15:18 hat Fam Zheng geschrieben:
> An unaligned zero write causes NULL deferencing in bdrv_co_do_pwritev. That
> path is reachable from bdrv_co_write_zeroes and bdrv_aio_write_zeroes.
> 
> You can easily trigger through the former with qemu-io, as the test case added
> by 61815d6e0aa. For bdrv_aio_write_zeroes, in common cases there's always a
> format driver (which uses 512 alignment), so it would be much rarer to have
> unaligned requests (only concerning top level here, when the request goes down
> to bs->file, where for example the alignment is 4k, it would then be calling
> bdrv_co_write_zeroes because it's in a coroutine).
> 
> fc3959e4669a1c fixed bdrv_co_write_zeroes but not bdrv_aio_write_zeroes.  The
> lattern is the actually used one by device model. Revert the previous fix, do
> it in bdrv_co_do_pwritev, to cover both paths.

Hi Fam,

Stefan's patch to split out block/io.c conflicts with this. Can you
please rebase?

Kevin

[Qemu-devel] [PATCH v4 7/7] MAINTAINERS: add VMware port

2015-04-30 Thread Don Slutz

Signed-off-by: Don Slutz 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index b5ab755..4bbda42 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -755,6 +755,13 @@ S: Maintained
 F: hw/net/vmxnet*
 F: hw/scsi/vmw_pvscsi*
 
+VMware port
+M: Don Slutz 
+S: Maintained
+F: hw/misc/vmport.c
+F: hw/input/vmmouse.c
+F: hw/misc/vmport_rpc.c
+
 Subsystems
 --
 Audio
-- 
1.8.4

Re: [Qemu-devel] [RESEND PATCH v4 2/4] apic: convert ->busdev.qdev casts to DEVICE() casts

2015-04-30 Thread Paolo Bonzini



On 30/04/2015 07:43, Andreas Färber wrote:
> Am 30.04.2015 um 03:33 schrieb Zhu Guihua:
>> > Use DEVICE() casts to avoid accessing ICCDevice's qdev field
>> > directly.
>> > 
>> > Signed-off-by: Zhu Guihua 
>> > ---
>> >  hw/intc/apic.c | 6 +++---
>> >  1 file changed, 3 insertions(+), 3 deletions(-)
>> > 
>> > diff --git a/hw/intc/apic.c b/hw/intc/apic.c
>> > index 0f97b47..00ae0ec 100644
>> > --- a/hw/intc/apic.c
>> > +++ b/hw/intc/apic.c
>> > @@ -376,7 +376,7 @@ static void apic_update_irq(APICCommonState *s)
>> >  cpu_interrupt(cpu, CPU_INTERRUPT_POLL);
>> >  } else if (apic_irq_pending(s) > 0) {
>> >  cpu_interrupt(cpu, CPU_INTERRUPT_HARD);
>> > -} else if (!apic_accept_pic_intr(&s->busdev.qdev) || 
>> > !pic_get_output(isa_pic)) {
>> > +} else if (!apic_accept_pic_intr(DEVICE(s)) || 
>> > !pic_get_output(isa_pic)) {
>> >  cpu_reset_interrupt(cpu, CPU_INTERRUPT_HARD);
>> >  }
>> >  }
>> > @@ -549,10 +549,10 @@ static void apic_deliver(DeviceState *dev, uint8_t 
>> > dest, uint8_t dest_mode,
>> >  
>> >  static bool apic_check_pic(APICCommonState *s)
>> >  {
>> > -if (!apic_accept_pic_intr(&s->busdev.qdev) || 
>> > !pic_get_output(isa_pic)) {
>> > +if (!apic_accept_pic_intr(DEVICE(s)) || !pic_get_output(isa_pic)) {
>> >  return false;
>> >  }
>> > -apic_deliver_pic_intr(&s->busdev.qdev, 1);
>> > +apic_deliver_pic_intr(DEVICE(s), 1);
> Please use a local DeviceState *dev = DEVICE(s); variable instead of
> doing the cast inline twice.
> 
> Please also check the hunk above - "irq" indicates to me we need to keep
> QOM impact low.

Yes, it's probably better to use a C cast here, so "DeviceState *dev =
(DeviceState *)s;".

Paolo

> Otherwise patch looks good, thanks for splitting out.
> Not sure who is going to handle this - CC'ing Paolo.

Re: [Qemu-devel] [PATCH 1/7] virtio-net: move qdev properties into virtio-net.c

2015-04-30 Thread Cornelia Huck

On Wed, 29 Apr 2015 23:24:03 +0800
Shannon Zhao  wrote:

> As only one place in virtio-net.c uses DEFINE_VIRTIO_NET_FEATURES,
> there is no need to expose it. Inline it into virtio-net.c to avoid
> wrongly use.
> 
> Signed-off-by: Shannon Zhao 
> Signed-off-by: Shannon Zhao 
> ---
>  hw/net/virtio-net.c| 42 
> +-
>  include/hw/virtio/virtio-net.h | 24 
>  2 files changed, 41 insertions(+), 25 deletions(-)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 5c38ac2..6ed2e78 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -1725,7 +1725,47 @@ static void virtio_net_instance_init(Object *obj)
>  }
> 
>  static Property virtio_net_properties[] = {
> -DEFINE_VIRTIO_NET_FEATURES(VirtIONet, host_features),
> +DEFINE_PROP_BIT("any_layout", VirtIONet, host_features, 
> VIRTIO_F_ANY_LAYOUT,
> +true),

Hm, the indentation after the line break looks a bit off here (same for
some of the feature bits further down).

> +DEFINE_PROP_BIT("csum", VirtIONet, host_features, VIRTIO_NET_F_CSUM, 
> true),
> +DEFINE_PROP_BIT("guest_csum", VirtIONet, host_features,
> + VIRTIO_NET_F_GUEST_CSUM, true),
> +DEFINE_PROP_BIT("gso", VirtIONet, host_features, VIRTIO_NET_F_GSO, true),
> +DEFINE_PROP_BIT("guest_tso4", VirtIONet, host_features,
> + VIRTIO_NET_F_GUEST_TSO4, true),
> +DEFINE_PROP_BIT("guest_tso6", VirtIONet, host_features,
> + VIRTIO_NET_F_GUEST_TSO6, true),
> +DEFINE_PROP_BIT("guest_ecn", VirtIONet, host_features,
> +VIRTIO_NET_F_GUEST_ECN, true),
> +DEFINE_PROP_BIT("guest_ufo", VirtIONet, host_features,
> +VIRTIO_NET_F_GUEST_UFO, true),
> +DEFINE_PROP_BIT("guest_announce", VirtIONet, host_features,
> +  VIRTIO_NET_F_GUEST_ANNOUNCE, true),
> +DEFINE_PROP_BIT("host_tso4", VirtIONet, host_features,
> +VIRTIO_NET_F_HOST_TSO4, true),
> +DEFINE_PROP_BIT("host_tso6", VirtIONet, host_features,
> +VIRTIO_NET_F_HOST_TSO6, true),
> +DEFINE_PROP_BIT("host_ecn", VirtIONet, host_features, 
> VIRTIO_NET_F_HOST_ECN,
> +  true),
> +DEFINE_PROP_BIT("host_ufo", VirtIONet, host_features, 
> VIRTIO_NET_F_HOST_UFO,
> +  true),
> +DEFINE_PROP_BIT("mrg_rxbuf", VirtIONet, host_features,
> +VIRTIO_NET_F_MRG_RXBUF, true),
> +DEFINE_PROP_BIT("status", VirtIONet, host_features, VIRTIO_NET_F_STATUS,
> +true),
> +DEFINE_PROP_BIT("ctrl_vq", VirtIONet, host_features, 
> VIRTIO_NET_F_CTRL_VQ,
> + true),
> +DEFINE_PROP_BIT("ctrl_rx", VirtIONet, host_features, 
> VIRTIO_NET_F_CTRL_RX,
> + true),
> +DEFINE_PROP_BIT("ctrl_vlan", VirtIONet, host_features,
> +VIRTIO_NET_F_CTRL_VLAN, true),
> +DEFINE_PROP_BIT("ctrl_rx_extra", VirtIONet, host_features,
> + VIRTIO_NET_F_CTRL_RX_EXTRA, true),
> +DEFINE_PROP_BIT("ctrl_mac_addr", VirtIONet, host_features,
> + VIRTIO_NET_F_CTRL_MAC_ADDR, true),
> +DEFINE_PROP_BIT("ctrl_guest_offloads", VirtIONet, host_features,
> +VIRTIO_NET_F_CTRL_GUEST_OFFLOADS, true),
> +DEFINE_PROP_BIT("mq", VirtIONet, host_features, VIRTIO_NET_F_MQ, false),
>  DEFINE_NIC_PROPERTIES(VirtIONet, nic_conf),
>  DEFINE_PROP_UINT32("x-txtimer", VirtIONet, net_conf.txtimer,
> TX_TIMER_INTERVAL),

Re: [Qemu-devel] [PATCH 0/7] virtio: inline private qdev properties into virtio devices

2015-04-30 Thread Cornelia Huck

On Wed, 29 Apr 2015 23:24:02 +0800
Shannon Zhao  wrote:

> The private qdev properties of virtio devices are only used by
> themselves. As Peter suggested and like what virtio-blk has done, we
> should move the private qdev properties into devices and don't expose
> them to avoid wrongly use.
> 
> This patchset is based on following patchset which moves host features
> to backends.
> http://lists.gnu.org/archive/html/qemu-devel/2015-04/msg03785.html
> 
> Shannon Zhao (7):
>   virtio-net: move qdev properties into virtio-net.c
>   virtio-net.h: Remove unsed DEFINE_VIRTIO_NET_PROPERTIES
>   virtio-scsi: move qdev properties into virtio-scsi.c
>   virtio-rng: move qdev properties into virtio-rng.c
>   virtio-serial-bus: move qdev properties into virtio-serial-bus.c
>   virtio-9p-device: move qdev properties into virtio-9p-device.c
>   vhost-scsi: move qdev properties into vhost-scsi.c
> 
>  hw/9pfs/virtio-9p-device.c|  3 ++-
>  hw/9pfs/virtio-9p.h   |  4 
>  hw/char/virtio-serial-bus.c   |  3 ++-
>  hw/net/virtio-net.c   | 42 
> ++-
>  hw/scsi/vhost-scsi.c  |  9 -
>  hw/scsi/virtio-scsi.c | 13 ++--
>  hw/virtio/virtio-rng.c|  8 +++-
>  include/hw/virtio/vhost-scsi.h|  9 -
>  include/hw/virtio/virtio-net.h| 31 +
>  include/hw/virtio/virtio-rng.h| 10 --
>  include/hw/virtio/virtio-scsi.h   | 13 
>  include/hw/virtio/virtio-serial.h |  3 ---
>  12 files changed, 72 insertions(+), 76 deletions(-)
> 

Other than my minor nit regarding indentation for virtio-net:

Acked-by: Cornelia Huck

Re: [Qemu-devel] [PATCH v4 4/7] vmport_rpc: Add QMP access to vmport_rpc object.

2015-04-30 Thread Paolo Bonzini



On 30/04/2015 15:32, Don Slutz wrote:
> +#ifdef VMPORT_SHORT
> +info->value->key = g_strdup(ckey);
> +#else
> +info->value->key = g_strdup_printf("guestinfo.%s", ckey);
> +#endif

What is VMPORT_SHORT for?

Paolo

Re: [Qemu-devel] [PATCH v4 6/7] vmport: Add VMware all ring hack

2015-04-30 Thread Paolo Bonzini



On 30/04/2015 15:32, Don Slutz wrote:
> This is done by adding a new machine property vmware-port-ring3 that
> needs to be enabled to have any effect.  It only effects accel=tcg
> mode.  It is needed if you want to use VMware tools in accel=tcg
> mode.

How does it work on KVM or Xen?

> 
> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
> index 3305e09..5085f29 100644
> --- a/target-i386/cpu.c
> +++ b/target-i386/cpu.c
> @@ -2592,6 +2592,7 @@ static void x86_cpu_reset(CPUState *s)
>  X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
>  CPUX86State *env = &cpu->env;
>  int i;
> +bool save_vmware_port_ring3 = env->hflags2 & HF2_VMPORT_HACK_MASK;
>  
>  xcc->parent_reset(s);
>  
> @@ -2607,6 +2608,9 @@ static void x86_cpu_reset(CPUState *s)
>  env->hflags |= HF_SOFTMMU_MASK;
>  #endif
>  env->hflags2 |= HF2_GIF_MASK;
> +if (save_vmware_port_ring3) {
> +env->hflags2 |= HF2_VMPORT_HACK_MASK;
> +}
>  
>  cpu_x86_update_cr0(env, 0x6010);
>  env->a20_mask = ~0x0;

The save/restore suggests that you want a new bool-typed field in
CPUX86State instead of a bit env->hflags2.

Paolo

Re: [Qemu-devel] [PATCH v4 0/7] Add limited support of VMware's hyper-call rpc

2015-04-30 Thread Paolo Bonzini

On 30/04/2015 15:32, Don Slutz wrote:
> Changes v4 to v4:
> 
>   Paolo Bonzini on "vmort_rpc: Add QMP access to vmport_rpc"
> Does this compile on non-x86 targets?
>   Nope.  Fixed.

Only have a couple more questions, but apart from this it seems ready to
go in.

Thanks for your persistence!

Paolo

> Changes v2 to v3:
> 
>   s/2.3/2.4
> 
> Changes v1 to v2:
> 
>Added live migration code.
>Adjust data structures for migration.
>Switch to GHashTable.
> 
>   Eric Blake
> s/spawened/spawned/
>   Done
> s/traceing/tracing/
>   Done
> Change "error_set(errp, ERROR_CLASS_GENERIC_ERROR, " to
> "error_setg(errp, "
>   Done
> Why two commands (inject-vmport-reboot, inject-vmport-halt)?
>   Switched to inject-vmport-action.
> format=base64 "bug" statements.
>   Dropped.
> 
> Much more on format=base64:
> 
> If there is a bug it is in GLIB.  However the Glib reference manual
> refers to RFC 1421 and RFC 2045 and MIME encoding.  Based on all
> that (which seems to match:
> 
> http://en.wikipedia.org/wiki/Base64
> 
> ) MIME states that all characters outside the (base64) alphabet are
> to be ignored.  Testing shows that g_base64_decode() does this.
> 
> The confusion is that most non-MIME uses reject a base64 string that
> contain characters outside the alphabet.  I was just following the
> other uses of base64 in this file.
> 
> DataFormat refers to RFC 3548, which has the info:
> 
> "
>Implementations MUST reject the encoding if it contains
>characters outside the base alphabet when interpreting base
>encoded data, unless the specification referring to this document
>explicitly states otherwise.  Such specifications may, as MIME
>does, instead state that characters outside the base encoding
>alphabet should simply be ignored when interpreting data ("be
>liberal in what you accept").
> "
> 
> So with GLIB going the MIME way, I do not think this is a QEMU bug
> (you could consider this a GLIB bug, but the document I found says
> that GLIB goes the MIME way and so does not reject anything).
> 
> ---
> 
> 
> The support included is enough to allow VMware tools to install in a
> guest and provide guestinfo support.  guestinfo support is provided
> by what is known as VMware RPC support.
> 
> One of the better on-line references is:
> 
> https://sites.google.com/site/chitchatvmback/backdoor
> 
> As a place to get more accurate information by studying:
> 
> http://open-vm-tools.sourceforge.net/
> 
> With vmware tools installed, you can do:
> 
> ---
> Last login: Fri Jan 30 16:03:08 2015
> [root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.joejoel"
> No value found
> [root@C63-min-tools ~]# vmtoolsd --cmd "info-set guestinfo.joejoel bar"
> 
> [root@C63-min-tools ~]# vmtoolsd --cmd "info-get guestinfo.joejoel"
> bar
> [root@C63-min-tools ~]# 
> ---
> 
> to access guest info.  QMP access is also provided.
> 
> The live migration code is still in progress.
> 
> Don Slutz (7):
>   vmport.c: Fix vmport_cmd_ram_size
>   vmport_rpc: Add the object vmport_rpc
>   vmport_rpc: Add limited support of VMware's hyper-call rpc
>   vmport_rpc: Add QMP access to vmport_rpc object.
>   vmport_rpc: Add migration
>   vmport:  Add VMware all ring hack
>   MAINTAINERS: add VMware port
> 
>  MAINTAINERS  |7 +
>  hw/i386/pc.c |   34 +-
>  hw/i386/pc_piix.c|2 +-
>  hw/i386/pc_q35.c |2 +-
>  hw/misc/Makefile.objs|1 +
>  hw/misc/vmport.c |2 +-
>  hw/misc/vmport_rpc.c | 1442 
> ++
>  include/hw/i386/pc.h |6 +-
>  monitor.c|   23 +
>  qapi-schema.json |   90 +++
>  qmp-commands.hx  |  120 
>  target-i386/cpu.c|4 +
>  target-i386/cpu.h|2 +
>  target-i386/seg_helper.c |6 +
>  trace-events |   24 +
>  15 files changed, 1760 insertions(+), 5 deletions(-)
>  create mode 100644 hw/misc/vmport_rpc.c
>

Re: [Qemu-devel] [RFC PATCH 03/15] spapr_drc: pass object ownership to parent/owner

2015-04-30 Thread Paolo Bonzini



On 29/04/2015 21:20, Michael Roth wrote:
> DRC objects attach themselves to an owner as a child
> property. unref afterward to allow them to be finalized
> when their owner is finalized.
> 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr_drc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index 48bf193..396a03b 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -456,6 +456,7 @@ sPAPRDRConnector *spapr_dr_connector_new(Object *owner,
>  drc->id = id;
>  drc->owner = owner;
>  object_property_add_child(owner, "dr-connector[*]", OBJECT(drc), NULL);
> +object_unref(OBJECT(drc));
>  object_property_set_bool(OBJECT(drc), true, "realized", NULL);
>  
>  /* human-readable name for a DRC to encode into the DT
> 

Reviewed-by: Paolo Bonzini

Re: [Qemu-devel] [RFC PATCH 04/15] spapr_iommu: pass object ownership to parent/owner

2015-04-30 Thread Paolo Bonzini



On 29/04/2015 21:20, Michael Roth wrote:
> DRC objects attach themselves to an owner as a child
> property. unref afterward to allow them to be finalized
> when their owner is finalized.
> 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr_iommu.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
> index a14cdc4..79e998b 100644
> --- a/hw/ppc/spapr_iommu.c
> +++ b/hw/ppc/spapr_iommu.c
> @@ -182,6 +182,7 @@ sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, 
> uint32_t liobn,
>  
>  snprintf(tmp, sizeof(tmp), "tce-table-%x", liobn);
>  object_property_add_child(OBJECT(owner), tmp, OBJECT(tcet), NULL);
> +object_unref(OBJECT(tcet));
>  
>  object_property_set_bool(OBJECT(tcet), true, "realized", NULL);
>  
> 

Reviewed-by: Paolo Bonzini

Re: [Qemu-devel] [PATCH] nvme: support NVME_VOLATILE_WRITE_CACHE feature

2015-04-30 Thread Keith Busch


On Thu, 30 Apr 2015, Christoph Hellwig wrote:

The SCSI emulation in the Linux NVMe driver really wants to know
if a device has a volatile write cache.  Given that qemu has moved
away from a model where we report the backing store WCE bit to
one where the WCE bit is supposed to be part of the migratable
guest-visible state we always return 1 here.


Thanks, this fix was long overdue and already incorporated in my tree. I
really need to get my act together for a legit pull request with all the
other 1.0, 1.1 and 1.2 features.

Acked-by: Keith Busch 


Signed-off-by: Christoph Hellwig 

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 1e07166..50d76f1 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -479,6 +479,9 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeCmd *cmd, 
NvmeRequest *req)
req->cqe.result =
cpu_to_le32((n->num_queues - 1) | ((n->num_queues - 1) << 16));
break;
+case NVME_VOLATILE_WRITE_CACHE:
+req->cqe.result = cpu_to_le32(1);
+break;
default:
return NVME_INVALID_FIELD | NVME_DNR;
}

Re: [Qemu-devel] [RFC PATCH 05/15] spapr_pci: add PHB unrealize

2015-04-30 Thread Paolo Bonzini



On 29/04/2015 21:20, Michael Roth wrote:
> To support PHB hotplug we need to clean up lingering references,
> memory, child properties, etc. prior to the PHB object being
> finalized. Generally this will be called as a result of calling
> object_unref() on the PHB object, which in turn would normally

s/object_unref/object_unparent/

> be called as the result of an unplug() operation.
> 
> When the PHB is finalized, child objects will be unparented in
> turn, and finalized if the PHB was the only reference holder. so
> we don't bother to explicitly unparent child objects of the PHB
> (spapr_iommu, spapr_drc, etc).
> 
> We do need to handle memory regions explicitly however, since
> they also take a reference on the PHB, and won't allow it to
> be finalized otherwise.

They shouldn't hold a reference anymore as soon as the regions are not
visible in an AddressSpace (and the RCU thread has picked up the changes).

In fact, docs/memory.txt documents (!) that you must call
object_unparent() for memory regions in the instance_finalize function,
not in the unrealize function.

> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr_pci.c | 32 
>  1 file changed, 32 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 2e7590c..25a738c 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1108,6 +1108,37 @@ static void spapr_phb_hot_unplug_child(HotplugHandler 
> *plug_handler,
>  }
>  }
>  
> +static void spapr_phb_unrealize(DeviceState *dev, Error **errp)
> +{
> +SysBusDevice *s = SYS_BUS_DEVICE(dev);
> +PCIHostState *phb = PCI_HOST_BRIDGE(s);
> +sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(phb);
> +sPAPRTCETable *tcet;
> +
> +pci_unregister_bus(phb->bus);
> +
> +g_free(sphb->dtbusname);
> +sphb->dtbusname = NULL;

This g_free can probably also be moved for simplicity to instance_finalize.

> +/* remove IO/MMIO subregions and aliases, rest should get cleaned
> + * via PHB's unrealize->object_finalize
> + */
> +memory_region_del_subregion(get_system_memory(), &sphb->iowindow);

^^ You should indeed do this here.

> +object_unparent(OBJECT(&sphb->iowindow));
> +object_unparent(OBJECT(&sphb->iospace));
> +
> +memory_region_del_subregion(get_system_memory(), &sphb->memwindow);

^^ and this

> +object_unparent(OBJECT(&sphb->memwindow));
> +object_unparent(OBJECT(&sphb->memspace));
> +
> +tcet = spapr_tce_find_by_liobn(sphb->dma_liobn);
> +memory_region_del_subregion(&sphb->iommu_root, &sphb->msiwindow);
> +memory_region_del_subregion(&sphb->iommu_root, 
> spapr_tce_get_iommu(tcet));
> +address_space_destroy(&sphb->iommu_as);

^^ and these three.  However, the object_unparents should be in
instance_finalize.

Paolo

> +QLIST_REMOVE(sphb, list);
> +}
> +
>  static void spapr_phb_realize(DeviceState *dev, Error **errp)
>  {
>  SysBusDevice *s = SYS_BUS_DEVICE(dev);
> @@ -1442,6 +1473,7 @@ static void spapr_phb_class_init(ObjectClass *klass, 
> void *data)
>  
>  hc->root_bus_path = spapr_phb_root_bus_path;
>  dc->realize = spapr_phb_realize;
> +dc->unrealize = spapr_phb_unrealize;
>  dc->props = spapr_phb_properties;
>  dc->reset = spapr_phb_reset;
>  dc->vmsd = &vmstate_spapr_pci;
>

Re: [Qemu-devel] [RFC PATCH 08/15] spapr: create DR connectors for PHBs and register reset hooks

2015-04-30 Thread Paolo Bonzini

On 29/04/2015 21:20, Michael Roth wrote:
> +if (spapr->dr_phb_enabled) {
> +for (i = 0; i < SPAPR_DRC_MAX_PHB; i++) {
> +sPAPRDRConnector *drc =
> +spapr_dr_connector_new(OBJECT(machine),
> +   SPAPR_DR_CONNECTOR_TYPE_PHB, i);
> +qemu_register_reset(spapr_drc_reset, drc);
> +}
> +}

Is this needed because drc is busless?  Then I think it should be done
in device_set_realized (and the matching qemu_unregister_reset too).

Paolo

Re: [Qemu-devel] [RFC PATCH 11/15] qdev: add qbus_set_hotplug_handler_generic()

2015-04-30 Thread Paolo Bonzini

On 29/04/2015 21:20, Michael Roth wrote:
>  void qbus_set_hotplug_handler(BusState *bus, DeviceState *handler, Error 
> **errp)
>  {
> -qbus_set_hotplug_handler_internal(bus, OBJECT(handler), errp);
> +qbus_set_hotplug_handler_generic(bus, OBJECT(handler), errp);
>  }
>  

I think it's okay to just change the type of qbus_set_hotplug_handler's
handler argument, and get rid altogether of
qbus_set_hotplug_handler_internal.

Paolo

1 2 3 >

1 - 100 of 254 matches

Mail list logo