Re: [Qemu-devel] [PATCH v18 11/21] replay: recording and replaying clock ticks

2015-09-23 Thread Pavel Dovgaluk
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> On 17/09/2015 18:24, Pavel Dovgalyuk wrote:
> > +if ((now < last || now > (last + get_max_clock_jump()))
> > +&& replay_mode == REPLAY_MODE_NONE) {
> >  notifier_list_notify(&clock->reset_notifiers, &now);
> 
> This seems wrong.  You will have different timers in the record and the
> replay if you do not invoke e.g. rtc_notify_clock_reset.

That will "work" - record and replay will be equivalent.
But they will differ from normal mode. It'll better remove it from the patch.

Pavel Dovgalyuk




Re: [Qemu-devel] [PATCH v18 13/21] icount: improve counting for record/replay

2015-09-23 Thread Pavel Dovgaluk
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> On 17/09/2015 18:24, Pavel Dovgalyuk wrote:
> >  #endif
> >
> > +/* CPU thread can infinitely wait for event after
> > +   missing the warp */
> > +qemu_clock_warp(QEMU_CLOCK_VIRTUAL);
> >  qemu_clock_run_all_timers();
> 
> It is still not clear to me why the call in timerlist_rearm is not
> sufficient.  Can you explain this (again probably)?

Sometimes tcg thread halts in qemu_tcg_wait_io_event function, waiting
for any external event. Virtual clock does not run, because warp is not called.
warp call in main_loop_wait proceeds virtual clock and allows tcg thread to run 
further.

Pavel Dovgalyuk




[Qemu-devel] [PATCH V6 1/2] sd.h: Move sd.h to include/hw/sd/

2015-09-23 Thread Sai Pavan Boddu
Create a sd directory under include/hw/ and move sd.h to same.

Signed-off-by: Sai Pavan Boddu 
Reviewed-by: Alistair Francis 
---
Changes for V6:
Fix commit message.
Changes for V5:
None
Changes for V4:
Fix commit message.
Changes for V3:
None.
---
 hw/sd/milkymist-memcard.c | 2 +-
 hw/sd/omap_mmc.c  | 2 +-
 hw/sd/pl181.c | 2 +-
 hw/sd/pxa2xx_mmci.c   | 2 +-
 hw/sd/sd.c| 2 +-
 hw/sd/sdhci.h | 2 +-
 hw/sd/ssi-sd.c| 2 +-
 include/hw/{ => sd}/sd.h  | 0
 8 files changed, 7 insertions(+), 7 deletions(-)
 rename include/hw/{ => sd}/sd.h (100%)

diff --git a/hw/sd/milkymist-memcard.c b/hw/sd/milkymist-memcard.c
index 2209ef1..b430d56 100644
--- a/hw/sd/milkymist-memcard.c
+++ b/hw/sd/milkymist-memcard.c
@@ -28,7 +28,7 @@
 #include "qemu/error-report.h"
 #include "sysemu/block-backend.h"
 #include "sysemu/blockdev.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 
 enum {
 ENABLE_CMD_TX   = (1<<0),
diff --git a/hw/sd/omap_mmc.c b/hw/sd/omap_mmc.c
index 35d8033..5bc4719 100644
--- a/hw/sd/omap_mmc.c
+++ b/hw/sd/omap_mmc.c
@@ -18,7 +18,7 @@
  */
 #include "hw/hw.h"
 #include "hw/arm/omap.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 
 struct omap_mmc_s {
 qemu_irq irq;
diff --git a/hw/sd/pl181.c b/hw/sd/pl181.c
index 11fcd47..ddd9b6f 100644
--- a/hw/sd/pl181.c
+++ b/hw/sd/pl181.c
@@ -10,7 +10,7 @@
 #include "sysemu/block-backend.h"
 #include "sysemu/blockdev.h"
 #include "hw/sysbus.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 
 //#define DEBUG_PL181 1
 
diff --git a/hw/sd/pxa2xx_mmci.c b/hw/sd/pxa2xx_mmci.c
index d1fe6d5..b217080 100644
--- a/hw/sd/pxa2xx_mmci.c
+++ b/hw/sd/pxa2xx_mmci.c
@@ -12,7 +12,7 @@
 
 #include "hw/hw.h"
 #include "hw/arm/pxa.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 #include "hw/qdev.h"
 
 struct PXA2xxMMCIState {
diff --git a/hw/sd/sd.c b/hw/sd/sd.c
index a1ff465..0787e33 100644
--- a/hw/sd/sd.c
+++ b/hw/sd/sd.c
@@ -31,7 +31,7 @@
 
 #include "hw/hw.h"
 #include "sysemu/block-backend.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 #include "qemu/bitmap.h"
 
 //#define DEBUG_SD 1
diff --git a/hw/sd/sdhci.h b/hw/sd/sdhci.h
index 3352d23..a45593f 100644
--- a/hw/sd/sdhci.h
+++ b/hw/sd/sdhci.h
@@ -28,7 +28,7 @@
 #include "qemu-common.h"
 #include "hw/pci/pci.h"
 #include "hw/sysbus.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 
 /* R/W SDMA System Address register 0x0 */
 #define SDHC_SYSAD 0x00
diff --git a/hw/sd/ssi-sd.c b/hw/sd/ssi-sd.c
index e4b2d4f..c49ff62 100644
--- a/hw/sd/ssi-sd.c
+++ b/hw/sd/ssi-sd.c
@@ -13,7 +13,7 @@
 #include "sysemu/block-backend.h"
 #include "sysemu/blockdev.h"
 #include "hw/ssi.h"
-#include "hw/sd.h"
+#include "hw/sd/sd.h"
 
 //#define DEBUG_SSI_SD 1
 
diff --git a/include/hw/sd.h b/include/hw/sd/sd.h
similarity index 100%
rename from include/hw/sd.h
rename to include/hw/sd/sd.h
-- 
2.1.4




[Qemu-devel] [PATCH V6 0/2] Move sdhci.h to include/hw/sd

2015-09-23 Thread Sai Pavan Boddu
Move sdhci.h splitting it into common and internal.
Create a new directory for sd in include/hw/.
Correct paths of sd.h in at every instance of #include.

Sai Pavan Boddu (2):
  sd.h: Move sd.h to include/hw/sd/
  sdhci: Split sdhci.h for public and internal device usage

 hw/sd/milkymist-memcard.c   |  2 +-
 hw/sd/omap_mmc.c|  2 +-
 hw/sd/pl181.c   |  2 +-
 hw/sd/pxa2xx_mmci.c |  2 +-
 hw/sd/sd.c  |  2 +-
 hw/sd/{sdhci.h => sdhci-internal.h} | 67 +--
 hw/sd/sdhci.c   |  3 +-
 hw/sd/ssi-sd.c  |  2 +-
 include/hw/{ => sd}/sd.h|  0
 include/hw/sd/sdhci.h   | 92 +
 10 files changed, 101 insertions(+), 73 deletions(-)
 rename hw/sd/{sdhci.h => sdhci-internal.h} (75%)
 rename include/hw/{ => sd}/sd.h (100%)
 create mode 100644 include/hw/sd/sdhci.h

-- 
2.1.4




[Qemu-devel] [PATCH V6 2/2] sdhci: Split sdhci.h for public and internal device usage

2015-09-23 Thread Sai Pavan Boddu
Split sdhci.h into Pubilc version (i.e include/hw/sd/sdhci.h) and
Internal version (i.e hw/sd/sdhci-interna.h) based on register
declarations and object declaration.

Signed-off-by: Sai Pavan Boddu 
Reviewed-by: Alistair Francis 
---
Changes for V6:
Fix commit message.
Chages for V5:
Rename pubilc header version as sdhci.h and internal version to
sdhci-internal.h
Changes for V4:
Remain the name of internal version of sdchi.h as same. And change
Re-Adding qemu-common.h header.
the one which is in includes/ to sdhci-common.h
Changes for V2:
Create new area in includes for sd. And move sdhci.h to same.
Changes for V3:
Split the headers to public and common.
---
 hw/sd/{sdhci.h => sdhci-internal.h} | 67 +--
 hw/sd/sdhci.c   |  3 +-
 include/hw/sd/sdhci.h   | 92 +
 3 files changed, 95 insertions(+), 67 deletions(-)
 rename hw/sd/{sdhci.h => sdhci-internal.h} (75%)
 create mode 100644 include/hw/sd/sdhci.h

diff --git a/hw/sd/sdhci.h b/hw/sd/sdhci-internal.h
similarity index 75%
rename from hw/sd/sdhci.h
rename to hw/sd/sdhci-internal.h
index a45593f..c40ae2b 100644
--- a/hw/sd/sdhci.h
+++ b/hw/sd/sdhci-internal.h
@@ -21,14 +21,10 @@
  * You should have received a copy of the GNU General Public License along
  * with this program; if not, see .
  */
-
 #ifndef SDHCI_H
 #define SDHCI_H
 
-#include "qemu-common.h"
-#include "hw/pci/pci.h"
-#include "hw/sysbus.h"
-#include "hw/sd/sd.h"
+#include "hw/sd/sdhci.h"
 
 /* R/W SDMA System Address register 0x0 */
 #define SDHC_SYSAD 0x00
@@ -231,65 +227,6 @@ enum {
 sdhc_gap_write  = 2   /* SDHC stopped at block gap during write operation 
*/
 };
 
-/* SD/MMC host controller state */
-typedef struct SDHCIState {
-union {
-PCIDevice pcidev;
-SysBusDevice busdev;
-};
-SDState *card;
-MemoryRegion iomem;
-
-QEMUTimer *insert_timer;   /* timer for 'changing' sd card. */
-QEMUTimer *transfer_timer;
-qemu_irq eject_cb;
-qemu_irq ro_cb;
-qemu_irq irq;
-
-uint32_t sdmasysad;/* SDMA System Address register */
-uint16_t blksize;  /* Host DMA Buff Boundary and Transfer BlkSize Reg 
*/
-uint16_t blkcnt;   /* Blocks count for current transfer */
-uint32_t argument; /* Command Argument Register */
-uint16_t trnmod;   /* Transfer Mode Setting Register */
-uint16_t cmdreg;   /* Command Register */
-uint32_t rspreg[4];/* Response Registers 0-3 */
-uint32_t prnsts;   /* Present State Register */
-uint8_t  hostctl;  /* Host Control Register */
-uint8_t  pwrcon;   /* Power control Register */
-uint8_t  blkgap;   /* Block Gap Control Register */
-uint8_t  wakcon;   /* WakeUp Control Register */
-uint16_t clkcon;   /* Clock control Register */
-uint8_t  timeoutcon;   /* Timeout Control Register */
-uint8_t  admaerr;  /* ADMA Error Status Register */
-uint16_t norintsts;/* Normal Interrupt Status Register */
-uint16_t errintsts;/* Error Interrupt Status Register */
-uint16_t norintstsen;  /* Normal Interrupt Status Enable Register */
-uint16_t errintstsen;  /* Error Interrupt Status Enable Register */
-uint16_t norintsigen;  /* Normal Interrupt Signal Enable Register */
-uint16_t errintsigen;  /* Error Interrupt Signal Enable Register */
-uint16_t acmd12errsts; /* Auto CMD12 error status register */
-uint64_t admasysaddr;  /* ADMA System Address Register */
-
-uint32_t capareg;  /* Capabilities Register */
-uint32_t maxcurr;  /* Maximum Current Capabilities Register */
-uint8_t  *fifo_buffer; /* SD host i/o FIFO buffer */
-uint32_t buf_maxsz;
-uint16_t data_count;   /* current element in FIFO buffer */
-uint8_t  stopped_state;/* Current SDHC state */
-/* Buffer Data Port Register - virtual access point to R and W buffers */
-/* Software Reset Register - always reads as 0 */
-/* Force Event Auto CMD12 Error Interrupt Reg - write only */
-/* Force Event Error Interrupt Register- write only */
-/* RO Host Controller Version Register always reads as 0x2401 */
-} SDHCIState;
-
 extern const VMStateDescription sdhci_vmstate;
 
-#define TYPE_PCI_SDHCI "sdhci-pci"
-#define PCI_SDHCI(obj) OBJECT_CHECK(SDHCIState, (obj), TYPE_PCI_SDHCI)
-
-#define TYPE_SYSBUS_SDHCI "generic-sdhci"
-#define SYSBUS_SDHCI(obj)   \
- OBJECT_CHECK(SDHCIState, (obj), TYPE_SYSBUS_SDHCI)
-
-#endif /* SDHCI_H */
+#endif
diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
index e63367b..34018fd 100644
--- a/hw/sd/sdhci.c
+++ b/hw/sd/sdhci.c
@@ -28,8 +28,7 @@
 #include "sysemu/dma.h"
 #include "qemu/timer.h"
 #include "qemu/bitops.h"
-
-#include "sdhci.h"
+#include "sdhci-internal.h"
 
 /* host controller debug messages */
 #ifndef SDHC_DEBUG
diff --git a/include/hw/sd/sdhci.h b/incl

Re: [Qemu-devel] [PATCH 1/1] target-i386: get/put MSR_TSC_AUX across reset and migration

2015-09-23 Thread Paolo Bonzini


On 23/09/2015 08:27, Amit Shah wrote:
> There's one report of migration breaking due to missing MSR_TSC_AUX
> save/restore.  Fix this by adding a new subsection that saves the state
> of this MSR.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1261797

It turns out that the MSR is already saved/restored into the migration
stream!  However, the commit that introduced RDTSCP support (commit
1b05007, "target-i386: add RDTSCP support", 2009-09-19) was written for
TCG, and we ended up forgetting to fish the value out of KVM and send it
back in.

The KVM bits are okay.  Eduardo, can you undo the machine.c hunk or
should Amit send v2?

Paolo

> Reported-by: Xiaoqing Wei 
> Signed-off-by: Amit Shah 
> CC: Paolo Bonzini 
> CC: Juan Quintela 
> CC: "Dr. David Alan Gilbert" 
> CC: Marcelo Tosatti 
> CC: Richard Henderson 
> CC: Eduardo Habkost 
> ---
>  target-i386/kvm.c | 14 ++
>  target-i386/machine.c | 20 
>  2 files changed, 34 insertions(+)
> 
> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
> index 7b0ba17..80d1a7e 100644
> --- a/target-i386/kvm.c
> +++ b/target-i386/kvm.c
> @@ -67,6 +67,7 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
>  
>  static bool has_msr_star;
>  static bool has_msr_hsave_pa;
> +static bool has_msr_tsc_aux;
>  static bool has_msr_tsc_adjust;
>  static bool has_msr_tsc_deadline;
>  static bool has_msr_feature_control;
> @@ -825,6 +826,10 @@ static int kvm_get_supported_msrs(KVMState *s)
>  has_msr_hsave_pa = true;
>  continue;
>  }
> +if (kvm_msr_list->indices[i] == MSR_TSC_AUX) {
> +has_msr_tsc_aux = true;
> +continue;
> +}
>  if (kvm_msr_list->indices[i] == MSR_TSC_ADJUST) {
>  has_msr_tsc_adjust = true;
>  continue;
> @@ -1299,6 +1304,9 @@ static int kvm_put_msrs(X86CPU *cpu, int level)
>  if (has_msr_hsave_pa) {
>  kvm_msr_entry_set(&msrs[n++], MSR_VM_HSAVE_PA, env->vm_hsave);
>  }
> +if (has_msr_tsc_aux) {
> +kvm_msr_entry_set(&msrs[n++], MSR_TSC_AUX, env->tsc_aux);
> +}
>  if (has_msr_tsc_adjust) {
>  kvm_msr_entry_set(&msrs[n++], MSR_TSC_ADJUST, env->tsc_adjust);
>  }
> @@ -1671,6 +1679,9 @@ static int kvm_get_msrs(X86CPU *cpu)
>  if (has_msr_hsave_pa) {
>  msrs[n++].index = MSR_VM_HSAVE_PA;
>  }
> +if (has_msr_tsc_aux) {
> +msrs[n++].index = MSR_TSC_AUX;
> +}
>  if (has_msr_tsc_adjust) {
>  msrs[n++].index = MSR_TSC_ADJUST;
>  }
> @@ -1820,6 +1831,9 @@ static int kvm_get_msrs(X86CPU *cpu)
>  case MSR_IA32_TSC:
>  env->tsc = msrs[i].data;
>  break;
> +case MSR_TSC_AUX:
> +env->tsc_aux = msrs[i].data;
> +break;
>  case MSR_TSC_ADJUST:
>  env->tsc_adjust = msrs[i].data;
>  break;
> diff --git a/target-i386/machine.c b/target-i386/machine.c
> index 9fa0563..116693d 100644
> --- a/target-i386/machine.c
> +++ b/target-i386/machine.c
> @@ -453,6 +453,25 @@ static const VMStateDescription vmstate_fpop_ip_dp = {
>  }
>  };
>  
> +static bool tsc_aux_needed(void *opaque)
> +{
> +X86CPU *cpu = opaque;
> +CPUX86State *env = &cpu->env;
> +
> +return env->tsc_aux != 0;
> +}
> +
> +static const VMStateDescription vmstate_msr_tsc_aux = {
> +.name = "cpu/msr_tsc_aux",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.needed = tsc_aux_needed,
> +.fields = (VMStateField[]) {
> +VMSTATE_UINT64(env.tsc_aux, X86CPU),
> +VMSTATE_END_OF_LIST()
> +}
> +};
> +
>  static bool tsc_adjust_needed(void *opaque)
>  {
>  X86CPU *cpu = opaque;
> @@ -871,6 +890,7 @@ VMStateDescription vmstate_x86_cpu = {
>  &vmstate_msr_hyperv_crash,
>  &vmstate_avx512,
>  &vmstate_xss,
> +&vmstate_msr_tsc_aux,
>  NULL
>  }
>  };
> 



Re: [Qemu-devel] [PATCH 0/3] pc: Set hw_version on all machine classes

2015-09-23 Thread Igor Mammedov
On Tue, 22 Sep 2015 17:16:24 -0300
Eduardo Habkost  wrote:

> In 2012, QEMU had a bug where it exposed QEMU version information
> to the guest, meaning a QEMU upgrade would expose different
> hardware to the guest OS even if the same machine-type is being
> used.
> 
> The bug was fixed by commit 93bfef4c6e4b23caea9d51e1099d06433d8835a4,
> on all machines up to pc-1.0. But we kept introducing the same
> bug on all newer machines since then. That means we are breaking
> guest ABI every time QEMU was upgraded.
> 
> Fix this by setting the hw_version on all PC machines, making
> sure the hardware won't change when upgrading QEMU.
> 
> This series is based on Michael's PCI tree, plus the "Set
> broken_reserved_end on pc-*-2.4, not 2.5" patch I submitted
> earlier today.
I haven't seen this patch on list, perhaps it needs to be resubmitted?





Re: [Qemu-devel] [PATCH v18 13/21] icount: improve counting for record/replay

2015-09-23 Thread Paolo Bonzini


On 23/09/2015 09:22, Pavel Dovgaluk wrote:
> Sometimes tcg thread halts in qemu_tcg_wait_io_event function,
> waiting for any external event. Virtual clock does not run, because
> warp is not called. warp call in main_loop_wait proceeds virtual
> clock and allows tcg thread to run further.

Ok, this makes sense!

Would this work too as a replacement for this patch?

Paolo

diff --git a/cpus.c b/cpus.c
index fbbd17f..9480acc 100644
--- a/cpus.c
+++ b/cpus.c
@@ -926,6 +926,7 @@ static void qemu_tcg_wait_io_event(CPUState *cpu)
 }

 while (iothread_requesting_mutex) {
+qemu_clock_warp(QEMU_CLOCK_VIRTUAL);
 qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex);
 }




Re: [Qemu-devel] [PATCH v2] spapr: generate DT node names

2015-09-23 Thread Thomas Huth
On 21/09/15 18:53, Laurent Vivier wrote:
> When DT node names for PCI devices are generated by SLOF,
> they are generated according to the type of the device
> (for instance, ethernet for virtio-net-pci device).
> 
> Node name for hotplugged devices is generated by QEMU.
> This patch adds the mechanic to QEMU to create the node
> name according to the device type too.
> 
> The data structure has been roughly copied from OpenBIOS/OpenHackware,
> node names from SLOF.
...
> Signed-off-by: Laurent Vivier 
> ---
> v2: Use CamelCase name, remove misc-* name,
> remove _OTHER entries to fallback to class name (as SLOF does).
> Fix typo (IPMI-bltr).
> 
>  hw/ppc/spapr_pci.c | 490 
> +++--
>  1 file changed, 476 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index a2feb4c..eb53719 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -944,6 +944,475 @@ static void populate_resource_props(PCIDevice *d, 
> ResourceProps *rp)
>  rp->assigned_len = assigned_idx * sizeof(ResourceFields);
>  }
>  
> +/* Device classes and subclasses */
> +
> +#define PCI_BASE_CLASS_STORAGE   0x01
> +#define PCI_SUBCLASS_STORAGE_SCSI0x00
> +#define PCI_SUBCLASS_STORAGE_IDE 0x01
> +#define PCI_SUBCLASS_STORAGE_FLOPPY  0x02
> +#define PCI_SUBCLASS_STORAGE_IPI 0x03
> +#define PCI_SUBCLASS_STORAGE_RAID0x04
> +#define PCI_SUBCLASS_STORAGE_ATA 0x05
> +#define PCI_SUBCLASS_STORAGE_SATA0x06
> +#define PCI_SUBCLASS_STORAGE_SAS 0x07
> +#define PCI_SUBCLASS_STORAGE_OTHER   0x80
> +
> +#define PCI_BASE_CLASS_NETWORK   0x02
> +#define PCI_SUBCLASS_NETWORK_ETHERNET0x00
> +#define PCI_SUBCLASS_NETWORK_TOKEN_RING  0x01
> +#define PCI_SUBCLASS_NETWORK_FDDI0x02
> +#define PCI_SUBCLASS_NETWORK_ATM 0x03
> +#define PCI_SUBCLASS_NETWORK_ISDN0x04
> +#define PCI_SUBCLASS_NETWORK_WORDFIP 0x05
> +#define PCI_SUBCLASS_NETWORK_PICMG2140x06
> +#define PCI_SUBCLASS_NETWORK_OTHER   0x80
> +
> +#define PCI_BASE_CLASS_DISPLAY   0x03
> +#define PCI_SUBCLASS_DISPLAY_VGA 0x00
> +#define PCI_SUBCLASS_DISPLAY_XGA 0x01
> +#define PCI_SUBCLASS_DISPLAY_3D  0x02
> +#define PCI_SUBCLASS_DISPLAY_OTHER   0x80
> +
> +#define PCI_BASE_CLASS_MULTIMEDIA0x04
> +#define PCI_SUBCLASS_MULTIMEDIA_VIDEO0x00
> +#define PCI_SUBCLASS_MULTIMEDIA_AUDIO0x01
> +#define PCI_SUBCLASS_MULTIMEDIA_PHONE0x02
> +#define PCI_SUBCLASS_MULTIMEDIA_OTHER0x80
> +
> +#define PCI_BASE_CLASS_MEMORY0x05
> +#define PCI_SUBCLASS_MEMORY_RAM  0x00
> +#define PCI_SUBCLASS_MEMORY_FLASH0x01
> +
> +#define PCI_BASE_CLASS_BRIDGE0x06
> +#define PCI_SUBCLASS_BRIDGE_HOST 0x00
> +#define PCI_SUBCLASS_BRIDGE_ISA  0x01
> +#define PCI_SUBCLASS_BRIDGE_EISA 0x02
> +#define PCI_SUBCLASS_BRIDGE_MC   0x03
> +#define PCI_SUBCLASS_BRIDGE_PCI  0x04
> +#define PCI_SUBCLASS_BRIDGE_PCMCIA   0x05
> +#define PCI_SUBCLASS_BRIDGE_NUBUS0x06
> +#define PCI_SUBCLASS_BRIDGE_CARDBUS  0x07
> +#define PCI_SUBCLASS_BRIDGE_RACEWAY  0x08
> +#define PCI_SUBCLASS_BRIDGE_PCI_SEMITP   0x09
> +#define PCI_SUBCLASS_BRIDGE_IB_PCI   0x0a
> +#define PCI_SUBCLASS_BRIDGE_OTHER0x80
> +
> +#define PCI_BASE_CLASS_COMMUNICATION 0x07
> +#define PCI_SUBCLASS_COMMUNICATION_SERIAL 0x00
> +#define PCI_SUBCLASS_COMMUNICATION_PARALLEL 0x01
> +#define PCI_SUBCLASS_COMMUNICATION_MULTISERIAL 0x02
> +#define PCI_SUBCLASS_COMMUNICATION_MODEM 0x03
> +#define PCI_SUBCLASS_COMMUNICATION_GPIB  0x04
> +#define PCI_SUBCLASS_COMMUNICATION_SC0x05
> +#define PCI_SUBCLASS_COMMUNICATION_OTHER 0x80
> +
> +#define PCI_BASE_CLASS_SYSTEM0x08
> +#define PCI_SUBCLASS_SYSTEM_PIC  0x00
> +#define PCI_SUBCLASS_SYSTEM_DMA  0x01
> +#define PCI_SUBCLASS_SYSTEM_TIMER0x02
> +#define PCI_SUBCLASS_SYSTEM_RTC  0x03
> +#define PCI_SUBCLASS_SYSTEM_HOTPLUG  0x04
> +#define PCI_SUBCLASS_SYSTEM_SD   0x05
> +#define PCI_SUBCLASS_SYSTEM_OTHER0x80
> +
> +#define PCI_BASE_CLASS_INPUT 0x09
> +#define PCI_SUBCLASS_INPUT_KEYBOARD  0x00
> +#define PCI_SUBCLASS_INPUT_PEN   0x01
> +#define PCI_SUBCLASS_INPUT_MOUSE 0x02
> +#define PCI_SUBCLASS_INPUT_SCANNER   0x03
> +#define PCI_SUBCLASS_INPUT_GAMEPORT  0x04
> +#define PCI_SUBCLASS_INPUT_OTHER 0x80
> +
> +#define PCI_BASE_CLASS_DOCKING   0x0a
> +#define PCI_SUBCLASS_DOCKING_GENERIC 0x00
> +#define PCI_SUBCLASS_DOCKING_OTHER   0x80
> +
> +#define PCI_BASE_CLASS_PROCESSOR 0x0b
> +#define PCI_SUBCLASS_PROCESSOR_386   0x00
> +#define PCI_SUBCLASS_PROCESSOR_486   0x01
> +#define PCI_SUBCLASS_PROCESSOR_PENTIUM   0x02
> +#define PCI_SUBCLASS_PROCESSOR_ALPHA 0x10
> +#define PCI_SUBCLASS_PROCESSOR_POWERPC   0x20
> +#define PCI_SUBC

Re: [Qemu-devel] MAINTAINERS leaves too many files uncovered

2015-09-23 Thread Kevin Wolf
Am 22.09.2015 um 20:07 hat Eric Blake geschrieben:
> On 09/22/2015 03:13 AM, Markus Armbruster wrote:
> 
> >> Full list of unmaintained files now:
> > [...]
> > 
> 
> > docs/bitmaps.md
> > docs/blkdebug.txt
> > docs/blkverify.txt
> 
> > docs/live-block-ops.txt
> 
> > docs/qcow2-cache.txt
> > docs/qdev-device-use.txt
> 
> > docs/specs/qcow2.txt
> > docs/specs/qed_spec.txt
> 
> Kevin, Stefan - should these be added to Block maintainers?

Except for docs/qdev-device-use.txt, I guess.

Also, they shouldn't be added to block in general, but to the specific
block drivers/subsystems like qcow2, qed, block jobs etc.

> > include/qemu/throttle.h
> 
> Block?

Would be covered if it were in the correct place (include/block/).

Should we add a separate section for I/O throttling, maintained by
Berto, as well? So far it's part of the block layer core - that is,
officially maintained by me and I know next to nothing about it.

Kevin


pgpHHzYp8j38p.pgp
Description: PGP signature


Re: [Qemu-devel] MAINTAINERS leaves too many files uncovered

2015-09-23 Thread Alberto Garcia
On Wed 23 Sep 2015 10:24:55 AM CEST, Kevin Wolf  wrote:

>> > include/qemu/throttle.h
>> 
>> Block?
>
> Would be covered if it were in the correct place (include/block/).
>
> Should we add a separate section for I/O throttling, maintained by
> Berto, as well? So far it's part of the block layer core - that is,
> officially maintained by me and I know next to nothing about it.

Yes, you can list me as maintainer of the throttling code.

Thanks for telling,

Berto



Re: [Qemu-devel] [PATCH v2 REPOST] oslib-win32: only provide localtime_r/gmtime_r if missing

2015-09-23 Thread Daniel P. Berrange
On Tue, Sep 22, 2015 at 07:49:40PM +0200, Stefan Weil wrote:
> Hi,
> 
> I suggest cleaning some comments, mostly using the "official"
> spellings for MinGW and Mingw-w64.
> 
> Am 22.09.2015 um 16:13 schrieb Daniel P. Berrange:
> > The oslib-win32 file currently provides a localtime_r and
> > gmtime_r replacement unconditionally. Some versions of
> > Mingw64 would provide crude macros for localtime_r/gmtime_r
> 
> MinGW / Mingw-w64
> 
> > which QEMU takes care to disable. Latest versions of Mingw64
> 
> Mingw-w64
> 
> > now provide actual functions for localtime_r/gmtime_r, but
> > with a twist that you have to include unistd.h or pthread.h
> > before including time.h.  By luck some files in QEMU have
> > such an include order, resulting in compile errors:
> > 
> >   CCutil/osdep.o
> > In file included from include/qemu-common.h:48:0,
> >  from util/osdep.c:48:
> > include/sysemu/os-win32.h:77:12: error: redundant redeclaration of 
> > 'gmtime_r' [-Werror=redundant-decls]
> >  struct tm *gmtime_r(const time_t *timep, struct tm *result);
> > ^
> > In file included from include/qemu-common.h:35:0,
> >  from util/osdep.c:48:
> > /usr/i686-w64-mingw32/sys-root/mingw/include/time.h:272:107: note: previous 
> > definition of 'gmtime_r' was here
> > In file included from include/qemu-common.h:48:0,
> >  from util/osdep.c:48:
> > include/sysemu/os-win32.h:79:12: error: redundant redeclaration of 
> > 'localtime_r' [-Werror=redundant-decls]
> >  struct tm *localtime_r(const time_t *timep, struct tm *result);
> > ^
> > In file included from include/qemu-common.h:35:0,
> >  from util/osdep.c:48:
> > /usr/i686-w64-mingw32/sys-root/mingw/include/time.h:269:107: note: previous 
> > definition of 'localtime_r' was here
> > 
> > This change adds a configure test to see if localtime_r
> > exits, and only enables the QEMU impl if missing. We also
> > re-arrange qemu-common.h try attempt to guarantee that all
> > source files get unistd.h before time.h and thus see the
> > localtime_r/gmtime_r defs.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  configure | 34 ++
> >  include/qemu/osdep.h  |  4 +++-
> >  include/sysemu/os-win32.h |  2 ++
> >  util/oslib-win32.c|  2 ++
> >  4 files changed, 41 insertions(+), 1 deletion(-)
> > 
> > diff --git a/configure b/configure
> > index 52f5b79..4654be8 100755
> > --- a/configure
> > +++ b/configure
> > @@ -1737,6 +1737,37 @@ else
> >  fi
> >  
> >  ##
> > +# Mingw64 localtime_r/gmtime_r check
> 
> MinGW / Mingw-w64
> 
> > +
> > +if test "$mingw32" = "yes"; then
> > +# Some versions of Mingw32/64 lack localtime_r
> 
> MinGW / Mingw-w64
> 
> > +# and gmtime_r entirely
> 
> Missing .
> 
> > +#
> > +# Some versions of Mingw64 define a macro for
> 
> Mingw-w64
> 
> > +# localtime_r/gmtime_r/etc
> 
> Why etc? Missing .

They also define macros for some other functions, but they
don't cause us problems, so we're not dealing with them
here. Simpler to just remove the /etc though.

> 
> > +#
> > +# Some versions of Ming64 will define functions
> > +# for localtime_r/gmtime_r, but only if you have
> > +# _POSIX_THREAD_SAFE_FUNCTIONS defined. For fun
> > +# though, unistd.h and pthread.h both define
> > +# that for you.
> > +#
> > +# So this #undef localtime_r and #include 
> > +# are not in fact redundant
> 
> 
> Otherwise this patch looks good.
> 
> If you agree, I'd clean the comments before I add
> the patch to my patch queue for Windows
> (git://qemu.weilnetz.de/qemu.git wxx).

Yes, I'm fine with you applying it to your queue and adding the fixes
mentioned.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH] docs: describe the QEMU build system structure / design

2015-09-23 Thread Daniel P. Berrange
On Tue, Sep 22, 2015 at 05:35:59PM +0100, Daniel P. Berrange wrote:
> Developers who are new to QEMU, or have a background familiarity
> with GNU autotools can have trouble getting their head around the
> home-grown QEMU build system. This document attempts to explain
> the structure / design of the configure script and the various
> Makefile pieces that live across the source tree.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  docs/build-system.txt | 493 
> ++
>  1 file changed, 493 insertions(+)
>  create mode 100644 docs/build-system.txt

Thanks for all the feedback on this. I won't reply to all the mails
individually, I'll simply address all the feedback given and post a
v2 later.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v18 13/21] icount: improve counting for record/replay

2015-09-23 Thread Pavel Dovgaluk
> From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo 
> Bonzini
> On 23/09/2015 09:22, Pavel Dovgaluk wrote:
> > Sometimes tcg thread halts in qemu_tcg_wait_io_event function,
> > waiting for any external event. Virtual clock does not run, because
> > warp is not called. warp call in main_loop_wait proceeds virtual
> > clock and allows tcg thread to run further.
> 
> Ok, this makes sense!
> 
> Would this work too as a replacement for this patch?

No, it doesn't help.
It seems that tcg is waiting within qemu_cond_wait function without leaving it.

> 
> diff --git a/cpus.c b/cpus.c
> index fbbd17f..9480acc 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -926,6 +926,7 @@ static void qemu_tcg_wait_io_event(CPUState *cpu)
>  }
> 
>  while (iothread_requesting_mutex) {
> +qemu_clock_warp(QEMU_CLOCK_VIRTUAL);
>  qemu_cond_wait(&qemu_io_proceeded_cond, &qemu_global_mutex);
>  }


Pavel Dovgalyuk




Re: [Qemu-devel] [PATCH v6 3/4] block: add a 'blockdev-snapshot' QMP command

2015-09-23 Thread Alberto Garcia
On Tue 22 Sep 2015 06:38:27 PM CEST, Max Reitz wrote:

>> +.name   = "blockdev-snapshot",
>> +.args_type  = "node:s,overlay:s",
>> +.mhandler.cmd_new = qmp_marshal_input_blockdev_snapshot,
>
> As of 7fad30f06eb6aa57aaa8f3d264288f24ae7646f0, this needs to be
> qmp_marshal_blockdev_snapshot.

Sure, that needs to be part of the next rebase, otherwise it won't
build.

Berto



Re: [Qemu-devel] [RFC PATCH 02/10] vfio: Generalize vfio_listener_region_add failure path

2015-09-23 Thread Thomas Huth
On 17/09/15 15:09, David Gibson wrote:
> If a DMA mapping operation fails in vfio_listener_region_add() it
> checks to see if we've already completed initial setup of the
> container.  If so it reports an error so the setup code can fail
> gracefully, otherwise throws a hw_error().
> 
> There are other potential failure cases in vfio_listener_region_add()
> which could benefit from the same logic, so move it to its own
> fail: block.  Later patches can use this to extend other failure cases
> to fail as gracefully as possible under the circumstances.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/vfio/common.c | 26 +++---
>  1 file changed, 15 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index e3152f6..9953b9c 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -400,19 +400,23 @@ static void vfio_listener_region_add(MemoryListener 
> *listener,
>  error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
>   "0x%"HWADDR_PRIx", %p) = %d (%m)",
>   container, iova, end - iova, vaddr, ret);
> +goto fail;
> +}
>  
> -/*
> - * On the initfn path, store the first error in the container so we
> - * can gracefully fail.  Runtime, there's not much we can do other
> - * than throw a hardware error.
> - */
> -if (!container->iommu_data.initialized) {
> -if (!container->iommu_data.error) {
> -container->iommu_data.error = ret;
> -}
> -} else {
> -hw_error("vfio: DMA mapping failed, unable to continue");
> +return;
> +
> +fail:
> +/*
> + * On the initfn path, store the first error in the container so we
> + * can gracefully fail.  Runtime, there's not much we can do other
> + * than throw a hardware error.
> + */
> +if (!container->iommu_data.initialized) {
> +if (!container->iommu_data.error) {
> +container->iommu_data.error = ret;
>  }
> +} else {
> +hw_error("vfio: DMA mapping failed, unable to continue");
>  }
>  }

Reviewed-by: Thomas Huth 






Re: [Qemu-devel] [PATCH v5 02/46] qapi: Clean up qapi.py per pep8

2015-09-23 Thread Markus Armbruster
Eric Blake  writes:

> On 09/22/2015 08:00 AM, Markus Armbruster wrote:
>> Eric Blake  writes:
>> 
>>> Silence pep8, and make pylint a bit happier.  Just style cleanups;
>>> no semantic changes.
>> 
>> I had planned to clean it up after resolving the TODO fold into
>> QAPISchema, but I'm fine with doing it right away.  I think we'll want
>> to do a bit more for pylint, but limiting ourselves to really obvious
>> changes now makes sense.
>> 
>>> Signed-off-by: Eric Blake 
>>> ---
>>>  scripts/qapi.py | 165 
>>> 
>>>  1 file changed, 107 insertions(+), 58 deletions(-)
>>>
>
>>> +
>>>  class QAPISchemaError(Exception):
>>>  def __init__(self, schema, msg):
>>> +Exception.__init__(self)
>> 
>> Not a style change.  One more below.  Separate patch?
>
> pep8 didn't mind, but pylint did.  Personally, I don't know what happens
> if you don't call the superclass constructor.  But since pep8 didn't
> flag it, I can agree to split into a separate patch.

I figure that'll be easier than explaining the fixing the commit message
not to claim "just style" ;)

>>>  if not isinstance(include, str):
>>> -raise QAPIExprError(expr_info,
>>> - 'Expected a file name (string), got: %s'
>>> -% include)
>>> +raise QAPIExprError(expr_info, 'Expected a file name '
>>> +'(string), got: %s' % include)
>> 
>> I don't like breaking lines in the middle of an argument when another
>> argument shares a line with a part.  Perhaps:
>> 
>> raise QAPIExprError(expr_info,
>> 'Expected a file name (string), got: %s'
>> % include)
>
> Hmm - I touch the same lines again in patch 6:
>
> |  include = expr["include"]
> |  if not isinstance(include, str):
> | -raise QAPIExprError(expr_info, 'Expected a file name '
> | -'(string), got: %s' % include)
> | +raise QAPIExprError(expr_info,
> | +"Expected a string for 'include'")
>
> Looks like I should reshuffle the series to avoid the churn.

I appreciate less churn, but I'm aware of diminishing returns.
Reshuffling may be more trouble than it's worth.  Your call.

>>> @@ -1308,12 +1340,13 @@ def camel_to_upper(value):
>>>  if c.isupper() and (i > 0) and c_fun_str[i - 1] != "_":
>>>  # Case 1: next string is lower
>>>  # Case 2: previous string is digit
>>> -if (i < (l - 1) and c_fun_str[i + 1].islower()) or \
>>> -c_fun_str[i - 1].isdigit():
>>> +next_lower = i < (l - 1) and c_fun_str[i + 1].islower()
>>> +if next_lower or c_fun_str[i - 1].isdigit():
>>>  new_name += '_'
>>>  new_name += c
>> 
>> Dunno.  Perhaps:
>> 
>> if i < (l - 1) and c_fun_str[i + 1].islower():
>> new_name += '_'
>> elif c_fun_str[i - 1].isdigit():
>> new_name += '_'
>> 
>> Differently ugly, I guess.
>
> The complaints from pep8 were the \ continuation, and the fact that the
> continued if condition was at the same indentation as the body of the
> if; another ugly solution would be another layer of ():
>
> if (((i < (l - 1) and c_fun_str[i + 1].islower()) or
>  c_fun_str[i - 1].isdigit())):
> new_name += '_'
>
> or maybe reverse the conditional:
>
> Your suggestion looks the least bad to me.
>
>> 
>> The two comment lines are pretty worthless.
>
> I can drop them if you'd like.

Yes, please.



Re: [Qemu-devel] [PATCH v5 0/4] qapi: child add/delete support

2015-09-23 Thread Dr. David Alan Gilbert
* Wen Congyang (we...@cn.fujitsu.com) wrote:
> On 09/22/2015 07:15 PM, Dr. David Alan Gilbert wrote:
> > * Wen Congyang (we...@cn.fujitsu.com) wrote:
> >> If quorum's child is broken, we can use mirror job to replace it.
> >> But sometimes, the user only need to remove the broken child, and
> >> add it later when the problem is fixed.
> >>
> > 
> > Hi,
> >   Two questions:
> > 1) Do you have an example of a pair of add/remove commands that work
> >   together? (I'm not quite sure I understand where the ID for the remove
> >   comes from).
> 
> The command line:
> -drive 
> if=virtio,id=disk1,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0.file.filename=/data/images/kvm/suse/suse11_3.img,children.0.driver=raw
> 
> And the QMP monitor command:
> {'execute':'blockdev-add', 'arguments':{'options':{'driver': 'raw', 
> 'node-name': 'test1', 'file': {'driver': 'file', 'filename': '/dev/null'}, 
> 'id': 'test11' }  } }
> {'execute': 'human-monitor-command', 'arguments': {'command-line': 'drive_add 
> buddy 
> driver=nbd,host=192.168.3.1,port=8889,export=colo-disk1,node-name=test2,if=none'}}
> {'execute':'x-blockdev-child-add', 'arguments':{'parent': 'disk1', 'child': 
> 'test1' } }
> {'execute':'x-blockdev-child-add', 'arguments':{'parent': 'disk1', 'child': 
> 'test2' } }
> {'execute': 'x-blockdev-child-del', 'arguments': {'parent': 'disk1', 'child': 
> 'test1' } }
> {'execute': 'x-blockdev-child-del', 'arguments': {'parent': 'disk1', 'child': 
> 'test2' } }
> 
> Note: the qmp monitor command doesn't support nbd now, and I use the hmp 
> command to add a BDS.

Thank you; OK I see the format has changed quite a bit from the older version; 
this version
is a lot nicer.

> > 2) If the child has failed and is not responding to block operations
> >at all (e.g a networking failure to an nbd device which may take 
> > minutes
> >to time out); how do you recover - flush or drain on the devices
> >hang at that point.
> 
> If the network fails, the kernel doesn't notify the application...
> 
> > 
> > (I was trying to test recovery from a failed secondary using the July COLO
> > release; but the primary gets stuck in bdrv_drain or bdrv_flush if I kill
> > the secondary in the right way).
> 
> IIRC, if the qemu is killed, the connection is closed at the same time. 
> bdrv_drain()
> or bdrv_flush() should not get stuck.

I use kill -SIGSTOP to the secondary qemu so I think that behaves like the 
network fails,
or if the secondary host just failed completely.  You do need some way to 
recover from the
NBD server dieing like that.

It sounds like we need some way to be able to remove a blockdev that's failed 
like that;
Paolo suggested the 'disk deadline' series could be used to time something like 
that
out eventually, but maybe you need something that allows you to remove
a child more forcibly.

Dave

> 
> Thanks
> Wen Congyang
> 
> > 
> > Dave
> > 
> > 
> >> It is based on the following patch:
> >> http://lists.nongnu.org/archive/html/qemu-devel/2015-09/msg04579.html
> >>
> >> ChangLog:
> >> v5:
> >> 1. Address Eric Blake's comments
> >> v4:
> >> 1. drop nbd driver's implementation. We can use human-monitor-command
> >>to do it.
> >> 2. Rename the command name.
> >> v3:
> >> 1. Don't open BDS in bdrv_add_child(). Use the existing BDS which is
> >>created by the QMP command blockdev-add.
> >> 2. The driver NBD can support filename, path, host:port now.
> >> v2:
> >> 1. Use bdrv_get_device_or_node_name() instead of new function
> >>bdrv_get_id_or_node_name()
> >> 2. Update the error message
> >> 3. Update the documents in block-core.json
> >>
> >> Wen Congyang (4):
> >>   Add new block driver interface to add/delete a BDS's child
> >>   quorum: implement bdrv_add_child() and bdrv_del_child()
> >>   qmp: add monitor command to add/remove a child
> >>   hmp: add monitor command to add/remove a child
> >>
> >>  block.c   | 56 ++--
> >>  block/quorum.c| 72 
> >> +--
> >>  blockdev.c| 48 +++
> >>  hmp-commands.hx   | 28 ++
> >>  hmp.c | 20 +
> >>  hmp.h |  2 ++
> >>  include/block/block.h |  8 ++
> >>  include/block/block_int.h |  5 
> >>  qapi/block-core.json  | 34 ++
> >>  qmp-commands.hx   | 61 +++
> >>  10 files changed, 329 insertions(+), 5 deletions(-)
> >>
> >> -- 
> >> 2.4.3
> >>
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> > .
> > 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v5 0/4] qapi: child add/delete support

2015-09-23 Thread Wen Congyang
On 09/23/2015 05:21 PM, Dr. David Alan Gilbert wrote:
> * Wen Congyang (we...@cn.fujitsu.com) wrote:
>> On 09/22/2015 07:15 PM, Dr. David Alan Gilbert wrote:
>>> * Wen Congyang (we...@cn.fujitsu.com) wrote:
 If quorum's child is broken, we can use mirror job to replace it.
 But sometimes, the user only need to remove the broken child, and
 add it later when the problem is fixed.

>>>
>>> Hi,
>>>   Two questions:
>>> 1) Do you have an example of a pair of add/remove commands that work
>>>   together? (I'm not quite sure I understand where the ID for the remove
>>>   comes from).
>>
>> The command line:
>> -drive 
>> if=virtio,id=disk1,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0.file.filename=/data/images/kvm/suse/suse11_3.img,children.0.driver=raw
>>
>> And the QMP monitor command:
>> {'execute':'blockdev-add', 'arguments':{'options':{'driver': 'raw', 
>> 'node-name': 'test1', 'file': {'driver': 'file', 'filename': '/dev/null'}, 
>> 'id': 'test11' }  } }
>> {'execute': 'human-monitor-command', 'arguments': {'command-line': 
>> 'drive_add buddy 
>> driver=nbd,host=192.168.3.1,port=8889,export=colo-disk1,node-name=test2,if=none'}}
>> {'execute':'x-blockdev-child-add', 'arguments':{'parent': 'disk1', 'child': 
>> 'test1' } }
>> {'execute':'x-blockdev-child-add', 'arguments':{'parent': 'disk1', 'child': 
>> 'test2' } }
>> {'execute': 'x-blockdev-child-del', 'arguments': {'parent': 'disk1', 
>> 'child': 'test1' } }
>> {'execute': 'x-blockdev-child-del', 'arguments': {'parent': 'disk1', 
>> 'child': 'test2' } }
>>
>> Note: the qmp monitor command doesn't support nbd now, and I use the hmp 
>> command to add a BDS.
> 
> Thank you; OK I see the format has changed quite a bit from the older 
> version; this version
> is a lot nicer.
> 
>>> 2) If the child has failed and is not responding to block operations
>>>at all (e.g a networking failure to an nbd device which may take 
>>> minutes
>>>to time out); how do you recover - flush or drain on the devices
>>>hang at that point.
>>
>> If the network fails, the kernel doesn't notify the application...
>>
>>>
>>> (I was trying to test recovery from a failed secondary using the July COLO
>>> release; but the primary gets stuck in bdrv_drain or bdrv_flush if I kill
>>> the secondary in the right way).
>>
>> IIRC, if the qemu is killed, the connection is closed at the same time. 
>> bdrv_drain()
>> or bdrv_flush() should not get stuck.
> 
> I use kill -SIGSTOP to the secondary qemu so I think that behaves like the 
> network fails,
> or if the secondary host just failed completely.  You do need some way to 
> recover from the
> NBD server dieing like that.

You use SIGSTOP, so there is no error in the connection, and the nbd client 
will wait the
reply. bdrv_drain() will never end in this case.

> 
> It sounds like we need some way to be able to remove a blockdev that's failed 
> like that;
> Paolo suggested the 'disk deadline' series could be used to time something 
> like that
> out eventually, but maybe you need something that allows you to remove
> a child more forcibly.

Yes, but quorum will wait bdrv_co_write() return. It is very hard to implement 
it now...


I guess 'disk deadline' can fix these two problems.

Thanks
Wen Congyang

> 
> Dave
> 
>>
>> Thanks
>> Wen Congyang
>>
>>>
>>> Dave
>>>
>>>
 It is based on the following patch:
 http://lists.nongnu.org/archive/html/qemu-devel/2015-09/msg04579.html

 ChangLog:
 v5:
 1. Address Eric Blake's comments
 v4:
 1. drop nbd driver's implementation. We can use human-monitor-command
to do it.
 2. Rename the command name.
 v3:
 1. Don't open BDS in bdrv_add_child(). Use the existing BDS which is
created by the QMP command blockdev-add.
 2. The driver NBD can support filename, path, host:port now.
 v2:
 1. Use bdrv_get_device_or_node_name() instead of new function
bdrv_get_id_or_node_name()
 2. Update the error message
 3. Update the documents in block-core.json

 Wen Congyang (4):
   Add new block driver interface to add/delete a BDS's child
   quorum: implement bdrv_add_child() and bdrv_del_child()
   qmp: add monitor command to add/remove a child
   hmp: add monitor command to add/remove a child

  block.c   | 56 ++--
  block/quorum.c| 72 
 +--
  blockdev.c| 48 +++
  hmp-commands.hx   | 28 ++
  hmp.c | 20 +
  hmp.h |  2 ++
  include/block/block.h |  8 ++
  include/block/block_int.h |  5 
  qapi/block-core.json  | 34 ++
  qmp-commands.hx   | 61 +++
  10 files changed, 329 insertions(+), 5 dele

Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb

2015-09-23 Thread Igor Mammedov
On Mon, 21 Sep 2015 14:22:23 +0200
Paolo Bonzini  wrote:

> 
> 
> On 21/09/2015 13:50, Igor Mammedov wrote:
> > it's attempt to workaround virtio bug reported earlier:
> > http://lists.nongnu.org/archive/html/qemu-devel/2015-08/msg00522.html
> > where virtio can't handle buffer that crosses border
> > between 2 DIMM's (i.e. 2 MemoryRegions).
> > 
> > Testing showed that virtio doesn't hit above bug
> > with 128Mb DIMM's granularity. Also linux memory
> > hotplug can handle hotplugged memory starting with
> > 128Mb memory sections so lets rise minimum size limit
> > to 128Mb and align starting DIMM address on 128Mb.
> > 
> > It's certainly not the fix but it reduces risk of
> > crashing VM till virtio is fixed.
> > It also could be improved in guest's virtio side if it
> > would align buffers on 128Mb border and limit max  buffer
> > size to the same value.
> > 
> > Signed-off-by: Igor Mammedov 
> 
> This seems to be easily handled at a level above QEMU---and the fix
> would be available to older machine types as well.  This patch would
> also make it quite a bit harder to test the real fix with QEMU. It is
> not alone a reason to NACK it but should also be kept in mind.
Patch makes it easy to change enforced alignment for future machine types,
so lowering alignment for testing isn't hard when virtio is fixed.

Handling it at libvirt level is a bit hard since currently it doesn't
deal with DIMM.addr allocation and there isn't any interface
to communicate hotplug address range to libvirt. Libvirt doesn't need
to know anything about DIMM.addr except of migration when it needs
to replicate state on target side.

Also it's QEMU bug/fault and pushing workaround to upper layers
doesn't seem right when it's much easier to do it in QEMU itself.


> Aligning to 4K makes some sense, since 4K is the page size, but
> enforcing an arbitrary alignment above 4K is policy that does not belong
> in QEMU.
> 
> To some extend, enforcing natural alignment would be okay as a
> workaround for the virtio bug as well.  It would also make it easier to
> ensure that hotplugged hugetlbfs-backed memory can use hugepages in the
> guest.  Does it make sense to you?
> 
> Paolo
> 
> > ---
> > Based on PCI tree as it has patches that add
> > 2.5 machine type.
> > ---
> >  hw/i386/pc.c |  8 +---
> >  hw/i386/pc_piix.c| 12 ++--
> >  hw/i386/pc_q35.c | 12 ++--
> >  include/hw/i386/pc.h |  5 ++---
> >  4 files changed, 27 insertions(+), 10 deletions(-)
> > 
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index b5107f7..ddb6710 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1645,8 +1645,9 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev,
> >  MemoryRegion *mr = ddc->get_memory_region(dimm);
> >  uint64_t align = TARGET_PAGE_SIZE;
> >  
> > -if (memory_region_get_alignment(mr) && pcms->enforce_aligned_dimm) {
> > -align = memory_region_get_alignment(mr);
> > +if (pcms->enforce_aligned_dimm) {
> > +align = MAX(memory_region_get_alignment(mr),
> > +pcms->enforce_aligned_dimm);
> >  }
> >  
> >  if (!pcms->acpi_dev) {
> > @@ -1936,7 +1937,8 @@ static void pc_machine_initfn(Object *obj)
> >  "Enable vmport (pc & q35)",
> >  &error_abort);
> >  
> > -pcms->enforce_aligned_dimm = true;
> > +/* align DIMM starting address/size by 128Mb */
> > +pcms->enforce_aligned_dimm = 1ULL << 27;
> >  object_property_add_bool(obj, PC_MACHINE_ENFORCE_ALIGNED_DIMM,
> >   pc_machine_get_aligned_dimm,
> >   NULL, &error_abort);
> > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > index caa4edc..7671905 100644
> > --- a/hw/i386/pc_piix.c
> > +++ b/hw/i386/pc_piix.c
> > @@ -301,9 +301,17 @@ static void pc_init1(MachineState *machine,
> >  }
> >  }
> >  
> > +static void pc_compat_2_4(MachineState *machine)
> > +{
> > +PCMachineState *pcms = PC_MACHINE(machine);
> > +
> > +pcms->enforce_aligned_dimm = TARGET_PAGE_SIZE;
> > +}
> > +
> >  static void pc_compat_2_3(MachineState *machine)
> >  {
> >  PCMachineState *pcms = PC_MACHINE(machine);
> > +pc_compat_2_4(machine);
> >  savevm_skip_section_footers();
> >  if (kvm_enabled()) {
> >  pcms->smm = ON_OFF_AUTO_OFF;
> > @@ -326,7 +334,7 @@ static void pc_compat_2_1(MachineState *machine)
> >  pc_compat_2_2(machine);
> >  smbios_uuid_encoded = false;
> >  x86_cpu_compat_kvm_no_autodisable(FEAT_8000_0001_ECX, CPUID_EXT3_SVM);
> > -pcms->enforce_aligned_dimm = false;
> > +pcms->enforce_aligned_dimm = 0;
> >  }
> >  
> >  static void pc_compat_2_0(MachineState *machine)
> > @@ -485,7 +493,7 @@ static void pc_i440fx_2_4_machine_options(MachineClass 
> > *m)
> >  SET_MACHINE_COMPAT(m, PC_COMPAT_2_4);
> >  }
> >  
> > -DEFINE_I440FX_MACHINE(v2_4, "pc-i440fx-2.4", NULL,
> > +DEFINE_I440FX_MACHINE(v2_4, "pc-i440f

Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb

2015-09-23 Thread Paolo Bonzini


On 23/09/2015 11:36, Igor Mammedov wrote:
> Also it's QEMU bug/fault and pushing workaround to upper layers
> doesn't seem right when it's much easier to do it in QEMU itself.

No, it's virtio's bug.  It makes sense to push workaround for guest bugs
as far from the hypervisor as possible.

If we want to increase the alignment in QEMU, I would prefer to have
natural alignment which makes some sense and happens to work around the
bug as well.  Michael, Eduardo, any third opinions?

Paolo



Re: [Qemu-devel] [PATCH v5 03/46] qapi: Test for C member name collisions

2015-09-23 Thread Markus Armbruster
Eric Blake  writes:

> On 09/22/2015 09:23 AM, Markus Armbruster wrote:
>> Eric Blake  writes:
>> 
>>> Expose some weaknesses in the generator: we don't always forbid
>>> the generation of structs that contain multiple members that map
>> 
>> Slightly misleading.  args-name-clash is a clash between command
>> arguments.  These are a struct internally, but we don't currently
>> generate an actual struct for it, only an argument list.
>
> Maybe struct-member-clash?  Renames are easy enough, but only if patch
> 1/46 is okay to go in first. :)

Naming the test args-member-clash is fine, because that's what it tests.
It also covers struct member clashes due to they way arguments work.
The slightly misleading part is the commit message.

>>> to the same C name.  This has already been marked FIXME in
>>> qapi.py, but having more tests will make sure future patches
>>> produce desired behavior.
>> 
>> Point to commit d90675f?
>
> Sure, now that it finally landed.
>
>> 
>>> Some of these tests will be deleted later, and a positive test
>>> added to qapi-schema-test.json in its place, when the code is
>> 
>> "in their place"?
>> 
>
> Yep. (Perils of editing, I started with one test, then added more later
> and merged into one patch)
>
>>> reworked so that the collision no longer occurs.
>>>
>>> Signed-off-by: Eric Blake 
>>> ---
>
>>> +++ b/tests/qapi-schema/flat-union-branch-clash.json
>>> @@ -1,4 +1,4 @@
>>> -# we check for no duplicate keys between branches and base
>>> +# we check for no duplicate keys between branch members and base
>>>  { 'enum': 'TestEnum',
>>>'data': [ 'value1', 'value2' ] }
>>>  { 'struct': 'Base',
>> 
>> This clashing business is awfully confusing as soon as unions come into
>> play.  When I'm confused, I need to think in writing.
>
> No kidding.  We already attempted to detect clashes, and caught some,
> but not all, types of clashes.  And there are indeed two types of member
> name clashes: those where the generated C struct has duplicate members
> (either because 2 user names map to the same C name, or because the
> generated code injects a C name for a purpose other than a "key":value
> name), and those where the qapi type would specify the same "key":value
> name more than once in the same {} object on the wire (even if the names
> would not collide in C because one is accessed through a box pointer).
> By patch 16/46, we should be catching all cases of member name clashes,
> but there's still work to do to catch collisions in 'command' and/or
> 'event' names.
>
> Also, by the time 16/46 is in, there are cases where we reject "clashes"
> where two member names with different spellings would map to the same C
> name, but where the corresponding C struct does not have a clash because
> the members are boxed behind different pointers.  Technically, we would
> not have to reject such cases, but the case is still confusing enough
> that rejecting it forces the qapi writer to consider a naming convention
> that is less confusing in the first place.

That's fair.

>> The basic case is clash between local, non-variant members.  Needs test
>> coverage.  args-name-clash.json provides it, because internally the
>> arguments are just another object type.
>
> Correct.  The test proves we don't yet catch the clash, and is fixed
> when later commits add the check.
>
>> 
>> With a base, the members inherited from base get added to the mix.  We
>> need to test a clash betwen local, non-variant member and a member
>> inherited from base.
>
> True for both structs and flat unions (the two places where we use
> 'base').  More on this below.
>
>> 
>> With unions, things get complicated, because we have multiple kinds of
>> clashes.  Best explained with an example.  Let's use UserDefFlatUnion
>> from qapi-schema-test.json.
>> 
>> { 'union': 'UserDefFlatUnion',
>>   'base': 'UserDefUnionBase',   # intentional forward reference
>>   'discriminator': 'enum1',
>>   'data': { 'value1' : 'UserDefA',
>> 'value2' : 'UserDefB',
>> 'value3' : 'UserDefB' } }
>> 
>> { 'struct': 'UserDefUnionBase',
>>   'base': 'UserDefZero',
>>   'data': { 'string': 'str', 'enum1': 'EnumOne' } }
>> 
>> Generated C looks like this:
>> 
>> struct UserDefFlatUnion {
>> /* Members inherited from UserDefUnionBase: */
>> int64_t integer;
>> char *string;
>> EnumOne enum1;
>> /* Own members: */
>> // if the schema language supported adding non-variant local
>> // members, they'd go right here
>> union { /* union tag is @enum1 */
>> void *data;
>> UserDefA *value1;
>> UserDefB *value2;
>> UserDefB *value3;
>> };
>> };
>> 
>> Thus, what can clash in C is the tag values value1, value2, value3 with
>> the non-variant members integer, string, enum1.
>
> That is, the tag values now appear as C member names, even though they
> did not correspond to QMP "key":v

Re: [Qemu-devel] [PATCH v3 0/5] add ACPI node for fw_cfg on pc and arm

2015-09-23 Thread Igor Mammedov
On Thu, 17 Sep 2015 10:56:29 -0400
"Gabriel L. Somlo"  wrote:

> New since v2:
> 
>   - pc/i386 node in ssdt only on machine types *newer* than 2.4
> (as suggested by Eduardo)
> 
> I appreciate any further comments and reviews. Hopefully we can make
> this palatable for upstream, modulo the lingering concerns about whether
> "QEMU0002" is ok to use as the value of _HID, which I'll hopefully get
> sorted out with the kernel crew...
> 
> Thanks much,
>   --Gabriel
> 
> >New since v1:
> >
> > - expose control register size (suggested by Marc Marí)
> >
> > - leaving out _UID and _STA fields (thanks Shannon & Igor)
> >
> > - using "QEMU0002" as the value of _HID (thanks Michael)
> >
> > - added documentation blurb to docs/specs/fw_cfg.txt
> >   (mainly to record usage of the "QEMU0002" string with fw_cfg).
> >
> >> This series adds a fw_cfg device node to the SSDT (on pc), or to the
> >> DSDT (on arm).
> >>
> >>- Patch 1/3 moves (and renames) the BIOS_CFG_IOPORT (0x510)
> >>  define from pc.c to pc.h, so that it could be used from
> >>  acpi-build.c in patch 2/3.
> >> 

> >>- Patch 2/3 adds a fw_cfg node to the pc SSDT.
> >> 
> >>- Patch 3/3 adds a fw_cfg node to the arm DSDT.
ACPI patches 2-3 look fine to me
although there is one but, it will make Windows complain about
unknown device and prompt user to install driver.

> >>
> >> I made up some names - "FWCF" for the node name, and "FWCF0001"
> >> for _HID; no idea whether that's appropriate, or how else I should
> >> figure out what to use instead...
> >>
> >> Also, using scope "\\_SB", based on where fw_cfg shows up in the
> >> output of "info qtree". Again, if that's wrong, please point me in
> >> the right direction.
> >>
> >> Re. 3/3 (also mentioned after the commit blurb in the patch itself),
> >> I noticed none of the other DSDT entries contain a _STA field, wondering
> >> why it would (not) make sense to include that, same as on the PC.
> 
> Gabriel L. Somlo (5):
>   fw_cfg: expose control register size in fw_cfg.h
>   pc: fw_cfg: move ioport base constant to pc.h
>   acpi: pc: add fw_cfg device node to ssdt
>   acpi: arm: add fw_cfg device node to dsdt
>   fw_cfg: document ACPI device node information
> 
>  docs/specs/fw_cfg.txt |  9 +
>  hw/arm/virt-acpi-build.c  | 13 +
>  hw/i386/acpi-build.c  | 21 +
>  hw/i386/pc.c  |  5 ++---
>  hw/i386/pc_piix.c |  1 +
>  hw/i386/pc_q35.c  |  1 +
>  hw/nvram/fw_cfg.c |  8 +---
>  include/hw/i386/pc.h  |  3 +++
>  include/hw/nvram/fw_cfg.h |  3 +++
>  9 files changed, 58 insertions(+), 6 deletions(-)
> 




Re: [Qemu-devel] [PATCH v2 0/5] monitor: throttle VSERPORT_CHANGED by "id"

2015-09-23 Thread Marc-André Lureau
ping

On Thu, Sep 17, 2015 at 6:08 PM,   wrote:
> From: Marc-André Lureau 
>
> QAPI_EVENT_VSERPORT_CHANGE reports changes of a virtio serial port
> state. However, the events may be for different ports, but the throttle
> mechanism may replace the event for a different port, since it only
> checks the event type.
>
> The following series implements throttling of events based on the "id"
> field. Hopefully this hash table approach can be later extended if
> other fields or combination of fields have to be used.
>
> v1->v2:
> - split first patch in 2 to ease review
> - remove some extra space
> - add some comments above delay handler function, and struct fields
> - rename the delay handler data "delay_data"
> - add a trace in monitor_protocol_event_delay()
> - improve some commit messages
> - simplify monitor_qapi_event_delay()
> - add some comment assert code in monitor_qapi_event_id_delay() to
>   ensure the given pending struct is valid
> - fixed hashtable key leak
> - rename qdict "data" argument to "qdict"
> - removed superfluous parenthesis
> - use a single timer handler for doing "id" filtering cleanup
>
> Marc-André Lureau (5):
>   monitor: split MonitorQAPIEventState
>   monitor: introduce MonitorQAPIEventDelay callback
>   monitor: rename QDict *data->qdict
>   monitor: throttle QAPI_EVENT_VSERPORT_CHANGE by "id"
>   monitor: remove old entries from event hash table
>
>  monitor.c| 256 
> ++-
>  trace-events |   3 +-
>  2 files changed, 203 insertions(+), 56 deletions(-)
>
> --
> 2.4.3
>



-- 
Marc-André Lureau



Re: [Qemu-devel] [PATCH] README: fill out some useful quickstart information

2015-09-23 Thread Markus Armbruster
Paolo Bonzini  writes:

> On 22/09/2015 10:16, Markus Armbruster wrote:
>
>> John Snow  writes:
>>
>>> "To build QEMU, you'd generally [some basic steps that have generally
>>> worked for a long time], but for up-to-date authoritative information,
>>> please do:
>>>
>>> mkdir path_to/build_dir
>>> cd path_to/build_dir
>>> ../../configure
>>> make qemu-doc.html"
>>>
>>> We wind up documenting how to almost do a build anyway.
>
> I don't think this is necessary.  We should remove duplicate information
> from qemu-doc.texi, developer documentation doesn't belong in manuals.

Just to avoid misunderstandings: I wouldn't object to splitting
qemu-doc.texi into a user manual and developer documentation.



[Qemu-devel] [PATCH v2] docs: describe the QEMU build system structure / design

2015-09-23 Thread Daniel P. Berrange
Developers who are new to QEMU, or have a background familiarity
with GNU autotools, can have trouble getting their head around the
home-grown QEMU build system. This document attempts to explain
the structure / design of the configure script and the various
Makefile pieces that live across the source tree.

Signed-off-by: Daniel P. Berrange 
---

Changed in v2:

 - Misc speling eror fixes
 - Rephrased some paragraphs as suggested
 - Added note about config-host.h file generation & use

 docs/build-system.txt | 506 ++
 1 file changed, 506 insertions(+)
 create mode 100644 docs/build-system.txt

diff --git a/docs/build-system.txt b/docs/build-system.txt
new file mode 100644
index 000..36e4aa0
--- /dev/null
+++ b/docs/build-system.txt
@@ -0,0 +1,506 @@
+The QEMU build system architecture
+==
+
+This document aims to help developers understand the architecture of the
+QEMU build system. As with projects using GNU autotools, the QEMU build
+system has two stages, first the developer runs the "configure" script
+to determine the local build environment characteristics, then they run
+"make" to build the project. There is about where the similarities with
+GNU autotools end, so try to forget what you know about them.
+
+
+Stage 1: configure
+==
+
+The QEMU configure script is written directly in shell, and should be
+compatible with any POSIX shell, hence it uses #!/bin/sh. An important
+implication of this is that it is important to avoid using bash-isms on
+development platforms where bash is the primary host.
+
+In contrast to autoconf scripts, QEMU's configure is expected to be
+silent while it is checking for features. It will only display output
+when an error occurs, or to show the final feature enablement summary
+on completion.
+
+Adding new checks to the configure script usually comprises the
+following tasks:
+
+ - Initialize one or more variables with the default feature state.
+
+   Ideally features should auto-detect whether they are present,
+   so try to avoid hardcoding the initial state to either enabled
+   or disabled, as that forces the user to pass a --enable-XXX
+   / --disable-XXX flag on every invocation of configure.
+
+ - Add support to the command line arg parser to handle any new
+   --enable-XXX / --disable-XXX flags required by the feature XXX.
+
+ - Add information to the help output message to report on the new
+   feature flag.
+
+ - Add code to perform the actual feature check. As noted above, try to
+   be fully dynamic in checking enablement/disablement.
+
+ - Add code to print out the feature status in the configure summary
+   upon completion.
+
+ - Add any new makefile variables to $config_host_mak on completion.
+
+
+Taking (a simplified version of) the probe for gnutls from configure,
+we have the following pieces:
+
+  # Initial variable state
+  gnutls=""
+
+  ..snip..
+
+  # Configure flag processing
+  --disable-gnutls) gnutls="no"
+  ;;
+  --enable-gnutls) gnutls="yes"
+  ;;
+
+  ..snip..
+
+  # Help output feature message
+  gnutls  GNUTLS cryptography support
+
+  ..snip..
+
+  # Test for gnutls
+  if test "$gnutls" != "no"; then
+ if ! $pkg_config --exists "gnutls"; then
+gnutls_cflags=`$pkg_config --cflags gnutls`
+gnutls_libs=`$pkg_config --libs gnutls`
+libs_softmmu="$gnutls_libs $libs_softmmu"
+libs_tools="$gnutls_libs $libs_tools"
+QEMU_CFLAGS="$QEMU_CFLAGS $gnutls_cflags"
+gnutls="yes"
+ elif test "$gnutls" = "yes"; then
+feature_not_found "gnutls" "Install gnutls devel"
+ else
+gnutls="no"
+ fi
+  fi
+
+  ..snip..
+
+  # Completion feature summary
+  echo "GNUTLS support$gnutls"
+
+  ..snip..
+
+  # Define make variables
+  if test "$gnutls" = "yes" ; then
+ echo "CONFIG_GNUTLS=y" >> $config_host_mak
+  fi
+
+
+Helper functions
+
+
+The configure script provides a variety of helper functions to assist
+developers in checking for system features:
+
+ - do_cc $ARGS...
+
+   Attempt to run the system C compiler passing it $ARGS...
+
+ - do_cxx $ARGS...
+
+   Attempt to run the system C++ compiler passing it $ARGS...
+
+ - compile_object $CFLAGS
+
+   Attempt to compile a test program with the system C compiler using
+   $CFLAGS. The test program must have been previously written to a file
+   called $TMPC.
+
+ - compile_prog $CFLAGS $LDFLAGS
+
+   Attempt to compile a test program with the system C compiler using
+   $CFLAGS and link it with the system linker using $LDFLAGS. The test
+   program must have been previously written to a file called $TMPC.
+
+ - has $COMMAND
+
+   Determine if $COMMAND exists in the current environment, either as a
+   shell builtin, or executable binary, returning 0 on success.
+
+ - path_of $COMMAND
+
+   Return the fully qualified path of $COMMAND, printing it to stdout,
+   and returning 0 on success.
+
+ - ch

Re: [Qemu-devel] [PATCH] README: fill out some useful quickstart information

2015-09-23 Thread Paolo Bonzini


On 23/09/2015 11:52, Markus Armbruster wrote:
> > I don't think this is necessary.  We should remove duplicate information
> > from qemu-doc.texi, developer documentation doesn't belong in manuals.
>
> Just to avoid misunderstandings: I wouldn't object to splitting
> qemu-doc.texi into a user manual and developer documentation.

That really boils down to moving chapter 6 somewhere else; everything
else is user documentation.  I think we should start with Dan's patch,
and then integrate it with any missing information from chapter 6.

Together with his build system docs, the non-texi developer
documentation is better than anything qemu-doc.texi has ever had
(*insert usual rant about perfect being the enemy of good*).

We also have the opposite problem (user documentation placed in docs/
instead of qemu-doc.texi) but that is separate.

Paolo



Re: [Qemu-devel] [PATCH v2 1/5] monitor: split MonitorQAPIEventState

2015-09-23 Thread Daniel P. Berrange
On Thu, Sep 17, 2015 at 06:08:46PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Create a separate structure MonitorQAPIEventPending for holding the data
> related to pending event.
> 
> The next commits are going to reuse this structure for different
> throttling implementations.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  monitor.c | 85 
> +--
>  1 file changed, 50 insertions(+), 35 deletions(-)

Reviewed-By: Daniel P. Berrange 

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v2 2/5] monitor: introduce MonitorQAPIEventDelay callback

2015-09-23 Thread Daniel P. Berrange
On Thu, Sep 17, 2015 at 06:08:47PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Move the delay handling in a seperate function that can be changed for
> different throttling implementations, as will be done in following commits.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  monitor.c| 85 
> +++-
>  trace-events |  3 ++-
>  2 files changed, 57 insertions(+), 31 deletions(-)

Reviewed-by: Daniel P. Berrange 

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v2] docs: describe the QEMU build system structure / design

2015-09-23 Thread Paolo Bonzini

A few nits that missed my first review:

On 23/09/2015 11:59, Daniel P. Berrange wrote:
> +A further complication for the system and userspace emulator binaries is
> +that two separate binaries need to be generated.

A further complication for the system emulator binaries (Windows does
not support userspace emulation) is that...

> There are no
> +corresponding $(QEMU_LIBS)/$(QEMU_LDFLAGS) variables, instead there are
> +a couple of more targeted variables.

The corresponding variable for linker flags is $(LIBS), but usually more
targeted variables are used instead.

> $(libs_softmmu) is used for
> +libraries that must be linked to system emulator targets, $(libs_tools)

$(LIBS_TOOLS)

> +is used for tools like qemu-img, qemu-nbd, etc and $(libs_qga) is used

$(LIBS_QGA)

> +for the QEMU guest agent. There is currently no variable for the
> +userspace emulator targets.

; they only use the generic $(LIBS) variable.

Paolo



Re: [Qemu-devel] [PATCH v2 3/5] monitor: rename QDict *data->qdict

2015-09-23 Thread Daniel P. Berrange
On Thu, Sep 17, 2015 at 06:08:48PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> As suggested by Markus Armbruster, this is a bit more specific for the
> event qdict than 'data'.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  monitor.c | 36 ++--
>  1 file changed, 18 insertions(+), 18 deletions(-)

Reviewed-by: Daniel P. Berrange 


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v2 4/5] monitor: throttle QAPI_EVENT_VSERPORT_CHANGE by "id"

2015-09-23 Thread Daniel P. Berrange
On Thu, Sep 17, 2015 at 06:08:49PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Use a hash table to lookup the pending event corresponding to the "id"
> field. The hash table may grow without limit here, the following patch
> will add some cleaning.
> 
> Signed-off-by: Marc-André Lureau 
> Reviewed-by: Eric Blake 
> ---
>  monitor.c | 104 
> --
>  1 file changed, 81 insertions(+), 23 deletions(-)

Reviewed-by: Daniel P. Berrange 


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v2 5/5] monitor: remove old entries from event hash table

2015-09-23 Thread Daniel P. Berrange
On Thu, Sep 17, 2015 at 06:08:50PM +0200, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> Do not let the hash table grow without limit, schedule a cleanup for
> outdated event.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  monitor.c | 58 +-
>  1 file changed, 53 insertions(+), 5 deletions(-)

Reviewed-by: Daniel P. Berrange 


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [RFC PATCH 03/10] vfio: Check guest IOVA ranges against host IOMMU capabilities

2015-09-23 Thread Thomas Huth
On 17/09/15 15:09, David Gibson wrote:
> The current vfio core code assumes that the host IOMMU is capable of
> mapping any IOVA the guest wants to use to where we need.  However, real
> IOMMUs generally only support translating a certain range of IOVAs (the
> "DMA window") not a full 64-bit address space.
> 
> The common x86 IOMMUs support a wide enough range that guests are very
> unlikely to go beyond it in practice, however the IOMMU used on IBM Power
> machines - in the default configuration - supports only a much more limited
> IOVA range, usually 0..2GiB.
> 
> If the guest attempts to set up an IOVA range that the host IOMMU can't
> map, qemu won't report an error until it actually attempts to map a bad
> IOVA.  If guest RAM is being mapped directly into the IOMMU (i.e. no guest
> visible IOMMU) then this will show up very quickly.  If there is a guest
> visible IOMMU, however, the problem might not show up until much later when
> the guest actually attempt to DMA with an IOVA the host can't handle.
> 
> This patch adds a test so that we will detect earlier if the guest is
> attempting to use IOVA ranges that the host IOMMU won't be able to deal
> with.
> 
> For now, we assume that "Type1" (x86) IOMMUs can support any IOVA, this is
> incorrect, but no worse than what we have already.  We can't do better for
> now because the Type1 kernel interface doesn't tell us what IOVA range the
> IOMMU actually supports.
> 
> For the Power "sPAPR TCE" IOMMU, however, we can retrieve the supported
> IOVA range and validate guest IOVA ranges against it, and this patch does
> so.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/vfio/common.c  | 42 +++---
>  include/hw/vfio/vfio-common.h |  6 ++
>  2 files changed, 45 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 9953b9c..c37f1a1 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -344,14 +344,23 @@ static void vfio_listener_region_add(MemoryListener 
> *listener,
>  if (int128_ge(int128_make64(iova), llend)) {
>  return;
>  }
> +end = int128_get64(llend);
> +
> +if ((iova < container->iommu_data.min_iova)
> +|| ((end - 1) > container->iommu_data.max_iova)) {

(Too much paranthesis for my taste ;-))

> +error_report("vfio: IOMMU container %p can't map guest IOVA region"
> + " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx,
> + container, iova, end - 1);
> +ret = -EFAULT; /* FIXME: better choice here? */

Maybe -EINVAL? ... but -EFAULT also sounds ok for me.

> +goto fail;
> +}
...
> @@ -712,6 +732,22 @@ static int vfio_connect_container(VFIOGroup *group, 
> AddressSpace *as)
>  ret = -errno;
>  goto free_container_exit;
>  }
> +
> +/*
> + * FIXME: This only considers the host IOMMU' 32-bit window.
> + * At some point we need to add support for the optional
> + * 64-bit window and dynamic windows
> + */
> +info.argsz = sizeof(info);
> +ret = ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info);
> +if (ret) {
> +error_report("vfio: VFIO_IOMMU_SPAPR_TCE_GET_INFO failed: %m");

Isn't that %m a glibc extension only? ... Well, this code likely only
runs on Linux with a glibc, so it likely doesn't matter, I guess...

> +ret = -errno;
> +goto free_container_exit;
> +}
> +container->iommu_data.min_iova = info.dma32_window_start;
> +container->iommu_data.max_iova = container->iommu_data.min_iova
> ++ info.dma32_window_size - 1;
>  } else {
>  error_report("vfio: No available IOMMU models");
>  ret = -EINVAL;
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index aff18cd..88ec213 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -71,6 +71,12 @@ typedef struct VFIOContainer {
>  MemoryListener listener;
>  int error;
>  bool initialized;
> +/*
> + * FIXME: This assumes the host IOMMU can support only a
> + * single contiguous IOVA window.  We may need to generalize
> + * that in future
> + */
> +hwaddr min_iova, max_iova;

Should that maybe be dma_addr_t instead of hwaddr ?

>  } iommu_data;
>  QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
>  QLIST_HEAD(, VFIOGroup) group_list;
> 

 Thomas




Re: [Qemu-devel] [PATCH] README: fill out some useful quickstart information

2015-09-23 Thread Daniel P. Berrange
On Wed, Sep 23, 2015 at 12:00:08PM +0200, Paolo Bonzini wrote:
> 
> 
> On 23/09/2015 11:52, Markus Armbruster wrote:
> > > I don't think this is necessary.  We should remove duplicate information
> > > from qemu-doc.texi, developer documentation doesn't belong in manuals.
> >
> > Just to avoid misunderstandings: I wouldn't object to splitting
> > qemu-doc.texi into a user manual and developer documentation.
> 
> That really boils down to moving chapter 6 somewhere else; everything
> else is user documentation.  I think we should start with Dan's patch,
> and then integrate it with any missing information from chapter 6.

With GNU autotools you get a generic INSTALL file which describe the
way you use configure to build. I'd suggest we just move chapter 6
to be an INSTALL file in plain text, and just have the README file
point people in that direction. Keep qemu-doc.texi as purely the
runtime usage manual


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH v3 04/46] ivshmem: fix number of bytes to push to fifo

2015-09-23 Thread Marc-André Lureau
Hi

On Wed, Sep 16, 2015 at 11:28 AM, Claudio Fontana
 wrote:
> (should we rely on sizeof(long) here?)

This is unrelated to the fix (MIN/MAX). I will add a patch to change
the protocol.

-- 
Marc-André Lureau



Re: [Qemu-devel] [PATCH v3 05/46] ivshmem: factor out the incoming fifo handling

2015-09-23 Thread Marc-André Lureau
Hi

On Tue, Sep 22, 2015 at 4:01 PM, Claudio Fontana
 wrote:
> On 15.09.2015 18:07, marcandre.lur...@redhat.com wrote:
>> From: Marc-André Lureau 
>>
>> Make a new function fifo_update_and_get() that can be reused by other
>> functions (in next commits).
>>
>> Signed-off-by: Marc-André Lureau 
>> ---
>>  hw/misc/ivshmem.c | 59 
>> ---
>>  1 file changed, 39 insertions(+), 20 deletions(-)
>>
>> diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
>> index 2162d02..dd15f0e 100644
>> --- a/hw/misc/ivshmem.c
>> +++ b/hw/misc/ivshmem.c
>> @@ -441,6 +441,42 @@ static int increase_dynamic_storage(IVShmemState *s, 
>> int new_min_size)
>>  return 0;
>>  }
>>
>> +static bool fifo_update_and_get(IVShmemState *s, const uint8_t *buf, int 
>> size,
>> +void *data, size_t len)
>> +{
>> +const uint8_t *p;
>> +uint32_t num;
>> +
>> +assert(len <= sizeof(long)); /* limitation of the fifo */
>> +if (fifo8_is_empty(&s->incoming_fifo) && size == len) {
>> +memcpy(data, buf, size);
>> +return true;
>> +}
>> +
>> +IVSHMEM_DPRINTF("short read of %d bytes\n", size);
>> +
>> +num = MIN(size, sizeof(long) - fifo8_num_used(&s->incoming_fifo));
>> +fifo8_push_all(&s->incoming_fifo, buf, num);
>> +
>> +if (fifo8_num_used(&s->incoming_fifo) < len) {
>> +assert(num == 0);
>> +return false;
>> +}
>> +
>> +size -= num;
>> +buf += num;
>> +p = fifo8_pop_buf(&s->incoming_fifo, len, &num);
>> +assert(num == len);
>> +
>> +memcpy(data, p, len);
>> +
>> +if (size > 0) {
>> +fifo8_push_all(&s->incoming_fifo, buf, size);
>> +}
>> +
>> +return true;
>> +}
>> +
>>  static void ivshmem_read(void *opaque, const uint8_t *buf, int size)
>>  {
>>  IVShmemState *s = opaque;
>> @@ -448,26 +484,9 @@ static void ivshmem_read(void *opaque, const uint8_t 
>> *buf, int size)
>>  int guest_max_eventfd;
>>  long incoming_posn;
>>
>> -if (fifo8_is_empty(&s->incoming_fifo) && size == sizeof(incoming_posn)) 
>> {
>> -memcpy(&incoming_posn, buf, size);
>> -} else {
>> -const uint8_t *p;
>> -uint32_t num;
>> -
>> -IVSHMEM_DPRINTF("short read of %d bytes\n", size);
>> -num = MIN(size, sizeof(long) - fifo8_num_used(&s->incoming_fifo));
>> -fifo8_push_all(&s->incoming_fifo, buf, num);
>> -if (fifo8_num_used(&s->incoming_fifo) < sizeof(incoming_posn)) {
>> -return;
>> -}
>> -size -= num;
>> -buf += num;
>> -p = fifo8_pop_buf(&s->incoming_fifo, sizeof(incoming_posn), &num);
>> -g_assert(num == sizeof(incoming_posn));
>> -memcpy(&incoming_posn, p, sizeof(incoming_posn));
>> -if (size > 0) {
>> -fifo8_push_all(&s->incoming_fifo, buf, size);
>> -}
>> +if (!fifo_update_and_get(s, buf, size,
>> + &incoming_posn, sizeof(incoming_posn))) {
>> +return;
>>  }
>>
>>  if (incoming_posn < -1) {
>>
>
>
> Fine in principle, I have that reservation about using sizeof(long) as part 
> of the interface.


This will be treated in a seperate patch.

-- 
Marc-André Lureau



Re: [Qemu-devel] [PATCH v3 09/46] ivshmem: more qdev conversion

2015-09-23 Thread Marc-André Lureau
On Tue, Sep 22, 2015 at 4:00 PM, Claudio Fontana
 wrote:
> On 15.09.2015 18:07, marcandre.lur...@redhat.com wrote:
>> From: Marc-André Lureau 
>>
>> Use the latest qemu device modeling API, in particular, convert to
>> realize to fix the error handling; right now a botched device_add
>> ivhsmem command kills the VM.
>>
>> Signed-off-by: Marc-André Lureau 
>> ---
>>  hw/misc/ivshmem.c | 119 
>> +++---
>>  1 file changed, 68 insertions(+), 51 deletions(-)
>>
>> diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
>> index 1b8204e..3af73a5 100644
>> --- a/hw/misc/ivshmem.c
>> +++ b/hw/misc/ivshmem.c
>> @@ -319,22 +319,23 @@ static CharDriverState* create_eventfd_chr_device(void 
>> * opaque, EventNotifier *
>>
>>  }
>>
>> -static int check_shm_size(IVShmemState *s, int fd) {
>> +static int check_shm_size(IVShmemState *s, int fd, Error **errp)
>> +{
>>  /* check that the guest isn't going to try and map more memory than the
>>   * the object has allocated return -1 to indicate error */
>>
>>  struct stat buf;
>>
>>  if (fstat(fd, &buf) < 0) {
>> -error_report("exiting: fstat on fd %d failed: %s",
>> - fd, strerror(errno));
>> +error_setg(errp, "exiting: fstat on fd %d failed: %s",
>> +   fd, strerror(errno));
>>  return -1;
>>  }
>>
>>  if (s->ivshmem_size > buf.st_size) {
>> -error_report("Requested memory size greater"
>> - " than shared object size (%" PRIu64 " > %" PRIu64")",
>> - s->ivshmem_size, (uint64_t)buf.st_size);
>> +error_setg(errp, "Requested memory size greater"
>> +   " than shared object size (%" PRIu64 " > %" PRIu64")",
>> +   s->ivshmem_size, (uint64_t)buf.st_size);
>>  return -1;
>>  } else {
>>  return 0;
>> @@ -343,13 +344,18 @@ static int check_shm_size(IVShmemState *s, int fd) {
>>
>>  /* create the shared memory BAR when we are not using the server, so we can
>>   * create the BAR and map the memory immediately */
>> -static void create_shared_memory_BAR(IVShmemState *s, int fd, uint8_t attr) 
>> {
>> -
>> +static int create_shared_memory_BAR(IVShmemState *s, int fd, uint8_t attr,
>> +Error **errp)
>> +{
>>  void * ptr;
>>
>> -s->shm_fd = fd;
>> -
>>  ptr = mmap(0, s->ivshmem_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
>> +if (ptr == MAP_FAILED) {
>> +error_setg_errno(errp, errno, "Failed to mmap shared memory");
>> +return -1;
>> +}
>> +
>> +s->shm_fd = fd;
>>
>>  memory_region_init_ram_ptr(&s->ivshmem, OBJECT(s), "ivshmem.bar2",
>> s->ivshmem_size, ptr);
>> @@ -358,6 +364,8 @@ static void create_shared_memory_BAR(IVShmemState *s, 
>> int fd, uint8_t attr) {
>>
>>  /* region for shared memory */
>>  pci_register_bar(PCI_DEVICE(s), 2, attr, &s->bar);
>> +
>> +return 0;
>>  }
>>
>>  static void ivshmem_add_eventfd(IVShmemState *s, int posn, int i)
>> @@ -481,6 +489,7 @@ static void ivshmem_read(void *opaque, const uint8_t 
>> *buf, int size)
>>  int incoming_fd;
>>  int guest_max_eventfd;
>>  long incoming_posn;
>> +Error *err = NULL;
>>
>>  if (!fifo_update_and_get(s, buf, size,
>>   &incoming_posn, sizeof(incoming_posn))) {
>> @@ -524,18 +533,24 @@ static void ivshmem_read(void *opaque, const uint8_t 
>> *buf, int size)
>>
>>  /* if the position is -1, then it's shared memory region fd */
>>  if (incoming_posn == -1) {
>> -
>>  void * map_ptr;
>>
>>  s->max_peer = 0;
>>
>> -if (check_shm_size(s, incoming_fd) == -1) {
>> -exit(1);
>> +if (check_shm_size(s, incoming_fd, &err) == -1) {
>> +error_report_err(err);
>> +close(incoming_fd);
>> +return;
>>  }
>>
>>  /* mmap the region and map into the BAR2 */
>>  map_ptr = mmap(0, s->ivshmem_size, PROT_READ|PROT_WRITE, MAP_SHARED,
>>  incoming_fd, 0);
>> +if (map_ptr == MAP_FAILED) {
>> +error_report("Failed to mmap shared memory %s", 
>> strerror(errno));
>> +close(incoming_fd);
>> +return;
>> +}
>>  memory_region_init_ram_ptr(&s->ivshmem, OBJECT(s),
>> "ivshmem.bar2", s->ivshmem_size, 
>> map_ptr);
>>  vmstate_register_ram(&s->ivshmem, DEVICE(s));
>> @@ -610,7 +625,7 @@ static void ivshmem_reset(DeviceState *d)
>>  ivshmem_use_msix(s);
>>  }
>>
>> -static uint64_t ivshmem_get_size(IVShmemState * s) {
>> +static uint64_t ivshmem_get_size(IVShmemState * s, Error **errp) {
>>
>>  uint64_t value;
>>  char *ptr;
>> @@ -624,24 +639,23 @@ static uint64_t ivshmem_get_size(IVShmemState * s) {
>>  value <<= 30;
>>  break;
>>  defau

Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb

2015-09-23 Thread Igor Mammedov
On Wed, 23 Sep 2015 11:38:56 +0200
Paolo Bonzini  wrote:

> 
> 
> On 23/09/2015 11:36, Igor Mammedov wrote:
> > Also it's QEMU bug/fault and pushing workaround to upper layers
> > doesn't seem right when it's much easier to do it in QEMU itself.
> 
> No, it's virtio's bug.  It makes sense to push workaround for guest bugs
> as far from the hypervisor as possible.
> 
> If we want to increase the alignment in QEMU, I would prefer to have
> natural alignment which makes some sense and happens to work around the
> bug as well.  Michael, Eduardo, any third opinions?
Natural alignment we have now is 2Mb, we can leave hipervisor side
as is and just try to align on 2Mb in guest + limit buffer size to 2Mb as we
that would make sure that buffer never crosses DIMM borders.

> 
> Paolo
> 




Re: [Qemu-devel] [RFC PATCH 04/10] vfio: Record host IOMMU's available IO page sizes

2015-09-23 Thread Thomas Huth
On 17/09/15 15:09, David Gibson wrote:
> Depending on the host IOMMU type we determine and record the available page
> sizes for IOMMU translation.  We'll need this for other validation in
> future patches.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/vfio/common.c  | 13 +
>  include/hw/vfio/vfio-common.h |  1 +
>  2 files changed, 14 insertions(+)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index c37f1a1..daaac48 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -680,6 +680,7 @@ static int vfio_connect_container(VFIOGroup *group, 
> AddressSpace *as)
>  if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU) ||
>  ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU)) {
>  bool v2 = !!ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU);
> +struct vfio_iommu_type1_info info;
>  
>  ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd);
>  if (ret) {
> @@ -705,6 +706,15 @@ static int vfio_connect_container(VFIOGroup *group, 
> AddressSpace *as)
>   */
>  container->iommu_data.min_iova = 0;
>  container->iommu_data.max_iova = (hwaddr)-1;
> +
> +/* Assume just 4K IOVA page size */
> +container->iommu_data.iova_pgsizes = 0x1000;
> +info.argsz = sizeof(info);
> +ret = ioctl(fd, VFIO_IOMMU_GET_INFO, &info);
> +/* Ignore errors */
> +if ((ret == 0) && (info.flags & VFIO_IOMMU_INFO_PGSIZES)) {
> +container->iommu_data.iova_pgsizes = info.iova_pgsizes;
> +}
>  } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) {
>  struct vfio_iommu_spapr_tce_info info;
>  
> @@ -748,6 +758,9 @@ static int vfio_connect_container(VFIOGroup *group, 
> AddressSpace *as)
>  container->iommu_data.min_iova = info.dma32_window_start;
>  container->iommu_data.max_iova = container->iommu_data.min_iova
>  + info.dma32_window_size - 1;
> +
> +/* Assume just 4K IOVA pages for now */
> +container->iommu_data.iova_pgsizes = 0x1000;
>  } else {
>  error_report("vfio: No available IOMMU models");
>  ret = -EINVAL;
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 88ec213..c09db6d 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -77,6 +77,7 @@ typedef struct VFIOContainer {
>   * that in future
>   */
>  hwaddr min_iova, max_iova;
> +uint64_t iova_pgsizes;
>  } iommu_data;
>  QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
>  QLIST_HEAD(, VFIOGroup) group_list;

Reviewed-by: Thomas Huth 





Re: [Qemu-devel] [PATCH v3 17/46] ivshmem: improve debug messages

2015-09-23 Thread Marc-André Lureau
On Tue, Sep 22, 2015 at 4:23 PM, Claudio Fontana
 wrote:
> This MSI-X use vs not use is a bit confusing to me.
> I see that the use of MSI is controlled mainly by IVSHMEM_MSI (Property 
> "msi"),
> but then there are if (msix_present()) checks spread around.
>
> Could this printf be a bit more clear, possibly adding other DPRINTFs as 
> necessary?
>
> Is your IVSHMEM_DPRINTF("use msix\n"); actually intended to mean ("using 
> MSIX\n")? But then why is the check for if (!msix_present(d)) only afterwards?


I don't remember precisely why it's there, only I probably wanted to
trace the entering of function ivshmem_use_msix().

Let's change it for IVSHMEM_DPRINTF("use msix, present: %d\n",
msix_present(d)); ok?

-- 
Marc-André Lureau



Re: [Qemu-devel] [PATCH v3 18/46] ivshmem: improve error

2015-09-23 Thread Marc-André Lureau
On Tue, Sep 22, 2015 at 4:26 PM, Claudio Fontana
 wrote:
> Seems an interrupted sentence...
>
> ... improve error handling? improve error messages?


ack

-- 
Marc-André Lureau



Re: [Qemu-devel] [RFC PATCH 01/10] vfio: Remove unneeded union from VFIOContainer

2015-09-23 Thread Thomas Huth
On 17/09/15 15:09, David Gibson wrote:
> Currently the VFIOContainer iommu_data field contains a union with
> different information for different host iommu types.  However:
>* It only actually contains information for the x86-like "Type1" iommu
>* Because we have a common listener the Type1 fields are actually used
> on all IOMMU types, including the SPAPR TCE type as well
>* There's no tag in the VFIOContainer to tell you which union member is
> valid anyway.
> 
> In fact we now have a general structure for the listener which is unlikely
> to ever need per-iommu-type information, so this patch removes the union.
> 
> In a similar way we can unify the setup of the vfio memory listener in
> vfio_connect_container() that is currently split across a switch on iommu
> type, but is effectively the same in both cases.
> 
> The iommu_data.release pointer was only needed as a cleanup function
> which would handle potentially different data in the union.  With the
> union gone, it too can be removed.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/vfio/common.c  | 51 
> +--
>  include/hw/vfio/vfio-common.h | 14 +++-
>  2 files changed, 23 insertions(+), 42 deletions(-)
...
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 59a321d..aff18cd 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -64,21 +64,13 @@ typedef struct VFIOAddressSpace {
>  
>  struct VFIOGroup;
>  
> -typedef struct VFIOType1 {
> -MemoryListener listener;
> -int error;
> -bool initialized;
> -} VFIOType1;
> -
>  typedef struct VFIOContainer {
>  VFIOAddressSpace *space;
>  int fd; /* /dev/vfio/vfio, empowered by the attached groups */
>  struct {
> -/* enable abstraction to support various iommu backends */
> -union {
> -VFIOType1 type1;
> -};
> -void (*release)(struct VFIOContainer *);
> +MemoryListener listener;
> +int error;
> +bool initialized;
>  } iommu_data;
>  QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
>  QLIST_HEAD(, VFIOGroup) group_list;
> 

I think I agree with Alexey here ... keeping the iommu_data struct
around those fields looks cumbersome. Is there a reason you did not
remove the struct completely?

 Thomas




Re: [Qemu-devel] [PATCH] pc: memhotplug: rise minimum DIMM addr/size alignment to 128Mb

2015-09-23 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote:
> 
> 
> On 23/09/2015 11:36, Igor Mammedov wrote:
> > Also it's QEMU bug/fault and pushing workaround to upper layers
> > doesn't seem right when it's much easier to do it in QEMU itself.
> 
> No, it's virtio's bug.  It makes sense to push workaround for guest bugs
> as far from the hypervisor as possible.

But you really don't want to have higher level things having to align
addresses themselves.  I could see adding an option for the required
alignment would be OK.

> If we want to increase the alignment in QEMU, I would prefer to have
> natural alignment which makes some sense and happens to work around the
> bug as well.  Michael, Eduardo, any third opinions?
> 

By natural alignment do you mean that an 'n MB' DIMM gets aligned to 'n MB' ?

Dave

> Paolo
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH 0/3] pc: Set hw_version on all machine classes

2015-09-23 Thread Laszlo Ersek
On 09/23/15 10:04, Igor Mammedov wrote:
> On Tue, 22 Sep 2015 17:16:24 -0300
> Eduardo Habkost  wrote:
> 
>> In 2012, QEMU had a bug where it exposed QEMU version information
>> to the guest, meaning a QEMU upgrade would expose different
>> hardware to the guest OS even if the same machine-type is being
>> used.
>>
>> The bug was fixed by commit 93bfef4c6e4b23caea9d51e1099d06433d8835a4,
>> on all machines up to pc-1.0. But we kept introducing the same
>> bug on all newer machines since then. That means we are breaking
>> guest ABI every time QEMU was upgraded.
>>
>> Fix this by setting the hw_version on all PC machines, making
>> sure the hardware won't change when upgrading QEMU.
>>
>> This series is based on Michael's PCI tree, plus the "Set
>> broken_reserved_end on pc-*-2.4, not 2.5" patch I submitted
>> earlier today.
> I haven't seen this patch on list, perhaps it needs to be resubmitted?

http://lists.nongnu.org/archive/html/qemu-devel/2015-09/msg05778.html




Re: [Qemu-devel] [RFC PATCH 05/10] memory: Allow replay of IOMMU mapping notifications

2015-09-23 Thread Thomas Huth
On 17/09/15 15:09, David Gibson wrote:
> When we have guest visible IOMMUs, we allow notifiers to be registered
> which will be informed of all changes to IOMMU mappings.  This is used by
> vfio to keep the host IOMMU mappings in sync with guest IOMMU mappings.
> 
> However, unlike with a memory region listener, an iommu notifier won't be
> told about any mappings which already exist in the (guest) IOMMU at the
> time it is registered.  This can cause problems if hotplugging a VFIO
> device onto a guest bus which had existing guest IOMMU mappings, but didn't
> previously have an VFIO devices (and hence no host IOMMU mappings).
> 
> This adds a memory_region_register_iommu_notifier_replay() function to
> handle this case.  As well as registering the new notifier it replays
> existing mappings.  Because the IOMMU memory region doesn't internally
> remember the granularity of the guest IOMMU it has a small hack where the
> caller must specify a granularity at which to replay mappings.
> 
> If there are finer mappings in the guest IOMMU these will be reported in
> the iotlb structures passed to the notifier which it must handle (probably
> causing it to flag an error).  This isn't new - the VFIO iommu notifier
> must already handle notifications about guest IOMMU mappings too short
> for it to represent in the host IOMMU.
> 
> Signed-off-by: David Gibson 
> ---
>  include/exec/memory.h | 16 
>  memory.c  | 18 ++
>  2 files changed, 34 insertions(+)

> diff --git a/memory.c b/memory.c
> index 0d8b2d9..6b5a2f1 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -1403,6 +1403,24 @@ void 
> memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n)
>  notifier_list_add(&mr->iommu_notify, n);
>  }
>  
> +void memory_region_register_iommu_notifier_replay(MemoryRegion *mr, Notifier 
> *n,
> +  hwaddr granularity, bool 
> is_write)

granularity itself is not an address, but a size, isn't it? So using
"hwaddr" sounds wrong here.

> +{
> +hwaddr addr;

dma_addr_t ?

> +IOMMUTLBEntry iotlb;
> +
> +memory_region_register_iommu_notifier(mr, n);
> +
> +for (addr = 0;
> + int128_lt(int128_make64(addr), mr->size);
> + addr += granularity) {
> +
> +iotlb = mr->iommu_ops->translate(mr, addr, is_write);
> +if (iotlb.perm != IOMMU_NONE)
> +n->notify(n, &iotlb);

Missing curly braces.

> +}
> +}
> +

 Thomas




Re: [Qemu-devel] [PATCH v3 29/46] ivshmem: error on too many eventfd received

2015-09-23 Thread Marc-André Lureau
Hi

On Wed, Sep 16, 2015 at 2:14 PM, Claudio Fontana
 wrote:
> On 15.09.2015 18:07, marcandre.lur...@redhat.com wrote:
>> From: Marc-André Lureau 
>>
>> The number of eventfd that can be handled per peer is limited by the
>> number of vectors. Return an error when receiving too many of them.
>>
>> Signed-off-by: Marc-André Lureau 
>> ---
>>  hw/misc/ivshmem.c | 7 +++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
>> index b9c78cd..63e4c4f 100644
>> --- a/hw/misc/ivshmem.c
>> +++ b/hw/misc/ivshmem.c
>> @@ -569,6 +569,13 @@ static void ivshmem_read(void *opaque, const uint8_t 
>> *buf, int size)
>>  }
>>
>>  /* get a new eventfd */
>> +if (peer->nb_eventfds >= s->vectors) {
>> +error_report("Too many eventfd received, device has %d vectors",
>> + s->vectors);
>> +close(incoming_fd);
>> +return;
>> +}
>> +
>>  nth_eventfd = peer->nb_eventfds++;
>>
>>  /* this is an eventfd for a particular peer VM */
>>
>
> can the device still operate if we detect these errors at ivshmem_read time?
>
> I am referring also to the other checks happening in ivshmem_read doing
>
> if ([something fails]) {
> error_report("abcabc");
> /* close(), ... */
> return;
> }
>
> Can the device stop operating if these conditions happen?

Yes, it simply closes the "new" incoming_fd. So if the server sends
extra eventfd for peers that we can't handle, we won't be able to
notify those peers. But the rest of the peers and function is still
working.

This is btw, what the current code is doing (only the variable is
called tmp_fd before I removed it in "remove unnecessary dup()" patch)

> If so, do we have to put the device into a non-operating state, where all 
> read/writes are ignored? Should there be a ivshmem status flag for ERROR?
> Should we exit(1)?
>
> note: I don't know what the "proper" behavior should be, but I am concerned 
> about the runtime stability of the software which uses the device.

It's likely a misconfiguration. So having error reported seems like a
sane and enough thing to do.

-- 
Marc-André Lureau



Re: [Qemu-devel] [PATCH v3 30/46] ivshmem: reset mask on device reset

2015-09-23 Thread Marc-André Lureau
Hi

On Wed, Sep 16, 2015 at 2:15 PM, Claudio Fontana
 wrote:
> probably you wanted to say "it should be reset, like the interrupt status".


yes, thanks

-- 
Marc-André Lureau



Re: [Qemu-devel] [RFC PATCH 06/10] vfio: Allow hotplug of containers onto existing guest IOMMU mappings

2015-09-23 Thread Thomas Huth
On 18/09/15 01:31, David Gibson wrote:
> On Thu, Sep 17, 2015 at 10:54:24AM -0600, Alex Williamson wrote:
>> On Thu, 2015-09-17 at 23:09 +1000, David Gibson wrote:
>>> At present the memory listener used by vfio to keep host IOMMU mappings
>>> in sync with the guest memory image assumes that if a guest IOMMU
>>> appears, then it has no existing mappings.
>>>
>>> This may not be true if a VFIO device is hotplugged onto a guest bus
>>> which didn't previously include a VFIO device, and which has existing
>>> guest IOMMU mappings.
>>>
>>> Therefore, use the memory_region_register_iommu_notifier_replay()
>>> function in order to fix this case, replaying existing guest IOMMU
>>> mappings, bringing the host IOMMU into sync with the guest IOMMU.
>>>
>>> Signed-off-by: David Gibson 
>>> ---
>>>  hw/vfio/common.c | 34 +++---
>>>  1 file changed, 19 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>> index daaac48..543c38e 100644
>>> --- a/hw/vfio/common.c
>>> +++ b/hw/vfio/common.c
>>> @@ -312,6 +312,22 @@ out:
>>>  rcu_read_unlock();
>>>  }
>>>  
>>> +static hwaddr vfio_container_granularity(VFIOContainer *container)
>>> +{
>>> +uint64_t pgsize;
>>> +
>>> +assert(container->iommu_data.iova_pgsizes);
>>
>> return (hwaddr)1 << (ffsl(container->iommu_data.iova_pgsizes) - 1;
> 
> Ah, yes, that should work.  I didn't do it that way mostly because I
> tend to confuse myself when I try to remember exactly how ffs
> semantics work.

Maybe use ffsll instead of ffsl, in case the code ever runs on a 32-bit
host instead of a 64-bit host ?

 Thomas





signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC PATCH 03/10] vfio: Check guest IOVA ranges against host IOMMU capabilities

2015-09-23 Thread David Gibson
On Wed, Sep 23, 2015 at 12:10:46PM +0200, Thomas Huth wrote:
> On 17/09/15 15:09, David Gibson wrote:
> > The current vfio core code assumes that the host IOMMU is capable of
> > mapping any IOVA the guest wants to use to where we need.  However, real
> > IOMMUs generally only support translating a certain range of IOVAs (the
> > "DMA window") not a full 64-bit address space.
> > 
> > The common x86 IOMMUs support a wide enough range that guests are very
> > unlikely to go beyond it in practice, however the IOMMU used on IBM Power
> > machines - in the default configuration - supports only a much more limited
> > IOVA range, usually 0..2GiB.
> > 
> > If the guest attempts to set up an IOVA range that the host IOMMU can't
> > map, qemu won't report an error until it actually attempts to map a bad
> > IOVA.  If guest RAM is being mapped directly into the IOMMU (i.e. no guest
> > visible IOMMU) then this will show up very quickly.  If there is a guest
> > visible IOMMU, however, the problem might not show up until much later when
> > the guest actually attempt to DMA with an IOVA the host can't handle.
> > 
> > This patch adds a test so that we will detect earlier if the guest is
> > attempting to use IOVA ranges that the host IOMMU won't be able to deal
> > with.
> > 
> > For now, we assume that "Type1" (x86) IOMMUs can support any IOVA, this is
> > incorrect, but no worse than what we have already.  We can't do better for
> > now because the Type1 kernel interface doesn't tell us what IOVA range the
> > IOMMU actually supports.
> > 
> > For the Power "sPAPR TCE" IOMMU, however, we can retrieve the supported
> > IOVA range and validate guest IOVA ranges against it, and this patch does
> > so.
> > 
> > Signed-off-by: David Gibson 
> > ---
> >  hw/vfio/common.c  | 42 
> > +++---
> >  include/hw/vfio/vfio-common.h |  6 ++
> >  2 files changed, 45 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > index 9953b9c..c37f1a1 100644
> > --- a/hw/vfio/common.c
> > +++ b/hw/vfio/common.c
> > @@ -344,14 +344,23 @@ static void vfio_listener_region_add(MemoryListener 
> > *listener,
> >  if (int128_ge(int128_make64(iova), llend)) {
> >  return;
> >  }
> > +end = int128_get64(llend);
> > +
> > +if ((iova < container->iommu_data.min_iova)
> > +|| ((end - 1) > container->iommu_data.max_iova)) {
> 
> (Too much paranthesis for my taste ;-))

Yes, well, we've already established our tastes differ on that point.

> > +error_report("vfio: IOMMU container %p can't map guest IOVA region"
> > + " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx,
> > + container, iova, end - 1);
> > +ret = -EFAULT; /* FIXME: better choice here? */
> 
> Maybe -EINVAL? ... but -EFAULT also sounds ok for me.

I try to avoid EINVAL unless it's clearly the only right choice.  So
many things use it that it tends to be very unhelpful when you get one.

> > +goto fail;
> > +}
> ...
> > @@ -712,6 +732,22 @@ static int vfio_connect_container(VFIOGroup *group, 
> > AddressSpace *as)
> >  ret = -errno;
> >  goto free_container_exit;
> >  }
> > +
> > +/*
> > + * FIXME: This only considers the host IOMMU' 32-bit window.
> > + * At some point we need to add support for the optional
> > + * 64-bit window and dynamic windows
> > + */
> > +info.argsz = sizeof(info);
> > +ret = ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info);
> > +if (ret) {
> > +error_report("vfio: VFIO_IOMMU_SPAPR_TCE_GET_INFO failed: %m");
> 
> Isn't that %m a glibc extension only? ... Well, this code likely only
> runs on Linux with a glibc, so it likely doesn't matter, I guess...

Yes, it is, but it's already used extensively within qemu.

> > +ret = -errno;
> > +goto free_container_exit;
> > +}
> > +container->iommu_data.min_iova = info.dma32_window_start;
> > +container->iommu_data.max_iova = container->iommu_data.min_iova
> > ++ info.dma32_window_size - 1;
> >  } else {
> >  error_report("vfio: No available IOMMU models");
> >  ret = -EINVAL;
> > diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> > index aff18cd..88ec213 100644
> > --- a/include/hw/vfio/vfio-common.h
> > +++ b/include/hw/vfio/vfio-common.h
> > @@ -71,6 +71,12 @@ typedef struct VFIOContainer {
> >  MemoryListener listener;
> >  int error;
> >  bool initialized;
> > +/*
> > + * FIXME: This assumes the host IOMMU can support only a
> > + * single contiguous IOVA window.  We may need to generalize
> > + * that in future
> > + */
> > +hwaddr min_iova, max_iova;
> 
> Should that maybe be dma_addr_t instead of hwaddr ?

Ah, yes it probably should.

> >  } iommu_data;
> >  

Re: [Qemu-devel] [RFC PATCH 07/10] spapr_pci: Allow PCI host bridge DMA window to be configured

2015-09-23 Thread Thomas Huth
On 17/09/15 15:09, David Gibson wrote:
> At present the PCI host bridge (PHB) for the pseries machine type has a
> fixed DMA window from 0..1GB (in PCI address space) which is mapped to real
> memory via the PAPR paravirtualized IOMMU.
> 
> For better support of VFIO devices, we're going to want to allow for
> different configurations of the DMA window.
> 
> Eventually we'll want to allow the guest itself to reconfigure the window
> via the PAPR dynamic DMA window interface, but as a preliminary this patch
> allows the user to reconfigure the window with new properties on the PHB
> device.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/ppc/spapr_pci.c  | 7 +--
>  include/hw/pci-host/spapr.h | 3 +--
>  2 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index b088491..622c4ac 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1394,7 +1394,7 @@ static void spapr_phb_finish_realize(sPAPRPHBState 
> *sphb, Error **errp)
>  sPAPRTCETable *tcet;
>  uint32_t nb_table;
>  
> -nb_table = SPAPR_PCI_DMA32_SIZE >> SPAPR_TCE_PAGE_SHIFT;
> +nb_table = sphb->dma_win_size >> SPAPR_TCE_PAGE_SHIFT;
>  tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn,
> 0, SPAPR_TCE_PAGE_SHIFT, nb_table, false);
>  if (!tcet) {
> @@ -1404,7 +1404,7 @@ static void spapr_phb_finish_realize(sPAPRPHBState 
> *sphb, Error **errp)
>  }
>  
>  /* Register default 32bit DMA window */
> -memory_region_add_subregion(&sphb->iommu_root, 0,
> +memory_region_add_subregion(&sphb->iommu_root, sphb->dma_win_addr,
>  spapr_tce_get_iommu(tcet));
>  }
>  
> @@ -1437,6 +1437,9 @@ static Property spapr_phb_properties[] = {
> SPAPR_PCI_IO_WIN_SIZE),
>  DEFINE_PROP_BOOL("dynamic-reconfiguration", sPAPRPHBState, dr_enabled,
>   true),
> +/* Default DMA window is 0..1GB */
> +DEFINE_PROP_UINT64("dma_win_addr", sPAPRPHBState, dma_win_addr, 0),
> +DEFINE_PROP_UINT64("dma_win_size", sPAPRPHBState, dma_win_size, 
> 0x4000),
>  DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
> index 5322b56..7de5e02 100644
> --- a/include/hw/pci-host/spapr.h
> +++ b/include/hw/pci-host/spapr.h
> @@ -78,6 +78,7 @@ struct sPAPRPHBState {
>  MemoryRegion memwindow, iowindow, msiwindow;
>  
>  uint32_t dma_liobn;
> +hwaddr dma_win_addr, dma_win_size;

Maybe use dma_addr_t for dma_win_addr? And dma_win_size isn't an
address, so I'd maybe use uint64_t here instead.

>  AddressSpace iommu_as;
>  MemoryRegion iommu_root;
>  
> @@ -115,8 +116,6 @@ struct sPAPRPHBVFIOState {
>  
>  #define SPAPR_PCI_MSI_WINDOW 0x400ULL
>  
> -#define SPAPR_PCI_DMA32_SIZE 0x4000
> -
>  static inline qemu_irq spapr_phb_lsi_qirq(struct sPAPRPHBState *phb, int pin)
>  {
>  sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());

Anyway, patch looks fine to me, so:

Reviewed-by: Thomas Huth 





Re: [Qemu-devel] [PATCH v7 11/14] block/backup: support block job transactions

2015-09-23 Thread Markus Armbruster
John, your MUA turned the QMP examples to mush.  You may want to teach
it manners.

John Snow  writes:

> On 09/22/2015 06:34 PM, Eric Blake wrote:
>> On 09/22/2015 03:08 PM, John Snow wrote:
>>> Eric, Markus: I've CC'd you for some crazy what-if QAPI questions
>>> after my R-B on this patch.
>>>
>>
>>> The current design of this patch is such that the blockdev-backup
>>> and drive-backup QMP commands are extended for use in the
>>> transaction action. This means that as part of the arguments for
>>> the action, we can specify "transactional-cancel" as on/off,
>>> defaulting to off. This maintains backwards compatible behavior.
>>>
>>>
>>> As an example of a lone command (for simplicity...)
>>>
>>> { "execute": "transaction",
>>>   "arguments": {
>>> "actions": [
>>>   {"type": "drive-backup",
>>>"data": {"device": "drive0",
>>> "target": "/path/to/new_full_backup.img",
>>> "sync": "full", "format": "qcow2",
>>> "transactional-cancel": true
>>>}
>>>   }
>>> ]
>>>   }
>>> }
>>>
>>> This means that this command should fail if any of the other
>>> block jobs in the transaction (that have also set
>>> transactional-cancel(!)) fail.
>>
>> Just wondering - is there ever a case where someone would want to
>> create a transaction with multiple sub-groups?  That is, I want
>> actions A, B, C, D to all occur at the same point in time, but with
>> grouping [A,B] and [C, D] (so that if A fails B gets cancelled, but
>> C and D can still

You make my head hurt.

> Only two sub-groups:
>
> (A) actions that live and die together
> (B) actions that proceed no matter what.
>
> The only reason we even allow these two is because the default
> behavior has been B historically, so we need to allow that to continue on.
>
> I don't think we need to architect multiple subgroups of "live and die
> together" semantics.
>
>> continue)?  Worse, is there a series of actions to accomplish
>> something that cannot happen unless it is interleaved across
>> multiple such subgroups?
>>
>
> Not sure we'll need to address this.
>
>> Here's hoping that, for design simplicity, we only ever need one
>> subgroup per 'transaction' (auto-cancel semantics applies to all
>> commands in the group that opted in, with no way to further refine
>> into sub-groups of commands).
>>
>
> I think this is correct.

Puh!

>>>
>>> This is nice because it allows us to specify per-action which
>>> actions should live or die on the success of the transaction as a
>>> whole.
>>>
>>> What I was wondering is if we wanted to pidgeon-hole ourselves
>>> like this by adding arguments per-action and instead opt for a
>>> more generic, transaction-wide setting.
>>>
>>> In my mind, the transactional cancel has nothing to do with each
>>> /specific/ action, but has more to do with a property of
>>> transactions and actions in general.
>>>
>>>
>>> So I was thinking of two things:
>>>
>>> (1) Transactional cancel, while false by default, could be
>>> toggled to true by default for an entire transaction all-at-once
>>>
>>> { "execute": "transaction",
>>>   "arguments": {
>>> "actions": [ ... ],
>>> "transactional-cancel": true
>>>   }
>>> }
>>>
>>> This would toggle the default state for all actions to "fail if
>>> anything else in the transaction does."
>>
>> Or worded in another way - what is the use case for having a batch
>> of actions where some commands are grouped-cancel, but the
>> remaining actions are independent?  Is there ever a case where you
>> would supply "transactional-cancel":true to one action, but not all
>> of them?  If not, then this idea of hoisting the bool to the
>> transaction arguments level makes more sense (it's an all-or-none
>> switch, not a per-action switch).
>>
>> Qapi-wise, here's how you would do this:
>>
>> { 'command': 'transaction',
>>   'data': { 'actions': [ 'TransactionAction' ],
>> '*transactional-cancel': 'bool' } }
>>
>
> Right. If there's no need for us to specify per-action settings at
> all, this becomes the obvious "correct" solution.
>
> If we do want to be able to specify this sub-grouping per-action (for
> whatever reason), this might still be nice as a convenience feature.

A common flag is semantically simpler than a per-action flag.  As
always, the more complex solution needs to be justified with an actual
use case.

A common @transactional-cancel defaulting to false suffices for backward
compatibility.

We think users will generally want to set it to true for all actions.
Again, a common flag suffices.

Unless someone comes up with actual use case for a per-action flag,
let's stick to the simpler common flag.

>>> Of course, as of right now, not all actions actually support this
>>>feature, but they might need to in the future -- and as more and more
>>>actions use this feature, it might be nice to have the global
>>>setting.

Uh-oh, you mean the flag is currently per-action because only some kinds
of actions actually 

Re: [Qemu-devel] [PATCH 2/2] target-ppc: fix xscmpodp and xscmpudp decoding

2015-09-23 Thread Tom Musta
The modern versions of the ISA require that reserved instruction bits be
ignored for server class processors (see Book I, section 1.3.3).  I believe
older versions of the ISA allowed for Illegal Instruction Interrupt or
"Boundedly Undefined", which is, of course, much less specific.  QEMU
supports implementations from both eras and, as a general rule, flags this
situation as an illegal instruction.

So I would expect that real hardware will ignore the bit.  You will still
be left with the choice of making this decoder consistent with the hardware
or consistent with the rest of QEMU :)   When I added support for VSX, the
intent was the latter.

On Tue, Sep 22, 2015 at 4:28 PM, Aurelien Jarno 
wrote:

> On 2015-09-22 12:26, Thomas Huth wrote:
> > On 13/09/15 23:03, Aurelien Jarno wrote:
> > > The xscmpodp and xscmpudp instructions only have the AX, BX bits in
> > > there encoding, the lowest bit (usually TX) is marked as an invalid
> > > bit. We therefore can't decode them with GEN_XX2FORM, which decodes
> > > the two lowest bit.
> > >
> > > Introduce a new form GEN_XX2FORM, which decodes AX and BX and mark
> > > the lowest bit as invalid.
> > >
> > > Cc: Tom Musta 
> > > Cc: Alexander Graf 
> > > Cc: qemu-sta...@nongnu.org
> > > Signed-off-by: Aurelien Jarno 
> > > ---
> > >  target-ppc/translate.c | 11 +--
> > >  1 file changed, 9 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> > > index 84c5cea..c0eed13 100644
> > > --- a/target-ppc/translate.c
> > > +++ b/target-ppc/translate.c
> > > @@ -10670,6 +10670,13 @@ GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1,
> opc3, 0, PPC_NONE, fl2), \
> > >  GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 2, opc3, 0, PPC_NONE, fl2), \
> > >  GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 3, opc3, 0, PPC_NONE, fl2)
> > >
> > > +#undef GEN_XX2IFORM
> > > +#define GEN_XX2IFORM(name, opc2, opc3, fl2)
>  \
> > > +GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0, opc3, 1, PPC_NONE, fl2), \
> > > +GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 1, opc3, 1, PPC_NONE, fl2), \
> > > +GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 2, opc3, 1, PPC_NONE, fl2), \
> > > +GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 3, opc3, 1, PPC_NONE, fl2)
> > > +
> > >  #undef GEN_XX3_RC_FORM
> > >  #define GEN_XX3_RC_FORM(name, opc2, opc3, fl2)
>   \
> > >  GEN_HANDLER2_E(name, #name, 0x3C, opc2 | 0x00, opc3 | 0x00, 0,
> PPC_NONE, fl2), \
> > > @@ -10731,8 +10738,8 @@ GEN_XX3FORM(xsnmaddadp, 0x04, 0x14, PPC2_VSX),
> > >  GEN_XX3FORM(xsnmaddmdp, 0x04, 0x15, PPC2_VSX),
> > >  GEN_XX3FORM(xsnmsubadp, 0x04, 0x16, PPC2_VSX),
> > >  GEN_XX3FORM(xsnmsubmdp, 0x04, 0x17, PPC2_VSX),
> > > -GEN_XX2FORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
> > > -GEN_XX2FORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
> > > +GEN_XX2IFORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
> > > +GEN_XX2IFORM(xscmpudp,  0x0C, 0x04, PPC2_VSX),
> >
> > According to PowerISA 2.07, xscmpodp and xscmpudp are of type XX3, not
> > of type XX2 ... so should this macro maybe rather be named XX3IFORM
> instead?
>
> Indeed, I have chosen the name without actually realizing the manual
> also give names. Then I do wonder if the lower bit is really decoded as
> invalid, I wouldn't be surprised it is actually just ignored.
>
> I'll try to do some tests on real hardware and come up with a fixup
> patch.
>
> Aurelien
>
> --
> Aurelien Jarno  GPG: 4096R/1DDD8C9B
> aurel...@aurel32.net http://www.aurel32.net
>


Re: [Qemu-devel] [RFC PATCH 09/10] spapr_iommu: Provide a function to switch a TCE table to allowing VFIO

2015-09-23 Thread Thomas Huth
On 17/09/15 18:54, Alex Williamson wrote:
> On Thu, 2015-09-17 at 23:09 +1000, David Gibson wrote:
>> Because of the way non-VFIO guest IOMMU operations are KVM accelerated, not
>> all TCE tables (guest IOMMU contexts) can support VFIO devices.  Currently,
>> this is decided at creation time.
>>
>> To support hotplug of VFIO devices, we need to allow a TCE table which
>> previously didn't allow VFIO devices to be switched so that it can.  This
>> patch adds an spapr_tce_need_vfio() function to do this, by reallocating
>> the table in userspace if necessary.
>>
>> Currently this doesn't allow the KVM acceleration to be re-enabled if all
>> the VFIO devices are removed.  That's an optimization for another time.
> 
> 
> Same comment as previous patch, spapr_tce_need_userspace_table() (or
> something) makes the code much more self documenting.

May I also add the using the word "need" in a function name is could be
a little bit confusing here? It's maybe just me, but if I read
..._need_...() I first think of a function that just checks something
and returns a boolean for yes or no, not of a function that changes
something.
Could you maybe use something like "..._change_to_..." instead?

 Thomas




Re: [Qemu-devel] [PATCH v3 40/46] tests: add ivshmem qtest

2015-09-23 Thread Marc-André Lureau
Hi

On Tue, Sep 22, 2015 at 4:44 PM, Claudio Fontana
 wrote:
> I find this kind of macro use an aberration, but it is common use in QEMU 
> (unfortunately), and becoming worse.
>
> If somebody else wants to step in and add his own tag on this I would feel 
> like less of an accomplice in this crime.

That won't be necessary, I replaced it with regular functions for v4

thanks


-- 
Marc-André Lureau



Re: [Qemu-devel] [RFC PATCH 00/10] pseries: Allow VFIO devices on spapr-pci-host-bridge

2015-09-23 Thread Thomas Huth
On 17/09/15 15:09, David Gibson wrote:
> Currently the pseries machine type uses two types of PCI Host Bridge
> (PHB) devices: "spapr-pci-host-bridge" the 'normal' variant intended
> for emulated PCI devices, and "spapr-pci-vfio-host-bridge" intended
> for VFIO devices.
> 
> When using VFIO with pseries, a separate spapr-pci-vfio-host-bridge
> device is needed for every host IOMMU group from which you're using
> VFIO devices.  This is quite awkward for the user and/or management
> tools.  It's especially awkward since the current code makes
> essentially no attempt to detect and warn the user if the wrong sorts
> of devices are connected to the wrong PHB.
> 
> It turns out that the VFIO core code is actually general enough that
> VFIO devices almost work on the normal spapr-pci-host-bridge device.
> In fact with the right combination of circumstances they *can* work
> right now.
> 
> spapr-pci-vfio-host-bridge does 3 additional things:
> 
> 1) It disables KVM acceleration of the guest IOMMU.  That
>acceleration breaks VFIO because it means guest IOMMU updates
>bypass the VFIO infrastructure which keeps the host IOMMU in
>sync.
> 
> 2) It automatically configures the guest PHB's DMA window to match
>the capabilities of the host IOMMU, and advertises that to the
>guest.
> 
> 3) It provides additional handling of EEH (Enhanced Error
>Handling) functions.
> 
> This patch series:
> * Allows VFIO devices to be used on the spapr-pci-host-bridge by
>   auto-switching the KVM TCE acceleration
> 
> * Adds verification that the host IOMMU can handle the DMA windows
>   used by guest PHBs
> 
> * Allows the DMA window on the guest PHB to be configured with
>   device properties.  This can be used to make sure it matches a
>   host window, but in practice the default setting will already
>   work with the host IOMMU on all current systems.
> 
> * Adds support to the VFIO core to allow a VFIO device to be
>   hotplugged onto a bus which doesn't yet have VFIO devices.  This
>   already worked for systems without a guest-visible IOMMU
>   (i.e. x86), this series makes it work even with a guest visible
>   IOMMU.
> 
> * Makes a few related cleanups along the way
> 
> This series does NOT allow EEH operations on VFIO devices on the
> spapr-pci-host-bridge device, so the spapr-pci-vfio-host-bridge device
> is left in for now.  It turns out there are some serious existing
> problems in both the qemu EEH implementation and (worse) in the
> EEH/VFIO kernel interface.  Fixing those is a problem for another day.
> Maybe tomorrow.
> 
> 
> I've tested basic assignment of an xHCI to a pseries guest, both at
> startup and with hotplug.  I haven't (yet) tested VFIO on x86 with
> this series.
> 
> This series probably needs to be merged via several different trees.
> I'm intending to split up as necessary once it's had some review.
> 
> David Gibson (10):
>   vfio: Remove unneeded union from VFIOContainer
>   vfio: Generalize vfio_listener_region_add failure path
>   vfio: Check guest IOVA ranges against host IOMMU capabilities
>   vfio: Record host IOMMU's available IO page sizes
>   memory: Allow replay of IOMMU mapping notifications
>   vfio: Allow hotplug of containers onto existing guest IOMMU mappings
>   spapr_pci: Allow PCI host bridge DMA window to be configured
>   spapr_iommu: Rename vfio_accel parameter
>   spapr_iommu: Provide a function to switch a TCE table to allowing VFIO
>   spapr_pci: Allow VFIO devices to work on the normal PCI host bridge
> 
>  hw/ppc/spapr_iommu.c  |  25 ++-
>  hw/ppc/spapr_pci.c|  13 +++-
>  hw/vfio/common.c  | 152 
> +++---
>  include/exec/memory.h |  16 +
>  include/hw/pci-host/spapr.h   |   3 +-
>  include/hw/ppc/spapr.h|   6 +-
>  include/hw/vfio/vfio-common.h |  21 +++---
>  memory.c  |  18 +
>  target-ppc/kvm.c  |   4 +-
>  target-ppc/kvm_ppc.h  |   2 +-
>  10 files changed, 184 insertions(+), 76 deletions(-)

Apart from some (mostly cosmetic) nits, the patch series looks fine to me.

 Thomas




[Qemu-devel] [PATCH v2] qom: allow properties to be registered against classes

2015-09-23 Thread Daniel P. Berrange
When there are many instances of a given class, registering
properties against the instance is wasteful of resources. The
majority of objects have a statically defined list of possible
properties, so most of the properties are easily registerable
against the class. Only those properties which are conditionally
registered at runtime need be recorded against the klass.

Registering properties against classes also makes it possible
to provide static introspection of QOM - currently introspection
is only possible after creating an instance of a class, which
severely limits its usefulness.

This impl only supports simple scalar properties. It does not
attempt to allow child object / link object properties against
the class. There are ways to support those too, but it would
make this patch more complicated, so it is left as an exercise
for the future.

There is no equivalent to object_property_del provided, since
classes must be immutable once they are defined.

Signed-off-by: Daniel P. Berrange 
---

Changed in v2:

 - Clarify in commit message that prop deletion is intentionally
   not provided against classes
 - Remove automatic expansion of "[*]" property names, leaving
   it to the caller todo this if desired.

 include/qom/object.h |  44 +++
 qom/object.c | 212 +--
 2 files changed, 249 insertions(+), 7 deletions(-)

diff --git a/include/qom/object.h b/include/qom/object.h
index be7280c..d177158 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -383,6 +383,8 @@ struct ObjectClass
 const char *class_cast_cache[OBJECT_CLASS_CAST_CACHE];
 
 ObjectUnparent *unparent;
+
+QTAILQ_HEAD(, ObjectProperty) properties;
 };
 
 /**
@@ -949,6 +951,13 @@ ObjectProperty *object_property_add(Object *obj, const 
char *name,
 
 void object_property_del(Object *obj, const char *name, Error **errp);
 
+ObjectProperty *object_class_property_add(ObjectClass *klass, const char *name,
+  const char *type,
+  ObjectPropertyAccessor *get,
+  ObjectPropertyAccessor *set,
+  ObjectPropertyRelease *release,
+  void *opaque, Error **errp);
+
 /**
  * object_property_find:
  * @obj: the object
@@ -959,6 +968,8 @@ void object_property_del(Object *obj, const char *name, 
Error **errp);
  */
 ObjectProperty *object_property_find(Object *obj, const char *name,
  Error **errp);
+ObjectProperty *object_class_property_find(ObjectClass *klass, const char 
*name,
+   Error **errp);
 
 void object_unparent(Object *obj);
 
@@ -1327,6 +1338,12 @@ void object_property_add_str(Object *obj, const char 
*name,
  void (*set)(Object *, const char *, Error **),
  Error **errp);
 
+void object_class_property_add_str(ObjectClass *klass, const char *name,
+   char *(*get)(Object *, Error **),
+   void (*set)(Object *, const char *,
+   Error **),
+   Error **errp);
+
 /**
  * object_property_add_bool:
  * @obj: the object to add a property to
@@ -1343,6 +1360,11 @@ void object_property_add_bool(Object *obj, const char 
*name,
   void (*set)(Object *, bool, Error **),
   Error **errp);
 
+void object_class_property_add_bool(ObjectClass *klass, const char *name,
+bool (*get)(Object *, Error **),
+void (*set)(Object *, bool, Error **),
+Error **errp);
+
 /**
  * object_property_add_enum:
  * @obj: the object to add a property to
@@ -1362,6 +1384,13 @@ void object_property_add_enum(Object *obj, const char 
*name,
   void (*set)(Object *, int, Error **),
   Error **errp);
 
+void object_class_property_add_enum(ObjectClass *klass, const char *name,
+const char *typename,
+const char * const *strings,
+int (*get)(Object *, Error **),
+void (*set)(Object *, int, Error **),
+Error **errp);
+
 /**
  * object_property_add_tm:
  * @obj: the object to add a property to
@@ -1376,6 +1405,10 @@ void object_property_add_tm(Object *obj, const char 
*name,
 void (*get)(Object *, struct tm *, Error **),
 Error **errp);
 
+void object_class_property_add_tm(ObjectClass *klass, const char *name,
+  void (*get)(Object *, struct tm *, Error **),
+

Re: [Qemu-devel] [PATCH 0/3] pc: Set hw_version on all machine classes

2015-09-23 Thread Igor Mammedov
On Wed, 23 Sep 2015 12:37:08 +0200
Laszlo Ersek  wrote:

> On 09/23/15 10:04, Igor Mammedov wrote:
> > On Tue, 22 Sep 2015 17:16:24 -0300
> > Eduardo Habkost  wrote:
> > 
> >> In 2012, QEMU had a bug where it exposed QEMU version information
> >> to the guest, meaning a QEMU upgrade would expose different
> >> hardware to the guest OS even if the same machine-type is being
> >> used.
> >>
> >> The bug was fixed by commit 93bfef4c6e4b23caea9d51e1099d06433d8835a4,
> >> on all machines up to pc-1.0. But we kept introducing the same
> >> bug on all newer machines since then. That means we are breaking
> >> guest ABI every time QEMU was upgraded.
> >>
> >> Fix this by setting the hw_version on all PC machines, making
> >> sure the hardware won't change when upgrading QEMU.
> >>
> >> This series is based on Michael's PCI tree, plus the "Set
> >> broken_reserved_end on pc-*-2.4, not 2.5" patch I submitted
> >> earlier today.
> > I haven't seen this patch on list, perhaps it needs to be resubmitted?
> 
> http://lists.nongnu.org/archive/html/qemu-devel/2015-09/msg05778.html
If I'm not mistaken it's link to this series and not to
"Set broken_reserved_end on pc-*-2.4, not 2.5" patch.



Re: [Qemu-devel] No error report when using the qemu-img.exe toconvert a disk to vmdk format which is saved on a disk that has no morespace

2015-09-23 Thread Kevin Wolf
Am 23.09.2015 um 13:30 hat Guangmu Zhu geschrieben:
> If the "BlockDriver" is "bdrv_vmdk", the function "vmdk_co_write" will be
> called instead. In function "vmdk_write_extent" I see "ret = bdrv_pwrite
> (extent->file, write_offset, write_buf, write_len);". So the "extend->file" is
> "bdrv_file", is it?

Yes, exactly. You'll go through bdrv_vmdk first, and then the nested
call goes to bdrv_file.

> -
> 
> Correct a mistake:
> So though the "count" would be "-EINVAL" if error occurred while writing some
> file, the return value will always be zero. Maybe I missed something?

I think you're right. Instead of setting count = 0/-EINVAL in
aio_worker, we should be setting ret.

Can you try the patch below and report back?

> 3. The "bs->drv->bdrv_aio_writev" is function "raw_aio_writev" in file
> "raw-win32.c" and the quemu-img uses synchronous IO always, so the function
> "paio_submit" in the same file will be called. This function submits the "aio"
> to "worker_thread" with the callback "aio_worker". There are some codes in
> "aio_worker":
> 
> ssize_t ret = 0;
> ..
> case QEMU_AIO_WRITE:
> count = handle_aiocb_rw(aiocb);
> if (count == aiocb->aio_nbytes) {
> count = 0;
> } else {
> count = -EINVAL;
> }
> break;
> ..
> return ret;

Independently of your problem, the code in aio_worker() looks a bit
fishy, because handle_aiocb_rw() can't distinguish between an error
and 0 bytes transferred.

For writes, that probably doesn't matter, but for reads, I think we
return a successful read of zeroes instead of signalling an error. This
might need another patch.

Generally, the Windows backend is not getting a lot of attention and
could use someone who checks it, cleans it up and fixes bugs.

Kevin


diff --git a/block/raw-win32.c b/block/raw-win32.c
index 68f2338..b562c94 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -119,9 +119,9 @@ static int aio_worker(void *arg)
 case QEMU_AIO_WRITE:
 count = handle_aiocb_rw(aiocb);
 if (count == aiocb->aio_nbytes) {
-count = 0;
+ret = 0;
 } else {
-count = -EINVAL;
+ret = -EINVAL;
 }
 break;
 case QEMU_AIO_FLUSH:



Re: [Qemu-devel] [PATCH v3 17/46] ivshmem: improve debug messages

2015-09-23 Thread Claudio Fontana
On 23.09.2015 12:29, Marc-André Lureau wrote:
> On Tue, Sep 22, 2015 at 4:23 PM, Claudio Fontana
>  wrote:
>> This MSI-X use vs not use is a bit confusing to me.
>> I see that the use of MSI is controlled mainly by IVSHMEM_MSI (Property 
>> "msi"),
>> but then there are if (msix_present()) checks spread around.
>>
>> Could this printf be a bit more clear, possibly adding other DPRINTFs as 
>> necessary?
>>
>> Is your IVSHMEM_DPRINTF("use msix\n"); actually intended to mean ("using 
>> MSIX\n")? But then why is the check for if (!msix_present(d)) only 
>> afterwards?
> 
> 
> I don't remember precisely why it's there, only I probably wanted to
> trace the entering of function ivshmem_use_msix().
> 
> Let's change it for IVSHMEM_DPRINTF("use msix, present: %d\n",
> msix_present(d)); ok?
> 

what about something like

IVSHMEM_DPRINTF("%susing MSI-X\n", msix_present(d) ? "" : "not ");

or just

if (!msix_present(d) {
   IVSHMEM_DPRINTF("not using MSI-X");
   return;
}

IVSHMEM_DPRINTF("using MSI-X");

Ciao

CLaudio



[Qemu-devel] [PATCH v3] spapr: generate DT node names

2015-09-23 Thread Laurent Vivier
When DT node names for PCI devices are generated by SLOF,
they are generated according to the type of the device
(for instance, ethernet for virtio-net-pci device).

Node name for hotplugged devices is generated by QEMU.
This patch adds the mechanic to QEMU to create the node
name according to the device type too.

The data structure has been roughly copied from OpenBIOS/OpenHackware,
node names from SLOF.

Example:

Hotplugging some PCI cards with QEMU monitor:

device_add virtio-tablet-pci
device_add virtio-serial-pci
device_add virtio-mouse-pci
device_add virtio-scsi-pci
device_add virtio-gpu-pci
device_add ne2k_pci
device_add nec-usb-xhci
device_add intel-hda

What we can see in linux device tree:

for dir in /proc/device-tree/pci@8002000/*@*/; do
echo $dir
cat $dir/name
echo
done

WITHOUT this patch:

/proc/device-tree/pci@8002000/pci@0/
pci
/proc/device-tree/pci@8002000/pci@1/
pci
/proc/device-tree/pci@8002000/pci@2/
pci
/proc/device-tree/pci@8002000/pci@3/
pci
/proc/device-tree/pci@8002000/pci@4/
pci
/proc/device-tree/pci@8002000/pci@5/
pci
/proc/device-tree/pci@8002000/pci@6/
pci
/proc/device-tree/pci@8002000/pci@7/
pci

WITH this patch:

/proc/device-tree/pci@8002000/communication-controller@1/
communication-controller
/proc/device-tree/pci@8002000/display@4/
display
/proc/device-tree/pci@8002000/ethernet@5/
ethernet
/proc/device-tree/pci@8002000/input-controller@0/
input-controller
/proc/device-tree/pci@8002000/mouse@2/
mouse
/proc/device-tree/pci@8002000/multimedia-device@7/
multimedia-device
/proc/device-tree/pci@8002000/scsi@3/
scsi
/proc/device-tree/pci@8002000/usb-xhci@6/
usb-xhci

Signed-off-by: Laurent Vivier 
---
v3: use values from pci_ids.h, update pci_ids.h values
keep only details for USB (xhci, ohci, ...) and PIC (IO-APIC, IO-XAPIC)
v2: Use CamelCase name, remove misc-* name,
remove _OTHER entries to fallback to class name (as SLOF does).
Fix typo (IPMI-bltr).

 hw/ppc/spapr_pci.c   | 296 ---
 include/hw/pci/pci_ids.h | 115 --
 2 files changed, 388 insertions(+), 23 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index a2feb4c..c521d31 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -38,6 +38,7 @@
 
 #include "hw/pci/pci_bridge.h"
 #include "hw/pci/pci_bus.h"
+#include "hw/pci/pci_ids.h"
 #include "hw/ppc/spapr_drc.h"
 #include "sysemu/device_tree.h"
 
@@ -944,6 +945,280 @@ static void populate_resource_props(PCIDevice *d, 
ResourceProps *rp)
 rp->assigned_len = assigned_idx * sizeof(ResourceFields);
 }
 
+typedef struct PCIClass PCIClass;
+typedef struct PCISubClass PCISubClass;
+typedef struct PCIIFace PCIIFace;
+
+struct PCIIFace {
+uint8_t iface;
+const char *name;
+};
+
+struct PCISubClass {
+uint8_t subclass;
+const char *name;
+const PCIIFace *iface;
+};
+#define SUBCLASS(a) ((uint8_t)a)
+#define IFACE(a)((uint8_t)a)
+
+struct PCIClass {
+const char *name;
+const PCISubClass *subc;
+};
+
+static const PCISubClass undef_subclass[] = {
+{ IFACE(PCI_CLASS_NOT_DEFINED_VGA), "display", NULL },
+{ 0xFF, NULL, NULL, NULL },
+};
+
+static const PCISubClass mass_subclass[] = {
+{ SUBCLASS(PCI_CLASS_STORAGE_SCSI), "scsi", NULL },
+{ SUBCLASS(PCI_CLASS_STORAGE_IDE), "ide", NULL },
+{ SUBCLASS(PCI_CLASS_STORAGE_FLOPPY), "fdc", NULL },
+{ SUBCLASS(PCI_CLASS_STORAGE_IPI), "ipi", NULL },
+{ SUBCLASS(PCI_CLASS_STORAGE_RAID), "raid", NULL },
+{ SUBCLASS(PCI_CLASS_STORAGE_ATA), "ata", NULL },
+{ SUBCLASS(PCI_CLASS_STORAGE_SATA), "sata", NULL },
+{ SUBCLASS(PCI_CLASS_STORAGE_SAS), "sas", NULL },
+{ 0xFF, NULL, NULL },
+};
+
+static const PCISubClass net_subclass[] = {
+{ SUBCLASS(PCI_CLASS_NETWORK_ETHERNET), "ethernet", NULL },
+{ SUBCLASS(PCI_CLASS_NETWORK_TOKEN_RING), "token-ring", NULL },
+{ SUBCLASS(PCI_CLASS_NETWORK_FDDI), "fddi", NULL },
+{ SUBCLASS(PCI_CLASS_NETWORK_ATM), "atm", NULL },
+{ SUBCLASS(PCI_CLASS_NETWORK_ISDN), "isdn", NULL },
+{ SUBCLASS(PCI_CLASS_NETWORK_WORDFIP), "worldfip", NULL },
+{ SUBCLASS(PCI_CLASS_NETWORK_PICMG214), "picmg", NULL },
+{ 0xFF, NULL, NULL },
+};
+
+static const PCISubClass displ_subclass[] = {
+{ SUBCLASS(PCI_CLASS_DISPLAY_VGA), "vga", NULL },
+{ SUBCLASS(PCI_CLASS_DISPLAY_XGA), "xga", NULL },
+{ SUBCLASS(PCI_CLASS_DISPLAY_3D), "3d-controller", NULL },
+{ 0xFF, NULL, NULL },
+};
+
+static const PCISubClass media_subclass[] = {
+{ SUBCLASS(PCI_CLASS_MULTIMEDIA_VIDEO), "video", NULL },
+{ SUBCLASS(PCI_CLASS_MULTIMEDIA_AUDIO), "sound", NULL },
+{ SUBCLASS(PCI_CLASS_MULTIMEDIA_PHONE), "telephony", NULL },
+{ 0xFF, NULL, NULL },
+};
+
+static const PCISubClass mem_subclass[] = {
+{ SUBCLASS(PCI_CLASS_MEMORY_RAM), "memory", NULL },
+{ SUBCLASS(PCI_CLASS_MEMORY

Re: [Qemu-devel] [PATCH v3 20/46] ivshmem: simplify a bit the code

2015-09-23 Thread Claudio Fontana
On 22.09.2015 16:56, Marc-André Lureau wrote:
> 
> 
> - Original Message -
>> On 15.09.2015 18:07, marcandre.lur...@redhat.com wrote:
>>> From: Marc-André Lureau 
>>>
>>> Use some more explicit variables to simplify the code.
>>>
>>> nth_eventfd variable is the current eventfd to be manipulated.
>>
>> well maybe a silly question, but then why not call it current_eventfd?
> 
> Either way, ok.
> 
> current_eventfd is the nth eventfd to be added :)
> 
>>
>>> Signed-off-by: Marc-André Lureau 
>>> ---
>>>  hw/misc/ivshmem.c | 26 --
>>>  1 file changed, 12 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
>>> index 1c98ec3..a60454f 100644
>>> --- a/hw/misc/ivshmem.c
>>> +++ b/hw/misc/ivshmem.c
>>> @@ -488,9 +488,10 @@ static void ivshmem_read(void *opaque, const uint8_t
>>> *buf, int size)
>>>  {
>>>  IVShmemState *s = opaque;
>>>  int incoming_fd;
>>> -int guest_max_eventfd;
>>> +int nth_eventfd;
>>>  long incoming_posn;
>>>  Error *err = NULL;
>>> +Peer *peer;
>>>  
>>>  if (!fifo_update_and_get(s, buf, size,
>>>   &incoming_posn, sizeof(incoming_posn))) {
>>> @@ -517,6 +518,8 @@ static void ivshmem_read(void *opaque, const uint8_t
>>> *buf, int size)
>>>  }
>>>  }
>>>  
>>> +peer = &s->peers[incoming_posn];
>>> +
>>>  if (incoming_fd == -1) {
>>>  /* if posn is positive and unseen before then this is our posn*/
>>>  if (incoming_posn >= 0 && s->vm_id == -1) {
>>> @@ -564,27 +567,22 @@ static void ivshmem_read(void *opaque, const uint8_t
>>> *buf, int size)
>>>  return;
>>>  }
>>>  
>>> -/* each guest has an array of eventfds, and we keep track of how many
>>> - * guests for each VM */
>>
>> you removed a few comments, do they no longer apply?
>> If so do they need to be replaced with better ones mentioning how it works in
>> contrast with the previous?
> 
> That comment didn't make much sense to me, especially the second part,
> what about:
> 
> "each peer has an associated array of eventfds, and we keep track of how many 
> eventfd received so far"

ok, "... of how many eventfds have been received so far".

> 
>>
>>> -guest_max_eventfd = s->peers[incoming_posn].nb_eventfds;
>>> +/* get a new eventfd */
>>> +nth_eventfd = peer->nb_eventfds++;
>>>  
>>>  /* this is an eventfd for a particular guest VM */
>>>  IVSHMEM_DPRINTF("eventfds[%ld][%d] = %d\n", incoming_posn,
>>> -guest_max_eventfd, incoming_fd);
>>> -
>>> event_notifier_init_fd(&s->peers[incoming_posn].eventfds[guest_max_eventfd],
>>> -   incoming_fd);
>>> -
>>> -/* increment count for particular guest */
>>> -s->peers[incoming_posn].nb_eventfds++;
>>> +nth_eventfd, incoming_fd);
>>> +event_notifier_init_fd(&peer->eventfds[nth_eventfd], incoming_fd);
>>>  
>>>  if (incoming_posn == s->vm_id) {
>>> -s->eventfd_chr[guest_max_eventfd] = create_eventfd_chr_device(s,
>>> -   &s->peers[s->vm_id].eventfds[guest_max_eventfd],
>>> -   guest_max_eventfd);
>>> +s->eventfd_chr[nth_eventfd] = create_eventfd_chr_device(s,
>>> +   &s->peers[s->vm_id].eventfds[nth_eventfd],
>>> +   nth_eventfd);
>>>  }
>>>  
>>>  if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) {
>>> -ivshmem_add_eventfd(s, incoming_posn, guest_max_eventfd);
>>> +ivshmem_add_eventfd(s, incoming_posn, nth_eventfd);
>>>  }
>>>  }
>>>  
>>>
>>
>> Ciao
>> C.
>>
>>





Re: [Qemu-devel] [PATCH 1/2] Move dirty page search state into separate structure

2015-09-23 Thread Amit Shah
On (Wed) 16 Sep 2015 [19:11:53], Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> Pull the sarch state for one iteration of the dirty page

typo in 'search'

> search into a structure.
> 
> Signed-off-by: Dr. David Alan Gilbert 

> @@ -923,26 +933,29 @@ static int ram_save_compressed_page(QEMUFile *f, 
> RAMBlock *block,
>  static int ram_find_and_save_block(QEMUFile *f, bool last_stage,
> uint64_t *bytes_transferred)
>  {
> -RAMBlock *block = last_seen_block;
> -ram_addr_t offset = last_offset;
> -bool complete_round = false;
> +PageSearchStatus pss;
>  int pages = 0;
>  
> -if (!block)
> -block = QLIST_FIRST_RCU(&ram_list.blocks);
> +pss.block = last_seen_block;
> +pss.offset = last_offset;
> +pss.complete_round = false;
> +
> +if (!pss.block)
> +pss.block = QLIST_FIRST_RCU(&ram_list.blocks);

Missing {} around if

With those fixed:

Reviewed-by: Amit Shah 


Amit



Re: [Qemu-devel] [PATCH v3 41/46] ivshmem: do not keep shm_fd open

2015-09-23 Thread Claudio Fontana
On 22.09.2015 16:59, Marc-André Lureau wrote:
> Hi
> 
> - Original Message -
>> On 15.09.2015 18:07, marcandre.lur...@redhat.com wrote:
>>> From: Marc-André Lureau 
>>>
>>> Remove shm_fd from device state, closing it as early as possible to avoid
>>> leaks.
>>>
>>> Signed-off-by: Marc-André Lureau 
>>> ---
>>>  hw/misc/ivshmem.c | 14 +-
>>>  1 file changed, 5 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
>>> index 4adcac5..f9ac955 100644
>>> --- a/hw/misc/ivshmem.c
>>> +++ b/hw/misc/ivshmem.c
>>> @@ -88,7 +88,6 @@ typedef struct IVShmemState {
>>>  MemoryRegion ivshmem;
>>>  uint64_t ivshmem_size; /* size of shared memory region */
>>>  uint32_t ivshmem_64bit;
>>> -int shm_fd; /* shared memory file descriptor */
>>
>> is it in no way useful during debugging to have access to this field?
>> Or is it easily available elsewhere?
> 
> How would it be useful during debugging? Once the memory is mapped there 
> isn't much you can do with it, it's just keeping a fd open, isn't it?

all right.

> 
>>
>> Ciao C.
>>
>>>  
>>>  Peer *peers;
>>>  int nb_peers; /* how many peers we have space for */
>>> @@ -235,7 +234,7 @@ static uint64_t ivshmem_io_read(void *opaque, hwaddr
>>> addr,
>>>  
>>>  case IVPOSITION:
>>>  /* return my VM ID if the memory is mapped */
>>> -if (s->shm_fd >= 0) {
>>> +if (memory_region_is_mapped(&s->ivshmem)) {
>>>  ret = s->vm_id;
>>>  } else {
>>>  ret = -1;
>>> @@ -356,8 +355,6 @@ static int create_shared_memory_BAR(IVShmemState *s,
>>> int fd, uint8_t attr,
>>>  return -1;
>>>  }
>>>  
>>> -s->shm_fd = fd;
>>> -
>>>  memory_region_init_ram_ptr(&s->ivshmem, OBJECT(s), "ivshmem.bar2",
>>> s->ivshmem_size, ptr);
>>>  vmstate_register_ram(&s->ivshmem, DEVICE(s));
>>> @@ -535,7 +532,7 @@ static void ivshmem_read(void *opaque, const uint8_t
>>> *buf, int size)
>>>  if (incoming_posn == -1) {
>>>  void * map_ptr;
>>>  
>>> -if (s->shm_fd >= 0) {
>>> +if (memory_region_is_mapped(&s->ivshmem)) {
>>>  error_report("shm already initialized");
>>>  close(incoming_fd);
>>>  return;
>>> @@ -564,9 +561,7 @@ static void ivshmem_read(void *opaque, const uint8_t
>>> *buf, int size)
>>>  
>>>  memory_region_add_subregion(&s->bar, 0, &s->ivshmem);
>>>  
>>> -/* only store the fd if it is successfully mapped */
>>> -s->shm_fd = incoming_fd;
>>> -
>>> +close(incoming_fd);
>>>  return;
>>>  }
>>>  
>>> @@ -827,6 +822,7 @@ static void pci_ivshmem_realize(PCIDevice *dev, Error
>>> **errp)
>>>  }
>>>  
>>>  create_shared_memory_BAR(s, fd, attr, errp);
>>> +close(fd);
>>>  }
>>>  }
>>>  
>>> @@ -842,7 +838,7 @@ static void pci_ivshmem_exit(PCIDevice *dev)
>>>  error_free(s->migration_blocker);
>>>  }
>>>  
>>> -if (s->shm_fd >= 0) {
>>> +if (memory_region_is_mapped(&s->ivshmem)) {
>>>  void *addr = memory_region_get_ram_ptr(&s->ivshmem);
>>>  
>>>  vmstate_unregister_ram(&s->ivshmem, DEVICE(dev));
>>>






Re: [Qemu-devel] [PATCH 2/2] ram_find_and_save_block: Split out the finding

2015-09-23 Thread Amit Shah
On (Wed) 16 Sep 2015 [19:11:54], Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> Split out the finding of the dirty page and all the wrap detection
> into a separate function since it was getting a bit hairy.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> ---
>  migration/ram.c | 87 
> -
>  1 file changed, 61 insertions(+), 26 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 1fadf52..d2982e9 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -236,7 +236,7 @@ struct PageSearchStatus {
>  /* Set once we wrap around */
>  bool complete_round;
>  };
> -typdef struct PageSearchStatus PageSearchStatus;
> +typedef struct PageSearchStatus PageSearchStatus;

:-)

> +/*
> + * Find the next dirty page and update any state associated with
> + * the search process.
> + *
> + * Returns: True if a page is found
> + *
> + * @f: Current migration stream.
> + * @pss: Data about the state of the current dirty page scan.
> + * @*again: Set to false if the search has scanned the whole of RAM
> + */
> +static bool find_dirty_block(QEMUFile *f, PageSearchStatus *pss,
> + bool *again)
> +{
> +pss->offset = migration_bitmap_find_and_reset_dirty(pss->block,
> +   pss->offset);
> +if (pss->complete_round && pss->block == last_seen_block &&
> +pss->offset >= last_offset) {
> +/*
> + * We've been once around the RAM and haven't found anything
> + * give up.
> + */
> +*again = false;
> +return false;
> +}
> +if (pss->offset >= pss->block->used_length) {
> +/* Didn't find anything in this RAM Block */
> +pss->offset = 0;
> +pss->block = QLIST_NEXT_RCU(pss->block, next);
> +if (!pss->block) {
> +/* Hit the end of the list */
> +pss->block = QLIST_FIRST_RCU(&ram_list.blocks);
> +/* Flag that we've looped */
> +pss->complete_round = true;
> +ram_bulk_stage = false;
> +if (migrate_use_xbzrle()) {
> +/* If xbzrle is on, stop using the data compression at this
> + * point. In theory, xbzrle can do better than compression.
> + */
> +flush_compressed_data(f);
> +compression_switch = false;
> +}
> +}
> +/* Didn't find anything this time, but try again on the new block */
> +*again = true;
> +return false;
> +} else {
> +/* Can go around again, but... */
> +*again = true;
> +/* We've found something so probably don't need to */
> +return true;

These comments are very good.  Had you not introduced them, I'd have
recommended to just set *again to true before the if; and get the
'return true' case handled earlier so that all the nesting too goes
off (each case has a return, so no point in continuing with if..else).

Also, the dance with the return value and the 'again' is also slightly
complex -- but I doubt it can be made any simpler.

> +}
> +}
> +
>  /**
>   * ram_find_and_save_block: Finds a dirty page and sends it to f
>   *
> @@ -935,6 +988,7 @@ static int ram_find_and_save_block(QEMUFile *f, bool 
> last_stage,
>  {
>  PageSearchStatus pss;
>  int pages = 0;
> +bool again, found;
>  
>  pss.block = last_seen_block;
>  pss.offset = last_offset;
> @@ -943,29 +997,11 @@ static int ram_find_and_save_block(QEMUFile *f, bool 
> last_stage,
>  if (!pss.block)
>  pss.block = QLIST_FIRST_RCU(&ram_list.blocks);
>  
> -while (true) {
> -pss.offset = migration_bitmap_find_and_reset_dirty(pss.block,
> -   pss.offset);
> -if (pss.complete_round && pss.block == last_seen_block &&
> -pss.offset >= last_offset) {
> -break;
> -}
> -if (pss.offset >= pss.block->used_length) {
> -pss.offset = 0;
> -pss.block = QLIST_NEXT_RCU(pss.block, next);
> -if (!pss.block) {
> -pss.block = QLIST_FIRST_RCU(&ram_list.blocks);
> -pss.complete_round = true;
> -ram_bulk_stage = false;
> -if (migrate_use_xbzrle()) {
> -/* If xbzrle is on, stop using the data compression at 
> this
> - * point. In theory, xbzrle can do better than 
> compression.
> - */
> -flush_compressed_data(f);
> -compression_switch = false;
> -}
> -}
> -} else {
> +do {
> +again = true;

This assignment isn't needed -- did any tool complain?

> +found = find_dirty_block(f, &pss, &again);
> +
> +if (found) {


Amit



Re: [Qemu-devel] [PATCH v10 5/7] vhost: introduce vhost_backend_get_vq_index method

2015-09-23 Thread Eric Blake
On 09/18/2015 08:58 AM, Yuanhan Liu wrote:
> Minusing the idx with the base(dev->vq_index) for vhost-kernel, and

s/Minusing/Subtracting/

> then adding it back for vhost-user doesn't seem right. Here introduces
> a new method vhost_backend_get_vq_index() for getting the right vq
> index for following vhost messages calls.
> 
> Suggested-by: Jason Wang 
> Signed-off-by: Yuanhan Liu 
> ---



-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 0/3] pc: Set hw_version on all machine classes

2015-09-23 Thread Laszlo Ersek
On 09/23/15 13:47, Igor Mammedov wrote:
> On Wed, 23 Sep 2015 12:37:08 +0200
> Laszlo Ersek  wrote:
> 
>> On 09/23/15 10:04, Igor Mammedov wrote:
>>> On Tue, 22 Sep 2015 17:16:24 -0300
>>> Eduardo Habkost  wrote:
>>>
 In 2012, QEMU had a bug where it exposed QEMU version information
 to the guest, meaning a QEMU upgrade would expose different
 hardware to the guest OS even if the same machine-type is being
 used.

 The bug was fixed by commit 93bfef4c6e4b23caea9d51e1099d06433d8835a4,
 on all machines up to pc-1.0. But we kept introducing the same
 bug on all newer machines since then. That means we are breaking
 guest ABI every time QEMU was upgraded.

 Fix this by setting the hw_version on all PC machines, making
 sure the hardware won't change when upgrading QEMU.

 This series is based on Michael's PCI tree, plus the "Set
 broken_reserved_end on pc-*-2.4, not 2.5" patch I submitted
 earlier today.
>>> I haven't seen this patch on list, perhaps it needs to be resubmitted?
>>
>> http://lists.nongnu.org/archive/html/qemu-devel/2015-09/msg05778.html
> If I'm not mistaken it's link to this series and not to
> "Set broken_reserved_end on pc-*-2.4, not 2.5" patch.

Ugh, right. Sorry. I misunderstood what you meant with "this".

Laszlo




Re: [Qemu-devel] [PATCH 2/2] ram_find_and_save_block: Split out the finding

2015-09-23 Thread Dr. David Alan Gilbert
* Amit Shah (amit.s...@redhat.com) wrote:
> On (Wed) 16 Sep 2015 [19:11:54], Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" 
> > 
> > Split out the finding of the dirty page and all the wrap detection
> > into a separate function since it was getting a bit hairy.
> > 
> > Signed-off-by: Dr. David Alan Gilbert 
> > ---
> >  migration/ram.c | 87 
> > -
> >  1 file changed, 61 insertions(+), 26 deletions(-)
> > 
> > diff --git a/migration/ram.c b/migration/ram.c
> > index 1fadf52..d2982e9 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -236,7 +236,7 @@ struct PageSearchStatus {
> >  /* Set once we wrap around */
> >  bool complete_round;
> >  };
> > -typdef struct PageSearchStatus PageSearchStatus;
> > +typedef struct PageSearchStatus PageSearchStatus;
> 
> :-)

Ahem - I didn't retest after I split it into two subpatches; thanks; I'll recut 
this soon.

> > +/*
> > + * Find the next dirty page and update any state associated with
> > + * the search process.
> > + *
> > + * Returns: True if a page is found
> > + *
> > + * @f: Current migration stream.
> > + * @pss: Data about the state of the current dirty page scan.
> > + * @*again: Set to false if the search has scanned the whole of RAM
> > + */
> > +static bool find_dirty_block(QEMUFile *f, PageSearchStatus *pss,
> > + bool *again)
> > +{
> > +pss->offset = migration_bitmap_find_and_reset_dirty(pss->block,
> > +   pss->offset);
> > +if (pss->complete_round && pss->block == last_seen_block &&
> > +pss->offset >= last_offset) {
> > +/*
> > + * We've been once around the RAM and haven't found anything
> > + * give up.
> > + */
> > +*again = false;
> > +return false;
> > +}
> > +if (pss->offset >= pss->block->used_length) {
> > +/* Didn't find anything in this RAM Block */
> > +pss->offset = 0;
> > +pss->block = QLIST_NEXT_RCU(pss->block, next);
> > +if (!pss->block) {
> > +/* Hit the end of the list */
> > +pss->block = QLIST_FIRST_RCU(&ram_list.blocks);
> > +/* Flag that we've looped */
> > +pss->complete_round = true;
> > +ram_bulk_stage = false;
> > +if (migrate_use_xbzrle()) {
> > +/* If xbzrle is on, stop using the data compression at this
> > + * point. In theory, xbzrle can do better than compression.
> > + */
> > +flush_compressed_data(f);
> > +compression_switch = false;
> > +}
> > +}
> > +/* Didn't find anything this time, but try again on the new block 
> > */
> > +*again = true;
> > +return false;
> > +} else {
> > +/* Can go around again, but... */
> > +*again = true;
> > +/* We've found something so probably don't need to */
> > +return true;
> 
> These comments are very good.  Had you not introduced them, I'd have
> recommended to just set *again to true before the if; and get the
> 'return true' case handled earlier so that all the nesting too goes
> off (each case has a return, so no point in continuing with if..else).

Yes, I needed to add them to help me understand what was going on.

> Also, the dance with the return value and the 'again' is also slightly
> complex -- but I doubt it can be made any simpler.
> 
> > +}
> > +}
> > +
> >  /**
> >   * ram_find_and_save_block: Finds a dirty page and sends it to f
> >   *
> > @@ -935,6 +988,7 @@ static int ram_find_and_save_block(QEMUFile *f, bool 
> > last_stage,
> >  {
> >  PageSearchStatus pss;
> >  int pages = 0;
> > +bool again, found;
> >  
> >  pss.block = last_seen_block;
> >  pss.offset = last_offset;
> > @@ -943,29 +997,11 @@ static int ram_find_and_save_block(QEMUFile *f, bool 
> > last_stage,
> >  if (!pss.block)
> >  pss.block = QLIST_FIRST_RCU(&ram_list.blocks);
> >  
> > -while (true) {
> > -pss.offset = migration_bitmap_find_and_reset_dirty(pss.block,
> > -   pss.offset);
> > -if (pss.complete_round && pss.block == last_seen_block &&
> > -pss.offset >= last_offset) {
> > -break;
> > -}
> > -if (pss.offset >= pss.block->used_length) {
> > -pss.offset = 0;
> > -pss.block = QLIST_NEXT_RCU(pss.block, next);
> > -if (!pss.block) {
> > -pss.block = QLIST_FIRST_RCU(&ram_list.blocks);
> > -pss.complete_round = true;
> > -ram_bulk_stage = false;
> > -if (migrate_use_xbzrle()) {
> > -/* If xbzrle is on, stop using the data compression at 
> > this
> > - * point. In theory, xbzrle can d

Re: [Qemu-devel] [PATCH v5 03/46] qapi: Test for C member name collisions

2015-09-23 Thread Eric Blake
On 09/23/2015 03:43 AM, Markus Armbruster wrote:

>> Commit 1e6c1616 was where we quit burning the C member name 'base'.
>> Prior to that time, members of base classes did not clash with variant
>> names because of the C boxing.
> 
> For union types.  For struct types, we still box the base.  I'd like to
> get rid of that.

Patch 34/46 :)

> 
> Even when the base is boxed, the members still clash in QMP.
> 
> We also box the variants (e.g. UserDefA *value1 in the example above).
> Would be nice to get rid of that, too.

What do you mean? Here's an example of current boxed code:

enum EnumType {
ENUM_TYPE_ONE,
ENUM_TYPE_TWO,
};
struct One {
int a;
};
struct Two {
char *a;
};
struct Union {
EnumType type;
/* union tag is @type */
union {
One *one;
Two *two;
};
};

Is this what you envision for unboxed? Note that we still have to
namespace things properly (we have to have union.one.a and union.two.a,
and not a direct union.a), so all we'd be saving is the additional
allocation of the variant pointers.

struct Union {
EnumType type;
/* union tag is @type */
union {
struct {
int a;
} one;
struct {
char *a;
} two;
};
};

However, I'm not sure it would always help.  The conversion of
netdev_add to full qapi relies on being able to access the variant
through a named struct (such as NetdevTapOptions); unboxing the variant
would get rid of the convenient access to these named sub-structs.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 04/16] quorum: Convert to BdrvChild

2015-09-23 Thread Alberto Garcia
On Thu 17 Sep 2015 03:48:08 PM CEST, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf 
> ---
>  block/quorum.c | 63 
> ++

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-devel] No error report when using the qemu-img.exe toconvert a disk to vmdk format which is saved on a disk that has no morespace

2015-09-23 Thread Guangmu Zhu
Thanks for your reply.


I read the source code again and have some question:


1. qume-img wrote the target file in the function "bdrv_aligned_pwritev" which 
called the "drv->bdrv_co_writev" function. I haven't know how the qume-img 
found the right driver for "drv", but I guessed it's "BlockDriver bdrv_file" 
for the file on Windows, is it?

2. If the above is correct, the member "bdrv_aio_writev" of "BlockDriver" is 
the function "raw_aio_writev" while the member "bdrv_co_writev" is NULL. The 
function "bdrv_register" registers the "BlockDriver" and checks whether the 
"bdrv_co_readv" or "bdrv_aio_readv" is NULL. If the "bdrv_co_readv" is NULL, 
the member "bdrv_co_writev" is "bdrv_co_writev_em" which calls the function 
"bdrv_co_io_em". There are some codes of it:

static int coroutine_fn bdrv_co_io_em(BlockDriverState *bs, int64_t sector_num,
  int nb_sectors, QEMUIOVector *iov,
  bool is_write)
{
CoroutineIOCompletion co = {
.coroutine = qemu_coroutine_self(),
};
BlockAIOCB *acb;


if (is_write) {
acb = bs->drv->bdrv_aio_writev(bs, sector_num, iov, nb_sectors,
   bdrv_co_io_em_complete, &co);
} else {
acb = bs->drv->bdrv_aio_readv(bs, sector_num, iov, nb_sectors,
  bdrv_co_io_em_complete, &co);
}


trace_bdrv_co_io_em(bs, sector_num, nb_sectors, is_write, acb);
if (!acb) {
return -EIO;
}
qemu_coroutine_yield();


return co.ret;
}



3. The "bs->drv->bdrv_aio_writev" is function "raw_aio_writev" in file 
"raw-win32.c" and the quemu-img uses synchronous IO always, so the function 
"paio_submit" in the same file will be called. This function submits the "aio" 
to "worker_thread" with the callback "aio_worker". There are some codes in 
"aio_worker":


ssize_t ret = 0;
..
case QEMU_AIO_WRITE:
count = handle_aiocb_rw(aiocb);
if (count == aiocb->aio_nbytes) {
count = 0;
} else {
count = -EINVAL;
}
break;

..
return ret;


So though the "count" would be zero if error occurred while writing some file, 
the return value will always be zero. Maybe I missed something?


4. The function "worker_thread" calls the callback:


ret = req->func(req->arg);


req->ret = ret;
/* Write ret before state.  */
smp_wmb();
req->state = THREAD_DONE;



qemu_mutex_lock(&pool->lock);


qemu_bh_schedule(pool->completion_bh);



The "pool->completion_bh" is function "thread_pool_completion_bh", which calls 
"elem->common.cb(elem->common.opaque, elem->ret);". And the "elem->common.cb" 
is function "bdrv_co_io_em_complete":


static void bdrv_co_io_em_complete(void *opaque, int ret)
{
CoroutineIOCompletion *co = opaque;


co->ret = ret;
qemu_coroutine_enter(co->coroutine, NULL);
}



5. Finally the return value(zero) will be stored in "co.ret" in function 
"bdrv_co_io_em". However, what would happen if the req haven't done but the 
function "bdrv_co_io_em" returned? The write operation would return an 
uninitialized value(is it zero? I don't know.), is it?


Maybe these made the program report nothing with writer errors, I think.


Thanks again in advance.
Guangmu Zhu


-
Am 22.09.2015 um 08:09 hat Guangmu Zhu geschrieben:
> I used the qemu-img.exe to convert a disk to vmdk format and the output file
> size could be 300 MB. However the left space of the disk the output file
> located on was about 200 MB. After a while, the left space had been zero but
> the program didn't stop or report any error. It was just going on as normal.
> 
> I read the source code and found the error report was controlled by
> "BlockdevOnError on_read_error, on_write_error" in "struct BlockDriverState",
> which had the default value "BLOCKDEV_ON_ERROR_REPORT" for "on_read_error" and
> "BLOCKDEV_ON_ERROR_ENOSPC" for "on_writer_error". The qemu-img.exe had no
> option to change the default behavior of the error report.
> 
> So I think if there were some ways to change the default value of the error
> report, it might be better. Further more, I suggest we could just add some
> codes to the "img_convert" function:
> 
>   1827:out_blk = img_open("target", out_filename, out_fmt, flags, 
> true,
> quiet);
>   1828:if (!out_blk) {
>   1829:ret = -1;
>   1830:goto out;
>   1831:}
>   1832:out_bs = blk_bs(out_blk);
> ++ 1833:
> ++ 1834:bdrv_set_on_error
> (out_bs, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT);

This shouldn't make any difference for qemu-img. The error handling mode
is only for emulated devices in qemu proper.

It looks more like VMDK is somehow failing to report an error at all whn
it's running out of free disk space (though I couldn't spot an error in
the code at 

Re: [Qemu-devel] No error report when using the qemu-img.exe toconvert a disk to vmdk format which is saved on a disk that has no morespace

2015-09-23 Thread Guangmu Zhu
Correct a mistake:
So though the "count" would be "-EINVAL" if error occurred while writing some 
file, the return value will always be zero. Maybe I missed something?


Sorry.
Guangmu Zhu
-


Thanks for your reply.


I read the source code again and have some question:


1. qume-img wrote the target file in the function "bdrv_aligned_pwritev" which 
called the "drv->bdrv_co_writev" function. I haven't know how the qume-img 
found the right driver for "drv", but I guessed it's "BlockDriver bdrv_file" 
for the file on Windows, is it?

2. If the above is correct, the member "bdrv_aio_writev" of "BlockDriver" is 
the function "raw_aio_writev" while the member "bdrv_co_writev" is NULL. The 
function "bdrv_register" registers the "BlockDriver" and checks whether the 
"bdrv_co_readv" or "bdrv_aio_readv" is NULL. If the "bdrv_co_readv" is NULL, 
the member "bdrv_co_writev" is "bdrv_co_writev_em" which calls the function 
"bdrv_co_io_em". There are some codes of it:

static int coroutine_fn bdrv_co_io_em(BlockDriverState *bs, int64_t sector_num,
  int nb_sectors, QEMUIOVector *iov,
  bool is_write)
{
CoroutineIOCompletion co = {
.coroutine = qemu_coroutine_self(),
};
BlockAIOCB *acb;


if (is_write) {
acb = bs->drv->bdrv_aio_writev(bs, sector_num, iov, nb_sectors,
   bdrv_co_io_em_complete, &co);
} else {
acb = bs->drv->bdrv_aio_readv(bs, sector_num, iov, nb_sectors,
  bdrv_co_io_em_complete, &co);
}


trace_bdrv_co_io_em(bs, sector_num, nb_sectors, is_write, acb);
if (!acb) {
return -EIO;
}
qemu_coroutine_yield();


return co.ret;
}



3. The "bs->drv->bdrv_aio_writev" is function "raw_aio_writev" in file 
"raw-win32.c" and the quemu-img uses synchronous IO always, so the function 
"paio_submit" in the same file will be called. This function submits the "aio" 
to "worker_thread" with the callback "aio_worker". There are some codes in 
"aio_worker":


ssize_t ret = 0;
..
case QEMU_AIO_WRITE:
count = handle_aiocb_rw(aiocb);
if (count == aiocb->aio_nbytes) {
count = 0;
} else {
count = -EINVAL;
}
break;

..
return ret;


So though the "count" would be zero if error occurred while writing some file, 
the return value will always be zero. Maybe I missed something?


4. The function "worker_thread" calls the callback:


ret = req->func(req->arg);


req->ret = ret;
/* Write ret before state.  */
smp_wmb();
req->state = THREAD_DONE;



qemu_mutex_lock(&pool->lock);


qemu_bh_schedule(pool->completion_bh);



The "pool->completion_bh" is function "thread_pool_completion_bh", which calls 
"elem->common.cb(elem->common.opaque, elem->ret);". And the "elem->common.cb" 
is function "bdrv_co_io_em_complete":


static void bdrv_co_io_em_complete(void *opaque, int ret)
{
CoroutineIOCompletion *co = opaque;


co->ret = ret;
qemu_coroutine_enter(co->coroutine, NULL);
}



5. Finally the return value(zero) will be stored in "co.ret" in function 
"bdrv_co_io_em". However, what would happen if the req haven't done but the 
function "bdrv_co_io_em" returned? The write operation would return an 
uninitialized value(is it zero? I don't know.), is it?


Maybe these made the program report nothing with writer errors, I think.


Thanks again in advance.
Guangmu Zhu


-
Am 22.09.2015 um 08:09 hat Guangmu Zhu geschrieben:
> I used the qemu-img.exe to convert a disk to vmdk format and the output file
> size could be 300 MB. However the left space of the disk the output file
> located on was about 200 MB. After a while, the left space had been zero but
> the program didn't stop or report any error. It was just going on as normal.
> 
> I read the source code and found the error report was controlled by
> "BlockdevOnError on_read_error, on_write_error" in "struct BlockDriverState",
> which had the default value "BLOCKDEV_ON_ERROR_REPORT" for "on_read_error" and
> "BLOCKDEV_ON_ERROR_ENOSPC" for "on_writer_error". The qemu-img.exe had no
> option to change the default behavior of the error report.
> 
> So I think if there were some ways to change the default value of the error
> report, it might be better. Further more, I suggest we could just add some
> codes to the "img_convert" function:
> 
>   1827:out_blk = img_open("target", out_filename, out_fmt, flags, 
> true,
> quiet);
>   1828:if (!out_blk) {
>   1829:ret = -1;
>   1830:goto out;
>   1831:}
>   1832:out_bs = blk_bs(out_blk);
> ++ 1833:
> ++ 1834:bdrv_set_on_error
> (out_bs, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT);

This shouldn't make any difference 

Re: [Qemu-devel] No error report when using the qemu-img.exe toconvert a disk to vmdk format which is saved on a disk that has no morespace

2015-09-23 Thread Guangmu Zhu
If the "BlockDriver" is "bdrv_vmdk", the function "vmdk_co_write" will be 
called instead. In function "vmdk_write_extent" I see "ret = 
bdrv_pwrite(extent->file, write_offset, write_buf, write_len);". So the 
"extend->file" is "bdrv_file", is it?


Thanks.
Guangmu Zhu


-


Correct a mistake:
So though the "count" would be "-EINVAL" if error occurred while writing some 
file, the return value will always be zero. Maybe I missed something?


Sorry.
Guangmu Zhu
-


Thanks for your reply.


I read the source code again and have some question:


1. qume-img wrote the target file in the function "bdrv_aligned_pwritev" which 
called the "drv->bdrv_co_writev" function. I haven't know how the qume-img 
found the right driver for "drv", but I guessed it's "BlockDriver bdrv_file" 
for the file on Windows, is it?

2. If the above is correct, the member "bdrv_aio_writev" of "BlockDriver" is 
the function "raw_aio_writev" while the member "bdrv_co_writev" is NULL. The 
function "bdrv_register" registers the "BlockDriver" and checks whether the 
"bdrv_co_readv" or "bdrv_aio_readv" is NULL. If the "bdrv_co_readv" is NULL, 
the member "bdrv_co_writev" is "bdrv_co_writev_em" which calls the function 
"bdrv_co_io_em". There are some codes of it:

static int coroutine_fn bdrv_co_io_em(BlockDriverState *bs, int64_t sector_num,
  int nb_sectors, QEMUIOVector *iov,
  bool is_write)
{
CoroutineIOCompletion co = {
.coroutine = qemu_coroutine_self(),
};
BlockAIOCB *acb;


if (is_write) {
acb = bs->drv->bdrv_aio_writev(bs, sector_num, iov, nb_sectors,
   bdrv_co_io_em_complete, &co);
} else {
acb = bs->drv->bdrv_aio_readv(bs, sector_num, iov, nb_sectors,
  bdrv_co_io_em_complete, &co);
}


trace_bdrv_co_io_em(bs, sector_num, nb_sectors, is_write, acb);
if (!acb) {
return -EIO;
}
qemu_coroutine_yield();


return co.ret;
}



3. The "bs->drv->bdrv_aio_writev" is function "raw_aio_writev" in file 
"raw-win32.c" and the quemu-img uses synchronous IO always, so the function 
"paio_submit" in the same file will be called. This function submits the "aio" 
to "worker_thread" with the callback "aio_worker". There are some codes in 
"aio_worker":


ssize_t ret = 0;
..
case QEMU_AIO_WRITE:
count = handle_aiocb_rw(aiocb);
if (count == aiocb->aio_nbytes) {
count = 0;
} else {
count = -EINVAL;
}
break;

..
return ret;


So though the "count" would be zero if error occurred while writing some file, 
the return value will always be zero. Maybe I missed something?


4. The function "worker_thread" calls the callback:


ret = req->func(req->arg);


req->ret = ret;
/* Write ret before state.  */
smp_wmb();
req->state = THREAD_DONE;



qemu_mutex_lock(&pool->lock);


qemu_bh_schedule(pool->completion_bh);



The "pool->completion_bh" is function "thread_pool_completion_bh", which calls 
"elem->common.cb(elem->common.opaque, elem->ret);". And the "elem->common.cb" 
is function "bdrv_co_io_em_complete":


static void bdrv_co_io_em_complete(void *opaque, int ret)
{
CoroutineIOCompletion *co = opaque;


co->ret = ret;
qemu_coroutine_enter(co->coroutine, NULL);
}



5. Finally the return value(zero) will be stored in "co.ret" in function 
"bdrv_co_io_em". However, what would happen if the req haven't done but the 
function "bdrv_co_io_em" returned? The write operation would return an 
uninitialized value(is it zero? I don't know.), is it?


Maybe these made the program report nothing with writer errors, I think.


Thanks again in advance.
Guangmu Zhu


-
Am 22.09.2015 um 08:09 hat Guangmu Zhu geschrieben:
> I used the qemu-img.exe to convert a disk to vmdk format and the output file
> size could be 300 MB. However the left space of the disk the output file
> located on was about 200 MB. After a while, the left space had been zero but
> the program didn't stop or report any error. It was just going on as normal.
> 
> I read the source code and found the error report was controlled by
> "BlockdevOnError on_read_error, on_write_error" in "struct BlockDriverState",
> which had the default value "BLOCKDEV_ON_ERROR_REPORT" for "on_read_error" and
> "BLOCKDEV_ON_ERROR_ENOSPC" for "on_writer_error". The qemu-img.exe had no
> option to change the default behavior of the error report.
> 
> So I think if there were some ways to change the default value of the error
> report, it might be better. Further more, I suggest we could just add some
> codes to the "img_convert" function:
> 
>   1827:out_blk = img_open("target", out_filename, out_fmt, fl

Re: [Qemu-devel] [PATCH 03/16] blkverify: Convert s->test_file to BdrvChild

2015-09-23 Thread Alberto Garcia
On Thu 17 Sep 2015 03:48:07 PM CEST, Kevin Wolf wrote:

> @@ -151,7 +151,7 @@ static void blkverify_close(BlockDriverState *bs)
>  {
>  BDRVBlkverifyState *s = bs->opaque;
>  
> -bdrv_unref(s->test_file);
> +bdrv_unref_child(bs, s->test_file);
>  s->test_file = NULL;
>  }

You are using bdrv_unref_child() here whereas in quorum you kept
bdrv_unref().

In principle both seem correct because if you don't detach the children
in the driver's close function then bdrv_close() will take care of doing
it, but is there a reason why you are using different methods?

Berto



Re: [Qemu-devel] [PATCH v3 0/5] add ACPI node for fw_cfg on pc and arm

2015-09-23 Thread Gabriel L. Somlo
On Wed, Sep 23, 2015 at 11:50:05AM +0200, Igor Mammedov wrote:
> On Thu, 17 Sep 2015 10:56:29 -0400
> "Gabriel L. Somlo"  wrote:
> 
> > New since v2:
> > 
> > - pc/i386 node in ssdt only on machine types *newer* than 2.4
> >   (as suggested by Eduardo)
> > 
> > I appreciate any further comments and reviews. Hopefully we can make
> > this palatable for upstream, modulo the lingering concerns about whether
> > "QEMU0002" is ok to use as the value of _HID, which I'll hopefully get
> > sorted out with the kernel crew...
> > 
> > Thanks much,
> >   --Gabriel
> > 
> > >New since v1:
> > >
> > >   - expose control register size (suggested by Marc Marí)
> > >
> > >   - leaving out _UID and _STA fields (thanks Shannon & Igor)
> > >
> > >   - using "QEMU0002" as the value of _HID (thanks Michael)
> > >
> > >   - added documentation blurb to docs/specs/fw_cfg.txt
> > > (mainly to record usage of the "QEMU0002" string with fw_cfg).
> > >
> > >> This series adds a fw_cfg device node to the SSDT (on pc), or to the
> > >> DSDT (on arm).
> > >>
> > >>  - Patch 1/3 moves (and renames) the BIOS_CFG_IOPORT (0x510)
> > >>define from pc.c to pc.h, so that it could be used from
> > >>acpi-build.c in patch 2/3.
> > >> 
> 
> > >>  - Patch 2/3 adds a fw_cfg node to the pc SSDT.
> > >> 
> > >>  - Patch 3/3 adds a fw_cfg node to the arm DSDT.
> ACPI patches 2-3 look fine to me
> although there is one but, it will make Windows complain about
> unknown device and prompt user to install driver.

I wonder if the implicit _STA has bit 2 (visible to u/i) implicitly
enabled (0x0F vs. 0x0B)... I'll try to reproduce this myself, and see
if adding (back) the explicit _STA value helps.

Thanks!
--Gabriel

> 
> > >>
> > >> I made up some names - "FWCF" for the node name, and "FWCF0001"
> > >> for _HID; no idea whether that's appropriate, or how else I should
> > >> figure out what to use instead...
> > >>
> > >> Also, using scope "\\_SB", based on where fw_cfg shows up in the
> > >> output of "info qtree". Again, if that's wrong, please point me in
> > >> the right direction.
> > >>
> > >> Re. 3/3 (also mentioned after the commit blurb in the patch itself),
> > >> I noticed none of the other DSDT entries contain a _STA field, wondering
> > >> why it would (not) make sense to include that, same as on the PC.
> > 
> > Gabriel L. Somlo (5):
> >   fw_cfg: expose control register size in fw_cfg.h
> >   pc: fw_cfg: move ioport base constant to pc.h
> >   acpi: pc: add fw_cfg device node to ssdt
> >   acpi: arm: add fw_cfg device node to dsdt
> >   fw_cfg: document ACPI device node information
> > 
> >  docs/specs/fw_cfg.txt |  9 +
> >  hw/arm/virt-acpi-build.c  | 13 +
> >  hw/i386/acpi-build.c  | 21 +
> >  hw/i386/pc.c  |  5 ++---
> >  hw/i386/pc_piix.c |  1 +
> >  hw/i386/pc_q35.c  |  1 +
> >  hw/nvram/fw_cfg.c |  8 +---
> >  include/hw/i386/pc.h  |  3 +++
> >  include/hw/nvram/fw_cfg.h |  3 +++
> >  9 files changed, 58 insertions(+), 6 deletions(-)
> > 
> 



Re: [Qemu-devel] [PATCH 06/16] block: Remove bdrv_open_image()

2015-09-23 Thread Alberto Garcia
On Thu 17 Sep 2015 03:48:10 PM CEST, Kevin Wolf wrote:
> It is unused now.
>
> Signed-off-by: Kevin Wolf 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-devel] [RFC PATCH 01/10] vfio: Remove unneeded union from VFIOContainer

2015-09-23 Thread Laurent Vivier


On 17/09/2015 15:09, David Gibson wrote:
> Currently the VFIOContainer iommu_data field contains a union with
> different information for different host iommu types.  However:
>* It only actually contains information for the x86-like "Type1" iommu
>* Because we have a common listener the Type1 fields are actually used
> on all IOMMU types, including the SPAPR TCE type as well
>* There's no tag in the VFIOContainer to tell you which union member is
> valid anyway.
> 
> In fact we now have a general structure for the listener which is unlikely
> to ever need per-iommu-type information, so this patch removes the union.
> 
> In a similar way we can unify the setup of the vfio memory listener in
> vfio_connect_container() that is currently split across a switch on iommu
> type, but is effectively the same in both cases.
> 
> The iommu_data.release pointer was only needed as a cleanup function
> which would handle potentially different data in the union.  With the
> union gone, it too can be removed.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/vfio/common.c  | 51 
> +--
>  include/hw/vfio/vfio-common.h | 14 +++-
>  2 files changed, 23 insertions(+), 42 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 6d21311..e3152f6 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -316,7 +316,7 @@ static void vfio_listener_region_add(MemoryListener 
> *listener,
>   MemoryRegionSection *section)
>  {
>  VFIOContainer *container = container_of(listener, VFIOContainer,
> -iommu_data.type1.listener);
> +iommu_data.listener);
>  hwaddr iova, end;
>  Int128 llend;
>  void *vaddr;
> @@ -406,9 +406,9 @@ static void vfio_listener_region_add(MemoryListener 
> *listener,
>   * can gracefully fail.  Runtime, there's not much we can do other
>   * than throw a hardware error.
>   */
> -if (!container->iommu_data.type1.initialized) {
> -if (!container->iommu_data.type1.error) {
> -container->iommu_data.type1.error = ret;
> +if (!container->iommu_data.initialized) {
> +if (!container->iommu_data.error) {
> +container->iommu_data.error = ret;
>  }
>  } else {
>  hw_error("vfio: DMA mapping failed, unable to continue");
> @@ -420,7 +420,7 @@ static void vfio_listener_region_del(MemoryListener 
> *listener,
>   MemoryRegionSection *section)
>  {
>  VFIOContainer *container = container_of(listener, VFIOContainer,
> -iommu_data.type1.listener);
> +iommu_data.listener);
>  hwaddr iova, end;
>  int ret;
>  
> @@ -485,7 +485,7 @@ static const MemoryListener vfio_memory_listener = {
>  
>  static void vfio_listener_release(VFIOContainer *container)
>  {
> -memory_listener_unregister(&container->iommu_data.type1.listener);
> +memory_listener_unregister(&container->iommu_data.listener);
>  }
>  
>  int vfio_mmap_region(Object *obj, VFIORegion *region,
> @@ -683,21 +683,6 @@ static int vfio_connect_container(VFIOGroup *group, 
> AddressSpace *as)
>  ret = -errno;
>  goto free_container_exit;
>  }
> -
> -container->iommu_data.type1.listener = vfio_memory_listener;
> -container->iommu_data.release = vfio_listener_release;
> -
> -memory_listener_register(&container->iommu_data.type1.listener,
> - container->space->as);
> -
> -if (container->iommu_data.type1.error) {
> -ret = container->iommu_data.type1.error;
> -error_report("vfio: memory listener initialization failed for 
> container");
> -goto listener_release_exit;
> -}
> -
> -container->iommu_data.type1.initialized = true;
> -
>  } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) {
>  ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd);
>  if (ret) {
> @@ -723,19 +708,25 @@ static int vfio_connect_container(VFIOGroup *group, 
> AddressSpace *as)
>  ret = -errno;
>  goto free_container_exit;
>  }
> -
> -container->iommu_data.type1.listener = vfio_memory_listener;
> -container->iommu_data.release = vfio_listener_release;
> -
> -memory_listener_register(&container->iommu_data.type1.listener,
> - container->space->as);
> -
>  } else {
>  error_report("vfio: No available IOMMU models");
>  ret = -EINVAL;
>  goto free_container_exit;
>  }
>  
> +container->iommu_data.listener = vfio_memory_listener;
> +
> +memory_listener_register(&container->iommu_data.listener,
> + 

Re: [Qemu-devel] [PATCH 05/16] block: Convert bs->file to BdrvChild

2015-09-23 Thread Kevin Wolf
Am 22.09.2015 um 20:36 hat Max Reitz geschrieben:
> On 17.09.2015 15:48, Kevin Wolf wrote:
> > This patch removes the temporary duplication between bs->file and
> > bs->file_child by converting everything to BdrvChild.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >  block.c   | 61 
> > ++
> >  block/blkdebug.c  | 32 
> >  block/blkverify.c | 33 ++---
> >  block/bochs.c |  8 +++---
> >  block/cloop.c | 10 
> >  block/dmg.c   | 20 +++
> >  block/io.c| 50 +++---
> >  block/parallels.c | 38 +++--
> >  block/qapi.c  |  2 +-
> >  block/qcow.c  | 42 +---
> >  block/qcow2-cache.c   | 11 +
> >  block/qcow2-cluster.c | 38 +
> >  block/qcow2-refcount.c| 45 ++
> >  block/qcow2-snapshot.c| 30 ---
> >  block/qcow2.c | 62 
> > ---
> >  block/qed-table.c |  4 +--
> >  block/qed.c   | 32 
> >  block/raw_bsd.c   | 40 +++---
> >  block/snapshot.c  | 12 -
> >  block/vdi.c   | 17 +++--
> >  block/vhdx-log.c  | 25 ++-
> >  block/vhdx.c  | 36 ++-
> >  block/vmdk.c  | 27 +++--
> >  block/vpc.c   | 34 ++
> >  include/block/block.h |  8 +-
> >  include/block/block_int.h |  3 +--
> >  26 files changed, 377 insertions(+), 343 deletions(-)
> 
> Reviewed-by: Max Reitz 
> 
> > diff --git a/block.c b/block.c
> > index 2e9e5e2..93d713b 100644
> > --- a/block.c
> > +++ b/block.c
> 
> [...]
> 
> > @@ -1929,6 +1925,11 @@ void bdrv_close(BlockDriverState *bs)
> >  bdrv_unref(backing_hd);
> >  }
> >  
> > +if (bs->file != NULL) {
> > +bdrv_unref(bs->file->bs);
> 
> I think we can use bdrv_unref_child(bs->file) here. I'd personally like
> it more, at least (because using bdrv_unref() and relying on the loop
> below is basically what the comment inside of the loop advises against).
> 
> Yes, I know, eventually, we want to drop this block anyway and let the
> loop below handle everything using bdrv_unref_child(). But when we do
> that conversion, it'll be obvious to drop a bdrv_unref_child() call,
> whereas dropping bdrv_unref() alone isn't so obvious.

Seems safe to do, and reads a bit nicer, so I'll change the patch
accordingly. Thanks.

Kevin

> > +bs->file = NULL;
> > +}
> > +
> >  QLIST_FOREACH_SAFE(child, &bs->children, next, next) {
> >  /* TODO Remove bdrv_unref() from drivers' close function and 
> > use
> >   * bdrv_unref_child() here */
> > @@ -1953,11 +1954,6 @@ void bdrv_close(BlockDriverState *bs)
> >  bs->options = NULL;
> >  QDECREF(bs->full_open_options);
> >  bs->full_open_options = NULL;
> > -
> > -if (bs->file != NULL) {
> > -bdrv_unref(bs->file);
> > -bs->file = NULL;
> > -}
> >  }
> >  
> >  if (bs->blk) {
> 




pgpMzb_ObEBe8.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH 02/16] vmdk: Use BdrvChild instead of BDS for references to extents

2015-09-23 Thread Alberto Garcia
On Thu 17 Sep 2015 03:48:06 PM CEST, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf 
> ---
>  block/vmdk.c | 99 
> +++-
>  1 file changed, 51 insertions(+), 48 deletions(-)

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-devel] [RFC PATCH 02/10] vfio: Generalize vfio_listener_region_add failure path

2015-09-23 Thread Laurent Vivier


On 17/09/2015 15:09, David Gibson wrote:
> If a DMA mapping operation fails in vfio_listener_region_add() it
> checks to see if we've already completed initial setup of the
> container.  If so it reports an error so the setup code can fail
> gracefully, otherwise throws a hw_error().
> 
> There are other potential failure cases in vfio_listener_region_add()
> which could benefit from the same logic, so move it to its own
> fail: block.  Later patches can use this to extend other failure cases
> to fail as gracefully as possible under the circumstances.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/vfio/common.c | 26 +++---
>  1 file changed, 15 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index e3152f6..9953b9c 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -400,19 +400,23 @@ static void vfio_listener_region_add(MemoryListener 
> *listener,
>  error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", "
>   "0x%"HWADDR_PRIx", %p) = %d (%m)",
>   container, iova, end - iova, vaddr, ret);
> +goto fail;
> +}
>  
> -/*
> - * On the initfn path, store the first error in the container so we
> - * can gracefully fail.  Runtime, there's not much we can do other
> - * than throw a hardware error.
> - */
> -if (!container->iommu_data.initialized) {
> -if (!container->iommu_data.error) {
> -container->iommu_data.error = ret;
> -}
> -} else {
> -hw_error("vfio: DMA mapping failed, unable to continue");
> +return;
> +
> +fail:
> +/*
> + * On the initfn path, store the first error in the container so we
> + * can gracefully fail.  Runtime, there's not much we can do other
> + * than throw a hardware error.
> + */
> +if (!container->iommu_data.initialized) {
> +if (!container->iommu_data.error) {
> +container->iommu_data.error = ret;
>  }
> +} else {
> +hw_error("vfio: DMA mapping failed, unable to continue");
>  }
>  }
>  
> 
Reviewed-by: Laurent Vivier 



Re: [Qemu-devel] [PATCH v2 2/2] PCI-e device multi-function hot-add support

2015-09-23 Thread Cao jin

Hi Alex,

On 09/23/2015 01:51 AM, Alex Williamson wrote:

On Tue, 2015-09-22 at 18:08 +0800, Cao jin wrote:

Hi Alex

On 09/22/2015 02:00 AM, Alex Williamson wrote:


Please use different subjects that uniquely identify what each patch
does, don't simply re-use the subject for the cover patch on each.


OK, will change it in next version.


On Wed, 2015-09-16 at 10:02 +0800, Cao jin wrote:

In case user regret when hot-add multi-function, we should roll back,
device_del the function added but still not worked.

Signed-off-by: Cao jin 
---
   hw/pci/pcie.c | 18 ++
   1 file changed, 18 insertions(+)

diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index 89bf61b..497f390 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -265,9 +265,27 @@ void pcie_cap_slot_hot_unplug_request_cb(HotplugHandler 
*hotplug_dev,
DeviceState *dev, Error **errp)
   {
   uint8_t *exp_cap;
+PCIDevice *pci_dev = PCI_DEVICE(dev);
+PCIBus *bus = pci_dev->bus;

   pcie_cap_slot_hotplug_common(PCI_DEVICE(hotplug_dev), dev, &exp_cap, 
errp);

+/* handle the condition: user hot-add multi function, but regret before
+ * finish it, and want to delete the added but not worked function. Fake
+ * the condition: the slot is polulated, power indicator is off and power
+ * controller is off, so device can be detached when OS write config space.
+ */
+if (PCI_FUNC(pci_dev->devfn) > 0 &&
+bus->devices[PCI_DEVFN(0, 0)] == NULL) {
+pci_word_test_and_set_mask(exp_cap + PCI_EXP_SLTSTA,
+PCI_EXP_SLTSTA_PDS);


AFAICT, we're only setting this to make pcie_cap_slot_write_config()
consider this device for being unplugged.  Would it not be cleaner to
flag the device as unexposed to the guest and also use that flag to
prevent config reads and writes to the device until function 0 is
populated, so we know that the guest hasn't interacted with the device?


Yes, set PDS bit here, for the purpose that fake the unplug condition in
pcie_cap_slot_write_config(), which means, let guest decide when to
unplug device. So I think setting PDS bit here is necessary, am I right?


I would consider it a hack.  You're setting up the device a certain way
to make it appear as if the guest has configured it that way, then
effectively sending the guest a spurious hotplug request for a device
that it theoretically doesn't know about.  If we were to prevent access
to the device, couldn't we remove it directly?



I agree with the judgement "hack", but I am confused about the last 
sentence. please correct me if I understand it wrong.
I design the hot-add feature via executing device_add cmd several times 
with func 0 added last. Assume we have a solution implemented to prevent 
access to the device before adding func 0, but we mustn`t remove other 
func directly, because we don`t know whether user want to add func 0 at 
last or just regret.



I am not quite clear about "flag device as unexposed", does the flag
means PCI_EXP_SLTSTA_PDS bit, or anything else? Could you give more
hints about it?


If function 0 doesn't exist in the slot, should the guest be able to
perform PCI config accesses to the device?  If the guest cannot do
config cycle accesses to the device, then we know the device is unused
and we don't need to involve the guest in removing it.


if func 0 doesn`t exist, theoretically as I think, guest has no reason 
to perform PCI config access to the device, but as you said before, if 
guest does do a gratuitous full PCI bus scan(actually I am not aware in 
what condition it will happen), guest is able to find the device without 
func 0 exist.


in the condition you said above, assume we already have the solution to 
forbidden the access to device before func 0 added, does that means the 
result: guest think there is no device in the slot, but in qemu, we 
still have device data structure in, and won`t destroy it?


or I have another solution of this feature: make multi-function hot-add 
atomic, which means creating a new api, with all func params following, 
like "multifunction_device_add func0,func1,func2...", but it will be 
more and more complicated, which maybe the last solution I prefer to choose.


another question: in what way do we flag the device unexposed to guest 
before func 0 populated? My thoughts is: return 0x as vendor id when 
being accessed, do you think it is a effective way?

+
+pcie_cap_slot_event(PCI_DEVICE(hotplug_dev),
+PCI_EXP_HP_EV_PDC | PCI_EXP_HP_EV_ABP);


Why do we need to test both vs just ABP, which is signaled in the
existing patch below?



Test the two hotplug event, yes, ABP is enough for device_del. will
remove PDC in next version.


+
+return;
+}
+
   pcie_cap_slot_push_attention_button(PCI_DEVICE(hotplug_dev));
   }




.







.



--
Yours Sincerely,

Cao Jin



Re: [Qemu-devel] [PATCH 03/16] blkverify: Convert s->test_file to BdrvChild

2015-09-23 Thread Kevin Wolf
Am 23.09.2015 um 15:01 hat Alberto Garcia geschrieben:
> On Thu 17 Sep 2015 03:48:07 PM CEST, Kevin Wolf wrote:
> 
> > @@ -151,7 +151,7 @@ static void blkverify_close(BlockDriverState *bs)
> >  {
> >  BDRVBlkverifyState *s = bs->opaque;
> >  
> > -bdrv_unref(s->test_file);
> > +bdrv_unref_child(bs, s->test_file);
> >  s->test_file = NULL;
> >  }
> 
> You are using bdrv_unref_child() here whereas in quorum you kept
> bdrv_unref().
> 
> In principle both seem correct because if you don't detach the children
> in the driver's close function then bdrv_close() will take care of doing
> it, but is there a reason why you are using different methods?

Because consistency is overrated? Or simply carelessness?

In the end (not in this series), I'd like to remove all of this from the
close functions (as the comment in block.c says) and let it all be
handled in bdrv_close(). But until then, we should stay reasonably
consistent indeed.

VMDK uses bdrv_unref_child() as well, so I guess quorum is the one that
should be changed?

Kevin



Re: [Qemu-devel] [PATCH v5 03/46] qapi: Test for C member name collisions

2015-09-23 Thread Markus Armbruster
Eric Blake  writes:

> On 09/23/2015 03:43 AM, Markus Armbruster wrote:
>
>>> Commit 1e6c1616 was where we quit burning the C member name 'base'.
>>> Prior to that time, members of base classes did not clash with variant
>>> names because of the C boxing.
>> 
>> For union types.  For struct types, we still box the base.  I'd like to
>> get rid of that.
>
> Patch 34/46 :)

Okay :)

>> Even when the base is boxed, the members still clash in QMP.
>> 
>> We also box the variants (e.g. UserDefA *value1 in the example above).
>> Would be nice to get rid of that, too.
>
> What do you mean? Here's an example of current boxed code:
>
> enum EnumType {
> ENUM_TYPE_ONE,
> ENUM_TYPE_TWO,
> };
> struct One {
> int a;
> };
> struct Two {
> char *a;
> };
> struct Union {
> EnumType type;
> /* union tag is @type */
> union {
> One *one;
> Two *two;
> };
> };
>
> Is this what you envision for unboxed? Note that we still have to
> namespace things properly (we have to have union.one.a and union.two.a,
> and not a direct union.a), so all we'd be saving is the additional
> allocation of the variant pointers.
>
> struct Union {
> EnumType type;
> /* union tag is @type */
> union {
> struct {
> int a;
> } one;
> struct {
> char *a;
> } two;
> };
> };
>
> However, I'm not sure it would always help.  The conversion of
> netdev_add to full qapi relies on being able to access the variant
> through a named struct (such as NetdevTapOptions); unboxing the variant
> would get rid of the convenient access to these named sub-structs.

struct Union {
EnumType type;
/* union tag is @type */
union {
One one;
Two two;
};
};

For base, we go one step further and peel off the struct, to save some
notational overhead.  Pointless for unions.



Re: [Qemu-devel] [PATCH 03/16] blkverify: Convert s->test_file to BdrvChild

2015-09-23 Thread Alberto Garcia
On Wed 23 Sep 2015 03:58:23 PM CEST, Kevin Wolf wrote:

>> > @@ -151,7 +151,7 @@ static void blkverify_close(BlockDriverState *bs)
>> >  {
>> >  BDRVBlkverifyState *s = bs->opaque;
>> >  
>> > -bdrv_unref(s->test_file);
>> > +bdrv_unref_child(bs, s->test_file);
>> >  s->test_file = NULL;
>> >  }
>> 
>> You are using bdrv_unref_child() here whereas in quorum you kept
>> bdrv_unref().
>> 
>> In principle both seem correct because if you don't detach the
>> children in the driver's close function then bdrv_close() will take
>> care of doing it, but is there a reason why you are using different
>> methods?
>
> Because consistency is overrated? Or simply carelessness?

Ah ok :-)

> VMDK uses bdrv_unref_child() as well, so I guess quorum is the one
> that should be changed?

You can keep my R-b if you do so.

Berto



Re: [Qemu-devel] [PATCH 7/7] tests: Simplify how qom-test is run

2015-09-23 Thread Markus Armbruster
Andreas Färber  writes:

> Am 18.09.2015 um 16:24 schrieb Markus Armbruster:
>> Andreas Färber  writes:
>>> Am 18.09.2015 um 14:00 schrieb Markus Armbruster:
 Add it to check-qtest-generic-y instead of check-qtest-$(target)-y for
 every target.

 Signed-off-by: Markus Armbruster 
 ---
  tests/Makefile | 5 +
  1 file changed, 1 insertion(+), 4 deletions(-)

 diff --git a/tests/Makefile b/tests/Makefile
 index 4559045..28c5f93 100644
 --- a/tests/Makefile
 +++ b/tests/Makefile
 @@ -219,10 +219,7 @@ gcov-files-ppc64-y += ppc64-softmmu/hw/ppc/spapr_pci.c
  check-qtest-microblazeel-y = $(check-qtest-microblaze-y)
  check-qtest-xtensaeb-y = $(check-qtest-xtensa-y)
  
 -# qom-test works for all sysemu architectures:
 -$(foreach target,$(SYSEMU_TARGET_LIST), \
 - $(if $(findstring tests/qom-test$(EXESUF),
 $(check-qtest-$(target)-y)),, \
 -  $(eval check-qtest-$(target)-y += tests/qom-test$(EXESUF
 +check-qtest-generic-y += tests/qom-test$(EXESUF)
>>>
>>> Does this -generic- have the same filtering code to avoid running the
>>> tests twice for x86_64, aarch64, ppc64, etc.? Please don't regress.
>> 
>> I'm dense today.  Can you explain the filtering code to me?
>
> For practical purpose,s x86_64 adds all tests from i386, that included
> qom-test then. If we now add it for x86_64 too, it got executed twice,
> which the above $(if ...) fixes by not adding it for x86_64 if it's
> already in. Just checking whether -generic- has equivalent filtering or
> other code somewhere else?

The code in master works only sometimes.  Here's the explanation copied
from my revised patch's commit message:

We want to run qom-test for every architecture, without having to
manually add it to every architecture's list of tests.  Commit 3687d53
accomplished this by adding it to every architecture's list
automatically.

However, some architectures inherit their tests from others, like this:

check-qtest-x86_64-y = $(check-qtest-i386-y)
check-qtest-microblazeel-y = $(check-qtest-microblaze-y)
check-qtest-xtensaeb-y = $(check-qtest-xtensa-y)

For such architectures, we ended up running the (slow!) test twice.
Commit 2b8419c attempted to avoid this by adding the test only when
it's not already present.  Works only as long as we consider adding
the test to the architectures on the left hand side *after* the ones
on the right hand side: x86_64 after i386, microblazeel after
microblaze, xtensaeb after xtensa.

Turns out we consider them in $(SYSEMU_TARGET_LIST) order.  Defined as

SYSEMU_TARGET_LIST := $(subst -softmmu.mak,,$(notdir \
   $(wildcard $(SRC_PATH)/default-configs/*-softmmu.mak)))

On my machine, this results in the oder xtensa, x86_64, microblazeel,
microblaze, i386.  Consequently, qom-test runs twice for microblazeel
and x86_64.

After my patch v2 (to be sent soon), it runs exactly once per target.



[Qemu-devel] [PATCH v2 0/7] Fix device introspection regressions

2015-09-23 Thread Markus Armbruster
QMP command device-list-properties regressed in 2.1: it can crash or
leave dangling pointers behind.

-device FOO,help regressed in 2.2: it no longer works for
non-pluggable devices.  I tried to fix that some time ago[*], but my
fix failed review.  This is my second, more comprehensive try.

PATCH 1,2 are preliminaries, PATCH 3 adds tests to demonstrate the
bugs, PATCH 4-6 fix them to a degree (see PATCH 5 for limitations),
and PATCH 7 cleans up.

[*] [PATCH] qdev: Make -device FOO,help help again when FOO is not
pluggable
https://lists.gnu.org/archive/html/qemu-devel/2015-03/msg03459.html
Message-Id: <1426527232-15044-1-git-send-email-arm...@redhat.com>

v2:
* PATCH 1: New, made from old PATCH 7 and relevant Makefile parts of
  old PATCH 3, with a much improved commit message [Andreas]
* PATCH 3: Fix hmp() [Eric]
* PATCH 4: Tweak commit message and comments [Eric]
* PATCH 6: Mark only the CPUs that are actually broken [Eduardo]

Markus Armbruster (7):
  tests: Fix how qom-test is run
  libqtest: Clean up unused QTestState member sigact_old
  libqtest: New hmp() & friends
  device-introspect-test: New, covering device introspection
  qmp: Fix device-list-properties not to crash for abstract device
  qdev: Protect device-list-properties against broken devices
  Revert "qdev: Use qdev_get_device_class() for -device ,help"

 hw/arm/allwinner-a10.c |   2 +
 hw/arm/digic.c |   2 +
 hw/arm/fsl-imx25.c |   2 +
 hw/arm/fsl-imx31.c |   2 +
 hw/arm/xlnx-zynqmp.c   |   2 +
 hw/pci-host/versatile.c|  11 
 hw/pcmcia/pxa2xx.c |   9 
 hw/s390x/event-facility.c  |   3 ++
 hw/s390x/sclp.c|   3 ++
 include/hw/qdev-core.h |  13 +
 qdev-monitor.c |   9 ++--
 qmp.c  |  11 
 target-alpha/cpu.c |   7 +++
 target-arm/cpu.c   |   7 +++
 target-cris/cpu.c  |   7 +++
 target-i386/cpu.c  |   8 +++
 target-lm32/cpu.c  |   7 +++
 target-m68k/cpu.c  |   7 +++
 target-microblaze/cpu.c|   6 +++
 target-mips/cpu.c  |   7 +++
 target-moxie/cpu.c |   7 +++
 target-openrisc/cpu.c  |   7 +++
 target-ppc/kvm.c   |   4 ++
 target-s390x/cpu.c |   7 +++
 target-sh4/cpu.c   |   7 +++
 target-sparc/cpu.c |   7 +++
 target-tricore/cpu.c   |   6 +++
 target-unicore32/cpu.c |   7 +++
 target-xtensa/cpu.c|   7 +++
 tests/Makefile |  20 ---
 tests/device-introspect-test.c | 117 +
 tests/drive_del-test.c |  22 +++-
 tests/ide-test.c   |   8 +--
 tests/libqtest.c   |  38 -
 tests/libqtest.h   |  33 
 35 files changed, 388 insertions(+), 34 deletions(-)
 create mode 100644 tests/device-introspect-test.c

-- 
2.4.3




[Qemu-devel] [PATCH v2 1/7] tests: Fix how qom-test is run

2015-09-23 Thread Markus Armbruster
We want to run qom-test for every architecture, without having to
manually add it to every architecture's list of tests.  Commit 3687d53
accomplished this by adding it to every architecture's list
automatically.

However, some architectures inherit their tests from others, like this:

check-qtest-x86_64-y = $(check-qtest-i386-y)
check-qtest-microblazeel-y = $(check-qtest-microblaze-y)
check-qtest-xtensaeb-y = $(check-qtest-xtensa-y)

For such architectures, we ended up running the (slow!) test twice.
Commit 2b8419c attempted to avoid this by adding the test only when
it's not already present.  Works only as long as we consider adding
the test to the architectures on the left hand side *after* the ones
on the right hand side: x86_64 after i386, microblazeel after
microblaze, xtensaeb after xtensa.

Turns out we consider them in $(SYSEMU_TARGET_LIST) order.  Defined as

SYSEMU_TARGET_LIST := $(subst -softmmu.mak,,$(notdir \
   $(wildcard $(SRC_PATH)/default-configs/*-softmmu.mak)))

On my machine, this results in the oder xtensa, x86_64, microblazeel,
microblaze, i386.  Consequently, qom-test runs twice for microblazeel
and x86_64.

Replace this complex and flawed machinery with a much simpler one: add
generic tests (currently just qom-test) to check-qtest-generic-y
instead of check-qtest-$(target)-y for every target, then run
$(check-qtest-generic-y) for every target.

Signed-off-by: Markus Armbruster 
---
 tests/Makefile | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/tests/Makefile b/tests/Makefile
index 4063639..9380e14 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -86,6 +86,8 @@ check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
 # All QTests for now are POSIX-only, but the dependencies are
 # really in libqtest, not in the testcases themselves.
 
+check-qtest-generic-y =
+
 gcov-files-ipack-y += hw/ipack/ipack.c
 check-qtest-ipack-y += tests/ipoctal232-test$(EXESUF)
 gcov-files-ipack-y += hw/char/ipoctal232.c
@@ -216,10 +218,7 @@ gcov-files-ppc64-y += ppc64-softmmu/hw/ppc/spapr_pci.c
 check-qtest-microblazeel-y = $(check-qtest-microblaze-y)
 check-qtest-xtensaeb-y = $(check-qtest-xtensa-y)
 
-# qom-test works for all sysemu architectures:
-$(foreach target,$(SYSEMU_TARGET_LIST), \
-   $(if $(findstring tests/qom-test$(EXESUF), 
$(check-qtest-$(target)-y)),, \
-   $(eval check-qtest-$(target)-y += tests/qom-test$(EXESUF
+check-qtest-generic-y += tests/qom-test$(EXESUF)
 
 check-qapi-schema-y := $(addprefix tests/qapi-schema/, \
comments.json empty.json enum-empty.json enum-missing-data.json \
@@ -446,8 +445,11 @@ CFLAGS += $(TEST_CFLAGS)
 
 TARGETS=$(patsubst %-softmmu,%, $(filter %-softmmu,$(TARGET_DIRS)))
 ifeq ($(CONFIG_POSIX),y)
-QTEST_TARGETS=$(foreach TARGET,$(TARGETS), $(if $(check-qtest-$(TARGET)-y), 
$(TARGET),))
+QTEST_TARGETS = $(TARGETS)
 check-qtest-y=$(foreach TARGET,$(TARGETS), $(check-qtest-$(TARGET)-y))
+check-qtest-y += $(check-qtest-generic-y)
+else
+QTEST_TARGETS =
 endif
 
 qtest-obj-y = tests/libqtest.o $(test-util-obj-y)
@@ -485,7 +487,7 @@ $(patsubst %, check-qtest-%, $(QTEST_TARGETS)): 
check-qtest-%: $(check-qtest-y)
$(call quiet-command,QTEST_QEMU_BINARY=$*-softmmu/qemu-system-$* \
QTEST_QEMU_IMG=qemu-img$(EXESUF) \
MALLOC_PERTURB_=$${MALLOC_PERTURB_:-$$((RANDOM % 255 + 1))} \
-   gtester $(GTESTER_OPTIONS) -m=$(SPEED) 
$(check-qtest-$*-y),"GTESTER $@")
+   gtester $(GTESTER_OPTIONS) -m=$(SPEED) $(check-qtest-$*-y) 
$(check-qtest-generic-y),"GTESTER $@")
$(if $(CONFIG_GCOV),@for f in $(gcov-files-$*-y); do \
  echo Gcov report for $$f:;\
  $(GCOV) $(GCOV_OPTIONS) $$f -o `dirname $$f`; \
-- 
2.4.3




[Qemu-devel] [PATCH v2 5/7] qmp: Fix device-list-properties not to crash for abstract device

2015-09-23 Thread Markus Armbruster
Broken in commit f4eb32b "qmp: show QOM properties in
device-list-properties", v2.1.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Markus Armbruster 
Reviewed-by: Eric Blake 
---
 qmp.c  |  6 ++
 tests/device-introspect-test.c | 15 ---
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/qmp.c b/qmp.c
index 057a7cb..1413de4 100644
--- a/qmp.c
+++ b/qmp.c
@@ -515,6 +515,12 @@ DevicePropertyInfoList *qmp_device_list_properties(const 
char *typename,
 return NULL;
 }
 
+if (object_class_is_abstract(klass)) {
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "name",
+   "non-abstract device type");
+return NULL;
+}
+
 obj = object_new(typename);
 
 QTAILQ_FOREACH(prop, &obj->properties, node) {
diff --git a/tests/device-introspect-test.c b/tests/device-introspect-test.c
index 44da30e..6c7366f 100644
--- a/tests/device-introspect-test.c
+++ b/tests/device-introspect-test.c
@@ -45,17 +45,10 @@ static void test_one_device(const char *type)
 QDict *resp;
 char *help;
 
-/*
- * Skip this part for the abstract device test case, because
- * device-list-properties crashes for such devices.
- * FIXME fix it not to crash
- */
-if (strcmp(type, "device")) {
-resp = qmp("{'execute': 'device-list-properties',"
-   " 'arguments': {'typename': %s}}",
-   type);
-QDECREF(resp);
-}
+resp = qmp("{'execute': 'device-list-properties',"
+   " 'arguments': {'typename': %s}}",
+   type);
+QDECREF(resp);
 
 help = hmp("device_add \"%s,help\"", type);
 g_free(help);
-- 
2.4.3




[Qemu-devel] [PATCH v2 2/7] libqtest: Clean up unused QTestState member sigact_old

2015-09-23 Thread Markus Armbruster
Unused since commit d766825.

Signed-off-by: Markus Armbruster 
Reviewed-by: Eric Blake 
---
 tests/libqtest.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tests/libqtest.c b/tests/libqtest.c
index e5188e0..8dede56 100644
--- a/tests/libqtest.c
+++ b/tests/libqtest.c
@@ -46,7 +46,6 @@ struct QTestState
 bool irq_level[MAX_IRQ];
 GString *rx;
 pid_t qemu_pid;  /* our child QEMU process */
-struct sigaction sigact_old; /* restored on exit */
 };
 
 static GList *qtest_instances;
-- 
2.4.3




[Qemu-devel] [PATCH v2 7/7] Revert "qdev: Use qdev_get_device_class() for -device , help"

2015-09-23 Thread Markus Armbruster
This reverts commit 31bed5509dfcbdfc293154ce81086a4dbd7a80b6.

The reverted commit changed qdev_device_help() to reject abstract
devices and devices that have cannot_instantiate_with_device_add_yet
set, to fix crash bugs like -device x86_64-cpu,help.

Rejecting abstract devices makes sense: they're purely internal, and
the implementation of the help feature can't cope with them.

Rejecting non-pluggable devices makes less sense: even though you
can't use them with -device, the help may still be useful elsewhere,
for instance with -global.  This is a regression: -device FOO,help
used to help even for FOO that aren't pluggable.

The previous two commits fixed the crash bug at a lower layer, so
reverting this one is now safe.  Fixes the -device FOO,help
regression, except for the broken devices marked
cannot_even_create_with_object_new_yet.  For those, the error message
is improved.

Example of a device where the regression is fixed:

$ qemu-system-x86_64 -device PIIX4_PM,help
PIIX4_PM.command_serr_enable=bool (on/off)
PIIX4_PM.multifunction=bool (on/off)
PIIX4_PM.rombar=uint32
PIIX4_PM.romfile=str
PIIX4_PM.addr=int32 (Slot and optional function number, example: 06.0 or 06)
PIIX4_PM.memory-hotplug-support=bool
PIIX4_PM.acpi-pci-hotplug-with-bridge-support=bool
PIIX4_PM.s4_val=uint8
PIIX4_PM.disable_s4=uint8
PIIX4_PM.disable_s3=uint8
PIIX4_PM.smb_io_base=uint32

Example of a device where it isn't fixed:

$ qemu-system-x86_64 -device host-x86_64-cpu,help
Can't list properties of device 'host-x86_64-cpu'

Both failed with "Parameter 'driver' expects pluggable device type"
before.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Markus Armbruster 
Reviewed-by: Eric Blake 
---
 qdev-monitor.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/qdev-monitor.c b/qdev-monitor.c
index 0bf7f83..ebe9004 100644
--- a/qdev-monitor.c
+++ b/qdev-monitor.c
@@ -237,9 +237,12 @@ int qdev_device_help(QemuOpts *opts)
 return 0;
 }
 
-qdev_get_device_class(&driver, &local_err);
-if (local_err) {
-goto error;
+if (!object_class_by_name(driver)) {
+const char *typename = find_typename_by_alias(driver);
+
+if (typename) {
+driver = typename;
+}
 }
 
 prop_list = qmp_device_list_properties(driver, &local_err);
-- 
2.4.3




[Qemu-devel] [PATCH v2 6/7] qdev: Protect device-list-properties against broken devices

2015-09-23 Thread Markus Armbruster
Several devices don't survive object_unref(object_new(T)): they crash
or hang during cleanup, or they leave dangling pointers behind.

This breaks at least device-list-properties, because
qmp_device_list_properties() needs to create a device to find its
properties.  Broken in commit f4eb32b "qmp: show QOM properties in
device-list-properties", v2.1.  Example reproducer:

$ qemu-system-aarch64 -nodefaults -display none -machine none -S -qmp stdio
{"QMP": {"version": {"qemu": {"micro": 50, "minor": 4, "major": 2}, 
"package": ""}, "capabilities": []}}
{ "execute": "qmp_capabilities" }
{"return": {}}
{ "execute": "device-list-properties", "arguments": { "typename": 
"pxa2xx-pcmcia" } }
qemu-system-aarch64: /home/armbru/work/qemu/memory.c:1307: 
memory_region_finalize: Assertion `((&mr->subregions)->tqh_first == ((void 
*)0))' failed.
Aborted (core dumped)
[Exit 134 (SIGABRT)]

Unfortunately, I can't fix the problems in these devices right now.
Instead, add DeviceClass member cannot_even_create_with_object_new_yet
to mark them:

* Crash or hang during cleanup (didn't debug them, so I can't say
  why): "pxa2xx-pcmcia", "realview_pci", "versatile_pci",
  "s390-sclp-event-facility", "sclp"

* Dangling pointers: most CPUs, plus "allwinner-a10", "digic",
  "fsl,imx25", "fsl,imx31", "xlnx,zynqmp", because they create such
  CPUs

* Assert kvm_enabled(): "host-x86_64-cpu", host-i386-cpu",
  "host-powerpc64-cpu", "host-embedded-powerpc-cpu",
  "host-powerpc-cpu" (the powerpc ones can't currently reach the
  assertion, because the CPUs are only registered when KVM is enabled,
  but the assertion is arguably in the wrong place all the same)

Make qmp_device_list_properties() fail cleanly when the device is so
marked.  This improves device-list-properties from "crashes or hangs"
to "fails".  Not a complete fix, just a better-than-nothing
work-around.  In the above reproducer, device-list-properties now
fails with "Can't list properties of device 'pxa2xx-pcmcia'".

This also protects -device FOO,help, which uses the same machinery
since commit ef52358 "qdev-monitor: include QOM properties in -device
FOO, help output", v2.2.  Example reproducer:

$ qemu-system-* -machine none -device pxa2xx-pcmcia,help

Before:

qemu-system-aarch64: .../memory.c:1307: memory_region_finalize: Assertion 
`((&mr->subregions)->tqh_first == ((void *)0))' failed.

After:

Can't list properties of device 'pxa2xx-pcmcia'

Cc: "Andreas Färber" 
Cc: Alexander Graf 
Cc: Alistair Francis 
Cc: Antony Pavlov 
Cc: Christian Borntraeger 
Cc: Cornelia Huck 
Cc: Eduardo Habkost 
Cc: Li Guang 
Cc: Paolo Bonzini 
Cc: Peter Crosthwaite 
Cc: Peter Maydell 
Cc: Richard Henderson 
Cc: qemu-...@nongnu.org
Cc: qemu-sta...@nongnu.org
Signed-off-by: Markus Armbruster 
---
 hw/arm/allwinner-a10.c |  2 ++
 hw/arm/digic.c |  2 ++
 hw/arm/fsl-imx25.c |  2 ++
 hw/arm/fsl-imx31.c |  2 ++
 hw/arm/xlnx-zynqmp.c   |  2 ++
 hw/pci-host/versatile.c| 11 +++
 hw/pcmcia/pxa2xx.c |  9 +
 hw/s390x/event-facility.c  |  3 +++
 hw/s390x/sclp.c|  3 +++
 include/hw/qdev-core.h | 13 +
 qmp.c  |  5 +
 target-alpha/cpu.c |  7 +++
 target-arm/cpu.c   |  7 +++
 target-cris/cpu.c  |  7 +++
 target-i386/cpu.c  |  8 
 target-lm32/cpu.c  |  7 +++
 target-m68k/cpu.c  |  7 +++
 target-microblaze/cpu.c|  6 ++
 target-mips/cpu.c  |  7 +++
 target-moxie/cpu.c |  7 +++
 target-openrisc/cpu.c  |  7 +++
 target-ppc/kvm.c   |  4 
 target-s390x/cpu.c |  7 +++
 target-sh4/cpu.c   |  7 +++
 target-sparc/cpu.c |  7 +++
 target-tricore/cpu.c   |  6 ++
 target-unicore32/cpu.c |  7 +++
 target-xtensa/cpu.c|  7 +++
 tests/device-introspect-test.c | 29 -
 29 files changed, 169 insertions(+), 29 deletions(-)

diff --git a/hw/arm/allwinner-a10.c b/hw/arm/allwinner-a10.c
index ff249af..7692090 100644
--- a/hw/arm/allwinner-a10.c
+++ b/hw/arm/allwinner-a10.c
@@ -103,6 +103,8 @@ static void aw_a10_class_init(ObjectClass *oc, void *data)
 DeviceClass *dc = DEVICE_CLASS(oc);
 
 dc->realize = aw_a10_realize;
+/* Reason: creates a CPU, thus use after free(), see cpu_class_init() */
+dc->cannot_even_create_with_object_new_yet = true;
 }
 
 static const TypeInfo aw_a10_type_info = {
diff --git a/hw/arm/digic.c b/hw/arm/digic.c
index ec8c330..3decef4 100644
--- a/hw/arm/digic.c
+++ b/hw/arm/digic.c
@@ -97,6 +97,8 @@ static void digic_class_init(ObjectClass *oc, void *data)
 DeviceClass *dc = DEVICE_CLASS(oc);
 
 dc->realize = digic_realize;
+/* Reason: creates a CPU, thus use after free(), see cpu_class_init() */
+ 

[Qemu-devel] [PATCH v2 4/7] device-introspect-test: New, covering device introspection

2015-09-23 Thread Markus Armbruster
The test doesn't check that the output makes any sense, only that QEMU
survives.  Useful since we've had an astounding number of crash bugs
around there.

In fact, we have a bunch of them right now: several devices crash or
hang, and all CPUs leave a dangling pointer behind.  The test skips
testing the broken parts.  The next commits will fix them, and drop
the skipping.

Signed-off-by: Markus Armbruster 
---
 tests/Makefile |   8 ++-
 tests/device-introspect-test.c | 153 +
 2 files changed, 158 insertions(+), 3 deletions(-)
 create mode 100644 tests/device-introspect-test.c

diff --git a/tests/Makefile b/tests/Makefile
index 9380e14..2bf7ba1 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -86,7 +86,8 @@ check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
 # All QTests for now are POSIX-only, but the dependencies are
 # really in libqtest, not in the testcases themselves.
 
-check-qtest-generic-y =
+check-qtest-generic-y = tests/device-introspect-test$(EXESUF)
+gcov-files-generic-y = qdev-monitor.c qmp.c
 
 gcov-files-ipack-y += hw/ipack/ipack.c
 check-qtest-ipack-y += tests/ipoctal232-test$(EXESUF)
@@ -381,6 +382,7 @@ libqos-imx-obj-y = $(libqos-obj-y) tests/libqos/i2c-imx.o
 libqos-usb-obj-y = $(libqos-pc-obj-y) tests/libqos/usb.o
 libqos-virtio-obj-y = $(libqos-pc-obj-y) tests/libqos/virtio.o 
tests/libqos/virtio-pci.o tests/libqos/virtio-mmio.o 
tests/libqos/malloc-generic.o
 
+tests/device-introspect-test$(EXESUF): tests/device-introspect-test.o
 tests/rtc-test$(EXESUF): tests/rtc-test.o
 tests/m48t59-test$(EXESUF): tests/m48t59-test.o
 tests/endianness-test$(EXESUF): tests/endianness-test.o
@@ -488,7 +490,7 @@ $(patsubst %, check-qtest-%, $(QTEST_TARGETS)): 
check-qtest-%: $(check-qtest-y)
QTEST_QEMU_IMG=qemu-img$(EXESUF) \
MALLOC_PERTURB_=$${MALLOC_PERTURB_:-$$((RANDOM % 255 + 1))} \
gtester $(GTESTER_OPTIONS) -m=$(SPEED) $(check-qtest-$*-y) 
$(check-qtest-generic-y),"GTESTER $@")
-   $(if $(CONFIG_GCOV),@for f in $(gcov-files-$*-y); do \
+   $(if $(CONFIG_GCOV),@for f in $(gcov-files-$*-y) 
$(gcov-files-generic-y); do \
  echo Gcov report for $$f:;\
  $(GCOV) $(GCOV_OPTIONS) $$f -o `dirname $$f`; \
done,)
@@ -499,7 +501,7 @@ $(patsubst %, check-%, $(check-unit-y)): check-%: %
$(call quiet-command, \
MALLOC_PERTURB_=$${MALLOC_PERTURB_:-$$((RANDOM % 255 + 1))} \
gtester $(GTESTER_OPTIONS) -m=$(SPEED) $*,"GTESTER $*")
-   $(if $(CONFIG_GCOV),@for f in $(gcov-files-$(subst tests/,,$*)-y); do \
+   $(if $(CONFIG_GCOV),@for f in $(gcov-files-$(subst tests/,,$*)-y) 
$(gcov-files-generic-y); do \
  echo Gcov report for $$f:;\
  $(GCOV) $(GCOV_OPTIONS) $$f -o `dirname $$f`; \
done,)
diff --git a/tests/device-introspect-test.c b/tests/device-introspect-test.c
new file mode 100644
index 000..44da30e
--- /dev/null
+++ b/tests/device-introspect-test.c
@@ -0,0 +1,153 @@
+/*
+ * Device introspection test cases
+ *
+ * Copyright (c) 2015 Red Hat Inc.
+ *
+ * Authors:
+ *  Markus Armbruster ,
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+/*
+ * Covers QMP device-list-properties and HMP device_add help.  We
+ * currently don't check that their output makes sense, only that QEMU
+ * survives.  Useful since we've had an astounding number of crash
+ * bugs around here.
+ */
+
+#include 
+#include 
+#include "qemu-common.h"
+#include "qapi/qmp/qstring.h"
+#include "libqtest.h"
+
+const char common_args[] = "-nodefaults -machine none";
+
+static QList *device_type_list(bool abstract)
+{
+QDict *resp;
+QList *ret;
+
+resp = qmp("{'execute': 'qom-list-types',"
+   " 'arguments': {'implements': 'device', 'abstract': %i}}",
+   abstract);
+g_assert(qdict_haskey(resp, "return"));
+ret = qdict_get_qlist(resp, "return");
+QINCREF(ret);
+QDECREF(resp);
+return ret;
+}
+
+static void test_one_device(const char *type)
+{
+QDict *resp;
+char *help;
+
+/*
+ * Skip this part for the abstract device test case, because
+ * device-list-properties crashes for such devices.
+ * FIXME fix it not to crash
+ */
+if (strcmp(type, "device")) {
+resp = qmp("{'execute': 'device-list-properties',"
+   " 'arguments': {'typename': %s}}",
+   type);
+QDECREF(resp);
+}
+
+help = hmp("device_add \"%s,help\"", type);
+g_free(help);
+}
+
+static void test_device_intro_list(void)
+{
+QList *types;
+char *help;
+
+qtest_start(common_args);
+
+types = device_type_list(true);
+QDECREF(types);
+
+help = hmp("device_add help");
+g_free(help);
+
+qtest_end();
+}
+
+static void test_device_intro_none(void)
+{
+qtest_start(common_args);
+test_one_device("nonexiste

[Qemu-devel] [PATCH v2 3/7] libqtest: New hmp() & friends

2015-09-23 Thread Markus Armbruster
New convenience function hmp() to facilitate use of
human-monitor-command in tests.  Use it to simplify its existing uses.

To blend into existing libqtest code, also add qtest_hmpv() and
qtest_hmp().  That, and the egregiously verbose GTK-Doc comment format
make this patch look bigger than it is.

Signed-off-by: Markus Armbruster 
Reviewed-by: Eric Blake 
---
 tests/drive_del-test.c | 22 ++
 tests/ide-test.c   |  8 ++--
 tests/libqtest.c   | 37 +
 tests/libqtest.h   | 33 +
 4 files changed, 78 insertions(+), 22 deletions(-)

diff --git a/tests/drive_del-test.c b/tests/drive_del-test.c
index 8951f6f..3390946 100644
--- a/tests/drive_del-test.c
+++ b/tests/drive_del-test.c
@@ -16,28 +16,18 @@
 
 static void drive_add(void)
 {
-QDict *response;
+char *resp = hmp("drive_add 0 if=none,id=drive0");
 
-response = qmp("{'execute': 'human-monitor-command',"
-   " 'arguments': {"
-   "   'command-line': 'drive_add 0 if=none,id=drive0'"
-   "}}");
-g_assert(response);
-g_assert_cmpstr(qdict_get_try_str(response, "return"), ==, "OK\r\n");
-QDECREF(response);
+g_assert_cmpstr(resp, ==, "OK\r\n");
+g_free(resp);
 }
 
 static void drive_del(void)
 {
-QDict *response;
+char *resp = hmp("drive_del drive0");
 
-response = qmp("{'execute': 'human-monitor-command',"
-   " 'arguments': {"
-   "   'command-line': 'drive_del drive0'"
-   "}}");
-g_assert(response);
-g_assert_cmpstr(qdict_get_try_str(response, "return"), ==, "");
-QDECREF(response);
+g_assert_cmpstr(resp, ==, "");
+g_free(resp);
 }
 
 static void device_del(void)
diff --git a/tests/ide-test.c b/tests/ide-test.c
index 5594738..6173dfa 100644
--- a/tests/ide-test.c
+++ b/tests/ide-test.c
@@ -510,9 +510,7 @@ static void test_flush(void)
 tmp_path);
 
 /* Delay the completion of the flush request until we explicitly do it */
-qmp_discard_response("{'execute':'human-monitor-command', 'arguments': {"
- " 'command-line':"
- " 'qemu-io ide0-hd0 \"break flush_to_os A\"'} }");
+g_free(hmp("qemu-io ide0-hd0 \"break flush_to_os A\""));
 
 /* FLUSH CACHE command on device 0*/
 outb(IDE_BASE + reg_device, 0);
@@ -524,9 +522,7 @@ static void test_flush(void)
 assert_bit_clear(data, DF | ERR | DRQ);
 
 /* Complete the command */
-qmp_discard_response("{'execute':'human-monitor-command', 'arguments': {"
- " 'command-line':"
- " 'qemu-io ide0-hd0 \"resume A\"'} }");
+g_free(hmp("qemu-io ide0-hd0 \"resume A\""));
 
 /* Check registers */
 data = inb(IDE_BASE + reg_device);
diff --git a/tests/libqtest.c b/tests/libqtest.c
index 8dede56..2a396ba 100644
--- a/tests/libqtest.c
+++ b/tests/libqtest.c
@@ -483,6 +483,33 @@ void qtest_qmp_eventwait(QTestState *s, const char *event)
 }
 }
 
+char *qtest_hmpv(QTestState *s, const char *fmt, va_list ap)
+{
+char *cmd;
+QDict *resp;
+char *ret;
+
+cmd = g_strdup_vprintf(fmt, ap);
+resp = qtest_qmp(s, "{'execute': 'human-monitor-command',"
+ " 'arguments': {'command-line': %s}}",
+ cmd);
+ret = g_strdup(qdict_get_try_str(resp, "return"));
+g_assert(ret);
+QDECREF(resp);
+g_free(cmd);
+return ret;
+}
+
+char *qtest_hmp(QTestState *s, const char *fmt, ...)
+{
+va_list ap;
+char *ret;
+
+va_start(ap, fmt);
+ret = qtest_hmpv(s, fmt, ap);
+va_end(ap);
+return ret;
+}
 
 const char *qtest_get_arch(void)
 {
@@ -774,6 +801,16 @@ void qmp_discard_response(const char *fmt, ...)
 qtest_qmpv_discard_response(global_qtest, fmt, ap);
 va_end(ap);
 }
+char *hmp(const char *fmt, ...)
+{
+va_list ap;
+char *ret;
+
+va_start(ap, fmt);
+ret = qtest_hmpv(global_qtest, fmt, ap);
+va_end(ap);
+return ret;
+}
 
 bool qtest_big_endian(void)
 {
diff --git a/tests/libqtest.h b/tests/libqtest.h
index ec42031..b270f7b 100644
--- a/tests/libqtest.h
+++ b/tests/libqtest.h
@@ -120,6 +120,29 @@ QDict *qtest_qmp_receive(QTestState *s);
 void qtest_qmp_eventwait(QTestState *s, const char *event);
 
 /**
+ * qtest_hmpv:
+ * @s: #QTestState instance to operate on.
+ * @fmt...: HMP command to send to QEMU
+ *
+ * Send HMP command to QEMU via QMP's human-monitor-command.
+ *
+ * Returns: the command's output.
+ */
+char *qtest_hmp(QTestState *s, const char *fmt, ...);
+
+/**
+ * qtest_hmpv:
+ * @s: #QTestState instance to operate on.
+ * @fmt: HMP command to send to QEMU
+ * @ap: HMP command arguments
+ *
+ * Send HMP command to QEMU via QMP's human-monitor-command.
+ *
+ * Returns: the command's output.
+ */
+char *qtest_hmpv(QTestState *s, const char *fmt, va_list ap);
+
+/**
  * qtest_get_irq:
  * @s: #QTestState instance to operate on.

Re: [Qemu-devel] [PATCH 03/16] blkverify: Convert s->test_file to BdrvChild

2015-09-23 Thread Alberto Garcia
On Thu 17 Sep 2015 03:48:07 PM CEST, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf 
> ---
>  block/blkverify.c | 41 +
>  1 file changed, 21 insertions(+), 20 deletions(-)
>

Reviewed-by: Alberto Garcia 

Berto



[Qemu-devel] [PATCH] hw/arm/virt: smbios: inform guest of kvm

2015-09-23 Thread Andrew Jones
ARM/AArch64 KVM guests don't have any way to identify
themselves as KVM guests (x86 guests use a CPUID leaf). Now, we
could discuss all sorts of reasons why guests shouldn't need to
know that, but then there's always some case where it'd be
nice... Anyway, now that we have SMBIOS tables in ARM guests,
it's easy for the guest to know that it's a QEMU instance. This
patch takes that one step further, also identifying KVM, when
appropriate. Again, we could debate why generally nothing
should care whether it's of type QEMU or QEMU/KVM, but again,
sometimes it's nice to know...

Signed-off-by: Andrew Jones 
---
 hw/arm/virt.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 6bf0d6d591d6c..607d448354a8c 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -855,12 +855,17 @@ static void virt_build_smbios(VirtGuestInfo *guest_info)
 FWCfgState *fw_cfg = guest_info->fw_cfg;
 uint8_t *smbios_tables, *smbios_anchor;
 size_t smbios_tables_len, smbios_anchor_len;
+const char *product = "QEMU Virtual Machine";
 
 if (!fw_cfg) {
 return;
 }
 
-smbios_set_defaults("QEMU", "QEMU Virtual Machine",
+if (kvm_enabled()) {
+product = "KVM Virtual Machine";
+}
+
+smbios_set_defaults("QEMU", product,
 "1.0", false, true, SMBIOS_ENTRY_POINT_30);
 
 smbios_get_tables(NULL, 0, &smbios_tables, &smbios_tables_len,
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v5 03/46] qapi: Test for C member name collisions

2015-09-23 Thread Eric Blake
On 09/23/2015 08:02 AM, Markus Armbruster wrote:

>> However, I'm not sure it would always help.  The conversion of
>> netdev_add to full qapi relies on being able to access the variant
>> through a named struct (such as NetdevTapOptions); unboxing the variant
>> would get rid of the convenient access to these named sub-structs.
> 
> struct Union {
> EnumType type;
> /* union tag is @type */
> union {
> One one;
> Two two;
> };
> };
> 
> For base, we go one step further and peel off the struct, to save some
> notational overhead.  Pointless for unions.

Ah, I see. Instead of malloc'ing a sub-struct and calling (roughly)

ptr = visit_start_struct(Union) // mallocs
subptr = visit_start_implicit_struct(One) // also mallocs
visit_type_fields(subptr)
visit_end_implicit_struct()

we would instead use inline allocation, with:

ptr = visit_start_struct(Union) // mallocs
visit_type_fields(&ptr->one)

seems straightforward enough; I'll play with the idea on top of my series.


-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v5 01/46] qapi: Sort qapi-schema tests

2015-09-23 Thread Eric Blake
On 09/21/2015 03:57 PM, Eric Blake wrote:
> Recent changes to qapi have provided quite a bit of churn in
> the makefile, because we are inconsistent on what order test
> names appear in, and on whether to re-wrap the list of tests or
> just add arbitrary line lengths.  Writing the list in a sorted
> fashion, one test per line, will make future patches easier
> to see what tests are being added or removed by a patch.
> 
> Signed-off-by: Eric Blake 
> ---
>  tests/Makefile | 160 
> -
>  1 file changed, 114 insertions(+), 46 deletions(-)
> 

> +qapi-schema += alternate-array.json
> +qapi-schema += alternate-base.json

Hmm, I just realized we require GNU make, and that we already use
$(wildcard) when building up other tests.  Would it be worth writing
this patch to merely use $(wildcard qapi-tests/*.json)?  Then further
additions (and removals) of .json files would automatically be picked up
without requiring Makefile tweaking.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v2 1/3] qapi: convert NumaOptions into a flat union

2015-09-23 Thread Kővágó, Zoltán
Changes the NumaOptions to flat union from a simple one.  This is
required by my later OptsVisitor patch to preserve backward
compatibility.

Strictly speaking this would break QMP compatibility (as specified in
docs/qapi-code-gen.txt), but since no QMP command use this structure,
it's not an issue.  The -numa option syntax doesn't change.  There are
some changes in the C api, but this patch fixes them.

Signed-off-by: Kővágó, Zoltán 
Reviewed-by: Eric Blake 

---

Changes from v1:
* fixed documentation

 numa.c   |  2 +-
 qapi-schema.json | 49 ++---
 2 files changed, 39 insertions(+), 12 deletions(-)

diff --git a/numa.c b/numa.c
index 16a8c41..9cd0c84 100644
--- a/numa.c
+++ b/numa.c
@@ -227,7 +227,7 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error 
**errp)
 }
 
 switch (object->type) {
-case NUMA_OPTIONS_KIND_NODE:
+case NUMA_OPTION_TYPE_NODE:
 numa_node_parse(object->node, opts, &err);
 if (err) {
 goto error;
diff --git a/qapi-schema.json b/qapi-schema.json
index 263053d..72827f8 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3573,17 +3573,6 @@
   'data': { '*console':'int', 'events': [ 'InputEvent' ] } }
 
 ##
-# @NumaOptions
-#
-# A discriminated record of NUMA options. (for OptsVisitor)
-#
-# Since 2.1
-##
-{ 'union': 'NumaOptions',
-  'data': {
-'node': 'NumaNodeOptions' }}
-
-##
 # @NumaNodeOptions
 #
 # Create a guest NUMA node. (for OptsVisitor)
@@ -3610,6 +3599,44 @@
'*memdev': 'str' }}
 
 ##
+# @NumaOptionType
+#
+# NUMA command-line option type.
+#
+# @node: Create a guest NUMA node. See @NumaNodeOptions.
+#
+# Since: 2.5
+##
+{ 'enum': 'NumaOptionType',
+  'data': [ 'node' ] }
+
+##
+# @NumaCommonOptions
+#
+# Common set of numa options.
+#
+# @type: NUMA command-line option type.
+#
+# Since: 2.5
+##
+{ 'struct': 'NumaCommonOptions',
+  'data': {
+'type': 'NumaOptionType' } }
+
+##
+# @NumaOptions
+#
+# A discriminated record of NUMA options. (for OptsVisitor)
+#
+# Since 2.1
+##
+{ 'union': 'NumaOptions',
+  'base': 'NumaCommonOptions',
+  'discriminator': 'type',
+  'data': {
+'node': 'NumaNodeOptions' }}
+
+##
 # @HostMemPolicy
 #
 # Host memory policy types
-- 
2.5.2




[Qemu-devel] [PATCH v2 0/3] qapi-flattening and preparation of -audiodev option

2015-09-23 Thread Kővágó, Zoltán
Here is the second version of my qapi flattening attempts.  This is now
based on Eric Blake's "post-introspection cleanups, and qapi-ify
netdev_add" patch series[1], which contains some of my previous commits.

What remains here: NumaOptions flattening (with proposed documentation
changes from Eduardo) and OptsVisitor.  The Netdev part was mostly taken
care of by Eric, I only had to convert NetLegacy into a flat union.

Please review.

[1]: http://lists.nongnu.org/archive/html/qemu-devel/2015-09/msg05410.html

Kővágó, Zoltán (3):
  qapi: convert NumaOptions into a flat union
  qapi: change NetLegacy into a flat union
  qapi: support nested structs in OptsVisitor

 net/net.c   |  47 +++--
 numa.c  |   2 +-
 qapi-schema.json|  78 +++--
 qapi/opts-visitor.c | 116 ++--
 tests/qapi-schema/qapi-schema-test.json |   9 ++-
 tests/qapi-schema/qapi-schema-test.out  |   4 ++
 tests/test-opts-visitor.c   |  34 ++
 7 files changed, 224 insertions(+), 66 deletions(-)

-- 
2.5.2




Re: [Qemu-devel] [PATCH 01/16] block: Introduce BDS.file_child

2015-09-23 Thread Alberto Garcia
On Thu 17 Sep 2015 03:48:05 PM CEST, Kevin Wolf wrote:
> Store the BdrvChild for bs->file. At this point, bs->file_child->bs just
> duplicates the bs->file pointer. Later, it will completely replace it.
>
> Signed-off-by: Kevin Wolf 
> ---

Reviewed-by: Alberto Garcia 

Berto



  1   2   3   4   >