date:20140826

Re: [Qemu-devel] Using cache=writeback safely on qemu 1.4.0 and later

2014-08-26 Thread Paolo Bonzini

Il 25/08/2014 20:13, Andrew Martin ha scritto:
> Even if the python daemon or apache2 did not fsync the modified files, isn't 
> there some action that the OS takes periodically to flush dirty pages to 
> disk? 
> This seems to be implied in the SuSE documentation:
> https://www.suse.com/documentation/sles11/book_kvm/data/sect1_1_chapter_book_kvm.html
> "the normal page cache management will handle commitment to the storage 
> device."

... using fsync :)

> In the case of the files uploaded by apache2, they were added to the server 
> days 
> before  the power outage, so it seems like there would have been ample time 
> for 
> those changes to have been flushed.

Remember that cache=writeback has a huge cache, whose size is several
gigabytes.

At some point there was a QEMU bug that caused the guest not to do
fsyncs.  It was fixed by commit ef5bc96268.

Paolo

Re: [Qemu-devel] [RFC PATCH v2 06/13] spapr_iommu: Implement free_table() helper

2014-08-26 Thread Alexey Kardashevskiy

On 08/26/2014 04:16 PM, David Gibson wrote:
> On Fri, Aug 15, 2014 at 08:12:28PM +1000, Alexey Kardashevskiy wrote:
>> Every sPAPRTCETable object holds an IOMMU memory region which holds
>> a referenced to the sPAPRTCETable instance. So if we want to free
>> an sPAPRTCETable instance, calling object_unref() will not be enough
>> as embedded memory region will hold the reference and we need to break
>> the loop.
>>
>> This adds a spapr_tce_free_table() helper which destroys the embedded
>> memory region and then calls object_unref() on the sPAPRTCETable instance
>> which will succeed now. The helper adds a new child under unique name
>> derived from LIOBN.
>>
>> As we are here, fix spapr_tce_new_table() callers.
>> At the moment spapr_tce_new_table() references sPAPRTCETable twice -
>> once in object_new() and second time in object_property_add_child().
>> The callers of spapr_tce_new_table() should take care of correct
>> referencing.
> 
> So I've been trying to determine if there's any way to avoid this by
> not constructing the reference loop in the first place, but all the
> qom is breaking my brain.

Well. I could have added an additional object to sPAPRTCETable (just a
pointer in the struct but not a QOM-child!) and use it as an owner for
MemoryRegion. It would not hold reference to the table. But since I do not
grasp the idea of having an owner for a MemoryRegion, this can break
something which I cannot even imagine :)



-- 
Alexey

Re: [Qemu-devel] [PATCH 5/5] target-ppc: Handle cases when multi-processors get machine-check

2014-08-26 Thread Aravinda Prasad



On Tuesday 26 August 2014 11:34 AM, David Gibson wrote:
> On Mon, Aug 25, 2014 at 07:15:54PM +0530, Aravinda Prasad wrote:
>> It is possible for multi-processors to experience machine
>> check at or about the same time. As per PAPR, subsequent
>> processors serialize waiting for the first processor to
>> issue the ibm,nmi-interlock call.
>>
>> The second processor retries if the first processor which
>> received a machine check is still reading the error log
>> and is yet to issue ibm,nmi-interlock call.
>>
>> This patch implements this functionality.
> 
> The previous patch is broken without this, so just fold the two
> together.

sure.

Thanks for the review.

Regards,
Aravinda

> 

-- 
Regards,
Aravinda

Re: [Qemu-devel] [PATCH] spapr_pci: Fix config space corruption

2014-08-26 Thread Alexey Kardashevskiy

On 08/13/2014 05:20 PM, Alexey Kardashevskiy wrote:
> When disabling MSI/MSIX via "ibm,change-msi" RTAS call, no check was made
> if MSI or MSIX is actually supported and the MSI message was reset
> unconditionally. If this happened on a device which does not support MSI
> (but does support MSIX, otherwise "ibm,change-msi" would not be called),
> this device would have PCIDevice::msi_cap field (MSI capability offset)
> set to zero and writing a vector would actually clear PCI status.
> 
> This clears MSI message only if MSI or MSIX is present on a device.

Ping?


> Signed-off-by: Alexey Kardashevskiy 
> ---
>  hw/ppc/spapr_pci.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index d1f4c86..c0e703f 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -263,7 +263,6 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, 
> sPAPREnvironment *spapr,
>  unsigned int irq, max_irqs = 0, num = 0;
>  sPAPRPHBState *phb = NULL;
>  PCIDevice *pdev = NULL;
> -bool msix = false;
>  spapr_pci_msi *msi;
>  int *config_addr_key;
>  
> @@ -301,7 +300,12 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, 
> sPAPREnvironment *spapr,
>  }
>  
>  xics_free(spapr->icp, msi->first_irq, msi->num);
> -spapr_msi_setmsg(pdev, 0, msix, 0, num);
> +if (msi_present(pdev)) {
> +spapr_msi_setmsg(pdev, 0, false, 0, num);
> +}
> +if (msix_present(pdev)) {
> +spapr_msi_setmsg(pdev, 0, true, 0, num);
> +}
>  g_hash_table_remove(phb->msi, &config_addr);
>  
>  trace_spapr_pci_msi("Released MSIs", config_addr);
> 


-- 
Alexey

[Qemu-devel] [PATCH 02/12] pcspk: adding vmstate for save/restore

2014-08-26 Thread Pavel Dovgalyuk

VMState added by this patch preserves correct
loading of the PC speaker device state.

Signed-off-by: Pavel Dovgalyuk 
---
 hw/audio/pcspk.c |   18 --
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/hw/audio/pcspk.c b/hw/audio/pcspk.c
index 1d81bbe..8b22dbf 100644
--- a/hw/audio/pcspk.c
+++ b/hw/audio/pcspk.c
@@ -50,8 +50,8 @@ typedef struct {
 unsigned int pit_count;
 unsigned int samples;
 unsigned int play_pos;
-int data_on;
-int dummy_refresh_clock;
+uint8_t data_on;
+uint8_t dummy_refresh_clock;
 } PCSpkState;
 
 static const char *s_spk = "pcspk";
@@ -163,6 +163,18 @@ static const MemoryRegionOps pcspk_io_ops = {
 },
 };
 
+static const VMStateDescription vmstate_spk = {
+.name = "pcspk",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField[]) {
+VMSTATE_UINT8(data_on, PCSpkState),
+VMSTATE_UINT8(dummy_refresh_clock, PCSpkState),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static void pcspk_initfn(Object *obj)
 {
 PCSpkState *s = PC_SPEAKER(obj);
@@ -175,6 +187,8 @@ static void pcspk_realizefn(DeviceState *dev, Error **errp)
 ISADevice *isadev = ISA_DEVICE(dev);
 PCSpkState *s = PC_SPEAKER(dev);
 
+vmstate_register(NULL, 0, &vmstate_spk, s);
+
 isa_register_ioport(isadev, &s->ioport, s->iobase);
 
 pcspk_state = s;

[Qemu-devel] [PATCH 04/12] parallel: adding vmstate for save/restore

2014-08-26 Thread Pavel Dovgalyuk

VMState added by this patch preserves correct
loading of the parallel port controller state.

Signed-off-by: Pavel Dovgalyuk 
---
 hw/char/parallel.c |   20 
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/hw/char/parallel.c b/hw/char/parallel.c
index 7ac90a5..26c03d7 100644
--- a/hw/char/parallel.c
+++ b/hw/char/parallel.c
@@ -477,6 +477,24 @@ static const MemoryRegionPortio 
isa_parallel_portio_sw_list[] = {
 PORTIO_END_OF_LIST(),
 };
 
+
+static const VMStateDescription vmstate_parallel_isa = {
+.name = "parallel_isa",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField[]) {
+VMSTATE_UINT8(state.dataw, ISAParallelState),
+VMSTATE_UINT8(state.datar, ISAParallelState),
+VMSTATE_UINT8(state.status, ISAParallelState),
+VMSTATE_UINT8(state.control, ISAParallelState),
+VMSTATE_INT32(state.irq_pending, ISAParallelState),
+VMSTATE_INT32(state.epp_timeout, ISAParallelState),
+VMSTATE_END_OF_LIST()
+}
+};
+
+
 static void parallel_isa_realizefn(DeviceState *dev, Error **errp)
 {
 static int index;
@@ -518,6 +536,8 @@ static void parallel_isa_realizefn(DeviceState *dev, Error 
**errp)
   ? &isa_parallel_portio_hw_list[0]
   : &isa_parallel_portio_sw_list[0]),
  s, "parallel");
+
+vmstate_register(NULL, -1, &vmstate_parallel_isa, isa);
 }
 
 /* Memory mapped interface */

[Qemu-devel] [PATCH 00/12] Fixing hardware migration issues

2014-08-26 Thread Pavel Dovgalyuk

This set of patches is related to migration issues in hardware devices.
Some of the devices had fields in their states that didn't saved and restored.
These patches add missed fields to the new subsections of the vmstates.
For several devices (like integratorcp) the patches add new vmstates, that
didn't exist at all.

---

Pavel Dovgalyuk (12):
  integratorcp: adding vmstate for save/restore
  pcspk: adding vmstate for save/restore
  fdc: adding vmstate for save/restore
  parallel: adding vmstate for save/restore
  serial: fixing vmstate for save/restore
  kvmvapic: fixing loading vmstate
  hpet: fixing saving and loading process
  pckbd: adding new fields to vmstate
  rtl8139: adding new fields to vmstate
  piix: do not raise irq while loading vmstate
  mc146818rtc: add missed field to vmstate
  pl031: add missed field to vmstate


 hw/arm/integratorcp.c   |   38 +-
 hw/audio/pcspk.c|   18 ++-
 hw/block/fdc.c  |   81 
 hw/char/parallel.c  |   20 +++
 hw/char/serial.c|  264 ---
 hw/i386/kvmvapic.c  |   22 +++
 hw/input/pckbd.c|   51 
 hw/intc/apic_common.c   |   44 +++
 hw/net/rtl8139.c|   50 +++
 hw/pci-host/piix.c  |   22 +++
 hw/timer/hpet.c |   15 --
 hw/timer/mc146818rtc.c  |   32 +
 hw/timer/pl031.c|3 
 include/hw/i386/apic_internal.h |2 
 14 files changed, 594 insertions(+), 68 deletions(-)

-- 
Pavel Dovgalyuk

[Qemu-devel] [PATCH 01/12] integratorcp: adding vmstate for save/restore

2014-08-26 Thread Pavel Dovgalyuk

VMState added by this patch preserves correct
loading of the integratorcp device state.

Signed-off-by: Pavel Dovgalyuk 
---
 hw/arm/integratorcp.c |   38 +-
 1 files changed, 37 insertions(+), 1 deletions(-)

diff --git a/hw/arm/integratorcp.c b/hw/arm/integratorcp.c
index 0e476c3..e5d5751 100644
--- a/hw/arm/integratorcp.c
+++ b/hw/arm/integratorcp.c
@@ -42,6 +42,27 @@ typedef struct IntegratorCMState {
 uint32_t fiq_enabled;
 } IntegratorCMState;
 
+static const VMStateDescription vmstate_integratorcm = {
+.name = "integratorcm",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField[]) {
+VMSTATE_UINT32(cm_osc, IntegratorCMState),
+VMSTATE_UINT32(cm_ctrl, IntegratorCMState),
+VMSTATE_UINT32(cm_lock, IntegratorCMState),
+VMSTATE_UINT32(cm_auxosc, IntegratorCMState),
+VMSTATE_UINT32(cm_sdram, IntegratorCMState),
+VMSTATE_UINT32(cm_init, IntegratorCMState),
+VMSTATE_UINT32(cm_flags, IntegratorCMState),
+VMSTATE_UINT32(cm_nvflags, IntegratorCMState),
+VMSTATE_UINT32(int_level, IntegratorCMState),
+VMSTATE_UINT32(irq_enabled, IntegratorCMState),
+VMSTATE_UINT32(fiq_enabled, IntegratorCMState),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static uint8_t integrator_spd[128] = {
128, 8, 4, 11, 9, 1, 64, 0,  2, 0xa0, 0xa0, 0, 0, 8, 0, 1,
0xe, 4, 0x1c, 1, 2, 0x20, 0xc0, 0, 0, 0, 0, 0x30, 0x28, 0x30, 0x28, 0x40
@@ -272,7 +293,7 @@ static int integratorcm_init(SysBusDevice *dev)
 sysbus_init_mmio(dev, &s->iomem);
 
 integratorcm_do_remap(s);
-/* ??? Save/restore.  */
+vmstate_register(NULL, -1, &vmstate_integratorcm, s);
 return 0;
 }
 
@@ -296,6 +317,20 @@ typedef struct icp_pic_state {
 qemu_irq parent_fiq;
 } icp_pic_state;
 
+
+static const VMStateDescription vmstate_icp_pic = {
+.name = "icp_pic_state",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields  = (VMStateField[]) {
+VMSTATE_UINT32(level, icp_pic_state),
+VMSTATE_UINT32(irq_enabled, icp_pic_state),
+VMSTATE_UINT32(fiq_enabled, icp_pic_state),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static void icp_pic_update(icp_pic_state *s)
 {
 uint32_t flags;
@@ -399,6 +434,7 @@ static int icp_pic_init(SysBusDevice *sbd)
 memory_region_init_io(&s->iomem, OBJECT(s), &icp_pic_ops, s,
   "icp-pic", 0x0080);
 sysbus_init_mmio(sbd, &s->iomem);
+vmstate_register(NULL, -1, &vmstate_icp_pic, s);
 return 0;
 }

[Qemu-devel] [PATCH 09/12] rtl8139: adding new fields to vmstate

2014-08-26 Thread Pavel Dovgalyuk

This patch adds virtual clock-dependent timers to VMState to allow correct
saving and restoring the state of RTL8139 network controller.

Signed-off-by: Pavel Dovgalyuk 
---
 hw/net/rtl8139.c |   50 --
 1 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 6e59f38..cc2f705 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -3246,10 +3246,17 @@ static uint32_t rtl8139_mmio_readl(void *opaque, hwaddr 
addr)
 return val;
 }
 
+static int rtl8139_pre_load(void *opaque)
+{
+RTL8139State *s = opaque;
+s->TimerExpire = 0;
+timer_del(s->timer);
+return 0;
+}
+
 static int rtl8139_post_load(void *opaque, int version_id)
 {
 RTL8139State* s = opaque;
-rtl8139_set_next_tctr_time(s, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL));
 if (version_id < 4) {
 s->cplus_enabled = s->CpCmd != 0;
 }
@@ -3275,6 +3282,38 @@ static const VMStateDescription 
vmstate_rtl8139_hotplug_ready ={
 }
 };
 
+static bool rtl8139_TimerExpire_needed(void *opaque)
+{
+RTL8139State *s = (RTL8139State *)opaque;
+return s->TimerExpire != 0;
+}
+
+static const VMStateDescription vmstate_rtl8139_TimerExpire = {
+.name = "rtl8139/TimerExpire",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_INT64(TimerExpire, RTL8139State),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static bool rtl8139_timer_needed(void *opaque)
+{
+RTL8139State *s = (RTL8139State *)opaque;
+return timer_pending(s->timer);
+}
+
+static const VMStateDescription vmstate_rtl8139_timer = {
+.name = "rtl8139/timer",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_TIMER(timer, RTL8139State),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static void rtl8139_pre_save(void *opaque)
 {
 RTL8139State* s = opaque;
@@ -3289,8 +3328,9 @@ static void rtl8139_pre_save(void *opaque)
 
 static const VMStateDescription vmstate_rtl8139 = {
 .name = "rtl8139",
-.version_id = 4,
+.version_id = 5,
 .minimum_version_id = 3,
+.pre_load = rtl8139_pre_load,
 .post_load = rtl8139_post_load,
 .pre_save  = rtl8139_pre_save,
 .fields = (VMStateField[]) {
@@ -3371,6 +3411,12 @@ static const VMStateDescription vmstate_rtl8139 = {
 .vmsd = &vmstate_rtl8139_hotplug_ready,
 .needed = rtl8139_hotplug_ready_needed,
 }, {
+.vmsd = &vmstate_rtl8139_TimerExpire,
+.needed = rtl8139_TimerExpire_needed,
+}, {
+.vmsd = &vmstate_rtl8139_timer,
+.needed = rtl8139_timer_needed,
+}, {
 /* empty */
 }
 }

[Qemu-devel] [PATCH 05/12] serial: fixing vmstate for save/restore

2014-08-26 Thread Pavel Dovgalyuk

Some fields were added to VMState by this patch to preserve correct
loading of the serial port controller state.
Updating FCR value while loading was also modified to disable generating
an interrupt by loadvm.

Signed-off-by: Pavel Dovgalyuk 
---
 hw/char/serial.c |  264 +-
 1 files changed, 219 insertions(+), 45 deletions(-)

diff --git a/hw/char/serial.c b/hw/char/serial.c
index 764e184..192f9a7 100644
--- a/hw/char/serial.c
+++ b/hw/char/serial.c
@@ -272,6 +272,64 @@ static gboolean serial_xmit(GIOChannel *chan, GIOCondition 
cond, void *opaque)
 }
 
 
+/* Setter for FCR.
+   is_load flag means, that value is set while loading VM state
+   and interrupt should not be invoked */
+static void serial_write_fcr(void *opaque, uint32_t val, int is_load)
+{
+SerialState *s = opaque;
+val = val & 0xFF;
+
+if (s->fcr == val) {
+return;
+}
+
+/* Did the enable/disable flag change? If so, make sure FIFOs get flushed 
*/
+if ((val ^ s->fcr) & UART_FCR_FE) {
+val |= UART_FCR_XFR | UART_FCR_RFR;
+}
+
+/* FIFO clear */
+
+if (val & UART_FCR_RFR) {
+timer_del(s->fifo_timeout_timer);
+s->timeout_ipending = 0;
+fifo8_reset(&s->recv_fifo);
+}
+
+if (val & UART_FCR_XFR) {
+fifo8_reset(&s->xmit_fifo);
+}
+
+if (val & UART_FCR_FE) {
+s->iir |= UART_IIR_FE;
+/* Set recv_fifo trigger Level */
+switch (val & 0xC0) {
+case UART_FCR_ITL_1:
+s->recv_fifo_itl = 1;
+break;
+case UART_FCR_ITL_2:
+s->recv_fifo_itl = 4;
+break;
+case UART_FCR_ITL_3:
+s->recv_fifo_itl = 8;
+break;
+case UART_FCR_ITL_4:
+s->recv_fifo_itl = 14;
+break;
+}
+} else {
+s->iir &= ~UART_IIR_FE;
+}
+
+/* Set fcr - or at least the bits in it that are supposed to "stick" */
+s->fcr = val & 0xC9;
+
+if (!is_load) {
+serial_update_irq(s);
+}
+}
+
 static void serial_ioport_write(void *opaque, hwaddr addr, uint64_t val,
 unsigned size)
 {
@@ -327,50 +385,7 @@ static void serial_ioport_write(void *opaque, hwaddr addr, 
uint64_t val,
 }
 break;
 case 2:
-val = val & 0xFF;
-
-if (s->fcr == val)
-break;
-
-/* Did the enable/disable flag change? If so, make sure FIFOs get 
flushed */
-if ((val ^ s->fcr) & UART_FCR_FE)
-val |= UART_FCR_XFR | UART_FCR_RFR;
-
-/* FIFO clear */
-
-if (val & UART_FCR_RFR) {
-timer_del(s->fifo_timeout_timer);
-s->timeout_ipending=0;
-fifo8_reset(&s->recv_fifo);
-}
-
-if (val & UART_FCR_XFR) {
-fifo8_reset(&s->xmit_fifo);
-}
-
-if (val & UART_FCR_FE) {
-s->iir |= UART_IIR_FE;
-/* Set recv_fifo trigger Level */
-switch (val & 0xC0) {
-case UART_FCR_ITL_1:
-s->recv_fifo_itl = 1;
-break;
-case UART_FCR_ITL_2:
-s->recv_fifo_itl = 4;
-break;
-case UART_FCR_ITL_3:
-s->recv_fifo_itl = 8;
-break;
-case UART_FCR_ITL_4:
-s->recv_fifo_itl = 14;
-break;
-}
-} else
-s->iir &= ~UART_IIR_FE;
-
-/* Set fcr - or at least the bits in it that are supposed to "stick" */
-s->fcr = val & 0xC9;
-serial_update_irq(s);
+serial_write_fcr(s, val, 0);
 break;
 case 3:
 {
@@ -590,6 +605,17 @@ static void serial_pre_save(void *opaque)
 s->fcr_vmstate = s->fcr;
 }
 
+static int serial_pre_load(void *opaque)
+{
+SerialState *s = (SerialState *)opaque;
+s->thr_ipending = -1;
+timer_del(s->fifo_timeout_timer);
+s->timeout_ipending = 0;
+s->poll_msl = -1;
+timer_del(s->modem_status_poll);
+return 0;
+}
+
 static int serial_post_load(void *opaque, int version_id)
 {
 SerialState *s = opaque;
@@ -597,17 +623,139 @@ static int serial_post_load(void *opaque, int version_id)
 if (version_id < 3) {
 s->fcr_vmstate = 0;
 }
+if (s->thr_ipending == -1) {
+s->thr_ipending = ((s->iir & UART_IIR_ID) == UART_IIR_THRI);
+}
+s->last_break_enable = (s->lcr >> 6) & 1;
 /* Initialize fcr via setter to perform essential side-effects */
-serial_ioport_write(s, 0x02, s->fcr_vmstate, 1);
+serial_write_fcr(s, s->fcr_vmstate, 1);
 serial_update_parameters(s);
 return 0;
 }
 
+static bool serial_thr_ipending_needed(void *opaque)
+{
+SerialState *s = opaque;
+bool expected_value = ((s->iir & UART_IIR_ID) == UART_IIR_THRI);
+return s->thr_ipending != expected_value;
+}
+
+const VMStateDescription vmstate_serial_thr_ipending = {
+.name = "serial/thr_ipending",
+

[Qemu-devel] [PATCH 07/12] hpet: fixing saving and loading process

2014-08-26 Thread Pavel Dovgalyuk

VM clock does not run while saving, so there is no need for saving the ticks
in HPET. Also added saving of hpet_offset field.

Signed-off-by: Pavel Dovgalyuk 
---
 hw/timer/hpet.c |   15 ++-
 1 files changed, 2 insertions(+), 13 deletions(-)

diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c
index e160e8f..4cdda64 100644
--- a/hw/timer/hpet.c
+++ b/hw/timer/hpet.c
@@ -222,14 +222,6 @@ static void update_irq(struct HPETTimer *timer, int set)
 }
 }
 
-static void hpet_pre_save(void *opaque)
-{
-HPETState *s = opaque;
-
-/* save current counter value */
-s->hpet_counter = hpet_get_ticks(s);
-}
-
 static int hpet_pre_load(void *opaque)
 {
 HPETState *s = opaque;
@@ -255,9 +247,6 @@ static int hpet_post_load(void *opaque, int version_id)
 {
 HPETState *s = opaque;
 
-/* Recalculate the offset between the main counter and guest time */
-s->hpet_offset = ticks_to_ns(s->hpet_counter) - 
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-
 /* Push number of timers into capability returned via HPET_ID */
 s->capability &= ~HPET_ID_NUM_TIM_MASK;
 s->capability |= (s->num_timers - 1) << HPET_ID_NUM_TIM_SHIFT;
@@ -306,15 +295,15 @@ static const VMStateDescription vmstate_hpet_timer = {
 
 static const VMStateDescription vmstate_hpet = {
 .name = "hpet",
-.version_id = 2,
+.version_id = 3,
 .minimum_version_id = 1,
-.pre_save = hpet_pre_save,
 .pre_load = hpet_pre_load,
 .post_load = hpet_post_load,
 .fields = (VMStateField[]) {
 VMSTATE_UINT64(config, HPETState),
 VMSTATE_UINT64(isr, HPETState),
 VMSTATE_UINT64(hpet_counter, HPETState),
+VMSTATE_UINT64_V(hpet_offset, HPETState, 3),
 VMSTATE_UINT8_V(num_timers, HPETState, 2),
 VMSTATE_VALIDATE("num_timers in range", hpet_validate_num_timers),
 VMSTATE_STRUCT_VARRAY_UINT8(timer, HPETState, num_timers, 0,

[Qemu-devel] [PATCH 03/12] fdc: adding vmstate for save/restore

2014-08-26 Thread Pavel Dovgalyuk

VMState added by this patch preserves correct
loading of the FDC device state.

Signed-off-by: Pavel Dovgalyuk 
---
 hw/block/fdc.c |   81 
 1 files changed, 81 insertions(+), 0 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 490d127..a10c458 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -695,10 +695,34 @@ static const VMStateDescription vmstate_fdrive_media_rate 
= {
 }
 };
 
+static bool fdrive_perpendicular_needed(void *opaque)
+{
+FDrive *drive = opaque;
+
+return drive->perpendicular != 0;
+}
+
+static const VMStateDescription vmstate_fdrive_perpendicular = {
+.name = "fdrive/perpendicular",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT8(perpendicular, FDrive),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static int fdrive_post_load(void *opaque, int version_id)
+{
+fd_revalidate(opaque);
+return 0;
+}
+
 static const VMStateDescription vmstate_fdrive = {
 .name = "fdrive",
 .version_id = 1,
 .minimum_version_id = 1,
+.post_load = fdrive_post_load,
 .fields = (VMStateField[]) {
 VMSTATE_UINT8(head, FDrive),
 VMSTATE_UINT8(track, FDrive),
@@ -713,6 +737,9 @@ static const VMStateDescription vmstate_fdrive = {
 .vmsd = &vmstate_fdrive_media_rate,
 .needed = &fdrive_media_rate_needed,
 } , {
+.vmsd = &vmstate_fdrive_perpendicular,
+.needed = &fdrive_perpendicular_needed,
+} , {
 /* empty */
 }
 }
@@ -725,6 +752,14 @@ static void fdc_pre_save(void *opaque)
 s->dor_vmstate = s->dor | GET_CUR_DRV(s);
 }
 
+static int fdc_pre_load(void *opaque)
+{
+FDCtrl *s = opaque;
+s->reset_sensei = 0;
+timer_del(s->result_timer);
+return 0;
+}
+
 static int fdc_post_load(void *opaque, int version_id)
 {
 FDCtrl *s = opaque;
@@ -734,11 +769,46 @@ static int fdc_post_load(void *opaque, int version_id)
 return 0;
 }
 
+static bool fdc_reset_sensei_needed(void *opaque)
+{
+FDCtrl *s = opaque;
+
+return s->reset_sensei != 0;
+}
+
+static const VMStateDescription vmstate_fdc_reset_sensei = {
+.name = "fdc/reset_sensei",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_INT32(reset_sensei, FDCtrl),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static bool fdc_result_timer_needed(void *opaque)
+{
+FDCtrl *s = opaque;
+
+return timer_pending(s->result_timer);
+}
+
+static const VMStateDescription vmstate_fdc_result_timer = {
+.name = "fdc/result_timer",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_TIMER(result_timer, FDCtrl),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_fdc = {
 .name = "fdc",
 .version_id = 2,
 .minimum_version_id = 2,
 .pre_save = fdc_pre_save,
+.pre_load = fdc_pre_load,
 .post_load = fdc_post_load,
 .fields = (VMStateField[]) {
 /* Controller State */
@@ -770,6 +840,17 @@ static const VMStateDescription vmstate_fdc = {
 VMSTATE_STRUCT_ARRAY(drives, FDCtrl, MAX_FD, 1,
  vmstate_fdrive, FDrive),
 VMSTATE_END_OF_LIST()
+},
+.subsections = (VMStateSubsection[]) {
+{
+.vmsd = &vmstate_fdc_reset_sensei,
+.needed = fdc_reset_sensei_needed,
+} , {
+.vmsd = &vmstate_fdc_result_timer,
+.needed = fdc_result_timer_needed,
+} , {
+/* empty */
+}
 }
 };

[Qemu-devel] [PATCH 06/12] kvmvapic: fixing loading vmstate

2014-08-26 Thread Pavel Dovgalyuk

vapic state should not be synchronized with APIC while loading,
because APIC state could be not loaded yet at that moment.
We just save vapic_paddr in APIC VMState instead of synchronization.

Signed-off-by: Pavel Dovgalyuk 
---
 hw/i386/kvmvapic.c  |   22 +++-
 hw/intc/apic_common.c   |   44 +++
 include/hw/i386/apic_internal.h |2 +-
 3 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index ee95963..3c403c5 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -351,6 +351,24 @@ static int get_kpcr_number(X86CPU *cpu)
 return kpcr.number;
 }
 
+static int vapic_enable_post_load(VAPICROMState *s, X86CPU *cpu)
+{
+int cpu_number = get_kpcr_number(cpu);
+hwaddr vapic_paddr;
+static const uint8_t enabled = 1;
+
+if (cpu_number < 0) {
+return -1;
+}
+vapic_paddr = s->vapic_paddr +
+(((hwaddr)cpu_number) << VAPIC_CPU_SHIFT);
+cpu_physical_memory_rw(vapic_paddr + offsetof(VAPICState, enabled),
+   (void *)&enabled, sizeof(enabled), 1);
+s->state = VAPIC_ACTIVE;
+
+return 0;
+}
+
 static int vapic_enable(VAPICROMState *s, X86CPU *cpu)
 {
 int cpu_number = get_kpcr_number(cpu);
@@ -731,7 +749,9 @@ static void do_vapic_enable(void *data)
 VAPICROMState *s = data;
 X86CPU *cpu = X86_CPU(first_cpu);
 
-vapic_enable(s, cpu);
+/* Do not synchronize with APIC, because it was not loaded yet.
+   Just call the enable function which does not have synchronization. */
+vapic_enable_post_load(s, cpu);
 }
 
 static int vapic_post_load(void *opaque, int version_id)
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index ce3d903..8d672bd 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -345,6 +345,39 @@ static int apic_dispatch_post_load(void *opaque, int 
version_id)
 return 0;
 }
 
+static bool apic_common_sipi_needed(void *opaque)
+{
+APICCommonState *s = APIC_COMMON(opaque);
+return s->wait_for_sipi != 0;
+}
+
+static const VMStateDescription vmstate_apic_common_sipi = {
+.name = "apic_sipi",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_INT32(sipi_vector, APICCommonState),
+VMSTATE_INT32(wait_for_sipi, APICCommonState),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static bool apic_common_vapic_paddr_needed(void *opaque)
+{
+APICCommonState *s = APIC_COMMON(opaque);
+return s->vapic_paddr != 0;
+}
+
+static const VMStateDescription vmstate_apic_common_vapic_paddr = {
+.name = "apic_vapic_paddr",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT64(vapic_paddr, APICCommonState),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_apic_common = {
 .name = "apic",
 .version_id = 3,
@@ -375,6 +408,17 @@ static const VMStateDescription vmstate_apic_common = {
 VMSTATE_INT64(timer_expiry,
   APICCommonState), /* open-coded timer state */
 VMSTATE_END_OF_LIST()
+},
+.subsections = (VMStateSubsection[]) {
+{
+.vmsd = &vmstate_apic_common_sipi,
+.needed = apic_common_sipi_needed,
+},
+{
+.vmsd = &vmstate_apic_common_vapic_paddr,
+.needed = apic_common_vapic_paddr_needed,
+},
+VMSTATE_END_OF_LIST()
 }
 };
 
diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
index 83e2a42..df4381c 100644
--- a/include/hw/i386/apic_internal.h
+++ b/include/hw/i386/apic_internal.h
@@ -124,7 +124,7 @@ struct APICCommonState {
 
 uint32_t vapic_control;
 DeviceState *vapic;
-hwaddr vapic_paddr; /* note: persistence via kvmvapic */
+hwaddr vapic_paddr;
 };
 
 typedef struct VAPICState {

[Qemu-devel] [PATCH] coroutine: Drop co_sleep_ns

2014-08-26 Thread Fam Zheng

block_job_sleep_ns is the only user. Since we are moving towards
AioContext aware code, it's better to use the explict version and drop
the old one.

Signed-off-by: Fam Zheng 
---
 blockjob.c|  2 +-
 include/block/coroutine.h |  8 
 qemu-coroutine-sleep.c| 12 
 3 files changed, 1 insertion(+), 21 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index ca0b4e2..0689fdd 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -205,7 +205,7 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, 
int64_t ns)
 if (block_job_is_paused(job)) {
 qemu_coroutine_yield();
 } else {
-co_sleep_ns(type, ns);
+co_aio_sleep_ns(bdrv_get_aio_context(job->bs), type, ns);
 }
 job->busy = true;
 }
diff --git a/include/block/coroutine.h b/include/block/coroutine.h
index b9b7f48..793df0e 100644
--- a/include/block/coroutine.h
+++ b/include/block/coroutine.h
@@ -203,14 +203,6 @@ void qemu_co_rwlock_unlock(CoRwlock *lock);
 /**
  * Yield the coroutine for a given duration
  *
- * Note this function uses timers and hence only works when a main loop is in
- * use.  See main-loop.h and do not use from qemu-tool programs.
- */
-void coroutine_fn co_sleep_ns(QEMUClockType type, int64_t ns);
-
-/**
- * Yield the coroutine for a given duration
- *
  * Behaves similarly to co_sleep_ns(), but the sleeping coroutine will be
  * resumed when using aio_poll().
  */
diff --git a/qemu-coroutine-sleep.c b/qemu-coroutine-sleep.c
index ad78fba..9abb7fd 100644
--- a/qemu-coroutine-sleep.c
+++ b/qemu-coroutine-sleep.c
@@ -27,18 +27,6 @@ static void co_sleep_cb(void *opaque)
 qemu_coroutine_enter(sleep_cb->co, NULL);
 }
 
-void coroutine_fn co_sleep_ns(QEMUClockType type, int64_t ns)
-{
-CoSleepCB sleep_cb = {
-.co = qemu_coroutine_self(),
-};
-sleep_cb.ts = timer_new(type, SCALE_NS, co_sleep_cb, &sleep_cb);
-timer_mod(sleep_cb.ts, qemu_clock_get_ns(type) + ns);
-qemu_coroutine_yield();
-timer_del(sleep_cb.ts);
-timer_free(sleep_cb.ts);
-}
-
 void coroutine_fn co_aio_sleep_ns(AioContext *ctx, QEMUClockType type,
   int64_t ns)
 {
-- 
2.1.0

[Qemu-devel] [PATCH 08/12] pckbd: adding new fields to vmstate

2014-08-26 Thread Pavel Dovgalyuk

This patch adds outport to VMState to allow correct saving and restoring
the state of PC keyboard controller.

Signed-off-by: Pavel Dovgalyuk 
---
 hw/input/pckbd.c |   51 +++
 1 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
index ca1cffc..95d8767 100644
--- a/hw/input/pckbd.c
+++ b/hw/input/pckbd.c
@@ -131,6 +131,7 @@ typedef struct KBDState {
 uint8_t status;
 uint8_t mode;
 uint8_t outport;
+bool outport_present;
 /* Bitmask of devices with data available.  */
 uint8_t pending;
 void *kbd;
@@ -367,18 +368,68 @@ static void kbd_reset(void *opaque)
 s->mode = KBD_MODE_KBD_INT | KBD_MODE_MOUSE_INT;
 s->status = KBD_STAT_CMD | KBD_STAT_UNLOCKED;
 s->outport = KBD_OUT_RESET | KBD_OUT_A20;
+s->outport_present = false;
+}
+
+static uint8_t kbd_outport_default(KBDState *s)
+{
+return KBD_OUT_RESET | KBD_OUT_A20
+   | (s->status & KBD_STAT_OBF ? KBD_OUT_OBF : 0)
+   | (s->status & KBD_STAT_MOUSE_OBF ? KBD_OUT_MOUSE_OBF : 0);
+}
+
+static int kbd_outport_post_load(void *opaque, int version_id)
+{
+KBDState *s = opaque;
+s->outport_present = true;
+return 0;
+}
+
+static const VMStateDescription vmstate_kbd_outport = {
+.name = "pckbd_outport",
+.version_id = 1,
+.minimum_version_id = 1,
+.post_load = kbd_outport_post_load,
+.fields = (VMStateField[]) {
+VMSTATE_UINT8(outport, KBDState),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static bool kbd_outport_needed(void *opaque)
+{
+KBDState *s = opaque;
+return s->outport != kbd_outport_default(s);
+}
+
+static int kbd_post_load(void *opaque, int version_id)
+{
+KBDState *s = opaque;
+if (!s->outport_present) {
+s->outport = kbd_outport_default(s);
+}
+s->outport_present = false;
+return 0;
 }
 
 static const VMStateDescription vmstate_kbd = {
 .name = "pckbd",
 .version_id = 3,
 .minimum_version_id = 3,
+.post_load = kbd_post_load,
 .fields = (VMStateField[]) {
 VMSTATE_UINT8(write_cmd, KBDState),
 VMSTATE_UINT8(status, KBDState),
 VMSTATE_UINT8(mode, KBDState),
 VMSTATE_UINT8(pending, KBDState),
 VMSTATE_END_OF_LIST()
+},
+.subsections = (VMStateSubsection[]) {
+{
+.vmsd = &vmstate_kbd_outport,
+.needed = kbd_outport_needed,
+},
+VMSTATE_END_OF_LIST()
 }
 };

Re: [Qemu-devel] [RFC PATCH v2 09/13] spapr_pci_vfio: Call spapr_pci::reset on reset

2014-08-26 Thread David Gibson

On Fri, Aug 15, 2014 at 08:12:31PM +1000, Alexey Kardashevskiy wrote:
> This enables use of the parent class rest() callback in VFIO.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> 
> I honestly do not remember why I did not do this when I added VFIO
> at the first place...

This or equivalent probably belongs in the main commit message,
otherwise it has no justification for the change at all.

Reviewed-by: David Gibson 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgp3RDvu7MFlw.pgp
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH v2 08/13] spapr_pci: Enable DDW

2014-08-26 Thread David Gibson

On Fri, Aug 15, 2014 at 08:12:30PM +1000, Alexey Kardashevskiy wrote:
> This implements DDW for emulated PHB.
> 
> This advertises DDW in device tree.
> 
> Since QEMU does not implement any 64bit DMA capable device, this hack
> has been used to enable 64bit DMA on E1000:
> 
> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> index 0fc29a0..131f80a 100644
> --- a/hw/net/e1000.c
> +++ b/hw/net/e1000.c
> @@ -240,6 +240,7 @@ static const uint32_t mac_reg_init[] = {
>  [STATUS] =  0x8000 | E1000_STATUS_GIO_MASTER_ENABLE |
>  E1000_STATUS_ASDV | E1000_STATUS_MTXCKOK |
>  E1000_STATUS_SPEED_1000 | E1000_STATUS_FD |
> +E1000_STATUS_PCIX_MODE |
>  E1000_STATUS_LU,
>  [MANC] =E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
>  E1000_MANC_ARP_EN | E1000_MANC_0298_EN |

Are you planning to send a patch to enable 64-bit DMA in the e1000
device properly?

> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> Changes:
> v2:
> * tested on hacked emulated E1000
> * implemented DDW reset on the PHB reset
> * spapr_pci_ddw_remove/spapr_pci_ddw_reset are public for reuse by VFIO
> ---
>  hw/ppc/spapr_pci.c  | 96 
> +
>  include/hw/pci-host/spapr.h |  7 
>  2 files changed, 103 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 9b03d0d..cba8d9d 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -22,6 +22,7 @@
>   * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>   * THE SOFTWARE.
>   */
> +#include "sysemu/sysemu.h"
>  #include "hw/hw.h"
>  #include "hw/pci/pci.h"
>  #include "hw/pci/msi.h"
> @@ -498,6 +499,70 @@ void spapr_pci_msi_init(sPAPREnvironment *spapr, hwaddr 
> addr)
>  }
>  
>  /*
> + * Dynamic DMA windows
> + */
> +static int spapr_pci_ddw_query(sPAPRPHBState *sphb,
> +   uint32_t *windows_available,
> +   uint32_t *page_size_mask)
> +{
> +*windows_available = 1;
> +*page_size_mask = DDW_PGSIZE_16M;
> +
> +return 0;
> +}
> +
> +static int spapr_pci_ddw_create(sPAPRPHBState *sphb, uint32_t page_shift,
> +uint32_t window_shift, uint32_t liobn,
> +sPAPRTCETable **ptcet)
> +{
> +*ptcet = spapr_tce_new_table(DEVICE(sphb), liobn,
> + SPAPR_PCI_TCE64_START, page_shift,
> + 1ULL << (window_shift - page_shift),
> + true);

Does anything actually validate that the specified page_shift is one
of the permitted/advertised ones?

> +if (!*ptcet) {
> +return -1;
> +}
> +memory_region_add_subregion(&sphb->iommu_root, (*ptcet)->bus_offset,
> +spapr_tce_get_iommu(*ptcet));
> +
> +return 0;
> +}
> +
> +int spapr_pci_ddw_remove(sPAPRPHBState *sphb, sPAPRTCETable *tcet)
> +{
> +memory_region_del_subregion(&sphb->iommu_root,
> +spapr_tce_get_iommu(tcet));
> +spapr_tce_free_table(tcet);

Ok, relating to my comment in the previous patch, ddw_num doesn't seem
to be decremented here either.

> +
> +return 0;
> +}
> +
> +static int spapr_pci_remove_ddw_cb(Object *child, void *opaque)
> +{
> +sPAPRTCETable *tcet;
> +
> +tcet = (sPAPRTCETable *) object_dynamic_cast(child, 
> TYPE_SPAPR_TCE_TABLE);
> +
> +/* Delete all dynamic windows, i.e. every except the default one with #0 
> */
> +if (tcet && SPAPR_PCI_DMA_WINDOW_NUM(tcet->liobn)) {
> +sPAPRPHBState *sphb = opaque;
> +sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
> +
> +spc->ddw_remove(sphb, tcet);
> +}
> +
> +return 0;
> +}
> +
> +int spapr_pci_ddw_reset(sPAPRPHBState *sphb)
> +{
> +object_child_foreach(OBJECT(sphb), spapr_pci_remove_ddw_cb, sphb);
> +sphb->ddw_num = 0;

So, you do reset ddw_num here, but since it is incremented in the
generic RTAS code, this smells like a layering violation.

> +
> +return 0;
> +}
> +
> +/*
>   * PHB PCI device
>   */
>  static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int 
> devfn)
> @@ -671,6 +736,12 @@ static int spapr_phb_children_reset(Object *child, void 
> *opaque)
>  
>  static void spapr_phb_reset(DeviceState *qdev)
>  {
> +sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(qdev);
> +
> +if (spc->ddw_reset) {
> +spc->ddw_reset(SPAPR_PCI_HOST_BRIDGE(qdev));
> +}
> +
>  /* Reset the IOMMU state */
>  object_child_foreach(OBJECT(qdev), spapr_phb_children_reset, NULL);
>  }
> @@ -685,6 +756,7 @@ static Property spapr_phb_properties[] = {
>  DEFINE_PROP_UINT64("io_win_addr", sPAPRPHBState, io_win_addr, -1),
>  DEFINE_PROP_UINT64("io_win_size", sPAPRPHBState, io_win_size,
> SPAPR_PCI_IO_WIN_SIZE),
> +DEFINE_PROP_BOOL("ddw", sPAPRPHBState, ddw_enabled, false

[Qemu-devel] [PATCH 12/12] pl031: add missed field to vmstate

2014-08-26 Thread Pavel Dovgalyuk

This patch adds timer which uses virtual clock to the VMState.
Such timers are required for saving because virtual clock is the part
of the virtual machine state.

Signed-off-by: Pavel Dovgalyuk 
---
 hw/timer/pl031.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/hw/timer/pl031.c b/hw/timer/pl031.c
index 34d9b44..f8e5abc 100644
--- a/hw/timer/pl031.c
+++ b/hw/timer/pl031.c
@@ -230,12 +230,13 @@ static int pl031_post_load(void *opaque, int version_id)
 
 static const VMStateDescription vmstate_pl031 = {
 .name = "pl031",
-.version_id = 1,
+.version_id = 2,
 .minimum_version_id = 1,
 .pre_save = pl031_pre_save,
 .post_load = pl031_post_load,
 .fields = (VMStateField[]) {
 VMSTATE_UINT32(tick_offset_vmstate, PL031State),
+VMSTATE_TIMER_V(timer, PL031State, 2),
 VMSTATE_UINT32(mr, PL031State),
 VMSTATE_UINT32(lr, PL031State),
 VMSTATE_UINT32(cr, PL031State),

[Qemu-devel] [PATCH 10/12] piix: do not raise irq while loading vmstate

2014-08-26 Thread Pavel Dovgalyuk

This patch disables raising an irq while loading the state of PCI bridge.

Signed-off-by: Pavel Dovgalyuk 
---
 hw/pci-host/piix.c |   22 --
 1 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index e0e0946..86d6d20 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -409,7 +409,7 @@ static void piix3_set_irq_pic(PIIX3State *piix3, int 
pic_irq)
  (pic_irq * PIIX_NUM_PIRQS;
 }
 
-static void piix3_set_irq_level(PIIX3State *piix3, int pirq, int level)
+static void piix3_set_irq_level_internal(PIIX3State *piix3, int pirq, int 
level)
 {
 int pic_irq;
 uint64_t mask;
@@ -422,6 +422,18 @@ static void piix3_set_irq_level(PIIX3State *piix3, int 
pirq, int level)
 mask = 1ULL << ((pic_irq * PIIX_NUM_PIRQS) + pirq);
 piix3->pic_levels &= ~mask;
 piix3->pic_levels |= mask * !!level;
+}
+
+static void piix3_set_irq_level(PIIX3State *piix3, int pirq, int level)
+{
+int pic_irq;
+
+pic_irq = piix3->dev.config[PIIX_PIRQC + pirq];
+if (pic_irq >= PIIX_NUM_PIC_IRQS) {
+return;
+}
+
+piix3_set_irq_level_internal(piix3, pirq, level);
 
 piix3_set_irq_pic(piix3, pic_irq);
 }
@@ -527,7 +539,13 @@ static void piix3_reset(void *opaque)
 static int piix3_post_load(void *opaque, int version_id)
 {
 PIIX3State *piix3 = opaque;
-piix3_update_irq_levels(piix3);
+int pirq;
+
+piix3->pic_levels = 0;
+for (pirq = 0; pirq < PIIX_NUM_PIRQS; pirq++) {
+piix3_set_irq_level_internal(piix3, pirq,
+pci_bus_get_irq_level(piix3->dev.bus, pirq));
+}
 return 0;
 }

Re: [Qemu-devel] [RFC PATCH v2 07/13] spapr_rtas: Add Dynamic DMA windows (DDW) RTAS calls support

2014-08-26 Thread David Gibson

On Fri, Aug 15, 2014 at 08:12:29PM +1000, Alexey Kardashevskiy wrote:
> This adds support for Dynamic DMA Windows (DDW) option defined by
> the SPAPR specification which allows to have additional DMA window(s)
> which can support page sizes other than 4K.
> 
> The existing implementation of DDW in the guest tries to create one huge
> DMA window with 64K or 16MB pages and map the entire guest RAM to. If it
> succeeds, the guest switches to dma_direct_ops and never calls
> TCE hypercalls (H_PUT_TCE,...) again. This enables VFIO devices to use
> the entire RAM and not waste time on map/unmap.
> 
> This adds 4 RTAS handlers:
> * ibm,query-pe-dma-window
> * ibm,create-pe-dma-window
> * ibm,remove-pe-dma-window
> * ibm,reset-pe-dma-window
> These are registered from type_init() callback.
> 
> These RTAS handlers are implemented in a separate file to avoid polluting
> spapr_iommu.c with PHB.
> 
> Since no PHB class implements new callback in this patch, no functional
> change is expected.

[snip]
> +static void rtas_ibm_create_pe_dma_window(PowerPCCPU *cpu,
> +  sPAPREnvironment *spapr,
> +  uint32_t token, uint32_t nargs,
> +  target_ulong args,
> +  uint32_t nret, target_ulong rets)
> +{
> +sPAPRPHBState *sphb;
> +sPAPRPHBClass *spc;
> +sPAPRTCETable *tcet = NULL;
> +uint32_t addr, page_shift, window_shift, liobn;
> +uint64_t buid;
> +long ret;
> +
> +if ((nargs != 5) || (nret != 4)) {
> +goto param_error_exit;
> +}
> +
> +buid = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 2);
> +addr = rtas_ld(args, 0);
> +sphb = spapr_pci_find_phb(spapr, buid);
> +if (!sphb) {
> +goto param_error_exit;
> +}
> +
> +spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
> +if (!spc->ddw_create) {
> +goto hw_error_exit;
> +}
> +
> +page_shift = rtas_ld(args, 3);
> +window_shift = rtas_ld(args, 4);
> +/* Default 32bit window#0 is always there so +1 */
> +liobn = SPAPR_PCI_LIOBN(sphb->index, sphb->ddw_num + 1);
> +
> +ret = spc->ddw_create(sphb, page_shift, window_shift, liobn, &tcet);
> +trace_spapr_iommu_ddw_create(buid, addr, 1ULL << page_shift,
> + 1ULL << window_shift,
> + tcet ? tcet->bus_offset : 0xbaadf00d,
> + liobn, ret);
> +if (ret || !tcet) {
> +goto hw_error_exit;
> +}
> +
> +sphb->ddw_num++;

You increment ddw_num here...

[snip]
> +static void rtas_ibm_remove_pe_dma_window(PowerPCCPU *cpu,
> +  sPAPREnvironment *spapr,
> +  uint32_t token, uint32_t nargs,
> +  target_ulong args,
> +  uint32_t nret, target_ulong rets)
> +{
> +sPAPRPHBState *sphb;
> +sPAPRPHBClass *spc;
> +sPAPRTCETable *tcet;
> +uint32_t liobn;
> +long ret;
> +
> +if ((nargs != 1) || (nret != 1)) {
> +goto param_error_exit;
> +}
> +
> +liobn = rtas_ld(args, 0);
> +tcet = spapr_tce_find_by_liobn(liobn);
> +if (!tcet) {
> +goto param_error_exit;
> +}
> +
> +sphb = SPAPR_PCI_HOST_BRIDGE(OBJECT(tcet)->parent);
> +if (!sphb) {
> +goto param_error_exit;
> +}
> +
> +spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
> +if (!spc->ddw_remove) {
> +goto hw_error_exit;
> +}
> +
> +ret = spc->ddw_remove(sphb, tcet);
> +trace_spapr_iommu_ddw_remove(liobn, ret);
> +if (ret) {
> +goto hw_error_exit;
> +}
> +
> +rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +return;

.. but don't decrement it here.  Is that a bug?

[snip]
> +static void rtas_ibm_reset_pe_dma_window(PowerPCCPU *cpu,
> + sPAPREnvironment *spapr,
> + uint32_t token, uint32_t nargs,
> + target_ulong args,
> + uint32_t nret, target_ulong rets)
> +{
> +sPAPRPHBState *sphb;
> +sPAPRPHBClass *spc;
> +uint64_t buid;
> +uint32_t addr;
> +long ret;
> +
> +if ((nargs != 3) || (nret != 1)) {
> +goto param_error_exit;
> +}
> +
> +buid = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 2);
> +addr = rtas_ld(args, 0);
> +sphb = spapr_pci_find_phb(spapr, buid);
> +if (!sphb) {
> +goto param_error_exit;
> +}
> +
> +spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
> +if (!spc->ddw_reset) {
> +goto hw_error_exit;
> +}
> +
> +ret = spc->ddw_reset(sphb);
> +trace_spapr_iommu_ddw_reset(buid, addr, ret);
> +if (ret) {
> +goto hw_error_exit;
> +}

Likewise ddw_num doesn't seem to be reset here.

-- 
David Gibson

Re: [Qemu-devel] [RFC PATCH v2 12/13] vfio: Enable DDW ioctls to VFIO IOMMU driver

2014-08-26 Thread David Gibson

On Fri, Aug 15, 2014 at 08:12:34PM +1000, Alexey Kardashevskiy wrote:
> This enables DDW RTAS-related ioctls in VFIO.
> 
> Signed-off-by: Alexey Kardashevskiy 

This should probably just be folded into the previous patch.  It's
broken without this change.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgp9kWbLbrwTa.pgp
Description: PGP signature

[Qemu-devel] [PATCH 11/12] mc146818rtc: add missed field to vmstate

2014-08-26 Thread Pavel Dovgalyuk

This patch adds irq_reinject_on_ack_count field to VMState to allow correct
saving/loading the state of MC146818 RTC.

Signed-off-by: Pavel Dovgalyuk 
---
 hw/timer/mc146818rtc.c |   32 
 1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/hw/timer/mc146818rtc.c b/hw/timer/mc146818rtc.c
index 233fc70..6ba360a 100644
--- a/hw/timer/mc146818rtc.c
+++ b/hw/timer/mc146818rtc.c
@@ -698,6 +698,13 @@ int rtc_get_memory(ISADevice *dev, int addr)
 return s->cmos_data[addr];
 }
 
+static int rtc_pre_load(void *opaque)
+{
+RTCState *s = (RTCState *)opaque;
+s->irq_reinject_on_ack_count = 0;
+return 0;
+}
+
 static void rtc_set_date_from_host(ISADevice *dev)
 {
 RTCState *s = MC146818_RTC(dev);
@@ -733,10 +740,27 @@ static int rtc_post_load(void *opaque, int version_id)
 return 0;
 }
 
+static const VMStateDescription vmstate_rtc_irq_reinject_on_ack_count = {
+.name = "irq_reinject_on_ack_count",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT16(irq_reinject_on_ack_count, RTCState),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static bool rtc_irq_reinject_on_ack_count_needed(void *opaque)
+{
+RTCState *s = (RTCState *)opaque;
+return s->irq_reinject_on_ack_count != 0;
+}
+
 static const VMStateDescription vmstate_rtc = {
 .name = "mc146818rtc",
 .version_id = 3,
 .minimum_version_id = 1,
+.pre_load = rtc_pre_load,
 .post_load = rtc_post_load,
 .fields = (VMStateField[]) {
 VMSTATE_BUFFER(cmos_data, RTCState),
@@ -753,6 +777,14 @@ static const VMStateDescription vmstate_rtc = {
 VMSTATE_TIMER_V(update_timer, RTCState, 3),
 VMSTATE_UINT64_V(next_alarm_time, RTCState, 3),
 VMSTATE_END_OF_LIST()
+},
+.subsections = (VMStateSubsection[]) {
+{
+.vmsd = &vmstate_rtc_irq_reinject_on_ack_count,
+.needed = rtc_irq_reinject_on_ack_count_needed,
+}, {
+/* empty */
+}
 }
 };

Re: [Qemu-devel] [RFC PATCH v2 11/13] spapr_pci_vfio: Enable DDW

2014-08-26 Thread David Gibson

On Fri, Aug 15, 2014 at 08:12:33PM +1000, Alexey Kardashevskiy wrote:
> This implements DDW for VFIO. Host kernel support is required for this.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> Changes:
> v2:
> * remove()/reset() callbacks use spapr_pci's ones
> ---
>  hw/ppc/spapr_pci_vfio.c | 86 
> +
>  1 file changed, 86 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
> index 11b4272..79df716 100644
> --- a/hw/ppc/spapr_pci_vfio.c
> +++ b/hw/ppc/spapr_pci_vfio.c
> @@ -71,6 +71,88 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState 
> *sphb, Error **errp)
>  spapr_tce_get_iommu(tcet));
>  
>  object_unref(OBJECT(tcet));
> +
> +if (sphb->ddw_enabled) {
> +sphb->ddw_enabled = !!(info.flags & VFIO_IOMMU_SPAPR_TCE_FLAG_DDW);

This overrides an explicit ddw= set by the user, which is a bit
counter-intuitive.

> +}
> +}
> +
> +static int spapr_pci_vfio_ddw_query(sPAPRPHBState *sphb,
> +uint32_t *windows_available,
> +uint32_t *page_size_mask)
> +{
> +sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
> +struct vfio_iommu_spapr_tce_query query = { .argsz = sizeof(query) };
> +int ret;
> +
> +ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
> +   VFIO_IOMMU_SPAPR_TCE_QUERY, &query);
> +if (ret) {
> +return ret;
> +}
> +
> +*windows_available = query.windows_available;
> +*page_size_mask = query.page_size_mask;
> +
> +return ret;
> +}
> +
> +static int spapr_pci_vfio_ddw_create(sPAPRPHBState *sphb, uint32_t 
> page_shift,
> + uint32_t window_shift, uint32_t liobn,
> + sPAPRTCETable **ptcet)
> +{
> +sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
> +struct vfio_iommu_spapr_tce_create create = {
> +.argsz = sizeof(create),
> +.page_shift = page_shift,
> +.window_shift = window_shift,
> +.start_addr = 0
> +};
> +int ret;
> +
> +ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
> +   VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
> +if (ret) {
> +return ret;
> +}
> +
> +*ptcet = spapr_tce_new_table(DEVICE(sphb), liobn,
> + create.start_addr, page_shift,
> + 1ULL << (window_shift - page_shift),
> + true);
> +memory_region_add_subregion(&sphb->iommu_root, (*ptcet)->bus_offset,
> +spapr_tce_get_iommu(*ptcet));
> +
> +return ret;
> +}
> +
> +static int spapr_pci_vfio_ddw_remove(sPAPRPHBState *sphb, sPAPRTCETable 
> *tcet)
> +{
> +sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
> +struct vfio_iommu_spapr_tce_remove remove = {
> +.argsz = sizeof(remove),
> +.start_addr = tcet->bus_offset
> +};
> +int ret;
> +
> +spapr_pci_ddw_remove(sphb, tcet);
> +ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
> +   VFIO_IOMMU_SPAPR_TCE_REMOVE, &remove);
> +
> +return ret;
> +}
> +
> +static int spapr_pci_vfio_ddw_reset(sPAPRPHBState *sphb)
> +{
> +sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
> +struct vfio_iommu_spapr_tce_reset reset = { .argsz = sizeof(reset) };
> +int ret;
> +
> +spapr_pci_ddw_reset(sphb);
> +ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
> +   VFIO_IOMMU_SPAPR_TCE_RESET, &reset);

Unlike the non-VFIO version, this doesn't appear to reset ddw_num.

Also, there isn't call to explicitly call DDW reset on system reset.
Is that handled in kernel by the overall VFIO reset?

> +return ret;
>  }
>  
>  static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data)
> @@ -80,6 +162,10 @@ static void spapr_phb_vfio_class_init(ObjectClass *klass, 
> void *data)
>  
>  dc->props = spapr_phb_vfio_properties;
>  spc->finish_realize = spapr_phb_vfio_finish_realize;
> +spc->ddw_query = spapr_pci_vfio_ddw_query;
> +spc->ddw_create = spapr_pci_vfio_ddw_create;
> +spc->ddw_remove = spapr_pci_vfio_ddw_remove;
> +spc->ddw_reset = spapr_pci_vfio_ddw_reset;
>  }
>  
>  static const TypeInfo spapr_phb_vfio_info = {

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpx0zYnAyeXJ.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 1/5] target-ppc: Extend rtas-blob

2014-08-26 Thread David Gibson

On Tue, Aug 26, 2014 at 12:04:27PM +0530, Aravinda Prasad wrote:
> 
> 
> On Tuesday 26 August 2014 11:08 AM, David Gibson wrote:
> > On Mon, Aug 25, 2014 at 07:15:16PM +0530, Aravinda Prasad wrote:
> >> Extend rtas-blob to accommodate error log. Error log
> >> structure is saved in rtas space upon a machine check
> >> exception.
> > 
> > Hrm.  Putting the reserved space into the actual firmware image file
> > seems really clumsy to me.  Wouldn't it be better to reserve the space
> > directly from qemu, then just paste the 20 bytes of code into that
> > area.
> 
> Hmm. It is possible to it from qemu. In that case we can get rid of
> spapr-rtas.S. Let me think about that.

I think getting rid of spapr-rtas.S is probably a good idea.  It
complicates building things for the sake of 5 instructions.
Especially since in your new code you construct the vector
instructions directly, it makes sense to do the same thing for
spapr-rtas.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpN2VWB8lSiQ.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 2/5] target-ppc: Register and handle HCALL to receive updated RTAS region

2014-08-26 Thread David Gibson

On Tue, Aug 26, 2014 at 04:15:40PM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2014-08-26 at 15:39 +1000, David Gibson wrote:
> > On Mon, Aug 25, 2014 at 07:15:26PM +0530, Aravinda Prasad wrote:
> > > Receive updates from SLOF about the updated rtas-base.
> > > A separate patch for SLOF [1] adds functionality to invoke a
> > > a private HCALL whenever OS issues instantiate-rtas with
> > > a new rtas-base.
> > > 
> > > This is required as qemu needs to know the updated rtas-base
> > > as it allocates error reporting structure in RTAS space upon
> > > a machine check exception.
> > 
> > This also seems really awkward.  Specifically it seems like a rather
> > arbitrary and complex division of what qemu's responsible for and what
> > SLOF is responsible for.
> > 
> > Instead I'd suggest that we add an H_INSTANTIATE_RTAS hcall, and we
> > move the loading of the spapr-rtas blob from normal reset to when that
> > hcall is invoked.
> 
> Beware that SLOF needs to call RTAS for its own reasons...
> 
> So it would be instanciated twice.

I don't see that that would cause a big problem, or am I missing
something?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpfjrLSpSvr6.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v6 0/3] linux-aio: introduce submit I/O asa batch

2014-08-26 Thread Zhang Haoyu

>> Hi,
>> 
>> The commit 580b6b2aa2(dataplane: use the QEMU block layer for I/O)
>> introduces ~40% throughput regression on virtio-blk dataplane, and
>> one of causes is that submitting I/O as a batch is removed.
>> 
>> This patchset trys to introduce this mechanism on block, at least,
>> linux-aio can benefit from that.
>> 
>> With these patches, it is observed that thoughout on virtio-blk
>> dataplane can be improved a lot, see data in commit log of patch
>> 3/3.
>> 
>> It should be possible to apply the batch mechanism to other devices
>> (such as virtio-scsi) too.
>> 
>> TODO:
>>  - support queuing I/O to multi files for scsi devies, which
>> need some changes to linux-aio
>> 
>> V6:
>>  - fix requests leak if part of them arn't submitted successfully,
>>  pointed by Stefan
>>  - linux-aio.c coding style fix
>> 
>> V5:
>>  - rebase on v2.1.0-rc0 of qemu.git/master
>>  - block/linux-aio.c code style fix
>>  - don't flush io queue before flush, pointed by Paolo
>> 
>> V4:
>>  - support other non-raw formats with under-optimized performance
>>  - use reference counter for plug & unplug
>>  - flush io queue before sending flush command
>> 
>> V3:
>>  - only support submitting I/O as a batch for raw format, pointed by
>> Kevin
>> 
>> V2:
>>  - define return value of bdrv_io_unplug as void, suggested by Paolo
>>  - avoid busy-wait for handling io_submit
>> V1:
>>  - move queuing io stuff into linux-aio.c as suggested by Paolo
>
>Thanks, applied to my block tree:
>https://github.com/stefanha/qemu/commits/block
>
Can we use the queued io data as caches, 
io write will directly return and tell the guest the io is completed after the 
io is enqueued, 
better user experience for burst io,
and io-read will firstly search the io queue, if matched data found, directly 
get the data from the queue, 
if not, then read the data from the disk or host page cache.
Any ideas?

Thanks,
Zhang Haoyu

>In Patch 2 we should complete requests with -EIO if io_submit() returned
>0 <= ret < len.  I fixed this up when applying because the patch was
>completing with a bogus ret value.
>
>Stefan

Re: [Qemu-devel] [PATCH v6 0/3] linux-aio: introduce submit I/O asa batch

2014-08-26 Thread Fam Zheng

On Tue, 08/26 15:31, Zhang Haoyu wrote:
> Can we use the queued io data as caches, 
> io write will directly return and tell the guest the io is completed after 
> the io is enqueued, 
> better user experience for burst io,
> and io-read will firstly search the io queue, if matched data found, directly 
> get the data from the queue, 
> if not, then read the data from the disk or host page cache.
> Any ideas?

Guest kernel already has a page cache that exactly does this, also keeping a
copy of guest request data in qemu may hurt the bandwidth in some cases.

Fam

Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node

2014-08-26 Thread Alexey Kardashevskiy

On 08/19/2014 10:21 AM, Michael Roth wrote:
> From: Nathan Fontenot 
> 
> This add entries to the root OF node to advertise our PHBs as being
> DR-capable in according with PAPR specification.
> 
> Each PHB is given a name of PHB, advertised as a PHB type,
> and associated with a power domain of -1 (indicating to guests that
> power management is handled automatically by hardware).
> 
> We currently allocate entries for up to 32 DR-capable PHBs, though
> this limit can be increased later.
> 
> DrcEntry objects to track the state of the DR-connector associated
> with each PHB are stored in a 32-entry array, and each DrcEntry has
> in turn have a dynamically-sized number of child DR-connectors,
> which we will use later to track the state of DR-connectors
> associated with a PHB's physical slots.
> 
> Signed-off-by: Nathan Fontenot 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr.c | 143 
> +
>  hw/ppc/spapr_pci.c |   1 +
>  include/hw/ppc/spapr.h |  35 
>  3 files changed, 179 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 5c92707..d5e46c3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
>  return ram_size;
>  }
>  
> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
> +{
> +int i;
> +
> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +if (spapr->drc_table[i].phb_buid == buid) {
> +return &spapr->drc_table[i];
> +}
> + }
> +
> + return NULL;
> +}
> +
> +static void spapr_init_drc_table(void)
> +{
> +int i;
> +
> +memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
> +
> +/* For now we only care about PHB entries */
> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +spapr->drc_table[i].drc_index = 0x201 + i;
> +}
> +}
> +
> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
> +{
> +sPAPRDrcEntry *empty_drc = NULL;
> +sPAPRDrcEntry *found_drc = NULL;
> +int i, phb_index;
> +
> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +if (spapr->drc_table[i].phb_buid == 0) {
> +empty_drc = &spapr->drc_table[i];
> +}
> +
> +if (spapr->drc_table[i].phb_buid == buid) {
> +found_drc = &spapr->drc_table[i];

return &spapr->drc_table[i];
?


> +break;
> +}
> +}
> +
> +if (found_drc) {
> +return found_drc;
> +}

   if (!empty_drc) {
return NULL;
   }

?


> +
> +if (empty_drc) {

and no need in this :)


> +empty_drc->phb_buid = buid;
> +empty_drc->state = state;
> +empty_drc->cc_state.fdt = NULL;
> +empty_drc->cc_state.offset = 0;
> +empty_drc->cc_state.depth = 0;
> +empty_drc->cc_state.state = CC_STATE_IDLE;
> +empty_drc->child_entries =
> +g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
> +phb_index = buid - SPAPR_PCI_BASE_BUID;
> +for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +empty_drc->child_entries[i].drc_index =
> +SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
> +}
> +return empty_drc;
> +}
> +
> +return NULL;
> +}
> +
> +static void spapr_create_drc_dt_entries(void *fdt)
> +{
> +char char_buf[1024];
> +uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
> +uint32_t *entries;
> +int offset, fdt_offset;
> +int i, ret;
> +
> +fdt_offset = fdt_path_offset(fdt, "/");
> +
> +/* ibm,drc-indexes */
> +memset(int_buf, 0, sizeof(int_buf));
> +int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> +
> +for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> +int_buf[i] = spapr->drc_table[i-1].drc_index;
> +}
> +
> +ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
> +  sizeof(int_buf));
> +if (ret) {
> +fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");

return here and below in the same error cases?

> +}
> +
> +/* ibm,drc-power-domains */
> +memset(int_buf, 0, sizeof(int_buf));
> +int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> +
> +for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> +int_buf[i] = 0x;
> +}
> +
> +ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
> +  sizeof(int_buf));
> +if (ret) {
> +fprintf(stderr, "Couldn't finalize ibm,drc-power-domains 
> property\n");
> +}
> +
> +/* ibm,drc-names */
> +memset(char_buf, 0, sizeof(char_buf));
> +entries = (uint32_t *)&char_buf[0];
> +*entries = SPAPR_DRC_TABLE_SIZE;
> +offset = sizeof(*entries);
> +
> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +offset += sprintf(char_buf + offset, "PHB %d", i + 1);
> +char_buf[offset++] = '\0';
> +}
> +
> +ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);
> +if

Re: [Qemu-devel] [PATCH v6 0/3] linux-aio: introduce submit I/O as a batch

2014-08-26 Thread Zhang Haoyu

>>> Hi,
>>> 
>>> The commit 580b6b2aa2(dataplane: use the QEMU block layer for I/O)
>>> introduces ~40% throughput regression on virtio-blk dataplane, and
>>> one of causes is that submitting I/O as a batch is removed.
>>> 
>>> This patchset trys to introduce this mechanism on block, at least,
>>> linux-aio can benefit from that.
>>> 
>>> With these patches, it is observed that thoughout on virtio-blk
>>> dataplane can be improved a lot, see data in commit log of patch
>>> 3/3.
>>> 
>>> It should be possible to apply the batch mechanism to other devices
>>> (such as virtio-scsi) too.
>>> 
>>> TODO:
>>> - support queuing I/O to multi files for scsi devies, which
>>> need some changes to linux-aio
>>> 
>>> V6:
>>> - fix requests leak if part of them arn't submitted successfully,
>>> pointed by Stefan
>>> - linux-aio.c coding style fix
>>> 
>>> V5:
>>> - rebase on v2.1.0-rc0 of qemu.git/master
>>> - block/linux-aio.c code style fix
>>> - don't flush io queue before flush, pointed by Paolo
>>> 
>>> V4:
>>> - support other non-raw formats with under-optimized performance
>>> - use reference counter for plug & unplug
>>> - flush io queue before sending flush command
>>> 
>>> V3:
>>> - only support submitting I/O as a batch for raw format, pointed by
>>> Kevin
>>> 
>>> V2:
>>> - define return value of bdrv_io_unplug as void, suggested by Paolo
>>> - avoid busy-wait for handling io_submit
>>> V1:
>>> - move queuing io stuff into linux-aio.c as suggested by Paolo
>>
>>Thanks, applied to my block tree:
>>https://github.com/stefanha/qemu/commits/block
>>
>Can we use the queued io data as caches, 
>io write will directly return and tell the guest the io is completed after the 
>io is enqueued, 
>better user experience for burst io,
>and io-read will firstly search the io queue, if matched data found, directly 
>get the data from the queue, 
>if not, then read the data from the disk or host page cache.

Because host page cache cannot assure the order of io-write, leading to VM 
image data corruption if suddenly poweroff happened,
but we can assure the order of io-write in queue, and use 
cache=none/writethrough/directsync to submit the batched IO to disk directly, 
bypass host page cache.
IO merging also can be performed in the queue.

>Any ideas?
>
>Thanks,
>Zhang Haoyu
>
>>In Patch 2 we should complete requests with -EIO if io_submit() returned
>>0 <= ret < len.  I fixed this up when applying because the patch was
>>completing with a bogus ret value.
>>
>>Stefan

Re: [Qemu-devel] [RFC PATCH v3 15/49] softmmu: fixing usage of cpu_st/ld* from helpers

2014-08-26 Thread Pavel Dovgaluk

> From: Alex Bennée [mailto:alex.ben...@linaro.org]
> Pavel Dovgalyuk writes:
> 
> > MMU helper functions are called from generated code and other helper
> > functions. In both cases they try to get function's return address for
> > using it while restoring virtual CPU state.
> >
> > When MMU helper is called from some other helper function
> > (like helper_maskmov_xmm) through cpu_st* function, the return address
> > will point to that helper. That is why CPU state cannot be restored in
> > the case of MMU fault.
> >
> > This patch introduces several inline helpers to load return address
> > which points to the right place.
> >
> 
> 
> OK I find it fairly hard to follow all the glue magic (not your fault
> ;-) we have in QEMU. However wouldn't it be simpler for the helper
> pre-amble code to ensure the subject pc is updated in the CPU
> environment?

Then I'll need to rewrite all helper calls or change their structure
by adding code which restores the PC.

> Can QEMU only rectify the processor state from a TranlationBlock tc address?

Current guest PC is not known during execution of the TB. When memory access
exception occurs, helpers have to evaluate guest PC using the host one.
Host PC should point to the translated block and this patch eliminates reading
wrong host PC value during such recovery.

Pavel Dovgalyuk

[Qemu-devel] [PATCH V4] net: Forbid dealing with packets when VM is not running

2014-08-26 Thread zhanghailiang

For all NICs(except virtio-net) emulated by qemu,
Such as e1000, rtl8139, pcnet and ne2k_pci,
Qemu can still receive packets when VM is not running.

If this happened in *migration's* last PAUSE VM stage, but
before the end of the migration, the new receiving packets will possibly dirty
parts of RAM which has been cached in *iovec*(will be sent asynchronously) and
dirty parts of new RAM which will be missed.
This will lead serious network fault in VM.

To avoid this, we forbid receiving packets in generic net code when
VM is not running.

Bug reproduction steps:
(1) Start a VM which configured at least one NIC
(2) In VM, open several Terminal and do *Ping IP -i 0.1*
(3) Migrate the VM repeatedly between two Hosts
And the *PING* command in VM will very likely fail with message:
'Destination HOST Unreachable', the NIC in VM will stay unavailable unless you
run 'service network restart'

Signed-off-by: zhanghailiang 
---
 v4:
- The action of flush queued packets is unnecessary, remove this.
  (Thanks for the help of Jason Wang and Stefan). 

 v3:
- change the 'vmstate' to 'vm_running'

 v2:
- remove the superfluous check of nc->received_disabled 
---

 net/net.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/net/net.c b/net/net.c
index 6d930ea..4cb92c0 100644
--- a/net/net.c
+++ b/net/net.c
@@ -41,6 +41,7 @@
 #include "qapi-visit.h"
 #include "qapi/opts-visitor.h"
 #include "qapi/dealloc-visitor.h"
+#include "sysemu/sysemu.h"
 
 /* Net bridge is currently not supported for W32. */
 #if !defined(_WIN32)
@@ -452,6 +453,12 @@ void qemu_set_vnet_hdr_len(NetClientState *nc, int len)
 
 int qemu_can_send_packet(NetClientState *sender)
 {
+int vm_running = runstate_is_running();
+
+if (!vm_running) {
+return 0;
+}
+
 if (!sender->peer) {
 return 1;
 }
-- 
1.7.12.4

Re: [Qemu-devel] [RFC PATCH v2 08/13] spapr_pci: Enable DDW

2014-08-26 Thread Alexey Kardashevskiy

On 08/26/2014 05:14 PM, David Gibson wrote:
> On Fri, Aug 15, 2014 at 08:12:30PM +1000, Alexey Kardashevskiy wrote:
>> This implements DDW for emulated PHB.
>>
>> This advertises DDW in device tree.
>>
>> Since QEMU does not implement any 64bit DMA capable device, this hack
>> has been used to enable 64bit DMA on E1000:
>>
>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>> index 0fc29a0..131f80a 100644
>> --- a/hw/net/e1000.c
>> +++ b/hw/net/e1000.c
>> @@ -240,6 +240,7 @@ static const uint32_t mac_reg_init[] = {
>>  [STATUS] =  0x8000 | E1000_STATUS_GIO_MASTER_ENABLE |
>>  E1000_STATUS_ASDV | E1000_STATUS_MTXCKOK |
>>  E1000_STATUS_SPEED_1000 | E1000_STATUS_FD |
>> +E1000_STATUS_PCIX_MODE |
>>  E1000_STATUS_LU,
>>  [MANC] =E1000_MANC_EN_MNG2HOST | E1000_MANC_RCV_TCO_EN |
>>  E1000_MANC_ARP_EN | E1000_MANC_0298_EN |
> 
> Are you planning to send a patch to enable 64-bit DMA in the e1000
> device properly?

Nope. The e1000 family has pci and pcix devices and so far QEMU emulated
pci one with its own vendor/device ids. I would either have to change
device id to pcix version and enable that pcix bit or add new e1000-pcix
device and I do not see any real demand on this as we have virtio-pci which
is way cooler and faster and everything :)


> 
>>
>> Signed-off-by: Alexey Kardashevskiy 
>> ---
>> Changes:
>> v2:
>> * tested on hacked emulated E1000
>> * implemented DDW reset on the PHB reset
>> * spapr_pci_ddw_remove/spapr_pci_ddw_reset are public for reuse by VFIO
>> ---
>>  hw/ppc/spapr_pci.c  | 96 
>> +
>>  include/hw/pci-host/spapr.h |  7 
>>  2 files changed, 103 insertions(+)
>>
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index 9b03d0d..cba8d9d 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -22,6 +22,7 @@
>>   * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>>   * THE SOFTWARE.
>>   */
>> +#include "sysemu/sysemu.h"
>>  #include "hw/hw.h"
>>  #include "hw/pci/pci.h"
>>  #include "hw/pci/msi.h"
>> @@ -498,6 +499,70 @@ void spapr_pci_msi_init(sPAPREnvironment *spapr, hwaddr 
>> addr)
>>  }
>>  
>>  /*
>> + * Dynamic DMA windows
>> + */
>> +static int spapr_pci_ddw_query(sPAPRPHBState *sphb,
>> +   uint32_t *windows_available,
>> +   uint32_t *page_size_mask)
>> +{
>> +*windows_available = 1;
>> +*page_size_mask = DDW_PGSIZE_16M;
>> +
>> +return 0;
>> +}
>> +
>> +static int spapr_pci_ddw_create(sPAPRPHBState *sphb, uint32_t page_shift,
>> +uint32_t window_shift, uint32_t liobn,
>> +sPAPRTCETable **ptcet)
>> +{
>> +*ptcet = spapr_tce_new_table(DEVICE(sphb), liobn,
>> + SPAPR_PCI_TCE64_START, page_shift,
>> + 1ULL << (window_shift - page_shift),
>> + true);
> 
> Does anything actually validate that the specified page_shift is one
> of the permitted/advertised ones?


Nope. Will fix.



>> +if (!*ptcet) {
>> +return -1;
>> +}
>> +memory_region_add_subregion(&sphb->iommu_root, (*ptcet)->bus_offset,
>> +spapr_tce_get_iommu(*ptcet));
>> +
>> +return 0;
>> +}
>> +
>> +int spapr_pci_ddw_remove(sPAPRPHBState *sphb, sPAPRTCETable *tcet)
>> +{
>> +memory_region_del_subregion(&sphb->iommu_root,
>> +spapr_tce_get_iommu(tcet));
>> +spapr_tce_free_table(tcet);
> 
> Ok, relating to my comment in the previous patch, ddw_num doesn't seem
> to be decremented here either.


@ddw_num is an id, it just have to be unique. If I decrement it, then I'll
have to track what numbers are in use.


>> +
>> +return 0;
>> +}
>> +
>> +static int spapr_pci_remove_ddw_cb(Object *child, void *opaque)
>> +{
>> +sPAPRTCETable *tcet;
>> +
>> +tcet = (sPAPRTCETable *) object_dynamic_cast(child, 
>> TYPE_SPAPR_TCE_TABLE);
>> +
>> +/* Delete all dynamic windows, i.e. every except the default one with 
>> #0 */
>> +if (tcet && SPAPR_PCI_DMA_WINDOW_NUM(tcet->liobn)) {
>> +sPAPRPHBState *sphb = opaque;
>> +sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
>> +
>> +spc->ddw_remove(sphb, tcet);
>> +}
>> +
>> +return 0;
>> +}
>> +
>> +int spapr_pci_ddw_reset(sPAPRPHBState *sphb)
>> +{
>> +object_child_foreach(OBJECT(sphb), spapr_pci_remove_ddw_cb, sphb);
>> +sphb->ddw_num = 0;
> 
> So, you do reset ddw_num here, but since it is incremented in the
> generic RTAS code, this smells like a layering violation.

Yeah, good idea to keep in one file at least. Will fix.



>> +
>> +return 0;
>> +}
>> +
>> +/*
>>   * PHB PCI device
>>   */
>>  static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int 
>> devfn)
>> @@ -671,6 +736,12 @@ static int

Re: [Qemu-devel] [RFC PATCH v2 11/13] spapr_pci_vfio: Enable DDW

2014-08-26 Thread Alexey Kardashevskiy

On 08/26/2014 05:19 PM, David Gibson wrote:
> On Fri, Aug 15, 2014 at 08:12:33PM +1000, Alexey Kardashevskiy wrote:
>> This implements DDW for VFIO. Host kernel support is required for this.
>>
>> Signed-off-by: Alexey Kardashevskiy 
>> ---
>> Changes:
>> v2:
>> * remove()/reset() callbacks use spapr_pci's ones
>> ---
>>  hw/ppc/spapr_pci_vfio.c | 86 
>> +
>>  1 file changed, 86 insertions(+)
>>
>> diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
>> index 11b4272..79df716 100644
>> --- a/hw/ppc/spapr_pci_vfio.c
>> +++ b/hw/ppc/spapr_pci_vfio.c
>> @@ -71,6 +71,88 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState 
>> *sphb, Error **errp)
>>  spapr_tce_get_iommu(tcet));
>>  
>>  object_unref(OBJECT(tcet));
>> +
>> +if (sphb->ddw_enabled) {
>> +sphb->ddw_enabled = !!(info.flags & VFIO_IOMMU_SPAPR_TCE_FLAG_DDW);
> 
> This overrides an explicit ddw= set by the user, which is a bit
> counter-intuitive.


For the user it is rather "try ddw when available" than "do ddw". This was
suggested by Alex Graf or I misunderstood his suggestion :)



> 
>> +}
>> +}
>> +
>> +static int spapr_pci_vfio_ddw_query(sPAPRPHBState *sphb,
>> +uint32_t *windows_available,
>> +uint32_t *page_size_mask)
>> +{
>> +sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
>> +struct vfio_iommu_spapr_tce_query query = { .argsz = sizeof(query) };
>> +int ret;
>> +
>> +ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
>> +   VFIO_IOMMU_SPAPR_TCE_QUERY, &query);
>> +if (ret) {
>> +return ret;
>> +}
>> +
>> +*windows_available = query.windows_available;
>> +*page_size_mask = query.page_size_mask;
>> +
>> +return ret;
>> +}
>> +
>> +static int spapr_pci_vfio_ddw_create(sPAPRPHBState *sphb, uint32_t 
>> page_shift,
>> + uint32_t window_shift, uint32_t liobn,
>> + sPAPRTCETable **ptcet)
>> +{
>> +sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
>> +struct vfio_iommu_spapr_tce_create create = {
>> +.argsz = sizeof(create),
>> +.page_shift = page_shift,
>> +.window_shift = window_shift,
>> +.start_addr = 0
>> +};
>> +int ret;
>> +
>> +ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
>> +   VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
>> +if (ret) {
>> +return ret;
>> +}
>> +
>> +*ptcet = spapr_tce_new_table(DEVICE(sphb), liobn,
>> + create.start_addr, page_shift,
>> + 1ULL << (window_shift - page_shift),
>> + true);
>> +memory_region_add_subregion(&sphb->iommu_root, (*ptcet)->bus_offset,
>> +spapr_tce_get_iommu(*ptcet));
>> +
>> +return ret;
>> +}
>> +
>> +static int spapr_pci_vfio_ddw_remove(sPAPRPHBState *sphb, sPAPRTCETable 
>> *tcet)
>> +{
>> +sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
>> +struct vfio_iommu_spapr_tce_remove remove = {
>> +.argsz = sizeof(remove),
>> +.start_addr = tcet->bus_offset
>> +};
>> +int ret;
>> +
>> +spapr_pci_ddw_remove(sphb, tcet);
>> +ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
>> +   VFIO_IOMMU_SPAPR_TCE_REMOVE, &remove);
>> +
>> +return ret;
>> +}
>> +
>> +static int spapr_pci_vfio_ddw_reset(sPAPRPHBState *sphb)
>> +{
>> +sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
>> +struct vfio_iommu_spapr_tce_reset reset = { .argsz = sizeof(reset) };
>> +int ret;
>> +
>> +spapr_pci_ddw_reset(sphb);
>> +ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
>> +   VFIO_IOMMU_SPAPR_TCE_RESET, &reset);
> 
> Unlike the non-VFIO version, this doesn't appear to reset ddw_num.
> 
> Also, there isn't call to explicitly call DDW reset on system reset.
> Is that handled in kernel by the overall VFIO reset?


"[RFC PATCH v2 09/13] spapr_pci_vfio: Call spapr_pci::reset on reset"
effectively enables spapr_phb_reset() as a reset handler for VFIO PHB and
that function does spc->ddw_reset(). This was the real trigger-reason for
the 09/13 patch.


>> +return ret;
>>  }
>>  
>>  static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data)
>> @@ -80,6 +162,10 @@ static void spapr_phb_vfio_class_init(ObjectClass 
>> *klass, void *data)
>>  
>>  dc->props = spapr_phb_vfio_properties;
>>  spc->finish_realize = spapr_phb_vfio_finish_realize;
>> +spc->ddw_query = spapr_pci_vfio_ddw_query;
>> +spc->ddw_create = spapr_pci_vfio_ddw_create;
>> +spc->ddw_remove = spapr_pci_vfio_ddw_remove;
>> +spc->ddw_reset = spapr_pci_vfio_ddw_res

Re: [Qemu-devel] [RFC PATCH v2 12/13] vfio: Enable DDW ioctls to VFIO IOMMU driver

2014-08-26 Thread Alexey Kardashevskiy

On 08/26/2014 05:20 PM, David Gibson wrote:
> On Fri, Aug 15, 2014 at 08:12:34PM +1000, Alexey Kardashevskiy wrote:
>> This enables DDW RTAS-related ioctls in VFIO.
>>
>> Signed-off-by: Alexey Kardashevskiy 
> 
> This should probably just be folded into the previous patch.  It's
> broken without this change.

It won't work but it is not broken - guest will fail to create DDW and
continue using the default windows.

And since the series needs attention of 2 maintainers (A. Williamson and A.
Graf), it is better to draw bold line between areas :)

-- 
Alexey

Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node

2014-08-26 Thread Alexey Kardashevskiy

On 08/26/2014 05:55 PM, Alexey Kardashevskiy wrote:
> On 08/19/2014 10:21 AM, Michael Roth wrote:
>> From: Nathan Fontenot 
>>
>> This add entries to the root OF node to advertise our PHBs as being
>> DR-capable in according with PAPR specification.
>>
>> Each PHB is given a name of PHB, advertised as a PHB type,
>> and associated with a power domain of -1 (indicating to guests that
>> power management is handled automatically by hardware).
>>
>> We currently allocate entries for up to 32 DR-capable PHBs, though
>> this limit can be increased later.
>>
>> DrcEntry objects to track the state of the DR-connector associated
>> with each PHB are stored in a 32-entry array, and each DrcEntry has
>> in turn have a dynamically-sized number of child DR-connectors,
>> which we will use later to track the state of DR-connectors
>> associated with a PHB's physical slots.
>>
>> Signed-off-by: Nathan Fontenot 
>> Signed-off-by: Michael Roth 
>> ---
>>  hw/ppc/spapr.c | 143 
>> +
>>  hw/ppc/spapr_pci.c |   1 +
>>  include/hw/ppc/spapr.h |  35 
>>  3 files changed, 179 insertions(+)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 5c92707..d5e46c3 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
>>  return ram_size;
>>  }
>>  
>> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
>> +{
>> +int i;
>> +
>> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>> +if (spapr->drc_table[i].phb_buid == buid) {
>> +return &spapr->drc_table[i];
>> +}
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +static void spapr_init_drc_table(void)
>> +{
>> +int i;
>> +
>> +memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
>> +
>> +/* For now we only care about PHB entries */
>> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>> +spapr->drc_table[i].drc_index = 0x201 + i;
>> +}
>> +}
>> +
>> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
>> +{
>> +sPAPRDrcEntry *empty_drc = NULL;
>> +sPAPRDrcEntry *found_drc = NULL;
>> +int i, phb_index;
>> +
>> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
>> +if (spapr->drc_table[i].phb_buid == 0) {
>> +empty_drc = &spapr->drc_table[i];
>> +}
>> +
>> +if (spapr->drc_table[i].phb_buid == buid) {
>> +found_drc = &spapr->drc_table[i];
> 
> return &spapr->drc_table[i];
> ?
> 
> 
>> +break;
>> +}
>> +}
>> +
>> +if (found_drc) {
>> +return found_drc;
>> +}
> 
>if (!empty_drc) {
> return NULL;
>}
> 
> ?
> 
> 
>> +
>> +if (empty_drc) {
> 
> and no need in this :)
> 
> 
>> +empty_drc->phb_buid = buid;
>> +empty_drc->state = state;
>> +empty_drc->cc_state.fdt = NULL;
>> +empty_drc->cc_state.offset = 0;
>> +empty_drc->cc_state.depth = 0;
>> +empty_drc->cc_state.state = CC_STATE_IDLE;
>> +empty_drc->child_entries =
>> +g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
>> +phb_index = buid - SPAPR_PCI_BASE_BUID;
>> +for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
>> +empty_drc->child_entries[i].drc_index =
>> +SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
>> +}
>> +return empty_drc;
>> +}
>> +
>> +return NULL;
>> +}
>> +
>> +static void spapr_create_drc_dt_entries(void *fdt)
>> +{
>> +char char_buf[1024];
>> +uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
>> +uint32_t *entries;
>> +int offset, fdt_offset;
>> +int i, ret;
>> +
>> +fdt_offset = fdt_path_offset(fdt, "/");
>> +
>> +/* ibm,drc-indexes */
>> +memset(int_buf, 0, sizeof(int_buf));
>> +int_buf[0] = SPAPR_DRC_TABLE_SIZE;
>> +
>> +for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
>> +int_buf[i] = spapr->drc_table[i-1].drc_index;
>> +}
>> +
>> +ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
>> +  sizeof(int_buf));
>> +if (ret) {
>> +fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
> 
> return here and below in the same error cases?
> 
>> +}
>> +
>> +/* ibm,drc-power-domains */
>> +memset(int_buf, 0, sizeof(int_buf));
>> +int_buf[0] = SPAPR_DRC_TABLE_SIZE;
>> +
>> +for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
>> +int_buf[i] = 0x;
>> +}
>> +
>> +ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
>> +  sizeof(int_buf));
>> +if (ret) {
>> +fprintf(stderr, "Couldn't finalize ibm,drc-power-domains 
>> property\n");
>> +}
>> +
>> +/* ibm,drc-names */
>> +memset(char_buf, 0, sizeof(char_buf));
>> +entries = (uint32_t *)&char_buf[0];
>> +*entries = SPAPR_DRC_TABLE_SIZE;
>> +offset = sizeof(*entries);
>> +
>> +for (i = 0; i

Re: [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs

2014-08-26 Thread Alexey Kardashevskiy

On 08/19/2014 10:21 AM, Michael Roth wrote:
> Reserve 32 entries of type PCI in each PHB's initial FDT. This
> advertises to guests that each PHB is DR-capable device with
> physical hotpluggable slots. This is necessary for allowing
> hotplugging of devices to it later via bus rescan or guest rpaphp
> hotplug module.
> 
> Each entry is assigned a name of "Slot <*32 +1>",
> advertised as a hotpluggable PCI slot, and assigned to power domain
> -1 to indicate to the guest that power management is handled by the
> hardware.
> 
> This models a DR-capable PCI expansion device attached to a host/lpar
> via a single PHB with 32 physical hotpluggable slots (as opposed to a
> virtual bridge device with external management console). Hotplug will
> be handled by the guest via bus rescan or the rpaphp hotplug module.
> 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr.c  |   3 +-
>  hw/ppc/spapr_pci.c  | 102 
> 
>  include/hw/pci-host/spapr.h |   1 +
>  3 files changed, 105 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d5e46c3..90b25b3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -890,7 +890,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>  QLIST_FOREACH(phb, &spapr->phbs, list) {
>  drc_entry = spapr_phb_to_drc_entry(phb->buid);
>  g_assert(drc_entry);
> -ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> +ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, drc_entry->drc_index,
> +fdt);
>  }
>  
>  if (ret < 0) {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index e85134f..924d488 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -851,8 +851,104 @@ static int spapr_phb_children_dt(Object *child, void 
> *opaque)
>  return 1;
>  }
>  
> +static void spapr_create_drc_phb_dt_entries(void *fdt, int bus_off, int 
> phb_index)
> +{
> +char char_buf[1024];
> +uint32_t int_buf[SPAPR_DRC_PHB_SLOT_MAX + 1];
> +uint32_t *entries;
> +int i, ret, offset;
> +
> +/* ibm,drc-indexes */
> +memset(int_buf, 0 , sizeof(int_buf));
> +int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +int_buf[i] = SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + ((i - 1) << 
> 3);
> +}
> +
> +ret = fdt_setprop(fdt, bus_off, "ibm,drc-indexes", int_buf,
> +  sizeof(int_buf));
> +if (ret) {
> +fprintf(stderr, "error adding 'ibm,drc-indexes' field for PHB FDT");
> +}
> +
> +/* ibm,drc-power-domains */
> +memset(int_buf, 0, sizeof(int_buf));
> +int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +int_buf[i] = 0x;
> +}
> +
> +ret = fdt_setprop(fdt, bus_off, "ibm,drc-power-domains", int_buf,
> +  sizeof(int_buf));
> +if (ret) {
> +fprintf(stderr,
> +"error adding 'ibm,drc-power-domains' field for PHB FDT");

As before - return here and below.

> +}
> +
> +/* ibm,drc-names */
> +memset(char_buf, 0, sizeof(char_buf));
> +entries = (uint32_t *)&char_buf[0];
> +*entries = SPAPR_DRC_PHB_SLOT_MAX;
> +offset = sizeof(*entries);
> +
> +for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +offset += sprintf(char_buf + offset, "Slot %d",
> +  (phb_index * SPAPR_DRC_PHB_SLOT_MAX) + i - 1);

Mmmm. From 1 to <=MAX and (i-1) inside the loop when it could be
traditional  0 to  +char_buf[offset++] = '\0';


sprintf() puts zero there itself, no? And as we are here, should not it be
snprintf()?


> +}
> +
> +ret = fdt_setprop(fdt, bus_off, "ibm,drc-names", char_buf, offset);
> +if (ret) {
> +fprintf(stderr, "error adding 'ibm,drc-names' field for PHB FDT");
> +}
> +
> +/* ibm,drc-types */
> +memset(char_buf, 0, sizeof(char_buf));
> +entries = (uint32_t *)&char_buf[0];
> +*entries = SPAPR_DRC_PHB_SLOT_MAX;
> +offset = sizeof(*entries);
> +
> +for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +offset += sprintf(char_buf + offset, "28");
> +char_buf[offset++] = '\0';
> +}
> +
> +ret = fdt_setprop(fdt, bus_off, "ibm,drc-types", char_buf, offset);
> +if (ret) {
> +fprintf(stderr, "error adding 'ibm,drc-types' field for PHB FDT");
> +}
> +
> +/* we want the initial indicator state to be 0 - "empty", when we
> + * hot-plug an adaptor in the slot, we need to set the indicator
> + * to 1 - "present."
> + */
> +
> +/* ibm,indicator-9003 */
> +memset(int_buf, 0, sizeof(int_buf));
> +int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +ret = fdt_setprop(fdt, bus_off, "ibm,indicator-9003", int_buf,
> +  sizeof(int_buf));
> +if (ret) {
> +fprintf(stderr, "error adding 'ibm,indicator-9003' field for PH

Re: [Qemu-devel] [0/3] target-ppc Fixes for some missing config dependencies

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 06:30, David Gibson ha scritto:
> These 3 patches fix some places where things ought to depend on an
> existing config variable, but don't.

Which header provdides the #defines?

Paolo

Re: [Qemu-devel] [PATCH v6 0/3] linux-aio: introduce submit I/O asabatch

2014-08-26 Thread Zhang Haoyu

>> Can we use the queued io data as caches, 
>> io write will directly return and tell the guest the io is completed after 
>> the io is enqueued, 
>> better user experience for burst io,
>> and io-read will firstly search the io queue, if matched data found, 
>> directly get the data from the queue, 
>> if not, then read the data from the disk or host page cache.
>> Any ideas?
>
>Guest kernel already has a page cache that exactly does this, also keeping a
>copy of guest request data in qemu may hurt the bandwidth in some cases.
>
You are right.
Does the io merging in queue is worthy to be performed ?

Thanks,
Zhang Haoyu
>Fam

Re: [Qemu-devel] [RFC PATCH v2 5/8] thread-pool: Implement .cancel_async

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 08:08, Fam Zheng ha scritto:
> +qemu_mutex_lock(&pool->lock);
> +if (thread_pool_cancel_from_queue(elem)) {
> +elem->state = THREAD_CANCELED_ASYNC;
> +}

Can you simply set it to THREAD_DONE (and set elem->ret to -ECANCELED)?

Paolo

Re: [Qemu-devel] [RFC PATCH v2 6/8] dma: Implement .cancel_async

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 08:08, Fam Zheng ha scritto:
> +if (dbs->cancelled) {
> +ret = -ECANCELED;
> +}

Why is dbs->cancelled necessary?

>  dma_bdrv_unmap(dbs);
>  if (dbs->common.cb) {
>  dbs->common.cb(dbs->common.opaque, ret);
> @@ -141,6 +148,9 @@ static void dma_bdrv_cb(void *opaque, int ret)
>  
>  trace_dma_bdrv_cb(dbs, ret);
>  
> +if (dbs->cancelled) {
> +ret = -ECANCELED;
> +}
>  dbs->acb = NULL;
>  dbs->sector_num += dbs->iov.size / 512;
>  
> @@ -185,6 +195,7 @@ static void dma_aio_cancel(BlockDriverAIOCB *acb)
>  
>  trace_dma_aio_cancel(dbs);
>  
> +dbs->cancelled = true;
>  if (dbs->acb) {
>  BlockDriverAIOCB *acb = dbs->acb;
>  dbs->acb = NULL;
> @@ -196,9 +207,25 @@ static void dma_aio_cancel(BlockDriverAIOCB *acb)
>  dma_complete(dbs, 0);
>  }
>  
> +static void dma_aio_cancel_async(BlockDriverAIOCB *acb)
> +{
> +DMAAIOCB *dbs = container_of(acb, DMAAIOCB, common);
> +
> +trace_dma_aio_cancel(dbs);
> +
> +dbs->cancelled = true;
> +if (dbs->acb) {
> +acb = dbs->acb;
> +dbs->acb = NULL;

Why do you need to set dbs->acb to NULL, since the callback is going to
be called?

Paolo

> +bdrv_aio_cancel_async(acb);
> +}
> +}

Re: [Qemu-devel] [PATCH 09/12] rtl8139: adding new fields to vmstate

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 09:15, Pavel Dovgalyuk ha scritto:
> This patch adds virtual clock-dependent timers to VMState to allow correct
> saving and restoring the state of RTL8139 network controller.
> 
> Signed-off-by: Pavel Dovgalyuk 
> ---
>  hw/net/rtl8139.c |   50 --
>  1 files changed, 48 insertions(+), 2 deletions(-)

Again, this is only needed in your record/replay system (and you haven't
yet quite explained why the design has this limitation), so it should
not be a part of this series.

I'll review the other patches separately, no need to resubmit.

Paolo

Re: [Qemu-devel] [PATCH 11/12] mc146818rtc: add missed field to vmstate

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 09:15, Pavel Dovgalyuk ha scritto:
> +static int rtc_pre_load(void *opaque)
> +{
> +RTCState *s = (RTCState *)opaque;
> +s->irq_reinject_on_ack_count = 0;
> +return 0;
> +}
> +

You found a real bug here, in that the field has to be cleared at reset
time.  But reinitializing stuff in pre-load hooks is another quirk of
your record/replay patches that does not belong in these patches (and
that you have not justified yet, either).

Paolo

Re: [Qemu-devel] [PATCH v5 0/8] modify boot order of guest, and take effect after rebooting

2014-08-26 Thread Gonglei (Arei)

Hi, Gerd

 Nice to meet you again in maillist. :)

> -Original Message-
> From: Gerd Hoffmann [mailto:kra...@redhat.com]
> Sent: Tuesday, August 26, 2014 2:36 PM
> Subject: Re: [PATCH v5 0/8] modify boot order of guest, and take effect after
> rebooting
> 
> > The patchsets add one qmp interface, and add an fw_cfg_machine_reset()
> > to achieve it.
> 
> > (qemu) set-bootindex ide0-0-1 1
> > The bootindex 1 has already been used
> 
> What happened to the idea to use qom-set instead?  I liked that
> suggestion.  Solves the suffix issue in a nice way.
> 
I have discussed with Makus about qom-set in pervious confabulation.
The main problem is that qom-set's function is simple, which just change
a device's property value, but not can do any other logic. In my case,
I should change global fw_boot_orde for devices's bootindex taking effect.

Please see the details as below, thanks a lot!

http://thread.gmane.org/gmane.comp.emulators.qemu/287636

Best regards,
-Gonglei

Re: [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs

2014-08-26 Thread Alexey Kardashevskiy

On 08/19/2014 10:21 AM, Michael Roth wrote:
> Reserve 32 entries of type PCI in each PHB's initial FDT. This
> advertises to guests that each PHB is DR-capable device with
> physical hotpluggable slots. This is necessary for allowing
> hotplugging of devices to it later via bus rescan or guest rpaphp
> hotplug module.
> 
> Each entry is assigned a name of "Slot <*32 +1>",
> advertised as a hotpluggable PCI slot, and assigned to power domain
> -1 to indicate to the guest that power management is handled by the
> hardware.
> 
> This models a DR-capable PCI expansion device attached to a host/lpar
> via a single PHB with 32 physical hotpluggable slots (as opposed to a
> virtual bridge device with external management console). Hotplug will
> be handled by the guest via bus rescan or the rpaphp hotplug module.
> 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr.c  |   3 +-
>  hw/ppc/spapr_pci.c  | 102 
> 
>  include/hw/pci-host/spapr.h |   1 +
>  3 files changed, 105 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d5e46c3..90b25b3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -890,7 +890,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>  QLIST_FOREACH(phb, &spapr->phbs, list) {
>  drc_entry = spapr_phb_to_drc_entry(phb->buid);
>  g_assert(drc_entry);
> -ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> +ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, drc_entry->drc_index,
> +fdt);
>  }
>  
>  if (ret < 0) {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index e85134f..924d488 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -851,8 +851,104 @@ static int spapr_phb_children_dt(Object *child, void 
> *opaque)
>  return 1;
>  }
>  
> +static void spapr_create_drc_phb_dt_entries(void *fdt, int bus_off, int 
> phb_index)
> +{
> +char char_buf[1024];
> +uint32_t int_buf[SPAPR_DRC_PHB_SLOT_MAX + 1];
> +uint32_t *entries;
> +int i, ret, offset;
> +
> +/* ibm,drc-indexes */
> +memset(int_buf, 0 , sizeof(int_buf));
> +int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +int_buf[i] = SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + ((i - 1) << 
> 3);
> +}
> +
> +ret = fdt_setprop(fdt, bus_off, "ibm,drc-indexes", int_buf,
> +  sizeof(int_buf));
> +if (ret) {
> +fprintf(stderr, "error adding 'ibm,drc-indexes' field for PHB FDT");
> +}
> +
> +/* ibm,drc-power-domains */
> +memset(int_buf, 0, sizeof(int_buf));
> +int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +int_buf[i] = 0x;
> +}
> +
> +ret = fdt_setprop(fdt, bus_off, "ibm,drc-power-domains", int_buf,
> +  sizeof(int_buf));
> +if (ret) {
> +fprintf(stderr,
> +"error adding 'ibm,drc-power-domains' field for PHB FDT");
> +}
> +
> +/* ibm,drc-names */
> +memset(char_buf, 0, sizeof(char_buf));
> +entries = (uint32_t *)&char_buf[0];
> +*entries = SPAPR_DRC_PHB_SLOT_MAX;
> +offset = sizeof(*entries);
> +
> +for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +offset += sprintf(char_buf + offset, "Slot %d",
> +  (phb_index * SPAPR_DRC_PHB_SLOT_MAX) + i - 1);
> +char_buf[offset++] = '\0';
> +}
> +
> +ret = fdt_setprop(fdt, bus_off, "ibm,drc-names", char_buf, offset);
> +if (ret) {
> +fprintf(stderr, "error adding 'ibm,drc-names' field for PHB FDT");
> +}
> +
> +/* ibm,drc-types */
> +memset(char_buf, 0, sizeof(char_buf));
> +entries = (uint32_t *)&char_buf[0];
> +*entries = SPAPR_DRC_PHB_SLOT_MAX;
> +offset = sizeof(*entries);
> +
> +for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +offset += sprintf(char_buf + offset, "28");


"28"? Is it for "PHB"?




-- 
Alexey

Re: [Qemu-devel] [PATCH 03/12] fdc: adding vmstate for save/restore

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 09:14, Pavel Dovgalyuk ha scritto:
> +static int fdc_pre_load(void *opaque)
> +{
> +FDCtrl *s = opaque;
> +s->reset_sensei = 0;
> +timer_del(s->result_timer);
> +return 0;
> +}

This should be in fdctrl_reset.

Paolo

Re: [Qemu-devel] [PATCH 04/12] parallel: adding vmstate for save/restore

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 09:14, Pavel Dovgalyuk ha scritto:
> VMState added by this patch preserves correct
> loading of the parallel port controller state.
> 
> Signed-off-by: Pavel Dovgalyuk 
> ---
>  hw/char/parallel.c |   20 
>  1 files changed, 20 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/char/parallel.c b/hw/char/parallel.c
> index 7ac90a5..26c03d7 100644
> --- a/hw/char/parallel.c
> +++ b/hw/char/parallel.c
> @@ -477,6 +477,24 @@ static const MemoryRegionPortio 
> isa_parallel_portio_sw_list[] = {
>  PORTIO_END_OF_LIST(),
>  };
>  
> +
> +static const VMStateDescription vmstate_parallel_isa = {
> +.name = "parallel_isa",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.minimum_version_id_old = 1,
> +.fields  = (VMStateField[]) {
> +VMSTATE_UINT8(state.dataw, ISAParallelState),
> +VMSTATE_UINT8(state.datar, ISAParallelState),
> +VMSTATE_UINT8(state.status, ISAParallelState),
> +VMSTATE_UINT8(state.control, ISAParallelState),
> +VMSTATE_INT32(state.irq_pending, ISAParallelState),
> +VMSTATE_INT32(state.epp_timeout, ISAParallelState),
> +VMSTATE_END_OF_LIST()
> +}
> +};
> +
> +
>  static void parallel_isa_realizefn(DeviceState *dev, Error **errp)
>  {
>  static int index;
> @@ -518,6 +536,8 @@ static void parallel_isa_realizefn(DeviceState *dev, 
> Error **errp)
>? &isa_parallel_portio_hw_list[0]
>: &isa_parallel_portio_sw_list[0]),
>   s, "parallel");
> +
> +vmstate_register(NULL, -1, &vmstate_parallel_isa, isa);
>  }
>  
>  /* Memory mapped interface */

I think you should use dc->vmsd.

Paolo

Re: [Qemu-devel] [PATCH 02/12] pcspk: adding vmstate for save/restore

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 09:14, Pavel Dovgalyuk ha scritto:
> VMState added by this patch preserves correct
> loading of the PC speaker device state.
> 
> Signed-off-by: Pavel Dovgalyuk 
> ---
>  hw/audio/pcspk.c |   18 --
>  1 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/audio/pcspk.c b/hw/audio/pcspk.c
> index 1d81bbe..8b22dbf 100644
> --- a/hw/audio/pcspk.c
> +++ b/hw/audio/pcspk.c
> @@ -50,8 +50,8 @@ typedef struct {
>  unsigned int pit_count;
>  unsigned int samples;
>  unsigned int play_pos;
> -int data_on;
> -int dummy_refresh_clock;
> +uint8_t data_on;
> +uint8_t dummy_refresh_clock;
>  } PCSpkState;
>  
>  static const char *s_spk = "pcspk";
> @@ -163,6 +163,18 @@ static const MemoryRegionOps pcspk_io_ops = {
>  },
>  };
>  
> +static const VMStateDescription vmstate_spk = {
> +.name = "pcspk",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.minimum_version_id_old = 1,
> +.fields  = (VMStateField[]) {
> +VMSTATE_UINT8(data_on, PCSpkState),
> +VMSTATE_UINT8(dummy_refresh_clock, PCSpkState),
> +VMSTATE_END_OF_LIST()
> +}
> +};
> +
>  static void pcspk_initfn(Object *obj)
>  {
>  PCSpkState *s = PC_SPEAKER(obj);
> @@ -175,6 +187,8 @@ static void pcspk_realizefn(DeviceState *dev, Error 
> **errp)
>  ISADevice *isadev = ISA_DEVICE(dev);
>  PCSpkState *s = PC_SPEAKER(dev);
>  
> +vmstate_register(NULL, 0, &vmstate_spk, s);
> +
>  isa_register_ioport(isadev, &s->ioport, s->iobase);
>  
>  pcspk_state = s;
> 
> 
> 

I think you should use dc->vmsd.

Paolo

Re: [Qemu-devel] [PATCH 08/12] pckbd: adding new fields to vmstate

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 09:15, Pavel Dovgalyuk ha scritto:
> This patch adds outport to VMState to allow correct saving and restoring
> the state of PC keyboard controller.
> 
> Signed-off-by: Pavel Dovgalyuk 
> ---
>  hw/input/pckbd.c |   51 +++
>  1 files changed, 51 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
> index ca1cffc..95d8767 100644
> --- a/hw/input/pckbd.c
> +++ b/hw/input/pckbd.c
> @@ -131,6 +131,7 @@ typedef struct KBDState {
>  uint8_t status;
>  uint8_t mode;
>  uint8_t outport;
> +bool outport_present;
>  /* Bitmask of devices with data available.  */
>  uint8_t pending;
>  void *kbd;
> @@ -367,18 +368,68 @@ static void kbd_reset(void *opaque)
>  s->mode = KBD_MODE_KBD_INT | KBD_MODE_MOUSE_INT;
>  s->status = KBD_STAT_CMD | KBD_STAT_UNLOCKED;
>  s->outport = KBD_OUT_RESET | KBD_OUT_A20;
> +s->outport_present = false;
> +}
> +
> +static uint8_t kbd_outport_default(KBDState *s)
> +{
> +return KBD_OUT_RESET | KBD_OUT_A20
> +   | (s->status & KBD_STAT_OBF ? KBD_OUT_OBF : 0)
> +   | (s->status & KBD_STAT_MOUSE_OBF ? KBD_OUT_MOUSE_OBF : 0);
> +}
> +
> +static int kbd_outport_post_load(void *opaque, int version_id)
> +{
> +KBDState *s = opaque;
> +s->outport_present = true;
> +return 0;
> +}
> +
> +static const VMStateDescription vmstate_kbd_outport = {
> +.name = "pckbd_outport",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.post_load = kbd_outport_post_load,
> +.fields = (VMStateField[]) {
> +VMSTATE_UINT8(outport, KBDState),
> +VMSTATE_END_OF_LIST()
> +}
> +};
> +
> +static bool kbd_outport_needed(void *opaque)
> +{
> +KBDState *s = opaque;
> +return s->outport != kbd_outport_default(s);
> +}
> +
> +static int kbd_post_load(void *opaque, int version_id)
> +{
> +KBDState *s = opaque;
> +if (!s->outport_present) {
> +s->outport = kbd_outport_default(s);
> +}
> +s->outport_present = false;
> +return 0;
>  }
>  
>  static const VMStateDescription vmstate_kbd = {
>  .name = "pckbd",
>  .version_id = 3,
>  .minimum_version_id = 3,
> +.post_load = kbd_post_load,
>  .fields = (VMStateField[]) {
>  VMSTATE_UINT8(write_cmd, KBDState),
>  VMSTATE_UINT8(status, KBDState),
>  VMSTATE_UINT8(mode, KBDState),
>  VMSTATE_UINT8(pending, KBDState),
>  VMSTATE_END_OF_LIST()
> +},
> +.subsections = (VMStateSubsection[]) {
> +{
> +.vmsd = &vmstate_kbd_outport,
> +.needed = kbd_outport_needed,
> +},
> +VMSTATE_END_OF_LIST()
>  }
>  };
>  
> 
> 
> 

Reviewed-by: Paolo Bonzini

Re: [Qemu-devel] [PATCH 07/12] spapr_pci: add ibm, configure-connector RTAS interface

2014-08-26 Thread Alexey Kardashevskiy

On 08/19/2014 10:21 AM, Michael Roth wrote:
> Signed-off-by: Michael Roth 

I have totally no idea what this patch actually does :) When is this rtas
call made? Once after the guest received the check exception interrupt? Is
it all it does is fetching the copy of the device tree made by
spapr_create_drc_phb_dt_entries()? If it is,
spapr_create_drc_phb_dt_entries() could compose the structure below at the
first place without any additional conversions, no?


> ---
>  hw/ppc/spapr_pci.c | 111 
> +
>  1 file changed, 111 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 8d1351d..96a57be 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -606,6 +606,115 @@ static void rtas_get_sensor_state(PowerPCCPU *cpu, 
> sPAPREnvironment *spapr,
>  rtas_st(rets, 1, decoded);
>  }
>  
> +/* configure-connector work area offsets, int32_t units */
> +#define CC_IDX_NODE_NAME_OFFSET 2
> +#define CC_IDX_PROP_NAME_OFFSET 2
> +#define CC_IDX_PROP_LEN 3
> +#define CC_IDX_PROP_DATA_OFFSET 4
> +
> +#define CC_VAL_DATA_OFFSET ((CC_IDX_PROP_DATA_OFFSET + 1) * 4)
> +#define CC_RET_NEXT_SIB 1
> +#define CC_RET_NEXT_CHILD 2
> +#define CC_RET_NEXT_PROPERTY 3
> +#define CC_RET_PREV_PARENT 4
> +#define CC_RET_ERROR RTAS_OUT_HW_ERROR
> +#define CC_RET_SUCCESS RTAS_OUT_SUCCESS


Why these two are here, not in the same bucket as RTAS_OUT_HW_ERROR and others?


> +
> +static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
> + sPAPREnvironment *spapr,
> + uint32_t token, uint32_t nargs,
> + target_ulong args, uint32_t nret,
> + target_ulong rets)
> +{
> +uint64_t wa_addr = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 0);
> +sPAPRDrcEntry *drc_entry = NULL;
> +sPAPRConfigureConnectorState *ccs;
> +void *wa_buf;
> +int32_t *wa_buf_int;
> +hwaddr map_len = 0x1024;
> +uint32_t drc_index;
> +int rc = 0, next_offset, tag, prop_len, node_name_len;
> +const struct fdt_property *prop;
> +const char *node_name, *prop_name;
> +
> +wa_buf = cpu_physical_memory_map(wa_addr, &map_len, 1);
> +if (!wa_buf) {
> +rc = CC_RET_ERROR;
> +goto error_exit;
> +}
> +wa_buf_int = wa_buf;
> +
> +drc_index = *(uint32_t *)wa_buf;
> +drc_entry = spapr_find_drc_entry(drc_index);
> +if (!drc_entry) {
> +rc = -1;
> +goto error_exit;
> +}
> +
> +ccs = &drc_entry->cc_state;
> +if (ccs->state == CC_STATE_PENDING) {
> +/* fdt should've been been attached to drc_entry during
> + * realize/hotplug
> + */
> +g_assert(ccs->fdt);
> +ccs->depth = 0;
> +ccs->offset = ccs->offset_start;
> +ccs->state = CC_STATE_ACTIVE;
> +}
> +
> +if (ccs->state == CC_STATE_IDLE) {
> +rc = -1;
> +goto error_exit;
> +}
> +
> +retry:
> +tag = fdt_next_tag(ccs->fdt, ccs->offset, &next_offset);
> +
> +switch (tag) {
> +case FDT_BEGIN_NODE:
> +ccs->depth++;
> +node_name = fdt_get_name(ccs->fdt, ccs->offset, &node_name_len);
> +wa_buf_int[CC_IDX_NODE_NAME_OFFSET] = CC_VAL_DATA_OFFSET;
> +strcpy(wa_buf + wa_buf_int[CC_IDX_NODE_NAME_OFFSET], node_name);
> +rc = CC_RET_NEXT_CHILD;
> +break;
> +case FDT_END_NODE:
> +ccs->depth--;
> +if (ccs->depth == 0) {
> +/* reached the end of top-level node, declare success */
> +ccs->state = CC_STATE_PENDING;
> +rc = CC_RET_SUCCESS;
> +} else {
> +rc = CC_RET_PREV_PARENT;
> +}
> +break;
> +case FDT_PROP:
> +prop = fdt_get_property_by_offset(ccs->fdt, ccs->offset, &prop_len);
> +prop_name = fdt_string(ccs->fdt, fdt32_to_cpu(prop->nameoff));
> +wa_buf_int[CC_IDX_PROP_NAME_OFFSET] = CC_VAL_DATA_OFFSET;
> +wa_buf_int[CC_IDX_PROP_LEN] = prop_len;
> +wa_buf_int[CC_IDX_PROP_DATA_OFFSET] =
> +CC_VAL_DATA_OFFSET + strlen(prop_name) + 1;
> +strcpy(wa_buf + wa_buf_int[CC_IDX_PROP_NAME_OFFSET], prop_name);
> +memcpy(wa_buf + wa_buf_int[CC_IDX_PROP_DATA_OFFSET],
> +   prop->data, prop_len);
> +rc = CC_RET_NEXT_PROPERTY;
> +break;
> +case FDT_END:
> +rc = CC_RET_ERROR;
> +break;
> +default:
> +ccs->offset = next_offset;
> +goto retry;

Could be a while(1) loop...


> +}
> +
> +ccs->offset = next_offset;
> +
> +error_exit:
> +cpu_physical_memory_unmap(wa_buf, 0x1024, 1, 0x1024);


0x1024 is weird constant, are you sure about that?


> +rtas_st(rets, 0, rc);
> +}
> +
>  static int pci_spapr_swizzle(int slot, int pin)
>  {
>  return (slot + pin) % PCI_NUM_PINS;
> @@ -1276,6 +1385,8 @@ void spapr_pci_rtas_init(void)
>

Re: [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions

2014-08-26 Thread Alexey Kardashevskiy

On 08/19/2014 10:21 AM, Michael Roth wrote:
> Some kernels program a 0 address for io regions. PCI 3.0 spec
> section 6.2.5.1 doesn't seem to disallow this.


I remember there was discussion about it but I forgot :) Why does it have
to be a part of this patchset? Worth mentioning in the commit log I believe.


> 
> Signed-off-by: Michael Roth 
> ---
>  hw/pci/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 351d320..9578749 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -1035,7 +1035,7 @@ static pcibus_t pci_bar_address(PCIDevice *d,
>  /* Check if 32 bit BAR wraps around explicitly.
>   * TODO: make priorities correct and remove this work around.
>   */
> -if (last_addr <= new_addr || new_addr == 0 || last_addr >= 
> UINT32_MAX) {
> +if (last_addr <= new_addr || last_addr >= UINT32_MAX) {
>  return PCI_BAR_UNMAPPED;
>  }
>  return new_addr;
> 


-- 
Alexey

Re: [Qemu-devel] [RFC PATCH v2 6/8] dma: Implement .cancel_async

2014-08-26 Thread Fam Zheng

On Tue, 08/26 10:46, Paolo Bonzini wrote:
> Il 26/08/2014 08:08, Fam Zheng ha scritto:
> > +if (dbs->cancelled) {
> > +ret = -ECANCELED;
> > +}
> 
> Why is dbs->cancelled necessary?

Request may complete after bdrv_aio_cancel_async with other status, this flag
is checked to fix the status to -ECANCELED.

> 
> >  dma_bdrv_unmap(dbs);
> >  if (dbs->common.cb) {
> >  dbs->common.cb(dbs->common.opaque, ret);
> > @@ -141,6 +148,9 @@ static void dma_bdrv_cb(void *opaque, int ret)
> >  
> >  trace_dma_bdrv_cb(dbs, ret);
> >  
> > +if (dbs->cancelled) {
> > +ret = -ECANCELED;
> > +}
> >  dbs->acb = NULL;
> >  dbs->sector_num += dbs->iov.size / 512;
> >  
> > @@ -185,6 +195,7 @@ static void dma_aio_cancel(BlockDriverAIOCB *acb)
> >  
> >  trace_dma_aio_cancel(dbs);
> >  
> > +dbs->cancelled = true;
> >  if (dbs->acb) {
> >  BlockDriverAIOCB *acb = dbs->acb;
> >  dbs->acb = NULL;
> > @@ -196,9 +207,25 @@ static void dma_aio_cancel(BlockDriverAIOCB *acb)
> >  dma_complete(dbs, 0);
> >  }
> >  
> > +static void dma_aio_cancel_async(BlockDriverAIOCB *acb)
> > +{
> > +DMAAIOCB *dbs = container_of(acb, DMAAIOCB, common);
> > +
> > +trace_dma_aio_cancel(dbs);
> > +
> > +dbs->cancelled = true;
> > +if (dbs->acb) {
> > +acb = dbs->acb;
> > +dbs->acb = NULL;
> 
> Why do you need to set dbs->acb to NULL, since the callback is going to
> be called?

Just copied from dma_aio_cancel, but seems not particularly useful. It reminds
me that the one in dma_aio_cancel also looks suspicious.

Fam

Re: [Qemu-devel] [PATCH 10/12] piix: do not raise irq while loading vmstate

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 09:15, Pavel Dovgalyuk ha scritto:
> This patch disables raising an irq while loading the state of PCI bridge.
> 
> Signed-off-by: Pavel Dovgalyuk 
> ---
>  hw/pci-host/piix.c |   22 --
>  1 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
> index e0e0946..86d6d20 100644
> --- a/hw/pci-host/piix.c
> +++ b/hw/pci-host/piix.c
> @@ -409,7 +409,7 @@ static void piix3_set_irq_pic(PIIX3State *piix3, int 
> pic_irq)
>   (pic_irq * PIIX_NUM_PIRQS;
>  }
>  
> -static void piix3_set_irq_level(PIIX3State *piix3, int pirq, int level)
> +static void piix3_set_irq_level_internal(PIIX3State *piix3, int pirq, int 
> level)
>  {
>  int pic_irq;
>  uint64_t mask;
> @@ -422,6 +422,18 @@ static void piix3_set_irq_level(PIIX3State *piix3, int 
> pirq, int level)
>  mask = 1ULL << ((pic_irq * PIIX_NUM_PIRQS) + pirq);
>  piix3->pic_levels &= ~mask;
>  piix3->pic_levels |= mask * !!level;
> +}
> +
> +static void piix3_set_irq_level(PIIX3State *piix3, int pirq, int level)
> +{
> +int pic_irq;
> +
> +pic_irq = piix3->dev.config[PIIX_PIRQC + pirq];
> +if (pic_irq >= PIIX_NUM_PIC_IRQS) {
> +return;
> +}
> +
> +piix3_set_irq_level_internal(piix3, pirq, level);
>  
>  piix3_set_irq_pic(piix3, pic_irq);
>  }
> @@ -527,7 +539,13 @@ static void piix3_reset(void *opaque)
>  static int piix3_post_load(void *opaque, int version_id)
>  {
>  PIIX3State *piix3 = opaque;
> -piix3_update_irq_levels(piix3);
> +int pirq;
> +
> +piix3->pic_levels = 0;
> +for (pirq = 0; pirq < PIIX_NUM_PIRQS; pirq++) {
> +piix3_set_irq_level_internal(piix3, pirq,
> +pci_bus_get_irq_level(piix3->dev.bus, pirq));
> +}
>  return 0;
>  }
>  
> 
> 
> 

The commit message or (probably better) a comment in the code should
explain why the PIC state must not be updated, that is which side effect
is undesirable.

This would clarify whether the change is useful for migration too, or
just cosmetic outside record/replay.  Unlike other patches, however, I
think this could be acceptable even without record/replay
infrastructure, because it is not going against "accepted" patterns for
migration.

Paolo

Re: [Qemu-devel] [PATCH v3 00/12] spapr: add support for pci hotplug

2014-08-26 Thread Alexey Kardashevskiy

On 08/19/2014 10:21 AM, Michael Roth wrote:
> These patches are based on ppc-next, and can also be obtained from:
> 
> https://github.com/mdroth/qemu/commits/spapr-pci-hotplug-v3-ppc-next
> 
> v3:
>  * dropped emulation of firmware-managed BAR allocation. this will be
>introduced via a follow-up series via a -machine flag and tied to
>a separate hotplug event to avoid a race condition with guest vs.
>"firmware"-managed BAR allocation, in conjunction with required
>fixes to rpaphp hotplug kernel module to utilize this mode.
>  * moved drc_table into sPAPREnvironment (Alexey)
>  * moved INDICATOR_* constants and friends into spapr_pci.c (Alexey)
>  * use prefixes for global types (DrcEntry/ConfigureConnectorState) (Alexey)
>  * updated for new hotplug interface (Alexey)
>  * fixed get-power-level to report current power-level rather than
>desired (Alexey)
>  * rebased to latest ppc-next
> 
> v2:
>   * re-ordered patches to fix build bisectability (Alexey)
>   * replaced g_warning with DPRINTF in RTAS calls for guest errors (Alexey)
>   * replaced g_warning with fprintf for qemu errors (Alexey)
>   * updated RTAS calls to use pre-existing error/success macros (Alexey)
>   * replaced DR_*/SENSOR_* macros with INDICATOR_* for set-indicator/
> get-sensor-state (Alexey)
> 
> OVERVIEW
> 
> These patches add support for PCI hotplug for SPAPR guests. We advertise
> each PHB as DR-capable (as defined by PAPR 13.5/13.6) with 32 hotpluggable
> PCI slots per PHB, which models a standard PCI expansion device for Power
> machines where the DRC name/loc-code/index for each slot are generated
> based on bus/slot number.
> 
> This is compatible with existing guest kernel's via the rpaphp hotplug
> module, and existing userspace tools such as drmgr/librtas/rtas_errd for
> managing devices, in theory...


It would help if you described roughly what happens step-by-step when
hotplugging - when HotplugHandlerClass is called, when RTAS calls are made,
what format is used between (it is something like device tree but not
exactly), when check exception interrupt is triggered...




> 
> NOTES / ADDITIONAL DEPENDENCIES
> 
> This series relies on v1.2.19 or later of powerppc-utils (drmgr, rtas_errd,
> ppc64-diag, and librtas components, specificially), which will automate
> guest-side hotplug setup in response to an EPOW event emitted by QEMU. For
> guests with older versions of powerpc-utils, a manual workaround must be
> used (documented below).
> 
> PATCH LAYOUT
> 
> Patches
> 1-3   advertise PHBs and associated slots as hotpluggable to guests
> 4-7   add RTAS interfaces required for device configuration
> 8 fix for ppc (and other) guests that allocate IO bars starting
>   at 0x0
> 9 enables device_add/device_del for spapr machines and
>   guest-driven hotplug
> 10-12 define hotplug event structure and emit them in response to
>   device_add/device_del
> 
> USAGE
> 
> For guests with powerpc-utils 1.2.19+:
>   hotplug:
> qemu:
>   device_add e1000,id=slot0
>   unplug:
> qemu:
>   device_del slot0
> 
> For guests with powerpc-utils prior to 1.2.19:
>   hotplug:
> qemu:
>   device_add e1000,id=slot0
> guest:
>   drmgr -c pci -s "Slot 0" -n -a
>   echo 1 >/sys/bus/pci/rescan
>   unplug:
> guest:
>   drmgr -c pci -s "Slot 0" -n -r
>   echo 1 >/sys/bus/pci/devices/:00:00.0/remove
> qemu:
>   device_del slot0
> 
>  hw/pci/pci.c|   2 +-
>  hw/ppc/spapr.c  | 172 +-
>  hw/ppc/spapr_events.c   | 224 
> 
>  hw/ppc/spapr_pci.c  | 689 
> +--
>  include/hw/pci-host/spapr.h |   1 +
>  include/hw/ppc/spapr.h  |  46 -
>  6 files changed, 1083 insertions(+), 51 deletions(-)
> 


-- 
Alexey

Re: [Qemu-devel] [RFC PATCH v2 5/8] thread-pool: Implement .cancel_async

2014-08-26 Thread Fam Zheng

On Tue, 08/26 10:42, Paolo Bonzini wrote:
> Il 26/08/2014 08:08, Fam Zheng ha scritto:
> > +qemu_mutex_lock(&pool->lock);
> > +if (thread_pool_cancel_from_queue(elem)) {
> > +elem->state = THREAD_CANCELED_ASYNC;
> > +}
> 
> Can you simply set it to THREAD_DONE (and set elem->ret to -ECANCELED)?
> 

Yes, that should work.

Fam

Re: [Qemu-devel] [PATCH 10/12] spapr_events: re-use EPOW event infrastructure for hotplug events

2014-08-26 Thread Alexey Kardashevskiy

On 08/19/2014 10:21 AM, Michael Roth wrote:
> From: Nathan Fontenot 
> 
> This extends the data structures currently used to report EPOW events to
> gets via the check-exception RTAS interfaces to also include event types
> for hotplug/unplug events.
> 
> This is currently undocumented and being finalized for inclusion in PAPR
> specification, but we implement this here as an extension for guest
> userspace tools to implement (existing guest kernels simply log these
> events via a sysfs interface that's read by rtas_errd).


This patch moves things around (should be a mechanical change, right?) AND
 add new (undocumented) stuff. It would be easier to review if it was 2
patches.


> 
> Signed-off-by: Nathan Fontenot 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr.c |   2 +-
>  hw/ppc/spapr_events.c  | 215 
> -
>  include/hw/ppc/spapr.h |   4 +-
>  3 files changed, 180 insertions(+), 41 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 39cb0bb..825fd31 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1706,7 +1706,7 @@ static void ppc_spapr_init(MachineState *machine)
>  spapr->fdt_skel = spapr_create_fdt_skel(initrd_base, initrd_size,
>  kernel_size, kernel_le,
>  boot_device, kernel_cmdline,
> -spapr->epow_irq);
> +spapr->check_exception_irq);
>  assert(spapr->fdt_skel != NULL);
>  }
>  
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 1b6157d..c0be0e5 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -32,6 +32,8 @@
>  
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/spapr_vio.h"
> +#include "hw/pci/pci.h"
> +#include "hw/pci-host/spapr.h"
>  
>  #include 
>  
> @@ -77,6 +79,7 @@ struct rtas_error_log {
>  #define   RTAS_LOG_TYPE_ECC_UNCORR  0x0009
>  #define   RTAS_LOG_TYPE_ECC_CORR0x000a
>  #define   RTAS_LOG_TYPE_EPOW0x0040
> +#define   RTAS_LOG_TYPE_HOTPLUG 0x00e5
>  uint32_t extended_length;
>  } QEMU_PACKED;
>  
> @@ -166,6 +169,38 @@ struct epow_log_full {
>  struct rtas_event_log_v6_epow epow;
>  } QEMU_PACKED;
>  
> +struct rtas_event_log_v6_hp {
> +#define RTAS_LOG_V6_SECTION_ID_HOTPLUG  0x4850 /* HP */
> +struct rtas_event_log_v6_section_header hdr;
> +uint8_t hotplug_type;
> +#define RTAS_LOG_V6_HP_TYPE_CPU  1
> +#define RTAS_LOG_V6_HP_TYPE_MEMORY   2
> +#define RTAS_LOG_V6_HP_TYPE_SLOT 3
> +#define RTAS_LOG_V6_HP_TYPE_PHB  4
> +#define RTAS_LOG_V6_HP_TYPE_PCI  5
> +uint8_t hotplug_action;
> +#define RTAS_LOG_V6_HP_ACTION_ADD1
> +#define RTAS_LOG_V6_HP_ACTION_REMOVE 2
> +uint8_t hotplug_identifier;
> +#define RTAS_LOG_V6_HP_ID_DRC_NAME   1
> +#define RTAS_LOG_V6_HP_ID_DRC_INDEX  2
> +#define RTAS_LOG_V6_HP_ID_DRC_COUNT  3
> +uint8_t reserved;
> +union {
> +uint32_t index;
> +uint32_t count;
> +char name[1];
> +} drc;
> +} QEMU_PACKED;
> +
> +struct hp_log_full {
> +struct rtas_error_log hdr;
> +struct rtas_event_log_v6 v6hdr;
> +struct rtas_event_log_v6_maina maina;
> +struct rtas_event_log_v6_mainb mainb;
> +struct rtas_event_log_v6_hp hp;
> +} QEMU_PACKED;
> +
>  #define EVENT_MASK_INTERNAL_ERRORS   0x8000
>  #define EVENT_MASK_EPOW  0x4000
>  #define EVENT_MASK_HOTPLUG   0x1000
> @@ -181,29 +216,61 @@ struct epow_log_full {
>  }  \
>  } while (0)
>  
> -void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq)
> +void spapr_events_fdt_skel(void *fdt, uint32_t check_exception_irq)
>  {
> -uint32_t epow_irq_ranges[] = {cpu_to_be32(epow_irq), cpu_to_be32(1)};
> -uint32_t epow_interrupts[] = {cpu_to_be32(epow_irq), 0};
> +uint32_t irq_ranges[] = {cpu_to_be32(check_exception_irq), 
> cpu_to_be32(1)};
> +uint32_t interrupts[] = {cpu_to_be32(check_exception_irq), 0};
>  
>  _FDT((fdt_begin_node(fdt, "event-sources")));
>  
>  _FDT((fdt_property(fdt, "interrupt-controller", NULL, 0)));
>  _FDT((fdt_property_cell(fdt, "#interrupt-cells", 2)));
>  _FDT((fdt_property(fdt, "interrupt-ranges",
> -   epow_irq_ranges, sizeof(epow_irq_ranges;
> +   irq_ranges, sizeof(irq_ranges;
>  
>  _FDT((fdt_begin_node(fdt, "epow-events")));
> -_FDT((fdt_property(fdt, "interrupts",
> -   epow_interrupts, sizeof(epow_interrupts;
> +_FDT((fdt_property(fdt, "interrupts", interrupts, sizeof(i

Re: [Qemu-devel] [PATCH 11/12] spapr_events: event-scan RTAS interface

2014-08-26 Thread Alexey Kardashevskiy

On 08/19/2014 10:21 AM, Michael Roth wrote:
> From: Tyrel Datwyler 
> 
> We don't actually rely on this interface to surface hotplug events, and
> instead rely on the similar-but-interrupt-driven check-exception RTAS
> interface used for EPOW events. However, the existence of this interface
> is needed to ensure guest kernels initialize the event-reporting
> interfaces which will in turn be used by userspace tools to handle these
> events, so we implement this interface as a stub.
> 
> Signed-off-by: Tyrel Datwyler 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr.c | 1 +
>  hw/ppc/spapr_events.c  | 9 +
>  include/hw/ppc/spapr.h | 2 ++
>  3 files changed, 12 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 825fd31..c65b13a 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -702,6 +702,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
>  refpoints, sizeof(refpoints;
>  
>  _FDT((fdt_property_cell(fdt, "rtas-error-log-max", RTAS_ERROR_LOG_MAX)));
> +_FDT((fdt_property_cell(fdt, "rtas-event-scan-rate", 
> RTAS_EVENT_SCAN_RATE)));
>  
>  /*
>   * According to PAPR, rtas ibm,os-term, does not gaurantee a return
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index c0be0e5..bb80080 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -449,6 +449,14 @@ static void check_exception(PowerPCCPU *cpu, 
> sPAPREnvironment *spapr,
>  }
>  }
>  
> +static void event_scan(PowerPCCPU *cpu, sPAPREnvironment *spapr,
> +uint32_t token, uint32_t nargs,
> +target_ulong args,
> +uint32_t nret, target_ulong rets)
> +{
> +rtas_st(rets, 0, 1); /* no error events found */


rtas_st(rets, 0, RTAS_OUT_SUCCESS);



> +}
> +
>  void spapr_events_init(sPAPREnvironment *spapr)
>  {
>  spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false);
> @@ -456,4 +464,5 @@ void spapr_events_init(sPAPREnvironment *spapr)
>  qemu_register_powerdown_notifier(&spapr->epow_notifier);
>  spapr_rtas_register(RTAS_CHECK_EXCEPTION, "check-exception",
>  check_exception);
> +spapr_rtas_register(RTAS_EVENT_SCAN, "event-scan", event_scan);
>  }
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 5382bf1..aab627f 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -484,6 +484,8 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr 
> rtas_addr,
>  
>  #define RTAS_ERROR_LOG_MAX  2048
>  
> +#define RTAS_EVENT_SCAN_RATE1

1 second? 1ms? 1 minute? :) Worth mentioning in the commit log.


> +
>  typedef struct sPAPRTCETable sPAPRTCETable;
>  
>  #define TYPE_SPAPR_TCE_TABLE "spapr-tce-table"
> 


-- 
Alexey

Re: [Qemu-devel] [question] e1000 interrupt storm happened becauseof its correspondingioapic->irr bit always set

2014-08-26 Thread Zhang Haoyu

 Hi, all

 I use a qemu-1.4.1/qemu-2.0.0 to run win7 guest, and encounter e1000 NIC 
 interrupt storm, 
 because "if (!ent->fields.mask && (ioapic->irr & (1 << i)))" is always 
 true in __kvm_ioapic_update_eoi().

 Any ideas?
>>> We meet this several times: search the autoneg patches for an example of
>>> workaround for this in qemu, and patch kvm: ioapic: conditionally delay
>>> irq delivery during eoi broadcast for an workaround in kvm (rejected).
>>>
>> Thanks, Jason,
>> I searched "e1000 autoneg" in gmane.comp.emulators.qemu, and found below 
>> patches, 
>> http://thread.gmane.org/gmane.comp.emulators.qemu/143001/focus=143007
>
>This series is the first try to fix the guest hang during guest
>hibernation or driver enable/disable.
>> http://thread.gmane.org/gmane.comp.emulators.qemu/284105/focus=284765
>> http://thread.gmane.org/gmane.comp.emulators.qemu/186159/focus=187351
>
>Those are follow-up that tries to fix the bugs introduced by the autoneg
>hack.
>> which one tries to fix this problem, or all of them?
>
>As you can see, those kinds of hacking may not as good as we expect
>since we don't know exactly how e1000 works. Only the register function
>description from Intel's manual may not be sufficient. And you can
>search e1000 in the archives and you can find some behaviour of e1000
>registers were not fictionalized like what spec said. It was really
>suggested to use virtio-net instead of e1000 in guest. 
>>
Will the "[PATCH] kvm: ioapic: conditionally delay irq delivery during eoi 
broadcast" add delay to virtual interrupt injection sometimes,
then some time delay sensitive applications will be impacted?

Thanks,
Zhang Haoyu

Re: [Qemu-devel] [PATCH v2 1/1] pc-dimm: Change PCDIMMDevice->node from UINT32 to INT32, and initialize it as -1.

2014-08-26 Thread tangchen


Hi ,

Would anybody help to review this patch ?

Thanks. :)

On 08/19/2014 09:55 AM, Tang Chen wrote:

If user doesn't specify numa options, nb_numa_nodes will be 0. But 
PCDIMMDevice->node
is also initialized to 0. As a result, the following check will fail:

pc_dimm_realize()
{
..
if (dimm->node >= nb_numa_nodes) {
 error_setg(errp, "'DIMM property " PC_DIMM_NODE_PROP " has value %"
PRIu32 "' which exceeds the number of numa nodes: %d",
dimm->node, nb_numa_nodes);
 return;
}
..
}

But this is not an error.

PCDIMMDevice->node should be initialized to -1. This is for users who do not use
NUMA.

Signed-off-by: Tang Chen 
---

Change log v1 -> v2:
1. Simplify the comment.
2. Move the definition of NO_NODE_ID near where it is used.

  hw/mem/pc-dimm.c | 7 ++-
  include/hw/mem/pc-dimm.h | 2 +-
  2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index ec8b1a3..34109a2 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -195,9 +195,14 @@ out:
  return ret;
  }
  
+/* Default value for PCDIMMDevice->node (unless specified by user).

+ * In this case, SRAT won't be created.
+ */
+#define NO_NODE_ID -1
+
  static Property pc_dimm_properties[] = {
  DEFINE_PROP_UINT64(PC_DIMM_ADDR_PROP, PCDIMMDevice, addr, 0),
-DEFINE_PROP_UINT32(PC_DIMM_NODE_PROP, PCDIMMDevice, node, 0),
+DEFINE_PROP_INT32(PC_DIMM_NODE_PROP, PCDIMMDevice, node, NO_NODE_ID),
  DEFINE_PROP_INT32(PC_DIMM_SLOT_PROP, PCDIMMDevice, slot,
PC_DIMM_UNASSIGNED_SLOT),
  DEFINE_PROP_END_OF_LIST(),
diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
index 761eeef..82abb2f 100644
--- a/include/hw/mem/pc-dimm.h
+++ b/include/hw/mem/pc-dimm.h
@@ -53,7 +53,7 @@ typedef struct PCDIMMDevice {
  
  /* public */

  uint64_t addr;
-uint32_t node;
+int32_t node;
  int32_t slot;
  HostMemoryBackend *hostmem;
  } PCDIMMDevice;

Re: [Qemu-devel] [libvirt] Mentors wanted for Outreach Program for Women October 2014

2014-08-26 Thread Stefan Hajnoczi

On Mon, Aug 25, 2014 at 5:40 PM, Marina Zhurakhinskaya
 wrote:
> - Original Message -
>> From: "Stefan Hajnoczi" 
>> To: "Martin Kletzander" 
>> Cc: "qemu-devel" , libvir-l...@redhat.com, "kvm" 
>> , "Marina
>> Zhurakhinskaya" 
>> Sent: Monday, August 25, 2014 12:29:27 PM
>> Subject: Re: [libvirt] Mentors wanted for Outreach Program for Women October 
>> 2014
>>
>> On Mon, Aug 25, 2014 at 11:52 AM, Martin Kletzander 
>> wrote:
>> > On Thu, Aug 21, 2014 at 09:06:39PM +0100, Stefan Hajnoczi wrote:
>> >> Regular code contributors to QEMU, KVM, and libvirt are eligible to
>> >> participate as mentors.
>> >>
>> >> We also need project ideas that are achievable in 12 weeks by someone
>> >> skilled in programming but not necessarily familiar with open source
>> >> or our codebase.  Ideas welcome!
>> >>
>> >
>> > It's just a matter of ideas.  Maybe we could revisit some of those we
>> > had for GSoC.  If I'm reading the deadline for project ideas is
>> > October 22nd, so I think we'll definitely come up with something.
>
> Thank you for your interest in helping revisit GSoC ideas and come up with 
> new ones! October 22 is an application deadline for prospective interns. QEMU 
> would need to have some project ideas listed by September 8, though you can 
> add more ideas through September. The timeline for the program is at 
> https://wiki.gnome.org/OutreachProgramForWomen/2014/DecemberMarch You don't 
> need very many ideas, as you are likely to only have at most 2-3 participants.
>
> We don't yet have any funding confirmed for QEMU, but Stefan and I will be 
> working on this. If your organization might be able to sponsor QEMU 
> internships in OPW, please contact me and Stefan off-list. You can learn more 
> at https://wiki.gnome.org/OutreachProgramForWomen/Admin/InfoForOrgs

Thanks for posting the information Marina.

There is now a wiki page for project ideas and OPW information:
http://qemu-project.org/Outreach_Program_for_Women_2014_DecemberMarch

Mentors: Please post your project ideas on the wiki page.

Thanks,
Stefan

Re: [Qemu-devel] [PATCH 12/12] spapr_pci: emit hotplug add/remove events during hotplug

2014-08-26 Thread Alexey Kardashevskiy

On 08/19/2014 10:21 AM, Michael Roth wrote:
> From: Tyrel Datwyler 
> 
> This uses extension of existing EPOW interrupt/event mechanism
> to notify userspace tools like librtas/drmgr to handle
> in-guest configuration/cleanup operations in response to
> device_add/device_del.
> 
> Userspace tools that don't implement this extension will need
> to be run manually in response/advance of device_add/device_del,
> respectively.
> 
> Signed-off-by: Tyrel Datwyler 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr_pci.c | 20 
>  1 file changed, 20 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 23864ab..17d37c0 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -944,6 +944,18 @@ static int spapr_device_hotplug_add(DeviceState *qdev, 
> PCIDevice *dev)
>  drc_entry_slot->state &= ~(uint32_t)INDICATOR_ENTITY_SENSE_MASK;
>  drc_entry_slot->state |= encoded; /* and the slot */
>  
> +/* reliable unplug requires we wait for a transition from
> + * UNISOLATED->ISOLATED prior to device removal/deletion.
> + * However, slots populated by devices at boot-time will not
> + * have ever been set by guest tools to an UNISOLATED/populated
> + * state, so set this manually in the case of coldplug devices
> + */
> +if (!DEVICE(dev)->hotplugged) {
> +drc_entry_slot->state |= ENCODE_DRC_STATE(1,
> +  INDICATOR_ISOLATION_MASK,
> +  INDICATOR_ISOLATION_SHIFT);
> +}
> +
>  /* add OF node for pci device and required OF DT properties */
>  fdt_orig = g_malloc0(FDT_MAX_SIZE);
>  offset = fdt_create(fdt_orig, FDT_MAX_SIZE);
> @@ -1026,13 +1038,21 @@ static void spapr_device_hotplug_remove(DeviceState 
> *qdev, PCIDevice *dev)
>  static void spapr_phb_hot_plug(HotplugHandler *plug_handler,
> DeviceState *plugged_dev, Error **errp)
>  {
> +int slot = PCI_SLOT(PCI_DEVICE(plugged_dev)->devfn);
> +
>  spapr_device_hotplug_add(DEVICE(plug_handler), PCI_DEVICE(plugged_dev));
> +if (plugged_dev->hotplugged) {
> +spapr_pci_hotplug_add_event(DEVICE(plug_handler), slot);
> +}


Minor comment here :)
@slot is a temporary variable and it used once. Ok, may be it increases
readability, then it makes sense to have local variables for
PCI_DEVICE(plugged_dev) and DEVICE(plug_handler) too as they both are
calculated twice. But I do not insist :)


>  }
>  
>  static void spapr_phb_hot_unplug(HotplugHandler *plug_handler,
>   DeviceState *plugged_dev, Error **errp)
>  {
> +int slot = PCI_SLOT(PCI_DEVICE(plugged_dev)->devfn);
> +
>  spapr_device_hotplug_remove(DEVICE(plug_handler), 
> PCI_DEVICE(plugged_dev));
> +spapr_pci_hotplug_remove_event(DEVICE(plug_handler), slot);
>  }

Same here.

>  
>  static void spapr_phb_realize(DeviceState *dev, Error **errp)
> 


-- 
Alexey

Re: [Qemu-devel] [libvirt] Mentors wanted for Outreach Program for Women October 2014

2014-08-26 Thread Martin Kletzander

On Tue, Aug 26, 2014 at 10:33:27AM +0100, Stefan Hajnoczi wrote:

On Mon, Aug 25, 2014 at 5:40 PM, Marina Zhurakhinskaya
 wrote:

- Original Message -

From: "Stefan Hajnoczi" 
To: "Martin Kletzander" 
Cc: "qemu-devel" , libvir-l...@redhat.com, "kvm" 
, "Marina
Zhurakhinskaya" 
Sent: Monday, August 25, 2014 12:29:27 PM
Subject: Re: [libvirt] Mentors wanted for Outreach Program for Women October 
2014

On Mon, Aug 25, 2014 at 11:52 AM, Martin Kletzander 
wrote:
> On Thu, Aug 21, 2014 at 09:06:39PM +0100, Stefan Hajnoczi wrote:
>> Regular code contributors to QEMU, KVM, and libvirt are eligible to
>> participate as mentors.
>>
>> We also need project ideas that are achievable in 12 weeks by someone
>> skilled in programming but not necessarily familiar with open source
>> or our codebase.  Ideas welcome!
>>
>
> It's just a matter of ideas.  Maybe we could revisit some of those we
> had for GSoC.  If I'm reading the deadline for project ideas is
> October 22nd, so I think we'll definitely come up with something.

Thank you for your interest in helping revisit GSoC ideas and come up with new 
ones! October 22 is an application deadline for prospective interns. QEMU would 
need to have some project ideas listed by September 8, though you can add more 
ideas through September. The timeline for the program is at 
https://wiki.gnome.org/OutreachProgramForWomen/2014/DecemberMarch You don't 
need very many ideas, as you are likely to only have at most 2-3 participants.

We don't yet have any funding confirmed for QEMU, but Stefan and I will be 
working on this. If your organization might be able to sponsor QEMU internships 
in OPW, please contact me and Stefan off-list. You can learn more at 
https://wiki.gnome.org/OutreachProgramForWomen/Admin/InfoForOrgs

Thanks for posting the information Marina.

There is now a wiki page for project ideas and OPW information:
http://qemu-project.org/Outreach_Program_for_Women_2014_DecemberMarch

I guess you meant the following link instead:
http://wiki.qemu.org/Outreach_Program_for_Women_2014_DecemberMarch

Mentors: Please post your project ideas on the wiki page.

Will do as soon as I find some free time and energy ;)

Martin

signature.asc
Description: Digital signature

Re: [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations

2014-08-26 Thread Alexey Kardashevskiy

On 08/19/2014 10:21 AM, Michael Roth wrote:
> This enables hotplug for PHB bridges. Upon hotplug we generate the
> OF-nodes required by PAPR specification and IEEE 1275-1994
> "PCI Bus Binding to Open Firmware" for the device.
> 
> We associate the corresponding FDT for these nodes with the DrcEntry
> corresponding to the slot, which will be fetched via
> ibm,configure-connector RTAS calls by the guest as described by PAPR
> specification. The FDT is cleaned up in the case of unplug.
> 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr_pci.c | 235 
> +++--
>  include/hw/ppc/spapr.h |   1 +
>  2 files changed, 228 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 96a57be..23864ab 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -87,6 +87,17 @@
>  #define ENCODE_DRC_STATE(val, m, s) \
>  (((uint32_t)(val) << (s)) & (uint32_t)(m))
>  
> +#define FDT_MAX_SIZE0x1
> +#define _FDT(exp) \
> +do { \
> +int ret = (exp);   \
> +if (ret < 0) { \
> +return ret;\
> +}  \
> +} while (0)
> +
> +static void spapr_drc_state_reset(sPAPRDrcEntry *drc_entry);
> +
>  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
>  {
>  sPAPRPHBState *sphb;
> @@ -476,6 +487,22 @@ static void rtas_set_indicator(PowerPCCPU *cpu, 
> sPAPREnvironment *spapr,
>  /* encode the new value into the correct bit field */
>  shift = INDICATOR_ISOLATION_SHIFT;
>  mask = INDICATOR_ISOLATION_MASK;
> +if (drc_entry) {
> +/* transition from unisolated to isolated for a hotplug slot
> + * entails completion of guest-side device unplug/cleanup, so
> + * we can now safely remove the device if qemu is waiting for
> + * it to be released
> + */
> +if (DECODE_DRC_STATE(*pind, mask, shift) != indicator_state) {
> +if (indicator_state == 0 && drc_entry->awaiting_release) {
> +/* device_del has been called and host is waiting
> + * for guest to release/isolate device, go ahead
> + * and remove it now
> + */
> +spapr_drc_state_reset(drc_entry);
> +}
> +}
> +}
>  break;
>  case 9002: /* DR */
>  shift = INDICATOR_DR_SHIFT;
> @@ -816,6 +843,198 @@ static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, 
> void *opaque, int devfn)
>  return &phb->iommu_as;
>  }
>  
> +static int spapr_populate_pci_child_dt(PCIDevice *dev, void *fdt, int offset,
> +   int phb_index)
> +{
> +int slot = PCI_SLOT(dev->devfn);
> +char slotname[16];
> +bool is_bridge = 1;
> +sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> +uint32_t reg[RESOURCE_CELLS_TOTAL * 8] = { 0 };
> +uint32_t assigned[RESOURCE_CELLS_TOTAL * 8] = { 0 };
> +int reg_size, assigned_size;
> +
> +drc_entry = spapr_phb_to_drc_entry(phb_index + SPAPR_PCI_BASE_BUID);
> +g_assert(drc_entry);
> +drc_entry_slot = &drc_entry->child_entries[slot];
> +
> +if (pci_default_read_config(dev, PCI_HEADER_TYPE, 1) ==
> +PCI_HEADER_TYPE_NORMAL) {
> +is_bridge = 0;
> +}
> +
> +_FDT(fdt_setprop_cell(fdt, offset, "vendor-id",
> +  pci_default_read_config(dev, PCI_VENDOR_ID, 2)));
> +_FDT(fdt_setprop_cell(fdt, offset, "device-id",
> +  pci_default_read_config(dev, PCI_DEVICE_ID, 2)));
> +_FDT(fdt_setprop_cell(fdt, offset, "revision-id",
> +  pci_default_read_config(dev, PCI_REVISION_ID, 1)));
> +_FDT(fdt_setprop_cell(fdt, offset, "class-code",
> +  pci_default_read_config(dev, PCI_CLASS_DEVICE, 2) 
> << 8));
> +
> +_FDT(fdt_setprop_cell(fdt, offset, "interrupts",
> +  pci_default_read_config(dev, PCI_INTERRUPT_PIN, 
> 1)));
> +
> +/* if this device is NOT a bridge */
> +if (!is_bridge) {
> +_FDT(fdt_setprop_cell(fdt, offset, "min-grant",
> +pci_default_read_config(dev, PCI_MIN_GNT, 1)));
> +_FDT(fdt_setprop_cell(fdt, offset, "max-latency",
> +pci_default_read_config(dev, PCI_MAX_LAT, 1)));
> +_FDT(fdt_setprop_cell(fdt, offset, "subsystem-id",
> +pci_default_read_config(dev, PCI_SUBSYSTEM_ID, 2)));
> +_FDT(fdt_setprop_cell(fdt, offset, "subsystem-vendor-id",
> +pci_default_read_config(dev, PCI_SUBSYSTEM_VENDOR_ID, 2)));
> +}
> +
> +_FDT(fdt_setprop_cell(fdt, offset, "cache-line-size",
> +pci_default_read_config(dev, PCI_CACHE_LINE_SIZE, 1)));
> +
> +/* the

Re: [Qemu-devel] [RFC PATCH v2 6/8] dma: Implement .cancel_async

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 11:21, Fam Zheng ha scritto:
> On Tue, 08/26 10:46, Paolo Bonzini wrote:
>> Il 26/08/2014 08:08, Fam Zheng ha scritto:
>>> +if (dbs->cancelled) {
>>> +ret = -ECANCELED;
>>> +}
>>
>> Why is dbs->cancelled necessary?
> 
> Request may complete after bdrv_aio_cancel_async with other status, this flag
> is checked to fix the status to -ECANCELED.

Ah, that's because the operation could be partly incomplete?

I think it would be better to call dma_complete with -ECANCELED, instead
of doing the same dbs->cancelled check twice.  For example, in dma_bdrv_cb:

if (dbs->sg_cur_index == dbs->sg->nsg || ret < 0) {
dma_complete(dbs, ret);
return;
}
if (dbs->cancelled) {
dma_complete(dbs, -ECANCELED);
return;
}

>>> +static void dma_aio_cancel_async(BlockDriverAIOCB *acb)
>>> +{
>>> +DMAAIOCB *dbs = container_of(acb, DMAAIOCB, common);
>>> +
>>> +trace_dma_aio_cancel(dbs);
>>> +
>>> +dbs->cancelled = true;
>>> +if (dbs->acb) {
>>> +acb = dbs->acb;
>>> +dbs->acb = NULL;
>>
>> Why do you need to set dbs->acb to NULL, since the callback is going to
>> be called?
> 
> Just copied from dma_aio_cancel, but seems not particularly useful. It reminds
> me that the one in dma_aio_cancel also looks suspicious.

Yeah, that one looks useless too...

Paolo

Re: [Qemu-devel] [PATCH 06/12] kvmvapic: fixing loading vmstate

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 09:15, Pavel Dovgalyuk ha scritto:
> vapic state should not be synchronized with APIC while loading,
> because APIC state could be not loaded yet at that moment.
> We just save vapic_paddr in APIC VMState instead of synchronization.

Can you use a vm_change_state_handler, or a QEMU_CLOCK_VIRTUAL timer
with expiration time in the past (e.g. at time zero) to run the sync
code as soon as possible?  Then you can preserve the current migration
format and avoid using the invalid APIC state.

Paolo

> Signed-off-by: Pavel Dovgalyuk 
> ---
>  hw/i386/kvmvapic.c  |   22 +++-
>  hw/intc/apic_common.c   |   44 
> +++
>  include/hw/i386/apic_internal.h |2 +-
>  3 files changed, 66 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
> index ee95963..3c403c5 100644
> --- a/hw/i386/kvmvapic.c
> +++ b/hw/i386/kvmvapic.c
> @@ -351,6 +351,24 @@ static int get_kpcr_number(X86CPU *cpu)
>  return kpcr.number;
>  }
>  
> +static int vapic_enable_post_load(VAPICROMState *s, X86CPU *cpu)
> +{
> +int cpu_number = get_kpcr_number(cpu);
> +hwaddr vapic_paddr;
> +static const uint8_t enabled = 1;
> +
> +if (cpu_number < 0) {
> +return -1;
> +}
> +vapic_paddr = s->vapic_paddr +
> +(((hwaddr)cpu_number) << VAPIC_CPU_SHIFT);
> +cpu_physical_memory_rw(vapic_paddr + offsetof(VAPICState, enabled),
> +   (void *)&enabled, sizeof(enabled), 1);
> +s->state = VAPIC_ACTIVE;
> +
> +return 0;
> +}
> +
>  static int vapic_enable(VAPICROMState *s, X86CPU *cpu)
>  {
>  int cpu_number = get_kpcr_number(cpu);
> @@ -731,7 +749,9 @@ static void do_vapic_enable(void *data)
>  VAPICROMState *s = data;
>  X86CPU *cpu = X86_CPU(first_cpu);
>  
> -vapic_enable(s, cpu);
> +/* Do not synchronize with APIC, because it was not loaded yet.
> +   Just call the enable function which does not have synchronization. */
> +vapic_enable_post_load(s, cpu);
>  }
>  
>  static int vapic_post_load(void *opaque, int version_id)
> diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
> index ce3d903..8d672bd 100644
> --- a/hw/intc/apic_common.c
> +++ b/hw/intc/apic_common.c
> @@ -345,6 +345,39 @@ static int apic_dispatch_post_load(void *opaque, int 
> version_id)
>  return 0;
>  }
>  
> +static bool apic_common_sipi_needed(void *opaque)
> +{
> +APICCommonState *s = APIC_COMMON(opaque);
> +return s->wait_for_sipi != 0;
> +}
> +
> +static const VMStateDescription vmstate_apic_common_sipi = {
> +.name = "apic_sipi",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.fields = (VMStateField[]) {
> +VMSTATE_INT32(sipi_vector, APICCommonState),
> +VMSTATE_INT32(wait_for_sipi, APICCommonState),
> +VMSTATE_END_OF_LIST()
> +}
> +};
> +
> +static bool apic_common_vapic_paddr_needed(void *opaque)
> +{
> +APICCommonState *s = APIC_COMMON(opaque);
> +return s->vapic_paddr != 0;
> +}
> +
> +static const VMStateDescription vmstate_apic_common_vapic_paddr = {
> +.name = "apic_vapic_paddr",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.fields = (VMStateField[]) {
> +VMSTATE_UINT64(vapic_paddr, APICCommonState),
> +VMSTATE_END_OF_LIST()
> +}
> +};
> +
>  static const VMStateDescription vmstate_apic_common = {
>  .name = "apic",
>  .version_id = 3,
> @@ -375,6 +408,17 @@ static const VMStateDescription vmstate_apic_common = {
>  VMSTATE_INT64(timer_expiry,
>APICCommonState), /* open-coded timer state */
>  VMSTATE_END_OF_LIST()
> +},
> +.subsections = (VMStateSubsection[]) {
> +{
> +.vmsd = &vmstate_apic_common_sipi,
> +.needed = apic_common_sipi_needed,
> +},
> +{
> +.vmsd = &vmstate_apic_common_vapic_paddr,
> +.needed = apic_common_vapic_paddr_needed,
> +},
> +VMSTATE_END_OF_LIST()
>  }
>  };
>  
> diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
> index 83e2a42..df4381c 100644
> --- a/include/hw/i386/apic_internal.h
> +++ b/include/hw/i386/apic_internal.h
> @@ -124,7 +124,7 @@ struct APICCommonState {
>  
>  uint32_t vapic_control;
>  DeviceState *vapic;
> -hwaddr vapic_paddr; /* note: persistence via kvmvapic */
> +hwaddr vapic_paddr;
>  };
>  
>  typedef struct VAPICState {
> 
> 
>

Re: [Qemu-devel] [PATCH v5 0/8] modify boot order of guest, and take effect after rebooting

2014-08-26 Thread Gerd Hoffmann

On Di, 2014-08-26 at 09:07 +, Gonglei (Arei) wrote:
> Hi, Gerd
> 
>  Nice to meet you again in maillist. :)
> 
> > -Original Message-
> > From: Gerd Hoffmann [mailto:kra...@redhat.com]
> > Sent: Tuesday, August 26, 2014 2:36 PM
> > Subject: Re: [PATCH v5 0/8] modify boot order of guest, and take effect 
> > after
> > rebooting
> > 
> > > The patchsets add one qmp interface, and add an fw_cfg_machine_reset()
> > > to achieve it.
> > 
> > > (qemu) set-bootindex ide0-0-1 1
> > > The bootindex 1 has already been used
> > 
> > What happened to the idea to use qom-set instead?  I liked that
> > suggestion.  Solves the suffix issue in a nice way.
> > 
> I have discussed with Makus about qom-set in pervious confabulation.
> The main problem is that qom-set's function is simple, which just change
> a device's property value, but not can do any other logic. In my case,
> I should change global fw_boot_orde for devices's bootindex taking effect.

Two options (also mentioned in the thread):

  (1) Set/update bootindex on reset instead of realize/init.
  (2) Switch the property from qdev to qom, then use the set
  callback to also update the fw_cfg file.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v2 2/2] pci: add check for pcie root ports and downstream ports

2014-08-26 Thread Gonglei (Arei)

> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Subject: Re: [PATCH v2 2/2] pci: add check for pcie root ports and downstream
> ports
> 
> On Thu, Aug 21, 2014 at 05:47:46PM +0800, arei.gong...@huawei.com wrote:
> > From: Gonglei 
> >
> > If ARI Forwarding is disabled, according to PCIe spec
> > section 7.3.1, only slot 0 with the device attached to
> > logic bus representing the link from downstream
> > ports and root ports.
> >
> > So, adding check for PCIe downstream ports and root ports,
> > which avoid useless operation, both hotplug and coldplug.
> >
> > Signed-off-by: Gonglei 
> > ---
> >  hw/pci/pci.c | 51
> +++
> >  1 file changed, 51 insertions(+)
> >
> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > index daeaeac..aa0af0c 100644
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -773,6 +773,52 @@ static int pci_init_multifunction(PCIBus *bus,
> PCIDevice *dev)
> >  return 0;
> >  }
> >
> > +static int pci_check_pcie_port(PCIBus *bus, PCIDevice *dev)
> > +{
> > +Object *obj = OBJECT(bus);
> > +
> > +if (pci_bus_is_root(bus)) {
> > +return 0;
> > +}
> > +
> > +if (object_dynamic_cast(obj, TYPE_PCIE_BUS)) {
> > +DeviceState *parent = qbus_get_parent(BUS(obj));
> > +PCIDevice *pci_dev = PCI_DEVICE(parent);
> > +uint8_t port_type;
> > +/*
> > + * Root ports and downstream ports of switches are the hot
> > + * pluggable ports in a PCI Express hierarchy.
> > + * PCI Express supports chip-to-chip interconnect, a PCIe link can
> > + * only connect one pci device/Switch/EndPoint or PCI-bridge.
> > + *
> > + * 7.3. Configuration Transaction Rules (PCI Express specification
> 3.0)
> > + * 7.3.1. Device Number
> > + *
> > + * Downstream Ports that do not have ARI Forwarding enabled
> must
> > + * associate only Device 0 with the device attached to the Logical
> Bus
> > + * representing the Link from the Port.
> > + *
> > + * If ARI Forwarding is not enabled on root ports and downstream
> > + * ports, only support the devices with slot non-0, regardless of
> > + * hotplug or coldplug.
> > + */
> > +port_type = pcie_cap_get_type(pci_dev);
> > +if (port_type == PCI_EXP_TYPE_DOWNSTREAM ||
> > +port_type == PCI_EXP_TYPE_ROOT_PORT) {
> > +if (!pcie_cap_is_ari_enabled(pci_dev)) {
> 
> Won't this mean cold-plugging devices is broken?

Yes. If root ports and downstream ports' ARP forwarding is disabled.
The device with 'slot > 0' will not recognized by guest os.
 
> I guess you could check for ARI capability instead.
> 
You mean that we check the PCIe devices with ARI capability,
but not root ports and downstream ports? If switch ports don't enable 
ARI Forwarding, we should not permit a device with 'slot > 0'. 
Of course, you said the scenario is also should be checked too.

Maybe I can add a if condition like:

If (!pcie_cap_is_arifwd_enabled(pci_dev) || !pcie_device_ari_cap_set(dev)) {
   error_report();
   return;
}

Any comments? Thanks!

Best regards,
-Gonglei

> > +if (PCI_SLOT(dev->devfn) != 0) {
> > +error_report("PCIe: Port's ARI Forwarding is
> disabled, "
> > + "device can't be populated in
> slot %d",
> > + PCI_SLOT(dev->devfn));
> > +return -1;
> > +}
> > +}
> > +}
> > +}
> > +
> > +return 0;
> > +}
> > +
> >  static void pci_config_alloc(PCIDevice *pci_dev)
> >  {
> >  int config_size = pci_config_size(pci_dev);
> > @@ -827,6 +873,11 @@ static PCIDevice
> *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
> >
> >  pci_dev->bus = bus;
> >  pci_dev->devfn = devfn;
> > +
> > +if (pci_check_pcie_port(bus, pci_dev)) {
> > +return NULL;
> > +}
> > +
> >  dma_as = pci_device_iommu_address_space(pci_dev);
> >
> >  memory_region_init_alias(&pci_dev->bus_master_enable_region,
> > --
> > 1.7.12.4
> >

Re: [Qemu-devel] [PATCH 05/12] serial: fixing vmstate for save/restore

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 09:14, Pavel Dovgalyuk ha scritto:
> +static int serial_pre_load(void *opaque)
> +{
> +SerialState *s = (SerialState *)opaque;
> +s->thr_ipending = -1;
> +timer_del(s->fifo_timeout_timer);
> +s->timeout_ipending = 0;
> +s->poll_msl = -1;
> +timer_del(s->modem_status_poll);
> +return 0;
> +}

The two timer_dels and s->timeout_ipending = 0 should be in the reset
function rather than here.  The two assignments of -1 are correct here.

Thanks,

Paolo

Re: [Qemu-devel] [RFC PATCH 2/2] VFIO: Clear stale MSIx table during EEH reset

2014-08-26 Thread Alexey Kardashevskiy

On 08/20/2014 07:52 PM, Gavin Shan wrote:
> The PCI device MSIx table is cleaned out in hardware after EEH PE
> reset. However, we still hold the stale MSIx entries in QEMU, which
> should be cleared accordingly. Otherwise, we will run into another
> (recursive) EEH error and the PCI devices contained in the PE have
> to be offlined exceptionally.
> 
> The patch clears stale MSIx table before EEH PE reset so that MSIx
> table could be restored properly after EEH PE reset.
> 
> Signed-off-by: Gavin Shan 
> ---
>  hw/misc/vfio.c | 24 
>  1 file changed, 24 insertions(+)
> 
> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> index 1a3e7eb..3cf7f02 100644
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -4424,6 +4424,8 @@ int vfio_container_ioctl(AddressSpace *as, int32_t 
> groupid,
>  {
>  VFIOGroup *group;
>  VFIOContainer *container;
> +VFIODevice *vdev;
> +struct vfio_eeh_pe_op *arg;
>  int ret = -1;
>  
>  group = vfio_get_group(groupid, as);
> @@ -4442,7 +,29 @@ int vfio_container_ioctl(AddressSpace *as, int32_t 
> groupid,
>  switch (req) {
>  case VFIO_CHECK_EXTENSION:
>  case VFIO_IOMMU_SPAPR_TCE_GET_INFO:
> +break;
>  case VFIO_EEH_PE_OP:
> +arg = (struct vfio_eeh_pe_op *)param;
> +switch (arg->op) {
> +case VFIO_EEH_PE_RESET_HOT:
> +case VFIO_EEH_PE_RESET_FUNDAMENTAL:
> +/*
> + * The MSIx table will be cleaned out by reset. We need
> + * disable it so that it can be reenabled properly. Also,
> + * the cached MSIx table should be cleared as it's not
> + * reflecting the contents in hardware.
> + */
> +QLIST_FOREACH(vdev, &group->device_list, next) {
> +if (msix_enabled(&vdev->pdev)) {
> +vfio_disable_msix(vdev);
> +}
> +
> +msix_reset(&vdev->pdev);
> +}
> +
> +break;
> +}


This has to be done in spapr_pci_vfio (but we do not have there access to
vfio_disable_msix) or in the way that x86 would benefit from too but in
this case yes, we need an opinion from Alex (Williamson)...


> +
>  break;
>  default:
>  /* Return an error on unknown requests */
> 


-- 
Alexey

Re: [Qemu-devel] [libvirt] Mentors wanted for Outreach Program for Women October 2014

2014-08-26 Thread Stefan Hajnoczi

On Tue, Aug 26, 2014 at 10:39 AM, Martin Kletzander  wrote:
> On Tue, Aug 26, 2014 at 10:33:27AM +0100, Stefan Hajnoczi wrote:
>>
>> On Mon, Aug 25, 2014 at 5:40 PM, Marina Zhurakhinskaya
>>  wrote:
>>>
>>> - Original Message -

 From: "Stefan Hajnoczi" 
 To: "Martin Kletzander" 
 Cc: "qemu-devel" , libvir-l...@redhat.com, "kvm"
 , "Marina
 Zhurakhinskaya" 
 Sent: Monday, August 25, 2014 12:29:27 PM
 Subject: Re: [libvirt] Mentors wanted for Outreach Program for Women
 October 2014

 On Mon, Aug 25, 2014 at 11:52 AM, Martin Kletzander
 
 wrote:
 > On Thu, Aug 21, 2014 at 09:06:39PM +0100, Stefan Hajnoczi wrote:
 >> Regular code contributors to QEMU, KVM, and libvirt are eligible to
 >> participate as mentors.
 >>
 >> We also need project ideas that are achievable in 12 weeks by someone
 >> skilled in programming but not necessarily familiar with open source
 >> or our codebase.  Ideas welcome!
 >>
 >
 > It's just a matter of ideas.  Maybe we could revisit some of those we
 > had for GSoC.  If I'm reading the deadline for project ideas is
 > October 22nd, so I think we'll definitely come up with something.
>>>
>>>
>>> Thank you for your interest in helping revisit GSoC ideas and come up
>>> with new ones! October 22 is an application deadline for prospective
>>> interns. QEMU would need to have some project ideas listed by September 8,
>>> though you can add more ideas through September. The timeline for the
>>> program is at
>>> https://wiki.gnome.org/OutreachProgramForWomen/2014/DecemberMarch You don't
>>> need very many ideas, as you are likely to only have at most 2-3
>>> participants.
>>>
>>> We don't yet have any funding confirmed for QEMU, but Stefan and I will
>>> be working on this. If your organization might be able to sponsor QEMU
>>> internships in OPW, please contact me and Stefan off-list. You can learn
>>> more at https://wiki.gnome.org/OutreachProgramForWomen/Admin/InfoForOrgs
>>
>>
>> Thanks for posting the information Marina.
>>
>> There is now a wiki page for project ideas and OPW information:
>> http://qemu-project.org/Outreach_Program_for_Women_2014_DecemberMarch
>>
>
> I guess you meant the following link instead:
> http://wiki.qemu.org/Outreach_Program_for_Women_2014_DecemberMarch

No.  qemu-project.org is the preferred domain.  wiki.qemu.org works
too and points to the same page.

Stefan

Re: [Qemu-devel] [RFC PATCH 1/2] VFIO: Drop vfio_container_do_ioctl()

2014-08-26 Thread Alexey Kardashevskiy

On 08/20/2014 07:52 PM, Gavin Shan wrote:
> The patch drops vfio_container_do_ioctl() and merges its logic to
> parent function call vfio_container_ioctl() so that the subsequent
> patches can reused the found VFIO group in vfio_container_ioctl().
> 
> Signed-off-by: Gavin Shan 
> ---
>  hw/misc/vfio.c | 33 +++--
>  1 file changed, 15 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> index 7d5f1bb..1a3e7eb 100644
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -4419,8 +4419,8 @@ static void register_vfio_pci_dev_type(void)
>  
>  type_init(register_vfio_pci_dev_type)
>  
> -static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid,
> -   int req, void *param)
> +int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
> + int req, void *param)
>  {
>  VFIOGroup *group;
>  VFIOContainer *container;
> @@ -4433,22 +4433,11 @@ static int vfio_container_do_ioctl(AddressSpace *as, 
> int32_t groupid,
>  }
>  
>  container = group->container;
> -if (group->container) {
> -ret = ioctl(container->fd, req, param);
> -if (ret < 0) {
> -error_report("vfio: failed to ioctl container: ret=%d, %s",
> - ret, strerror(errno));
> -}
> +if (!container) {
> +error_report("vfio: no container for group %d\n", groupid);
> +goto out;
>  }
>  
> -vfio_put_group(group);
> -
> -return ret;
> -}
> -
> -int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
> - int req, void *param)
> -{
>  /* We allow only certain ioctls to the container */
>  switch (req) {
>  case VFIO_CHECK_EXTENSION:
> @@ -4458,8 +4447,16 @@ int vfio_container_ioctl(AddressSpace *as, int32_t 
> groupid,
>  default:
>  /* Return an error on unknown requests */
>  error_report("vfio: unsupported ioctl %X", req);
> -return -1;
> +goto out;
>  }
>  
> -return vfio_container_do_ioctl(as, groupid, req, param);
> +ret = ioctl(container->fd, req, param);
> +if (ret < 0) {
> +error_report("vfio: failed to ioctl container: ret=%d, %s",
> + ret, strerror(errno));
> +}
> +
> +out:
> +vfio_put_group(group);
> +return ret;


Nack. We specifically separated an ioctl-filtering function and actual code
which does ioctl. I understand that you want to save on vfio_get_group()
calls for the next EEH patch but this is not a hot path anyway.




>  }
> 


-- 
Alexey

Re: [Qemu-devel] [libvirt] Mentors wanted for Outreach Program for Women October 2014

2014-08-26 Thread Martin Kletzander

On Tue, Aug 26, 2014 at 11:27:41AM +0100, Stefan Hajnoczi wrote:

On Tue, Aug 26, 2014 at 10:39 AM, Martin Kletzander  wrote:

On Tue, Aug 26, 2014 at 10:33:27AM +0100, Stefan Hajnoczi wrote:

On Mon, Aug 25, 2014 at 5:40 PM, Marina Zhurakhinskaya
 wrote:

- Original Message -

From: "Stefan Hajnoczi" 
To: "Martin Kletzander" 
Cc: "qemu-devel" , libvir-l...@redhat.com, "kvm"
, "Marina
Zhurakhinskaya" 
Sent: Monday, August 25, 2014 12:29:27 PM
Subject: Re: [libvirt] Mentors wanted for Outreach Program for Women
October 2014

On Mon, Aug 25, 2014 at 11:52 AM, Martin Kletzander

wrote:
> On Thu, Aug 21, 2014 at 09:06:39PM +0100, Stefan Hajnoczi wrote:
>> Regular code contributors to QEMU, KVM, and libvirt are eligible to
>> participate as mentors.
>>
>> We also need project ideas that are achievable in 12 weeks by someone
>> skilled in programming but not necessarily familiar with open source
>> or our codebase.  Ideas welcome!
>>
>
> It's just a matter of ideas.  Maybe we could revisit some of those we
> had for GSoC.  If I'm reading the deadline for project ideas is
> October 22nd, so I think we'll definitely come up with something.

Thank you for your interest in helping revisit GSoC ideas and come up
with new ones! October 22 is an application deadline for prospective
interns. QEMU would need to have some project ideas listed by September 8,
though you can add more ideas through September. The timeline for the
program is at
https://wiki.gnome.org/OutreachProgramForWomen/2014/DecemberMarch You don't
need very many ideas, as you are likely to only have at most 2-3
participants.

We don't yet have any funding confirmed for QEMU, but Stefan and I will
be working on this. If your organization might be able to sponsor QEMU
internships in OPW, please contact me and Stefan off-list. You can learn
more at https://wiki.gnome.org/OutreachProgramForWomen/Admin/InfoForOrgs

Thanks for posting the information Marina.

There is now a wiki page for project ideas and OPW information:
http://qemu-project.org/Outreach_Program_for_Women_2014_DecemberMarch

I guess you meant the following link instead:
http://wiki.qemu.org/Outreach_Program_for_Women_2014_DecemberMarch

No.  qemu-project.org is the preferred domain.  wiki.qemu.org works
too and points to the same page.

Oh, sorry then, but I got non-existent page with qemu-project, so I
thought it's different.  I probably loaded wiki.qemu right after you
saved it and that's why it looked like they are two different
domains.

Martin

signature.asc
Description: Digital signature

Re: [Qemu-devel] [PATCH v12 0/6] qcow2, raw: add preallocation=full and preallocation=falloc

2014-08-26 Thread Stefan Hajnoczi

On Mon, Aug 25, 2014 at 09:35:15AM +0800, Hu Tao wrote:
> On Fri, Aug 22, 2014 at 12:54:29PM +0200, Kevin Wolf wrote:
> > Am 28.07.2014 um 10:48 hat Hu Tao geschrieben:
> > > ping...
> > > 
> > > All the 6 patches have reviewed-by now.
> > 
> > Looks mostly good to me, I have only a few minor comments that wouldn't
> > block inclusion but could be addressed in follow-up patches.
> > 
> > However, you have a dependency on a patch series from Max (you use
> > minimal_blob_size()), which hasn't been reviewed and merged yet, so your
> > series is blocked on that.
> > 
> > If you want to get your series merged quicker, you could replace this
> > with a rough estimate that excludes the clusters used by refcount table
> > and blocks. If full preallocation isn't really full, but only
> > preallocation of 99.9%, that's probably good enough in practice.
> 
> How about my calculation in v10?
> https://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg02844.html
> 
> It doesn't depend on minimal_blob_size(). In most cases the calculated
> size is the same as the size before this patch. I have no test results
> on hand, but if you need I can do it.

Kevin is on vacation this week.

I think the previous calculation could work.

Stefan


pgpQCSH8oC9Im.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 2/2] ide: Fix bootindex for bus_id > 9

2014-08-26 Thread Stefan Hajnoczi

On Fri, Aug 15, 2014 at 01:32:37PM +0200, Markus Armbruster wrote:
> We identify devices by their Open Firmware device paths.  The encoding
> of bus numbers is incorrect: idebus_get_fw_dev_path() formats them in
> decimal, while SeaBIOS uses hexadecimal.  With bus number > 9, SeaBIOS
> will miss the bootindex (lucky case), or apply it to another device
> (unlucky case).
> 
> Bug can't bite right now: ich9-ahci has six ports, and the sysbus-ahci
> created by Calxeda Highbank has just one.
> 
> Fix it anyway, by changing %d to %x.
> 
> I couldn't find an Open Firmware spec covering this.  For what it's
> worth, OVMF agrees with SeaBIOS.
> 
> Signed-off-by: Markus Armbruster 
> ---
>  hw/ide/qdev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


pgpZx6zcCfRSo.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH] block: Always compile virtio-blk dataplane

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 08:57, Fam Zheng ha scritto:
> Dataplane doesn't depend on linux-aio any more, so we don't need the
> compiling condition now.
> 
> Configure options are kept but just print a message.
> 
> Signed-off-by: Fam Zheng 
> ---
>  configure  | 21 ++---
>  hw/block/Makefile.objs |  2 +-
>  hw/block/virtio-blk.c  | 22 ++
>  hw/virtio/Makefile.objs|  2 +-
>  include/hw/virtio/virtio-blk.h |  2 --
>  5 files changed, 6 insertions(+), 43 deletions(-)
> 
> diff --git a/configure b/configure
> index 2063cf6..68cb4d2 100755
> --- a/configure
> +++ b/configure
> @@ -327,7 +327,6 @@ glusterfs=""
>  glusterfs_discard="no"
>  glusterfs_zerofill="no"
>  archipelago=""
> -virtio_blk_data_plane=""
>  gtk=""
>  gtkabi=""
>  vte=""
> @@ -1092,9 +1091,8 @@ for opt do
>;;
>--enable-archipelago) archipelago="yes"
>;;
> -  --disable-virtio-blk-data-plane) virtio_blk_data_plane="no"
> -  ;;
> -  --enable-virtio-blk-data-plane) virtio_blk_data_plane="yes"
> +  --disable-virtio-blk-data-plane|--enable-virtio-blk-data-plane)
> +  echo "$0: $opt is obsolete, virtio-blk data-plane is always on" >&2
>;;
>--disable-gtk) gtk="no"
>;;
> @@ -2936,16 +2934,6 @@ else
>  fi
>  
>  ##
> -# adjust virtio-blk-data-plane based on linux-aio
> -
> -if test "$virtio_blk_data_plane" = "yes" -a \
> - "$linux_aio" != "yes" ; then
> -  error_exit "virtio-blk-data-plane requires Linux AIO, please try 
> --enable-linux-aio"
> -elif test -z "$virtio_blk_data_plane" ; then
> -  virtio_blk_data_plane=$linux_aio
> -fi
> -
> -##
>  # attr probe
>  
>  if test "$attr" != "no" ; then
> @@ -4319,7 +4307,6 @@ echo "coroutine backend $coroutine"
>  echo "coroutine pool$coroutine_pool"
>  echo "GlusterFS support $glusterfs"
>  echo "Archipelago support $archipelago"
> -echo "virtio-blk-data-plane $virtio_blk_data_plane"
>  echo "gcov  $gcov_tool"
>  echo "gcov enabled  $gcov"
>  echo "TPM support   $tpm"
> @@ -4778,10 +4765,6 @@ if test "$quorum" = "yes" ; then
>echo "CONFIG_QUORUM=y" >> $config_host_mak
>  fi
>  
> -if test "$virtio_blk_data_plane" = "yes" ; then
> -  echo 'CONFIG_VIRTIO_BLK_DATA_PLANE=$(CONFIG_VIRTIO)' >> $config_host_mak
> -fi
> -
>  if test "$vhdx" = "yes" ; then
>echo "CONFIG_VHDX=y" >> $config_host_mak
>  fi
> diff --git a/hw/block/Makefile.objs b/hw/block/Makefile.objs
> index bf46f03..d4c3ab7 100644
> --- a/hw/block/Makefile.objs
> +++ b/hw/block/Makefile.objs
> @@ -12,4 +12,4 @@ common-obj-$(CONFIG_NVME_PCI) += nvme.o
>  obj-$(CONFIG_SH4) += tc58128.o
>  
>  obj-$(CONFIG_VIRTIO) += virtio-blk.o
> -obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
> +obj-$(CONFIG_VIRTIO) += dataplane/
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index d9167ce..643eb2f 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -18,10 +18,8 @@
>  #include "hw/block/block.h"
>  #include "sysemu/blockdev.h"
>  #include "hw/virtio/virtio-blk.h"
> -#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
> -# include "dataplane/virtio-blk.h"
> -# include "migration/migration.h"
> -#endif
> +#include "dataplane/virtio-blk.h"
> +#include "migration/migration.h"
>  #include "block/scsi.h"
>  #ifdef __linux__
>  # include 
> @@ -432,7 +430,6 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
> VirtQueue *vq)
>  .num_writes = 0,
>  };
>  
> -#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
>  /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
>   * dataplane here instead of waiting for .set_status().
>   */
> @@ -440,7 +437,6 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
> VirtQueue *vq)
>  virtio_blk_data_plane_start(s->dataplane);
>  return;
>  }
> -#endif
>  
>  while ((req = virtio_blk_get_request(s))) {
>  virtio_blk_handle_request(req, &mrb);
> @@ -497,11 +493,9 @@ static void virtio_blk_reset(VirtIODevice *vdev)
>  {
>  VirtIOBlock *s = VIRTIO_BLK(vdev);
>  
> -#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
>  if (s->dataplane) {
>  virtio_blk_data_plane_stop(s->dataplane);
>  }
> -#endif
>  
>  /*
>   * This should cancel pending requests, but can't do nicely until there
> @@ -591,12 +585,10 @@ static void virtio_blk_set_status(VirtIODevice *vdev, 
> uint8_t status)
>  VirtIOBlock *s = VIRTIO_BLK(vdev);
>  uint32_t features;
>  
> -#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
>  if (s->dataplane && !(status & (VIRTIO_CONFIG_S_DRIVER |
>  VIRTIO_CONFIG_S_DRIVER_OK))) {
>  virtio_blk_data_plane_stop(s->dataplane);
>  }
> -#endif
>  
>  if (!(status & VIRTIO_CONFIG_S_DRIVER_OK)) {
>  return;
> @@ -691,7 +683,6 @@ static const BlockDevOps virtio_block_ops = {
>  .resize_cb = virtio_blk_resize,
>  };
>  
> -#ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
>  /* Disable da

Re: [Qemu-devel] [PATCH v3 2/2] docs: update ivshmem device spec

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 08:47, David Marchand ha scritto:
> 
> Using a version message supposes we want to keep ivshmem-server and QEMU
> separated (for example, in two distribution packages) while we can avoid
> this, so why would we do so ?
> 
> If we want the ivshmem-server to come with QEMU, then both are supposed
> to be aligned on your system.

What about upgrading QEMU and ivshmem-server while you have existing
guests?  You cannot restart ivshmem-server, and the new QEMU would have
to talk to the old ivshmem-server.

Paolo

> If you want to test local modifications, then it means you know what you
> are doing and you will call the right ivshmem-server binary with the
> right QEMU binary.

Re: [Qemu-devel] [PATCH 01/12] spapr: populate DRC entries for root dt node

2014-08-26 Thread Alexander Graf



On 19.08.14 02:21, Michael Roth wrote:
> From: Nathan Fontenot 
> 
> This add entries to the root OF node to advertise our PHBs as being
> DR-capable in according with PAPR specification.
> 
> Each PHB is given a name of PHB, advertised as a PHB type,
> and associated with a power domain of -1 (indicating to guests that
> power management is handled automatically by hardware).
> 
> We currently allocate entries for up to 32 DR-capable PHBs, though
> this limit can be increased later.
> 
> DrcEntry objects to track the state of the DR-connector associated
> with each PHB are stored in a 32-entry array, and each DrcEntry has
> in turn have a dynamically-sized number of child DR-connectors,
> which we will use later to track the state of DR-connectors
> associated with a PHB's physical slots.
> 
> Signed-off-by: Nathan Fontenot 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr.c | 143 
> +
>  hw/ppc/spapr_pci.c |   1 +
>  include/hw/ppc/spapr.h |  35 
>  3 files changed, 179 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 5c92707..d5e46c3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -296,6 +296,143 @@ static hwaddr spapr_node0_size(void)
>  return ram_size;
>  }
>  
> +sPAPRDrcEntry *spapr_phb_to_drc_entry(uint64_t buid)
> +{
> +int i;
> +
> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +if (spapr->drc_table[i].phb_buid == buid) {
> +return &spapr->drc_table[i];
> +}
> + }
> +
> + return NULL;
> +}
> +
> +static void spapr_init_drc_table(void)
> +{
> +int i;
> +
> +memset(spapr->drc_table, 0, sizeof(spapr->drc_table));
> +
> +/* For now we only care about PHB entries */
> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +spapr->drc_table[i].drc_index = 0x201 + i;

magic number?

> +}
> +}
> +
> +sPAPRDrcEntry *spapr_add_phb_to_drc_table(uint64_t buid, uint32_t state)
> +{
> +sPAPRDrcEntry *empty_drc = NULL;
> +sPAPRDrcEntry *found_drc = NULL;
> +int i, phb_index;
> +
> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +if (spapr->drc_table[i].phb_buid == 0) {
> +empty_drc = &spapr->drc_table[i];
> +}
> +
> +if (spapr->drc_table[i].phb_buid == buid) {
> +found_drc = &spapr->drc_table[i];
> +break;
> +}
> +}
> +
> +if (found_drc) {
> +return found_drc;
> +}
> +
> +if (empty_drc) {
> +empty_drc->phb_buid = buid;
> +empty_drc->state = state;
> +empty_drc->cc_state.fdt = NULL;
> +empty_drc->cc_state.offset = 0;
> +empty_drc->cc_state.depth = 0;
> +empty_drc->cc_state.state = CC_STATE_IDLE;
> +empty_drc->child_entries =
> +g_malloc0(sizeof(sPAPRDrcEntry) * SPAPR_DRC_PHB_SLOT_MAX);
> +phb_index = buid - SPAPR_PCI_BASE_BUID;
> +for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +empty_drc->child_entries[i].drc_index =
> +SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + (i << 3);
> +}
> +return empty_drc;
> +}
> +
> +return NULL;
> +}
> +
> +static void spapr_create_drc_dt_entries(void *fdt)
> +{
> +char char_buf[1024];
> +uint32_t int_buf[SPAPR_DRC_TABLE_SIZE + 1];
> +uint32_t *entries;
> +int offset, fdt_offset;
> +int i, ret;
> +
> +fdt_offset = fdt_path_offset(fdt, "/");
> +
> +/* ibm,drc-indexes */
> +memset(int_buf, 0, sizeof(int_buf));
> +int_buf[0] = SPAPR_DRC_TABLE_SIZE;
> +
> +for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> +int_buf[i] = spapr->drc_table[i-1].drc_index;

Not endian safe.

> +}
> +
> +ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-indexes", int_buf,
> +  sizeof(int_buf));
> +if (ret) {
> +fprintf(stderr, "Couldn't finalize ibm,drc-indexes property\n");
> +}
> +
> +/* ibm,drc-power-domains */
> +memset(int_buf, 0, sizeof(int_buf));
> +int_buf[0] = SPAPR_DRC_TABLE_SIZE;

Not endian safe.

> +
> +for (i = 1; i <= SPAPR_DRC_TABLE_SIZE; i++) {
> +int_buf[i] = 0x;
> +}

memset(-1) instead above?

> +
> +ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-power-domains", int_buf,
> +  sizeof(int_buf));
> +if (ret) {
> +fprintf(stderr, "Couldn't finalize ibm,drc-power-domains 
> property\n");
> +}
> +
> +/* ibm,drc-names */
> +memset(char_buf, 0, sizeof(char_buf));
> +entries = (uint32_t *)&char_buf[0];
> +*entries = SPAPR_DRC_TABLE_SIZE;

Not endian safe. I guess you get the idea. I'll stop looking for endian
problems here :).

> +offset = sizeof(*entries);
> +
> +for (i = 0; i < SPAPR_DRC_TABLE_SIZE; i++) {
> +offset += sprintf(char_buf + offset, "PHB %d", i + 1);
> +char_buf[offset++] = '\0';
> +}
> +
> +ret = fdt_setprop(fdt, fdt_offset, "ibm,drc-names", char_buf, offset);

Re: [Qemu-devel] [PATCH] checkpatch.pl: adjust typedef definition to QEMU coding style

2014-08-26 Thread Markus Armbruster

Paolo Bonzini  writes:

> Most QEMU typedefs are camelcase, starting with one uppercase letter
> and containing at least one lowercase letter.  There are a few
> all-uppercase types, add the most common too.
>
> This fixes recognition of types in lines such as
>
> static __attribute__((unused)) inline void tcg_out8(TCGContext *s, 
> uint8_t v)
>
> (Example provided by Peter Maydell).
>
> Reported-by: Alexey Kardashevskiy 
> Cc: Peter Maydell 
> Cc: Stefan Weil 
> Cc: Markus Armbruster 
> Signed-off-by: Paolo Bonzini 
> ---
>  scripts/checkpatch.pl | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 9d46e5a..053e432 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -206,9 +206,13 @@ our $UTF8= qr {
>   |  \xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16
>  }x;
>  
> +# There are still some false positives, but this catches most
> +# common cases.
>  our $typeTypedefs = qr{(?x:
> - (?:__)?(?:u|s|be|le)(?:8|16|32|64)|
> - atomic_t
> +[A-Z][A-Z\d_]*[a-z][A-Za-z\d_]* # camelcase
> +| [A-Z][A-Z\d_]*AIOCB   # all uppercase
> +| [A-Z][A-Z\d_]*CPU # all uppercase
> +| QEMUBH# all uppercase
>  )};
>  
>  our $logFunctions = qr{(?x:

I had to look up \d, and then I got scared until I remembered "Perl will
not use locales unless specifically requested to".  Applies both to \d
and A-Z.  Consistent with existing usage, except for the position of
'|'.

Reviewed-by: Markus Armbruster

Re: [Qemu-devel] [PATCH] virtio-scsi: Report error if num_queues is 0 or too large

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 08:30, Fam Zheng ha scritto:
> No cmd vq surprises guest (Linux panics in virtscsi_probe), too many
> queues abort qemu (in the following virtio_add_queue).
> 
> Signed-off-by: Fam Zheng 
> ---
>  hw/scsi/virtio-scsi.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index 2dd9255..86aba88 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -699,6 +699,12 @@ void virtio_scsi_common_realize(DeviceState *dev, Error 
> **errp,
>  virtio_init(vdev, "virtio-scsi", VIRTIO_ID_SCSI,
>  sizeof(VirtIOSCSIConfig));
>  
> +if (s->conf.num_queues <= 0 || s->conf.num_queues > 
> VIRTIO_PCI_QUEUE_MAX) {
> +error_setg(errp, "Invalid number of queues (= %" PRId32 "), "
> + "must be a positive integer less than %d.",
> +   s->conf.num_queues, VIRTIO_PCI_QUEUE_MAX);
> +return;
> +}
>  s->cmd_vqs = g_malloc0(s->conf.num_queues * sizeof(VirtQueue *));
>  s->sense_size = VIRTIO_SCSI_SENSE_SIZE;
>  s->cdb_size = VIRTIO_SCSI_CDB_SIZE;
> 

Thanks, applying to scsi-next.

Paolo

Re: [Qemu-devel] [PATCH v5 0/8] modify boot order of guest, and take effect after rebooting

2014-08-26 Thread Markus Armbruster

Gerd Hoffmann  writes:

> On Di, 2014-08-26 at 09:07 +, Gonglei (Arei) wrote:
>> Hi, Gerd
>> 
>>  Nice to meet you again in maillist. :)
>> 
>> > -Original Message-
>> > From: Gerd Hoffmann [mailto:kra...@redhat.com]
>> > Sent: Tuesday, August 26, 2014 2:36 PM
>> > Subject: Re: [PATCH v5 0/8] modify boot order of guest, and take effect 
>> > after
>> > rebooting
>> > 
>> > > The patchsets add one qmp interface, and add an fw_cfg_machine_reset()
>> > > to achieve it.
>> > 
>> > > (qemu) set-bootindex ide0-0-1 1
>> > > The bootindex 1 has already been used
>> > 
>> > What happened to the idea to use qom-set instead?  I liked that
>> > suggestion.  Solves the suffix issue in a nice way.
>> > 
>> I have discussed with Makus about qom-set in pervious confabulation.
>> The main problem is that qom-set's function is simple, which just change
>> a device's property value, but not can do any other logic. In my case,
>> I should change global fw_boot_orde for devices's bootindex taking effect.
>
> Two options (also mentioned in the thread):
>
>   (1) Set/update bootindex on reset instead of realize/init.
>   (2) Switch the property from qdev to qom, then use the set
>   callback to also update the fw_cfg file.

Yes, please.  Even if it should make the implementation a bit more
complex.  Avoiding new ways to name things in external interfaces, such
as the suffix here, is worth some complication.

Re: [Qemu-devel] [PATCH] tests: set QEMU_AUDIO_DRV=none for pci sound cards

2014-08-26 Thread Markus Armbruster

Gerd Hoffmann  writes:

> On Di, 2014-08-05 at 16:05 +0200, Markus Armbruster wrote:
>> Ping?
>
> Back online.  What is the state here?  I've seen Stefan (Cc'ed) posted a
> different patch for the same issue?  Anything merged meanwhile?

Neither patch has been merged.  Either would do for me :)

Re: [Qemu-devel] [PATCH 02/12] spapr_pci: populate DRC dt entries for PHBs

2014-08-26 Thread Alexander Graf



On 19.08.14 02:21, Michael Roth wrote:
> Reserve 32 entries of type PCI in each PHB's initial FDT. This
> advertises to guests that each PHB is DR-capable device with
> physical hotpluggable slots. This is necessary for allowing
> hotplugging of devices to it later via bus rescan or guest rpaphp
> hotplug module.
> 
> Each entry is assigned a name of "Slot <*32 +1>",
> advertised as a hotpluggable PCI slot, and assigned to power domain
> -1 to indicate to the guest that power management is handled by the
> hardware.
> 
> This models a DR-capable PCI expansion device attached to a host/lpar
> via a single PHB with 32 physical hotpluggable slots (as opposed to a
> virtual bridge device with external management console). Hotplug will
> be handled by the guest via bus rescan or the rpaphp hotplug module.
> 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr.c  |   3 +-
>  hw/ppc/spapr_pci.c  | 102 
> 
>  include/hw/pci-host/spapr.h |   1 +
>  3 files changed, 105 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d5e46c3..90b25b3 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -890,7 +890,8 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>  QLIST_FOREACH(phb, &spapr->phbs, list) {
>  drc_entry = spapr_phb_to_drc_entry(phb->buid);
>  g_assert(drc_entry);
> -ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, fdt);
> +ret = spapr_populate_pci_dt(phb, PHANDLE_XICP, drc_entry->drc_index,
> +fdt);
>  }
>  
>  if (ret < 0) {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index e85134f..924d488 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -851,8 +851,104 @@ static int spapr_phb_children_dt(Object *child, void 
> *opaque)
>  return 1;
>  }
>  
> +static void spapr_create_drc_phb_dt_entries(void *fdt, int bus_off, int 
> phb_index)
> +{
> +char char_buf[1024];
> +uint32_t int_buf[SPAPR_DRC_PHB_SLOT_MAX + 1];
> +uint32_t *entries;
> +int i, ret, offset;
> +
> +/* ibm,drc-indexes */
> +memset(int_buf, 0 , sizeof(int_buf));
> +int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +int_buf[i] = SPAPR_DRC_DEV_ID_BASE + (phb_index << 8) + ((i - 1) << 
> 3);

Same endianness breakage.

Please verify that your patch set works with

  1) ppc64le host and KVM
  2) x86_64 host and TCG


Alex

> +}
> +
> +ret = fdt_setprop(fdt, bus_off, "ibm,drc-indexes", int_buf,
> +  sizeof(int_buf));
> +if (ret) {
> +fprintf(stderr, "error adding 'ibm,drc-indexes' field for PHB FDT");
> +}
> +
> +/* ibm,drc-power-domains */
> +memset(int_buf, 0, sizeof(int_buf));
> +int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +int_buf[i] = 0x;
> +}
> +
> +ret = fdt_setprop(fdt, bus_off, "ibm,drc-power-domains", int_buf,
> +  sizeof(int_buf));
> +if (ret) {
> +fprintf(stderr,
> +"error adding 'ibm,drc-power-domains' field for PHB FDT");
> +}
> +
> +/* ibm,drc-names */
> +memset(char_buf, 0, sizeof(char_buf));
> +entries = (uint32_t *)&char_buf[0];
> +*entries = SPAPR_DRC_PHB_SLOT_MAX;
> +offset = sizeof(*entries);
> +
> +for (i = 1; i <= SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +offset += sprintf(char_buf + offset, "Slot %d",
> +  (phb_index * SPAPR_DRC_PHB_SLOT_MAX) + i - 1);
> +char_buf[offset++] = '\0';
> +}
> +
> +ret = fdt_setprop(fdt, bus_off, "ibm,drc-names", char_buf, offset);
> +if (ret) {
> +fprintf(stderr, "error adding 'ibm,drc-names' field for PHB FDT");
> +}
> +
> +/* ibm,drc-types */
> +memset(char_buf, 0, sizeof(char_buf));
> +entries = (uint32_t *)&char_buf[0];
> +*entries = SPAPR_DRC_PHB_SLOT_MAX;
> +offset = sizeof(*entries);
> +
> +for (i = 0; i < SPAPR_DRC_PHB_SLOT_MAX; i++) {
> +offset += sprintf(char_buf + offset, "28");
> +char_buf[offset++] = '\0';
> +}
> +
> +ret = fdt_setprop(fdt, bus_off, "ibm,drc-types", char_buf, offset);
> +if (ret) {
> +fprintf(stderr, "error adding 'ibm,drc-types' field for PHB FDT");
> +}
> +
> +/* we want the initial indicator state to be 0 - "empty", when we
> + * hot-plug an adaptor in the slot, we need to set the indicator
> + * to 1 - "present."
> + */
> +
> +/* ibm,indicator-9003 */
> +memset(int_buf, 0, sizeof(int_buf));
> +int_buf[0] = SPAPR_DRC_PHB_SLOT_MAX;
> +
> +ret = fdt_setprop(fdt, bus_off, "ibm,indicator-9003", int_buf,
> +  sizeof(int_buf));
> +if (ret) {
> +fprintf(stderr, "error adding 'ibm,indicator-9003' field for PHB 
> FDT");
> +}
> +
> +/* ibm,sensor-9003 */
> +memset(int_buf, 0, s

Re: [Qemu-devel] [PULL 0/1] VFIO pull request

2014-08-26 Thread Peter Maydell

On 25 August 2014 20:39, Alex Williamson  wrote:
> The following changes since commit 3dd359c2d34c6abf385d58da863f337b39702585:
>
>   Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-08-24' 
> into staging (2014-08-25 17:34:30 +0100)
>
> are available in the git repository at:
>
>
>   git://github.com/awilliam/qemu-vfio.git tags/vfio-pci-for-qemu-20140825.0
>
> for you to fetch changes up to fe08275db9b88ecf3a30c7540b894c25aec150c2:
>
>   vfio: Enable NVIDIA 88000 region quirk regardless of VGA (2014-08-25 
> 12:10:15 -0600)
>
> 
> VFIO: Enable primary NVIDIA quirk regardless of VGA support
>
> 
> Alex Williamson (1):
>   vfio: Enable NVIDIA 88000 region quirk regardless of VGA
>
>  hw/misc/vfio.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH] Revert "virtio_rng: replace custom backend API with UserCreatable.complete() callback"

2014-08-26 Thread Markus Armbruster

Stefan Hajnoczi  writes:

> On Sat, Aug 16, 2014 at 12:23:52AM +0800, Amos Kong wrote:
>> This reverts commit 57d3e1b3f52d07d215ed96df946ee01f8d9f9526.
>> 
>> The commit introduced a regression bug, the initialization order of 
>> virtio-rng
>> backend was changed.
>> 
>>  # x86_64-softmmu/qemu-system-x86_64 -monitor stdio -vnc :0 \
>>  -chardev socket,host=localhost,port=1024,id=chr0 \
>>  -object rng-egd,chardev=chr0,id=rng0
>> 
>>  qemu-system-x86_64: -object rng-egd,chardev=chr0,id=rng0: Device 'chr0' not 
>> found
>> 
>> Chardev 'chr0' isn't initialized when we try to open rng backend,
>
> More detail:
> The problem is that vl.c:main() calls object_create() on -object before
> -chardev options are processed.  Moving the object_create() call after
> chardev is arbitrary and does not work if a chardev depends on an
> -object.
>
> It would have been nice to process command-line options in left-to-right
> order instead of grouping them by option type.  I doubt we can change
> this now since it would break the command-line.

In my private opinion, our command line could really use a thorough
breaking.

[...]

Re: [Qemu-devel] [PATCH 04/12] spapr_pci: add set-indicator RTAS interface

2014-08-26 Thread Alexander Graf



On 19.08.14 02:21, Michael Roth wrote:
> From: Mike Day 
> 
> Signed-off-by: Mike Day 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr_pci.c | 119 
> +
>  include/hw/ppc/spapr.h |   3 ++
>  2 files changed, 122 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 924d488..23a3477 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -36,6 +36,16 @@
>  
>  #include "hw/pci/pci_bus.h"
>  
> +/* #define DEBUG_SPAPR */
> +
> +#ifdef DEBUG_SPAPR
> +#define DPRINTF(fmt, ...) \
> +do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
> +#else
> +#define DPRINTF(fmt, ...) \
> +do { } while (0)
> +#endif
> +
>  /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
>  #define RTAS_QUERY_FN   0
>  #define RTAS_CHANGE_FN  1
> @@ -47,6 +57,31 @@
>  #define RTAS_TYPE_MSI   1
>  #define RTAS_TYPE_MSIX  2
>  
> +/* For set-indicator RTAS interface */
> +#define INDICATOR_ISOLATION_MASK0x0001   /* 9001 one bit */
> +#define INDICATOR_GLOBAL_INTERRUPT_MASK 0x0002   /* 9005 one bit */
> +#define INDICATOR_ERROR_LOG_MASK0x0004   /* 9006 one bit */
> +#define INDICATOR_IDENTIFY_MASK 0x0008   /* 9007 one bit */
> +#define INDICATOR_RESET_MASK0x0010   /* 9009 one bit */
> +#define INDICATOR_DR_MASK   0x00e0   /* 9002 three bits */
> +#define INDICATOR_ALLOCATION_MASK   0x0300   /* 9003 two bits */
> +#define INDICATOR_EPOW_MASK 0x1c00   /* 9 three bits */
> +
> +#define INDICATOR_ISOLATION_SHIFT   0x00 /* bit 0 */
> +#define INDICATOR_GLOBAL_INTERRUPT_SHIFT0x01 /* bit 1 */
> +#define INDICATOR_ERROR_LOG_SHIFT   0x02 /* bit 2 */
> +#define INDICATOR_IDENTIFY_SHIFT0x03 /* bit 3 */
> +#define INDICATOR_RESET_SHIFT   0x04 /* bit 4 */
> +#define INDICATOR_DR_SHIFT  0x05 /* bits 5-7 */
> +#define INDICATOR_ALLOCATION_SHIFT  0x08 /* bits 8-9 */
> +#define INDICATOR_EPOW_SHIFT0x0a /* bits 10-12 */
> +
> +#define DECODE_DRC_STATE(state, m, s)  \
> +uint32_t)(state) & (uint32_t)(m))) >> (s))
> +
> +#define ENCODE_DRC_STATE(val, m, s) \
> +(((uint32_t)(val) << (s)) & (uint32_t)(m))
> +
>  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
>  {
>  sPAPRPHBState *sphb;
> @@ -402,6 +437,80 @@ static void 
> rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
>  rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */
>  }
>  
> +static void rtas_set_indicator(PowerPCCPU *cpu, sPAPREnvironment *spapr,
> +   uint32_t token, uint32_t nargs,
> +   target_ulong args, uint32_t nret,
> +   target_ulong rets)
> +{
> +uint32_t indicator = rtas_ld(args, 0);
> +uint32_t drc_index = rtas_ld(args, 1);
> +uint32_t indicator_state = rtas_ld(args, 2);
> +uint32_t encoded = 0, shift = 0, mask = 0;
> +uint32_t *pind;
> +sPAPRDrcEntry *drc_entry = NULL;

This rtas call does not have any idea what a PHB is. Why does it live in
spapr_pci.c?

> +
> +if (drc_index == 0) { /* platform indicator */
> +pind = &spapr->state;
> +} else {
> +drc_entry = spapr_find_drc_entry(drc_index);
> +if (!drc_entry) {
> +DPRINTF("rtas_set_indicator: unable to find drc_entry for %x",
> +drc_index);
> +rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
> +return;
> +}
> +pind = &drc_entry->state;
> +}
> +
> +switch (indicator) {
> +case 9:  /* EPOW */
> +shift = INDICATOR_EPOW_SHIFT;
> +mask = INDICATOR_EPOW_MASK;
> +break;
> +case 9001: /* Isolation state */
> +/* encode the new value into the correct bit field */
> +shift = INDICATOR_ISOLATION_SHIFT;
> +mask = INDICATOR_ISOLATION_MASK;
> +break;
> +case 9002: /* DR */
> +shift = INDICATOR_DR_SHIFT;
> +mask = INDICATOR_DR_MASK;
> +break;
> +case 9003: /* Allocation State */
> +shift = INDICATOR_ALLOCATION_SHIFT;
> +mask = INDICATOR_ALLOCATION_MASK;
> +break;
> +case 9005: /* global interrupt */
> +shift = INDICATOR_GLOBAL_INTERRUPT_SHIFT;
> +mask = INDICATOR_GLOBAL_INTERRUPT_MASK;
> +break;
> +case 9006: /* error log */
> +shift = INDICATOR_ERROR_LOG_SHIFT;
> +mask = INDICATOR_ERROR_LOG_MASK;
> +break;
> +case 9007: /* identify */
> +shift = INDICATOR_IDENTIFY_SHIFT;
> +mask = INDICATOR_IDENTIFY_MASK;
> +break;
> +case 9009: /* reset */
> +shift = INDICATOR_RESET_SHIFT;
> +mask = INDICATOR_RESET_MASK;
> +break;
> +default:
> +DPRINTF("rtas_set_indicator: indicator not implemented: %d

Re: [Qemu-devel] [PATCH 07/12] spapr_pci: add ibm, configure-connector RTAS interface

2014-08-26 Thread Alexander Graf



On 19.08.14 02:21, Michael Roth wrote:
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr_pci.c | 111 
> +
>  1 file changed, 111 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 8d1351d..96a57be 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -606,6 +606,115 @@ static void rtas_get_sensor_state(PowerPCCPU *cpu, 
> sPAPREnvironment *spapr,
>  rtas_st(rets, 1, decoded);
>  }
>  
> +/* configure-connector work area offsets, int32_t units */
> +#define CC_IDX_NODE_NAME_OFFSET 2
> +#define CC_IDX_PROP_NAME_OFFSET 2
> +#define CC_IDX_PROP_LEN 3
> +#define CC_IDX_PROP_DATA_OFFSET 4
> +
> +#define CC_VAL_DATA_OFFSET ((CC_IDX_PROP_DATA_OFFSET + 1) * 4)
> +#define CC_RET_NEXT_SIB 1
> +#define CC_RET_NEXT_CHILD 2
> +#define CC_RET_NEXT_PROPERTY 3
> +#define CC_RET_PREV_PARENT 4
> +#define CC_RET_ERROR RTAS_OUT_HW_ERROR
> +#define CC_RET_SUCCESS RTAS_OUT_SUCCESS
> +
> +static void rtas_ibm_configure_connector(PowerPCCPU *cpu,
> + sPAPREnvironment *spapr,
> + uint32_t token, uint32_t nargs,
> + target_ulong args, uint32_t nret,
> + target_ulong rets)
> +{
> +uint64_t wa_addr = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 0);
> +sPAPRDrcEntry *drc_entry = NULL;
> +sPAPRConfigureConnectorState *ccs;
> +void *wa_buf;
> +int32_t *wa_buf_int;
> +hwaddr map_len = 0x1024;
> +uint32_t drc_index;
> +int rc = 0, next_offset, tag, prop_len, node_name_len;
> +const struct fdt_property *prop;
> +const char *node_name, *prop_name;
> +
> +wa_buf = cpu_physical_memory_map(wa_addr, &map_len, 1);

Please use the rtas helpers to access memory from an rtas function. It
takes care of endian swapping and address masking.


Alex

> +if (!wa_buf) {
> +rc = CC_RET_ERROR;
> +goto error_exit;
> +}
> +wa_buf_int = wa_buf;
> +
> +drc_index = *(uint32_t *)wa_buf;
> +drc_entry = spapr_find_drc_entry(drc_index);
> +if (!drc_entry) {
> +rc = -1;
> +goto error_exit;
> +}
> +
> +ccs = &drc_entry->cc_state;
> +if (ccs->state == CC_STATE_PENDING) {
> +/* fdt should've been been attached to drc_entry during
> + * realize/hotplug
> + */
> +g_assert(ccs->fdt);
> +ccs->depth = 0;
> +ccs->offset = ccs->offset_start;
> +ccs->state = CC_STATE_ACTIVE;
> +}
> +
> +if (ccs->state == CC_STATE_IDLE) {
> +rc = -1;
> +goto error_exit;
> +}
> +
> +retry:
> +tag = fdt_next_tag(ccs->fdt, ccs->offset, &next_offset);
> +
> +switch (tag) {
> +case FDT_BEGIN_NODE:
> +ccs->depth++;
> +node_name = fdt_get_name(ccs->fdt, ccs->offset, &node_name_len);
> +wa_buf_int[CC_IDX_NODE_NAME_OFFSET] = CC_VAL_DATA_OFFSET;
> +strcpy(wa_buf + wa_buf_int[CC_IDX_NODE_NAME_OFFSET], node_name);
> +rc = CC_RET_NEXT_CHILD;
> +break;
> +case FDT_END_NODE:
> +ccs->depth--;
> +if (ccs->depth == 0) {
> +/* reached the end of top-level node, declare success */
> +ccs->state = CC_STATE_PENDING;
> +rc = CC_RET_SUCCESS;
> +} else {
> +rc = CC_RET_PREV_PARENT;
> +}
> +break;
> +case FDT_PROP:
> +prop = fdt_get_property_by_offset(ccs->fdt, ccs->offset, &prop_len);
> +prop_name = fdt_string(ccs->fdt, fdt32_to_cpu(prop->nameoff));
> +wa_buf_int[CC_IDX_PROP_NAME_OFFSET] = CC_VAL_DATA_OFFSET;
> +wa_buf_int[CC_IDX_PROP_LEN] = prop_len;
> +wa_buf_int[CC_IDX_PROP_DATA_OFFSET] =
> +CC_VAL_DATA_OFFSET + strlen(prop_name) + 1;
> +strcpy(wa_buf + wa_buf_int[CC_IDX_PROP_NAME_OFFSET], prop_name);
> +memcpy(wa_buf + wa_buf_int[CC_IDX_PROP_DATA_OFFSET],
> +   prop->data, prop_len);
> +rc = CC_RET_NEXT_PROPERTY;
> +break;
> +case FDT_END:
> +rc = CC_RET_ERROR;
> +break;
> +default:
> +ccs->offset = next_offset;
> +goto retry;
> +}
> +
> +ccs->offset = next_offset;
> +
> +error_exit:
> +cpu_physical_memory_unmap(wa_buf, 0x1024, 1, 0x1024);
> +rtas_st(rets, 0, rc);
> +}
> +
>  static int pci_spapr_swizzle(int slot, int pin)
>  {
>  return (slot + pin) % PCI_NUM_PINS;
> @@ -1276,6 +1385,8 @@ void spapr_pci_rtas_init(void)
>  rtas_get_power_level);
>  spapr_rtas_register(RTAS_GET_SENSOR_STATE, "get-sensor-state",
>  rtas_get_sensor_state);
> +spapr_rtas_register(RTAS_IBM_CONFIGURE_CONNECTOR, 
> "ibm,configure-connector",
> +rtas_ibm_configure_connector);
>  }
>  
>  static void spapr_pci_register_types(void)
>

Re: [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions

2014-08-26 Thread Alexander Graf



On 19.08.14 02:21, Michael Roth wrote:
> Some kernels program a 0 address for io regions. PCI 3.0 spec
> section 6.2.5.1 doesn't seem to disallow this.
> 
> Signed-off-by: Michael Roth 

This patch does not need to be inside of this patch set. It also should
go via Michael's tree.


Alex

> ---
>  hw/pci/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 351d320..9578749 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -1035,7 +1035,7 @@ static pcibus_t pci_bar_address(PCIDevice *d,
>  /* Check if 32 bit BAR wraps around explicitly.
>   * TODO: make priorities correct and remove this work around.
>   */
> -if (last_addr <= new_addr || new_addr == 0 || last_addr >= 
> UINT32_MAX) {
> +if (last_addr <= new_addr || last_addr >= UINT32_MAX) {
>  return PCI_BAR_UNMAPPED;
>  }
>  return new_addr;
>

Re: [Qemu-devel] [PATCH] device_tree.c: redirect load_device_tree err message to stderr

2014-08-26 Thread Markus Armbruster

"john.liuli"  writes:

> From: Li Liu 
>
> Signed-off-by: Li Liu 
> ---
>  device_tree.c |8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/device_tree.c b/device_tree.c
> index ca83504..ccdb039 100644
> --- a/device_tree.c
> +++ b/device_tree.c
> @@ -79,7 +79,7 @@ void *load_device_tree(const char *filename_path, int 
> *sizep)
>  *sizep = 0;
>  dt_size = get_image_size(filename_path);
>  if (dt_size < 0) {
> -printf("Unable to get size of device tree file '%s'\n",
> +fprintf(stderr, "Unable to get size of device tree file '%s'\n",
>  filename_path);
>  goto fail;
>  }
> @@ -92,20 +92,20 @@ void *load_device_tree(const char *filename_path, int 
> *sizep)
>  
>  dt_file_load_size = load_image(filename_path, fdt);
>  if (dt_file_load_size < 0) {
> -printf("Unable to open device tree file '%s'\n",
> +fprintf(stderr, "Unable to open device tree file '%s'\n",
> filename_path);
>  goto fail;
>  }
>  
>  ret = fdt_open_into(fdt, fdt, dt_size);
>  if (ret) {
> -printf("Unable to copy device tree in memory\n");
> +fprintf(stderr, "Unable to copy device tree in memory\n");
>  goto fail;
>  }
>  
>  /* Check sanity of device tree */
>  if (fdt_check_header(fdt)) {
> -printf ("Device tree file loaded into memory is invalid: %s\n",
> +fprintf(stderr, "Device tree file loaded into memory is invalid: 
> %s\n",
>  filename_path);
>  goto fail;
>  }

A welcome improvement.  Further improvement would be using
error_report().

Re: [Qemu-devel] [PATCH 06/15] hw/intc/arm_gic: Make ICDDCR/GICD_CTLR banked

2014-08-26 Thread Sergey Fedorov

On 22.08.2014 14:29, Fabian Aggeler wrote:
> ICDDCR/GICD_CTLR is banked in GICv1 implementations with Security
> Extensions or in GICv2 in independent from Security Extensions.
> This makes it possible to enable forwarding of interrupts from
> Distributor to the CPU interfaces for Group0 and Group1.
>
> EnableGroup0 (Bit [1]) in GICv1 is IMPDEF. Since this bit (Enable
> Non-secure) is present in the integrated IC of the Cortex-A9 MPCore,
> which implements the GICv1 profile, we support this bit in GICv1 too.
>
> Signed-off-by: Fabian Aggeler 
> ---
>  hw/intc/arm_gic.c| 34 ++
>  hw/intc/arm_gic_common.c |  2 +-
>  include/hw/intc/arm_gic_common.h |  7 ++-
>  3 files changed, 37 insertions(+), 6 deletions(-)
>
> diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
> index a972942..c78b301 100644
> --- a/hw/intc/arm_gic.c
> +++ b/hw/intc/arm_gic.c
> @@ -301,8 +301,18 @@ static uint32_t gic_dist_readb(void *opaque, hwaddr 
> offset)
>  cpu = gic_get_current_cpu(s);
>  cm = 1 << cpu;
>  if (offset < 0x100) {
> -if (offset == 0)
> -return s->enabled;
> +if (offset == 0) {
> +res = 0;
> +if ((s->revision == 2 && !s->security_extn)
> +|| (s->security_extn && !ns_access())) {
> +res = (s->enabled_grp[1] << 1) | s->enabled_grp[0];

Seems that (s->enabled_grp[1] << 1) can give wrong value, namely 4
instead of 2. See below.

> +} else if (s->security_extn && ns_access()) {
> +res = s->enabled_grp[1];
> +} else {
> +/* Neither GICv2 nor Security Extensions present */
> +res = s->enabled;
> +}
> +}
>  if (offset == 4)
>  /* Interrupt Controller Type Register */
>  return ((s->num_irq / 32) - 1)
> @@ -469,8 +479,24 @@ static void gic_dist_writeb(void *opaque, hwaddr offset,
>  cpu = gic_get_current_cpu(s);
>  if (offset < 0x100) {
>  if (offset == 0) {
> -s->enabled = (value & 1);
> -DPRINTF("Distribution %sabled\n", s->enabled ? "En" : "Dis");
> +if ((s->revision == 2 && !s->security_extn)
> +|| (s->security_extn && !ns_access())) {
> +s->enabled_grp[0] = value & (1U << 0); /* EnableGrp0 */
> +/* For a GICv1 with Security Extn "EnableGrp1" is IMPDEF. */
> +s->enabled_grp[1] = value & (1U << 1); /* EnableGrp1 */

Here s->enabled_grp[1] can get value of 2. That can give wrong result as
noted above.

Best,
Sergey

> +DPRINTF("Group0 distribution %sabled\n"
> +"Group1 distribution %sabled\n",
> +s->enabled_grp[0] ? "En" : "Dis",
> +s->enabled_grp[1] ? "En" : "Dis");
> +} else if (s->security_extn && ns_access()) {
> +s->enabled_grp[1] = (value & 1U);
> +DPRINTF("Group1 distribution %sabled\n",
> +s->enabled_grp[1] ? "En" : "Dis");
> +} else {
> +/* Neither GICv2 nor Security Extensions present */
> +s->enabled = (value & 1U);
> +DPRINTF("Distribution %sabled\n", s->enabled ? "En" : "Dis");
> +}
>  } else if (offset < 4) {
>  /* ignored.  */
>  } else if (offset >= 0x80) {
> diff --git a/hw/intc/arm_gic_common.c b/hw/intc/arm_gic_common.c
> index f74175d..7652754 100644
> --- a/hw/intc/arm_gic_common.c
> +++ b/hw/intc/arm_gic_common.c
> @@ -64,7 +64,7 @@ static const VMStateDescription vmstate_gic = {
>  .pre_save = gic_pre_save,
>  .post_load = gic_post_load,
>  .fields = (VMStateField[]) {
> -VMSTATE_BOOL(enabled, GICState),
> +VMSTATE_UINT8_ARRAY(enabled_grp, GICState, GIC_NR_GROUP),
>  VMSTATE_BOOL_ARRAY(cpu_enabled, GICState, GIC_NCPU),
>  VMSTATE_STRUCT_ARRAY(irq_state, GICState, GIC_MAXIRQ, 1,
>   vmstate_gic_irq_state, gic_irq_state),
> diff --git a/include/hw/intc/arm_gic_common.h 
> b/include/hw/intc/arm_gic_common.h
> index a61e52e..a39b066 100644
> --- a/include/hw/intc/arm_gic_common.h
> +++ b/include/hw/intc/arm_gic_common.h
> @@ -30,6 +30,8 @@
>  #define GIC_NR_SGIS 16
>  /* Maximum number of possible CPU interfaces, determined by GIC architecture 
> */
>  #define GIC_NCPU 8
> +/* Number of Groups (Group0 [Secure], Group1 [Non-secure]) */
> +#define GIC_NR_GROUP 2
>  
>  #define MAX_NR_GROUP_PRIO 128
>  #define GIC_NR_APRS (MAX_NR_GROUP_PRIO / 32)
> @@ -52,7 +54,10 @@ typedef struct GICState {
>  
>  qemu_irq parent_irq[GIC_NCPU];
>  qemu_irq parent_fiq[GIC_NCPU];
> -bool enabled;
> +union {
> +uint8_t enabled;
> +uint8_t enabled_grp[GIC_NR_GROUP]; /* EnableGrp0 and EnableGrp1 */
> +};
>  bool cpu_enabled[GIC_NCPU];
>  
>

Re: [Qemu-devel] [PATCH 08/12] pci: allow 0 address for PCI IO regions

2014-08-26 Thread Peter Maydell

On 26 August 2014 10:14, Alexey Kardashevskiy  wrote:
> On 08/19/2014 10:21 AM, Michael Roth wrote:
>> Some kernels program a 0 address for io regions. PCI 3.0 spec
>> section 6.2.5.1 doesn't seem to disallow this.
>
>
> I remember there was discussion about it but I forgot :)

I think the conclusion we came to was that there may have been
a note in the PCI 2.1 spec that implied that 0 addresses meant
"disabled" but this seems to have gone from later versions,
suggesting it was erroneous.

Personally I'm happy for us to remove the "0 means disabled"
check, but I'd prefer it if we do it consistently for both IO and
MMIO regions -- this patch only changes the IO BAR code.

thanks
-- PMM

Re: [Qemu-devel] [PATCH] device_tree.c: redirect load_device_tree err message to stderr

2014-08-26 Thread Markus Armbruster

Markus Armbruster  writes:

[...]
> A welcome improvement.  Further improvement would be using
> error_report().

Ah, you did that already in v2.  Sorry for the noise!

Re: [Qemu-devel] [PATCH 09/12] spapr_pci: enable basic hotplug operations

2014-08-26 Thread Alexander Graf



On 19.08.14 02:21, Michael Roth wrote:
> This enables hotplug for PHB bridges. Upon hotplug we generate the
> OF-nodes required by PAPR specification and IEEE 1275-1994
> "PCI Bus Binding to Open Firmware" for the device.
> 
> We associate the corresponding FDT for these nodes with the DrcEntry
> corresponding to the slot, which will be fetched via
> ibm,configure-connector RTAS calls by the guest as described by PAPR
> specification. The FDT is cleaned up in the case of unplug.
> 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr_pci.c | 235 
> +++--
>  include/hw/ppc/spapr.h |   1 +
>  2 files changed, 228 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 96a57be..23864ab 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -87,6 +87,17 @@
>  #define ENCODE_DRC_STATE(val, m, s) \
>  (((uint32_t)(val) << (s)) & (uint32_t)(m))
>  
> +#define FDT_MAX_SIZE0x1
> +#define _FDT(exp) \
> +do { \
> +int ret = (exp);   \
> +if (ret < 0) { \
> +return ret;\
> +}  \
> +} while (0)
> +
> +static void spapr_drc_state_reset(sPAPRDrcEntry *drc_entry);
> +
>  static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
>  {
>  sPAPRPHBState *sphb;
> @@ -476,6 +487,22 @@ static void rtas_set_indicator(PowerPCCPU *cpu, 
> sPAPREnvironment *spapr,
>  /* encode the new value into the correct bit field */
>  shift = INDICATOR_ISOLATION_SHIFT;
>  mask = INDICATOR_ISOLATION_MASK;
> +if (drc_entry) {
> +/* transition from unisolated to isolated for a hotplug slot
> + * entails completion of guest-side device unplug/cleanup, so
> + * we can now safely remove the device if qemu is waiting for
> + * it to be released
> + */
> +if (DECODE_DRC_STATE(*pind, mask, shift) != indicator_state) {
> +if (indicator_state == 0 && drc_entry->awaiting_release) {
> +/* device_del has been called and host is waiting
> + * for guest to release/isolate device, go ahead
> + * and remove it now
> + */
> +spapr_drc_state_reset(drc_entry);
> +}
> +}
> +}
>  break;
>  case 9002: /* DR */
>  shift = INDICATOR_DR_SHIFT;
> @@ -816,6 +843,198 @@ static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, 
> void *opaque, int devfn)
>  return &phb->iommu_as;
>  }
>  
> +static int spapr_populate_pci_child_dt(PCIDevice *dev, void *fdt, int offset,
> +   int phb_index)
> +{
> +int slot = PCI_SLOT(dev->devfn);
> +char slotname[16];
> +bool is_bridge = 1;
> +sPAPRDrcEntry *drc_entry, *drc_entry_slot;
> +uint32_t reg[RESOURCE_CELLS_TOTAL * 8] = { 0 };
> +uint32_t assigned[RESOURCE_CELLS_TOTAL * 8] = { 0 };
> +int reg_size, assigned_size;
> +
> +drc_entry = spapr_phb_to_drc_entry(phb_index + SPAPR_PCI_BASE_BUID);
> +g_assert(drc_entry);
> +drc_entry_slot = &drc_entry->child_entries[slot];
> +
> +if (pci_default_read_config(dev, PCI_HEADER_TYPE, 1) ==
> +PCI_HEADER_TYPE_NORMAL) {
> +is_bridge = 0;
> +}
> +
> +_FDT(fdt_setprop_cell(fdt, offset, "vendor-id",
> +  pci_default_read_config(dev, PCI_VENDOR_ID, 2)));
> +_FDT(fdt_setprop_cell(fdt, offset, "device-id",
> +  pci_default_read_config(dev, PCI_DEVICE_ID, 2)));
> +_FDT(fdt_setprop_cell(fdt, offset, "revision-id",
> +  pci_default_read_config(dev, PCI_REVISION_ID, 1)));
> +_FDT(fdt_setprop_cell(fdt, offset, "class-code",
> +  pci_default_read_config(dev, PCI_CLASS_DEVICE, 2) 
> << 8));
> +
> +_FDT(fdt_setprop_cell(fdt, offset, "interrupts",
> +  pci_default_read_config(dev, PCI_INTERRUPT_PIN, 
> 1)));
> +
> +/* if this device is NOT a bridge */
> +if (!is_bridge) {
> +_FDT(fdt_setprop_cell(fdt, offset, "min-grant",
> +pci_default_read_config(dev, PCI_MIN_GNT, 1)));
> +_FDT(fdt_setprop_cell(fdt, offset, "max-latency",
> +pci_default_read_config(dev, PCI_MAX_LAT, 1)));
> +_FDT(fdt_setprop_cell(fdt, offset, "subsystem-id",
> +pci_default_read_config(dev, PCI_SUBSYSTEM_ID, 2)));
> +_FDT(fdt_setprop_cell(fdt, offset, "subsystem-vendor-id",
> +pci_default_read_config(dev, PCI_SUBSYSTEM_VENDOR_ID, 2)));
> +}
> +
> +_FDT(fdt_setprop_cell(fdt, offset, "cache-line-size",
> +pci_default_read_config(dev, PCI_CACHE_LINE_SIZE, 1)));
> +
> +/* the fol

Re: [Qemu-devel] [PATCH] pc: reserve more memory for ACPI for new machine types

2014-08-26 Thread Stefan Hajnoczi

On Wed, Aug 20, 2014 at 10:55:38PM +0200, Michael S. Tsirkin wrote:
> commit 868270f23d8db2cce83e4f082fe75e8625a5fbf9
> acpi-build: tweak acpi migration limits
> broke kernel loading with -kernel/-initrd: it doubled
> the size of ACPI tables but did not reserve
> enough memory.
> 
> As a result, issues on boot and halt are observed.
> 
> Fix this up by doubling reserved memory for new machine types.
> 
> Cc: qemu-sta...@nongnu.org
> Reported-by: Stefan Hajnoczi 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  include/hw/i386/pc.h |  2 ++
>  hw/i386/pc.c | 12 +---
>  hw/i386/pc_piix.c|  1 +
>  hw/i386/pc_q35.c |  1 +
>  4 files changed, 13 insertions(+), 3 deletions(-)

Already merged but for the record:
Tested-by: Stefan Hajnoczi 


pgp9zjI87mpah.pgp
Description: PGP signature

Re: [Qemu-devel] [12/16] linux-user: support {name_to, open_by}_handle_at syscalls

2014-08-26 Thread Riku Voipio

Hi Paul,

On Sun, Jun 15, 2014 at 05:18:29PM +0100, Paul Burton wrote:
> Implement support for the name_to_handle_at and open_by_handle_at
> syscalls, allowing their use by the target program. 

What was your testcase for these syscalls? I usually use LTP for testing
syscalls, but there is no testcase for name_to_handle_at and
open_by_handle_at.

Riku

> Signed-off-by: Paul Burton 
> ---
>  linux-user/strace.c| 30 ++
>  linux-user/strace.list |  6 ++
>  linux-user/syscall.c   | 50 
> ++
>  3 files changed, 86 insertions(+)
> 
> diff --git a/linux-user/strace.c b/linux-user/strace.c
> index ea6c1d2..c20ddf1 100644
> --- a/linux-user/strace.c
> +++ b/linux-user/strace.c
> @@ -1552,6 +1552,36 @@ print_kill(const struct syscallname *name,
>  }
>  #endif
>  
> +#ifdef TARGET_NR_name_to_handle_at
> +static void
> +print_name_to_handle_at(const struct syscallname *name,
> +abi_long arg0, abi_long arg1, abi_long arg2,
> +abi_long arg3, abi_long arg4, abi_long arg5)
> +{
> +print_syscall_prologue(name);
> +print_at_dirfd(arg0, 0);
> +print_string(arg1, 0);
> +print_pointer(arg2, 0);
> +print_pointer(arg3, 0);
> +print_raw_param("0x%x", arg4, 1);
> +print_syscall_epilogue(name);
> +}
> +#endif
> +
> +#ifdef TARGET_NR_open_by_handle_at
> +static void
> +print_open_by_handle_at(const struct syscallname *name,
> +abi_long arg0, abi_long arg1, abi_long arg2,
> +abi_long arg3, abi_long arg4, abi_long arg5)
> +{
> +print_syscall_prologue(name);
> +print_raw_param("%d", arg0, 0);
> +print_pointer(arg2, 0);
> +print_open_flags(arg3, 1);
> +print_syscall_epilogue(name);
> +}
> +#endif
> +
>  /*
>   * An array of all of the syscalls we know about
>   */
> diff --git a/linux-user/strace.list b/linux-user/strace.list
> index 8de972a..147f579 100644
> --- a/linux-user/strace.list
> +++ b/linux-user/strace.list
> @@ -582,6 +582,9 @@
>  #ifdef TARGET_NR_munmap
>  { TARGET_NR_munmap, "munmap" , NULL, print_munmap, NULL },
>  #endif
> +#ifdef TARGET_NR_name_to_handle_at
> +{ TARGET_NR_name_to_handle_at, "name_to_handle_at" , NULL, 
> print_name_to_handle_at, NULL },
> +#endif
>  #ifdef TARGET_NR_nanosleep
>  { TARGET_NR_nanosleep, "nanosleep" , NULL, NULL, NULL },
>  #endif
> @@ -624,6 +627,9 @@
>  #ifdef TARGET_NR_openat
>  { TARGET_NR_openat, "openat" , NULL, print_openat, NULL },
>  #endif
> +#ifdef TARGET_NR_open_by_handle_at
> +{ TARGET_NR_open_by_handle_at, "open_by_handle_at" , NULL, 
> print_open_by_handle_at, NULL },
> +#endif
>  #ifdef TARGET_NR_osf_adjtime
>  { TARGET_NR_osf_adjtime, "osf_adjtime" , NULL, NULL, NULL },
>  #endif
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index c7f176a..192ad3a 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -5349,6 +5349,56 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
> arg1,
>  unlock_user(p, arg2, 0);
>  break;
>  #endif
> +#ifdef TARGET_NR_name_to_handle_at
> +case TARGET_NR_name_to_handle_at:
> +{
> +struct file_handle *fh;
> +uint32_t sz;
> +int mount_id;
> +
> +if (!(p = lock_user_string(arg2)))
> +goto efault;
> +
> +if (get_user_u32(sz, arg3)) {
> +unlock_user(p, arg2, 0);
> +goto efault;
> +}
> +
> +if (!(fh = lock_user(VERIFY_WRITE, arg3, sizeof(*fh) + sz, 1))) {
> +unlock_user(p, arg2, 0);
> +goto efault;
> +}
> +
> +ret = get_errno(name_to_handle_at(arg1, path(p), fh,
> +  &mount_id, arg5));
> +
> +unlock_user(p, arg2, 0);
> +unlock_user(p, arg3, sizeof(*fh) + sz);
> +
> +if (put_user_s32(mount_id, arg4))
> +goto efault;
> +}
> +break;
> +#endif
> +#ifdef TARGET_NR_open_by_handle_at
> +case TARGET_NR_open_by_handle_at:
> +{
> +struct file_handle *fh;
> +uint32_t sz;
> +
> +if (get_user_u32(sz, arg2))
> +goto efault;
> +
> +if (!(fh = lock_user(VERIFY_WRITE, arg2, sizeof(*fh) + sz, 1)))
> +goto efault;
> +
> +ret = get_errno(open_by_handle_at(arg1, fh,
> +target_to_host_bitmask(arg3, fcntl_flags_tbl)));
> +
> +unlock_user(p, arg2, sizeof(*fh) + sz);
> +}
> +break;
> +#endif
>  case TARGET_NR_close:
>  ret = get_errno(close(arg1));
>  break;

Re: [Qemu-devel] [PATCH 12/12] spapr_pci: emit hotplug add/remove events during hotplug

2014-08-26 Thread Alexander Graf



On 19.08.14 02:21, Michael Roth wrote:
> From: Tyrel Datwyler 
> 
> This uses extension of existing EPOW interrupt/event mechanism
> to notify userspace tools like librtas/drmgr to handle
> in-guest configuration/cleanup operations in response to
> device_add/device_del.
> 
> Userspace tools that don't implement this extension will need
> to be run manually in response/advance of device_add/device_del,
> respectively.

Couldn't they use the pull event mechanism you implement in the previous
patch?

> 
> Signed-off-by: Tyrel Datwyler 
> Signed-off-by: Michael Roth 
> ---
>  hw/ppc/spapr_pci.c | 20 
>  1 file changed, 20 insertions(+)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 23864ab..17d37c0 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -944,6 +944,18 @@ static int spapr_device_hotplug_add(DeviceState *qdev, 
> PCIDevice *dev)
>  drc_entry_slot->state &= ~(uint32_t)INDICATOR_ENTITY_SENSE_MASK;
>  drc_entry_slot->state |= encoded; /* and the slot */
>  
> +/* reliable unplug requires we wait for a transition from
> + * UNISOLATED->ISOLATED prior to device removal/deletion.
> + * However, slots populated by devices at boot-time will not
> + * have ever been set by guest tools to an UNISOLATED/populated
> + * state, so set this manually in the case of coldplug devices
> + */
> +if (!DEVICE(dev)->hotplugged) {
> +drc_entry_slot->state |= ENCODE_DRC_STATE(1,
> +  INDICATOR_ISOLATION_MASK,
> +  INDICATOR_ISOLATION_SHIFT);

I think as a general scheme we would like to have the PHB call DRC
(which it knows from a qom link) which raises a qemu_irq to notify the
EPOW device that an event happened. If the EPOW interface is too
complex, I guess we can live with a link and function call too.


Alex

Re: [Qemu-devel] [PATCH] kvm: run cpu state synchronization on target vcpu thread

2014-08-26 Thread Christian Borntraeger

On 20/08/14 14:55, David Hildenbrand wrote:
> As already done for kvm_cpu_synchronize_state(), let's trigger
> kvm_arch_put_registers() via run_on_cpu() for kvm_cpu_synchronize_post_reset()
> and kvm_cpu_synchronize_post_init().
> 
> This way, we make sure that the register synchronizing ioctls are
> called from the proper vcpu thread; this avoids calls to
> synchronize_rcu() in the kernel.
> 
> Reviewed-by: Cornelia Huck 
> Signed-off-by: David Hildenbrand 

Acked-by: Christian Borntraeger 

Paolo, will you take that via the kvm tree? 
Otherwise I can queue that via the s390 tree together with the other patches 
that move our vcpu ioctls to the vcpu thread. I need your ACK in that case.


Christian
> ---
>  kvm-all.c | 18 --
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/kvm-all.c b/kvm-all.c
> index 1402f4f..b240bf8 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -1669,18 +1669,32 @@ void kvm_cpu_synchronize_state(CPUState *cpu)
>  }
>  }
> 
> -void kvm_cpu_synchronize_post_reset(CPUState *cpu)
> +static void do_kvm_cpu_synchronize_post_reset(void *arg)
>  {
> +CPUState *cpu = arg;
> +
>  kvm_arch_put_registers(cpu, KVM_PUT_RESET_STATE);
>  cpu->kvm_vcpu_dirty = false;
>  }
> 
> -void kvm_cpu_synchronize_post_init(CPUState *cpu)
> +void kvm_cpu_synchronize_post_reset(CPUState *cpu)
> +{
> +run_on_cpu(cpu, do_kvm_cpu_synchronize_post_reset, cpu);
> +}
> +
> +static void do_kvm_cpu_synchronize_post_init(void *arg)
>  {
> +CPUState *cpu = arg;
> +
>  kvm_arch_put_registers(cpu, KVM_PUT_FULL_STATE);
>  cpu->kvm_vcpu_dirty = false;
>  }
> 
> +void kvm_cpu_synchronize_post_init(CPUState *cpu)
> +{
> +run_on_cpu(cpu, do_kvm_cpu_synchronize_post_init, cpu);
> +}
> +
>  int kvm_cpu_exec(CPUState *cpu)
>  {
>  struct kvm_run *run = cpu->kvm_run;
>

Re: [Qemu-devel] [PATCH] kvm: run cpu state synchronization on target vcpu thread

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 14:43, Christian Borntraeger ha scritto:
> Acked-by: Christian Borntraeger 
> 
> Paolo, will you take that via the kvm tree? 

Yes, I will.

Paolo

Re: [Qemu-devel] [PATCH memory v2 3/3] memory: Lazy init name from QOM name as needed

2014-08-26 Thread Paolo Bonzini

Il 26/08/2014 05:10, Peter Crosthwaite ha scritto:
> To support name retrieval of MemoryRegions that were created
> dynamically (that is, not via memory_region_init and friends). We
> cache the name in MemoryRegion's state as
> object_get_canonical_path_component mallocs the returned value
> so it's not suitable for direct return to callers. Memory already
> frees the name field, so this will be garbage collected along with
> the MR object.
> 
> Signed-off-by: Peter Crosthwaite 
> ---
> 
>  memory.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/memory.c b/memory.c
> index 42317a2..fc16e5f 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -914,7 +914,6 @@ void memory_region_init(MemoryRegion *mr,
>  if (size == UINT64_MAX) {
>  mr->size = int128_2_64();
>  }
> -mr->name = g_strdup(name);
>  
>  if (name) {
>  object_property_add_child_array(owner, name, OBJECT(mr));
> @@ -1309,6 +1308,10 @@ uint64_t memory_region_size(MemoryRegion *mr)
>  
>  const char *memory_region_name(const MemoryRegion *mr)
>  {
> +if (!mr->name) {
> +((MemoryRegion *)mr)->name =
> +object_get_canonical_path_component(OBJECT(mr));
> +}
>  return mr->name;
>  }
>  
> 

Oh if we only used C++... :)

Thanks, looks good!

Paolo

Re: [Qemu-devel] [PATCH v2 0/2] target-i386: tsc_adjust and mpx feature names

2014-08-26 Thread Paolo Bonzini

Il 25/08/2014 22:02, Eduardo Habkost ha scritto:
> Add feature names that are missing on the x86 CPU feature name tables. Both 
> had
> migration support implemented many months ago.
> 
> Changes v1 -> v2:
>  * Commit message changes only. Added reference to migration support commit 
> IDs.
> 
> Note that v1 was not sent as a series, but as separate individual patches.
> 
> Eduardo Habkost (2):
>   target-i386: Add "mpx" CPU feature name
>   target-i386: Add "tsc_adjust" CPU feature name
> 
>  target-i386/cpu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 

Thanks, applied to uq/master.

Paolo

Re: [Qemu-devel] [PATCH v2 0/6] target-i386: Make most CPU models work with "enforce" out of the box

2014-08-26 Thread Paolo Bonzini

Il 25/08/2014 22:45, Eduardo Habkost ha scritto:
> 
> TCG users expect the default CPU model to contain most TCG-supported features
> (and it makes sense). See, for example, commit
> f1e00a9cf326acc1f2386a72525af8859852e1df.

It doesn't though (SMAP is the most egregious omission, and probably the
main reason why people use QEMU TCG these days), and it raises the
question of backwards-compatibility of qemu64---should we disable TCG
features in old machine types?  Probably yes, but we've never done that.

Paolo

Re: [Qemu-devel] [PATCH v3 05/10] qcow2: Fix refcount blocks beyond image end

2014-08-26 Thread Eric Blake

On 08/22/2014 10:31 AM, Max Reitz wrote:
> If the qcow2 check function detects a refcount block located beyond the
> image end, grow the image appropriately. This cannot break anything and
> is the logical fix for such a case.

Does the testsuite cover this one? I didn't see it mentioned in patch 10/10.

> 
> Signed-off-by: Max Reitz 
> ---
>  block/qcow2-refcount.c | 50 
> ++
>  1 file changed, 46 insertions(+), 4 deletions(-)
> 
-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

1 2 3 >

1 - 100 of 201 matches

Mail list logo