[Qemu-devel] [PATCH v3] ARM: ACPI: Fix use-after-free due to memory realloc

2018-05-30 Thread Shannon Zhao
acpi_data_push uses g_array_set_size to resize the memory size. If there
is no enough contiguous memory, the address will be changed. So previous
pointer could not be used any more. It must update the pointer and use
the new one.

Also, previous codes wrongly use le32 conversion of iort->node_offset
for subsequent computations that will result incorrect value if host is
not litlle endian. So use the non-converted one instead.

Signed-off-by: Shannon Zhao 
---
V3: Fix typo and add some words in commit message to explain another bug
---
 hw/arm/virt-acpi-build.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 92ceee9..74f5744 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -400,7 +400,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 AcpiIortItsGroup *its;
 AcpiIortTable *iort;
 AcpiIortSmmu3 *smmu;
-size_t node_size, iort_length, smmu_offset = 0;
+size_t node_size, iort_node_offset, iort_length, smmu_offset = 0;
 AcpiIortRC *rc;
 
 iort = acpi_data_push(table_data, sizeof(*iort));
@@ -413,7 +413,12 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 
 iort_length = sizeof(*iort);
 iort->node_count = cpu_to_le32(nb_nodes);
-iort->node_offset = cpu_to_le32(sizeof(*iort));
+/*
+ * Use a copy in case table_data->data moves during acpi_data_push
+ * operations.
+ */
+iort_node_offset = sizeof(*iort);
+iort->node_offset = cpu_to_le32(iort_node_offset);
 
 /* ITS group node */
 node_size =  sizeof(*its) + sizeof(uint32_t);
@@ -429,7 +434,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 int irq =  vms->irqmap[VIRT_SMMU];
 
 /* SMMUv3 node */
-smmu_offset = iort->node_offset + node_size;
+smmu_offset = iort_node_offset + node_size;
 node_size = sizeof(*smmu) + sizeof(*idmap);
 iort_length += node_size;
 smmu = acpi_data_push(table_data, node_size);
@@ -450,7 +455,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 idmap->id_count = cpu_to_le32(0x);
 idmap->output_base = 0;
 /* output IORT node is the ITS group node (the first node) */
-idmap->output_reference = cpu_to_le32(iort->node_offset);
+idmap->output_reference = cpu_to_le32(iort_node_offset);
 }
 
 /* Root Complex Node */
@@ -479,9 +484,14 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 idmap->output_reference = cpu_to_le32(smmu_offset);
 } else {
 /* output IORT node is the ITS group node (the first node) */
-idmap->output_reference = cpu_to_le32(iort->node_offset);
+idmap->output_reference = cpu_to_le32(iort_node_offset);
 }
 
+/*
+ * Update the pointer address in case table_data->data moves during above
+ * acpi_data_push operations.
+ */
+iort = (AcpiIortTable *)(table_data->data + iort_start);
 iort->length = cpu_to_le32(iort_length);
 
 build_header(linker, table_data, (void *)(table_data->data + iort_start),
-- 
2.0.4





[Qemu-devel] [PATCH qemu v3] qom: Document qom/device-list-properties implementation specific

2018-05-30 Thread Alexey Kardashevskiy
The recently introduced qom-list-properties QMP command raised
a question what properties it (and its cousin - device-list-properties)
can possibly print - only those defined by DeviceClass::props
or dynamically created in TypeInfo::instance_init() so properties created
elsewhere won't show up and this behaviour might confuse the user.

For example, PIIX4 does that from piix4_pm_realize() via
  piix4_pm_add_propeties():

object_property_add_uint8_ptr(OBJECT(s), ACPI_PM_PROP_ACPI_ENABLE_CMD,
  &acpi_enable_cmd, NULL);

This adds a note to the command descriptions about the limitation.

Reviewed-by: Markus Armbruster 
Acked-by: Paolo Bonzini 
Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v3:
* adjusted commit log with Markus suggestion
* added a-by/r-by
---
 qapi/misc.json | 8 
 1 file changed, 8 insertions(+)

diff --git a/qapi/misc.json b/qapi/misc.json
index 99bcaac..bbf9ef8 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -1529,6 +1529,10 @@
 #
 # Returns: a list of ObjectPropertyInfo describing a devices properties
 #
+# Note: objects can create properties at runtime, for example to describe
+# links between different devices and/or objects. These properties
+# are not included in the output of this command.
+#
 # Since: 1.2
 ##
 { 'command': 'device-list-properties',
@@ -1542,6 +1546,10 @@
 #
 # @typename: the type name of an object
 #
+# Note: objects can create properties at runtime, for example to describe
+# links between different devices and/or objects. These properties
+# are not included in the output of this command.
+#
 # Returns: a list of ObjectPropertyInfo describing object properties
 #
 # Since: 2.12
-- 
2.11.0




[Qemu-devel] [PATCH qemu v3 1/2] object: Handle objects with no parents

2018-05-30 Thread Alexey Kardashevskiy
At the moment object_get_canonical_path() crashes if the object or one
of its parents does not have a parent, for example, a KVM accelerator
object.

This adds a check for obj!=NULL in a loop to prevent the crash.
In order not to return a wrong path, this checks for currently resolved
partial path and does not add a leading slash to tell the reader that
the path is partial as the owner object is detached.

Signed-off-by: Alexey Kardashevskiy 
---

I have not tested the case with obj==NULL and path!=NULL as this is
for objects which have parents which are not attached to the root
and we do not have such objects in current QEMU afaict but I kept it
just in case.

---
Changes:
v3:
* do not check for obj->parent
* return NULL or incomplete path depending on the situation
---
 qom/object.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/qom/object.c b/qom/object.c
index 0fc9720..05138ba 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -1669,7 +1669,7 @@ gchar *object_get_canonical_path(Object *obj)
 Object *root = object_get_root();
 char *newpath, *path = NULL;
 
-while (obj != root) {
+while (obj && obj != root) {
 char *component = object_get_canonical_path_component(obj);
 
 if (path) {
@@ -1684,7 +1684,13 @@ gchar *object_get_canonical_path(Object *obj)
 obj = obj->parent;
 }
 
-newpath = g_strdup_printf("/%s", path ? path : "");
+if (obj && path) {
+newpath = g_strdup_printf("/%s", path);
+} else if (path) {
+newpath = g_strdup(path);
+} else {
+newpath = NULL;
+}
 g_free(path);
 
 return newpath;
-- 
2.11.0




[Qemu-devel] [PATCH qemu v3 2/2] memory/hmp: Print owners/parents in "info mtree"

2018-05-30 Thread Alexey Kardashevskiy
This adds owners/parents (which are the same, just occasionally
owner==NULL) printing for memory regions; a new '-o' flag
enabled new output.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v3:
* removed QOM's "id" property as there are no objects left which would
have this property and own an MR

v2:
* cleanups
---
 include/exec/memory.h |  2 +-
 memory.c  | 68 +++
 monitor.c |  4 ++-
 hmp-commands-info.hx  |  7 +++---
 4 files changed, 66 insertions(+), 15 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 525619a..b98e918 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1541,7 +1541,7 @@ void memory_global_dirty_log_start(void);
 void memory_global_dirty_log_stop(void);
 
 void mtree_info(fprintf_function mon_printf, void *f, bool flatview,
-bool dispatch_tree);
+bool dispatch_tree, bool owner);
 
 /**
  * memory_region_request_mmio_ptr: request a pointer to an mmio
diff --git a/memory.c b/memory.c
index fc7f9b7..d742bb2 100644
--- a/memory.c
+++ b/memory.c
@@ -2830,10 +2830,45 @@ typedef QTAILQ_HEAD(mrqueue, MemoryRegionList) 
MemoryRegionListHead;
int128_sub((size), int128_one())) : 0)
 #define MTREE_INDENT "  "
 
+static void mtree_expand_owner(fprintf_function mon_printf, void *f,
+   const char *label, Object *obj)
+{
+DeviceState *dev = (DeviceState *) object_dynamic_cast(obj, TYPE_DEVICE);
+
+mon_printf(f, " %s:{%s", label, dev ? "dev" : "obj");
+if (dev && dev->id) {
+mon_printf(f, " id=%s", dev->id);
+} else {
+gchar *canonical_path = object_get_canonical_path(obj);
+mon_printf(f, " path=%s", canonical_path);
+g_free(canonical_path);
+}
+mon_printf(f, "}");
+}
+
+static void mtree_print_mr_owner(fprintf_function mon_printf, void *f,
+ const MemoryRegion *mr)
+{
+Object *owner = mr->owner;
+Object *parent = memory_region_owner((MemoryRegion *)mr);
+
+if (!owner && !parent) {
+mon_printf(f, " orphan");
+return;
+}
+if (owner) {
+mtree_expand_owner(mon_printf, f, "owner", owner);
+}
+if (parent && parent != owner) {
+mtree_expand_owner(mon_printf, f, "parent", parent);
+}
+}
+
 static void mtree_print_mr(fprintf_function mon_printf, void *f,
const MemoryRegion *mr, unsigned int level,
hwaddr base,
-   MemoryRegionListHead *alias_print_queue)
+   MemoryRegionListHead *alias_print_queue,
+   bool owner)
 {
 MemoryRegionList *new_ml, *ml, *next_ml;
 MemoryRegionListHead submr_print_queue;
@@ -2879,7 +2914,7 @@ static void mtree_print_mr(fprintf_function mon_printf, 
void *f,
 }
 mon_printf(f, TARGET_FMT_plx "-" TARGET_FMT_plx
" (prio %d, %s): alias %s @%s " TARGET_FMT_plx
-   "-" TARGET_FMT_plx "%s\n",
+   "-" TARGET_FMT_plx "%s",
cur_start, cur_end,
mr->priority,
memory_region_type((MemoryRegion *)mr),
@@ -2888,15 +2923,22 @@ static void mtree_print_mr(fprintf_function mon_printf, 
void *f,
mr->alias_offset,
mr->alias_offset + MR_SIZE(mr->size),
mr->enabled ? "" : " [disabled]");
+if (owner) {
+mtree_print_mr_owner(mon_printf, f, mr);
+}
 } else {
 mon_printf(f,
-   TARGET_FMT_plx "-" TARGET_FMT_plx " (prio %d, %s): %s%s\n",
+   TARGET_FMT_plx "-" TARGET_FMT_plx " (prio %d, %s): %s%s",
cur_start, cur_end,
mr->priority,
memory_region_type((MemoryRegion *)mr),
memory_region_name(mr),
mr->enabled ? "" : " [disabled]");
+if (owner) {
+mtree_print_mr_owner(mon_printf, f, mr);
+}
 }
+mon_printf(f, "\n");
 
 QTAILQ_INIT(&submr_print_queue);
 
@@ -2919,7 +2961,7 @@ static void mtree_print_mr(fprintf_function mon_printf, 
void *f,
 
 QTAILQ_FOREACH(ml, &submr_print_queue, mrqueue) {
 mtree_print_mr(mon_printf, f, ml->mr, level + 1, cur_start,
-   alias_print_queue);
+   alias_print_queue, owner);
 }
 
 QTAILQ_FOREACH_SAFE(ml, &submr_print_queue, mrqueue, next_ml) {
@@ -2932,6 +2974,7 @@ struct FlatViewInfo {
 void *f;
 int counter;
 bool dispatch_tree;
+bool owner;
 };
 
 static void mtree_print_flatview(gpointer key, gpointer value,
@@ -2972,7 +3015,7 @@ static void mtree_print_flatview(gpointer key, gpointer 
value,
 mr = range->mr;
 if (range->offset_in_region) {
 p(f, MTREE_INDENT TARGET_FMT_plx "-"
-  TARGET_F

[Qemu-devel] [PATCH qemu v3 0/2] memory/hmp: Print owners/parents in "info mtree"

2018-05-30 Thread Alexey Kardashevskiy
This is a debug extension to "into mtree" to print a memory region owner/parent.

This is based on sha1
e609fa7 Peter Maydell "Merge remote-tracking branch 
'remotes/edgar/tags/edgar/xilinx-next-2018-05-29-v1.for-upstream' into staging".

Please comment. Thanks.



Alexey Kardashevskiy (2):
  object: Handle objects with no parents
  memory/hmp: Print owners/parents in "info mtree"

 include/exec/memory.h |  2 +-
 memory.c  | 68 +++
 monitor.c |  4 ++-
 qom/object.c  | 10 ++--
 hmp-commands-info.hx  |  7 +++---
 5 files changed, 74 insertions(+), 17 deletions(-)

-- 
2.11.0




Re: [Qemu-devel] [PATCH] qga: add mountpoint usage to GuestFilesystemInfo

2018-05-30 Thread Chen Hanxiao


At 2018-05-30 11:19:27, "Eric Blake"  wrote:
>On 05/29/2018 10:01 PM, Chen Hanxiao wrote:
>> From: Chen Hanxiao 
>> 
>> This patch adds support for getting the usage of mounted
>> filesystem.
>> It's very useful when we try to monitor guest's filesystem.
>> Use df of coreutils for reference.
>> 
>> Cc: Michael Roth 
>> Signed-off-by: Chen Hanxiao 
>> ---
>
>> @@ -1072,6 +1073,9 @@ static GuestFilesystemInfo *build_guest_fsinfo(struct 
>> FsMount *mount,
>>  Error **errp)
>>   {
>>   GuestFilesystemInfo *fs = g_malloc0(sizeof(*fs));
>> +struct statvfs buf;
>> +unsigned long u100, used, nonroot_total;
>> +int usage;
>>   char *devpath = g_strdup_printf("/sys/dev/block/%u:%u",
>>   mount->devmajor, mount->devminor);
>>   
>> @@ -1079,7 +1083,20 @@ static GuestFilesystemInfo *build_guest_fsinfo(struct 
>> FsMount *mount,
>>   fs->type = g_strdup(mount->devtype);
>>   build_guest_fsinfo_for_device(devpath, fs, errp);
>>   
>> +if (statvfs(fs->mountpoint, &buf)) {
>> +error_setg_errno(errp, errno, "Failed to get statvfs");
>> +return NULL;
>> +}
>> +
>> +used = buf.f_blocks - buf.f_bfree;
>> +u100 = 100 * used;
>> +nonroot_total = used + buf.f_bavail;
>> +usage = u100 / nonroot_total + (u100 % nonroot_total != 0);
>
>Why integral instead of floating point?

I followed the style of df from coreutils.
As the percentage already multiplied by 100,
I think it has enough precision.

>
>> +++ b/qga/qapi-schema.json
>> @@ -846,13 +846,14 @@
>>   # @name: disk name
>>   # @mountpoint: mount point path
>>   # @type: file system type string
>> +# @usage: file system usage
>
>Needs more details.  As written, it is an integer between 0 and 100; but 
>if you use floating point, a better description would be a fraction 
>between 0 and 1.

Will be updated in the following patch.

Regards,
- Chen

>
>>   # @disk: an array of disk hardware information that the volume lies on,
>>   #which may be empty if the disk type is not supported
>>   #
>>   # Since: 2.2
>>   ##
>>   { 'struct': 'GuestFilesystemInfo',
>> -  'data': {'name': 'str', 'mountpoint': 'str', 'type': 'str',
>> +  'data': {'name': 'str', 'mountpoint': 'str', 'type': 'str', 'usage': 
>> 'int',
>>  'disk': ['GuestDiskAddress']} }
>>   
>>   ##
>> 
>
>-- 
>Eric Blake, Principal Software Engineer
>Red Hat, Inc.   +1-919-301-3266
>Virtualization:  qemu.org | libvirt.org


Re: [Qemu-devel] [PATCH] pc-bios/s390-ccw: define loadparm length

2018-05-30 Thread Cornelia Huck
On Tue, 29 May 2018 00:40:09 -0400
Collin Walling  wrote:

> Loadparm is defined by the s390 architecture to be 8 bytes
> in length. Let's define this size in the s390-ccw bios.
> 
> Suggested-by: Laszlo Ersek 
> Signed-off-by: Collin Walling 
> ---
>  pc-bios/s390-ccw/iplb.h | 4 +++-
>  pc-bios/s390-ccw/main.c | 8 
>  pc-bios/s390-ccw/sclp.c | 2 +-
>  pc-bios/s390-ccw/sclp.h | 2 +-
>  4 files changed, 9 insertions(+), 7 deletions(-)

Reviewed-by: Cornelia Huck 

Thomas, I assume this will go via your tree?



Re: [Qemu-devel] [PATCH] vfio/pci: Default display option to "off"

2018-05-30 Thread Erik Skultety
On Tue, May 29, 2018 at 09:24:08AM -0600, Alex Williamson wrote:
> [Cc +Erik,libvirt]
>
> Sorry, should have cc'd libvirt with this initially since display
> support is under development.  I think "off" is the better
> compatibility option, but perhaps the damage is done since it was the
> 2.12 default.  Thanks,

Thanks for CC'ing me. So, the approach I took with the initial design to be
sent as an RFC is to default to display=off in libvirt. My implementation is
lacking support for display=auto, solely because the way libvirt would need to
get all the information about devices was cumbersome - whether it uses dmabuf
or vfio regions ergo not guessing whether a missing gl=on signals dmabuf or it
was just a mistake. Therefore, we're going to default to use egl-headless unless
explicitly asked to turn on OpenGL. Back to the 'auto' value, it would also mean
for libvirt to make policy decisions which is not what libvirt's supposed to do
plus in this case it's difficult for any layer to guess what user's intention
with the device was, whether it was supposed to be used for graphics or not.
We can add support for 'auto' once there are compelling reasons for us to do so
and the overall consensus is in favour of implementing that, however, we
wouldn't change the display=off default anyway, since it's much safer for us.

Erik

>
> Alex
>
> On Tue, 29 May 2018 09:18:10 -0600
> Alex Williamson  wrote:
>
> > Commit a9994687cb9b ("vfio/display: core & wireup") added display
> > support to vfio-pci with the default being "auto", which breaks
> > existing VMs when the vGPU requires GL support but had no previous
> > requirement for a GL compatible configuration.  "Off" is the safer
> > default as we impose no new requirements to VM configurations.
> >
> > Fixes: a9994687cb9b ("vfio/display: core & wireup")
> > Cc: qemu-sta...@nongnu.org
> > Cc: Gerd Hoffmann 
> > Signed-off-by: Alex Williamson 
> > ---
> >  hw/vfio/pci.c |2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > index 84e27c7bb2d1..18c493b49ec1 100644
> > --- a/hw/vfio/pci.c
> > +++ b/hw/vfio/pci.c
> > @@ -3160,7 +3160,7 @@ static Property vfio_pci_dev_properties[] = {
> >  DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host),
> >  DEFINE_PROP_STRING("sysfsdev", VFIOPCIDevice, vbasedev.sysfsdev),
> >  DEFINE_PROP_ON_OFF_AUTO("display", VFIOPCIDevice,
> > -display, ON_OFF_AUTO_AUTO),
> > +display, ON_OFF_AUTO_OFF),
> >  DEFINE_PROP_UINT32("x-intx-mmap-timeout-ms", VFIOPCIDevice,
> > intx.mmap_timeout, 1100),
> >  DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features,
> >
>



[Qemu-devel] [PATCH 0/4] aspeed: add MMIO exec support to the FMC controller

2018-05-30 Thread Cédric Le Goater
Hello,

When MMIO execution support is active, these changes let the Aspeed
SoC machine boot directly from CE0. As there is still some
issues, the feature is disabled by default and should be activated
with :

-global driver=aspeed.smc,property=mmio-exec,value=true

Thanks,

C.

Cédric Le Goater (4):
  aspeed/smc: fix HW strapping
  aspeed/smc: rename aspeed_smc_flash_send_addr() to
aspeed_smc_flash_setup()
  aspeed/smc: add a new memory region dedicated to MMIO execution
  hw/arm/aspeed: boot from the FMC CE0 flash module

 include/hw/ssi/aspeed_smc.h |   7 ++
 hw/arm/aspeed.c |  41 +++
 hw/ssi/aspeed_smc.c | 161 +---
 3 files changed, 172 insertions(+), 37 deletions(-)

-- 
2.13.6




[Qemu-devel] [PATCH 2/4] aspeed/smc: rename aspeed_smc_flash_send_addr() to aspeed_smc_flash_setup()

2018-05-30 Thread Cédric Le Goater
Also handle the fake transfers for dummy bytes in this setup
routine. It will be useful when we activate MMIO execution.

Signed-off-by: Cédric Le Goater 
---
 hw/ssi/aspeed_smc.c | 31 ---
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c
index f33ec87fcb74..5808181568c4 100644
--- a/hw/ssi/aspeed_smc.c
+++ b/hw/ssi/aspeed_smc.c
@@ -496,10 +496,11 @@ static int aspeed_smc_flash_dummies(const AspeedSMCFlash 
*fl)
 return ((dummy_high << 2) | dummy_low) * 8;
 }
 
-static void aspeed_smc_flash_send_addr(AspeedSMCFlash *fl, uint32_t addr)
+static void aspeed_smc_flash_setup(AspeedSMCFlash *fl, uint32_t addr)
 {
 const AspeedSMCState *s = fl->controller;
 uint8_t cmd = aspeed_smc_flash_cmd(fl);
+int i;
 
 /* Flash access can not exceed CS segment */
 addr = aspeed_smc_check_segment_addr(fl, addr);
@@ -512,6 +513,18 @@ static void aspeed_smc_flash_send_addr(AspeedSMCFlash *fl, 
uint32_t addr)
 ssi_transfer(s->spi, (addr >> 16) & 0xff);
 ssi_transfer(s->spi, (addr >> 8) & 0xff);
 ssi_transfer(s->spi, (addr & 0xff));
+
+/*
+ * Use fake transfers to model dummy bytes. The value should
+ * be configured to some non-zero value in fast read mode and
+ * zero in read mode. But, as the HW allows inconsistent
+ * settings, let's check for fast read mode.
+ */
+if (aspeed_smc_flash_mode(fl) == CTRL_FREADMODE) {
+for (i = 0; i < aspeed_smc_flash_dummies(fl); i++) {
+ssi_transfer(fl->controller->spi, 0xFF);
+}
+}
 }
 
 static uint64_t aspeed_smc_flash_read(void *opaque, hwaddr addr, unsigned size)
@@ -530,19 +543,7 @@ static uint64_t aspeed_smc_flash_read(void *opaque, hwaddr 
addr, unsigned size)
 case CTRL_READMODE:
 case CTRL_FREADMODE:
 aspeed_smc_flash_select(fl);
-aspeed_smc_flash_send_addr(fl, addr);
-
-/*
- * Use fake transfers to model dummy bytes. The value should
- * be configured to some non-zero value in fast read mode and
- * zero in read mode. But, as the HW allows inconsistent
- * settings, let's check for fast read mode.
- */
-if (aspeed_smc_flash_mode(fl) == CTRL_FREADMODE) {
-for (i = 0; i < aspeed_smc_flash_dummies(fl); i++) {
-ssi_transfer(fl->controller->spi, 0xFF);
-}
-}
+aspeed_smc_flash_setup(fl, addr);
 
 for (i = 0; i < size; i++) {
 ret |= ssi_transfer(s->spi, 0x0) << (8 * i);
@@ -579,7 +580,7 @@ static void aspeed_smc_flash_write(void *opaque, hwaddr 
addr, uint64_t data,
 break;
 case CTRL_WRITEMODE:
 aspeed_smc_flash_select(fl);
-aspeed_smc_flash_send_addr(fl, addr);
+aspeed_smc_flash_setup(fl, addr);
 
 for (i = 0; i < size; i++) {
 ssi_transfer(s->spi, (data >> (8 * i)) & 0xff);
-- 
2.13.6




[Qemu-devel] [PATCH 3/4] aspeed/smc: add a new memory region dedicated to MMIO execution

2018-05-30 Thread Cédric Le Goater
The Aspeed SoC are generally booted from one of the flash modules
behind the FMC controller. The FMC CS0 flash module is mapped at a
specific address depending on the SoC revision and also at 0x0, the
default boot-up address.

To support this second mapping, we add a new 'ROM' like memory region
under the FMC flash module model and activate support for MMIO
execution with a 'request_ptr' handler. The latter fills up a cache of
flash content to be executed or read by the boot up process.

Also add a 'mmio_exec' bool to activate the feature which still has
some issues.

Signed-off-by: Cédric Le Goater 
---
 include/hw/ssi/aspeed_smc.h |   7 +++
 hw/ssi/aspeed_smc.c | 122 +++-
 2 files changed, 128 insertions(+), 1 deletion(-)

diff --git a/include/hw/ssi/aspeed_smc.h b/include/hw/ssi/aspeed_smc.h
index 1f557313fa93..5e853afe725d 100644
--- a/include/hw/ssi/aspeed_smc.h
+++ b/include/hw/ssi/aspeed_smc.h
@@ -48,6 +48,8 @@ typedef struct AspeedSMCController {
 uint32_t nregs;
 } AspeedSMCController;
 
+#define ASPEED_SMC_CACHE_SIZE1024  /* 1K is the minimum */
+
 typedef struct AspeedSMCFlash {
 struct AspeedSMCState *controller;
 
@@ -56,6 +58,10 @@ typedef struct AspeedSMCFlash {
 
 MemoryRegion mmio;
 DeviceState *flash;
+
+MemoryRegion mmio_rom;
+uint8_t cache[ASPEED_SMC_CACHE_SIZE];
+hwaddr cache_addr;
 } AspeedSMCFlash;
 
 #define TYPE_ASPEED_SMC "aspeed.smc"
@@ -79,6 +85,7 @@ typedef struct AspeedSMCState {
 
 MemoryRegion mmio;
 MemoryRegion mmio_flash;
+bool mmio_exec;
 
 qemu_irq irq;
 int irqline;
diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c
index 5808181568c4..d599eebc7d21 100644
--- a/hw/ssi/aspeed_smc.c
+++ b/hw/ssi/aspeed_smc.c
@@ -604,6 +604,96 @@ static const MemoryRegionOps aspeed_smc_flash_ops = {
 },
 };
 
+static bool aspeed_smc_flash_rom_is_cached(AspeedSMCFlash *fl, hwaddr addr)
+{
+return (addr >= fl->cache_addr &&
+addr <= fl->cache_addr + ASPEED_SMC_CACHE_SIZE - 4);
+}
+
+static void aspeed_smc_flash_rom_load_cache(AspeedSMCFlash *fl, hwaddr addr)
+{
+AspeedSMCState *s = fl->controller;
+hwaddr cache_addr = addr & ~(ASPEED_SMC_CACHE_SIZE - 1);
+int i;
+
+if (fl->cache_addr != ~0ULL) {
+memory_region_invalidate_mmio_ptr(&fl->mmio_rom, fl->cache_addr,
+  ASPEED_SMC_CACHE_SIZE);
+}
+
+aspeed_smc_flash_select(fl);
+aspeed_smc_flash_setup(fl, cache_addr);
+
+for (i = 0; i < ASPEED_SMC_CACHE_SIZE; i++) {
+fl->cache[i] = ssi_transfer(s->spi, 0x0);
+}
+
+aspeed_smc_flash_unselect(fl);
+
+fl->cache_addr = cache_addr;
+}
+
+static void *aspeed_smc_flash_rom_request_ptr(void *opaque, hwaddr addr,
+  unsigned *size, unsigned *offset)
+{
+AspeedSMCFlash *fl = opaque;
+
+if (!aspeed_smc_flash_rom_is_cached(fl, addr)) {
+aspeed_smc_flash_rom_load_cache(fl, addr);
+}
+
+*size = ASPEED_SMC_CACHE_SIZE;
+*offset = fl->cache_addr;
+return fl->cache;
+}
+
+static uint64_t aspeed_smc_flash_rom_read(void *opaque, hwaddr addr,
+  unsigned size)
+{
+AspeedSMCFlash *fl = opaque;
+AspeedSMCState *s = fl->controller;
+uint64_t ret = 0;
+int i;
+
+/*
+ * Transfer or use the cache if possible. Reloading the cache
+ * while loading from the flash can break the TCG execution flow.
+ */
+if (!aspeed_smc_flash_rom_is_cached(fl, addr)) {
+aspeed_smc_flash_select(fl);
+aspeed_smc_flash_setup(fl, addr);
+
+for (i = 0; i < size; i++) {
+ret |= (uint64_t) ssi_transfer(s->spi, 0x0) << (8 * i);
+}
+
+aspeed_smc_flash_unselect(fl);
+} else {
+for (i = 0; i < size; i++) {
+ret |= (uint64_t) fl->cache[addr - fl->cache_addr + i] << (8 * i);
+}
+}
+return ret;
+}
+
+static void aspeed_smc_flash_rom_write(void *opaque, hwaddr addr, uint64_t 
data,
+   unsigned size)
+{
+qemu_log_mask(LOG_GUEST_ERROR, "%s: flash is not writable at 0x%"
+  HWADDR_PRIx "\n", __func__, addr);
+}
+
+static const MemoryRegionOps aspeed_smc_flash_rom_ops = {
+.read = aspeed_smc_flash_rom_read,
+.write = aspeed_smc_flash_rom_write,
+.request_ptr = aspeed_smc_flash_rom_request_ptr,
+.endianness = DEVICE_LITTLE_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 4,
+},
+};
+
 static void aspeed_smc_flash_update_cs(AspeedSMCFlash *fl)
 {
 const AspeedSMCState *s = fl->controller;
@@ -778,21 +868,51 @@ static void aspeed_smc_realize(DeviceState *dev, Error 
**errp)
   fl, name, fl->size);
 memory_region_add_subregion(&s->mmio_flash, offset, &fl->mmio);
 offset += fl->size;
+
+/*
+ * The system is generally booted from one of th

[Qemu-devel] [PATCH 1/4] aspeed/smc: fix HW strapping

2018-05-30 Thread Cédric Le Goater
Only the flash type is strapped by HW. The 4BYTE mode is set by
firmware when the flash device is detected.

Signed-off-by: Cédric Le Goater 
---
 hw/ssi/aspeed_smc.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c
index 5059396bc623..f33ec87fcb74 100644
--- a/hw/ssi/aspeed_smc.c
+++ b/hw/ssi/aspeed_smc.c
@@ -632,23 +632,17 @@ static void aspeed_smc_reset(DeviceState *d)
 aspeed_smc_segment_to_reg(&s->ctrl->segments[i]);
 }
 
-/* HW strapping for AST2500 FMC controllers  */
+/* HW strapping flash type for FMC controllers  */
 if (s->ctrl->segments == aspeed_segments_ast2500_fmc) {
 /* flash type is fixed to SPI for CE0 and CE1 */
 s->regs[s->r_conf] |= (CONF_FLASH_TYPE_SPI << CONF_FLASH_TYPE0);
 s->regs[s->r_conf] |= (CONF_FLASH_TYPE_SPI << CONF_FLASH_TYPE1);
-
-/* 4BYTE mode is autodetected for CE0. Let's force it to 1 for
- * now */
-s->regs[s->r_ce_ctrl] |= (1 << (CTRL_EXTENDED0));
 }
 
 /* HW strapping for AST2400 FMC controllers (SCU70). Let's use the
  * configuration of the palmetto-bmc machine */
 if (s->ctrl->segments == aspeed_segments_fmc) {
 s->regs[s->r_conf] |= (CONF_FLASH_TYPE_SPI << CONF_FLASH_TYPE0);
-
-s->regs[s->r_ce_ctrl] |= (1 << (CTRL_EXTENDED0));
 }
 }
 
-- 
2.13.6




[Qemu-devel] [PATCH 4/4] hw/arm/aspeed: boot from the FMC CE0 flash module

2018-05-30 Thread Cédric Le Goater
When MMIO execution is activated on the FMC controller, remove the
copy of the flash module contents in a ROM memory region and boot
directly from CE0.

Booting from an alternate module (FMC CE1) should be possible. There
is still some work to support the module swap when the watchdog resets
the system.

Signed-off-by: Cédric Le Goater 
---
 hw/arm/aspeed.c | 41 +++--
 1 file changed, 27 insertions(+), 14 deletions(-)

diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index e28170b7e1d8..e87a5899845f 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -160,6 +160,32 @@ static void write_boot_rom(DriveInfo *dinfo, hwaddr addr, 
size_t rom_size,
 g_free(storage);
 }
 
+static void install_boot_rom(AspeedBoardState *bmc, DriveInfo *drive,
+ hwaddr addr)
+{
+AspeedSMCState *fmc = &bmc->soc.fmc;
+AspeedSMCFlash *fl = &fmc->flashes[0];
+bool mmio_exec = object_property_get_bool(OBJECT(fmc), "mmio-exec",
+  &error_abort);
+if (mmio_exec) {
+memory_region_add_subregion(get_system_memory(), addr,
+&fl->mmio_rom);
+} else {
+MemoryRegion *boot_rom = g_new(MemoryRegion, 1);
+
+/*
+ * create a ROM region using the default mapping window size of
+ * the flash module. The window size is 64MB for the AST2400
+ * SoC and 128MB for the AST2500 SoC, which is twice as big as
+ * needed by the flash modules of the Aspeed machines.
+ */
+memory_region_init_rom(boot_rom, OBJECT(bmc), "aspeed.boot_rom",
+   fl->size, &error_abort);
+memory_region_add_subregion(get_system_memory(), addr, boot_rom);
+write_boot_rom(drive, addr, fl->size, &error_abort);
+}
+}
+
 static void aspeed_board_init_flashes(AspeedSMCState *s, const char *flashtype,
   Error **errp)
 {
@@ -232,20 +258,7 @@ static void aspeed_board_init(MachineState *machine,
 
 /* Install first FMC flash content as a boot rom. */
 if (drive0) {
-AspeedSMCFlash *fl = &bmc->soc.fmc.flashes[0];
-MemoryRegion *boot_rom = g_new(MemoryRegion, 1);
-
-/*
- * create a ROM region using the default mapping window size of
- * the flash module. The window size is 64MB for the AST2400
- * SoC and 128MB for the AST2500 SoC, which is twice as big as
- * needed by the flash modules of the Aspeed machines.
- */
-memory_region_init_rom(boot_rom, OBJECT(bmc), "aspeed.boot_rom",
-   fl->size, &error_abort);
-memory_region_add_subregion(get_system_memory(), FIRMWARE_ADDR,
-boot_rom);
-write_boot_rom(drive0, FIRMWARE_ADDR, fl->size, &error_abort);
+install_boot_rom(bmc, drive0, FIRMWARE_ADDR);
 }
 
 aspeed_board_binfo.kernel_filename = machine->kernel_filename;
-- 
2.13.6




Re: [Qemu-devel] [PATCH] pc-bios/s390-ccw: define loadparm length

2018-05-30 Thread Thomas Huth
On 30.05.2018 09:47, Cornelia Huck wrote:
> On Tue, 29 May 2018 00:40:09 -0400
> Collin Walling  wrote:
> 
>> Loadparm is defined by the s390 architecture to be 8 bytes
>> in length. Let's define this size in the s390-ccw bios.
>>
>> Suggested-by: Laszlo Ersek 
>> Signed-off-by: Collin Walling 
>> ---
>>  pc-bios/s390-ccw/iplb.h | 4 +++-
>>  pc-bios/s390-ccw/main.c | 8 
>>  pc-bios/s390-ccw/sclp.c | 2 +-
>>  pc-bios/s390-ccw/sclp.h | 2 +-
>>  4 files changed, 9 insertions(+), 7 deletions(-)
> 
> Reviewed-by: Cornelia Huck 
> 
> Thomas, I assume this will go via your tree?

Yes, I'll pick it up (I'll include it with the new version of the
pxelinux.cfg patches once they are ready).

 Thomas



Re: [Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again

2018-05-30 Thread Peter Xu
On Tue, May 22, 2018 at 02:40:26PM -0400, John Snow wrote:
> 
> 
> On 05/21/2018 10:13 AM, Eric Blake wrote:
> > On 05/21/2018 03:42 AM, Peter Xu wrote:
> >> We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
> >> try to turn that on again.
> > 
> > "try to turn" sounds weak, like you aren't sure of this patch.  If you
> > aren't sure, then why should we feel safe in applying it?  This text is
> > going in the permanent git history, so sound bold, rather than hesitant!
> > 
> > "We have resolved the issues from last time (commit 3fd2457d reverted by
> > commit a4f90923):
> > - issue 1 ...
> > - issue 2 ...
> > So now we are ready to enable advertisement of the feature by default"
> > 
> > with better descriptions of the issues that you fixed (I can think of at
> > least the fixes adding thread-safety to the current monitor, and fixing
> > early use of the monitor before qmp_capabilities completes; there may
> > also be other issues that you want to call out).
> > 
> >>
> >> Signed-off-by: Peter Xu 
> >> -- 
> >> Now OOB should be okay with all known tests (except iotest qcow2, since
> >> it is still broken on master),
> > 
> > Which tests are still failing for you?  Ideally, you can still
> > demonstrate that the tests not failing without this patch continue to
> > pass with this patch, even if you call out the tests that have known
> > issues to still be resolved.
> > 
> 
> Probably 91 and 169. If any others fail that's news to me.

I just gave it a shot on my workstation too (./check -qcow2):

Not run: 045 059 064 070 075 076 077 078 081 083 084 088 092 093 094 101 106 
109 113 116 119 123 128 131 135 136 146 148 149 160 162 171 173 175 199 207 210 
3
Failures: 087 188 189 198 206
Failed 5 of 167 tests

I'm testing against master, e609fa7.

Thanks,

-- 
Peter Xu



[Qemu-devel] [Bug 1774149] [NEW] qemu-user x86_64 x86 gdb call function from gdb doesn't work

2018-05-30 Thread mou
Public bug reported:

While running qemu user x86_64 x86 with gdb server, calling functions
are not working.

Here is how to reproduce it:

run in a terminal:
$ qemu-x86_64 -g 12345 -L / /bin/ls

In another terminal run gdb:
(gdb) file /bin/ls
(gdb) target remote :12345
(gdb) b _init
(gdb) c
(gdb) call malloc(1)
Could not fetch register "fs_base"; remote failure reply 'E14'

In other cases we also got the error:
Could not fetch register "orig_rax"; remote failure reply 'E14'

Here is how I patched it (it is only a workaround):

diff --git a/gdbstub.c b/gdbstub.c
index 2a94030..5749efe 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -668,6 +668,11 @@ static int gdb_read_register(CPUState *cpu, uint8_t 
*mem_buf, int reg)
 return r->get_reg(env, mem_buf, reg - r->base_reg);
 }
 }
+#ifdef TARGET_X86_64
+return 8;
+#elif TARGET_I386
+return 4;
+#endif
 return 0;
 }

(Our guess for this issue was, gdb is requesting for 'fake' registers to
know register size)

Once we patched that, we got another problem while calling functions
from gdb: We could call functions, but only once.

Here is how to reproduce it:
run in a terminal:
$ qemu-x86_64 -g 12345 -L / /bin/ls

In another terminal run gdb:
(gdb) file /bin/ls
(gdb) target remote :12345
(gdb) b _init
(gdb) c
(gdb) call malloc(1)
$1 = (void *) 0x620010
(gdb) call malloc(1)
Cannot access memory at address 0x40007ffb8f

Here is how we patched it to make it work:

diff --git a/exec.c b/exec.c
index 03238a3..d303922 100644
--- a/exec.c
+++ b/exec.c
@@ -2833,7 +2833,7 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
 if (!(flags & PAGE_VALID))
 return -1;
 if (is_write) {
-if (!(flags & PAGE_WRITE))
+if (!(flags & (PAGE_WRITE | PAGE_WRITE_ORG)))
 return -1;
 /* XXX: this code should not depend on lock_user */
 if (!(p = lock_user(VERIFY_WRITE, addr, l, 0)))

>From what we saw, there is a page which is passed to read-only after
first execution, and gdb need to write on that page to put a breakpoint.
(on the stack to get function return)

We suspect this is linked to this:
https://qemu.weilnetz.de/w64/2012/2012-06-28/qemu-tech.html#Self_002dmodifying-code-and-translated-code-invalidation

** Affects: qemu
 Importance: Undecided
 Status: New

** Summary changed:

- qemu-user x86_64 x86 gdb call function not working properly
+ qemu-user x86_64 x86 gdb call function from gdb doesn't work

** Description changed:

  While running qemu user x86_64 x86 with gdb server, calling functions
  are not working.
  
  Here is how to reproduce it:
  
  run in a terminal:
  $ qemu-x86_64 -g 12345 -L / /bin/ls
  
  In another terminal run gdb:
  (gdb) file /bin/ls
  (gdb) target remote :12345
  (gdb) b _init
  (gdb) c
  (gdb) call malloc(1)
  Could not fetch register "fs_base"; remote failure reply 'E14'
  
  In other cases we also got the error:
  Could not fetch register "orig_rax"; remote failure reply 'E14'
  
  Here is how I patched it (it is only a workaround):
  
  diff --git a/gdbstub.c b/gdbstub.c
  index 2a94030..5749efe 100644
  --- a/gdbstub.c
  +++ b/gdbstub.c
  @@ -668,6 +668,11 @@ static int gdb_read_register(CPUState *cpu, uint8_t 
*mem_buf, int reg)
-  return r->get_reg(env, mem_buf, reg - r->base_reg);
-  }
-  }
+  return r->get_reg(env, mem_buf, reg - r->base_reg);
+  }
+  }
  +#ifdef TARGET_X86_64
  +return 8;
  +#elif TARGET_I386
  +return 4;
  +#endif
-  return 0;
-  }
+  return 0;
+  }
  
  (Our guess for this issue was, gdb is requesting for 'fake' registers to
  know register size)
  
  Once we patched that, we got another problem while calling functions
  from gdb: We could call functions, but only once.
  
  Here is how to reproduce it:
  run in a terminal:
  $ qemu-x86_64 -g 12345 -L / /bin/ls
  
  In another terminal run gdb:
  (gdb) file /bin/ls
  (gdb) target remote :12345
  (gdb) b _init
  (gdb) c
  (gdb) call malloc(1)
  $1 = (void *) 0x620010
  (gdb) call malloc(1)
  Cannot access memory at address 0x40007ffb8f
  
  Here is how we patched it to make it work:
  
  diff --git a/exec.c b/exec.c
  index 03238a3..d303922 100644
  --- a/exec.c
  +++ b/exec.c
  @@ -2833,7 +2833,7 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong 
addr,
-  if (!(flags & PAGE_VALID))
-  return -1;
-  if (is_write) {
+  if (!(flags & PAGE_VALID))
+  return -1;
+  if (is_write) {
  -if (!(flags & PAGE_WRITE))
  +if (!(flags & (PAGE_WRITE | PAGE_WRITE_ORG)))
-  return -1;
-  /* XXX: this code should not depend on lock_user */
-  if (!(p = lock_user(VERIFY_WRITE, addr, l, 0)))
+  return -1;
+  /* XXX: this code should not depend on lock_user */
+  if (!(p = lock_user(VERIFY_WRITE, addr, l, 0)))
  
  From what we saw,

[Qemu-devel] [PATCH v2] qga: add mountpoint usage to GuestFilesystemInfo

2018-05-30 Thread Chen Hanxiao
From: Chen Hanxiao 

This patch adds support for getting the usage of mounted
filesystem.
It's very useful when we try to monitor guest's filesystem.

Cc: Michael Roth 
Signed-off-by: Chen Hanxiao 
---
v2:
   add description in qapi-schema and version numbers

 qga/commands-posix.c | 17 +
 qga/qapi-schema.json |  3 ++-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 0dc219dbcf..0d93c47a5d 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -46,6 +46,7 @@ extern char **environ;
 #include 
 #include 
 #include 
+#include 
 
 #ifdef FIFREEZE
 #define CONFIG_FSFREEZE
@@ -1072,6 +1073,9 @@ static GuestFilesystemInfo *build_guest_fsinfo(struct 
FsMount *mount,
Error **errp)
 {
 GuestFilesystemInfo *fs = g_malloc0(sizeof(*fs));
+struct statvfs buf;
+unsigned long u100, used, nonroot_total;
+int usage;
 char *devpath = g_strdup_printf("/sys/dev/block/%u:%u",
 mount->devmajor, mount->devminor);
 
@@ -1079,7 +1083,20 @@ static GuestFilesystemInfo *build_guest_fsinfo(struct 
FsMount *mount,
 fs->type = g_strdup(mount->devtype);
 build_guest_fsinfo_for_device(devpath, fs, errp);
 
+if (statvfs(fs->mountpoint, &buf)) {
+error_setg_errno(errp, errno, "Failed to get statvfs");
+return NULL;
+}
+
+used = buf.f_blocks - buf.f_bfree;
+u100 = 100 * used;
+nonroot_total = used + buf.f_bavail;
+usage = u100 / nonroot_total + (u100 % nonroot_total != 0);
+
+fs->usage = usage;
+
 g_free(devpath);
+
 return fs;
 }
 
diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index 17884c7c70..98611b49af 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -846,13 +846,14 @@
 # @name: disk name
 # @mountpoint: mount point path
 # @type: file system type string
+# @usage: file system usage, integer between 0 and 100 (since 3.0)
 # @disk: an array of disk hardware information that the volume lies on,
 #which may be empty if the disk type is not supported
 #
 # Since: 2.2
 ##
 { 'struct': 'GuestFilesystemInfo',
-  'data': {'name': 'str', 'mountpoint': 'str', 'type': 'str',
+  'data': {'name': 'str', 'mountpoint': 'str', 'type': 'str', 'usage': 'int',
'disk': ['GuestDiskAddress']} }
 
 ##
-- 
2.17.0




Re: [Qemu-devel] qemu:handle_cpu_signal received signal outside vCPU context

2018-05-30 Thread Andreas Schwab
On Mär 19 2018, Michael Clark  wrote:

> On Mon, Mar 19, 2018 at 9:17 AM, Andreas Schwab  wrote:
>
>> I'm seeing this error while building gedit for riscv64 with linux-user
>> emulation:
>>
>> $ LD_LIBRARY_PATH=gedit/.libs qemu-riscv64 gedit/.libs/gedit
>> --introspect-dump=/tmp/tmp-introspectnj0xla07/functions.txt,
>> /tmp/tmp-introspectnj0xla07/dump.xml
>> qemu:handle_cpu_signal received signal outside vCPU context @ pc=0x6003d7d5
>> qemu:handle_cpu_signal received signal outside vCPU context @ pc=0x60106a16
>>
>
> Thanks. I can see this code in accel/tcg/user-exec.c
>
> It would be nice if that log message included the signal number. I wonder
> if we are getting a SIGSEGV. I also wonder what thread is actually
> running...

A native build doesn't see any issues, so this looks like a genuine qemu
bug.

> I wonder what is the best way for me to reproduce on my side... a tarball
> with binaries that I can use to trigger the fault?

There are some images under

that can be used as a base.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



Re: [Qemu-devel] [PATCH v3] ARM: ACPI: Fix use-after-free due to memory realloc

2018-05-30 Thread Auger Eric
Hi Shannon,

On 05/30/2018 09:05 AM, Shannon Zhao wrote:
> acpi_data_push uses g_array_set_size to resize the memory size. If there
> is no enough contiguous memory, the address will be changed. So previous
> pointer could not be used any more. It must update the pointer and use
> the new one.
> 
> Also, previous codes wrongly use le32 conversion of iort->node_offset
> for subsequent computations that will result incorrect value if host is
> not litlle endian. So use the non-converted one instead.
> 
> Signed-off-by: Shannon Zhao 
Reviewed-by: Eric Auger 

Thanks

Eric
> ---
> V3: Fix typo and add some words in commit message to explain another bug
> ---
>  hw/arm/virt-acpi-build.c | 20 +++-
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 92ceee9..74f5744 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -400,7 +400,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  AcpiIortItsGroup *its;
>  AcpiIortTable *iort;
>  AcpiIortSmmu3 *smmu;
> -size_t node_size, iort_length, smmu_offset = 0;
> +size_t node_size, iort_node_offset, iort_length, smmu_offset = 0;
>  AcpiIortRC *rc;
>  
>  iort = acpi_data_push(table_data, sizeof(*iort));
> @@ -413,7 +413,12 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  
>  iort_length = sizeof(*iort);
>  iort->node_count = cpu_to_le32(nb_nodes);
> -iort->node_offset = cpu_to_le32(sizeof(*iort));
> +/*
> + * Use a copy in case table_data->data moves during acpi_data_push
> + * operations.
> + */
> +iort_node_offset = sizeof(*iort);
> +iort->node_offset = cpu_to_le32(iort_node_offset);
>  
>  /* ITS group node */
>  node_size =  sizeof(*its) + sizeof(uint32_t);
> @@ -429,7 +434,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  int irq =  vms->irqmap[VIRT_SMMU];
>  
>  /* SMMUv3 node */
> -smmu_offset = iort->node_offset + node_size;
> +smmu_offset = iort_node_offset + node_size;
>  node_size = sizeof(*smmu) + sizeof(*idmap);
>  iort_length += node_size;
>  smmu = acpi_data_push(table_data, node_size);
> @@ -450,7 +455,7 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  idmap->id_count = cpu_to_le32(0x);
>  idmap->output_base = 0;
>  /* output IORT node is the ITS group node (the first node) */
> -idmap->output_reference = cpu_to_le32(iort->node_offset);
> +idmap->output_reference = cpu_to_le32(iort_node_offset);
>  }
>  
>  /* Root Complex Node */
> @@ -479,9 +484,14 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  idmap->output_reference = cpu_to_le32(smmu_offset);
>  } else {
>  /* output IORT node is the ITS group node (the first node) */
> -idmap->output_reference = cpu_to_le32(iort->node_offset);
> +idmap->output_reference = cpu_to_le32(iort_node_offset);
>  }
>  
> +/*
> + * Update the pointer address in case table_data->data moves during above
> + * acpi_data_push operations.
> + */
> +iort = (AcpiIortTable *)(table_data->data + iort_start);
>  iort->length = cpu_to_le32(iort_length);
>  
>  build_header(linker, table_data, (void *)(table_data->data + iort_start),
> 



Re: [Qemu-devel] [PATCH] socket: dont't free msgfds if error equals EAGAIN

2018-05-30 Thread Gonglei (Arei)


> -Original Message-
> From: Eric Blake [mailto:ebl...@redhat.com]
> Sent: Wednesday, May 30, 2018 3:33 AM
> To: linzhecheng ; Marc-André Lureau
> 
> Cc: QEMU ; Paolo Bonzini ;
> wangxin (U) ; Gonglei (Arei)
> ; pet...@redhat.com; berra...@redhat.com
> Subject: Re: [Qemu-devel] [PATCH] socket: dont't free msgfds if error equals
> EAGAIN
> 
> On 05/29/2018 04:33 AM, linzhecheng wrote:
> > I think this patch doesn't fix my issue. For more details, please see 
> > Gonglei's
> reply.
> > https://lists.gnu.org/archive/html/qemu-devel/2018-05/msg06296.html
> 
> Your mailer is not honoring threading (it failed to include
> 'In-Reply-To:' and 'References:' headers that refer to the message you
> are replying to), and you are top-posting, both of which make it
> difficult to follow your comments on a technical list.
> 
> 
Agree.

@Zhecheng, pls resend a patch with commit message. Ccing these guys.

Regards,
-Gonglei


Re: [Qemu-devel] [PATCH v7 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-05-30 Thread Wei Wang

On 05/29/2018 11:24 PM, Michael S. Tsirkin wrote:

On Tue, Apr 24, 2018 at 02:13:47PM +0800, Wei Wang wrote:

+/*
+ * Balloon will report pages which were free at the time of this call. As the
+ * reporting happens asynchronously, dirty bit logging must be enabled before
+ * this call is made.
+ */
+void balloon_free_page_start(void)
+{
+balloon_free_page_start_fn(balloon_opaque);
+}

Please create notifier support, not a single global.


OK. The start is called at the end of bitmap_sync, and the stop is 
called at the beginning of bitmap_sync. In this case, we will need to 
add two migration states, MIGRATION_STATUS_BEFORE_BITMAP_SYNC and 
MIGRATION_STATUS_AFTER_BITMAP_SYNC, right?





+static void virtio_balloon_poll_free_page_hints(void *opaque)
+{
+VirtQueueElement *elem;
+VirtIOBalloon *dev = opaque;
+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+VirtQueue *vq = dev->free_page_vq;
+uint32_t id;
+size_t size;
+
+while (1) {
+qemu_mutex_lock(&dev->free_page_lock);
+while (dev->block_iothread) {
+qemu_cond_wait(&dev->free_page_cond, &dev->free_page_lock);
+}
+
+/*
+ * If the migration thread actively stops the reporting, exit
+ * immediately.
+ */
+if (dev->free_page_report_status == FREE_PAGE_REPORT_S_STOP) {
Please refactor this : move loop body into a function so
you can do lock/unlock in a single place.


Sounds good.



+
+static bool virtio_balloon_free_page_support(void *opaque)
+{
+VirtIOBalloon *s = opaque;
+VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT);
or if poison is negotiated.


Will make it
return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT) && 
!virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON)



Best,
Wei



[Qemu-devel] [PATCH 2/3] pc-bios/s390-ccw/net: Add support for pxelinux-style config files

2018-05-30 Thread Thomas Huth
Since it is quite cumbersome to manually create a combined kernel with
initrd image for network booting, we now support loading via pxelinux
configuration files, too. In these files, the kernel, initrd and command
line parameters can be specified seperately, and the firmware then takes
care of glueing everything together in memory after the files have been
downloaded. See this URL for details about the config file layout:
https://www.syslinux.org/wiki/index.php?title=PXELINUX

The user can either specify a config file directly as bootfile via DHCP
(but in this case, the file has to start either with "default" or a "#"
comment so we can distinguish it from binary kernels), or a folder (i.e.
the bootfile name must end with "/") where the firmware should look for
the typical pxelinux.cfg file names, e.g. based on MAC or IP address.
We also support the pxelinux.cfg DHCP options 209 and 210 from RFC 5071.

Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw/netboot.mak |  7 ++--
 pc-bios/s390-ccw/netmain.c   | 79 +++-
 2 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/pc-bios/s390-ccw/netboot.mak b/pc-bios/s390-ccw/netboot.mak
index a73be36..8af0cfd 100644
--- a/pc-bios/s390-ccw/netboot.mak
+++ b/pc-bios/s390-ccw/netboot.mak
@@ -25,8 +25,9 @@ CTYPE_OBJS = isdigit.o isxdigit.o toupper.o
 %.o : $(SLOF_DIR)/lib/libc/ctype/%.c
$(call quiet-command,$(CC) $(LIBC_CFLAGS) -c -o $@ 
$<,"CC","$(TARGET_DIR)$@")
 
-STRING_OBJS = strcat.o strchr.o strcmp.o strcpy.o strlen.o strncmp.o strncpy.o 
\
- strstr.o memset.o memcpy.o memmove.o memcmp.o
+STRING_OBJS = strcat.o strchr.o strrchr.o strcpy.o strlen.o strncpy.o \
+ strcmp.o strncmp.o strcasecmp.o strncasecmp.o strstr.o \
+ memset.o memcpy.o memmove.o memcmp.o
 %.o : $(SLOF_DIR)/lib/libc/string/%.c
$(call quiet-command,$(CC) $(LIBC_CFLAGS) -c -o $@ 
$<,"CC","$(TARGET_DIR)$@")
 
@@ -50,7 +51,7 @@ libc.a: $(LIBCOBJS)
 # libnet files:
 
 LIBNETOBJS := args.o dhcp.o dns.o icmpv6.o ipv6.o tcp.o udp.o bootp.o \
- dhcpv6.o ethernet.o ipv4.o ndp.o tftp.o
+ dhcpv6.o ethernet.o ipv4.o ndp.o tftp.o pxelinux.o
 LIBNETCFLAGS := $(QEMU_CFLAGS) -DDHCPARCH=0x1F $(LIBC_INC) $(LIBNET_INC)
 
 %.o : $(SLOF_DIR)/lib/libnet/%.c
diff --git a/pc-bios/s390-ccw/netmain.c b/pc-bios/s390-ccw/netmain.c
index 7533cf7..e84bb2b 100644
--- a/pc-bios/s390-ccw/netmain.c
+++ b/pc-bios/s390-ccw/netmain.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "s390-ccw.h"
 #include "virtio.h"
@@ -41,12 +42,14 @@ extern char _start[];
 
 #define KERNEL_ADDR ((void *)0L)
 #define KERNEL_MAX_SIZE ((long)_start)
+#define ARCH_COMMAND_LINE_SIZE  896  /* Taken from Linux kernel */
 
 char stack[PAGE_SIZE * 8] __attribute__((aligned(PAGE_SIZE)));
 IplParameterBlock iplb __attribute__((aligned(PAGE_SIZE)));
 static char cfgbuf[2048];
 
 static SubChannelId net_schid = { .one = 1 };
+static uint8_t mac[6];
 static uint64_t dest_timer;
 
 static uint64_t get_timer_ms(void)
@@ -158,7 +161,6 @@ static int tftp_load(filename_ip_t *fnip, void *buffer, int 
len)
 
 static int net_init(filename_ip_t *fn_ip)
 {
-uint8_t mac[6];
 int rc;
 
 memset(fn_ip, 0, sizeof(filename_ip_t));
@@ -233,6 +235,66 @@ static void net_release(filename_ip_t *fn_ip)
 }
 
 /**
+ * Load a kernel with initrd (i.e. with the information that we've got from
+ * a pxelinux.cfg config file)
+ */
+static int load_kernel_with_initrd(filename_ip_t *fn_ip,
+   struct pl_cfg_entry *entry)
+{
+int rc;
+
+printf("Loading pxelinux.cfg entry '%s'\n", entry->label);
+
+if (!entry->kernel) {
+printf("Kernel entry is missing!\n");
+return -1;
+}
+
+strncpy(fn_ip->filename, entry->kernel, sizeof(fn_ip->filename));
+rc = tftp_load(fn_ip, KERNEL_ADDR, KERNEL_MAX_SIZE);
+if (rc < 0) {
+return rc;
+}
+
+if (entry->initrd) {
+uint64_t iaddr = (rc + 0xfff) & ~0xfffUL;
+
+strncpy(fn_ip->filename, entry->initrd, sizeof(fn_ip->filename));
+rc = tftp_load(fn_ip, (void *)iaddr, KERNEL_MAX_SIZE - iaddr);
+if (rc < 0) {
+return rc;
+}
+/* Patch location and size: */
+*(uint64_t *)0x10408 = iaddr;
+*(uint64_t *)0x10410 = rc;
+rc += iaddr;
+}
+
+if (entry->append) {
+strncpy((char *)0x10480, entry->append, ARCH_COMMAND_LINE_SIZE);
+}
+
+return rc;
+}
+
+#define MAX_PXELINUX_ENTRIES 16
+
+static int net_try_pxelinux_cfg(filename_ip_t *fn_ip)
+{
+struct pl_cfg_entry entries[MAX_PXELINUX_ENTRIES];
+int num_ent, def_ent = 0;
+
+num_ent = pxelinux_load_parse_cfg(fn_ip, mac, NULL, DEFAULT_TFTP_RETRIES,
+  cfgbuf, sizeof(cfgbuf),
+  entries, MAX_PXELINUX_ENTRIES, &def_ent);
+if (num_ent > 0) {
+return load_kerne

[Qemu-devel] [PATCH 1/3] pc-bios/s390-ccw/net: Update code for the latest changes in SLOF

2018-05-30 Thread Thomas Huth
The ip_version information now has to be stored in the filename_ip_t
structure, and there is now a common function called tftp_get_error_info()
which can be used to get the error string for a TFTP error code.
We can also get rid of some superfluous "(char *)" casts now.

Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw/netboot.mak |  2 +-
 pc-bios/s390-ccw/netmain.c   | 85 +---
 2 files changed, 17 insertions(+), 70 deletions(-)

diff --git a/pc-bios/s390-ccw/netboot.mak b/pc-bios/s390-ccw/netboot.mak
index 4f64128..a73be36 100644
--- a/pc-bios/s390-ccw/netboot.mak
+++ b/pc-bios/s390-ccw/netboot.mak
@@ -34,7 +34,7 @@ STDLIB_OBJS = atoi.o atol.o strtoul.o strtol.o rand.o 
malloc.o free.o
 %.o : $(SLOF_DIR)/lib/libc/stdlib/%.c
$(call quiet-command,$(CC) $(LIBC_CFLAGS) -c -o $@ 
$<,"CC","$(TARGET_DIR)$@")
 
-STDIO_OBJS = sprintf.o vfprintf.o vsnprintf.o vsprintf.o fprintf.o \
+STDIO_OBJS = sprintf.o snprintf.o vfprintf.o vsnprintf.o vsprintf.o fprintf.o \
 printf.o putc.o puts.o putchar.o stdchnls.o fileno.o
 %.o : $(SLOF_DIR)/lib/libc/stdio/%.c
$(call quiet-command,$(CC) $(LIBC_CFLAGS) -c -o $@ 
$<,"CC","$(TARGET_DIR)$@")
diff --git a/pc-bios/s390-ccw/netmain.c b/pc-bios/s390-ccw/netmain.c
index 6000241..7533cf7 100644
--- a/pc-bios/s390-ccw/netmain.c
+++ b/pc-bios/s390-ccw/netmain.c
@@ -47,7 +47,6 @@ IplParameterBlock iplb __attribute__((aligned(PAGE_SIZE)));
 static char cfgbuf[2048];
 
 static SubChannelId net_schid = { .one = 1 };
-static int ip_version = 4;
 static uint64_t dest_timer;
 
 static uint64_t get_timer_ms(void)
@@ -100,10 +99,10 @@ static int dhcp(struct filename_ip *fn_ip, int retries)
 printf("\nGiving up after %d DHCP requests\n", retries);
 return -1;
 }
-ip_version = 4;
+fn_ip->ip_version = 4;
 rc = dhcpv4(NULL, fn_ip);
 if (rc == -1) {
-ip_version = 6;
+fn_ip->ip_version = 6;
 set_ipv6_address(fn_ip->fd, 0);
 rc = dhcpv6(NULL, fn_ip);
 if (rc == 0) {
@@ -137,8 +136,7 @@ static int tftp_load(filename_ip_t *fnip, void *buffer, int 
len)
 tftp_err_t tftp_err;
 int rc;
 
-rc = tftp(fnip, buffer, len, DEFAULT_TFTP_RETRIES, &tftp_err, 1, 1428,
-  ip_version);
+rc = tftp(fnip, buffer, len, DEFAULT_TFTP_RETRIES, &tftp_err);
 
 if (rc < 0) {
 /* Make sure that error messages are put into a new line */
@@ -149,61 +147,10 @@ static int tftp_load(filename_ip_t *fnip, void *buffer, 
int len)
 printf("  TFTP: Received %s (%d KBytes)\n", fnip->filename, rc / 1024);
 } else if (rc > 0) {
 printf("  TFTP: Received %s (%d Bytes)\n", fnip->filename, rc);
-} else if (rc == -1) {
-puts("unknown TFTP error");
-} else if (rc == -2) {
-printf("TFTP buffer of %d bytes is too small for %s\n",
-len, fnip->filename);
-} else if (rc == -3) {
-printf("file not found: %s\n", fnip->filename);
-} else if (rc == -4) {
-puts("TFTP access violation");
-} else if (rc == -5) {
-puts("illegal TFTP operation");
-} else if (rc == -6) {
-puts("unknown TFTP transfer ID");
-} else if (rc == -7) {
-puts("no such TFTP user");
-} else if (rc == -8) {
-puts("TFTP blocksize negotiation failed");
-} else if (rc == -9) {
-puts("file exceeds maximum TFTP transfer size");
-} else if (rc <= -10 && rc >= -15) {
-const char *icmp_err_str;
-switch (rc) {
-case -ICMP_NET_UNREACHABLE - 10:
-icmp_err_str = "net unreachable";
-break;
-case -ICMP_HOST_UNREACHABLE - 10:
-icmp_err_str = "host unreachable";
-break;
-case -ICMP_PROTOCOL_UNREACHABLE - 10:
-icmp_err_str = "protocol unreachable";
-break;
-case -ICMP_PORT_UNREACHABLE - 10:
-icmp_err_str = "port unreachable";
-break;
-case -ICMP_FRAGMENTATION_NEEDED - 10:
-icmp_err_str = "fragmentation needed and DF set";
-break;
-case -ICMP_SOURCE_ROUTE_FAILED - 10:
-icmp_err_str = "source route failed";
-break;
-default:
-icmp_err_str = " UNKNOWN";
-break;
-}
-printf("ICMP ERROR \"%s\"\n", icmp_err_str);
-} else if (rc == -40) {
-printf("TFTP error occurred after %d bad packets received",
-tftp_err.bad_tftp_packets);
-} else if (rc == -41) {
-printf("TFTP error occurred after missing %d responses",
-tftp_err.no_packets);
-} else if (rc == -42) {
-printf("TFTP error missing block %d, expected block was %d",
-tftp_err.blocks_missed,
-tftp_err.blocks_received);
+} else {
+const char *errstr = NULL;
+tftp_get_error_info(fnip, &tftp_err, rc, &errstr, NULL);
+printf("TFTP error:

[Qemu-devel] [PATCH 0/3] pc-bios/s390-ccw: Allow network booting via pxelinux.cfg

2018-05-30 Thread Thomas Huth
This patch series adds pxelinux.cfg-style network booting to the s390-ccw
firmware. The core pxelinux.cfg loading and parsing logic has recently
been merged to SLOF, so these patches now just have to make sure to call
the right functions to get the config file loaded and parsed. Once this is
done, the kernel and initrd are loaded separately, and are then glued
together in RAM.

Note that you have to update the roms/SLOF submodule to the latest version
of SLOF first (64c526a6020c3042e3b2a505d5f5f11478d5f2cb). Unfortunately the
SLOF.git mirror on qemu.org currently is not updated anymore (i.e. this also
must be fixed), so you need to use the upstream https://github.com/aik/SLOF
if you want to test the patches right now.

Thomas Huth (3):
  pc-bios/s390-ccw/net: Update code for the latest changes in SLOF
  pc-bios/s390-ccw/net: Add support for pxelinux-style config files
  pc-bios/s390-ccw/net: Try to load pxelinux.cfg file accoring to the
UUID

 pc-bios/s390-ccw/netboot.mak |   9 +-
 pc-bios/s390-ccw/netmain.c   | 208 ---
 2 files changed, 143 insertions(+), 74 deletions(-)



[Qemu-devel] [PATCH 3/3] pc-bios/s390-ccw/net: Try to load pxelinux.cfg file accoring to the UUID

2018-05-30 Thread Thomas Huth
With the STSI instruction, we can get the UUID of the current VM instance,
so we can support loading pxelinux config files via UUID in the file name,
too.

Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw/netmain.c | 46 +-
 1 file changed, 45 insertions(+), 1 deletion(-)

diff --git a/pc-bios/s390-ccw/netmain.c b/pc-bios/s390-ccw/netmain.c
index e84bb2b..7ece302 100644
--- a/pc-bios/s390-ccw/netmain.c
+++ b/pc-bios/s390-ccw/netmain.c
@@ -235,6 +235,49 @@ static void net_release(filename_ip_t *fn_ip)
 }
 
 /**
+ * Retrieve the Universally Unique Identifier of the VM.
+ * @return UUID string, or NULL in case of errors
+ */
+static const char *get_uuid(void)
+{
+register int r0 asm("0");
+register int r1 asm("1");
+uint8_t *mem, *buf, uuid[16];
+int i, chk = 0;
+static char uuid_str[37];
+
+mem = malloc(2 * PAGE_SIZE);
+if (!mem) {
+puts("Out of memory ... can not get UUID.");
+return NULL;
+}
+buf = (uint8_t *)(((uint64_t)mem + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1));
+memset(buf, 0, PAGE_SIZE);
+
+/* Get SYSIB 3.2.2 */
+r0 = (3 << 28) | 2;
+r1 = 2;
+asm volatile(" stsi 0(%2)\n" : : "d" (r0), "d" (r1), "a" (buf)
+ : "cc", "memory");
+
+for (i = 0; i < 16; i++) {
+uuid[i] = buf[8 * 4 + 12 * 4 + i];
+chk |= uuid[i];
+}
+free(mem);
+if (!chk) {
+return NULL;
+}
+
+sprintf(uuid_str, "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-"
+"%02x%02x%02x%02x%02x%02x", uuid[0], uuid[1], uuid[2], uuid[3],
+uuid[4], uuid[5], uuid[6], uuid[7], uuid[8], uuid[9], uuid[10],
+uuid[11], uuid[12], uuid[13], uuid[14], uuid[15]);
+
+return uuid_str;
+}
+
+/**
  * Load a kernel with initrd (i.e. with the information that we've got from
  * a pxelinux.cfg config file)
  */
@@ -284,7 +327,8 @@ static int net_try_pxelinux_cfg(filename_ip_t *fn_ip)
 struct pl_cfg_entry entries[MAX_PXELINUX_ENTRIES];
 int num_ent, def_ent = 0;
 
-num_ent = pxelinux_load_parse_cfg(fn_ip, mac, NULL, DEFAULT_TFTP_RETRIES,
+num_ent = pxelinux_load_parse_cfg(fn_ip, mac, get_uuid(),
+  DEFAULT_TFTP_RETRIES,
   cfgbuf, sizeof(cfgbuf),
   entries, MAX_PXELINUX_ENTRIES, &def_ent);
 if (num_ent > 0) {
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v4 09/21] target: Do not include "exec/exec-all.h" if it is not necessary

2018-05-30 Thread Paolo Bonzini
On 30/05/2018 07:50, Philippe Mathieu-Daudé wrote:
>>> No, not all :/
>>> I started with "(cpu_loop_|tlb_|tb_)" then kept brutebuilding until no
>>> more errors appear. In 2 more steps I added "cpu_address_space_init|"
>>> then "|GETPC|singlestep|TranslationBlock". Quick and dirty enough for my
>>> goal than trying to build a regex to explode function/struct names from
>>> headers. This is a clever way to do it for long term command reuse taken
>>> from commit messages...
>> Brutebuilding isn't a good way to find unused includes, some other header
>> might pull in an include you are trying to remove for its own purposes.
>> If you want to try brutebuilding you must also verify that's
>> not the case - e.g. look at the dependency file generated.
> Hmm you mean the .d files in the build dir?

You can also check which include files include this one.  In this case
you can see that brute-building was more or less enough:

$ git grep -l exec-all -- '*.h'
accel/tcg/translate-all.h
include/exec/tb-lookup.h
include/exec/translator.h
linux-user/qemu.h
target/ppc/helper_regs.h

Leaving aside linux-user/qemu.h, let's check which files include one of
those headers, but they do not include exec/exec-all.h:

$ git grep -L exec-all \
   $(git grep -lFf <(git grep -l exec-all -- '*.h' | \
 grep -v qemu.h | sed 's,.*/,,'))
linux-user/mmap.c
target/arm/translate.h
target/ppc/int_helper.c
trace/control-target.c

I'll send some patches shortly to fix up what's left.

Paolo



[Qemu-devel] [PATCH v4 00/12] Enable postcopy RDMA live migration

2018-05-30 Thread Lidong Chen
The RDMA QIOChannel does not support bi-directional communication, so when RDMA 
live migration with postcopy enabled, the source qemu return path get qemu file 
error.

These patches implement bi-directional communication for RDMA QIOChannel and 
disable the RDMA WRITE during the postcopy phase.

This patch just make postcopy works, and will improve performance later.

[v4]
 - not wait RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect
 - implement io_set_aio_fd_handler function for RDMA QIOChannel (Juan Quintela)
 - invoke qio_channel_yield only when qemu_in_coroutine() (Juan Quintela)
 - create a dedicated thread to release rdma resource
 - poll the cm event while wait RDMA work request completion
 - implement the shutdown function for RDMA QIOChannel

[v3]
 - add a mutex in QEMUFile struct to avoid concurrent channel close (Daniel)
 - destroy the mutex before free QEMUFile (David)
 - use rdmain and rmdaout instead of rdma->return_path (Daniel)

[v2]
 - does not update bytes_xfer when disable RDMA WRITE (David)
 - implement bi-directional communication for RDMA QIOChannel (Daniel)

Lidong Chen (12):
  migration: disable RDMA WRITE after postcopy started
  migration: create a dedicated connection for rdma return path
  migration: remove unnecessary variables len in QIOChannelRDMA
  migration: avoid concurrent invoke channel_close by different threads
  migration: implement bi-directional RDMA QIOChannel
  migration: Stop rdma yielding during incoming postcopy
  migration: not wait RDMA_CM_EVENT_DISCONNECTED event after
rdma_disconnect
  migration: implement io_set_aio_fd_handler function for RDMA
QIOChannel
  migration: invoke qio_channel_yield only when qemu_in_coroutine()
  migration: create a dedicated thread to release rdma resource
  migration: poll the cm event while wait RDMA work request completion
  migration: implement the shutdown for RDMA QIOChannel

 migration/colo.c  |   2 +
 migration/migration.c |   2 +
 migration/postcopy-ram.c  |   2 +
 migration/qemu-file-channel.c |  12 +-
 migration/qemu-file.c |  13 +-
 migration/ram.c   |   4 +
 migration/rdma.c  | 435 --
 migration/savevm.c|   3 +
 migration/trace-events|   1 -
 9 files changed, 411 insertions(+), 63 deletions(-)

-- 
1.8.3.1




[Qemu-devel] [PATCH v4 02/12] migration: create a dedicated connection for rdma return path

2018-05-30 Thread Lidong Chen
From: Lidong Chen 

If start a RDMA migration with postcopy enabled, the source qemu
establish a dedicated connection for return path.

Signed-off-by: Lidong Chen 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/rdma.c | 94 ++--
 1 file changed, 91 insertions(+), 3 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index a0748f4..ec4bbff 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -387,6 +387,10 @@ typedef struct RDMAContext {
 uint64_t unregistrations[RDMA_SIGNALED_SEND_MAX];
 
 GHashTable *blockmap;
+
+/* the RDMAContext for return path */
+struct RDMAContext *return_path;
+bool is_return_path;
 } RDMAContext;
 
 #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma"
@@ -2332,10 +2336,22 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
 rdma_destroy_id(rdma->cm_id);
 rdma->cm_id = NULL;
 }
+
+/* the destination side, listen_id and channel is shared */
 if (rdma->listen_id) {
-rdma_destroy_id(rdma->listen_id);
+if (!rdma->is_return_path) {
+rdma_destroy_id(rdma->listen_id);
+}
 rdma->listen_id = NULL;
+
+if (rdma->channel) {
+if (!rdma->is_return_path) {
+rdma_destroy_event_channel(rdma->channel);
+}
+rdma->channel = NULL;
+}
 }
+
 if (rdma->channel) {
 rdma_destroy_event_channel(rdma->channel);
 rdma->channel = NULL;
@@ -2564,6 +2580,25 @@ err_dest_init_create_listen_id:
 
 }
 
+static void qemu_rdma_return_path_dest_init(RDMAContext *rdma_return_path,
+RDMAContext *rdma)
+{
+int idx;
+
+for (idx = 0; idx < RDMA_WRID_MAX; idx++) {
+rdma_return_path->wr_data[idx].control_len = 0;
+rdma_return_path->wr_data[idx].control_curr = NULL;
+}
+
+/*the CM channel and CM id is shared*/
+rdma_return_path->channel = rdma->channel;
+rdma_return_path->listen_id = rdma->listen_id;
+
+rdma->return_path = rdma_return_path;
+rdma_return_path->return_path = rdma;
+rdma_return_path->is_return_path = true;
+}
+
 static void *qemu_rdma_data_init(const char *host_port, Error **errp)
 {
 RDMAContext *rdma = NULL;
@@ -3021,6 +3056,8 @@ err:
 return ret;
 }
 
+static void rdma_accept_incoming_migration(void *opaque);
+
 static int qemu_rdma_accept(RDMAContext *rdma)
 {
 RDMACapabilities cap;
@@ -3115,7 +3152,14 @@ static int qemu_rdma_accept(RDMAContext *rdma)
 }
 }
 
-qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
+/* Accept the second connection request for return path */
+if (migrate_postcopy() && !rdma->is_return_path) {
+qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration,
+NULL,
+(void *)(intptr_t)rdma->return_path);
+} else {
+qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL);
+}
 
 ret = rdma_accept(rdma->cm_id, &conn_param);
 if (ret) {
@@ -3700,6 +3744,10 @@ static void rdma_accept_incoming_migration(void *opaque)
 
 trace_qemu_rdma_accept_incoming_migration_accepted();
 
+if (rdma->is_return_path) {
+return;
+}
+
 f = qemu_fopen_rdma(rdma, "rb");
 if (f == NULL) {
 ERROR(errp, "could not qemu_fopen_rdma!");
@@ -3714,7 +3762,7 @@ static void rdma_accept_incoming_migration(void *opaque)
 void rdma_start_incoming_migration(const char *host_port, Error **errp)
 {
 int ret;
-RDMAContext *rdma;
+RDMAContext *rdma, *rdma_return_path;
 Error *local_err = NULL;
 
 trace_rdma_start_incoming_migration();
@@ -3741,12 +3789,24 @@ void rdma_start_incoming_migration(const char 
*host_port, Error **errp)
 
 trace_rdma_start_incoming_migration_after_rdma_listen();
 
+/* initialize the RDMAContext for return path */
+if (migrate_postcopy()) {
+rdma_return_path = qemu_rdma_data_init(host_port, &local_err);
+
+if (rdma_return_path == NULL) {
+goto err;
+}
+
+qemu_rdma_return_path_dest_init(rdma_return_path, rdma);
+}
+
 qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration,
 NULL, (void *)(intptr_t)rdma);
 return;
 err:
 error_propagate(errp, local_err);
 g_free(rdma);
+g_free(rdma_return_path);
 }
 
 void rdma_start_outgoing_migration(void *opaque,
@@ -3754,6 +3814,7 @@ void rdma_start_outgoing_migration(void *opaque,
 {
 MigrationState *s = opaque;
 RDMAContext *rdma = qemu_rdma_data_init(host_port, errp);
+RDMAContext *rdma_return_path = NULL;
 int ret = 0;
 
 if (rdma == NULL) {
@@ -3774,6 +3835,32 @@ void rdma_start_outgoing_migration(void *opaque,
 goto err;
 }
 
+/* RDMA postcopy need a seprate queue pair for return path */
+if (migrate_postcopy()) {
+rdma_return_path = qemu_rdma_data_init(host_port, errp)

[Qemu-devel] [PATCH v4 01/12] migration: disable RDMA WRITE after postcopy started

2018-05-30 Thread Lidong Chen
From: Lidong Chen 

RDMA WRITE operations are performed with no notification to the destination
qemu, then the destination qemu can not wakeup. This patch disable RDMA WRITE
after postcopy started.

Signed-off-by: Lidong Chen 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/qemu-file.c |  8 ++--
 migration/rdma.c  | 12 
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 0463f4c..977b9ae 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -253,8 +253,12 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t 
block_offset,
 if (f->hooks && f->hooks->save_page) {
 int ret = f->hooks->save_page(f, f->opaque, block_offset,
   offset, size, bytes_sent);
-f->bytes_xfer += size;
-if (ret != RAM_SAVE_CONTROL_DELAYED) {
+if (ret != RAM_SAVE_CONTROL_NOT_SUPP) {
+f->bytes_xfer += size;
+}
+
+if (ret != RAM_SAVE_CONTROL_DELAYED &&
+ret != RAM_SAVE_CONTROL_NOT_SUPP) {
 if (bytes_sent && *bytes_sent > 0) {
 qemu_update_position(f, *bytes_sent);
 } else if (ret < 0) {
diff --git a/migration/rdma.c b/migration/rdma.c
index 7d233b0..a0748f4 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2930,6 +2930,10 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void 
*opaque,
 
 CHECK_ERROR_STATE();
 
+if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+return RAM_SAVE_CONTROL_NOT_SUPP;
+}
+
 qemu_fflush(f);
 
 if (size > 0) {
@@ -3489,6 +3493,10 @@ static int qemu_rdma_registration_start(QEMUFile *f, 
void *opaque,
 
 CHECK_ERROR_STATE();
 
+if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+return 0;
+}
+
 trace_qemu_rdma_registration_start(flags);
 qemu_put_be64(f, RAM_SAVE_FLAG_HOOK);
 qemu_fflush(f);
@@ -3511,6 +3519,10 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void 
*opaque,
 
 CHECK_ERROR_STATE();
 
+if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+return 0;
+}
+
 qemu_fflush(f);
 ret = qemu_rdma_drain_cq(f, rdma);
 
-- 
1.8.3.1




[Qemu-devel] [PATCH v4 09/12] migration: invoke qio_channel_yield only when qemu_in_coroutine()

2018-05-30 Thread Lidong Chen
From: Lidong Chen 

when qio_channel_read return QIO_CHANNEL_ERR_BLOCK, the source qemu crash.

The backtrace is:
(gdb) bt
#0  0x7fb20aba91d7 in raise () from /lib64/libc.so.6
#1  0x7fb20abaa8c8 in abort () from /lib64/libc.so.6
#2  0x7fb20aba2146 in __assert_fail_base () from /lib64/libc.so.6
#3  0x7fb20aba21f2 in __assert_fail () from /lib64/libc.so.6
#4  0x008dba2d in qio_channel_yield (ioc=0x22f9e20, 
condition=G_IO_IN) at io/channel.c:460
#5  0x007a870b in channel_get_buffer (opaque=0x22f9e20, 
buf=0x3d54038 "", pos=0, size=32768)
at migration/qemu-file-channel.c:83
#6  0x007a70f6 in qemu_fill_buffer (f=0x3d54000) at 
migration/qemu-file.c:299
#7  0x007a79d0 in qemu_peek_byte (f=0x3d54000, offset=0) at 
migration/qemu-file.c:562
#8  0x007a7a22 in qemu_get_byte (f=0x3d54000) at 
migration/qemu-file.c:575
#9  0x007a7c46 in qemu_get_be16 (f=0x3d54000) at 
migration/qemu-file.c:647
#10 0x00796db7 in source_return_path_thread (opaque=0x2242280) at 
migration/migration.c:1794
#11 0x009428fa in qemu_thread_start (args=0x3e58420) at 
util/qemu-thread-posix.c:504
#12 0x7fb20af3ddc5 in start_thread () from /lib64/libpthread.so.0
#13 0x7fb20ac6b74d in clone () from /lib64/libc.so.6

This patch fixed by invoke qio_channel_yield only when qemu_in_coroutine().

Signed-off-by: Lidong Chen 
Reviewed-by: Juan Quintela 
---
 migration/qemu-file-channel.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/migration/qemu-file-channel.c b/migration/qemu-file-channel.c
index e202d73..8e639eb 100644
--- a/migration/qemu-file-channel.c
+++ b/migration/qemu-file-channel.c
@@ -49,7 +49,11 @@ static ssize_t channel_writev_buffer(void *opaque,
 ssize_t len;
 len = qio_channel_writev(ioc, local_iov, nlocal_iov, NULL);
 if (len == QIO_CHANNEL_ERR_BLOCK) {
-qio_channel_wait(ioc, G_IO_OUT);
+if (qemu_in_coroutine()) {
+qio_channel_yield(ioc, G_IO_OUT);
+} else {
+qio_channel_wait(ioc, G_IO_OUT);
+}
 continue;
 }
 if (len < 0) {
@@ -80,7 +84,11 @@ static ssize_t channel_get_buffer(void *opaque,
 ret = qio_channel_read(ioc, (char *)buf, size, NULL);
 if (ret < 0) {
 if (ret == QIO_CHANNEL_ERR_BLOCK) {
-qio_channel_yield(ioc, G_IO_IN);
+if (qemu_in_coroutine()) {
+qio_channel_yield(ioc, G_IO_IN);
+} else {
+qio_channel_wait(ioc, G_IO_IN);
+}
 } else {
 /* XXX handle Error * object */
 return -EIO;
-- 
1.8.3.1




[Qemu-devel] [PATCH v4 03/12] migration: remove unnecessary variables len in QIOChannelRDMA

2018-05-30 Thread Lidong Chen
From: Lidong Chen 

Because qio_channel_rdma_writev and qio_channel_rdma_readv maybe invoked
by different threads concurrently, this patch removes unnecessary variables
len in QIOChannelRDMA and use local variable instead.

Signed-off-by: Lidong Chen 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Daniel P. Berrangé 
---
 migration/rdma.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index ec4bbff..9b6da4d 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -404,7 +404,6 @@ struct QIOChannelRDMA {
 QIOChannel parent;
 RDMAContext *rdma;
 QEMUFile *file;
-size_t len;
 bool blocking; /* XXX we don't actually honour this yet */
 };
 
@@ -2643,6 +2642,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
 int ret;
 ssize_t done = 0;
 size_t i;
+size_t len = 0;
 
 CHECK_ERROR_STATE();
 
@@ -2662,10 +2662,10 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
 while (remaining) {
 RDMAControlHeader head;
 
-rioc->len = MIN(remaining, RDMA_SEND_INCREMENT);
-remaining -= rioc->len;
+len = MIN(remaining, RDMA_SEND_INCREMENT);
+remaining -= len;
 
-head.len = rioc->len;
+head.len = len;
 head.type = RDMA_CONTROL_QEMU_FILE;
 
 ret = qemu_rdma_exchange_send(rdma, &head, data, NULL, NULL, NULL);
@@ -2675,8 +2675,8 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
 return ret;
 }
 
-data += rioc->len;
-done += rioc->len;
+data += len;
+done += len;
 }
 }
 
@@ -2771,8 +2771,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,
 }
 }
 }
-rioc->len = done;
-return rioc->len;
+return done;
 }
 
 /*
-- 
1.8.3.1




[Qemu-devel] [PATCH v4 08/12] migration: implement io_set_aio_fd_handler function for RDMA QIOChannel

2018-05-30 Thread Lidong Chen
From: Lidong Chen 

if qio_channel_rdma_readv return QIO_CHANNEL_ERR_BLOCK, the destination qemu
crash.

The backtrace is:
(gdb) bt
#0  0x in ?? ()
#1  0x008db50e in qio_channel_set_aio_fd_handler (ioc=0x38111e0, 
ctx=0x3726080,
io_read=0x8db841 , io_write=0x0, 
opaque=0x38111e0) at io/channel.c:
#2  0x008db952 in qio_channel_set_aio_fd_handlers (ioc=0x38111e0) 
at io/channel.c:438
#3  0x008dbab4 in qio_channel_yield (ioc=0x38111e0, 
condition=G_IO_IN) at io/channel.c:47
#4  0x007a870b in channel_get_buffer (opaque=0x38111e0, 
buf=0x440c038 "", pos=0, size=327
at migration/qemu-file-channel.c:83
#5  0x007a70f6 in qemu_fill_buffer (f=0x440c000) at 
migration/qemu-file.c:299
#6  0x007a79d0 in qemu_peek_byte (f=0x440c000, offset=0) at 
migration/qemu-file.c:562
#7  0x007a7a22 in qemu_get_byte (f=0x440c000) at 
migration/qemu-file.c:575
#8  0x007a7c78 in qemu_get_be32 (f=0x440c000) at 
migration/qemu-file.c:655
#9  0x007a0508 in qemu_loadvm_state (f=0x440c000) at 
migration/savevm.c:2126
#10 0x00794141 in process_incoming_migration_co (opaque=0x0) at 
migration/migration.c:366
#11 0x0095c598 in coroutine_trampoline (i0=84033984, i1=0) at 
util/coroutine-ucontext.c:1
#12 0x7f9c0db56d40 in ?? () from /lib64/libc.so.6
#13 0x7f96fe858760 in ?? ()
#14 0x in ?? ()

RDMA QIOChannel not implement io_set_aio_fd_handler. so
qio_channel_set_aio_fd_handler will access NULL pointer.

Signed-off-by: Lidong Chen 
Reviewed-by: Juan Quintela 
---
 migration/rdma.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/migration/rdma.c b/migration/rdma.c
index 92e4d30..dfa4f77 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2963,6 +2963,21 @@ static GSource *qio_channel_rdma_create_watch(QIOChannel 
*ioc,
 return source;
 }
 
+static void qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc,
+  AioContext *ctx,
+  IOHandler *io_read,
+  IOHandler *io_write,
+  void *opaque)
+{
+QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
+if (io_read) {
+aio_set_fd_handler(ctx, rioc->rdmain->comp_channel->fd,
+   false, io_read, io_write, NULL, opaque);
+} else {
+aio_set_fd_handler(ctx, rioc->rdmaout->comp_channel->fd,
+   false, io_read, io_write, NULL, opaque);
+}
+}
 
 static int qio_channel_rdma_close(QIOChannel *ioc,
   Error **errp)
@@ -3822,6 +3837,7 @@ static void qio_channel_rdma_class_init(ObjectClass 
*klass,
 ioc_klass->io_set_blocking = qio_channel_rdma_set_blocking;
 ioc_klass->io_close = qio_channel_rdma_close;
 ioc_klass->io_create_watch = qio_channel_rdma_create_watch;
+ioc_klass->io_set_aio_fd_handler = qio_channel_rdma_set_aio_fd_handler;
 }
 
 static const TypeInfo qio_channel_rdma_info = {
-- 
1.8.3.1




[Qemu-devel] [PATCH v4 04/12] migration: avoid concurrent invoke channel_close by different threads

2018-05-30 Thread Lidong Chen
From: Lidong Chen 

The channel_close maybe invoked by different threads. For example, source
qemu invokes qemu_fclose in main thread, migration thread and return path
thread. Destination qemu invokes qemu_fclose in main thread, listen thread
and COLO incoming thread.

Add a mutex in QEMUFile struct to avoid concurrent invoke channel_close.

Signed-off-by: Lidong Chen 
---
 migration/qemu-file.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 977b9ae..87d0f05 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -52,6 +52,7 @@ struct QEMUFile {
 unsigned int iovcnt;
 
 int last_error;
+QemuMutex lock;
 };
 
 /*
@@ -96,6 +97,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops)
 
 f = g_new0(QEMUFile, 1);
 
+qemu_mutex_init(&f->lock);
 f->opaque = opaque;
 f->ops = ops;
 return f;
@@ -328,7 +330,9 @@ int qemu_fclose(QEMUFile *f)
 ret = qemu_file_get_error(f);
 
 if (f->ops->close) {
+qemu_mutex_lock(&f->lock);
 int ret2 = f->ops->close(f->opaque);
+qemu_mutex_unlock(&f->lock);
 if (ret >= 0) {
 ret = ret2;
 }
@@ -339,6 +343,7 @@ int qemu_fclose(QEMUFile *f)
 if (f->last_error) {
 ret = f->last_error;
 }
+qemu_mutex_destroy(&f->lock);
 g_free(f);
 trace_qemu_file_fclose();
 return ret;
-- 
1.8.3.1




[Qemu-devel] [PATCH v4 06/12] migration: Stop rdma yielding during incoming postcopy

2018-05-30 Thread Lidong Chen
From: Lidong Chen 

During incoming postcopy, the destination qemu will invoke
qemu_rdma_wait_comp_channel in a seprate thread. So does not use rdma
yield, and poll the completion channel fd instead.

Signed-off-by: Lidong Chen 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/rdma.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index 45f01e6..0dd4033 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1493,11 +1493,13 @@ static int qemu_rdma_wait_comp_channel(RDMAContext 
*rdma)
  * Coroutine doesn't start until migration_fd_process_incoming()
  * so don't yield unless we know we're running inside of a coroutine.
  */
-if (rdma->migration_started_on_destination) {
+if (rdma->migration_started_on_destination &&
+migration_incoming_get_current()->state == MIGRATION_STATUS_ACTIVE) {
 yield_until_fd_readable(rdma->comp_channel->fd);
 } else {
 /* This is the source side, we're in a separate thread
  * or destination prior to migration_fd_process_incoming()
+ * after postcopy, the destination also in a seprate thread.
  * we can't yield; so we have to poll the fd.
  * But we need to be able to handle 'cancel' or an error
  * without hanging forever.
-- 
1.8.3.1




[Qemu-devel] [PATCH v4 05/12] migration: implement bi-directional RDMA QIOChannel

2018-05-30 Thread Lidong Chen
From: Lidong Chen 

This patch implements bi-directional RDMA QIOChannel. Because different
threads may access RDMAQIOChannel currently, this patch use RCU to protect it.

Signed-off-by: Lidong Chen 
---
 migration/colo.c |   2 +
 migration/migration.c|   2 +
 migration/postcopy-ram.c |   2 +
 migration/ram.c  |   4 +
 migration/rdma.c | 196 ---
 migration/savevm.c   |   3 +
 6 files changed, 183 insertions(+), 26 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 4381067..88936f5 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -534,6 +534,7 @@ void *colo_process_incoming_thread(void *opaque)
 uint64_t value;
 Error *local_err = NULL;
 
+rcu_register_thread();
 qemu_sem_init(&mis->colo_incoming_sem, 0);
 
 migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
@@ -666,5 +667,6 @@ out:
 }
 migration_incoming_exit_colo();
 
+rcu_unregister_thread();
 return NULL;
 }
diff --git a/migration/migration.c b/migration/migration.c
index 05aec2c..6217ef1 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2008,6 +2008,7 @@ static void *source_return_path_thread(void *opaque)
 int res;
 
 trace_source_return_path_thread_entry();
+rcu_register_thread();
 
 retry:
 while (!ms->rp_state.error && !qemu_file_get_error(rp) &&
@@ -2147,6 +2148,7 @@ out:
 trace_source_return_path_thread_end();
 ms->rp_state.from_dst_file = NULL;
 qemu_fclose(rp);
+rcu_unregister_thread();
 return NULL;
 }
 
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index 658b750..a5de61d 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -853,6 +853,7 @@ static void *postcopy_ram_fault_thread(void *opaque)
 RAMBlock *rb = NULL;
 
 trace_postcopy_ram_fault_thread_entry();
+rcu_register_thread();
 mis->last_rb = NULL; /* last RAMBlock we sent part of */
 qemu_sem_post(&mis->fault_thread_sem);
 
@@ -1059,6 +1060,7 @@ retry:
 }
 }
 }
+rcu_unregister_thread();
 trace_postcopy_ram_fault_thread_exit();
 g_free(pfd);
 return NULL;
diff --git a/migration/ram.c b/migration/ram.c
index c53e836..85c8c39 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -678,6 +678,7 @@ static void *multifd_send_thread(void *opaque)
 MultiFDSendParams *p = opaque;
 Error *local_err = NULL;
 
+rcu_register_thread();
 if (multifd_send_initial_packet(p, &local_err) < 0) {
 goto out;
 }
@@ -701,6 +702,7 @@ out:
 p->running = false;
 qemu_mutex_unlock(&p->mutex);
 
+rcu_unregister_thread();
 return NULL;
 }
 
@@ -814,6 +816,7 @@ static void *multifd_recv_thread(void *opaque)
 {
 MultiFDRecvParams *p = opaque;
 
+rcu_register_thread();
 while (true) {
 qemu_mutex_lock(&p->mutex);
 if (p->quit) {
@@ -828,6 +831,7 @@ static void *multifd_recv_thread(void *opaque)
 p->running = false;
 qemu_mutex_unlock(&p->mutex);
 
+rcu_unregister_thread();
 return NULL;
 }
 
diff --git a/migration/rdma.c b/migration/rdma.c
index 9b6da4d..45f01e6 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -86,6 +86,7 @@ static uint32_t known_capabilities = RDMA_CAPABILITY_PIN_ALL;
 " to abort!"); \
 rdma->error_reported = 1; \
 } \
+rcu_read_unlock(); \
 return rdma->error_state; \
 } \
 } while (0)
@@ -402,7 +403,8 @@ typedef struct QIOChannelRDMA QIOChannelRDMA;
 
 struct QIOChannelRDMA {
 QIOChannel parent;
-RDMAContext *rdma;
+RDMAContext *rdmain;
+RDMAContext *rdmaout;
 QEMUFile *file;
 bool blocking; /* XXX we don't actually honour this yet */
 };
@@ -2638,12 +2640,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
 {
 QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
 QEMUFile *f = rioc->file;
-RDMAContext *rdma = rioc->rdma;
+RDMAContext *rdma;
 int ret;
 ssize_t done = 0;
 size_t i;
 size_t len = 0;
 
+rcu_read_lock();
+rdma = atomic_rcu_read(&rioc->rdmaout);
+
+if (!rdma) {
+rcu_read_unlock();
+return -EIO;
+}
+
 CHECK_ERROR_STATE();
 
 /*
@@ -2653,6 +2663,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
 ret = qemu_rdma_write_flush(f, rdma);
 if (ret < 0) {
 rdma->error_state = ret;
+rcu_read_unlock();
 return ret;
 }
 
@@ -2672,6 +2683,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
 
 if (ret < 0) {
 rdma->error_state = ret;
+rcu_read_unlock();
 return ret;
 }
 
@@ -2680,6 +2692,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
 }
 }
 
+rcu_read_unlock();
 return done;
 }
 
@@ -2713,12 +2726,20 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc,

[Qemu-devel] [PATCH v4 12/12] migration: implement the shutdown for RDMA QIOChannel

2018-05-30 Thread Lidong Chen
Because RDMA QIOChannel not implement shutdown function,
If the to_dst_file was set error, the return path thread
will wait forever. and the migration thread will wait
return path thread exit.

the backtrace of return path thread is:

(gdb) bt
#0  0x7f372a76bb0f in ppoll () from /lib64/libc.so.6
#1  0x0071dc24 in qemu_poll_ns (fds=0x7ef7091d0580, nfds=2, 
timeout=1)
at qemu-timer.c:325
#2  0x006b2fba in qemu_rdma_wait_comp_channel (rdma=0xd424000)
at migration/rdma.c:1501
#3  0x006b3191 in qemu_rdma_block_for_wrid (rdma=0xd424000, 
wrid_requested=4000,
byte_len=0x7ef7091d0640) at migration/rdma.c:1580
#4  0x006b3638 in qemu_rdma_exchange_get_response (rdma=0xd424000,
head=0x7ef7091d0720, expecting=3, idx=0) at migration/rdma.c:1726
#5  0x006b3ad6 in qemu_rdma_exchange_recv (rdma=0xd424000, 
head=0x7ef7091d0720,
expecting=3) at migration/rdma.c:1903
#6  0x006b5d03 in qemu_rdma_get_buffer (opaque=0x6a57dc0, 
buf=0x5c80030 "", pos=8,
size=32768) at migration/rdma.c:2714
#7  0x006a9635 in qemu_fill_buffer (f=0x5c8) at 
migration/qemu-file.c:232
#8  0x006a9ecd in qemu_peek_byte (f=0x5c8, offset=0)
at migration/qemu-file.c:502
#9  0x006a9f1f in qemu_get_byte (f=0x5c8) at 
migration/qemu-file.c:515
#10 0x006aa162 in qemu_get_be16 (f=0x5c8) at 
migration/qemu-file.c:591
#11 0x006a46d3 in source_return_path_thread (
opaque=0xd826a0 ) at migration/migration.c:1331
#12 0x7f372aa49e25 in start_thread () from /lib64/libpthread.so.0
#13 0x7f372a77635d in clone () from /lib64/libc.so.6

the backtrace of migration thread is:

(gdb) bt
#0  0x7f372aa4af57 in pthread_join () from /lib64/libpthread.so.0
#1  0x007d5711 in qemu_thread_join (thread=0xd826f8 
)
at util/qemu-thread-posix.c:504
#2  0x006a4bc5 in await_return_path_close_on_source (
ms=0xd826a0 ) at migration/migration.c:1460
#3  0x006a53e4 in migration_completion (s=0xd826a0 
,
current_active_state=4, old_vm_running=0x7ef7089cf976, 
start_time=0x7ef7089cf980)
at migration/migration.c:1695
#4  0x006a5c54 in migration_thread (opaque=0xd826a0 
)
at migration/migration.c:1837
#5  0x7f372aa49e25 in start_thread () from /lib64/libpthread.so.0
#6  0x7f372a77635d in clone () from /lib64/libc.so.6

Signed-off-by: Lidong Chen 
---
 migration/rdma.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/migration/rdma.c b/migration/rdma.c
index d611a06..0912b6a 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3038,6 +3038,45 @@ static int qio_channel_rdma_close(QIOChannel *ioc,
 return 0;
 }
 
+static int
+qio_channel_rdma_shutdown(QIOChannel *ioc,
+QIOChannelShutdown how,
+Error **errp)
+{
+QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
+RDMAContext *rdmain, *rdmaout;
+
+rcu_read_lock();
+
+rdmain = atomic_rcu_read(&rioc->rdmain);
+rdmaout = atomic_rcu_read(&rioc->rdmain);
+
+switch (how) {
+case QIO_CHANNEL_SHUTDOWN_READ:
+if (rdmain) {
+rdmain->error_state = -1;
+}
+break;
+case QIO_CHANNEL_SHUTDOWN_WRITE:
+if (rdmaout) {
+rdmaout->error_state = -1;
+}
+break;
+case QIO_CHANNEL_SHUTDOWN_BOTH:
+default:
+if (rdmain) {
+rdmain->error_state = -1;
+}
+if (rdmaout) {
+rdmaout->error_state = -1;
+}
+break;
+}
+
+rcu_read_unlock();
+return 0;
+}
+
 /*
  * Parameters:
  *@offset == 0 :
@@ -3864,6 +3903,7 @@ static void qio_channel_rdma_class_init(ObjectClass 
*klass,
 ioc_klass->io_close = qio_channel_rdma_close;
 ioc_klass->io_create_watch = qio_channel_rdma_create_watch;
 ioc_klass->io_set_aio_fd_handler = qio_channel_rdma_set_aio_fd_handler;
+ioc_klass->io_shutdown = qio_channel_rdma_shutdown;
 }
 
 static const TypeInfo qio_channel_rdma_info = {
-- 
1.8.3.1




[Qemu-devel] [PATCH v4 07/12] migration: not wait RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect

2018-05-30 Thread Lidong Chen
From: Lidong Chen 

When cancel migration during RDMA precopy, the source qemu main thread hangs 
sometime.

The backtrace is:
(gdb) bt
#0  0x7f249eabd43d in write () from /lib64/libpthread.so.0
#1  0x7f24a1ce98e4 in rdma_get_cm_event (channel=0x4675d10, 
event=0x7ffe2f643dd0) at src/cma.c:2189
#2  0x007b6166 in qemu_rdma_cleanup (rdma=0x6784000) at 
migration/rdma.c:2296
#3  0x007b7cae in qio_channel_rdma_close (ioc=0x3bfcc30, errp=0x0) 
at migration/rdma.c:2999
#4  0x008db60e in qio_channel_close (ioc=0x3bfcc30, errp=0x0) at 
io/channel.c:273
#5  0x007a8765 in channel_close (opaque=0x3bfcc30) at 
migration/qemu-file-channel.c:98
#6  0x007a71f9 in qemu_fclose (f=0x527c000) at 
migration/qemu-file.c:334
#7  0x00795b96 in migrate_fd_cleanup (opaque=0x3b46280) at 
migration/migration.c:1162
#8  0x0093a71b in aio_bh_call (bh=0x3db7a20) at util/async.c:90
#9  0x0093a7b2 in aio_bh_poll (ctx=0x3b121c0) at util/async.c:118
#10 0x0093f2ad in aio_dispatch (ctx=0x3b121c0) at 
util/aio-posix.c:436
#11 0x0093ab41 in aio_ctx_dispatch (source=0x3b121c0, callback=0x0, 
user_data=0x0)
at util/async.c:261
#12 0x7f249f73c7aa in g_main_context_dispatch () from 
/lib64/libglib-2.0.so.0
#13 0x0093dc5e in glib_pollfds_poll () at util/main-loop.c:215
#14 0x0093dd4e in os_host_main_loop_wait (timeout=2800) at 
util/main-loop.c:263
#15 0x0093de05 in main_loop_wait (nonblocking=0) at 
util/main-loop.c:522
#16 0x005bc6a5 in main_loop () at vl.c:1944
#17 0x005c39b5 in main (argc=56, argv=0x7ffe2f6443f8, 
envp=0x3ad0030) at vl.c:4752

It does not get the RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect 
sometime.

According to IB Spec once active side send DREQ message, it should wait for 
DREP message
and only once it arrived it should trigger a DISCONNECT event. DREP message can 
be dropped
due to network issues.
For that case the spec defines a DREP_timeout state in the CM state machine, if 
the DREP is
dropped we should get a timeout and a TIMEWAIT_EXIT event will be trigger.
Unfortunately the current kernel CM implementation doesn't include the 
DREP_timeout state
and in above scenario we will not get DISCONNECT or TIMEWAIT_EXIT events.

So it should not invoke rdma_get_cm_event which may hang forever, and the event 
channel
is also destroyed in qemu_rdma_cleanup.

Signed-off-by: Lidong Chen 
---
 migration/rdma.c   | 12 ++--
 migration/trace-events |  1 -
 2 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index 0dd4033..92e4d30 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2275,8 +2275,7 @@ static int qemu_rdma_write(QEMUFile *f, RDMAContext *rdma,
 
 static void qemu_rdma_cleanup(RDMAContext *rdma)
 {
-struct rdma_cm_event *cm_event;
-int ret, idx;
+int idx;
 
 if (rdma->cm_id && rdma->connected) {
 if ((rdma->error_state ||
@@ -2290,14 +2289,7 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
 qemu_rdma_post_send_control(rdma, NULL, &head);
 }
 
-ret = rdma_disconnect(rdma->cm_id);
-if (!ret) {
-trace_qemu_rdma_cleanup_waiting_for_disconnect();
-ret = rdma_get_cm_event(rdma->channel, &cm_event);
-if (!ret) {
-rdma_ack_cm_event(cm_event);
-}
-}
+rdma_disconnect(rdma->cm_id);
 trace_qemu_rdma_cleanup_disconnect();
 rdma->connected = false;
 }
diff --git a/migration/trace-events b/migration/trace-events
index 3c798dd..4a768ea 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -146,7 +146,6 @@ qemu_rdma_accept_pin_state(bool pin) "%d"
 qemu_rdma_accept_pin_verbsc(void *verbs) "Verbs context after listen: %p"
 qemu_rdma_block_for_wrid_miss(const char *wcompstr, int wcomp, const char 
*gcompstr, uint64_t req) "A Wanted wrid %s (%d) but got %s (%" PRIu64 ")"
 qemu_rdma_cleanup_disconnect(void) ""
-qemu_rdma_cleanup_waiting_for_disconnect(void) ""
 qemu_rdma_close(void) ""
 qemu_rdma_connect_pin_all_requested(void) ""
 qemu_rdma_connect_pin_all_outcome(bool pin) "%d"
-- 
1.8.3.1




[Qemu-devel] [PATCH v4 10/12] migration: create a dedicated thread to release rdma resource

2018-05-30 Thread Lidong Chen
ibv_dereg_mr wait for a long time for big memory size virtual server.

The test result is:
  10GB  326ms
  20GB  699ms
  30GB  1021ms
  40GB  1387ms
  50GB  1712ms
  60GB  2034ms
  70GB  2457ms
  80GB  2807ms
  90GB  3107ms
  100GB 3474ms
  110GB 3735ms
  120GB 4064ms
  130GB 4567ms
  140GB 4886ms

this will cause the guest os hang for a while when migration finished.
So create a dedicated thread to release rdma resource.

Signed-off-by: Lidong Chen 
---
 migration/rdma.c | 21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index dfa4f77..1b9e261 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2979,12 +2979,12 @@ static void 
qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc,
 }
 }
 
-static int qio_channel_rdma_close(QIOChannel *ioc,
-  Error **errp)
+static void *qio_channel_rdma_close_thread(void *arg)
 {
-QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
+QIOChannelRDMA *rioc = arg;
 RDMAContext *rdmain, *rdmaout;
-trace_qemu_rdma_close();
+
+rcu_register_thread();
 
 rdmain = rioc->rdmain;
 if (rdmain) {
@@ -3009,6 +3009,19 @@ static int qio_channel_rdma_close(QIOChannel *ioc,
 g_free(rdmain);
 g_free(rdmaout);
 
+rcu_unregister_thread();
+return NULL;
+}
+
+static int qio_channel_rdma_close(QIOChannel *ioc,
+  Error **errp)
+{
+QemuThread t;
+QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
+trace_qemu_rdma_close();
+
+qemu_thread_create(&t, "rdma cleanup", qio_channel_rdma_close_thread,
+   rioc, QEMU_THREAD_DETACHED);
 return 0;
 }
 
-- 
1.8.3.1




[Qemu-devel] [PATCH v4 11/12] migration: poll the cm event while wait RDMA work request completion

2018-05-30 Thread Lidong Chen
If the peer qemu is crashed, the qemu_rdma_wait_comp_channel function
maybe loop forever. so we should also poll the cm event fd, and when
receive any cm event, we consider some error happened.

Signed-off-by: Lidong Chen 
---
 migration/rdma.c | 35 ---
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index 1b9e261..d611a06 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1489,6 +1489,9 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, 
uint64_t *wr_id_out,
  */
 static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
 {
+struct rdma_cm_event *cm_event;
+int ret = -1;
+
 /*
  * Coroutine doesn't start until migration_fd_process_incoming()
  * so don't yield unless we know we're running inside of a coroutine.
@@ -1504,25 +1507,35 @@ static int qemu_rdma_wait_comp_channel(RDMAContext 
*rdma)
  * But we need to be able to handle 'cancel' or an error
  * without hanging forever.
  */
-while (!rdma->error_state  && !rdma->received_error) {
-GPollFD pfds[1];
+while (!rdma->error_state && !rdma->received_error) {
+GPollFD pfds[2];
 pfds[0].fd = rdma->comp_channel->fd;
 pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
+pfds[0].revents = 0;
+
+pfds[1].fd = rdma->channel->fd;
+pfds[1].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
+pfds[1].revents = 0;
+
 /* 0.1s timeout, should be fine for a 'cancel' */
-switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) {
-case 1: /* fd active */
-return 0;
+qemu_poll_ns(pfds, 2, 100 * 1000 * 1000);
 
-case 0: /* Timeout, go around again */
-break;
+if (pfds[1].revents) {
+ret = rdma_get_cm_event(rdma->channel, &cm_event);
+if (!ret) {
+rdma_ack_cm_event(cm_event);
+}
+error_report("receive cm event while wait comp channel,"
+ "cm event is %d", cm_event->event);
 
-default: /* Error of some type -
-  * I don't trust errno from qemu_poll_ns
- */
-error_report("%s: poll failed", __func__);
+/* consider any rdma communication event as an error */
 return -EPIPE;
 }
 
+if (pfds[0].revents) {
+return 0;
+}
+
 if (migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) {
 /* Bail out and let the cancellation happen */
 return -EPIPE;
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v2 03/16] job: Add error message for failing jobs

2018-05-30 Thread Max Reitz
On 2018-05-29 22:38, Kevin Wolf wrote:
> So far we relied on job->ret and strerror() to produce an error message
> for failed jobs. Not surprisingly, this tends to result in completely
> useless messages.
> 
> This adds a Job.error field that can contain an error string for a
> failing job, and a parameter to job_completed() that sets the field. As
> a default, if NULL is passed, we continue to use strerror(job->ret).
> 
> All existing callers are changed to pass NULL. They can be improved in
> separate patches.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  include/qemu/job.h|  7 ++-
>  block/backup.c|  2 +-
>  block/commit.c|  2 +-
>  block/mirror.c|  2 +-
>  block/stream.c|  2 +-
>  job-qmp.c |  9 ++---
>  job.c | 16 ++--
>  tests/test-bdrv-drain.c   |  2 +-
>  tests/test-blockjob-txn.c |  2 +-
>  tests/test-blockjob.c |  2 +-
>  10 files changed, 29 insertions(+), 17 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 06/16] qemu-iotests: Add VM.qmp_log()

2018-05-30 Thread Max Reitz
On 2018-05-29 22:39, Kevin Wolf wrote:
> This adds a helper function that logs both the QMP request and the
> received response before returning it.
> 
> Signed-off-by: Kevin Wolf 
> Reviewed-by: Jeff Cody 
> ---
>  tests/qemu-iotests/iotests.py | 11 +++
>  1 file changed, 11 insertions(+)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 07/16] qemu-iotests: Add iotests.img_info_log()

2018-05-30 Thread Max Reitz
On 2018-05-29 22:39, Kevin Wolf wrote:
> This adds a filter function to postprocess 'qemu-img info' input
> (similar to what _img_info does), and an img_info_log() function that
> calls 'qemu-img info' and logs the filtered output.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  tests/qemu-iotests/iotests.py | 18 ++
>  1 file changed, 18 insertions(+)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 08/16] qemu-iotests: Add VM.run_job()

2018-05-30 Thread Max Reitz
On 2018-05-29 22:39, Kevin Wolf wrote:
> Add an iotests.py function that runs a job and only returns when it is
> destroyed. An error is logged when the job failed and job-finalize and
> job-dismiss commands are issued if necessary.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  tests/qemu-iotests/iotests.py | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index edcd2bb701..8b612cb891 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -418,6 +418,25 @@ class VM(qtest.QEMUQtestMachine):
>  log(str(result), filters)
>  return result
>  
> +def run_job(self, job, auto_finalize=True, auto_dismiss=False):
> +while True:
> +for ev in self.get_qmp_events_filtered(wait=True):
> +if ev['event'] == 'JOB_STATUS_CHANGE':
> +status = ev['data']['status']
> +if status == 'aborting':
> +result = self.qmp('query-jobs')
> +for j in result['return']:
> +if j['id'] == job:
> +log('Job failed: %s' % (j['error']))
> +elif status == 'pending' and not auto_finalize:
> +self.qmp_log('job-finalize', id=job)
> +elif status == 'concluded' and not auto_dismiss:
> +self.qmp_log('job-dismiss', id=job)
> +elif status == 'null':
> +return
> +else:
> +iotests.log(ev)
> +
>  
>  index_re = re.compile(r'([^\[]+)\[([^\]]+)\]')

No, I won't mention that I just realized you could get the job ID from
the event and don't actually need it as a parameter.

So without any comments:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 09/16] qemu-iotests: iotests.py helper for non-file protocols

2018-05-30 Thread Max Reitz
On 2018-05-29 22:39, Kevin Wolf wrote:
> This adds two helper functions that are useful for test cases that make
> use of a non-file protocol (specifically ssh).
> 
> Signed-off-by: Kevin Wolf 
> ---
>  tests/qemu-iotests/iotests.py | 17 +
>  1 file changed, 17 insertions(+)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v2] pnv: add a physical mapping array describing MMIO ranges in each chip

2018-05-30 Thread Cédric Le Goater
Based on previous work done in skiboot, the physical mapping array
helps in calculating the MMIO ranges of each controller depending on
the chip id and the chip type. This is will be particularly useful for
the P9 models which use less the XSCOM bus and rely more on MMIOs.

A link on the chip is now necessary to calculate MMIO BARs and
sizes. This is why such a link is introduced in the PSIHB model.

Signed-off-by: Cédric Le Goater 
Reviewed-by: Philippe Mathieu-Daudé 
---

 Changes since v1:

 - removed PNV_MAP_MAX which has unused
 - introduced a chip class handler to calculate the base address of a
   controller as suggested by Greg.
 - fix error reporting in pnv_psi_realize()

 include/hw/ppc/pnv.h | 51 ++
 hw/ppc/pnv.c | 53 
 hw/ppc/pnv_psi.c | 15 ---
 hw/ppc/pnv_xscom.c   |  8 
 4 files changed, 96 insertions(+), 31 deletions(-)

diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index 90759240a7b1..ffa4a0899705 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -53,7 +53,6 @@ typedef struct PnvChip {
 uint64_t cores_mask;
 void *cores;
 
-hwaddr   xscom_base;
 MemoryRegion xscom_mmio;
 MemoryRegion xscom;
 AddressSpace xscom_as;
@@ -64,6 +63,18 @@ typedef struct PnvChip {
 PnvOCC   occ;
 } PnvChip;
 
+typedef enum PnvPhysMapType {
+PNV_MAP_XSCOM,
+PNV_MAP_ICP,
+PNV_MAP_PSIHB,
+PNV_MAP_PSIHB_FSP,
+} PnvPhysMapType;
+
+typedef struct PnvPhysMapEntry {
+uint64_tbase;
+uint64_tsize;
+} PnvPhysMapEntry;
+
 typedef struct PnvChipClass {
 /*< private >*/
 SysBusDeviceClass parent_class;
@@ -73,9 +84,10 @@ typedef struct PnvChipClass {
 uint64_t chip_cfam_id;
 uint64_t cores_mask;
 
-hwaddr   xscom_base;
+const PnvPhysMapEntry *phys_map;
 
 uint32_t (*core_pir)(PnvChip *chip, uint32_t core_id);
+uint64_t (*map_base)(const PnvChip *chip, PnvPhysMapType type);
 } PnvChipClass;
 
 #define PNV_CHIP_TYPE_SUFFIX "-" TYPE_PNV_CHIP
@@ -159,9 +171,21 @@ void pnv_bmc_powerdown(IPMIBmc *bmc);
 /*
  * POWER8 MMIO base addresses
  */
-#define PNV_XSCOM_SIZE0x8ull
-#define PNV_XSCOM_BASE(chip)\
-(chip->xscom_base + ((uint64_t)(chip)->chip_id) * PNV_XSCOM_SIZE)
+static inline uint64_t pnv_map_size(const PnvChip *chip, PnvPhysMapType type)
+{
+PnvChipClass *pcc = PNV_CHIP_GET_CLASS(chip);
+const PnvPhysMapEntry *map = &pcc->phys_map[type];
+
+return map->size;
+}
+
+static inline uint64_t pnv_map_base(const PnvChip *chip, PnvPhysMapType type)
+{
+return PNV_CHIP_GET_CLASS(chip)->map_base(chip, type);
+}
+
+#define PNV_XSCOM_SIZE(chip) pnv_map_size(chip, PNV_MAP_XSCOM)
+#define PNV_XSCOM_BASE(chip) pnv_map_base(chip, PNV_MAP_XSCOM)
 
 /*
  * XSCOM 0x20109CA defines the ICP BAR:
@@ -177,18 +201,13 @@ void pnv_bmc_powerdown(IPMIBmc *bmc);
  *  0xe022 -> 0x00038080
  *  0xe026 -> 0x00038090
  */
-#define PNV_ICP_SIZE 0x0010ull
-#define PNV_ICP_BASE(chip)  \
-(0x00038000ull + (uint64_t) PNV_CHIP_INDEX(chip) * PNV_ICP_SIZE)
-
+#define PNV_ICP_SIZE(chip)   pnv_map_size(chip, PNV_MAP_ICP)
+#define PNV_ICP_BASE(chip)   pnv_map_base(chip, PNV_MAP_ICP)
 
-#define PNV_PSIHB_SIZE   0x0010ull
-#define PNV_PSIHB_BASE(chip) \
-(0x0003fffe8000ull + (uint64_t)PNV_CHIP_INDEX(chip) * PNV_PSIHB_SIZE)
+#define PNV_PSIHB_SIZE(chip) pnv_map_size(chip, PNV_MAP_PSIHB)
+#define PNV_PSIHB_BASE(chip) pnv_map_base(chip, PNV_MAP_PSIHB)
 
-#define PNV_PSIHB_FSP_SIZE   0x0001ull
-#define PNV_PSIHB_FSP_BASE(chip) \
-(0x0003ffe0ull + (uint64_t)PNV_CHIP_INDEX(chip) * \
- PNV_PSIHB_FSP_SIZE)
+#define PNV_PSIHB_FSP_SIZE(chip) pnv_map_size(chip, PNV_MAP_PSIHB_FSP)
+#define PNV_PSIHB_FSP_BASE(chip) pnv_map_base(chip, PNV_MAP_PSIHB_FSP)
 
 #endif /* _PPC_PNV_H */
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 031488131629..77caaea64b2f 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -712,6 +712,24 @@ static uint32_t pnv_chip_core_pir_p9(PnvChip *chip, 
uint32_t core_id)
  */
 #define POWER9_CORE_MASK   (0xffull)
 
+/*
+ * POWER8 MMIOs
+ */
+static const PnvPhysMapEntry pnv_chip_power8_phys_map[] = {
+[PNV_MAP_XSCOM] = { 0x0003fc00ull, 0x0008ull },
+[PNV_MAP_ICP]   = { 0x00038000ull, 0x0010ull },
+[PNV_MAP_PSIHB] = { 0x0003fffe8000ull, 0x0010ull },
+[PNV_MAP_PSIHB_FSP] = { 0x0003ffe0ull, 0x0001ull },
+};
+
+static uint64_t pnv_chip_map_base_p8(const PnvChip *chip, PnvPhysMapType type)
+{
+PnvChipClass *pcc = PNV_CHIP_GET_CLASS(chip);
+const PnvPhysMapEntry *m

Re: [Qemu-devel] [PATCH v2 10/16] qemu-iotests: Rewrite 206 for blockdev-create job

2018-05-30 Thread Max Reitz
On 2018-05-29 22:39, Kevin Wolf wrote:
> This rewrites the test case 206 to work with the new x-blockdev-create
> job rather than the old synchronous version of the command.
> 
> All of the test cases stay the same as before, but in order to be able
> to implement proper job handling, the test case is rewritten in Python.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  tests/qemu-iotests/206 | 680 
> ++---
>  tests/qemu-iotests/206.out | 253 ++---
>  tests/qemu-iotests/group   |   2 +-
>  3 files changed, 414 insertions(+), 521 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 2/5] hw/i386: Rename 2.13 machine types to 3.0

2018-05-30 Thread Igor Mammedov
On Tue, 22 May 2018 11:39:57 +0100
Peter Maydell  wrote:

> Rename the 2.13 machine types to match what we're going to
> use as our next release number.
> 
> Signed-off-by: Peter Maydell 
> ---
[...]
q35 hunk of this patch for no apparent reasons causes
change of the NVDIMM's DSM page allocated by Seabios.

@ -5,13 +5,13 @@
  * 
  * Disassembling to symbolic ASL+ operators
  *
- * Disassembly of tests/acpi-test-data/q35/SSDT.dimmpxm, Wed May 30 11:20:51 
2018
+ * Disassembly of /tmp/aml-3XMAJZ, Wed May 30 11:20:51 2018
  *
  * Original Table Header:
  * Signature"SSDT"
  * Length   0x02AD (685)
  * Revision 0x01
- * Checksum 0x50
+ * Checksum 0x40
  * OEM ID   "BOCHS "
  * OEM Table ID "NVDIMM"
  * OEM Revision 0x0001 (1)
@@ -183,6 +183,6 @@ DefinitionBlock ("", "SSDT", 1, "BOCHS ", "NVDIMM", 
0x0001)
 }
 }
 
-Name (MEMA, 0x07FFE000)
+Name (MEMA, 0x07FFF000)
 }

As far as I see it should safe wrt NVDIMMs,
but the question is what in this commit forced Seabios
to change allocated address?

Offending commit aa78a16d86:
Testcase to reproduce:
 QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/bios-tables-test

CLI to reproduce manually:
x86_64-softmmu/qemu-system-x86_64 -M q35 -machine nvdimm=on -smp 4,sockets=4  
-m 128M,slots=3,maxmem=1G  -numa node,mem=32M,nodeid=0  -numa 
node,mem=32M,nodeid=1  -numa node,mem=32M,nodeid=2 -numa node,mem=32M,nodeid=3 
-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1 -numa 
cpu,node-id=2,socket-id=2 -numa cpu,node-id=3,socket-id=3 -object 
memory-backend-ram,id=ram0,size=128M -object 
memory-backend-ram,id=nvm0,size=128M -device 
pc-dimm,id=dimm0,memdev=ram0,node=1  -device nvdimm,id=dimm1,memdev=nvm0,node=2 

> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 2372457c6a..83d6d75efa 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -308,18 +308,18 @@ static void pc_q35_machine_options(MachineClass *m)
>  m->max_cpus = 288;
>  }
>  
> -static void pc_q35_2_13_machine_options(MachineClass *m)
> +static void pc_q35_3_0_machine_options(MachineClass *m)
>  {
>  pc_q35_machine_options(m);
>  m->alias = "q35";
>  }
>  
> -DEFINE_Q35_MACHINE(v2_13, "pc-q35-2.13", NULL,
> -pc_q35_2_13_machine_options);
> +DEFINE_Q35_MACHINE(v3_0, "pc-q35-3.0", NULL,
> +pc_q35_3_0_machine_options);
>  
>  static void pc_q35_2_12_machine_options(MachineClass *m)
>  {
> -pc_q35_2_13_machine_options(m);
> +pc_q35_3_0_machine_options(m);
>  m->alias = NULL;
>  SET_MACHINE_COMPAT(m, PC_COMPAT_2_12);
>  }




Re: [Qemu-devel] [PATCH v4 09/21] target: Do not include "exec/exec-all.h" if it is not necessary

2018-05-30 Thread Cornelia Huck
On Mon, 28 May 2018 20:27:07 -0300
Philippe Mathieu-Daudé  wrote:

> Code change produced with:
> $ git grep '#include "exec/exec-all.h"' | \
>   cut -d: -f-1 | \
>   xargs egrep -L 
> "(cpu_address_space_init|cpu_loop_|tlb_|tb_|GETPC|singlestep|TranslationBlock)"
>  | \
>   xargs sed -i.bak '/#include "exec\/exec-all.h"/d'
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  bsd-user/qemu.h| 1 -
>  target/arm/arm_ldst.h  | 1 -
>  hw/i386/kvmvapic.c | 1 -
>  target/arm/arm-powerctl.c  | 1 -
>  target/arm/crypto_helper.c | 1 -
>  target/arm/iwmmxt_helper.c | 1 -
>  target/arm/neon_helper.c   | 1 -
>  target/arm/psci.c  | 1 -
>  target/arm/vec_helper.c| 1 -
>  target/cris/cpu.c  | 1 -
>  target/hppa/helper.c   | 1 -
>  target/hppa/int_helper.c   | 1 -
>  target/i386/hax-all.c  | 1 -
>  target/i386/hax-mem.c  | 1 -
>  target/i386/hax-windows.c  | 1 -
>  target/i386/hvf/hvf.c  | 1 -
>  target/i386/hvf/x86_task.c | 1 -
>  target/i386/whpx-all.c | 1 -
>  target/lm32/cpu.c  | 1 -
>  target/m68k/cpu.c  | 1 -
>  target/moxie/cpu.c | 1 -
>  target/moxie/mmu.c | 1 -
>  target/openrisc/cpu.c  | 1 -
>  target/ppc/int_helper.c| 1 -
>  target/s390x/cpu.c | 1 -
>  target/s390x/diag.c| 1 -
>  target/s390x/helper.c  | 1 -
>  target/tilegx/cpu.c| 1 -
>  target/xtensa/core-dc232b.c| 1 -
>  target/xtensa/core-dc233c.c| 1 -
>  target/xtensa/core-de212.c | 1 -
>  target/xtensa/core-fsf.c   | 1 -
>  target/xtensa/core-sample_controller.c | 1 -
>  target/xtensa/cpu.c| 1 -
>  tcg/tcg-op-vec.c   | 1 -
>  target/xtensa/import_core.sh   | 1 -
>  36 files changed, 36 deletions(-)
> 

However you arrived at it, this looks sane, so

Acked-by: Cornelia Huck 



Re: [Qemu-devel] [PATCH v2] monitor: report entirety of hmp command on error

2018-05-30 Thread Dr. David Alan Gilbert
* Markus Armbruster (arm...@redhat.com) wrote:
> David, looks like your turf.

Yep, I've got it on my list to take.

Dave

> Collin Walling  writes:
> 
> > When a user incorrectly provides an hmp command, an error response will be
> > printed that prompts the user to try "help ". However, when
> > the command contains multiple parts e.g. "info uuid xyz", only the last
> > whitespace delimited string will be reported (in this example "info" will
> > be dropped and the message will read "Try "help uuid" for more information",
> > which is incorrect).
> >
> > Let's correct this by capturing the entirety of the command from the command
> > line -- excluding any extraneous characters.
> >
> > Reported-by: Mikhail Fokin 
> > Signed-off-by: Collin Walling 
> > ---
> >  monitor.c | 8 ++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/monitor.c b/monitor.c
> > index 39f8ee1..38736b3 100644
> > --- a/monitor.c
> > +++ b/monitor.c
> > @@ -3371,6 +3371,7 @@ static void handle_hmp_command(Monitor *mon, const 
> > char *cmdline)
> >  {
> >  QDict *qdict;
> >  const mon_cmd_t *cmd;
> > +const char *cmd_start = cmdline;
> >  
> >  trace_handle_hmp_command(mon, cmdline);
> >  
> > @@ -3381,8 +3382,11 @@ static void handle_hmp_command(Monitor *mon, const 
> > char *cmdline)
> >  
> >  qdict = monitor_parse_arguments(mon, &cmdline, cmd);
> >  if (!qdict) {
> > -monitor_printf(mon, "Try \"help %s\" for more information\n",
> > -   cmd->name);
> > +while (cmdline > cmd_start && qemu_isspace(cmdline[-1])) {
> > +cmdline--;
> > +}
> > +monitor_printf(mon, "Try \"help %.*s\" for more information\n",
> > +   (int)(cmdline - cmd_start), cmd_start);
> >  return;
> >  }
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v2 11/16] qemu-iotests: Rewrite 207 for blockdev-create job

2018-05-30 Thread Max Reitz
On 2018-05-29 22:39, Kevin Wolf wrote:
> This rewrites the test case 207 to work with the new x-blockdev-create
> job rather than the old synchronous version of the command.
> 
> Most of the test cases stay the same as before (the exception being some
> improved 'size' options that allow distinguishing which command created
> the image), but in order to be able to implement proper job handling,
> the test case is rewritten in Python.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  tests/qemu-iotests/207 | 440 
> -
>  tests/qemu-iotests/207.out | 107 +--
>  tests/qemu-iotests/group   |   6 +-
>  3 files changed, 257 insertions(+), 296 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 2/5] hw/i386: Rename 2.13 machine types to 3.0

2018-05-30 Thread Paolo Bonzini
On 30/05/2018 12:11, Igor Mammedov wrote:
> -Name (MEMA, 0x07FFE000)
> +Name (MEMA, 0x07FFF000)
>  }
> 
> As far as I see it should safe wrt NVDIMMs,
> but the question is what in this commit forced Seabios
> to change allocated address?

Probably 2.13 is longer than 3.0 or something like that (and the planets
aligned in the right way).

Paolo



Re: [Qemu-devel] [PATCH v2] pnv: add a physical mapping array describing MMIO ranges in each chip

2018-05-30 Thread Greg Kurz
On Wed, 30 May 2018 12:07:54 +0200
Cédric Le Goater  wrote:

> Based on previous work done in skiboot, the physical mapping array
> helps in calculating the MMIO ranges of each controller depending on
> the chip id and the chip type. This is will be particularly useful for
> the P9 models which use less the XSCOM bus and rely more on MMIOs.
> 
> A link on the chip is now necessary to calculate MMIO BARs and
> sizes. This is why such a link is introduced in the PSIHB model.
> 
> Signed-off-by: Cédric Le Goater 
> Reviewed-by: Philippe Mathieu-Daudé 
> ---
> 
>  Changes since v1:
> 
>  - removed PNV_MAP_MAX which has unused
>  - introduced a chip class handler to calculate the base address of a
>controller as suggested by Greg.
>  - fix error reporting in pnv_psi_realize()
> 
>  include/hw/ppc/pnv.h | 51 ++
>  hw/ppc/pnv.c | 53 
> 
>  hw/ppc/pnv_psi.c | 15 ---
>  hw/ppc/pnv_xscom.c   |  8 
>  4 files changed, 96 insertions(+), 31 deletions(-)
> 
> diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
> index 90759240a7b1..ffa4a0899705 100644
> --- a/include/hw/ppc/pnv.h
> +++ b/include/hw/ppc/pnv.h
> @@ -53,7 +53,6 @@ typedef struct PnvChip {
>  uint64_t cores_mask;
>  void *cores;
>  
> -hwaddr   xscom_base;
>  MemoryRegion xscom_mmio;
>  MemoryRegion xscom;
>  AddressSpace xscom_as;
> @@ -64,6 +63,18 @@ typedef struct PnvChip {
>  PnvOCC   occ;
>  } PnvChip;
>  
> +typedef enum PnvPhysMapType {
> +PNV_MAP_XSCOM,
> +PNV_MAP_ICP,
> +PNV_MAP_PSIHB,
> +PNV_MAP_PSIHB_FSP,
> +} PnvPhysMapType;
> +
> +typedef struct PnvPhysMapEntry {
> +uint64_tbase;
> +uint64_tsize;
> +} PnvPhysMapEntry;
> +
>  typedef struct PnvChipClass {
>  /*< private >*/
>  SysBusDeviceClass parent_class;
> @@ -73,9 +84,10 @@ typedef struct PnvChipClass {
>  uint64_t chip_cfam_id;
>  uint64_t cores_mask;
>  
> -hwaddr   xscom_base;
> +const PnvPhysMapEntry *phys_map;
>  
>  uint32_t (*core_pir)(PnvChip *chip, uint32_t core_id);
> +uint64_t (*map_base)(const PnvChip *chip, PnvPhysMapType type);
>  } PnvChipClass;
>  
>  #define PNV_CHIP_TYPE_SUFFIX "-" TYPE_PNV_CHIP
> @@ -159,9 +171,21 @@ void pnv_bmc_powerdown(IPMIBmc *bmc);
>  /*
>   * POWER8 MMIO base addresses
>   */
> -#define PNV_XSCOM_SIZE0x8ull
> -#define PNV_XSCOM_BASE(chip)\
> -(chip->xscom_base + ((uint64_t)(chip)->chip_id) * PNV_XSCOM_SIZE)
> +static inline uint64_t pnv_map_size(const PnvChip *chip, PnvPhysMapType type)
> +{
> +PnvChipClass *pcc = PNV_CHIP_GET_CLASS(chip);
> +const PnvPhysMapEntry *map = &pcc->phys_map[type];
> +
> +return map->size;
> +}
> +
> +static inline uint64_t pnv_map_base(const PnvChip *chip, PnvPhysMapType type)
> +{
> +return PNV_CHIP_GET_CLASS(chip)->map_base(chip, type);
> +}
> +
> +#define PNV_XSCOM_SIZE(chip) pnv_map_size(chip, PNV_MAP_XSCOM)
> +#define PNV_XSCOM_BASE(chip) pnv_map_base(chip, PNV_MAP_XSCOM)
>  
>  /*
>   * XSCOM 0x20109CA defines the ICP BAR:
> @@ -177,18 +201,13 @@ void pnv_bmc_powerdown(IPMIBmc *bmc);
>   *  0xe022 -> 0x00038080
>   *  0xe026 -> 0x00038090
>   */
> -#define PNV_ICP_SIZE 0x0010ull
> -#define PNV_ICP_BASE(chip)  \
> -(0x00038000ull + (uint64_t) PNV_CHIP_INDEX(chip) * PNV_ICP_SIZE)
> -
> +#define PNV_ICP_SIZE(chip)   pnv_map_size(chip, PNV_MAP_ICP)
> +#define PNV_ICP_BASE(chip)   pnv_map_base(chip, PNV_MAP_ICP)
>  
> -#define PNV_PSIHB_SIZE   0x0010ull
> -#define PNV_PSIHB_BASE(chip) \
> -(0x0003fffe8000ull + (uint64_t)PNV_CHIP_INDEX(chip) * PNV_PSIHB_SIZE)
> +#define PNV_PSIHB_SIZE(chip) pnv_map_size(chip, PNV_MAP_PSIHB)
> +#define PNV_PSIHB_BASE(chip) pnv_map_base(chip, PNV_MAP_PSIHB)
>  
> -#define PNV_PSIHB_FSP_SIZE   0x0001ull
> -#define PNV_PSIHB_FSP_BASE(chip) \
> -(0x0003ffe0ull + (uint64_t)PNV_CHIP_INDEX(chip) * \
> - PNV_PSIHB_FSP_SIZE)
> +#define PNV_PSIHB_FSP_SIZE(chip) pnv_map_size(chip, PNV_MAP_PSIHB_FSP)
> +#define PNV_PSIHB_FSP_BASE(chip) pnv_map_base(chip, PNV_MAP_PSIHB_FSP)
>  
>  #endif /* _PPC_PNV_H */
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index 031488131629..77caaea64b2f 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -712,6 +712,24 @@ static uint32_t pnv_chip_core_pir_p9(PnvChip *chip, 
> uint32_t core_id)
>   */
>  #define POWER9_CORE_MASK   (0xffull)
>  
> +/*
> + * POWER8 MMIOs
> + */
> +static const PnvPhysMapEntry pnv_chip_power8_phys_map[] = {
> +[PNV_MAP_XSCOM] = { 0x0003fc00ull, 0x0008ull },
> +[PNV_MAP_ICP]   = { 0x00038000ull, 0x0010

Re: [Qemu-devel] [PATCH qemu v2 1/2] memory/hmp: Print owners/parents in "info mtree"

2018-05-30 Thread Paolo Bonzini
On 30/05/2018 06:57, Alexey Kardashevskiy wrote:
> hw/intc/apic_common.c|489| object_property_add(obj, "id", "uint32",
> hw/ppc/spapr_drc.c|557| object_property_add_uint32_ptr(obj, "id", &drc->id,
> NULL);
> 
> This does not look like "remove the "id" property altogether" :) Does this
> mean we still rather want to print QOM's "id"? spapr_drc does not own MRs
> and APIC seems not to either.

No, those properties are specific to some devices (and in fact they are
integers rather than strings).  The id property that mirrors the path
component is gone.

Paolo



[Qemu-devel] "socket" host network backend: suggested improvements and fixes

2018-05-30 Thread Artem Pisarenko
Hi to all.

I'm working on integrating QEMU networking to simulation environment and
found socket backend very convenient: it's simple, easy to use (i.e no
intermediate things required, such as tap/tun adapter, vde switch, etc.)
and transparent to host environment (i.e. it doesn't pollutes system with
bunch of virtual devices, etc.).

Although, it have some problems, closely related to each other. I've
investigated source code and played with it a little, but I'm not ready
submit a complete patch. So, here are my thoughts...

1. Internal protocol (only qemu instances can join its network devices
between). I suggest to make it available to plug with external software,
i.e freeze communication protocol at current state and document it in
docs/interop/ directory.

2. Transport options wrongly documented. Section "2.3.6 Network options"
lists "-netdev socket,..." entries. It gives very different basic
understanding of how it works from actual one.
 2.1. It has two entries: listen/connect (TCP connecton) and mcast
(multicast UDP), but 'qemu --help' outputs additional one - udp (UDP
tunnel), which is undocumented, but looks like working.
 2.2. Each entry has fd=h parameter, which looks like it's an optional
parameter, but code analysis (net/socket.c) shows that, in fact, it's a
separate transport option exclusive to listed ones. It used as follows:
user creates/opens whatever (almost) custom socket/file it wants, connects
it with other endpoint and passes file descriptor (handle) to qemu, which
just recv/send over it and nothing more.
 2.3. As a consequence, if you try to invoke any transport/variant option
with "fd=", you'll get an error: "exactly one of listen=, connect=, mcast=
or udp= is required". And again, error message is incomplete - it misses
"fd=" option.

3. "fd=" transport doesn't work with SOCK_DGRAM type sockets. It's due to
an implementation error in net/socket.c. Function
net_socket_receive_dgram() uses sendo() call with s->dgram_dst value which
is undefined for this case (and, of course, cannot be defined).
Although net_socket_fd_init() execution branch is smart enough to detect
type of socket passed with "fd=", but its "connected" state forgotten
afterwards. Suggested fix: replace sendto() with send(), which correctly
assumes already connected socket, and add corresponding connect() calls for
"mcast=" and "udp=" init sequences.

(For those, who interested, currently I've got working network
communication with unmodified qemu 2.12.0 in Linux using UNIX domain
sockets created by socketpair(AF_LOCAL, SOCK_STREAM, ...), one of which
passed to spawned child qemu process via -netdev socket,fd=... and other
one, used in parent application process to send/receive packets. Protocol,
used by qemu, is simple and implements only data plane: it just transfers
raw ethernet frames in binary form, for datagram-type sockets it's
straightforward, and for stream-type sockets each frame prepended with
uint32 length in network byte order, without any delimiters and escaping.)
-- 

С уважением,
  Артем Писаренко


Re: [Qemu-devel] [PATCH v7 00/11] enable numa configuration before machine_init() from QMP

2018-05-30 Thread Igor Mammedov
Eduardo,

I've rebased series on top of current master
the only change in several patches was s/2.13/3.0/
otherwise there weren't any other conflicts.
You can find rebased version at

https://github.com/imammedo/qemu.git qmp_preconfig_v9



Re: [Qemu-devel] [PATCH 2/5] hw/i386: Rename 2.13 machine types to 3.0

2018-05-30 Thread Igor Mammedov
On Wed, 30 May 2018 12:19:59 +0200
Paolo Bonzini  wrote:

> On 30/05/2018 12:11, Igor Mammedov wrote:
> > -Name (MEMA, 0x07FFE000)
> > +Name (MEMA, 0x07FFF000)
> >  }
Michael, could you update ACPI test blobs in your next pull request please?


> > As far as I see it should safe wrt NVDIMMs,
> > but the question is what in this commit forced Seabios
> > to change allocated address?  
> 
> Probably 2.13 is longer than 3.0 or something like that
looks like it's other way around (2.13 is shorter than 3.0)
since address went up.

> (and the planets aligned in the right way).
probably not the case considering that warning reproduces
the same regardless of day and time changes. :)

> 
> Paolo




Re: [Qemu-devel] [PATCH v3] sandbox: disable -sandbox if CONFIG_SECCOMP undefined

2018-05-30 Thread Eduardo Otubo
On 29/05/2018 - 18:05:25, Yi Min Zhao wrote:
> 
> 
> 在 2018/5/29 下午5:37, Paolo Bonzini 写道:
> > On 29/05/2018 09:31, Yi Min Zhao wrote:
> > > If CONFIG_SECCOMP is undefined, the option 'elevateprivileges' remains
> > > compiled. This would make libvirt set the corresponding capability and
> > > then trigger failure during guest startup. This patch moves the code
> > > regarding seccomp command line options to qemu-seccomp.c file and
> > > wraps qemu_opts_foreach finding sandbox option with CONFIG_SECCOMP.
> > > Because parse_sandbox() is moved into qemu-seccomp.c file, change
> > > seccomp_start() to static function.
> > > 
> > > Signed-off-by: Yi Min Zhao 
> > I had to squash this in:
> > 
> > diff --git a/vl.c b/vl.c
> > index 1140feb227..66c17ff8f8 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -3842,11 +3842,16 @@ int main(int argc, char **argv, char **envp)
> >   qtest_log = optarg;
> >   break;
> >   case QEMU_OPTION_sandbox:
> > +#ifndef CONFIG_SECCOMP
> One question, I guess you want to use #ifdef ?

Yep, I guess he meant #ifdef.

Can you send a v4 with a cleaned up version? Also fixing a typo on the text
(elevateDprivileges). 

Thanks for the contribution.

> >   opts = qemu_opts_parse_noisily(qemu_find_opts("sandbox"),
> >  optarg, true);
> >   if (!opts) {
> >   exit(1);
> >   }
> > +#else
> > +error_report("-sandbox support is not enabled in this QEMU 
> > binary");
> > +exit(1);
> > +#endif
> >   break;
> >   case QEMU_OPTION_add_fd:
> >   #ifndef _WIN32
> > 
> > 
> > Otherwise "-sandbox" will crash with a NULL pointer dereference in a binary 
> > without
> > seccomp support.  Otherwise looks great, thanks!
> > 
> > Paolo
> > 
> > > ---
> > > 1. Problem Description
> > > ==
> > > If QEMU is built without seccomp support, 'elevateprivileges' remains 
> > > compiled.
> > > This option of sandbox is treated as an indication for seccomp blacklist 
> > > support
> > > in libvirt. This behavior is introduced by the libvirt commits 31ca6a5 and
> > > 3527f9d. It would make libvirt build wrong QEMU cmdline, and then the 
> > > guest
> > > startup would fail.
> > > 
> > > 2. Libvirt Log
> > > ==
> > > qemu-system-s390x: -sandbox 
> > > on,obsolete=deny,elevateprivileges=deny,spawn=deny,\
> > > resourcecontrol=deny: seccomp support is disabled
> > > 
> > > 3. Fixup
> > > 
> > > Move the code related ot sandbox to qemu-seccomp.c file and wrap them with
> > > CONFIG_SECCOMP. So compile the code related to sandbox only when
> > > CONFIG_SECCOMP is defined.
> > > ---
> > >   include/sysemu/seccomp.h |   3 +-
> > >   qemu-seccomp.c   | 121 
> > > ++-
> > >   vl.c | 118 
> > > +
> > >   3 files changed, 124 insertions(+), 118 deletions(-)
> > > 
> > > diff --git a/include/sysemu/seccomp.h b/include/sysemu/seccomp.h
> > > index 9b092aa23f..fe859894f6 100644
> > > --- a/include/sysemu/seccomp.h
> > > +++ b/include/sysemu/seccomp.h
> > > @@ -21,5 +21,6 @@
> > >   #define QEMU_SECCOMP_SET_SPAWN   (1 << 3)
> > >   #define QEMU_SECCOMP_SET_RESOURCECTL (1 << 4)
> > > -int seccomp_start(uint32_t seccomp_opts);
> > > +int parse_sandbox(void *opaque, QemuOpts *opts, Error **errp);
> > > +
> > >   #endif
> > > diff --git a/qemu-seccomp.c b/qemu-seccomp.c
> > > index b770a77d33..148e4c6f24 100644
> > > --- a/qemu-seccomp.c
> > > +++ b/qemu-seccomp.c
> > > @@ -13,6 +13,11 @@
> > >* GNU GPL, version 2 or (at your option) any later version.
> > >*/
> > >   #include "qemu/osdep.h"
> > > +#include "qemu/config-file.h"
> > > +#include "qemu/option.h"
> > > +#include "qemu/module.h"
> > > +#include "qemu/error-report.h"
> > > +#include 
> > >   #include 
> > >   #include "sysemu/seccomp.h"
> > > @@ -96,7 +101,7 @@ static const struct QemuSeccompSyscall blacklist[] = {
> > >   };
> > > -int seccomp_start(uint32_t seccomp_opts)
> > > +static int seccomp_start(uint32_t seccomp_opts)
> > >   {
> > >   int rc = 0;
> > >   unsigned int i = 0;
> > > @@ -125,3 +130,117 @@ int seccomp_start(uint32_t seccomp_opts)
> > >   seccomp_release(ctx);
> > >   return rc;
> > >   }
> > > +
> > > +#ifdef CONFIG_SECCOMP
> > > +int parse_sandbox(void *opaque, QemuOpts *opts, Error **errp)
> > > +{
> > > +if (qemu_opt_get_bool(opts, "enable", false)) {
> > > +uint32_t seccomp_opts = QEMU_SECCOMP_SET_DEFAULT
> > > +| QEMU_SECCOMP_SET_OBSOLETE;
> > > +const char *value = NULL;
> > > +
> > > +value = qemu_opt_get(opts, "obsolete");
> > > +if (value) {
> > > +if (g_str_equal(value, "allow")) {
> > > +seccomp_opts &= ~QEMU_SECCOMP_SET_OBSOLETE;
> > > +} else if (g_str_equal(

[Qemu-devel] [PATCH v1 4/8] docker: update Travis docker image

2018-05-30 Thread Alex Bennée
This is still poorly documented by Travis but according to:

  
https://docs.travis-ci.com/user/common-build-problems/#Running-a-Container-Based-Docker-Image-Locally

their reference images are now hosted on Docker Hub. So we update the
FROM line to refer to the new default image. We also need a few
additional tweaks:

  - re-enable deb-src lines for our build-dep install
  - add explicit PATH definition for tools
  - force the build USER to be Travis
  - add clang to FEATURES for our test-clang machinery

Signed-off-by: Alex Bennée 
---
 tests/docker/dockerfiles/travis.docker | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/tests/docker/dockerfiles/travis.docker 
b/tests/docker/dockerfiles/travis.docker
index 605b6e429b..6e90f033d5 100644
--- a/tests/docker/dockerfiles/travis.docker
+++ b/tests/docker/dockerfiles/travis.docker
@@ -1,8 +1,11 @@
-FROM quay.io/travisci/travis-ruby
+FROM travisci/ci-garnet:packer-1512502276-986baf0
 ENV DEBIAN_FRONTEND noninteractive
 ENV LANG en_US.UTF-8
 ENV LC_ALL en_US.UTF-8
+RUN cat /etc/apt/sources.list | sed "s/# deb-src/deb-src/" >> 
/etc/apt/sources.list
 RUN apt-get update
 RUN apt-get -y build-dep qemu
 RUN apt-get -y install device-tree-compiler python2.7 python-yaml 
dh-autoreconf gdb strace lsof net-tools
-ENV FEATURES pyyaml
+ENV PATH 
/usr/local/phantomjs/bin:/usr/local/phantomjs:/usr/local/neo4j-3.2.7/bin:/usr/local/maven-3.5.2/bin:/usr/local/cmake-3.9.2/bin:/usr/local/clang-5.0.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
+ENV FEATURES clang pyyaml
+USER travis
-- 
2.17.0




[Qemu-devel] [PATCH v1 7/8] .travis.yml: update GCC sanitizer build to GCC 7

2018-05-30 Thread Alex Bennée
GCC has moved on and so should we. We also enable apt update to ensure
we get the latest build from the toolchain PPA.

Signed-off-by: Alex Bennée 
---
 .travis.yml | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index e99af6f357..ecc4367036 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -111,13 +111,14 @@ matrix:
 # Using newer GCC with sanitizers
 - addons:
 apt:
+  update: true
   sources:
 # PPAs for newer toolchains
 - ubuntu-toolchain-r-test
   packages:
 # Extra toolchains
-- gcc-5
-- g++-5
+- gcc-7
+- g++-7
 # Build dependencies
 - libaio-dev
 - libattr1-dev
@@ -146,8 +147,8 @@ matrix:
   language: generic
   compiler: none
   env:
-- COMPILER_NAME=gcc CXX=g++-5 CC=gcc-5
-- CONFIG="--cc=gcc-5 --cxx=g++-5 --disable-pie --disable-linux-user"
+- COMPILER_NAME=gcc CXX=g++-7 CC=gcc-7
+- CONFIG="--cc=gcc-7 --cxx=g++-7 --disable-pie --disable-linux-user"
 - TEST_CMD=""
   before_script:
 - ./configure ${CONFIG} --extra-cflags="-g3 -O0 -fsanitize=thread 
-fuse-ld=gold" || cat config.log
-- 
2.17.0




[Qemu-devel] [PATCH v1 1/8] .travis.yml: disable linux-user build for gcov

2018-05-30 Thread Alex Bennée
Currently the default testing doesn't exercise the linux-user builds
so there is no point spending time building them. We may want to
enable a separate gcov build once linux-user testing is re-enabled
although it's likely to report very low coverage.

Signed-off-by: Alex Bennée 
---
 .travis.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.travis.yml b/.travis.yml
index c1e99237b2..aa83e9aed7 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -74,7 +74,7 @@ matrix:
 - env: CONFIG=""
   compiler: clang
 # gprof/gcov are GCC features
-- env: CONFIG="--enable-gprof --enable-gcov --disable-pie"
+- env: CONFIG="--enable-gprof --enable-gcov --disable-pie 
--disable-linux-user"
   compiler: gcc
 # We manually include builds which we disable "make check" for
 - env: CONFIG="--enable-debug --enable-tcg-interpreter"
-- 
2.17.0




[Qemu-devel] [PATCH v1 0/8] Travis stability and a few docker patches

2018-05-30 Thread Alex Bennée
Hi,

Again the final patch won't make it into a pull-request but I'm just
keeping it around to keep track of failures. So far my numerous
re-builds have been mainly plain timeouts.

The alternate co-routine builds are sailing the closest to timeout
purgatory which makes me think we should limit the scope of the
testing - I'm not sure how much of the unit tests or qapi tests are
likely to exercise the co-routine code. Is it likely only the
(currently not run) block tests really exercise this area of code?

I've also updated the Travis image to be more recent. I noticed while
doing so that the Travis clang was newer than our hand crafted tools
so I took the opportunity to drop them and just concentrate on
exercising the python stuff. It might be worth just limiting the
target list for those tests to x86_64 only though?

There are also a couple of Philippe's patches which I should have
queued ages ago. If we get some quick reviews I can turn around a pull
request by the end of the week.

Alex Bennée (6):
  .travis.yml: disable linux-user build for gcov
  docker: update Travis docker image
  .travis.yml: rationalise clang testing
  .travis.yml: make current setup explicit
  .travis.yml: update GCC sanitizer build to GCC 7
  tests/Makefile: mark flakey tests (!UPSTREAM)

Philippe Mathieu-Daudé (2):
  docker: sort images list displayed by 'make docker'
  docker: do not display deprecated images in 'make docker' help

 .travis.yml| 82 +-
 tests/Makefile.include |  4 ++
 tests/docker/Makefile.include  |  5 +-
 tests/docker/dockerfiles/travis.docker |  7 ++-
 4 files changed, 28 insertions(+), 70 deletions(-)

-- 
2.17.0




[Qemu-devel] [PATCH v1 3/8] docker: do not display deprecated images in 'make docker' help

2018-05-30 Thread Alex Bennée
From: Philippe Mathieu-Daudé 

the 'debian' base image is deprecated since 3e11974988d8

Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Alex Bennée 
---
 tests/docker/Makefile.include | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
index 50cd51a54e..31f21a43f5 100644
--- a/tests/docker/Makefile.include
+++ b/tests/docker/Makefile.include
@@ -4,7 +4,8 @@
 
 DOCKER_SUFFIX := .docker
 DOCKER_FILES_DIR := $(SRC_PATH)/tests/docker/dockerfiles
-DOCKER_IMAGES := $(sort $(notdir $(basename $(wildcard 
$(DOCKER_FILES_DIR)/*.docker
+DOCKER_DEPRECATED_IMAGES := debian
+DOCKER_IMAGES := $(filter-out $(DOCKER_DEPRECATED_IMAGES),$(sort $(notdir 
$(basename $(wildcard $(DOCKER_FILES_DIR)/*.docker)
 DOCKER_TARGETS := $(patsubst %,docker-image-%,$(DOCKER_IMAGES))
 # Use a global constant ccache directory to speed up repetitive builds
 DOCKER_CCACHE_DIR := $$HOME/.cache/qemu-docker-ccache
@@ -63,7 +64,7 @@ docker-image-debian-win64-cross: docker-image-debian8-mxe
 docker-image-travis: NOUSER=1
 
 # Expand all the pre-requistes for each docker image and test combination
-$(foreach i,$(DOCKER_IMAGES), \
+$(foreach i,$(DOCKER_IMAGES) $(DOCKER_DEPRECATED_IMAGES), \
$(foreach t,$(DOCKER_TESTS) $(DOCKER_TOOLS), \
$(eval .PHONY: docker-$t@$i) \
$(eval docker-$t@$i: docker-image-$i docker-run-$t@$i) \
-- 
2.17.0




[Qemu-devel] [PATCH v1 5/8] .travis.yml: rationalise clang testing

2018-05-30 Thread Alex Bennée
As Travis includes Clang 5.0 in its own build environment there is no
point manually building with older Clangs. We still need to test with
the two pythons though so we leave them as system only builds. We also
split the clang build into two as it often exceeds the 40 minute build
time limit.

Signed-off-by: Alex Bennée 
---
 .travis.yml | 67 +
 1 file changed, 6 insertions(+), 61 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index aa83e9aed7..85ee2a1edb 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -70,8 +70,10 @@ script:
   - make ${MAKEFLAGS} && ${TEST_CMD}
 matrix:
   include:
-# Test with CLang for compile portability
-- env: CONFIG=""
+# Test with Clang for compile portability (Travis uses clang-5.0)
+- env: CONFIG="--disable-system"
+  compiler: clang
+- env: CONFIG="--disable-user"
   compiler: clang
 # gprof/gcov are GCC features
 - env: CONFIG="--enable-gprof --enable-gcov --disable-pie 
--disable-linux-user"
@@ -95,70 +97,13 @@ matrix:
 - env: CONFIG=""
   os: osx
   compiler: clang
-# Plain Trusty System Build
+# Python builds
 - env: CONFIG="--disable-linux-user"
-  sudo: required
-  addons:
-  dist: trusty
-  compiler: gcc
-  before_install:
-- sudo apt-get update -qq
-- sudo apt-get build-dep -qq qemu
-- wget -O - 
http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
-- git submodule update --init --recursive
-# Plain Trusty Linux User Build
-- env: CONFIG="--disable-system"
-  sudo: required
-  addons:
-  dist: trusty
-  compiler: gcc
-  before_install:
-- sudo apt-get update -qq
-- sudo apt-get build-dep -qq qemu
-- wget -O - 
http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
-- git submodule update --init --recursive
-# Trusty System build with latest stable clang & python 3.0
-- sudo: required
-  addons:
-  dist: trusty
-  language: generic
-  compiler: none
   python:
 - "3.0"
-  env:
-- COMPILER_NAME=clang CXX=clang++-3.9 CC=clang-3.9
-- CONFIG="--disable-linux-user --cc=clang-3.9 --cxx=clang++-3.9 
--python=/usr/bin/python3"
-  before_install:
-- wget -nv -O - http://llvm.org/apt/llvm-snapshot.gpg.key | sudo 
apt-key add -
-- sudo apt-add-repository -y 'deb http://llvm.org/apt/trusty 
llvm-toolchain-trusty-3.9 main'
-- sudo apt-get update -qq
-- sudo apt-get install -qq -y clang-3.9
-- sudo apt-get build-dep -qq qemu
-- wget -O - 
http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
-- git submodule update --init --recursive
-  before_script:
-- ./configure ${CONFIG} || cat config.log
-# Trusty Linux User build with latest stable clang & python 3.6
-- sudo: required
-  addons:
-  dist: trusty
-  language: generic
-  compiler: none
+- env: CONFIG="--disable-linux-user"
   python:
 - "3.6"
-  env:
-- COMPILER_NAME=clang CXX=clang++-3.9 CC=clang-3.9
-- CONFIG="--disable-system --cc=clang-3.9 --cxx=clang++-3.9 
--python=/usr/bin/python3"
-  before_install:
-- wget -nv -O - http://llvm.org/apt/llvm-snapshot.gpg.key | sudo 
apt-key add -
-- sudo apt-add-repository -y 'deb http://llvm.org/apt/trusty 
llvm-toolchain-trusty-3.9 main'
-- sudo apt-get update -qq
-- sudo apt-get install -qq -y clang-3.9
-- sudo apt-get build-dep -qq qemu
-- wget -O - 
http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
-- git submodule update --init --recursive
-  before_script:
-- ./configure ${CONFIG} || cat config.log
 # Using newer GCC with sanitizers
 - addons:
 apt:
-- 
2.17.0




[Qemu-devel] [PATCH v1 8/8] tests/Makefile: mark flakey tests (!UPSTREAM)

2018-05-30 Thread Alex Bennée
This is a bookmarking commit to keep track of the failures I'm
currently seeing in Travis. They are currently:

1. test-aio

GTESTER tests/test-thread-pool
**
ERROR:tests/test-aio.c:501:test_timer_schedule: assertion failed: 
(aio_poll(ctx, true))
GTester: last random seed: R02S66126aca97f9606b33e5d7be7fc9b625
make: *** [check-tests/test-aio] Error 1
make: *** Waiting for unfinished jobs

Last discussion @ 20180525091724.GC14757@stefanha-x1.localdomain

  - rcutorture

[AWAITING EXAMPLE]

I suspect it is load that causes the problems but they really need to
be fixed properly.

Signed-off-by: Alex Bennée 
---
 tests/Makefile.include | 4 
 1 file changed, 4 insertions(+)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index b499ba1813..8ff4ab9e27 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -76,7 +76,9 @@ gcov-files-test-coroutine-y = 
coroutine-$(CONFIG_COROUTINE_BACKEND).c
 check-unit-y += tests/test-visitor-serialization$(EXESUF)
 check-unit-y += tests/test-iov$(EXESUF)
 gcov-files-test-iov-y = util/iov.c
+# start: flakey
 check-unit-y += tests/test-aio$(EXESUF)
+# end: flakey
 gcov-files-test-aio-y = util/async.c util/qemu-timer.o
 gcov-files-test-aio-$(CONFIG_WIN32) += util/aio-win32.c
 gcov-files-test-aio-$(CONFIG_POSIX) += util/aio-posix.c
@@ -110,7 +112,9 @@ gcov-files-test-mul64-y = util/host-utils.c
 check-unit-y += tests/test-int128$(EXESUF)
 # all code tested by test-int128 is inside int128.h
 gcov-files-test-int128-y =
+# start: flakey
 check-unit-y += tests/rcutorture$(EXESUF)
+# end: flakey
 gcov-files-rcutorture-y = util/rcu.c
 check-unit-y += tests/test-rcu-list$(EXESUF)
 gcov-files-test-rcu-list-y = util/rcu.c
-- 
2.17.0




[Qemu-devel] [PATCH v1 2/8] docker: sort images list displayed by 'make docker'

2018-05-30 Thread Alex Bennée
From: Philippe Mathieu-Daudé 

we can now directly see different version sort consecutively.

Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Alex Bennée 
---
 tests/docker/Makefile.include | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
index ef1a3e62eb..50cd51a54e 100644
--- a/tests/docker/Makefile.include
+++ b/tests/docker/Makefile.include
@@ -4,7 +4,7 @@
 
 DOCKER_SUFFIX := .docker
 DOCKER_FILES_DIR := $(SRC_PATH)/tests/docker/dockerfiles
-DOCKER_IMAGES := $(notdir $(basename $(wildcard $(DOCKER_FILES_DIR)/*.docker)))
+DOCKER_IMAGES := $(sort $(notdir $(basename $(wildcard 
$(DOCKER_FILES_DIR)/*.docker
 DOCKER_TARGETS := $(patsubst %,docker-image-%,$(DOCKER_IMAGES))
 # Use a global constant ccache directory to speed up repetitive builds
 DOCKER_CCACHE_DIR := $$HOME/.cache/qemu-docker-ccache
-- 
2.17.0




Re: [Qemu-devel] [PATCH 2/3] pc-bios/s390-ccw/net: Add support for pxelinux-style config files

2018-05-30 Thread Viktor VM Mihajlovski
On 30.05.2018 11:16, Thomas Huth wrote:
> Since it is quite cumbersome to manually create a combined kernel with
> initrd image for network booting, we now support loading via pxelinux
> configuration files, too. In these files, the kernel, initrd and command
> line parameters can be specified seperately, and the firmware then takes
> care of glueing everything together in memory after the files have been
> downloaded. See this URL for details about the config file layout:
> https://www.syslinux.org/wiki/index.php?title=PXELINUX
> 
> The user can either specify a config file directly as bootfile via DHCP
> (but in this case, the file has to start either with "default" or a "#"
> comment so we can distinguish it from binary kernels), or a folder (i.e.
> the bootfile name must end with "/") where the firmware should look for
> the typical pxelinux.cfg file names, e.g. based on MAC or IP address.
> We also support the pxelinux.cfg DHCP options 209 and 210 from RFC 5071.
> 
> Signed-off-by: Thomas Huth 
> ---
>  pc-bios/s390-ccw/netboot.mak |  7 ++--
>  pc-bios/s390-ccw/netmain.c   | 79 
> +++-
>  2 files changed, 82 insertions(+), 4 deletions(-)
[...]
> diff --git a/pc-bios/s390-ccw/netmain.c b/pc-bios/s390-ccw/netmain.c
> index 7533cf7..e84bb2b 100644
> --- a/pc-bios/s390-ccw/netmain.c
> +++ b/pc-bios/s390-ccw/netmain.c
[...]
> @@ -301,6 +363,18 @@ static int net_try_direct_tftp_load(filename_ip_t *fn_ip)
>  if (!strncmp("* ", cfgbuf, 2)) {
>  return handle_ins_cfg(fn_ip, cfgbuf, rc);
>  }
> +if (!strncasecmp("default", cfgbuf, 7) || !strncmp("# ", cfgbuf, 2)) 
> {
Minor, but I'm wondering whether this is not too cautious and could rule
out valid config files. You might just unconditionally call
pxelinux_parse_cfg and let it find out if this is as pxelinux config
file or not.
> +/* Looks like it is a pxelinux.cfg */
> +struct pl_cfg_entry entries[MAX_PXELINUX_ENTRIES];
> +int num_ent, def_ent = 0;
> +
> +num_ent = pxelinux_parse_cfg(cfgbuf, sizeof(cfgbuf), entries,
> + MAX_PXELINUX_ENTRIES, &def_ent);
> +if (num_ent <= 0) {
> +return -1;
> +}
> +return load_kernel_with_initrd(fn_ip, &entries[def_ent]);
> +}>  }
> 
>  /* Move kernel to right location */
> @@ -406,6 +480,9 @@ void main(void)
>  if (fnlen > 0 && fn_ip.filename[fnlen - 1] != '/') {
>  rc = net_try_direct_tftp_load(&fn_ip);
>  }
> +if (rc <= 0) {
> +rc = net_try_pxelinux_cfg(&fn_ip);
> +}
> 
>  net_release(&fn_ip);
> 


-- 
Regards,
  Viktor Mihajlovski




[Qemu-devel] [PATCH v1 6/8] .travis.yml: make current setup explicit

2018-05-30 Thread Alex Bennée
Add some commentary and make the selection of Container based Trusty
build explicit. We will need to add VM builds later when using docker.

Signed-off-by: Alex Bennée 
---
 .travis.yml | 4 
 1 file changed, 4 insertions(+)

diff --git a/.travis.yml b/.travis.yml
index 85ee2a1edb..e99af6f357 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -1,4 +1,8 @@
+# The current Travis default is a container based 14.04 Trust on EC2
+# Additional builds with specific requirements for a full VM need to
+# be added as additional matrix: entries later on
 sudo: false
+dist: trusty
 language: c
 python:
   - "2.6"
-- 
2.17.0




[Qemu-devel] [PATCH] migration/block-dirty-bitmap: fix dirty_bitmap_load

2018-05-30 Thread Vladimir Sementsov-Ogievskiy
dirty_bitmap_load_header return code is obtained but not handled. Fix
this.

Bug was introduced in b35ebdf076d697bc
"migration: add postcopy migration of dirty bitmaps" with the whole
function.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 8819aabe3a..2c541c985d 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -672,6 +672,9 @@ static int dirty_bitmap_load(QEMUFile *f, void *opaque, int 
version_id)
 
 do {
 ret = dirty_bitmap_load_header(f, &s);
+if (ret < 0) {
+return ret;
+}
 
 if (s.flags & DIRTY_BITMAP_MIG_FLAG_START) {
 ret = dirty_bitmap_load_start(f, &s);
-- 
2.11.1




[Qemu-devel] [RFC v3 0/8] KVM/ARM: Relax the max 123 vcpus limitation along with KVM GICv3

2018-05-30 Thread Eric Auger
Currently the max number of VCPUs usable along with the KVM GICv3
device is limited to 123. The rationale is a single redistributor
region was supported and this latter was set to [0x80A, 0x900]
within the guest physical address space, surrounded with DIST and UART
MMIO regions.

[1] now allows to register several redistributor regions.
So this series overcomes the max 123 vcpu limitation by registering
a new redistributor region located just after the VIRT_MEM RAM region.
This second redistributor region has a capacity of 512 redistributors.

The max supported VCPUs in non accelerated mode is not modified.

Best Regards

Eric

Host Kernel dependencies:
[1] [PATCH v6 00/12] KVM: arm/arm64: Allow multiple GICv3 redistributor
regions
https://github.com/eauger/linux/tree/v4.17-rc2-rdist-regions-v6

This QEMU series can be found at:
https://github.com/eauger/qemu/tree/v2.12.0-rdist_regions-rfc-v3
Previous version:
https://github.com/eauger/qemu/tree/v2.12.0-rdist_regions-rfc-v2

History:
v2 -> v3:
- Added the last patch defining 3.0 machine type and setting
  max_cpus to 512
- redistributor region total count exactly matching the number
  of requested vcpus
- added missing return in arm_gic_realize
- checked redist region capacity versus #vcpus earlier in
  gicv3_init_irqs_and_mmio
- use GICV3_REDIST_SIZE
- Changed the 2d region size to 64MB

v1 -> v2:
- Do not use KVM_MAX_VCPUS anymore
- In case the multiple redistributor region capability is not
  supported by the host kernel, the GICv3 device realize() fails
  with a hint for the end-user.
- use properties to set the redistributor region count
- sysbus_mmio_map is kept in virt and machine init done notifier
  mechanism is used with an address fixup addition.
- I have not yet extended the second redist region as Peter suggested.
  We can easily add another 3th region later on if requested. But
  if mandated, I will fix that in next release.


Eric Auger (8):
  linux-headers: Partial update for KVM/ARM multiple redistributor
region registration
  target/arm: Allow KVM device address overwriting
  hw/intc/arm_gicv3: Introduce redist-region-count array property
  hw/intc/arm_gicv3_kvm: Get prepared to handle multiple redist regions
  hw/arm/virt: GICv3 DT node with one or two redistributor regions
  hw/arm/virt-acpi-build: Advertise one or two GICR structures
  hw/arm/virt: Register two redistributor regions when necessary
  hw/arm/virt: Add virt-3.0 machine type supporting up to 512 vcpus

 hw/arm/virt-acpi-build.c   |  9 
 hw/arm/virt.c  | 88 --
 hw/intc/arm_gic_kvm.c  |  4 +-
 hw/intc/arm_gicv3.c| 12 +-
 hw/intc/arm_gicv3_common.c | 38 +---
 hw/intc/arm_gicv3_its_kvm.c|  2 +-
 hw/intc/arm_gicv3_kvm.c| 40 +++--
 include/hw/arm/virt.h  | 14 ++
 include/hw/intc/arm_gicv3_common.h |  8 +++-
 linux-headers/asm-arm/kvm.h|  1 +
 linux-headers/asm-arm64/kvm.h  |  1 +
 target/arm/kvm.c   | 10 -
 target/arm/kvm_arm.h   |  3 +-
 13 files changed, 200 insertions(+), 30 deletions(-)

-- 
2.5.5




[Qemu-devel] [RFC v3 6/8] hw/arm/virt-acpi-build: Advertise one or two GICR structures

2018-05-30 Thread Eric Auger
Depending on the number of smp_cpus we now register one or two
GICR structures.

Signed-off-by: Eric Auger 
---
 hw/arm/virt-acpi-build.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 92ceee9..6a4340a 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -660,6 +660,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 
 if (vms->gic_version == 3) {
 AcpiMadtGenericTranslator *gic_its;
+int nb_redist_regions = virt_gicv3_redist_region_count(vms);
 AcpiMadtGenericRedistributor *gicr = acpi_data_push(table_data,
  sizeof *gicr);
 
@@ -668,6 +669,14 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
 gicr->base_address = cpu_to_le64(memmap[VIRT_GIC_REDIST].base);
 gicr->range_length = cpu_to_le32(memmap[VIRT_GIC_REDIST].size);
 
+if (nb_redist_regions == 2) {
+gicr = acpi_data_push(table_data, sizeof(*gicr));
+gicr->type = ACPI_APIC_GENERIC_REDISTRIBUTOR;
+gicr->length = sizeof(*gicr);
+gicr->base_address = cpu_to_le64(memmap[VIRT_GIC_REDIST2].base);
+gicr->range_length = cpu_to_le32(memmap[VIRT_GIC_REDIST2].size);
+}
+
 if (its_class_name() && !vmc->no_its) {
 gic_its = acpi_data_push(table_data, sizeof *gic_its);
 gic_its->type = ACPI_APIC_GENERIC_TRANSLATOR;
-- 
2.5.5




[Qemu-devel] [RFC v3 1/8] linux-headers: Partial update for KVM/ARM multiple redistributor region registration

2018-05-30 Thread Eric Auger
This updates KVM/ARM headers against
https://github.com/eauger/linux/tree/v4.17-rc2-rdist-regions-v6

Signed-off-by: Eric Auger 
---
 linux-headers/asm-arm/kvm.h   | 1 +
 linux-headers/asm-arm64/kvm.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/linux-headers/asm-arm/kvm.h b/linux-headers/asm-arm/kvm.h
index 4392955..81ae4e5 100644
--- a/linux-headers/asm-arm/kvm.h
+++ b/linux-headers/asm-arm/kvm.h
@@ -91,6 +91,7 @@ struct kvm_regs {
 #define KVM_VGIC_V3_ADDR_TYPE_DIST 2
 #define KVM_VGIC_V3_ADDR_TYPE_REDIST   3
 #define KVM_VGIC_ITS_ADDR_TYPE 4
+#define KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION5
 
 #define KVM_VGIC_V3_DIST_SIZE  SZ_64K
 #define KVM_VGIC_V3_REDIST_SIZE(2 * SZ_64K)
diff --git a/linux-headers/asm-arm64/kvm.h b/linux-headers/asm-arm64/kvm.h
index 4e80651..d41f39a 100644
--- a/linux-headers/asm-arm64/kvm.h
+++ b/linux-headers/asm-arm64/kvm.h
@@ -91,6 +91,7 @@ struct kvm_regs {
 #define KVM_VGIC_V3_ADDR_TYPE_DIST 2
 #define KVM_VGIC_V3_ADDR_TYPE_REDIST   3
 #define KVM_VGIC_ITS_ADDR_TYPE 4
+#define KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION5
 
 #define KVM_VGIC_V3_DIST_SIZE  SZ_64K
 #define KVM_VGIC_V3_REDIST_SIZE(2 * SZ_64K)
-- 
2.5.5




[Qemu-devel] [RFC v3 4/8] hw/intc/arm_gicv3_kvm: Get prepared to handle multiple redist regions

2018-05-30 Thread Eric Auger
Let's check if KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION is supported.
If not, we check the number of redist region is equal to 1 and use the
legacy KVM_VGIC_V3_ADDR_TYPE_REDIST attribute. Otherwise we use
the new attribute and allow to register multiple regions to the
KVM device.

Signed-off-by: Eric Auger 
Reviewed-by: Peter Maydell 

---

v2 -> v3:
- In kvm_arm_gicv3_realize rename val into add_ormask local variable and
  add a comment
- start the redist region registration  from s->nb_redist_regions - 1
  downwards
---
 hw/intc/arm_gicv3_kvm.c | 33 ++---
 1 file changed, 30 insertions(+), 3 deletions(-)

diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index 7e76b87..3826ff4 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -714,6 +714,7 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error 
**errp)
 {
 GICv3State *s = KVM_ARM_GICV3(dev);
 KVMARMGICv3Class *kgc = KVM_ARM_GICV3_GET_CLASS(s);
+bool multiple_redist_region_allowed;
 Error *local_err = NULL;
 int i;
 
@@ -750,6 +751,18 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 
+multiple_redist_region_allowed =
+kvm_device_check_attr(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ADDR,
+  KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION);
+
+if (!multiple_redist_region_allowed && s->nb_redist_regions > 1) {
+error_setg(errp, "Multiple VGICv3 redistributor regions are not "
+   "supported by this host kernel");
+error_append_hint(errp, "A maximum of %d VCPUs can be used",
+  s->redist_region_count[0]);
+return;
+}
+
 kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_NR_IRQS,
   0, &s->num_irq, true, &error_abort);
 
@@ -759,9 +772,23 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error 
**errp)
 
 kvm_arm_register_device(&s->iomem_dist, -1, KVM_DEV_ARM_VGIC_GRP_ADDR,
 KVM_VGIC_V3_ADDR_TYPE_DIST, s->dev_fd, 0);
-kvm_arm_register_device(&s->iomem_redist[0], -1,
-KVM_DEV_ARM_VGIC_GRP_ADDR,
-KVM_VGIC_V3_ADDR_TYPE_REDIST, s->dev_fd, 0);
+
+if (!multiple_redist_region_allowed) {
+kvm_arm_register_device(&s->iomem_redist[0], -1,
+KVM_DEV_ARM_VGIC_GRP_ADDR,
+KVM_VGIC_V3_ADDR_TYPE_REDIST, s->dev_fd, 0);
+} else {
+for (i = s->nb_redist_regions - 1; i >= 0; i--) {
+/* Address mask made of the rdist region index and count */
+uint64_t addr_ormask =
+i | ((uint64_t)s->redist_region_count[i] << 52);
+
+kvm_arm_register_device(&s->iomem_redist[i], -1,
+KVM_DEV_ARM_VGIC_GRP_ADDR,
+KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION,
+s->dev_fd, addr_ormask);
+}
+}
 
 if (kvm_has_gsi_routing()) {
 /* set up irq routing */
-- 
2.5.5




[Qemu-devel] [Bug 1396052] Re: migration failed when running BurnInTest in guest

2018-05-30 Thread z08687
Thanks for your reply.

I didn't start dest vm. The vm just in pause state and crash. 
static void process_incoming_migration_co(void *opaque)
{
--
ret = qemu_loadvm_state(mis->from_src_file);   -- return 
when recieve "QEMU_VM_EOF"
--
mis->bh = qemu_bh_new(process_incoming_migration_bh, mis);
qemu_bh_schedule(mis->bh);
}

static void process_incoming_migration_bh(void *opaque)
{
--
bdrv_invalidate_cache_all(&local_err);  --- maybe yield 
in this process and nbd write will access invailed bdrv, and crash
--

if (!global_state_received() ||
global_state_get_runstate() == RUN_STATE_RUNNING) {
if (autostart) {
vm_start();  -- resume vm 
here
} else {
runstate_set(RUN_STATE_PAUSED);
}
} else {
runstate_set(global_state_get_runstate());
}
/*
 * This must happen after any state changes since as soon as an external
 * observer sees this event they might start to prod at the VM assuming
 * it's ready to use.
 */
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
  MIGRATION_STATUS_COMPLETED);
qemu_bh_delete(mis->bh);
migration_incoming_state_destroy();
}

You are right, the correct solution should be some like drain around 
bdrv_invalidate_cache().
My solution;
static void process_incoming_migration_co(void *opaque)
{
--
ret = qemu_loadvm_state(mis->from_src_file); 
--
wait for all nbd client closed.
mis->bh = qemu_bh_new(process_incoming_migration_bh, mis);
qemu_bh_schedule(mis->bh);
}

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1396052

Title:
  migration failed when running BurnInTest in guest

Status in QEMU:
  Triaged

Bug description:
  Hi,  
  I found a live migration problem and have found out the reason, but I can't 
fix it up myself. I really need help.
  When live migration vm and it's block device in save time, it will occur 
probabilistic .

  Step:
  1.  start a windows vm,and run burnInTest(it will write dirty data to block 
device in migration)
  2.  migrate vm with it's block device.
  3.  a few minutes later,  dest vm was killed and migration will be failed 
(probabilistic )

  Reason:
  when migraion start, in source host libvirt will send command to qemu,and 
qemu will call mirror_run coroutine to copy blcok device data to dest vm block 
device.mirror_run running in qemu main thread.   When this 
finished(actually it still running because in following steps,there may 
generate dirty data by vm), qemu will  start migration_thread to migration ram 
and other device.
  In dest vm, qemu will call "bdrv_invalidate_cache --> 
qcow2_invalidate_cache" function after vm read "QEMU_VM_EOF" byte. 
qcow2_invalidate_cache fuction call qcow2_close ,  in qcow2_close fuction set 
"s->l1_table = NULL" and then call qcow2_cache_flush fuction.   In 
qcow2_cache_flush fuction will call 
"bdrv_flush-->bdrv_flush_co_entry-->bdrv_co_flush-->qemu_coroutine_yield".   
This will let itself back to mian loop.   If source vm send block device dirty 
data to dest vm at this time, in dest vm will occur the following segmentation 
fault.
  The primary reason is mirror_run and migration run in two thread.  
although qemu stopping vm before write "QEMU_VM_EOF" byte, it still can't 
ensure mirror_run coroutine do not write dirty data  after migration thread  
sending "QEMU_VM_EOF" byte.

  
  Program received signal SIGSEGV, Segmentation fault.
  0x7f90d250db24 in get_cluster_table (bs=0x7f90d493f500, 
offset=1832189952, new_l2_table=0x7f8fbd6faa88, 
  new_l2_index=0x7f8fbd6faaa0) at block/qcow2-cluster.c:573
  573 l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
  (gdb) bt
  #0  0x7f90d250db24 in get_cluster_table (bs=0x7f90d493f500, 
offset=1832189952, new_l2_table=0x7f8fbd6faa88, 
  new_l2_index=0x7f8fbd6faaa0) at block/qcow2-cluster.c:573
  #1  0x7f90d250e577 in handle_copied (bs=0x7f90d493f500, 
guest_offset=1832189952, host_offset=0x7f8fbd6fab18, 
  bytes=0x7f8fbd6fab20, m=0x7f8fbd6fabc8) at block/qcow2-cluster.c:927
  #2  0x7f90d250ef45 in qcow2_alloc_cluster_offset (bs=0x7f90d493f500, 
offset=1832189952, num=0x7f8fbd6fabfc, 
  host_offset=0x7f8fbd6fabc0, m=0x7f8fbd6fabc8) at 
block/qcow2-cluster.c:1269
  #3  0x7f90d250445f in qcow2_co_writev (bs=0x7f90d493f500, 
sector_num=3578496, remaining_sectors=2040, 
  qiov=0x7f8fbd6fae90) at block/qcow2.c:1171
  #4  0x7f90d24d4764 in bdrv_aligned_pwritev (bs=0x7f90d493f500, 
req=0x7f8fbd6facd0, offset=1832189952, bytes=1044480, 
  qiov=0x7f8fbd6fae90, flags=0) at block.c:3321
  #5  0x7f90d24d4d21 in bdrv_co_do_pwritev (bs=0x7f90d493f500, 
offset=1832189952, bytes=1044480, qiov=0x7f8fbd6fae90, 
  flags=0) at block.c

[Qemu-devel] [RFC v3 7/8] hw/arm/virt: Register two redistributor regions when necessary

2018-05-30 Thread Eric Auger
With a VGICv3 KVM device, if the number of vcpus exceeds the
capacity of the legacy redistributor region (123 redistributors),
we now attempt to register a second redistributor region. Up to
512 redistributors can fit in this latter on top of the 123 allowed
by the legacy redistributor region.

Registering this second redistributor region is possible if the
host kernel supports the following VGICv3 KVM device group/attribute:
KVM_DEV_ARM_VGIC_GRP_ADDR/KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION.

In case the host kernel does not support the registration of several
redistributor regions and the requested number of vcpus exceeds the
capacity of the legacy redistributor region, the GICv3 device
initialization fails with a proper error message and qemu exits.

At the moment the max number of vcpus still is capped by the
virt machine class max_cpus.

Signed-off-by: Eric Auger 

---

v2 -> v3:
- remove spare space
---
 hw/arm/virt.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3018ec2..c00f47d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -529,6 +529,7 @@ static void create_gic(VirtMachineState *vms, qemu_irq *pic)
 SysBusDevice *gicbusdev;
 const char *gictype;
 int type = vms->gic_version, i;
+uint32_t nb_redist_regions = 0;
 
 gictype = (type == 3) ? gicv3_class_name() : gic_class_name();
 
@@ -548,14 +549,28 @@ static void create_gic(VirtMachineState *vms, qemu_irq 
*pic)
 vms->memmap[VIRT_GIC_REDIST].size / GICV3_REDIST_SIZE;
 uint32_t redist0_count = MIN(smp_cpus, redist0_capacity);
 
-qdev_prop_set_uint32(gicdev, "len-redist-region-count", 1);
+nb_redist_regions = virt_gicv3_redist_region_count(vms);
+
+qdev_prop_set_uint32(gicdev, "len-redist-region-count",
+ nb_redist_regions);
 qdev_prop_set_uint32(gicdev, "redist-region-count[0]", redist0_count);
+
+if (nb_redist_regions == 2) {
+uint32_t redist1_capacity =
+vms->memmap[VIRT_GIC_REDIST2].size / GICV3_REDIST_SIZE;
+
+qdev_prop_set_uint32(gicdev, "redist-region-count[1]",
+MIN(smp_cpus - redist0_count, redist1_capacity));
+}
 }
 qdev_init_nofail(gicdev);
 gicbusdev = SYS_BUS_DEVICE(gicdev);
 sysbus_mmio_map(gicbusdev, 0, vms->memmap[VIRT_GIC_DIST].base);
 if (type == 3) {
 sysbus_mmio_map(gicbusdev, 1, vms->memmap[VIRT_GIC_REDIST].base);
+if (nb_redist_regions == 2) {
+sysbus_mmio_map(gicbusdev, 2, vms->memmap[VIRT_GIC_REDIST2].base);
+}
 } else {
 sysbus_mmio_map(gicbusdev, 1, vms->memmap[VIRT_GIC_CPU].base);
 }
@@ -1351,6 +1366,7 @@ static void machvirt_init(MachineState *machine)
  */
 if (vms->gic_version == 3) {
 virt_max_cpus = vms->memmap[VIRT_GIC_REDIST].size / GICV3_REDIST_SIZE;
+virt_max_cpus += vms->memmap[VIRT_GIC_REDIST2].size / 
GICV3_REDIST_SIZE;
 } else {
 virt_max_cpus = GIC_NCPU;
 }
-- 
2.5.5




[Qemu-devel] [RFC v3 2/8] target/arm: Allow KVM device address overwriting

2018-05-30 Thread Eric Auger
for KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION attribute, the attribute
data pointed to by kvm_device_attr.addr is a OR of the
redistributor region address and other fields such as the index
of the redistributor region and the number of redistributors the
region can contain.

The existing machine init done notifier framework sets the address
field to the actual address of the device and does not allow to OR
this value with other fields.

This patch extends the KVMDevice struct with a new kda_addr_ormask
member. Its value is passed at registration time and OR'ed with the
resolved address on kvm_arm_set_device_addr().

Signed-off-by: Eric Auger 
Reviewed-by: Peter Maydell 

---

v2 -> v3:
- s/addr_fixup/add_ormask
- Added Peter's R-b
---
 hw/intc/arm_gic_kvm.c   |  4 ++--
 hw/intc/arm_gicv3_its_kvm.c |  2 +-
 hw/intc/arm_gicv3_kvm.c |  4 ++--
 target/arm/kvm.c| 10 +-
 target/arm/kvm_arm.h|  3 ++-
 5 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
index 6f467e6..eb9664e 100644
--- a/hw/intc/arm_gic_kvm.c
+++ b/hw/intc/arm_gic_kvm.c
@@ -558,7 +558,7 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error 
**errp)
 | KVM_VGIC_V2_ADDR_TYPE_DIST,
 KVM_DEV_ARM_VGIC_GRP_ADDR,
 KVM_VGIC_V2_ADDR_TYPE_DIST,
-s->dev_fd);
+s->dev_fd, 0);
 /* CPU interface for current core. Unlike arm_gic, we don't
  * provide the "interface for core #N" memory regions, because
  * cores with a VGIC don't have those.
@@ -568,7 +568,7 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error 
**errp)
 | KVM_VGIC_V2_ADDR_TYPE_CPU,
 KVM_DEV_ARM_VGIC_GRP_ADDR,
 KVM_VGIC_V2_ADDR_TYPE_CPU,
-s->dev_fd);
+s->dev_fd, 0);
 
 if (kvm_has_gsi_routing()) {
 /* set up irq routing */
diff --git a/hw/intc/arm_gicv3_its_kvm.c b/hw/intc/arm_gicv3_its_kvm.c
index eea6a73..271ebe4 100644
--- a/hw/intc/arm_gicv3_its_kvm.c
+++ b/hw/intc/arm_gicv3_its_kvm.c
@@ -103,7 +103,7 @@ static void kvm_arm_its_realize(DeviceState *dev, Error 
**errp)
 
 /* register the base address */
 kvm_arm_register_device(&s->iomem_its_cntrl, -1, KVM_DEV_ARM_VGIC_GRP_ADDR,
-KVM_VGIC_ITS_ADDR_TYPE, s->dev_fd);
+KVM_VGIC_ITS_ADDR_TYPE, s->dev_fd, 0);
 
 gicv3_its_init_mmio(s, NULL);
 
diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
index ec37177..93ac293 100644
--- a/hw/intc/arm_gicv3_kvm.c
+++ b/hw/intc/arm_gicv3_kvm.c
@@ -754,9 +754,9 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Error 
**errp)
   KVM_DEV_ARM_VGIC_CTRL_INIT, NULL, true, &error_abort);
 
 kvm_arm_register_device(&s->iomem_dist, -1, KVM_DEV_ARM_VGIC_GRP_ADDR,
-KVM_VGIC_V3_ADDR_TYPE_DIST, s->dev_fd);
+KVM_VGIC_V3_ADDR_TYPE_DIST, s->dev_fd, 0);
 kvm_arm_register_device(&s->iomem_redist, -1, KVM_DEV_ARM_VGIC_GRP_ADDR,
-KVM_VGIC_V3_ADDR_TYPE_REDIST, s->dev_fd);
+KVM_VGIC_V3_ADDR_TYPE_REDIST, s->dev_fd, 0);
 
 if (kvm_has_gsi_routing()) {
 /* set up irq routing */
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 5141d0a..6a4324a 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -184,10 +184,15 @@ unsigned long kvm_arch_vcpu_id(CPUState *cpu)
  * We use a MemoryListener to track mapping and unmapping of
  * the regions during board creation, so the board models don't
  * need to do anything special for the KVM case.
+ *
+ * Sometimes the address must be OR'ed with some other fields
+ * (for example for KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION).
+ * @kda_addr_ormask aims at storing the value of those fields.
  */
 typedef struct KVMDevice {
 struct kvm_arm_device_addr kda;
 struct kvm_device_attr kdattr;
+uint64_t kda_addr_ormask;
 MemoryRegion *mr;
 QSLIST_ENTRY(KVMDevice) entries;
 int dev_fd;
@@ -234,6 +239,8 @@ static void kvm_arm_set_device_addr(KVMDevice *kd)
  */
 if (kd->dev_fd >= 0) {
 uint64_t addr = kd->kda.addr;
+
+addr |= kd->kda_addr_ormask;
 attr->addr = (uintptr_t)&addr;
 ret = kvm_device_ioctl(kd->dev_fd, KVM_SET_DEVICE_ATTR, attr);
 } else {
@@ -266,7 +273,7 @@ static Notifier notify = {
 };
 
 void kvm_arm_register_device(MemoryRegion *mr, uint64_t devid, uint64_t group,
- uint64_t attr, int dev_fd)
+ uint64_t attr, int dev_fd, uint64_t addr_ormask)
 {
 KVMDevice *kd;
 
@@ -286,6 +293,7 @@ void kvm_arm_register_device(MemoryRegion *mr, uint64_t 
devid, uint64_t group,
 kd->kdattr.group = group;
 kd->kdattr.attr = attr;
 kd->dev_fd = 

[Qemu-devel] [RFC v3 3/8] hw/intc/arm_gicv3: Introduce redist-region-count array property

2018-05-30 Thread Eric Auger
To prepare for multiple redistributor regions, we introduce
an array of uint32_t properties that stores the redistributor
count of each redistributor region.

Non accelerated VGICv3 only supports a single redistributor region.
The capacity of all redist regions is checked against the number of
vcpus.

Machvirt is updated to set those properties, ie. a single
redistributor region with count set to the number of vcpus
capped by 123.

Signed-off-by: Eric Auger 

---
v2 -> v3:
- add missing return in arm_gic_realize
- in gicv3_init_irqs_and_mmio, compute/check rdist_capacity first
- rdist region 0 size set to MIN(smp_cpus, redist0_capacity)
- add GICV3_REDIST_SIZE
---
 hw/arm/virt.c  | 11 ++-
 hw/intc/arm_gicv3.c| 12 +++-
 hw/intc/arm_gicv3_common.c | 38 +-
 hw/intc/arm_gicv3_kvm.c|  9 +++--
 include/hw/intc/arm_gicv3_common.h |  8 ++--
 5 files changed, 67 insertions(+), 11 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a3a28e2..ed79460 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -523,6 +523,15 @@ static void create_gic(VirtMachineState *vms, qemu_irq 
*pic)
 if (!kvm_irqchip_in_kernel()) {
 qdev_prop_set_bit(gicdev, "has-security-extensions", vms->secure);
 }
+
+if (type == 3) {
+uint32_t redist0_capacity =
+vms->memmap[VIRT_GIC_REDIST].size / GICV3_REDIST_SIZE;
+uint32_t redist0_count = MIN(smp_cpus, redist0_capacity);
+
+qdev_prop_set_uint32(gicdev, "len-redist-region-count", 1);
+qdev_prop_set_uint32(gicdev, "redist-region-count[0]", redist0_count);
+}
 qdev_init_nofail(gicdev);
 gicbusdev = SYS_BUS_DEVICE(gicdev);
 sysbus_mmio_map(gicbusdev, 0, vms->memmap[VIRT_GIC_DIST].base);
@@ -1322,7 +1331,7 @@ static void machvirt_init(MachineState *machine)
  * many redistributors we can fit into the memory map.
  */
 if (vms->gic_version == 3) {
-virt_max_cpus = vms->memmap[VIRT_GIC_REDIST].size / 0x2;
+virt_max_cpus = vms->memmap[VIRT_GIC_REDIST].size / GICV3_REDIST_SIZE;
 } else {
 virt_max_cpus = GIC_NCPU;
 }
diff --git a/hw/intc/arm_gicv3.c b/hw/intc/arm_gicv3.c
index 479c667..7044133 100644
--- a/hw/intc/arm_gicv3.c
+++ b/hw/intc/arm_gicv3.c
@@ -373,7 +373,17 @@ static void arm_gic_realize(DeviceState *dev, Error **errp)
 return;
 }
 
-gicv3_init_irqs_and_mmio(s, gicv3_set_irq, gic_ops);
+if (s->nb_redist_regions != 1) {
+error_setg(errp, "VGICv3 redist region number(%d) not equal to 1",
+   s->nb_redist_regions);
+return;
+}
+
+gicv3_init_irqs_and_mmio(s, gicv3_set_irq, gic_ops, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
 
 gicv3_init_cpuif(s);
 }
diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c
index 7b54d52..4c89e7d 100644
--- a/hw/intc/arm_gicv3_common.c
+++ b/hw/intc/arm_gicv3_common.c
@@ -169,11 +169,22 @@ static const VMStateDescription vmstate_gicv3 = {
 };
 
 void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler,
-  const MemoryRegionOps *ops)
+  const MemoryRegionOps *ops, Error **errp)
 {
 SysBusDevice *sbd = SYS_BUS_DEVICE(s);
+int rdist_capacity = 0;
 int i;
 
+for (i = 0; i < s->nb_redist_regions; i++) {
+rdist_capacity += s->redist_region_count[i];
+}
+if (rdist_capacity < s->num_cpu) {
+error_setg(errp, "Capacity of the redist regions(%d) "
+   "is less than number of vcpus(%d)",
+   rdist_capacity, s->num_cpu);
+return;
+}
+
 /* For the GIC, also expose incoming GPIO lines for PPIs for each CPU.
  * GPIO array layout is thus:
  *  [0..N-1] spi
@@ -199,11 +210,18 @@ void gicv3_init_irqs_and_mmio(GICv3State *s, 
qemu_irq_handler handler,
 
 memory_region_init_io(&s->iomem_dist, OBJECT(s), ops, s,
   "gicv3_dist", 0x1);
-memory_region_init_io(&s->iomem_redist, OBJECT(s), ops ? &ops[1] : NULL, s,
-  "gicv3_redist", 0x2 * s->num_cpu);
-
 sysbus_init_mmio(sbd, &s->iomem_dist);
-sysbus_init_mmio(sbd, &s->iomem_redist);
+
+s->iomem_redist = g_new0(MemoryRegion, s->nb_redist_regions);
+for (i = 0; i < s->nb_redist_regions; i++) {
+char *name = g_strdup_printf("gicv3_redist_region[%d]", i);
+
+memory_region_init_io(&s->iomem_redist[i], OBJECT(s),
+  ops ? &ops[1] : NULL, s, name,
+  s->redist_region_count[i] * GICV3_REDIST_SIZE);
+sysbus_init_mmio(sbd, &s->iomem_redist[i]);
+g_free(name);
+}
 }
 
 static void arm_gicv3_common_realize(DeviceState *dev, Error **errp)
@@ -285,6 +303,13 @@ static void arm_gicv3_common_realize(DeviceState *dev, 
Error **errp)

[Qemu-devel] [RFC v3 5/8] hw/arm/virt: GICv3 DT node with one or two redistributor regions

2018-05-30 Thread Eric Auger
This patch allows the creation of a GICv3 node with 1 or 2
redistributor regions depending on the number of smu_cpus.
The second redistributor region is located just after the
existing RAM region, at 256GB and contains up to up to 512 vcpus.

Please refer to kernel documentation for further node details:
Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt

Signed-off-by: Eric Auger 

---

v2 -> v3:
- VIRT_GIC_REDIST2 is now 64MB large, ie. 512 redistributor capacity
- virt_gicv3_redist_region_count does not test kvm_irqchip_in_kernel
  anymore
---
 hw/arm/virt.c | 29 -
 include/hw/arm/virt.h | 14 ++
 2 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ed79460..3018ec2 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -149,6 +149,8 @@ static const MemMapEntry a15memmap[] = {
 [VIRT_PCIE_PIO] =   { 0x3eff, 0x0001 },
 [VIRT_PCIE_ECAM] =  { 0x3f00, 0x0100 },
 [VIRT_MEM] ={ 0x4000, RAMLIMIT_BYTES },
+/* Additional 64 MB redist region (can contain up to 512 redistributors) */
+[VIRT_GIC_REDIST2] ={ 0x40ULL, 0x400ULL },
 /* Second PCIe window, 512GB wide at the 512GB boundary */
 [VIRT_PCIE_MMIO_HIGH] =   { 0x80ULL, 0x80ULL },
 };
@@ -402,13 +404,30 @@ static void fdt_add_gic_node(VirtMachineState *vms)
 qemu_fdt_setprop_cell(vms->fdt, "/intc", "#size-cells", 0x2);
 qemu_fdt_setprop(vms->fdt, "/intc", "ranges", NULL, 0);
 if (vms->gic_version == 3) {
+int nb_redist_regions = virt_gicv3_redist_region_count(vms);
+
 qemu_fdt_setprop_string(vms->fdt, "/intc", "compatible",
 "arm,gic-v3");
-qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
- 2, vms->memmap[VIRT_GIC_DIST].base,
- 2, vms->memmap[VIRT_GIC_DIST].size,
- 2, vms->memmap[VIRT_GIC_REDIST].base,
- 2, vms->memmap[VIRT_GIC_REDIST].size);
+
+qemu_fdt_setprop_cell(vms->fdt, "/intc",
+  "#redistributor-regions", nb_redist_regions);
+
+if (nb_redist_regions == 1) {
+qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
+ 2, vms->memmap[VIRT_GIC_DIST].base,
+ 2, vms->memmap[VIRT_GIC_DIST].size,
+ 2, vms->memmap[VIRT_GIC_REDIST].base,
+ 2, vms->memmap[VIRT_GIC_REDIST].size);
+} else {
+qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
+ 2, vms->memmap[VIRT_GIC_DIST].base,
+ 2, vms->memmap[VIRT_GIC_DIST].size,
+ 2, vms->memmap[VIRT_GIC_REDIST].base,
+ 2, vms->memmap[VIRT_GIC_REDIST].size,
+ 2, vms->memmap[VIRT_GIC_REDIST2].base,
+ 2, 
vms->memmap[VIRT_GIC_REDIST2].size);
+}
+
 if (vms->virt) {
 qemu_fdt_setprop_cells(vms->fdt, "/intc", "interrupts",
GIC_FDT_IRQ_TYPE_PPI, ARCH_GICV3_MAINT_IRQ,
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index 4ac7ef6..308156f 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -35,6 +35,8 @@
 #include "qemu/notify.h"
 #include "hw/boards.h"
 #include "hw/arm/arm.h"
+#include "sysemu/kvm.h"
+#include "hw/intc/arm_gicv3_common.h"
 
 #define NUM_GICV2M_SPIS   64
 #define NUM_VIRTIO_TRANSPORTS 32
@@ -60,6 +62,7 @@ enum {
 VIRT_GIC_V2M,
 VIRT_GIC_ITS,
 VIRT_GIC_REDIST,
+VIRT_GIC_REDIST2,
 VIRT_SMMU,
 VIRT_UART,
 VIRT_MMIO,
@@ -130,4 +133,15 @@ typedef struct {
 
 void virt_acpi_setup(VirtMachineState *vms);
 
+/* Return the number of used redistributor regions  */
+static inline int virt_gicv3_redist_region_count(VirtMachineState *vms)
+{
+uint32_t redist0_capacity =
+vms->memmap[VIRT_GIC_REDIST].size / GICV3_REDIST_SIZE;
+
+assert(vms->gic_version == 3);
+
+return vms->smp_cpus > redist0_capacity ? 2 : 1;
+}
+
 #endif /* QEMU_ARM_VIRT_H */
-- 
2.5.5




[Qemu-devel] [RFC v3 8/8] hw/arm/virt: Add virt-3.0 machine type supporting up to 512 vcpus

2018-05-30 Thread Eric Auger
Add virt-3.0 machine type.

This machine type allows up to 512 vcpus whereas for
earlier machine types, max_cpus was set to 255 and
any attempt to start the machine with vcpus > 255
was rejected at vl.c/main level.

Signed-off-by: Eric Auger 
---
 hw/arm/virt.c | 32 +---
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index c00f47d..a9fc24b 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1697,11 +1697,13 @@ static void virt_machine_class_init(ObjectClass *oc, 
void *data)
 HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
 
 mc->init = machvirt_init;
-/* Start max_cpus at the maximum QEMU supports. We'll further restrict
- * it later in machvirt_init, where we have more information about the
- * configuration of the particular instance.
+/* Start with max_cpus set to 512. This value is chosen since achievable
+ * in accelerated mode with GICv3 and recent host supporting up to 512 
vcpus
+ * and multiple redistributor region registration.
+ * This value will be refined later on once we collect more information
+ * about the configuration of the particular instance.
  */
-mc->max_cpus = 255;
+mc->max_cpus = 512;
 machine_class_allow_dynamic_sysbus_dev(mc, TYPE_VFIO_CALXEDA_XGMAC);
 machine_class_allow_dynamic_sysbus_dev(mc, TYPE_VFIO_AMD_XGBE);
 mc->block_default_type = IF_VIRTIO;
@@ -1737,7 +1739,7 @@ static void machvirt_machine_init(void)
 }
 type_init(machvirt_machine_init);
 
-static void virt_2_12_instance_init(Object *obj)
+static void virt_3_0_instance_init(Object *obj)
 {
 VirtMachineState *vms = VIRT_MACHINE(obj);
 VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
@@ -1805,10 +1807,26 @@ static void virt_2_12_instance_init(Object *obj)
 vms->irqmap = a15irqmap;
 }
 
-static void virt_machine_2_12_options(MachineClass *mc)
+static void virt_machine_3_0_options(MachineClass *mc)
 {
 }
-DEFINE_VIRT_MACHINE_AS_LATEST(2, 12)
+DEFINE_VIRT_MACHINE_AS_LATEST(3, 0)
+
+#define VIRT_COMPAT_2_12 \
+HW_COMPAT_2_12
+
+static void virt_2_12_instance_init(Object *obj)
+{
+virt_3_0_instance_init(obj);
+}
+
+static void virt_machine_2_12_options(MachineClass *mc)
+{
+virt_machine_3_0_options(mc);
+SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_12);
+mc->max_cpus = 255;
+}
+DEFINE_VIRT_MACHINE(2, 12)
 
 #define VIRT_COMPAT_2_11 \
 HW_COMPAT_2_11
-- 
2.5.5




Re: [Qemu-devel] [PATCH 01/17] block: iterate_format with account of whitelisting

2018-05-30 Thread Max Reitz
On 2018-04-26 18:19, Roman Kagan wrote:
> bdrv_iterate_format (which is currently only used for printing out the
> formats supported by the block layer) doesn't take format whitelisting
> into account.
> 
> As a result, QEMU lies when asked for the list of block drivers it
> supports with "-drive format=?": some of the formats there may be
> recognized by qemu-* tools but unusable in qemu proper.
> 
> To avoid that, exclude formats that are not whitelisted from
> enumeration, if whitelisting is in use.  Since we have separate
> whitelists for r/w and r/o, take this as a parameter to
> bdrv_iterate_format, and print two lists of supported formats (r/w and
> r/o) in main qemu.
> 
> Signed-off-by: Roman Kagan 
> ---
>  include/block/block.h |  2 +-
>  block.c   | 23 +++
>  blockdev.c|  4 +++-
>  qemu-img.c|  2 +-
>  4 files changed, 24 insertions(+), 7 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 02/17] iotests: iotests.py: prevent deadlock in subprocess

2018-05-30 Thread Max Reitz
On 2018-04-26 18:19, Roman Kagan wrote:
> A subprocess whose std{out,err} is subprocess.PIPE may block writing its
> output, so .wait() should not be called on it until the pipes are read
> completely on the caller's side.
> 
> Subprocess.communicate takes care of this.
> 
> Signed-off-by: Roman Kagan 
> ---
>  tests/qemu-iotests/iotests.py | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again

2018-05-30 Thread Peter Xu
On Wed, May 30, 2018 at 04:04:58PM +0800, Peter Xu wrote:
> On Tue, May 22, 2018 at 02:40:26PM -0400, John Snow wrote:
> > 
> > 
> > On 05/21/2018 10:13 AM, Eric Blake wrote:
> > > On 05/21/2018 03:42 AM, Peter Xu wrote:
> > >> We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
> > >> try to turn that on again.
> > > 
> > > "try to turn" sounds weak, like you aren't sure of this patch.  If you
> > > aren't sure, then why should we feel safe in applying it?  This text is
> > > going in the permanent git history, so sound bold, rather than hesitant!
> > > 
> > > "We have resolved the issues from last time (commit 3fd2457d reverted by
> > > commit a4f90923):
> > > - issue 1 ...
> > > - issue 2 ...
> > > So now we are ready to enable advertisement of the feature by default"
> > > 
> > > with better descriptions of the issues that you fixed (I can think of at
> > > least the fixes adding thread-safety to the current monitor, and fixing
> > > early use of the monitor before qmp_capabilities completes; there may
> > > also be other issues that you want to call out).
> > > 
> > >>
> > >> Signed-off-by: Peter Xu 
> > >> -- 
> > >> Now OOB should be okay with all known tests (except iotest qcow2, since
> > >> it is still broken on master),
> > > 
> > > Which tests are still failing for you?  Ideally, you can still
> > > demonstrate that the tests not failing without this patch continue to
> > > pass with this patch, even if you call out the tests that have known
> > > issues to still be resolved.
> > > 
> > 
> > Probably 91 and 169. If any others fail that's news to me.
> 
> I just gave it a shot on my workstation too (./check -qcow2):
> 
> Not run: 045 059 064 070 075 076 077 078 081 083 084 088 092 093 094 101 106 
> 109 113 116 119 123 128 131 135 136 146 148 149 160 162 171 173 175 199 207 
> 210 3
> Failures: 087 188 189 198 206
> Failed 5 of 167 tests
> 
> I'm testing against master, e609fa7.

Hmm... I ran again the same test and the same master commit but this
time it passed all 167 tests on my laptop.  So I assume the previous 5
failures are unreproducable at least every time (or there might be
something wrong with my testbed).  I'll rerun some more times, and
when I post the OOB patch I'll cover all correct qcow2 tests.

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH 10/13] 9p: darwin: *xattr_nofollow implementations

2018-05-30 Thread Greg Kurz
On Sat, 26 May 2018 01:23:12 -0400
k...@juliacomputing.com wrote:

> From: Keno Fischer 
> 
> Signed-off-by: Keno Fischer 
> ---

As mentioned in patch 3, this should go to 9p-util-darwin.c

>  hw/9pfs/9p-util.c | 49 +
>  1 file changed, 45 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/9pfs/9p-util.c b/hw/9pfs/9p-util.c
> index 8cf5554..98004ac 100644
> --- a/hw/9pfs/9p-util.c
> +++ b/hw/9pfs/9p-util.c
> @@ -17,49 +17,90 @@
>  ssize_t fgetxattrat_nofollow(int dirfd, const char *filename, const char 
> *name,
>   void *value, size_t size)
>  {
> -char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, 
> filename);
>  int ret;
> +#ifdef CONFIG_DARWIN
> +int fd = openat_file(dirfd, filename, O_RDONLY | O_PATH_9P_UTIL | 
> O_NOFOLLOW, 0);
> +if (fd == -1)
> +return -1;
> +
> +ret = fgetxattr(fd, name, value, size, 0, XATTR_NOFOLLOW);
> +close_preserve_errno(fd);
> +#else
> +char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, 
> filename);
>  
>  ret = lgetxattr(proc_path, name, value, size);
>  g_free(proc_path);
> +#endif
>  return ret;
>  }
>  
>  ssize_t fgetxattr_follow(int fd, const char *name,
>   void *value, size_t size)
>  {
> +#ifdef CONFIG_DARWIN
> +return fgetxattr(fd, name, value, size, 0, 0);
> +#else
>  return fgetxattr(fd, name, value, size);
> +#endif
>  }
>  
>  ssize_t flistxattrat_nofollow(int dirfd, const char *filename,
>char *list, size_t size)
>  {
> -char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, 
> filename);
>  int ret;
> +#ifdef CONFIG_DARWIN
> +int fd = openat_file(dirfd, filename, O_RDONLY | O_PATH_9P_UTIL | 
> O_NOFOLLOW, 0);
> +if (fd == -1)
> +return -1;
> +
> +ret = flistxattr(fd, list, size, XATTR_NOFOLLOW);
> +close_preserve_errno(fd);
> +#else
> +char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, 
> filename);
>  
>  ret = llistxattr(proc_path, list, size);
>  g_free(proc_path);
> +#endif
>  return ret;
>  }
>  
>  ssize_t fremovexattrat_nofollow(int dirfd, const char *filename,
>  const char *name)
>  {
> -char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, 
> filename);
>  int ret;
> +#ifdef CONFIG_DARWIN
> +int fd = openat_file(dirfd, filename, O_PATH_9P_UTIL | O_NOFOLLOW, 0);
> +if (fd == -1)
> +return -1;
> +
> +ret = fremovexattr(fd, name, XATTR_NOFOLLOW);
> +close_preserve_errno(fd);
> +return ret;
> +#else
> +char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, 
> filename);
>  
>  ret = lremovexattr(proc_path, name);
>  g_free(proc_path);
>  return ret;
> +#endif
>  }
>  
>  int fsetxattrat_nofollow(int dirfd, const char *filename, const char *name,
>   void *value, size_t size, int flags)
>  {
> -char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, 
> filename);
>  int ret;
> +#ifdef CONFIG_DARWIN
> +int fd = openat_file(dirfd, filename, O_PATH_9P_UTIL | O_NOFOLLOW, 0);
> +if (fd == -1)
> +return -1;
> +
> +ret = fsetxattr(fd, name, value, size, 0, XATTR_NOFOLLOW);
> +close_preserve_errno(fd);
> +#else
> +char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, 
> filename);
>  
>  ret = lsetxattr(proc_path, name, value, size, flags);
>  g_free(proc_path);
> +#endif
>  return ret;
>  }




Re: [Qemu-devel] [RFC v3 5/8] hw/arm/virt: GICv3 DT node with one or two redistributor regions

2018-05-30 Thread Igor Mammedov
On Wed, 30 May 2018 13:45:38 +0200
Eric Auger  wrote:

> This patch allows the creation of a GICv3 node with 1 or 2
> redistributor regions depending on the number of smu_cpus.
> The second redistributor region is located just after the
> existing RAM region, at 256GB and contains up to up to 512 vcpus.
> 
> Please refer to kernel documentation for further node details:
> Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt
> 
> Signed-off-by: Eric Auger 
> 
> ---
> 
> v2 -> v3:
> - VIRT_GIC_REDIST2 is now 64MB large, ie. 512 redistributor capacity
> - virt_gicv3_redist_region_count does not test kvm_irqchip_in_kernel
>   anymore
> ---
>  hw/arm/virt.c | 29 -
>  include/hw/arm/virt.h | 14 ++
>  2 files changed, 38 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index ed79460..3018ec2 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -149,6 +149,8 @@ static const MemMapEntry a15memmap[] = {
>  [VIRT_PCIE_PIO] =   { 0x3eff, 0x0001 },
>  [VIRT_PCIE_ECAM] =  { 0x3f00, 0x0100 },
>  [VIRT_MEM] ={ 0x4000, RAMLIMIT_BYTES },
> +/* Additional 64 MB redist region (can contain up to 512 redistributors) 
> */
> +[VIRT_GIC_REDIST2] ={ 0x40ULL, 0x400ULL },
could it be placed after VIRT_PCIE_MMIO_HIGH,
so we would have some space here to increase RAM without
creating the second RAM region upto 512GB boundary?

>  /* Second PCIe window, 512GB wide at the 512GB boundary */
>  [VIRT_PCIE_MMIO_HIGH] =   { 0x80ULL, 0x80ULL },
>  };
> @@ -402,13 +404,30 @@ static void fdt_add_gic_node(VirtMachineState *vms)
>  qemu_fdt_setprop_cell(vms->fdt, "/intc", "#size-cells", 0x2);
>  qemu_fdt_setprop(vms->fdt, "/intc", "ranges", NULL, 0);
>  if (vms->gic_version == 3) {
> +int nb_redist_regions = virt_gicv3_redist_region_count(vms);
> +
>  qemu_fdt_setprop_string(vms->fdt, "/intc", "compatible",
>  "arm,gic-v3");
> -qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
> - 2, vms->memmap[VIRT_GIC_DIST].base,
> - 2, vms->memmap[VIRT_GIC_DIST].size,
> - 2, vms->memmap[VIRT_GIC_REDIST].base,
> - 2, vms->memmap[VIRT_GIC_REDIST].size);
> +
> +qemu_fdt_setprop_cell(vms->fdt, "/intc",
> +  "#redistributor-regions", nb_redist_regions);
> +
> +if (nb_redist_regions == 1) {
> +qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
> + 2, vms->memmap[VIRT_GIC_DIST].base,
> + 2, vms->memmap[VIRT_GIC_DIST].size,
> + 2, 
> vms->memmap[VIRT_GIC_REDIST].base,
> + 2, 
> vms->memmap[VIRT_GIC_REDIST].size);
> +} else {
> +qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
> + 2, vms->memmap[VIRT_GIC_DIST].base,
> + 2, vms->memmap[VIRT_GIC_DIST].size,
> + 2, 
> vms->memmap[VIRT_GIC_REDIST].base,
> + 2, 
> vms->memmap[VIRT_GIC_REDIST].size,
> + 2, 
> vms->memmap[VIRT_GIC_REDIST2].base,
> + 2, 
> vms->memmap[VIRT_GIC_REDIST2].size);
> +}
> +
>  if (vms->virt) {
>  qemu_fdt_setprop_cells(vms->fdt, "/intc", "interrupts",
> GIC_FDT_IRQ_TYPE_PPI, 
> ARCH_GICV3_MAINT_IRQ,
> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
> index 4ac7ef6..308156f 100644
> --- a/include/hw/arm/virt.h
> +++ b/include/hw/arm/virt.h
> @@ -35,6 +35,8 @@
>  #include "qemu/notify.h"
>  #include "hw/boards.h"
>  #include "hw/arm/arm.h"
> +#include "sysemu/kvm.h"
> +#include "hw/intc/arm_gicv3_common.h"
>  
>  #define NUM_GICV2M_SPIS   64
>  #define NUM_VIRTIO_TRANSPORTS 32
> @@ -60,6 +62,7 @@ enum {
>  VIRT_GIC_V2M,
>  VIRT_GIC_ITS,
>  VIRT_GIC_REDIST,
> +VIRT_GIC_REDIST2,
>  VIRT_SMMU,
>  VIRT_UART,
>  VIRT_MMIO,
> @@ -130,4 +133,15 @@ typedef struct {
>  
>  void virt_acpi_setup(VirtMachineState *vms);
>  
> +/* Return the number of used redistributor regions  */
> +static inline int virt_gicv3_redist_region_count(VirtMachineState *vms)
> +{
> +uint32_t redist0_capacity =
> +vms->memmap[VIRT_GIC_REDIST].size / GICV3_REDIST_SIZE;
> +
> +assert(vms->gic_version == 3);
> +
> +return vms->smp_cpus > redist0_capacity ? 2 : 1;
> +}
> +
>  #endif /* QEMU_ARM_VIRT_H */




Re: [Qemu-devel] [PATCH 12/13] 9p: darwin: Provide a fallback implementation for utimensat

2018-05-30 Thread Greg Kurz
On Sat, 26 May 2018 01:23:14 -0400
k...@juliacomputing.com wrote:

> From: Keno Fischer 
> 
> This function is new in Mac OS 10.13. Provide a fallback implementation
> when building against older SDKs.
> 
> Signed-off-by: Keno Fischer 
> ---

As with patch 10, this should go to 9p-util-darwin.c

>  hw/9pfs/9p-local.c |  2 +-
>  hw/9pfs/9p-util.c  | 38 ++
>  hw/9pfs/9p-util.h  |  7 +++
>  hw/9pfs/9p.c   |  1 +
>  4 files changed, 47 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
> index 3e358b7..70ab541 100644
> --- a/hw/9pfs/9p-local.c
> +++ b/hw/9pfs/9p-local.c
> @@ -1082,7 +1082,7 @@ static int local_utimensat(FsContext *s, V9fsPath 
> *fs_path,
>  goto out;
>  }
>  
> -ret = utimensat(dirfd, name, buf, AT_SYMLINK_NOFOLLOW);
> +ret = utimensat_nofollow(dirfd, name, buf);
>  close_preserve_errno(dirfd);
>  out:
>  g_free(dirpath);
> diff --git a/hw/9pfs/9p-util.c b/hw/9pfs/9p-util.c
> index 98004ac..8403f5f 100644
> --- a/hw/9pfs/9p-util.c
> +++ b/hw/9pfs/9p-util.c
> @@ -104,3 +104,41 @@ int fsetxattrat_nofollow(int dirfd, const char 
> *filename, const char *name,
>  #endif
>  return ret;
>  }
> +
> +#ifndef __has_builtin
> +#define __has_builtin(x) 0
> +#endif
> +
> +int utimensat_nofollow(int dirfd, const char *filename, const struct 
> timespec times[2])
> +{
> +#ifdef CONFIG_DARWIN
> +#if defined(__MAC_10_13) /* Check whether we have an SDK version that 
> defines utimensat */
> +#if __MAC_OS_X_VERSION_MIN_REQUIRED >= __MAC_10_13
> +#define UTIMENSAT_AVAILABLE 1
> +#elif __has_builtin(__builtin_available)
> +#define UTIMENSAT_AVAILABLE __builtin_available(macos 10.13, *)
> +#else
> +#define UTIMENSAT_AVAILABLE 0
> +#endif
> +if (UTIMENSAT_AVAILABLE)
> +{
> +return utimensat(dirfd, filename, times, AT_SYMLINK_NOFOLLOW);
> +}
> +#endif
> +// utimensat not available. Use futimes.
> +int fd = openat_file(dirfd, filename, O_PATH_9P_UTIL | O_NOFOLLOW, 0);
> +if (fd == -1)
> +return -1;
> +
> +struct timeval futimes_buf[2];
> +futimes_buf[0].tv_sec = times[0].tv_sec;
> +futimes_buf[0].tv_usec = times[0].tv_nsec * 1000;
> +futimes_buf[1].tv_sec = times[1].tv_sec;
> +futimes_buf[1].tv_usec = times[1].tv_nsec * 1000;
> +int ret = futimes(fd, futimes_buf);
> +close_preserve_errno(fd);
> +return ret;
> +#else
> +return utimensat(dirfd, filename, times, AT_SYMLINK_NOFOLLOW);
> +#endif
> +}
> diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
> index cb26343..2329c82 100644
> --- a/hw/9pfs/9p-util.h
> +++ b/hw/9pfs/9p-util.h
> @@ -19,6 +19,12 @@
>  #define O_PATH_9P_UTIL 0
>  #endif
>  
> +/* Compatibility with OLD SDK Versions for Darwin */
> +#if defined(CONFIG_DARWIN) && !defined(UTIME_NOW)
> +#define UTIME_NOW -1
> +#define UTIME_OMIT -2
> +#endif
> +
>  static inline void close_preserve_errno(int fd)
>  {
>  int serrno = errno;
> @@ -66,5 +72,6 @@ ssize_t flistxattrat_nofollow(int dirfd, const char 
> *filename,
>char *list, size_t size);
>  ssize_t fremovexattrat_nofollow(int dirfd, const char *filename,
>  const char *name);
> +int utimensat_nofollow(int dirfd, const char *filename, const struct 
> timespec times[2]);
>  
>  #endif
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index 4ae4da6..8e0594a 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -21,6 +21,7 @@
>  #include "virtio-9p.h"
>  #include "fsdev/qemu-fsdev.h"
>  #include "9p-xattr.h"
> +#include "9p-util.h"
>  #include "coth.h"
>  #include "trace.h"
>  #include "migration/blocker.h"




Re: [Qemu-devel] [PATCH 03/17] iotests: ask qemu for supported formats

2018-05-30 Thread Max Reitz
On 2018-04-26 18:19, Roman Kagan wrote:
> Add helper functions to query the block drivers actually supported by
> QEMU using "-drive format=?".  This allows to skip certain tests that
> require drivers not built in or whitelisted in QEMU.
> 
> Signed-off-by: Roman Kagan 
> ---
>  tests/qemu-iotests/common.rc  | 19 +++
>  tests/qemu-iotests/iotests.py | 30 +++---
>  2 files changed, 46 insertions(+), 3 deletions(-)

[...]

> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index e2abf0cb53..698ef2b2c0 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py

[...]

> @@ -550,13 +561,26 @@ def verify_cache_mode(supported_cache_modes=[]):
>  if supported_cache_modes and (cachemode not in supported_cache_modes):
>  notrun('not suitable for this cache mode: %s' % cachemode)
>  
> +rw_formats = None
> +
> +def supports_format(format_name):
> +format_message = qemu_pipe('-drive', 'format=?')
> +global rw_formats
> +if rw_formats is None:
> +rw_formats = format_message.splitlines()[0].split(':')[1].split()

Isn't it sufficient to call qemu_pipe() only if rw_formats is None?

The rest looks good.

Max

> +return format_name in rw_formats
> +
> +def require_formats(*formats):
> +for fmt in formats:
> +if not supports_format(fmt):
> +notrun('%s does not support format %s' % (qemu_prog, fmt))
> +
>  def supports_quorum():
> -return 'quorum' in qemu_img_pipe('--help')
> +return supports_format('quorum')
>  
>  def verify_quorum():
>  '''Skip test suite if quorum support is not available'''
> -if not supports_quorum():
> -notrun('quorum support missing')
> +require_formats('quorum')
>  
>  def main(supported_fmts=[], supported_oses=['linux'], 
> supported_cache_modes=[],
>   unsupported_fmts=[]):
> 




signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [qemu-web PATCH] Add a blog post about the new -nic parameter

2018-05-30 Thread Thomas Huth
QEMU v2.12 features a new, useful parameter called "-nic". Let's
throw some light on this new parameter with a new blog post.

Signed-off-by: Thomas Huth 
---
 _posts/2018-05-30-nic-parameter.md | 126 +
 screenshots/2018-05-30-qemu-cli-net.png| Bin 0 -> 24020 bytes
 screenshots/2018-05-30-qemu-cli-netdev.png | Bin 0 -> 13553 bytes
 3 files changed, 126 insertions(+)
 create mode 100644 _posts/2018-05-30-nic-parameter.md
 create mode 100644 screenshots/2018-05-30-qemu-cli-net.png
 create mode 100644 screenshots/2018-05-30-qemu-cli-netdev.png

diff --git a/_posts/2018-05-30-nic-parameter.md 
b/_posts/2018-05-30-nic-parameter.md
new file mode 100644
index 000..787575f
--- /dev/null
+++ b/_posts/2018-05-30-nic-parameter.md
@@ -0,0 +1,126 @@
+---
+layout: post
+title:  "QEMU's new -nic command line parameter"
+date:   2018-05-30 14:00:00 +0200
+author: Thomas Huth
+categories: [features, parameters, 'qemu 2.12']
+---
+QEMU v2.12 has a new command line parameter, the `-nic` parameter, which can
+be used to configure a network connection for the guest quite easily, since
+it sets up both, the guest NIC and the host network backend in one go.
+If you've read the
+[ChangeLog of QEMU v2.12](https://wiki.qemu.org/ChangeLog/2.12) or the current
+[documentation of QEMU](https://qemu.weilnetz.de/doc/qemu-doc.html)
+you might already have spotted this new `-nic` parameter, and maybe you've
+thought "Why yet another parameter for configuring the network?". To answer
+that question, we've got to have look at the basics and other options first. 
+To configure a network interface for a guest, you've got to consider two
+sides:
+
+1. The emulated hardware that the guest sees, i.e. the so-called NIC (network
+interface controller). On systems that support PCI cards, these typically
+could be an e1000 network card, a rtl8139 network card or a virtio-net device.
+
+2. The network backend on the host side, i.e. the interface that QEMU uses
+to exchange network packets with the outside (like other QEMU instances
+or other real hosts in your intranet or in the internet). The common host
+backends are the "user" (a.k.a. SLIRP) backend which provides access to
+the host's network via NAT, the "tap" backend which allows the guest to
+directly access the host's network, or the "socket" backend which can be
+used to connect multiple QEMU instances to simulate a shared network for
+the their guests.
+
+The legacy -net option
+--
+
+QEMU's initial way of configuring the network for the guest was the `-net`
+option. The emulated NIC hardware can be chosen with the
+`-net nic,model=xyz,...` parameter, and the host backend with the
+`-net ,...` parameter (e.g. `-net user` for the SLIRP backend).
+The emulated NIC and the host backend are not directly connected here, but
+via an emulated hub (called "vlan" in QEMU), so if you start QEMU with
+`-net nic,model=e1000 -net user -net nic,model=virtio -net tap` for example,
+you get a setup where all the NICs and host backends are connected together
+via a hub:
+
+![Networking with -net](/screenshots/2018-05-30-qemu-cli-net.png)
+
+That means the e1000 NIC also gets the network traffic from the virtio-net
+NIC and both host backends... this is probably not what the user expected,
+who likely wanted to have two separate network connections instead. To achieve
+this with the `-net` parameter, you've got to use the "vlan" option instead,
+for example `-net nic,model=e1000,vlan=0 -net user,vlan=0
+-net nic,model=virtio,vlan=1 -net tap,vlan=1` moves the virtio-net NIC
+and the "tap" backend to another hub (with ID #1). Please note that the
+"vlan" option will be dropped in QEMU v3.0 since the term was rather
+[confusing](https://bugs.launchpad.net/qemu/+bug/658904) (it's not related
+to IEEE 802.1Q for example) and caused a lot of misconfigurations in the past.
+
+The modern -netdev option
+-
+
+Beside the confusing "vlan" option of the `-net` parameter, there is one
+more major drawback with that legacy option: Since you always get the
+emulated hub here inbetween to which multiple NICs and host backends could be
+attached, you can not use this concept for situations where the NIC frontend
+has to work very closely together with the host backend, e.g. when you want
+to use vhost acceleration for virtio-net.
+
+To configure a network connection where the emulated NIC is directly connected
+to a host network backend without a hub inbetween, you've got to use the
+`-netdev` option for the backend, together with `-device` for the guest NIC
+hardware. Assuming that you want to configure the same devices as in the
+`-net` example above, you could use `-netdev user,id=n1 -device e1000,netdev=n1
+-netdev tap,id=n2 -device virtio-net,netdev=n2` for example. This will give you
+straight 1:1 connections between the NICs and the host backends:
+
+![Networking with -netdev](/screenshots/2018-05-30-qemu-cli-netdev.png)
+
+Note th

Re: [Qemu-devel] [PATCH 04/17] iotest 030: skip quorum test setup/teardown too

2018-05-30 Thread Max Reitz
On 2018-04-26 18:19, Roman Kagan wrote:
> If quorum driver is not enabled, test 030 skips the corresponding
> testcase.  This, however, is insufficient: quorum is first used in the
> testsuite's setUp.
> 
> To avoid erroring out here, skip setUp/tearDown, too.
> 
> Signed-off-by: Roman Kagan 
> ---
>  tests/qemu-iotests/030 | 6 ++
>  1 file changed, 6 insertions(+)

Not sure if there is any nicer way of doing this, but:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 05/17] iotest 030: require blkdebug

2018-05-30 Thread Max Reitz
On 2018-04-26 18:19, Roman Kagan wrote:
> This test uses blkdebug extensively so notrun it if blkdebug is
> disabled in QEMU.
> 
> Signed-off-by: Roman Kagan 
> ---
>  tests/qemu-iotests/030 | 1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 11/13] 9p: darwin: Mark mknod as unsupported

2018-05-30 Thread Greg Kurz
On Sat, 26 May 2018 01:23:13 -0400
k...@juliacomputing.com wrote:

> From: Keno Fischer 
> 
> Signed-off-by: Keno Fischer 
> ---
>  hw/9pfs/9p-local.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
> index c55ea25..3e358b7 100644
> --- a/hw/9pfs/9p-local.c
> +++ b/hw/9pfs/9p-local.c
> @@ -669,6 +669,13 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath 
> *dir_path,
>  return -1;
>  }
>  
> +#ifdef CONFIG_DARWIN
> +/* Darwin doesn't have mknodat and it's unlikely to work anyway,

What's unlikely to work ?

> +   so let's just mark it as unsupported */
> +err = -1;
> +errno = EOPNOTSUPP;
> +goto out;
> +#else

Please introduce qemu_mknodat() with distinct implementations for linux
and darwin.

>  if (fs_ctx->export_flags & V9FS_SM_MAPPED ||
>  fs_ctx->export_flags & V9FS_SM_MAPPED_FILE) {
>  err = mknodat(dirfd, name, fs_ctx->fmode | S_IFREG, 0);
> @@ -699,6 +706,8 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath 
> *dir_path,
>  
>  err_end:
>  unlinkat_preserve_errno(dirfd, name, 0);
> +#endif
> +
>  out:
>  close_preserve_errno(dirfd);
>  return err;




Re: [Qemu-devel] [PATCH 06/17] iotest 055: skip unsupported backup target formats

2018-05-30 Thread Max Reitz
On 2018-04-26 18:19, Roman Kagan wrote:
> Signed-off-by: Roman Kagan 
> ---
>  tests/qemu-iotests/055 | 12 
>  1 file changed, 12 insertions(+)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 07/17] iotest 055: require blkdebug

2018-05-30 Thread Max Reitz
On 2018-04-26 18:19, Roman Kagan wrote:
> This test uses blkdebug extensively so notrun it if blkdebug is
> disabled in QEMU.
> 
> Signed-off-by: Roman Kagan 
> ---
>  tests/qemu-iotests/055 | 1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC v3 5/8] hw/arm/virt: GICv3 DT node with one or two redistributor regions

2018-05-30 Thread Auger Eric
Hi Igor,

On 05/30/2018 02:13 PM, Igor Mammedov wrote:
> On Wed, 30 May 2018 13:45:38 +0200
> Eric Auger  wrote:
> 
>> This patch allows the creation of a GICv3 node with 1 or 2
>> redistributor regions depending on the number of smu_cpus.
>> The second redistributor region is located just after the
>> existing RAM region, at 256GB and contains up to up to 512 vcpus.
>>
>> Please refer to kernel documentation for further node details:
>> Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt
>>
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> v2 -> v3:
>> - VIRT_GIC_REDIST2 is now 64MB large, ie. 512 redistributor capacity
>> - virt_gicv3_redist_region_count does not test kvm_irqchip_in_kernel
>>   anymore
>> ---
>>  hw/arm/virt.c | 29 -
>>  include/hw/arm/virt.h | 14 ++
>>  2 files changed, 38 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index ed79460..3018ec2 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -149,6 +149,8 @@ static const MemMapEntry a15memmap[] = {
>>  [VIRT_PCIE_PIO] =   { 0x3eff, 0x0001 },
>>  [VIRT_PCIE_ECAM] =  { 0x3f00, 0x0100 },
>>  [VIRT_MEM] ={ 0x4000, RAMLIMIT_BYTES },
>> +/* Additional 64 MB redist region (can contain up to 512 
>> redistributors) */
>> +[VIRT_GIC_REDIST2] ={ 0x40ULL, 0x400ULL },
> could it be placed after VIRT_PCIE_MMIO_HIGH,
> so we would have some space here to increase RAM without
> creating the second RAM region upto 512GB boundary?

Personally I don't have any objection but see my reply to Shannon's
similar query in
http://lists.gnu.org/archive/html/qemu-arm/2018-03/msg00465.html

* If we need to provide more RAM to VMs in the future then we need to:
 *  * allocate a second bank of RAM starting at 2TB and working up
 *  * fix the DT and ACPI table generation code in QEMU to correctly
 *report two split lumps of RAM to the guest
 *  * fix KVM in the host kernel to allow guests with >40 bit address spaces
 * (We don't want to fill all the way up to 512GB with RAM because
 * we might want it for non-RAM purposes later.

Thanks

Eric


> 
>>  /* Second PCIe window, 512GB wide at the 512GB boundary */
>>  [VIRT_PCIE_MMIO_HIGH] =   { 0x80ULL, 0x80ULL },
>>  };
>> @@ -402,13 +404,30 @@ static void fdt_add_gic_node(VirtMachineState *vms)
>>  qemu_fdt_setprop_cell(vms->fdt, "/intc", "#size-cells", 0x2);
>>  qemu_fdt_setprop(vms->fdt, "/intc", "ranges", NULL, 0);
>>  if (vms->gic_version == 3) {
>> +int nb_redist_regions = virt_gicv3_redist_region_count(vms);
>> +
>>  qemu_fdt_setprop_string(vms->fdt, "/intc", "compatible",
>>  "arm,gic-v3");
>> -qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
>> - 2, vms->memmap[VIRT_GIC_DIST].base,
>> - 2, vms->memmap[VIRT_GIC_DIST].size,
>> - 2, vms->memmap[VIRT_GIC_REDIST].base,
>> - 2, vms->memmap[VIRT_GIC_REDIST].size);
>> +
>> +qemu_fdt_setprop_cell(vms->fdt, "/intc",
>> +  "#redistributor-regions", nb_redist_regions);
>> +
>> +if (nb_redist_regions == 1) {
>> +qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
>> + 2, vms->memmap[VIRT_GIC_DIST].base,
>> + 2, vms->memmap[VIRT_GIC_DIST].size,
>> + 2, 
>> vms->memmap[VIRT_GIC_REDIST].base,
>> + 2, 
>> vms->memmap[VIRT_GIC_REDIST].size);
>> +} else {
>> +qemu_fdt_setprop_sized_cells(vms->fdt, "/intc", "reg",
>> + 2, vms->memmap[VIRT_GIC_DIST].base,
>> + 2, vms->memmap[VIRT_GIC_DIST].size,
>> + 2, 
>> vms->memmap[VIRT_GIC_REDIST].base,
>> + 2, 
>> vms->memmap[VIRT_GIC_REDIST].size,
>> + 2, 
>> vms->memmap[VIRT_GIC_REDIST2].base,
>> + 2, 
>> vms->memmap[VIRT_GIC_REDIST2].size);
>> +}
>> +
>>  if (vms->virt) {
>>  qemu_fdt_setprop_cells(vms->fdt, "/intc", "interrupts",
>> GIC_FDT_IRQ_TYPE_PPI, 
>> ARCH_GICV3_MAINT_IRQ,
>> diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
>> index 4ac7ef6..308156f 100644
>> --- a/include/hw/arm/virt.h
>> +++ b/include/hw/arm/virt.h
>> @@ -35,6 +35,8 @@
>>  #include "qemu/notify.h"
>>  #include "hw/boards.h"
>>  #include "hw/arm/arm.h"
>> +#include "sysemu/kvm.h"
>> +#include "hw/intc/arm_gicv3_common.h"
>>  
>>  #define NUM_GICV2M_SPIS   64
>>  #define NUM_VIRTIO_TR

Re: [Qemu-devel] [PATCH v4 07/12] migration: not wait RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect

2018-05-30 Thread Dr. David Alan Gilbert
* Lidong Chen (jemmy858...@gmail.com) wrote:
> From: Lidong Chen 
> 
> When cancel migration during RDMA precopy, the source qemu main thread hangs 
> sometime.
> 
> The backtrace is:
> (gdb) bt
> #0  0x7f249eabd43d in write () from /lib64/libpthread.so.0
> #1  0x7f24a1ce98e4 in rdma_get_cm_event (channel=0x4675d10, 
> event=0x7ffe2f643dd0) at src/cma.c:2189
> #2  0x007b6166 in qemu_rdma_cleanup (rdma=0x6784000) at 
> migration/rdma.c:2296
> #3  0x007b7cae in qio_channel_rdma_close (ioc=0x3bfcc30, 
> errp=0x0) at migration/rdma.c:2999
> #4  0x008db60e in qio_channel_close (ioc=0x3bfcc30, errp=0x0) at 
> io/channel.c:273
> #5  0x007a8765 in channel_close (opaque=0x3bfcc30) at 
> migration/qemu-file-channel.c:98
> #6  0x007a71f9 in qemu_fclose (f=0x527c000) at 
> migration/qemu-file.c:334
> #7  0x00795b96 in migrate_fd_cleanup (opaque=0x3b46280) at 
> migration/migration.c:1162
> #8  0x0093a71b in aio_bh_call (bh=0x3db7a20) at util/async.c:90
> #9  0x0093a7b2 in aio_bh_poll (ctx=0x3b121c0) at util/async.c:118
> #10 0x0093f2ad in aio_dispatch (ctx=0x3b121c0) at 
> util/aio-posix.c:436
> #11 0x0093ab41 in aio_ctx_dispatch (source=0x3b121c0, 
> callback=0x0, user_data=0x0)
> at util/async.c:261
> #12 0x7f249f73c7aa in g_main_context_dispatch () from 
> /lib64/libglib-2.0.so.0
> #13 0x0093dc5e in glib_pollfds_poll () at util/main-loop.c:215
> #14 0x0093dd4e in os_host_main_loop_wait (timeout=2800) at 
> util/main-loop.c:263
> #15 0x0093de05 in main_loop_wait (nonblocking=0) at 
> util/main-loop.c:522
> #16 0x005bc6a5 in main_loop () at vl.c:1944
> #17 0x005c39b5 in main (argc=56, argv=0x7ffe2f6443f8, 
> envp=0x3ad0030) at vl.c:4752
> 
> It does not get the RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect 
> sometime.
> 
> According to IB Spec once active side send DREQ message, it should wait for 
> DREP message
> and only once it arrived it should trigger a DISCONNECT event. DREP message 
> can be dropped
> due to network issues.
> For that case the spec defines a DREP_timeout state in the CM state machine, 
> if the DREP is
> dropped we should get a timeout and a TIMEWAIT_EXIT event will be trigger.
> Unfortunately the current kernel CM implementation doesn't include the 
> DREP_timeout state
> and in above scenario we will not get DISCONNECT or TIMEWAIT_EXIT events.
> 
> So it should not invoke rdma_get_cm_event which may hang forever, and the 
> event channel
> is also destroyed in qemu_rdma_cleanup.
> 
> Signed-off-by: Lidong Chen 



Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/rdma.c   | 12 ++--
>  migration/trace-events |  1 -
>  2 files changed, 2 insertions(+), 11 deletions(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 0dd4033..92e4d30 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -2275,8 +2275,7 @@ static int qemu_rdma_write(QEMUFile *f, RDMAContext 
> *rdma,
>  
>  static void qemu_rdma_cleanup(RDMAContext *rdma)
>  {
> -struct rdma_cm_event *cm_event;
> -int ret, idx;
> +int idx;
>  
>  if (rdma->cm_id && rdma->connected) {
>  if ((rdma->error_state ||
> @@ -2290,14 +2289,7 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
>  qemu_rdma_post_send_control(rdma, NULL, &head);
>  }
>  
> -ret = rdma_disconnect(rdma->cm_id);
> -if (!ret) {
> -trace_qemu_rdma_cleanup_waiting_for_disconnect();
> -ret = rdma_get_cm_event(rdma->channel, &cm_event);
> -if (!ret) {
> -rdma_ack_cm_event(cm_event);
> -}
> -}
> +rdma_disconnect(rdma->cm_id);
>  trace_qemu_rdma_cleanup_disconnect();
>  rdma->connected = false;
>  }
> diff --git a/migration/trace-events b/migration/trace-events
> index 3c798dd..4a768ea 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -146,7 +146,6 @@ qemu_rdma_accept_pin_state(bool pin) "%d"
>  qemu_rdma_accept_pin_verbsc(void *verbs) "Verbs context after listen: %p"
>  qemu_rdma_block_for_wrid_miss(const char *wcompstr, int wcomp, const char 
> *gcompstr, uint64_t req) "A Wanted wrid %s (%d) but got %s (%" PRIu64 ")"
>  qemu_rdma_cleanup_disconnect(void) ""
> -qemu_rdma_cleanup_waiting_for_disconnect(void) ""
>  qemu_rdma_close(void) ""
>  qemu_rdma_connect_pin_all_requested(void) ""
>  qemu_rdma_connect_pin_all_outcome(bool pin) "%d"
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH 08/17] iotest 056: skip testcases using blkdebug if disabled

2018-05-30 Thread Max Reitz
On 2018-04-26 18:19, Roman Kagan wrote:
> Signed-off-by: Roman Kagan 
> ---
>  tests/qemu-iotests/056 | 3 +++
>  1 file changed, 3 insertions(+)

TestBeforeWriteNotifier uses blkdebug (and null-co) in its setUp
function.  Maybe you just want to skip the whole test if blkdebug is
disabled.

Then again, I'd argue there are block drivers without which running the
iotests is a bit pointless.  Why not just require blkdebug, null-co/aio,
raw, and file for all of them?

I think the reason why we added the check for quorum support was because
quorum relies on an external library (for hashing), so it isn't trivial
to enable.  But you can always easily whitelist those drivers above if
you want to do testing, so I'm not sure whether it's worth checking
their availability in every test that needs them.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 00/17] iotests: don't choke on disabled drivers

2018-05-30 Thread Max Reitz
On 2018-04-26 18:19, Roman Kagan wrote:
> Some iotests assume availability of certain block drivers, and fail if
> the driver is not supported by QEMU because it was disabled at configure
> time.
> 
> This series tries to address that, by making QEMU report the actual list
> of supported block drivers in response to "-drive format=?", and using
> this information to skip the parts of the io testsuite that can not be
> run in this configuration.
> 
> Roman Kagan (17):
>   block: iterate_format with account of whitelisting
>   iotests: iotests.py: prevent deadlock in subprocess
>   iotests: ask qemu for supported formats
>   iotest 030: skip quorum test setup/teardown too
>   iotest 030: require blkdebug
>   iotest 055: skip unsupported backup target formats
>   iotest 055: require blkdebug
>   iotest 056: skip testcases using blkdebug if disabled
>   iotest 071: notrun if blkdebug or blkverify is disabled
>   iotest 081: notrun if quorum is disabled
>   iotest 087: notrun if null-co is disabled
>   iotest 093: notrun if null-co or null-aio is disabled
>   iotest 099: notrun if blkdebug or blkverify is disabled
>   iotest 124: skip testcases using blkdebug if disabled
>   iotest 139: skip testcases using disabled drivers
>   iotest 147: notrun if nbd is disabled
>   iotest 184: notrun if null-co or throttle is disabled
> 
>  include/block/block.h |  2 +-
>  block.c   | 23 ++
>  blockdev.c|  4 +++-
>  qemu-img.c|  2 +-
>  tests/qemu-iotests/030|  7 +++
>  tests/qemu-iotests/055| 13 
>  tests/qemu-iotests/056|  3 +++
>  tests/qemu-iotests/071|  1 +
>  tests/qemu-iotests/081|  1 +
>  tests/qemu-iotests/087|  1 +
>  tests/qemu-iotests/093|  1 +
>  tests/qemu-iotests/099|  1 +
>  tests/qemu-iotests/124|  5 +
>  tests/qemu-iotests/139|  4 
>  tests/qemu-iotests/147|  1 +
>  tests/qemu-iotests/184|  1 +
>  tests/qemu-iotests/common.rc  | 19 ++
>  tests/qemu-iotests/iotests.py | 46 
> ---
>  18 files changed, 117 insertions(+), 18 deletions(-)

I'll stop reviewing this series for now, because there are more iotests
that use drivers outside of their format/protocol combination.

For instance:

$ grep -l null-co ??? | wc -l
15
$ grep -l blkdebug ??? | wc -l
30
$ (grep -l '"raw"' ???; grep -l "'raw'" ???) | wc -l
22

As I've written in my reply to patch 8, I'm not sure whether it's the
right solution to check for the availability of these block drivers in
every single test that needs them.  It makes sense for quorum, because
quorum needs an external library for hashing, so it may not be trivial
to enable.  But it does not seem too useful for other formats that do
not have such a dependency (e.g. null-co, blkdebug, raw).

The thing is that it's OK to whitelist everything for testing, and then
disable some drivers when building a release.  I don't think one needs
to run the iotests with the release version if the whole difference is
whether some drivers have been disabled or not.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 0/4] aspeed: add MMIO exec support to the FMC controller

2018-05-30 Thread Peter Maydell
On 30 May 2018 at 08:49, Cédric Le Goater  wrote:
> Hello,
>
> When MMIO execution support is active, these changes let the Aspeed
> SoC machine boot directly from CE0. As there is still some
> issues, the feature is disabled by default and should be activated
> with :
>
> -global driver=aspeed.smc,property=mmio-exec,value=true

I'd really rather not add another mmio-exec device until
we've sorted out making it actually work properly...

thanks
-- PMM



Re: [Qemu-devel] [[Qemu devel] RFC] hw/net: Add Smartfusion2 emac block

2018-05-30 Thread sundeep subbaraya
Hi Philippe,

On Sun, May 27, 2018 at 8:56 AM, Philippe Mathieu-Daudé  wrote:
> On 05/26/2018 06:53 AM, Subbaraya Sundeep wrote:
>> Modelled Ethernet MAC of Smartfusion2 SoC.
>> Micrel KSZ8051 PHY is present on Emcraft's SOM kit hence same
>> PHY is emulated.
>>
>> Signed-off-by: Subbaraya Sundeep 
>> ---
>>  hw/arm/msf2-soc.c |  21 +-
>>  hw/net/Makefile.objs  |   1 +
>>  hw/net/mss-emac.c | 544 
>> ++
>>  include/hw/arm/msf2-soc.h |   3 +
>>  include/hw/net/mss-emac.h |  23 ++
>>  5 files changed, 591 insertions(+), 1 deletion(-)
>>  create mode 100644 hw/net/mss-emac.c
>>  create mode 100644 include/hw/net/mss-emac.h
>>
>> diff --git a/hw/arm/msf2-soc.c b/hw/arm/msf2-soc.c
>> index 75c44ad..ed3d0f5 100644
>> --- a/hw/arm/msf2-soc.c
>> +++ b/hw/arm/msf2-soc.c
>> @@ -35,6 +35,7 @@
>>
>>  #define MSF2_TIMER_BASE   0x40004000
>>  #define MSF2_SYSREG_BASE  0x40038000
>> +#define MSF2_EMAC_BASE0x40041000
>>
>>  #define ENVM_BASE_ADDRESS 0x6000
>>
>> @@ -55,6 +56,7 @@ static const uint32_t uart_addr[MSF2_NUM_UARTS] = { 
>> 0x4000 , 0x4001 };
>>  static const int spi_irq[MSF2_NUM_SPIS] = { 2, 3 };
>>  static const int uart_irq[MSF2_NUM_UARTS] = { 10, 11 };
>>  static const int timer_irq[MSF2_NUM_TIMERS] = { 14, 15 };
>> +static const int emac_irq[MSF2_NUM_EMACS] = { 12 };
>>
>>  static void do_sys_reset(void *opaque, int n, int level)
>>  {
>> @@ -82,6 +84,13 @@ static void m2sxxx_soc_initfn(Object *obj)
>>TYPE_MSS_SPI);
>>  qdev_set_parent_bus(DEVICE(&s->spi[i]), sysbus_get_default());
>>  }
>> +
>> +object_initialize(&s->emac, sizeof(s->emac), TYPE_MSS_EMAC);
>> +qdev_set_parent_bus(DEVICE(&s->emac), sysbus_get_default());
>> +if (nd_table[0].used) {
>> +qemu_check_nic_model(&nd_table[0], TYPE_MSS_EMAC);
>> +qdev_set_nic_properties(DEVICE(&s->emac), &nd_table[0]);
>> +}
>>  }
>>
>>  static void m2sxxx_soc_realize(DeviceState *dev_soc, Error **errp)
>> @@ -192,6 +201,17 @@ static void m2sxxx_soc_realize(DeviceState *dev_soc, 
>> Error **errp)
>>  g_free(bus_name);
>>  }
>>
>> +dev = DEVICE(&s->emac);
>> +object_property_set_bool(OBJECT(&s->emac), true, "realized", &err);
>> +if (err != NULL) {
>> +error_propagate(errp, err);
>> +return;
>> +}
>> +busdev = SYS_BUS_DEVICE(dev);
>> +sysbus_mmio_map(busdev, 0, MSF2_EMAC_BASE);
>> +sysbus_connect_irq(busdev, 0,
>> +   qdev_get_gpio_in(armv7m, emac_irq[0]));
>> +
>>  /* Below devices are not modelled yet. */
>>  create_unimplemented_device("i2c_0", 0x40002000, 0x1000);
>>  create_unimplemented_device("dma", 0x40003000, 0x1000);
>> @@ -202,7 +222,6 @@ static void m2sxxx_soc_realize(DeviceState *dev_soc, 
>> Error **errp)
>>  create_unimplemented_device("can", 0x40015000, 0x1000);
>>  create_unimplemented_device("rtc", 0x40017000, 0x1000);
>>  create_unimplemented_device("apb_config", 0x4002, 0x1);
>> -create_unimplemented_device("emac", 0x40041000, 0x1000);
>>  create_unimplemented_device("usb", 0x40043000, 0x1000);
>>  }
>>
>> diff --git a/hw/net/Makefile.objs b/hw/net/Makefile.objs
>> index ab22968..d9b4cae 100644
>> --- a/hw/net/Makefile.objs
>> +++ b/hw/net/Makefile.objs
>> @@ -48,3 +48,4 @@ common-obj-$(CONFIG_ROCKER) += rocker/rocker.o 
>> rocker/rocker_fp.o \
>>  obj-$(call lnot,$(CONFIG_ROCKER)) += rocker/qmp-norocker.o
>>
>>  common-obj-$(CONFIG_CAN_BUS) += can/
>> +common-obj-$(CONFIG_MSF2) += mss-emac.o
>> diff --git a/hw/net/mss-emac.c b/hw/net/mss-emac.c
>> new file mode 100644
>> index 000..a9588c0
>> --- /dev/null
>> +++ b/hw/net/mss-emac.c
>> @@ -0,0 +1,544 @@
>> +/*
>> + * QEMU model of the Smartfusion2 Ethernet MAC.
>> + *
>> + * Copyright (c) 2018 Subbaraya Sundeep .
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a 
>> copy
>> + * of this software and associated documentation files (the "Software"), to 
>> deal
>> + * in the Software without restriction, including without limitation the 
>> rights
>> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
>> + * copies of the Software, and to permit persons to whom the Software is
>> + * furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included 
>> in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>> OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
>> FROM,
>> + * OUT OF OR IN CONNECTION WITH THE SOF

Re: [Qemu-devel] [PATCH 0/4] aspeed: add MMIO exec support to the FMC controller

2018-05-30 Thread Cédric Le Goater
On 05/30/2018 02:40 PM, Peter Maydell wrote:
> On 30 May 2018 at 08:49, Cédric Le Goater  wrote:
>> Hello,
>>
>> When MMIO execution support is active, these changes let the Aspeed
>> SoC machine boot directly from CE0. As there is still some
>> issues, the feature is disabled by default and should be activated
>> with :
>>
>> -global driver=aspeed.smc,property=mmio-exec,value=true
> 
> I'd really rather not add another mmio-exec device until
> we've sorted out making it actually work properly...

OK. I will keep it for later. The FW of the witherspoon machine is 
a very good mmio-exec torture test. 

Thanks,

C. 




Re: [Qemu-devel] [PATCH v7 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-05-30 Thread Michael S. Tsirkin
On Wed, May 30, 2018 at 05:12:09PM +0800, Wei Wang wrote:
> On 05/29/2018 11:24 PM, Michael S. Tsirkin wrote:
> > On Tue, Apr 24, 2018 at 02:13:47PM +0800, Wei Wang wrote:
> > > +/*
> > > + * Balloon will report pages which were free at the time of this call. 
> > > As the
> > > + * reporting happens asynchronously, dirty bit logging must be enabled 
> > > before
> > > + * this call is made.
> > > + */
> > > +void balloon_free_page_start(void)
> > > +{
> > > +balloon_free_page_start_fn(balloon_opaque);
> > > +}
> > Please create notifier support, not a single global.
> 
> OK. The start is called at the end of bitmap_sync, and the stop is called at
> the beginning of bitmap_sync. In this case, we will need to add two
> migration states, MIGRATION_STATUS_BEFORE_BITMAP_SYNC and
> MIGRATION_STATUS_AFTER_BITMAP_SYNC, right?

If that's the way you do it, you need to ask migration guys, not me.

> 
> > 
> > +static void virtio_balloon_poll_free_page_hints(void *opaque)
> > +{
> > +VirtQueueElement *elem;
> > +VirtIOBalloon *dev = opaque;
> > +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > +VirtQueue *vq = dev->free_page_vq;
> > +uint32_t id;
> > +size_t size;
> > +
> > +while (1) {
> > +qemu_mutex_lock(&dev->free_page_lock);
> > +while (dev->block_iothread) {
> > +qemu_cond_wait(&dev->free_page_cond, &dev->free_page_lock);
> > +}
> > +
> > +/*
> > + * If the migration thread actively stops the reporting, exit
> > + * immediately.
> > + */
> > +if (dev->free_page_report_status == FREE_PAGE_REPORT_S_STOP) {
> > Please refactor this : move loop body into a function so
> > you can do lock/unlock in a single place.
> 
> Sounds good.
> 
> > 
> > +
> > +static bool virtio_balloon_free_page_support(void *opaque)
> > +{
> > +VirtIOBalloon *s = opaque;
> > +VirtIODevice *vdev = VIRTIO_DEVICE(s);
> > +
> > +return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT);
> > or if poison is negotiated.
> 
> Will make it
> return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT) &&
> !virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON)


I mean the reverse:
virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT) ||
virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON)


If poison has been negotiated you must migrate the
guest supplied value even if you don't use it for hints.


> 
> 
> Best,
> Wei



[Qemu-devel] [PATCH] fix Segmentation fault when emulating a bluetooth device 'dev'

2018-05-30 Thread Fei Li
The current code bt_hid_send_data() did not check whether its first
parameter *ch is NULL, which will cause a "Segmentation fault" when
*ch is NULL as ch->remote_mtu will be directly referenced later in
this function. E.g. when called by bt_hid_datain() and hid->interrupt
is NULL with "qemu-system-x86_64 ... \ -bt device:keyboard,vlan=3".

Thus add a judgement to avoid such error.

Signed-off-by: Fei Li 
---
 hw/bt/hid.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/bt/hid.c b/hw/bt/hid.c
index 056291f9b5..dfb0ea9c9b 100644
--- a/hw/bt/hid.c
+++ b/hw/bt/hid.c
@@ -174,6 +174,9 @@ static void bt_hid_send_data(struct bt_l2cap_conn_params_s 
*ch, int type,
 uint8_t *pkt, hdr = (BT_DATA << 4) | type;
 int plen;
 
+if (!ch)
+return;
+
 do {
 plen = MIN(len, ch->remote_mtu - 1);
 pkt = ch->sdu_out(ch, plen + 1);
-- 
2.13.6




  1   2   3   4   >