[Qemu-devel] [PATCH v2 1/2] slirp: Add "query-usernet" QMP command

2018-02-26 Thread Fam Zheng
HMP "info usernet" has been available but it isn't ideal for programed
use cases. This closes the gap in QMP by adding a counterpart
"query-usernet" command. It is basically translated from
the HMP slirp_connection_info() loop, which now calls the QMP
implementation and prints the data, just like other HMP info_* commands.

The TCPS_* macros are now defined as a QAPI enum.

Signed-off-by: Fam Zheng 
---
 net/slirp.c  |  26 +++
 qapi/net.json| 201 +++
 slirp/libslirp.h |   1 +
 slirp/misc.c | 156 +-
 slirp/tcp.h  |  15 -
 5 files changed, 339 insertions(+), 60 deletions(-)

diff --git a/net/slirp.c b/net/slirp.c
index 8991816bbf..415f967f99 100644
--- a/net/slirp.c
+++ b/net/slirp.c
@@ -36,6 +36,7 @@
 #include "monitor/monitor.h"
 #include "qemu/error-report.h"
 #include "qemu/sockets.h"
+#include "slirp/slirp.h"
 #include "slirp/libslirp.h"
 #include "slirp/ip6.h"
 #include "chardev/char-fe.h"
@@ -43,6 +44,7 @@
 #include "qemu/cutils.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qdict.h"
+#include "qmp-commands.h"
 
 static int get_str_sep(char *buf, int buf_size, const char **pp, int sep)
 {
@@ -864,6 +866,30 @@ static int slirp_guestfwd(SlirpState *s, const char 
*config_str,
 return -1;
 }
 
+UsernetInfoList *qmp_query_usernet(Error **errp)
+{
+SlirpState *s;
+UsernetInfoList *list = NULL;
+UsernetInfoList **p = &list;
+
+QTAILQ_FOREACH(s, &slirp_stacks, entry) {
+int vlan;
+UsernetInfoList *il = g_new0(UsernetInfoList, 1);
+UsernetInfo *info = il->value = g_new0(UsernetInfo, 1);
+
+info->id = g_strdup(s->nc.name);
+if (!net_hub_id_for_client(&s->nc, &vlan)) {
+info->vlan = vlan;
+} else {
+info->vlan = -1;
+}
+usernet_get_info(s->slirp, info);
+*p = il;
+p = &il->next;
+}
+return list;
+}
+
 void hmp_info_usernet(Monitor *mon, const QDict *qdict)
 {
 SlirpState *s;
diff --git a/qapi/net.json b/qapi/net.json
index 1238ba5de1..26b2674ffa 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -706,3 +706,204 @@
 ##
 { 'event': 'NIC_RX_FILTER_CHANGED',
   'data': { '*name': 'str', 'path': 'str' } }
+
+##
+# @TCPS:
+#
+# TCP States of a SLIRP connection.
+#
+# - States where connections are not established: none, closed, listen, 
syn_sent,
+#   syn_received
+#
+# - States where user has closed: fin_wait_1, closing, last_ack, fin_wait_2,
+#   time_wait
+#
+# - States await ACK of FIN: fin_wait_1, closing, last_ack
+#
+# 'none' state is used only when host forwarding
+#
+# Since 2.12
+#
+##
+{ 'enum': 'TCPS',
+  'data':
+   ['closed',
+'listen',
+'syn_sent',
+'syn_received',
+'established',
+'close_wait',
+'fin_wait_1',
+'closing',
+'last_ack',
+'fin_wait_2',
+'time_wait',
+'none'
+   ] }
+
+##
+# @UsernetTCPConnection:
+#
+# SLIRP TCP information.
+#
+# @state: tcp connection state
+#
+# @hostfwd: whether this connection has host port forwarding
+#
+# @fd: the file descriptor of the connection
+#
+# @src_addr: source address of host port forwarding
+#
+# @src_port: source port of host port forwarding
+#
+# @dest_addr: destination address of host port forwarding
+#
+# @dest_port: destination port of host port forwarding
+#
+# @recv_buffered: number of bytes queued in the receive buffer
+#
+# @send_buffered: number of bytes queued in the send buffer
+#
+# Since: 2.12
+##
+{ 'struct': 'UsernetTCPConnection',
+  'data': {
+'state': 'TCPS',
+'hostfwd': 'bool',
+'fd': 'int',
+'src_addr': 'str',
+'src_port': 'int',
+'dest_addr': 'str',
+'dest_port': 'int',
+'recv_buffered': 'int',
+'send_buffered': 'int'
+  } }
+
+##
+# @UsernetUDPConnection:
+#
+# SLIRP UDP information.
+#
+# @hostfwd: whether this connection has host port forwarding
+#
+# @expire_time_ms: time in microseconds after which this connection will expire
+#
+# @fd: the file descriptor of the connection
+#
+# @src_addr: source address of host port forwarding
+#
+# @src_port: source port of host port forwarding
+#
+# @dest_addr: destination address of host port forwarding
+#
+# @dest_port: destination port of host port forwarding
+#
+# @recv_buffered: number of bytes queued in the receive buffer
+#
+# @send_buffered: number of bytes queued in the send buffer
+#
+# Since: 2.12
+##
+{ 'struct': 'UsernetUDPConnection',
+  'data': {
+'hostfwd': 'bool',
+'expire_time_ms': 'int',
+'fd': 'int',
+'src_addr': 'str',
+'src_port': 'int',
+'dest_addr': 'str',
+'dest_port': 'int',
+'recv_buffered': 'int',
+'send_buffered': 'int'
+} }
+
+##
+# @UsernetICMPConnection:
+#
+# SLIRP ICMP information.
+#
+# @expire_time_ms: time in microseconds after which this connection will expire
+#
+# @fd: the file descriptor of the connection
+#
+# @src_addr: source address of host port forwarding
+#
+# @dest_addr: destin

[Qemu-devel] [PATCH qemu v2] qmp: Add qom-list-properties to list QOM object properties

2018-02-26 Thread Alexey Kardashevskiy
There is already 'device-list-properties' which does most of the job,
however it does not handle everything returned by qom-list-types such
as machines as they inherit directly from TYPE_OBJECT and not TYPE_DEVICE.
It does not handle abstract classes either.

This adds a new qom-list-properties command which prints properties
of a specific class and its instance. It is pretty much a simplified copy
of the device-list-properties handler.

Since it creates an object instance, device properties should appear
in the output as they are copied to QOM properties at the instance_init
hook.

This adds a object_class_property_iter_init() helper to allow class
properties enumeration uses it in the new QMP command to allow properties
listing for abstract classes.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v2:
* added abstract classes support, now things like "pci-device" or
"spapr-machine" show properties, previously these would produce
an "abstract class" error
---
 qapi-schema.json | 29 +
 include/qom/object.h | 16 
 qmp.c| 49 +
 qom/object.c |  7 +++
 4 files changed, 101 insertions(+)

diff --git a/qapi-schema.json b/qapi-schema.json
index 0262b9f..fa5f189 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1455,6 +1455,35 @@
   'returns': [ 'DevicePropertyInfo' ] }
 
 ##
+# @QOMPropertyInfo:
+#
+# Information about object properties.
+#
+# @name: the name of the property
+# @type: the typename of the property
+# @description: if specified, the description of the property.
+#
+# Since: 2.12
+##
+{ 'struct': 'QOMPropertyInfo',
+  'data': { 'name': 'str', 'type': 'str', '*description': 'str' } }
+
+##
+# @qom-list-properties:
+#
+# List properties associated with a QOM object.
+#
+# @typename: the type name of an object
+#
+# Returns: a list of QOMPropertyInfo describing object properties
+#
+# Since: 2.12
+##
+{ 'command': 'qom-list-properties',
+  'data': { 'typename': 'str'},
+  'returns': [ 'QOMPropertyInfo' ] }
+
+##
 # @xen-set-global-dirty-log:
 #
 # Enable or disable the global dirty log mode.
diff --git a/include/qom/object.h b/include/qom/object.h
index dc73d59..ef07d78 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -1017,6 +1017,22 @@ void object_property_iter_init(ObjectPropertyIterator 
*iter,
Object *obj);
 
 /**
+ * object_class_property_iter_init:
+ * @klass: the class
+ *
+ * Initializes an iterator for traversing all properties
+ * registered against an object class and all parent classes.
+ *
+ * It is forbidden to modify the property list while iterating,
+ * whether removing or adding properties.
+ *
+ * This can be used on abstract classes as it does not create a temporary
+ * instance.
+ */
+void object_class_property_iter_init(ObjectPropertyIterator *iter,
+ ObjectClass *klass);
+
+/**
  * object_property_iter_next:
  * @iter: the iterator instance
  *
diff --git a/qmp.c b/qmp.c
index 793f6f3..151d3d7 100644
--- a/qmp.c
+++ b/qmp.c
@@ -576,6 +576,55 @@ DevicePropertyInfoList *qmp_device_list_properties(const 
char *typename,
 return prop_list;
 }
 
+QOMPropertyInfoList *qmp_qom_list_properties(const char *typename,
+ Error **errp)
+{
+ObjectClass *klass;
+Object *obj = NULL;
+ObjectProperty *prop;
+ObjectPropertyIterator iter;
+QOMPropertyInfoList *prop_list = NULL;
+
+klass = object_class_by_name(typename);
+if (klass == NULL) {
+error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+  "Class '%s' not found", typename);
+return NULL;
+}
+
+klass = object_class_dynamic_cast(klass, TYPE_OBJECT);
+if (klass == NULL) {
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "typename", 
TYPE_OBJECT);
+return NULL;
+}
+
+if (object_class_is_abstract(klass)) {
+object_class_property_iter_init(&iter, klass);
+} else {
+obj = object_new(typename);
+object_property_iter_init(&iter, obj);
+}
+while ((prop = object_property_iter_next(&iter))) {
+QOMPropertyInfo *info;
+QOMPropertyInfoList *entry;
+
+info = g_malloc0(sizeof(*info));
+info->name = g_strdup(prop->name);
+info->type = g_strdup(prop->type);
+info->has_description = !!prop->description;
+info->description = g_strdup(prop->description);
+
+entry = g_malloc0(sizeof(*entry));
+entry->value = info;
+entry->next = prop_list;
+prop_list = entry;
+}
+
+object_unref(obj);
+
+return prop_list;
+}
+
 CpuDefinitionInfoList *qmp_query_cpu_definitions(Error **errp)
 {
 return arch_query_cpu_definitions(errp);
diff --git a/qom/object.c b/qom/object.c
index 5dcee46..e7978bd 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -1037,6 +1037,13 @@ ObjectProperty 
*object_property_it

Re: [Qemu-devel] [PATCH v4 4/7] qdev: add hotpluggable to DeviceState

2018-02-26 Thread Gerd Hoffmann
  Hi,

> > The connection between QemuConsole and User Interface (i.e. gtk, spice,
> > ...) is a bit more flexible.  But also not really designed for hotplug
> > as QemuConsole is not hotpluggable in the first place ...
> > 
> > We could drop the display property and use two devices instead.
> > 
> >   new vfio-pci would behave like display=off with this series.
> >   added vfio-pci-display has display=on behavior.
> >   display=auto is not possible.
> 
> I expect libvirt and above would balk at creating a separate QEMU
> device for this purpose, easy for QEMU, hard for anything that manages
> QEMU.

Now as you've mentioned libvirt I remember we had the same discussion
before, with usb host adapters.  The uhci and ehci controllers have a
simliar issue:  If they are configured as companion setup (ehci for usb2
and uhci for usb1) they can't be hotplugged, as standalone controller
hotplugging works fine.

We ended up with splitting the controllers into two groups:  The ones
which can be used in a companion setup (basically all ich9-* devices)
which are not hotpluggable.   And the other ones which don't support
companion setups but can be hotplugged.  Commits:

   ec56214f6f usb: tag standalone ehci as hotpluggable
   638ca939d8 usb: tag standalone uhci as hotpluggable

The argument from the libvirt side was that it is actually easier for
them to handle if hotplugging is a fixed property of a device and
doesn't change magically depending on device configuration.  First
because they can query qemu then whenever a given device can be
hotplugged or not, and second because it'll work for both plug-in and
plug-out.

So this patch isn't going to fly, and unless someone can come up with a
better idea I'll go use the two-devices approach.

cheers,
  Gerd




Re: [Qemu-devel] [PATCH v4 0/7] vfio: add display support

2018-02-26 Thread Gerd Hoffmann
On Fri, Feb 23, 2018 at 10:05:17AM +0100, Gerd Hoffmann wrote:
> > Hi Gerd,
> > 
> > It's a little bit concerning that the only way we can test the
> > region-based display support is with proprietary drivers that nobody
> > but NVIDIA has at this point.  Have you considered adding region-based
> > display support to the mdev sample tty driver?  I know it sounds
> > ridiculous for a serial device to have a display, but the vfio display
> > region support isn't really tied to the functionality of the base mdev
> > device.  We could have it simply display a static test pattern, just so
> > we can test the end to end code path without a dependency on a closed
> > vendor driver.
> 
> Hmm, have to think about that.  Some way to change the display content
> would be nice as you can see whenever display updates are working then.

https://www.kraxel.org/cgit/linux/log/?h=vfio-sample-display

Comes with host mdev driver and guest framebuffer driver.

enjoy,
  Gerd




Re: [Qemu-devel] [PATCH qemu v7 2/4] vfio/pci: Relax DMA map errors for MMIO regions

2018-02-26 Thread Alexey Kardashevskiy
On 19/02/18 13:46, Alexey Kardashevskiy wrote:
> On 16/02/18 16:28, David Gibson wrote:
>> On Wed, Feb 14, 2018 at 08:55:41AM -0700, Alex Williamson wrote:
>>> On Wed, 14 Feb 2018 19:09:16 +1100
>>> Alexey Kardashevskiy  wrote:
>>>
 On 14/02/18 12:33, David Gibson wrote:
> On Tue, Feb 13, 2018 at 07:20:56PM +1100, Alexey Kardashevskiy wrote:  
>> On 13/02/18 16:41, David Gibson wrote:  
>>> On Tue, Feb 13, 2018 at 04:36:30PM +1100, David Gibson wrote:  
 On Tue, Feb 13, 2018 at 12:15:52PM +1100, Alexey Kardashevskiy wrote:  
> On 13/02/18 03:06, Alex Williamson wrote:  
>> On Mon, 12 Feb 2018 18:05:54 +1100
>> Alexey Kardashevskiy  wrote:
>>  
>>> On 12/02/18 16:19, David Gibson wrote:  
 On Fri, Feb 09, 2018 at 06:55:01PM +1100, Alexey Kardashevskiy 
 wrote:
> At the moment if vfio_memory_listener is registered in the system 
> memory
> address space, it maps/unmaps every RAM memory region for DMA.
> It expects system page size aligned memory sections so 
> vfio_dma_map
> would not fail and so far this has been the case. A mapping 
> failure
> would be fatal. A side effect of such behavior is that some MMIO 
> pages
> would not be mapped silently.
>
> However we are going to change MSIX BAR handling so we will end 
> having
> non-aligned sections in vfio_memory_listener (more details is in
> the next patch) and vfio_dma_map will exit QEMU.
>
> In order to avoid fatal failures on what previously was not a 
> failure and
> was just silently ignored, this checks the section alignment to
> the smallest supported IOMMU page size and prints an error if not 
> aligned;
> it also prints an error if vfio_dma_map failed despite the page 
> size check.
> Both errors are not fatal; only MMIO RAM regions are checked
> (aka "RAM device" regions).
>
> If the amount of errors printed is overwhelming, the MSIX 
> relocation
> could be used to avoid excessive error output.
>
> This is unlikely to cause any behavioral change.
>
> Signed-off-by: Alexey Kardashevskiy 

 There are some relatively superficial problems noted below.

 But more fundamentally, this feels like it's extending an existing
 hack past the point of usefulness.

 The explicit check for is_ram_device() here has always bothered me 
 -
 it's not like a real bus bridge magically knows whether a target
 address maps to RAM or not.

 What I think is really going on is that even for systems without an
 IOMMU, it's not really true to say that the PCI address space maps
 directly onto address_space_memory.  Instead, there's a large, but
 much less than 2^64 sized, "upstream window" at address 0 on the 
 PCI
 bus, which is identity mapped to the system bus.  Details will vary
 with the system, but in practice we expect nothing but RAM to be in
 that window.  Addresses not within that window won't be mapped to 
 the
 system bus but will just be broadcast on the PCI bus and might be
 picked up as a p2p transaction.
>>>
>>> Currently this p2p works only via the IOMMU, direct p2p is not 
>>> possible as
>>> the guest needs to know physical MMIO addresses to make p2p work 
>>> and it
>>> does not.  
>>
>> /me points to the Direct Translated P2P section of the ACS spec, 
>> though
>> it's as prone to spoofing by the device as ATS.  In any case, p2p
>> reflected from the IOMMU is still p2p and offloads the CPU even if
>> bandwidth suffers vs bare metal depending on if the data doubles back
>> over any links.  Thanks,  
>
> Sure, I was just saying that p2p via IOMMU won't be as simple as 
> broadcast
> on the PCI bus, IOMMU needs to be programmed in advance to make this 
> work,
> and current that broadcast won't work for the passed through devices. 
>  

 Well, sure, p2p in a guest with passthrough devices clearly needs to
 be translated through the IOMMU (and p2p from a passthrough to an
 emulated device is essentially impossible).

 But.. what does that have to do with this code.  This is the memory
 area watcher, looking for memory regions being mapped directly into
 the PCI space.  NOT IOMMU regions, si

Re: [Qemu-devel] [PULL 0/4] Linux user for 2.12 patches

2018-02-26 Thread Laurent Vivier
Le 25/02/2018 à 19:13, no-re...@patchew.org a écrit :
> Hi,
> 
> This series failed build test on s390x host. Please find the details below.
> 
> Type: series
> Message-id: 20180225175928.13101-1-laur...@vivier.eu
> Subject: [Qemu-devel] [PULL 0/4] Linux user for 2.12 patches
> 
...
> === TEST BEGIN ===
> Using CC: /home/fam/bin/cc
> Install prefix/var/tmp/patchew-tester-tmp-193tvn22/src/install
> BIOS directory/var/tmp/patchew-tester-tmp-193tvn22/src/install/share/qemu
> firmware path 
> /var/tmp/patchew-tester-tmp-193tvn22/src/install/share/qemu-firmware
> binary directory  /var/tmp/patchew-tester-tmp-193tvn22/src/install/bin
> library directory /var/tmp/patchew-tester-tmp-193tvn22/src/install/lib
> module directory  /var/tmp/patchew-tester-tmp-193tvn22/src/install/lib/qemu
> libexec directory /var/tmp/patchew-tester-tmp-193tvn22/src/install/libexec
> include directory /var/tmp/patchew-tester-tmp-193tvn22/src/install/include
> config directory  /var/tmp/patchew-tester-tmp-193tvn22/src/install/etc
> local state directory   /var/tmp/patchew-tester-tmp-193tvn22/src/install/var
> Manual directory  /var/tmp/patchew-tester-tmp-193tvn22/src/install/share/man
> ELF interp prefix /usr/gnemul/qemu-%M
> Source path   /var/tmp/patchew-tester-tmp-193tvn22/src
> GIT binarygit
> GIT submodulesui/keycodemapdb capstone
> C compiler/home/fam/bin/cc
> Host C compiler   cc
> C++ compiler  c++
> Objective-C compiler /home/fam/bin/cc
> ARFLAGS   rv
> CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g 
> QEMU_CFLAGS   -I/usr/include/pixman-1   -Werror -DHAS_LIBSSH2_SFTP_FSYNC 
> -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include  
> -DNCURSES_WIDECHAR -D_GNU_SOURCE -D_DEFAULT_SOURCE  -m64 -D_GNU_SOURCE 
> -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
> -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes 
> -fno-strict-aliasing -fno-common -fwrapv  -Wexpansion-to-defined 
> -Wendif-labels -Wno-shift-negative-value -Wno-missing-include-dirs 
> -Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
> -Wignored-qualifiers -Wold-style-declaration -Wold-style-definition 
> -Wtype-limits -fstack-protector-strong -I/usr/include/p11-kit-1 
> -I/usr/include/libpng16  -I/usr/include/libdrm   
> -I$(SRC_PATH)/capstone/include
> LDFLAGS   -Wl,--warn-common -m64 -g 
> make  make
> install   install
> pythonpython -B
> smbd  /usr/sbin/smbd
> module supportno
> host CPU  s390x
...
> collect2: error: ld returned 1 exit status
> make[1]: *** [Makefile:193: qemu-system-mips64] Error 1
> make: *** [Makefile:404: subdir-mips64-softmmu] Error 2
> make: *** Waiting for unfinished jobs
...
> make[1]: *** [/var/tmp/patchew-tester-tmp-193tvn22/src/rules.mak:66: 
> hw/arm/aspeed.o] Error 1
> make[1]: *** Waiting for unfinished jobs
> make: *** [Makefile:404: subdir-aarch64-softmmu] Error 2
> === OUTPUT END ===
> 
> Test command exited with code: 2
> 

It looks like a problem with s390x host.

Thanks,
Laurent



Re: [Qemu-devel] QEMU GSoC 2018 Project Idea (Apply polling to QEMU NVMe)

2018-02-26 Thread Paolo Bonzini
On 25/02/2018 23:52, Huaicheng Li wrote:
> I remember there were some discussions back in 2015 about this, but I
> don't see it finally done. For this project, I think we can go in three
> steps: (1). add the shadow doorbell buffer support into QEMU NVMe
> emulation, this will reduce # of VM-exits. (2). replace current timers
> used by QEMU NVMe with a separate polling thread, thus we can completely
> eliminate VM-exits. (3). Even further, we can adapt the architecture to
> use one polling thread for each NVMe queue pair, thus it's possible to
> provide more performance. (step 3 can be left for next year if the
> workload is too much for 3 months).

Slightly rephrased:

(1) add shadow doorbell buffer and ioeventfd support into QEMU NVMe
emulation, which will reduce # of VM-exits and make them less expensive
(reduce VCPU latency.

(2) add iothread support to QEMU NVMe emulation.  This can also be used
to eliminate VM-exits because iothreads can do adaptive polling.

(1) and (2) seem okay for at most 1.5 months, especially if you already
have experience with QEMU.

For (3), there is work in progress to add multiqueue support to QEMU's
block device layer.  We're hoping to get the infrastructure part in
(removing the AioContext lock) during the first half of 2018.  As you
say, we can see what the workload will be.

Including a RAM disk backend in QEMU would be nice too, and it may
interest you as it would reduce the delta between upstream QEMU and
FEMU.  So this could be another idea.

However, the main issue that I'd love to see tackled is interrupt
mitigation.  With higher rates of I/O ops and high queue depth (e.g.
32), it's common for the guest to become slower when you introduce
optimizations in QEMU.  The reason is that lower latency causes higher
interrupt rates and that in turn slows down the guest.  If you have any
ideas on how to work around this, I would love to hear about it.

In any case, I would very much like to mentor this project.  Let me know
if you have any more ideas on how to extend it!

Paolo



Re: [Qemu-devel] [qemu-s390x] [PATCH v1] numa: s390x has no NUMA

2018-02-26 Thread Claudio Imbrenda
On Fri, 23 Feb 2018 18:36:57 +0100
David Hildenbrand  wrote:

> Right now it is possible to crash QEMU for s390x by providing e.g.
> -numa node,nodeid=0,cpus=0-1
> 
> Problem is, that numa.c uses mc->cpu_index_to_instance_props as an
> indicator whether NUMA is supported by a machine type. We don't
> implement NUMA on s390x (and that concept also doesn't really exist).
> We need mc->cpu_index_to_instance_props for query-cpus.
> 
> So let's fix this case.
> 
> qemu-system-s390x: -numa node,nodeid=0,cpus=0-1: NUMA is not
> supported by this machine-type
> 
> Signed-off-by: David Hildenbrand 
> ---
>  numa.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/numa.c b/numa.c
> index 7e0e789b02..3b9be613d9 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -80,10 +80,16 @@ static void parse_numa_node(MachineState *ms,
> NumaNodeOptions *node, return;
>  }
> 
> +#ifdef TARGET_S390X
> +/* s390x provides cpu_index_to_instance_props but has no NUMA */
> +error_report("NUMA is not supported by this machine-type");
> +exit(1);
> +#else
>  if (!mc->cpu_index_to_instance_props) {
>  error_report("NUMA is not supported by this machine-type");
>  exit(1);
>  }
> +#endif
>  for (cpus = node->cpus; cpus; cpus = cpus->next) {
>  CpuInstanceProperties props;
>  if (cpus->value >= max_cpus) {

seems straightforward

Reviewed-by: Claudio Imbrenda 




Re: [Qemu-devel] [PATCH] macio: fix NULL pointer dereference when issuing IDE trim

2018-02-26 Thread Anton Nefedov



On 23/2/2018 9:47 PM, Mark Cave-Ayland wrote:

Commit ef0e64a983 "ide: pass IDEState to trim AIO callback" changed the
IDE trim callback from using a BlockBackend to an IDEState but forgot to update
the dma_blk_io() call in hw/ide/macio.c accordingly.



I somehow missed this whole macio part in that series :(


Without this fix qemu-system-ppc segfaults when issuing an IDE trim command on
any of the PPC Mac machines (easily triggered by running the Debian installer).

Reported-by: Howard Spoelstra 
Signed-off-by: Mark Cave-Ayland 


Reviewed-by: Anton Nefedov 

..but there should also be a fix-up for
947858b "ide: abort TRIM operation for invalid range"
which apparently lacks a few steps on the invalid range errorpath for
macio. I'll look into that.


---
  hw/ide/macio.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index 2e043ef1ea..d3a85cba3b 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -187,7 +187,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
  break;
  case IDE_DMA_TRIM:
  s->bus->dma->aiocb = dma_blk_io(blk_get_aio_context(s->blk), &s->sg,
-offset, 0x1, ide_issue_trim, s->blk,
+offset, 0x1, ide_issue_trim, s,
  pmac_ide_transfer_cb, io,
  DMA_DIRECTION_TO_DEVICE);
  break;





Re: [Qemu-devel] [PATCH V4 3/3] tests: Add migration test for aarch64

2018-02-26 Thread Andrew Jones
On Fri, Feb 23, 2018 at 04:13:08PM -0600, Wei Huang wrote:
> 
> 
> On 02/22/2018 03:00 AM, Andrew Jones wrote:
> > On Wed, Feb 21, 2018 at 10:44:17PM -0600, Wei Huang wrote:
> >> This patch adds migration test support for aarch64. The test code, which
> >> implements the same functionality as x86, is booted as a kernel in qemu.
> >> Here are the design choices we make for aarch64:
> >>
> >>  * We choose this -kernel approach because aarch64 QEMU doesn't provide a
> >>built-in fw like x86 does. So instead of relying on a boot loader, we
> >>use -kernel approach for aarch64.
> >>  * The serial output is sent to PL011 directly.
> >>  * The physical memory base for mach-virt machine is 0x4000. We change
> >>the start_address and end_address for aarch64.
> >>
> >> In addition to providing the binary, this patch also includes the source
> >> code and the build script in tests/migration/. So users can change the
> >> source and/or re-compile the binary as they wish.
> >>
> >> Signed-off-by: Wei Huang 
> >> ---
> >>  tests/Makefile.include   |  1 +
> >>  tests/migration-test.c   | 47 +---
> >>  tests/migration/Makefile | 12 +-
> >>  tests/migration/aarch64-a-b-kernel.S | 71 
> >> 
> >>  tests/migration/aarch64-a-b-kernel.h | 19 ++
> >>  tests/migration/migration-test.h |  5 +++
> >>  6 files changed, 147 insertions(+), 8 deletions(-)
> >>  create mode 100644 tests/migration/aarch64-a-b-kernel.S
> >>  create mode 100644 tests/migration/aarch64-a-b-kernel.h
> >>
> >> diff --git a/tests/Makefile.include b/tests/Makefile.include
> >> index a1bcbffe12..df9f64438f 100644
> >> --- a/tests/Makefile.include
> >> +++ b/tests/Makefile.include
> >> @@ -372,6 +372,7 @@ check-qtest-arm-y += tests/sdhci-test$(EXESUF)
> >>  check-qtest-aarch64-y = tests/numa-test$(EXESUF)
> >>  check-qtest-aarch64-y += tests/sdhci-test$(EXESUF)
> >>  check-qtest-aarch64-y += tests/boot-serial-test$(EXESUF)
> >> +check-qtest-aarch64-y += tests/migration-test$(EXESUF)
> >>  
> >>  check-qtest-microblazeel-y = $(check-qtest-microblaze-y)
> >>  
> >> diff --git a/tests/migration-test.c b/tests/migration-test.c
> >> index e2e06ed337..a4f6732a59 100644
> >> --- a/tests/migration-test.c
> >> +++ b/tests/migration-test.c
> >> @@ -11,6 +11,7 @@
> >>   */
> >>  
> >>  #include "qemu/osdep.h"
> >> +#include 
> >>  
> >>  #include "libqtest.h"
> >>  #include "qapi/qmp/qdict.h"
> >> @@ -23,8 +24,8 @@
> >>  
> >>  #include "migration/migration-test.h"
> >>  
> >> -const unsigned start_address = TEST_MEM_START;
> >> -const unsigned end_address = TEST_MEM_END;
> >> +unsigned start_address = TEST_MEM_START;
> >> +unsigned end_address = TEST_MEM_END;
> >>  bool got_stop;
> >>  
> >>  #if defined(__linux__)
> >> @@ -81,12 +82,13 @@ static const char *tmpfs;
> >>   * outputting a 'B' every so often if it's still running.
> >>   */
> >>  #include "tests/migration/x86-a-b-bootblock.h"
> >> +#include "tests/migration/aarch64-a-b-kernel.h"
> >>  
> >> -static void init_bootfile_x86(const char *bootpath)
> >> +static void init_bootfile(const char *bootpath, void *content)
> >>  {
> >>  FILE *bootfile = fopen(bootpath, "wb");
> >>  
> >> -g_assert_cmpint(fwrite(x86_bootsect, 512, 1, bootfile), ==, 1);
> >> +g_assert_cmpint(fwrite(content, 512, 1, bootfile), ==, 1);
> >>  fclose(bootfile);
> >>  }
> >>  
> >> @@ -393,7 +395,7 @@ static void test_migrate_start(QTestState **from, 
> >> QTestState **to,
> >>  got_stop = false;
> >>  
> >>  if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
> >> -init_bootfile_x86(bootpath);
> >> +init_bootfile(bootpath, x86_bootsect);
> >>  cmd_src = g_strdup_printf("-machine accel=%s -m 150M"
> >>" -name source,debug-threads=on"
> >>" -serial file:%s/src_serial"
> >> @@ -422,6 +424,39 @@ static void test_migrate_start(QTestState **from, 
> >> QTestState **to,
> >>" -serial file:%s/dest_serial"
> >>" -incoming %s",
> >>accel, tmpfs, uri);
> >> +} else if (strcmp(arch, "aarch64") == 0) {
> >> +const char *cpu;
> >> +const char *gic_ver;
> >> +struct utsname utsname;
> >> +
> >> +/* kvm and tcg need different cpu and gic-version configs */
> >> +if (access("/dev/kvm", F_OK) == 0 && uname(&utsname) == 0 &&
> >> +strcmp(utsname.machine, "aarch64") == 0) {
> >> +accel = "kvm";
> >> +cpu = "host";
> >> +gic_ver = "host";
> >> +} else {
> >> +accel = "tcg";
> >> +cpu = "cortex-a57";
> >> +gic_ver = "2";
> >> +}
> >> +
> >> +init_bootfile(bootpath, aarch64_kernel);
> >> +cmd_src = g_strdup_printf("-machine virt,accel=%s,gic-version=%s 

Re: [Qemu-devel] [PATCH V5 3/4] tests/migration: Add migration-test header file

2018-02-26 Thread Andrew Jones
On Fri, Feb 23, 2018 at 03:58:57PM -0600, Wei Huang wrote:
> This patch moves the settings related migration-test from the
> migration-test.c file to a seperate header file. It also renames the
> x86-a-b-bootblock.s file extension from .s to .S, allowing gcc
> pre-processor to include the C-style header file correctly.
> 
> Signed-off-by: Wei Huang 
> ---
>  tests/migration-test.c | 28 
> +++---
>  tests/migration/Makefile   |  4 ++--
>  tests/migration/migration-test.h   | 18 ++
>  .../{x86-a-b-bootblock.s => x86-a-b-bootblock.S}   |  7 +++---
>  tests/migration/x86-a-b-bootblock.h|  2 +-
>  5 files changed, 39 insertions(+), 20 deletions(-)
>  create mode 100644 tests/migration/migration-test.h
>  rename tests/migration/{x86-a-b-bootblock.s => x86-a-b-bootblock.S} (94%)
>

I gave this my r-b last review. Do I have to review it again? 



Re: [Qemu-devel] [PATCH V5 2/4] tests/migration: Convert the boot block compilation script into Makefile

2018-02-26 Thread Andrew Jones
On Fri, Feb 23, 2018 at 03:58:56PM -0600, Wei Huang wrote:
> The x86 boot block header currently is generated with a shell script.
> To better support other CPUs (e.g. aarch64), we convert the script
> into Makefile. This allows us to 1) support cross-compilation easily,
> and 2) avoid creating a script file for every architecture.
> 
> Signed-off-by: Wei Huang 
> ---
>  tests/migration/Makefile | 36 
> 
>  tests/migration/rebuild-x86-bootblock.sh | 33 -
>  tests/migration/x86-a-b-bootblock.h  |  2 +-
>  tests/migration/x86-a-b-bootblock.s  |  5 ++---
>  4 files changed, 39 insertions(+), 37 deletions(-)
>  create mode 100644 tests/migration/Makefile
>  delete mode 100755 tests/migration/rebuild-x86-bootblock.sh
>

Reviewed-by: Andrew Jones 



Re: [Qemu-devel] [PATCH V5 1/4] rules: Move cross compilation auto detection functions to rules.mak

2018-02-26 Thread Andrew Jones
On Fri, Feb 23, 2018 at 03:58:55PM -0600, Wei Huang wrote:
> This patch moves the auto detection functions for cross compilation from
> roms/Makefile to rules.mak. So the functions can be shared among Makefiles
> in QEMU.
> 
> Signed-off-by: Wei Huang 
> ---
>  roms/Makefile | 24 +++-
>  rules.mak | 15 +++
>  2 files changed, 22 insertions(+), 17 deletions(-)
>

Reviewed-by: Andrew Jones 



Re: [Qemu-devel] [PATCH v1] numa: s390x has no NUMA

2018-02-26 Thread Christian Borntraeger


On 02/23/2018 06:36 PM, David Hildenbrand wrote:
> Right now it is possible to crash QEMU for s390x by providing e.g.
> -numa node,nodeid=0,cpus=0-1
> 
> Problem is, that numa.c uses mc->cpu_index_to_instance_props as an
> indicator whether NUMA is supported by a machine type. We don't
> implement NUMA on s390x (and that concept also doesn't really exist).
> We need mc->cpu_index_to_instance_props for query-cpus.

Looks like we assert because of 
machine->possible_cpus == 0.

Later during boot this is created in s390_possible_cpu_arch_ids. (via 
s390_init_cpus). What we (in the future) actually could provide is a 
cpu topology.

So something like this also fixes the bug

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index fd5bfcdaa5..d981335ca9 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -13,6 +13,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "cpu.h"
+#include "sysemu/numa.h"
 #include "hw/boards.h"
 #include "exec/address-spaces.h"
 #include "hw/s390x/s390-virtio-hcall.h"
@@ -393,11 +394,20 @@ static void 
s390_machine_device_unplug_request(HotplugHandler *hotplug_dev,
 static CpuInstanceProperties s390_cpu_index_to_props(MachineState *machine,
  unsigned cpu_index)
 {
+MachineClass *mc = MACHINE_GET_CLASS(machine);
+
+/* make sure possible_cpu are intialized */
+mc->possible_cpu_arch_ids(machine);
 g_assert(machine->possible_cpus && cpu_index < 
machine->possible_cpus->len);
 
 return machine->possible_cpus->cpus[cpu_index].props;
 }
 
+static int64_t s390_get_default_cpu_node_id(const MachineState *ms, int idx)
+{
+return idx / smp_cpus % nb_numa_nodes;
+}
+
 static const CPUArchIdList *s390_possible_cpu_arch_ids(MachineState *ms)
 {
 int i;
@@ -473,6 +483,7 @@ static void ccw_machine_class_init(ObjectClass *oc, void 
*data)
 mc->get_hotplug_handler = s390_get_hotplug_handler;
 mc->cpu_index_to_instance_props = s390_cpu_index_to_props;
 mc->possible_cpu_arch_ids = s390_possible_cpu_arch_ids;
+mc->get_default_cpu_node_id = s390_get_default_cpu_node_id;
 /* it is overridden with 'host' cpu *in kvm_arch_init* */
 mc->default_cpu_type = S390_CPU_TYPE_NAME("qemu");
 hc->plug = s390_machine_device_plug;


and it would allow us to extend things later on. On the other hand, my fix does 
not
implement anything so your fix is "more correct".

> 
> So let's fix this case.
> 
> qemu-system-s390x: -numa node,nodeid=0,cpus=0-1: NUMA is not supported by
>this machine-type
> 
> Signed-off-by: David Hildenbrand 
> ---
>  numa.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/numa.c b/numa.c
> index 7e0e789b02..3b9be613d9 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -80,10 +80,16 @@ static void parse_numa_node(MachineState *ms, 
> NumaNodeOptions *node,
>  return;
>  }
> 
> +#ifdef TARGET_S390X
> +/* s390x provides cpu_index_to_instance_props but has no NUMA */
> +error_report("NUMA is not supported by this machine-type");
> +exit(1);
> +#else
>  if (!mc->cpu_index_to_instance_props) {
>  error_report("NUMA is not supported by this machine-type");
>  exit(1);
>  }
> +#endif
>  for (cpus = node->cpus; cpus; cpus = cpus->next) {
>  CpuInstanceProperties props;
>  if (cpus->value >= max_cpus) {
> 




Re: [Qemu-devel] [PATCH v2 2/3] migration: use the free page reporting feature from balloon

2018-02-26 Thread Wang, Wei W
On Monday, February 26, 2018 1:07 PM, Wei Wang wrote:
> On 02/09/2018 07:50 PM, Dr. David Alan Gilbert wrote:
> > * Wei Wang (wei.w.w...@intel.com) wrote:
> >> Use the free page reporting feature from the balloon device to clear
> >> the bits corresponding to guest free pages from the dirty bitmap, so
> >> that the free memory are not sent.
> >>
> >> Signed-off-by: Wei Wang 
> >> CC: Michael S. Tsirkin 
> >> CC: Juan Quintela 
> >> ---
> >>   migration/ram.c | 24 
> >>   1 file changed, 20 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/migration/ram.c b/migration/ram.c index d6f462c..4fe16d2
> >> 100644
> >> --- a/migration/ram.c
> >> +++ b/migration/ram.c
> >> @@ -49,6 +49,7 @@
> >>   #include "qemu/rcu_queue.h"
> >>   #include "migration/colo.h"
> >>   #include "migration/block.h"
> >> +#include "sysemu/balloon.h"
> >>
> >>   /***/
> >>   /* ram save/restore */
> >> @@ -206,6 +207,10 @@ struct RAMState {
> >>   uint32_t last_version;
> >>   /* We are in the first round */
> >>   bool ram_bulk_stage;
> >> +/* The feature, skipping the transfer of free pages, is supported */
> >> +bool free_page_support;
> >> +/* Skip the transfer of free pages in the bulk stage */
> >> +bool free_page_done;
> >>   /* How many times we have dirty too many pages */
> >>   int dirty_rate_high_cnt;
> >>   /* these variables are used for bitmap sync */ @@ -773,7 +778,7
> >> @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock
> *rb,
> >>   unsigned long *bitmap = rb->bmap;
> >>   unsigned long next;
> >>
> >> -if (rs->ram_bulk_stage && start > 0) {
> >> +if (rs->ram_bulk_stage && start > 0 && !rs->free_page_support) {
> >>   next = start + 1;
> >>   } else {
> >>   next = find_next_bit(bitmap, size, start); @@ -1653,6
> >> +1658,8 @@ static void ram_state_reset(RAMState *rs)
> >>   rs->last_page = 0;
> >>   rs->last_version = ram_list.version;
> >>   rs->ram_bulk_stage = true;
> >> +rs->free_page_support = balloon_free_page_support();
> >> +rs->free_page_done = false;
> >>   }
> >>
> >>   #define MAX_WAIT 50 /* ms, half buffered_file limit */ @@ -2135,7
> >> +2142,7 @@ static int ram_state_init(RAMState **rsp)
> >>   return 0;
> >>   }
> >>
> >> -static void ram_list_init_bitmaps(void)
> >> +static void ram_list_init_bitmaps(RAMState *rs)
> >>   {
> >>   RAMBlock *block;
> >>   unsigned long pages;
> >> @@ -2145,7 +2152,11 @@ static void ram_list_init_bitmaps(void)
> >>   QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> >>   pages = block->max_length >> TARGET_PAGE_BITS;
> >>   block->bmap = bitmap_new(pages);
> >> -bitmap_set(block->bmap, 0, pages);
> >> +if (rs->free_page_support) {
> >> +bitmap_set(block->bmap, 1, pages);
> > I don't understand how it makes sense to do that here; ignoring
> > anything ese it means that migration_dirty_pages is wrong which could
> > end up with migration finishing before all real pages are sent.
> >
> 
> The bulk stage treats all the pages as dirty pages, so we set all the bits to 
> "1",
> this is needed by this optimization feature, because the free pages reported
> from the guest can then be directly cleared from the bitmap (we don't need
> any more bitmaps to record free pages).
> 

Sorry, there was a misunderstanding of the bitmap_set API (thought it was used 
to set all the bits to 1 or 0). So the above change isn't needed actually. Btw, 
this doesn't affect the results I reported.

Best,
Wei
 




Re: [Qemu-devel] [PATCH v1] numa: s390x has no NUMA

2018-02-26 Thread David Hildenbrand
On 26.02.2018 10:20, Christian Borntraeger wrote:
> 
> 
> On 02/23/2018 06:36 PM, David Hildenbrand wrote:
>> Right now it is possible to crash QEMU for s390x by providing e.g.
>> -numa node,nodeid=0,cpus=0-1
>>
>> Problem is, that numa.c uses mc->cpu_index_to_instance_props as an
>> indicator whether NUMA is supported by a machine type. We don't
>> implement NUMA on s390x (and that concept also doesn't really exist).
>> We need mc->cpu_index_to_instance_props for query-cpus.
> 
> Looks like we assert because of 
> machine->possible_cpus == 0.
> 
> Later during boot this is created in s390_possible_cpu_arch_ids. (via 
> s390_init_cpus). What we (in the future) actually could provide is a 
> cpu topology.
> 
> So something like this also fixes the bug

Yes, but I decided to not go this way because we don't support NUMA as
of now. -numa has to bail out (just as it did before I implemented
proper query-cpus support).

What you propose is something for future support - one we have cpu
topology information exposed.


-- 

Thanks,

David / dhildenb



Re: [Qemu-devel] [PATCH 01/19] target/hppa: Use DisasContextBase.is_jmp

2018-02-26 Thread Philippe Mathieu-Daudé
On 02/17/2018 05:31 PM, Richard Henderson wrote:
> Instead of returning DisasJumpType, immediately store it.

neat!

> Signed-off-by: Richard Henderson 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  target/hppa/translate.c | 971 
> 
>  1 file changed, 487 insertions(+), 484 deletions(-)
> 
> diff --git a/target/hppa/translate.c b/target/hppa/translate.c
> index 6499b392f9..f72bc84873 100644
> --- a/target/hppa/translate.c
> +++ b/target/hppa/translate.c
> @@ -290,10 +290,6 @@ typedef struct DisasContext {
>  bool psw_n_nonzero;
>  } DisasContext;
>  
> -/* Target-specific return values from translate_one, indicating the
> -   state of the TB.  Note that DISAS_NEXT indicates that we are not
> -   exiting the TB.  */
> -
>  /* We are not using a goto_tb (for whatever reason), but have updated
> the iaq (for whatever reason), so don't do it again on exit.  */
>  #define DISAS_IAQ_N_UPDATED  DISAS_TARGET_0
> @@ -308,8 +304,8 @@ typedef struct DisasContext {
>  
>  typedef struct DisasInsn {
>  uint32_t insn, mask;
> -DisasJumpType (*trans)(DisasContext *ctx, uint32_t insn,
> -   const struct DisasInsn *f);
> +void (*trans)(DisasContext *ctx, uint32_t insn,
> +  const struct DisasInsn *f);
>  union {
>  void (*ttt)(TCGv_reg, TCGv_reg, TCGv_reg);
>  void (*weww)(TCGv_i32, TCGv_env, TCGv_i32, TCGv_i32);
> @@ -678,9 +674,10 @@ static void nullify_set(DisasContext *ctx, bool x)
>  
>  /* Mark the end of an instruction that may have been nullified.
> This is the pair to nullify_over.  */
> -static DisasJumpType nullify_end(DisasContext *ctx, DisasJumpType status)
> +static void nullify_end(DisasContext *ctx)
>  {
>  TCGLabel *null_lab = ctx->null_lab;
> +DisasJumpType status = ctx->base.is_jmp;
>  
>  /* For NEXT, NORETURN, STALE, we can easily continue (or exit).
> For UPDATED, we cannot update on the nullified path.  */
> @@ -690,7 +687,7 @@ static DisasJumpType nullify_end(DisasContext *ctx, 
> DisasJumpType status)
>  /* The current insn wasn't conditional or handled the condition
> applied to it without a branch, so the (new) setting of
> NULL_COND can be applied directly to the next insn.  */
> -return status;
> +return;
>  }
>  ctx->null_lab = NULL;
>  
> @@ -708,9 +705,8 @@ static DisasJumpType nullify_end(DisasContext *ctx, 
> DisasJumpType status)
>  ctx->null_cond = cond_make_n();
>  }
>  if (status == DISAS_NORETURN) {
> -status = DISAS_NEXT;
> +ctx->base.is_jmp = DISAS_NEXT;
>  }
> -return status;
>  }
>  
>  static void copy_iaoq_entry(TCGv_reg dest, target_ureg ival, TCGv_reg vval)
> @@ -734,41 +730,45 @@ static void gen_excp_1(int exception)
>  tcg_temp_free_i32(t);
>  }
>  
> -static DisasJumpType gen_excp(DisasContext *ctx, int exception)
> +static void gen_excp(DisasContext *ctx, int exception)
>  {
>  copy_iaoq_entry(cpu_iaoq_f, ctx->iaoq_f, cpu_iaoq_f);
>  copy_iaoq_entry(cpu_iaoq_b, ctx->iaoq_b, cpu_iaoq_b);
>  nullify_save(ctx);
>  gen_excp_1(exception);
> -return DISAS_NORETURN;
> +ctx->base.is_jmp = DISAS_NORETURN;
>  }
>  
> -static DisasJumpType gen_excp_iir(DisasContext *ctx, int exc)
> +static void gen_excp_iir(DisasContext *ctx, int exc)
>  {
>  TCGv_reg tmp = tcg_const_reg(ctx->insn);
>  tcg_gen_st_reg(tmp, cpu_env, offsetof(CPUHPPAState, cr[CR_IIR]));
>  tcg_temp_free(tmp);
> -return gen_excp(ctx, exc);
> +gen_excp(ctx, exc);
>  }
>  
> -static DisasJumpType gen_illegal(DisasContext *ctx)
> +static void gen_illegal(DisasContext *ctx)
>  {
>  nullify_over(ctx);
> -return nullify_end(ctx, gen_excp_iir(ctx, EXCP_ILL));
> +gen_excp_iir(ctx, EXCP_ILL);
> +nullify_end(ctx);
>  }
>  
> -#define CHECK_MOST_PRIVILEGED(EXCP)   \
> -do {  \
> -if (ctx->privilege != 0) {\
> -nullify_over(ctx);\
> -return nullify_end(ctx, gen_excp_iir(ctx, EXCP)); \
> -} \
> +#define CHECK_MOST_PRIVILEGED(EXCP)  \
> +do { \
> +if (ctx->privilege != 0) {   \
> +nullify_over(ctx);   \
> +gen_excp_iir(ctx, EXCP); \
> +nullify_end(ctx);\
> +return;  \
> +}\
>  } while (0)
>  
>  static bool use_goto_tb(DisasContext *ctx, target_ureg dest)
>  {
>  /* Suppress goto_tb in the case of single-steping and IO.  */
> -if ((tb_cflags(ctx->base.tb) & CF_LAST_IO) || 
> ctx->base.singlestep_enabled) {
> +if ((tb_cflags(ctx->base.tb) & CF_LAST_IO)
> +|| ctx->base.sing

Re: [Qemu-devel] [PATCH V5 4/4] tests: Add migration test for aarch64

2018-02-26 Thread Andrew Jones
On Fri, Feb 23, 2018 at 03:58:58PM -0600, Wei Huang wrote:
> This patch adds migration test support for aarch64. The test code, which
> implements the same functionality as x86, is booted as a kernel in qemu.
> Here are the design choices we make for aarch64:
> 
>  * We choose this -kernel approach because aarch64 QEMU doesn't provide a
>built-in fw like x86 does. So instead of relying on a boot loader, we
>use -kernel approach for aarch64.
>  * The serial output is sent to PL011 directly.
>  * The physical memory base for mach-virt machine is 0x4000. We change
>the start_address and end_address for aarch64.
> 
> In addition to providing the binary, this patch also includes the source
> code and the build script in tests/migration/. So users can change the
> source and/or re-compile the binary as they wish.
> 
> Signed-off-by: Wei Huang 
> ---
>  tests/Makefile.include   |  1 +
>  tests/migration-test.c   | 50 ++---
>  tests/migration/Makefile | 12 +-
>  tests/migration/aarch64-a-b-kernel.S | 71 
> 
>  tests/migration/aarch64-a-b-kernel.h | 19 ++
>  tests/migration/migration-test.h |  5 +++
>  6 files changed, 150 insertions(+), 8 deletions(-)
>  create mode 100644 tests/migration/aarch64-a-b-kernel.S
>  create mode 100644 tests/migration/aarch64-a-b-kernel.h
> 
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index a1bcbffe12..df9f64438f 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -372,6 +372,7 @@ check-qtest-arm-y += tests/sdhci-test$(EXESUF)
>  check-qtest-aarch64-y = tests/numa-test$(EXESUF)
>  check-qtest-aarch64-y += tests/sdhci-test$(EXESUF)
>  check-qtest-aarch64-y += tests/boot-serial-test$(EXESUF)
> +check-qtest-aarch64-y += tests/migration-test$(EXESUF)
>  
>  check-qtest-microblazeel-y = $(check-qtest-microblaze-y)
>  
> diff --git a/tests/migration-test.c b/tests/migration-test.c
> index ce2922df6a..d60e34c82d 100644
> --- a/tests/migration-test.c
> +++ b/tests/migration-test.c
> @@ -11,6 +11,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include 
>  
>  #include "libqtest.h"
>  #include "qapi/qmp/qdict.h"
> @@ -23,8 +24,8 @@
>  
>  #include "migration/migration-test.h"
>  
> -const unsigned start_address = TEST_MEM_START;
> -const unsigned end_address = TEST_MEM_END;
> +unsigned start_address = TEST_MEM_START;
> +unsigned end_address = TEST_MEM_END;
>  bool got_stop;
>  
>  #if defined(__linux__)
> @@ -81,12 +82,13 @@ static const char *tmpfs;
>   * repeatedly. It outputs a 'B' at a fixed rate while it's still running.
>   */
>  #include "tests/migration/x86-a-b-bootblock.h"
> +#include "tests/migration/aarch64-a-b-kernel.h"
>  
> -static void init_bootfile_x86(const char *bootpath)
> +static void init_bootfile(const char *bootpath, void *content)
>  {
>  FILE *bootfile = fopen(bootpath, "wb");
>  
> -g_assert_cmpint(fwrite(x86_bootsect, 512, 1, bootfile), ==, 1);
> +g_assert_cmpint(fwrite(content, 512, 1, bootfile), ==, 1);
>  fclose(bootfile);
>  }
>  
> @@ -392,7 +394,7 @@ static void test_migrate_start(QTestState **from, 
> QTestState **to,
>  got_stop = false;
>  
>  if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
> -init_bootfile_x86(bootpath);
> +init_bootfile(bootpath, x86_bootsect);
>  cmd_src = g_strdup_printf("-machine accel=%s -m 150M"
>" -name source,debug-threads=on"
>" -serial file:%s/src_serial"
> @@ -421,6 +423,42 @@ static void test_migrate_start(QTestState **from, 
> QTestState **to,
>" -serial file:%s/dest_serial"
>" -incoming %s",
>accel, tmpfs, uri);
> +} else if (strcmp(arch, "aarch64") == 0) {
> +const char *cpu;
> +const char *gic_ver;
> +struct utsname utsname;
> +
> +/* kvm and tcg need different cpu and gic-version configs */
> +if (access("/dev/kvm", F_OK) == 0 && uname(&utsname) == 0 &&
> +strcmp(utsname.machine, "aarch64") == 0) {
> +accel = "kvm";
> +cpu = "host";
> +gic_ver = "host";
> +} else {
> +accel = "tcg";
> +cpu = "cortex-a57";
> +gic_ver = "2";
> +}
> +
> +init_bootfile(bootpath, aarch64_kernel);
> +cmd_src = g_strdup_printf("-machine virt,accel=%s,gic-version=%s "
> +  "-name vmsource,debug-threads=on -cpu %s "
> +  "-m 150M -serial file:%s/src_serial "
> +  "-kernel %s ",
> +  accel, gic_ver, cpu, tmpfs, bootpath);
> +cmd_dst = g_strdup_printf("-machine virt,accel=%s,gic-version=%s "
> +  "-name vmdest,debug-threads=on -c

Re: [Qemu-devel] [PATCH V5 3/4] tests/migration: Add migration-test header file

2018-02-26 Thread Andrew Jones
On Fri, Feb 23, 2018 at 03:58:57PM -0600, Wei Huang wrote:
> This patch moves the settings related migration-test from the
> migration-test.c file to a seperate header file. It also renames the
> x86-a-b-bootblock.s file extension from .s to .S, allowing gcc
> pre-processor to include the C-style header file correctly.
> 
> Signed-off-by: Wei Huang 
> ---
>  tests/migration-test.c | 28 
> +++---
>  tests/migration/Makefile   |  4 ++--
>  tests/migration/migration-test.h   | 18 ++
>  .../{x86-a-b-bootblock.s => x86-a-b-bootblock.S}   |  7 +++---
>  tests/migration/x86-a-b-bootblock.h|  2 +-
>  5 files changed, 39 insertions(+), 20 deletions(-)
>  create mode 100644 tests/migration/migration-test.h
>  rename tests/migration/{x86-a-b-bootblock.s => x86-a-b-bootblock.S} (94%)
> 
> diff --git a/tests/migration-test.c b/tests/migration-test.c
> index 74f9361bdd..ce2922df6a 100644
> --- a/tests/migration-test.c
> +++ b/tests/migration-test.c
> @@ -21,10 +21,10 @@
>  #include "sysemu/sysemu.h"
>  #include "hw/nvram/chrp_nvram.h"
>  
> -#define MIN_NVRAM_SIZE 8192 /* from spapr_nvram.c */
> +#include "migration/migration-test.h"
>  
> -const unsigned start_address = 1024 * 1024;
> -const unsigned end_address = 100 * 1024 * 1024;
> +const unsigned start_address = TEST_MEM_START;
> +const unsigned end_address = TEST_MEM_END;
>  bool got_stop;
>  
>  #if defined(__linux__)
> @@ -77,8 +77,8 @@ static bool ufd_version_check(void)
>  
>  static const char *tmpfs;
>  
> -/* A simple PC boot sector that modifies memory (1-100MB) quickly
> - * outputting a 'B' every so often if it's still running.
> +/* The boot file modifies memory area in [start_address, end_address)
> + * repeatedly. It outputs a 'B' at a fixed rate while it's still running.
>   */
>  #include "tests/migration/x86-a-b-bootblock.h"
>  
> @@ -104,9 +104,8 @@ static void init_bootfile_ppc(const char *bootpath)
>  memcpy(header->name, "common", 6);
>  chrp_nvram_finish_partition(header, MIN_NVRAM_SIZE);
>  
> -/* FW_MAX_SIZE is 4MB, but slof.bin is only 900KB,
> - * so let's modify memory between 1MB and 100MB
> - * to do like PC bootsector
> +/* FW_MAX_SIZE is 4MB, but slof.bin is only 900KB. So it is OK to modify
> + * memory between start_address and end_address like PC bootsector does.
>   */
>  
>  sprintf(buf + 16,
> @@ -263,11 +262,11 @@ static void wait_for_migration_pass(QTestState *who)
>  static void check_guests_ram(QTestState *who)
>  {
>  /* Our ASM test will have been incrementing one byte from each page from
> - * 1MB to <100MB in order.
> - * This gives us a constraint that any page's byte should be equal or 
> less
> - * than the previous pages byte (mod 256); and they should all be equal
> - * except for one transition at the point where we meet the incrementer.
> - * (We're running this with the guest stopped).
> + * start_address to  + * that any page's byte should be equal or less than the previous pages
> + * byte (mod 256); and they should all be equal except for one transition
> + * at the point where we meet the incrementer. (We're running this with
> + * the guest stopped).
>   */
>  unsigned address;
>  uint8_t first_byte;
> @@ -278,7 +277,8 @@ static void check_guests_ram(QTestState *who)
>  qtest_memread(who, start_address, &first_byte, 1);
>  last_byte = first_byte;
>  
> -for (address = start_address + 4096; address < end_address; address += 
> 4096)
> +for (address = start_address + TEST_MEM_PAGE_SIZE; address < end_address;
> + address += TEST_MEM_PAGE_SIZE)
>  {
>  uint8_t b;
>  qtest_memread(who, address, &b, 1);
> diff --git a/tests/migration/Makefile b/tests/migration/Makefile
> index 8fbedaa8b8..013b8d1f44 100644
> --- a/tests/migration/Makefile
> +++ b/tests/migration/Makefile
> @@ -25,8 +25,8 @@ include $(SRC_PATH)/rules.mak
>  
>  x86_64_cross_prefix := $(call find-cross-prefix,x86_64)
>  
> -x86-a-b-bootblock.h: x86-a-b-bootblock.s
> - $(x86_64_cross_prefix)as --32 -march=i486 $< -o x86.o
> +x86-a-b-bootblock.h: x86-a-b-bootblock.S
> + $(x86_64_cross_prefix)gcc -m32 -march=i486 -c $< -o x86.o
>   $(x86_64_cross_prefix)objcopy -O binary x86.o x86.boot
>   dd if=x86.boot of=x86.bootsect bs=256 count=2 skip=124
>   echo "$$__note" > $@
> diff --git a/tests/migration/migration-test.h 
> b/tests/migration/migration-test.h
> new file mode 100644
> index 00..48b59b3281
> --- /dev/null
> +++ b/tests/migration/migration-test.h
> @@ -0,0 +1,18 @@
> +/*
> + * Copyright (c) 2018 Red Hat, Inc. and/or its affiliates
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +#ifndef _TEST_MIGRATION_H_
> +#define _TEST_MIGRATION_H_
> +
> +/* Common */
> +#define TEST_MEM

Re: [Qemu-devel] [edk2] [PATCH 4/7] ovmf: link with Tcg2Pei module

2018-02-26 Thread Laszlo Ersek
On 02/23/18 14:23, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> This module will initialize TPM device, measure reported FVs and BIOS
> version.
> 
> CC: Laszlo Ersek 
> CC: Stefan Berger 
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Marc-André Lureau 
> ---
>  OvmfPkg/OvmfPkgX64.dsc | 7 +++
>  OvmfPkg/OvmfPkgX64.fdf | 1 +
>  2 files changed, 8 insertions(+)
> 
> diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
> index b5cbe8430f..34a7c2778e 100644
> --- a/OvmfPkg/OvmfPkgX64.dsc
> +++ b/OvmfPkg/OvmfPkgX64.dsc
> @@ -279,6 +279,8 @@
>PcdLib|MdePkg/Library/PeiPcdLib/PeiPcdLib.inf
>QemuFwCfgLib|OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgPeiLib.inf
>  !if $(TPM2_ENABLE)
> +  BaseCryptLib|CryptoPkg/Library/BaseCryptLib/PeiCryptLib.inf
> +  
> HashLib|SecurityPkg/Library/HashLibBaseCryptoRouter/HashLibBaseCryptoRouterPei.inf
>
> Tpm12DeviceLib|SecurityPkg/Library/Tpm12DeviceLibDTpm/Tpm12DeviceLibDTpm.inf
>Tpm2DeviceLib|SecurityPkg/Library/Tpm2DeviceLibDTpm/Tpm2DeviceLibDTpm.inf
>  !endif
> @@ -647,6 +649,11 @@
>  
>  !if $(TPM2_ENABLE) == TRUE
>SecurityPkg/Tcg/Tcg2Config/Tcg2ConfigPei.inf
> +  SecurityPkg/Tcg/Tcg2Pei/Tcg2Pei.inf {
> +
> +  NULL|SecurityPkg/Library/HashInstanceLibSha1/HashInstanceLibSha1.inf
> +  
> NULL|SecurityPkg/Library/HashInstanceLibSha256/HashInstanceLibSha256.inf
> +  }
>  !endif
>  
>  !if $(SECURE_BOOT_ENABLE) == TRUE
> diff --git a/OvmfPkg/OvmfPkgX64.fdf b/OvmfPkg/OvmfPkgX64.fdf
> index dc35d0a1f7..9558142a42 100644
> --- a/OvmfPkg/OvmfPkgX64.fdf
> +++ b/OvmfPkg/OvmfPkgX64.fdf
> @@ -170,6 +170,7 @@ INF  MdeModulePkg/Universal/Variable/Pei/VariablePei.inf
>  !endif
>  !if $(TPM2_ENABLE) == TRUE
>  INF  SecurityPkg/Tcg/Tcg2Config/Tcg2ConfigPei.inf
> +INF  SecurityPkg/Tcg/Tcg2Pei/Tcg2Pei.inf
>  !endif
>  
>  
> 
> 

Would it be possible to drop SHA1 (include SHA256 only) by setting
PcdTpm2HashMask to value 2? Or SHA1 required for some other reason? (If
so please mention it in the commit message.)

Thanks
Laszlo



Re: [Qemu-devel] [PATCH V5 4/4] tests: Add migration test for aarch64

2018-02-26 Thread Andrew Jones
On Mon, Feb 26, 2018 at 10:30:31AM +0100, Andrew Jones wrote:
> On Fri, Feb 23, 2018 at 03:58:58PM -0600, Wei Huang wrote:
> > +/* aarch64 virt machine physical memory starts at 0x4000, which
> > + * is also the kernel loader base address. It should be fine to
> 
> It's not the kernel base address.
> 
> > + * allocate & modify the test memory 1MB away.
> 
> It's only 512K away - which is still probably fine, but you
> said he found data once when reading every 4K after 1M, so
> maybe not.
>

BTW, I'd just drop this comment altogether, rather than fix it.
Or fix it, but put it in the header. The point of the header
is to avoid these addresses spreading around too much. Putting
them in comments doesn't help.

drew



Re: [Qemu-devel] [edk2] [PATCH 5/7] ovmf: link with Tcg2Dxe module

2018-02-26 Thread Laszlo Ersek
On 02/23/18 14:23, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> This module measures and log the boot environment. It also produces
> the Tcg2 protocol, which allows for example to read the log from OS:
> 
> [0.00] efi: EFI v2.70 by EDK II
> [0.00] efi:  SMBIOS=0x3fa1f000  ACPI=0x3fbb6000  ACPI 2.0=0x3fbb6014  
> MEMATTR=0x3e7d4318  TPMEventLog=0x3db21018
> 
> $ python chipsec_util.py tpm parse_log binary_bios_measurements
> 
> [CHIPSEC] Version 1.3.5.dev2
> [CHIPSEC] API mode: using OS native API (not using CHIPSEC kernel module)
> [CHIPSEC] Executing command 'tpm' with args ['parse_log', 
> '/tmp/binary_bios_measurements']
> 
> PCR: 0type: EV_S_CRTM_VERSION size: 0x2   digest: 
> 1489f923c4dca729178b3e3233458550d8dddf29
>   + version:
> PCR: 0type: EV_EFI_PLATFORM_FIRMWARE_BLOB size: 0x10  digest: 
> fd39ced7c0d2a61f6830c78c7625f94826b05bcc
>   + base: 0x82length: 0xe
> PCR: 0type: EV_EFI_PLATFORM_FIRMWARE_BLOB size: 0x10  digest: 
> 39ebc6783b72bc1e73c7d5bcfeb5f54a3f105d4c
>   + base: 0x90length: 0xa0
> PCR: 7type: EV_EFI_VARIABLE_DRIVER_CONFIG size: 0x35  digest: 
> 57cd4dc19442475aa82743484f3b1caa88e142b8
> PCR: 7type: EV_EFI_VARIABLE_DRIVER_CONFIG size: 0x24  digest: 
> 9b1387306ebb7ff8e795e7be77563666bbf4516e
> PCR: 7type: EV_EFI_VARIABLE_DRIVER_CONFIG size: 0x26  digest: 
> 9afa86c507419b8570c62167cb9486d9fc809758
> PCR: 7type: EV_EFI_VARIABLE_DRIVER_CONFIG size: 0x24  digest: 
> 5bf8faa078d40ffbd03317c93398b01229a0e1e0
> PCR: 7type: EV_EFI_VARIABLE_DRIVER_CONFIG size: 0x26  digest: 
> 734424c9fe8fc71716c42096f4b74c88733b175e
> PCR: 7type: EV_SEPARATOR  size: 0x4   digest: 
> 9069ca78e7450a285173431b3e52c5c25299e473
> PCR: 1type: EV_EFI_VARIABLE_BOOT  size: 0x3e  digest: 
> 252f8ebb85340290b64f4b06a001742be8e5cab6
> PCR: 1type: EV_EFI_VARIABLE_BOOT  size: 0x6e  digest: 
> 22a4f6ee9af6dba01d3528deb64b74b582fc182b
> PCR: 1type: EV_EFI_VARIABLE_BOOT  size: 0x80  digest: 
> b7811d5bf30a7efd4e385c6179fe10d9290bb9e8
> PCR: 1type: EV_EFI_VARIABLE_BOOT  size: 0x84  digest: 
> 425e502c24fc924e231e0a62327b6b7d1f704573
> PCR: 1type: EV_EFI_VARIABLE_BOOT  size: 0x9a  digest: 
> 0b5d2c98ac5de6148a4a1490ff9d5df69039f04e
> PCR: 1type: EV_EFI_VARIABLE_BOOT  size: 0xbd  digest: 
> 20bd5f402271d57a88ea314fe35c1705956b1f74
> PCR: 1type: EV_EFI_VARIABLE_BOOT  size: 0x88  digest: 
> df5d6605cb8f4366d745a8464cfb26c1efdc305c
> PCR: 4type: EV_EFI_ACTION size: 0x28  digest: 
> cd0fdb4531a6ec41be2753ba042637d6e5f7f256
> PCR: 0type: EV_SEPARATOR  size: 0x4   digest: 
> 9069ca78e7450a285173431b3e52c5c25299e473
> PCR: 1type: EV_SEPARATOR  size: 0x4   digest: 
> 9069ca78e7450a285173431b3e52c5c25299e473
> PCR: 2type: EV_SEPARATOR  size: 0x4   digest: 
> 9069ca78e7450a285173431b3e52c5c25299e473
> PCR: 3type: EV_SEPARATOR  size: 0x4   digest: 
> 9069ca78e7450a285173431b3e52c5c25299e473
> PCR: 4type: EV_SEPARATOR  size: 0x4   digest: 
> 9069ca78e7450a285173431b3e52c5c25299e473
> PCR: 5type: EV_SEPARATOR  size: 0x4   digest: 
> 9069ca78e7450a285173431b3e52c5c25299e473
> 
> $ tpm2_pcrlist
> sha1 :
>   0  : 35bd1786b6909daad610d7598b1d620352d33b8a
>   1  : ec0511e860206e0af13c31da2f9e943fb6ca353d
>   2  : b2a83b0ebf2f8374299a5b2bdfc31ea955ad7236
>   3  : b2a83b0ebf2f8374299a5b2bdfc31ea955ad7236
>   4  : 45a323382bd933f08e7f0e256bc8249e4095b1ec
>   5  : d16d7e629fd8d08ca256f9ad3a3a1587c9e6cc1b
>   6  : b2a83b0ebf2f8374299a5b2bdfc31ea955ad7236
>   7  : 518bd167271fbb64589c61e43d8c0165861431d8
>   8  : 
>   9  : 
>   10 : 
>   11 : 
>   12 : 
>   13 : 
>   14 : 
>   15 : 
>   16 : 
>   17 : 
>   18 : 
>   19 : 
>   20 : 
>   21 : 
>   22 : 
>   23 : 
> sha256 :
>   0  : 9ae903dbae3357ac00d223660bac19ea5c021499a56201104332ab966631ce2c
>   1  : acc611d90245cf04e77b0ca94901f90e7f

Re: [Qemu-devel] [PATCH v3 0/7] block: Handle null backing link

2018-02-26 Thread no-reply
Hi,

This series failed build test on ppcbe host. Please find the details below.

Type: series
Message-id: 20180224154033.29559-1-mre...@redhat.com
Subject: [Qemu-devel] [PATCH v3 0/7] block: Handle null backing link

=== TEST SCRIPT BEGIN ===
#!/bin/bash
# Testing script will be invoked under the git checkout with
# HEAD pointing to a commit that has the patches applied on top of "base"
# branch
set -e
echo "=== ENV ==="
env
echo "=== PACKAGES ==="
rpm -qa
echo "=== TEST BEGIN ==="
INSTALL=$PWD/install
BUILD=$PWD/build
mkdir -p $BUILD $INSTALL
SRC=$PWD
cd $BUILD
$SRC/configure --prefix=$INSTALL
make -j100
# XXX: we need reliable clean up
# make check -j100 V=1
make install
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Submodule 'capstone' (git://git.qemu.org/capstone.git) registered for path 
'capstone'
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Submodule 'roms/QemuMacDrivers' (git://git.qemu.org/QemuMacDrivers.git) 
registered for path 'roms/QemuMacDrivers'
Submodule 'roms/SLOF' (git://git.qemu-project.org/SLOF.git) registered for path 
'roms/SLOF'
Submodule 'roms/ipxe' (git://git.qemu-project.org/ipxe.git) registered for path 
'roms/ipxe'
Submodule 'roms/openbios' (git://git.qemu-project.org/openbios.git) registered 
for path 'roms/openbios'
Submodule 'roms/openhackware' (git://git.qemu-project.org/openhackware.git) 
registered for path 'roms/openhackware'
Submodule 'roms/qemu-palcode' (git://github.com/rth7680/qemu-palcode.git) 
registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (git://git.qemu-project.org/seabios.git/) registered 
for path 'roms/seabios'
Submodule 'roms/seabios-hppa' (git://github.com/hdeller/seabios-hppa.git) 
registered for path 'roms/seabios-hppa'
Submodule 'roms/sgabios' (git://git.qemu-project.org/sgabios.git) registered 
for path 'roms/sgabios'
Submodule 'roms/skiboot' (git://git.qemu.org/skiboot.git) registered for path 
'roms/skiboot'
Submodule 'roms/u-boot' (git://git.qemu-project.org/u-boot.git) registered for 
path 'roms/u-boot'
Submodule 'roms/vgabios' (git://git.qemu-project.org/vgabios.git/) registered 
for path 'roms/vgabios'
Submodule 'ui/keycodemapdb' (git://git.qemu.org/keycodemapdb.git) registered 
for path 'ui/keycodemapdb'
Cloning into 'capstone'...
Submodule path 'capstone': checked out 
'22ead3e0bfdb87516656453336160e0a37b066bf'
Cloning into 'dtc'...
Submodule path 'dtc': checked out 'e54388015af1fb4bf04d0bca99caba1074d9cc42'
Cloning into 'roms/QemuMacDrivers'...
Submodule path 'roms/QemuMacDrivers': checked out 
'd4e7d7ac663fcb55f1b93575445fcbca372f17a7'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 
'fa981320a1e0968d6fc1b8de319723ff8212b337'
Cloning into 'roms/ipxe'...
Submodule path 'roms/ipxe': checked out 
'0600d3ae94f93efd10fc6b3c7420a9557a3a1670'
Cloning into 'roms/openbios'...
Submodule path 'roms/openbios': checked out 
'54d959d97fb331708767b2fd4a878efd2bbc41bb'
Cloning into 'roms/openhackware'...
Submodule path 'roms/openhackware': checked out 
'c559da7c8eec5e45ef1f67978827af6f0b9546f5'
Cloning into 'roms/qemu-palcode'...
Submodule path 'roms/qemu-palcode': checked out 
'f3c7e44c70254975df2a00af39701eafbac4d471'
Cloning into 'roms/seabios'...
Submodule path 'roms/seabios': checked out 
'63451fca13c75870e1703eb3e20584d91179aebc'
Cloning into 'roms/seabios-hppa'...
Submodule path 'roms/seabios-hppa': checked out 
'649e6202b8d65d46c69f542b1380f840fbe8ab13'
Cloning into 'roms/sgabios'...
Submodule path 'roms/sgabios': checked out 
'cbaee52287e5f32373181cff50a00b6c4ac9015a'
Cloning into 'roms/skiboot'...
Submodule path 'roms/skiboot': checked out 
'e0ee24c27a172bcf482f6f2bc905e6211c134bcc'
Cloning into 'roms/u-boot'...
Submodule path 'roms/u-boot': checked out 
'd85ca029f257b53a96da6c2fb421e78a003a9943'
Cloning into 'roms/vgabios'...
Submodule path 'roms/vgabios': checked out 
'19ea12c230ded95928ecaef0db47a82231c2e485'
Cloning into 'ui/keycodemapdb'...
Submodule path 'ui/keycodemapdb': checked out 
'6b3d716e2b6472eb7189d3220552280ef3d832ce'
Switched to a new branch 'test'
20aa306 block: Deprecate "backing": ""
c9e6a2d block: Handle null backing link
3a046aa qapi: Make more of qobject_to()
a51ebac qapi: Remove qobject_to_X() functions
7b78cc8 qapi: Replace qobject_to_X(o) by qobject_to(o, X)
0a96468 qapi: Add qobject_to()
0cd4496 compiler: Add QEMU_BUILD_BUG_MSG() macro

=== OUTPUT BEGIN ===
=== ENV ===
XDG_SESSION_ID=29961
SHELL=/bin/sh
USER=patchew
PATCHEW=./patchew-cli -s https://patchew.org
PATH=/usr/bin:/bin
PWD=/var/tmp/patchew-tester-tmp-8vb5f8ja/src
LANG=en_US.UTF-8
HOME=/home/patchew
SHLVL=2
LOGNAME=patchew
XDG_RUNTIME_DIR=/run/user/1000
_=/usr/bin/env
=== PACKAGES ===
telepathy-filesystem-0.0.2-6.el7.noarch
ipa-common-4.5.0-20.el7.centos.noarch
ipa-client-common-4.5.0-20.el7.centos.noarch
nhn-nanum-fonts-common-3.020-9.el7.noarch
perl-srpm-macros-1-8.el7.noarch
glibc-common-2.17-196.el7.ppc64
zlib-1.2.7-17.el7.ppc64
nss-util-3.28.4-3.el7.ppc64
libSM

Re: [Qemu-devel] [edk2] [PATCH 6/7] ovmf: link with Tcg2ConfigDxe module

2018-02-26 Thread Laszlo Ersek
On 02/23/18 14:23, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> The module allows to tweak and interact with the TPM. Note that many
> actions are broken due to implementation of qemu TPM (providing it's
> own ACPI table), and the lack of PPI implementation.
> 
> CC: Laszlo Ersek 
> CC: Stefan Berger 
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Marc-André Lureau 
> ---
>  OvmfPkg/OvmfPkgX64.dsc | 2 ++
>  OvmfPkg/OvmfPkgX64.fdf | 1 +
>  2 files changed, 3 insertions(+)
> 
> diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
> index 9bd0709f98..2281bd5ff8 100644
> --- a/OvmfPkg/OvmfPkgX64.dsc
> +++ b/OvmfPkg/OvmfPkgX64.dsc
> @@ -669,6 +669,8 @@
>NULL|SecurityPkg/Library/HashInstanceLibSha1/HashInstanceLibSha1.inf
>
> NULL|SecurityPkg/Library/HashInstanceLibSha256/HashInstanceLibSha256.inf
>}
> +
> +  SecurityPkg/Tcg/Tcg2Config/Tcg2ConfigDxe.inf
>  !endif
>  
>  !if $(SECURE_BOOT_ENABLE) == TRUE
> diff --git a/OvmfPkg/OvmfPkgX64.fdf b/OvmfPkg/OvmfPkgX64.fdf
> index b8dd7ecae4..985404850f 100644
> --- a/OvmfPkg/OvmfPkgX64.fdf
> +++ b/OvmfPkg/OvmfPkgX64.fdf
> @@ -399,6 +399,7 @@ INF  
> MdeModulePkg/Universal/Variable/RuntimeDxe/VariableRuntimeDxe.inf
>  
>  !if $(TPM2_ENABLE) == TRUE
>  INF  SecurityPkg/Tcg/Tcg2Dxe/Tcg2Dxe.inf
> +INF  SecurityPkg/Tcg/Tcg2Config/Tcg2ConfigDxe.inf
>  !endif
>  
>  
> 
> 

Please drop this patch.

In my earlier investigation I wrote, Tcg2ConfigDxe "[p]rovides a Setup
TUI interface to configure the TPM. IIUC, it can also save the
configured TPM type for subsequent boots (see Tcg2ConfigPei.inf above)".

The INF file itself says "This module is only for reference only, each
platform should have its own setup page."

And Jiewen wrote earlier, "Tcg2ConfigPei/Dxe are platform sample driver.
A platform may have its own version based upon platform requirement. For
example, if a platform supports fTPM, it may use another Tcg2Config driver."

Given that OVMF lacks PEI-phase variable access, and that I consequently
suggested cloning, and seriously trimming, Tcg2ConfigPei, it makes no
sense to include an HII dialog that sets a variable for PEI phase
consumption. Also, as you say, many of the exposed operations are broken
due to lack of PPI support. So let's just postpone the inclusion of this
driver, for now.

Thanks
Laszlo



Re: [Qemu-devel] intel-iommu and vhost: Do we need 'device-iotlb' and 'ats'?

2018-02-26 Thread Auger Eric
Hi Jintack,

On 21/02/18 05:03, Jintack Lim wrote:
> Hi,
> 
> I'm using vhost with the virtual intel-iommu, and this page[1] shows
> the QEMU command line example.
> 
> qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split -m 2G \
>-device intel-iommu,intremap=on,device-iotlb=on \
>-device ioh3420,id=pcie.1,chassis=1 \
>-device
> virtio-net-pci,bus=pcie.1,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,ats=on
> \
>-netdev tap,id=net0,vhostforce \
>$IMAGE_PATH
> 
> I wonder what's the impact of using device-iotlb and ats options as
> they are described necessary.
> 
> In my understanding, vhost in the kernel only looks at
> VIRTIO_F_IOMMU_PLATFORM, and when it is set, vhost uses a
> device-iotlb. In addition, vhost and QEMU communicate using vhost_msg
> basically to cache mappings correctly in the vhost, so I wonder what's
> the role of ats in this case.
> 
> A related question is that if we use SMMU emulation[2] on ARM without
> those options, does vhost cache mappings as if it has a device-iotlb?
> (I guess this is the case.)
vsmmuv3 emulation code does not support ATS at the moment. vhost support
is something different. As Peter explained it comes with the capability
of the virtio device to register unmap notifiers. Those notifiers get
called each time there are TLB invalidation commands. That way the
in-kernel vhost cache can be invalidated. vhost support was there until
vsmmuv3 v7. With latest versions, I removed it to help reviewers
concentrate on the root functionality. However I will send it to you
based on v9.

Thanks

Eric
> 
> I'm pretty new to QEMU code, so I might be missing something. Can
> somebody shed some light on it?
> 
> [1] https://wiki.qemu.org/Features/VT-d
> [2] http://lists.nongnu.org/archive/html/qemu-devel/2018-02/msg04736.html
> 
> Thanks,
> Jintack
> 
> 



Re: [Qemu-devel] [PATCH v1] numa: s390x has no NUMA

2018-02-26 Thread Cornelia Huck
On Fri, 23 Feb 2018 18:36:57 +0100
David Hildenbrand  wrote:

> Right now it is possible to crash QEMU for s390x by providing e.g.
> -numa node,nodeid=0,cpus=0-1
> 
> Problem is, that numa.c uses mc->cpu_index_to_instance_props as an
> indicator whether NUMA is supported by a machine type. We don't
> implement NUMA on s390x (and that concept also doesn't really exist).
> We need mc->cpu_index_to_instance_props for query-cpus.

Is existence of cpu_index_to_instance_probs the correct indicator for
numa, then?

OTOH, your patch is straightforward...

> 
> So let's fix this case.
> 
> qemu-system-s390x: -numa node,nodeid=0,cpus=0-1: NUMA is not supported by
>this machine-type
> 
> Signed-off-by: David Hildenbrand 
> ---
>  numa.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/numa.c b/numa.c
> index 7e0e789b02..3b9be613d9 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -80,10 +80,16 @@ static void parse_numa_node(MachineState *ms, 
> NumaNodeOptions *node,
>  return;
>  }
>  
> +#ifdef TARGET_S390X
> +/* s390x provides cpu_index_to_instance_props but has no NUMA */
> +error_report("NUMA is not supported by this machine-type");
> +exit(1);
> +#else
>  if (!mc->cpu_index_to_instance_props) {
>  error_report("NUMA is not supported by this machine-type");
>  exit(1);
>  }
> +#endif
>  for (cpus = node->cpus; cpus; cpus = cpus->next) {
>  CpuInstanceProperties props;
>  if (cpus->value >= max_cpus) {




Re: [Qemu-devel] [PATCH v1] numa: s390x has no NUMA

2018-02-26 Thread David Hildenbrand
On 26.02.2018 11:19, Cornelia Huck wrote:
> On Fri, 23 Feb 2018 18:36:57 +0100
> David Hildenbrand  wrote:
> 
>> Right now it is possible to crash QEMU for s390x by providing e.g.
>> -numa node,nodeid=0,cpus=0-1
>>
>> Problem is, that numa.c uses mc->cpu_index_to_instance_props as an
>> indicator whether NUMA is supported by a machine type. We don't
>> implement NUMA on s390x (and that concept also doesn't really exist).
>> We need mc->cpu_index_to_instance_props for query-cpus.
> 
> Is existence of cpu_index_to_instance_probs the correct indicator for
> numa, then?
> 
> OTOH, your patch is straightforward...

Maybe it is get_default_cpu_node_id as Christian discovered?


-- 

Thanks,

David / dhildenb



Re: [Qemu-devel] [PATCH 7/7] ovmf: add DxeTpm2MeasureBootLib

2018-02-26 Thread Laszlo Ersek
On 02/23/18 14:23, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
>
> The library registers a security management handler, to measure images
> that are not measure in PEI phase.
>
> This seems to work for example with the qemu PXE rom:
>
> Loading driver at 0x0003E6C2000 EntryPoint=0x0003E6C9076 8086100e.efi
>
> And the following binary_bios_measurements log entry seems to be
> added:
>
> PCR: 2type: EV_EFI_BOOT_SERVICES_DRIVER   size: 0x4e  digest: 
> 70a22475e9f18806d2ed9193b48d80d26779d9a4
>
> CC: Laszlo Ersek 
> CC: Stefan Berger 
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Marc-André Lureau 
> ---
>  OvmfPkg/OvmfPkgX64.dsc | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
> index 2281bd5ff8..92ed9f3b0c 100644
> --- a/OvmfPkg/OvmfPkgX64.dsc
> +++ b/OvmfPkg/OvmfPkgX64.dsc
> @@ -677,7 +677,10 @@
>MdeModulePkg/Universal/SecurityStubDxe/SecurityStubDxe.inf {
>  
>
> NULL|SecurityPkg/Library/DxeImageVerificationLib/DxeImageVerificationLib.inf
> - }
> +!if $(TPM2_ENABLE) == TRUE
> +  
> NULL|SecurityPkg/Library/DxeTpm2MeasureBootLib/DxeTpm2MeasureBootLib.inf
> +!endif
> +  }
>  !else
>MdeModulePkg/Universal/SecurityStubDxe/SecurityStubDxe.inf
>  !endif
>

This looks OK to me.

First, can you please clean up the SecurityStubDxe stanza as follows (as
a separate patch):

> diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
> index 96fc7b82e708..f4288b625cba 100644
> --- a/OvmfPkg/OvmfPkgX64.dsc
> +++ b/OvmfPkg/OvmfPkgX64.dsc
> @@ -634,14 +634,12 @@ [Components]
>
>MdeModulePkg/Core/RuntimeDxe/RuntimeDxe.inf
>
> -!if $(SECURE_BOOT_ENABLE) == TRUE
>MdeModulePkg/Universal/SecurityStubDxe/SecurityStubDxe.inf {
>  
> +!if $(SECURE_BOOT_ENABLE) == TRUE
>
> NULL|SecurityPkg/Library/DxeImageVerificationLib/DxeImageVerificationLib.inf
> -   }
> -!else
> -  MdeModulePkg/Universal/SecurityStubDxe/SecurityStubDxe.inf
>  !endif
> +  }
>
>MdeModulePkg/Universal/EbcDxe/EbcDxe.inf
>PcAtChipsetPkg/8259InterruptControllerDxe/8259.inf

The idea is that "SecurityStubDxe.inf" should be included
unconditionally; only its plug-in libs should be conditional on various
build flags. While the current (pre-patch) code does that -- in effect
-- for SECURE_BOOT_ENABLE already, your patch (as-is) can only add
TPM2_ENABLE *within* SECURE_BOOT_ENABLE.

I don't think that's for the best -- first we should make
DxeImageVerificationLib the *only* bit that's conditional on
SECURE_BOOT_ENABLE, and then we can add DxeTpm2MeasureBootLib
independently. (If neither build option is specified, we'll have a
 list that's empty, but that's perfectly fine.)

Thanks!
Laszlo



Re: [Qemu-devel] [PATCH v1] numa: s390x has no NUMA

2018-02-26 Thread Cornelia Huck
On Mon, 26 Feb 2018 11:28:26 +0100
David Hildenbrand  wrote:

> On 26.02.2018 11:19, Cornelia Huck wrote:
> > On Fri, 23 Feb 2018 18:36:57 +0100
> > David Hildenbrand  wrote:
> >   
> >> Right now it is possible to crash QEMU for s390x by providing e.g.
> >> -numa node,nodeid=0,cpus=0-1
> >>
> >> Problem is, that numa.c uses mc->cpu_index_to_instance_props as an
> >> indicator whether NUMA is supported by a machine type. We don't
> >> implement NUMA on s390x (and that concept also doesn't really exist).
> >> We need mc->cpu_index_to_instance_props for query-cpus.  
> > 
> > Is existence of cpu_index_to_instance_probs the correct indicator for
> > numa, then?
> > 
> > OTOH, your patch is straightforward...  
> 
> Maybe it is get_default_cpu_node_id as Christian discovered?

Yes, that seems like a better candidate for checking.



[Qemu-devel] [PULL-for-s390x 04/14] s390-ccw: update libc

2018-02-26 Thread Thomas Huth
From: "Collin L. Walling" 

Moved:
  memcmp from bootmap.h to libc.h (renamed from _memcmp)
  strlen from sclp.c to libc.h (renamed from _strlen)

Added C standard functions:
  isdigit

Added non C-standard function:
  uitoa
  atoui

Signed-off-by: Collin L. Walling 
Acked-by: Christian Borntraeger 
Reviewed-by: Janosch Frank 
Reviewed-by: Thomas Huth 
Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw/Makefile  |  2 +-
 pc-bios/s390-ccw/bootmap.c |  4 +--
 pc-bios/s390-ccw/bootmap.h | 16 +
 pc-bios/s390-ccw/libc.c| 88 ++
 pc-bios/s390-ccw/libc.h| 37 +--
 pc-bios/s390-ccw/main.c| 17 +
 pc-bios/s390-ccw/sclp.c| 10 +-
 7 files changed, 129 insertions(+), 45 deletions(-)
 create mode 100644 pc-bios/s390-ccw/libc.c

diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
index 6d0c2ee..9f7904f 100644
--- a/pc-bios/s390-ccw/Makefile
+++ b/pc-bios/s390-ccw/Makefile
@@ -9,7 +9,7 @@ $(call set-vpath, $(SRC_PATH)/pc-bios/s390-ccw)
 
 .PHONY : all clean build-all
 
-OBJECTS = start.o main.o bootmap.o sclp.o virtio.o virtio-scsi.o 
virtio-blkdev.o
+OBJECTS = start.o main.o bootmap.o sclp.o virtio.o virtio-scsi.o 
virtio-blkdev.o libc.o
 QEMU_CFLAGS := $(filter -W%, $(QEMU_CFLAGS))
 QEMU_CFLAGS += -ffreestanding -fno-delete-null-pointer-checks -msoft-float
 QEMU_CFLAGS += -march=z900 -fPIE -fno-strict-aliasing
diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index a94638d..092fb35 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -506,7 +506,7 @@ static bool is_iso_bc_entry_compatible(IsoBcSection *s)
 "Failed to read image sector 0");
 
 /* Checking bytes 8 - 32 for S390 Linux magic */
-return !_memcmp(magic_sec + 8, linux_s390_magic, 24);
+return !memcmp(magic_sec + 8, linux_s390_magic, 24);
 }
 
 /* Location of the current sector of the directory */
@@ -635,7 +635,7 @@ static uint32_t find_iso_bc(void)
 if (vd->type == VOL_DESC_TYPE_BOOT) {
 IsoVdElTorito *et = &vd->vd.boot;
 
-if (!_memcmp(&et->el_torito[0], el_torito_magic, 32)) {
+if (!memcmp(&et->el_torito[0], el_torito_magic, 32)) {
 return bswap32(et->bc_offset);
 }
 }
diff --git a/pc-bios/s390-ccw/bootmap.h b/pc-bios/s390-ccw/bootmap.h
index 4bd95cd..4cf7e1e 100644
--- a/pc-bios/s390-ccw/bootmap.h
+++ b/pc-bios/s390-ccw/bootmap.h
@@ -328,20 +328,6 @@ static inline bool magic_match(const void *data, const 
void *magic)
 return *((uint32_t *)data) == *((uint32_t *)magic);
 }
 
-static inline int _memcmp(const void *s1, const void *s2, size_t n)
-{
-int i;
-const uint8_t *p1 = s1, *p2 = s2;
-
-for (i = 0; i < n; i++) {
-if (p1[i] != p2[i]) {
-return p1[i] > p2[i] ? 1 : -1;
-}
-}
-
-return 0;
-}
-
 static inline uint32_t iso_733_to_u32(uint64_t x)
 {
 return (uint32_t)x;
@@ -434,7 +420,7 @@ const uint8_t vol_desc_magic[] = "CD001";
 
 static inline bool is_iso_vd_valid(IsoVolDesc *vd)
 {
-return !_memcmp(&vd->ident[0], vol_desc_magic, 5) &&
+return !memcmp(&vd->ident[0], vol_desc_magic, 5) &&
vd->version == 0x1 &&
vd->type <= VOL_DESC_TYPE_PARTITION;
 }
diff --git a/pc-bios/s390-ccw/libc.c b/pc-bios/s390-ccw/libc.c
new file mode 100644
index 000..38ea77d
--- /dev/null
+++ b/pc-bios/s390-ccw/libc.c
@@ -0,0 +1,88 @@
+/*
+ * libc-style definitions and functions
+ *
+ * Copyright 2018 IBM Corp.
+ * Author(s): Collin L. Walling 
+ *
+ * This code is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ */
+
+#include "libc.h"
+#include "s390-ccw.h"
+
+/**
+ * atoui:
+ * @str: the string to be converted.
+ *
+ * Given a string @str, convert it to an integer. Leading spaces are
+ * ignored. Any other non-numerical value will terminate the conversion
+ * and return 0. This function only handles numbers between 0 and
+ * UINT64_MAX inclusive.
+ *
+ * Returns: an integer converted from the string @str, or the number 0
+ * if an error occurred.
+ */
+uint64_t atoui(const char *str)
+{
+int val = 0;
+
+if (!str || !str[0]) {
+return 0;
+}
+
+while (*str == ' ') {
+str++;
+}
+
+while (*str) {
+if (!isdigit(*str)) {
+break;
+}
+val = val * 10 + *str - '0';
+str++;
+}
+
+return val;
+}
+
+/**
+ * uitoa:
+ * @num: an integer (base 10) to be converted.
+ * @str: a pointer to a string to store the conversion.
+ * @len: the length of the passed string.
+ *
+ * Given an integer @num, convert it to a string. The string @str must be
+ * allocated beforehand. The resulting string will be null terminated and
+ * returned. This function only handles numbers between 0

[Qemu-devel] [PULL-for-s390x 00/14] s390-ccw firmware update

2018-02-26 Thread Thomas Huth

 Hi Cornelia!

The following changes since commit 0a773d55ac76c5aa89ed9187a3bc5af8c5c2a6d0:

  maintainers: Add myself as a OpenBSD maintainer (2018-02-23 12:05:07 +)

are available in the git repository at:

  https://github.com/huth/qemu.git tags/s390-ccw-bios-2018-02-26

for you to fetch changes up to 9c050f3d15697c4c84c9d6aa7af779a273b71d87:

  pc-bios/s390: Rebuild the s390x firmware images with the boot menu changes 
(2018-02-26 11:10:30 +0100)


Boot menu patches by Collin L. Walling


Collin L. Walling (13):
  s390-ccw: refactor boot map table code
  s390-ccw: refactor eckd_block_num to use CHS
  s390-ccw: refactor IPL structs
  s390-ccw: update libc
  s390-ccw: move auxiliary IPL data to separate location
  s390-ccw: parse and set boot menu options
  s390-ccw: set up interactive boot menu parameters
  s390-ccw: read stage2 boot loader data to find menu
  s390-ccw: print zipl boot menu
  s390-ccw: read user input for boot index via the SCLP console
  s390-ccw: set cp_receive mask only when needed and consume pending 
service irqs
  s390-ccw: use zipl values when no boot menu options are present
  s390-ccw: interactive boot menu for scsi

Thomas Huth (1):
  pc-bios/s390: Rebuild the s390x firmware images with the boot menu changes

 hw/s390x/ipl.c  |  77 +-
 hw/s390x/ipl.h  |  31 +-
 pc-bios/s390-ccw.img| Bin 26416 -> 34568 bytes
 pc-bios/s390-ccw/Makefile   |   2 +-
 pc-bios/s390-ccw/bootmap.c  | 184 +++-
 pc-bios/s390-ccw/bootmap.h  |  91 ++--
 pc-bios/s390-ccw/iplb.h |  24 -
 pc-bios/s390-ccw/libc.c |  88 
 pc-bios/s390-ccw/libc.h |  37 ++-
 pc-bios/s390-ccw/main.c |  49 ++---
 pc-bios/s390-ccw/menu.c | 249 
 pc-bios/s390-ccw/s390-ccw.h |  10 ++
 pc-bios/s390-ccw/sclp.c |  39 ---
 pc-bios/s390-ccw/virtio.c   |   2 +-
 pc-bios/s390-netboot.img| Bin 83864 -> 83776 bytes
 15 files changed, 757 insertions(+), 126 deletions(-)
 create mode 100644 pc-bios/s390-ccw/libc.c
 create mode 100644 pc-bios/s390-ccw/menu.c
 mode change 100755 => 100644 pc-bios/s390-netboot.img



[Qemu-devel] [PULL-for-s390x 02/14] s390-ccw: refactor eckd_block_num to use CHS

2018-02-26 Thread Thomas Huth
From: "Collin L. Walling" 

Add new cylinder/head/sector struct. Use it to calculate
eckd block numbers instead of a BootMapPointer (which used
eckd chs anyway).

Signed-off-by: Collin L. Walling 
Reviewed-by: Thomas Huth 
Acked-by: Christian Borntraeger 
Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw/bootmap.c | 28 ++--
 pc-bios/s390-ccw/bootmap.h |  8 ++--
 2 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index a4eaf24..9534f56 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -95,32 +95,32 @@ static inline void verify_boot_info(BootInfo *bip)
"Bad block size in zIPL section of the 1st record.");
 }
 
-static block_number_t eckd_block_num(BootMapPointer *p)
+static block_number_t eckd_block_num(EckdCHS *chs)
 {
 const uint64_t sectors = virtio_get_sectors();
 const uint64_t heads = virtio_get_heads();
-const uint64_t cylinder = p->eckd.cylinder
-+ ((p->eckd.head & 0xfff0) << 12);
-const uint64_t head = p->eckd.head & 0x000f;
+const uint64_t cylinder = chs->cylinder
++ ((chs->head & 0xfff0) << 12);
+const uint64_t head = chs->head & 0x000f;
 const block_number_t block = sectors * heads * cylinder
+ sectors * head
-   + p->eckd.sector
+   + chs->sector
- 1; /* block nr starts with zero */
 return block;
 }
 
 static bool eckd_valid_address(BootMapPointer *p)
 {
-const uint64_t head = p->eckd.head & 0x000f;
+const uint64_t head = p->eckd.chs.head & 0x000f;
 
 if (head >= virtio_get_heads()
-||  p->eckd.sector > virtio_get_sectors()
-||  p->eckd.sector <= 0) {
+||  p->eckd.chs.sector > virtio_get_sectors()
+||  p->eckd.chs.sector <= 0) {
 return false;
 }
 
 if (!virtio_guessed_disk_nature() &&
-eckd_block_num(p) >= virtio_get_blocks()) {
+eckd_block_num(&p->eckd.chs) >= virtio_get_blocks()) {
 return false;
 }
 
@@ -140,7 +140,7 @@ static block_number_t load_eckd_segments(block_number_t 
blk, uint64_t *address)
 do {
 more_data = false;
 for (j = 0;; j++) {
-block_nr = eckd_block_num((void *)&(bprs[j].xeckd));
+block_nr = eckd_block_num(&bprs[j].xeckd.bptr.chs);
 if (is_null_block_number(block_nr)) { /* end of chunk */
 break;
 }
@@ -198,7 +198,7 @@ static void run_eckd_boot_script(block_number_t 
bmt_block_nr)
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
 read_block(bmt_block_nr, sec, "Cannot read Boot Map Table");
 
-block_nr = eckd_block_num(&bmt->entry[loadparm]);
+block_nr = eckd_block_num(&bmt->entry[loadparm].xeckd.bptr.chs);
 IPL_assert(block_nr != -1, "Cannot find Boot Map Table Entry");
 
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
@@ -206,7 +206,7 @@ static void run_eckd_boot_script(block_number_t 
bmt_block_nr)
 
 for (i = 0; bms->entry[i].type == BOOT_SCRIPT_LOAD; i++) {
 address = bms->entry[i].address.load_address;
-block_nr = eckd_block_num(&(bms->entry[i].blkptr));
+block_nr = eckd_block_num(&bms->entry[i].blkptr.xeckd.bptr.chs);
 
 do {
 block_nr = load_eckd_segments(block_nr, &address);
@@ -239,7 +239,7 @@ static void ipl_eckd_cdl(void)
"Non-ECKD device type in zIPL section of IPL2 record.");
 
 /* save pointer to Boot Map Table */
-bmt_block_nr = eckd_block_num(&mbr->blockptr);
+bmt_block_nr = eckd_block_num(&mbr->blockptr.xeckd.bptr.chs);
 
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
 read_block(2, vlbl, "Cannot read Volume Label at block 2");
@@ -300,7 +300,7 @@ static void ipl_eckd_ldl(ECKD_IPL_mode_t mode)
 verify_boot_info(bip);
 
 /* save pointer to Boot Map Table */
-bmt_block_nr = eckd_block_num((void *)&bip->bp.ipl.bm_ptr.eckd.bptr);
+bmt_block_nr = eckd_block_num(&bip->bp.ipl.bm_ptr.eckd.bptr.chs);
 
 run_eckd_boot_script(bmt_block_nr);
 /* no return */
diff --git a/pc-bios/s390-ccw/bootmap.h b/pc-bios/s390-ccw/bootmap.h
index 486c0f3..b361084 100644
--- a/pc-bios/s390-ccw/bootmap.h
+++ b/pc-bios/s390-ccw/bootmap.h
@@ -32,10 +32,14 @@ typedef struct FbaBlockPtr {
 uint16_t blockct;
 } __attribute__ ((packed)) FbaBlockPtr;
 
-typedef struct EckdBlockPtr {
-uint16_t cylinder; /* cylinder/head/sector is an address of the block */
+typedef struct EckdCHS {
+uint16_t cylinder;
 uint16_t head;
 uint8_t sector;
+} __attribute__ ((packed)) EckdCHS;
+
+typedef struct EckdBlockPtr {
+EckdCHS chs; /* cylinder/head/sector is an address of the block */
 uint16_t size;
 uint8_t count; /* (size_in_blocks-1);
 * it's 0 for TablePtr, ScriptPtr, and SectionPtr */
-- 
1.8.3.1




[Qemu-devel] [PULL-for-s390x 03/14] s390-ccw: refactor IPL structs

2018-02-26 Thread Thomas Huth
From: "Collin L. Walling" 

ECKD DASDs have different IPL structures for CDL and LDL
formats. The current Ipl1 and Ipl2 structs follow the CDL
format, so we prepend "EckdCdl" to them. Boot info for LDL
has been moved to a new struct: EckdLdlIpl1.

Signed-off-by: Collin L. Walling 
Acked-by: Janosch Frank 
Reviewed-by: Thomas Huth 
Acked-by: Christian Borntraeger 
Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw/bootmap.c | 12 ++--
 pc-bios/s390-ccw/bootmap.h | 37 +
 2 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index 9534f56..a94638d 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -221,7 +221,7 @@ static void run_eckd_boot_script(block_number_t 
bmt_block_nr)
 static void ipl_eckd_cdl(void)
 {
 XEckdMbr *mbr;
-Ipl2 *ipl2 = (void *)sec;
+EckdCdlIpl2 *ipl2 = (void *)sec;
 IplVolumeLabel *vlbl = (void *)sec;
 block_number_t bmt_block_nr;
 
@@ -231,7 +231,7 @@ static void ipl_eckd_cdl(void)
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
 read_block(1, ipl2, "Cannot read IPL2 record at block 1");
 
-mbr = &ipl2->u.x.mbr;
+mbr = &ipl2->mbr;
 IPL_assert(magic_match(mbr, ZIPL_MAGIC), "No zIPL section in IPL2 
record.");
 IPL_assert(block_size_ok(mbr->blockptr.xeckd.bptr.size),
"Bad block size in zIPL section of IPL2 record.");
@@ -281,7 +281,7 @@ static void print_eckd_ldl_msg(ECKD_IPL_mode_t mode)
 static void ipl_eckd_ldl(ECKD_IPL_mode_t mode)
 {
 block_number_t bmt_block_nr;
-BootInfo *bip = (void *)(sec + 0x70); /* BootInfo is MBR for LDL */
+EckdLdlIpl1 *ipl1 = (void *)sec;
 
 if (mode != ECKD_LDL_UNLABELED) {
 print_eckd_ldl_msg(mode);
@@ -292,15 +292,15 @@ static void ipl_eckd_ldl(ECKD_IPL_mode_t mode)
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
 read_block(0, sec, "Cannot read block 0 to grab boot info.");
 if (mode == ECKD_LDL_UNLABELED) {
-if (!magic_match(bip->magic, ZIPL_MAGIC)) {
+if (!magic_match(ipl1->bip.magic, ZIPL_MAGIC)) {
 return; /* not applicable layout */
 }
 sclp_print("unlabeled LDL.\n");
 }
-verify_boot_info(bip);
+verify_boot_info(&ipl1->bip);
 
 /* save pointer to Boot Map Table */
-bmt_block_nr = eckd_block_num(&bip->bp.ipl.bm_ptr.eckd.bptr.chs);
+bmt_block_nr = eckd_block_num(&ipl1->bip.bp.ipl.bm_ptr.eckd.bptr.chs);
 
 run_eckd_boot_script(bmt_block_nr);
 /* no return */
diff --git a/pc-bios/s390-ccw/bootmap.h b/pc-bios/s390-ccw/bootmap.h
index b361084..4bd95cd 100644
--- a/pc-bios/s390-ccw/bootmap.h
+++ b/pc-bios/s390-ccw/bootmap.h
@@ -239,22 +239,27 @@ typedef struct BootInfo {  /* @ 0x70, record #0   
 */
 } bp;
 } __attribute__ ((packed)) BootInfo; /* see also XEckdMbr   */
 
-typedef struct Ipl1 {
-unsigned char key[4]; /* == "IPL1" */
-unsigned char data[24];
-} __attribute__((packed)) Ipl1;
-
-typedef struct Ipl2 {
-unsigned char key[4]; /* == "IPL2" */
-union {
-unsigned char data[144];
-struct {
-unsigned char reserved1[92-4];
-XEckdMbr mbr;
-unsigned char reserved2[144-(92-4)-sizeof(XEckdMbr)];
-} x;
-} u;
-} __attribute__((packed)) Ipl2;
+/*
+ * Structs for IPL
+ */
+#define STAGE2_BLK_CNT_MAX  24 /* Stage 1b can load up to 24 blocks */
+
+typedef struct EckdCdlIpl1 {
+uint8_t key[4]; /* == "IPL1" */
+uint8_t data[24];
+} __attribute__((packed)) EckdCdlIpl1;
+
+typedef struct EckdCdlIpl2 {
+uint8_t key[4]; /* == "IPL2" */
+uint8_t reserved0[88];
+XEckdMbr mbr;
+uint8_t reserved[24];
+} __attribute__((packed)) EckdCdlIpl2;
+
+typedef struct EckdLdlIpl1 {
+uint8_t reserved[112];
+BootInfo bip; /* BootInfo is MBR for LDL */
+} __attribute__((packed)) EckdLdlIpl1;
 
 typedef struct IplVolumeLabel {
 unsigned char key[4]; /* == "VOL1" */
-- 
1.8.3.1




[Qemu-devel] [PULL-for-s390x 01/14] s390-ccw: refactor boot map table code

2018-02-26 Thread Thomas Huth
From: "Collin L. Walling" 

Some ECKD bootmap code was using structs designed for SCSI.
Even though this works, it confuses readability. Add a new
BootMapTable struct to assist with readability in bootmap
entry code. Also:

- replace ScsiMbr in ECKD code with appropriate structs
- fix read_block messages to reflect BootMapTable
- fixup ipl_scsi to use BootMapTable (referred to as Program Table)
- defined value for maximum table entries

Signed-off-by: Collin L. Walling 
Reviewed-by: Thomas Huth 
Acked-by: Christian Borntraeger 
Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw/bootmap.c | 60 +-
 pc-bios/s390-ccw/bootmap.h | 11 -
 2 files changed, 37 insertions(+), 34 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index 67a6123..a4eaf24 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -182,24 +182,24 @@ static block_number_t load_eckd_segments(block_number_t 
blk, uint64_t *address)
 return block_nr;
 }
 
-static void run_eckd_boot_script(block_number_t mbr_block_nr)
+static void run_eckd_boot_script(block_number_t bmt_block_nr)
 {
 int i;
 unsigned int loadparm = get_loadparm_index();
 block_number_t block_nr;
 uint64_t address;
-ScsiMbr *bte = (void *)sec; /* Eckd bootmap table entry */
+BootMapTable *bmt = (void *)sec;
 BootMapScript *bms = (void *)sec;
 
 debug_print_int("loadparm", loadparm);
-IPL_assert(loadparm < 31, "loadparm value greater than"
+IPL_assert(loadparm <= MAX_TABLE_ENTRIES, "loadparm value greater than"
" maximum number of boot entries allowed");
 
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
-read_block(mbr_block_nr, sec, "Cannot read MBR");
+read_block(bmt_block_nr, sec, "Cannot read Boot Map Table");
 
-block_nr = eckd_block_num((void *)&(bte->blockptr[loadparm]));
-IPL_assert(block_nr != -1, "No Boot Map");
+block_nr = eckd_block_num(&bmt->entry[loadparm]);
+IPL_assert(block_nr != -1, "Cannot find Boot Map Table Entry");
 
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
 read_block(block_nr, sec, "Cannot read Boot Map Script");
@@ -223,7 +223,7 @@ static void ipl_eckd_cdl(void)
 XEckdMbr *mbr;
 Ipl2 *ipl2 = (void *)sec;
 IplVolumeLabel *vlbl = (void *)sec;
-block_number_t block_nr;
+block_number_t bmt_block_nr;
 
 /* we have just read the block #0 and recognized it as "IPL1" */
 sclp_print("CDL\n");
@@ -238,8 +238,8 @@ static void ipl_eckd_cdl(void)
 IPL_assert(mbr->dev_type == DEV_TYPE_ECKD,
"Non-ECKD device type in zIPL section of IPL2 record.");
 
-/* save pointer to Boot Script */
-block_nr = eckd_block_num((void *)&(mbr->blockptr));
+/* save pointer to Boot Map Table */
+bmt_block_nr = eckd_block_num(&mbr->blockptr);
 
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
 read_block(2, vlbl, "Cannot read Volume Label at block 2");
@@ -249,7 +249,7 @@ static void ipl_eckd_cdl(void)
"Invalid magic of volser block");
 print_volser(vlbl->f.volser);
 
-run_eckd_boot_script(block_nr);
+run_eckd_boot_script(bmt_block_nr);
 /* no return */
 }
 
@@ -280,7 +280,7 @@ static void print_eckd_ldl_msg(ECKD_IPL_mode_t mode)
 
 static void ipl_eckd_ldl(ECKD_IPL_mode_t mode)
 {
-block_number_t block_nr;
+block_number_t bmt_block_nr;
 BootInfo *bip = (void *)(sec + 0x70); /* BootInfo is MBR for LDL */
 
 if (mode != ECKD_LDL_UNLABELED) {
@@ -299,8 +299,10 @@ static void ipl_eckd_ldl(ECKD_IPL_mode_t mode)
 }
 verify_boot_info(bip);
 
-block_nr = eckd_block_num((void *)&(bip->bp.ipl.bm_ptr.eckd.bptr));
-run_eckd_boot_script(block_nr);
+/* save pointer to Boot Map Table */
+bmt_block_nr = eckd_block_num((void *)&bip->bp.ipl.bm_ptr.eckd.bptr);
+
+run_eckd_boot_script(bmt_block_nr);
 /* no return */
 }
 
@@ -325,7 +327,7 @@ static void print_eckd_msg(void)
 
 static void ipl_eckd(void)
 {
-ScsiMbr *mbr = (void *)sec;
+XEckdMbr *mbr = (void *)sec;
 LDL_VTOC *vlbl = (void *)sec;
 
 print_eckd_msg();
@@ -449,10 +451,8 @@ static void zipl_run(ScsiBlockPtr *pte)
 static void ipl_scsi(void)
 {
 ScsiMbr *mbr = (void *)sec;
-uint8_t *ns, *ns_end;
 int program_table_entries = 0;
-const int pte_len = sizeof(ScsiBlockPtr);
-ScsiBlockPtr *prog_table_entry = NULL;
+BootMapTable *prog_table = (void *)sec;
 unsigned int loadparm = get_loadparm_index();
 
 /* Grab the MBR */
@@ -467,34 +467,28 @@ static void ipl_scsi(void)
 debug_print_int("MBR Version", mbr->version_id);
 IPL_check(mbr->version_id == 1,
   "Unknown MBR layout version, assuming version 1");
-debug_print_int("program table", mbr->blockptr[0].blockno);
-IPL_assert(mbr->blockptr[0].blockno, "No Program Table");
+debug_print_int("program table", mbr->pt.blockno);
+IPL_assert(mbr->pt.blockno, "No Program Table");
 
 /* 

[Qemu-devel] [PULL-for-s390x 08/14] s390-ccw: read stage2 boot loader data to find menu

2018-02-26 Thread Thomas Huth
From: "Collin L. Walling" 

Read the stage2 boot loader data block-by-block. We scan the
current block for the string "zIPL" to detect the start of the
boot menu banner. We then load the adjacent blocks (previous
block and next block) to account for the possibility of menu
data spanning multiple blocks.

Signed-off-by: Collin L. Walling 
Reviewed-by: Thomas Huth 
Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw/bootmap.c  | 94 ++---
 pc-bios/s390-ccw/bootmap.h  | 23 ++-
 pc-bios/s390-ccw/menu.c | 10 +
 pc-bios/s390-ccw/s390-ccw.h |  2 +
 4 files changed, 122 insertions(+), 7 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index 092fb35..ae93b55 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -83,6 +83,10 @@ static void jump_to_IPL_code(uint64_t address)
 
 static unsigned char _bprs[8*1024]; /* guessed "max" ECKD sector size */
 static const int max_bprs_entries = sizeof(_bprs) / sizeof(ExtEckdBlockPtr);
+static uint8_t _s2[MAX_SECTOR_SIZE * 3] 
__attribute__((__aligned__(PAGE_SIZE)));
+static void *s2_prev_blk = _s2;
+static void *s2_cur_blk = _s2 + MAX_SECTOR_SIZE;
+static void *s2_next_blk = _s2 + MAX_SECTOR_SIZE * 2;
 
 static inline void verify_boot_info(BootInfo *bip)
 {
@@ -182,7 +186,77 @@ static block_number_t load_eckd_segments(block_number_t 
blk, uint64_t *address)
 return block_nr;
 }
 
-static void run_eckd_boot_script(block_number_t bmt_block_nr)
+static bool find_zipl_boot_menu_banner(int *offset)
+{
+int i;
+
+/* Menu banner starts with "zIPL" */
+for (i = 0; i < virtio_get_block_size() - 4; i++) {
+if (magic_match(s2_cur_blk + i, ZIPL_MAGIC_EBCDIC)) {
+*offset = i;
+return true;
+}
+}
+
+return false;
+}
+
+static int eckd_get_boot_menu_index(block_number_t s1b_block_nr)
+{
+block_number_t cur_block_nr;
+block_number_t prev_block_nr = 0;
+block_number_t next_block_nr = 0;
+EckdStage1b *s1b = (void *)sec;
+int banner_offset;
+int i;
+
+/* Get Stage1b data */
+memset(sec, FREE_SPACE_FILLER, sizeof(sec));
+read_block(s1b_block_nr, s1b, "Cannot read stage1b boot loader");
+
+memset(_s2, FREE_SPACE_FILLER, sizeof(_s2));
+
+/* Get Stage2 data */
+for (i = 0; i < STAGE2_BLK_CNT_MAX; i++) {
+cur_block_nr = eckd_block_num(&s1b->seek[i].chs);
+
+if (!cur_block_nr) {
+break;
+}
+
+read_block(cur_block_nr, s2_cur_blk, "Cannot read stage2 boot loader");
+
+if (find_zipl_boot_menu_banner(&banner_offset)) {
+/*
+ * Load the adjacent blocks to account for the
+ * possibility of menu data spanning multiple blocks.
+ */
+if (prev_block_nr) {
+read_block(prev_block_nr, s2_prev_blk,
+   "Cannot read stage2 boot loader");
+}
+
+if (i + 1 < STAGE2_BLK_CNT_MAX) {
+next_block_nr = eckd_block_num(&s1b->seek[i + 1].chs);
+}
+
+if (next_block_nr) {
+read_block(next_block_nr, s2_next_blk,
+   "Cannot read stage2 boot loader");
+}
+
+return menu_get_zipl_boot_index(s2_cur_blk + banner_offset);
+}
+
+prev_block_nr = cur_block_nr;
+}
+
+sclp_print("No zipl boot menu data found. Booting default entry.");
+return 0;
+}
+
+static void run_eckd_boot_script(block_number_t bmt_block_nr,
+ block_number_t s1b_block_nr)
 {
 int i;
 unsigned int loadparm = get_loadparm_index();
@@ -191,6 +265,10 @@ static void run_eckd_boot_script(block_number_t 
bmt_block_nr)
 BootMapTable *bmt = (void *)sec;
 BootMapScript *bms = (void *)sec;
 
+if (menu_is_enabled_zipl()) {
+loadparm = eckd_get_boot_menu_index(s1b_block_nr);
+}
+
 debug_print_int("loadparm", loadparm);
 IPL_assert(loadparm <= MAX_TABLE_ENTRIES, "loadparm value greater than"
" maximum number of boot entries allowed");
@@ -223,7 +301,7 @@ static void ipl_eckd_cdl(void)
 XEckdMbr *mbr;
 EckdCdlIpl2 *ipl2 = (void *)sec;
 IplVolumeLabel *vlbl = (void *)sec;
-block_number_t bmt_block_nr;
+block_number_t bmt_block_nr, s1b_block_nr;
 
 /* we have just read the block #0 and recognized it as "IPL1" */
 sclp_print("CDL\n");
@@ -241,6 +319,9 @@ static void ipl_eckd_cdl(void)
 /* save pointer to Boot Map Table */
 bmt_block_nr = eckd_block_num(&mbr->blockptr.xeckd.bptr.chs);
 
+/* save pointer to Stage1b Data */
+s1b_block_nr = eckd_block_num(&ipl2->stage1.seek[0].chs);
+
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
 read_block(2, vlbl, "Cannot read Volume Label at block 2");
 IPL_assert(magic_match(vlbl->key, VOL1_MAGIC),
@@ -249,7 +330,7 @@ static void ipl_eckd_cdl(void)
"Invalid magic of volser blo

[Qemu-devel] [PULL-for-s390x 09/14] s390-ccw: print zipl boot menu

2018-02-26 Thread Thomas Huth
From: "Collin L. Walling" 

When the boot menu options are present and the guest's
disk has been configured by the zipl tool, then the user
will be presented with an interactive boot menu with
labeled entries. An example of what the menu might look
like:

zIPL v1.37.1-build-20170714 interactive boot menu.

0. default (linux-4.13.0)

  1. linux-4.13.0
  2. performance
  3. kvm

Signed-off-by: Collin L. Walling 
Reviewed-by: Thomas Huth 
Acked-by: Christian Borntraeger 
Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw/menu.c | 33 -
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/pc-bios/s390-ccw/menu.c b/pc-bios/s390-ccw/menu.c
index c1d242f..730d44e 100644
--- a/pc-bios/s390-ccw/menu.c
+++ b/pc-bios/s390-ccw/menu.c
@@ -15,11 +15,42 @@
 static uint8_t flag;
 static uint64_t timeout;
 
-int menu_get_zipl_boot_index(const char *menu_data)
+static int get_boot_index(int entries)
 {
 return 0; /* implemented next patch */
 }
 
+static void zipl_println(const char *data, size_t len)
+{
+char buf[len + 2];
+
+ebcdic_to_ascii(data, buf, len);
+buf[len] = '\n';
+buf[len + 1] = '\0';
+
+sclp_print(buf);
+}
+
+int menu_get_zipl_boot_index(const char *menu_data)
+{
+size_t len;
+int entries;
+
+/* Print and count all menu items, including the banner */
+for (entries = 0; *menu_data; entries++) {
+len = strlen(menu_data);
+zipl_println(menu_data, len);
+menu_data += len + 1;
+
+if (entries < 2) {
+sclp_print("\n");
+}
+}
+
+sclp_print("\n");
+return get_boot_index(entries - 1); /* subtract 1 to exclude banner */
+}
+
 void menu_set_parms(uint8_t boot_menu_flag, uint32_t boot_menu_timeout)
 {
 flag = boot_menu_flag;
-- 
1.8.3.1




[Qemu-devel] [PULL-for-s390x 05/14] s390-ccw: move auxiliary IPL data to separate location

2018-02-26 Thread Thomas Huth
From: "Collin L. Walling" 

The s390-ccw firmware needs some information in support of the
boot process which is not available on the native machine.
Examples are the netboot firmware load address and now the
boot menu parameters.

While storing that data in unused fields of the IPL parameter block
works, that approach could create problems if the parameter block
definition should change in the future. Because then a guest could
overwrite these fields using the set IPLB diagnose.

In fact the data in question is of more global nature and not really
tied to an IPL device, so separating it is rather logical.

This commit introduces a new structure to hold firmware relevant
IPL parameters set by QEMU. The data is stored at location 204 (dec)
and can contain up to 7 32-bit words. This area is available to
programming in the z/Architecture Principles of Operation and
can thus safely be used by the firmware until the IPL has completed.

Signed-off-by: Viktor Mihajlovski 
Signed-off-by: Collin L. Walling 
Reviewed-by: Thomas Huth 
Acked-by: Christian Borntraeger 
[thuth: fixed "4 + 8 * n" comment]
Signed-off-by: Thomas Huth 
---
 hw/s390x/ipl.c  | 18 +-
 hw/s390x/ipl.h  | 25 +++--
 pc-bios/s390-ccw/iplb.h | 18 --
 pc-bios/s390-ccw/main.c |  6 +-
 4 files changed, 61 insertions(+), 6 deletions(-)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index 0d06fc1..79f5a58 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -399,6 +399,21 @@ void s390_reipl_request(void)
 qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
 }
 
+static void s390_ipl_prepare_qipl(S390CPU *cpu)
+{
+S390IPLState *ipl = get_ipl_device();
+uint8_t *addr;
+uint64_t len = 4096;
+
+addr = cpu_physical_memory_map(cpu->env.psa, &len, 1);
+if (!addr || len < QIPL_ADDRESS + sizeof(QemuIplParameters)) {
+error_report("Cannot set QEMU IPL parameters");
+return;
+}
+memcpy(addr + QIPL_ADDRESS, &ipl->qipl, sizeof(QemuIplParameters));
+cpu_physical_memory_unmap(addr, len, 1, len);
+}
+
 void s390_ipl_prepare_cpu(S390CPU *cpu)
 {
 S390IPLState *ipl = get_ipl_device();
@@ -418,8 +433,9 @@ void s390_ipl_prepare_cpu(S390CPU *cpu)
 error_report_err(err);
 vm_stop(RUN_STATE_INTERNAL_ERROR);
 }
-ipl->iplb.ccw.netboot_start_addr = cpu_to_be64(ipl->start_addr);
+ipl->qipl.netboot_start_addr = cpu_to_be64(ipl->start_addr);
 }
+s390_ipl_prepare_qipl(cpu);
 }
 
 static void s390_ipl_reset(DeviceState *dev)
diff --git a/hw/s390x/ipl.h b/hw/s390x/ipl.h
index 8a705e0..5cc3b77 100644
--- a/hw/s390x/ipl.h
+++ b/hw/s390x/ipl.h
@@ -16,8 +16,7 @@
 #include "cpu.h"
 
 struct IplBlockCcw {
-uint64_t netboot_start_addr;
-uint8_t  reserved0[77];
+uint8_t  reserved0[85];
 uint8_t  ssid;
 uint16_t devno;
 uint8_t  vm_flags;
@@ -90,6 +89,27 @@ void s390_ipl_prepare_cpu(S390CPU *cpu);
 IplParameterBlock *s390_ipl_get_iplb(void);
 void s390_reipl_request(void);
 
+#define QIPL_ADDRESS  0xcc
+
+/*
+ * The QEMU IPL Parameters will be stored at absolute address
+ * 204 (0xcc) which means it is 32-bit word aligned but not
+ * double-word aligned.
+ * Placement of data fields in this area must account for
+ * their alignment needs. E.g., netboot_start_address must
+ * have an offset of 4 + n * 8 bytes within the struct in order
+ * to keep it double-word aligned.
+ * The total size of the struct must never exceed 28 bytes.
+ * This definition must be kept in sync with the defininition
+ * in pc-bios/s390-ccw/iplb.h.
+ */
+struct QemuIplParameters {
+uint8_t  reserved1[4];
+uint64_t netboot_start_addr;
+uint8_t  reserved2[16];
+} QEMU_PACKED;
+typedef struct QemuIplParameters QemuIplParameters;
+
 #define TYPE_S390_IPL "s390-ipl"
 #define S390_IPL(obj) OBJECT_CHECK(S390IPLState, (obj), TYPE_S390_IPL)
 
@@ -105,6 +125,7 @@ struct S390IPLState {
 bool iplb_valid;
 bool reipl_requested;
 bool netboot;
+QemuIplParameters qipl;
 
 /*< public >*/
 char *kernel;
diff --git a/pc-bios/s390-ccw/iplb.h b/pc-bios/s390-ccw/iplb.h
index 890aed9..31d2934 100644
--- a/pc-bios/s390-ccw/iplb.h
+++ b/pc-bios/s390-ccw/iplb.h
@@ -13,8 +13,7 @@
 #define IPLB_H
 
 struct IplBlockCcw {
-uint64_t netboot_start_addr;
-uint8_t  reserved0[77];
+uint8_t  reserved0[85];
 uint8_t  ssid;
 uint16_t devno;
 uint8_t  vm_flags;
@@ -73,6 +72,21 @@ typedef struct IplParameterBlock IplParameterBlock;
 
 extern IplParameterBlock iplb __attribute__((__aligned__(PAGE_SIZE)));
 
+#define QIPL_ADDRESS  0xcc
+
+/*
+ * This definition must be kept in sync with the defininition
+ * in hw/s390x/ipl.h
+ */
+struct QemuIplParameters {
+uint8_t  reserved1[4];
+uint64_t netboot_start_addr;
+uint8_t  reserved2[16];
+} __attribute__ ((packed));
+typedef struct QemuIplParameters QemuIplParameters;
+
+extern QemuIplParameters qipl;
+
 #define S390_IPL_TYPE_FCP 0x00
 #

[Qemu-devel] [PULL-for-s390x 14/14] pc-bios/s390: Rebuild the s390x firmware images with the boot menu changes

2018-02-26 Thread Thomas Huth
Provide a new s390-ccw.img binary with the boot menu patches by Collin.
Though there should not be any visible changes for the network booting,
the s390-netboot.img binary has been rebuilt, too, since some of the
changes affected the shared source files.

Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw.img | Bin 26416 -> 34568 bytes
 pc-bios/s390-netboot.img | Bin 83864 -> 83776 bytes
 2 files changed, 0 insertions(+), 0 deletions(-)
 mode change 100755 => 100644 pc-bios/s390-netboot.img

diff --git a/pc-bios/s390-ccw.img b/pc-bios/s390-ccw.img
index 
97155d2638eebc1220a63287fcada21df65b3a77..fbd76bb55ed01367c9e2e5656e9a3e2eb21f823f
 100644
GIT binary patch
literal 34568
zcmeHwdw5jU)%QL#lVoxsoP-c236~QH5J2Jt5CdwNNdgAsGC;6Ftt26VL_-pj0nt|_
zVpOnJLn>CQt)m4GVzmKn1#hI4S5eW9)C<-d?Wm=V)(J&Lgq-iU);?!439(85vmIZDEVmU5oQlD+JdzS$^tIYEtNSNc`DBTe5}S-iyyA01
zfD>^?ztmpC<@PbA+>B>K+5ZJkVL&utrYs-HX31B=bcL;7Dsp6bBz08zKk@bARj!xy
z{Zm_#NzY6L;Vnj4oGky3|Ej=S0%()Z@6wA>E-6~6P|mKatvtJ~YWlRi=7zj{t|c02
z_}VXAxJ>-(Z@a%9I8_7XsWD_yP
zgF5b8c=i*Aue;}tLD${?r#}yV?xTOG+N*B=qF=s%`tSSz*_7XLBnl`Sonn?dP5pCi
zY>ME1!o#KaOL#W>c_>fjXM2Y%7PeMol-_d9EEmI(NluN>kGBv;qkKwKd7FuliL{`%
zd`4K7JALA++QvFjUAb~fMe|Cry4t^@uA!o;v7%|sirV_B>h)q}Lw)nwHPtJcSJpM6
z$S)cz>T6eub+t|Y+6I(1*RE);_OERepi^nKChk07=A@n2HS_V#V4w1c{C6JiTW-m7
z=aIhUmLBgs+P6I3l=DtgE@9fICFh-{A27=!`fXGD&><_}Pn%|DBps#2eB9E}r`$dy
zD4(}<_A9UIY6!bUPABTe$LQZ0YUd|HsDVzjH|~o=X<=OzRFd+=UWeFCx%B5dH6a#U
zjps_PxmeE-!mEXKeTHxrdCNtiS4+^dgrc8ooa_t;F?%+{O$>SP;2>x~4lnYl2MZT_
zLH&i|S66Fy4h$f7Q4lB
zq$R(oVoi-VzCv9KG|2rj_GhVA>#fE;L?I;`9FndI%Ct7F&90+zjr7neuA}^*uSv?W
z-Q41jO5AF_l$5#sNytN-6ThXNpp*FBj5#blZTocY;b-@ziK)anVT(iH_qefx={-s%
z>Wq*YkXmz)?sm0_96-Ft5Dna7PDH{DXG&P1vFYV_=Ob`qQwrhm9i`4FYyCI^^kAg#)b~
z7LK^%q}kMbvOD8**7%=H+Z+sgEf&s@8jKZNGKQf?KcMl1zWN9?+l()fdz0iOofH*g
zj0xQ(gijNDe5!ahuBYpZaHEuq=q6*N8-}q+KMvLNN7hqcTr`%XCn)W@t}lqYz3F1K
z)M8v*NJVd^6WmQHhY|rJ;<|}I*GJ)+D7_66A8-rL5I-AF9ZICG>QvF@UkP{8X4?FVZ
z5)rUpX@KLD+rQRDuX;0SCqd_rzYO
zQ1>Q@A^Id(?r*|jNbeNcamRc8eXgQ^`_B_04NN+`?rtYZFk&V7Or?$Z&vCtF?{-4}
zts;#i8MBjRoWoKL;=G6Q<1b^;#v@y9!Tg`7rob8sUD0dpsvKq?MBo=_q#z$DzFNyBHBILfkg1vk@F+0lyR4
zbE9Osp5YPgJKaJ*5B4+6IOX+(JE?7&hYqRX1C1i&^T0oJZAbK$6hcRdmZF{E1BC%q
z6n%p9h^pup0XGZnpKBats);1m@FVOohp^OVQ^}xL;jQ209Q(6zZ;;GA&xSNG2VX~A
z!84Gby`%Gp5Szcy?-feuNH_p)eR%#zpNo0#5n=2L`4TXGMsv&X_g%%{ZV-4@y~Qn<
z`+(u_{>3APOPTARkOk>y^EjP|9sniw
zkQRRqvDoDCd|3AE3l6YT^DhYmKK8ETzt2n)rUzJqaR@n%x1l4oS#Mcar0Ok
z=lq*C{l)>b;n63gIK3(5f^1=Lw#bGraC^(kwJ1wy?KusLpm~g~ZZyvvgA%#@Q`}|@
zw~0u_Fiw99dx52J`xx&j&X5QG!NFEeeu%hUNG%X%(oud6a-PTUSa>H;ISdIqxvqfg
z-sHT_`PV5w7JU!6{qXGVT$jUj66adZKh1Fe&2SxnD>wa7B)%HA0AsxXBS9rg+w?a2
zP9e<9(=_J9_|5=IUnOfFR|l>C9N%l<8&f(vP(ogO2hEpq+HZDW1I4;jX(6AanJ
zkUyl_#D6j5D9(RQ`J59B8DidC%$whMa;ptKAjjy~x=bC3x#-thcdDb1?$~En;ggJQ
z-ZphCVgJ}*Ja}00X0UUbwHs0XHTBonR(-FS6>_80v2_zXQme>f7zcC++h(74pV0KD
zj4z1Jtb@$yvA){aD%vpOjZgL!wZZz0zl5|lyBJB@`Eh5ob}~L;@5G}+JHirP$Ml}z
z{5qEQMwYgYdw7Cyp@&70(!xEAHlE6kwtYLk5i2FdBRUxmV9fZ|1+rn&2g*!l({*ZIlG(l6PUAU%-N{2K;R*Y
zIu|V=JuX^7dR$bemtn3Cw8%N%*Cu>wqtJI?P0&U~boPUJ!W1b}WzkQ%At6cPv#c$k?w&r!Eaa$hZm-_y;KhUTMWkWq2*SbFK
z)$ypmFx~69jd;#q&24PtHr9lEGzWAIW&V^lkxJU>uryYd0^RzJ)zm8Ky(_1;G>Xi8A4+{FO`Qt7oAjNALAbCB@EY9~^S!Dc-?v5aeKYf#O}uK4
zaXyRr9>;vA=#Rr1{l=0qHN2l9R9F5cL~B^(In#Mg)?Re&+WtHEm=0kh?D$iF?CcK-
zTYgfccjXJW7z92qLX6~wZv?!nIEt59OyND|rGxWtQGPT=mSLQHHH18@o}w(r-WEg(
zts-vkFfp2IKiiHKm-Mrp#{vCu$W7K*=0)tGbtDK!W>MA{Sc$aQX;|%{Hfu~~k+dxz
z;K+_JTfEQhPv`cx%8^#)_6*0EyeQN`lv1C8Pq;>ZI?j-gkey?d_!|JT5iwr@&w3{W
zW<8E$BlFIf5j5{$>^y;Q6Z75ZdWc#Y*$O@QmG$J4ZoUJ43chVwfNWUKy(4!qKMT0U
z>p6cR>t+d4xq$LFW7SG=m?rc|uoc8%Xk|QEz|FkYa?R}3^~lXkgFUKtHRne$y@^b3
zD7@T6qJr2LF%kP(M5WMc4QR^9#&2>~#n^&5E2q5kBvXnZ_;148{#eNw^lL_lpu#iO
z%`UfQC+@Co9`dIo#3Bn$b@iVpM=b7+8=#lY~8g>}sL>sN45n+4t!2hjV1i_
z_j+w)1gsR)UnGhn_i&!$KkZef{yQpP+NEc>`}8JR2aT_h`)Ibi`7*2l1nguR>mx_6
z5iW3m`2qZW3B62_mV`cZ45|zNdmF*lh?8e%bYKL8e9RVXPb*M`(?%6Lbe5>#|^wyr|
zs6K5jjhLR7G3FpOb5VkpH!|*h(q?;}2IYr|&*AL+APjqGR7-v5p8WZIsT2CXY0v3G2b18_l%#k-7UJ6_L
z53F@~G5r*M8EV`I!hs~WH$3I$lB$9rz%
z`HJ^NFjM(JbvJg4XxHhA_F3U=?Rnwt?Fr$B+XeQRd52TsNOHuhok8r$F!h*kxuyNo
zQp}ejEw~64#a4o~q=%!rk>VIqCrSr{>h`qM?ODQINESXpY@_w4HxS;z_}u?9{z;xG
zgpDU?afXN``ZxB3yhz%LihbDS4p@8FUxTy9xqY7BVt&hdAx=0F%!0;{-w8{IA&zf+
zmvnaH+vJ0y_bbz^a|i&|5?r_nLp=trZX<@xyEq+k35fY$AyBy>*Xvoq%`AD1Zx9hWib}d!ZAa+
zMUAm|-GAeCoUevnc~-u$mY;uw=PCG3rl)UjLky&0r2M8EJr;MIR?V4Dy=4zJc2Jbd
z>%rlNs4d4QWh%;!8x8GK(EG;?h1wgAanr*)cA4y>9Nf}yj<6lPUK{Zawe#g4fN=|9
zWR7NP&oegrX~!PQk3b%9j{~m0J)c_W31VOIfp7^TdTQyfu<|nEujNgq@C(Q=Nm@g1
z6a{Ng$WB^#mLBr6UTetUx!SRjY3*YQy#5<;E4SN3R**RpUj8!n@;ujP9Avl}#(9eK
zmCWlJ#wiO0J8EU$q~6loHyIrNIXM0s{%gkK_;17##=2mu=%|Dhupey+aM@2GOFL0+
z`3XP&nD!G|X!qhFDPiwS3khv3A$yl0L#s>dqlQPdUgjmI1uj}_@?
ztP2ycPl>;HzNl-OP{bfSV>ZE_;WU)lu}*Xs3ROXVQtD{5>^Fkw-+ksjUn=(bRP3NW
z54yVmL%NyNuE9Rhsy^O01=t4(wle@7&t<7o0W)!cQNccP!?^^Ln$&AEu1l%J#^+mE
zJSuG6q_?b7q_=Hm^O^$mZIw31h*b#U~Tb=Pu|wiPq+1YswPQ@7FI(_fn_w+HBaBJMbk399|C
z2;4e`3!MRnNC|MOsRh`?9~ml$+*Whn@y@UZ{stX7d@?MfoH({8L8rU-jmj>^sBPzx
zIP+ZQKIZ&(qVlf_K%Iv;;ydQ28?Is9EbP_IFdl8T8LW>vtedHv{{hpRP}U-~FzgG8
zfp1v8yoGfEFAgof2o7jfJp-u|e6d$mi}XC&%Tbb<&b15~Pk#r$LN

[Qemu-devel] [PULL-for-s390x 06/14] s390-ccw: parse and set boot menu options

2018-02-26 Thread Thomas Huth
From: "Collin L. Walling" 

Set boot menu options for an s390 guest and store them in
the iplb. These options are set via the QEMU command line
option:

-boot menu=on|off[,splash-time=X]

or via the libvirt domain xml:


  


Where X represents some positive integer representing
milliseconds.

Any value set for loadparm will override all boot menu options.
If loadparm=PROMPT, then the menu will be enabled without a
timeout.

Signed-off-by: Collin L. Walling 
Reviewed-by: Janosch Frank 
Reviewed-by: Thomas Huth 
Signed-off-by: Thomas Huth 
---
 hw/s390x/ipl.c  | 52 +
 hw/s390x/ipl.h  |  9 +++--
 pc-bios/s390-ccw/iplb.h |  9 +++--
 3 files changed, 66 insertions(+), 4 deletions(-)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index 79f5a58..ee2039d 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -23,6 +23,9 @@
 #include "hw/s390x/ebcdic.h"
 #include "ipl.h"
 #include "qemu/error-report.h"
+#include "qemu/config-file.h"
+#include "qemu/cutils.h"
+#include "qemu/option.h"
 
 #define KERN_IMAGE_START0x01UL
 #define KERN_PARM_AREA  0x010480UL
@@ -219,6 +222,54 @@ static Property s390_ipl_properties[] = {
 DEFINE_PROP_END_OF_LIST(),
 };
 
+static void s390_ipl_set_boot_menu(S390IPLState *ipl)
+{
+QemuOptsList *plist = qemu_find_opts("boot-opts");
+QemuOpts *opts = QTAILQ_FIRST(&plist->head);
+uint8_t *flags = &ipl->qipl.qipl_flags;
+uint32_t *timeout = &ipl->qipl.boot_menu_timeout;
+const char *tmp;
+unsigned long splash_time = 0;
+
+if (!get_boot_device(0)) {
+if (boot_menu) {
+error_report("boot menu requires a bootindex to be specified for "
+ "the IPL device.");
+}
+return;
+}
+
+switch (ipl->iplb.pbt) {
+case S390_IPL_TYPE_CCW:
+break;
+default:
+error_report("boot menu is not supported for this device type.");
+return;
+}
+
+if (!boot_menu) {
+return;
+}
+
+*flags |= QIPL_FLAG_BM_OPTS_CMD;
+
+tmp = qemu_opt_get(opts, "splash-time");
+
+if (tmp && qemu_strtoul(tmp, NULL, 10, &splash_time)) {
+error_report("splash-time is invalid, forcing it to 0.");
+*timeout = 0;
+return;
+}
+
+if (splash_time > 0x) {
+error_report("splash-time is too large, forcing it to max value.");
+*timeout = 0x;
+return;
+}
+
+*timeout = cpu_to_be32(splash_time);
+}
+
 static bool s390_gen_initial_iplb(S390IPLState *ipl)
 {
 DeviceState *dev_st;
@@ -435,6 +486,7 @@ void s390_ipl_prepare_cpu(S390CPU *cpu)
 }
 ipl->qipl.netboot_start_addr = cpu_to_be64(ipl->start_addr);
 }
+s390_ipl_set_boot_menu(ipl);
 s390_ipl_prepare_qipl(cpu);
 }
 
diff --git a/hw/s390x/ipl.h b/hw/s390x/ipl.h
index 5cc3b77..d6c6f75 100644
--- a/hw/s390x/ipl.h
+++ b/hw/s390x/ipl.h
@@ -91,6 +91,9 @@ void s390_reipl_request(void);
 
 #define QIPL_ADDRESS  0xcc
 
+/* Boot Menu flags */
+#define QIPL_FLAG_BM_OPTS_CMD   0x80
+
 /*
  * The QEMU IPL Parameters will be stored at absolute address
  * 204 (0xcc) which means it is 32-bit word aligned but not
@@ -104,9 +107,11 @@ void s390_reipl_request(void);
  * in pc-bios/s390-ccw/iplb.h.
  */
 struct QemuIplParameters {
-uint8_t  reserved1[4];
+uint8_t  qipl_flags;
+uint8_t  reserved1[3];
 uint64_t netboot_start_addr;
-uint8_t  reserved2[16];
+uint32_t boot_menu_timeout;
+uint8_t  reserved2[12];
 } QEMU_PACKED;
 typedef struct QemuIplParameters QemuIplParameters;
 
diff --git a/pc-bios/s390-ccw/iplb.h b/pc-bios/s390-ccw/iplb.h
index 31d2934..832bb94 100644
--- a/pc-bios/s390-ccw/iplb.h
+++ b/pc-bios/s390-ccw/iplb.h
@@ -74,14 +74,19 @@ extern IplParameterBlock iplb 
__attribute__((__aligned__(PAGE_SIZE)));
 
 #define QIPL_ADDRESS  0xcc
 
+/* Boot Menu flags */
+#define QIPL_FLAG_BM_OPTS_CMD   0x80
+
 /*
  * This definition must be kept in sync with the defininition
  * in hw/s390x/ipl.h
  */
 struct QemuIplParameters {
-uint8_t  reserved1[4];
+uint8_t  qipl_flags;
+uint8_t  reserved1[3];
 uint64_t netboot_start_addr;
-uint8_t  reserved2[16];
+uint32_t boot_menu_timeout;
+uint8_t  reserved2[12];
 } __attribute__ ((packed));
 typedef struct QemuIplParameters QemuIplParameters;
 
-- 
1.8.3.1




[Qemu-devel] [PULL-for-s390x 07/14] s390-ccw: set up interactive boot menu parameters

2018-02-26 Thread Thomas Huth
From: "Collin L. Walling" 

Reads boot menu flag and timeout values from the iplb and
sets the respective fields for the menu.

Signed-off-by: Collin L. Walling 
Reviewed-by: Thomas Huth 
Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw/Makefile   |  2 +-
 pc-bios/s390-ccw/main.c | 24 
 pc-bios/s390-ccw/menu.c | 22 ++
 pc-bios/s390-ccw/s390-ccw.h |  3 +++
 4 files changed, 50 insertions(+), 1 deletion(-)
 create mode 100644 pc-bios/s390-ccw/menu.c

diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
index 9f7904f..1712c2d 100644
--- a/pc-bios/s390-ccw/Makefile
+++ b/pc-bios/s390-ccw/Makefile
@@ -9,7 +9,7 @@ $(call set-vpath, $(SRC_PATH)/pc-bios/s390-ccw)
 
 .PHONY : all clean build-all
 
-OBJECTS = start.o main.o bootmap.o sclp.o virtio.o virtio-scsi.o 
virtio-blkdev.o libc.o
+OBJECTS = start.o main.o bootmap.o sclp.o virtio.o virtio-scsi.o 
virtio-blkdev.o libc.o menu.o
 QEMU_CFLAGS := $(filter -W%, $(QEMU_CFLAGS))
 QEMU_CFLAGS += -ffreestanding -fno-delete-null-pointer-checks -msoft-float
 QEMU_CFLAGS += -march=z900 -fPIE -fno-strict-aliasing
diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c
index e41b264..32ed70e 100644
--- a/pc-bios/s390-ccw/main.c
+++ b/pc-bios/s390-ccw/main.c
@@ -18,6 +18,9 @@ IplParameterBlock iplb 
__attribute__((__aligned__(PAGE_SIZE)));
 static char loadparm[8] = { 0, 0, 0, 0, 0, 0, 0, 0 };
 QemuIplParameters qipl;
 
+#define LOADPARM_PROMPT "PROMPT  "
+#define LOADPARM_EMPTY  ""
+
 /*
  * Priniciples of Operations (SA22-7832-09) chapter 17 requires that
  * a subsystem-identification is at 184-187 and bytes 188-191 are zero
@@ -74,6 +77,26 @@ static bool find_dev(Schib *schib, int dev_no)
 return false;
 }
 
+static void menu_setup(void)
+{
+if (memcmp(loadparm, LOADPARM_PROMPT, 8) == 0) {
+menu_set_parms(QIPL_FLAG_BM_OPTS_CMD, 0);
+return;
+}
+
+/* If loadparm was set to any other value, then do not enable menu */
+if (memcmp(loadparm, LOADPARM_EMPTY, 8) != 0) {
+return;
+}
+
+switch (iplb.pbt) {
+case S390_IPL_TYPE_CCW:
+menu_set_parms(qipl.qipl_flags & QIPL_FLAG_BM_OPTS_CMD,
+   qipl.boot_menu_timeout);
+return;
+}
+}
+
 static void virtio_setup(void)
 {
 Schib schib;
@@ -117,6 +140,7 @@ static void virtio_setup(void)
 default:
 panic("List-directed IPL not supported yet!\n");
 }
+menu_setup();
 } else {
 for (ssid = 0; ssid < 0x3; ssid++) {
 blk_schid.ssid = ssid;
diff --git a/pc-bios/s390-ccw/menu.c b/pc-bios/s390-ccw/menu.c
new file mode 100644
index 000..1ce33dd
--- /dev/null
+++ b/pc-bios/s390-ccw/menu.c
@@ -0,0 +1,22 @@
+/*
+ * QEMU S390 Interactive Boot Menu
+ *
+ * Copyright 2018 IBM Corp.
+ * Author: Collin L. Walling 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at
+ * your option) any later version. See the COPYING file in the top-level
+ * directory.
+ */
+
+#include "libc.h"
+#include "s390-ccw.h"
+
+static uint8_t flag;
+static uint64_t timeout;
+
+void menu_set_parms(uint8_t boot_menu_flag, uint32_t boot_menu_timeout)
+{
+flag = boot_menu_flag;
+timeout = boot_menu_timeout;
+}
diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h
index 25d4d21..6cfd4b2 100644
--- a/pc-bios/s390-ccw/s390-ccw.h
+++ b/pc-bios/s390-ccw/s390-ccw.h
@@ -84,6 +84,9 @@ ulong get_second(void);
 /* bootmap.c */
 void zipl_load(void);
 
+/* menu.c */
+void menu_set_parms(uint8_t boot_menu_flag, uint32_t boot_menu_timeout);
+
 static inline void fill_hex(char *out, unsigned char val)
 {
 const char hex[] = "0123456789abcdef";
-- 
1.8.3.1




[Qemu-devel] [PULL-for-s390x 13/14] s390-ccw: interactive boot menu for scsi

2018-02-26 Thread Thomas Huth
From: "Collin L. Walling" 

Interactive boot menu for scsi. This follows a similar procedure
as the interactive menu for eckd dasd. An example follows:

s390x Enumerated Boot Menu.

3 entries detected. Select from index 0 to 2.

Signed-off-by: Collin L. Walling 
Reviewed-by: Thomas Huth 
[thuth: Added additional "break;" statement to avoid analyzer warnings]
Signed-off-by: Thomas Huth 
---
 hw/s390x/ipl.c  |  2 ++
 pc-bios/s390-ccw/bootmap.c  |  4 
 pc-bios/s390-ccw/main.c |  1 +
 pc-bios/s390-ccw/menu.c | 20 
 pc-bios/s390-ccw/s390-ccw.h |  2 ++
 5 files changed, 29 insertions(+)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index c12e460..798e99a 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -247,6 +247,8 @@ static void s390_ipl_set_boot_menu(S390IPLState *ipl)
 return;
 }
 break;
+case S390_IPL_TYPE_QEMU_SCSI:
+break;
 default:
 error_report("boot menu is not supported for this device type.");
 return;
diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index ae93b55..29bfd8c 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -568,6 +568,10 @@ static void ipl_scsi(void)
 debug_print_int("program table entries", program_table_entries);
 IPL_assert(program_table_entries != 0, "Empty Program Table");
 
+if (menu_is_enabled_enum()) {
+loadparm = menu_get_enum_boot_index(program_table_entries);
+}
+
 debug_print_int("loadparm", loadparm);
 IPL_assert(loadparm <= MAX_TABLE_ENTRIES, "loadparm value greater than"
" maximum number of boot entries allowed");
diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c
index a7473b0..9d9f8cf 100644
--- a/pc-bios/s390-ccw/main.c
+++ b/pc-bios/s390-ccw/main.c
@@ -92,6 +92,7 @@ static void menu_setup(void)
 
 switch (iplb.pbt) {
 case S390_IPL_TYPE_CCW:
+case S390_IPL_TYPE_QEMU_SCSI:
 menu_set_parms(qipl.qipl_flags & BOOT_MENU_FLAG_MASK,
qipl.boot_menu_timeout);
 return;
diff --git a/pc-bios/s390-ccw/menu.c b/pc-bios/s390-ccw/menu.c
index ee56939..96eec81 100644
--- a/pc-bios/s390-ccw/menu.c
+++ b/pc-bios/s390-ccw/menu.c
@@ -217,6 +217,21 @@ int menu_get_zipl_boot_index(const char *menu_data)
 return get_boot_index(entries - 1); /* subtract 1 to exclude banner */
 }
 
+
+int menu_get_enum_boot_index(int entries)
+{
+char tmp[4];
+
+sclp_print("s390x Enumerated Boot Menu.\n\n");
+
+sclp_print(uitoa(entries, tmp, sizeof(tmp)));
+sclp_print(" entries detected. Select from boot index 0 to ");
+sclp_print(uitoa(entries - 1, tmp, sizeof(tmp)));
+sclp_print(".\n\n");
+
+return get_boot_index(entries);
+}
+
 void menu_set_parms(uint8_t boot_menu_flag, uint32_t boot_menu_timeout)
 {
 flag = boot_menu_flag;
@@ -227,3 +242,8 @@ bool menu_is_enabled_zipl(void)
 {
 return flag & (QIPL_FLAG_BM_OPTS_CMD | QIPL_FLAG_BM_OPTS_ZIPL);
 }
+
+bool menu_is_enabled_enum(void)
+{
+return flag & QIPL_FLAG_BM_OPTS_CMD;
+}
diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h
index c4ddf9f..fd18da2 100644
--- a/pc-bios/s390-ccw/s390-ccw.h
+++ b/pc-bios/s390-ccw/s390-ccw.h
@@ -91,6 +91,8 @@ void zipl_load(void);
 void menu_set_parms(uint8_t boot_menu_flag, uint32_t boot_menu_timeout);
 int menu_get_zipl_boot_index(const char *menu_data);
 bool menu_is_enabled_zipl(void);
+int menu_get_enum_boot_index(int entries);
+bool menu_is_enabled_enum(void);
 
 static inline void fill_hex(char *out, unsigned char val)
 {
-- 
1.8.3.1




[Qemu-devel] [PULL-for-s390x 10/14] s390-ccw: read user input for boot index via the SCLP console

2018-02-26 Thread Thomas Huth
From: "Collin L. Walling" 

Implements an sclp_read function to capture input from the
console and a wrapper function that handles parsing certain
characters and adding input to a buffer. The input is checked
for any erroneous values and is handled appropriately.

A prompt will persist until input is entered or the timeout
expires (if one was set). Example:

  Please choose (default will boot in 10 seconds):

Correct input will boot the respective boot index. If the
user's input is empty, 0, or if the timeout expires, then
the default zipl entry will be chosen. If the input is
within the range of available boot entries, then the
selection will be booted. Any erroneous input will cancel
the timeout and re-prompt the user.

Signed-off-by: Collin L. Walling 
Reviewed-by: Thomas Huth 
Acked-by: Christian Borntraeger 
Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw/menu.c | 149 +++-
 pc-bios/s390-ccw/s390-ccw.h |   2 +
 pc-bios/s390-ccw/sclp.c |  19 ++
 pc-bios/s390-ccw/virtio.c   |   2 +-
 4 files changed, 170 insertions(+), 2 deletions(-)

diff --git a/pc-bios/s390-ccw/menu.c b/pc-bios/s390-ccw/menu.c
index 730d44e..b99ff03 100644
--- a/pc-bios/s390-ccw/menu.c
+++ b/pc-bios/s390-ccw/menu.c
@@ -12,12 +12,159 @@
 #include "libc.h"
 #include "s390-ccw.h"
 
+#define KEYCODE_NO_INP '\0'
+#define KEYCODE_ESCAPE '\033'
+#define KEYCODE_BACKSP '\177'
+#define KEYCODE_ENTER  '\r'
+
+#define TOD_CLOCK_MILLISECOND   0x3e8000
+
+#define LOW_CORE_EXTERNAL_INT_ADDR   0x86
+#define CLOCK_COMPARATOR_INT 0X1004
+
 static uint8_t flag;
 static uint64_t timeout;
 
+static inline void enable_clock_int(void)
+{
+uint64_t tmp = 0;
+
+asm volatile(
+"stctg  0,0,%0\n"
+"oi 6+%0, 0x8\n"
+"lctlg  0,0,%0"
+: : "Q" (tmp) : "memory"
+);
+}
+
+static inline void disable_clock_int(void)
+{
+uint64_t tmp = 0;
+
+asm volatile(
+"stctg  0,0,%0\n"
+"ni 6+%0, 0xf7\n"
+"lctlg  0,0,%0"
+: : "Q" (tmp) : "memory"
+);
+}
+
+static inline void set_clock_comparator(uint64_t time)
+{
+asm volatile("sckc %0" : : "Q" (time));
+}
+
+static inline bool check_clock_int(void)
+{
+uint16_t *code = (uint16_t *)LOW_CORE_EXTERNAL_INT_ADDR;
+
+consume_sclp_int();
+
+return *code == CLOCK_COMPARATOR_INT;
+}
+
+static int read_prompt(char *buf, size_t len)
+{
+char inp[2] = {};
+uint8_t idx = 0;
+uint64_t time;
+
+if (timeout) {
+time = get_clock() + timeout * TOD_CLOCK_MILLISECOND;
+set_clock_comparator(time);
+enable_clock_int();
+timeout = 0;
+}
+
+while (!check_clock_int()) {
+
+sclp_read(inp, 1); /* Process only one character at a time */
+
+switch (inp[0]) {
+case KEYCODE_NO_INP:
+case KEYCODE_ESCAPE:
+continue;
+case KEYCODE_BACKSP:
+if (idx > 0) {
+buf[--idx] = 0;
+sclp_print("\b \b");
+}
+continue;
+case KEYCODE_ENTER:
+disable_clock_int();
+return idx;
+default:
+/* Echo input and add to buffer */
+if (idx < len) {
+buf[idx++] = inp[0];
+sclp_print(inp);
+}
+}
+}
+
+disable_clock_int();
+*buf = 0;
+
+return 0;
+}
+
+static int get_index(void)
+{
+char buf[11];
+int len;
+int i;
+
+memset(buf, 0, sizeof(buf));
+
+len = read_prompt(buf, sizeof(buf) - 1);
+
+/* If no input, boot default */
+if (len == 0) {
+return 0;
+}
+
+/* Check for erroneous input */
+for (i = 0; i < len; i++) {
+if (!isdigit(buf[i])) {
+return -1;
+}
+}
+
+return atoui(buf);
+}
+
+static void boot_menu_prompt(bool retry)
+{
+char tmp[11];
+
+if (retry) {
+sclp_print("\nError: undefined configuration"
+   "\nPlease choose:\n");
+} else if (timeout > 0) {
+sclp_print("Please choose (default will boot in ");
+sclp_print(uitoa(timeout / 1000, tmp, sizeof(tmp)));
+sclp_print(" seconds):\n");
+} else {
+sclp_print("Please choose:\n");
+}
+}
+
 static int get_boot_index(int entries)
 {
-return 0; /* implemented next patch */
+int boot_index;
+bool retry = false;
+char tmp[5];
+
+do {
+boot_menu_prompt(retry);
+boot_index = get_index();
+retry = true;
+} while (boot_index < 0 || boot_index >= entries);
+
+sclp_print("\nBooting entry #");
+sclp_print(uitoa(boot_index, tmp, sizeof(tmp)));
+
+return boot_index;
 }
 
 static void zipl_println(const char *data, size_t len)
diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h
index c0dd37f..aeba8b0 100644
--- a/pc-bios/s390-ccw/s390-ccw.h
+++ b/pc-bios/s390-ccw/s390-ccw.h
@@ -71,6 +71,7 @@ unsigned int get_loadparm_i

Re: [Qemu-devel] [PATCH v6 00/23] RISC-V QEMU Port Submission

2018-02-26 Thread Daniel P . Berrangé
On Sat, Feb 24, 2018 at 09:05:49AM +1300, Michael Clark wrote:
> Dear Daniel,
> 
> We've had this discussion on a recent pull request where some code was
> going to be copied directly from hw/arm/virt.c to hw/riscv/virt.c and we
> have subsequently relicensed the recipient file as GPLv2+. This code has
> not yet been incorporated into the port. Besides naming conventions and use
> of some common APIs, however the logic in hw/riscv/virt.c is original work.
> Try diffing them. I wrote the device tree code from scratch and we have a
> unique memory map, and the other functions are dervied from other RISC-V
> machines which are MIT licensed.
> 
> - https://github.com/riscv/riscv-qemu/pull/109
> 
> In any case, SiFive are happy to license their contributions as GPLv2+.
> We'll need to get the main contributors to agree to re-license to GPLv2+ or
> fall back to having GPLv2+ prefix the MIT license, as MIT is compatible
> with GPLv2+. Stefan O'Rear has commented that he is happy for his code to
> be GPLv2+ and so is SiFive, but we'll need to get confirmation from Sagar,
> one of the main port contributors, and potentially the whole list of
> contributors to do complete due diligence on re-licensing. i.e. if we want
> to eradicate MIT license from the code-base.
> 
> SiFive have made substantial changes to all of the non-GPLv2+ files in the
> port, and SiFive can license their contributions as GPLv2+ which would
> allow us to prefix all files in hw/riscv with GPLv2+. The only issue is
> that we must get approval from contributors to completely remove the MIT
> license, as the original contributors licensed their code under that
> license, as is the case for all of Fabrice's original code and many other
> parts of the code base e.g. GPEX hw/pci-host/gpex.c.
> 
> SiFive have made substantial changes to all files in the RISC-V port, so we
> would be empowered to at least prefix the MIT license with GPLv2+.
> 
> Is that acceptable? the MIT terms are compatible with GPLv2+ as MIT is a
> "permissive-license".

I accept that MIT is compatible with GPLv2+, so that's not an immediate legal
problem. The issue is that as we add more & more different licenses to QEMU,
it becomes a maintenance burden to developers, especially when doing code
refactoring across files. You have to be careful you're not taking a piece
of GPLv2+ code and copying/moving it into a file that's MIT licensed, as
that would be non-compliant. We already suffer this problem with our mixture
of GPLv2-only and GPLv2+ and LGPLv2+ and BSD licensed code. So I'm personally
loathe to see us add yet another license to the mix.

Ultimately though, Peter Maydall is the one who has the final say on whether
we'll pull the patch series. So I'll defer to him for a definitive answer on
whether its OK for riscv files to add MIT license to the mix, either long
term or as a temporary state.


> 
> 'cc Sagar, Bastian, as they have been main contributors to the port in the
> past...
> 
> Regards,
> Michael.
> 
> On Fri, Feb 23, 2018 at 11:10 PM, Daniel P. Berrangé 
> wrote:
> 
> > On Fri, Feb 23, 2018 at 01:11:46PM +1300, Michael Clark wrote:
> > > QEMU RISC-V Emulation Support (RV64GC, RV32GC)
> > >
> > > This is hopefully the "fix remaining issues in-tree" release.
> >
> > This code seems to be a mixture of LGPLv2+ and MIT licensed code. The
> > preferred license for QEMU contributions is GPLv2+. Is there a reason
> > you need to diverge from this or can it be changed to be all GPLv2+ ?

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



[Qemu-devel] [PULL-for-s390x 11/14] s390-ccw: set cp_receive mask only when needed and consume pending service irqs

2018-02-26 Thread Thomas Huth
From: "Collin L. Walling" 

It is possible while waiting for multiple types of external
interrupts that we might have pending irqs remaining between
irq consumption and irq-type disabling. Those interrupts
could potentially propagate to the guest after IPL completes
and cause unwanted behavior.

As it is today, the SCLP will only recognize write events that
are enabled by the control program's send and receive masks. To
limit the window for, and prevent further irqs from, ASCII
console events (specifically keystrokes), we should only enable
the control program's receive mask when we need it.

While we're at it, remove assignment of the (non control program)
send and receive masks, as those are actually set by the SCLP.

Signed-off-by: Collin L. Walling 
Reviewed-by: Thomas Huth 
Acked-by: Christian Borntraeger 
Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw/menu.c |  5 +
 pc-bios/s390-ccw/s390-ccw.h |  1 +
 pc-bios/s390-ccw/sclp.c | 10 --
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/pc-bios/s390-ccw/menu.c b/pc-bios/s390-ccw/menu.c
index b99ff03..8d55869 100644
--- a/pc-bios/s390-ccw/menu.c
+++ b/pc-bios/s390-ccw/menu.c
@@ -11,6 +11,7 @@
 
 #include "libc.h"
 #include "s390-ccw.h"
+#include "sclp.h"
 
 #define KEYCODE_NO_INP '\0'
 #define KEYCODE_ESCAPE '\033'
@@ -116,8 +117,12 @@ static int get_index(void)
 
 memset(buf, 0, sizeof(buf));
 
+sclp_set_write_mask(SCLP_EVENT_MASK_MSG_ASCII, SCLP_EVENT_MASK_MSG_ASCII);
+
 len = read_prompt(buf, sizeof(buf) - 1);
 
+sclp_set_write_mask(0, SCLP_EVENT_MASK_MSG_ASCII);
+
 /* If no input, boot default */
 if (len == 0) {
 return 0;
diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h
index aeba8b0..c4ddf9f 100644
--- a/pc-bios/s390-ccw/s390-ccw.h
+++ b/pc-bios/s390-ccw/s390-ccw.h
@@ -69,6 +69,7 @@ unsigned int get_loadparm_index(void);
 
 /* sclp.c */
 void sclp_print(const char *string);
+void sclp_set_write_mask(uint32_t receive_mask, uint32_t send_mask);
 void sclp_setup(void);
 void sclp_get_loadparm_ascii(char *loadparm);
 int sclp_read(char *str, size_t count);
diff --git a/pc-bios/s390-ccw/sclp.c b/pc-bios/s390-ccw/sclp.c
index a2f25eb..3836cb4 100644
--- a/pc-bios/s390-ccw/sclp.c
+++ b/pc-bios/s390-ccw/sclp.c
@@ -46,23 +46,21 @@ static int sclp_service_call(unsigned int command, void 
*sccb)
 return 0;
 }
 
-static void sclp_set_write_mask(void)
+void sclp_set_write_mask(uint32_t receive_mask, uint32_t send_mask)
 {
 WriteEventMask *sccb = (void *)_sccb;
 
 sccb->h.length = sizeof(WriteEventMask);
 sccb->mask_length = sizeof(unsigned int);
-sccb->receive_mask = SCLP_EVENT_MASK_MSG_ASCII;
-sccb->cp_receive_mask = SCLP_EVENT_MASK_MSG_ASCII;
-sccb->send_mask = SCLP_EVENT_MASK_MSG_ASCII;
-sccb->cp_send_mask = SCLP_EVENT_MASK_MSG_ASCII;
+sccb->cp_receive_mask = receive_mask;
+sccb->cp_send_mask = send_mask;
 
 sclp_service_call(SCLP_CMD_WRITE_EVENT_MASK, sccb);
 }
 
 void sclp_setup(void)
 {
-sclp_set_write_mask();
+sclp_set_write_mask(0, SCLP_EVENT_MASK_MSG_ASCII);
 }
 
 long write(int fd, const void *str, size_t len)
-- 
1.8.3.1




[Qemu-devel] [PULL-for-s390x 12/14] s390-ccw: use zipl values when no boot menu options are present

2018-02-26 Thread Thomas Huth
From: "Collin L. Walling" 

If no boot menu options are present, then flag the boot menu to
use the zipl options that were set in the zipl configuration file
(and stored on disk by zipl). These options are found at some
offset prior to the start of the zipl boot menu banner. The zipl
timeout value is limited to a 16-bit unsigned integer and stored
as seconds, so we take care to convert it to milliseconds in order
to conform to the rest of the boot menu functionality. This is
limited to CCW devices.

For reference, the zipl configuration file uses the following
fields in the menu section:

  prompt=1  enable the boot menu
  timeout=X set the timeout to X seconds

To explicitly disregard any boot menu options, then menu=off or
 must be specified.

Signed-off-by: Collin L. Walling 
Reviewed-by: Thomas Huth 
Signed-off-by: Thomas Huth 
---
 hw/s390x/ipl.c  |  5 +
 hw/s390x/ipl.h  |  1 +
 pc-bios/s390-ccw/iplb.h |  1 +
 pc-bios/s390-ccw/main.c |  3 ++-
 pc-bios/s390-ccw/menu.c | 16 +++-
 5 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index ee2039d..c12e460 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -241,6 +241,11 @@ static void s390_ipl_set_boot_menu(S390IPLState *ipl)
 
 switch (ipl->iplb.pbt) {
 case S390_IPL_TYPE_CCW:
+/* In the absence of -boot menu, use zipl parameters */
+if (!qemu_opt_get(opts, "menu")) {
+*flags |= QIPL_FLAG_BM_OPTS_ZIPL;
+return;
+}
 break;
 default:
 error_report("boot menu is not supported for this device type.");
diff --git a/hw/s390x/ipl.h b/hw/s390x/ipl.h
index d6c6f75..0570d0a 100644
--- a/hw/s390x/ipl.h
+++ b/hw/s390x/ipl.h
@@ -93,6 +93,7 @@ void s390_reipl_request(void);
 
 /* Boot Menu flags */
 #define QIPL_FLAG_BM_OPTS_CMD   0x80
+#define QIPL_FLAG_BM_OPTS_ZIPL  0x40
 
 /*
  * The QEMU IPL Parameters will be stored at absolute address
diff --git a/pc-bios/s390-ccw/iplb.h b/pc-bios/s390-ccw/iplb.h
index 832bb94..7dfce4f 100644
--- a/pc-bios/s390-ccw/iplb.h
+++ b/pc-bios/s390-ccw/iplb.h
@@ -76,6 +76,7 @@ extern IplParameterBlock iplb 
__attribute__((__aligned__(PAGE_SIZE)));
 
 /* Boot Menu flags */
 #define QIPL_FLAG_BM_OPTS_CMD   0x80
+#define QIPL_FLAG_BM_OPTS_ZIPL  0x40
 
 /*
  * This definition must be kept in sync with the defininition
diff --git a/pc-bios/s390-ccw/main.c b/pc-bios/s390-ccw/main.c
index 32ed70e..a7473b0 100644
--- a/pc-bios/s390-ccw/main.c
+++ b/pc-bios/s390-ccw/main.c
@@ -20,6 +20,7 @@ QemuIplParameters qipl;
 
 #define LOADPARM_PROMPT "PROMPT  "
 #define LOADPARM_EMPTY  ""
+#define BOOT_MENU_FLAG_MASK (QIPL_FLAG_BM_OPTS_CMD | QIPL_FLAG_BM_OPTS_ZIPL)
 
 /*
  * Priniciples of Operations (SA22-7832-09) chapter 17 requires that
@@ -91,7 +92,7 @@ static void menu_setup(void)
 
 switch (iplb.pbt) {
 case S390_IPL_TYPE_CCW:
-menu_set_parms(qipl.qipl_flags & QIPL_FLAG_BM_OPTS_CMD,
+menu_set_parms(qipl.qipl_flags & BOOT_MENU_FLAG_MASK,
qipl.boot_menu_timeout);
 return;
 }
diff --git a/pc-bios/s390-ccw/menu.c b/pc-bios/s390-ccw/menu.c
index 8d55869..ee56939 100644
--- a/pc-bios/s390-ccw/menu.c
+++ b/pc-bios/s390-ccw/menu.c
@@ -18,6 +18,10 @@
 #define KEYCODE_BACKSP '\177'
 #define KEYCODE_ENTER  '\r'
 
+/* Offsets from zipl fields to zipl banner start */
+#define ZIPL_TIMEOUT_OFFSET 138
+#define ZIPL_FLAG_OFFSET140
+
 #define TOD_CLOCK_MILLISECOND   0x3e8000
 
 #define LOW_CORE_EXTERNAL_INT_ADDR   0x86
@@ -187,6 +191,16 @@ int menu_get_zipl_boot_index(const char *menu_data)
 {
 size_t len;
 int entries;
+uint16_t zipl_flag = *(uint16_t *)(menu_data - ZIPL_FLAG_OFFSET);
+uint16_t zipl_timeout = *(uint16_t *)(menu_data - ZIPL_TIMEOUT_OFFSET);
+
+if (flag == QIPL_FLAG_BM_OPTS_ZIPL) {
+if (!zipl_flag) {
+return 0; /* Boot default */
+}
+/* zipl stores timeout as seconds */
+timeout = zipl_timeout * 1000;
+}
 
 /* Print and count all menu items, including the banner */
 for (entries = 0; *menu_data; entries++) {
@@ -211,5 +225,5 @@ void menu_set_parms(uint8_t boot_menu_flag, uint32_t 
boot_menu_timeout)
 
 bool menu_is_enabled_zipl(void)
 {
-return flag & QIPL_FLAG_BM_OPTS_CMD;
+return flag & (QIPL_FLAG_BM_OPTS_CMD | QIPL_FLAG_BM_OPTS_ZIPL);
 }
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v1] numa: s390x has no NUMA

2018-02-26 Thread Christian Borntraeger


On 02/26/2018 11:35 AM, Cornelia Huck wrote:
> On Mon, 26 Feb 2018 11:28:26 +0100
> David Hildenbrand  wrote:
> 
>> On 26.02.2018 11:19, Cornelia Huck wrote:
>>> On Fri, 23 Feb 2018 18:36:57 +0100
>>> David Hildenbrand  wrote:
>>>   
 Right now it is possible to crash QEMU for s390x by providing e.g.
 -numa node,nodeid=0,cpus=0-1

 Problem is, that numa.c uses mc->cpu_index_to_instance_props as an
 indicator whether NUMA is supported by a machine type. We don't
 implement NUMA on s390x (and that concept also doesn't really exist).
 We need mc->cpu_index_to_instance_props for query-cpus.  
>>>
>>> Is existence of cpu_index_to_instance_probs the correct indicator for
>>> numa, then?
>>>
>>> OTOH, your patch is straightforward...  
>>
>> Maybe it is get_default_cpu_node_id as Christian discovered?
> 
> Yes, that seems like a better candidate for checking.

Agreed. 
As everybody else calls possible_cpu_arch_ids  in cpu_index_to_props
I am asking myself if we should do that as well anyway?




Re: [Qemu-devel] [PATCH v1] numa: s390x has no NUMA

2018-02-26 Thread David Hildenbrand
On 26.02.2018 12:07, Christian Borntraeger wrote:
> 
> 
> On 02/26/2018 11:35 AM, Cornelia Huck wrote:
>> On Mon, 26 Feb 2018 11:28:26 +0100
>> David Hildenbrand  wrote:
>>
>>> On 26.02.2018 11:19, Cornelia Huck wrote:
 On Fri, 23 Feb 2018 18:36:57 +0100
 David Hildenbrand  wrote:
   
> Right now it is possible to crash QEMU for s390x by providing e.g.
> -numa node,nodeid=0,cpus=0-1
>
> Problem is, that numa.c uses mc->cpu_index_to_instance_props as an
> indicator whether NUMA is supported by a machine type. We don't
> implement NUMA on s390x (and that concept also doesn't really exist).
> We need mc->cpu_index_to_instance_props for query-cpus.  

 Is existence of cpu_index_to_instance_probs the correct indicator for
 numa, then?

 OTOH, your patch is straightforward...  
>>>
>>> Maybe it is get_default_cpu_node_id as Christian discovered?
>>
>> Yes, that seems like a better candidate for checking.
> 
> Agreed. 
> As everybody else calls possible_cpu_arch_ids  in cpu_index_to_props
> I am asking myself if we should do that as well anyway?
> 

Well, it found a BUG :)

-- 

Thanks,

David / dhildenb



Re: [Qemu-devel] [PATCH v1] numa: s390x has no NUMA

2018-02-26 Thread Cornelia Huck
On Mon, 26 Feb 2018 12:07:43 +0100
Christian Borntraeger  wrote:

> On 02/26/2018 11:35 AM, Cornelia Huck wrote:
> > On Mon, 26 Feb 2018 11:28:26 +0100
> > David Hildenbrand  wrote:
> >   
> >> On 26.02.2018 11:19, Cornelia Huck wrote:  
> >>> On Fri, 23 Feb 2018 18:36:57 +0100
> >>> David Hildenbrand  wrote:
> >>> 
>  Right now it is possible to crash QEMU for s390x by providing e.g.
>  -numa node,nodeid=0,cpus=0-1
> 
>  Problem is, that numa.c uses mc->cpu_index_to_instance_props as an
>  indicator whether NUMA is supported by a machine type. We don't
>  implement NUMA on s390x (and that concept also doesn't really exist).
>  We need mc->cpu_index_to_instance_props for query-cpus.
> >>>
> >>> Is existence of cpu_index_to_instance_probs the correct indicator for
> >>> numa, then?
> >>>
> >>> OTOH, your patch is straightforward...
> >>
> >> Maybe it is get_default_cpu_node_id as Christian discovered?  
> > 
> > Yes, that seems like a better candidate for checking.  
> 
> Agreed. 
> As everybody else calls possible_cpu_arch_ids  in cpu_index_to_props
> I am asking myself if we should do that as well anyway?
> 

Making the behaviour consistent with other archs sounds like a good
idea.



Re: [Qemu-devel] [PATCH v6 00/23] RISC-V QEMU Port Submission

2018-02-26 Thread Richard W.M. Jones
If anyone wants a simple way to test this, grab the latest bbl &
stage4 disk image from here and boot it under qemu-system-riscv64
using the command line given in the readme.txt file:

  https://fedorapeople.org/groups/risc-v/disk-images/

I've added v6 to Fedora copr, and switched to using it for the
Fedora/RISC-V build system, so it'll get a lot of heavy testing
shortly.

  http://copr-fe.cloud.fedoraproject.org/coprs/rjones/riscv/build/721136/

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org



Re: [Qemu-devel] [PATCH v3 0/7] block: Handle null backing link

2018-02-26 Thread Max Reitz
On 2018-02-24 19:02, no-re...@patchew.org wrote:
> Hi,
> 
> This series failed build test on s390x host. Please find the details below.

[...]

> In file included from 
> /var/tmp/patchew-tester-tmp-wr4zoy33/src/include/qemu/osdep.h:36:0,
>  from /var/tmp/patchew-tester-tmp-wr4zoy33/src/block.c:25:
> /var/tmp/patchew-tester-tmp-wr4zoy33/src/block.c: In function 
> ‘bdrv_open_inherit’:
> /var/tmp/patchew-tester-tmp-wr4zoy33/src/include/qemu/compiler.h:63:34: 
> error: dereferencing pointer to incomplete type ‘QNull {aka struct QNull}’
>  const typeof(((type *) 0)->member) *__mptr = (ptr); \
>   ^
> /var/tmp/patchew-tester-tmp-wr4zoy33/src/include/qapi/qmp/qobject.h:65:5: 
> note: in expansion of macro ‘container_of’
>  container_of(qobject_check_type(obj, glue(QTYPE_CAST_TO_, type)) ?: \
>  ^~~~
> /var/tmp/patchew-tester-tmp-wr4zoy33/src/block.c:2603:9: note: in expansion 
> of macro ‘qobject_to’
>  if (qobject_to(qdict_get(options, "backing"), QNull) != NULL ||
>  ^~
> In file included from /usr/include/sched.h:29:0,
>  from /usr/include/pthread.h:23,
>  from /usr/include/glib-2.0/glib/deprecated/gthread.h:128,
>  from /usr/include/glib-2.0/glib.h:108,
>  from 
> /var/tmp/patchew-tester-tmp-wr4zoy33/src/include/glib-compat.h:19,
>  from 
> /var/tmp/patchew-tester-tmp-wr4zoy33/src/include/qemu/osdep.h:107,
>  from /var/tmp/patchew-tester-tmp-wr4zoy33/src/block.c:25:
> /var/tmp/patchew-tester-tmp-wr4zoy33/src/include/qemu/compiler.h:64:37: 
> error: invalid use of incomplete typedef ‘QNull {aka struct QNull}’
>  (type *) ((char *) __mptr - offsetof(type, member));})
>  ^
> /var/tmp/patchew-tester-tmp-wr4zoy33/src/include/qapi/qmp/qobject.h:65:5: 
> note: in expansion of macro ‘container_of’
>  container_of(qobject_check_type(obj, glue(QTYPE_CAST_TO_, type)) ?: \
>  ^~~~
> /var/tmp/patchew-tester-tmp-wr4zoy33/src/block.c:2603:9: note: in expansion 
> of macro ‘qobject_to’
>  if (qobject_to(qdict_get(options, "backing"), QNull) != NULL ||
>  ^~

I guess I missed 6b67395762a4c8b.  Oops.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v6 00/23] RISC-V QEMU Port Submission

2018-02-26 Thread Andreas Schwab
This is being used to build openSUSE Factory for riscv64 with linux-user
emulation:

https://build.opensuse.org/project/show/openSUSE:Factory:RISCV

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



Re: [Qemu-devel] [PATCH v6 00/23] RISC-V QEMU Port Submission

2018-02-26 Thread Peter Maydell
On 26 February 2018 at 10:47, Daniel P. Berrangé  wrote:
> I accept that MIT is compatible with GPLv2+, so that's not an immediate legal
> problem. The issue is that as we add more & more different licenses to QEMU,
> it becomes a maintenance burden to developers, especially when doing code
> refactoring across files. You have to be careful you're not taking a piece
> of GPLv2+ code and copying/moving it into a file that's MIT licensed, as
> that would be non-compliant. We already suffer this problem with our mixture
> of GPLv2-only and GPLv2+ and LGPLv2+ and BSD licensed code. So I'm personally
> loathe to see us add yet another license to the mix.

Unless I'm confused, we already have a lot of MIT-licensed code in the tree,
including much of the block layer, accel/tcg, the audio subsystem. Looking
at vl.c, it was put under the MIT license by Fabrice in 2003, so we've
been living with it as part of our licensing mix for a very long time already.

thanks
-- PMM



Re: [Qemu-devel] [PATCH v3 2/7] qapi: Add qobject_to()

2018-02-26 Thread Max Reitz
On 2018-02-24 21:57, Eric Blake wrote:
> On 02/24/2018 09:40 AM, Max Reitz wrote:
>> This is a dynamic casting macro that, given a QObject type, returns an
>> object as that type or NULL if the object is of a different type (or
>> NULL itself).
>>
>> The macro uses lower-case letters because:
>> 1. There does not seem to be a hard rule on whether qemu macros have to
>>     be upper-cased,
>> 2. The current situation in qapi/qmp is inconsistent (compare e.g.
>>     QINCREF() vs. qdict_put()),
>> 3. qobject_to() will evaluate its @obj parameter only once, thus it is
>>     generally not important to the caller whether it is a macro or not,
>> 4. I prefer it aesthetically.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>   include/qapi/qmp/qobject.h | 30 ++
>>   1 file changed, 30 insertions(+)
>>
> 
>> +++ b/include/qapi/qmp/qobject.h
>> @@ -50,6 +50,22 @@ struct QObject {
>>   #define QDECREF(obj)  \
>>   qobject_decref(obj ? QOBJECT(obj) : NULL)
>>   +/* Required for qobject_to() */
>> +#define QTYPE_CAST_TO_QNull QTYPE_QNULL
>> +#define QTYPE_CAST_TO_QNum  QTYPE_QNUM
>> +#define QTYPE_CAST_TO_QString   QTYPE_QSTRING
>> +#define QTYPE_CAST_TO_QDict QTYPE_QDICT
>> +#define QTYPE_CAST_TO_QList QTYPE_QLIST
>> +#define QTYPE_CAST_TO_QBool QTYPE_QBOOL
>> +
>> +QEMU_BUILD_BUG_MSG(QTYPE__MAX != 7,
>> +   "The QTYPE_CAST_TO_* list needs to be extended");
>> +
>> +#define qobject_to(obj, type) \
>> +    container_of(qobject_check_type(obj, glue(QTYPE_CAST_TO_, type))
>> ?: \
>> + QOBJECT((type *)NULL), \
> 
> I guess the third (second?) branch of the ternary is written this way,
> rather than the simpler 'NULL', to ensure that 'type' is still something
> that can have the QOBJECT() macro applied to it?  Should be okay.

It's written this way because of the container_of() around it.  We want
the whole expression to return NULL then, and without the QOBJECT()
around it, it would only return NULL if offsetof(type, base) == 0 (which
it is not necessarily).

OTOH, container_of(&((type *)NULL)->base, type, base) is by definition NULL.

(QOBJECT(x) is &(x)->base)

Max

> 
>> + type, base)
>> +
> 
> Reviewed-by: Eric Blake 
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3 3/7] qapi: Replace qobject_to_X(o) by qobject_to(o, X)

2018-02-26 Thread Max Reitz
On 2018-02-24 22:04, Eric Blake wrote:
> On 02/24/2018 09:40 AM, Max Reitz wrote:
>> This patch was generated using the following Coccinelle script:
>>
> 
>> and a bit of manual fix-up for overly long lines and three places in
>> tests/check-qjson.c that Coccinelle did not find.
>>
>> Signed-off-by: Max Reitz 
>> Reviewed-by: Alberto Garcia 
>> ---
> 
>> diff --git a/block.c b/block.c
>> index 814e5a02da..cb69fd7ae4 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -1457,7 +1457,7 @@ static QDict *parse_json_filename(const char
>> *filename, Error **errp)
>>   return NULL;
>>   }
>>   -    options = qobject_to_qdict(options_obj);
>> +    options = qobject_to(options_obj, QDict);
> 
> Bikeshedding - would it read any easier as:
> 
> options = qobject_to(QDict, options_obj);
> 
> ?  If so, your Coccinelle script can be touched up, and patch 2/7 swaps
> argument order around, so it would be tolerable but still slightly
> busywork to regenerate the series.  But I'm not strongly attached to
> either order, and so I'm also willing to take this as-is (especially
> since that's less work), if no one else has a strong opinion that
> swapping order would aid legibility.

Well, same for me. :-)

In a template/generic language, we'd write the type first (e.g.
qobject_cast(options_obj)).  But maybe we'd write the object
first, too (e.g. options_obj.cast()).  And the current order of
the arguments follows the order in the name ("qobject" options_obj "to"
QDict).  But maybe it's more natural to read it as "qobject to" QDict
"applied to" options_obj.

I don't know either.

Max

> Reviewed-by: Eric Blake 
> 
> 
>> +++ b/block/rbd.c
>> @@ -256,14 +256,14 @@ static int qemu_rbd_set_keypairs(rados_t
>> cluster, const char *keypairs_json,
>>   if (!keypairs_json) {
>>   return ret;
>>   }
>> -    keypairs = qobject_to_qlist(qobject_from_json(keypairs_json,
>> -  &error_abort));
>> +    keypairs = qobject_to(qobject_from_json(keypairs_json,
>> &error_abort),
>> +  QList);
> 
> The question about legibility gets a bit more obvious when you span lines.
> 
>> @@ -893,8 +893,9 @@ static void simple_number(void)
>>   QNum *qnum;
>>   int64_t val;
>>   -    qnum =
>> qobject_to_qnum(qobject_from_json(test_cases[i].encoded,
>> - &error_abort));
>> +    qnum = qobject_to(qobject_from_json(test_cases[i].encoded,
>> +    &error_abort),
>> +  QNum);
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v6 00/23] RISC-V QEMU Port Submission

2018-02-26 Thread Daniel P . Berrangé
On Mon, Feb 26, 2018 at 11:57:10AM +, Peter Maydell wrote:
> On 26 February 2018 at 10:47, Daniel P. Berrangé  wrote:
> > I accept that MIT is compatible with GPLv2+, so that's not an immediate 
> > legal
> > problem. The issue is that as we add more & more different licenses to QEMU,
> > it becomes a maintenance burden to developers, especially when doing code
> > refactoring across files. You have to be careful you're not taking a piece
> > of GPLv2+ code and copying/moving it into a file that's MIT licensed, as
> > that would be non-compliant. We already suffer this problem with our mixture
> > of GPLv2-only and GPLv2+ and LGPLv2+ and BSD licensed code. So I'm 
> > personally
> > loathe to see us add yet another license to the mix.
> 
> Unless I'm confused, we already have a lot of MIT-licensed code in the tree,
> including much of the block layer, accel/tcg, the audio subsystem. Looking
> at vl.c, it was put under the MIT license by Fabrice in 2003, so we've
> been living with it as part of our licensing mix for a very long time already.

Eeek, I totally missed that as the top level LICENSE file only mentions
GPL and BSD licenses :-(  I guess that's a trigger for a patch to improve
the text in the LICENSE file to better reflect reality...

So I guess you can ignore my comments in this thread about MIT license
being different from normal practice in QEMU.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [PATCH v3 21/36] rbd: Pass BlockdevOptionsRbd to qemu_rbd_connect()

2018-02-26 Thread Max Reitz
On 2018-02-23 20:25, Kevin Wolf wrote:
> With the conversion to a QAPI options object, the function is now
> prepared to be used in a .bdrv_co_create implementation.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/rbd.c | 109 
> +---
>  1 file changed, 53 insertions(+), 56 deletions(-)
> 
> diff --git a/block/rbd.c b/block/rbd.c
> index 2e79c2d1fd..9b247f020d 100644
> --- a/block/rbd.c
> +++ b/block/rbd.c

[...]

> @@ -482,29 +484,27 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
>  qemu_aio_unref(acb);
>  }
>  
> -static char *qemu_rbd_mon_host(QDict *options, Error **errp)
> +static char *qemu_rbd_mon_host(BlockdevOptionsRbd *opts, Error **errp)
>  {
> -const char **vals = g_new(const char *, qdict_size(options) + 1);
> -char keybuf[32];
> +const char **vals;
>  const char *host, *port;
>  char *rados_str;
> -int i;
> -
> -for (i = 0;; i++) {
> -sprintf(keybuf, "server.%d.host", i);
> -host = qdict_get_try_str(options, keybuf);
> -qdict_del(options, keybuf);
> -sprintf(keybuf, "server.%d.port", i);
> -port = qdict_get_try_str(options, keybuf);
> -qdict_del(options, keybuf);
> -if (!host && !port) {
> -break;
> -}
> -if (!host) {
> -error_setg(errp, "Parameter server.%d.host is missing", i);
> -rados_str = NULL;
> -goto out;
> -}
> +InetSocketAddressBaseList *p;
> +int i, cnt;
> +
> +if (!opts->has_server) {
> +return NULL;
> +}
> +
> +for (cnt = 0, p = opts->server; p; p = p->next) {
> +cnt++;
> +}
> +
> +vals = g_new(const char *, cnt + 1);
> +
> +for (i = 0, p = opts->server; p; p = p->next, i++) {
> +host = p->value->host;
> +port = p->value->port;
>  
>  if (strchr(host, ':')) {
>  vals[i] = port ? g_strdup_printf("[%s]:%s", host, port)

host *and* port are mandatory, so this and the next ternary can be
simplified, too. ;-)

But not necessary, so:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3 23/36] rbd: Assign s->snap/image_name in qemu_rbd_open()

2018-02-26 Thread Max Reitz
On 2018-02-23 20:25, Kevin Wolf wrote:
> Now that the options are already available in qemu_rbd_open() and not
> only parsed in qemu_rbd_connect(), we can assign s->snap and
> s->image_name there instead of passing the fields by reference to
> qemu_rbd_connect().
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/rbd.c | 14 +-
>  1 file changed, 5 insertions(+), 9 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3 24/36] rbd: Use qemu_rbd_connect() in qemu_rbd_do_create()

2018-02-26 Thread Max Reitz
On 2018-02-23 20:25, Kevin Wolf wrote:
> This is almost exactly the same code. The differences are that
> qemu_rbd_connect() supports BlockdevOptionsRbd.server and that the cache
> mode is set explicitly.
> 
> Supporting 'server' is a welcome new feature for image creation.
> Caching is disabled by default, so leave it that way.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/rbd.c | 54 ++
>  1 file changed, 10 insertions(+), 44 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v7 0/4] cryptodev: add vhost support

2018-02-26 Thread Jay Zhou
From: Gonglei 

I posted the RFC verion a few months ago for DPDK
vhost-crypto implmention, and now it's time to send
the formal version. Because we need an user space scheme
for better performance.

The vhost user crypto server side patches had been
sent to DPDK community, pls see

[RFC PATCH 0/6] lib/librte_vhost: introduce new vhost_user crypto backend
support
http://dpdk.org/ml/archives/dev/2017-November/081048.html

You also can get virtio-crypto polling mode driver from:

[PATCH v2 0/7] crypto: add virtio poll mode driver
http://dpdk.org/ml/archives/dev/2018-February/091410.html

v7:
  - make virtio crypto enabled on non-Linux
  - fix format-string issues
  - fix error reported by clang
  - fix a typo when setting length of cipher key
  - rebased on the master
v6:
  - Fix compile error about backends/cryptodev-vhost-user.o and rebase on
the master
v5:
  - squash [PATCH v4 5/5] into previous patches [Michael]
v4:
  - "[PATCH v4 5/5] cryptodev-vhost-user: depend on CONFIG_VHOST_CRYPTO
and CONFIG_VHOST_USER" newly added to fix compilation dependency [Michael]
v3:
  - New added vhost user messages should be sent only when feature
has been successfully negotiated [Michael]
v2:
  - Fix compile error on mingw32

Gonglei (4):
  cryptodev: add vhost-user as a new cryptodev backend
  cryptodev: add vhost support
  cryptodev-vhost-user: add crypto session handler
  cryptodev-vhost-user: set the key length

 backends/Makefile.objs|   6 +
 backends/cryptodev-builtin.c  |   1 +
 backends/cryptodev-vhost-user.c   | 377 ++
 backends/cryptodev-vhost.c| 347 +++
 configure |  15 ++
 docs/interop/vhost-user.txt   |  26 +++
 hw/virtio/vhost-user.c| 104 ++
 hw/virtio/virtio-crypto.c |  70 +++
 include/hw/virtio/vhost-backend.h |   8 +
 include/hw/virtio/virtio-crypto.h |   1 +
 include/sysemu/cryptodev-vhost-user.h |  47 +
 include/sysemu/cryptodev-vhost.h  | 154 ++
 include/sysemu/cryptodev.h|   8 +
 qemu-options.hx   |  21 ++
 vl.c  |   6 +
 15 files changed, 1191 insertions(+)
 create mode 100644 backends/cryptodev-vhost-user.c
 create mode 100644 backends/cryptodev-vhost.c
 create mode 100644 include/sysemu/cryptodev-vhost-user.h
 create mode 100644 include/sysemu/cryptodev-vhost.h

--
1.8.3.1




[Qemu-devel] [PATCH v7 2/4] cryptodev: add vhost support

2018-02-26 Thread Jay Zhou
From: Gonglei 

Impliment the vhost-crypto's funtions, such as startup,
stop and notification etc. Introduce an enum
QCryptoCryptoDevBackendOptionsType in order to
identify the cryptodev vhost backend is vhost-user
or vhost-kernel-module (If exist).

At this point, the cryptdoev-vhost-user works.

Signed-off-by: Gonglei 
Signed-off-by: Longpeng(Mike) 
Signed-off-by: Jay Zhou 
---
 backends/cryptodev-builtin.c  |   1 +
 backends/cryptodev-vhost-user.c   |  16 +++
 backends/cryptodev-vhost.c| 258 ++
 hw/virtio/virtio-crypto.c |  70 +
 include/hw/virtio/virtio-crypto.h |   1 +
 include/sysemu/cryptodev-vhost-user.h |  44 ++
 include/sysemu/cryptodev.h|   8 ++
 7 files changed, 398 insertions(+)
 create mode 100644 include/sysemu/cryptodev-vhost-user.h

diff --git a/backends/cryptodev-builtin.c b/backends/cryptodev-builtin.c
index 657c0ba..9fb0bd5 100644
--- a/backends/cryptodev-builtin.c
+++ b/backends/cryptodev-builtin.c
@@ -78,6 +78,7 @@ static void cryptodev_builtin_init(
   "cryptodev-builtin", NULL);
 cc->info_str = g_strdup_printf("cryptodev-builtin0");
 cc->queue_index = 0;
+cc->type = CRYPTODEV_BACKEND_TYPE_BUILTIN;
 backend->conf.peers.ccs[0] = cc;
 
 backend->conf.crypto_services =
diff --git a/backends/cryptodev-vhost-user.c b/backends/cryptodev-vhost-user.c
index 93c3f10..151a0e6 100644
--- a/backends/cryptodev-vhost-user.c
+++ b/backends/cryptodev-vhost-user.c
@@ -29,6 +29,7 @@
 #include "standard-headers/linux/virtio_crypto.h"
 #include "sysemu/cryptodev-vhost.h"
 #include "chardev/char-fe.h"
+#include "sysemu/cryptodev-vhost-user.h"
 
 
 /**
@@ -58,6 +59,20 @@ cryptodev_vhost_user_running(
 return crypto ? 1 : 0;
 }
 
+CryptoDevBackendVhost *
+cryptodev_vhost_user_get_vhost(
+ CryptoDevBackendClient *cc,
+ CryptoDevBackend *b,
+ uint16_t queue)
+{
+CryptoDevBackendVhostUser *s =
+  CRYPTODEV_BACKEND_VHOST_USER(b);
+assert(cc->type == CRYPTODEV_BACKEND_TYPE_VHOST_USER);
+assert(queue < MAX_CRYPTO_QUEUE_NUM);
+
+return s->vhost_crypto[queue];
+}
+
 static void cryptodev_vhost_user_stop(int queues,
   CryptoDevBackendVhostUser *s)
 {
@@ -188,6 +203,7 @@ static void cryptodev_vhost_user_init(
 cc->info_str = g_strdup_printf("cryptodev-vhost-user%zu to %s ",
i, chr->label);
 cc->queue_index = i;
+cc->type = CRYPTODEV_BACKEND_TYPE_VHOST_USER;
 
 backend->conf.peers.ccs[i] = cc;
 
diff --git a/backends/cryptodev-vhost.c b/backends/cryptodev-vhost.c
index 27e1c4a..8337c9a 100644
--- a/backends/cryptodev-vhost.c
+++ b/backends/cryptodev-vhost.c
@@ -23,9 +23,16 @@
  */
 
 #include "qemu/osdep.h"
+#include "hw/virtio/virtio-bus.h"
 #include "sysemu/cryptodev-vhost.h"
 
 #ifdef CONFIG_VHOST_CRYPTO
+#include "qapi/error.h"
+#include "qapi/qmp/qerror.h"
+#include "qemu/error-report.h"
+#include "hw/virtio/virtio-crypto.h"
+#include "sysemu/cryptodev-vhost-user.h"
+
 uint64_t
 cryptodev_vhost_get_max_queues(
 CryptoDevBackendVhost *crypto)
@@ -70,6 +77,228 @@ fail:
 return NULL;
 }
 
+static int
+cryptodev_vhost_start_one(CryptoDevBackendVhost *crypto,
+  VirtIODevice *dev)
+{
+int r;
+
+crypto->dev.nvqs = 1;
+crypto->dev.vqs = crypto->vqs;
+
+r = vhost_dev_enable_notifiers(&crypto->dev, dev);
+if (r < 0) {
+goto fail_notifiers;
+}
+
+r = vhost_dev_start(&crypto->dev, dev);
+if (r < 0) {
+goto fail_start;
+}
+
+return 0;
+
+fail_start:
+vhost_dev_disable_notifiers(&crypto->dev, dev);
+fail_notifiers:
+return r;
+}
+
+static void
+cryptodev_vhost_stop_one(CryptoDevBackendVhost *crypto,
+ VirtIODevice *dev)
+{
+vhost_dev_stop(&crypto->dev, dev);
+vhost_dev_disable_notifiers(&crypto->dev, dev);
+}
+
+CryptoDevBackendVhost *
+cryptodev_get_vhost(CryptoDevBackendClient *cc,
+CryptoDevBackend *b,
+uint16_t queue)
+{
+CryptoDevBackendVhost *vhost_crypto = NULL;
+
+if (!cc) {
+return NULL;
+}
+
+switch (cc->type) {
+#if defined(CONFIG_VHOST_USER) && defined(CONFIG_LINUX)
+case CRYPTODEV_BACKEND_TYPE_VHOST_USER:
+vhost_crypto = cryptodev_vhost_user_get_vhost(cc, b, queue);
+break;
+#endif
+default:
+break;
+}
+
+return vhost_crypto;
+}
+
+static void
+cryptodev_vhost_set_vq_index(CryptoDevBackendVhost *crypto,
+ int vq_index)
+{
+crypto->dev.vq_index = vq_index;
+}
+
+static int
+vhost_set_vring_enable(CryptoDevBackendClient *cc,
+CryptoDevBackend *b,
+uint16_t queue, int enable)
+{
+CryptoDevBackendVhost *crypt

[Qemu-devel] [PATCH v7 3/4] cryptodev-vhost-user: add crypto session handler

2018-02-26 Thread Jay Zhou
From: Gonglei 

Introduce two vhost-user meassges: VHOST_USER_CREATE_CRYPTO_SESSION
and VHOST_USER_CLOSE_CRYPTO_SESSION. At this point, the QEMU side
support crypto operation in cryptodev host-user backend.

Signed-off-by: Gonglei 
Signed-off-by: Longpeng(Mike) 
Signed-off-by: Jay Zhou 
---
 backends/cryptodev-vhost-user.c   |  48 ++
 docs/interop/vhost-user.txt   |  26 ++
 hw/virtio/vhost-user.c| 104 ++
 include/hw/virtio/vhost-backend.h |   8 +++
 4 files changed, 175 insertions(+), 11 deletions(-)

diff --git a/backends/cryptodev-vhost-user.c b/backends/cryptodev-vhost-user.c
index 151a0e6..9cd06c4 100644
--- a/backends/cryptodev-vhost-user.c
+++ b/backends/cryptodev-vhost-user.c
@@ -231,7 +231,25 @@ static int64_t cryptodev_vhost_user_sym_create_session(
CryptoDevBackendSymSessionInfo *sess_info,
uint32_t queue_index, Error **errp)
 {
-return 0;
+CryptoDevBackendClient *cc =
+   backend->conf.peers.ccs[queue_index];
+CryptoDevBackendVhost *vhost_crypto;
+uint64_t session_id = 0;
+int ret;
+
+vhost_crypto = cryptodev_vhost_user_get_vhost(cc, backend, queue_index);
+if (vhost_crypto) {
+struct vhost_dev *dev = &(vhost_crypto->dev);
+ret = dev->vhost_ops->vhost_crypto_create_session(dev,
+  sess_info,
+  &session_id);
+if (ret < 0) {
+return -1;
+} else {
+return session_id;
+}
+}
+return -1;
 }
 
 static int cryptodev_vhost_user_sym_close_session(
@@ -239,15 +257,23 @@ static int cryptodev_vhost_user_sym_close_session(
uint64_t session_id,
uint32_t queue_index, Error **errp)
 {
-return 0;
-}
-
-static int cryptodev_vhost_user_sym_operation(
- CryptoDevBackend *backend,
- CryptoDevBackendSymOpInfo *op_info,
- uint32_t queue_index, Error **errp)
-{
-return VIRTIO_CRYPTO_OK;
+CryptoDevBackendClient *cc =
+  backend->conf.peers.ccs[queue_index];
+CryptoDevBackendVhost *vhost_crypto;
+int ret;
+
+vhost_crypto = cryptodev_vhost_user_get_vhost(cc, backend, queue_index);
+if (vhost_crypto) {
+struct vhost_dev *dev = &(vhost_crypto->dev);
+ret = dev->vhost_ops->vhost_crypto_close_session(dev,
+ session_id);
+if (ret < 0) {
+return -1;
+} else {
+return 0;
+}
+}
+return -1;
 }
 
 static void cryptodev_vhost_user_cleanup(
@@ -326,7 +352,7 @@ cryptodev_vhost_user_class_init(ObjectClass *oc, void *data)
 bc->cleanup = cryptodev_vhost_user_cleanup;
 bc->create_session = cryptodev_vhost_user_sym_create_session;
 bc->close_session = cryptodev_vhost_user_sym_close_session;
-bc->do_sym_op = cryptodev_vhost_user_sym_operation;
+bc->do_sym_op = NULL;
 }
 
 static const TypeInfo cryptodev_vhost_user_info = {
diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
index 9fcf48d..cb3a759 100644
--- a/docs/interop/vhost-user.txt
+++ b/docs/interop/vhost-user.txt
@@ -368,6 +368,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_MTU4
 #define VHOST_USER_PROTOCOL_F_SLAVE_REQ  5
 #define VHOST_USER_PROTOCOL_F_CROSS_ENDIAN   6
+#define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7
 
 Master message types
 
@@ -663,6 +664,31 @@ Master message types
   field, and slaves MUST NOT accept SET_CONFIG for read-only
   configuration space fields unless the live migration bit is set.
 
+* VHOST_USER_CREATE_CRYPTO_SESSION
+
+ Id: 26
+ Equivalent ioctl: N/A
+ Master payload: crypto session description
+ Slave payload: crypto session description
+
+ Create a session for crypto operation. The server side must return the
+ session id, 0 or positive for success, negative for failure.
+ This request should be sent only when VHOST_USER_PROTOCOL_F_CRYPTO_SESSION
+ feature has been successfully negotiated.
+ It's a required feature for crypto devices.
+
+* VHOST_USER_CLOSE_CRYPTO_SESSION
+
+ Id: 27
+ Equivalent ioctl: N/A
+ Master payload: u64
+
+ Close a session for crypto operation which was previously
+ created by VHOST_USER_CREATE_CRYPTO_SESSION.
+ This request should be sent only when VHOST_USER_PROTOCOL_F_CRYPTO_SESSION
+ feature has been successfully negotiated.
+ It's a required feature for crypto devices.
+
 Slave message types
 ---
 
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 6eb9798..41ff5cf 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -17,6 +17,7 @@
 #include "sysemu/kvm.h"
 #include "qemu/error-report.h"
 #include "qemu/sockets.h"
+#include "sysemu/cryptodev.h"
 
 #include 
 

Re: [Qemu-devel] [PATCH v3 27/36] sheepdog: QAPIfy "redundancy" create option

2018-02-26 Thread Max Reitz
On 2018-02-23 20:25, Kevin Wolf wrote:
> The "redundancy" option for Sheepdog image creation is currently a
> string that can encode one or two integers depending on its format,
> which at the same time implicitly selects a mode.
> 
> This patch turns it into a QAPI union and converts the string into such
> a QAPI object before interpreting the values.
> 
> Signed-off-by: Kevin Wolf 
> Reviewed-by: Max Reitz 
> ---
>  qapi/block-core.json | 45 +
>  block/sheepdog.c | 94 
> +---
>  2 files changed, 112 insertions(+), 27 deletions(-)

[...]

> @@ -1907,35 +1950,32 @@ static int parse_redundancy(BDRVSheepdogState *s, 
> const char *opt)
>  return -EINVAL;
>  }
>  
> -copy = strtol(n1, NULL, 10);
> -/* FIXME fix error checking by switching to qemu_strtol() */
> -if (copy > SD_MAX_COPIES || copy < 1) {
> -return -EINVAL;
> -}
> -if (!n2) {
> -inode->copy_policy = 0;
> -inode->nr_copies = copy;
> -return 0;
> +ret = qemu_strtol(n1, NULL, 10, ©);

(By the way: This was what I was thanking you for in v2 -- I just now
realized I was clever enough not to point to it in my reply...)

> +if (ret < 0) {
> +return ret;
>  }



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v7 4/4] cryptodev-vhost-user: set the key length

2018-02-26 Thread Jay Zhou
From: Gonglei 

Signed-off-by: Gonglei 
---
 backends/cryptodev-vhost-user.c   | 4 
 include/sysemu/cryptodev-vhost-user.h | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/backends/cryptodev-vhost-user.c b/backends/cryptodev-vhost-user.c
index 9cd06c4..862d4f2 100644
--- a/backends/cryptodev-vhost-user.c
+++ b/backends/cryptodev-vhost-user.c
@@ -224,6 +224,10 @@ static void cryptodev_vhost_user_init(
  1u << VIRTIO_CRYPTO_SERVICE_MAC;
 backend->conf.cipher_algo_l = 1u << VIRTIO_CRYPTO_CIPHER_AES_CBC;
 backend->conf.hash_algo = 1u << VIRTIO_CRYPTO_HASH_SHA1;
+
+backend->conf.max_size = UINT64_MAX;
+backend->conf.max_cipher_key_len = VHOST_USER_MAX_CIPHER_KEY_LEN;
+backend->conf.max_auth_key_len = VHOST_USER_MAX_AUTH_KEY_LEN;
 }
 
 static int64_t cryptodev_vhost_user_sym_create_session(
diff --git a/include/sysemu/cryptodev-vhost-user.h 
b/include/sysemu/cryptodev-vhost-user.h
index 937217b..6debf53 100644
--- a/include/sysemu/cryptodev-vhost-user.h
+++ b/include/sysemu/cryptodev-vhost-user.h
@@ -23,6 +23,9 @@
 #ifndef CRYPTODEV_VHOST_USER_H
 #define CRYPTODEV_VHOST_USER_H
 
+#define VHOST_USER_MAX_AUTH_KEY_LEN512
+#define VHOST_USER_MAX_CIPHER_KEY_LEN  64
+
 
 /**
  * cryptodev_vhost_user_get_vhost:
-- 
1.8.3.1





Re: [Qemu-devel] [PATCH v6 00/23] RISC-V QEMU Port Submission

2018-02-26 Thread Peter Maydell
On 26 February 2018 at 12:03, Daniel P. Berrangé  wrote:
> Eeek, I totally missed that as the top level LICENSE file only mentions
> GPL and BSD licenses :-(  I guess that's a trigger for a patch to improve
> the text in the LICENSE file to better reflect reality...

Paragraph (2) says "Parts of QEMU have specific licenses which are
compatible with the GNU General Public License, version 2", which
is what this is. Paragraph (3) isn't saying "BSD license is special",
it's saying "the TCG codegen code is special" -- it's a theoretically
well-defined reusable subset of code that has its own tighter standards
for what license we accept (see also tcg/LICENSE).

thanks
-- PMM



Re: [Qemu-devel] [PATCH v3 28/36] sheepdog: Support .bdrv_co_create

2018-02-26 Thread Max Reitz
On 2018-02-23 20:25, Kevin Wolf wrote:
> This adds the .bdrv_co_create driver callback to sheepdog, which enables
> image creation over QMP.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  qapi/block-core.json |  24 -
>  block/sheepdog.c | 242 
> +++
>  2 files changed, 191 insertions(+), 75 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v7 1/4] cryptodev: add vhost-user as a new cryptodev backend

2018-02-26 Thread Jay Zhou
From: Gonglei 

Usage:
 -chardev socket,id=charcrypto0,path=/path/to/your/socket
 -object cryptodev-vhost-user,id=cryptodev0,chardev=charcrypto0
 -device virtio-crypto-pci,id=crypto0,cryptodev=cryptodev0

Signed-off-by: Gonglei 
Signed-off-by: Longpeng(Mike) 
Signed-off-by: Jay Zhou 
---
 backends/Makefile.objs   |   6 +
 backends/cryptodev-vhost-user.c  | 331 +++
 backends/cryptodev-vhost.c   |  89 +++
 configure|  15 ++
 include/sysemu/cryptodev-vhost.h | 154 ++
 qemu-options.hx  |  21 +++
 vl.c |   6 +
 7 files changed, 622 insertions(+)
 create mode 100644 backends/cryptodev-vhost-user.c
 create mode 100644 backends/cryptodev-vhost.c
 create mode 100644 include/sysemu/cryptodev-vhost.h

diff --git a/backends/Makefile.objs b/backends/Makefile.objs
index 67eeeba..9b7face 100644
--- a/backends/Makefile.objs
+++ b/backends/Makefile.objs
@@ -9,4 +9,10 @@ common-obj-$(CONFIG_LINUX) += hostmem-file.o
 common-obj-y += cryptodev.o
 common-obj-y += cryptodev-builtin.o
 
+ifeq ($(CONFIG_VIRTIO),y)
+common-obj-$(CONFIG_LINUX) += cryptodev-vhost.o
+common-obj-$(call land,$(CONFIG_VHOST_USER),$(CONFIG_LINUX)) += \
+cryptodev-vhost-user.o
+endif
+
 common-obj-$(CONFIG_LINUX) += hostmem-memfd.o
diff --git a/backends/cryptodev-vhost-user.c b/backends/cryptodev-vhost-user.c
new file mode 100644
index 000..93c3f10
--- /dev/null
+++ b/backends/cryptodev-vhost-user.c
@@ -0,0 +1,331 @@
+/*
+ * QEMU Cryptodev backend for QEMU cipher APIs
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ *
+ * Authors:
+ *Gonglei 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "hw/boards.h"
+#include "qapi/error.h"
+#include "qapi/qmp/qerror.h"
+#include "qemu/error-report.h"
+#include "standard-headers/linux/virtio_crypto.h"
+#include "sysemu/cryptodev-vhost.h"
+#include "chardev/char-fe.h"
+
+
+/**
+ * @TYPE_CRYPTODEV_BACKEND_VHOST_USER:
+ * name of backend that uses vhost user server
+ */
+#define TYPE_CRYPTODEV_BACKEND_VHOST_USER "cryptodev-vhost-user"
+
+#define CRYPTODEV_BACKEND_VHOST_USER(obj) \
+OBJECT_CHECK(CryptoDevBackendVhostUser, \
+ (obj), TYPE_CRYPTODEV_BACKEND_VHOST_USER)
+
+
+typedef struct CryptoDevBackendVhostUser {
+CryptoDevBackend parent_obj;
+
+CharBackend chr;
+char *chr_name;
+bool opened;
+CryptoDevBackendVhost *vhost_crypto[MAX_CRYPTO_QUEUE_NUM];
+} CryptoDevBackendVhostUser;
+
+static int
+cryptodev_vhost_user_running(
+ CryptoDevBackendVhost *crypto)
+{
+return crypto ? 1 : 0;
+}
+
+static void cryptodev_vhost_user_stop(int queues,
+  CryptoDevBackendVhostUser *s)
+{
+size_t i;
+
+for (i = 0; i < queues; i++) {
+if (!cryptodev_vhost_user_running(s->vhost_crypto[i])) {
+continue;
+}
+
+cryptodev_vhost_cleanup(s->vhost_crypto[i]);
+s->vhost_crypto[i] = NULL;
+}
+}
+
+static int
+cryptodev_vhost_user_start(int queues,
+ CryptoDevBackendVhostUser *s)
+{
+CryptoDevBackendVhostOptions options;
+CryptoDevBackend *b = CRYPTODEV_BACKEND(s);
+int max_queues;
+size_t i;
+
+for (i = 0; i < queues; i++) {
+if (cryptodev_vhost_user_running(s->vhost_crypto[i])) {
+continue;
+}
+
+options.opaque = &s->chr;
+options.backend_type = VHOST_BACKEND_TYPE_USER;
+options.cc = b->conf.peers.ccs[i];
+s->vhost_crypto[i] = cryptodev_vhost_init(&options);
+if (!s->vhost_crypto[i]) {
+error_report("failed to init vhost_crypto for queue %zu", i);
+goto err;
+}
+
+if (i == 0) {
+max_queues =
+  cryptodev_vhost_get_max_queues(s->vhost_crypto[i]);
+if (queues > max_queues) {
+error_report("you are asking more queues than supported: %d",
+ max_queues);
+goto err;
+}
+}
+}
+
+return 0;
+
+err:
+cryptodev_vhost_user_stop(i + 1, s);
+return -1;
+}
+
+static Chardev *
+cryptodev_vhost_claim_chardev(CryptoDevBackendVhostUser *s,
+Error **errp)
+{
+Chardev *chr;
+
+if (s->c

Re: [Qemu-devel] [PATCH v2 32/36] ssh: Support .bdrv_co_create

2018-02-26 Thread Max Reitz
On 2018-02-21 14:54, Kevin Wolf wrote:
> This adds the .bdrv_co_create driver callback to ssh, which enables
> image creation over QMP.
> 
> Signed-off-by: Kevin Wolf 
> Reviewed-by: Max Reitz 
> ---
>  qapi/block-core.json | 16 -
>  block/ssh.c  | 92 
> +---
>  2 files changed, 67 insertions(+), 41 deletions(-)

This needs a rebase on my ssh truncation patches (notably "block/ssh:
Pull ssh_grow_file() from ssh_create()") -- good thing for me you need
to rebase, because without those patches you cannot create qcow2 files
over ssh. O:-)

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 32/36] ssh: Support .bdrv_co_create

2018-02-26 Thread Max Reitz
On 2018-02-26 13:40, Max Reitz wrote:
> On 2018-02-21 14:54, Kevin Wolf wrote:
>> This adds the .bdrv_co_create driver callback to ssh, which enables
>> image creation over QMP.
>>
>> Signed-off-by: Kevin Wolf 
>> Reviewed-by: Max Reitz 
>> ---
>>  qapi/block-core.json | 16 -
>>  block/ssh.c  | 92 
>> +---
>>  2 files changed, 67 insertions(+), 41 deletions(-)
> 
> This needs a rebase on my ssh truncation patches (notably "block/ssh:
> Pull ssh_grow_file() from ssh_create()") -- good thing for me you need
> to rebase, because without those patches you cannot create qcow2 files
> over ssh. O:-)

Oops, meant to reply to v3...

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3 32/36] ssh: Support .bdrv_co_create

2018-02-26 Thread Max Reitz
On 2018-02-23 20:25, Kevin Wolf wrote:
> This adds the .bdrv_co_create driver callback to ssh, which enables
> image creation over QMP.
> 
> Signed-off-by: Kevin Wolf 
> Reviewed-by: Max Reitz 
> ---
>  qapi/block-core.json | 16 -
>  block/ssh.c  | 92 
> +---
>  2 files changed, 67 insertions(+), 41 deletions(-)

As written just now in my reply to v2 (clever me), this conflicts with
my patches to add truncation support to ssh -- and since your series
requires protocols to have feature parity when comparing truncation on
creation and truncation on-the-fly, I insolently claim that you should
rebase. O:-)

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 36/36] qemu-iotests: Test ssh image creation over QMP

2018-02-26 Thread Max Reitz
On 2018-02-21 14:54, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf 
> Reviewed-by: Max Reitz 
> ---
>  tests/qemu-iotests/207 | 261 
> +
>  tests/qemu-iotests/207.out |  75 +
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 337 insertions(+)
>  create mode 100755 tests/qemu-iotests/207
>  create mode 100644 tests/qemu-iotests/207.out

Minor note: If this test tried to create a qcow2 image over ssh, it
would have seen "Image format driver does not support resize" without
the ssh truncation patches.

But I'm not sure whether such a test case should be added here, because
technically this test then becomes a qcow2+ssh test, and nobody is ever
going to run the iotests with the qcow2+ssh combination.

(We could cheat and still mark this test as raw and then just create a
qcow2 image nonetheless...)

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3 36/36] qemu-iotests: Test ssh image creation over QMP

2018-02-26 Thread Max Reitz
On 2018-02-23 20:25, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf 
> Reviewed-by: Max Reitz 
> ---
>  tests/qemu-iotests/207 | 261 
> +
>  tests/qemu-iotests/207.out |  75 +
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 337 insertions(+)
>  create mode 100755 tests/qemu-iotests/207
>  create mode 100644 tests/qemu-iotests/207.out

Aw man, I've done it again and replied to the wrong version...

The gist of my reply: Maybe this should also test qcow2 creation over
ssh, but only if that doesn't force us to mark it as qcow2+ssh.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC] exec: eliminate ram naming issue as migration

2018-02-26 Thread Igor Mammedov
On Sat, 24 Feb 2018 03:11:30 +
"Tan, Jianfeng"  wrote:

> > -Original Message-
> > From: Tan, Jianfeng
> > Sent: Saturday, February 24, 2018 11:08 AM
> > To: 'Igor Mammedov'
> > Cc: Paolo Bonzini; Jason Wang; Maxime Coquelin; qemu-devel@nongnu.org;
> > Michael S . Tsirkin
> > Subject: RE: [Qemu-devel] [RFC] exec: eliminate ram naming issue as
> > migration
> > 
> > Hi Igor and all,
> >   
> > > -Original Message-
> > > From: Igor Mammedov [mailto:imamm...@redhat.com]
> > > Sent: Thursday, February 8, 2018 7:30 PM
> > > To: Tan, Jianfeng
> > > Cc: Paolo Bonzini; Jason Wang; Maxime Coquelin; qemu-  
> > de...@nongnu.org;  
> > > Michael S . Tsirkin
> > > Subject: Re: [Qemu-devel] [RFC] exec: eliminate ram naming issue as
> > > migration
> > >  
> > [...]  
> > > > > It could be solved by adding memdev option to machine,
> > > > > which would allow to specify backend object. And then on
> > > > > top make -mem-path alias new option to clean thing up.  
> > > >
> > > > Do you mean?
> > > >
> > > > src vm: -m xG
> > > > dst vm: -m xG,memdev=pc.ram -object memory-backend-  
> > file,id=pc.ram,size=xG,mem-path=xxx,share=on ...  
> > > Yep, I've meant something like it
> > >
> > > src vm: -m xG,memdev=SHARED_RAM -object memory-backend-  
> > file,id=SHARED_RAM,size=xG,mem-path=xxx,share=on  
> > > dst vm: -m xG,memdev=SHARED_RAM -object memory-backend-  
> > file,id=SHARED_RAM,size=xG,mem-path=xxx,share=on
> > 
> > After a second thought, I find adding a backend for nonnuma pc RAM is
> > roundabout way.
> > 
> > And we actually have an existing way to add a file-backed RAM: commit
> > c902760fb25f ("Add option to use file backed guest memory"). Basically, this
> > commit adds two options, --mem-path and --mem-prealloc, without specify
> > a backend explicitly.
> > 
> > So how about just adding a new option --mem-share to decide if that's a
> > private memory or shared memory? That seems much straightforward way
Above options are legacy (which we can't remove for compat reasons),
their replacement is 'memory-backend-file' backend which has all of
the above including 'share' property.

So just add 'memdev' property to machine and reuse memory-backend-file
with it instead of duplicating functionality in the legacy code.

> > to me; after this change we can migrate like:
> > 
> > src vm: -m xG
> > dst vm: -m xG --mem-path xxx --mem-share
Even though it might work for now, that's still invalid configuration
for migration, src side must include the same
  "--mem-path xxx --mem-share"
options as dst.

It'd be better to fix management application to start QEMU
properly on SRC side.

 
> Attach the patch FYI. Look forward to your thoughts.
> 
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index 31612ca..5eaf367 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -127,6 +127,7 @@ extern bool enable_mlock;
>  extern uint8_t qemu_extra_params_fw[2];
>  extern QEMUClockType rtc_clock;
>  extern const char *mem_path;
> +extern int mem_share;
>  extern int mem_prealloc;
>  
>  #define MAX_NODES 128
> diff --git a/numa.c b/numa.c
> index 7b9c33a..322289f 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -456,7 +456,7 @@ static void allocate_system_memory_nonnuma(MemoryRegion 
> *mr, Object *owner,
>  if (mem_path) {
>  #ifdef __linux__
>  Error *err = NULL;
> -memory_region_init_ram_from_file(mr, owner, name, ram_size, false,
> +memory_region_init_ram_from_file(mr, owner, name, ram_size, 
> mem_share,
>   mem_path, &err);
>  if (err) {
>  error_report_err(err);
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 678181c..c968c53 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -389,6 +389,15 @@ STEXI
>  Allocate guest RAM from a temporarily created file in @var{path}.
>  ETEXI
>  
> +DEF("mem-share", 0, QEMU_OPTION_memshare,
> +"-mem-share   make guest memory shareable (use with -mem-path)\n",
> +QEMU_ARCH_ALL)
> +STEXI
> +@item -mem-share
> +@findex -mem-share
> +Make file-backed guest RAM shareable when using -mem-path.
> +ETEXI
> +
>  DEF("mem-prealloc", 0, QEMU_OPTION_mem_prealloc,
>  "-mem-prealloc   preallocate guest memory (use with -mem-path)\n",
>  QEMU_ARCH_ALL)
> diff --git a/vl.c b/vl.c
> index 444b750..0ff06c2 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -140,6 +140,7 @@ int display_opengl;
>  const char* keyboard_layout = NULL;
>  ram_addr_t ram_size;
>  const char *mem_path = NULL;
> +int mem_share = 0;
>  int mem_prealloc = 0; /* force preallocation of physical target memory */
>  bool enable_mlock = false;
>  int nb_nics;
> @@ -3395,6 +3396,9 @@ int main(int argc, char **argv, char **envp)
>  case QEMU_OPTION_mempath:
>  mem_path = optarg;
>  break;
> +case QEMU_OPTION_memshare:
> +mem_share = 1;
> +break;
>  case QEMU_OPTION_mem_prealloc:
>

Re: [Qemu-devel] [PATCH v6 00/23] RISC-V QEMU Port Submission

2018-02-26 Thread Laurent Desnogues
On Mon, Feb 26, 2018 at 1:32 PM, Peter Maydell  wrote:
> Paragraph (3) isn't saying "BSD license is special",
> it's saying "the TCG codegen code is special" -- it's a theoretically
> well-defined reusable subset of code that has its own tighter standards
> for what license we accept (see also tcg/LICENSE).

That tcg/LICENSE file has alas become obsolete since the AArch64
back-end and its GPLv2+ license was added.  That's quite unfortunate.


Laurent



[Qemu-devel] [PATCH 1/1] serial: Open non-block

2018-02-26 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

On a real serial device, the open can block if the handshake
lines are in a particular state.  If a QEMU is passing the serial
device to the guest, the QEMU startup is blocked opening the device
(with a symptom seen as a timeout from libvirt).

Open the serial port with O_NONBLOCK.

Signed-off-by: Dr. David Alan Gilbert 
---
 chardev/char-serial.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/chardev/char-serial.c b/chardev/char-serial.c
index feb52e559d..97be5d4a63 100644
--- a/chardev/char-serial.c
+++ b/chardev/char-serial.c
@@ -265,7 +265,8 @@ static void qmp_chardev_open_serial(Chardev *chr,
 ChardevHostdev *serial = backend->u.serial.data;
 int fd;
 
-fd = qmp_chardev_open_file_source(serial->device, O_RDWR, errp);
+fd = qmp_chardev_open_file_source(serial->device, O_RDWR | O_NONBLOCK,
+  errp);
 if (fd < 0) {
 return;
 }
-- 
2.14.3




[Qemu-devel] [Bug 1751264] Re: qemu-img convert issue in a tmpfs partition

2018-02-26 Thread Teddy VALETTE
** Description changed:

  qemu-img convert command is slow when the file to convert is located in
  a tmpfs formatted partition.
  
  v2.1.0 on debian/jessie x64, ext4: 10m14s
  v2.1.0 on debian/jessie x64, tmpfs: 10m15s
  
  v2.1.0 on debian/stretch x64, ext4: 11m9s
  v2.1.0 on debian/stretch x64, tmpfs: 10m21.362s
  
  v2.8.0 on debian/jessie x64, ext4: 10m21s
- v2.8.0 on debian/jessie x64, tmpfs: Too long
+ v2.8.0 on debian/jessie x64, tmpfs: Too long (50min+)
  
  v2.8.0 on debian/stretch x64, ext4: 10m42s
- v2.8.0 on debian/stretch x64, tmpfs: Too long
+ v2.8.0 on debian/stretch x64, tmpfs: Too long (50min+)
  
  It seems that the issue is caused by this commit :
  https://github.com/qemu/qemu/commit/690c7301600162421b928c7f26fd488fd8fa464e
  
  In order to reproduce this bug :
  
  1/ mount a tmpfs partition : mount -t tmpfs tmpfs /tmp
  2/ get a vmdk file (we used a 15GB image) and put it on /tmp
  3/ run the 'qemu-img convert -O qcow2 /tmp/file.vmdk /path/to/destination' 
command
  
  When we trace the process, we can see that there's a lseek loop which is
  very slow (compare to outside a tmpfs partition).

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1751264

Title:
  qemu-img convert issue in a tmpfs partition

Status in QEMU:
  New

Bug description:
  qemu-img convert command is slow when the file to convert is located
  in a tmpfs formatted partition.

  v2.1.0 on debian/jessie x64, ext4: 10m14s
  v2.1.0 on debian/jessie x64, tmpfs: 10m15s

  v2.1.0 on debian/stretch x64, ext4: 11m9s
  v2.1.0 on debian/stretch x64, tmpfs: 10m21.362s

  v2.8.0 on debian/jessie x64, ext4: 10m21s
  v2.8.0 on debian/jessie x64, tmpfs: Too long (50min+)

  v2.8.0 on debian/stretch x64, ext4: 10m42s
  v2.8.0 on debian/stretch x64, tmpfs: Too long (50min+)

  It seems that the issue is caused by this commit :
  https://github.com/qemu/qemu/commit/690c7301600162421b928c7f26fd488fd8fa464e

  In order to reproduce this bug :

  1/ mount a tmpfs partition : mount -t tmpfs tmpfs /tmp
  2/ get a vmdk file (we used a 15GB image) and put it on /tmp
  3/ run the 'qemu-img convert -O qcow2 /tmp/file.vmdk /path/to/destination' 
command

  When we trace the process, we can see that there's a lseek loop which
  is very slow (compare to outside a tmpfs partition).

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1751264/+subscriptions



Re: [Qemu-devel] [RFC PATCH 1/2] qcow2: Allow checking and repairing corrupted internal snapshots

2018-02-26 Thread Max Reitz
On 2018-02-15 17:30, Alberto Garcia wrote:
> The L1 table parameters of internal snapshots are generally not
> checked by QEMU. This patch allows 'qemu-img check' to detect broken
> snapshots and to skip them when doing the refcount consistency check.
> 
> Since without an L1 table we don't have a reliable way to recover the
> data from the snapshot, when 'qemu-img check' runs in repair mode this
> patch simply removes the corrupted snapshots.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block/qcow2-snapshot.c | 53 
> ++
>  block/qcow2.c  |  7 ++-
>  block/qcow2.h  |  2 ++
>  3 files changed, 61 insertions(+), 1 deletion(-)

I think shouldn't delete things in qemu-img check.  I think we do need a
new mode (-r lossy? -r destructive?), although I'd personally even
prefer indeed asking the user before every destructive change.  The only
reason I'm not strongly in favor of this is because we don't have an
infrastructure for that (yet).

> diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
> index cee25f582b..7a36073e3e 100644
> --- a/block/qcow2-snapshot.c
> +++ b/block/qcow2-snapshot.c
> @@ -736,3 +736,56 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
>  
>  return 0;
>  }
> +
> +/* Check the snapshot table and optionally delete the corrupted entries */
> +int qcow2_snapshot_table_check(BlockDriverState *bs, BdrvCheckResult *result,
> +   BdrvCheckMode fix)
> +{
> +BDRVQcow2State *s = bs->opaque;
> +bool keep_checking;
> +int ret, i;
> +
> +do {
> +keep_checking = false;
> +
> +for (i = 0; i < s->nb_snapshots; i++) {> +QCowSnapshot 
> *sn = s->snapshots + i;
> +bool found_corruption = false;
> +
> +if (offset_into_cluster(s, sn->l1_table_offset)) {
> +fprintf(stderr, "%s snapshot %s (%s) l1_offset=%#" PRIx64 ": 
> "
> +"L1 table is not cluster aligned; snapshot table 
> entry "
> +"corrupted\n",
> +(fix & BDRV_FIX_ERRORS) ? "Deleting" : "ERROR",
> +sn->id_str, sn->name, sn->l1_table_offset);
> +found_corruption = true;
> +} else if (sn->l1_size > QCOW_MAX_L1_SIZE / sizeof(uint64_t)) {
> +fprintf(stderr, "%s snapshot %s (%s) l1_size=%#" PRIx32 ": "
> +"L1 table is too large; snapshot table entry 
> corrupted\n",
> +(fix & BDRV_FIX_ERRORS) ? "Deleting" : "ERROR",
> +sn->id_str, sn->name, sn->l1_size);
> +found_corruption = true;
> +}

This code assumes the snapshot table itself has been valid.  Why should
it be when it contains garbage entries?

> +
> +if (found_corruption) {
> +result->corruptions++;
> +sn->l1_size = 0; /* Prevent this L1 table from being used */
> +if (fix & BDRV_FIX_ERRORS) {
> +ret = qcow2_snapshot_delete(bs, sn->id_str, sn->name, 
> NULL);

So calling this is actually very dangerous.  It modifies the snapshot
table which I wouldn't trust is actually just a snapshot table.  It
could intersect any other structure in the qcow2 image.  Yes, we do an
overlap check, but that only protects metadata, and I don't really want
to see an overlap check corruption when repairing the image; especially
since this means you cannot fix the corruption.

I don't quite know myself what to do instead, but I guess my main point
would be:  Before any (potentially) destructive changes are made, the
user should have the chance of still opening the image read-only and
copying all the data off somewhere else.  Which of course again means we
shouldn't prevent the user from opening an image because a snapshot is
broken.

(This would at least allow the user to convert the image to raw, then
invoke -r destructive, and then compare the result to see whether
anything has visibly changed.)

Max

> +if (ret < 0) {
> +return ret;
> +}
> +result->corruptions_fixed++;
> +/* If we modified the snapshot table we can't keep
> + * iterating. We have to start again from the
> + * beginning instead. */
> +keep_checking = true;
> +break;
> +}
> +}
> +}
> +
> +} while (keep_checking);
> +
> +return 0;
> +}
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 2c6c33b67c..20e16ea602 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -546,7 +546,12 @@ int qcow2_mark_consistent(BlockDriverState *bs)
>  static int qcow2_check(BlockDriverState *bs, BdrvCheckResult *result,
> BdrvCheckMode fix)
>  {
> -int ret = qcow2_check_refcounts(bs, result, fix);
> +int ret = qcow2_sn

Re: [Qemu-devel] [PATCH v2] iotests: Test abnormally large size in compressed cluster descriptor

2018-02-26 Thread Alberto Garcia
On Fri 23 Feb 2018 02:30:14 PM CET, Eric Blake wrote:
>> One possible task for the future is to make 'qemu-img check' verify
>> the sizes of the compressed clusters, by trying to decompress the data
>> and checking that the size stored in the L2 entry is correct.
>
> Indeed, but that means...
>
>> +
>> +# Reduce size of compressed data to 4 sectors: this corrupts the image.
>> +poke_file "$TEST_IMG" $((0x80)) "\x40\x06"
>> +$QEMU_IO -c "read  -P 0x11 0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | 
>> _filter_testdir
>> +
>> +# 'qemu-img check' however doesn't see anything wrong because it
>> +# doesn't try to decompress the data and the refcounts are consistent.
>> +_check_test_img
>
> ...this spot should have a TODO comment that mentions the test needs 
> updating if qemu-img check is taught to be pickier.

Hehe, I actually had a TODO there but decided to remove it in the last
moment.

> Hmm - I also wonder - does our refcount code properly account for a
> compressed cluster that would affect the refcount of THREE clusters?
> Remember, qemu will never emit a compressed cluster that touches more
> than two clusters, but when you enlarge the size, if offset part of
> the link was already in the tail of one cluster, then you can bleed
> over into not just one, but two additional host clusters.  Your test
> didn't cover that, because it uses a compressed cluster that maps to
> the start of the host cluster.

Yes, just fine. I could actually check that by corrupting the second
compressed cluster instead of the first one. Or both, in fact.

I'll send v3 with this change then.

Berto



Re: [Qemu-devel] [PATCH 1/1] serial: Open non-block

2018-02-26 Thread Paolo Bonzini
On 26/02/2018 14:04, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> On a real serial device, the open can block if the handshake
> lines are in a particular state.  If a QEMU is passing the serial
> device to the guest, the QEMU startup is blocked opening the device
> (with a symptom seen as a timeout from libvirt).
> 
> Open the serial port with O_NONBLOCK.
> 
> Signed-off-by: Dr. David Alan Gilbert 

Socket chardevs have "nowait" for that.  Should serial have something
similar?

Thanks,

Paolo

> ---
>  chardev/char-serial.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/chardev/char-serial.c b/chardev/char-serial.c
> index feb52e559d..97be5d4a63 100644
> --- a/chardev/char-serial.c
> +++ b/chardev/char-serial.c
> @@ -265,7 +265,8 @@ static void qmp_chardev_open_serial(Chardev *chr,
>  ChardevHostdev *serial = backend->u.serial.data;
>  int fd;
>  
> -fd = qmp_chardev_open_file_source(serial->device, O_RDWR, errp);
> +fd = qmp_chardev_open_file_source(serial->device, O_RDWR | O_NONBLOCK,
> +  errp);
>  if (fd < 0) {
>  return;
>  }
> 




Re: [Qemu-devel] [PATCH] hw/acpi-build: build SRAT memory affinity structures for NVDIMM

2018-02-26 Thread Igor Mammedov
On Thu, 22 Feb 2018 09:40:00 +0800
Haozhong Zhang  wrote:

> On 02/21/18 14:55 +0100, Igor Mammedov wrote:
> > On Tue, 20 Feb 2018 17:17:58 -0800
> > Dan Williams  wrote:
> >   
> > > On Tue, Feb 20, 2018 at 6:10 AM, Igor Mammedov  
> > > wrote:  
> > > > On Sat, 17 Feb 2018 14:31:35 +0800
> > > > Haozhong Zhang  wrote:
> > > >
> > > >> ACPI 6.2A Table 5-129 "SPA Range Structure" requires the proximity
> > > >> domain of a NVDIMM SPA range must match with corresponding entry in
> > > >> SRAT table.
> > > >>
> > > >> The address ranges of vNVDIMM in QEMU are allocated from the
> > > >> hot-pluggable address space, which is entirely covered by one SRAT
> > > >> memory affinity structure. However, users can set the vNVDIMM
> > > >> proximity domain in NFIT SPA range structure by the 'node' property of
> > > >> '-device nvdimm' to a value different than the one in the above SRAT
> > > >> memory affinity structure.
> > > >>
> > > >> In order to solve such proximity domain mismatch, this patch build one
> > > >> SRAT memory affinity structure for each NVDIMM device with the
> > > >> proximity domain used in NFIT. The remaining hot-pluggable address
> > > >> space is covered by one or multiple SRAT memory affinity structures
> > > >> with the proximity domain of the last node as before.
> > > >>
> > > >> Signed-off-by: Haozhong Zhang 
> > > > If we consider hotpluggable system, correctly implemented OS should
> > > > be able pull proximity from Device::_PXM and override any value from 
> > > > SRAT.
> > > > Do we really have a problem here (anything that breaks if we would use 
> > > > _PXM)?
> > > > Maybe we should add _PXM object to nvdimm device nodes instead of 
> > > > massaging SRAT?
> > > 
> > > Unfortunately _PXM is an awkward fit. Currently the proximity domain
> > > is attached to the SPA range structure. The SPA range may be
> > > associated with multiple DIMM devices and those individual NVDIMMs may
> > > have conflicting _PXM properties.  
> > There shouldn't be any conflict here as  NVDIMM device's _PXM method,
> > should override in runtime any proximity specified by parent scope.
> > (as parent scope I'd also count boot time NFIT/SRAT tables).
> > 
> > To make it more clear we could clear valid proximity domain flag in SPA
> > like this:
> > 
> > diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> > index 59d6e42..131bca5 100644
> > --- a/hw/acpi/nvdimm.c
> > +++ b/hw/acpi/nvdimm.c
> > @@ -260,9 +260,7 @@ nvdimm_build_structure_spa(GArray *structures, 
> > DeviceState *dev)
> >   */
> >  nfit_spa->flags = cpu_to_le16(1 /* Control region is strictly for
> > management during hot add/online
> > -   operation */ |
> > -  2 /* Data in Proximity Domain field is
> > -   valid*/);
> > +   operation */);
> >  
> >  /* NUMA node. */
> >  nfit_spa->proximity_domain = cpu_to_le32(node);
> >   
> > > Even if that was unified across
> > > DIMMs it is ambiguous whether a DIMM-device _PXM would relate to the
> > > device's control interface, or the assembled persistent memory SPA
> > > range.  
> > I'm not sure what you mean under 'device's control interface',
> > could you clarify where the ambiguity comes from?
> > 
> > I read spec as: _PXM applies to address range covered by NVDIMM
> > device it belongs to.
> > 
> > As for assembled SPA, I'd assume that it applies to interleaved set
> > and all NVDIMMs with it should be on the same node. It's somewhat
> > irrelevant question though as QEMU so far implements only
> >   1:1:1/SPA:Region Mapping:NVDIMM Device/
> > mapping.
> > 
> > My main concern with using static configuration tables for proximity
> > mapping, we'd miss on hotplug side of equation. However if we start
> > from dynamic side first, we could later complement it with static
> > tables if there really were need for it.  
> 
> This patch affects only the static tables and static-plugged NVDIMM.
> For hot-plugged NVDIMMs, guest OSPM still needs to evaluate _FIT to
> get the information of the new NVDIMMs including their proximity
> domains.
> 
> One intention of this patch is to simulate the bare metal as much as
> possible. I have been using this patch to develop and test NVDIMM
> enabling work on Xen, and think it might be useful for developers of
> other OS and hypervisors.
It's simpler on bare metal as systems usually statically partitioned
according to capacity slots are able to handle.

The patch is technically correct and might be useful,
especially in current case case flag /* Data in Proximity Domain field is 
valid*/
set, to conform to the spec. So just complement the patch with
test case as requested and it should be fine to merge.

PS:
while adding ranges for present NVDIMMs in SRAT it would be
better to generalize a bit and include present pc-dimms
there as well to be consistent with 

Re: [Qemu-devel] [PATCH v8 09/21] null: Switch to .bdrv_co_block_status()

2018-02-26 Thread Kevin Wolf
Am 24.02.2018 um 00:38 hat Eric Blake geschrieben:
> On 02/23/2018 11:05 AM, Kevin Wolf wrote:
> > Am 23.02.2018 um 17:43 hat Eric Blake geschrieben:
> > > > OFFSET_VALID | DATA might be excusable because I can see that it's
> > > > convenient that a protocol driver refers to itself as *file instead of
> > > > returning NULL there and then the offset is valid (though it would be
> > > > pointless to actually follow the file pointer), but OFFSET_VALID without
> > > > DATA probably isn't.
> > > 
> > > So OFFSET_VALID | DATA for a protocol BDS is not just convenient, but
> > > necessary to avoid breaking qemu-img map output.  But you are also right
> > > that OFFSET_VALID without data makes little sense at a protocol layer. So
> > > with that in mind, I'm auditing all of the protocol layers to make sure
> > > OFFSET_VALID ends up as something sane.
> > 
> > That's one way to look at it.
> > 
> > The other way is that qemu-img map shouldn't ask the protocol layer for
> > its offset because it already knows the offset (it is what it passes as
> > a parameter to bdrv_co_block_status).
> > 
> > Anyway, it's probably not worth changing the interface, we should just
> > make sure that the return values of the individual drivers are
> > consistent.
> 
> Yet another inconsistency, and it's making me scratch my head today.
> 
> By the way, in my byte-based stuff that is now pending on your tree, I tried
> hard to NOT change semantics or the set of flags returned by a given driver,
> and we agreed that's why you'd accept the series as-is and make me do this
> followup exercise.  But it's looking like my followups may end up touching a
> lot of the same drivers again, now that I'm looking at what the semantics
> SHOULD be (and whatever I do end up tweaking, I will at least make sure that
> iotests is still happy with it).

Hm, that's unfortunate, but I don't think we should hold up your first
series just so we can touch the drivers only once.

> First, let's read what states the NBD spec is proposing:
> 
> > It defines the following flags for the flags field:
> > 
> > NBD_STATE_HOLE (bit 0): if set, the block represents a hole (and future 
> > writes to that area may cause fragmentation or encounter an ENOSPC error); 
> > if clear, the block is allocated or the server could not otherwise 
> > determine its status. Note that the use of NBD_CMD_TRIM is related to this 
> > status, but that the server MAY report a hole even where NBD_CMD_TRIM has 
> > not been requested, and also that a server MAY report that the block is 
> > allocated even where NBD_CMD_TRIM has been requested.
> > NBD_STATE_ZERO (bit 1): if set, the block contents read as all zeroes; 
> > if clear, the block contents are not known. Note that the use of 
> > NBD_CMD_WRITE_ZEROES is related to this status, but that the server MAY 
> > report zeroes even where NBD_CMD_WRITE_ZEROES has not been requested, and 
> > also that a server MAY report unknown content even where 
> > NBD_CMD_WRITE_ZEROES has been requested.
> > 
> > It is not an error for a server to report that a region of the export has 
> > both NBD_STATE_HOLE set and NBD_STATE_ZERO clear. The contents of such an 
> > area are undefined, and a client reading such an area should make no 
> > assumption as to its contents or stability.
> 
> So here's how Vladimir proposed implementing it in his series (written
> before my byte-based block status stuff went in to your tree):
> https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg04038.html
> 
> Server side (3/9):
> 
> +int ret = bdrv_block_status_above(bs, NULL, offset, tail_bytes,
> &num,
> +  NULL, NULL);
> +if (ret < 0) {
> +return ret;
> +}
> +
> +flags = (ret & BDRV_BLOCK_ALLOCATED ? 0 : NBD_STATE_HOLE) |
> +(ret & BDRV_BLOCK_ZERO  ? NBD_STATE_ZERO : 0);
> 
> Client side (6/9):
> 
> +*pnum = extent.length >> BDRV_SECTOR_BITS;
> +return (extent.flags & NBD_STATE_HOLE ? 0 : BDRV_BLOCK_DATA) |
> +   (extent.flags & NBD_STATE_ZERO ? BDRV_BLOCK_ZERO : 0);
> 
> Does anything there strike you as odd?

Two things I noticed while reading the above:

1. NBD doesn't consider backing files, so the definition of holes
   becomes ambiguous. Is a hole any block that isn't allocated in the
   top layer (may cause fragmentation or encounter an ENOSPC error) or
   is it any block that isn't allocated anywhere in the whole backing
   chain (may read as non-zero)?

   Considering that there is a separate NBD_STATE_ZERO and nothing
   forbids a state of NBD_STATE_HOLE without NBD_STATE_ZERO, maybe the
   former is more useful. The code you quote implements the latter.

   Maybe if we go with the former, we should add a note to the NBD spec
   that explictly says that NBD_STATE_HOLE doesn't imply any specific
   content that is returned on reads.

2. Using BDRV_BLOCK_ALLOCATED to determine NBD_STATE_HOLE seems wrong. A
   (not prealloca

Re: [Qemu-devel] [PATCH] hw/acpi-build: build SRAT memory affinity structures for NVDIMM

2018-02-26 Thread Igor Mammedov
On Wed, 21 Feb 2018 06:51:11 -0800
Dan Williams  wrote:

> On Wed, Feb 21, 2018 at 5:55 AM, Igor Mammedov  wrote:
> > On Tue, 20 Feb 2018 17:17:58 -0800
> > Dan Williams  wrote:
> >  
> >> On Tue, Feb 20, 2018 at 6:10 AM, Igor Mammedov  
> >> wrote:  
> >> > On Sat, 17 Feb 2018 14:31:35 +0800
> >> > Haozhong Zhang  wrote:
> >> >  
> >> >> ACPI 6.2A Table 5-129 "SPA Range Structure" requires the proximity
> >> >> domain of a NVDIMM SPA range must match with corresponding entry in
> >> >> SRAT table.
> >> >>
> >> >> The address ranges of vNVDIMM in QEMU are allocated from the
> >> >> hot-pluggable address space, which is entirely covered by one SRAT
> >> >> memory affinity structure. However, users can set the vNVDIMM
> >> >> proximity domain in NFIT SPA range structure by the 'node' property of
> >> >> '-device nvdimm' to a value different than the one in the above SRAT
> >> >> memory affinity structure.
> >> >>
> >> >> In order to solve such proximity domain mismatch, this patch build one
> >> >> SRAT memory affinity structure for each NVDIMM device with the
> >> >> proximity domain used in NFIT. The remaining hot-pluggable address
> >> >> space is covered by one or multiple SRAT memory affinity structures
> >> >> with the proximity domain of the last node as before.
> >> >>
> >> >> Signed-off-by: Haozhong Zhang   
> >> > If we consider hotpluggable system, correctly implemented OS should
> >> > be able pull proximity from Device::_PXM and override any value from 
> >> > SRAT.
> >> > Do we really have a problem here (anything that breaks if we would use 
> >> > _PXM)?
> >> > Maybe we should add _PXM object to nvdimm device nodes instead of 
> >> > massaging SRAT?  
> >>
> >> Unfortunately _PXM is an awkward fit. Currently the proximity domain
> >> is attached to the SPA range structure. The SPA range may be
> >> associated with multiple DIMM devices and those individual NVDIMMs may
> >> have conflicting _PXM properties.  
> > There shouldn't be any conflict here as  NVDIMM device's _PXM method,
> > should override in runtime any proximity specified by parent scope.
> > (as parent scope I'd also count boot time NFIT/SRAT tables).
> >
> > To make it more clear we could clear valid proximity domain flag in SPA
> > like this:
> >
> > diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> > index 59d6e42..131bca5 100644
> > --- a/hw/acpi/nvdimm.c
> > +++ b/hw/acpi/nvdimm.c
> > @@ -260,9 +260,7 @@ nvdimm_build_structure_spa(GArray *structures, 
> > DeviceState *dev)
> >   */
> >  nfit_spa->flags = cpu_to_le16(1 /* Control region is strictly for
> > management during hot add/online
> > -   operation */ |
> > -  2 /* Data in Proximity Domain field is
> > -   valid*/);
> > +   operation */);
> >
> >  /* NUMA node. */
> >  nfit_spa->proximity_domain = cpu_to_le32(node);
> >  
> >> Even if that was unified across
> >> DIMMs it is ambiguous whether a DIMM-device _PXM would relate to the
> >> device's control interface, or the assembled persistent memory SPA
> >> range.  
> > I'm not sure what you mean under 'device's control interface',
> > could you clarify where the ambiguity comes from?  
> 
> There are multiple SPA range types. In addition to the typical
> Persistent Memory SPA range there are also Control Region SPA ranges
> for MMIO registers on the DIMM for Block Apertures and other purposes.
> 
> >
> > I read spec as: _PXM applies to address range covered by NVDIMM
> > device it belongs to.  
> 
> No, an NVDIMM may contribute to multiple SPA ranges and those ranges
> may span sockets.
Isn't NVDIMM device plugged into a single socket which belongs to
a single numa node?
If it's so then shouldn't SPAs referencing it also have the same
proximity domain?


> > As for assembled SPA, I'd assume that it applies to interleaved set
> > and all NVDIMMs with it should be on the same node. It's somewhat
> > irrelevant question though as QEMU so far implements only
> >   1:1:1/SPA:Region Mapping:NVDIMM Device/
> > mapping.
> >
> > My main concern with using static configuration tables for proximity
> > mapping, we'd miss on hotplug side of equation. However if we start
> > from dynamic side first, we could later complement it with static
> > tables if there really were need for it.  
> 
> Especially when you consider the new HMAT table that wants to have
> proximity domains for describing performance characteristics of an
> address range relative to an initiator, the _PXM method on an
> individual NVDIMM device is a poor fit for describing a wider set.
> 




[Qemu-devel] [PATCH v3] iotests: Test abnormally large size in compressed cluster descriptor

2018-02-26 Thread Alberto Garcia
L2 entries for compressed clusters have a field that indicates the
number of sectors used to store the data in the image.

That's however not the size of the compressed data itself, just the
number of sectors where that data is located. The actual data size is
usually not a multiple of the sector size, and therefore cannot be
represented with this field.

The way it works is that QEMU reads all the specified sectors and
starts decompressing the data until there's enough to recover the
original uncompressed cluster. If there are any bytes left that
haven't been decompressed they are simply ignored.

One consequence of this is that even if the size field is larger than
it needs to be QEMU can handle it just fine: it will read more data
from disk but it will ignore the extra bytes.

This test creates an image with two compressed clusters that use 5
sectors (2.5 KB) each, increases the size field to the maximum (8192
sectors, or 4 MB) and verifies that the data can be read without
problems.

This test is important because while the decompressed data takes
exactly one cluster, the maximum value allowed in the compressed size
field is twice the cluster size. So although QEMU won't produce images
with such large values we need to make sure that it can handle them.

Another effect of increasing the size field is that it can make
it include data from the following host cluster(s). In this case
'qemu-img check' will detect that the refcounts are not correct, and
we'll need to rebuild them.

Additionally, this patch also tests that decreasing the size corrupts
the image since the original data can no longer be recovered. In this
case QEMU returns an error when trying to read the compressed data,
but 'qemu-img check' doesn't see anything wrong if the refcounts are
consistent.

One possible task for the future is to make 'qemu-img check' verify
the sizes of the compressed clusters, by trying to decompress the data
and checking that the size stored in the L2 entry is correct.

Signed-off-by: Alberto Garcia 
---

v3: Add TODO comment, as suggested by Eric.

Corrupt the length of the second compressed cluster as well so the
uncompressed data would span three host clusters.

v2: We now have two scenarios where we make QEMU read data from the
next host cluster and from beyond the end of the image. This
version also runs qemu-img check on the corrupted image.

If the size field is too small, reading fails but qemu-img check
succeeds.

If the size field is too large, reading succeeds but qemu-img
check fails (this can be repaired, though).

---
 tests/qemu-iotests/122 | 45 +
 tests/qemu-iotests/122.out | 31 +++
 2 files changed, 76 insertions(+)

diff --git a/tests/qemu-iotests/122 b/tests/qemu-iotests/122
index 45b359c2ba..5b9593016c 100755
--- a/tests/qemu-iotests/122
+++ b/tests/qemu-iotests/122
@@ -130,6 +130,51 @@ $QEMU_IO -c "read -P 01024k 1022k" "$TEST_IMG" 2>&1 | 
_filter_qemu_io | _fil
 
 
 echo
+echo "=== Corrupted size field in compressed cluster descriptor ==="
+echo
+# Create an empty image, fill half of it with data and compress it.
+# The L2 entries of the two compressed clusters are located at
+# 0x80 and 0x88, their original values are 0x400800a0
+# and 0x400800a00802 (5 sectors for compressed data each).
+TEST_IMG="$TEST_IMG".1 _make_test_img 8M
+$QEMU_IO -c "write -P 0x11 0 4M" "$TEST_IMG".1 2>&1 | _filter_qemu_io | 
_filter_testdir
+$QEMU_IMG convert -c -O qcow2 -o cluster_size=2M "$TEST_IMG".1 "$TEST_IMG"
+
+# Reduce size of compressed data to 4 sectors: this corrupts the image.
+poke_file "$TEST_IMG" $((0x80)) "\x40\x06"
+$QEMU_IO -c "read  -P 0x11 0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | 
_filter_testdir
+
+# 'qemu-img check' however doesn't see anything wrong because it
+# doesn't try to decompress the data and the refcounts are consistent.
+# TODO: update qemu-img so this can be detected
+_check_test_img
+
+# Increase size of compressed data to the maximum (8192 sectors).
+# This makes QEMU read more data (8192 sectors instead of 5, host
+# addresses [0xa0, 0xdf]), but the decompression algorithm
+# stops once we have enough to restore the uncompressed cluster, so
+# the rest of the data is ignored.
+poke_file "$TEST_IMG" $((0x80)) "\x7f\xfe"
+# Do it also for the second compressed cluster (L2 entry at 0x88).
+# In this case the compressed data would span 3 host clusters
+# (host addresses: [0xa00802, 0xe00801])
+poke_file "$TEST_IMG" $((0x88)) "\x7f\xfe"
+
+# Here the image is too small so we're asking QEMU to read beyond the
+# end of the image.
+$QEMU_IO -c "read  -P 0x11  0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | 
_filter_testdir
+# But if we grow the image we won't be reading beyond its end anymore.
+$QEMU_IO -c "write -P 0x22 4M 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | 
_filter_testdir
+$QEMU_IO -c "read  -P 0x11  0 4M" "$TEST_IMG" 2>&1 |

Re: [Qemu-devel] [RFC] exec: eliminate ram naming issue as migration

2018-02-26 Thread Paolo Bonzini
On 26/02/2018 13:55, Igor Mammedov wrote:
>>> So how about just adding a new option --mem-share to decide if that's a
>>> private memory or shared memory? That seems much straightforward way
> Above options are legacy (which we can't remove for compat reasons),
> their replacement is 'memory-backend-file' backend which has all of
> the above including 'share' property.

More precisely, we have added "-object memory-backend-file" to avoid
proliferation of options related to memory.  Besides unifying the cases
of 1 and >1 NUMA node, using -object also has the advantage of
supporting memory hotplug.

You wrote "I find adding a backend for nonnuma pc RAM is roundabout way"
but basically the command line says "this VM has only one NUMA node,
backed by this memory object" which is a precise description of what the
VM memory looks like.

> So just add 'memdev' property to machine and reuse memory-backend-file
> with it instead of duplicating functionality in the legacy code.

That would however also have a different RAMBlock id, effectively
producing the same output as "-numa node,memdev=...".

I think this should be solved at the libvirt level.  Libvirt should
write in the migration XML cookie whether the VM is using -object or
-mem-path to declare its memory, and newly-started VMs should always use
-object.  This won't fix the problem for VMs that are already running,
but it will fix it the next time they are started.

Paolo



Re: [Qemu-devel] [PATCH] hw/acpi-build: build SRAT memory affinity structures for NVDIMM

2018-02-26 Thread Haozhong Zhang
On 02/26/18 14:59 +0100, Igor Mammedov wrote:
> On Thu, 22 Feb 2018 09:40:00 +0800
> Haozhong Zhang  wrote:
> 
> > On 02/21/18 14:55 +0100, Igor Mammedov wrote:
> > > On Tue, 20 Feb 2018 17:17:58 -0800
> > > Dan Williams  wrote:
> > >   
> > > > On Tue, Feb 20, 2018 at 6:10 AM, Igor Mammedov  
> > > > wrote:  
> > > > > On Sat, 17 Feb 2018 14:31:35 +0800
> > > > > Haozhong Zhang  wrote:
> > > > >
> > > > >> ACPI 6.2A Table 5-129 "SPA Range Structure" requires the proximity
> > > > >> domain of a NVDIMM SPA range must match with corresponding entry in
> > > > >> SRAT table.
> > > > >>
> > > > >> The address ranges of vNVDIMM in QEMU are allocated from the
> > > > >> hot-pluggable address space, which is entirely covered by one SRAT
> > > > >> memory affinity structure. However, users can set the vNVDIMM
> > > > >> proximity domain in NFIT SPA range structure by the 'node' property 
> > > > >> of
> > > > >> '-device nvdimm' to a value different than the one in the above SRAT
> > > > >> memory affinity structure.
> > > > >>
> > > > >> In order to solve such proximity domain mismatch, this patch build 
> > > > >> one
> > > > >> SRAT memory affinity structure for each NVDIMM device with the
> > > > >> proximity domain used in NFIT. The remaining hot-pluggable address
> > > > >> space is covered by one or multiple SRAT memory affinity structures
> > > > >> with the proximity domain of the last node as before.
> > > > >>
> > > > >> Signed-off-by: Haozhong Zhang 
> > > > > If we consider hotpluggable system, correctly implemented OS should
> > > > > be able pull proximity from Device::_PXM and override any value from 
> > > > > SRAT.
> > > > > Do we really have a problem here (anything that breaks if we would 
> > > > > use _PXM)?
> > > > > Maybe we should add _PXM object to nvdimm device nodes instead of 
> > > > > massaging SRAT?
> > > > 
> > > > Unfortunately _PXM is an awkward fit. Currently the proximity domain
> > > > is attached to the SPA range structure. The SPA range may be
> > > > associated with multiple DIMM devices and those individual NVDIMMs may
> > > > have conflicting _PXM properties.  
> > > There shouldn't be any conflict here as  NVDIMM device's _PXM method,
> > > should override in runtime any proximity specified by parent scope.
> > > (as parent scope I'd also count boot time NFIT/SRAT tables).
> > > 
> > > To make it more clear we could clear valid proximity domain flag in SPA
> > > like this:
> > > 
> > > diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> > > index 59d6e42..131bca5 100644
> > > --- a/hw/acpi/nvdimm.c
> > > +++ b/hw/acpi/nvdimm.c
> > > @@ -260,9 +260,7 @@ nvdimm_build_structure_spa(GArray *structures, 
> > > DeviceState *dev)
> > >   */
> > >  nfit_spa->flags = cpu_to_le16(1 /* Control region is strictly for
> > > management during hot add/online
> > > -   operation */ |
> > > -  2 /* Data in Proximity Domain field is
> > > -   valid*/);
> > > +   operation */);
> > >  
> > >  /* NUMA node. */
> > >  nfit_spa->proximity_domain = cpu_to_le32(node);
> > >   
> > > > Even if that was unified across
> > > > DIMMs it is ambiguous whether a DIMM-device _PXM would relate to the
> > > > device's control interface, or the assembled persistent memory SPA
> > > > range.  
> > > I'm not sure what you mean under 'device's control interface',
> > > could you clarify where the ambiguity comes from?
> > > 
> > > I read spec as: _PXM applies to address range covered by NVDIMM
> > > device it belongs to.
> > > 
> > > As for assembled SPA, I'd assume that it applies to interleaved set
> > > and all NVDIMMs with it should be on the same node. It's somewhat
> > > irrelevant question though as QEMU so far implements only
> > >   1:1:1/SPA:Region Mapping:NVDIMM Device/
> > > mapping.
> > > 
> > > My main concern with using static configuration tables for proximity
> > > mapping, we'd miss on hotplug side of equation. However if we start
> > > from dynamic side first, we could later complement it with static
> > > tables if there really were need for it.  
> > 
> > This patch affects only the static tables and static-plugged NVDIMM.
> > For hot-plugged NVDIMMs, guest OSPM still needs to evaluate _FIT to
> > get the information of the new NVDIMMs including their proximity
> > domains.
> > 
> > One intention of this patch is to simulate the bare metal as much as
> > possible. I have been using this patch to develop and test NVDIMM
> > enabling work on Xen, and think it might be useful for developers of
> > other OS and hypervisors.
> It's simpler on bare metal as systems usually statically partitioned
> according to capacity slots are able to handle.
> 
> The patch is technically correct and might be useful,
> especially in current case case flag /* Data in Proximity Domain field is 
> valid*/
>

Re: [Qemu-devel] [PATCH v3] iotests: Test abnormally large size in compressed cluster descriptor

2018-02-26 Thread Eric Blake

On 02/26/2018 08:36 AM, Alberto Garcia wrote:

L2 entries for compressed clusters have a field that indicates the
number of sectors used to store the data in the image.





One consequence of this is that even if the size field is larger than
it needs to be QEMU can handle it just fine: it will read more data
from disk but it will ignore the extra bytes.


Modulo any ref count checks, of course ;)




Signed-off-by: Alberto Garcia 
---

v3: Add TODO comment, as suggested by Eric.

 Corrupt the length of the second compressed cluster as well so the
 uncompressed data would span three host clusters.


Rather, it is the 'claimed' size of the compressed data that spans three 
host clusters.  I don't know if our refcount repair code is geared for 
that (it IS prepared for a compressed data cluster that spans two host 
clusters, but spanning three is only possible for externally-produced 
images, as in this test).




  echo
+echo "=== Corrupted size field in compressed cluster descriptor ==="
+echo
+# Create an empty image, fill half of it with data and compress it.
+# The L2 entries of the two compressed clusters are located at
+# 0x80 and 0x88, their original values are 0x400800a0
+# and 0x400800a00802 (5 sectors for compressed data each).
+TEST_IMG="$TEST_IMG".1 _make_test_img 8M
+$QEMU_IO -c "write -P 0x11 0 4M" "$TEST_IMG".1 2>&1 | _filter_qemu_io | 
_filter_testdir
+$QEMU_IMG convert -c -O qcow2 -o cluster_size=2M "$TEST_IMG".1 "$TEST_IMG"


Is it worth a $QEMU_IO -c "read -v 0x80 16" so that our .out file 
validates that we indeed see the values we expect (if some other qcow2 
change causes us to stick the L2 table at a different offset, the 
verbose read will make it a bit more obvious why this test starts 
failing).  But that's an extra layer of paranoia, I'm fairly confident 
this test would start failing even without that extra read, so it's not 
a reason for a respin.



+
+# Reduce size of compressed data to 4 sectors: this corrupts the image.
+poke_file "$TEST_IMG" $((0x80)) "\x40\x06"
+$QEMU_IO -c "read  -P 0x11 0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | 
_filter_testdir
+
+# 'qemu-img check' however doesn't see anything wrong because it
+# doesn't try to decompress the data and the refcounts are consistent.
+# TODO: update qemu-img so this can be detected
+_check_test_img
+
+# Increase size of compressed data to the maximum (8192 sectors).
+# This makes QEMU read more data (8192 sectors instead of 5, host
+# addresses [0xa0, 0xdf]), but the decompression algorithm
+# stops once we have enough to restore the uncompressed cluster, so
+# the rest of the data is ignored.
+poke_file "$TEST_IMG" $((0x80)) "\x7f\xfe"
+# Do it also for the second compressed cluster (L2 entry at 0x88).
+# In this case the compressed data would span 3 host clusters
+# (host addresses: [0xa00802, 0xe00801])
+poke_file "$TEST_IMG" $((0x88)) "\x7f\xfe"
+
+# Here the image is too small so we're asking QEMU to read beyond the
+# end of the image.
+$QEMU_IO -c "read  -P 0x11  0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | 
_filter_testdir
+# But if we grow the image we won't be reading beyond its end anymore.
+$QEMU_IO -c "write -P 0x22 4M 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | 
_filter_testdir
+$QEMU_IO -c "read  -P 0x11  0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | 
_filter_testdir
+
+# The refcount data is however wrong because due to the increased size
+# of the compressed data it now reaches the following host clusters.
+# This can be repaired by qemu-img check.
+_check_test_img -r all
+$QEMU_IO -c "read  -P 0x11  0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | 
_filter_testdir
+$QEMU_IO -c "read  -P 0x22 4M 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | 
_filter_testdir


Looks good.  Thanks for adding this in v3.

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH 1/1] serial: Open non-block

2018-02-26 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote:
> On 26/02/2018 14:04, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" 
> > 
> > On a real serial device, the open can block if the handshake
> > lines are in a particular state.  If a QEMU is passing the serial
> > device to the guest, the QEMU startup is blocked opening the device
> > (with a symptom seen as a timeout from libvirt).
> > 
> > Open the serial port with O_NONBLOCK.
> > 
> > Signed-off-by: Dr. David Alan Gilbert 
> 
> Socket chardevs have "nowait" for that.  Should serial have something
> similar?

Hmm, maybe, although I think for real serial the nonblocking open should
be the default.
I've not got any more complex tests though for it.
stty -F uses the same trick of opening non-blocking.

Dave

> Thanks,
> 
> Paolo
> 
> > ---
> >  chardev/char-serial.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/chardev/char-serial.c b/chardev/char-serial.c
> > index feb52e559d..97be5d4a63 100644
> > --- a/chardev/char-serial.c
> > +++ b/chardev/char-serial.c
> > @@ -265,7 +265,8 @@ static void qmp_chardev_open_serial(Chardev *chr,
> >  ChardevHostdev *serial = backend->u.serial.data;
> >  int fd;
> >  
> > -fd = qmp_chardev_open_file_source(serial->device, O_RDWR, errp);
> > +fd = qmp_chardev_open_file_source(serial->device, O_RDWR | O_NONBLOCK,
> > +  errp);
> >  if (fd < 0) {
> >  return;
> >  }
> > 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v3] iotests: Test abnormally large size in compressed cluster descriptor

2018-02-26 Thread Alberto Garcia
On Mon 26 Feb 2018 04:12:22 PM CET, Eric Blake wrote:
>> +# Create an empty image, fill half of it with data and compress it.
>> +# The L2 entries of the two compressed clusters are located at
>> +# 0x80 and 0x88, their original values are 0x400800a0
>> +# and 0x400800a00802 (5 sectors for compressed data each).
>> +TEST_IMG="$TEST_IMG".1 _make_test_img 8M
>> +$QEMU_IO -c "write -P 0x11 0 4M" "$TEST_IMG".1 2>&1 | _filter_qemu_io | 
>> _filter_testdir
>> +$QEMU_IMG convert -c -O qcow2 -o cluster_size=2M "$TEST_IMG".1 "$TEST_IMG"
>
> Is it worth a $QEMU_IO -c "read -v 0x80 16" so that our .out file
> validates that we indeed see the values we expect (if some other qcow2
> change causes us to stick the L2 table at a different offset, the
> verbose read will make it a bit more obvious why this test starts
> failing).

I don't think it's necessary, the success of the rest of the tests after
this one requires that we modify these exact L2 entries. If we're
touching something else then one of them will fail.

Berto



Re: [Qemu-devel] [PATCH v6 03/23] RISC-V CPU Core Definition

2018-02-26 Thread Igor Mammedov
On Fri, 23 Feb 2018 13:11:49 +1300
Michael Clark  wrote:

> Add CPU state header, CPU definitions and initialization routines
> 
> Reviewed-by: Richard Henderson 
> Signed-off-by: Michael Clark 
> ---
>  target/riscv/cpu.c  | 391 +
>  target/riscv/cpu.h  | 256 +
>  target/riscv/cpu_bits.h | 416 
> 
>  3 files changed, 1063 insertions(+)
>  create mode 100644 target/riscv/cpu.c
>  create mode 100644 target/riscv/cpu.h
>  create mode 100644 target/riscv/cpu_bits.h
> 
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
[...]

> +
> +static const RISCVCPUInfo riscv_cpus[] = {
> +{ TYPE_RISCV_CPU_ANY,riscv_any_cpu_init },
> +{ TYPE_RISCV_CPU_IMAFDCSU_PRIV_1_09, riscv_imafdcsu_priv1_9_cpu_init },
> +{ TYPE_RISCV_CPU_IMAFDCSU_PRIV_1_10, riscv_imafdcsu_priv1_10_cpu_init },
> +{ TYPE_RISCV_CPU_IMACU_PRIV_1_10,riscv_imacu_priv1_10_cpu_init },
> +{ TYPE_RISCV_CPU_IMAC_PRIV_1_10, riscv_imac_priv1_10_cpu_init },
> +{ NULL, NULL }
> +};
[...]

> +static void cpu_register(const RISCVCPUInfo *info)
> +{
> +TypeInfo type_info = {
> +.name = info->name,
> +.parent = TYPE_RISCV_CPU,
> +.instance_size = sizeof(RISCVCPU),
> +.instance_init = info->initfn,
> +};
> +
> +type_register(&type_info);
> +}
[...]

> +
> +void riscv_cpu_list(FILE *f, fprintf_function cpu_fprintf)
> +{
> +const RISCVCPUInfo *info = riscv_cpus;
> +
> +while (info->name) {
> +(*cpu_fprintf)(f, "%s\n", info->name);
> +info++;
> +}
> +}
majority targets use object_class_get_list() to get
the list of CPU types, you can use cris_cpu_list() as example.


> +static void riscv_cpu_register_types(void)
> +{
> +const RISCVCPUInfo *info = riscv_cpus;
> +
> +type_register_static(&riscv_cpu_type_info);
> +
> +while (info->name) {
> +cpu_register(info);
> +info++;
> +}
> +}
> +
> +type_init(riscv_cpu_register_types)
[...]

This still hasn't addressed a comment from
 "[PATCH v4 03/22] RISC-V CPU Core Definition"
and uses old approach with RISCVCPUInfo helper structure.

Please, use commit 974e58d2 to model after.



[Qemu-devel] [PATCH v3] slirp: Add domainname option to slirp's DHCP server

2018-02-26 Thread Benjamin Drung
This patch will allow the user to include the domainname option in
replies from the built-in DHCP server.

Signed-off-by: Benjamin Drung 
---
 net/slirp.c  |  7 ---
 qapi/net.json|  4 
 qemu-options.hx  |  7 +--
 slirp/bootp.c|  8 
 slirp/libslirp.h |  2 +-
 slirp/slirp.c| 10 +-
 slirp/slirp.h|  1 +
 7 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/net/slirp.c b/net/slirp.c
index 8991816bbf..9511ff3bb7 100644
--- a/net/slirp.c
+++ b/net/slirp.c
@@ -157,7 +157,8 @@ static int net_slirp_init(NetClientState *peer, const char 
*model,
   const char *bootfile, const char *vdhcp_start,
   const char *vnameserver, const char *vnameserver6,
   const char *smb_export, const char *vsmbserver,
-  const char **dnssearch, Error **errp)
+  const char **dnssearch, const char *vdomainname,
+  Error **errp)
 {
 /* default settings according to historic slirp */
 struct in_addr net  = { .s_addr = htonl(0x0a000200) }; /* 10.0.2.0 */
@@ -371,7 +372,7 @@ static int net_slirp_init(NetClientState *peer, const char 
*model,
 s->slirp = slirp_init(restricted, ipv4, net, mask, host,
   ipv6, ip6_prefix, vprefix6_len, ip6_host,
   vhostname, tftp_export, bootfile, dhcp,
-  dns, ip6_dns, dnssearch, s);
+  dns, ip6_dns, dnssearch, vdomainname, s);
 QTAILQ_INSERT_TAIL(&slirp_stacks, s, entry);
 
 for (config = slirp_configs; config; config = config->next) {
@@ -958,7 +959,7 @@ int net_init_slirp(const Netdev *netdev, const char *name,
  user->ipv6_host, user->hostname, user->tftp,
  user->bootfile, user->dhcpstart,
  user->dns, user->ipv6_dns, user->smb,
- user->smbserver, dnssearch, errp);
+ user->smbserver, dnssearch, user->domainname, errp);
 
 while (slirp_configs) {
 config = slirp_configs;
diff --git a/qapi/net.json b/qapi/net.json
index 1238ba5de1..9dfd34cafa 100644
--- a/qapi/net.json
+++ b/qapi/net.json
@@ -160,6 +160,9 @@
 # @dnssearch: list of DNS suffixes to search, passed as DHCP option
 # to the guest
 #
+# @domainname: guest-visible domain name of the virtual nameserver
+#  (since 2.12)
+#
 # @ipv6-prefix: IPv6 network prefix (default is fec0::) (since
 #   2.6). The network prefix is given in the usual
 #   hexadecimal IPv6 address notation.
@@ -197,6 +200,7 @@
 '*dhcpstart': 'str',
 '*dns':   'str',
 '*dnssearch': ['String'],
+'*domainname': 'str',
 '*ipv6-prefix':  'str',
 '*ipv6-prefixlen':   'int',
 '*ipv6-host':'str',
diff --git a/qemu-options.hx b/qemu-options.hx
index 8ccd5dcaa6..c000ef454e 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1906,8 +1906,8 @@ DEF("netdev", HAS_ARG, QEMU_OPTION_netdev,
 "-netdev user,id=str[,ipv4[=on|off]][,net=addr[/mask]][,host=addr]\n"
 " [,ipv6[=on|off]][,ipv6-net=addr[/int]][,ipv6-host=addr]\n"
 " [,restrict=on|off][,hostname=host][,dhcpstart=addr]\n"
-" [,dns=addr][,ipv6-dns=addr][,dnssearch=domain][,tftp=dir]\n"
-" [,bootfile=f][,hostfwd=rule][,guestfwd=rule]"
+" 
[,dns=addr][,ipv6-dns=addr][,dnssearch=domain][,domainname=domain]\n"
+" [,tftp=dir][,bootfile=f][,hostfwd=rule][,guestfwd=rule]"
 #ifndef _WIN32
  "[,smb=dir[,smbserver=addr]]\n"
 #endif
@@ -2116,6 +2116,9 @@ Example:
 qemu -net user,dnssearch=mgmt.example.org,dnssearch=example.org [...]
 @end example
 
+@item domainname=@var{domain}
+Specifies the client domain name reported by the built-in DHCP server.
+
 @item tftp=@var{dir}
 When using the user mode network stack, activate a built-in TFTP
 server. The files in @var{dir} will be exposed as the root of a TFTP server.
diff --git a/slirp/bootp.c b/slirp/bootp.c
index 5dd1a415b5..9e7b53ba94 100644
--- a/slirp/bootp.c
+++ b/slirp/bootp.c
@@ -298,6 +298,14 @@ static void bootp_reply(Slirp *slirp, const struct bootp_t 
*bp)
 q += val;
 }
 
+if (slirp->vdomainname) {
+val = strlen(slirp->vdomainname);
+*q++ = RFC1533_DOMAINNAME;
+*q++ = val;
+memcpy(q, slirp->vdomainname, val);
+q += val;
+}
+
 if (slirp->vdnssearch) {
 size_t spaceleft = sizeof(rbp->bp_vend) - (q - rbp->bp_vend);
 val = slirp->vdnssearch_len;
diff --git a/slirp/libslirp.h b/slirp/libslirp.h
index 540b3e5903..740408a96e 100644
--- a/slirp/libslirp.h
+++ b/slirp/libslirp.h
@@ -16,7 +16,7 @@ Slirp *slirp_init(int restricted, bool in_enabled, struct 
in_addr vnetwork,
   const char *tftp_path, const char *

[Qemu-devel] [Bug 1751264] Re: qemu-img convert issue in a tmpfs partition

2018-02-26 Thread Max Reitz
Hi,

This is a combination of (in our opinion) a bug in tmpfs (...and I think
maybe btrfs as well?), the fact that the vmdk block driver is not very
well optimized, and qemu-img convert assuming that the filesystem works
as it thinks it does or that at least the block driver can work around
this.

So what happens is that qemu-img convert tries to find out which data it
needs to copy.  For this, it queries which parts of the image are
allocated.  This involves querying both the format level (vmdk in this
case) and the protocol level (tmpfs in this case).

Now the vmdk block driver is not very well optimized, so it only allows
querying on cluster boundaries (64 kB by default, as far as I can tell).
qcow2 OTOH allows greater areas (I just created a 512 MB image and it
can query the whole image at once).

So the requests go down to the protocol level.  We expect that to
respond very quickly to an allocation request (the lseek() you are
seeing) -- but tmpfs (and I think btrfs, too) don't do that.  They take
a rather long time.

For an example, the attached program seeks through a file (in 64 kB steps) with 
SEEK_DATA/SEEK_HOLE.  This is what happens:
$ cd /tmp
$ gcc test.c -std=c11 -Wall -Wextra -pedantic -O3
$ qemu-img create -f raw -o preallocation=falloc empty 512M
$ qemu-img create -f raw -o preallocation=falloc ~/empty 512M
$ time ./a.out empty
./a.out empty  0,01s user 23,10s system 99% cpu 23,166 total
$ time ./a.out ~/empty
./a.out ~/empty  0,01s user 0,03s system 96% cpu 0,041 total

So there's a huge difference and that is (in my opinion) a bug in tmpfs.

(When converting from qcow2 you don't notice this, because qcow2 allows
performing a single allocation request for the whole image, so it
doesn't matter much whether that's slow.)


There are three ways around this:
(1) tmpfs (and probably btrfs? -- although I can't reproduce it myself right 
now) should be fixed.  If they can't tell allocated areas quickly, they should 
just report the whole file as allocated.

(2) Our vmdk driver could be optimized.  Sure, but that wouldn't solve
the real issue and someone would have to do it first (and we don't have
a strong interest in this, because all format drivers but qcow2 and raw
are there mainly just for reading other formats and converting them to
qcow2).

(3a) qemu-img convert could poll for allocation information less
insistently.  One way would be to add a switch to disable this behavior
completely and force it to just read everything.  We already have -S 0
which could do this; but just reading all data and then doing zero
detection over it kind of defeats the purpose.  If read() + memcmp() is
faster than lseek(SEEK_DATA), then the FS is just doing something wrong.

(3b) Eric Blake has recently added support for a less insisting way to
query allocation status that should only go to the format layer (e.g.
vmdk) and ignore the protocol layer (e.g. tmpfs).  Maybe qemu-img
convert should use that.


But in any case, I claim the main issue is in tmpfs.

Max

** Attachment added: "test.c"
   
https://bugs.launchpad.net/qemu/+bug/1751264/+attachment/5063575/+files/test.c

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1751264

Title:
  qemu-img convert issue in a tmpfs partition

Status in QEMU:
  New

Bug description:
  qemu-img convert command is slow when the file to convert is located
  in a tmpfs formatted partition.

  v2.1.0 on debian/jessie x64, ext4: 10m14s
  v2.1.0 on debian/jessie x64, tmpfs: 10m15s

  v2.1.0 on debian/stretch x64, ext4: 11m9s
  v2.1.0 on debian/stretch x64, tmpfs: 10m21.362s

  v2.8.0 on debian/jessie x64, ext4: 10m21s
  v2.8.0 on debian/jessie x64, tmpfs: Too long (50min+)

  v2.8.0 on debian/stretch x64, ext4: 10m42s
  v2.8.0 on debian/stretch x64, tmpfs: Too long (50min+)

  It seems that the issue is caused by this commit :
  https://github.com/qemu/qemu/commit/690c7301600162421b928c7f26fd488fd8fa464e

  In order to reproduce this bug :

  1/ mount a tmpfs partition : mount -t tmpfs tmpfs /tmp
  2/ get a vmdk file (we used a 15GB image) and put it on /tmp
  3/ run the 'qemu-img convert -O qcow2 /tmp/file.vmdk /path/to/destination' 
command

  When we trace the process, we can see that there's a lseek loop which
  is very slow (compare to outside a tmpfs partition).

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1751264/+subscriptions



Re: [Qemu-devel] [PATCH v3 2/4] qcow2: Document some maximum size constraints

2018-02-26 Thread Alberto Garcia
On Thu 22 Feb 2018 04:59:20 PM CET, Eric Blake wrote:
> While at it, notice that since we cannot map any virtual cluster to
> any address higher than 64 PB (56 bits) (due to the L1/L2 field
> encoding), it makes little sense to require the refcount table to
> access host offsets beyond that point.

But refcount blocks are not addressed by L2 tables, so in principle it
should be possible to have refcount blocks after the first 64PB.

But I agree that it's a good idea to set that as a maximum possible
physical size of the qcow2 image.

> @@ -341,7 +355,7 @@ Refcount table entry:
>
>  Bit  0 -  8:Reserved (set to 0)
>
> - 9 - 63:Bits 9-63 of the offset into the image file at which the
> + 9 - 55:Bits 9-55 of the offset into the image file at which the
>  refcount block starts. Must be aligned to a cluster
>  boundary.
>
> @@ -349,6 +363,8 @@ Refcount table entry:
>  been allocated. All refcounts managed by this refcount 
> block
>  are 0.
>
> +56 - 63:Reserved (set to 0)

Are we not updating REFT_OFFSET_MASK as well?

Berto



Re: [Qemu-devel] [PATCH v3 3/4] qcow2: Don't allow overflow during cluster allocation

2018-02-26 Thread Alberto Garcia
On Thu 22 Feb 2018 04:59:21 PM CET, Eric Blake wrote:
> Our code was already checking that we did not attempt to
> allocate more clusters than what would fit in an INT64 (the
> physical maximimum if we can access a full off_t's worth of
> data).  But this does not catch smaller limits enforced by
> various spots in the qcow2 image description: L1 and normal
> clusters of L2 are documented as having bits 63-56 reserved
> for other purposes, capping our maximum offset at 64PB (bit
> 55 is the maximum bit set).  And for compressed images with
> 2M clusters, the cap drops the maximum offset to bit 48, or
> a maximum offset of 512TB.  If we overflow that offset, we
> would write compressed data into one place, but try to
> decompress from another, which won't work.
>
> I don't have 512TB handy to prove whether things break if we
> compress so much data that we overflow that limit, and don't
> think that iotests can (quickly) test it either.  Test 138
> comes close (it corrupts an image into thinking something lives
> at 32PB, which is half the maximum for L1 sizing - although
> it relies on 512-byte clusters).  But that test points out
> that we will generally hit other limits first (such as running
> out of memory for the refcount table, or exceeding file system
> limits like 16TB on ext4, etc), so this is more a theoretical
> safety valve than something likely to be hit.
>
> Signed-off-by: Eric Blake 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-devel] [PATCH v3 4/4] qcow2: Avoid memory over-allocation on compressed images

2018-02-26 Thread Alberto Garcia
On Thu 22 Feb 2018 08:02:44 PM CET, Eric Blake wrote:
> On 02/22/2018 10:23 AM, Alberto Garcia wrote:
>> On Thu 22 Feb 2018 04:59:22 PM CET, Eric Blake wrote:
>>>   sector_offset = coffset & 511;
>>>   csize = nb_csectors * 512 - sector_offset;
>> [...]
>>> +assert(csize < 2 * s->cluster_size);
>> 
>> I think it should be <=
>> 
>> If sector_offset is 0 and nb_csector is the maximum allowed value then
>> csize is exactly 2 * s->cluster_size bytes.
>
> Sigh, yes you're right.  I was thinking that "qemu sets csize to a 
> maximum of s->cluster_size, but only when sector_offset is not 0" - but 
> as long as we're dealing with externally-produced images, sector_offset 
> can be 0 at the same time as providing all 1s to the field.  So I did 
> indeed have an off-by-one.
>
> Perhaps the maintainer can fix it up, instead of me spinning a v4?

That would work for me, but note that this part of the commit message
also needs to be updated:

   In fact, the qcow2 spec permits an all-ones sector count, plus 511
   bytes from the sector containing the initial offset, for a maximum
   read of nearly 2 full clusters;

With those two things corrected,

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-devel] [PATCH v4 0/3] s390x/sclp: 64 bit event masks

2018-02-26 Thread Cornelia Huck
On Fri, 23 Feb 2018 18:42:55 +0100
Claudio Imbrenda  wrote:

> Until 67915de9f0383ccf4a ("s390x/event-facility: variable-length event masks")
> we only supported 32bit sclp event masks, even though the archiecture
> allows the guests to set up sclp event masks up to 1021 bytes in length.
> With that patch the behaviour was almost compliant, but some issues were
> still remaining, in particular regarding the handling of selective reads
> and migration.
> 
> This patchset fixes migration and the handling of selective reads, and
> puts in place the support for 64-bit sclp event masks internally.
> 
> A new property of the sclp-event device switches between the 32bit masks
> and the compliant behaviour. The property is bound to the machine
> version, so older machines keep the old broken behaviour, allowing for
> migration, but the default is the compliant implementation.
> 
> Fixes: 67915de9f0383ccf4a ("s390x/event-facility: variable-length event 
> masks")
> 
> v3 -> v4
> * removed all pre_load hooks
> * split the internal represntation of the receive mask into an array of
>   uint32_t and added accessors; the union would not work on little
>   endian hosts!

Oops. Did you test under a le host? How can I test this (I guess using
the current s390/features branch as guest)?

> * fixed the pre-existing documentation comment for copy_mask
> 
> v2 -> v3
> * fixed some typos in the first patch description
> * updated an existing comment in the third patch: newer Linux versions
>   will support masks larger than 4 bytes.
> 
> v1 -> v2 
> * improved comments and patch descriptions to better explain why we need
>   this, including better description of the old broken behaviour
> * rename SCLPEVMSK to SCLP_EVMASK to improve readability
> * removed some unneded variable initializations
> * fixed a pre-existing typo
> 
> Claudio Imbrenda (3):
>   s390x/sclp: proper support of larger send and receive masks
>   s390x/sclp: clean up sclp masks
>   s390x/sclp: extend SCLP event masks to 64 bits
> 
>  hw/char/sclpconsole-lm.c  |   4 +-
>  hw/char/sclpconsole.c |   4 +-
>  hw/s390x/event-facility.c | 153 
> ++
>  hw/s390x/s390-virtio-ccw.c|   8 +-
>  hw/s390x/sclpcpu.c|   4 +-
>  hw/s390x/sclpquiesce.c|   4 +-
>  include/hw/s390x/event-facility.h |  22 +++---
>  7 files changed, 148 insertions(+), 51 deletions(-)
> 




Re: [Qemu-devel] [PATCH v3 2/4] qcow2: Document some maximum size constraints

2018-02-26 Thread Eric Blake

On 02/26/2018 10:25 AM, Alberto Garcia wrote:

On Thu 22 Feb 2018 04:59:20 PM CET, Eric Blake wrote:

While at it, notice that since we cannot map any virtual cluster to
any address higher than 64 PB (56 bits) (due to the L1/L2 field
encoding), it makes little sense to require the refcount table to
access host offsets beyond that point.


But refcount blocks are not addressed by L2 tables, so in principle it
should be possible to have refcount blocks after the first 64PB.


But (if we don't make this change) that's about all you can usefully 
have (and it would be a self-referencing refcount block).




But I agree that it's a good idea to set that as a maximum possible
physical size of the qcow2 image.


@@ -341,7 +355,7 @@ Refcount table entry:

  Bit  0 -  8:Reserved (set to 0)

- 9 - 63:Bits 9-63 of the offset into the image file at which the
+ 9 - 55:Bits 9-55 of the offset into the image file at which the
  refcount block starts. Must be aligned to a cluster
  boundary.

@@ -349,6 +363,8 @@ Refcount table entry:
  been allocated. All refcounts managed by this refcount 
block
  are 0.

+56 - 63:Reserved (set to 0)


Are we not updating REFT_OFFSET_MASK as well?


We could, but that should be a separate patch from the spec change.  We 
could also add some validation that any offsets in the header point to 
less than the 64PB limit.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH v3 2/4] qcow2: Document some maximum size constraints

2018-02-26 Thread Alberto Garcia
On Mon 26 Feb 2018 05:41:54 PM CET, Eric Blake wrote:
>> But refcount blocks are not addressed by L2 tables, so in principle
>> it should be possible to have refcount blocks after the first 64PB.
>
> But (if we don't make this change) that's about all you can usefully
> have (and it would be a self-referencing refcount block).

Yeah, true.

>>> +56 - 63:Reserved (set to 0)
>> 
>> Are we not updating REFT_OFFSET_MASK as well?
>
> We could, but that should be a separate patch from the spec change.
> We could also add some validation that any offsets in the header point
> to less than the 64PB limit.

Ok, let's leave that for another patch then.

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-devel] [PATCH v3 0/7] Call check and invalidate_cache from coroutine context

2018-02-26 Thread Kevin Wolf
Am 18.01.2018 um 13:43 hat Paolo Bonzini geschrieben:
> Check and invalidate_cache share some parts of the implementation
> with the regular I/O path.  This is sometimes complicated because the
> I/O path wants to use a CoMutex but that is not possible outside coroutine
> context.  By moving things to coroutine context, we can remove special
> cases.  In fact, invalidate_cache is already called from coroutine context
> because incoming migration is placed in a coroutine.
> 
> While at it, I'm including two patches from Stefan to rename the
> bdrv_create callback to bdrv_co_create, because it is already called
> from coroutine context.  The name is now bdrv_co_create_opts, with
> bdrv_co_create reserved for the QAPI-based version that Kevin is
> working on.
> 
> qcow2 still has cache flushing in non-coroutine context, coming from
> qcow2_reopen_prepare->qcow2_update_options_prepare and
> qcow2_close->qcow2_inactivate.

The patches looked good, but this deadlocks qemu-iotests 165 for me:

#0  0x562392ae1cb0 in qemu_coroutine_switch 
(from_=from_@entry=0x562393de2410, to_=to_@entry=0x7f81bf44ee48, 
action=action@entry=COROUTINE_YIELD) at util/coroutine-ucontext.c:219
#1  0x562392ae0ad1 in qemu_coroutine_yield () at util/qemu-coroutine.c:186
#2  0x562392ae0cb4 in qemu_co_mutex_lock_slowpath (mutex=0x562393de1870, 
ctx=0x562393dc1ad0) at util/qemu-coroutine-lock.c:269
#3  0x562392ae0cb4 in qemu_co_mutex_lock (mutex=mutex@entry=0x562393de1870) 
at util/qemu-coroutine-lock.c:307
#4  0x562392a149fc in qcow2_co_flush_to_os (bs=0x562393dd6750) at 
block/qcow2.c:3705
#5  0x562392a461c9 in bdrv_co_flush (bs=0x562393dd6750) at block/io.c:2439
#6  0x562392a46639 in bdrv_flush_co_entry (opaque=0x7f8195a7cd50) at 
block/io.c:2403
#7  0x562392a46639 in bdrv_flush (bs=bs@entry=0x562393dd6750) at 
block/io.c:2528
#8  0x562392a26de6 in update_header_sync (bs=bs@entry=0x562393dd6750) at 
block/qcow2-bitmap.c:113
#9  0x562392a26e5a in update_ext_header_and_dir_in_place 
(bs=bs@entry=0x562393dd6750, bm_list=bm_list@entry=0x562393de14f0) at 
block/qcow2-bitmap.c:826
#10 0x562392a27fa6 in qcow2_load_dirty_bitmaps (bs=bs@entry=0x562393dd6750, 
errp=errp@entry=0x7f8195a7ce68) at block/qcow2-bitmap.c:982
#11 0x562392a1947c in qcow2_do_open (bs=0x562393dd6750, options=, flags=8194, errp=0x7ffc3289a110) at block/qcow2.c:1501
#12 0x562392a198e2 in qcow2_open_entry (opaque=0x7ffc3289a0b0) at 
block/qcow2.c:1578
#13 0x562392ae1d1c in coroutine_trampoline (i0=, 
i1=) at util/coroutine-ucontext.c:116
#14 0x7f81a338d950 in __start_context () at /lib64/libc.so.6
#15 0x7ffc32899920 in  ()
#16 0x in  ()

Not saving the coroutine pointer anywhere was a bit nasty, too. gdb
only gave the coroutine pointer away with something as indirect as
'p ((BDRVQcow2State*) bs.opaque).lock.holder'.

Kevin



[Qemu-devel] [PATCH v2 1/2] nbd: BLOCK_STATUS constants

2018-02-26 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

Expose the new constants and structs that will be used by both
server and client implementations of NBD_CMD_BLOCK_STATUS (the
command is currently experimental at
https://github.com/NetworkBlockDevice/nbd/blob/extension-blockstatus/doc/proto.md
but will hopefully be stabilized soon).

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <1518702707-7077-4-git-send-email-vsement...@virtuozzo.com>
[eblake: split from larger patch on server implementation]
Signed-off-by: Eric Blake 
---
 include/block/nbd.h | 31 +++
 nbd/common.c| 10 ++
 2 files changed, 41 insertions(+)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index ef1698914ba..53250d0979c 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -41,6 +41,12 @@ struct NBDOptionReply {
 } QEMU_PACKED;
 typedef struct NBDOptionReply NBDOptionReply;

+typedef struct NBDOptionReplyMetaContext {
+NBDOptionReply h; /* h.type = NBD_REP_META_CONTEXT, h.length > 4 */
+uint32_t context_id;
+/* meta context name follows */
+} QEMU_PACKED NBDOptionReplyMetaContext;
+
 /* Transmission phase structs
  *
  * Note: these are _NOT_ the same as the network representation of an NBD
@@ -105,6 +111,19 @@ typedef struct NBDStructuredError {
 uint16_t message_length;
 } QEMU_PACKED NBDStructuredError;

+/* Header of NBD_REPLY_TYPE_BLOCK_STATUS */
+typedef struct NBDStructuredMeta {
+NBDStructuredReplyChunk h; /* h.length >= 12 (at least one extent) */
+uint32_t context_id;
+/* extents follows */
+} QEMU_PACKED NBDStructuredMeta;
+
+/* Extent chunk for NBD_REPLY_TYPE_BLOCK_STATUS */
+typedef struct NBDExtent {
+uint32_t length;
+uint32_t flags; /* NBD_STATE_* */
+} QEMU_PACKED NBDExtent;
+
 /* Transmission (export) flags: sent from server to client during handshake,
but describe what will happen during transmission */
 #define NBD_FLAG_HAS_FLAGS (1 << 0) /* Flags are there */
@@ -136,6 +155,8 @@ typedef struct NBDStructuredError {
 #define NBD_OPT_INFO  (6)
 #define NBD_OPT_GO(7)
 #define NBD_OPT_STRUCTURED_REPLY  (8)
+#define NBD_OPT_LIST_META_CONTEXT (9)
+#define NBD_OPT_SET_META_CONTEXT  (10)

 /* Option reply types. */
 #define NBD_REP_ERR(value) ((UINT32_C(1) << 31) | (value))
@@ -143,6 +164,7 @@ typedef struct NBDStructuredError {
 #define NBD_REP_ACK (1)/* Data sending finished. */
 #define NBD_REP_SERVER  (2)/* Export description. */
 #define NBD_REP_INFO(3)/* NBD_OPT_INFO/GO. */
+#define NBD_REP_META_CONTEXT(4)/* NBD_OPT_{LIST,SET}_META_CONTEXT */

 #define NBD_REP_ERR_UNSUP   NBD_REP_ERR(1)  /* Unknown option */
 #define NBD_REP_ERR_POLICY  NBD_REP_ERR(2)  /* Server denied */
@@ -163,6 +185,8 @@ typedef struct NBDStructuredError {
 #define NBD_CMD_FLAG_FUA(1 << 0) /* 'force unit access' during write */
 #define NBD_CMD_FLAG_NO_HOLE(1 << 1) /* don't punch hole on zero run */
 #define NBD_CMD_FLAG_DF (1 << 2) /* don't fragment structured read */
+#define NBD_CMD_FLAG_REQ_ONE(1 << 3) /* only one extent in BLOCK_STATUS
+  * reply chunk */

 /* Supported request types */
 enum {
@@ -173,6 +197,7 @@ enum {
 NBD_CMD_TRIM = 4,
 /* 5 reserved for failed experiment NBD_CMD_CACHE */
 NBD_CMD_WRITE_ZEROES = 6,
+NBD_CMD_BLOCK_STATUS = 7,
 };

 #define NBD_DEFAULT_PORT   10809
@@ -200,9 +225,15 @@ enum {
 #define NBD_REPLY_TYPE_NONE  0
 #define NBD_REPLY_TYPE_OFFSET_DATA   1
 #define NBD_REPLY_TYPE_OFFSET_HOLE   2
+#define NBD_REPLY_TYPE_BLOCK_STATUS  3
 #define NBD_REPLY_TYPE_ERROR NBD_REPLY_ERR(1)
 #define NBD_REPLY_TYPE_ERROR_OFFSET  NBD_REPLY_ERR(2)

+/* Flags for extents (NBDExtent.flags) of NBD_REPLY_TYPE_BLOCK_STATUS,
+ * for base:allocation meta context */
+#define NBD_STATE_HOLE (1 << 0)
+#define NBD_STATE_ZERO (1 << 1)
+
 static inline bool nbd_reply_type_is_error(int type)
 {
 return type & (1 << 15);
diff --git a/nbd/common.c b/nbd/common.c
index 6295526dd14..8c95c1d606e 100644
--- a/nbd/common.c
+++ b/nbd/common.c
@@ -75,6 +75,10 @@ const char *nbd_opt_lookup(uint32_t opt)
 return "go";
 case NBD_OPT_STRUCTURED_REPLY:
 return "structured reply";
+case NBD_OPT_LIST_META_CONTEXT:
+return "list meta context";
+case NBD_OPT_SET_META_CONTEXT:
+return "set meta context";
 default:
 return "";
 }
@@ -90,6 +94,8 @@ const char *nbd_rep_lookup(uint32_t rep)
 return "server";
 case NBD_REP_INFO:
 return "info";
+case NBD_REP_META_CONTEXT:
+return "meta context";
 case NBD_REP_ERR_UNSUP:
 return "unsupported";
 case NBD_REP_ERR_POLICY:
@@ -144,6 +150,8 @@ const char *nbd_cmd_lookup(uint16_t cmd)
 return "trim";
 case NBD_CMD_WRITE_ZEROES:
 return "write zeroes";
+case NBD_CMD_BLOCK_STATUS:
+return "b

[Qemu-devel] [PATCH v2 0/2] nbd block status initial patches

2018-02-26 Thread Eric Blake
Here's the bits of 3/9 and 5/9 that I liked, but where I made further
changes.  I'd like to take these two, plus the original 2/9 as-is,
as part of my next NBD PULL request, but want to make sure you are
okay with my changes.

Yes, I still need to follow up on the upstream NBD list what we
are going to do about the disagreement on values for the constants
between the early Virtuozzo implementation and the current wording
in the NBD extension branch.

Vladimir Sementsov-Ogievskiy (2):
  nbd: BLOCK_STATUS constants
  nbd/client: fix error messages in nbd_handle_reply_err

 include/block/nbd.h | 31 +++
 nbd/client.c| 20 ++--
 nbd/common.c| 10 ++
 nbd/server.c|  4 ++--
 nbd/trace-events|  8 
 5 files changed, 57 insertions(+), 16 deletions(-)

-- 
2.14.3




Re: [Qemu-devel] [PATCH v4 3/3] s390x/sclp: extend SCLP event masks to 64 bits

2018-02-26 Thread Cornelia Huck
On Fri, 23 Feb 2018 18:42:58 +0100
Claudio Imbrenda  wrote:

> Extend the SCLP event masks to 64 bits.
> 
> Notice that using any of the new bits results in a state that cannot be
> migrated to an older version.
> 
> Signed-off-by: Claudio Imbrenda 
> ---
>  hw/s390x/event-facility.c | 56 
> ++-
>  include/hw/s390x/event-facility.h |  2 +-
>  2 files changed, 45 insertions(+), 13 deletions(-)
> 
> diff --git a/hw/s390x/event-facility.c b/hw/s390x/event-facility.c
> index e04ed9f..c3e39ee 100644
> --- a/hw/s390x/event-facility.c
> +++ b/hw/s390x/event-facility.c
> @@ -30,7 +30,7 @@ struct SCLPEventFacility {
>  SysBusDevice parent_obj;
>  SCLPEventsBus sbus;
>  /* guest's receive mask */
> -sccb_mask_t receive_mask;
> +uint32_t receive_mask_pieces[2];
>  /*
>   * when false, we keep the same broken, backwards compatible behaviour as
>   * before, allowing only masks of size exactly 4; when true, we implement
> @@ -42,6 +42,18 @@ struct SCLPEventFacility {
>  uint16_t mask_length;
>  };
>  
> +static inline sccb_mask_t make_receive_mask(SCLPEventFacility *ef)
> +{
> +return ((sccb_mask_t)ef->receive_mask_pieces[0]) << 32 |
> + ef->receive_mask_pieces[1];
> +}
> +
> +static inline void store_receive_mask(SCLPEventFacility *ef, sccb_mask_t val)
> +{
> +ef->receive_mask_pieces[1] = val;
> +ef->receive_mask_pieces[0] = val >> 32;
> +}
> +

Hm... how are all those values actually defined in the architecture?
You pass around some values internally (which are supposedly in host
endian) and then and/or them with the receive mask here. Are they
compared byte-for-byte? 32-bit-for-32-bit?

I'm also not a fan of the _pieces suffix - reminds me of Dwarf pieces :)



  1   2   3   >