date:20200724

Re: [PATCH 3/3] scripts/qmp/qom-fuse: Fix getattr(), read() for files in /

2020-07-24 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> On 7/23/20 4:27 PM, Markus Armbruster wrote:
>> path, prop = "type".rsplit('/', 1) sets path to "", which doesn't
>> work.  Correct to "/".
>> 
>> Signed-off-by: Markus Armbruster 
>> ---
>>  scripts/qmp/qom-fuse | 10 --
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>> 
>> diff --git a/scripts/qmp/qom-fuse b/scripts/qmp/qom-fuse
>> index 405e6ebd67..7c7cff8edf 100755
>> --- a/scripts/qmp/qom-fuse
>> +++ b/scripts/qmp/qom-fuse
>> @@ -45,8 +45,10 @@ class QOMFS(Operations):
>>  return False
>>  
>>  def is_property(self, path):
>> +path, prop = path.rsplit('/', 1)
>> +if path == '':
>> +path = '/'
>>  try:
>> -path, prop = path.rsplit('/', 1)
>
> Maybe worth adding an tiny root_path_split() helper?

The script goes back to commit 5ade767485 "qom: quick and dirty QOM
filesystem based on FUSE" (2014).  It's as "quick and dirty" as ever.
It could use a thorough rework.

> Reviewed-by: Philippe Mathieu-Daudé 

Thanks!

[...]

Re: [PATCH] block/amend: Check whether the node exists

2020-07-24 Thread Max Reitz

On 23.07.20 19:56, Peter Maydell wrote:
> On Fri, 10 Jul 2020 at 10:51, Max Reitz  wrote:
>>
>> We should check whether the user-specified node-name actually refers to
>> a node.  The simplest way to do that is to use bdrv_lookup_bs() instead
>> of bdrv_find_node() (the former wraps the latter, and produces an error
>> message if necessary).
>>
>> Reported-by: Coverity (CID 1430268)
>> Fixes: ced914d0ab9fb2c900f873f6349a0b8eecd1fdbe
>> Signed-off-by: Max Reitz 
> 
> Hi; has this patch got lost? (I'm just running through the Coverity
> issues marked as fix-submitted to check the patches made it into
> master, and it looks like this one hasn't yet.)

Well, not strictly speaking lost, but I did forget to merge it, yes.
Thanks for the reminder!

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] block/amend: Check whether the node exists

2020-07-24 Thread Max Reitz

On 10.07.20 11:50, Max Reitz wrote:
> We should check whether the user-specified node-name actually refers to
> a node.  The simplest way to do that is to use bdrv_lookup_bs() instead
> of bdrv_find_node() (the former wraps the latter, and produces an error
> message if necessary).
> 
> Reported-by: Coverity (CID 1430268)
> Fixes: ced914d0ab9fb2c900f873f6349a0b8eecd1fdbe
> Signed-off-by: Max Reitz 
> ---
>  block/amend.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)

Applied to my block branch.



signature.asc
Description: OpenPGP digital signature

[RFC PATCH] buildsys: Only build capstone if softmmu/user mode is enabled

2020-07-24 Thread Philippe Mathieu-Daudé

At least one of softmmu or user mode has to be enabled to use
capstone. If not, don't clone/built it.

This save CI time for the tools/documentation-only build jobs.

Signed-off-by: Philippe Mathieu-Daudé 
---
 configure | 4 
 1 file changed, 4 insertions(+)

diff --git a/configure b/configure
index 4bd80ed507..bc5757159a 100755
--- a/configure
+++ b/configure
@@ -5381,6 +5381,10 @@ fi
 ##
 # capstone
 
+if test -z "$capstone" && test $tcg = 'no' ; then # !tcg implies !softmmu
+  capstone="no"
+fi
+
 case "$capstone" in
   "" | yes)
 if $pkg_config capstone; then
-- 
2.21.3

Re: [PATCH] docs/nvdimm: add 'pmem=on' for the device dax backend file

2020-07-24 Thread Liu, Jingqi


Ping for comments.


On 7/15/2020 10:54 AM, Liu, Jingqi wrote:

At the end of live migration, QEMU uses msync() to flush the data to
the backend storage. When the backend file is a character device dax,
the pages explicitly avoid the page cache. It will return failure from msync().
The following warning is output.

 "warning: qemu_ram_msync: failed to sync memory range“

So we add 'pmem=on' to avoid calling msync(), use the QEMU command line:

 -object memory-backend-file,id=mem1,pmem=on,mem-path=/dev/dax0.0,size=4G

Signed-off-by: Jingqi Liu 
---
  docs/nvdimm.txt | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/docs/nvdimm.txt b/docs/nvdimm.txt
index c2c6e441b3..31048aff5e 100644
--- a/docs/nvdimm.txt
+++ b/docs/nvdimm.txt
@@ -243,6 +243,13 @@ use the QEMU command line:
  
  -object memory-backend-file,id=nv_mem,mem-path=/XXX/yyy,size=4G,pmem=on
  
+At the end of live migration, QEMU uses msync() to flush the data to the

+backend storage. When the backend file is a character device dax, the pages
+explicitly avoid the page cache. It will return failure from msync().
+So we add 'pmem=on' to avoid calling msync(), use the QEMU command line:
+
+-object memory-backend-file,id=mem1,pmem=on,mem-path=/dev/dax0.0,size=4G
+
  References
  --

Re: [PATCH v3 14/16] python/qemu: Cleanup changes to ConsoleSocket

2020-07-24 Thread Philippe Mathieu-Daudé

+John who requested the changes.

On 7/24/20 8:45 AM, Alex Bennée wrote:
> From: Robert Foley 
> 
> The changes to console_socket.py and machine.py are to
> cleanup for pylint and flake8.
> 
> Signed-off-by: Robert Foley 
> Reviewed-by: Alex Bennée 
> Signed-off-by: Alex Bennée 
> Message-Id: <20200717203041.9867-2-robert.fo...@linaro.org>
> ---
>  python/qemu/console_socket.py | 57 ++-
>  python/qemu/machine.py|  7 +++--
>  python/qemu/pylintrc  |  2 +-
>  3 files changed, 34 insertions(+), 32 deletions(-)
> 
> diff --git a/python/qemu/console_socket.py b/python/qemu/console_socket.py
> index 830cb7c6282..09986bc2152 100644
> --- a/python/qemu/console_socket.py
> +++ b/python/qemu/console_socket.py
> @@ -1,12 +1,9 @@
> -#!/usr/bin/env python3
> -#
> -# This python module implements a ConsoleSocket object which is
> -# designed always drain the socket itself, and place
> -# the bytes into a in memory buffer for later processing.
> -#
> -# Optionally a file path can be passed in and we will also
> -# dump the characters to this file for debug.
> -#
> +"""
> +QEMU Console Socket Module:
> +
> +This python module implements a ConsoleSocket object,
> +which can drain a socket and optionally dump the bytes to file.
> +"""
>  # Copyright 2020 Linaro
>  #
>  # Authors:
> @@ -15,20 +12,27 @@
>  # This code is licensed under the GPL version 2 or later.  See
>  # the COPYING file in the top-level directory.
>  #
> +
>  import asyncore
>  import socket
>  import threading
> -import io
> -import os
> -import sys
>  from collections import deque
>  import time
> -import traceback
> +
>  
>  class ConsoleSocket(asyncore.dispatcher):
> +"""
> +ConsoleSocket represents a socket attached to a char device.
>  
> +Drains the socket and places the bytes into an in memory buffer
> +for later processing.
> +
> +Optionally a file path can be passed in and we will also
> +dump the characters to this file for debugging purposes.
> +"""
>  def __init__(self, address, file=None):
>  self._recv_timeout_sec = 300
> +self._sleep_time = 0.5
>  self._buffer = deque()
>  self._asyncore_thread = None
>  self._sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
> @@ -70,31 +74,28 @@ class ConsoleSocket(asyncore.dispatcher):
>  
>  def handle_read(self):
>  """process arriving characters into in memory _buffer"""
> -try:
> -data = asyncore.dispatcher.recv(self, 1)
> -# latin1 is needed since there are some chars
> -# we are receiving that cannot be encoded to utf-8
> -# such as 0xe2, 0x80, 0xA6.
> -string = data.decode("latin1")
> -except:
> -print("Exception seen.")
> -traceback.print_exc()
> -return
> +data = asyncore.dispatcher.recv(self, 1)

A bit more than the commit description, but I don't mind.

Reviewed-by: Philippe Mathieu-Daudé 

> +# latin1 is needed since there are some chars
> +# we are receiving that cannot be encoded to utf-8
> +# such as 0xe2, 0x80, 0xA6.
> +string = data.decode("latin1")
>  if self._logfile:
>  self._logfile.write("{}".format(string))
>  self._logfile.flush()
>  for c in string:
>  self._buffer.extend(c)
>  
> -def recv(self, n=1, sleep_delay_s=0.1):
> -"""Return chars from in memory buffer"""
> +def recv(self, buffer_size=1):
> +"""Return chars from in memory buffer.
> +   Maintains the same API as socket.socket.recv.
> +"""
>  start_time = time.time()
> -while len(self._buffer) < n:
> -time.sleep(sleep_delay_s)
> +while len(self._buffer) < buffer_size:
> +time.sleep(self._sleep_time)
>  elapsed_sec = time.time() - start_time
>  if elapsed_sec > self._recv_timeout_sec:
>  raise socket.timeout
> -chars = ''.join([self._buffer.popleft() for i in range(n)])
> +chars = ''.join([self._buffer.popleft() for i in range(buffer_size)])
>  # We choose to use latin1 to remain consistent with
>  # handle_read() and give back the same data as the user would
>  # receive if they were reading directly from the
> diff --git a/python/qemu/machine.py b/python/qemu/machine.py
> index 80c4d4a8b6e..9956360a792 100644
> --- a/python/qemu/machine.py
> +++ b/python/qemu/machine.py
> @@ -27,7 +27,7 @@ import socket
>  import tempfile
>  from typing import Optional, Type
>  from types import TracebackType
> -from qemu.console_socket import ConsoleSocket
> +from . import console_socket
>  
>  from . import qmp
>  
> @@ -674,8 +674,9 @@ class QEMUMachine:
>  """
>  if self._console_socket is None:
>  if self._drain_console:
> -self._console_socket = ConsoleSocket(self._console_address,
> -

Re: [PATCH v2 09/22] migration/block-dirty-bitmap: relax error handling in incoming part

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


19.02.2020 18:34, Vladimir Sementsov-Ogievskiy wrote:

18.02.2020 21:54, Andrey Shinkevich wrote:



On 17/02/2020 18:02, Vladimir Sementsov-Ogievskiy wrote:

Bitmaps data is not critical, and we should not fail the migration (or
use postcopy recovering) because of dirty-bitmaps migration failure.
Instead we should just lose unfinished bitmaps.

Still we have to report io stream violation errors, as they affect the
whole migration stream.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  migration/block-dirty-bitmap.c | 148 +
  1 file changed, 113 insertions(+), 35 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 1329db8d7d..aea5326804 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -145,6 +145,15 @@ typedef struct DBMLoadState {
  bool before_vm_start_handled; /* set in dirty_bitmap_mig_before_vm_start 
*/
+    /*
+ * cancelled
+ * Incoming migration is cancelled for some reason. That means that we
+ * still should read our chunks from migration stream, to not affect other
+ * migration objects (like RAM), but just ignore them and do not touch any
+ * bitmaps or nodes.
+ */
+    bool cancelled;
+
  GSList *bitmaps;
  QemuMutex lock; /* protect bitmaps */
  } DBMLoadState;
@@ -545,13 +554,47 @@ void dirty_bitmap_mig_before_vm_start(void)
  qemu_mutex_unlock(&s->lock);
  }
+static void cancel_incoming_locked(DBMLoadState *s)
+{
+    GSList *item;
+
+    if (s->cancelled) {
+    return;
+    }
+
+    s->cancelled = true;
+    s->bs = NULL;
+    s->bitmap = NULL;
+
+    /* Drop all unfinished bitmaps */
+    for (item = s->bitmaps; item; item = g_slist_next(item)) {
+    LoadBitmapState *b = item->data;
+
+    /*
+ * Bitmap must be unfinished, as finished bitmaps should already be
+ * removed from the list.
+ */
+    assert(!s->before_vm_start_handled || !b->migrated);
+    if (bdrv_dirty_bitmap_has_successor(b->bitmap)) {
+    bdrv_reclaim_dirty_bitmap(b->bitmap, &error_abort);
+    }
+    bdrv_release_dirty_bitmap(b->bitmap);
+    }
+
+    g_slist_free_full(s->bitmaps, g_free);
+    s->bitmaps = NULL;
+}
+
  static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
  {
  GSList *item;
  trace_dirty_bitmap_load_complete();
-    bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
-    qemu_mutex_lock(&s->lock);


Why is it safe to remove the critical section?


It's not removed, it becomes wider in this patch.




+    if (s->cancelled) {
+    return;
+    }
+
+    bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
  if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
  bdrv_reclaim_dirty_bitmap(s->bitmap, &error_abort);
@@ -569,8 +612,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
  break;
  }
  }
-
-    qemu_mutex_unlock(&s->lock);
  }
  static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
@@ -582,15 +623,32 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
  if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
  trace_dirty_bitmap_load_bits_zeroes();
-    bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
- false);
+    if (!s->cancelled) {
+    bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte,
+ nr_bytes, false);
+    }
  } else {
  size_t ret;
  uint8_t *buf;
  uint64_t buf_size = qemu_get_be64(f);
-    uint64_t needed_size =
-    bdrv_dirty_bitmap_serialization_size(s->bitmap,
- first_byte, nr_bytes);
+    uint64_t needed_size;
+
+    buf = g_malloc(buf_size);
+    ret = qemu_get_buffer(f, buf, buf_size);
+    if (ret != buf_size) {
+    error_report("Failed to read bitmap bits");
+    g_free(buf);
+    return -EIO;
+    }
+
+    if (s->cancelled) {
+    g_free(buf);
+    return 0;
+    }
+
+    needed_size = bdrv_dirty_bitmap_serialization_size(s->bitmap,
+   first_byte,
+   nr_bytes);
  if (needed_size > buf_size ||
  buf_size > QEMU_ALIGN_UP(needed_size, 4 * sizeof(long))
@@ -599,15 +657,8 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
  error_report("Migrated bitmap granularity doesn't "
   "match the destination bitmap '%s' granularity",
   bdrv_dirty_bitmap_name(s->bitmap));
-    return -EINVAL;
-    }
-
-    buf = g_malloc(buf_size);
-    ret = qemu_get_buffer(f, buf, buf_size);
-    if (ret != buf_size) {
-    error_report("Failed t

Status of scripts/qmp/ (was: [PATCH 0/3] scripts/qmp/qom-fuse: Scrape off the bitrot)

2020-07-24 Thread Markus Armbruster

Thomas Huth  writes:

> On 23/07/2020 16.27, Markus Armbruster wrote:
>> Markus Armbruster (3):
>>   scripts/qmp/qom-fuse: Unbreak import of QEMUMonitorProtocol
>>   scripts/qmp/qom-fuse: Port to current Python module fuse
>>   scripts/qmp/qom-fuse: Fix getattr(), read() for files in /
>
> Could it be added to a CI pipeline, so that it does not bitrot again?

Should it be?

Thread hijack!

What's the status of scripts/qmp/?

The directory is covered by MAINTAINERS section QMP, with status
"Supported".  Status is a *lie* for these scripts.  I inherited them
with the rest of QMP.  I have no use for them, except I occasionally use
qom-fuse for QOM spelunking, mostly because our monitor commands are so
unwieldy compared to a filesystem.  I barely looked at them in the 5+
years of my service as QMP maintainer.  Actual status is "Odd fixes".

Does this stuff have any business in /scripts/?

Nothing in scripts/qmp/ should be shipped.

scripts/qmp/qemu-ga-client doesn't even belong there.  Michael, is it of
any use today?

I know scripts/qmp/qmp-shell has a few friends among developers.  I
regard it as a failed attempt at making QMP easier for humans, and have
zero interest in it.

scripts/qmp/qmp looks like an attempt at making QMP easier for shell
scripts.  I'm not aware of actual use, and have zero interest in it.

scripts/qmp/qom-{get,list,set} look like an attempt at making QOM easier
for shell scripts.  I'm not aware of actual use, and have zero interest
in it.  Heck, I can't even figure out how to use qom-get (I just spend
at least 30s trying).

scripts/qmp/qom-tree feels redundant with scripts/qmp/qom-fuse.  I just
ran it for the first time just to come to this judgement.

I believe contrib/ would be a better home for all of them.

I feel like moving the directory there and leaving it *uncovered* in
MAINTAINERS.  If a volunteer steps forward, we can add a suitable
section.

Opinions?

[PATCH 2/2] tests: Exclude 'boot_linux.py' from fetch-acceptance rule

2020-07-24 Thread Philippe Mathieu-Daudé

The boot_linux.py file triggers an exception:

  $ make fetch-acceptance
  AVOCADO tests/acceptance
  Fetching assets from tests/acceptance/empty_cpu_model.py.
  Fetching assets from tests/acceptance/vnc.py.
  Fetching assets from tests/acceptance/boot_linux_console.py.
  Fetching assets from tests/acceptance/boot_linux.py.
  Traceback (most recent call last):
File 
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/__main__.py",
 line 11, in 
  sys.exit(main.run())
File 
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/core/app.py",
 line 91, in run
  return method(self.parser.config)
File 
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
 line 291, in run
  success, fail = fetch_assets(test_file)
File 
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
 line 200, in fetch_assets
  handler = FetchAssetHandler(test_file, klass, method)
File 
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
 line 65, in __init__
  self.visit(self.tree)
File "/usr/lib64/python3.7/ast.py", line 271, in visit
  return visitor(node)
File "/usr/lib64/python3.7/ast.py", line 279, in generic_visit
  self.visit(item)
File "/usr/lib64/python3.7/ast.py", line 271, in visit
  return visitor(node)
File 
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
 line 139, in visit_ClassDef
  self.generic_visit(node)
File "/usr/lib64/python3.7/ast.py", line 279, in generic_visit
  self.visit(item)
File "/usr/lib64/python3.7/ast.py", line 271, in visit
  return visitor(node)
File 
"/var/tmp/qemu-builddir/tests/venv/lib64/python3.7/site-packages/avocado/plugins/assets.py",
 line 171, in visit_Assign
  self.asgmts[cur_klass][cur_method][name] = node.value.s
  KeyError: 'launch_and_wait'
  make: *** [tests/Makefile.include:949: fetch-acceptance] Error 1

Exclude it for now. We will revert this commit once the script is
fixed.

Signed-off-by: Philippe Mathieu-Daudé 
---
 tests/Makefile.include | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 238974d8da..7c9cf7a818 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -950,7 +950,7 @@ fetch-acceptance: check-venv
 $(TESTS_VENV_DIR)/bin/python -m avocado \
 $(if $(V),--show=$(AVOCADO_SHOW)) \
 assets fetch \
-$(wildcard tests/acceptance/*.py), \
+$(filter-out tests/acceptance/boot_linux.py,$(wildcard 
tests/acceptance/*.py)), \
 "AVOCADO", "tests/acceptance")
 
 check-acceptance: check-venv $(TESTS_RESULTS_DIR) get-vm-images
-- 
2.21.3

[PATCH 0/2] tests: Add 'fetch-acceptance' rule

2020-07-24 Thread Philippe Mathieu-Daudé

Add a rule to fetch acceptance test assets.

This is particularly useful in a CI context, when a single job
can fetch and save the cache so other jobs reuse it directly.

It is also useful to measure the time spent downloading the
assets versus the time spent running the tests.

For now we have to exclude the 'boot_linux.py' which triggers
an exception.

Philippe Mathieu-Daudé (2):
  tests: Add 'fetch-acceptance' rule
  tests: Exclude 'boot_linux.py' from fetch-acceptance rule

 tests/Makefile.include | 9 +
 1 file changed, 9 insertions(+)

-- 
2.21.3

[PATCH 1/2] tests: Add 'fetch-acceptance' rule

2020-07-24 Thread Philippe Mathieu-Daudé

Add a rule to fetch acceptance test assets.

This is particularly useful in a CI context, when a single job
can fetch and save the cache so other jobs reuse it directly.

It is also useful to measure the time spent downloading the
assets versus the time spent running the tests.

Signed-off-by: Philippe Mathieu-Daudé 
---
 tests/Makefile.include | 9 +
 1 file changed, 9 insertions(+)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index c7e4646ded..238974d8da 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -944,6 +944,15 @@ get-vm-image-fedora-31-%: check-venv
 # download all vm images, according to defined targets
 get-vm-images: check-venv $(patsubst %,get-vm-image-fedora-31-%, 
$(FEDORA_31_DOWNLOAD))
 
+# fetch acceptance test assets
+fetch-acceptance: check-venv
+   $(call quiet-command, \
+$(TESTS_VENV_DIR)/bin/python -m avocado \
+$(if $(V),--show=$(AVOCADO_SHOW)) \
+assets fetch \
+$(wildcard tests/acceptance/*.py), \
+"AVOCADO", "tests/acceptance")
+
 check-acceptance: check-venv $(TESTS_RESULTS_DIR) get-vm-images
$(call quiet-command, \
 $(TESTS_VENV_DIR)/bin/python -m avocado \
-- 
2.21.3

[PATCH v2] gitlab-ci: Fix Avocado cache usage

2020-07-24 Thread Philippe Mathieu-Daudé

In commit 6957fd98dc ("gitlab: add avocado asset caching") we
tried to save the Avocado cache (as in commit c1073e44b4 with
Travis-CI) however it doesn't work as expected. For some reason
Avocado uses /root/avocado_cache/ which we can not select later.

Manually generate a Avocado config to force the use of the
current directory.

We add a new 'build-acceptance-cache' job that runs first,
(during the 'build' stage) to create/update the cache.

The cache content is then pulled (but not updated) during the
'test' stage.

See:
- https://docs.gitlab.com/ee/ci/caching/
- 
https://avocado-framework.readthedocs.io/en/latest/guides/writer/chapters/writing.html#fetching-asset-files

Reported-by: Thomas Huth 
Fixes: 6957fd98dc ("gitlab: add avocado asset caching")
Signed-off-by: Philippe Mathieu-Daudé 
---
Since v1:
- add a specific 'build-acceptance-cache' job

Thomas prefers to use a different cache for each job.
Since I had this patch ready, I prefer to post it as
v2 and will work on a v3 using Thomas suggestion.

Supersedes: <20200723200318.28214-1-f4...@amsat.org>
Based-on: <20200724073524.26589-1-f4...@amsat.org>
  "tests: Add 'fetch-acceptance' rule"
---
 .gitlab-ci.yml | 61 ++
 1 file changed, 52 insertions(+), 9 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 362e5ee755..a8d8a7e849 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -8,11 +8,9 @@ stages:
   - build
   - test
 
-# We assume GitLab has it's own caching set up for RPM/APT repositories so we
-# just take care of avocado assets here.
-cache:
-  paths:
-- $HOME/avocado/data/cache
+# We assume GitLab has it's own caching set up for RPM/APT repositories
+cache: &global_cache
+  policy: pull
 
 include:
   - local: '/.gitlab-ci.d/edk2.yml'
@@ -47,11 +45,52 @@ include:
 - find . -type f -exec touch {} +
 - make $MAKE_CHECK_ARGS
 
-.post_acceptance_template: &post_acceptance
+.acceptance_template: &acceptance_definition
+  cache:
+# inherit all global cache settings
+<<: *global_cache
+key: acceptance_cache
+paths:
+  - $CI_PROJECT_DIR/avocado_cache
+policy: pull
+  before_script:
+- JOBS=$(expr $(nproc) + 1)
+- mkdir -p ~/.config/avocado
+- echo "[datadir.paths]" > ~/.config/avocado/avocado.conf
+- echo "cache_dirs = ['${CI_PROJECT_DIR}/avocado_cache']" >> 
~/.config/avocado/avocado.conf
   after_script:
 - cd build
 - python3 -c 'import json; r = 
json.load(open("tests/results/latest/results.json")); [print(t["logfile"]) for 
t in r["tests"] if t["status"] not in ("PASS", "SKIP")]' | xargs cat
-- du -chs $HOME/avocado/data/cache
+- du -chs $CI_PROJECT_DIR/avocado_cache
+
+build-acceptance-cache:
+  stage: build
+  cache:
+# inherit all global cache settings
+<<: *global_cache
+key: acceptance_cache
+paths:
+  - $CI_PROJECT_DIR/avocado_cache
+policy: pull-push
+  variables:
+# any image should work
+IMAGE: ubuntu2004
+CONFIGURE_ARGS: --disable-user --disable-system
+  --disable-docs --disable-tools
+MAKE_CHECK_ARGS: fetch-acceptance
+  image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest
+  before_script:
+- mkdir -p ~/.config/avocado
+- echo "[datadir.paths]" > ~/.config/avocado/avocado.conf
+- echo "cache_dirs = ['${CI_PROJECT_DIR}/avocado_cache']" >> 
~/.config/avocado/avocado.conf
+  script:
+- mkdir build
+- cd build
+- ../configure --disable-user --disable-system --disable-docs 
--disable-tools
+# ignore "asset fetched or already on cache" error
+- make fetch-acceptance || true
+  after_script:
+- du -chs $CI_PROJECT_DIR/avocado_cache
 
 build-system-ubuntu-main:
   <<: *native_build_job_definition
@@ -76,13 +115,15 @@ check-system-ubuntu-main:
 
 acceptance-system-ubuntu-main:
   <<: *native_test_job_definition
+  <<: *acceptance_definition
   needs:
 - job: build-system-ubuntu-main
   artifacts: true
+- job: build-acceptance-cache
+  artifacts: false
   variables:
 IMAGE: ubuntu2004
 MAKE_CHECK_ARGS: check-acceptance
-  <<: *post_acceptance
 
 build-system-fedora-alt:
   <<: *native_build_job_definition
@@ -107,13 +148,15 @@ check-system-fedora-alt:
 
 acceptance-system-fedora-alt:
   <<: *native_test_job_definition
+  <<: *acceptance_definition
   needs:
 - job: build-system-fedora-alt
   artifacts: true
+- job: build-acceptance-cache
+  artifacts: false
   variables:
 IMAGE: fedora
 MAKE_CHECK_ARGS: check-acceptance
-  <<: *post_acceptance
 
 build-disabled:
   <<: *native_build_job_definition
-- 
2.21.3

Re: [PATCH v2 22/22] qemu-iotests/199: add source-killed case to bitmaps postcopy

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


19.02.2020 20:15, Andrey Shinkevich wrote:

On 17/02/2020 18:02, Vladimir Sementsov-Ogievskiy wrote:

Previous patches fixes behavior of bitmaps migration, so that errors
are handled by just removing unfinished bitmaps, and not fail or try to
recover postcopy migration. Add corresponding test.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/199 | 15 +++
  tests/qemu-iotests/199.out |  4 ++--
  2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 0d12e6b1ae..d38913fa44 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -235,6 +235,21 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
  self.vm_a.launch()
  check_bitmaps(self.vm_a, 0)
+    def test_early_kill_source(self):
+    self.start_postcopy()
+
+    self.vm_a_events = self.vm_a.get_qmp_events()
+    self.vm_a.kill()
+
+    self.vm_a.launch()
+
+    match = {'data': {'status': 'completed'}}
+    e_complete = self.vm_b.event_wait('MIGRATION', match=match)


A failed migration gets the status 'completed'. That misleads a user but is not 
in the scope of this series, I guess.


It's not failed. Only bitmaps are not migrated, which is not a problem.. 
Probably we should invent some additional status or QAPI event for this, but 
yes, not in this series.




+    self.vm_b_events.append(e_complete)
+
+    check_bitmaps(self.vm_a, 0)
+    check_bitmaps(self.vm_b, 0)
+
  if __name__ == '__main__':
  iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/199.out b/tests/qemu-iotests/199.out
index fbc63e62f8..8d7e996700 100644
--- a/tests/qemu-iotests/199.out
+++ b/tests/qemu-iotests/199.out
@@ -1,5 +1,5 @@
-..
+...
  --
-Ran 2 tests
+Ran 3 tests
  OK



The updated test passed.

Reviewed-by: Andrey Shinkevich 



--
Best regards,
Vladimir

Re: [RFC PATCH] buildsys: Only build capstone if softmmu/user mode is enabled

2020-07-24 Thread Thomas Huth

On 24/07/2020 09.16, Philippe Mathieu-Daudé wrote:
> At least one of softmmu or user mode has to be enabled to use
> capstone. If not, don't clone/built it.
> 
> This save CI time for the tools/documentation-only build jobs.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  configure | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/configure b/configure
> index 4bd80ed507..bc5757159a 100755
> --- a/configure
> +++ b/configure
> @@ -5381,6 +5381,10 @@ fi
>  ##
>  # capstone
>  
> +if test -z "$capstone" && test $tcg = 'no' ; then # !tcg implies !softmmu
> +  capstone="no"
> +fi

I don't think this is right. You could have a KVM-only build where you
still want to use the disassembler for the human monitor.

But maybe it could be disabled if softmmu="no", linux_user="no" and
bsd_user="no" ?

 Thomas

Re: [PATCH v2] qapi: enable use of g_autoptr with QAPI types

2020-07-24 Thread Markus Armbruster

Daniel P. Berrangé  writes:

> Currently QAPI generates a type and function for free'ing it:
>
>   typedef struct QCryptoBlockCreateOptions QCryptoBlockCreateOptions;
>   void qapi_free_QCryptoBlockCreateOptions(QCryptoBlockCreateOptions *obj);
>
> This is used in the traditional manner:
>
>   QCryptoBlockCreateOptions *opts = NULL;
>
>   opts = g_new0(QCryptoBlockCreateOptions, 1);
>
>   do stuff with opts...
>
>   qapi_free_QCryptoBlockCreateOptions(opts);
>
> Since bumping the min glib to 2.48, QEMU has incrementally adopted the
> use of g_auto/g_autoptr. This allows the compiler to run a function to
> free a variable when it goes out of scope, the benefit being the
> compiler can guarantee it is freed in all possible code ptahs.
>
> This benefit is applicable to QAPI types too, and given the seriously
> long method names for some qapi_free_() functions, is much less
> typing. This change thus makes the code generator emit:
>
>  G_DEFINE_AUTOPTR_CLEANUP_FUNC(QCryptoBlockCreateOptions,
>   qapi_free_QCryptoBlockCreateOptions)
>
> The above code example now becomes
>
>   g_autoptr(QCryptoBlockCreateOptions) opts = NULL;
>
>   opts = g_new0(QCryptoBlockCreateOptions, 1);
>
>   do stuff with opts...
>
> Note, if the local pointer needs to live beyond the scope holding the
> variable, then g_steal_pointer can be used. This is useful to return the
> pointer to the caller in the success codepath, while letting it be freed
> in all error codepaths.
>
>   return g_steal_pointer(&opts);
>
> The crypto/block.h header needs updating to avoid symbol clash now that
> the g_autoptr support is a standard QAPI feature.
>
> Signed-off-by: Daniel P. Berrangé 

Reviewed-by: Markus Armbruster 

Queued for 5.2, thanks!

Re: [PATCH-for-5.1? v2] qapi/error: Check format string argument in error_*prepend()

2020-07-24 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> error_propagate_prepend() "behaves like error_prepend()", and
> error_prepend() uses "formatting @fmt, ... like printf()".
> error_prepend() checks its format string argument, but
> error_propagate_prepend() does not. Fix by addint the format

s/addint/adding/

> attribute to error_propagate_prepend() and error_vprepend().
>
> This would have caught the bug fixed in the previous commit:
>
> CC  hw/sd/milkymist-memcard.o
>   hw/sd/milkymist-memcard.c: In function ‘milkymist_memcard_realize’:
>   hw/sd/milkymist-memcard.c:284:70: error: format ‘%s’ expects a matching 
> ‘char *’ argument [-Werror=format=]
> 284 | error_propagate_prepend(errp, err, "failed to init SD card: 
> %s");
> | 
> ~^
> | 
>  |
> | 
>  char *

I see no need to repeat the details here.  If you agree, I'll drop them
in my tree.

> Missed in commit 4b5766488f "error: Fix use of error_prepend() with
> &error_fatal, &error_abort".
>
> Inspired-by: Stefan Weil 
> Suggested-by: Eric Blake 
> Reviewed-by: Markus Armbruster 
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Markus Armbruster

Re: Status of scripts/qmp/ (was: [PATCH 0/3] scripts/qmp/qom-fuse: Scrape off the bitrot)

2020-07-24 Thread Thomas Huth

On 24/07/2020 09.33, Markus Armbruster wrote:
> Thomas Huth  writes:
> 
>> On 23/07/2020 16.27, Markus Armbruster wrote:
>>> Markus Armbruster (3):
>>>   scripts/qmp/qom-fuse: Unbreak import of QEMUMonitorProtocol
>>>   scripts/qmp/qom-fuse: Port to current Python module fuse
>>>   scripts/qmp/qom-fuse: Fix getattr(), read() for files in /
>>
>> Could it be added to a CI pipeline, so that it does not bitrot again?
> 
> Should it be?
> 
> Thread hijack!
> 
> What's the status of scripts/qmp/?
> 
> The directory is covered by MAINTAINERS section QMP, with status
> "Supported".  Status is a *lie* for these scripts.  I inherited them
> with the rest of QMP.  I have no use for them, except I occasionally use
> qom-fuse for QOM spelunking, mostly because our monitor commands are so
> unwieldy compared to a filesystem.  I barely looked at them in the 5+
> years of my service as QMP maintainer.  Actual status is "Odd fixes".
> 
> Does this stuff have any business in /scripts/?
> 
> Nothing in scripts/qmp/ should be shipped.
> 
> scripts/qmp/qemu-ga-client doesn't even belong there.  Michael, is it of
> any use today?
> 
> I know scripts/qmp/qmp-shell has a few friends among developers.  I
> regard it as a failed attempt at making QMP easier for humans, and have
> zero interest in it.
> 
> scripts/qmp/qmp looks like an attempt at making QMP easier for shell
> scripts.  I'm not aware of actual use, and have zero interest in it.
> 
> scripts/qmp/qom-{get,list,set} look like an attempt at making QOM easier
> for shell scripts.  I'm not aware of actual use, and have zero interest
> in it.  Heck, I can't even figure out how to use qom-get (I just spend
> at least 30s trying).

According to the original commit (235fe3bfd46b1104575b540d0bc), it seems
like these were using for manual testing. If in all those years nobody
ever tried to integrate them into our "make check" test suite, I guess
they are just dead code.

> scripts/qmp/qom-tree feels redundant with scripts/qmp/qom-fuse.  I just
> ran it for the first time just to come to this judgement.
> 
> I believe contrib/ would be a better home for all of them.
> 
> I feel like moving the directory there and leaving it *uncovered* in
> MAINTAINERS.  If a volunteer steps forward, we can add a suitable
> section.
> 
> Opinions?

I'd suggest to remove the "test tools" from commit 235fe3bfd46b1104575b5
since apparently nobody ever cared to integrate them into the test suite.

For the other scripts that are still used occasionally, I'd leave them
in their current location. If you don't want to maintain them, remove
them from your section in MAINTAINERS and add a new "QMP scripts"
section where you can mark the scripts/qmp folder as orphan.

 Thomas

Re: [PATCH v0 0/4] background snapshot

2020-07-24 Thread Denis Plotnikov





On 23.07.2020 20:39, Peter Xu wrote:

On Thu, Jul 23, 2020 at 11:03:55AM +0300, Denis Plotnikov wrote:


On 22.07.2020 19:30, Peter Xu wrote:

On Wed, Jul 22, 2020 at 06:47:44PM +0300, Denis Plotnikov wrote:

On 22.07.2020 18:42, Denis Plotnikov wrote:

On 22.07.2020 17:50, Peter Xu wrote:

Hi, Denis,

Hi, Peter

...

How to use:
1. enable background snapshot capability
      virsh qemu-monitor-command vm --hmp migrate_set_capability
background-snapshot on

2. stop the vm
      virsh qemu-monitor-command vm --hmp stop

3. Start the external migration to a file
      virsh qemu-monitor-command cent78-bs --hmp migrate exec:'cat

./vm_state'

4. Wait for the migration finish and check that the migration
has completed state.

Thanks for continued working on this project! I have two high level
questions
before dig into the patches.

Firstly, is step 2 required?  Can we use a single QMP command to
take snapshots
(which can still be a "migrate" command)?

With this series it is required, but steps 2 and 3 should be merged into
a single one.

I'm not sure whether you're talking about the disk snapshot operations, anyway
yeah it'll be definitely good if we merge them into one in the next version.

After thinking for a while, I remembered why I split these two steps.
The vm snapshot consists of two parts: disk(s) snapshot(s) and vmstate.
With migrate command we save the vmstate only. So, the steps to save
the whole vm snapshot is the following:

2. stop the vm
     virsh qemu-monitor-command vm --hmp stop

2.1. Make a disk snapshot, something like
 virsh qemu-monitor-command vm --hmp snapshot_blkdev drive-scsi0-0-0-0 
./new_data
3. Start the external migration to a file
     virsh qemu-monitor-command vm --hmp migrate exec:'cat ./vm_state'

In this example, vm snapshot consists of two files: vm_state and the disk file. 
new_data will contain all new disk data written since [2.1.] executing.

But that's slightly different to the current interface of savevm and loadvm
which only requires a snapshot name, am I right?


Yes

Now we need both a snapshot
name (of the vmstate) and the name of the new snapshot image.


Yes


I'm not familiar with qemu image snapshots... my understanding is that current
snapshot (save_snapshot) used internal image snapshots, while in this proposal
you want the live snapshot to use extrenal snapshots.
Correct, I want to add ability to make a external live snapshot. (live = 
asyn ram writing)

   Is there any criteria on
making this decision/change?
Internal snapshot is supported by qcow2 and sheepdog (I never heard of 
someone using the later).
Because of qcow2 internal snapshot design, it's quite complex to 
implement "background" snapshot there.
More details here: 
https://www.mail-archive.com/qemu-devel@nongnu.org/msg705116.html
So, I decided to start with external snapshot to implement and approve 
the memory access intercepting part firstly.
Once it's done for external snapshot we can start to approach the 
internal snapshots.


Thanks,
Denis

Re: [PATCH for-5.1] iotests: Select a default machine for the rx and avr targets

2020-07-24 Thread Max Reitz

On 22.07.20 18:19, Thomas Huth wrote:
> If you are building only with either the new rx-softmmu or avr-softmmu
> target, "make check-block" fails a couple of tests since there is no
> default machine defined in these new targets. We have to select a machine
> in the "check" script for these, just like we already do for the arm- and
> tricore-softmmu targets.
> 
> Signed-off-by: Thomas Huth 
> ---
>  tests/qemu-iotests/check | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)

Thanks, applied to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block



signature.asc
Description: OpenPGP digital signature

[Bug 1888601] Re: QEMU v5.1.0-rc0/rc1 hang with nested virtualization

2020-07-24 Thread Dr. David Alan Gilbert

Simon: Please answer Jason's questions

But my reading is; I don't know what a VSI is but I'm guessing it's the
host and he doesn't know the host qemu; so the 5.1is the qemu that kata
is running?

It perhaps would make more sense if this has nothing to do with nesting,
and more to do with Kata's virtio setup;  I've just tried virtio-fs on
current head (outside kata) and it seems OK from a quick test.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1888601

Title:
  QEMU v5.1.0-rc0/rc1 hang with nested virtualization

Status in QEMU:
  New

Bug description:
  We're running Kata Containers using QEMU and with v5.1.0rc0 and rc1
  have noticed a problem at startup where QEMu appears to hang. We are
  not seeing this problem on our bare metal nodes and only on a VSI that
  supports nested virtualization.

  We unfortunately see nothing at all in the QEMU logs to help
  understand the problem and a hung process is just a guess at this
  point.

  Using git bisect we first see the problem with...

  ---

  f19bcdfedd53ee93412d535a842a89fa27cae7f2 is the first bad commit
  commit f19bcdfedd53ee93412d535a842a89fa27cae7f2
  Author: Jason Wang 
  Date:   Wed Jul 1 22:55:28 2020 +0800

  virtio-pci: implement queue_enabled method

  With version 1, we can detect whether a queue is enabled via
  queue_enabled.

  Signed-off-by: Jason Wang 
  Signed-off-by: Cindy Lu 
  Message-Id: <20200701145538.22333-5-l...@redhat.com>
  Reviewed-by: Michael S. Tsirkin 
  Signed-off-by: Michael S. Tsirkin 
  Acked-by: Jason Wang 

   hw/virtio/virtio-pci.c | 13 +
   1 file changed, 13 insertions(+)

  ---

  Reverting this commit (on top of 5.1.0-rc1) seems to work and prevent
  the hanging.

  ---

  Here's how kata ends up launching qemu in our environment --
  /opt/kata/bin/qemu-system-x86_64 -name 
sandbox-849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f -uuid 
6bec458e-1da7-4847-a5d7-5ab31d4d2465 -machine pc,accel=kvm,kernel_irqchip -cpu 
host,pmu=off -qmp 
unix:/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/qmp.sock,server,nowait
 -m 4096M,slots=10,maxmem=30978M -device 
pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= 
-device virtio-serial-pci,disable-modern=true,id=serial0,romfile= -device 
virtconsole,chardev=charconsole0,id=console0 -chardev 
socket,id=charconsole0,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/console.sock,server,nowait
 -device virtio-scsi-pci,id=scsi0,disable-modern=true,romfile= -object 
rng-random,id=rng0,filename=/dev/urandom -device 
virtio-rng-pci,rng=rng0,romfile= -device 
virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 -chardev 
socket,id=charch0,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/kata.sock,server,nowait
 -chardev 
socket,id=char-396c5c3e19e29353,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/vhost-fs.sock
 -device 
vhost-user-fs-pci,chardev=char-396c5c3e19e29353,tag=kataShared,romfile= -netdev 
tap,id=network-0,vhost=on,vhostfds=3:4,fds=5:6 -device 
driver=virtio-net-pci,netdev=network-0,mac=52:ac:2d:02:1f:6f,disable-modern=true,mq=on,vectors=6,romfile=
 -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults 
-nographic -daemonize -object 
memory-backend-file,id=dimm1,size=4096M,mem-path=/dev/shm,share=on -numa 
node,memdev=dimm1 -kernel /opt/kata/share/kata-containers/vmlinuz-5.7.9-74 
-initrd 
/opt/kata/share/kata-containers/kata-containers-initrd_alpine_1.11.2-6_agent.initrd
 -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 
i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 
console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 debug 
panic=1 nr_cpus=4 agent.use_vsock=false scsi_mod.scan=none 
init=/usr/bin/kata-agent -pidfile 
/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/pid 
-D 
/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/qemu.log
 -smp 2,cores=1,threads=1,sockets=4,maxcpus=4

  ---

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1888601/+subscriptions

[PATCH] pseries: fix kvmppc_set_fwnmi()

2020-07-24 Thread Laurent Vivier

QEMU issues the ioctl(KVM_CAP_PPC_FWNMI) on the first vCPU.

If the first vCPU is currently running, the vCPU mutex is held
and the ioctl() cannot be done and waits until the mutex is released.
This never happens and the VM is stuck.

To avoid this deadlock, issue the ioctl on the same vCPU doing the
RTAS call.

The problem can be reproduced by booting a guest with several vCPUs
(the probability to have the problem is (n - 1) / n,  n = # of CPUs),
and then by triggering a kernel crash with "echo c >/proc/sysrq-trigger".

On the reboot, the kernel hangs after:

...
[0.00] -
[0.00] ppc64_pft_size= 0x0
[0.00] phys_mem_size = 0x4800
[0.00] dcache_bsize  = 0x80
[0.00] icache_bsize  = 0x80
[0.00] cpu_features  = 0x0001c06f8f4f91a7
[0.00]   possible= 0x0003fbffcf5fb1a7
[0.00]   always  = 0x0003800081a1
[0.00] cpu_user_features = 0xdc0065c2 0xaee0
[0.00] mmu_features  = 0x3c006041
[0.00] firmware_features = 0x0085455a445f
[0.00] physical_start= 0x800
[0.00] -
[0.00] numa:   NODE_DATA [mem 0x47f33c80-0x47f3]

Fixes: ec010c00665b ("ppc/spapr: KVM FWNMI should not be enabled until guest 
requests it")
Cc: npig...@gmail.com
Signed-off-by: Laurent Vivier 
---
 hw/ppc/spapr_rtas.c  | 2 +-
 target/ppc/kvm.c | 3 +--
 target/ppc/kvm_ppc.h | 4 ++--
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index bcac0d00e7b6..513c7a84351b 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -438,7 +438,7 @@ static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
 }
 
 if (kvm_enabled()) {
-if (kvmppc_set_fwnmi() < 0) {
+if (kvmppc_set_fwnmi(cpu) < 0) {
 rtas_st(rets, 0, RTAS_OUT_NOT_SUPPORTED);
 return;
 }
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 2692f76130aa..d85ba8ffe00b 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -2071,9 +2071,8 @@ bool kvmppc_get_fwnmi(void)
 return cap_fwnmi;
 }
 
-int kvmppc_set_fwnmi(void)
+int kvmppc_set_fwnmi(PowerPCCPU *cpu)
 {
-PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
 CPUState *cs = CPU(cpu);
 
 return kvm_vcpu_enable_cap(cs, KVM_CAP_PPC_FWNMI, 0);
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index 701c0c262be2..72e05f1cd2fc 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -28,7 +28,7 @@ void kvmppc_set_papr(PowerPCCPU *cpu);
 int kvmppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr);
 void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy);
 bool kvmppc_get_fwnmi(void);
-int kvmppc_set_fwnmi(void);
+int kvmppc_set_fwnmi(PowerPCCPU *cpu);
 int kvmppc_smt_threads(void);
 void kvmppc_error_append_smt_possible_hint(Error *const *errp);
 int kvmppc_set_smt_threads(int smt);
@@ -169,7 +169,7 @@ static inline bool kvmppc_get_fwnmi(void)
 return false;
 }
 
-static inline int kvmppc_set_fwnmi(void)
+static inline int kvmppc_set_fwnmi(PowerPCCPU *cpu)
 {
 return -1;
 }
-- 
2.26.2

[PATCH-for-5.2] stubs/cmos: Use correct include

2020-07-24 Thread Philippe Mathieu-Daudé

cmos_get_fd_drive_type() is declared in "hw/block/fdc.h".
This currently works because "hw/i386/pc.h" happens to
include it. Simplify including the correct header.

Fixes: 2055dbc1c9 ("acpi: move aml builder code for floppy device")
Signed-off-by: Philippe Mathieu-Daudé 
---
 stubs/cmos.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/stubs/cmos.c b/stubs/cmos.c
index 416cbe4055..3fdbae2c69 100644
--- a/stubs/cmos.c
+++ b/stubs/cmos.c
@@ -1,5 +1,5 @@
 #include "qemu/osdep.h"
-#include "hw/i386/pc.h"
+#include "hw/block/fdc.h"
 
 int cmos_get_fd_drive_type(FloppyDriveType fd0)
 {
-- 
2.21.3

[PATCH v3 for-5.1?? 00/21] Fix error handling during bitmap postcopy

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Hi all!

I'm resending this now, as Eric started to review v2 and it occurs
outdated.

Last time it was somehow postponed, and I thought that I'll to
rebase it onto Max's
"[PATCH v2 0/3] migration: Add block-bitmap-mapping parameter"

Of course, if this goes into 5.1, Max's series will need a rebase, sorry
for this.

So we need to decide now, do we really want this in 5.1?

pros: it fixes real problems
cons: - it affects migration in invasive way. Still, it may be not bad,
if we double check that it don't affect migration when dirty-bitmaps
migration capability is disabled (the default).
  - seems, there are at least one more crash which Max found. I
  don't remember now, if I reproduced it on top of my series or
  not...

If we decide, that it goes to 5.2, than again, let's decide which series
goes first. I'm OK either way, and can rebase this series, or rebase
Max's series myself, if he will not have time (I fill responsive for all
this mess, sorry).

Max, please look at this series if you have some time, you are familiar
with code now.



Original idea of bitmaps postcopy migration is that bitmaps are non
critical data, and their loss is not serious problem. So, using postcopy
method on any failure we should just drop unfinished bitmaps and
continue guest execution.

However, it doesn't work so. It crashes, fails, it goes to
postcopy-recovery feature. It does anything except for behavior we want.
These series fixes at least some problems with error handling during
bitmaps migration postcopy.



v3:
- 199-iotest improvements moved to the beginning of the series, v2:0018 already 
merged upstream.
- add r-b and t-b marks by Andrey and Eric

03: rename same variables [Andrey] 
05,06: update commit msg
08: some changes due to rebase, still, keep Eric's and Andrey's r-bs
11: rebased, drop r-b
14: s/,/;/
15: handle cancelled at start of dirty_bitmap_load_start()
17: Modify error message a bit, keep r-bs
18: Rebased on 03 changes
20: add comment

v2:

Most of patches are new or changed a lot.
Only patches 06,07 mostly unchanged, just rebased on refactorings.

v1 was "[PATCH 0/7] Fix crashes on early shutdown during bitmaps postcopy"

Vladimir Sementsov-Ogievskiy (21):
  qemu-iotests/199: fix style
  qemu-iotests/199: drop extra constraints
  qemu-iotests/199: better catch postcopy time
  qemu-iotests/199: improve performance: set bitmap by discard
  qemu-iotests/199: change discard patterns
  qemu-iotests/199: increase postcopy period
  migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start
  migration/block-dirty-bitmap: rename state structure types
  migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup
  migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init
  migration/block-dirty-bitmap: refactor state global variables
  migration/block-dirty-bitmap: rename finish_lock to just lock
  migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete
  migration/block-dirty-bitmap: keep bitmap state for all bitmaps
  migration/block-dirty-bitmap: relax error handling in incoming part
  migration/block-dirty-bitmap: cancel migration on shutdown
  migration/savevm: don't worry if bitmap migration postcopy failed
  qemu-iotests/199: prepare for new test-cases addition
  qemu-iotests/199: check persistent bitmaps
  qemu-iotests/199: add early shutdown case to bitmaps postcopy
  qemu-iotests/199: add source-killed case to bitmaps postcopy

 migration/migration.h  |   3 +-
 migration/block-dirty-bitmap.c | 458 +
 migration/migration.c  |  15 +-
 migration/savevm.c |  37 ++-
 tests/qemu-iotests/199 | 250 ++
 tests/qemu-iotests/199.out |   4 +-
 6 files changed, 535 insertions(+), 232 deletions(-)

-- 
2.21.0

[PATCH v3 04/21] qemu-iotests/199: improve performance: set bitmap by discard

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Discard dirties dirty-bitmap as well as write, but works faster. Let's
use it instead.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
---
 tests/qemu-iotests/199 | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index dd6044768c..190e820b84 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -67,8 +67,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 os.mkfifo(fifo)
 qemu_img('create', '-f', iotests.imgfmt, disk_a, size)
 qemu_img('create', '-f', iotests.imgfmt, disk_b, size)
-self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a)
-self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a,
+  'discard=unmap')
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b,
+  'discard=unmap')
 self.vm_b.add_incoming("exec: cat '" + fifo + "'")
 self.vm_a.launch()
 self.vm_b.launch()
@@ -78,7 +80,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b_events = []
 
 def test_postcopy(self):
-write_size = 0x4000
+discard_size = 0x4000
 granularity = 512
 chunk = 4096
 
@@ -86,25 +88,32 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
name='bitmap', granularity=granularity)
 self.assert_qmp(result, 'return', {})
 
+result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+   node='drive0', name='bitmap')
+empty_sha256 = result['return']['sha256']
+
 s = 0
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1
 s = 0x8000
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap')
 sha256 = result['return']['sha256']
 
+# Check, that updating the bitmap by discards works
+assert sha256 != empty_sha256
+
 result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
name='bitmap')
 self.assert_qmp(result, 'return', {})
 s = 0
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1
 
 caps = [{'capability': 'dirty-bitmaps', 'state': True},
@@ -126,8 +135,8 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b_events.append(event_resume)
 
 s = 0x8000
-while s < write_size:
-self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_b.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1
 
 match = {'data': {'status': 'completed'}}
-- 
2.21.0

[PATCH v3 05/21] qemu-iotests/199: change discard patterns

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

iotest 199 works too long because of many discard operations. At the
same time, postcopy period is very short, in spite of all these
efforts.

So, let's use less discards (and with more interesting patterns) to
reduce test timing. In the next commit we'll increase postcopy period.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
---
 tests/qemu-iotests/199 | 44 +-
 1 file changed, 26 insertions(+), 18 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 190e820b84..da4dae01fb 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -30,6 +30,28 @@ size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
 
+GiB = 1024 * 1024 * 1024
+
+discards1 = (
+(0, GiB),
+(2 * GiB + 512 * 5, 512),
+(3 * GiB + 512 * 5, 512),
+(100 * GiB, GiB)
+)
+
+discards2 = (
+(3 * GiB + 512 * 8, 512),
+(4 * GiB + 512 * 8, 512),
+(50 * GiB, GiB),
+(100 * GiB + GiB // 2, GiB)
+)
+
+
+def apply_discards(vm, discards):
+for d in discards:
+vm.hmp_qemu_io('drive0', 'discard {} {}'.format(*d))
+
+
 def event_seconds(event):
 return event['timestamp']['seconds'] + \
 event['timestamp']['microseconds'] / 100.0
@@ -80,9 +102,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b_events = []
 
 def test_postcopy(self):
-discard_size = 0x4000
 granularity = 512
-chunk = 4096
 
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap', granularity=granularity)
@@ -92,14 +112,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
node='drive0', name='bitmap')
 empty_sha256 = result['return']['sha256']
 
-s = 0
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
-s = 0x8000
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+apply_discards(self.vm_a, discards1 + discards2)
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap')
@@ -111,10 +124,8 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
name='bitmap')
 self.assert_qmp(result, 'return', {})
-s = 0
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+
+apply_discards(self.vm_a, discards1)
 
 caps = [{'capability': 'dirty-bitmaps', 'state': True},
 {'capability': 'events', 'state': True}]
@@ -134,10 +145,7 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 event_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(event_resume)
 
-s = 0x8000
-while s < discard_size:
-self.vm_b.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+apply_discards(self.vm_b, discards2)
 
 match = {'data': {'status': 'completed'}}
 event_complete = self.vm_b.event_wait('MIGRATION', match=match)
-- 
2.21.0

[PATCH v3 02/21] qemu-iotests/199: drop extra constraints

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

We don't need any specific format constraints here. Still keep qcow2
for two reasons:
1. No extra calls of format-unrelated test
2. Add some check around persistent bitmap in future (require qcow2)

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
---
 tests/qemu-iotests/199 | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index de9ba8d94c..dda918450a 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -116,5 +116,4 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 
 if __name__ == '__main__':
-iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'],
- supported_protocols=['file'])
+iotests.main(supported_fmts=['qcow2'])
-- 
2.21.0

[PATCH v3 01/21] qemu-iotests/199: fix style

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Mostly, satisfy pep8 complains.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
---
 tests/qemu-iotests/199 | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 40774eed74..de9ba8d94c 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -28,8 +28,8 @@ disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
-class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
+class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 def tearDown(self):
 self.vm_a.shutdown()
 self.vm_b.shutdown()
@@ -54,7 +54,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap', granularity=granularity)
-self.assert_qmp(result, 'return', {});
+self.assert_qmp(result, 'return', {})
 
 s = 0
 while s < write_size:
@@ -71,7 +71,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
name='bitmap')
-self.assert_qmp(result, 'return', {});
+self.assert_qmp(result, 'return', {})
 s = 0
 while s < write_size:
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
@@ -104,15 +104,16 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
 s += 0x1
 
-result = self.vm_b.qmp('query-block');
+result = self.vm_b.qmp('query-block')
 while len(result['return'][0]['dirty-bitmaps']) > 1:
 time.sleep(2)
-result = self.vm_b.qmp('query-block');
+result = self.vm_b.qmp('query-block')
 
 result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap')
 
-self.assert_qmp(result, 'return/sha256', sha256);
+self.assert_qmp(result, 'return/sha256', sha256)
+
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'],
-- 
2.21.0

[PATCH v3 07/21] migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

No reason to use _locked version of bdrv_enable_dirty_bitmap, as we
don't lock this mutex before. Moreover, the adjacent
bdrv_dirty_bitmap_enable_successor do lock the mutex.

Fixes: 58f72b965e9e1q
Cc: qemu-sta...@nongnu.org # v3.0
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/block-dirty-bitmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index b0dbf9eeed..0739f1259e 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -566,7 +566,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 DirtyBitmapLoadBitmapState *b = item->data;
 
 if (b->migrated) {
-bdrv_enable_dirty_bitmap_locked(b->bitmap);
+bdrv_enable_dirty_bitmap(b->bitmap);
 } else {
 bdrv_dirty_bitmap_enable_successor(b->bitmap);
 }
-- 
2.21.0

[PATCH v3 09/21] migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Rename dirty_bitmap_mig_cleanup to dirty_bitmap_do_save_cleanup, to
stress that it is on save part.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Reviewed-by: Eric Blake 
---
 migration/block-dirty-bitmap.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 1d57bff4f6..01a536d7d3 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -259,7 +259,7 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState 
*dbms,
 }
 
 /* Called with iothread lock taken.  */
-static void dirty_bitmap_mig_cleanup(void)
+static void dirty_bitmap_do_save_cleanup(void)
 {
 SaveBitmapState *dbms;
 
@@ -406,7 +406,7 @@ static int init_dirty_bitmap_migration(void)
 
 fail:
 g_hash_table_destroy(handled_by_blk);
-dirty_bitmap_mig_cleanup();
+dirty_bitmap_do_save_cleanup();
 
 return -1;
 }
@@ -445,7 +445,7 @@ static void bulk_phase(QEMUFile *f, bool limit)
 /* for SaveVMHandlers */
 static void dirty_bitmap_save_cleanup(void *opaque)
 {
-dirty_bitmap_mig_cleanup();
+dirty_bitmap_do_save_cleanup();
 }
 
 static int dirty_bitmap_save_iterate(QEMUFile *f, void *opaque)
@@ -480,7 +480,7 @@ static int dirty_bitmap_save_complete(QEMUFile *f, void 
*opaque)
 
 trace_dirty_bitmap_save_complete_finish();
 
-dirty_bitmap_mig_cleanup();
+dirty_bitmap_do_save_cleanup();
 return 0;
 }
 
-- 
2.21.0

[PATCH v3 03/21] qemu-iotests/199: better catch postcopy time

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

The test aims to test _postcopy_ migration, and wants to do some write
operations during postcopy time.

Test considers migrate status=complete event on source as start of
postcopy. This is completely wrong, completion is completion of the
whole migration process. Let's instead consider destination start as
start of postcopy, and use RESUME event for it.

Next, as migration finish, let's use migration status=complete event on
target, as such method is closer to what libvirt or another user will
do, than tracking number of dirty-bitmaps.

Finally, add a possibility to dump events for debug. And if
set debug to True, we see, that actual postcopy period is very small
relatively to the whole test duration time (~0.2 seconds to >40 seconds
for me). This means, that test is very inefficient in what it supposed
to do. Let's improve it in following commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
---
 tests/qemu-iotests/199 | 72 +-
 1 file changed, 57 insertions(+), 15 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index dda918450a..dd6044768c 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -20,17 +20,43 @@
 
 import os
 import iotests
-import time
 from iotests import qemu_img
 
+debug = False
+
 disk_a = os.path.join(iotests.test_dir, 'disk_a')
 disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
 
+def event_seconds(event):
+return event['timestamp']['seconds'] + \
+event['timestamp']['microseconds'] / 100.0
+
+
+def event_dist(e1, e2):
+return event_seconds(e2) - event_seconds(e1)
+
+
 class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 def tearDown(self):
+if debug:
+self.vm_a_events += self.vm_a.get_qmp_events()
+self.vm_b_events += self.vm_b.get_qmp_events()
+for e in self.vm_a_events:
+e['vm'] = 'SRC'
+for e in self.vm_b_events:
+e['vm'] = 'DST'
+events = (self.vm_a_events + self.vm_b_events)
+events = [(e['timestamp']['seconds'],
+   e['timestamp']['microseconds'],
+   e['vm'],
+   e['event'],
+   e.get('data', '')) for e in events]
+for e in sorted(events):
+print('{}.{:06} {} {} {}'.format(*e))
+
 self.vm_a.shutdown()
 self.vm_b.shutdown()
 os.remove(disk_a)
@@ -47,6 +73,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a.launch()
 self.vm_b.launch()
 
+# collect received events for debug
+self.vm_a_events = []
+self.vm_b_events = []
+
 def test_postcopy(self):
 write_size = 0x4000
 granularity = 512
@@ -77,15 +107,13 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
 s += 0x1
 
-bitmaps_cap = {'capability': 'dirty-bitmaps', 'state': True}
-events_cap = {'capability': 'events', 'state': True}
+caps = [{'capability': 'dirty-bitmaps', 'state': True},
+{'capability': 'events', 'state': True}]
 
-result = self.vm_a.qmp('migrate-set-capabilities',
-   capabilities=[bitmaps_cap, events_cap])
+result = self.vm_a.qmp('migrate-set-capabilities', capabilities=caps)
 self.assert_qmp(result, 'return', {})
 
-result = self.vm_b.qmp('migrate-set-capabilities',
-   capabilities=[bitmaps_cap])
+result = self.vm_b.qmp('migrate-set-capabilities', capabilities=caps)
 self.assert_qmp(result, 'return', {})
 
 result = self.vm_a.qmp('migrate', uri='exec:cat>' + fifo)
@@ -94,24 +122,38 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 result = self.vm_a.qmp('migrate-start-postcopy')
 self.assert_qmp(result, 'return', {})
 
-while True:
-event = self.vm_a.event_wait('MIGRATION')
-if event['data']['status'] == 'completed':
-break
+event_resume = self.vm_b.event_wait('RESUME')
+self.vm_b_events.append(event_resume)
 
 s = 0x8000
 while s < write_size:
 self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
 s += 0x1
 
+match = {'data': {'status': 'completed'}}
+event_complete = self.vm_b.event_wait('MIGRATION', match=match)
+self.vm_b_events.append(event_complete)
+
+# take queued event, should already been happened
+event_stop = self.vm_a.event_wait('STOP')
+self.vm_a_events.append(event_stop)
+
+downtime = event_dist(event_stop, event_resume)
+postcopy_time = event_dist(event_resume, event_complete)

[PATCH v3 14/21] migration/block-dirty-bitmap: keep bitmap state for all bitmaps

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Keep bitmap state for disabled bitmaps too. Keep the state until the
end of the process. It's needed for the following commit to implement
bitmap postcopy canceling.

To clean-up the new list the following logic is used:
We need two events to consider bitmap migration finished:
1. chunk with DIRTY_BITMAP_MIG_FLAG_COMPLETE flag should be received
2. dirty_bitmap_mig_before_vm_start should be called
These two events may come in any order, so we understand which one is
last, and on the last of them we remove bitmap migration state from the
list.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/block-dirty-bitmap.c | 64 +++---
 1 file changed, 43 insertions(+), 21 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 405a259296..eb4ffeac4d 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -132,6 +132,7 @@ typedef struct LoadBitmapState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 bool migrated;
+bool enabled;
 } LoadBitmapState;
 
 /* State of the dirty bitmap migration (DBM) during load process */
@@ -142,8 +143,10 @@ typedef struct DBMLoadState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 
-GSList *enabled_bitmaps;
-QemuMutex lock; /* protect enabled_bitmaps */
+bool before_vm_start_handled; /* set in dirty_bitmap_mig_before_vm_start */
+
+GSList *bitmaps;
+QemuMutex lock; /* protect bitmaps */
 } DBMLoadState;
 
 typedef struct DBMState {
@@ -526,6 +529,7 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DBMLoadState *s)
 Error *local_err = NULL;
 uint32_t granularity = qemu_get_be32(f);
 uint8_t flags = qemu_get_byte(f);
+LoadBitmapState *b;
 
 if (s->bitmap) {
 error_report("Bitmap with the same name ('%s') already exists on "
@@ -552,45 +556,59 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DBMLoadState *s)
 
 bdrv_disable_dirty_bitmap(s->bitmap);
 if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) {
-LoadBitmapState *b;
-
 bdrv_dirty_bitmap_create_successor(s->bitmap, &local_err);
 if (local_err) {
 error_report_err(local_err);
 return -EINVAL;
 }
-
-b = g_new(LoadBitmapState, 1);
-b->bs = s->bs;
-b->bitmap = s->bitmap;
-b->migrated = false;
-s->enabled_bitmaps = g_slist_prepend(s->enabled_bitmaps, b);
 }
 
+b = g_new(LoadBitmapState, 1);
+b->bs = s->bs;
+b->bitmap = s->bitmap;
+b->migrated = false;
+b->enabled = flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED;
+
+s->bitmaps = g_slist_prepend(s->bitmaps, b);
+
 return 0;
 }
 
-void dirty_bitmap_mig_before_vm_start(void)
+/*
+ * before_vm_start_handle_item
+ *
+ * g_slist_foreach helper
+ *
+ * item is LoadBitmapState*
+ * opaque is DBMLoadState*
+ */
+static void before_vm_start_handle_item(void *item, void *opaque)
 {
-DBMLoadState *s = &dbm_state.load;
-GSList *item;
-
-qemu_mutex_lock(&s->lock);
-
-for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
-LoadBitmapState *b = item->data;
+DBMLoadState *s = opaque;
+LoadBitmapState *b = item;
 
+if (b->enabled) {
 if (b->migrated) {
 bdrv_enable_dirty_bitmap(b->bitmap);
 } else {
 bdrv_dirty_bitmap_enable_successor(b->bitmap);
 }
+}
 
+if (b->migrated) {
+s->bitmaps = g_slist_remove(s->bitmaps, b);
 g_free(b);
 }
+}
 
-g_slist_free(s->enabled_bitmaps);
-s->enabled_bitmaps = NULL;
+void dirty_bitmap_mig_before_vm_start(void)
+{
+DBMLoadState *s = &dbm_state.load;
+qemu_mutex_lock(&s->lock);
+
+assert(!s->before_vm_start_handled);
+g_slist_foreach(s->bitmaps, before_vm_start_handle_item, s);
+s->before_vm_start_handled = true;
 
 qemu_mutex_unlock(&s->lock);
 }
@@ -607,11 +625,15 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 bdrv_reclaim_dirty_bitmap(s->bitmap, &error_abort);
 }
 
-for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
+for (item = s->bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
 
 if (b->bitmap == s->bitmap) {
 b->migrated = true;
+if (s->before_vm_start_handled) {
+s->bitmaps = g_slist_remove(s->bitmaps, b);
+g_free(b);
+}
 break;
 }
 }
-- 
2.21.0

[PATCH v3 08/21] migration/block-dirty-bitmap: rename state structure types

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Rename types to be symmetrical for load/save part and shorter.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Reviewed-by: Eric Blake 
---
 migration/block-dirty-bitmap.c | 70 ++
 1 file changed, 37 insertions(+), 33 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 0739f1259e..1d57bff4f6 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -100,23 +100,25 @@
 /* 0x04 was "AUTOLOAD" flags on elder versions, no it is ignored */
 #define DIRTY_BITMAP_MIG_START_FLAG_RESERVED_MASK0xf8
 
-typedef struct DirtyBitmapMigBitmapState {
+/* State of one bitmap during save process */
+typedef struct SaveBitmapState {
 /* Written during setup phase. */
 BlockDriverState *bs;
 const char *node_name;
 BdrvDirtyBitmap *bitmap;
 uint64_t total_sectors;
 uint64_t sectors_per_chunk;
-QSIMPLEQ_ENTRY(DirtyBitmapMigBitmapState) entry;
+QSIMPLEQ_ENTRY(SaveBitmapState) entry;
 uint8_t flags;
 
 /* For bulk phase. */
 bool bulk_completed;
 uint64_t cur_sector;
-} DirtyBitmapMigBitmapState;
+} SaveBitmapState;
 
-typedef struct DirtyBitmapMigState {
-QSIMPLEQ_HEAD(, DirtyBitmapMigBitmapState) dbms_list;
+/* State of the dirty bitmap migration (DBM) during save process */
+typedef struct DBMSaveState {
+QSIMPLEQ_HEAD(, SaveBitmapState) dbms_list;
 
 bool bulk_completed;
 bool no_bitmaps;
@@ -124,23 +126,25 @@ typedef struct DirtyBitmapMigState {
 /* for send_bitmap_bits() */
 BlockDriverState *prev_bs;
 BdrvDirtyBitmap *prev_bitmap;
-} DirtyBitmapMigState;
+} DBMSaveState;
 
-typedef struct DirtyBitmapLoadState {
+/* State of the dirty bitmap migration (DBM) during load process */
+typedef struct DBMLoadState {
 uint32_t flags;
 char node_name[256];
 char bitmap_name[256];
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
-} DirtyBitmapLoadState;
+} DBMLoadState;
 
-static DirtyBitmapMigState dirty_bitmap_mig_state;
+static DBMSaveState dirty_bitmap_mig_state;
 
-typedef struct DirtyBitmapLoadBitmapState {
+/* State of one bitmap during load process */
+typedef struct LoadBitmapState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 bool migrated;
-} DirtyBitmapLoadBitmapState;
+} LoadBitmapState;
 static GSList *enabled_bitmaps;
 QemuMutex finish_lock;
 
@@ -170,7 +174,7 @@ static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t 
flags)
 qemu_put_byte(f, flags);
 }
 
-static void send_bitmap_header(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
+static void send_bitmap_header(QEMUFile *f, SaveBitmapState *dbms,
uint32_t additional_flags)
 {
 BlockDriverState *bs = dbms->bs;
@@ -199,19 +203,19 @@ static void send_bitmap_header(QEMUFile *f, 
DirtyBitmapMigBitmapState *dbms,
 }
 }
 
-static void send_bitmap_start(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+static void send_bitmap_start(QEMUFile *f, SaveBitmapState *dbms)
 {
 send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_START);
 qemu_put_be32(f, bdrv_dirty_bitmap_granularity(dbms->bitmap));
 qemu_put_byte(f, dbms->flags);
 }
 
-static void send_bitmap_complete(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+static void send_bitmap_complete(QEMUFile *f, SaveBitmapState *dbms)
 {
 send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
 }
 
-static void send_bitmap_bits(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
+static void send_bitmap_bits(QEMUFile *f, SaveBitmapState *dbms,
  uint64_t start_sector, uint32_t nr_sectors)
 {
 /* align for buffer_is_zero() */
@@ -257,7 +261,7 @@ static void send_bitmap_bits(QEMUFile *f, 
DirtyBitmapMigBitmapState *dbms,
 /* Called with iothread lock taken.  */
 static void dirty_bitmap_mig_cleanup(void)
 {
-DirtyBitmapMigBitmapState *dbms;
+SaveBitmapState *dbms;
 
 while ((dbms = QSIMPLEQ_FIRST(&dirty_bitmap_mig_state.dbms_list)) != NULL) 
{
 QSIMPLEQ_REMOVE_HEAD(&dirty_bitmap_mig_state.dbms_list, entry);
@@ -271,7 +275,7 @@ static void dirty_bitmap_mig_cleanup(void)
 static int add_bitmaps_to_list(BlockDriverState *bs, const char *bs_name)
 {
 BdrvDirtyBitmap *bitmap;
-DirtyBitmapMigBitmapState *dbms;
+SaveBitmapState *dbms;
 Error *local_err = NULL;
 
 FOR_EACH_DIRTY_BITMAP(bs, bitmap) {
@@ -309,7 +313,7 @@ static int add_bitmaps_to_list(BlockDriverState *bs, const 
char *bs_name)
 bdrv_ref(bs);
 bdrv_dirty_bitmap_set_busy(bitmap, true);
 
-dbms = g_new0(DirtyBitmapMigBitmapState, 1);
+dbms = g_new0(SaveBitmapState, 1);
 dbms->bs = bs;
 dbms->node_name = bs_name;
 dbms->bitmap = bitmap;
@@ -334,7 +338,7 @@ static int add_bitmaps_to_list(BlockDriverState *bs, const 
char *bs_name)
 static int init_dirty_bitmap_migration(void)
 {
 BlockDriverState *bs;
-DirtyBitmapMigBitmapState *dbms;
+SaveBitm

[PATCH v3 06/21] qemu-iotests/199: increase postcopy period

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

The test wants to force a bitmap postcopy. Still, the resulting
postcopy period is very small. Let's increase it by adding more
bitmaps to migrate. Also, test disabled bitmaps migration.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
---
 tests/qemu-iotests/199 | 58 --
 1 file changed, 39 insertions(+), 19 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index da4dae01fb..d8532e49da 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -103,29 +103,45 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 def test_postcopy(self):
 granularity = 512
+nb_bitmaps = 15
 
-result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
-   name='bitmap', granularity=granularity)
-self.assert_qmp(result, 'return', {})
+for i in range(nb_bitmaps):
+result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
+   name='bitmap{}'.format(i),
+   granularity=granularity)
+self.assert_qmp(result, 'return', {})
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
-   node='drive0', name='bitmap')
+   node='drive0', name='bitmap0')
 empty_sha256 = result['return']['sha256']
 
-apply_discards(self.vm_a, discards1 + discards2)
+apply_discards(self.vm_a, discards1)
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
-   node='drive0', name='bitmap')
-sha256 = result['return']['sha256']
+   node='drive0', name='bitmap0')
+discards1_sha256 = result['return']['sha256']
 
 # Check, that updating the bitmap by discards works
-assert sha256 != empty_sha256
+assert discards1_sha256 != empty_sha256
 
-result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
-   name='bitmap')
-self.assert_qmp(result, 'return', {})
+# We want to calculate resulting sha256. Do it in bitmap0, so, disable
+# other bitmaps
+for i in range(1, nb_bitmaps):
+result = self.vm_a.qmp('block-dirty-bitmap-disable', node='drive0',
+   name='bitmap{}'.format(i))
+self.assert_qmp(result, 'return', {})
 
-apply_discards(self.vm_a, discards1)
+apply_discards(self.vm_a, discards2)
+
+result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+   node='drive0', name='bitmap0')
+all_discards_sha256 = result['return']['sha256']
+
+# Now, enable some bitmaps, to be updated during migration
+for i in range(2, nb_bitmaps, 2):
+result = self.vm_a.qmp('block-dirty-bitmap-enable', node='drive0',
+   name='bitmap{}'.format(i))
+self.assert_qmp(result, 'return', {})
 
 caps = [{'capability': 'dirty-bitmaps', 'state': True},
 {'capability': 'events', 'state': True}]
@@ -145,6 +161,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 event_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(event_resume)
 
+# enabled bitmaps should be updated
 apply_discards(self.vm_b, discards2)
 
 match = {'data': {'status': 'completed'}}
@@ -158,7 +175,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 downtime = event_dist(event_stop, event_resume)
 postcopy_time = event_dist(event_resume, event_complete)
 
-# TODO: assert downtime * 10 < postcopy_time
+assert downtime * 10 < postcopy_time
 if debug:
 print('downtime:', downtime)
 print('postcopy_time:', postcopy_time)
@@ -166,12 +183,15 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 # Assert that bitmap migration is finished (check that successor bitmap
 # is removed)
 result = self.vm_b.qmp('query-block')
-assert len(result['return'][0]['dirty-bitmaps']) == 1
-
-# Check content of migrated (and updated by new writes) bitmap
-result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
-   node='drive0', name='bitmap')
-self.assert_qmp(result, 'return/sha256', sha256)
+assert len(result['return'][0]['dirty-bitmaps']) == nb_bitmaps
+
+# Check content of migrated bitmaps. Still, don't waste time checking
+# every bitmap
+for i in range(0, nb_bitmaps, 5):
+result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
+   node='drive0', name='bitmap{}'.format(i))
+sha256 = discards1_sha256 if i % 2 else all_discards_sh

[PATCH v3 10/21] migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

No reasons to keep two public init functions.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/migration.h  | 1 -
 migration/block-dirty-bitmap.c | 6 +-
 migration/migration.c  | 2 --
 3 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/migration/migration.h b/migration/migration.h
index f617960522..ab20c756f5 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -335,7 +335,6 @@ void migrate_send_rp_recv_bitmap(MigrationIncomingState 
*mis,
 void migrate_send_rp_resume_ack(MigrationIncomingState *mis, uint32_t value);
 
 void dirty_bitmap_mig_before_vm_start(void);
-void init_dirty_bitmap_incoming_migration(void);
 void migrate_add_address(SocketAddress *address);
 
 int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 01a536d7d3..4b67e4f4fb 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -148,11 +148,6 @@ typedef struct LoadBitmapState {
 static GSList *enabled_bitmaps;
 QemuMutex finish_lock;
 
-void init_dirty_bitmap_incoming_migration(void)
-{
-qemu_mutex_init(&finish_lock);
-}
-
 static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
 {
 uint8_t flags = qemu_get_byte(f);
@@ -801,6 +796,7 @@ static SaveVMHandlers savevm_dirty_bitmap_handlers = {
 void dirty_bitmap_mig_init(void)
 {
 QSIMPLEQ_INIT(&dirty_bitmap_mig_state.dbms_list);
+qemu_mutex_init(&finish_lock);
 
 register_savevm_live("dirty-bitmap", 0, 1,
  &savevm_dirty_bitmap_handlers,
diff --git a/migration/migration.c b/migration/migration.c
index 2ed9923227..1c61428988 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -165,8 +165,6 @@ void migration_object_init(void)
 qemu_sem_init(¤t_incoming->postcopy_pause_sem_dst, 0);
 qemu_sem_init(¤t_incoming->postcopy_pause_sem_fault, 0);
 
-init_dirty_bitmap_incoming_migration();
-
 if (!migration_object_check(current_migration, &err)) {
 error_report_err(err);
 exit(1);
-- 
2.21.0

[PATCH v3 19/21] qemu-iotests/199: check persistent bitmaps

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Check that persistent bitmaps are not stored on source and that bitmaps
are persistent on destination.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 tests/qemu-iotests/199 | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 355c0b2885..5fd34f0fcd 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -117,7 +117,8 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 for i in range(nb_bitmaps):
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap{}'.format(i),
-   granularity=granularity)
+   granularity=granularity,
+   persistent=True)
 self.assert_qmp(result, 'return', {})
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
@@ -193,6 +194,19 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 print('downtime:', downtime)
 print('postcopy_time:', postcopy_time)
 
+# check that there are no bitmaps stored on source
+self.vm_a_events += self.vm_a.get_qmp_events()
+self.vm_a.shutdown()
+self.vm_a.launch()
+check_bitmaps(self.vm_a, 0)
+
+# check that bitmaps are migrated and persistence works
+check_bitmaps(self.vm_b, nb_bitmaps)
+self.vm_b.shutdown()
+# recreate vm_b, so there is no incoming option, which prevents
+# loading bitmaps from disk
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_b.launch()
 check_bitmaps(self.vm_b, nb_bitmaps)
 
 # Check content of migrated bitmaps. Still, don't waste time checking
-- 
2.21.0

[PATCH v3 16/21] migration/block-dirty-bitmap: cancel migration on shutdown

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

If target is turned off prior to postcopy finished, target crashes
because busy bitmaps are found at shutdown.
Canceling incoming migration helps, as it removes all unfinished (and
therefore busy) bitmaps.

Similarly on source we crash in bdrv_close_all which asserts that all
bdrv states are removed, because bdrv states involved into dirty bitmap
migration are referenced by it. So, we need to cancel outgoing
migration as well.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/migration.h  |  2 ++
 migration/block-dirty-bitmap.c | 16 
 migration/migration.c  | 13 +
 3 files changed, 31 insertions(+)

diff --git a/migration/migration.h b/migration/migration.h
index ab20c756f5..6c6a931d0d 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -335,6 +335,8 @@ void migrate_send_rp_recv_bitmap(MigrationIncomingState 
*mis,
 void migrate_send_rp_resume_ack(MigrationIncomingState *mis, uint32_t value);
 
 void dirty_bitmap_mig_before_vm_start(void);
+void dirty_bitmap_mig_cancel_outgoing(void);
+void dirty_bitmap_mig_cancel_incoming(void);
 void migrate_add_address(SocketAddress *address);
 
 int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index c24d4614bf..a198ec7278 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -657,6 +657,22 @@ static void cancel_incoming_locked(DBMLoadState *s)
 s->bitmaps = NULL;
 }
 
+void dirty_bitmap_mig_cancel_outgoing(void)
+{
+dirty_bitmap_do_save_cleanup(&dbm_state.save);
+}
+
+void dirty_bitmap_mig_cancel_incoming(void)
+{
+DBMLoadState *s = &dbm_state.load;
+
+qemu_mutex_lock(&s->lock);
+
+cancel_incoming_locked(s);
+
+qemu_mutex_unlock(&s->lock);
+}
+
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
 {
 GSList *item;
diff --git a/migration/migration.c b/migration/migration.c
index 1c61428988..8fe36339db 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -188,6 +188,19 @@ void migration_shutdown(void)
  */
 migrate_fd_cancel(current_migration);
 object_unref(OBJECT(current_migration));
+
+/*
+ * Cancel outgoing migration of dirty bitmaps. It should
+ * at least unref used block nodes.
+ */
+dirty_bitmap_mig_cancel_outgoing();
+
+/*
+ * Cancel incoming migration of dirty bitmaps. Dirty bitmaps
+ * are non-critical data, and their loss never considered as
+ * something serious.
+ */
+dirty_bitmap_mig_cancel_incoming();
 }
 
 /* For outgoing */
-- 
2.21.0

[PATCH v3 18/21] qemu-iotests/199: prepare for new test-cases addition

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Move future common part to start_postcopy() method. Move checking
number of bitmaps to check_bitmap().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 tests/qemu-iotests/199 | 36 +++-
 1 file changed, 23 insertions(+), 13 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index d8532e49da..355c0b2885 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -29,6 +29,8 @@ disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
+granularity = 512
+nb_bitmaps = 15
 
 GiB = 1024 * 1024 * 1024
 
@@ -61,6 +63,15 @@ def event_dist(e1, e2):
 return event_seconds(e2) - event_seconds(e1)
 
 
+def check_bitmaps(vm, count):
+result = vm.qmp('query-block')
+
+if count == 0:
+assert 'dirty-bitmaps' not in result['return'][0]
+else:
+assert len(result['return'][0]['dirty-bitmaps']) == count
+
+
 class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 def tearDown(self):
 if debug:
@@ -101,10 +112,8 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a_events = []
 self.vm_b_events = []
 
-def test_postcopy(self):
-granularity = 512
-nb_bitmaps = 15
-
+def start_postcopy(self):
+""" Run migration until RESUME event on target. Return this event. """
 for i in range(nb_bitmaps):
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap{}'.format(i),
@@ -119,10 +128,10 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap0')
-discards1_sha256 = result['return']['sha256']
+self.discards1_sha256 = result['return']['sha256']
 
 # Check, that updating the bitmap by discards works
-assert discards1_sha256 != empty_sha256
+assert self.discards1_sha256 != empty_sha256
 
 # We want to calculate resulting sha256. Do it in bitmap0, so, disable
 # other bitmaps
@@ -135,7 +144,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap0')
-all_discards_sha256 = result['return']['sha256']
+self.all_discards_sha256 = result['return']['sha256']
 
 # Now, enable some bitmaps, to be updated during migration
 for i in range(2, nb_bitmaps, 2):
@@ -160,6 +169,10 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 event_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(event_resume)
+return event_resume
+
+def test_postcopy_success(self):
+event_resume = self.start_postcopy()
 
 # enabled bitmaps should be updated
 apply_discards(self.vm_b, discards2)
@@ -180,18 +193,15 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 print('downtime:', downtime)
 print('postcopy_time:', postcopy_time)
 
-# Assert that bitmap migration is finished (check that successor bitmap
-# is removed)
-result = self.vm_b.qmp('query-block')
-assert len(result['return'][0]['dirty-bitmaps']) == nb_bitmaps
+check_bitmaps(self.vm_b, nb_bitmaps)
 
 # Check content of migrated bitmaps. Still, don't waste time checking
 # every bitmap
 for i in range(0, nb_bitmaps, 5):
 result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap{}'.format(i))
-sha256 = discards1_sha256 if i % 2 else all_discards_sha256
-self.assert_qmp(result, 'return/sha256', sha256)
+sha = self.discards1_sha256 if i % 2 else self.all_discards_sha256
+self.assert_qmp(result, 'return/sha256', sha)
 
 
 if __name__ == '__main__':
-- 
2.21.0

[PATCH v3 17/21] migration/savevm: don't worry if bitmap migration postcopy failed

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

First, if only bitmaps postcopy enabled (not ram postcopy)
postcopy_pause_incoming crashes on assertion assert(mis->to_src_file).

And anyway, bitmaps postcopy is not prepared to be somehow recovered.
The original idea instead is that if bitmaps postcopy failed, we just
loss some bitmaps, which is not critical. So, on failure we just need
to remove unfinished bitmaps and guest should continue execution on
destination.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Andrey Shinkevich 
---
 migration/savevm.c | 37 -
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 45c9dd9d8a..a843d202b5 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1813,6 +1813,9 @@ static void *postcopy_ram_listen_thread(void *opaque)
 MigrationIncomingState *mis = migration_incoming_get_current();
 QEMUFile *f = mis->from_src_file;
 int load_res;
+MigrationState *migr = migrate_get_current();
+
+object_ref(OBJECT(migr));
 
 migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_POSTCOPY_ACTIVE);
@@ -1839,11 +1842,24 @@ static void *postcopy_ram_listen_thread(void *opaque)
 
 trace_postcopy_ram_listen_thread_exit();
 if (load_res < 0) {
-error_report("%s: loadvm failed: %d", __func__, load_res);
 qemu_file_set_error(f, load_res);
-migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
-   MIGRATION_STATUS_FAILED);
-} else {
+dirty_bitmap_mig_cancel_incoming();
+if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
+!migrate_postcopy_ram() && migrate_dirty_bitmaps())
+{
+error_report("%s: loadvm failed during postcopy: %d. All states "
+ "are migrated except dirty bitmaps. Some dirty "
+ "bitmaps may be lost, and present migrated dirty "
+ "bitmaps are correctly migrated and valid.",
+ __func__, load_res);
+load_res = 0; /* prevent further exit() */
+} else {
+error_report("%s: loadvm failed: %d", __func__, load_res);
+migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
+   MIGRATION_STATUS_FAILED);
+}
+}
+if (load_res >= 0) {
 /*
  * This looks good, but it's possible that the device loading in the
  * main thread hasn't finished yet, and so we might not be in 'RUN'
@@ -1879,6 +1895,8 @@ static void *postcopy_ram_listen_thread(void *opaque)
 mis->have_listen_thread = false;
 postcopy_state_set(POSTCOPY_INCOMING_END);
 
+object_unref(OBJECT(migr));
+
 return NULL;
 }
 
@@ -2437,6 +2455,8 @@ static bool 
postcopy_pause_incoming(MigrationIncomingState *mis)
 {
 trace_postcopy_pause_incoming();
 
+assert(migrate_postcopy_ram());
+
 /* Clear the triggered bit to allow one recovery */
 mis->postcopy_recover_triggered = false;
 
@@ -2521,15 +2541,22 @@ out:
 if (ret < 0) {
 qemu_file_set_error(f, ret);
 
+/* Cancel bitmaps incoming regardless of recovery */
+dirty_bitmap_mig_cancel_incoming();
+
 /*
  * If we are during an active postcopy, then we pause instead
  * of bail out to at least keep the VM's dirty data.  Note
  * that POSTCOPY_INCOMING_LISTENING stage is still not enough,
  * during which we're still receiving device states and we
  * still haven't yet started the VM on destination.
+ *
+ * Only RAM postcopy supports recovery. Still, if RAM postcopy is
+ * enabled, canceled bitmaps postcopy will not affect RAM postcopy
+ * recovering.
  */
 if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
-postcopy_pause_incoming(mis)) {
+migrate_postcopy_ram() && postcopy_pause_incoming(mis)) {
 /* Reset f to point to the newly created channel */
 f = mis->from_src_file;
 goto retry;
-- 
2.21.0

[PATCH v3 12/21] migration/block-dirty-bitmap: rename finish_lock to just lock

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

finish_lock is bad name, as lock used not only on process end.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/block-dirty-bitmap.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 9b39e7aa2b..9194807b54 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -143,7 +143,7 @@ typedef struct DBMLoadState {
 BdrvDirtyBitmap *bitmap;
 
 GSList *enabled_bitmaps;
-QemuMutex finish_lock;
+QemuMutex lock; /* protect enabled_bitmaps */
 } DBMLoadState;
 
 typedef struct DBMState {
@@ -575,7 +575,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 DBMLoadState *s = &dbm_state.load;
 GSList *item;
 
-qemu_mutex_lock(&s->finish_lock);
+qemu_mutex_lock(&s->lock);
 
 for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
@@ -592,7 +592,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 g_slist_free(s->enabled_bitmaps);
 s->enabled_bitmaps = NULL;
 
-qemu_mutex_unlock(&s->finish_lock);
+qemu_mutex_unlock(&s->lock);
 }
 
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
@@ -601,7 +601,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 trace_dirty_bitmap_load_complete();
 bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
-qemu_mutex_lock(&s->finish_lock);
+qemu_mutex_lock(&s->lock);
 
 for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
@@ -633,7 +633,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 bdrv_dirty_bitmap_unlock(s->bitmap);
 }
 
-qemu_mutex_unlock(&s->finish_lock);
+qemu_mutex_unlock(&s->lock);
 }
 
 static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
@@ -815,7 +815,7 @@ static SaveVMHandlers savevm_dirty_bitmap_handlers = {
 void dirty_bitmap_mig_init(void)
 {
 QSIMPLEQ_INIT(&dbm_state.save.dbms_list);
-qemu_mutex_init(&dbm_state.load.finish_lock);
+qemu_mutex_init(&dbm_state.load.lock);
 
 register_savevm_live("dirty-bitmap", 0, 1,
  &savevm_dirty_bitmap_handlers,
-- 
2.21.0

[PATCH v3 11/21] migration/block-dirty-bitmap: refactor state global variables

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Move all state variables into one global struct. Reduce global
variable usage, utilizing opaque pointer where possible.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/block-dirty-bitmap.c | 179 ++---
 1 file changed, 99 insertions(+), 80 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 4b67e4f4fb..9b39e7aa2b 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -128,6 +128,12 @@ typedef struct DBMSaveState {
 BdrvDirtyBitmap *prev_bitmap;
 } DBMSaveState;
 
+typedef struct LoadBitmapState {
+BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
+bool migrated;
+} LoadBitmapState;
+
 /* State of the dirty bitmap migration (DBM) during load process */
 typedef struct DBMLoadState {
 uint32_t flags;
@@ -135,18 +141,17 @@ typedef struct DBMLoadState {
 char bitmap_name[256];
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
+
+GSList *enabled_bitmaps;
+QemuMutex finish_lock;
 } DBMLoadState;
 
-static DBMSaveState dirty_bitmap_mig_state;
+typedef struct DBMState {
+DBMSaveState save;
+DBMLoadState load;
+} DBMState;
 
-/* State of one bitmap during load process */
-typedef struct LoadBitmapState {
-BlockDriverState *bs;
-BdrvDirtyBitmap *bitmap;
-bool migrated;
-} LoadBitmapState;
-static GSList *enabled_bitmaps;
-QemuMutex finish_lock;
+static DBMState dbm_state;
 
 static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
 {
@@ -169,21 +174,21 @@ static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t 
flags)
 qemu_put_byte(f, flags);
 }
 
-static void send_bitmap_header(QEMUFile *f, SaveBitmapState *dbms,
-   uint32_t additional_flags)
+static void send_bitmap_header(QEMUFile *f, DBMSaveState *s,
+   SaveBitmapState *dbms, uint32_t 
additional_flags)
 {
 BlockDriverState *bs = dbms->bs;
 BdrvDirtyBitmap *bitmap = dbms->bitmap;
 uint32_t flags = additional_flags;
 trace_send_bitmap_header_enter();
 
-if (bs != dirty_bitmap_mig_state.prev_bs) {
-dirty_bitmap_mig_state.prev_bs = bs;
+if (bs != s->prev_bs) {
+s->prev_bs = bs;
 flags |= DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME;
 }
 
-if (bitmap != dirty_bitmap_mig_state.prev_bitmap) {
-dirty_bitmap_mig_state.prev_bitmap = bitmap;
+if (bitmap != s->prev_bitmap) {
+s->prev_bitmap = bitmap;
 flags |= DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME;
 }
 
@@ -198,19 +203,22 @@ static void send_bitmap_header(QEMUFile *f, 
SaveBitmapState *dbms,
 }
 }
 
-static void send_bitmap_start(QEMUFile *f, SaveBitmapState *dbms)
+static void send_bitmap_start(QEMUFile *f, DBMSaveState *s,
+  SaveBitmapState *dbms)
 {
-send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_START);
+send_bitmap_header(f, s, dbms, DIRTY_BITMAP_MIG_FLAG_START);
 qemu_put_be32(f, bdrv_dirty_bitmap_granularity(dbms->bitmap));
 qemu_put_byte(f, dbms->flags);
 }
 
-static void send_bitmap_complete(QEMUFile *f, SaveBitmapState *dbms)
+static void send_bitmap_complete(QEMUFile *f, DBMSaveState *s,
+ SaveBitmapState *dbms)
 {
-send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
+send_bitmap_header(f, s, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
 }
 
-static void send_bitmap_bits(QEMUFile *f, SaveBitmapState *dbms,
+static void send_bitmap_bits(QEMUFile *f, DBMSaveState *s,
+ SaveBitmapState *dbms,
  uint64_t start_sector, uint32_t nr_sectors)
 {
 /* align for buffer_is_zero() */
@@ -235,7 +243,7 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState 
*dbms,
 
 trace_send_bitmap_bits(flags, start_sector, nr_sectors, buf_size);
 
-send_bitmap_header(f, dbms, flags);
+send_bitmap_header(f, s, dbms, flags);
 
 qemu_put_be64(f, start_sector);
 qemu_put_be32(f, nr_sectors);
@@ -254,12 +262,12 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState 
*dbms,
 }
 
 /* Called with iothread lock taken.  */
-static void dirty_bitmap_do_save_cleanup(void)
+static void dirty_bitmap_do_save_cleanup(DBMSaveState *s)
 {
 SaveBitmapState *dbms;
 
-while ((dbms = QSIMPLEQ_FIRST(&dirty_bitmap_mig_state.dbms_list)) != NULL) 
{
-QSIMPLEQ_REMOVE_HEAD(&dirty_bitmap_mig_state.dbms_list, entry);
+while ((dbms = QSIMPLEQ_FIRST(&s->dbms_list)) != NULL) {
+QSIMPLEQ_REMOVE_HEAD(&s->dbms_list, entry);
 bdrv_dirty_bitmap_set_busy(dbms->bitmap, false);
 bdrv_unref(dbms->bs);
 g_free(dbms);
@@ -267,7 +275,8 @@ static void dirty_bitmap_do_save_cleanup(void)
 }
 
 /* Called with iothread lock taken. */
-static int add_bitmaps_to_list(BlockDriverState *bs, const char *bs_name)
+static int add_bitmaps_to_list(DBMSaveState *s, BlockDriverState *bs,
+   const char *bs_name

[PATCH v3 20/21] qemu-iotests/199: add early shutdown case to bitmaps postcopy

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Previous patches fixed two crashes which may occur on shutdown prior to
bitmaps postcopy finished. Check that it works now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 tests/qemu-iotests/199 | 24 
 tests/qemu-iotests/199.out |  4 ++--
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 5fd34f0fcd..140930b2b1 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -217,6 +217,30 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 sha = self.discards1_sha256 if i % 2 else self.all_discards_sha256
 self.assert_qmp(result, 'return/sha256', sha)
 
+def test_early_shutdown_destination(self):
+self.start_postcopy()
+
+self.vm_b_events += self.vm_b.get_qmp_events()
+self.vm_b.shutdown()
+# recreate vm_b, so there is no incoming option, which prevents
+# loading bitmaps from disk
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_b.launch()
+check_bitmaps(self.vm_b, 0)
+
+# Bitmaps will be lost if we just shutdown the vm, as they are marked
+# to skip storing to disk when prepared for migration. And that's
+# correct, as actual data may be modified in target vm, so we play
+# safe.
+# Still, this mark would be taken away if we do 'cont', and bitmaps
+# become persistent again. (see iotest 169 for such behavior case)
+result = self.vm_a.qmp('query-status')
+assert not result['return']['running']
+self.vm_a_events += self.vm_a.get_qmp_events()
+self.vm_a.shutdown()
+self.vm_a.launch()
+check_bitmaps(self.vm_a, 0)
+
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/199.out b/tests/qemu-iotests/199.out
index ae1213e6f8..fbc63e62f8 100644
--- a/tests/qemu-iotests/199.out
+++ b/tests/qemu-iotests/199.out
@@ -1,5 +1,5 @@
-.
+..
 --
-Ran 1 tests
+Ran 2 tests
 
 OK
-- 
2.21.0

[PATCH v3 13/21] migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

bdrv_enable_dirty_bitmap_locked() call does nothing, as if we are in
postcopy, bitmap successor must be enabled, and reclaim operation will
enable the bitmap.

So, actually we need just call _reclaim_ in both if branches, and
making differences only to add an assertion seems not really good. The
logic becomes simple: on load complete we do reclaim and that's all.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 migration/block-dirty-bitmap.c | 25 -
 1 file changed, 4 insertions(+), 21 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 9194807b54..405a259296 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -603,6 +603,10 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 
 qemu_mutex_lock(&s->lock);
 
+if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
+bdrv_reclaim_dirty_bitmap(s->bitmap, &error_abort);
+}
+
 for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
 
@@ -612,27 +616,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 }
 }
 
-if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
-bdrv_dirty_bitmap_lock(s->bitmap);
-if (s->enabled_bitmaps == NULL) {
-/* in postcopy */
-bdrv_reclaim_dirty_bitmap_locked(s->bitmap, &error_abort);
-bdrv_enable_dirty_bitmap_locked(s->bitmap);
-} else {
-/* target not started, successor must be empty */
-int64_t count = bdrv_get_dirty_count(s->bitmap);
-BdrvDirtyBitmap *ret = bdrv_reclaim_dirty_bitmap_locked(s->bitmap,
-NULL);
-/* bdrv_reclaim_dirty_bitmap can fail only on no successor (it
- * must be) or on merge fail, but merge can't fail when second
- * bitmap is empty
- */
-assert(ret == s->bitmap &&
-   count == bdrv_get_dirty_count(s->bitmap));
-}
-bdrv_dirty_bitmap_unlock(s->bitmap);
-}
-
 qemu_mutex_unlock(&s->lock);
 }
 
-- 
2.21.0

Re: [PATCH] MAINTAINERS: Add an entry to review Avocado based acceptance tests

2020-07-24 Thread Alex Bennée



Philippe Mathieu-Daudé  writes:

> Add an entry for to allow reviewers to be notified when acceptance /
> integration tests are added or modified.
> The designated reviewers are not maintainers, subsystem maintainers
> are expected to merge their tests.
>
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Alex Bennée 

> ---
>  MAINTAINERS | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 3e7d9cb0a5..c2ae2bf390 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2923,6 +2923,14 @@ S: Maintained
>  F: tests/tcg/Makefile
>  F: tests/tcg/Makefile.include
>  
> +Acceptance (Integration) Testing with the Avocado framework
> +W: https://trello.com/b/6Qi1pxVn/avocado-qemu
> +R: Cleber Rosa 
> +R: Philippe Mathieu-Daudé 
> +R: Wainer dos Santos Moschetta 
> +S: Odd Fixes
> +F: tests/acceptance/
> +
>  Documentation
>  -
>  Build system architecture


-- 
Alex Bennée

[PATCH v3 21/21] qemu-iotests/199: add source-killed case to bitmaps postcopy

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Previous patches fixes behavior of bitmaps migration, so that errors
are handled by just removing unfinished bitmaps, and not fail or try to
recover postcopy migration. Add corresponding test.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
---
 tests/qemu-iotests/199 | 15 +++
 tests/qemu-iotests/199.out |  4 ++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 140930b2b1..58fad872a1 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -241,6 +241,21 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a.launch()
 check_bitmaps(self.vm_a, 0)
 
+def test_early_kill_source(self):
+self.start_postcopy()
+
+self.vm_a_events = self.vm_a.get_qmp_events()
+self.vm_a.kill()
+
+self.vm_a.launch()
+
+match = {'data': {'status': 'completed'}}
+e_complete = self.vm_b.event_wait('MIGRATION', match=match)
+self.vm_b_events.append(e_complete)
+
+check_bitmaps(self.vm_a, 0)
+check_bitmaps(self.vm_b, 0)
+
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/199.out b/tests/qemu-iotests/199.out
index fbc63e62f8..8d7e996700 100644
--- a/tests/qemu-iotests/199.out
+++ b/tests/qemu-iotests/199.out
@@ -1,5 +1,5 @@
-..
+...
 --
-Ran 2 tests
+Ran 3 tests
 
 OK
-- 
2.21.0

[PATCH v3 15/21] migration/block-dirty-bitmap: relax error handling in incoming part

2020-07-24 Thread Vladimir Sementsov-Ogievskiy

Bitmaps data is not critical, and we should not fail the migration (or
use postcopy recovering) because of dirty-bitmaps migration failure.
Instead we should just lose unfinished bitmaps.

Still we have to report io stream violation errors, as they affect the
whole migration stream.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 152 +
 1 file changed, 117 insertions(+), 35 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index eb4ffeac4d..c24d4614bf 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -145,6 +145,15 @@ typedef struct DBMLoadState {
 
 bool before_vm_start_handled; /* set in dirty_bitmap_mig_before_vm_start */
 
+/*
+ * cancelled
+ * Incoming migration is cancelled for some reason. That means that we
+ * still should read our chunks from migration stream, to not affect other
+ * migration objects (like RAM), but just ignore them and do not touch any
+ * bitmaps or nodes.
+ */
+bool cancelled;
+
 GSList *bitmaps;
 QemuMutex lock; /* protect bitmaps */
 } DBMLoadState;
@@ -531,6 +540,10 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DBMLoadState *s)
 uint8_t flags = qemu_get_byte(f);
 LoadBitmapState *b;
 
+if (s->cancelled) {
+return 0;
+}
+
 if (s->bitmap) {
 error_report("Bitmap with the same name ('%s') already exists on "
  "destination", bdrv_dirty_bitmap_name(s->bitmap));
@@ -613,13 +626,47 @@ void dirty_bitmap_mig_before_vm_start(void)
 qemu_mutex_unlock(&s->lock);
 }
 
+static void cancel_incoming_locked(DBMLoadState *s)
+{
+GSList *item;
+
+if (s->cancelled) {
+return;
+}
+
+s->cancelled = true;
+s->bs = NULL;
+s->bitmap = NULL;
+
+/* Drop all unfinished bitmaps */
+for (item = s->bitmaps; item; item = g_slist_next(item)) {
+LoadBitmapState *b = item->data;
+
+/*
+ * Bitmap must be unfinished, as finished bitmaps should already be
+ * removed from the list.
+ */
+assert(!s->before_vm_start_handled || !b->migrated);
+if (bdrv_dirty_bitmap_has_successor(b->bitmap)) {
+bdrv_reclaim_dirty_bitmap(b->bitmap, &error_abort);
+}
+bdrv_release_dirty_bitmap(b->bitmap);
+}
+
+g_slist_free_full(s->bitmaps, g_free);
+s->bitmaps = NULL;
+}
+
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
 {
 GSList *item;
 trace_dirty_bitmap_load_complete();
-bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
-qemu_mutex_lock(&s->lock);
+if (s->cancelled) {
+return;
+}
+
+bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
 if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
 bdrv_reclaim_dirty_bitmap(s->bitmap, &error_abort);
@@ -637,8 +684,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 break;
 }
 }
-
-qemu_mutex_unlock(&s->lock);
 }
 
 static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
@@ -650,15 +695,32 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
 
 if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
 trace_dirty_bitmap_load_bits_zeroes();
-bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
- false);
+if (!s->cancelled) {
+bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte,
+ nr_bytes, false);
+}
 } else {
 size_t ret;
 uint8_t *buf;
 uint64_t buf_size = qemu_get_be64(f);
-uint64_t needed_size =
-bdrv_dirty_bitmap_serialization_size(s->bitmap,
- first_byte, nr_bytes);
+uint64_t needed_size;
+
+buf = g_malloc(buf_size);
+ret = qemu_get_buffer(f, buf, buf_size);
+if (ret != buf_size) {
+error_report("Failed to read bitmap bits");
+g_free(buf);
+return -EIO;
+}
+
+if (s->cancelled) {
+g_free(buf);
+return 0;
+}
+
+needed_size = bdrv_dirty_bitmap_serialization_size(s->bitmap,
+   first_byte,
+   nr_bytes);
 
 if (needed_size > buf_size ||
 buf_size > QEMU_ALIGN_UP(needed_size, 4 * sizeof(long))
@@ -667,15 +729,8 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
 error_report("Migrated bitmap granularity doesn't "
  "match the destination bitmap '%s' granularity",
  bdrv_dirty_bitmap_name(s->bitmap));
-return -EINVAL;
-}
-
-buf = g_malloc(buf_size);
-ret = qemu_

Re: [PATCH v7 28/47] block/null: Implement bdrv_get_allocated_file_size

2020-07-24 Thread Max Reitz

On 20.07.20 17:10, Andrey Shinkevich wrote:
> On 25.06.2020 18:21, Max Reitz wrote:
>> It is trivial, so we might as well do it.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>   block/null.c   | 7 +++
>>   tests/qemu-iotests/153.out | 2 +-
>>   tests/qemu-iotests/184.out | 6 --
>>   3 files changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/null.c b/block/null.c
>> index 15e1d56746..cc9b1d4ea7 100644
>> --- a/block/null.c
>> +++ b/block/null.c
>> @@ -262,6 +262,11 @@ static void
>> null_refresh_filename(BlockDriverState *bs)
>>    bs->drv->format_name);
>>   }
>>   +static int64_t null_allocated_file_size(BlockDriverState *bs)
>> +{
>> +    return 0;
>> +}
>> +
>>   static const char *const null_strong_runtime_opts[] = {
>>   BLOCK_OPT_SIZE,
>>   NULL_OPT_ZEROES,
>> @@ -277,6 +282,7 @@ static BlockDriver bdrv_null_co = {
>>   .bdrv_file_open = null_file_open,
>>   .bdrv_parse_filename    = null_co_parse_filename,
>>   .bdrv_getlength = null_getlength,
>> +    .bdrv_get_allocated_file_size = null_allocated_file_size,
>>     .bdrv_co_preadv = null_co_preadv,
>>   .bdrv_co_pwritev    = null_co_pwritev,
>> @@ -297,6 +303,7 @@ static BlockDriver bdrv_null_aio = {
>>   .bdrv_file_open = null_file_open,
>>   .bdrv_parse_filename    = null_aio_parse_filename,
>>   .bdrv_getlength = null_getlength,
>> +    .bdrv_get_allocated_file_size = null_allocated_file_size,
>>     .bdrv_aio_preadv    = null_aio_preadv,
>>   .bdrv_aio_pwritev   = null_aio_pwritev,
>> diff --git a/tests/qemu-iotests/153.out b/tests/qemu-iotests/153.out
>> index b2a90caa6b..8659e6463b 100644
>> --- a/tests/qemu-iotests/153.out
>> +++ b/tests/qemu-iotests/153.out
>> @@ -461,7 +461,7 @@ No conflict:
>>   image: null-co://
>>   file format: null-co
>>   virtual size: 1 GiB (1073741824 bytes)
>> -disk size: unavailable
>> +disk size: 0 B
>>     Conflict:
>>   qemu-img: --force-share/-U conflicts with image options
>> diff --git a/tests/qemu-iotests/184.out b/tests/qemu-iotests/184.out
>> index 3deb3cfb94..28b104da89 100644
>> --- a/tests/qemu-iotests/184.out
>> +++ b/tests/qemu-iotests/184.out
>> @@ -29,7 +29,8 @@ Testing:
>>   "image": {
>>   "virtual-size": 1073741824,
>>   "filename": "json:{\"throttle-group\": \"group0\",
>> \"driver\": \"throttle\", \"file\": {\"driver\": \"null-co\"}}",
>> -    "format": "throttle"
>> +    "format": "throttle",
>> +    "actual-size": SIZE
> 
> 
> If we remove the _filter_generated_node_ids() in the current
> implementation of the test #184, we will get '"actual-size": 0'. It
> might be more informative when analyzing the output in 184.out.

You mean _filter_actual_image_size?  Yeah, why not, it doesn’t seem
necessary here.

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 1/4] qom: make object_ref/unref use a void * instead of Object *.

2020-07-24 Thread Daniel P . Berrangé

On Thu, Jul 23, 2020 at 02:04:45PM -0500, Eric Blake wrote:
> On 7/23/20 1:14 PM, Daniel P. Berrangé wrote:
> > The object_ref/unref methods are intended for use with any subclass of
> > the base Object. Using "Object *" in the signature is not adding any
> > meaningful level of type safety, since callers simply use "OBJECT(ptr)"
> > and this expands to an unchecked cast "(Object *)".
> > 
> > By using "void *" we enable the object_unref() method to be used to
> > provide support for g_autoptr() with any subclass.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >   include/qom/object.h | 4 ++--
> >   qom/object.c | 6 --
> >   2 files changed, 6 insertions(+), 4 deletions(-)
> 
> Is it worth a followup patch (probably with Coccinelle) that changes:
> 
> object_ref(OBJECT(dev));
> 
> to the now-simpler
> 
> object_ref(dev);

Yes, its worth a cleanup.

> But I don't think it belongs in this patch, so
> 
> Reviewed-by: Eric Blake 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH v4 0/4] Fix some PMP implementations

2020-07-24 Thread Zong Li

This patch set contains the fixes for wrong index of pmpcfg CSR on rv64,
and the pmp range in CSR function table. After 3rd version of this patch
series, we also fix the PMP issues such as wrong physical address
translation and ignoring PMP checking.

Chagned in v4:
 - Refine the implementation. Suggested by Bin Meng.
 - Add fix for PMP checking was ignored.

Changed in v3:
 - Refine the implementation. Suggested by Bin Meng.
 - Add fix for wrong pphysical address translation.

Changed in v2:
 - Move out the shifting operation from loop. Suggested by Bin Meng.

Zong Li (4):
  target/riscv: Fix the range of pmpcfg of CSR funcion table
  target/riscv/pmp.c: Fix the index offset on RV64
  target/riscv: Fix the translation of physical address
  target/riscv: Change the TLB page size depends on PMP entries.

 target/riscv/cpu_helper.c | 13 +++--
 target/riscv/csr.c|  2 +-
 target/riscv/pmp.c| 60 +++
 target/riscv/pmp.h|  2 ++
 4 files changed, 73 insertions(+), 4 deletions(-)

-- 
2.27.0

[PATCH v4 2/4] target/riscv/pmp.c: Fix the index offset on RV64

2020-07-24 Thread Zong Li

On RV64, the reg_index is 2 (pmpcfg2 CSR) after the seventh pmp
entry, it is not 1 (pmpcfg1 CSR) like RV32. In the original
implementation, the second parameter of pmp_write_cfg is
"reg_index * sizeof(target_ulong)", and we get the the result
which is started from 16 if reg_index is 2, but we expect that
it should be started from 8. Separate the implementation for
RV32 and RV64 respectively.

Signed-off-by: Zong Li 
---
 target/riscv/pmp.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
index 2a2b9f5363..e0161d6aab 100644
--- a/target/riscv/pmp.c
+++ b/target/riscv/pmp.c
@@ -310,6 +310,10 @@ void pmpcfg_csr_write(CPURISCVState *env, uint32_t 
reg_index,
 int i;
 uint8_t cfg_val;
 
+#if defined(TARGET_RISCV64)
+reg_index >>= 1;
+#endif
+
 trace_pmpcfg_csr_write(env->mhartid, reg_index, val);
 
 if ((reg_index & 1) && (sizeof(target_ulong) == 8)) {
@@ -335,6 +339,10 @@ target_ulong pmpcfg_csr_read(CPURISCVState *env, uint32_t 
reg_index)
 target_ulong cfg_val = 0;
 target_ulong val = 0;
 
+#if defined(TARGET_RISCV64)
+reg_index >>= 1;
+#endif
+
 for (i = 0; i < sizeof(target_ulong); i++) {
 val = pmp_read_cfg(env, (reg_index * sizeof(target_ulong)) + i);
 cfg_val |= (val << (i * 8));
-- 
2.27.0

[PATCH v4 4/4] target/riscv: Change the TLB page size depends on PMP entries.

2020-07-24 Thread Zong Li

The minimum granularity of PMP is 4 bytes, it is small than 4KB page
size, therefore, the pmp checking would be ignored if its range doesn't
start from the alignment of one page. This patch detects the pmp entries
and sets the small page size to TLB if there is a PMP entry which cover
the page size.

Signed-off-by: Zong Li 
---
 target/riscv/cpu_helper.c | 10 ++--
 target/riscv/pmp.c| 52 +++
 target/riscv/pmp.h|  2 ++
 3 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 08b069f0c9..b3013bc91e 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -693,6 +693,7 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int 
size,
 bool first_stage_error = true;
 int ret = TRANSLATE_FAIL;
 int mode = mmu_idx;
+target_ulong tlb_size = 0;
 
 env->guest_phys_fault_addr = 0;
 
@@ -784,8 +785,13 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, int 
size,
 }
 
 if (ret == TRANSLATE_SUCCESS) {
-tlb_set_page(cs, address & TARGET_PAGE_MASK, pa & TARGET_PAGE_MASK,
- prot, mmu_idx, TARGET_PAGE_SIZE);
+if (pmp_is_range_in_tlb(env, pa & TARGET_PAGE_MASK, &tlb_size)) {
+tlb_set_page(cs, address & ~(tlb_size - 1), pa & ~(tlb_size - 1),
+ prot, mmu_idx, tlb_size);
+} else {
+tlb_set_page(cs, address & TARGET_PAGE_MASK, pa & TARGET_PAGE_MASK,
+ prot, mmu_idx, TARGET_PAGE_SIZE);
+}
 return true;
 } else if (probe) {
 return false;
diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
index e0161d6aab..a040cdd285 100644
--- a/target/riscv/pmp.c
+++ b/target/riscv/pmp.c
@@ -392,3 +392,55 @@ target_ulong pmpaddr_csr_read(CPURISCVState *env, uint32_t 
addr_index)
 
 return val;
 }
+
+/*
+ * Calculate the TLB size if the start address or the end address of
+ * PMP entry is presented in thie TLB page.
+ */
+static target_ulong pmp_get_tlb_size(CPURISCVState *env, int pmp_index,
+target_ulong tlb_sa, target_ulong tlb_ea)
+{
+target_ulong pmp_sa = env->pmp_state.addr[pmp_index].sa;
+target_ulong pmp_ea = env->pmp_state.addr[pmp_index].ea;
+
+if (pmp_sa >= tlb_sa && pmp_ea <= tlb_ea) {
+return pmp_ea - pmp_sa + 1;
+}
+
+if (pmp_sa >= tlb_sa && pmp_sa <= tlb_ea && pmp_ea >= tlb_ea) {
+return tlb_ea - pmp_sa + 1;
+}
+
+if (pmp_ea <= tlb_ea && pmp_ea >= tlb_sa && pmp_sa <= tlb_sa) {
+return pmp_ea - tlb_sa + 1;
+}
+
+return 0;
+}
+
+/*
+ * Check is there a PMP entry whcih range covers this page. If so,
+ * try to find the minimum granularity for the TLB size.
+ */
+bool pmp_is_range_in_tlb(CPURISCVState *env, hwaddr tlb_sa,
+target_ulong *tlb_size)
+{
+int i;
+target_ulong val;
+target_ulong tlb_ea = (tlb_sa + TARGET_PAGE_SIZE - 1);
+
+for (i = 0; i < MAX_RISCV_PMPS; i++) {
+val = pmp_get_tlb_size(env, i, tlb_sa, tlb_ea);
+if (val) {
+if (*tlb_size == 0 || *tlb_size > val) {
+*tlb_size = val;
+}
+}
+}
+
+if (*tlb_size != 0) {
+return true;
+}
+
+return false;
+}
diff --git a/target/riscv/pmp.h b/target/riscv/pmp.h
index 8e19793132..c70f2ea4c4 100644
--- a/target/riscv/pmp.h
+++ b/target/riscv/pmp.h
@@ -60,5 +60,7 @@ void pmpaddr_csr_write(CPURISCVState *env, uint32_t 
addr_index,
 target_ulong pmpaddr_csr_read(CPURISCVState *env, uint32_t addr_index);
 bool pmp_hart_has_privs(CPURISCVState *env, target_ulong addr,
 target_ulong size, pmp_priv_t priv, target_ulong mode);
+bool pmp_is_range_in_tlb(CPURISCVState *env, hwaddr tlb_sa,
+target_ulong *tlb_size);
 
 #endif
-- 
2.27.0

[PATCH v4 3/4] target/riscv: Fix the translation of physical address

2020-07-24 Thread Zong Li

The real physical address should add the 12 bits page offset. It also
causes the PMP wrong checking due to the minimum granularity of PMP is
4 byte, but we always get the physical address which is 4KB alignment,
that means, we always use the start address of the page to check PMP for
all addresses which in the same page.

Signed-off-by: Zong Li 
---
 target/riscv/cpu_helper.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 75d2ae3434..08b069f0c9 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -543,7 +543,8 @@ restart:
 /* for superpage mappings, make a fake leaf PTE for the TLB's
benefit. */
 target_ulong vpn = addr >> PGSHIFT;
-*physical = (ppn | (vpn & ((1L << ptshift) - 1))) << PGSHIFT;
+*physical = ((ppn | (vpn & ((1L << ptshift) - 1))) << PGSHIFT) |
+(addr & ~TARGET_PAGE_MASK);
 
 /* set permissions on the TLB entry */
 if ((pte & PTE_R) || ((pte & PTE_X) && mxr)) {
-- 
2.27.0

[PATCH v4 1/4] target/riscv: Fix the range of pmpcfg of CSR funcion table

2020-07-24 Thread Zong Li

The range of Physical Memory Protection should be from CSR_PMPCFG0
to CSR_PMPCFG3, not to CSR_PMPADDR9.

Signed-off-by: Zong Li 
Reviewed-by: Alistair Francis 
Reviewed-by: Bin Meng 
---
 target/riscv/csr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index ac01c835e1..6a96a01b1c 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -1353,7 +1353,7 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
 [CSR_MTINST] =  { hmode,   read_mtinst,  write_mtinst 
},
 
 /* Physical Memory Protection */
-[CSR_PMPCFG0  ... CSR_PMPADDR9] =  { pmp,   read_pmpcfg,  write_pmpcfg   },
+[CSR_PMPCFG0  ... CSR_PMPCFG3]   = { pmp,   read_pmpcfg,  write_pmpcfg   },
 [CSR_PMPADDR0 ... CSR_PMPADDR15] = { pmp,   read_pmpaddr, write_pmpaddr  },
 
 /* Performance Counters */
-- 
2.27.0

Re: [PATCH 2/4] qom: provide convenient macros for declaring and defining types

2020-07-24 Thread Daniel P . Berrangé

On Thu, Jul 23, 2020 at 02:23:54PM -0500, Eric Blake wrote:
> On 7/23/20 1:14 PM, Daniel P. Berrangé wrote:
> > When creating new QOM types, there is a lot of boilerplate code that
> > must be repeated using a standard pattern. This is tedious to write
> > and liable to suffer from subtle inconsistencies. Thus it would
> > benefit from some simple automation.
> > 
> > QOM was loosely inspired by GLib's GObject, and indeed GObject suffers
> > from the same burden of boilerplate code, but has long provided a set of
> > macros to eliminate this burden in the source implementation. More
> > recently it has also provided a set of macros to eliminate this burden
> > in the header declaration.
> > 
> > In GLib there are the G_DECLARE_* and G_DEFINE_* family of macros
> > for the header declaration and source implementation respectively:
> > 
> >https://developer.gnome.org/gobject/stable/chapter-gobject.html
> >https://developer.gnome.org/gobject/stable/howto-gobject.html
> > 
> > This patch takes inspiration from GObject to provide the equivalent
> > functionality for QOM.
> > 
> 
> > 
> > IOW, in both cases the maintainer now only has to think about the
> > interesting part of the code which implements useful functionality
> > and avoids much of the boilerplate.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >   include/qom/object.h | 277 +++
> >   1 file changed, 277 insertions(+)
> > 
> > diff --git a/include/qom/object.h b/include/qom/object.h
> > index 1f8aa2d48e..be64421089 100644
> > --- a/include/qom/object.h
> > +++ b/include/qom/object.h
> > @@ -304,6 +304,119 @@ typedef struct InterfaceInfo InterfaceInfo;
> >*
> >* The first example of such a QOM method was #CPUClass.reset,
> >* another example is #DeviceClass.realize.
> > + *
> > + * # Standard type declaration and definition macros #
> > + *
> > + * A lot of the code outlined above follows a standard pattern and naming
> > + * convention. To reduce the amount of boilerplate code that needs to be
> > + * written for a new type there are two sets of macros to generate the
> > + * common parts in a standard format.
> > + *
> > + * A type is declared using the OBJECT_DECLARE macro family. In types
> > + * which do not require any virtual functions in the class, the
> > + * OBJECT_DECLARE_SIMPLE_TYPE macro is suitable, and is commonly placed
> > + * in the header file:
> > + *
> > + * 
> > + *   Declaring a simple type
> > + *   
> > + * OBJECT_DECLARE_SIMPLE_TYPE(MyDevice, my_device, MY_DEVICE, DEVICE)
> 
> How sensitive is this macro to trailing semicolon?  Must the user omit it
> (as shown here), supply it (by tweaking the macro to be a syntax error if
> one is not supplied), or is it optional?  I guess whatever glib does is fine
> to copy, though.

Testing it appears it tolerates a ";" but GLib doesn't use it typically
in this case.

> 
> Hmm. I think you meant to use s/ DEVICE/ Device/ here...

Yes.

> 
> > + *   
> > + * 
> > + *
> > + * This is equivalent to the following:
> > + *
> > + * 
> > + *   Expansion from declaring a simple type
> > + *   
> > + * typedef struct MyDevice MyDevice;
> > + * typedef struct MyDeviceClass MyDeviceClass;
> > + *
> > + * G_DEFINE_AUTOPTR_CLEANUP_FUNC(MyDeviceClass, object_unref)
> > + *
> > + * #define MY_DEVICE_GET_CLASS(void *obj) \
> > + * OBJECT_GET_CLASS(MyDeviceClass, obj, TYPE_MY_DEVICE)
> 
> How'd you manage to invoke #define inside the OBJECT_DECLARE_SIMPLE_TYPE
> macro expansion?
> 
> /me reads ahead
> 
> Oh, you didn't; you used a static inline function instead.  But the effect
> is the same, so claiming the equivalence here, while slightly misleading, is
> not horrible.

Yes, I was in two minds here, whether to illustrate the real inline
function, or the macro. I choose the macro to make it clear what kind
of code is being replaced, rather than what kind of code it strictly
produces.

> > + * #define MY_DEVICE_CLASS(void *klass) \
> > + * OBJECT_CLASS_CHECK(MyDeviceClass, klass, TYPE_MY_DEVICE)
> > + * #define MY_DEVICE(void *obj)
> > + * OBJECT_CHECK(MyDevice, obj, TYPE_MY_DEVICE)
> > + *
> > + * struct MyDeviceClass {
> > + * DeviceClass parent_class;
> 
> ...given that this line is constructed as arg4##Class, and the fact that we
> have DeviceClass, not DEVICEClass.
> 
> > + * };
> > + *   
> > + * 
> > + *
> > + * The 'struct MyDevice' needs to be declared separately.
> > + * If the type requires virtual functions to be declared in the class
> > + * struct, then the alternative OBJECT_DECLARE_TYPE() macro can be
> > + * used. This does the same as OBJECT_DECLARE_SIMPLE_TYPE(), but without
> > + * the 'struct MyDeviceClass' definition.
> > + *
> > + * To implement the type, the OBJECT_DEFINE macro family is available.
> > + * In the simple case the OBJECT_DEFINE_TYPE macro is suitable:
> > + *
> > + * 
> > + *   Defining a simple type
> > + *   
> > + *

Re: [PATCH 3/4] crypto: use QOM macros for declaration/definition of secret types

2020-07-24 Thread Daniel P . Berrangé

On Thu, Jul 23, 2020 at 02:50:06PM -0400, Eduardo Habkost wrote:
> On Thu, Jul 23, 2020 at 07:14:09PM +0100, Daniel P. Berrangé wrote:
> > This introduces the use of the OBJECT_DEFINE and OBJECT_DECLARE macro
> > families in the secret types, in order to eliminate boilerplate code.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >  crypto/secret.c | 24 
> >  crypto/secret_common.c  | 32 +---
> >  crypto/secret_keyring.c | 28 +---
> >  include/crypto/secret.h | 11 ++-
> >  include/crypto/secret_common.h  | 13 ++---
> >  include/crypto/secret_keyring.h | 18 ++
> >  6 files changed, 28 insertions(+), 98 deletions(-)
> > 
> 
> Beautiful.
> 
> I wonder how hard it would be to automate this.  I'm assuming
> Coccinelle won't be able to deal with the macro definitions, but
> a handwritten conversion script would be really useful for
> dealing with our 1226 static TypeInfo structs.

Probably possible to do a reasonably good job with a perl script or
similar. The code patterns to be replaced are reasonably easy to
identify with a few regexes.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH] pseries: fix kvmppc_set_fwnmi()

2020-07-24 Thread Philippe Mathieu-Daudé

On 7/24/20 10:35 AM, Laurent Vivier wrote:
> QEMU issues the ioctl(KVM_CAP_PPC_FWNMI) on the first vCPU.
> 
> If the first vCPU is currently running, the vCPU mutex is held
> and the ioctl() cannot be done and waits until the mutex is released.
> This never happens and the VM is stuck.
> 
> To avoid this deadlock, issue the ioctl on the same vCPU doing the
> RTAS call.
> 
> The problem can be reproduced by booting a guest with several vCPUs
> (the probability to have the problem is (n - 1) / n,  n = # of CPUs),
> and then by triggering a kernel crash with "echo c >/proc/sysrq-trigger".
> 
> On the reboot, the kernel hangs after:
> 
> ...
> [0.00] -
> [0.00] ppc64_pft_size= 0x0
> [0.00] phys_mem_size = 0x4800
> [0.00] dcache_bsize  = 0x80
> [0.00] icache_bsize  = 0x80
> [0.00] cpu_features  = 0x0001c06f8f4f91a7
> [0.00]   possible= 0x0003fbffcf5fb1a7
> [0.00]   always  = 0x0003800081a1
> [0.00] cpu_user_features = 0xdc0065c2 0xaee0
> [0.00] mmu_features  = 0x3c006041
> [0.00] firmware_features = 0x0085455a445f
> [0.00] physical_start= 0x800
> [0.00] -
> [0.00] numa:   NODE_DATA [mem 0x47f33c80-0x47f3]
> 
> Fixes: ec010c00665b ("ppc/spapr: KVM FWNMI should not be enabled until guest 
> requests it")
> Cc: npig...@gmail.com
> Signed-off-by: Laurent Vivier 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  hw/ppc/spapr_rtas.c  | 2 +-
>  target/ppc/kvm.c | 3 +--
>  target/ppc/kvm_ppc.h | 4 ++--
>  3 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index bcac0d00e7b6..513c7a84351b 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -438,7 +438,7 @@ static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
>  }
>  
>  if (kvm_enabled()) {
> -if (kvmppc_set_fwnmi() < 0) {
> +if (kvmppc_set_fwnmi(cpu) < 0) {
>  rtas_st(rets, 0, RTAS_OUT_NOT_SUPPORTED);
>  return;
>  }
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 2692f76130aa..d85ba8ffe00b 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -2071,9 +2071,8 @@ bool kvmppc_get_fwnmi(void)
>  return cap_fwnmi;
>  }
>  
> -int kvmppc_set_fwnmi(void)
> +int kvmppc_set_fwnmi(PowerPCCPU *cpu)
>  {
> -PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
>  CPUState *cs = CPU(cpu);
>  
>  return kvm_vcpu_enable_cap(cs, KVM_CAP_PPC_FWNMI, 0);
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index 701c0c262be2..72e05f1cd2fc 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -28,7 +28,7 @@ void kvmppc_set_papr(PowerPCCPU *cpu);
>  int kvmppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr);
>  void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy);
>  bool kvmppc_get_fwnmi(void);
> -int kvmppc_set_fwnmi(void);
> +int kvmppc_set_fwnmi(PowerPCCPU *cpu);
>  int kvmppc_smt_threads(void);
>  void kvmppc_error_append_smt_possible_hint(Error *const *errp);
>  int kvmppc_set_smt_threads(int smt);
> @@ -169,7 +169,7 @@ static inline bool kvmppc_get_fwnmi(void)
>  return false;
>  }
>  
> -static inline int kvmppc_set_fwnmi(void)
> +static inline int kvmppc_set_fwnmi(PowerPCCPU *cpu)
>  {
>  return -1;
>  }
>

Re: [PATCH v4 2/4] target/riscv/pmp.c: Fix the index offset on RV64

2020-07-24 Thread Bin Meng

Hi Zong,

On Fri, Jul 24, 2020 at 5:08 PM Zong Li  wrote:
>
> On RV64, the reg_index is 2 (pmpcfg2 CSR) after the seventh pmp
> entry, it is not 1 (pmpcfg1 CSR) like RV32. In the original
> implementation, the second parameter of pmp_write_cfg is
> "reg_index * sizeof(target_ulong)", and we get the the result
> which is started from 16 if reg_index is 2, but we expect that
> it should be started from 8. Separate the implementation for
> RV32 and RV64 respectively.
>
> Signed-off-by: Zong Li 
> ---
>  target/riscv/pmp.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
> index 2a2b9f5363..e0161d6aab 100644
> --- a/target/riscv/pmp.c
> +++ b/target/riscv/pmp.c
> @@ -310,6 +310,10 @@ void pmpcfg_csr_write(CPURISCVState *env, uint32_t 
> reg_index,
>  int i;
>  uint8_t cfg_val;
>
> +#if defined(TARGET_RISCV64)
> +reg_index >>= 1;
> +#endif
> +
>  trace_pmpcfg_csr_write(env->mhartid, reg_index, val);
>
>  if ((reg_index & 1) && (sizeof(target_ulong) == 8)) {
> @@ -335,6 +339,10 @@ target_ulong pmpcfg_csr_read(CPURISCVState *env, 
> uint32_t reg_index)
>  target_ulong cfg_val = 0;
>  target_ulong val = 0;
>
> +#if defined(TARGET_RISCV64)
> +reg_index >>= 1;
> +#endif
> +
>  for (i = 0; i < sizeof(target_ulong); i++) {
>  val = pmp_read_cfg(env, (reg_index * sizeof(target_ulong)) + i);
>  cfg_val |= (val << (i * 8));
> --

It seems you missed to address my review comments in v3? reg_index
should be shifted after we call the trace function.

Regards,
Bin

Re: [PATCH v7 29/47] blockdev: Use CAF in external_snapshot_prepare()

2020-07-24 Thread Max Reitz

On 20.07.20 18:08, Andrey Shinkevich wrote:
> On 25.06.2020 18:21, Max Reitz wrote:
>> This allows us to differentiate between filters and nodes with COW
>> backing files: Filters cannot be used as overlays at all (for this
>> function).
>>
>> Signed-off-by: Max Reitz 
>> ---
>>   blockdev.c | 7 ++-
>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/blockdev.c b/blockdev.c
>> index 1eb0fcdea2..aabe51036d 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -1549,7 +1549,12 @@ static void
>> external_snapshot_prepare(BlkActionState *common,
>>   goto out;
>>   }
>>   -    if (state->new_bs->backing != NULL) {
>> +    if (state->new_bs->drv->is_filter) {
> 
> 
> Is there a chance to get a filter here? If so, is that when a user
> specifies the file name of such a kind “filter[filter-name]:foo.qcow2”
> or somehow else?

It would be with blockdev-snapshot and by specifying a filter for
@overlay.  Technically that’s already caught by the check whether the
overlay supports backing images (whether drv->supports_backing is true),
but we might as well give a nicer error message.

Example:

{"execute":"qmp_capabilities"}

{"execute":"blockdev-add","arguments":
 {"node-name":"overlay","driver":"copy-on-read",
  "file":{"driver":"null-co"}}}

{"execute":"blockdev-add","arguments":
 {"node-name":"base","driver":"null-co"}}

{"execute":"blockdev-snapshot","arguments":
 {"node":"base","overlay":"overlay"}}

Max



signature.asc
Description: OpenPGP digital signature

[Bug 1888818] [NEW] Multi-queue vhost-user fails to reconnect with qemu version >=4.2

2020-07-24 Thread xuan

Public bug reported:

Test Environment:
DPDK version: DPDK v20.08
Other software versions: qemu4.2.0, qemu5.0.0.
OS: Linux 4.15.0-20-generic
Compiler: gcc (Ubuntu 7.3.0-16ubuntu3) 8.4.0
Hardware platform: Purley.
Test Setup
Steps to reproduce
List the steps to reproduce the issue.

Test flow
=
1. Launch vhost-user testpmd as port0 with 2 queues:

./x86_64-native-linuxapp-gcc/app/testpmd -l 2-4 -n 4 \
--file-prefix=vhost --vdev 'net_vhost0,iface=vhost-net,queues=2,client=1' 
-- -i --txd=1024 --rxd=1024 --txq=2 --rxq=2
testpmd>start

3. Launch qemu with virtio-net:

 taskset -c 13 \
qemu-system-x86_64 -name us-vhost-vm1 \
   -cpu host -enable-kvm -m 2048 -object 
memory-backend-file,id=mem,size=2048M,mem-path=/mnt/huge,share=on \
   -numa node,memdev=mem \
   -mem-prealloc -monitor unix:/tmp/vm2_monitor.sock,server,nowait -netdev 
user,id=yinan,hostfwd=tcp:127.0.0.1:6005-:22 -device e1000,netdev=yinan \
   -smp cores=1,sockets=1 -drive file=/home/osimg/ubuntu16.img  \
   -chardev socket,id=char0,path=./vhost-net,server \
   -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce,queues=2 \
   -device 
virtio-net-pci,mac=52:54:00:00:00:01,netdev=mynet1,mrg_rxbuf=on,csum=on,gso=on,host_tso4=on,guest_tso4=on,mq=on,vectors=15
 \
   -vnc :10 -daemonize

6. Quit testpmd and restart vhost-user :

testpmd>quit
./x86_64-native-linuxapp-gcc/app/testpmd -l 2-4 -n 4 \
--file-prefix=vhost --vdev 'net_vhost0,iface=vhost-net,queues=2,client=1' 
-- -i --txd=1024 --rxd=1024 --txq=2 --rxq=2

Expected Result:
After the vhost-user is killed then re-launched, the virtio-net can connect 
back to vhost-user again.

Actual Result:
Vhost-user relaunch failed with continous log printed"VHOST_CONFIG: Processing 
VHOST_USER_SET_FEATURES failed.

Analysis:
This is a regression bug, bad commit: c6beefd674f
When vhost-user quits, QEMU doesnot save acked features for each virtio-net 
after vhost-user quits. When vhost-user reconnects to QEMU, QEMU sends two 
different features(one is the true acked feature while the another is 
0x4000) to vhost-user successively which causing vhost-user exits 
abnormally.

** Affects: qemu
 Importance: Undecided
 Status: New

** Summary changed:

- vhost-user/vitrio-net test fail to reconnect from vhost-user with qemu 
version >=4.2
+ vhost-user fail to reconnect from virtio-net with qemu version >=4.2

** Description changed:

  Test Environment:
  DPDK version: DPDK v20.08
- Other software versions: qemu4.2, qemu5.0.
+ Other software versions: qemu4.2.0, qemu5.0.0.
  OS: Linux 4.15.0-20-generic
  Compiler: gcc (Ubuntu 7.3.0-16ubuntu3) 8.4.0
  Hardware platform: Purley.
  Test Setup
  Steps to reproduce
  List the steps to reproduce the issue.
  
  Test flow
  =
  1. Launch vhost-user testpmd as port0 with 2 queues:
  
  ./x86_64-native-linuxapp-gcc/app/testpmd -l 2-4 -n 4 \
- --file-prefix=vhost --vdev 'net_vhost0,iface=vhost-net,queues=2,client=1' 
-- -i --txd=1024 --rxd=1024 --txq=2 --rxq=2
+ --file-prefix=vhost --vdev 'net_vhost0,iface=vhost-net,queues=2,client=1' 
-- -i --txd=1024 --rxd=1024 --txq=2 --rxq=2
  testpmd>start
  
  3. Launch qemu with virtio-net:
  
-  taskset -c 13 \
- qemu-system-x86_64 -name us-vhost-vm1 \
--cpu host -enable-kvm -m 2048 -object 
memory-backend-file,id=mem,size=2048M,mem-path=/mnt/huge,share=on \
--numa node,memdev=mem \
--mem-prealloc -monitor unix:/tmp/vm2_monitor.sock,server,nowait 
-netdev user,id=yinan,hostfwd=tcp:127.0.0.1:6005-:22 -device e1000,netdev=yinan 
\
--smp cores=1,sockets=1 -drive file=/home/osimg/ubuntu16.img  \
--chardev socket,id=char0,path=./vhost-net,server \
--netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce,queues=2 \
--device 
virtio-net-pci,mac=52:54:00:00:00:01,netdev=mynet1,mrg_rxbuf=on,csum=on,gso=on,host_tso4=on,guest_tso4=on,mq=on,vectors=15
 \
--vnc :10 -daemonize
+  taskset -c 13 \
+ qemu-system-x86_64 -name us-vhost-vm1 \
+    -cpu host -enable-kvm -m 2048 -object 
memory-backend-file,id=mem,size=2048M,mem-path=/mnt/huge,share=on \
+    -numa node,memdev=mem \
+    -mem-prealloc -monitor unix:/tmp/vm2_monitor.sock,server,nowait 
-netdev user,id=yinan,hostfwd=tcp:127.0.0.1:6005-:22 -device e1000,netdev=yinan 
\
+    -smp cores=1,sockets=1 -drive file=/home/osimg/ubuntu16.img  \
+    -chardev socket,id=char0,path=./vhost-net,server \
+    -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce,queues=2 \
+    -device 
virtio-net-pci,mac=52:54:00:00:00:01,netdev=mynet1,mrg_rxbuf=on,csum=on,gso=on,host_tso4=on,guest_tso4=on,mq=on,vectors=15
 \
+    -vnc :10 -daemonize
  
  6. Quit testpmd and restart vhost-user :
  
  testpmd>quit
  ./x86_64-native-linuxapp-gcc/app/testpmd -l 2-4 -n 4 \
- --file-prefix=vhost --vdev 'net_vhost0,iface=vhost-net,queues=2,client=1' 
-- -i --txd=1024 --rxd=1024 --txq=2 --rxq=2
- 
+ --file-prefix=vhost --

Re: [PATCH for-5.1] iotests: Select a default machine for the rx and avr targets

2020-07-24 Thread Philippe Mathieu-Daudé

Hi Max/Kevin,

On 7/24/20 10:24 AM, Max Reitz wrote:
> On 22.07.20 18:19, Thomas Huth wrote:
>> If you are building only with either the new rx-softmmu or avr-softmmu
>> target, "make check-block" fails a couple of tests since there is no
>> default machine defined in these new targets. We have to select a machine
>> in the "check" script for these, just like we already do for the arm- and
>> tricore-softmmu targets.

I guess remember I already asked on IRC but can't find the log,
so better ask again on the list.

Why can't we use the 'none' machine for the block tests? What
part of the machines is required? I was thinking maybe busses,
but apparently not (example with these 2 machines).

Thanks,

Phil.

>>
>> Signed-off-by: Thomas Huth 
>> ---
>>  tests/qemu-iotests/check | 14 +-
>>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> Thanks, applied to my block branch:
> 
> https://git.xanclic.moe/XanClic/qemu/commits/branch/block
>

Re: [PATCH-for-5.1? v2] qapi/error: Check format string argument in error_*prepend()

2020-07-24 Thread Philippe Mathieu-Daudé

On 7/24/20 10:03 AM, Markus Armbruster wrote:
> Philippe Mathieu-Daudé  writes:
> 
>> error_propagate_prepend() "behaves like error_prepend()", and
>> error_prepend() uses "formatting @fmt, ... like printf()".
>> error_prepend() checks its format string argument, but
>> error_propagate_prepend() does not. Fix by addint the format
> 
> s/addint/adding/
> 
>> attribute to error_propagate_prepend() and error_vprepend().
>>
>> This would have caught the bug fixed in the previous commit:
>>
>> CC  hw/sd/milkymist-memcard.o
>>   hw/sd/milkymist-memcard.c: In function ‘milkymist_memcard_realize’:
>>   hw/sd/milkymist-memcard.c:284:70: error: format ‘%s’ expects a matching 
>> ‘char *’ argument [-Werror=format=]
>> 284 | error_propagate_prepend(errp, err, "failed to init SD 
>> card: %s");
>> |
>>  ~^
>> |
>>   |
>> |
>>   char *
> 
> I see no need to repeat the details here.  If you agree, I'll drop them
> in my tree.

OK, thanks.

> 
>> Missed in commit 4b5766488f "error: Fix use of error_prepend() with
>> &error_fatal, &error_abort".
>>
>> Inspired-by: Stefan Weil 
>> Suggested-by: Eric Blake 
>> Reviewed-by: Markus Armbruster 
>> Signed-off-by: Philippe Mathieu-Daudé 
> 
> Reviewed-by: Markus Armbruster 
>

Re: [RFC PATCH] buildsys: Only build capstone if softmmu/user mode is enabled

2020-07-24 Thread Philippe Mathieu-Daudé

On 7/24/20 9:56 AM, Thomas Huth wrote:
> On 24/07/2020 09.16, Philippe Mathieu-Daudé wrote:
>> At least one of softmmu or user mode has to be enabled to use
>> capstone. If not, don't clone/built it.
>>
>> This save CI time for the tools/documentation-only build jobs.
>>
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>>  configure | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/configure b/configure
>> index 4bd80ed507..bc5757159a 100755
>> --- a/configure
>> +++ b/configure
>> @@ -5381,6 +5381,10 @@ fi
>>  ##
>>  # capstone
>>  
>> +if test -z "$capstone" && test $tcg = 'no' ; then # !tcg implies !softmmu
>> +  capstone="no"
>> +fi
> 
> I don't think this is right. You could have a KVM-only build where you
> still want to use the disassembler for the human monitor.

I had the same question with KVM, I agree this is unclear, this is why
I added RFC.

Don't we have !softmmu implies !kvm?

> 
> But maybe it could be disabled if softmmu="no", linux_user="no" and
> bsd_user="no" ?
> 
>  Thomas
> 
>

Re: Possible regression with VGA and resolutions in Windows 10?

2020-07-24 Thread Gerd Hoffmann

On Thu, Jul 23, 2020 at 04:24:06PM +0200, Aaron Lauterer wrote:
> Hi all,
> 
> I think we have a regression introduced in commit 0221d73ce6.
> 
> Once I start a Windows 10 VM (build 18363) with `-device VGA` I have only the 
> following resolutions to choose from instead of the much longer list:
> 
> 1920x1080
> 1024x768
> 800x600

That is probably vgabios gaining edid support.

The list should be longer though, the qemu edid block has more
resolutions included.  The qemu-edid tool is a command line
interface to the edid generator, for testing purposes.

Try "qemu-edid | edid-decode" to see the decoded edid data.

Linux guests have the raw edid block in sysfs, see
/sys/class/drm/card0/card0-Virtual-1/edid.

>   -device 'VGA,id=vga,vgamem_mb=32,bus=pci.0,addr=0x2' \

Try adding "xres=,yres=" of you want a specific
display resolution.

Try adding "edid=off" to return to previous behavior.

HTH & take care,
  Gerd

Re: [PATCH] MAINTAINERS: Add an entry to review Avocado based acceptance tests

2020-07-24 Thread Philippe Mathieu-Daudé

On 7/24/20 10:45 AM, Alex Bennée wrote:
> 
> Philippe Mathieu-Daudé  writes:
> 
>> Add an entry for to allow reviewers to be notified when acceptance /
>> integration tests are added or modified.
>> The designated reviewers are not maintainers, subsystem maintainers
>> are expected to merge their tests.
>>
>> Signed-off-by: Philippe Mathieu-Daudé 
> 
> Reviewed-by: Alex Bennée 

Thanks, but the v2 has already been merged (commit 6634f1c43d).
https://www.mail-archive.com/qemu-devel@nongnu.org/msg675105.html

> 
>> ---
>>  MAINTAINERS | 8 
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 3e7d9cb0a5..c2ae2bf390 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -2923,6 +2923,14 @@ S: Maintained
>>  F: tests/tcg/Makefile
>>  F: tests/tcg/Makefile.include
>>  
>> +Acceptance (Integration) Testing with the Avocado framework
>> +W: https://trello.com/b/6Qi1pxVn/avocado-qemu
>> +R: Cleber Rosa 
>> +R: Philippe Mathieu-Daudé 
>> +R: Wainer dos Santos Moschetta 
>> +S: Odd Fixes
>> +F: tests/acceptance/
>> +
>>  Documentation
>>  -
>>  Build system architecture
> 
>

Re: [RFC PATCH] buildsys: Only build capstone if softmmu/user mode is enabled

2020-07-24 Thread Thomas Huth

On 24/07/2020 11.38, Philippe Mathieu-Daudé wrote:
> On 7/24/20 9:56 AM, Thomas Huth wrote:
>> On 24/07/2020 09.16, Philippe Mathieu-Daudé wrote:
>>> At least one of softmmu or user mode has to be enabled to use
>>> capstone. If not, don't clone/built it.
>>>
>>> This save CI time for the tools/documentation-only build jobs.
>>>
>>> Signed-off-by: Philippe Mathieu-Daudé 
>>> ---
>>>  configure | 4 
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/configure b/configure
>>> index 4bd80ed507..bc5757159a 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -5381,6 +5381,10 @@ fi
>>>  ##
>>>  # capstone
>>>  
>>> +if test -z "$capstone" && test $tcg = 'no' ; then # !tcg implies !softmmu
>>> +  capstone="no"
>>> +fi
>>
>> I don't think this is right. You could have a KVM-only build where you
>> still want to use the disassembler for the human monitor.
> 
> I had the same question with KVM, I agree this is unclear, this is why
> I added RFC.
> 
> Don't we have !softmmu implies !kvm?

Yes. You can not use KVM with linux-user builds.

But I think it's wrong to test for accelerators here. You could still
build a QEMU binary with HVF or HAX and still want to have disassembler
support in there.

 Thomas

Re: [RFC PATCH] buildsys: Only build capstone if softmmu/user mode is enabled

2020-07-24 Thread Philippe Mathieu-Daudé

On 7/24/20 11:38 AM, Philippe Mathieu-Daudé wrote:
> On 7/24/20 9:56 AM, Thomas Huth wrote:
>> On 24/07/2020 09.16, Philippe Mathieu-Daudé wrote:
>>> At least one of softmmu or user mode has to be enabled to use
>>> capstone. If not, don't clone/built it.
>>>
>>> This save CI time for the tools/documentation-only build jobs.
>>>
>>> Signed-off-by: Philippe Mathieu-Daudé 
>>> ---
>>>  configure | 4 
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/configure b/configure
>>> index 4bd80ed507..bc5757159a 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -5381,6 +5381,10 @@ fi
>>>  ##
>>>  # capstone
>>>  
>>> +if test -z "$capstone" && test $tcg = 'no' ; then # !tcg implies !softmmu
>>> +  capstone="no"
>>> +fi
>>
>> I don't think this is right. You could have a KVM-only build where you
>> still want to use the disassembler for the human monitor.
> 
> I had the same question with KVM, I agree this is unclear, this is why
> I added RFC.
> 
> Don't we have !softmmu implies !kvm?

It works because it falls back to the old disas.c (if capstone is
here, use it, else fall-back).

Does this means we can directly remove the capstone experiment &
submodule without waiting for the libllvm integration?

> 
>>
>> But maybe it could be disabled if softmmu="no", linux_user="no" and
>> bsd_user="no" ?
>>
>>  Thomas
>>
>>
>

Re: [PATCH v7 28/47] block/null: Implement bdrv_get_allocated_file_size

2020-07-24 Thread Andrey Shinkevich


On 24.07.2020 11:58, Max Reitz wrote:

On 20.07.20 17:10, Andrey Shinkevich wrote:

On 25.06.2020 18:21, Max Reitz wrote:

It is trivial, so we might as well do it.

Signed-off-by: Max Reitz 
---
   block/null.c   | 7 +++
   tests/qemu-iotests/153.out | 2 +-
   tests/qemu-iotests/184.out | 6 --
   3 files changed, 12 insertions(+), 3 deletions(-)

...

diff --git a/tests/qemu-iotests/184.out b/tests/qemu-iotests/184.out
index 3deb3cfb94..28b104da89 100644
--- a/tests/qemu-iotests/184.out
+++ b/tests/qemu-iotests/184.out
@@ -29,7 +29,8 @@ Testing:
   "image": {
   "virtual-size": 1073741824,
   "filename": "json:{\"throttle-group\": \"group0\",
\"driver\": \"throttle\", \"file\": {\"driver\": \"null-co\"}}",
-    "format": "throttle"
+    "format": "throttle",
+    "actual-size": SIZE


If we remove the _filter_generated_node_ids() in the current
implementation of the test #184, we will get '"actual-size": 0'. It
might be more informative when analyzing the output in 184.out.

You mean _filter_actual_image_size?  Yeah, why not, it doesn’t seem
necessary here.

Max



Yes Max, you are right, I ment the _filter_actual_image_size. It was my 
copy/paste mistake.


Andrey

Re: [PATCH 1/3] block/nbd: allow drain during reconnect attempt

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


20.07.2020 12:00, Vladimir Sementsov-Ogievskiy wrote:

It should be to reenter qio_channel_yield() on io/channel read/write
path, so it's safe to reduce in_flight and allow attaching new aio
context. And no problem to allow drain itself: connection attempt is
not a guest request. Moreover, if remote server is down, we can hang
in negotiation, blocking drain section and provoking a dead lock.

How to reproduce the dead lock:

1. Create nbd-fault-injector.conf with the following contents:

[inject-error "mega1"]
event=data
io=readwrite
when=before

2. In one terminal run nbd-fault-injector in a loop, like this:

n=1; while true; do
 echo $n; ((n++));
 ./nbd-fault-injector.py 127.0.0.1:1 nbd-fault-injector.conf;
done

3. In another terminal run qemu-io in a loop, like this:

n=1; while true; do
 echo $n; ((n++));
 ./qemu-io -c 'read 0 512' nbd+tcp://127.0.0.1:1;
done

After some time, qemu-io will hang trying to drain, for example, like
this:

  #3 aio_poll (ctx=0x55f006bdd890, blocking=true) at
 util/aio-posix.c:600
  #4 bdrv_do_drained_begin (bs=0x55f006bea710, recursive=false,
 parent=0x0, ignore_bds_parents=false, poll=true) at block/io.c:427
  #5 bdrv_drained_begin (bs=0x55f006bea710) at block/io.c:433
  #6 blk_drain (blk=0x55f006befc80) at block/block-backend.c:1710
  #7 blk_unref (blk=0x55f006befc80) at block/block-backend.c:498
  #8 bdrv_open_inherit (filename=0x7fffba1563bc
 "nbd+tcp://127.0.0.1:1", reference=0x0, options=0x55f006be86d0,
 flags=24578, parent=0x0, child_class=0x0, child_role=0,
 errp=0x7fffba154620) at block.c:3491
  #9 bdrv_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:1",
 reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
 block.c:3513
  #10 blk_new_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:1",
 reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
 block/block-backend.c:421

And connection_co stack like this:

  #0 qemu_coroutine_switch (from_=0x55f006bf2650, to_=0x7fe96e07d918,
 action=COROUTINE_YIELD) at util/coroutine-ucontext.c:302
  #1 qemu_coroutine_yield () at util/qemu-coroutine.c:193
  #2 qio_channel_yield (ioc=0x55f006bb3c20, condition=G_IO_IN) at
 io/channel.c:472
  #3 qio_channel_readv_all_eof (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
 niov=1, errp=0x7fe96d729eb0) at io/channel.c:110
  #4 qio_channel_readv_all (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
 niov=1, errp=0x7fe96d729eb0) at io/channel.c:143
  #5 qio_channel_read_all (ioc=0x55f006bb3c20, buf=0x7fe96d729d28
 "\300.\366\004\360U", buflen=8, errp=0x7fe96d729eb0) at
 io/channel.c:247
  #6 nbd_read (ioc=0x55f006bb3c20, buffer=0x7fe96d729d28, size=8,
 desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
 /work/src/qemu/master/include/block/nbd.h:365
  #7 nbd_read64 (ioc=0x55f006bb3c20, val=0x7fe96d729d28,
 desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
 /work/src/qemu/master/include/block/nbd.h:391
  #8 nbd_start_negotiate (aio_context=0x55f006bdd890,
 ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
 outioc=0x55f006bf19f8, structured_reply=true,
 zeroes=0x7fe96d729dca, errp=0x7fe96d729eb0) at nbd/client.c:904
  #9 nbd_receive_negotiate (aio_context=0x55f006bdd890,
 ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
 outioc=0x55f006bf19f8, info=0x55f006bf1a00, errp=0x7fe96d729eb0) at
 nbd/client.c:1032
  #10 nbd_client_connect (bs=0x55f006bea710, errp=0x7fe96d729eb0) at
 block/nbd.c:1460
  #11 nbd_reconnect_attempt (s=0x55f006bf19f0) at block/nbd.c:287
  #12 nbd_co_reconnect_loop (s=0x55f006bf19f0) at block/nbd.c:309
  #13 nbd_connection_entry (opaque=0x55f006bf19f0) at block/nbd.c:360
  #14 coroutine_trampoline (i0=113190480, i1=22000) at
 util/coroutine-ucontext.c:173

Note, that the hang may be
triggered by another bug, so the whole case is fixed only together with
commit "block/nbd: on shutdown terminate connection attempt".

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 65a4f56924..49254f1c3c 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -280,7 +280,18 @@ static coroutine_fn void 
nbd_reconnect_attempt(BDRVNBDState *s)
  s->ioc = NULL;
  }
  
+bdrv_dec_in_flight(s->bs);

  s->connect_status = nbd_client_connect(s->bs, &local_err);


Hmm. This inserts yield between setting connect_status and connect_err. Don't 
know,
can it be a problem, but at least looks like bad design. Better to use local 
ret here
and set connect_status together with connect_err after the yield.


+s->wait_drained_end = true;
+while (s->drained) {
+/*
+ * We may be entered once from nbd_client_attach_aio_context_bh
+ * and then from nbd_client_co_drain_end. So here is a loop.
+ */
+qemu_coroutine_yield();
+}
+bdrv_inc_in_flight(s->bs);
+
  error_free(s-

Re: [PATCH v7 33/47] mirror: Deal with filters

2020-07-24 Thread Max Reitz

On 22.07.20 20:31, Andrey Shinkevich wrote:
> On 25.06.2020 18:22, Max Reitz wrote:
>> This includes some permission limiting (for example, we only need to
>> take the RESIZE permission for active commits where the base is smaller
>> than the top).
>>
>> Use this opportunity to rename qmp_drive_mirror()'s "source" BDS to
>> "target_backing_bs", because that is what it really refers to.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>   qapi/block-core.json |   6 ++-
>>   block/mirror.c   | 118 +--
>>   blockdev.c   |  36 +
>>   3 files changed, 121 insertions(+), 39 deletions(-)
>>
> ...
>> diff --git a/block/mirror.c b/block/mirror.c
>> index 469acf4600..770de3b34e 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -42,6 +42,7 @@ typedef struct MirrorBlockJob {
>>   BlockBackend *target;
>>   BlockDriverState *mirror_top_bs;
>>   BlockDriverState *base;
>> +    BlockDriverState *base_overlay;
>>     /* The name of the graph node to replace */
>>   char *replaces;
>> @@ -677,8 +678,10 @@ static int mirror_exit_common(Job *job)
>>    &error_abort);
>>   if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
>>   BlockDriverState *backing = s->is_none_mode ? src : s->base;
>> -    if (backing_bs(target_bs) != backing) {
>> -    bdrv_set_backing_hd(target_bs, backing, &local_err);
>> +    BlockDriverState *unfiltered_target =
>> bdrv_skip_filters(target_bs);
>> +
>> +    if (bdrv_cow_bs(unfiltered_target) != backing) {
> 
> 
> I just worry about a filter node of the concurrent job right below the
> unfiltered_target.

Having a concurrent job on the target sounds extremely problematic in
itself (because at least for most of the mirror job, the target isn’t in
a consistent state).  Is that a real use case?

> The filter has unfiltered_target in its parent list.
> Will that filter node be replaced correctly then?

I’m also not quite sure what you mean.  We need to attach the source’s
backing chain to the target here, so we go down to the first node that
might accept COW backing files (by invoking bdrv_skip_filters()).  That
should be correct no matter what kind of filters are on it.
>> +    /*
>> + * The topmost node with
>> + * bdrv_skip_filters(filtered_target) ==
>> bdrv_skip_filters(target)
>> + */
>> +    filtered_target = bdrv_cow_bs(bdrv_find_overlay(bs, target));
>> +
>> +    assert(bdrv_skip_filters(filtered_target) ==
>> +   bdrv_skip_filters(target));
>> +
>> +    /*
>> + * XXX BLK_PERM_WRITE needs to be allowed so we don't block
>> + * ourselves at s->base (if writes are blocked for a node,
>> they are
>> + * also blocked for its backing file). The other options
>> would be a
>> + * second filter driver above s->base (== target).
>> + */
>> +    iter_shared_perms = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE;
>> +
>> +    for (iter = bdrv_filter_or_cow_bs(bs); iter != target;
>> + iter = bdrv_filter_or_cow_bs(iter))
>> +    {
>> +    if (iter == filtered_target) {
> 
> 
> For one filter node only?

No, iter_shared_perms is never reset, so it retains the
BLK_PERM_CONSISTENT_READ flag until the end of the loop.

>> +    /*
>> + * From here on, all nodes are filters on the base.
>> + * This allows us to share BLK_PERM_CONSISTENT_READ.
>> + */
>> +    iter_shared_perms |= BLK_PERM_CONSISTENT_READ;
>> +    }
>> +
>>   ret = block_job_add_bdrv(&s->common, "intermediate
>> node", iter, 0,
>> - BLK_PERM_WRITE_UNCHANGED |
>> BLK_PERM_WRITE,
>> - errp);
>> + iter_shared_perms, errp);
>>   if (ret < 0) {
>>   goto fail;
>>   }
> ...
>> @@ -3042,6 +3053,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error
>> **errp)
>>    " named node of the graph");
>>   goto out;
>>   }
>> +    replaces_node_name = arg->replaces;
> 
> 
> What is the idea behind the variables substitution?

Looks like a remnant from v6, where there was an

if (arg->has_replaces) {
...
replaces_node_name = arg->replaces;
} else if (unfiltered_bs != bs) {
replaces_node_name = unfiltered_bs->node_name;
}

But I moved that logic to blockdev_mirror_common() in this version.

So it’s just useless now and replaces_node_name shouldn’t exist.

Max



signature.asc
Description: OpenPGP digital signature

Re: [PULL 0/5] riscv-to-apply queue

2020-07-24 Thread Peter Maydell

On Wed, 22 Jul 2020 at 18:00, Alistair Francis  wrote:
>
> The following changes since commit 3cbc8970f55c87cb58699b6dc8fe42998bc79dc0:
>
>   Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2020-07-21' 
> into staging (2020-07-22 09:13:46 +0100)
>
> are available in the Git repository at:
>
>   g...@github.com:alistair23/qemu.git tags/pull-riscv-to-apply-20200722-1
>
> for you to fetch changes up to 8ba26b0b2b00dd5849a6c0981e358dc7a7cc315d:
>
>   target/riscv: Fix the range of pmpcfg of CSR funcion table (2020-07-22 
> 09:41:36 -0700)
>
> 
> This PR contains a few RISC-V fixes.
>
> The main fix is the correction of the goldfish RTC time. On top of that
> some small fixes to the recently added vector extensions have been added
> (including an assert that fixed a coverity report). There is a change in
> the SiFive E debug memory size to match hardware. Finally there is a fix
> for PMP accesses.


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.1
for any user-visible changes.

-- PMM

Re: [PATCH v7 34/47] backup: Deal with filters

2020-07-24 Thread Max Reitz

On 23.07.20 17:51, Andrey Shinkevich wrote:
> On 25.06.2020 18:22, Max Reitz wrote:
>> Signed-off-by: Max Reitz 
>> ---
>>   block/backup-top.c |  2 +-
>>   block/backup.c |  9 +
>>   blockdev.c | 19 +++
>>   3 files changed, 21 insertions(+), 9 deletions(-)
>>
>>
>> diff --git a/block/backup.c b/block/backup.c
>> index 4f13bb20a5..9afa0bf3b4 100644
>> --- a/block/backup.c
>> +++ b/block/backup.c
>> @@ -297,6 +297,7 @@ static int64_t
>> backup_calculate_cluster_size(BlockDriverState *target,
>>   {
>>   int ret;
>>   BlockDriverInfo bdi;
>> +    bool target_does_cow = bdrv_backing_chain_next(target);
>>   
> 
> 
> Wouldn't it better to make the explicit type conversion or use "!!"
> approach?

I don’t know. O:)

I personally don’t like too may exclamation marks because I feel like
the code is screaming at me.  So I tend to use them only where necessary.

As for doing an explicit cast...  I think I’ll keep that in mind to
reduce my future use of !!.  But in this case, the type name is in the
same line, so I feel like it’s sufficiently clear.

Max

signature.asc
Description: OpenPGP digital signature

Re: [PATCH v2] configure: actually disable 'git_update' mode with --disable-git-update

2020-07-24 Thread Michael Tokarev

Hi!

Dan, can you please resend this with To: qemu-devel@, maybe Cc: qemu-trivial@?
It isn't exactly trivial and it needs some review from the developers.

Thanks,

/mjt

16.07.2020 13:22, Dan Streetman wrote:
> The --disable-git-update configure param sets git_update=no, but
> some later checks only look for the .git dir. This changes the
> --enable-git-update to set git_update=yes but also fail if it
> does not find a .git dir. Then all the later checks for the .git
> dir can just be changed to a check for $git_update = "yes".
> 
> Also update the Makefile to skip the 'git_update' checks if it has
> been disabled.
> 
> This is needed because downstream packagers, e.g. Debian, Ubuntu, etc,
> also keep the source code in git, but do not want to enable the
> 'git_update' mode; with the current code, that's not possible even
> if the downstream package specifies --disable-git-update.
> 
> Signed-off-by: Dan Streetman 
> ---
> Changes in v2:
>  - change GIT_UPDATE default to "" if no .git found
>  - change Makefile to skip git update checks if GIT_UPDATE=no
> 
>  Makefile  | 15 +--
>  configure | 21 +
>  2 files changed, 22 insertions(+), 14 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 32345c610e..9bcd3f9af4 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -25,6 +25,8 @@ git-submodule-update:
>  
>  .PHONY: git-submodule-update
>  
> +# If --disable-git-update specified, skip these git checks
> +ifneq (no,$(GIT_UPDATE))
>  git_module_status := $(shell \
>cd '$(SRC_PATH)' && \
>GIT="$(GIT)" ./scripts/git-submodule.sh status $(GIT_SUBMODULES); \
> @@ -32,7 +34,12 @@ git_module_status := $(shell \
>  )
>  
>  ifeq (1,$(git_module_status))
> -ifeq (no,$(GIT_UPDATE))
> +ifeq (yes,$(GIT_UPDATE))
> +git-submodule-update:
> + $(call quiet-command, \
> +  (cd $(SRC_PATH) && GIT="$(GIT)" ./scripts/git-submodule.sh update 
> $(GIT_SUBMODULES)), \
> +  "GIT","$(GIT_SUBMODULES)")
> +else
>  git-submodule-update:
>   $(call quiet-command, \
>  echo && \
> @@ -41,11 +48,7 @@ git-submodule-update:
>  echo "from the source directory checkout $(SRC_PATH)" && \
>  echo && \
>  exit 1)
> -else
> -git-submodule-update:
> - $(call quiet-command, \
> -  (cd $(SRC_PATH) && GIT="$(GIT)" ./scripts/git-submodule.sh update 
> $(GIT_SUBMODULES)), \
> -  "GIT","$(GIT_SUBMODULES)")
> +endif
>  endif
>  endif
>  
> diff --git a/configure b/configure
> index b751c853f5..eefda62210 100755
> --- a/configure
> +++ b/configure
> @@ -318,7 +318,7 @@ then
>  git_submodules="$git_submodules tests/fp/berkeley-testfloat-3"
>  git_submodules="$git_submodules tests/fp/berkeley-softfloat-3"
>  else
> -git_update=no
> +git_update=""
>  git_submodules=""
>  
>  if ! test -f "$source_path/ui/keycodemapdb/README"
> @@ -1603,7 +1603,12 @@ for opt do
>;;
>--with-git=*) git="$optarg"
>;;
> -  --enable-git-update) git_update=yes
> +  --enable-git-update)
> +  git_update=yes
> +  if test ! -e "$source_path/.git"; then
> +  echo "ERROR: cannot --enable-git-update without .git"
> +  exit 1
> +  fi
>;;
>--disable-git-update) git_update=no
>;;
> @@ -2017,7 +2022,7 @@ fi
>  # Consult white-list to determine whether to enable werror
>  # by default.  Only enable by default for git builds
>  if test -z "$werror" ; then
> -if test -e "$source_path/.git" && \
> +if test "$git_update" = "yes" && \
>  { test "$linux" = "yes" || test "$mingw32" = "yes"; }; then
>  werror="yes"
>  else
> @@ -4418,10 +4423,10 @@ EOF
>  fdt=system
>else
># have GIT checkout, so activate dtc submodule
> -  if test -e "${source_path}/.git" ; then
> +  if test "$git_update" = "yes" ; then
>git_submodules="${git_submodules} dtc"
>fi
> -  if test -d "${source_path}/dtc/libfdt" || test -e 
> "${source_path}/.git" ; then
> +  if test -d "${source_path}/dtc/libfdt" || test "$git_update" = "yes" ; 
> then
>fdt=git
>mkdir -p dtc
>if [ "$pwd_is_source_path" != "y" ] ; then
> @@ -5391,7 +5396,7 @@ case "$capstone" in
>"" | yes)
>  if $pkg_config capstone; then
>capstone=system
> -elif test -e "${source_path}/.git" && test $git_update = 'yes' ; then
> +elif test "$git_update" = "yes" ; then
>capstone=git
>  elif test -e "${source_path}/capstone/Makefile" ; then
>capstone=internal
> @@ -6447,7 +6452,7 @@ case "$slirp" in
>"" | yes)
>  if $pkg_config slirp; then
>slirp=system
> -elif test -e "${source_path}/.git" && test $git_update = 'yes' ; then
> +elif test "$git_update" = "yes" ; then
>slirp=git
>  elif test -e "${source_path}/slirp/Makefile" ; then
>slirp=internal
> @@ -6809,7 +6814,7 @@ if test "$cpu" = "s390x" ; then
>  roms="$roms s390-ccw"
>  # SLOF is required for building th

Re: [RFC PATCH] buildsys: Only build capstone if softmmu/user mode is enabled

2020-07-24 Thread Peter Maydell

On Fri, 24 Jul 2020 at 10:47, Philippe Mathieu-Daudé  wrote:
>
> On 7/24/20 11:38 AM, Philippe Mathieu-Daudé wrote:
> > On 7/24/20 9:56 AM, Thomas Huth wrote:
> >> On 24/07/2020 09.16, Philippe Mathieu-Daudé wrote:
> >>> At least one of softmmu or user mode has to be enabled to use
> >>> capstone. If not, don't clone/built it.
> >>>
> >>> This save CI time for the tools/documentation-only build jobs.

> >>> +if test -z "$capstone" && test $tcg = 'no' ; then # !tcg implies !softmmu
> >>> +  capstone="no"
> >>> +fi
> >>
> >> I don't think this is right. You could have a KVM-only build where you
> >> still want to use the disassembler for the human monitor.
> >
> > I had the same question with KVM, I agree this is unclear, this is why
> > I added RFC.
> >
> > Don't we have !softmmu implies !kvm?
>
> It works because it falls back to the old disas.c (if capstone is
> here, use it, else fall-back).
>
> Does this means we can directly remove the capstone experiment &
> submodule without waiting for the libllvm integration?

The theory (at least at the time) was that capstone was better
than the internal disassembler for at least some targets.
If we want to go from libllvm to capstone as our long term
plan that's cool, but until we actually do that I don't think
we should drop capstone.

As far as this patch goes: if you want to disable capstone for
the tools-and-docs-only setup then I think the right condition is
if [ "$bsd_user" = "no" -a "$linux_user" = "no" -a "$softmmu" = "no" ] ; then
  capstone=no
fi

thanks
-- PMM

Re: [PATCH 1/3] block/nbd: allow drain during reconnect attempt

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


23.07.2020 21:47, Eric Blake wrote:

On 7/20/20 4:00 AM, Vladimir Sementsov-Ogievskiy wrote:

It should be to reenter qio_channel_yield() on io/channel read/write
path, so it's safe to reduce in_flight and allow attaching new aio
context. And no problem to allow drain itself: connection attempt is
not a guest request. Moreover, if remote server is down, we can hang
in negotiation, blocking drain section and provoking a dead lock.

How to reproduce the dead lock:



I tried to reproduce this; but in the several minutes it has taken me to write 
this email, it still has not hung.  Still, your stack trace is fairly good 
evidence of the problem, where adding a temporary sleep or running it under gdb 
with a breakpoint can probably make reproduction easier.


1. Create nbd-fault-injector.conf with the following contents:

[inject-error "mega1"]
event=data
io=readwrite
when=before

2. In one terminal run nbd-fault-injector in a loop, like this:

n=1; while true; do
 echo $n; ((n++));


Bashism, but not a problem for the commit message.


 ./nbd-fault-injector.py 127.0.0.1:1 nbd-fault-injector.conf;
done

3. In another terminal run qemu-io in a loop, like this:

n=1; while true; do
 echo $n; ((n++));
 ./qemu-io -c 'read 0 512' nbd+tcp://127.0.0.1:1;


I prefer the spelling nbd:// for TCP connections, but also inconsequential.


Note, that the hang may be
triggered by another bug, so the whole case is fixed only together with
commit "block/nbd: on shutdown terminate connection attempt".

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 65a4f56924..49254f1c3c 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -280,7 +280,18 @@ static coroutine_fn void 
nbd_reconnect_attempt(BDRVNBDState *s)
  s->ioc = NULL;
  }
+    bdrv_dec_in_flight(s->bs);
  s->connect_status = nbd_client_connect(s->bs, &local_err);
+    s->wait_drained_end = true;
+    while (s->drained) {
+    /*
+ * We may be entered once from nbd_client_attach_aio_context_bh
+ * and then from nbd_client_co_drain_end. So here is a loop.
+ */
+    qemu_coroutine_yield();
+    }
+    bdrv_inc_in_flight(s->bs);
+


This is very similar to the code in nbd_co_reconnect_loop.  Does that function 
still need to wait on drained, since it calls nbd_reconnect_attempt which is 
now doing the same loop?  But off-hand, I'm not seeing a problem with keeping 
both places.


I want to reduce in_fligth around one operation. And I'm afraid of continuing 
while drained. So, here is the pattern:

 - allow drain (by decreasing in_flight)
 - do some operation, safe for drained section
 - we afraid that some further operations are unsafe for drained sections, so
   - disallow new drain (by increasing in_fligth)
   - wait for current drain to finish, if any

And, I'm not sure that nbd_read_eof is not buggy: it just do dec/inc in_flight 
around qio_channel_yield(), so nothing prevents us of continuing some other 
operations being in drained section. The code in nbd_read_eof was introduced by 
d3bd5b90890f6715bce..
Is it safe?



Reviewed-by: Eric Blake 

As a bug fix, I'll be including this in my NBD pull request for the next -rc 
build.




--
Best regards,
Vladimir

Re: [PATCH 1/3] block/nbd: allow drain during reconnect attempt

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


20.07.2020 12:00, Vladimir Sementsov-Ogievskiy wrote:

It should be to reenter qio_channel_yield() on io/channel read/write
path, so it's safe to reduce in_flight and allow attaching new aio
context. And no problem to allow drain itself: connection attempt is
not a guest request. Moreover, if remote server is down, we can hang
in negotiation, blocking drain section and provoking a dead lock.

How to reproduce the dead lock:

1. Create nbd-fault-injector.conf with the following contents:

[inject-error "mega1"]
event=data
io=readwrite
when=before

2. In one terminal run nbd-fault-injector in a loop, like this:

n=1; while true; do
 echo $n; ((n++));
 ./nbd-fault-injector.py 127.0.0.1:1 nbd-fault-injector.conf;
done

3. In another terminal run qemu-io in a loop, like this:

n=1; while true; do
 echo $n; ((n++));
 ./qemu-io -c 'read 0 512' nbd+tcp://127.0.0.1:1;
done

After some time, qemu-io will hang trying to drain, for example, like
this:

  #3 aio_poll (ctx=0x55f006bdd890, blocking=true) at
 util/aio-posix.c:600
  #4 bdrv_do_drained_begin (bs=0x55f006bea710, recursive=false,
 parent=0x0, ignore_bds_parents=false, poll=true) at block/io.c:427
  #5 bdrv_drained_begin (bs=0x55f006bea710) at block/io.c:433
  #6 blk_drain (blk=0x55f006befc80) at block/block-backend.c:1710
  #7 blk_unref (blk=0x55f006befc80) at block/block-backend.c:498
  #8 bdrv_open_inherit (filename=0x7fffba1563bc
 "nbd+tcp://127.0.0.1:1", reference=0x0, options=0x55f006be86d0,
 flags=24578, parent=0x0, child_class=0x0, child_role=0,
 errp=0x7fffba154620) at block.c:3491
  #9 bdrv_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:1",
 reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
 block.c:3513
  #10 blk_new_open (filename=0x7fffba1563bc "nbd+tcp://127.0.0.1:1",
 reference=0x0, options=0x0, flags=16386, errp=0x7fffba154620) at
 block/block-backend.c:421

And connection_co stack like this:

  #0 qemu_coroutine_switch (from_=0x55f006bf2650, to_=0x7fe96e07d918,
 action=COROUTINE_YIELD) at util/coroutine-ucontext.c:302
  #1 qemu_coroutine_yield () at util/qemu-coroutine.c:193
  #2 qio_channel_yield (ioc=0x55f006bb3c20, condition=G_IO_IN) at
 io/channel.c:472
  #3 qio_channel_readv_all_eof (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
 niov=1, errp=0x7fe96d729eb0) at io/channel.c:110
  #4 qio_channel_readv_all (ioc=0x55f006bb3c20, iov=0x7fe96d729bf0,
 niov=1, errp=0x7fe96d729eb0) at io/channel.c:143
  #5 qio_channel_read_all (ioc=0x55f006bb3c20, buf=0x7fe96d729d28
 "\300.\366\004\360U", buflen=8, errp=0x7fe96d729eb0) at
 io/channel.c:247
  #6 nbd_read (ioc=0x55f006bb3c20, buffer=0x7fe96d729d28, size=8,
 desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
 /work/src/qemu/master/include/block/nbd.h:365
  #7 nbd_read64 (ioc=0x55f006bb3c20, val=0x7fe96d729d28,
 desc=0x55f004f69644 "initial magic", errp=0x7fe96d729eb0) at
 /work/src/qemu/master/include/block/nbd.h:391
  #8 nbd_start_negotiate (aio_context=0x55f006bdd890,
 ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
 outioc=0x55f006bf19f8, structured_reply=true,
 zeroes=0x7fe96d729dca, errp=0x7fe96d729eb0) at nbd/client.c:904
  #9 nbd_receive_negotiate (aio_context=0x55f006bdd890,
 ioc=0x55f006bb3c20, tlscreds=0x0, hostname=0x0,
 outioc=0x55f006bf19f8, info=0x55f006bf1a00, errp=0x7fe96d729eb0) at
 nbd/client.c:1032
  #10 nbd_client_connect (bs=0x55f006bea710, errp=0x7fe96d729eb0) at
 block/nbd.c:1460
  #11 nbd_reconnect_attempt (s=0x55f006bf19f0) at block/nbd.c:287
  #12 nbd_co_reconnect_loop (s=0x55f006bf19f0) at block/nbd.c:309
  #13 nbd_connection_entry (opaque=0x55f006bf19f0) at block/nbd.c:360
  #14 coroutine_trampoline (i0=113190480, i1=22000) at
 util/coroutine-ucontext.c:173

Note, that the hang may be
triggered by another bug, so the whole case is fixed only together with
commit "block/nbd: on shutdown terminate connection attempt".

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 65a4f56924..49254f1c3c 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -280,7 +280,18 @@ static coroutine_fn void 
nbd_reconnect_attempt(BDRVNBDState *s)
  s->ioc = NULL;
  }
  
+bdrv_dec_in_flight(s->bs);

  s->connect_status = nbd_client_connect(s->bs, &local_err);
+s->wait_drained_end = true;
+while (s->drained) {
+/*
+ * We may be entered once from nbd_client_attach_aio_context_bh
+ * and then from nbd_client_co_drain_end. So here is a loop.
+ */
+qemu_coroutine_yield();
+}
+bdrv_inc_in_flight(s->bs);


My next version of non-blocking connect will need nbd_establish_connection() to 
be above bdrv_dec_in_flight(). So, I want to resend this anyway.


+
  error_free(s->connect_err);
  s->connect_err = NULL;
  error_propagate(&s->connect_err, local_err

Re: [PATCH v7 33/47] mirror: Deal with filters

2020-07-24 Thread Andrey Shinkevich


On 24.07.2020 12:49, Max Reitz wrote:

On 22.07.20 20:31, Andrey Shinkevich wrote:

On 25.06.2020 18:22, Max Reitz wrote:

This includes some permission limiting (for example, we only need to
take the RESIZE permission for active commits where the base is smaller
than the top).

Use this opportunity to rename qmp_drive_mirror()'s "source" BDS to
"target_backing_bs", because that is what it really refers to.

Signed-off-by: Max Reitz 
---
   qapi/block-core.json |   6 ++-
   block/mirror.c   | 118 +--
   blockdev.c   |  36 +
   3 files changed, 121 insertions(+), 39 deletions(-)


...

diff --git a/block/mirror.c b/block/mirror.c
index 469acf4600..770de3b34e 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -42,6 +42,7 @@ typedef struct MirrorBlockJob {
   BlockBackend *target;
   BlockDriverState *mirror_top_bs;
   BlockDriverState *base;
+    BlockDriverState *base_overlay;
     /* The name of the graph node to replace */
   char *replaces;
@@ -677,8 +678,10 @@ static int mirror_exit_common(Job *job)
    &error_abort);
   if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
   BlockDriverState *backing = s->is_none_mode ? src : s->base;
-    if (backing_bs(target_bs) != backing) {
-    bdrv_set_backing_hd(target_bs, backing, &local_err);
+    BlockDriverState *unfiltered_target =
bdrv_skip_filters(target_bs);
+
+    if (bdrv_cow_bs(unfiltered_target) != backing) {


I just worry about a filter node of the concurrent job right below the
unfiltered_target.

Having a concurrent job on the target sounds extremely problematic in
itself (because at least for most of the mirror job, the target isn’t in
a consistent state).  Is that a real use case?



It might be at the TestParallelOps of iotests #30 but I am not sure now. 
I am going to apply my series with copy-on-read filter for the stream 
job above this one and will see then.


Andrey





The filter has unfiltered_target in its parent list.
Will that filter node be replaced correctly then?

I’m also not quite sure what you mean.  We need to attach the source’s
backing chain to the target here, so we go down to the first node that
might accept COW backing files (by invoking bdrv_skip_filters()).  That
should be correct no matter what kind of filters are on it.



I ment when a filter is removed with the bdrv_replace_node() afterwards. 
As I mentioned above, I am going to test the case later.


Andrey



+    /*
+ * The topmost node with
+ * bdrv_skip_filters(filtered_target) ==
bdrv_skip_filters(target)
+ */
+    filtered_target = bdrv_cow_bs(bdrv_find_overlay(bs, target));
+
+    assert(bdrv_skip_filters(filtered_target) ==
+   bdrv_skip_filters(target));
+
+    /*
+ * XXX BLK_PERM_WRITE needs to be allowed so we don't block
+ * ourselves at s->base (if writes are blocked for a node,
they are
+ * also blocked for its backing file). The other options
would be a
+ * second filter driver above s->base (== target).
+ */
+    iter_shared_perms = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE;
+
+    for (iter = bdrv_filter_or_cow_bs(bs); iter != target;
+ iter = bdrv_filter_or_cow_bs(iter))
+    {
+    if (iter == filtered_target) {


For one filter node only?

No, iter_shared_perms is never reset, so it retains the
BLK_PERM_CONSISTENT_READ flag until the end of the loop.



Yes, that's right. Clear.

Andrey





+    /*
+ * From here on, all nodes are filters on the base.
+ * This allows us to share BLK_PERM_CONSISTENT_READ.
+ */
+    iter_shared_perms |= BLK_PERM_CONSISTENT_READ;
+    }
+
   ret = block_job_add_bdrv(&s->common, "intermediate
node", iter, 0,
- BLK_PERM_WRITE_UNCHANGED |
BLK_PERM_WRITE,
- errp);
+ iter_shared_perms, errp);
   if (ret < 0) {
   goto fail;
   }

...

@@ -3042,6 +3053,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error
**errp)
    " named node of the graph");
   goto out;
   }
+    replaces_node_name = arg->replaces;


What is the idea behind the variables substitution?

Looks like a remnant from v6, where there was an

if (arg->has_replaces) {
 ...
 replaces_node_name = arg->replaces;
} else if (unfiltered_bs != bs) {
 replaces_node_name = unfiltered_bs->node_name;
}

But I moved that logic to blockdev_mirror_common() in this version.

So it’s just useless now and replaces_node_name shouldn’t exist.

Max

Re: [PATCH v7 35/47] commit: Deal with filters

2020-07-24 Thread Andrey Shinkevich


On 23.07.2020 20:15, Andrey Shinkevich wrote:

On 25.06.2020 18:22, Max Reitz wrote:

This includes some permission limiting (for example, we only need to
take the RESIZE permission if the base is smaller than the top).

Signed-off-by: Max Reitz 
---
  block/block-backend.c  |  9 +++-
  block/commit.c | 96 +-
  block/monitor/block-hmp-cmds.c |  2 +-
  blockdev.c |  4 +-
  4 files changed, 81 insertions(+), 30 deletions(-)


...

+    /*
+ * Block all nodes between top and base, because they will
+ * disappear from the chain after this operation.
+ * Note that this assumes that the user is fine with removing all
+ * nodes (including R/W filters) between top and base. Assuring
+ * this is the responsibility of the interface (i.e. whoever calls
+ * commit_start()).
+ */
+    s->base_overlay = bdrv_find_overlay(top, base);
+    assert(s->base_overlay);
+
+    /*
+ * The topmost node with
+ * bdrv_skip_filters(filtered_base) == bdrv_skip_filters(base)
+ */
+    filtered_base = bdrv_cow_bs(s->base_overlay);
+    assert(bdrv_skip_filters(filtered_base) == 
bdrv_skip_filters(base));

+
+    /*
+ * XXX BLK_PERM_WRITE needs to be allowed so we don't block 
ourselves
+ * at s->base (if writes are blocked for a node, they are also 
blocked
+ * for its backing file). The other options would be a second 
filter

+ * driver above s->base.
+ */
+    iter_shared_perms = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE;
+
+    for (iter = top; iter != base; iter = 
bdrv_filter_or_cow_bs(iter)) {

+    if (iter == filtered_base) {



The question same to mirroring:

in case of multiple filters, one above another, the permission is 
extended for the filtered_base only.


Andrey



The question has been answered already.

Andrey





+    /*
+ * From here on, all nodes are filters on the base.  This
+ * allows us to share BLK_PERM_CONSISTENT_READ.
+ */
+    iter_shared_perms |= BLK_PERM_CONSISTENT_READ;
+    }
+
  ret = block_job_add_bdrv(&s->common, "intermediate node", 
iter, 0,
- BLK_PERM_WRITE_UNCHANGED | 
BLK_PERM_WRITE,

- errp);
+ iter_shared_perms, errp);
  if (ret < 0) {
  goto fail;
  }


...

Reviewed-by: Andrey Shinkevich

Re: [PATCH v7 29/47] blockdev: Use CAF in external_snapshot_prepare()

2020-07-24 Thread Andrey Shinkevich


On 24.07.2020 12:23, Max Reitz wrote:

On 20.07.20 18:08, Andrey Shinkevich wrote:

On 25.06.2020 18:21, Max Reitz wrote:

This allows us to differentiate between filters and nodes with COW
backing files: Filters cannot be used as overlays at all (for this
function).

Signed-off-by: Max Reitz 
---
   blockdev.c | 7 ++-
   1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 1eb0fcdea2..aabe51036d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1549,7 +1549,12 @@ static void
external_snapshot_prepare(BlkActionState *common,
   goto out;
   }
   -    if (state->new_bs->backing != NULL) {
+    if (state->new_bs->drv->is_filter) {


Is there a chance to get a filter here? If so, is that when a user
specifies the file name of such a kind “filter[filter-name]:foo.qcow2”
or somehow else?

It would be with blockdev-snapshot and by specifying a filter for
@overlay.  Technically that’s already caught by the check whether the
overlay supports backing images (whether drv->supports_backing is true),
but we might as well give a nicer error message.

Example:

{"execute":"qmp_capabilities"}

{"execute":"blockdev-add","arguments":
  {"node-name":"overlay","driver":"copy-on-read",
   "file":{"driver":"null-co"}}}

{"execute":"blockdev-add","arguments":
  {"node-name":"base","driver":"null-co"}}

{"execute":"blockdev-snapshot","arguments":
  {"node":"base","overlay":"overlay"}}

Max



Thank you for the example.

Andrey

Re: [PATCH v11 00/11] iotests: Dump QCOW2 dirty bitmaps metadata

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


23.07.2020 23:18, Eric Blake wrote:

On 7/23/20 2:42 PM, Eric Blake wrote:

On 7/17/20 3:14 AM, Andrey Shinkevich wrote:

Add dirty bitmap information to QCOW2 metadata dump in the qcow2_format.py.




  block/qcow2.c  |   2 +-
  docs/interop/qcow2.txt |   2 +-
  tests/qemu-iotests/qcow2.py    |  18 ++-
  tests/qemu-iotests/qcow2_format.py | 221 ++---
  4 files changed, 220 insertions(+), 23 deletions(-)


I still don't see any obvious coverage of the new output, which makes it harder 
to test (I have to manually run qcow2.py on a file rather than seeing what 
changes in a ???.out file).  I know we said back in v9 that test 291 is not the 
right test, but that does not stop you from adding a new test just for that 
purpose.


The bulk of this series is touching a non-installed utility. At this point, I 
feel safer deferring it to 5.2 (it is a feature addition for testsuite use 
only, and we missed soft freeze), even though it has no negative impact to 
installed binaries.



Yes, it's absolutely OK to defer to 5.2.

Thanks a lot for taking a look at our series!

--
Best regards,
Vladimir

Re: Testing the virtio-vhost-user QEMU patch

2020-07-24 Thread Alyssa Ross

Alyssa Ross  writes:

> Stefan Hajnoczi  writes:
>
>> On Tue, Jul 21, 2020 at 07:14:38AM +, Alyssa Ross wrote:
>>> Hi -- I hope it's okay me reaching out like this.
>>> 
>>> I've been trying to test out the virtio-vhost-user implementation that's
>>> been posted to this list a couple of times, but have been unable to get
>>> it to boot a kernel following the steps listed either on
>>>  or
>>> .
>>> 
>>> Specifically, the kernel appears to be unable to write to the
>>> virtio-vhost-user device's PCI registers.  I've included the full panic
>>> output from the kernel at the end of this message.  The panic is
>>> reproducible with two different kernels I tried (with different configs
>>> and versions).  I tried both versions of the virtio-vhost-user I was
>>> able to find[1][2], and both exhibited the same behaviour.
>>> 
>>> Is this a known issue?  Am I doing something wrong?
>>
>> Hi,
>> Unfortunately I'm not sure what the issue is. This is an early
>> virtio-pci register access before a driver for any specific device type
>> (net, blk, vhost-user, etc) comes into play.
>
> Small update here: I tried on another computer, and it worked.  Made
> sure that it was exactly the same QEMU binary, command line, and VM
> disk/initrd/kernel, so I think I can fairly confidently say the panic
> depends on what hardware QEMU is running on.  I set -cpu value to the
> same on both as well (SandyBridge).
>
> I also discovered that it works on my primary computer (the one it
> panicked on before) with KVM disabled.
>
> Note that I've only got so far as finding that it boots on the other
> machine -- I haven't verified yet that it actually works.
>
> Bad host CPU:  Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
> Good host CPU: AMD EPYC 7401P 24-Core Processor
>
> May I ask what host CPUs other people have tested this on?  Having more
> data would probably be useful.  Could it be an AMD vs. Intel thing?

I think I've figured it out!

Sandy Bridge and Ivy Bridge hosts encounter this panic because the
"additional resources" bar size is too big, at 1 << 36.  If I change
this to 1 << 35, no more kernel panic.

Skylake and later are fine with 1 << 36.  In between Ivy Bridge and
Skylake were Haswell and Broadwell, but I couldn't find anybody who was
able to help me test on either of those, so I don't know what they do.

Perhaps related, the hosts that produce panics all seem to have a
physical address size of 36 bits, while the hosts that work have larger
physical address sizes, as reported by lscpu.

Re: [PATCH-for-5.1? v2] qapi/error: Check format string argument in error_*prepend()

2020-07-24 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> error_propagate_prepend() "behaves like error_prepend()", and
> error_prepend() uses "formatting @fmt, ... like printf()".
> error_prepend() checks its format string argument, but
> error_propagate_prepend() does not. Fix by addint the format
> attribute to error_propagate_prepend() and error_vprepend().
>
> This would have caught the bug fixed in the previous commit:
>
> CC  hw/sd/milkymist-memcard.o
>   hw/sd/milkymist-memcard.c: In function ‘milkymist_memcard_realize’:
>   hw/sd/milkymist-memcard.c:284:70: error: format ‘%s’ expects a matching 
> ‘char *’ argument [-Werror=format=]
> 284 | error_propagate_prepend(errp, err, "failed to init SD card: 
> %s");
> | 
> ~^
> | 
>  |
> | 
>  char *
>
> Missed in commit 4b5766488f "error: Fix use of error_prepend() with
> &error_fatal, &error_abort".
>
> Inspired-by: Stefan Weil 
> Suggested-by: Eric Blake 
> Reviewed-by: Markus Armbruster 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> Since v1:
> - Reword (Markus)
> - Add error_vprepend (Stefan)
> - Use local style to add the attribute *after* the declaration,
>   which is invalid on definition where we get (depending on the
>   toolchain):
>
>   . error: attributes should be specified before the declarator in a function 
> definition
>
>   . error: attributes are not allowed on a function-definition
>
> Supersedes: <20200723091309.18690-1-phi...@redhat.com>

Queued with Stefan's R-by from v1.  Hope what's okay.  Thanks!

Re: [PATCH 1/3] block/nbd: allow drain during reconnect attempt

2020-07-24 Thread Vladimir Sementsov-Ogievskiy


23.07.2020 21:47, Eric Blake wrote:

On 7/20/20 4:00 AM, Vladimir Sementsov-Ogievskiy wrote:

It should be to reenter qio_channel_yield() on io/channel read/write
path, so it's safe to reduce in_flight and allow attaching new aio
context. And no problem to allow drain itself: connection attempt is
not a guest request. Moreover, if remote server is down, we can hang
in negotiation, blocking drain section and provoking a dead lock.

How to reproduce the dead lock:



I tried to reproduce this; but in the several minutes it has taken me to write 
this email, it still has not hung.  Still, your stack trace is fairly good 
evidence of the problem, where adding a temporary sleep or running it under gdb 
with a breakpoint can probably make reproduction easier.


I've tried to make a reproduce, adding temporary BDRV_POLL_WHILE, but I failed.

One time, it reproduced for me after 4000 iterations, but other times a lot 
earlier.

It may help to start several qemu-io loop in parallel.

Also, iotest 83 for -nbd hangs sometimes for me as well.

--
Best regards,
Vladimir

[Bug 1888431] Re: v5.1.0-rc1 build fails on Mac OS X 10.11.6

2020-07-24 Thread Robert Ball

Fantastic.   Thank you Thomas, greatly appreciated!

Best regards,

Robert Ball

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1888431

Title:
  v5.1.0-rc1 build fails on Mac OS X 10.11.6

Status in QEMU:
  Won't Fix

Bug description:
  Hi all,

  build of tag v5.1.0-rc1 fails on Mac OS X 10.11.6 (El Capitan) with
  the following error:

  git clone https://git.qemu.org/git/qemu.git
  
  cd qemu
  git submodule init
  
  git submodule update --recursive
  
  ./configure
  
  make
  

CC  trace/control.o
  In file included from trace/control.c:29:
  In file included from /Users/rtb/src/qemu/include/monitor/monitor.h:4:
  In file included from /Users/rtb/src/qemu/include/block/block.h:4:
  In file included from /Users/rtb/src/qemu/include/block/aio.h:23:
  /Users/rtb/src/qemu/include/qemu/timer.h:843:9: warning: implicit declaration 
of function 'clock_gettime' is invalid in C99
[-Wimplicit-function-declaration]
  clock_gettime(CLOCK_MONOTONIC, &ts);
  ^
  /Users/rtb/src/qemu/include/qemu/timer.h:843:23: error: use of undeclared 
identifier 'CLOCK_MONOTONIC'
  clock_gettime(CLOCK_MONOTONIC, &ts);
^
  1 warning and 1 error generated.
  make: *** [trace/control.o] Error 1

  
  rtb:qemu rtb$ git log -n1
  commit c8004fe6bbfc0d9c2e7b942c418a85efb3ac4b00 (HEAD -> master, tag: 
v5.1.0-rc1, origin/master, origin/HEAD)
  Author: Peter Maydell 
  Date:   Tue Jul 21 20:28:59 2020 +0100

  Update version for v5.1.0-rc1 release
  
  Signed-off-by: Peter Maydell 
  rtb:qemu rtb$ 

  
  Please find the full output of all the commands (from git clone of the repo, 
to the make) in the attached file "buildfail.txt".

  Thank you!

  Best regards,

  Robert Ball

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1888431/+subscriptions

Re: [PATCH for-5.1 2/3] virtiofsd: add container-friendly -o chroot sandboxing option

2020-07-24 Thread Stefan Hajnoczi

On Thu, Jul 23, 2020 at 06:55:38PM +0100, Dr. David Alan Gilbert wrote:
> * Stefan Hajnoczi (stefa...@redhat.com) wrote:
> > On Wed, Jul 22, 2020 at 08:03:18PM +0100, Dr. David Alan Gilbert wrote:
> > > * Stefan Hajnoczi (stefa...@redhat.com) wrote:
> > > > +/*
> > > > + * Make the shared directory the file system root so that FUSE_OPEN
> > > > + * (lo_open()) cannot escape the shared directory by opening a 
> > > > symlink.
> > > > + *
> > > > + * It's still possible to escape the chroot via lo->proc_self_fd 
> > > > but that
> > > > + * requires gaining control of the process first.
> > > > + */
> > > > +if (chroot(lo->source) != 0) {
> > > > +fuse_log(FUSE_LOG_ERR, "chroot(\"%s\"): %m\n", lo->source);
> > > > +exit(1);
> > > > +}
> > > 
> > > I'm seeing suggestions that you should drop CAP_SYS_CHROOT after
> > > chroot'ing to stop an old escape (where you create another jail inside
> > > the jail and the kernel then lets you walk outside of the old one).
> > 
> > That's already the case:
> > 
> > 1. setup_seccomp() blocks further chroot(2) calls.
> > 2. setup_capabilities() drops CAP_SYS_CHROOT.
> 
> Ah yes; could you add a comment if you respin; it's not obvious that
> the capability to chroot allows you to break out of an existing chroot
> you're in.

Sure.

Stefan


signature.asc
Description: PGP signature

Re: 答复: [PATCH v1] hw/pci-host: save/restore pci host config register

2020-07-24 Thread Michael S. Tsirkin

On Thu, Jul 23, 2020 at 02:29:36PM +, Wangjing (Hogan, Cloud Infrastructure 
Service Product Dept.) wrote:
> Wang King wrote:
> > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > On Thu, Jul 23, 2020 at 08:53:03PM +0800, Wang King wrote:
> > > > From: Hogan Wang 
> > > > 
> > > > The pci host config register is used to save PCI address for 
> > > > read/write config data. If guest write a value to config register, 
> > > > and then pause the vcpu to migrate, After the migration, the guest 
> > > > continue to write pci config data, and the write data will be 
> > > > ignored because of new qemu process lost the config register state.
> > > > 
> > > > Reproduction steps are:
> > > > 1. guest booting in seabios.
> > > > 2. guest enable the SMRAM in seabios:piix4_apmc_smm_setup, and then
> > > >expect to disable the SMRAM by pci_config_writeb.
> > > > 3. after guest write the pci host config register, and then pasued vcpu
> > > >to finish migration.
> > > > 4. guest write config data(0x0A) fail to disable the SMRAM becasue of
> > > >config register state lost.
> > > > 5. guest continue to boot and crash in ipxe option ROM due to SMRAM in
> > > >enabled state.
> > > > 
> > > > Signed-off-by: Hogan Wang 
> > > 
> > > I guess this is like v3 right?
> > > 
> > > thanks a lot for the patch!
> > > 
> > > My question stands : does anyone see a way to pass this info around 
> > > without breaking migration for all existing machine types?
> > 
> > You need a .needed clause in the vmstate_i440fx_pcihost and 
> > vmstate_q35_pcihost which is a pointer to a function which enables it on 
> > new machine types and ignores it on old ones.
> > 
> > Or, if it always crashes if the SMRAM is enabled, then the migration is 
> > dead anyway; so you could make the .needed only save the config if the 
> > SMRAM is opened, so you'd get a unknown section error, which is nasty but 
> > it would only happen in the case it would crash anyway.
> > 
> > Dave
> > 
> 
> It always crashes if the SMRAM is enabled, but it's just one case, config 
> register
> state lost may cause other uncertain errors, so it's need on new machine 
> types.

Well, let's limit this to new machine types for now?
Declaring all machine types broken is unattractive ...

> > > 
> > > > ---
> > > >  hw/pci-host/i440fx.c   | 11 +++
> > > >  hw/pci-host/q35.c  | 11 +++
> > > >  hw/pci/pci_host.c  | 11 +++
> > > >  hw/pci/pcie_host.c | 11 +++
> > > >  include/hw/pci/pci_host.h  | 10 ++  
> > > > include/hw/pci/pcie_host.h | 10 ++
> > > >  6 files changed, 64 insertions(+)
> > > > 
> > > > diff --git a/hw/pci-host/i440fx.c b/hw/pci-host/i440fx.c index 
> > > > 8ed2417f0c..17705bb025 100644
> > > > --- a/hw/pci-host/i440fx.c
> > > > +++ b/hw/pci-host/i440fx.c
> > > > @@ -118,6 +118,16 @@ static const VMStateDescription vmstate_i440fx = {
> > > >  }
> > > >  };
> > > >  
> > > > +static const VMStateDescription vmstate_i440fx_pcihost = {
> > > > +.name = "I440FX_PCIHost",
> > > > +.version_id = 1,
> > > > +.minimum_version_id = 1,
> > > > +.fields = (VMStateField[]) {
> > > > +VMSTATE_PCI_HOST(parent_obj, I440FXState),
> > > > +VMSTATE_END_OF_LIST()
> > > > +}
> > > > +};
> > > > +
> > > >  static void i440fx_pcihost_get_pci_hole_start(Object *obj, Visitor *v,
> > > >const char *name, void 
> > > > *opaque,
> > > >Error **errp) @@ 
> > > > -398,6 +408,7 @@ static void i440fx_pcihost_class_init(ObjectClass 
> > > > *klass, void *data)
> > > >  hc->root_bus_path = i440fx_pcihost_root_bus_path;
> > > >  dc->realize = i440fx_pcihost_realize;
> > > >  dc->fw_name = "pci";
> > > > +dc->vmsd = &vmstate_i440fx_pcihost;
> > > >  device_class_set_props(dc, i440fx_props);
> > > >  /* Reason: needs to be wired up by pc_init1 */
> > > >  dc->user_creatable = false;
> > > > diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c index 
> > > > b67cb9c29f..5e323be2e3 100644
> > > > --- a/hw/pci-host/q35.c
> > > > +++ b/hw/pci-host/q35.c
> > > > @@ -165,6 +165,16 @@ static void q35_host_get_pci_hole64_end(Object 
> > > > *obj, Visitor *v,
> > > >  visit_type_uint64(v, name, &value, errp);  }
> > > >  
> > > > +static const VMStateDescription vmstate_q35_pcihost = {
> > > > +.name = "Q35_PCIHost",
> > > > +.version_id = 1,
> > > > +.minimum_version_id = 1,
> > > > +.fields = (VMStateField[]) {
> > > > +VMSTATE_PCIE_HOST(parent_obj, Q35PCIHost),
> > > > +VMSTATE_END_OF_LIST()
> > > > +}
> > > > +};
> > > > +
> > > >  /*
> > > >   * NOTE: setting defaults for the mch.* fields in this table
> > > >   * doesn't work, because mch is a separate QOM object that is @@ 
> > > > -194,6 +204,7 @@ static void q35_host_class_init(ObjectClass *klass, 
> > > > void *data)
> > > >  
> > > >  hc->root_bus_path

Re: Testing the virtio-vhost-user QEMU patch

2020-07-24 Thread Stefan Hajnoczi

On Fri, Jul 24, 2020 at 10:58:45AM +, Alyssa Ross wrote:
> Alyssa Ross  writes:
> 
> > Stefan Hajnoczi  writes:
> >
> >> On Tue, Jul 21, 2020 at 07:14:38AM +, Alyssa Ross wrote:
> >>> Hi -- I hope it's okay me reaching out like this.
> >>> 
> >>> I've been trying to test out the virtio-vhost-user implementation that's
> >>> been posted to this list a couple of times, but have been unable to get
> >>> it to boot a kernel following the steps listed either on
> >>>  or
> >>> .
> >>> 
> >>> Specifically, the kernel appears to be unable to write to the
> >>> virtio-vhost-user device's PCI registers.  I've included the full panic
> >>> output from the kernel at the end of this message.  The panic is
> >>> reproducible with two different kernels I tried (with different configs
> >>> and versions).  I tried both versions of the virtio-vhost-user I was
> >>> able to find[1][2], and both exhibited the same behaviour.
> >>> 
> >>> Is this a known issue?  Am I doing something wrong?
> >>
> >> Hi,
> >> Unfortunately I'm not sure what the issue is. This is an early
> >> virtio-pci register access before a driver for any specific device type
> >> (net, blk, vhost-user, etc) comes into play.
> >
> > Small update here: I tried on another computer, and it worked.  Made
> > sure that it was exactly the same QEMU binary, command line, and VM
> > disk/initrd/kernel, so I think I can fairly confidently say the panic
> > depends on what hardware QEMU is running on.  I set -cpu value to the
> > same on both as well (SandyBridge).
> >
> > I also discovered that it works on my primary computer (the one it
> > panicked on before) with KVM disabled.
> >
> > Note that I've only got so far as finding that it boots on the other
> > machine -- I haven't verified yet that it actually works.
> >
> > Bad host CPU:  Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
> > Good host CPU: AMD EPYC 7401P 24-Core Processor
> >
> > May I ask what host CPUs other people have tested this on?  Having more
> > data would probably be useful.  Could it be an AMD vs. Intel thing?
> 
> I think I've figured it out!
> 
> Sandy Bridge and Ivy Bridge hosts encounter this panic because the
> "additional resources" bar size is too big, at 1 << 36.  If I change
> this to 1 << 35, no more kernel panic.
> 
> Skylake and later are fine with 1 << 36.  In between Ivy Bridge and
> Skylake were Haswell and Broadwell, but I couldn't find anybody who was
> able to help me test on either of those, so I don't know what they do.
> 
> Perhaps related, the hosts that produce panics all seem to have a
> physical address size of 36 bits, while the hosts that work have larger
> physical address sizes, as reported by lscpu.

I have run it successfully on Broadwell but never tried 64GB or larger
shared memory resources.

Stefan


signature.asc
Description: PGP signature

Re: [PATCH] hw/input/virtio-input-hid.c: Don't undef CONFIG_CURSES

2020-07-24 Thread Michael S. Tsirkin

On Thu, Jul 23, 2020 at 08:24:57PM +0100, Peter Maydell wrote:
> virtio-input-hid.c undefines CONFIG_CURSES before including
> ui/console.h. However since commits e2f82e924d057935 and b0766612d16da18
> that header does not have behaviour dependent on CONFIG_CURSES.
> Remove the now-unneeded undef.
> 
> Signed-off-by: Peter Maydell 

Acked-by: Michael S. Tsirkin 

Gerd pls feel free to pick this up.

> ---
> NB: tested with 'make check' only.
> 
>  hw/input/virtio-input-hid.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/hw/input/virtio-input-hid.c b/hw/input/virtio-input-hid.c
> index 09cf2609854..a7a244a95db 100644
> --- a/hw/input/virtio-input-hid.c
> +++ b/hw/input/virtio-input-hid.c
> @@ -12,7 +12,6 @@
>  #include "hw/qdev-properties.h"
>  #include "hw/virtio/virtio-input.h"
>  
> -#undef CONFIG_CURSES
>  #include "ui/console.h"
>  
>  #include "standard-headers/linux/input.h"
> -- 
> 2.20.1

Re: [PATCH] ACPI: Assert that we don't run out of the preallocated memory

2020-07-24 Thread Michael S. Tsirkin

On Thu, Jul 23, 2020 at 06:50:06PM +0100, Peter Maydell wrote:
> On Mon, 22 Jun 2020 at 12:30, Dongjiu Geng  wrote:
> >
> > data_length is a constant value, so we use assert instead of
> > condition check.
> >
> > Signed-off-by: Dongjiu Geng 
> > ---
> > 1. Address Peter and Michael's comments to use assert instead of if().
> > https://lore.kernel.org/qemu-devel/ca79ea28-9ea9-18a5-99ad-25c3eb744...@huawei.com/
> 
> Oops, looks like we forgot about this patch -- Michael, are you
> taking it through your tree or should I take it via target-arm ?
> 
> thanks
> -- PMM

Feel free to merge pls.
Reviewed-by: Michael S. Tsirkin

Re: [PATCH v1] hw/pci-host: save/restore pci host config register

2020-07-24 Thread Michael S. Tsirkin

On Fri, Jul 24, 2020 at 03:21:58AM +, Wangjing (Hogan, Cloud Infrastructure 
Service Product Dept.) wrote:
> On Sat, Jul 25, 2020 at 10:53:03AM Hogan Wang wrote:
> > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > On Thu, Jul 23, 2020 at 02:12:54PM +0100, Dr. David Alan Gilbert wrote:
> > > > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > > > On Thu, Jul 23, 2020 at 08:53:03PM +0800, Hogan Wang wrote:
> > > > > > From: Hogan Wang 
> > > > > > 
> > > > > > The pci host config register is used to save PCI address for 
> > > > > > read/write config data. If guest write a value to config 
> > > > > > register, and then pause the vcpu to migrate, After the 
> > > > > > migration, the guest continue to write pci config data, and the 
> > > > > > write data will be ignored because of new qemu process lost the 
> > > > > > config register state.
> > > > > > 
> > > > > > Reproduction steps are:
> > > > > > 1. guest booting in seabios.
> > > > > > 2. guest enable the SMRAM in seabios:piix4_apmc_smm_setup, and then
> > > > > >expect to disable the SMRAM by pci_config_writeb.
> > > > > > 3. after guest write the pci host config register, and then pasued 
> > > > > > vcpu
> > > > > >to finish migration.
> > > > > > 4. guest write config data(0x0A) fail to disable the SMRAM becasue 
> > > > > > of
> > > > > >config register state lost.
> > > > > > 5. guest continue to boot and crash in ipxe option ROM due to SMRAM 
> > > > > > in
> > > > > >enabled state.
> > > > > > 
> > > > > > Signed-off-by: Hogan Wang 
> > > > > 
> > > > > I guess this is like v3 right?
> > > > > 
> > > > > thanks a lot for the patch!
> > > > > 
> > > > > My question stands : does anyone see a way to pass this info 
> > > > > around without breaking migration for all existing machine types?
> > > > 
> > > > You need a .needed clause in the vmstate_i440fx_pcihost and 
> > > > vmstate_q35_pcihost which is a pointer to a function which enables 
> > > > it on new machine types and ignores it on old ones.
> > > > 
> > > > Or, if it always crashes if the SMRAM is enabled, then the migration 
> > > > is dead anyway; so you could make the .needed only save the config 
> > > > if the SMRAM is opened, so you'd get a unknown section error, which 
> > > > is nasty but it would only happen in the case it would crash anyway.
> > > > 
> > > > Dave
> > > 
> > > Problem is we never know whether it's needed.
> > > 
> > > For example: guest programs cf8, then cfc.
> > > Guest on destination can crash if migrated after writing cf8 before 
> > > writing cfc.
> > > But in theory it can also crash if guest assumes
> > > cf8 is unchanged and just writes cfc.
> > > 
> > > So what I'd prefer to do is put it in some data that old qemu ignores. 
> > > Then once qemu on destination is updated, it will start interpreting 
> > > it.
> > 
> > We don't have a way to do that; the choice is:
> >   a) Not sending it for old versions, so you only get the
> > fix for new machine types
> > 
> >   b) Trying to second guess when it will crash
> > 
> > I recommend (a) generally - but the format has no way to ignore unknown 
> > data.
> > 
> > Dave
> > 
> 
> The i440fx and q35 machines integrate i440FX or ICH9-LPC PCI device by
> default. Refer to i440FX and ICH9-LPC spcifications, there are some reserved
> configuration registers can used to save/restore PCIHostState.config_reg,
> like i440FX.config[0x57] used for Older coreboot to get RAM size from QEMU.
> 
> whitch is nasty but it friendly to old ones.

Right. So what I propose is a series of two patches:
1. a patch to add it in a clean way to new machine types only.

2. a patch on top for old machine types to stick it in some
   reserved register.

Then people can review both approaches and we can decide.


> > > 
> > > > > 
> > > > > > ---
> > > > > >  hw/pci-host/i440fx.c   | 11 +++
> > > > > >  hw/pci-host/q35.c  | 11 +++
> > > > > >  hw/pci/pci_host.c  | 11 +++
> > > > > >  hw/pci/pcie_host.c | 11 +++
> > > > > >  include/hw/pci/pci_host.h  | 10 ++  
> > > > > > include/hw/pci/pcie_host.h | 10 ++
> > > > > >  6 files changed, 64 insertions(+)
> > > > > > 
> > > > > 
> > > > --
> > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> > > 
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH v1] hw/pci-host: save/restore pci host config register

2020-07-24 Thread Michael S. Tsirkin

On Thu, Jul 23, 2020 at 05:04:53PM +0100, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (m...@redhat.com) wrote:
> > On Thu, Jul 23, 2020 at 02:12:54PM +0100, Dr. David Alan Gilbert wrote:
> > > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > > On Thu, Jul 23, 2020 at 08:53:03PM +0800, Wang King wrote:
> > > > > From: Hogan Wang 
> > > > > 
> > > > > The pci host config register is used to save PCI address for
> > > > > read/write config data. If guest write a value to config register,
> > > > > and then pause the vcpu to migrate, After the migration, the guest
> > > > > continue to write pci config data, and the write data will be ignored
> > > > > because of new qemu process lost the config register state.
> > > > > 
> > > > > Reproduction steps are:
> > > > > 1. guest booting in seabios.
> > > > > 2. guest enable the SMRAM in seabios:piix4_apmc_smm_setup, and then
> > > > >expect to disable the SMRAM by pci_config_writeb.
> > > > > 3. after guest write the pci host config register, and then pasued 
> > > > > vcpu
> > > > >to finish migration.
> > > > > 4. guest write config data(0x0A) fail to disable the SMRAM becasue of
> > > > >config register state lost.
> > > > > 5. guest continue to boot and crash in ipxe option ROM due to SMRAM in
> > > > >enabled state.
> > > > > 
> > > > > Signed-off-by: Hogan Wang 
> > > > 
> > > > I guess this is like v3 right?
> > > > 
> > > > thanks a lot for the patch!
> > > > 
> > > > My question stands : does anyone see a way to pass this
> > > > info around without breaking migration for all existing
> > > > machine types?
> > > 
> > > You need a .needed clause in the vmstate_i440fx_pcihost and
> > > vmstate_q35_pcihost which is a pointer to a function which enables it on
> > > new machine types and ignores it on old ones.
> > > 
> > > Or, if it always crashes if the SMRAM is enabled, then the migration is
> > > dead anyway; so you could make the .needed only save the config if
> > > the SMRAM is opened, so you'd get a unknown section error, which is
> > > nasty but it would only happen in the case it would crash anyway.
> > > 
> > > Dave
> > 
> > Problem is we never know whether it's needed.
> > 
> > For example: guest programs cf8, then cfc.
> > Guest on destination can crash if migrated after
> > writing cf8 before writing cfc.
> > But in theory it can also crash if guest assumes
> > cf8 is unchanged and just writes cfc.
> > 
> > So what I'd prefer to do is put it in some data that
> > old qemu ignores. Then once qemu on destination
> > is updated, it will start interpreting it.
> 
> We don't have a way to do that; the choice is:
>   a) Not sending it for old versions, so you only get the
> fix for new machine types
> 
>   b) Trying to second guess when it will crash
> 
> I recommend (a) generally - but the format has no
> way to ignore unknown data.
> 
> Dave

How about adding that finally, so we have a way to fix bugs like this one?

> > 
> > > > 
> > > > > ---
> > > > >  hw/pci-host/i440fx.c   | 11 +++
> > > > >  hw/pci-host/q35.c  | 11 +++
> > > > >  hw/pci/pci_host.c  | 11 +++
> > > > >  hw/pci/pcie_host.c | 11 +++
> > > > >  include/hw/pci/pci_host.h  | 10 ++
> > > > >  include/hw/pci/pcie_host.h | 10 ++
> > > > >  6 files changed, 64 insertions(+)
> > > > > 
> > > > > diff --git a/hw/pci-host/i440fx.c b/hw/pci-host/i440fx.c
> > > > > index 8ed2417f0c..17705bb025 100644
> > > > > --- a/hw/pci-host/i440fx.c
> > > > > +++ b/hw/pci-host/i440fx.c
> > > > > @@ -118,6 +118,16 @@ static const VMStateDescription vmstate_i440fx = 
> > > > > {
> > > > >  }
> > > > >  };
> > > > >  
> > > > > +static const VMStateDescription vmstate_i440fx_pcihost = {
> > > > > +.name = "I440FX_PCIHost",
> > > > > +.version_id = 1,
> > > > > +.minimum_version_id = 1,
> > > > > +.fields = (VMStateField[]) {
> > > > > +VMSTATE_PCI_HOST(parent_obj, I440FXState),
> > > > > +VMSTATE_END_OF_LIST()
> > > > > +}
> > > > > +};
> > > > > +
> > > > >  static void i440fx_pcihost_get_pci_hole_start(Object *obj, Visitor 
> > > > > *v,
> > > > >const char *name, void 
> > > > > *opaque,
> > > > >Error **errp)
> > > > > @@ -398,6 +408,7 @@ static void i440fx_pcihost_class_init(ObjectClass 
> > > > > *klass, void *data)
> > > > >  hc->root_bus_path = i440fx_pcihost_root_bus_path;
> > > > >  dc->realize = i440fx_pcihost_realize;
> > > > >  dc->fw_name = "pci";
> > > > > +dc->vmsd = &vmstate_i440fx_pcihost;
> > > > >  device_class_set_props(dc, i440fx_props);
> > > > >  /* Reason: needs to be wired up by pc_init1 */
> > > > >  dc->user_creatable = false;
> > > > > diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
> > > > > index b67cb9c29f..5e323be2e3 100644
> > > > > --- a/hw/pci-host/q35.c
> > > > > +++ b/hw/pci-host/q35.c
> > > > > @@ -16

Re: [PATCH] ACPI: Assert that we don't run out of the preallocated memory

2020-07-24 Thread Peter Maydell

On Fri, 24 Jul 2020 at 13:55, Michael S. Tsirkin  wrote:
>
> On Thu, Jul 23, 2020 at 06:50:06PM +0100, Peter Maydell wrote:
> > On Mon, 22 Jun 2020 at 12:30, Dongjiu Geng  wrote:
> > >
> > > data_length is a constant value, so we use assert instead of
> > > condition check.
> > >
> > > Signed-off-by: Dongjiu Geng 
> > > ---
> > > 1. Address Peter and Michael's comments to use assert instead of if().
> > > https://lore.kernel.org/qemu-devel/ca79ea28-9ea9-18a5-99ad-25c3eb744...@huawei.com/
> >
> > Oops, looks like we forgot about this patch -- Michael, are you
> > taking it through your tree or should I take it via target-arm ?
> >
> > thanks
> > -- PMM
>
> Feel free to merge pls.
> Reviewed-by: Michael S. Tsirkin 



Applied to target-arm.next, thanks.

-- PMM

Re: Possible regression with VGA and resolutions in Windows 10?

2020-07-24 Thread Aaron Lauterer





On 7/24/20 11:41 AM, Gerd Hoffmann wrote:

On Thu, Jul 23, 2020 at 04:24:06PM +0200, Aaron Lauterer wrote:

Hi all,

I think we have a regression introduced in commit 0221d73ce6.

Once I start a Windows 10 VM (build 18363) with `-device VGA` I have only the 
following resolutions to choose from instead of the much longer list:

1920x1080
1024x768
800x600


That is probably vgabios gaining edid support.

The list should be longer though, the qemu edid block has more
resolutions included.  The qemu-edid tool is a command line
interface to the edid generator, for testing purposes.

Try "qemu-edid | edid-decode" to see the decoded edid data.

Linux guests have the raw edid block in sysfs, see
/sys/class/drm/card0/card0-Virtual-1/edid.


   -device 'VGA,id=vga,vgamem_mb=32,bus=pci.0,addr=0x2' \


Try adding "xres=,yres=" of you want a specific
display resolution.

Try adding "edid=off" to return to previous behavior.


I did try that initially but it seemingly was ignored by Windows.

But I took another look at it and I was able to observe the following behavior 
with Win10:

Boot it first with edid enabled (by default) and I see only the short list of 
possible resolutions (I forgot 640x480, which is only shown in the Display Adapter 
Properties -> List all modes).

Boot it with edid=off and nothing changes, still the short list.
Uninstall the `Microsoft Basic Display Adapter` in Device Manager and reboot 
the VM. Now it recognizes all resolutions again.

The behavior is similar when setting a custom resolution with the xres and yres 
parameters. Setting it the first time works fine and it is shown along with the 
short list. Setting it to something different on the next boot will not be 
recognized unless the display adapter is uninstalled and the VM rebooted.

The same applies when changing from edid=off to edid=on. Only after 
uninstalling the display driver and a reboot will it change from showing the 
long list of resolution to the mentioned short one.


Knowing that uninstalling the display driver after setting edid=off will help 
is good enough for now.

Thanks for pointing that out again and nudging me in the right direction.

All the best,
Aaron




HTH & take care,
   Gerd

Re: [PATCH-for-5.1 v3] hw/misc/aspeed_sdmc: Fix incorrect memory size

2020-07-24 Thread Peter Maydell

On Tue, 21 Jul 2020 at 13:42, Philippe Mathieu-Daudé  wrote:
>
> The SDRAM Memory Controller has a 32-bit address bus, thus
> supports up to 4 GiB of DRAM. There is a signed to unsigned
> conversion error with the AST2600 maximum memory size:
>
>   (uint64_t)(2048 << 20) = (uint64_t)(-2147483648)
>  = 0x4000
>  = 16 EiB - 2 GiB
>
> Fix by using the IEC suffixes which are usually safer, and add
> an assertion check to verify the memory is valid. This would have
> caught this bug:
>
>   $ qemu-system-arm -M ast2600-evb
>   qemu-system-arm: hw/misc/aspeed_sdmc.c:258: aspeed_sdmc_realize: Assertion 
> `asc->max_ram_size < 4 * GiB' failed.
>   Aborted (core dumped)
>
> Fixes: 1550d72679 ("aspeed/sdmc: Add AST2600 support")
> Reviewed-by: Cédric Le Goater 
> Signed-off-by: Philippe Mathieu-Daudé 
> --



Applied to target-arm.next, thanks.

-- PMM

Re: [PATCH for-5.1] target/arm: Always pass cacheattr in S1_ptw_translate

2020-07-24 Thread Peter Maydell

On Tue, 21 Jul 2020 at 17:35, Richard Henderson
 wrote:
>
> When we changed the interface of get_phys_addr_lpae to require
> the cacheattr parameter, this spot was missed.  The compiler is
> unable to detect the use of NULL vs the nonnull attribute here.
>
> Fixes: 7e98e21c098
> Reported-by: Jan Kiszka 
> Signed-off-by: Richard Henderson 
> ---
>  target/arm/helper.c | 19 ++-
>  1 file changed, 6 insertions(+), 13 deletions(-)



Applied to target-arm.next, thanks.

-- PMM

[PATCH v3 0/4] block: improve error reporting for unsupported O_DIRECT

2020-07-24 Thread Daniel P . Berrangé

v1: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00269.html
v2: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00589.html

See patch commit messages for rationale

Ideally we would convert other callers of qemu_open to use
qemu_open_err, and eventually remove qemu_open, renaming
qemu_open_err back to qemu_open.  Given soft freeze is just
days away though, I'm hoping this series is simple enough
to get into this release, leaving bigger cleanup for later.

Improved in v3:

 - Re-arrange the patches series, so that the conversion to Error
   takes place first, then the improve O_DIRECT reporting
 - Rename existing method to qemu_open_old
 - Use a pair of new methods qemu_open + qemu_create to improve
   arg checking

Improved in v2:

 - Mention that qemu_open_err is preferred over qemu_open
 - Get rid of obsolete error_report call
 - Simplify O_DIRECT handling
 - Fixup iotests for changed error message text

Daniel P. Berrangé (4):
  util: rename qemu_open() to qemu_open_old()
  util: introduce qemu_open and qemu_create with error reporting
  util: give a specific error message when O_DIRECT doesn't work
  block: switch to use qemu_open/qemu_create for improved errors

 accel/kvm/kvm-all.c|  2 +-
 backends/rng-random.c  |  2 +-
 backends/tpm/tpm_passthrough.c |  8 +--
 block/file-posix.c | 16 +++---
 block/file-win32.c |  5 +-
 block/vvfat.c  |  5 +-
 chardev/char-fd.c  |  2 +-
 chardev/char-pipe.c|  6 +--
 chardev/char.c |  2 +-
 dump/dump.c|  2 +-
 hw/s390x/s390-skeys.c  |  2 +-
 hw/usb/host-libusb.c   |  2 +-
 hw/vfio/common.c   |  4 +-
 include/qemu/osdep.h   |  8 ++-
 io/channel-file.c  |  2 +-
 net/vhost-vdpa.c   |  2 +-
 os-posix.c |  2 +-
 qga/channel-posix.c|  4 +-
 qga/commands-posix.c   |  6 +--
 target/arm/kvm.c   |  2 +-
 tests/qemu-iotests/051.out |  4 +-
 tests/qemu-iotests/051.pc.out  |  4 +-
 tests/qemu-iotests/061.out |  2 +-
 tests/qemu-iotests/069.out |  2 +-
 tests/qemu-iotests/082.out |  4 +-
 tests/qemu-iotests/111.out |  2 +-
 tests/qemu-iotests/226.out |  6 +--
 tests/qemu-iotests/232.out | 12 ++---
 tests/qemu-iotests/244.out |  6 +--
 ui/console.c   |  2 +-
 util/osdep.c   | 91 +-
 util/oslib-posix.c |  2 +-
 32 files changed, 144 insertions(+), 77 deletions(-)

-- 
2.26.2

[PATCH v3 3/4] util: give a specific error message when O_DIRECT doesn't work

2020-07-24 Thread Daniel P . Berrangé

A common error scenario is to tell QEMU to use O_DIRECT in combination
with a filesystem that doesn't support it. To aid users to diagnosing
their mistake we want to provide a clear error message when this happens.

Signed-off-by: Daniel P. Berrangé 
---
 util/osdep.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/util/osdep.c b/util/osdep.c
index 5c0f4684b1..ac3e7f48f1 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -348,6 +348,19 @@ qemu_open_internal(const char *name, int flags, mode_t 
mode, Error **errp)
 if (flags & O_CREAT) {
 action = "create";
 }
+#ifdef O_DIRECT
+if (errno == EINVAL && (flags & O_DIRECT)) {
+ret = open(name, flags & ~O_DIRECT, mode);
+if (ret != -1) {
+close(ret);
+error_setg(errp, "Could not %s '%s' flags 0x%x: "
+   "filesystem does not support O_DIRECT",
+   action, name, flags);
+errno = EINVAL; /* close() clobbered earlier errno */
+return -1;
+}
+}
+#endif /* O_DIRECT */
 error_setg_errno(errp, errno, "Could not %s '%s' flags 0x%x",
  action, name, flags);
 }
-- 
2.26.2

[PATCH v3 4/4] block: switch to use qemu_open/qemu_create for improved errors

2020-07-24 Thread Daniel P . Berrangé

Currently at startup if using cache=none on a filesystem lacking
O_DIRECT such as tmpfs, at startup QEMU prints

qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: file system may not 
support O_DIRECT
qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Could not open 
'/tmp/foo.img': Invalid argument

while at QMP level the hint is missing, so QEMU reports just

  "error": {
  "class": "GenericError",
  "desc": "Could not open '/tmp/foo.img': Invalid argument"
  }

which is close to useless for the end user trying to figure out what
they did wrong.

With this change at startup QEMU prints

qemu-system-x86_64: -drive file=/tmp/foo.img,cache=none: Unable to open 
'/tmp/foo.img' flags 0x4000: filesystem does not support O_DIRECT

while at the QMP level QEMU reports a massively more informative

  "error": {
 "class": "GenericError",
 "desc": "Unable to open '/tmp/foo.img' flags 0x4002: filesystem does not 
support O_DIRECT"
  }

Signed-off-by: Daniel P. Berrangé 
---
 block/file-posix.c| 18 +++---
 block/file-win32.c|  6 ++
 tests/qemu-iotests/051.out|  4 ++--
 tests/qemu-iotests/051.pc.out |  4 ++--
 tests/qemu-iotests/061.out|  2 +-
 tests/qemu-iotests/069.out|  2 +-
 tests/qemu-iotests/082.out|  4 ++--
 tests/qemu-iotests/111.out|  2 +-
 tests/qemu-iotests/226.out|  6 +++---
 tests/qemu-iotests/232.out| 12 ++--
 tests/qemu-iotests/244.out|  6 +++---
 11 files changed, 30 insertions(+), 36 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index bac2566f10..c63926d592 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -630,11 +630,10 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 raw_parse_flags(bdrv_flags, &s->open_flags, false);
 
 s->fd = -1;
-fd = qemu_open_old(filename, s->open_flags, 0644);
+fd = qemu_open(filename, s->open_flags, errp);
 ret = fd < 0 ? -errno : 0;
 
 if (ret < 0) {
-error_setg_file_open(errp, -ret, filename);
 if (ret == -EROFS) {
 ret = -EACCES;
 }
@@ -1032,15 +1031,13 @@ static int raw_reconfigure_getfd(BlockDriverState *bs, 
int flags,
 }
 }
 
-/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open_old() */
+/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */
 if (fd == -1) {
 const char *normalized_filename = bs->filename;
 ret = raw_normalize_devicepath(&normalized_filename, errp);
 if (ret >= 0) {
-assert(!(*open_flags & O_CREAT));
-fd = qemu_open_old(normalized_filename, *open_flags);
+fd = qemu_open(normalized_filename, *open_flags, errp);
 if (fd == -1) {
-error_setg_errno(errp, errno, "Could not reopen file");
 return -1;
 }
 }
@@ -2411,10 +2408,9 @@ raw_co_create(BlockdevCreateOptions *options, Error 
**errp)
 }
 
 /* Create file */
-fd = qemu_open_old(file_opts->filename, O_RDWR | O_CREAT | O_BINARY, 0644);
+fd = qemu_create(file_opts->filename, O_RDWR | O_BINARY, 0644, errp);
 if (fd < 0) {
 result = -errno;
-error_setg_errno(errp, -result, "Could not create file");
 goto out;
 }
 
@@ -3335,7 +3331,7 @@ static bool setup_cdrom(char *bsd_path, Error **errp)
 for (index = 0; index < num_of_test_partitions; index++) {
 snprintf(test_partition, sizeof(test_partition), "%ss%d", bsd_path,
  index);
-fd = qemu_open_old(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE);
+fd = qemu_open(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE, 
NULL);
 if (fd >= 0) {
 partition_found = true;
 qemu_close(fd);
@@ -3653,7 +3649,7 @@ static int cdrom_probe_device(const char *filename)
 int prio = 0;
 struct stat st;
 
-fd = qemu_open_old(filename, O_RDONLY | O_NONBLOCK);
+fd = qemu_open(filename, O_RDONLY | O_NONBLOCK, NULL);
 if (fd < 0) {
 goto out;
 }
@@ -3787,7 +3783,7 @@ static int cdrom_reopen(BlockDriverState *bs)
  */
 if (s->fd >= 0)
 qemu_close(s->fd);
-fd = qemu_open_old(bs->filename, s->open_flags, 0644);
+fd = qemu_open(bs->filename, s->open_flags, NULL);
 if (fd < 0) {
 s->fd = -1;
 return -EIO;
diff --git a/block/file-win32.c b/block/file-win32.c
index 8c1845830e..1a31f8a5ba 100644
--- a/block/file-win32.c
+++ b/block/file-win32.c
@@ -576,11 +576,9 @@ static int raw_co_create(BlockdevCreateOptions *options, 
Error **errp)
 return -EINVAL;
 }
 
-fd = qemu_open_old(file_opts->filename,
-   O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
-   0644);
+fd = qemu_create(file_opts->filename, O_WRONLY | O_TRUNC | O_BINARY,
+ 0644, errp);
 if (fd < 0) {
-error_setg_errno(errp, errno, "Could not create file");

[PATCH v3 1/4] util: rename qemu_open() to qemu_open_old()

2020-07-24 Thread Daniel P . Berrangé

We want to introduce a new version of qemu_open() that uses an Error
object for reporting problems and make this it the preferred interface.
Rename the existing method to release the namespace for the new impl.

Signed-off-by: Daniel P. Berrangé 
---
 accel/kvm/kvm-all.c|  2 +-
 backends/rng-random.c  |  2 +-
 backends/tpm/tpm_passthrough.c |  8 
 block/file-posix.c | 14 +++---
 block/file-win32.c |  5 +++--
 block/vvfat.c  |  5 +++--
 chardev/char-fd.c  |  2 +-
 chardev/char-pipe.c|  6 +++---
 chardev/char.c |  2 +-
 dump/dump.c|  2 +-
 hw/s390x/s390-skeys.c  |  2 +-
 hw/usb/host-libusb.c   |  2 +-
 hw/vfio/common.c   |  4 ++--
 include/qemu/osdep.h   |  2 +-
 io/channel-file.c  |  2 +-
 net/vhost-vdpa.c   |  2 +-
 os-posix.c |  2 +-
 qga/channel-posix.c|  4 ++--
 qga/commands-posix.c   |  6 +++---
 target/arm/kvm.c   |  2 +-
 ui/console.c   |  2 +-
 util/osdep.c   |  2 +-
 util/oslib-posix.c |  2 +-
 23 files changed, 42 insertions(+), 40 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 63ef6af9a1..ad8b315b35 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2013,7 +2013,7 @@ static int kvm_init(MachineState *ms)
 #endif
 QLIST_INIT(&s->kvm_parked_vcpus);
 s->vmfd = -1;
-s->fd = qemu_open("/dev/kvm", O_RDWR);
+s->fd = qemu_open_old("/dev/kvm", O_RDWR);
 if (s->fd == -1) {
 fprintf(stderr, "Could not access KVM kernel module: %m\n");
 ret = -errno;
diff --git a/backends/rng-random.c b/backends/rng-random.c
index 32998d8ee7..245b12ab24 100644
--- a/backends/rng-random.c
+++ b/backends/rng-random.c
@@ -75,7 +75,7 @@ static void rng_random_opened(RngBackend *b, Error **errp)
 error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
"filename", "a valid filename");
 } else {
-s->fd = qemu_open(s->filename, O_RDONLY | O_NONBLOCK);
+s->fd = qemu_open_old(s->filename, O_RDONLY | O_NONBLOCK);
 if (s->fd == -1) {
 error_setg_file_open(errp, errno, s->filename);
 }
diff --git a/backends/tpm/tpm_passthrough.c b/backends/tpm/tpm_passthrough.c
index 7403807ec4..81e2d8f531 100644
--- a/backends/tpm/tpm_passthrough.c
+++ b/backends/tpm/tpm_passthrough.c
@@ -217,7 +217,7 @@ static int 
tpm_passthrough_open_sysfs_cancel(TPMPassthruState *tpm_pt)
 char path[PATH_MAX];
 
 if (tpm_pt->options->cancel_path) {
-fd = qemu_open(tpm_pt->options->cancel_path, O_WRONLY);
+fd = qemu_open_old(tpm_pt->options->cancel_path, O_WRONLY);
 if (fd < 0) {
 error_report("tpm_passthrough: Could not open TPM cancel path: %s",
  strerror(errno));
@@ -235,11 +235,11 @@ static int 
tpm_passthrough_open_sysfs_cancel(TPMPassthruState *tpm_pt)
 dev++;
 if (snprintf(path, sizeof(path), "/sys/class/tpm/%s/device/cancel",
  dev) < sizeof(path)) {
-fd = qemu_open(path, O_WRONLY);
+fd = qemu_open_old(path, O_WRONLY);
 if (fd < 0) {
 if (snprintf(path, sizeof(path), 
"/sys/class/misc/%s/device/cancel",
  dev) < sizeof(path)) {
-fd = qemu_open(path, O_WRONLY);
+fd = qemu_open_old(path, O_WRONLY);
 }
 }
 }
@@ -271,7 +271,7 @@ tpm_passthrough_handle_device_opts(TPMPassthruState 
*tpm_pt, QemuOpts *opts)
 }
 
 tpm_pt->tpm_dev = value ? value : TPM_PASSTHROUGH_DEFAULT_DEVICE;
-tpm_pt->tpm_fd = qemu_open(tpm_pt->tpm_dev, O_RDWR);
+tpm_pt->tpm_fd = qemu_open_old(tpm_pt->tpm_dev, O_RDWR);
 if (tpm_pt->tpm_fd < 0) {
 error_report("Cannot access TPM device using '%s': %s",
  tpm_pt->tpm_dev, strerror(errno));
diff --git a/block/file-posix.c b/block/file-posix.c
index 9a00d4190a..bac2566f10 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -630,7 +630,7 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 raw_parse_flags(bdrv_flags, &s->open_flags, false);
 
 s->fd = -1;
-fd = qemu_open(filename, s->open_flags, 0644);
+fd = qemu_open_old(filename, s->open_flags, 0644);
 ret = fd < 0 ? -errno : 0;
 
 if (ret < 0) {
@@ -1032,13 +1032,13 @@ static int raw_reconfigure_getfd(BlockDriverState *bs, 
int flags,
 }
 }
 
-/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */
+/* If we cannot use fcntl, or fcntl failed, fall back to qemu_open_old() */
 if (fd == -1) {
 const char *normalized_filename = bs->filename;
 ret = raw_normalize_devicepath(&normalized_filename, errp);
 if (ret >= 0) {
 assert(!(*open_flags & O_CREAT));
-fd = qemu_open(normalized_filename,

[PATCH v3 2/4] util: introduce qemu_open and qemu_create with error reporting

2020-07-24 Thread Daniel P . Berrangé

This introduces two new helper metohds

  int qemu_open(const char *name, int flags, Error **errp);
  int qemu_create(const char *name, int flags, mode_t mode, Error **errp);

Note that with this design we no longer require or even accept the
O_CREAT flag. Avoiding overloading the two distinct operations
means we can avoid variable arguments which would prevent 'errp' from
being the last argument. It also gives us a guarantee that the 'mode' is
given when creating files, avoiding a latent security bug.

Signed-off-by: Daniel P. Berrangé 
---
 include/qemu/osdep.h |  6 
 util/osdep.c | 78 
 2 files changed, 71 insertions(+), 13 deletions(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 3a16e58932..ca24ebe211 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -494,7 +494,13 @@ int qemu_madvise(void *addr, size_t len, int advice);
 int qemu_mprotect_rwx(void *addr, size_t size);
 int qemu_mprotect_none(void *addr, size_t size);
 
+/*
+ * Don't introduce new usage of this function, prefer the following
+ * qemu_open/qemu_create that take a "Error **errp"
+ */
 int qemu_open_old(const char *name, int flags, ...);
+int qemu_open(const char *name, int flags, Error **errp);
+int qemu_create(const char *name, int flags, mode_t mode, Error **errp);
 int qemu_close(int fd);
 int qemu_unlink(const char *name);
 #ifndef _WIN32
diff --git a/util/osdep.c b/util/osdep.c
index 9df1b6adec..5c0f4684b1 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -22,6 +22,7 @@
  * THE SOFTWARE.
  */
 #include "qemu/osdep.h"
+#include "qapi/error.h"
 
 /* Needed early for CONFIG_BSD etc. */
 
@@ -282,10 +283,10 @@ int qemu_lock_fd_test(int fd, int64_t start, int64_t len, 
bool exclusive)
 /*
  * Opens a file with FD_CLOEXEC set
  */
-int qemu_open_old(const char *name, int flags, ...)
+static int
+qemu_open_internal(const char *name, int flags, mode_t mode, Error **errp)
 {
 int ret;
-int mode = 0;
 
 #ifndef _WIN32
 const char *fdset_id_str;
@@ -297,24 +298,31 @@ int qemu_open_old(const char *name, int flags, ...)
 
 fdset_id = qemu_parse_fdset(fdset_id_str);
 if (fdset_id == -1) {
+error_setg(errp, "Could not parse fdset %s", name);
 errno = EINVAL;
 return -1;
 }
 
 fd = monitor_fdset_get_fd(fdset_id, flags);
 if (fd < 0) {
+error_setg_errno(errp, -fd, "Could not acquire FD for %s flags %x",
+ name, flags);
 errno = -fd;
 return -1;
 }
 
 dupfd = qemu_dup_flags(fd, flags);
 if (dupfd == -1) {
+error_setg_errno(errp, errno, "Could not dup FD for %s flags %x",
+ name, flags);
 return -1;
 }
 
 ret = monitor_fdset_dup_fd_add(fdset_id, dupfd);
 if (ret == -1) {
 close(dupfd);
+error_setg(errp, "Could not save FD for %s flags %x",
+   name, flags);
 errno = EINVAL;
 return -1;
 }
@@ -323,22 +331,66 @@ int qemu_open_old(const char *name, int flags, ...)
 }
 #endif
 
-if (flags & O_CREAT) {
-va_list ap;
-
-va_start(ap, flags);
-mode = va_arg(ap, int);
-va_end(ap);
-}
-
 #ifdef O_CLOEXEC
-ret = open(name, flags | O_CLOEXEC, mode);
-#else
+flags |= O_CLOEXEC;
+#endif /* O_CLOEXEC */
+
 ret = open(name, flags, mode);
+
+#ifndef O_CLOEXEC
 if (ret >= 0) {
 qemu_set_cloexec(ret);
 }
-#endif
+#endif /* ! O_CLOEXEC */
+
+if (ret == -1) {
+const char *action = "open";
+if (flags & O_CREAT) {
+action = "create";
+}
+error_setg_errno(errp, errno, "Could not %s '%s' flags 0x%x",
+ action, name, flags);
+}
+
+
+return ret;
+}
+
+
+int qemu_open(const char *name, int flags, Error **errp)
+{
+if (flags & O_CREAT) {
+error_setg(errp,
+   "Invalid O_CREAT flag passed to qemu_open, use 
qemu_create");
+return -1;
+}
+return qemu_open_internal(name, flags, 0, errp);
+}
+
+
+int qemu_create(const char *name, int flags, mode_t mode, Error **errp)
+{
+if (flags & O_CREAT) {
+error_setg(errp, "Redundant O_CREAT flag passed to qemu_create");
+return -1;
+}
+return qemu_open_internal(name, flags | O_CREAT, mode, errp);
+}
+
+
+int qemu_open_old(const char *name, int flags, ...)
+{
+va_list ap;
+mode_t mode = 0;
+int ret;
+
+va_start(ap, flags);
+if (flags & O_CREAT) {
+mode = va_arg(ap, int);
+}
+va_end(ap);
+
+ret = qemu_open_internal(name, flags, mode, NULL);
 
 #ifdef O_DIRECT
 if (ret == -1 && errno == EINVAL && (flags & O_DIRECT)) {
-- 
2.26.2

[BUG] vhost-vdpa: qemu-system-s390x crashes with second virtio-net-ccw device

2020-07-24 Thread Cornelia Huck

When I start qemu with a second virtio-net-ccw device (i.e. adding
-device virtio-net-ccw in addition to the autogenerated device), I get
a segfault. gdb points to

#0  0x55d6ab52681d in virtio_net_get_config (vdev=, 
config=0x55d6ad9e3f80 "RT") at /home/cohuck/git/qemu/hw/net/virtio-net.c:146
146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {

(backtrace doesn't go further)

Starting qemu with no additional "-device virtio-net-ccw" (i.e., only
the autogenerated virtio-net-ccw device is present) works. Specifying
several "-device virtio-net-pci" works as well.

Things break with 1e0a84ea49b6 ("vhost-vdpa: introduce vhost-vdpa net
client"), 38140cc4d971 ("vhost_net: introduce set_config & get_config")
works (in-between state does not compile).

This is reproducible with tcg as well. Same problem both with
--enable-vhost-vdpa and --disable-vhost-vdpa.

Have not yet tried to figure out what might be special with
virtio-ccw... anyone have an idea?

[This should probably be considered a blocker?]

[PATCH] tpm_emulator: Report an error if chardev is missing

2020-07-24 Thread Stefan Berger

This patch fixes the odd error reporting when trying to send a file
descriptor to the TPM emulator if one has not passed a valid chardev.

$ x86_64-softmmu/qemu-system-x86_64 -tpmdev emulator,id=tpm0
qemu-system-x86_64: -tpmdev emulator,id=tpm0: tpm-emulator: Failed to send 
CMD_SET_DATAFD: Success
qemu-system-x86_64: -tpmdev emulator,id=tpm0: tpm-emulator: Could not cleanly 
shutdown the TPM: Success

This is the new error report:

$ x86_64-softmmu/qemu-system-x86_64 -tpmdev emulator,id=tpm0
qemu-system-x86_64: -tpmdev emulator,id=tpm0: tpm-emulator: missing chardev

This change does not hide the display of supported TPM types if a non-existent 
type is passed:

$ x86_64-softmmu/qemu-system-x86_64 -tpmdev nonexistent,id=tpm0
qemu-system-x86_64: -tpmdev nonexistent,id=tpm0: Parameter 'type' expects a TPM 
backend type
Supported TPM types (choose only one):
 passthrough   Passthrough TPM backend driver
emulator   TPM emulator backend driver

Signed-off-by: Stefan Berger 
---
 backends/tpm/tpm_emulator.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/backends/tpm/tpm_emulator.c b/backends/tpm/tpm_emulator.c
index 9605339f93..55cbc9c60e 100644
--- a/backends/tpm/tpm_emulator.c
+++ b/backends/tpm/tpm_emulator.c
@@ -568,6 +568,9 @@ static int tpm_emulator_handle_device_opts(TPMEmulator 
*tpm_emu, QemuOpts *opts)
 }
 
 tpm_emu->options->chardev = g_strdup(value);
+} else {
+error_report("tpm-emulator: missing chardev");
+goto err;
 }
 
 if (tpm_emulator_prepare_data_fd(tpm_emu) < 0) {
@@ -925,6 +928,11 @@ static void tpm_emulator_shutdown(TPMEmulator *tpm_emu)
 {
 ptm_res res;
 
+if (!tpm_emu->options->chardev) {
+/* was never properly initialized */
+return;
+}
+
 if (tpm_emulator_ctrlcmd(tpm_emu, CMD_SHUTDOWN, &res, 0, sizeof(res)) < 0) 
{
 error_report("tpm-emulator: Could not cleanly shutdown the TPM: %s",
  strerror(errno));
-- 
2.24.1

1 2 3 >

1 - 100 of 252 matches

Mail list logo