date:20200116

On Thu, 16 Jan 2020 17:35:32 +0100
Thomas Huth  wrote:

> On 15/01/2020 16.07, Igor Mammedov wrote:
> > Use GString to pass argument to make_cli() so that it would be easy
> > to dynamically change test case arguments from main(). The follow up
> > patch will use it to change RAM size options depending on target.
> > 
> > While at it cleanup 'cli' freeing, using g_autofree annotation.  
> 
> Hmm, I'd use g_autofree for new code or do it in a separate cleanup
> patch, but doing this here distracts quite a bit from the real changes
> that you are doing...
I'll split it into separate patch

> 
> > Signed-off-by: Igor Mammedov 
> > ---
> > PS:
> >   made as a separate patch so it won't clutter followup testcase changes.
> > 
> > CC: th...@redhat.com
> > CC: lviv...@redhat.com
> > ---
> >  tests/qtest/numa-test.c | 38 ++
> >  1 file changed, 14 insertions(+), 24 deletions(-)
> > 
> > diff --git a/tests/qtest/numa-test.c b/tests/qtest/numa-test.c
> > index 17dd807..a696dfd 100644
> > --- a/tests/qtest/numa-test.c
> > +++ b/tests/qtest/numa-test.c
> > @@ -14,16 +14,16 @@
> >  #include "qapi/qmp/qdict.h"
> >  #include "qapi/qmp/qlist.h"
> >  
> > -static char *make_cli(const char *generic_cli, const char *test_cli)
> > +static char *make_cli(const GString *generic_cli, const char *test_cli)
> >  {
> > -return g_strdup_printf("%s %s", generic_cli ? generic_cli : "", 
> > test_cli);
> > +return g_strdup_printf("%s %s", generic_cli->str, test_cli);
> >  }  
> [...]
> > @@ -539,11 +529,11 @@ static void pc_hmat_erange_cfg(const void *data)
> >  
> >  int main(int argc, char **argv)
> >  {
> > -const char *args = NULL;
> > +g_autoptr(GString) args = g_string_new("");  
> 
> I think g_string_new(NULL) would be better?
> 
> >  const char *arch = qtest_get_arch();
> >  
> >  if (strcmp(arch, "aarch64") == 0) {
> > -args = "-machine virt";
> > +g_string_append(args, " -machine virt")>  }  
> 
> Is this really required? Looking at your next patch, you could also
> simply do
> 
>   args = " -object memory-backend-ram,id=ram,size=xxxM"
xxx is variable so options are
 1 build this part of CLI dynamically
 2 mostly duplicate testcase function and include per target size there
 3 make up a test data structure and pass that to test cases

Given simplicity of current testcases, I'd prefer continue with
passing CLI as testcase data (option #1).


> 
> there? So using a GString seems overkill to me here.
> 
>  Thomas

Re: [PATCH v2 85/86] numa: make exit() usage consistent

On Thu, 16 Jan 2020 17:43:30 +0100
Thomas Huth  wrote:

> On 15/01/2020 16.07, Igor Mammedov wrote:
> > Signed-off-by: Igor Mammedov 
> > ---
> > CC: ehabk...@redhat.com
> > ---
> >  hw/core/numa.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/core/numa.c b/hw/core/numa.c
> > index 3177066..47d5ea1 100644
> > --- a/hw/core/numa.c
> > +++ b/hw/core/numa.c
> > @@ -718,7 +718,7 @@ void numa_complete_configuration(MachineState *ms)
> >  /* Report large node IDs first, to make mistakes easier to spot */
> >  if (!numa_info[i].present) {
> >  error_report("numa: Node ID missing: %d", i);
> > -exit(1);
> > +exit(EXIT_FAILURE);
> >  }
> >  }
> >  
> > @@ -759,7 +759,7 @@ void numa_complete_configuration(MachineState *ms)
> >  error_report("total memory for NUMA nodes (0x%" PRIx64 ")"
> >   " should equal RAM size (0x" RAM_ADDR_FMT ")",
> >   numa_total, ram_size);
> > -exit(1);
> > +exit(EXIT_FAILURE);
> >  }
> >  
> >  if (!numa_uses_legacy_mem()) {  
> 
> Please don't. We've had exit(1) vs. exit(EXIT_FAILURE) discussions in
> the past already, and IIRC there was no clear conclusion which one we
> want to use. There are examples of changes to the numeric value in our
> git history (see d54e4d7659ebecd0e1fa7ffc3e954197e09f8a1f for example),
> and example of the other way round (see 4d1275c24d5d64d22ec4a30ce1b6a0
> for example).
> 
> Your patch series here is already big enough, so I suggest to drop this
> patch from the series. If you want to change this, please suggest an
> update to CODING_STYLE.rst first so that we agree upon one style for
> exit() ... otherwise somebody else might change this back into numeric
> values in a couple of months just because they have a different taste.

Ok, will do.

There are other patches that introduce new exit(EXIT_FAILURE),
is it fine to use that or should I stick to the style used in nearby code?

> 
>  Thomas

[Bug 1859916] Re: coreaudio not working on MacOS

2020-01-16 Thread JS

I am not using any audio enhancing software.

I start trying qemu + Ubuntu Desktop a few weeks ago and put all
configurations together last week, so I never try coreaudio on 4.1.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1859916

Title:
  coreaudio not working on MacOS

Status in QEMU:
  New

Bug description:
  OS: MacOS Catalina 10.15.2

  qemu-system-x86_64 -version
  QEMU emulator version 4.2.50 (v4.2.0-13-g084a398bf8-dirty)
  Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers

  Qemu install via brew: brew install qemu

  ---

  I use following to install Ubuntu 18.04 desktop successfully:-

  IMG_CD=$HOME/Downloads/iso/ubuntu-18.04.3-desktop-amd64.iso
  IMG_FILE=$HOME/code/vm/qemu/u64d01.qcow2
  MAC_ADDR=xx:xx:xx:xx:xx:xx

  qemu-system-x86_64 \
  -no-user-config -nodefaults \
  -show-cursor \
  -name u64d01 \
  -M q35,accel=hvf,usb=off,vmport=off \
  -cpu host -smp 4 -m 2048 \
  -overcommit mem-lock=off \
  -overcommit cpu-pm=off \
  -rtc base=utc,clock=host \
  \
  -device virtio-tablet-pci \
  -device virtio-vga \
  \
  -device virtio-blk-pci,drive=ssd1 \
  -drive id=ssd1,file=$IMG_FILE,if=none,format=qcow2 \
  \
  -device virtio-net-pci,netdev=nic1,mac=$MAC_ADDR \
  -netdev user,id=nic1,ipv4=on,ipv6=on,hostname=u64d01,hostfwd=tcp::-:22 \
  \
  -device ich9-intel-hda,id=snd,msi=on \
  -device hda-output,id=snd-codec0,bus=snd.0,cad=0,audiodev=snd0 \
  -audiodev coreaudio,id=snd0,out.buffer-count=1 \
  \
  -cdrom $IMG_CD

  Removing the last -cdrom line Ubuntu desktop boot up and everything
  work perfectly except the audio.

  I test with wav audio driver by replacing the -audiodev line as
  follow, which save the client audio to a wave file:

  -audiodev wav,id=snd0,path=$HOME/qemu.wav

  I start the vm, open firefox, play a few video, then shutdown the vm.
  Then I can play the qemu.wav file and all the audio was recorded
  there.

  However, I can't get audio directly with coreaudio.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1859916/+subscriptions

Re: [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions

2020-01-16 Thread Paolo Bonzini

On 08/01/20 15:31, Sergio Lopez wrote:
> This patch series includes fixes for various issues related to
> AioContext mismanagement for various blockdev-initiated actions.
> 
> It also adds tests for those actions when executed on drives running
> on an IOThread AioContext.

Testing the latest addition to Patchew:

Supersedes: <20191128104129.250206-1-...@redhat.com>

:)

Paolo

> ---
> Changelog:
> 
> v6:
>  - Rename the patch series.
>  - Add "block/backup-top: Don't acquire context while dropping top"
>  - Add "blockdev: Acquire AioContext on dirty bitmap functions"
>  - Add "blockdev: Return bs to the proper context on snapshot abort"
>  - Add "iotests: Test handling of AioContexts with some blockdev
>actions" (thanks Kevin Wolf)
>  - Fix context release on error. (thanks Kevin Wolf)
> 
> v5:
>  - Include fix for iotest 141 in patch 2. (thanks Max Reitz)
>  - Also fix iotest 185 and 219 in patch 2. (thanks Max Reitz)
>  - Move error block after context acquisition/release, to we can use
>goto to bail out and release resources. (thanks Max Reitz)
>  - Properly release old_context in qmp_blockdev_mirror. (thanks Max
>Reitz)
> 
> v4:
>  - Unify patches 1-4 and 5-7 to avoid producing broken interim
>states. (thanks Max Reitz)
>  - Include a fix for iotest 141. (thanks Kevin Wolf)
> 
> v3:
>  - Rework the whole patch series to fix the issue by consolidating all
>operations in the transaction model. (thanks Kevin Wolf)
> 
> v2:
>  - Honor bdrv_try_set_aio_context() context acquisition requirements
>(thanks Max Reitz).
>  - Release the context at drive_backup_prepare() instead of avoiding
>re-acquiring it at do_drive_baclup(). (thanks Max Reitz)
>  - Convert a single patch into a two-patch series.
> ---
> 
> Sergio Lopez (8):
>   blockdev: fix coding style issues in drive_backup_prepare
>   blockdev: unify qmp_drive_backup and drive-backup transaction paths
>   blockdev: unify qmp_blockdev_backup and blockdev-backup transaction
> paths
>   blockdev: honor bdrv_try_set_aio_context() context requirements
>   block/backup-top: Don't acquire context while dropping top
>   blockdev: Acquire AioContext on dirty bitmap functions
>   blockdev: Return bs to the proper context on snapshot abort
>   iotests: Test handling of AioContexts with some blockdev actions
> 
>  block/backup-top.c |   5 -
>  block/backup.c |   3 +
>  blockdev.c | 391 -
>  tests/qemu-iotests/141.out |   2 +
>  tests/qemu-iotests/185.out |   2 +
>  tests/qemu-iotests/219 |   7 +-
>  tests/qemu-iotests/219.out |   8 +
>  tests/qemu-iotests/281 | 247 +++
>  tests/qemu-iotests/281.out |   5 +
>  tests/qemu-iotests/group   |   1 +
>  10 files changed, 484 insertions(+), 187 deletions(-)
>  create mode 100755 tests/qemu-iotests/281
>  create mode 100644 tests/qemu-iotests/281.out
>

Re: [PATCH 093/104] virtiofsd: introduce inode refcount to prevent use-after-free

2020-01-16 Thread Stefan Hajnoczi

On Thu, Jan 16, 2020 at 09:25:42PM +0900, Misono Tomohiro wrote:
> > From: Stefan Hajnoczi 
> > 
> > If thread A is using an inode it must not be deleted by thread B when
> > processing a FUSE_FORGET request.
> > 
> > The FUSE protocol itself already has a counter called nlookup that is
> > used in FUSE_FORGET messages.  We cannot trust this counter since the
> > untrusted client can manipulate it via FUSE_FORGET messages.
> > 
> > Introduce a new refcount to keep inodes alive for the required lifespan.
> > lo_inode_put() must be called to release a reference.  FUSE's nlookup
> > counter holds exactly one reference so that the inode stays alive as
> > long as the client still wants to remember it.
> > 
> > Note that the lo_inode->is_symlink field is moved to avoid creating a
> > hole in the struct due to struct field alignment.
> > 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> >  tools/virtiofsd/passthrough_ll.c | 168 ++-
> >  1 file changed, 145 insertions(+), 23 deletions(-)
> > 
> > diff --git a/tools/virtiofsd/passthrough_ll.c 
> > b/tools/virtiofsd/passthrough_ll.c
> > index b19c9ee328..8f4ab8351c 100644
> > --- a/tools/virtiofsd/passthrough_ll.c
> > +++ b/tools/virtiofsd/passthrough_ll.c
> > @@ -99,7 +99,13 @@ struct lo_key {
> >  
> >  struct lo_inode {
> >  int fd;
> > -bool is_symlink;
> > +
> > +/*
> > + * Atomic reference count for this object.  The nlookup field holds a
> > + * reference and release it when nlookup reaches 0.
> > + */
> > +gint refcount;
> > +
> >  struct lo_key key;
> >  
> >  /*
> > @@ -118,6 +124,8 @@ struct lo_inode {
> >  fuse_ino_t fuse_ino;
> >  pthread_mutex_t plock_mutex;
> >  GHashTable *posix_locks; /* protected by lo_inode->plock_mutex */
> > +
> > +bool is_symlink;
> >  };
> >  
> >  struct lo_cred {
> > @@ -473,6 +481,23 @@ static ssize_t lo_add_inode_mapping(fuse_req_t req, 
> > struct lo_inode *inode)
> >  return elem - lo_data(req)->ino_map.elems;
> >  }
> >  
> > +static void lo_inode_put(struct lo_data *lo, struct lo_inode **inodep)
> > +{
> > +struct lo_inode *inode = *inodep;
> > +
> > +if (!inode) {
> > +return;
> > +}
> > +
> > +*inodep = NULL;
> > +
> > +if (g_atomic_int_dec_and_test(&inode->refcount)) {
> > +close(inode->fd);
> > +free(inode);
> > +}
> > +}
> > +
> > +/* Caller must release refcount using lo_inode_put() */
> >  static struct lo_inode *lo_inode(fuse_req_t req, fuse_ino_t ino)
> >  {
> >  struct lo_data *lo = lo_data(req);
> > @@ -480,6 +505,9 @@ static struct lo_inode *lo_inode(fuse_req_t req, 
> > fuse_ino_t ino)
> >  
> >  pthread_mutex_lock(&lo->mutex);
> >  elem = lo_map_get(&lo->ino_map, ino);
> > +if (elem) {
> > +g_atomic_int_inc(&elem->inode->refcount);
> > +}
> >  pthread_mutex_unlock(&lo->mutex);
> >  
> >  if (!elem) {
> > @@ -489,10 +517,23 @@ static struct lo_inode *lo_inode(fuse_req_t req, 
> > fuse_ino_t ino)
> >  return elem->inode;
> >  }
> >  
> > +/*
> > + * TODO Remove this helper and force callers to hold an inode refcount 
> > until
> > + * they are done with the fd.  This will be done in a later patch to make
> > + * review easier.
> > + */
> >  static int lo_fd(fuse_req_t req, fuse_ino_t ino)
> >  {
> >  struct lo_inode *inode = lo_inode(req, ino);
> > -return inode ? inode->fd : -1;
> > +int fd;
> > +
> > +if (!inode) {
> > +return -1;
> > +}
> > +
> > +fd = inode->fd;
> > +lo_inode_put(lo_data(req), &inode);
> > +return fd;
> >  }
> >  
> >  static void lo_init(void *userdata, struct fuse_conn_info *conn)
> > @@ -547,6 +588,10 @@ static void lo_getattr(fuse_req_t req, fuse_ino_t ino,
> >  fuse_reply_attr(req, &buf, lo->timeout);
> >  }
> >  
> > +/*
> > + * Increments parent->nlookup and caller must release refcount using
> > + * lo_inode_put(&parent).
> > + */
> >  static int lo_parent_and_name(struct lo_data *lo, struct lo_inode *inode,
> >char path[PATH_MAX], struct lo_inode 
> > **parent)
> >  {
> > @@ -584,6 +629,7 @@ retry:
> >  p = &lo->root;
> >  pthread_mutex_lock(&lo->mutex);
> >  p->nlookup++;
> > +g_atomic_int_inc(&p->refcount);
> >  pthread_mutex_unlock(&lo->mutex);
> >  } else {
> >  *last = '\0';
> 
> We need lo_ionde_put() in error path, right?:
> https://gitlab.com/virtio-fs/qemu/blob/virtio-fs-as-posted-2019-12-12/tools/virtiofsd/passthrough_ll.c#L680

Yes, thanks for spotting this bug!  The lo_parent_and_name() code should
look like this:

  fail_unref:
  unref_inode_lolocked(lo, p, 1);
  lo_inode_put(lo, &p);
  ...

> nit: if yes, unref_inode_lolocked() is always paired with lo_inode_put().
> So how about combine them in one function? As p->nloockup and p->refcount
> are both incremented in one place (lo_find/lo_parent_and_name) in these case,
> it seems natural for me to decrement them in

[PATCH v3 18/86] arm:kzm: drop RAM size fixup

If the user provided too large a RAM size, the code used to
complain and trim it to the max size.  Now tht RAM is allocated by
generic code, that's no longer possible, so generate an error and
exit instead.

Signed-off-by: Igor Mammedov 
---
v3:
 * rephrase commit message in nicer way
   ("Chubb, Peter (Data61, Kensington NSW)" )
 * reword error message and use size_to_str() to pretty print suggested size
   ("Chubb, Peter (Data61, Kensington NSW)" )

CC: peter.ch...@nicta.com.au
CC: peter.mayd...@linaro.org
CC: qemu-...@nongnu.org
---
 hw/arm/kzm.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/hw/arm/kzm.c b/hw/arm/kzm.c
index 1d5ef28..94cbac1 100644
--- a/hw/arm/kzm.c
+++ b/hw/arm/kzm.c
@@ -25,6 +25,7 @@
 #include "hw/char/serial.h"
 #include "sysemu/qtest.h"
 #include "sysemu/sysemu.h"
+#include "qemu/cutils.h"
 
 /* Memory map for Kzm Emulation Baseboard:
  * 0x-0x7fff See i.MX31 SOC for support
@@ -78,10 +79,10 @@ static void kzm_init(MachineState *machine)
 
 /* Check the amount of memory is compatible with the SOC */
 if (machine->ram_size > (FSL_IMX31_SDRAM0_SIZE + FSL_IMX31_SDRAM1_SIZE)) {
-warn_report("RAM size " RAM_ADDR_FMT " above max supported, "
-"reduced to %x", machine->ram_size,
-FSL_IMX31_SDRAM0_SIZE + FSL_IMX31_SDRAM1_SIZE);
-machine->ram_size = FSL_IMX31_SDRAM0_SIZE + FSL_IMX31_SDRAM1_SIZE;
+char *sz = size_to_str(FSL_IMX31_SDRAM0_SIZE + FSL_IMX31_SDRAM1_SIZE);
+error_report("RAM size more than %s is not supported", sz);
+g_free(sz);
+exit(EXIT_FAILURE);
 }
 
 memory_region_allocate_system_memory(&s->ram, NULL, "kzm.ram",
-- 
2.7.4

Re: [PATCH] target/arm/arm-semi: fix SYS_OPEN to return nonzero filehandle

On Thu, 9 Jan 2020 at 04:13, Masahiro Yamada  wrote:
>
> According to the specification "Semihosting for AArch32 and Aarch64",
> the SYS_OPEN operation should return:
>
>  - A nonzero handle if the call is successful
>  - -1 if the call is not successful
>
> So, it should never return 0.
>
> Prior to commit 35e9a0a8ce4b ("target/arm/arm-semi: Make semihosting
> code hand out its own file descriptors"), the guest fd matched to the
> host fd. It returned a nonzero handle on success since the fd 0 is
> already used for stdin.

I think this bug existed even prior to that commit, because
in the old implementation we would handle the ":tt" magic
file by returning STDIN_FILENO or STDOUT_FILENO, and
STDIN_FILENO is zero.

So although I agree we should fix this bug, it would probably
be wise if your code using the API treated 0 as a success,
because QEMU's probably not the only implementation that
decided to use "just pass through the host fd"...

> Basically, there are two ways to fix this:
>
>   - Use (guestfd - 1) as the index of guestfs_arrary. We need to insert
> increment/decrement to convert the guestfd and the array index back
> and forth.
>
>   - Keep using guestfd as the index of guestfs_array. The first entry
> of guestfs_array is left unused.
>
> I thought the latter is simpler. We end up with wasting a small piece
> of memory for the unused first entry of guestfd_array, but this is
> probably not a big deal.

Yeah, I guess so.

Applied to target-arm.next.

(This also reminds me that I never got round to fixing a bug where
if the guest does a SYS_OPEN on :tt and then a SYS_CLOSE then
we close the host stdin/stdout, which we should not...)

thanks
-- PMM

Re: [PATCH v2 08/86] arm:aspeed: actually check RAM size

On Thu, 16 Jan 2020 09:41:03 +0100
Cédric Le Goater  wrote:

> On 1/15/20 4:06 PM, Igor Mammedov wrote:
> > It's supposed that SOC will check if "-m" provided
> > RAM size is valid by setting "ram-size" property and
> > then board would read back valid (possibly corrected
> > value) to map RAM MemoryReging with valid size.
> > Well it isn't doing so, since check is called only
> > indirectly from
> >   aspeed_sdmc_reset()->asc->compute_conf()
> > or much later when guest writes to configuration
> > register.
> > 
> > So depending on "-m" value QEMU end-ups with a warning
> > and an invalid MemoryRegion size allocated and mapped.
> > (examples:
> >  -M ast2500-evb -m 1M
> > 8000-00017ffe (prio 0, i/o): aspeed-ram-container
> >   8000-800f (prio 0, ram): ram
> >   8010-bfff (prio 0, i/o): max_ram
> >  -M ast2500-evb -m 3G
> > 8000-00017ffe (prio 0, i/o): aspeed-ram-container
> >   8000-00013fff (prio 0, ram): ram
> >   [DETECTED OVERFLOW!] 00014000-bfff (prio 0, i/o): 
> > max_ram
> > )
> > On top of that sdmc falls back and reports to guest
> > "default" size, it thinks machine should have.>
> > I don't know how hardware is supposed to work so
> > I've kept it as is.  
> 
> The HW is hardwired and the modeling is trying to accommodate with
> the '-m' option, the machine definition and the SDRAM controller
> limits and register definitions for a given SoC. The result is not 
> that good it seems :/ 
> 
> > But as for CLI side machine should honor whatever
> > user configured or error out to make user fix CLI.
> > 
> > This patch makes ram-size check actually work and
> > changes behavior from a warning later on during
> > machine reset to error_fatal at the moment SOC is
> > realized so user will have to fix RAM size on CLI
> > to start machine.  
> 
> commit 8e00d1a97d1d ("aspeed/sdmc: Introduce an object class per SoC") 
> moved some calls from the realize handler to reset handler and it
> broke the checks on the RAM size.
> 
> I think we should introduce get/set handlers on the "ram-size" property
> that would look for a matching size in an AspeedSDMCClass array of valid
> RAM sizes. The default size of the machine would be a valid default and
> bogus user defined sizes would be fatal to QEMU.  

That's what I'm after, i.e. board either accepts user asked size or exits
with error. So as far as there aren't any fix-ups it should work for
the purpose of this series

> 
> We could get rid of the code :
> 
> /* use a common default */
> warn_report("Invalid RAM size 0x%" PRIx64 ". Using default 512M",
> s->ram_size);
> s->ram_size = 512 << 20;
> return ASPEED_SDMC_AST2500_512MB;
> 
> 
> 'max_ram_size' would be the last entry of the AspeedSDMCClass array
> and, anyhow, we need to check bmc->max_ram is really needed. I am not 
> sure anymore. 

I'll rework aspeed parts taking in account feedback.

> 
> Thanks,
> 
> C. 
> 
> > It also gets out of the way mutable ram-size logic,
> > so we could consolidate RAM allocation logic around
> > pre-allocated hostmem backend (supplied by user or
> > auto created by generic machine code depending on
> > supplied -m/mem-path/mem-prealloc options.
> > 
> > Signed-off-by: Igor Mammedov 
> > ---
> > CC: c...@kaod.org
> > CC: peter.mayd...@linaro.org
> > CC: and...@aj.id.au
> > CC: j...@jms.id.au
> > CC: qemu-...@nongnu.org
> > ---
> >  hw/arm/aspeed.c   | 9 +
> >  hw/misc/aspeed_sdmc.c | 5 +
> >  2 files changed, 6 insertions(+), 8 deletions(-)
> > 
> > diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
> > index cc06af4..525c547 100644
> > --- a/hw/arm/aspeed.c
> > +++ b/hw/arm/aspeed.c
> > @@ -213,14 +213,7 @@ static void aspeed_machine_init(MachineState *machine)
> >  "hw-prot-key", &error_abort);
> >  }
> >  object_property_set_bool(OBJECT(&bmc->soc), true, "realized",
> > - &error_abort);
> > -
> > -/*
> > - * Allocate RAM after the memory controller has checked the size
> > - * was valid. If not, a default value is used.
> > - */
> > -ram_size = object_property_get_uint(OBJECT(&bmc->soc), "ram-size",
> > -&error_abort);
> > + &error_fatal);
> >  
> >  memory_region_allocate_system_memory(&bmc->ram, NULL, "ram", ram_size);
> >  memory_region_add_subregion(&bmc->ram_container, 0, &bmc->ram);
> > diff --git a/hw/misc/aspeed_sdmc.c b/hw/misc/aspeed_sdmc.c
> > index 3fc80f0..b398e36 100644
> > --- a/hw/misc/aspeed_sdmc.c
> > +++ b/hw/misc/aspeed_sdmc.c
> > @@ -165,6 +165,11 @@ static void aspeed_sdmc_realize(DeviceState *dev, 
> > Error **errp)
> >  AspeedSDMCState *s = ASPEED_SDMC(dev);
> >  AspeedSDMCClass *asc = ASPEED_SDMC_GET_CLASS(s);
> >  
> > +if (!g_hash_table_contains(asc->ram2feat,
> > +

[PATCH] docs: qemu-cpu-models: Document '-noTSX' variants and 'mds-no'

2020-01-16 Thread Kashyap Chamarthy

- Add the -noTSX variants for CascadeLake and SkyLake.

- Add a note aboute the 'mds-no' MSR.  Two confusing things about this:

  (1) The 'mds-no' will _not_ show up in the guest's /proc/cpuinfo.
  Rather it is used to fill in the guest's sysfs:

sys/devices/system/cpu/vulnerabilities/mds:Not affected

  Paolo confirmed on IRC as such.

  (2) There are _three_ variants[+] of CascadeLake CPUs, with different
  stepping levels: 5, 6, and 7.  To quite wikichip.org[*]:

"note that while steppings 6 & 7 are fully mitigated, earlier
stepping 5 is not protected against MSBDS, MLPDS, nor MDSUM"

  The above is also indicated in the Intel's manual itself[+], as
  indicated by "No" under the three columns of MFBDS, MSBDS, and
  MLPDS.

  [+] 
https://software.intel.com/security-software-guidance/insights/processors-affected-microarchitectural-data-sampling
  [*] 
https://en.wikichip.org/wiki/intel/microarchitectures/cascade_lake#Key_changes_from_Skylake

Signed-off-by: Kashyap Chamarthy 
---
TODO:
 - I think I also need to add a note about 'tsx-ctrl' bit.  Here too,
   same question as above -- does it show up in /proc/cpuinfo/?
---
 docs/qemu-cpu-models.texi | 37 -
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/docs/qemu-cpu-models.texi b/docs/qemu-cpu-models.texi
index 
f88a1def0d042cc25213259172a648f0a9c514dc..e6c4058e764a81988d6bc97457c668cb0ad2ea37
 100644
--- a/docs/qemu-cpu-models.texi
+++ b/docs/qemu-cpu-models.texi
@@ -72,14 +72,30 @@ between machines, if live migration compatibility is 
required, use the newest
 CPU model that is compatible across all desired hosts.
 
 @table @option
+
+@item @code{Cascadelake-Server-noTSX}
+
+Intell Xeon Processor (Cascade Lake, 2019-2020), with "stepping" levels
+6 or 7 only.  (The Cascade Lake Xeon processor with @b{stepping 5 is
+vulnerable to MDS variants}; refer below.)
+
+@code{/proc/cpuinfo}.
+
+The @code{mds-no} bit does not show up under @code{/proc/cpuinfo}.
+Rather it shows up under the @code{sysfs}, as
+@code{/sys/devices/system/cpu/vulnerabilities/mds:Not affected}
+
+
 @item @code{Skylake-Server}
 @item @code{Skylake-Server-IBRS}
+@item @code{Skylake-Server-noTSX-IBRS}
 
 Intel Xeon Processor (Skylake, 2016)
 
 
 @item @code{Skylake-Client}
 @item @code{Skylake-Client-IBRS}
+@item @code{Skylake-Client-noTSX-IBRS}
 
 Intel Core Processor (Skylake, 2015)
 
@@ -214,9 +230,28 @@ Must be explicitly turned on for all Intel CPU models.
 
 Requires the host CPU microcode to support this feature before it
 can be used for guest CPUs.
+
+@item @code{mds-no}
+
+This is an MSR (Model-Specific Register) used by QEMU to indicate that
+the host is @i{not} vulnerable to any of the MDS variants ([MFBDS]
+CVE-2018-12130, [MLPDS] CVE-2018-12127, [MSBDS] CVE-2018-12126).
+
+Note that there are @i{three} versions of Intel's Cascade Lake
+processor, as distinguished by their "stepping" levels 5, 6, and 7.  The
+CPU with stepping "5" is @b{vulnerable to MDS variants}; and the CPUs
+with steppings "6" and "7" are @b{not vulnerable} to the above mentioned
+MDS variants.  The processor "stepping" is reported in
+@code{/proc/cpuinfo}.
+
+Confusingly, the @code{mds-no} bit does not show up under
+@code{/proc/cpuinfo} inside the guest.  Rather the kernel uses it to
+fill in the @code{sysfs}, as
+@code{/sys/devices/system/cpu/vulnerabilities/mds: Not affected},
+assuming the processor is running with stepping 6 or 7.
+
 @end table
 
-
 @node preferred_cpu_models_amd_x86
 @subsubsection Preferred CPU models for AMD x86 hosts
 
-- 
2.21.0

[PATCH] MAINTAINERS: update Leif Lindholm's address

2020-01-16 Thread Leif Lindholm

Update address to reflect new employer.

Signed-off-by: Leif Lindholm 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 483edfbc0b..3c8653f26f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -735,7 +735,7 @@ F: include/hw/ssi/imx_spi.h
 SBSA-REF
 M: Radoslaw Biernacki 
 M: Peter Maydell 
-R: Leif Lindholm 
+R: Leif Lindholm 
 L: qemu-...@nongnu.org
 S: Maintained
 F: hw/arm/sbsa-ref.c
-- 
2.20.1

Re: [PATCH 093/104] virtiofsd: introduce inode refcount to prevent use-after-free

2020-01-16 Thread Dr. David Alan Gilbert

* Stefan Hajnoczi (stefa...@redhat.com) wrote:
> On Thu, Jan 16, 2020 at 09:25:42PM +0900, Misono Tomohiro wrote:
> > > From: Stefan Hajnoczi 
> > > 
> > > If thread A is using an inode it must not be deleted by thread B when
> > > processing a FUSE_FORGET request.
> > > 
> > > The FUSE protocol itself already has a counter called nlookup that is
> > > used in FUSE_FORGET messages.  We cannot trust this counter since the
> > > untrusted client can manipulate it via FUSE_FORGET messages.
> > > 
> > > Introduce a new refcount to keep inodes alive for the required lifespan.
> > > lo_inode_put() must be called to release a reference.  FUSE's nlookup
> > > counter holds exactly one reference so that the inode stays alive as
> > > long as the client still wants to remember it.
> > > 
> > > Note that the lo_inode->is_symlink field is moved to avoid creating a
> > > hole in the struct due to struct field alignment.
> > > 
> > > Signed-off-by: Stefan Hajnoczi 
> > > ---
> > >  tools/virtiofsd/passthrough_ll.c | 168 ++-
> > >  1 file changed, 145 insertions(+), 23 deletions(-)
> > > 
> > > diff --git a/tools/virtiofsd/passthrough_ll.c 
> > > b/tools/virtiofsd/passthrough_ll.c
> > > index b19c9ee328..8f4ab8351c 100644
> > > --- a/tools/virtiofsd/passthrough_ll.c
> > > +++ b/tools/virtiofsd/passthrough_ll.c
> > > @@ -99,7 +99,13 @@ struct lo_key {
> > >  
> > >  struct lo_inode {
> > >  int fd;
> > > -bool is_symlink;
> > > +
> > > +/*
> > > + * Atomic reference count for this object.  The nlookup field holds a
> > > + * reference and release it when nlookup reaches 0.
> > > + */
> > > +gint refcount;
> > > +
> > >  struct lo_key key;
> > >  
> > >  /*
> > > @@ -118,6 +124,8 @@ struct lo_inode {
> > >  fuse_ino_t fuse_ino;
> > >  pthread_mutex_t plock_mutex;
> > >  GHashTable *posix_locks; /* protected by lo_inode->plock_mutex */
> > > +
> > > +bool is_symlink;
> > >  };
> > >  
> > >  struct lo_cred {
> > > @@ -473,6 +481,23 @@ static ssize_t lo_add_inode_mapping(fuse_req_t req, 
> > > struct lo_inode *inode)
> > >  return elem - lo_data(req)->ino_map.elems;
> > >  }
> > >  
> > > +static void lo_inode_put(struct lo_data *lo, struct lo_inode **inodep)
> > > +{
> > > +struct lo_inode *inode = *inodep;
> > > +
> > > +if (!inode) {
> > > +return;
> > > +}
> > > +
> > > +*inodep = NULL;
> > > +
> > > +if (g_atomic_int_dec_and_test(&inode->refcount)) {
> > > +close(inode->fd);
> > > +free(inode);
> > > +}
> > > +}
> > > +
> > > +/* Caller must release refcount using lo_inode_put() */
> > >  static struct lo_inode *lo_inode(fuse_req_t req, fuse_ino_t ino)
> > >  {
> > >  struct lo_data *lo = lo_data(req);
> > > @@ -480,6 +505,9 @@ static struct lo_inode *lo_inode(fuse_req_t req, 
> > > fuse_ino_t ino)
> > >  
> > >  pthread_mutex_lock(&lo->mutex);
> > >  elem = lo_map_get(&lo->ino_map, ino);
> > > +if (elem) {
> > > +g_atomic_int_inc(&elem->inode->refcount);
> > > +}
> > >  pthread_mutex_unlock(&lo->mutex);
> > >  
> > >  if (!elem) {
> > > @@ -489,10 +517,23 @@ static struct lo_inode *lo_inode(fuse_req_t req, 
> > > fuse_ino_t ino)
> > >  return elem->inode;
> > >  }
> > >  
> > > +/*
> > > + * TODO Remove this helper and force callers to hold an inode refcount 
> > > until
> > > + * they are done with the fd.  This will be done in a later patch to make
> > > + * review easier.
> > > + */
> > >  static int lo_fd(fuse_req_t req, fuse_ino_t ino)
> > >  {
> > >  struct lo_inode *inode = lo_inode(req, ino);
> > > -return inode ? inode->fd : -1;
> > > +int fd;
> > > +
> > > +if (!inode) {
> > > +return -1;
> > > +}
> > > +
> > > +fd = inode->fd;
> > > +lo_inode_put(lo_data(req), &inode);
> > > +return fd;
> > >  }
> > >  
> > >  static void lo_init(void *userdata, struct fuse_conn_info *conn)
> > > @@ -547,6 +588,10 @@ static void lo_getattr(fuse_req_t req, fuse_ino_t 
> > > ino,
> > >  fuse_reply_attr(req, &buf, lo->timeout);
> > >  }
> > >  
> > > +/*
> > > + * Increments parent->nlookup and caller must release refcount using
> > > + * lo_inode_put(&parent).
> > > + */
> > >  static int lo_parent_and_name(struct lo_data *lo, struct lo_inode *inode,
> > >char path[PATH_MAX], struct lo_inode 
> > > **parent)
> > >  {
> > > @@ -584,6 +629,7 @@ retry:
> > >  p = &lo->root;
> > >  pthread_mutex_lock(&lo->mutex);
> > >  p->nlookup++;
> > > +g_atomic_int_inc(&p->refcount);
> > >  pthread_mutex_unlock(&lo->mutex);
> > >  } else {
> > >  *last = '\0';
> > 
> > We need lo_ionde_put() in error path, right?:
> > https://gitlab.com/virtio-fs/qemu/blob/virtio-fs-as-posted-2019-12-12/tools/virtiofsd/passthrough_ll.c#L680
> 
> Yes, thanks for spotting this bug!  The lo_parent_and_name() code should
> look like this:
> 
>   fail_unref:
>   unr

Re: [PATCH] MAINTAINERS: update Leif Lindholm's address

2020-01-16 Thread Leif Lindholm

On Thu, Jan 16, 2020 at 17:42:26 +, Leif Lindholm wrote:
> Update address to reflect new employer.
> 
> Signed-off-by: Leif Lindholm 

Reviewed-by: Leif Lindholm 

> ---
>  MAINTAINERS | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 483edfbc0b..3c8653f26f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -735,7 +735,7 @@ F: include/hw/ssi/imx_spi.h
>  SBSA-REF
>  M: Radoslaw Biernacki 
>  M: Peter Maydell 
> -R: Leif Lindholm 
> +R: Leif Lindholm 
>  L: qemu-...@nongnu.org
>  S: Maintained
>  F: hw/arm/sbsa-ref.c
> -- 
> 2.20.1
>

Re: Extraneous nesting in QAPI schema

2020-01-16 Thread Christophe de Dinechin

> On 16 Dec 2019, at 17:59, Markus Armbruster  wrote:
> 
> Kevin suggested to investigate a more generic flattening solutions.
> 
> Of course, flattening is only possible as long as there are no name
> clashes.

Or you could reverse the name sequence and allow
the “upper” layers to disambiguate.

backend.data.addr.type -> type.addr.data.backend
other.data.addr.data.type -> type.data.addr.data.other

then “type” would match both, could be disambiguated with
type.other or type.backend, or even type.o or type.b.

That way, you could have some generic way to disambiguate
for the rare cases you need it.

Re: [PATCH v2] i.MX: add an emulation for RNGC

On Wed, 8 Jan 2020 at 08:46, Martin Kaiser  wrote:
>
> Add an emulation for the RNGC random number generator and the compatible
> RNGB variant. These peripherals are included (at least) in imx25 and
> imx35 chipsets.
>
> The emulation supports the initial self test, reseeding the prng and
> reading random numbers.
>
> Signed-off-by: Martin Kaiser 
> ---
> changes in v2
>  add a VMStateDescription
>  rename __imx_rngc_reset to imx_rngc_do_reset
>  fix a typo



Applied to target-arm.next, thanks.

-- PMM

Re: [PATCH v2] target/arm: adjust program counter for wfi exception in AArch32

On Mon, 13 Jan 2020 at 15:59, Jeff Kubascik
 wrote:
>
> The wfi instruction can be configured to be trapped by a higher exception
> level, such as the EL2 hypervisor. When the instruction is trapped, the
> program counter should contain the address of the wfi instruction that
> caused the exception. The program counter is adjusted for this in the wfi op
> helper function.
>
> However, this correction is done to env->pc, which only applies to AArch64
> mode. For AArch32, the program counter is stored in env->regs[15]. This
> adds an if-else statement to modify the correct program counter location
> based on the the current CPU mode.
>
> Signed-off-by: Jeff Kubascik 
> ---
> Hello,
>
> I am using the ARMv8 version of QEMU to run the Xen hypervisor with a guest
> virtual machine compiled for AArch32/Thumb code. I have noticed that when
> the AArch32 guest VM executes the wfi instruction, the hypervisor trap of
> the wfi instruction sees the program counter contain the address of the
> instruction following the wfi. This does not occur for an AARch64 guest VM;
> in this case, the program counter contains the address of the wfi
> instruction. I am confident the correct behavior in both cases is for the
> program counter to contain the address of the wfi instruction, as this works
> on actual hardware (Xilinx Zynq UltraScale+ MPSoC).
>
> I have tested the above patch and it works for Xen with both an AArch64
> guest (Linux) and an AArch32 guest (RTEMS). I'm still getting accustomed to
> the QEMU code base, so it may not be correct. Any feedback would be greatly
> appreciated.
>
> Sincerely,
> Jeff Kubascik
>
> Changes in v2:
> - Added braces {} to if-else statement, per patchew feedback
> ---



Applied to target-arm.next, thanks.

-- PMM

Re: [PATCH] arm/gicv3: update virtual irq state after IAR register read

On Mon, 13 Jan 2020 at 15:46, Jeff Kubascik
 wrote:
>
> The IAR0/IAR1 register is used to acknowledge an interrupt - a read of the
> register activates the highest priority pending interrupt and provides its
> interrupt ID. Activating an interrupt can change the CPU's virtual interrupt
> state - this change makes sure the virtual irq state is updated.
>
> Signed-off-by: Jeff Kubascik 
> ---



Applied to target-arm.next, thanks.

-- PMM

[PATCH v2] hw/i386: disable smbus migration for xenfv

2020-01-16 Thread Olaf Hering

With commit 7fccf2a06890e3bc3b30e29827ad3fb93fe88fea a new member
smbus_no_migration_support was added, and enabled in two places.
With commit 4ab2f2a8aabfea95cc53c64e13b3f67960b27fdf the vmstate_acpi
got new elements, which are conditionally filled. As a result, an
incoming migration expected smbus related data unless smbus migration
was disabled for a given MachineClass.

Since commit 7fccf2a06890e3bc3b30e29827ad3fb93fe88fea forgot to handle
xenfv, live migration to receiving hosts using qemu-4.0 and later is broken.

Therefore this patch must be applied to stable-4.x as well.

Signed-off-by: Olaf Hering 
---

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index fa12203079..e19716d0d3 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -952,6 +952,7 @@ static void xenfv_machine_options(MachineClass *m)
 m->desc = "Xen Fully-virtualized PC";
 m->max_cpus = HVM_MAX_VCPUS;
 m->default_machine_opts = "accel=xen";
+m->smbus_no_migration_support = true;
 }
 
 DEFINE_PC_MACHINE(xenfv, "xenfv", pc_xen_hvm_init,

Re: [PATCH v3 02/11] 9pfs: require msize >= 4096

2020-01-16 Thread Greg Kurz

On Thu, 16 Jan 2020 17:16:07 +0100
Christian Schoenebeck  wrote:

> On Donnerstag, 16. Januar 2020 14:15:03 CET Greg Kurz wrote:
> > On Mon, 13 Jan 2020 23:21:04 +0100
> > 
> > Christian Schoenebeck  wrote:
> > > A client establishes a session by sending a Tversion request along with
> > > a 'msize' parameter which client uses to suggest server a maximum
> > > message size ever to be used for communication (for both requests and
> > > replies) between client and server during that session. If client
> > > suggests a 'msize' smaller than 4096 then deny session by server
> > > immediately with an error response (Rlerror for "9P2000.L" clients or
> > > Rerror for "9P2000.u" clients) instead of replying with Rversion.
> > > 
> > > Introduction of a minimum msize is required to prevent issues in
> > > responses to numerous individual request types. For instance with a
> > > msize of < P9_IOHDRSZ no useful operations would be possible at all.
> > 
> > P9_IOHDRSZ is really specific to write/read operations.
> > 
> > /*
> >  * ample room for Twrite/Rread header
> >  * size[4] Tread/Twrite tag[2] fid[4] offset[8] count[4]
> >  */
> > #define P9_IOHDRSZ 24
> > 
> > As you see P9_IOHDRSZ is approximately the size of a Twrite header.
> > Its primary use is to inform the client about the 'count' to use for
> > Twrite/Tread messages (see get_iounit()).
> > 
> > Not sure it helps to mention P9_IOHDRSZ since we're going to choose
> > something much greater. I'd personally drop this sentence.
> 
> The point here was that there must be some kind of absolute minimum msize, 

Then the absolute minimum size is 7, the size of the header part that is
common to all messages:

size[4] Message tag[2]

> even though the protocol specs officially don't mandate one. And if a client 
> choses a msize < P9_IOHDRSZ, what useful actions shall it be able to do? 

I haven't checked but I guess some messages can fit in 24 bytes.

> That's why I mentioned P9_IOHDRSZ just in case somebody might come up later 
> asking to lower that minimum msize value for whatever reason.
> 

That's precisely what I want to avoid. The semantic of P9_IOHDRSZ is
that it is the header size of a Twrite message and as such it is used
to compute the 'iounit' which the guest later uses to compute a
suitable 'count' for Tread/Twrite messages only. It shouldn't be
involved in msize considerations IMHO.

> But np, IMO this sentence makes the fundamental requirement for this patch 
> more clear, but I have no problem dropping this sentence.
> 

Then you can mention 7 has the true minimal size.

> > > Furthermore there are some responses which are not allowed by the 9p
> > > protocol to be truncated like e.g. Rreadlink which may yield up to
> > 
> > No message may be truncated in any way actually. The spec just allows
> > an exception with the string part of Rerror.
> 
> Rreaddir, Rread, Twrite, Rerror can be truncated. So I suggest I'll just 
> change that to s/some/most/ semantically.
> 

I mean truncate with loss of data. Rreaddir can return less than 'count'
but it won't truncate an individual entry. It rewinds to the previous
one and returns its offset for the next Treaddir. Same goes with Rread
and Twrite, we always return a 'count' that can be used by the client
to continue reading or writing.

Rerror is really the only message where we can deliberately drop
data (but we actually don't do that).

> > Maybe just mention that and say we choose 4096 to be able to send
> > big Rreadlink messages.
> 
> That's not the point for this patch. The main purpose is to avoid having to 
> maintain individual error prone minimum msize checks all over the code base.
> If we would allow very small msizes (much smaller than 4096), then we would 
> need to add numerous error handlers all over the code base, each one checking 
> for different values (depending on the specific message type).
> 

I'm not sure what you mean by 'minimum msize checks all over the code base'...
Please provide an example.

> > > a size of PATH_MAX which is usually 4096. Hence this size was chosen
> > > as min. msize for server, which is already the minimum msize of the
> > > Linux kernel's 9pfs client. By forcing a min. msize already at
> > > session start (when handling Tversion) we don't have to check for a
> > > minimum msize on a per request type basis later on during session,
> > > which would be much harder and more error prone to maintain.
> > > 
> > > This is a user visible change which should be documented as such
> > > (i.e. in public QEMU release changelog).
> > 
> > This last sentence isn't informative in the commit message. This
> > kind of indication should be added after the --- below.
> > 
> > > Signed-off-by: Christian Schoenebeck 
> > > ---
> 
> np
> 
> > 
> > LGTM
> > 
> > With an updated changelog,
> > 
> > Reviewed-by: Greg Kurz 
> > 
> > >  hw/9pfs/9p.c | 12 
> > >  hw/9pfs/9p.h | 11 +++
> > >  2 files changed, 23 insertions(+)
> > > 
> > > diff --git a/hw/9pfs/9p.c

Re: [PATCH 1/2] qdev: Introduce qdev_get_bus_device


Hi Julia,

Cc'ing Markus for the qdev/qbus analysis.

On 1/15/20 11:40 PM, Julia Suvorova wrote:

For bus devices, it is useful to be able to handle the parent device.

Signed-off-by: Julia Suvorova 
---
  hw/core/qdev.c  |  5 +
  hw/pci-bridge/pci_expander_bridge.c |  4 +++-
  hw/scsi/scsi-bus.c  |  4 +++-
  hw/usb/bus.c|  4 +++-
  hw/usb/dev-smartcard-reader.c   | 32 +
  hw/virtio/virtio-pci.c  | 16 +--
  include/hw/qdev-core.h  |  2 ++


Please consider using the scripts/git.orderfile config.


  7 files changed, 54 insertions(+), 13 deletions(-)

diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 9f1753f5cf..ad8226e240 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -114,6 +114,11 @@ void qdev_set_parent_bus(DeviceState *dev, BusState *bus)
  }
  }
  
+DeviceState *qdev_get_bus_device(const DeviceState *dev)


We have qdev_get_bus_hotplug_handler(), this follow the naming, OK.


+{
+return dev->parent_bus ? dev->parent_bus->parent : NULL;
+}
+
  /* Create a new device.  This only initializes the device state
 structure and allows properties to be set.  The device still needs
 to be realized.  See qdev-core.h.  */
diff --git a/hw/pci-bridge/pci_expander_bridge.c 
b/hw/pci-bridge/pci_expander_bridge.c
index 0592818447..63a6c07406 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -125,9 +125,11 @@ static char *pxb_host_ofw_unit_address(const SysBusDevice 
*dev)
  assert(position >= 0);
  
  pxb_dev_base = DEVICE(pxb_dev);

-main_host = PCI_HOST_BRIDGE(pxb_dev_base->parent_bus->parent);
+main_host = PCI_HOST_BRIDGE(qdev_get_bus_device(pxb_dev_base));
  main_host_sbd = SYS_BUS_DEVICE(main_host);
  
+g_assert(main_host);


I found myself stuck reviewing this patch for 25min, I'm not sure what's 
bugging me yet, so I'll take notes a-la-Markus-style.


We have the qdev API, with DeviceState.


We have the qbus API, with BusState.

A BusState is not a DeviceState but a raw Object.
It keeps a pointer to the a DeviceState parent, a HotplugHandler, and a 
list of BusChild.



BusChild are neither DeviceState nor Object, but keep a pointer the a 
DeviceState.



TYPE_HOTPLUG_HANDLER is an interface. It can be implemented by any 
object, but its API seems expects a DeviceState as argument.


Looking at examples implementing TYPE_HOTPLUG_HANDLER:

- TYPE_USB_BUS. It inherits TYPE_BUS. Handlers will be called with 
USBDevice as argument (TYPE_USB_DEVICE -> TYPE_DEVICE).


- TYPE_PCI_BRIDGE_DEV. Inherits TYPE_PCI_BRIDGE -> TYPE_PCI_DEVICE -> 
TYPE_DEVICE. Handlers expects PCIDevice (TYPE_PCI_DEVICE).


- TYPE_PC_MACHINE. It inherits TYPE_X86_MACHINE -> TYPE_MACHINE -> 
TYPE_OBJECT. Not a TYPE_BUS. Handlers for TYPE_PC_DIMM, TYPE_CPU and 
TYPE_VIRTIO_PMEM_PCI. Complex... TYPE_PC_DIMM/TYPE_CPU are TYPE_DEVICE.
For TYPE_VIRTIO_PMEM_PCI we have VirtIOPMEMPCI -> VirtIOPCIProxy -> 
PCIDevice.


- USB_CCID_DEV. Inherits TYPE_USB_DEVICE -> TYPE_DEVICE. Only one 
'unplug' handler which likely expects USBCCIDState.


- TYPE_SCSI_BUS. Inherits TYPE_BUS. Also a single 'unplug' handler 
expecting SCSIDevice.


- TYPE_VIRTIO_SCSI. Inherits TYPE_VIRTIO_SCSI_COMMON -> 
TYPE_VIRTIO_DEVICE -> TYPE_DEVICE. Handlers expect VirtIOSCSI.



No simple pattern so far.


Looking back at qbus. qbus_initfn() enforces a TYPE_HOTPLUG_HANDLER 
property on BusState (which is not a DeviceState). So IIUC TYPE_BUS also 
implements TYPE_HOTPLUG_HANDLER.


---

Back to your patch, you add asserts() calls because you expect 
SysBusDeviceClass::explicit_ofw_unit_address() to be called before the 
device is plugged on a bus.


This handler is only used by sysbus_get_fw_dev_path(), so 
BusClass::get_dev_path(), similar to the scsi/usb following cases.


BusClass::get_dev_path() is only called in qdev_get_dev_path() were we 
know that dev->parent_bus is not NULL, because checked there.


So the assert is pointless.


+
  if (main_host_sbd->num_mmio > 0) {
  return g_strdup_printf(TARGET_FMT_plx ",%x",
 main_host_sbd->mmio[0].addr, position + 1);
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index ad0e7f6d88..3d9497882b 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -1558,10 +1558,12 @@ void scsi_device_purge_requests(SCSIDevice *sdev, 
SCSISense sense)
  static char *scsibus_get_dev_path(DeviceState *dev)
  {
  SCSIDevice *d = SCSI_DEVICE(dev);
-DeviceState *hba = dev->parent_bus->parent;
+DeviceState *hba = qdev_get_bus_device(dev);
  char *id;
  char *path;
  
+g_assert(hba);


Similarly, we checked in qdev_get_dev_path().


+
  id = qdev_get_dev_path(hba);
  if (id) {
  path = g_strdup_printf("%s/%d:%d:%d", id, d->channel, d->id, d->lun);
diff --git a/hw/usb/bus.c b/hw/usb/bus.c
index a6522f5429..26bf794315 100644
--- a/hw/usb/bus.c
+++ b

Re: [PATCH v2 10/86] arm:aspeed: use memdev for RAM


On 1/16/20 10:24 AM, Cédric Le Goater wrote:

On 1/15/20 4:06 PM, Igor Mammedov wrote:

memory_region_allocate_system_memory() API is going away, so
replace it with memdev allocated MemoryRegion. The later is
initialized by generic code, so board only needs to opt in
to memdev scheme by providing
   MachineClass::default_ram_id
and using MachineState::ram instead of manually initializing
RAM memory region.

Signed-off-by: Igor Mammedov 


Reviewed-by: Cédric Le Goater 


We need to check 'max_ram' is still needed. I remember that old firmwares
were testing the RAM size by doing write/read checks at the top of the RAM.
This was breaking QEMU sometime ago.


Do you remember any version we can use to test this? Also add to our 
acceptance tests.



C.


---
CC: c...@kaod.org
CC: peter.mayd...@linaro.org
CC: and...@aj.id.au
CC: j...@jms.id.au
CC: qemu-...@nongnu.org
---
  hw/arm/aspeed.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index 525c547..330254b 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -35,7 +35,6 @@ static struct arm_boot_info aspeed_board_binfo = {
  struct AspeedBoardState {
  AspeedSoCState soc;
  MemoryRegion ram_container;
-MemoryRegion ram;
  MemoryRegion max_ram;
  };
  
@@ -184,6 +183,7 @@ static void aspeed_machine_init(MachineState *machine)
  
  memory_region_init(&bmc->ram_container, NULL, "aspeed-ram-container",

 UINT32_MAX);
+memory_region_add_subregion(&bmc->ram_container, 0, machine->ram);
  
  object_initialize_child(OBJECT(machine), "soc", &bmc->soc,

  (sizeof(bmc->soc)), amc->soc_name, &error_abort,
@@ -215,8 +215,6 @@ static void aspeed_machine_init(MachineState *machine)
  object_property_set_bool(OBJECT(&bmc->soc), true, "realized",
   &error_fatal);
  
-memory_region_allocate_system_memory(&bmc->ram, NULL, "ram", ram_size);

-memory_region_add_subregion(&bmc->ram_container, 0, &bmc->ram);
  memory_region_add_subregion(get_system_memory(),
  sc->memmap[ASPEED_SDRAM],
  &bmc->ram_container);
@@ -393,6 +391,7 @@ static void aspeed_machine_class_init(ObjectClass *oc, void 
*data)
  mc->no_floppy = 1;
  mc->no_cdrom = 1;
  mc->no_parallel = 1;
+mc->default_ram_id = "ram";
  }
  
  static void aspeed_machine_palmetto_class_init(ObjectClass *oc, void *data)

Re: [PATCH 3/3] linux-user/i386: Emulate x86_64 vsyscalls

On 1/16/20 6:26 AM, Alex Bennée wrote:
>> +/*
>> + * Perform the syscall.  None of the vsyscalls should need restarting,
>> + * and all faults should have been caught above.
>> + */
>> +ret = do_syscall(env, syscall, env->regs[R_EDI], env->regs[R_ESI],
>> + env->regs[R_EDX], env->regs[10], env->regs[8],
>> + env->regs[9], 0, 0);
> 
> How come the register ABI to the syscall is different to the others. I
> can see why syscall doesn't come from EAX but the others are a different
> set to normal syscalls which might be why:

Cut and paste error, I assume.

That said, the three syscalls have a maximum of 2 arguments,
so I could really just pass EDI and ESI and 0 for the rest...

> I'm seeing a EFAULT on the gettimeofday failure:

What getttimeofday failure?  Is this related to the mention of /sbin/ldconfig
in your previous message?

>#0  do_syscall (cpu_env=cpu_env@entry=0x577d2b10, num=num@entry=96, 
> arg1=0, arg2=0, arg3=4211016, arg4=8, arg5=274888677184, arg6=274886295415, 
> arg7=0, arg8=0) at /home/alex/lsrc/qemu.git/linux-user/syscall.c:12076
>
>#1  0x55609b6e in emulate_vsyscall (env=0x577d2b10) at 
> /home/alex/lsrc/qemu.git/linux-user/x86_64/../i386/cpu_loop.c:180
>#2  cpu_loop (env=0x577d2b10) at 
> /home/alex/lsrc/qemu.git/linux-user/x86_64/../i386/cpu_loop.c:246 
>  
>#3  0x5559640e in main (argc=, argv=#out>, envp=) at
>#/home/alex/lsrc/qemu.git/linux-user/main.c:865
> 
> arg1/arg2 don't seem right here.

Why?  NULL value for arg1 is legal, though semi-useless.

Ah, I see that our implementation of gettimeofday doesn't honor NULL.


r~

Re: [PATCH v3 1/5] multifd: Make sure that we don't do any IO after an error

2020-01-16 Thread Dr. David Alan Gilbert

* Juan Quintela (quint...@redhat.com) wrote:
> Signed-off-by: Juan Quintela 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/ram.c | 22 +-
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index ba6e0eea15..8f9f3bba5b 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3442,7 +3442,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>  {
>  RAMState **temp = opaque;
>  RAMState *rs = *temp;
> -int ret;
> +int ret = 0;
>  int i;
>  int64_t t0;
>  int done = 0;
> @@ -3521,12 +3521,14 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>  ram_control_after_iterate(f, RAM_CONTROL_ROUND);
>  
>  out:
> -multifd_send_sync_main(rs);
> -qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> -qemu_fflush(f);
> -ram_counters.transferred += 8;
> +if (ret >= 0) {
> +multifd_send_sync_main(rs);
> +qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> +qemu_fflush(f);
> +ram_counters.transferred += 8;
>  
> -ret = qemu_file_get_error(f);
> +ret = qemu_file_get_error(f);
> +}
>  if (ret < 0) {
>  return ret;
>  }
> @@ -3578,9 +3580,11 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>  ram_control_after_iterate(f, RAM_CONTROL_FINISH);
>  }
>  
> -multifd_send_sync_main(rs);
> -qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> -qemu_fflush(f);
> +if (ret >= 0) {
> +multifd_send_sync_main(rs);
> +qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> +qemu_fflush(f);
> +}
>  
>  return ret;
>  }
> -- 
> 2.24.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH 3/3] linux-user/i386: Emulate x86_64 vsyscalls

On 1/16/20 8:19 AM, Richard Henderson wrote:
> On 1/16/20 6:26 AM, Alex Bennée wrote:
>>> +/*
>>> + * Perform the syscall.  None of the vsyscalls should need restarting,
>>> + * and all faults should have been caught above.
>>> + */
>>> +ret = do_syscall(env, syscall, env->regs[R_EDI], env->regs[R_ESI],
>>> + env->regs[R_EDX], env->regs[10], env->regs[8],
>>> + env->regs[9], 0, 0);
>>
>> How come the register ABI to the syscall is different to the others. I
>> can see why syscall doesn't come from EAX but the others are a different
>> set to normal syscalls which might be why:
> 
> Cut and paste error, I assume.

What register difference?


case EXCP_SYSCALL:
/* linux syscall from syscall instruction */
ret = do_syscall(env,
 env->regs[R_EAX],
 env->regs[R_EDI],
 env->regs[R_ESI],
 env->regs[R_EDX],
 env->regs[10],
 env->regs[8],
 env->regs[9],
 0, 0);

Looks the same to me...


r~

Re: [PATCH v3 18/86] arm:kzm: drop RAM size fixup


On 1/16/20 6:26 PM, Igor Mammedov wrote:

If the user provided too large a RAM size, the code used to
complain and trim it to the max size.  Now tht RAM is allocated by
generic code, that's no longer possible, so generate an error and
exit instead.

Signed-off-by: Igor Mammedov 
---
v3:
  * rephrase commit message in nicer way
("Chubb, Peter (Data61, Kensington NSW)" )
  * reword error message and use size_to_str() to pretty print suggested size
("Chubb, Peter (Data61, Kensington NSW)" )

CC: peter.ch...@nicta.com.au
CC: peter.mayd...@linaro.org
CC: qemu-...@nongnu.org
---
  hw/arm/kzm.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/hw/arm/kzm.c b/hw/arm/kzm.c
index 1d5ef28..94cbac1 100644
--- a/hw/arm/kzm.c
+++ b/hw/arm/kzm.c
@@ -25,6 +25,7 @@
  #include "hw/char/serial.h"
  #include "sysemu/qtest.h"
  #include "sysemu/sysemu.h"
+#include "qemu/cutils.h"
  
  /* Memory map for Kzm Emulation Baseboard:

   * 0x-0x7fff See i.MX31 SOC for support
@@ -78,10 +79,10 @@ static void kzm_init(MachineState *machine)
  
  /* Check the amount of memory is compatible with the SOC */

  if (machine->ram_size > (FSL_IMX31_SDRAM0_SIZE + FSL_IMX31_SDRAM1_SIZE)) {
-warn_report("RAM size " RAM_ADDR_FMT " above max supported, "
-"reduced to %x", machine->ram_size,
-FSL_IMX31_SDRAM0_SIZE + FSL_IMX31_SDRAM1_SIZE);
-machine->ram_size = FSL_IMX31_SDRAM0_SIZE + FSL_IMX31_SDRAM1_SIZE;
+char *sz = size_to_str(FSL_IMX31_SDRAM0_SIZE + FSL_IMX31_SDRAM1_SIZE);
+error_report("RAM size more than %s is not supported", sz);


Yay! Can you use this pattern the other patches too?

You might want to add:

#define FSL_IMX31_SDRAM_SIZE_MAX \
  (FSL_IMX31_SDRAM0_SIZE + FSL_IMX31_SDRAM1_SIZE)

Reviewed-by: Philippe Mathieu-Daudé 


+g_free(sz);
+exit(EXIT_FAILURE);
  }
  
  memory_region_allocate_system_memory(&s->ram, NULL, "kzm.ram",

Re: [PATCH] MAINTAINERS: update Leif Lindholm's address


On 1/16/20 6:42 PM, Leif Lindholm wrote:

Update address to reflect new employer.

Signed-off-by: Leif Lindholm 
---
  MAINTAINERS | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 483edfbc0b..3c8653f26f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -735,7 +735,7 @@ F: include/hw/ssi/imx_spi.h
  SBSA-REF
  M: Radoslaw Biernacki 
  M: Peter Maydell 
-R: Leif Lindholm 
+R: Leif Lindholm 
  L: qemu-...@nongnu.org
  S: Maintained
  F: hw/arm/sbsa-ref.c



You might want to add yourself a .mailmap entry too:

-- >8 --
diff --git a/.mailmap b/.mailmap
index 3816e4effe..a521c17b44 100644
--- a/.mailmap
+++ b/.mailmap
@@ -44,6 +44,7 @@ Aleksandar Markovic  


 Aleksandar Rikalo  
 Anthony Liguori  Anthony Liguori 


 James Hogan  
+Leif Lindholm  
 Paul Burton  
 Paul Burton  
 Paul Burton  
---

If you ack this snippet maybe a maintainer is willing to amend it to 
your patch :)


Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé

Re: [PATCH v2] hw/i386: disable smbus migration for xenfv

2020-01-16 Thread Paolo Bonzini

On 16/01/20 19:03, Olaf Hering wrote:
> With commit 7fccf2a06890e3bc3b30e29827ad3fb93fe88fea a new member
> smbus_no_migration_support was added, and enabled in two places.
> With commit 4ab2f2a8aabfea95cc53c64e13b3f67960b27fdf the vmstate_acpi
> got new elements, which are conditionally filled. As a result, an
> incoming migration expected smbus related data unless smbus migration
> was disabled for a given MachineClass.
> 
> Since commit 7fccf2a06890e3bc3b30e29827ad3fb93fe88fea forgot to handle
> xenfv, live migration to receiving hosts using qemu-4.0 and later is broken.
> 
> Therefore this patch must be applied to stable-4.x as well.
> 
> Signed-off-by: Olaf Hering 
> ---
> 
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index fa12203079..e19716d0d3 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -952,6 +952,7 @@ static void xenfv_machine_options(MachineClass *m)
>  m->desc = "Xen Fully-virtualized PC";
>  m->max_cpus = HVM_MAX_VCPUS;
>  m->default_machine_opts = "accel=xen";
> +m->smbus_no_migration_support = true;
>  }
>  
>  DEFINE_PC_MACHINE(xenfv, "xenfv", pc_xen_hvm_init,
> 

This patch is wrong; xenfv does not support cross-version migration
compatibility.  Even if the migration stream does not change, the
hardware exposed to the guest will.

My understanding is that Xen is able to use "-M
pc-i440fx-VERSION,accel=xen".  The presence of the version in the
machine type guarantees that the migration stream is compatible and that
the hardware exposed to the guest is the same on the source and destination.

Is this incorrect?

Paolo

Re: [PATCH] spapr: Fail CAS if option vector table cannot be parsed


On 1/16/20 5:13 PM, Greg Kurz wrote:

On Thu, 16 Jan 2020 16:34:06 +0100
Philippe Mathieu-Daudé  wrote:


Hi Greg,



Hi Phil,


On 1/16/20 4:05 PM, Greg Kurz wrote:

Most of the option vector helpers have assertions to check their
arguments aren't null. The guest can provide an arbitrary address
for the CAS structure that would result in such null arguments.
Fail CAS with H_PARAMETER instead of aborting QEMU.

Signed-off-by: Greg Kurz 
---
   hw/ppc/spapr_hcall.c |9 +
   1 file changed, 9 insertions(+)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 84e1612595bb..051869ae20ec 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1701,9 +1701,18 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu,
   
   /* For the future use: here @ov_table points to the first option vector */

   ov_table = addr;
+if (!ov_table) {
+return H_PARAMETER;
+}


This doesn't look right to check ov_table, I'd check addr directly instead:



I decided to check ov_table because this is what we pass to
spapr_ovec_parse_vector() and that shouldn't be NULL.


OK, it makes sense.


-- >8 --
@@ -1679,12 +1679,16 @@ static target_ulong
h_client_architecture_support(PowerPCCPU *cpu,

   cas_pvr = cas_check_pvr(spapr, cpu, &addr, &raw_mode_supported,
&local_err);
   if (local_err) {
   error_report_err(local_err);
   return H_HARDWARE;
   }
+if (!addr) {
+// error_report*()
+return H_PARAMETER;
+}



I don't really care one way or another, but adding an error_report() is a
good idea since linux just print out the following in case of CAS failure:

WARNING: ibm,client-architecture-support call FAILED!


With some error_report:
Reviewed-by: Philippe Mathieu-Daudé 


   /* Update CPUs */
   if (cpu->compat_pvr != cas_pvr) {
---

Still I'm not sure it makes sense, because the guest can also set other
invalid addresses such addr=0x69.



The point of this patch is just to avoid hitting the assertions. 0x69
is probably bullshit but it passes the g_assert() at least.

   
   ov1_guest = spapr_ovec_parse_vector(ov_table, 1);

+if (!ov1_guest) {
+return H_PARAMETER;
+}


This one is OK (unlikely case where vector 1 isn't present).


   ov5_guest = spapr_ovec_parse_vector(ov_table, 5);
+if (!ov5_guest) {
+return H_PARAMETER;
+}


This one is OK too (unlikely case where vector 5 isn't present).


   if (spapr_ovec_test(ov5_guest, OV5_MMU_BOTH)) {
   error_report("guest requested hash and radix MMU, which is 
invalid.");
   exit(EXIT_FAILURE);

Re: [PATCH] spapr: Migrate CAS reboot flag

2020-01-16 Thread Greg Kurz

On Thu, 16 Jan 2020 13:14:35 +0100
Greg Kurz  wrote:

> On Thu, 16 Jan 2020 11:37:24 +0100
> Laurent Vivier  wrote:
> 
> > On 16/01/2020 09:48, Greg Kurz wrote:
> > > On Wed, 15 Jan 2020 19:10:37 +0100
> > > Laurent Vivier  wrote:
> > > 
> > >> Hi,
> > >>
> > >> On 15/01/2020 18:48, Greg Kurz wrote:
> > >>> Migration can potentially race with CAS reboot. If the migration thread
> > >>> completes migration after CAS has set spapr->cas_reboot but before the
> > >>> mainloop could pick up the reset request and reset the machine, the
> > >>> guest is migrated unrebooted and the destination doesn't reboot it
> > >>> either because it isn't aware a CAS reboot was needed (eg, because a
> > >>> device was added before CAS). This likely result in a broken or hung
> > >>> guest.
> > >>>
> > >>> Even if it is small, the window between CAS and CAS reboot is enough to
> > >>> re-qualify spapr->cas_reboot as state that we should migrate. Add a new
> > >>> subsection for that and always send it when a CAS reboot is pending.
> > >>> This may cause migration to older QEMUs to fail but it is still better
> > >>> than end up with a broken guest.
> > >>>
> > >>> The destination cannot honour the CAS reboot request from a post load
> > >>> handler because this must be done after the guest is fully restored.
> > >>> It is thus done from a VM change state handler.
> > >>>
> > >>> Reported-by: Lukáš Doktor 
> > >>> Signed-off-by: Greg Kurz 
> > >>> ---
> > >>>
> > >>
> > >> I'm wondering if the problem can be related with the fact that
> > >> main_loop_should_exit() could release qemu_global_mutex in
> > >> pause_all_vcpus() in the reset case?
> > >>
> > >> 1602 static bool main_loop_should_exit(void)
> > >> 1603 {
> > >> ...
> > >> 1633 request = qemu_reset_requested();
> > >> 1634 if (request) {
> > >> 1635 pause_all_vcpus();
> > >> 1636 qemu_system_reset(request);
> > >> 1637 resume_all_vcpus();
> > >> 1638 if (!runstate_check(RUN_STATE_RUNNING) &&
> > >> 1639 !runstate_check(RUN_STATE_INMIGRATE)) {
> > >> 1640 runstate_set(RUN_STATE_PRELAUNCH);
> > >> 1641 }
> > >> 1642 }
> > >> ...
> > >>
> > >> I already sent a patch for this kind of problem (in current Juan pull
> > >> request):
> > >>
> > >> "runstate: ignore finishmigrate -> prelaunch transition"
> > >>
> > > 
> > > IIUC your patch avoids an invalid 'prelaunch' -> 'postmigrate' runstate
> > > transition that can happen if the migration thread sets the runstate to
> > > 'finishmigrate' when pause_all_vcpus() releases the main loop mutex.
> > > 
> > > ie. symptom of the problem is QEMU aborting, correct ? The issue I'm
> > > trying to fix is a guest breakage caused by a discrepancy between
> > > QEMU and the guest after migration has succeeded.
> > > 
> > >> but I don't know if it could fix this one.
> > >>
> > > 
> > > I don't think so and your patch kinda illustrates it. If the runstate
> > > is 'finishmigrate' when returning from pause_all_vcpus(), this means
> > > that state was sent to the destination before we could actually reset
> > > the machine.
> > 
> > Yes, you're right.
> > 
> > But the question behind my comment was: is it expected to have a pending
> > reset while we are migrating?
> > 
> 
> Nothing prevents qemu_system_reset_request() to be called when migration
> is active. 
> 
> > Perhaps H_CAS can return H_BUSY and wait the end of the migration and
> > then be fully executed on destination?
> > 
> 
> And so we would need to teach SLOF to try H_CAS again until it stops
> returning H_BUSY ? It seems safer to migrate the CAS reboot flag IMHO.
> 

Ok I've tried that with a patched SLOF that sleeps 500ms and tries CAS
again if H_BUSY was returned. It fixes the issue but it looks a bit
ugly because of the polling with an arbitrary timeout in SLOF... I'm
not very comfortable either with calling migration_is_active() from
the CAS code in QEMU.

David,

Any suggestion ?

> > Thanks,
> > Laurent
> > 
>

Fixed unknown audio format with SDL2

2020-01-16 Thread KJ Liew


SDL2 (version >=2.0) prefers float32 audio format over integer audio format. 
QEMU sdlaudio.c does not handle any kind of AUDIO_F32 formats, but 
SDL_OpenAudio(req, obt) will return float32 audio format in obt and QEMU prints 
error about unknown format 33056 (0x8120).

The following simple patch fix the error by forcing SDL2 internal audio format 
conversion.

diff -ru ../orig/qemu-4.2.0/audio/sdlaudio.c ../qemu-4.2.0/audio/sdlaudio.c
--- ../orig/qemu-4.2.0/audio/sdlaudio.c 2019-12-12 10:20:47.0 -0800
+++ ../qemu-4.2.0/audio/sdlaudio.c  2020-01-15 15:56:25.059841600 -0800
@@ -147,10 +147,11 @@
 }
 #endif

-status = SDL_OpenAudio (req, obt);
+status = SDL_OpenAudio (req, NULL);
 if (status) {
 sdl_logerr ("SDL_OpenAudio failed\n");
 }
+memcpy(obt, req, sizeof(SDL_AudioSpec));

 #ifndef _WIN32
 err = pthread_sigmask (SIG_SETMASK, &old, NULL);

Re: [PATCH 0/3] linux-user: Implement x86_64 vsyscalls

On 1/16/20 4:30 AM, Alex Bennée wrote:
> /x86_64-linux-user/qemu-x86_64 
> ~/lsrc/linux.git/tools/testing/selftests/x86/test_vsyscall_64
> [WARN]  failed to find vDSO
> no vsyscall map in /proc/self/maps
> [RUN]   test gettimeofday()
> [RUN]   test time()
> [RUN]   getcpu() on CPU 0
> [RUN]   getcpu() on CPU 1
> [RUN]   Checking read access to the vsyscall page
> [OK]We do not have read access: #PF(0x4)
> [RUN]   Make sure that vsyscalls really page fault
> **
> ERROR:/home/alex/lsrc/qemu.git/linux-user/x86_64/../i386/cpu_loop.c:185:emulate_vsyscall:
>  assertion failed: (ret != -TARGET_EFAULT)
> qemu:handle_cpu_signal received signal outside vCPU context @ 
> pc=0x7f6eed31b613

Ok, thanks.


r~

Re: [PATCH v2] hw/i386: disable smbus migration for xenfv

2020-01-16 Thread Olaf Hering

Am Thu, 16 Jan 2020 19:26:39 +0100
schrieb Paolo Bonzini :

> xenfv does not support cross-version migration compatibility.

Wait, what does that mean?
So far live migration of a running domU must be possible from Xen N to Xen N+1.
It would be more than unexpected if the target host running "Xen N+1" must have 
the very same qemu version as "Xen N".

Olaf


pgpYl_ff4rXPj.pgp
Description: Digitale Signatur von OpenPGP

[PATCH v2] spapr: Fail CAS if option vector table cannot be parsed

2020-01-16 Thread Greg Kurz

Most of the option vector helpers have assertions to check their
arguments aren't null. The guest can provide an arbitrary address
for the CAS structure that would result in such null arguments.
Fail CAS with H_PARAMETER and print a warning instead of aborting
QEMU.

Signed-off-by: Greg Kurz 
Reviewed-by: Philippe Mathieu-Daudé 
---
v2: - print warnings
---
 hw/ppc/spapr_hcall.c |   12 
 1 file changed, 12 insertions(+)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 84e1612595bb..90d74076b09c 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1701,9 +1701,21 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu,
 
 /* For the future use: here @ov_table points to the first option vector */
 ov_table = addr;
+if (!ov_table) {
+warn_report("guest passed an invalid option vector table address");
+return H_PARAMETER;
+}
 
 ov1_guest = spapr_ovec_parse_vector(ov_table, 1);
+if (!ov1_guest) {
+warn_report("guest didn't provide option vector 1");
+return H_PARAMETER;
+}
 ov5_guest = spapr_ovec_parse_vector(ov_table, 5);
+if (!ov5_guest) {
+warn_report("guest didn't provide option vector 5");
+return H_PARAMETER;
+}
 if (spapr_ovec_test(ov5_guest, OV5_MMU_BOTH)) {
 error_report("guest requested hash and radix MMU, which is invalid.");
 exit(EXIT_FAILURE);

Re: [PATCH v1 1/2] target/arm: detect 64 bit overflow caused by high cval + voff

On Fri, 10 Jan 2020 at 16:16, Alex Bennée  wrote:
>
> If we don't detect this we will be stuck in a busy loop as we schedule
> a timer for before now which will continually trigger gt_recalc_timer
> even though we haven't reached the state required to trigger the IRQ.
>
> Bug: https://bugs.launchpad.net/bugs/1859021
> Cc: 1859...@bugs.launchpad.net
> Signed-off-by: Alex Bennée 
> ---
>  target/arm/helper.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index 19a57a17da5..eb17106f7bd 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -2481,6 +2481,9 @@ static void gt_recalc_timer(ARMCPU *cpu, int timeridx)
>  } else {
>  /* Next transition is when we hit cval */
>  nexttick = gt->cval + offset;
> +if (nexttick < gt->cval) {
> +nexttick = UINT64_MAX;
> +}
>  }

There's something odd going on with this code. Adding a bit of context:

uint64_t offset = timeridx == GTIMER_VIRT ?
  cpu->env.cp15.cntvoff_el2 : 0;
uint64_t count = gt_get_countervalue(&cpu->env);
/* Note that this must be unsigned 64 bit arithmetic: */
int istatus = count - offset >= gt->cval;
[...]
if (istatus) {
/* Next transition is when count rolls back over to zero */
nexttick = UINT64_MAX;
} else {
/* Next transition is when we hit cval */
nexttick = gt->cval + offset;
}

I think this patch is correct, in that the 'nexttick' values
are all absolute and this cval/offset combination implies
that the next timer interrupt is going to be in a future
so distant we can't even fit the duration in a uint64_t.

But the other half of the 'if' also looks wrong: that's
for the case of "timer has fired, how long until the
wraparound causes the interrupt line to go low again?".
UINT64_MAX is right for the EL1 case where offset is 0,
but the offset might actually be set such that the wrap
around happens fairly soon. We want to calculate the
tick when (count - offset) hits 0, saturated to
UINT64_MAX. It's getting late here and I couldn't figure
out what that expression should be with 15 minutes of
fiddling around with pen and paper diagrams. I'll have another
go tomorrow if nobody else gets there first...

thanks
-- PMM

Re: [PATCH v2] hw/i386: disable smbus migration for xenfv

2020-01-16 Thread Paolo Bonzini

Il gio 16 gen 2020, 19:33 Olaf Hering  ha scritto:

> Am Thu, 16 Jan 2020 19:26:39 +0100
> schrieb Paolo Bonzini :
>
> > xenfv does not support cross-version migration compatibility.
>
> Wait, what does that mean?
> So far live migration of a running domU must be possible from Xen N to Xen
> N+1.
>

The  Xen N+1 must know that the source is running Xen N, and use a machine
type corresponding to the QEMU version that was shipped with Xen N. For new
virtual machines Xen must also use pc-i440fx-M.N and not just PC.

This is not new, versioned machine types were introduced in QEMU 0.10 or
something like that for exactly this purpose.

Paolo

It would be more than unexpected if the target host running "Xen N+1" must
> have the very same qemu version as "Xen N".
>
> Olaf
>

[Bug 1859989] Re: qemu-img has broken output with large snapshot names

2020-01-16 Thread pevogam

** Description changed:

- On Qemu 4.1.1 the output of snalshots breaks if the chosen state name is
+ On Qemu 4.1.1 the output of snapshots breaks if the chosen state name is
  too long:
  
  ```
  # qemu-img snapshot -l /mnt/local/some_image.qcow2
  Snapshot list:
  IDTAG VM SIZEDATE   VM CLOCK
  1 online_provider_with_dhcp747 MiB 2020-01-15 12:05:01   00:00:45.873
  ```
  
  Prior to 4.1.1 this used to work with extra tabs for the VM SIZE values.
  The collision is also disabling us from using a regex on top of this
  input to detect the snapshot.

** Description changed:

  On Qemu 4.1.1 the output of snapshots breaks if the chosen state name is
  too long:
  
  ```
  # qemu-img snapshot -l /mnt/local/some_image.qcow2
  Snapshot list:
  IDTAG VM SIZEDATE   VM CLOCK
  1 online_provider_with_dhcp747 MiB 2020-01-15 12:05:01   00:00:45.873
  ```
  
  Prior to 4.1.1 this used to work with extra tabs for the VM SIZE values.
  The collision is also disabling us from using a regex on top of this
- input to detect the snapshot.
+ output to detect the snapshot.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1859989

Title:
  qemu-img has broken output with large snapshot names

Status in QEMU:
  New

Bug description:
  On Qemu 4.1.1 the output of snapshots breaks if the chosen state name
  is too long:

  ```
  # qemu-img snapshot -l /mnt/local/some_image.qcow2
  Snapshot list:
  IDTAG VM SIZEDATE   VM CLOCK
  1 online_provider_with_dhcp747 MiB 2020-01-15 12:05:01   00:00:45.873
  ```

  Prior to 4.1.1 this used to work with extra tabs for the VM SIZE
  values. The collision is also disabling us from using a regex on top
  of this output to detect the snapshot.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1859989/+subscriptions

Re: [PATCH 0/3] linux-user: Implement x86_64 vsyscalls

On 1/16/20 4:05 AM, Alex Bennée wrote:
> It turns out the /sbin/ldconfig crash is a regression w.r.t from
> stretch->buster so unrelated to these patches. However I've been giving
> them a spin with the linux vdso selftests and stuff is breaking which I
> guess means it's incomplete?
...
> Also from selftests/vDSO:
> 
> alex@23eb55f27ff8:~/lsrc/linux.git/tools/testing/selftests/vDSO$ ./vdso_test 
> AT_SYSINFO_EHDR is not present!

This patch set is for vsyscall only, not vdso.


r~

[PATCH v2 0/5] linux-user: Implement x86_64 vsyscalls

Changes since v2:

  * Add /proc/self/maps line

I'm not sure this is really necessary.  The linux kernel
self-test checks for it, and modifies the set of tests that
it runs based on it.  But otherwise I think it's unused.

  * Fix errors in base gettimeofday syscall

This is also checked by test_vsyscall, as noticed by AJB.


r~


Original blurb:

The x86_64 abi has a legacy vsyscall page.  The kernel folk
have been trying to deprecate this since at least v3.1, but

(1) We don't implement the vdso that replaces vsyscalls,
(2) As of v5.5, the vsyscall page is still enabled by default.

This lack is affecting Peter's linux-user testing.

The dependency is not obvious because Peter is running the tests
on x86_64, so the host is providing a vsyscall page to qemu.

Because of how user-only memory operations are handled, with no
validation of guest vs host pages, so long as qemu chooses to
run with guest_base == 0, the guest may Just So Happen to read
the host's vsyscall page.

Complicating this, new OS releases may use a kernel configured
with CONFIG_LEGACY_VSYSCALL_XONLY=y, which means the the vsyscall
page cannot be read, only executed.  Which means that the guest
then cannot read the host vsyscall page during translation and
will SIGSEGV.

Exactly which of these many variables is affecting Peter's testing
with Ubuntu 18.04 of my TCG merge, I'm not exactly sure.  I suspect
that it is the change to drop the textseg_addr adjustment to user-only
static binaries.  IIRC bionic does not support -static-pie, which is
the preferred replacement.  This could mean that the host and guest
binaries overlap, which leads to guest_base != 0.

I vaguely remember someone (Paolo?) implementing something like
this many years ago, but clearly it never got merged.

In any case, this emulation has been missing for too long.


Richard Henderson (5):
  target/i386: Renumber EXCP_SYSCALL
  linux-user/i386: Split out gen_signal
  linux-user/i386: Emulate x86_64 vsyscalls
  linux-user: Add x86_64 vsyscall page to /proc/self/maps
  linux-user: Flush out implementation of gettimeofday

 target/i386/cpu.h  |   6 +-
 linux-user/i386/cpu_loop.c | 198 ++---
 linux-user/syscall.c   |  36 ++-
 target/i386/translate.c|  16 ++-
 4 files changed, 190 insertions(+), 66 deletions(-)

-- 
2.20.1

[PATCH v2 2/5] linux-user/i386: Split out gen_signal

This is a bit tidier than open-coding the 5 lines necessary
to initialize the target_siginfo_t.  In addition, this zeros
the remaining bytes of the target_siginfo_t, rather than
passing in garbage.

Reviewed-by: Paolo Bonzini 
Reviewed-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 linux-user/i386/cpu_loop.c | 93 ++
 1 file changed, 33 insertions(+), 60 deletions(-)

diff --git a/linux-user/i386/cpu_loop.c b/linux-user/i386/cpu_loop.c
index 024b6f4d58..e217cca5ee 100644
--- a/linux-user/i386/cpu_loop.c
+++ b/linux-user/i386/cpu_loop.c
@@ -81,13 +81,23 @@ static void set_idt(int n, unsigned int dpl)
 }
 #endif
 
+static void gen_signal(CPUX86State *env, int sig, int code, abi_ptr addr)
+{
+target_siginfo_t info = {
+.si_signo = sig,
+.si_code = code,
+._sifields._sigfault._addr = addr
+};
+
+queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
+}
+
 void cpu_loop(CPUX86State *env)
 {
 CPUState *cs = env_cpu(env);
 int trapnr;
 abi_ulong pc;
 abi_ulong ret;
-target_siginfo_t info;
 
 for(;;) {
 cpu_exec_start(cs);
@@ -134,70 +144,45 @@ void cpu_loop(CPUX86State *env)
 #endif
 case EXCP0B_NOSEG:
 case EXCP0C_STACK:
-info.si_signo = TARGET_SIGBUS;
-info.si_errno = 0;
-info.si_code = TARGET_SI_KERNEL;
-info._sifields._sigfault._addr = 0;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
+gen_signal(env, TARGET_SIGBUS, TARGET_SI_KERNEL, 0);
 break;
 case EXCP0D_GPF:
 /* XXX: potential problem if ABI32 */
 #ifndef TARGET_X86_64
 if (env->eflags & VM_MASK) {
 handle_vm86_fault(env);
-} else
-#endif
-{
-info.si_signo = TARGET_SIGSEGV;
-info.si_errno = 0;
-info.si_code = TARGET_SI_KERNEL;
-info._sifields._sigfault._addr = 0;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
+break;
 }
+#endif
+gen_signal(env, TARGET_SIGSEGV, TARGET_SI_KERNEL, 0);
 break;
 case EXCP0E_PAGE:
-info.si_signo = TARGET_SIGSEGV;
-info.si_errno = 0;
-if (!(env->error_code & 1))
-info.si_code = TARGET_SEGV_MAPERR;
-else
-info.si_code = TARGET_SEGV_ACCERR;
-info._sifields._sigfault._addr = env->cr[2];
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
+gen_signal(env, TARGET_SIGSEGV,
+   (env->error_code & 1 ?
+TARGET_SEGV_ACCERR : TARGET_SEGV_MAPERR),
+   env->cr[2]);
 break;
 case EXCP00_DIVZ:
 #ifndef TARGET_X86_64
 if (env->eflags & VM_MASK) {
 handle_vm86_trap(env, trapnr);
-} else
-#endif
-{
-/* division by zero */
-info.si_signo = TARGET_SIGFPE;
-info.si_errno = 0;
-info.si_code = TARGET_FPE_INTDIV;
-info._sifields._sigfault._addr = env->eip;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
+break;
 }
+#endif
+gen_signal(env, TARGET_SIGFPE, TARGET_FPE_INTDIV, env->eip);
 break;
 case EXCP01_DB:
 case EXCP03_INT3:
 #ifndef TARGET_X86_64
 if (env->eflags & VM_MASK) {
 handle_vm86_trap(env, trapnr);
-} else
+break;
+}
 #endif
-{
-info.si_signo = TARGET_SIGTRAP;
-info.si_errno = 0;
-if (trapnr == EXCP01_DB) {
-info.si_code = TARGET_TRAP_BRKPT;
-info._sifields._sigfault._addr = env->eip;
-} else {
-info.si_code = TARGET_SI_KERNEL;
-info._sifields._sigfault._addr = 0;
-}
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
+if (trapnr == EXCP01_DB) {
+gen_signal(env, TARGET_SIGTRAP, TARGET_TRAP_BRKPT, env->eip);
+} else {
+gen_signal(env, TARGET_SIGTRAP, TARGET_SI_KERNEL, 0);
 }
 break;
 case EXCP04_INTO:
@@ -205,31 +190,19 @@ void cpu_loop(CPUX86State *env)
 #ifndef TARGET_X86_64
 if (env->eflags & VM_MASK) {
 handle_vm86_trap(env, trapnr);
-} else
-#endif
-{
-info.si_signo = TARGET_SIGSEGV;
-info.si_errno = 0;
-info.si_code = TARGET_SI_KERNEL;
-info._sifields._sigfault._addr = 0;
-queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
+break;
 }
+#e

[PATCH v2 5/5] linux-user: Flush out implementation of gettimeofday

The first argument, timeval, is allowed to be NULL.

The second argument, timezone, was missing.  While its use is
deprecated, it is still present in the syscall.

Signed-off-by: Richard Henderson 
---
 linux-user/syscall.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index eb867a5296..628b4de9a1 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -1219,6 +1219,23 @@ static inline abi_long 
host_to_target_timespec64(abi_ulong target_addr,
 return 0;
 }
 
+static inline abi_long copy_to_user_timezone(abi_ulong target_tz_addr,
+ struct timezone *tz)
+{
+struct target_timezone *target_tz;
+
+if (!lock_user_struct(VERIFY_WRITE, target_tz, target_tz_addr, 1)) {
+return -TARGET_EFAULT;
+}
+
+__put_user(tz->tz_minuteswest, &target_tz->tz_minuteswest);
+__put_user(tz->tz_dsttime, &target_tz->tz_dsttime);
+
+unlock_user_struct(target_tz, target_tz_addr, 1);
+
+return 0;
+}
+
 static inline abi_long copy_from_user_timezone(struct timezone *tz,
abi_ulong target_tz_addr)
 {
@@ -8567,10 +8584,16 @@ static abi_long do_syscall1(void *cpu_env, int num, 
abi_long arg1,
 case TARGET_NR_gettimeofday:
 {
 struct timeval tv;
-ret = get_errno(gettimeofday(&tv, NULL));
+struct timezone tz;
+
+ret = get_errno(gettimeofday(&tv, &tz));
 if (!is_error(ret)) {
-if (copy_to_user_timeval(arg1, &tv))
+if (arg1 && copy_to_user_timeval(arg1, &tv)) {
 return -TARGET_EFAULT;
+}
+if (arg2 && copy_to_user_timezone(arg2, &tz)) {
+return -TARGET_EFAULT;
+}
 }
 }
 return ret;
-- 
2.20.1

[PATCH v2 1/5] target/i386: Renumber EXCP_SYSCALL

We are not short of numbers for EXCP_*.  There is no need to confuse things
by having EXCP_VMEXIT and EXCP_SYSCALL overlap, even though the former is
only used for system mode and the latter is only used for user mode.

Reviewed-by: Paolo Bonzini 
Reviewed-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/i386/cpu.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 594326a794..164d038d1f 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -998,9 +998,8 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
 #define EXCP11_ALGN17
 #define EXCP12_MCHK18
 
-#define EXCP_SYSCALL0x100 /* only happens in user only emulation
- for syscall instruction */
-#define EXCP_VMEXIT 0x100
+#define EXCP_VMEXIT 0x100 /* only for system emulation */
+#define EXCP_SYSCALL0x101 /* only for user emulation */
 
 /* i386-specific interrupt pending bits.  */
 #define CPU_INTERRUPT_POLL  CPU_INTERRUPT_TGT_EXT_1
-- 
2.20.1

[PATCH v2 4/5] linux-user: Add x86_64 vsyscall page to /proc/self/maps

The page isn't (necessarily) present in the host /proc/self/maps,
and even if it might be it isn't present in page_flags, and even
if it was it might not have the same set of page permissions.

The easiest thing to do, particularly when it comes to the
"[vsyscall]" note at the end of line, is to special case it.

Signed-off-by: Richard Henderson 
---
 linux-user/syscall.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 171c0caef3..eb867a5296 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -7005,6 +7005,15 @@ static int open_self_maps(void *cpu_env, int fd)
 }
 }
 
+#ifdef TARGET_X86_64
+/*
+ * We only support execution from the vsyscall page.
+ * This is as if CONFIG_LEGACY_VSYSCALL_XONLY=y from v5.3.
+ */
+dprintf(fd, "ff60-ff601000 --xp  00:00 0"
+"  [vsyscall]\n");
+#endif
+
 free(line);
 fclose(fp);
 
-- 
2.20.1

Re: [PATCH v2 0/5] linux-user: Implement x86_64 vsyscalls

On 1/16/20 9:43 AM, Richard Henderson wrote:
> Changes since v2:
> 
>   * Add /proc/self/maps line
> 
>   I'm not sure this is really necessary.  The linux kernel
>   self-test checks for it, and modifies the set of tests that
>   it runs based on it.  But otherwise I think it's unused.
> 
>   * Fix errors in base gettimeofday syscall
> 
>   This is also checked by test_vsyscall, as noticed by AJB.

Oh, and

* Set error_code in write_ok_or_segv (patch 2)


r~

[PATCH v2 3/5] linux-user/i386: Emulate x86_64 vsyscalls

Notice the magic page during translate, much like we already
do for the arm32 commpage.  At runtime, raise an exception to
return cpu_loop for emulation.

Reviewed-by: Paolo Bonzini 
Signed-off-by: Richard Henderson 
---
 target/i386/cpu.h  |   1 +
 linux-user/i386/cpu_loop.c | 105 +
 target/i386/translate.c|  16 +-
 3 files changed, 121 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 164d038d1f..3fb2d2a986 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1000,6 +1000,7 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
 
 #define EXCP_VMEXIT 0x100 /* only for system emulation */
 #define EXCP_SYSCALL0x101 /* only for user emulation */
+#define EXCP_VSYSCALL   0x102 /* only for user emulation */
 
 /* i386-specific interrupt pending bits.  */
 #define CPU_INTERRUPT_POLL  CPU_INTERRUPT_TGT_EXT_1
diff --git a/linux-user/i386/cpu_loop.c b/linux-user/i386/cpu_loop.c
index e217cca5ee..f9bf6cec27 100644
--- a/linux-user/i386/cpu_loop.c
+++ b/linux-user/i386/cpu_loop.c
@@ -92,6 +92,106 @@ static void gen_signal(CPUX86State *env, int sig, int code, 
abi_ptr addr)
 queue_signal(env, info.si_signo, QEMU_SI_FAULT, &info);
 }
 
+#ifdef TARGET_X86_64
+static bool write_ok_or_segv(CPUX86State *env, abi_ptr addr, size_t len)
+{
+/*
+ * For all the vsyscalls, NULL means "don't write anything" not
+ * "write it at address 0".
+ */
+if (addr == 0 || access_ok(VERIFY_WRITE, addr, len)) {
+return true;
+}
+
+env->error_code = PG_ERROR_W_MASK | PG_ERROR_U_MASK;
+gen_signal(env, TARGET_SIGSEGV, TARGET_SEGV_MAPERR, addr);
+return false;
+}
+
+/*
+ * Since v3.1, the kernel traps and emulates the vsyscall page.
+ * Entry points other than the official generate SIGSEGV.
+ */
+static void emulate_vsyscall(CPUX86State *env)
+{
+int syscall;
+abi_ulong ret;
+uint64_t caller;
+
+/*
+ * Validate the entry point.  We have already validated the page
+ * during translation, now verify the offset.
+ */
+switch (env->eip & ~TARGET_PAGE_MASK) {
+case 0x000:
+syscall = TARGET_NR_gettimeofday;
+break;
+case 0x400:
+syscall = TARGET_NR_time;
+break;
+case 0x800:
+syscall = TARGET_NR_getcpu;
+break;
+default:
+sigsegv:
+/* Like force_sig(SIGSEGV).  */
+gen_signal(env, TARGET_SIGSEGV, TARGET_SI_KERNEL, 0);
+return;
+}
+
+/*
+ * Validate the return address.
+ * Note that the kernel treats this the same as an invalid entry point.
+ */
+if (get_user_u64(caller, env->regs[R_ESP])) {
+goto sigsegv;
+}
+
+/*
+ * Validate the the pointer arguments.
+ */
+switch (syscall) {
+case TARGET_NR_gettimeofday:
+if (!write_ok_or_segv(env, env->regs[R_EDI],
+  sizeof(struct target_timeval)) ||
+!write_ok_or_segv(env, env->regs[R_ESI],
+  sizeof(struct target_timezone))) {
+return;
+}
+break;
+case TARGET_NR_time:
+if (!write_ok_or_segv(env, env->regs[R_EDI], sizeof(abi_long))) {
+return;
+}
+break;
+case TARGET_NR_getcpu:
+if (!write_ok_or_segv(env, env->regs[R_EDI], sizeof(uint32_t)) ||
+!write_ok_or_segv(env, env->regs[R_ESI], sizeof(uint32_t))) {
+return;
+}
+break;
+default:
+g_assert_not_reached();
+}
+
+/*
+ * Perform the syscall.  None of the vsyscalls should need restarting,
+ * and all faults should have been caught above.
+ */
+ret = do_syscall(env, syscall, env->regs[R_EDI], env->regs[R_ESI],
+ env->regs[R_EDX], env->regs[10], env->regs[8],
+ env->regs[9], 0, 0);
+g_assert(ret != -TARGET_ERESTARTSYS);
+g_assert(ret != -TARGET_QEMU_ESIGRETURN);
+g_assert(ret != -TARGET_EFAULT);
+env->regs[R_EAX] = ret;
+
+/* Emulate a ret instruction to leave the vsyscall page.  */
+env->eip = caller;
+env->regs[R_ESP] += 8;
+}
+#endif
+
 void cpu_loop(CPUX86State *env)
 {
 CPUState *cs = env_cpu(env);
@@ -141,6 +241,11 @@ void cpu_loop(CPUX86State *env)
 env->regs[R_EAX] = ret;
 }
 break;
+#endif
+#ifdef TARGET_X86_64
+case EXCP_VSYSCALL:
+emulate_vsyscall(env);
+break;
 #endif
 case EXCP0B_NOSEG:
 case EXCP0C_STACK:
diff --git a/target/i386/translate.c b/target/i386/translate.c
index 7c99ef1385..391b4ef149 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -8555,7 +8555,21 @@ static bool i386_tr_breakpoint_check(DisasContextBase 
*dcbase, CPUState *cpu,
 static void i386_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
 {
 DisasContext *dc = container_of(dcbase, DisasContext, base);
-target_ulong pc_next = disas

Re: [PATCH 3/3] linux-user/i386: Emulate x86_64 vsyscalls

2020-01-16 Thread Alex Bennée



Richard Henderson  writes:

> On 1/16/20 6:26 AM, Alex Bennée wrote:
>>> +/*
>>> + * Perform the syscall.  None of the vsyscalls should need restarting,
>>> + * and all faults should have been caught above.
>>> + */
>>> +ret = do_syscall(env, syscall, env->regs[R_EDI], env->regs[R_ESI],
>>> + env->regs[R_EDX], env->regs[10], env->regs[8],
>>> + env->regs[9], 0, 0);
>> 
>> How come the register ABI to the syscall is different to the others. I
>> can see why syscall doesn't come from EAX but the others are a different
>> set to normal syscalls which might be why:
>
> Cut and paste error, I assume.
>
> That said, the three syscalls have a maximum of 2 arguments,
> so I could really just pass EDI and ESI and 0 for the rest...
>
>> I'm seeing a EFAULT on the gettimeofday failure:
>
> What getttimeofday failure?  Is this related to the mention of /sbin/ldconfig
> in your previous message?

No - the buster x86064 ldconfig seg is unrelated to this series. It has
however spawned an additional bug in gdbstub while it was at it ;-)

>
>>#0  do_syscall (cpu_env=cpu_env@entry=0x577d2b10, num=num@entry=96, 
>> arg1=0, arg2=0, arg3=4211016, arg4=8, arg5=274888677184, arg6=274886295415, 
>> arg7=0, arg8=0) at /home/alex/lsrc/qemu.git/linux-user/syscall.c:12076   
>> 
>>#1  0x55609b6e in emulate_vsyscall (env=0x577d2b10) at 
>> /home/alex/lsrc/qemu.git/linux-user/x86_64/../i386/cpu_loop.c:180
>>#2  cpu_loop (env=0x577d2b10) at 
>> /home/alex/lsrc/qemu.git/linux-user/x86_64/../i386/cpu_loop.c:246
>>   
>>#3  0x5559640e in main (argc=, argv=>#out>, envp=) at
>>#/home/alex/lsrc/qemu.git/linux-user/main.c:865
>> 
>> arg1/arg2 don't seem right here.
>
> Why?  NULL value for arg1 is legal, though semi-useless.
>
> Ah, I see that our implementation of gettimeofday doesn't honor NULL.
>
>
> r~


-- 
Alex Bennée

Re: [PATCH] target/s390x/kvm: Enable adapter interruption suppression again

2020-01-16 Thread Matthew Rosato


On 1/16/20 7:20 AM, Thomas Huth wrote:

The AIS feature has been disabled late in the v2.10 development
cycle since there were some issues with migration (see commit
3f2d07b3b01ea61126b - "s390x/ais: for 2.10 stable: disable ais
facility"). We originally wanted to enable it again for newer
machine types, but apparently we forgot to do this so far. Let's
do it for the new s390-ccw-virtio-5.0 machine now.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1756946
Signed-off-by: Thomas Huth 
---
  hw/s390x/s390-virtio-ccw.c |  4 
  include/hw/s390x/s390-virtio-ccw.h |  4 
  target/s390x/kvm.c | 11 ---
  3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index e7eadd14e8..6f43136396 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -456,6 +456,7 @@ static void ccw_machine_class_init(ObjectClass *oc, void 
*data)
  s390mc->cpu_model_allowed = true;
  s390mc->css_migration_enabled = true;
  s390mc->hpage_1m_allowed = true;
+s390mc->kvm_ais_allowed = true;
  mc->init = ccw_init;
  mc->reset = s390_machine_reset;
  mc->hot_add_cpu = s390_hot_add_cpu;
@@ -662,6 +663,9 @@ static void ccw_machine_4_2_instance_options(MachineState 
*machine)
  
  static void ccw_machine_4_2_class_options(MachineClass *mc)

  {
+S390CcwMachineClass *s390mc = S390_MACHINE_CLASS(mc);
+
+s390mc->kvm_ais_allowed = false;
  ccw_machine_5_0_class_options(mc);
  compat_props_add(mc->compat_props, hw_compat_4_2, hw_compat_4_2_len);
  }
diff --git a/include/hw/s390x/s390-virtio-ccw.h 
b/include/hw/s390x/s390-virtio-ccw.h
index 8aa27199c9..f142d379c6 100644
--- a/include/hw/s390x/s390-virtio-ccw.h
+++ b/include/hw/s390x/s390-virtio-ccw.h
@@ -21,6 +21,9 @@
  #define S390_MACHINE_CLASS(klass) \
  OBJECT_CLASS_CHECK(S390CcwMachineClass, (klass), TYPE_S390_CCW_MACHINE)
  
+#define S390_CCW_MACHINE_OBJ_GET_CLASS(obj) \

+OBJECT_GET_CLASS(S390CcwMachineClass, obj, TYPE_S390_CCW_MACHINE)
+
  typedef struct S390CcwMachineState {
  /*< private >*/
  MachineState parent_obj;
@@ -40,6 +43,7 @@ typedef struct S390CcwMachineClass {
  bool cpu_model_allowed;
  bool css_migration_enabled;
  bool hpage_1m_allowed;
+bool kvm_ais_allowed;
  } S390CcwMachineClass;
  
  /* runtime-instrumentation allowed by the machine */

diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index 15260aeb9a..4c1c8c0208 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -329,6 +329,8 @@ static void ccw_machine_class_foreach(ObjectClass *oc, void 
*opaque)
  
  int kvm_arch_init(MachineState *ms, KVMState *s)

  {
+S390CcwMachineClass *smc = S390_CCW_MACHINE_OBJ_GET_CLASS(ms);
+


I still can't run a proper test due to unavailable hw but in the 
meantime I tried to virsh define a libvirt guest pointed at qemu (master 
+ this patch).  Regardless of machine type (s390-ccw-virtio-5.0 or 
s390-ccw-virtio-4.2) I get:


virsh define guest.xml
error: Failed to define domain from /path/to/guest.xml
error: invalid argument: could not find capabilities for arch=s390x 
domaintype=kvm


Similarly:

virsh domcapabilities
error: failed to get emulator capabilities
error: invalid argument: unable to find any emulator to serve 's390x' 
architecture


Rolling back to qemu master, the define and domcapabilities work (with 
no ais of course).


So: there is some incompatibility between the way libvirt invokes qemu 
to detect capabilities and this code.  The above line seems to be the 
root problem - if I take your patch and remove 'smc' then libvirt works 
as expected and I can see ais in the domcapabilities.


Looking at those wrappers David mentioned...  I suspect you need this 
for the 'none' machine case.  I tried a quick hack with the following:


bool ais_allowed(void)
{
/* for "none" machine this results in true */
return get_machine_class()->kvm_ais_allowed;
}

and

if (ais_allowed() &&
kvm_check_extension(s, KVM_CAP_S390_AIS_MIGRATION)) {
kvm_vm_enable_cap(s, KVM_CAP_S390_AIS, 0);
}

This works and doesn't break libvirt compatibility detection.




  object_class_foreach(ccw_machine_class_foreach, TYPE_S390_CCW_MACHINE,
   false, NULL);
  
@@ -365,10 +367,13 @@ int kvm_arch_init(MachineState *ms, KVMState *s)

  /*
   * The migration interface for ais was introduced with kernel 4.13
   * but the capability itself had been active since 4.12. As migration
- * support is considered necessary let's disable ais in the 2.10
- * machine.
+ * support is considered necessary we only enable this for newer
+ * machine types and if KVM_CAP_S390_AIS_MIGRATION is available.
   */
-/* kvm_vm_enable_cap(s, KVM_CAP_S390_AIS, 0); */
+if (smc->kvm_ais_allowed &&
+kvm_check_extension(s, KVM_CAP_S390_AIS_MIGRATION)) {
+kvm_vm_enable_cap(s, KVM_CAP_S390_AIS, 0);
+}
  
  kvm_set_max_me

[PATCH v3 0/2] Fix hyperv synic on vhost

2020-01-16 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Hyperv's synic (that we emulate) is a feature that allows the guest
to place some magic (4k) pages of RAM anywhere it likes in GPA.
This confuses vhost's RAM section merging when these pages
land over the top of hugepages.

This v3 takes a different approach to v2 and v1.
It avoids doing the hugepage alignment except for vhost-user:

  a) Vhost kernel : doesn't need alignment, it's turned off
 synic won't cause a problem.

  b) vhost user : Already filters out anything without an fd
 synic won't cause a problem.

(Not tried vhost-user yet, it currently seems broken even without this).

This might also cause some other reported problems with vga
pages causing similar issues.

bz: https://bugzilla.redhat.com/show_bug.cgi?id=1779041

Dr. David Alan Gilbert (2):
  vhost: Add names to section rounded warning
  vhost: Only align sections for vhost-user

 hw/virtio/vhost.c | 39 +--
 1 file changed, 21 insertions(+), 18 deletions(-)

-- 
2.24.1

[PATCH v3 1/2] vhost: Add names to section rounded warning

2020-01-16 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

Add the memory region names to section rounding/alignment
warnings.

Signed-off-by: Dr. David Alan Gilbert 
---
 hw/virtio/vhost.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 4da0d5a6c5..774d87d98e 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -590,9 +590,10 @@ static void vhost_region_add_section(struct vhost_dev *dev,
  * match up in the same RAMBlock if they do.
  */
 if (mrs_gpa < prev_gpa_start) {
-error_report("%s:Section rounded to %"PRIx64
- " prior to previous %"PRIx64,
- __func__, mrs_gpa, prev_gpa_start);
+error_report("%s:Section '%s' rounded to %"PRIx64
+ " prior to previous '%s' %"PRIx64,
+ __func__, section->mr->name, mrs_gpa,
+ prev_sec->mr->name, prev_gpa_start);
 /* A way to cleanly fail here would be better */
 return;
 }
-- 
2.24.1

[PATCH v3 2/2] vhost: Only align sections for vhost-user

2020-01-16 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

I added hugepage alignment code in c1ece84e7c9 to deal with
vhost-user + postcopy which needs aligned pages when using userfault.
However, on x86 the lower 2MB of address space tends to be shotgun'd
with small fragments around the 512-640k range - e.g. video RAM, and
with HyperV synic pages tend to sit around there - again splitting
it up.  The alignment code complains with a 'Section rounded to ...'
error and gives up.

Since vhost-user already filters out devices without an fd
(see vhost-user.c vhost_user_mem_section_filter) it shouldn't be
affected by those overlaps.

Turn the alignment off on vhost-kernel so that it doesn't try
and align, and thus won't hit the rounding issues.

Signed-off-by: Dr. David Alan Gilbert 
---
 hw/virtio/vhost.c | 34 ++
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 774d87d98e..25fd469179 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -547,26 +547,28 @@ static void vhost_region_add_section(struct vhost_dev 
*dev,
 uintptr_t mrs_host = (uintptr_t)memory_region_get_ram_ptr(section->mr) +
  section->offset_within_region;
 RAMBlock *mrs_rb = section->mr->ram_block;
-size_t mrs_page = qemu_ram_pagesize(mrs_rb);
 
 trace_vhost_region_add_section(section->mr->name, mrs_gpa, mrs_size,
mrs_host);
 
-/* Round the section to it's page size */
-/* First align the start down to a page boundary */
-uint64_t alignage = mrs_host & (mrs_page - 1);
-if (alignage) {
-mrs_host -= alignage;
-mrs_size += alignage;
-mrs_gpa  -= alignage;
-}
-/* Now align the size up to a page boundary */
-alignage = mrs_size & (mrs_page - 1);
-if (alignage) {
-mrs_size += mrs_page - alignage;
-}
-trace_vhost_region_add_section_aligned(section->mr->name, mrs_gpa, 
mrs_size,
-   mrs_host);
+if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER) {   
+/* Round the section to it's page size */
+/* First align the start down to a page boundary */
+size_t mrs_page = qemu_ram_pagesize(mrs_rb);
+uint64_t alignage = mrs_host & (mrs_page - 1);
+if (alignage) {
+mrs_host -= alignage;
+mrs_size += alignage;
+mrs_gpa  -= alignage;
+}
+/* Now align the size up to a page boundary */
+alignage = mrs_size & (mrs_page - 1);
+if (alignage) {
+mrs_size += mrs_page - alignage;
+}
+trace_vhost_region_add_section_aligned(section->mr->name, mrs_gpa, 
mrs_size,
+   mrs_host);
+}
 
 if (dev->n_tmp_sections) {
 /* Since we already have at least one section, lets see if
-- 
2.24.1

[PATCH] qapi: Fix code generation with Python 3.5

2020-01-16 Thread Markus Armbruster

Recent commit 3e7fb5811b "qapi: Fix code generation for empty modules"
modules" switched QAPISchema.visit() from

for entity in self._entity_list:

effectively to

for mod in self._module_dict.values():
for entity in mod._entity_list:

Visits in the same order as long as .values() is in insertion order.
That's the case only for Python 3.6 and later.  Before, it's in some
arbitrary order, which results in broken generated code.

Fix by making self._module_dict an OrderedDict rather than a dict.

Fixes: 3e7fb5811baab213dcc7149c3aa69442d683c26c
Signed-off-by: Markus Armbruster 
---
 scripts/qapi/schema.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
index 0bfc5256fb..5100110fa2 100644
--- a/scripts/qapi/schema.py
+++ b/scripts/qapi/schema.py
@@ -795,7 +795,7 @@ class QAPISchema(object):
 self.docs = parser.docs
 self._entity_list = []
 self._entity_dict = {}
-self._module_dict = {}
+self._module_dict = OrderedDict()
 self._schema_dir = os.path.dirname(fname)
 self._make_module(None) # built-ins
 self._make_module(fname)
-- 
2.21.1

Re: qemu-4.0.1: vhost_region_add_section:Section rounded to 0 prior to previous a0000

2020-01-16 Thread Dr. David Alan Gilbert

* Peter Lieven (p...@kamp.de) wrote:
> Am 16.01.20 um 13:47 schrieb Peter Lieven:
> > Am 13.01.20 um 17:25 schrieb Peter Lieven:
> > > Am 09.01.20 um 19:44 schrieb Dr. David Alan Gilbert:
> > > > * Peter Lieven (p...@kamp.de) wrote:
> > > > > Am 08.01.20 um 16:04 schrieb Dr. David Alan Gilbert:
> > > > > > * Peter Lieven (p...@kamp.de) wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > 
> > > > > > > I have a Qemu 4.0.1 machine with vhost-net network adapter, thats 
> > > > > > > polluting the log with the above message.
> > > > > > > 
> > > > > > > Is this something known? Googling revealed the following patch in 
> > > > > > > Nemu (with seems to be a Qemu fork from Intel):
> > > > > > > 
> > > > > > > https://github.com/intel/nemu/commit/03940ded7f5370ce7492c619dccced114ef7f56e
> > > > > > > 
> > > > > > > 
> > > > > > > The network stopped functioning. After a live-migration the 
> > > > > > > vServer is reachable again.
> > > > > > > 
> > > > > > > 
> > > > > > > Any ideas?
> > > > > > What guest are you running and what does your qemu commandline look
> > > > > > like?
> > > > > 
> > > > > Its running debian9. We have hundreds of other VMs with identical 
> > > > > setup. Do not know why this one makes trouble.
> > > > Could you extract an 'info mtree' from it - particularly the
> > > > 'address-space: memory' near the top.
> > > 
> > > 
> > > Here we go:
> > > 
> > > 
> > > address-space: memory
> > >   - (prio 0, i/o): system
> > >     -3fff (prio 0, i/o): alias ram-below-4g 
> > > @pc.ram -3fff
> > >     - (prio -1, i/o): pci
> > >   000a-000a (prio 2, i/o): alias vga.chain4 
> > > @vga.vram -
> > >   000a-000b (prio 1, i/o): vga-lowmem
> > 
> > 
> > What seems special is that the RAM area is prio2. Any idea if this makes 
> > trouble?
> 
> 
> Update from my side. This happens when I have Debian 10 with XFCE when the 
> Graphical User Interface is initialized.
> 
> I see the log message when I specify -M pc-i440fx-2.9. If I obmit the machine 
> type the error does not appear.

I can't persuade this to reproduce here on the images I currently have;
but if you can rebuild, can you try the v3 of 'Fix hyperv synic on
vhost' I've just posted?  It turns off the alignment code that's
spitting that error in vhost-kernel cases, so should go away.

Dave

> 
> Peter
> 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH] target/s390x/kvm: Enable adapter interruption suppression again

2020-01-16 Thread Cornelia Huck

On Thu, 16 Jan 2020 15:19:13 -0500
Matthew Rosato  wrote:

> On 1/16/20 7:20 AM, Thomas Huth wrote:
> > The AIS feature has been disabled late in the v2.10 development
> > cycle since there were some issues with migration (see commit
> > 3f2d07b3b01ea61126b - "s390x/ais: for 2.10 stable: disable ais
> > facility"). We originally wanted to enable it again for newer
> > machine types, but apparently we forgot to do this so far. Let's
> > do it for the new s390-ccw-virtio-5.0 machine now.
> > 
> > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1756946
> > Signed-off-by: Thomas Huth 
> > ---
> >   hw/s390x/s390-virtio-ccw.c |  4 
> >   include/hw/s390x/s390-virtio-ccw.h |  4 
> >   target/s390x/kvm.c | 11 ---
> >   3 files changed, 16 insertions(+), 3 deletions(-)

(...)

> > diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
> > index 15260aeb9a..4c1c8c0208 100644
> > --- a/target/s390x/kvm.c
> > +++ b/target/s390x/kvm.c
> > @@ -329,6 +329,8 @@ static void ccw_machine_class_foreach(ObjectClass *oc, 
> > void *opaque)
> >   
> >   int kvm_arch_init(MachineState *ms, KVMState *s)
> >   {
> > +S390CcwMachineClass *smc = S390_CCW_MACHINE_OBJ_GET_CLASS(ms);
> > +  
> 
> I still can't run a proper test due to unavailable hw but in the 
> meantime I tried to virsh define a libvirt guest pointed at qemu (master 
> + this patch).  Regardless of machine type (s390-ccw-virtio-5.0 or 
> s390-ccw-virtio-4.2) I get:
> 
> virsh define guest.xml
> error: Failed to define domain from /path/to/guest.xml
> error: invalid argument: could not find capabilities for arch=s390x 
> domaintype=kvm
> 
> Similarly:
> 
> virsh domcapabilities
> error: failed to get emulator capabilities
> error: invalid argument: unable to find any emulator to serve 's390x' 
> architecture
> 
> Rolling back to qemu master, the define and domcapabilities work (with 
> no ais of course).
> 
> So: there is some incompatibility between the way libvirt invokes qemu 
> to detect capabilities and this code.  The above line seems to be the 
> root problem - if I take your patch and remove 'smc' then libvirt works 
> as expected and I can see ais in the domcapabilities.
> 
> Looking at those wrappers David mentioned...  I suspect you need this 
> for the 'none' machine case.  I tried a quick hack with the following:
> 
> bool ais_allowed(void)
> {
>  /* for "none" machine this results in true */
>  return get_machine_class()->kvm_ais_allowed;
> }
> 
> and
> 
> if (ais_allowed() &&
>  kvm_check_extension(s, KVM_CAP_S390_AIS_MIGRATION)) {
>  kvm_vm_enable_cap(s, KVM_CAP_S390_AIS, 0);
> }
> 
> This works and doesn't break libvirt compatibility detection.

Oh, "none" machine fun again... I think you're on the right track, and
we really need a wrapper.

[Bug 1859713] Re: ARM v8.3a pauth not working

2020-01-16 Thread Vincent Dehors

Hi,

Here is a patch for this bug. The sbox function was using "b+=16"
instead of "b+=4".

Also, you check test vector using :

```c
uint64_t P = 0xfb623599da6e8127ull;
uint64_t T = 0x477d469dec0b8762ull;
uint64_t w0 = 0x84be85ce9804e94bull;
uint64_t k0 = 0xec2802d4e0a488e9ull;
ARMPACKey key = { .hi = w0, .lo = k0 };
uint64_t C5 = pauth_computepac(P, T, key);
/* C5 should be 0xc003b93999b33765 */
```

** Patch added: "0001-target-arm-Fix-PAuth-sbox-functions.patch"
   
https://bugs.launchpad.net/qemu/+bug/1859713/+attachment/5320937/+files/0001-target-arm-Fix-PAuth-sbox-functions.patch

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1859713

Title:
  ARM v8.3a pauth not working

Status in QEMU:
  Confirmed

Bug description:
  Host: Ubuntu 19.10 - x86_64 machine
  QEMU version: 3a63b24a1bbf166e6f455fe43a6bbd8dea413d92 (master)

  ARMV8.3 pauth is not working well.

  With a test code containing two pauth instructions:
  - paciasp that sign LR with A key and sp as context;
  - autiasp that verify the signature.

  Test:
  - Run the program and corrupt LR just before autiasp (ex 0x3e0400660 
instead of 0x3e00400664)

  Expected:
  - autiasp places an invalid pointer in LR

  Result:
  - autiasp successfully auth the pointer and places 0x0400660 in LR.

  Further explanations:
  Adding traces in qemu code shows that "pauth_computepac" is not robust 
enough against truncating.
  With 0x3100400664 as input of pauth_auth, we obtain 
"0x55b1d65b2c138e14" for PAC, "0x30" for bot_bit and "0x38" for top_bit.
  With 0x310040008743ec as input of pauth (with same key), we obtain 
"0x55b1d65b2c138ef4" for PAC, "0x30" for bot_bit and "0x38" for top_bit.
  Values of top_bit and bottom_bit are strictly the same and it should not.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1859713/+subscriptions

[PATCH v2 0/4] vl: Fixes for cleanups to -accel

Running qemu-system-foo with no options should not generate
a warning for "invalid accelerator bar".

Changes in v2:
  * Rebase on master, getting the free accel_list fix from upstream.
Re-word the resulting patch 2 to merely reduce the scope of the
local variables.
  * Use g_str_has_suffix (ajb)


r~


Richard Henderson (4):
  vl: Remove unused variable in configure_accelerators
  vl: Reduce scope of variables in configure_accelerators
  vl: Remove useless test in configure_accelerators
  vl: Only choose enabled accelerators in configure_accelerators

 vl.c | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

-- 
2.20.1

[PATCH v2 1/4] vl: Remove unused variable in configure_accelerators

The accel_initialised variable no longer has any setters.

Fixes: 6f6e1698a68c
Acked-by: Paolo Bonzini 
Reviewed-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed by: Aleksandar Markovic 
Signed-off-by: Richard Henderson 
---
 vl.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index 751401214c..6a5abf2f54 100644
--- a/vl.c
+++ b/vl.c
@@ -2754,7 +2754,6 @@ static void configure_accelerators(const char *progname)
 {
 const char *accel;
 char **accel_list, **tmp;
-bool accel_initialised = false;
 bool init_failed = false;
 
 qemu_opts_foreach(qemu_find_opts("icount"),
@@ -2781,7 +2780,7 @@ static void configure_accelerators(const char *progname)
 
 accel_list = g_strsplit(accel, ":", 0);
 
-for (tmp = accel_list; !accel_initialised && tmp && *tmp; tmp++) {
+for (tmp = accel_list; tmp && *tmp; tmp++) {
 /*
  * Filter invalid accelerators here, to prevent obscenities
  * such as "-machine accel=tcg,,thread=single".
-- 
2.20.1

[PATCH v2 4/4] vl: Only choose enabled accelerators in configure_accelerators

By choosing "tcg:kvm" when kvm is not enabled, we generate
an incorrect warning: "invalid accelerator kvm".

At the same time, use g_str_has_suffix rather than open-coding
the same operation.

Presumably the inverse is also true with --disable-tcg.

Fixes: 28a0961757fc
Acked-by: Paolo Bonzini 
Reviewed-by: Alex Bennée 
Reviewed by: Aleksandar Markovic 
Signed-off-by: Richard Henderson 
---
v2: Use g_str_has_suffix (ajb)
---
 vl.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/vl.c b/vl.c
index 8ae8a5d241..40ac9c5544 100644
--- a/vl.c
+++ b/vl.c
@@ -2764,21 +2764,26 @@ static void configure_accelerators(const char *progname)
 
 if (accel == NULL) {
 /* Select the default accelerator */
-if (!accel_find("tcg") && !accel_find("kvm")) {
-error_report("No accelerator selected and"
- " no default accelerator available");
-exit(1);
-} else {
-int pnlen = strlen(progname);
-if (pnlen >= 3 && g_str_equal(&progname[pnlen - 3], "kvm")) {
+bool have_tcg = accel_find("tcg");
+bool have_kvm = accel_find("kvm");
+
+if (have_tcg && have_kvm) {
+if (g_str_has_suffix(progname, "kvm")) {
 /* If the program name ends with "kvm", we prefer KVM */
 accel = "kvm:tcg";
 } else {
 accel = "tcg:kvm";
 }
+} else if (have_kvm) {
+accel = "kvm";
+} else if (have_tcg) {
+accel = "tcg";
+} else {
+error_report("No accelerator selected and"
+ " no default accelerator available");
+exit(1);
 }
 }
-
 accel_list = g_strsplit(accel, ":", 0);
 
 for (tmp = accel_list; *tmp; tmp++) {
-- 
2.20.1

[PATCH v2 3/4] vl: Remove useless test in configure_accelerators

The result of g_strsplit is never NULL.

Acked-by: Paolo Bonzini 
Reviewed-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed by: Aleksandar Markovic 
Signed-off-by: Richard Henderson 
---
 vl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/vl.c b/vl.c
index 3e2b77a4e8..8ae8a5d241 100644
--- a/vl.c
+++ b/vl.c
@@ -2781,7 +2781,7 @@ static void configure_accelerators(const char *progname)
 
 accel_list = g_strsplit(accel, ":", 0);
 
-for (tmp = accel_list; tmp && *tmp; tmp++) {
+for (tmp = accel_list; *tmp; tmp++) {
 /*
  * Filter invalid accelerators here, to prevent obscenities
  * such as "-machine accel=tcg,,thread=single".
-- 
2.20.1

[PATCH v2 2/4] vl: Reduce scope of variables in configure_accelerators

The accel_list and tmp variables are only used when manufacturing
-machine accel, options based on -accel.

Acked-by: Paolo Bonzini 
Reviewed-by: Alex Bennée 
Reviewed by: Aleksandar Markovic 
Signed-off-by: Richard Henderson 
---
v2: The freeing of accel_list was fixed in adb464ff671d.
---
 vl.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/vl.c b/vl.c
index 6a5abf2f54..3e2b77a4e8 100644
--- a/vl.c
+++ b/vl.c
@@ -2753,7 +2753,6 @@ static int do_configure_accelerator(void *opaque, 
QemuOpts *opts, Error **errp)
 static void configure_accelerators(const char *progname)
 {
 const char *accel;
-char **accel_list, **tmp;
 bool init_failed = false;
 
 qemu_opts_foreach(qemu_find_opts("icount"),
@@ -2761,6 +2760,8 @@ static void configure_accelerators(const char *progname)
 
 accel = qemu_opt_get(qemu_get_machine_opts(), "accel");
 if (QTAILQ_EMPTY(&qemu_accel_opts.head)) {
+char **accel_list, **tmp;
+
 if (accel == NULL) {
 /* Select the default accelerator */
 if (!accel_find("tcg") && !accel_find("kvm")) {
-- 
2.20.1

Re: qemu-4.0.1: vhost_region_add_section:Section rounded to 0 prior to previous a0000

2020-01-16 Thread Peter Lieven

> Am 16.01.2020 um 21:26 schrieb Dr. David Alan Gilbert :
> 
> * Peter Lieven (p...@kamp.de) wrote:
>> Am 16.01.20 um 13:47 schrieb Peter Lieven:
>>> Am 13.01.20 um 17:25 schrieb Peter Lieven:
 Am 09.01.20 um 19:44 schrieb Dr. David Alan Gilbert:
> * Peter Lieven (p...@kamp.de) wrote:
>> Am 08.01.20 um 16:04 schrieb Dr. David Alan Gilbert:
>>> * Peter Lieven (p...@kamp.de) wrote:
 Hi,

 I have a Qemu 4.0.1 machine with vhost-net network adapter, thats 
 polluting the log with the above message.

 Is this something known? Googling revealed the following patch in Nemu 
 (with seems to be a Qemu fork from Intel):

 https://github.com/intel/nemu/commit/03940ded7f5370ce7492c619dccced114ef7f56e

 The network stopped functioning. After a live-migration the vServer is 
 reachable again.

 Any ideas?
>>> What guest are you running and what does your qemu commandline look
>>> like?
>> 
>> Its running debian9. We have hundreds of other VMs with identical setup. 
>> Do not know why this one makes trouble.
> Could you extract an 'info mtree' from it - particularly the
> 'address-space: memory' near the top.

 Here we go:

 address-space: memory
   - (prio 0, i/o): system
 -3fff (prio 0, i/o): alias ram-below-4g 
 @pc.ram -3fff
 - (prio -1, i/o): pci
   000a-000a (prio 2, i/o): alias vga.chain4 
 @vga.vram -
   000a-000b (prio 1, i/o): vga-lowmem
>>> 
>>> 
>>> What seems special is that the RAM area is prio2. Any idea if this makes 
>>> trouble?
>> 
>> 
>> Update from my side. This happens when I have Debian 10 with XFCE when the 
>> Graphical User Interface is initialized.
>> 
>> I see the log message when I specify -M pc-i440fx-2.9. If I obmit the 
>> machine type the error does not appear.
> 
> I can't persuade this to reproduce here on the images I currently have;
> but if you can rebuild, can you try the v3 of 'Fix hyperv synic on
> vhost' I've just posted?  It turns off the alignment code that's
> spitting that error in vhost-kernel cases, so should go away.

I will definitely give it a try tomorrow. The fix to change the machine type 
was not working. I was too fast with my assumption.

What I see is the following:

Machine boots up (cold start), has network connectivity. As soon as the 
Graphics are initialized, the vhost_region_add_section error pops up and
Networking is interrupted. When I then migrate the vServer to another host 
Networking works again. It even seems to keep working when I do a
soft reset (shutdown -r). The only thing that breaks networking again when the 
graphic is initialized is a hard reset (system reset via hmp or amp).

Thank you,
Peter

Re: [PATCH] qapi: Fix code generation with Python 3.5

2020-01-16 Thread John Snow




On 1/16/20 3:25 PM, Markus Armbruster wrote:
> Recent commit 3e7fb5811b "qapi: Fix code generation for empty modules"
> modules" switched QAPISchema.visit() from
> 
> for entity in self._entity_list:
> 
> effectively to
> 
> for mod in self._module_dict.values():
> for entity in mod._entity_list:
> 
> Visits in the same order as long as .values() is in insertion order.
> That's the case only for Python 3.6 and later.  Before, it's in some
> arbitrary order, which results in broken generated code.
> 
> Fix by making self._module_dict an OrderedDict rather than a dict.
> 
> Fixes: 3e7fb5811baab213dcc7149c3aa69442d683c26c
> Signed-off-by: Markus Armbruster 
> ---
>  scripts/qapi/schema.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
> index 0bfc5256fb..5100110fa2 100644
> --- a/scripts/qapi/schema.py
> +++ b/scripts/qapi/schema.py
> @@ -795,7 +795,7 @@ class QAPISchema(object):
>  self.docs = parser.docs
>  self._entity_list = []
>  self._entity_dict = {}
> -self._module_dict = {}
> +self._module_dict = OrderedDict()
>  self._schema_dir = os.path.dirname(fname)
>  self._make_module(None) # built-ins
>  self._make_module(fname)
> 

This problem has bitten me *many* times. I'm wondering if there's a
prescription that isn't just "Wait until we can stipulate 3.6+".

Re: [PATCH v3 02/11] 9pfs: require msize >= 4096

2020-01-16 Thread Christian Schoenebeck

On Donnerstag, 16. Januar 2020 19:07:48 CET Greg Kurz wrote:
> > The point here was that there must be some kind of absolute minimum msize,
> 
> Then the absolute minimum size is 7, the size of the header part that is
> common to all messages:
> 
> size[4] Message tag[2]
> 
> > even though the protocol specs officially don't mandate one. And if a
> > client choses a msize < P9_IOHDRSZ, what useful actions shall it be able
> > to do?
> I haven't checked but I guess some messages can fit in 24 bytes.
> 
> > That's why I mentioned P9_IOHDRSZ just in case somebody might come up
> > later
> > asking to lower that minimum msize value for whatever reason.
> 
> That's precisely what I want to avoid. The semantic of P9_IOHDRSZ is
> that it is the header size of a Twrite message and as such it is used
> to compute the 'iounit' which the guest later uses to compute a
> suitable 'count' for Tread/Twrite messages only. It shouldn't be
> involved in msize considerations IMHO.
> 
> > But np, IMO this sentence makes the fundamental requirement for this patch
> > more clear, but I have no problem dropping this sentence.
> 
> Then you can mention 7 has the true minimal size.

Ok, I'll drop P9_IOHDRSZ completely and mention literal 7 as absolute 
theoretical minimum instead.

> > > > Furthermore there are some responses which are not allowed by the 9p
> > > > protocol to be truncated like e.g. Rreadlink which may yield up to
> > > 
> > > No message may be truncated in any way actually. The spec just allows
> > > an exception with the string part of Rerror.
> > 
> > Rreaddir, Rread, Twrite, Rerror can be truncated. So I suggest I'll just
> > change that to s/some/most/ semantically.
> 
> I mean truncate with loss of data. Rreaddir can return less than 'count'
> but it won't truncate an individual entry. It rewinds to the previous
> one and returns its offset for the next Treaddir. Same goes with Rread
> and Twrite, we always return a 'count' that can be used by the client
> to continue reading or writing.

Individual entries are not truncated, correct, but data loss: Example, you 
have this directory and say server's fs would deliver entries by readdir() in 
this order (from top down):

.
..
a
1234567890
b
c
d

and say 37 >= msize < 45, then client would receive 3 entries ".", "..", "a" 
and that's it.

> Rerror is really the only message where we can deliberately drop
> data (but we actually don't do that).
> 
> > > Maybe just mention that and say we choose 4096 to be able to send
> > > big Rreadlink messages.
> > 
> > That's not the point for this patch. The main purpose is to avoid having
> > to
> > maintain individual error prone minimum msize checks all over the code
> > base. If we would allow very small msizes (much smaller than 4096), then
> > we would need to add numerous error handlers all over the code base, each
> > one checking for different values (depending on the specific message
> > type).
> 
> I'm not sure what you mean by 'minimum msize checks all over the code
> base'... Please provide an example.

1. Like done in this patch series: Treaddir has to limit client's 'count'
   parameter according to msize (by subtracting with msize).

2. get_iounit() does this:

iounit = stbuf.f_bsize;
iounit *= (s->msize - P9_IOHDRSZ) / stbuf.f_bsize;

   without checking anywhere for a potential negative outcome
   (i.e. when msize < P9_IOHDRSZ)

3. Example of directory entry name length with Rreaddir above.

Individual issues that can easily be overlooked but also easily avoided by not 
allowing small msizes in the first place.

Best regards,
Christian Schoenebeck

[Bug 1859713] Re: ARM v8.3a pauth not working

Ooof. Good catch on the sbox error.

That said, how did you test pauth_computepac?
I still do not get the C5 result above, but 0x99d88f4472f3be39.

The following test case sets up the parameters.

** Patch added: "test case"

https://bugs.launchpad.net/qemu/+bug/1859713/+attachment/5320967/+files/0001-tests-tcg-aarch64-Add-pauth-3.patch

--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1859713

Title:
ARM v8.3a pauth not working

Status in QEMU:
Confirmed

Bug description:
Host: Ubuntu 19.10 - x86_64 machine
QEMU version: 3a63b24a1bbf166e6f455fe43a6bbd8dea413d92 (master)

ARMV8.3 pauth is not working well.

With a test code containing two pauth instructions:
- paciasp that sign LR with A key and sp as context;
- autiasp that verify the signature.

Test:
- Run the program and corrupt LR just before autiasp (ex 0x3e0400660
instead of 0x3e00400664)

Expected:
- autiasp places an invalid pointer in LR

Result:
- autiasp successfully auth the pointer and places 0x0400660 in LR.

Further explanations:
Adding traces in qemu code shows that "pauth_computepac" is not robust
enough against truncating.
With 0x3100400664 as input of pauth_auth, we obtain
"0x55b1d65b2c138e14" for PAC, "0x30" for bot_bit and "0x38" for top_bit.
With 0x310040008743ec as input of pauth (with same key), we obtain
"0x55b1d65b2c138ef4" for PAC, "0x30" for bot_bit and "0x38" for top_bit.
Values of top_bit and bottom_bit are strictly the same and it should not.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1859713/+subscriptions

[Bug 1859713] Re: ARM v8.3a pauth not working

Oops again.  The test case has the parts of the key the wrong way around.
I'll submit the pair of patches to the mailing list.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1859713

Title:
  ARM v8.3a pauth not working

Status in QEMU:
  Confirmed

Bug description:
  Host: Ubuntu 19.10 - x86_64 machine
  QEMU version: 3a63b24a1bbf166e6f455fe43a6bbd8dea413d92 (master)

  ARMV8.3 pauth is not working well.

  With a test code containing two pauth instructions:
  - paciasp that sign LR with A key and sp as context;
  - autiasp that verify the signature.

  Test:
  - Run the program and corrupt LR just before autiasp (ex 0x3e0400660 
instead of 0x3e00400664)

  Expected:
  - autiasp places an invalid pointer in LR

  Result:
  - autiasp successfully auth the pointer and places 0x0400660 in LR.

  Further explanations:
  Adding traces in qemu code shows that "pauth_computepac" is not robust 
enough against truncating.
  With 0x3100400664 as input of pauth_auth, we obtain 
"0x55b1d65b2c138e14" for PAC, "0x30" for bot_bit and "0x38" for top_bit.
  With 0x310040008743ec as input of pauth (with same key), we obtain 
"0x55b1d65b2c138ef4" for PAC, "0x30" for bot_bit and "0x38" for top_bit.
  Values of top_bit and bottom_bit are strictly the same and it should not.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1859713/+subscriptions

[PATCH 12/12] linux-user: Add support for KCOV_INIT_TRACE ioctl

From: Aleksandar Markovic 

KCOV_INIT_TRACE ioctl plays the role in kernel coverage tracing.
This ioctl's third argument is of type 'unsigned long', and the
implementation in QEMU is straightforward.

Reviewed-by: Laurent Vivier 
Signed-off-by: Aleksandar Markovic 
---
 linux-user/ioctls.h   | 1 +
 linux-user/syscall_defs.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index 39b3825..1da71dd 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -556,4 +556,5 @@
 #ifdef CONFIG_KCOV
   IOCTL(KCOV_ENABLE, 0, TYPE_NULL)
   IOCTL(KCOV_DISABLE, 0, TYPE_NULL)
+  IOCTL(KCOV_INIT_TRACE, IOC_R, TYPE_ULONG)
 #endif
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index c8999ef..bf71b3a 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -2464,6 +2464,7 @@ struct target_mtpos {
 /* kcov ioctls */
 #define TARGET_KCOV_ENABLE TARGET_IO('c', 100)
 #define TARGET_KCOV_DISABLETARGET_IO('c', 101)
+#define TARGET_KCOV_INIT_TRACE TARGET_IOR('c', 1, abi_ulong)
 
 struct target_sysinfo {
 abi_long uptime;/* Seconds since boot */
-- 
2.7.4

[PATCH 02/12] linux-user: Add support for FS_IOC32_FLAGS ioctls

From: Aleksandar Markovic 

These FS_IOC32_FLAGS ioctls are identical to
FS_IOC_FLAGS ioctls, but without the anomaly of their
number defined as if their third argument is of type long, while
it is treated internally in kernel as is of type int.

Reviewed-by: Laurent Vivier 
Signed-off-by: Aleksandar Markovic 
---
 linux-user/ioctls.h   | 2 ++
 linux-user/syscall_defs.h | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index c44f42e..4fd6939 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -140,6 +140,8 @@
  IOCTL(FS_IOC_SETFLAGS, IOC_W, MK_PTR(TYPE_INT))
  IOCTL(FS_IOC_GETVERSION, IOC_R, MK_PTR(TYPE_INT))
  IOCTL(FS_IOC_SETVERSION, IOC_W, MK_PTR(TYPE_INT))
+ IOCTL(FS_IOC32_GETFLAGS, IOC_R, MK_PTR(TYPE_INT))
+ IOCTL(FS_IOC32_SETFLAGS, IOC_W, MK_PTR(TYPE_INT))
 
 #ifdef CONFIG_USBFS
   /* USB ioctls */
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index f68a8b6..964b2b4 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -920,6 +920,8 @@ struct target_pollfd {
 #define TARGET_FS_IOC_GETVERSION TARGET_IOR('v', 1, abi_long)
 #define TARGET_FS_IOC_SETVERSION TARGET_IOW('v', 2, abi_long)
 #define TARGET_FS_IOC_FIEMAP TARGET_IOWR('f',11,struct fiemap)
+#define TARGET_FS_IOC32_GETFLAGS TARGET_IOR('f', 1, int)
+#define TARGET_FS_IOC32_SETFLAGS TARGET_IOW('f', 2, int)
 
 /* usb ioctls */
 #define TARGET_USBDEVFS_CONTROL TARGET_IOWRU('U', 0)
-- 
2.7.4

[PATCH 03/12] linux-user: Add support for FS_IOC32_VERSION ioctls

From: Aleksandar Markovic 

These FS_IOC32_VERSION ioctls are identical to
FS_IOC_VERSION ioctls, but without the anomaly of their
number defined as if their third argument is of type long, while
it is treated internally in kernel as is of type int.

Reviewed-by: Laurent Vivier 
Signed-off-by: Aleksandar Markovic 
---
 linux-user/ioctls.h   | 2 ++
 linux-user/syscall_defs.h | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index 4fd6939..3affd88 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -142,6 +142,8 @@
  IOCTL(FS_IOC_SETVERSION, IOC_W, MK_PTR(TYPE_INT))
  IOCTL(FS_IOC32_GETFLAGS, IOC_R, MK_PTR(TYPE_INT))
  IOCTL(FS_IOC32_SETFLAGS, IOC_W, MK_PTR(TYPE_INT))
+ IOCTL(FS_IOC32_GETVERSION, IOC_R, MK_PTR(TYPE_INT))
+ IOCTL(FS_IOC32_SETVERSION, IOC_W, MK_PTR(TYPE_INT))
 
 #ifdef CONFIG_USBFS
   /* USB ioctls */
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 964b2b4..a73cc3d 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -922,6 +922,8 @@ struct target_pollfd {
 #define TARGET_FS_IOC_FIEMAP TARGET_IOWR('f',11,struct fiemap)
 #define TARGET_FS_IOC32_GETFLAGS TARGET_IOR('f', 1, int)
 #define TARGET_FS_IOC32_SETFLAGS TARGET_IOW('f', 2, int)
+#define TARGET_FS_IOC32_GETVERSION TARGET_IOR('v', 1, int)
+#define TARGET_FS_IOC32_SETVERSION TARGET_IOW('v', 2, int)
 
 /* usb ioctls */
 #define TARGET_USBDEVFS_CONTROL TARGET_IOWRU('U', 0)
-- 
2.7.4

[PATCH 00/12] linux-user: Add support for fs, fd,and kcov ioctls

From: Aleksandar Markovic 

This series is a spin-off of v5 of earlier series "linux-user: Misc
patches for 5.0", that became too large to manage. I will submit the
rest of that large series separately.

The only difference between patches in this series and in the former
series is that all Laurent's comments and concerns were addressed.

The summary of the patches is as follows:

Patches 1-6: Adding support for some filesystem-related ioctls
Patches 7-9: Adding support for some floppy-drive-related ioctls
Patches 10-12: Adding support for kcov-related ioctls

Aleksandar Markovic (12):
  linux-user: Add support for FS_IOC_VERSION ioctls
  linux-user: Add support for FS_IOC32_FLAGS ioctls
  linux-user: Add support for FS_IOC32_VERSION ioctls
  linux-user: Add support for FS_IOC_FSXATTR ioctls
  linux-user: Add support for FITRIM ioctl
  linux-user: Add support for FIFREEZE and FITHAW ioctls
  linux-user: Add support for FD
ioctls
  linux-user: Add support for FDFMT ioctls
  linux-user: Add support for FDGETFDCSTAT ioctl
  configure: Detect kcov support and introduce CONFIG_KCOV
  linux-user: Add support for KCOV_ ioctls
  linux-user: Add support for KCOV_INIT_TRACE ioctl

 configure  |  9 +
 linux-user/ioctls.h| 36 +
 linux-user/syscall.c   |  3 +++
 linux-user/syscall_defs.h  | 50 +++---
 linux-user/syscall_types.h | 29 +++
 5 files changed, 124 insertions(+), 3 deletions(-)

-- 
2.7.4

[PATCH 06/12] linux-user: Add support for FIFREEZE and FITHAW ioctls

From: Aleksandar Markovic 

Both FIFREEZE and FITHAW ioctls accept an integer as their third
argument.

All ioctls in this group (FI* ioctl) are guarded with "#ifdef", so the
guards are used in this implementation too for consistency (however,
many of ioctls in FI* group became old enough that their #ifdef guards
could be removed, bit this is out of the scope of this patch).

Reviewed-by: Laurent Vivier 
Signed-off-by: Aleksandar Markovic 
---
 linux-user/ioctls.h   | 6 ++
 linux-user/syscall_defs.h | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index e4f0a04..66f8c4e 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -123,6 +123,12 @@
 #ifdef FIBMAP
  IOCTL(FIBMAP, IOC_W | IOC_R, MK_PTR(TYPE_LONG))
 #endif
+#ifdef FIFREEZE
+ IOCTL(FIFREEZE, IOC_W | IOC_R, TYPE_INT)
+#endif
+#ifdef FITHAW
+ IOCTL(FITHAW, IOC_W | IOC_R, TYPE_INT)
+#endif
 #ifdef FITRIM
  IOCTL(FITRIM, IOC_W | IOC_R, MK_PTR(MK_STRUCT(STRUCT_fstrim_range)))
 #endif
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 97ab3e1..d4d39de 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -908,6 +908,8 @@ struct target_pollfd {
 #define TARGET_FIBMAP TARGET_IO(0x00,1)  /* bmap access */
 #define TARGET_FIGETBSZ   TARGET_IO(0x00,2)  /* get the block size used for 
bmap */
 
+#define TARGET_FIFREEZE   TARGET_IOWR('X', 119, int)/* Freeze */
+#define TARGET_FITHAW TARGET_IOWR('X', 120, int)/* Thaw */
 #define TARGET_FITRIM TARGET_IOWR('X', 121, struct fstrim_range)
 #define TARGET_FICLONETARGET_IOW(0x94, 9, int)
 #define TARGET_FICLONERANGE TARGET_IOW(0x94, 13, struct file_clone_range)
-- 
2.7.4

[PATCH 05/12] linux-user: Add support for FITRIM ioctl

From: Aleksandar Markovic 

FITRIM ioctl accepts a pointer to the structure

struct fstrim_range {
__u64 start;
__u64 len;
__u64 minlen;
};

as its third argument.

All ioctls in this group (FI* ioctl) are guarded with "#ifdef", so the
guards are used in this implementation too for consistency (however,
many of ioctls in FI* group became old enough that their #ifdef guards
could be removed, bit this is out of the scope of this patch).

Signed-off-by: Aleksandar Markovic 
---
 linux-user/ioctls.h| 3 +++
 linux-user/syscall_defs.h  | 1 +
 linux-user/syscall_types.h | 5 +
 3 files changed, 9 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index e1b89a7..e4f0a04 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -123,6 +123,9 @@
 #ifdef FIBMAP
  IOCTL(FIBMAP, IOC_W | IOC_R, MK_PTR(TYPE_LONG))
 #endif
+#ifdef FITRIM
+ IOCTL(FITRIM, IOC_W | IOC_R, MK_PTR(MK_STRUCT(STRUCT_fstrim_range)))
+#endif
 #ifdef FICLONE
  IOCTL(FICLONE, IOC_W, TYPE_INT)
  IOCTL(FICLONERANGE, IOC_W, MK_PTR(MK_STRUCT(STRUCT_file_clone_range)))
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index e1663b6..97ab3e1 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -908,6 +908,7 @@ struct target_pollfd {
 #define TARGET_FIBMAP TARGET_IO(0x00,1)  /* bmap access */
 #define TARGET_FIGETBSZ   TARGET_IO(0x00,2)  /* get the block size used for 
bmap */
 
+#define TARGET_FITRIM TARGET_IOWR('X', 121, struct fstrim_range)
 #define TARGET_FICLONETARGET_IOW(0x94, 9, int)
 #define TARGET_FICLONERANGE TARGET_IOW(0x94, 13, struct file_clone_range)
 
diff --git a/linux-user/syscall_types.h b/linux-user/syscall_types.h
index 4e36983..ca9429e 100644
--- a/linux-user/syscall_types.h
+++ b/linux-user/syscall_types.h
@@ -226,6 +226,11 @@ STRUCT(dm_target_versions,
 STRUCT(dm_target_msg,
TYPE_ULONGLONG) /* sector */
 
+STRUCT(fstrim_range,
+   TYPE_LONGLONG, /* start */
+   TYPE_LONGLONG, /* len */
+   TYPE_LONGLONG) /* minlen */
+
 STRUCT(file_clone_range,
TYPE_LONGLONG, /* src_fd */
TYPE_ULONGLONG, /* src_offset */
-- 
2.7.4

[PATCH 01/12] linux-user: Add support for FS_IOC_VERSION ioctls

From: Aleksandar Markovic 

A very specific thing for these two ioctls is that their code
implies that their third argument is of type 'long', but the
kernel uses that argument as if it is of type 'int'. This anomaly
is recognized also in commit 6080723 (linux-user: Implement
FS_IOC_GETFLAGS and FS_IOC_SETFLAGS ioctls).

Reviewed-by: Laurent Vivier 
Signed-off-by: Aleksandar Markovic 
---
 linux-user/ioctls.h   | 2 ++
 linux-user/syscall_defs.h | 8 +---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index c6b9d6a..c44f42e 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -138,6 +138,8 @@
 
  IOCTL(FS_IOC_GETFLAGS, IOC_R, MK_PTR(TYPE_INT))
  IOCTL(FS_IOC_SETFLAGS, IOC_W, MK_PTR(TYPE_INT))
+ IOCTL(FS_IOC_GETVERSION, IOC_R, MK_PTR(TYPE_INT))
+ IOCTL(FS_IOC_SETVERSION, IOC_W, MK_PTR(TYPE_INT))
 
 #ifdef CONFIG_USBFS
   /* USB ioctls */
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 98c2119..f68a8b6 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -911,12 +911,14 @@ struct target_pollfd {
 #define TARGET_FICLONETARGET_IOW(0x94, 9, int)
 #define TARGET_FICLONERANGE TARGET_IOW(0x94, 13, struct file_clone_range)
 
-/* Note that the ioctl numbers claim type "long" but the actual type
- * used by the kernel is "int".
+/*
+ * Note that the ioctl numbers for FS_IOC_
+ * claim type "long" but the actual type used by the kernel is "int".
  */
 #define TARGET_FS_IOC_GETFLAGS TARGET_IOR('f', 1, abi_long)
 #define TARGET_FS_IOC_SETFLAGS TARGET_IOW('f', 2, abi_long)
-
+#define TARGET_FS_IOC_GETVERSION TARGET_IOR('v', 1, abi_long)
+#define TARGET_FS_IOC_SETVERSION TARGET_IOW('v', 2, abi_long)
 #define TARGET_FS_IOC_FIEMAP TARGET_IOWR('f',11,struct fiemap)
 
 /* usb ioctls */
-- 
2.7.4

[PATCH 07/12] linux-user: Add support for FD ioctls

From: Aleksandar Markovic 

FDSETEMSGTRESH, FDSETMAXERRS, and FDGETMAXERRS ioctls are commands
for controlling error reporting of a floppy drive.

FDSETEMSGTRESH's third agrument is a pointer to the structure:

struct floppy_max_errors {
unsigned int
  abort,  /* number of errors to be reached before aborting */
  read_track, /* maximal number of errors permitted to read an
   * entire track at once */
  reset,  /* maximal number of errors before a reset is tried */
  recal,  /* maximal number of errors before a recalibrate is
   * tried */
  /*
   * Threshold for reporting FDC errors to the console.
   * Setting this to zero may flood your screen when using
   * ultra cheap floppies ;-)
   */
  reporting;
};

defined in Linux kernel header .

Since all fields of the structure are of type 'unsigned int', there is
no need to define "target_floppy_max_errors".

FDSETMAXERRS and FDGETMAXERRS ioctls do not use the third argument.

Reviewed-by: Laurent Vivier 
Signed-off-by: Aleksandar Markovic 
---
 linux-user/ioctls.h| 3 +++
 linux-user/syscall_defs.h  | 3 +++
 linux-user/syscall_types.h | 7 +++
 3 files changed, 13 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index 66f8c4e..9e3ca90 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -114,7 +114,10 @@
 
  IOCTL(FDMSGON, 0, TYPE_NULL)
  IOCTL(FDMSGOFF, 0, TYPE_NULL)
+ IOCTL(FDSETEMSGTRESH, 0, TYPE_NULL)
  IOCTL(FDFLUSH, 0, TYPE_NULL)
+ IOCTL(FDSETMAXERRS, IOC_W, MK_PTR(MK_STRUCT(STRUCT_floppy_max_errors)))
+ IOCTL(FDGETMAXERRS, IOC_R, MK_PTR(MK_STRUCT(STRUCT_floppy_max_errors)))
  IOCTL(FDRESET, 0, TYPE_NULL)
  IOCTL(FDRAWCMD, 0, TYPE_NULL)
  IOCTL(FDTWADDLE, 0, TYPE_NULL)
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index d4d39de..e317115 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -899,7 +899,10 @@ struct target_pollfd {
 
 #define TARGET_FDMSGONTARGET_IO(2, 0x45)
 #define TARGET_FDMSGOFF   TARGET_IO(2, 0x46)
+#define TARGET_FDSETEMSGTRESH TARGET_IO(2, 0x4a)
 #define TARGET_FDFLUSHTARGET_IO(2, 0x4b)
+#define TARGET_FDSETMAXERRS  TARGET_IOW(2, 0x4c, struct floppy_max_errors)
+#define TARGET_FDGETMAXERRS  TARGET_IOR(2, 0x0e, struct floppy_max_errors)
 #define TARGET_FDRESETTARGET_IO(2, 0x54)
 #define TARGET_FDRAWCMD   TARGET_IO(2, 0x58)
 #define TARGET_FDTWADDLE  TARGET_IO(2, 0x59)
diff --git a/linux-user/syscall_types.h b/linux-user/syscall_types.h
index ca9429e..434ce1c 100644
--- a/linux-user/syscall_types.h
+++ b/linux-user/syscall_types.h
@@ -266,6 +266,13 @@ STRUCT(blkpg_ioctl_arg,
TYPE_INT, /* datalen */
TYPE_PTRVOID) /* data */
 
+STRUCT(floppy_max_errors,
+   TYPE_INT, /* abort */
+   TYPE_INT, /* read_track */
+   TYPE_INT, /* reset */
+   TYPE_INT, /* recal */
+   TYPE_INT) /* reporting */
+
 #if defined(CONFIG_USBFS)
 /* usb device ioctls */
 STRUCT(usbdevfs_ctrltransfer,
-- 
2.7.4

[PATCH 10/12] configure: Detect kcov support and introduce CONFIG_KCOV

From: Aleksandar Markovic 

kcov is kernel code coverage tracing tool. It requires kernel 4.4+
compiled with certain kernel options.

This patch checks if kcov header "sys/kcov.h" is present on build
machine, and stores the result in variable CONFIG_KCOV, meant to
be used in linux-user code related to the support for three ioctls
that were introduced at the same time as the mentioned header
(their definition was a part of the first version of that header).

Signed-off-by: Aleksandar Markovic 
---
 configure | 9 +
 1 file changed, 9 insertions(+)

diff --git a/configure b/configure
index 940bf9e..57e6eba 100755
--- a/configure
+++ b/configure
@@ -4752,6 +4752,12 @@ if compile_prog "" "" ; then
   syncfs=yes
 fi
 
+# check for kcov support (kernel must be 4.4+, compiled with certain options)
+kcov=no
+if check_include sys/kcov.h ; then
+kcov=yes
+fi
+
 # Check we have a new enough version of sphinx-build
 has_sphinx_build() {
 # This is a bit awkward but works: create a trivial document and
@@ -6874,6 +6880,9 @@ fi
 if test "$syncfs" = "yes" ; then
   echo "CONFIG_SYNCFS=y" >> $config_host_mak
 fi
+if test "$kcov" = "yes" ; then
+  echo "CONFIG_KCOV=y" >> $config_host_mak
+fi
 if test "$inotify" = "yes" ; then
   echo "CONFIG_INOTIFY=y" >> $config_host_mak
 fi
-- 
2.7.4

[PATCH 11/12] linux-user: Add support for KCOV_ ioctls

From: Aleksandar Markovic 

KCOV_ENABLE and KCOV_DISABLE play the role in kernel coverage
tracing. These ioctls do not use the third argument of ioctl()
system call and are straightforward to implement in QEMU.

Reviewed-by: Laurent Vivier 
Signed-off-by: Aleksandar Markovic 
---
 linux-user/ioctls.h   | 5 +
 linux-user/syscall.c  | 3 +++
 linux-user/syscall_defs.h | 4 
 3 files changed, 12 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index d72cd76..39b3825 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -552,3 +552,8 @@
   IOCTL_IGNORE(TIOCSTART)
   IOCTL_IGNORE(TIOCSTOP)
 #endif
+
+#ifdef CONFIG_KCOV
+  IOCTL(KCOV_ENABLE, 0, TYPE_NULL)
+  IOCTL(KCOV_DISABLE, 0, TYPE_NULL)
+#endif
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 171c0ca..6edcb0d 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -73,6 +73,9 @@
 #ifdef CONFIG_SENDFILE
 #include 
 #endif
+#ifdef CONFIG_KCOV
+#include 
+#endif
 
 #define termios host_termios
 #define winsize host_winsize
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 46dc565..c8999ef 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -2461,6 +2461,10 @@ struct target_mtpos {
 #define TARGET_MTIOCGETTARGET_IOR('m', 2, struct target_mtget)
 #define TARGET_MTIOCPOSTARGET_IOR('m', 3, struct target_mtpos)
 
+/* kcov ioctls */
+#define TARGET_KCOV_ENABLE TARGET_IO('c', 100)
+#define TARGET_KCOV_DISABLETARGET_IO('c', 101)
+
 struct target_sysinfo {
 abi_long uptime;/* Seconds since boot */
 abi_ulong loads[3]; /* 1, 5, and 15 minute load averages */
-- 
2.7.4

[PATCH 09/12] linux-user: Add support for FDGETFDCSTAT ioctl

From: Aleksandar Markovic 

FDGETFDCSTAT's third agrument is a pointer to the structure:

struct floppy_fdc_state {
int spec1;
int spec2;
int dtr;
unsigned char version;
unsigned char dor;
unsigned long address;
unsigned int rawcmd:2;
unsigned int reset:1;
unsigned int need_configure:1;
unsigned int perp_mode:2;
unsigned int has_fifo:1;
unsigned int driver_version;
unsigned char track[4];
};

defined in Linux kernel header .

Since there is a fields of the structure of type 'unsigned long', there is
a need to define "target_format_descr". Also, five fields rawcmd, reset,
need_configure, perp_mode, and has_fifo are all just bitfields and are
part od a single 'unsigned int' field.

Signed-off-by: Aleksandar Markovic 
---
 linux-user/ioctls.h|  2 ++
 linux-user/syscall_defs.h  | 18 ++
 linux-user/syscall_types.h | 12 
 3 files changed, 32 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index e754a6b..d72cd76 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -122,6 +122,8 @@
  IOCTL(FDSETMAXERRS, IOC_W, MK_PTR(MK_STRUCT(STRUCT_floppy_max_errors)))
  IOCTL(FDGETMAXERRS, IOC_R, MK_PTR(MK_STRUCT(STRUCT_floppy_max_errors)))
  IOCTL(FDRESET, 0, TYPE_NULL)
+ IOCTL(FDGETFDCSTAT, IOC_R,
+   MK_PTR(MK_STRUCT(STRUCT_target_floppy_fdc_state)))
  IOCTL(FDRAWCMD, 0, TYPE_NULL)
  IOCTL(FDTWADDLE, 0, TYPE_NULL)
  IOCTL(FDEJECT, 0, TYPE_NULL)
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 9fbf04a..46dc565 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -897,6 +897,23 @@ struct target_pollfd {
 
 /* From  */
 
+struct target_floppy_fdc_state {
+int spec1;  /* spec1 value last used */
+int spec2;  /* spec2 value last used */
+int dtr;
+unsigned char version;  /* FDC version code */
+unsigned char dor;
+abi_ulong address;  /* io address */
+unsigned int rawcmd:2;
+unsigned int reset:1;
+unsigned int need_configure:1;
+unsigned int perp_mode:2;
+unsigned int has_fifo:1;
+unsigned int driver_version;/* version code for floppy driver */
+unsigned char track[4];
+};
+
+
 #define TARGET_FDMSGONTARGET_IO(2, 0x45)
 #define TARGET_FDMSGOFF   TARGET_IO(2, 0x46)
 #define TARGET_FDFMTBEG   TARGET_IO(2, 0x47)
@@ -907,6 +924,7 @@ struct target_pollfd {
 #define TARGET_FDSETMAXERRS  TARGET_IOW(2, 0x4c, struct floppy_max_errors)
 #define TARGET_FDGETMAXERRS  TARGET_IOR(2, 0x0e, struct floppy_max_errors)
 #define TARGET_FDRESETTARGET_IO(2, 0x54)
+#define TARGET_FDGETFDCSTAT  TARGET_IOR(2, 0x15, struct 
target_floppy_fdc_state)
 #define TARGET_FDRAWCMD   TARGET_IO(2, 0x58)
 #define TARGET_FDTWADDLE  TARGET_IO(2, 0x59)
 #define TARGET_FDEJECTTARGET_IO(2, 0x5a)
diff --git a/linux-user/syscall_types.h b/linux-user/syscall_types.h
index ff60240..2b480d0 100644
--- a/linux-user/syscall_types.h
+++ b/linux-user/syscall_types.h
@@ -278,6 +278,18 @@ STRUCT(floppy_max_errors,
TYPE_INT, /* recal */
TYPE_INT) /* reporting */
 
+STRUCT(target_floppy_fdc_state,
+   TYPE_INT, /* spec1 */
+   TYPE_INT, /* spec2 */
+   TYPE_INT, /* dtr */
+   TYPE_CHAR, /* version */
+   TYPE_CHAR, /* dor */
+   TYPE_ULONG, /* address */
+   TYPE_INT, /* bit field for rawcmd:2, reset:1, need_configure:1, */
+ /* perp_mode:2, and has_fifo:1 */
+   TYPE_INT, /* driver_version */
+   MK_ARRAY(TYPE_CHAR, 4)) /* track */
+
 #if defined(CONFIG_USBFS)
 /* usb device ioctls */
 STRUCT(usbdevfs_ctrltransfer,
-- 
2.7.4

[PATCH 08/12] linux-user: Add support for FDFMT ioctls

From: Aleksandar Markovic 

FDFMTBEG, FDFMTTRK, and FDFMTEND ioctls provide means for controlling
formatting of a floppy drive.

FDFMTTRK's third agrument is a pointer to the structure:

struct format_descr {
unsigned int device,head,track;
};

defined in Linux kernel header .

Since all fields of the structure are of type 'unsigned int', there is
no need to define "target_format_descr".

FDFMTBEG and FDFMTEND ioctls do not use the third argument.

Reviewed-by: Laurent Vivier 
Signed-off-by: Aleksandar Markovic 
---
 linux-user/ioctls.h| 3 +++
 linux-user/syscall_defs.h  | 3 +++
 linux-user/syscall_types.h | 5 +
 3 files changed, 11 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index 9e3ca90..e754a6b 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -115,6 +115,9 @@
  IOCTL(FDMSGON, 0, TYPE_NULL)
  IOCTL(FDMSGOFF, 0, TYPE_NULL)
  IOCTL(FDSETEMSGTRESH, 0, TYPE_NULL)
+ IOCTL(FDFMTBEG, 0, TYPE_NULL)
+ IOCTL(FDFMTTRK, IOC_W, MK_PTR(MK_STRUCT(STRUCT_format_descr)))
+ IOCTL(FDFMTEND, 0, TYPE_NULL)
  IOCTL(FDFLUSH, 0, TYPE_NULL)
  IOCTL(FDSETMAXERRS, IOC_W, MK_PTR(MK_STRUCT(STRUCT_floppy_max_errors)))
  IOCTL(FDGETMAXERRS, IOC_R, MK_PTR(MK_STRUCT(STRUCT_floppy_max_errors)))
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index e317115..9fbf04a 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -899,6 +899,9 @@ struct target_pollfd {
 
 #define TARGET_FDMSGONTARGET_IO(2, 0x45)
 #define TARGET_FDMSGOFF   TARGET_IO(2, 0x46)
+#define TARGET_FDFMTBEG   TARGET_IO(2, 0x47)
+#define TARGET_FDFMTTRK  TARGET_IOW(2, 0x48, struct format_descr)
+#define TARGET_FDFMTEND   TARGET_IO(2, 0x49)
 #define TARGET_FDSETEMSGTRESH TARGET_IO(2, 0x4a)
 #define TARGET_FDFLUSHTARGET_IO(2, 0x4b)
 #define TARGET_FDSETMAXERRS  TARGET_IOW(2, 0x4c, struct floppy_max_errors)
diff --git a/linux-user/syscall_types.h b/linux-user/syscall_types.h
index 434ce1c..ff60240 100644
--- a/linux-user/syscall_types.h
+++ b/linux-user/syscall_types.h
@@ -266,6 +266,11 @@ STRUCT(blkpg_ioctl_arg,
TYPE_INT, /* datalen */
TYPE_PTRVOID) /* data */
 
+STRUCT(format_descr,
+   TYPE_INT, /* device */
+   TYPE_INT, /* head */
+   TYPE_INT) /* track */
+
 STRUCT(floppy_max_errors,
TYPE_INT, /* abort */
TYPE_INT, /* read_track */
-- 
2.7.4

[PATCH 04/12] linux-user: Add support for FS_IOC_FSXATTR ioctls

From: Aleksandar Markovic 

Both FS_IOC_FSGETXATTR and FS_IOC_FSSETXATTR accept a pointer to
the structure

struct fsxattr {
__u32 fsx_xflags; /* xflags field value (get/set) */
__u32 fsx_extsize;/* extsize field value (get/set)*/
__u32 fsx_nextents;   /* nextents field value (get) */
__u32 fsx_projid; /* project identifier (get/set) */
__u32 fsx_cowextsize; /* CoW extsize field value (get/set)*/
unsigned char fsx_pad[8];
};

as their third argument.

These ioctls were relatively recently introduced, so the "#ifdef"
guards are used in this implementation.

Signed-off-by: Aleksandar Markovic 
---
 linux-user/ioctls.h   | 7 +++
 linux-user/syscall_defs.h | 6 ++
 2 files changed, 13 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index 3affd88..e1b89a7 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -144,6 +144,13 @@
  IOCTL(FS_IOC32_SETFLAGS, IOC_W, MK_PTR(TYPE_INT))
  IOCTL(FS_IOC32_GETVERSION, IOC_R, MK_PTR(TYPE_INT))
  IOCTL(FS_IOC32_SETVERSION, IOC_W, MK_PTR(TYPE_INT))
+#ifdef FS_IOC_FSGETXATTR
+ IOCTL(FS_IOC_FSGETXATTR, IOC_W, MK_PTR(MK_STRUCT(STRUCT_fsxattr)))
+#endif
+#ifdef FS_IOC_FSSETXATTR
+ IOCTL(FS_IOC_FSSETXATTR, IOC_W, MK_PTR(MK_STRUCT(STRUCT_fsxattr)))
+#endif
+
 
 #ifdef CONFIG_USBFS
   /* USB ioctls */
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index a73cc3d..e1663b6 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -924,6 +924,12 @@ struct target_pollfd {
 #define TARGET_FS_IOC32_SETFLAGS TARGET_IOW('f', 2, int)
 #define TARGET_FS_IOC32_GETVERSION TARGET_IOR('v', 1, int)
 #define TARGET_FS_IOC32_SETVERSION TARGET_IOW('v', 2, int)
+#ifdef FS_IOC_FSGETXATTR
+#define TARGET_FS_IOC_FSGETXATTR TARGET_IOR('X', 31, struct fsxattr)
+#endif
+#ifdef FS_IOC_FSSETXATTR
+#define TARGET_FS_IOC_FSSETXATTR TARGET_IOR('X', 32, struct fsxattr)
+#endif
 
 /* usb ioctls */
 #define TARGET_USBDEVFS_CONTROL TARGET_IOWRU('U', 0)
-- 
2.7.4

[PATCH 0/4] target/arm: Fix ComputePAC (LP 1859713)

Thanks to Vincent and Adrien for both reporting the bug and
providing the solution.  I've converted their manual testing
into executable tests.


r~


Richard Henderson (3):
  tests/tcg/aarch64: Fix compilation parameters for pauth-%
  tests/tcg/aarch64: Add pauth-3
  tests/tcg/aarch64: Add pauth-4

Vincent Dehors (1):
  target/arm: Fix PAuth sbox functions

 target/arm/pauth_helper.c |  4 +--
 tests/tcg/aarch64/pauth-1.c   |  2 --
 tests/tcg/aarch64/pauth-2.c   |  2 --
 tests/tcg/aarch64/pauth-4.c   | 25 ++
 tests/tcg/aarch64/system/pauth-3.c| 40 +++
 tests/tcg/aarch64/Makefile.softmmu-target |  5 ++-
 tests/tcg/aarch64/Makefile.target |  3 +-
 7 files changed, 73 insertions(+), 8 deletions(-)
 create mode 100644 tests/tcg/aarch64/pauth-4.c
 create mode 100644 tests/tcg/aarch64/system/pauth-3.c

-- 
2.20.1

[PATCH 3/4] tests/tcg/aarch64: Add pauth-3

This is the test vector from the QARMA paper, run through PACGA.

Suggested-by: Vincent Dehors 
Signed-off-by: Richard Henderson 
---
 tests/tcg/aarch64/system/pauth-3.c| 40 +++
 tests/tcg/aarch64/Makefile.softmmu-target |  5 ++-
 2 files changed, 44 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/aarch64/system/pauth-3.c

diff --git a/tests/tcg/aarch64/system/pauth-3.c 
b/tests/tcg/aarch64/system/pauth-3.c
new file mode 100644
index 00..42eff4d5ea
--- /dev/null
+++ b/tests/tcg/aarch64/system/pauth-3.c
@@ -0,0 +1,40 @@
+#include 
+#include 
+
+int main()
+{
+/*
+ * Test vector from QARMA paper (https://eprint.iacr.org/2016/444.pdf)
+ * to verify one computation of the pauth_computepac() function,
+ * which uses sbox2.
+ *
+ * Use PACGA, because it returns the most bits from ComputePAC.
+ * We still only get the most significant 32-bits of the result.
+ */
+
+static const uint64_t d[5] = {
+0xfb623599da6e8127ull,
+0x477d469dec0b8762ull,
+0x84be85ce9804e94bull,
+0xec2802d4e0a488e9ull,
+0xc003b93999b33765ull & 0xull
+};
+uint64_t r;
+
+asm("msr apgakeyhi_el1, %[w0]\n\t"
+"msr apgakeylo_el1, %[k0]\n\t"
+"pacga %[r], %[P], %[T]"
+: [r] "=r"(r)
+: [P] "r" (d[0]),
+  [T] "r" (d[1]),
+  [w0] "r" (d[2]),
+  [k0] "r" (d[3]));
+
+if (r == d[4]) {
+ml_printf("OK\n");
+return 0;
+} else {
+ml_printf("FAIL: %lx != %lx\n", r, d[4]);
+return 1;
+}
+}
diff --git a/tests/tcg/aarch64/Makefile.softmmu-target 
b/tests/tcg/aarch64/Makefile.softmmu-target
index 7b4eede3f0..f6b5121f5c 100644
--- a/tests/tcg/aarch64/Makefile.softmmu-target
+++ b/tests/tcg/aarch64/Makefile.softmmu-target
@@ -61,4 +61,7 @@ run-memory-replay: memory-replay run-memory-record
  $(QEMU_OPTS) memory, \
  "$< on $(TARGET_NAME)")
 
-EXTRA_TESTS+=memory-record memory-replay
+run-pauth-3: pauth-3
+pauth-3: CFLAGS += -march=armv8.3-a
+
+EXTRA_TESTS+=memory-record memory-replay pauth-3
-- 
2.20.1

[PATCH 2/4] tests/tcg/aarch64: Fix compilation parameters for pauth-%

Hiding the required architecture within asm() is not correct.
Add it to the cflags of the (cross-) compiler.

Signed-off-by: Richard Henderson 
---
 tests/tcg/aarch64/pauth-1.c   | 2 --
 tests/tcg/aarch64/pauth-2.c   | 2 --
 tests/tcg/aarch64/Makefile.target | 1 +
 3 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/tests/tcg/aarch64/pauth-1.c b/tests/tcg/aarch64/pauth-1.c
index a3c1443cd0..ea0984ea82 100644
--- a/tests/tcg/aarch64/pauth-1.c
+++ b/tests/tcg/aarch64/pauth-1.c
@@ -2,8 +2,6 @@
 #include 
 #include 
 
-asm(".arch armv8.4-a");
-
 #ifndef PR_PAC_RESET_KEYS
 #define PR_PAC_RESET_KEYS  54
 #define PR_PAC_APDAKEY (1 << 2)
diff --git a/tests/tcg/aarch64/pauth-2.c b/tests/tcg/aarch64/pauth-2.c
index 2fe030ba3d..9bba0beb63 100644
--- a/tests/tcg/aarch64/pauth-2.c
+++ b/tests/tcg/aarch64/pauth-2.c
@@ -1,8 +1,6 @@
 #include 
 #include 
 
-asm(".arch armv8.4-a");
-
 void do_test(uint64_t value)
 {
 uint64_t salt1, salt2;
diff --git a/tests/tcg/aarch64/Makefile.target 
b/tests/tcg/aarch64/Makefile.target
index df3fe8032c..374c8d6830 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -20,6 +20,7 @@ run-fcvt: fcvt
 # Pauth Tests
 AARCH64_TESTS += pauth-1 pauth-2
 run-pauth-%: QEMU_OPTS += -cpu max
+pauth-%: CFLAGS += -march=armv8.3-a
 
 # Semihosting smoke test for linux-user
 AARCH64_TESTS += semihosting
-- 
2.20.1

[PATCH 4/4] tests/tcg/aarch64: Add pauth-4

Perform the set of operations and test described in LP 1859713.

Suggested-by: Adrien GRASSEIN 
Signed-off-by: Richard Henderson 
---
 tests/tcg/aarch64/pauth-4.c   | 25 +
 tests/tcg/aarch64/Makefile.target |  2 +-
 2 files changed, 26 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/aarch64/pauth-4.c

diff --git a/tests/tcg/aarch64/pauth-4.c b/tests/tcg/aarch64/pauth-4.c
new file mode 100644
index 00..4b22e6e282
--- /dev/null
+++ b/tests/tcg/aarch64/pauth-4.c
@@ -0,0 +1,25 @@
+#include 
+#include 
+
+int main()
+{
+  uintptr_t x, y;
+
+  asm("mov %0, lr\n\t"
+  "pacia %0, sp\n\t"   /* sigill if pauth not supported */
+  "eor %0, %0, #4\n\t" /* corrupt single bit */
+  "mov %1, %0\n\t"
+  "autia %1, sp\n\t"   /* validate corrupted pointer */
+  "xpaci %0\n\t"   /* strip pac from corrupted pointer */
+  : "=r"(x), "=r"(y));
+
+  /*
+   * Once stripped, the corrupted pointer is of the form 0x...wxyz.
+   * We expect the autia to indicate failure, producing a pointer of the
+   * form 0x000ewxyz.  Use xpaci and != for the test, rather than
+   * extracting explicit bits from the top, because the location of the
+   * error code "e" depends on the configuration of virtual memory.
+   */
+  assert(x != y);
+  return 0;
+}
diff --git a/tests/tcg/aarch64/Makefile.target 
b/tests/tcg/aarch64/Makefile.target
index 374c8d6830..efa67cf1e9 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -18,7 +18,7 @@ run-fcvt: fcvt
$(call diff-out,$<,$(AARCH64_SRC)/fcvt.ref)
 
 # Pauth Tests
-AARCH64_TESTS += pauth-1 pauth-2
+AARCH64_TESTS += pauth-1 pauth-2 pauth-4
 run-pauth-%: QEMU_OPTS += -cpu max
 pauth-%: CFLAGS += -march=armv8.3-a
 
-- 
2.20.1

Re: [PATCH 0/4] migration: Replace gemu_log with qemu_log

2020-01-16 Thread Josh Kunz

On Tue, Jan 14, 2020 at 3:02 AM Alex Bennée  wrote:

>
> Josh Kunz  writes:
>
> 
> >
> > Not tested:
> >   * Build/logging with bsd-user. I do not have easy access to a BSD
> > system.
>
> If you have the time we have vm-build-netbsd:
>
>   make vm-build-netbsd EXTRA_CONFIGURE_OPTS="--disable-system"
>
> Which will create a NetBSD image for you and run the build through it.
> Other images are available ;-)
>

This works, but it looks like it only runs the tests on BSD, even with
`configure --enable-bsd-user` first. I don't see the bsd-user binaries
being produced in the output of this command.

Re: [PATCH 0/4] migration: Replace gemu_log with qemu_log

2020-01-16 Thread Josh Kunz

On Mon, Jan 13, 2020 at 7:06 PM Warner Losh  wrote:

> The bsd-user that is in tree doesn't work. I've been trying to catch up to
> qemu head of tree, but I'm only up to 3.2... chances are the bsd-user
> changes will not change the state of things...
>

Ah good to know. Would you all prefer I don't modify it at all, to try and
keep the code "pristine"? Or is it OK to keep the patch as is?

[PATCH 1/4] target/arm: Fix PAuth sbox functions

From: Vincent Dehors 

In the PAC computation, sbox was applied over wrong bits.
As this is a 4-bit sbox, bit index should be incremented by 4 instead of 16.

Test vector from QARMA paper (https://eprint.iacr.org/2016/444.pdf) was
used to verify one computation of the pauth_computepac() function which
uses sbox2.

Launchpad: https://bugs.launchpad.net/bugs/1859713
Reviewed-by: Richard Henderson 
Signed-off-by: Vincent DEHORS 
Signed-off-by: Adrien GRASSEIN 
---
 target/arm/pauth_helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
index d3194f2043..0a5f41e10c 100644
--- a/target/arm/pauth_helper.c
+++ b/target/arm/pauth_helper.c
@@ -89,7 +89,7 @@ static uint64_t pac_sub(uint64_t i)
 uint64_t o = 0;
 int b;
 
-for (b = 0; b < 64; b += 16) {
+for (b = 0; b < 64; b += 4) {
 o |= (uint64_t)sub[(i >> b) & 0xf] << b;
 }
 return o;
@@ -104,7 +104,7 @@ static uint64_t pac_inv_sub(uint64_t i)
 uint64_t o = 0;
 int b;
 
-for (b = 0; b < 64; b += 16) {
+for (b = 0; b < 64; b += 4) {
 o |= (uint64_t)inv_sub[(i >> b) & 0xf] << b;
 }
 return o;
-- 
2.20.1

Re: [PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value

2020-01-16 Thread Alberto Garcia

On Tue 14 Jan 2020 03:15:48 PM CET, Max Reitz wrote:
>> @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset,
>>   * Writes one sector of the L1 table to the disk (can't update single 
>> entries
>>   * and we really don't want bdrv_pread to perform a read-modify-write)
>>   */
>> -#define L1_ENTRIES_PER_SECTOR (512 / 8)
>> +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8)
>>  int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
>
> Here it’s because the comment is wrong: “Can’t update single entries” –
> yes, we can.  We’d just have to do a bdrv_pwrite() to a single entry.

What's the point of qcow2_write_l1_entry() then?

>> @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
>>  case QCOW2_CLUSTER_NORMAL:
>>  child = s->data_file;
>>  copy_offset += offset_into_cluster(s, src_offset);
>> -if ((copy_offset & 511) != 0) {
>> +if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) {
>
> Hm.  I don’t get this one.

Checking the code (e.g. block_copy_do_copy()) it seems that the whole
chunk must be cluster aligned so I don't get this one either.

Berto

[Bug 1860053] [NEW] Possible lack of precision when calling clock_gettime via vDSO on user mode ppc64le

2020-01-16 Thread Patrick Meiring

Public bug reported:

Occurs on QEMU v4.2.0 run on docker (via the qemu-user-static:v4.2.0-2
image) on an AMD64 Ubuntu 18.04.3 LTS machine provided by travis-ci.org.

>From golang's https://github.com/golang/go/issues/36592:

It was discovered that golang's time.NewTicker() and time.Sleep()
malfunction when a compiled application was run via QEMU's ppc64le
emulator in user mode.

The methods did not malfunction on actual PowerPC hardware or when the
same golang application was compiled for golang's arm, arm64 or 386
targets and was run via user mode QEMU on the same system.

Curiously, the methods also worked when the program was compiled under
go 1.11, but do malfunction in go 1.12 and 1.13.

It was identified the change in behaviour was most likely attributable to 
golang switching to using vSDO for calling clock_gettime() on PowerPC 64 
architectures in 1.12. I.E:
https://github.com/golang/go/commit/dbd8af74723d2c98cbdcc70f7e2801f69b57ac5b

We therefore suspect there may be a bug in QEMU's user-mode emulation of
ppc64le as relates to vDSO calls to clock_gettime().

The nature of the malfunction of time.NewTicker() and time.Sleep() is
such that sleeps or ticks with a granularity of less than one second do
not appear to be possible (they all revert to 1 second sleeps/ticks).
Could it be that the nanoseconds field of clock_gettime() is getting
lost in the vDSO version but not in the syscall? Or some other issue
calling these methods via vDSO?

Thanks in advance.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1860053

Title:
  Possible lack of precision when calling clock_gettime via vDSO on user
  mode ppc64le

Status in QEMU:
  New

Bug description:
  Occurs on QEMU v4.2.0 run on docker (via the qemu-user-static:v4.2.0-2
  image) on an AMD64 Ubuntu 18.04.3 LTS machine provided by travis-
  ci.org.

  From golang's https://github.com/golang/go/issues/36592:

  It was discovered that golang's time.NewTicker() and time.Sleep()
  malfunction when a compiled application was run via QEMU's ppc64le
  emulator in user mode.

  The methods did not malfunction on actual PowerPC hardware or when the
  same golang application was compiled for golang's arm, arm64 or 386
  targets and was run via user mode QEMU on the same system.

  Curiously, the methods also worked when the program was compiled under
  go 1.11, but do malfunction in go 1.12 and 1.13.

  It was identified the change in behaviour was most likely attributable to 
golang switching to using vSDO for calling clock_gettime() on PowerPC 64 
architectures in 1.12. I.E:
  https://github.com/golang/go/commit/dbd8af74723d2c98cbdcc70f7e2801f69b57ac5b

  We therefore suspect there may be a bug in QEMU's user-mode emulation
  of ppc64le as relates to vDSO calls to clock_gettime().

  The nature of the malfunction of time.NewTicker() and time.Sleep() is
  such that sleeps or ticks with a granularity of less than one second
  do not appear to be possible (they all revert to 1 second
  sleeps/ticks). Could it be that the nanoseconds field of
  clock_gettime() is getting lost in the vDSO version but not in the
  syscall? Or some other issue calling these methods via vDSO?

  Thanks in advance.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1860053/+subscriptions

Re: [PATCH] util/cacheinfo: fix crash when compiling with uClibc

On 12/16/19 1:18 AM, Carlos Santos wrote:
> On Thu, Oct 17, 2019 at 8:06 PM Carlos Santos  wrote:
>>
>> On Thu, Oct 17, 2019 at 9:47 AM Peter Maydell  
>> wrote:
>>>
>>> On Thu, 17 Oct 2019 at 13:39,  wrote:

 From: Carlos Santos 

 uClibc defines _SC_LEVEL1_ICACHE_LINESIZE and _SC_LEVEL1_DCACHE_LINESIZE
 but the corresponding sysconf calls returns -1, which is a valid result,
 meaning that the limit is indeterminate.

 Handle this situation using the fallback values instead of crashing due
 to an assertion failure.

 Signed-off-by: Carlos Santos 
 ---
  util/cacheinfo.c | 10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

 diff --git a/util/cacheinfo.c b/util/cacheinfo.c
 index ea6f3e99bf..d94dc6adc8 100644
 --- a/util/cacheinfo.c
 +++ b/util/cacheinfo.c
 @@ -93,10 +93,16 @@ static void sys_cache_info(int *isize, int *dsize)
  static void sys_cache_info(int *isize, int *dsize)
  {
  # ifdef _SC_LEVEL1_ICACHE_LINESIZE
 -*isize = sysconf(_SC_LEVEL1_ICACHE_LINESIZE);
 +int tmp_isize = (int) sysconf(_SC_LEVEL1_ICACHE_LINESIZE);
>>>
>>> Do we need the cast here ?
>>
>> It's there to remind the reader that a type coercion may occur, since
>> sysconf() returns a long and isize is an int.
>>
 +if (tmp_isize > 0) {
 +*isize = tmp_isize;
 +}
  # endif
  # ifdef _SC_LEVEL1_DCACHE_LINESIZE
 -*dsize = sysconf(_SC_LEVEL1_DCACHE_LINESIZE);
 +int tmp_dsize = (int) sysconf(_SC_LEVEL1_DCACHE_LINESIZE);
 +if (tmp_dsize > 0) {
 +*dsize = tmp_dsize;
 +}
  # endif
  }
  #endif /* sys_cache_info */
 --
>>>
>>> thanks
>>> -- PMM
>>
>> --
>> Carlos Santos
>> Senior Software Maintenance Engineer
>> Red Hat
>> casan...@redhat.comT: +55-11-3534-6186
> 
> Hi,
> 
> Any chance to have this merged for Christmas? :-)

No, but it's queued now.  ;-)


r~

Re: [PATCH v3 0/2] Fix hyperv synic on vhost

2020-01-16 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20200116202414.157959-1-dgilb...@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20200116202414.157959-1-dgilb...@redhat.com
Type: series
Subject: [PATCH v3 0/2] Fix hyperv synic on vhost

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

From https://github.com/patchew-project/qemu
 * [new tag] patchew/20200116202414.157959-1-dgilb...@redhat.com -> 
patchew/20200116202414.157959-1-dgilb...@redhat.com
Switched to a new branch 'test'
f7aeff2 vhost: Only align sections for vhost-user
5bb467f vhost: Add names to section rounded warning

=== OUTPUT BEGIN ===
1/2 Checking commit 5bb467f4ac3b (vhost: Add names to section rounded warning)
2/2 Checking commit f7aeff24a99a (vhost: Only align sections for vhost-user)
ERROR: trailing whitespace
#45: FILE: hw/virtio/vhost.c:554:
+if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER) {   $

WARNING: line over 80 characters
#60: FILE: hw/virtio/vhost.c:569:
+trace_vhost_region_add_section_aligned(section->mr->name, mrs_gpa, 
mrs_size,

total: 1 errors, 1 warnings, 43 lines checked

Patch 2/2 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200116202414.157959-1-dgilb...@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH V2] vhost-user-test: fix a memory leak

2020-01-16 Thread Pan Nengyuan




On 1/16/2020 9:29 PM, Thomas Huth wrote:
> On 15/01/2020 10.13, Thomas Huth wrote:
>> On 15/01/2020 04.10, Pan Nengyuan wrote:
>>>
>>> On 1/13/2020 10:32 AM, Pan Nengyuan wrote:

 On 1/12/2020 6:39 PM, Thomas Huth wrote:
>> [...]
> ... and now I had to unqueue the patch again. It is reproducibly causing
> one of the gitlab CI pipelines to fail with a timeout, e.g.:
>
>  https://gitlab.com/huth/qemu/-/jobs/400101552
>
> Not sure what is going on here, though, there is no obvious error
> message in the output... this needs some more investigation... do you
> have a gitlab account and could have a look?
>

 OK, I will register a account and have a look.

>>>
>>> I'm sorry, I build and test with the same params, but I can't reproduce it.
>>> Could you add "V=1 or V=2" params to get more information ?
>>
>> It seems to hang forever in qos-test
>> /arm/virt/virtio-mmio/virtio-bus/virtio-net-device/virtio-net/virtio-net-tests/announce-self
>> :
>>
>>  https://gitlab.com/huth/qemu/-/jobs/403472594
>>
>> It's completely weird, I also added some fprintf statements:
>>
>>  https://gitlab.com/huth/qemu/commit/8ae76c0cf37cf46d26620dd
>>
>> ... but none of them show up in the output of the test run... so I'm
>> currently completely puzzled what might be going wrong here... Any other
>> ideas what we could try here?
> 
> I tried to add some more fprintfs here and there to see where it hangs,
> but I did not succeed to get any further.
> 
> However, the CI build succeeds with this fix instead:
> 
> diff a/tests/qtest/vhost-user-test.c b/tests/qtest/vhost-user-test.c
> --- a/tests/qtest/vhost-user-test.c
> +++ b/tests/qtest/vhost-user-test.c
> @@ -707,9 +707,9 @@ static void test_read_guest_mem(void *obj, void
> *arg, QGuestAllocator *alloc)
>  static void test_migrate(void *obj, void *arg, QGuestAllocator *alloc)
>  {
>  TestServer *s = arg;
> -TestServer *dest = test_server_new("dest");
> -GString *dest_cmdline = g_string_new(qos_get_current_command_line());
> -char *uri = g_strdup_printf("%s%s", "unix:", dest->mig_path);
> +TestServer *dest;
> +GString *dest_cmdline;
> +char *uri;
>  QTestState *to;
>  GSource *source;
>  QDict *rsp;
> @@ -720,6 +720,10 @@ static void test_migrate(void *obj, void *arg,
> QGuestAllocator *alloc)
>  return;
>  }
> 
> +dest = test_server_new("dest");
> +dest_cmdline = g_string_new(qos_get_current_command_line());
> +uri = g_strdup_printf("%s%s", "unix:", dest->mig_path);
> +
>  size = get_log_size(s);
>  g_assert_cmpint(size, ==, (256 * 1024 * 1024) / (VHOST_LOG_PAGE * 8));
> 
> @@ -778,6 +782,7 @@ static void test_migrate(void *obj, void *arg,
> QGuestAllocator *alloc)
>  qtest_quit(to);
>  test_server_free(dest);
>  g_free(uri);
> +g_string_free(dest_cmdline, true);
>  }
> 
> 
> Here's a build with that patch that succeeded:
> 
>  https://gitlab.com/huth/qemu/-/jobs/405357307
> 
> So if you agree with that patch, I think we should simply use that
> version instead, ok?

Ok, I agree with it.

Thanks.

> 
>  Thomas
> 
> .
>

[PATCH v2 0/2] target/arm: Fix ISSIs16Bit

Changes in v2:
  - Include the merge_syn_data_abort fix, as a self-contained patch.


r~


Jeff Kubascik (1):
  target/arm: Return correct IL bit in merge_syn_data_abort

Richard Henderson (1):
  target/arm: Set ISSIs16Bit in make_issinfo

 target/arm/tlb_helper.c | 2 +-
 target/arm/translate.c  | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

-- 
2.20.1

[PATCH v2 2/2] target/arm: Set ISSIs16Bit in make_issinfo

During the conversion to decodetree, the setting of
ISSIs16Bit got lost.  This causes the guest os to
incorrectly adjust trapping memory operations.

Cc: qemu-sta...@nongnu.org
Fixes: 46beb58efbb8a2a32 ("target/arm: Convert T16, load (literal)")
Reported-by: Jeff Kubascik 
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 5185e08641..c25921ef95 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -8556,6 +8556,9 @@ static ISSInfo make_issinfo(DisasContext *s, int rd, bool 
p, bool w)
 /* ISS not valid if writeback */
 if (p && !w) {
 ret = rd;
+if (s->base.pc_next - s->pc_curr == 2) {
+ret |= ISSIs16Bit;
+}
 } else {
 ret = ISSInvalid;
 }
-- 
2.20.1

[PATCH v2 1/2] target/arm: Return correct IL bit in merge_syn_data_abort

From: Jeff Kubascik 

The IL bit is set for 32-bit instructions, thus passing false
with the is_16bit parameter to syn_data_abort_with_iss() makes
a syn mask that always has the IL bit set.

Pass is_16bit as true to make the initial syn mask have IL=0,
so that the final IL value comes from or'ing template_syn.

Cc: qemu-sta...@nongnu.org
Fixes: aaa1f954d4ca ("target-arm: A64: Create Instruction Syndromes for Data 
Aborts")
Signed-off-by: Jeff Kubascik 
[rth: Extracted this as a self-contained bug fix from a larger patch]
Signed-off-by: Richard Henderson 
---
 target/arm/tlb_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/tlb_helper.c b/target/arm/tlb_helper.c
index 5feb312941..e63f8bda29 100644
--- a/target/arm/tlb_helper.c
+++ b/target/arm/tlb_helper.c
@@ -44,7 +44,7 @@ static inline uint32_t merge_syn_data_abort(uint32_t 
template_syn,
 syn = syn_data_abort_with_iss(same_el,
   0, 0, 0, 0, 0,
   ea, 0, s1ptw, is_write, fsc,
-  false);
+  true);
 /* Merge the runtime syndrome with the template syndrome.  */
 syn |= template_syn;
 }
-- 
2.20.1

RE: [PATCH 093/104] virtiofsd: introduce inode refcount to prevent use-after-free

2020-01-16 Thread misono.tomoh...@fujitsu.com

> > On Thu, Jan 16, 2020 at 09:25:42PM +0900, Misono Tomohiro wrote:
> > > > From: Stefan Hajnoczi 
> > > >
> > > > If thread A is using an inode it must not be deleted by thread B
> > > > when processing a FUSE_FORGET request.
> > > >
> > > > The FUSE protocol itself already has a counter called nlookup that
> > > > is used in FUSE_FORGET messages.  We cannot trust this counter
> > > > since the untrusted client can manipulate it via FUSE_FORGET messages.
> > > >
> > > > Introduce a new refcount to keep inodes alive for the required lifespan.
> > > > lo_inode_put() must be called to release a reference.  FUSE's
> > > > nlookup counter holds exactly one reference so that the inode
> > > > stays alive as long as the client still wants to remember it.
> > > >
> > > > Note that the lo_inode->is_symlink field is moved to avoid
> > > > creating a hole in the struct due to struct field alignment.
> > > >
> > > > Signed-off-by: Stefan Hajnoczi 
> > > > ---
> > > >  tools/virtiofsd/passthrough_ll.c | 168
> > > > ++-
> > > >  1 file changed, 145 insertions(+), 23 deletions(-)
> > > >
> > > > diff --git a/tools/virtiofsd/passthrough_ll.c
> > > > b/tools/virtiofsd/passthrough_ll.c
> > > > index b19c9ee328..8f4ab8351c 100644
> > > > --- a/tools/virtiofsd/passthrough_ll.c
> > > > +++ b/tools/virtiofsd/passthrough_ll.c
> > > > @@ -99,7 +99,13 @@ struct lo_key {
> > > >
> > > >  struct lo_inode {
> > > >  int fd;
> > > > -bool is_symlink;
> > > > +
> > > > +/*
> > > > + * Atomic reference count for this object.  The nlookup field 
> > > > holds a
> > > > + * reference and release it when nlookup reaches 0.
> > > > + */
> > > > +gint refcount;
> > > > +
> > > >  struct lo_key key;
> > > >
> > > >  /*
> > > > @@ -118,6 +124,8 @@ struct lo_inode {
> > > >  fuse_ino_t fuse_ino;
> > > >  pthread_mutex_t plock_mutex;
> > > >  GHashTable *posix_locks; /* protected by
> > > > lo_inode->plock_mutex */
> > > > +
> > > > +bool is_symlink;
> > > >  };
> > > >
> > > >  struct lo_cred {
> > > > @@ -473,6 +481,23 @@ static ssize_t lo_add_inode_mapping(fuse_req_t 
> > > > req, struct lo_inode *inode)
> > > >  return elem - lo_data(req)->ino_map.elems;  }
> > > >
> > > > +static void lo_inode_put(struct lo_data *lo, struct lo_inode
> > > > +**inodep) {
> > > > +struct lo_inode *inode = *inodep;
> > > > +
> > > > +if (!inode) {
> > > > +return;
> > > > +}
> > > > +
> > > > +*inodep = NULL;
> > > > +
> > > > +if (g_atomic_int_dec_and_test(&inode->refcount)) {
> > > > +close(inode->fd);
> > > > +free(inode);
> > > > +}
> > > > +}
> > > > +
> > > > +/* Caller must release refcount using lo_inode_put() */
> > > >  static struct lo_inode *lo_inode(fuse_req_t req, fuse_ino_t ino)
> > > > {
> > > >  struct lo_data *lo = lo_data(req); @@ -480,6 +505,9 @@ static
> > > > struct lo_inode *lo_inode(fuse_req_t req, fuse_ino_t ino)
> > > >
> > > >  pthread_mutex_lock(&lo->mutex);
> > > >  elem = lo_map_get(&lo->ino_map, ino);
> > > > +if (elem) {
> > > > +g_atomic_int_inc(&elem->inode->refcount);
> > > > +}
> > > >  pthread_mutex_unlock(&lo->mutex);
> > > >
> > > >  if (!elem) {
> > > > @@ -489,10 +517,23 @@ static struct lo_inode *lo_inode(fuse_req_t req, 
> > > > fuse_ino_t ino)
> > > >  return elem->inode;
> > > >  }
> > > >
> > > > +/*
> > > > + * TODO Remove this helper and force callers to hold an inode
> > > > +refcount until
> > > > + * they are done with the fd.  This will be done in a later patch
> > > > +to make
> > > > + * review easier.
> > > > + */
> > > >  static int lo_fd(fuse_req_t req, fuse_ino_t ino)  {
> > > >  struct lo_inode *inode = lo_inode(req, ino);
> > > > -return inode ? inode->fd : -1;
> > > > +int fd;
> > > > +
> > > > +if (!inode) {
> > > > +return -1;
> > > > +}
> > > > +
> > > > +fd = inode->fd;
> > > > +lo_inode_put(lo_data(req), &inode);
> > > > +return fd;
> > > >  }
> > > >
> > > >  static void lo_init(void *userdata, struct fuse_conn_info *conn)
> > > > @@ -547,6 +588,10 @@ static void lo_getattr(fuse_req_t req, fuse_ino_t 
> > > > ino,
> > > >  fuse_reply_attr(req, &buf, lo->timeout);  }
> > > >
> > > > +/*
> > > > + * Increments parent->nlookup and caller must release refcount
> > > > +using
> > > > + * lo_inode_put(&parent).
> > > > + */
> > > >  static int lo_parent_and_name(struct lo_data *lo, struct lo_inode 
> > > > *inode,
> > > >char path[PATH_MAX], struct
> > > > lo_inode **parent)  { @@ -584,6 +629,7 @@ retry:
> > > >  p = &lo->root;
> > > >  pthread_mutex_lock(&lo->mutex);
> > > >  p->nlookup++;
> > > > +g_atomic_int_inc(&p->refcount);
> > > >  pthread_mutex_unlock(&lo->mutex);
> > > >  } else {
> > > >  *last = '\0';
> > >
> > > We need lo_ionde_put() in error path, right?:
> > > https://gitlab.com/virtio-fs

Re: [PATCH] target/openrisc: Fix FPCSR mask to allow setting DZF

On 1/10/20 11:28 AM, Stafford Horne wrote:
> The mask used when setting FPCSR allows setting bits 10 to 1.  However,
> OpenRISC has flags and config bits in 11 to 1, 11 being Divide by Zero
> Flag (DZF).  This seems like an off-by-one bug.
> 
> This was found when testing the GLIBC test suite which has test cases to
> set and clear all bits.
> 
> Signed-off-by: Stafford Horne 
> ---
>  target/openrisc/fpu_helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks, queued.


r~

[PULL 1/1] target/openrisc: Fix FPCSR mask to allow setting DZF

From: Stafford Horne 

The mask used when setting FPCSR allows setting bits 10 to 1.  However,
OpenRISC has flags and config bits in 11 to 1, 11 being Divide by Zero
Flag (DZF).  This seems like an off-by-one bug.

This was found when testing the GLIBC test suite which has test cases to
set and clear all bits.

Signed-off-by: Stafford Horne 
Message-Id: <20200110212843.27335-1-sho...@gmail.com>
Signed-off-by: Richard Henderson 
---
 target/openrisc/fpu_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/openrisc/fpu_helper.c b/target/openrisc/fpu_helper.c
index 59e1413279..6f75ea0505 100644
--- a/target/openrisc/fpu_helper.c
+++ b/target/openrisc/fpu_helper.c
@@ -70,7 +70,7 @@ void cpu_set_fpcsr(CPUOpenRISCState *env, uint32_t val)
 float_round_down
 };
 
-env->fpcsr = val & 0x7ff;
+env->fpcsr = val & 0xfff;
 set_float_rounding_mode(rm_to_sf[extract32(val, 1, 2)], &env->fp_status);
 }
 
-- 
2.20.1

[PULL 0/1] target/openrisc patch queue

The following changes since commit 28b58f19d269633b3d14b6aebf1e92b3cd3ab56e:

  ui/gtk: Get display refresh rate with GDK version 3.22 or later (2020-01-16 
14:03:45 +)

are available in the Git repository at:

  https://github.com/rth7680/qemu.git tags/pull-or1k-20200116

for you to fetch changes up to 97a254b3f03a184136e381c6d9fd80475e1795ac:

  target/openrisc: Fix FPCSR mask to allow setting DZF (2020-01-16 14:50:43 
-1000)


Fix FPSCR masking


Stafford Horne (1):
  target/openrisc: Fix FPCSR mask to allow setting DZF

 target/openrisc/fpu_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

[Bug 1860056] [NEW] mips binaries segfault

2020-01-16 Thread mcandre

Public bug reported:

Hello World appears to segfault with qemu mips, on a Debian 10.0.0
Buster amd64 host.

Example:


$ cat mips/test/hello.cpp 
#include 
using std::cout;

int main() {
cout << "Hello World!\n";
return 0;
}

$ mips-linux-gnu-g++ -o hello hello.cpp && ./hello
qemu: uncaught target signal 11 (Segmentation fault) - core dumped

Note that 64-bit MIPS and little endian 32-bit MIPS qemu work fine. The
problem is limited to big endian 32-bit MIPS.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1860056

Title:
  mips binaries segfault

Status in QEMU:
  New

Bug description:
  Hello World appears to segfault with qemu mips, on a Debian 10.0.0
  Buster amd64 host.

  Example:

  
  $ cat mips/test/hello.cpp 
  #include 
  using std::cout;

  int main() {
  cout << "Hello World!\n";
  return 0;
  }

  $ mips-linux-gnu-g++ -o hello hello.cpp && ./hello
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped

  Note that 64-bit MIPS and little endian 32-bit MIPS qemu work fine.
  The problem is limited to big endian 32-bit MIPS.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1860056/+subscriptions

Re: [PATCH v2 0/3] hw/hppa/machine: Restrict the total memory size to 3GB