Re: [Qemu-devel] [PATCH RFC v7 5/9] migration: fix the multifd code when sending less channels

2018-11-02 Thread Fei Li




On 11/02/2018 11:32 AM, Peter Xu wrote:

On Fri, Nov 02, 2018 at 11:00:24AM +0800, Fei Li wrote:


On 11/02/2018 10:37 AM, Peter Xu wrote:

On Thu, Nov 01, 2018 at 06:17:11PM +0800, Fei Li wrote:

Set the migration state to "failed" instead of "setup" when failing
to send packet via some channel.

Could you please provide more information in the commit message?
E.g., what will happen if without this patch?  Will it crash the
source or stall the source migration or others?  Otherwise it's a bit
hard for me to understand what's this patch for.

Sorry for the inadequate description , I was intended to say that when
failing
to do the live migration using multifd, e.g. sending less channels, the src
status displays "setup" when running `info migrate`. I assume we should tell
users that the "Migration status" is "failed" now (and along with the
failure reason).

The current src status when failed inmultifd_new_send_channel_async():


(qemu) migrate_set_capability x-multifd on
(qemu) migrate_set_parameter x-multifd-channels 4
(qemu) migrate -d tcp:192.168.190.98:
(qemu) qemu-system-x86_64: failed in multifd_new_send_channel_async due to
...
(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks:
off compress: off events: off postcopy-ram: off x-colo: off release-ram: off
block: off return-path: off pause-before-switchover: off x-multifd: on
dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off
Migration status: setup
total time: 0 milliseconds

Thanks for the information.

I had a quick look.  For now we do this:

 multifd_save_setup (without waiting for channels to be ready)
 create thread migration_thread
 (in thread)
 ram_save_setup
 multifd_send_sync_main (wait for the channels)

The thing is that we didn't get the notification when one of the
multifd channel is failed.  IMHO instead of setting the global
migration state in a per-channel function, we should just report the
error upwards, then the main thread should decide how to change the
state machine of the migration.

Thanks for the detail explanation, do agree with reporting and letting
the main thread handle this. :)
But one thing to note is that during my previous debugging, I remember
sometimes the main thread: migration_thread() is called earlier than
the first channel is ready in multifd_new_send_channel_async(). Thus
we should be careful about where/when to check the state of the channel.

And we have set it in migrate_set_error() after all so the main thread
should be able to know somehow

But in our current code, the main thread has not utilized the s->error
to know whether the migration state, right? As I checked the code,
the s->error is only used
- in qmp query: copy s->error to info->error_desc when detecting the 
migrate status is failed;

- in migrate_fd_cleanup() when migrate_fd_connect() fails: print the error
Or the s->error is just used in this way?

  (though IMHO I'll even prefer to have a
per-channel variable to keep the state of the channel, then the
per-channel functions won't touch any globals which offers better
isolation).

I'm not sure how Juan thinks about it, but I'd prefer some work to
provide such isolation and also some mechanism to allow the main
thread to detect the per-channel errors not only during setup phase
but also during the migration (e.g., when network is suddenly down).
Then we don't touch any globals (e.g., we shouldn't call
migrate_get_current in any per-channel function like
multifd_new_send_channel_async).

Ok, wait for Juan's comment. :)

Have a nice day, and thanks again for the detail explanation.
Fei

Normally I would prefer to not touch global states in feature specific
code path, but I'd like to know the problem more first...

Thanks,


Cc: Peter Xu 
Signed-off-by: Fei Li 
---
   migration/ram.c | 2 ++
   1 file changed, 2 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 4db3b3e8f4..c84d164fc8 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1072,6 +1072,7 @@ out:
   static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
   {
   MultiFDSendParams *p = opaque;
+MigrationState *s = migrate_get_current();
   QIOChannel *sioc = QIO_CHANNEL(qio_task_get_source(task));
   Error *local_err = NULL;
@@ -1083,6 +1084,7 @@ static void multifd_new_send_channel_async(QIOTask *task, 
gpointer opaque)
   if (multifd_save_cleanup(&local_err) != 0) {
   migrate_set_error(migrate_get_current(), local_err);
   }
+migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
   } else {
   p->c = QIO_CHANNEL(sioc);
   qio_channel_set_delay(p->c, false);
--
2.13.7


Regards,


Regards,






Re: [Qemu-devel] [PATCH] docker: Use a stable snapshot for Debian Sid

2018-11-02 Thread Philippe Mathieu-Daudé
Hi Fam,

Thanks for picking this.

On Fri, Nov 2, 2018 at 7:48 AM Fam Zheng  wrote:
> On Thu, 11/01 19:37, Philippe Mathieu-Daudé wrote:
> > The Debian Sid repository is not garanteed to be stable, as his
> > 'unstable' name suggest :)

There is an error in "my be" -> "might be"...
Do you mind to update the comment:

> > To allow quick testing, packages are pushed various time a day,
> > which my be annoying when trying to use it for stable development
> > (which is not recommended, but Sid provides edge packages we use
> > for testing).

By:

To allow quick testing, Debian maintainers might push packages
various time a day. Sometime package dependencies might break,
which is annoying when using this repository for stable development
(which is not recommended, but Sid provides edge packages we use
for testing).

I can resend as v2 if you prefer.

Thanks!

Phil.

> > (which is not recommended, but Sid provides edge packages we use
> > for testing).
> >
> > Debian provides repositories snapshots which are suitable for our
> > use. Pick a recent date that works. When required, update to newer
> > releases will be easy.
> >
> > This fixes current issues with this image:
> >
> >   $ make docker-image-debian-sid
> >   [...]
> >   The following packages have unmet dependencies:
> >build-essential : Depends: dpkg-dev (>= 1.17.11) but it is not going to 
> > be installed
> >git : Depends: perl but it is not going to be installed
> >  Depends: liberror-perl but it is not going to be installed
> >pkg-config : Depends: libdpkg-perl but it is not going to be installed
> >texinfo : Depends: perl (>= 5.26.2-6) but it is not going to be installed
> >  Depends: libtext-unidecode-perl but it is not going to be 
> > installed
> >  Depends: libxml-libxml-perl but it is not going to be installed
> >   E: Unable to correct problems, you have held broken packages.
> >
> > Signed-off-by: Philippe Mathieu-Daudé 
> > ---
> >  tests/docker/dockerfiles/debian-sid.docker | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/tests/docker/dockerfiles/debian-sid.docker 
> > b/tests/docker/dockerfiles/debian-sid.docker
> > index 9a3d168705..4e4cda0ba5 100644
> > --- a/tests/docker/dockerfiles/debian-sid.docker
> > +++ b/tests/docker/dockerfiles/debian-sid.docker
> > @@ -13,6 +13,10 @@
> >
> >  FROM debian:sid-slim
> >
> > +# Use a snapshot known to work (see http://snapshot.debian.org/#Usage)
> > +ENV DEBIAN_SNAPSHOT_DATE "20181030"
> > +RUN sed -i "s%^deb \(https\?://\)deb.debian.org/debian/\? \(.*\)%deb 
> > [check-valid-until=no] 
> > \1snapshot.debian.org/archive/debian/${DEBIAN_SNAPSHOT_DATE} \2%" 
> > /etc/apt/sources.list
> > +
> >  # Duplicate deb line as deb-src
> >  RUN cat /etc/apt/sources.list | sed "s/^deb\ /deb-src /" >> 
> > /etc/apt/sources.list
> >
> > --
> > 2.17.2
> >
>
> Queued, thanks!
>
> Fam



Re: [Qemu-devel] [PATCH] docker: Use a stable snapshot for Debian Sid

2018-11-02 Thread Fam Zheng
On Fri, Nov 2, 2018 at 3:20 PM Philippe Mathieu-Daudé  wrote:
>
> Hi Fam,
>
> Thanks for picking this.
>
> On Fri, Nov 2, 2018 at 7:48 AM Fam Zheng  wrote:
> > On Thu, 11/01 19:37, Philippe Mathieu-Daudé wrote:
> > > The Debian Sid repository is not garanteed to be stable, as his
> > > 'unstable' name suggest :)
>
> There is an error in "my be" -> "might be"...
> Do you mind to update the comment:
>
> > > To allow quick testing, packages are pushed various time a day,
> > > which my be annoying when trying to use it for stable development
> > > (which is not recommended, but Sid provides edge packages we use
> > > for testing).
>
> By:
>
> To allow quick testing, Debian maintainers might push packages
> various time a day. Sometime package dependencies might break,
> which is annoying when using this repository for stable development
> (which is not recommended, but Sid provides edge packages we use
> for testing).

Sure, updated in my queue.

Fam



Re: [Qemu-devel] [PATCH RFC v7 5/9] migration: fix the multifd code when sending less channels

2018-11-02 Thread Peter Xu
On Fri, Nov 02, 2018 at 03:13:05PM +0800, Fei Li wrote:
> 
> 
> On 11/02/2018 11:32 AM, Peter Xu wrote:
> > On Fri, Nov 02, 2018 at 11:00:24AM +0800, Fei Li wrote:
> > > 
> > > On 11/02/2018 10:37 AM, Peter Xu wrote:
> > > > On Thu, Nov 01, 2018 at 06:17:11PM +0800, Fei Li wrote:
> > > > > Set the migration state to "failed" instead of "setup" when failing
> > > > > to send packet via some channel.
> > > > Could you please provide more information in the commit message?
> > > > E.g., what will happen if without this patch?  Will it crash the
> > > > source or stall the source migration or others?  Otherwise it's a bit
> > > > hard for me to understand what's this patch for.
> > > Sorry for the inadequate description , I was intended to say that when
> > > failing
> > > to do the live migration using multifd, e.g. sending less channels, the 
> > > src
> > > status displays "setup" when running `info migrate`. I assume we should 
> > > tell
> > > users that the "Migration status" is "failed" now (and along with the
> > > failure reason).
> > > 
> > > The current src status when failed inmultifd_new_send_channel_async():
> > > 
> > > 
> > > (qemu) migrate_set_capability x-multifd on
> > > (qemu) migrate_set_parameter x-multifd-channels 4
> > > (qemu) migrate -d tcp:192.168.190.98:
> > > (qemu) qemu-system-x86_64: failed in multifd_new_send_channel_async due to
> > > ...
> > > (qemu) info migrate
> > > globals:
> > > store-global-state: on
> > > only-migratable: off
> > > send-configuration: on
> > > send-section-footer: on
> > > decompress-error-check: on
> > > capabilities: xbzrle: off rdma-pin-all: off auto-converge: off 
> > > zero-blocks:
> > > off compress: off events: off postcopy-ram: off x-colo: off release-ram: 
> > > off
> > > block: off return-path: off pause-before-switchover: off x-multifd: on
> > > dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off
> > > Migration status: setup
> > > total time: 0 milliseconds
> > Thanks for the information.
> > 
> > I had a quick look.  For now we do this:
> > 
> >  multifd_save_setup (without waiting for channels to be ready)
> >  create thread migration_thread
> >  (in thread)
> >  ram_save_setup
> >  multifd_send_sync_main (wait for the channels)

[1]

> > 
> > The thing is that we didn't get the notification when one of the
> > multifd channel is failed.  IMHO instead of setting the global
> > migration state in a per-channel function, we should just report the
> > error upwards, then the main thread should decide how to change the
> > state machine of the migration.
> Thanks for the detail explanation, do agree with reporting and letting
> the main thread handle this. :)
> But one thing to note is that during my previous debugging, I remember
> sometimes the main thread: migration_thread() is called earlier than
> the first channel is ready in multifd_new_send_channel_async(). Thus
> we should be careful about where/when to check the state of the channel.

Yeah, I guess that's exactly the stack I described at [1] above.

So my preference here would be that: in multifd_save_setup() we don't
continue until we know that the sockets are ready.  After all AFAIU
currently we'll depend on all the channels when migrate, so we can't
really do anything if without all the channels ready.  That'll
simplify the error handling of the case you've encountered during
SETUP.

> > And we have set it in migrate_set_error() after all so the main thread
> > should be able to know somehow
> But in our current code, the main thread has not utilized the s->error
> to know whether the migration state, right? As I checked the code,
> the s->error is only used
> - in qmp query: copy s->error to info->error_desc when detecting the migrate
> status is failed;
> - in migrate_fd_cleanup() when migrate_fd_connect() fails: print the error
> Or the s->error is just used in this way?

Hmm, _maybe_ we can introduce MultiFDSendParams.err then we can put
per-thread error there.

> >   (though IMHO I'll even prefer to have a
> > per-channel variable to keep the state of the channel, then the
> > per-channel functions won't touch any globals which offers better
> > isolation).
> > 
> > I'm not sure how Juan thinks about it, but I'd prefer some work to
> > provide such isolation and also some mechanism to allow the main
> > thread to detect the per-channel errors not only during setup phase
> > but also during the migration (e.g., when network is suddenly down).
> > Then we don't touch any globals (e.g., we shouldn't call
> > migrate_get_current in any per-channel function like
> > multifd_new_send_channel_async).
> Ok, wait for Juan's comment. :)

Yes.

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH for 3.2 v2 0/7] hw/arm/bcm2835: Add basic support for cprman (clock subsystem)

2018-11-02 Thread Philippe Mathieu-Daudé
Hi Guenter,

On Fri, Nov 2, 2018 at 3:52 AM Guenter Roeck  wrote:
>
> On 11/1/18 5:12 PM, Philippe Mathieu-Daudé wrote:
> > Hi,
> >
> > This series is a mix of a previous work I had for the raspi, and a patch 
> > from
> > Guenter: https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg03153.html
> >
> > The final patch keep Guenter ideas and comments, but is mostly a rewrite.
> > I dropped the A2W code from this work, it doesn't seems unuseful to me.
> >
> > Guenter can you test this series?
> >
>
> arm/raspi2 works, but aarch64/raspi3 stalls.

Thanks for testing it!

So I suppose the A2W is required. And I'm probably using a too old kernel,
I'm using the Buster preview following Peter's post [1]:

[0.00] Linux version 4.14.0-3-arm64
(debian-ker...@lists.debian.org) (gcc version 7.2.0 (Debian 7.2.0-18))
#1 SMP Debian 4.14.12-2 (2018-01-06)
[0.00] Machine model: Raspberry Pi 3 Model B
[...]
[8.044215] systemd[1]: Detected architecture arm64.
Welcome to Debian GNU/Linux buster/sid!

Debian GNU/Linux buster/sid rpi3 ttyAMA0

rpi3 login: root
Password:
Linux rpi3 4.14.0-3-arm64 #1 SMP Debian 4.14.12-2 (2018-01-06) aarch64
root@rpi3:~#

I'll look for a newer kernel.

BTW I use these QEMU command line options while testing:

qemu-system-aarch64 \
  -d unimp,guest_errors \
  -trace bcm2835_cprman_rd_\* -trace bcm2835_cprman_wr_\* \
  ...

And the cmdline suggested by Peter:

  -append "rw earlycon=pl011,0x3f201000 console=ttyAMA0 loglevel=8
root=/dev/mmcblk0p2 fsck.repair=yes net.ifnames=0 rootwait memtest=1"

[1] 
https://translatedcode.wordpress.com/2018/04/25/debian-on-qemus-raspberry-pi-3-model/

>
> [   45.683302] Run /sbin/init as init process
> [   50.745961] random: dd: uninitialized urandom read (512 bytes read)
> [   77.478266] Writes:  Total: 2074828  Max/Min: 0/0   Fail: 0
>
> ... then nothing else until I abort the session.
>
> This is with the series applied on top of master.
>
> Guenter
>
> > I kept Guenter S-o-b and (C), is that OK? (Guenter?, Peter?)
> >
> > - patches 1, 7: obvious updates in MAINTAINERS
> > - patches 3, 4: simple cleanups
> > - patches 2, 5: add UNIMP code
> > - patch 6: add the cprman (KISS init values from Guenter)
> >
> > Peter: can you take patch #1 for 3.1?
> >
> > Regards,
> >
> > Phil.
> >
> > Philippe Mathieu-Daudé (7):
> >MAINTAINERS: Add an entry for the Raspberry Pi machines
> >hw/misc/bcm2835_property: Handle the 'domain state' property
> >hw/arm/bcm2835: Use 0x prefix for hex numbers
> >hw/arm/bcm2835: Rename some definitions
> >hw/arm/bcm2835: Add various unimplemented peripherals
> >hw/arm/bcm2835: Add basic support for cprman (clock subsystem)
> >MAINTAINERS: Volunteer to review Raspi patches
> >
> >   MAINTAINERS  |   7 +
> >   hw/arm/bcm2835_peripherals.c |  42 +++-
> >   hw/char/bcm2835_aux.c|   2 +-
> >   hw/intc/bcm2836_control.c|   4 +-
> >   hw/misc/Makefile.objs|   1 +
> >   hw/misc/bcm2835_cprman.c | 277 +++
> >   hw/misc/bcm2835_property.c   |   8 +-
> >   hw/misc/trace-events |   8 +
> >   include/hw/arm/bcm2835_peripherals.h |  11 ++
> >   include/hw/arm/raspi_platform.h  |   6 +-
> >   include/hw/misc/bcm2835_cprman.h |  28 +++
> >   11 files changed, 387 insertions(+), 7 deletions(-)
> >   create mode 100644 hw/misc/bcm2835_cprman.c
> >   create mode 100644 include/hw/misc/bcm2835_cprman.h
> >
>



Re: [Qemu-devel] [PATCH for 3.2 v2 0/7] hw/arm/bcm2835: Add basic support for cprman (clock subsystem)

2018-11-02 Thread Philippe Mathieu-Daudé
On Fri, Nov 2, 2018 at 8:32 AM Philippe Mathieu-Daudé  wrote:
>
> Hi Guenter,
>
> On Fri, Nov 2, 2018 at 3:52 AM Guenter Roeck  wrote:
> >
> > On 11/1/18 5:12 PM, Philippe Mathieu-Daudé wrote:
> > > Hi,
> > >
> > > This series is a mix of a previous work I had for the raspi, and a patch 
> > > from
> > > Guenter: 
> > > https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg03153.html
> > >
> > > The final patch keep Guenter ideas and comments, but is mostly a rewrite.
> > > I dropped the A2W code from this work, it doesn't seems unuseful to me.
> > >
> > > Guenter can you test this series?
> > >
> >
> > arm/raspi2 works, but aarch64/raspi3 stalls.
>
> Thanks for testing it!
>
> So I suppose the A2W is required. And I'm probably using a too old kernel,
> I'm using the Buster preview following Peter's post [1]:
>
> [0.00] Linux version 4.14.0-3-arm64
> (debian-ker...@lists.debian.org) (gcc version 7.2.0 (Debian 7.2.0-18))
> #1 SMP Debian 4.14.12-2 (2018-01-06)
> [0.00] Machine model: Raspberry Pi 3 Model B
> [...]
> [8.044215] systemd[1]: Detected architecture arm64.
> Welcome to Debian GNU/Linux buster/sid!
>
> Debian GNU/Linux buster/sid rpi3 ttyAMA0
>
> rpi3 login: root
> Password:
> Linux rpi3 4.14.0-3-arm64 #1 SMP Debian 4.14.12-2 (2018-01-06) aarch64
> root@rpi3:~#
>
> I'll look for a newer kernel.

I'm a bit confuse since I can boot a 4.19 kernel:

[0.00] Booting Linux on physical CPU 0x00 [0x410fd034]
[0.00] Linux version 4.19.0 (gokrazy@docker) (gcc version
6.3.0 20170516 (Debian 6.3.0-18)) #1 SMP PREEMPT Wed Mar 1 20:57:29
UTC 2017
[0.00] Machine model: Raspberry Pi 3 Model B
[0.00] earlycon: pl11 at MMIO 0x3f201000 (options '')
[0.00] bootconsole [pl11] enabled
...
[2.722577] Freeing unused kernel memory: 5696K
[2.723256] Run /init as init process
Loading, please wait...
starting version 236
...
root@rpi3:~# uname -a
Linux rpi3 4.19.0 #1 SMP PREEMPT Wed Mar 1 20:57:29 UTC 2017 aarch64 GNU/Linux

>
> BTW I use these QEMU command line options while testing:
>
> qemu-system-aarch64 \
>   -d unimp,guest_errors \
>   -trace bcm2835_cprman_rd_\* -trace bcm2835_cprman_wr_\* \
>   ...
>
> And the cmdline suggested by Peter:
>
>   -append "rw earlycon=pl011,0x3f201000 console=ttyAMA0 loglevel=8
> root=/dev/mmcblk0p2 fsck.repair=yes net.ifnames=0 rootwait memtest=1"
>
> [1] 
> https://translatedcode.wordpress.com/2018/04/25/debian-on-qemus-raspberry-pi-3-model/
>
> >
> > [   45.683302] Run /sbin/init as init process

init is ran way after A2W register accesses, so I doubt they are the
problem here.

Can you provide me your testing setup?

Thanks,

Phil.

> > [   50.745961] random: dd: uninitialized urandom read (512 bytes read)
> > [   77.478266] Writes:  Total: 2074828  Max/Min: 0/0   Fail: 0
> >
> > ... then nothing else until I abort the session.
> >
> > This is with the series applied on top of master.
> >
> > Guenter
> >
> > > I kept Guenter S-o-b and (C), is that OK? (Guenter?, Peter?)
> > >
> > > - patches 1, 7: obvious updates in MAINTAINERS
> > > - patches 3, 4: simple cleanups
> > > - patches 2, 5: add UNIMP code
> > > - patch 6: add the cprman (KISS init values from Guenter)
> > >
> > > Peter: can you take patch #1 for 3.1?
> > >
> > > Regards,
> > >
> > > Phil.
> > >
> > > Philippe Mathieu-Daudé (7):
> > >MAINTAINERS: Add an entry for the Raspberry Pi machines
> > >hw/misc/bcm2835_property: Handle the 'domain state' property
> > >hw/arm/bcm2835: Use 0x prefix for hex numbers
> > >hw/arm/bcm2835: Rename some definitions
> > >hw/arm/bcm2835: Add various unimplemented peripherals
> > >hw/arm/bcm2835: Add basic support for cprman (clock subsystem)
> > >MAINTAINERS: Volunteer to review Raspi patches
> > >
> > >   MAINTAINERS  |   7 +
> > >   hw/arm/bcm2835_peripherals.c |  42 +++-
> > >   hw/char/bcm2835_aux.c|   2 +-
> > >   hw/intc/bcm2836_control.c|   4 +-
> > >   hw/misc/Makefile.objs|   1 +
> > >   hw/misc/bcm2835_cprman.c | 277 +++
> > >   hw/misc/bcm2835_property.c   |   8 +-
> > >   hw/misc/trace-events |   8 +
> > >   include/hw/arm/bcm2835_peripherals.h |  11 ++
> > >   include/hw/arm/raspi_platform.h  |   6 +-
> > >   include/hw/misc/bcm2835_cprman.h |  28 +++
> > >   11 files changed, 387 insertions(+), 7 deletions(-)
> > >   create mode 100644 hw/misc/bcm2835_cprman.c
> > >   create mode 100644 include/hw/misc/bcm2835_cprman.h
> > >
> >



Re: [Qemu-devel] strange situation, guest cpu thread spinning at ~100%, but display not yet initialized

2018-11-02 Thread Alex Bennée


Chris Friesen  writes:

> Hi all,
>
> I have an odd situation which occurs very infrequently and I'm hoping
> to get some advice on how to debug.  Apologies for the length of this
> message, I tried to include as much potentially useful information as
> possible.
>
> In the context of an OpenStack compute node I have a qemu guest (with
> kvm acceleration) that has started up.  The virtual console shows
> "Guest has not initialized the display (yet)."   I'm trying to figure
> out what's going on and how we got into this state.  I assume it's
> some sort of deadlock/livelock, but I can't figure out what's causing
> it.
>
> I'm using qemu 2.10.0 (qemu-kvm-ev-2.10.0-0), with CentOS 7.4.1708 as
> the underlying OS.  Kernel is 3.10.0-693.21.1.el7.36.
>
> On the host, the "CPU 0/KVM" thread for this guest is at 99.9% cpu
> utilization on host cpu 43.  There are two other threads of a separate
> process which are chewing up host cpus 2 and 3.  Host cpus 0 and 1
> (and their HT siblings 36 and 37) are ~90% idle and are used for
> general host overhead.
>
> The qemu process looks like this:
>
> controller-0:~# ps -ef|grep qemu
> root  48250  1 99 18:16 ?01:17:35
> /usr/libexec/qemu-kvm -c 0x0001 -n 4
> --proc-type=secondary --file-prefix=vs -- -enable-dpdk -name
> guest=instance-0001,debug-threads=on -S -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-288-instance-0001/master-key.aes
> -machine pc-i440fx-rhel7.4.0,accel=kvm,usb=off,dump-guest-core=off -m
> 1024 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -object
> memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/mnt/huge-2048kB/libvirt/qemu/288-instance-0001,share=yes,size=1073741824,host-nodes=0,policy=bind
> -numa node,nodeid=0,cpus=0-3,memdev=ram-node0 -uuid
> 146389d6-0190-4b41-9fbc-6fc7c957b81a -smbios
> type=1,manufacturer=Fedora Project,product=OpenStack
> Nova,version=16.0.2-1.tis.156,serial=d6e1c3bf-126e-4518-a46d-aa33f27ec0ab,uuid=146389d6-0190-4b41-9fbc-6fc7c957b81a,family=Virtual
> Machine -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-288-instance-0001/monitor.sock,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet
> -no-shutdown -boot reboot-timeout=5000,strict=on -global
> i440FX-pcihost.pci-hole64-size=67108864K -device
> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive
> file=/etc/nova/instances/146389d6-0190-4b41-9fbc-6fc7c957b81a/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
> -chardev
> socket,id=charnet0,path=/var/run/vswitch/usvhost-a19a0e3b-6c85-4726-918d-572c223bd23c
> -netdev vhost-user,chardev=charnet0,id=hostnet0 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:68:4a:a2,bus=pci.0,addr=0x3
> -add-fd set=0,fd=78 -chardev
> pty,id=charserial0,logfile=/dev/fdset/0,logappend=on -device
> isa-serial,chardev=charserial0,id=serial0 -device
> usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -k en-us -device
> cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device
> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on
>
>
> The CPU affinity looks a little odd, since we have a number of host
> CPUs reserved for other things, and the qemu process is affined to the
> CPUs of a single host NUMA node.
>
> controller-0:~# taskset -apc 48250
> pid 48250's current affinity list: 5-17,41-53
> pid 48271's current affinity list: 5-17,41-53
> pid 48272's current affinity list: 5-17,41-53
> pid 48316's current affinity list: 5-17,41-53
> pid 48317's current affinity list: 5-17,41-53
> pid 48318's current affinity list: 5-17,41-53
> pid 48319's current affinity list: 5-17,41-53
> pid 48335's current affinity list: 5-17,41-53
>
>
> CPU scheduler policy:
> controller-0:~# chrt -ap 48250
> pid 48250's current scheduling policy: SCHED_OTHER
> pid 48250's current scheduling priority: 0
> pid 48271's current scheduling policy: SCHED_OTHER
> pid 48271's current scheduling priority: 0
> pid 48272's current scheduling policy: SCHED_OTHER
> pid 48272's current scheduling priority: 0
> pid 48316's current scheduling policy: SCHED_OTHER
> pid 48316's current scheduling priority: 0
> pid 48317's current scheduling policy: SCHED_OTHER
> pid 48317's current scheduling priority: 0
> pid 48318's current scheduling policy: SCHED_OTHER
> pid 48318's current scheduling priority: 0
> pid 48319's current scheduling policy: SCHED_OTHER
> pid 48319's current scheduling priority: 0
> pid 48335's current scheduling policy: SCHED_OTHER
> pid 48335's current scheduling priority: 0
>
>
> Kernel stack for the CPU 0/KVM task.  This is kind of strange, because
> I'd expect it to be in the "ioctl" call in the kernel or somewhere
> further down the stack.
> controller-0:~# cat /proc/48316/stack
> [] 0xff

Re: [Qemu-devel] [PATCH] nvme: fix oob access issue(CVE-2018-16847)

2018-11-02 Thread Philippe Mathieu-Daudé

On 2/11/18 2:22, Li Qiang wrote:

Currently, the nvme_cmb_ops mr doesn't check the addr and size.
This can lead an oob access issue. This is triggerable in the guest.
Add check to avoid this issue.

Fixes CVE-2018-16847.

Reported-by: Li Qiang 
Reviewed-by: Paolo Bonzini 
Signed-off-by: Li Qiang 
---
  hw/block/nvme.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index fc7dacb..d097add 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1175,6 +1175,10 @@ static void nvme_cmb_write(void *opaque, hwaddr addr, 
uint64_t data,
  unsigned size)
  {
  NvmeCtrl *n = (NvmeCtrl *)opaque;
+
+if (addr + size > NVME_CMBSZ_GETSIZE(n->bar.cmbsz)) {


Should this be reported via qemu_log_mask(LOG_GUEST_ERROR, ...)?


+return;
+}
  memcpy(&n->cmbuf[addr], &data, size);
  }
  
@@ -1183,6 +1187,9 @@ static uint64_t nvme_cmb_read(void *opaque, hwaddr addr, unsigned size)

  uint64_t val;
  NvmeCtrl *n = (NvmeCtrl *)opaque;
  
+if (addr + size > NVME_CMBSZ_GETSIZE(n->bar.cmbsz)) {


Ditto.


+return 0;
+}
  memcpy(&val, &n->cmbuf[addr], size);
  return val;
  }





Re: [Qemu-devel] [PATCH] target/arm: Conditionalize arm_div assert on aarch32 support

2018-11-02 Thread Alex Bennée


Richard Henderson  writes:

> When populating id registers from kvm, on a host that doesn't support
> aarch32 mode at all, aa32_arm_div will not be supported either.
>
> Signed-off-by: Richard Henderson 
> ---
>
> "Tested" on an APM Mustang, which does support AArch32.  I'm not
> sure, off hand, which cpu(s) don't have it, and Alex didn't say
> in his bug report.  Tsk tsk.  ;-)

It's qemu-test - which I think is a ThunderX. Unfortunately I think we
need the same treatment for the Jazelle test:

  ./aarch64-softmmu/qemu-system-aarch64 -machine virt,gic-version=3 -accel kvm 
-cpu host -serial mon:stdio -nic 
user,model=virtio-net-pci,hostfwd=tcp::-:22 -device virtio-scsi-pci -kernel 
../linux.git/arch/arm64/boot/Image -append "console=ttyAMA0 panic=-1" -display 
none -m 4096 --no-reboot
  qemu-system-aarch64: /home/alex/lsrc/qemu.git/target/arm/cpu.c:866: 
arm_cpu_realizefn: Assertion `cpu_isar_feature(jazelle, cpu)' failed.
  fish: “./aarch64-softmmu/qemu-system-a…” terminated by signal SIGABRT (Abort)


>
>
> r~
>
> ---
>  target/arm/cpu.h |  5 +
>  target/arm/cpu.c | 10 +-
>  2 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index 895f9909d8..4521ad5ae8 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -3300,6 +3300,11 @@ static inline bool isar_feature_aa64_fp16(const 
> ARMISARegisters *id)
>  return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
>  }
>
> +static inline bool isar_feature_aa64_a32(const ARMISARegisters *id)
> +{
> +return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, EL0) == 2;
> +}
> +
>  static inline bool isar_feature_aa64_sve(const ARMISARegisters *id)
>  {
>  return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index e08a2d2d79..988d97d1f1 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -828,8 +828,16 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>   * include the various other features that V7VE implies.
>   * Presence of EL2 itself is ARM_FEATURE_EL2, and of the
>   * Security Extensions is ARM_FEATURE_EL3.
> + *
> + * V7VE requires ARM division.  However, there exist AArch64 cpus
> + * without AArch32 support.  When KVM queries ID_ISAR0_EL1 on such
> + * a host, the value is UNKNOWN.  Similarly, we cannot check
> + * ID_AA64PFR0 without AArch64 support.  Check everything in order.
>   */
> -assert(cpu_isar_feature(arm_div, cpu));
> +if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)
> +&& cpu_isar_feature(aa64_a32, cpu)) {
> +assert(cpu_isar_feature(arm_div, cpu));
> +}
>  set_feature(env, ARM_FEATURE_LPAE);
>  set_feature(env, ARM_FEATURE_V7);
>  }


--
Alex Bennée



[Qemu-devel] How to emulate block I/O timeout on qemu side?

2018-11-02 Thread Dongli Zhang
Hi,

Is there any way to emulate I/O timeout on qemu side (not fault injection in VM
kernel) without modifying qemu source code?

For instance, I would like to observe/study/debug the I/O timeout handling of
nvme, scsi, virtio-blk (not supported) of VM kernel.

Is there a way to trigger this on purpose on qemu side?

Thank you very much!

Dongli Zhang



[Qemu-devel] [PATCH] qemu/units: Move out QCow2 specific definitions

2018-11-02 Thread Philippe Mathieu-Daudé
This definitions are QCow2 specific, there is no need to expose them
in the global namespace.

This partially reverts commit 540b8492618eb.

Signed-off-by: Philippe Mathieu-Daudé 
---
 block/qcow2.h| 56 +++-
 include/qemu/units.h | 55 ---
 2 files changed, 55 insertions(+), 56 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 29c98d87a0..74d200c8cb 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -27,12 +27,66 @@
 
 #include "crypto/block.h"
 #include "qemu/coroutine.h"
-#include "qemu/units.h"
 
 //#define DEBUG_ALLOC
 //#define DEBUG_ALLOC2
 //#define DEBUG_EXT
 
+#define S_1KiB  1024
+#define S_2KiB  2048
+#define S_4KiB  4096
+#define S_8KiB  8192
+#define S_16KiB16384
+#define S_32KiB32768
+#define S_64KiB65536
+#define S_128KiB  131072
+#define S_256KiB  262144
+#define S_512KiB  524288
+#define S_1MiB   1048576
+#define S_2MiB   2097152
+#define S_4MiB   4194304
+#define S_8MiB   8388608
+#define S_16MiB 16777216
+#define S_32MiB 33554432
+#define S_64MiB 67108864
+#define S_128MiB   134217728
+#define S_256MiB   268435456
+#define S_512MiB   536870912
+#define S_1GiB1073741824
+#define S_2GiB2147483648
+#define S_4GiB4294967296
+#define S_8GiB8589934592
+#define S_16GiB  17179869184
+#define S_32GiB  34359738368
+#define S_64GiB  68719476736
+#define S_128GiB137438953472
+#define S_256GiB274877906944
+#define S_512GiB549755813888
+#define S_1TiB 1099511627776
+#define S_2TiB 219902322
+#define S_4TiB 4398046511104
+#define S_8TiB 8796093022208
+#define S_16TiB   17592186044416
+#define S_32TiB   35184372088832
+#define S_64TiB   70368744177664
+#define S_128TiB 140737488355328
+#define S_256TiB 281474976710656
+#define S_512TiB 562949953421312
+#define S_1PiB  1125899906842624
+#define S_2PiB  2251799813685248
+#define S_4PiB  4503599627370496
+#define S_8PiB  9007199254740992
+#define S_16PiB18014398509481984
+#define S_32PiB36028797018963968
+#define S_64PiB72057594037927936
+#define S_128PiB  144115188075855872
+#define S_256PiB  288230376151711744
+#define S_512PiB  576460752303423488
+#define S_1EiB   1152921504606846976
+#define S_2EiB   2305843009213693952
+#define S_4EiB   4611686018427387904
+#define S_8EiB   9223372036854775808
+
 #define QCOW_MAGIC (('Q' << 24) | ('F' << 16) | ('I' << 8) | 0xfb)
 
 #define QCOW_CRYPT_NONE 0
diff --git a/include/qemu/units.h b/include/qemu/units.h
index 68a7758650..692db3fbb2 100644
--- a/include/qemu/units.h
+++ b/include/qemu/units.h
@@ -17,59 +17,4 @@
 #define PiB (INT64_C(1) << 50)
 #define EiB (INT64_C(1) << 60)
 
-#define S_1KiB  1024
-#define S_2KiB  2048
-#define S_4KiB  4096
-#define S_8KiB  8192
-#define S_16KiB16384
-#define S_32KiB32768
-#define S_64KiB65536
-#define S_128KiB  131072
-#define S_256KiB  262144
-#define S_512KiB  524288
-#define S_1MiB   1048576
-#define S_2MiB   2097152
-#define S_4MiB   4194304
-#define S_8MiB   8388608
-#define S_16MiB 16777216
-#define S_32MiB 33554432
-#define S_64MiB 67108864
-#define S_128MiB   134217728
-#define S_256MiB   268435456
-#define S_512MiB   536870912
-#define S_1GiB1073741824
-#define S_2GiB2147483648
-#define S_4GiB4294967296
-#define S_8GiB8589934592
-#define S_16GiB  17179869184
-#define S_32GiB  34359738368
-#define S_64GiB  68719476736
-#define S_128GiB137438953472
-#define S_256GiB274877906944
-#define S_512GiB549755813888
-#define S_1TiB 1099511627776
-#define S_2TiB 219902322
-#define S_4TiB 4398046511104
-#define S_8TiB 8796093022208
-#define S_16TiB   17592186044416
-#define S_32TiB   35184372088832
-#define S_64TiB   70368744177664
-#define S_128TiB 140737488355328
-#define S_256TiB 281474976710656
-#define S_512TiB 562949953421312
-#define S_1PiB  1125899906842624
-#define S_2PiB  2251799813685248
-#define S_4PiB  4503599627370496
-#define S_8PiB  9007199254740992
-#define S_16PiB18014398509481984
-#define S_32PiB36028797018963968
-#define S_64PiB72057594037927936
-#define S_128PiB  144115188075855872
-#define S_256PiB  288230376151711744
-#define S_512PiB  576460752303423488
-#define S_1EiB   1152921504606846

Re: [Qemu-devel] [PATCH v3 00/35] target/riscv: Convert to decodetree

2018-11-02 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20181031132029.4887-1-kbast...@mail.uni-paderborn.de
Subject: [Qemu-devel] [PATCH v3 00/35] target/riscv: Convert to decodetree

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
0dc9aaa65c target/riscv: Remaining rvc insn reuse 32 bit translators
8d508e3f89 target/riscv: Splice remaining compressed insn pairs for riscv32 vs 
riscv64
44e80c8498 target/riscv: Splice fsw_sd and flw_ld for riscv32 vs riscv64
07f644d3a2 target/riscv: Convert @cl_d, @cl_w, @cs_d, @cs_w insns
40b6158d54 target/riscv: Convert @cs_2 insns to share translation functions
ca95ca92a5 target/riscv: Remove decode_RV32_64G()
666c42149b target/riscv: Remove gen_system()
e1892386ec target/riscv: Rename trans_arith to gen_arith
1b33908223 target/riscv: Remove manual decoding of RV32/64M insn
1a5fef8c4e target/riscv: Remove shift and slt insn manual decoding
2731f425de target/riscv: make ADD/SUB/OR/XOR/AND insn use arg lists
9c34cbd280 target/riscv: Move gen_arith_imm() decoding into trans_* functions
2eaf68ef78 target/riscv: Remove manual decoding from gen_store()
31f0d4c241 target/riscv: Remove manual decoding from gen_load()
9d725fe480 target/riscv: Remove manual decoding from gen_branch()
9d8e1f844c target/riscv: Remove gen_jalr()
dc3753ad33 target/riscv: Convert quadrant 2 of RVXC insns to decodetree
1a083b100e target/riscv: Convert quadrant 1 of RVXC insns to decodetree
b831b8c4ec target/riscv: Convert quadrant 0 of RVXC insns to decodetree
c931914bd2 target/riscv: Convert RV priv insns to decodetree
767d172f2d target/riscv: Convert RV64D insns to decodetree
18d3406ee6 target/riscv: Convert RV32D insns to decodetree
515c07059a target/riscv: Convert RV64F insns to decodetree
3506dbc10d target/riscv: Convert RV32F insns to decodetree
2197c6b6b1 target/riscv: Convert RV64A insns to decodetree
c07ae6248a target/riscv: Convert RV32A insns to decodetree
508bf988f6 target/riscv: Convert RVXM insns to decodetree
2773626354 target/riscv: Convert RVXI csr insns to decodetree
55033facbb target/riscv: Convert RVXI fence insns to decodetree
e4d5253c9f target/riscv: Convert RVXI arithmetic insns to decodetree
7616bcfe80 target/riscv: Convert RV64I load/store insns to decodetree
7a348ac675 target/riscv: Convert RV32I load/store insns to decodetree
68f5d736d1 target/riscv: Convert RVXI branch insns to decodetree
2b6fed94f8 target/riscv: Activate decodetree and implemnt LUI & AUIPC
5fa58ba15a target/riscv: Move CPURISCVState pointer to DisasContext

=== OUTPUT BEGIN ===
Checking PATCH 1/35: target/riscv: Move CPURISCVState pointer to DisasContext...
Checking PATCH 2/35: target/riscv: Activate decodetree and implemnt LUI & 
AUIPC...
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#32: 
new file mode 100644

ERROR: externs should be avoided in .c files
#123: FILE: target/riscv/translate.c:1677:
+bool decode_insn32(DisasContext *ctx, uint32_t insn);

total: 1 errors, 1 warnings, 125 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 3/35: target/riscv: Convert RVXI branch insns to decodetree...
Checking PATCH 4/35: target/riscv: Convert RV32I load/store insns to 
decodetree...
Checking PATCH 5/35: target/riscv: Convert RV64I load/store insns to 
decodetree...
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#37: 
new file mode 100644

total: 0 errors, 1 warnings, 76 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 6/35: target/riscv: Convert RVXI arithmetic insns to 
decodetree...
Checking PATCH 7/35: target/riscv: Convert RVXI fence insns to decodetree...
Checking PATCH 8/35: target/riscv: Convert RVXI csr insns to decodetree...
Checking PATCH 9/35: target/riscv: Convert RVXM insns to decodetree...
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#46: 
new file mode 100644

total: 0 errors, 1 warnings, 145 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 10/35: target/riscv: Convert RV32A insns to de

Re: [Qemu-devel] [PATCH v1] bt: use size_t type for length parameters instead of int

2018-11-02 Thread P J P
+-- On Sat, 27 Oct 2018, P J P wrote --+
|+-- On Sun, 21 Oct 2018, P J P wrote --+
|| The length parameter values are not negative, thus use an unsigned
|| type 'size_t' for them. Many routines pass 'len' values to memcpy(3)
|| calls. If it was negative, it could lead to memory corruption issues.
|| Add check to avoid it.
|| 
|| Reported-by: Arash TC 
|| Signed-off-by: Prasad J Pandit 
|| ---
||  bt-host.c  |  8 +++---
||  bt-vhci.c  |  7 +++---
||  hw/bt/core.c   |  2 +-
||  hw/bt/hci-csr.c| 20 +++
||  hw/bt/hci.c| 38 ++--
||  hw/bt/hid.c| 10 
||  hw/bt/l2cap.c  | 56 ++
||  hw/bt/sdp.c|  6 ++---
||  hw/usb/dev-bluetooth.c | 12 -
||  include/hw/bt.h|  8 +++---
||  include/sysemu/bt.h| 10 
||  11 files changed, 90 insertions(+), 87 deletions(-)
|| 
|| Update v1: add assert check in vhci_host_send. Also check other places 
wherein
|| length is used with fixed size buffers.
||   -> https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg03831.html
| 
| Ping...!


Ping...!
--
Prasad J Pandit / Red Hat Product Security Team
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F



Re: [Qemu-devel] [PULL 1/2] qxl: store channel id in qxl->id

2018-11-02 Thread Frediano Ziglio
> 
> See qemu_spice_add_display_interface(), the console index is also used
> as channel id.  So put that into the qxl->id field too.
> 
> In typical use cases (one primary qxl-vga device, optionally one or more
> secondary qxl devices, no non-qxl display devices) this doesn't change
> anything.
> 
> With this in place the qxl->id can not be used any more to figure
> whenever a given device is primary (with vga compat mode) or secondary.
> So add a bool to track this.
> 
> Cc: spice-de...@lists.freedesktop.org
> Signed-off-by: Gerd Hoffmann 
> Message-id: 20181012114540.27829-1-kra...@redhat.com
> ---
>  hw/display/qxl.h |  1 +
>  hw/display/qxl.c | 19 ---
>  2 files changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/display/qxl.h b/hw/display/qxl.h
> index dd9c0522b7..6f9d1f21fa 100644
> --- a/hw/display/qxl.h
> +++ b/hw/display/qxl.h
> @@ -34,6 +34,7 @@ typedef struct PCIQXLDevice {
>  PortioList vga_port_list;
>  SimpleSpiceDisplay ssd;
>  intid;
> +bool   have_vga;
>  uint32_t   debug;
>  uint32_t   guestdebug;
>  uint32_t   cmdlog;
> diff --git a/hw/display/qxl.c b/hw/display/qxl.c
> index f608abc769..9087db5dee 100644
> --- a/hw/display/qxl.c
> +++ b/hw/display/qxl.c
> @@ -848,7 +848,7 @@ static int interface_get_cursor_command(QXLInstance *sin,
> struct QXLCommandExt *
>  qxl->guest_primary.commands++;
>  qxl_track_command(qxl, ext);
>  qxl_log_command(qxl, "csr", ext);
> -if (qxl->id == 0) {
> +if (qxl->have_vga) {
>  qxl_render_cursor(qxl, ext);
>  }
>  trace_qxl_ring_cursor_get(qxl->id, qxl_mode_to_string(qxl->mode));
> @@ -1255,7 +1255,7 @@ static void qxl_soft_reset(PCIQXLDevice *d)
>  d->current_async = QXL_UNDEFINED_IO;
>  qemu_mutex_unlock(&d->async_lock);
>  
> -if (d->id == 0) {
> +if (d->have_vga) {
>  qxl_enter_vga_mode(d);
>  } else {
>  d->mode = QXL_MODE_UNDEFINED;
> @@ -2139,7 +2139,7 @@ static void qxl_realize_common(PCIQXLDevice *qxl, Error
> **errp)
>  
>  memory_region_init_io(&qxl->io_bar, OBJECT(qxl), &qxl_io_ops, qxl,
>"qxl-ioports", io_size);
> -if (qxl->id == 0) {
> +if (qxl->have_vga) {
>  vga_dirty_log_start(&qxl->vga);
>  }
>  memory_region_set_flush_coalesced(&qxl->io_bar);
> @@ -2171,7 +2171,7 @@ static void qxl_realize_common(PCIQXLDevice *qxl, Error
> **errp)
>  
>  /* print pci bar details */
>  dprint(qxl, 1, "ram/%s: %" PRId64 " MB [region 0]\n",
> -   qxl->id == 0 ? "pri" : "sec", qxl->vga.vram_size / MiB);
> +   qxl->have_vga ? "pri" : "sec", qxl->vga.vram_size / MiB);
>  dprint(qxl, 1, "vram/32: %" PRIx64 " MB [region 1]\n",
> qxl->vram32_size / MiB);
>  dprint(qxl, 1, "vram/64: %" PRIx64 " MB %s\n",
> @@ -2199,7 +2199,6 @@ static void qxl_realize_primary(PCIDevice *dev, Error
> **errp)
>  VGACommonState *vga = &qxl->vga;
>  Error *local_err = NULL;
>  
> -qxl->id = 0;
>  qxl_init_ramsize(qxl);
>  vga->vbe_size = qxl->vgamem_size;
>  vga->vram_size_mb = qxl->vga.vram_size / MiB;
> @@ -2210,8 +2209,15 @@ static void qxl_realize_primary(PCIDevice *dev, Error
> **errp)
>   vga, "vga");
>  portio_list_set_flush_coalesced(&qxl->vga_port_list);
>  portio_list_add(&qxl->vga_port_list, pci_address_space_io(dev), 0x3b0);
> +qxl->have_vga = true;
>  
>  vga->con = graphic_console_init(DEVICE(dev), 0, &qxl_ops, qxl);
> +qxl->id = qemu_console_get_index(vga->con); /* == channel_id */
> +if (qxl->id != 0) {
> +error_setg(errp, "primary qxl-vga device must be console 0 "
> +   "(first display device on the command line)");
> +return;
> +}

In the comment this seems no more required so why testing it?

>  
>  qxl_realize_common(qxl, &local_err);
>  if (local_err) {
> @@ -2226,15 +2232,14 @@ static void qxl_realize_primary(PCIDevice *dev, Error
> **errp)
>  
>  static void qxl_realize_secondary(PCIDevice *dev, Error **errp)
>  {
> -static int device_id = 1;
>  PCIQXLDevice *qxl = PCI_QXL(dev);
>  
> -qxl->id = device_id++;
>  qxl_init_ramsize(qxl);
>  memory_region_init_ram(&qxl->vga.vram, OBJECT(dev), "qxl.vgavram",
> qxl->vga.vram_size, &error_fatal);
>  qxl->vga.vram_ptr = memory_region_get_ram_ptr(&qxl->vga.vram);
>  qxl->vga.con = graphic_console_init(DEVICE(dev), 0, &qxl_ops, qxl);
> +qxl->id = qemu_console_get_index(qxl->vga.con); /* == channel_id */
>  

As these IDs must be contiguous this means that there must be the requirement
that if there is a qxl interface only qxl interfaces are used and no other
console which seems to me wrong.

>  qxl_realize_common(qxl, errp);
>  }

Frediano



Re: [Qemu-devel] [PATCH v4 03/23] hw: acpi: Export the RSDP build API

2018-11-02 Thread Shannon Zhao

Hi,

On 2018/11/1 18:22, Samuel Ortiz wrote:

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index f28a2faa53..0ed132b79b 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -367,7 +367,7 @@ static void acpi_dsdt_add_power_button(Aml *scope)
  }
  
  /* RSDP */

-static GArray *
+static void
  build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned xsdt_tbl_offset)
  {
  AcpiRsdpDescriptor *rsdp = acpi_data_push(rsdp_table, sizeof *rsdp);


Why change this? It's not related to your patch purpose.

Thanks,
Shannon



[Qemu-devel] [PATCH 0/3] Performance improvements for xen_disk

2018-11-02 Thread Tim Smith
A series of performance improvements for disks using the Xen PV ring.

These have had fairly extensive testing.

The batching and latency improvements together boost the throughput
of small reads and writes by two to six percent (measured using fio
in the guest)

Avoiding repeated calls to posix_memalign() reduced the dirty heap
from 25MB to 5MB in the case of a single datapath process while also
improving performance.

---

Tim Smith (3):
  Improve xen_disk batching behaviour
  Improve xen_disk response latency
  Avoid repeated memory allocation in xen_disk


 hw/block/xen_disk.c |   82 +--
 1 file changed, 46 insertions(+), 36 deletions(-)

--
Tim Smith 



[Qemu-devel] [PATCH 1/3] Improve xen_disk batching behaviour

2018-11-02 Thread Tim Smith
When I/O consists of many small requests, performance is improved by
batching them together in a single io_submit() call. When there are
relatively few requests, the extra overhead is not worth it. This
introduces a check to start batching I/O requests via blk_io_plug()/
blk_io_unplug() in an amount proportional to the number which were
already in flight at the time we started reading the ring.

Signed-off-by: Tim Smith 
---
 hw/block/xen_disk.c |   29 +
 1 file changed, 29 insertions(+)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 36eff94f84..6cb40d66fa 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -101,6 +101,9 @@ struct XenBlkDev {
 AioContext  *ctx;
 };
 
+/* Threshold of in-flight requests above which we will start using
+ * blk_io_plug()/blk_io_unplug() to batch requests */
+#define IO_PLUG_THRESHOLD 1
 /* - */
 
 static void ioreq_reset(struct ioreq *ioreq)
@@ -542,6 +545,8 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 {
 RING_IDX rc, rp;
 struct ioreq *ioreq;
+int inflight_atstart = blkdev->requests_inflight;
+int batched = 0;
 
 blkdev->more_work = 0;
 
@@ -550,6 +555,16 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
 
 blk_send_response_all(blkdev);
+/* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
+ * when we got here, this is an indication that there the bottleneck
+ * is below us, so it's worth beginning to batch up I/O requests
+ * rather than submitting them immediately. The maximum number
+ * of requests we're willing to batch is the number already in
+ * flight, so it can grow up to max_requests when the bottleneck
+ * is below us */
+if (inflight_atstart > IO_PLUG_THRESHOLD) {
+blk_io_plug(blkdev->blk);
+}
 while (rc != rp) {
 /* pull request from ring */
 if (RING_REQUEST_CONS_OVERFLOW(&blkdev->rings.common, rc)) {
@@ -589,7 +604,21 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 continue;
 }
 
+if (inflight_atstart > IO_PLUG_THRESHOLD && batched >= 
inflight_atstart) {
+blk_io_unplug(blkdev->blk);
+}
 ioreq_runio_qemu_aio(ioreq);
+if (inflight_atstart > IO_PLUG_THRESHOLD) {
+if (batched >= inflight_atstart) {
+blk_io_plug(blkdev->blk);
+batched=0;
+} else {
+batched++;
+}
+}
+}
+if (inflight_atstart > IO_PLUG_THRESHOLD) {
+blk_io_unplug(blkdev->blk);
 }
 
 if (blkdev->more_work && blkdev->requests_inflight < blkdev->max_requests) 
{




[Qemu-devel] [PATCH 2/3] Improve xen_disk response latency

2018-11-02 Thread Tim Smith
If the I/O ring is full, the guest cannot send any more requests
until some responses are sent. Only sending all available responses
just before checking for new work does not leave much time for the
guest to supply new work, so this will cause stalls if the ring gets
full. Also, not completing reads as soon as possible adds latency
to the guest.

To alleviate that, complete IO requests as soon as they come back.
blk_send_response() already returns a value indicating whether
a notify should be sent, which is all the batching we need.

Signed-off-by: Tim Smith 
---
 hw/block/xen_disk.c |   43 ---
 1 file changed, 12 insertions(+), 31 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 6cb40d66fa..c11cd21d37 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -83,11 +83,9 @@ struct XenBlkDev {
 
 /* request lists */
 QLIST_HEAD(inflight_head, ioreq) inflight;
-QLIST_HEAD(finished_head, ioreq) finished;
 QLIST_HEAD(freelist_head, ioreq) freelist;
 int requests_total;
 int requests_inflight;
-int requests_finished;
 unsigned intmax_requests;
 
 gbooleanfeature_discard;
@@ -104,6 +102,9 @@ struct XenBlkDev {
 /* Threshold of in-flight requests above which we will start using
  * blk_io_plug()/blk_io_unplug() to batch requests */
 #define IO_PLUG_THRESHOLD 1
+
+static int blk_send_response(struct ioreq *ioreq);
+
 /* - */
 
 static void ioreq_reset(struct ioreq *ioreq)
@@ -155,12 +156,10 @@ static void ioreq_finish(struct ioreq *ioreq)
 struct XenBlkDev *blkdev = ioreq->blkdev;
 
 QLIST_REMOVE(ioreq, list);
-QLIST_INSERT_HEAD(&blkdev->finished, ioreq, list);
 blkdev->requests_inflight--;
-blkdev->requests_finished++;
 }
 
-static void ioreq_release(struct ioreq *ioreq, bool finish)
+static void ioreq_release(struct ioreq *ioreq)
 {
 struct XenBlkDev *blkdev = ioreq->blkdev;
 
@@ -168,11 +167,7 @@ static void ioreq_release(struct ioreq *ioreq, bool finish)
 ioreq_reset(ioreq);
 ioreq->blkdev = blkdev;
 QLIST_INSERT_HEAD(&blkdev->freelist, ioreq, list);
-if (finish) {
-blkdev->requests_finished--;
-} else {
-blkdev->requests_inflight--;
-}
+blkdev->requests_inflight--;
 }
 
 /*
@@ -351,6 +346,10 @@ static void qemu_aio_complete(void *opaque, int ret)
 default:
 break;
 }
+if (blk_send_response(ioreq)) {
+xen_pv_send_notify(&blkdev->xendev);
+}
+ioreq_release(ioreq);
 qemu_bh_schedule(blkdev->bh);
 
 done:
@@ -455,7 +454,7 @@ err:
 return -1;
 }
 
-static int blk_send_response_one(struct ioreq *ioreq)
+static int blk_send_response(struct ioreq *ioreq)
 {
 struct XenBlkDev  *blkdev = ioreq->blkdev;
 int   send_notify   = 0;
@@ -504,22 +503,6 @@ static int blk_send_response_one(struct ioreq *ioreq)
 return send_notify;
 }
 
-/* walk finished list, send outstanding responses, free requests */
-static void blk_send_response_all(struct XenBlkDev *blkdev)
-{
-struct ioreq *ioreq;
-int send_notify = 0;
-
-while (!QLIST_EMPTY(&blkdev->finished)) {
-ioreq = QLIST_FIRST(&blkdev->finished);
-send_notify += blk_send_response_one(ioreq);
-ioreq_release(ioreq, true);
-}
-if (send_notify) {
-xen_pv_send_notify(&blkdev->xendev);
-}
-}
-
 static int blk_get_request(struct XenBlkDev *blkdev, struct ioreq *ioreq, 
RING_IDX rc)
 {
 switch (blkdev->protocol) {
@@ -554,7 +537,6 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 rp = blkdev->rings.common.sring->req_prod;
 xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
 
-blk_send_response_all(blkdev);
 /* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
  * when we got here, this is an indication that there the bottleneck
  * is below us, so it's worth beginning to batch up I/O requests
@@ -597,10 +579,10 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 break;
 };
 
-if (blk_send_response_one(ioreq)) {
+if (blk_send_response(ioreq)) {
 xen_pv_send_notify(&blkdev->xendev);
 }
-ioreq_release(ioreq, false);
+ioreq_release(ioreq);
 continue;
 }
 
@@ -645,7 +627,6 @@ static void blk_alloc(struct XenDevice *xendev)
 trace_xen_disk_alloc(xendev->name);
 
 QLIST_INIT(&blkdev->inflight);
-QLIST_INIT(&blkdev->finished);
 QLIST_INIT(&blkdev->freelist);
 
 blkdev->iothread = iothread_create(xendev->name, &err);




[Qemu-devel] [PATCH 3/3] Avoid repeated memory allocation in xen_disk

2018-11-02 Thread Tim Smith
xen_disk currently allocates memory to hold the data for each ioreq
as that ioreq is used, and frees it afterwards. Because it requires
page-aligned blocks, this interacts poorly with non-page-aligned
allocations and balloons the heap.

Instead, allocate the maximum possible requirement, which is
BLKIF_MAX_SEGMENTS_PER_REQUEST pages (currently 11 pages) when
the ioreq is created, and keep that allocation until it is destroyed.
Since the ioreqs themselves are re-used via a free list, this
should actually improve memory usage.

Signed-off-by: Tim Smith 
---
 hw/block/xen_disk.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index c11cd21d37..67f894bba5 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -112,7 +112,6 @@ static void ioreq_reset(struct ioreq *ioreq)
 memset(&ioreq->req, 0, sizeof(ioreq->req));
 ioreq->status = 0;
 ioreq->start = 0;
-ioreq->buf = NULL;
 ioreq->size = 0;
 ioreq->presync = 0;
 
@@ -137,6 +136,10 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev)
 /* allocate new struct */
 ioreq = g_malloc0(sizeof(*ioreq));
 ioreq->blkdev = blkdev;
+/* We cannot need more pages per ioreq than this, and we do re-use 
ioreqs,
+ * so allocate the memory once here, to be freed in blk_free() when the
+ * ioreq is freed. */
+ioreq->buf = qemu_memalign(XC_PAGE_SIZE, 
BLKIF_MAX_SEGMENTS_PER_REQUEST * XC_PAGE_SIZE);
 blkdev->requests_total++;
 qemu_iovec_init(&ioreq->v, 1);
 } else {
@@ -313,14 +316,12 @@ static void qemu_aio_complete(void *opaque, int ret)
 if (ret == 0) {
 ioreq_grant_copy(ioreq);
 }
-qemu_vfree(ioreq->buf);
 break;
 case BLKIF_OP_WRITE:
 case BLKIF_OP_FLUSH_DISKCACHE:
 if (!ioreq->req.nr_segments) {
 break;
 }
-qemu_vfree(ioreq->buf);
 break;
 default:
 break;
@@ -392,12 +393,10 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
 {
 struct XenBlkDev *blkdev = ioreq->blkdev;
 
-ioreq->buf = qemu_memalign(XC_PAGE_SIZE, ioreq->size);
 if (ioreq->req.nr_segments &&
 (ioreq->req.operation == BLKIF_OP_WRITE ||
  ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE) &&
 ioreq_grant_copy(ioreq)) {
-qemu_vfree(ioreq->buf);
 goto err;
 }
 
@@ -989,6 +988,7 @@ static int blk_free(struct XenDevice *xendev)
 ioreq = QLIST_FIRST(&blkdev->freelist);
 QLIST_REMOVE(ioreq, list);
 qemu_iovec_destroy(&ioreq->v);
+qemu_vfree(ioreq->buf);
 g_free(ioreq);
 }
 




Re: [Qemu-devel] [PATCH v4 05/23] hw: arm: Switch to the AML build RSDP building routine

2018-11-02 Thread Shannon Zhao




On 2018/11/1 18:22, Samuel Ortiz wrote:

We make the ARM virt ACPI code use the now shared build_rsdp() API from
aml-build.c. By doing so we fix a bug where the ARM implementation was
missing adding both the legacy and extended checksums, which was
building an invalid RSDP table.

Signed-off-by: Samuel Ortiz 
---
  hw/arm/virt-acpi-build.c | 31 +--
  1 file changed, 1 insertion(+), 30 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 0ed132b79b..0a6a88380a 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -35,6 +35,7 @@
  #include "target/arm/cpu.h"
  #include "hw/acpi/acpi-defs.h"
  #include "hw/acpi/acpi.h"
+#include "hw/acpi/aml-build.h"
  #include "hw/nvram/fw_cfg.h"
  #include "hw/acpi/bios-linker-loader.h"
  #include "hw/loader.h"
@@ -366,36 +367,6 @@ static void acpi_dsdt_add_power_button(Aml *scope)
  aml_append(scope, dev);
  }
  
-/* RSDP */

-static void
-build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned xsdt_tbl_offset)
Note: here we use xsdt table not rsdt for ARM. After your change which 
assigns xsdt table address to rsdt_physical_address, it doesn't work.


IIRC, we discussed this before that ARM virt uses xsdt while pc/q35 uses 
rsdt. So this patch is not necessary I think.


Thanks,
Shannon

-{
-AcpiRsdpDescriptor *rsdp = acpi_data_push(rsdp_table, sizeof *rsdp);
-unsigned xsdt_pa_size = sizeof(rsdp->xsdt_physical_address);
-unsigned xsdt_pa_offset =
-(char *)&rsdp->xsdt_physical_address - rsdp_table->data;
-
-bios_linker_loader_alloc(linker, ACPI_BUILD_RSDP_FILE, rsdp_table, 16,
- true /* fseg memory */);
-
-memcpy(&rsdp->signature, "RSD PTR ", sizeof(rsdp->signature));
-memcpy(rsdp->oem_id, ACPI_BUILD_APPNAME6, sizeof(rsdp->oem_id));
-rsdp->length = cpu_to_le32(sizeof(*rsdp));
-rsdp->revision = 0x02;
-
-/* Address to be filled by Guest linker */
-bios_linker_loader_add_pointer(linker,
-ACPI_BUILD_RSDP_FILE, xsdt_pa_offset, xsdt_pa_size,
-ACPI_BUILD_TABLE_FILE, xsdt_tbl_offset);
-
-/* Checksum to be filled by Guest linker */
-bios_linker_loader_add_checksum(linker, ACPI_BUILD_RSDP_FILE,
-(char *)rsdp - rsdp_table->data, sizeof *rsdp,
-(char *)&rsdp->checksum - rsdp_table->data);
-
-return rsdp_table;
-}
-
  static void
  build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
  {





Re: [Qemu-devel] [QEMU PATCH v2 0/2]: KVM: i386: Add support for save and restore nested state

2018-11-02 Thread Paolo Bonzini
On 02/11/2018 04:46, Liran Alon wrote:
>> On Thu, Nov1, 2018 at 09:45 AM, Jim Mattson  wrote:
> 
>>> On Thu, Nov 1, 2018 at 8:56 AM, Dr. David Alan Gilbert 
>>>  wrote:
> 
>>> So if I have matching host kernels it should always work?
>>> What happens if I upgrade the source kernel to increase it's maximum
>>> nested size, can I force it to keep things small for some VMs?
> 
>> Any change to the format of the nested state should be gated by a
>> KVM_CAP set by userspace. (Unlike, say, how the
>> KVM_VCPUEVENT_VALID_SMM flag was added to the saved VCPU events state
>> in commit f077825a8758d.) KVM has traditionally been quite bad about
>> maintaining backwards compatibility, but I hope the community is more
>> cognizant of the issues now.
> 
>> As a cloud provider, one would only enable the new capability from
>> userspace once all hosts in the pool have a kernel that supports it.
>> During the transition, the capability would not be enabled on the
>> hosts with a new kernel, and these hosts would continue to provide
>> nested state that could be consumed by hosts running the older kernel.
> 
> Hmm this makes sense.
> 
> This means though that the patch I have submitted here isn't good enough.
> My patch currently assumes that when it attempts to get nested state from KVM,
> QEMU should always set nested_state->size to max size supported by KVM as 
> received
> from kvm_check_extension(s, KVM_CAP_NESTED_STATE);
> (See kvm_get_nested_state() introduced on my patch).
> This indeed won't allow migration from host with new KVM to host with old KVM 
> if
> nested_state size was enlarged between these KVM versions.
> Which is obviously an issue.

Actually I think this is okay, because unlike the "new" capability was
enabled, KVM would always reduce nested_state->size to a value that is
compatible with current kernels.

> But on second thought, I'm not sure that this is the right approach as-well.
> We don't really want the used version of nested_state to be determined on 
> kvm_init().
> * On source QEMU, we actually want to determine it when preparing for 
> migration based
> on to the support given by our destination host. If it's an old host, we 
> would like to
> save an old version nested_state and if it's a new host, we will like to save 
> our newest
> supported nested_state.

No, that's wrong because it would lead to losing state.  If the source
QEMU supports more state than the destination QEMU, and the current VM
state needs to transmit it for migration to be _correct_, then migration
to that destination QEMU must fail.

In particular, enabling the new KVM capability needs to be gated by a
new machine type and/or -cpu flag, if migration compatibility is needed.
 (In particular, this is one reason why I haven't considered this series
for 3.1.  Right now, migration of nested hypervisors is completely
busted but if we make it "almost" work, pre-3.1 machine types would not
ever be able to add support for KVM_CAP_EXCEPTION_PAYLOAD.  Therefore,
it's better for users if we wait for one release more, and add support
for KVM_CAP_NESTED_STATE and KVM_CAP_EXCEPTION_PAYLOAD at the same time).

Personally, I would like to say that, starting from QEMU 3.2, enabling
nested VMX requires a 4.20 kernel.  It's a bit bold, but I think it's a
good way to keep some sanity.  Any opinions on that?

Paolo

> Therefore, I don't think that we want this versioning to be based on KVM_CAP 
> at all.
> It seems that we would want the process to behave as follows:
> 1) Mgmt-layer at dest queries dest host max supported nested_state size.
>(Which should be returned from kvm_check_extension(KVM_CAP_NESTED_STATE))
> 2) Mgmt-layer at source initiate migration to dest with requesting QEMU to 
> send nested_state 
>matching dest max supported nested_state size.
>When saving nested state using KVM_GET_NESTED_STATE IOCTL, QEMU will 
> specify in nested_state->size
>the *requested* size to be saved and KVM should be able to save only the 
> information which matches
>the version that worked with that size.
> 3) After some sanity checks on received migration stream, dest host use 
> KVM_SET_NESTED_STATE IOCTL.
>This IOCTL should deduce which information it should deploy based on given 
> nested_state->size.
> 
> This also makes me wonder if it's not just nicer to use nested_state->flags 
> to specify which
> information is actually present on nested_state instead of managing 
> versioning with nested_state->size.
> 
> What are your opinions on this?
> 
> -Liran
> 




Re: [Qemu-devel] [Qemu-arm] [PATCH v4 06/23] hw: acpi: Generalize AML build routines

2018-11-02 Thread Shannon Zhao




On 2018/11/1 18:22, Samuel Ortiz wrote:

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 0a6a88380a..6822ee4eaa 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -546,7 +546,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
  }
  
  static void

-build_mcfg(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
+virt_build_mcfg(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
  {
  AcpiTableMcfg *mcfg;
  const MemMapEntry *memmap = vms->memmap;
@@ -791,7 +791,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables 
*tables)
  build_gtdt(tables_blob, tables->linker, vms);
  
  acpi_add_table(table_offsets, tables_blob);

-build_mcfg(tables_blob, tables->linker, vms);
+virt_build_mcfg(tables_blob, tables->linker, vms);
Looks like it doesn't share build_mcfg with x86. Why you still export 
the x86 build_mcfg and introduce this unnecessary change?


Thanks,
Shannon



Re: [Qemu-devel] [PATCH] target/arm: Conditionalize arm_div assert on aarch32 support

2018-11-02 Thread Peter Maydell
On 1 November 2018 at 21:57, Richard Henderson
 wrote:
> When populating id registers from kvm, on a host that doesn't support
> aarch32 mode at all, aa32_arm_div will not be supported either.
>
> Signed-off-by: Richard Henderson 
> ---
>
> "Tested" on an APM Mustang, which does support AArch32.  I'm not
> sure, off hand, which cpu(s) don't have it, and Alex didn't say
> in his bug report.  Tsk tsk.  ;-)
>
>
> r~
>
> ---
>  target/arm/cpu.h |  5 +
>  target/arm/cpu.c | 10 +-
>  2 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index 895f9909d8..4521ad5ae8 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -3300,6 +3300,11 @@ static inline bool isar_feature_aa64_fp16(const 
> ARMISARegisters *id)
>  return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
>  }
>
> +static inline bool isar_feature_aa64_a32(const ARMISARegisters *id)
> +{
> +return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, EL0) == 2;
> +}
> +

Doesn't the stuff in the Arm ARM's "Principles of the ID
scheme for fields in ID registers" about signed and unsigned
values for ID register fields strictly mean you want to be
testing (unsigned) >= 2 here rather than strict equality?

thanks
-- PMM



Re: [Qemu-devel] [PULL 0/3] decodetree improvements

2018-11-02 Thread Peter Maydell
On 31 October 2018 at 16:53, Richard Henderson
 wrote:
> The following changes since commit a2e002ff7913ce93aa0f7dbedd2123dce5f1a9cd:
>
>   Merge remote-tracking branch 
> 'remotes/vivier2/tags/qemu-trivial-for-3.1-pull-request' into staging 
> (2018-10-30 15:49:55 +)
>
> are available in the Git repository at:
>
>   https://github.com/rth7680/qemu.git tags/pull-dt-20181031
>
> for you to fetch changes up to 6699ae6a8e74381583622502db8bd47fac381c9e:
>
>   decodetree: Allow multiple input files (2018-10-31 16:48:58 +)
>
> 
> Updates to decodetree.py for risc-v.
>
> 
> Richard Henderson (3):
>   decodetree: Add !extern flag to argument sets
>   decodetree: Remove "insn" argument from trans_* expanders
>   decodetree: Allow multiple input files

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH] target/arm: Conditionalize arm_div assert on aarch32 support

2018-11-02 Thread Richard Henderson
On 11/2/18 9:48 AM, Peter Maydell wrote:
>> +static inline bool isar_feature_aa64_a32(const ARMISARegisters *id)
>> +{
>> +return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, EL0) == 2;
>> +}
>> +
> 
> Doesn't the stuff in the Arm ARM's "Principles of the ID
> scheme for fields in ID registers" about signed and unsigned
> values for ID register fields strictly mean you want to be
> testing (unsigned) >= 2 here rather than strict equality?

Yes.  Will fix.


r~



Re: [Qemu-devel] [PATCH] docs/block-replication.txt: Add more detail about replication_do_checkpoint_all

2018-11-02 Thread Zhang Chen
Hi All,

Maybe we have forgot this patch? Ping again.

Related discussion:
https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg03352.html

Thanks
Zhang Chen

On Fri, Sep 14, 2018 at 2:32 PM Zhang Chen  wrote:

> Hi All,
>
> No news update?
> Ping...
>
> Thanks
> Zhang Chen
>
> On Thu, Sep 6, 2018 at 12:12 AM Zhang Chen  wrote:
>
>> Add more detail description for COLO checkpoint use case.
>> Suggested by Dr. David Alan Gilbert 
>>
>> Signed-off-by: Zhang Chen 
>> ---
>>  docs/block-replication.txt | 7 ---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/docs/block-replication.txt b/docs/block-replication.txt
>> index 6bde6737fb..b6c5e94764 100644
>> --- a/docs/block-replication.txt
>> +++ b/docs/block-replication.txt
>> @@ -131,9 +131,10 @@ a. replication_start_all()
>> thread.
>>  b. replication_do_checkpoint_all()
>> This interface is called after all VM state is transferred to
>> -   Secondary QEMU. The Disk buffer will be dropped in this interface.
>> -   The caller must hold the I/O mutex lock if it is in
>> migration/checkpoint
>> -   thread.
>> +   secondary node, and in primary node this interface no need to
>> +   called after synchronizing all the states. The Disk buffer will be
>> dropped
>> +   in this interface. The caller must hold the I/O mutex lock if it is
>> +   in migration/checkpoint thread.
>>  c. replication_get_error_all()
>> This interface is called to check if error happened in replication.
>> The caller must hold the I/O mutex lock if it is in
>> migration/checkpoint
>> --
>> 2.17.GIT
>>
>>


Re: [Qemu-devel] [PATCH v4 03/23] hw: acpi: Export the RSDP build API

2018-11-02 Thread Philippe Mathieu-Daudé

Hi,

On 2/11/18 10:20, Shannon Zhao wrote:

Hi,

On 2018/11/1 18:22, Samuel Ortiz wrote:

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index f28a2faa53..0ed132b79b 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -367,7 +367,7 @@ static void acpi_dsdt_add_power_button(Aml *scope)
  }
  /* RSDP */
-static GArray *
+static void
  build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned 
xsdt_tbl_offset)

  {
  AcpiRsdpDescriptor *rsdp = acpi_data_push(rsdp_table, sizeof 
*rsdp);


Why change this? It's not related to your patch purpose.


This patch updates include/hw/acpi/aml-build.h to export the 
build_rsdp() function.

Since this file includes this header, the orototype needs to match.

Regards,

Phil.



Thanks,
Shannon





[Qemu-devel] [PATCH 0/3] Performance improvements for xen_disk v2

2018-11-02 Thread Tim Smith
A series of performance improvements for disks using the Xen PV ring.

These have had fairly extensive testing.

The batching and latency improvements together boost the throughput
of small reads and writes by two to six percent (measured using fio
in the guest)

Avoiding repeated calls to posix_memalign() reduced the dirty heap
from 25MB to 5MB in the case of a single datapath process while also
improving performance.

v2 removes some checkpatch complaints and fixes the CCs

---

Tim Smith (3):
  Improve xen_disk batching behaviour
  Improve xen_disk response latency
  Avoid repeated memory allocation in xen_disk


 hw/block/xen_disk.c |   82 +--
 1 file changed, 46 insertions(+), 36 deletions(-)

--
Tim Smith 



[Qemu-devel] [PATCH 2/3] Improve xen_disk response latency

2018-11-02 Thread Tim Smith
If the I/O ring is full, the guest cannot send any more requests
until some responses are sent. Only sending all available responses
just before checking for new work does not leave much time for the
guest to supply new work, so this will cause stalls if the ring gets
full. Also, not completing reads as soon as possible adds latency
to the guest.

To alleviate that, complete IO requests as soon as they come back.
blk_send_response() already returns a value indicating whether
a notify should be sent, which is all the batching we need.

Signed-off-by: Tim Smith 
---
 hw/block/xen_disk.c |   43 ---
 1 file changed, 12 insertions(+), 31 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index cb2881b7e6..b506e23868 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -83,11 +83,9 @@ struct XenBlkDev {
 
 /* request lists */
 QLIST_HEAD(inflight_head, ioreq) inflight;
-QLIST_HEAD(finished_head, ioreq) finished;
 QLIST_HEAD(freelist_head, ioreq) freelist;
 int requests_total;
 int requests_inflight;
-int requests_finished;
 unsigned intmax_requests;
 
 gbooleanfeature_discard;
@@ -104,6 +102,9 @@ struct XenBlkDev {
 /* Threshold of in-flight requests above which we will start using
  * blk_io_plug()/blk_io_unplug() to batch requests */
 #define IO_PLUG_THRESHOLD 1
+
+static int blk_send_response(struct ioreq *ioreq);
+
 /* - */
 
 static void ioreq_reset(struct ioreq *ioreq)
@@ -155,12 +156,10 @@ static void ioreq_finish(struct ioreq *ioreq)
 struct XenBlkDev *blkdev = ioreq->blkdev;
 
 QLIST_REMOVE(ioreq, list);
-QLIST_INSERT_HEAD(&blkdev->finished, ioreq, list);
 blkdev->requests_inflight--;
-blkdev->requests_finished++;
 }
 
-static void ioreq_release(struct ioreq *ioreq, bool finish)
+static void ioreq_release(struct ioreq *ioreq)
 {
 struct XenBlkDev *blkdev = ioreq->blkdev;
 
@@ -168,11 +167,7 @@ static void ioreq_release(struct ioreq *ioreq, bool finish)
 ioreq_reset(ioreq);
 ioreq->blkdev = blkdev;
 QLIST_INSERT_HEAD(&blkdev->freelist, ioreq, list);
-if (finish) {
-blkdev->requests_finished--;
-} else {
-blkdev->requests_inflight--;
-}
+blkdev->requests_inflight--;
 }
 
 /*
@@ -351,6 +346,10 @@ static void qemu_aio_complete(void *opaque, int ret)
 default:
 break;
 }
+if (blk_send_response(ioreq)) {
+xen_pv_send_notify(&blkdev->xendev);
+}
+ioreq_release(ioreq);
 qemu_bh_schedule(blkdev->bh);
 
 done:
@@ -455,7 +454,7 @@ err:
 return -1;
 }
 
-static int blk_send_response_one(struct ioreq *ioreq)
+static int blk_send_response(struct ioreq *ioreq)
 {
 struct XenBlkDev  *blkdev = ioreq->blkdev;
 int   send_notify   = 0;
@@ -504,22 +503,6 @@ static int blk_send_response_one(struct ioreq *ioreq)
 return send_notify;
 }
 
-/* walk finished list, send outstanding responses, free requests */
-static void blk_send_response_all(struct XenBlkDev *blkdev)
-{
-struct ioreq *ioreq;
-int send_notify = 0;
-
-while (!QLIST_EMPTY(&blkdev->finished)) {
-ioreq = QLIST_FIRST(&blkdev->finished);
-send_notify += blk_send_response_one(ioreq);
-ioreq_release(ioreq, true);
-}
-if (send_notify) {
-xen_pv_send_notify(&blkdev->xendev);
-}
-}
-
 static int blk_get_request(struct XenBlkDev *blkdev, struct ioreq *ioreq, 
RING_IDX rc)
 {
 switch (blkdev->protocol) {
@@ -554,7 +537,6 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 rp = blkdev->rings.common.sring->req_prod;
 xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
 
-blk_send_response_all(blkdev);
 /* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
  * when we got here, this is an indication that there the bottleneck
  * is below us, so it's worth beginning to batch up I/O requests
@@ -597,10 +579,10 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 break;
 };
 
-if (blk_send_response_one(ioreq)) {
+if (blk_send_response(ioreq)) {
 xen_pv_send_notify(&blkdev->xendev);
 }
-ioreq_release(ioreq, false);
+ioreq_release(ioreq);
 continue;
 }
 
@@ -646,7 +628,6 @@ static void blk_alloc(struct XenDevice *xendev)
 trace_xen_disk_alloc(xendev->name);
 
 QLIST_INIT(&blkdev->inflight);
-QLIST_INIT(&blkdev->finished);
 QLIST_INIT(&blkdev->freelist);
 
 blkdev->iothread = iothread_create(xendev->name, &err);




[Qemu-devel] [PATCH 3/3] Avoid repeated memory allocation in xen_disk

2018-11-02 Thread Tim Smith
xen_disk currently allocates memory to hold the data for each ioreq
as that ioreq is used, and frees it afterwards. Because it requires
page-aligned blocks, this interacts poorly with non-page-aligned
allocations and balloons the heap.

Instead, allocate the maximum possible requirement, which is
BLKIF_MAX_SEGMENTS_PER_REQUEST pages (currently 11 pages) when
the ioreq is created, and keep that allocation until it is destroyed.
Since the ioreqs themselves are re-used via a free list, this
should actually improve memory usage.

Signed-off-by: Tim Smith 
---
 hw/block/xen_disk.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index b506e23868..faaeefba29 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -112,7 +112,6 @@ static void ioreq_reset(struct ioreq *ioreq)
 memset(&ioreq->req, 0, sizeof(ioreq->req));
 ioreq->status = 0;
 ioreq->start = 0;
-ioreq->buf = NULL;
 ioreq->size = 0;
 ioreq->presync = 0;
 
@@ -137,6 +136,11 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev)
 /* allocate new struct */
 ioreq = g_malloc0(sizeof(*ioreq));
 ioreq->blkdev = blkdev;
+/* We cannot need more pages per ioreq than this, and we do re-use
+ * ioreqs, so allocate the memory once here, to be freed in
+ * blk_free() when the ioreq is freed. */
+ioreq->buf = qemu_memalign(XC_PAGE_SIZE, BLKIF_MAX_SEGMENTS_PER_REQUEST
+   * XC_PAGE_SIZE);
 blkdev->requests_total++;
 qemu_iovec_init(&ioreq->v, 1);
 } else {
@@ -313,14 +317,12 @@ static void qemu_aio_complete(void *opaque, int ret)
 if (ret == 0) {
 ioreq_grant_copy(ioreq);
 }
-qemu_vfree(ioreq->buf);
 break;
 case BLKIF_OP_WRITE:
 case BLKIF_OP_FLUSH_DISKCACHE:
 if (!ioreq->req.nr_segments) {
 break;
 }
-qemu_vfree(ioreq->buf);
 break;
 default:
 break;
@@ -392,12 +394,10 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
 {
 struct XenBlkDev *blkdev = ioreq->blkdev;
 
-ioreq->buf = qemu_memalign(XC_PAGE_SIZE, ioreq->size);
 if (ioreq->req.nr_segments &&
 (ioreq->req.operation == BLKIF_OP_WRITE ||
  ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE) &&
 ioreq_grant_copy(ioreq)) {
-qemu_vfree(ioreq->buf);
 goto err;
 }
 
@@ -990,6 +990,7 @@ static int blk_free(struct XenDevice *xendev)
 ioreq = QLIST_FIRST(&blkdev->freelist);
 QLIST_REMOVE(ioreq, list);
 qemu_iovec_destroy(&ioreq->v);
+qemu_vfree(ioreq->buf);
 g_free(ioreq);
 }
 




[Qemu-devel] [PATCH 1/3] Improve xen_disk batching behaviour

2018-11-02 Thread Tim Smith
When I/O consists of many small requests, performance is improved by
batching them together in a single io_submit() call. When there are
relatively few requests, the extra overhead is not worth it. This
introduces a check to start batching I/O requests via blk_io_plug()/
blk_io_unplug() in an amount proportional to the number which were
already in flight at the time we started reading the ring.

Signed-off-by: Tim Smith 
---
 hw/block/xen_disk.c |   30 ++
 1 file changed, 30 insertions(+)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 36eff94f84..cb2881b7e6 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -101,6 +101,9 @@ struct XenBlkDev {
 AioContext  *ctx;
 };
 
+/* Threshold of in-flight requests above which we will start using
+ * blk_io_plug()/blk_io_unplug() to batch requests */
+#define IO_PLUG_THRESHOLD 1
 /* - */
 
 static void ioreq_reset(struct ioreq *ioreq)
@@ -542,6 +545,8 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 {
 RING_IDX rc, rp;
 struct ioreq *ioreq;
+int inflight_atstart = blkdev->requests_inflight;
+int batched = 0;
 
 blkdev->more_work = 0;
 
@@ -550,6 +555,16 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
 
 blk_send_response_all(blkdev);
+/* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
+ * when we got here, this is an indication that there the bottleneck
+ * is below us, so it's worth beginning to batch up I/O requests
+ * rather than submitting them immediately. The maximum number
+ * of requests we're willing to batch is the number already in
+ * flight, so it can grow up to max_requests when the bottleneck
+ * is below us */
+if (inflight_atstart > IO_PLUG_THRESHOLD) {
+blk_io_plug(blkdev->blk);
+}
 while (rc != rp) {
 /* pull request from ring */
 if (RING_REQUEST_CONS_OVERFLOW(&blkdev->rings.common, rc)) {
@@ -589,7 +604,22 @@ static void blk_handle_requests(struct XenBlkDev *blkdev)
 continue;
 }
 
+if (inflight_atstart > IO_PLUG_THRESHOLD &&
+batched >= inflight_atstart) {
+blk_io_unplug(blkdev->blk);
+}
 ioreq_runio_qemu_aio(ioreq);
+if (inflight_atstart > IO_PLUG_THRESHOLD) {
+if (batched >= inflight_atstart) {
+blk_io_plug(blkdev->blk);
+batched = 0;
+} else {
+batched++;
+}
+}
+}
+if (inflight_atstart > IO_PLUG_THRESHOLD) {
+blk_io_unplug(blkdev->blk);
 }
 
 if (blkdev->more_work && blkdev->requests_inflight < blkdev->max_requests) 
{




Re: [Qemu-devel] [PATCH v4 05/23] hw: arm: Switch to the AML build RSDP building routine

2018-11-02 Thread Shannon Zhao




On 2018/11/2 17:35, Shannon Zhao wrote:



On 2018/11/1 18:22, Samuel Ortiz wrote:

We make the ARM virt ACPI code use the now shared build_rsdp() API from
aml-build.c. By doing so we fix a bug where the ARM implementation was
missing adding both the legacy and extended checksums, which was
building an invalid RSDP table.

Signed-off-by: Samuel Ortiz 
---
  hw/arm/virt-acpi-build.c | 31 +--
  1 file changed, 1 insertion(+), 30 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 0ed132b79b..0a6a88380a 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -35,6 +35,7 @@
  #include "target/arm/cpu.h"
  #include "hw/acpi/acpi-defs.h"
  #include "hw/acpi/acpi.h"
+#include "hw/acpi/aml-build.h"
  #include "hw/nvram/fw_cfg.h"
  #include "hw/acpi/bios-linker-loader.h"
  #include "hw/loader.h"
@@ -366,36 +367,6 @@ static void acpi_dsdt_add_power_button(Aml *scope)
  aml_append(scope, dev);
  }
-/* RSDP */
-static void
-build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned 
xsdt_tbl_offset)
Note: here we use xsdt table not rsdt for ARM. After your change which 
assigns xsdt table address to rsdt_physical_address, it doesn't work.


Oops. I didn't notice your patch "[PATCH v4 04/23] hw: acpi: Implement 
XSDT support for RSDP".


Thanks,
Shannon



Re: [Qemu-devel] [PATCH v4 05/23] hw: arm: Switch to the AML build RSDP building routine

2018-11-02 Thread Igor Mammedov
On Fri, 2 Nov 2018 17:35:06 +0800
Shannon Zhao  wrote:

> On 2018/11/1 18:22, Samuel Ortiz wrote:
> > We make the ARM virt ACPI code use the now shared build_rsdp() API from
> > aml-build.c. By doing so we fix a bug where the ARM implementation was
> > missing adding both the legacy and extended checksums, which was
> > building an invalid RSDP table.
> > 
> > Signed-off-by: Samuel Ortiz 
> > ---
> >   hw/arm/virt-acpi-build.c | 31 +--
> >   1 file changed, 1 insertion(+), 30 deletions(-)
> > 
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> > index 0ed132b79b..0a6a88380a 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -35,6 +35,7 @@
> >   #include "target/arm/cpu.h"
> >   #include "hw/acpi/acpi-defs.h"
> >   #include "hw/acpi/acpi.h"
> > +#include "hw/acpi/aml-build.h"
> >   #include "hw/nvram/fw_cfg.h"
> >   #include "hw/acpi/bios-linker-loader.h"
> >   #include "hw/loader.h"
> > @@ -366,36 +367,6 @@ static void acpi_dsdt_add_power_button(Aml *scope)
> >   aml_append(scope, dev);
> >   }
> >   
> > -/* RSDP */
> > -static void
> > -build_rsdp(GArray *rsdp_table, BIOSLinker *linker, unsigned 
> > xsdt_tbl_offset)  
> Note: here we use xsdt table not rsdt for ARM. After your change which 
> assigns xsdt table address to rsdt_physical_address, it doesn't work.
Hi Shannon,

I do not really like how this series organized/split and doing refactoring
backwards but that's another story and I'll do suggestions how to make it
better on per patch review later.

But as far as I see this change should only add legacy checksum over
the current rsdp, which is bug fix and makes table spec compliant.

Could you point out why it doesn't work and what exactly breaks?


> IIRC, we discussed this before that ARM virt uses xsdt while pc/q35 uses 
> rsdt. So this patch is not necessary I think.
> 
> Thanks,
> Shannon
> > -{
> > -AcpiRsdpDescriptor *rsdp = acpi_data_push(rsdp_table, sizeof *rsdp);
> > -unsigned xsdt_pa_size = sizeof(rsdp->xsdt_physical_address);
> > -unsigned xsdt_pa_offset =
> > -(char *)&rsdp->xsdt_physical_address - rsdp_table->data;
> > -
> > -bios_linker_loader_alloc(linker, ACPI_BUILD_RSDP_FILE, rsdp_table, 16,
> > - true /* fseg memory */);
> > -
> > -memcpy(&rsdp->signature, "RSD PTR ", sizeof(rsdp->signature));
> > -memcpy(rsdp->oem_id, ACPI_BUILD_APPNAME6, sizeof(rsdp->oem_id));
> > -rsdp->length = cpu_to_le32(sizeof(*rsdp));
> > -rsdp->revision = 0x02;
> > -
> > -/* Address to be filled by Guest linker */
> > -bios_linker_loader_add_pointer(linker,
> > -ACPI_BUILD_RSDP_FILE, xsdt_pa_offset, xsdt_pa_size,
> > -ACPI_BUILD_TABLE_FILE, xsdt_tbl_offset);
> > -
> > -/* Checksum to be filled by Guest linker */
> > -bios_linker_loader_add_checksum(linker, ACPI_BUILD_RSDP_FILE,
> > -(char *)rsdp - rsdp_table->data, sizeof *rsdp,
> > -(char *)&rsdp->checksum - rsdp_table->data);
> > -
> > -return rsdp_table;
> > -}
> > -
> >   static void
> >   build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
> >   {
> >   




[Qemu-devel] [PATCH v2] target/arm: Conditionalize some asserts on aarch32 support

2018-11-02 Thread Richard Henderson
When populating id registers from kvm, on a host that doesn't support
aarch32 mode at all, neither arm_div nor jazelle will be supported either.

Signed-off-by: Richard Henderson 
---

v2: Test aa64pfr.el0 >= 2; rename to isar_feature_aa64_aa32.
Pull out realizefn test to no_aa32 bool; use it for jazelle as well.

Alex, can you give this a test please?


r~

---
 target/arm/cpu.h |  5 +
 target/arm/cpu.c | 15 +--
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 895f9909d8..5c2c77c31d 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3300,6 +3300,11 @@ static inline bool isar_feature_aa64_fp16(const 
ARMISARegisters *id)
 return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
 }
 
+static inline bool isar_feature_aa64_aa32(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, EL0) >= 2;
+}
+
 static inline bool isar_feature_aa64_sve(const ARMISARegisters *id)
 {
 return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index e08a2d2d79..d4dc0bc225 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -774,6 +774,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
**errp)
 CPUARMState *env = &cpu->env;
 int pagebits;
 Error *local_err = NULL;
+bool no_aa32 = false;
 
 /* If we needed to query the host kernel for the CPU features
  * then it's possible that might have failed in the initfn, but
@@ -820,6 +821,16 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
**errp)
 set_feature(env, ARM_FEATURE_V7VE);
 }
 }
+
+/*
+ * There exist AArch64 cpus without AArch32 support.  When KVM
+ * queries ID_ISAR0_EL1 on such a host, the value is UNKNOWN.
+ * Similarly, we cannot check ID_AA64PFR0 without AArch64 support.
+ */
+if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
+no_aa32 = !cpu_isar_feature(aa64_aa32, cpu);
+}
+
 if (arm_feature(env, ARM_FEATURE_V7VE)) {
 /* v7 Virtualization Extensions. In real hardware this implies
  * EL2 and also the presence of the Security Extensions.
@@ -829,7 +840,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
**errp)
  * Presence of EL2 itself is ARM_FEATURE_EL2, and of the
  * Security Extensions is ARM_FEATURE_EL3.
  */
-assert(cpu_isar_feature(arm_div, cpu));
+assert(no_aa32 || cpu_isar_feature(arm_div, cpu));
 set_feature(env, ARM_FEATURE_LPAE);
 set_feature(env, ARM_FEATURE_V7);
 }
@@ -855,7 +866,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
**errp)
 if (arm_feature(env, ARM_FEATURE_V6)) {
 set_feature(env, ARM_FEATURE_V5);
 if (!arm_feature(env, ARM_FEATURE_M)) {
-assert(cpu_isar_feature(jazelle, cpu));
+assert(no_aa32 || cpu_isar_feature(jazelle, cpu));
 set_feature(env, ARM_FEATURE_AUXCR);
 }
 }
-- 
2.17.2




[Qemu-devel] [PATCH v1 0/5] s390x/vfio: VFIO-AP interrupt control interception

2018-11-02 Thread Pierre Morel
The S390 APQP/AQIC instruction can be intercepted by the host
to configure the AP queues interruption handling for and handle
the ISC used by the host and the guest and the indicator address.

This patch series define the AQIC feature in the cpumodel,
extend the APDevice type for per queue interrupt handling,
intercept the APQP/AQIC instruction, uses the S390 adapter interface
to setup the adapter and use a VFIO ioctl to let the VFIO-AP
driver handle the host instruction associated with the intercepted
guest instruction.

This patch serie can be tested with the Linux/KVM patch series
for the VFIO-AP driver: "s390: vfio: ap: Using GISA for AP Interrupt"

Pierre Morel (5):
  s390x/vfio: ap: Linux uapi VFIO place holder
  s390x/cpumodel: Set up CPU model for AQIC interception
  s390x/vfio: ap: Definition for AP Adapter type
  s390x/vfio: ap: Intercepting AP Queue Interrupt Control
  s390x/vfio: ap: Implementing AP Queue Interrupt Control

 hw/vfio/ap.c| 100 
 include/hw/s390x/ap-device.h|  55 ++
 include/hw/s390x/css.h  |   1 +
 linux-headers/linux/vfio.h  |  22 +++
 target/s390x/cpu_features.c |   1 +
 target/s390x/cpu_features_def.h |   1 +
 target/s390x/cpu_models.c   |   1 +
 target/s390x/kvm.c  |  20 +++
 8 files changed, 201 insertions(+)

-- 
2.17.0




[Qemu-devel] [PATCH v1 1/5] s390x/vfio: ap: Linux uapi VFIO place holder

2018-11-02 Thread Pierre Morel
This file would be copied from Linux,
I put it here for the review.

Signed-off-by: Pierre Morel 
---
 linux-headers/linux/vfio.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index ceb6453394..32b1fec362 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -816,6 +816,28 @@ struct vfio_iommu_spapr_tce_remove {
 };
 #define VFIO_IOMMU_SPAPR_TCE_REMOVE_IO(VFIO_TYPE, VFIO_BASE + 20)
 
+/**
+ * VFIO_AP_SET_IRQ - _IOWR(VFIO_TYPE, VFIO_BASE + 21, struct vfio_ap_aqic)
+ *
+ * Setup IRQ for an AP Queue
+ * @cmd contains the AP queue number (apqn)
+ * @status receives the resulting status of the command
+ * @nib is the Notification Indicator byte address
+ * @adapter_id allows to retrieve the associated adapter
+ */
+struct vfio_ap_aqic {
+   __u32   argsz;
+   __u32   flags;
+   /* in */
+   __u64 cmd;
+   __u64 status;
+   __u64 nib;
+   __u32 adapter_id;
+};
+#define VFIO_AP_SET_IRQ_IO(VFIO_TYPE, VFIO_BASE + 21)
+#define VFIO_AP_CLEAR_IRQ  _IO(VFIO_TYPE, VFIO_BASE + 22)
+
 /* * */
 
+
 #endif /* VFIO_H */
-- 
2.17.0




[Qemu-devel] [PATCH v1 2/5] s390x/cpumodel: Set up CPU model for AQIC interception

2018-11-02 Thread Pierre Morel
A new CPU model facilities is introduced to support AP devices
interruption interception for a KVM guest.

CPU model facility:

The S390_FEAT_AP_QUEUE_INTERRUPT_CONTROL, CPU facility indicates
whether AP interruption interception is available to the guest.
This feature will be enabled only if the AP instructions are
available on the linux host and AQIC facility is installed on
the host.

This feature must be turned on from userspace to intercept AP
instructions on the KVM guest. The QEMU command line to turn
this feature on looks something like this:

qemu-system-s390x ... -cpu xxx,aqci=on ...

Signed-off-by: Pierre Morel 
---
 target/s390x/cpu_features.c | 1 +
 target/s390x/cpu_features_def.h | 1 +
 target/s390x/cpu_models.c   | 1 +
 3 files changed, 3 insertions(+)

diff --git a/target/s390x/cpu_features.c b/target/s390x/cpu_features.c
index 60cfeba48f..c464abf30a 100644
--- a/target/s390x/cpu_features.c
+++ b/target/s390x/cpu_features.c
@@ -84,6 +84,7 @@ static const S390FeatDef s390_features[] = {
 FEAT_INIT("sema", S390_FEAT_TYPE_STFL, 59, "Semaphore-assist facility"),
 FEAT_INIT("tsi", S390_FEAT_TYPE_STFL, 60, "Time-slice Instrumentation 
facility"),
 FEAT_INIT("ri", S390_FEAT_TYPE_STFL, 64, "CPU runtime-instrumentation 
facility"),
+FEAT_INIT("aqic", S390_FEAT_TYPE_STFL, 65, "AP-Queue interruption Control 
facility"),
 FEAT_INIT("zpci", S390_FEAT_TYPE_STFL, 69, "z/PCI facility"),
 FEAT_INIT("aen", S390_FEAT_TYPE_STFL, 71, 
"General-purpose-adapter-event-notification facility"),
 FEAT_INIT("ais", S390_FEAT_TYPE_STFL, 72, 
"General-purpose-adapter-interruption-suppression facility"),
diff --git a/target/s390x/cpu_features_def.h b/target/s390x/cpu_features_def.h
index 5fc7e7bf01..3f22780104 100644
--- a/target/s390x/cpu_features_def.h
+++ b/target/s390x/cpu_features_def.h
@@ -72,6 +72,7 @@ typedef enum {
 S390_FEAT_SEMAPHORE_ASSIST,
 S390_FEAT_TIME_SLICE_INSTRUMENTATION,
 S390_FEAT_RUNTIME_INSTRUMENTATION,
+S390_FEAT_AP_QUEUE_INTERRUPT_CONTROL,
 S390_FEAT_ZPCI,
 S390_FEAT_ADAPTER_EVENT_NOTIFICATION,
 S390_FEAT_ADAPTER_INT_SUPPRESSION,
diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index 7c253ff308..6b5e94b9f6 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -788,6 +788,7 @@ static void check_consistency(const S390CPUModel *model)
 { S390_FEAT_SIE_KSS, S390_FEAT_SIE_F2 },
 { S390_FEAT_AP_QUERY_CONFIG_INFO, S390_FEAT_AP },
 { S390_FEAT_AP_FACILITIES_TEST, S390_FEAT_AP },
+{ S390_FEAT_AP_QUEUE_INTERRUPT_CONTROL, S390_FEAT_AP },
 };
 int i;
 
-- 
2.17.0




[Qemu-devel] [PATCH v1 4/5] s390x/vfio: ap: Intercepting AP Queue Interrupt Control

2018-11-02 Thread Pierre Morel
From: Pierre Morel 

We intercept the PQAP(AQIC) instruction.

Until we implement AQIC we return a PGM_OPERATION.

Signed-off-by: Pierre Morel 
---
 hw/vfio/ap.c | 10 ++
 include/hw/s390x/ap-device.h |  9 +
 target/s390x/kvm.c   | 20 
 3 files changed, 39 insertions(+)

diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index 3962bb74e5..d8d9cadc46 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -38,6 +38,16 @@ typedef struct VFIOAPDevice {
 #define VFIO_AP_DEVICE(obj) \
 OBJECT_CHECK(VFIOAPDevice, (obj), VFIO_AP_DEVICE_TYPE)
 
+/*
+ * ap_pqap
+ * @env: environment pointing to registers
+ * return value: Code Condition
+ */
+int ap_pqap(CPUS390XState *env)
+{
+return -PGM_OPERATION;
+}
+
 static void vfio_ap_compute_needs_reset(VFIODevice *vdev)
 {
 vdev->needs_reset = false;
diff --git a/include/hw/s390x/ap-device.h b/include/hw/s390x/ap-device.h
index 765e9082a3..a83ea096c7 100644
--- a/include/hw/s390x/ap-device.h
+++ b/include/hw/s390x/ap-device.h
@@ -19,4 +19,13 @@ typedef struct APDevice {
 #define AP_DEVICE(obj) \
 OBJECT_CHECK(APDevice, (obj), AP_DEVICE_TYPE)
 
+#define AP_DEVICE_GET_CLASS(obj) \
+OBJECT_GET_CLASS(APDeviceClass, (obj), AP_DEVICE_TYPE)
+
+#define AP_DEVICE_CLASS(klass) \
+OBJECT_CLASS_CHECK(APDeviceClass, (klass), AP_DEVICE_TYPE)
+
+#include "cpu.h"
+int ap_pqap(CPUS390XState *env);
+
 #endif /* HW_S390X_AP_DEVICE_H */
diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index 2ebf26adfe..3eac59549d 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -45,6 +45,7 @@
 #include "trace.h"
 #include "hw/s390x/s390-pci-inst.h"
 #include "hw/s390x/s390-pci-bus.h"
+#include "hw/s390x/ap-device.h"
 #include "hw/s390x/ipl.h"
 #include "hw/s390x/ebcdic.h"
 #include "exec/memattrs.h"
@@ -88,6 +89,7 @@
 #define PRIV_B2_CHSC0x5f
 #define PRIV_B2_SIGA0x74
 #define PRIV_B2_XSCH0x76
+#define PRIV_B2_PQAP0xaf
 
 #define PRIV_EB_SQBS0x8a
 #define PRIV_EB_PCISTB  0xd0
@@ -1154,6 +1156,21 @@ static int kvm_sclp_service_call(S390CPU *cpu, struct 
kvm_run *run,
 return 0;
 }
 
+static int kvm_ap_pqap(S390CPU *cpu, uint16_t ipbh0)
+{
+int r;
+
+r = ap_pqap(&cpu->env);
+
+if (r < 0) {
+kvm_s390_program_interrupt(cpu, -r);
+} else {
+setcc(cpu, r);
+}
+
+return 0;
+}
+
 static int handle_b2(S390CPU *cpu, struct kvm_run *run, uint8_t ipa1)
 {
 CPUS390XState *env = &cpu->env;
@@ -1216,6 +1233,9 @@ static int handle_b2(S390CPU *cpu, struct kvm_run *run, 
uint8_t ipa1)
 case PRIV_B2_SCLP_CALL:
 rc = kvm_sclp_service_call(cpu, run, ipbh0);
 break;
+case PRIV_B2_PQAP:
+rc = kvm_ap_pqap(cpu, ipbh0);
+break;
 default:
 rc = -1;
 DPRINTF("KVM: unhandled PRIV: 0xb2%x\n", ipa1);
-- 
2.17.0




[Qemu-devel] [PATCH v1 5/5] s390x/vfio: ap: Implementing AP Queue Interrupt Control

2018-11-02 Thread Pierre Morel
We intercept the PQAP(AQIC) instruction and transform
the guest's AQIC command parameters for the host AQIC
parameters.

Doing this we use the standard adapter interface to provide
the adapter NIB, indicator and ISC.

We define a new structure, APQueue to keep track of
the route and indicator address and we add an array of
AP Queues in the VFIOAPDevice.

We call the VFIO ioctl to set or clear the interruption
according to the "i" bit of the parameter.

Signed-off-by: Pierre Morel 
---
 hw/vfio/ap.c | 92 +++-
 include/hw/s390x/ap-device.h | 46 ++
 2 files changed, 137 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c
index d8d9cadc46..67a46e163e 100644
--- a/hw/vfio/ap.c
+++ b/hw/vfio/ap.c
@@ -27,17 +27,90 @@
 #include "sysemu/sysemu.h"
 #include "hw/s390x/ap-bridge.h"
 #include "exec/address-spaces.h"
+#include "hw/s390x/s390_flic.h"
+#include "hw/s390x/css.h"
 
 #define VFIO_AP_DEVICE_TYPE  "vfio-ap"
 
 typedef struct VFIOAPDevice {
 APDevice apdev;
 VFIODevice vdev;
+QTAILQ_ENTRY(VFIOAPDevice) sibling;
+APQueue apq[MAX_AP][MAX_DOMAIN];
 } VFIOAPDevice;
 
 #define VFIO_AP_DEVICE(obj) \
 OBJECT_CHECK(VFIOAPDevice, (obj), VFIO_AP_DEVICE_TYPE)
 
+VFIOAPDevice *vfio_apdev;
+static APDevice *matrix;
+
+static int ap_aqic(CPUS390XState *env)
+{
+struct pqap_cmd cmd = reg2cmd(env->regs[0]);
+struct ap_status status = reg2status(env->regs[1]);
+uint64_t guest_nib = env->regs[2];
+struct vfio_ap_aqic param = {};
+int retval;
+VFIODevice *vdev;
+VFIOAPDevice *ap_vdev;
+APQueue *apq;
+
+ap_vdev = DO_UPCAST(VFIOAPDevice, apdev, matrix);
+apq = &ap_vdev->apq[cmd.apid][cmd.apqi];
+vdev = &ap_vdev->vdev;
+
+if (status.irq) {
+if (apq->nib) {
+status.rc = AP_RC_BAD_STATE;
+goto error;
+}
+} else {
+if (!apq->nib) {
+status.rc = AP_RC_BAD_STATE;
+goto error;
+}
+}
+if (!guest_nib) {
+status.rc = AP_RC_INVALID_ADDR;
+goto error;
+}
+
+apq->routes.adapter.adapter_id = css_get_adapter_id(
+   CSS_IO_ADAPTER_AP, status.isc);
+
+apq->nib = get_indicator(ldq_p(&guest_nib), 8);
+
+retval = map_indicator(&apq->routes.adapter, apq->nib);
+if (retval) {
+status.rc = AP_RC_INVALID_ADDR;
+env->regs[1] = status2reg(status);
+goto error;
+}
+
+param.cmd = env->regs[0];
+param.status = env->regs[1];
+param.nib = env->regs[2];
+param.adapter_id = apq->routes.adapter.adapter_id;
+param.argsz = sizeof(param);
+
+retval = ioctl(vdev->fd, VFIO_AP_SET_IRQ, ¶m);
+status = reg2status(param.status);
+if (retval) {
+goto err_ioctl;
+}
+
+env->regs[1] = param.status;
+
+return 0;
+err_ioctl:
+release_indicator(&apq->routes.adapter, apq->nib);
+apq->nib = NULL;
+error:
+env->regs[1] = status2reg(status);
+return 0;
+}
+
 /*
  * ap_pqap
  * @env: environment pointing to registers
@@ -45,7 +118,20 @@ typedef struct VFIOAPDevice {
  */
 int ap_pqap(CPUS390XState *env)
 {
-return -PGM_OPERATION;
+struct pqap_cmd cmd = reg2cmd(env->regs[0]);
+int cc = 0;
+
+switch (cmd.fc) {
+case AQIC:
+if (!s390_has_feat(S390_FEAT_AP_QUEUE_INTERRUPT_CONTROL)) {
+return -PGM_OPERATION;
+}
+cc = ap_aqic(env);
+break;
+default:
+return -PGM_OPERATION;
+}
+return cc;
 }
 
 static void vfio_ap_compute_needs_reset(VFIODevice *vdev)
@@ -119,6 +205,9 @@ static void vfio_ap_realize(DeviceState *dev, Error **errp)
 goto out_get_dev_err;
 }
 
+matrix = apdev;
+css_register_io_adapters(CSS_IO_ADAPTER_AP, true, false,
+ 0, &error_abort);
 return;
 
 out_get_dev_err:
@@ -135,6 +224,7 @@ static void vfio_ap_unrealize(DeviceState *dev, Error 
**errp)
 VFIOGroup *group = vapdev->vdev.group;
 
 vfio_ap_put_device(vapdev);
+matrix = NULL;
 vfio_put_group(group);
 }
 
diff --git a/include/hw/s390x/ap-device.h b/include/hw/s390x/ap-device.h
index a83ea096c7..bc2b7bcd8e 100644
--- a/include/hw/s390x/ap-device.h
+++ b/include/hw/s390x/ap-device.h
@@ -28,4 +28,50 @@ typedef struct APDevice {
 #include "cpu.h"
 int ap_pqap(CPUS390XState *env);
 
+#define MAX_AP 256
+#define MAX_DOMAIN 256
+
+#include "hw/s390x/s390_flic.h"
+#include "hw/s390x/css.h"
+typedef struct APQueue {
+uint32_t apid;
+uint32_t apqi;
+AdapterRoutes routes;
+IndAddr *nib;
+} APQueue;
+
+/* AP PQAP commands definitions */
+#define AQIC 0x03
+
+struct pqap_cmd {
+uint32_t unused;
+uint8_t fc;
+unsigned t:1;
+unsigned reserved:7;
+uint8_t apid;
+uint8_t apqi;
+};
+/* AP status returned by the AP PQAP commands */
+#define AP_RC_APQN_INVALID 0x01
+#define AP_RC_INVALID_ADDR 0x06
+#define AP_RC_BAD_STATE0x07
+
+struct ap_

[Qemu-devel] [PATCH v1 3/5] s390x/vfio: ap: Definition for AP Adapter type

2018-11-02 Thread Pierre Morel
From: Pierre Morel 

Let's define the AP adapter type to use it with standard
adapter interface.

Signed-off-by: Pierre Morel 
---
 include/hw/s390x/css.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/hw/s390x/css.h b/include/hw/s390x/css.h
index aae19c4272..9946492214 100644
--- a/include/hw/s390x/css.h
+++ b/include/hw/s390x/css.h
@@ -217,6 +217,7 @@ IOInstEnding do_subchannel_work_passthrough(SubchDev *sub);
 typedef enum {
 CSS_IO_ADAPTER_VIRTIO = 0,
 CSS_IO_ADAPTER_PCI = 1,
+CSS_IO_ADAPTER_AP = 2,
 CSS_IO_ADAPTER_TYPE_NUMS,
 } CssIoAdapterType;
 
-- 
2.17.0




Re: [Qemu-devel] [PULL 0/3] migration queue

2018-11-02 Thread Peter Maydell
On 31 October 2018 at 16:57, Dr. David Alan Gilbert (git)
 wrote:
> From: "Dr. David Alan Gilbert" 
>
> The following changes since commit a2e002ff7913ce93aa0f7dbedd2123dce5f1a9cd:
>
>   Merge remote-tracking branch 
> 'remotes/vivier2/tags/qemu-trivial-for-3.1-pull-request' into staging 
> (2018-10-30 15:49:55 +)
>
> are available in the Git repository at:
>
>   git://github.com/dagrh/qemu.git tags/pull-migration-20181031a
>
> for you to fetch changes up to 3d63da16fbcd05405efd5946000cdb45474a9bad:
>
>   migration: avoid segmentfault when take a snapshot of a VM which being 
> migrated (2018-10-31 09:38:59 +)
>
> 
> Minor migration fixes 2018-10-31
>
> 

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH] nvme: fix oob access issue(CVE-2018-16847)

2018-11-02 Thread Kevin Wolf
Am 02.11.2018 um 02:22 hat Li Qiang geschrieben:
> Currently, the nvme_cmb_ops mr doesn't check the addr and size.
> This can lead an oob access issue. This is triggerable in the guest.
> Add check to avoid this issue.
> 
> Fixes CVE-2018-16847.
> 
> Reported-by: Li Qiang 
> Reviewed-by: Paolo Bonzini 
> Signed-off-by: Li Qiang 
> ---
>  hw/block/nvme.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index fc7dacb..d097add 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1175,6 +1175,10 @@ static void nvme_cmb_write(void *opaque, hwaddr addr, 
> uint64_t data,
>  unsigned size)
>  {
>  NvmeCtrl *n = (NvmeCtrl *)opaque;
> +
> +if (addr + size > NVME_CMBSZ_GETSIZE(n->bar.cmbsz)) {

What prevents a guest from moving the device to the end of the address
space and causing an integer overflow in addr + size?

If this happens, we still have .max_access_size = 8. The next question is
then, is NVME_CMBSZ_GETSIZE guaranteed to be at least 8? I suppose yes,
but do we want to rely on this for security?

Kevin



[Qemu-devel] [PATCH] qapi: misc: change the 'pc' to unsinged 64 in CpuInfo

2018-11-02 Thread Li Qiang
When trigger a 'query-cpus' qmp, the pc is an signed value like
following:
{"arch": "x86", ...  "pc": -1732653994, "halted": true,...}
It is strange. Change it to uint64_t.

Signed-off-by: Li Qiang 
---
 qapi/misc.json | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/qapi/misc.json b/qapi/misc.json
index 6c1c5c0a37..621ec6ce13 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -407,7 +407,7 @@
 #
 # Since: 2.6
 ##
-{ 'struct': 'CpuInfoX86', 'data': { 'pc': 'int' } }
+{ 'struct': 'CpuInfoX86', 'data': { 'pc': 'uint64' } }
 
 ##
 # @CpuInfoSPARC:
@@ -420,7 +420,7 @@
 #
 # Since: 2.6
 ##
-{ 'struct': 'CpuInfoSPARC', 'data': { 'pc': 'int', 'npc': 'int' } }
+{ 'struct': 'CpuInfoSPARC', 'data': { 'pc': 'uint64', 'npc': 'uint64' } }
 
 ##
 # @CpuInfoPPC:
@@ -431,7 +431,7 @@
 #
 # Since: 2.6
 ##
-{ 'struct': 'CpuInfoPPC', 'data': { 'nip': 'int' } }
+{ 'struct': 'CpuInfoPPC', 'data': { 'nip': 'uint64' } }
 
 ##
 # @CpuInfoMIPS:
@@ -442,7 +442,7 @@
 #
 # Since: 2.6
 ##
-{ 'struct': 'CpuInfoMIPS', 'data': { 'PC': 'int' } }
+{ 'struct': 'CpuInfoMIPS', 'data': { 'PC': 'uint64' } }
 
 ##
 # @CpuInfoTricore:
@@ -453,7 +453,7 @@
 #
 # Since: 2.6
 ##
-{ 'struct': 'CpuInfoTricore', 'data': { 'PC': 'int' } }
+{ 'struct': 'CpuInfoTricore', 'data': { 'PC': 'uint64' } }
 
 ##
 # @CpuInfoRISCV:
@@ -464,7 +464,7 @@
 #
 # Since 2.12
 ##
-{ 'struct': 'CpuInfoRISCV', 'data': { 'pc': 'int' } }
+{ 'struct': 'CpuInfoRISCV', 'data': { 'pc': 'uint64' } }
 
 ##
 # @CpuS390State:
-- 
2.11.0




[Qemu-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)

2018-11-02 Thread Kevin Wolf
Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> A series of performance improvements for disks using the Xen PV ring.
> 
> These have had fairly extensive testing.
> 
> The batching and latency improvements together boost the throughput
> of small reads and writes by two to six percent (measured using fio
> in the guest)
> 
> Avoiding repeated calls to posix_memalign() reduced the dirty heap
> from 25MB to 5MB in the case of a single datapath process while also
> improving performance.
> 
> v2 removes some checkpatch complaints and fixes the CCs

Completely unrelated, but since you're the first person touching
xen_disk in a while, you're my victim:

At KVM Forum we discussed sending a patch to deprecate xen_disk because
after all those years, it still hasn't been converted to qdev. Markus is
currently fixing some other not yet qdevified block device, but after
that xen_disk will be the only one left.

A while ago, a downstream patch review found out that there are some QMP
commands that would immediately crash if a xen_disk device were present
because of the lacking qdevification. This is not the code quality
standard I envision for QEMU. It's time for non-qdev devices to go.

So if you guys are still interested in the device, could someone please
finally look into converting it?

Kevin



Re: [Qemu-devel] [PATCH] qemu/units: Move out QCow2 specific definitions

2018-11-02 Thread Kevin Wolf
Am 02.11.2018 um 09:58 hat Philippe Mathieu-Daudé geschrieben:
> This definitions are QCow2 specific, there is no need to expose them
> in the global namespace.
> 
> This partially reverts commit 540b8492618eb.
> 
> Signed-off-by: Philippe Mathieu-Daudé 

If we don't want this globally, I think we also don't want it in qcow2.
Or at least reduce it to only those constants that qcow2 actually uses.

Kevin



Re: [Qemu-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)

2018-11-02 Thread Paul Durrant
> -Original Message-
> From: Kevin Wolf [mailto:kw...@redhat.com]
> Sent: 02 November 2018 11:04
> To: Tim Smith 
> Cc: xen-de...@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> bl...@nongnu.org; Anthony Perard ; Paul Durrant
> ; Stefano Stabellini ;
> Max Reitz ; arm...@redhat.com
> Subject: xen_disk qdevification (was: [PATCH 0/3] Performance improvements
> for xen_disk v2)
> 
> Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > A series of performance improvements for disks using the Xen PV ring.
> >
> > These have had fairly extensive testing.
> >
> > The batching and latency improvements together boost the throughput
> > of small reads and writes by two to six percent (measured using fio
> > in the guest)
> >
> > Avoiding repeated calls to posix_memalign() reduced the dirty heap
> > from 25MB to 5MB in the case of a single datapath process while also
> > improving performance.
> >
> > v2 removes some checkpatch complaints and fixes the CCs
> 
> Completely unrelated, but since you're the first person touching
> xen_disk in a while, you're my victim:
> 
> At KVM Forum we discussed sending a patch to deprecate xen_disk because
> after all those years, it still hasn't been converted to qdev. Markus is
> currently fixing some other not yet qdevified block device, but after
> that xen_disk will be the only one left.
> 
> A while ago, a downstream patch review found out that there are some QMP
> commands that would immediately crash if a xen_disk device were present
> because of the lacking qdevification. This is not the code quality
> standard I envision for QEMU. It's time for non-qdev devices to go.
> 
> So if you guys are still interested in the device, could someone please
> finally look into converting it?
> 

I have a patch series to do exactly this. It's somewhat involved as I need to 
convert the whole PV backend infrastructure. I will try to rebase and clean up 
my series a.s.a.p.

  Paul

> Kevin



Re: [Qemu-devel] [PATCH for-3.1] hw/ppc/mac_newworld: Free openpic_irqs array after use

2018-11-02 Thread Mark Cave-Ayland
On 01/11/2018 16:17, Peter Maydell wrote:

> In ppc_core99_init(), we allocate an openpic_irqs array, which
> we then use to collect up the various qemu_irqs which we're
> going to connect to the interrupt controller. Once we've
> called sysbus_connect_irq() to connect them all up, the
> array is no longer required, but we forgot to free it.
> 
> Since board init is only run once at startup, the memory
> leak is not a significant one.
> 
> Spotted by Coverity: CID 1192916.
> 
> Signed-off-by: Peter Maydell 
> ---
>  hw/ppc/mac_newworld.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
> index a630cb81cd8..14273a123e5 100644
> --- a/hw/ppc/mac_newworld.c
> +++ b/hw/ppc/mac_newworld.c
> @@ -303,6 +303,7 @@ static void ppc_core99_init(MachineState *machine)
>  sysbus_connect_irq(s, k++, openpic_irqs[i][j]);
>  }
>  }
> +g_free(openpic_irqs);
>  
>  if (PPC_INPUT(env) == PPC_FLAGS_INPUT_970) {
>  /* 970 gets a U3 bus */
> 

Reviewed-by: Mark Cave-Ayland 

I did notice the generation of this 2D array for the OpenPIC controller whilst
converting the Mac machines over to qdev, but wasn't exactly sure what to do 
here so
I left it.


ATB,

Mark.



Re: [Qemu-devel] [PATCH 1/3] Improve xen_disk batching behaviour

2018-11-02 Thread Paul Durrant
> -Original Message-
> From: Tim Smith [mailto:tim.sm...@citrix.com]
> Sent: 02 November 2018 10:01
> To: xen-de...@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> bl...@nongnu.org
> Cc: Anthony Perard ; Kevin Wolf
> ; Paul Durrant ; Stefano
> Stabellini ; Max Reitz 
> Subject: [PATCH 1/3] Improve xen_disk batching behaviour
> 
> When I/O consists of many small requests, performance is improved by
> batching them together in a single io_submit() call. When there are
> relatively few requests, the extra overhead is not worth it. This
> introduces a check to start batching I/O requests via blk_io_plug()/
> blk_io_unplug() in an amount proportional to the number which were
> already in flight at the time we started reading the ring.
> 
> Signed-off-by: Tim Smith 

Reviewed-by: Paul Durrant 

> ---
>  hw/block/xen_disk.c |   30 ++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index 36eff94f84..cb2881b7e6 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -101,6 +101,9 @@ struct XenBlkDev {
>  AioContext  *ctx;
>  };
> 
> +/* Threshold of in-flight requests above which we will start using
> + * blk_io_plug()/blk_io_unplug() to batch requests */
> +#define IO_PLUG_THRESHOLD 1
>  /* - */
> 
>  static void ioreq_reset(struct ioreq *ioreq)
> @@ -542,6 +545,8 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>  {
>  RING_IDX rc, rp;
>  struct ioreq *ioreq;
> +int inflight_atstart = blkdev->requests_inflight;
> +int batched = 0;
> 
>  blkdev->more_work = 0;
> 
> @@ -550,6 +555,16 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>  xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
> 
>  blk_send_response_all(blkdev);
> +/* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
> + * when we got here, this is an indication that there the bottleneck
> + * is below us, so it's worth beginning to batch up I/O requests
> + * rather than submitting them immediately. The maximum number
> + * of requests we're willing to batch is the number already in
> + * flight, so it can grow up to max_requests when the bottleneck
> + * is below us */
> +if (inflight_atstart > IO_PLUG_THRESHOLD) {
> +blk_io_plug(blkdev->blk);
> +}
>  while (rc != rp) {
>  /* pull request from ring */
>  if (RING_REQUEST_CONS_OVERFLOW(&blkdev->rings.common, rc)) {
> @@ -589,7 +604,22 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>  continue;
>  }
> 
> +if (inflight_atstart > IO_PLUG_THRESHOLD &&
> +batched >= inflight_atstart) {
> +blk_io_unplug(blkdev->blk);
> +}
>  ioreq_runio_qemu_aio(ioreq);
> +if (inflight_atstart > IO_PLUG_THRESHOLD) {
> +if (batched >= inflight_atstart) {
> +blk_io_plug(blkdev->blk);
> +batched = 0;
> +} else {
> +batched++;
> +}
> +}
> +}
> +if (inflight_atstart > IO_PLUG_THRESHOLD) {
> +blk_io_unplug(blkdev->blk);
>  }
> 
>  if (blkdev->more_work && blkdev->requests_inflight < blkdev-
> >max_requests) {



Re: [Qemu-devel] [PATCH 3/3] Avoid repeated memory allocation in xen_disk

2018-11-02 Thread Paul Durrant
> -Original Message-
> From: Tim Smith [mailto:tim.sm...@citrix.com]
> Sent: 02 November 2018 10:01
> To: xen-de...@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> bl...@nongnu.org
> Cc: Anthony Perard ; Kevin Wolf
> ; Paul Durrant ; Stefano
> Stabellini ; Max Reitz 
> Subject: [PATCH 3/3] Avoid repeated memory allocation in xen_disk
> 
> xen_disk currently allocates memory to hold the data for each ioreq
> as that ioreq is used, and frees it afterwards. Because it requires
> page-aligned blocks, this interacts poorly with non-page-aligned
> allocations and balloons the heap.
> 
> Instead, allocate the maximum possible requirement, which is
> BLKIF_MAX_SEGMENTS_PER_REQUEST pages (currently 11 pages) when
> the ioreq is created, and keep that allocation until it is destroyed.
> Since the ioreqs themselves are re-used via a free list, this
> should actually improve memory usage.
> 
> Signed-off-by: Tim Smith 

Reviewed-by: Paul Durrant 

> ---
>  hw/block/xen_disk.c |   11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index b506e23868..faaeefba29 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -112,7 +112,6 @@ static void ioreq_reset(struct ioreq *ioreq)
>  memset(&ioreq->req, 0, sizeof(ioreq->req));
>  ioreq->status = 0;
>  ioreq->start = 0;
> -ioreq->buf = NULL;
>  ioreq->size = 0;
>  ioreq->presync = 0;
> 
> @@ -137,6 +136,11 @@ static struct ioreq *ioreq_start(struct XenBlkDev
> *blkdev)
>  /* allocate new struct */
>  ioreq = g_malloc0(sizeof(*ioreq));
>  ioreq->blkdev = blkdev;
> +/* We cannot need more pages per ioreq than this, and we do re-
> use
> + * ioreqs, so allocate the memory once here, to be freed in
> + * blk_free() when the ioreq is freed. */
> +ioreq->buf = qemu_memalign(XC_PAGE_SIZE,
> BLKIF_MAX_SEGMENTS_PER_REQUEST
> +   * XC_PAGE_SIZE);
>  blkdev->requests_total++;
>  qemu_iovec_init(&ioreq->v, 1);
>  } else {
> @@ -313,14 +317,12 @@ static void qemu_aio_complete(void *opaque, int ret)
>  if (ret == 0) {
>  ioreq_grant_copy(ioreq);
>  }
> -qemu_vfree(ioreq->buf);
>  break;
>  case BLKIF_OP_WRITE:
>  case BLKIF_OP_FLUSH_DISKCACHE:
>  if (!ioreq->req.nr_segments) {
>  break;
>  }
> -qemu_vfree(ioreq->buf);
>  break;
>  default:
>  break;
> @@ -392,12 +394,10 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
>  {
>  struct XenBlkDev *blkdev = ioreq->blkdev;
> 
> -ioreq->buf = qemu_memalign(XC_PAGE_SIZE, ioreq->size);
>  if (ioreq->req.nr_segments &&
>  (ioreq->req.operation == BLKIF_OP_WRITE ||
>   ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE) &&
>  ioreq_grant_copy(ioreq)) {
> -qemu_vfree(ioreq->buf);
>  goto err;
>  }
> 
> @@ -990,6 +990,7 @@ static int blk_free(struct XenDevice *xendev)
>  ioreq = QLIST_FIRST(&blkdev->freelist);
>  QLIST_REMOVE(ioreq, list);
>  qemu_iovec_destroy(&ioreq->v);
> +qemu_vfree(ioreq->buf);
>  g_free(ioreq);
>  }
> 



Re: [Qemu-devel] [PATCH 2/3] Improve xen_disk response latency

2018-11-02 Thread Paul Durrant


> -Original Message-
> From: Tim Smith [mailto:tim.sm...@citrix.com]
> Sent: 02 November 2018 10:01
> To: xen-de...@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> bl...@nongnu.org
> Cc: Anthony Perard ; Kevin Wolf
> ; Paul Durrant ; Stefano
> Stabellini ; Max Reitz 
> Subject: [PATCH 2/3] Improve xen_disk response latency
> 
> If the I/O ring is full, the guest cannot send any more requests
> until some responses are sent. Only sending all available responses
> just before checking for new work does not leave much time for the
> guest to supply new work, so this will cause stalls if the ring gets
> full. Also, not completing reads as soon as possible adds latency
> to the guest.
> 
> To alleviate that, complete IO requests as soon as they come back.
> blk_send_response() already returns a value indicating whether
> a notify should be sent, which is all the batching we need.
> 
> Signed-off-by: Tim Smith 

Reviewed-by: Paul Durrant 

> ---
>  hw/block/xen_disk.c |   43 ---
>  1 file changed, 12 insertions(+), 31 deletions(-)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index cb2881b7e6..b506e23868 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -83,11 +83,9 @@ struct XenBlkDev {
> 
>  /* request lists */
>  QLIST_HEAD(inflight_head, ioreq) inflight;
> -QLIST_HEAD(finished_head, ioreq) finished;
>  QLIST_HEAD(freelist_head, ioreq) freelist;
>  int requests_total;
>  int requests_inflight;
> -int requests_finished;
>  unsigned intmax_requests;
> 
>  gbooleanfeature_discard;
> @@ -104,6 +102,9 @@ struct XenBlkDev {
>  /* Threshold of in-flight requests above which we will start using
>   * blk_io_plug()/blk_io_unplug() to batch requests */
>  #define IO_PLUG_THRESHOLD 1
> +
> +static int blk_send_response(struct ioreq *ioreq);
> +
>  /* - */
> 
>  static void ioreq_reset(struct ioreq *ioreq)
> @@ -155,12 +156,10 @@ static void ioreq_finish(struct ioreq *ioreq)
>  struct XenBlkDev *blkdev = ioreq->blkdev;
> 
>  QLIST_REMOVE(ioreq, list);
> -QLIST_INSERT_HEAD(&blkdev->finished, ioreq, list);
>  blkdev->requests_inflight--;
> -blkdev->requests_finished++;
>  }
> 
> -static void ioreq_release(struct ioreq *ioreq, bool finish)
> +static void ioreq_release(struct ioreq *ioreq)
>  {
>  struct XenBlkDev *blkdev = ioreq->blkdev;
> 
> @@ -168,11 +167,7 @@ static void ioreq_release(struct ioreq *ioreq, bool
> finish)
>  ioreq_reset(ioreq);
>  ioreq->blkdev = blkdev;
>  QLIST_INSERT_HEAD(&blkdev->freelist, ioreq, list);
> -if (finish) {
> -blkdev->requests_finished--;
> -} else {
> -blkdev->requests_inflight--;
> -}
> +blkdev->requests_inflight--;
>  }
> 
>  /*
> @@ -351,6 +346,10 @@ static void qemu_aio_complete(void *opaque, int ret)
>  default:
>  break;
>  }
> +if (blk_send_response(ioreq)) {
> +xen_pv_send_notify(&blkdev->xendev);
> +}
> +ioreq_release(ioreq);
>  qemu_bh_schedule(blkdev->bh);
> 
>  done:
> @@ -455,7 +454,7 @@ err:
>  return -1;
>  }
> 
> -static int blk_send_response_one(struct ioreq *ioreq)
> +static int blk_send_response(struct ioreq *ioreq)
>  {
>  struct XenBlkDev  *blkdev = ioreq->blkdev;
>  int   send_notify   = 0;
> @@ -504,22 +503,6 @@ static int blk_send_response_one(struct ioreq *ioreq)
>  return send_notify;
>  }
> 
> -/* walk finished list, send outstanding responses, free requests */
> -static void blk_send_response_all(struct XenBlkDev *blkdev)
> -{
> -struct ioreq *ioreq;
> -int send_notify = 0;
> -
> -while (!QLIST_EMPTY(&blkdev->finished)) {
> -ioreq = QLIST_FIRST(&blkdev->finished);
> -send_notify += blk_send_response_one(ioreq);
> -ioreq_release(ioreq, true);
> -}
> -if (send_notify) {
> -xen_pv_send_notify(&blkdev->xendev);
> -}
> -}
> -
>  static int blk_get_request(struct XenBlkDev *blkdev, struct ioreq *ioreq,
> RING_IDX rc)
>  {
>  switch (blkdev->protocol) {
> @@ -554,7 +537,6 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>  rp = blkdev->rings.common.sring->req_prod;
>  xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
> 
> -blk_send_response_all(blkdev);
>  /* If there was more than IO_PLUG_THRESHOLD ioreqs in flight
>   * when we got here, this is an indication that there the bottleneck
>   * is below us, so it's worth beginning to batch up I/O requests
> @@ -597,10 +579,10 @@ static void blk_handle_requests(struct XenBlkDev
> *blkdev)
>  break;
>  };
> 
> -if (blk_send_response_one(ioreq)) {
> +if (blk_send_response(ioreq)) {
>  xen_pv_send_notify(&blkdev->xendev);
>  }
> -ioreq_release(io

Re: [Qemu-devel] [PATCH for-3.1] hw/ppc/mac_newworld: Free openpic_irqs array after use

2018-11-02 Thread Peter Maydell
On 2 November 2018 at 11:14, Mark Cave-Ayland
 wrote:
> On 01/11/2018 16:17, Peter Maydell wrote:
>
>> In ppc_core99_init(), we allocate an openpic_irqs array, which
>> we then use to collect up the various qemu_irqs which we're
>> going to connect to the interrupt controller. Once we've
>> called sysbus_connect_irq() to connect them all up, the
>> array is no longer required, but we forgot to free it.
>>
>> Since board init is only run once at startup, the memory
>> leak is not a significant one.
>>
>> Spotted by Coverity: CID 1192916.
>>
>> Signed-off-by: Peter Maydell 
>> ---
>>  hw/ppc/mac_newworld.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
>> index a630cb81cd8..14273a123e5 100644
>> --- a/hw/ppc/mac_newworld.c
>> +++ b/hw/ppc/mac_newworld.c
>> @@ -303,6 +303,7 @@ static void ppc_core99_init(MachineState *machine)
>>  sysbus_connect_irq(s, k++, openpic_irqs[i][j]);
>>  }
>>  }
>> +g_free(openpic_irqs);
>>
>>  if (PPC_INPUT(env) == PPC_FLAGS_INPUT_970) {
>>  /* 970 gets a U3 bus */
>>
>
> Reviewed-by: Mark Cave-Ayland 
>
> I did notice the generation of this 2D array for the OpenPIC controller whilst
> converting the Mac machines over to qdev, but wasn't exactly sure what to do 
> here so
> I left it.

In some sense the array isn't really necessary at all -- instead
of "fill in array with things; create PIC; sysbus_connect_irq from array"
you could just do "create PIC; sysbus_connect_irq to things".
But for this patch I opted to just free the memory rather than
attempt more complicated refactoring.

thanks
-- PMM



Re: [Qemu-devel] [PATCH v5 00/11] hw/m68k: add Apple Machintosh Quadra 800 machine

2018-11-02 Thread Laurent Vivier
On 02/11/2018 01:32, Thomas Huth wrote:
> On 2018-10-30 13:39, Laurent Vivier wrote:
>> Le 30/10/2018 à 14:12, Mark Cave-Ayland a écrit :
>>> On 30/10/2018 12:49, Laurent Vivier wrote:
>>>
 Le 30/10/2018 à 12:48, Mark Cave-Ayland a écrit :
> On 30/10/2018 08:15, Richard Henderson wrote:
>
>> On 10/29/18 1:39 PM, Mark Cave-Ayland wrote:
>>> You can install your own disk using debian-installer, with:
>>>
>>> ...
>>> -M q800 \
>>> -serial none -serial mon:stdio \
>>> -m 1000M -drive file=m68k.qcow2,format=qcow2 \
>>> -net nic,model=dp83932,addr=09:00:07:12:34:57 \
>>> -append "console=ttyS0 vga=off" \
>>> -kernel vmlinux-4.15.0-2-m68k \
>>> -initrd initrd.gz \
>>> -drive file=debian-9.0-m68k-NETINST-1.iso \
>>> -drive file=m68k.qcow2,format=qcow2 \
>>> -nographic
>>
>> I tried this and got
>>
>> Trace 0: 0x7f2e886c7140 [/d404/0xe000]
>> INT  1: Unassigned(0xf4) pc=d404 sp=00393e60 sr=2700
>> INT  2: Access Fault(0x8) pc= sp=00393e58 sr=2700
>> ssw:  0506 ea:    sfc:  5dfc: 5
>>
>> which lead straight to buserr and panic.  This happens way early in boot 
>> --
>> only 1926 TranslationBlocks generated.
>>
>> Is there some device missing from the command-line that the kernel is 
>> expecting?
>
> Heh that's annoying. The original branch I forked that Laurent was 
> working on had
> some extra patches at the start of the series: some were required for 
> q800 whilst
> others were for new development. I thought that all of the patches 
> required for q800
> had been applied over the past few months, but sadly that isn't the case 
> :(
>
> I've pushed an updated branch to 
> https://github.com/mcayland/qemu/tree/q800-test
> which contains the patchset plus two extra patches that are still needed 
> to boot to
> the debian installer here:
>
> 9281a5371f "tmp"
> 629754d847 "target/m68k: manage FPU exceptions"
>
> Laurent, are these patches ready for upstream or do they need work in 
> which case we
> should leave q800 until the 3.2 cycle?

 The only needed part is from 9281a5371f.
>>>
>>> Yeah I think you're right, sorry about that. I'm sure I tried without 
>>> 629754d847 and
>>> I got a premature exit from QEMU but only in graphic mode, but I've just 
>>> tried again
>>> and can't seem to recreate it now.
> [...]
 Because kernel only manages illegal instruction exception not unsupported.

 Without the patch, we have:

 IN:
 0xd454:  071400

 INT  1: Unassigned(0xf4) pc=d454 sp=00331e60 sr=2700

 with the patch:

 IN:
 0xd454:  071400

 INT  1: Illegal Instruction(0x10) pc=d454 sp=00331e60 sr=2700

 We have in linux/arch/m68k/kernel/vectors.c:

 /*
  * this must be called very early as the kernel might
  * use some instruction that are emulated on the 060
  * and so we're prepared for early probe attempts (e.g. nf_init).
  */
 void __init base_trap_init(void)
 {
 ...

 vectors[VEC_BUSERR] = buserr;
 vectors[VEC_ILLEGAL] = trap;
 vectors[VEC_SYS] = system_call;
 }

 So I think the unsupported vector jumps to an invalid address.

 This seems triggered by the aranym native feature:

 d454:   7300mvsb %d0,%d1

 from linux/arch/m68k/emu/natfeat.c
>>>
>>> Interesting. So is this an actual bug in QEMU in terms of implementing the 
>>> processor
>>> specification, or is it relying on undefined behaviour on real hardware?
>>
>> It's a bug in QEMU.
>>
>> EXCP_UNSUPPORTED is defined to a QEMU specific value (61) that is in the
>> Unassigned/Reserved range of the vector table.
>>
>> It is used by QEMU user-mode to trigger illegal instruction, whereas
>> illegal is also used to do simcalls (some thing like a syscall with an
>> illegal instruction trap). I think this should be deprecated as no one
>> is maintaining that and knows how to use that.
>>
>> Perhaps Thomas as an idea as it comes with the coldfire implementation?
>> (e6e5906b6e ColdFire target)
> 
> No clue, I've never used those simcalls before.
> 
> Maybe we could "fix" it simply by changing the #define in cpu.h like this:
> 
> #if defined(CONFIG_USER_ONLY)
> #define EXCP_UNSUPPORTED61
> #else
> #define EXCP_UNSUPPORTEDEXCP_ILLEGAL
> #endif
> 

I've found EXCP_UNSUPPORTED is a valid value to softmmu too, only
supported by some coldfire version.

In fact, we don't need the EXCP_UNSUPPORTED, the EXCP_ILLEGAL is used to
call the simcall interface. Before the introduction of m680x0 emulation
, EXCP_ILLEGAL was only used with the "illegal" instruction, other
unsupported instructions triggered the EXCP_UNSUPPORTED. So only the
"

Re: [Qemu-devel] [PULL v2 0/7] Chardev patches

2018-11-02 Thread Peter Maydell
On 1 November 2018 at 08:24, Marc-André Lureau
 wrote:
> The following changes since commit a2e002ff7913ce93aa0f7dbedd2123dce5f1a9cd:
>
>   Merge remote-tracking branch 
> 'remotes/vivier2/tags/qemu-trivial-for-3.1-pull-request' into staging 
> (2018-10-30 15:49:55 +)
>
> are available in the Git repository at:
>
>   https://github.com/elmarco/qemu.git tags/chrdev-pull-request
>
> for you to fetch changes up to 1ad723e98b8499d0690f7f7eafc945908a1db634:
>
>   editorconfig: set emacs mode (2018-11-01 12:13:12 +0400)
>
> 
> - add websocket support
> - socket: make 'fd' incompatible with 'reconnect'
> - fix a websocket leak
> - unrelated editorconfig patch that missed -trivial (included for
>   convenience)
> - v2: fix commit author field
>
> 
Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH v4 4/4] hw/arm: versal: Add a virtual Xilinx Versal board

2018-11-02 Thread Edgar E. Iglesias
On Tue, Oct 30, 2018 at 01:31:44PM +, Peter Maydell wrote:
> On 22 October 2018 at 18:35, Edgar E. Iglesias  
> wrote:
> > From: "Edgar E. Iglesias" 
> >
> > Add a virtual Xilinx Versal board.
> >
> > This board is based on the Xilinx Versal SoC. The exact
> > details of what peripherals are attached to this board
> > will remain in control of QEMU. QEMU will generate an
> > FDT on the fly for Linux and other software to auto-discover
> > peripherals.
> >
> > Signed-off-by: Edgar E. Iglesias 
> 
> > +static void fdt_add_fixed_link_nodes(VersalVirt *s, char *gemname,
> > + uint32_t phandle)
> > +{
> > +char *name = g_strdup_printf("%s/fixed-link", gemname);
> > +
> > +qemu_fdt_add_subnode(s->fdt, name);
> > +qemu_fdt_setprop_cell(s->fdt, name, "phandle", phandle);
> > +qemu_fdt_setprop_cells(s->fdt, name, "full-duplex");
> 
> Hi. This fails to compile in a non-debug build:
> 
> In file included from /home/peter.maydell/qemu/hw/arm/xlnx-versal-virt.c:16:0:
> /home/peter.maydell/qemu/hw/arm/xlnx-versal-virt.c: In function
> 'fdt_add_fixed_link_nodes':
> /home/peter.maydell/qemu/include/sysemu/device_tree.h:110:23: error:
> comparison of unsigned expression < 0 is always false
> [-Werror=type-limits]
>  for (i = 0; i < ARRAY_SIZE(qdt_tmp); i++) {  
>  \
>^
> /home/peter.maydell/qemu/hw/arm/xlnx-versal-virt.c:191:5: note: in
> expansion of macro 'qemu_fdt_setprop_cells'
>  qemu_fdt_setprop_cells(s->fdt, name, "full-duplex");
>  ^
> 
> because qemu_fdt_setprop_cells() requires you to provide
> at least one cell value for the property being set.
> What was the intention here ?

Hi Peter,

The intent was to set a boolean property without a value.
I'll fix this with the following and send a new version:
qemu_fdt_setprop(s->fdt, name, "full-duplex", NULL, 0);

Cheers,
Edgar


> 
> 
> > +qemu_fdt_setprop_cell(s->fdt, name, "speed", 1000);
> > +g_free(name);
> > +}
> 
> In the meantime, I'm dropping the versal patches from
> target-arm.next.
> 
> thanks
> -- PMM



Re: [Qemu-devel] [PATCH v1 2/7] pcihp: overwrite hotplug handler recursively from the start

2018-11-02 Thread David Hildenbrand
On 01.11.18 15:10, Igor Mammedov wrote:
> On Wed, 24 Oct 2018 12:19:25 +0200
> David Hildenbrand  wrote:
> 
>> For now, the hotplug handler is not called for devices that are
>> being cold plugged. The hotplug handler is setup when the machine
>> initialization is fully done. Only bridges that were cold plugged are
>> considered.
>>
>> Set the hotplug handler for the root piix bus directly when realizing.
>> Overwrite the hotplug handler of bridges when hotplugging/coldplugging
>> them.
>>
>> This will now make sure that the ACPI PCI hotplug handler is also called
>> for cold-plugged devices (also on bridges) and for bridges that were
>> hotplugged.
>>
>> When trying to hotplug a device to a hotplugged bridge, we now correctly
>> get the error message
>>  "Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set"
>> Insted of going via the standard PCI hotplug handler.
> Erroring out is probably not ok, since it can break existing setups
> where SHPC hotplugging to hotplugged bridge was working just fine before.

The question is if it actually was supposed (and eventually did) work.

If this was the expected behavior (mixing hotplug types), then the
necessary change to this patch would boil down to checking if the bridge
it hot or coldplugged.

> 
> Marcel/Michael what's your take on this change in behaviour?
> CCing libvirt in case they are doing this stuff
> 

Indeed, it would be nice to know if this was actually supposed to work
like this (coldplugged bridges using ACPI hotplug and hotplugged bridges
using SHPC hotplug).


-- 

Thanks,

David / dhildenb



[Qemu-devel] [PATCH v2 for-3.1 1/4] tests: Move tests/acpi-test-data/ to tests/data/acpi/

2018-11-02 Thread Peter Maydell
Currently tests/acpi-test-data contains data files used by the
bios-tables-test, and configure individually symlinks those
data files into the build directory using a wildcard.

Using a wildcard like this is a bad idea, because if a new
data file is added, nothing causes configure to be rerun,
and so no symlink is added for the new file. This can cause
tests to spuriously fail when they can't find their data.
Instead, it's better to symlink an entire directory of
data files. We already have such a directory: tests/data.

Move the data files from tests/acpi-test-data/ to
tests/data/acpi/, and remove the unnecessary symlinking.

We can remove entirely the note in rebuild-expected-aml.sh
about copying any new data files, because now they will
be in the source directory, not the build directory, and
no copying is required.

(We can't just change the existing tests/acpi-test-data/
to being a symlinked directory, because if we did that and
a developer switched git branches from one after that change
to one before it then configure would end up trashing all
the test files by making them symlinks to themselves.
Changing their path avoids this annoyance.)

Signed-off-by: Peter Maydell 
---
 configure   |   4 
 tests/bios-tables-test.c|   2 +-
 tests/{acpi-test-data => data/acpi}/pc/APIC | Bin
 tests/{acpi-test-data => data/acpi}/pc/APIC.cphp| Bin
 tests/{acpi-test-data => data/acpi}/pc/APIC.dimmpxm | Bin
 tests/{acpi-test-data => data/acpi}/pc/DSDT | Bin
 tests/{acpi-test-data => data/acpi}/pc/DSDT.bridge  | Bin
 tests/{acpi-test-data => data/acpi}/pc/DSDT.cphp| Bin
 tests/{acpi-test-data => data/acpi}/pc/DSDT.dimmpxm | Bin
 tests/{acpi-test-data => data/acpi}/pc/DSDT.ipmikcs | Bin
 tests/{acpi-test-data => data/acpi}/pc/DSDT.memhp   | Bin
 tests/{acpi-test-data => data/acpi}/pc/DSDT.numamem | Bin
 tests/{acpi-test-data => data/acpi}/pc/FACP | Bin
 tests/{acpi-test-data => data/acpi}/pc/FACS | Bin
 tests/{acpi-test-data => data/acpi}/pc/HPET | Bin
 tests/{acpi-test-data => data/acpi}/pc/NFIT.dimmpxm | Bin
 tests/{acpi-test-data => data/acpi}/pc/SLIT.cphp| Bin
 tests/{acpi-test-data => data/acpi}/pc/SLIT.memhp   | Bin
 tests/{acpi-test-data => data/acpi}/pc/SRAT.cphp| Bin
 tests/{acpi-test-data => data/acpi}/pc/SRAT.dimmpxm | Bin
 tests/{acpi-test-data => data/acpi}/pc/SRAT.memhp   | Bin
 tests/{acpi-test-data => data/acpi}/pc/SRAT.numamem | Bin
 tests/{acpi-test-data => data/acpi}/pc/SSDT.dimmpxm | Bin
 tests/{acpi-test-data => data/acpi}/q35/APIC| Bin
 tests/{acpi-test-data => data/acpi}/q35/APIC.cphp   | Bin
 .../{acpi-test-data => data/acpi}/q35/APIC.dimmpxm  | Bin
 tests/{acpi-test-data => data/acpi}/q35/DSDT| Bin
 tests/{acpi-test-data => data/acpi}/q35/DSDT.bridge | Bin
 tests/{acpi-test-data => data/acpi}/q35/DSDT.cphp   | Bin
 .../{acpi-test-data => data/acpi}/q35/DSDT.dimmpxm  | Bin
 tests/{acpi-test-data => data/acpi}/q35/DSDT.ipmibt | Bin
 tests/{acpi-test-data => data/acpi}/q35/DSDT.memhp  | Bin
 .../{acpi-test-data => data/acpi}/q35/DSDT.numamem  | Bin
 tests/{acpi-test-data => data/acpi}/q35/FACP| Bin
 tests/{acpi-test-data => data/acpi}/q35/FACS| Bin
 tests/{acpi-test-data => data/acpi}/q35/HPET| Bin
 tests/{acpi-test-data => data/acpi}/q35/MCFG| Bin
 .../{acpi-test-data => data/acpi}/q35/NFIT.dimmpxm  | Bin
 tests/{acpi-test-data => data/acpi}/q35/SLIT.cphp   | Bin
 tests/{acpi-test-data => data/acpi}/q35/SLIT.memhp  | Bin
 tests/{acpi-test-data => data/acpi}/q35/SRAT.cphp   | Bin
 .../{acpi-test-data => data/acpi}/q35/SRAT.dimmpxm  | Bin
 tests/{acpi-test-data => data/acpi}/q35/SRAT.memhp  | Bin
 .../{acpi-test-data => data/acpi}/q35/SRAT.numamem  | Bin
 .../{acpi-test-data => data/acpi}/q35/SSDT.dimmpxm  | Bin
 .../acpi}/rebuild-expected-aml.sh   |   2 --
 46 files changed, 1 insertion(+), 7 deletions(-)
 rename tests/{acpi-test-data => data/acpi}/pc/APIC (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/APIC.cphp (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/APIC.dimmpxm (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/DSDT (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/DSDT.bridge (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/DSDT.cphp (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/DSDT.dimmpxm (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/DSDT.ipmikcs (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/DSDT.memhp (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/DSDT.numamem (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/FACP (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/FACS (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/HPET (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/NFIT.dimmpxm (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/SLIT.cphp (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/SLIT.memhp (100%)
 renam

[Qemu-devel] [PATCH v2 for-3.1 4/4] configure: Use LINKS loop for all build tree symlinks

2018-11-02 Thread Peter Maydell
A few places in configure were doing ad-hoc calls to
the symlink function to set up symlinks from the build tree
back to the source tree. We have a loop that does this
already for all files and directories listed in the LINKS
environment variable; use that instead.

Signed-off-by: Peter Maydell 
---
 configure | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/configure b/configure
index 52408ed3076..74e313a8101 100755
--- a/configure
+++ b/configure
@@ -7417,6 +7417,8 @@ LINKS="$LINKS pc-bios/s390-ccw/Makefile"
 LINKS="$LINKS roms/seabios/Makefile roms/vgabios/Makefile"
 LINKS="$LINKS pc-bios/qemu-icon.bmp"
 LINKS="$LINKS .gdbinit scripts" # scripts needed by relative path in .gdbinit
+LINKS="$LINKS tests/acceptance tests/data"
+LINKS="$LINKS tests/qemu-iotests/check"
 for bios_file in \
 $source_path/pc-bios/*.bin \
 $source_path/pc-bios/*.lid \
@@ -7453,25 +7455,13 @@ for rom in seabios vgabios ; do
 echo "RANLIB=$ranlib" >> $config_mak
 done
 
-# set up tests data directory
-for tests_subdir in acceptance data; do
-if [ ! -e tests/$tests_subdir ]; then
-symlink "$source_path/tests/$tests_subdir" tests/$tests_subdir
-fi
-done
-
 # set up qemu-iotests in this build directory
 iotests_common_env="tests/qemu-iotests/common.env"
-iotests_check="tests/qemu-iotests/check"
 
 echo "# Automatically generated by configure - do not modify" > 
"$iotests_common_env"
 echo >> "$iotests_common_env"
 echo "export PYTHON='$python'" >> "$iotests_common_env"
 
-if [ ! -e "$iotests_check" ]; then
-symlink "$source_path/$iotests_check" "$iotests_check"
-fi
-
 # Save the configure command line for later reuse.
 cat 

[Qemu-devel] [PATCH v2 for-3.1 3/4] configure: Rename FILES variable to LINKS

2018-11-02 Thread Peter Maydell
The FILES variable is used to accumulate a list of things to symlink
from the source tree into the build tree.  These don't have to be
individual files; symlinking an entire directory of data files is
also fine.  Rename it to something less confusing before we add a few
directories to it.

Improve the comment to clarify what DIRS and LINKS do and why
it's not a good idea to add things to LINKS with wildcarding.

Signed-off-by: Peter Maydell 
---
 configure | 35 ++-
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/configure b/configure
index bfdca8b814e..52408ed3076 100755
--- a/configure
+++ b/configure
@@ -7392,22 +7392,31 @@ if test "$ccache_cpp2" = "yes"; then
   echo "export CCACHE_CPP2=y" >> $config_host_mak
 fi
 
-# build tree in object directory in case the source is not in the current 
directory
+# If we're using a separate build tree, set it up now.
+# DIRS are directories which we simply mkdir in the build tree;
+# LINKS are things to symlink back into the source tree
+# (these can be both files and directories).
+# Caution: do not add files or directories here using wildcards. This
+# will result in problems later if a new file matching the wildcard is
+# added to the source tree -- nothing will cause configure to be rerun
+# so the build tree will be missing the link back to the new file, and
+# tests might fail. Prefer to keep the relevant files in their own
+# directory and symlink the directory instead.
 DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32 tests/libqos 
tests/qapi-schema tests/tcg/xtensa tests/qemu-iotests tests/vm"
 DIRS="$DIRS tests/fp"
 DIRS="$DIRS docs docs/interop fsdev scsi"
 DIRS="$DIRS pc-bios/optionrom pc-bios/spapr-rtas pc-bios/s390-ccw"
 DIRS="$DIRS roms/seabios roms/vgabios"
-FILES="Makefile tests/tcg/Makefile qdict-test-data.txt"
-FILES="$FILES tests/tcg/cris/Makefile tests/tcg/cris/.gdbinit"
-FILES="$FILES tests/tcg/lm32/Makefile tests/tcg/xtensa/Makefile po/Makefile"
-FILES="$FILES tests/fp/Makefile"
-FILES="$FILES pc-bios/optionrom/Makefile pc-bios/keymaps"
-FILES="$FILES pc-bios/spapr-rtas/Makefile"
-FILES="$FILES pc-bios/s390-ccw/Makefile"
-FILES="$FILES roms/seabios/Makefile roms/vgabios/Makefile"
-FILES="$FILES pc-bios/qemu-icon.bmp"
-FILES="$FILES .gdbinit scripts" # scripts needed by relative path in .gdbinit
+LINKS="Makefile tests/tcg/Makefile qdict-test-data.txt"
+LINKS="$LINKS tests/tcg/cris/Makefile tests/tcg/cris/.gdbinit"
+LINKS="$LINKS tests/tcg/lm32/Makefile tests/tcg/xtensa/Makefile po/Makefile"
+LINKS="$LINKS tests/fp/Makefile"
+LINKS="$LINKS pc-bios/optionrom/Makefile pc-bios/keymaps"
+LINKS="$LINKS pc-bios/spapr-rtas/Makefile"
+LINKS="$LINKS pc-bios/s390-ccw/Makefile"
+LINKS="$LINKS roms/seabios/Makefile roms/vgabios/Makefile"
+LINKS="$LINKS pc-bios/qemu-icon.bmp"
+LINKS="$LINKS .gdbinit scripts" # scripts needed by relative path in .gdbinit
 for bios_file in \
 $source_path/pc-bios/*.bin \
 $source_path/pc-bios/*.lid \
@@ -7419,10 +7428,10 @@ for bios_file in \
 $source_path/pc-bios/u-boot.* \
 $source_path/pc-bios/palcode-*
 do
-FILES="$FILES pc-bios/$(basename $bios_file)"
+LINKS="$LINKS pc-bios/$(basename $bios_file)"
 done
 mkdir -p $DIRS
-for f in $FILES ; do
+for f in $LINKS ; do
 if [ -e "$source_path/$f" ] && [ "$pwd_is_source_path" != "y" ]; then
 symlink "$source_path/$f" "$f"
 fi
-- 
2.19.1




[Qemu-devel] [PATCH v2 for-3.1 2/4] tests: Move tests/hex-loader-check-data/ to tests/data/hex-loader/

2018-11-02 Thread Peter Maydell
Currently tests/hex-loader-check-data contains data files used
by the hexloader-test, and configure individually symlinks those
data files into the build directory using a wildcard.

Using a wildcard like this is a bad idea, because if a new
data file is added, nothing causes configure to be rerun,
and so no symlink is added for the new file. This can cause
tests to spuriously fail when they can't find their data.
Instead, it's better to symlink an entire directory of
data files. We already have such a directory: tests/data.

Move the data files from tests/hex-loader-check-data/ to
tests/data/hex-loader/, and remove the unnecessary symlinking.

Signed-off-by: Peter Maydell 
---
 configure | 4 
 tests/hexloader-test.c| 2 +-
 MAINTAINERS   | 2 +-
 tests/{hex-loader-check-data => data/hex-loader}/test.hex | 0
 4 files changed, 2 insertions(+), 6 deletions(-)
 rename tests/{hex-loader-check-data => data/hex-loader}/test.hex (100%)

diff --git a/configure b/configure
index 895b7483b8a..bfdca8b814e 100755
--- a/configure
+++ b/configure
@@ -7421,10 +7421,6 @@ for bios_file in \
 do
 FILES="$FILES pc-bios/$(basename $bios_file)"
 done
-for test_file in $(find $source_path/tests/hex-loader-check-data -type f)
-do
-FILES="$FILES tests/hex-loader-check-data$(echo $test_file | sed -e 
's/.*hex-loader-check-data//')"
-done
 mkdir -p $DIRS
 for f in $FILES ; do
 if [ -e "$source_path/$f" ] && [ "$pwd_is_source_path" != "y" ]; then
diff --git a/tests/hexloader-test.c b/tests/hexloader-test.c
index b653d44ba10..834ed52c22b 100644
--- a/tests/hexloader-test.c
+++ b/tests/hexloader-test.c
@@ -23,7 +23,7 @@ static void hex_loader_test(void)
 const unsigned int base_addr = 0x0001;
 
 QTestState *s = qtest_initf(
-"-M vexpress-a9 -nographic -device 
loader,file=tests/hex-loader-check-data/test.hex");
+"-M vexpress-a9 -nographic -device 
loader,file=tests/data/hex-loader/test.hex");
 
 for (i = 0; i < 256; ++i) {
 uint8_t val = qtest_readb(s, base_addr + i);
diff --git a/MAINTAINERS b/MAINTAINERS
index f2360efe3ed..5c342a670f5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1369,7 +1369,7 @@ Intel Hexadecimal Object File Loader
 M: Su Hang 
 S: Maintained
 F: tests/hexloader-test.c
-F: tests/hex-loader-check-data/test.hex
+F: tests/data/hex-loader/test.hex
 
 CHRP NVRAM
 M: Thomas Huth 
diff --git a/tests/hex-loader-check-data/test.hex 
b/tests/data/hex-loader/test.hex
similarity index 100%
rename from tests/hex-loader-check-data/test.hex
rename to tests/data/hex-loader/test.hex
-- 
2.19.1




[Qemu-devel] [PATCH v2 for-3.1 0/4] configure: symlink directories, not wildcarded files

2018-11-02 Thread Peter Maydell
This patchset fixes a problem with our build infrastructure
that meant that MST's recent 'pci, pc, virtio' pullreq failed
tests.

Currently our configure script has a wildcard loop that creates
symlinks for every data file in tests/acpi-test-data from the
source tree to the build tree. However, if a new data file is
added in git, there is nothing that causes configure to be rerun,
and so it is not available in the build tree, which can cause
test failures.

In v1 of this patchset I addressed this by changing configure
to make tests/acpi-test-data itself a symlink. Unfortunately
this has an awkward consequence that if we did that and
a developer switched git branches from one after that change
to one before it then configure would end up trashing all
the test files by making them symlinks to themselves.
So instead in v2, we move all the data files to the tests/data/
directory. tests/data/ is already symlinked as a directory,
so there is no problem for bisection.

Patch 1 does that for tests/acpi-test-data.
Patch 2 does that for tests/hex-loader-check-data.
Patch 3 is a cleanup, renaming a variable and adding
documentation so that it's clearer that symlinking can
be used for directories and that wildcarding files is bad.
Patch 4 rolls some ad-hoc symlinking into the common loop.

We do still use wildcarding to construct a list of files in
pc-bios to be symlinked; we get away with this because we don't
in practice add new BIOS images often and if we do there's also
usually a change that means configure is rerun anyway. We can't
just symlink all of pc-bios into the build tree because it
contains other things than just generated binaries. There
might be scope for fixing this, but I wanted to get this fix out.

thanks
-- PMM

Peter Maydell (4):
  tests: Move tests/acpi-test-data/ to tests/data/acpi/
  tests: Move tests/hex-loader-check-data/ to tests/data/hex-loader/
  configure: Rename FILES variable to LINKS
  configure: Use LINKS loop for all build tree symlinks

 configure |  57 --
 tests/bios-tables-test.c  |   2 +-
 tests/hexloader-test.c|   2 +-
 MAINTAINERS   |   2 +-
 tests/{acpi-test-data => data/acpi}/pc/APIC   | Bin
 .../acpi}/pc/APIC.cphp| Bin
 .../acpi}/pc/APIC.dimmpxm | Bin
 tests/{acpi-test-data => data/acpi}/pc/DSDT   | Bin
 .../acpi}/pc/DSDT.bridge  | Bin
 .../acpi}/pc/DSDT.cphp| Bin
 .../acpi}/pc/DSDT.dimmpxm | Bin
 .../acpi}/pc/DSDT.ipmikcs | Bin
 .../acpi}/pc/DSDT.memhp   | Bin
 .../acpi}/pc/DSDT.numamem | Bin
 tests/{acpi-test-data => data/acpi}/pc/FACP   | Bin
 tests/{acpi-test-data => data/acpi}/pc/FACS   | Bin
 tests/{acpi-test-data => data/acpi}/pc/HPET   | Bin
 .../acpi}/pc/NFIT.dimmpxm | Bin
 .../acpi}/pc/SLIT.cphp| Bin
 .../acpi}/pc/SLIT.memhp   | Bin
 .../acpi}/pc/SRAT.cphp| Bin
 .../acpi}/pc/SRAT.dimmpxm | Bin
 .../acpi}/pc/SRAT.memhp   | Bin
 .../acpi}/pc/SRAT.numamem | Bin
 .../acpi}/pc/SSDT.dimmpxm | Bin
 tests/{acpi-test-data => data/acpi}/q35/APIC  | Bin
 .../acpi}/q35/APIC.cphp   | Bin
 .../acpi}/q35/APIC.dimmpxm| Bin
 tests/{acpi-test-data => data/acpi}/q35/DSDT  | Bin
 .../acpi}/q35/DSDT.bridge | Bin
 .../acpi}/q35/DSDT.cphp   | Bin
 .../acpi}/q35/DSDT.dimmpxm| Bin
 .../acpi}/q35/DSDT.ipmibt | Bin
 .../acpi}/q35/DSDT.memhp  | Bin
 .../acpi}/q35/DSDT.numamem| Bin
 tests/{acpi-test-data => data/acpi}/q35/FACP  | Bin
 tests/{acpi-test-data => data/acpi}/q35/FACS  | Bin
 tests/{acpi-test-data => data/acpi}/q35/HPET  | Bin
 tests/{acpi-test-data => data/acpi}/q35/MCFG  | Bin
 .../acpi}/q35/NFIT.dimmpxm| Bin
 .../acpi}/q35/SLIT.cphp   | Bin
 .../acpi}/q35/SLIT.memhp  | Bin
 .../acpi}/q35/SRAT.cphp   | Bin
 .../acpi}/q35/SRAT.dimmpxm| Bin
 .../acpi}/q35/SRAT.memhp  | Bin
 .../acpi}/q35/SRAT.numamem| Bin
 .../acpi}/q35/SSDT.dimmpxm| Bin
 .../acpi}/rebuild-expected-aml.sh |   2 -
 .../hex-loader}/test.hex  |   0
 49 files changed, 27 insertions(+), 38 deletions(-)
 rename tests/{acpi-test-data => data/acpi}/pc/APIC (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/APIC.cphp (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/APIC.dimmpxm (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/DSDT (100%)
 rename tests/{acpi-test-data => data/acpi}/pc/DSDT.bridge (100%)
 rename t

Re: [Qemu-devel] xen_disk qdevification (was: [PATCH 0/3] Performance improvements for xen_disk v2)

2018-11-02 Thread Kevin Wolf
Am 02.11.2018 um 12:13 hat Paul Durrant geschrieben:
> > -Original Message-
> > From: Kevin Wolf [mailto:kw...@redhat.com]
> > Sent: 02 November 2018 11:04
> > To: Tim Smith 
> > Cc: xen-de...@lists.xenproject.org; qemu-devel@nongnu.org; qemu-
> > bl...@nongnu.org; Anthony Perard ; Paul Durrant
> > ; Stefano Stabellini ;
> > Max Reitz ; arm...@redhat.com
> > Subject: xen_disk qdevification (was: [PATCH 0/3] Performance improvements
> > for xen_disk v2)
> > 
> > Am 02.11.2018 um 11:00 hat Tim Smith geschrieben:
> > > A series of performance improvements for disks using the Xen PV ring.
> > >
> > > These have had fairly extensive testing.
> > >
> > > The batching and latency improvements together boost the throughput
> > > of small reads and writes by two to six percent (measured using fio
> > > in the guest)
> > >
> > > Avoiding repeated calls to posix_memalign() reduced the dirty heap
> > > from 25MB to 5MB in the case of a single datapath process while also
> > > improving performance.
> > >
> > > v2 removes some checkpatch complaints and fixes the CCs
> > 
> > Completely unrelated, but since you're the first person touching
> > xen_disk in a while, you're my victim:
> > 
> > At KVM Forum we discussed sending a patch to deprecate xen_disk because
> > after all those years, it still hasn't been converted to qdev. Markus is
> > currently fixing some other not yet qdevified block device, but after
> > that xen_disk will be the only one left.
> > 
> > A while ago, a downstream patch review found out that there are some QMP
> > commands that would immediately crash if a xen_disk device were present
> > because of the lacking qdevification. This is not the code quality
> > standard I envision for QEMU. It's time for non-qdev devices to go.
> > 
> > So if you guys are still interested in the device, could someone please
> > finally look into converting it?
> 
> I have a patch series to do exactly this. It's somewhat involved as I
> need to convert the whole PV backend infrastructure. I will try to
> rebase and clean up my series a.s.a.p.

Thanks a lot, Paul! This is good news.

Kevin



Re: [Qemu-devel] [PATCH v4 00/23] ACPI reorganization for hardware-reduced support

2018-11-02 Thread Igor Mammedov
On Thu,  1 Nov 2018 11:22:40 +0100
Samuel Ortiz  wrote:

Thanks for looking at ACPI mess we have in QEMU and trying to make it better,
this series look a bit hackish probably because it was written to suit new
virt board, so it needs some more clean up to be done.

> This patch set provides an ACPI code reorganization in preparation for
> adding hardware-reduced support to QEMU.
QEMU already has hw reduced implementation, specifically in arm/virt board

> The changes are coming from the NEMU [1] project where we're defining
> a new x86 machine type: i386/virt. This is an EFI only, ACPI
> hardware-reduced platform and as such we had to implement support
> for the latter.
> 
> As a preliminary for adding hardware-reduced support to QEMU, we did
s:support to QEMU:support for new i386/virt machine:

> some ACPI code reorganization with the following goals:
> 
> * Share as much as possible of the current ACPI build APIs between
>   legacy and hardware-reduced ACPI.
> * Share the ACPI build code across machine types and architectures and
>   remove the typical PC machine type dependency.
>   Eventually we hope to see arm/virt also re-use much of that code.
it probably should be other way around, generalize and reuse as much of
arm/virt acpi code, instead of adding new duplicated code without
an actual user and then swapping/dropping old arm version in favor of the
new one. It's hard to review when it done in this order and easy to miss
issues that would be easier to spot if you reused arm versions (where
applicable) as starting point for generalization.

Here are some generic suggestions/nits that apply to whole series:
  * s/Factorize/Factor out/
  * try to restructure series in following way
 1. put bug fixes at the beginning of series.
'make V=1 check' should produce tables diffs to account for
changes made in the tables.
After error fixing, add an extra patch to update reference
ACPI tables to simplify testing for reviewing and so for
self-check when you'll be doing refactoring to make sure
there aren't any changes during generalization/refactoring
later.

 2. instead of adding 'new' implementations try to generalize
existing ones so that a user for new code will always exist.
It's also makes patches easier to review/test.

 3. Since you are touching/moving around existing fixed tables
that are using legacy 'struct' based approach to construct
them, it's a good as opportunity to switch to a newer approach
and use build_append_int_noprefix() API to construct tables.
Use build_amd_iommu() as example. And it's doubly true if/when
you are adding new fixed tables (i.e. only build_append_int_noprefix()
based ones are acceptable).

 4. for patches during #2 and #3 stages, 'make V=1 check'
should pass without any warnings, that will speed up review
process.

 5. add i386/virt board and related hardware reduced ACPI code
that's specific to it.

I'll try to review series during next week and will do per patch
suggestions how to structure it or do other way.

Hopefully after doing refactoring we would end up with
simpler/smaller and cleaner ACPI code to the benefit of everyone.

PS:
if you need a quick advice wrt APCI parts, feel free to ping me on IRC
(Paris timezone).

> The patches are also available in their own git branch [2].
> 
> [1] https://github.com/intel/nemu
> [2] https://github.com/intel/nemu/tree/topic/upstream/acpi
> 
> v1 -> v2:
>* Drop the hardware-reduced implementation for now. Our next patch set
>  will add hardware-reduced and convert arm/virt to it.
>* Implement the ACPI build methods as a QOM Interface Class and convert
>  the PC machine type to it.
>* acpi_conf_pc_init() uses a PCMachineState pointer and not a
>  MachineState one as its argument.
> 
> v2 -> v3:
>* Cc all relevant maintainers, no functional changes.
> 
> v3 -> v4:
>* Renamed all AcpiConfiguration pointers from conf to acpi_conf.
>* Removed the ACPI_BUILD_ALIGN_SIZE export.
>* Temporarily updated the arm virt build_rsdp() prototype for
>  bisectability purposes.
>* Removed unneeded pci headers from acpi-build.c.
>* Refactor the acpi PCI host getter so that it truly is architecture
>  agnostic, by carrying the PCI host pointer through the
>  AcpiConfiguration structure.
>* Splitted the PCI host AML builder API export patch from the PCI host
>  and holes getter one.
>* Reduced the build_srat() export scope to hw/i386 instead of the broader
>  hw/acpi. SRAT builders are truly architecture specific and can hardly be
>  generalized.
>* Completed the ACPI builder documentation.
> 
> Samuel Ortiz (15):
>   hw: i386: Decouple the ACPI build from the PC machine type
>   hw: acpi: Export ACPI build alignment API
>   hw: acpi: Export the RSDP build API
>   hw: acpi: Implement XSDT support f

Re: [Qemu-devel] [PATCH v2] target/arm: Conditionalize some asserts on aarch32 support

2018-11-02 Thread Alex Bennée


Richard Henderson  writes:

> When populating id registers from kvm, on a host that doesn't support
> aarch32 mode at all, neither arm_div nor jazelle will be supported either.
>
> Signed-off-by: Richard Henderson 
> ---
>
> v2: Test aa64pfr.el0 >= 2; rename to isar_feature_aa64_aa32.
> Pull out realizefn test to no_aa32 bool; use it for jazelle as well.
>
> Alex, can you give this a test please?

Reviewed-by: Alex Bennée 
Tested-by: Alex Bennée 


>
>
> r~
>
> ---
>  target/arm/cpu.h |  5 +
>  target/arm/cpu.c | 15 +--
>  2 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index 895f9909d8..5c2c77c31d 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -3300,6 +3300,11 @@ static inline bool isar_feature_aa64_fp16(const 
> ARMISARegisters *id)
>  return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, FP) == 1;
>  }
>
> +static inline bool isar_feature_aa64_aa32(const ARMISARegisters *id)
> +{
> +return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, EL0) >= 2;
> +}
> +
>  static inline bool isar_feature_aa64_sve(const ARMISARegisters *id)
>  {
>  return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index e08a2d2d79..d4dc0bc225 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -774,6 +774,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>  CPUARMState *env = &cpu->env;
>  int pagebits;
>  Error *local_err = NULL;
> +bool no_aa32 = false;
>
>  /* If we needed to query the host kernel for the CPU features
>   * then it's possible that might have failed in the initfn, but
> @@ -820,6 +821,16 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>  set_feature(env, ARM_FEATURE_V7VE);
>  }
>  }
> +
> +/*
> + * There exist AArch64 cpus without AArch32 support.  When KVM
> + * queries ID_ISAR0_EL1 on such a host, the value is UNKNOWN.
> + * Similarly, we cannot check ID_AA64PFR0 without AArch64 support.
> + */
> +if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
> +no_aa32 = !cpu_isar_feature(aa64_aa32, cpu);
> +}
> +
>  if (arm_feature(env, ARM_FEATURE_V7VE)) {
>  /* v7 Virtualization Extensions. In real hardware this implies
>   * EL2 and also the presence of the Security Extensions.
> @@ -829,7 +840,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>   * Presence of EL2 itself is ARM_FEATURE_EL2, and of the
>   * Security Extensions is ARM_FEATURE_EL3.
>   */
> -assert(cpu_isar_feature(arm_div, cpu));
> +assert(no_aa32 || cpu_isar_feature(arm_div, cpu));
>  set_feature(env, ARM_FEATURE_LPAE);
>  set_feature(env, ARM_FEATURE_V7);
>  }
> @@ -855,7 +866,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>  if (arm_feature(env, ARM_FEATURE_V6)) {
>  set_feature(env, ARM_FEATURE_V5);
>  if (!arm_feature(env, ARM_FEATURE_M)) {
> -assert(cpu_isar_feature(jazelle, cpu));
> +assert(no_aa32 || cpu_isar_feature(jazelle, cpu));
>  set_feature(env, ARM_FEATURE_AUXCR);
>  }
>  }


--
Alex Bennée



Re: [Qemu-devel] [PATCH 00/12] file-posix: Simplify delegation to worker thread

2018-11-02 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20181031215622.27690-1-kw...@redhat.com
Subject: [Qemu-devel] [PATCH 00/12] file-posix: Simplify delegation to worker 
thread

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
f3d24f259d file-posix: Avoid aio_worker() for QEMU_AIO_IOCTL
a20af5e7d5 file-posix: Switch to .bdrv_co_ioctl
86c8cd0ca9 file-posix: Remove paio_submit_co()
a6798de879 file-posix: Avoid aio_worker() for QEMU_AIO_READ/WRITE
f667c83e70 file-posix: Move read/write operation logic out of aio_worker()
028fa45b36 file-posix: Avoid aio_worker() for QEMU_AIO_FLUSH
992060b92c file-posix: Avoid aio_worker() for QEMU_AIO_DISCARD
b27306c38f file-posix: Avoid aio_worker() for QEMU_AIO_WRITE_ZEROES
934194f753 file-posix: Avoid aio_worker() for QEMU_AIO_COPY_RANGE
ab3479aa9e file-posix: Avoid aio_worker() for QEMU_AIO_TRUNCATE
055cea24a4 file-posix: Factor out raw_thread_pool_submit()
91834b3fb8 file-posix: Reorganise RawPosixAIOData

=== OUTPUT BEGIN ===
Checking PATCH 1/12: file-posix: Reorganise RawPosixAIOData...
ERROR: suspect code indent for conditional statements (8, 13)
#96: FILE: block/file-posix.c:1278:
+if (aiocb->io.niov == 1) {
+ return handle_aiocb_rw_linear(aiocb, aiocb->io.iov->iov_base);

total: 1 errors, 0 warnings, 203 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 2/12: file-posix: Factor out raw_thread_pool_submit()...
Checking PATCH 3/12: file-posix: Avoid aio_worker() for QEMU_AIO_TRUNCATE...
Checking PATCH 4/12: file-posix: Avoid aio_worker() for QEMU_AIO_COPY_RANGE...
Checking PATCH 5/12: file-posix: Avoid aio_worker() for QEMU_AIO_WRITE_ZEROES...
Checking PATCH 6/12: file-posix: Avoid aio_worker() for QEMU_AIO_DISCARD...
Checking PATCH 7/12: file-posix: Avoid aio_worker() for QEMU_AIO_FLUSH...
Checking PATCH 8/12: file-posix: Move read/write operation logic out of 
aio_worker()...
ERROR: suspect code indent for conditional statements (8, 13)
#25: FILE: block/file-posix.c:1279:
 if (aiocb->io.niov == 1) {
+ nbytes = handle_aiocb_rw_linear(aiocb, aiocb->io.iov->iov_base);

total: 1 errors, 0 warnings, 73 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 9/12: file-posix: Avoid aio_worker() for QEMU_AIO_READ/WRITE...
Checking PATCH 10/12: file-posix: Remove paio_submit_co()...
Checking PATCH 11/12: file-posix: Switch to .bdrv_co_ioctl...
Checking PATCH 12/12: file-posix: Avoid aio_worker() for QEMU_AIO_IOCTL...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-devel] [QEMU PATCH v2 0/2]: KVM: i386: Add support for save and restore nested state

2018-11-02 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote:
> On 02/11/2018 04:46, Liran Alon wrote:
> >> On Thu, Nov1, 2018 at 09:45 AM, Jim Mattson  wrote:
> > 
> >>> On Thu, Nov 1, 2018 at 8:56 AM, Dr. David Alan Gilbert 
> >>>  wrote:
> > 
> >>> So if I have matching host kernels it should always work?
> >>> What happens if I upgrade the source kernel to increase it's maximum
> >>> nested size, can I force it to keep things small for some VMs?
> > 
> >> Any change to the format of the nested state should be gated by a
> >> KVM_CAP set by userspace. (Unlike, say, how the
> >> KVM_VCPUEVENT_VALID_SMM flag was added to the saved VCPU events state
> >> in commit f077825a8758d.) KVM has traditionally been quite bad about
> >> maintaining backwards compatibility, but I hope the community is more
> >> cognizant of the issues now.
> > 
> >> As a cloud provider, one would only enable the new capability from
> >> userspace once all hosts in the pool have a kernel that supports it.
> >> During the transition, the capability would not be enabled on the
> >> hosts with a new kernel, and these hosts would continue to provide
> >> nested state that could be consumed by hosts running the older kernel.
> > 
> > Hmm this makes sense.
> > 
> > This means though that the patch I have submitted here isn't good enough.
> > My patch currently assumes that when it attempts to get nested state from 
> > KVM,
> > QEMU should always set nested_state->size to max size supported by KVM as 
> > received
> > from kvm_check_extension(s, KVM_CAP_NESTED_STATE);
> > (See kvm_get_nested_state() introduced on my patch).
> > This indeed won't allow migration from host with new KVM to host with old 
> > KVM if
> > nested_state size was enlarged between these KVM versions.
> > Which is obviously an issue.
> 
> Actually I think this is okay, because unlike the "new" capability was
> enabled, KVM would always reduce nested_state->size to a value that is
> compatible with current kernels.
> 
> > But on second thought, I'm not sure that this is the right approach as-well.
> > We don't really want the used version of nested_state to be determined on 
> > kvm_init().
> > * On source QEMU, we actually want to determine it when preparing for 
> > migration based
> > on to the support given by our destination host. If it's an old host, we 
> > would like to
> > save an old version nested_state and if it's a new host, we will like to 
> > save our newest
> > supported nested_state.
> 
> No, that's wrong because it would lead to losing state.  If the source
> QEMU supports more state than the destination QEMU, and the current VM
> state needs to transmit it for migration to be _correct_, then migration
> to that destination QEMU must fail.
> 
> In particular, enabling the new KVM capability needs to be gated by a
> new machine type and/or -cpu flag, if migration compatibility is needed.
>  (In particular, this is one reason why I haven't considered this series
> for 3.1.  Right now, migration of nested hypervisors is completely
> busted but if we make it "almost" work, pre-3.1 machine types would not
> ever be able to add support for KVM_CAP_EXCEPTION_PAYLOAD.  Therefore,
> it's better for users if we wait for one release more, and add support
> for KVM_CAP_NESTED_STATE and KVM_CAP_EXCEPTION_PAYLOAD at the same time).
> 
> Personally, I would like to say that, starting from QEMU 3.2, enabling
> nested VMX requires a 4.20 kernel.  It's a bit bold, but I think it's a
> good way to keep some sanity.  Any opinions on that?

That seems a bit mean; there's a lot of people already using nested.

Dave

> Paolo
> 
> > Therefore, I don't think that we want this versioning to be based on 
> > KVM_CAP at all.
> > It seems that we would want the process to behave as follows:
> > 1) Mgmt-layer at dest queries dest host max supported nested_state size.
> >(Which should be returned from kvm_check_extension(KVM_CAP_NESTED_STATE))
> > 2) Mgmt-layer at source initiate migration to dest with requesting QEMU to 
> > send nested_state 
> >matching dest max supported nested_state size.
> >When saving nested state using KVM_GET_NESTED_STATE IOCTL, QEMU will 
> > specify in nested_state->size
> >the *requested* size to be saved and KVM should be able to save only the 
> > information which matches
> >the version that worked with that size.
> > 3) After some sanity checks on received migration stream, dest host use 
> > KVM_SET_NESTED_STATE IOCTL.
> >This IOCTL should deduce which information it should deploy based on 
> > given nested_state->size.
> > 
> > This also makes me wonder if it's not just nicer to use nested_state->flags 
> > to specify which
> > information is actually present on nested_state instead of managing 
> > versioning with nested_state->size.
> > 
> > What are your opinions on this?
> > 
> > -Liran
> > 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [QEMU PATCH v2 0/2]: KVM: i386: Add support for save and restore nested state

2018-11-02 Thread Daniel P . Berrangé
On Fri, Nov 02, 2018 at 12:35:05PM +, Dr. David Alan Gilbert wrote:
> * Paolo Bonzini (pbonz...@redhat.com) wrote:
> > On 02/11/2018 04:46, Liran Alon wrote:
> > >> On Thu, Nov1, 2018 at 09:45 AM, Jim Mattson  wrote:
> > > 
> > >>> On Thu, Nov 1, 2018 at 8:56 AM, Dr. David Alan Gilbert 
> > >>>  wrote:
> > > 
> > >>> So if I have matching host kernels it should always work?
> > >>> What happens if I upgrade the source kernel to increase it's maximum
> > >>> nested size, can I force it to keep things small for some VMs?
> > > 
> > >> Any change to the format of the nested state should be gated by a
> > >> KVM_CAP set by userspace. (Unlike, say, how the
> > >> KVM_VCPUEVENT_VALID_SMM flag was added to the saved VCPU events state
> > >> in commit f077825a8758d.) KVM has traditionally been quite bad about
> > >> maintaining backwards compatibility, but I hope the community is more
> > >> cognizant of the issues now.
> > > 
> > >> As a cloud provider, one would only enable the new capability from
> > >> userspace once all hosts in the pool have a kernel that supports it.
> > >> During the transition, the capability would not be enabled on the
> > >> hosts with a new kernel, and these hosts would continue to provide
> > >> nested state that could be consumed by hosts running the older kernel.
> > > 
> > > Hmm this makes sense.
> > > 
> > > This means though that the patch I have submitted here isn't good enough.
> > > My patch currently assumes that when it attempts to get nested state from 
> > > KVM,
> > > QEMU should always set nested_state->size to max size supported by KVM as 
> > > received
> > > from kvm_check_extension(s, KVM_CAP_NESTED_STATE);
> > > (See kvm_get_nested_state() introduced on my patch).
> > > This indeed won't allow migration from host with new KVM to host with old 
> > > KVM if
> > > nested_state size was enlarged between these KVM versions.
> > > Which is obviously an issue.
> > 
> > Actually I think this is okay, because unlike the "new" capability was
> > enabled, KVM would always reduce nested_state->size to a value that is
> > compatible with current kernels.
> > 
> > > But on second thought, I'm not sure that this is the right approach 
> > > as-well.
> > > We don't really want the used version of nested_state to be determined on 
> > > kvm_init().
> > > * On source QEMU, we actually want to determine it when preparing for 
> > > migration based
> > > on to the support given by our destination host. If it's an old host, we 
> > > would like to
> > > save an old version nested_state and if it's a new host, we will like to 
> > > save our newest
> > > supported nested_state.
> > 
> > No, that's wrong because it would lead to losing state.  If the source
> > QEMU supports more state than the destination QEMU, and the current VM
> > state needs to transmit it for migration to be _correct_, then migration
> > to that destination QEMU must fail.
> > 
> > In particular, enabling the new KVM capability needs to be gated by a
> > new machine type and/or -cpu flag, if migration compatibility is needed.
> >  (In particular, this is one reason why I haven't considered this series
> > for 3.1.  Right now, migration of nested hypervisors is completely
> > busted but if we make it "almost" work, pre-3.1 machine types would not
> > ever be able to add support for KVM_CAP_EXCEPTION_PAYLOAD.  Therefore,
> > it's better for users if we wait for one release more, and add support
> > for KVM_CAP_NESTED_STATE and KVM_CAP_EXCEPTION_PAYLOAD at the same time).
> > 
> > Personally, I would like to say that, starting from QEMU 3.2, enabling
> > nested VMX requires a 4.20 kernel.  It's a bit bold, but I think it's a
> > good way to keep some sanity.  Any opinions on that?
> 
> That seems a bit mean; there's a lot of people already using nested.

Agreed, it would be a significant regression for people. They may not
even care about migration, so we should not block its use with old
kernels just for the sake of working migration that they won't use.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



[Qemu-devel] [PATCH v2 5/5] git: use HTTPS git URLs for repo.or.cz

2018-11-02 Thread Stefan Hajnoczi
repo.or.cz supports git smart HTTPS.  Use it in preference to git:// or
http:// since it's more secure.

Suggested-by: Eric Blake 
Signed-off-by: Stefan Hajnoczi 
---
 MAINTAINERS| 14 +++---
 pc-bios/README |  2 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5cca097c49..a0576aa68a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1455,7 +1455,7 @@ F: tests/qemu-iotests/
 F: util/qemu-progress.c
 F: qobject/block-qdict.c
 F: tests/check-block-qdict.c
-T: git git://repo.or.cz/qemu/kevin.git block
+T: git https://repo.or.cz/qemu/kevin.git block
 
 Block I/O path
 M: Stefan Hajnoczi 
@@ -1502,7 +1502,7 @@ F: blockdev.c
 F: block/qapi.c
 F: qapi/block*.json
 F: qapi/transaction.json
-T: git git://repo.or.cz/qemu/armbru.git block-next
+T: git https://repo.or.cz/qemu/armbru.git block-next
 
 Dirty Bitmaps
 M: Fam Zheng 
@@ -1697,14 +1697,14 @@ F: tests/test-visitor-serialization.c
 F: scripts/qapi-gen.py
 F: scripts/qapi/*
 F: docs/devel/qapi*
-T: git git://repo.or.cz/qemu/armbru.git qapi-next
+T: git https://repo.or.cz/qemu/armbru.git qapi-next
 
 QAPI Schema
 M: Eric Blake 
 M: Markus Armbruster 
 S: Supported
 F: qapi/*.json
-T: git git://repo.or.cz/qemu/armbru.git qapi-next
+T: git https://repo.or.cz/qemu/armbru.git qapi-next
 
 QObject
 M: Markus Armbruster 
@@ -1718,7 +1718,7 @@ F: tests/check-qnum.c
 F: tests/check-qjson.c
 F: tests/check-qlist.c
 F: tests/check-qstring.c
-T: git git://repo.or.cz/qemu/armbru.git qapi-next
+T: git https://repo.or.cz/qemu/armbru.git qapi-next
 
 QEMU Guest Agent
 M: Michael Roth 
@@ -1750,7 +1750,7 @@ F: docs/devel/*qmp-*
 F: scripts/qmp/
 F: tests/qmp-test.c
 F: tests/qmp-cmd-test.c
-T: git git://repo.or.cz/qemu/armbru.git qapi-next
+T: git https://repo.or.cz/qemu/armbru.git qapi-next
 
 qtest
 M: Paolo Bonzini 
@@ -2065,7 +2065,7 @@ F: include/block/nbd*
 F: qemu-nbd.*
 F: blockdev-nbd.c
 F: docs/interop/nbd.txt
-T: git git://repo.or.cz/qemu/ericb.git nbd
+T: git https://repo.or.cz/qemu/ericb.git nbd
 
 NFS
 M: Jeff Cody 
diff --git a/pc-bios/README b/pc-bios/README
index b572e9eb00..3f91912a18 100644
--- a/pc-bios/README
+++ b/pc-bios/README
@@ -5,7 +5,7 @@
   project (http://www.nongnu.org/vgabios/).
 
 - The PowerPC Open Hack'Ware Open Firmware Compatible BIOS is
-  available at http://repo.or.cz/w/openhackware.git.
+  available at https://repo.or.cz/w/openhackware.git.
 
 - OpenBIOS (http://www.openbios.org/) is a free (GPL v2) portable
   firmware implementation. The goal is to implement a 100% IEEE
-- 
2.17.2




[Qemu-devel] [PATCH v2 0/5] Use 'https://' instead of 'git://'

2018-11-02 Thread Stefan Hajnoczi
v2:
 * Use HTTPS for repo.or.cz [Eric]

Jeff Cody has enabled git smart HTTP support on qemu.org.  From now on HTTPS is
the preferred protocol because it adds some protection against
man-in-the-middle when cloning a repo.

This patch series updates git:// URLs and changes them to https://.  The 
https:// URL format is:

  https://git.qemu.org/git/.git

The old git:// URL format was:

  git://git.qemu.org/.git

I have also updated git://github.com/ and repo.or.cz URLs because they offer 
HTTPS.

I have tested that submodules continue to work after the change to .gitmodules.

Stefan Hajnoczi (5):
  README: use 'https://' instead of 'git://'
  get_maintainer: use 'https://' instead of 'git://'
  MAINTAINERS: use 'https://' instead of 'git://' for GitHub
  gitmodules: use 'https://' instead of 'git://'
  git: use HTTPS git URLs for repo.or.cz

 MAINTAINERS   | 88 +++
 .gitmodules   | 34 +++
 README|  4 +-
 pc-bios/README|  6 +--
 scripts/get_maintainer.pl |  2 +-
 5 files changed, 67 insertions(+), 67 deletions(-)

-- 
2.17.2




[Qemu-devel] [PATCH v2 2/5] get_maintainer: use 'https://' instead of 'git://'

2018-11-02 Thread Stefan Hajnoczi
When you clone the repository without previous commit history, 'git://'
doesn't protect from man-in-the-middle attacks.  HTTPS is more secure
since the client verifies the server certificate.

Reported-by: Jann Horn 
Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Stefan Hajnoczi 
---
 scripts/get_maintainer.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl
index 43fb5f512f..fc7275b9e2 100755
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -1376,7 +1376,7 @@ sub vcs_exists {
warn("$P: No supported VCS found.  Add --nogit to options?\n");
warn("Using a git repository produces better results.\n");
warn("Try latest git repository using:\n");
-   warn("git clone git://git.qemu.org/qemu.git\n");
+   warn("git clone https//git.qemu.org/git/qemu.git\n");
$printed_novcs = 1;
 }
 return 0;
-- 
2.17.2




[Qemu-devel] [PATCH v2 4/5] gitmodules: use 'https://' instead of 'git://'

2018-11-02 Thread Stefan Hajnoczi
When you clone the repository without previous commit history, 'git://'
doesn't protect from man-in-the-middle attacks.  HTTPS is more secure
since the client verifies the server certificate.

Also change git.qemu-project.org to git.qemu.org (we control both domain
names but qemu.org is used more widely).

Reported-by: Jann Horn 
Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Stefan Hajnoczi 
---
 .gitmodules | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/.gitmodules b/.gitmodules
index a48d2a764c..6b91176098 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -1,51 +1,51 @@
 [submodule "roms/seabios"]
path = roms/seabios
-   url = git://git.qemu-project.org/seabios.git/
+   url = https://git.qemu.org/git/seabios.git/
 [submodule "roms/SLOF"]
path = roms/SLOF
-   url = git://git.qemu-project.org/SLOF.git
+   url = https://git.qemu.org/git/SLOF.git
 [submodule "roms/ipxe"]
path = roms/ipxe
-   url = git://git.qemu-project.org/ipxe.git
+   url = https://git.qemu.org/git/ipxe.git
 [submodule "roms/openbios"]
path = roms/openbios
-   url = git://git.qemu-project.org/openbios.git
+   url = https://git.qemu.org/git/openbios.git
 [submodule "roms/openhackware"]
path = roms/openhackware
-   url = git://git.qemu-project.org/openhackware.git
+   url = https://git.qemu.org/git/openhackware.git
 [submodule "roms/qemu-palcode"]
path = roms/qemu-palcode
-   url = git://git.qemu.org/qemu-palcode.git
+   url = https://git.qemu.org/git/qemu-palcode.git
 [submodule "roms/sgabios"]
path = roms/sgabios
-   url = git://git.qemu-project.org/sgabios.git
+   url = https://git.qemu.org/git/sgabios.git
 [submodule "dtc"]
path = dtc
-   url = git://git.qemu-project.org/dtc.git
+   url = https://git.qemu.org/git/dtc.git
 [submodule "roms/u-boot"]
path = roms/u-boot
-   url = git://git.qemu-project.org/u-boot.git
+   url = https://git.qemu.org/git/u-boot.git
 [submodule "roms/skiboot"]
path = roms/skiboot
-   url = git://git.qemu.org/skiboot.git
+   url = https://git.qemu.org/git/skiboot.git
 [submodule "roms/QemuMacDrivers"]
path = roms/QemuMacDrivers
-   url = git://git.qemu.org/QemuMacDrivers.git
+   url = https://git.qemu.org/git/QemuMacDrivers.git
 [submodule "ui/keycodemapdb"]
path = ui/keycodemapdb
-   url = git://git.qemu.org/keycodemapdb.git
+   url = https://git.qemu.org/git/keycodemapdb.git
 [submodule "capstone"]
path = capstone
-   url = git://git.qemu.org/capstone.git
+   url = https://git.qemu.org/git/capstone.git
 [submodule "roms/seabios-hppa"]
path = roms/seabios-hppa
-   url = git://github.com/hdeller/seabios-hppa.git
+   url = https://github.com/hdeller/seabios-hppa.git
 [submodule "roms/u-boot-sam460ex"]
path = roms/u-boot-sam460ex
-   url = git://git.qemu.org/u-boot-sam460ex.git
+   url = https://git.qemu.org/git/u-boot-sam460ex.git
 [submodule "tests/fp/berkeley-testfloat-3"]
path = tests/fp/berkeley-testfloat-3
-   url = git://github.com/cota/berkeley-testfloat-3
+   url = https://github.com/cota/berkeley-testfloat-3
 [submodule "tests/fp/berkeley-softfloat-3"]
path = tests/fp/berkeley-softfloat-3
-   url = git://github.com/cota/berkeley-softfloat-3
+   url = https://github.com/cota/berkeley-softfloat-3
-- 
2.17.2




Re: [Qemu-devel] [PATCH v4 09/10] block/nbd-client: nbd reconnect

2018-11-02 Thread Vladimir Sementsov-Ogievskiy
31.07.2018 20:30, Vladimir Sementsov-Ogievskiy wrote:
> Implement reconnect. To achieve this:
>
> 1. add new modes:
> connecting-wait: means, that reconnecting is in progress, and there
>   were small number of reconnect attempts, so all requests are
>   waiting for the connection.
> connecting-nowait: reconnecting is in progress, there were a lot of
>   attempts of reconnect, all requests will return errors.
>
> two old modes are used too:
> connected: normal state
> quit: exiting after fatal error or on close
>
> Possible transitions are:
>
> * -> quit
> connecting-* -> connected
> connecting-wait -> connecting-nowait (transition is done after
>reconnect-delay seconds in connecting-wait mode)
> connected -> connecting-wait
>
> 2. Implement reconnect in connection_co. So, in connecting-* mode,
>  connection_co, tries to reconnect unlimited times.
>
> 3. Retry nbd queries on channel error, if we are in connecting-wait
>  state.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>   block/nbd-client.h |   4 +
>   block/nbd-client.c | 304 
> +++--
>   2 files changed, 255 insertions(+), 53 deletions(-)
>
> diff --git a/block/nbd-client.h b/block/nbd-client.h
> index ef8a6a9239..52e4ec66be 100644
> --- a/block/nbd-client.h
> +++ b/block/nbd-client.h
> @@ -40,6 +40,10 @@ typedef struct NBDClientSession {
>   Coroutine *connection_co;
>   int in_flight;
>   NBDClientState state;
> +bool receiving;
> +int connect_status;
> +Error *connect_err;
> +bool wait_in_flight;
>   
>   NBDClientRequest requests[MAX_NBD_REQUESTS];
>   NBDReply reply;
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index 41e6e6e702..b09907096d 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -34,10 +34,26 @@
>   #define HANDLE_TO_INDEX(bs, handle) ((handle) ^ (uint64_t)(intptr_t)(bs))
>   #define INDEX_TO_HANDLE(bs, index)  ((index)  ^ (uint64_t)(intptr_t)(bs))

[...]

> +static coroutine_fn void nbd_reconnect_attempt(NBDConnection *con)
> +{
> +NBDClientSession *s = nbd_get_client_session(con->bs);
> +Error *local_err = NULL;
> +
> +assert(nbd_client_connecting(s));
> +
> +/* Wait completion of all in-flight requests */
> +
> +qemu_co_mutex_lock(&s->send_mutex);
> +
> +while (s->in_flight > 0) {
> +qemu_co_mutex_unlock(&s->send_mutex);
> +nbd_recv_coroutines_wake_all(s);
> +s->wait_in_flight = true;
> +qemu_coroutine_yield();
> +s->wait_in_flight = false;
> +qemu_co_mutex_lock(&s->send_mutex);
> +}
> +
> +qemu_co_mutex_unlock(&s->send_mutex);
> +
> +/* Now we are sure, that nobody accessing the channel now and nobody
> + * will try to access the channel, until we set state to CONNECTED
> + */
> +
> +/* Finalize previous connection if any */
> +if (s->ioc) {
> +nbd_client_detach_aio_context(con->bs);
> +object_unref(OBJECT(s->sioc));
> +s->sioc = NULL;
> +object_unref(OBJECT(s->ioc));
> +s->ioc = NULL;
> +}
> +
> +s->connect_status = nbd_client_connect(con->bs, con->saddr,
> +   con->export, con->tlscreds,
> +   con->hostname, 
> con->x_dirty_bitmap,
> +   &local_err);
> +error_free(s->connect_err);
> +s->connect_err = NULL;
> +error_propagate(&s->connect_err, local_err);
> +local_err = NULL;
>   
> -nbd_client_detach_aio_context(bs);
> -object_unref(OBJECT(client->sioc));
> -client->sioc = NULL;
> -object_unref(OBJECT(client->ioc));
> -client->ioc = NULL;
> +if (s->connect_status == -EINVAL) {
> +/* Protocol error or something like this, go to NBD_CLIENT_QUIT */
> +nbd_channel_error(s, s->connect_status);
> +return;

Unfortunately, nbd_client_connect returns -EINVAL for io errors instead 
of -EIO. And it is not trivial to fix it. So, this if{} should be removed.

> +}
> +
> +if (s->connect_status < 0) {
> +/* failed attempt */
> +return;
> +}
> +
> +/* successfully connected */
> +s->state = NBD_CLIENT_CONNECTED;
> +qemu_co_queue_restart_all(&s->free_sema);
> +}
> +



-- 
Best regards,
Vladimir



[Qemu-devel] [PATCH v2 1/5] README: use 'https://' instead of 'git://'

2018-11-02 Thread Stefan Hajnoczi
When you clone the repository without previous commit history, 'git://'
doesn't protect from man-in-the-middle attacks.  HTTPS is more secure
since the client verifies the server certificate.

Reported-by: Jann Horn 
Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Stefan Hajnoczi 
---
 README | 4 ++--
 pc-bios/README | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/README b/README
index 49a9fd09cd..441c33eb2f 100644
--- a/README
+++ b/README
@@ -54,7 +54,7 @@ Submitting patches
 
 The QEMU source code is maintained under the GIT version control system.
 
-   git clone git://git.qemu.org/qemu.git
+   git clone https://git.qemu.org/git/qemu.git
 
 When submitting patches, one common approach is to use 'git
 format-patch' and/or 'git send-email' to format & send the mail to the
@@ -70,7 +70,7 @@ the QEMU website
 
 The QEMU website is also maintained under source control.
 
-  git clone git://git.qemu.org/qemu-web.git
+  git clone https://git.qemu.org/git/qemu-web.git
   https://www.qemu.org/2017/02/04/the-new-qemu-website-is-up/
 
 A 'git-publish' utility was created to make above process less
diff --git a/pc-bios/README b/pc-bios/README
index 90f0fa7aa7..b572e9eb00 100644
--- a/pc-bios/README
+++ b/pc-bios/README
@@ -23,7 +23,7 @@
   legacy x86 software to communicate with an attached serial console as
   if a video card were attached.  The master sources reside in a subversion
   repository at http://sgabios.googlecode.com/svn/trunk.  A git mirror is
-  available at git://git.qemu.org/sgabios.git.
+  available at https://git.qemu.org/git/sgabios.git.
 
 - The PXE roms come from the iPXE project. Built with BANNER_TIME 0.
   Sources available at http://ipxe.org.  Vendor:Device ID -> ROM mapping:
@@ -40,7 +40,7 @@
 
 - The u-boot binary for e500 comes from the upstream denx u-boot project where
   it was compiled using the qemu-ppce500 target.
-  A git mirror is available at: git://git.qemu.org/u-boot.git
+  A git mirror is available at: https://git.qemu.org/git/u-boot.git
   The hash used to compile the current version is: 2072e72
 
 - Skiboot (https://github.com/open-power/skiboot/) is an OPAL
-- 
2.17.2




[Qemu-devel] [PATCH v2 3/5] MAINTAINERS: use 'https://' instead of 'git://' for GitHub

2018-11-02 Thread Stefan Hajnoczi
When you clone the repository without previous commit history, 'git://'
doesn't protect from man-in-the-middle attacks.  HTTPS is more secure
since the client verifies the server certificate.

Reported-by: Jann Horn 
Reviewed-by: Daniel P. Berrangé 
Acked-by: Cornelia Huck 
Signed-off-by: Stefan Hajnoczi 
---
 MAINTAINERS | 74 ++---
 1 file changed, 37 insertions(+), 37 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index f2360efe3e..5cca097c49 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -74,7 +74,7 @@ S: Maintained
 L: qemu-triv...@nongnu.org
 K: ^Subject:.*(?i)trivial
 T: git git://git.corpit.ru/qemu.git trivial-patches
-T: git git://github.com/vivier/qemu.git trivial-patches
+T: git https://github.com/vivier/qemu.git trivial-patches
 
 Architecture support
 
@@ -98,7 +98,7 @@ F: pc-bios/s390-ccw.img
 F: target/s390x/
 F: docs/vfio-ap.txt
 K: ^Subject:.*(?i)s390x?
-T: git git://github.com/cohuck/qemu.git s390-next
+T: git https://github.com/cohuck/qemu.git s390-next
 L: qemu-s3...@nongnu.org
 
 Guest CPU cores (TCG):
@@ -295,7 +295,7 @@ F: tests/tcg/x86_64/
 F: hw/i386/
 F: disas/i386.c
 F: docs/qemu-cpu-models.texi
-T: git git://github.com/ehabkost/qemu.git x86-next
+T: git https://github.com/ehabkost/qemu.git x86-next
 
 Xtensa
 M: Max Filippov 
@@ -359,8 +359,8 @@ F: hw/intc/s390_flic.c
 F: hw/intc/s390_flic_kvm.c
 F: include/hw/s390x/s390_flic.h
 F: gdb-xml/s390*.xml
-T: git git://github.com/cohuck/qemu.git s390-next
-T: git git://github.com/borntraeger/qemu.git s390-next
+T: git https://github.com/cohuck/qemu.git s390-next
+T: git https://github.com/borntraeger/qemu.git s390-next
 L: qemu-s3...@nongnu.org
 
 X86
@@ -942,8 +942,8 @@ F: include/hw/s390x/
 F: hw/watchdog/wdt_diag288.c
 F: include/hw/watchdog/wdt_diag288.h
 F: default-configs/s390x-softmmu.mak
-T: git git://github.com/cohuck/qemu.git s390-next
-T: git git://github.com/borntraeger/qemu.git s390-next
+T: git https://github.com/cohuck/qemu.git s390-next
+T: git https://github.com/borntraeger/qemu.git s390-next
 L: qemu-s3...@nongnu.org
 
 S390-ccw Bios
@@ -952,7 +952,7 @@ M: Thomas Huth 
 S: Supported
 F: pc-bios/s390-ccw/
 F: pc-bios/s390-ccw.img
-T: git git://github.com/borntraeger/qemu.git s390-next
+T: git https://github.com/borntraeger/qemu.git s390-next
 L: qemu-s3...@nongnu.org
 
 UniCore32 Machines
@@ -1022,7 +1022,7 @@ S: Supported
 F: hw/core/machine.c
 F: hw/core/null-machine.c
 F: include/hw/boards.h
-T: git git://github.com/ehabkost/qemu.git machine-next
+T: git https://github.com/ehabkost/qemu.git machine-next
 
 Xtensa Machines
 ---
@@ -1058,7 +1058,7 @@ F: tests/ide-test.c
 F: tests/ahci-test.c
 F: tests/cdrom-test.c
 F: tests/libqos/ahci*
-T: git git://github.com/jnsnow/qemu.git ide
+T: git https://github.com/jnsnow/qemu.git ide
 
 IPMI
 M: Corey Minyard 
@@ -1067,7 +1067,7 @@ F: include/hw/ipmi/*
 F: hw/ipmi/*
 F: hw/smbios/smbios_type_38.c
 F: tests/ipmi*
-T: git git://github.com/cminyard/qemu.git master-ipmi-rebase
+T: git https://github.com/cminyard/qemu.git master-ipmi-rebase
 
 Floppy
 M: John Snow 
@@ -1076,7 +1076,7 @@ S: Supported
 F: hw/block/fdc.c
 F: include/hw/block/fdc.h
 F: tests/fdc-test.c
-T: git git://github.com/jnsnow/qemu.git ide
+T: git https://github.com/jnsnow/qemu.git ide
 
 OMAP
 M: Peter Maydell 
@@ -1144,7 +1144,7 @@ S: Odd Fixes
 F: hw/net/
 F: include/hw/net/
 F: tests/virtio-net-test.c
-T: git git://github.com/jasowang/qemu.git net
+T: git https://github.com/jasowang/qemu.git net
 
 SCSI
 M: Paolo Bonzini 
@@ -1153,7 +1153,7 @@ S: Supported
 F: include/hw/scsi/*
 F: hw/scsi/*
 F: tests/virtio-scsi-test.c
-T: git git://github.com/bonzini/qemu.git scsi-next
+T: git https://github.com/bonzini/qemu.git scsi-next
 
 SSI
 M: Peter Crosthwaite 
@@ -1208,7 +1208,7 @@ S: Supported
 F: hw/vfio/ccw.c
 F: hw/s390x/s390-ccw.c
 F: include/hw/s390x/s390-ccw.h
-T: git git://github.com/cohuck/qemu.git s390-next
+T: git https://github.com/cohuck/qemu.git s390-next
 L: qemu-s3...@nongnu.org
 
 vfio-ap
@@ -1247,7 +1247,7 @@ S: Supported
 F: hw/9pfs/
 F: fsdev/
 F: tests/virtio-9p-test.c
-T: git git://github.com/gkurz/qemu.git 9p-next
+T: git https://github.com/gkurz/qemu.git 9p-next
 
 virtio-blk
 M: Stefan Hajnoczi 
@@ -1256,7 +1256,7 @@ S: Supported
 F: hw/block/virtio-blk.c
 F: hw/block/dataplane/*
 F: tests/virtio-blk-test.c
-T: git git://github.com/stefanha/qemu.git block
+T: git https://github.com/stefanha/qemu.git block
 
 virtio-ccw
 M: Cornelia Huck 
@@ -1264,8 +1264,8 @@ M: Christian Borntraeger 
 S: Supported
 F: hw/s390x/virtio-ccw*.[hc]
 F: hw/s390x/vhost-vsock-ccw.c
-T: git git://github.com/cohuck/qemu.git s390-next
-T: git git://github.com/borntraeger/qemu.git s390-next
+T: git https://github.com/cohuck/qemu.git s390-next
+T: git https://github.com/borntraeger/qemu.git s390-next
 L: qemu-s3...@nongnu.org
 
 virtio-input
@@ -1469,7 +1469,7 @@ F: migration/block*
 F: include/block/aio.h
 F: include/block/aio-wait.

Re: [Qemu-devel] [PATCH v1 0/5] s390x/vfio: VFIO-AP interrupt control interception

2018-11-02 Thread David Hildenbrand
On 02.11.18 11:30, Pierre Morel wrote:
> The S390 APQP/AQIC instruction can be intercepted by the host
> to configure the AP queues interruption handling for and handle
> the ISC used by the host and the guest and the indicator address.
> 
> This patch series define the AQIC feature in the cpumodel,
> extend the APDevice type for per queue interrupt handling,
> intercept the APQP/AQIC instruction, uses the S390 adapter interface
> to setup the adapter and use a VFIO ioctl to let the VFIO-AP
> driver handle the host instruction associated with the intercepted
> guest instruction.
> 
> This patch serie can be tested with the Linux/KVM patch series
> for the VFIO-AP driver: "s390: vfio: ap: Using GISA for AP Interrupt"
> 
> Pierre Morel (5):
>   s390x/vfio: ap: Linux uapi VFIO place holder
>   s390x/cpumodel: Set up CPU model for AQIC interception
>   s390x/vfio: ap: Definition for AP Adapter type
>   s390x/vfio: ap: Intercepting AP Queue Interrupt Control
>   s390x/vfio: ap: Implementing AP Queue Interrupt Control
> 
>  hw/vfio/ap.c| 100 
>  include/hw/s390x/ap-device.h|  55 ++
>  include/hw/s390x/css.h  |   1 +
>  linux-headers/linux/vfio.h  |  22 +++
>  target/s390x/cpu_features.c |   1 +
>  target/s390x/cpu_features_def.h |   1 +
>  target/s390x/cpu_models.c   |   1 +
>  target/s390x/kvm.c  |  20 +++
>  8 files changed, 201 insertions(+)
> 

I had a very quick high level look at it and it seems to be in general fine.

-- 

Thanks,

David / dhildenb



Re: [Qemu-devel] [QEMU PATCH v2 0/2]: KVM: i386: Add support for save and restore nested state

2018-11-02 Thread Liran Alon



> On 2 Nov 2018, at 11:40, Paolo Bonzini  wrote:
> 
> On 02/11/2018 04:46, Liran Alon wrote:
>>> On Thu, Nov1, 2018 at 09:45 AM, Jim Mattson  wrote:
>> 
 On Thu, Nov 1, 2018 at 8:56 AM, Dr. David Alan Gilbert 
  wrote:
>> 
 So if I have matching host kernels it should always work?
 What happens if I upgrade the source kernel to increase it's maximum
 nested size, can I force it to keep things small for some VMs?
>> 
>>> Any change to the format of the nested state should be gated by a
>>> KVM_CAP set by userspace. (Unlike, say, how the
>>> KVM_VCPUEVENT_VALID_SMM flag was added to the saved VCPU events state
>>> in commit f077825a8758d.) KVM has traditionally been quite bad about
>>> maintaining backwards compatibility, but I hope the community is more
>>> cognizant of the issues now.
>> 
>>> As a cloud provider, one would only enable the new capability from
>>> userspace once all hosts in the pool have a kernel that supports it.
>>> During the transition, the capability would not be enabled on the
>>> hosts with a new kernel, and these hosts would continue to provide
>>> nested state that could be consumed by hosts running the older kernel.
>> 
>> Hmm this makes sense.
>> 
>> This means though that the patch I have submitted here isn't good enough.
>> My patch currently assumes that when it attempts to get nested state from 
>> KVM,
>> QEMU should always set nested_state->size to max size supported by KVM as 
>> received
>> from kvm_check_extension(s, KVM_CAP_NESTED_STATE);
>> (See kvm_get_nested_state() introduced on my patch).
>> This indeed won't allow migration from host with new KVM to host with old 
>> KVM if
>> nested_state size was enlarged between these KVM versions.
>> Which is obviously an issue.
> 
> Actually I think this is okay, because unlike the "new" capability was
> enabled, KVM would always reduce nested_state->size to a value that is
> compatible with current kernels.
> 
>> But on second thought, I'm not sure that this is the right approach as-well.
>> We don't really want the used version of nested_state to be determined on 
>> kvm_init().
>> * On source QEMU, we actually want to determine it when preparing for 
>> migration based
>> on to the support given by our destination host. If it's an old host, we 
>> would like to
>> save an old version nested_state and if it's a new host, we will like to 
>> save our newest
>> supported nested_state.
> 
> No, that's wrong because it would lead to losing state.  If the source
> QEMU supports more state than the destination QEMU, and the current VM
> state needs to transmit it for migration to be _correct_, then migration
> to that destination QEMU must fail.
> 
> In particular, enabling the new KVM capability needs to be gated by a
> new machine type and/or -cpu flag, if migration compatibility is needed.
> (In particular, this is one reason why I haven't considered this series
> for 3.1.  Right now, migration of nested hypervisors is completely
> busted but if we make it "almost" work, pre-3.1 machine types would not
> ever be able to add support for KVM_CAP_EXCEPTION_PAYLOAD.  Therefore,
> it's better for users if we wait for one release more, and add support
> for KVM_CAP_NESTED_STATE and KVM_CAP_EXCEPTION_PAYLOAD at the same time).
> 
> Personally, I would like to say that, starting from QEMU 3.2, enabling
> nested VMX requires a 4.20 kernel.  It's a bit bold, but I think it's a
> good way to keep some sanity.  Any opinions on that?
> 
> Paolo

If I understand you correctly, you wish that nested_state version used will be 
tied to the machine-type used to launch the guest.
The reason I am not fond of this approach is that it means that once a VM is 
launched with some machine-type, it’s nested_state
will be forever saved with a specific version. Even if this VM has already 
migrated to a host with newer kernel which knows how to
save more accurate state for the next migration.

The scheme I have described below should avoid this kind of issues while still 
preserving the ability to migrate to older hosts.
Note that I believe it’s in the responsibility of the mgmt-layer to decide the 
risk of migrating from a new host to an old host
in case the old host cannot receive the full nested_state that the new host is 
capable of generating. I don’t think migration
should fail in this case like happens currently with the patch I have submitted.

So in general I still agree with Jim's approach, but I’m not convinced we 
should use KVM_CAP for this.
(See details on proposed scheme below).

What are the disadvantages you see of using the proposed scheme below?
Why using KVM_CAP is better?

BTW, I agree with the rest of the group here that it’s too aggressive to make 
QEMU 3.2 force having kernel 4.20 for using nVMX.
This will hurt common nVMX workloads that don’t care about the ability to 
migrate.

-Liran

> 
>> Therefore, I don't think that we want this versioning to be based on KVM_CAP 
>> at all.
>> It seems that we would w

Re: [Qemu-devel] [PATCH 3/4] MAINTAINERS: use 'https://' instead of 'git://' for GitHub

2018-11-02 Thread Stefan Hajnoczi
On Wed, Oct 31, 2018 at 08:31:09AM -0500, Eric Blake wrote:
> On 10/31/18 3:43 AM, Stefan Hajnoczi wrote:
> > When you clone the repository without previous commit history, 'git://'
> > doesn't protect from man-in-the-middle attacks.  HTTPS is more secure
> > since the client verifies the server certificate.
> > 
> > Reported-by: Jann Horn 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> >   MAINTAINERS | 74 ++---
> >   1 file changed, 37 insertions(+), 37 deletions(-)
> 
> We should also do the same for maintainers using git://repo.or.cz:

Your wish is my command.  Fixed in v2.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2 5/5] git: use HTTPS git URLs for repo.or.cz

2018-11-02 Thread Eric Blake

On 11/2/18 7:42 AM, Stefan Hajnoczi wrote:

repo.or.cz supports git smart HTTPS.  Use it in preference to git:// or
http:// since it's more secure.


Not sure why you didn't copy-paste the commit messages from the other 
patches, but it doesn't really matter.




Suggested-by: Eric Blake 
Signed-off-by: Stefan Hajnoczi 
---
  MAINTAINERS| 14 +++---
  pc-bios/README |  2 +-
  2 files changed, 8 insertions(+), 8 deletions(-)



+++ b/pc-bios/README
@@ -5,7 +5,7 @@
project (http://www.nongnu.org/vgabios/).
  
  - The PowerPC Open Hack'Ware Open Firmware Compatible BIOS is

-  available at http://repo.or.cz/w/openhackware.git.
+  available at https://repo.or.cz/w/openhackware.git.


This one fails. Remove 'w/' and you can have:
Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH v2 0/5] Use 'https://' instead of 'git://'

2018-11-02 Thread Eric Blake

On 11/2/18 7:42 AM, Stefan Hajnoczi wrote:

v2:
  * Use HTTPS for repo.or.cz [Eric]

Jeff Cody has enabled git smart HTTP support on qemu.org.  From now on HTTPS is
the preferred protocol because it adds some protection against
man-in-the-middle when cloning a repo.

This patch series updates git:// URLs and changes them to https://.  The 
https:// URL format is:

   https://git.qemu.org/git/.git

The old git:// URL format was:

   git://git.qemu.org/.git

I have also updated git://github.com/ and repo.or.cz URLs because they offer 
HTTPS.

I have tested that submodules continue to work after the change to .gitmodules.

Stefan Hajnoczi (5):
   README: use 'https://' instead of 'git://'
   get_maintainer: use 'https://' instead of 'git://'
   MAINTAINERS: use 'https://' instead of 'git://' for GitHub
   gitmodules: use 'https://' instead of 'git://'
   git: use HTTPS git URLs for repo.or.cz

  MAINTAINERS   | 88 +++
  .gitmodules   | 34 +++
  README|  4 +-
  pc-bios/README|  6 +--
  scripts/get_maintainer.pl |  2 +-
  5 files changed, 67 insertions(+), 67 deletions(-)


Other files still mentioning git://:

hw/misc/pc-testdev.c (git.kernel.org)
tests/docker/dockerfiles/debian-amd64.docker (anongit.freedesktop.org)

As I haven't regularly used either of those hosting sites, and was too 
lazy to check if I could clone with smart https:, I'll leave it to you 
to do a followup patch if we care.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH] qemu/units: Move out QCow2 specific definitions

2018-11-02 Thread Philippe Mathieu-Daudé

Hi Kevin,

On 2/11/18 12:07, Kevin Wolf wrote:

Am 02.11.2018 um 09:58 hat Philippe Mathieu-Daudé geschrieben:

This definitions are QCow2 specific, there is no need to expose them
in the global namespace.

This partially reverts commit 540b8492618eb.

Signed-off-by: Philippe Mathieu-Daudé 


If we don't want this globally, I think we also don't want it in qcow2.


I only see this definitions used by block/qcow2.h (b6a95c6d1007).

Per 540b8492618eb description "This is needed when a size has to be 
stringified" but I can't find other code requiring these definitions in 
the codebase.



Or at least reduce it to only those constants that qcow2 actually uses.


Fine by me, I'll let Leonid opine first.

Regards,

Phil.



Re: [Qemu-devel] [PATCH v1 2/7] pcihp: overwrite hotplug handler recursively from the start

2018-11-02 Thread Igor Mammedov
On Fri, 2 Nov 2018 12:43:10 +0100
David Hildenbrand  wrote:

> On 01.11.18 15:10, Igor Mammedov wrote:
> > On Wed, 24 Oct 2018 12:19:25 +0200
> > David Hildenbrand  wrote:
> >   
> >> For now, the hotplug handler is not called for devices that are
> >> being cold plugged. The hotplug handler is setup when the machine
> >> initialization is fully done. Only bridges that were cold plugged are
> >> considered.
> >>
> >> Set the hotplug handler for the root piix bus directly when realizing.
> >> Overwrite the hotplug handler of bridges when hotplugging/coldplugging
> >> them.
> >>
> >> This will now make sure that the ACPI PCI hotplug handler is also called
> >> for cold-plugged devices (also on bridges) and for bridges that were
> >> hotplugged.
> >>
> >> When trying to hotplug a device to a hotplugged bridge, we now correctly
> >> get the error message
> >>  "Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set"
> >> Insted of going via the standard PCI hotplug handler.  
> > Erroring out is probably not ok, since it can break existing setups
> > where SHPC hotplugging to hotplugged bridge was working just fine before.  
> 
> The question is if it actually was supposed (and eventually did) work.
I think it works now, it's QEMU 'ACPI hotplug hack' (which exists for
the sake of Windows) limitation. We weren't able to dynamically add
ACPI description for hotplugged bridge, so it was using native hotplug.
Now theoretically we can load tables dynamically but that, would add
maintenance nightmare (versioned tables) and would be harder to debug.
I'd rather not go that direction and keep current limited version,
suggesting users to use native hotplug if guest is capable.

> If this was the expected behavior (mixing hotplug types), then the
> necessary change to this patch would boil down to checking if the bridge
> it hot or coldplugged.
> 
> > 
> > Marcel/Michael what's your take on this change in behaviour?
> > CCing libvirt in case they are doing this stuff
> >   
> 
> Indeed, it would be nice to know if this was actually supposed to work
> like this (coldplugged bridges using ACPI hotplug and hotplugged bridges
> using SHPC hotplug).
> 
> 




Re: [Qemu-devel] [PULL 0/1] M68k for 3.1 patches

2018-11-02 Thread Peter Maydell
On 1 November 2018 at 11:37, Laurent Vivier  wrote:
> The following changes since commit 7d51a855cd568ec3399a1834ada4023cfa12f231:
>
>   Merge remote-tracking branch 'remotes/xtensa/tags/20181030-xtensa' into 
> staging (2018-10-31 16:11:43 +)
>
> are available in the Git repository at:
>
>   git://github.com/vivier/qemu-m68k.git tags/m68k-for-3.1-pull-request
>
> for you to fetch changes up to b9f8e55bf7e994e192ab7360830731580384b813:
>
>   target/m68k: use EXCP_ILLEGAL instead of EXCP_UNSUPPORTED (2018-11-01 
> 12:12:24 +0100)
>
> 
> Fix illegal instruction exception number
>
> 
>
> Laurent Vivier (1):
>   target/m68k: use EXCP_ILLEGAL instead of EXCP_UNSUPPORTED
>
>  linux-user/m68k/cpu_loop.c | 1 -
>  target/m68k/cpu.h  | 1 -
>  target/m68k/translate.c| 6 +++---
>  3 files changed, 3 insertions(+), 5 deletions(-)
>

Applied, thanks.

-- PMM



[Qemu-devel] [PATCH v5 0/2] arm: Add first models of Xilinx Versal SoC

2018-11-02 Thread Edgar E. Iglesias
This patch series adds initial support for Xilinx's Versal SoC.
Xilinx is introducing Versal, an adaptive compute acceleration platform
(ACAP), built on 7nm FinFET process technology. Versal ACAPs combine Scalar
Processing Engines, Adaptable Hardware Engines, and Intelligent Engines with
leading-edge memory and interfacing technologies to deliver powerful
heterogeneous acceleration for any application. The Versal AI Core series has
five devices, offering 128 to 400 AI Engines. The series includes dual-core Arm
Cortex-A72 application processors, dual-core Arm Cortex-R5 real-time
processors, 256KB of on-chip memory with ECC, more than 1,900 DSP engines
optimized for high-precision floating point with low latency.

More info can be found here:
https://www.xilinx.com/news/press/2018/xilinx-unveils-versal-the-first-in-a-new-category-of-platforms-delivering-rapid-innovation-with-software-programmability-and-scalable-ai-inference.html


In QEMU we'd like to have a virtual developer board with the Versal SoC
and a selected set of peripherals under the control of QEMU.
We'd like to gradually extend this board as QEMU gains more support
for Versal hardware components. QEMU will generate a device-tree
describing only the components it supports and includes in the virtual
dev board.

Before adding Versal support, this series starts with a few fixes to the
GEM that I ran into when running recent kernels on the Versal and ZynqMP
models.

I also noticed a problem with HVC insns not being enabled when using
QEMU's PSCI implementation on CPU's with EL2 and EL3 enabled. This causes
problems for Linux/KVM guests, also fixed in this series.

Best regards,
Edgar

ChangeLog:
v4 -> v5:
* Correct setting of boolean full-duplex fdt property

v3 -> v4:
* Improve error handling for CPU and GIC creation
* Remove KVM checks in GIC creation

v2 -> v3:
* Fix DESCONF6 queue mask generation for GEMs with only one queue.

v1 -> v2:
* Spell out OCM as On Chip Memory
* apperture -> aperture
* Remove copy+pasted virt board comment
* Remove VMSD for Versal SoC (with a comment on why it's not needed)
* Embedd AddressSpace dma object in GEM
* Remove debug left-overs in arm-powerctl
* Enable PMU in Cortex-A72
* Rename cortex_a57_a53_cp_reginfo -> cortex_a72_a57_a53_cp_reginfo


Edgar E. Iglesias (2):
  hw/arm: versal: Add a model of Xilinx Versal SoC
  hw/arm: versal: Add a virtual Xilinx Versal board

 default-configs/aarch64-softmmu.mak |   1 +
 hw/arm/Makefile.objs|   1 +
 hw/arm/xlnx-versal-virt.c   | 494 
 hw/arm/xlnx-versal.c| 323 ++
 include/hw/arm/xlnx-versal.h| 122 +++
 5 files changed, 941 insertions(+)
 create mode 100644 hw/arm/xlnx-versal-virt.c
 create mode 100644 hw/arm/xlnx-versal.c
 create mode 100644 include/hw/arm/xlnx-versal.h

-- 
2.17.1




[Qemu-devel] [PATCH v5 2/2] hw/arm: versal: Add a virtual Xilinx Versal board

2018-11-02 Thread Edgar E. Iglesias
Add a virtual Xilinx Versal board.

This board is based on the Xilinx Versal SoC. The exact
details of what peripherals are attached to this board
will remain in control of QEMU. QEMU will generate an
FDT on the fly for Linux and other software to auto-discover
peripherals.

Signed-off-by: Edgar E. Iglesias 
---
 hw/arm/Makefile.objs  |   2 +-
 hw/arm/xlnx-versal-virt.c | 494 ++
 2 files changed, 495 insertions(+), 1 deletion(-)
 create mode 100644 hw/arm/xlnx-versal-virt.c

diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index ec21d9bc1f..50c7b4a927 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -26,7 +26,7 @@ obj-$(CONFIG_ALLWINNER_A10) += allwinner-a10.o cubieboard.o
 obj-$(CONFIG_RASPI) += bcm2835_peripherals.o bcm2836.o raspi.o
 obj-$(CONFIG_STM32F205_SOC) += stm32f205_soc.o
 obj-$(CONFIG_XLNX_ZYNQMP_ARM) += xlnx-zynqmp.o xlnx-zcu102.o
-obj-$(CONFIG_XLNX_VERSAL) += xlnx-versal.o
+obj-$(CONFIG_XLNX_VERSAL) += xlnx-versal.o xlnx-versal-virt.o
 obj-$(CONFIG_FSL_IMX25) += fsl-imx25.o imx25_pdk.o
 obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
 obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
diff --git a/hw/arm/xlnx-versal-virt.c b/hw/arm/xlnx-versal-virt.c
new file mode 100644
index 00..1e31a3f442
--- /dev/null
+++ b/hw/arm/xlnx-versal-virt.c
@@ -0,0 +1,494 @@
+/*
+ * Xilinx Versal Virtual board.
+ *
+ * Copyright (c) 2018 Xilinx Inc.
+ * Written by Edgar E. Iglesias
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 or
+ * (at your option) any later version.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "sysemu/device_tree.h"
+#include "exec/address-spaces.h"
+#include "hw/boards.h"
+#include "hw/sysbus.h"
+#include "hw/arm/sysbus-fdt.h"
+#include "hw/arm/fdt.h"
+#include "cpu.h"
+#include "hw/arm/xlnx-versal.h"
+
+#define TYPE_XLNX_VERSAL_VIRT_MACHINE MACHINE_TYPE_NAME("xlnx-versal-virt")
+#define XLNX_VERSAL_VIRT_MACHINE(obj) \
+OBJECT_CHECK(VersalVirt, (obj), TYPE_XLNX_VERSAL_VIRT_MACHINE)
+
+typedef struct VersalVirt {
+MachineState parent_obj;
+
+Versal soc;
+MemoryRegion mr_ddr;
+
+void *fdt;
+int fdt_size;
+struct {
+uint32_t gic;
+uint32_t ethernet_phy[2];
+uint32_t clk_125Mhz;
+uint32_t clk_25Mhz;
+} phandle;
+struct arm_boot_info binfo;
+
+struct {
+bool secure;
+} cfg;
+} VersalVirt;
+
+static void fdt_create(VersalVirt *s)
+{
+MachineClass *mc = MACHINE_GET_CLASS(s);
+int i;
+
+s->fdt = create_device_tree(&s->fdt_size);
+if (!s->fdt) {
+error_report("create_device_tree() failed");
+exit(1);
+}
+
+/* Allocate all phandles.  */
+s->phandle.gic = qemu_fdt_alloc_phandle(s->fdt);
+for (i = 0; i < ARRAY_SIZE(s->phandle.ethernet_phy); i++) {
+s->phandle.ethernet_phy[i] = qemu_fdt_alloc_phandle(s->fdt);
+}
+s->phandle.clk_25Mhz = qemu_fdt_alloc_phandle(s->fdt);
+s->phandle.clk_125Mhz = qemu_fdt_alloc_phandle(s->fdt);
+
+/* Create /chosen node for load_dtb.  */
+qemu_fdt_add_subnode(s->fdt, "/chosen");
+
+/* Header */
+qemu_fdt_setprop_cell(s->fdt, "/", "interrupt-parent", s->phandle.gic);
+qemu_fdt_setprop_cell(s->fdt, "/", "#size-cells", 0x2);
+qemu_fdt_setprop_cell(s->fdt, "/", "#address-cells", 0x2);
+qemu_fdt_setprop_string(s->fdt, "/", "model", mc->desc);
+qemu_fdt_setprop_string(s->fdt, "/", "compatible", "xlnx-versal-virt");
+}
+
+static void fdt_add_clk_node(VersalVirt *s, const char *name,
+ unsigned int freq_hz, uint32_t phandle)
+{
+qemu_fdt_add_subnode(s->fdt, name);
+qemu_fdt_setprop_cell(s->fdt, name, "phandle", phandle);
+qemu_fdt_setprop_cell(s->fdt, name, "clock-frequency", freq_hz);
+qemu_fdt_setprop_cell(s->fdt, name, "#clock-cells", 0x0);
+qemu_fdt_setprop_string(s->fdt, name, "compatible", "fixed-clock");
+qemu_fdt_setprop(s->fdt, name, "u-boot,dm-pre-reloc", NULL, 0);
+}
+
+static void fdt_add_cpu_nodes(VersalVirt *s, uint32_t psci_conduit)
+{
+int i;
+
+qemu_fdt_add_subnode(s->fdt, "/cpus");
+qemu_fdt_setprop_cell(s->fdt, "/cpus", "#size-cells", 0x0);
+qemu_fdt_setprop_cell(s->fdt, "/cpus", "#address-cells", 1);
+
+for (i = XLNX_VERSAL_NR_ACPUS - 1; i >= 0; i--) {
+char *name = g_strdup_printf("/cpus/cpu@%d", i);
+ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
+
+qemu_fdt_add_subnode(s->fdt, name);
+qemu_fdt_setprop_cell(s->fdt, name, "reg", armcpu->mp_affinity);
+if (psci_conduit != QEMU_PSCI_CONDUIT_DISABLED) {
+qemu_fdt_setprop_string(s->fdt, name, "enable-method", "psci");
+}
+qemu_fdt_setprop_string(s->fdt, name, "device_type", "cpu");
+qemu_fdt_setprop_string(s->fdt, name, "compatible",
+ 

[Qemu-devel] [PATCH v5 1/2] hw/arm: versal: Add a model of Xilinx Versal SoC

2018-11-02 Thread Edgar E. Iglesias
Add a model of Xilinx Versal SoC.

Signed-off-by: Edgar E. Iglesias 
---
 default-configs/aarch64-softmmu.mak |   1 +
 hw/arm/Makefile.objs|   1 +
 hw/arm/xlnx-versal.c| 323 
 include/hw/arm/xlnx-versal.h| 122 +++
 4 files changed, 447 insertions(+)
 create mode 100644 hw/arm/xlnx-versal.c
 create mode 100644 include/hw/arm/xlnx-versal.h

diff --git a/default-configs/aarch64-softmmu.mak 
b/default-configs/aarch64-softmmu.mak
index 6f790f061a..4ea9add003 100644
--- a/default-configs/aarch64-softmmu.mak
+++ b/default-configs/aarch64-softmmu.mak
@@ -8,4 +8,5 @@ CONFIG_DDC=y
 CONFIG_DPCD=y
 CONFIG_XLNX_ZYNQMP=y
 CONFIG_XLNX_ZYNQMP_ARM=y
+CONFIG_XLNX_VERSAL=y
 CONFIG_ARM_SMMUV3=y
diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index 5f88062c66..ec21d9bc1f 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -26,6 +26,7 @@ obj-$(CONFIG_ALLWINNER_A10) += allwinner-a10.o cubieboard.o
 obj-$(CONFIG_RASPI) += bcm2835_peripherals.o bcm2836.o raspi.o
 obj-$(CONFIG_STM32F205_SOC) += stm32f205_soc.o
 obj-$(CONFIG_XLNX_ZYNQMP_ARM) += xlnx-zynqmp.o xlnx-zcu102.o
+obj-$(CONFIG_XLNX_VERSAL) += xlnx-versal.o
 obj-$(CONFIG_FSL_IMX25) += fsl-imx25.o imx25_pdk.o
 obj-$(CONFIG_FSL_IMX31) += fsl-imx31.o kzm.o
 obj-$(CONFIG_FSL_IMX6) += fsl-imx6.o sabrelite.o
diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
new file mode 100644
index 00..5ee58c09be
--- /dev/null
+++ b/hw/arm/xlnx-versal.c
@@ -0,0 +1,323 @@
+/*
+ * Xilinx Versal SoC model.
+ *
+ * Copyright (c) 2018 Xilinx Inc.
+ * Written by Edgar E. Iglesias
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 or
+ * (at your option) any later version.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu-common.h"
+#include "qemu/log.h"
+#include "hw/sysbus.h"
+#include "net/net.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/kvm.h"
+#include "hw/arm/arm.h"
+#include "kvm_arm.h"
+#include "hw/misc/unimp.h"
+#include "hw/intc/arm_gicv3_common.h"
+#include "hw/arm/xlnx-versal.h"
+
+#define XLNX_VERSAL_ACPU_TYPE ARM_CPU_TYPE_NAME("cortex-a72")
+#define GEM_REVISION0x40070106
+
+static void versal_create_apu_cpus(Versal *s)
+{
+int i;
+
+for (i = 0; i < ARRAY_SIZE(s->fpd.apu.cpu); i++) {
+Object *obj;
+char *name;
+
+obj = object_new(XLNX_VERSAL_ACPU_TYPE);
+if (!obj) {
+/* Secondary CPUs start in PSCI powered-down state */
+error_report("Unable to create apu.cpu[%d] of type %s",
+ i, XLNX_VERSAL_ACPU_TYPE);
+exit(EXIT_FAILURE);
+}
+
+name = g_strdup_printf("apu-cpu[%d]", i);
+object_property_add_child(OBJECT(s), name, obj, &error_fatal);
+g_free(name);
+
+object_property_set_int(obj, s->cfg.psci_conduit,
+"psci-conduit", &error_abort);
+if (i) {
+object_property_set_bool(obj, true,
+ "start-powered-off", &error_abort);
+}
+
+object_property_set_int(obj, ARRAY_SIZE(s->fpd.apu.cpu),
+"core-count", &error_abort);
+object_property_set_link(obj, OBJECT(&s->fpd.apu.mr), "memory",
+ &error_abort);
+object_property_set_bool(obj, true, "realized", &error_fatal);
+s->fpd.apu.cpu[i] = ARM_CPU(obj);
+}
+}
+
+static void versal_create_apu_gic(Versal *s, qemu_irq *pic)
+{
+static const uint64_t addrs[] = {
+MM_GIC_APU_DIST_MAIN,
+MM_GIC_APU_REDIST_0
+};
+SysBusDevice *gicbusdev;
+DeviceState *gicdev;
+int nr_apu_cpus = ARRAY_SIZE(s->fpd.apu.cpu);
+int i;
+
+sysbus_init_child_obj(OBJECT(s), "apu-gic",
+  &s->fpd.apu.gic, sizeof(s->fpd.apu.gic),
+  gicv3_class_name());
+gicbusdev = SYS_BUS_DEVICE(&s->fpd.apu.gic);
+gicdev = DEVICE(&s->fpd.apu.gic);
+qdev_prop_set_uint32(gicdev, "revision", 3);
+qdev_prop_set_uint32(gicdev, "num-cpu", 2);
+qdev_prop_set_uint32(gicdev, "num-irq", XLNX_VERSAL_NR_IRQS + 32);
+qdev_prop_set_uint32(gicdev, "len-redist-region-count", 1);
+qdev_prop_set_uint32(gicdev, "redist-region-count[0]", 2);
+qdev_prop_set_bit(gicdev, "has-security-extensions", true);
+
+object_property_set_bool(OBJECT(&s->fpd.apu.gic), true, "realized",
+&error_fatal);
+
+for (i = 0; i < ARRAY_SIZE(addrs); i++) {
+MemoryRegion *mr;
+
+mr = sysbus_mmio_get_region(gicbusdev, i);
+memory_region_add_subregion(&s->fpd.apu.mr, addrs[i], mr);
+}
+
+for (i = 0; i < nr_apu_cpus; i++) {
+DeviceState *cpudev = DEVICE(s->fpd.apu.cpu[i]);
+int ppibase = XLNX_VERSAL_NR_IRQS + i * GIC_INTERNAL + GIC_NR_SGIS;
+qemu_irq maint_irq;
+   

Re: [Qemu-devel] [PATCH for 3.2 v2 0/7] hw/arm/bcm2835: Add basic support for cprman (clock subsystem)

2018-11-02 Thread Guenter Roeck

On 11/2/18 12:48 AM, Philippe Mathieu-Daudé wrote:

On Fri, Nov 2, 2018 at 8:32 AM Philippe Mathieu-Daudé  wrote:


Hi Guenter,

On Fri, Nov 2, 2018 at 3:52 AM Guenter Roeck  wrote:


On 11/1/18 5:12 PM, Philippe Mathieu-Daudé wrote:

Hi,

This series is a mix of a previous work I had for the raspi, and a patch from
Guenter: https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg03153.html

The final patch keep Guenter ideas and comments, but is mostly a rewrite.
I dropped the A2W code from this work, it doesn't seems unuseful to me.

Guenter can you test this series?



arm/raspi2 works, but aarch64/raspi3 stalls.


Thanks for testing it!

So I suppose the A2W is required. And I'm probably using a too old kernel,
I'm using the Buster preview following Peter's post [1]:

[0.00] Linux version 4.14.0-3-arm64
(debian-ker...@lists.debian.org) (gcc version 7.2.0 (Debian 7.2.0-18))
#1 SMP Debian 4.14.12-2 (2018-01-06)
[0.00] Machine model: Raspberry Pi 3 Model B
[...]
[8.044215] systemd[1]: Detected architecture arm64.
Welcome to Debian GNU/Linux buster/sid!

Debian GNU/Linux buster/sid rpi3 ttyAMA0

rpi3 login: root
Password:
Linux rpi3 4.14.0-3-arm64 #1 SMP Debian 4.14.12-2 (2018-01-06) aarch64
root@rpi3:~#

I'll look for a newer kernel.


I'm a bit confuse since I can boot a 4.19 kernel:

[0.00] Booting Linux on physical CPU 0x00 [0x410fd034]
[0.00] Linux version 4.19.0 (gokrazy@docker) (gcc version
6.3.0 20170516 (Debian 6.3.0-18)) #1 SMP PREEMPT Wed Mar 1 20:57:29
UTC 2017
[0.00] Machine model: Raspberry Pi 3 Model B
[0.00] earlycon: pl11 at MMIO 0x3f201000 (options '')
[0.00] bootconsole [pl11] enabled
...
[2.722577] Freeing unused kernel memory: 5696K
[2.723256] Run /init as init process
Loading, please wait...
starting version 236
...
root@rpi3:~# uname -a
Linux rpi3 4.19.0 #1 SMP PREEMPT Wed Mar 1 20:57:29 UTC 2017 aarch64 GNU/Linux



BTW I use these QEMU command line options while testing:

qemu-system-aarch64 \
   -d unimp,guest_errors \
   -trace bcm2835_cprman_rd_\* -trace bcm2835_cprman_wr_\* \
   ...

And the cmdline suggested by Peter:

   -append "rw earlycon=pl011,0x3f201000 console=ttyAMA0 loglevel=8
root=/dev/mmcblk0p2 fsck.repair=yes net.ifnames=0 rootwait memtest=1"

[1] 
https://translatedcode.wordpress.com/2018/04/25/debian-on-qemus-raspberry-pi-3-model/



[   45.683302] Run /sbin/init as init process


init is ran way after A2W register accesses, so I doubt they are the
problem here.

Can you provide me your testing setup?



-append 'earlycon=uart8250,mmio32,0x3f215040 rdinit=/sbin/init panic=-1 
console=ttyS1,115200'

On raspi3, ttyAMA0 can not be used as console because it is connected to 
something else.
I can boot if I use ttyAMA0 as console. Comparing the log output with the 
output when
using my original patch, tt looks like ttyS1 doesn't come up.

Hope this helps,
Guenter



Re: [Qemu-devel] [PATCH] tests: Disable test-bdrv-drain

2018-11-02 Thread Peter Maydell
On 9 October 2018 at 12:16, Paolo Bonzini  wrote:
> On 08/10/2018 18:40, Kevin Wolf wrote:
>>>
>>> I'm pretty confident this analysis of the problem is correct:
>>> unfortunately I have no idea what the right way to fix it is...
>> Yes, I agree with your analysis. If __thread variables can be destructed
>> before pthread_key_create() destructors are called (and in particular if
>> the former are implemented in terms of the latter), this implies at
>> least two rules:
>>
>> 1. The Notfier itself can't be a TLS variable
>>
>> 2. The notifier callback can't access any TLS variables
>>
>> Of course, with these restrictions, qemu_thread_atexit_*() with its
>> existing API is as useless as it could be.
>
> Yup, we have to stop using pthread_key_create.  Luckily, these days
> there is always qemu_thread_start that wraps the thread, so we can call
> qemu_thread_atexit_run from there, and change exit_key to a thread-local
> NotifierList.

We would also need to catch exits via qemu_thread_exit(), right?
We probably also need to handle the main thread specially, via
atexit(). This seems to be pretty much what we already do in
util/qemu-thread-win32.c...

thanks
-- PMM



Re: [Qemu-devel] [PATCH v1 2/7] pcihp: overwrite hotplug handler recursively from the start

2018-11-02 Thread David Hildenbrand
On 02.11.18 14:00, Igor Mammedov wrote:
> On Fri, 2 Nov 2018 12:43:10 +0100
> David Hildenbrand  wrote:
> 
>> On 01.11.18 15:10, Igor Mammedov wrote:
>>> On Wed, 24 Oct 2018 12:19:25 +0200
>>> David Hildenbrand  wrote:
>>>   
 For now, the hotplug handler is not called for devices that are
 being cold plugged. The hotplug handler is setup when the machine
 initialization is fully done. Only bridges that were cold plugged are
 considered.

 Set the hotplug handler for the root piix bus directly when realizing.
 Overwrite the hotplug handler of bridges when hotplugging/coldplugging
 them.

 This will now make sure that the ACPI PCI hotplug handler is also called
 for cold-plugged devices (also on bridges) and for bridges that were
 hotplugged.

 When trying to hotplug a device to a hotplugged bridge, we now correctly
 get the error message
  "Unsupported bus. Bus doesn't have property 'acpi-pcihp-bsel' set"
 Insted of going via the standard PCI hotplug handler.  
>>> Erroring out is probably not ok, since it can break existing setups
>>> where SHPC hotplugging to hotplugged bridge was working just fine before.  
>>
>> The question is if it actually was supposed (and eventually did) work.
> I think it works now, it's QEMU 'ACPI hotplug hack' (which exists for
> the sake of Windows) limitation. We weren't able to dynamically add
> ACPI description for hotplugged bridge, so it was using native hotplug.
> Now theoretically we can load tables dynamically but that, would add
> maintenance nightmare (versioned tables) and would be harder to debug.
> I'd rather not go that direction and keep current limited version,
> suggesting users to use native hotplug if guest is capable.

Alright I'll keep current behavior (checking if the bridge is hotplugged
or coldplugged). Thanks!


-- 

Thanks,

David / dhildenb



[Qemu-devel] [Bug 1740364] Re: qemu-img: fails to get shared 'write' lock

2018-11-02 Thread Richard Jones
Sorry I noticed this bug is filed against qemu.  The fix was done in
libguestfs, it's not a bug in qemu as far as I know.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1740364

Title:
  qemu-img: fails to get shared 'write' lock

Status in QEMU:
  Fix Committed

Bug description:
  Description of problem:
  Somewhere in F27 (did not see it happening before), I'm getting while running 
libguestfs (via libvirt or direct), a qemu-img failure. Note: multiple qcow2 
snapshots are on the same backing file, and a parallel libguestfs command is 
running on all. However, it seems to be failing to get a lock on the leaf, 
which is unique, non-shared.

  The VM is up and running. I'm not sure why qemu-img is even trying to get a 
write lock on it. Even 'info' fails:
  ykaul@ykaul ovirt-system-tests]$ qemu-img info 
/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2
  qemu-img: Could not open 
'/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2':
 Failed to get shared "write" lock
  Is another process using the image?
  [ykaul@ykaul ovirt-system-tests]$ lsof |grep qcow2
  [ykaul@ykaul ovirt-system-tests]$ file 
/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2
  
/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2:
 QEMU QCOW Image (v3), has backing file (path 
/var/lib/lago/store/phx_repo:el7.4-base:v1), 6442450944 bytes

  
  And it's OK if I kill the VM of course.


  
  Version-Release number of selected component (if applicable):
  [ykaul@ykaul ovirt-system-tests]$ rpm -qa |grep qemu
  qemu-block-nfs-2.10.1-2.fc27.x86_64
  qemu-block-dmg-2.10.1-2.fc27.x86_64
  qemu-guest-agent-2.10.1-2.fc27.x86_64
  qemu-system-x86-core-2.10.1-2.fc27.x86_64
  qemu-block-curl-2.10.1-2.fc27.x86_64
  qemu-img-2.10.1-2.fc27.x86_64
  qemu-common-2.10.1-2.fc27.x86_64
  qemu-kvm-2.10.1-2.fc27.x86_64
  qemu-block-ssh-2.10.1-2.fc27.x86_64
  qemu-block-iscsi-2.10.1-2.fc27.x86_64
  libvirt-daemon-driver-qemu-3.7.0-3.fc27.x86_64
  qemu-block-gluster-2.10.1-2.fc27.x86_64
  ipxe-roms-qemu-20161108-2.gitb991c67.fc26.noarch
  qemu-system-x86-2.10.1-2.fc27.x86_64
  qemu-block-rbd-2.10.1-2.fc27.x86_64

  
  How reproducible:
  Sometimes.

  Steps to Reproduce:
  1. Running Lago (ovirt-system-tests) on my laptop, it happens quite a lot.

  Additional info:
  libguestfs: trace: set_verbose true
  libguestfs: trace: set_verbose = 0
  libguestfs: trace: set_backend "direct"
  libguestfs: trace: set_backend = 0
  libguestfs: create: flags = 0, handle = 0x7f1314006430, program = python2
  libguestfs: trace: set_program "lago"
  libguestfs: trace: set_program = 0
  libguestfs: trace: add_drive_ro 
"/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2"
  libguestfs: trace: add_drive 
"/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2"
 "readonly:true"
  libguestfs: creating COW overlay to protect original drive content
  libguestfs: trace: get_tmpdir
  libguestfs: trace: get_tmpdir = "/tmp"
  libguestfs: trace: disk_create "/tmp/libguestfsWrA7Dh/overlay1.qcow2" "qcow2" 
-1 
"backingfile:/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2"
  libguestfs: command: run: qemu-img
  libguestfs: command: run: \ create
  libguestfs: command: run: \ -f qcow2
  libguestfs: command: run: \ -o 
backing_file=/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2
  libguestfs: command: run: \ /tmp/libguestfsWrA7Dh/overlay1.qcow2
  qemu-img: /tmp/libguestfsWrA7Dh/overlay1.qcow2: Failed to get shared "write" 
lock
  Is another process using the image?
  Could not open backing image to determine size.
  libguestfs: trace: disk_create = -1 (error)
  libguestfs: trace: add_drive = -1 (error)
  libguestfs: trace: add_drive_ro = -1 (error)

  
  And:
  [ykaul@ykaul ovirt-system-tests]$ strace qemu-img info 
/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2
  execve("/usr/bin/qemu-img", ["qemu-img", "info", 
"/home/ykaul/ovirt-system-tests/d"...], 0x7fffb36ccfc0 /* 59 vars */) = 0
  brk(NULL)   = 0x562790488000
  mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f20cea08000
  access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or 
directory)
  openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
  fstat(3, {st_mode=S_IFREG|0644, st_size=93275, ...}) = 0
  mmap(NULL, 93275, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f20ce9f10

[Qemu-devel] [Bug 1740364] Re: qemu-img: fails to get shared 'write' lock

2018-11-02 Thread Richard Jones
Fixed upstream in
https://github.com/libguestfs/libguestfs/commit/f00f920ad3b15ab8e9e8f201c16e7628b6b7b109

The fix should appear in libguestfs 1.40.

** Changed in: qemu
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1740364

Title:
  qemu-img: fails to get shared 'write' lock

Status in QEMU:
  Fix Committed

Bug description:
  Description of problem:
  Somewhere in F27 (did not see it happening before), I'm getting while running 
libguestfs (via libvirt or direct), a qemu-img failure. Note: multiple qcow2 
snapshots are on the same backing file, and a parallel libguestfs command is 
running on all. However, it seems to be failing to get a lock on the leaf, 
which is unique, non-shared.

  The VM is up and running. I'm not sure why qemu-img is even trying to get a 
write lock on it. Even 'info' fails:
  ykaul@ykaul ovirt-system-tests]$ qemu-img info 
/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2
  qemu-img: Could not open 
'/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2':
 Failed to get shared "write" lock
  Is another process using the image?
  [ykaul@ykaul ovirt-system-tests]$ lsof |grep qcow2
  [ykaul@ykaul ovirt-system-tests]$ file 
/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2
  
/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2:
 QEMU QCOW Image (v3), has backing file (path 
/var/lib/lago/store/phx_repo:el7.4-base:v1), 6442450944 bytes

  
  And it's OK if I kill the VM of course.


  
  Version-Release number of selected component (if applicable):
  [ykaul@ykaul ovirt-system-tests]$ rpm -qa |grep qemu
  qemu-block-nfs-2.10.1-2.fc27.x86_64
  qemu-block-dmg-2.10.1-2.fc27.x86_64
  qemu-guest-agent-2.10.1-2.fc27.x86_64
  qemu-system-x86-core-2.10.1-2.fc27.x86_64
  qemu-block-curl-2.10.1-2.fc27.x86_64
  qemu-img-2.10.1-2.fc27.x86_64
  qemu-common-2.10.1-2.fc27.x86_64
  qemu-kvm-2.10.1-2.fc27.x86_64
  qemu-block-ssh-2.10.1-2.fc27.x86_64
  qemu-block-iscsi-2.10.1-2.fc27.x86_64
  libvirt-daemon-driver-qemu-3.7.0-3.fc27.x86_64
  qemu-block-gluster-2.10.1-2.fc27.x86_64
  ipxe-roms-qemu-20161108-2.gitb991c67.fc26.noarch
  qemu-system-x86-2.10.1-2.fc27.x86_64
  qemu-block-rbd-2.10.1-2.fc27.x86_64

  
  How reproducible:
  Sometimes.

  Steps to Reproduce:
  1. Running Lago (ovirt-system-tests) on my laptop, it happens quite a lot.

  Additional info:
  libguestfs: trace: set_verbose true
  libguestfs: trace: set_verbose = 0
  libguestfs: trace: set_backend "direct"
  libguestfs: trace: set_backend = 0
  libguestfs: create: flags = 0, handle = 0x7f1314006430, program = python2
  libguestfs: trace: set_program "lago"
  libguestfs: trace: set_program = 0
  libguestfs: trace: add_drive_ro 
"/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2"
  libguestfs: trace: add_drive 
"/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2"
 "readonly:true"
  libguestfs: creating COW overlay to protect original drive content
  libguestfs: trace: get_tmpdir
  libguestfs: trace: get_tmpdir = "/tmp"
  libguestfs: trace: disk_create "/tmp/libguestfsWrA7Dh/overlay1.qcow2" "qcow2" 
-1 
"backingfile:/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2"
  libguestfs: command: run: qemu-img
  libguestfs: command: run: \ create
  libguestfs: command: run: \ -f qcow2
  libguestfs: command: run: \ -o 
backing_file=/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2
  libguestfs: command: run: \ /tmp/libguestfsWrA7Dh/overlay1.qcow2
  qemu-img: /tmp/libguestfsWrA7Dh/overlay1.qcow2: Failed to get shared "write" 
lock
  Is another process using the image?
  Could not open backing image to determine size.
  libguestfs: trace: disk_create = -1 (error)
  libguestfs: trace: add_drive = -1 (error)
  libguestfs: trace: add_drive_ro = -1 (error)

  
  And:
  [ykaul@ykaul ovirt-system-tests]$ strace qemu-img info 
/home/ykaul/ovirt-system-tests/deployment-basic-suite-master/default/images/lago-basic-suite-master-host-1_root.qcow2
  execve("/usr/bin/qemu-img", ["qemu-img", "info", 
"/home/ykaul/ovirt-system-tests/d"...], 0x7fffb36ccfc0 /* 59 vars */) = 0
  brk(NULL)   = 0x562790488000
  mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f20cea08000
  access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or 
directory)
  openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
  fstat(3, {st_mode=S_IFREG|0644, st_s

[Qemu-devel] [PATCH for-4.0 3/4] target/arm: Implement the ARMv8.1-HPD extension

2018-11-02 Thread Richard Henderson
Since the TCR_*.HPD bits were RES0 in ARMv8.0, we can simply
interpret the bits as if ARMv8.1-HPD is present without checking.
We will need a slightly different check for hpd for aarch32.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu64.c  |  1 +
 target/arm/helper.c | 29 +
 2 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index aac6283018..1d57be0c91 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -325,6 +325,7 @@ static void aarch64_max_initfn(Object *obj)
 cpu->isar.id_aa64pfr0 = t;
 
 t = cpu->isar.id_aa64mmfr1;
+t = FIELD_DP64(t, ID_AA64MMFR1, HPDS, 1); /* HPD */
 t = FIELD_DP64(t, ID_AA64MMFR1, LO, 1);
 cpu->isar.id_aa64mmfr1 = t;
 
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 758ddac5e9..312d3e6f02 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -9682,6 +9682,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
 bool ttbr1_valid = true;
 uint64_t descaddrmask;
 bool aarch64 = arm_el_is_aa64(env, el);
+bool hpd = false;
 
 /* TODO:
  * This code does not handle the different format TCR for VTCR_EL2.
@@ -9796,6 +9797,13 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
 if (tg == 2) { /* 16KB pages */
 stride = 11;
 }
+if (aarch64) {
+if (el > 1) {
+hpd = extract64(tcr->raw_tcr, 24, 1);
+} else {
+hpd = extract64(tcr->raw_tcr, 41, 1);
+}
+}
 } else {
 /* We should only be here if TTBR1 is valid */
 assert(ttbr1_valid);
@@ -9811,6 +9819,9 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
 if (tg == 1) { /* 16KB pages */
 stride = 11;
 }
+if (aarch64) {
+hpd = extract64(tcr->raw_tcr, 42, 1);
+}
 }
 
 /* Here we should have set up all the parameters for the translation:
@@ -9904,7 +9915,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
 descaddr = descriptor & descaddrmask;
 
 if ((descriptor & 2) && (level < 3)) {
-/* Table entry. The top five bits are attributes which  may
+/* Table entry. The top five bits are attributes which may
  * propagate down through lower levels of the table (and
  * which are all arranged so that 0 means "no effect", so
  * we can gather them up by ORing in the bits at each level).
@@ -9928,14 +9939,16 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
 /* Stage 2 table descriptors do not include any attribute fields */
 break;
 }
-/* Merge in attributes from table descriptors */
-attrs |= extract32(tableattrs, 0, 2) << 11; /* XN, PXN */
-attrs |= extract32(tableattrs, 3, 1) << 5; /* APTable[1] => AP[2] */
-/* The sense of AP[1] vs APTable[0] is reversed, as APTable[0] == 1
- * means "force PL1 access only", which means forcing AP[1] to 0.
+/*
+ * Merge in attributes from table descriptors, if the
+ * Hierarchical Permission Disable bit is not set.
  */
-if (extract32(tableattrs, 2, 1)) {
-attrs &= ~(1 << 4);
+if (!hpd) {
+attrs |= extract32(tableattrs, 0, 2) << 11; /* XN, PXN */
+/* !APTable[0] => AP[1].  */
+attrs &= ~(extract32(tableattrs, 2, 1) << 4);
+/* APTable[1] => AP[2] */
+attrs |= extract32(tableattrs, 3, 1) << 5;
 }
 attrs |= nstable << 3; /* NS */
 break;
-- 
2.17.2




[Qemu-devel] [PATCH for-4.0 4/4] target/arm: Implement the ARMv8.2-AA32HPD extension

2018-11-02 Thread Richard Henderson
The bulk of the work here, beyond base HPD, is defining the TTBCR2 register.
In addition we must check TTBCR.T2E, which is not present (RES0) for AArch64.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h|  8 
 target/arm/cpu.c|  4 
 target/arm/helper.c | 37 +
 3 files changed, 41 insertions(+), 8 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index f12a6afddc..a253cdebde 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1517,6 +1517,14 @@ FIELD(ID_ISAR6, FHM, 8, 4)
 FIELD(ID_ISAR6, SB, 12, 4)
 FIELD(ID_ISAR6, SPECRES, 16, 4)
 
+FIELD(ID_MMFR4, SPECSEI, 0, 4)
+FIELD(ID_MMFR4, AC2, 4, 4)
+FIELD(ID_MMFR4, XNX, 8, 4)
+FIELD(ID_MMFR4, CNP, 12, 4)
+FIELD(ID_MMFR4, HPDS, 16, 4)
+FIELD(ID_MMFR4, LSM, 20, 4)
+FIELD(ID_MMFR4, CCIDX, 24, 4)
+
 FIELD(ID_AA64ISAR0, AES, 4, 4)
 FIELD(ID_AA64ISAR0, SHA1, 8, 4)
 FIELD(ID_AA64ISAR0, SHA2, 12, 4)
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 8f16e96b6c..3fd85f21c5 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1856,6 +1856,10 @@ static void arm_max_initfn(Object *obj)
 t = cpu->isar.id_isar6;
 t = FIELD_DP32(t, ID_ISAR6, DP, 1);
 cpu->isar.id_isar6 = t;
+
+t = cpu->id_mmfr4;
+t = FIELD_DP32(t, ID_MMFR4, HPDS, 1); /* AA32HPD */
+cpu->id_mmfr4 = t;
 }
 #endif
 }
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 312d3e6f02..85d3f4ad89 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2722,6 +2722,7 @@ static void vmsa_ttbcr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
  uint64_t value)
 {
 ARMCPU *cpu = arm_env_get_cpu(env);
+TCR *tcr = raw_ptr(env, ri);
 
 if (arm_feature(env, ARM_FEATURE_LPAE)) {
 /* With LPAE the TTBCR could result in a change of ASID
@@ -2729,6 +2730,8 @@ static void vmsa_ttbcr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
  */
 tlb_flush(CPU(cpu));
 }
+/* Preserve the high half of TCR_EL1, set via TTBCR2.  */
+value = deposit64(tcr->raw_tcr, 0, 32, value);
 vmsa_ttbcr_raw_write(env, ri, value);
 }
 
@@ -2831,6 +2834,16 @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
 REGINFO_SENTINEL
 };
 
+/* Note that unlike TTBCR, writing to TTBCR2 does not require flushing
+ * qemu tlbs nor adjusting cached masks.
+ */
+static const ARMCPRegInfo ttbcr2_reginfo = {
+.name = "TTBCR2", .cp = 15, .crn = 2, .crm = 0, .opc1 = 0, .opc2 = 3,
+.access = PL1_RW, .type = ARM_CP_ALIAS,
+.bank_fieldoffsets = { offsetofhigh32(CPUARMState, cp15.tcr_el[3]),
+   offsetofhigh32(CPUARMState, cp15.tcr_el[1]) },
+};
+
 static void omap_ticonfig_write(CPUARMState *env, const ARMCPRegInfo *ri,
 uint64_t value)
 {
@@ -5454,6 +5467,10 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 } else {
 define_arm_cp_regs(cpu, vmsa_pmsa_cp_reginfo);
 define_arm_cp_regs(cpu, vmsa_cp_reginfo);
+/* TTCBR2 is introduced with ARMv8.2-A32HPD.  */
+if (FIELD_EX32(cpu->id_mmfr4, ID_MMFR4, HPDS) != 0) {
+define_one_arm_cp_reg(cpu, &ttbcr2_reginfo);
+}
 }
 if (arm_feature(env, ARM_FEATURE_THUMB2EE)) {
 define_arm_cp_regs(cpu, t2ee_cp_reginfo);
@@ -9797,12 +9814,14 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
 if (tg == 2) { /* 16KB pages */
 stride = 11;
 }
-if (aarch64) {
-if (el > 1) {
-hpd = extract64(tcr->raw_tcr, 24, 1);
-} else {
-hpd = extract64(tcr->raw_tcr, 41, 1);
-}
+if (aarch64 && el > 1) {
+hpd = extract64(tcr->raw_tcr, 24, 1);
+} else {
+hpd = extract64(tcr->raw_tcr, 41, 1);
+}
+if (!aarch64) {
+/* For aarch32, hpd0 is not enabled without t2e as well.  */
+hpd &= extract64(tcr->raw_tcr, 6, 1);
 }
 } else {
 /* We should only be here if TTBR1 is valid */
@@ -9819,8 +9838,10 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
 if (tg == 1) { /* 16KB pages */
 stride = 11;
 }
-if (aarch64) {
-hpd = extract64(tcr->raw_tcr, 42, 1);
+hpd = extract64(tcr->raw_tcr, 42, 1);
+if (!aarch64) {
+/* For aarch32, hpd1 is not enabled without t2e as well.  */
+hpd &= extract64(tcr->raw_tcr, 6, 1);
 }
 }
 
-- 
2.17.2




[Qemu-devel] [PATCH for-4.0 2/4] target/arm: Implement the ARMv8.1-LOR extension

2018-11-02 Thread Richard Henderson
Provide a trivial implementation with zero limited ordering regions,
which causes the LDLAR and STLLR instructions to devolve into the
LDAR and STLR instructions from the base ARMv8.0 instruction set.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   |  5 +
 target/arm/cpu64.c |  4 
 target/arm/helper.c| 26 ++
 target/arm/translate-a64.c | 12 
 4 files changed, 47 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 2ce5e80dfc..f12a6afddc 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3278,6 +3278,11 @@ static inline bool isar_feature_aa64_atomics(const 
ARMISARegisters *id)
 return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, ATOMIC) != 0;
 }
 
+static inline bool isar_feature_aa64_lor(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64mmfr0, ID_AA64MMFR1, LO) != 0;
+}
+
 static inline bool isar_feature_aa64_rdm(const ARMISARegisters *id)
 {
 return FIELD_EX64(id->id_aa64isar0, ID_AA64ISAR0, RDM) != 0;
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 0babe483ac..aac6283018 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -324,6 +324,10 @@ static void aarch64_max_initfn(Object *obj)
 t = FIELD_DP64(t, ID_AA64PFR0, ADVSIMD, 1);
 cpu->isar.id_aa64pfr0 = t;
 
+t = cpu->isar.id_aa64mmfr1;
+t = FIELD_DP64(t, ID_AA64MMFR1, LO, 1);
+cpu->isar.id_aa64mmfr1 = t;
+
 /* Replicate the same data to the 32-bit id registers.  */
 u = cpu->isar.id_isar5;
 u = FIELD_DP32(u, ID_ISAR5, AES, 2); /* AES + PMULL */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 70376764cb..758ddac5e9 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5714,6 +5714,32 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 define_one_arm_cp_reg(cpu, &sctlr);
 }
 
+if (cpu_isar_feature(aa64_lor, cpu)) {
+/*
+ * A trivial implementation of ARMv8.1-LOR leaves all of these
+ * registers fixed at 0, which indicates that there are zero
+ * supported Limited Ordering regions.
+ */
+static const ARMCPRegInfo lor_reginfo[] = {
+{ .name = "LORSA_EL1", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 4, .opc2 = 0,
+  .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+{ .name = "LOREA_EL1", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 4, .opc2 = 1,
+  .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+{ .name = "LORN_EL1", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 4, .opc2 = 2,
+  .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+{ .name = "LORC_EL1", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 4, .opc2 = 3,
+  .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+{ .name = "LORID_EL1", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 0, .crn = 10, .crm = 4, .opc2 = 7,
+  .access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
+};
+define_arm_cp_regs(cpu, lor_reginfo);
+}
+
 if (cpu_isar_feature(aa64_sve, cpu)) {
 define_one_arm_cp_reg(cpu, &zcr_el1_reginfo);
 if (arm_feature(env, ARM_FEATURE_EL2)) {
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 88195ab949..2307a18d5a 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -2290,6 +2290,12 @@ static void disas_ldst_excl(DisasContext *s, uint32_t 
insn)
 }
 return;
 
+case 0x8: /* STLLR */
+if (!dc_isar_feature(aa64_lor, s)) {
+break;
+}
+/* StoreLORelease is the same as Store-Release for QEMU.  */
+/* fallthru */
 case 0x9: /* STLR */
 /* Generate ISS for non-exclusive accesses including LASR.  */
 if (rn == 31) {
@@ -2301,6 +2307,12 @@ static void disas_ldst_excl(DisasContext *s, uint32_t 
insn)
   disas_ldst_compute_iss_sf(size, false, 0), is_lasr);
 return;
 
+case 0xc: /* LDLAR */
+if (!dc_isar_feature(aa64_lor, s)) {
+break;
+}
+/* LoadLOAcquire is the same as Load-Acquire for QEMU.  */
+/* fallthru */
 case 0xd: /* LDAR */
 /* Generate ISS for non-exclusive accesses including LASR.  */
 if (rn == 31) {
-- 
2.17.2




[Qemu-devel] [PATCH for-4.0 1/4] target/arm: Move id_aa64mmfr* to ARMISARegisters

2018-11-02 Thread Richard Henderson
At the same time, define the fields for these registers,
and use those defines in arm_pamax().

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   | 22 --
 target/arm/internals.h |  3 ++-
 target/arm/cpu64.c |  6 +++---
 target/arm/helper.c|  4 ++--
 4 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 8e6779936e..2ce5e80dfc 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -815,6 +815,8 @@ struct ARMCPU {
 uint64_t id_aa64isar1;
 uint64_t id_aa64pfr0;
 uint64_t id_aa64pfr1;
+uint64_t id_aa64mmfr0;
+uint64_t id_aa64mmfr1;
 } isar;
 uint32_t midr;
 uint32_t revidr;
@@ -836,8 +838,6 @@ struct ARMCPU {
 uint64_t id_aa64dfr1;
 uint64_t id_aa64afr0;
 uint64_t id_aa64afr1;
-uint64_t id_aa64mmfr0;
-uint64_t id_aa64mmfr1;
 uint32_t dbgdidr;
 uint32_t clidr;
 uint64_t mp_affinity; /* MP ID without feature bits */
@@ -1554,6 +1554,24 @@ FIELD(ID_AA64PFR0, GIC, 24, 4)
 FIELD(ID_AA64PFR0, RAS, 28, 4)
 FIELD(ID_AA64PFR0, SVE, 32, 4)
 
+FIELD(ID_AA64MMFR0, PARange, 0, 4)
+FIELD(ID_AA64MMFR0, ASIDBits, 4, 4)
+FIELD(ID_AA64MMFR0, BigEnd, 8, 4)
+FIELD(ID_AA64MMFR0, SNSMem, 12, 4)
+FIELD(ID_AA64MMFR0, BigEndEL0, 16, 4)
+FIELD(ID_AA64MMFR0, TGran16, 20, 4)
+FIELD(ID_AA64MMFR0, TGran64, 24, 4)
+FIELD(ID_AA64MMFR0, TGran4, 28, 4)
+
+FIELD(ID_AA64MMFR1, HAFDBS, 0, 4)
+FIELD(ID_AA64MMFR1, VMIDBits, 4, 4)
+FIELD(ID_AA64MMFR1, VH, 8, 4)
+FIELD(ID_AA64MMFR1, HPDS, 12, 4)
+FIELD(ID_AA64MMFR1, LO, 16, 4)
+FIELD(ID_AA64MMFR1, PAN, 20, 4)
+FIELD(ID_AA64MMFR1, SpecSEI, 24, 4)
+FIELD(ID_AA64MMFR1, XNX, 28, 4)
+
 QEMU_BUILD_BUG_ON(ARRAY_SIZE(((ARMCPU *)0)->ccsidr) <= 
R_V7M_CSSELR_INDEX_MASK);
 
 /* If adding a feature bit which corresponds to a Linux ELF
diff --git a/target/arm/internals.h b/target/arm/internals.h
index 6c2bb2deeb..bf844abc47 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -213,7 +213,8 @@ static inline unsigned int arm_pamax(ARMCPU *cpu)
 [4] = 44,
 [5] = 48,
 };
-unsigned int parange = extract32(cpu->id_aa64mmfr0, 0, 4);
+unsigned int parange =
+FIELD_EX64(cpu->isar.id_aa64mmfr0, ID_AA64MMFR0, PARange);
 
 /* id_aa64mmfr0 is a read-only register so values outside of the
  * supported mappings can be considered an implementation error.  */
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 873f059bf2..0babe483ac 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -141,7 +141,7 @@ static void aarch64_a57_initfn(Object *obj)
 cpu->pmceid0 = 0x;
 cpu->pmceid1 = 0x;
 cpu->isar.id_aa64isar0 = 0x00011120;
-cpu->id_aa64mmfr0 = 0x1124;
+cpu->isar.id_aa64mmfr0 = 0x1124;
 cpu->dbgdidr = 0x3516d000;
 cpu->clidr = 0x0a200023;
 cpu->ccsidr[0] = 0x701fe00a; /* 32KB L1 dcache */
@@ -195,7 +195,7 @@ static void aarch64_a53_initfn(Object *obj)
 cpu->isar.id_aa64pfr0 = 0x;
 cpu->id_aa64dfr0 = 0x10305106;
 cpu->isar.id_aa64isar0 = 0x00011120;
-cpu->id_aa64mmfr0 = 0x1122; /* 40 bit physical addr */
+cpu->isar.id_aa64mmfr0 = 0x1122; /* 40 bit physical addr */
 cpu->dbgdidr = 0x3516d000;
 cpu->clidr = 0x0a200023;
 cpu->ccsidr[0] = 0x700fe01a; /* 32KB L1 dcache */
@@ -249,7 +249,7 @@ static void aarch64_a72_initfn(Object *obj)
 cpu->pmceid0 = 0x;
 cpu->pmceid1 = 0x;
 cpu->isar.id_aa64isar0 = 0x00011120;
-cpu->id_aa64mmfr0 = 0x1124;
+cpu->isar.id_aa64mmfr0 = 0x1124;
 cpu->dbgdidr = 0x3516d000;
 cpu->clidr = 0x0a200023;
 cpu->ccsidr[0] = 0x701fe00a; /* 32KB L1 dcache */
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 0ea95b0815..70376764cb 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5228,11 +5228,11 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 { .name = "ID_AA64MMFR0_EL1", .state = ARM_CP_STATE_AA64,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 7, .opc2 = 0,
   .access = PL1_R, .type = ARM_CP_CONST,
-  .resetvalue = cpu->id_aa64mmfr0 },
+  .resetvalue = cpu->isar.id_aa64mmfr0 },
 { .name = "ID_AA64MMFR1_EL1", .state = ARM_CP_STATE_AA64,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 7, .opc2 = 1,
   .access = PL1_R, .type = ARM_CP_CONST,
-  .resetvalue = cpu->id_aa64mmfr1 },
+  .resetvalue = cpu->isar.id_aa64mmfr1 },
 { .name = "ID_AA64MMFR2_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
   .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 7, .opc2 = 2,
   .access = PL1_R, .type = ARM_CP_CONST,
-- 
2.17.2




Re: [Qemu-devel] Correction needed for R5900 instruction decoding

2018-11-02 Thread Aleksandar Markovic
> ... in the patch series I posted 25 October "[PATCH 00/11]
> target/mips: Amend R5900 support". I will post updated patches shortly!

Fridrik,

It is now code freeze before 3.1, the code base is being stabilized, and only 
important fixes are allowed to be integrated - so, in that light, a separate 
patch, or a small series, that addresses only concerns from the original mail 
of this thread is needed. Such series should not contain any additional 
features (like your v2 of the series "Amend..." does), and its patch titles 
should look like "Fix decoding mechanism of ..." or such.

Could you please provide those appropriate changes in that format?

Thanks,
Aleksandar


From: Fredrik Noring 
Sent: Thursday, November 1, 2018 6:23:53 PM
To: Philippe Mathieu-Daudé; Emilio G. Cota; Aleksandar Markovic
Cc: Stefan Markovic; Petar Jovanovic; Aleksandar Rikalo; Maciej W. Rozycki; 
Jürgen Urban; qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Correction needed for R5900 instruction decoding

[ Philippe and Emilio -- thank you for cc-ing me. Good catch, since I'm
not subscribed to the QEMU mailing list. Changes to the R5900 emulation
are certainly of interest. ]

Hi Aleksandar, Philippe,

On Thu, Nov 01, 2018 at 03:31:54PM +0100, Philippe Mathieu-Daudé wrote:
> Cc'ing Fredrik.
>
> On 1/11/18 12:06, Aleksandar Markovic wrote:
> > Hi, Fridrik,
> >
> > I did some closer code inspection of R5900 in last few days, and I
> > noticed some sub-optimal implementation in the area where R5900-specific
> > opcodes overlap with the rest-of-MIPS-CPUs opcodes.
> >
> > The right implementation should be based on the principle that all such
> > cases are covered with if statements involving INSN_R5900 flag, like
> > this:
> >
> >  if (ctx->insn_flags & INSN_R5900) {
> >  
> >  } else {
> >  
> >  }
> >
> > You followed that principle for OPC_SPECIAL2 and OPC_SPECIAL3, but for
> > some other opcodes not. For example, there are lines:
> >
> >  if (reg == 0 && (opc == OPC_MFHI || opc == TX79_MMI_MFHI1 ||
> >   opc == OPC_MFLO || opc == TX79_MMI_MFLO1)) {
> >
> > or
> >
> >   switch (opc) {
> >   case OPC_MFHI:
> >   case TX79_MMI_MFHI1:
> >
> > Such implementation makes it difficult to discern R5900 and non-R5900
> > cases. Potentialy allows bugs to sneak in and affect non-R5900 support.

MFLO1, MFHI1, MTLO1 and MTHI1 for the TX79 and the R5900 are already
decoded in the ISA specific decode_tx79_mmi function, and thereby follow
your first suggested pattern. They do however reuse the gen_HILO function,
but it is a simpel matter to post a patch to make a new gen_tx79_HILO1
variant that is almost identical to the original gen_HILO.

The only other case is gen_muldiv that is used for DIV1 and DIVU1. The
same argument applies and a TX79 specific variant would be similar to the
original, but I can certainly post a variant for that one too.

> > The correction is not that difficult, I gather. Worse comme to worst,
> > you can remove R5900 MFLO1 and MFHI1 altogether, they are not that
> > essential at this moment, but do try correcting the decoding stuff as I
> > described. Can you please make these changes in next few days or so
> > (given that 3.1 release is getting closer and closer), and send them to
> > the list?

MFLO1 and MFHI1 are essential for MULT1, MULTU1, DIV1 and DIVU1 as well as
MADD1 and MADDU1 in the patch series I posted 25 October "[PATCH 00/11]
target/mips: Amend R5900 support". I will post updated patches shortly!

> > It is my bad that I didn't spot this during review, but in any case, I
> > think this should be fixed in 3.1 to make sure that non-R5900
> > functionalities are intact.

It is a common pattern in target/mips/translate.c to cover several ISAs
in the same gen_* and decode_* functions, especially when there are only
minor differences between them.

Fredrik



[Qemu-devel] [PATCH for-4.0 0/4] target/arm: LOR, HPD, AA32HPD extensions

2018-11-02 Thread Richard Henderson
Three relatively simple post-8.0 extensions.


r~


Richard Henderson (4):
  target/arm: Move id_aa64mmfr* to ARMISARegisters
  target/arm: Implement the ARMv8.1-LOR extension
  target/arm: Implement the ARMv8.1-HPD extension
  target/arm: Implement the ARMv8.2-AA32HPD extension

 target/arm/cpu.h   | 35 -
 target/arm/internals.h |  3 +-
 target/arm/cpu.c   |  4 ++
 target/arm/cpu64.c | 11 --
 target/arm/helper.c| 80 +-
 target/arm/translate-a64.c | 12 ++
 6 files changed, 129 insertions(+), 16 deletions(-)

-- 
2.17.2




Re: [Qemu-devel] [PATCH 1/3] Improve xen_disk batching behaviour

2018-11-02 Thread Anthony PERARD
On Fri, Nov 02, 2018 at 10:00:59AM +, Tim Smith wrote:
> When I/O consists of many small requests, performance is improved by
> batching them together in a single io_submit() call. When there are
> relatively few requests, the extra overhead is not worth it. This
> introduces a check to start batching I/O requests via blk_io_plug()/
> blk_io_unplug() in an amount proportional to the number which were
> already in flight at the time we started reading the ring.
> 
> Signed-off-by: Tim Smith 

Acked-by: Anthony PERARD 

-- 
Anthony PERARD



Re: [Qemu-devel] [PATCH 2/3] Improve xen_disk response latency

2018-11-02 Thread Anthony PERARD
On Fri, Nov 02, 2018 at 10:01:04AM +, Tim Smith wrote:
> If the I/O ring is full, the guest cannot send any more requests
> until some responses are sent. Only sending all available responses
> just before checking for new work does not leave much time for the
> guest to supply new work, so this will cause stalls if the ring gets
> full. Also, not completing reads as soon as possible adds latency
> to the guest.
> 
> To alleviate that, complete IO requests as soon as they come back.
> blk_send_response() already returns a value indicating whether
> a notify should be sent, which is all the batching we need.
> 
> Signed-off-by: Tim Smith 

Acked-by: Anthony PERARD 

-- 
Anthony PERARD



Re: [Qemu-devel] [PATCH 3/3] Avoid repeated memory allocation in xen_disk

2018-11-02 Thread Anthony PERARD
On Fri, Nov 02, 2018 at 10:01:09AM +, Tim Smith wrote:
> xen_disk currently allocates memory to hold the data for each ioreq
> as that ioreq is used, and frees it afterwards. Because it requires
> page-aligned blocks, this interacts poorly with non-page-aligned
> allocations and balloons the heap.
> 
> Instead, allocate the maximum possible requirement, which is
> BLKIF_MAX_SEGMENTS_PER_REQUEST pages (currently 11 pages) when
> the ioreq is created, and keep that allocation until it is destroyed.
> Since the ioreqs themselves are re-used via a free list, this
> should actually improve memory usage.
> 
> Signed-off-by: Tim Smith 

Acked-by: Anthony PERARD 

-- 
Anthony PERARD



Re: [Qemu-devel] [PULL] RISC-V Patches for the 3.1 Soft Freeze, Part 2

2018-11-02 Thread Peter Maydell
On 1 November 2018 at 23:55, Palmer Dabbelt  wrote:
> The following changes since commit a2e002ff7913ce93aa0f7dbedd2123dce5f1a9cd:
>
>   Merge remote-tracking branch 
> 'remotes/vivier2/tags/qemu-trivial-for-3.1-pull-request' into staging 
> (2018-10-30 15:49:55 +)
>
> are available in the Git repository at:
>
>   git://github.com/riscv/riscv-qemu.git tags/riscv-for-master-3.1-sf1
>
> for you to fetch changes up to a094b3544f2855c0489f5df3c938b14b9a5899e5:
>
>   Add qemu-ri...@nongnu.org as the RISC-V list (2018-10-30 11:04:29 -0700)
>
> 
> RISC-V Patches for the 3.1 Soft Freeze, Part 2
>
> This tag contains a few simple patches that I'd like to target for the
> QEMU soft freeze.  There's only one code change: a fix to our PMP
> implementation that avoids an internal truncation while computing a
> partial PMP read.
>
> I also have two updates to the MAINTAINERS file: one to add Alistair as
> a RISC-V maintainer, and one to add our newly created mailing list.
>

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH] qemu/units: Move out QCow2 specific definitions

2018-11-02 Thread Kevin Wolf
Am 02.11.2018 um 13:37 hat Philippe Mathieu-Daudé geschrieben:
> Hi Kevin,
> 
> On 2/11/18 12:07, Kevin Wolf wrote:
> > Am 02.11.2018 um 09:58 hat Philippe Mathieu-Daudé geschrieben:
> > > This definitions are QCow2 specific, there is no need to expose them
> > > in the global namespace.
> > > 
> > > This partially reverts commit 540b8492618eb.
> > > 
> > > Signed-off-by: Philippe Mathieu-Daudé 
> > 
> > If we don't want this globally, I think we also don't want it in qcow2.
> 
> I only see this definitions used by block/qcow2.h (b6a95c6d1007).
> 
> Per 540b8492618eb description "This is needed when a size has to be
> stringified" but I can't find other code requiring these definitions in the
> codebase.

I guess the real question is: Is qcow2 the only place that needs
stringification of sizes?

The only value where this actually seems to be used in qcow2 is for
DEFAULT_CLUSTER_SIZE, as the default value for QemuOpts. Other drivers
still use plain numbers, but this is less readable.

Then there is VDI which uses (1 * MiB), but that is compiled out and if
you enable it, it breaks. So it needs the same fix.

Are block drivers the only places where we stringify a size? I imagine
some device models might use something like it, too?

I don't mind too much which solution we end up using, but I'd prefer it
to be universal.

Kevin



  1   2   3   >