date:20201208

Re: [PATCH 7/9] vnc: force initial resize message

2020-12-08 Thread Marc-André Lureau

Hi

On Tue, Dec 8, 2020 at 10:57 AM Gerd Hoffmann  wrote:

> On Fri, Dec 04, 2020 at 03:57:23PM +0400, Marc-André Lureau wrote:
> > Hi
> >
> > On Thu, Dec 3, 2020 at 3:12 PM Gerd Hoffmann  wrote:
> >
> > > The vnc server should send desktop resize notifications unconditionally
> > > on a new client connect, for feature negotiation reasons.  Add a bool
> > > flag to vnc_desktop_resize() to force sending the message even in case
> > > there is no size change.
> > >
> > > Signed-off-by: Gerd Hoffmann 
> > >
> >
> > In principle, this looks harmless. But the spec says:
> >
> > "The server should only send a *DesktopSize* pseudo-rectangle when an
> > actual change of the framebuffer dimensions has occurred. Some clients
> > respond to a *DesktopSize* pseudo-rectangle in a way that could send the
> > system into an infinite loop if the server sent out the pseudo-rectangle
> > for anything other than an actual change."
> > (
> >
> https://github.com/rfbproto/rfbproto/blob/master/rfbproto.rst#server-semantics
> > )
> >
> > I can't say if sending desktop resize during the initial SetEncoding
> phase
> > is really compliant with the specification. Also, it's unclear to me if
> the
> > client is allowed to SetEncoding multiple times (in which there would be
> no
> > dimension change occurring).
> >
> > What did you fix with this? Is it worth a clarification in the
> > specification?
>
> Well, for ExtendedDesktopResize the spec explicitly asked for this.
> But, yes, for DesktopResize this is not needed.  But it also shouldn't
> cause much trouble.  It is sent before any actual display updates, so
> concerns whenever the client should consider the screen content invalid
> or not are moot.
>
> I could squash this into patch #8 and do it for ExtendedDesktopResize
> only ...
>

Ok, it's probably fine (dunno), you could also capture that in the commit
message, or as code comment.


-- 
Marc-André Lureau

Re: [PATCH for-6.0] accel: Wire accel to /machine

2020-12-08 Thread Roman Bolshakov

On Mon, Dec 07, 2020 at 12:38:49PM -0500, Eduardo Habkost wrote:
> On Mon, Dec 07, 2020 at 11:46:22AM +0300, Roman Bolshakov wrote:
> > There's no generic way to query current accel and its properties via QOM
> > because there's no link between an accel and current machine.
> > 
> > The change adds the link, i.e. if HVF is enabled the following will be
> > available in QOM:
> > 
> >   (qemu) qom-get /machine/accel type
> >   "hvf-accel"
> > 
> > Suggested-by: Markus Armbruster 
> > Suggested-by: Paolo Bonzini 
> > Signed-off-by: Roman Bolshakov 
> > ---
> > 
> > Hi,
> > 
> > this is a follow up patch that deprecates earlier series [1].
> > 
> 
> Is there a reference to the reasoning for dropping the earlier
> approach?  Your previous approach seems preferable.
> 

Okay I wasn't sure about that :) It's clear now from your and Daniel's
response that a documented QMP method is preferable for public API.

> but we need a commit message that doesn't make people think the
> `qom-get` command above will always work.
> 

How about the one below:

accel: Wire accel to /machine

An accel is a property of a machine but it's not reflected in QOM. An
ownership between machine and accel is also not expressed.

The change adds the link, i.e. if HVF is enabled the following will be
available in QOM:

  (qemu) qom-get /machine/accel type
  "hvf-accel"

Note that management applications shouldn't rely on the type as it may
change in future at QEMU's discretion.

---
Regards,
Roman

Re: [PATCH v2] hw/block/nvme: add compare command

2020-12-08 Thread Klaus Jensen

On Nov 27 07:21, Minwoo Im wrote:
> Hello,
> 
> On Fri, Nov 27, 2020 at 3:56 AM Klaus Jensen  wrote:
> >
> > From: Gollu Appalanaidu 
> >
> > Add the Compare command.
> >
> > This implementation uses a bounce buffer to read in the data from
> > storage and then compare with the host supplied buffer.
> >
> > Signed-off-by: Gollu Appalanaidu 
> > [k.jensen: rebased]
> > Signed-off-by: Klaus Jensen 
> 
> 
> Reviewed-by: Minwoo Im 
> 

Thanks, applied to nvme-next.


signature.asc
Description: PGP signature

Re: [PATCH for-6.0] accel: Wire accel to /machine

2020-12-08 Thread Roman Bolshakov

On Mon, Dec 07, 2020 at 06:50:07PM +0100, Peter Krempa wrote:
> On Mon, Dec 07, 2020 at 12:38:49 -0500, Eduardo Habkost wrote:
> > On Mon, Dec 07, 2020 at 11:46:22AM +0300, Roman Bolshakov wrote:
> > > There's no generic way to query current accel and its properties via QOM
> > > because there's no link between an accel and current machine.
> > > 
> > > The change adds the link, i.e. if HVF is enabled the following will be
> > > available in QOM:
> > > 
> > >   (qemu) qom-get /machine/accel type
> > >   "hvf-accel"
> > > 
> > > Suggested-by: Markus Armbruster 
> > > Suggested-by: Paolo Bonzini 
> > > Signed-off-by: Roman Bolshakov 
> > > ---
> > > 
> > > Hi,
> > > 
> > > this is a follow up patch that deprecates earlier series [1].
> > > 
> > 
> > Is there a reference to the reasoning for dropping the earlier
> > approach?  Your previous approach seems preferable.
> 
> The gist of the discussion before was that deprecating old commands in
> the same release cycle as introducing the replacement might be
> problematic if libvirt wants to adapt ASAP and that the new command
> should be elevated to a intermediate tier of stability, where ACK from
> libvirt is needed to change it during the same release cycle
> incompatibly.
> 
> That was meant generally for any command, and was started because we had
> a similar issue recently.
> 
> My intention definitely was not to change the patch itself, but more a
> process change so that we can keep cooperating on new stuff rapidly, but
> without actually breaking what we do.
> 

Thanks Peter,

I'll drop deprecation patch in v2 of query-accel QMP command.

Ragards,
Roman

Re: [PATCH 10/15] vl: make qemu_get_machine_opts static

2020-12-08 Thread Paolo Bonzini


On 08/12/20 03:16, Daniel Henrique Barboza wrote:

2) find a way to make object_property_get_str() to return kvm_type =
 NULL if the 'kvm_type' option is absent  and keep spapr code
untouched. I don't have the knowledge to assess how hard this would
be.

3) I can change the pseries logic to add an explicit default value
for kvm_type=NULL or kvm_type='' (i.e. no user input is given). We're
already doing that in a sense, but it's not exposed to the user. I
would call it 'auto' and expose it to the user as default value if no
kvm_type is explictly set. This would enhance user experience a bit
and avoid having to deal with a scenario where kvm_type=(blank) is a
valid input.


David, if you think (3) is tolerable let me know and I can send a
patch. IMO this change adds a bit of value regardless of Paolo's
change.


Yes, I agree that (3) is a good idea.  If you send a patch I can 
integrate it in the series.


Paolo

Re: [RFC PATCH 13/27] vhost: Send buffers to device

2020-12-08 Thread Stefan Hajnoczi

On Fri, Nov 20, 2020 at 07:50:51PM +0100, Eugenio Pérez wrote:
> -static inline bool vhost_vring_should_kick(VhostShadowVirtqueue *vq)
> +static bool vhost_vring_should_kick_rcu(VhostShadowVirtqueue *vq)

"vhost_vring_" is a confusing name. This is not related to
vhost_virtqueue or the vhost_vring_* structs.

vhost_shadow_vq_should_kick_rcu()?

>  {
> -return virtio_queue_get_used_notify_split(vq->vq);
> +VirtIODevice *vdev = vq->vdev;
> +vq->num_added = 0;

I'm surprised that a bool function modifies state. Is this assignment
intentional?

> +/* virtqueue_add:
> + * @vq: The #VirtQueue
> + * @elem: The #VirtQueueElement

The copy-pasted doc comment doesn't match this function.

> +int vhost_vring_add(VhostShadowVirtqueue *vq, VirtQueueElement *elem)
> +{
> +int host_head = vhost_vring_add_split(vq, elem);
> +if (vq->ring_id_maps[host_head]) {
> +g_free(vq->ring_id_maps[host_head]);
> +}

VirtQueueElement is freed lazily? Is there a reason for this approach? I
would have expected it to be freed when the used element is process by
the kick fd handler.

> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index 9352c56bfa..304e0baa61 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -956,8 +956,34 @@ static void handle_sw_lm_vq(VirtIODevice *vdev, 
> VirtQueue *vq)
>  uint16_t idx = virtio_get_queue_index(vq);
>  
>  VhostShadowVirtqueue *svq = hdev->sw_lm_shadow_vq[idx];
> +VirtQueueElement *elem;
>  
> -vhost_vring_kick(svq);
> +/*
> + * Make available all buffers as possible.
> + */
> +do {
> +if (virtio_queue_get_notification(vq)) {
> +virtio_queue_set_notification(vq, false);
> +}
> +
> +while (true) {
> +int r;
> +if (virtio_queue_full(vq)) {
> +break;
> +}

Why is this check necessary? The guest cannot provide more descriptors
than there is ring space. If that happens somehow then it's a driver
error that is already reported in virtqueue_pop() below.

I wonder if you added this because the vring implementation above
doesn't support indirect descriptors? It's easy to exhaust the vhost
hdev vring while there is still room in the VirtIODevice's VirtQueue
vring.

signature.asc
Description: PGP signature

Re: [for-6.0 v5 12/13] securable guest memory: Alter virtio default properties for protected guests

2020-12-08 Thread Christian Borntraeger




On 08.12.20 02:54, David Gibson wrote:
> On Fri, Dec 04, 2020 at 03:43:10PM +0100, Halil Pasic wrote:
>> On Fri, 4 Dec 2020 09:29:59 +0100
>> Christian Borntraeger  wrote:
>>
>>> On 04.12.20 09:17, Cornelia Huck wrote:
 On Fri, 4 Dec 2020 09:10:36 +0100
 Christian Borntraeger  wrote:

> On 04.12.20 06:44, David Gibson wrote:
>> The default behaviour for virtio devices is not to use the platforms 
>> normal
>> DMA paths, but instead to use the fact that it's running in a hypervisor
>> to directly access guest memory.  That doesn't work if the guest's memory
>> is protected from hypervisor access, such as with AMD's SEV or POWER's 
>> PEF.
>>
>> So, if a securable guest memory mechanism is enabled, then apply the
>> iommu_platform=on option so it will go through normal DMA mechanisms.
>> Those will presumably have some way of marking memory as shared with
>> the hypervisor or hardware so that DMA will work.
>>
>> Signed-off-by: David Gibson 
>> Reviewed-by: Dr. David Alan Gilbert 
>> ---
>>  hw/core/machine.c | 13 +
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/hw/core/machine.c b/hw/core/machine.c
>> index a67a27d03c..d16273d75d 100644
>> --- a/hw/core/machine.c
>> +++ b/hw/core/machine.c
>> @@ -28,6 +28,8 @@
>>  #include "hw/mem/nvdimm.h"
>>  #include "migration/vmstate.h"
>>  #include "exec/securable-guest-memory.h"
>> +#include "hw/virtio/virtio.h"
>> +#include "hw/virtio/virtio-pci.h"
>>  
>>  GlobalProperty hw_compat_5_1[] = {
>>  { "vhost-scsi", "num_queues", "1"},
>> @@ -1169,6 +1171,17 @@ void machine_run_board_init(MachineState *machine)
>>   * areas.
>>   */
>>  machine_set_mem_merge(OBJECT(machine), false, &error_abort);
>> +
>> +/*
>> + * Virtio devices can't count on directly accessing guest
>> + * memory, so they need iommu_platform=on to use normal DMA
>> + * mechanisms.  That requires also disabling legacy virtio
>> + * support for those virtio pci devices which allow it.
>> + */
>> +object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy",
>> +   "on", true);
>> +object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform",
>> +   "on", false);  
>
> I have not followed all the history (sorry). Should we also set 
> iommu_platform
> for virtio-ccw? Halil?
>

 That line should add iommu_platform for all virtio devices, shouldn't
 it?
>>>
>>> Yes, sorry. Was misreading that with the line above. 
>>>
>>
>> I believe this is the best we can get. In a sense it is still a
>> pessimization,
> 
> I'm not really clear on what you're getting at here.

I think Halils point is that somebody might come up with a solution where 
things would
work even without iommu_platform. But as he said, still the best setting we can 
get
to cover all cases.

Re: [RFC PATCH 10/27] vhost: Allocate shadow vring

2020-12-08 Thread Stefan Hajnoczi

On Fri, Nov 20, 2020 at 07:50:48PM +0100, Eugenio Pérez wrote:
> Signed-off-by: Eugenio Pérez 
> ---
>  hw/virtio/vhost-sw-lm-ring.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/vhost-sw-lm-ring.c b/hw/virtio/vhost-sw-lm-ring.c
> index cbf53965cd..cd7b5ba772 100644
> --- a/hw/virtio/vhost-sw-lm-ring.c
> +++ b/hw/virtio/vhost-sw-lm-ring.c
> @@ -16,8 +16,11 @@
>  #include "qemu/event_notifier.h"
>  
>  typedef struct VhostShadowVirtqueue {
> +struct vring vring;
>  EventNotifier hdev_notifier;
>  VirtQueue *vq;
> +
> +vring_desc_t descs[];
>  } VhostShadowVirtqueue;

Looking at later patches I see that this is the vhost hdev vring state,
not the VirtIODevice vring state. That makes more sense.


signature.asc
Description: PGP signature

Re: [PATCH for-6.0] accel: Wire accel to /machine

2020-12-08 Thread Peter Krempa

On Tue, Dec 08, 2020 at 11:13:07 +0300, Roman Bolshakov wrote:
> On Mon, Dec 07, 2020 at 06:50:07PM +0100, Peter Krempa wrote:
> > On Mon, Dec 07, 2020 at 12:38:49 -0500, Eduardo Habkost wrote:
> > > On Mon, Dec 07, 2020 at 11:46:22AM +0300, Roman Bolshakov wrote:
> > > > There's no generic way to query current accel and its properties via QOM
> > > > because there's no link between an accel and current machine.
> > > > 
> > > > The change adds the link, i.e. if HVF is enabled the following will be
> > > > available in QOM:
> > > > 
> > > >   (qemu) qom-get /machine/accel type
> > > >   "hvf-accel"
> > > > 
> > > > Suggested-by: Markus Armbruster 
> > > > Suggested-by: Paolo Bonzini 
> > > > Signed-off-by: Roman Bolshakov 
> > > > ---
> > > > 
> > > > Hi,
> > > > 
> > > > this is a follow up patch that deprecates earlier series [1].
> > > > 
> > > 
> > > Is there a reference to the reasoning for dropping the earlier
> > > approach?  Your previous approach seems preferable.
> > 
> > The gist of the discussion before was that deprecating old commands in
> > the same release cycle as introducing the replacement might be
> > problematic if libvirt wants to adapt ASAP and that the new command
> > should be elevated to a intermediate tier of stability, where ACK from
> > libvirt is needed to change it during the same release cycle
> > incompatibly.
> > 
> > That was meant generally for any command, and was started because we had
> > a similar issue recently.
> > 
> > My intention definitely was not to change the patch itself, but more a
> > process change so that we can keep cooperating on new stuff rapidly, but
> > without actually breaking what we do.
> > 
> 
> Thanks Peter,
> 
> I'll drop deprecation patch in v2 of query-accel QMP command.

Actually In the discussion my stance is that you can deprecate the
old command itself, but starting from that point any semantic changes to
query-accel must be consulted with libvirt as if query-accel was already
released.

We do want to develop the use of the new command as soon as possible,
because that's beneficial to both sides and can actually show a design
problem in the replacement command, so having us replace it before qemu
releases it is good.

On the other hand once libvirt takes the replacement, we must be
involved in any changes to the command due to de-synced release
shchedules so that there isn't a libvirt which e.g. didnt work.

I specifically don't want to derail any of this collaboration, I just
want to enhance it by all parties knowingly agreeing to the "gentleman's
agreement" since complicating the process would actually be detremental.

The thing is that the command may be changed if we know about it and
e.g. didn't yet commit the replacement. We just need to communicate.

Re: [PATCH] linux-user: add option to chroot before emulation

2020-12-08 Thread Laurent Vivier

Le 08/12/2020 à 01:17, Matteo Croce a écrit :
> From: Matteo Croce 
> 
> Add a '-c' option which does a chroot() just before starting the
> emulation. This is useful when the static QEMU user binary can't
> be copied into the target root filesystem, e.g. if it's readonly.

Did you try to use the binfmt_misc 'F' flag (fix binary)?

https://www.kernel.org/doc/Documentation/admin-guide/binfmt-misc.rst

``F`` - fix binary

The usual behaviour of binfmt_misc is to spawn the
binary lazily when the misc format file is invoked.  However,
this doesn``t work very well in the face of mount namespaces and
changeroots, so the ``F`` mode opens the binary as soon as the
emulation is installed and uses the opened image to spawn the
emulator, meaning it is always available once installed,
regardless of how the environment changes.

This can be configured with scripts/qemu-binfmt-conf.sh and
"--persistent yes"" option

Thanks,
Laurent
> 
> Move some code which accesses /proc/sys/vm/mmap_min_addr before
> the chroot, otherwise it would fail.
> 
> Signed-off-by: Matteo Croce 
> ---
>  linux-user/main.c | 128 +++---
>  1 file changed, 75 insertions(+), 53 deletions(-)
> 
> diff --git a/linux-user/main.c b/linux-user/main.c
> index 24d1eb73ad..4788e4b5bc 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -60,6 +60,7 @@ static const char *seed_optarg;
>  unsigned long mmap_min_addr;
>  unsigned long guest_base;
>  bool have_guest_base;
> +static const char *qemu_chroot;
>  
>  /*
>   * Used to implement backwards-compatibility for the `-strace`, and
> @@ -304,6 +305,11 @@ static void handle_arg_pagesize(const char *arg)
>  }
>  }
>  
> +static void handle_arg_chroot(const char *arg)
> +{
> +qemu_chroot = arg;
> +}
> +
>  static void handle_arg_seed(const char *arg)
>  {
>  seed_optarg = arg;
> @@ -450,6 +456,8 @@ static const struct qemu_argument arg_table[] = {
>   "logfile", "write logs to 'logfile' (default stderr)"},
>  {"p",  "QEMU_PAGESIZE",true,  handle_arg_pagesize,
>   "pagesize",   "set the host page size to 'pagesize'"},
> +{"c",  "QEMU_CHROOT",  true,  handle_arg_chroot,
> + "chroot", "chroot to 'chroot' before starting emulation"},
>  {"singlestep", "QEMU_SINGLESTEP",  false, handle_arg_singlestep,
>   "",   "run in singlestep mode"},
>  {"strace", "QEMU_STRACE",  false, handle_arg_strace,
> @@ -688,6 +696,73 @@ int main(int argc, char **argv, char **envp)
>  
>  init_qemu_uname_release();
>  
> +/*
> + * Read in mmap_min_addr kernel parameter.  This value is used
> + * When loading the ELF image to determine whether guest_base
> + * is needed.  It is also used in mmap_find_vma.
> + */
> +{
> +FILE *fp;
> +
> +if ((fp = fopen("/proc/sys/vm/mmap_min_addr", "r")) != NULL) {
> +unsigned long tmp;
> +if (fscanf(fp, "%lu", &tmp) == 1 && tmp != 0) {
> +mmap_min_addr = tmp;
> +qemu_log_mask(CPU_LOG_PAGE, "host mmap_min_addr=0x%lx\n",
> +  mmap_min_addr);
> +}
> +fclose(fp);
> +}
> +}
> +
> +/*
> + * We prefer to not make NULL pointers accessible to QEMU.
> + * If we're in a chroot with no /proc, fall back to 1 page.
> + */
> +if (mmap_min_addr == 0) {
> +mmap_min_addr = qemu_host_page_size;
> +qemu_log_mask(CPU_LOG_PAGE,
> +  "host mmap_min_addr=0x%lx (fallback)\n",
> +  mmap_min_addr);
> +}
> +
> +/*
> + * Prepare copy of argv vector for target.
> + */
> +target_argc = argc - optind;
> +target_argv = calloc(target_argc + 1, sizeof (char *));
> +if (target_argv == NULL) {
> +(void) fprintf(stderr, "Unable to allocate memory for 
> target_argv\n");
> +exit(EXIT_FAILURE);
> +}
> +
> +/*
> + * If argv0 is specified (using '-0' switch) we replace
> + * argv[0] pointer with the given one.
> + */
> +i = 0;
> +if (argv0 != NULL) {
> +target_argv[i++] = strdup(argv0);
> +}
> +for (; i < target_argc; i++) {
> +target_argv[i] = strdup(argv[optind + i]);
> +}
> +target_argv[target_argc] = NULL;
> +
> +/*
> + * Change root if requested wuth '-c'
> + */
> +if (qemu_chroot) {
> +if (chroot(qemu_chroot) < 0) {
> +error_report("chroot failed");
> +exit(1);
> +}
> +if (chdir("/")) {
> +error_report("not able to chdir to /: %s", strerror(errno));
> +exit(1);
> +}
> +}
> +
>  execfd = qemu_getauxval(AT_EXECFD);
>  if (execfd == 0) {
>  execfd = open(exec_path, O_RDONLY);
> @@ -746,59 +821,6 @@ int main(int argc, char **argv, char **envp)
>  target_environ = envlist_to_environ(envlist, NULL);
>  envlist_free(envlist);
>  
> -/*
> - *

Re: [PATCH v2] gitlab-ci.yml: Add openSUSE Leap 15.2 for gitlab CI/CD

2020-12-08 Thread AL Yu-Chen Cho

On Tue, 2020-12-08 at 07:55 +0100, Thomas Huth wrote:
> On 24/11/2020 10.45, Cho, Yu-Chen wrote:
> > v2:
> > Drop some package from dockerfile to make docker image more light.
> > 
> > v1:
> > Add build-system-opensuse jobs and opensuse-leap.docker dockerfile.
> > Use openSUSE Leap 15.2 container image in the gitlab-CI.
> > 
> > Signed-off-by: Cho, Yu-Chen 
> > ---
> >  .gitlab-ci.d/containers.yml   |  5 ++
> >  .gitlab-ci.yml    | 30 +++
> >  tests/docker/dockerfiles/opensuse-leap.docker | 54
> > +++
> >  3 files changed, 89 insertions(+)
> >  create mode 100644 tests/docker/dockerfiles/opensuse-leap.docker
> 
>  Hi!
> 
> While trying to pick up your patch, I noticed that it is failing now
> in the
> gitlab-CI:
> 
>  https://gitlab.com/huth/qemu/-/jobs/896384459
> 
> Could you please have a look and send a fixed v3?
> 

No problem, will submit v3 soon.

Cheers,
  AL

>  Thanks,
>   Thomas
>

Re: [RFC PATCH 16/27] virtio: Expose virtqueue_alloc_element

2020-12-08 Thread Stefan Hajnoczi

On Fri, Nov 20, 2020 at 07:50:54PM +0100, Eugenio Pérez wrote:
> Specify VirtQueueElement * as return type makes no harm at this moment.

The reason for the void * return type is that C implicitly converts void
pointers to pointers of any type. The function takes a size_t sz
argument so it can allocate a object of user-defined size. The idea is
that the user's struct embeds a VirtQueueElement field. Changing the
return type to VirtQueueElement * means that callers may need to
explicitly cast to the user's struct type.

It's a question of coding style but I think the void * return type
communicates what is going on better than VirtQueueElement *.


signature.asc
Description: PGP signature

[PATCH v3 0/2] hw/block/nvme: add simple copy command

2020-12-08 Thread Klaus Jensen

From: Klaus Jensen 

Add support for TP 4065 ("Simple Copy Command").

Changes for v3

  * rebased on nvme-next
  * changed the default msrc value to a more reasonable 127 from 255 to
better align with the default mcl value of 128.

Changes for v2

  * prefer style that aligns with existing NvmeIdCtrl field enums
(Minwoo)
  * swapped elbat/elbatm fields in copy source range. I've kept the R-b
and A-b from Minwoo and Stefan since this is a non-functional change
(the device does not use these fields at all).

Klaus Jensen (2):
  nvme: updated shared header for copy command
  hw/block/nvme: add simple copy command

 hw/block/nvme-ns.h|   4 +
 hw/block/nvme.h   |   1 +
 include/block/nvme.h  |  45 -
 hw/block/nvme-ns.c|   8 ++
 hw/block/nvme.c   | 224 +-
 hw/block/trace-events |   6 ++
 6 files changed, 285 insertions(+), 3 deletions(-)

-- 
2.29.2

[PATCH v3 1/2] nvme: updated shared header for copy command

2020-12-08 Thread Klaus Jensen

From: Klaus Jensen 

Add new data structures and types for the Simple Copy command.

Signed-off-by: Klaus Jensen 
Cc: Stefan Hajnoczi 
Cc: Fam Zheng 
Reviewed-by: Minwoo Im 
Acked-by: Stefan Hajnoczi 
---
 include/block/nvme.h | 45 ++--
 1 file changed, 43 insertions(+), 2 deletions(-)

diff --git a/include/block/nvme.h b/include/block/nvme.h
index e95ff6ca9b37..be3aca913a1d 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -472,6 +472,7 @@ enum NvmeIoCommands {
 NVME_CMD_COMPARE= 0x05,
 NVME_CMD_WRITE_ZEROES   = 0x08,
 NVME_CMD_DSM= 0x09,
+NVME_CMD_COPY   = 0x19,
 };
 
 typedef struct QEMU_PACKED NvmeDeleteQ {
@@ -603,6 +604,35 @@ typedef struct QEMU_PACKED NvmeDsmRange {
 uint64_tslba;
 } NvmeDsmRange;
 
+enum {
+NVME_COPY_FORMAT_0 = 0x0,
+};
+
+typedef struct NvmeCopyCmd {
+uint8_t opcode;
+uint8_t flags;
+uint16_tcid;
+uint32_tnsid;
+uint32_trsvd2[4];
+NvmeCmdDptr dptr;
+uint64_tsdlba;
+uint32_tcdw12;
+uint32_tcdw13;
+uint32_tilbrt;
+uint16_tlbat;
+uint16_tlbatm;
+} NvmeCopyCmd;
+
+typedef struct NvmeCopySourceRange {
+uint8_t  rsvd0[8];
+uint64_t slba;
+uint16_t nlb;
+uint8_t  rsvd18[6];
+uint32_t eilbrt;
+uint16_t elbat;
+uint16_t elbatm;
+} NvmeCopySourceRange;
+
 enum NvmeAsyncEventRequest {
 NVME_AER_TYPE_ERROR = 0,
 NVME_AER_TYPE_SMART = 1,
@@ -680,6 +710,7 @@ enum NvmeStatusCodes {
 NVME_CONFLICTING_ATTRS  = 0x0180,
 NVME_INVALID_PROT_INFO  = 0x0181,
 NVME_WRITE_TO_RO= 0x0182,
+NVME_CMD_SIZE_LIMIT = 0x0183,
 NVME_WRITE_FAULT= 0x0280,
 NVME_UNRECOVERED_READ   = 0x0281,
 NVME_E2E_GUARD_ERROR= 0x0282,
@@ -831,7 +862,7 @@ typedef struct QEMU_PACKED NvmeIdCtrl {
 uint8_t nvscc;
 uint8_t rsvd531;
 uint16_tacwu;
-uint8_t rsvd534[2];
+uint16_tocfs;
 uint32_tsgls;
 uint8_t rsvd540[228];
 uint8_t subnqn[256];
@@ -854,6 +885,11 @@ enum NvmeIdCtrlOncs {
 NVME_ONCS_FEATURES  = 1 << 4,
 NVME_ONCS_RESRVATIONS   = 1 << 5,
 NVME_ONCS_TIMESTAMP = 1 << 6,
+NVME_ONCS_COPY  = 1 << 8,
+};
+
+enum NvmeIdCtrlOcfs {
+NVME_OCFS_COPY_FORMAT_0 = 1 << 0,
 };
 
 enum NvmeIdCtrlFrmw {
@@ -995,7 +1031,10 @@ typedef struct QEMU_PACKED NvmeIdNs {
 uint16_tnpdg;
 uint16_tnpda;
 uint16_tnows;
-uint8_t rsvd74[30];
+uint16_tmssrl;
+uint32_tmcl;
+uint8_t msrc;
+uint8_t rsvd81[23];
 uint8_t nguid[16];
 uint64_teui64;
 NvmeLBAFlbaf[16];
@@ -1059,6 +1098,7 @@ static inline void _nvme_check_size(void)
 QEMU_BUILD_BUG_ON(sizeof(NvmeAerResult) != 4);
 QEMU_BUILD_BUG_ON(sizeof(NvmeCqe) != 16);
 QEMU_BUILD_BUG_ON(sizeof(NvmeDsmRange) != 16);
+QEMU_BUILD_BUG_ON(sizeof(NvmeCopySourceRange) != 32);
 QEMU_BUILD_BUG_ON(sizeof(NvmeCmd) != 64);
 QEMU_BUILD_BUG_ON(sizeof(NvmeDeleteQ) != 64);
 QEMU_BUILD_BUG_ON(sizeof(NvmeCreateCq) != 64);
@@ -1066,6 +1106,7 @@ static inline void _nvme_check_size(void)
 QEMU_BUILD_BUG_ON(sizeof(NvmeIdentify) != 64);
 QEMU_BUILD_BUG_ON(sizeof(NvmeRwCmd) != 64);
 QEMU_BUILD_BUG_ON(sizeof(NvmeDsmCmd) != 64);
+QEMU_BUILD_BUG_ON(sizeof(NvmeCopyCmd) != 64);
 QEMU_BUILD_BUG_ON(sizeof(NvmeRangeType) != 64);
 QEMU_BUILD_BUG_ON(sizeof(NvmeErrorLog) != 64);
 QEMU_BUILD_BUG_ON(sizeof(NvmeFwSlotInfoLog) != 512);
-- 
2.29.2

[PATCH v3 2/2] hw/block/nvme: add simple copy command

2020-12-08 Thread Klaus Jensen

From: Klaus Jensen 

Add support for TP 4065a ("Simple Copy Command"), v2020.05.04
("Ratified").

The implementation uses a bounce buffer to first read in the source
logical blocks, then issue a write of that bounce buffer. The default
maximum number of source logical blocks is 128, translating to 512 KiB
for 4k logical blocks which aligns with the default value of MDTS.

Signed-off-by: Klaus Jensen 
---
 hw/block/nvme-ns.h|   4 +
 hw/block/nvme.h   |   1 +
 hw/block/nvme-ns.c|   8 ++
 hw/block/nvme.c   | 224 +-
 hw/block/trace-events |   6 ++
 5 files changed, 242 insertions(+), 1 deletion(-)

diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h
index 44bf6271b744..745d288b09cf 100644
--- a/hw/block/nvme-ns.h
+++ b/hw/block/nvme-ns.h
@@ -21,6 +21,10 @@
 
 typedef struct NvmeNamespaceParams {
 uint32_t nsid;
+
+uint16_t mssrl;
+uint32_t mcl;
+uint8_t  msrc;
 } NvmeNamespaceParams;
 
 typedef struct NvmeNamespace {
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 574333caa3f9..f549abeeb930 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -62,6 +62,7 @@ static inline const char *nvme_io_opc_str(uint8_t opc)
 case NVME_CMD_READ: return "NVME_NVM_CMD_READ";
 case NVME_CMD_WRITE_ZEROES: return "NVME_NVM_CMD_WRITE_ZEROES";
 case NVME_CMD_DSM:  return "NVME_NVM_CMD_DSM";
+case NVME_CMD_COPY: return "NVME_NVM_CMD_COPY";
 default:return "NVME_NVM_CMD_UNKNOWN";
 }
 }
diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index 2d69b5177b51..f53f8fc56fd8 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -59,6 +59,11 @@ static int nvme_ns_init(NvmeNamespace *ns, Error **errp)
 
 id_ns->npda = id_ns->npdg = npdg - 1;
 
+/* simple copy */
+id_ns->mssrl = cpu_to_le16(ns->params.mssrl);
+id_ns->mcl = cpu_to_le32(ns->params.mcl);
+id_ns->msrc = ns->params.msrc;
+
 return 0;
 }
 
@@ -150,6 +155,9 @@ static void nvme_ns_realize(DeviceState *dev, Error **errp)
 static Property nvme_ns_props[] = {
 DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf),
 DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0),
+DEFINE_PROP_UINT16("mssrl", NvmeNamespace, params.mssrl, 128),
+DEFINE_PROP_UINT32("mcl", NvmeNamespace, params.mcl, 128),
+DEFINE_PROP_UINT8("msrc", NvmeNamespace, params.msrc, 127),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 8814201364c1..d06ffab7e684 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -999,6 +999,109 @@ static void nvme_aio_discard_cb(void *opaque, int ret)
 nvme_enqueue_req_completion(nvme_cq(req), req);
 }
 
+struct nvme_copy_ctx {
+int copies;
+uint8_t *bounce;
+uint32_t nlb;
+};
+
+struct nvme_copy_in_ctx {
+NvmeRequest *req;
+QEMUIOVector iov;
+};
+
+static void nvme_copy_cb(void *opaque, int ret)
+{
+NvmeRequest *req = opaque;
+NvmeNamespace *ns = req->ns;
+struct nvme_copy_ctx *ctx = req->opaque;
+
+trace_pci_nvme_copy_cb(nvme_cid(req));
+
+if (!ret) {
+block_acct_done(blk_get_stats(ns->blkconf.blk), &req->acct);
+} else {
+block_acct_failed(blk_get_stats(ns->blkconf.blk), &req->acct);
+nvme_aio_err(req, ret);
+}
+
+g_free(ctx->bounce);
+g_free(ctx);
+
+nvme_enqueue_req_completion(nvme_cq(req), req);
+}
+
+static void nvme_copy_in_complete(NvmeRequest *req)
+{
+NvmeNamespace *ns = req->ns;
+NvmeCopyCmd *copy = (NvmeCopyCmd *)&req->cmd;
+struct nvme_copy_ctx *ctx = req->opaque;
+uint64_t sdlba = le64_to_cpu(copy->sdlba);
+uint16_t status;
+
+trace_pci_nvme_copy_in_complete(nvme_cid(req));
+
+block_acct_done(blk_get_stats(ns->blkconf.blk), &req->acct);
+
+status = nvme_check_bounds(ns, sdlba, ctx->nlb);
+if (status) {
+trace_pci_nvme_err_invalid_lba_range(sdlba, ctx->nlb, ns->id_ns.nsze);
+req->status = status;
+
+g_free(ctx->bounce);
+g_free(ctx);
+
+nvme_enqueue_req_completion(nvme_cq(req), req);
+
+return;
+}
+
+qemu_iovec_init(&req->iov, 1);
+qemu_iovec_add(&req->iov, ctx->bounce, nvme_l2b(ns, ctx->nlb));
+
+block_acct_start(blk_get_stats(ns->blkconf.blk), &req->acct,
+ nvme_l2b(ns, ctx->nlb), BLOCK_ACCT_WRITE);
+
+req->aiocb = blk_aio_pwritev(ns->blkconf.blk, nvme_l2b(ns, sdlba),
+ &req->iov, 0, nvme_copy_cb, req);
+}
+
+static void nvme_aio_copy_in_cb(void *opaque, int ret)
+{
+struct nvme_copy_in_ctx *in_ctx = opaque;
+NvmeRequest *req = in_ctx->req;
+NvmeNamespace *ns = req->ns;
+struct nvme_copy_ctx *ctx = req->opaque;
+
+qemu_iovec_destroy(&in_ctx->iov);
+g_free(in_ctx);
+
+trace_pci_nvme_aio_copy_in_cb(nvme_cid(req));
+
+if (ret) {
+nvme_aio_err(req, ret);
+}
+
+ctx->copies--;
+
+if (ctx->copies) {
+return;
+}
+
+if (

[Bug 1907210] [NEW] QEMU gdbstub command "?" issue

2020-12-08 Thread Matic Kres

Public bug reported:

I am using some third party GDB client, and I have noticed that every time "?" 
command is send from the client, QEMU gdbstub removes all break points. This 
behaviour is not expected since "?" command should only return stop reason.
Here is documentation from official gdb:
‘?’ Indicate the reason the target halted. The reply is the same as for step and
continue. This packet has a special interpretation when the target is in 
non-stop
mode; see Section E.10 [Remote Non-Stop], page 733.
Reply: See Section E.3 [Stop Reply Packets], page 693, for the reply 
specifications.

With some help on the irc, we have been able to pin point the failure point(in 
attachement file gdbstub.c).
Function that handles "?" command has this comment in it:
/*
 * Remove all the breakpoints when this query is issued,
 * because gdb is doing an initial connect and the state
 * should be cleaned up.
 */
>From which it is clear that developer that wrote that code assumed, that 
>because most popular gdb client only uses "?" command at initial connect, it 
>is safe to also remove all BPs.
In my opinion initial connect should be detected in some other way, and not 
with "?" command.
Also note that official gdbserver does not remove the BPs when this command is 
send, this issue is present in QEMU and apparently also on kgdb(I was told that 
on irc, have not tested it myself)

** Affects: qemu
 Importance: Undecided
 Status: New

** Attachment added: "qemuissue.jpg"
   
https://bugs.launchpad.net/bugs/1907210/+attachment/5441925/+files/qemuissue.jpg

** Description changed:

  I am using some third party GDB client, and I have noticed that every time 
"?" command is send from the client, QEMU gdbstub removes all break points. 
This behaviour is not expected since "?" command should only return stop reason.
  Here is documentation from official gdb:
  ‘?’ Indicate the reason the target halted. The reply is the same as for step 
and
  continue. This packet has a special interpretation when the target is in 
non-stop
  mode; see Section E.10 [Remote Non-Stop], page 733.
  Reply: See Section E.3 [Stop Reply Packets], page 693, for the reply 
specifications.
  
  With some help on the irc, we have been able to pin point the failure 
point(in attachement file gdbstub.c).
  Function that handles "?" command has this comment in it:
- /*
-  * Remove all the breakpoints when this query is issued,
-  * because gdb is doing an initial connect and the state
-  * should be cleaned up.
-  */
- From which it is clear that developer that wrote that code assumed, that 
because most popular gdb client only uses "?" command at initial connect, it is 
safe to also remove all BPs. 
+ /*
+  * Remove all the breakpoints when this query is issued,
+  * because gdb is doing an initial connect and the state
+  * should be cleaned up.
+  */
+ From which it is clear that developer that wrote that code assumed, that 
because most popular gdb client only uses "?" command at initial connect, it is 
safe to also remove all BPs.
  In my opinion initial connect should be detected in some other way, and not 
with "?" command.
+ Also note that official gdbserver does not remove the BPs when this command 
is send, this issue is present in QEMU and apparently also on kgdb(I was told 
that on irc, have not tested it myself)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1907210

Title:
  QEMU gdbstub command "?" issue

Status in QEMU:
  New

Bug description:
  I am using some third party GDB client, and I have noticed that every time 
"?" command is send from the client, QEMU gdbstub removes all break points. 
This behaviour is not expected since "?" command should only return stop reason.
  Here is documentation from official gdb:
  ‘?’ Indicate the reason the target halted. The reply is the same as for step 
and
  continue. This packet has a special interpretation when the target is in 
non-stop
  mode; see Section E.10 [Remote Non-Stop], page 733.
  Reply: See Section E.3 [Stop Reply Packets], page 693, for the reply 
specifications.

  With some help on the irc, we have been able to pin point the failure 
point(in attachement file gdbstub.c).
  Function that handles "?" command has this comment in it:
  /*
   * Remove all the breakpoints when this query is issued,
   * because gdb is doing an initial connect and the state
   * should be cleaned up.
   */
  From which it is clear that developer that wrote that code assumed, that 
because most popular gdb client only uses "?" command at initial connect, it is 
safe to also remove all BPs.
  In my opinion initial connect should be detected in some other way, and not 
with "?" command.
  Also note that official gdbserver does not remove the BPs when this command 
is send, this issue is present in QEMU and apparently

Re: [RFC PATCH 18/27] vhost: add vhost_vring_poll_rcu

2020-12-08 Thread Stefan Hajnoczi

On Fri, Nov 20, 2020 at 07:50:56PM +0100, Eugenio Pérez wrote:
> @@ -83,6 +89,18 @@ void vhost_vring_set_notification_rcu(VhostShadowVirtqueue 
> *vq, bool enable)
>  smp_mb();
>  }
>  
> +bool vhost_vring_poll_rcu(VhostShadowVirtqueue *vq)

A name like "more_used" is clearer than "poll".


signature.asc
Description: PGP signature

Re: [RFC PATCH 20/27] vhost: Return used buffers

2020-12-08 Thread Stefan Hajnoczi

On Fri, Nov 20, 2020 at 07:50:58PM +0100, Eugenio Pérez wrote:
> @@ -1028,6 +1061,7 @@ static int vhost_sw_live_migration_start(struct 
> vhost_dev *dev)
>  
>  for (idx = 0; idx < dev->nvqs; ++idx) {
>  struct vhost_virtqueue *vq = &dev->vqs[idx];
> +unsigned num = virtio_queue_get_num(dev->vdev, idx);
>  struct vhost_vring_addr addr = {
>  .index = idx,
>  };
> @@ -1044,6 +1078,12 @@ static int vhost_sw_live_migration_start(struct 
> vhost_dev *dev)
>  r = dev->vhost_ops->vhost_set_vring_addr(dev, &addr);
>  assert(r == 0);
>  
> +r = vhost_backend_update_device_iotlb(dev, addr.used_user_addr,
> +  addr.used_user_addr,
> +  sizeof(vring_used_elem_t) * 
> num,
> +  IOMMU_RW);

I don't remember seeing iotlb setup for the rest of the vring or guest
memory. Maybe this should go into a single patch so it's easy to review
the iova space layout.


signature.asc
Description: PGP signature

Re: [RFC PATCH 24/27] vhost: iommu changes

2020-12-08 Thread Stefan Hajnoczi

On Fri, Nov 20, 2020 at 07:51:02PM +0100, Eugenio Pérez wrote:
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index eebfac4455..cb44b9997f 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1109,6 +1109,10 @@ static int vhost_sw_live_migration_start(struct 
> vhost_dev *dev)
>  
>  assert(dev->vhost_ops->vhost_set_vring_enable);
>  dev->vhost_ops->vhost_set_vring_enable(dev, false);
> +if (vhost_dev_has_iommu(dev)) {
> +r = vhost_backend_invalidate_device_iotlb(dev, 0, -1ULL);
> +assert(r == 0);
> +}
>  
>  for (idx = 0; idx < dev->nvqs; ++idx) {
>  struct vhost_virtqueue *vq = &dev->vqs[idx];
> @@ -1269,6 +1273,19 @@ int vhost_device_iotlb_miss(struct vhost_dev *dev, 
> uint64_t iova, int write)
>  
>  trace_vhost_iotlb_miss(dev, 1);
>  
> +if (dev->sw_lm_enabled) {
> +uaddr = iova;
> +len = 4096;
> +ret = vhost_backend_update_device_iotlb(dev, iova, uaddr, len,
> +IOMMU_RW);

It would be nice to look up the available memory so
vhost_backend_update_device_iotlb() can be called with a much bigger
[uaddr, uaddr+len) range. This will reduce the number of iotlb misses.

Will vIOMMU be required for this feature? If not, then the vring needs
to be added to the vhost memory regions because vhost will not send QEMU
iotlb misses.


signature.asc
Description: PGP signature

[PATCH] icount: improve exec nocache usage

2020-12-08 Thread Pavel Dovgalyuk

cpu-exec tries to execute TB without caching when current
icount budget is over. But sometimes refilled budget is big
enough to try executing cached blocks.
This patch checks that instruction budget is big enough
for next block execution instead of just running cpu_exec_nocache.
It halves the number of calls of cpu_exec_nocache function
during tested OS boot scenario.

Signed-off-by: Pavel Dovgalyuk 
---
 accel/tcg/cpu-exec.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 58aea605d8..251b340fb9 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -685,7 +685,7 @@ static inline void cpu_loop_exec_tb(CPUState *cpu, 
TranslationBlock *tb,
 insns_left = MIN(0x, cpu->icount_budget);
 cpu_neg(cpu)->icount_decr.u16.low = insns_left;
 cpu->icount_extra = cpu->icount_budget - insns_left;
-if (!cpu->icount_extra) {
+if (!cpu->icount_extra && insns_left < tb->icount) {
 /* Execute any remaining instructions, then let the main loop
  * handle the next event.
  */

Re: [PATCH for-6.0] spapr: Allow memory unplug to always succeed

2020-12-08 Thread Greg Kurz

On Tue, 8 Dec 2020 15:30:04 +1100
David Gibson  wrote:

> On Mon, Dec 07, 2020 at 02:37:04PM +0100, Greg Kurz wrote:
> > It is currently impossible to hot-unplug a memory device between
> > machine reset and CAS.
> > 
> > (qemu) device_del dimm1
> > Error: Memory hot unplug not supported for this guest
> > 
> > This limitation was introduced in order to provide an explicit
> > error path for older guests that didn't support hot-plug event
> > sources (and thus memory hot-unplug).
> > 
> > The linux kernel has been supporting these since 4.11. All recent
> > enough guests are thus capable of handling the removal of a memory
> > device at all time, including during early boot.
> > 
> > Lift the limitation for the latest machine type. This means that
> > trying to unplug memory from a guest that doesn't support it will
> > likely just do nothing and the memory will only get removed at
> > next reboot. Such older guests can still get the existing behavior
> > by using an older machine type.
> > 
> > Signed-off-by: Greg Kurz 
> 
> Looks like this conflicts with something I've added to for-6.0
> recently.  Can you rebase and resend, please.
> 

I'm not quite sure what for-6.0 you're talking about. Despite
you're recent announcement about moving to gitlab, it seems
that the branch at github is the most up to date.

gitlab:
- HEAD is "xive: Add trace events"
- Date: 26 Nov, 2020

github:
- HEAD is "MAINTAINERS: Add Greg Kurz as co-maintainer for ppc"
- Date: Dec 4, 2020

I've thus based this patch on github. Also, this is based on Connie's
"hw: add compat machines for 6.0" patch...

> > ---
> > Based-on: 20201109173928.1001764-1-coh...@redhat.com

... maybe I should have made it more clear than just
mentioning the message id ?

I think I'll just wait for Connie's patch to get merged and I'll repost after
you've rebased ppc-for-6.0.

> > ---
> >  include/hw/ppc/spapr.h | 1 +
> >  hw/ppc/spapr.c | 6 +-
> >  hw/ppc/spapr_events.c  | 3 ++-
> >  3 files changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index b7ced9faebf5..7aa630350326 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -139,6 +139,7 @@ struct SpaprMachineClass {
> >  hwaddr rma_limit;  /* clamp the RMA to this size */
> >  bool pre_5_1_assoc_refpoints;
> >  bool pre_5_2_numa_associativity;
> > +bool pre_6_0_memory_unplug;
> >  
> >  bool (*phb_placement)(SpaprMachineState *spapr, uint32_t index,
> >uint64_t *buid, hwaddr *pio, 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 7e954bc84bed..f0b26b2af30d 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -4044,7 +4044,8 @@ static void 
> > spapr_machine_device_unplug_request(HotplugHandler *hotplug_dev,
> >  SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> >  
> >  if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > -if (spapr_ovec_test(sms->ov5_cas, OV5_HP_EVT)) {
> > +if (!smc->pre_6_0_memory_unplug ||
> > +spapr_ovec_test(sms->ov5_cas, OV5_HP_EVT)) {
> >  spapr_memory_unplug_request(hotplug_dev, dev, errp);
> >  } else {
> >  /* NOTE: this means there is a window after guest reset, prior 
> > to
> > @@ -4530,8 +4531,11 @@ DEFINE_SPAPR_MACHINE(6_0, "6.0", true);
> >   */
> >  static void spapr_machine_5_2_class_options(MachineClass *mc)
> >  {
> > +SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> > +
> >  spapr_machine_6_0_class_options(mc);
> >  compat_props_add(mc->compat_props, hw_compat_5_2, hw_compat_5_2_len);
> > +smc->pre_6_0_memory_unplug = true;
> >  }
> >  
> >  DEFINE_SPAPR_MACHINE(5_2, "5.2", false);
> > diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> > index 1add53547ec3..c30123177b16 100644
> > --- a/hw/ppc/spapr_events.c
> > +++ b/hw/ppc/spapr_events.c
> > @@ -659,7 +659,8 @@ static void spapr_hotplug_req_event(uint8_t hp_id, 
> > uint8_t hp_action,
> >  /* we should not be using count_indexed value unless the guest
> >   * supports dedicated hotplug event source
> >   */
> > -g_assert(spapr_ovec_test(spapr->ov5_cas, OV5_HP_EVT));
> > +g_assert(!SPAPR_MACHINE_GET_CLASS(spapr)->pre_6_0_memory_unplug ||
> > + spapr_ovec_test(spapr->ov5_cas, OV5_HP_EVT));
> >  hp->drc_id.count_indexed.count =
> >  cpu_to_be32(drc_id->count_indexed.count);
> >  hp->drc_id.count_indexed.index =
> 



pgpDzn8NhhSPK.pgp
Description: OpenPGP digital signature

[PATCH] hw/block/nvme: fix bad clearing of CAP

2020-12-08 Thread Klaus Jensen

From: Klaus Jensen 

Commit 37712e00b1f0 ("hw/block/nvme: factor out pmr setup") changed the
control flow such that the CAP register is erronously cleared after
nvme_init_pmr() has configured it. Since the entire NvmeCtrl structure
is zero-filled initially, there is no need for the explicit clearing, so
just remove it.

Fixes: 37712e00b1f0 ("hw/block/nvme: factor out pmr setup")
Signed-off-by: Klaus Jensen 
---
 hw/block/nvme.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 8814201364c1..28416b18a5c0 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -3040,7 +3040,6 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice 
*pci_dev)
 id->psd[0].enlat = cpu_to_le32(0x10);
 id->psd[0].exlat = cpu_to_le32(0x4);
 
-n->bar.cap = 0;
 NVME_CAP_SET_MQES(n->bar.cap, 0x7ff);
 NVME_CAP_SET_CQR(n->bar.cap, 1);
 NVME_CAP_SET_TO(n->bar.cap, 0xf);
-- 
2.29.2

Re: [RFC PATCH 00/27] vDPA software assisted live migration

2020-12-08 Thread Stefan Hajnoczi

On Fri, Nov 20, 2020 at 07:50:38PM +0100, Eugenio Pérez wrote:
> This series enable vDPA software assisted live migration for vhost-net
> devices. This is a new method of vhost devices migration: Instead of
> relay on vDPA device's dirty logging capability, SW assisted LM
> intercepts dataplane, forwarding the descriptors between VM and device.

Pros:
+ vhost/vDPA devices don't need to implement dirty memory logging
+ Obsoletes ioctl(VHOST_SET_LOG_BASE) and friends

Cons:
- Not generic, relies on vhost-net-specific ioctls
- Doesn't support VIRTIO Shared Memory Regions
  https://github.com/oasis-tcs/virtio-spec/blob/master/shared-mem.tex
- Performance (see below)

I think performance will be significantly lower when the shadow vq is
enabled. Imagine a vDPA device with hardware vq doorbell registers
mapped into the guest so the guest driver can directly kick the device.
When the shadow vq is enabled a vmexit is needed to write to the shadow
vq ioeventfd, then the host kernel scheduler switches to a QEMU thread
to read the ioeventfd, the descriptors are translated, QEMU writes to
the vhost hdev kick fd, the host kernel scheduler switches to the vhost
worker thread, vhost/vDPA notifies the virtqueue, and finally the
vDPA driver writes to the hardware vq doorbell register. That is a lot
of overhead compared to writing to an exitless MMIO register!

If the shadow vq was implemented in drivers/vhost/ and QEMU used the
existing ioctl(VHOST_SET_LOG_BASE) approach, then the overhead would be
reduced to just one set of ioeventfd/irqfd. In other words, the QEMU
dirty memory logging happens asynchronously and isn't in the dataplane.

In addition, hardware that supports dirty memory logging as well as
software vDPA devices could completely eliminate the shadow vq for even
better performance.

But performance is a question of "is it good enough?". Maybe this
approach is okay and users don't expect good performance while dirty
memory logging is enabled. I just wanted to share the idea of moving the
shadow vq into the kernel in case you like that approach better.

Stefan

signature.asc
Description: PGP signature

[Bug 1905979] Re: Check if F_OFD_SETLK is supported may give wrong result

2020-12-08 Thread Daniel Berrange

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1905979

Title:
  Check if F_OFD_SETLK is supported may give wrong result

Status in QEMU:
  Incomplete

Bug description:
  In util/osdep.c there is a function qemu_probe_lock_ops() to check if
  file locks F_OFD_SETLK and F_OFD_GETLK (of the style "Open file
  description locks (non-POSIX)") are supported.

  This test is done by trying a lock operation on the file /dev/null.

  This test can get a wrong result.

  The result is (probably) if the operating system *in general* supports
  these locks. However, it does not guarantee that the file system where
  the lock is really wanted (for instance, in caller
  raw_check_lock_bytes() in block/file-posix.c) does support these
  locks.

  (In theory it could even be that /dev/null, being a device special
  file, does not support the lock type while a plain file would.)

  This is in particular relevant for disk images which are stored on a
  shared file system (my particular use case is the Quobyte file system,
  which appears not to support these locks).

  The code as mentioned above is present in the master branch (I checked
  commit ea8208249d1082eae0444934efb3b59cd3183f05) but also for example
  on stable-2.11 commit 0982a56a551556c704dc15752dabf57b4be1c640)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1905979/+subscriptions

Re: [PATCH v5 0/2] MTE support for KVM guest

2020-12-08 Thread Haibo Xu

On Mon, 7 Dec 2020 at 22:48, Steven Price  wrote:
>
> On 04/12/2020 08:25, Haibo Xu wrote:
> > On Fri, 20 Nov 2020 at 17:51, Steven Price  wrote:
> >>
> >> On 19/11/2020 19:11, Marc Zyngier wrote:
> >>> On 2020-11-19 18:42, Andrew Jones wrote:
>  On Thu, Nov 19, 2020 at 03:45:40PM +, Peter Maydell wrote:
> > On Thu, 19 Nov 2020 at 15:39, Steven Price  wrote:
> >> This series adds support for Arm's Memory Tagging Extension (MTE) to
> >> KVM, allowing KVM guests to make use of it. This builds on the
> > existing
> >> user space support already in v5.10-rc1, see [1] for an overview.
> >
> >> The change to require the VMM to map all guest memory PROT_MTE is
> >> significant as it means that the VMM has to deal with the MTE tags
> > even
> >> if it doesn't care about them (e.g. for virtual devices or if the VMM
> >> doesn't support migration). Also unfortunately because the VMM can
> >> change the memory layout at any time the check for PROT_MTE/VM_MTE has
> >> to be done very late (at the point of faulting pages into stage 2).
> >
> > I'm a bit dubious about requring the VMM to map the guest memory
> > PROT_MTE unless somebody's done at least a sketch of the design
> > for how this would work on the QEMU side. Currently QEMU just
> > assumes the guest memory is guest memory and it can access it
> > without special precautions...
> >
> 
>  There are two statements being made here:
> 
>  1) Requiring the use of PROT_MTE when mapping guest memory may not fit
>  QEMU well.
> 
>  2) New KVM features should be accompanied with supporting QEMU code in
>  order to prove that the APIs make sense.
> 
>  I strongly agree with (2). While kvmtool supports some quick testing, it
>  doesn't support migration. We must test all new features with a migration
>  supporting VMM.
> 
>  I'm not sure about (1). I don't feel like it should be a major problem,
>  but (2).
> >>
> >> (1) seems to be contentious whichever way we go. Either PROT_MTE isn't
> >> required in which case it's easy to accidentally screw up migration, or
> >> it is required in which case it's difficult to handle normal guest
> >> memory from the VMM. I get the impression that probably I should go back
> >> to the previous approach - sorry for the distraction with this change.
> >>
> >> (2) isn't something I'm trying to skip, but I'm limited in what I can do
> >> myself so would appreciate help here. Haibo is looking into this.
> >>
> >
> > Hi Steven,
> >
> > Sorry for the later reply!
> >
> > I have finished the POC for the MTE migration support with the assumption
> > that all the memory is mapped with PROT_MTE. But I got stuck in the test
> > with a FVP setup. Previously, I successfully compiled a test case to verify
> > the basic function of MTE in a guest. But these days, the re-compiled test
> > can't be executed by the guest(very weird). The short plan to verify
> > the migration
> > is to set the MTE tags on one page in the guest, and try to dump the 
> > migrated
> > memory contents.
>
> Hi Haibo,
>
> Sounds like you are making good progress - thanks for the update. Have
> you thought about how the PROT_MTE mappings might work if QEMU itself
> were to use MTE? My worry is that we end up with MTE in a guest
> preventing QEMU from using MTE itself (because of the PROT_MTE
> mappings). I'm hoping QEMU can wrap its use of guest memory in a
> sequence which disables tag checking (something similar will be needed
> for the "protected VM" use case anyway), but this isn't something I've
> looked into.

As far as I can see, to map all the guest memory with PROT_MTE in VMM
is a little weird, and lots of APIs have to be changed to include this flag.
IMHO, it would be better if the KVM can provide new APIs to load/store the
guest memory tag which may make it easier to enable the Qemu migration
support.

>
> > I will update the status later next week!
>
> Great, I look forward to hearing how it goes.
>
> Thanks,
>
> Steve

Re: [PATCH v5 0/2] MTE support for KVM guest

2020-12-08 Thread Marc Zyngier


On 2020-12-08 09:51, Haibo Xu wrote:

On Mon, 7 Dec 2020 at 22:48, Steven Price  wrote:




[...]


Sounds like you are making good progress - thanks for the update. Have
you thought about how the PROT_MTE mappings might work if QEMU itself
were to use MTE? My worry is that we end up with MTE in a guest
preventing QEMU from using MTE itself (because of the PROT_MTE
mappings). I'm hoping QEMU can wrap its use of guest memory in a
sequence which disables tag checking (something similar will be needed
for the "protected VM" use case anyway), but this isn't something I've
looked into.


As far as I can see, to map all the guest memory with PROT_MTE in VMM
is a little weird, and lots of APIs have to be changed to include this 
flag.
IMHO, it would be better if the KVM can provide new APIs to load/store 
the

guest memory tag which may make it easier to enable the Qemu migration
support.


On what granularity? To what storage? How do you plan to synchronise 
this

with the dirty-log interface?

Thanks,

M.
--
Jazz is not dead. It just smells funny...

Re: [PATCH v5 0/2] MTE support for KVM guest

2020-12-08 Thread Haibo Xu

On Tue, 8 Dec 2020 at 00:44, Dr. David Alan Gilbert  wrote:
>
> * Steven Price (steven.pr...@arm.com) wrote:
> > On 07/12/2020 15:27, Peter Maydell wrote:
> > > On Mon, 7 Dec 2020 at 14:48, Steven Price  wrote:
> > > > Sounds like you are making good progress - thanks for the update. Have
> > > > you thought about how the PROT_MTE mappings might work if QEMU itself
> > > > were to use MTE? My worry is that we end up with MTE in a guest
> > > > preventing QEMU from using MTE itself (because of the PROT_MTE
> > > > mappings). I'm hoping QEMU can wrap its use of guest memory in a
> > > > sequence which disables tag checking (something similar will be needed
> > > > for the "protected VM" use case anyway), but this isn't something I've
> > > > looked into.
> > >
> > > It's not entirely the same as the "protected VM" case. For that
> > > the patches currently on list basically special case "this is a
> > > debug access (eg from gdbstub/monitor)" which then either gets
> > > to go via "decrypt guest RAM for debug" or gets failed depending
> > > on whether the VM has a debug-is-ok flag enabled. For an MTE
> > > guest the common case will be guests doing standard DMA operations
> > > to or from guest memory. The ideal API for that from QEMU's
> > > point of view would be "accesses to guest RAM don't do tag
> > > checks, even if tag checks are enabled for accesses QEMU does to
> > > memory it has allocated itself as a normal userspace program".
> >
> > Sorry, I know I simplified it rather by saying it's similar to protected VM.
> > Basically as I see it there are three types of memory access:
> >
> > 1) Debug case - has to go via a special case for decryption or ignoring the
> > MTE tag value. Hopefully this can be abstracted in the same way.
> >
> > 2) Migration - for a protected VM there's likely to be a special method to
> > allow the VMM access to the encrypted memory (AFAIK memory is usually kept
> > inaccessible to the VMM). For MTE this again has to be special cased as we
> > actually want both the data and the tag values.
> >
> > 3) Device DMA - for a protected VM it's usual to unencrypt a small area of
> > memory (with the permission of the guest) and use that as a bounce buffer.
> > This is possible with MTE: have an area the VMM purposefully maps with
> > PROT_MTE. The issue is that this has a performance overhead and we can do
> > better with MTE because it's trivial for the VMM to disable the protection
> > for any memory.
>
> Those all sound very similar to the AMD SEV world;  there's the special
> case for Debug that Peter mentioned; migration is ...complicated and
> needs special case that's still being figured out, and as I understand
> Device DMA also uses a bounce buffer (and swiotlb in the guest to make
> that happen).
>
>
> I'm not sure about the stories for the IBM hardware equivalents.

Like s390-skeys(storage keys) support in Qemu?

I have read the migration support for the s390-skeys in Qemu and found
that the logic is very similar to that of MTE, except the difference that the
s390-skeys were migrated separately from that of the guest memory data
while for MTE, I think the guest memory tags should go with the  memory data.

>
> Dave
>
> > The part I'm unsure on is how easy it is for QEMU to deal with (3) without
> > the overhead of bounce buffers. Ideally there'd already be a wrapper for
> > guest memory accesses and that could just be wrapped with setting TCO during
> > the access. I suspect the actual situation is more complex though, and I'm
> > hoping Haibo's investigations will help us understand this.
> >
> > Thanks,
> >
> > Steve
> >
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
>

Re: [PATCH v5 0/2] MTE support for KVM guest

2020-12-08 Thread Haibo Xu

On Tue, 8 Dec 2020 at 18:01, Marc Zyngier  wrote:
>
> On 2020-12-08 09:51, Haibo Xu wrote:
> > On Mon, 7 Dec 2020 at 22:48, Steven Price  wrote:
> >>
>
> [...]
>
> >> Sounds like you are making good progress - thanks for the update. Have
> >> you thought about how the PROT_MTE mappings might work if QEMU itself
> >> were to use MTE? My worry is that we end up with MTE in a guest
> >> preventing QEMU from using MTE itself (because of the PROT_MTE
> >> mappings). I'm hoping QEMU can wrap its use of guest memory in a
> >> sequence which disables tag checking (something similar will be needed
> >> for the "protected VM" use case anyway), but this isn't something I've
> >> looked into.
> >
> > As far as I can see, to map all the guest memory with PROT_MTE in VMM
> > is a little weird, and lots of APIs have to be changed to include this
> > flag.
> > IMHO, it would be better if the KVM can provide new APIs to load/store
> > the
> > guest memory tag which may make it easier to enable the Qemu migration
> > support.
>
> On what granularity? To what storage? How do you plan to synchronise
> this
> with the dirty-log interface?

The Qemu would migrate page by page, and if one page has been migrated but
becomes dirty again, the migration process would re-send this dirty page.
The current MTE migration POC codes would try to send the page tags just after
the page data, if one page becomes dirty again, the page data and the tags would
be re-sent.

>
> Thanks,
>
>  M.
> --
> Jazz is not dead. It just smells funny...

[Bug 1905979] Re: Check if F_OFD_SETLK is supported may give wrong result

2020-12-08 Thread Olaf Seibert

Interesting. Thanks for the link.

The file system we are using is the Quobyte file system (2.24.1) 
(https://www.quobyte.com/), which works via FUSE. 
We've had problems with OFD locks with this file system in the past, so my 
first thought, seeing the error in comment #1, was that those would be to blame.

But if the OFD locks are not really handled by the file system, I'm not
sure how that explains the OFD lock issues we had in the past. I don't
suppose this changed in the last year or so. Just now I made a little
test program (basically copying qemu_lock_fd_test() and
qemu_probe_lock_ops() from qemu) to double-check, and indeed right now
it seems that the OFD locks *are* working on the Quobyte file system. Or
at least qemu_lock_fd_test() doesn't return an error.

So now I'm back to square one on diagnosing the observed error. It
occurred in an installation of Openstack Ussuri installed on Ubuntu
18.04 Bionic using the Ubuntu Cloud Archive for packaging. The Cloud
Archive has backports of the latest Qemu to earlier Ubuntu versions. The
exact qemu version was http://ubuntu-
cloud.archive.canonical.com/ubuntu/pool/main/q/qemu/qemu_4.2-3ubuntu6.7~cloud0_amd64.deb
.

Annoyingly I have not been able to locate the git repo from which the
Ubuntu Cloud Archive creates its packages (containing the patches and
build changes for backports); all I can find is version 4.2-3ubuntu6.7
(without ~cloud0) which is for Ubuntu 20.04 Focal.

For now we're working around it by downgrading Qemu to the normal Bionic
version (2.11+dfsg-1ubuntu7.33)

You wouldn't happen to know where the Ubuntu Cloud Archive stores exact
files it creates its packages from? (I have already asked on
stackoverflow without success so far:
https://stackoverflow.com/questions/65146846/from-which-git-repos-does-
the-ubuntu-cloud-archive-compile-its-packages)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1905979

Title:
  Check if F_OFD_SETLK is supported may give wrong result

Status in QEMU:
  Incomplete

Bug description:
  In util/osdep.c there is a function qemu_probe_lock_ops() to check if
  file locks F_OFD_SETLK and F_OFD_GETLK (of the style "Open file
  description locks (non-POSIX)") are supported.

  This test is done by trying a lock operation on the file /dev/null.

  This test can get a wrong result.

  The result is (probably) if the operating system *in general* supports
  these locks. However, it does not guarantee that the file system where
  the lock is really wanted (for instance, in caller
  raw_check_lock_bytes() in block/file-posix.c) does support these
  locks.

  (In theory it could even be that /dev/null, being a device special
  file, does not support the lock type while a plain file would.)

  This is in particular relevant for disk images which are stored on a
  shared file system (my particular use case is the Quobyte file system,
  which appears not to support these locks).

  The code as mentioned above is present in the master branch (I checked
  commit ea8208249d1082eae0444934efb3b59cd3183f05) but also for example
  on stable-2.11 commit 0982a56a551556c704dc15752dabf57b4be1c640)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1905979/+subscriptions

Re: [for-6.0 v5 12/13] securable guest memory: Alter virtio default properties for protected guests

2020-12-08 Thread Halil Pasic

On Tue, 8 Dec 2020 12:54:03 +1100
David Gibson  wrote:

> > > >>> + * Virtio devices can't count on directly accessing guest
> > > >>> + * memory, so they need iommu_platform=on to use normal DMA
> > > >>> + * mechanisms.  That requires also disabling legacy virtio
> > > >>> + * support for those virtio pci devices which allow it.
> > > >>> + */
> > > >>> +object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy",
> > > >>> +   "on", true);
> > > >>> +object_register_sugar_prop(TYPE_VIRTIO_DEVICE, 
> > > >>> "iommu_platform",
> > > >>> +   "on", false);
> > > >>
> > > >> I have not followed all the history (sorry). Should we also set 
> > > >> iommu_platform
> > > >> for virtio-ccw? Halil?
> > > >>  
> > > > 
> > > > That line should add iommu_platform for all virtio devices, shouldn't
> > > > it?  
> > > 
> > > Yes, sorry. Was misreading that with the line above. 
> > >   
> > 
> > I believe this is the best we can get. In a sense it is still a
> > pessimization,  
> 
> I'm not really clear on what you're getting at here.

By pessimiziation, I mean that we are going to indicate
_F_PLATFORM_ACCESS even if it isn't necessary, because the guest never
opted in for confidential/memory protection/memory encryption. We have
discussed this before, and I don't see a better solution that works for
everybody.

Regards,
Halil


pgpA7Ebw4OihH.pgp
Description: OpenPGP digital signature

[Bug 1905979] Re: Check if F_OFD_SETLK is supported may give wrong result

2020-12-08 Thread Daniel Berrange

Look in the same directory as that .deb link above - the the files
ending in orig.tar.gz (upstream source) and files ending in
debian.tar.xz (downstream modifications)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1905979

Title:
  Check if F_OFD_SETLK is supported may give wrong result

Status in QEMU:
  Incomplete

Bug description:
  In util/osdep.c there is a function qemu_probe_lock_ops() to check if
  file locks F_OFD_SETLK and F_OFD_GETLK (of the style "Open file
  description locks (non-POSIX)") are supported.

  This test is done by trying a lock operation on the file /dev/null.

  This test can get a wrong result.

  The result is (probably) if the operating system *in general* supports
  these locks. However, it does not guarantee that the file system where
  the lock is really wanted (for instance, in caller
  raw_check_lock_bytes() in block/file-posix.c) does support these
  locks.

  (In theory it could even be that /dev/null, being a device special
  file, does not support the lock type while a plain file would.)

  This is in particular relevant for disk images which are stored on a
  shared file system (my particular use case is the Quobyte file system,
  which appears not to support these locks).

  The code as mentioned above is present in the master branch (I checked
  commit ea8208249d1082eae0444934efb3b59cd3183f05) but also for example
  on stable-2.11 commit 0982a56a551556c704dc15752dabf57b4be1c640)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1905979/+subscriptions

[Bug 1905979] Re: Check if F_OFD_SETLK is supported may give wrong result

2020-12-08 Thread Olaf Seibert

The kernel version is Linux hostname 4.15.0-124-generic #127-Ubuntu SMP
Fri Nov 6 10:54:43 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1905979

Title:
  Check if F_OFD_SETLK is supported may give wrong result

Status in QEMU:
  Incomplete

Bug description:
  In util/osdep.c there is a function qemu_probe_lock_ops() to check if
  file locks F_OFD_SETLK and F_OFD_GETLK (of the style "Open file
  description locks (non-POSIX)") are supported.

  This test is done by trying a lock operation on the file /dev/null.

  This test can get a wrong result.

  The result is (probably) if the operating system *in general* supports
  these locks. However, it does not guarantee that the file system where
  the lock is really wanted (for instance, in caller
  raw_check_lock_bytes() in block/file-posix.c) does support these
  locks.

  (In theory it could even be that /dev/null, being a device special
  file, does not support the lock type while a plain file would.)

  This is in particular relevant for disk images which are stored on a
  shared file system (my particular use case is the Quobyte file system,
  which appears not to support these locks).

  The code as mentioned above is present in the master branch (I checked
  commit ea8208249d1082eae0444934efb3b59cd3183f05) but also for example
  on stable-2.11 commit 0982a56a551556c704dc15752dabf57b4be1c640)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1905979/+subscriptions

Re: [PATCH 10/15] vl: make qemu_get_machine_opts static

2020-12-08 Thread Igor Mammedov

On Mon, 7 Dec 2020 23:32:55 -0300
Daniel Henrique Barboza  wrote:

> On 12/7/20 1:07 PM, Igor Mammedov wrote:
> > On Wed,  2 Dec 2020 03:18:49 -0500
> > Paolo Bonzini  wrote:
> >   
> >> Machine options can be retrieved as properties of the machine object.
> >> Encourage that by removing the "easy" accessor to machine options.
> >>
> >> Signed-off-by: Paolo Bonzini 
> >> ---
> >>   accel/kvm/kvm-all.c | 11 ---
> >>   hw/arm/boot.c   |  2 +-
> >>   hw/microblaze/boot.c|  9 -
> >>   hw/nios2/boot.c |  9 -
> >>   hw/ppc/e500.c   |  5 ++---
> >>   hw/ppc/spapr_nvdimm.c   |  4 ++--
> >>   hw/ppc/virtex_ml507.c   |  2 +-
> >>   hw/riscv/sifive_u.c |  6 ++
> >>   hw/riscv/virt.c |  6 ++
> >>   hw/xtensa/xtfpga.c  |  9 -
> >>   include/sysemu/sysemu.h |  2 --
> >>   softmmu/device_tree.c   |  2 +-
> >>   softmmu/vl.c|  2 +-
> >>   13 files changed, 28 insertions(+), 41 deletions(-)
> >>  
> > [...]  
> >> diff --git a/hw/ppc/spapr_nvdimm.c b/hw/ppc/spapr_nvdimm.c
> >> index a833a63b5e..84715a4d78 100644
> >> --- a/hw/ppc/spapr_nvdimm.c
> >> +++ b/hw/ppc/spapr_nvdimm.c
> >> @@ -38,7 +38,6 @@ bool spapr_nvdimm_validate(HotplugHandler *hotplug_dev, 
> >> NVDIMMDevice *nvdimm,
> >>   {
> >>   const MachineClass *mc = MACHINE_GET_CLASS(hotplug_dev);
> >>   const MachineState *ms = MACHINE(hotplug_dev);
> >> -const char *nvdimm_opt = qemu_opt_get(qemu_get_machine_opts(), 
> >> "nvdimm");
> >>   g_autofree char *uuidstr = NULL;
> >>   QemuUUID uuid;
> >>   int ret;
> >> @@ -57,10 +56,11 @@ bool spapr_nvdimm_validate(HotplugHandler 
> >> *hotplug_dev, NVDIMMDevice *nvdimm,
> >>* ensure that, if the user sets nvdimm=off, we error out
> >>* regardless of being 5.1 or newer.
> >>*/
> >> -if (!ms->nvdimms_state->is_enabled && nvdimm_opt) {
> >> +if (!ms->nvdimms_state->is_enabled && 
> >> ms->nvdimms_state->has_is_enabled) {
> >>   error_setg(errp, "nvdimm device found but 'nvdimm=off' was set");
> >>   return false;
> >>   }
> >> +ms->nvdimms_state->is_enabled = true;  
> > 
> > it looks like nvdimm is always enabled since 5.0 and there is no way to 
> > disable it  
> 
> 
> I'm not sure I'm following this statement. Testing here with all patches
> up to this one applied, in a x86 machine:
> 
> 
> $ sudo ./x86_64-softmmu/qemu-system-x86_64 -M pc,nvdimm=off -object 
> memory-backend-file,id=mem0,size=1G,mem-path=/tmp/aa -device 
> nvdimm,id=dimm0,memdev=mem0
> qemu-system-x86_64: -device nvdimm,id=dimm0,memdev=mem0: nvdimm is not 
> enabled: missing 'nvdimm' in '-M'
> $
> $ sudo ./x86_64-softmmu/qemu-system-x86_64 -M pc -object 
> memory-backend-file,id=mem0,size=1G,mem-path=/tmp/aa -device 
> nvdimm,id=dimm0,memdev=mem0
> qemu-system-x86_64: -device nvdimm,id=dimm0,memdev=mem0: nvdimm is not 
> enabled: missing 'nvdimm' in '-M'
> $
> 
> This is the x86 NVDIMM behavior I considered when patching pseries NVDIMM 
> support.
> As Paolo mentioned in patch 09, pseries has different default values. We 
> screwed
> it up when pushing the first version of pseries NVDIMM support and we ended up
> enabling it by default, as opposed to disabling it by default like x86. One 
> release
> later people complained that we weren't honoring 'nvdimm=off' to disable 
> NVDIMM
> support. The path of less pain that I chose was to at the very least disable
> the support when "nvdimm=off" was specified by the user.
> 
> 
> 
> 
> 
> > how about dropping 9/15 and just error out is it's not enabled, something 
> > like:
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index b7e0894019..d63f095bff 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2668,6 +2668,7 @@ static void spapr_machine_init(MachineState *machine)
> >   char *filename;
> >   Error *resize_hpt_err = NULL;
> > +if (mc->nvdimm_supported) { // by default it is always enabled
> > +ms->nvdimms_state->is_enabled = true;
> > +}
> >   msi_nonbroken = true;
> >   
> >   QLIST_INIT(&spapr->phbs);
> > diff --git a/hw/ppc/spapr_nvdimm.c b/hw/ppc/spapr_nvdimm.c
> > index a833a63b5e..b11c85dbaa 100644
> > --- a/hw/ppc/spapr_nvdimm.c
> > +++ b/hw/ppc/spapr_nvdimm.c
> > @@ -57,7 +57,7 @@ bool spapr_nvdimm_validate(HotplugHandler *hotplug_dev, 
> > NVDIMMDevice *nvdimm,
> >* ensure that, if the user sets nvdimm=off, we error out
> >* regardless of being 5.1 or newer.
> >*/
> > -if (!ms->nvdimms_state->is_enabled && nvdimm_opt) {
> > +if (!ms->nvdimms_state->is_enabled) {
> >   error_setg(errp, "nvdimm device found but 'nvdimm=off' was set");
> >   return false;
> >   }
> > 
> > if I didn't miss something, that way we do not abuse nvdimm_opt and still
> > have nvdimm enabled by default and error out if it turns to disabled for 
> > whatever reason.  
> 
> 
> Just tried this out. This doesn't disable the NVDIMM support w

[Bug 1905979] Re: Check if F_OFD_SETLK is supported may give wrong result

2020-12-08 Thread Olaf Seibert

That is indeed the source and patches, but I wanted to follow their git
repo for easier maintenance. Surely they must have one.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1905979

Title:
  Check if F_OFD_SETLK is supported may give wrong result

Status in QEMU:
  Incomplete

Bug description:
  In util/osdep.c there is a function qemu_probe_lock_ops() to check if
  file locks F_OFD_SETLK and F_OFD_GETLK (of the style "Open file
  description locks (non-POSIX)") are supported.

  This test is done by trying a lock operation on the file /dev/null.

  This test can get a wrong result.

  The result is (probably) if the operating system *in general* supports
  these locks. However, it does not guarantee that the file system where
  the lock is really wanted (for instance, in caller
  raw_check_lock_bytes() in block/file-posix.c) does support these
  locks.

  (In theory it could even be that /dev/null, being a device special
  file, does not support the lock type while a plain file would.)

  This is in particular relevant for disk images which are stored on a
  shared file system (my particular use case is the Quobyte file system,
  which appears not to support these locks).

  The code as mentioned above is present in the master branch (I checked
  commit ea8208249d1082eae0444934efb3b59cd3183f05) but also for example
  on stable-2.11 commit 0982a56a551556c704dc15752dabf57b4be1c640)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1905979/+subscriptions

[PATCH] ppc/spapr: cleanup -machine pseries,nvdimm=X handling

2020-12-08 Thread Igor Mammedov

Since NVDIMM support was introduced on pseries machine,
it ignored machine's nvdimm=on|off option and effectively
was always enabled on machines that support NVDIMM.
Later on commit
  (28f5a716212 ppc/spapr_nvdimm: do not enable support with 'nvdimm=off')
makes QEMU error out in case user explicitly set 'nvdimm=off'
on CLI by peeking at machine_opts.

However that's a bit hacking and going away (in world where
machine configured over QMP) and it also leaves
  nvdimms_state->is_enabled
in inconsistent state (false) even when it should be set true
by default.

Instead of using on machine_opts, implement per machine
"nvdimm enabled" default handling and set default enabled value
at machine class init time (which is set to true for pseries)
and properly handle it generic machine code.
That way pseries will have, nvdimm enabled by default and
will honor user provided 'nvdimm=on|off'.

Signed-off-by: Igor Mammedov 
---
PS:
Patch should be applied on top of:
  [PATCH 08/15] machine: introduce MachineInitPhase
---
 include/hw/boards.h   |  1 +
 hw/core/machine.c |  1 +
 hw/ppc/spapr.c|  8 
 hw/ppc/spapr_nvdimm.c | 14 +-
 4 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index b9233af54a..1730f151aa 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -206,6 +206,7 @@ struct MachineClass {
 bool ignore_boot_device_suffixes;
 bool smbus_no_migration_support;
 bool nvdimm_supported;
+bool nvdimm_enabled_default;
 bool numa_mem_supported;
 bool auto_enable_numa;
 const char *default_ram_id;
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 2c0bc15143..12f04bed58 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -893,6 +893,7 @@ static void machine_initfn(Object *obj)
 Object *obj = OBJECT(ms);
 
 ms->nvdimms_state = g_new0(NVDIMMState, 1);
+ms->nvdimms_state->is_enabled = mc->nvdimm_enabled_default;
 object_property_add_bool(obj, "nvdimm",
  machine_get_nvdimm, machine_set_nvdimm);
 object_property_set_description(obj, "nvdimm",
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b7e0894019..5a0cf79bbe 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4413,6 +4413,14 @@ static void spapr_machine_class_init(ObjectClass *oc, 
void *data)
 mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power9_v2.0");
 mc->has_hotpluggable_cpus = true;
 mc->nvdimm_supported = true;
+/*
+ * NVDIMM support went live in 5.1 without considering that, in
+ * other archs, the user needs to enable NVDIMM support with the
+ * 'nvdimm' machine option and the default behavior is NVDIMM
+ * support disabled. It is too late to roll back to the standard
+ * behavior without breaking 5.1 guests.
+ */
+mc->nvdimm_enabled_default = true;
 smc->resize_hpt_default = SPAPR_RESIZE_HPT_ENABLED;
 fwc->get_dev_path = spapr_get_fw_dev_path;
 nc->nmi_monitor_handler = spapr_nmi;
diff --git a/hw/ppc/spapr_nvdimm.c b/hw/ppc/spapr_nvdimm.c
index a833a63b5e..66cd3dc13f 100644
--- a/hw/ppc/spapr_nvdimm.c
+++ b/hw/ppc/spapr_nvdimm.c
@@ -27,10 +27,8 @@
 #include "hw/ppc/spapr_nvdimm.h"
 #include "hw/mem/nvdimm.h"
 #include "qemu/nvdimm-utils.h"
-#include "qemu/option.h"
 #include "hw/ppc/fdt.h"
 #include "qemu/range.h"
-#include "sysemu/sysemu.h"
 #include "hw/ppc/spapr_numa.h"
 
 bool spapr_nvdimm_validate(HotplugHandler *hotplug_dev, NVDIMMDevice *nvdimm,
@@ -38,7 +36,6 @@ bool spapr_nvdimm_validate(HotplugHandler *hotplug_dev, 
NVDIMMDevice *nvdimm,
 {
 const MachineClass *mc = MACHINE_GET_CLASS(hotplug_dev);
 const MachineState *ms = MACHINE(hotplug_dev);
-const char *nvdimm_opt = qemu_opt_get(qemu_get_machine_opts(), "nvdimm");
 g_autofree char *uuidstr = NULL;
 QemuUUID uuid;
 int ret;
@@ -48,16 +45,7 @@ bool spapr_nvdimm_validate(HotplugHandler *hotplug_dev, 
NVDIMMDevice *nvdimm,
 return false;
 }
 
-/*
- * NVDIMM support went live in 5.1 without considering that, in
- * other archs, the user needs to enable NVDIMM support with the
- * 'nvdimm' machine option and the default behavior is NVDIMM
- * support disabled. It is too late to roll back to the standard
- * behavior without breaking 5.1 guests. What we can do is to
- * ensure that, if the user sets nvdimm=off, we error out
- * regardless of being 5.1 or newer.
- */
-if (!ms->nvdimms_state->is_enabled && nvdimm_opt) {
+if (!ms->nvdimms_state->is_enabled) {
 error_setg(errp, "nvdimm device found but 'nvdimm=off' was set");
 return false;
 }
-- 
2.27.0

Re: [PATCH 11/15] qtest: add a QOM object for qtest

2020-12-08 Thread Igor Mammedov

On Mon, 7 Dec 2020 18:22:55 +0100
Paolo Bonzini  wrote:

> Il lun 7 dic 2020, 17:57 Igor Mammedov  ha scritto:
> 
> > On Mon, 7 Dec 2020 17:43:16 +0100
> > Paolo Bonzini  wrote:
> >  
> > > On 07/12/20 17:24, Igor Mammedov wrote:  
> > > >> +void qtest_server_init(const char *qtest_chrdev, const char  
> > *qtest_log, Error **errp)  
> > > >> +{
> > > >> +Chardev *chr;
> > > >> +
> > > >> +chr = qemu_chr_new("qtest", qtest_chrdev, NULL);
> > > >> +
> > > >> +if (chr == NULL) {
> > > >> +error_setg(errp, "Failed to initialize device for qtest:  
> > \"%s\"",  
> > > >> +   qtest_chrdev);
> > > >> +return;
> > > >> +}
> > > >> +
> > > >> +qtest_server_start(chr, qtest_log, errp);  
> > > > why not create qtest object here instead of trying to preserve old way,
> > > > or create it directly at the place that calls qtest_server_init()?  
> > >
> > > Because I wasn't sure of where to put it in the QOM object tree.  So I
> > > punted and left it for later.  
> >
> > but you implicitly decided where it should be (with -object qtest),
> > it goes to /objects.
> > So I'd wouldn't put anywhere else to be consistent.
> >  
> 
> No, /objects is for stuff created with -object exclusively. I suppose I
> could have the "well-known path" be /machine/qtest, and it would be either
> a child (for -qtest) or a link to /objects/some-id (for -object qtest).
> Should I implement that (as a separate patch on top of this one)?
yes

> 
> Paolo
> 
> 
> 
> > >
> > > Paolo
> > >  
> >
> >

Re: [PATCH qemu v11] spapr: Implement Open Firmware client interface

2020-12-08 Thread BALATON Zoltan via


On Tue, 8 Dec 2020, Alexey Kardashevskiy wrote:

On 07/12/2020 22:48, BALATON Zoltan wrote:

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 2e89e36cfbdc..048bf49592aa 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -175,6 +175,13 @@ struct SpaprMachineState {
    long kernel_size;
    bool kernel_le;
    uint64_t kernel_addr;
+    bool vof; /* Virtual Open Firmware */
+    uint32_t rtas_base;
+    GArray *claimed; /* array of SpaprOfClaimed */
+    uint64_t claimed_base;
+    GHashTable *of_instances; /* ihandle -> SpaprOfInstance */
+    uint32_t of_instance_last;
+    char *bootargs;


Are these really state for vof so is it better to place them in a separate 
of_state struct instead of adding to the machine state? I'm not interested 
in spapr but interested in using vof as a replacement firmware for other 
machines so clear separation of what is spapr specific and what is vof 
specific would help me (and maybe also other reviewers to tell how much 
impact this really has on spapr which seems to be a concern of Greg).


This is a very good point, I'll separate VOF from the rest, may be even at 
QOM level. I was also thinking of adding a pseries-vof machine type but this 
is probably an overkill.


Out of curiosity - how are you going to use this VOF anyway, for what machine 
type?


https://osdn.net/projects/qmiga/wiki/SubprojectPegasos2

It is basically working now, can boot MorphOS (and also AmigaOS but that 
has no display driver for VGA so can't be seen such as some Linux versions 
for the machine too which have the same problem) but to be able to 
upstream it I'll need to clean it up and have some firmware to avoid 
needing a non-distributable ROM image. VOF might be the simplest way for 
this to just get the Amiga like bootloaders and Linux start which only 
need some CI fuctions.


Regards,
BALATON Zoltan

[PATCH] kvm: Take into account the unaligned section size when preparing bitmap

2020-12-08 Thread Zenghui Yu

The kernel KVM_CLEAR_DIRTY_LOG interface has align requirement on both the
start and the size of the given range of pages. We have been careful to
handle the unaligned cases when performing CLEAR on one slot. But it seems
that we forget to take the unaligned *size* case into account when
preparing bitmap for the interface, and we may end up clearing dirty status
for pages outside of [start, start + size).

If the size is unaligned, let's go through the slow path to manipulate a
temp bitmap for the interface so that we won't bother with those unaligned
bits at the end of bitmap.

I don't think this can happen in practice since the upper layer would
provide us with the alignment guarantee. I'm not sure if kvm-all could rely
on it. And this patch is mainly intended to address correctness of the
specific algorithm used inside kvm_log_clear_one_slot().

Signed-off-by: Zenghui Yu 
---
 accel/kvm/kvm-all.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index bed2455ca5..05d323ba1f 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -747,7 +747,7 @@ static int kvm_log_clear_one_slot(KVMSlot *mem, int as_id, 
uint64_t start,
 assert(bmap_start % BITS_PER_LONG == 0);
 /* We should never do log_clear before log_sync */
 assert(mem->dirty_bmap);
-if (start_delta) {
+if (start_delta || bmap_npages - size / psize) {
 /* Slow path - we need to manipulate a temp bitmap */
 bmap_clear = bitmap_new(bmap_npages);
 bitmap_copy_with_src_offset(bmap_clear, mem->dirty_bmap,
@@ -760,7 +760,10 @@ static int kvm_log_clear_one_slot(KVMSlot *mem, int as_id, 
uint64_t start,
 bitmap_clear(bmap_clear, 0, start_delta);
 d.dirty_bitmap = bmap_clear;
 } else {
-/* Fast path - start address aligns well with BITS_PER_LONG */
+/*
+ * Fast path - both start and size align well with BITS_PER_LONG
+ * (or the end of memory slot)
+ */
 d.dirty_bitmap = mem->dirty_bmap + BIT_WORD(bmap_start);
 }
 
-- 
2.19.1

Re: [PATCH v2] tests/acceptance: test hot(un)plug of ccw devices

2020-12-08 Thread Cornelia Huck

On Mon, 7 Dec 2020 19:40:36 +0100
Thomas Huth  wrote:

> On 07/12/2020 17.34, Thomas Huth wrote:
> > On 07/12/2020 17.30, Cornelia Huck wrote:  
> >> On Mon, 7 Dec 2020 15:28:47 +0100
> >> Thomas Huth  wrote:
> >>  
> >>> On 04/12/2020 13.14, Cornelia Huck wrote:  
>  Hotplug a virtio-net-ccw device, and then hotunplug it again.
> 
>  Signed-off-by: Cornelia Huck 
>  ---  
> >  
> [...]
>  +exec_command_and_wait_for_pattern(self, 'ls 
>  /sys/bus/ccw/devices/',
>  +  '0.0.4711')
>  +# and detach it again
>  +exec_command_and_wait_for_pattern(self, 'dmesg -c', ' ')
> >>>
> >>> If adapt my above change, you could also get rid of this dmesg -c here
> >>> (since it's done in the while loop already)  
> >>
> >> I don't think so (there are two CRWs posted, and the loop might have
> >> caught the first one only.)  
> > 
> > Oh, you're right. So let's better be safe than sorry and keep this dmesg 
> > -c.  
> 
> Ok, as we had to discover during some testing, it's a bad idea to only wait
> for ' ' after the 'dmesg -c' since it matches too early, so that the device
> gets added while the dmesg command is still running.
> 
> The following code is working for me instead:
> 
>  # add another device
>  exec_command_and_wait_for_pattern(self,
>  'dmesg -c > /dev/null; echo dm-clear\ 1', 'dm-clear 1')
>  self.vm.command('device_add', driver='virtio-net-ccw',
>  devno='fe.0.4711', id='net_4711')
>  exec_command_and_wait_for_pattern(self,
>  'while ! (dmesg -c | grep CRW) ; do sleep 1 ; done',
>  'CRW reports')
>  exec_command_and_wait_for_pattern(self, 'ls /sys/bus/ccw/devices/',
>'0.0.4711')
>  # and detach it again
>  exec_command_and_wait_for_pattern(self,
>  'dmesg -c > /dev/null; echo dm-clear\ 2', 'dm-clear 2')
>  self.vm.command('device_del', id='net_4711')
>  self.vm.event_wait(name='DEVICE_DELETED',
> match={'data': {'device': 'net_4711'}})
>  exec_command_and_wait_for_pattern(self,
>  'while ! (dmesg -c | grep CRW) ; do sleep 1 ; done',
>  'CRW reports')
>  exec_command_and_wait_for_pattern(self,
>'ls /sys/bus/ccw/devices/0.0.4711',
>'No such file or directory')

Thanks for tracking this down, this works for me as well. I'll send a
v3.

Re: [PATCH v12 11/19] multi-process: setup memory manager for remote device

2020-12-08 Thread Marc-André Lureau

On Wed, Dec 2, 2020 at 12:23 AM Jagannathan Raman 
wrote:

> SyncSysMemMsg message format is defined. It is used to send
> file descriptors of the RAM regions to remote device.
> RAM on the remote device is configured with a set of file descriptors.
> Old RAM regions are deleted and new regions, each with an fd, is
> added to the RAM.
>
> Signed-off-by: Jagannathan Raman 
> Signed-off-by: John G Johnson 
> Signed-off-by: Elena Ufimtseva 
> Reviewed-by: Stefan Hajnoczi 
> ---
>  include/hw/remote/memory.h  | 19 ++
>  include/hw/remote/mpqemu-link.h | 13 +
>  hw/remote/memory.c  | 58
> +
>  hw/remote/mpqemu-link.c | 11 
>  MAINTAINERS |  2 ++
>  hw/remote/meson.build   |  2 ++
>  6 files changed, 105 insertions(+)
>  create mode 100644 include/hw/remote/memory.h
>  create mode 100644 hw/remote/memory.c
>
> diff --git a/include/hw/remote/memory.h b/include/hw/remote/memory.h
> new file mode 100644
> index 000..4fd548e
> --- /dev/null
> +++ b/include/hw/remote/memory.h
> @@ -0,0 +1,19 @@
> +/*
> + * Memory manager for remote device
> + *
> + * Copyright © 2018, 2020 Oracle and/or its affiliates.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef REMOTE_MEMORY_H
> +#define REMOTE_MEMORY_H
> +
> +#include "exec/hwaddr.h"
> +#include "hw/remote/mpqemu-link.h"
> +
> +void remote_sysmem_reconfig(MPQemuMsg *msg, Error **errp);
> +
> +#endif
> diff --git a/include/hw/remote/mpqemu-link.h
> b/include/hw/remote/mpqemu-link.h
> index 2d79ff8..070ac77 100644
> --- a/include/hw/remote/mpqemu-link.h
> +++ b/include/hw/remote/mpqemu-link.h
> @@ -14,6 +14,7 @@
>  #include "qom/object.h"
>  #include "qemu/thread.h"
>  #include "io/channel.h"
> +#include "exec/hwaddr.h"
>
>  #define REMOTE_MAX_FDS 8
>
> @@ -24,12 +25,22 @@
>   *
>   * MPQemuCmd enum type to specify the command to be executed on the remote
>   * device.
> + *
> + * SYNC_SYSMEM  Shares QEMU's RAM with remote device's RAM
>

My understanding is that it's deliberately a private protocol between qemu
and remote host processes, so that it can break anytime. And in the future
it will be vfio-user based. Correct? It would be worth to leave a note
about it somewhere.

  */
>  typedef enum {
>  MPQEMU_CMD_INIT,
> +SYNC_SYSMEM,
> +RET_MSG,
>  MPQEMU_CMD_MAX,
>  } MPQemuCmd;
>
> +typedef struct {
> +hwaddr gpas[REMOTE_MAX_FDS];
> +uint64_t sizes[REMOTE_MAX_FDS];
> +off_t offsets[REMOTE_MAX_FDS];
> +} SyncSysmemMsg;
> +
>  /**
>   * MPQemuMsg:
>   * @cmd: The remote command
> @@ -40,12 +51,14 @@ typedef enum {
>   * MPQemuMsg Format of the message sent to the remote device from QEMU.
>   *
>   */
> +
>  typedef struct {
>  int cmd;
>  size_t size;
>
>  union {
>  uint64_t u64;
> +SyncSysmemMsg sync_sysmem;
>  } data;
>
>  int fds[REMOTE_MAX_FDS];
> diff --git a/hw/remote/memory.c b/hw/remote/memory.c
> new file mode 100644
> index 000..6d1e830
> --- /dev/null
> +++ b/hw/remote/memory.c
> @@ -0,0 +1,58 @@
> +/*
> + * Memory manager for remote device
> + *
> + * Copyright © 2018, 2020 Oracle and/or its affiliates.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu-common.h"
> +
> +#include "hw/remote/memory.h"
> +#include "exec/address-spaces.h"
> +#include "exec/ram_addr.h"
> +#include "qapi/error.h"
> +
> +void remote_sysmem_reconfig(MPQemuMsg *msg, Error **errp)
> +{
> +SyncSysmemMsg *sysmem_info = &msg->data.sync_sysmem;
> +MemoryRegion *sysmem, *subregion, *next;
> +static unsigned int suffix;
> +Error *local_err = NULL;
> +char *name;
> +int region;
> +
> +sysmem = get_system_memory();
> +
> +memory_region_transaction_begin();
>

It looks like this transaction business isn't really helping here. Both
del_subregion and add_subregion already handle it.

+
> +QTAILQ_FOREACH_SAFE(subregion, &sysmem->subregions, subregions_link,
> next) {
> +if (subregion->ram) {
> +memory_region_del_subregion(sysmem, subregion);
> +object_unparent(OBJECT(subregion));
> +}
> +}
> +
> +for (region = 0; region < msg->num_fds; region++) {
> +subregion = g_new(MemoryRegion, 1);
> +name = g_strdup_printf("remote-mem-%u", suffix++);
> +memory_region_init_ram_from_fd(subregion, NULL,
> +   name, sysmem_info->sizes[region],
> +   true, msg->fds[region],
> +   sysmem_info->offsets[region],
> +   &local_err);
> +g_free(name);
>

We are quite happily using g_auto these days


> +

[PATCH v2 2/9] console: allow con==NULL in dpy_{get, set}_ui_info and dpy_ui_info_supported

2020-12-08 Thread Gerd Hoffmann

Use active_console in that case like we do in many other places.

Signed-off-by: Gerd Hoffmann 
---
 ui/console.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/ui/console.c b/ui/console.c
index f995639e45f6..30e70be555db 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -1544,19 +1544,27 @@ static void dpy_set_ui_info_timer(void *opaque)
 
 bool dpy_ui_info_supported(QemuConsole *con)
 {
+if (con == NULL) {
+con = active_console;
+}
+
 return con->hw_ops->ui_info != NULL;
 }
 
 const QemuUIInfo *dpy_get_ui_info(const QemuConsole *con)
 {
-assert(con != NULL);
+if (con == NULL) {
+con = active_console;
+}
 
 return &con->ui_info;
 }
 
 int dpy_set_ui_info(QemuConsole *con, QemuUIInfo *info)
 {
-assert(con != NULL);
+if (con == NULL) {
+con = active_console;
+}
 
 if (!dpy_ui_info_supported(con)) {
 return -1;
-- 
2.27.0

[PATCH v2 4/9] vnc: drop unused copyrect feature

2020-12-08 Thread Gerd Hoffmann

vnc stopped using the copyrect pseudo encoding in 2017, in commit
50628d3479e4 ("cirrus/vnc: zap bitblit support from console code.")
So we can drop the now unused copyrect feature bit.

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Marc-André Lureau 
---
 ui/vnc.h | 2 --
 ui/vnc.c | 3 ---
 2 files changed, 5 deletions(-)

diff --git a/ui/vnc.h b/ui/vnc.h
index 262fcf179b44..a7fd38a82075 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -445,7 +445,6 @@ enum VncFeatures {
 VNC_FEATURE_WMVI,
 VNC_FEATURE_TIGHT,
 VNC_FEATURE_ZLIB,
-VNC_FEATURE_COPYRECT,
 VNC_FEATURE_RICH_CURSOR,
 VNC_FEATURE_TIGHT_PNG,
 VNC_FEATURE_ZRLE,
@@ -459,7 +458,6 @@ enum VncFeatures {
 #define VNC_FEATURE_WMVI_MASK(1 << VNC_FEATURE_WMVI)
 #define VNC_FEATURE_TIGHT_MASK   (1 << VNC_FEATURE_TIGHT)
 #define VNC_FEATURE_ZLIB_MASK(1 << VNC_FEATURE_ZLIB)
-#define VNC_FEATURE_COPYRECT_MASK(1 << VNC_FEATURE_COPYRECT)
 #define VNC_FEATURE_RICH_CURSOR_MASK (1 << VNC_FEATURE_RICH_CURSOR)
 #define VNC_FEATURE_TIGHT_PNG_MASK   (1 << VNC_FEATURE_TIGHT_PNG)
 #define VNC_FEATURE_ZRLE_MASK(1 << VNC_FEATURE_ZRLE)
diff --git a/ui/vnc.c b/ui/vnc.c
index 49235056f7a8..8c2771c1ce3b 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -2061,9 +2061,6 @@ static void set_encodings(VncState *vs, int32_t 
*encodings, size_t n_encodings)
 case VNC_ENCODING_RAW:
 vs->vnc_encoding = enc;
 break;
-case VNC_ENCODING_COPYRECT:
-vs->features |= VNC_FEATURE_COPYRECT_MASK;
-break;
 case VNC_ENCODING_HEXTILE:
 vs->features |= VNC_FEATURE_HEXTILE_MASK;
 vs->vnc_encoding = enc;
-- 
2.27.0

[PATCH v2 7/9] vnc: force initial resize message

2020-12-08 Thread Gerd Hoffmann

The vnc server should send desktop resize notifications unconditionally
on a new client connect, for feature negotiation reasons.  Add a bool
flag to vnc_desktop_resize() to force sending the message even in case
there is no size change.

Signed-off-by: Gerd Hoffmann 
---
 ui/vnc.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/ui/vnc.c b/ui/vnc.c
index 247e80d8f5c8..bdaf384f71a4 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -664,13 +664,14 @@ void vnc_framebuffer_update(VncState *vs, int x, int y, 
int w, int h,
 }
 
 
-static void vnc_desktop_resize(VncState *vs)
+static void vnc_desktop_resize(VncState *vs, bool force)
 {
 if (vs->ioc == NULL || !vnc_has_feature(vs, VNC_FEATURE_RESIZE)) {
 return;
 }
 if (vs->client_width == pixman_image_get_width(vs->vd->server) &&
-vs->client_height == pixman_image_get_height(vs->vd->server)) {
+vs->client_height == pixman_image_get_height(vs->vd->server) &&
+!force) {
 return;
 }
 
@@ -800,7 +801,7 @@ static void vnc_dpy_switch(DisplayChangeListener *dcl,
 
 QTAILQ_FOREACH(vs, &vd->clients, next) {
 vnc_colordepth(vs);
-vnc_desktop_resize(vs);
+vnc_desktop_resize(vs, false);
 if (vs->vd->cursor) {
 vnc_cursor_define(vs);
 }
@@ -2143,7 +2144,7 @@ static void set_encodings(VncState *vs, int32_t 
*encodings, size_t n_encodings)
 break;
 }
 }
-vnc_desktop_resize(vs);
+vnc_desktop_resize(vs, true);
 check_pointer_type_change(&vs->mouse_mode_notifier, NULL);
 vnc_led_state_change(vs);
 if (vs->vd->cursor) {
-- 
2.27.0

[PATCH v2 0/9] vnc: support some new extensions.

2020-12-08 Thread Gerd Hoffmann

The rfb standard keeps envolving.  While the official spec is kind of
frozen since a decade or so the community maintains an updated version
of the spec at:
https://github.com/rfbproto/rfbproto/

This series adds support for two new extensions from that spec: alpha
cursor and extended desktop resize.

alpha cursor allows a full alpha channel for the cursor image (unlike
the rich cursor extension which has only a bitmask for transparency).

extended desktop resize makes the desktop-resize work both ways, i.e. we
can not only signal guest display resolution changes to the vnc client
but also vnc client window size changes to the guest.

Tested with tigervnc.

gtk-vnc (and anything building on top of it like virt-viewer and
virt-manager) has no support for these extensions.

v2:
 - dropped qxl bits (will be a separate patch series).
 - use error codes for desktop resize responses.
 - little tweaks here and there.
 - pick up some review tags.

Gerd Hoffmann (9):
  console: drop qemu_console_get_ui_info
  console: allow con==NULL in dpy_{get,set}_ui_info and
dpy_ui_info_supported
  vnc: use enum for features
  vnc: drop unused copyrect feature
  vnc: add pseudo encodings
  vnc: add alpha cursor support
  vnc: force initial resize message
  vnc: add support for extended desktop resize
  vnc: move check into vnc_cursor_define

 include/ui/console.h |   1 -
 ui/vnc.h |  32 --
 ui/console.c |  18 
 ui/vnc.c | 103 +--
 4 files changed, 118 insertions(+), 36 deletions(-)

-- 
2.27.0

[PATCH v2 6/9] vnc: add alpha cursor support

2020-12-08 Thread Gerd Hoffmann

There is a new vnc extension for cursors with an alpha channel.  Use
it if supported by the vnc client, prefer it over the "rich cursor"
extension which supports only a bitmask for transparency.

This is a visible improvement especially on modern desktops which
actually use the alpha channel when defining cursors.

https://github.com/rfbproto/rfbproto/blob/master/rfbproto.rst#cursor-with-alpha-pseudo-encoding

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Marc-André Lureau 
---
 ui/vnc.h |  2 ++
 ui/vnc.c | 21 ++---
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/ui/vnc.h b/ui/vnc.h
index 6f5006da3593..c8d3ad9ec496 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -448,6 +448,7 @@ enum VncFeatures {
 VNC_FEATURE_TIGHT,
 VNC_FEATURE_ZLIB,
 VNC_FEATURE_RICH_CURSOR,
+VNC_FEATURE_ALPHA_CURSOR,
 VNC_FEATURE_TIGHT_PNG,
 VNC_FEATURE_ZRLE,
 VNC_FEATURE_ZYWRLE,
@@ -461,6 +462,7 @@ enum VncFeatures {
 #define VNC_FEATURE_TIGHT_MASK   (1 << VNC_FEATURE_TIGHT)
 #define VNC_FEATURE_ZLIB_MASK(1 << VNC_FEATURE_ZLIB)
 #define VNC_FEATURE_RICH_CURSOR_MASK (1 << VNC_FEATURE_RICH_CURSOR)
+#define VNC_FEATURE_ALPHA_CURSOR_MASK(1 << VNC_FEATURE_ALPHA_CURSOR)
 #define VNC_FEATURE_TIGHT_PNG_MASK   (1 << VNC_FEATURE_TIGHT_PNG)
 #define VNC_FEATURE_ZRLE_MASK(1 << VNC_FEATURE_ZRLE)
 #define VNC_FEATURE_ZYWRLE_MASK  (1 << VNC_FEATURE_ZYWRLE)
diff --git a/ui/vnc.c b/ui/vnc.c
index 8c2771c1ce3b..247e80d8f5c8 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -937,6 +937,18 @@ static int vnc_cursor_define(VncState *vs)
 QEMUCursor *c = vs->vd->cursor;
 int isize;
 
+if (vnc_has_feature(vs, VNC_FEATURE_ALPHA_CURSOR)) {
+vnc_lock_output(vs);
+vnc_write_u8(vs,  VNC_MSG_SERVER_FRAMEBUFFER_UPDATE);
+vnc_write_u8(vs,  0);  /*  padding */
+vnc_write_u16(vs, 1);  /*  # of rects  */
+vnc_framebuffer_update(vs, c->hot_x, c->hot_y, c->width, c->height,
+   VNC_ENCODING_ALPHA_CURSOR);
+vnc_write_s32(vs, VNC_ENCODING_RAW);
+vnc_write(vs, c->data, c->width * c->height * 4);
+vnc_unlock_output(vs);
+return 0;
+}
 if (vnc_has_feature(vs, VNC_FEATURE_RICH_CURSOR)) {
 vnc_lock_output(vs);
 vnc_write_u8(vs,  VNC_MSG_SERVER_FRAMEBUFFER_UPDATE);
@@ -2102,9 +2114,9 @@ static void set_encodings(VncState *vs, int32_t 
*encodings, size_t n_encodings)
 break;
 case VNC_ENCODING_RICH_CURSOR:
 vs->features |= VNC_FEATURE_RICH_CURSOR_MASK;
-if (vs->vd->cursor) {
-vnc_cursor_define(vs);
-}
+break;
+case VNC_ENCODING_ALPHA_CURSOR:
+vs->features |= VNC_FEATURE_ALPHA_CURSOR_MASK;
 break;
 case VNC_ENCODING_EXT_KEY_EVENT:
 send_ext_key_event_ack(vs);
@@ -2134,6 +2146,9 @@ static void set_encodings(VncState *vs, int32_t 
*encodings, size_t n_encodings)
 vnc_desktop_resize(vs);
 check_pointer_type_change(&vs->mouse_mode_notifier, NULL);
 vnc_led_state_change(vs);
+if (vs->vd->cursor) {
+vnc_cursor_define(vs);
+}
 }
 
 static void set_pixel_conversion(VncState *vs)
-- 
2.27.0

[PATCH v2 1/9] console: drop qemu_console_get_ui_info

2020-12-08 Thread Gerd Hoffmann

Unused and duplicate (there is dpy_get_ui_info).

Signed-off-by: Gerd Hoffmann 
---
 include/ui/console.h | 1 -
 ui/console.c | 6 --
 2 files changed, 7 deletions(-)

diff --git a/include/ui/console.h b/include/ui/console.h
index e7303d8b98a8..5dd21976a376 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -390,7 +390,6 @@ bool qemu_console_is_gl_blocked(QemuConsole *con);
 char *qemu_console_get_label(QemuConsole *con);
 int qemu_console_get_index(QemuConsole *con);
 uint32_t qemu_console_get_head(QemuConsole *con);
-QemuUIInfo *qemu_console_get_ui_info(QemuConsole *con);
 int qemu_console_get_width(QemuConsole *con, int fallback);
 int qemu_console_get_height(QemuConsole *con, int fallback);
 /* Return the low-level window id for the console */
diff --git a/ui/console.c b/ui/console.c
index 53dee8e26b17..f995639e45f6 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -2122,12 +2122,6 @@ uint32_t qemu_console_get_head(QemuConsole *con)
 return con ? con->head : -1;
 }
 
-QemuUIInfo *qemu_console_get_ui_info(QemuConsole *con)
-{
-assert(con != NULL);
-return &con->ui_info;
-}
-
 int qemu_console_get_width(QemuConsole *con, int fallback)
 {
 if (con == NULL) {
-- 
2.27.0

[PATCH v2 8/9] vnc: add support for extended desktop resize

2020-12-08 Thread Gerd Hoffmann

The extended desktop resize encoding adds support for (a) clients
sending resize requests to the server, and (b) multihead support.

This patch implements (a).  All resize requests are rejected by qemu.
Qemu can't resize the framebuffer on its own, this is in the hands of
the guest, so all qemu can do is forward the request to the guest.
Should the guest actually resize the framebuffer we can notify the vnc
client later with a separate message.

This requires support in the display device.  Works with virtio-gpu.

https://github.com/rfbproto/rfbproto/blob/master/rfbproto.rst#extendeddesktopsize-pseudo-encoding

Signed-off-by: Gerd Hoffmann 
---
 ui/vnc.h |  2 ++
 ui/vnc.c | 64 +++-
 2 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/ui/vnc.h b/ui/vnc.h
index c8d3ad9ec496..77a310947bd6 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -442,6 +442,7 @@ enum {
 
 enum VncFeatures {
 VNC_FEATURE_RESIZE,
+VNC_FEATURE_RESIZE_EXT,
 VNC_FEATURE_HEXTILE,
 VNC_FEATURE_POINTER_TYPE_CHANGE,
 VNC_FEATURE_WMVI,
@@ -456,6 +457,7 @@ enum VncFeatures {
 };
 
 #define VNC_FEATURE_RESIZE_MASK  (1 << VNC_FEATURE_RESIZE)
+#define VNC_FEATURE_RESIZE_EXT_MASK  (1 << VNC_FEATURE_RESIZE_EXT)
 #define VNC_FEATURE_HEXTILE_MASK (1 << VNC_FEATURE_HEXTILE)
 #define VNC_FEATURE_POINTER_TYPE_CHANGE_MASK (1 << 
VNC_FEATURE_POINTER_TYPE_CHANGE)
 #define VNC_FEATURE_WMVI_MASK(1 << VNC_FEATURE_WMVI)
diff --git a/ui/vnc.c b/ui/vnc.c
index bdaf384f71a4..094ba6cd14cb 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -663,10 +663,35 @@ void vnc_framebuffer_update(VncState *vs, int x, int y, 
int w, int h,
 vnc_write_s32(vs, encoding);
 }
 
+static void vnc_desktop_resize_ext(VncState *vs, int reject_reason)
+{
+vnc_lock_output(vs);
+vnc_write_u8(vs, VNC_MSG_SERVER_FRAMEBUFFER_UPDATE);
+vnc_write_u8(vs, 0);
+vnc_write_u16(vs, 1); /* number of rects */
+vnc_framebuffer_update(vs,
+   reject_reason ? 1 : 0,
+   reject_reason,
+   vs->client_width, vs->client_height,
+   VNC_ENCODING_DESKTOP_RESIZE_EXT);
+vnc_write_u8(vs, 1);  /* number of screens */
+vnc_write_u8(vs, 0);  /* padding */
+vnc_write_u8(vs, 0);  /* padding */
+vnc_write_u8(vs, 0);  /* padding */
+vnc_write_u32(vs, 0); /* screen id */
+vnc_write_u16(vs, 0); /* screen x-pos */
+vnc_write_u16(vs, 0); /* screen y-pos */
+vnc_write_u16(vs, vs->client_width);
+vnc_write_u16(vs, vs->client_height);
+vnc_write_u32(vs, 0); /* screen flags */
+vnc_unlock_output(vs);
+vnc_flush(vs);
+}
 
 static void vnc_desktop_resize(VncState *vs, bool force)
 {
-if (vs->ioc == NULL || !vnc_has_feature(vs, VNC_FEATURE_RESIZE)) {
+if (vs->ioc == NULL || (!vnc_has_feature(vs, VNC_FEATURE_RESIZE) &&
+!vnc_has_feature(vs, VNC_FEATURE_RESIZE_EXT))) {
 return;
 }
 if (vs->client_width == pixman_image_get_width(vs->vd->server) &&
@@ -681,6 +706,12 @@ static void vnc_desktop_resize(VncState *vs, bool force)
pixman_image_get_height(vs->vd->server) >= 0);
 vs->client_width = pixman_image_get_width(vs->vd->server);
 vs->client_height = pixman_image_get_height(vs->vd->server);
+
+if (vnc_has_feature(vs, VNC_FEATURE_RESIZE_EXT)) {
+vnc_desktop_resize_ext(vs, 0);
+return;
+}
+
 vnc_lock_output(vs);
 vnc_write_u8(vs, VNC_MSG_SERVER_FRAMEBUFFER_UPDATE);
 vnc_write_u8(vs, 0);
@@ -2110,6 +2141,9 @@ static void set_encodings(VncState *vs, int32_t 
*encodings, size_t n_encodings)
 case VNC_ENCODING_DESKTOPRESIZE:
 vs->features |= VNC_FEATURE_RESIZE_MASK;
 break;
+case VNC_ENCODING_DESKTOP_RESIZE_EXT:
+vs->features |= VNC_FEATURE_RESIZE_EXT_MASK;
+break;
 case VNC_ENCODING_POINTER_TYPE_CHANGE:
 vs->features |= VNC_FEATURE_POINTER_TYPE_CHANGE_MASK;
 break;
@@ -2431,6 +2465,34 @@ static int protocol_client_msg(VncState *vs, uint8_t 
*data, size_t len)
 break;
 }
 break;
+case VNC_MSG_CLIENT_SET_DESKTOP_SIZE:
+{
+size_t size;
+uint8_t screens;
+
+if (len < 8) {
+return 8;
+}
+
+screens = read_u8(data, 6);
+size= 8 + screens * 16;
+if (len < size) {
+return size;
+}
+
+if (dpy_ui_info_supported(vs->vd->dcl.con)) {
+QemuUIInfo info;
+memset(&info, 0, sizeof(info));
+info.width = read_u16(data, 2);
+info.height = read_u16(data, 4);
+dpy_set_ui_info(vs->vd->dcl.con, &info);
+vnc_desktop_resize_ext(vs, 4 /* Request forwarded */);
+} else {
+vnc_desktop_resize_ext(vs, 3 /* Invalid screen layout */);
+}
+
+break;
+}
 de

[PATCH v2 3/9] vnc: use enum for features

2020-12-08 Thread Gerd Hoffmann

Use an enum for the vnc feature bits.  That way they are enumerated
automatically and we don't have to do that manually when adding or
removing features.

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Marc-André Lureau 
---
 ui/vnc.h | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/ui/vnc.h b/ui/vnc.h
index 4e2637ce6c5c..262fcf179b44 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -438,18 +438,20 @@ enum {
  *
  */
 
-#define VNC_FEATURE_RESIZE   0
-#define VNC_FEATURE_HEXTILE  1
-#define VNC_FEATURE_POINTER_TYPE_CHANGE  2
-#define VNC_FEATURE_WMVI 3
-#define VNC_FEATURE_TIGHT4
-#define VNC_FEATURE_ZLIB 5
-#define VNC_FEATURE_COPYRECT 6
-#define VNC_FEATURE_RICH_CURSOR  7
-#define VNC_FEATURE_TIGHT_PNG8
-#define VNC_FEATURE_ZRLE 9
-#define VNC_FEATURE_ZYWRLE  10
-#define VNC_FEATURE_LED_STATE   11
+enum VncFeatures {
+VNC_FEATURE_RESIZE,
+VNC_FEATURE_HEXTILE,
+VNC_FEATURE_POINTER_TYPE_CHANGE,
+VNC_FEATURE_WMVI,
+VNC_FEATURE_TIGHT,
+VNC_FEATURE_ZLIB,
+VNC_FEATURE_COPYRECT,
+VNC_FEATURE_RICH_CURSOR,
+VNC_FEATURE_TIGHT_PNG,
+VNC_FEATURE_ZRLE,
+VNC_FEATURE_ZYWRLE,
+VNC_FEATURE_LED_STATE,
+};
 
 #define VNC_FEATURE_RESIZE_MASK  (1 << VNC_FEATURE_RESIZE)
 #define VNC_FEATURE_HEXTILE_MASK (1 << VNC_FEATURE_HEXTILE)
-- 
2.27.0

[PATCH v2 9/9] vnc: move check into vnc_cursor_define

2020-12-08 Thread Gerd Hoffmann

Move the check whenever a cursor exists into the vnc_cursor_define()
function so callers don't have to do it.

Suggested-by: Marc-André Lureau 
Signed-off-by: Gerd Hoffmann 
---
 ui/vnc.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/ui/vnc.c b/ui/vnc.c
index 094ba6cd14cb..1f7dc545db7c 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -833,9 +833,7 @@ static void vnc_dpy_switch(DisplayChangeListener *dcl,
 QTAILQ_FOREACH(vs, &vd->clients, next) {
 vnc_colordepth(vs);
 vnc_desktop_resize(vs, false);
-if (vs->vd->cursor) {
-vnc_cursor_define(vs);
-}
+vnc_cursor_define(vs);
 memset(vs->dirty, 0x00, sizeof(vs->dirty));
 vnc_set_area_dirty(vs->dirty, vd, 0, 0,
vnc_width(vd),
@@ -969,6 +967,10 @@ static int vnc_cursor_define(VncState *vs)
 QEMUCursor *c = vs->vd->cursor;
 int isize;
 
+if (!vs->vd->cursor) {
+return -1;
+}
+
 if (vnc_has_feature(vs, VNC_FEATURE_ALPHA_CURSOR)) {
 vnc_lock_output(vs);
 vnc_write_u8(vs,  VNC_MSG_SERVER_FRAMEBUFFER_UPDATE);
@@ -2181,9 +2183,7 @@ static void set_encodings(VncState *vs, int32_t 
*encodings, size_t n_encodings)
 vnc_desktop_resize(vs, true);
 check_pointer_type_change(&vs->mouse_mode_notifier, NULL);
 vnc_led_state_change(vs);
-if (vs->vd->cursor) {
-vnc_cursor_define(vs);
-}
+vnc_cursor_define(vs);
 }
 
 static void set_pixel_conversion(VncState *vs)
-- 
2.27.0

[PATCH v2 5/9] vnc: add pseudo encodings

2020-12-08 Thread Gerd Hoffmann

Add #defines for two new pseudo encodings:
 * cursor with alpha channel.
 * extended desktop resize.

https://github.com/rfbproto/rfbproto/blob/master/rfbproto.rst#pseudo-encodings

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Marc-André Lureau 
---
 ui/vnc.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ui/vnc.h b/ui/vnc.h
index a7fd38a82075..6f5006da3593 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -411,6 +411,8 @@ enum {
 #define VNC_ENCODING_AUDIO0XFEFD /* -259 */
 #define VNC_ENCODING_TIGHT_PNG0xFEFC /* -260 */
 #define VNC_ENCODING_LED_STATE0XFEFB /* -261 */
+#define VNC_ENCODING_DESKTOP_RESIZE_EXT   0XFECC /* -308 */
+#define VNC_ENCODING_ALPHA_CURSOR 0XFEC6 /* -314 */
 #define VNC_ENCODING_WMVi 0x574D5669
 
 /*
-- 
2.27.0

Re: [PATCH v12 11/19] multi-process: setup memory manager for remote device

2020-12-08 Thread Marc-André Lureau

On Wed, Dec 2, 2020 at 12:23 AM Jagannathan Raman 
wrote:

> SyncSysMemMsg message format is defined. It is used to send
> file descriptors of the RAM regions to remote device.
> RAM on the remote device is configured with a set of file descriptors.
> Old RAM regions are deleted and new regions, each with an fd, is
> added to the RAM.
>
> Signed-off-by: Jagannathan Raman 
> Signed-off-by: John G Johnson 
> Signed-off-by: Elena Ufimtseva 
> Reviewed-by: Stefan Hajnoczi 
> ---
>  include/hw/remote/memory.h  | 19 ++
>  include/hw/remote/mpqemu-link.h | 13 +
>  hw/remote/memory.c  | 58
> +
>  hw/remote/mpqemu-link.c | 11 
>  MAINTAINERS |  2 ++
>  hw/remote/meson.build   |  2 ++
>  6 files changed, 105 insertions(+)
>  create mode 100644 include/hw/remote/memory.h
>  create mode 100644 hw/remote/memory.c
>
> diff --git a/include/hw/remote/memory.h b/include/hw/remote/memory.h
> new file mode 100644
> index 000..4fd548e
> --- /dev/null
> +++ b/include/hw/remote/memory.h
> @@ -0,0 +1,19 @@
> +/*
> + * Memory manager for remote device
> + *
> + * Copyright © 2018, 2020 Oracle and/or its affiliates.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef REMOTE_MEMORY_H
> +#define REMOTE_MEMORY_H
> +
> +#include "exec/hwaddr.h"
> +#include "hw/remote/mpqemu-link.h"
> +
> +void remote_sysmem_reconfig(MPQemuMsg *msg, Error **errp);
> +
> +#endif
> diff --git a/include/hw/remote/mpqemu-link.h
> b/include/hw/remote/mpqemu-link.h
> index 2d79ff8..070ac77 100644
> --- a/include/hw/remote/mpqemu-link.h
> +++ b/include/hw/remote/mpqemu-link.h
> @@ -14,6 +14,7 @@
>  #include "qom/object.h"
>  #include "qemu/thread.h"
>  #include "io/channel.h"
> +#include "exec/hwaddr.h"
>
>  #define REMOTE_MAX_FDS 8
>
> @@ -24,12 +25,22 @@
>   *
>   * MPQemuCmd enum type to specify the command to be executed on the remote
>   * device.
> + *
> + * SYNC_SYSMEM  Shares QEMU's RAM with remote device's RAM
>   */
>  typedef enum {
>  MPQEMU_CMD_INIT,
> +SYNC_SYSMEM,
> +RET_MSG,
>

RET_MSG is not used here, but later. ok.

 MPQEMU_CMD_MAX,
>  } MPQemuCmd;
>
> +typedef struct {
> +hwaddr gpas[REMOTE_MAX_FDS];
> +uint64_t sizes[REMOTE_MAX_FDS];
> +off_t offsets[REMOTE_MAX_FDS];
> +} SyncSysmemMsg;
> +
>  /**
>   * MPQemuMsg:
>   * @cmd: The remote command
> @@ -40,12 +51,14 @@ typedef enum {
>   * MPQemuMsg Format of the message sent to the remote device from QEMU.
>   *
>   */
> +
>  typedef struct {
>  int cmd;
>  size_t size;
>
>  union {
>  uint64_t u64;
> +SyncSysmemMsg sync_sysmem;
>  } data;
>
>  int fds[REMOTE_MAX_FDS];
> diff --git a/hw/remote/memory.c b/hw/remote/memory.c
> new file mode 100644
> index 000..6d1e830
> --- /dev/null
> +++ b/hw/remote/memory.c
> @@ -0,0 +1,58 @@
> +/*
> + * Memory manager for remote device
> + *
> + * Copyright © 2018, 2020 Oracle and/or its affiliates.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu-common.h"
> +
> +#include "hw/remote/memory.h"
> +#include "exec/address-spaces.h"
> +#include "exec/ram_addr.h"
> +#include "qapi/error.h"
> +
> +void remote_sysmem_reconfig(MPQemuMsg *msg, Error **errp)
> +{
> +SyncSysmemMsg *sysmem_info = &msg->data.sync_sysmem;
> +MemoryRegion *sysmem, *subregion, *next;
> +static unsigned int suffix;
> +Error *local_err = NULL;
> +char *name;
> +int region;
> +
> +sysmem = get_system_memory();
> +
> +memory_region_transaction_begin();
> +
> +QTAILQ_FOREACH_SAFE(subregion, &sysmem->subregions, subregions_link,
> next) {
> +if (subregion->ram) {
> +memory_region_del_subregion(sysmem, subregion);
> +object_unparent(OBJECT(subregion));
> +}
> +}
> +
> +for (region = 0; region < msg->num_fds; region++) {
> +subregion = g_new(MemoryRegion, 1);
> +name = g_strdup_printf("remote-mem-%u", suffix++);
> +memory_region_init_ram_from_fd(subregion, NULL,
> +   name, sysmem_info->sizes[region],
> +   true, msg->fds[region],
> +   sysmem_info->offsets[region],
> +   &local_err);
> +g_free(name);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +break;
> +}
> +
> +memory_region_add_subregion(sysmem, sysmem_info->gpas[region],
> +subregion);
> +}
> +
> +memory_region_transaction_commit();
> +}
> diff --git a/hw/remote/mpqemu-link.c b/hw/remote/mpqemu-link.c
> index e535ed2..

[PATCH] hw/input: expand trace info reported for ps2 device

2020-12-08 Thread Daniel P . Berrangé

It is interesting to know if the PS2 keyboard is in translated mode, and
which of the three scancode sets are in use.

Signed-off-by: Daniel P. Berrangé 
---
 hw/input/ps2.c| 2 +-
 hw/input/trace-events | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/input/ps2.c b/hw/input/ps2.c
index 72cdb80ae1..237956aca2 100644
--- a/hw/input/ps2.c
+++ b/hw/input/ps2.c
@@ -293,7 +293,7 @@ static void ps2_keyboard_event(DeviceState *dev, 
QemuConsole *src,
 qcode = qemu_input_key_value_to_qcode(key->key);
 
 mod = ps2_modifier_bit(qcode);
-trace_ps2_keyboard_event(s, qcode, key->down, mod, s->modifiers);
+trace_ps2_keyboard_event(s, qcode, key->down, mod, s->modifiers, 
s->scancode_set, s->translate);
 if (key->down) {
 s->modifiers |= mod;
 } else {
diff --git a/hw/input/trace-events b/hw/input/trace-events
index 1dd8ad6018..49c70d544e 100644
--- a/hw/input/trace-events
+++ b/hw/input/trace-events
@@ -30,7 +30,7 @@ pckbd_kbd_write_data(uint64_t val) "0x%02"PRIx64
 
 # ps2.c
 ps2_put_keycode(void *opaque, int keycode) "%p keycode 0x%02x"
-ps2_keyboard_event(void *opaque, int qcode, int down, unsigned int modifier, 
unsigned int modifiers) "%p qcode %d down %d modifier 0x%x modifiers 0x%x"
+ps2_keyboard_event(void *opaque, int qcode, int down, unsigned int modifier, 
unsigned int modifiers, int set, int xlate) "%p qcode %d down %d modifier 0x%x 
modifiers 0x%x set %d xlate %d"
 ps2_read_data(void *opaque) "%p"
 ps2_set_ledstate(void *s, int ledstate) "%p ledstate %d"
 ps2_reset_keyboard(void *s) "%p"
-- 
2.28.0

Re: [PATCH v12 10/19] multi-process: Associate fd of a PCIDevice with its object

2020-12-08 Thread Marc-André Lureau

On Mon, Dec 7, 2020 at 6:03 PM Marc-André Lureau 
wrote:

> Hi
>
> On Wed, Dec 2, 2020 at 12:25 AM Jagannathan Raman 
> wrote:
>
>> Associate the file descriptor for a PCIDevice in remote process with
>> DeviceState object.
>>
>> Signed-off-by: Elena Ufimtseva 
>> Signed-off-by: John G Johnson 
>> Signed-off-by: Jagannathan Raman 
>> Reviewed-by: Stefan Hajnoczi 
>> ---
>>  include/hw/remote/remote-obj.h |  42 +++
>>  hw/remote/message.c|   1 +
>>  hw/remote/remote-obj.c | 154
>> +
>>  MAINTAINERS|   2 +
>>  hw/remote/meson.build  |   1 +
>>  5 files changed, 200 insertions(+)
>>  create mode 100644 include/hw/remote/remote-obj.h
>>  create mode 100644 hw/remote/remote-obj.c
>>
>> diff --git a/include/hw/remote/remote-obj.h
>> b/include/hw/remote/remote-obj.h
>> new file mode 100644
>> index 000..0e95813
>> --- /dev/null
>> +++ b/include/hw/remote/remote-obj.h
>> @@ -0,0 +1,42 @@
>> +/*
>> + * Copyright © 2020 Oracle and/or its affiliates.
>> + *
>> + * This work is licensed under the terms of the GNU GPL-v2, version 2 or
>> later.
>> + *
>> + * See the COPYING file in the top-level directory.
>> + *
>> + */
>> +
>> +#ifndef REMOTE_OBJECT_H
>> +#define REMOTE_OBJECT_H
>> +
>> +#include "io/channel.h"
>> +#include "qemu/notify.h"
>> +
>> +#define TYPE_REMOTE_OBJECT "x-remote-object"
>> +#define REMOTE_OBJECT(obj) \
>> +OBJECT_CHECK(RemoteObject, (obj), TYPE_REMOTE_OBJECT)
>> +#define REMOTE_OBJECT_GET_CLASS(obj) \
>> +OBJECT_GET_CLASS(RemoteObjectClass, (obj), TYPE_REMOTE_OBJECT)
>> +#define REMOTE_OBJECT_CLASS(klass) \
>> +OBJECT_CLASS_CHECK(RemoteObjectClass, (klass), TYPE_REMOTE_OBJECT)
>> +
>> +typedef struct {
>> +ObjectClass parent_class;
>> +
>> +unsigned int nr_devs;
>> +unsigned int max_devs;
>> +} RemoteObjectClass;
>> +
>> +typedef struct {
>> +/* private */
>> +Object parent;
>> +
>> +Notifier machine_done;
>> +
>> +int fd;
>> +char *devid;
>> +QIOChannel *ioc;
>> +} RemoteObject;
>> +
>> +#endif
>>
>
> There is no need for a header if the header isn't actually shared with
> various .c units. In this series, you can just declare those structs in
> remote-obj.c
>
> diff --git a/hw/remote/message.c b/hw/remote/message.c
>> index 5d87bf4..1f2edc7 100644
>> --- a/hw/remote/message.c
>> +++ b/hw/remote/message.c
>> @@ -56,6 +56,7 @@ void coroutine_fn mpqemu_remote_msg_loop_co(void *data)
>>  }
>>  }
>>  qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN);
>> +g_free(com);
>>
>
>
> Should be squashed with the previous patch
>
>
>>  return;
>>  }
>> diff --git a/hw/remote/remote-obj.c b/hw/remote/remote-obj.c
>> new file mode 100644
>> index 000..3b4c0d4
>> --- /dev/null
>> +++ b/hw/remote/remote-obj.c
>> @@ -0,0 +1,154 @@
>> +/*
>> + * Copyright © 2020 Oracle and/or its affiliates.
>> + *
>> + * This work is licensed under the terms of the GNU GPL-v2, version 2 or
>> later.
>> + *
>> + * See the COPYING file in the top-level directory.
>> + *
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qemu-common.h"
>> +
>> +#include "hw/remote/remote-obj.h"
>> +#include "qemu/error-report.h"
>> +#include "qom/object_interfaces.h"
>> +#include "hw/qdev-core.h"
>> +#include "io/channel.h"
>> +#include "hw/qdev-core.h"
>> +#include "hw/remote/machine.h"
>> +#include "io/channel-util.h"
>> +#include "qapi/error.h"
>> +#include "sysemu/sysemu.h"
>> +#include "hw/pci/pci.h"
>> +
>> +static void remote_object_set_fd(Object *obj, const char *str, Error
>> **errp)
>> +{
>> +RemoteObject *o = REMOTE_OBJECT(obj);
>> +
>> +o->fd = atoi(str);
>>
>
>  qemu_strtoi() has better error handling semantics. You may also want to
> check it's a valid socket fd with fd_is_socket().
>
> Alternatively, you could use qemu_open() which allows to open from QMP
> fdset ("/dev/fdset/..."). This should be more flexible.
>


A better alternative is qemu_parse_fd().

In some later patch, you use monitor_fd_param(monitor_cur(), ...) for
x-pci-proxy-dev "fd" property.

That might be the right API, for consistency, use it here too?


-- 
Marc-André Lureau

[PATCH 3/5] target/arm: add descriptions of CLIDR_EL1, CCSIDR_EL1, CTR_EL0 to cpu.h

2020-12-08 Thread Leif Lindholm

Signed-off-by: Leif Lindholm 
---
 target/arm/cpu.h | 24 
 1 file changed, 24 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index b54d1dc092..5e9e8061f7 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1713,6 +1713,30 @@ FIELD(V7M_FPCCR, ASPEN, 31, 1)
 /*
  * System register ID fields.
  */
+FIELD(CLIDR_EL1, CTYPE1, 0, 3)
+FIELD(CLIDR_EL1, CTYPE2, 3, 3)
+FIELD(CLIDR_EL1, CTYPE3, 6, 3)
+FIELD(CLIDR_EL1, CTYPE4, 9, 3)
+FIELD(CLIDR_EL1, CTYPE5, 12, 3)
+FIELD(CLIDR_EL1, CTYPE6, 15, 3)
+FIELD(CLIDR_EL1, CTYPE7, 18, 3)
+FIELD(CLIDR_EL1, LOUIS, 21, 3)
+FIELD(CLIDR_EL1, LOC, 24, 3)
+FIELD(CLIDR_EL1, LOUU, 27, 3)
+FIELD(CLIDR_EL1, ICB, 30, 3)
+
+FIELD(CCSIDR_EL1, LINESIZE, 0, 3)
+FIELD(CCSIDR_EL1, ASSOCIATIVITY, 3, 20)
+FIELD(CCSIDR_EL1, NUMSETS, 32, 23)
+
+FIELD(CTR_EL0,  IMINLINE, 0, 4)
+FIELD(CTR_EL0,  L1IP, 14, 2)
+FIELD(CTR_EL0,  DMINLINE, 16, 4)
+FIELD(CTR_EL0,  ERG, 20, 4)
+FIELD(CTR_EL0,  CWG, 24, 4)
+FIELD(CTR_EL0,  IDC, 28, 1)
+FIELD(CTR_EL0,  DIC, 29, 1)
+
 FIELD(MIDR_EL1, REVISION, 0, 4)
 FIELD(MIDR_EL1, PARTNUM, 4, 12)
 FIELD(MIDR_EL1, ARCHITECTURE, 16, 4)
-- 
2.20.1

[PATCH 1/5] target/arm: fix typo in cpu.h ID_AA64PFR1 field name

2020-12-08 Thread Leif Lindholm

SBSS -> SSBS

Signed-off-by: Leif Lindholm 
---
 target/arm/cpu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index e5514c8286..6962ef05d6 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1851,7 +1851,7 @@ FIELD(ID_AA64PFR0, RAS, 28, 4)
 FIELD(ID_AA64PFR0, SVE, 32, 4)
 
 FIELD(ID_AA64PFR1, BT, 0, 4)
-FIELD(ID_AA64PFR1, SBSS, 4, 4)
+FIELD(ID_AA64PFR1, SSBS, 4, 4)
 FIELD(ID_AA64PFR1, MTE, 8, 4)
 FIELD(ID_AA64PFR1, RAS_FRAC, 12, 4)
 
-- 
2.20.1

[PATCH 2/5] target/arm: make ARMCPU.clidr 64-bit

2020-12-08 Thread Leif Lindholm

The AArch64 view of CLIDR_EL1 extends the ICB field to include also bit
32, as well as adding a Ttype field when FEAT_MTE is implemented.
Extend the clidr field to be able to hold this context.

Signed-off-by: Leif Lindholm 
---
 target/arm/cpu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 6962ef05d6..b54d1dc092 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -938,7 +938,7 @@ struct ARMCPU {
 uint32_t id_afr0;
 uint64_t id_aa64afr0;
 uint64_t id_aa64afr1;
-uint32_t clidr;
+uint64_t clidr;
 uint64_t mp_affinity; /* MP ID without feature bits */
 /* The elements of this array are the CCSIDR values for each cache,
  * in the order L1DCache, L1ICache, L2DCache, L2ICache, etc.
-- 
2.20.1

[PATCH 0/5] target/arm: various changes to cpu.h

2020-12-08 Thread Leif Lindholm

First, fix a typo in ID_AA64PFR1 (SBSS -> SSBS).

Second, turn clidr in the ARMCPU struct 64-bit, to support all fields defined
by the ARM ARM.

Third, add field definitions for CLIDR (excepting the Ttype fields, since
I was unsure of prefererred naming - Ttype7-Ttype1?).

Fourth add all ID_AA64 registers/fields present in ARM DDI 0487F.c,

Lastly, add all ID_ (aarch32) registers/fields.

Some of the ID_AA64 fields will be used by some patches Rebecca Cran will be
submitting shortly, and some of those features also exist for aarch32.

Leif Lindholm (5):
  target/arm: fix typo in cpu.h ID_AA64PFR1 field name
  target/arm: make ARMCPU.clidr 64-bit
  target/arm: add descriptions of CLIDR_EL1, CCSIDR_EL1, CTR_EL0 to
cpu.h
  target/arm: add aarch64 ID register fields to cpu.h
  target/arm: add aarch32 ID register fields to cpu.h

 target/arm/cpu.h | 80 ++--
 1 file changed, 78 insertions(+), 2 deletions(-)

-- 
2.20.1

Re: [PATCH] hw/input: expand trace info reported for ps2 device

2020-12-08 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20201208115934.3163238-1-berra...@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20201208115934.3163238-1-berra...@redhat.com
Subject: [PATCH] hw/input: expand trace info reported for ps2 device

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/20201208115934.3163238-1-berra...@redhat.com -> 
patchew/20201208115934.3163238-1-berra...@redhat.com
Switched to a new branch 'test'
ef4eabf hw/input: expand trace info reported for ps2 device

=== OUTPUT BEGIN ===
ERROR: line over 90 characters
#22: FILE: hw/input/ps2.c:296:
+trace_ps2_keyboard_event(s, qcode, key->down, mod, s->modifiers, 
s->scancode_set, s->translate);

total: 1 errors, 0 warnings, 16 lines checked

Commit ef4eabf7b954 (hw/input: expand trace info reported for ps2 device) has 
style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20201208115934.3163238-1-berra...@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH v12 12/19] multi-process: introduce proxy object

2020-12-08 Thread Marc-André Lureau

Hi

On Wed, Dec 2, 2020 at 12:23 AM Jagannathan Raman 
wrote:

> From: Elena Ufimtseva 
>
> Defines a PCI Device proxy object as a child of TYPE_PCI_DEVICE.
>
> Signed-off-by: Elena Ufimtseva 
> Signed-off-by: Jagannathan Raman 
> Signed-off-by: John G Johnson 
> Reviewed-by: Stefan Hajnoczi 
> ---
>  include/hw/remote/proxy.h | 36 +
>  hw/remote/proxy.c | 98
> +++
>  MAINTAINERS   |  2 +
>  hw/remote/meson.build |  1 +
>  4 files changed, 137 insertions(+)
>  create mode 100644 include/hw/remote/proxy.h
>  create mode 100644 hw/remote/proxy.c
>
> diff --git a/include/hw/remote/proxy.h b/include/hw/remote/proxy.h
> new file mode 100644
> index 000..923432a
> --- /dev/null
> +++ b/include/hw/remote/proxy.h
> @@ -0,0 +1,36 @@
> +/*
> + * Copyright © 2018, 2020 Oracle and/or its affiliates.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef PROXY_H
> +#define PROXY_H
>
+
> +#include "hw/pci/pci.h"
> +#include "io/channel.h"
> +
> +#define TYPE_PCI_PROXY_DEV "x-pci-proxy-dev"
> +
> +#define PCI_PROXY_DEV(obj) \
> +OBJECT_CHECK(PCIProxyDev, (obj), TYPE_PCI_PROXY_DEV)
> +typedef struct PCIProxyDev PCIProxyDev;
> +
> +struct PCIProxyDev {
> +PCIDevice parent_dev;
> +char *fd;
> +
> +/*
> + * Mutex used to protect the QIOChannel fd from
> + * the concurrent access by the VCPUs since proxy
> + * blocks while awaiting for the replies from the
> + * process remote.
> + */
> +QemuMutex io_mutex;
> +QIOChannel *ioc;
> +Error *migration_blocker;
> +};
> +
> +#endif /* PROXY_H */
> diff --git a/hw/remote/proxy.c b/hw/remote/proxy.c
> new file mode 100644
> index 000..29100bc
> --- /dev/null
> +++ b/hw/remote/proxy.c
> @@ -0,0 +1,98 @@
> +/*
> + * Copyright © 2018, 2020 Oracle and/or its affiliates.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu-common.h"
> +
> +#include "hw/remote/proxy.h"
> +#include "hw/pci/pci.h"
> +#include "qapi/error.h"
> +#include "io/channel-util.h"
> +#include "hw/qdev-properties.h"
> +#include "monitor/monitor.h"
> +#include "migration/blocker.h"
> +
> +static void proxy_set_socket(PCIProxyDev *pdev, int fd, Error **errp)
> +{
> +pdev->ioc = qio_channel_new_fd(fd, errp);
> +}
>

That one line function (called once) should be inlined.

+
> +static Property proxy_properties[] = {
> +DEFINE_PROP_STRING("fd", PCIProxyDev, fd),
> +DEFINE_PROP_END_OF_LIST(),
> +};
> +
>

Generally we put properties just above class_init, where it is used.

+static void pci_proxy_dev_realize(PCIDevice *device, Error **errp)
> +{
>

(errp shouldn't be NULL, but ERRP_GUARD would be safer)

+PCIProxyDev *dev = PCI_PROXY_DEV(device);
> +int fd;
> +
> +if (dev->fd) {
> +fd = monitor_fd_param(monitor_cur(), dev->fd, errp);
> +if (fd == -1) {
> +error_prepend(errp, "proxy: unable to parse fd: ");
> +return;
> +}
> +proxy_set_socket(dev, fd, errp);
> +} else {
> +error_setg(errp, "fd parameter not specified for %s",
> +   DEVICE(device)->id);
> +return;
>

We prefer early return, to keep the code easy to read.

if (!dev->fd) {
  return error...
}

the_normal_thing();
...

+}
> +
> +error_setg(&dev->migration_blocker, "%s does not support migration",
> +   TYPE_PCI_PROXY_DEV);
> +if (migrate_add_blocker(dev->migration_blocker, errp)) {
> +error_free(dev->migration_blocker);
>

leave that free to dev_exit() ?


> +error_free(*errp);
> +dev->migration_blocker = NULL;
> +error_setg(errp, "Failed to set migration blocker");
>

Why not use the error from migrate_add_blocker?

+}
> +
> +qemu_mutex_init(&dev->io_mutex);
> +qio_channel_set_blocking(dev->ioc, true, NULL);
> +}
> +
> +static void pci_proxy_dev_exit(PCIDevice *pdev)
> +{
> +PCIProxyDev *dev = PCI_PROXY_DEV(pdev);
> +
> +qio_channel_close(dev->ioc, NULL);
>

on early return in realize, dev->ioc is NULL. This will crash.


+
> +migrate_del_blocker(dev->migration_blocker);
>

So is migration_blocker, but this should be safe with NULL


> +
> +error_free(dev->migration_blocker);
>
+}
> +
> +static void pci_proxy_dev_class_init(ObjectClass *klass, void *data)
> +{
> +DeviceClass *dc = DEVICE_CLASS(klass);
> +PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> +
> +k->realize = pci_proxy_dev_realize;
> +k->exit = pci_proxy_dev_exit;
> +device_class_set_props(dc, proxy_properties);
> +}
> +
> +static const TypeInfo pci_proxy_dev_type_info = {
> +.name  = TYPE_PCI_PROXY_DEV,
> +.parent= TYPE_PCI_DEVICE,
> +.instance_size =

[PATCH 4/5] target/arm: add aarch64 ID register fields to cpu.h

2020-12-08 Thread Leif Lindholm

Add entries present in ARM DDI 0487F.c (August 2020).

Signed-off-by: Leif Lindholm 
---
 target/arm/cpu.h | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 5e9e8061f7..2a12a5ce92 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1863,6 +1863,9 @@ FIELD(ID_AA64ISAR1, GPI, 28, 4)
 FIELD(ID_AA64ISAR1, FRINTTS, 32, 4)
 FIELD(ID_AA64ISAR1, SB, 36, 4)
 FIELD(ID_AA64ISAR1, SPECRES, 40, 4)
+FIELD(ID_AA64ISAR1, BF16, 44, 4)
+FIELD(ID_AA64ISAR1, DGH, 48, 4)
+FIELD(ID_AA64ISAR1, I8MM, 52, 4)
 
 FIELD(ID_AA64PFR0, EL0, 0, 4)
 FIELD(ID_AA64PFR0, EL1, 4, 4)
@@ -1873,11 +1876,18 @@ FIELD(ID_AA64PFR0, ADVSIMD, 20, 4)
 FIELD(ID_AA64PFR0, GIC, 24, 4)
 FIELD(ID_AA64PFR0, RAS, 28, 4)
 FIELD(ID_AA64PFR0, SVE, 32, 4)
+FIELD(ID_AA64PFR0, SEL2, 36, 4)
+FIELD(ID_AA64PFR0, MPAM, 40, 4)
+FIELD(ID_AA64PFR0, AMU, 44, 4)
+FIELD(ID_AA64PFR0, DIT, 48, 4)
+FIELD(ID_AA64PFR0, CSV2, 56, 4)
+FIELD(ID_AA64PFR0, CSV3, 60, 4)
 
 FIELD(ID_AA64PFR1, BT, 0, 4)
 FIELD(ID_AA64PFR1, SSBS, 4, 4)
 FIELD(ID_AA64PFR1, MTE, 8, 4)
 FIELD(ID_AA64PFR1, RAS_FRAC, 12, 4)
+FIELD(ID_AA64PFR1, MPAM_FRAC, 16, 4)
 
 FIELD(ID_AA64MMFR0, PARANGE, 0, 4)
 FIELD(ID_AA64MMFR0, ASIDBITS, 4, 4)
@@ -1891,6 +1901,8 @@ FIELD(ID_AA64MMFR0, TGRAN16_2, 32, 4)
 FIELD(ID_AA64MMFR0, TGRAN64_2, 36, 4)
 FIELD(ID_AA64MMFR0, TGRAN4_2, 40, 4)
 FIELD(ID_AA64MMFR0, EXS, 44, 4)
+FIELD(ID_AA64MMFR0, FGT, 56, 4)
+FIELD(ID_AA64MMFR0, ECV, 60, 4)
 
 FIELD(ID_AA64MMFR1, HAFDBS, 0, 4)
 FIELD(ID_AA64MMFR1, VMIDBITS, 4, 4)
@@ -1900,6 +1912,8 @@ FIELD(ID_AA64MMFR1, LO, 16, 4)
 FIELD(ID_AA64MMFR1, PAN, 20, 4)
 FIELD(ID_AA64MMFR1, SPECSEI, 24, 4)
 FIELD(ID_AA64MMFR1, XNX, 28, 4)
+FIELD(ID_AA64MMFR1, TWED, 32, 4)
+FIELD(ID_AA64MMFR1, ETS, 36, 4)
 
 FIELD(ID_AA64MMFR2, CNP, 0, 4)
 FIELD(ID_AA64MMFR2, UAO, 4, 4)
@@ -1926,6 +1940,7 @@ FIELD(ID_AA64DFR0, CTX_CMPS, 28, 4)
 FIELD(ID_AA64DFR0, PMSVER, 32, 4)
 FIELD(ID_AA64DFR0, DOUBLELOCK, 36, 4)
 FIELD(ID_AA64DFR0, TRACEFILT, 40, 4)
+FIELD(ID_AA64DFR0, MTPMU, 48, 4)
 
 FIELD(ID_DFR0, COPDBG, 0, 4)
 FIELD(ID_DFR0, COPSDBG, 4, 4)
-- 
2.20.1

[PATCH v3] tests/acceptance: test hot(un)plug of ccw devices

2020-12-08 Thread Cornelia Huck

Hotplug a virtio-net-ccw device, and then hotunplug it again.

Signed-off-by: Cornelia Huck 
---
v2->v3:
- do the dmesg cleanout and waiting for messages properly [Thomas]

Wainer: I dropped your r-b, as there had been too many changes for
me to be comfortable with retaining it

---
 tests/acceptance/machine_s390_ccw_virtio.py | 24 +
 1 file changed, 24 insertions(+)

diff --git a/tests/acceptance/machine_s390_ccw_virtio.py 
b/tests/acceptance/machine_s390_ccw_virtio.py
index 81d14088818c..864ef4ee6e9b 100644
--- a/tests/acceptance/machine_s390_ccw_virtio.py
+++ b/tests/acceptance/machine_s390_ccw_virtio.py
@@ -99,3 +99,27 @@ class S390CCWVirtioMachine(Test):
 exec_command_and_wait_for_pattern(self,
 'cat /sys/bus/pci/devices/000a\:00\:00.0/function_id',
 '0x000c')
+# add another device
+exec_command_and_wait_for_pattern(self,
+'dmesg -c > /dev/null; echo dm_clear\ 1',
+'dm_clear 1')
+self.vm.command('device_add', driver='virtio-net-ccw',
+devno='fe.0.4711', id='net_4711')
+exec_command_and_wait_for_pattern(self,
+'while ! (dmesg -c | grep CRW) ; do sleep 1 ; done',
+'CRW reports')
+exec_command_and_wait_for_pattern(self, 'ls /sys/bus/ccw/devices/',
+  '0.0.4711')
+# and detach it again
+exec_command_and_wait_for_pattern(self,
+'dmesg -c > /dev/null; echo dm_clear\ 2',
+'dm_clear 2')
+self.vm.command('device_del', id='net_4711')
+self.vm.event_wait(name='DEVICE_DELETED',
+   match={'data': {'device': 'net_4711'}})
+exec_command_and_wait_for_pattern(self,
+'while ! (dmesg -c | grep CRW) ; do sleep 1 ; done',
+'CRW reports')
+exec_command_and_wait_for_pattern(self,
+  'ls /sys/bus/ccw/devices/0.0.4711',
+  'No such file or directory')
-- 
2.26.2

Re: [PATCH v2 1/9] console: drop qemu_console_get_ui_info

2020-12-08 Thread Marc-André Lureau

On Tue, Dec 8, 2020 at 4:02 PM Gerd Hoffmann  wrote:

> Unused and duplicate (there is dpy_get_ui_info).
>
> Signed-off-by: Gerd Hoffmann 
>

:)
Reviewed-by: Marc-André Lureau 

---
>  include/ui/console.h | 1 -
>  ui/console.c | 6 --
>  2 files changed, 7 deletions(-)
>
> diff --git a/include/ui/console.h b/include/ui/console.h
> index e7303d8b98a8..5dd21976a376 100644
> --- a/include/ui/console.h
> +++ b/include/ui/console.h
> @@ -390,7 +390,6 @@ bool qemu_console_is_gl_blocked(QemuConsole *con);
>  char *qemu_console_get_label(QemuConsole *con);
>  int qemu_console_get_index(QemuConsole *con);
>  uint32_t qemu_console_get_head(QemuConsole *con);
> -QemuUIInfo *qemu_console_get_ui_info(QemuConsole *con);
>  int qemu_console_get_width(QemuConsole *con, int fallback);
>  int qemu_console_get_height(QemuConsole *con, int fallback);
>  /* Return the low-level window id for the console */
> diff --git a/ui/console.c b/ui/console.c
> index 53dee8e26b17..f995639e45f6 100644
> --- a/ui/console.c
> +++ b/ui/console.c
> @@ -2122,12 +2122,6 @@ uint32_t qemu_console_get_head(QemuConsole *con)
>  return con ? con->head : -1;
>  }
>
> -QemuUIInfo *qemu_console_get_ui_info(QemuConsole *con)
> -{
> -assert(con != NULL);
> -return &con->ui_info;
> -}
> -
>  int qemu_console_get_width(QemuConsole *con, int fallback)
>  {
>  if (con == NULL) {
> --
> 2.27.0
>
>
>

-- 
Marc-André Lureau

[PATCH 5/5] target/arm: add aarch32 ID register fields to cpu.h

2020-12-08 Thread Leif Lindholm

Add entries present in ARM DDI 0487F.c (August 2020).

Signed-off-by: Leif Lindholm 
---
 target/arm/cpu.h | 37 +
 1 file changed, 37 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 2a12a5ce92..b37a74348d 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1800,6 +1800,8 @@ FIELD(ID_ISAR6, DP, 4, 4)
 FIELD(ID_ISAR6, FHM, 8, 4)
 FIELD(ID_ISAR6, SB, 12, 4)
 FIELD(ID_ISAR6, SPECRES, 16, 4)
+FIELD(ID_ISAR6, BF16, 20, 4)
+FIELD(ID_ISAR6, I8MM, 24, 4)
 
 FIELD(ID_MMFR0, VMSA, 0, 4)
 FIELD(ID_MMFR0, PMSA, 4, 4)
@@ -1810,6 +1812,24 @@ FIELD(ID_MMFR0, AUXREG, 20, 4)
 FIELD(ID_MMFR0, FCSE, 24, 4)
 FIELD(ID_MMFR0, INNERSHR, 28, 4)
 
+FIELD(ID_MMFR1, L1HVDVA, 0, 4)
+FIELD(ID_MMFR1, L1UNIVA, 4, 4)
+FIELD(ID_MMFR1, L1HVDSW, 8, 4)
+FIELD(ID_MMFR1, L1UNISW, 12, 4)
+FIELD(ID_MMFR1, L1HVD, 16, 4)
+FIELD(ID_MMFR1, L1UNI, 20, 4)
+FIELD(ID_MMFR1, L1TSTCLN, 24, 4)
+FIELD(ID_MMFR1, BPRED, 28, 4)
+
+FIELD(ID_MMFR2, L1HVDFG, 0, 4)
+FIELD(ID_MMFR2, L1HVDBG, 4, 4)
+FIELD(ID_MMFR2, L1HVDRNG, 8, 4)
+FIELD(ID_MMFR2, HVDTLB, 12, 4)
+FIELD(ID_MMFR2, UNITLB, 16, 4)
+FIELD(ID_MMFR2, MEMBARR, 20, 4)
+FIELD(ID_MMFR2, WFISTALL, 24, 4)
+FIELD(ID_MMFR2, HWACCFLG, 28, 4)
+
 FIELD(ID_MMFR3, CMAINTVA, 0, 4)
 FIELD(ID_MMFR3, CMAINTSW, 4, 4)
 FIELD(ID_MMFR3, BPMAINT, 8, 4)
@@ -1828,6 +1848,17 @@ FIELD(ID_MMFR4, LSM, 20, 4)
 FIELD(ID_MMFR4, CCIDX, 24, 4)
 FIELD(ID_MMFR4, EVT, 28, 4)
 
+FIELD(ID_MMFR5, ETS, 0, 4)
+
+FIELD(ID_PFR0, STATE0, 0, 4)
+FIELD(ID_PFR0, STATE1, 4, 4)
+FIELD(ID_PFR0, STATE2, 8, 4)
+FIELD(ID_PFR0, STATE3, 12, 4)
+FIELD(ID_PFR0, CSV2, 16, 4)
+FIELD(ID_PFR0, AMU, 20, 4)
+FIELD(ID_PFR0, DIT, 24, 4)
+FIELD(ID_PFR0, RAS, 28, 4)
+
 FIELD(ID_PFR1, PROGMOD, 0, 4)
 FIELD(ID_PFR1, SECURITY, 4, 4)
 FIELD(ID_PFR1, MPROGMOD, 8, 4)
@@ -1837,6 +1868,10 @@ FIELD(ID_PFR1, SEC_FRAC, 20, 4)
 FIELD(ID_PFR1, VIRT_FRAC, 24, 4)
 FIELD(ID_PFR1, GIC, 28, 4)
 
+FIELD(ID_PFR2, CSV3, 0, 4)
+FIELD(ID_PFR2, SSBS, 4, 4)
+FIELD(ID_PFR2, RAS_FRAC, 8, 4)
+
 FIELD(ID_AA64ISAR0, AES, 4, 4)
 FIELD(ID_AA64ISAR0, SHA1, 8, 4)
 FIELD(ID_AA64ISAR0, SHA2, 12, 4)
@@ -1951,6 +1986,8 @@ FIELD(ID_DFR0, MPROFDBG, 20, 4)
 FIELD(ID_DFR0, PERFMON, 24, 4)
 FIELD(ID_DFR0, TRACEFILT, 28, 4)
 
+FIELD(ID_DFR1, MTPMU, 0, 4)
+
 FIELD(DBGDIDR, SE_IMP, 12, 1)
 FIELD(DBGDIDR, NSUHD_IMP, 14, 1)
 FIELD(DBGDIDR, VERSION, 16, 4)
-- 
2.20.1

Re: [PATCH v2 2/9] console: allow con==NULL in dpy_{get, set}_ui_info and dpy_ui_info_supported

2020-12-08 Thread Marc-André Lureau

On Tue, Dec 8, 2020 at 3:59 PM Gerd Hoffmann  wrote:

> Use active_console in that case like we do in many other places.
>
> Signed-off-by: Gerd Hoffmann 
>

 Reviewed-by: Marc-André Lureau 

---
>  ui/console.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/ui/console.c b/ui/console.c
> index f995639e45f6..30e70be555db 100644
> --- a/ui/console.c
> +++ b/ui/console.c
> @@ -1544,19 +1544,27 @@ static void dpy_set_ui_info_timer(void *opaque)
>
>  bool dpy_ui_info_supported(QemuConsole *con)
>  {
> +if (con == NULL) {
> +con = active_console;
> +}
> +
>  return con->hw_ops->ui_info != NULL;
>  }
>
>  const QemuUIInfo *dpy_get_ui_info(const QemuConsole *con)
>  {
> -assert(con != NULL);
> +if (con == NULL) {
> +con = active_console;
> +}
>
>  return &con->ui_info;
>  }
>
>  int dpy_set_ui_info(QemuConsole *con, QemuUIInfo *info)
>  {
> -assert(con != NULL);
> +if (con == NULL) {
> +con = active_console;
> +}
>
>  if (!dpy_ui_info_supported(con)) {
>  return -1;
> --
> 2.27.0
>
>
>

-- 
Marc-André Lureau

Re: [PATCH v12 13/19] multi-process: add proxy communication functions

2020-12-08 Thread Marc-André Lureau

Hi

On Wed, Dec 2, 2020 at 12:23 AM Jagannathan Raman 
wrote:

> From: Elena Ufimtseva 
>
> Signed-off-by: Elena Ufimtseva 
> Signed-off-by: Jagannathan Raman 
> Signed-off-by: John G Johnson 
> Reviewed-by: Stefan Hajnoczi 
> ---
>  include/hw/remote/mpqemu-link.h |  4 
>  hw/remote/mpqemu-link.c | 38
> ++
>  2 files changed, 42 insertions(+)
>
> diff --git a/include/hw/remote/mpqemu-link.h
> b/include/hw/remote/mpqemu-link.h
> index 070ac77..cee9468 100644
> --- a/include/hw/remote/mpqemu-link.h
> +++ b/include/hw/remote/mpqemu-link.h
> @@ -15,6 +15,8 @@
>  #include "qemu/thread.h"
>  #include "io/channel.h"
>  #include "exec/hwaddr.h"
> +#include "io/channel-socket.h"
> +#include "hw/remote/proxy.h"
>
>  #define REMOTE_MAX_FDS 8
>
> @@ -65,6 +67,8 @@ typedef struct {
>  int num_fds;
>  } MPQemuMsg;
>
> +uint64_t mpqemu_msg_send_and_await_reply(MPQemuMsg *msg, PCIProxyDev
> *pdev,
> + Error **errp);
>  void mpqemu_msg_send(MPQemuMsg *msg, QIOChannel *ioc, Error **errp);
>  void mpqemu_msg_recv(MPQemuMsg *msg, QIOChannel *ioc, Error **errp);
>
> diff --git a/hw/remote/mpqemu-link.c b/hw/remote/mpqemu-link.c
> index bbd9df3..18c8a54 100644
> --- a/hw/remote/mpqemu-link.c
> +++ b/hw/remote/mpqemu-link.c
> @@ -17,6 +17,7 @@
>  #include "qemu/iov.h"
>  #include "qemu/error-report.h"
>  #include "qemu/main-loop.h"
> +#include "io/channel.h"
>
>  /*
>   * Send message over the ioc QIOChannel.
> @@ -219,6 +220,43 @@ fail:
>  }
>  }
>
> +/*
> + * Called from VCPU thread in non-coroutine context.
>

You could check that precondition in code early on, presumably.

The comment could use some further description, like "Send @msg and wait
for a RET_MSG reply. Returns the associated u64 message code, or UINT64_MAX
on error."

+ */
> +uint64_t mpqemu_msg_send_and_await_reply(MPQemuMsg *msg, PCIProxyDev
> *pdev,
> + Error **errp)
> +{
> +MPQemuMsg msg_reply = {0};
> +uint64_t ret = UINT64_MAX;
> +Error *local_err = NULL;
> +
>

Should work with ERRP_GUARD instead

+qemu_mutex_lock(&pdev->io_mutex);
>

Should work with QEMU_LOCK_GUARD

+mpqemu_msg_send(msg, pdev->ioc, &local_err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +goto exit_send;
> +}
> +
>

By making mpqemu_msg_send() return true on success, this should then become
simply

if (!mpqemu_msg_send(msg, pdev->ioc, errp))
  return ret;

+mpqemu_msg_recv(&msg_reply, pdev->ioc, &local_err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +goto exit_send;
> +}
> +
> +if (!mpqemu_msg_valid(&msg_reply) || msg_reply.cmd != RET_MSG) {
> +error_setg(errp, "ERROR: Invalid reply received for command %d",
> + msg->cmd);
> +goto exit_send;
> +} else {
>

that else is unneeded.

The function with automatic cleanups should be simpler.

+ret = msg_reply.data.u64;
> +}
> +
> + exit_send:
> +qemu_mutex_unlock(&pdev->io_mutex);
> +
> +return ret;
> +}
> +
>  bool mpqemu_msg_valid(MPQemuMsg *msg)
>  {
>  if (msg->cmd >= MPQEMU_CMD_MAX && msg->cmd < 0) {
> --
> 1.8.3.1
>
>

-- 
Marc-André Lureau

Re: [PATCH v2 0/9] vnc: support some new extensions.

2020-12-08 Thread Daniel P . Berrangé

On Tue, Dec 08, 2020 at 12:57:28PM +0100, Gerd Hoffmann wrote:
> The rfb standard keeps envolving.  While the official spec is kind of
> frozen since a decade or so the community maintains an updated version
> of the spec at:
>   https://github.com/rfbproto/rfbproto/
> 
> This series adds support for two new extensions from that spec: alpha
> cursor and extended desktop resize.
> 
> alpha cursor allows a full alpha channel for the cursor image (unlike
> the rich cursor extension which has only a bitmask for transparency).
> 
> extended desktop resize makes the desktop-resize work both ways, i.e. we
> can not only signal guest display resolution changes to the vnc client
> but also vnc client window size changes to the guest.
> 
> Tested with tigervnc.
> 
> gtk-vnc (and anything building on top of it like virt-viewer and
> virt-manager) has no support for these extensions.

I'm working on adding these two extensions to gtk-vnc as they're a notable
gap compared to spice.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [for-6.0 v5 00/13] Generalize memory encryption models

2020-12-08 Thread Cornelia Huck

On Tue, 8 Dec 2020 13:57:28 +1100
David Gibson  wrote:

> On Fri, Dec 04, 2020 at 02:12:29PM +0100, Cornelia Huck wrote:
> > On Fri, 4 Dec 2020 13:07:27 +
> > "Dr. David Alan Gilbert"  wrote:
> >   
> > > * Cornelia Huck (coh...@redhat.com) wrote:  
> > > > On Fri, 4 Dec 2020 09:06:50 +0100
> > > > Christian Borntraeger  wrote:
> > > > 
> > > > > On 04.12.20 06:44, David Gibson wrote:
> > > > > > A number of hardware platforms are implementing mechanisms whereby 
> > > > > > the
> > > > > > hypervisor does not have unfettered access to guest memory, in order
> > > > > > to mitigate the security impact of a compromised hypervisor.
> > > > > > 
> > > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel 
> > > > > > has
> > > > > > its own memory encryption mechanism.  POWER has an upcoming 
> > > > > > mechanism
> > > > > > to accomplish this in a different way, using a new memory protection
> > > > > > level plus a small trusted ultravisor.  s390 also has a protected
> > > > > > execution environment.
> > > > > > 
> > > > > > The current code (committed or draft) for these features has each
> > > > > > platform's version configured entirely differently.  That doesn't 
> > > > > > seem
> > > > > > ideal for users, or particularly for management layers.
> > > > > > 
> > > > > > AMD SEV introduces a notionally generic machine option
> > > > > > "machine-encryption", but it doesn't actually cover any cases other
> > > > > > than SEV.
> > > > > > 
> > > > > > This series is a proposal to at least partially unify configuration
> > > > > > for these mechanisms, by renaming and generalizing AMD's
> > > > > > "memory-encryption" property.  It is replaced by a
> > > > > > "securable-guest-memory" property pointing to a platform specific   
> > > > > >
> > > > > 
> > > > > Can we do "securable-guest" ?
> > > > > s390x also protects registers and integrity. memory is only one piece
> > > > > of the puzzle and what we protect might differ from platform to 
> > > > > platform.
> > > > > 
> > > > 
> > > > I agree. Even technologies that currently only do memory encryption may
> > > > be enhanced with more protections later.
> > > 
> > > There's already SEV-ES patches onlist for this on the SEV side.
> > > 
> > > 
> > > 
> > > Perhaps 'confidential guest' is actually what we need, since the
> > > marketing folks seem to have started labelling this whole idea
> > > 'confidential computing'.  
> 
> That's not a bad idea, much as I usually hate marketing terms.  But it
> does seem to be becoming a general term for this style of thing, and
> it doesn't overlap too badly with other terms ("secure" and
> "protected" are also used for hypervisor-from-guest and
> guest-from-guest protection).
> 
> > It's more like a 'possibly confidential guest', though.  
> 
> Hmm.  What about "Confidential Guest Facility" or "Confidential Guest
> Mechanism"?  The implication being that the facility is there, whether
> or not the guest actually uses it.
> 

"Confidential Guest Enablement"? The others generally sound fine to me
as well, though; not sure if "Facility" might be a bit confusing, as
that term is already a bit overloaded.


pgpKpYjghVwoC.pgp
Description: OpenPGP digital signature

Re: [for-6.0 v5 12/13] securable guest memory: Alter virtio default properties for protected guests

2020-12-08 Thread Cornelia Huck

On Tue, 8 Dec 2020 11:28:29 +0100
Halil Pasic  wrote:

> On Tue, 8 Dec 2020 12:54:03 +1100
> David Gibson  wrote:
> 
> > > > >>> + * Virtio devices can't count on directly accessing guest
> > > > >>> + * memory, so they need iommu_platform=on to use normal DMA
> > > > >>> + * mechanisms.  That requires also disabling legacy virtio
> > > > >>> + * support for those virtio pci devices which allow it.
> > > > >>> + */
> > > > >>> +object_register_sugar_prop(TYPE_VIRTIO_PCI, 
> > > > >>> "disable-legacy",
> > > > >>> +   "on", true);
> > > > >>> +object_register_sugar_prop(TYPE_VIRTIO_DEVICE, 
> > > > >>> "iommu_platform",
> > > > >>> +   "on", false);  
> > > > >>
> > > > >> I have not followed all the history (sorry). Should we also set 
> > > > >> iommu_platform
> > > > >> for virtio-ccw? Halil?
> > > > >>
> > > > > 
> > > > > That line should add iommu_platform for all virtio devices, shouldn't
> > > > > it?
> > > > 
> > > > Yes, sorry. Was misreading that with the line above. 
> > > > 
> > > 
> > > I believe this is the best we can get. In a sense it is still a
> > > pessimization,
> > 
> > I'm not really clear on what you're getting at here.  
> 
> By pessimiziation, I mean that we are going to indicate
> _F_PLATFORM_ACCESS even if it isn't necessary, because the guest never
> opted in for confidential/memory protection/memory encryption. We have
> discussed this before, and I don't see a better solution that works for
> everybody.

If you consider specifying the secure guest option as a way to tell
QEMU to make everything ready for running a secure guest, I'd certainly
consider it necessary. If you do not want to force it, you should not
do the secure guest preparation setup.


pgpkIEfl5FnbD.pgp
Description: OpenPGP digital signature

Re: [PATCH v12 14/19] multi-process: Forward PCI config space acceses to the remote process

2020-12-08 Thread Marc-André Lureau

Hi

On Wed, Dec 2, 2020 at 12:25 AM Jagannathan Raman 
wrote:

> From: Elena Ufimtseva 
>
> The Proxy Object sends the PCI config space accesses as messages
> to the remote process over the communication channel
>
> Signed-off-by: Elena Ufimtseva 
> Signed-off-by: Jagannathan Raman 
> Signed-off-by: John G Johnson 
> Reviewed-by: Stefan Hajnoczi 
> ---
>  include/hw/remote/mpqemu-link.h |  9 ++
>  hw/remote/message.c | 62
> +
>  hw/remote/mpqemu-link.c |  6 
>  hw/remote/proxy.c   | 51 +
>  4 files changed, 128 insertions(+)
>
> diff --git a/include/hw/remote/mpqemu-link.h
> b/include/hw/remote/mpqemu-link.h
> index cee9468..057c98b 100644
> --- a/include/hw/remote/mpqemu-link.h
> +++ b/include/hw/remote/mpqemu-link.h
> @@ -34,6 +34,8 @@ typedef enum {
>  MPQEMU_CMD_INIT,
>  SYNC_SYSMEM,
>  RET_MSG,
> +PCI_CONFIG_WRITE,
> +PCI_CONFIG_READ,
>  MPQEMU_CMD_MAX,
>

It would be a good idea to prefix all enums with MQEMU_CMD.

 } MPQemuCmd;
>
> @@ -43,6 +45,12 @@ typedef struct {
>  off_t offsets[REMOTE_MAX_FDS];
>  } SyncSysmemMsg;
>
> +typedef struct {
> +uint32_t addr;
> +uint32_t val;
> +int l;
>

"len" please, it's already short enough :)

+} PciConfDataMsg;
> +
>  /**
>   * MPQemuMsg:
>   * @cmd: The remote command
> @@ -60,6 +68,7 @@ typedef struct {
>
>  union {
>  uint64_t u64;
> +PciConfDataMsg pci_conf_data;
>  SyncSysmemMsg sync_sysmem;
>  } data;
>
> diff --git a/hw/remote/message.c b/hw/remote/message.c
> index 1f2edc7..52a6f6f 100644
> --- a/hw/remote/message.c
> +++ b/hw/remote/message.c
> @@ -15,6 +15,12 @@
>  #include "hw/remote/mpqemu-link.h"
>  #include "qapi/error.h"
>  #include "sysemu/runstate.h"
> +#include "hw/pci/pci.h"
> +
> +static void process_config_write(QIOChannel *ioc, PCIDevice *dev,
> + MPQemuMsg *msg);
> +static void process_config_read(QIOChannel *ioc, PCIDevice *dev,
> +MPQemuMsg *msg);
>
>  void coroutine_fn mpqemu_remote_msg_loop_co(void *data)
>  {
> @@ -43,6 +49,12 @@ void coroutine_fn mpqemu_remote_msg_loop_co(void *data)
>  }
>
>  switch (msg.cmd) {
> +case PCI_CONFIG_WRITE:
> +process_config_write(com->ioc, pci_dev, &msg);
> +break;
>

Some other process_ functions take &local_err. I think this is a better
idea, so error reporting is done in a single place from
mpqemu_remote_msg_loop_co() for now

+case PCI_CONFIG_READ:
> +process_config_read(com->ioc, pci_dev, &msg);
> +break;
>  default:
>  error_setg(&local_err,
> "Unknown command (%d) received for device %s
> (pid=%d)",
> @@ -60,3 +72,53 @@ void coroutine_fn mpqemu_remote_msg_loop_co(void *data)
>
>  return;
>  }
> +
> +static void process_config_write(QIOChannel *ioc, PCIDevice *dev,
> + MPQemuMsg *msg)
> +{
> +PciConfDataMsg *conf = (PciConfDataMsg *)&msg->data.pci_conf_data;
> +MPQemuMsg ret = { 0 };
> +Error *local_err = NULL;
> +
> +if ((conf->addr + sizeof(conf->val)) > pci_config_size(dev)) {
> +error_report("Bad address received when writing PCI config, pid
> %d",
> + getpid());
>

Should use FMT_pid.

But then, I am not sure some error messages should have PID, while most of
them dont. That should be either the job of a log manager, or a custom
logger function/option.

+ret.data.u64 = UINT64_MAX;
> +} else {
> +pci_default_write_config(dev, conf->addr, conf->val, conf->l);
> +}
> +
> +ret.cmd = RET_MSG;
> +ret.size = sizeof(ret.data.u64);
> +
> +mpqemu_msg_send(&ret, ioc, &local_err);
> +if (local_err) {
> +error_report("Could not send message to proxy from pid %d",
> + getpid());
>

Here it leaks local_err, and ignores it. Use error_prepend instead?

+}
> +}
> +
> +static void process_config_read(QIOChannel *ioc, PCIDevice *dev,
> +MPQemuMsg *msg)
> +{
> +PciConfDataMsg *conf = (PciConfDataMsg *)&msg->data.pci_conf_data;
> +MPQemuMsg ret = { 0 };
> +Error *local_err = NULL;
> +
> +if ((conf->addr + sizeof(conf->val)) > pci_config_size(dev)) {
> +error_report("Bad address received when reading PCI config, pid
> %d",
> + getpid());
> +ret.data.u64 = UINT64_MAX;
> +} else {
> +ret.data.u64 = pci_default_read_config(dev, conf->addr, conf->l);
> +}
> +
> +ret.cmd = RET_MSG;
> +ret.size = sizeof(ret.data.u64);
> +
> +mpqemu_msg_send(&ret, ioc, &local_err);
> +if (local_err) {
> +error_report("Could not send message to proxy from pid %d",
> + getpid());
>

Same as earlier

+}
> +}
> diff --git a/hw/remote/mpqemu-link.c b/hw/remote/mpqemu-link.c
> index

Re: [PATCH 2/5] target/arm: make ARMCPU.clidr 64-bit

2020-12-08 Thread Philippe Mathieu-Daudé

On 12/8/20 1:23 PM, Leif Lindholm wrote:
> The AArch64 view of CLIDR_EL1 extends the ICB field to include also bit
> 32, as well as adding a Ttype field when FEAT_MTE is implemented.
> Extend the clidr field to be able to hold this context.
> 
> Signed-off-by: Leif Lindholm 
> ---
>  target/arm/cpu.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Philippe Mathieu-Daudé

[PATCH] file-posix: detect the lock using the real file

2020-12-08 Thread Li Feng

This patch addresses this issue:
When accessing a volume on an NFS filesystem without supporting the file lock,
tools, like qemu-img, will complain "Failed to lock byte 100".

In the original code, the qemu_has_ofd_lock will test the lock on the
"/dev/null" pseudo-file. Actually, the file.locking is per-drive property,
which depends on the underlay filesystem.

In this patch, make the 'qemu_has_ofd_lock' with a filename be more generic
and reasonable.

Signed-off-by: Li Feng 
---
 block/file-posix.c | 32 +++-
 include/qemu/osdep.h   |  2 +-
 tests/test-image-locking.c |  2 +-
 util/osdep.c   | 43 --
 4 files changed, 47 insertions(+), 32 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 806764f7e3..03be1b188c 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -595,7 +595,7 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 switch (locking) {
 case ON_OFF_AUTO_ON:
 s->use_lock = true;
-if (!qemu_has_ofd_lock()) {
+if (!qemu_has_ofd_lock(filename)) {
 warn_report("File lock requested but OFD locking syscall is "
 "unavailable, falling back to POSIX file locks");
 error_printf("Due to the implementation, locks can be lost "
@@ -606,7 +606,7 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 s->use_lock = false;
 break;
 case ON_OFF_AUTO_AUTO:
-s->use_lock = qemu_has_ofd_lock();
+s->use_lock = qemu_has_ofd_lock(filename);
 break;
 default:
 abort();
@@ -2388,6 +2388,7 @@ raw_co_create(BlockdevCreateOptions *options, Error 
**errp)
 int fd;
 uint64_t perm, shared;
 int result = 0;
+bool use_lock;
 
 /* Validate options and set default values */
 assert(options->driver == BLOCKDEV_DRIVER_FILE);
@@ -2428,19 +2429,22 @@ raw_co_create(BlockdevCreateOptions *options, Error 
**errp)
 perm = BLK_PERM_WRITE | BLK_PERM_RESIZE;
 shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
 
-/* Step one: Take locks */
-result = raw_apply_lock_bytes(NULL, fd, perm, ~shared, false, errp);
-if (result < 0) {
-goto out_close;
-}
+use_lock = qemu_has_ofd_lock(file_opts->filename);
+if (use_lock) {
+/* Step one: Take locks */
+result = raw_apply_lock_bytes(NULL, fd, perm, ~shared, false, errp);
+if (result < 0) {
+goto out_close;
+}
 
-/* Step two: Check that nobody else has taken conflicting locks */
-result = raw_check_lock_bytes(fd, perm, shared, errp);
-if (result < 0) {
-error_append_hint(errp,
-  "Is another process using the image [%s]?\n",
-  file_opts->filename);
-goto out_unlock;
+/* Step two: Check that nobody else has taken conflicting locks */
+result = raw_check_lock_bytes(fd, perm, shared, errp);
+if (result < 0) {
+error_append_hint(errp,
+  "Is another process using the image [%s]?\n",
+  file_opts->filename);
+goto out_unlock;
+}
 }
 
 /* Clear the file by truncating it to 0 */
diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index f9ec8c84e9..349adad465 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -512,7 +512,7 @@ int qemu_dup(int fd);
 int qemu_lock_fd(int fd, int64_t start, int64_t len, bool exclusive);
 int qemu_unlock_fd(int fd, int64_t start, int64_t len);
 int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive);
-bool qemu_has_ofd_lock(void);
+bool qemu_has_ofd_lock(const char *filename);
 #endif
 
 #if defined(__HAIKU__) && defined(__i386__)
diff --git a/tests/test-image-locking.c b/tests/test-image-locking.c
index ba057bd66c..3e80246081 100644
--- a/tests/test-image-locking.c
+++ b/tests/test-image-locking.c
@@ -149,7 +149,7 @@ int main(int argc, char **argv)
 
 g_test_init(&argc, &argv, NULL);
 
-if (qemu_has_ofd_lock()) {
+if (qemu_has_ofd_lock(NULL)) {
 g_test_add_func("/image-locking/basic", test_image_locking_basic);
 g_test_add_func("/image-locking/set-perm-abort", test_set_perm_abort);
 }
diff --git a/util/osdep.c b/util/osdep.c
index 66d01b9160..e7e502edd1 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -42,6 +42,7 @@ extern int madvise(char *, size_t, int);
 static bool fips_enabled = false;
 
 static const char *hw_version = QEMU_HW_VERSION;
+static const char *null_device = "/dev/null";
 
 int socket_set_cork(int fd, int v)
 {
@@ -187,11 +188,10 @@ static int qemu_parse_fdset(const char *param)
 return qemu_parse_fd(param);
 }
 
-static void qemu_probe_lock_ops(void)
+static void qemu_probe_lock_ops_fd(int fd)
 {
 if (fcntl_op_setlk == -1) {
 #ifdef F_OFD_SETLK
-int fd;
 int ret;
 struct flock fl = {
 .l_wh

Re: [PATCH 1/5] target/arm: fix typo in cpu.h ID_AA64PFR1 field name

2020-12-08 Thread Philippe Mathieu-Daudé

On 12/8/20 1:23 PM, Leif Lindholm wrote:
> SBSS -> SSBS

For Speculative Store Bypassing State.

> 
> Signed-off-by: Leif Lindholm 
> ---
>  target/arm/cpu.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Philippe Mathieu-Daudé

Re: Plugin Register Accesses

2020-12-08 Thread Alex Bennée

Aaron Lindsay  writes:

> I'm trying to migrate to using the new plugin interface. I see the
> following in include/qemu/qemu-plugin.h:
>
>> enum qemu_plugin_cb_flags {
>> QEMU_PLUGIN_CB_NO_REGS, /* callback does not access the CPU's regs */
>> QEMU_PLUGIN_CB_R_REGS,  /* callback reads the CPU's regs */
>> QEMU_PLUGIN_CB_RW_REGS, /* callback reads and writes the CPU's regs */
>> };
>
> But I don't see a way to access registers in callbacks. Am I missing
> something?

No - while those symbols do inform the TCG to not try and optimise
the register file we don't yet have an API for the plugins for reading
(or writing) the CPU registers.

There has been discussion about this before, I'll quote what I said
off-list to someone else who asked:

  > Has there been any clarification or softening of the position that 
  > exposing register and memory contents to the QEMU plugin would provide a 
  > way to circumvent the GPL of QEMU?

  I don't think implementing read only access would be a problem and
  should probably be a first step anyway.

  For registers I think there needs to be some re-factoring of QEMU's
  internals to do it cleanly. Currently we have each front-end providing
  hooks to the gdbstub as well as building up their own regid and xml to
  be consumed by it. We probably want a architectural neutral central
  repository that the front ends can register their registers (sic) and
  helpers with. This would then provide hooks for gdbstub to cleanly
  generate XML as well as an interface point for the plugin infrastructure
  (and probably whatever the HMP uses as well).

  Memory is a little trickier because you can't know at any point if a
  given virtual address is actually mapped to real memory. The safest way
  would be to extend the existing memory tracking code to save the values
  saved/loaded from a given address. However if you had register access
  you could probably achieve the same thing after the fact by examining
  the opcode and pulling the values from the registers.

>
> -Aaron

-- 
Alex Bennée

Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions

2020-12-08 Thread Stefan Hajnoczi

On Thu, Oct 22, 2020 at 05:29:16PM +0100, Fam Zheng wrote:
> On Tue, 2020-10-20 at 09:34 +0800, Zhenyu Ye wrote:
> > On 2020/10/19 21:25, Paolo Bonzini wrote:
> > > On 19/10/20 14:40, Zhenyu Ye wrote:
> > > > The kernel backtrace for io_submit in GUEST is:
> > > > 
> > > > guest# ./offcputime -K -p `pgrep -nx fio`
> > > > b'finish_task_switch'
> > > > b'__schedule'
> > > > b'schedule'
> > > > b'io_schedule'
> > > > b'blk_mq_get_tag'
> > > > b'blk_mq_get_request'
> > > > b'blk_mq_make_request'
> > > > b'generic_make_request'
> > > > b'submit_bio'
> > > > b'blkdev_direct_IO'
> > > > b'generic_file_read_iter'
> > > > b'aio_read'
> > > > b'io_submit_one'
> > > > b'__x64_sys_io_submit'
> > > > b'do_syscall_64'
> > > > b'entry_SYSCALL_64_after_hwframe'
> > > > -fio (1464)
> > > > 40031912
> > > > 
> > > > And Linux io_uring can avoid the latency problem.
> 
> Thanks for the info. What this tells us is basically the inflight
> requests are high. It's sad that the linux-aio is in practice
> implemented as a blocking API.
> 
> Host side backtrace will be of more help. Can you get that too?

I guess Linux AIO didn't set the BLK_MQ_REQ_NOWAIT flag so the task went
to sleep when it ran out of blk-mq tags. The easiest solution is to move
to io_uring. Linux AIO is broken - it's not AIO :).

If we know that no other process is writing to the host block device
then maybe we can determine the blk-mq tags limit (the queue depth) and
avoid sending more requests. That way QEMU doesn't block, but I don't
think this approach works when other processes are submitting I/O to the
same host block device :(.

Fam's original suggestion of invoking io_submit(2) from a worker thread
is an option, but I'm afraid it will slow down the uncontended case.

I'm CCing Glauber in case he battled this in the past in ScyllaDB.

Stefan

signature.asc
Description: PGP signature

Re: [PATCH] hw/input: expand trace info reported for ps2 device

2020-12-08 Thread Philippe Mathieu-Daudé

On 12/8/20 12:59 PM, Daniel P. Berrangé wrote:
> It is interesting to know if the PS2 keyboard is in translated mode, and
> which of the three scancode sets are in use.
> 
> Signed-off-by: Daniel P. Berrangé 
> ---
>  hw/input/ps2.c| 2 +-
>  hw/input/trace-events | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/input/ps2.c b/hw/input/ps2.c
> index 72cdb80ae1..237956aca2 100644
> --- a/hw/input/ps2.c
> +++ b/hw/input/ps2.c
> @@ -293,7 +293,7 @@ static void ps2_keyboard_event(DeviceState *dev, 
> QemuConsole *src,
>  qcode = qemu_input_key_value_to_qcode(key->key);
>  
>  mod = ps2_modifier_bit(qcode);
> -trace_ps2_keyboard_event(s, qcode, key->down, mod, s->modifiers);
> +trace_ps2_keyboard_event(s, qcode, key->down, mod, s->modifiers, 
> s->scancode_set, s->translate);

Long line...

Anyway,
Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH v3] tests/acceptance: test hot(un)plug of ccw devices

2020-12-08 Thread Thomas Huth

On 08/12/2020 13.28, Cornelia Huck wrote:
> Hotplug a virtio-net-ccw device, and then hotunplug it again.
> 
> Signed-off-by: Cornelia Huck 
> ---
> v2->v3:
> - do the dmesg cleanout and waiting for messages properly [Thomas]
> 
> Wainer: I dropped your r-b, as there had been too many changes for
> me to be comfortable with retaining it
> 
> ---
>  tests/acceptance/machine_s390_ccw_virtio.py | 24 +
>  1 file changed, 24 insertions(+)

Reviewed-by: Thomas Huth

Re: [PATCH] target/arm: do not use cc->do_interrupt for KVM directly

2020-12-08 Thread Philippe Mathieu-Daudé

On 12/7/20 10:50 PM, Peter Maydell wrote:
> On Mon, 7 Dec 2020 at 21:26, Eduardo Habkost  wrote:
>> My understanding is that there's no reason for ARM KVM to use
>> another approach, and that CPUClass.do_interrupt is not really
>> TCG-specific.
>>
>> Do we have any case where the CPUClass.do_interrupt
>> implementation is really TCG-specific, or it is just a
>> coincidence that most other accelerators simply don't to call the
>> method?  It looks like the only cases where the
>> CPUClass.do_interrupt assignment is conditional on CONFIG_TCG are
>> i386 and s390x.
> 
> Looking at PPC, its kvm_handle_debug() function does a
> direct call to ppc_cpu_do_interrupt(). So the code of
> its do_interrupt method must be ok-for-KVM, it's just that
> it doesn't use the method pointer. (It's doing the same thing
> Arm is -- if a debug event turns out not to be for QEMU itself,
> inject a suitable exception into the guest.)
> 
> So of our 5 KVM-supporting architectures:
> 
>  * i386 and s390x have kernel APIs for "inject suitable
>exception", don't need to call do_interrupt, and make
>the cc->do_interrupt assignment only ifdef CONFIG_TCG,
>so that the code for do_interrupt need not be compiled
>into a KVM-only binary. (In both cases the code for the
>function is in a source file that the meson.build puts
>into the source list only if CONFIG_TCG)
>  * ppc and arm both need to use this code even in a KVM
>only binary. Neither of them #ifdef the cc->do_interrupt
>assignment, because there's not much point at the moment
>if you're not going to try to compile out the code.
>ppc happens to do a direct function call, and arm happens
>to go via the cc->do_interrupt pointer, but I don't
>think there's much significance in the choice either way.
>In both cases, the only places making the call are within
>architecture-specific KVM code.
>  * mips KVM does neither of these things, probably because it is
>not sufficiently featureful to have run into the cases
>where you might want to re-inject an exception and it's
>not being sufficiently used in production for anybody to
>have looked at minimising the amount of code in a
>KVM-only QEMU binary for it.
> 
> So in conclusion we have a basically 50:50 split between
> "use the same do_interrupt code as TCG" and "have a kernel
> API to make the kernel do the work", plus one arch that
> probably hasn't had to make the choice yet.   ¯\_(ツ)_/¯

Why not introduce KVMCpuOperations similar to TCGCpuOperations
Claudio is introducing, and declare the do_interrupt(CPUState*)
in both structures?

Then we can assign the same handler to both fields, TCG keeps
calling cc->tcg->do_interrupt(), KVM calls cc->kvm->do_interrupt().
This allow building with a particular accelerator, while staying
compliant with the current 50:50 split...

> 
>> Oh, I thought you were arguing that CPUClass.do_interrupt is
>> not TCG_specific.
> 
> Well, I don't think it really is TCG-specific. But as
> a pragmatic thing, if these two lines in the Arm code
> are getting in the way of stopping us from having a
> useful compile-time check that code that's not supposed
> to call this method isn't calling it, I think the balance
> maybe leans towards just making the direct function call.
> I guess it depends whether you think people are likely to
> accidentally make cc->do_interrupt calls in non-target-specific
> code that gets used by KVM (which currently would crash if that
> code path is exercised on x86 or s390x, but under the
> proposed change would become a compile error).
> 
> thanks
> -- PMM
>

Re: [PATCH] s390x: pv: Fence additional unavailable SCLP facilities for PV guests

2020-12-08 Thread Christian Borntraeger




On 04.12.20 09:36, Janosch Frank wrote:
> There's no VSIE support for a protected guest, so let's better not
> advertise it and its support facilities.
> 
> Signed-off-by: Janosch Frank 

Looks sane. Assuming that all features that depend on SIE are named 
S390_FEAT_SIE_*
this should take care of everything. (i compared to gen-facilities.c)

> ---
> CI:
> https://gitlab.com/frankja/qemu/-/pipelines/224881703
> ---
>  target/s390x/cpu_features.c | 38 -
>  target/s390x/cpu_models.c   | 24 +--
>  2 files changed, 59 insertions(+), 3 deletions(-)
> 
> diff --git a/target/s390x/cpu_features.c b/target/s390x/cpu_features.c
> index 42fe0bf4ca..7d7ea8e3b8 100644
> --- a/target/s390x/cpu_features.c
> +++ b/target/s390x/cpu_features.c
> @@ -107,8 +107,44 @@ void s390_fill_feat_block(const S390FeatBitmap features, 
> S390FeatType type,
>  feat = find_next_bit(features, S390_FEAT_MAX, feat + 1);
>  }
>  
> -if (type == S390_FEAT_TYPE_SCLP_FAC134 && s390_is_pv()) {
> +if (!s390_is_pv()) {
> +return;
> +}
> +
> +/*
> + * Some facilities are not available for CPUs in protected mode:
> + * - All SIE facilities because SIE is not available
> + * - DIAG318
> + *
> + * As VMs can move in and out of protected mode the CPU model
> + * doesn't protect us from that problem because it is only
> + * validated at the start of the VM.
> + */
> +switch (type) {
> +case S390_FEAT_TYPE_SCLP_CPU:
> +clear_be_bit(s390_feat_def(S390_FEAT_SIE_F2)->bit, data);
> +clear_be_bit(s390_feat_def(S390_FEAT_SIE_SKEY)->bit, data);
> +clear_be_bit(s390_feat_def(S390_FEAT_SIE_GPERE)->bit, data);
> +clear_be_bit(s390_feat_def(S390_FEAT_SIE_SIIF)->bit, data);
> +clear_be_bit(s390_feat_def(S390_FEAT_SIE_SIGPIF)->bit, data);
> +clear_be_bit(s390_feat_def(S390_FEAT_SIE_IB)->bit, data);
> +clear_be_bit(s390_feat_def(S390_FEAT_SIE_CEI)->bit, data);
> +break;
> +case S390_FEAT_TYPE_SCLP_CONF_CHAR:
> +clear_be_bit(s390_feat_def(S390_FEAT_SIE_KSS)->bit, data);
> +clear_be_bit(s390_feat_def(S390_FEAT_SIE_GSLS)->bit, data);
> +break;
> +case S390_FEAT_TYPE_SCLP_CONF_CHAR_EXT:
> +clear_be_bit(s390_feat_def(S390_FEAT_SIE_64BSCAO)->bit, data);
> +clear_be_bit(s390_feat_def(S390_FEAT_SIE_CMMA)->bit, data);
> +clear_be_bit(s390_feat_def(S390_FEAT_SIE_PFMFI)->bit, data);
> +clear_be_bit(s390_feat_def(S390_FEAT_SIE_IBS)->bit, data);
> +break;
> +case S390_FEAT_TYPE_SCLP_FAC134:
>  clear_be_bit(s390_feat_def(S390_FEAT_DIAG_318)->bit, data);
> +break;
> +default:
> +return;
>  }
>  }
>  
> diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
> index b5abff8bef..51feb71546 100644
> --- a/target/s390x/cpu_models.c
> +++ b/target/s390x/cpu_models.c
> @@ -239,8 +239,28 @@ bool s390_has_feat(S390Feat feat)
>  }
>  return 0;
>  }
> -if (feat == S390_FEAT_DIAG_318 && s390_is_pv()) {
> -return false;
> +
> +if (s390_is_pv()) {
> +switch (feat) {
> +case S390_FEAT_DIAG_318:
> +case S390_FEAT_SIE_F2:
> +case S390_FEAT_SIE_SKEY:
> +case S390_FEAT_SIE_GPERE:
> +case S390_FEAT_SIE_SIIF:
> +case S390_FEAT_SIE_SIGPIF:
> +case S390_FEAT_SIE_IB:
> +case S390_FEAT_SIE_CEI:
> +case S390_FEAT_SIE_KSS:
> +case S390_FEAT_SIE_GSLS:
> +case S390_FEAT_SIE_64BSCAO:
> +case S390_FEAT_SIE_CMMA:
> +case S390_FEAT_SIE_PFMFI:
> +case S390_FEAT_SIE_IBS:
> +return false;
> +break;
> +default:
> +break;
> +}
>  }
>  return test_bit(feat, cpu->model->features);
>  }
>

Re: [PATCH v2] qemu-nbd: Fix a memleak in nbd_client_thread()

2020-12-08 Thread Vladimir Sementsov-Ogievskiy


03.12.2020 16:58, Alex Chen wrote:

When the qio_channel_socket_connect_sync() fails
we should goto 'out_socket' label to free the 'sioc' instead of
goto 'out' label.
In addition, there's a lot of redundant code in the successful branch
and the error branch, optimize it.

Reported-by: Euler Robot 
Signed-off-by: Alex Chen 
Signed-off-by: Eric Blake 
---
  qemu-nbd.c | 38 +++---
  1 file changed, 15 insertions(+), 23 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index a7075c5419..9583ee1af6 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -265,8 +265,8 @@ static void *nbd_client_thread(void *arg)
  char *device = arg;
  NBDExportInfo info = { .request_sizes = false, .name = g_strdup("") };
  QIOChannelSocket *sioc;
-int fd;
-int ret;
+int fd = -1;
+int ret = EXIT_FAILURE;
  pthread_t show_parts_thread;
  Error *local_error = NULL;
  
@@ -278,26 +278,24 @@ static void *nbd_client_thread(void *arg)

  goto out;
  }
  
-ret = nbd_receive_negotiate(NULL, QIO_CHANNEL(sioc),

-NULL, NULL, NULL, &info, &local_error);
-if (ret < 0) {
+if (nbd_receive_negotiate(NULL, QIO_CHANNEL(sioc),
+  NULL, NULL, NULL, &info, &local_error) < 0) {
  if (local_error) {
  error_report_err(local_error);
  }
-goto out_socket;
+goto out;
  }
  
  fd = open(device, O_RDWR);

  if (fd < 0) {
  /* Linux-only, we can use %m in printf.  */
  error_report("Failed to open %s: %m", device);
-goto out_socket;
+goto out;
  }
  
-ret = nbd_init(fd, sioc, &info, &local_error);

-if (ret < 0) {
+if (nbd_init(fd, sioc, &info, &local_error) < 0) {
  error_report_err(local_error);
-goto out_fd;
+goto out;
  }
  
  /* update partition table */

@@ -311,24 +309,18 @@ static void *nbd_client_thread(void *arg)
  dup2(STDOUT_FILENO, STDERR_FILENO);
  }
  
-ret = nbd_client(fd);

-if (ret) {
-goto out_fd;
+if (nbd_client(fd) == 0) {
+ret = EXIT_SUCCESS;


It's not obvious that nbd_client() returns 0 on success, it calls ioctl(), 
which may return something positive in theory..

So, with s/==/>=/, or with just

if (nbd_client(fd) < 0) {
  goto out;
}

ret = EXIT_SUCCESS;


(which is good common pattern I think)

:

Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir

Re: [PATCH] file-posix: detect the lock using the real file

2020-12-08 Thread Daniel P . Berrangé

On Tue, Dec 08, 2020 at 08:59:37PM +0800, Li Feng wrote:
> This patch addresses this issue:
> When accessing a volume on an NFS filesystem without supporting the file lock,
> tools, like qemu-img, will complain "Failed to lock byte 100".
> 
> In the original code, the qemu_has_ofd_lock will test the lock on the
> "/dev/null" pseudo-file. Actually, the file.locking is per-drive property,
> which depends on the underlay filesystem.

IIUC, the problem you're describing is one of whether the filesystem
supports fcntl locking at all, which is indeed a per-FS check.

The QEMU code being changed though is just about detecting whether
the host OS supports OFD to not, which is supposed to be a kernel
level feature applied  universally to all FS types.

> 
> In this patch, make the 'qemu_has_ofd_lock' with a filename be more generic
> and reasonable.
> 
> Signed-off-by: Li Feng 
> ---
>  block/file-posix.c | 32 +++-
>  include/qemu/osdep.h   |  2 +-
>  tests/test-image-locking.c |  2 +-
>  util/osdep.c   | 43 --
>  4 files changed, 47 insertions(+), 32 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 806764f7e3..03be1b188c 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -595,7 +595,7 @@ static int raw_open_common(BlockDriverState *bs, QDict 
> *options,
>  switch (locking) {
>  case ON_OFF_AUTO_ON:
>  s->use_lock = true;
> -if (!qemu_has_ofd_lock()) {
> +if (!qemu_has_ofd_lock(filename)) {
>  warn_report("File lock requested but OFD locking syscall is "
>  "unavailable, falling back to POSIX file locks");
>  error_printf("Due to the implementation, locks can be lost "
> @@ -606,7 +606,7 @@ static int raw_open_common(BlockDriverState *bs, QDict 
> *options,
>  s->use_lock = false;
>  break;
>  case ON_OFF_AUTO_AUTO:
> -s->use_lock = qemu_has_ofd_lock();
> +s->use_lock = qemu_has_ofd_lock(filename);
>  break;
>  default:
>  abort();
> @@ -2388,6 +2388,7 @@ raw_co_create(BlockdevCreateOptions *options, Error 
> **errp)
>  int fd;
>  uint64_t perm, shared;
>  int result = 0;
> +bool use_lock;
>  
>  /* Validate options and set default values */
>  assert(options->driver == BLOCKDEV_DRIVER_FILE);
> @@ -2428,19 +2429,22 @@ raw_co_create(BlockdevCreateOptions *options, Error 
> **errp)
>  perm = BLK_PERM_WRITE | BLK_PERM_RESIZE;
>  shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
>  
> -/* Step one: Take locks */
> -result = raw_apply_lock_bytes(NULL, fd, perm, ~shared, false, errp);
> -if (result < 0) {
> -goto out_close;
> -}
> +use_lock = qemu_has_ofd_lock(file_opts->filename);
> +if (use_lock) {
> +/* Step one: Take locks */
> +result = raw_apply_lock_bytes(NULL, fd, perm, ~shared, false, errp);
> +if (result < 0) {
> +goto out_close;
> +}
>  
> -/* Step two: Check that nobody else has taken conflicting locks */
> -result = raw_check_lock_bytes(fd, perm, shared, errp);
> -if (result < 0) {
> -error_append_hint(errp,
> -  "Is another process using the image [%s]?\n",
> -  file_opts->filename);
> -goto out_unlock;
> +/* Step two: Check that nobody else has taken conflicting locks */
> +result = raw_check_lock_bytes(fd, perm, shared, errp);
> +if (result < 0) {
> +error_append_hint(errp,
> +  "Is another process using the image [%s]?\n",
> +  file_opts->filename);
> +goto out_unlock;
> +}
>  }
>  
>  /* Clear the file by truncating it to 0 */
> diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
> index f9ec8c84e9..349adad465 100644
> --- a/include/qemu/osdep.h
> +++ b/include/qemu/osdep.h
> @@ -512,7 +512,7 @@ int qemu_dup(int fd);
>  int qemu_lock_fd(int fd, int64_t start, int64_t len, bool exclusive);
>  int qemu_unlock_fd(int fd, int64_t start, int64_t len);
>  int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive);
> -bool qemu_has_ofd_lock(void);
> +bool qemu_has_ofd_lock(const char *filename);
>  #endif
>  
>  #if defined(__HAIKU__) && defined(__i386__)
> diff --git a/tests/test-image-locking.c b/tests/test-image-locking.c
> index ba057bd66c..3e80246081 100644
> --- a/tests/test-image-locking.c
> +++ b/tests/test-image-locking.c
> @@ -149,7 +149,7 @@ int main(int argc, char **argv)
>  
>  g_test_init(&argc, &argv, NULL);
>  
> -if (qemu_has_ofd_lock()) {
> +if (qemu_has_ofd_lock(NULL)) {
>  g_test_add_func("/image-locking/basic", test_image_locking_basic);
>  g_test_add_func("/image-locking/set-perm-abort", 
> test_set_perm_abort);
>  }
> diff --git a/util/osdep.c b/util/osdep.c
> index 66d01b9160..e7e502e

[PATCH 1/1] spapr.c: set a 'kvm-type' default value instead of relying on NULL

2020-12-08 Thread Daniel Henrique Barboza

spapr_kvm_type() is considering 'vm_type=NULL' as a valid input, where
the function returns 0. This is relying on the current QEMU machine
options handling logic, where the absence of the 'kvm-type' option
will be reflected as 'vm_type=NULL' in this function.

This is not robust, and will break if QEMU options code decides to propagate
something else in the case mentioned above (e.g. an empty string instead
of NULL).

Let's avoid this entirely by setting a non-NULL default value in case of
no user input for 'kvm-type'. spapr_kvm_type() was changed to handle 3 fixed
values of kvm-type: "HV", "PR" and, if no kvm-type was set by the user,
DEFAULT_KVM_TYPE. This allows us always be predictable regardless of any
enhancements/changes made in QEMU options mechanics.

While we're at it, let's also document in 'kvm-type' description what
happens if the user does not set this option. This information is based
on how the pseries kernel handles the KVM_CREATE_VM ioctl(), where the
default value '0' makes the kernel choose an available KVM module to
use, giving precedence to kvm_hv. This logic is described in the kernel
source file arch/powerpc/kvm/powerpc.c, function kvm_arch_init_vm().

Signed-off-by: Daniel Henrique Barboza 
---

The case I mentioned in the second paragraph is happening when we try to
execute a pseries guest with '--enable-kvm' using Paolo's patch 
"vl: make qemu_get_machine_opts static" [1]:

$ sudo ./ppc64-softmmu/qemu-system-ppc64 -nographic -nodefaults -machine 
pseries --enable-kvm
qemu-system-ppc64: Unknown kvm-type specified '' 

This happens due to the differences between how qemu_opt_get() and
object_property_get_str() works when there is no 'kvm-type' specified.
See [2] for more info about the issue found.


[1] https://lists.gnu.org/archive/html/qemu-devel/2020-12/msg00471.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2020-12/msg01578.html 


 hw/ppc/spapr.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b7e0894019..32d938dc6a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3021,9 +3021,10 @@ static void spapr_machine_init(MachineState *machine)
 qemu_cond_init(&spapr->fwnmi_machine_check_interlock_cond);
 }
 
+#define DEFAULT_KVM_TYPE "auto"
 static int spapr_kvm_type(MachineState *machine, const char *vm_type)
 {
-if (!vm_type) {
+if (!strcmp(vm_type, DEFAULT_KVM_TYPE)) {
 return 0;
 }
 
@@ -3131,6 +3132,16 @@ static char *spapr_get_kvm_type(Object *obj, Error 
**errp)
 {
 SpaprMachineState *spapr = SPAPR_MACHINE(obj);
 
+/*
+ * In case the user didn't set 'kvm-type', return DEFAULT_KVM_TYPE
+ * instead of NULL. This allows us to be more predictable with what
+ * is expected to happen in spapr_kvm_type(), since we can stop relying
+ * on receiving a 'NULL' parameter as a valid input there.
+ */
+if (!spapr->kvm_type) {
+return g_strdup(DEFAULT_KVM_TYPE);
+}
+
 return g_strdup(spapr->kvm_type);
 }
 
@@ -3273,7 +3284,11 @@ static void spapr_instance_init(Object *obj)
 object_property_add_str(obj, "kvm-type",
 spapr_get_kvm_type, spapr_set_kvm_type);
 object_property_set_description(obj, "kvm-type",
-"Specifies the KVM virtualization mode 
(HV, PR)");
+"Specifies the KVM virtualization mode 
(HV, PR)."
+" If not specified, defaults to any 
available KVM"
+" module loaded in the host. In case both 
kvm_hv"
+" and kvm_pr are loaded, kvm_hv takes 
precedence.");
+
 object_property_add_bool(obj, "modern-hotplug-events",
 spapr_get_modern_hotplug_events,
 spapr_set_modern_hotplug_events);
-- 
2.26.2

Re: [PATCH] target/arm: do not use cc->do_interrupt for KVM directly

2020-12-08 Thread Claudio Fontana

On 12/8/20 2:27 PM, Philippe Mathieu-Daudé wrote:
> On 12/7/20 10:50 PM, Peter Maydell wrote:
>> On Mon, 7 Dec 2020 at 21:26, Eduardo Habkost  wrote:
>>> My understanding is that there's no reason for ARM KVM to use
>>> another approach, and that CPUClass.do_interrupt is not really
>>> TCG-specific.
>>>
>>> Do we have any case where the CPUClass.do_interrupt
>>> implementation is really TCG-specific, or it is just a
>>> coincidence that most other accelerators simply don't to call the
>>> method?  It looks like the only cases where the
>>> CPUClass.do_interrupt assignment is conditional on CONFIG_TCG are
>>> i386 and s390x.
>>
>> Looking at PPC, its kvm_handle_debug() function does a
>> direct call to ppc_cpu_do_interrupt(). So the code of
>> its do_interrupt method must be ok-for-KVM, it's just that
>> it doesn't use the method pointer. (It's doing the same thing
>> Arm is -- if a debug event turns out not to be for QEMU itself,
>> inject a suitable exception into the guest.)
>>
>> So of our 5 KVM-supporting architectures:
>>
>>  * i386 and s390x have kernel APIs for "inject suitable
>>exception", don't need to call do_interrupt, and make
>>the cc->do_interrupt assignment only ifdef CONFIG_TCG,
>>so that the code for do_interrupt need not be compiled
>>into a KVM-only binary. (In both cases the code for the
>>function is in a source file that the meson.build puts
>>into the source list only if CONFIG_TCG)
>>  * ppc and arm both need to use this code even in a KVM
>>only binary. Neither of them #ifdef the cc->do_interrupt
>>assignment, because there's not much point at the moment
>>if you're not going to try to compile out the code.
>>ppc happens to do a direct function call, and arm happens
>>to go via the cc->do_interrupt pointer, but I don't
>>think there's much significance in the choice either way.
>>In both cases, the only places making the call are within
>>architecture-specific KVM code.
>>  * mips KVM does neither of these things, probably because it is
>>not sufficiently featureful to have run into the cases
>>where you might want to re-inject an exception and it's
>>not being sufficiently used in production for anybody to
>>have looked at minimising the amount of code in a
>>KVM-only QEMU binary for it.
>>
>> So in conclusion we have a basically 50:50 split between
>> "use the same do_interrupt code as TCG" and "have a kernel
>> API to make the kernel do the work", plus one arch that
>> probably hasn't had to make the choice yet.   ¯\_(ツ)_/¯
> 
> Why not introduce KVMCpuOperations similar to TCGCpuOperations
> Claudio is introducing, and declare the do_interrupt(CPUState*)
> in both structures?
> 
> Then we can assign the same handler to both fields, TCG keeps
> calling cc->tcg->do_interrupt(), KVM calls cc->kvm->do_interrupt().
> This allow building with a particular accelerator, while staying
> compliant with the current 50:50 split...


Hi Philippe,

in principle interesting, but KVMCpuOperations would end up currently 
containing do_interrupt only..
seems a bit overkill for just one method.
Or where you thinking of ways to refactor current kvm code to use methods in 
CPUClass similarly to what Tcg does?

Ciao,

Claudio


> 
>>
>>> Oh, I thought you were arguing that CPUClass.do_interrupt is
>>> not TCG_specific.
>>
>> Well, I don't think it really is TCG-specific. But as
>> a pragmatic thing, if these two lines in the Arm code
>> are getting in the way of stopping us from having a
>> useful compile-time check that code that's not supposed
>> to call this method isn't calling it, I think the balance
>> maybe leans towards just making the direct function call.
>> I guess it depends whether you think people are likely to
>> accidentally make cc->do_interrupt calls in non-target-specific
>> code that gets used by KVM (which currently would crash if that
>> code path is exercised on x86 or s390x, but under the
>> proposed change would become a compile error).
>>
>> thanks
>> -- PMM
>>
>

Re: [PATCH] target/arm: do not use cc->do_interrupt for KVM directly

2020-12-08 Thread Claudio Fontana

On 12/8/20 2:51 PM, Claudio Fontana wrote:
> On 12/8/20 2:27 PM, Philippe Mathieu-Daudé wrote:
>> On 12/7/20 10:50 PM, Peter Maydell wrote:
>>> On Mon, 7 Dec 2020 at 21:26, Eduardo Habkost  wrote:
 My understanding is that there's no reason for ARM KVM to use
 another approach, and that CPUClass.do_interrupt is not really
 TCG-specific.

 Do we have any case where the CPUClass.do_interrupt
 implementation is really TCG-specific, or it is just a
 coincidence that most other accelerators simply don't to call the
 method?  It looks like the only cases where the
 CPUClass.do_interrupt assignment is conditional on CONFIG_TCG are
 i386 and s390x.
>>>
>>> Looking at PPC, its kvm_handle_debug() function does a
>>> direct call to ppc_cpu_do_interrupt(). So the code of
>>> its do_interrupt method must be ok-for-KVM, it's just that
>>> it doesn't use the method pointer. (It's doing the same thing
>>> Arm is -- if a debug event turns out not to be for QEMU itself,
>>> inject a suitable exception into the guest.)
>>>
>>> So of our 5 KVM-supporting architectures:
>>>
>>>  * i386 and s390x have kernel APIs for "inject suitable
>>>exception", don't need to call do_interrupt, and make
>>>the cc->do_interrupt assignment only ifdef CONFIG_TCG,
>>>so that the code for do_interrupt need not be compiled
>>>into a KVM-only binary. (In both cases the code for the
>>>function is in a source file that the meson.build puts
>>>into the source list only if CONFIG_TCG)
>>>  * ppc and arm both need to use this code even in a KVM
>>>only binary. Neither of them #ifdef the cc->do_interrupt
>>>assignment, because there's not much point at the moment
>>>if you're not going to try to compile out the code.
>>>ppc happens to do a direct function call, and arm happens
>>>to go via the cc->do_interrupt pointer, but I don't
>>>think there's much significance in the choice either way.
>>>In both cases, the only places making the call are within
>>>architecture-specific KVM code.
>>>  * mips KVM does neither of these things, probably because it is
>>>not sufficiently featureful to have run into the cases
>>>where you might want to re-inject an exception and it's
>>>not being sufficiently used in production for anybody to
>>>have looked at minimising the amount of code in a
>>>KVM-only QEMU binary for it.
>>>
>>> So in conclusion we have a basically 50:50 split between
>>> "use the same do_interrupt code as TCG" and "have a kernel
>>> API to make the kernel do the work", plus one arch that
>>> probably hasn't had to make the choice yet.   ¯\_(ツ)_/¯
>>
>> Why not introduce KVMCpuOperations similar to TCGCpuOperations
>> Claudio is introducing, and declare the do_interrupt(CPUState*)
>> in both structures?
>>
>> Then we can assign the same handler to both fields, TCG keeps
>> calling cc->tcg->do_interrupt(), KVM calls cc->kvm->do_interrupt().
>> This allow building with a particular accelerator, while staying
>> compliant with the current 50:50 split...
> 
> 
> Hi Philippe,
> 
> in principle interesting, but KVMCpuOperations would end up currently 
> containing do_interrupt only..
> seems a bit overkill for just one method.

I mean, all the others in CPUClass are common between TCG and KVM, I don't see 
a lot that is KVM-only there that would warrant a KVMCPUOps structure

> Or where you thinking of ways to refactor current kvm code to use methods in 
> CPUClass similarly to what Tcg does?
> 

But maybe this is where you were going with this?

Ciao,

C

> Ciao,
> 
> Claudio
> 
> 
>>
>>>
 Oh, I thought you were arguing that CPUClass.do_interrupt is
 not TCG_specific.
>>>
>>> Well, I don't think it really is TCG-specific. But as
>>> a pragmatic thing, if these two lines in the Arm code
>>> are getting in the way of stopping us from having a
>>> useful compile-time check that code that's not supposed
>>> to call this method isn't calling it, I think the balance
>>> maybe leans towards just making the direct function call.
>>> I guess it depends whether you think people are likely to
>>> accidentally make cc->do_interrupt calls in non-target-specific
>>> code that gets used by KVM (which currently would crash if that
>>> code path is exercised on x86 or s390x, but under the
>>> proposed change would become a compile error).
>>>
>>> thanks
>>> -- PMM
>>>
>>
>

Re: [PATCH v2] qemu-nbd: Fix a memleak in nbd_client_thread()

2020-12-08 Thread Alex Chen

On 2020/12/8 21:41, Vladimir Sementsov-Ogievskiy wrote:
> 03.12.2020 16:58, Alex Chen wrote:
>> When the qio_channel_socket_connect_sync() fails
>> we should goto 'out_socket' label to free the 'sioc' instead of
>> goto 'out' label.
>> In addition, there's a lot of redundant code in the successful branch
>> and the error branch, optimize it.
>>
>> Reported-by: Euler Robot 
>> Signed-off-by: Alex Chen 
>> Signed-off-by: Eric Blake 
>> ---
>>   qemu-nbd.c | 38 +++---
>>   1 file changed, 15 insertions(+), 23 deletions(-)
>>
>> diff --git a/qemu-nbd.c b/qemu-nbd.c
>> index a7075c5419..9583ee1af6 100644
>> --- a/qemu-nbd.c
>> +++ b/qemu-nbd.c
>> @@ -265,8 +265,8 @@ static void *nbd_client_thread(void *arg)
>>   char *device = arg;
>>   NBDExportInfo info = { .request_sizes = false, .name = g_strdup("") };
>>   QIOChannelSocket *sioc;
>> -int fd;
>> -int ret;
>> +int fd = -1;
>> +int ret = EXIT_FAILURE;
>>   pthread_t show_parts_thread;
>>   Error *local_error = NULL;
>>   @@ -278,26 +278,24 @@ static void *nbd_client_thread(void *arg)
>>   goto out;
>>   }
>>   -ret = nbd_receive_negotiate(NULL, QIO_CHANNEL(sioc),
>> -NULL, NULL, NULL, &info, &local_error);
>> -if (ret < 0) {
>> +if (nbd_receive_negotiate(NULL, QIO_CHANNEL(sioc),
>> +  NULL, NULL, NULL, &info, &local_error) < 0) {
>>   if (local_error) {
>>   error_report_err(local_error);
>>   }
>> -goto out_socket;
>> +goto out;
>>   }
>> fd = open(device, O_RDWR);
>>   if (fd < 0) {
>>   /* Linux-only, we can use %m in printf.  */
>>   error_report("Failed to open %s: %m", device);
>> -goto out_socket;
>> +goto out;
>>   }
>>   -ret = nbd_init(fd, sioc, &info, &local_error);
>> -if (ret < 0) {
>> +if (nbd_init(fd, sioc, &info, &local_error) < 0) {
>>   error_report_err(local_error);
>> -goto out_fd;
>> +goto out;
>>   }
>> /* update partition table */
>> @@ -311,24 +309,18 @@ static void *nbd_client_thread(void *arg)
>>   dup2(STDOUT_FILENO, STDERR_FILENO);
>>   }
>>   -ret = nbd_client(fd);
>> -if (ret) {
>> -goto out_fd;
>> +if (nbd_client(fd) == 0) {
>> +ret = EXIT_SUCCESS;
> 
> It's not obvious that nbd_client() returns 0 on success, it calls ioctl(), 
> which may return something positive in theory..
> 
> So, with s/==/>=/, or with just
> 
> if (nbd_client(fd) < 0) {
>   goto out;
> }
> 
> ret = EXIT_SUCCESS;
> 
> 
> (which is good common pattern I think)
> 
> :
> 

Thanks for your review, I will fix it and send patch v3.

Thanks,
Alex

[PATCH v3] qemu-nbd: Fix a memleak in nbd_client_thread()

2020-12-08 Thread Alex Chen

When the qio_channel_socket_connect_sync() fails
we should goto 'out_socket' label to free the 'sioc' instead of
goto 'out' label.
In addition, there's a lot of redundant code in the successful branch
and the error branch, optimize it.

Reported-by: Euler Robot 
Signed-off-by: Alex Chen 
Signed-off-by: Eric Blake 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 qemu-nbd.c | 40 +---
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index a7075c5419..ee2fbc4cdb 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -265,8 +265,8 @@ static void *nbd_client_thread(void *arg)
 char *device = arg;
 NBDExportInfo info = { .request_sizes = false, .name = g_strdup("") };
 QIOChannelSocket *sioc;
-int fd;
-int ret;
+int fd = -1;
+int ret = EXIT_FAILURE;
 pthread_t show_parts_thread;
 Error *local_error = NULL;
 
@@ -278,26 +278,24 @@ static void *nbd_client_thread(void *arg)
 goto out;
 }
 
-ret = nbd_receive_negotiate(NULL, QIO_CHANNEL(sioc),
-NULL, NULL, NULL, &info, &local_error);
-if (ret < 0) {
+if (nbd_receive_negotiate(NULL, QIO_CHANNEL(sioc),
+  NULL, NULL, NULL, &info, &local_error) < 0) {
 if (local_error) {
 error_report_err(local_error);
 }
-goto out_socket;
+goto out;
 }
 
 fd = open(device, O_RDWR);
 if (fd < 0) {
 /* Linux-only, we can use %m in printf.  */
 error_report("Failed to open %s: %m", device);
-goto out_socket;
+goto out;
 }
 
-ret = nbd_init(fd, sioc, &info, &local_error);
-if (ret < 0) {
+if (nbd_init(fd, sioc, &info, &local_error) < 0) {
 error_report_err(local_error);
-goto out_fd;
+goto out;
 }
 
 /* update partition table */
@@ -311,24 +309,20 @@ static void *nbd_client_thread(void *arg)
 dup2(STDOUT_FILENO, STDERR_FILENO);
 }
 
-ret = nbd_client(fd);
-if (ret) {
-goto out_fd;
+if (nbd_client(fd) < 0) {
+goto out;
 }
-close(fd);
-object_unref(OBJECT(sioc));
-g_free(info.name);
-kill(getpid(), SIGTERM);
-return (void *) EXIT_SUCCESS;
 
-out_fd:
-close(fd);
-out_socket:
+ret = EXIT_SUCCESS;
+
+ out:
+if (fd >= 0) {
+close(fd);
+}
 object_unref(OBJECT(sioc));
-out:
 g_free(info.name);
 kill(getpid(), SIGTERM);
-return (void *) EXIT_FAILURE;
+return (void *) (intptr_t) ret;
 }
 #endif /* HAVE_NBD_DEVICE */
 
-- 
2.19.1

Re: [PATCH v12 16/19] multi-process: Synchronize remote memory

2020-12-08 Thread Marc-André Lureau

Hi

On Wed, Dec 2, 2020 at 12:23 AM Jagannathan Raman 
wrote:

> Add memory-listener object which is used to keep the view of the RAM
> in sync between QEMU and remote process.
> A MemoryListener is registered for system-memory AddressSpace. The
> listener sends SYNC_SYSMEM message to the remote process when memory
> listener commits the changes to memory, the remote process receives
> the message and processes it in the handler for SYNC_SYSMEM message.
>
> Signed-off-by: Jagannathan Raman 
> Signed-off-by: John G Johnson 
> Signed-off-by: Elena Ufimtseva 
> Reviewed-by: Stefan Hajnoczi 
> ---
>  include/hw/remote/memory-sync.h |  27 ++
>  include/hw/remote/proxy.h   |   2 +
>  hw/remote/memory-sync.c | 210
> 
>  hw/remote/message.c |   5 +
>  hw/remote/proxy.c   |   6 ++
>  MAINTAINERS |   2 +
>  hw/remote/meson.build   |   1 +
>  7 files changed, 253 insertions(+)
>  create mode 100644 include/hw/remote/memory-sync.h
>  create mode 100644 hw/remote/memory-sync.c
>
> diff --git a/include/hw/remote/memory-sync.h
> b/include/hw/remote/memory-sync.h
> new file mode 100644
> index 000..785f76a
> --- /dev/null
> +++ b/include/hw/remote/memory-sync.h
> @@ -0,0 +1,27 @@
> +/*
> + * Copyright © 2018, 2020 Oracle and/or its affiliates.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef MEMORY_SYNC_H
> +#define MEMORY_SYNC_H
> +
> +#include "exec/memory.h"
> +#include "io/channel.h"
> +
> +typedef struct RemoteMemSync {
> +MemoryListener listener;
> +
> +int n_mr_sections;
> +MemoryRegionSection *mr_sections;
> +
> +QIOChannel *ioc;
> +} RemoteMemSync;
> +
> +void configure_memory_sync(RemoteMemSync *sync, QIOChannel *ioc);
> +void deconfigure_memory_sync(RemoteMemSync *sync);
>

RemoteMemSync vs MemorySync, and function with _memory_sync suffixes...
Naming things is hard, but trying to be consistent generally helps.

My understanding is that this is a proxy-dev helper to handle memory
listening and sending SYNC_SYSMEM.

I would thus suggest naming it ProxyMemoryListener. It could eventually be
folded in proxy.c

Please try to be consistent with header naming, structure naming, type,
functions and enum prefixes etc.

proxy_memory_listener isn't that long imho.

+
> +#endif
> diff --git a/include/hw/remote/proxy.h b/include/hw/remote/proxy.h
> index e29c61b..a687b7d 100644
> --- a/include/hw/remote/proxy.h
> +++ b/include/hw/remote/proxy.h
> @@ -11,6 +11,7 @@
>
>  #include "hw/pci/pci.h"
>  #include "io/channel.h"
> +#include "hw/remote/memory-sync.h"
>
>  #define TYPE_PCI_PROXY_DEV "x-pci-proxy-dev"
>
> @@ -40,6 +41,7 @@ struct PCIProxyDev {
>  QemuMutex io_mutex;
>  QIOChannel *ioc;
>  Error *migration_blocker;
> +RemoteMemSync sync;
>  ProxyMemoryRegion region[PCI_NUM_REGIONS];
>  };
>
> diff --git a/hw/remote/memory-sync.c b/hw/remote/memory-sync.c
> new file mode 100644
> index 000..2365e69
> --- /dev/null
> +++ b/hw/remote/memory-sync.c
> @@ -0,0 +1,210 @@
> +/*
> + * Copyright © 2018, 2020 Oracle and/or its affiliates.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu-common.h"
> +
> +#include "qemu/compiler.h"
> +#include "qemu/int128.h"
> +#include "qemu/range.h"
> +#include "exec/memory.h"
> +#include "exec/cpu-common.h"
> +#include "cpu.h"
> +#include "exec/ram_addr.h"
> +#include "exec/address-spaces.h"
> +#include "hw/remote/mpqemu-link.h"
> +#include "hw/remote/memory-sync.h"
> +
> +static void proxy_ml_begin(MemoryListener *listener)
>

I suggest to rename begin -> reset

+{
> +RemoteMemSync *sync = container_of(listener, RemoteMemSync, listener);
> +int mrs;
> +
> +for (mrs = 0; mrs < sync->n_mr_sections; mrs++) {
> +memory_region_unref(sync->mr_sections[mrs].mr);
> +}
> +
> +g_free(sync->mr_sections);
> +sync->mr_sections = NULL;
> +sync->n_mr_sections = 0;
> +}
> +
> +static int get_fd_from_hostaddr(uint64_t host, ram_addr_t *offset)
>

This function is very similar to vhost_user_get_mr_data(). That suggests we
could factor the code.

Perhaps a new memory_region_from_host_full(), or extend
memory_region_from_host() with an extra optional "int *fd" argument.


> +{
> +MemoryRegion *mr;
> +ram_addr_t off;
> +
> +/**
> + * Assumes that the host address is a valid address as it's
> + * coming from the MemoryListener system. In the case host
> + * address is not valid, the following call would return
> + * the default subregion of "system_memory" region, and
> + * not NULL. So it's not possible to check for NULL here.
> + */
> +mr = memory_region_from_host((void *)(uintptr_t)host, &off);
> +
> +if (offset)

Re: [PATCHv3 00/17] ARMv8.4 Secure EL2

2020-12-08 Thread Alex Bennée



Rémi Denis-Courmont  writes:

>   Hi,
>
> Le tiistaina 1. joulukuuta 2020, 20.23.46 EET Peter Maydell a écrit :
>> > The base tests fail without the patchset seem to assume an US American
>> > locale, which is frankly infuriatingly culturally insensitive.
>> 
>> I run them with en_GB.UTF-8, which works fine for me, but it's a bug
>> in the test suite if something's locale dependent and not ensuring
>> it is set correctly.
>
> For a start, it seems to be barfing on strsignal() localisation by the shell, 
> printing French instead of "Killed" on SIGKILL.
>
> % locale
> LANG=fr_FR.UTF-8
> LANGUAGE=fr:en_GB:fi
> LC_CTYPE="fi_FI.UTF-8"
> ...
>
>> > As for the acceptance tests fail equally early without the patchset with a
>> > completely helpless diagnostic about unresolved references. Wiki does not
>> > help either.
>> 
>> I just run "make -C my-build-dir check-acceptance"; I don't know anything
>> about the internals. It would help if you quoted the error messages
>> you see.
>
>  AVOCADO tests/acceptance
> Unable to resolve reference(s) 'tests/acceptance' with plugins(s) 'file', 
> 'tap', 'external', try running 'avocado list -V tests/acceptance' to see the 
> details.
> make: *** [/home/remi/dev/qemu/tests/Makefile.include:125 : check-acceptance] 
> Erreur 2
>
> % avocado list -V tests/acceptance
> bash: avocado : commande introuvable
>
> % tests/venv/bin/avocado list -V tests/acceptance
> usage: avocado list [-h] [--loaders [LIST.LOADERS ...]]
>
> Wiki implies that dependencies are automatically installed, but I
> guess not?

They should be in the venv that is built when you run the test the first
time. Running the above command without the -V which it doesn't
recognise gives me a list:

  14:12:25 [alex@zen:~/l/q/b/all] 
xen/guest-loader-and-arm-build-cleanups-v2|✚3…(+5/-5) + 
./tests/venv/bin/avocado list tests/acceptance/ | head -n 10
  INSTRUMENTED tests/acceptance/boot_linux.py:BootLinuxX8664.test_pc_i440fx_tcg
  INSTRUMENTED tests/acceptance/boot_linux.py:BootLinuxX8664.test_pc_i440fx_kvm
  INSTRUMENTED tests/acceptance/boot_linux.py:BootLinuxX8664.test_pc_q35_tcg
  INSTRUMENTED tests/acceptance/boot_linux.py:BootLinuxX8664.test_pc_q35_kvm
  INSTRUMENTED tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_tcg
  INSTRUMENTED 
tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_kvm_gicv2
  INSTRUMENTED 
tests/acceptance/boot_linux.py:BootLinuxAarch64.test_virt_kvm_gicv3
  INSTRUMENTED tests/acceptance/boot_linux.py:BootLinuxPPC64.test_pseries_tcg
  INSTRUMENTED 
tests/acceptance/boot_linux.py:BootLinuxS390X.test_s390_ccw_virtio_tcg
  INSTRUMENTED 
tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_x86_64_pc
  Exception ignored in: <_io.TextIOWrapper name='' mode='w' 
encoding='UTF-8'>
  BrokenPipeError: [Errno 32] Broken pipe

I wonder are you running an in-tree build?

>
> Br,


-- 
Alex Bennée

Re: [PATCH 1/3] block: Simplify qmp_block_resize() error paths

2020-12-08 Thread Vladimir Sementsov-Ogievskiy


03.12.2020 20:23, Kevin Wolf wrote:

The only thing that happens after the 'out:' label is blk_unref(blk).
However, blk = NULL in all of the error cases, so instead of jumping to
'out:', we can just return directly.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Kevin Wolf 
---
  blockdev.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index fe6fb5dc1d..229d2cce1b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2468,17 +2468,17 @@ void coroutine_fn qmp_block_resize(bool has_device, 
const char *device,
  
  if (size < 0) {

  error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "size", "a >0 size");
-goto out;
+return;
  }
  
  if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_RESIZE, NULL)) {

  error_setg(errp, QERR_DEVICE_IN_USE, device);
-goto out;
+return;
  }
  
  blk = blk_new_with_bs(bs, BLK_PERM_RESIZE, BLK_PERM_ALL, errp);

  if (!blk) {
-goto out;
+return;
  }
  
  bdrv_drained_begin(bs);

@@ -2487,7 +2487,6 @@ void coroutine_fn qmp_block_resize(bool has_device, const 
char *device,
  bdrv_co_leave(bs, old_ctx);
  bdrv_drained_end(bs);
  
-out:

  bdrv_co_lock(bs);
  blk_unref(blk);
  bdrv_co_unlock(bs);



Initialization of blk to NULL becomes redundant with this patch, so may be 
dropped too. Anyway:

Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir

Re: [PATCH 0/3] virtiofsd: Fix lo_flush() and inode->posix_lock init

2020-12-08 Thread Vivek Goyal

On Tue, Dec 08, 2020 at 05:51:34AM +0100, Laszlo Ersek wrote:
> Hi Vivek,
> 
> On 12/07/20 19:30, Vivek Goyal wrote:
> > Laszlo is writing a virtiofs client for OVMF and noticed that if he
> > sends fuse FLUSH command for directory object, virtiofsd crashes.
> > virtiofsd does not expect a FLUSH arriving for a directory object.
> > 
> > This patch series has one of the patches which fixes that. It also
> > has couple of posix lock fixes as a result of lo_flush() related debugging.
> > 
> > Vivek Goyal (3):
> >   virtiofsd: Set up posix_lock hash table for root inode
> >   virtiofsd: Disable posix_lock hash table if remote locks are not
> > enabled
> >   virtiofsd: Check file type in lo_flush()
> > 
> >  tools/virtiofsd/passthrough_ll.c | 54 +++-
> >  1 file changed, 39 insertions(+), 15 deletions(-)
> > 
> 
> I put back the (wrong) FLUSH for the root dir into my code temporarily, to 
> reproduce the crash (it does, with v5.2.0-rc4).
> 
> Then I applied your series [*], and retested.
> 
> [*] I'm unsure about the email you sent in response to 1/3, namely 
> ; I 
> ignored that when applying the patches.

Hi Laszlo,

Thank you for the testing.

I reposed patch 1 to take care of coding style issues. Functionally both
the versions are same.

> 
> Indeed now I get a graceful -EBADF:
> 
> [13316825985314] [ID: 0004] unique: 60, opcode: FLUSH (25), nodeid: 1, 
> insize: 64, pid: 1
> [13316825993517] [ID: 0004]unique: 60, error: -9 (Bad file 
> descriptor), outsize: 16
> 
> For whichever patch in the series my testing is relevant:
> 
> Tested-by: Laszlo Ersek 
> 
> (I'm having some difficulty figuring out which patch(es) should carry my T-b.
> 
> - I think I didn't really test patch#2 with the above, so that one should 
> likely not get the T-b
> 
> - I think patch#3 is what I really tested.
> 
> - But, if that's the case, doesn't patch#3 make the fix in patch#1 
> untestable, in my scenario? I believe the code is no longer reached in 
> lo_flush(), due to patch#3, where the change from patch#1 would matter. 
> Patch#1 seems correct, and the last paragraph of its commit message relevant, 
> but I think my testing currently only covered patch#3.
> 
> I'll let you decide where to apply my T-b.)

David Gilbert can add your Tested-by: while applying this patch series.
I think adding it to patch 3 makes most sense.

Thanks
Vivek

Re: [PATCH 0/4] tracetool: show trace-events filename/lineno in fmt string errors

2020-12-08 Thread Stefan Hajnoczi

On Thu, Aug 27, 2020 at 03:29:11PM +0100, Stefan Hajnoczi wrote:
> This patch series improves format string compiler error messages. Instead of
> showing the generated file, show the trace-events file where the format string
> is defined.
> 
> Stefan Hajnoczi (4):
>   tracetool: add output filename command-line argument
>   tracetool: add out_lineno and out_next_lineno to out()
>   tracetool: add input filename and line number to Event
>   tracetool: show trace-events filename/lineno in fmt string errors
> 
>  docs/devel/tracing.txt  |  3 +-
>  meson.build |  3 +-
>  scripts/tracetool.py| 12 ---
>  scripts/tracetool/__init__.py   | 53 +
>  scripts/tracetool/backend/ftrace.py |  4 +++
>  scripts/tracetool/backend/log.py|  4 +++
>  scripts/tracetool/backend/syslog.py |  4 +++
>  trace/meson.build   | 23 +
>  8 files changed, 76 insertions(+), 30 deletions(-)
> 
> -- 
> 2.26.2
> 

Thanks, applied to my tracing-next tree:
https://gitlab.com/stefanha/qemu/commits/tracing-next

Stefan


signature.asc
Description: PGP signature

[PATCH v3 1/2] crypto: luks: Fix tiny memory leak

2020-12-08 Thread Maxim Levitsky

When the underlying block device doesn't support the
bdrv_co_delete_file interface, an 'Error' object was leaked.

Signed-off-by: Maxim Levitsky 
---
 block/crypto.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/crypto.c b/block/crypto.c
index aef5a5721a..b3a5275132 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -735,6 +735,8 @@ fail:
  */
 if ((r_del < 0) && (r_del != -ENOTSUP)) {
 error_report_err(local_delete_err);
+} else {
+error_free(local_delete_err);
 }
 }
 
-- 
2.26.2

[PATCH v3 0/2] qcow2: don't leave partially initialized file on image creation

2020-12-08 Thread Maxim Levitsky

Use the bdrv_co_delete_file interface to delete the underlying
file if qcow2 initialization fails (e.g due to bad encryption secret)

This makes the qcow2 driver behave the same way as the luks driver behaves.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1845353

V3: addressed review feedback and reworked commit messages

Best regards,
Maxim Levitsky

Maxim Levitsky (2):
  crypto: luks: Fix tiny memory leak
  block: qcow2: remove the created file on initialization error

 block/crypto.c |  2 ++
 block/qcow2.c  | 13 +
 2 files changed, 15 insertions(+)

-- 
2.26.2

[PATCH v3 2/2] block: qcow2: remove the created file on initialization error

2020-12-08 Thread Maxim Levitsky

If the qcow initialization fails, we should remove the file if it was
already created, to avoid leaving stale files around.

We already do this for luks raw images.

Signed-off-by: Maxim Levitsky 
---
 block/qcow2.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 3a90ef2786..3bc2096b72 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3848,6 +3848,19 @@ static int coroutine_fn qcow2_co_create_opts(BlockDriver 
*drv,
 /* Create the qcow2 image (format layer) */
 ret = qcow2_co_create(create_options, errp);
 if (ret < 0) {
+
+Error *local_delete_err = NULL;
+int r_del = bdrv_co_delete_file(bs, &local_delete_err);
+/*
+ * ENOTSUP will happen if the block driver doesn't support
+ * the 'bdrv_co_delete_file' interface. This is a predictable
+ * scenario and shouldn't be reported back to the user.
+ */
+if ((r_del < 0) && (r_del != -ENOTSUP)) {
+error_report_err(local_delete_err);
+} else {
+error_free(local_delete_err);
+}
 goto finish;
 }
 
-- 
2.26.2

Re: [PATCH 1/1] spapr.c: set a 'kvm-type' default value instead of relying on NULL

2020-12-08 Thread Greg Kurz

Hi Daniel,

On Tue,  8 Dec 2020 10:45:36 -0300
Daniel Henrique Barboza  wrote:

> spapr_kvm_type() is considering 'vm_type=NULL' as a valid input, where
> the function returns 0. This is relying on the current QEMU machine
> options handling logic, where the absence of the 'kvm-type' option
> will be reflected as 'vm_type=NULL' in this function.
> 
> This is not robust, and will break if QEMU options code decides to propagate
> something else in the case mentioned above (e.g. an empty string instead
> of NULL).
> 
> Let's avoid this entirely by setting a non-NULL default value in case of
> no user input for 'kvm-type'. spapr_kvm_type() was changed to handle 3 fixed
> values of kvm-type: "HV", "PR" and, if no kvm-type was set by the user,
> DEFAULT_KVM_TYPE. This allows us always be predictable regardless of any
> enhancements/changes made in QEMU options mechanics.
> 
> While we're at it, let's also document in 'kvm-type' description what
> happens if the user does not set this option. This information is based
> on how the pseries kernel handles the KVM_CREATE_VM ioctl(), where the
> default value '0' makes the kernel choose an available KVM module to
> use, giving precedence to kvm_hv. This logic is described in the kernel
> source file arch/powerpc/kvm/powerpc.c, function kvm_arch_init_vm().
> 
> Signed-off-by: Daniel Henrique Barboza 
> ---
> 
> The case I mentioned in the second paragraph is happening when we try to
> execute a pseries guest with '--enable-kvm' using Paolo's patch 
> "vl: make qemu_get_machine_opts static" [1]:
> 
> $ sudo ./ppc64-softmmu/qemu-system-ppc64 -nographic -nodefaults -machine 
> pseries --enable-kvm
> qemu-system-ppc64: Unknown kvm-type specified '' 
> 
> This happens due to the differences between how qemu_opt_get() and
> object_property_get_str() works when there is no 'kvm-type' specified.
> See [2] for more info about the issue found.
> 
> 
> [1] https://lists.gnu.org/archive/html/qemu-devel/2020-12/msg00471.html
> [2] https://lists.gnu.org/archive/html/qemu-devel/2020-12/msg01578.html 
> 
> 
>  hw/ppc/spapr.c | 19 +--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index b7e0894019..32d938dc6a 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3021,9 +3021,10 @@ static void spapr_machine_init(MachineState *machine)
>  qemu_cond_init(&spapr->fwnmi_machine_check_interlock_cond);
>  }
>  
> +#define DEFAULT_KVM_TYPE "auto"
>  static int spapr_kvm_type(MachineState *machine, const char *vm_type)
>  {
> -if (!vm_type) {
> +if (!strcmp(vm_type, DEFAULT_KVM_TYPE)) {
>  return 0;
>  }
>  
> @@ -3131,6 +3132,16 @@ static char *spapr_get_kvm_type(Object *obj, Error 
> **errp)
>  {
>  SpaprMachineState *spapr = SPAPR_MACHINE(obj);
>  
> +/*
> + * In case the user didn't set 'kvm-type', return DEFAULT_KVM_TYPE
> + * instead of NULL. This allows us to be more predictable with what
> + * is expected to happen in spapr_kvm_type(), since we can stop relying
> + * on receiving a 'NULL' parameter as a valid input there.
> + */

Returning what is obviously a default value is straightforward enough
that is doesn't need to a comment IMHO.

> +if (!spapr->kvm_type) {
> +return g_strdup(DEFAULT_KVM_TYPE);
> +}
> +
>  return g_strdup(spapr->kvm_type);

Alternatively you could simply do:

return g_strdup(spapr->kvm_type ? spapr->kvm_type : DEFAULT_KVM_TYPE);

>  }
>  
> @@ -3273,7 +3284,11 @@ static void spapr_instance_init(Object *obj)
>  object_property_add_str(obj, "kvm-type",
>  spapr_get_kvm_type, spapr_set_kvm_type);
>  object_property_set_description(obj, "kvm-type",
> -"Specifies the KVM virtualization mode 
> (HV, PR)");
> +"Specifies the KVM virtualization mode 
> (HV, PR)."

Why not documenting "auto" as a valid option as well ?

While here you could maybe convert HV and PR to lowercase in order to
have a prettier "hv, pr, auto" set of valid values in the help string.
You'd need to convert the related checks in spapr_kvm_type() to still
check for the uppercase versions for compatibility, eg. by using
g_ascii_strcasecmp().

> +" If not specified, defaults to any 
> available KVM"
> +" module loaded in the host. In case 
> both kvm_hv"
> +" and kvm_pr are loaded, kvm_hv takes 
> precedence.");
> +

s/If not specified/If 'auto'/ and mention 'auto' as being the default value.

>  object_property_add_bool(obj, "modern-hotplug-events",
>  spapr_get_modern_hotplug_events,
>  spapr_set_modern_hotplug_events);

Re: [PATCH] target/arm: do not use cc->do_interrupt for KVM directly

2020-12-08 Thread Philippe Mathieu-Daudé

On 12/8/20 2:55 PM, Claudio Fontana wrote:
> On 12/8/20 2:51 PM, Claudio Fontana wrote:
>> On 12/8/20 2:27 PM, Philippe Mathieu-Daudé wrote:
>>> On 12/7/20 10:50 PM, Peter Maydell wrote:
 On Mon, 7 Dec 2020 at 21:26, Eduardo Habkost  wrote:
> My understanding is that there's no reason for ARM KVM to use
> another approach, and that CPUClass.do_interrupt is not really
> TCG-specific.
>
> Do we have any case where the CPUClass.do_interrupt
> implementation is really TCG-specific, or it is just a
> coincidence that most other accelerators simply don't to call the
> method?  It looks like the only cases where the
> CPUClass.do_interrupt assignment is conditional on CONFIG_TCG are
> i386 and s390x.

 Looking at PPC, its kvm_handle_debug() function does a
 direct call to ppc_cpu_do_interrupt(). So the code of
 its do_interrupt method must be ok-for-KVM, it's just that
 it doesn't use the method pointer. (It's doing the same thing
 Arm is -- if a debug event turns out not to be for QEMU itself,
 inject a suitable exception into the guest.)

 So of our 5 KVM-supporting architectures:

  * i386 and s390x have kernel APIs for "inject suitable
exception", don't need to call do_interrupt, and make
the cc->do_interrupt assignment only ifdef CONFIG_TCG,
so that the code for do_interrupt need not be compiled
into a KVM-only binary. (In both cases the code for the
function is in a source file that the meson.build puts
into the source list only if CONFIG_TCG)
  * ppc and arm both need to use this code even in a KVM
only binary. Neither of them #ifdef the cc->do_interrupt
assignment, because there's not much point at the moment
if you're not going to try to compile out the code.
ppc happens to do a direct function call, and arm happens
to go via the cc->do_interrupt pointer, but I don't
think there's much significance in the choice either way.
In both cases, the only places making the call are within
architecture-specific KVM code.
  * mips KVM does neither of these things, probably because it is
not sufficiently featureful to have run into the cases
where you might want to re-inject an exception and it's
not being sufficiently used in production for anybody to
have looked at minimising the amount of code in a
KVM-only QEMU binary for it.

 So in conclusion we have a basically 50:50 split between
 "use the same do_interrupt code as TCG" and "have a kernel
 API to make the kernel do the work", plus one arch that
 probably hasn't had to make the choice yet.   ¯\_(ツ)_/¯
>>>
>>> Why not introduce KVMCpuOperations similar to TCGCpuOperations
>>> Claudio is introducing, and declare the do_interrupt(CPUState*)
>>> in both structures?
>>>
>>> Then we can assign the same handler to both fields, TCG keeps
>>> calling cc->tcg->do_interrupt(), KVM calls cc->kvm->do_interrupt().
>>> This allow building with a particular accelerator, while staying
>>> compliant with the current 50:50 split...
>>
>>
>> Hi Philippe,
>>
>> in principle interesting, but KVMCpuOperations would end up currently 
>> containing do_interrupt only..
>> seems a bit overkill for just one method.

I don't see this being a problem, if this makes code clearer
(think about maintainability).

> I mean, all the others in CPUClass are common between TCG and KVM, I don't 
> see a lot that is KVM-only there that would warrant a KVMCPUOps structure
> 
>> Or where you thinking of ways to refactor current kvm code to use methods in 
>> CPUClass similarly to what Tcg does?
>>
> 
> But maybe this is where you were going with this?

No, not really. I'm looking for a design to enforce correctness,
while keeping the 2 choices Peter mentioned available.

- "use the same do_interrupt code as TCG":

cc->tcg.do_interrupt = x86_cpu_do_interrupt;
cc->kvm.do_interrupt = NULL;

cc->tcg.do_interrupt = s390_cpu_do_interrupt;
cc->kvm.do_interrupt = NULL;

- "have a kernel API to make the kernel do the work"

cc->tcg.do_interrupt = arm_cpu_do_interrupt;
cc->kvm.do_interrupt = arm_cpu_do_interrupt;

cc->tcg.do_interrupt = ppc_cpu_do_interrupt;
cc->kvm.do_interrupt = ppc_cpu_do_interrupt;

Looks easy to review, hard to misplace #ifdef'ry.

> 
> Ciao,
> 
> C

Re: [PATCH] file-posix: detect the lock using the real file

2020-12-08 Thread Kevin Wolf

Am 08.12.2020 um 13:59 hat Li Feng geschrieben:
> This patch addresses this issue:
> When accessing a volume on an NFS filesystem without supporting the file lock,
> tools, like qemu-img, will complain "Failed to lock byte 100".
> 
> In the original code, the qemu_has_ofd_lock will test the lock on the
> "/dev/null" pseudo-file. Actually, the file.locking is per-drive property,
> which depends on the underlay filesystem.
> 
> In this patch, make the 'qemu_has_ofd_lock' with a filename be more generic
> and reasonable.
> 
> Signed-off-by: Li Feng 

Do you know any way how I could configure either the NFS server or the
NFS client such that locking would fail? For any patch related to this,
it would be good if I could even test the scenario.

For this specific patch, I think Daniel has already provided a good
explanation of the fundamental problems it has.

Kevin

Re: [PATCH v2 6/9] vnc: add alpha cursor support

2020-12-08 Thread Daniel P . Berrangé

On Tue, Dec 08, 2020 at 12:57:34PM +0100, Gerd Hoffmann wrote:
> There is a new vnc extension for cursors with an alpha channel.  Use
> it if supported by the vnc client, prefer it over the "rich cursor"
> extension which supports only a bitmask for transparency.
> 
> This is a visible improvement especially on modern desktops which
> actually use the alpha channel when defining cursors.
> 
> https://github.com/rfbproto/rfbproto/blob/master/rfbproto.rst#cursor-with-alpha-pseudo-encoding
> 
> Signed-off-by: Gerd Hoffmann 
> Reviewed-by: Marc-André Lureau 
> ---
>  ui/vnc.h |  2 ++
>  ui/vnc.c | 21 ++---
>  2 files changed, 20 insertions(+), 3 deletions(-)

Add implemented the client side in GTK-VNC and tested against
this QEMU patch:

  https://gitlab.gnome.org/GNOME/gtk-vnc/-/merge_requests/8

So

Tested-by: Daniel P. Berrangé 
Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v2 1/9] console: drop qemu_console_get_ui_info

2020-12-08 Thread Daniel P . Berrangé

On Tue, Dec 08, 2020 at 12:57:29PM +0100, Gerd Hoffmann wrote:
> Unused and duplicate (there is dpy_get_ui_info).
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  include/ui/console.h | 1 -
>  ui/console.c | 6 --
>  2 files changed, 7 deletions(-)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v2 2/9] console: allow con==NULL in dpy_{get, set}_ui_info and dpy_ui_info_supported

2020-12-08 Thread Daniel P . Berrangé

On Tue, Dec 08, 2020 at 12:57:30PM +0100, Gerd Hoffmann wrote:
> Use active_console in that case like we do in many other places.
> 
> Signed-off-by: Gerd Hoffmann 
> ---
>  ui/console.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 2/3] block: Fix locking in qmp_block_resize()

2020-12-08 Thread Vladimir Sementsov-Ogievskiy


03.12.2020 20:23, Kevin Wolf wrote:

The drain functions assume that we hold the AioContext lock of the
drained block node. Make sure to actually take the lock.

Cc: qemu-sta...@nongnu.org
Fixes: eb94b81a94bce112e6b206df846c1551aaf6cab6
Signed-off-by: Kevin Wolf 
---
  blockdev.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 229d2cce1b..0535a8dc9e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2481,13 +2481,16 @@ void coroutine_fn qmp_block_resize(bool has_device, 
const char *device,
  return;
  }
  
+bdrv_co_lock(bs);

  bdrv_drained_begin(bs);
+bdrv_co_unlock(bs);
+
  old_ctx = bdrv_co_enter(bs);
  blk_truncate(blk, size, false, PREALLOC_MODE_OFF, 0, errp);
  bdrv_co_leave(bs, old_ctx);
-bdrv_drained_end(bs);
  
  bdrv_co_lock(bs);

+bdrv_drained_end(bs);
  blk_unref(blk);
  bdrv_co_unlock(bs);
  }



Can't we just do

old_ctx = bdrv_co_enter(bs);

bdrv_drained_begin(bs);

blk_truncate(blk, size, false, PREALLOC_MODE_OFF, 0, errp);
  
bdrv_drained_end(bs);

blk_unref(blk);

bdrv_co_leave(bs, old_ctx);


? This way we have one acquire/release section instead of three in a row.. But 
then we probably need addition bdrv_ref/bdrv_unref, to not crash with final 
bdrv_co_leave after blk_unref.

Also, preexisting, but it seems not good that coroutine_fn qmp_block_resize is 
called from non-coroutine hmp_block_resize()

anyway:
Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir

1 2 3 4 5 >

1 - 100 of 453 matches

Mail list logo