date:20161124

Re: [Qemu-devel] [PATCH v1 09/10] target-ppc: add vextu[bhw]lx instructions

2016-11-24 Thread Richard Henderson


On 11/24/2016 06:53 AM, Nikunj A Dadhania wrote:

David Gibson  writes:


[ Unknown signature status ]
On Wed, Nov 23, 2016 at 05:07:18PM +0530, Nikunj A Dadhania wrote:

From: Avinesh Kumar 

vextublx:  Vector Extract Unsigned Byte Left
vextuhlx:  Vector Extract Unsigned Halfword Left
vextuwlx:  Vector Extract Unsigned Word Left

Signed-off-by: Avinesh Kumar 
Signed-off-by: Nikunj A Dadhania 


So, when I suggested doing these without helpers before, I had
forgotten that the non-byte versions can straddle the word boundary.
Given that the offset is in a register, not the instruction that does
make it complicated.

But, this version also relies on working 128-bit arithmetic, AFAICT
this will just fail to build if CONFIG_INT128 isn't defined.


It has both the implementation, just that the defines might have
confused you:

#if defined(HOST_WORDS_BIGENDIAN)

#  if defined(CONFIG_INT128)
#  else
#  endif

#else /* !defined (HOST_WORDS_BIGENDIAN) */

#  if defined(CONFIG_INT128)
#  else
#  endif

#endif


In include/qemu/int128.h, we do have int128_rshift.  So you don't *really* have 
to do this by hand, exactly.



r~

Re: [Qemu-devel] [PATCH v1 09/10] target-ppc: add vextu[bhw]lx instructions

2016-11-24 Thread Nikunj A Dadhania

Richard Henderson  writes:

> On 11/24/2016 06:53 AM, Nikunj A Dadhania wrote:
>> David Gibson  writes:
>>
>>> [ Unknown signature status ]
>>> On Wed, Nov 23, 2016 at 05:07:18PM +0530, Nikunj A Dadhania wrote:
 From: Avinesh Kumar 

 vextublx:  Vector Extract Unsigned Byte Left
 vextuhlx:  Vector Extract Unsigned Halfword Left
 vextuwlx:  Vector Extract Unsigned Word Left

 Signed-off-by: Avinesh Kumar 
 Signed-off-by: Nikunj A Dadhania 
>>>
>>> So, when I suggested doing these without helpers before, I had
>>> forgotten that the non-byte versions can straddle the word boundary.
>>> Given that the offset is in a register, not the instruction that does
>>> make it complicated.
>>>
>>> But, this version also relies on working 128-bit arithmetic, AFAICT
>>> this will just fail to build if CONFIG_INT128 isn't defined.
>>
>> It has both the implementation, just that the defines might have
>> confused you:
>>
>> #if defined(HOST_WORDS_BIGENDIAN)
>>
>> #  if defined(CONFIG_INT128)
>> #  else
>> #  endif
>>
>> #else /* !defined (HOST_WORDS_BIGENDIAN) */
>>
>> #  if defined(CONFIG_INT128)
>> #  else
>> #  endif
>>
>> #endif
>
> In include/qemu/int128.h, we do have int128_rshift.  So you don't *really* 
> have 
> to do this by hand, exactly.

Sure, let me add int128_extract as well. Will be helpful.

Regards
Nikunj

Re: [Qemu-devel] [PATCH 1/4] 9pfs: move pdus to V9fsState

2016-11-24 Thread Greg Kurz

On Mon, 21 Nov 2016 13:39:29 -0800
Stefano Stabellini  wrote:

> pdus are initialized and used in 9pfs common code. Move the array from
> V9fsVirtioState to V9fsState.
> 
> Signed-off-by: Stefano Stabellini 
> ---

Reviewed-by: Greg Kurz 

>  hw/9pfs/9p.c| 7 +++
>  hw/9pfs/9p.h| 1 +
>  hw/9pfs/virtio-9p.h | 1 -
>  3 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index aea7e9d..05e950f 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -3440,7 +3440,6 @@ void pdu_submit(V9fsPDU *pdu)
>  /* Returns 0 on success, 1 on failure. */
>  int v9fs_device_realize_common(V9fsState *s, Error **errp)
>  {
> -V9fsVirtioState *v = container_of(s, V9fsVirtioState, state);
>  int i, len;
>  struct stat stat;
>  FsDriverEntry *fse;
> @@ -3451,9 +3450,9 @@ int v9fs_device_realize_common(V9fsState *s, Error 
> **errp)
>  QLIST_INIT(&s->free_list);
>  QLIST_INIT(&s->active_list);
>  for (i = 0; i < (MAX_REQ - 1); i++) {
> -QLIST_INSERT_HEAD(&s->free_list, &v->pdus[i], next);
> -v->pdus[i].s = s;
> -v->pdus[i].idx = i;
> +QLIST_INSERT_HEAD(&s->free_list, &s->pdus[i], next);
> +s->pdus[i].s = s;
> +s->pdus[i].idx = i;
>  }
>  
>  v9fs_path_init(&path);
> diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h
> index 3976b7f..07cee01 100644
> --- a/hw/9pfs/9p.h
> +++ b/hw/9pfs/9p.h
> @@ -229,6 +229,7 @@ typedef struct V9fsState
>  char *tag;
>  enum p9_proto_version proto_version;
>  int32_t msize;
> +V9fsPDU pdus[MAX_REQ];
>  /*
>   * lock ensuring atomic path update
>   * on rename.
> diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h
> index 25c47c7..52c4b9d 100644
> --- a/hw/9pfs/virtio-9p.h
> +++ b/hw/9pfs/virtio-9p.h
> @@ -10,7 +10,6 @@ typedef struct V9fsVirtioState
>  VirtIODevice parent_obj;
>  VirtQueue *vq;
>  size_t config_size;
> -V9fsPDU pdus[MAX_REQ];
>  VirtQueueElement *elems[MAX_REQ];
>  V9fsState state;
>  } V9fsVirtioState;

Re: [Qemu-devel] [PATCH 2/4] 9pfs: introduce transport specific callbacks

2016-11-24 Thread Greg Kurz

On Mon, 21 Nov 2016 13:39:30 -0800
Stefano Stabellini  wrote:

> Don't call virtio functions from 9pfs generic code, use generic function
> callbacks instead.
> 
> Signed-off-by: Stefano Stabellini 
> ---

Just a couple of indentation and line over 80 characters nits. I'll fix them
before pushing to 9p-next.

Reviewed-by: Greg Kurz 

>  hw/9pfs/9p.c   |  8 
>  hw/9pfs/9p.h   | 18 ++
>  hw/9pfs/virtio-9p-device.c | 18 ++
>  hw/9pfs/virtio-9p.h|  9 -
>  4 files changed, 36 insertions(+), 17 deletions(-)
> 
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index 05e950f..5a20a13 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -47,7 +47,7 @@ ssize_t pdu_marshal(V9fsPDU *pdu, size_t offset, const char 
> *fmt, ...)
>  va_list ap;
>  
>  va_start(ap, fmt);
> -ret = virtio_pdu_vmarshal(pdu, offset, fmt, ap);
> +ret = pdu->s->transport->pdu_vmarshal(pdu, offset, fmt, ap);
>  va_end(ap);
>  
>  return ret;
> @@ -59,7 +59,7 @@ ssize_t pdu_unmarshal(V9fsPDU *pdu, size_t offset, const 
> char *fmt, ...)
>  va_list ap;
>  
>  va_start(ap, fmt);
> -ret = virtio_pdu_vunmarshal(pdu, offset, fmt, ap);
> +ret = pdu->s->transport->pdu_vunmarshal(pdu, offset, fmt, ap);
>  va_end(ap);
>  
>  return ret;
> @@ -67,7 +67,7 @@ ssize_t pdu_unmarshal(V9fsPDU *pdu, size_t offset, const 
> char *fmt, ...)
>  
>  static void pdu_push_and_notify(V9fsPDU *pdu)
>  {
> -virtio_9p_push_and_notify(pdu);
> +pdu->s->transport->push_and_notify(pdu);
>  }
>  
>  static int omode_to_uflags(int8_t mode)
> @@ -1751,7 +1751,7 @@ static void v9fs_init_qiov_from_pdu(QEMUIOVector *qiov, 
> V9fsPDU *pdu,
>  struct iovec *iov;
>  unsigned int niov;
>  
> -virtio_init_iov_from_pdu(pdu, &iov, &niov, is_write);
> +pdu->s->transport->init_iov_from_pdu(pdu, &iov, &niov, is_write);
>  
>  qemu_iovec_init_external(&elem, iov, niov);
>  qemu_iovec_init(qiov, niov);
> diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h
> index 07cee01..ab398d0 100644
> --- a/hw/9pfs/9p.h
> +++ b/hw/9pfs/9p.h
> @@ -230,6 +230,7 @@ typedef struct V9fsState
>  enum p9_proto_version proto_version;
>  int32_t msize;
>  V9fsPDU pdus[MAX_REQ];
> +struct V9fsTransport *transport;
>  /*
>   * lock ensuring atomic path update
>   * on rename.
> @@ -343,4 +344,21 @@ void pdu_free(V9fsPDU *pdu);
>  void pdu_submit(V9fsPDU *pdu);
>  void v9fs_reset(V9fsState *s);
>  
> +struct V9fsTransport {
> +ssize_t (*pdu_vmarshal)(V9fsPDU *pdu, size_t offset, const char 
> *fmt, va_list ap);

over 80 characters

> +ssize_t (*pdu_vunmarshal)(V9fsPDU *pdu, size_t offset, const char 
> *fmt, va_list ap);

ditto

> +void(*init_iov_from_pdu)(V9fsPDU *pdu, struct iovec **piov,
> +  unsigned int *pniov, bool is_write);

indent

> +void(*push_and_notify)(V9fsPDU *pdu);
> +};
> +
> +static inline int v9fs_register_transport(V9fsState *s, struct V9fsTransport 
> *t)
> +{
> +if (s->transport) {
> +return -EINVAL;
> +}
> +s->transport = t;
> +return 0;
> +}
> +
>  #endif
> diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
> index 1782e4a..e1a37a4 100644
> --- a/hw/9pfs/virtio-9p-device.c
> +++ b/hw/9pfs/virtio-9p-device.c
> @@ -20,7 +20,9 @@
>  #include "hw/virtio/virtio-access.h"
>  #include "qemu/iov.h"
>  
> -void virtio_9p_push_and_notify(V9fsPDU *pdu)
> +static struct V9fsTransport virtio_9p_transport;
> +
> +static void virtio_9p_push_and_notify(V9fsPDU *pdu)
>  {
>  V9fsState *s = pdu->s;
>  V9fsVirtioState *v = container_of(s, V9fsVirtioState, state);
> @@ -126,6 +128,7 @@ static void virtio_9p_device_realize(DeviceState *dev, 
> Error **errp)
>  v->config_size = sizeof(struct virtio_9p_config) + strlen(s->fsconf.tag);
>  virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, v->config_size);
>  v->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output);
> +v9fs_register_transport(s, &virtio_9p_transport);
>  
>  out:
>  return;
> @@ -148,7 +151,7 @@ static void virtio_9p_reset(VirtIODevice *vdev)
>  v9fs_reset(&v->state);
>  }
>  
> -ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
> +static ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
>  const char *fmt, va_list ap)

indent

>  {
>  V9fsState *s = pdu->s;
> @@ -158,7 +161,7 @@ ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
>  return v9fs_iov_vmarshal(elem->in_sg, elem->in_num, offset, 1, fmt, ap);
>  }
>  
> -ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t offset,
> +static ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t offset,
>const char *fmt, va_list ap)

indent

>  {
>  V9fsState *s = pdu->s;
> @@ -168,7 +171,7 @@ ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t offset,
>  return v9fs_iov_vunmarshal(elem->out_sg, elem->out_num, offset,

Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI

2016-11-24 Thread Laszlo Ersek

On 11/24/16 05:29, Michael S. Tsirkin wrote:
> On Wed, Nov 23, 2016 at 07:38:35PM -0500, Kevin O'Connor wrote:
>> As a general comment - it does seem unfortunate that we keep building
>> adhoc interfaces to communicate information from firmware to QEMU.  We
>> have a generic mechanism (fw_cfg) for passing adhoc information from
>> QEMU to the firmware, but the inverse seems to always involve magic
>> pci registers, magic io space registers, specific init ordering, etc.
> 
> FWIW I posted a proposal
>   fw-cfg: support writeable blobs
> a while ago to try to address that
> 

Yes, here's the discussion (Feb 2016):

https://www.mail-archive.com/qemu-devel@nongnu.org/msg354852.html

and it was even part of a pull req (Mar 2016):

https://www.mail-archive.com/qemu-devel@nongnu.org/msg359348.html

but it wasn't merged, apparently.

If QEMU (re)gains this feature, I can try basing the broadcast SMI
negotiation on it.

Thanks
Laszlo

[Qemu-devel] [PATCH for-2.8] target-m68k: fix muluw/mulsw

2016-11-24 Thread Laurent Vivier

"The multiplier and multiplicand are both word operands, and the result
is a long-word operand."

So compute flags on a long-word result, not on a word result.

Signed-off-by: Laurent Vivier 
---
 target-m68k/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index d2d6816..d6ed883 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -1186,7 +1186,7 @@ DISAS_INSN(mulw)
 SRC_EA(env, src, OS_WORD, sign, NULL);
 tcg_gen_mul_i32(tmp, tmp, src);
 tcg_gen_mov_i32(reg, tmp);
-gen_logic_cc(s, tmp, OS_WORD);
+gen_logic_cc(s, tmp, OS_LONG);
 }
 
 DISAS_INSN(divw)
-- 
2.7.4

[Qemu-devel] [Bug 1626972] Re: QEMU memfd_create fallback mechanism change for security drivers

2016-11-24 Thread Thomas Huth

Commit 0d34fbabc13 is upstream, so setting this to "Fix committed", too.

** Changed in: qemu
   Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1626972

Title:
  QEMU memfd_create fallback mechanism change for security drivers

Status in Ubuntu Cloud Archive:
  In Progress
Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Fix Committed
Status in qemu source package in Yakkety:
  In Progress
Status in qemu source package in Zesty:
  Fix Released

Bug description:
  [Impact]

   * Updated QEMU (from UCA) live migration doesn't work with 3.13 kernels.
   * QEMU code checks if it can create /tmp/memfd-XXX files wrongly.
   * Apparmor will block access to /tmp/ and QEMU will fail migrating.

  [Test Case]

   * Install 2 Ubuntu Trusty (3.13) + UCA Mitaka + apparmor rules.
   * Try to live-migration from one to another. 
   * Apparmor will block creation of /tmp/memfd-XXX files.

  [Regression Potential]

   Pros:
   * Exhaustively tested this.
   * Worked with upstream on this fix. 
   * I'm implementing new vhost log mechanism for upstream.
   * One line change to a blocker that is already broken.

   Cons:
   * To break live migration in other circumstances. 

  [Other Info]

   * Christian Ehrhardt has been following this.

  ORIGINAL DESCRIPTION:

  When libvirt starts using apparmor, and creating apparmor profiles for
  every virtual machine created in the compute nodes, mitaka qemu (2.5 -
  and upstream also) uses a fallback mechanism for creating shared
  memory for live-migrations. This fall back mechanism, on kernels 3.13
  - that don't have memfd_create() system-call, try to create files on
  /tmp/ directory and fails.. causing live-migration not to work.

  Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability =
  can't live migrate.

  From qemu 2.5, logic is on :

  void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int 
*fd)
  {
  if (memfd_create)... ### only works with HWE kernels

  else ### 3.13 kernels, gets blocked by apparmor
     tmpdir = g_get_tmp_dir
     ...
     mfd = mkstemp(fname)
  }

  And you can see the errors:

  From the host trying to send the virtual machine:

  2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted
  2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver 
[req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 
133ebc3585c041aebaead8c062cd6511 - - -] [instance: 
2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: 
unable to execute QEMU command 'migrate': Migration disabled: failed to 
allocate shared memory

  From the host trying to receive the virtual machine:

  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 
audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12565 comm="apparmor_parser"
  Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 
audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" 
profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 
audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" 
pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 
audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" 
profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser"
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 
audit(1471289780.407:76): apparmor="DENIED" operation="mknod" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" 
pid=12625 comm="qemu-system-x86" requested_mask="c" denied_mask="c" fsuid=107 
ouid=107
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979871] type=1400 
audit(1471289780.411:77): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/" pid=12625 
comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 ouid=0
  Aug 15 16:36:20 tkcompute01 kernel: [ 1194.979881] type=1400 
audit(1471289780.411:78): apparmor="DENIED" operation="open" 
profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/var/tmp/" 
pid=12625 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=107 
ouid=0

  When leaving libvirt without apparmor capabilities (thus not confining
  virtual machines on compute nodes, the live migration works as

[Qemu-devel] Asynchronous / synchronous IO emulation

2016-11-24 Thread Dong, Eddie

Hi:
 Under certain situation, we have a requirement to apply the virtual 
DMA event in a deterministic way. Current Qemu uses asynchronous approach to 
emulate the virtual IO events since quite long time ago. I am wondering if we 
still have an option to choose the synchronous emulation solution.

 Thanks, Eddie

Re: [Qemu-devel] [dpdk-dev] dpdk/vpp and cross-version migration for vhost

2016-11-24 Thread Kevin Traynor

On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
> On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
 You keep assuming that you have the VM started first and
 figure out things afterwards, but this does not work.

 Think about a cluster of machines. You want to start a VM in
 a way that will ensure compatibility with all hosts
 in a cluster.
>>>
>>> I see. I was more considering about the case when the dst
>>> host (including the qemu and dpdk combo) is given, and
>>> then determine whether it will be a successfull migration
>>> or not.
>>>
>>> And you are asking that we need to know which host could
>>> be a good candidate before starting the migration. In such
>>> case, we indeed need some inputs from both the qemu and
>>> vhost-user backend.
>>>
>>> For DPDK, I think it could be simple, just as you said, it
>>> could be either a tiny script, or even a macro defined in
>>> the source code file (we extend it every time we add a
>>> new feature) to let the libvirt to read it. Or something
>>> else.
>>
>> There's the issue of APIs that tweak features as Maxime
>> suggested.
> 
> Yes, it's a good point.
> 
>> Maybe the only thing to do is to deprecate it,
> 
> Looks like so.
> 
>> but I feel some way for application to pass info into
>> guest might be benefitial.
> 
> The two APIs are just for tweaking feature bits DPDK supports before
> any device got connected. It's another way to disable some features
> (the another obvious way is to through QEMU command lines).
> 
> IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
> of disabling something though qemu one by one, we could disable it
> once in DPDK.
> 
> But I doubt the useful of it. It's only used in DPDK's vhost example
> after all. Nor is it used in vhost pmd, neither is it used in OVS.

rte_vhost_feature_disable() is currently used in OVS, lib/netdev-dpdk.c

netdev_dpdk_vhost_class_init(void)
{
static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;

/* This function can be called for different classes.  The
initialization
 * needs to be done only once */
if (ovsthread_once_start(&once)) {
rte_vhost_driver_callback_register(&virtio_net_device_ops);
rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
  | 1ULL << VIRTIO_NET_F_HOST_TSO6
  | 1ULL << VIRTIO_NET_F_CSUM);

> 
 If you don't, guest visible interface will change
 and you won't be able to migrate.

 It does not make sense to discuss feature bits specifically
 since that is not the only part of interface.
 For example, max ring size supported might change.
>>>
>>> I don't quite understand why we have to consider the max ring
>>> size here? Isn't it a virtio device attribute, that QEMU could
>>> provide such compatibility information?
>>>
>>> I mean, DPDK is supposed to support vary vring size, it's QEMU
>>> to give a specifc value.
>>
>> If backend supports s/g of any size up to 2^16, there's no issue.
> 
> I don't know others, but I see no issues in DPDK.
> 
>> ATM some backends might be assuming up to 1K s/g since
>> QEMU never supported bigger ones. We might classify this
>> as a bug, or not and add a feature flag.
>>
>> But it's just an example. There might be more values at issue
>> in the future.
> 
> Yeah, maybe. But we could analysis it one by one.
> 
 Let me describe how it works in qemu/libvirt.
 When you install a VM, you can specify compatibility
 level (aka "machine type"), and you can query the supported compatibility
 levels. Management uses that to find the supported compatibility
 and stores the compatibility in XML that is migrated with the VM.
 There's also a way to find the latest level which is the
 default unless overridden by user, again this level
 is recorded and then
 - management can make sure migration destination is compatible
 - management can avoid migration to hosts without that support
>>>
>>> Thanks for the info, it helps.
>>>
>>> ...
 As version here is an opaque string for libvirt and qemu,
 anything can be used - but I suggest either a list
 of values defining the interface, e.g.
 any_layout=on,max_ring=256
 or a version including the name and vendor of the backend,
 e.g. "org.dpdk.v4.5.6".
>>>
>>> The version scheme may not be ideal here. Assume a QEMU is supposed
>>> to work with a specific DPDK version, however, user may disable some
>>> newer features through qemu command line, that it also could work with
>>> an elder DPDK version. Using the version scheme will not allow us doing
>>> such migration to an elder DPDK version. The MTU is a lively example
>>> here? (when MTU feature is provided by QEMU but is actually disabled
>>> by user, that it could also work with an elder DPDK without MTU support).
>>>
>>> --yliu
>>
>> OK, so does a list of values look better to you then?
> 
>

Re: [Qemu-devel] [RFC 00/15] qmp: Report supported device types on 'query-machines'

2016-11-24 Thread Marcel Apfelbaum


On 11/23/2016 07:35 PM, Eduardo Habkost wrote:

On Wed, Nov 23, 2016 at 06:43:16PM +0200, Marcel Apfelbaum wrote:

On 11/22/2016 03:11 AM, Eduardo Habkost wrote:

The Problem
===

Currently management software has no way to find out which device
types can be plugged in a machine, unless the machine is already
initialized.



Hi Eduardo,
Thank you for this interesting series. I think this is a problem
worth addressing.



[...]



PCI vs PCIe
---

Machines with PCIe buses will report INTERFACE_PCIE_DEVICE on
supported-device-types.

Machines with legacy PCI buses will report TYPE_PCI_DEVICE on
supported-device-types.

The problem with the current approach is that PCIe devices are
TYPE_PCI_DEVICE subclasses. The allows PCI device classes to
indicate they are PCIe-capable, but there's no obvious way to
indicate that a device is PCIe-only. This needs to be addressed
in a future version of this series.

Suggestions are welcome.



As we talked offline, I personally like an interface IPCIType
with a field having 3 possible values {pci/pcie/hybrid}

To understand how hybrid works we need some rules, like
"pci on pci bus/pcie on pcie bus"
"pcie on a non-root pcie bus/pcie otherwise

I don't think we'll have a lot of rules, simple boolean fields
for the interface should be enough.


What you propose makes sense, the only difference is that the
boolean fields would be just interface names that can be used as
the "implements" argument on qom-list-types.

e.g.:

* Hybrid PCI device-types would implement both "legacy-pci-device" and
  "pcie-device" interfaces.
* PCIe-only device-types would implement only the "pcie-device"
  interface.
* Legacy-PCI-only device-types would implement only the
  "legacy-pci-device" interface.

Then, the bus instances would have a 'accepted-device-types'
field:

* A legacy PCI bus would accept only "legacy-pci-device" devices.
* A PCIe-only bus would accept only "pcie-device" devices.
* A PCIe bus that accepts legacy PCI devices (the root bus?)
  would accept both "legacy-pci-device" and "pcie-device"
  devices.

Then, query-machines would return the list of bus instances that
machine init creates, containing the bus ID, bus type, and
accepted-device-types.

Does that make sense?



Sure, the described approach solves the problem.



This still does not solve the problem that some devices makes
sense only on a specific arch.


Right now, we can simply compile out arch-specific devices that
can never be plugged in a QEMU binary. But if we still want them
compiled in for some reason, we can categorize them on a
different type/interface name, and the corresponding machine-type
would tell management that their buses accept only the
arch-specific type/interface name.



Adding an Interface for each arch could work, yes.






Incomplete bus lists on some machines
-

With this series, not all machines classes are changed to add the
full list of device types on the 'supported-device-types'. To
allow the code to be updated gradually, qmp-machine-info.py has a
STRICT_ARCHES variable, that will make the test code require a
complete device type list only on some architectures.

Out of scope: Configurable buses


There's no way to map machine options like "usb=on|off" to
device-types or buses. I plan to propose a new interface that
allows machine options to be mapped to buses/device-types later.

Out of scope: Deciding where to plug devices


Once management software discovers which devices can be plugged
to a machine, it still has to discover or define where devices
can/should/will be plugged. This is out of the scope of this
series.



That's a pitty :(
I was hoping this series will solve this issue. But if it prepares
the grounds for it is also a good step .


The bus ID will be in the scope of v2.



Thanks,
Marcel






Thanks,
Marcel



[...]

Re: [Qemu-devel] [PATCH] hw/pci: disable pci-bridge's shpc by default

2016-11-24 Thread Marcel Apfelbaum


On 11/24/2016 06:06 AM, David Gibson wrote:

On Wed, Nov 23, 2016 at 01:08:46PM +0200, Marcel Apfelbaum wrote:

On 11/22/2016 10:25 PM, Laurent Vivier wrote:



On 22/11/2016 18:26, Marcel Apfelbaum wrote:

On 11/18/2016 05:52 PM, Andrew Jones wrote:

On Wed, Nov 16, 2016 at 07:05:25PM +0200, Marcel Apfelbaum wrote:

On 11/16/2016 06:44 PM, Andrew Jones wrote:

On Sat, Nov 05, 2016 at 06:46:34PM +0200, Marcel Apfelbaum wrote:

On 11/03/2016 09:40 PM, Michael S. Tsirkin wrote:

On Thu, Nov 03, 2016 at 01:05:44PM +0200, Marcel Apfelbaum wrote:

On 11/03/2016 06:18 AM, Michael S. Tsirkin wrote:

On Wed, Nov 02, 2016 at 05:16:42PM +0200, Marcel Apfelbaum wrote:

The shpc component is optional while  ACPI hotplug is used
for hot-plugging PCI devices into a PCI-PCI bridge.
Disabling the shpc by default will make slot 0 usable at boot time


Hi Michael



at the cost of breaking all hotplug for all non-acpi users.



Do we have a non-acpi user that is able to use the shpc component
as-is today?


power and some arm systems I guess?



Adding Andrew , maybe he can give us an answer.


Not really :-) My lack of PCI knowledge makes that difficult. I'd be
happy
to help with an experiment though. Can you give me command line
arguments,
qmp commands, etc. that I should use to try it out? I imagine I should
just boot an ARM guest using DT (instead of ACPI) and then attempt to
hotplug a PCI device. I'm not sure, however, what, if any, special
configuration I need in order to ensure I'm testing what you're
interested in.



Hi Drew,


Just run QEMU with '-device pci-bridge,chassis_nr=1,id=bridge1
-monitor stdio'
with an ARM guest using DT and wait until the guest finish booting.

Then run at hmp:
device_add virtio-net-pci,bus=bridge1,id=net2

Next run lspci in the guest to see the new device.


Thanks for the instructions Marcel. Here's the results

 $QEMU -machine virt,accel=$ACCEL -cpu $CPU -nographic -m 4096 -smp 8 \
   -bios /usr/share/AAVMF/AAVMF_CODE.fd \
   -device pci-bridge,chassis_nr=1,id=bridge1 \
   -drive file=$FEDORA_IMG,if=none,id=dr0,format=qcow2 \
   -device virtio-blk-pci,bus=bridge1,addr=01,drive=dr0,id=disk0 \
   -netdev user,id=hostnet0 \
   -device virtio-net-pci,bus=bridge1,addr=02,netdev=hostnet0,id=net0

 # lspci
 00:00.0 Host bridge: Red Hat, Inc. Device 0008
 00:01.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
 01:01.0 SCSI storage controller: Red Hat, Inc Virtio block device
 01:02.0 Ethernet controller: Red Hat, Inc Virtio network device

 (qemu) device_add virtio-net-pci,bus=bridge1,id=net2
 Unsupported PCI slot 0 for standard hotplug controller. Valid slots are
 between 1 and 31.

(Tried again giving addr=03)

 (qemu) device_add virtio-net-pci,bus=bridge1,id=net2,addr=03

(Seemed to work, but...)

 # lspci
 00:00.0 Host bridge: Red Hat, Inc. Device 0008
 00:01.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
 01:01.0 SCSI storage controller: Red Hat, Inc Virtio block device
 01:02.0 Ethernet controller: Red Hat, Inc Virtio network device

(Doesn't show up in lscpi. So I guess it doesn't work)



Hi Drew,
Thanks for confirming that it doesn't work.

Michael asked if we can check the same for powerpc before
disabling the shpc by default.

Adding David, Thomas and Laurrent, maybe they have time
to check it for powerpc.


With this patch:

# lspci
00:00.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
00:03.0 Unclassified device [00ff]: Red Hat, Inc Virtio memory balloon
[root@dhcp6-56 ~]# QEMU 2.7.90 monitor - type 'help' for more information
(qemu) device_add virtio-net-pci,bus=bridge1,id=net2
Bus 'bridge1' does not support hotplugging
(qemu) device_add virtio-net-pci,bus=bridge1,id=net2,addr=3
Bus 'bridge1' does not support hotplugging

Without this patch:

# lspci
00:00.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
00:03.0 Unclassified device [00ff]: Red Hat, Inc Virtio memory balloon
[root@dhcp6-56 ~]# QEMU 2.7.90 monitor - type 'help' for more information
(qemu) device_add virtio-net-pci,bus=bridge1,id=net2
Unsupported PCI slot 0 for standard hotplug controller. Valid slots are
between 1 and 31.
(qemu) device_add virtio-net-pci,bus=bridge1,id=net2,addr=3
(qemu) [   37.342908] shpchp :00:00.0: Latch close on Slot(3)
[   37.342963] shpchp :00:00.0: Button pressed on Slot(3)
[   37.343003] shpchp :00:00.0: Card present on Slot(3)
[   37.344186] shpchp :00:00.0: PCI slot #3 - powering on due to
button press
[   43.361827] shpchp :00:00.0: No new device found



Hi Laurrent,
Thank you for the fast reply!

What I see from the log is that ppc does use the shpc for hotplug,
but there is some implementation bug preventing it to work.


That doesn't sound right.  ppc, or more correctly the pseries machine
type, has it's own hotplug notification mechanism.  If we're trying to
invoke shpc, I think something is wrong.



Hi David,

As you can see the shpc driver claims the pci-bridge slots and it responds
to the hot-plug events.

Anyway, what is the hotplu

Re: [Qemu-devel] [PATCH for-2.8] target-m68k: fix muluw/mulsw

2016-11-24 Thread Richard Henderson


On 11/24/2016 09:40 AM, Laurent Vivier wrote:

"The multiplier and multiplicand are both word operands, and the result
is a long-word operand."

So compute flags on a long-word result, not on a word result.

Signed-off-by: Laurent Vivier 
---
 target-m68k/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PULL 00/11] ppc-for-2.8 queue 20161123

2016-11-24 Thread Stefan Hajnoczi

On Tue, Nov 22, 2016 at 07:06:11PM -0800, no-re...@patchew.org wrote:
> === OUTPUT BEGIN ===
> Checking PATCH 1/11: tests/postcopy: Use KVM on ppc64 only if it is KVM-HV...
> Checking PATCH 2/11: spapr: migration support for CAS-negotiated option 
> vectors...
> ERROR: spaces required around that '*' (ctx:VxV)
> #99: FILE: hw/ppc/spapr.c:1347:
> +.subsections = (const VMStateDescription*[]) {

Is there a follow-up to fix this?

Thanks,
Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PULL for-2.8] target-alpha: fix smp interrupts

2016-11-24 Thread Stefan Hajnoczi

On Wed, Nov 23, 2016 at 01:46:34PM +0100, Richard Henderson wrote:
> This is such a minor typo, let's fix it for 2.8 please.
> 
> 
> r~
> 
> The following changes since commit c36ed06e9159fa484b711dfdd27ec64d7ac3d17a:
> 
>   Merge remote-tracking branch 'mst/tags/for_upstream' into staging 
> (2016-11-21 11:09:58 +)
> 
> are available in the git repository at:
> 
>   git://github.com/rth7680/qemu.git tags/pull-axp-20161123
> 
> for you to fetch changes up to 424ad8388f89f4202a7836d003273f23ebe04b09:
> 
>   target-alpha: Fix interrupt mask for cpu1 (2016-11-22 16:53:53 +0100)
> 
> 
> Fix alpha smp interrupt masking
> 
> 
> Richard Henderson (1):
>   target-alpha: Fix interrupt mask for cpu1
> 
>  hw/alpha/typhoon.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks, applied to my staging tree:
https://github.com/stefanha/qemu/commits/staging

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [kvm-unit-tests PATCH v7 09/11] arm/arm64: add initial gicv3 support

2016-11-24 Thread Auger Eric

Hi Drew,

On 23/11/2016 17:54, Andrew Jones wrote:
> Reviewed-by: Alex Bennée 
> Reviewed-by: Eric Auger 
> Signed-off-by: Andrew Jones 
> 
> ---
> v7: split lib/arm/gic.c into gic-v2/3.c [Eric]
> v6:
>  - added comments [Alex]
>  - added stride parameter to gicv3_set_redist_base [Andre]
>  - redist-wait s/rwp/uwp/ and comment [Andre]
>  - removed unnecessary wait-for-rwps [Andre]
> v5: use modern register names [Andre]
> v4:
>  - only take defines from kernel we need now [Andre]
>  - simplify enable by not caring if we reinit the distributor [drew]
> v2:
>  - configure irqs as NS GRP1
> ---
>  arm/Makefile.common|   2 +-
>  lib/arm/asm/arch_gicv3.h   |  47 
>  lib/arm/asm/gic-v3.h   | 104 
> +
>  lib/arm/asm/gic.h  |   5 ++-
>  lib/arm/gic-v2.c   |  27 
>  lib/arm/gic-v3.c   |  61 ++
>  lib/arm/gic.c  |  30 +
>  lib/arm64/asm/arch_gicv3.h |  44 +++
>  lib/arm64/asm/gic-v3.h |   1 +
>  lib/arm64/asm/sysreg.h |  44 +++
>  10 files changed, 343 insertions(+), 22 deletions(-)
>  create mode 100644 lib/arm/asm/arch_gicv3.h
>  create mode 100644 lib/arm/asm/gic-v3.h
>  create mode 100644 lib/arm/gic-v2.c
>  create mode 100644 lib/arm/gic-v3.c
>  create mode 100644 lib/arm64/asm/arch_gicv3.h
>  create mode 100644 lib/arm64/asm/gic-v3.h
>  create mode 100644 lib/arm64/asm/sysreg.h
> 
> diff --git a/arm/Makefile.common b/arm/Makefile.common
> index 2fe7aeeca6d4..6c0898f28be1 100644
> --- a/arm/Makefile.common
> +++ b/arm/Makefile.common
> @@ -46,7 +46,7 @@ cflatobjs += lib/arm/mmu.o
>  cflatobjs += lib/arm/bitops.o
>  cflatobjs += lib/arm/psci.o
>  cflatobjs += lib/arm/smp.o
> -cflatobjs += lib/arm/gic.o
> +cflatobjs += lib/arm/gic.o lib/arm/gic-v2.o lib/arm/gic-v3.o
>  
>  libeabi = lib/arm/libeabi.a
>  eabiobjs = lib/arm/eabi_compat.o
> diff --git a/lib/arm/asm/arch_gicv3.h b/lib/arm/asm/arch_gicv3.h
> new file mode 100644
> index ..276577452a14
> --- /dev/null
> +++ b/lib/arm/asm/arch_gicv3.h
> @@ -0,0 +1,47 @@
> +/*
> + * All ripped off from arch/arm/include/asm/arch_gicv3.h
> + *
> + * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2.
> + */
> +#ifndef _ASMARM_ARCH_GICV3_H_
> +#define _ASMARM_ARCH_GICV3_H_
> +
> +#ifndef __ASSEMBLY__
> +#include 
> +#include 
> +#include 
> +
> +#define __stringify xstr
> +
> +#define __ACCESS_CP15(CRn, Op1, CRm, Op2)p15, Op1, %0, CRn, CRm, Op2
> +
> +#define ICC_PMR  __ACCESS_CP15(c4, 0, c6, 0)
> +#define ICC_IGRPEN1  __ACCESS_CP15(c12, 0, c12, 7)
> +
> +static inline void gicv3_write_pmr(u32 val)
> +{
> + asm volatile("mcr " __stringify(ICC_PMR) : : "r" (val));
> +}
> +
> +static inline void gicv3_write_grpen1(u32 val)
> +{
> + asm volatile("mcr " __stringify(ICC_IGRPEN1) : : "r" (val));
> + isb();
> +}
> +
> +/*
> + * We may access GICR_TYPER and GITS_TYPER by reading both the TYPER
> + * offset and the following offset (+ 4) and then combining them to
> + * form a 64-bit address.
> + */
> +static inline u64 gicv3_read_typer(const volatile void __iomem *addr)
> +{
> + u64 val = readl(addr);
> + val |= (u64)readl(addr + 4) << 32;
> + return val;
> +}
> +
> +#endif /* !__ASSEMBLY__ */
> +#endif /* _ASMARM_ARCH_GICV3_H_ */
> diff --git a/lib/arm/asm/gic-v3.h b/lib/arm/asm/gic-v3.h
> new file mode 100644
> index ..73ade4681d21
> --- /dev/null
> +++ b/lib/arm/asm/gic-v3.h
> @@ -0,0 +1,104 @@
> +/*
> + * All GIC* defines are lifted from include/linux/irqchip/arm-gic-v3.h
> + *
> + * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2.
> + */
> +#ifndef _ASMARM_GIC_V3_H_
> +#define _ASMARM_GIC_V3_H_
> +
> +#ifndef _ASMARM_GIC_H_
> +#error Do not directly include . Include 
> +#endif
> +
> +/*
> + * Distributor registers
> + *
> + * We expect to be run in Non-secure mode, thus we define the
> + * group1 enable bits with respect to that view.
> + */
> +#define GICD_CTLR_RWP(1U << 31)
> +#define GICD_CTLR_ARE_NS (1U << 4)
> +#define GICD_CTLR_ENABLE_G1A (1U << 1)
> +#define GICD_CTLR_ENABLE_G1  (1U << 0)
> +
> +/* Re-Distributor registers, offsets from RD_base */
> +#define GICR_TYPER   0x0008
> +
> +#define GICR_TYPER_LAST  (1U << 4)
> +
> +/* Re-Distributor registers, offsets from SGI_base */
> +#define GICR_IGROUPR0GICD_IGROUPR
> +#define GICR_ISENABLER0  GICD_ISENABLER
> +#define GICR_IPRIORITYR0 GICD_IPRIORITYR
> +
> +#include 
> +
> +#ifndef __ASSEMBLY__
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +struct gicv3_data {
> + void *dist_base;
> + void *redist_base[NR_CPUS];
> + u

Re: [Qemu-devel] [kvm-unit-tests PATCH v7 10/11] arm/arm64: gicv3: add an IPI test

2016-11-24 Thread Auger Eric

Hi Drew,

On 23/11/2016 17:54, Andrew Jones wrote:
> Signed-off-by: Andrew Jones 
> 
> ---
> v7:
>  - add common ipi_send_single/mask (replacing ipi_send).
>Note, the arg order irq,cpu got swapped. [Eric]
>  - comment rewording [Eric]
>  - make enable_defaults a common op [Eric]
>  - gic_enable_defaults() will now invoke gic_init if
>necessary [drew]
>  - split lib/arm/gic.c into gic-v2/3.c [Eric]
> v6: move most gicv2/gicv3 wrappers to common code [Alex]
> v5:
>  - fix copy+paste error in gicv3_write_eoir [drew]
>  - use modern register names [Andre]
> v4:
>  - heavily comment gicv3_ipi_send_tlist() [Eric]
>  - changes needed for gicv2 iar/irqstat fix to other patch
> v2:
>  - use IRM for gicv3 broadcast
> ---
>  arm/gic.c  | 83 
>  arm/unittests.cfg  |  6 +++
>  lib/arm/asm/arch_gicv3.h   | 23 
>  lib/arm/asm/gic-v2.h   |  2 +
>  lib/arm/asm/gic-v3.h   | 12 +-
>  lib/arm/asm/gic.h  | 63 +++
>  lib/arm/gic-v2.c   | 40 
>  lib/arm/gic-v3.c   | 94 
> ++
>  lib/arm/gic.c  |  9 -
>  lib/arm64/asm/arch_gicv3.h | 22 +++
>  10 files changed, 336 insertions(+), 18 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index b42c2b1ca1e1..23c1860a49d9 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -3,6 +3,8 @@
>   *
>   * GICv2
>   *   + test sending/receiving IPIs
> + * GICv3
> + *   + test sending/receiving IPIs
>   *
>   * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
>   *
> @@ -16,7 +18,14 @@
>  #include 
>  #include 
>  
> -static int gic_version;
> +struct gic {
> + struct {
> + void (*send_self)(void);
> + void (*send_broadcast)(void);
> + } ipi;
> +};
> +
> +static struct gic *gic;
>  static int acked[NR_CPUS], spurious[NR_CPUS];
>  static cpumask_t ready;
>  
> @@ -83,11 +92,11 @@ static void check_spurious(void)
>  
>  static void ipi_handler(struct pt_regs *regs __unused)
>  {
> - u32 irqstat = readl(gicv2_cpu_base() + GICC_IAR);
> - u32 irqnr = irqstat & GICC_IAR_INT_ID_MASK;
> + u32 irqstat = gic_read_iar();
> + u32 irqnr = gic_iar_irqnr(irqstat);
>  
>   if (irqnr != GICC_INT_SPURIOUS) {
> - writel(irqstat, gicv2_cpu_base() + GICC_EOIR);
> + gic_write_eoir(irqstat);
>   smp_rmb(); /* pairs with wmb in ipi_test functions */
>   ++acked[smp_processor_id()];
>   smp_wmb(); /* pairs with rmb in check_acked */
> @@ -97,6 +106,27 @@ static void ipi_handler(struct pt_regs *regs __unused)
>   }
>  }
>  
> +static void gicv2_ipi_send_self(void)
> +{
> + writel(2 << 24, gicv2_dist_base() + GICD_SGIR);
> +}
> +
> +static void gicv2_ipi_send_broadcast(void)
> +{
> + writel(1 << 24, gicv2_dist_base() + GICD_SGIR);
> +}
> +
> +static void gicv3_ipi_send_self(void)
> +{
> + gic_ipi_send_single(0, smp_processor_id());
> +}
> +
> +static void gicv3_ipi_send_broadcast(void)
> +{
> + gicv3_write_sgi1r(1ULL << 40);
> + isb();
> +}
We could have moved above functions in lib/arm/gic-v2/V3.c +gicv2/v3
> +
>  static void ipi_test_self(void)
>  {
>   cpumask_t mask;
> @@ -106,7 +136,7 @@ static void ipi_test_self(void)
>   smp_wmb();
>   cpumask_clear(&mask);
>   cpumask_set_cpu(0, &mask);
> - writel(2 << 24, gicv2_dist_base() + GICD_SGIR);
> + gic->ipi.send_self();
>   check_acked(&mask);
>   report_prefix_pop();
>  }
> @@ -114,14 +144,15 @@ static void ipi_test_self(void)
>  static void ipi_test_smp(void)
>  {
>   cpumask_t mask;
> - unsigned long tlist;
> + int i;
>  
>   report_prefix_push("target-list");
>   memset(acked, 0, sizeof(acked));
>   smp_wmb();
> - tlist = cpumask_bits(&cpu_present_mask)[0] & 0xaa;
> - cpumask_bits(&mask)[0] = tlist;
> - writel((u8)tlist << 16, gicv2_dist_base() + GICD_SGIR);
> + cpumask_copy(&mask, &cpu_present_mask);
> + for (i = 0; i < nr_cpus; i += 2)
> + cpumask_clear_cpu(i, &mask);
> + gic_ipi_send_mask(0, &mask);
>   check_acked(&mask);
>   report_prefix_pop();
>  
> @@ -130,14 +161,14 @@ static void ipi_test_smp(void)
>   smp_wmb();
>   cpumask_copy(&mask, &cpu_present_mask);
>   cpumask_clear_cpu(0, &mask);
> - writel(1 << 24, gicv2_dist_base() + GICD_SGIR);
> + gic->ipi.send_broadcast();
>   check_acked(&mask);
>   report_prefix_pop();
>  }
>  
>  static void ipi_enable(void)
>  {
> - gicv2_enable_defaults();
> + gic_enable_defaults();
>  #ifdef __arm__
>   install_exception_handler(EXCPTN_IRQ, ipi_handler);
>  #else
> @@ -154,18 +185,40 @@ static void ipi_recv(void)
>   wfi();
>  }
>  
> +static struct gic gicv2 = {
> + .ipi = {
> + .send_self = gicv2_ipi_send_self,
> + .send_broadcast = gicv2_ipi_send_broadcast,
> + },
> +};
> +
> +

Re: [Qemu-devel] [kvm-unit-tests PATCH v7 11/11] arm/arm64: gic: don't just use zero

2016-11-24 Thread Auger Eric

Hi,

On 23/11/2016 17:54, Andrew Jones wrote:
> Allow user to select who sends ipis and with which irq,
> rather than just always sending irq=0 from cpu0.
> 
> Signed-off-by: Andrew Jones 

Reviewed-by: Eric Auger 
Tested-by: Eric Auger 

Eric


> 
> ---
> v7: cleanup cmdline parsing and add complain on bad args [Eric]
> v6:
>  - make sender/irq names more future-proof [drew]
>  - sanity check inputs [drew]
>  - introduce check_sender/irq and bad_sender/irq to more
>cleanly do checks [drew]
>  - default sender and irq to 1, instead of still zero [drew]
> v4: improve structure and make sure spurious checking is
> done even when the sender isn't cpu0
> v2: actually check that the irq received was the irq sent,
> and (for gicv2) that the sender is the expected one.
> ---
>  arm/gic.c | 143 
> --
>  1 file changed, 120 insertions(+), 23 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index 23c1860a49d9..88c5f49d807d 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -11,6 +11,7 @@
>   * This work is licensed under the terms of the GNU LGPL, version 2.
>   */
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -27,6 +28,8 @@ struct gic {
>  
>  static struct gic *gic;
>  static int acked[NR_CPUS], spurious[NR_CPUS];
> +static int bad_sender[NR_CPUS], bad_irq[NR_CPUS];
> +static int cmdl_sender = 1, cmdl_irq = 1;
>  static cpumask_t ready;
>  
>  static void nr_cpu_check(int nr)
> @@ -42,10 +45,23 @@ static void wait_on_ready(void)
>   cpu_relax();
>  }
>  
> +static void stats_reset(void)
> +{
> + int i;
> +
> + for (i = 0; i < nr_cpus; ++i) {
> + acked[i] = 0;
> + bad_sender[i] = -1;
> + bad_irq[i] = -1;
> + }
> + smp_wmb();
> +}
> +
>  static void check_acked(cpumask_t *mask)
>  {
>   int missing = 0, extra = 0, unexpected = 0;
>   int nr_pass, cpu, i;
> + bool bad = false;
>  
>   /* Wait up to 5s for all interrupts to be delivered */
>   for (i = 0; i < 50; ++i) {
> @@ -55,9 +71,21 @@ static void check_acked(cpumask_t *mask)
>   smp_rmb();
>   nr_pass += cpumask_test_cpu(cpu, mask) ?
>   acked[cpu] == 1 : acked[cpu] == 0;
> +
> + if (bad_sender[cpu] != -1) {
> + printf("cpu%d received IPI from wrong sender 
> %d\n",
> + cpu, bad_sender[cpu]);
> + bad = true;
> + }
> +
> + if (bad_irq[cpu] != -1) {
> + printf("cpu%d received wrong irq %d\n",
> + cpu, bad_irq[cpu]);
> + bad = true;
> + }
>   }
>   if (nr_pass == nr_cpus) {
> - report("Completed in %d ms", true, ++i * 100);
> + report("Completed in %d ms", !bad, ++i * 100);
>   return;
>   }
>   }
> @@ -90,6 +118,22 @@ static void check_spurious(void)
>   }
>  }
>  
> +static void check_ipi_sender(u32 irqstat)
> +{
> + if (gic_version() == 2) {
> + int src = (irqstat >> 10) & 7;
> +
> + if (src != cmdl_sender)
> + bad_sender[smp_processor_id()] = src;
> + }
> +}
> +
> +static void check_irqnr(u32 irqnr)
> +{
> + if (irqnr != (u32)cmdl_irq)
> + bad_irq[smp_processor_id()] = irqnr;
> +}
> +
>  static void ipi_handler(struct pt_regs *regs __unused)
>  {
>   u32 irqstat = gic_read_iar();
> @@ -97,8 +141,10 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  
>   if (irqnr != GICC_INT_SPURIOUS) {
>   gic_write_eoir(irqstat);
> - smp_rmb(); /* pairs with wmb in ipi_test functions */
> + smp_rmb(); /* pairs with wmb in stats_reset */
>   ++acked[smp_processor_id()];
> + check_ipi_sender(irqstat);
> + check_irqnr(irqnr);
>   smp_wmb(); /* pairs with rmb in check_acked */
>   } else {
>   ++spurious[smp_processor_id()];
> @@ -108,22 +154,22 @@ static void ipi_handler(struct pt_regs *regs __unused)
>  
>  static void gicv2_ipi_send_self(void)
>  {
> - writel(2 << 24, gicv2_dist_base() + GICD_SGIR);
> + writel(2 << 24 | cmdl_irq, gicv2_dist_base() + GICD_SGIR);
>  }
>  
>  static void gicv2_ipi_send_broadcast(void)
>  {
> - writel(1 << 24, gicv2_dist_base() + GICD_SGIR);
> + writel(1 << 24 | cmdl_irq, gicv2_dist_base() + GICD_SGIR);
>  }
>  
>  static void gicv3_ipi_send_self(void)
>  {
> - gic_ipi_send_single(0, smp_processor_id());
> + gic_ipi_send_single(cmdl_irq, smp_processor_id());
>  }
>  
>  static void gicv3_ipi_send_broadcast(void)
>  {
> - gicv3_write_sgi1r(1ULL << 40);
> + gicv3_write_sgi1r(1ULL << 40 | cmdl_irq << 24);
>   isb();
>  }
>

Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support

2016-11-24 Thread Daniel P. Berrange

On Wed, Nov 23, 2016 at 02:09:50PM -0800, ashish mittal wrote:
> On the topic of protocol security -
> 
> Would it be enough for the first patch to implement only
> authentication and not encryption?

Yes, authentication is the only critical thing from my POV. While encryption
is a nice to have, there are plenty of storage systems which do *not* do
encryption. Guest data can still be protected simply by running LUKS on the
guest disks, so lack of encryption is not a serious security risk, provided
the authentication scheme itself does not require encryption in order to
be secure.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|

Re: [Qemu-devel] [for-2.8 0/4] 9p patches for 2.8 20161123

2016-11-24 Thread Stefan Hajnoczi

On Wed, Nov 23, 2016 at 05:58:14PM +0100, Greg Kurz wrote:
> The following changes since commit 00227fefd2059464cd2f59aed29944874c630e2f:
> 
>   Update version for v2.8.0-rc1 release (2016-11-22 22:29:08 +)
> 
> are available in the git repository at:
> 
>   https://github.com/gkurz/qemu.git tags/for-upstream
> 
> for you to fetch changes up to 898ae90a44551d25b8e956fd87372d303c82fe68:
> 
>   9pfs: add cleanup operation for proxy backend driver (2016-11-23 13:53:34 
> +0100)
> 
> 
> This pull request fixes some leaks (memory, fd) in the handle and proxy
> backends.

Hi Greg,
Please make sure to aways use the "[PULL]" tag in the Subject so tools
can track your pull requests.

I have applied this pull request manually to my staging tree:
https://github.com/stefanha/qemu/commits/staging

Your PGP key has only one signature.  Can you key sign with other
IBMers?  That may allow us to get the Web of Trust going.

Thanks,
Stefan


signature.asc
Description: PGP signature

[Qemu-devel] [PATCH for-2.8] pci-assign: sync MSI/MSI-X cap and table with PCIDevice

2016-11-24 Thread Peter Xu

Since commit e1d4fb2d ("kvm-irqchip: x86: add msi route notify fn"),
kvm_irqchip_add_msi_route() starts to use pci_get_msi_message() to fetch
MSI info. This requires that we setup MSI related fields in PCIDevice.
For most devices, that won't be a problem, as long as we are using
general interfaces like msi_init()/msix_init().

However, for pci-assign devices, MSI/MSI-X is treated differently - PCI
assign devices are maintaining its own MSI table and cap information in
AssignedDevice struct. however that's not synced up with PCIDevice's
fields. That will leads to pci_get_msi_message() failed to find correct
MSI capability, even with an NULL msix_table.

A quick fix is to sync up the two places: both the capability bits and
table address for MSI/MSI-X.

Reported-by: Changlimin 
Cc: qemu-sta...@nongnu.org
Fixes: e1d4fb2d ("kvm-irqchip: x86: add msi route notify fn")
Signed-off-by: Peter Xu 

---
Do we still support pci-assign?
---
 hw/i386/kvm/pci-assign.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index 8238fbc..df9748d 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -1251,6 +1251,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 error_propagate(errp, local_err);
 return -ENOTSUP;
 }
+dev->dev.cap_present |= QEMU_PCI_CAP_MSI;
 dev->cap.available |= ASSIGNED_DEVICE_CAP_MSI;
 /* Only 32-bit/no-mask currently supported */
 ret = pci_add_capability2(pci_dev, PCI_CAP_ID_MSI, pos, 10,
@@ -1285,6 +1286,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
*pci_dev, Error **errp)
 error_propagate(errp, local_err);
 return -ENOTSUP;
 }
+dev->dev.cap_present |= QEMU_PCI_CAP_MSIX;
 dev->cap.available |= ASSIGNED_DEVICE_CAP_MSIX;
 ret = pci_add_capability2(pci_dev, PCI_CAP_ID_MSIX, pos, 12,
   &local_err);
@@ -1648,6 +1650,7 @@ static void 
assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp)
 dev->msix_table = NULL;
 return;
 }
+dev->dev.msix_table = dev->msix_table;
 
 assigned_dev_msix_reset(dev);
 
@@ -1665,6 +1668,7 @@ static void 
assigned_dev_unregister_msix_mmio(AssignedDevice *dev)
 error_report("error unmapping msix_table! %s", strerror(errno));
 }
 dev->msix_table = NULL;
+dev->dev.msix_table = NULL;
 }
 
 static const VMStateDescription vmstate_assigned_device = {
-- 
2.7.4

Re: [Qemu-devel] [PATCH 2/3] xen: slightly simplify bufioreq handling

2016-11-24 Thread Jan Beulich

>>> On 23.11.16 at 19:13,  wrote:
> On Wed, 23 Nov 2016, Jan Beulich wrote:
>> There's no point setting fields always receiving the same value on each
>> iteration, as handle_ioreq() doesn't alter them anyway. Set state and
>> count once ahead of the loop, drop the redundant clearing of
>> data_is_ptr, and avoid the meaningless setting of df altogether.
> 
> Why setting df is meaningless?

With count being fixed to one there's no need to update addresses,
and hence no use for knowing which direction the updates should go.

Jan

Re: [Qemu-devel] [for-2.8 0/4] 9p patches for 2.8 20161123

2016-11-24 Thread Greg Kurz

On Thu, 24 Nov 2016 10:19:58 +
Stefan Hajnoczi  wrote:

> On Wed, Nov 23, 2016 at 05:58:14PM +0100, Greg Kurz wrote:
> > The following changes since commit 00227fefd2059464cd2f59aed29944874c630e2f:
> > 
> >   Update version for v2.8.0-rc1 release (2016-11-22 22:29:08 +)
> > 
> > are available in the git repository at:
> > 
> >   https://github.com/gkurz/qemu.git tags/for-upstream
> > 
> > for you to fetch changes up to 898ae90a44551d25b8e956fd87372d303c82fe68:
> > 
> >   9pfs: add cleanup operation for proxy backend driver (2016-11-23 13:53:34 
> > +0100)
> > 
> > 
> > This pull request fixes some leaks (memory, fd) in the handle and proxy
> > backends.  
> 
> Hi Greg,
> Please make sure to aways use the "[PULL]" tag in the Subject so tools
> can track your pull requests.
> 

Hi Stefan,

Damn... I wanted to add for-2.8 and forgot PULL :(

I'll fix my script to add this or at least to yell at me.

> I have applied this pull request manually to my staging tree:
> https://github.com/stefanha/qemu/commits/staging
> 

Thanks.

> Your PGP key has only one signature.  Can you key sign with other
> IBMers?  That may allow us to get the Web of Trust going.
> 

We're only a few IBMers in my area (Toulouse, France), but I guess this
is better than nothing. :)

Cheers.

--
Greg

> Thanks,
> Stefan



pgp5Z8zWgREcT.pgp
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] blkverify: Implement bdrv_co_preadv/pwritev/flush

2016-11-24 Thread Stefan Hajnoczi

On Tue, Nov 22, 2016 at 01:42:44PM +0100, Kevin Wolf wrote:
> -static BlockAIOCB *blkverify_aio_flush(BlockDriverState *bs,
> -   BlockCompletionFunc *cb,
> -   void *opaque)
> +static int blkverify_co_flush(BlockDriverState *bs)

Please use the coroutine_fn annotation.

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

[Qemu-devel] [Bug 992067] Re: Windows 2008R2 very slow cold boot when >4GB memory

2016-11-24 Thread Thomas Huth

Triaging old bug tickets... QEMU 0.12/0.14/0.15 is pretty outdated
nowadays. Can you still reproduce this behavior with the latest version
of QEMU? If not, I think we should close this bug...

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/992067

Title:
  Windows 2008R2 very slow cold boot when >4GB memory

Status in QEMU:
  Fix Committed

Bug description:
  I've been having a consistent problem booting 2008R2 guests with
  4096MB of RAM or greater. On the initial boot the KVM process starts
  out with a ~200MB memory allocation and will use 100% of all CPU
  allocated to it. The RES memory of the KVM process slowly rises by
  around 200mb every few minutes until it reaches it's memory allocation
  (several hours in some cases). Whilst this is happening the guest will
  usually blue screen with the message of -

  A clock interrupt was not received on a secondary processor within the
  allocated time interval

  If I let the KVM process continue to run it will eventually allocate
  the required memory the guest will run at full speed, usually
  restarting after the blue screen and booting into startup repair. From
  here you can restart it and it will boot perfectly. Once booted the
  guest has no performance issues at all.

  I've tried everything I could think of. Removing PAE, playing with
  huge pages, different kernels, different userspaces, different
  systems, different backing file systems, different processor feature
  set, with or without Virtio etc. My best theory is that the problem is
  caused by Windows 2008 zeroing out all the memory on boot and
  something is causing this to be held up or slowed to a crawl. The
  hosts always have memory free to boot the guest and are not using swap
  at all.

  Nothing so far has solved the issue. A few observations I've made about the 
issue are - 
  Large memory 2008R2 guests seem to boot fine (or with a small delay) when 
they are the first to boot on the host after a reboot
  Sometimes dropping the disk cache (echo 1 > /proc/sys/vm/drop_caches) will 
cause them to boot faster

  
  The hosts I've tried are -
  All Nehalem based (5540, 5620 and 5660)
  Host ram of 48GB, 96GB and 192GB
  Storage on NFS, Gluster and local (ext4, xfs and zfs)
  QED, QCOW and RAW formats
  Scientific Linux 6.1 with the standard kernel 2.6.32, 2.6.38 and 3.3.1
  KVM userspaces 0.12, 0.14 and (currently) 0.15.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/992067/+subscriptions

[Qemu-devel] [Bug 1308381] Re: illegal instructions for AArch64 ARMv8

2016-11-24 Thread Thomas Huth

Patch had been included here:
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=e44a90c59697cf98
==> Fix released

** Changed in: qemu
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1308381

Title:
  illegal instructions for AArch64 ARMv8

Status in QEMU:
  Fix Released

Bug description:
  The test case is in the attachment. To reproduce as following (I tried both 
GCC and Clang):
  $aarch64-linux-gnu-gcc qemu.c -o test
  $./test
  qemu: uncaught target signal 4 (Illegal instruction) - core dumped
  Illegal instruction (core dumped)

  In the test case, 3 intrinsics are tested: vqmovunh_s16,  vqmovuns_s32, 
vqmovund_s64. They will be compiled into instructions:
  SQXTUN Bd, Hn
  SQXTUN Hd, Sn
  SQXTUN Sd, Dn.

  QEMU is get from trunk git://git.qemu.org/qemu.git. Other instructions
  work well. It seems that these instructions are not supported in QEMU.
  Is this a bug?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1308381/+subscriptions

Re: [Qemu-devel] [PATCH v2 1/4] qmp-event: Avoid qobject_from_jsonf("%"PRId64)

2016-11-24 Thread Markus Armbruster

Eric Blake  writes:

> The qobject_from_jsonf() function implements a pseudo-printf
> language for creating a QObject; however, it is hard-coded to
> only parse a subset of formats understood by -Wformat, and is
> not a straight synonym to bare printf().  In particular, any
> use of an int64_t integer works only if the system's
> definition of PRId64 matches what the parser expects; which
> works on glibc (%lld or %ld depending on 32- vs. 64-bit) and
> mingw (%I64d), but not on Mac OS (%qd).  Rather than enhance
> the parser, it is just as easy to use 'long long', which we
> know always works.  There are few enough callers of
> qobject_from_json[fv]() that it is easy to audit that this is
> the only non-testsuite caller that was actually relying on
> this particular conversion.
>
> Reported by: G 3 
> Signed-off-by: Eric Blake 
>
> ---
> v2: keep qobject_from_jsonf for now, but switch to %lld
> ---
>  qapi/qmp-event.c | 17 -
>  1 file changed, 4 insertions(+), 13 deletions(-)
>
> diff --git a/qapi/qmp-event.c b/qapi/qmp-event.c
> index 8bba165..e7c8755 100644
> --- a/qapi/qmp-event.c
> +++ b/qapi/qmp-event.c
> @@ -35,21 +35,12 @@ static void timestamp_put(QDict *qdict)
>  int err;
>  QObject *obj;
>  qemu_timeval tv;
> -int64_t sec, usec;
>
>  err = qemu_gettimeofday(&tv);
> -if (err < 0) {
> -/* Put -1 to indicate failure of getting host time */
> -sec = -1;
> -usec = -1;
> -} else {
> -sec = tv.tv_sec;
> -usec = tv.tv_usec;
> -}
> -
> -obj = qobject_from_jsonf("{ 'seconds': %" PRId64 ", "
> - "'microseconds': %" PRId64 " }",
> - sec, usec);
> +/* Put -1 to indicate failure of getting host time */
> +obj = qobject_from_jsonf("{ 'seconds': %lld, 'microseconds': %lld }",
> + err < 0 ? -1LL : tv.tv_sec,
> + err < 0 ? -1LL : tv.tv_usec);
>  qdict_put_obj(qdict, "timestamp", obj);
>  }

The ternaries must both yield long long for this to work.  They will,
unless their last operand's rank is greater than long long's, or their
last operand is unsigned long long.  The latter case should be harmless
in practice.  The former case would be weird.  Hmm.

I think it's best to be a bit more explicit here.  Let's cast the last
operands to long long, or use long long variables.  Your choice.

Re: [Qemu-devel] [PATCH v2 3/4] tests: Avoid qobject_from_jsonf("%"PRId64)

2016-11-24 Thread Markus Armbruster

Eric Blake  writes:

> The qobject_from_jsonf() function implements a pseudo-printf
> language for creating a QObject; however, it is hard-coded to
> only parse a subset of formats understood by -Wformat, and is
> not a straight synonym to bare printf().  In particular, any
> use of an int64_t integer works only if the system's
> definition of PRId64 matches what the parser expects; which
> works on glibc (%lld or %ld depending on 32- vs. 64-bit) and
> mingw (%I64d), but not on Mac OS (%qd).  Rather than enhance
> the parser, it is just as easy to force the use of int (where
> the value is small enough) or long long instead of int64_t,
> which we know always works.
>
> This should cover all remaining testsuite uses of
> qobject_from_json[fv]() that were trying to rely on PRId64,
> although my proof for that was done by adding in asserts and
> checking that 'make check' still passed, where such asserts
> are inappropriate during hard freeze.  A later series in 2.9
> may remove all dynamic JSON parsing, but that's a bigger task.
>
> Reported by: G 3 
> Signed-off-by: Eric Blake 
> ---
>  tests/check-qjson.c| 4 ++--
>  tests/test-qobject-input-visitor.c | 5 +++--
>  2 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/tests/check-qjson.c b/tests/check-qjson.c
> index 8595574..b03a2e1 100644
> --- a/tests/check-qjson.c
> +++ b/tests/check-qjson.c
> @@ -964,7 +964,7 @@ static void vararg_number(void)
>  QInt *qint;
>  QFloat *qfloat;
>  int value = 0x2342;
> -int64_t value64 = 0x2342342343LL;
> +long long value64 = 0x2342342343LL;

The name value64 isn't quite right anymore.  value_ll?

>  double valuef = 2.323423423;
>
>  obj = qobject_from_jsonf("%d", value);
> @@ -976,7 +976,7 @@ static void vararg_number(void)
>
>  QDECREF(qint);
>
> -obj = qobject_from_jsonf("%" PRId64, value64);
> +obj = qobject_from_jsonf("%lld", value64);
>  g_assert(obj != NULL);
>  g_assert(qobject_type(obj) == QTYPE_QINT);
>
> diff --git a/tests/test-qobject-input-visitor.c 
> b/tests/test-qobject-input-visitor.c
> index 26c5012..945404a 100644
> --- a/tests/test-qobject-input-visitor.c
> +++ b/tests/test-qobject-input-visitor.c
> @@ -83,10 +83,11 @@ static Visitor 
> *visitor_input_test_init_raw(TestInputVisitorData *data,
>  static void test_visitor_in_int(TestInputVisitorData *data,
>  const void *unused)
>  {
> -int64_t res = 0, value = -42;
> +int64_t res = 0;
> +int value = -42;
>  Visitor *v;
>
> -v = visitor_input_test_init(data, "%" PRId64, value);
> +v = visitor_input_test_init(data, "%d", value);
>
>  visit_type_int(v, NULL, &res, &error_abort);
>  g_assert_cmpint(res, ==, value);

Re: [Qemu-devel] [PATCH v2 for-2.8 0/4] Fix MacOS runtime failure of qobject_from_jsonf()

2016-11-24 Thread Markus Armbruster

Eric Blake  writes:

> programmingk...@gmail.com[*] reported a runtime failure on a
> 32-bit Mac OS compilation, where "%"PRId64 expands to "%qd".
> Fortunately, we had very few spots that were relying on our
> pseudo-printf JSON parsing of int64_t numbers, so it was
> easier to just convert callers to stick to safer %lld.
>
> The remaining uses of pseudo-printf handling are more complex;
> there are only 3 users in the released codebased, but LOTS of
> users in the testsuite (via wrapper functions like qmp()); I
> will be posting a followup series that rips out the remaining
> uses of dynamic JSON, but it will be 2.9 material, while
> these (first three) patches qualify for 2.8.  The fourth patch
> is RFC; not intended to be applied now, but shows how I tested
> patch 3/4; it will probably reappear in the later 2.9 series.

I can take the first three through my tree.

> [*] git log shows the name John, but the particular email that
> sparked this only stated the non-descript name 'G 3', which
> makes it a bit hard for me to know which form is preferred
> when lending credit.

If someone's capriciousness makes it hard to give credit, not giving it
can be excused.

Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support

2016-11-24 Thread Stefan Hajnoczi

On Thu, Nov 24, 2016 at 05:44:37AM +, Ketan Nilangekar wrote:
> On 11/24/16, 4:07 AM, "Paolo Bonzini"  wrote:
> >On 23/11/2016 23:09, ashish mittal wrote:
> >> On the topic of protocol security -
> >> 
> >> Would it be enough for the first patch to implement only
> >> authentication and not encryption?
> >
> >Yes, of course.  However, as we introduce more and more QEMU-specific
> >characteristics to a protocol that is already QEMU-specific (it doesn't
> >do failover, etc.), I am still not sure of the actual benefit of using
> >libqnio versus having an NBD server or FUSE driver.
> >
> >You have already mentioned performance, but the design has changed so
> >much that I think one of the two things has to change: either failover
> >moves back to QEMU and there is no (closed source) translator running on
> >the node, or the translator needs to speak a well-known and
> >already-supported protocol.
> 
> IMO design has not changed. Implementation has changed significantly. I would 
> propose that we keep resiliency/failover code out of QEMU driver and 
> implement it entirely in libqnio as planned in a subsequent revision. The 
> VxHS server does not need to understand/handle failover at all. 
> 
> Today libqnio gives us significantly better performance than any NBD/FUSE 
> implementation. We know because we have prototyped with both. Significant 
> improvements to libqnio are also in the pipeline which will use cross memory 
> attach calls to further boost performance. Ofcourse a big reason for the 
> performance is also the HyperScale storage backend but we believe this method 
> of IO tapping/redirecting can be leveraged by other solutions as well.

By "cross memory attach" do you mean
process_vm_readv(2)/process_vm_writev(2)?

That puts us back to square one in terms of security.  You have
(untrusted) QEMU + (untrusted) libqnio directly accessing the memory of
another process on the same machine.  That process is therefore also
untrusted and may only process data for one guest so that guests stay
isolated from each other.

There's an easier way to get even better performance: get rid of libqnio
and the external process.  Move the code from the external process into
QEMU to eliminate the process_vm_readv(2)/process_vm_writev(2) and
context switching.

Can you remind me why there needs to be an external process?

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] Asynchronous / synchronous IO emulation

2016-11-24 Thread Stefan Hajnoczi

On Thu, Nov 24, 2016 at 08:44:06AM +, Dong, Eddie wrote:
>  Under certain situation, we have a requirement to apply the virtual 
> DMA event in a deterministic way. Current Qemu uses asynchronous approach to 
> emulate the virtual IO events since quite long time ago. I am wondering if we 
> still have an option to choose the synchronous emulation solution.

Please explain exactly what you mean.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support

2016-11-24 Thread Ketan Nilangekar

On 11/24/16, 4:41 PM, "Stefan Hajnoczi"  wrote:

On Thu, Nov 24, 2016 at 05:44:37AM +, Ketan Nilangekar wrote:
> On 11/24/16, 4:07 AM, "Paolo Bonzini"  wrote:
> >On 23/11/2016 23:09, ashish mittal wrote:
> >> On the topic of protocol security -
> >> 
> >> Would it be enough for the first patch to implement only
> >> authentication and not encryption?
> >
> >Yes, of course.  However, as we introduce more and more QEMU-specific
> >characteristics to a protocol that is already QEMU-specific (it doesn't
> >do failover, etc.), I am still not sure of the actual benefit of using
> >libqnio versus having an NBD server or FUSE driver.
> >
> >You have already mentioned performance, but the design has changed so
> >much that I think one of the two things has to change: either failover
> >moves back to QEMU and there is no (closed source) translator running on
> >the node, or the translator needs to speak a well-known and
> >already-supported protocol.
> 
> IMO design has not changed. Implementation has changed significantly. I 
would propose that we keep resiliency/failover code out of QEMU driver and 
implement it entirely in libqnio as planned in a subsequent revision. The VxHS 
server does not need to understand/handle failover at all. 
> 
> Today libqnio gives us significantly better performance than any NBD/FUSE 
implementation. We know because we have prototyped with both. Significant 
improvements to libqnio are also in the pipeline which will use cross memory 
attach calls to further boost performance. Ofcourse a big reason for the 
performance is also the HyperScale storage backend but we believe this method 
of IO tapping/redirecting can be leveraged by other solutions as well.

By "cross memory attach" do you mean
process_vm_readv(2)/process_vm_writev(2)?

Ketan> Yes.

That puts us back to square one in terms of security.  You have
(untrusted) QEMU + (untrusted) libqnio directly accessing the memory of
another process on the same machine.  That process is therefore also
untrusted and may only process data for one guest so that guests stay
isolated from each other.

Ketan> Understood but this will be no worse than the current network based 
communication between qnio and vxhs server. And although we have questions 
around QEMU trust/vulnerability issues, we are looking to implement basic 
authentication scheme between libqnio and vxhs server.

There's an easier way to get even better performance: get rid of libqnio
and the external process.  Move the code from the external process into
QEMU to eliminate the process_vm_readv(2)/process_vm_writev(2) and
context switching.

Can you remind me why there needs to be an external process?

Ketan>  Apart from virtualizing the available direct attached storage on the 
compute, vxhs storage backend (the external process) provides features such as 
storage QoS, resiliency, efficient use of direct attached storage, automatic 
storage recovery points (snapshots) etc. Implementing this in QEMU is not 
practical and not the purpose of proposing this driver.

Ketan.

Stefan

[Qemu-devel] [PATCH ppc-for-2.9 v2 0/2] POWER9 TCG enablements - part8

2016-11-24 Thread Nikunj A Dadhania

This series contains 6 new instructions for POWER9 ISA3.0
Vector Extract Left/Right Indexed

Changelog:
v1:
* Rebase
* Implement vextub[r,l]x using int128_rshift

v0:
* Change dq/ds-form decoding for primary opcode 0x3D
* Rename CR Field defines, as at every place it was
  using bit shifts.
* Use symbolic constants in xscmp*
* Fix bug in exception handling for QNaN
* Define EXTRACT128 within CONFIG_INT128

Patches
===
   01: 
  vextublx:  Vector Extract Unsigned Byte Left
  vextuhlx:  Vector Extract Unsigned Halfword Left
  vextuwlx:  Vector Extract Unsigned Word Left
   02: 
  vextubrx: Vector Extract Unsigned Byte Right-Indexed
  vextuhrx: Vector Extract Unsigned  Halfword Right-Indexed
  vextuwrx: Vector Extract Unsigned Word Right-Indexed


Avinesh Kumar (1):
  target-ppc: add vextu[bhw]lx instructions

Hariharan T.S (1):
  target-ppc: add vextu[bhw]rx instructions

 target-ppc/helper.h |  6 +++
 target-ppc/int_helper.c | 95 +
 target-ppc/translate/vmx-impl.inc.c | 23 +
 target-ppc/translate/vmx-ops.inc.c  |  8 +++-
 4 files changed, 130 insertions(+), 2 deletions(-)

-- 
2.7.4

[Qemu-devel] [PATCH v2 2/2] target-ppc: add vextu[bhw]rx instructions

2016-11-24 Thread Nikunj A Dadhania

From: "Hariharan T.S" 

vextubrx: Vector Extract Unsigned Byte Right-Indexed VX-form
vextuhrx: Vector Extract Unsigned  Halfword Right-Indexed VX-form
vextuwrx: Vector Extract Unsigned Word Right-Indexed VX-form

Signed-off-by: Hariharan T.S. 
Signed-off-by: Avinesh Kumar 
[ Implement using int128_rshift ]
Signed-off-by: Nikunj A Dadhania 
---
 target-ppc/helper.h |  3 +++
 target-ppc/int_helper.c | 47 +
 target-ppc/translate/vmx-impl.inc.c |  5 
 target-ppc/translate/vmx-ops.inc.c  |  4 +++-
 4 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index d0a8fb2..a6e04cb 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -369,6 +369,9 @@ DEF_HELPER_3(vpmsumd, void, avr, avr, avr)
 DEF_HELPER_2(vextublx, tl, tl, avr)
 DEF_HELPER_2(vextuhlx, tl, tl, avr)
 DEF_HELPER_2(vextuwlx, tl, tl, avr)
+DEF_HELPER_2(vextubrx, tl, tl, avr)
+DEF_HELPER_2(vextuhrx, tl, tl, avr)
+DEF_HELPER_2(vextuwrx, tl, tl, avr)
 
 DEF_HELPER_2(vsbox, void, avr, avr)
 DEF_HELPER_3(vcipher, void, avr, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 2aa4474..406e23a 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1853,6 +1853,53 @@ VEXTULX_DO(vextuhlx, 16)
 VEXTULX_DO(vextuwlx, 32)
 #undef VEXTULX_DO
 
+#if defined(HOST_WORDS_BIGENDIAN)
+# if defined(CONFIG_INT128)
+# define VEXTURX_DO(name, size) \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{   \
+int index = (15 - (a & 0xf) + 1) * 8;   \
+return int128_rshift(b->u128, index - size) &   \
+MAKE_64BIT_MASK(0, size);   \
+}
+# else
+# define VEXTURX_DO(name, size) \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{   \
+int index = (15 - (a & 0xf) + 1) * 8;   \
+Int128 value = int128_make128(b->u64[LO_IDX],   \
+  b->u64[HI_IDX]);  \
+return int128_rshift(value, index - size) & \
+MAKE_64BIT_MASK(0, size);   \
+}
+# endif
+#else
+# if defined(CONFIG_INT128)
+# define VEXTURX_DO(name, size) \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{   \
+int index = (a & 0xf) * 8;  \
+return int128_rshift(b->u128, index) &  \
+MAKE_64BIT_MASK(0, size);   \
+}
+# else
+# define VEXTURX_DO(name, size) \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{   \
+int index = (a & 0xf) * 8;  \
+Int128 value = int128_make128(b->u64[LO_IDX],   \
+  b->u64[HI_IDX]);  \
+return int128_rshift(value, index) &\
+MAKE_64BIT_MASK(0, size);   \
+}
+# endif
+#endif
+
+VEXTURX_DO(vextubrx, 8)
+VEXTURX_DO(vextuhrx, 16)
+VEXTURX_DO(vextuwrx, 32)
+#undef VEXTURX_DO
+
 /* The specification says that the results are undefined if all of the
  * shift counts are not identical.  We check to make sure that they are
  * to conform to what real hardware appears to do.  */
diff --git a/target-ppc/translate/vmx-impl.inc.c 
b/target-ppc/translate/vmx-impl.inc.c
index e91d10b..3dea465 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -543,6 +543,11 @@ GEN_VXFORM_HETRO(vextuhlx, 6, 25)
 GEN_VXFORM_HETRO(vextuwlx, 6, 26)
 GEN_VXFORM_DUAL(vmrgow, PPC_NONE, PPC2_ALTIVEC_207,
 vextuwlx, PPC_NONE, PPC2_ISA300)
+GEN_VXFORM_HETRO(vextubrx, 6, 28)
+GEN_VXFORM_HETRO(vextuhrx, 6, 29)
+GEN_VXFORM_HETRO(vextuwrx, 6, 30)
+GEN_VXFORM_DUAL(vmrgew, PPC_NONE, PPC2_ALTIVEC_207, \
+vextuwrx, PPC_NONE, PPC2_ISA300)
 
 #define GEN_VXRFORM1(opname, name, str, opc2, opc3) \
 static void glue(gen_, name)(DisasContext *ctx) \
diff --git a/target-ppc/translate/vmx-ops.inc.c 
b/target-ppc/translate/vmx-ops.inc.c
index e62e564..a3c9d05 100644
--- a/target-ppc/translate/vmx-ops.inc.c
+++ b/target-ppc/translate/vmx-ops.inc.c
@@ -94,7 +94,9 @@ GEN_VXFORM(vmrglw, 6, 6),
 GEN_VXFORM_300(vextublx, 6, 24),
 GEN_VXFORM_300(vextuhlx, 6, 25),
 GEN_VXFORM_DUAL(vmrgow, vextuwlx, 6, 26, PPC_NONE, PPC2_ALTIVEC_207),
-GEN_VXFORM_207(vmrgew, 6, 30),

[Qemu-devel] [PATCH v2 1/2] target-ppc: add vextu[bhw]lx instructions

2016-11-24 Thread Nikunj A Dadhania

From: Avinesh Kumar 

vextublx:  Vector Extract Unsigned Byte Left
vextuhlx:  Vector Extract Unsigned Halfword Left
vextuwlx:  Vector Extract Unsigned Word Left

Signed-off-by: Avinesh Kumar 
[ implement using int128_rshift ]
Signed-off-by: Nikunj A Dadhania 
---
 target-ppc/helper.h |  3 +++
 target-ppc/int_helper.c | 48 +
 target-ppc/translate/vmx-impl.inc.c | 18 ++
 target-ppc/translate/vmx-ops.inc.c  |  4 +++-
 4 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 3b26678..d0a8fb2 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -366,6 +366,9 @@ DEF_HELPER_3(vpmsumb, void, avr, avr, avr)
 DEF_HELPER_3(vpmsumh, void, avr, avr, avr)
 DEF_HELPER_3(vpmsumw, void, avr, avr, avr)
 DEF_HELPER_3(vpmsumd, void, avr, avr, avr)
+DEF_HELPER_2(vextublx, tl, tl, avr)
+DEF_HELPER_2(vextuhlx, tl, tl, avr)
+DEF_HELPER_2(vextuwlx, tl, tl, avr)
 
 DEF_HELPER_2(vsbox, void, avr, avr)
 DEF_HELPER_3(vcipher, void, avr, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index fbf477f..2aa4474 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -17,6 +17,7 @@
  * License along with this library; if not, see .
  */
 #include "qemu/osdep.h"
+#include "qemu/int128.h"
 #include "cpu.h"
 #include "internal.h"
 #include "exec/exec-all.h"
@@ -1805,6 +1806,53 @@ void helper_vlogefp(CPUPPCState *env, ppc_avr_t *r, 
ppc_avr_t *b)
 }
 }
 
+#if defined(HOST_WORDS_BIGENDIAN)
+# if defined(CONFIG_INT128)
+# define VEXTULX_DO(name, size) \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{   \
+int index = (a & 0xf) * 8;  \
+return int128_rshift(b->u128, index) &  \
+MAKE_64BIT_MASK(0, size);   \
+}
+# else
+# define VEXTULX_DO(name, size) \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{   \
+int index = (a & 0xf) * 8;  \
+Int128 value = int128_make128(b->u64[LO_IDX],   \
+  b->u64[HI_IDX]);  \
+return int128_rshift(value, index) &\
+MAKE_64BIT_MASK(0, size);   \
+}
+# endif
+#else
+# if defined(CONFIG_INT128)
+# define VEXTULX_DO(name, size) \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{   \
+int index = (15 - (a & 0xf) + 1) * 8;   \
+return int128_rshift(b->u128, index - size) &   \
+MAKE_64BIT_MASK(0, size);   \
+}
+# else
+# define VEXTULX_DO(name, size) \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{   \
+int index = (15 - (a & 0xf) + 1) * 8;   \
+Int128 value = int128_make128(b->u64[LO_IDX],   \
+  b->u64[HI_IDX]);  \
+return int128_rshift(value, index - size) & \
+MAKE_64BIT_MASK(0, size);   \
+}
+# endif
+#endif
+
+VEXTULX_DO(vextublx, 8)
+VEXTULX_DO(vextuhlx, 16)
+VEXTULX_DO(vextuwlx, 32)
+#undef VEXTULX_DO
+
 /* The specification says that the results are undefined if all of the
  * shift counts are not identical.  We check to make sure that they are
  * to conform to what real hardware appears to do.  */
diff --git a/target-ppc/translate/vmx-impl.inc.c 
b/target-ppc/translate/vmx-impl.inc.c
index 7143eb3..e91d10b 100644
--- a/target-ppc/translate/vmx-impl.inc.c
+++ b/target-ppc/translate/vmx-impl.inc.c
@@ -340,6 +340,19 @@ static void glue(gen_, name0##_##name1)(DisasContext *ctx) 
 \
 }   \
 }
 
+#define GEN_VXFORM_HETRO(name, opc2, opc3)  \
+static void glue(gen_, name)(DisasContext *ctx) \
+{   \
+TCGv_ptr rb;\
+if (unlikely(!ctx->altivec_enabled)) {  \
+gen_exception(ctx, POWERPC_EXCP_VPU);   \
+return; \
+}   \
+rb = ge

Re: [Qemu-devel] [PATCH v1 05/18] tests/test-rbcache: add test cases

2016-11-24 Thread Kevin Wolf

Am 15.11.2016 um 07:37 hat Pavel Butsykin geschrieben:
> Signed-off-by: Pavel Butsykin 
> ---
>  tests/Makefile.include |   3 +
>  tests/test-rbcache.c   | 308 
> +
>  2 files changed, 311 insertions(+)
>  create mode 100644 tests/test-rbcache.c
> 
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index 8162f6f..36bb472 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -54,6 +54,8 @@ check-unit-y += tests/test-hbitmap$(EXESUF)
>  gcov-files-test-hbitmap-y = blockjob.c
>  check-unit-y += tests/test-blockjob$(EXESUF)
>  check-unit-y += tests/test-blockjob-txn$(EXESUF)
> +gcov-files-test-rbcache-y = util/rbcache.c
> +check-unit-y += tests/test-rbcache$(EXESUF)
>  check-unit-y += tests/test-x86-cpuid$(EXESUF)
>  # all code tested by test-x86-cpuid is inside topology.h
>  gcov-files-test-x86-cpuid-y =
> @@ -481,6 +483,7 @@ tests/test-blockjob-txn$(EXESUF): 
> tests/test-blockjob-txn.o $(test-block-obj-y)
>  tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(test-block-obj-y)
>  tests/test-iov$(EXESUF): tests/test-iov.o $(test-util-obj-y)
>  tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o $(test-util-obj-y)
> +tests/test-rbcache$(EXESUF): tests/test-rbcache.o $(test-util-obj-y)
>  tests/test-x86-cpuid$(EXESUF): tests/test-x86-cpuid.o
>  tests/test-xbzrle$(EXESUF): tests/test-xbzrle.o migration/xbzrle.o 
> page_cache.o $(test-util-obj-y)
>  tests/test-cutils$(EXESUF): tests/test-cutils.o util/cutils.o
> diff --git a/tests/test-rbcache.c b/tests/test-rbcache.c
> new file mode 100644
> index 000..1c72bfa
> --- /dev/null
> +++ b/tests/test-rbcache.c
> @@ -0,0 +1,308 @@
> +/*
> + * QEMU Range-Based Cache core
> + *
> + * Copyright (C) 2015-2016 Parallels IP Holdings GmbH. All rights reserved.
> + *
> + * Author: Pavel Butsykin 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/rbcache.h"
> +
> +typedef struct TestRBCacheData {
> +RBCache *cache;
> +} TestRBCacheData;
> +
> +typedef struct TestRBCacheConfig {
> +uint64_t limit_size;
> +int eviction_type;
> +RBNodeAlloc *alloc;
> +RBNodeFree  *free;
> +void *opaque;
> +} TestRBCacheConfig;
> +
> +#define KB(_n) ((_n) << 10)
> +#define MB(_n) ((_n) << 20)
> +
> +#define OFFSET1 0
> +#define SIZE1   KB(1)
> +
> +#define OFFSET2 KB(1)
> +#define SIZE2   KB(2)
> +
> +#define OFFSET3 KB(15)
> +#define SIZE3   KB(1)
> +
> +#define OFFSET4 KB(7)
> +#define SIZE4   KB(7)
> +
> +#define OFFSET5 KB(2)
> +#define SIZE5   KB(8)

Visualised, we test these requests:

1: *
2:  **
3:*
4:***
5:   

You test inserting the only element, inserting after the last element,
inserting in the middle and inserting something that overlaps two other
requests at its start and end.

That's a good start, but it might be worth testing more scenarios:

- Inserting a new first element to a non-empty cache
- Overlapping only at the start
- Overlapping only at the end
- Overlapping in the middle (i.e. including existing ranges as a
  subset)
* With only one node
* With multiple nodes (like adding offset=2, size=16kb here)


> +
> +static void test_rbcache_init(TestRBCacheData *data, const void *ctx)
> +{
> +g_assert_nonnull(data->cache);
> +}
> +
> +static void test_rbcache_insert(TestRBCacheData *data, const void *ctx)
> +{
> +RBCacheNode *node1 = rbcache_node_alloc(data->cache, OFFSET1, SIZE1);
> +RBCacheNode *node2 = rbcache_node_alloc(data->cache, OFFSET2, SIZE2);
> +RBCacheNode *node3 = rbcache_node_alloc(data->cache, OFFSET3, SIZE3);
> +RBCacheNode *node4 = rbcache_node_alloc(data->cache, OFFSET4, SIZE4);
> +RBCacheNode *node5 = rbcache_node_alloc(data->cache, OFFSET5, SIZE5);
> +RBCacheNode *node;
> +
> +node = rbcache_insert(data->cache, node1);
> +g_assert_true(node == node1);
> +
> +node = rbcache_insert(data->cache, node2);
> +g_assert_true(node == node2);
> +
> +node = rbcache_insert(data->cache, node3);
> +g_assert_true(node == node3);
> +
> +node = rbcache_insert(data->cache, node4);
> +g_assert_true(node == node4);
> +
> +node = rbcache_insert(data->cache, node5);
> +g_assert_true(node != node5);

You can test for the right result instead:

g_assert_true(node == node2);


> +rbcache_node_free(data->cache, node5);
> +}
> +
> +static void test_rbcache_search(TestRBCacheData *data, const void *ctx)
> +{
> +RBCacheNode *node;
> +
> +test_rbcache_insert(data, ctx);
> +
> +node = rbcache_search(data->cache, OFFSET1, SIZE1);
> +g_assert_nonnull(node);
> +g_assert_cmpuint(node->offset, ==, OFFSET1);
> +g_assert_cmpuint(node->bytes, ==, SIZE1);
> +
> +node = rbcache_search(data->cache, OFFSET2 + KB(1), SIZE1);

Intentional that you use SIZE1 here even though we're looking at node 2?

[Qemu-devel] [PATCH 0/2] fix memory leak in virtio_gpu_resource_destroy

2016-11-24 Thread Li Qiang

If the guest destroy the resource before detach banking, the 'iov'
and 'addrs' field in resource is not freed thus leading memory
leak issue. This patchset address this.

Li Qiang (2):
  virtio-gpu: add declaration for virtio_gpu_cleanup_mapping function
  virtio-gpu: call cleanup mapping function in resource destroy

 hw/display/virtio-gpu.c | 3 +++
 1 file changed, 3 insertions(+)

-- 
1.8.3.1

[Qemu-devel] [PATCH 2/2] virtio-gpu: call cleanup mapping function in resource destroy

2016-11-24 Thread Li Qiang

If the guest destroy the resource before detach banking, the 'iov'
and 'addrs' field in resource is not freed thus leading memory
leak issue. This patch avoid this.

Signed-off-by: Li Qiang 
---
 hw/display/virtio-gpu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index cc725a6..98dadf2 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -360,6 +360,7 @@ static void virtio_gpu_resource_destroy(VirtIOGPU *g,
 struct virtio_gpu_simple_resource *res)
 {
 pixman_image_unref(res->image);
+virtio_gpu_cleanup_mapping(res);
 QTAILQ_REMOVE(&g->reslist, res, next);
 g_free(res);
 }
-- 
1.8.3.1

[Qemu-devel] [PATCH 1/2] virtio-gpu: add declaration for virtio_gpu_cleanup_mapping function

2016-11-24 Thread Li Qiang

The virtio_gpu_resource_destroy function is before the definition
of the cleanup mapping function and it should call this cleanup
mapping function, so we should add a declaration for it.

Signed-off-by: Li Qiang 
---
 hw/display/virtio-gpu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 60bce94..cc725a6 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -28,6 +28,8 @@
 static struct virtio_gpu_simple_resource*
 virtio_gpu_find_resource(VirtIOGPU *g, uint32_t resource_id);
 
+static void virtio_gpu_cleanup_mapping(struct virtio_gpu_simple_resource *res);
+
 #ifdef CONFIG_VIRGL
 #include 
 #define VIRGL(_g, _virgl, _simple, ...) \
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH for-2.8] pci-assign: sync MSI/MSI-X cap and table with PCIDevice

2016-11-24 Thread Peter Xu

CC Limin.

Limin, could you please help test this patch as well to see whether it
can solve your issue? This one is slightly cleaner IMHO.

Thanks,

On Thu, Nov 24, 2016 at 06:25:13PM +0800, Peter Xu wrote:
> Since commit e1d4fb2d ("kvm-irqchip: x86: add msi route notify fn"),
> kvm_irqchip_add_msi_route() starts to use pci_get_msi_message() to fetch
> MSI info. This requires that we setup MSI related fields in PCIDevice.
> For most devices, that won't be a problem, as long as we are using
> general interfaces like msi_init()/msix_init().
> 
> However, for pci-assign devices, MSI/MSI-X is treated differently - PCI
> assign devices are maintaining its own MSI table and cap information in
> AssignedDevice struct. however that's not synced up with PCIDevice's
> fields. That will leads to pci_get_msi_message() failed to find correct
> MSI capability, even with an NULL msix_table.
> 
> A quick fix is to sync up the two places: both the capability bits and
> table address for MSI/MSI-X.
> 
> Reported-by: Changlimin 
> Cc: qemu-sta...@nongnu.org
> Fixes: e1d4fb2d ("kvm-irqchip: x86: add msi route notify fn")
> Signed-off-by: Peter Xu 
> 
> ---
> Do we still support pci-assign?
> ---
>  hw/i386/kvm/pci-assign.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
> index 8238fbc..df9748d 100644
> --- a/hw/i386/kvm/pci-assign.c
> +++ b/hw/i386/kvm/pci-assign.c
> @@ -1251,6 +1251,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
> *pci_dev, Error **errp)
>  error_propagate(errp, local_err);
>  return -ENOTSUP;
>  }
> +dev->dev.cap_present |= QEMU_PCI_CAP_MSI;
>  dev->cap.available |= ASSIGNED_DEVICE_CAP_MSI;
>  /* Only 32-bit/no-mask currently supported */
>  ret = pci_add_capability2(pci_dev, PCI_CAP_ID_MSI, pos, 10,
> @@ -1285,6 +1286,7 @@ static int assigned_device_pci_cap_init(PCIDevice 
> *pci_dev, Error **errp)
>  error_propagate(errp, local_err);
>  return -ENOTSUP;
>  }
> +dev->dev.cap_present |= QEMU_PCI_CAP_MSIX;
>  dev->cap.available |= ASSIGNED_DEVICE_CAP_MSIX;
>  ret = pci_add_capability2(pci_dev, PCI_CAP_ID_MSIX, pos, 12,
>&local_err);
> @@ -1648,6 +1650,7 @@ static void 
> assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp)
>  dev->msix_table = NULL;
>  return;
>  }
> +dev->dev.msix_table = dev->msix_table;
>  
>  assigned_dev_msix_reset(dev);
>  
> @@ -1665,6 +1668,7 @@ static void 
> assigned_dev_unregister_msix_mmio(AssignedDevice *dev)
>  error_report("error unmapping msix_table! %s", strerror(errno));
>  }
>  dev->msix_table = NULL;
> +dev->dev.msix_table = NULL;
>  }
>  
>  static const VMStateDescription vmstate_assigned_device = {
> -- 
> 2.7.4
> 

-- peterx

Re: [Qemu-devel] [dpdk-dev] dpdk/vpp and cross-version migration for vhost

2016-11-24 Thread Yuanhan Liu

On Thu, Nov 24, 2016 at 09:30:49AM +, Kevin Traynor wrote:
> On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
> > On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
>  You keep assuming that you have the VM started first and
>  figure out things afterwards, but this does not work.
> 
>  Think about a cluster of machines. You want to start a VM in
>  a way that will ensure compatibility with all hosts
>  in a cluster.
> >>>
> >>> I see. I was more considering about the case when the dst
> >>> host (including the qemu and dpdk combo) is given, and
> >>> then determine whether it will be a successfull migration
> >>> or not.
> >>>
> >>> And you are asking that we need to know which host could
> >>> be a good candidate before starting the migration. In such
> >>> case, we indeed need some inputs from both the qemu and
> >>> vhost-user backend.
> >>>
> >>> For DPDK, I think it could be simple, just as you said, it
> >>> could be either a tiny script, or even a macro defined in
> >>> the source code file (we extend it every time we add a
> >>> new feature) to let the libvirt to read it. Or something
> >>> else.
> >>
> >> There's the issue of APIs that tweak features as Maxime
> >> suggested.
> > 
> > Yes, it's a good point.
> > 
> >> Maybe the only thing to do is to deprecate it,
> > 
> > Looks like so.
> > 
> >> but I feel some way for application to pass info into
> >> guest might be benefitial.
> > 
> > The two APIs are just for tweaking feature bits DPDK supports before
> > any device got connected. It's another way to disable some features
> > (the another obvious way is to through QEMU command lines).
> > 
> > IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
> > of disabling something though qemu one by one, we could disable it
> > once in DPDK.
> > 
> > But I doubt the useful of it. It's only used in DPDK's vhost example
> > after all. Nor is it used in vhost pmd, neither is it used in OVS.
> 
> rte_vhost_feature_disable() is currently used in OVS, lib/netdev-dpdk.c

Hmmm. I must have checked very old code ...
> 
> netdev_dpdk_vhost_class_init(void)
> {
> static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
> 
> /* This function can be called for different classes.  The
> initialization
>  * needs to be done only once */
> if (ovsthread_once_start(&once)) {
> rte_vhost_driver_callback_register(&virtio_net_device_ops);
> rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>   | 1ULL << VIRTIO_NET_F_CSUM);

I saw the commit introduced such change, but it tells no reason why
it was added.

commit 362ca39639ae871806be5ae97d55e1cbb14afd92
Author: mweglicx 
Date:   Thu Apr 14 17:40:06 2016 +0100

Update relevant artifacts to add support for DPDK 16.04.

Following changes are applied:
 - INSTALL.DPDK.md: CONFIG_RTE_BUILD_COMBINE_LIBS step has been
   removed because it is no longer present in DPDK configuration
   (combined library is created by default),
 - INSTALL.DPDK.md: VHost Cuse configuration is updated,
 - netdev-dpdk.c: Link speed definition is changed in DPDK and
   netdev_dpdk_get_features is updated accordingly,
 - netdev-dpdk.c: TSO and checksum offload has been disabled for
   vhostuser device.
 - .travis/linux-build.sh: DPDK version is updated and legacy
   flags have been removed in configuration.

Signed-off-by: Michal Weglicki 
Signed-off-by: Panu Matilainen 
Acked-by: Daniele Di Proietto 

--yliu

Re: [Qemu-devel] [PATCH v1 01/18] block/io: add bdrv_aio_{preadv, pwritev}

2016-11-24 Thread Kevin Wolf

Am 24.11.2016 um 11:58 hat Pavel Butsykin geschrieben:
> On 23.11.2016 17:28, Kevin Wolf wrote:
> >Am 15.11.2016 um 07:36 hat Pavel Butsykin geschrieben:
> >>It's just byte-based wrappers over bdrv_co_aio_prw_vector(), which provide
> >>  a byte-based interface for AIO read/write.
> >>
> >>Signed-off-by: Pavel Butsykin 
> >
> >I'm in the process to phase out the last users of bdrv_aio_*() so that
> >this set of interfaces can be removed. I'm doing this because it's an
> >unnecessary redundancy, we have too many wrapper functions that expose
> >the same functionality with different syntax. So let's not add new
> >users.
> >
> >At first sight, you don't even seem to use bdrv_aio_preadv() for actual
> >parallelism, but you often have a pattern like this:
> >
> > void foo_cb(void *opaque)
> > {
> > ...
> > qemu_coroutine_enter(acb->co);
> > }
> >
> > void caller()
> > {
> > ...
> > acb = bdrv_aio_preadv(...);
> > qemu_coroutine_yield();
> > }
> >
> >The code will actually become a lot simpler if you use bdrv_co_preadv()
> >instead because you don't have to have a callback, but you get pure
> >sequential code.
> >
> >The part that actually has some parallelism, pcache_readahead_request(),
> >already creates its own coroutine, so it runs in the background without
> >using callback-style interfaces.
> 
> I used bdrv_co_preadv(), because it conveniently solves the partial
> cache hit. To solve the partial cache hit, we need to split a request
> into smaller parts, make asynchronous requests and wait for all
> requests in one place.
> 
> Do you propose to create a coroutine for each part of request? It
> seemed to me that bdrv_co_preadv() is a wrapper that allows us to get
> rid of the same code.

It's actually the other way round, bdrv_co_preadv() is the "native"
block layer API, and bdrv_aio_*() are wrappers providing an alternative
interface.

I'm looking at pcache_co_readahead(), for example. It looks like this:

bdrv_aio_preadv(bs->file, node->common.offset, &readahead_acb.qiov,
node->common.bytes, pcache_aio_readahead_cb,
&readahead_acb);
qemu_coroutine_yield();

And then we have pcache_aio_readahead_cb(), which ends in:

qemu_coroutine_enter(acb->co);

So here the callback style doesn't buy you anything, it just rips the
code apart in two function. There is no parallelism here anyway,
pcache_co_readahead() doesn't do anything until the callback reenters
it. This is a very obvious example where bdrv_co_preadv() will simplify
the code.

It's similar with the other bdrv_aio_preadv() calls, which are in
pcache_co_preadv():

if (bytes > s->max_aio_size) {
bdrv_aio_preadv(bs->file, offset, qiov, bytes,
pcache_aio_read_cb, &acb);
goto out;
}

update_req_stats(s->req_stats, offset, bytes);

status = pcache_lookup_data(&acb);
if (status == CACHE_MISS) {
bdrv_aio_preadv(bs->file, offset, qiov, bytes,
pcache_aio_read_cb, &acb);
} else if (status == PARTIAL_CACHE_HIT) {
assert(acb.part.qiov.niov != 0);
bdrv_aio_preadv(bs->file, acb.part.offset, &acb.part.qiov,
acb.part.bytes, pcache_aio_read_cb, &acb);
}

pcache_readahead_request(&acb);

if (status == CACHE_HIT && --acb.ref == 0) {
return 0;
}

out:
qemu_coroutine_yield();

Here you have mainly the pcache_readahead_request() call between
bdrv_aio_preadv() and the yield. It only spawns a new coroutine, which
works in the background, so I think you can move it to before the reads
and then the reads can trivially become bdrv_co_preadv() and the
callback can again be inlined instead of ripping the function in two
parts.

The bdrv_aio_pwritev() call in pcache_co_pwritev() is just the same
thing and using the coroutine version results in obvious code
improvements.

And I think this are all uses of bdrv_aio_*() in the pcache driver, so
converting it to use bdrv_co_*() instead isn't only possible, but will
improve the legibility of your code, too. It's a clear win in all three
places.

Kevin

Re: [Qemu-devel] [PATCH] memory: add section range info for IOMMU notifier

2016-11-24 Thread Peter Xu

On Thu, Nov 24, 2016 at 03:04:35PM +1100, David Gibson wrote:

[...]

> > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > index 801578b..c3db115 100644
> > --- a/hw/vfio/common.c
> > +++ b/hw/vfio/common.c
> > @@ -455,6 +455,10 @@ static void vfio_listener_region_add(MemoryListener 
> > *listener,
> >  giommu->container = container;
> >  giommu->n.notify = vfio_iommu_map_notify;
> >  giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL;
> > +giommu->n.start = section->offset_within_address_space;
> 
> I think this needs to be offset_within_region rather than
> offset_within_address_space.  The IOVAs used in the IOMMUTLBEntry are
> relative to the MR, not the enclosing AS (in fact there could be
> several enclosing ASes with the right aliasing).  See for example
> put_tce_emu() - the (ioba - tcet->bus_offset) expression is
> effectively converting the AS relative ioba into an MR relative
> address.

Thanks for the pointer. Will fix (and the other place).

-- peterx

Re: [Qemu-devel] [dpdk-dev] dpdk/vpp and cross-version migration for vhost

2016-11-24 Thread Maxime Coquelin




On 11/24/2016 01:33 PM, Yuanhan Liu wrote:

On Thu, Nov 24, 2016 at 09:30:49AM +, Kevin Traynor wrote:

> On 11/24/2016 06:31 AM, Yuanhan Liu wrote:

> > On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:

>  You keep assuming that you have the VM started first and
>  figure out things afterwards, but this does not work.
> 
>  Think about a cluster of machines. You want to start a VM in
>  a way that will ensure compatibility with all hosts
>  in a cluster.

> >>>
> >>> I see. I was more considering about the case when the dst
> >>> host (including the qemu and dpdk combo) is given, and
> >>> then determine whether it will be a successfull migration
> >>> or not.
> >>>
> >>> And you are asking that we need to know which host could
> >>> be a good candidate before starting the migration. In such
> >>> case, we indeed need some inputs from both the qemu and
> >>> vhost-user backend.
> >>>
> >>> For DPDK, I think it could be simple, just as you said, it
> >>> could be either a tiny script, or even a macro defined in
> >>> the source code file (we extend it every time we add a
> >>> new feature) to let the libvirt to read it. Or something
> >>> else.

> >>
> >> There's the issue of APIs that tweak features as Maxime
> >> suggested.

> >
> > Yes, it's a good point.
> >

> >> Maybe the only thing to do is to deprecate it,

> >
> > Looks like so.
> >

> >> but I feel some way for application to pass info into
> >> guest might be benefitial.

> >
> > The two APIs are just for tweaking feature bits DPDK supports before
> > any device got connected. It's another way to disable some features
> > (the another obvious way is to through QEMU command lines).
> >
> > IMO, it's bit handy only in a case like: we have bunch of VMs. Instead
> > of disabling something though qemu one by one, we could disable it
> > once in DPDK.
> >
> > But I doubt the useful of it. It's only used in DPDK's vhost example
> > after all. Nor is it used in vhost pmd, neither is it used in OVS.

>
> rte_vhost_feature_disable() is currently used in OVS, lib/netdev-dpdk.c

Hmmm. I must have checked very old code ...

>
> netdev_dpdk_vhost_class_init(void)
> {
> static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>
> /* This function can be called for different classes.  The
> initialization
>  * needs to be done only once */
> if (ovsthread_once_start(&once)) {
> rte_vhost_driver_callback_register(&virtio_net_device_ops);
> rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>   | 1ULL << VIRTIO_NET_F_CSUM);

I saw the commit introduced such change, but it tells no reason why
it was added.


I'm also interested to know the reason.
In any case, I think this is something that can/should be managed by
the management tool, which  should disable it in cmd parameters.

Kevin, do you agree?

Cheers,
Maxime

[Qemu-devel] [PATCH v2] memory: add section range info for IOMMU notifier

2016-11-24 Thread Peter Xu

In this patch, IOMMUNotifier.{start|end} are introduced to store section
information for a specific notifier. When notification occurs, we not
only check the notification type (MAP|UNMAP), but also check whether the
notified iova is in the range of specific IOMMU notifier, and skip those
notifiers if not in the listened range.

When removing an region, we need to make sure we removed the correct
VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well.

Suggested-by: David Gibson 
Signed-off-by: Peter Xu 
---
v2:
- replace offset_within_address_space with offset_within_region since
  IOTLB iova is relative to region [David]
---
 hw/vfio/common.c  | 7 ++-
 include/exec/memory.h | 3 +++
 memory.c  | 4 +++-
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 801578b..6f648da 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -455,6 +455,10 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
 giommu->container = container;
 giommu->n.notify = vfio_iommu_map_notify;
 giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL;
+giommu->n.start = section->offset_within_region;
+llend = int128_add(int128_make64(giommu->n.start), section->size);
+llend = int128_sub(llend, int128_one());
+giommu->n.end = int128_get64(llend);
 QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
 
 memory_region_register_iommu_notifier(giommu->iommu, &giommu->n);
@@ -525,7 +529,8 @@ static void vfio_listener_region_del(MemoryListener 
*listener,
 VFIOGuestIOMMU *giommu;
 
 QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
-if (giommu->iommu == section->mr) {
+if (giommu->iommu == section->mr &&
+giommu->n.start == section->offset_within_region) {
 memory_region_unregister_iommu_notifier(giommu->iommu,
 &giommu->n);
 QLIST_REMOVE(giommu, giommu_next);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 9728a2f..87357ea 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -84,6 +84,9 @@ typedef enum {
 struct IOMMUNotifier {
 void (*notify)(struct IOMMUNotifier *notifier, IOMMUTLBEntry *data);
 IOMMUNotifierFlag notifier_flags;
+/* Notify for address space range start <= addr <= end */
+hwaddr start;
+hwaddr end;
 QLIST_ENTRY(IOMMUNotifier) node;
 };
 typedef struct IOMMUNotifier IOMMUNotifier;
diff --git a/memory.c b/memory.c
index 33110e9..f89d047 100644
--- a/memory.c
+++ b/memory.c
@@ -1662,7 +1662,9 @@ void memory_region_notify_iommu(MemoryRegion *mr,
 }
 
 QLIST_FOREACH(iommu_notifier, &mr->iommu_notify, node) {
-if (iommu_notifier->notifier_flags & request_flags) {
+if (iommu_notifier->notifier_flags & request_flags &&
+iommu_notifier->start <= entry.iova &&
+iommu_notifier->end >= entry.iova) {
 iommu_notifier->notify(iommu_notifier, &entry);
 }
 }
-- 
2.7.4

[Qemu-devel] [PATCH] ps2: add support for mice with extra/side buttons

2016-11-24 Thread Fabian Lesniak


This patch introduces the SIDE and EXTRA mouse buttons and implements
appropriate event generation for gtk and input-linux input methods.

The naming was borrowed from evdev since it is more descriptive than
BUTTON4/5.

Note that the guest has to switch the ps2 mouse into IMEX mode,
otherwise events of the extra buttons are ignored. For example on a Windows
guest one needs to manually select the "Microsoft PS/2 Mouse" driver.

Signed-off-by: Fabian Lesniak 
---
 hw/input/ps2.c   | 6 ++
 qapi-schema.json | 2 +-
 ui/gtk.c | 4 
 ui/input-linux.c | 6 ++
 4 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/hw/input/ps2.c b/hw/input/ps2.c
index 0d14de0..7347da5 100644
--- a/hw/input/ps2.c
+++ b/hw/input/ps2.c
@@ -870,10 +870,16 @@ static void ps2_mouse_send_packet(PS2MouseState *s)
 static void ps2_mouse_event(DeviceState *dev, QemuConsole *src,
 InputEvent *evt)
 {
+/* the definition of MOUSE_EVENT_WHEELUP/MOUSE_EVENT_WHEELDN
+ * is ambiguous since ps2_mouse_send_packet expects the bits
+ * for buttons 4 and 5 at s->mouse_buttons & 0x18 which
+ * matches MOUSE_EVENT_WHEELUP/MOUSE_EVENT_WHEELDN */
 static const int bmap[INPUT_BUTTON__MAX] = {
 [INPUT_BUTTON_LEFT]   = MOUSE_EVENT_LBUTTON,
 [INPUT_BUTTON_MIDDLE] = MOUSE_EVENT_MBUTTON,
 [INPUT_BUTTON_RIGHT]  = MOUSE_EVENT_RBUTTON,
+[INPUT_BUTTON_SIDE]   = MOUSE_EVENT_WHEELUP,
+[INPUT_BUTTON_EXTRA]  = MOUSE_EVENT_WHEELDN,
 };
 PS2MouseState *s = (PS2MouseState *)dev;
 InputMoveEvent *move;
diff --git a/qapi-schema.json b/qapi-schema.json
index f3e9bfc..dc7b28c 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -4280,7 +4280,7 @@
 # Since: 2.0
 ##
 { 'enum'  : 'InputButton',
-  'data'  : [ 'left', 'middle', 'right', 'wheel-up', 'wheel-down' ] }
+  'data'  : [ 'left', 'middle', 'right', 'wheel-up', 'wheel-down', 
'side', 'extra' ] }


 ##
 # @InputAxis
diff --git a/ui/gtk.c b/ui/gtk.c
index e816428..9cdce83 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -995,6 +995,10 @@ static gboolean gd_button_event(GtkWidget *widget, 
GdkEventButton *button,

 btn = INPUT_BUTTON_MIDDLE;
 } else if (button->button == 3) {
 btn = INPUT_BUTTON_RIGHT;
+} else if (button->button == 8) {
+btn = INPUT_BUTTON_SIDE;
+} else if (button->button == 9) {
+btn = INPUT_BUTTON_EXTRA;
 } else {
 return TRUE;
 }
diff --git a/ui/input-linux.c b/ui/input-linux.c
index f345317..ac31f47 100644
--- a/ui/input-linux.c
+++ b/ui/input-linux.c
@@ -291,6 +291,12 @@ static void input_linux_handle_mouse(InputLinux 
*il, struct input_event *event)

 qemu_input_queue_btn(NULL, INPUT_BUTTON_WHEEL_DOWN,
  event->value);
 break;
+case BTN_SIDE:
+qemu_input_queue_btn(NULL, INPUT_BUTTON_SIDE, event->value);
+break;
+case BTN_EXTRA:
+qemu_input_queue_btn(NULL, INPUT_BUTTON_EXTRA, event->value);
+break;
 };
 break;
 case EV_REL:
--
2.10.2

[Qemu-devel] [PATCH] 9pfs: add missing coroutine_fn annotations

2016-11-24 Thread Greg Kurz

Signed-off-by: Greg Kurz 
---
 hw/9pfs/9p.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index aea7e9d39206..e4815a97922d 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -1571,7 +1571,7 @@ out_nofid:
 v9fs_string_free(&name);
 }
 
-static void v9fs_fsync(void *opaque)
+static void coroutine_fn v9fs_fsync(void *opaque)
 {
 int err;
 int32_t fid;
@@ -2332,7 +2332,7 @@ out_nofid:
 v9fs_string_free(&symname);
 }
 
-static void v9fs_flush(void *opaque)
+static void coroutine_fn v9fs_flush(void *opaque)
 {
 ssize_t err;
 int16_t tag;

Re: [Qemu-devel] [RFC 00/15] qmp: Report supported device types on 'query-machines'

2016-11-24 Thread Markus Armbruster

Eduardo Habkost  writes:

> On Wed, Nov 23, 2016 at 06:43:16PM +0200, Marcel Apfelbaum wrote:
>> On 11/22/2016 03:11 AM, Eduardo Habkost wrote:
>> > The Problem
>> > ===
>> > 
>> > Currently management software has no way to find out which device
>> > types can be plugged in a machine, unless the machine is already
>> > initialized.

I doubt it can find the precise list of device types even for an
initialized machine.

>> Hi Eduardo,
>> Thank you for this interesting series. I think this is a problem
>> worth addressing.
>> 
>> > Even after the machine is initialized, there's no way to map
>> > existing bus types to supported device types unless management
>> > software hardcodes the mapping between bus types and device
>> > types.
>> > 
>> 
>> Here I am a little lost.
>> 
>> We are going for machine => supported devices or
>> bus-type => supported devices?
>
> On this series, we go for machine-type => supported-devices.
>
> A bus-type => supported-devices map wouldn't work because
> different PCIe bus instances might accept different types of
> devices (so supported-devices depend on the specific bus
> instance, not only on the bus-type).
>
> v2 will probably be more detailed. I plan to change it to:
>
> query-machine(machine-type) => list of BusInfo
>
> BusInfo would contain:
>  * bus-type
>  * bus-path
>  * accepted-device-types (list of type/interface names)

Let me take a step back and explore what questions a management
application may want to ask.  I think the basic questions are:

* Which devices models could be plugged now?

  "Now" means in the current state of the machine.  Depending on "now",
  this could be hot- or cold-plug.

* Into which sockets could device model T be plugged now?

  Mind, I wrote "socket", not bus.  Buses provide sockets, but they are
  not the only provider of sockets.

  We need a way to name sockets.  (QOM path to device or bus, socket
  address on device or bus) could work.

Are we on the same page so far?

Related: same questions with "newly created machine of machine type M"
instead of "now".  This isn't basic, because you could always create the
machine, then ask the basic question.  But not having to create the
machine is convenient, in particular as long as a single QEMU process
can only ever create one machine.

Related: same, with creation options.

Your series provides a solution for a special non-basic case: newly
created machine of machine type M with no optional parts, i.e. with
-nodefaults and nothing else.  Or maybe with the default set of optional
parts, i.e. without -nodefaults, not sure.

I'm not demanding you provide a solution for the general case first.  It
would be nice, but it could be too complex for a first step.  I do
recommend you design your solution for the special case with the general
case in mind.  Say, by trying to ensure the special case query returns
something that will do or can be evolved to do for the general case,
too.

>> > Example: floppy support on q35 vs i440fx
>> > 
>> > 
>> > There's no way for libvirt to find out that there's no floppy
>> > controller on pc-q35-* machine-types by default.
>> > 
>> 
>> Again "by default". So do we want to query the init state of a machine?
>> What devices are there? Or what devices *can be* there?
>
> "by default" means what's present when using "-machine "
> with no extra -device arguments.

Not just -device.  Some legacy options can also create buses.  For
instance, -device if=scsi,... creates a SCSI bus.

> We want to know what _buses_ are always there. Which in turn lets
> management know which _device_ types _can_ be plugged.
>
>> 
>> > With this series, pc-i440fx-* will report "floppy" as a supported
>> > device type, but pc-q35-* will not.
>> > 
>> > Example: Legacy PCI vs vs PCIe devices
>> > --
>> > 
>> > Some devices require a PCIe bus to be available, others work on
>> > both legacy PCI and PCIe, while others work only on a legacy PCI
>> > bus.
>> > 
>> > Currently management software has no way to know which devices
>> > can be added to a given machine, unless it hardcodes machine-type
>> > names and device-types names.
>> > 
>> 
>> Again it seems a double problem, machine => devices vs pci/pcie bus => 
>> devices.
>> The bus => devices match is not related to a machine type.
>
> A bus-type => device-type match would not depend on the
> machine-type, but it would not be useful: different bus instances
> can accept different device-types (and the way the bus topology
> is configured depend on the machine-type).
>
>
>> 
>> > The Proposed Interface
>> > ==
>> > 
>> > This series adds a new field to the output of 'query-machines':
>> > 'supported-device-types'. It will contain a list of QOM type
>> > names, that can be used to find the list of device types that can
>> > be plugged in the machine by default.
>> 
>> What do you mean "by default"? Without bridges or part of the machine

Re: [Qemu-devel] -nodefaults and available buses (was Re: [RFC 00/15] qmp: Report supported device types on 'query-machines')

2016-11-24 Thread Markus Armbruster

Eduardo Habkost  writes:

> (CCing the maintainers of the machines that crash when using
> -nodefaults)
>
> On Tue, Nov 22, 2016 at 08:34:50PM -0200, Eduardo Habkost wrote:
> [...]
>> "default defaults" vs "-nodefault defaults"
>> ---
>> 
>> Two bad news:
>> 
>> 1) We need to differentiate buses created by the machine with
>>"-nodefaults" and buses that are created only without
>>"-nodefaults".
>> 
>> libvirt use -nodefaults when starting QEMU, so knowing which
>> buses are available when using -nodefaults is more interesting
>> for them.

Yes.

>> Other software, on the other hand, might be interested in the
>> results without -nodefaults.

Maybe.

Related: other machine options, such as usb=on.

>> We need to be able model both cases in the new interface.
>> Suggestions are welcome.
>
> The good news is that the list is short. The only[1] machines
> where the list of buses seem to change when using -nodefaults
> are:
>
> * mpc8544ds
> * ppce500
> * mpc8544ds
> * ppce500
> * s390-ccw-virtio-*
>
> On all cases above, the only difference is that a virtio bus is
> available if not using -nodefaults.
>
> Considering that the list is short, I plan to rename
> 'supported-device-types' to 'always-available-buses', and
> document that it will include only the buses that are not
> disabled by -nodefaults.
>
> [1] I mean, the only ones from the set that don't crash with
> -nodefaults. The ones below could not be tested:
>
>> 2) A lot of machine-types won't start if using
>>"-nodefaults -machine " without any extra devices or
>>drives.
>> 
>> Lots of machines require some drives or devices to be created
>> (especially ARM machines that require a SD drive to be
>> available).
>> 
>> Some machines will make QEMU exit, some of them simply segfault.
>> I am looking for ways to work around it so we can still validate
>> -nodefaults-based info on the test code.
>
> The following machines won't work with -nodefaults:
>
> These make QEMU segfault:
> * cubieboard
> * petalogix-ml605
> * or32-sim
> * virtex-ml507
> * Niagara

Bugs.

> These exit with a "missing SecureDigital device" error:
> * akita
> * borzoi
> * cheetah
> * connex
> * mainstone
> * n800
> * n810
> * spitz
> * sx1
> * sx1-v1
> * terrier
> * tosa
> * verdex
> * z2

Bugs only if there is no other way to provide the SD device.  I believe
some variation if -drive if=sd,... should do fine.

Re: [Qemu-devel] [kvm-unit-tests PATCH v7 09/11] arm/arm64: add initial gicv3 support

2016-11-24 Thread Andrew Jones

On Thu, Nov 24, 2016 at 10:54:35AM +0100, Auger Eric wrote:
> Hi Drew,
> 
> On 23/11/2016 17:54, Andrew Jones wrote:
> > Reviewed-by: Alex Bennée 
> > Reviewed-by: Eric Auger 
> > Signed-off-by: Andrew Jones 
> > 
> > ---
> > v7: split lib/arm/gic.c into gic-v2/3.c [Eric]
> > v6:
> >  - added comments [Alex]
> >  - added stride parameter to gicv3_set_redist_base [Andre]
> >  - redist-wait s/rwp/uwp/ and comment [Andre]
> >  - removed unnecessary wait-for-rwps [Andre]
> > v5: use modern register names [Andre]
> > v4:
> >  - only take defines from kernel we need now [Andre]
> >  - simplify enable by not caring if we reinit the distributor [drew]
> > v2:
> >  - configure irqs as NS GRP1
> > ---
> >  arm/Makefile.common|   2 +-
> >  lib/arm/asm/arch_gicv3.h   |  47 
> >  lib/arm/asm/gic-v3.h   | 104 
> > +
> >  lib/arm/asm/gic.h  |   5 ++-
> >  lib/arm/gic-v2.c   |  27 
> >  lib/arm/gic-v3.c   |  61 ++
> >  lib/arm/gic.c  |  30 +
> >  lib/arm64/asm/arch_gicv3.h |  44 +++
> >  lib/arm64/asm/gic-v3.h |   1 +
> >  lib/arm64/asm/sysreg.h |  44 +++
> >  10 files changed, 343 insertions(+), 22 deletions(-)
> >  create mode 100644 lib/arm/asm/arch_gicv3.h
> >  create mode 100644 lib/arm/asm/gic-v3.h
> >  create mode 100644 lib/arm/gic-v2.c
> >  create mode 100644 lib/arm/gic-v3.c
> >  create mode 100644 lib/arm64/asm/arch_gicv3.h
> >  create mode 100644 lib/arm64/asm/gic-v3.h
> >  create mode 100644 lib/arm64/asm/sysreg.h
> > 
> > diff --git a/arm/Makefile.common b/arm/Makefile.common
> > index 2fe7aeeca6d4..6c0898f28be1 100644
> > --- a/arm/Makefile.common
> > +++ b/arm/Makefile.common
> > @@ -46,7 +46,7 @@ cflatobjs += lib/arm/mmu.o
> >  cflatobjs += lib/arm/bitops.o
> >  cflatobjs += lib/arm/psci.o
> >  cflatobjs += lib/arm/smp.o
> > -cflatobjs += lib/arm/gic.o
> > +cflatobjs += lib/arm/gic.o lib/arm/gic-v2.o lib/arm/gic-v3.o
> >  
> >  libeabi = lib/arm/libeabi.a
> >  eabiobjs = lib/arm/eabi_compat.o
> > diff --git a/lib/arm/asm/arch_gicv3.h b/lib/arm/asm/arch_gicv3.h
> > new file mode 100644
> > index ..276577452a14
> > --- /dev/null
> > +++ b/lib/arm/asm/arch_gicv3.h
> > @@ -0,0 +1,47 @@
> > +/*
> > + * All ripped off from arch/arm/include/asm/arch_gicv3.h
> > + *
> > + * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
> > + *
> > + * This work is licensed under the terms of the GNU LGPL, version 2.
> > + */
> > +#ifndef _ASMARM_ARCH_GICV3_H_
> > +#define _ASMARM_ARCH_GICV3_H_
> > +
> > +#ifndef __ASSEMBLY__
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#define __stringify xstr
> > +
> > +#define __ACCESS_CP15(CRn, Op1, CRm, Op2)  p15, Op1, %0, CRn, CRm, Op2
> > +
> > +#define ICC_PMR__ACCESS_CP15(c4, 0, c6, 0)
> > +#define ICC_IGRPEN1__ACCESS_CP15(c12, 0, c12, 7)
> > +
> > +static inline void gicv3_write_pmr(u32 val)
> > +{
> > +   asm volatile("mcr " __stringify(ICC_PMR) : : "r" (val));
> > +}
> > +
> > +static inline void gicv3_write_grpen1(u32 val)
> > +{
> > +   asm volatile("mcr " __stringify(ICC_IGRPEN1) : : "r" (val));
> > +   isb();
> > +}
> > +
> > +/*
> > + * We may access GICR_TYPER and GITS_TYPER by reading both the TYPER
> > + * offset and the following offset (+ 4) and then combining them to
> > + * form a 64-bit address.
> > + */
> > +static inline u64 gicv3_read_typer(const volatile void __iomem *addr)
> > +{
> > +   u64 val = readl(addr);
> > +   val |= (u64)readl(addr + 4) << 32;
> > +   return val;
> > +}
> > +
> > +#endif /* !__ASSEMBLY__ */
> > +#endif /* _ASMARM_ARCH_GICV3_H_ */
> > diff --git a/lib/arm/asm/gic-v3.h b/lib/arm/asm/gic-v3.h
> > new file mode 100644
> > index ..73ade4681d21
> > --- /dev/null
> > +++ b/lib/arm/asm/gic-v3.h
> > @@ -0,0 +1,104 @@
> > +/*
> > + * All GIC* defines are lifted from include/linux/irqchip/arm-gic-v3.h
> > + *
> > + * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
> > + *
> > + * This work is licensed under the terms of the GNU LGPL, version 2.
> > + */
> > +#ifndef _ASMARM_GIC_V3_H_
> > +#define _ASMARM_GIC_V3_H_
> > +
> > +#ifndef _ASMARM_GIC_H_
> > +#error Do not directly include . Include 
> > +#endif
> > +
> > +/*
> > + * Distributor registers
> > + *
> > + * We expect to be run in Non-secure mode, thus we define the
> > + * group1 enable bits with respect to that view.
> > + */
> > +#define GICD_CTLR_RWP  (1U << 31)
> > +#define GICD_CTLR_ARE_NS   (1U << 4)
> > +#define GICD_CTLR_ENABLE_G1A   (1U << 1)
> > +#define GICD_CTLR_ENABLE_G1(1U << 0)
> > +
> > +/* Re-Distributor registers, offsets from RD_base */
> > +#define GICR_TYPER 0x0008
> > +
> > +#define GICR_TYPER_LAST(1U << 4)
> > +
> > +/* Re-Distributor registers, offsets from SGI_base */
> > +#define GICR_IGROUPR0

Re: [Qemu-devel] [kvm-unit-tests PATCH v7 10/11] arm/arm64: gicv3: add an IPI test

2016-11-24 Thread Andrew Jones

On Thu, Nov 24, 2016 at 10:54:55AM +0100, Auger Eric wrote:
> Hi Drew,
> 
> On 23/11/2016 17:54, Andrew Jones wrote:
> > Signed-off-by: Andrew Jones 
> > 
> > ---
> > v7:
> >  - add common ipi_send_single/mask (replacing ipi_send).
> >Note, the arg order irq,cpu got swapped. [Eric]
> >  - comment rewording [Eric]
> >  - make enable_defaults a common op [Eric]
> >  - gic_enable_defaults() will now invoke gic_init if
> >necessary [drew]
> >  - split lib/arm/gic.c into gic-v2/3.c [Eric]
> > v6: move most gicv2/gicv3 wrappers to common code [Alex]
> > v5:
> >  - fix copy+paste error in gicv3_write_eoir [drew]
> >  - use modern register names [Andre]
> > v4:
> >  - heavily comment gicv3_ipi_send_tlist() [Eric]
> >  - changes needed for gicv2 iar/irqstat fix to other patch
> > v2:
> >  - use IRM for gicv3 broadcast
> > ---
> >  arm/gic.c  | 83 
> >  arm/unittests.cfg  |  6 +++
> >  lib/arm/asm/arch_gicv3.h   | 23 
> >  lib/arm/asm/gic-v2.h   |  2 +
> >  lib/arm/asm/gic-v3.h   | 12 +-
> >  lib/arm/asm/gic.h  | 63 +++
> >  lib/arm/gic-v2.c   | 40 
> >  lib/arm/gic-v3.c   | 94 
> > ++
> >  lib/arm/gic.c  |  9 -
> >  lib/arm64/asm/arch_gicv3.h | 22 +++
> >  10 files changed, 336 insertions(+), 18 deletions(-)
> > 
> > diff --git a/arm/gic.c b/arm/gic.c
> > index b42c2b1ca1e1..23c1860a49d9 100644
> > --- a/arm/gic.c
> > +++ b/arm/gic.c
> > @@ -3,6 +3,8 @@
> >   *
> >   * GICv2
> >   *   + test sending/receiving IPIs
> > + * GICv3
> > + *   + test sending/receiving IPIs
> >   *
> >   * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
> >   *
> > @@ -16,7 +18,14 @@
> >  #include 
> >  #include 
> >  
> > -static int gic_version;
> > +struct gic {
> > +   struct {
> > +   void (*send_self)(void);
> > +   void (*send_broadcast)(void);
> > +   } ipi;
> > +};
> > +
> > +static struct gic *gic;
> >  static int acked[NR_CPUS], spurious[NR_CPUS];
> >  static cpumask_t ready;
> >  
> > @@ -83,11 +92,11 @@ static void check_spurious(void)
> >  
> >  static void ipi_handler(struct pt_regs *regs __unused)
> >  {
> > -   u32 irqstat = readl(gicv2_cpu_base() + GICC_IAR);
> > -   u32 irqnr = irqstat & GICC_IAR_INT_ID_MASK;
> > +   u32 irqstat = gic_read_iar();
> > +   u32 irqnr = gic_iar_irqnr(irqstat);
> >  
> > if (irqnr != GICC_INT_SPURIOUS) {
> > -   writel(irqstat, gicv2_cpu_base() + GICC_EOIR);
> > +   gic_write_eoir(irqstat);
> > smp_rmb(); /* pairs with wmb in ipi_test functions */
> > ++acked[smp_processor_id()];
> > smp_wmb(); /* pairs with rmb in check_acked */
> > @@ -97,6 +106,27 @@ static void ipi_handler(struct pt_regs *regs __unused)
> > }
> >  }
> >  
> > +static void gicv2_ipi_send_self(void)
> > +{
> > +   writel(2 << 24, gicv2_dist_base() + GICD_SGIR);
> > +}
> > +
> > +static void gicv2_ipi_send_broadcast(void)
> > +{
> > +   writel(1 << 24, gicv2_dist_base() + GICD_SGIR);
> > +}
> > +
> > +static void gicv3_ipi_send_self(void)
> > +{
> > +   gic_ipi_send_single(0, smp_processor_id());
> > +}
> > +
> > +static void gicv3_ipi_send_broadcast(void)
> > +{
> > +   gicv3_write_sgi1r(1ULL << 40);
> > +   isb();
> > +}
> We could have moved above functions in lib/arm/gic-v2/V3.c +gicv2/v3

The common code has gic_ipi_send_single/mask functions. These
functions are specific IPI test functions using special SGIR
bits, so I think they belong here.

> > +
> >  static void ipi_test_self(void)
> >  {
> > cpumask_t mask;
> > @@ -106,7 +136,7 @@ static void ipi_test_self(void)
> > smp_wmb();
> > cpumask_clear(&mask);
> > cpumask_set_cpu(0, &mask);
> > -   writel(2 << 24, gicv2_dist_base() + GICD_SGIR);
> > +   gic->ipi.send_self();
> > check_acked(&mask);
> > report_prefix_pop();
> >  }
> > @@ -114,14 +144,15 @@ static void ipi_test_self(void)
> >  static void ipi_test_smp(void)
> >  {
> > cpumask_t mask;
> > -   unsigned long tlist;
> > +   int i;
> >  
> > report_prefix_push("target-list");
> > memset(acked, 0, sizeof(acked));
> > smp_wmb();
> > -   tlist = cpumask_bits(&cpu_present_mask)[0] & 0xaa;
> > -   cpumask_bits(&mask)[0] = tlist;
> > -   writel((u8)tlist << 16, gicv2_dist_base() + GICD_SGIR);
> > +   cpumask_copy(&mask, &cpu_present_mask);
> > +   for (i = 0; i < nr_cpus; i += 2)
> > +   cpumask_clear_cpu(i, &mask);
> > +   gic_ipi_send_mask(0, &mask);
> > check_acked(&mask);
> > report_prefix_pop();
> >  
> > @@ -130,14 +161,14 @@ static void ipi_test_smp(void)
> > smp_wmb();
> > cpumask_copy(&mask, &cpu_present_mask);
> > cpumask_clear_cpu(0, &mask);
> > -   writel(1 << 24, gicv2_dist_base() + GICD_SGIR);
> > +   gic->ipi.send_broadcast();
> > check_acked(&mask);
> > report_prefix_pop();
> >  }
> >  
> >  static void ipi_enable(void)
>

Re: [Qemu-devel] [kvm-unit-tests PATCH v7 11/11] arm/arm64: gic: don't just use zero

2016-11-24 Thread Andrew Jones

On Thu, Nov 24, 2016 at 10:57:01AM +0100, Auger Eric wrote:
> Hi,
> 
> On 23/11/2016 17:54, Andrew Jones wrote:
> > Allow user to select who sends ipis and with which irq,
> > rather than just always sending irq=0 from cpu0.
> > 
> > Signed-off-by: Andrew Jones 
> 
> Reviewed-by: Eric Auger 
> Tested-by: Eric Auger 

Thanks!

But if I spin a v8 then I may just dump all the command line parsing
stuff, dropping the optional input arguments sender and irq, or at
least irq. I'm feeling it's a bit crufty... Any thoughts on that?

drew

Re: [Qemu-devel] [RFC 00/15] qmp: Report supported device types on 'query-machines'

2016-11-24 Thread Eduardo Habkost

On Thu, Nov 24, 2016 at 02:34:05PM +0100, Markus Armbruster wrote:
> Eduardo Habkost  writes:
> 
> > On Wed, Nov 23, 2016 at 06:43:16PM +0200, Marcel Apfelbaum wrote:
> >> On 11/22/2016 03:11 AM, Eduardo Habkost wrote:
> >> > The Problem
> >> > ===
> >> > 
> >> > Currently management software has no way to find out which device
> >> > types can be plugged in a machine, unless the machine is already
> >> > initialized.
> 
> I doubt it can find the precise list of device types even for an
> initialized machine.

My plan is to make it possible. See:
  [RFC 06/15] qdev: Add device_type field to BusClass

I will change how it looks like on v2 (mostly due to the PCI/PCIe
issues), but basically it makes all bus objects able to tell
which device types they accept.

> 
> >> Hi Eduardo,
> >> Thank you for this interesting series. I think this is a problem
> >> worth addressing.
> >> 
> >> > Even after the machine is initialized, there's no way to map
> >> > existing bus types to supported device types unless management
> >> > software hardcodes the mapping between bus types and device
> >> > types.
> >> > 
> >> 
> >> Here I am a little lost.
> >> 
> >> We are going for machine => supported devices or
> >> bus-type => supported devices?
> >
> > On this series, we go for machine-type => supported-devices.
> >
> > A bus-type => supported-devices map wouldn't work because
> > different PCIe bus instances might accept different types of
> > devices (so supported-devices depend on the specific bus
> > instance, not only on the bus-type).
> >
> > v2 will probably be more detailed. I plan to change it to:
> >
> > query-machine(machine-type) => list of BusInfo
> >
> > BusInfo would contain:
> >  * bus-type
> >  * bus-path
> >  * accepted-device-types (list of type/interface names)
> 
> Let me take a step back and explore what questions a management
> application may want to ask.  I think the basic questions are:
> 
> * Which devices models could be plugged now?
> 
>   "Now" means in the current state of the machine.  Depending on "now",
>   this could be hot- or cold-plug.

Yes.

> 
> * Into which sockets could device model T be plugged now?
> 
>   Mind, I wrote "socket", not bus.  Buses provide sockets, but they are
>   not the only provider of sockets.
> 
>   We need a way to name sockets.  (QOM path to device or bus, socket
>   address on device or bus) could work.

Naming "sockets" was not on my model, but I think we can do that.

I will probably send RFC v2 without any socket abstraction, so
other conceptual changes I am introducing can be reviewed
(especially the PCI/PCIe mess I am diving into, right now). But I
will try to document how the design can evolve to handle sockets.

> 
> Are we on the same page so far?

Yes.

> 
> Related: same questions with "newly created machine of machine type M"
> instead of "now".  This isn't basic, because you could always create the
> machine, then ask the basic question.  But not having to create the
> machine is convenient, in particular as long as a single QEMU process
> can only ever create one machine.
> 
> Related: same, with creation options.
> 
> Your series provides a solution for a special non-basic case: newly
> created machine of machine type M with no optional parts, i.e. with
> -nodefaults and nothing else.  Or maybe with the default set of optional
> parts, i.e. without -nodefaults, not sure.
> 
> I'm not demanding you provide a solution for the general case first.  It
> would be nice, but it could be too complex for a first step.  I do
> recommend you design your solution for the special case with the general
> case in mind.  Say, by trying to ensure the special case query returns
> something that will do or can be evolved to do for the general case,
> too.

Yep. I will keep that in mind.

> 
> >> > Example: floppy support on q35 vs i440fx
> >> > 
> >> > 
> >> > There's no way for libvirt to find out that there's no floppy
> >> > controller on pc-q35-* machine-types by default.
> >> > 
> >> 
> >> Again "by default". So do we want to query the init state of a machine?
> >> What devices are there? Or what devices *can be* there?
> >
> > "by default" means what's present when using "-machine "
> > with no extra -device arguments.
> 
> Not just -device.  Some legacy options can also create buses.  For
> instance, -device if=scsi,... creates a SCSI bus.

Right. I found that out when debugging the -nodefaults issues.

> 
> > We want to know what _buses_ are always there. Which in turn lets
> > management know which _device_ types _can_ be plugged.
> >
> >> 
> >> > With this series, pc-i440fx-* will report "floppy" as a supported
> >> > device type, but pc-q35-* will not.
> >> > 
> >> > Example: Legacy PCI vs vs PCIe devices
> >> > --
> >> > 
> >> > Some devices require a PCIe bus to be available, others work on
> >> > both legacy PCI and PCIe, while others work only on a legacy PCI
> >> > b

Re: [Qemu-devel] [RFC 00/15] qmp: Report supported device types on 'query-machines'

2016-11-24 Thread Marcel Apfelbaum


On 11/24/2016 03:34 PM, Markus Armbruster wrote:

Eduardo Habkost  writes:


On Wed, Nov 23, 2016 at 06:43:16PM +0200, Marcel Apfelbaum wrote:

On 11/22/2016 03:11 AM, Eduardo Habkost wrote:

The Problem




[...]


Our decision to have hybrid PCI/PCIe devices and buses breeds
considerable complexity.  I wish we had avoided them, but I believe it's
too late to change now.


This still does not solve the problem that some devices makes
sense only on a specific arch.




Hi Markus,


Examples?



One quick example would be that we don't want to see
Intel's IOH 3420 PCIe Root Port in an ARM machine,
or a pxb on a Q35 machine (in this case we want pxb-pcie)

I do believe there are other examples, I'll try to think of more.

Thanks,
Marcel

[...]

Re: [Qemu-devel] [PATCH 2/4] 9pfs: introduce transport specific callbacks

2016-11-24 Thread Greg Kurz

On Thu, 24 Nov 2016 09:31:52 +0100
Greg Kurz  wrote:

> On Mon, 21 Nov 2016 13:39:30 -0800
> Stefano Stabellini  wrote:
> 
> > Don't call virtio functions from 9pfs generic code, use generic function
> > callbacks instead.
> > 
> > Signed-off-by: Stefano Stabellini 
> > ---  
> 
> Just a couple of indentation and line over 80 characters nits. I'll fix them
> before pushing to 9p-next.
> 
> Reviewed-by: Greg Kurz 
> 

Hmm... second thought...

> >  hw/9pfs/9p.c   |  8 
> >  hw/9pfs/9p.h   | 18 ++
> >  hw/9pfs/virtio-9p-device.c | 18 ++
> >  hw/9pfs/virtio-9p.h|  9 -
> >  4 files changed, 36 insertions(+), 17 deletions(-)
> > 
> > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> > index 05e950f..5a20a13 100644
> > --- a/hw/9pfs/9p.c
> > +++ b/hw/9pfs/9p.c
> > @@ -47,7 +47,7 @@ ssize_t pdu_marshal(V9fsPDU *pdu, size_t offset, const 
> > char *fmt, ...)
> >  va_list ap;
> >  
> >  va_start(ap, fmt);
> > -ret = virtio_pdu_vmarshal(pdu, offset, fmt, ap);
> > +ret = pdu->s->transport->pdu_vmarshal(pdu, offset, fmt, ap);
> >  va_end(ap);
> >  
> >  return ret;
> > @@ -59,7 +59,7 @@ ssize_t pdu_unmarshal(V9fsPDU *pdu, size_t offset, const 
> > char *fmt, ...)
> >  va_list ap;
> >  
> >  va_start(ap, fmt);
> > -ret = virtio_pdu_vunmarshal(pdu, offset, fmt, ap);
> > +ret = pdu->s->transport->pdu_vunmarshal(pdu, offset, fmt, ap);
> >  va_end(ap);
> >  
> >  return ret;
> > @@ -67,7 +67,7 @@ ssize_t pdu_unmarshal(V9fsPDU *pdu, size_t offset, const 
> > char *fmt, ...)
> >  
> >  static void pdu_push_and_notify(V9fsPDU *pdu)
> >  {
> > -virtio_9p_push_and_notify(pdu);
> > +pdu->s->transport->push_and_notify(pdu);
> >  }
> >  
> >  static int omode_to_uflags(int8_t mode)
> > @@ -1751,7 +1751,7 @@ static void v9fs_init_qiov_from_pdu(QEMUIOVector 
> > *qiov, V9fsPDU *pdu,
> >  struct iovec *iov;
> >  unsigned int niov;
> >  
> > -virtio_init_iov_from_pdu(pdu, &iov, &niov, is_write);
> > +pdu->s->transport->init_iov_from_pdu(pdu, &iov, &niov, is_write);
> >  
> >  qemu_iovec_init_external(&elem, iov, niov);
> >  qemu_iovec_init(qiov, niov);
> > diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h
> > index 07cee01..ab398d0 100644
> > --- a/hw/9pfs/9p.h
> > +++ b/hw/9pfs/9p.h
> > @@ -230,6 +230,7 @@ typedef struct V9fsState
> >  enum p9_proto_version proto_version;
> >  int32_t msize;
> >  V9fsPDU pdus[MAX_REQ];
> > +struct V9fsTransport *transport;
> >  /*
> >   * lock ensuring atomic path update
> >   * on rename.
> > @@ -343,4 +344,21 @@ void pdu_free(V9fsPDU *pdu);
> >  void pdu_submit(V9fsPDU *pdu);
> >  void v9fs_reset(V9fsState *s);
> >  
> > +struct V9fsTransport {
> > +ssize_t (*pdu_vmarshal)(V9fsPDU *pdu, size_t offset, const char 
> > *fmt, va_list ap);  
> 
> over 80 characters
> 
> > +ssize_t (*pdu_vunmarshal)(V9fsPDU *pdu, size_t offset, const char 
> > *fmt, va_list ap);  
> 
> ditto
> 
> > +void(*init_iov_from_pdu)(V9fsPDU *pdu, struct iovec **piov,
> > +  unsigned int *pniov, bool is_write);  
> 
> indent
> 
> > +void(*push_and_notify)(V9fsPDU *pdu);
> > +};
> > +
> > +static inline int v9fs_register_transport(V9fsState *s, struct 
> > V9fsTransport *t)
> > +{
> > +if (s->transport) {
> > +return -EINVAL;
> > +}
> > +s->transport = t;
> > +return 0;
> > +}
> > +
> >  #endif
> > diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
> > index 1782e4a..e1a37a4 100644
> > --- a/hw/9pfs/virtio-9p-device.c
> > +++ b/hw/9pfs/virtio-9p-device.c
> > @@ -20,7 +20,9 @@
> >  #include "hw/virtio/virtio-access.h"
> >  #include "qemu/iov.h"
> >  
> > -void virtio_9p_push_and_notify(V9fsPDU *pdu)
> > +static struct V9fsTransport virtio_9p_transport;

... shouldn't this be const ?

> > +
> > +static void virtio_9p_push_and_notify(V9fsPDU *pdu)
> >  {
> >  V9fsState *s = pdu->s;
> >  V9fsVirtioState *v = container_of(s, V9fsVirtioState, state);
> > @@ -126,6 +128,7 @@ static void virtio_9p_device_realize(DeviceState *dev, 
> > Error **errp)
> >  v->config_size = sizeof(struct virtio_9p_config) + 
> > strlen(s->fsconf.tag);
> >  virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, v->config_size);
> >  v->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output);
> > +v9fs_register_transport(s, &virtio_9p_transport);
> >  
> >  out:
> >  return;
> > @@ -148,7 +151,7 @@ static void virtio_9p_reset(VirtIODevice *vdev)
> >  v9fs_reset(&v->state);
> >  }
> >  
> > -ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
> > +static ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
> >  const char *fmt, va_list ap)  
> 
> indent
> 
> >  {
> >  V9fsState *s = pdu->s;
> > @@ -158,7 +161,7 @@ ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
> >  return v9fs_iov_vmarshal(elem->in_sg, elem->in

Re: [Qemu-devel] [PATCH] target-arm: Add VBAR support to ARM1176 CPUs

2016-11-24 Thread Cédric Le Goater

On 09/05/2016 04:39 PM, Peter Maydell wrote:
> On 23 August 2016 at 10:59, Cédric Le Goater  wrote:
>> ARM1176 CPUs support the Vector Base Address Register but currently,
>> qemu only supports VBAR on ARMv7 CPUs. Fix this by adding a new
>> feature ARM_FEATURE_VBAR which is used for ARMv7 and ARM1176 CPUs.
>>
>> Signed-off-by: Cédric Le Goater 
> 
> This looks mostly good; one question below.

Sorry for the late answer. I was drilling down through PPC with 
the initial support of the pnv machine. 

For 2.9, I would like to send a couple of fixes for the Aspeed  
machines, this is one of them and I feel I didn't choose the 
easiest path to enter the ARM world :)

>> Index: qemu-aspeed.git/target-arm/cpu.h
>> ===
>> --- qemu-aspeed.git.orig/target-arm/cpu.h
>> +++ qemu-aspeed.git/target-arm/cpu.h
>> @@ -1129,6 +1129,7 @@ enum arm_features {
>>  ARM_FEATURE_V8_SHA256, /* implements SHA256 part of v8 Crypto 
>> Extensions */
>>  ARM_FEATURE_V8_PMULL, /* implements PMULL part of v8 Crypto Extensions 
>> */
>>  ARM_FEATURE_THUMB_DSP, /* DSP insns supported in the Thumb encodings */
>> +ARM_FEATURE_VBAR, /* has cp15 VBAR (ARM1176) */
> 
> you don't need the bit in brackets, I think (it's not complete
> anyway, since v7+ also has this).

ok.

>>  };
>>
>>  static inline int arm_feature(CPUARMState *env, int feature)
>> Index: qemu-aspeed.git/target-arm/cpu.c
>> ===
>> --- qemu-aspeed.git.orig/target-arm/cpu.c
>> +++ qemu-aspeed.git/target-arm/cpu.c
>> @@ -584,6 +584,7 @@ static void arm_cpu_realizefn(DeviceStat
>>  set_feature(env, ARM_FEATURE_LPAE);
>>  }
>>  if (arm_feature(env, ARM_FEATURE_V7)) {
>> +set_feature(env, ARM_FEATURE_VBAR);
>>  set_feature(env, ARM_FEATURE_VAPA);
>>  set_feature(env, ARM_FEATURE_THUMB2);
>>  set_feature(env, ARM_FEATURE_MPIDR);
>> @@ -867,6 +868,7 @@ static void arm1176_initfn(Object *obj)
>>
>>  cpu->dtb_compatible = "arm,arm1176";
>>  set_feature(&cpu->env, ARM_FEATURE_V6K);
>> +set_feature(&cpu->env, ARM_FEATURE_VBAR);
>>  set_feature(&cpu->env, ARM_FEATURE_VFP);
>>  set_feature(&cpu->env, ARM_FEATURE_VAPA);
>>  set_feature(&cpu->env, ARM_FEATURE_DUMMY_C15_REGS);
> 
> Is it sufficient to set ARM_FEATURE_VBAR in the realizefn
> if FEATURE_V7 or FEATURE_EL3? (watch out that realizefn
> may change the value of the FEATURE_EL3 bit partway down
> so you'd need to do the check there) ?

It looks like it, yes. 

The addition of the VBAR only depends on ARM_FEATURE_VBAR 
bit which is set for 1176 and V7. I followed the pattern 
used by ARM_FEATURE_V6K. FEATURE_EL3 does not come in play
but may be it should indeed, see below.

> We implement VBAR in v7-without-EL3 even though architecturally
> it should only exist in v7-with-EL3 because we have some
> legacy board models which we implement as without-EL3 but
> where the guest (Linux) assumes it's running on a with-EL3
> CPU in NS mode. I'd rather we stick to the architectural
> definition for 1176 if we can rather than expanding the
> "things we do which aren't architectural" set if there's
> no legacy config that requires it. (If it is necessary
> to define it for 1176 always then that's OK, I'd just
> prefer not to unless we know we have to.)

According to the ARM spec, 1176 is a v6 with a couple of 
extensions, among which v6k and TrustZone. So it has Secure 
and Non-Secure VBAR and MVBAR registers.

Looking at :

  https://en.wikipedia.org/wiki/List_of_ARM_microarchitectures

May be, we should instead introduce a new ARMv6Z architecture 
for the 1176 cpu which would represent a ARMv6 + TrustZone ?


But I am a bit confused with this EL3 feature. 1176 does not 
have EL3 to my understanding (I am beyond my ARM skills there).
So why is it part of the feature list in arm1176_initfn ? 
Is that a way to define the MVBAR and SCR regs ? 

Should we define VBAR under el3_cp_reginfo and not under 
v7_cp_reginfo then ? 

> We should probably also have a brief comment noting that
> we define VBAR always in v7, even though architecturally
> it doesn't exist in non-EL3 configs, for the benefit of
> legacy board models.

ok. I can add that.

Thanks,

C. 

> (In v8 VBAR is required whether EL3 is implemented or not.)
> 
> thanks
> -- PMM
>

Re: [Qemu-devel] [PATCH v2 1/2] target-ppc: add vextu[bhw]lx instructions

2016-11-24 Thread Richard Henderson


On 11/24/2016 12:32 PM, Nikunj A Dadhania wrote:

+#if defined(HOST_WORDS_BIGENDIAN)
+# if defined(CONFIG_INT128)
+# define VEXTULX_DO(name, size) \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{   \
+int index = (a & 0xf) * 8;  \
+return int128_rshift(b->u128, index) &  \
+MAKE_64BIT_MASK(0, size);   \
+}
+# else
+# define VEXTULX_DO(name, size) \
+target_ulong glue(helper_, name)(target_ulong a, ppc_avr_t *b)  \
+{   \
+int index = (a & 0xf) * 8;  \
+Int128 value = int128_make128(b->u64[LO_IDX],   \
+  b->u64[HI_IDX]);  \
+return int128_rshift(value, index) &\
+MAKE_64BIT_MASK(0, size);   \
+}
+# endif


Why are these duplicated?

Clearly the missed trick is that you should *never* check CONFIG_INT128 and 
always rely on Int128.



r~

Re: [Qemu-devel] [PATCH 2/4] 9pfs: introduce transport specific callbacks

2016-11-24 Thread Greg Kurz

On Thu, 24 Nov 2016 15:23:10 +0100
Greg Kurz  wrote:

> On Thu, 24 Nov 2016 09:31:52 +0100
> Greg Kurz  wrote:
> 
> > On Mon, 21 Nov 2016 13:39:30 -0800
> > Stefano Stabellini  wrote:
> >   
> > > Don't call virtio functions from 9pfs generic code, use generic function
> > > callbacks instead.
> > > 
> > > Signed-off-by: Stefano Stabellini 
> > > ---
> > 
> > Just a couple of indentation and line over 80 characters nits. I'll fix them
> > before pushing to 9p-next.
> > 
> > Reviewed-by: Greg Kurz 
> >   
> 
> Hmm... second thought...
> 

[...]

> > > +
> > > +static inline int v9fs_register_transport(V9fsState *s, struct 
> > > V9fsTransport *t)
> > > +{
> > > +if (s->transport) {
> > > +return -EINVAL;
> > > +}

Calling v9fs_register_transport() several times for the same V9fsState looks
more like a bug than an error... also, is it possible to have several 9pfs
shares with different transports ?

> > > +s->transport = t;
> > > +return 0;
> > > +}
> > > +
> > >  #endif
> > > diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
> > > index 1782e4a..e1a37a4 100644
> > > --- a/hw/9pfs/virtio-9p-device.c
> > > +++ b/hw/9pfs/virtio-9p-device.c
> > > @@ -20,7 +20,9 @@
> > >  #include "hw/virtio/virtio-access.h"
> > >  #include "qemu/iov.h"
> > >  
> > > -void virtio_9p_push_and_notify(V9fsPDU *pdu)
> > > +static struct V9fsTransport virtio_9p_transport;  
> 
> ... shouldn't this be const ?
> 
> > > +
> > > +static void virtio_9p_push_and_notify(V9fsPDU *pdu)
> > >  {
> > >  V9fsState *s = pdu->s;
> > >  V9fsVirtioState *v = container_of(s, V9fsVirtioState, state);
> > > @@ -126,6 +128,7 @@ static void virtio_9p_device_realize(DeviceState 
> > > *dev, Error **errp)
> > >  v->config_size = sizeof(struct virtio_9p_config) + 
> > > strlen(s->fsconf.tag);
> > >  virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, v->config_size);
> > >  v->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output);
> > > +v9fs_register_transport(s, &virtio_9p_transport);
> > >  
> > >  out:
> > >  return;
> > > @@ -148,7 +151,7 @@ static void virtio_9p_reset(VirtIODevice *vdev)
> > >  v9fs_reset(&v->state);
> > >  }
> > >  
> > > -ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
> > > +static ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
> > >  const char *fmt, va_list ap)
> > 
> > indent
> >   
> > >  {
> > >  V9fsState *s = pdu->s;
> > > @@ -158,7 +161,7 @@ ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t 
> > > offset,
> > >  return v9fs_iov_vmarshal(elem->in_sg, elem->in_num, offset, 1, fmt, 
> > > ap);
> > >  }
> > >  
> > > -ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t offset,
> > > +static ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t offset,
> > >const char *fmt, va_list ap)
> > 
> > indent
> >   
> > >  {
> > >  V9fsState *s = pdu->s;
> > > @@ -168,7 +171,7 @@ ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t 
> > > offset,
> > >  return v9fs_iov_vunmarshal(elem->out_sg, elem->out_num, offset, 1, 
> > > fmt, ap);
> > >  }
> > >  
> > > -void virtio_init_iov_from_pdu(V9fsPDU *pdu, struct iovec **piov,
> > > +static void virtio_init_iov_from_pdu(V9fsPDU *pdu, struct iovec **piov,
> > >unsigned int *pniov, bool is_write)
> > 
> > indent
> >   
> > >  {
> > >  V9fsState *s = pdu->s;
> > > @@ -184,6 +187,13 @@ void virtio_init_iov_from_pdu(V9fsPDU *pdu, struct 
> > > iovec **piov,
> > >  }
> > >  }
> > >  
> > > +static struct V9fsTransport virtio_9p_transport = {
> > > +.pdu_vmarshal = virtio_pdu_vmarshal,
> > > +.pdu_vunmarshal = virtio_pdu_vunmarshal,
> > > +.init_iov_from_pdu = virtio_init_iov_from_pdu,
> > > +.push_and_notify = virtio_9p_push_and_notify,
> > > +};
> > > +
> > >  /* virtio-9p device */
> > >  
> > >  static const VMStateDescription vmstate_virtio_9p = {
> > > diff --git a/hw/9pfs/virtio-9p.h b/hw/9pfs/virtio-9p.h
> > > index 52c4b9d..e763da2c 100644
> > > --- a/hw/9pfs/virtio-9p.h
> > > +++ b/hw/9pfs/virtio-9p.h
> > > @@ -14,15 +14,6 @@ typedef struct V9fsVirtioState
> > >  V9fsState state;
> > >  } V9fsVirtioState;
> > >  
> > > -void virtio_9p_push_and_notify(V9fsPDU *pdu);
> > > -
> > > -ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
> > > -const char *fmt, va_list ap);
> > > -ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t offset,
> > > -  const char *fmt, va_list ap);
> > > -void virtio_init_iov_from_pdu(V9fsPDU *pdu, struct iovec **piov,
> > > -  unsigned int *pniov, bool is_write);
> > > -
> > >  #define TYPE_VIRTIO_9P "virtio-9p-device"
> > >  #define VIRTIO_9P(obj) \
> > >  OBJECT_CHECK(V9fsVirtioState, (obj), TYPE_VIRTIO_9P)
> > 
> >   
> 
>

Re: [Qemu-devel] [PATCH 3/4] 9pfs: use v9fs_init_qiov_from_pdu instead of v9fs_pack

2016-11-24 Thread Greg Kurz

On Mon, 21 Nov 2016 13:39:31 -0800
Stefano Stabellini  wrote:

> v9fs_xattr_read should not access VirtQueueElement elems directly.
> Move v9fs_init_qiov_from_pdu up in the file and call
> v9fs_init_qiov_from_pdu instead of v9fs_pack.
> 

instead of ? I see v9fs_init_qiov_from_pdu() gets called before calling 
v9fs_pack().
Also, I don't see the corresponding call to qemu_iovec_destroy() to free the
allocated iovec.

> Signed-off-by: Stefano Stabellini 
> ---
>  hw/9pfs/9p.c | 58 +-
>  1 file changed, 29 insertions(+), 29 deletions(-)
> 
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index 5a20a13..b6ec042 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -1633,14 +1633,39 @@ out_nofid:
>  pdu_complete(pdu, err);
>  }
>  
> +/*
> + * Create a QEMUIOVector for a sub-region of PDU iovecs
> + *
> + * @qiov:   uninitialized QEMUIOVector
> + * @skip:   number of bytes to skip from beginning of PDU
> + * @size:   number of bytes to include
> + * @is_write:   true - write, false - read
> + *
> + * The resulting QEMUIOVector has heap-allocated iovecs and must be cleaned 
> up
> + * with qemu_iovec_destroy().
> + */
> +static void v9fs_init_qiov_from_pdu(QEMUIOVector *qiov, V9fsPDU *pdu,
> +size_t skip, size_t size,
> +bool is_write)
> +{
> +QEMUIOVector elem;
> +struct iovec *iov;
> +unsigned int niov;
> +
> +pdu->s->transport->init_iov_from_pdu(pdu, &iov, &niov, is_write);
> +
> +qemu_iovec_init_external(&elem, iov, niov);
> +qemu_iovec_init(qiov, niov);
> +qemu_iovec_concat(qiov, &elem, skip, size);
> +}
> +
>  static int v9fs_xattr_read(V9fsState *s, V9fsPDU *pdu, V9fsFidState *fidp,
> uint64_t off, uint32_t max_count)
>  {
>  ssize_t err;
>  size_t offset = 7;
>  uint64_t read_count;
> -V9fsVirtioState *v = container_of(s, V9fsVirtioState, state);
> -VirtQueueElement *elem = v->elems[pdu->idx];
> +QEMUIOVector qiov_full;
>  
>  if (fidp->fs.xattr.len < off) {
>  read_count = 0;
> @@ -1656,7 +1681,8 @@ static int v9fs_xattr_read(V9fsState *s, V9fsPDU *pdu, 
> V9fsFidState *fidp,
>  }
>  offset += err;
>  
> -err = v9fs_pack(elem->in_sg, elem->in_num, offset,
> +v9fs_init_qiov_from_pdu(&qiov_full, pdu, 0, read_count, false);
> +err = v9fs_pack(qiov_full.iov, qiov_full.niov, offset,
>  ((char *)fidp->fs.xattr.value) + off,
>  read_count);
>  if (err < 0) {
> @@ -1732,32 +1758,6 @@ static int coroutine_fn 
> v9fs_do_readdir_with_stat(V9fsPDU *pdu,
>  return count;
>  }
>  
> -/*
> - * Create a QEMUIOVector for a sub-region of PDU iovecs
> - *
> - * @qiov:   uninitialized QEMUIOVector
> - * @skip:   number of bytes to skip from beginning of PDU
> - * @size:   number of bytes to include
> - * @is_write:   true - write, false - read
> - *
> - * The resulting QEMUIOVector has heap-allocated iovecs and must be cleaned 
> up
> - * with qemu_iovec_destroy().
> - */
> -static void v9fs_init_qiov_from_pdu(QEMUIOVector *qiov, V9fsPDU *pdu,
> -size_t skip, size_t size,
> -bool is_write)
> -{
> -QEMUIOVector elem;
> -struct iovec *iov;
> -unsigned int niov;
> -
> -pdu->s->transport->init_iov_from_pdu(pdu, &iov, &niov, is_write);
> -
> -qemu_iovec_init_external(&elem, iov, niov);
> -qemu_iovec_init(qiov, niov);
> -qemu_iovec_concat(qiov, &elem, skip, size);
> -}
> -
>  static void coroutine_fn v9fs_read(void *opaque)
>  {
>  int32_t fid;

Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI

2016-11-24 Thread Igor Mammedov

On Thu, 24 Nov 2016 01:01:58 +0100
Laszlo Ersek  wrote:

> CC Jordan & Mike
> 
> On 11/23/16 23:35, Paolo Bonzini wrote:
> > 
> > 
> > On 18/11/2016 11:36, Laszlo Ersek wrote:
> >> This is v3 of the series, with updates based on the v2 discussion:
> >> .
> >>
> >> I've added feature negotiation via the APM_STS ("scratchpad") register.
> >> A new spec file called "docs/specs/q35-apm-sts.txt" is included.
> >>
> >> Tested with new OVMF patches (about to send out those as well).
> >> Regression tested with SeaBIOS (beyond simple functional tests with
> >> maximum SeaBIOS logging enabled, I used gdb to step through the new
> >> ich9_apm_status_changed() callback to see if it was behaving compatibly
> >> with SeaBIOS).
> >>
> >> The series was developed and tested on top of v2.7.0, because v2.8.0-rc0
> >> crashes very quickly for me when running OVMF:
> >>
> >>   kvm_io_ioeventfd_add: error adding ioeventfd: File exists
> >>
> >> It is my understanding that there are patches on the list for this:
> >>
> >>   [Qemu-devel] [PATCH v2 for-2.8 0/3] virtio fixes
> >>
> >> Anyway, the series rebases to v2.8.0-rc0 without as much as context
> >> differences.
> > 
> > Hi Laszlo,
> > 
> > sorry for the slightly delayed reply.
> > 
> > First of all, I'm wondering if we would be better off adding a new port
> > 0xB1 that is QEMU-specific, instead of reusing 0xB3.
> 
> Sure, I can look into that, if we agree that's the best way to proceed,
> for now. (Although I'm not really happy about the new memory region
> stuff it would require. :()
> 
> I CC'd Kevin to learn if he foresaw other uses for the APM_STS register
> in SeaBIOS.
> 
> BTW, what happens in QEMU if the guest reads an unimplemented port?
> Hm... unassigned_io_write() seems to be a no-op, and
> unassigned_io_read() returns all-bits-one. This means that for a new
> port, the negotiation protocol / values have to be reworked.
> 
> Port 0xB1 is occupied by ICH9 according to the spec:
> 
>   Table 9-2. Fixed I/O Ranges Decoded by Intel ® ICH9 (Sheet 2 of 2)
> 
>   I/O
>   Address  Read Target   Write Target  Internal Unit
>   ---      -
>   B0h–B1h  Interrupt Controller  Interrupt Controller  Interrupt
> 
> I wonder if we care -- after all, APM_STS (0xB3) is documented not to
> have any hardware effects ("scratchpad register").
+1 in favor of using scratch register as Laszlo suggests.


> > Second, I now remembered the reason why I was against broadcast SMI.
> > The reason is that it breaks hot-plug.
> 
> How does it break hot-plug? After reading your explanation below: is it
> that the broadcast SMI (possibly raised by the OS directly) would get to
> the new VCPU before the firmware relocated its SMBASE?
> 
> > 
> > On hot-plug, the firmware (if it wants to use SMI for anything secure)
> > must relocate the SMBASE of the newly-hotplugged CPU before the OS has
> > any chance to put its fangs on it.  This is because the OS can send
> > direct SMIs and is in control of the area at 0x3.  So OVMF is
> > already broken with respect to hotplug,
> 
> Yes. We theorized that there could be further edk2 core modules that
> hadn't been open sourced yet, necessary for handling VCPU hotplug.
> 
> > but I am not yet sure if these
> > patches would break it further.
> 
> Hard to say without seeing those modules.
> 
> I will speculate: assuming that the non-public modules fit together with
> the public modules in some way, I expect the broadcast SMI shouldn't
> break them. The reason is that the broadcast SMI / traditional delivery
> are the default method in UefiCpuPkg, and in practice they work better
> (more reliably) with the rest of the edk2 infrastructure, in my
> experience, than the relaxed sync method.
> 
> In his review today, Jordan wrote
> ,
> 
> > I'm glad we'll be using a mechanism that broadcasts to all the
> > processors like the real hardware. It is a bit unfortunate that it
> > doesn't go through the b2 port for it.
> 
> If broadcast is how real hardware does it (even by default!,
> apparently), I expect those as-yet unreleased, hotplug-handling modules
> in edk2 should be able to cope with broadcast.
> 
> > The full solution is to follow a protocol similar to what real hardware
> > does.
> 
> Real hardware seems to use broadcast, according to the above...
> 
> On 11/23/16 23:35, Paolo Bonzini wrote:
> 
> > On hot-plug, before the new CPU starts running, the DSDT should
> > generate an SMI (with a well-known value written to 0xB2).
> 
> I'm not sure I understand right. If it is the DSDT that writes to 0xB2,
> that's just another way to say, "the firmware vendor asks the operating
> system to write to 0xB2". If the malicious OS is smart enough, it can
> realize (from the hardware signal to run the ACPI GPE handler, IIRC)
> that a new VCPU is available, an

Re: [Qemu-devel] [RFC 00/15] qmp: Report supported device types on 'query-machines'

2016-11-24 Thread Markus Armbruster

Eduardo Habkost  writes:

> On Thu, Nov 24, 2016 at 02:34:05PM +0100, Markus Armbruster wrote:
>> Eduardo Habkost  writes:
>> 
>> > On Wed, Nov 23, 2016 at 06:43:16PM +0200, Marcel Apfelbaum wrote:
>> >> On 11/22/2016 03:11 AM, Eduardo Habkost wrote:
>> >> > The Problem
>> >> > ===
>> >> > 
>> >> > Currently management software has no way to find out which device
>> >> > types can be plugged in a machine, unless the machine is already
>> >> > initialized.
>> 
>> I doubt it can find the precise list of device types even for an
>> initialized machine.
>
> My plan is to make it possible. See:
>   [RFC 06/15] qdev: Add device_type field to BusClass
>
> I will change how it looks like on v2 (mostly due to the PCI/PCIe
> issues), but basically it makes all bus objects able to tell
> which device types they accept.

Sounds like a step forward.

>> >> Hi Eduardo,
>> >> Thank you for this interesting series. I think this is a problem
>> >> worth addressing.
>> >> 
>> >> > Even after the machine is initialized, there's no way to map
>> >> > existing bus types to supported device types unless management
>> >> > software hardcodes the mapping between bus types and device
>> >> > types.
>> >> > 
>> >> 
>> >> Here I am a little lost.
>> >> 
>> >> We are going for machine => supported devices or
>> >> bus-type => supported devices?
>> >
>> > On this series, we go for machine-type => supported-devices.
>> >
>> > A bus-type => supported-devices map wouldn't work because
>> > different PCIe bus instances might accept different types of
>> > devices (so supported-devices depend on the specific bus
>> > instance, not only on the bus-type).
>> >
>> > v2 will probably be more detailed. I plan to change it to:
>> >
>> > query-machine(machine-type) => list of BusInfo
>> >
>> > BusInfo would contain:
>> >  * bus-type
>> >  * bus-path
>> >  * accepted-device-types (list of type/interface names)
>> 
>> Let me take a step back and explore what questions a management
>> application may want to ask.  I think the basic questions are:
>> 
>> * Which devices models could be plugged now?
>> 
>>   "Now" means in the current state of the machine.  Depending on "now",
>>   this could be hot- or cold-plug.
>
> Yes.
>
>> 
>> * Into which sockets could device model T be plugged now?
>> 
>>   Mind, I wrote "socket", not bus.  Buses provide sockets, but they are
>>   not the only provider of sockets.
>> 
>>   We need a way to name sockets.  (QOM path to device or bus, socket
>>   address on device or bus) could work.
>
> Naming "sockets" was not on my model, but I think we can do that.
>
> I will probably send RFC v2 without any socket abstraction, so
> other conceptual changes I am introducing can be reviewed
> (especially the PCI/PCIe mess I am diving into, right now). But I
> will try to document how the design can evolve to handle sockets.

Makes sense.

>> Are we on the same page so far?
>
> Yes.
>
>> 
>> Related: same questions with "newly created machine of machine type M"
>> instead of "now".  This isn't basic, because you could always create the
>> machine, then ask the basic question.  But not having to create the
>> machine is convenient, in particular as long as a single QEMU process
>> can only ever create one machine.
>> 
>> Related: same, with creation options.
>> 
>> Your series provides a solution for a special non-basic case: newly
>> created machine of machine type M with no optional parts, i.e. with
>> -nodefaults and nothing else.  Or maybe with the default set of optional
>> parts, i.e. without -nodefaults, not sure.
>> 
>> I'm not demanding you provide a solution for the general case first.  It
>> would be nice, but it could be too complex for a first step.  I do
>> recommend you design your solution for the special case with the general
>> case in mind.  Say, by trying to ensure the special case query returns
>> something that will do or can be evolved to do for the general case,
>> too.
>
> Yep. I will keep that in mind.
>
>> 
>> >> > Example: floppy support on q35 vs i440fx
>> >> > 
>> >> > 
>> >> > There's no way for libvirt to find out that there's no floppy
>> >> > controller on pc-q35-* machine-types by default.
>> >> > 
>> >> 
>> >> Again "by default". So do we want to query the init state of a machine?
>> >> What devices are there? Or what devices *can be* there?
>> >
>> > "by default" means what's present when using "-machine "
>> > with no extra -device arguments.
>> 
>> Not just -device.  Some legacy options can also create buses.  For
>> instance, -device if=scsi,... creates a SCSI bus.
>
> Right. I found that out when debugging the -nodefaults issues.
>
>> 
>> > We want to know what _buses_ are always there. Which in turn lets
>> > management know which _device_ types _can_ be plugged.
>> >
>> >> 
>> >> > With this series, pc-i440fx-* will report "floppy" as a supported
>> >> > device type, but pc-q35-* will not.
>> >> > 
>> >> > Example: Legacy PCI vs vs PCIe dev

Re: [Qemu-devel] [dpdk-dev] dpdk/vpp and cross-version migration for vhost

2016-11-24 Thread Kevin Traynor

On 11/24/2016 12:47 PM, Maxime Coquelin wrote:
> 
> 
> On 11/24/2016 01:33 PM, Yuanhan Liu wrote:
>> On Thu, Nov 24, 2016 at 09:30:49AM +, Kevin Traynor wrote:
>>> > On 11/24/2016 06:31 AM, Yuanhan Liu wrote:
 > > On Tue, Nov 22, 2016 at 04:53:05PM +0200, Michael S. Tsirkin wrote:
>>> >  You keep assuming that you have the VM started first and
>>> >  figure out things afterwards, but this does not work.
>>> > 
>>> >  Think about a cluster of machines. You want to start a VM in
>>> >  a way that will ensure compatibility with all hosts
>>> >  in a cluster.
>> > >>>
>> > >>> I see. I was more considering about the case when the dst
>> > >>> host (including the qemu and dpdk combo) is given, and
>> > >>> then determine whether it will be a successfull migration
>> > >>> or not.
>> > >>>
>> > >>> And you are asking that we need to know which host could
>> > >>> be a good candidate before starting the migration. In such
>> > >>> case, we indeed need some inputs from both the qemu and
>> > >>> vhost-user backend.
>> > >>>
>> > >>> For DPDK, I think it could be simple, just as you said, it
>> > >>> could be either a tiny script, or even a macro defined in
>> > >>> the source code file (we extend it every time we add a
>> > >>> new feature) to let the libvirt to read it. Or something
>> > >>> else.
> > >>
> > >> There's the issue of APIs that tweak features as Maxime
> > >> suggested.
 > >
 > > Yes, it's a good point.
 > >
> > >> Maybe the only thing to do is to deprecate it,
 > >
 > > Looks like so.
 > >
> > >> but I feel some way for application to pass info into
> > >> guest might be benefitial.
 > >
 > > The two APIs are just for tweaking feature bits DPDK supports
 before
 > > any device got connected. It's another way to disable some features
 > > (the another obvious way is to through QEMU command lines).
 > >
 > > IMO, it's bit handy only in a case like: we have bunch of VMs.
 Instead
 > > of disabling something though qemu one by one, we could disable it
 > > once in DPDK.
 > >
 > > But I doubt the useful of it. It's only used in DPDK's vhost
 example
 > > after all. Nor is it used in vhost pmd, neither is it used in OVS.
>>> >
>>> > rte_vhost_feature_disable() is currently used in OVS,
>>> lib/netdev-dpdk.c
>> Hmmm. I must have checked very old code ...
>>> >
>>> > netdev_dpdk_vhost_class_init(void)
>>> > {
>>> > static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>>> >
>>> > /* This function can be called for different classes.  The
>>> > initialization
>>> >  * needs to be done only once */
>>> > if (ovsthread_once_start(&once)) {
>>> > rte_vhost_driver_callback_register(&virtio_net_device_ops);
>>> > rte_vhost_feature_disable(1ULL << VIRTIO_NET_F_HOST_TSO4
>>> >   | 1ULL << VIRTIO_NET_F_HOST_TSO6
>>> >   | 1ULL << VIRTIO_NET_F_CSUM);
>> I saw the commit introduced such change, but it tells no reason why
>> it was added.
> 
> I'm also interested to know the reason.

I can't remember off hand, added Mark K or Michal W who should be able
to shed some light on it.

> In any case, I think this is something that can/should be managed by
> the management tool, which  should disable it in cmd parameters.
> 
> Kevin, do you agree?

I think best to find out the reason first. Because if no reason to
disable in the code, then no need to debate!

> 
> Cheers,
> Maxime

[Qemu-devel] Linux kernel polling for QEMU

2016-11-24 Thread Stefan Hajnoczi

I looked through the socket SO_BUSY_POLL and blk_mq poll support in
recent Linux kernels with an eye towards integrating the ongoing QEMU
polling work.  The main missing feature is eventfd polling support which
I describe below.

Background
--
We're experimenting with polling in QEMU so I wondered if there are
advantages to having the kernel do polling instead of userspace.

One such advantage has been pointed out by Christian Borntraeger and
Paolo Bonzini: a userspace thread spins blindly without knowing when it
is hogging a CPU that other tasks need.  The kernel knows when other
tasks need to run and can skip polling in that case.

Power management might also benefit if the kernel was aware of polling
activity on the system.  That way polling can be controlled by the
system administrator in a single place.  Perhaps smarter power saving
choices can also be made by the kernel.

Another advantage is that the kernel can poll hardware rings (e.g. NIC
rx rings) whereas QEMU can only poll its own virtual memory (including
guest RAM).  That means the kernel can bypass interrupts for devices
that are using kernel drivers.

State of polling in Linux
-
SO_BUSY_POLL causes recvmsg(2), select(2), and poll(2) family system
calls to spin awaiting new receive packets.  From what I can tell epoll
is not supported so that system call will sleep without polling.

blk_mq poll is mainly supported by NVMe.  It is only available with
synchronous direct I/O.  select(2), poll(2), epoll, and Linux AIO are
therefore not integrated.  It would be nice to extend the code so a
process waiting on Linux AIO using io_getevents(2), select(2), poll(2),
or epoll will poll.

QEMU and KVM-specific polling
-
There are a few QEMU/KVM-specific items that require polling support:

QEMU's event loop aio_notify() mechanism wakes up the event loop from a
blocking poll(2) or epoll call.  It is used when another thread adds or
changes an event loop resource (such as scheduling a BH).  There is a
userspace memory location (ctx->notified) that is written by
aio_notify() as well as an eventfd that can be signalled.

kvm.ko's ioeventfd is signalled upon guest MMIO/PIO accesses.  Virtio
devices use ioeventfd as a doorbell after new requests have been placed
in a virtqueue, which is a descriptor ring in userspace memory.

Eventfd polling support could look like this:

  struct eventfd_poll_info poll_info = {
  .addr = ...memory location...,
  .size = sizeof(uint32_t),
  .op   = EVENTFD_POLL_OP_NOT_EQUAL, /* check *addr != val */
  .val  = ...last value...,
  };
  ioctl(eventfd, EVENTFD_SET_POLL, &poll_info);

In the kernel, eventfd stashes this information and eventfd_poll()
evaluates the operation (e.g. not equal, bitwise and, etc) to detect
progress.

Note that this eventfd polling mechanism doesn't actually poll the
eventfd counter value.  It's useful for situations where the eventfd is
a doorbell/notification that some object in userspace memory has been
updated.  So it polls that userspace memory location directly.

This new eventfd feature also provides a poor man's Linux AIO polling
support: set the Linux AIO shared ring index as the eventfd polling
memory location.  This is not as good as true Linux AIO polling support
where the kernel polls the NVMe, virtio_blk, etc ring since we'd still
rely on an interrupt to complete I/O requests.

Thoughts?

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [for-2.8 0/4] 9p patches for 2.8 20161123

2016-11-24 Thread Stefan Hajnoczi

On Thu, Nov 24, 2016 at 11:36:29AM +0100, Greg Kurz wrote:
> On Thu, 24 Nov 2016 10:19:58 +
> Stefan Hajnoczi  wrote:
> > On Wed, Nov 23, 2016 at 05:58:14PM +0100, Greg Kurz wrote:
> > Your PGP key has only one signature.  Can you key sign with other
> > IBMers?  That may allow us to get the Web of Trust going.
> > 
> 
> We're only a few IBMers in my area (Toulouse, France), but I guess this
> is better than nothing. :)

Will you attend FOSDEM (https://fosdem.org/2017/) this year?  There will
be good opportunities for key signing.

Stefan


signature.asc
Description: PGP signature

[Qemu-devel] [PULL 0/3] M68k for 2.8 patches

2016-11-24 Thread Laurent Vivier

The following changes since commit 00227fefd2059464cd2f59aed29944874c630e2f:

  Update version for v2.8.0-rc1 release (2016-11-22 22:29:08 +)

are available in the git repository at:

  git://github.com/vivier/qemu-m68k.git tags/m68k-for-2.8-pull-request

for you to fetch changes up to 4a18cd44f3c905d443c26e26bb9b09932606d1a3:

  target-m68k: fix muluw/mulsw (2016-11-24 16:24:27 +0100)


This series fixes some errors found unsing risu.


Laurent Vivier (3):
  target-m68k: fix EXG instruction
  target-m68k: Fix cmpa operand size
  target-m68k: fix muluw/mulsw

 target-m68k/translate.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

-- 
2.7.4

[Qemu-devel] [PULL 1/3] target-m68k: fix EXG instruction

2016-11-24 Thread Laurent Vivier

opcodes of "EXG Ax,Ay" and "EXG Dx,Dy" have been swapped

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target-m68k/translate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 9ad974f..8e522db 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2198,13 +2198,13 @@ static void do_exg(TCGv reg1, TCGv reg2)
 tcg_temp_free(temp);
 }
 
-DISAS_INSN(exg_aa)
+DISAS_INSN(exg_dd)
 {
 /* exchange Dx and Dy */
 do_exg(DREG(insn, 9), DREG(insn, 0));
 }
 
-DISAS_INSN(exg_dd)
+DISAS_INSN(exg_aa)
 {
 /* exchange Ax and Ay */
 do_exg(AREG(insn, 9), AREG(insn, 0));
-- 
2.7.4

[Qemu-devel] [PULL 2/3] target-m68k: Fix cmpa operand size

2016-11-24 Thread Laurent Vivier

"The size of the operation can be specified as word or long.
Word length source operands are sign-extended to 32 bits for
comparison."

So comparison is always done using OS_LONG.

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target-m68k/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 8e522db..d2d6816 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2170,7 +2170,7 @@ DISAS_INSN(cmpa)
 }
 SRC_EA(env, src, opsize, 1, NULL);
 reg = AREG(insn, 9);
-gen_update_cc_cmp(s, reg, src, opsize);
+gen_update_cc_cmp(s, reg, src, OS_LONG);
 }
 
 DISAS_INSN(eor)
-- 
2.7.4

[Qemu-devel] [PULL 3/3] target-m68k: fix muluw/mulsw

2016-11-24 Thread Laurent Vivier

"The multiplier and multiplicand are both word operands, and the result
is a long-word operand."

So compute flags on a long-word result, not on a word result.

Signed-off-by: Laurent Vivier 
Reviewed-by: Richard Henderson 
---
 target-m68k/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index d2d6816..d6ed883 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -1186,7 +1186,7 @@ DISAS_INSN(mulw)
 SRC_EA(env, src, OS_WORD, sign, NULL);
 tcg_gen_mul_i32(tmp, tmp, src);
 tcg_gen_mov_i32(reg, tmp);
-gen_logic_cc(s, tmp, OS_WORD);
+gen_logic_cc(s, tmp, OS_LONG);
 }
 
 DISAS_INSN(divw)
-- 
2.7.4

Re: [Qemu-devel] [RFC 00/15] qmp: Report supported device types on 'query-machines'

2016-11-24 Thread Markus Armbruster

Marcel Apfelbaum  writes:

> On 11/24/2016 03:34 PM, Markus Armbruster wrote:
>> Eduardo Habkost  writes:
>>
>>> On Wed, Nov 23, 2016 at 06:43:16PM +0200, Marcel Apfelbaum wrote:
 On 11/22/2016 03:11 AM, Eduardo Habkost wrote:
> The Problem
>>>
>
> [...]
>
>> Our decision to have hybrid PCI/PCIe devices and buses breeds
>> considerable complexity.  I wish we had avoided them, but I believe it's
>> too late to change now.
>>
 This still does not solve the problem that some devices makes
 sense only on a specific arch.
>>
>
> Hi Markus,
>
>> Examples?
>>
>
> One quick example would be that we don't want to see
> Intel's IOH 3420 PCIe Root Port in an ARM machine,
> or a pxb on a Q35 machine (in this case we want pxb-pcie)

Such a device would be weird.  But would it be wrong?  Wrong enough for
QEMU to reject it?  Unless QEMU rejects it, there's no reason not to
list it as pluggable.

> I do believe there are other examples, I'll try to think of more.
>
> Thanks,
> Marcel
>
> [...]

Re: [Qemu-devel] [PATCH v1 02/18] block/pcache: empty pcache driver filter

2016-11-24 Thread Pavel Butsykin


On 23.11.2016 18:15, Kevin Wolf wrote:

Am 15.11.2016 um 07:36 hat Pavel Butsykin geschrieben:

The basic version of pcache driver for easy preparation of a patch set.

Signed-off-by: Pavel Butsykin 
---
  block/Makefile.objs |   1 +
  block/pcache.c  | 144 
  2 files changed, 145 insertions(+)
  create mode 100644 block/pcache.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 7d4031d..c60f680 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -4,6 +4,7 @@ block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o 
qed-cluster.o
  block-obj-y += qed-check.o
  block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o
  block-obj-y += quorum.o
+block-obj-y += pcache.o
  block-obj-y += parallels.o blkdebug.o blkverify.o blkreplay.o
  block-obj-y += block-backend.o snapshot.o qapi.o
  block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
diff --git a/block/pcache.c b/block/pcache.c
new file mode 100644
index 000..59461df
--- /dev/null
+++ b/block/pcache.c
@@ -0,0 +1,144 @@
+/*
+ * Prefetch cache driver filter
+ *
+ * Copyright (C) 2015-2016 Parallels IP Holdings GmbH. All rights reserved.


"All rights reserved" doesn't really agree with licensing under the GPL.
It would be preferable to remove this note to avoid any ambiguity.


Yes, you're right. It seems that 'All rights reserved' is directly
inconsistent with GPL.


+ *
+ * Author: Pavel Butsykin 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "block/block_int.h"
+#include "qapi/error.h"
+#include "qapi/qmp/qstring.h"
+
+
+static QemuOptsList runtime_opts = {
+.name = "pcache",
+.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
+.desc = {
+{
+.name = "x-image",
+.type = QEMU_OPT_STRING,
+.help = "[internal use only, will be removed]",
+},


blkdebug/blkverify have this because they have to support legacy syntax
like -drive file=blkdebug:blkdebug.conf:test.img, i.e. it has to deal
with filenames.q

Here we don't have to support a legacy syntax, so I would completely
avoid this from the beginning. You already support the "image" option,
which should be good enough.


Then the command line would look like this:

-drive 
file=/img/harddisk.hdd,if=none,id=drive-scsi0-0-0-0,cache=none,aio=native

-drive driver=pcache,image=scsi0-0-0-0,if=virtio

Ok.


+{ /* end of list */ }
+},
+};
+
+typedef struct PCacheAIOCB {
+Coroutine *co;
+int ret;
+} PCacheAIOCB;
+
+static void pcache_aio_cb(void *opaque, int ret)
+{
+PCacheAIOCB *acb = opaque;
+
+acb->ret = ret;
+qemu_coroutine_enter(acb->co);
+}
+
+static coroutine_fn int pcache_co_preadv(BlockDriverState *bs, uint64_t offset,
+ uint64_t bytes, QEMUIOVector *qiov,
+ int flags)
+{
+PCacheAIOCB acb = {
+.co = qemu_coroutine_self(),
+};
+
+bdrv_aio_preadv(bs->file, offset, qiov, bytes, pcache_aio_cb, &acb);
+
+qemu_coroutine_yield();
+
+return acb.ret;
+}


As I commented on patch 1, all of this can be replaced by a single line:

 return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);


+static coroutine_fn int pcache_co_pwritev(BlockDriverState *bs, uint64_t 
offset,
+  uint64_t bytes, QEMUIOVector *qiov,
+  int flags)
+{
+PCacheAIOCB acb = {
+.co = qemu_coroutine_self(),
+};
+
+bdrv_aio_pwritev(bs->file, offset, qiov, bytes, pcache_aio_cb, &acb);
+
+qemu_coroutine_yield();
+
+return acb.ret;
+}


Same here.


+static int pcache_file_open(BlockDriverState *bs, QDict *options, int flags,
+Error **errp)
+{
+QemuOpts *opts;
+Error *local_err = NULL;
+int ret = 0;
+
+opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
+qemu_opts_absorb_qdict(opts, options, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+ret = -EINVAL;
+goto fail;
+}
+
+assert(bs->file == NULL);
+bs->file = bdrv_open_child(qemu_opt_get(opts, "x-image"), options,
+   "image", bs, &child_format, false, &local_err);


When removing x-image, the first parameter becomes NULL here.


+if (local_err) {
+ret = -EINVAL;
+error_propagate(errp, local_err);
+}
+fail:
+qemu_opts_del(opts);
+return ret;
+}
+
+static void pcache_close(BlockDriverState *bs)
+{
+}
+
+static void pcache_parse_filename(const char *filename, QDict *options,
+  Error **errp)
+{
+qdict_put(options, "x-image", qstring_from_str(filename));
+}


This one goes away.


+static int64_t pcache_getlength(BlockDriverState *bs)
+{
+return bdrv_getlength(bs->file->bs);
+}
+

Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support

2016-11-24 Thread Stefan Hajnoczi

On Thu, Nov 24, 2016 at 11:31:14AM +, Ketan Nilangekar wrote:
> 
> 
> On 11/24/16, 4:41 PM, "Stefan Hajnoczi"  wrote:
> 
> On Thu, Nov 24, 2016 at 05:44:37AM +, Ketan Nilangekar wrote:
> > On 11/24/16, 4:07 AM, "Paolo Bonzini"  wrote:
> > >On 23/11/2016 23:09, ashish mittal wrote:
> > >> On the topic of protocol security -
> > >> 
> > >> Would it be enough for the first patch to implement only
> > >> authentication and not encryption?
> > >
> > >Yes, of course.  However, as we introduce more and more QEMU-specific
> > >characteristics to a protocol that is already QEMU-specific (it doesn't
> > >do failover, etc.), I am still not sure of the actual benefit of using
> > >libqnio versus having an NBD server or FUSE driver.
> > >
> > >You have already mentioned performance, but the design has changed so
> > >much that I think one of the two things has to change: either failover
> > >moves back to QEMU and there is no (closed source) translator running 
> on
> > >the node, or the translator needs to speak a well-known and
> > >already-supported protocol.
> > 
> > IMO design has not changed. Implementation has changed significantly. I 
> would propose that we keep resiliency/failover code out of QEMU driver and 
> implement it entirely in libqnio as planned in a subsequent revision. The 
> VxHS server does not need to understand/handle failover at all. 
> > 
> > Today libqnio gives us significantly better performance than any 
> NBD/FUSE implementation. We know because we have prototyped with both. 
> Significant improvements to libqnio are also in the pipeline which will use 
> cross memory attach calls to further boost performance. Ofcourse a big reason 
> for the performance is also the HyperScale storage backend but we believe 
> this method of IO tapping/redirecting can be leveraged by other solutions as 
> well.
> 
> By "cross memory attach" do you mean
> process_vm_readv(2)/process_vm_writev(2)?
>   
> Ketan> Yes.
>   
> That puts us back to square one in terms of security.  You have
> (untrusted) QEMU + (untrusted) libqnio directly accessing the memory of
> another process on the same machine.  That process is therefore also
> untrusted and may only process data for one guest so that guests stay
> isolated from each other.
> 
> Ketan> Understood but this will be no worse than the current network based 
> communication between qnio and vxhs server. And although we have questions 
> around QEMU trust/vulnerability issues, we are looking to implement basic 
> authentication scheme between libqnio and vxhs server.

This is incorrect.

Cross memory attach is equivalent to ptrace(2) (i.e. debugger) access.
It means process A reads/writes directly from/to process B memory.  Both
processes must have the same uid/gid.  There is no trust boundary
between them.

Network communication does not require both processes to have the same
uid/gid.  If you want multiple QEMU processes talking to a single server
there must be a trust boundary between client and server.  The server
can validate the input from the client and reject undesired operations.

Hope this makes sense now.

Two architectures that implement the QEMU trust model correctly are:

1. Cross memory attach: each QEMU process has a dedicated vxhs server
   process to prevent guests from attacking each other.  This is where I
   said you might as well put the code inside QEMU since there is no
   isolation anyway.  From what you've said it sounds like the vxhs
   server needs a host-wide view and is responsible for all guests
   running on the host, so I guess we have to rule out this
   architecture.

2. Network communication: one vxhs server process and multiple guests.
   Here you might as well use NBD or iSCSI because it already exists and
   the vxhs driver doesn't add any unique functionality over existing
   protocols.

> There's an easier way to get even better performance: get rid of libqnio
> and the external process.  Move the code from the external process into
> QEMU to eliminate the process_vm_readv(2)/process_vm_writev(2) and
> context switching.
> 
> Can you remind me why there needs to be an external process?
>  
> Ketan>  Apart from virtualizing the available direct attached storage on the 
> compute, vxhs storage backend (the external process) provides features such 
> as storage QoS, resiliency, efficient use of direct attached storage, 
> automatic storage recovery points (snapshots) etc. Implementing this in QEMU 
> is not practical and not the purpose of proposing this driver.

This sounds similar to what QEMU and Linux (file systems, LVM, RAID,
etc) already do.  It brings to mind a third architecture:

3. A Linux driver or file system.  Then QEMU opens a raw block device.
   This is what the Ceph rbd block driver in Linux does.  This
   architecture has a kernel-userspace boundary so

[Qemu-devel] [PATCH v2 7/7] linux-user: Fix mq_open

2016-11-24 Thread Lena Djokic

If fourth argument is NULL it should be passed without
using lock_user function which would, in that case, return
EFAULT, and system call supports passing NULL as fourth argument.

Signed-off-by: Lena Djokic 
---
 linux-user/syscall.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 3faf4f0..dad03e9 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -11694,17 +11694,22 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 case TARGET_NR_mq_open:
 {
 struct mq_attr posix_mq_attr;
+struct mq_attr *pposix_mq_attr;
 int host_flags;
 
 host_flags = target_to_host_bitmask(arg2, fcntl_flags_tbl);
-if (copy_from_user_mq_attr(&posix_mq_attr, arg4) != 0) {
-goto efault;
+pposix_mq_attr = NULL;
+if (arg4) {
+if (copy_from_user_mq_attr(&posix_mq_attr, arg4) != 0) {
+goto efault;
+}
+pposix_mq_attr = &posix_mq_attr;
 }
 p = lock_user_string(arg1 - 1);
 if (!p) {
 goto efault;
 }
-ret = get_errno(mq_open(p, host_flags, arg3, &posix_mq_attr));
+ret = get_errno(mq_open(p, host_flags, arg3, pposix_mq_attr));
 unlock_user (p, arg1, 0);
 }
 break;
-- 
2.7.4

[Qemu-devel] [PATCH v2 2/7] linux-user: Fix inotify_init1 support

2016-11-24 Thread Lena Djokic

This commit adds necessary conversion of argument passed to inotify_init1.
inotify_init1 flags can be IN_NONBLOCK and IN_CLOEXEC which rely on O_NONBLOCK
and O_CLOEXEC and those can have different values on different platforms.

Signed-off-by: Lena Djokic 
---
 linux-user/syscall.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f5d9a26..41873ca 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -11625,7 +11625,8 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 #ifdef CONFIG_INOTIFY1
 #if defined(TARGET_NR_inotify_init1) && defined(__NR_inotify_init1)
 case TARGET_NR_inotify_init1:
-ret = get_errno(sys_inotify_init1(arg1));
+ret = get_errno(sys_inotify_init1(target_to_host_bitmask(arg1,
+  fcntl_flags_tbl)));
 break;
 #endif
 #endif
-- 
2.7.4

[Qemu-devel] [kvm-unit-tests PATCH v7 03/11] run_tests: allow passing of options to QEMU

2016-11-24 Thread Alex Bennée

This introduces a the option -o for passing of options directly to QEMU
which is useful. In my case I'm using it to toggle MTTCG on an off:

  ./run_tests.sh -t -o "-tcg mttcg=on"

Signed-off-by: Alex Bennée 
---
 run_tests.sh   | 10 +++---
 scripts/functions.bash | 13 +++--
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/run_tests.sh b/run_tests.sh
index 4f2e5cb..05cc7fb 100755
--- a/run_tests.sh
+++ b/run_tests.sh
@@ -13,10 +13,11 @@ function usage()
 {
 cat < test.log
-for_each_unittest $config run
+for_each_unittest $config run "$extra_opts"
diff --git a/scripts/functions.bash b/scripts/functions.bash
index ee9143c..d38a69e 100644
--- a/scripts/functions.bash
+++ b/scripts/functions.bash
@@ -2,11 +2,12 @@
 function for_each_unittest()
 {
local unittests="$1"
-   local cmd="$2"
-   local testname
+local cmd="$2"
+local extra_opts=$3
+local testname
local smp
local kernel
-   local opts
+local opts=$extra_opts
local groups
local arch
local check
@@ -21,7 +22,7 @@ function for_each_unittest()
testname=${BASH_REMATCH[1]}
smp=1
kernel=""
-   opts=""
+opts=$extra_opts
groups=""
arch=""
check=""
@@ -32,7 +33,7 @@ function for_each_unittest()
elif [[ $line =~ ^smp\ *=\ *(.*)$ ]]; then
smp=${BASH_REMATCH[1]}
elif [[ $line =~ ^extra_params\ *=\ *(.*)$ ]]; then
-   opts=${BASH_REMATCH[1]}
+opts="$opts ${BASH_REMATCH[1]}"
elif [[ $line =~ ^groups\ *=\ *(.*)$ ]]; then
groups=${BASH_REMATCH[1]}
elif [[ $line =~ ^arch\ *=\ *(.*)$ ]]; then
@@ -45,6 +46,6 @@ function for_each_unittest()
timeout=${BASH_REMATCH[1]}
fi
done
-   "$cmd" "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch" "$check" 
"$accel" "$timeout"
+"$cmd" "$testname" "$groups" "$smp" "$kernel" "$opts" "$arch" "$check" 
"$accel" "$timeout"
exec {fd}<&-
 }
-- 
2.10.1

[Qemu-devel] [PATCH v2 5/7] linux-user: Fix readahead

2016-11-24 Thread Lena Djokic

Calculation of 64-bit offset was not correct for all cases.

Signed-off-by: Lena Djokic 
---
 linux-user/syscall.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 1b59a71..61c4126 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -11296,7 +11296,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 arg3 = arg4;
 arg4 = arg5;
 }
-ret = get_errno(readahead(arg1, ((off64_t)arg3 << 32) | arg2, arg4));
+ret = get_errno(readahead(arg1, target_offset64(arg2, arg3) , arg4));
 #else
 ret = get_errno(readahead(arg1, arg2, arg3));
 #endif
-- 
2.7.4

[Qemu-devel] [PATCH v2 0/7] Improvements of qemu linux-user

2016-11-24 Thread Lena Djokic

v2: added 6 patches
This patch series contains implementation of support for 
two new system calls, and fixes for 5 existing system calls,
and fix for a structure definition as well.

Lena Djokic (7):
  linux-user: Add fanotify implementation
  linux-user: Fix inotify_init1 support
  linux-user: Fix flock definition for mips64
  linux-user: Fix fcnt
  linux-user: Fix readahead
  linux-user: Fix syslog
  linux-user: Fix mq_open

 configure |  20 ++
 linux-user/syscall.c  | 166 --
 linux-user/syscall_defs.h |   2 +-
 3 files changed, 168 insertions(+), 20 deletions(-)

-- 
2.7.4

[Qemu-devel] [PATCH v2 4/7] linux-user: Fix fcnt

2016-11-24 Thread Lena Djokic

F_GETSIG and F_SETSIG were implemented with default behaviour which
simply passes given arguments to fcntl syscall, but since those
arguments are signals used for communication between taget and
host we need conversion which is done by using host_to_target_signal
and taget_to_host_signal functions.

Signed-off-by: Lena Djokic 
---
 linux-user/syscall.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 41873ca..1b59a71 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6544,14 +6544,18 @@ static abi_long do_fcntl(int fd, int cmd, abi_ulong arg)
 
 case TARGET_F_SETOWN:
 case TARGET_F_GETOWN:
-case TARGET_F_SETSIG:
-case TARGET_F_GETSIG:
 case TARGET_F_SETLEASE:
 case TARGET_F_GETLEASE:
 case TARGET_F_SETPIPE_SZ:
 case TARGET_F_GETPIPE_SZ:
 ret = get_errno(safe_fcntl(fd, host_cmd, arg));
 break;
+case TARGET_F_GETSIG:
+ret = host_to_target_signal(get_errno(fcntl(fd, host_cmd, arg)));
+break;
+case TARGET_F_SETSIG:
+ret = get_errno(fcntl(fd, host_cmd, target_to_host_signal(arg)));
+break;
 
 default:
 ret = get_errno(safe_fcntl(fd, cmd, arg));
-- 
2.7.4

[Qemu-devel] [kvm-unit-tests PATCH v7 10/11] arm/barrier-litmus-tests: add simple mp and sal litmus tests

2016-11-24 Thread Alex Bennée

This adds a framework for adding simple barrier litmus tests against
ARM. The litmus tests aren't as comprehensive as the academic exercises
which will attempt to do all sorts of things to keep racing CPUs synced
up. These tests do honour the "sync" parameter to do a poor-mans
equivalent.

The two litmus tests are:
  - message passing
  - store-after-load

They both have case that should fail (although won't on single-threaded
TCG setups). If barriers aren't working properly the store-after-load
test will fail even on an x86 backend as x86 allows re-ording of non
aliased stores.

I've imported a few more of the barrier primatives from the Linux source
tree so we consistently use macros.

The arm64 barrier primitives trip up on -Wstrict-aliasing so this is
disabled in the Makefile.

Signed-off-by: Alex Bennée 
CC: Will Deacon 

---
v7
  - merge in store-after-load
  - clean-up sync-up code
  - use new counter api
  - fix xfail for sal test
v6
  - add a unittest.cfg
  - -fno-strict-aliasing
---
 Makefile  |   2 +
 arm/Makefile.common   |   2 +
 arm/barrier-litmus-test.c | 437 ++
 arm/unittests.cfg |  36 
 lib/arm/asm/barrier.h |  63 ++-
 lib/arm64/asm/barrier.h   |  50 ++
 6 files changed, 589 insertions(+), 1 deletion(-)
 create mode 100644 arm/barrier-litmus-test.c

diff --git a/Makefile b/Makefile
index 5201472..53594a1 100644
--- a/Makefile
+++ b/Makefile
@@ -51,10 +51,12 @@ fomit_frame_pointer := $(call cc-option, 
$(frame-pointer-flag), "")
 fnostack_protector := $(call cc-option, -fno-stack-protector, "")
 fnostack_protector_all := $(call cc-option, -fno-stack-protector-all, "")
 wno_frame_address := $(call cc-option, -Wno-frame-address, "")
+fno_strict_aliasing := $(call cc-option, -fno-strict-aliasing, "")
 CFLAGS += $(fomit_frame_pointer)
 CFLAGS += $(fno_stack_protector)
 CFLAGS += $(fno_stack_protector_all)
 CFLAGS += $(wno_frame_address)
+CFLAGS += $(fno_strict_aliasing)
 
 CXXFLAGS += $(CFLAGS)
 
diff --git a/arm/Makefile.common b/arm/Makefile.common
index eb4cfdf..a508128 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -16,6 +16,7 @@ tests-common += $(TEST_DIR)/gic.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
 tests-common += $(TEST_DIR)/tlbflush-data.flat
 tests-common += $(TEST_DIR)/locking-test.flat
+tests-common += $(TEST_DIR)/barrier-litmus-test.flat
 
 all: test_cases
 
@@ -87,3 +88,4 @@ $(TEST_DIR)/selftest.o $(cstart.o): $(asm-offsets)
 $(TEST_DIR)/tlbflush-code.elf: $(cstart.o) $(TEST_DIR)/tlbflush-code.o
 $(TEST_DIR)/tlbflush-data.elf: $(cstart.o) $(TEST_DIR)/tlbflush-data.o
 $(TEST_DIR)/locking-test.elf: $(cstart.o) $(TEST_DIR)/locking-test.o
+$(TEST_DIR)/barrier-litmus-test.elf: $(cstart.o) 
$(TEST_DIR)/barrier-litmus-test.o
diff --git a/arm/barrier-litmus-test.c b/arm/barrier-litmus-test.c
new file mode 100644
index 000..2557a88
--- /dev/null
+++ b/arm/barrier-litmus-test.c
@@ -0,0 +1,437 @@
+/*
+ * ARM Barrier Litmus Tests
+ *
+ * This test provides a framework for testing barrier conditions on
+ * the processor. It's simpler than the more involved barrier testing
+ * frameworks as we are looking for simple failures of QEMU's TCG not
+ * weird edge cases the silicon gets wrong.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MAX_CPUS 8
+
+/* Array size and access controls */
+static int array_size = 10;
+static int wait_if_ahead = 0;
+
+static cpumask_t cpu_mask;
+
+/*
+ * These test_array_* structures are a contiguous array modified by two or more
+ * competing CPUs. The padding is to ensure the variables do not share
+ * cache lines.
+ *
+ * All structures start zeroed.
+ */
+
+typedef struct test_array
+{
+   volatile unsigned int x;
+   uint8_t dummy[64];
+   volatile unsigned int y;
+   uint8_t dummy2[64];
+   volatile unsigned int r[MAX_CPUS];
+} test_array;
+
+volatile test_array *array;
+
+/* Test definition structure
+ *
+ * The first function will always run on the primary CPU, it is
+ * usually the one that will detect any weirdness and trigger the
+ * failure of the test.
+ */
+
+typedef void (*test_fn)(void);
+
+typedef struct {
+   const char *test_name;
+   bool  should_pass;
+   test_fn main_fn;
+   test_fn secondary_fns[MAX_CPUS-1];
+} test_descr_t;
+
+/* Litmus tests */
+
+static unsigned long sync_start(void)
+{
+   const unsigned long gate_mask = ~0x3;
+   unsigned long gate, now;
+   gate = get_cntvct() & gate_mask;
+   do {
+   now =get_cntvct();
+   } while ((now & gate_mask)==gate);
+
+   return now;
+}
+
+/* Simple Message Passing
+ *
+ * x is the message data
+ * y is the flag to indicate the data is ready
+ *
+ * Reading x == 0 when y == 1 is a failure.
+ */
+
+void message_passing_write(void)
+{
+   int i;
+
+   sync_start();
+   for (i=0; i< array_size; i++) {
+   volatile test_array *entry = &array[i];
+

Re: [Qemu-devel] [PATCH 4/4] 9pfs: add a size parameter to init_iov_from_pdu

2016-11-24 Thread Greg Kurz

On Mon, 21 Nov 2016 13:39:32 -0800
Stefano Stabellini  wrote:

> Not all 9pfs transports share memory between request and response. For
> those who don't, it is necessary to know how much memory is required in
> the response.
> 
> Signed-off-by: Stefano Stabellini 
> ---

IIUC the transport used in Xen requires you pass the size when sending a
P9_RREAD message back to the guest, i.e. only in the case when is_write is
false, correct ?

If so, for better clarity, what about having two distinct init_in_iov_from_pdu
and init_out_iov_from_pdu ops, with an explicit comment in the virtio-9p
implementation of init_in_iov_from_pdu so that someone isn't tempted to drop
the apparently unused size argument ? Would this be ok for you on the Xen
side ?

Cheers.

--
Greg

>  hw/9pfs/9p.c   | 2 +-
>  hw/9pfs/9p.h   | 2 +-
>  hw/9pfs/virtio-9p-device.c | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> index b6ec042..b82212b 100644
> --- a/hw/9pfs/9p.c
> +++ b/hw/9pfs/9p.c
> @@ -1652,7 +1652,7 @@ static void v9fs_init_qiov_from_pdu(QEMUIOVector *qiov, 
> V9fsPDU *pdu,
>  struct iovec *iov;
>  unsigned int niov;
>  
> -pdu->s->transport->init_iov_from_pdu(pdu, &iov, &niov, is_write);
> +pdu->s->transport->init_iov_from_pdu(pdu, &iov, &niov, is_write, skip + 
> size);
>  
>  qemu_iovec_init_external(&elem, iov, niov);
>  qemu_iovec_init(qiov, niov);
> diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h
> index ab398d0..c830188 100644
> --- a/hw/9pfs/9p.h
> +++ b/hw/9pfs/9p.h
> @@ -348,7 +348,7 @@ struct V9fsTransport {
>  ssize_t (*pdu_vmarshal)(V9fsPDU *pdu, size_t offset, const char 
> *fmt, va_list ap);
>  ssize_t (*pdu_vunmarshal)(V9fsPDU *pdu, size_t offset, const char 
> *fmt, va_list ap);
>  void(*init_iov_from_pdu)(V9fsPDU *pdu, struct iovec **piov,
> -  unsigned int *pniov, bool is_write);
> +  unsigned int *pniov, bool is_write, size_t 
> size);
>  void(*push_and_notify)(V9fsPDU *pdu);
>  };
>  
> diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
> index e1a37a4..e2b27e8 100644
> --- a/hw/9pfs/virtio-9p-device.c
> +++ b/hw/9pfs/virtio-9p-device.c
> @@ -172,7 +172,7 @@ static ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t 
> offset,
>  }
>  
>  static void virtio_init_iov_from_pdu(V9fsPDU *pdu, struct iovec **piov,
> -  unsigned int *pniov, bool is_write)
> +  unsigned int *pniov, bool is_write, size_t 
> size)
>  {
>  V9fsState *s = pdu->s;
>  V9fsVirtioState *v = container_of(s, V9fsVirtioState, state);

[Qemu-devel] [PATCH v2 3/7] linux-user: Fix flock definition for mips64

2016-11-24 Thread Lena Djokic

Mips64 uses generic flock structure.
See /arch/mips/include/uapi/asm/fcntl.h#L63 for reference.

Signed-off-by: Lena Djokic 
---
 linux-user/syscall_defs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 0b15466..099fd0e 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -2363,7 +2363,7 @@ struct target_flock {
 short l_whence;
 abi_long l_start;
 abi_long l_len;
-#if defined(TARGET_MIPS)
+#if defined(TARGET_MIPS) && (TARGET_ABI_BITS == 32)
 abi_long l_sysid;
 #endif
 int l_pid;
-- 
2.7.4

[Qemu-devel] [PATCH v2 6/7] linux-user: Fix syslog

2016-11-24 Thread Lena Djokic

Third argument represents lenght not second.
If second argument is NULL it should be passed without
using lock_user function which would, in that case, return
EFAULT, and system call supports passing NULL as second argument.

Signed-off-by: Lena Djokic 
---
 linux-user/syscall.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 61c4126..3faf4f0 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -9426,7 +9426,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #if defined(TARGET_NR_syslog)
 case TARGET_NR_syslog:
 {
-int len = arg2;
+int len = arg3;
 
 switch (arg1) {
 case TARGET_SYSLOG_ACTION_CLOSE: /* Close log */
@@ -9450,13 +9450,13 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 goto fail;
 }
 ret = 0;
-if (len == 0) {
-break;
-}
-p = lock_user(VERIFY_WRITE, arg2, arg3, 0);
-if (!p) {
-ret = -TARGET_EFAULT;
-goto fail;
+p = NULL;
+if (arg2) {
+p = lock_user(VERIFY_WRITE, arg2, arg3, 0);
+if (!p) {
+ret = -TARGET_EFAULT;
+goto fail;
+}
 }
 ret = get_errno(sys_syslog((int)arg1, p, (int)arg3));
 unlock_user(p, arg2, arg3);
-- 
2.7.4

[Qemu-devel] [kvm-unit-tests PATCH v7 08/11] arm/tlbflush-data: Add TLB flush during data writes test

2016-11-24 Thread Alex Bennée

This test is the cousin of the tlbflush-code test. Instead of flushing
running code it re-maps virtual addresses while a buffer is being filled
up. It then audits the results checking for writes that have ended up in
the wrong place.

While tlbflush-code exercises QEMU's translation invalidation logic this
test stresses the SoftMMU cputlb code and ensures it is semantically
correct.

The test optionally takes two parameters for debugging:

   cycles   - change the default number of test iterations
   page - flush pages individually instead of all

Signed-off-by: Alex Bennée 
CC: Mark Rutland 
---
 arm/Makefile.common |   2 +
 arm/tlbflush-data.c | 401 
 arm/unittests.cfg   |  12 ++
 3 files changed, 415 insertions(+)
 create mode 100644 arm/tlbflush-data.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index de99a6e..528166d 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -14,6 +14,7 @@ tests-common += $(TEST_DIR)/spinlock-test.flat
 tests-common += $(TEST_DIR)/pci-test.flat
 tests-common += $(TEST_DIR)/gic.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
+tests-common += $(TEST_DIR)/tlbflush-data.flat
 
 all: test_cases
 
@@ -83,3 +84,4 @@ test_cases: $(generated_files) $(tests-common) $(tests)
 
 $(TEST_DIR)/selftest.o $(cstart.o): $(asm-offsets)
 $(TEST_DIR)/tlbflush-code.elf: $(cstart.o) $(TEST_DIR)/tlbflush-code.o
+$(TEST_DIR)/tlbflush-data.elf: $(cstart.o) $(TEST_DIR)/tlbflush-data.o
diff --git a/arm/tlbflush-data.c b/arm/tlbflush-data.c
new file mode 100644
index 000..7920179
--- /dev/null
+++ b/arm/tlbflush-data.c
@@ -0,0 +1,401 @@
+/*
+ * TLB Flush Race Tests
+ *
+ * These tests are designed to test for incorrect TLB flush semantics
+ * under emulation. The initial CPU will set all the others working on
+ * a writing to a set of pages. It will then re-map one of the pages
+ * back and forth while recording the timestamps of when each page was
+ * active. The test fails if a write was detected on a page after the
+ * tlbflush switching to a new page should have completed.
+ *
+ * Copyright (C) 2016, Linaro, Alex Bennée 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define NR_TIMESTAMPS  ((PAGE_SIZE/sizeof(u64)) << 2)
+#define NR_AUDIT_RECORDS   16384
+#define NR_DYNAMIC_PAGES   3
+#define MAX_CPUS   8
+
+#define MIN(a, b)  ((a) < (b) ? (a) : (b))
+
+typedef struct {
+   u64 timestamps[NR_TIMESTAMPS];
+} write_buffer;
+
+typedef struct {
+   write_buffer*newbuf;
+   u64 time_before_flush;
+   u64 time_after_flush;
+} audit_rec_t;
+
+typedef struct {
+   audit_rec_t records[NR_AUDIT_RECORDS];
+} audit_buffer;
+
+typedef struct {
+   write_buffer*stable_pages;
+   write_buffer*dynamic_pages[NR_DYNAMIC_PAGES];
+   audit_buffer*audit;
+   unsigned intflush_count;
+} test_data_t;
+
+static test_data_t test_data[MAX_CPUS];
+
+static cpumask_t ready;
+static cpumask_t complete;
+
+static bool test_complete;
+static bool flush_verbose;
+static bool flush_by_page;
+static int test_cycles=3;
+static int secondary_cpus;
+
+static write_buffer * alloc_test_pages(void)
+{
+   write_buffer *pg;
+   pg = calloc(NR_TIMESTAMPS, sizeof(u64));
+   return pg;
+}
+
+static void setup_pages_for_cpu(int cpu)
+{
+   unsigned int i;
+
+   test_data[cpu].stable_pages = alloc_test_pages();
+
+   for (i=0; irecords[record];
+}
+
+/* Sync on a given cpumask */
+static void wait_on(int cpu, cpumask_t *mask)
+{
+   cpumask_set_cpu(cpu, mask);
+   while (!cpumask_full(mask))
+   cpu_relax();
+}
+
+static uint64_t sync_start(void)
+{
+   const uint64_t gate_mask = ~0x7ff;
+   uint64_t gate, now;
+   gate = get_cntvct() & gate_mask;
+   do {
+   now = get_cntvct();
+   } while ((now & gate_mask) == gate);
+
+   return now;
+}
+
+static void do_page_writes(void)
+{
+   unsigned int i, runs = 0;
+   int cpu = smp_processor_id();
+   write_buffer *stable_pages = test_data[cpu].stable_pages;
+   write_buffer *moving_page = test_data[cpu].dynamic_pages[0];
+
+   printf("CPU%d: ready %p/%p @ 0x%08" PRIx64"\n",
+   cpu, stable_pages, moving_page, get_cntvct());
+
+   while (!test_complete) {
+   u64 run_start, run_end;
+
+   smp_mb();
+   wait_on(cpu, &ready);
+   run_start = sync_start();
+
+   for (i = 0; i < NR_TIMESTAMPS; i++) {
+   u64 ts = get_cntvct();
+   moving_page->timestamps[i] = ts;
+   stable_pages->timestamps[i] = ts;
+   }
+
+   run_end = get_cntvct();
+   printf("CPU%d: run %d 0x%" PRIx64 "->0x%" PRIx64 " (%" PRId64 " 
cycles)\n"

[Qemu-devel] [kvm-unit-tests PATCH v7 04/11] libcflat: add PRI(dux)32 format types

2016-11-24 Thread Alex Bennée

So we can have portable formatting of uint32_t types.

Signed-off-by: Alex Bennée 
---
 lib/libcflat.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/lib/libcflat.h b/lib/libcflat.h
index bdcc561..6dab5be 100644
--- a/lib/libcflat.h
+++ b/lib/libcflat.h
@@ -55,12 +55,17 @@ typedef _Bool   bool;
 #define true  1
 
 #if __SIZEOF_LONG__ == 8
+#  define __PRI32_PREFIX
 #  define __PRI64_PREFIX   "l"
 #  define __PRIPTR_PREFIX  "l"
 #else
+#  define __PRI32_PREFIX"l"
 #  define __PRI64_PREFIX   "ll"
 #  define __PRIPTR_PREFIX
 #endif
+#define PRId32  __PRI32_PREFIX "d"
+#define PRIu32  __PRI32_PREFIX "u"
+#define PRIx32  __PRI32_PREFIX "x"
 #define PRId64  __PRI64_PREFIX "d"
 #define PRIu64  __PRI64_PREFIX "u"
 #define PRIx64  __PRI64_PREFIX "x"
-- 
2.10.1

[Qemu-devel] [PATCH v2 1/7] linux-user: Add fanotify implementation

2016-11-24 Thread Lena Djokic

This commit adds implementation of fanotify_init and fanotify_mark.
Second argument for fanotify_init needs conversion because of flags
which can be FAN_NONBLOCK and FAN_CLOEXEC which rely on O_NONBLOCK
and O_CLOEXEC and those can have different values on different platforms.
For fanotify_mark argument layout is different for 32-bit and 64-bit
platforms and this implementation have support for that situation.
Also, support for writing and reading of file descriptor opened by
fanotify_init is added.
Configure file contains checks for excistence of fanotify support on
given build system.

Signed-off-by: Lena Djokic 
---
 configure|  20 
 linux-user/syscall.c | 126 +--
 2 files changed, 142 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index fd6f898..56e6c98 100755
--- a/configure
+++ b/configure
@@ -3537,6 +3537,23 @@ if compile_prog "" "" ; then
   inotify1=yes
 fi
 
+# check if fanotify group of system calls is supported
+fanotify=no
+cat > $TMPC << EOF
+#include 
+
+int
+main(void)
+{
+fanotify_init(0,0);
+fanotify_mark(0,0,0,0,0);
+return 0;
+}
+EOF
+if compile_prog "" "" ; then
+  fanotify=yes
+fi
+
 # check if utimensat and futimens are supported
 utimens=no
 cat > $TMPC << EOF
@@ -5335,6 +5352,9 @@ fi
 if test "$inotify1" = "yes" ; then
   echo "CONFIG_INOTIFY1=y" >> $config_host_mak
 fi
+if test "$fanotify" = "yes" ; then
+  echo "CONFIG_FANOTIFY=y" >> $config_host_mak
+fi
 if test "$byteswap_h" = "yes" ; then
   echo "CONFIG_BYTESWAP_H=y" >> $config_host_mak
 fi
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 7b77503..f5d9a26 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -76,6 +76,9 @@ int __clone2(int (*fn)(void *), void *child_stack_base,
 #ifdef CONFIG_SENDFILE
 #include 
 #endif
+#ifdef CONFIG_FANOTIFY
+#include 
+#endif
 
 #define termios host_termios
 #define winsize host_winsize
@@ -499,9 +502,13 @@ enum {
 QEMU___IFLA_INET6_MAX
 };
 
+typedef abi_long (*TargetFdReadFunc)(void *, size_t);
+typedef abi_long (*TargetFdWriteFunc)(void *, size_t);
 typedef abi_long (*TargetFdDataFunc)(void *, size_t);
 typedef abi_long (*TargetFdAddrFunc)(void *, abi_ulong, socklen_t);
 typedef struct TargetFdTrans {
+TargetFdReadFunc read_op;
+TargetFdWriteFunc write_op;
 TargetFdDataFunc host_to_target_data;
 TargetFdDataFunc target_to_host_data;
 TargetFdAddrFunc target_to_host_addr;
@@ -511,6 +518,22 @@ static TargetFdTrans **target_fd_trans;
 
 static unsigned int target_fd_max;
 
+static TargetFdReadFunc fd_trans_read_op(int fd)
+{
+if (fd >= 0 && fd < target_fd_max && target_fd_trans[fd]) {
+return target_fd_trans[fd]->read_op;
+}
+return NULL;
+}
+
+static TargetFdWriteFunc fd_trans_write_op(int fd)
+{
+if (fd >= 0 && fd < target_fd_max && target_fd_trans[fd]) {
+return target_fd_trans[fd]->write_op;
+}
+return NULL;
+}
+
 static TargetFdDataFunc fd_trans_target_to_host_data(int fd)
 {
 if (fd >= 0 && fd < target_fd_max && target_fd_trans[fd]) {
@@ -7527,6 +7550,47 @@ static target_timer_t get_timer_id(abi_long arg)
 return timerid;
 }
 
+#if defined(CONFIG_FANOTIFY)
+static inline abi_long fanotify_fd_read_op(void *buf, size_t len)
+{
+struct fanotify_event_metadata *fem;
+int num;
+
+/* Read buffer for fanotify file descriptor contains one or more
+ * of fanotify_event_metadata structures.
+ */
+fem = (struct fanotify_event_metadata *)buf;
+num = len / sizeof(struct fanotify_event_metadata);
+for (int i = 0; i < num; i++) {
+(fem + i)->event_len = tswap32((fem + i)->event_len);
+/* Fields (fem+i)->vers and (fem+i)->reserved are single byte,
+ * so swapping is not needed for them.
+ */
+(fem + i)->metadata_len = tswap16((fem + i)->metadata_len);
+(fem + i)->mask = tswap64((fem + i)->mask);
+(fem + i)->fd = tswap32((fem + i)->fd);
+(fem + i)->pid = tswap32((fem + i)->pid);
+}
+
+return len;
+}
+
+static inline abi_long fanotify_fd_write_op(void *buf, size_t len)
+{
+struct fanotify_response *fr = (struct fanotify_response *)buf;
+
+fr->fd = tswap32(fr->fd);
+fr->response = tswap32(fr->response);
+
+return len;
+}
+
+static TargetFdTrans fanotify_trans = {
+.read_op = fanotify_fd_read_op,
+.write_op = fanotify_fd_write_op,
+};
+#endif
+
 /* do_syscall() should always have a single exit point at the end so
that actions, such as logging of syscall results, can be performed.
All errnos that do_syscall() returns must be -TARGET_. */
@@ -7613,16 +7677,27 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 if (!(p = lock_user(VERIFY_WRITE, arg2, arg3, 0)))
 goto efault;
 ret = get_errno(safe_read(arg1, p, arg3));
-if (ret >= 0 &&
-fd_trans_host_to_target_data(arg1)) {
-ret = fd_trans

[Qemu-devel] [kvm-unit-tests PATCH v7 02/11] run_tests: allow disabling of timeouts

2016-11-24 Thread Alex Bennée

Certainly during development of the tests and MTTCG there are times when
the timeout just gets in the way.

Signed-off-by: Alex Bennée 
---
 run_tests.sh | 8 ++--
 scripts/runtime.bash | 4 
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/run_tests.sh b/run_tests.sh
index b88c36f..4f2e5cb 100755
--- a/run_tests.sh
+++ b/run_tests.sh
@@ -13,10 +13,11 @@ function usage()
 {
 cat <

[Qemu-devel] [kvm-unit-tests PATCH v7 00/11] QEMU MTTCG Test cases

2016-11-24 Thread Alex Bennée

Hi,

Looking at my records it seems as though it has been a while since I
last posted these tests. As I'm hoping to get the final bits of MTTCG
merged upstream on the next QEMU development cycle I've been re-basing
these and getting them cleaned up for merging.

Some of the patches might be worth taking now if the maintainers are
happy to do so (run_test tweaks, libcflat updates?). The others could
do with more serious review. I've CC'd some of the ARM guys to look
over the tlbflush/barrier tests so they can cast their expert eyes
over them ;-)

There are two additions to the series.

The tcg-test is a general torture test aimed at QEMU's TCG execution
model. It stresses the cpu execution loop through the use of
cross-page and computed jumps. It can also add IRQ's and self-modifying
code to the mix.

The tlbflush-data test is a new one, the old tlbflush test is renamed
tlbflush-code to better indicate the code path it exercise. The the
code test tests the translation invalidation pathways in QEMU the data
test exercises the SoftMMU's TLBs and explicitly that tlbflush
completion semantics are correct.

The tlbflush-data passes most of the times on real hardware but
definitely showed the problem with deferred TLB flushes running under
MTTCG QEMU. I've looked at some of the failure cases on real hardware
and it did look like a timestamp appeared on a page that shouldn't
have been accessible at the time - I don't know if this is a real
silicon bug or my misreading of the semantics so I'd appreciate
a comment from the experts.

The code needs to be applied on top of Drew's latest ARM GIC patches
or you can grab my tree from:

  https://github.com/stsquad/kvm-unit-tests/tree/mttcg/current-tests-v7

Cheers,

Alex.

Alex Bennée (11):
  run_tests: allow forcing of acceleration mode
  run_tests: allow disabling of timeouts
  run_tests: allow passing of options to QEMU
  libcflat: add PRI(dux)32 format types
  lib: add isaac prng library from CCAN
  arm/Makefile.common: force -fno-pic
  arm/tlbflush-code: Add TLB flush during code execution test
  arm/tlbflush-data: Add TLB flush during data writes test
  arm/locking-tests: add comprehensive locking test
  arm/barrier-litmus-tests: add simple mp and sal litmus tests
  arm/tcg-test: some basic TCG exercising tests

 Makefile  |   2 +
 arm/Makefile.arm  |   2 +
 arm/Makefile.arm64|   2 +
 arm/Makefile.common   |  11 ++
 arm/barrier-litmus-test.c | 437 ++
 arm/locking-test.c| 302 
 arm/tcg-test-asm.S| 170 ++
 arm/tcg-test-asm64.S  | 169 ++
 arm/tcg-test.c| 337 +++
 arm/tlbflush-code.c   | 212 ++
 arm/tlbflush-data.c   | 401 ++
 arm/unittests.cfg | 190 
 lib/arm/asm/barrier.h |  63 ++-
 lib/arm64/asm/barrier.h   |  50 ++
 lib/libcflat.h|   5 +
 lib/prng.c| 162 +
 lib/prng.h|  82 +
 run_tests.sh  |  18 +-
 scripts/functions.bash|  13 +-
 scripts/runtime.bash  |   8 +
 20 files changed, 2626 insertions(+), 10 deletions(-)
 create mode 100644 arm/barrier-litmus-test.c
 create mode 100644 arm/locking-test.c
 create mode 100644 arm/tcg-test-asm.S
 create mode 100644 arm/tcg-test-asm64.S
 create mode 100644 arm/tcg-test.c
 create mode 100644 arm/tlbflush-code.c
 create mode 100644 arm/tlbflush-data.c
 create mode 100644 lib/prng.c
 create mode 100644 lib/prng.h

-- 
2.10.1

[Qemu-devel] [kvm-unit-tests PATCH v7 01/11] run_tests: allow forcing of acceleration mode

2016-11-24 Thread Alex Bennée

While tests can be pegged to tcg it is useful to override this from time
to time, especially when testing correctness on real systems.
---
 run_tests.sh | 8 ++--
 scripts/runtime.bash | 4 
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/run_tests.sh b/run_tests.sh
index 254129d..b88c36f 100755
--- a/run_tests.sh
+++ b/run_tests.sh
@@ -13,9 +13,10 @@ function usage()
 {
 cat <

[Qemu-devel] [kvm-unit-tests PATCH v7 09/11] arm/locking-tests: add comprehensive locking test

2016-11-24 Thread Alex Bennée

This test has been written mainly to stress multi-threaded TCG behaviour
but will demonstrate failure by default on real hardware. The test takes
the following parameters:

  - "lock" use GCC's locking semantics
  - "atomic" use GCC's __atomic primitives
  - "wfelock" use WaitForEvent sleep
  - "excl" use load/store exclusive semantics

Also two more options allow the test to be tweaked

  - "noshuffle" disables the memory shuffling
  - "count=%ld" set your own per-CPU increment count

Signed-off-by: Alex Bennée 

---
v2
  - Don't use thumb style strexeq stuff
  - Add atomic and wfelock tests
  - Add count/noshuffle test controls
  - Move barrier tests to separate test file
v4
  - fix up unitests.cfg to use correct test name
  - move into "locking" group, remove barrier tests
  - use a table to add tests, mark which are expected to work
  - correctly report XFAIL
v5
  - max out at -smp 4 in unittest.cfg
v7
  - make test control flags bools
  - default the count to 10 (so it doesn't timeout)
---
 arm/Makefile.common |   2 +
 arm/locking-test.c  | 302 
 arm/unittests.cfg   |  34 ++
 3 files changed, 338 insertions(+)
 create mode 100644 arm/locking-test.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 528166d..eb4cfdf 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -15,6 +15,7 @@ tests-common += $(TEST_DIR)/pci-test.flat
 tests-common += $(TEST_DIR)/gic.flat
 tests-common += $(TEST_DIR)/tlbflush-code.flat
 tests-common += $(TEST_DIR)/tlbflush-data.flat
+tests-common += $(TEST_DIR)/locking-test.flat
 
 all: test_cases
 
@@ -85,3 +86,4 @@ test_cases: $(generated_files) $(tests-common) $(tests)
 $(TEST_DIR)/selftest.o $(cstart.o): $(asm-offsets)
 $(TEST_DIR)/tlbflush-code.elf: $(cstart.o) $(TEST_DIR)/tlbflush-code.o
 $(TEST_DIR)/tlbflush-data.elf: $(cstart.o) $(TEST_DIR)/tlbflush-data.o
+$(TEST_DIR)/locking-test.elf: $(cstart.o) $(TEST_DIR)/locking-test.o
diff --git a/arm/locking-test.c b/arm/locking-test.c
new file mode 100644
index 000..f10c61b
--- /dev/null
+++ b/arm/locking-test.c
@@ -0,0 +1,302 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define MAX_CPUS 8
+
+/* Test definition structure
+ *
+ * A simple structure that describes the test name, expected pass and
+ * increment function.
+ */
+
+/* Function pointers for test */
+typedef void (*inc_fn)(int cpu);
+
+typedef struct {
+   const char *test_name;
+   bool  should_pass;
+   inc_fn main_fn;
+} test_descr_t;
+
+/* How many increments to do */
+static int increment_count = 100;
+static bool do_shuffle = true;
+
+/* Shared value all the tests attempt to safely increment using
+ * various forms of atomic locking and exclusive behaviour.
+ */
+static unsigned int shared_value;
+
+/* PAGE_SIZE * uint32_t means we span several pages */
+__attribute__((aligned(PAGE_SIZE))) static uint32_t memory_array[PAGE_SIZE];
+
+/* We use the alignment of the following to ensure accesses to locking
+ * and synchronisation primatives don't interfere with the page of the
+ * shared value
+ */
+__attribute__((aligned(PAGE_SIZE))) static unsigned int 
per_cpu_value[MAX_CPUS];
+__attribute__((aligned(PAGE_SIZE))) static cpumask_t smp_test_complete;
+__attribute__((aligned(PAGE_SIZE))) struct isaac_ctx prng_context[MAX_CPUS];
+
+/* Some of the approaches use a global lock to prevent contention. */
+static int global_lock;
+
+/* In any SMP setting this *should* fail due to cores stepping on
+ * each other updating the shared variable
+ */
+static void increment_shared(int cpu)
+{
+   (void)cpu;
+
+   shared_value++;
+}
+
+/* GCC __sync primitives are deprecated in favour of __atomic */
+static void increment_shared_with_lock(int cpu)
+{
+   (void)cpu;
+
+   while (__sync_lock_test_and_set(&global_lock, 1));
+   shared_value++;
+   __sync_lock_release(&global_lock);
+}
+
+/* In practice even __ATOMIC_RELAXED uses ARM's ldxr/stex exclusive
+ * semantics */
+static void increment_shared_with_atomic(int cpu)
+{
+   (void)cpu;
+
+   __atomic_add_fetch(&shared_value, 1, __ATOMIC_SEQ_CST);
+}
+
+
+/*
+ * Load/store exclusive with WFE (wait-for-event)
+ *
+ * See ARMv8 ARM examples:
+ *   Use of Wait For Event (WFE) and Send Event (SEV) with locks
+ */
+
+static void increment_shared_with_wfelock(int cpu)
+{
+   (void)cpu;
+
+#if defined(__aarch64__)
+   asm volatile(
+   "   mov w1, #1\n"
+   "   sevl\n"
+   "   prfm PSTL1KEEP, [%[lock]]\n"
+   "1: wfe\n"
+   "   ldaxr   w0, [%[lock]]\n"
+   "   cbnzw0, 1b\n"
+   "   stxrw0, w1, [%[lock]]\n"
+   "   cbnzw0, 1b\n"
+   /* lock held */
+   "   ldr w0, [%[sptr]]\n"
+   "   add w0, w0, #0x1\n"
+   "   str w0, [%[sptr]]\n"
+   /* now release */
+   "   stlrwzr, [%[lock]]\n"
+   : /* out */
+   : [lock] "r" (&global_lock), [

Re: [Qemu-devel] -nodefaults and available buses (was Re: [RFC 00/15] qmp: Report supported device types on 'query-machines')

2016-11-24 Thread Cornelia Huck

On Thu, 24 Nov 2016 12:51:19 +1100
David Gibson  wrote:

> On Wed, Nov 23, 2016 at 03:10:47PM -0200, Eduardo Habkost wrote:
> > (CCing the maintainers of the machines that crash when using
> > -nodefaults)
> > 
> > On Tue, Nov 22, 2016 at 08:34:50PM -0200, Eduardo Habkost wrote:
> > [...]
> > > "default defaults" vs "-nodefault defaults"
> > > ---
> > > 
> > > Two bad news:
> > > 
> > > 1) We need to differentiate buses created by the machine with
> > >"-nodefaults" and buses that are created only without
> > >"-nodefaults".
> > > 
> > > libvirt use -nodefaults when starting QEMU, so knowing which
> > > buses are available when using -nodefaults is more interesting
> > > for them.
> > > 
> > > Other software, on the other hand, might be interested in the
> > > results without -nodefaults.
> > > 
> > > We need to be able model both cases in the new interface.
> > > Suggestions are welcome.
> > 
> > The good news is that the list is short. The only[1] machines
> > where the list of buses seem to change when using -nodefaults
> > are:
> > 
> > * mpc8544ds
> > * ppce500
> > * mpc8544ds
> > * ppce500
> > * s390-ccw-virtio-*
> > 
> > On all cases above, the only difference is that a virtio bus is
> > available if not using -nodefaults.
> 
> Hrm.. that's odd.  Well, it makes sense for the s390 which has special
> virtio arrangements.  

I don't think it makes much sense for s390 either... is this a 'virtio'
bus or a 'virtio-{pci,ccw}' bus? The transport bus should be present
with -nodefaults; the virtio bus is basically a glue bus for virtio
devices...

> However, the others are all embedded ppc
> machines, whose virtio should be bog-standard virtio-pci.  I'm
> wondering if the addition of the virtio "bus" is a side-effect of the
> NIC or storage device created without -nodefaults being virtio.

I'd suspect the default NICs (which are virtio at least in the s390
case).

Re: [Qemu-devel] [PATCH v2 for-2.8 0/4] Fix MacOS runtime failure of qobject_from_jsonf()

2016-11-24 Thread Eric Blake

On 11/24/2016 05:07 AM, Markus Armbruster wrote:
> Eric Blake  writes:
> 
>> programmingk...@gmail.com[*] reported a runtime failure on a
>> 32-bit Mac OS compilation, where "%"PRId64 expands to "%qd".
>> Fortunately, we had very few spots that were relying on our
>> pseudo-printf JSON parsing of int64_t numbers, so it was
>> easier to just convert callers to stick to safer %lld.
>>
>> The remaining uses of pseudo-printf handling are more complex;
>> there are only 3 users in the released codebased, but LOTS of
>> users in the testsuite (via wrapper functions like qmp()); I
>> will be posting a followup series that rips out the remaining
>> uses of dynamic JSON, but it will be 2.9 material, while
>> these (first three) patches qualify for 2.8.  The fourth patch
>> is RFC; not intended to be applied now, but shows how I tested
>> patch 3/4; it will probably reappear in the later 2.9 series.
> 
> I can take the first three through my tree.

You mentioned some possible touchups - do you want to make those, or
have me send a v3?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [kvm-unit-tests PATCH v7 06/11] arm/Makefile.common: force -fno-pic

2016-11-24 Thread Alex Bennée

As distro compilers move towards defaults for build hardening for things
like ASLR we need to force -fno-pic. Failure to do can lead to weird
relocation problems when we build our "lat" binaries.

Signed-off-by: Alex Bennée 
---
 arm/Makefile.common | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 52f7440..cca0d9c 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -21,6 +21,7 @@ phys_base = $(LOADADDR)
 
 CFLAGS += -std=gnu99
 CFLAGS += -ffreestanding
+CFLAGS += -fno-pic
 CFLAGS += -Wextra
 CFLAGS += -O2
 CFLAGS += -I lib -I lib/libfdt
-- 
2.10.1

[Qemu-devel] [kvm-unit-tests PATCH v7 07/11] arm/tlbflush-code: Add TLB flush during code execution test

2016-11-24 Thread Alex Bennée

This adds a fairly brain dead torture test for TLB flushes intended for
stressing the MTTCG QEMU build. It takes the usual -smp option for
multiple CPUs.

By default it CPU0 will do a TLBIALL flush after each cycle. You can
pass options via -append to control additional aspects of the test:

  - "page" flush each page in turn (one per function)
  - "self" do the flush after each computation cycle
  - "verbose" report progress on each computation cycle

Signed-off-by: Alex Bennée 
CC: Mark Rutland 

---
v2
  - rename to tlbflush-test
  - made makefile changes cleaner
  - added self/other flush mode
  - create specific prefix
  - whitespace fixes
v3
  - using new SMP framework for test runing
v4
  - merge in the unitests.cfg
v5
  - max out at -smp 4
  - printf fmtfix
v7
  - rename to tlbflush-code
  - int -> bool flags
---
 arm/Makefile.common |   2 +
 arm/tlbflush-code.c | 212 
 arm/unittests.cfg   |  24 ++
 3 files changed, 238 insertions(+)
 create mode 100644 arm/tlbflush-code.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index cca0d9c..de99a6e 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -13,6 +13,7 @@ tests-common  = $(TEST_DIR)/selftest.flat
 tests-common += $(TEST_DIR)/spinlock-test.flat
 tests-common += $(TEST_DIR)/pci-test.flat
 tests-common += $(TEST_DIR)/gic.flat
+tests-common += $(TEST_DIR)/tlbflush-code.flat
 
 all: test_cases
 
@@ -81,3 +82,4 @@ generated_files = $(asm-offsets)
 test_cases: $(generated_files) $(tests-common) $(tests)
 
 $(TEST_DIR)/selftest.o $(cstart.o): $(asm-offsets)
+$(TEST_DIR)/tlbflush-code.elf: $(cstart.o) $(TEST_DIR)/tlbflush-code.o
diff --git a/arm/tlbflush-code.c b/arm/tlbflush-code.c
new file mode 100644
index 000..cb5cdc2
--- /dev/null
+++ b/arm/tlbflush-code.c
@@ -0,0 +1,212 @@
+/*
+ * TLB Flush Race Tests
+ *
+ * These tests are designed to test for incorrect TLB flush semantics
+ * under emulation. The initial CPU will set all the others working a
+ * compuation task and will then trigger TLB flushes across the
+ * system. It doesn't actually need to re-map anything but the flushes
+ * themselves will trigger QEMU's TCG self-modifying code detection
+ * which will invalidate any generated  code causing re-translation.
+ * Eventually the code buffer will fill and a general tb_lush() will
+ * be triggered.
+ *
+ * Copyright (C) 2016, Linaro, Alex Bennée 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define SEQ_LENGTH 10
+#define SEQ_HASH 0x7cd707fe
+
+static cpumask_t smp_test_complete;
+static int flush_count = 100;
+static bool flush_self;
+static bool flush_page;
+static bool flush_verbose;
+
+/*
+ * Work functions
+ *
+ * These work functions need to be:
+ *
+ *  - page aligned, so we can flush one function at a time
+ *  - have branches, so QEMU TCG generates multiple basic blocks
+ *  - call across pages, so we exercise the TCG basic block slow path
+ */
+
+/* Adler32 */
+__attribute__((aligned(PAGE_SIZE))) uint32_t hash_array(const void *buf,
+   size_t buflen)
+{
+   const uint8_t *data = (uint8_t *) buf;
+   uint32_t s1 = 1;
+   uint32_t s2 = 0;
+
+   for (size_t n = 0; n < buflen; n++) {
+   s1 = (s1 + data[n]) % 65521;
+   s2 = (s2 + s1) % 65521;
+   }
+   return (s2 << 16) | s1;
+}
+
+__attribute__((aligned(PAGE_SIZE))) void create_fib_sequence(int length,
+   unsigned int *array)
+{
+   int i;
+
+   /* first two values */
+   array[0] = 0;
+   array[1] = 1;
+   for (i=2; i4?4:$MAX_SMP))
+groups = tlbflush
+
+[tlbflush-code::page_other]
+file = tlbflush-code.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'page'
+groups = tlbflush
+
+[tlbflush-code::all_self]
+file = tlbflush-code.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'self'
+groups = tlbflush
+
+[tlbflush-code::page_self]
+file = tlbflush-code.flat
+smp = $(($MAX_SMP>4?4:$MAX_SMP))
+extra_params = -append 'page self'
+groups = tlbflush
-- 
2.10.1

Re: [Qemu-devel] [PATCH v2 1/4] qmp-event: Avoid qobject_from_jsonf("%"PRId64)

2016-11-24 Thread Eric Blake

On 11/24/2016 05:00 AM, Markus Armbruster wrote:
> Eric Blake  writes:
> 

>> +/* Put -1 to indicate failure of getting host time */
>> +obj = qobject_from_jsonf("{ 'seconds': %lld, 'microseconds': %lld }",
>> + err < 0 ? -1LL : tv.tv_sec,
>> + err < 0 ? -1LL : tv.tv_usec);
>>  qdict_put_obj(qdict, "timestamp", obj);
>>  }
> 
> The ternaries must both yield long long for this to work.  They will,
> unless their last operand's rank is greater than long long's, or their
> last operand is unsigned long long.  The latter case should be harmless
> in practice.  The former case would be weird.  Hmm.
> 
> I think it's best to be a bit more explicit here.  Let's cast the last
> operands to long long, or use long long variables.  Your choice.

I think I favor the cast, as in:

err < 0 ? -1LL : (long long) tv.tv_sec

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 1/4] target-ppc: Implement bcdcfsq. instruction

2016-11-24 Thread joserz

Hello Richard,

Thank you for your review, please read my answer below.


On Thu, Nov 24, 2016 at 01:43:18AM +0100, Richard Henderson wrote:
> On 11/23/2016 05:21 PM, Jose Ricardo Ziviani wrote:
> >bcdcfsq.: Decimal convert from signed quadword. It is not possible
> >to convert values less than 10^31-1 or greater than -10^31-1 to be
> >represented in packed decimal format.
> >
> >Signed-off-by: Jose Ricardo Ziviani 
> >---
> > target-ppc/helper.h |  1 +
> > target-ppc/int_helper.c | 45 
> > +
> > target-ppc/translate/vmx-impl.inc.c |  7 ++
> > 3 files changed, 53 insertions(+)
> >
> >diff --git a/target-ppc/helper.h b/target-ppc/helper.h
> >index da00f0a..87f533c 100644
> >--- a/target-ppc/helper.h
> >+++ b/target-ppc/helper.h
> >@@ -382,6 +382,7 @@ DEF_HELPER_3(bcdcfn, i32, avr, avr, i32)
> > DEF_HELPER_3(bcdctn, i32, avr, avr, i32)
> > DEF_HELPER_3(bcdcfz, i32, avr, avr, i32)
> > DEF_HELPER_3(bcdctz, i32, avr, avr, i32)
> >+DEF_HELPER_3(bcdcfsq, i32, avr, avr, i32)
> >
> > DEF_HELPER_2(xsadddp, void, env, i32)
> > DEF_HELPER_2(xssubdp, void, env, i32)
> >diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
> >index 8886a72..751909c 100644
> >--- a/target-ppc/int_helper.c
> >+++ b/target-ppc/int_helper.c
> >@@ -2874,6 +2874,51 @@ uint32_t helper_bcdctz(ppc_avr_t *r, ppc_avr_t *b, 
> >uint32_t ps)
> > return cr;
> > }
> >
> >+uint32_t helper_bcdcfsq(ppc_avr_t *r, ppc_avr_t *b, uint32_t ps)
> >+{
> >+int cr;
> >+int i;
> >+int ox_flag = 0;
> >+uint64_t lo_value;
> >+uint64_t hi_value;
> >+uint64_t max = 0x38d7ea4c68000;
> 
> This is at heart a decimal number, and should be written as such.
> Also, you need ULL for a 32-bit host compile.
>

OK

> >+if (divu128(&lo_value, &hi_value, max)) {
> >+ox_flag = 1;
> >+} else if (lo_value >= max && hi_value == 0) {
> >+ox_flag = 1;
> >+}
> 
> Dispense with ox_flag and set cr = CRF_SO now.
> 

OK

> >+for (i = 1; hi_value; hi_value /= 10, i++) {
> >+bcd_put_digit(&ret, hi_value % 10, i);
> >+}
> >+
> >+for (; lo_value; lo_value /= 10, i++) {
> >+bcd_put_digit(&ret, lo_value % 10, i);
> >+}
> 
> How can this possibly work?  You know there are 15 digits between high and
> low, but you continue with i++?
> 
> If hi_value == 1 && lo_value == 1, this should not produce 11, but
> 10001.
> 

Suppose we have hi_value = lo_value = 1

after divu128 above we will have
hi_value = 744073709551617
lo_value = 18446

then, after the first for loop:
hi_value = 0
lo_value = 18446
i = 16
ret = u128 = 0x8376822234287792511

finally, after the second loop:
hi_value = 0
lo_value = 0
i = 21
ret = u128 = 0x00018446744073709551617f

Which is correct, this function converted a signed quadword to bcd correctly, 
using two doubleword variables:
1 << 64 | 1 = 18446744073709551617

Am I missing anything?

Thank you!

> 
> r~
>

Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 0/4] POWER9 TCG enablements - BCD functions part II

2016-11-24 Thread joserz

David,

Thank you again for reviewing my code, I'm making the changes. I only have a 
question about patch 1/4 which I didn't find the issue so I'm waiting for 
Richard's answer.

Thanks!


On Thu, Nov 24, 2016 at 12:28:32PM +1100, David Gibson wrote:
> On Wed, Nov 23, 2016 at 02:21:41PM -0200, Jose Ricardo Ziviani wrote:
> > v2:
> >  - use div128 and mul64 functions to make code easier to understand
> >  - fixed int128 neg
> >  - improved functions bcdcpsgn and bcdsetsgn to do less work
> >than necessary
> >  - rebased on ppc-for-2.9
> > 
> > This serie contains 4 new instructions for POWER9 ISA3.0
> > 
> > bcdcfsq.: Convert signed quadword to packed BCD
> > bcdctsq.: Convert packed BCD to signed quadword
> > bcdcpsgn.: Copy the sign of a register to another
> > bcdsetsgn.: Set the BCD sign according to a preferred sign
> 
> Patch 1/4 has some problems, see comments.
> 
> Patches 2..4/4 look ok - except that they'll need to be updated for
> the recent change I merged from Nikunj (in ppc-for-2.9) which changes
> the meaning of CRF_*.
> 
> > 
> > Jose Ricardo Ziviani (4):
> >   target-ppc: Implement bcdcfsq. instruction
> >   target-ppc: Implement bcdctsq. instruction
> >   target-ppc: Implement bcdcpsgn. instruction
> >   target-ppc: Implement bcdsetsgn. instruction
> > 
> >  target-ppc/helper.h |   4 ++
> >  target-ppc/int_helper.c | 127 
> > 
> >  target-ppc/translate/vmx-impl.inc.c |  25 +++
> >  target-ppc/translate/vmx-ops.inc.c  |   2 +-
> >  4 files changed, 157 insertions(+), 1 deletion(-)
> > 
> 
> -- 
> David Gibson  | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au| minimalist, thank you.  NOT _the_ 
> _other_
>   | _way_ _around_!
> http://www.ozlabs.org/~dgibson

[Qemu-devel] [kvm-unit-tests PATCH v7 11/11] arm/tcg-test: some basic TCG exercising tests

2016-11-24 Thread Alex Bennée

These tests are not really aimed at KVM at all but exist to stretch
QEMU's TCG code generator. In particular these exercise the ability of
the TCG to:

  * Chain TranslationBlocks together (tight)
  * Handle heavy usage of the tb_jump_cache (paged)
  * Pathological case of computed local jumps (computed)

In addition the tests can be varied by adding IPI IRQs or SMC sequences
into the mix to stress the tcg_exit and invalidation mechanisms.

To explicitly stress the tb_flush() mechanism you can use the mod/rounds
parameters to force more frequent tb invalidation. Combined with setting
-tb-size 1 in QEMU to limit the code generation buffer size.

Signed-off-by: Alex Bennée 

---
v5
  - added armv8 version of the tcg tests
  - max out at -smp 4 in unittests.cfg
  - add up IRQs sent and delivered for PASS/FAIL
  - take into account error count
  - add "rounds=" parameter
  - tweak smc to tb-size=1
  - printf fmt fix
v7
  - merged in IRQ numerology
  - updated to latest IRQ API
---
 arm/Makefile.arm |   2 +
 arm/Makefile.arm64   |   2 +
 arm/Makefile.common  |   1 +
 arm/tcg-test-asm.S   | 170 ++
 arm/tcg-test-asm64.S | 169 ++
 arm/tcg-test.c   | 337 +++
 arm/unittests.cfg|  84 +
 7 files changed, 765 insertions(+)
 create mode 100644 arm/tcg-test-asm.S
 create mode 100644 arm/tcg-test-asm64.S
 create mode 100644 arm/tcg-test.c

diff --git a/arm/Makefile.arm b/arm/Makefile.arm
index 92f3757..7058bd2 100644
--- a/arm/Makefile.arm
+++ b/arm/Makefile.arm
@@ -24,4 +24,6 @@ tests =
 
 include $(TEST_DIR)/Makefile.common
 
+$(TEST_DIR)/tcg-test.elf: $(cstart.o) $(TEST_DIR)/tcg-test.o 
$(TEST_DIR)/tcg-test-asm.o
+
 arch_clean: arm_clean
diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index 0b0761c..678fca4 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -16,5 +16,7 @@ tests =
 
 include $(TEST_DIR)/Makefile.common
 
+$(TEST_DIR)/tcg-test.elf: $(cstart.o) $(TEST_DIR)/tcg-test.o 
$(TEST_DIR)/tcg-test-asm64.o
+
 arch_clean: arm_clean
$(RM) lib/arm64/.*.d
diff --git a/arm/Makefile.common b/arm/Makefile.common
index a508128..9af758f 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -17,6 +17,7 @@ tests-common += $(TEST_DIR)/tlbflush-code.flat
 tests-common += $(TEST_DIR)/tlbflush-data.flat
 tests-common += $(TEST_DIR)/locking-test.flat
 tests-common += $(TEST_DIR)/barrier-litmus-test.flat
+tests-common += $(TEST_DIR)/tcg-test.flat
 
 all: test_cases
 
diff --git a/arm/tcg-test-asm.S b/arm/tcg-test-asm.S
new file mode 100644
index 000..6e823b7
--- /dev/null
+++ b/arm/tcg-test-asm.S
@@ -0,0 +1,170 @@
+/*
+ * TCG Test assembler functions for armv7 tests.
+ *
+ * Copyright (C) 2016, Linaro Ltd, Alex Bennée 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ *
+ * These helper functions are written in pure asm to control the size
+ * of the basic blocks and ensure they fit neatly into page
+ * aligned chunks. The pattern of branches they follow is determined by
+ * the 32 bit seed they are passed. It should be the same for each set.
+ *
+ * Calling convention
+ *  - r0, iterations
+ *  - r1, jump pattern
+ *  - r2-r3, scratch
+ *
+ * Returns r0
+ */
+
+.arm
+
+.section .text
+
+/* Tight - all blocks should quickly be patched and should run
+ * very fast unless irqs or smc gets in the way
+ */
+
+.global tight_start
+tight_start:
+subsr0, r0, #1
+beq tight_end
+
+ror r1, r1, #1
+tst r1, #1
+beq tightA
+b   tight_start
+
+tightA:
+subsr0, r0, #1
+beq tight_end
+
+ror r1, r1, #1
+tst r1, #1
+beq tightB
+b   tight_start
+
+tightB:
+subsr0, r0, #1
+beq tight_end
+
+ror r1, r1, #1
+tst r1, #1
+beq tight_start
+b   tightA
+
+.global tight_end
+tight_end:
+mov pc, lr
+
+/*
+ * Computed jumps cannot be hardwired into the basic blocks so each one
+ * will cause an exit for the main execution loop to look up the next block.
+ *
+ * There is some caching which should ameliorate the cost a little.
+ */
+
+/* Align << 13 == 4096 byte alignment */
+.align 13
+.global computed_start
+computed_start:
+subsr0, r0, #1
+beq computed_end
+
+/* Jump table */
+ror r1, r1, #1
+and r2, r1, #1
+adr r3, computed_jump_table
+ldr r2, [r3, r2, lsl #2]
+mov pc, r2
+
+b   computed_err
+
+computed_jump_table:
+.word   computed_start
+.word   computedA
+
+computedA:
+subsr0, r0, #1
+beq computed_end
+
+/* Jump into code */
+ror r1, r1, #1
+and r2, r1, #1
+adr r3, 1f
+addr3, r2, lsl #2
+mov pc, r3
+1:  b   computed_s

Re: [Qemu-devel] [PATCH v1 02/18] block/pcache: empty pcache driver filter

2016-11-24 Thread Kevin Wolf

Am 24.11.2016 um 16:48 hat Pavel Butsykin geschrieben:
> On 23.11.2016 18:15, Kevin Wolf wrote:
> >Am 15.11.2016 um 07:36 hat Pavel Butsykin geschrieben:
> >>+static QemuOptsList runtime_opts = {
> >>+.name = "pcache",
> >>+.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
> >>+.desc = {
> >>+{
> >>+.name = "x-image",
> >>+.type = QEMU_OPT_STRING,
> >>+.help = "[internal use only, will be removed]",
> >>+},
> >
> >blkdebug/blkverify have this because they have to support legacy syntax
> >like -drive file=blkdebug:blkdebug.conf:test.img, i.e. it has to deal
> >with filenames.q
> >
> >Here we don't have to support a legacy syntax, so I would completely
> >avoid this from the beginning. You already support the "image" option,
> >which should be good enough.
> 
> Then the command line would look like this:
> 
> -drive 
> file=/img/harddisk.hdd,if=none,id=drive-scsi0-0-0-0,cache=none,aio=native
> -drive driver=pcache,image=scsi0-0-0-0,if=virtio

Yes, either that or with an inline block node definition, the block
layer supports both:

-drive 
driver=pcache,image.drive=file,image.filename=/img/harddisk.hdd,if=virtio,cache=none,image.aio=native

Kevin

Re: [Qemu-devel] [RFC 00/15] qmp: Report supported device types on 'query-machines'

2016-11-24 Thread Marcel Apfelbaum


On 11/24/2016 05:41 PM, Markus Armbruster wrote:

Marcel Apfelbaum  writes:


On 11/24/2016 03:34 PM, Markus Armbruster wrote:

Eduardo Habkost  writes:


On Wed, Nov 23, 2016 at 06:43:16PM +0200, Marcel Apfelbaum wrote:

On 11/22/2016 03:11 AM, Eduardo Habkost wrote:

The Problem




[...]


Our decision to have hybrid PCI/PCIe devices and buses breeds
considerable complexity.  I wish we had avoided them, but I believe it's
too late to change now.


This still does not solve the problem that some devices makes
sense only on a specific arch.




Hi Markus,


Examples?



One quick example would be that we don't want to see
Intel's IOH 3420 PCIe Root Port in an ARM machine,
or a pxb on a Q35 machine (in this case we want pxb-pcie)


Such a device would be weird.  But would it be wrong?


Define wrong :)

 Wrong enough for

QEMU to reject it?


QEMU accepts them and they even function correctly as far as I know.

  Unless QEMU rejects it, there's no reason not to

list it as pluggable.



This is the gray area I can't argue. I do think that Eduardo's
work may present an opportunity to change QEMU's mantra:
"everything goes as long as it works" to "here is what this configuration 
supports".



Thanks,
Marcel


I do believe there are other examples, I'll try to think of more.

Thanks,
Marcel

[...]

Re: [Qemu-devel] [RFC 06/15] qdev: Add device_type field to BusClass

2016-11-24 Thread Cornelia Huck

On Mon, 21 Nov 2016 23:12:04 -0200
Eduardo Habkost  wrote:



> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 24fae16..74b8fef 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -152,6 +152,7 @@ static void pci_bus_class_init(ObjectClass *klass, void 
> *data)
>  k->realize = pci_bus_realize;
>  k->unrealize = pci_bus_unrealize;
>  k->reset = pcibus_reset;
> +k->device_type = TYPE_PCI_DEVICE;

This covers pci-per-se...

> 
>  pbc->is_root = pcibus_is_root;
>  pbc->bus_num = pcibus_num;

> diff --git a/hw/s390x/css-bridge.c b/hw/s390x/css-bridge.c
> index 9a7f7ee..8a6c1ae 100644
> --- a/hw/s390x/css-bridge.c
> +++ b/hw/s390x/css-bridge.c
> @@ -17,6 +17,7 @@
>  #include "hw/s390x/css.h"
>  #include "ccw-device.h"
>  #include "hw/s390x/css-bridge.h"
> +#include "hw/s390x/virtio-ccw.h"
> 
>  /*
>   * Invoke device-specific unplug handler, disable the subchannel
> @@ -81,6 +82,7 @@ static void virtual_css_bus_class_init(ObjectClass *klass, 
> void *data)
> 
>  k->reset = virtual_css_bus_reset;
>  k->get_dev_path = virtual_css_bus_get_dev_path;
> +k->device_type = TYPE_VIRTIO_CCW_DEVICE;

...this covers virtio-ccw... (notably _not_ generic css)

>  }
> 
>  static const TypeInfo virtual_css_bus_info = {

> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 63f6248..7470fdd 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -783,10 +783,17 @@ static const TypeInfo s390_pcihost_info = {
>  }
>  };
> 
> +static void s390_pcibus_class_init(ObjectClass *oc, void *opaque)
> +{
> +BusClass *bc = BUS_CLASS(oc);
> +bc->device_type = TYPE_S390_PCI_DEVICE;

...this covers zpci, which is needed _in addition to_ normal pci to make pci 
work on s390...

> +}
> +
>  static const TypeInfo s390_pcibus_info = {
>  .name = TYPE_S390_PCI_BUS,
>  .parent = TYPE_BUS,
>  .instance_size = sizeof(S390PCIBus),
> +.class_init = s390_pcibus_class_init,
>  };
> 
>  static uint16_t s390_pci_generate_uid(void)

> diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
> index d6c0c72..815f3dd 100644
> --- a/hw/virtio/virtio-bus.c
> +++ b/hw/virtio/virtio-bus.c
> @@ -293,6 +293,7 @@ static void virtio_bus_class_init(ObjectClass *klass, 
> void *data)
>  BusClass *bus_class = BUS_CLASS(klass);
>  bus_class->get_dev_path = virtio_bus_get_dev_path;
>  bus_class->get_fw_dev_path = virtio_bus_get_fw_dev_path;
> +bus_class->device_type = TYPE_VIRTIO_DEVICE;
>  }
> 
>  static const TypeInfo virtio_bus_info = {

...and this covers virtio, which is kind of a glue bus.

So on s390, we have the following:

- to get virtio-ccw, we need virtio-ccw _and_ virtio
- to get virtio-pci, we need pci _and_ zpci _and_ virtio
- to get virtio-per-se, we need one of the combinations above

Please stop me if I'm babbling, but there seem to be some
interdependencies which are different on different architectures and
I'm not sure how these should be modelled.

Re: [Qemu-devel] [PATCH v3 for-2.9 0/3] q35: add negotiable broadcast SMI

2016-11-24 Thread Paolo Bonzini


> > Okay, this does plug the hole I sketched out above. This logic (the
> > QEMU-specific unparking) can be done in another platform API in OVMF I
> > guess (like those in SmmCpuFeaturesLib), but I wonder if we have to
> > provide the infrastructure in platform code up to the separate SMI
> > command handler. I think it again depends on those unreleased modules.
>
> I don't really like parked CPU idea and related modifications to
> CPU hotplug MMIO interface.
> 
> How about an alternative approach:
> 
> 1) on CPU hotplug QEMU generates SMI (if SMIs are enabled)
>it doesn't matter if it's a directed (to being hotplugged CPU) or
>broadcast SMI)
>as hotplugged CPU is in reset state and won't handle SMI until it receives
>SIPI (see SDM: 34.2 SYSTEM MANAGEMENT INTERRUPT: NOTES)

Parked CPUs are exactly how it works on real hardware (the arbitrator is the
BMC, while we have QEMU in its place).  The problem is that, if you just
place the hotplugged CPU in reset state, there is a race between the OSPM
and the firmware.  The OSPM can place its own code at 0x3 and send
INIT/SIPI/SMI before the firmware gets round to doing it.

> BTW:
> I don't see how broadcast SMI could be used reliably at initial boot
> as all present APs would race toward the same SMBASE when broadcast
> SIPI is sent so it's either directed SMIs or directed SIPIs, I'm not
> sure what edk2 actually does.

It uses directed SMIs via local APIC, to relocate SMBASE one vCPU at a
time.

Paolo

Re: [Qemu-devel] [kvm-unit-tests PATCH v7 11/11] arm/arm64: gic: don't just use zero

2016-11-24 Thread Auger Eric

Hi Drew,

On 24/11/2016 15:11, Andrew Jones wrote:
> On Thu, Nov 24, 2016 at 10:57:01AM +0100, Auger Eric wrote:
>> Hi,
>>
>> On 23/11/2016 17:54, Andrew Jones wrote:
>>> Allow user to select who sends ipis and with which irq,
>>> rather than just always sending irq=0 from cpu0.
>>>
>>> Signed-off-by: Andrew Jones 
>>
>> Reviewed-by: Eric Auger 
>> Tested-by: Eric Auger 
> 
> Thanks!
> 
> But if I spin a v8 then I may just dump all the command line parsing
> stuff, dropping the optional input arguments sender and irq, or at
> least irq. I'm feeling it's a bit crufty... Any thoughts on that?

Hum personally I don' have sufficient experience on this test suite to
advise. Maybe let's get inspired of the kind of tests written on x86 and
their kind of command lines. Anyway I don't want to further slow down
the pull. We will be able to refine later on when adding new tests.

Thanks

Eric
> 
> drew
>

1 2 >

1 - 100 of 144 matches

Mail list logo