Re: [Qemu-devel] a suggestion to extend the query block stats commands for multi-disk case

2016-12-06 Thread Dou Liyang

Hi Paolo,Stefan,

At 12/05/2016 07:57 PM, Paolo Bonzini wrote:



On 05/12/2016 12:53, Stefan Hajnoczi wrote:

Adding the BlockBackend *query_blk parameter is a reasonable short-term
workaround.  I wonder if the polling stats APIs need to be rethought in
the longer term.

Regarding the low-level details of block statistics inside QEMU, we can
probably collect statistics without requiring the AioContext lock.  This
means guest I/O processing does not need to be interrupted.  There
should be some RCU-style scheme that can be used to extract the stats.


The question is also what level of consistency you want.  Perhaps as
long as you can access it atomically (so no tearing of values) you don't
need a lock at all, blk_ref/unref is enough.


Yes, that is what I want to express. When I use the command frequently,
I may want to sacrifice consistency, in exchange for performance.


Thanks,

  Liyang.





Re: [Qemu-devel] [PATCH for-2.8] qdev: apply global properties in reverse order

2016-12-06 Thread Greg Kurz
On Mon, 5 Dec 2016 17:01:28 +0100
Cornelia Huck  wrote:

> On Mon, 5 Dec 2016 16:42:00 +0100
> Cornelia Huck  wrote:
> 
> > On Mon, 05 Dec 2016 16:21:22 +0100
> > Greg Kurz  wrote:  
> 
> > > AFAIK, libvirt's XML doesn't know about modern/legacy modes for virtio
> > > devices. Early adopters of virtio 1.0 had to rely on the 
> > > 
> > > tag to pass global properties to QEMU. This patch ensures that XML files
> > > used with older machine types remain valid with newer versions of QEMU.  
> 
> I recall some libvirt patches floating around for this legacy/modern
> stuff, but I don't know their status.
> 

libvirt does some probing of disable-legacy but I could find nothing that
allows a user to explicitly choose between legacy or modern.

> > > 
> > > FWIW I guess it could help to have this fix in 2.8, and also probably in
> > > 2.7.1.  
> > 
> > ...but I'm a bit worried about doing that change this late in the
> > cycle, as we may introduce subtle changes for other configurations. At
> > the very least, we should look over the existing backwards compat
> > properties (I'll look at those I'm familiar with).  
> 
> The s390x properties seem safe.
> 
> For virtio-pci, the ability to override extra state might become
> problematic for modern devices. Although manually setting this property
> is probably a patholotical case...
> 

I agree that migrating a 2.6 machine type with disable-modern=off being
set through -global for a given subtype, from QEMU 2.6 to QEMU 2.8 (or 2.7)
is probably a corner case :)



Re: [Qemu-devel] [PATCH v7 0/5] IOMMU: intel_iommu support map and unmap notifications

2016-12-06 Thread Lan Tianyu
On 2016年12月06日 15:22, Peter Xu wrote:
> On Tue, Dec 06, 2016 at 03:06:20PM +0800, Lan Tianyu wrote:
>> On 2016年12月06日 14:51, Peter Xu wrote:
>>> On Tue, Dec 06, 2016 at 02:30:24PM +0800, Lan Tianyu wrote:
>>>
>>> [...]
>>>
>> 2. How to restore GPA->HPA mapping when IOVA is disabled by guest.
>> When guest enables IOVA for device, vIOMMU will invalidate all previous
>> GPA->HPA mapping and update IOVA->HPA mapping to pIOMMU via iommu
>> notifier. But if IOVA is disabled, I think we should restore GPA->HPA
>> mapping for the device otherwise the device won't work again in the VM.
>
> If we can have a workable replay mechanism, this problem will be
> solved IMHO.

 Basic idea is to replay related memory regions to restore GPA->HPA
 mapping when guest disables IOVA.
>>>
>>> Btw, could you help elaborate in what case we will trigger such a
>>> condition? Or, can we dynamically enable/disable an IOMMU?
>>
>> First case I think of is nest virtualization and we assign physical
>> device to l2 guest.
> 
> If we assign physical device to l2 guest, then it will belongs to an
> IOMMU domain in either l1 guest and host, right? (assuming we are
> using vfio-pci to do device assignment)

Yes.

> 
>> User space driver(E.G DPDK) also can enable/disable
>> IOVA for device dynamically.
> 
> Could you provide more detailed (or any pointer) on how to do that? I
> did try to find it myself, I see an VFIO_IOMMU_ENABLE ioctl, but looks
> like it is for ppc only.

No, I just give an example that user space may do that but no more
research. But since Qemu already can enable device's IOVA, other user
application also should can do that with the same VFIO interface, right?

-- 
Best regards
Tianyu Lan



Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration

2016-12-06 Thread David Hildenbrand

Am 30.11.2016 um 09:43 schrieb Liang Li:

This patch set contains two parts of changes to the virtio-balloon.

One is the change for speeding up the inflating & deflating process,
the main idea of this optimization is to use bitmap to send the page
information to host instead of the PFNs, to reduce the overhead of
virtio data transmission, address translation and madvise(). This can
help to improve the performance by about 85%.


Do you have some statistics/some rough feeling how many consecutive bits 
are usually set in the bitmaps? Is it really just purely random or is 
there some granularity that is usually consecutive?


IOW in real examples, do we have really large consecutive areas or are 
all pages just completely distributed over our memory?


Thanks!

--

David



Re: [Qemu-devel] [PATCH 11/13] target-ppc: implement xsnegqp instruction

2016-12-06 Thread Nikunj A Dadhania
Richard Henderson  writes:

> On 12/05/2016 03:25 AM, Nikunj A Dadhania wrote:
>> +case OP_NEG:  \
>> +tcg_gen_xor_i64(xbh, xbh, sgm);   \
>> +tcg_gen_xori_i64(xbl, xbl, 0);\
>> +break;\
>
> No point in the xori.

Yeah, right.

Regards
Nikunj




Re: [Qemu-devel] [Nbd] [PATCH] doc: Propose NBD_FLAG_INIT_ZEROES extension

2016-12-06 Thread Alex Bligh

> On 5 Dec 2016, at 23:42, Eric Blake  wrote:
> 
> While not directly related to NBD_CMD_WRITE_ZEROES, the qemu
> team discovered that it is useful if a server can advertise
> whether an export is in a known-all-zeroes state at the time
> the client connects.

I think this is good to go, and ...

> Signed-off-by: Eric Blake 
> ---
> doc/proto.md | 5 +
> 1 file changed, 5 insertions(+)
> 
> This replaces the following qemu patch attempt:
> https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg00357.html
> which tried to add NBD_CMD_HAS_ZERO_INIT with poor semantics. The
> semantics in this proposal should be much better.
> 
> Patch is to the merge of the master branch and the
> extension-write-zeroes branch.  By the way, qemu 2.8 is due
> to be released "real soon now", and implements NBD_CMD_WRITE_ZEROES,
> so maybe it is time to consider promoting the extension-write-zeroes
> branch into master.

I would support this.

In fact the patch is sufficiently simple I think I'd merge this
into extension-write-zeroes then merge that into master.

Wouter?

Alex

> diff --git a/doc/proto.md b/doc/proto.md
> index afe71fc..7e4ec7f 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -697,6 +697,11 @@ The field has the following format:
>   the export.
> - bit 9, `NBD_FLAG_SEND_BLOCK_STATUS`: defined by the experimental
>   `BLOCK_STATUS` 
> [extension](https://github.com/NetworkBlockDevice/nbd/blob/extension-blockstatus/doc/proto.md).
> +- bit 10, `NBD_FLAG_INIT_ZEROES`: Indicates that the server guarantees
> +  that at the time transmission phase begins, all offsets within the
> +  export read as zero bytes.  Clients MAY use this information to
> +  avoid writing to sections of the export that should still read as
> +  zero after the client is done writing.
> 
> Clients SHOULD ignore unknown flags.
> 
> -- 
> 2.9.3
> 
> 
> --
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today.http://sdm.link/xeonphi
> ___
> Nbd-general mailing list
> nbd-gene...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nbd-general
> 

-- 
Alex Bligh







[Qemu-devel] An issue for migration determining the cpu throttle value according to workload

2016-12-06 Thread Chao Fan
Hi all,

Here is an issue in auto-converge feature of migration.

When migrating a guest which consumes too much CPU & memory, dirty
pages amount will increase significantly, so does the migration
time, migration can not even complete, at worst.

I did some simple tests on this feature. Set the two parameters
the same as 10,20,30,40,50,60,70,80,99 and run the same task in the
same guest. The result roughly is, with the increment of the
two parameters, the total_time and the dirty_sync_count will decrease.
Result shows larger the value of the two parameters is, faster the
migration is, but much more slowly the guest runs.

So I think there should be a appropriate throttle value according to
the workload of guest. But users do not know how to determine the
appropriate value.

So I want to do a job that qemu can set the throttle value according
to the workload of guest. I think qemu could calculate the instant
dirty pages rate, and then determine a appropriate throttle value.
The instant dirty pages rate means in a short fixed time, how
many dirty pages born. But I have two questions:
1. Where to add this feature. I have two options:
   a. Now qemu detects the rest migration time and decides whether
  to execute the CPU throttle. It can be changed to that qemu
  executes the CPU throttle when instant dirty pages rate increases
  to a certain threshold and sets the throttle value according to
  the instant dirty pages rate. 
   b. Using the current way as it is, when the rest migration time
  is too long and begin to execute the CPU throttle, assign
  appropriate throttle value according to the workload. Codes
  will be changed fewer in this method.
2. How to determine the CPU throttle value according to the dirty pages.
   My preliminary idea is, the CPU throttle should be related to
   the instant dirty pages rate and the total memory.
   But I am not sure how to do the map from instant dirty pages rate
   and total memory to CPU throttle value is best.

Any comments will be welcome, and I want to know whether more people
think this feature is needed.
If anyone has good ideas, please tell me.

Thanks,
Chao Fan





Re: [Qemu-devel] [PATCH 13/13] target-ppc: Add xxperm and xxpermr instructions

2016-12-06 Thread Bharata B Rao
On Tue, Dec 06, 2016 at 03:11:22PM +1100, David Gibson wrote:
> On Mon, Dec 05, 2016 at 04:55:30PM +0530, Nikunj A Dadhania wrote:
> > From: Bharata B Rao 
> > 
> > xxperm:  VSX Vector Permute
> > xxpermr: VSX Vector Permute Right-indexed
> > 
> > Signed-off-by: Bharata B Rao 
> > Signed-off-by: Nikunj A Dadhania 
> > ---
> >  target-ppc/fpu_helper.c | 50 
> > +
> >  target-ppc/helper.h |  2 ++
> >  target-ppc/translate/vsx-impl.inc.c |  2 ++
> >  target-ppc/translate/vsx-ops.inc.c  |  2 ++
> >  4 files changed, 56 insertions(+)
> > 
> > diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
> > index 3b867cf..be552c7 100644
> > --- a/target-ppc/fpu_helper.c
> > +++ b/target-ppc/fpu_helper.c
> > @@ -2869,3 +2869,53 @@ uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb)
> >  float_check_status(env);
> >  return xt;
> >  }
> > +
> > +static void vsr_copy_256(ppc_vsr_t *xa, ppc_vsr_t *xt, int8_t *src)
> > +{
> > +#if defined(HOST_WORDS_BIGENDIAN)
> > +memcpy(src, xa, sizeof(*xa));
> > +memcpy(src + 16, xt, sizeof(*xt));
> > +#else
> > +memcpy(src, xt, sizeof(*xt));
> > +memcpy(src + 16, xa, sizeof(*xa));
> 
> Is this right?  I thought the order of the bytes within each word
> varied with the host endianness as well.

Since we are already working with 2 16 byte vectors xa and xb here, I thought
we don't have to worry about order of bytes within each vector, but instead can
construct the 32 byte vector as above based on host endianness.

> 
> > +#endif
> > +}
> > +
> > +static int8_t vsr_get_byte(int8_t *src, int bound, int idx)
> > +{
> > +if (idx >= bound) {
> > +return 0xFF;
> > +}
> 
> AFAICT you don't need this check.  For both xxperm and xxpermr you're
> already masking the index to 5 bits, so it can't exceed 31.

Was thinking of making it a generic API and hence had that boundary
check but yes, no point for the check in the context of this instruction.

> 
> > +#if defined(HOST_WORDS_BIGENDIAN)
> > +return src[idx];
> > +#else
> > +return src[bound - 1 - idx];
> > +#endif
> > +}
> > +
> > +#define VSX_XXPERM(op, indexed)\
> > +void helper_##op(CPUPPCState *env, uint32_t opcode)\
> > +{  \
> > +ppc_vsr_t xt, xa, pcv; \
> > +int i, idx;\
> > +int8_t src[32];\
> > +   \
> > +getVSR(xA(opcode), &xa, env);  \
> > +getVSR(xT(opcode), &xt, env);  \
> > +getVSR(xB(opcode), &pcv, env); \
> > +   \
> > +vsr_copy_256(&xa, &xt, src);   \
> 
> You have a double copy here AFAICT - first from the actual env
> structure to xt and xa, then to the src array.  That seems like it
> would be good to avoid.
> 
> It seems like it would nice in any case to avoid even the one copy.
> You'd need a temporary for the output of course and to copy that, but
> you should be able to combine indexed with host endianness to
> translate each index to retrieve directly from the VSR values in env.

I am not sure it would be good to retrieve byte values directly from
env as getVSR nicely abstracts out from which fields
(env->[fpr, vsr, avr] the data is fetched based on the register specified
in the opcode.

I can reduce one copy though by not constructing a 32 byte vector (src)
but instead retrieving the bytes directly from xa and xt based on
the index.

Regards,
Bharata.




Re: [Qemu-devel] [PATCH v5 1/1] crypto: add virtio-crypto driver

2016-12-06 Thread Gonglei (Arei)
Hi Herbert,

Would you please review and/or ack the virtio_crypto_algs.c?
It is the realization of specified algs based on Linux Crypto Framework.

Thanks!


Regards,
-Gonglei


> -Original Message-
> From: Gonglei (Arei)
> Sent: Thursday, December 01, 2016 8:39 PM
> To: linux-ker...@vger.kernel.org; qemu-devel@nongnu.org;
> virtio-...@lists.oasis-open.org; virtualizat...@lists.linux-foundation.org;
> linux-cry...@vger.kernel.org
> Cc: Luonengjun; m...@redhat.com; stefa...@redhat.com; Huangweidong (C);
> Wubin (H); xin.z...@intel.com; Claudio Fontana;
> herb...@gondor.apana.org.au; pa...@linux.vnet.ibm.com;
> da...@davemloft.net; Zhoujian (jay, Euler); Hanweidong (Randy);
> arei.gong...@hotmail.com; cornelia.h...@de.ibm.com; Xuquan (Quan Xu);
> longpeng; Wanzongshun (Vincent); Gonglei (Arei)
> Subject: [PATCH v5 1/1] crypto: add virtio-crypto driver
> 
> This patch introduces virtio-crypto driver for Linux Kernel.
> 
> The virtio crypto device is a virtual cryptography device
> as well as a kind of virtual hardware accelerator for
> virtual machines. The encryption anddecryption requests
> are placed in the data queue and are ultimately handled by
> thebackend crypto accelerators. The second queue is the
> control queue used to create or destroy sessions for
> symmetric algorithms and will control some advanced features
> in the future. The virtio crypto device provides the following
> cryptoservices: CIPHER, MAC, HASH, and AEAD.
> 
> For more information about virtio-crypto device, please see:
>   http://qemu-project.org/Features/VirtioCrypto
> 
> CC: Michael S. Tsirkin 
> CC: Cornelia Huck 
> CC: Stefan Hajnoczi 
> CC: Herbert Xu 
> CC: Halil Pasic 
> CC: David S. Miller 
> CC: Zeng Xin 
> Signed-off-by: Gonglei 
> ---
>  MAINTAINERS  |   9 +
>  drivers/crypto/Kconfig   |   2 +
>  drivers/crypto/Makefile  |   1 +
>  drivers/crypto/virtio/Kconfig|  10 +
>  drivers/crypto/virtio/Makefile   |   5 +
>  drivers/crypto/virtio/virtio_crypto_algs.c   | 537
> +++
>  drivers/crypto/virtio/virtio_crypto_common.h | 122 ++
>  drivers/crypto/virtio/virtio_crypto_core.c   | 464
> +++
>  drivers/crypto/virtio/virtio_crypto_mgr.c| 264 +
>  include/uapi/linux/Kbuild|   1 +
>  include/uapi/linux/virtio_crypto.h   | 450
> ++
>  include/uapi/linux/virtio_ids.h  |   1 +
>  12 files changed, 1866 insertions(+)
>  create mode 100644 drivers/crypto/virtio/Kconfig
>  create mode 100644 drivers/crypto/virtio/Makefile
>  create mode 100644 drivers/crypto/virtio/virtio_crypto_algs.c
>  create mode 100644 drivers/crypto/virtio/virtio_crypto_common.h
>  create mode 100644 drivers/crypto/virtio/virtio_crypto_core.c
>  create mode 100644 drivers/crypto/virtio/virtio_crypto_mgr.c
>  create mode 100644 include/uapi/linux/virtio_crypto.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index ad9b965..cccaaf0 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -12810,6 +12810,7 @@ F:drivers/net/virtio_net.c
>  F:   drivers/block/virtio_blk.c
>  F:   include/linux/virtio_*.h
>  F:   include/uapi/linux/virtio_*.h
> +F:   drivers/crypto/virtio/
> 
>  VIRTIO DRIVERS FOR S390
>  M:   Christian Borntraeger 
> @@ -12846,6 +12847,14 @@ S:   Maintained
>  F:   drivers/virtio/virtio_input.c
>  F:   include/uapi/linux/virtio_input.h
> 
> +VIRTIO CRYPTO DRIVER
> +M:  Gonglei 
> +L:  virtualizat...@lists.linux-foundation.org
> +L:  linux-cry...@vger.kernel.org
> +S:  Maintained
> +F:  drivers/crypto/virtio/
> +F:  include/uapi/linux/virtio_crypto.h
> +
>  VIA RHINE NETWORK DRIVER
>  S:   Orphan
>  F:   drivers/net/ethernet/via/via-rhine.c
> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> index 4d2b81f..7956478 100644
> --- a/drivers/crypto/Kconfig
> +++ b/drivers/crypto/Kconfig
> @@ -555,4 +555,6 @@ config CRYPTO_DEV_ROCKCHIP
> 
>  source "drivers/crypto/chelsio/Kconfig"
> 
> +source "drivers/crypto/virtio/Kconfig"
> +
>  endif # CRYPTO_HW
> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
> index ad7250f..bc53cb8 100644
> --- a/drivers/crypto/Makefile
> +++ b/drivers/crypto/Makefile
> @@ -32,3 +32,4 @@ obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/
>  obj-$(CONFIG_CRYPTO_DEV_SUN4I_SS) += sunxi-ss/
>  obj-$(CONFIG_CRYPTO_DEV_ROCKCHIP) += rockchip/
>  obj-$(CONFIG_CRYPTO_DEV_CHELSIO) += chelsio/
> +obj-$(CONFIG_CRYPTO_DEV_VIRTIO) += virtio/
> diff --git a/drivers/crypto/virtio/Kconfig b/drivers/crypto/virtio/Kconfig
> new file mode 100644
> index 000..d80f733
> --- /dev/null
> +++ b/drivers/crypto/virtio/Kconfig
> @@ -0,0 +1,10 @@
> +config CRYPTO_DEV_VIRTIO
> + tristate "VirtIO crypto driver"
> + depends on VIRTIO
> + select CRYPTO_AEAD
> + select CRYPTO_AUTHENC
> + select CRYPTO_BLKCIPHER
> + default m
> + help
> +   This driver provides support for virtio crypto device

Re: [Qemu-devel] [PATCH for-2.8] qdev: apply global properties in reverse order

2016-12-06 Thread Greg Kurz
On Mon, 5 Dec 2016 19:14:40 +0100
Halil Pasic  wrote:

> On 12/05/2016 06:41 PM, Eduardo Habkost wrote:
> > On Mon, Dec 05, 2016 at 06:25:55PM +0100, Cornelia Huck wrote:  
> >> > On Mon, 5 Dec 2016 14:48:29 -0200
> >> > Eduardo Habkost  wrote:
> >> >   
> >>> > > On Mon, Dec 05, 2016 at 04:42:00PM +0100, Cornelia Huck wrote:  
>  > > > On Mon, 05 Dec 2016 16:21:22 +0100
>  > > > Greg Kurz  wrote:
>  > > >   
> > > > > > The current code recursively applies global properties from 
> > > > > > child up to
> > > > > > parent. So, if you have:
> > > > > > 
> > > > > > -global virtio-pci.disable-modern=on
> > > > > > -global virtio-blk-pci.disable-modern=off
> > > > > > 
> > > > > > Then the default value of disable-modern for a virtio-blk-pci 
> > > > > > device is on,
> > > > > > which looks wrong from an OOP perspective.
> > > > > > 
> > > > > > This patch reverses the logic, so that a child property always 
> > > > > > prevail.  
>  > > > 
>  > > > This sounds reasonable...
>  > > >   
> > > > > > 
> > > > > > This fixes a subtle bug that got introduced in 2.7 with commit 
> > > > > > "9a4c0e220d8a
> > > > > > hw/virtio-pci: fix virtio behaviour" for older (< 2.7) machine 
> > > > > > types: the
> > > > > > HW_COMPAT_2_6 macro contains global virtio-pci.disable-* 
> > > > > > properties which
> > > > > > would silently override global properties passed on the command 
> > > > > > line for
> > > > > > virtio subtypes.
> > > > > > 
> > > > > > Signed-off-by: Greg Kurz 
> > > > > > ---
> > > > > > 
> > > > > > AFAIK, libvirt's XML doesn't know about modern/legacy modes for 
> > > > > > virtio
> > > > > > devices. Early adopters of virtio 1.0 had to rely on the 
> > > > > > 
> > > > > > tag to pass global properties to QEMU. This patch ensures that 
> > > > > > XML files
> > > > > > used with older machine types remain valid with newer versions 
> > > > > > of QEMU.
> > > > > > 
> > > > > > FWIW I guess it could help to have this fix in 2.8, and also 
> > > > > > probably in
> > > > > > 2.7.1.  
>  > > > 
>  > > > ...but I'm a bit worried about doing that change this late in the
>  > > > cycle, as we may introduce subtle changes for other 
>  > > > configurations. At
>  > > > the very least, we should look over the existing backwards compat
>  > > > properties (I'll look at those I'm familiar with).  
> >>> > > 
> >>> > > This patch would change the behavior for:
> >>> > >  -global virtio-blk-pci.disable-modern=on
> >>> > >  -global virtio-pci.disable-modern=off
> >>> > > 
> >>> > > And I am not sure the new behavior would be correct. Shouldn't we
> >>> > > apply the properties in the order specified in the command-line?  
> >> > 
> >> > Probably; but how should this interact with compat props?  
> > compat props should be always applied in the order they appear.
> > -global should always be applied after compat props.
> > 
> > So, it looks like we have two additional reasons to just follow
> > the order the global properties were registered.
> >   
> 
> How about a docs update? 
> 
> Given the current doc:
> """
> -global driver.prop=value
> -global driver=driver,property=property,value=value
> Set default value of driver's property prop to value, e.g.:
> 
>  qemu-system-i386 -global ide-drive.physical_block_size=4096 -drive
> file=file,if=ide,index=0,media=disk
> 
> In particular, you can use this to set driver properties for devices which
> are created automatically by the machine model. To create a device which 
> is
> not created automatically and set properties on it, use -device.
> 
> -global driver.prop=value is shorthand for -global 
> driver=driver,property=prop,
> value=value. The longhand syntax works even when
> driver contains a dot. 
> """
> I think this OOP argument, which I do not completely understand,

With the current code, properties from the parent classes implicitly
prevail and this has nothing to do with command line order, or order
of appearance in HW_COMPAT_*.

From an OOP perspective, we usually expect child classes to override
parent classes behavior, not the contrary.

> is not the right direction.
> 
> Yet I do not think the current state of documentation gives a
> definitive answer on what behavior should take place in the
> scenarios described above.
> 

True, the documentation doesn't mention anything about the QEMU
object model. And the user cannot even think that virtio-pci
exists and is the parent of virtio-blk-pci...

> Maybe it's my mistake, but I did not find a statement about
> the order in which global properties are to be applied
> (please point me to it if I've missed it).
> 

True, so this should be the implicit command line order.

> Another problem with the doc (IMHO) is that it's not
> really defined what a driver is. So I can't eve

Re: [Qemu-devel] [PATCH v4 12/13] aio: self-tune polling time

2016-12-06 Thread Stefan Hajnoczi
On Mon, Dec 05, 2016 at 09:06:17PM +0100, Christian Borntraeger wrote:
> On 12/01/2016 08:26 PM, Stefan Hajnoczi wrote:
> > This patch is based on the algorithm for the kvm.ko halt_poll_ns
> > parameter in Linux.  The initial polling time is zero.
> > 
> > If the event loop is woken up within the maximum polling time it means
> > polling could be effective, so grow polling time.
> > 
> > If the event loop is woken up beyond the maximum polling time it means
> > polling is not effective, so shrink polling time.
> > 
> > If the event loop makes progress within the current polling time then
> > the sweet spot has been reached.
> > 
> > This algorithm adjusts the polling time so it can adapt to variations in
> > workloads.  The goal is to reach the sweet spot while also recognizing
> > when polling would hurt more than help.
> > 
> > Two new trace events, poll_grow and poll_shrink, are added for observing
> > polling time adjustment.
> > 
> > Signed-off-by: Stefan Hajnoczi 
> 
> Not sure way, but I have 4 host ramdisks with the same iothread as guest
> virtio-blk. running fio in the guest on one of these disks will poll, as
> soon as I have 2 disks  in fio I almost always see shrinks (so polling 
> stays at 0) and almost no grows.

Shrinking occurs when polling + ppoll(2) time exceeds poll-max-ns.

What is the value of poll-max-ns and how long is run_poll_handlers_end -
run_poll_handlers_begin?

I wonder if polling both disks takes longer than poll-max-ns once you
have two disks.  The "polling" activity includes processing the I/O
requests, so I imagine the time extends significantly as more disks have
I/O requests ready for processing.

Maybe the block_ns timing calculation should exclude processing time to
avoid false shrinking?

It also strikes me that there's a blind spot to the self-tuning
algorithm: imagine virtqueue kick via ppoll(2) + ioeventfd takes N
nanoseconds.  Detecting new virtqueue buffers via polling takes M
nanoseconds.  When M <= poll-max-ns < N the algorithm decides there is
no point in polling but it would actually be faster to poll.  The reason
is that the algorithm only looks at block_ns, which is N, not M.

This seems difficult to tackle because the algorithm has no way of
predicting M unless it randomly tries to poll longer.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH for-2.9 1/3] crypto: add standard des support

2016-12-06 Thread Daniel P. Berrange
On Tue, Dec 06, 2016 at 01:23:31AM +, Gonglei (Arei) wrote:
> >
> > From: Daniel P. Berrange [mailto:berra...@redhat.com]
> > Sent: Tuesday, December 06, 2016 12:59 AM
> > To: Gonglei (Arei)
> > Cc: longpeng; ebl...@redhat.com; arm...@redhat.com;
> > qemu-devel@nongnu.org; Wubin (H); Zhoujian (jay, Euler)
> > Subject: Re: [PATCH for-2.9 1/3] crypto: add standard des support
> > 
> > On Mon, Dec 05, 2016 at 09:29:59AM +, Gonglei (Arei) wrote:
> > > >
> > > > >  switch (alg) {
> > > > > +case QCRYPTO_CIPHER_ALG_DES:
> > > > >  case QCRYPTO_CIPHER_ALG_DES_RFB:
> > > > >  case QCRYPTO_CIPHER_ALG_AES_128:
> > > > >  case QCRYPTO_CIPHER_ALG_AES_192:
> > > > > @@ -256,11 +257,17 @@ QCryptoCipher
> > > > *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
> > > > >  ctx = g_new0(QCryptoCipherNettle, 1);
> > > > >
> > > > >  switch (alg) {
> > > > > +case QCRYPTO_CIPHER_ALG_DES:
> > > > >  case QCRYPTO_CIPHER_ALG_DES_RFB:
> > > > >  ctx->ctx = g_new0(struct des_ctx, 1);
> > > > > -rfbkey = qcrypto_cipher_munge_des_rfb_key(key, nkey);
> > > > > -des_set_key(ctx->ctx, rfbkey);
> > > > > -g_free(rfbkey);
> > > > > +
> > > > > +if (alg == QCRYPTO_CIPHER_ALG_DES_RFB) {
> > > > > +rfbkey = qcrypto_cipher_munge_des_rfb_key(key, nkey);
> > > > > +des_set_key(ctx->ctx, rfbkey);
> > > > > +g_free(rfbkey);
> > > > > +} else {
> > > > > +des_set_key(ctx->ctx, key);
> > > > > +}
> > > > >
> > > > >  ctx->alg_encrypt_native = des_encrypt_native;
> > > > >  ctx->alg_decrypt_native = des_decrypt_native;
> > > > > diff --git a/crypto/cipher.c b/crypto/cipher.c
> > > > > index a9bca41..00d9682 100644
> > > > > --- a/crypto/cipher.c
> > > > > +++ b/crypto/cipher.c
> > > > > @@ -27,6 +27,7 @@ static size_t
> > > > alg_key_len[QCRYPTO_CIPHER_ALG__MAX] = {
> > > > >  [QCRYPTO_CIPHER_ALG_AES_128] = 16,
> > > > >  [QCRYPTO_CIPHER_ALG_AES_192] = 24,
> > > > >  [QCRYPTO_CIPHER_ALG_AES_256] = 32,
> > > > > +[QCRYPTO_CIPHER_ALG_DES] = 8,
> > > > >  [QCRYPTO_CIPHER_ALG_DES_RFB] = 8,
> > > > >  [QCRYPTO_CIPHER_ALG_CAST5_128] = 16,
> > > > >  [QCRYPTO_CIPHER_ALG_SERPENT_128] = 16,
> > > > > @@ -41,6 +42,7 @@ static size_t
> > > > alg_block_len[QCRYPTO_CIPHER_ALG__MAX] = {
> > > > >  [QCRYPTO_CIPHER_ALG_AES_128] = 16,
> > > > >  [QCRYPTO_CIPHER_ALG_AES_192] = 16,
> > > > >  [QCRYPTO_CIPHER_ALG_AES_256] = 16,
> > > > > +[QCRYPTO_CIPHER_ALG_DES] = 8,
> > > > >  [QCRYPTO_CIPHER_ALG_DES_RFB] = 8,
> > > > >  [QCRYPTO_CIPHER_ALG_CAST5_128] = 8,
> > > > >  [QCRYPTO_CIPHER_ALG_SERPENT_128] = 16,
> > > > > @@ -107,7 +109,8 @@
> > > > qcrypto_cipher_validate_key_length(QCryptoCipherAlgorithm alg,
> > > > >  }
> > > > >
> > > > >  if (mode == QCRYPTO_CIPHER_MODE_XTS) {
> > > > > -if (alg == QCRYPTO_CIPHER_ALG_DES_RFB) {
> > > > > +if (alg == QCRYPTO_CIPHER_ALG_DES_RFB
> > > > > +|| alg == QCRYPTO_CIPHER_ALG_DES) {
> > > > >  error_setg(errp, "XTS mode not compatible with
> > DES-RFB");
> > > > >  return false;
> > > > >  }
> > > > > diff --git a/qapi/crypto.json b/qapi/crypto.json
> > > > > index 5c9d7d4..d403ab9 100644
> > > > > --- a/qapi/crypto.json
> > > > > +++ b/qapi/crypto.json
> > > > > @@ -75,7 +75,7 @@
> > > > >  { 'enum': 'QCryptoCipherAlgorithm',
> > > > >'prefix': 'QCRYPTO_CIPHER_ALG',
> > > > >'data': ['aes-128', 'aes-192', 'aes-256',
> > > > > -   'des-rfb',
> > > > > +   'des-rfb', 'des',
> > > >
> > > > Can we call this '3des' to make it clear that this is Triple-DES and not
> > > > the single-DES (which des-rfb is)
> > > >
> > > Actually the current des is not triple-DES, just the single-DES, and 
> > > des-rfb in
> > QEMU is just a variant of
> > > single DES, which change the standard key by calling
> > qcrypto_cipher_munge_des_rfb_key().
> > >
> > > I think we can add the 3des support as well in the next step.
> > >
> > > The current single-DES in the patch set is ok to me. :)
> > 
> > Per my othre reply in this thread,
> 
> I saw that, thanks for your information, Daniel.
> 
> > I don't think we should be supporting
> > single-DES at all in QEMU / cryptodev. So IMHO, the correct fix is to
> > remove the single-DES support from cryptodev entirely
> > 
> The cryptodev-builtin is one kind of cryptodev backends. It provides the
> real crypto capability for virtio crypto device.
>  
> I don't think we should artificially remove one algorithm support if
> the frontend driver (users) wants to use it, though the algorithm is
> unsafe.

IIUC the cryptodev hardware is ultimately about allowing the guest
to offload crypto operations to the host, potentialy using hardware
acceleration. If the cryptodev backend doesn't support a particular
algorithm, the guest is still capable of using its own built-in
support for that algorithm. I see no compell

Re: [Qemu-devel] [kvm-unit-tests RFC 03/15] arm/arm64: ITS skeleton

2016-12-06 Thread Andrew Jones
On Mon, Dec 05, 2016 at 10:46:34PM +0100, Eric Auger wrote:
> At the moment we just detect the presence of ITS as part of the
> GICv3 init routine and initialize its base address.
> 
> Signed-off-by: Eric Auger 
> ---
>  arm/Makefile.common|  1 +
>  lib/arm/asm/gic-v3-its.h   | 22 ++
>  lib/arm/asm/gic.h  |  1 +
>  lib/arm/gic-v3-its.c   |  9 +
>  lib/arm/gic.c  | 30 +-
>  lib/arm64/asm/gic-v3-its.h |  1 +
>  6 files changed, 59 insertions(+), 5 deletions(-)
>  create mode 100644 lib/arm/asm/gic-v3-its.h
>  create mode 100644 lib/arm/gic-v3-its.c
>  create mode 100644 lib/arm64/asm/gic-v3-its.h
> 
> diff --git a/arm/Makefile.common b/arm/Makefile.common
> index 6c0898f..070f349 100644
> --- a/arm/Makefile.common
> +++ b/arm/Makefile.common
> @@ -47,6 +47,7 @@ cflatobjs += lib/arm/bitops.o
>  cflatobjs += lib/arm/psci.o
>  cflatobjs += lib/arm/smp.o
>  cflatobjs += lib/arm/gic.o lib/arm/gic-v2.o lib/arm/gic-v3.o
> +cflatobjs += lib/arm/gic-v3-its.o
>  
>  libeabi = lib/arm/libeabi.a
>  eabiobjs = lib/arm/eabi_compat.o
> diff --git a/lib/arm/asm/gic-v3-its.h b/lib/arm/asm/gic-v3-its.h
> new file mode 100644
> index 000..2044565
> --- /dev/null
> +++ b/lib/arm/asm/gic-v3-its.h
> @@ -0,0 +1,22 @@
> +/*
> + * All ITS* defines are lifted from include/linux/irqchip/arm-gic-v3.h
> + *
> + * Copyright (C) 2016, Red Hat Inc, Andrew Jones 

s/Andrew/Eric/

> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2.
> + */
> +#ifndef _ASMARM_GIC_V3_ITS_H_
> +#define _ASMARM_GIC_V3_ITS_H_
> +
> +#ifndef __ASSEMBLY__
> +
> +struct its_data {
> + void *base;
> +};
> +
> +extern struct its_data its_data;
> +
> +#define gicv3_its_base() (its_data.base)

Can't we just add the ITS base address to the current gicv3_data struct?

> +
> +#endif /* !__ASSEMBLY__ */
> +#endif /* _ASMARM_GIC_V3_ITS_H_ */
> diff --git a/lib/arm/asm/gic.h b/lib/arm/asm/gic.h
> index ea5fde9..73d4502 100644
> --- a/lib/arm/asm/gic.h
> +++ b/lib/arm/asm/gic.h
> @@ -30,6 +30,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  #ifndef __ASSEMBLY__
>  #include 
> diff --git a/lib/arm/gic-v3-its.c b/lib/arm/gic-v3-its.c
> new file mode 100644
> index 000..e382b80
> --- /dev/null
> +++ b/lib/arm/gic-v3-its.c
> @@ -0,0 +1,9 @@
> +/*
> + * Copyright (C) 2016, Red Hat Inc, Eric Auger 
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2.
> + */
> +#include 
> +
> +struct its_data its_data;
> +
> diff --git a/lib/arm/gic.c b/lib/arm/gic.c
> index 957a146..e551abd 100644
> --- a/lib/arm/gic.c
> +++ b/lib/arm/gic.c
> @@ -6,6 +6,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  struct gic_common_ops *gic_common_ops;
>  
> @@ -17,12 +18,14 @@ struct gicv3_data gicv3_data;
>   * Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.txt
>   */
>  static bool
> -gic_get_dt_bases(const char *compatible, void **base1, void **base2)
> +gic_get_dt_bases(const char *compatible, void **base1, void **base2,
> +  void **base3)
>  {
>   struct dt_pbus_reg reg;
> - struct dt_device gic;
> + struct dt_device gic, its;
>   struct dt_bus bus;
> - int node, ret;
> + int node, subnode, ret, len;
> + const void *fdt = dt_fdt();
>  
>   dt_bus_init_defaults(&bus);
>   dt_device_init(&gic, &bus, NULL);
> @@ -43,19 +46,36 @@ gic_get_dt_bases(const char *compatible, void **base1, 
> void **base2)
>   assert(ret == 0);
>   *base2 = ioremap(reg.addr, reg.size);
>  
> + if (base3 && !strcmp(compatible, "arm,gic-v3")) {
> + dt_for_each_subnode(node, subnode) {
> + const struct fdt_property *prop;
> +
> + prop = fdt_get_property(fdt, subnode,
> + "compatible", &len);
> + if (!strcmp((char *)prop->data, "arm,gic-v3-its")) {
> + dt_device_bind_node(&its, subnode);
> + ret = dt_pbus_translate(&its, 0, ®);
> + assert(ret == 0);
> + *base3 = ioremap(reg.addr, reg.size);
> + break;
> + }
> + }
> +
> + }
> +
>   return true;
>  }
>  
>  int gicv2_init(void)
>  {
>   return gic_get_dt_bases("arm,cortex-a15-gic",
> - &gicv2_data.dist_base, &gicv2_data.cpu_base);
> + &gicv2_data.dist_base, &gicv2_data.cpu_base, NULL);
>  }
>  
>  int gicv3_init(void)
>  {
>   return gic_get_dt_bases("arm,gic-v3", &gicv3_data.dist_base,
> - &gicv3_data.redist_base[0]);
> + &gicv3_data.redist_base[0], &its_data.base);
>  }
>  
>  int gic_init(void)
> diff --git a/lib/arm64/asm/gic-v3-its.h b/lib/arm64/asm/gic-v3-its.h
> new file mode 100644
> index 000..083cba4
> --- /dev/null
> +++ b/lib/arm64/asm/gi

Re: [Qemu-devel] [Qemu-block] [PATCH] doc: Propose NBD_FLAG_INIT_ZEROES extension

2016-12-06 Thread Kevin Wolf
Am 06.12.2016 um 00:42 hat Eric Blake geschrieben:
> While not directly related to NBD_CMD_WRITE_ZEROES, the qemu
> team discovered that it is useful if a server can advertise
> whether an export is in a known-all-zeroes state at the time
> the client connects.

Does a server usually have the information to set this flag, other than
querying the block status of all blocks at startup? If so, the client
could just query this by itself.

The patch that was originally sent to qemu-devel just forwarded qemu's
.bdrv_has_zero_init() call to the server. However, what this function
returns is not a known-all-zeroes state on open, but just a
known-all-zeroes state immediately after bdrv_create(), i.e. creating a
new image. Then it becomes information that is easy to get and doesn't
involve querying all blocks (e.g. true for COW image formats, true for
raw on regular files, false for raw on block devices).

This is useful for 'qemu-img convert', which creates an image and then
writes the whole contents, but I'm not sure if this property is
applicable for NBD, which I think doesn't even have a create operation.

Kevin



Re: [Qemu-devel] [PATCH v4 00/13] aio: experimental virtio-blk polling mode

2016-12-06 Thread Stefan Hajnoczi
On Mon, Dec 05, 2016 at 10:40:49AM -0600, Karl Rister wrote:
> On 12/05/2016 08:56 AM, Stefan Hajnoczi wrote:
> 
> 
> > Karl: do you have time to run a bigger suite of benchmarks to identify a
> > reasonable default poll-max-ns value?  Both aio=native and aio=threads
> > are important.
> > 
> > If there is a sweet spot that improves performance without pathological
> > cases then we could even enable polling by default in QEMU.
> > 
> > Otherwise we'd just document the recommended best polling duration as a
> > starting point for users.
> > 
> 
> I have collected a baseline on the latest patches and am currently
> collecting poll-max-ns=16384.  I can certainly throw in a few more
> scenarios.  Do we want to stick with powers of 2 or some other strategy?

Excellent, thanks!  The algorithm self-tunes by doubling poll time so
there's not much advantage to looking at non pow-2 values.

Thanks,
Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [kvm-unit-tests RFC 05/15] arm/arm64: GICv3: add cpu count

2016-12-06 Thread Andrew Jones
On Mon, Dec 05, 2016 at 10:46:36PM +0100, Eric Auger wrote:
> Add a new cpu_count field in gicv3_data indicating the
> number of redistributors. This will be useful for enumeration
> of their resources such as LPI pending tables.

I'm fine with the additional state, but just curious, will it
ever be possible for gicv3.cpu_count != nr_cpus?

> 
> Signed-off-by: Eric Auger 
> ---
>  lib/arm/asm/gic-v3.h | 1 +
>  lib/arm/gic-v3.c | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/lib/arm/asm/gic-v3.h b/lib/arm/asm/gic-v3.h
> index ed330af..039b7c2 100644
> --- a/lib/arm/asm/gic-v3.h
> +++ b/lib/arm/asm/gic-v3.h
> @@ -58,6 +58,7 @@ struct gicv3_data {
>   void *dist_base;
>   void *redist_base[NR_CPUS];
>   unsigned int irq_nr;
> + unsigned int cpu_count;
>  };
>  extern struct gicv3_data gicv3_data;
>  
> diff --git a/lib/arm/gic-v3.c b/lib/arm/gic-v3.c
> index 6246221..9921f4d 100644
> --- a/lib/arm/gic-v3.c
> +++ b/lib/arm/gic-v3.c
> @@ -12,12 +12,14 @@ void gicv3_set_redist_base(size_t stride)
>   void *ptr = gicv3_data.redist_base[0];
>   u64 typer;
>  
> + gicv3_data.cpu_count = 0;
>   do {
>   typer = gicv3_read_typer(ptr + GICR_TYPER);
>   if ((typer >> 32) == aff) {
>   gicv3_redist_base() = ptr;
>   return;
>   }
> + gicv3_data.cpu_count++;
>   ptr += stride; /* skip RD_base, SGI_base, etc. */
>   } while (!(typer & GICR_TYPER_LAST));
>  
> -- 
> 2.5.5
> 
> 



Re: [Qemu-devel] [PATCH for-2.8] qdev: apply global properties in reverse order

2016-12-06 Thread Greg Kurz
On Mon, 5 Dec 2016 15:41:30 -0200
Eduardo Habkost  wrote:

> On Mon, Dec 05, 2016 at 06:25:55PM +0100, Cornelia Huck wrote:
> > On Mon, 5 Dec 2016 14:48:29 -0200
> > Eduardo Habkost  wrote:
> >   
> > > On Mon, Dec 05, 2016 at 04:42:00PM +0100, Cornelia Huck wrote:  
> > > > On Mon, 05 Dec 2016 16:21:22 +0100
> > > > Greg Kurz  wrote:
> > > >   
> > > > > The current code recursively applies global properties from child up 
> > > > > to
> > > > > parent. So, if you have:
> > > > > 
> > > > > -global virtio-pci.disable-modern=on
> > > > > -global virtio-blk-pci.disable-modern=off
> > > > > 
> > > > > Then the default value of disable-modern for a virtio-blk-pci device 
> > > > > is on,
> > > > > which looks wrong from an OOP perspective.
> > > > > 
> > > > > This patch reverses the logic, so that a child property always 
> > > > > prevail.  
> > > > 
> > > > This sounds reasonable...
> > > >   
> > > > > 
> > > > > This fixes a subtle bug that got introduced in 2.7 with commit 
> > > > > "9a4c0e220d8a
> > > > > hw/virtio-pci: fix virtio behaviour" for older (< 2.7) machine types: 
> > > > > the
> > > > > HW_COMPAT_2_6 macro contains global virtio-pci.disable-* properties 
> > > > > which
> > > > > would silently override global properties passed on the command line 
> > > > > for
> > > > > virtio subtypes.
> > > > > 
> > > > > Signed-off-by: Greg Kurz 
> > > > > ---
> > > > > 
> > > > > AFAIK, libvirt's XML doesn't know about modern/legacy modes for virtio
> > > > > devices. Early adopters of virtio 1.0 had to rely on the 
> > > > > 
> > > > > tag to pass global properties to QEMU. This patch ensures that XML 
> > > > > files
> > > > > used with older machine types remain valid with newer versions of 
> > > > > QEMU.
> > > > > 
> > > > > FWIW I guess it could help to have this fix in 2.8, and also probably 
> > > > > in
> > > > > 2.7.1.  
> > > > 
> > > > ...but I'm a bit worried about doing that change this late in the
> > > > cycle, as we may introduce subtle changes for other configurations. At
> > > > the very least, we should look over the existing backwards compat
> > > > properties (I'll look at those I'm familiar with).  
> > > 
> > > This patch would change the behavior for:
> > >  -global virtio-blk-pci.disable-modern=on
> > >  -global virtio-pci.disable-modern=off
> > > 
> > > And I am not sure the new behavior would be correct. Shouldn't we
> > > apply the properties in the order specified in the command-line?  
> > 
> > Probably; but how should this interact with compat props?  
> 
> compat props should be always applied in the order they appear.
> -global should always be applied after compat props.
> 

This is actually the way they're being registered to the global_props
static list: compat props as they appear in HW_COMPAT_* and then -global
as they appear on the command line.

> So, it looks like we have two additional reasons to just follow
> the order the global properties were registered.
> 

Thinking again, maybe we just need to reverse the logic in another
way: go through global_props and apply the property if the device
can be casted to the corresponding class (i.e. object_class_dynamic_cast()
!= NULL). I'll try that.

> >   
> > > 
> > > On either case, changing the semantics of the command-line can
> > > break existing configurations. Let's do it more carefully in the
> > > 2.9 cycle, and fix the existing bug by changing the HW_COMPAT_*
> > > macros?  
> > 
> > Changing the compat props is probably the best option at this point in
> > time. Let's take this slowly so we can come up with a reasonable
> > solution for 2.9.  
> 
> Agreed.
> 




Re: [Qemu-devel] [PATCH for-2.9 1/3] crypto: add standard des support

2016-12-06 Thread Gonglei (Arei)
>
> > > > > >  }
> > > > > > diff --git a/qapi/crypto.json b/qapi/crypto.json
> > > > > > index 5c9d7d4..d403ab9 100644
> > > > > > --- a/qapi/crypto.json
> > > > > > +++ b/qapi/crypto.json
> > > > > > @@ -75,7 +75,7 @@
> > > > > >  { 'enum': 'QCryptoCipherAlgorithm',
> > > > > >'prefix': 'QCRYPTO_CIPHER_ALG',
> > > > > >'data': ['aes-128', 'aes-192', 'aes-256',
> > > > > > -   'des-rfb',
> > > > > > +   'des-rfb', 'des',
> > > > >
> > > > > Can we call this '3des' to make it clear that this is Triple-DES and 
> > > > > not
> > > > > the single-DES (which des-rfb is)
> > > > >
> > > > Actually the current des is not triple-DES, just the single-DES, and 
> > > > des-rfb
> in
> > > QEMU is just a variant of
> > > > single DES, which change the standard key by calling
> > > qcrypto_cipher_munge_des_rfb_key().
> > > >
> > > > I think we can add the 3des support as well in the next step.
> > > >
> > > > The current single-DES in the patch set is ok to me. :)
> > >
> > > Per my othre reply in this thread,
> >
> > I saw that, thanks for your information, Daniel.
> >
> > > I don't think we should be supporting
> > > single-DES at all in QEMU / cryptodev. So IMHO, the correct fix is to
> > > remove the single-DES support from cryptodev entirely
> > >
> > The cryptodev-builtin is one kind of cryptodev backends. It provides the
> > real crypto capability for virtio crypto device.
> >
> > I don't think we should artificially remove one algorithm support if
> > the frontend driver (users) wants to use it, though the algorithm is
> > unsafe.
> 
> IIUC the cryptodev hardware is ultimately about allowing the guest
> to offload crypto operations to the host, potentialy using hardware
> acceleration. If the cryptodev backend doesn't support a particular
> algorithm, the guest is still capable of using its own built-in
> support for that algorithm. I see no compelling reason to provide
> host offload / acceleration for single-DES. Just kill this obsolete
> algorithm from cryptodev and in the unlikely event that a guest
> really does want single-DES it can use its built-in impl instead.
> 
Make sense. And I don't want to support single-DES in the virtio-crypto
frontend driver as well. The guest will use the software realization.

Thanks,
-Gonglei


Re: [Qemu-devel] [PATCH for-2.9 15/17] target-i386: Define static "base" CPU model

2016-12-06 Thread David Hildenbrand

Am 06.12.2016 um 00:57 schrieb Eduardo Habkost:

On Mon, Dec 05, 2016 at 07:18:47PM +0100, David Hildenbrand wrote:

Am 02.12.2016 um 22:18 schrieb Eduardo Habkost:

The query-cpu-model-expand QMP command needs at least one static
model, to allow the "static" expansion mode to be implemented.
Instead of defining static versions of every CPU model, define a
"base" CPU model that has absolutely no feature flag enabled.



Introducing separate ones makes feature lists presented to the user much
shorter (and therefore easier to maintain). But I don't know how libvirt
wants to deal with models on x86 in the future.


I understand that having a larger set of static models would make
expansions shorter. But I worry that by defining a complete set
of static models on x86 would require extra maintenance work on
the QEMU side with no visible benefit for libvirt.


As static models will never change (theory) the maintenance work should 
be pretty much down to zero. But the initial implementation and comming 
up with the models requires work (my experience ;) ).


I am not against the "base" model (actually it is really pretty nice to 
have). Using only that somehow smells like the "user" cpu model 
discussion. Which might be ok for x86.




I would like to hear from libvirt developers what they think. I
still don't know what they plan to use the type=static expansion
results for.



How long is the static expansion on a recent intel CPU?


CPU model "Broadwell" returns 165 entries on return.model.props:

(QEMU) query-cpu-model-expansion type=static model={"name":"Broadwell"}



{"return": {"migration-safe": true, "model": {"name": "base", "props": {"pfthreshold": false, "pku": false, "rtm": true, "tsc-deadline": true, "xstore-en": false, "tsc-scale": false, "abm": true, "ia64": false, "kvm-mmu": false, "xsaveopt": true, "tce": false, "smep": true, "fpu": true, "xcrypt": false, "clflush": true, "flushbyasid": false, "kvm-steal-time": false, "lm": true, "tsc": true, "adx": true, "fxsr": true, "tm": false, "xgetbv1": false, "xstore": false, "vme": false, "vendor": "GenuineIntel", "arat": true, "de": true, "aes": true, "pse": true, "ds-cpl": false, "tbm": false, "sse": true, "phe-en": false, "f16c": true, "ds": false, "mpx": false, "tsc-adjust": false, "avx512f": false, "avx2": true, "pbe": false, "cx16": true, "avx512pf": false, "movbe": true, "perfctr-nb": false, "ospke": false, "avx512ifma": false, "stepping": 2, "sep": true, "sse4a": false, "avx512dq": false, "avx512-4vnniw": false, "xsave": true, "pmm": false, "hle": true, "est": false, "xop": false, "smx": false, "monitor": false, "avx512er": false, "apic": true, "sse4.1": true, "sse4.2": true, "pause-filter": 
false, "lahf-lm": true, "kvm-nopiodelay": false, "acpi": false, "mmx": true, "osxsave": false, "pcommit": false, "mtrr": true, "clwb": false, "dca": false, "pdcm": false, "xcrypt-en": false, "3dnow": false, "invtsc": false, "tm2": false, "hypervisor": true, "kvmclock-stable-bit": false, "fxsr-opt": false, "pcid": true, "lbrv": false, "avx512-4fmaps": false, "svm-lock": false, "popcnt": true, "nrip-save": false, "avx512vl": false, "x2apic": true, "kvmclock": false, "smap": true, "family": 6, "min-level": 13, "dtes64": false, "ace2": false, "fma4": false, "xtpr": false, "avx512bw": false, "nx": true, "lwp": false, "msr": true, "ace2-en": false, "decodeassists": false, "perfctr-core": false, "pge": true, "pn": false, "fma": true, "nodeid-msr": false, "cx8": true, "mce": true, "avx512cd": false, "cr8legacy": false, "mca": true, "pni": true, "rdseed": true, "osvw": false, "fsgsbase": true, "model-id": "Intel Core Processor (Broadwell)", "cmp-legacy": false, "kvm-pv-unhalt": false, "rdtscp": true, "mmxext": false, "cid": false, "vmx": false, "ssse3": true, "extapic": false, "pse36": true, "min-xlevel": 2147483656, "ibs": false, "avx": 
true, "syscall": true, "umip": false, "invpcid": true, "bmi1": true, "bmi2": true, "vmcb-clean": false, "erms": true, "cmov": true, "misalignsse": false, "clflushopt": false, "pat": true, "3dnowprefetch": true, "rdpid": false, "pae": true, "wdt": false, "skinit": false, "pmm-en": false, "phe": false, "3dnowext": false, "lmce": false, "ht": false, "pdpe1gb": false, "kvm-pv-eoi": false, "npt": false, "xsavec": false, "pclmulqdq": true, "svm": false, "sse2": true, "ss": false, "topoext": false, "rdrand": true, "avx512vbmi": false, "kvm-asyncpf": false, "xsaves": false, "model": 61}}, "static": true}}




Wow, yes that was the reason for me to introduce abstractions on s390x. 
But here the plan was to use the epansion directly when indication the
"host" model to the user. Having something like "Broadwell-base"+/- a 
handful of features is just easier to handle than "base" with 165 
feature flags. But as we don't know what libvirt plans are (they could 
use that interface on x86 to do feature detection only and convert to 
models themselves), I also have no idea what would be best in the 
context of x86 cpu models.


--

David


Re: [Qemu-devel] [kvm-unit-tests RFC 05/15] arm/arm64: GICv3: add cpu count

2016-12-06 Thread Andre Przywara
Hi,

On 06/12/16 09:29, Andrew Jones wrote:
> On Mon, Dec 05, 2016 at 10:46:36PM +0100, Eric Auger wrote:
>> Add a new cpu_count field in gicv3_data indicating the
>> number of redistributors. This will be useful for enumeration
>> of their resources such as LPI pending tables.
> 
> I'm fine with the additional state, but just curious, will it
> ever be possible for gicv3.cpu_count != nr_cpus?

If not you are in trouble, so that should in fact be one test.

Which brings me to my comment ...

>>
>> Signed-off-by: Eric Auger 
>> ---
>>  lib/arm/asm/gic-v3.h | 1 +
>>  lib/arm/gic-v3.c | 2 ++
>>  2 files changed, 3 insertions(+)
>>
>> diff --git a/lib/arm/asm/gic-v3.h b/lib/arm/asm/gic-v3.h
>> index ed330af..039b7c2 100644
>> --- a/lib/arm/asm/gic-v3.h
>> +++ b/lib/arm/asm/gic-v3.h
>> @@ -58,6 +58,7 @@ struct gicv3_data {
>>  void *dist_base;
>>  void *redist_base[NR_CPUS];
>>  unsigned int irq_nr;
>> +unsigned int cpu_count;

Should that be called "nr_redists" or the like?
Since this is what it counts in the code below.
Later we can then compare this with nr_cpus to check for a match.

Cheers,
Andre.

>>  };
>>  extern struct gicv3_data gicv3_data;
>>  
>> diff --git a/lib/arm/gic-v3.c b/lib/arm/gic-v3.c
>> index 6246221..9921f4d 100644
>> --- a/lib/arm/gic-v3.c
>> +++ b/lib/arm/gic-v3.c
>> @@ -12,12 +12,14 @@ void gicv3_set_redist_base(size_t stride)
>>  void *ptr = gicv3_data.redist_base[0];
>>  u64 typer;
>>  
>> +gicv3_data.cpu_count = 0;
>>  do {
>>  typer = gicv3_read_typer(ptr + GICR_TYPER);
>>  if ((typer >> 32) == aff) {
>>  gicv3_redist_base() = ptr;
>>  return;
>>  }
>> +gicv3_data.cpu_count++;
>>  ptr += stride; /* skip RD_base, SGI_base, etc. */
>>  } while (!(typer & GICR_TYPER_LAST));
>>  
>> -- 
>> 2.5.5
>>
>>



Re: [Qemu-devel] [PATCH v4 04/13] virtio: poll virtqueues for new buffers

2016-12-06 Thread Stefan Hajnoczi
On Mon, Dec 05, 2016 at 10:22:12AM -0500, Paolo Bonzini wrote:
> > Anyway, I'm glad the polling series on its own is already showing good
> > performance results.  I'd like to merge it at the beginning of QEMU 2.9
> > so we can work on further improvements that build on top of it.
> 
> Yes, it would be great if you can stage the infrastructure bits, so that
> I can rebase my lockcnt patches on top.

I'll apply it to block-next as soon as Christian and Karl are happy with
the performance results.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH for 2.8 v3 1/1] cadence_uart: Check baud rate generator and divider values on migration

2016-12-06 Thread Peter Maydell
On 5 December 2016 at 18:35, Alistair Francis
 wrote:
> The Cadence UART device emulator calculates speed by dividing the
> baud rate by a 'baud rate generator' & 'baud rate divider' value.
> The device specification defines these register values to be
> non-zero and within certain limits. Checks were recently added when
> writing to these registers but not when restoring from migration.
>
> This patch adds checks when restoring from migration to avoid divide by
> zero errors.
>
> Reported-by: Huawei PSIRT 
> Signed-off-by: Alistair Francis 
> ---
> It would be nice to squeeze this into 2.8 if possible.
>
> V3:
>  - Fix broken migration logic
>  - Manually double checked and it passes migration.
> V2:
>  - Abort the migration if the data is invalid
>
>  hw/char/cadence_uart.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/hw/char/cadence_uart.c b/hw/char/cadence_uart.c
> index 0215d65..ce9063b 100644
> --- a/hw/char/cadence_uart.c
> +++ b/hw/char/cadence_uart.c
> @@ -502,6 +502,13 @@ static int cadence_uart_post_load(void *opaque, int 
> version_id)
>  {
>  CadenceUARTState *s = opaque;
>
> +/* Ensure these two aren't invalid numbers */
> +if (s->r[R_BRGR] <= 1 || s->r[R_BRGR] & ~0x ||
> +s->r[R_BDIV] <= 3 || s->r[R_BDIV] & ~0xFF) {

The uart_write() code says BRGR == 1 is valid, but
this code says it isn't. Which is correct?

thanks
-- PMM



Re: [Qemu-devel] [PULL for-2.8 0/4] vga fixes

2016-12-06 Thread Stefan Hajnoczi
On Mon, Dec 05, 2016 at 12:03:56PM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> Here is a last-minute poll for 2.8-rc3, bringing some vga fixes.
> 
> Most important one is the qxl fix which is quite user-visible.
> Sorry for submitting that late, it lingers in my queue for a while
> already and I through I had that in the last vga pull already, but
> obviously that isn't the case.  If you feel it is too late now it'll
> be -stable instead.
> 
> cheers,
>   Gerd
> 
> The following changes since commit bd8ef5060dd2124a54578241da9a572faf7658dd:
> 
>   Merge remote-tracking branch 'dgibson/tags/ppc-for-2.8-20161201' into 
> staging (2016-12-01 13:39:29 +)
> 
> are available in the git repository at:
> 
> 
>   git://git.kraxel.org/qemu tags/pull-vga-20161205-1
> 
> for you to fetch changes up to 4299b90e9ba9ce5ca9024572804ba751aa1a7e70:
> 
>   display: cirrus: check vga bits per pixel(bpp) value (2016-12-05 11:01:55 
> +0100)
> 
> 
> qxl: fix flickering.
> cirrus: avoid devision by zero.
> virtio-gpu: fix two leaks.
> 
> 
> Christophe Fergeau (1):
>   qxl: Only emit QXL_INTERRUPT_CLIENT_MONITORS_CONFIG on config changes
> 
> Li Qiang (2):
>   virtio-gpu: fix information leak in getting capset info dispatch
>   virtio-gpu: fix memory leak in update_cursor_data_virgl
> 
> Prasad J Pandit (1):
>   display: cirrus: check vga bits per pixel(bpp) value
> 
>  hw/display/cirrus_vga.c| 14 ++
>  hw/display/qxl.c   | 37 -
>  hw/display/virtio-gpu-3d.c |  1 +
>  hw/display/virtio-gpu.c|  1 +
>  4 files changed, 48 insertions(+), 5 deletions(-)
> 

Thanks, applied to my staging tree:
https://github.com/stefanha/qemu/commits/staging

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v5 7/9] block: don't make snapshots for filters

2016-12-06 Thread Pavel Dovgalyuk
> From: Kevin Wolf [mailto:kw...@redhat.com]
> Am 05.12.2016 um 12:49 hat Pavel Dovgalyuk geschrieben:
> > > From: Kevin Wolf [mailto:kw...@redhat.com]
> > > Am 05.12.2016 um 08:43 hat Pavel Dovgalyuk geschrieben:
> 
> > Record/replay without this option uses '-snapshot' to preserve
> > the state of the disk images.
> >
> > > Anyway, it seems that doing things manually is the safe way as long as
> > > we don't know the final solution, so I think I agree.
> > >
> > > For a slightly more convenient way, one of the problems to solve seems
> > > to be that snapshot=on always affects the top level node and you can't
> > > create a temporary snapshot in the middle of the chain. Perhaps we
> > > should introduce a 'temporary-overlay' driver or something like that, so
> > > that you could specify things like this:
> > >
> > > -drive if=none,driver=file,filename=test.img,id=orig
> > > -drive if=none,driver=temporary-overlay,file=orig,id=snap
> > > -drive if=none,driver=blkreplay,image=snap
> >
> > This seems reasonable for manual way.
> 
> Maybe another, easier to implement option could be something like this:
> 
> -drive 
> if=none,driver=file,filename=test.img,snapshot=on,overlay.node-name=snap
> -drive if=none,driver=blkreplay,image=snap
> 
> It would require that we implement support for overlay.* options like we
> already support backing.* options. Allowing to specify options for the
> overlay node is probably nice to have anyway.
> 
> However, there could be a few tricky parts there. For example, what
> happens if someone uses overlay.backing=something-else? Perhaps
> completely disallowing backing and backing.* for overlays would already
> solve this.
> 
> > > Which makes me wonder... Is blkreplay usable without the temporary
> > > snapshot or is this pretty much a requirement?
> >
> > It's not a requirement. But to make replay deterministic we have to
> > start with the same image every time. As I know, this may be achieved by:
> > 1. Restoring original disk image manually
> > 2. Using vm snapshot to start execution from
> > 3. Using -snapshot option
> > 4. Not using disks at all
> >
> > > Because if it has to be
> > > there, the next step could be that blkreplay creates temporary-overlay
> > > internally in its .bdrv_open().
> >
> > Here is your answer about such an approach :)
> > https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg04687.html
> 
> Right, and unfortunately these are still good points.
> 
> Especially the part where you allowed to give the overlay filename
> really needs to work the way it does now with the 'image' option. We
> might not need to be that strict with temporary overlays, restricting to
> qcow2 with default options could be acceptable there - but whatever I
> think of to support both cases results in something that isn't really
> easier than the manual way that we figured out above.

Can we stop on the following?
1. Don't create any overlays automatically when user wants to save/restore VM 
state
2. In the opposite case create snapshots, but do not use -snapshot option.
   Snapshots will be created by the blkreplay as in the link specified.

Pavel Dovgalyuk




Re: [Qemu-devel] [PATCH for-2.8] qdev: apply global properties in reverse order

2016-12-06 Thread Stefan Hajnoczi
On Mon, Dec 05, 2016 at 04:21:22PM +0100, Greg Kurz wrote:
> The current code recursively applies global properties from child up to
> parent. So, if you have:
> 
> -global virtio-pci.disable-modern=on
> -global virtio-blk-pci.disable-modern=off
> 
> Then the default value of disable-modern for a virtio-blk-pci device is on,
> which looks wrong from an OOP perspective.
> 
> This patch reverses the logic, so that a child property always prevail.
> 
> This fixes a subtle bug that got introduced in 2.7 with commit "9a4c0e220d8a
> hw/virtio-pci: fix virtio behaviour" for older (< 2.7) machine types: the
> HW_COMPAT_2_6 macro contains global virtio-pci.disable-* properties which
> would silently override global properties passed on the command line for
> virtio subtypes.
> 
> Signed-off-by: Greg Kurz 
> ---
> 
> AFAIK, libvirt's XML doesn't know about modern/legacy modes for virtio
> devices. Early adopters of virtio 1.0 had to rely on the 
> tag to pass global properties to QEMU. This patch ensures that XML files
> used with older machine types remain valid with newer versions of QEMU.
> 
> FWIW I guess it could help to have this fix in 2.8, and also probably in
> 2.7.1.

Hi Greg,
I won't merge this for QEMU 2.8 because this 2.7 issue is not a 2.8
release blocker and it's too risky (good points have been raised in this
thread).

Please target -stable when consensus has been reached.

Thanks,
Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [Qemu-block] [PATCH for-2.8 v2] qcow2: Don't strand clusters near 2G intervals during commit

2016-12-06 Thread Stefan Hajnoczi
On Mon, Dec 05, 2016 at 09:49:34AM -0600, Eric Blake wrote:
> The qcow2_make_empty() function is reached during 'qemu-img commit',
> in order to clear out ALL clusters of an image.  However, if the
> image cannot use the fast code path (true if the image is format
> 0.10, or if the image contains a snapshot), the cluster size is
> larger than 512, and the image is larger than 2G in size, then our
> choice of sector_step causes problems.  Since it is not cluster
> aligned, but qcow2_discard_clusters() silently ignores an unaligned
> head or tail, we are leaving clusters allocated.
> 
> Enhance the testsuite to expose the flaw, and patch the problem by
> ensuring our step size is aligned.
> 
> Signed-off-by: Eric Blake 
> 
> ---
> v2: perform rounding correctly
> ---
>  block/qcow2.c  |   3 +-
>  tests/qemu-iotests/097 |  41 +---
>  tests/qemu-iotests/097.out | 249 
> +
>  3 files changed, 210 insertions(+), 83 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [kvm-unit-tests RFC 00/15] arm/arm64: add ITS framework

2016-12-06 Thread Andrew Jones
On Mon, Dec 05, 2016 at 10:46:31PM +0100, Eric Auger wrote:
> This series proposes a framework to test the virtual ITS.
> This is based on Drew's v7 series [1]. The last patch tests
> several ITS commands (collection/device mapping, interrupt
> translation service entry creation and LPI trigger through INT
> command). At this point we don't use any external PCIe device
> to write into the GITS_TRANSLATER register.
> 
> The bulk of the code derives from the ITS driver code so all
> the credit is due to Marc.
> 
> Many other ITS commands could be tested. Also existing MMIO
> accesses could be enhanced into standalone tests. Current focus
> was to make it functional.
> 
> The code deserves more cleanup with respect to cacheability
> attributes in general.
> 
> Tested on Cavium ThunderX [2].
> 
> Best Regards
> 
> Eric
> 
> [1] [kvm-unit-tests PATCH v7 00/11] arm/arm64: add gic framework
> 
> [2] sample command line:
> 
> $QEMU -machine virt,accel=kvm -cpu host \
>  -device virtio-serial-device \
>  -device virtconsole,chardev=ctd -chardev testdev,id=ctd \
>  -display none -serial stdio \
>  -kernel arm/gic.flat \
>  -smp 8 -machine gic-version=3 -append 'its'
> 
> Eric Auger (15):
>   libcflat: Add other size defines
>   arm/arm64: gicv3: Add some re-distributor defines
>   arm/arm64: ITS skeleton
>   arm/arm64: ITS: BASER parsing and setup
>   arm/arm64: GICv3: add cpu count
>   arm/arm64: ITS: Set the LPI config and pending tables
>   arm/arm64: ITS: Init the command queue
>   arm/arm64: ITS: enable LPIs at re-distributor level
>   arm/arm64: ITS: Parse the typer register
>   arm/arm64: ITS: its_enable_defaults
>   arm/arm64: ITS: create device
>   arm/arm64: ITS: create collection
>   arm/arm64: ITS: commands
>   arm/arm64: gic: Generalize ipi_enable()
>   arm/arm64: ITS test
> 
>  arm/Makefile.common|   1 +
>  arm/gic.c  | 101 +++-
>  lib/arm/asm/gic-v3-its.h   | 238 +++
>  lib/arm/asm/gic-v3.h   |  84 ++
>  lib/arm/asm/gic.h  |   1 +
>  lib/arm/gic-v3-its-cmd.c   | 399 
> +
>  lib/arm/gic-v3-its.c   | 305 ++
>  lib/arm/gic-v3.c   |   2 +
>  lib/arm/gic.c  |  30 +++-
>  lib/arm64/asm/gic-v3-its.h |   1 +
>  lib/libcflat.h |   3 +
>  11 files changed, 1154 insertions(+), 11 deletions(-)
>  create mode 100644 lib/arm/asm/gic-v3-its.h
>  create mode 100644 lib/arm/gic-v3-its-cmd.c
>  create mode 100644 lib/arm/gic-v3-its.c
>  create mode 100644 lib/arm64/asm/gic-v3-its.h
> 
> -- 
> 2.5.5
> 
>

Thanks for this Eric! I'm glad to see we're getting more GIC test
coverage written, even before v8 of the gic series is posted :-)
v8 will be rebased on some sysreg stuff Wei is doing for the PMU
series, that's why it's held up. I'll need to set plenty of time
aside to learn enough in order to review all the 'ITS:' patches
in this series. Apologies if I can't get to it right away.

Thanks again,
drew



[Qemu-devel] [PATCH for-2.8] virtio-crypto: zeroize the key material before free

2016-12-06 Thread Gonglei
Zeroize the memory of CryptoDevBackendSymOpInfo structure pointed
for key material security.

Signed-off-by: Gonglei 
---
 hw/virtio/virtio-crypto.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
index 2f2467e..ecb19b6 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -337,7 +337,18 @@ static void virtio_crypto_free_request(VirtIOCryptoReq 
*req)
 {
 if (req) {
 if (req->flags == CRYPTODEV_BACKEND_ALG_SYM) {
-g_free(req->u.sym_op_info);
+size_t max_len;
+CryptoDevBackendSymOpInfo *op_info = req->u.sym_op_info;
+
+max_len = op_info->iv_len +
+  op_info->aad_len +
+  op_info->src_len +
+  op_info->dst_len +
+  op_info->digest_result_len;
+
+/* Zeroize and free request data structure */
+memset(op_info, 0, sizeof(*op_info) + max_len);
+g_free(op_info);
 }
 g_free(req);
 }
-- 
1.8.3.1





Re: [Qemu-devel] [PULL 0/9] QAPI patches for 2016-12-05

2016-12-06 Thread Stefan Hajnoczi
On Mon, Dec 05, 2016 at 05:45:04PM +0100, Markus Armbruster wrote:
> A set of fixes for MacOS from Eric, and a little documentation
> polishing from Marc-André.  The diffstat for the .json may look scary,
> but the generated code is identical.
> 
> The following changes since commit bd8ef5060dd2124a54578241da9a572faf7658dd:
> 
>   Merge remote-tracking branch 'dgibson/tags/ppc-for-2.8-20161201' into 
> staging (2016-12-01 13:39:29 +)
> 
> are available in the git repository at:
> 
>   git://repo.or.cz/qemu/armbru.git tags/pull-qapi-2016-12-05
> 
> for you to fetch changes up to 5072f7b38b1b9b26b8fbe1a89086386a420aded8:
> 
>   qapi: add missing colon-ending for section name (2016-12-05 17:41:38 +0100)
> 
> 
> QAPI patches for 2016-12-05
> 
> 
> Eric Blake (3):
>   qmp-event: Avoid qobject_from_jsonf("%"PRId64)
>   test-qga: Avoid qobject_from_jsonv("%"PRId64)
>   tests: Avoid qobject_from_jsonf("%"PRId64)
> 
> Marc-André Lureau (6):
>   qga/schema: fix double-return in doc
>   qapi: fix schema symbol sections
>   qapi: fix missing symbol @prefix
>   qapi: fix various symbols mismatch in documentation
>   qapi: use one symbol per line
>   qapi: add missing colon-ending for section name
> 
>  qapi-schema.json   | 346 
> +++--
>  qapi/block-core.json   | 209 +++---
>  qapi/block.json|  16 +-
>  qapi/common.json   |  14 +-
>  qapi/crypto.json   |  36 ++--
>  qapi/event.json|  58 +++
>  qapi/introspect.json   |  28 +--
>  qapi/qmp-event.c   |  17 +-
>  qapi/rocker.json   |   2 +-
>  qapi/trace.json|   8 +-
>  qga/qapi-schema.json   |  52 +++---
>  tests/check-qjson.c|   6 +-
>  tests/test-qga.c   |   7 +-
>  tests/test-qobject-input-visitor.c |   5 +-
>  14 files changed, 406 insertions(+), 398 deletions(-)
> 
> -- 
> 2.5.5
> 
> 

Thanks, applied to my staging tree:
https://github.com/stefanha/qemu/commits/staging

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PULL 0/1] target-arm queue

2016-12-06 Thread Stefan Hajnoczi
On Mon, Dec 05, 2016 at 05:59:34PM +, Peter Maydell wrote:
> One patch for 2.8-rc3, which is Alex's partial revert
> of 1dd089d0 to fix A64 ldaxp.
> 
> thanks
> -- PMM
> 
> The following changes since commit bc66cedb4141fb7588f2462c74310d8fb5dd4cf1:
> 
>   Merge remote-tracking branch 'yongbok/tags/mips-20161204' into staging 
> (2016-12-05 10:56:45 +)
> 
> are available in the git repository at:
> 
>   git://git.linaro.org/people/pmaydell/qemu-arm.git 
> tags/pull-target-arm-20161205
> 
> for you to fetch changes up to 5460da501a57cd72eda6fec736d76539122e2f99:
> 
>   target-arm/translate-a64: fix gen_load_exclusive (2016-12-05 17:52:01 +)
> 
> 
> target-arm queue:
>  * fix gen_load_exclusive handling of ldaxp
> 
> 
> Alex Bennée (1):
>   target-arm/translate-a64: fix gen_load_exclusive
> 
>  target-arm/translate-a64.c | 42 +++---
>  1 file changed, 19 insertions(+), 23 deletions(-)
> 

Thanks, applied to my staging tree:
https://github.com/stefanha/qemu/commits/staging

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH for-2.8] qdev: apply global properties in reverse order

2016-12-06 Thread Greg Kurz
On Tue, 6 Dec 2016 09:47:08 +
Stefan Hajnoczi  wrote:

> On Mon, Dec 05, 2016 at 04:21:22PM +0100, Greg Kurz wrote:
> > The current code recursively applies global properties from child up to
> > parent. So, if you have:
> > 
> > -global virtio-pci.disable-modern=on
> > -global virtio-blk-pci.disable-modern=off
> > 
> > Then the default value of disable-modern for a virtio-blk-pci device is on,
> > which looks wrong from an OOP perspective.
> > 
> > This patch reverses the logic, so that a child property always prevail.
> > 
> > This fixes a subtle bug that got introduced in 2.7 with commit "9a4c0e220d8a
> > hw/virtio-pci: fix virtio behaviour" for older (< 2.7) machine types: the
> > HW_COMPAT_2_6 macro contains global virtio-pci.disable-* properties which
> > would silently override global properties passed on the command line for
> > virtio subtypes.
> > 
> > Signed-off-by: Greg Kurz 
> > ---
> > 
> > AFAIK, libvirt's XML doesn't know about modern/legacy modes for virtio
> > devices. Early adopters of virtio 1.0 had to rely on the 
> > tag to pass global properties to QEMU. This patch ensures that XML files
> > used with older machine types remain valid with newer versions of QEMU.
> > 
> > FWIW I guess it could help to have this fix in 2.8, and also probably in
> > 2.7.1.  
> 
> Hi Greg,
> I won't merge this for QEMU 2.8 because this 2.7 issue is not a 2.8
> release blocker and it's too risky (good points have been raised in this
> thread).
> 
> Please target -stable when consensus has been reached.
> 

Sure I'll just do that.

Cheers.

--
Greg

> Thanks,
> Stefan



pgp1NBp0rfEci.pgp
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC PATCH] glusterfs: allow partial reads

2016-12-06 Thread Kevin Wolf
Am 05.12.2016 um 09:26 hat Wolfgang Bumiller geschrieben:
> On Fri, Dec 02, 2016 at 01:13:28PM -0600, Eric Blake wrote:
> > On 12/01/2016 04:59 AM, Wolfgang Bumiller wrote:
> > > Fixes #1644754.
> > > 
> > > Signed-off-by: Wolfgang Bumiller 
> > > ---
> > > I'm not sure what the original rationale was to treat both partial
> > > reads as well as well as writes as I/O error. (Seems to have happened
> > > from original glusterfs v1 to v2 series with a note but no reasoning
> > > for the read side as far as I could see.)
> > > The general direction lately seems to be to move away from sector
> > > based block APIs. Also eg. the NFS code allows partial reads. (It
> > > does, however, have an old patch (c2eb918e3) dedicated to aligning
> > > sizes to 512 byte boundaries for file creation for compatibility to
> > > other parts of qemu like qcow2. This already happens in glusterfs,
> > > though, but if you move a file from a different storage over to
> > > glusterfs you may end up with a qcow2 file with eg. the L1 table in
> > > the last 80 bytes of the file aligned to _begin_ at a 512 boundary,
> > > but not _end_ at one.)

Hm, does this really happen? I always thought that the file size of
qcow2 images is aligned to the cluster size. If it isn't, maybe we
should fix that.

> > >  block/gluster.c | 10 +-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/block/gluster.c b/block/gluster.c
> > > index 891c13b..3db0bf8 100644
> > > --- a/block/gluster.c
> > > +++ b/block/gluster.c
> > > @@ -41,6 +41,7 @@ typedef struct GlusterAIOCB {
> > >  int ret;
> > >  Coroutine *coroutine;
> > >  AioContext *aio_context;
> > > +bool is_write;
> > >  } GlusterAIOCB;
> > >  
> > >  typedef struct BDRVGlusterState {
> > > @@ -716,8 +717,10 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, 
> > > ssize_t ret, void *arg)
> > >  acb->ret = 0; /* Success */
> > >  } else if (ret < 0) {
> > >  acb->ret = -errno; /* Read/Write failed */
> > > +} else if (acb->is_write) {
> > > +acb->ret = -EIO; /* Partial write - fail it */
> > >  } else {
> > > -acb->ret = -EIO; /* Partial read/write - fail it */
> > > +acb->ret = 0; /* Success */
> > 
> > Does this properly guarantee that the portion beyond EOF reads as zero?
> 
> I'd argue this wasn't necessarily the case before either, considering
> the first check starts with `!ret`:
> 
> if (!ret || ret == acb->size) {
> acb->ret = 0; /* Success */
> 
> A read right at EOF would return 0 and be treated as success there, no?

Yes, this is a bug.

I guess this was the lazy way that "usually" works both for
reads/writes, which return a positive number of bytes, and for things
like flush which return 0 on success. But the callback really needs to
distinguish these cases and apply different checks.

> Iow. it wouldn't zero out the destination buffer as far as I can see.
> Come to think of it, I'm not too fond of this part of the check for the
> write case either.

raw-posix treats short reads as success, too, but it zeroes out the
missing part. Note that it also loops after a short read and only if it
reads 0 bytes then, it returns success. If an error is returned after
the short read, the whole function returns an error. Is this necessary
for gluster, too?

> > Would it be better to switch to byte-based interfaces rather than
> > continue to force gluster interaction in 512-byte sector chunks,
> > since gluster can obviously store files that are not 512-aligned?

The gluster I/O functions are byte-based anyway, and the driver already
implements .bdrv_co_readv, so going to .bdrv_co_preadv should be
trivial. Probably the best solution here indeed.

Kevin



Re: [Qemu-devel] [kvm-unit-tests RFC 05/15] arm/arm64: GICv3: add cpu count

2016-12-06 Thread Auger Eric
Hi Andre, Drew,

On 06/12/2016 10:32, Andre Przywara wrote:
> Hi,
> 
> On 06/12/16 09:29, Andrew Jones wrote:
>> On Mon, Dec 05, 2016 at 10:46:36PM +0100, Eric Auger wrote:
>>> Add a new cpu_count field in gicv3_data indicating the
>>> number of redistributors. This will be useful for enumeration
>>> of their resources such as LPI pending tables.
>>
>> I'm fine with the additional state, but just curious, will it
>> ever be possible for gicv3.cpu_count != nr_cpus?
> 
> If not you are in trouble, so that should in fact be one test.
> 
> Which brings me to my comment ...
> 
>>>
>>> Signed-off-by: Eric Auger 
>>> ---
>>>  lib/arm/asm/gic-v3.h | 1 +
>>>  lib/arm/gic-v3.c | 2 ++
>>>  2 files changed, 3 insertions(+)
>>>
>>> diff --git a/lib/arm/asm/gic-v3.h b/lib/arm/asm/gic-v3.h
>>> index ed330af..039b7c2 100644
>>> --- a/lib/arm/asm/gic-v3.h
>>> +++ b/lib/arm/asm/gic-v3.h
>>> @@ -58,6 +58,7 @@ struct gicv3_data {
>>> void *dist_base;
>>> void *redist_base[NR_CPUS];
>>> unsigned int irq_nr;
>>> +   unsigned int cpu_count;
> 
> Should that be called "nr_redists" or the like?
> Since this is what it counts in the code below.
> Later we can then compare this with nr_cpus to check for a match.

I fully agree with you suggestion.

Thanks

Eric
> 
> Cheers,
> Andre.
> 
>>>  };
>>>  extern struct gicv3_data gicv3_data;
>>>  
>>> diff --git a/lib/arm/gic-v3.c b/lib/arm/gic-v3.c
>>> index 6246221..9921f4d 100644
>>> --- a/lib/arm/gic-v3.c
>>> +++ b/lib/arm/gic-v3.c
>>> @@ -12,12 +12,14 @@ void gicv3_set_redist_base(size_t stride)
>>> void *ptr = gicv3_data.redist_base[0];
>>> u64 typer;
>>>  
>>> +   gicv3_data.cpu_count = 0;
>>> do {
>>> typer = gicv3_read_typer(ptr + GICR_TYPER);
>>> if ((typer >> 32) == aff) {
>>> gicv3_redist_base() = ptr;
>>> return;
>>> }
>>> +   gicv3_data.cpu_count++;
>>> ptr += stride; /* skip RD_base, SGI_base, etc. */
>>> } while (!(typer & GICR_TYPER_LAST));
>>>  
>>> -- 
>>> 2.5.5
>>>
>>>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



Re: [Qemu-devel] [Qemu-block] [PULL for-2.8 0/3] Block patches for -rc3

2016-12-06 Thread Stefan Hajnoczi
On Mon, Dec 05, 2016 at 04:31:58PM -0500, Jeff Cody wrote:
> The following changes since commit bc66cedb4141fb7588f2462c74310d8fb5dd4cf1:
> 
>   Merge remote-tracking branch 'yongbok/tags/mips-20161204' into staging 
> (2016-12-05 10:56:45 +)
> 
> are available in the git repository at:
> 
>   https://github.com/codyprime/qemu-kvm-jtc.git tags/block-pull-request
> 
> for you to fetch changes up to 76b5550f709b975a7b04fb4c887f300b7bb731c2:
> 
>   qemu-doc: update gluster protocol usage guide (2016-12-05 16:30:29 -0500)
> 
> 
> Gluster block patches for -rc3
> 
> 
> Prasanna Kumar Kalever (3):
>   block/gluster: fix QMP to match debug option
>   block/nfs: fix QMP to match debug option
>   qemu-doc: update gluster protocol usage guide
> 
>  block/gluster.c  | 38 -
>  block/nfs.c  |  4 ++--
>  qapi/block-core.json |  8 +++
>  qemu-doc.texi| 59 
> +++-
>  qemu-options.hx  | 25 --
>  5 files changed, 93 insertions(+), 41 deletions(-)

BlockdevOptionsGluster.debug(-level) does not have "Added in 2.8" so I
had to dig through git-blame(1) to verify that it was indeed added in
the current release cycle.

In the future please make sure all QAPI changes are marked by version.
If there tricky changes you can include a statement showing you are
aware of QAPI backwards compatibility ("These new options were added in
the 2.8 release cycle and can therefore still be changed without
breaking backward compatibility").  This will make me confident that
you've checked the QAPI changes.

Thanks, applied to my staging tree:
https://github.com/stefanha/qemu/commits/staging

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 03/13] target-ppc: implement lxvl instruction

2016-12-06 Thread Nikunj A Dadhania
Richard Henderson  writes:
>> +void helper_lxvl(CPUPPCState *env, target_ulong addr,
>> + target_ulong xt_num, target_ulong rb)
>> +{
>> +ppc_vsr_t xt;
>> +
>> +getVSR(xt_num, &xt, env);
>> +if (unlikely((rb & 0xFF) == 0)) {
>> +xt.s128 = int128_make128(0, 0);
>> +} else {
>> +target_ulong end = ((rb & 0xFF) * 8) - 1;

Found the above wrong it the code, ISA is extracting bit 0:7
from GPR[RB]

Regards
Nikunj




Re: [Qemu-devel] [PATCH v4 12/13] aio: self-tune polling time

2016-12-06 Thread Christian Borntraeger
On 12/06/2016 10:20 AM, Stefan Hajnoczi wrote:
> On Mon, Dec 05, 2016 at 09:06:17PM +0100, Christian Borntraeger wrote:
>> On 12/01/2016 08:26 PM, Stefan Hajnoczi wrote:
>>> This patch is based on the algorithm for the kvm.ko halt_poll_ns
>>> parameter in Linux.  The initial polling time is zero.
>>>
>>> If the event loop is woken up within the maximum polling time it means
>>> polling could be effective, so grow polling time.
>>>
>>> If the event loop is woken up beyond the maximum polling time it means
>>> polling is not effective, so shrink polling time.
>>>
>>> If the event loop makes progress within the current polling time then
>>> the sweet spot has been reached.
>>>
>>> This algorithm adjusts the polling time so it can adapt to variations in
>>> workloads.  The goal is to reach the sweet spot while also recognizing
>>> when polling would hurt more than help.
>>>
>>> Two new trace events, poll_grow and poll_shrink, are added for observing
>>> polling time adjustment.
>>>
>>> Signed-off-by: Stefan Hajnoczi 
>>
>> Not sure way, but I have 4 host ramdisks with the same iothread as guest
>> virtio-blk. running fio in the guest on one of these disks will poll, as
>> soon as I have 2 disks  in fio I almost always see shrinks (so polling 
>> stays at 0) and almost no grows.
> 
> Shrinking occurs when polling + ppoll(2) time exceeds poll-max-ns.
> 
> What is the value of poll-max-ns

I used 5ns as poll value. When using 50ns it is polling again.

> and how long is run_poll_handlers_end - run_poll_handlers_begin?

Too long. I looked again and I realized that I used cache=none without
io=native. After adding io=native things are better. Even with 4 disks
polling still happens. So it seems that the mileage will vary depending
on the settings

Christian







Re: [Qemu-devel] [PATCH] doc: Propose NBD_FLAG_INIT_ZEROES extension

2016-12-06 Thread Stefan Hajnoczi
On Mon, Dec 05, 2016 at 05:42:35PM -0600, Eric Blake wrote:
> While not directly related to NBD_CMD_WRITE_ZEROES, the qemu
> team discovered that it is useful if a server can advertise
> whether an export is in a known-all-zeroes state at the time
> the client connects.
> 
> Signed-off-by: Eric Blake 
> ---
>  doc/proto.md | 5 +
>  1 file changed, 5 insertions(+)
> 
> This replaces the following qemu patch attempt:
> https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg00357.html
> which tried to add NBD_CMD_HAS_ZERO_INIT with poor semantics. The
> semantics in this proposal should be much better.
> 
> Patch is to the merge of the master branch and the
> extension-write-zeroes branch.  By the way, qemu 2.8 is due
> to be released "real soon now", and implements NBD_CMD_WRITE_ZEROES,
> so maybe it is time to consider promoting the extension-write-zeroes
> branch into master.

Useful, thanks!

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PULL V2 0/3] Net patches

2016-12-06 Thread Stefan Hajnoczi
On Tue, Dec 06, 2016 at 10:32:29AM +0800, Jason Wang wrote:
> The following changes since commit bd8ef5060dd2124a54578241da9a572faf7658dd:
> 
>   Merge remote-tracking branch 'dgibson/tags/ppc-for-2.8-20161201' into 
> staging (2016-12-01 13:39:29 +)
> 
> are available in the git repository at:
> 
>   https://github.com/jasowang/qemu.git tags/net-pull-request
> 
> for you to fetch changes up to 9f5832d34b0c155e9538a745c80e441aed257670:
> 
>   fsl_etsec: Fix various small problems in hexdump code (2016-12-06 10:23:50 
> +0800)
> 
> 
> 
> Changes from V1:
> - fix coding style
> 
> 
> Andrey Smirnov (2):
>   fsl_etsec: Pad short payloads with zeros
>   fsl_etsec: Fix various small problems in hexdump code
> 
> Prasad J Pandit (1):
>   net: mcf: check receive buffer size register value
> 
>  hw/net/fsl_etsec/etsec.c | 4 ++--
>  hw/net/fsl_etsec/rings.c | 8 
>  hw/net/mcf_fec.c | 2 +-
>  3 files changed, 11 insertions(+), 3 deletions(-)
> 
> 

Thanks, applied to my staging tree:
https://github.com/stefanha/qemu/commits/staging

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [Nbd] [Qemu-block] [PATCH] doc: Propose NBD_FLAG_INIT_ZEROES extension

2016-12-06 Thread Alex Bligh

> On 6 Dec 2016, at 09:25, Kevin Wolf  wrote:
> 
> Am 06.12.2016 um 00:42 hat Eric Blake geschrieben:
>> While not directly related to NBD_CMD_WRITE_ZEROES, the qemu
>> team discovered that it is useful if a server can advertise
>> whether an export is in a known-all-zeroes state at the time
>> the client connects.
> 
> Does a server usually have the information to set this flag, other than
> querying the block status of all blocks at startup? 

The server may have other ways of knowing this, for instance
that it has just created the file (*), or that it stat'd the file
before opening it (not unlikely) and noticed it had 0 allocated
size. The latter I suspect would be trivial to implement in nbd-server

(*) = e.g. I had one application where nbd use the export path
to signify it wanted to open a temporary file, the path consisting
of a UUID and an encoded length. If the file was not present already
it created it with ftruncate(). That could trivially have used this.

> If so, the client could just query this by itself.

Well there's no currently mainlined extension to do that, but yes
it could. On the other hand I see no issue passing complete
zero status back to the client if it's so obvious from a stat().

-- 
Alex Bligh







Re: [Qemu-devel] [Nbd] [PATCH] doc: Propose NBD_FLAG_INIT_ZEROES extension

2016-12-06 Thread Alex Bligh

> On 6 Dec 2016, at 08:46, Alex Bligh  wrote:
> 
> I would support this.
> 
> In fact the patch is sufficiently simple I think I'd merge this
> into extension-write-zeroes then merge that into master.

Hence:

Reviewed-By: Alex Bligh 

-- 
Alex Bligh







[Qemu-devel] [RFC PATCH 01/13] intel_iommu: allocate new key when creating new address space

2016-12-06 Thread Peter Xu
From: Jason Wang 

We use the pointer to stack for key for new address space, this will
break hash table searching, fixing by g_malloc() a new key instead.

Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Acked-by: Peter Xu 
Signed-off-by: Jason Wang 
Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 708770e..92e4064 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2426,12 +2426,13 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, 
PCIBus *bus, int devfn)
 VTDAddressSpace *vtd_dev_as;
 
 if (!vtd_bus) {
+uintptr_t *new_key = g_malloc(sizeof(*new_key));
+*new_key = (uintptr_t)bus;
 /* No corresponding free() */
 vtd_bus = g_malloc0(sizeof(VTDBus) + sizeof(VTDAddressSpace *) * \
 X86_IOMMU_PCI_DEVFN_MAX);
 vtd_bus->bus = bus;
-key = (uintptr_t)bus;
-g_hash_table_insert(s->vtd_as_by_busptr, &key, vtd_bus);
+g_hash_table_insert(s->vtd_as_by_busptr, new_key, vtd_bus);
 }
 
 vtd_dev_as = vtd_bus->dev_as[devfn];
-- 
2.7.4




[Qemu-devel] [RFC PATCH 00/13] VT-d replay and misc cleanup

2016-12-06 Thread Peter Xu
This RFC series is a continue work for Aviv B.D.'s vfio enablement
series with vt-d. Aviv has done a great job there, and what we still
lack there are mostly the following:

(1) VFIO got duplicated IOTLB notifications due to splitted VT-d IOMMU
memory region.

(2) VT-d still haven't provide a correct replay() mechanism (e.g.,
when IOMMU domain switches, things will broke).

Here I'm trying to solve the above two issues.

(1) is solved by patch 7, (2) is solved by patch 11-12.

Basically it contains the following:

patch 1:picked up from Jason's vhost DMAR series, which is a bugfix

patch 2-6:  Cleanups/Enhancements for existing vt-d codes (please see
specific commit message for details, there are patches
that I thought may be suitable for 2.8 as well, but looks
like it's too late)

patch 7:Solve the issue that vfio is notified more than once for
IOTLB notifications with Aviv's patches

patch 8-10: Some trivial memory APIs added for further patches, and
add customize replay() support for MemoryRegion (I see
Aviv's latest v7 contains similar replay, I can rebase
onto that, merely the same thing)

patch 11:   Provide a valid vt-d replay() callback, using page walk

patch 12:   Enable the domain switch support - we replay() when
context entry got invalidated

patch 13:   Enhancement for existing invalidation notification,
instead of using translate() for each page, we leverage
the new vtd_page_walk() interface, which should be faster.

I would glad to hear about any review comments for above patches
(especially patch 8-13, which is the main part of this series),
especially any issue I missed in the series.

=
Test Done
=

Build test passed for x86_64/arm/ppc64.

Simply tested with x86_64, assigning two PCI devices to a single VM,
boot the VM using:

bin=x86_64-softmmu/qemu-system-x86_64
$bin -M q35,accel=kvm,kernel-irqchip=split -m 1G \
 -device intel-iommu,intremap=on,eim=off,cache-mode=on \
 -netdev user,id=net0,hostfwd=tcp::-:22 \
 -device virtio-net-pci,netdev=net0 \
 -device vfio-pci,host=03:00.0 \
 -device vfio-pci,host=02:00.0 \
 -trace events=".trace.vfio" \
 /var/lib/libvirt/images/vm1.qcow2

pxdev:bin [vtd-vfio-enablement]# cat .trace.vfio
vtd_page_walk*
vtd_replay*
vtd_inv_desc*

Then, in the guest, run the following tool:

  
https://github.com/xzpeter/clibs/blob/master/gpl/userspace/vfio-bind-group/vfio-bind-group.c

With parameter:

  ./vfio-bind-group 00:03.0 00:04.0

Check host side trace log, I can see pages are replayed and mapped in
00:04.0 device address space, like:

...
vtd_replay_ce_valid replay valid context device 00:04.00 hi 0x301 lo 0x3be77001
vtd_page_walk Page walk for ce (0x301, 0x3be77001) iova range 0x0 - 0x80
vtd_page_walk_level Page walk (base=0x3be77000, level=3) iova range 0x0 - 
0x80
vtd_page_walk_level Page walk (base=0x3c88a000, level=2) iova range 0x0 - 
0x4000
vtd_page_walk_level Page walk (base=0x366cb000, level=1) iova range 0x0 - 
0x20
vtd_page_walk_one Page walk detected map level 0x1 iova 0x0 -> gpa 0x366cb000 
mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x1000 -> gpa 
0x366cb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x2000 -> gpa 
0x366cb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x3000 -> gpa 
0x366cb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x4000 -> gpa 
0x366cb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x5000 -> gpa 
0x366cb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x6000 -> gpa 
0x366cb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x7000 -> gpa 
0x366cb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x8000 -> gpa 
0x366cb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0x9000 -> gpa 
0x366cb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0xa000 -> gpa 
0x366cb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0xb000 -> gpa 
0x366cb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0xc000 -> gpa 
0x366cb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0xd000 -> gpa 
0x366cb000 mask 0xfff perm 3
vtd_page_walk_one Page walk detected map level 0x1 iova 0xe000 -> gpa 
0x366cb000 mask 0xfff perm 3
...

=
Todo List
=

- error reporting for the assigned devices (as Tianyu has mentioned)

- per-domain address-space: A better solution in the future may be -
  we maintain one address space per IOMMU domain in the guest (so
  multiple devices can share a same address space if they are sharing
  the same IOMMU domains in the guest), rather than one

[Qemu-devel] [RFC PATCH 03/13] intel_iommu: renaming gpa to iova where proper

2016-12-06 Thread Peter Xu
There are lots of places in current intel_iommu.c codes that named
"iova" as "gpa". It is really confusing to use a name "gpa" in these
places (which is very easily to be understood as "Guest Physical
Address", while it's not). To make the codes (much) easier to be read, I
decided to do this once and for all.

No functional change is made. Only literal ones.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 46 +++---
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index f19a8b3..3d98797 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -279,7 +279,7 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t 
source_id,
 uint64_t *key = g_malloc(sizeof(*key));
 uint64_t gfn = vtd_get_iotlb_gfn(addr, level);
 
-VTD_DPRINTF(CACHE, "update iotlb sid 0x%"PRIx16 " gpa 0x%"PRIx64
+VTD_DPRINTF(CACHE, "update iotlb sid 0x%"PRIx16 " iova 0x%"PRIx64
 " slpte 0x%"PRIx64 " did 0x%"PRIx16, source_id, addr, slpte,
 domain_id);
 if (g_hash_table_size(s->iotlb) >= VTD_IOTLB_MAX_SIZE) {
@@ -595,12 +595,12 @@ static uint64_t vtd_get_slpte(dma_addr_t base_addr, 
uint32_t index)
 return slpte;
 }
 
-/* Given a gpa and the level of paging structure, return the offset of current
- * level.
+/* Given a iova and the level of paging structure, return the offset
+ * of current level.
  */
-static inline uint32_t vtd_gpa_level_offset(uint64_t gpa, uint32_t level)
+static inline uint32_t vtd_iova_level_offset(uint64_t iova, uint32_t level)
 {
-return (gpa >> vtd_slpt_level_shift(level)) &
+return (iova >> vtd_slpt_level_shift(level)) &
 ((1ULL << VTD_SL_LEVEL_BITS) - 1);
 }
 
@@ -648,13 +648,13 @@ static bool vtd_slpte_nonzero_rsvd(uint64_t slpte, 
uint32_t level)
 }
 }
 
-/* Given the @gpa, get relevant @slptep. @slpte_level will be the last level
+/* Given the @iova, get relevant @slptep. @slpte_level will be the last level
  * of the translation, can be used for deciding the size of large page.
  */
-static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t gpa,
-IOMMUAccessFlags flags,
-uint64_t *slptep, uint32_t *slpte_level,
-bool *reads, bool *writes)
+static int vtd_iova_to_slpte(VTDContextEntry *ce, uint64_t iova,
+ IOMMUAccessFlags flags,
+ uint64_t *slptep, uint32_t *slpte_level,
+ bool *reads, bool *writes)
 {
 dma_addr_t addr = vtd_get_slpt_base_from_context(ce);
 uint32_t level = vtd_get_level_from_context_entry(ce);
@@ -663,11 +663,11 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t 
gpa,
 uint32_t ce_agaw = vtd_get_agaw_from_context_entry(ce);
 uint64_t access_right_check = 0;
 
-/* Check if @gpa is above 2^X-1, where X is the minimum of MGAW in CAP_REG
- * and AW in context-entry.
+/* Check if @iova is above 2^X-1, where X is the minimum of MGAW
+ * in CAP_REG and AW in context-entry.
  */
-if (gpa & ~((1ULL << MIN(ce_agaw, VTD_MGAW)) - 1)) {
-VTD_DPRINTF(GENERAL, "error: gpa 0x%"PRIx64 " exceeds limits", gpa);
+if (iova & ~((1ULL << MIN(ce_agaw, VTD_MGAW)) - 1)) {
+VTD_DPRINTF(GENERAL, "error: iova 0x%"PRIx64 " exceeds limits", iova);
 return -VTD_FR_ADDR_BEYOND_MGAW;
 }
 
@@ -683,13 +683,13 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t 
gpa,
 }
 
 while (true) {
-offset = vtd_gpa_level_offset(gpa, level);
+offset = vtd_iova_level_offset(iova, level);
 slpte = vtd_get_slpte(addr, offset);
 
 if (slpte == (uint64_t)-1) {
 VTD_DPRINTF(GENERAL, "error: fail to access second-level paging "
-"entry at level %"PRIu32 " for gpa 0x%"PRIx64,
-level, gpa);
+"entry at level %"PRIu32 " for iova 0x%"PRIx64,
+level, iova);
 if (level == vtd_get_level_from_context_entry(ce)) {
 /* Invalid programming of context-entry */
 return -VTD_FR_CONTEXT_ENTRY_INV;
@@ -701,8 +701,8 @@ static int vtd_gpa_to_slpte(VTDContextEntry *ce, uint64_t 
gpa,
 *writes = (*writes) && (slpte & VTD_SL_W);
 if (!(slpte & access_right_check) && !(flags & IOMMU_NO_FAIL)) {
 VTD_DPRINTF(GENERAL, "error: lack of %s permission for "
-"gpa 0x%"PRIx64 " slpte 0x%"PRIx64,
-(flags & IOMMU_WO ? "write" : "read"), gpa, slpte);
+"iova 0x%"PRIx64 " slpte 0x%"PRIx64,
+(flags & IOMMU_WO ? "write" : "read"), iova, slpte);
 return (flags & IOMMU_WO) ? -VTD_FR_WRITE : -VTD_FR_READ;
 }
 if (vtd_slpte_nonzero_rsvd(slpte, level)) {
@@ -851,7 +851,7 @@ static void vtd_do_iommu_t

[Qemu-devel] [RFC PATCH 05/13] intel_iommu: fix trace for addr translation

2016-12-06 Thread Peter Xu
Another patch to convert the DPRINTF() stuffs. This patch focuses on the
address translation path and caching.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 87 ---
 hw/i386/trace-events  |  7 +
 2 files changed, 48 insertions(+), 46 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 35fbfbe..0f8387e 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -280,11 +280,9 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t 
source_id,
 uint64_t *key = g_malloc(sizeof(*key));
 uint64_t gfn = vtd_get_iotlb_gfn(addr, level);
 
-VTD_DPRINTF(CACHE, "update iotlb sid 0x%"PRIx16 " iova 0x%"PRIx64
-" slpte 0x%"PRIx64 " did 0x%"PRIx16, source_id, addr, slpte,
-domain_id);
+trace_vtd_iotlb_page_update(source_id, addr, slpte, domain_id);
 if (g_hash_table_size(s->iotlb) >= VTD_IOTLB_MAX_SIZE) {
-VTD_DPRINTF(CACHE, "iotlb exceeds size limit, forced to reset");
+trace_vtd_iotlb_reset("iotlb exceeds size limit");
 vtd_reset_iotlb(s);
 }
 
@@ -525,8 +523,8 @@ static int vtd_get_root_entry(IntelIOMMUState *s, uint8_t 
index,
 
 addr = s->root + index * sizeof(*re);
 if (dma_memory_read(&address_space_memory, addr, re, sizeof(*re))) {
-VTD_DPRINTF(GENERAL, "error: fail to access root-entry at 0x%"PRIx64
-" + %"PRIu8, s->root, index);
+error_report("Fail to access root-entry at 0x%"PRIx64
+ " index %"PRIu8, s->root, index);
 re->val = 0;
 return -VTD_FR_ROOT_TABLE_INV;
 }
@@ -545,13 +543,12 @@ static int vtd_get_context_entry_from_root(VTDRootEntry 
*root, uint8_t index,
 dma_addr_t addr;
 
 if (!vtd_root_entry_present(root)) {
-VTD_DPRINTF(GENERAL, "error: root-entry is not present");
+error_report("Root-entry is not present");
 return -VTD_FR_ROOT_ENTRY_P;
 }
 addr = (root->val & VTD_ROOT_ENTRY_CTP) + index * sizeof(*ce);
 if (dma_memory_read(&address_space_memory, addr, ce, sizeof(*ce))) {
-VTD_DPRINTF(GENERAL, "error: fail to access context-entry at 0x%"PRIx64
-" + %"PRIu8,
+error_report("Fail to access context-entry at 0x%"PRIx64" ind %"PRIu8,
 (uint64_t)(root->val & VTD_ROOT_ENTRY_CTP), index);
 return -VTD_FR_CONTEXT_TABLE_INV;
 }
@@ -665,7 +662,7 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint64_t 
iova,
  * in CAP_REG and AW in context-entry.
  */
 if (iova & ~((1ULL << MIN(ce_agaw, VTD_MGAW)) - 1)) {
-VTD_DPRINTF(GENERAL, "error: iova 0x%"PRIx64 " exceeds limits", iova);
+error_report("IOVA 0x%"PRIx64 " exceeds limits", iova);
 return -VTD_FR_ADDR_BEYOND_MGAW;
 }
 
@@ -685,7 +682,7 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint64_t 
iova,
 slpte = vtd_get_slpte(addr, offset);
 
 if (slpte == (uint64_t)-1) {
-VTD_DPRINTF(GENERAL, "error: fail to access second-level paging "
+error_report("Fail to access second-level paging "
 "entry at level %"PRIu32 " for iova 0x%"PRIx64,
 level, iova);
 if (level == vtd_get_level_from_context_entry(ce)) {
@@ -698,13 +695,13 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, 
uint64_t iova,
 *reads = (*reads) && (slpte & VTD_SL_R);
 *writes = (*writes) && (slpte & VTD_SL_W);
 if (!(slpte & access_right_check) && !(flags & IOMMU_NO_FAIL)) {
-VTD_DPRINTF(GENERAL, "error: lack of %s permission for "
-"iova 0x%"PRIx64 " slpte 0x%"PRIx64,
-(flags & IOMMU_WO ? "write" : "read"), iova, slpte);
+error_report("Lack of %s permission for iova 0x%"PRIx64
+ " slpte 0x%"PRIx64,
+ (flags & IOMMU_WO ? "write" : "read"), iova, slpte);
 return (flags & IOMMU_WO) ? -VTD_FR_WRITE : -VTD_FR_READ;
 }
 if (vtd_slpte_nonzero_rsvd(slpte, level)) {
-VTD_DPRINTF(GENERAL, "error: non-zero reserved field in second "
+error_report("Non-zero reserved field in second "
 "level paging entry level %"PRIu32 " slpte 0x%"PRIx64,
 level, slpte);
 return -VTD_FR_PAGING_ENTRY_RSVD;
@@ -733,12 +730,13 @@ static int vtd_dev_to_context_entry(IntelIOMMUState *s, 
uint8_t bus_num,
 }
 
 if (!vtd_root_entry_present(&re)) {
-VTD_DPRINTF(GENERAL, "error: root-entry #%"PRIu8 " is not present",
-bus_num);
+/* Not error - it's okay we don't have root entry. */
+trace_vtd_re_not_present(bus_num);
 return -VTD_FR_ROOT_ENTRY_P;
 } else if (re.rsvd || (re.val & VTD_ROOT_ENTRY_RSVD)) {
-VTD_DPRINTF(GENERAL, "error: non-zero reserved field in root-entry "
-"hi 0

[Qemu-devel] [RFC PATCH 02/13] intel_iommu: simplify irq region translation

2016-12-06 Thread Peter Xu
Before we have int-remap, we need to bypass interrupt write requests.
That's not necessary now - we have supported int-remap, and all the irq
region requests should be redirected there. Cleaning up the block with
an assertion instead.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 29 ++---
 1 file changed, 6 insertions(+), 23 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 92e4064..f19a8b3 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -842,29 +842,12 @@ static void vtd_do_iommu_translate(VTDAddressSpace 
*vtd_as, PCIBus *bus,
 bool writes = true;
 VTDIOTLBEntry *iotlb_entry;
 
-/* Check if the request is in interrupt address range */
-if (vtd_is_interrupt_addr(addr)) {
-if (flags & IOMMU_WO) {
-/* FIXME: since we don't know the length of the access here, we
- * treat Non-DWORD length write requests without PASID as
- * interrupt requests, too. Withoud interrupt remapping support,
- * we just use 1:1 mapping.
- */
-VTD_DPRINTF(MMU, "write request to interrupt address "
-"gpa 0x%"PRIx64, addr);
-entry->iova = addr & VTD_PAGE_MASK_4K;
-entry->translated_addr = addr & VTD_PAGE_MASK_4K;
-entry->addr_mask = ~VTD_PAGE_MASK_4K;
-entry->perm = IOMMU_WO;
-return;
-} else {
-VTD_DPRINTF(GENERAL, "error: read request from interrupt address "
-"gpa 0x%"PRIx64, addr);
-vtd_report_dmar_fault(s, source_id, addr, VTD_FR_READ,
-flags & IOMMU_WO);
-return;
-}
-}
+/*
+ * We have standalone memory region for interrupt addresses, we
+ * should never receive translation requests in this region.
+ */
+assert(!vtd_is_interrupt_addr(addr));
+
 /* Try to fetch slpte form IOTLB */
 iotlb_entry = vtd_lookup_iotlb(s, source_id, addr);
 if (iotlb_entry) {
-- 
2.7.4




[Qemu-devel] [RFC PATCH 04/13] intel_iommu: fix trace for inv desc handling

2016-12-06 Thread Peter Xu
VT-d codes are still using static DEBUG_INTEL_IOMMU macro. That's not
good, and we should end the day when we need to recompile the code
before getting useful debugging information for vt-d. Time to switch to
the trace system.

This is the first patch to do it.

Generally, the rule of mine is:

- for the old GENERAL typed message, I use error_report() directly if
  apply. Those are something shouldn't happen, and we should print those
  errors in all cases, even without enabling debug and tracing.

- for the non-GENERAL typed messages, remove those VTD_PRINTF()s that
  looks hardly used, and convert the rest lines into trace_*().

- for useless DPRINTFs, I removed them.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 98 ---
 hw/i386/trace-events  | 12 +++
 2 files changed, 58 insertions(+), 52 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 3d98797..35fbfbe 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -35,6 +35,7 @@
 #include "sysemu/kvm.h"
 #include "hw/i386/apic_internal.h"
 #include "kvm_i386.h"
+#include "trace.h"
 
 /*#define DEBUG_INTEL_IOMMU*/
 #ifdef DEBUG_INTEL_IOMMU
@@ -494,22 +495,19 @@ static void vtd_handle_inv_queue_error(IntelIOMMUState *s)
 /* Set the IWC field and try to generate an invalidation completion interrupt 
*/
 static void vtd_generate_completion_event(IntelIOMMUState *s)
 {
-VTD_DPRINTF(INV, "completes an invalidation wait command with "
-"Interrupt Flag");
 if (vtd_get_long_raw(s, DMAR_ICS_REG) & VTD_ICS_IWC) {
-VTD_DPRINTF(INV, "there is a previous interrupt condition to be "
-"serviced by software, "
-"new invalidation event is not generated");
+trace_vtd_inv_desc_wait_irq("One pending, skip current");
 return;
 }
 vtd_set_clear_mask_long(s, DMAR_ICS_REG, 0, VTD_ICS_IWC);
 vtd_set_clear_mask_long(s, DMAR_IECTL_REG, 0, VTD_IECTL_IP);
 if (vtd_get_long_raw(s, DMAR_IECTL_REG) & VTD_IECTL_IM) {
-VTD_DPRINTF(INV, "IM filed in IECTL_REG is set, new invalidation "
-"event is not generated");
+trace_vtd_inv_desc_wait_irq("IM in IECTL_REG is set, "
+"new event not generated");
 return;
 } else {
 /* Generate the interrupt event */
+trace_vtd_inv_desc_wait_irq("Generating complete event");
 vtd_generate_interrupt(s, DMAR_IEADDR_REG, DMAR_IEDATA_REG);
 vtd_set_clear_mask_long(s, DMAR_IECTL_REG, VTD_IECTL_IP, 0);
 }
@@ -952,6 +950,7 @@ static void vtd_interrupt_remap_table_setup(IntelIOMMUState 
*s)
 
 static void vtd_context_global_invalidate(IntelIOMMUState *s)
 {
+trace_vtd_inv_desc_cc_global();
 s->context_cache_gen++;
 if (s->context_cache_gen == VTD_CONTEXT_CACHE_GEN_MAX) {
 vtd_reset_context_cache(s);
@@ -991,9 +990,11 @@ static void vtd_context_device_invalidate(IntelIOMMUState 
*s,
 uint16_t mask;
 VTDBus *vtd_bus;
 VTDAddressSpace *vtd_as;
-uint16_t devfn;
+uint8_t bus_n, devfn;
 uint16_t devfn_it;
 
+trace_vtd_inv_desc_cc_devices(source_id, func_mask);
+
 switch (func_mask & 3) {
 case 0:
 mask = 0;   /* No bits in the SID field masked */
@@ -1009,16 +1010,16 @@ static void 
vtd_context_device_invalidate(IntelIOMMUState *s,
 break;
 }
 mask = ~mask;
-VTD_DPRINTF(INV, "device-selective invalidation source 0x%"PRIx16
-" mask %"PRIu16, source_id, mask);
-vtd_bus = vtd_find_as_from_bus_num(s, VTD_SID_TO_BUS(source_id));
+
+bus_n = VTD_SID_TO_BUS(source_id);
+vtd_bus = vtd_find_as_from_bus_num(s, bus_n);
 if (vtd_bus) {
 devfn = VTD_SID_TO_DEVFN(source_id);
 for (devfn_it = 0; devfn_it < X86_IOMMU_PCI_DEVFN_MAX; ++devfn_it) {
 vtd_as = vtd_bus->dev_as[devfn_it];
 if (vtd_as && ((devfn_it & mask) == (devfn & mask))) {
-VTD_DPRINTF(INV, "invalidate context-cahce of devfn 0x%"PRIx16,
-devfn_it);
+trace_vtd_inv_desc_cc_device(bus_n, (devfn_it >> 3) & 0x1f,
+ devfn_it & 3);
 vtd_as->context_cache_entry.context_cache_gen = 0;
 }
 }
@@ -1371,7 +1372,7 @@ static bool vtd_process_wait_desc(IntelIOMMUState *s, 
VTDInvDesc *inv_desc)
 {
 if ((inv_desc->hi & VTD_INV_DESC_WAIT_RSVD_HI) ||
 (inv_desc->lo & VTD_INV_DESC_WAIT_RSVD_LO)) {
-VTD_DPRINTF(GENERAL, "error: non-zero reserved field in Invalidation "
+error_report("Non-zero reserved field in Invalidation "
 "Wait Descriptor hi 0x%"PRIx64 " lo 0x%"PRIx64,
 inv_desc->hi, inv_desc->lo);
 return false;
@@ -1385,21 +1386,20 @@ static bool vtd_process_wait_desc(IntelIOMMUState *s, 
VTDInvDesc *inv_desc)
 
 /* FIXME: need to be masked with H

[Qemu-devel] [RFC PATCH 07/13] memory: add section range info for IOMMU notifier

2016-12-06 Thread Peter Xu
In this patch, IOMMUNotifier.{start|end} are introduced to store section
information for a specific notifier. When notification occurs, we not
only check the notification type (MAP|UNMAP), but also check whether the
notified iova is in the range of specific IOMMU notifier, and skip those
notifiers if not in the listened range.

When removing an region, we need to make sure we removed the correct
VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well.

Suggested-by: David Gibson 
Reviewed-by: David Gibson 
Signed-off-by: Peter Xu 
---
v2:
- replace offset_within_address_space with offset_within_region since
  IOTLB iova is relative to region [David]
---
 hw/vfio/common.c  | 7 ++-
 include/exec/memory.h | 3 +++
 memory.c  | 4 +++-
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 801578b..6f648da 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -455,6 +455,10 @@ static void vfio_listener_region_add(MemoryListener 
*listener,
 giommu->container = container;
 giommu->n.notify = vfio_iommu_map_notify;
 giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL;
+giommu->n.start = section->offset_within_region;
+llend = int128_add(int128_make64(giommu->n.start), section->size);
+llend = int128_sub(llend, int128_one());
+giommu->n.end = int128_get64(llend);
 QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
 
 memory_region_register_iommu_notifier(giommu->iommu, &giommu->n);
@@ -525,7 +529,8 @@ static void vfio_listener_region_del(MemoryListener 
*listener,
 VFIOGuestIOMMU *giommu;
 
 QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) {
-if (giommu->iommu == section->mr) {
+if (giommu->iommu == section->mr &&
+giommu->n.start == section->offset_within_region) {
 memory_region_unregister_iommu_notifier(giommu->iommu,
 &giommu->n);
 QLIST_REMOVE(giommu, giommu_next);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 2d7ee54..cb2d432 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -85,6 +85,9 @@ typedef enum {
 struct IOMMUNotifier {
 void (*notify)(struct IOMMUNotifier *notifier, IOMMUTLBEntry *data);
 IOMMUNotifierFlag notifier_flags;
+/* Notify for address space range start <= addr <= end */
+hwaddr start;
+hwaddr end;
 QLIST_ENTRY(IOMMUNotifier) node;
 };
 typedef struct IOMMUNotifier IOMMUNotifier;
diff --git a/memory.c b/memory.c
index 9b88638..f73c897 100644
--- a/memory.c
+++ b/memory.c
@@ -1663,7 +1663,9 @@ void memory_region_notify_iommu(MemoryRegion *mr,
 }
 
 QLIST_FOREACH(iommu_notifier, &mr->iommu_notify, node) {
-if (iommu_notifier->notifier_flags & request_flags) {
+if (iommu_notifier->notifier_flags & request_flags &&
+iommu_notifier->start <= entry.iova &&
+iommu_notifier->end >= entry.iova) {
 iommu_notifier->notify(iommu_notifier, &entry);
 }
 }
-- 
2.7.4




[Qemu-devel] [RFC PATCH 06/13] intel_iommu: vtd_slpt_level_shift check level

2016-12-06 Thread Peter Xu
This helps in debugging incorrect level passed in.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 0f8387e..46b8a2f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -188,6 +188,7 @@ static gboolean vtd_hash_remove_by_domain(gpointer key, 
gpointer value,
 /* The shift of an addr for a certain level of paging structure */
 static inline uint32_t vtd_slpt_level_shift(uint32_t level)
 {
+assert(level != 0);
 return VTD_PAGE_SHIFT_4K + (level - 1) * VTD_SL_LEVEL_BITS;
 }
 
-- 
2.7.4




[Qemu-devel] [RFC PATCH 09/13] memory: introduce memory_region_notify_one()

2016-12-06 Thread Peter Xu
Generalizing the notify logic in memory_region_notify_iommu() into a
single function. This can be further used in customized replay()
functions for IOMMUs.

Signed-off-by: Peter Xu 
---
 include/exec/memory.h | 15 +++
 memory.c  | 29 ++---
 2 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 1669c7b..9902e9e 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -668,6 +668,21 @@ void memory_region_notify_iommu(MemoryRegion *mr,
 IOMMUTLBEntry entry);
 
 /**
+ * memory_region_notify_one: notify a change in an IOMMU translation
+ *   entry to a single notifier
+ *
+ * This works just like memory_region_notify_iommu(), but it only
+ * notifies a specific notifier, not all of them.
+ *
+ * @notifier: the notifier to be notified
+ * @entry: the new entry in the IOMMU translation table.  The entry
+ * replaces all old entries for the same virtual I/O address range.
+ * Deleted entries have .@perm == 0.
+ */
+void memory_region_notify_one(IOMMUNotifier *notifier,
+  IOMMUTLBEntry *entry);
+
+/**
  * memory_region_register_iommu_notifier: register a notifier for changes to
  * IOMMU translation entries.
  *
diff --git a/memory.c b/memory.c
index 62ca6e0..84c91fa 100644
--- a/memory.c
+++ b/memory.c
@@ -1657,26 +1657,33 @@ void 
memory_region_unregister_iommu_notifier(MemoryRegion *mr,
 memory_region_update_iommu_notify_flags(mr);
 }
 
-void memory_region_notify_iommu(MemoryRegion *mr,
-IOMMUTLBEntry entry)
+void memory_region_notify_one(IOMMUNotifier *notifier,
+  IOMMUTLBEntry *entry)
 {
-IOMMUNotifier *iommu_notifier;
 IOMMUNotifierFlag request_flags;
 
-assert(memory_region_is_iommu(mr));
-
-if (entry.perm & IOMMU_RW) {
+if (entry->perm & IOMMU_RW) {
 request_flags = IOMMU_NOTIFIER_MAP;
 } else {
 request_flags = IOMMU_NOTIFIER_UNMAP;
 }
 
+if (notifier->notifier_flags & request_flags &&
+notifier->start <= entry->iova &&
+notifier->end >= entry->iova) {
+notifier->notify(notifier, entry);
+}
+}
+
+void memory_region_notify_iommu(MemoryRegion *mr,
+IOMMUTLBEntry entry)
+{
+IOMMUNotifier *iommu_notifier;
+
+assert(memory_region_is_iommu(mr));
+
 QLIST_FOREACH(iommu_notifier, &mr->iommu_notify, node) {
-if (iommu_notifier->notifier_flags & request_flags &&
-iommu_notifier->start <= entry.iova &&
-iommu_notifier->end >= entry.iova) {
-iommu_notifier->notify(iommu_notifier, &entry);
-}
+memory_region_notify_one(iommu_notifier, &entry);
 }
 }
 
-- 
2.7.4




[Qemu-devel] [RFC PATCH 08/13] memory: provide iommu_replay_all()

2016-12-06 Thread Peter Xu
This is an "global" version of exising memory_region_iommu_replay() - we
announce the translations to all the registered notifiers, instead of a
specific one.

Signed-off-by: Peter Xu 
---
 include/exec/memory.h | 8 
 memory.c  | 9 +
 2 files changed, 17 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index cb2d432..1669c7b 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -693,6 +693,14 @@ void memory_region_iommu_replay(MemoryRegion *mr, 
IOMMUNotifier *n,
 bool is_write);
 
 /**
+ * memory_region_iommu_replay_all: replay existing IOMMU translations
+ * to all the notifiers registered.
+ *
+ * @mr: the memory region to observe
+ */
+void memory_region_iommu_replay_all(MemoryRegion *mr);
+
+/**
  * memory_region_unregister_iommu_notifier: unregister a notifier for
  * changes to IOMMU translation entries.
  *
diff --git a/memory.c b/memory.c
index f73c897..62ca6e0 100644
--- a/memory.c
+++ b/memory.c
@@ -1641,6 +1641,15 @@ void memory_region_iommu_replay(MemoryRegion *mr, 
IOMMUNotifier *n,
 }
 }
 
+void memory_region_iommu_replay_all(MemoryRegion *mr)
+{
+IOMMUNotifier *notifier;
+
+QLIST_FOREACH(notifier, &mr->iommu_notify, node) {
+memory_region_iommu_replay(mr, notifier, false);
+}
+}
+
 void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
  IOMMUNotifier *n)
 {
-- 
2.7.4




[Qemu-devel] [RFC PATCH 11/13] intel_iommu: provide its own replay() callback

2016-12-06 Thread Peter Xu
The default replay() don't work for VT-d since vt-d will have a huge
default memory region which covers address range 0-(2^64-1). This will
normally bring a dead loop when guest starts.

The solution is simple - we don't walk over all the regions. Instead, we
jump over the regions when we found that the page directories are empty.
It'll greatly reduce the time to walk the whole region.

To achieve this, we provided a page walk helper to do that, invoking
corresponding hook function when we found an page we are interested in.
vtd_page_walk_level() is the core logic for the page walking. It's
interface is designed to suite further use case, e.g., to invalidate a
range of addresses.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 212 --
 hw/i386/trace-events  |   8 ++
 include/exec/memory.h |   2 +
 3 files changed, 217 insertions(+), 5 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 46b8a2f..2fcd7af 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -620,6 +620,22 @@ static inline uint32_t 
vtd_get_agaw_from_context_entry(VTDContextEntry *ce)
 return 30 + (ce->hi & VTD_CONTEXT_ENTRY_AW) * 9;
 }
 
+static inline uint64_t vtd_iova_limit(VTDContextEntry *ce)
+{
+uint32_t ce_agaw = vtd_get_agaw_from_context_entry(ce);
+return 1ULL << MIN(ce_agaw, VTD_MGAW);
+}
+
+/* Return true if IOVA passes range check, otherwise false. */
+static inline bool vtd_iova_range_check(uint64_t iova, VTDContextEntry *ce)
+{
+/*
+ * Check if @iova is above 2^X-1, where X is the minimum of MGAW
+ * in CAP_REG and AW in context-entry.
+ */
+return !(iova & ~(vtd_iova_limit(ce) - 1));
+}
+
 static const uint64_t vtd_paging_entry_rsvd_field[] = {
 [0] = ~0ULL,
 /* For not large page */
@@ -656,13 +672,9 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint64_t 
iova,
 uint32_t level = vtd_get_level_from_context_entry(ce);
 uint32_t offset;
 uint64_t slpte;
-uint32_t ce_agaw = vtd_get_agaw_from_context_entry(ce);
 uint64_t access_right_check = 0;
 
-/* Check if @iova is above 2^X-1, where X is the minimum of MGAW
- * in CAP_REG and AW in context-entry.
- */
-if (iova & ~((1ULL << MIN(ce_agaw, VTD_MGAW)) - 1)) {
+if (!vtd_iova_range_check(iova, ce)) {
 error_report("IOVA 0x%"PRIx64 " exceeds limits", iova);
 return -VTD_FR_ADDR_BEYOND_MGAW;
 }
@@ -718,6 +730,166 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, 
uint64_t iova,
 }
 }
 
+typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, void *private);
+
+/**
+ * vtd_page_walk_level - walk over specific level for IOVA range
+ *
+ * @addr: base GPA addr to start the walk
+ * @start: IOVA range start address
+ * @end: IOVA range end address (start <= addr < end)
+ * @hook_fn: hook func to be called when detected page
+ * @private: private data to be passed into hook func
+ * @read: whether parent level has read permission
+ * @write: whether parent level has write permission
+ * @skipped: accumulated skipped ranges
+ * @notify_unmap: whether we should notify invalid entries
+ */
+static int vtd_page_walk_level(dma_addr_t addr, uint64_t start,
+   uint64_t end, vtd_page_walk_hook hook_fn,
+   void *private, uint32_t level,
+   bool read, bool write, uint64_t *skipped,
+   bool notify_unmap)
+{
+bool read_cur, write_cur, entry_valid;
+uint32_t offset;
+uint64_t slpte;
+uint64_t subpage_size, subpage_mask;
+IOMMUTLBEntry entry;
+uint64_t iova = start;
+uint64_t iova_next;
+uint64_t skipped_local = 0;
+int ret = 0;
+
+trace_vtd_page_walk_level(addr, level, start, end);
+
+subpage_size = 1ULL << vtd_slpt_level_shift(level);
+subpage_mask = vtd_slpt_level_page_mask(level);
+
+while (iova < end) {
+iova_next = (iova & subpage_mask) + subpage_size;
+
+offset = vtd_iova_level_offset(iova, level);
+slpte = vtd_get_slpte(addr, offset);
+
+/*
+ * When one of the following case happens, we assume the whole
+ * range is invalid:
+ *
+ * 1. read block failed
+ * 3. reserved area non-zero
+ * 2. both read & write flag are not set
+ */
+
+if (slpte == (uint64_t)-1) {
+trace_vtd_page_walk_skip_read(iova, iova_next);
+skipped_local++;
+goto next;
+}
+
+if (vtd_slpte_nonzero_rsvd(slpte, level)) {
+trace_vtd_page_walk_skip_reserve(iova, iova_next);
+skipped_local++;
+goto next;
+}
+
+/* Permissions are stacked with parents' */
+read_cur = read && (slpte & VTD_SL_R);
+write_cur = write && (slpte & VTD_SL_W);
+
+/*
+ * As long as we have either read/write permission, this is
+ * a valid entry. The rule wor

[Qemu-devel] [RFC PATCH 10/13] memory: add MemoryRegionIOMMUOps.replay() callback

2016-12-06 Thread Peter Xu
Originally we have one memory_region_iommu_replay() function, which is
the default behavior to replay the translations of the whole IOMMU
region. However, on some platform like x86, we may want our own replay
logic for IOMMU regions. This patch add one more hook for IOMMUOps for
the callback, and it'll override the default if set.

Signed-off-by: Peter Xu 
---
 include/exec/memory.h | 2 ++
 memory.c  | 6 ++
 2 files changed, 8 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 9902e9e..6bdd12c 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -183,6 +183,8 @@ struct MemoryRegionIOMMUOps {
 void (*notify_flag_changed)(MemoryRegion *iommu,
 IOMMUNotifierFlag old_flags,
 IOMMUNotifierFlag new_flags);
+/* Set this up to provide customized IOMMU replay function */
+void (*replay)(MemoryRegion *iommu, IOMMUNotifier *notifier);
 };
 
 typedef struct CoalescedMemoryRange CoalescedMemoryRange;
diff --git a/memory.c b/memory.c
index 84c91fa..9ad6319 100644
--- a/memory.c
+++ b/memory.c
@@ -1624,6 +1624,12 @@ void memory_region_iommu_replay(MemoryRegion *mr, 
IOMMUNotifier *n,
 hwaddr addr, granularity;
 IOMMUTLBEntry iotlb;
 
+/* If the IOMMU has its own replay callback, override */
+if (mr->iommu_ops->replay) {
+mr->iommu_ops->replay(mr, n);
+return;
+}
+
 granularity = memory_region_iommu_get_min_page_size(mr);
 
 for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
-- 
2.7.4




[Qemu-devel] [RFC PATCH 13/13] intel_iommu: use page_walk for iotlb inv notify

2016-12-06 Thread Peter Xu
Instead of translate() every page for iotlb invalidations (which is
slower), we walk the pages when needed and notify in a hook function.
This will also simplify the code a bit.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 64 +++
 1 file changed, 19 insertions(+), 45 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 0220e63..226dbcd 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -149,23 +149,6 @@ static uint64_t vtd_set_clear_mask_quad(IntelIOMMUState 
*s, hwaddr addr,
 return new_val;
 }
 
-static int vtd_get_did_dev(IntelIOMMUState *s, uint8_t bus_num, uint8_t devfn,
-   uint16_t *domain_id)
-{
-VTDContextEntry ce;
-int ret_fr;
-
-assert(domain_id);
-
-ret_fr = vtd_dev_to_context_entry(s, bus_num, devfn, &ce);
-if (ret_fr) {
-return -1;
-}
-
-*domain_id =  VTD_CONTEXT_ENTRY_DID(ce.hi);
-return 0;
-}
-
 /* GHashTable functions */
 static gboolean vtd_uint64_equal(gconstpointer v1, gconstpointer v2)
 {
@@ -868,7 +851,8 @@ next:
  * @private: private data for the hook function
  */
 static int vtd_page_walk(VTDContextEntry *ce, uint64_t start, uint64_t end,
- vtd_page_walk_hook hook_fn, void *private)
+ vtd_page_walk_hook hook_fn, void *private,
+ bool notify_unmap)
 {
 dma_addr_t addr = vtd_get_slpt_base_from_context(ce);
 uint32_t level = vtd_get_level_from_context_entry(ce);
@@ -887,7 +871,7 @@ static int vtd_page_walk(VTDContextEntry *ce, uint64_t 
start, uint64_t end,
 trace_vtd_page_walk(ce->hi, ce->lo, start, end);
 
 return vtd_page_walk_level(addr, start, end, hook_fn, private,
-   level, true, true, NULL, false);
+   level, true, true, NULL, notify_unmap);
 }
 
 /* Map a device to its corresponding domain (context-entry) */
@@ -1238,39 +1222,29 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUState 
*s, uint16_t domain_id)
 &domain_id);
 }
 
+static int vtd_page_invalidate_notify_hook(IOMMUTLBEntry *entry,
+   void *private)
+{
+memory_region_notify_iommu((MemoryRegion *)private, *entry);
+return 0;
+}
+
 static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s,
uint16_t domain_id, hwaddr addr,
uint8_t am)
 {
 IntelIOMMUNotifierNode *node;
+VTDContextEntry ce;
+int ret;
 
 QLIST_FOREACH(node, &(s->notifiers_list), next) {
 VTDAddressSpace *vtd_as = node->vtd_as;
-uint16_t vfio_domain_id;
-int ret = vtd_get_did_dev(s, pci_bus_num(vtd_as->bus), vtd_as->devfn,
-  &vfio_domain_id);
-if (!ret && domain_id == vfio_domain_id) {
-hwaddr original_addr = addr;
-
-while (addr < original_addr + (1 << am) * VTD_PAGE_SIZE) {
-IOMMUTLBEntry entry = s->iommu_ops.translate(
- &node->vtd_as->iommu,
- addr,
- IOMMU_NO_FAIL);
-
-if (entry.perm == IOMMU_NONE &&
-node->notifier_flag & IOMMU_NOTIFIER_UNMAP) {
-entry.target_as = &address_space_memory;
-entry.iova = addr & VTD_PAGE_MASK_4K;
-entry.translated_addr = 0;
-entry.addr_mask = ~VTD_PAGE_MASK(VTD_PAGE_SHIFT);
-memory_region_notify_iommu(&node->vtd_as->iommu, entry);
-addr += VTD_PAGE_SIZE;
-} else if (node->notifier_flag & IOMMU_NOTIFIER_MAP) {
-memory_region_notify_iommu(&node->vtd_as->iommu, 
entry);
-addr += entry.addr_mask + 1;
-}
-}
+ret = vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
+   vtd_as->devfn, &ce);
+if (!ret && domain_id == VTD_CONTEXT_ENTRY_DID(ce.hi)) {
+vtd_page_walk(&ce, addr, addr + (1 << am) * VTD_PAGE_SIZE,
+  vtd_page_invalidate_notify_hook,
+  (void *)&vtd_as->iommu, true);
 }
 }
 }
@@ -2623,7 +2597,7 @@ static void vtd_iommu_replay(MemoryRegion *mr, 
IOMMUNotifier *n)
  */
 trace_vtd_replay_ce_valid(bus_n, PCI_SLOT(vtd_as->devfn),
   PCI_FUNC(vtd_as->devfn), ce.hi, ce.lo);
-vtd_page_walk(&ce, 0, ~0, vtd_replay_hook, (void *)n);
+vtd_page_walk(&ce, 0, ~0, vtd_replay_hook, (void *)n, false);
 } else {
 trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn),
 PCI_FUNC(vtd_as->devf

[Qemu-devel] [RFC PATCH 12/13] intel_iommu: do replay when context invalidate

2016-12-06 Thread Peter Xu
Before this one we only invalidate context cache when we receive context
entry invalidations. However it's possible that the invalidation also
contains a domain switch (only if cache-mode is enabled for vIOMMU). In
that case we need to notify all the registered components about the new
mapping.

Signed-off-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 2fcd7af..0220e63 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1188,6 +1188,7 @@ static void vtd_context_device_invalidate(IntelIOMMUState 
*s,
 trace_vtd_inv_desc_cc_device(bus_n, (devfn_it >> 3) & 0x1f,
  devfn_it & 3);
 vtd_as->context_cache_entry.context_cache_gen = 0;
+memory_region_iommu_replay_all(&vtd_as->iommu);
 }
 }
 }
-- 
2.7.4




Re: [Qemu-devel] [Spice-devel] Postcopy+spice crash

2016-12-06 Thread Dr. David Alan Gilbert
* Gerd Hoffmann (kra...@redhat.com) wrote:
>   Hi,
> 
> > >> On a quick glance I'd blame the guest for sending corrupted commands.
> > >> Strange though that it happens on migration only, so there could be
> > >> a host issue too.  Or a timing issue triggered by migration.
> > >>
> > >> Which migration phase?
> > >
> > > This is the point at which it switches over in postcopy.
> > 
> > It looks like it's the vmstate (post) load phase of the qxl device on
> > destination host.
> 
> Dave, can you try "thread apply all bt" so we see the other threads too?
> That should show whenever it happens in post_load

Yes, I already have the full set of threads; you can see the qxl_post_load in
thread 1.

red_dispatcher_loadvm_commands: 
id 0, group 0, virt start 0, virt end , generation 0, delta 0
id 1, group 1, virt start 7fbe83c0, virt end 7fbe87bfe000, generation 0, 
delta 7fbe83c0
id 2, group 1, virt start 7fbe7fa0, virt end 7fbe83a0, generation 0, 
delta 7fbe7fa0
(./x86_64-softmmu/qemu-system-x86_64:22376): Spice-CRITICAL **: 
red_memslots.c:123:get_virt: slot_id 128 too big, addr=8000
Thread 12 (Thread 0x7fc0a0df2700 (LWP 22377)):
#0  0x7fc0aa42f1bd in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x7fc0aa42ad02 in _L_lock_791 () from /lib64/libpthread.so.0
#2  0x7fc0aa42ac08 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x556465736839 in qemu_mutex_lock (mutex=mutex@entry=0x556465d76120 
) at /root/git/qemu/util/qemu-thread-posix.c:64
#4  0x5564653e69d6 in qemu_mutex_lock_iothread () at 
/root/git/qemu/cpus.c:1296
#5  0x55646574596e in call_rcu_thread (opaque=) at 
/root/git/qemu/util/rcu.c:257
#6  0x7fc0aa428dc5 in start_thread () from /lib64/libpthread.so.0
#7  0x7fc0a61786ed in clone () from /lib64/libc.so.6
Thread 11 (Thread 0x7fc09f304700 (LWP 22379)):
#0  0x7fc0aa42c6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
#1  0x556465736999 in qemu_cond_wait (cond=, 
mutex=mutex@entry=0x556465d76120 ) at 
/root/git/qemu/util/qemu-thread-posix.c:137
#2  0x5564653e6fe3 in qemu_kvm_wait_io_event (cpu=) at 
/root/git/qemu/cpus.c:964
#3  qemu_kvm_cpu_thread_fn (arg=0x556466688740) at /root/git/qemu/cpus.c:1003
#4  0x7fc0aa428dc5 in start_thread () from /lib64/libpthread.so.0
#5  0x7fc0a61786ed in clone () from /lib64/libc.so.6
Thread 10 (Thread 0x7fc09eb03700 (LWP 22380)):
#0  0x7fc0aa42c6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
#1  0x556465736999 in qemu_cond_wait (cond=, 
mutex=mutex@entry=0x556465d76120 ) at 
/root/git/qemu/util/qemu-thread-posix.c:137
#2  0x5564653e6fe3 in qemu_kvm_wait_io_event (cpu=) at 
/root/git/qemu/cpus.c:964
#3  qemu_kvm_cpu_thread_fn (arg=0x5564666ea960) at /root/git/qemu/cpus.c:1003
#4  0x7fc0aa428dc5 in start_thread () from /lib64/libpthread.so.0
#5  0x7fc0a61786ed in clone () from /lib64/libc.so.6
Thread 9 (Thread 0x7fc09e302700 (LWP 22381)):
#0  0x7fc0aa42c6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
#1  0x556465736999 in qemu_cond_wait (cond=, 
mutex=mutex@entry=0x556465d76120 ) at 
/root/git/qemu/util/qemu-thread-posix.c:137
#2  0x5564653e6fe3 in qemu_kvm_wait_io_event (cpu=) at 
/root/git/qemu/cpus.c:964
#3  qemu_kvm_cpu_thread_fn (arg=0x55646670a120) at /root/git/qemu/cpus.c:1003
#4  0x7fc0aa428dc5 in start_thread () from /lib64/libpthread.so.0
#5  0x7fc0a61786ed in clone () from /lib64/libc.so.6
Thread 8 (Thread 0x7fc09db01700 (LWP 22382)):
#0  0x7fc0aa42c6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
#1  0x556465736999 in qemu_cond_wait (cond=, 
mutex=mutex@entry=0x556465d76120 ) at 
/root/git/qemu/util/qemu-thread-posix.c:137
#2  0x5564653e6fe3 in qemu_kvm_wait_io_event (cpu=) at 
/root/git/qemu/cpus.c:964
#3  qemu_kvm_cpu_thread_fn (arg=0x5564667298d0) at /root/git/qemu/cpus.c:1003
#4  0x7fc0aa428dc5 in start_thread () from /lib64/libpthread.so.0
#5  0x7fc0a61786ed in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x7fbe7f9ff700 (LWP 22383)):
#0  0x7fc0aa42f49d in read () from /lib64/libpthread.so.0
#1  0x7fc0a8c36c01 in spice_backtrace_gstack () from 
/lib64/libspice-server.so.1
#2  0x7fc0a8c3e4f7 in spice_logv () from /lib64/libspice-server.so.1
#3  0x7fc0a8c3e655 in spice_log () from /lib64/libspice-server.so.1
#4  0x7fc0a8bfc6de in get_virt () from /lib64/libspice-server.so.1
#5  0x7fc0a8bfcb73 in red_get_data_chunks_ptr () from 
/lib64/libspice-server.so.1
#6  0x7fc0a8bff3fa in red_get_cursor_cmd () from /lib64/libspice-server.so.1
#7  0x7fc0a8c0fd79 in handle_dev_loadvm_commands () from 
/lib64/libspice-server.so.1
#8  0x7fc0a8bf9523 in dispatcher_handle_recv_read () from 
/lib64/libspice-server.so.1
#9  0x7fc0a8c1d5a5 in red_worker_main () from /lib64/libspice-server.so.1
#10 0x7fc0aa428dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x7fc0a61786ed 

Re: [Qemu-devel] [PATCH v4 2/7] fw-cfg: turn FW_CFG_FILE_SLOTS into a device property

2016-12-06 Thread Igor Mammedov
On Thu,  1 Dec 2016 18:06:19 +0100
Laszlo Ersek  wrote:

> We'd like to raise the value of FW_CFG_FILE_SLOTS. Doing it naively could
> lead to problems with backward migration: a more recent QEMU (running an
> older machine type) would allow the guest, in fw_cfg_select(), to select a
> high key value that is unavailable in the same machine type implemented by
> the older (target) QEMU. On the target host, fw_cfg_data_read() for
> example could dereference nonexistent entries.
> 
> As first step, size the FWCfgState.entries[*] and FWCfgState.entry_order
> arrays dynamically. All three array sizes will be influenced by the new
> field (and device property) FWCfgState.file_slots.
> 
> Make the following changes:
> 
> - Replace the FW_CFG_FILE_SLOTS macro with FW_CFG_FILE_SLOTS_TRAD
>   (traditional count of fw_cfg file slots) in the header file. The value
>   remains 0x10.
> 
> - Replace all uses of FW_CFG_FILE_SLOTS with a helper function called
>   fw_cfg_file_slots(), returning the new property.
> 
> - Eliminate the macro FW_CFG_MAX_ENTRY, and replace all its uses with a
>   helper function called fw_cfg_max_entry().
> 
> - In the MMIO- and IO-mapped realize functions both, allocate all three
>   arrays dynamically, based on the new property.
> 
> - The new property defaults to 0x20; however at the moment we forcibly set
>   it to FW_CFG_FILE_SLOTS_TRAD on all code paths available to board code
>   (namely in the fw_cfg_init_io_dma() and fw_cfg_init_mem_wide() helper
>   functions). This is going to be customized in the following patches.
> 
> Cc: "Gabriel L. Somlo" 
> Cc: "Michael S. Tsirkin" 
> Cc: Gerd Hoffmann 
> Cc: Igor Mammedov 
> Cc: Paolo Bonzini 
> Signed-off-by: Laszlo Ersek 
> ---
> 
> Notes:
> I know that upstream doesn't care about backward migration, but some
> downstreams might.
> 
>  docs/specs/fw_cfg.txt  |  2 +-
>  include/hw/nvram/fw_cfg_keys.h |  3 +-
>  hw/nvram/fw_cfg.c  | 85 
> ++
>  3 files changed, 79 insertions(+), 11 deletions(-)
> 
> diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt
> index a19e2adbe1c6..84e2978706f5 100644
> --- a/docs/specs/fw_cfg.txt
> +++ b/docs/specs/fw_cfg.txt
> @@ -154,11 +154,11 @@ Selector Reg.Range Usage
>  0x8000 - 0xbfff  Arch. Specific (0x - 0x3fff, generally RO, possibly RW
>   through the DMA interface in QEMU v2.9+)
>  0xc000 - 0x  Arch. Specific (0x - 0x3fff, RW, ignored in v2.4+)
>  
>  In practice, the number of allowed firmware configuration items is given
> -by the value of FW_CFG_MAX_ENTRY (see fw_cfg.h).
> +by the value (FW_CFG_FILE_FIRST + FW_CFG_FILE_SLOTS_TRAD) (see fw_cfg.h).
>  
>  = Guest-side DMA Interface =
>  
>  If bit 1 of the feature bitmap is set, the DMA interface is present. This 
> does
>  not replace the existing fw_cfg interface, it is an add-on. This interface
> diff --git a/include/hw/nvram/fw_cfg_keys.h b/include/hw/nvram/fw_cfg_keys.h
> index 0f3e871884c0..627589793671 100644
> --- a/include/hw/nvram/fw_cfg_keys.h
> +++ b/include/hw/nvram/fw_cfg_keys.h
> @@ -27,12 +27,11 @@
>  #define FW_CFG_SETUP_SIZE   0x17
>  #define FW_CFG_SETUP_DATA   0x18
>  #define FW_CFG_FILE_DIR 0x19
>  
>  #define FW_CFG_FILE_FIRST   0x20
> -#define FW_CFG_FILE_SLOTS   0x10
> -#define FW_CFG_MAX_ENTRY(FW_CFG_FILE_FIRST + FW_CFG_FILE_SLOTS)
> +#define FW_CFG_FILE_SLOTS_TRAD  0x10
>  
>  #define FW_CFG_WRITE_CHANNEL0x4000
>  #define FW_CFG_ARCH_LOCAL   0x8000
>  #define FW_CFG_ENTRY_MASK   (~(FW_CFG_WRITE_CHANNEL | FW_CFG_ARCH_LOCAL))
>  
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index e0145c11a19b..2e1441c09750 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -31,10 +31,13 @@
>  #include "hw/sysbus.h"
>  #include "trace.h"
>  #include "qemu/error-report.h"
>  #include "qemu/config-file.h"
>  #include "qemu/cutils.h"
> +#include "qapi/error.h"
> +
> +#define FW_CFG_FILE_SLOTS_DFLT 0x20
>  
>  #define FW_CFG_NAME "fw_cfg"
>  #define FW_CFG_PATH "/machine/" FW_CFG_NAME
>  
>  #define TYPE_FW_CFG "fw_cfg"
> @@ -69,12 +72,13 @@ typedef struct FWCfgEntry {
>  struct FWCfgState {
>  /*< private >*/
>  SysBusDevice parent_obj;
>  /*< public >*/
>  
> -FWCfgEntry entries[2][FW_CFG_MAX_ENTRY];
> -int entry_order[FW_CFG_MAX_ENTRY];
> +uint32_t file_slots;
should it be uint16_t?
As below you use "uint16_t file_slots_max;" and do some UINT16
to calculate max limit.

> +FWCfgEntry *entries[2];
> +int *entry_order;
>  FWCfgFiles *files;
>  uint16_t cur_entry;
>  uint32_t cur_offset;
>  Notifier machine_ready;
>  
> @@ -255,17 +259,27 @@ static void fw_cfg_reboot(FWCfgState *s)
>  static void fw_cfg_write(FWCfgState *s, uint8_t value)
>  {
>  /* nothing, write support removed in QEMU v2.4+ */
>  }
>  
> +static inline uint32_t fw_cfg_file_slots(const FWCfgState *s)
> +{
> +return s->file_slots;
> +}
so fa

Re: [Qemu-devel] [RFC PATCH 00/13] VT-d replay and misc cleanup

2016-12-06 Thread Peter Xu
On Tue, Dec 06, 2016 at 06:36:15PM +0800, Peter Xu wrote:
> This RFC series is a continue work for Aviv B.D.'s vfio enablement
> series with vt-d. Aviv has done a great job there, and what we still
> lack there are mostly the following:
> 
> (1) VFIO got duplicated IOTLB notifications due to splitted VT-d IOMMU
> memory region.
> 
> (2) VT-d still haven't provide a correct replay() mechanism (e.g.,
> when IOMMU domain switches, things will broke).
> 
> Here I'm trying to solve the above two issues.
> 
> (1) is solved by patch 7, (2) is solved by patch 11-12.

One thing to mention: this series is based on Aviv's series v6:

  https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01452.html

I pushed an online branch for better reference:

  https://github.com/xzpeter/qemu/tree/vtd-vfio-enablement

Forgot to CC Paolo & DavidG. Adding in.

Thanks,

-- peterx



Re: [Qemu-devel] [PATCH v7 0/5] IOMMU: intel_iommu support map and unmap notifications

2016-12-06 Thread Peter Xu
On Tue, Dec 06, 2016 at 04:27:39PM +0800, Lan Tianyu wrote:

[...]

> > 
> >> User space driver(E.G DPDK) also can enable/disable
> >> IOVA for device dynamically.
> > 
> > Could you provide more detailed (or any pointer) on how to do that? I
> > did try to find it myself, I see an VFIO_IOMMU_ENABLE ioctl, but looks
> > like it is for ppc only.
> 
> No, I just give an example that user space may do that but no more
> research. But since Qemu already can enable device's IOVA, other user
> application also should can do that with the same VFIO interface, right?

AFAIU we can't do that at least on x86. We can use vfio interface to
bind group into container, but we should not be able to dynamically
disable IOMMU protection. IIUC That needs to taint the kernel.

The only way I know is that we probe vfio-pci with no-iommu mode, in
that case, we disabled IOMMU, but we can never dynamically enable it
as well.

Please correct me if I am wrong.

Thanks,

-- peterx



Re: [Qemu-devel] [Qemu-block] Meeting notes on -blockdev, dynamic backend reconfiguration

2016-12-06 Thread Stefan Hajnoczi
On Mon, Dec 05, 2016 at 01:03:50PM +0100, Markus Armbruster wrote:
> I recently met Kevin, and we discussed two block layer topics in some
> depth.

Thanks for sharing this.  Both topics were a good read and the direction
you are heading in looks good.

Stefan


signature.asc
Description: PGP signature


[Qemu-devel] RISU TCG failures (AArch64 on AArch64)

2016-12-06 Thread Alex Bennée

Hi Claudio,

I've been fixing up the RISU tests for AArch64 while I was reviewing
Richard's latest TCG series. In the process I discovered a bunch of them
fail when run on an ARMv8 host using TCG although they pass on my x86_64
desktop machine. I'm assuming (but I could be wrong) this means the bug
is in the backend so I was wondering if you could have a look?

The failing binaries can be found at:

  http://people.linaro.org/~alex.bennee/testcases/arm64.risu/

The risu is statically compiled for easy running and you can run with
the record/playback traces:

  $QEMU ./risu testcase.risu.bin -t testcase.risu.bin.trace

And the failing tests are:

  testcases.aarch64/insn_ADDPv_ADD_RES1_ADD_RES2_ADDS__INC.risu.bin
  testcases.aarch64/insn_BSL_CCMN_CCMNi_CCMP__INC.risu.bin
  testcases.aarch64/insn_MOVI_MOVK_MOVN_MOVZ__INC.risu.bin
  testcases.aarch64/insn_UCVTFv_UCVTFvf_UCVTFvis_UCVTFv_RES1__INC.risu.bin

I ran up the failures with in_asm,op,op_opt,out_asm:

loading test image 
testcases.aarch64/insn_ADDPv_ADD_RES1_ADD_RES2_ADDS__INC.risu.bin...
starting apprentice image at 0x4000801000
Executed 100 test instructions (pc=0x4000801574).Executed 200 test instructions 
(pc=0x4000801aec).Executed 300 test instructions (pc=0x4000801d08).Executed 400 
test instructions (pc=0x4000802264).Executed 500 test instructions 
(pc=0x40008027b8).Executed 600 test instructions (pc=0x40008029e8).Executed 700 
test instructions (pc=0x4000802f3c).Executed 800 test instructions 
(pc=0x40008034a8).Executed 900 test instructions (pc=0x40008036cc).Executed 
1000 test instructions (pc=0x4000803c2c).
IN:
0x004000803cac:  ab1feff6  adds x22, xzr, xzr, lsl #59
0x004000803cb0:  5af0  unallocated (Unallocated)

OP:
 ld_i32 tmp0,env,$0xfff8
 movi_i32 tmp1,$0x0
 brcond_i32 tmp0,tmp1,ne,$L0

  004000803cac  
 movi_i64 tmp2,$0x0
 movi_i64 tmp3,$0x0
 movi_i64 tmp4,$0x3b
 shl_i64 tmp3,tmp3,tmp4
 movi_i64 tmp7,$0x0
 add2_i64 tmp5,tmp6,tmp2,tmp7,tmp3,tmp7
 mov_i32 CF,tmp6
 mov_i32 ZF,tmp5
 movi_i64 tmp9,$0x20
 shr_i64 tmp8,tmp5,tmp9
 mov_i32 NF,tmp8
 or_i32 ZF,ZF,NF
 xor_i64 tmp6,tmp5,tmp2
 xor_i64 tmp7,tmp2,tmp3
 andc_i64 tmp6,tmp6,tmp7
 movi_i64 tmp8,$0x20
 shr_i64 tmp7,tmp6,tmp8
 mov_i32 VF,tmp7
 mov_i64 tmp4,tmp5
 mov_i64 x22,tmp4

  004000803cb0  
 movi_i64 pc,$0x4000803cb0
 movi_i32 tmp0,$0x1
 movi_i32 tmp1,$0x200
 movi_i32 tmp10,$0x1
 call exception_with_syndrome,$0x0,$0,env,tmp0,tmp1,tmp10
 set_label $L0
 exit_tb $0x7f7f530323

OP after optimization and liveness analysis:
 ld_i32 tmp0,env,$0xfff8  dead: 1
 movi_i32 tmp1,$0x0
 brcond_i32 tmp0,tmp1,ne,$L0  dead: 0 1

  004000803cac  
 movi_i64 tmp2,$0x0
 movi_i64 tmp3,$0x0
 movi_i64 tmp7,$0x0
 add2_i64 tmp5,tmp6,tmp2,tmp7,tmp3,tmp7   dead: 2 3 4 5
 mov_i32 CF,tmp6  sync: 0  dead: 0 1
 mov_i32 ZF,tmp5
 movi_i64 tmp9,$0x20
 shr_i64 tmp8,tmp5,tmp9   dead: 2
 mov_i32 NF,tmp8  sync: 0  dead: 1
 or_i32 ZF,ZF,NF  sync: 0  dead: 0 1 2
 mov_i64 tmp6,tmp5
 movi_i64 tmp8,$0x20
 shr_i64 tmp7,tmp6,tmp8   dead: 1 2
 mov_i32 VF,tmp7  sync: 0  dead: 0 1
 mov_i64 tmp4,tmp5dead: 1
 mov_i64 x22,tmp4 sync: 0  dead: 0 1

  004000803cb0  
 movi_i64 pc,$0x4000803cb0sync: 0  dead: 0
 movi_i32 tmp0,$0x1
 movi_i32 tmp1,$0x200
 movi_i32 tmp10,$0x1
 call exception_with_syndrome,$0x0,$0,env,tmp0,tmp1,tmp10  dead: 0 1 2 3
 set_label $L0
 exit_tb $0x7f7f530323

OUT: [size=108]
0x556a9c11a0:  b85f8274  ldur w20, [x19, #-8]
0x556a9c11a4:  350002d4  cbnz w20, #+0x58 (addr 0x556a9c11fc)
0x556a9c11a8:  b10003f4  adds x20, sp, #0x0 (0)
0x556a9c11ac:  9a1f03f5  adc x21, xzr, xzr
0x556a9c11b0:  b9020275  str w21, [x19, #512]
0x556a9c11b4:  2a1403f5  mov w21, w20
0x556a9c11b8:  d360fe96  lsr x22, x20, #32
0x556a9c11bc:  b9020a76  str w22, [x19, #520]
0x556a9c11c0:  2a1602b5  orr w21, w21, w22
0x556a9c11c4:  b9020e75  str w21, [x19, #524]
0x556a9c11c8:  aa1403f5  mov x21, x20
0x556a9c11cc:  d360feb5  lsr x21, x21, #32
0x556a9c11d0:  b9020675  str w21, [x19, #516]
0x556a9c11d4:  f9007a74  str x20, [x19, #240]
0x556a9c11d8:  d2879614  mov x20, #0x3cb0
0x556a9c11dc:  f2a01014  movk x20, #0x80, lsl #16
0x556a9c11e0:  f2c00814  movk x20, #0x40, lsl #32
0x556a9c11e4:  f900a274  str x20, [x19, #320]
0x556a9c11e8:  aa1303e0  mov x0, x19
0x556a9c11ec:  52800021  mov w1, #0x1
0x556a9c11f0:  320703e2  orr w2, wzr, #0x200
0x556a9c11f4:  52800023  mov w3, #0x1
0x556a9c11f8:  97f78fa5  bl #-0x21c16c (addr 0x556a7a508c)
0x556a9c

[Qemu-devel] [PATCH] qemu-img: fix in-flight count for qemu-img bench

2016-12-06 Thread Paolo Bonzini
With aio=native (qemu-img bench -n) one or more requests can be completed
when a new request is submitted.  This in turn can cause bench_cb to
recurse before b->in_flight is updated.  The blk_aio_pwritev coroutines
are never freed, and qemu-img aborts.

Signed-off-by: Paolo Bonzini 
---
 qemu-img.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 6949b73..607dbe5 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3559,6 +3559,9 @@ static void bench_cb(void *opaque, int ret)
 }
 
 while (b->n > b->in_flight && b->in_flight < b->nrreq) {
+b->in_flight++;
+b->offset += b->step;
+b->offset %= b->image_size;
 if (b->write) {
 acb = blk_aio_pwritev(b->blk, b->offset, b->qiov, 0,
   bench_cb, b);
@@ -3570,9 +3573,6 @@ static void bench_cb(void *opaque, int ret)
 error_report("Failed to issue request");
 exit(EXIT_FAILURE);
 }
-b->in_flight++;
-b->offset += b->step;
-b->offset %= b->image_size;
 }
 }
 
-- 
2.9.3




Re: [Qemu-devel] [kvm-unit-tests RFC 00/15] arm/arm64: add ITS framework

2016-12-06 Thread Andre Przywara
Hi,

On 06/12/16 09:48, Andrew Jones wrote:
> On Mon, Dec 05, 2016 at 10:46:31PM +0100, Eric Auger wrote:
>> This series proposes a framework to test the virtual ITS.
>> This is based on Drew's v7 series [1]. The last patch tests
>> several ITS commands (collection/device mapping, interrupt
>> translation service entry creation and LPI trigger through INT
>> command). At this point we don't use any external PCIe device
>> to write into the GITS_TRANSLATER register.
>>
>> The bulk of the code derives from the ITS driver code so all
>> the credit is due to Marc.
>>
>> Many other ITS commands could be tested. Also existing MMIO
>> accesses could be enhanced into standalone tests. Current focus
>> was to make it functional.
>>
>> The code deserves more cleanup with respect to cacheability
>> attributes in general.
>>
>> Tested on Cavium ThunderX [2].
>>
>> Best Regards
>>
>> Eric
>>
>> [1] [kvm-unit-tests PATCH v7 00/11] arm/arm64: add gic framework
>>
>> [2] sample command line:
>>
>> $QEMU -machine virt,accel=kvm -cpu host \
>>  -device virtio-serial-device \
>>  -device virtconsole,chardev=ctd -chardev testdev,id=ctd \
>>  -display none -serial stdio \
>>  -kernel arm/gic.flat \
>>  -smp 8 -machine gic-version=3 -append 'its'
>>
>> Eric Auger (15):
>>   libcflat: Add other size defines
>>   arm/arm64: gicv3: Add some re-distributor defines
>>   arm/arm64: ITS skeleton
>>   arm/arm64: ITS: BASER parsing and setup
>>   arm/arm64: GICv3: add cpu count
>>   arm/arm64: ITS: Set the LPI config and pending tables
>>   arm/arm64: ITS: Init the command queue
>>   arm/arm64: ITS: enable LPIs at re-distributor level
>>   arm/arm64: ITS: Parse the typer register
>>   arm/arm64: ITS: its_enable_defaults
>>   arm/arm64: ITS: create device
>>   arm/arm64: ITS: create collection
>>   arm/arm64: ITS: commands
>>   arm/arm64: gic: Generalize ipi_enable()
>>   arm/arm64: ITS test
>>
>>  arm/Makefile.common|   1 +
>>  arm/gic.c  | 101 +++-
>>  lib/arm/asm/gic-v3-its.h   | 238 +++
>>  lib/arm/asm/gic-v3.h   |  84 ++
>>  lib/arm/asm/gic.h  |   1 +
>>  lib/arm/gic-v3-its-cmd.c   | 399 
>> +
>>  lib/arm/gic-v3-its.c   | 305 ++
>>  lib/arm/gic-v3.c   |   2 +
>>  lib/arm/gic.c  |  30 +++-
>>  lib/arm64/asm/gic-v3-its.h |   1 +
>>  lib/libcflat.h |   3 +
>>  11 files changed, 1154 insertions(+), 11 deletions(-)
>>  create mode 100644 lib/arm/asm/gic-v3-its.h
>>  create mode 100644 lib/arm/gic-v3-its-cmd.c
>>  create mode 100644 lib/arm/gic-v3-its.c
>>  create mode 100644 lib/arm64/asm/gic-v3-its.h
>>
>> -- 
>> 2.5.5
>>
>>
> 
> Thanks for this Eric! I'm glad to see we're getting more GIC test
> coverage written, even before v8 of the gic series is posted :-)
> v8 will be rebased on some sysreg stuff Wei is doing for the PMU
> series,

Are you planning on a v8 post any time soon?

> that's why it's held up. I'll need to set plenty of time
> aside to learn enough in order to review all the 'ITS:' patches
> in this series.

Are you sure you want to really taint yourself with this stuff? You
wouldn't be the first who risks his mental health by understanding the
ITS ;-)

That being said, I will take a look, I am in ITS land anyway for Xen ...

Cheers,
Andre.


> Apologies if I can't get to it right away.
> 
> Thanks again,
> drew
> 



Re: [Qemu-devel] [PATCH] qemu-img: fix in-flight count for qemu-img bench

2016-12-06 Thread Kevin Wolf
Am 06.12.2016 um 12:08 hat Paolo Bonzini geschrieben:
> With aio=native (qemu-img bench -n) one or more requests can be completed
> when a new request is submitted.  This in turn can cause bench_cb to
> recurse before b->in_flight is updated.  The blk_aio_pwritev coroutines
> are never freed, and qemu-img aborts.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  qemu-img.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index 6949b73..607dbe5 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -3559,6 +3559,9 @@ static void bench_cb(void *opaque, int ret)
>  }
>  
>  while (b->n > b->in_flight && b->in_flight < b->nrreq) {
> +b->in_flight++;
> +b->offset += b->step;
> +b->offset %= b->image_size;
>  if (b->write) {
>  acb = blk_aio_pwritev(b->blk, b->offset, b->qiov, 0,
>bench_cb, b);

This implicitly adds b->step to the initial offset because the write
request now uses the already updated offset. We should probably save the
old value and use that for the request.

Also, maybe add a short comment to the code (rather than just to the
commit message) that explains why the update has to be first?

Kevin



Re: [Qemu-devel] [kvm-unit-tests RFC 00/15] arm/arm64: add ITS framework

2016-12-06 Thread Andrew Jones
On Tue, Dec 06, 2016 at 11:14:41AM +, Andre Przywara wrote:
> Hi,
> 
> On 06/12/16 09:48, Andrew Jones wrote:
> > On Mon, Dec 05, 2016 at 10:46:31PM +0100, Eric Auger wrote:
> >> This series proposes a framework to test the virtual ITS.
> >> This is based on Drew's v7 series [1]. The last patch tests
> >> several ITS commands (collection/device mapping, interrupt
> >> translation service entry creation and LPI trigger through INT
> >> command). At this point we don't use any external PCIe device
> >> to write into the GITS_TRANSLATER register.
> >>
> >> The bulk of the code derives from the ITS driver code so all
> >> the credit is due to Marc.
> >>
> >> Many other ITS commands could be tested. Also existing MMIO
> >> accesses could be enhanced into standalone tests. Current focus
> >> was to make it functional.
> >>
> >> The code deserves more cleanup with respect to cacheability
> >> attributes in general.
> >>
> >> Tested on Cavium ThunderX [2].
> >>
> >> Best Regards
> >>
> >> Eric
> >>
> >> [1] [kvm-unit-tests PATCH v7 00/11] arm/arm64: add gic framework
> >>
> >> [2] sample command line:
> >>
> >> $QEMU -machine virt,accel=kvm -cpu host \
> >>  -device virtio-serial-device \
> >>  -device virtconsole,chardev=ctd -chardev testdev,id=ctd \
> >>  -display none -serial stdio \
> >>  -kernel arm/gic.flat \
> >>  -smp 8 -machine gic-version=3 -append 'its'
> >>
> >> Eric Auger (15):
> >>   libcflat: Add other size defines
> >>   arm/arm64: gicv3: Add some re-distributor defines
> >>   arm/arm64: ITS skeleton
> >>   arm/arm64: ITS: BASER parsing and setup
> >>   arm/arm64: GICv3: add cpu count
> >>   arm/arm64: ITS: Set the LPI config and pending tables
> >>   arm/arm64: ITS: Init the command queue
> >>   arm/arm64: ITS: enable LPIs at re-distributor level
> >>   arm/arm64: ITS: Parse the typer register
> >>   arm/arm64: ITS: its_enable_defaults
> >>   arm/arm64: ITS: create device
> >>   arm/arm64: ITS: create collection
> >>   arm/arm64: ITS: commands
> >>   arm/arm64: gic: Generalize ipi_enable()
> >>   arm/arm64: ITS test
> >>
> >>  arm/Makefile.common|   1 +
> >>  arm/gic.c  | 101 +++-
> >>  lib/arm/asm/gic-v3-its.h   | 238 +++
> >>  lib/arm/asm/gic-v3.h   |  84 ++
> >>  lib/arm/asm/gic.h  |   1 +
> >>  lib/arm/gic-v3-its-cmd.c   | 399 
> >> +
> >>  lib/arm/gic-v3-its.c   | 305 ++
> >>  lib/arm/gic-v3.c   |   2 +
> >>  lib/arm/gic.c  |  30 +++-
> >>  lib/arm64/asm/gic-v3-its.h |   1 +
> >>  lib/libcflat.h |   3 +
> >>  11 files changed, 1154 insertions(+), 11 deletions(-)
> >>  create mode 100644 lib/arm/asm/gic-v3-its.h
> >>  create mode 100644 lib/arm/gic-v3-its-cmd.c
> >>  create mode 100644 lib/arm/gic-v3-its.c
> >>  create mode 100644 lib/arm64/asm/gic-v3-its.h
> >>
> >> -- 
> >> 2.5.5
> >>
> >>
> > 
> > Thanks for this Eric! I'm glad to see we're getting more GIC test
> > coverage written, even before v8 of the gic series is posted :-)
> > v8 will be rebased on some sysreg stuff Wei is doing for the PMU
> > series,
> 
> Are you planning on a v8 post any time soon?

I think Wei is going to post PMU today. Hopefully everyone will be happy
with the sysreg bits. If so, then I'll post v8 no later than tomorrow
sometime.

> 
> > that's why it's held up. I'll need to set plenty of time
> > aside to learn enough in order to review all the 'ITS:' patches
> > in this series.
> 
> Are you sure you want to really taint yourself with this stuff? You
> wouldn't be the first who risks his mental health by understanding the
> ITS ;-)
> 
> That being said, I will take a look, I am in ITS land anyway for Xen ...

Excellent! Thanks!

drew

> 
> Cheers,
> Andre.
> 
> 
> > Apologies if I can't get to it right away.
> > 
> > Thanks again,
> > drew
> > 



Re: [Qemu-devel] RISU TCG failures (AArch64 on AArch64)

2016-12-06 Thread Peter Maydell
On 6 December 2016 at 11:06, Alex Bennée  wrote:

A quick eyeball of the logs:

> loading test image 
> testcases.aarch64/insn_ADDPv_ADD_RES1_ADD_RES2_ADDS__INC.risu.bin...
> starting apprentice image at 0x4000801000
> Executed 100 test instructions (pc=0x4000801574).Executed 200 test 
> instructions (pc=0x4000801aec).Executed 300 test instructions 
> (pc=0x4000801d08).Executed 400 test instructions (pc=0x4000802264).Executed 
> 500 test instructions (pc=0x40008027b8).Executed 600 test instructions 
> (pc=0x40008029e8).Executed 700 test instructions (pc=0x4000802f3c).Executed 
> 800 test instructions (pc=0x40008034a8).Executed 900 test instructions 
> (pc=0x40008036cc).Executed 1000 test instructions 
> (pc=0x4000803c2c).
> IN:
> 0x004000803cac:  ab1feff6  adds x22, xzr, xzr, lsl #59

>   004000803cac  
>  movi_i64 tmp2,$0x0
>  movi_i64 tmp3,$0x0
>  movi_i64 tmp7,$0x0
>  add2_i64 tmp5,tmp6,tmp2,tmp7,tmp3,tmp7   dead: 2 3 4 5
>  mov_i32 CF,tmp6  sync: 0  dead: 0 1
>  mov_i32 ZF,tmp5
>  movi_i64 tmp9,$0x20
>  shr_i64 tmp8,tmp5,tmp9   dead: 2
[...]

> OUT: [size=108]
> 0x556a9c11a0:  b85f8274  ldur w20, [x19, #-8]
> 0x556a9c11a4:  350002d4  cbnz w20, #+0x58 (addr 0x556a9c11fc)
> 0x556a9c11a8:  b10003f4  adds x20, sp, #0x0 (0)
> 0x556a9c11ac:  9a1f03f5  adc x21, xzr, xzr

Looks like the add2_i64 has emitted a bogus adds, because
it thinks adds (immediate) allows use of xzr when it doesn't
(for this insn Rn==31 means SP, not XZR).


> loading test image testcases.aarch64/insn_BSL_CCMN_CCMNi_CCMP__INC.risu.bin...
> starting apprentice image at 0x4000801000
> 
> IN:
> 0x00400080157c:  3a5cabed  ccmn wzr, #28, #NZcV, ge
> 0x004000801580:  5af0  unallocated (Unallocated)

> OP after optimization and liveness analysis:
>  ld_i32 tmp0,env,$0xfff8  dead: 1
>  movi_i32 tmp1,$0x0
>  brcond_i32 tmp0,tmp1,ne,$L0  dead: 0 1
>
>   00400080157c  
>  xor_i32 tmp1,VF,NF   dead: 1 2
>  movi_i32 tmp2,$0x0
>  setcond_i32 tmp0,tmp1,tmp2,ltdead: 1 2
>  movi_i64 tmp3,$0x1c
>  movi_i64 tmp4,$0x0
>  movi_i32 tmp6,$0x0
>  mov_i32 tmp1,tmp4dead: 1
>  mov_i32 tmp2,tmp3dead: 1
>  add2_i32 NF,CF,tmp1,tmp6,tmp2,tmp6   dead: 3 5

> 0x55956bfc4c:  310073f5  adds w21, wsp, #0x1c (28)
> 0x55956bfc50:  1a1f03f6  adc w22, wzr, wzr

Same again but with add2_i32.

> loading test image testcases.aarch64/insn_MOVI_MOVK_MOVN_MOVZ__INC.risu.bin...


> 0x00400080be2c:  6f01e7a0  movi v0.2d, #0x00ff
> 0x00400080be30:  5af0  unallocated (Unallocated)

>   00400080be2c  
>  movi_i64 tmp2,$0x00ff
>  mov_i64 tmp3,tmp2dead: 1
>  st_i64 tmp3,env,$0x838
>  st_i64 tmp3,env,$0x840   dead: 0

> OUT: [size=72]
> 0x5561a8beb0:  b85f8274  ldur w20, [x19, #-8]
> 0x5561a8beb4:  350001b4  cbnz w20, #+0x34 (addr 0x5561a8bee8)
> 0x5561a8beb8:  929fe014  mov x20, #0x00ff

Generated move with wrong constant value...

> loading test image 
> testcases.aarch64/insn_UCVTFv_UCVTFvf_UCVTFvis_UCVTFv_RES1__INC.risu.bin...
> starting apprentice image at 0x4000801000
> Executed 100 test instructions (pc=0x400080158c).Executed 200 test 
> instructions (pc=0x4000801b10).Executed 300 test instructions 
> (pc=0x400080208c).Executed 400 test instructions (pc=0x40008022c8).Executed 
> 500 test instructions (pc=0x4000802880).Executed 600 test instructions 
> (pc=0x4000802e10).Executed 700 test instructions (pc=0x4000803034).Executed 
> 800 test instructions (pc=0x40008035a8).
> IN:
> 0x004000803a74:  6f01e6fc  movi v28.2d, #0x00ff
> 0x004000803a78:  5af0  unallocated (Unallocated)

>  movi_i64 tmp2,$0x00ff

> OUT: [size=72]
> 0x555c897290:  b85f8274  ldur w20, [x19, #-8]
> 0x555c897294:  350001b4  cbnz w20, #+0x34 (addr 0x555c8972c8)
> 0x555c897298:  92bfe014  mov x20, #0x00ff

Same thing again.

So there's two bugs here, one in handling of add2 in the
backend, and one in constant loading.

thanks
-- PMM



Re: [Qemu-devel] [PATCH v4 2/7] fw-cfg: turn FW_CFG_FILE_SLOTS into a device property

2016-12-06 Thread Laszlo Ersek
On 12/06/16 11:50, Igor Mammedov wrote:
> On Thu,  1 Dec 2016 18:06:19 +0100
> Laszlo Ersek  wrote:
> 
>> We'd like to raise the value of FW_CFG_FILE_SLOTS. Doing it naively could
>> lead to problems with backward migration: a more recent QEMU (running an
>> older machine type) would allow the guest, in fw_cfg_select(), to select a
>> high key value that is unavailable in the same machine type implemented by
>> the older (target) QEMU. On the target host, fw_cfg_data_read() for
>> example could dereference nonexistent entries.
>>
>> As first step, size the FWCfgState.entries[*] and FWCfgState.entry_order
>> arrays dynamically. All three array sizes will be influenced by the new
>> field (and device property) FWCfgState.file_slots.
>>
>> Make the following changes:
>>
>> - Replace the FW_CFG_FILE_SLOTS macro with FW_CFG_FILE_SLOTS_TRAD
>>   (traditional count of fw_cfg file slots) in the header file. The value
>>   remains 0x10.
>>
>> - Replace all uses of FW_CFG_FILE_SLOTS with a helper function called
>>   fw_cfg_file_slots(), returning the new property.
>>
>> - Eliminate the macro FW_CFG_MAX_ENTRY, and replace all its uses with a
>>   helper function called fw_cfg_max_entry().
>>
>> - In the MMIO- and IO-mapped realize functions both, allocate all three
>>   arrays dynamically, based on the new property.
>>
>> - The new property defaults to 0x20; however at the moment we forcibly set
>>   it to FW_CFG_FILE_SLOTS_TRAD on all code paths available to board code
>>   (namely in the fw_cfg_init_io_dma() and fw_cfg_init_mem_wide() helper
>>   functions). This is going to be customized in the following patches.
>>
>> Cc: "Gabriel L. Somlo" 
>> Cc: "Michael S. Tsirkin" 
>> Cc: Gerd Hoffmann 
>> Cc: Igor Mammedov 
>> Cc: Paolo Bonzini 
>> Signed-off-by: Laszlo Ersek 
>> ---
>>
>> Notes:
>> I know that upstream doesn't care about backward migration, but some
>> downstreams might.
>>
>>  docs/specs/fw_cfg.txt  |  2 +-
>>  include/hw/nvram/fw_cfg_keys.h |  3 +-
>>  hw/nvram/fw_cfg.c  | 85 
>> ++
>>  3 files changed, 79 insertions(+), 11 deletions(-)
>>
>> diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt
>> index a19e2adbe1c6..84e2978706f5 100644
>> --- a/docs/specs/fw_cfg.txt
>> +++ b/docs/specs/fw_cfg.txt
>> @@ -154,11 +154,11 @@ Selector Reg.Range Usage
>>  0x8000 - 0xbfff  Arch. Specific (0x - 0x3fff, generally RO, possibly RW
>>   through the DMA interface in QEMU v2.9+)
>>  0xc000 - 0x  Arch. Specific (0x - 0x3fff, RW, ignored in v2.4+)
>>  
>>  In practice, the number of allowed firmware configuration items is given
>> -by the value of FW_CFG_MAX_ENTRY (see fw_cfg.h).
>> +by the value (FW_CFG_FILE_FIRST + FW_CFG_FILE_SLOTS_TRAD) (see fw_cfg.h).
>>  
>>  = Guest-side DMA Interface =
>>  
>>  If bit 1 of the feature bitmap is set, the DMA interface is present. This 
>> does
>>  not replace the existing fw_cfg interface, it is an add-on. This interface
>> diff --git a/include/hw/nvram/fw_cfg_keys.h b/include/hw/nvram/fw_cfg_keys.h
>> index 0f3e871884c0..627589793671 100644
>> --- a/include/hw/nvram/fw_cfg_keys.h
>> +++ b/include/hw/nvram/fw_cfg_keys.h
>> @@ -27,12 +27,11 @@
>>  #define FW_CFG_SETUP_SIZE   0x17
>>  #define FW_CFG_SETUP_DATA   0x18
>>  #define FW_CFG_FILE_DIR 0x19
>>  
>>  #define FW_CFG_FILE_FIRST   0x20
>> -#define FW_CFG_FILE_SLOTS   0x10
>> -#define FW_CFG_MAX_ENTRY(FW_CFG_FILE_FIRST + FW_CFG_FILE_SLOTS)
>> +#define FW_CFG_FILE_SLOTS_TRAD  0x10
>>  
>>  #define FW_CFG_WRITE_CHANNEL0x4000
>>  #define FW_CFG_ARCH_LOCAL   0x8000
>>  #define FW_CFG_ENTRY_MASK   (~(FW_CFG_WRITE_CHANNEL | 
>> FW_CFG_ARCH_LOCAL))
>>  
>> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
>> index e0145c11a19b..2e1441c09750 100644
>> --- a/hw/nvram/fw_cfg.c
>> +++ b/hw/nvram/fw_cfg.c
>> @@ -31,10 +31,13 @@
>>  #include "hw/sysbus.h"
>>  #include "trace.h"
>>  #include "qemu/error-report.h"
>>  #include "qemu/config-file.h"
>>  #include "qemu/cutils.h"
>> +#include "qapi/error.h"
>> +
>> +#define FW_CFG_FILE_SLOTS_DFLT 0x20
>>  
>>  #define FW_CFG_NAME "fw_cfg"
>>  #define FW_CFG_PATH "/machine/" FW_CFG_NAME
>>  
>>  #define TYPE_FW_CFG "fw_cfg"
>> @@ -69,12 +72,13 @@ typedef struct FWCfgEntry {
>>  struct FWCfgState {
>>  /*< private >*/
>>  SysBusDevice parent_obj;
>>  /*< public >*/
>>  
>> -FWCfgEntry entries[2][FW_CFG_MAX_ENTRY];
>> -int entry_order[FW_CFG_MAX_ENTRY];
>> +uint32_t file_slots;
> should it be uint16_t?
> As below you use "uint16_t file_slots_max;" and do some UINT16
> to calculate max limit.

I think I had a reason for making this uint32_t. I think the argument
was that fw_cfg_max_entry() could theoretically return a value that
doesn't fit in a uint16_t. Looking again at the patch however, I think I
can try to make this a uint16_t for the next version.

> 
>> +FWCfgEntry *entries[2];
>> +int *entry

Re: [Qemu-devel] [PATCH v5 13/17] qapi: add qapi2texi script

2016-12-06 Thread Markus Armbruster
I had to resort to diff to find your replies, and massage the text
manually to produce a readable reply myself.  Please quote the usual
way.

> Markus Armbruster  writes:
> 
> > Marc-André Lureau  writes:
> >
> >> As the name suggests, the qapi2texi script converts JSON QAPI
> >> description into a texi file suitable for different target
> >> formats (info/man/txt/pdf/html...).
> >>
> >> It parses the following kind of blocks:
> >>
> >> Free-form:
> >>
> >>   ##
> >>   # = Section
> >>   # == Subsection
> >>   #
> >>   # Some text foo with *emphasis*
> >>   # 1. with a list
> >>   # 2. like that
> >>   #
> >>   # And some code:
> >>   # | $ echo foo
> >>   # | -> do this
> >>   # | <- get that
> >>   #
> >>   ##
> >>
> >> Symbol:
> >>
> >>   ##
> >>   # @symbol:
> >>   #
> >>   # Symbol body ditto ergo sum. Foo bar
> >>   # baz ding.
> >>   #
> >>   # @arg: foo
> >>   # @arg: #optional foo
> >
> > Let's not use @arg twice.
> >
> > Terminology: I prefer to use "parameter" for formal parameters, and
> > "argument" for actual arguments.  This matches how The Python Language
> > Reference uses the terms.
> >
> > What about
> >
> > # @param1: the frob to frobnicate
> > # @param2: #optional how hard to frobnicate
> 
> ok
> 
> >>   #
> >>   # Returns: returns bla bla
> >>   #  Or bla blah
> >
> > Repeating "returns" is awkward, and we don't do that in our schemas.
> >
> > We need a period before "Or", or spell it "or".
> >
> > What about
> >
> > # Returns: the frobnicated frob.
> > #  If frob isn't frobnicatable, GenericError.
> 
> ok
> 
> >>   #
> >>   # Since: version
> >>   # Notes: notes, comments can have
> >>   #- itemized list
> >>   #- like this
> >>   #
> >>   # Example:
> >>   #
> >>   # -> { "execute": "quit" }
> >>   # <- { "return": {} }
> >>   #
> >>   ##
> >>
> >> That's roughly following the following EBNF grammar:
> >>
> >> api_comment = "##\n" comment "##\n"
> >> comment = freeform_comment | symbol_comment
> >> freeform_comment = { "# " text "\n" | "#\n" }
> >> symbol_comment = "# @" name ":\n" { member | meta | freeform_comment }
> >
> > Rejects non-empty comments where "#" is not followed by space.  Such
> > usage is accepted outside doc comments.  Hmm.
> >
> >> member = "# @" name ':' [ text ] freeform_comment
> >
> > Are you missing a "\n" before freeform_comment?
> 
> yes
> 
> >> meta = "# " ( "Returns:", "Since:", "Note:", "Notes:", "Example:", 
> >> "Examples:" ) [ text ] freeform_comment
> >
> > Likewise.
> 
> ok
> 
> >> text = free-text markdown-like, "#optional" for members
> >
> > The grammar is ambiguous: a line "# @foo:\n" can be parsed both as
> > freeform_comment and as symbol_comment.  Since your intent is obvious
> > enough, it can still serve as documentation.  It's not a suitable
> > foundation for parsing, though.  Okay for a commit message.
> >
> >> Thanks to the following json expressions, the documentation is enhanced
> >> with extra information about the type of arguments and return value
> >> expected.
> >
> > I guess you want to say that we enrich the documentation we extract from
> > comments with information from the actual schema.  Correct?
> 
> yes
> 
> > Missing: a brief discussion of deficiencies.  These include:
> >
> > * The generated QMP documentation includes internal types
> >
> >   We use qapi-schema.json both for defining the external QMP interface
> >   and for defining internal types.  qmp-introspect.py carefully
> >   separates the two, to not expose internal types.  qapi2texi.py happily
> >   exposes everything.
> >
> >   Currently, about a fifth of the types in the generated docs are
> >   internal:
> >
> >   AcpiTableOptions
> >   BiosAtaTranslation
> >   BlockDeviceMapEntry
> >   COLOMessage
> >   COLOMode
> >   DummyForceArrays
> >   FailoverStatus
> >   FloppyDriveType
> >   ImageCheck
> >   LostTickPolicy
> >   MapEntry
> >   MigrationParameter
> >   NetClientDriver
> >   NetFilterDirection
> >   NetLegacy
> >   NetLegacyNicOptions
> >   NetLegacyOptions
> >   NetLegacyOptionsKind
> >   Netdev
> >   NetdevBridgeOptions
> >   NetdevDumpOptions
> >   NetdevHubPortOptions
> >   NetdevL2TPv3Options
> >   NetdevNetmapOptions
> >   NetdevNoneOptions
> >   NetdevSocketOptions
> >   NetdevTapOptions
> >   NetdevUserOptions
> >   NetdevVdeOptions
> >   NetdevVhostUserOptions
> >   NumaNodeOptions
> >   NumaOptions
> >   NumaOptionsKind
> >   OnOffAuto
> >   OnOffSplit
> >   PreallocMode
> >   QCryptoBlockCreateOptions
> >   QCryptoBlockCreateOptionsLUKS
> >   QCryptoBlockFormat
> >   QCryptoBlockInfo
> >   QCryptoBlockInfoBase
> >   QCryptoBlockInfoQCow
> >   QCryptoBlockOpenOptions
> >   QCryptoBlockOptionsBase
> >   QCryptoBlockOptionsLUKS
> >   QCryptoBlockOptionsQCow
> >   QCryptoSecretFormat
> >   QCryptoTLSCredsEnd

Re: [Qemu-devel] [PATCH v4 3/7] fw-cfg: expose "file_slots" parameter in fw_cfg_init_io_dma()

2016-12-06 Thread Igor Mammedov
On Thu,  1 Dec 2016 18:06:20 +0100
Laszlo Ersek  wrote:

> Accordingly, generalize the "file_slots" minimum calculation in
> fw_cfg_init_io_dma(), and move the constant FW_CFG_FILE_SLOTS_TRAD
> argument to the callers of fw_cfg_init_io_dma().
If I get idea correctly you're extending fw_cfg_init_io_dma() and
setting
 qdev_prop_set_uint32(dev, "file_slots", file_slots);
manually to keep old fw_cfg_init_io() the same without touching
xen/sun4u machines.
That way we would have 2 different ways to set defaults
per machine type/version rather then the single COMPAT property way.

How about dropping this patch and adding
 SET_MACHINE_COMPAT
to xen/sun4u machines instead and dropping fw_cfg_init_io() in
favor of fw_cfg_init_io_dma() along the way.

> 
> Cc: "Gabriel L. Somlo" 
> Cc: "Michael S. Tsirkin" 
> Cc: Gerd Hoffmann 
> Cc: Igor Mammedov 
> Cc: Paolo Bonzini 
> Signed-off-by: Laszlo Ersek 
> ---
>  docs/specs/fw_cfg.txt |  4 ++--
>  include/hw/nvram/fw_cfg.h |  2 +-
>  hw/i386/pc.c  |  3 ++-
>  hw/nvram/fw_cfg.c | 13 ++---
>  4 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt
> index 84e2978706f5..4a6888b511f4 100644
> --- a/docs/specs/fw_cfg.txt
> +++ b/docs/specs/fw_cfg.txt
> @@ -153,12 +153,12 @@ Selector Reg.Range Usage
>  0x4000 - 0x7fff  Generic (0x - 0x3fff, RW, ignored in QEMU v2.4+)
>  0x8000 - 0xbfff  Arch. Specific (0x - 0x3fff, generally RO, possibly RW
>   through the DMA interface in QEMU v2.9+)
>  0xc000 - 0x  Arch. Specific (0x - 0x3fff, RW, ignored in v2.4+)
>  
> -In practice, the number of allowed firmware configuration items is given
> -by the value (FW_CFG_FILE_FIRST + FW_CFG_FILE_SLOTS_TRAD) (see fw_cfg.h).
> +In practice, the number of allowed firmware configuration items depends on 
> the
> +machine type.
machine type/version

>  
>  = Guest-side DMA Interface =
>  
>  If bit 1 of the feature bitmap is set, the DMA interface is present. This 
> does
>  not replace the existing fw_cfg interface, it is an add-on. This interface
> diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
> index b980cbaebf43..e9a6b6aa968c 100644
> --- a/include/hw/nvram/fw_cfg.h
> +++ b/include/hw/nvram/fw_cfg.h
> @@ -173,11 +173,11 @@ void fw_cfg_add_file_callback(FWCfgState *s, const char 
> *filename,
>   */
>  void *fw_cfg_modify_file(FWCfgState *s, const char *filename, void *data,
>   size_t len);
>  
>  FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, uint32_t dma_iobase,
> -AddressSpace *dma_as);
> +AddressSpace *dma_as, uint32_t file_slots);
>  FWCfgState *fw_cfg_init_io(uint32_t iobase);
>  FWCfgState *fw_cfg_init_mem(hwaddr ctl_addr, hwaddr data_addr);
>  FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr,
>   hwaddr data_addr, uint32_t data_width,
>   hwaddr dma_addr, AddressSpace *dma_as);
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index a9e64a88e5e7..5d929d8fc887 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -741,11 +741,12 @@ static FWCfgState *bochs_bios_init(AddressSpace *as, 
> PCMachineState *pcms)
>  {
>  FWCfgState *fw_cfg;
>  uint64_t *numa_fw_cfg;
>  int i, j;
>  
> -fw_cfg = fw_cfg_init_io_dma(FW_CFG_IO_BASE, FW_CFG_IO_BASE + 4, as);
> +fw_cfg = fw_cfg_init_io_dma(FW_CFG_IO_BASE, FW_CFG_IO_BASE + 4, as,
> +FW_CFG_FILE_SLOTS_TRAD);
>  fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, pcms->boot_cpus);
>  
>  /* FW_CFG_MAX_CPUS is a bit confusing/problematic on x86:
>   *
>   * For machine types prior to 1.8, SeaBIOS needs FW_CFG_MAX_CPUS for
> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> index 2e1441c09750..c33c76ab93b1 100644
> --- a/hw/nvram/fw_cfg.c
> +++ b/hw/nvram/fw_cfg.c
> @@ -926,11 +926,11 @@ static void fw_cfg_init1(DeviceState *dev)
>  s->machine_ready.notify = fw_cfg_machine_ready;
>  qemu_add_machine_init_done_notifier(&s->machine_ready);
>  }
>  
>  FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, uint32_t dma_iobase,
> -AddressSpace *dma_as)
> +AddressSpace *dma_as, uint32_t file_slots)
>  {
>  DeviceState *dev;
>  FWCfgState *s;
>  uint32_t version = FW_CFG_VERSION;
>  bool dma_requested = dma_iobase && dma_as;
> @@ -940,15 +940,14 @@ FWCfgState *fw_cfg_init_io_dma(uint32_t iobase, 
> uint32_t dma_iobase,
>  qdev_prop_set_uint32(dev, "dma_iobase", dma_iobase);
>  if (!dma_requested) {
>  qdev_prop_set_bit(dev, "dma_enabled", false);
>  }
>  
> -/* Once we expose the "file_slots" property to callers of
> - * fw_cfg_init_io_dma(), the following setting should become conditional 
> on
> - * the input parameter being lower than the current value of the 
> property.
> - */
> 

Re: [Qemu-devel] [PATCH v4 2/7] fw-cfg: turn FW_CFG_FILE_SLOTS into a device property

2016-12-06 Thread Igor Mammedov
On Tue, 6 Dec 2016 12:43:06 +0100
Laszlo Ersek  wrote:

> On 12/06/16 11:50, Igor Mammedov wrote:
> > On Thu,  1 Dec 2016 18:06:19 +0100
> > Laszlo Ersek  wrote:
> >   
> >> We'd like to raise the value of FW_CFG_FILE_SLOTS. Doing it naively could
> >> lead to problems with backward migration: a more recent QEMU (running an
> >> older machine type) would allow the guest, in fw_cfg_select(), to select a
> >> high key value that is unavailable in the same machine type implemented by
> >> the older (target) QEMU. On the target host, fw_cfg_data_read() for
> >> example could dereference nonexistent entries.
> >>
> >> As first step, size the FWCfgState.entries[*] and FWCfgState.entry_order
> >> arrays dynamically. All three array sizes will be influenced by the new
> >> field (and device property) FWCfgState.file_slots.
> >>
> >> Make the following changes:
> >>
> >> - Replace the FW_CFG_FILE_SLOTS macro with FW_CFG_FILE_SLOTS_TRAD
> >>   (traditional count of fw_cfg file slots) in the header file. The value
> >>   remains 0x10.
> >>
> >> - Replace all uses of FW_CFG_FILE_SLOTS with a helper function called
> >>   fw_cfg_file_slots(), returning the new property.
> >>
> >> - Eliminate the macro FW_CFG_MAX_ENTRY, and replace all its uses with a
> >>   helper function called fw_cfg_max_entry().
> >>
> >> - In the MMIO- and IO-mapped realize functions both, allocate all three
> >>   arrays dynamically, based on the new property.
> >>
> >> - The new property defaults to 0x20; however at the moment we forcibly set
> >>   it to FW_CFG_FILE_SLOTS_TRAD on all code paths available to board code
> >>   (namely in the fw_cfg_init_io_dma() and fw_cfg_init_mem_wide() helper
> >>   functions). This is going to be customized in the following patches.
> >>
> >> Cc: "Gabriel L. Somlo" 
> >> Cc: "Michael S. Tsirkin" 
> >> Cc: Gerd Hoffmann 
> >> Cc: Igor Mammedov 
> >> Cc: Paolo Bonzini 
> >> Signed-off-by: Laszlo Ersek 
> >> ---
> >>
> >> Notes:
> >> I know that upstream doesn't care about backward migration, but some
> >> downstreams might.
> >>
> >>  docs/specs/fw_cfg.txt  |  2 +-
> >>  include/hw/nvram/fw_cfg_keys.h |  3 +-
> >>  hw/nvram/fw_cfg.c  | 85 
> >> ++
> >>  3 files changed, 79 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt
> >> index a19e2adbe1c6..84e2978706f5 100644
> >> --- a/docs/specs/fw_cfg.txt
> >> +++ b/docs/specs/fw_cfg.txt
> >> @@ -154,11 +154,11 @@ Selector Reg.Range Usage
> >>  0x8000 - 0xbfff  Arch. Specific (0x - 0x3fff, generally RO, possibly 
> >> RW
> >>   through the DMA interface in QEMU v2.9+)
> >>  0xc000 - 0x  Arch. Specific (0x - 0x3fff, RW, ignored in v2.4+)
> >>  
> >>  In practice, the number of allowed firmware configuration items is given
> >> -by the value of FW_CFG_MAX_ENTRY (see fw_cfg.h).
> >> +by the value (FW_CFG_FILE_FIRST + FW_CFG_FILE_SLOTS_TRAD) (see fw_cfg.h).
> >>  
> >>  = Guest-side DMA Interface =
> >>  
> >>  If bit 1 of the feature bitmap is set, the DMA interface is present. This 
> >> does
> >>  not replace the existing fw_cfg interface, it is an add-on. This interface
> >> diff --git a/include/hw/nvram/fw_cfg_keys.h 
> >> b/include/hw/nvram/fw_cfg_keys.h
> >> index 0f3e871884c0..627589793671 100644
> >> --- a/include/hw/nvram/fw_cfg_keys.h
> >> +++ b/include/hw/nvram/fw_cfg_keys.h
> >> @@ -27,12 +27,11 @@
> >>  #define FW_CFG_SETUP_SIZE   0x17
> >>  #define FW_CFG_SETUP_DATA   0x18
> >>  #define FW_CFG_FILE_DIR 0x19
> >>  
> >>  #define FW_CFG_FILE_FIRST   0x20
> >> -#define FW_CFG_FILE_SLOTS   0x10
> >> -#define FW_CFG_MAX_ENTRY(FW_CFG_FILE_FIRST + FW_CFG_FILE_SLOTS)
> >> +#define FW_CFG_FILE_SLOTS_TRAD  0x10
> >>  
> >>  #define FW_CFG_WRITE_CHANNEL0x4000
> >>  #define FW_CFG_ARCH_LOCAL   0x8000
> >>  #define FW_CFG_ENTRY_MASK   (~(FW_CFG_WRITE_CHANNEL | 
> >> FW_CFG_ARCH_LOCAL))
> >>  
> >> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
> >> index e0145c11a19b..2e1441c09750 100644
> >> --- a/hw/nvram/fw_cfg.c
> >> +++ b/hw/nvram/fw_cfg.c
> >> @@ -31,10 +31,13 @@
> >>  #include "hw/sysbus.h"
> >>  #include "trace.h"
> >>  #include "qemu/error-report.h"
> >>  #include "qemu/config-file.h"
> >>  #include "qemu/cutils.h"
> >> +#include "qapi/error.h"
> >> +
> >> +#define FW_CFG_FILE_SLOTS_DFLT 0x20
> >>  
> >>  #define FW_CFG_NAME "fw_cfg"
> >>  #define FW_CFG_PATH "/machine/" FW_CFG_NAME
> >>  
> >>  #define TYPE_FW_CFG "fw_cfg"
> >> @@ -69,12 +72,13 @@ typedef struct FWCfgEntry {
> >>  struct FWCfgState {
> >>  /*< private >*/
> >>  SysBusDevice parent_obj;
> >>  /*< public >*/
> >>  
> >> -FWCfgEntry entries[2][FW_CFG_MAX_ENTRY];
> >> -int entry_order[FW_CFG_MAX_ENTRY];
> >> +uint32_t file_slots;  
> > should it be uint16_t?
> > As below you use "uint16_t file_slots_max;" and do some UINT16
> > to calculate max limit.  
> 
> I

[Qemu-devel] [Bug 1646610] Re: "Assertion `!r->req.sg' failed." during live migration with VirtIO

2016-12-06 Thread Thomas Huth
Which version of QEMU are you using?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1646610

Title:
  "Assertion `!r->req.sg' failed." during live migration with VirtIO

Status in QEMU:
  New

Bug description:
  We've hit this issue twice so far, but don't have an obvious repro
  yet. It's pretty rare for us to hit it but I'm still trying so I can
  get a core and backtrace. The guest was Windows running a constant
  workload. We were using VirtIO SCSI drivers in both cases.

  In both cases we hit the assert here:

  hw/scsi/scsi-generic.c:

  static void scsi_generic_save_request(QEMUFile *f, SCSIRequest *req)
  {
  SCSIGenericReq *r = DO_UPCAST(SCSIGenericReq, req, req);

  qemu_put_sbe32s(f, &r->buflen);
  if (r->buflen && r->req.cmd.mode == SCSI_XFER_TO_DEV) {
  *** assert(!r->req.sg);
  qemu_put_buffer(f, r->buf, r->req.cmd.xfer);
  }
  }

  From code inspection, it seems that this will always happen if
  scsi_req_enqueue_internal in hw/scsi/scsi-bus.c is ever invoked.

  static void scsi_req_enqueue_internal(SCSIRequest *req)
  {
  assert(!req->enqueued);
  scsi_req_ref(req);
  if (req->bus->info->get_sg_list) {
  req->sg = req->bus->info->get_sg_list(req);
  } else {
  req->sg = NULL;
  }
  req->enqueued = true;
  QTAILQ_INSERT_TAIL(&req->dev->requests, req, next);
  }

  req->bus->info->get_sg_list will return &req->qsgl for a virtio-scsi
  bus since it's a method stored on the SCSIBusInfo struct. I didn't see
  anything that would clear the req->sg if scsi_req_enqueue_internal is
  called at least once.

  I think this can only happen if scsi_dma_restart_bh in hw/scsi/scsi-
  bus.c is called. The only other location I see
  scsi_req_enqueue_internal is on the load side for the destination of a
  migration.

  static void scsi_dma_restart_bh(void *opaque)
  {
  SCSIDevice *s = opaque;
  SCSIRequest *req, *next;

  qemu_bh_delete(s->bh);
  s->bh = NULL;

  QTAILQ_FOREACH_SAFE(req, &s->requests, next, next) {
  scsi_req_ref(req);
  if (req->retry) {
  req->retry = false;
  switch (req->cmd.mode) {
  case SCSI_XFER_FROM_DEV:
  case SCSI_XFER_TO_DEV:
  scsi_req_continue(req);
  break;
  case SCSI_XFER_NONE:
  scsi_req_dequeue(req);
  scsi_req_enqueue(req); // *** this calls 
scsi_req_enqueue_internal
  break;
  }
  }
  scsi_req_unref(req);
  }
  }

  Finally when put_scsi_requests is called for migration, it seems like
  it will call both virtio_scsi_save_request (from
  bus->info->save_request) and scsi_generic_save_request (from
  req->ops->save_request) and trigger the assert.

  I searched for a bit, but didn't find anyone else reporting this. Has
  anyone else seen this? It seems to me like that assert should check
  that the sg list is empty instead of checking that it exists. Is this
  an appropriate assessment? Assuming I find a repro, I'll try to test
  this solution.

  Thanks!

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1646610/+subscriptions



Re: [Qemu-devel] [PATCH for-2.8] qdev: apply global properties in reverse order

2016-12-06 Thread Halil Pasic


On 12/06/2016 10:30 AM, Greg Kurz wrote:
> On Mon, 5 Dec 2016 15:41:30 -0200
> Eduardo Habkost  wrote:
> 
>> > On Mon, Dec 05, 2016 at 06:25:55PM +0100, Cornelia Huck wrote:
>>> > > On Mon, 5 Dec 2016 14:48:29 -0200
>>> > > Eduardo Habkost  wrote:
>>> > >   
 > > > On Mon, Dec 05, 2016 at 04:42:00PM +0100, Cornelia Huck wrote:  
> > > > > On Mon, 05 Dec 2016 16:21:22 +0100
> > > > > Greg Kurz  wrote:
> > > > >   
>> > > > > > The current code recursively applies global properties from 
>> > > > > > child up to
>> > > > > > parent. So, if you have:
>> > > > > > 
>> > > > > > -global virtio-pci.disable-modern=on
>> > > > > > -global virtio-blk-pci.disable-modern=off
>> > > > > > 
>> > > > > > Then the default value of disable-modern for a virtio-blk-pci 
>> > > > > > device is on,
>> > > > > > which looks wrong from an OOP perspective.
>> > > > > > 
>> > > > > > This patch reverses the logic, so that a child property always 
>> > > > > > prevail.  
[...]
>> > compat props should be always applied in the order they appear.
>> > -global should always be applied after compat props.
>> > 
> This is actually the way they're being registered to the global_props
> static list: compat props as they appear in HW_COMPAT_* and then -global
> as they appear on the command line.
> 
>> > So, it looks like we have two additional reasons to just follow
>> > the order the global properties were registered.
>> > 
> Thinking again, maybe we just need to reverse the logic in another
> way: go through global_props and apply the property if the device
> can be casted to the corresponding class (i.e. object_class_dynamic_cast()
> != NULL). I'll try that.
> 

IMHO this is the right thing to do (and would result in the
exactly behavior outlined by Eduardo -- compat props as they
applied appear and then command line props as they appear, what
means inverse ordering in terms of overriding regarding the
result).

Halil




[Qemu-devel] [Bug 1647683] [NEW] Bad interaction between tb flushing & gdb stub

2016-12-06 Thread Julian Brown
Public bug reported:

I have been working on a series of patches for ARM big-endian system
mode support, using QEMU as a bare-metal simulator for the GDB test
suite. At some point I realised that these tests were not running
reliably on the QEMU master branch, even without my patches applied.
(I.e., in little-endian mode.)

Running QEMU under GDB in the test harness via Valgrind, using something
akin to:

(gdb) target remote | valgrind --tool=memcheck qemu-arm-system [...]

leads to intermittent (and quite hard-to-reproduce) segfaults in QEMU of
the form:

==52333== Process terminating with default action of signal 11 (SIGSEGV)
==52333==  Access not within mapped region at address 0x24
==52333==at 0x1D55F2: tb_page_remove (translate-all.c:1026)
==52333==by 0x1D58B4: tb_phys_invalidate (translate-all.c:1119)
==52333==by 0x1D63AA: tb_invalidate_phys_page_range (translate-all.c:1519)
==52333==by 0x1D66D7: tb_invalidate_phys_addr (translate-all.c:1714)
==52333==by 0x1CBA7F: breakpoint_invalidate (exec.c:704)
==52333==by 0x1CC01F: cpu_breakpoint_remove_by_ref (exec.c:869)
==52333==by 0x1CBF97: cpu_breakpoint_remove (exec.c:857)
==52333==by 0x218FAA: gdb_breakpoint_remove (gdbstub.c:717)
==52333==by 0x219E35: gdb_handle_packet (gdbstub.c:1035)
==52333==by 0x21AF62: gdb_read_byte (gdbstub.c:1459)
==52333==by 0x21B096: gdb_chr_receive (gdbstub.c:1672)
==52333==by 0x3AF2BC: qemu_chr_be_write_impl (qemu-char.c:419)

These crashes didn't happen on a 2.6-era QEMU, so I bisected and
discovered the commit 3359baad36889b83df40b637ed993a4b816c4906 ("tcg:
Make tb_flush() thread safe") appears to be the thing that triggers this
intermittent failure. Reverting the patch on the branch tip makes the
crashes go away.

Unfortunately I don't currently have a way to trigger the segfaults
outside of Mentor Graphics's test infrastructure, which I can't share.

Does anyone know a reason that this might be happening, or suggestions
of how I might further debug this? Maybe a missed tb flush in the gdb
stub code, somewhere?

Thanks!

Julian

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1647683

Title:
  Bad interaction between tb flushing & gdb stub

Status in QEMU:
  New

Bug description:
  I have been working on a series of patches for ARM big-endian system
  mode support, using QEMU as a bare-metal simulator for the GDB test
  suite. At some point I realised that these tests were not running
  reliably on the QEMU master branch, even without my patches applied.
  (I.e., in little-endian mode.)

  Running QEMU under GDB in the test harness via Valgrind, using
  something akin to:

  (gdb) target remote | valgrind --tool=memcheck qemu-arm-system [...]

  leads to intermittent (and quite hard-to-reproduce) segfaults in QEMU
  of the form:

  ==52333== Process terminating with default action of signal 11 (SIGSEGV)
  ==52333==  Access not within mapped region at address 0x24
  ==52333==at 0x1D55F2: tb_page_remove (translate-all.c:1026)
  ==52333==by 0x1D58B4: tb_phys_invalidate (translate-all.c:1119)
  ==52333==by 0x1D63AA: tb_invalidate_phys_page_range (translate-all.c:1519)
  ==52333==by 0x1D66D7: tb_invalidate_phys_addr (translate-all.c:1714)
  ==52333==by 0x1CBA7F: breakpoint_invalidate (exec.c:704)
  ==52333==by 0x1CC01F: cpu_breakpoint_remove_by_ref (exec.c:869)
  ==52333==by 0x1CBF97: cpu_breakpoint_remove (exec.c:857)
  ==52333==by 0x218FAA: gdb_breakpoint_remove (gdbstub.c:717)
  ==52333==by 0x219E35: gdb_handle_packet (gdbstub.c:1035)
  ==52333==by 0x21AF62: gdb_read_byte (gdbstub.c:1459)
  ==52333==by 0x21B096: gdb_chr_receive (gdbstub.c:1672)
  ==52333==by 0x3AF2BC: qemu_chr_be_write_impl (qemu-char.c:419)

  These crashes didn't happen on a 2.6-era QEMU, so I bisected and
  discovered the commit 3359baad36889b83df40b637ed993a4b816c4906 ("tcg:
  Make tb_flush() thread safe") appears to be the thing that triggers
  this intermittent failure. Reverting the patch on the branch tip makes
  the crashes go away.

  Unfortunately I don't currently have a way to trigger the segfaults
  outside of Mentor Graphics's test infrastructure, which I can't share.

  Does anyone know a reason that this might be happening, or suggestions
  of how I might further debug this? Maybe a missed tb flush in the gdb
  stub code, somewhere?

  Thanks!

  Julian

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1647683/+subscriptions



[Qemu-devel] [Bug 1221966] Re: SIGSEGV in static_code_gen_buffer

2016-12-06 Thread Thomas Huth
Triaging old bug tickets ... is this still an issue with the latest
version of QEMU or could we close this ticket nowadays?

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1221966

Title:
  SIGSEGV in static_code_gen_buffer

Status in QEMU:
  Incomplete

Bug description:
  Trying to run 'ls' (or, anything else as far as I can tell) from a
  SunOS 5.8 box under RHEL 6.4 linux, I get a segfault.  I've tried
  qemu-1.5.3, qemu-1.6.0, and I fetched git://git.qemu-
  project.org/qemu.git.  I've also tried a statically linked sh from
  /sbin/ and it also segfaulted.

  wcolburn@anotheruvula$ gdb bin/qemu-sparc
  GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
  Copyright (C) 2010 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later 
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-redhat-linux-gnu".
  For bug reporting instructions, please see:
  ...
  Reading symbols from /home/anotheruvula/qemu/bin/qemu-sparc...done.
  (gdb) run ../sparc/ls
  Starting program: /home/anotheruvula/qemu/bin/qemu-sparc ../sparc/ls
  [Thread debugging using libthread_db enabled]

  Program received signal SIGSEGV, Segmentation fault.
  0x78259116 in static_code_gen_buffer ()
  Missing separate debuginfos, use: debuginfo-install glib2-2.22.5-7.el6.x86_64 
glibc-2.12-1.107.el6_4.4.x86_64
  (gdb) where
  #0  0x78259116 in static_code_gen_buffer ()
  #1  0x77f570cd in cpu_tb_exec (cpu=0x7a2b1210, tb_ptr=
  0x782590d0 "A\213n \205í\017\205Í")
  at /home/anotheruvula/qemu-devel/cpu-exec.c:56
  #2  0x77f57b2d in cpu_sparc_exec (env=0x7a2b1348)
  at /home/anotheruvula/qemu-devel/cpu-exec.c:631
  #3  0x77f77f36 in cpu_loop (env=0x7a2b1348)
  at /home/anotheruvula/qemu-devel/linux-user/main.c:1089
  #4  0x77f798ff in main (argc=2, argv=0x7fffdfc8, envp=
  0x7fffdfe0) at /home/anotheruvula/qemu-devel/linux-user/main.c:4083
  (gdb)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1221966/+subscriptions



[Qemu-devel] [Bug 1233225] Re: mips/mipsel linux user float division problem

2016-12-06 Thread Thomas Huth
Looks like Petar's patch from comment #6 has been included here:
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=4d66261f71f2efa31e1052e
==> Fix released

** Changed in: qemu
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1233225

Title:
  mips/mipsel linux user float division problem

Status in QEMU:
  Fix Released

Bug description:
  Hi,

  I tested the following with the qemu git HEAD as of 2013-09-30 on
  Debian stable and testing. My host runs amd64 but I also tried this
  out inside a i386 chroot with the same result. The problem occurs for
  mips and mipsel. Given the following program:

  #include 
  int main(int argc, char **argv)
  {
  int a = 1;
  double d = a/2.0;
  printf("%f\n", d);
  return 0;
  }

  Instead of printing 0.5, it will print 2.0 if executed in qemu user
  mode.

  $ mipsel-linux-gnu-gcc mipstest.c
  $ ~/qemu/mipsel-linux-user/qemu-mipsel ./a.out
  2.0

  Expecting this to be a problem with my cross compiler (gcc-4.4 from
  emdebian) I ran a fully emulated debian squeeze environment inside
  qemu. There, I compiled the same program natively with gcc and as
  expected got 0.5 as the output. I also copied the cross compiled
  binary inside the emulated environment and also got 0.5 when I ran it.
  So the same mips/mipsel binary produces different output depending on
  whether it is run in a fully emulated environment or qemu user mode.

  Can anybody else reproduce this problem?

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1233225/+subscriptions



[Qemu-devel] [Bug 1647683] Re: Bad interaction between tb flushing & gdb stub

2016-12-06 Thread Julian Brown
(FAOD, the crashes happen without Valgrind too, and the above backtrace
may be a red herring.)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1647683

Title:
  Bad interaction between tb flushing & gdb stub

Status in QEMU:
  New

Bug description:
  I have been working on a series of patches for ARM big-endian system
  mode support, using QEMU as a bare-metal simulator for the GDB test
  suite. At some point I realised that these tests were not running
  reliably on the QEMU master branch, even without my patches applied.
  (I.e., in little-endian mode.)

  Running QEMU under GDB in the test harness via Valgrind, using
  something akin to:

  (gdb) target remote | valgrind --tool=memcheck qemu-arm-system [...]

  leads to intermittent (and quite hard-to-reproduce) segfaults in QEMU
  of the form:

  ==52333== Process terminating with default action of signal 11 (SIGSEGV)
  ==52333==  Access not within mapped region at address 0x24
  ==52333==at 0x1D55F2: tb_page_remove (translate-all.c:1026)
  ==52333==by 0x1D58B4: tb_phys_invalidate (translate-all.c:1119)
  ==52333==by 0x1D63AA: tb_invalidate_phys_page_range (translate-all.c:1519)
  ==52333==by 0x1D66D7: tb_invalidate_phys_addr (translate-all.c:1714)
  ==52333==by 0x1CBA7F: breakpoint_invalidate (exec.c:704)
  ==52333==by 0x1CC01F: cpu_breakpoint_remove_by_ref (exec.c:869)
  ==52333==by 0x1CBF97: cpu_breakpoint_remove (exec.c:857)
  ==52333==by 0x218FAA: gdb_breakpoint_remove (gdbstub.c:717)
  ==52333==by 0x219E35: gdb_handle_packet (gdbstub.c:1035)
  ==52333==by 0x21AF62: gdb_read_byte (gdbstub.c:1459)
  ==52333==by 0x21B096: gdb_chr_receive (gdbstub.c:1672)
  ==52333==by 0x3AF2BC: qemu_chr_be_write_impl (qemu-char.c:419)

  These crashes didn't happen on a 2.6-era QEMU, so I bisected and
  discovered the commit 3359baad36889b83df40b637ed993a4b816c4906 ("tcg:
  Make tb_flush() thread safe") appears to be the thing that triggers
  this intermittent failure. Reverting the patch on the branch tip makes
  the crashes go away.

  Unfortunately I don't currently have a way to trigger the segfaults
  outside of Mentor Graphics's test infrastructure, which I can't share.

  Does anyone know a reason that this might be happening, or suggestions
  of how I might further debug this? Maybe a missed tb flush in the gdb
  stub code, somewhere?

  Thanks!

  Julian

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1647683/+subscriptions



Re: [Qemu-devel] [PATCH v4 04/64] tcg/aarch64: Implement field extraction opcodes

2016-12-06 Thread Alex Bennée

Richard Henderson  writes:

> Signed-off-by: Richard Henderson 
> ---
>  tcg/aarch64/tcg-target.h |  8 
>  tcg/aarch64/tcg-target.inc.c | 14 ++
>  2 files changed, 18 insertions(+), 4 deletions(-)
>
> diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
> index 410c31b..4a74bd8 100644
> --- a/tcg/aarch64/tcg-target.h
> +++ b/tcg/aarch64/tcg-target.h
> @@ -63,8 +63,8 @@ typedef enum {
>  #define TCG_TARGET_HAS_nand_i32 0
>  #define TCG_TARGET_HAS_nor_i32  0
>  #define TCG_TARGET_HAS_deposit_i32  1
> -#define TCG_TARGET_HAS_extract_i32  0
> -#define TCG_TARGET_HAS_sextract_i32 0
> +#define TCG_TARGET_HAS_extract_i32  1
> +#define TCG_TARGET_HAS_sextract_i32 1
>  #define TCG_TARGET_HAS_movcond_i32  1
>  #define TCG_TARGET_HAS_add2_i32 1
>  #define TCG_TARGET_HAS_sub2_i32 1
> @@ -95,8 +95,8 @@ typedef enum {
>  #define TCG_TARGET_HAS_nand_i64 0
>  #define TCG_TARGET_HAS_nor_i64  0
>  #define TCG_TARGET_HAS_deposit_i64  1
> -#define TCG_TARGET_HAS_extract_i64  0
> -#define TCG_TARGET_HAS_sextract_i64 0
> +#define TCG_TARGET_HAS_extract_i64  1
> +#define TCG_TARGET_HAS_sextract_i64 1
>  #define TCG_TARGET_HAS_movcond_i64  1
>  #define TCG_TARGET_HAS_add2_i64 1
>  #define TCG_TARGET_HAS_sub2_i64 1
> diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
> index 1939d35..c0e9890 100644
> --- a/tcg/aarch64/tcg-target.inc.c
> +++ b/tcg/aarch64/tcg-target.inc.c
> @@ -1640,6 +1640,16 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
>  tcg_out_dep(s, ext, a0, REG0(2), args[3], args[4]);
>  break;
>
> +case INDEX_op_extract_i64:
> +case INDEX_op_extract_i32:
> +tcg_out_ubfm(s, ext, a0, a1, a2, a2 + args[3] - 1);
> +break;
> +
> +case INDEX_op_sextract_i64:
> +case INDEX_op_sextract_i32:
> +tcg_out_sbfm(s, ext, a0, a1, a2, a2 + args[3] - 1);
> +break;
> +

This isn't right is it? As I'm reading it extract takes from a
offset+len from the source register to low bits of the destination
register. The Bitfield Move instructions are the other way around,
moving from the low order bits in the source register to an offset+len
in the destination.

>  case INDEX_op_add2_i32:
>  tcg_out_addsub2(s, TCG_TYPE_I32, a0, a1, REG0(2), REG0(3),
>  (int32_t)args[4], args[5], const_args[4],
> @@ -1785,6 +1795,10 @@ static const TCGTargetOpDef aarch64_op_defs[] = {
>
>  { INDEX_op_deposit_i32, { "r", "0", "rZ" } },
>  { INDEX_op_deposit_i64, { "r", "0", "rZ" } },
> +{ INDEX_op_extract_i32, { "r", "r" } },
> +{ INDEX_op_extract_i64, { "r", "r" } },
> +{ INDEX_op_sextract_i32, { "r", "r" } },
> +{ INDEX_op_sextract_i64, { "r", "r" } },
>
>  { INDEX_op_add2_i32, { "r", "r", "rZ", "rZ", "rA", "rMZ" } },
>  { INDEX_op_add2_i64, { "r", "r", "rZ", "rZ", "rA", "rMZ" } },


--
Alex Bennée



[Qemu-devel] [Bug 1243639] Re: qemu-1.5.3 segment fault with -vga qxl

2016-12-06 Thread Thomas Huth
Triaging old bug tickets ... QEMU 1.5 is quite old already - can you
still reproduce the crash with the latest version of QEMU?

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1243639

Title:
  qemu-1.5.3   segment fault  with  -vga qxl

Status in QEMU:
  Incomplete

Bug description:
  execute " qemu-system-x86_64-enable-kvm -machine accel=kvm:tcg -m
  1G  -drive file=/dev/sda  --full-screen -spice
  addr=127.0.0.1,port=5900,disable-ticketing -vga qxl "  on shell will
  get  segment fault  after  a few seconds   if  I  don't connect to it
  with  spicec client  immediately.

  IF  excute  "spicec -h 127.0.0.1 -p 5900 "  immediately after
  the  qemu-system-x86_64  execution, then  no segment fault happens
  and  it runs well.

  =

  GDB output:

  root@kali-john:~# gdb /usr/local/bin/qemu-system-x86_64
  GNU gdb (GDB) 7.4.1-debian
  (gdb) run -enable-kvm -machine accel=kvm:tcg -m 1G  -drive file=/dev/sda  
--full-screen -spice addr=127.0.0.1,port=5900,disable-ticketing -vga qxl

  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
  [New Thread 0x73737700 (LWP 14797)]
  [New Thread 0x72d54700 (LWP 14798)]
  [New Thread 0x70fff700 (LWP 14799)]

  Program received signal SIGSEGV, Segmentation fault.
  0x7683ad70 in pixman_image_get_data () from 
/usr/lib/x86_64-linux-gnu/libpixman-1.so.0
  (gdb) bt
  #0  0x7683ad70 in pixman_image_get_data () from 
/usr/lib/x86_64-linux-gnu/libpixman-1.so.0
  #1  0x5581060a in surface_data (s=0x566183a0) at 
/zh-download/QEMU/qemu-1.5.3/include/ui/console.h:235
  #2  0x55818616 in vga_draw_graphic (s=0x5662c778, full_update=1) 
at /zh-download/QEMU/qemu-1.5.3/hw/display/vga.c:1788
  #3  0x55818c6a in vga_update_display (opaque=0x5662c778) at 
/zh-download/QEMU/qemu-1.5.3/hw/display/vga.c:1917
  #4  0x5580eb15 in qxl_hw_update (opaque=0x5662bd70) at 
/zh-download/QEMU/qemu-1.5.3/hw/display/qxl.c:1766
  #5  0x557bd6bc in graphic_hw_update (con=0x56618d00) at 
ui/console.c:254
  #6  0x557c8426 in qemu_spice_display_refresh (ssd=0x5662c418) at 
ui/spice-display.c:417
  #7  0x5580eff0 in display_refresh (dcl=0x5662c420) at 
/zh-download/QEMU/qemu-1.5.3/hw/display/qxl.c:1886
  #8  0x557c0cb1 in dpy_refresh (s=0x56618370) at ui/console.c:1436
  #9  0x557bd3af in gui_update (opaque=0x56618370) at 
ui/console.c:192
  #10 0x55797f20 in qemu_run_timers (clock=0x565b5a30) at 
qemu-timer.c:394
  #11 0x55798183 in qemu_run_all_timers () at qemu-timer.c:453
  #12 0x55760bb7 in main_loop_wait (nonblocking=0) at main-loop.c:470
  #13 0x557cd19c in main_loop () at vl.c:2029
  #14 0x557d43f2 in main (argc=13, argv=0x7fffe2b8, 
envp=0x7fffe328) at vl.c:4419
  (gdb) 

  
  ==

  http://www.spice-space.org/download/releases/spice-0.12.4.tar.bz2
  http://www.spice-space.org/download/releases/spice-protocol-0.12.6.tar.bz2
  spice  compiling 
./configure --enable-smartcard=no   && make

  qemu-1.5.3
  compiling 
  ./configure \
  --disable-strip  --enable-debug \
  --target-list=x86_64-softmmu,x86_64-linux-user  \
  --disable-sdl  --audio-drv-list=alsa --disable-vnc --disable-xen 
--disable-libiscsi  \
--disable-seccomp --disable-glusterfs --disable-libssh2 
--disable-smartcard-nss  \
--disable-usb-redir --disable-brlapi --disable-curl  --disable-bsd-user 
\
\
  --enable-kvm --enable-spice --enable-system --enable-guest-agent 
--enable-vhost-net 

  
  root@kali-john:~# qemu-system-x86_64 -version
  QEMU emulator version 1.5.3, Copyright (c) 2003-2008 Fabrice Bellard

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1243639/+subscriptions



[Qemu-devel] [Bug 1248469] Re: qemu 1.6.1 q35 ioh3420 not work in windows 7 32bit

2016-12-06 Thread Thomas Huth
Can you still reproduce this problem with the latest version of QEMU?

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1248469

Title:
  qemu 1.6.1 q35 ioh3420 not work in windows 7 32bit

Status in QEMU:
  Incomplete

Bug description:
  boot windows 7 32bit guest with -readconfig q35-chipset.cfg
  paramter,in guest's device manager,there's a device 3420 not work,it
  shows error "This device cannot find enough free resources that it can
  use(code 12)".

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1248469/+subscriptions



Re: [Qemu-devel] [PATCH v4 05/64] tcg/arm: Move isa detection to tcg-target.h

2016-12-06 Thread Alex Bennée

Richard Henderson  writes:

> Signed-off-by: Richard Henderson 

A slightly expanded commit message to mention why you are moving it
wouldn't go amiss. Otherwise:

Reviewed-by: Alex Bennée 

> ---
>  tcg/arm/tcg-target.h | 36 
>  tcg/arm/tcg-target.inc.c | 41 +
>  2 files changed, 33 insertions(+), 44 deletions(-)
>
> diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
> index 8e724be..d1fe12b 100644
> --- a/tcg/arm/tcg-target.h
> +++ b/tcg/arm/tcg-target.h
> @@ -26,6 +26,37 @@
>  #ifndef ARM_TCG_TARGET_H
>  #define ARM_TCG_TARGET_H
>
> +/* The __ARM_ARCH define is provided by gcc 4.8.  Construct it otherwise.  */
> +#ifndef __ARM_ARCH
> +# if defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) \
> + || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) \
> + || defined(__ARM_ARCH_7EM__)
> +#  define __ARM_ARCH 7
> +# elif defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) \
> +   || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) \
> +   || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6T2__)
> +#  define __ARM_ARCH 6
> +# elif defined(__ARM_ARCH_5__) || defined(__ARM_ARCH_5E__) \
> +   || defined(__ARM_ARCH_5T__) || defined(__ARM_ARCH_5TE__) \
> +   || defined(__ARM_ARCH_5TEJ__)
> +#  define __ARM_ARCH 5
> +# else
> +#  define __ARM_ARCH 4
> +# endif
> +#endif
> +
> +extern int arm_arch;
> +
> +#if defined(__ARM_ARCH_5T__) \
> +|| defined(__ARM_ARCH_5TE__) || defined(__ARM_ARCH_5TEJ__)
> +# define use_armv5t_instructions 1
> +#else
> +# define use_armv5t_instructions use_armv6_instructions
> +#endif
> +
> +#define use_armv6_instructions  (__ARM_ARCH >= 6 || arm_arch >= 6)
> +#define use_armv7_instructions  (__ARM_ARCH >= 7 || arm_arch >= 7)
> +
>  #undef TCG_TARGET_STACK_GROWSUP
>  #define TCG_TARGET_INSN_UNIT_SIZE 4
>  #define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
> @@ -79,7 +110,7 @@ extern bool use_idiv_instructions;
>  #define TCG_TARGET_HAS_eqv_i32  0
>  #define TCG_TARGET_HAS_nand_i32 0
>  #define TCG_TARGET_HAS_nor_i32  0
> -#define TCG_TARGET_HAS_deposit_i32  1
> +#define TCG_TARGET_HAS_deposit_i32  use_armv7_instructions
>  #define TCG_TARGET_HAS_extract_i32  0
>  #define TCG_TARGET_HAS_sextract_i32 0
>  #define TCG_TARGET_HAS_movcond_i32  1
> @@ -90,9 +121,6 @@ extern bool use_idiv_instructions;
>  #define TCG_TARGET_HAS_div_i32  use_idiv_instructions
>  #define TCG_TARGET_HAS_rem_i32  0
>
> -extern bool tcg_target_deposit_valid(int ofs, int len);
> -#define TCG_TARGET_deposit_i32_valid  tcg_target_deposit_valid
> -
>  enum {
>  TCG_AREG0 = TCG_REG_R6,
>  };
> diff --git a/tcg/arm/tcg-target.inc.c b/tcg/arm/tcg-target.inc.c
> index ffa0d40..1415c27 100644
> --- a/tcg/arm/tcg-target.inc.c
> +++ b/tcg/arm/tcg-target.inc.c
> @@ -25,36 +25,7 @@
>  #include "elf.h"
>  #include "tcg-be-ldst.h"
>
> -/* The __ARM_ARCH define is provided by gcc 4.8.  Construct it otherwise.  */
> -#ifndef __ARM_ARCH
> -# if defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) \
> - || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) \
> - || defined(__ARM_ARCH_7EM__)
> -#  define __ARM_ARCH 7
> -# elif defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) \
> -   || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__) \
> -   || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6T2__)
> -#  define __ARM_ARCH 6
> -# elif defined(__ARM_ARCH_5__) || defined(__ARM_ARCH_5E__) \
> -   || defined(__ARM_ARCH_5T__) || defined(__ARM_ARCH_5TE__) \
> -   || defined(__ARM_ARCH_5TEJ__)
> -#  define __ARM_ARCH 5
> -# else
> -#  define __ARM_ARCH 4
> -# endif
> -#endif
> -
> -static int arm_arch = __ARM_ARCH;
> -
> -#if defined(__ARM_ARCH_5T__) \
> -|| defined(__ARM_ARCH_5TE__) || defined(__ARM_ARCH_5TEJ__)
> -# define use_armv5t_instructions 1
> -#else
> -# define use_armv5t_instructions use_armv6_instructions
> -#endif
> -
> -#define use_armv6_instructions  (__ARM_ARCH >= 6 || arm_arch >= 6)
> -#define use_armv7_instructions  (__ARM_ARCH >= 7 || arm_arch >= 7)
> +int arm_arch = __ARM_ARCH;
>
>  #ifndef use_idiv_instructions
>  bool use_idiv_instructions;
> @@ -730,16 +701,6 @@ static inline void tcg_out_bswap32(TCGContext *s, int 
> cond, int rd, int rn)
>  }
>  }
>
> -bool tcg_target_deposit_valid(int ofs, int len)
> -{
> -/* ??? Without bfi, we could improve over generic code by combining
> -   the right-shift from a non-zero ofs with the orr.  We do run into
> -   problems when rd == rs, and the mask generated from ofs+len doesn't
> -   fit into an immediate.  We would have to be careful not to pessimize
> -   wrt the optimizations performed on the expanded code.  */
> -return use_armv7_instructions;
> -}
> -
>  static inline void tcg_out_deposit(TCGContext *s, int cond, TCGReg rd,
> TCGArg a1, int ofs, int len, bool 
> const_a1)
>  {


--
Alex Bennée



Re: [Qemu-devel] [Bug 1647683] [NEW] Bad interaction between tb flushing & gdb stub

2016-12-06 Thread Peter Maydell
On 6 December 2016 at 11:39, Julian Brown <1647...@bugs.launchpad.net> wrote:
> Running QEMU under GDB in the test harness via Valgrind, using something
> akin to:
>
> (gdb) target remote | valgrind --tool=memcheck qemu-arm-system [...]
>
> leads to intermittent (and quite hard-to-reproduce) segfaults in QEMU of
> the form:
>
> ==52333== Process terminating with default action of signal 11 (SIGSEGV)
> ==52333==  Access not within mapped region at address 0x24
> ==52333==at 0x1D55F2: tb_page_remove (translate-all.c:1026)
> ==52333==by 0x1D58B4: tb_phys_invalidate (translate-all.c:1119)
> ==52333==by 0x1D63AA: tb_invalidate_phys_page_range (translate-all.c:1519)
> ==52333==by 0x1D66D7: tb_invalidate_phys_addr (translate-all.c:1714)
> ==52333==by 0x1CBA7F: breakpoint_invalidate (exec.c:704)
> ==52333==by 0x1CC01F: cpu_breakpoint_remove_by_ref (exec.c:869)
> ==52333==by 0x1CBF97: cpu_breakpoint_remove (exec.c:857)
> ==52333==by 0x218FAA: gdb_breakpoint_remove (gdbstub.c:717)
> ==52333==by 0x219E35: gdb_handle_packet (gdbstub.c:1035)
> ==52333==by 0x21AF62: gdb_read_byte (gdbstub.c:1459)
> ==52333==by 0x21B096: gdb_chr_receive (gdbstub.c:1672)
> ==52333==by 0x3AF2BC: qemu_chr_be_write_impl (qemu-char.c:419)
>
> These crashes didn't happen on a 2.6-era QEMU, so I bisected and
> discovered the commit 3359baad36889b83df40b637ed993a4b816c4906 ("tcg:
> Make tb_flush() thread safe") appears to be the thing that triggers this
> intermittent failure. Reverting the patch on the branch tip makes the
> crashes go away.

I saw something similar the other day as well, not involving valgrind,
just a simple gdb connected to the gdbstub.

thanks
-- PMM



Re: [Qemu-devel] [Spice-devel] Postcopy+spice crash

2016-12-06 Thread Gerd Hoffmann
  Hi,

Yep, spice worker thread ...

> Thread 7 (Thread 0x7fbe7f9ff700 (LWP 22383)):
> #0  0x7fc0aa42f49d in read () from /lib64/libpthread.so.0
> #1  0x7fc0a8c36c01 in spice_backtrace_gstack () from 
> /lib64/libspice-server.so.1
> #2  0x7fc0a8c3e4f7 in spice_logv () from /lib64/libspice-server.so.1
> #3  0x7fc0a8c3e655 in spice_log () from /lib64/libspice-server.so.1
> #4  0x7fc0a8bfc6de in get_virt () from /lib64/libspice-server.so.1
> #5  0x7fc0a8bfcb73 in red_get_data_chunks_ptr () from 
> /lib64/libspice-server.so.1
> #6  0x7fc0a8bff3fa in red_get_cursor_cmd () from 
> /lib64/libspice-server.so.1
> #7  0x7fc0a8c0fd79 in handle_dev_loadvm_commands () from 
> /lib64/libspice-server.so.1
> #8  0x7fc0a8bf9523 in dispatcher_handle_recv_read () from 
> /lib64/libspice-server.so.1
> #9  0x7fc0a8c1d5a5 in red_worker_main () from /lib64/libspice-server.so.1
> #10 0x7fc0aa428dc5 in start_thread () from /lib64/libpthread.so.0
> #11 0x7fc0a61786ed in clone () from /lib64/libc.so.6

... busy processing post_load request from main thread ...

> Thread 1 (Thread 0x7fc0aead5c40 (LWP 22376)):
> #0  0x7fc0aa42f49d in read () from /lib64/libpthread.so.0
> #1  0x7fc0a8bf9264 in read_safe () from /lib64/libspice-server.so.1
> #2  0x7fc0a8bf9717 in dispatcher_send_message () from 
> /lib64/libspice-server.so.1
> #3  0x7fc0a8bfa0c2 in red_dispatcher_loadvm_commands () from 
> /lib64/libspice-server.so.1
> #4  0x55646556c03d in qxl_spice_loadvm_commands 
> (qxl=qxl@entry=0x55646755b8c0, ext=ext@entry=0x556467a895a0, count=2) at 
> /root/git/qemu/hw/display/qxl.c:219
> #5  0x55646556d15f in qxl_post_load (opaque=0x55646755b8c0, 
> version=) at /root/git/qemu/hw/display/qxl.c:2212
> #6  0x55646562f1b8 in vmstate_load_state (f=f@entry=0x5564666347d0, 
> vmsd=, opaque=0x55646755b8c0, version_id=version_id@entry=21) 
> at /root/git/qemu/migration/vmstate.c:151
> #7  0x55646540f4a1 in vmstate_load (f=0x5564666347d0, se=0x5564676f90a0, 
> version_id=21) at /root/git/qemu/migration/savevm.c:690
> #8  0x55646540f6db in qemu_loadvm_section_start_full 
> (f=f@entry=0x5564666347d0, mis=mis@entry=0x556466c93f10) at 
> /root/git/qemu/migration/savevm.c:1843
> #9  0x55646540f9ac in qemu_loadvm_state_main (f=f@entry=0x5564666347d0, 
> mis=mis@entry=0x556466c93f10) at /root/git/qemu/migration/savevm.c:1900
> #10 0x55646540fd8f in loadvm_handle_cmd_packaged (mis=0x556466c93f10) at 
> /root/git/qemu/migration/savevm.c:1660
> #11 loadvm_process_command (f=0x556467e45740) at 
> /root/git/qemu/migration/savevm.c:1723
> #12 qemu_loadvm_state_main (f=f@entry=0x556467e45740, 
> mis=mis@entry=0x556466c93f10) at /root/git/qemu/migration/savevm.c:1913
> #13 0x556465412546 in qemu_loadvm_state (f=f@entry=0x556467e45740) at 
> /root/git/qemu/migration/savevm.c:1973
> #14 0x55646562b4e8 in process_incoming_migration_co 
> (opaque=0x556467e45740) at /root/git/qemu/migration/migration.c:394
> #15 0x556465746ada in coroutine_trampoline (i0=, 
> i1=) at /root/git/qemu/util/coroutine-ucontext.c:79
> #16 0x7fc0a60c7cf0 in ?? () from /lib64/libc.so.6
> #17 0x7ffe14885180 in ?? ()
> #18 0x in ?? ()

> It should; the device memory is just a RAMBlock that's migrated, so if it's
> not arrived yet from the source the qxl code will block until postcopy
> drags it across; assuming that is that the qxl code on the source isn't
> still trying to write to it's copy at the same time, which at this
> point it shouldn't.

Seems it happens while restoring the cursor,
does this patch make a difference?

--- a/hw/display/qxl.c
+++ b/hw/display/qxl.c
@@ -2238,12 +2238,14 @@ static int qxl_post_load(void *opaque, int
version)
 cmds[out].group_id = MEMSLOT_GROUP_GUEST;
 out++;
 }
+#if 0
 if (d->guest_cursor) {
 cmds[out].cmd.data = d->guest_cursor;
 cmds[out].cmd.type = QXL_CMD_CURSOR;
 cmds[out].group_id = MEMSLOT_GROUP_GUEST;
 out++;
 }
+#endif
 qxl_spice_loadvm_commands(d, cmds, out);
 g_free(cmds);
 if (d->guest_monitors_config) {

cheers,
  Gerd




Re: [Qemu-devel] [PATCH for-2.8] qdev: apply global properties in reverse order

2016-12-06 Thread Halil Pasic


On 12/06/2016 10:11 AM, Greg Kurz wrote:
>> Given the current doc:
>> > """
>> > -global driver.prop=value
>> > -global driver=driver,property=property,value=value
>> > Set default value of driver's property prop to value, e.g.:
>> > 
>> >  qemu-system-i386 -global ide-drive.physical_block_size=4096 -drive
>> > file=file,if=ide,index=0,media=disk
>> > 
>> > In particular, you can use this to set driver properties for devices 
>> > which
>> > are created automatically by the machine model. To create a device 
>> > which is
>> > not created automatically and set properties on it, use -device.
>> > 
>> > -global driver.prop=value is shorthand for -global 
>> > driver=driver,property=prop,
>> > value=value. The longhand syntax works even when
>> > driver contains a dot. 
>> > """
>> > I think this OOP argument, which I do not completely understand,
> With the current code, properties from the parent classes implicitly
> prevail and this has nothing to do with command line order, or order
> of appearance in HW_COMPAT_*.
>

Yeah and IMHO this is exactly the problem :).
 
> From an OOP perspective, we usually expect child classes to override
> parent classes behavior, not the contrary.
> 


I know, but the question is if it is the right analogy. The point is (as
you yourself already stated in a follow up email) the semantic of global
properties can be very well defined as for each instance of the class X
set property P to value V (that is imperatively). Now if it is possible
that same data exposed by properties is set more than once (e.g. via
parent and via child) and we want to remain deterministic we need to say
which write is going to win by being the last one.

Another option would be to say define this in a functional manner and
say the value of a property is going to be V unless ... ( here we need
to state that child takes precedence over parent and bring in command
line in in some way too -- where I do not know if parent command line
should take precedence over compat property on the child or the other
way around).

Because IMHO the first option is not consistent with the doc I'm in
favor of that, but then it has not much to do with overriding behavior.

Thanks for the explanation. Given your other email I think we are in
agreement now. I still think a bit more documentation (not necessarily
user documentation) would not hurt, as IMHO the OOP intuition you stated
is a valid one too (not my favored one though) and thus it would not
hurt to have the design decision captured in natural language too.

Cheers,
Halil




Re: [Qemu-devel] [PATCH for-2.9 15/17] target-i386: Define static "base" CPU model

2016-12-06 Thread Eduardo Habkost
On Tue, Dec 06, 2016 at 10:32:48AM +0100, David Hildenbrand wrote:
[...]
> > 
> > I would like to hear from libvirt developers what they think. I
> > still don't know what they plan to use the type=static expansion
> > results for.
> > 
> > > 
> > > How long is the static expansion on a recent intel CPU?
> > 
> > CPU model "Broadwell" returns 165 entries on return.model.props:
> > 
> > (QEMU) query-cpu-model-expansion type=static model={"name":"Broadwell"}
> 
> > {"return": {"migration-safe": true, "model": {"name": "base", "props": 
> > {"pfthreshold": false, "pku": false, "rtm": true, "tsc-deadline": true, 
> > "xstore-en": false, "tsc-scale": false, "abm": true, "ia64": false, 
> > "kvm-mmu": false, "xsaveopt": true, "tce": false, "smep": true, "fpu": 
> > true, "xcrypt": false, "clflush": true, "flushbyasid": false, 
> > "kvm-steal-time": false, "lm": true, "tsc": true, "adx": true, "fxsr": 
> > true, "tm": false, "xgetbv1": false, "xstore": false, "vme": false, 
> > "vendor": "GenuineIntel", "arat": true, "de": true, "aes": true, "pse": 
> > true, "ds-cpl": false, "tbm": false, "sse": true, "phe-en": false, "f16c": 
> > true, "ds": false, "mpx": false, "tsc-adjust": false, "avx512f": false, 
> > "avx2": true, "pbe": false, "cx16": true, "avx512pf": false, "movbe": true, 
> > "perfctr-nb": false, "ospke": false, "avx512ifma": false, "stepping": 2, 
> > "sep": true, "sse4a": false, "avx512dq": false, "avx512-4vnniw": false, 
> > "xsave": true, "pmm": false, "hle": true, "est": false, "xop": false, 
> > "smx": false, "monitor": false, "avx512er": false, "apic": true, "sse4.1": 
> > true, "sse4.2": true, "pause-filter": false, "lahf-lm": true, 
> > "kvm-nopiodelay": false, "acpi": false, "mmx": true, "osxsave": false, 
> > "pcommit": false, "mtrr": true, "clwb": false, "dca": false, "pdcm": false, 
> > "xcrypt-en": false, "3dnow": false, "invtsc": false, "tm2": false, 
> > "hypervisor": true, "kvmclock-stable-bit": false, "fxsr-opt": false, 
> > "pcid": true, "lbrv": false, "avx512-4fmaps": false, "svm-lock": false, 
> > "popcnt": true, "nrip-save": false, "avx512vl": false, "x2apic": true, 
> > "kvmclock": false, "smap": true, "family": 6, "min-level": 13, "dtes64": 
> > false, "ace2": false, "fma4": false, "xtpr": false, "avx512bw": false, 
> > "nx": true, "lwp": false, "msr": true, "ace2-en": false, "decodeassists": 
> > false, "perfctr-core": false, "pge": true, "pn": false, "fma": true, 
> > "nodeid-msr": false, "cx8": true, "mce": true, "avx512cd": false, 
> > "cr8legacy": false, "mca": true, "pni": true, "rdseed": true, "osvw": 
> > false, "fsgsbase": true, "model-id": "Intel Core Processor (Broadwell)", 
> > "cmp-legacy": false, "kvm-pv-unhalt": false, "rdtscp": true, "mmxext": 
> > false, "cid": false, "vmx": false, "ssse3": true, "extapic": false, 
> > "pse36": true, "min-xlevel": 2147483656, "ibs": false, "avx": true, 
> > "syscall": true, "umip": false, "invpcid": true, "bmi1": true, "bmi2": 
> > true, "vmcb-clean": false, "erms": true, "cmov": true, "misalignsse": 
> > false, "clflushopt": false, "pat": true, "3dnowprefetch": true, "rdpid": 
> > false, "pae": true, "wdt": false, "skinit": false, "pmm-en": false, "phe": 
> > false, "3dnowext": false, "lmce": false, "ht": false, "pdpe1gb": false, 
> > "kvm-pv-eoi": false, "npt": false, "xsavec": false, "pclmulqdq": true, 
> > "svm": false, "sse2": true, "ss": false, "topoext": false, "rdrand": true, 
> > "avx512vbmi": false, "kvm-asyncpf": false, "xsaves": false, "model": 61}}, 
> > "static": true}}
> > 
> > 
> 
> Wow, yes that was the reason for me to introduce abstractions on s390x. But
> here the plan was to use the epansion directly when indication the
> "host" model to the user. Having something like "Broadwell-base"+/- a
> handful of features is just easier to handle than "base" with 165 feature
> flags. But as we don't know what libvirt plans are (they could use that
> interface on x86 to do feature detection only and convert to models
> themselves), I also have no idea what would be best in the context of x86
> cpu models.

In the case of x86, libvirt already has their own database of
"static" CPU models in /usr/share/libvirt/cpu_map.xml. Maybe
providing our own set of static CPU models would be helpful if
they want to get rid of their database. But I'm not sure if:
1) they want to do that in the near future; 2) how easily they
could keep compatibility with their existing model names while
doing that.

-- 
Eduardo



[Qemu-devel] [Bug 1221966] Re: SIGSEGV in static_code_gen_buffer

2016-12-06 Thread Peter Maydell
Also I just noticed that the original report says the crash is while
trying to run a SunOS binary. This isn't supported at all -- we can run
Linux Sparc binaries with qemu-sparc, not random-other-OS binaries, and
"guest binary crashes" is not an implausible result...

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1221966

Title:
  SIGSEGV in static_code_gen_buffer

Status in QEMU:
  Incomplete

Bug description:
  Trying to run 'ls' (or, anything else as far as I can tell) from a
  SunOS 5.8 box under RHEL 6.4 linux, I get a segfault.  I've tried
  qemu-1.5.3, qemu-1.6.0, and I fetched git://git.qemu-
  project.org/qemu.git.  I've also tried a statically linked sh from
  /sbin/ and it also segfaulted.

  wcolburn@anotheruvula$ gdb bin/qemu-sparc
  GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
  Copyright (C) 2010 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later 
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-redhat-linux-gnu".
  For bug reporting instructions, please see:
  ...
  Reading symbols from /home/anotheruvula/qemu/bin/qemu-sparc...done.
  (gdb) run ../sparc/ls
  Starting program: /home/anotheruvula/qemu/bin/qemu-sparc ../sparc/ls
  [Thread debugging using libthread_db enabled]

  Program received signal SIGSEGV, Segmentation fault.
  0x78259116 in static_code_gen_buffer ()
  Missing separate debuginfos, use: debuginfo-install glib2-2.22.5-7.el6.x86_64 
glibc-2.12-1.107.el6_4.4.x86_64
  (gdb) where
  #0  0x78259116 in static_code_gen_buffer ()
  #1  0x77f570cd in cpu_tb_exec (cpu=0x7a2b1210, tb_ptr=
  0x782590d0 "A\213n \205í\017\205Í")
  at /home/anotheruvula/qemu-devel/cpu-exec.c:56
  #2  0x77f57b2d in cpu_sparc_exec (env=0x7a2b1348)
  at /home/anotheruvula/qemu-devel/cpu-exec.c:631
  #3  0x77f77f36 in cpu_loop (env=0x7a2b1348)
  at /home/anotheruvula/qemu-devel/linux-user/main.c:1089
  #4  0x77f798ff in main (argc=2, argv=0x7fffdfc8, envp=
  0x7fffdfe0) at /home/anotheruvula/qemu-devel/linux-user/main.c:4083
  (gdb)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1221966/+subscriptions



Re: [Qemu-devel] [PATCH v5 13/17] qapi: add qapi2texi script

2016-12-06 Thread Marc-André Lureau
Hi

On Tue, Dec 6, 2016 at 2:50 PM Markus Armbruster  wrote:

> I had to resort to diff to find your replies, and massage the text
> manually to produce a readable reply myself.  Please quote the usual
> way.
>
>
I'd have to switch to something else than gmail (which bothers me for
various reasons, let's not discuss the merits of various mail clients
please ;) In general, I don't have problems, but this mail is rather big,
sorry for the inconvenience..

> Markus Armbruster  writes:
> >
> > > Marc-André Lureau  writes:
> > >
> > >> As the name suggests, the qapi2texi script converts JSON QAPI
> > >> description into a texi file suitable for different target
> > >> formats (info/man/txt/pdf/html...).
> > >>
> > >> It parses the following kind of blocks:
> > >>
> > >> Free-form:
> > >>
> > >>   ##
> > >>   # = Section
> > >>   # == Subsection
> > >>   #
> > >>   # Some text foo with *emphasis*
> > >>   # 1. with a list
> > >>   # 2. like that
> > >>   #
> > >>   # And some code:
> > >>   # | $ echo foo
> > >>   # | -> do this
> > >>   # | <- get that
> > >>   #
> > >>   ##
> > >>
> > >> Symbol:
> > >>
> > >>   ##
> > >>   # @symbol:
> > >>   #
> > >>   # Symbol body ditto ergo sum. Foo bar
> > >>   # baz ding.
> > >>   #
> > >>   # @arg: foo
> > >>   # @arg: #optional foo
> > >
> > > Let's not use @arg twice.
> > >
> > > Terminology: I prefer to use "parameter" for formal parameters, and
> > > "argument" for actual arguments.  This matches how The Python Language
> > > Reference uses the terms.
> > >
> > > What about
> > >
> > > # @param1: the frob to frobnicate
> > > # @param2: #optional how hard to frobnicate
> >
> > ok
> >
> > >>   #
> > >>   # Returns: returns bla bla
> > >>   #  Or bla blah
> > >
> > > Repeating "returns" is awkward, and we don't do that in our schemas.
> > >
> > > We need a period before "Or", or spell it "or".
> > >
> > > What about
> > >
> > > # Returns: the frobnicated frob.
> > > #  If frob isn't frobnicatable, GenericError.
> >
> > ok
> >
> > >>   #
> > >>   # Since: version
> > >>   # Notes: notes, comments can have
> > >>   #- itemized list
> > >>   #- like this
> > >>   #
> > >>   # Example:
> > >>   #
> > >>   # -> { "execute": "quit" }
> > >>   # <- { "return": {} }
> > >>   #
> > >>   ##
> > >>
> > >> That's roughly following the following EBNF grammar:
> > >>
> > >> api_comment = "##\n" comment "##\n"
> > >> comment = freeform_comment | symbol_comment
> > >> freeform_comment = { "# " text "\n" | "#\n" }
> > >> symbol_comment = "# @" name ":\n" { member | meta | freeform_comment }
> > >
> > > Rejects non-empty comments where "#" is not followed by space.  Such
> > > usage is accepted outside doc comments.  Hmm.
> > >
> > >> member = "# @" name ':' [ text ] freeform_comment
> > >
> > > Are you missing a "\n" before freeform_comment?
> >
> > yes
> >
> > >> meta = "# " ( "Returns:", "Since:", "Note:", "Notes:", "Example:",
> "Examples:" ) [ text ] freeform_comment
> > >
> > > Likewise.
> >
> > ok
> >
> > >> text = free-text markdown-like, "#optional" for members
> > >
> > > The grammar is ambiguous: a line "# @foo:\n" can be parsed both as
> > > freeform_comment and as symbol_comment.  Since your intent is obvious
> > > enough, it can still serve as documentation.  It's not a suitable
> > > foundation for parsing, though.  Okay for a commit message.
> > >
> > >> Thanks to the following json expressions, the documentation is
> enhanced
> > >> with extra information about the type of arguments and return value
> > >> expected.
> > >
> > > I guess you want to say that we enrich the documentation we extract
> from
> > > comments with information from the actual schema.  Correct?
> >
> > yes
> >
> > > Missing: a brief discussion of deficiencies.  These include:
> > >
> > > * The generated QMP documentation includes internal types
> > >
> > >   We use qapi-schema.json both for defining the external QMP interface
> > >   and for defining internal types.  qmp-introspect.py carefully
> > >   separates the two, to not expose internal types.  qapi2texi.py
> happily
> > >   exposes everything.
> > >
> > >   Currently, about a fifth of the types in the generated docs are
> > >   internal:
> > >
> > >   AcpiTableOptions
> > >   BiosAtaTranslation
> > >   BlockDeviceMapEntry
> > >   COLOMessage
> > >   COLOMode
> > >   DummyForceArrays
> > >   FailoverStatus
> > >   FloppyDriveType
> > >   ImageCheck
> > >   LostTickPolicy
> > >   MapEntry
> > >   MigrationParameter
> > >   NetClientDriver
> > >   NetFilterDirection
> > >   NetLegacy
> > >   NetLegacyNicOptions
> > >   NetLegacyOptions
> > >   NetLegacyOptionsKind
> > >   Netdev
> > >   NetdevBridgeOptions
> > >   NetdevDumpOptions
> > >   NetdevHubPortOptions
> > >   NetdevL2TPv3Options
> > >   NetdevNetmapOptions
> > >   NetdevNoneOptions
> > >   NetdevSocketOptions
> > >   NetdevTapOptio

Re: [Qemu-devel] [Nbd] [PATCH v3] doc: Add NBD_CMD_BLOCK_STATUS extension

2016-12-06 Thread Wouter Verhelst
Hi John

Sorry for the late reply; weekend was busy, and so was monday.

On Fri, Dec 02, 2016 at 03:39:08PM -0500, John Snow wrote:
> On 12/02/2016 01:45 PM, Alex Bligh wrote:
> > John,
> > 
> >>> +Some storage formats and operations over such formats express a
> >>> +concept of data dirtiness. Whether the operation is block device
> >>> +mirroring, incremental block device backup or any other operation with
> >>> +a concept of data dirtiness, they all share a need to provide a list
> >>> +of ranges that this particular operation treats as dirty.
> >>>
> >>> How can data be 'dirty' if it is static and unchangeable? (I thought)
> >>>
> >>
> >> In a simple case, live IO goes to e.g. hda.qcow2. These writes come from
> >> the VM and cause the bitmap that QEMU manages to become dirty.
> >>
> >> We intend to expose the ability to fleece dirty blocks via NBD. What
> >> happens in this scenario would be that a snapshot of the data at the
> >> time of the request is exported over NBD in a read-only manner.
> >>
> >> In this way, the drive itself is R/W, but the "view" of it from NBD is
> >> RO. While a hypothetical backup client is busy copying data out of this
> >> temporary view, new writes are coming in to the drive, but are not being
> >> exposed through the NBD export.
> >>
> >> (This goes into QEMU-specifics, but those new writes are dirtying a
> >> version of the bitmap not intended to be exposed via the NBD channel.
> >> NBD gets effectively a snapshot of both the bitmap AND the data.)
> > 
> > Thanks. That makes sense - or enough sense for me to carry on commenting!
> > 
> 
> Whew! I'm glad.
> 
> >>> I now think what you are talking about backing up a *snapshot* of a disk
> >>> that's running, where the disk itself was not connected using NBD? IE it's
> >>> not being 'made dirty' by NBD_CMD_WRITE etc. Rather 'dirtiness' is 
> >>> effectively
> >>> an opaque state represented in a bitmap, which is binary metadata
> >>> at some particular level of granularity. It might as well be 'happiness'
> >>> or 'is coloured blue'. The NBD server would (normally) have no way of
> >>> manipulating this bitmap.
> >>>
> >>> In previous comments, I said 'how come we can set the dirty bit through
> >>> writes but can't clear it?'. This (my statement) is now I think wrong,
> >>> as NBD_CMD_WRITE etc. is not defined to set the dirty bit. The
> >>> state of the bitmap comes from whatever sets the bitmap which is outside
> >>> the scope of this protocol to transmit it.
> >>>
> >>
> >> You know, this is a fair point. We have not (to my knowledge) yet
> >> carefully considered the exact bitmap management scenario when NBD is
> >> involved in retrieving dirty blocks.
> >>
> >> Humor me for a moment while I talk about a (completely hypothetical, not
> >> yet fully discussed) workflow for how I envision this feature.
> >>
> >> (1) User sets up a drive in QEMU, a bitmap is initialized, an initial
> >> backup is made, etc.
> >>
> >> (2) As writes come in, QEMU's bitmap is dirtied.
> >>
> >> (3) The user decides they want to root around to see what data has
> >> changed and would like to use NBD to do so, in contrast to QEMU's own
> >> facilities for dumping dirty blocks.
> >>
> >> (4) A command is issued that creates a temporary, lightweight snapshot
> >> ('fleecing') and exports this snapshot over NBD. The bitmap is
> >> associated with the NBD export at this point at NBD server startup. (For
> >> the sake of QEMU discussion, maybe this command is "blockdev-fleece")
> >>
> >> (5) At this moment, the snapshot is static and represents the data at
> >> the time the NBD server was started. The bitmap is also forked and
> >> represents only this snapshot. The live data and bitmap continue to change.
> >>
> >> (6) Dirty blocks are queried and copied out via NBD.
> >>
> >> (7) The user closes the NBD instance upon completion of their task,
> >> whatever it was. (Making a new incremental backup? Just taking a peek at
> >> some changed data? who knows.)
> >>
> >> The point that's interesting here is what do we do with the two bitmaps
> >> at this point? The data delta can be discarded (this was after all just
> >> a lightweight read-only point-in-time snapshot) but the bitmap data
> >> needs to be dealt with.
> >>
> >> (A) In the case of "User made a new incremental backup," the bitmap that
> >> got forked off to serve the NBD read should be discarded.
> >>
> >> (B) In the case of "User just wanted to look around," the bitmap should
> >> be merged back into the bitmap it was forked from.
> >>
> >> I don't advise a hybrid where "User copied some data, but not all" where
> >> we need to partially clear *and* merge, but conceivably this could
> >> happen, because the things we don't want to happen always will.
> >>
> >> At this point maybe it's becoming obvious that actually it would be very
> >> prudent to allow the NBD client itself to inform QEMU via the NBD
> >> protocol which extents/blocks/(etc) that it is "done" with.
> >>
> >> Maybe it *

Re: [Qemu-devel] [PATCH for-2.8] virtio-crypto: zeroize the key material before free

2016-12-06 Thread Michael S. Tsirkin
On Tue, Dec 06, 2016 at 05:29:13PM +0800, Gonglei wrote:
> Zeroize the memory of CryptoDevBackendSymOpInfo structure pointed
> for key material security.
> 
> Signed-off-by: Gonglei 
> ---
>  hw/virtio/virtio-crypto.c | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
> index 2f2467e..ecb19b6 100644
> --- a/hw/virtio/virtio-crypto.c
> +++ b/hw/virtio/virtio-crypto.c
> @@ -337,7 +337,18 @@ static void virtio_crypto_free_request(VirtIOCryptoReq 
> *req)
>  {
>  if (req) {
>  if (req->flags == CRYPTODEV_BACKEND_ALG_SYM) {
> -g_free(req->u.sym_op_info);
> +size_t max_len;
> +CryptoDevBackendSymOpInfo *op_info = req->u.sym_op_info;
> +
> +max_len = op_info->iv_len +
> +  op_info->aad_len +
> +  op_info->src_len +
> +  op_info->dst_len +
> +  op_info->digest_result_len;
> +
> +/* Zeroize and free request data structure */
> +memset(op_info, 0, sizeof(*op_info) + max_len);
> +g_free(op_info);

Write into memory, then free it?  This looks rather strange. Why are we
doing this?

>  }
>  g_free(req);
>  }
> -- 
> 1.8.3.1
> 



Re: [Qemu-devel] [PATCH] migration: re-active images when migration fails to complete

2016-12-06 Thread Kevin Wolf
Am 19.11.2016 um 12:43 hat zhanghailiang geschrieben:
> commit fe904ea8242cbae2d7e69c052c754b8f5f1ba1d6 fixed a case
> which migration aborted QEMU because it didn't regain the control
> of images while some errors happened.
> 
> Actually, we have another case in that error path to abort QEMU
> because of the same reason:
> migration_thread()
> migration_completion()
>bdrv_inactivate_all() > inactivate images
>qemu_savevm_state_complete_precopy()
>socket_writev_buffer() > error because destination 
> fails
>  qemu_fflush() ---> set error on migration stream
>qemu_mutex_unlock_iothread() --> unlock
> qmp_migrate_cancel() -> user cancelled migration
> migrate_set_state() --> set migrate CANCELLING

Important to note here: qmp_migrate_cancel() is executed by a concurrent
thread, it doesn't depend on any code paths in migration_completion().

> migration_completion() -> go on to fail_invalidate
> if (s->state == MIGRATION_STATUS_ACTIVE) -> Jump this branch
> migration_thread() ---> break migration loop
>   vm_start() -> restart guest with inactive
> images
> We failed to regain the control of images because we only regain it
> while the migration state is "active", but here users cancelled the migration
> when they found some errors happened (for example, libvirtd daemon is shutdown
> in destination unexpectedly).
> 
> Signed-off-by: zhanghailiang 
> ---
>  migration/migration.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index f498ab8..0c1ee6d 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1752,7 +1752,8 @@ fail_invalidate:
>  /* If not doing postcopy, vm_start() will be called: let's regain
>   * control on images.
>   */
> -if (s->state == MIGRATION_STATUS_ACTIVE) {

This if condition tries to check whether we ran the code path that
called bdrv_inactivate_all(), so that we only try to reactivate images
it if we really inactivated them first.

The problem with it is that it ignores a possible concurrent
modification of s->state.

> +if (s->state == MIGRATION_STATUS_ACTIVE ||
> +s->state == MIGRATION_STATUS_CANCELLING) {

This adds another state that we could end up with with a concurrent
modification, so that even in this case we undo the inactivation.

However, it is no longer limited to the cases where we inactivated the
image. It also applies to other code paths (like the postcopy one) where
we didn't inactivate images.

What saves the patch is that bdrv_invalidate_cache() is a no-op for
block devices that aren't inactivated, so calling it more often than
necessary is okay.

But then, if we're going to rely on this, it would be much better to
just remove the if altogether. I can't say whether there are any other
possible values of s->state that we should consider, and by removing the
if we would be guaranteed to catch all of them.

If we don't want to rely on it, just keep a local bool that remembers
whether we inactivated images and check that here.

>  Error *local_err = NULL;
>  
>  bdrv_invalidate_cache_all(&local_err);

So in summary, this is a horrible patch because it checks the wrong
thing, and for I can't really say if it covers everything it needs to
cover, but arguably it happens to correctly fix the outcome of a
previously failing case.

Normally I would reject such a patch and require a clean solution, but
then we're on the day of -rc3, so if you can't send v2 right away, we
might not have the time for it.

Tough call...

Kevin



Re: [Qemu-devel] [Bug 1647683] [NEW] Bad interaction between tb flushing & gdb stub

2016-12-06 Thread Peter Maydell
On 6 December 2016 at 12:34, Peter Maydell  wrote:
> I saw something similar the other day as well, not involving valgrind,
> just a simple gdb connected to the gdbstub.

http://people.linaro.org/~peter.maydell/gdbstub-bug.tgz
is a repro case for this (with an aarch64 kernel guest).
Segfaults every time when the guest hits the breakpoint.

thanks
-- PMM



[Qemu-devel] [PATCH v6 01/17] qapi: improve device_add schema

2016-12-06 Thread Marc-André Lureau
'device_add' is still incomplete for now, but we can fix a few
arguments:
- 'bus' is a common argument, regardless of the device
- 'id' is an optional argument

Signed-off-by: Marc-André Lureau 
Reviewed-by: Markus Armbruster 
Reviewed-by: Eric Blake 
---
 qapi-schema.json | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index a0d3b5d7c5..d4c42e5a6f 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2292,7 +2292,7 @@
 #
 # @bus: #optional the device's parent bus (device tree path)
 #
-# @id: the device's ID, must be unique
+# @id: #optional the device's ID, must be unique
 #
 # Additional arguments depend on the type.
 #
@@ -2322,7 +2322,7 @@
 # Since: 0.13
 ##
 { 'command': 'device_add',
-  'data': {'driver': 'str', 'id': 'str'},
+  'data': {'driver': 'str', '*bus': 'str', '*id': 'str'},
   'gen': false } # so we can get the additional arguments
 
 ##
-- 
2.11.0




[Qemu-devel] [PATCH v6 02/17] qapi: improve TransactionAction doc

2016-12-06 Thread Marc-André Lureau
TransactionAction is a flat union, document 'type' versions
exhaustively, and sort the members.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Markus Armbruster 
---
 qapi-schema.json | 31 ---
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index d4c42e5a6f..e29d47ded3 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1795,28 +1795,29 @@
 # @TransactionAction:
 #
 # A discriminated record of operations that can be performed with
-# @transaction.
+# @transaction. Action @type can be:
 #
-# Since: 1.1
+# - @abort: since 1.6
+# - @block-dirty-bitmap-add: since 2.5
+# - @block-dirty-bitmap-clear: since 2.5
+# - @blockdev-backup: since 2.3
+# - @blockdev-snapshot: since 2.5
+# - @blockdev-snapshot-internal-sync: since 1.7
+# - @blockdev-snapshot-sync: since 1.1
+# - @drive-backup: since 1.6
 #
-# drive-backup since 1.6
-# abort since 1.6
-# blockdev-snapshot-internal-sync since 1.7
-# blockdev-backup since 2.3
-# blockdev-snapshot since 2.5
-# block-dirty-bitmap-add since 2.5
-# block-dirty-bitmap-clear since 2.5
+# Since: 1.1
 ##
 { 'union': 'TransactionAction',
   'data': {
-   'blockdev-snapshot': 'BlockdevSnapshot',
-   'blockdev-snapshot-sync': 'BlockdevSnapshotSync',
-   'drive-backup': 'DriveBackup',
-   'blockdev-backup': 'BlockdevBackup',
'abort': 'Abort',
-   'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal',
'block-dirty-bitmap-add': 'BlockDirtyBitmapAdd',
-   'block-dirty-bitmap-clear': 'BlockDirtyBitmap'
+   'block-dirty-bitmap-clear': 'BlockDirtyBitmap',
+   'blockdev-backup': 'BlockdevBackup',
+   'blockdev-snapshot': 'BlockdevSnapshot',
+   'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal',
+   'blockdev-snapshot-sync': 'BlockdevSnapshotSync',
+   'drive-backup': 'DriveBackup'
} }
 
 ##
-- 
2.11.0




[Qemu-devel] [PATCH v6 04/17] qapi: add some sections in docs

2016-12-06 Thread Marc-André Lureau
Add some more section titles to organize the documentation we're going
to generate.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Markus Armbruster 
---
 qapi-schema.json |  4 
 qapi/block-core.json |  6 --
 qapi/block.json  | 10 --
 qapi/common.json |  6 --
 qapi/crypto.json |  5 -
 qapi/event.json  |  6 ++
 qapi/rocker.json |  4 
 qapi/trace.json  |  3 +++
 8 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index e29d47ded3..fbea3b18d9 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -20,6 +20,10 @@
 # QAPI introspection
 { 'include': 'qapi/introspect.json' }
 
+##
+# = QMP commands
+##
+
 ##
 # @qmp_capabilities:
 #
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ec1da2a29a..05cedc3f23 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1,6 +1,8 @@
 # -*- Mode: Python -*-
-#
-# QAPI block core definitions (vm unrelated)
+
+##
+# == QAPI block core definitions (vm unrelated)
+##
 
 # QAPI common definitions
 { 'include': 'common.json' }
diff --git a/qapi/block.json b/qapi/block.json
index 937df05830..e4ad74bcb2 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -1,10 +1,16 @@
 # -*- Mode: Python -*-
-#
-# QAPI block definitions (vm related)
+
+##
+# = QAPI block definitions
+##
 
 # QAPI block core definitions
 { 'include': 'block-core.json' }
 
+##
+# == QAPI block definitions (vm unrelated)
+##
+
 ##
 # @BiosAtaTranslation:
 #
diff --git a/qapi/common.json b/qapi/common.json
index 624a8619c8..d93f159946 100644
--- a/qapi/common.json
+++ b/qapi/common.json
@@ -1,6 +1,8 @@
 # -*- Mode: Python -*-
-#
-# QAPI common definitions
+
+##
+# = QAPI common definitions
+##
 
 ##
 # @QapiErrorClass:
diff --git a/qapi/crypto.json b/qapi/crypto.json
index 15d296e3c1..1e517b0841 100644
--- a/qapi/crypto.json
+++ b/qapi/crypto.json
@@ -1,6 +1,9 @@
 # -*- Mode: Python -*-
 #
-# QAPI crypto definitions
+
+##
+# = QAPI crypto definitions
+##
 
 ##
 # @QCryptoTLSCredsEndpoint:
diff --git a/qapi/event.json b/qapi/event.json
index 37bf34ed6d..683666848a 100644
--- a/qapi/event.json
+++ b/qapi/event.json
@@ -1,3 +1,9 @@
+# -*- Mode: Python -*-
+
+##
+# = Other events
+##
+
 ##
 # @SHUTDOWN:
 #
diff --git a/qapi/rocker.json b/qapi/rocker.json
index ace27760f1..1e511cd37a 100644
--- a/qapi/rocker.json
+++ b/qapi/rocker.json
@@ -1,3 +1,7 @@
+##
+# = Rocker switch device
+##
+
 ##
 # @RockerSwitch:
 #
diff --git a/qapi/trace.json b/qapi/trace.json
index 4fd39b7792..3ad7df7fdb 100644
--- a/qapi/trace.json
+++ b/qapi/trace.json
@@ -5,6 +5,9 @@
 # This work is licensed under the terms of the GNU GPL, version 2 or later.
 # See the COPYING file in the top-level directory.
 
+##
+# = Tracing commands
+##
 
 ##
 # @TraceEventState:
-- 
2.11.0




[Qemu-devel] [PATCH v6 00/17] qapi doc generation (whole version, squashed)

2016-12-06 Thread Marc-André Lureau
Add a qapi2texi script to generate the documentation from the qapi
schemas.

The SQUASHED patch in this series is a squashed version of the
documentation move from qmp-commands.txt to the schemas. The whole
version (not sent on the ML to avoid spamming) is in the following git
branch: https://github.com/elmarco/qemu/commits/qapi-doc

PDF preview:
https://fedorapeople.org/~elmarco/qemu-qmp-ref.pdf

v6:
- rebased on top of armbru/qapi-next branch
- add a few patches to improve Exception subclasses and usage in
  qapi.py as suggested during review
- parser and generator fixes and improvements after v5 review:
  - various union improvements, hopefully with a better syntax
  - improve error messages
  - improve docs/qapi-code-gen.txt documentation section
  - do not allow interleaved body documentation between sections
  - more tests for new cases
  - make expression documentation mandatory, fix the tests
  - replace bad usage of @var{} with @t{} in texi, fix texi2pod to
handle it
  - renaming, reordering etc..
- add docs/qapi-syntax.texi to describe the API syntax used in the
  texi documentation
- fix interleaved body and section documentation
- improve documentation sections name
- many build-sys improvements after review
- fix and improve commit messages, update R-b tags

v5:
- many parser and generator fixes and improvements after v4 review:
 - simplified current section handling by using a Section object
 - adding a line is more stateful: either freeform or symbol comment
 - always check_docs() when parsing with QAPISchema
 - simplified some code and comments
 - do not break current section on empty line, but break after a non
   indented paragraph in an argument section. This seems to reflects
   the way documentation is written:

   ##
   # @foo:
   # @arg: fluctuat nec mergitur
   #   - continues here
   #
   #   Since: 1853
   #
   # Body
   #
   ##

   Other sections (Note/Examples etc) are not indented (it seems), but
   could use a similar rule. I prefer to keep this only for args, for
   styling reasons (bikeshedding?).

- better handling of flat-union in generator
- list all enum values (even when not documented)
- added qapi-doc parsing tests and more error checking
- pep8/pylint fixes
- some more schema doc fixes
- do not move logo to docs/

v4:
- more device_add schema fixes
- do not merge docs/qmp-intro.txt in qemu-qmp-ref.texi
- remove needless @ifinfo, add GPL copying text
- added qemu logo to pdf
- added some r-b tags

v3:
- many improvements to the doc parser:
  - throws an error in various malformated conditions
  - allows multiple meta-sections, except for "Since:" and "Return:"
  - build a list of docs, instead of attaching docs to expressions
  - accept() breaks on new doc block, and get_doc() returns a QAPIDoc
- fix more documentation to fit the new parser
- use a master texi file that includes the generated file, instead of
  templated texi file
- texi fixes after Markus review
- only build and install html and man pages by default
- fix .gitignore

v2:
- change licence to be lgpl2+
- fix some comments & commit message
- add more code comments
- improve the doc parsing to treat only "Since" as a special case not
  requiring ":" (common notation in the doc)
- include some early schema doc fixes (to fix generated doc)
- include the squashed version of the doc move
- include the man page and installation build changes

Marc-André Lureau (17):
  qapi: improve device_add schema
  qapi: improve TransactionAction doc
  qga/schema: improve guest-set-vcpus Returns: section
  qapi: add some sections in docs
  docs: add master qapi texi files
  qapi: rework qapi Exception
  qapi: use a QAPIParseError in parser
  qapi: add qapi2texi script
  texi2pod: learn quotation, deftp and deftypefn
  json: reorder documentation body
  (SQUASHED) move doc to schema
  docs: add qemu logo to pdf
  build-sys: use --no-split for info
  build-sys: remove dvi doc generation
  build-sys: use a generic TEXI2MAN rule
  build-sys: add txt documentation rules
  build-sys: add qapi doc generation targets

 Makefile   |   94 +-
 tests/Makefile.include |   19 +
 scripts/qapi.py|  558 ++-
 scripts/qapi2texi.py   |  339 ++
 scripts/texi2pod.pl|   54 +-
 .gitignore |   11 +-
 configure  |2 +-
 docs/qapi-code-gen.txt |  138 +-
 docs/qapi-syntax.texi  |  175 +
 docs/qemu-ga-ref.texi  |   89 +
 docs/qemu-qmp-ref.texi |   89 +
 docs/qemu_logo.pdf |  Bin 0 -> 9117 bytes
 docs/qmp-commands.txt  | 3824 
 docs/qmp-events.txt|  731 
 docs/qmp-intro.txt 

[Qemu-devel] [PATCH v6 13/17] build-sys: use --no-split for info

2016-12-06 Thread Marc-André Lureau
Splitting the info files doesn't bring much benefits these days.
This fixes also untracked generated info files from git ignore.

Let's use MAKEINFOFLAGS for common flags, --number-sections is already
the default anyway, so adding it doesn't change the info output.

Signed-off-by: Marc-André Lureau 
---
 Makefile | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index 361773634d..0bc470d974 100644
--- a/Makefile
+++ b/Makefile
@@ -529,17 +529,17 @@ ui/console-gl.o: $(SRC_PATH)/ui/console-gl.c \
 
 # documentation
 MAKEINFO=makeinfo
-MAKEINFOFLAGS=--no-headers --no-split --number-sections
+MAKEINFOFLAGS=--no-split --number-sections
 TEXIFLAG=$(if $(V),,--quiet)
 %.dvi: %.texi
$(call quiet-command,texi2dvi $(TEXIFLAG) -I . $<,"GEN","$@")
 
 %.html: %.texi
-   $(call quiet-command,LC_ALL=C $(MAKEINFO) $(MAKEINFOFLAGS) --html $< -o 
$@, \
-   "GEN","$@")
+   $(call quiet-command,LC_ALL=C $(MAKEINFO) $(MAKEINFOFLAGS) --no-headers 
\
+   --html $< -o $@,"GEN","$@")
 
 %.info: %.texi
-   $(call quiet-command,$(MAKEINFO) $< -o $@,"GEN","$@")
+   $(call quiet-command,$(MAKEINFO) $(MAKEINFOFLAGS) $< -o $@,"GEN","$@")
 
 %.pdf: %.texi
$(call quiet-command,texi2pdf $(TEXIFLAG) -I . $<,"GEN","$@")
-- 
2.11.0




[Qemu-devel] [PATCH v6 03/17] qga/schema: improve guest-set-vcpus Returns: section

2016-12-06 Thread Marc-André Lureau
Itemize the possible return values of guest-set-vcpus.

Signed-off-by: Marc-André Lureau 
---
 qga/qapi-schema.json | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index 94c03128fd..ad63737fce 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -697,21 +697,21 @@
 # Returns: The length of the initial sublist that has been successfully
 #  processed. The guest agent maximizes this value. Possible cases:
 #
-#  0:if the @vcpus list was empty on input. Guest state
+#  - 0:  if the @vcpus list was empty on input. Guest state
 #has not been changed. Otherwise,
 #
-#  Error:processing the first node of @vcpus failed for the
+#  - Error:  processing the first node of @vcpus failed for the
 #reason returned. Guest state has not been changed.
 #Otherwise,
 #
-#  < length(@vcpus): more than zero initial nodes have been processed,
+#  - < length(@vcpus): more than zero initial nodes have been 
processed,
 #but not the entire @vcpus list. Guest state has
 #changed accordingly. To retrieve the error
 #(assuming it persists), repeat the call with the
 #successfully processed initial sublist removed.
 #Otherwise,
 #
-#  length(@vcpus):   call successful.
+#  - length(@vcpus): call successful.
 #
 # Since: 1.5
 ##
-- 
2.11.0




[Qemu-devel] [PATCH v6 10/17] json: reorder documentation body

2016-12-06 Thread Marc-André Lureau
Place the body of expression documentation at the top (after the
@symbol:). This prevents ambiguity with other argument or
tagged-section documentation.

Signed-off-by: Marc-André Lureau 
---
 qapi-schema.json | 83 ++--
 qapi/block-core.json | 14 -
 qapi/introspect.json | 28 --
 qapi/trace.json  | 16 +-
 qga/qapi-schema.json | 27 +
 5 files changed, 83 insertions(+), 85 deletions(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index fbea3b18d9..f11b3bd178 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1946,11 +1946,11 @@
 #
 # Set XBZRLE cache size
 #
-# @value: cache size in bytes
-#
-# The size will be rounded down to the nearest power of 2.
 # The cache size can be modified before and during ongoing migration
 #
+# @value: cache size in bytes. The size will be rounded down to the
+# nearest power of 2.
+#
 # Returns: nothing on success
 #
 # Since: 1.2
@@ -2293,16 +2293,16 @@
 ##
 # @device_add:
 #
+# Add a device.
+#
+# Additional arguments depend on the type.
+#
 # @driver: the name of the new device's driver
 #
 # @bus: #optional the device's parent bus (device tree path)
 #
 # @id: #optional the device's ID, must be unique
 #
-# Additional arguments depend on the type.
-#
-# Add a device.
-#
 # Notes:
 # 1. For detailed information about this command, please refer to the
 #'docs/qdev-device-use.txt' file.
@@ -2319,13 +2319,13 @@
 # "mac": "52:54:00:12:34:56" } }
 # <- { "return": {} }
 #
+# Since: 0.13
+##
 # TODO This command effectively bypasses QAPI completely due to its
 # "additional arguments" business.  It shouldn't have been added to
 # the schema in this form.  It should be qapified properly, or
 # replaced by a properly qapified command.
 #
-# Since: 0.13
-##
 { 'command': 'device_add',
   'data': {'driver': 'str', '*bus': 'str', '*id': 'str'},
   'gen': false } # so we can get the additional arguments
@@ -2499,10 +2499,10 @@
 #
 # Dump guest's storage keys
 #
-# @filename: the path to the file to dump to
-#
 # This command is only supported on s390 architecture.
 #
+# @filename: the path to the file to dump to
+#
 # Since: 2.5
 ##
 { 'command': 'dump-skeys',
@@ -2513,23 +2513,24 @@
 #
 # Add a network backend.
 #
+# Additional arguments depend on the type.
+#
 # @type: the type of network backend.  Current valid values are 'user', 'tap',
 #'vde', 'socket', 'dump' and 'bridge'
 #
 # @id: the name of the new network backend
 #
-# Additional arguments depend on the type.
+# Since: 0.14.0
+#
+# Returns: Nothing on success
+#  If @type is not a valid network backend, DeviceNotFound
+##
 #
 # TODO This command effectively bypasses QAPI completely due to its
 # "additional arguments" business.  It shouldn't have been added to
 # the schema in this form.  It should be qapified properly, or
 # replaced by a properly qapified command.
 #
-# Since: 0.14.0
-#
-# Returns: Nothing on success
-#  If @type is not a valid network backend, DeviceNotFound
-##
 { 'command': 'netdev_add',
   'data': {'type': 'str', 'id': 'str'},
   'gen': false }# so we can get the additional arguments
@@ -3209,6 +3210,22 @@
 #
 # Virtual CPU definition.
 #
+# @unavailable-features is a list of QOM property names that
+# represent CPU model attributes that prevent the CPU from running.
+# If the QOM property is read-only, that means there's no known
+# way to make the CPU model run in the current host. Implementations
+# that choose not to provide specific information return the
+# property name "type".
+# If the property is read-write, it means that it MAY be possible
+# to run the CPU model in the current host if that property is
+# changed. Management software can use it as hints to suggest or
+# choose an alternative for the user, or just to generate meaningful
+# error messages explaining why the CPU model can't be used.
+# If @unavailable-features is an empty list, the CPU model is
+# runnable using the current host and machine-type.
+# If @unavailable-features is not present, runnability
+# information for the CPU is not available.
+#
 # @name: the name of the CPU definition
 #
 # @migration-safe: #optional whether a CPU definition can be safely used for
@@ -3227,22 +3244,6 @@
 #the CPU model from running in the current
 #host. (since 2.8)
 #
-# @unavailable-features is a list of QOM property names that
-# represent CPU model attributes that prevent the CPU from running.
-# If the QOM property is read-only, that means there's no known
-# way to make the CPU model run in the current host. Implementations
-# that choose not to provide specific information return the
-# property name "type".
-# If the property is read-write, it means that it MAY be possible
-# to run the CPU model in the current host if that property is
-# changed. Management software can use it as hints to suggest or
-#

[Qemu-devel] [PATCH v6 14/17] build-sys: remove dvi doc generation

2016-12-06 Thread Marc-André Lureau
There is no clear reason to have rules to generate dvi format
documentation, pdf is generally better supported.

Signed-off-by: Marc-André Lureau 
---
 Makefile   | 12 
 .gitignore |  1 -
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/Makefile b/Makefile
index 0bc470d974..f2b9ef0784 100644
--- a/Makefile
+++ b/Makefile
@@ -80,7 +80,7 @@ GENERATED_HEADERS += module_block.h
 Makefile: ;
 configure: ;
 
-.PHONY: all clean cscope distclean dvi html info install install-doc \
+.PHONY: all clean cscope distclean html info install install-doc \
pdf recurse-all speed test dist msi FORCE
 
 $(call set-vpath, $(SRC_PATH))
@@ -389,7 +389,7 @@ distclean: clean
rm -f config-all-devices.mak config-all-disas.mak config.status
rm -f po/*.mo tests/qemu-iotests/common.env
rm -f roms/seabios/config.mak roms/vgabios/config.mak
-   rm -f qemu-doc.info qemu-doc.aux qemu-doc.cp qemu-doc.cps qemu-doc.dvi
+   rm -f qemu-doc.info qemu-doc.aux qemu-doc.cp qemu-doc.cps
rm -f qemu-doc.fn qemu-doc.fns qemu-doc.info qemu-doc.ky qemu-doc.kys
rm -f qemu-doc.log qemu-doc.pdf qemu-doc.pg qemu-doc.toc qemu-doc.tp
rm -f qemu-doc.vr
@@ -531,9 +531,6 @@ ui/console-gl.o: $(SRC_PATH)/ui/console-gl.c \
 MAKEINFO=makeinfo
 MAKEINFOFLAGS=--no-split --number-sections
 TEXIFLAG=$(if $(V),,--quiet)
-%.dvi: %.texi
-   $(call quiet-command,texi2dvi $(TEXIFLAG) -I . $<,"GEN","$@")
-
 %.html: %.texi
$(call quiet-command,LC_ALL=C $(MAKEINFO) $(MAKEINFOFLAGS) --no-headers 
\
--html $< -o $@,"GEN","$@")
@@ -587,12 +584,11 @@ qemu-ga.8: qemu-ga.texi
  $(POD2MAN) --section=8 --center=" " --release=" " qemu-ga.pod > $@, \
  "GEN","$@")
 
-dvi: qemu-doc.dvi
 html: qemu-doc.html
 info: qemu-doc.info
 pdf: qemu-doc.pdf
 
-qemu-doc.dvi qemu-doc.html qemu-doc.info qemu-doc.pdf: \
+qemu-doc.html qemu-doc.info qemu-doc.pdf: \
qemu-img.texi qemu-nbd.texi qemu-options.texi qemu-option-trace.texi \
qemu-monitor.texi qemu-img-cmds.texi qemu-ga.texi \
qemu-monitor-info.texi
@@ -689,7 +685,7 @@ help:
@echo  '  docker  - Help about targets running tests inside 
Docker containers'
@echo  ''
@echo  'Documentation targets:'
-   @echo  '  dvi html info pdf'
+   @echo  '  html info pdf'
@echo  '  - Build documentation in specified format'
@echo  ''
 ifdef CONFIG_WIN32
diff --git a/.gitignore b/.gitignore
index 3d7848cb7e..6f175b391e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -60,7 +60,6 @@
 *.a
 *.aux
 *.cp
-*.dvi
 *.exe
 *.msi
 *.dll
-- 
2.11.0




[Qemu-devel] [PATCH v6 16/17] build-sys: add txt documentation rules

2016-12-06 Thread Marc-André Lureau
Build txt documentation, and install it.

Signed-off-by: Marc-André Lureau 
---
 Makefile   | 12 +---
 .gitignore |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index d18bac1c31..37d45ee21b 100644
--- a/Makefile
+++ b/Makefile
@@ -81,7 +81,7 @@ Makefile: ;
 configure: ;
 
 .PHONY: all clean cscope distclean html info install install-doc \
-   pdf recurse-all speed test dist msi FORCE
+   pdf txt recurse-all speed test dist msi FORCE
 
 $(call set-vpath, $(SRC_PATH))
 
@@ -90,7 +90,7 @@ LIBS+=-lz $(LIBS_TOOLS)
 HELPERS-$(CONFIG_LINUX) = qemu-bridge-helper$(EXESUF)
 
 ifdef BUILD_DOCS
-DOCS=qemu-doc.html qemu.1 qemu-img.1 qemu-nbd.8 qemu-ga.8
+DOCS=qemu-doc.html qemu-doc.txt qemu.1 qemu-img.1 qemu-nbd.8 qemu-ga.8
 ifdef CONFIG_VIRTFS
 DOCS+=fsdev/virtfs-proxy-helper.1
 endif
@@ -431,6 +431,7 @@ endif
 install-doc: $(DOCS)
$(INSTALL_DIR) "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qemu-doc.html "$(DESTDIR)$(qemu_docdir)"
+   $(INSTALL_DATA) qemu-doc.txt "$(DESTDIR)$(qemu_docdir)"
 ifdef CONFIG_POSIX
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DATA) qemu.1 "$(DESTDIR)$(mandir)/man1"
@@ -538,6 +539,10 @@ TEXIFLAG=$(if $(V),,--quiet)
 %.info: %.texi
$(call quiet-command,$(MAKEINFO) $(MAKEINFOFLAGS) $< -o $@,"GEN","$@")
 
+%.txt: %.texi
+   $(call quiet-command,LC_ALL=C $(MAKEINFO) $(MAKEINFOFLAGS) --no-headers 
\
+   --plaintext $< -o $@,"GEN   $@")
+
 %.pdf: %.texi
$(call quiet-command,texi2pdf $(TEXIFLAG) -I . $<,"GEN","$@")
 
@@ -563,6 +568,7 @@ qemu-ga.8: qemu-ga.texi
 html: qemu-doc.html
 info: qemu-doc.info
 pdf: qemu-doc.pdf
+txt: qemu-doc.txt
 
 qemu-doc.html qemu-doc.info qemu-doc.pdf: \
qemu-img.texi qemu-nbd.texi qemu-options.texi qemu-option-trace.texi \
@@ -661,7 +667,7 @@ help:
@echo  '  docker  - Help about targets running tests inside 
Docker containers'
@echo  ''
@echo  'Documentation targets:'
-   @echo  '  html info pdf'
+   @echo  '  html info pdf txt'
@echo  '  - Build documentation in specified format'
@echo  ''
 ifdef CONFIG_WIN32
diff --git a/.gitignore b/.gitignore
index 6f175b391e..e16bddc070 100644
--- a/.gitignore
+++ b/.gitignore
@@ -40,6 +40,7 @@
 /qmp-marshal.c
 /qemu-doc.html
 /qemu-doc.info
+/qemu-doc.txt
 /qemu-img
 /qemu-nbd
 /qemu-options.def
-- 
2.11.0




[Qemu-devel] [PATCH v6 05/17] docs: add master qapi texi files

2016-12-06 Thread Marc-André Lureau
The qapi2texi script generates a file to be included in a texi file. Add
"QEMU QMP Reference Manual" and "QEMU Guest Agent Protocol Reference"
master texi files.

Signed-off-by: Marc-André Lureau 
---
 docs/qapi-syntax.texi  | 175 +
 docs/qemu-ga-ref.texi  |  85 
 docs/qemu-qmp-ref.texi |  85 
 3 files changed, 345 insertions(+)
 create mode 100644 docs/qapi-syntax.texi
 create mode 100644 docs/qemu-ga-ref.texi
 create mode 100644 docs/qemu-qmp-ref.texi

diff --git a/docs/qapi-syntax.texi b/docs/qapi-syntax.texi
new file mode 100644
index 00..117d6272d6
--- /dev/null
+++ b/docs/qapi-syntax.texi
@@ -0,0 +1,175 @@
+See QEMU @file{docs/qapi-code-gen.txt} for details about the ``Client
+JSON Protocol'' wire format. Many @b{Example} illustrate the usage of
+the various types.
+
+This reference document uses a simplified syntax for the different
+JSON expressions, of the following general form:
+
+@deftp {Type} TypeName @
+{@{ 'member': @t{type}, ['optional-member: @t{some-type}], ... @}}
+
+@table @asis
+@item @code{'member'}
+Member description
+@item @code{'optional-member'} *
+Optional member description
+@end table
+@quotation Since
+A tagged section
+@end quotation
+@quotation Example
+@example
+<- @{ "return": @{ "member": "foo", ... @} @}
+@end example
+@end quotation
+@end deftp
+
+The [] in the declaration and the * name prefix in the member
+description means the member is optional.
+
+A type name inside [] refers to a single-dimension array of that type.
+
+@section Enum documentation
+
+Enumerations are strings over the Client JSON Protocol.
+
+Example of an API documentation:
+
+@deftp Enum Enumeration
+
+@table @asis
+@item @code{'auto'}
+Description auto
+@item @code{'on'}
+Description on
+@item @code{'off'}
+Description off
+@end table
+An enumeration of three options: on, off, and auto
+@end deftp
+
+@section Struct documentation
+
+A struct is an Object in the Client JSON protocol, whose members are
+listed in the declaration. It may have a base structure: the members
+of the base structure are merged in the same top-level Object over the
+client protocol.
+
+The API documentation uses the following syntax for a struct:
+
+@deftp {Struct} Type @
+{@{ BaseStruct, 'foo': @t{type}, ... @}}
+
+@table @asis
+@item @code{'foo'}
+Member foo description
+@end table
+The type description.
+@end deftp
+
+@section Union documentation
+
+Union types are used to let the user choose between several different
+variants for an object.  There are two flavors: simple (no
+discriminator or base), and flat (both discriminator and base).
+
+In the Client JSON Protocol, a simple union is represented by a
+dictionary that contains the @t{'type'} member as a discriminator, and
+a @t{'data'} member that is of the specified data type corresponding
+to the discriminator value.
+
+The API documentation uses the following syntax for simple union:
+
+@deftp {Simple Union} SimpleUnionType @
+{@{ 'type': @t{str}, 'data': [ 'type' = 't1': @t{Type1}, 't2: @t{Type2}, ... ] 
@}}
+
+Simple union description
+@end deftp
+
+A flat union definition avoids nesting on the wire, and specifies a
+set of common members that occur in all variants of the union. The
+top-level members of the union dictionary on the wire will be
+combination of members from both the base type and the appropriate
+discriminated branch type.  The @t{'discriminator'} member is the name
+of a non-optional enum-typed member of the base struct.
+
+The documentation uses the following syntax for a flat union:
+
+@deftp {Flat Union} FlatUnionType @
+{@{ UnionBase, [ 'discriminator' = 'd1': @t{Type1}, 'd2': @t{Type2} ] @}}
+
+Flat union description
+@end deftp
+
+@section Alternate documentation
+
+An alternate type is one that allows a choice between two or more JSON
+data types (string, integer, number, or object, but currently not
+array) on the wire.
+
+@deftp {Alternate} AlternateType @
+{[ 't1': @t{Type1}, 't2': @t{Type2}, ... ]}
+
+@table @asis
+@item @code{'t1'}
+Either this type
+@item @code{'t2'}
+Or this type
+@end table
+AlternateType description
+@end deftp
+
+@section Command documentation
+
+In the Client JSON Protocol, a command is a dictionary with an
+@t{'execute'} member (the name of the command as value), and an
+@t{'arguments'} member for the arguments. The API documentation uses
+the following syntax for a command:
+
+@deftypefn Command {ReturnType} query-foo @
+{('arg': @t{type}, ...)}
+
+@table @asis
+@item @code{'arg'}
+If true, the command will query...
+@end table
+Query for all bar...
+@quotation Returns
+The @code{ReturnType} for...
+@end quotation
+@quotation Example
+@example
+-> @{ "execute": "query-foo", "arguments": @{ "arg": ... @} @}
+<- @{
+  "return": @{ "foo": ... @}
+   @}
+@end example
+@end quotation
+@end deftypefn
+
+@section Event documentation
+
+An event is a JSON object defined by its name, used as the @t{'event'}

[Qemu-devel] [PATCH v6 06/17] qapi: rework qapi Exception

2016-12-06 Thread Marc-André Lureau
Use a base class QAPIError, and QAPIParseError for parser errors and
QAPISemError for semantic errors, suggested by Markus Armbruster.

Signed-off-by: Marc-André Lureau 
---
 scripts/qapi.py | 338 ++--
 1 file changed, 158 insertions(+), 180 deletions(-)

diff --git a/scripts/qapi.py b/scripts/qapi.py
index 21bc32fda3..5885c9e4ad 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -91,35 +91,38 @@ def error_path(parent):
 return res
 
 
-class QAPISchemaError(Exception):
-def __init__(self, schema, msg):
+class QAPIError(Exception):
+def __init__(self, fname, line, col, incl_info, msg):
 Exception.__init__(self)
-self.fname = schema.fname
+self.fname = fname
+self.line = line
+self.col = col
+self.info = incl_info
 self.msg = msg
-self.col = 1
-self.line = schema.line
-for ch in schema.src[schema.line_pos:schema.pos]:
-if ch == '\t':
-self.col = (self.col + 7) % 8 + 1
-else:
-self.col += 1
-self.info = schema.incl_info
 
 def __str__(self):
-return error_path(self.info) + \
-"%s:%d:%d: %s" % (self.fname, self.line, self.col, self.msg)
+loc = "%s:%d" % (self.fname, self.line)
+if self.col is not None:
+loc += ":%s" % self.col
+return error_path(self.info) + "%s: %s" % (loc, self.msg)
 
 
-class QAPIExprError(Exception):
-def __init__(self, expr_info, msg):
-Exception.__init__(self)
-assert expr_info
-self.info = expr_info
-self.msg = msg
+class QAPIParseError(QAPIError):
+def __init__(self, parser, msg):
+col = 1
+for ch in parser.src[parser.line_pos:parser.pos]:
+if ch == '\t':
+col = (col + 7) % 8 + 1
+else:
+col += 1
+QAPIError.__init__(self, parser.fname, parser.line,
+   col, parser.incl_info, msg)
 
-def __str__(self):
-return error_path(self.info['parent']) + \
-"%s:%d: %s" % (self.info['file'], self.info['line'], self.msg)
+
+class QAPISemError(QAPIError):
+def __init__(self, info, msg):
+QAPIError.__init__(self, info['file'], info['line'], None,
+   info['parent'], msg)
 
 
 class QAPISchemaParser(object):
@@ -140,25 +143,24 @@ class QAPISchemaParser(object):
 self.accept()
 
 while self.tok is not None:
-expr_info = {'file': fname, 'line': self.line,
- 'parent': self.incl_info}
+info = {'file': fname, 'line': self.line,
+'parent': self.incl_info}
 expr = self.get_expr(False)
 if isinstance(expr, dict) and "include" in expr:
 if len(expr) != 1:
-raise QAPIExprError(expr_info,
-"Invalid 'include' directive")
+raise QAPISemError(info, "Invalid 'include' directive")
 include = expr["include"]
 if not isinstance(include, str):
-raise QAPIExprError(expr_info,
-"Value of 'include' must be a string")
+raise QAPISemError(info,
+   "Value of 'include' must be a string")
 incl_abs_fname = os.path.join(os.path.dirname(abs_fname),
   include)
 # catch inclusion cycle
-inf = expr_info
+inf = info
 while inf:
 if incl_abs_fname == os.path.abspath(inf['file']):
-raise QAPIExprError(expr_info, "Inclusion loop for %s"
-% include)
+raise QAPISemError(info, "Inclusion loop for %s"
+   % include)
 inf = inf['parent']
 # skip multiple include of the same file
 if incl_abs_fname in previously_included:
@@ -166,14 +168,13 @@ class QAPISchemaParser(object):
 try:
 fobj = open(incl_abs_fname, 'r')
 except IOError as e:
-raise QAPIExprError(expr_info,
-'%s: %s' % (e.strerror, include))
+raise QAPISemError(info, '%s: %s' % (e.strerror, include))
 exprs_include = QAPISchemaParser(fobj, previously_included,
- expr_info)
+ info)
 self.exprs.extend(exprs_include.exprs)
 else:
 expr_elem = {'expr': expr,
- 'info': expr_info}
+ 'info': info}
 sel

  1   2   3   >