Re: [Qemu-devel] [PATCH v3 5/6] qemu-iotests: Add test case for mirror with unmap

2015-05-14 Thread Paolo Bonzini


On 12/05/2015 13:48, Fam Zheng wrote:
> This checks that the discard on mirror source that effectively zeroes
> data is also reflected by the data of target.
> 
> Signed-off-by: Fam Zheng 
> Reviewed-by: John Snow 
> ---
>  tests/qemu-iotests/131 | 59 
> ++
>  tests/qemu-iotests/131.out |  5 
>  tests/qemu-iotests/group   |  1 +
>  3 files changed, 65 insertions(+)
>  create mode 100644 tests/qemu-iotests/131
>  create mode 100644 tests/qemu-iotests/131.out
> 
> diff --git a/tests/qemu-iotests/131 b/tests/qemu-iotests/131
> new file mode 100644
> index 000..f53ef6e
> --- /dev/null
> +++ b/tests/qemu-iotests/131
> @@ -0,0 +1,59 @@
> +#!/usr/bin/env python
> +#
> +# Test mirror with unmap
> +#
> +# Copyright (C) 2015 Red Hat, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see .
> +#
> +
> +import time
> +import os
> +import iotests
> +from iotests import qemu_img, qemu_io
> +
> +test_img = os.path.join(iotests.test_dir, 'test.img')
> +target_img = os.path.join(iotests.test_dir, 'target.img')
> +
> +class TestSingleDrive(iotests.QMPTestCase):
> +image_len = 2 * 1024 * 1024 # MB
> +
> +def setUp(self):
> +# Write data to the image so we can compare later
> +qemu_img('create', '-f', iotests.imgfmt, test_img, 
> str(TestSingleDrive.image_len))
> +qemu_io('-f', iotests.imgfmt, '-c', 'write -P0x5d 0 2M', test_img)
> +
> +self.vm = iotests.VM().add_drive(test_img, 'discard=unmap')
> +self.vm.launch()
> +
> +def tearDown(self):
> +self.vm.shutdown()
> +os.remove(test_img)
> +try:
> +os.remove(target_img)
> +except OSError:
> +pass
> +
> +def test_mirror_discard(self):
> +result = self.vm.qmp('drive-mirror', device='drive0', sync='full',
> + target=target_img)
> +self.assert_qmp(result, 'return', {})
> +self.vm.hmp_qemu_io('drive0', 'discard 0 64k')
> +self.complete_and_wait('drive0')
> +self.vm.shutdown()
> +self.assertTrue(iotests.compare_images(test_img, target_img),
> +'target image does not match source after mirroring')
> +
> +if __name__ == '__main__':
> +iotests.main(supported_fmts=['raw', 'qcow2'])
> diff --git a/tests/qemu-iotests/131.out b/tests/qemu-iotests/131.out
> new file mode 100644
> index 000..ae1213e
> --- /dev/null
> +++ b/tests/qemu-iotests/131.out
> @@ -0,0 +1,5 @@
> +.
> +--
> +Ran 1 tests
> +
> +OK
> diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
> index 6ca3466..34b16cb 100644
> --- a/tests/qemu-iotests/group
> +++ b/tests/qemu-iotests/group
> @@ -128,3 +128,4 @@
>  128 rw auto quick
>  129 rw auto quick
>  130 rw auto quick
> +131 rw auto quick
> 

Do other "quick" tests need a QEMU binary?

Paolo



Re: [Qemu-devel] [PATCH v2] mirror: correct buf_size

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 12:29, Wen Congyang wrote:
> 
> If buf_size % granularity is not 0, mirror_free_init() will
> do dangerous things.
> 
> Signed-off-by: Wen Congyang 
> ---
>  block/mirror.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index 58f391a..9521212 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -684,7 +684,7 @@ static void mirror_start_job(BlockDriverState *bs, 
> BlockDriverState *target,
>  s->is_none_mode = is_none_mode;
>  s->base = base;
>  s->granularity = granularity;
> -s->buf_size = MAX(buf_size, granularity);
> +s->buf_size = ROUND_UP(buf_size, granularity);
>  
>  s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
>  if (!s->dirty_bitmap) {
> 

Reviewed-by: Paolo Bonzini 



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Peter Maydell
On 14 May 2015 at 11:31, Andrew Jones  wrote:
> Forgot to (4): switch from setting userspace's mapping to
> device memory to normal, non-cacheable. Using device memory
> caused a problem that Alex Graf found, and Peter Maydell suggested
> using normal, non-cacheable instead.

Did you check that non-cacheable is definitely the correct
kind of Normal memory attribute we want? (ie not write-through).

-- PMM



Re: [Qemu-devel] [PATCH RFC 1/1] virtio: migrate config_vector

2015-05-14 Thread Christian Borntraeger
Am 14.05.2015 um 12:09 schrieb Paolo Bonzini:
> 
> 
> On 14/05/2015 12:07, Michael S. Tsirkin wrote:
>>> If you really want that, the hook should be in the virtio device class.
>>> Keying on the machine type is almost never the answer, at least upstream.
>>> Distros can always override it if they know what they're getting into.
>>
>> I beg to disagree.  Check out PC_COMPAT_ macros.
> 
> I'm talking specifically about migration subsections.  None of them are
> keyed on the machine type downstream, as far as I know.
> 
>> But if you are saying
>> migration with virtio-ccw is just too broken ATM, so we shouldn't care
>> about its migration compatibility upstream, then I can buy that.
> 
> Yes, I'm also saying that (and Christian must agree or they would not
> have posted this patch).

So some background: Migration works most of the time but it will fail 
if config changes happen (like virtio-balloon). When trying to fix this Jason
came up with a patch that added qemu_put_be16(f, vdev->config_vector) in the 
virtio_ccw save_config. As this does not use vmstates and versions we were not
sure about how to ensure compatibility. So Conny came up with this proposal to
detect and reject migration from an old (broken) version.

Christian






Re: [Qemu-devel] [RFC/RFT PATCH v2 3/3] arm/arm64: KVM: implement 'uncached' mem coherency

2015-05-14 Thread Christoffer Dall
On Wed, May 13, 2015 at 01:31:54PM +0200, Andrew Jones wrote:
> When S1 and S2 memory attributes combine wrt to caching policy,
> non-cacheable types take precedence. If a guest maps a region as
> device memory, which KVM userspace is using to emulate the device
> using normal, cacheable memory, then we lose coherency. With
> KVM_MEM_UNCACHED, KVM userspace can now hint to KVM which memory
> regions are likely to be problematic. With this patch, as pages
> of these types of regions are faulted into the guest, not only do
> we flush the page's dcache, but we also change userspace's
> mapping to NC in order to maintain coherency.
> 
> What if the guest doesn't do what we expect? While we can't
> force a guest to use cacheable memory, we can take advantage of
> the non-cacheable precedence, and force it to use non-cacheable.
> So, this patch also introduces PAGE_S2_NORMAL_NC, and uses it on
> KVM_MEM_UNCACHED regions to force them to NC.
> 
> We now have both guest and userspace on the same page (pun intended)

I'd like to revisit the overall approach here.  Is doing non-cached
accesses in both the guest and host really the right thing to do here?

The semantics of the device becomes that it is cache coherent (because
QEMU is), and I think Marc argued that Linux/UEFI should simply be
adapted to handle whatever emulated devices we have as coherent.  I also
remember someone arguing that would be wrong (Peter?).

Finally, does this address all cache coherency issues with emulated
devices?  Some VOS guys had seen things still not working with this
approach, unsure why...  I'd like to avoid us merging this only to merge
a more complete solution in a few weeks which reverts this solution...

More comments/questions below:

> 
> Signed-off-by: Andrew Jones 
> ---
>  arch/arm/include/asm/kvm_mmu.h|  5 -
>  arch/arm/include/asm/pgtable-3level.h |  1 +
>  arch/arm/include/asm/pgtable.h|  1 +
>  arch/arm/kvm/mmu.c| 37 
> +++
>  arch/arm64/include/asm/kvm_mmu.h  |  5 -
>  arch/arm64/include/asm/memory.h   |  1 +
>  arch/arm64/include/asm/pgtable.h  |  1 +
>  7 files changed, 36 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 405aa18833073..e8034a80b12e5 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -214,8 +214,11 @@ static inline void __coherent_cache_guest_page(struct 
> kvm_vcpu *vcpu, pfn_t pfn,
>   while (size) {
>   void *va = kmap_atomic_pfn(pfn);
>  
> - if (need_flush)
> + if (need_flush) {
>   kvm_flush_dcache_to_poc(va, PAGE_SIZE);
> + if (ipa_uncached)
> + set_memory_nc((unsigned long)va, 1);

nit: consider moving this outside the need_flush

> + }
>  
>   if (icache_is_pipt())
>   __cpuc_coherent_user_range((unsigned long)va,
> diff --git a/arch/arm/include/asm/pgtable-3level.h 
> b/arch/arm/include/asm/pgtable-3level.h
> index a745a2a53853c..39b3f7a40e663 100644
> --- a/arch/arm/include/asm/pgtable-3level.h
> +++ b/arch/arm/include/asm/pgtable-3level.h
> @@ -121,6 +121,7 @@
>   * 2nd stage PTE definitions for LPAE.
>   */
>  #define L_PTE_S2_MT_UNCACHED (_AT(pteval_t, 0x0) << 2) /* strongly 
> ordered */
> +#define L_PTE_S2_MT_NORMAL_NC(_AT(pteval_t, 0x5) << 2) /* 
> normal non-cacheable */
>  #define L_PTE_S2_MT_WRITETHROUGH (_AT(pteval_t, 0xa) << 2) /* normal 
> inner write-through */
>  #define L_PTE_S2_MT_WRITEBACK(_AT(pteval_t, 0xf) << 2) /* 
> normal inner write-back */
>  #define L_PTE_S2_MT_DEV_SHARED   (_AT(pteval_t, 0x1) << 2) /* 
> device */
> diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> index f40354198bad4..ae13ca8b0a23d 100644
> --- a/arch/arm/include/asm/pgtable.h
> +++ b/arch/arm/include/asm/pgtable.h
> @@ -100,6 +100,7 @@ extern pgprot_t   pgprot_s2_device;
>  #define PAGE_HYP _MOD_PROT(pgprot_kernel, L_PTE_HYP)
>  #define PAGE_HYP_DEVICE  _MOD_PROT(pgprot_hyp_device, L_PTE_HYP)
>  #define PAGE_S2  _MOD_PROT(pgprot_s2, L_PTE_S2_RDONLY)
> +#define PAGE_S2_NORMAL_NC__pgprot((pgprot_val(PAGE_S2) & 
> ~L_PTE_S2_MT_MASK) | L_PTE_S2_MT_NORMAL_NC)
>  #define PAGE_S2_DEVICE   _MOD_PROT(pgprot_s2_device, 
> L_PTE_S2_RDONLY)
>  
>  #define __PAGE_NONE  __pgprot(_L_PTE_DEFAULT | L_PTE_RDONLY | 
> L_PTE_XN | L_PTE_NONE)
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index bc1665acd73e7..6b3bd8061bd2a 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -1220,7 +1220,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>   struct vm_area_struct *vma;
>   pfn_t pfn;
>   pgprot_t mem_type = PAGE_S2;
> - bool fault_ipa_uncached;
>

Re: [Qemu-devel] [PATCH] block: Let bdrv_drain_all() to call aio_poll() for each AioContext

2015-05-14 Thread Alexander Yarygin
Fam Zheng  writes:

> On Wed, 05/13 19:34, Alexander Yarygin wrote:
>> Paolo Bonzini  writes:
>> 
>> > On 13/05/2015 17:18, Alexander Yarygin wrote:
>> >> After the commit 9b536adc ("block: acquire AioContext in
>> >> bdrv_drain_all()") the aio_poll() function got called for every
>> >> BlockDriverState, in assumption that every device may have its own
>> >> AioContext. The bdrv_drain_all() function is called in each
>> >> virtio_reset() call,
>> >
>> > ... which should actually call bdrv_drain().  Can you fix that?
>> >
>> 
>> I thought about it, but couldn't come to conclusion that it's safe. The
>> comment above bdrv_drain_all() states "... it is not possible to have a
>> function to drain a single device's I/O queue.",
>
> I think that comment is stale - it predates the introduction of per BDS req
> tracking and bdrv_drain.
>

It says "completion of an asynchronous I/O operation can trigger any
number of other I/O operations on other devices". If this is no longer
the case, then I agree :). But I think it doesn't exclude this
patch anyway: bdrv_drain_all() is called in other places as well,
e.g. in do_vm_stop().

>> besides that what if we
>> have several virtual disks that share host file?
>
> I'm not sure what you mean, bdrv_drain works on a BDS, each virtual disk has
> one of which.
>
>> Or I'm wrong and it's ok to do?
>> 
>> >> which in turn is called for every virtio-blk
>> >> device on initialization, so we got aio_poll() called
>> >> 'length(device_list)^2' times.
>> >> 
>> >> If we have thousands of disks attached, there are a lot of
>> >> BlockDriverStates but only a few AioContexts, leading to tons of
>> >> unnecessary aio_poll() calls. For example, startup times with 1000 disks
>> >> takes over 13 minutes.
>> >> 
>> >> This patch changes the bdrv_drain_all() function allowing it find shared
>> >> AioContexts and to call aio_poll() only for unique ones. This results in
>> >> much better startup times, e.g. 1000 disks do come up within 5 seconds.
>> >
>> > I'm not sure this patch is correct.  You may have to call aio_poll
>> > multiple times before a BlockDriverState is drained.
>> >
>> > Paolo
>> >
>> 
>> 
>> Ah, right. We need second loop, something like this:
>> 
>> @@ -2030,20 +2033,33 @@ void bdrv_drain(BlockDriverState *bs)
>>  void bdrv_drain_all(void)
>>  {
>>  /* Always run first iteration so any pending completion BHs run */
>> -bool busy = true;
>> +bool busy = true, pending = false;
>>  BlockDriverState *bs;
>> +GList *aio_ctxs = NULL, *ctx;
>> +AioContext *aio_context;
>> 
>>  while (busy) {
>>  busy = false;
>> 
>>  QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
>> -AioContext *aio_context = bdrv_get_aio_context(bs);
>> +aio_context = bdrv_get_aio_context(bs);
>> 
>>  aio_context_acquire(aio_context);
>>  busy |= bdrv_drain_one(bs);
>>  aio_context_release(aio_context);
>> +if (!aio_ctxs || !g_list_find(aio_ctxs, aio_context))
>> +aio_ctxs = g_list_append(aio_ctxs, aio_context);
>
> Braces are required even for single line if. Moreover, I don't understand this
> - aio_ctxs is a duplicate of bdrv_states.
>
> Fam
>
>

length(bdrv_states) == amount of virtual disks
length(aio_ctxs) == amount of threads

We can get as many disks as we want, while amount of threads is
limited. In my case there were 1024 disks sharing one AioContext that
gives overhead at least in 1023 calls of aio_poll(). 

[.. skipped ..]




[Qemu-devel] [PATCH v3 3/5] raw-posix: DPRINTF instead of DEBUG_BLOCK_PRINT

2015-05-14 Thread Dimitris Aragiorgis
Building the QEMU tools fails if we #define DEBUG_BLOCK inside
block/raw-posix.c. Here instead of adding qemu-log.o in block-obj-y
so that DEBUG_BLOCK_PRINT can be used, we substitute the latter with
a simple DPRINTF() (that does not cause bit-rot).

Signed-off-by: Dimitris Aragiorgis 
---
 block/raw-posix.c |   17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index fb58440..ff7dc4d 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -97,12 +97,17 @@
 //#define DEBUG_FLOPPY
 
 //#define DEBUG_BLOCK
-#if defined(DEBUG_BLOCK)
-#define DEBUG_BLOCK_PRINT(formatCstr, ...) do { if (qemu_log_enabled()) \
-{ qemu_log(formatCstr, ## __VA_ARGS__); qemu_log_flush(); } } while (0)
+
+#ifdef DEBUG_BLOCK
+# define DEBUG_BLOCK_PRINT 1
 #else
-#define DEBUG_BLOCK_PRINT(formatCstr, ...)
+# define DEBUG_BLOCK_PRINT 0
 #endif
+#define DPRINTF(fmt, ...) \
+do { \
+if (DEBUG_BLOCK_PRINT) \
+printf(fmt, ## __VA_ARGS__); \
+} while (0)
 
 /* OS X does not have O_DSYNC */
 #ifndef O_DSYNC
@@ -1023,7 +1028,7 @@ static int xfs_write_zeroes(BDRVRawState *s, int64_t 
offset, uint64_t bytes)
 fl.l_len = bytes;
 
 if (xfsctl(NULL, s->fd, XFS_IOC_ZERO_RANGE, &fl) < 0) {
-DEBUG_BLOCK_PRINT("cannot write zero range (%s)\n", strerror(errno));
+DPRINTF("cannot write zero range (%s)\n", strerror(errno));
 return -errno;
 }
 
@@ -1040,7 +1045,7 @@ static int xfs_discard(BDRVRawState *s, int64_t offset, 
uint64_t bytes)
 fl.l_len = bytes;
 
 if (xfsctl(NULL, s->fd, XFS_IOC_UNRESVSP64, &fl) < 0) {
-DEBUG_BLOCK_PRINT("cannot punch hole (%s)\n", strerror(errno));
+DPRINTF("cannot punch hole (%s)\n", strerror(errno));
 return -errno;
 }
 
-- 
1.7.10.4




[Qemu-devel] [PATCH v3 4/5] raw-posix: Use DPRINTF for DEBUG_FLOPPY

2015-05-14 Thread Dimitris Aragiorgis
Get rid of several #ifdef DEBUG_FLOPPY and substitute them with
DPRINTF.

Signed-off-by: Dimitris Aragiorgis 
---
 block/raw-posix.c |   22 +-
 1 file changed, 5 insertions(+), 17 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index ff7dc4d..82e522f 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -94,8 +94,6 @@
 #include 
 #endif
 
-//#define DEBUG_FLOPPY
-
 //#define DEBUG_BLOCK
 
 #ifdef DEBUG_BLOCK
@@ -2162,16 +2160,12 @@ static int fd_open(BlockDriverState *bs)
 (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - s->fd_open_time) >= 
FD_OPEN_TIMEOUT) {
 qemu_close(s->fd);
 s->fd = -1;
-#ifdef DEBUG_FLOPPY
-printf("Floppy closed\n");
-#endif
+DPRINTF("Floppy closed\n");
 }
 if (s->fd < 0) {
 if (s->fd_got_error &&
 (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - s->fd_error_time) < 
FD_OPEN_TIMEOUT) {
-#ifdef DEBUG_FLOPPY
-printf("No floppy (open delayed)\n");
-#endif
+DPRINTF("No floppy (open delayed)\n");
 return -EIO;
 }
 s->fd = qemu_open(bs->filename, s->open_flags & ~O_NONBLOCK);
@@ -2180,14 +2174,10 @@ static int fd_open(BlockDriverState *bs)
 s->fd_got_error = 1;
 if (last_media_present)
 s->fd_media_changed = 1;
-#ifdef DEBUG_FLOPPY
-printf("No floppy\n");
-#endif
+DPRINTF("No floppy\n");
 return -EIO;
 }
-#ifdef DEBUG_FLOPPY
-printf("Floppy opened\n");
-#endif
+DPRINTF("Floppy opened\n");
 }
 if (!last_media_present)
 s->fd_media_changed = 1;
@@ -2455,9 +2445,7 @@ static int floppy_media_changed(BlockDriverState *bs)
 fd_open(bs);
 ret = s->fd_media_changed;
 s->fd_media_changed = 0;
-#ifdef DEBUG_FLOPPY
-printf("Floppy changed=%d\n", ret);
-#endif
+DPRINTF("Floppy changed=%d\n", ret);
 return ret;
 }
 
-- 
1.7.10.4




[Qemu-devel] [PATCH v3 0/5] Some fixes related to scsi-generic

2015-05-14 Thread Dimitris Aragiorgis
Hi all,

These four patches make slight changes to the way QEMU handles SCSI
generic devices to fix a number of small problems.

I am sending them against the master branch, since I don't know if they
can be considered bugfixes.

Thanks,
dimara

v3 (rebased to current master):
* Avoid bit-rot in DPRINTF (adopt Eric's suggestion)
* Address Kevin's comments (DEBUF_FLOPPY, line > 80 chars, SG device)
* Mention Kevin's comment wrt disk flush in the corresponding commit

v2:
* remove duplicate check for sg inside iscsi_co_flush()
* remove DEBUG_BLOCK_PRINT in block/raw-posix.c
* use DPRINTF for debugging in block/raw-posix.c

PS: Paolo suggested to use a tracepoint inside hdev_is_sg() but I chose DPRINTF
instead. It would make sense to add a tracepoint for bdrv_is_sg() (just like
most bdrv_* commands) but this is too much for now since it just returns the
bs->sg flag (and is not an actual driver function). If you insist I'll change
it in v3.

Dimitris Aragiorgis (5):
  block: Use bdrv_is_sg() everywhere
  Fix migration in case of scsi-generic
  raw-posix: DPRINTF instead of DEBUG_BLOCK_PRINT
  raw-posix: Use DPRINTF for DEBUG_FLOPPY
  raw-posix: Introduce hdev_is_sg()

 block.c   |6 ++--
 block/io.c|3 +-
 block/iscsi.c |4 ---
 block/raw-posix.c |   82 ++---
 4 files changed, 51 insertions(+), 44 deletions(-)

-- 
1.7.10.4




[Qemu-devel] [PATCH v3 2/5] Fix migration in case of scsi-generic

2015-05-14 Thread Dimitris Aragiorgis
During migration, QEMU uses fsync()/fdatasync() on the open file
descriptor for read-write block devices to flush data just before
stopping the VM.

However, fsync() on a scsi-generic device returns -EINVAL which
causes the migration to fail. This patch skips flushing data in case
of an SG device, since submitting SCSI commands directly via an SG
character device (e.g. /dev/sg0) bypasses the page cache completely,
anyway.

Note that fsync() not only flushes the page cache but also the disk
cache. The scsi-generic device never sends flushes, and for
migration it assumes that the same SCSI device is used by the
destination host, so it does not issue any SCSI SYNCHRONIZE CACHE
(10) command.

Finally, remove the bdrv_is_sg() test from iscsi_co_flush() since
this is now redundant (we flush the underlying protocol at the end
of bdrv_co_flush() which, with this patch, we never reach).

Signed-off-by: Dimitris Aragiorgis 
---
 block/io.c|3 ++-
 block/iscsi.c |4 
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index 1ce62c4..922dc07 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2231,7 +2231,8 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 {
 int ret;
 
-if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
+if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
+bdrv_is_sg(bs)) {
 return 0;
 }
 
diff --git a/block/iscsi.c b/block/iscsi.c
index 502a81f..965978b 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -627,10 +627,6 @@ static int coroutine_fn iscsi_co_flush(BlockDriverState 
*bs)
 IscsiLun *iscsilun = bs->opaque;
 struct IscsiTask iTask;
 
-if (bdrv_is_sg(bs)) {
-return 0;
-}
-
 if (!iscsilun->force_next_flush) {
 return 0;
 }
-- 
1.7.10.4




[Qemu-devel] [PATCH v3 1/5] block: Use bdrv_is_sg() everywhere

2015-05-14 Thread Dimitris Aragiorgis
Instead of checking bs->sg use bdrv_is_sg() consistently throughout
the code.

Signed-off-by: Dimitris Aragiorgis 
Reviewed-by: Paolo Bonzini 
---
 block.c   |6 +++---
 block/iscsi.c |2 +-
 block/raw-posix.c |4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index 7904098..d651b57 100644
--- a/block.c
+++ b/block.c
@@ -566,7 +566,7 @@ static int find_image_format(BlockDriverState *bs, const 
char *filename,
 int ret = 0;
 
 /* Return the raw BlockDriver * to scsi-generic devices or empty drives */
-if (bs->sg || !bdrv_is_inserted(bs) || bdrv_getlength(bs) == 0) {
+if (bdrv_is_sg(bs) || !bdrv_is_inserted(bs) || bdrv_getlength(bs) == 0) {
 *pdrv = &bdrv_raw;
 return ret;
 }
@@ -598,7 +598,7 @@ static int refresh_total_sectors(BlockDriverState *bs, 
int64_t hint)
 BlockDriver *drv = bs->drv;
 
 /* Do not attempt drv->bdrv_getlength() on scsi-generic devices */
-if (bs->sg)
+if (bdrv_is_sg(bs))
 return 0;
 
 /* query actual device if possible, otherwise just trust the hint */
@@ -890,7 +890,7 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 }
 
 assert(bdrv_opt_mem_align(bs) != 0);
-assert((bs->request_alignment != 0) || bs->sg);
+assert((bs->request_alignment != 0) || bdrv_is_sg(bs));
 return 0;
 
 free_and_fail:
diff --git a/block/iscsi.c b/block/iscsi.c
index 8fca1d3..502a81f 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -627,7 +627,7 @@ static int coroutine_fn iscsi_co_flush(BlockDriverState *bs)
 IscsiLun *iscsilun = bs->opaque;
 struct IscsiTask iTask;
 
-if (bs->sg) {
+if (bdrv_is_sg(bs)) {
 return 0;
 }
 
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 24d8582..fb58440 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -302,9 +302,9 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
 BDRVRawState *s = bs->opaque;
 char *buf;
 
-/* For /dev/sg devices the alignment is not really used.
+/* For SCSI generic devices the alignment is not really used.
With buffered I/O, we don't have any restrictions. */
-if (bs->sg || !s->needs_alignment) {
+if (bdrv_is_sg(bs) || !s->needs_alignment) {
 bs->request_alignment = 1;
 s->buf_align = 1;
 return;
-- 
1.7.10.4




Re: [Qemu-devel] [PATCH v2 2/5] Fix migration in case of scsi-generic

2015-05-14 Thread Dimitris Aragiorgis
Hello Kevin,

* Kevin Wolf  [2015-05-11 12:12:23 +0200]:

> Am 08.05.2015 um 19:47 hat Dimitris Aragiorgis geschrieben:
> > During migration, QEMU uses fsync()/fdatasync() on the open file
> > descriptor for read-write block devices to flush data just before
> > stopping the VM.
> > 
> > However, fsync() on a scsi-generic device returns -EINVAL which
> > causes the migration to fail. This patch skips flushing data in case
> > of an SG device, since submitting SCSI commands directly via an SG
> > character device (e.g. /dev/sg0) bypasses the page cache completely,
> > anyway.
> 
> fsync() doesn't only flush the kernel page cache, but also the disk
> cache. The full justification includes at least that the scsi-generic
> device never sends flushes, and for migration it's assumed that the same
> SCSI device is used by the destination host (rather than another way to
> access the same backing storage). If this were not enough, we would have
> to send a SCSI flush command.
> 

If I understand correctly you mean that QEMU will not issue any
SCSI SYNCHRONIZE CACHE (10) command and thus the disk cache does not get
flushed during migration. This requires, as you say, that the same SCSI
device is used by the destination host.

Note taken. I have added your comment in the commit message of the
corresponding patch in v3 just sent to the list. Thanks for clarifying
this.

Also note that QEMU seems to flush the disk cache explicitly in case of
iSCSI, using iscsi_synchronizecache10_task() from libiscsi inside
iscsi_co_flush().

Perhaps we can also extend this approach to scsi-generic e.g. by calling
sg_ll_sync_cache_10() from libsgutils2. This is a distinct, orthogonal
patch that could be added in the future.

> On another note, I expect we still neglect to check bs_is_sg() in too
> many places. Any monitor command that does normal I/O on such a BDS
> should actually fail.
> 
> > Remove the bdrv_is_sg() test from iscsi_co_flush() since this is
> > now redundant (we flush the underlying protocol at the end
> > of bdrv_co_flush() which, with this patch, we never reach).
> > 
> > Signed-off-by: Dimitris Aragiorgis 
> > ---
> >  block/io.c|2 +-
> >  block/iscsi.c |4 
> >  2 files changed, 1 insertion(+), 5 deletions(-)
> > 
> > diff --git a/block/io.c b/block/io.c
> > index 1ce62c4..d248a4d 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -2231,7 +2231,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
> >  {
> >  int ret;
> >  
> > -if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
> > +if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) || 
> > bdrv_is_sg(bs)) {
> 
> This line exceeds 80 characters.
> 
> Kevin


signature.asc
Description: Digital signature


[Qemu-devel] [PATCH v3 5/5] raw-posix: Introduce hdev_is_sg()

2015-05-14 Thread Dimitris Aragiorgis
Until now, an SG device was identified only by checking if its path
started with "/dev/sg". Then, hdev_open() set bs->sg accordingly.
This is very fragile, e.g. it fails with symlinks or relative paths.
We should rely on the actual properties of the device instead of the
specified file path.

Test for an SG device (e.g. /dev/sg0) by ensuring that all of the
following holds:

 - The device supports the SG_GET_VERSION_NUM ioctl
 - The device supports the SG_GET_SCSI_ID ioctl
 - The specified file name corresponds to a character device

Signed-off-by: Dimitris Aragiorgis 
---
 block/raw-posix.c |   39 ---
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 82e522f..19b90d1 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -57,6 +57,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifdef __s390__
 #include 
 #endif
@@ -2076,15 +2077,38 @@ static void hdev_parse_filename(const char *filename, 
QDict *options,
 qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
 }
 
+static int hdev_is_sg(BlockDriverState *bs)
+{
+
+#if defined(__linux__)
+
+struct stat st;
+struct sg_scsi_id scsiid;
+int sg_version;
+
+if (!bdrv_ioctl(bs, SG_GET_VERSION_NUM, &sg_version) &&
+!bdrv_ioctl(bs, SG_GET_SCSI_ID, &scsiid) &&
+stat(bs->filename, &st) >= 0 && S_ISCHR(st.st_mode)) {
+DPRINTF("SG device found: type=%d, version=%d\n",
+scsiid.scsi_type, sg_version);
+return 1;
+}
+
+#endif
+
+return 0;
+}
+
 static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
  Error **errp)
 {
 BDRVRawState *s = bs->opaque;
 Error *local_err = NULL;
 int ret;
-const char *filename = qdict_get_str(options, "filename");
 
 #if defined(__APPLE__) && defined(__MACH__)
+const char *filename = qdict_get_str(options, "filename");
+
 if (strstart(filename, "/dev/cdrom", NULL)) {
 kern_return_t kernResult;
 io_iterator_t mediaIterator;
@@ -2113,16 +2137,6 @@ static int hdev_open(BlockDriverState *bs, QDict 
*options, int flags,
 #endif
 
 s->type = FTYPE_FILE;
-#if defined(__linux__)
-{
-char resolved_path[ MAXPATHLEN ], *temp;
-
-temp = realpath(filename, resolved_path);
-if (temp && strstart(temp, "/dev/sg", NULL)) {
-bs->sg = 1;
-}
-}
-#endif
 
 ret = raw_open_common(bs, options, flags, 0, &local_err);
 if (ret < 0) {
@@ -2132,6 +2146,9 @@ static int hdev_open(BlockDriverState *bs, QDict 
*options, int flags,
 return ret;
 }
 
+/* Since this does ioctl the device must be already opened */
+bs->sg = hdev_is_sg(bs);
+
 if (flags & BDRV_O_RDWR) {
 ret = check_hdev_writable(s);
 if (ret < 0) {
-- 
1.7.10.4




Re: [Qemu-devel] [RFC/RFT PATCH v2 1/3] arm/arm64: pageattr: add set_memory_nc

2015-05-14 Thread Christoffer Dall
On Wed, May 13, 2015 at 01:31:52PM +0200, Andrew Jones wrote:
> Provide a method to change normal, cacheable memory to non-cacheable.
> KVM will make use of this to keep emulated device memory regions
> coherent with the guest.
> 
> Signed-off-by: Andrew Jones 

Reviewed-by: Christoffer Dall 

But you obviously need Russell and Will/Catalin to ack/merge this.

-Christoffer



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Laszlo Ersek
On 05/14/15 12:30, Christoffer Dall wrote:
> On Wed, May 13, 2015 at 01:31:51PM +0200, Andrew Jones wrote:
>> Introduce a new memory region flag, KVM_MEM_UNCACHED, which is
>> needed by ARM. This flag informs KVM that the given memory region
>> is typically mapped by the guest as non-cacheable. KVM for ARM
>> then ensures that that memory is indeed mapped non-cacheable by
>> the guest, and also remaps that region as non-cacheable for
>> userspace, allowing them both to maintain a coherent view.
>>
>> Changes since v1:
>>  1) don't pin pages [Paolo]
>>  2) ensure the guest maps the memory non-cacheable [me]
>>  3) clean up memslot flag documentation [Christoffer]
>> changes 1 and 2 effectively redesigned/rewrote v1. Find v1 here
>> http://www.spinics.net/lists/kvm-arm/msg14022.html
>>
>> The QEMU series for v1 hasn't really changed. Only the linux
>> header hack needed to bump KVM_CAP_UNCACHED_MEM from 107 to
>> 116.  Find the series here
>> http://www.spinics.net/lists/kvm-arm/msg14026.html
>>
>> Testing:
>> This series still needs lots of testing, but I thought I'd
>> kick it to the list early, as there's been recent interest
>> in solving this problem, and I'd like to get test results
>> and opinions on this approach from others sooner than later.
>> I've tested with AAVMF (UEFI for AArch64 mach-virt guests).
>> AAVMF has a kludge in it to avoid the coherency problem.
> 
> How does the 'kludge' work?

https://github.com/tianocore/edk2/commit/f9a8be42

(It's probably worth looking at the documentation in the first hunk too,
under the commit message.)

Thanks
Laszlo


> 
>> I've tested both with and without that kludge active. Both
>> worked for me (almost). Sometimes with the non-kludged
>> version I was still able to see a bit of corruption in
>> grub's output after edk2 loaded it - not much, and not always,
>> but something.
> 
> Remind me, this is a VGA framebuffer corruption with a PCI-plugged VGA
> card?
> 
> Thanks,
> -Christoffer
> 
>> Anyway, it's quite frustrating, as I'm not sure
>> what I'm missing...
>>
>> This series applies to Linus' 110bc76729d4, but I tested with
>> a version backported to the current RHELSA kernel.
>>
>> Thanks for reviews and testing!
>>
>> drew
>>
>>
>> Andrew Jones (3):
>>   arm/arm64: pageattr: add set_memory_nc
>>   KVM: promote KVM_MEMSLOT_INCOHERENT to uapi
>>   arm/arm64: KVM: implement 'uncached' mem coherency
>>
>>  Documentation/virtual/kvm/api.txt | 20 --
>>  arch/arm/include/asm/cacheflush.h |  1 +
>>  arch/arm/include/asm/kvm_mmu.h|  5 -
>>  arch/arm/include/asm/pgtable-3level.h |  1 +
>>  arch/arm/include/asm/pgtable.h|  1 +
>>  arch/arm/include/uapi/asm/kvm.h   |  1 +
>>  arch/arm/kvm/arm.c|  1 +
>>  arch/arm/kvm/mmu.c| 39 
>> ++-
>>  arch/arm/mm/pageattr.c|  7 +++
>>  arch/arm64/include/asm/cacheflush.h   |  1 +
>>  arch/arm64/include/asm/kvm_mmu.h  |  5 -
>>  arch/arm64/include/asm/memory.h   |  1 +
>>  arch/arm64/include/asm/pgtable.h  |  1 +
>>  arch/arm64/include/uapi/asm/kvm.h |  1 +
>>  arch/arm64/mm/pageattr.c  |  8 +++
>>  include/linux/kvm_host.h  |  1 -
>>  include/uapi/linux/kvm.h  |  2 ++
>>  virt/kvm/kvm_main.c   |  7 ++-
>>  18 files changed, 79 insertions(+), 24 deletions(-)
>>
>> -- 
>> 2.1.0
>>




Re: [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Stefano Stabellini
On Wed, 13 May 2015, John Snow wrote:
> On 05/13/2015 02:15 PM, Stefano Stabellini wrote:
> > On Wed, 13 May 2015, Daniel P. Berrange wrote:
> >> On Wed, May 13, 2015 at 06:29:46PM +0100, Stefano Stabellini wrote:
> >>> Do not emulate a floppy drive if no drives are supposed to be present.
> >>>
> >>> This fixes the behavior of -nodefaults, that should remove the floppy
> >>> drive (see docs/qdev-device-use.txt:Default Devices), but actually
> >>> doesn't.
> >>
> >> Technically that doc is just refering to the disablement of the
> >> primary floppy drive, as opposed to the floppy controller. The
> >> floppy controller itself is really a built-in device that is
> >> defined as part of the machine type, along with the various other
> >> platform devices hanging off the PIIX and ISA brige.
> > 
> > I think you are right, this patch is a bit too harsh in fixing the
> > problem: I just wanted to properly disable drive emulation, because from
> > my tests the guest thinks that one drive is present even when is not.
> > 
> 
> We should just be a little careful in explaining the difference between
> the controller, the drives, and what circumstances things show up and to
> whom.
> 
> Currently the FDC is always present and always has two drive objects
> that are directly inlined to that device.
> 
> Now, based on whether or not this line fires in vl.c:
> default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS);
> 
> controls whether or not we find that drive in pc_basic_device_init as
> you've found, which controls (ultimately) whether or not
> fdctrl->drives[0].blk or fdctrl->drives[1].blk gets set.
> 
> If the blk pointer is set, even if you have no media inserted,
> fd_revalidate (and pick_geometry) will *FORCIBLY PICK* a drive type for
> this floppy, currently a 1.44MB type, (20 sect, 80 track, 1 head) with a
> data rate of 500 Kbps. This is written to the rtc memory where seabios
> later reads it to discover if you have a guest-visible floppy drive there.
> 
> Both QEMU and SeaBIOS confuse the concept of "A drive is present" with
> "A disk is present" which leads to these kinds of confusing scenarios.
> 
> There's some work to do here, for sure.

That was my impression too.



On Thu, 14 May 2015, Stefan Weil wrote:
> Am 13.05.2015 um 20:15 schrieb Stefano Stabellini:
> > On Wed, 13 May 2015, Daniel P. Berrange wrote:
> > > On Wed, May 13, 2015 at 06:29:46PM +0100, Stefano Stabellini wrote:
> > > > Do not emulate a floppy drive if no drives are supposed to be present.
> > > > 
> > > > This fixes the behavior of -nodefaults, that should remove the floppy
> > > > drive (see docs/qdev-device-use.txt:Default Devices), but actually
> > > > doesn't.
> > > Technically that doc is just refering to the disablement of the
> > > primary floppy drive, as opposed to the floppy controller. The
> > > floppy controller itself is really a built-in device that is
> > > defined as part of the machine type, along with the various other
> > > platform devices hanging off the PIIX and ISA brige.
> > I think you are right, this patch is a bit too harsh in fixing the
> > problem: I just wanted to properly disable drive emulation, because from
> > my tests the guest thinks that one drive is present even when is not.
> 
> A short test on some of my physical machines shows that most
> of them don't have a floppy disk controller at all (dmesg | grep FDC).
> 
> Only some older machines still have one.
> 
> Therefore I think that QEMU must also be able to offer a virtual
> machine without an FDC, maybe as the default for the next
> version of QEMU.

I think it would make sense to make it the default for -nodefaults
machines. I would be OK with a new property too, as we could set it from
libxl or libvirt. Anybody would be happy to pick this one up or should I
do it?



Re: [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Daniel P. Berrange
On Thu, May 14, 2015 at 12:12:52PM +0100, Stefano Stabellini wrote:
> On Wed, 13 May 2015, John Snow wrote:
> > On 05/13/2015 02:15 PM, Stefano Stabellini wrote:
> > > On Wed, 13 May 2015, Daniel P. Berrange wrote:
> > >> On Wed, May 13, 2015 at 06:29:46PM +0100, Stefano Stabellini wrote:
> > >>> Do not emulate a floppy drive if no drives are supposed to be present.
> > >>>
> > >>> This fixes the behavior of -nodefaults, that should remove the floppy
> > >>> drive (see docs/qdev-device-use.txt:Default Devices), but actually
> > >>> doesn't.
> > >>
> > >> Technically that doc is just refering to the disablement of the
> > >> primary floppy drive, as opposed to the floppy controller. The
> > >> floppy controller itself is really a built-in device that is
> > >> defined as part of the machine type, along with the various other
> > >> platform devices hanging off the PIIX and ISA brige.
> > > 
> > > I think you are right, this patch is a bit too harsh in fixing the
> > > problem: I just wanted to properly disable drive emulation, because from
> > > my tests the guest thinks that one drive is present even when is not.
> > > 
> > 
> > We should just be a little careful in explaining the difference between
> > the controller, the drives, and what circumstances things show up and to
> > whom.
> > 
> > Currently the FDC is always present and always has two drive objects
> > that are directly inlined to that device.
> > 
> > Now, based on whether or not this line fires in vl.c:
> > default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS);
> > 
> > controls whether or not we find that drive in pc_basic_device_init as
> > you've found, which controls (ultimately) whether or not
> > fdctrl->drives[0].blk or fdctrl->drives[1].blk gets set.
> > 
> > If the blk pointer is set, even if you have no media inserted,
> > fd_revalidate (and pick_geometry) will *FORCIBLY PICK* a drive type for
> > this floppy, currently a 1.44MB type, (20 sect, 80 track, 1 head) with a
> > data rate of 500 Kbps. This is written to the rtc memory where seabios
> > later reads it to discover if you have a guest-visible floppy drive there.
> > 
> > Both QEMU and SeaBIOS confuse the concept of "A drive is present" with
> > "A disk is present" which leads to these kinds of confusing scenarios.
> > 
> > There's some work to do here, for sure.
> 
> That was my impression too.
> 
> 
> 
> On Thu, 14 May 2015, Stefan Weil wrote:
> > Am 13.05.2015 um 20:15 schrieb Stefano Stabellini:
> > > On Wed, 13 May 2015, Daniel P. Berrange wrote:
> > > > On Wed, May 13, 2015 at 06:29:46PM +0100, Stefano Stabellini wrote:
> > > > > Do not emulate a floppy drive if no drives are supposed to be present.
> > > > > 
> > > > > This fixes the behavior of -nodefaults, that should remove the floppy
> > > > > drive (see docs/qdev-device-use.txt:Default Devices), but actually
> > > > > doesn't.
> > > > Technically that doc is just refering to the disablement of the
> > > > primary floppy drive, as opposed to the floppy controller. The
> > > > floppy controller itself is really a built-in device that is
> > > > defined as part of the machine type, along with the various other
> > > > platform devices hanging off the PIIX and ISA brige.
> > > I think you are right, this patch is a bit too harsh in fixing the
> > > problem: I just wanted to properly disable drive emulation, because from
> > > my tests the guest thinks that one drive is present even when is not.
> > 
> > A short test on some of my physical machines shows that most
> > of them don't have a floppy disk controller at all (dmesg | grep FDC).
> > 
> > Only some older machines still have one.
> > 
> > Therefore I think that QEMU must also be able to offer a virtual
> > machine without an FDC, maybe as the default for the next
> > version of QEMU.
> 
> I think it would make sense to make it the default for -nodefaults
> machines. I would be OK with a new property too, as we could set it from
> libxl or libvirt. Anybody would be happy to pick this one up or should I
> do it?

It isn't permissible to change the hardware exposed to guests for existing
machine types, even when -nodefaults is used. Any change in defaults has
to be for new machine types only. eg we can't change pc-2.3.0 machine
type, but we could change defaults for the pc-2.4.0 machine type that
will be in next release. That ensures migration upgrade compatibility
for existing guests, while letting new guests get the new defaults.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [PATCH COLO v4 01/15] docs: block replication's description

2015-05-14 Thread Dr. David Alan Gilbert
* Wen Congyang (we...@cn.fujitsu.com) wrote:
> Signed-off-by: Wen Congyang 
> Signed-off-by: Yang Hongyang 
> Signed-off-by: zhanghailiang 
> Signed-off-by: Gonglei 
> ---
>  docs/block-replication.txt | 179 
> +
>  1 file changed, 179 insertions(+)
>  create mode 100644 docs/block-replication.txt
> 
> diff --git a/docs/block-replication.txt b/docs/block-replication.txt
> new file mode 100644
> index 000..a29f51a
> --- /dev/null
> +++ b/docs/block-replication.txt
> @@ -0,0 +1,179 @@
> +Block replication
> +
> +Copyright Fujitsu, Corp. 2015
> +Copyright (c) 2015 Intel Corporation
> +Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
> +
> +This work is licensed under the terms of the GNU GPL, version 2 or later.
> +See the COPYING file in the top-level directory.
> +
> +Block replication is used for continuous checkpoints. It is designed
> +for COLO (COurse-grain LOck-stepping) where the Secondary VM is running.
> +It can also be applied for FT/HA (Fault-tolerance/High Assurance) scenario,
> +where the Secondary VM is not running.
> +
> +This document gives an overview of block replication's design.
> +
> +== Background ==
> +High availability solutions such as micro checkpoint and COLO will do
> +consecutive checkpoints. The VM state of Primary VM and Secondary VM is
> +identical right after a VM checkpoint, but becomes different as the VM
> +executes till the next checkpoint. To support disk contents checkpoint,
> +the modified disk contents in the Secondary VM must be buffered, and are
> +only dropped at next checkpoint time. To reduce the network transportation
> +effort at the time of checkpoint, the disk modification operations of
> +Primary disk are asynchronously forwarded to the Secondary node.
> +
> +== Workflow ==
> +The following is the image of block replication workflow:
> +
> ++--+++
> +|Primary Write Requests||Secondary Write Requests|
> ++--+++
> +  |   |
> +  |  (4)
> +  |   V
> +  |  /-\
> +  |  Copy and Forward| |
> +  |-(1)--+   | Disk Buffer |
> +  |  |   | |
> +  | (3)  \-/
> +  | speculative  ^
> +  |write through(2)
> +  |  |   |
> +  V  V   |
> +   +--+   ++
> +   | Primary Disk |   | Secondary Disk |
> +   +--+   ++
> +
> +1) Primary write requests will be copied and forwarded to Secondary
> +   QEMU.
> +2) Before Primary write requests are written to Secondary disk, the
> +   original sector content will be read from Secondary disk and
> +   buffered in the Disk buffer, but it will not overwrite the existing
> +   sector content(it could be from either "Secondary Write Requests" or
> +   previous COW of "Primary Write Requests") in the Disk buffer.
> +3) Primary write requests will be written to Secondary disk.
> +4) Secondary write requests will be buffered in the Disk buffer and it
> +   will overwrite the existing sector content in the buffer.
> +
> +== Architecture ==
> +We are going to implement block replication from many basic
> +blocks that are already in QEMU.
> +
> + virtio-blk   ||
> + ^||.--
> + |||| Secondary
> +1 Quorum  ||'--
> + /  \ ||
> +/\||
> +   Primary2 filter
> + disk ^  
>virtio-blk
> +  |  
> ^
> +3 NBD  --->  3 NBD   
> |
> +client|| server  
> 2 filter
> +  ||^
> ^
> +. |||
> |
> +Primary | ||  Secondary disk <- hidden-disk 5 
> <- active-disk 4
> +' |||  backing^   backing
> +  ||   

Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA

2015-05-14 Thread Maciej W. Rozycki
On Thu, 14 May 2015, Leon Alrae wrote:

> >  I don't think we have.  The specification is a bit unclear I must admit 
> > and it also defines the details of vector load and store operations as 
> > implementation dependent, so there's no further clarification.
> 
> This is specified in "MIPS Architecture For Programmers Volume I-A:
> Introduction to the MIPS64 Architecture", Revision: 6.01, Imagination
> Technologies, Document Number: MD00083, August 20, 2014, p.142:
> 
> "For example, a misaligned vector load instruction will never leave its
> vector destination register half written, if part of a page split
> succeeds and the other part takes an exception. It is either all done,
> or not at all."

 Thanks.  It's good to see r6 has clarified the matters around here and 
I think we can (and for simplicity ought to) apply them to r5 too.

  Maciej



Re: [Qemu-devel] [PATCH] monitor: suggest running "help" for command errors

2015-05-14 Thread Markus Armbruster
Bandan Das  writes:

> When a command fails due to incorrect syntax or input,
> suggest using the "help" command to get more information
> about the command. This is only applicable for HMP.
>
> Before:
> (qemu) drive_add usb_flash_drive
> drive_add: string expected
> After:
> (qemu) drive_add usb_flash_drive
> drive_add: string expected
> Try "help drive_add" for more information
>
> Signed-off-by: Bandan Das 
> ---
>  monitor.c | 28 +++-
>  1 file changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/monitor.c b/monitor.c
> index b2561e1..46e8880 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -939,7 +939,7 @@ static int qmp_async_cmd_handler(Monitor *mon, const 
> mon_cmd_t *cmd,
>  return cmd->mhandler.cmd_async(mon, params, qmp_monitor_complete, mon);
>  }
>  
> -static void user_async_cmd_handler(Monitor *mon, const mon_cmd_t *cmd,
> +static int user_async_cmd_handler(Monitor *mon, const mon_cmd_t *cmd,
> const QDict *params)
>  {
>  int ret;
> @@ -954,6 +954,8 @@ static void user_async_cmd_handler(Monitor *mon, const 
> mon_cmd_t *cmd,
>  monitor_resume(mon);
>  g_free(cb_data);
>  }
> +
> +return ret;
>  }
>  

user_async_cmd_handler() is unreachable since commit 3b5704b.  I got
code cleaning that up in my tree.  Don't worry about it.

>  static void hmp_info_help(Monitor *mon, const QDict *qdict)
> @@ -3698,7 +3700,8 @@ static const mon_cmd_t *monitor_parse_command(Monitor 
> *mon,
>const char *cmdline,
>int start,
>mon_cmd_t *table,
> -  QDict *qdict)
> +  QDict *qdict,
> +  int *failed)
>  {
>  const char *p, *typestr;
>  int c;
> @@ -3734,7 +3737,7 @@ static const mon_cmd_t *monitor_parse_command(Monitor 
> *mon,
>  return cmd;
>  }
>  return monitor_parse_command(mon, cmdline, p - cmdline,
> - cmd->sub_table, qdict);
> + cmd->sub_table, qdict, failed);
>  }
>  
>  /* parse the parameters */
> @@ -4084,8 +4087,9 @@ static const mon_cmd_t *monitor_parse_command(Monitor 
> *mon,
>  return cmd;
>  
>  fail:
> +*failed = 1;
>  g_free(key);
> -return NULL;
> +return cmd;
>  }
>  

>From the function's contract:

 * If @cmdline is blank, return NULL.
 * If it can't be parsed, report to @mon, and return NULL.
 * Else, insert command arguments into @qdict, and return the command.

Your patch splits the "it can't be parsed" case into "command not found"
and "arguments can't be parsed", but neglects to update the contract.
Needs fixing.  Perhaps like this:

 * If @cmdline is blank, return NULL.
 * If @cmdline doesn't start with a valid command name, report to @mon,
 * and return NULL.
 * Else, if the command's arguments can't be parsed, report to @mon,
 * store 1 through @failed, and return the command.
 * Else, insert command arguments into @qdict, and return the command.

The contract wasn't exactly pretty before, and I'm afraid it's
positively ugly now.

The common cause for such ugliness is doing too much in one function.
I'd try splitting into a command part and an argument part, so that

cmd = monitor_parse_command(mon, cmdline, start, table, qdict);
if (!cmd) {
// bail out
}

becomes something like

cmd = monitor_parse_command(mon, cmdline, start, table, &args);
if (!cmd) {
// bail out
}
qdict = monitor_parse_arguments(mon, cmd, args)
if (!qdict) {
// bail out
}

While I encourage you to do this, I won't reject your patch just because
you don't.

>  void monitor_set_error(Monitor *mon, QError *qerror)
> @@ -4114,20 +4118,22 @@ static void handle_user_command(Monitor *mon, const 
> char *cmdline)
>  {
>  QDict *qdict;
>  const mon_cmd_t *cmd;
> +int failed = 0;
>  
>  qdict = qdict_new();
>  
> -cmd = monitor_parse_command(mon, cmdline, 0, mon->cmd_table, qdict);
> -if (!cmd)
> +cmd = monitor_parse_command(mon, cmdline, 0, mon->cmd_table,
> +qdict, &failed);
> +if (!cmd || failed) {
>  goto out;
> +}

cmd implies !failed, therefore !cmd || failed == !cmd here.  Why not
simply stick to if (!cmd)?

>  
>  if (handler_is_async(cmd)) {
> -user_async_cmd_handler(mon, cmd, qdict);
> +failed = user_async_cmd_handler(mon, cmd, qdict);

As discussed above: unreachable, but don't worry about it.

>  } else if (handler_is_qobject(cmd)) {

This is the "new" HMP command interface.  It's used only with drive_del,
device_add, client_migrate_info, pcie_aer_inject_error.  The cleanup
work I mentioned above will get rid of it.  Again, don't worry.

>  

Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Christoffer Dall
On Thu, May 14, 2015 at 01:09:34PM +0200, Laszlo Ersek wrote:
> On 05/14/15 12:30, Christoffer Dall wrote:
> > On Wed, May 13, 2015 at 01:31:51PM +0200, Andrew Jones wrote:
> >> Introduce a new memory region flag, KVM_MEM_UNCACHED, which is
> >> needed by ARM. This flag informs KVM that the given memory region
> >> is typically mapped by the guest as non-cacheable. KVM for ARM
> >> then ensures that that memory is indeed mapped non-cacheable by
> >> the guest, and also remaps that region as non-cacheable for
> >> userspace, allowing them both to maintain a coherent view.
> >>
> >> Changes since v1:
> >>  1) don't pin pages [Paolo]
> >>  2) ensure the guest maps the memory non-cacheable [me]
> >>  3) clean up memslot flag documentation [Christoffer]
> >> changes 1 and 2 effectively redesigned/rewrote v1. Find v1 here
> >> http://www.spinics.net/lists/kvm-arm/msg14022.html
> >>
> >> The QEMU series for v1 hasn't really changed. Only the linux
> >> header hack needed to bump KVM_CAP_UNCACHED_MEM from 107 to
> >> 116.  Find the series here
> >> http://www.spinics.net/lists/kvm-arm/msg14026.html
> >>
> >> Testing:
> >> This series still needs lots of testing, but I thought I'd
> >> kick it to the list early, as there's been recent interest
> >> in solving this problem, and I'd like to get test results
> >> and opinions on this approach from others sooner than later.
> >> I've tested with AAVMF (UEFI for AArch64 mach-virt guests).
> >> AAVMF has a kludge in it to avoid the coherency problem.
> > 
> > How does the 'kludge' work?
> 
> https://github.com/tianocore/edk2/commit/f9a8be42
> 
> (It's probably worth looking at the documentation in the first hunk too,
> under the commit message.)
> 
Why is this a hack/unintuitive?  Is the semantics of the QEMU PCI bus
not simply that MMIO regions are coherent?

-Christoffer



[Qemu-devel] [Bug 1453436] Re: Building on OS X: Undefined symbols ___emutls_v.prng_state and ___emutls_v.prng_state_data

2015-05-14 Thread Molt
Ah well, I only have the "normal" PATH set, not library or include path.
But I managed to build qemu now by just building pixman separately.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1453436

Title:
  Building on OS X: Undefined symbols ___emutls_v.prng_state and
  ___emutls_v.prng_state_data

Status in QEMU:
  New

Bug description:
  Trying to build qemu on my system fails during linking with the error:

  Undefined symbols for architecture x86_64:
"___emutls_v.prng_state", referenced from:
_main in region-test.o
__GLOBAL__sub_I_65535_0_region_test.c in region-test.o
"___emutls_v.prng_state_data", referenced from:
_main in region-test.o
__GLOBAL__sub_I_65535_0_region_test.c in region-test.o

  My setup:

  OS: OS X 10.10.3, 64bit
  gcc: 5.1.0
  clang: 6.1.0

  configure command:

  configure --prefix="$HOME/local" --cc=clang --host-cc=clang
  --cxx=clang++

  It makes no difference whether I try to build in the source directory or 
somewhere else.
  It is the same for qemu release 2.3.0 and qemu 
git@f8340b360b9bc29d48716ba8aca79df2b9544979.

  Now this is clearly happening in the pixman submodule, but it does not
  seem to be a pixman issue, as I can clone
  git://anongit.freedesktop.org/pixman
  @cf086d4949092861dc3729465a3881d229cc1060 and build it without any
  errors with just :

  configure --prefix="$HOME/local"
  make

  It also works with

  configure --prefix="$HOME/local" CC=clang CXX=clang++
  make

  although then OpenMP is disabled.
  Also, running

  nm qemu/pixman/test/utils.o

  gives me (amongst other stuff):

  0020 C ___emutls_v.prng_state
  0020 C ___emutls_v.prng_state_data

  So the symbols are actually there, it's really just linking that
  fails.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1453436/+subscriptions



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 13:29, Christoffer Dall wrote:
> > (It's probably worth looking at the documentation in the first hunk too,
> > under the commit message.)
> 
> Why is this a hack/unintuitive?  Is the semantics of the QEMU PCI bus
> not simply that MMIO regions are coherent?

Only until device assignment gets into the picture.

Paolo



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Christoffer Dall
On Thu, May 14, 2015 at 01:31:03PM +0200, Paolo Bonzini wrote:
> 
> 
> On 14/05/2015 13:29, Christoffer Dall wrote:
> > > (It's probably worth looking at the documentation in the first hunk too,
> > > under the commit message.)
> > 
> > Why is this a hack/unintuitive?  Is the semantics of the QEMU PCI bus
> > not simply that MMIO regions are coherent?
> 
> Only until device assignment gets into the picture.
> 
Will UEFI have to deal with device assignment in any respect?

-Christoffer



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 13:36, Christoffer Dall wrote:
> > > > (It's probably worth looking at the documentation in the first hunk too,
> > > > under the commit message.)
> > > 
> > > Why is this a hack/unintuitive?  Is the semantics of the QEMU PCI bus
> > > not simply that MMIO regions are coherent?
> > 
> > Only until device assignment gets into the picture.
> 
> Will UEFI have to deal with device assignment in any respect?

Why not?  For example you could do network boot from an assigned network
card.

In fact, anything that UEFI has to deal with, the OS has to deal with
too.  If you need a UEFI hack, chances are you need or will need a Linux
hack too.

Paolo



Re: [Qemu-devel] [PATCH 2/2] scripts: qom-tree: add support of path as argument

2015-05-14 Thread Martin Cerveny

Hello.

Ternary/if/else >= python 2.5 (I use the same coding as in 
scripts/qmp/qemu-ga-client).
"import json" >= python 2.6 (scripts/qmp/qmp.py)
"import argparse" >= python 2.7 (scripts/analyze-migration.py, 
scripts/vmstate-static-checker.py)
"import optparse" < python 2.7 (deprecated, scripts/qmp/qemu-ga-client)


Of course there is no problem to use traditional syntax.
(My platform (centos5.10) has python2.4, I must also replace json imports:
try:
import json
except ImportError:
import simplejson as json
)

Which version of python is officialy minimum supported ?

Thanks for explanation of python status.

M.C>

On Thu, 14 May 2015, Paolo Bonzini wrote:




On 13/05/2015 14:14, Martin Cerveny wrote:

 for item in items:
 if item['type'].startswith('child<'):
-list_node(path + '/' + item['name'])
+list_node((path if (path != '/') else '')  + '/' + item['name'])


I'm not sure which Python version introduced if...else.  The more
traditional idiom would be

path != '/' and path or ''

Can you use it, and move the expression out of the 'for item in items'
loop into a variable?

Paolo





Re: [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Michael S. Tsirkin
On Thu, May 14, 2015 at 12:18:26PM +0100, Daniel P. Berrange wrote:
> On Thu, May 14, 2015 at 12:12:52PM +0100, Stefano Stabellini wrote:
> > On Wed, 13 May 2015, John Snow wrote:
> > > On 05/13/2015 02:15 PM, Stefano Stabellini wrote:
> > > > On Wed, 13 May 2015, Daniel P. Berrange wrote:
> > > >> On Wed, May 13, 2015 at 06:29:46PM +0100, Stefano Stabellini wrote:
> > > >>> Do not emulate a floppy drive if no drives are supposed to be present.
> > > >>>
> > > >>> This fixes the behavior of -nodefaults, that should remove the floppy
> > > >>> drive (see docs/qdev-device-use.txt:Default Devices), but actually
> > > >>> doesn't.
> > > >>
> > > >> Technically that doc is just refering to the disablement of the
> > > >> primary floppy drive, as opposed to the floppy controller. The
> > > >> floppy controller itself is really a built-in device that is
> > > >> defined as part of the machine type, along with the various other
> > > >> platform devices hanging off the PIIX and ISA brige.
> > > > 
> > > > I think you are right, this patch is a bit too harsh in fixing the
> > > > problem: I just wanted to properly disable drive emulation, because from
> > > > my tests the guest thinks that one drive is present even when is not.
> > > > 
> > > 
> > > We should just be a little careful in explaining the difference between
> > > the controller, the drives, and what circumstances things show up and to
> > > whom.
> > > 
> > > Currently the FDC is always present and always has two drive objects
> > > that are directly inlined to that device.
> > > 
> > > Now, based on whether or not this line fires in vl.c:
> > > default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS);
> > > 
> > > controls whether or not we find that drive in pc_basic_device_init as
> > > you've found, which controls (ultimately) whether or not
> > > fdctrl->drives[0].blk or fdctrl->drives[1].blk gets set.
> > > 
> > > If the blk pointer is set, even if you have no media inserted,
> > > fd_revalidate (and pick_geometry) will *FORCIBLY PICK* a drive type for
> > > this floppy, currently a 1.44MB type, (20 sect, 80 track, 1 head) with a
> > > data rate of 500 Kbps. This is written to the rtc memory where seabios
> > > later reads it to discover if you have a guest-visible floppy drive there.
> > > 
> > > Both QEMU and SeaBIOS confuse the concept of "A drive is present" with
> > > "A disk is present" which leads to these kinds of confusing scenarios.
> > > 
> > > There's some work to do here, for sure.
> > 
> > That was my impression too.
> > 
> > 
> > 
> > On Thu, 14 May 2015, Stefan Weil wrote:
> > > Am 13.05.2015 um 20:15 schrieb Stefano Stabellini:
> > > > On Wed, 13 May 2015, Daniel P. Berrange wrote:
> > > > > On Wed, May 13, 2015 at 06:29:46PM +0100, Stefano Stabellini wrote:
> > > > > > Do not emulate a floppy drive if no drives are supposed to be 
> > > > > > present.
> > > > > > 
> > > > > > This fixes the behavior of -nodefaults, that should remove the 
> > > > > > floppy
> > > > > > drive (see docs/qdev-device-use.txt:Default Devices), but actually
> > > > > > doesn't.
> > > > > Technically that doc is just refering to the disablement of the
> > > > > primary floppy drive, as opposed to the floppy controller. The
> > > > > floppy controller itself is really a built-in device that is
> > > > > defined as part of the machine type, along with the various other
> > > > > platform devices hanging off the PIIX and ISA brige.
> > > > I think you are right, this patch is a bit too harsh in fixing the
> > > > problem: I just wanted to properly disable drive emulation, because from
> > > > my tests the guest thinks that one drive is present even when is not.
> > > 
> > > A short test on some of my physical machines shows that most
> > > of them don't have a floppy disk controller at all (dmesg | grep FDC).
> > > 
> > > Only some older machines still have one.
> > > 
> > > Therefore I think that QEMU must also be able to offer a virtual
> > > machine without an FDC, maybe as the default for the next
> > > version of QEMU.
> > 
> > I think it would make sense to make it the default for -nodefaults
> > machines. I would be OK with a new property too, as we could set it from
> > libxl or libvirt. Anybody would be happy to pick this one up or should I
> > do it?
> 
> It isn't permissible to change the hardware exposed to guests for existing
> machine types, even when -nodefaults is used. Any change in defaults has
> to be for new machine types only. eg we can't change pc-2.3.0 machine
> type, but we could change defaults for the pc-2.4.0 machine type that
> will be in next release. That ensures migration upgrade compatibility
> for existing guests, while letting new guests get the new defaults.
> 
> Regards,
> Daniel

For libvirt, yes. But command-line users might rely on the old behaviour
with e.g. -M pc so we shouldn't change defaults lightly even for new
machine types.



> -- 
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberr

Re: [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Michael S. Tsirkin
On Thu, May 14, 2015 at 12:12:52PM +0100, Stefano Stabellini wrote:
> I would be OK with a new property too, as we could set it from
> libxl or libvirt. Anybody would be happy to pick this one up or should I
> do it?

Pls go ahead, I can merge it in the pc tree.

-- 
MST



Re: [Qemu-devel] [PATCH 2/2] scripts: qom-tree: add support of path as argument

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 13:41, Martin Cerveny wrote:
> Hello.
> 
> Ternary/if/else >= python 2.5 (I use the same coding as in
> scripts/qmp/qemu-ga-client).

Oh, that's old.  Shows my knowledge of Python.

Then your patch is okay---thanks for enduring with me. :)

Paolo

> "import json" >= python 2.6 (scripts/qmp/qmp.py)
> "import argparse" >= python 2.7 (scripts/analyze-migration.py,
> scripts/vmstate-static-checker.py)
> "import optparse" < python 2.7 (deprecated, scripts/qmp/qemu-ga-client)
> 
> 
> Of course there is no problem to use traditional syntax.
> (My platform (centos5.10) has python2.4, I must also replace json imports:
> try:
> import json
> except ImportError:
> import simplejson as json
> )
> 
> Which version of python is officialy minimum supported ?
> 
> Thanks for explanation of python status.
> 
> M.C>



Re: [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 13:47, Michael S. Tsirkin wrote:
> > I would be OK with a new property too, as we could set it from
> > libxl or libvirt. Anybody would be happy to pick this one up or should I
> > do it?
>
> Pls go ahead, I can merge it in the pc tree.

Note that because of the "-drive if=none,id=fdd1 -global
isa-fdc.driveA=fdd1" trick to create a floppy drive without if=floppy,
the default should remain on---at least for a few releases---even if
-nodefaults is passed.

Paolo



Re: [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Michael S. Tsirkin
On Thu, May 14, 2015 at 01:54:22PM +0200, Paolo Bonzini wrote:
> 
> 
> On 14/05/2015 13:47, Michael S. Tsirkin wrote:
> > > I would be OK with a new property too, as we could set it from
> > > libxl or libvirt. Anybody would be happy to pick this one up or should I
> > > do it?
> >
> > Pls go ahead, I can merge it in the pc tree.
> 
> Note that because of the "-drive if=none,id=fdd1 -global
> isa-fdc.driveA=fdd1" trick to create a floppy drive without if=floppy,
> the default should remain on---at least for a few releases---even if
> -nodefaults is passed.
> 
> Paolo

Exactly.

-- 
MST



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Christoffer Dall
On Thu, May 14, 2015 at 01:38:38PM +0200, Paolo Bonzini wrote:
> 
> 
> On 14/05/2015 13:36, Christoffer Dall wrote:
> > > > > (It's probably worth looking at the documentation in the first hunk 
> > > > > too,
> > > > > under the commit message.)
> > > > 
> > > > Why is this a hack/unintuitive?  Is the semantics of the QEMU PCI bus
> > > > not simply that MMIO regions are coherent?
> > > 
> > > Only until device assignment gets into the picture.
> > 
> > Will UEFI have to deal with device assignment in any respect?
> 
> Why not?  For example you could do network boot from an assigned network
> card.
> 
> In fact, anything that UEFI has to deal with, the OS has to deal with
> too.  If you need a UEFI hack, chances are you need or will need a Linux
> hack too.
> 
Fair enough.  I was thinking that UEFI needs to be built with knowledge
of all the hardware present including any passthrough devices, but I
guess this is plainly not true with PCI (and might not even be true with
the level of DT parsing we do for the virtual platform).

So, getting back to my original question.  Is the point then that UEFI
must assume (from ACPI/DT) the cache-coherency properties of the PCI
controller which exists in hardware on the system you're running on,
even for the virtual PCI bus because that will be the semantics for
assigned devices?

And in that case, we have no way to distinguish between passthrough
devices and virtual devices plugged into the virtual PCI bus?

What about the idea of having two virtual PCI buses on your system where
one is always cache-coherent and uses for virtual devices, and the other
is whatever the hardware is and used for passthrough devices?

-Christoffer



Re: [Qemu-devel] [PATCH v2] Add virt-v3 machine that uses GIC-500

2015-05-14 Thread Shlomo Pongratz
Hi Pavel,

Please see in-line.
Best regards,

S.P.


From: Pavel Fedin [p.fe...@samsung.com]
Sent: Wednesday, May 13, 2015 4:57 PM
To: Shlomo Pongratz; qemu-devel@nongnu.org
Subject: RE: [Qemu-devel] [PATCH v2] Add virt-v3 machine that uses GIC-500

 Hello!

> 1. In fdt_add_timer why you didn't used the 24 bit limit I posed on the 
> irqflags? Please
note that
> the argument is 32 bits wide and 8 bits are for flags.

  Simply missed it when checking for differences. Please fix. :) Perhaps it is 
the reason
why >=24 CPUs fail for you.

I wonder how it works for you. Do you aware of an alternative way to configure 
the clock irqflags for more then 24 cores, or is it just ignored. According to 
Linux documentation this fdt field sets the clock IRQ affinity.

My current status is as follows:
With 64 cores there is no printouts what so ever.  
With 32 cores the boot usually get stuck after the message "[   45.719102] SCSI 
subsystem initialized"
With 24 cores the system noontimes complete the boot and sometimes get stuck 
like the 32 cores system.

> 2. In machvirt_init, I used TYPE_AARCH64_CPU while you reverted it to 
> TYPE_ARM_CPU, I
> assume this is because you want to support cortex-a15. Don't you think It 
> should be
according
> to the cortex type?

 Yes, i just left it as it was because it already works fine with ARM64. 
Actually,
TYPE_AARCH64_CPU is a subclass of TYPE_ARM_CPU.

I see but I guess that I want aarch64_cpu_initfn to be called and not 
arm_cpu_initfn.

> (BTW you removed cortex-a53).

 Yes, because i didn't see how it is different from a57 (or a15). I tried to 
follow
minimal intervention principle.
 But perhaps i was wrong because there was real support for a53 added recently:
http://lists.nongnu.org/archive/html/qemu-devel/2015-05/msg01304.html, so feel 
free to
re-add it back.

I agree with you I think this should wait for the patch you mentioned above to 
be integrated.

 BTW, just for interest, have you tried to do anything with KVM support of 
vGICv3? I have
some code but it's inherently unstable and lock up for unknown (yet) reason.

No, this is because I don't have an ARM64 based server needed for running KVM 
for ARM64.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia





Re: [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Markus Armbruster
"Daniel P. Berrange"  writes:

> On Thu, May 14, 2015 at 12:12:52PM +0100, Stefano Stabellini wrote:
>> On Wed, 13 May 2015, John Snow wrote:
>> > On 05/13/2015 02:15 PM, Stefano Stabellini wrote:
>> > > On Wed, 13 May 2015, Daniel P. Berrange wrote:
>> > >> On Wed, May 13, 2015 at 06:29:46PM +0100, Stefano Stabellini wrote:
>> > >>> Do not emulate a floppy drive if no drives are supposed to be present.
>> > >>>
>> > >>> This fixes the behavior of -nodefaults, that should remove the floppy
>> > >>> drive (see docs/qdev-device-use.txt:Default Devices), but actually
>> > >>> doesn't.
>> > >>
>> > >> Technically that doc is just refering to the disablement of the
>> > >> primary floppy drive, as opposed to the floppy controller. The
>> > >> floppy controller itself is really a built-in device that is
>> > >> defined as part of the machine type, along with the various other
>> > >> platform devices hanging off the PIIX and ISA brige.
>> > > 
>> > > I think you are right, this patch is a bit too harsh in fixing the
>> > > problem: I just wanted to properly disable drive emulation, because from
>> > > my tests the guest thinks that one drive is present even when is not.
>> > > 
>> > 
>> > We should just be a little careful in explaining the difference between
>> > the controller, the drives, and what circumstances things show up and to
>> > whom.
>> > 
>> > Currently the FDC is always present and always has two drive objects
>> > that are directly inlined to that device.
>> > 
>> > Now, based on whether or not this line fires in vl.c:
>> > default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS);
>> > 
>> > controls whether or not we find that drive in pc_basic_device_init as
>> > you've found, which controls (ultimately) whether or not
>> > fdctrl->drives[0].blk or fdctrl->drives[1].blk gets set.
>> > 
>> > If the blk pointer is set, even if you have no media inserted,
>> > fd_revalidate (and pick_geometry) will *FORCIBLY PICK* a drive type for
>> > this floppy, currently a 1.44MB type, (20 sect, 80 track, 1 head) with a
>> > data rate of 500 Kbps. This is written to the rtc memory where seabios
>> > later reads it to discover if you have a guest-visible floppy drive there.
>> > 
>> > Both QEMU and SeaBIOS confuse the concept of "A drive is present" with
>> > "A disk is present" which leads to these kinds of confusing scenarios.
>> > 
>> > There's some work to do here, for sure.
>> 
>> That was my impression too.
>> 
>> 
>> 
>> On Thu, 14 May 2015, Stefan Weil wrote:
>> > Am 13.05.2015 um 20:15 schrieb Stefano Stabellini:
>> > > On Wed, 13 May 2015, Daniel P. Berrange wrote:
>> > > > On Wed, May 13, 2015 at 06:29:46PM +0100, Stefano Stabellini wrote:
>> > > > > Do not emulate a floppy drive if no drives are supposed to be 
>> > > > > present.
>> > > > > 
>> > > > > This fixes the behavior of -nodefaults, that should remove the floppy
>> > > > > drive (see docs/qdev-device-use.txt:Default Devices), but actually
>> > > > > doesn't.
>> > > > Technically that doc is just refering to the disablement of the
>> > > > primary floppy drive, as opposed to the floppy controller. The
>> > > > floppy controller itself is really a built-in device that is
>> > > > defined as part of the machine type, along with the various other
>> > > > platform devices hanging off the PIIX and ISA brige.
>> > > I think you are right, this patch is a bit too harsh in fixing the
>> > > problem: I just wanted to properly disable drive emulation, because from
>> > > my tests the guest thinks that one drive is present even when is not.
>> > 
>> > A short test on some of my physical machines shows that most
>> > of them don't have a floppy disk controller at all (dmesg | grep FDC).
>> > 
>> > Only some older machines still have one.
>> > 
>> > Therefore I think that QEMU must also be able to offer a virtual
>> > machine without an FDC, maybe as the default for the next
>> > version of QEMU.
>> 
>> I think it would make sense to make it the default for -nodefaults
>> machines. I would be OK with a new property too, as we could set it from
>> libxl or libvirt. Anybody would be happy to pick this one up or should I
>> do it?
>
> It isn't permissible to change the hardware exposed to guests for existing
> machine types, even when -nodefaults is used. Any change in defaults has
> to be for new machine types only. eg we can't change pc-2.3.0 machine
> type, but we could change defaults for the pc-2.4.0 machine type that
> will be in next release. That ensures migration upgrade compatibility
> for existing guests, while letting new guests get the new defaults.

Correct.

Here's how I think it should be done:

* Create a machine option to control the FDC

  This is a machine-specific option.  It should only exist for machine
  types that have an optional FDC.

  Default must be "on" for old machine types.  Default may be "off" for
  new machine types.

  It should certainly be off for pc-q35-2.4 and newer.  Real Q35 boards
  commonly don't have an 

Re: [Qemu-devel] [PATCH] block: Let bdrv_drain_all() to call aio_poll() for each AioContext

2015-05-14 Thread Paolo Bonzini


On 13/05/2015 18:34, Alexander Yarygin wrote:
> Ah, right. We need second loop, something like this:
> 
> @@ -2030,20 +2033,33 @@ void bdrv_drain(BlockDriverState *bs)
>  void bdrv_drain_all(void)
>  {
>  /* Always run first iteration so any pending completion BHs run */
> -bool busy = true;
> +bool busy = true, pending = false;
>  BlockDriverState *bs;
> +GList *aio_ctxs = NULL, *ctx;
> +AioContext *aio_context;
> 
>  while (busy) {
>  busy = false;
> 
>  QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
> -AioContext *aio_context = bdrv_get_aio_context(bs);
> +aio_context = bdrv_get_aio_context(bs);
> 
>  aio_context_acquire(aio_context);
>  busy |= bdrv_drain_one(bs);
>  aio_context_release(aio_context);
> +if (!aio_ctxs || !g_list_find(aio_ctxs, aio_context))
> +aio_ctxs = g_list_append(aio_ctxs, aio_context);
> +}
> +pending = busy;
> +
> +for (ctx = aio_ctxs; ctx != NULL; ctx = ctx->next) {
> +aio_context = ctx->data;
> +aio_context_acquire(aio_context);
> +busy |= aio_poll(aio_context, pending);
> +aio_context_release(aio_context);
>  }
>  }
> +g_list_free(aio_ctxs);
>  }
> 
> That looks quite ugly for me and breaks consistence of bdrv_drain_one()
> since it doesn't call aio_poll() anymore...

It's not ugly.  After your patch bdrv_drain_one doesn't call aio_poll,
while bdrv_drain and bdrv_drain_all call bdrv_drain_one + aio_poll.  All
callers of bdrv_drain_one are consistent.

Perhaps you can rename bdrv_drain_one to bdrv_flush_io_queue (inlining
the existing bdrv_flush_io_queue into it)?  That would work very well
for me.

Paolo



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 14:00, Christoffer Dall wrote:
> So, getting back to my original question.  Is the point then that UEFI
> must assume (from ACPI/DT) the cache-coherency properties of the PCI
> controller which exists in hardware on the system you're running on,
> even for the virtual PCI bus because that will be the semantics for
> assigned devices?
> 
> And in that case, we have no way to distinguish between passthrough
> devices and virtual devices plugged into the virtual PCI bus?

Well, we could use the subsystem id.  But it's a hack, and may cause
incompatibilities with some drivers.  Michael, any ideas?

> What about the idea of having two virtual PCI buses on your system where
> one is always cache-coherent and uses for virtual devices, and the other
> is whatever the hardware is and used for passthrough devices?

I think that was rejected before.

Paolo



Re: [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 14:02, Markus Armbruster wrote:
>   It should certainly be off for pc-q35-2.4 and newer.  Real Q35 boards
>   commonly don't have an FDC (depends on the Super I/O chip used).
> 
>   We may want to keep it off for pc-i440fx-2.4 and newer.  I doubt
>   there's a real i440FX without an FDC, but our virtual i440FX is quite
>   unlike a real one in other ways already.

That would break libvirt for people upgrading from 2.3 to 2.4.  So it's
more like pc-i440fx-3.0 and pc-q35-3.0.  Unless for q35 we decide to
break everything and retroactively nuke the controller.

(I'm still not sure why we have backwards-compatible machine types for q35).

Paolo

> * Create the FDC only if the option is "on".
> 
> * Optional: make -drive if=floppy,... auto-enable it
> 
>   I wouldn't bother doing the same for -global isa-fdc.driveA=... and
>   such.



Re: [Qemu-devel] [RFC PATCH v4 00/28] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service

2015-05-14 Thread Dr. David Alan Gilbert
* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:
> This is the 4th version of COLO, here is only COLO frame part, include: VM 
> checkpoint,
> failover, proxy API, block replication API, not include block replication.
> The block part has been sent by wencongyang:
> [RFC PATCH COLO v2 00/13] Block replication for continuous checkpoints
> 
> Compared with last version, there aren't too much optimize and new functions.
> The main reason is that there is an known issue that still unsolved, we found
> some dirty pages which have been missed setting bit in corresponding bitmap.
> And it will trigger strange problem in VM.
> We hope to resolve it before add more codes.
> 
> You can get the newest integrated qemu colo patches from github:
> https://github.com/coloft/qemu/commits/colo-v1.1

I thought I'd just say I've got the remotes/origin/colo_huawei_v4.7 off Wen's
git running here on one of my pair of machines.

This set of hosts is mostly OK running colo; the proxy module still gets
upset sometimes; but mostly on errors;  if the colo-proxy-script fails
during startup, I get RCU stalls; but if the script works, colo normally
works on this setup.

Dave

> About how to test COLO, Please reference to the follow link.
> http://wiki.qemu.org/Features/COLO.
> 
> Please review and test.
> 
> Known issue still unsolved:
> (1) Some pages dirtied without setting its corresponding dirty-bitmap.
> 
> Previous posted RFC patch series:
> http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html
> http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg04459.html
> https://lists.gnu.org/archive/html/qemu-devel/2015-02/msg04771.html
> 
> TODO list:
> 1 Optimize the process of checkpoint, shorten the time-consuming:
>   (Partly done, patch is not include into this series)
>1) separate ram and device save/load process to reduce size of extra memory
>   used during checkpoint
>2) live migrate part of dirty pages to slave during sleep time.
> 2 Add more debug/stat info
>   (Partly done, patch is not include into this series)
>   include checkpoint count, proxy discompare count, downtime,
>number of live migrated pages, total sent pages, etc.
> 3 Strengthen failover
> 4 optimize proxy part, include proxy script.
> 5 The capability of continuous FT
> 
> v4:
> - New block replication scheme (use image-fleecing for sencondary side)
> - Adress some comments from Eric Blake and Dave
> - Add commmand colo-set-checkpoint-period to set the time of periodic 
> checkpoint
> - Add a delay (100ms) between continuous checkpoint requests to ensure VM
>   run 100ms at least since last pause.
> 
> v3:
> - use proxy instead of colo agent to compare network packets
> - add block replication
> - Optimize failover disposal
> - handle shutdown
> 
> v2:
> - use QEMUSizedBuffer/QEMUFile as COLO buffer
> - colo support is enabled by default
> - add nic replication support
> - addressed comments from Eric Blake and Dr. David Alan Gilbert
> 
> v1:
> - implement the frame of colo
> 
> Wen Congyang (1):
>   COLO: Add block replication into colo process
> 
> zhanghailiang (27):
>   configure: Add parameter for configure to enable/disable COLO support
>   migration: Introduce capability 'colo' to migration
>   COLO: migrate colo related info to slave
>   migration: Integrate COLO checkpoint process into migration
>   migration: Integrate COLO checkpoint process into loadvm
>   COLO: Implement colo checkpoint protocol
>   COLO: Add a new RunState RUN_STATE_COLO
>   QEMUSizedBuffer: Introduce two help functions for qsb
>   COLO: Save VM state to slave when do checkpoint
>   COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
>   COLO VMstate: Load VM state into qsb before restore it
>   arch_init: Start to trace dirty pages of SVM
>   COLO RAM: Flush cached RAM into SVM's memory
>   COLO failover: Introduce a new command to trigger a failover
>   COLO failover: Implement COLO master/slave failover work
>   COLO failover: Don't do failover during loading VM's state
>   COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
>   COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
>   COLO NIC: Implement colo nic device interface configure()
>   COLO NIC : Implement colo nic init/destroy function
>   COLO NIC: Some init work related with proxy module
>   COLO: Do checkpoint according to the result of net packets comparing
>   COLO: Improve checkpoint efficiency by do additional periodic
> checkpoint
>   COLO: Add colo-set-checkpoint-period command
>   COLO NIC: Implement NIC checkpoint and failover
>   COLO: Disable qdev hotplug when VM is in COLO mode
>   COLO: Implement shutdown checkpoint
> 
>  arch_init.c| 199 +++-
>  configure  |  14 +
>  hmp-commands.hx|  30 ++
>  hmp.c  |  14 +
>  hmp.h  |   2 +
>  include/exec/cpu-all.h   

Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Christoffer Dall
On Thu, May 14, 2015 at 02:08:49PM +0200, Paolo Bonzini wrote:
> 
> 
> On 14/05/2015 14:00, Christoffer Dall wrote:
> > So, getting back to my original question.  Is the point then that UEFI
> > must assume (from ACPI/DT) the cache-coherency properties of the PCI
> > controller which exists in hardware on the system you're running on,
> > even for the virtual PCI bus because that will be the semantics for
> > assigned devices?
> > 
> > And in that case, we have no way to distinguish between passthrough
> > devices and virtual devices plugged into the virtual PCI bus?
> 
> Well, we could use the subsystem id.  But it's a hack, and may cause
> incompatibilities with some drivers.  Michael, any ideas?
> 
> > What about the idea of having two virtual PCI buses on your system where
> > one is always cache-coherent and uses for virtual devices, and the other
> > is whatever the hardware is and used for passthrough devices?
> 
> I think that was rejected before.
> 

Do you remember where?  I just remember Catalin mentioning the idea to
me verbally.

Besides the slightly heavy added use of resources etc. it seems like it
would address some of our issues in a good way.

But I'm still not sure why UEFI/Linux currently sees our PCI bus as
being non-coherent when in fact it is and we have no passthrough issues
currently.  Are all PCI controllers always non-coherent for some reason
and therefore we model it as such too?

-Christoffer



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 14:24, Christoffer Dall wrote:
> On Thu, May 14, 2015 at 02:08:49PM +0200, Paolo Bonzini wrote:
>>
>>
>> On 14/05/2015 14:00, Christoffer Dall wrote:
>>> So, getting back to my original question.  Is the point then that UEFI
>>> must assume (from ACPI/DT) the cache-coherency properties of the PCI
>>> controller which exists in hardware on the system you're running on,
>>> even for the virtual PCI bus because that will be the semantics for
>>> assigned devices?
>>>
>>> And in that case, we have no way to distinguish between passthrough
>>> devices and virtual devices plugged into the virtual PCI bus?
>>
>> Well, we could use the subsystem id.  But it's a hack, and may cause
>> incompatibilities with some drivers.  Michael, any ideas?
>>
>>> What about the idea of having two virtual PCI buses on your system where
>>> one is always cache-coherent and uses for virtual devices, and the other
>>> is whatever the hardware is and used for passthrough devices?
>>
>> I think that was rejected before.
> 
> Do you remember where?  I just remember Catalin mentioning the idea to
> me verbally.

In the last centithread on the subject. :)

At least I and Peter disagreed.  It's not about the heavy added use of
resources, it's more about it being really easy to misconfigure.

> But I'm still not sure why UEFI/Linux currently sees our PCI bus as
> being non-coherent when in fact it is and we have no passthrough issues
> currently.  Are all PCI controllers always non-coherent for some reason
> and therefore we model it as such too?

Well, PCI BARs are generally MMIO resources, and hence should not be cached.

As an optimization, OS drivers can mark them as cacheable or
write-combining or something like that, but in general it's a safe
default to leave them uncached---one would think.

Paolo



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Christoffer Dall
On Thu, May 14, 2015 at 02:28:49PM +0200, Paolo Bonzini wrote:
> 
> 
> On 14/05/2015 14:24, Christoffer Dall wrote:
> > On Thu, May 14, 2015 at 02:08:49PM +0200, Paolo Bonzini wrote:
> >>
> >>
> >> On 14/05/2015 14:00, Christoffer Dall wrote:
> >>> So, getting back to my original question.  Is the point then that UEFI
> >>> must assume (from ACPI/DT) the cache-coherency properties of the PCI
> >>> controller which exists in hardware on the system you're running on,
> >>> even for the virtual PCI bus because that will be the semantics for
> >>> assigned devices?
> >>>
> >>> And in that case, we have no way to distinguish between passthrough
> >>> devices and virtual devices plugged into the virtual PCI bus?
> >>
> >> Well, we could use the subsystem id.  But it's a hack, and may cause
> >> incompatibilities with some drivers.  Michael, any ideas?
> >>
> >>> What about the idea of having two virtual PCI buses on your system where
> >>> one is always cache-coherent and uses for virtual devices, and the other
> >>> is whatever the hardware is and used for passthrough devices?
> >>
> >> I think that was rejected before.
> > 
> > Do you remember where?  I just remember Catalin mentioning the idea to
> > me verbally.
> 
> In the last centithread on the subject. :)
> 
> At least I and Peter disagreed.  It's not about the heavy added use of
> resources, it's more about it being really easy to misconfigure.
> 
> > But I'm still not sure why UEFI/Linux currently sees our PCI bus as
> > being non-coherent when in fact it is and we have no passthrough issues
> > currently.  Are all PCI controllers always non-coherent for some reason
> > and therefore we model it as such too?
> 
> Well, PCI BARs are generally MMIO resources, and hence should not be cached.
> 
> As an optimization, OS drivers can mark them as cacheable or
> write-combining or something like that, but in general it's a safe
> default to leave them uncached---one would think.
> 
ok, I guess this series makes sense then, assuming it works, and
assuming we don't kill performance by going to RAM all the time when we
don't have to...

Thanks,
-Christoffer



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Peter Maydell
On 14 May 2015 at 13:28, Paolo Bonzini  wrote:
> Well, PCI BARs are generally MMIO resources, and hence should not be cached.
>
> As an optimization, OS drivers can mark them as cacheable or
> write-combining or something like that, but in general it's a safe
> default to leave them uncached---one would think.

Isn't this handled by the OS mapping them in the 'prefetchable'
MMIO window rather than the 'non-prefetchable' one? (QEMU's
generic-PCIe device doesn't yet support the prefetchable window.)

-- PMM



Re: [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Markus Armbruster
Paolo Bonzini  writes:

> On 14/05/2015 14:02, Markus Armbruster wrote:
>>   It should certainly be off for pc-q35-2.4 and newer.  Real Q35 boards
>>   commonly don't have an FDC (depends on the Super I/O chip used).
>> 
>>   We may want to keep it off for pc-i440fx-2.4 and newer.  I doubt
>>   there's a real i440FX without an FDC, but our virtual i440FX is quite
>>   unlike a real one in other ways already.
>
> That would break libvirt for people upgrading from 2.3 to 2.4.  So it's
> more like pc-i440fx-3.0 and pc-q35-3.0.

What exactly breaks when?

>  Unless for q35 we decide to
> break everything and retroactively nuke the controller.
>
> (I'm still not sure why we have backwards-compatible machine types for q35).

Beats me :)

[...]



Re: [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Daniel P. Berrange
On Thu, May 14, 2015 at 02:45:30PM +0200, Markus Armbruster wrote:
> Paolo Bonzini  writes:
> 
> > On 14/05/2015 14:02, Markus Armbruster wrote:
> >>   It should certainly be off for pc-q35-2.4 and newer.  Real Q35 boards
> >>   commonly don't have an FDC (depends on the Super I/O chip used).
> >> 
> >>   We may want to keep it off for pc-i440fx-2.4 and newer.  I doubt
> >>   there's a real i440FX without an FDC, but our virtual i440FX is quite
> >>   unlike a real one in other ways already.
> >
> > That would break libvirt for people upgrading from 2.3 to 2.4.  So it's
> > more like pc-i440fx-3.0 and pc-q35-3.0.
> 
> What exactly breaks when?

[quote]
  * Create the FDC only if the option is "on".

  * Optional: make -drive if=floppy,... auto-enable it

I wouldn't bother doing the same for -global isa-fdc.driveA=... and
such.
[/quote]

Libvirt uses -global when enabling floppy devices. So since current libvirt
does not know about the new (to be created) machine type property to turn
on FDC, it will get an error using -global isa-fdc.driveA=

I'm not too bothered about this, as long as libvirt has enough advance
notice to add support for the new machine type property to enable FDC
before we change its default value to be "off". Perhaps one QEMU major
release cycle before toggling the default, to give time for new libvirt
to penetrate to distros

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



[Qemu-devel] [PATCH v4 02/16] qapi: Rename identical c_fun()/c_var() into c_name()

2015-05-14 Thread Eric Blake
Now that the two functions are identical, we only need one of them,
and we might as well give it a more descriptive name.  Basically,
the function serves as the translation from a QAPI name into a
(portion of a) C identifier, without regards to whether it is a
variable or function name.

Signed-off-by: Eric Blake 
Reviewed-by: Markus Armbruster 
---
 scripts/qapi-commands.py | 35 ++-
 scripts/qapi-event.py| 10 +-
 scripts/qapi-types.py|  8 
 scripts/qapi-visit.py| 10 +-
 scripts/qapi.py  | 11 ---
 5 files changed, 36 insertions(+), 38 deletions(-)

diff --git a/scripts/qapi-commands.py b/scripts/qapi-commands.py
index 93e43f0..8c125ca 100644
--- a/scripts/qapi-commands.py
+++ b/scripts/qapi-commands.py
@@ -31,12 +31,13 @@ def generate_command_decl(name, args, ret_type):
 for argname, argtype, optional in parse_args(args):
 argtype = c_type(argtype, is_param=True)
 if optional:
-arglist += "bool has_%s, " % c_var(argname)
-arglist += "%s %s, " % (argtype, c_var(argname))
+arglist += "bool has_%s, " % c_name(argname)
+arglist += "%s %s, " % (argtype, c_name(argname))
 return mcgen('''
 %(ret_type)s qmp_%(name)s(%(args)sError **errp);
 ''',
- ret_type=c_type(ret_type), name=c_fun(name), 
args=arglist).strip()
+ ret_type=c_type(ret_type), name=c_name(name),
+ args=arglist).strip()

 def gen_err_check(errvar):
 if errvar:
@@ -55,14 +56,14 @@ def gen_sync_call(name, args, ret_type, indent=0):
 retval = "retval = "
 for argname, argtype, optional in parse_args(args):
 if optional:
-arglist += "has_%s, " % c_var(argname)
-arglist += "%s, " % (c_var(argname))
+arglist += "has_%s, " % c_name(argname)
+arglist += "%s, " % (c_name(argname))
 push_indent(indent)
 ret = mcgen('''
 %(retval)sqmp_%(name)s(%(args)s&local_err);

 ''',
-name=c_fun(name), args=arglist, retval=retval).rstrip()
+name=c_name(name), args=arglist, retval=retval).rstrip()
 if ret_type:
 ret += "\n" + gen_err_check('local_err')
 ret += "\n" + mcgen(
@@ -76,7 +77,7 @@ def gen_sync_call(name, args, ret_type, indent=0):
 def gen_marshal_output_call(name, ret_type):
 if not ret_type:
 return ""
-return "qmp_marshal_output_%s(retval, ret, &local_err);" % c_fun(name)
+return "qmp_marshal_output_%s(retval, ret, &local_err);" % c_name(name)

 def gen_visitor_input_containers_decl(args, obj):
 ret = ""
@@ -101,17 +102,17 @@ def gen_visitor_input_vars_decl(args):
 ret += mcgen('''
 bool has_%(argname)s = false;
 ''',
- argname=c_var(argname))
+ argname=c_name(argname))
 if is_c_ptr(argtype):
 ret += mcgen('''
 %(argtype)s %(argname)s = NULL;
 ''',
- argname=c_var(argname), argtype=c_type(argtype))
+ argname=c_name(argname), argtype=c_type(argtype))
 else:
 ret += mcgen('''
 %(argtype)s %(argname)s = {0};
 ''',
- argname=c_var(argname), argtype=c_type(argtype))
+ argname=c_name(argname), argtype=c_type(argtype))

 pop_indent()
 return ret.rstrip()
@@ -144,17 +145,17 @@ v = qmp_input_get_visitor(mi);
 ret += mcgen('''
 visit_optional(v, &has_%(c_name)s, "%(name)s", %(errp)s);
 ''',
- c_name=c_var(argname), name=argname, errp=errparg)
+ c_name=c_name(argname), name=argname, errp=errparg)
 ret += gen_err_check(errarg)
 ret += mcgen('''
 if (has_%(c_name)s) {
 ''',
- c_name=c_var(argname))
+ c_name=c_name(argname))
 push_indent()
 ret += mcgen('''
 %(visitor)s(v, &%(c_name)s, "%(name)s", %(errp)s);
 ''',
- c_name=c_var(argname), name=argname, argtype=argtype,
+ c_name=c_name(argname), name=argname, argtype=argtype,
  visitor=type_visitor(argtype), errp=errparg)
 ret += gen_err_check(errarg)
 if optional:
@@ -198,16 +199,16 @@ out:
 qapi_dealloc_visitor_cleanup(md);
 }
 ''',
-c_ret_type=c_type(ret_type), c_name=c_fun(name),
+c_ret_type=c_type(ret_type), c_name=c_name(name),
 visitor=type_visitor(ret_type))

 return ret

 def gen_marshal_input_decl(name, args, ret_type, middle_mode):
 if middle_mode:
-return 'int qmp_marshal_input_%s(Monitor *mon, const QDict *qdict, 
QObject **ret)' % c_fun(name)
+return 'int qmp_marshal_input_%s(Monitor *mon, const QDict *qdict, 
QObject **ret)' % c_name(name)
 else:
-return 'static void qmp_marshal_input_%s(QDict *args, QObject **ret, 
Error **errp)' % c_fun(name)
+retur

[Qemu-devel] [PATCH v4 00/16] Fix qapi mangling of downstream names

2015-05-14 Thread Eric Blake
This series makes it possible to use downstream extensions
(such as __com.redhat_xyz) and temporary names (such as x-foo)
in every position possible in QAPI schemes, with added tests
that the generated code still compiles.

There's still some things we could do to the qapi generator,
such as normalizing struct member names and C manglings and
creating named implicit types up front on the initial parse
rather than multiple times in each backend.  But that should
wait until existing pending patches have landed, to minimize
rebase churn.

v3 was here:
https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg00519.html
[already queued in Markus tree, but not yet in a pull request, so
that queue can be ruthlessly rebased]

v4 includes 2 new patches (although 14/14 is somewhat unrelated,
and could easily be dropped from the series if it needs another
spin without holding up the rest of the series), incorporates
Markus' suggestions for more comments, and simplifies special-casing
of lists of builtin types.

Here's the backport-diff stats (and note that it exposes a bug
in the script, as a raw % in the subject line triggered mayhem):

001/16:[] [--] 'qapi: Fix C identifiers generated for names containing '.''
002/16:[] [--] 'qapi: Rename identical c_fun()/c_var() into c_name()'
003/16:[] [--] 'qapi: Rename _generate_enum_string() to camel_to_upper()'
004/16:[] [--] 'qapi: Rename generate_enum_full_value() to c_enum_const()'
005/16:[] [--] 'qapi: Simplify c_enum_const()'
006/16:[] [--] 'qapi: Use c_enum_const() in generate_alternate_qtypes()'
007/16:[] [--] 'qapi: Move camel_to_upper(), c_enum_const() to closely 
related code'
008/16:[down] 'qapi: Tidy c_type logic'
009/16:[0046] [FC] 'qapi: Make c_type() consistently convert qapi names'
010/16:[0007] [FC] 'qapi: Support downstream enums'
011/16:[] [--] 'qapi: Support downstream structs'
012/16:[] [--] 'qapi: Support downstream simple unions'
013/16:[] [--] 'qapi: Support downstream flat unions'
014/16:[] [--] 'qapi: Support downstream alternates'
015/16:[] [--] 'qapi: Support downstream events and commands'
/home/eblake/bin/git-backport-diff: line 267: printf: `v': invalid format 
character
016/16:[down] 'qapi: Prefer '"str" + var' over '"str"

Eric Blake (10):
  qapi: Rename identical c_fun()/c_var() into c_name()
  qapi: Tidy c_type logic
  qapi: Make c_type() consistently convert qapi names
  qapi: Support downstream enums
  qapi: Support downstream structs
  qapi: Support downstream simple unions
  qapi: Support downstream flat unions
  qapi: Support downstream alternates
  qapi: Support downstream events and commands
  qapi: Prefer '"str" + var' over '"str%s" % var'

Markus Armbruster (6):
  qapi: Fix C identifiers generated for names containing '.'
  qapi: Rename _generate_enum_string() to camel_to_upper()
  qapi: Rename generate_enum_full_value() to c_enum_const()
  qapi: Simplify c_enum_const()
  qapi: Use c_enum_const() in generate_alternate_qtypes()
  qapi: Move camel_to_upper(), c_enum_const() to closely related code

 scripts/qapi-commands.py|  56 
 scripts/qapi-event.py   |  15 +--
 scripts/qapi-types.py   |  54 
 scripts/qapi-visit.py   |  55 
 scripts/qapi.py | 226 +---
 tests/qapi-schema/include-non-file.err  |   2 +-
 tests/qapi-schema/qapi-schema-test.json |  20 +++
 tests/qapi-schema/qapi-schema-test.out  |  21 ++-
 tests/test-qmp-commands.c   |  15 +++
 9 files changed, 263 insertions(+), 201 deletions(-)

-- 
2.1.0




[Qemu-devel] [PATCH v4 05/16] qapi: Simplify c_enum_const()

2015-05-14 Thread Eric Blake
From: Markus Armbruster 

Signed-off-by: Markus Armbruster 
Signed-off-by: Eric Blake 
---
 scripts/qapi.py | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/scripts/qapi.py b/scripts/qapi.py
index 1258f76..b917cad 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -961,6 +961,4 @@ def camel_to_upper(value):
 return new_name.lstrip('_').upper()

 def c_enum_const(type_name, const_name):
-abbrev_string = camel_to_upper(type_name)
-value_string = camel_to_upper(const_name)
-return "%s_%s" % (abbrev_string, value_string)
+return camel_to_upper(type_name + '_' + const_name)
-- 
2.1.0




[Qemu-devel] [PATCH v4 10/16] qapi: Support downstream enums

2015-05-14 Thread Eric Blake
Enhance the testsuite to cover a downstream enum type and enum
string.  Update the generator to mangle the enum name in the
appropriate places.

Signed-off-by: Eric Blake 

---

v4: use type_name() instead of special-casing builtins in
generate_visit_list()
---
 scripts/qapi-types.py   | 15 ---
 scripts/qapi-visit.py   |  8 
 tests/qapi-schema/qapi-schema-test.json |  3 +++
 tests/qapi-schema/qapi-schema-test.out  |  4 +++-
 4 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
index 9eb08a6..1593fc6 100644
--- a/scripts/qapi-types.py
+++ b/scripts/qapi-types.py
@@ -58,7 +58,7 @@ typedef struct %(name)sList
 struct %(name)sList *next;
 } %(name)sList;
 ''',
- name=name)
+ name=c_name(name))

 def generate_struct_fields(members):
 ret = ''
@@ -115,7 +115,7 @@ def generate_enum_lookup(name, values):
 ret = mcgen('''
 const char *%(name)s_lookup[] = {
 ''',
- name=name)
+name=c_name(name))
 i = 0
 for value in values:
 index = c_enum_const(name, value)
@@ -134,6 +134,7 @@ const char *%(name)s_lookup[] = {
 return ret

 def generate_enum(name, values):
+name = c_name(name)
 lookup_decl = mcgen('''
 extern const char *%(name)s_lookup[];
 ''',
@@ -247,15 +248,15 @@ extern const int %(name)s_qtypes[];

 def generate_type_cleanup_decl(name):
 ret = mcgen('''
-void qapi_free_%(type)s(%(c_type)s obj);
+void qapi_free_%(name)s(%(c_type)s obj);
 ''',
-c_type=c_type(name),type=name)
+c_type=c_type(name), name=c_name(name))
 return ret

 def generate_type_cleanup(name):
 ret = mcgen('''

-void qapi_free_%(type)s(%(c_type)s obj)
+void qapi_free_%(name)s(%(c_type)s obj)
 {
 QapiDeallocVisitor *md;
 Visitor *v;
@@ -266,11 +267,11 @@ void qapi_free_%(type)s(%(c_type)s obj)

 md = qapi_dealloc_visitor_new();
 v = qapi_dealloc_get_visitor(md);
-visit_type_%(type)s(v, &obj, NULL, NULL);
+visit_type_%(name)s(v, &obj, NULL, NULL);
 qapi_dealloc_visitor_cleanup(md);
 }
 ''',
-c_type=c_type(name),type=name)
+c_type=c_type(name), name=c_name(name))
 return ret


diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index 0368e62..7697ec6 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -173,7 +173,7 @@ out:
 error_propagate(errp, err);
 }
 ''',
-name=name)
+name=type_name(name))

 def generate_visit_enum(name, members):
 return mcgen('''
@@ -183,7 +183,7 @@ void visit_type_%(name)s(Visitor *m, %(name)s *obj, const 
char *name, Error **er
 visit_type_enum(m, (int *)obj, %(name)s_lookup, "%(name)s", name, errp);
 }
 ''',
- name=name)
+ name=c_name(name))

 def generate_visit_alternate(name, members):
 ret = mcgen('''
@@ -364,7 +364,7 @@ def generate_enum_declaration(name, members):
 ret = mcgen('''
 void visit_type_%(name)sList(Visitor *m, %(name)sList **obj, const char *name, 
Error **errp);
 ''',
-name=name)
+name=c_name(name))

 return ret

@@ -373,7 +373,7 @@ def generate_decl_enum(name, members):

 void visit_type_%(name)s(Visitor *m, %(name)s *obj, const char *name, Error 
**errp);
 ''',
-name=name)
+ name=c_name(name))

 try:
 opts, args = getopt.gnu_getopt(sys.argv[1:], "chbp:i:o:",
diff --git a/tests/qapi-schema/qapi-schema-test.json 
b/tests/qapi-schema/qapi-schema-test.json
index 8193dc1..5f9af66 100644
--- a/tests/qapi-schema/qapi-schema-test.json
+++ b/tests/qapi-schema/qapi-schema-test.json
@@ -107,3 +107,6 @@
   'data': { '*a': 'int', '*b': 'UserDefOne', 'c': 'str' } }
 { 'event': 'EVENT_D',
   'data': { 'a' : 'EventStructOne', 'b' : 'str', '*c': 'str', '*enum3': 
'EnumOne' } }
+
+# test that we correctly compile downstream extensions
+{ 'enum': '__org.qemu_x-Enum', 'data': [ '__org.qemu_x-value' ] }
diff --git a/tests/qapi-schema/qapi-schema-test.out 
b/tests/qapi-schema/qapi-schema-test.out
index 93c4963..40f0f20 100644
--- a/tests/qapi-schema/qapi-schema-test.out
+++ b/tests/qapi-schema/qapi-schema-test.out
@@ -22,8 +22,10 @@
  OrderedDict([('event', 'EVENT_A')]),
  OrderedDict([('event', 'EVENT_B'), ('data', OrderedDict())]),
  OrderedDict([('event', 'EVENT_C'), ('data', OrderedDict([('*a', 'int'), 
('*b', 'UserDefOne'), ('c', 'str')]))]),
- OrderedDict([('event', 'EVENT_D'), ('data', OrderedDict([('a', 
'EventStructOne'), ('b', 'str'), ('*c', 'str'), ('*enum3', 'EnumOne')]))])]
+ OrderedDict([('event', 'EVENT_D'), ('data', OrderedDict([('a', 
'EventStructOne'), ('b', 'str'), ('*c', 'str'), ('*enum3', 'EnumOne')]))]),
+ OrderedDict([('enum', '__org.qemu_x-Enum'), ('data', 
['__org.qemu_x-value'])])]
 [{'enum_name': 'EnumOne', 'enum_values': ['value1', 'value2', 'value3']},
+ {'enum_name': '__org.qemu_x-Enum', 'enum_va

[Qemu-devel] [PATCH v4 15/16] qapi: Support downstream events and commands

2015-05-14 Thread Eric Blake
Enhance the testsuite to cover downstream events and commands.
Events worked without more tweaks, but commands needed a few final
updates in the generator to mangle names in the appropriate places.
In making those tweaks, it was easier to drop type_visitor() and
inline its actions instead.

Signed-off-by: Eric Blake 
Reviewed-by: Markus Armbruster 
---
 scripts/qapi-commands.py| 16 +---
 tests/qapi-schema/qapi-schema-test.json |  5 +
 tests/qapi-schema/qapi-schema-test.out  |  4 +++-
 tests/test-qmp-commands.c   | 15 +++
 4 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/scripts/qapi-commands.py b/scripts/qapi-commands.py
index 8c125ca..0a1d636 100644
--- a/scripts/qapi-commands.py
+++ b/scripts/qapi-commands.py
@@ -20,12 +20,6 @@ import os
 import getopt
 import errno

-def type_visitor(name):
-if type(name) == list:
-return 'visit_type_%sList' % name[0]
-else:
-return 'visit_type_%s' % name
-
 def generate_command_decl(name, args, ret_type):
 arglist=""
 for argname, argtype, optional in parse_args(args):
@@ -153,10 +147,10 @@ if (has_%(c_name)s) {
  c_name=c_name(argname))
 push_indent()
 ret += mcgen('''
-%(visitor)s(v, &%(c_name)s, "%(name)s", %(errp)s);
+visit_type_%(visitor)s(v, &%(c_name)s, "%(name)s", %(errp)s);
 ''',
  c_name=c_name(argname), name=argname, argtype=argtype,
- visitor=type_visitor(argtype), errp=errparg)
+ visitor=type_name(argtype), errp=errparg)
 ret += gen_err_check(errarg)
 if optional:
 pop_indent()
@@ -184,7 +178,7 @@ static void qmp_marshal_output_%(c_name)s(%(c_ret_type)s 
ret_in, QObject **ret_o
 Visitor *v;

 v = qmp_output_get_visitor(mo);
-%(visitor)s(v, &ret_in, "unused", &local_err);
+visit_type_%(visitor)s(v, &ret_in, "unused", &local_err);
 if (local_err) {
 goto out;
 }
@@ -195,12 +189,12 @@ out:
 qmp_output_visitor_cleanup(mo);
 md = qapi_dealloc_visitor_new();
 v = qapi_dealloc_get_visitor(md);
-%(visitor)s(v, &ret_in, "unused", NULL);
+visit_type_%(visitor)s(v, &ret_in, "unused", NULL);
 qapi_dealloc_visitor_cleanup(md);
 }
 ''',
 c_ret_type=c_type(ret_type), c_name=c_name(name),
-visitor=type_visitor(ret_type))
+visitor=type_name(ret_type))

 return ret

diff --git a/tests/qapi-schema/qapi-schema-test.json 
b/tests/qapi-schema/qapi-schema-test.json
index d586b56..c7eaa86 100644
--- a/tests/qapi-schema/qapi-schema-test.json
+++ b/tests/qapi-schema/qapi-schema-test.json
@@ -122,3 +122,8 @@
   'data': { '__org.qemu_x-value': '__org.qemu_x-Struct2' } }
 { 'alternate': '__org.qemu_x-Alt',
   'data': { '__org.qemu_x-branch': 'str', 'b': '__org.qemu_x-Base' } }
+{ 'event': '__ORG.QEMU_X-EVENT', 'data': '__org.qemu_x-Struct' }
+{ 'command': '__org.qemu_x-command',
+  'data': { 'a': ['__org.qemu_x-Enum'], 'b': ['__org.qemu_x-Struct'],
+'c': '__org.qemu_x-Union2', 'd': '__org.qemu_x-Alt' },
+  'returns': '__org.qemu_x-Union1' }
diff --git a/tests/qapi-schema/qapi-schema-test.out 
b/tests/qapi-schema/qapi-schema-test.out
index 2161a90..cf0ccc4 100644
--- a/tests/qapi-schema/qapi-schema-test.out
+++ b/tests/qapi-schema/qapi-schema-test.out
@@ -29,7 +29,9 @@
  OrderedDict([('union', '__org.qemu_x-Union1'), ('data', 
OrderedDict([('__org.qemu_x-branch', 'str')]))]),
  OrderedDict([('struct', '__org.qemu_x-Struct2'), ('data', 
OrderedDict([('array', ['__org.qemu_x-Union1'])]))]),
  OrderedDict([('union', '__org.qemu_x-Union2'), ('base', '__org.qemu_x-Base'), 
('discriminator', '__org.qemu_x-member1'), ('data', 
OrderedDict([('__org.qemu_x-value', '__org.qemu_x-Struct2')]))]),
- OrderedDict([('alternate', '__org.qemu_x-Alt'), ('data', 
OrderedDict([('__org.qemu_x-branch', 'str'), ('b', '__org.qemu_x-Base')]))])]
+ OrderedDict([('alternate', '__org.qemu_x-Alt'), ('data', 
OrderedDict([('__org.qemu_x-branch', 'str'), ('b', '__org.qemu_x-Base')]))]),
+ OrderedDict([('event', '__ORG.QEMU_X-EVENT'), ('data', 
'__org.qemu_x-Struct')]),
+ OrderedDict([('command', '__org.qemu_x-command'), ('data', OrderedDict([('a', 
['__org.qemu_x-Enum']), ('b', ['__org.qemu_x-Struct']), ('c', 
'__org.qemu_x-Union2'), ('d', '__org.qemu_x-Alt')])), ('returns', 
'__org.qemu_x-Union1')])]
 [{'enum_name': 'EnumOne', 'enum_values': ['value1', 'value2', 'value3']},
  {'enum_name': '__org.qemu_x-Enum', 'enum_values': ['__org.qemu_x-value']},
  {'enum_name': 'UserDefAlternateKind', 'enum_values': None},
diff --git a/tests/test-qmp-commands.c b/tests/test-qmp-commands.c
index ad2e403..9918f23 100644
--- a/tests/test-qmp-commands.c
+++ b/tests/test-qmp-commands.c
@@ -51,6 +51,21 @@ int64_t qmp_user_def_cmd3(int64_t a, bool has_b, int64_t b, 
Error **errp)
 return a + (has_b ? b : 0);
 }

+__org_qemu_x_Union1 *qmp___org_qemu_x_command(__org_qemu_x_EnumList *a

[Qemu-devel] [PATCH v4 14/16] qapi: Support downstream alternates

2015-05-14 Thread Eric Blake
Enhance the testsuite to cover downstream alternates, including
whether the branch name or type is downstream.  Update the
generator to mangle alternate names in the appropriate places.

Signed-off-by: Eric Blake 
Reviewed-by: Markus Armbruster 
---
 scripts/qapi-types.py   | 7 ---
 scripts/qapi-visit.py   | 6 +++---
 tests/qapi-schema/qapi-schema-test.json | 2 ++
 tests/qapi-schema/qapi-schema-test.out  | 6 --
 4 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
index 13e4b53..5665145 100644
--- a/scripts/qapi-types.py
+++ b/scripts/qapi-types.py
@@ -174,16 +174,17 @@ def generate_alternate_qtypes(expr):
 ret = mcgen('''
 const int %(name)s_qtypes[QTYPE_MAX] = {
 ''',
-name=name)
+name=c_name(name))

 for key in members:
 qtype = find_alternate_member_qtype(members[key])
 assert qtype, "Invalid alternate member"

 ret += mcgen('''
-[ %(qtype)s ] = %(enum_const)s,
+[%(qtype)s] = %(enum_const)s,
 ''',
-qtype = qtype, enum_const = c_enum_const(name + 'Kind', key))
+ qtype = qtype,
+ enum_const = c_enum_const(name + 'Kind', key))

 ret += mcgen('''
 };
diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index c15305f..e511be3 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -202,11 +202,11 @@ void visit_type_%(name)s(Visitor *m, %(name)s **obj, 
const char *name, Error **e
 }
 switch ((*obj)->kind) {
 ''',
-name=name)
+name=c_name(name))

 # For alternate, always use the default enum type automatically generated
-# as "'%sKind' % (name)"
-disc_type = '%sKind' % (name)
+# as name + 'Kind'
+disc_type = c_name(name) + 'Kind'

 for key in members:
 assert (members[key] in builtin_types.keys()
diff --git a/tests/qapi-schema/qapi-schema-test.json 
b/tests/qapi-schema/qapi-schema-test.json
index ac236e3..d586b56 100644
--- a/tests/qapi-schema/qapi-schema-test.json
+++ b/tests/qapi-schema/qapi-schema-test.json
@@ -120,3 +120,5 @@
 { 'union': '__org.qemu_x-Union2', 'base': '__org.qemu_x-Base',
   'discriminator': '__org.qemu_x-member1',
   'data': { '__org.qemu_x-value': '__org.qemu_x-Struct2' } }
+{ 'alternate': '__org.qemu_x-Alt',
+  'data': { '__org.qemu_x-branch': 'str', 'b': '__org.qemu_x-Base' } }
diff --git a/tests/qapi-schema/qapi-schema-test.out 
b/tests/qapi-schema/qapi-schema-test.out
index 3fc24e8..2161a90 100644
--- a/tests/qapi-schema/qapi-schema-test.out
+++ b/tests/qapi-schema/qapi-schema-test.out
@@ -28,12 +28,14 @@
  OrderedDict([('struct', '__org.qemu_x-Struct'), ('base', 
'__org.qemu_x-Base'), ('data', OrderedDict([('__org.qemu_x-member2', 
'str')]))]),
  OrderedDict([('union', '__org.qemu_x-Union1'), ('data', 
OrderedDict([('__org.qemu_x-branch', 'str')]))]),
  OrderedDict([('struct', '__org.qemu_x-Struct2'), ('data', 
OrderedDict([('array', ['__org.qemu_x-Union1'])]))]),
- OrderedDict([('union', '__org.qemu_x-Union2'), ('base', '__org.qemu_x-Base'), 
('discriminator', '__org.qemu_x-member1'), ('data', 
OrderedDict([('__org.qemu_x-value', '__org.qemu_x-Struct2')]))])]
+ OrderedDict([('union', '__org.qemu_x-Union2'), ('base', '__org.qemu_x-Base'), 
('discriminator', '__org.qemu_x-member1'), ('data', 
OrderedDict([('__org.qemu_x-value', '__org.qemu_x-Struct2')]))]),
+ OrderedDict([('alternate', '__org.qemu_x-Alt'), ('data', 
OrderedDict([('__org.qemu_x-branch', 'str'), ('b', '__org.qemu_x-Base')]))])]
 [{'enum_name': 'EnumOne', 'enum_values': ['value1', 'value2', 'value3']},
  {'enum_name': '__org.qemu_x-Enum', 'enum_values': ['__org.qemu_x-value']},
  {'enum_name': 'UserDefAlternateKind', 'enum_values': None},
  {'enum_name': 'UserDefNativeListUnionKind', 'enum_values': None},
- {'enum_name': '__org.qemu_x-Union1Kind', 'enum_values': None}]
+ {'enum_name': '__org.qemu_x-Union1Kind', 'enum_values': None},
+ {'enum_name': '__org.qemu_x-AltKind', 'enum_values': None}]
 [OrderedDict([('struct', 'NestedEnumsOne'), ('data', OrderedDict([('enum1', 
'EnumOne'), ('*enum2', 'EnumOne'), ('enum3', 'EnumOne'), ('*enum4', 
'EnumOne')]))]),
  OrderedDict([('struct', 'UserDefZero'), ('data', OrderedDict([('integer', 
'int')]))]),
  OrderedDict([('struct', 'UserDefOne'), ('base', 'UserDefZero'), ('data', 
OrderedDict([('string', 'str'), ('*enum1', 'EnumOne')]))]),
-- 
2.1.0




[Qemu-devel] [PATCH v4 12/16] qapi: Support downstream simple unions

2015-05-14 Thread Eric Blake
Enhance the testsuite to cover downstream simple unions, including
when a union branch is a downstream name.  Update the generator to
mangle the union names in the appropriate places.

Signed-off-by: Eric Blake 
Reviewed-by: Markus Armbruster 
---
 scripts/qapi-types.py   | 2 +-
 scripts/qapi-visit.py   | 8 
 tests/qapi-schema/qapi-schema-test.json | 1 +
 tests/qapi-schema/qapi-schema-test.out  | 6 --
 4 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
index 5118151..5b0bc5d 100644
--- a/scripts/qapi-types.py
+++ b/scripts/qapi-types.py
@@ -193,7 +193,7 @@ const int %(name)s_qtypes[QTYPE_MAX] = {

 def generate_union(expr, meta):

-name = expr[meta]
+name = c_name(expr[meta])
 typeinfo = expr['data']

 base = expr.get('base')
diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index b500724..d1ec70b 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -255,9 +255,9 @@ def generate_visit_union(expr):
 disc_type = enum_define['enum_name']
 else:
 # There will always be a discriminator in the C switch code, by default
-# it is an enum type generated silently as "'%sKind' % (name)"
-ret = generate_visit_enum('%sKind' % name, members.keys())
-disc_type = '%sKind' % (name)
+# it is an enum type generated silently
+ret = generate_visit_enum(name + 'Kind', members.keys())
+disc_type = c_name(name) + 'Kind'

 if base:
 assert discriminator
@@ -281,7 +281,7 @@ void visit_type_%(name)s(Visitor *m, %(name)s **obj, const 
char *name, Error **e
 }
 if (*obj) {
 ''',
- name=name)
+ name=c_name(name))

 if base:
 ret += mcgen('''
diff --git a/tests/qapi-schema/qapi-schema-test.json 
b/tests/qapi-schema/qapi-schema-test.json
index ca9b34c..6416d85 100644
--- a/tests/qapi-schema/qapi-schema-test.json
+++ b/tests/qapi-schema/qapi-schema-test.json
@@ -114,3 +114,4 @@
   'data': { '__org.qemu_x-member1': '__org.qemu_x-Enum' } }
 { 'struct': '__org.qemu_x-Struct', 'base': '__org.qemu_x-Base',
   'data': { '__org.qemu_x-member2': 'str' } }
+{ 'union': '__org.qemu_x-Union1', 'data': { '__org.qemu_x-branch': 'str' } }
diff --git a/tests/qapi-schema/qapi-schema-test.out 
b/tests/qapi-schema/qapi-schema-test.out
index 6ab4f00..f9ebe08 100644
--- a/tests/qapi-schema/qapi-schema-test.out
+++ b/tests/qapi-schema/qapi-schema-test.out
@@ -25,11 +25,13 @@
  OrderedDict([('event', 'EVENT_D'), ('data', OrderedDict([('a', 
'EventStructOne'), ('b', 'str'), ('*c', 'str'), ('*enum3', 'EnumOne')]))]),
  OrderedDict([('enum', '__org.qemu_x-Enum'), ('data', 
['__org.qemu_x-value'])]),
  OrderedDict([('struct', '__org.qemu_x-Base'), ('data', 
OrderedDict([('__org.qemu_x-member1', '__org.qemu_x-Enum')]))]),
- OrderedDict([('struct', '__org.qemu_x-Struct'), ('base', 
'__org.qemu_x-Base'), ('data', OrderedDict([('__org.qemu_x-member2', 
'str')]))])]
+ OrderedDict([('struct', '__org.qemu_x-Struct'), ('base', 
'__org.qemu_x-Base'), ('data', OrderedDict([('__org.qemu_x-member2', 
'str')]))]),
+ OrderedDict([('union', '__org.qemu_x-Union1'), ('data', 
OrderedDict([('__org.qemu_x-branch', 'str')]))])]
 [{'enum_name': 'EnumOne', 'enum_values': ['value1', 'value2', 'value3']},
  {'enum_name': '__org.qemu_x-Enum', 'enum_values': ['__org.qemu_x-value']},
  {'enum_name': 'UserDefAlternateKind', 'enum_values': None},
- {'enum_name': 'UserDefNativeListUnionKind', 'enum_values': None}]
+ {'enum_name': 'UserDefNativeListUnionKind', 'enum_values': None},
+ {'enum_name': '__org.qemu_x-Union1Kind', 'enum_values': None}]
 [OrderedDict([('struct', 'NestedEnumsOne'), ('data', OrderedDict([('enum1', 
'EnumOne'), ('*enum2', 'EnumOne'), ('enum3', 'EnumOne'), ('*enum4', 
'EnumOne')]))]),
  OrderedDict([('struct', 'UserDefZero'), ('data', OrderedDict([('integer', 
'int')]))]),
  OrderedDict([('struct', 'UserDefOne'), ('base', 'UserDefZero'), ('data', 
OrderedDict([('string', 'str'), ('*enum1', 'EnumOne')]))]),
-- 
2.1.0




[Qemu-devel] [PATCH v4 16/16] qapi: Prefer '"str" + var' over '"str%s" % var'

2015-05-14 Thread Eric Blake
Use of python's % operator to format strings is fine if there are
multiple strings or if there is integer formatting going on, but
when it is just abused for string concatenation, it looks nicer
to just use the + operator.  This is particularly true when the
value being substituted is at the front of the format string,
rather than the tail.

Update an error message (and testsuite coverage) while at it, since
concatenating a non-string object does not always produce legible
results.

Signed-off-by: Eric Blake 

---

v4: new patch
---
 scripts/qapi-commands.py   | 17 +
 scripts/qapi-event.py  |  2 +-
 scripts/qapi-types.py  | 10 +++---
 scripts/qapi-visit.py  |  4 +--
 scripts/qapi.py| 64 +-
 tests/qapi-schema/include-non-file.err |  2 +-
 6 files changed, 51 insertions(+), 48 deletions(-)

diff --git a/scripts/qapi-commands.py b/scripts/qapi-commands.py
index 0a1d636..1464a3d 100644
--- a/scripts/qapi-commands.py
+++ b/scripts/qapi-commands.py
@@ -25,7 +25,7 @@ def generate_command_decl(name, args, ret_type):
 for argname, argtype, optional in parse_args(args):
 argtype = c_type(argtype, is_param=True)
 if optional:
-arglist += "bool has_%s, " % c_name(argname)
+arglist += "bool has_" + c_name(argname) + ", "
 arglist += "%s %s, " % (argtype, c_name(argname))
 return mcgen('''
 %(ret_type)s qmp_%(name)s(%(args)sError **errp);
@@ -50,8 +50,8 @@ def gen_sync_call(name, args, ret_type, indent=0):
 retval = "retval = "
 for argname, argtype, optional in parse_args(args):
 if optional:
-arglist += "has_%s, " % c_name(argname)
-arglist += "%s, " % (c_name(argname))
+arglist += "has_" + c_name(argname) + ", "
+arglist += c_name(argname) + ", "
 push_indent(indent)
 ret = mcgen('''
 %(retval)sqmp_%(name)s(%(args)s&local_err);
@@ -71,7 +71,7 @@ def gen_sync_call(name, args, ret_type, indent=0):
 def gen_marshal_output_call(name, ret_type):
 if not ret_type:
 return ""
-return "qmp_marshal_output_%s(retval, ret, &local_err);" % c_name(name)
+return "qmp_marshal_output_" + c_name(name) + "(retval, ret, &local_err);"

 def gen_visitor_input_containers_decl(args, obj):
 ret = ""
@@ -200,9 +200,11 @@ out:

 def gen_marshal_input_decl(name, args, ret_type, middle_mode):
 if middle_mode:
-return 'int qmp_marshal_input_%s(Monitor *mon, const QDict *qdict, 
QObject **ret)' % c_name(name)
+return ('int qmp_marshal_input_' + c_name(name)
++ '(Monitor *mon, const QDict *qdict, QObject **ret)')
 else:
-return 'static void qmp_marshal_input_%s(QDict *args, QObject **ret, 
Error **errp)' % c_name(name)
+return ('static void qmp_marshal_input_' + c_name(name)
++ '(QDict *args, QObject **ret, Error **errp)')



@@ -458,7 +460,8 @@ if dispatch_type == "sync":
 fdef.write(ret)

 if middle_mode:
-fdecl.write('%s;\n' % gen_marshal_input_decl(cmd['command'], 
arglist, ret_type, middle_mode))
+fdecl.write(gen_marshal_input_decl(cmd['command'], arglist,
+   ret_type, middle_mode) + ';\n')

 ret = gen_marshal_input(cmd['command'], arglist, ret_type, 
middle_mode) + "\n"
 fdef.write(ret)
diff --git a/scripts/qapi-event.py b/scripts/qapi-event.py
index a7e0033..957dba5 100644
--- a/scripts/qapi-event.py
+++ b/scripts/qapi-event.py
@@ -17,7 +17,7 @@ import getopt
 import errno

 def _generate_event_api_name(event_name, params):
-api_name = "void qapi_event_send_%s(" % c_name(event_name).lower();
+api_name = "void qapi_event_send_" + c_name(event_name).lower() + "(";
 l = len(api_name)

 if params:
diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
index 5665145..1c41743 100644
--- a/scripts/qapi-types.py
+++ b/scripts/qapi-types.py
@@ -204,7 +204,7 @@ def generate_union(expr, meta):
 if enum_define:
 discriminator_type_name = enum_define['enum_name']
 else:
-discriminator_type_name = '%sKind' % (name)
+discriminator_type_name = name + 'Kind'

 ret = mcgen('''
 struct %(name)s
@@ -399,13 +399,13 @@ for expr in exprs:
 ret += generate_fwd_struct(expr['union'], expr['data']) + "\n"
 enum_define = discriminator_find_enum_define(expr)
 if not enum_define:
-ret += generate_enum('%sKind' % expr['union'], expr['data'].keys())
-fdef.write(generate_enum_lookup('%sKind' % expr['union'],
+ret += generate_enum(expr['union'] + 'Kind', expr['data'].keys())
+fdef.write(generate_enum_lookup(expr['union'] + 'Kind',
 expr['data'].keys()))
 elif expr.has_key('alternate'):
 ret += generate_fwd_struct(expr['alternate'], expr['data']) + "\n"

[Qemu-devel] [PATCH v4 08/16] qapi: Tidy c_type logic

2015-05-14 Thread Eric Blake
c_type() is designed to be called on both string names and on
array designations, so 'name' is a bit misleading because it
operates on more than strings.  Also, no caller ever passes
an empty string.  Finally, + notation is a bit nicer to read
than '%s' % value for string concatenation.

Signed-off-by: Eric Blake 

---

v4: new patch, suggested by Markus
---
 scripts/qapi.py | 58 +
 1 file changed, 30 insertions(+), 28 deletions(-)

diff --git a/scripts/qapi.py b/scripts/qapi.py
index cc33355..85e5d00 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -801,12 +801,12 @@ def c_name(name, protect=True):
 return name.translate(c_name_trans)

 def c_list_type(name):
-return '%sList' % name
+return name + 'List'

-def type_name(name):
-if type(name) == list:
-return c_list_type(name[0])
-return name
+def type_name(value):
+if type(value) == list:
+return c_list_type(value[0])
+return value

 def add_name(name, info, meta, implicit = False):
 global all_names
@@ -863,42 +863,44 @@ def is_enum(name):
 return find_enum(name) != None

 eatspace = '\033EATSPACE.'
+pointer_suffix = ' *' + eatspace

 # A special suffix is added in c_type() for pointer types, and it's
 # stripped in mcgen(). So please notice this when you check the return
 # value of c_type() outside mcgen().
-def c_type(name, is_param=False):
-if name == 'str':
+def c_type(value, is_param=False):
+if value == 'str':
 if is_param:
-return 'const char *' + eatspace
-return 'char *' + eatspace
+return 'const char' + pointer_suffix
+return 'char' + pointer_suffix

-elif name == 'int':
+elif value == 'int':
 return 'int64_t'
-elif (name == 'int8' or name == 'int16' or name == 'int32' or
-  name == 'int64' or name == 'uint8' or name == 'uint16' or
-  name == 'uint32' or name == 'uint64'):
-return name + '_t'
-elif name == 'size':
+elif (value == 'int8' or value == 'int16' or value == 'int32' or
+  value == 'int64' or value == 'uint8' or value == 'uint16' or
+  value == 'uint32' or value == 'uint64'):
+return value + '_t'
+elif value == 'size':
 return 'uint64_t'
-elif name == 'bool':
+elif value == 'bool':
 return 'bool'
-elif name == 'number':
+elif value == 'number':
 return 'double'
-elif type(name) == list:
-return '%s *%s' % (c_list_type(name[0]), eatspace)
-elif is_enum(name):
-return name
-elif name == None or len(name) == 0:
+elif type(value) == list:
+return c_list_type(value[0]) + pointer_suffix
+elif is_enum(value):
+return value
+elif value == None:
 return 'void'
-elif name in events:
-return '%sEvent *%s' % (camel_case(name), eatspace)
+elif value in events:
+return camel_case(value) + 'Event' + pointer_suffix
 else:
-return '%s *%s' % (name, eatspace)
+# complex type name
+assert isinstance(value, str) and str != ""
+return value + pointer_suffix

-def is_c_ptr(name):
-suffix = "*" + eatspace
-return c_type(name).endswith(suffix)
+def is_c_ptr(value):
+return c_type(value).endswith(pointer_suffix)

 def genindent(count):
 ret = ""
-- 
2.1.0




[Qemu-devel] [PATCH v4 11/16] qapi: Support downstream structs

2015-05-14 Thread Eric Blake
Enhance the testsuite to cover downstream structs, including struct
members and base structs.  Update the generator to mangle the
struct names in the appropriate places.

Signed-off-by: Eric Blake 
Reviewed-by: Markus Armbruster 
---
 scripts/qapi-types.py   |  4 ++--
 scripts/qapi-visit.py   | 11 ++-
 tests/qapi-schema/qapi-schema-test.json |  4 
 tests/qapi-schema/qapi-schema-test.out  |  8 ++--
 4 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
index 1593fc6..5118151 100644
--- a/scripts/qapi-types.py
+++ b/scripts/qapi-types.py
@@ -45,7 +45,7 @@ typedef struct %(name)sList
 struct %(name)sList *next;
 } %(name)sList;
 ''',
- name=name)
+ name=c_name(name))

 def generate_fwd_enum_struct(name, members):
 return mcgen('''
@@ -87,7 +87,7 @@ def generate_struct(expr):
 struct %(name)s
 {
 ''',
-  name=structname)
+  name=c_name(structname))

 if base:
 ret += generate_struct_fields({'base': base})
diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index 7697ec6..b500724 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -56,7 +56,7 @@ static void visit_type_%(name)s_fields(Visitor *m, %(name)s 
**obj, Error **errp)
 {
 Error *err = NULL;
 ''',
- name=name)
+ name=c_name(name))
 push_indent()

 if base:
@@ -111,16 +111,16 @@ def generate_visit_struct_body(name, members):
 ret = mcgen('''
 Error *err = NULL;

-visit_start_struct(m, (void **)obj, "%(name)s", name, sizeof(%(name)s), 
&err);
+visit_start_struct(m, (void **)obj, "%(name)s", name, sizeof(%(c_name)s), 
&err);
 if (!err) {
 if (*obj) {
-visit_type_%(name)s_fields(m, obj, errp);
+visit_type_%(c_name)s_fields(m, obj, errp);
 }
 visit_end_struct(m, &err);
 }
 error_propagate(errp, err);
 ''',
-name=name)
+name=name, c_name=c_name(name))

 return ret

@@ -137,7 +137,7 @@ def generate_visit_struct(expr):
 void visit_type_%(name)s(Visitor *m, %(name)s **obj, const char *name, Error 
**errp)
 {
 ''',
-name=name)
+ name=c_name(name))

 ret += generate_visit_struct_body(name, members)

@@ -347,6 +347,7 @@ out:
 def generate_declaration(name, members, builtin_type=False):
 ret = ""
 if not builtin_type:
+name = c_name(name)
 ret += mcgen('''

 void visit_type_%(name)s(Visitor *m, %(name)s **obj, const char *name, Error 
**errp);
diff --git a/tests/qapi-schema/qapi-schema-test.json 
b/tests/qapi-schema/qapi-schema-test.json
index 5f9af66..ca9b34c 100644
--- a/tests/qapi-schema/qapi-schema-test.json
+++ b/tests/qapi-schema/qapi-schema-test.json
@@ -110,3 +110,7 @@

 # test that we correctly compile downstream extensions
 { 'enum': '__org.qemu_x-Enum', 'data': [ '__org.qemu_x-value' ] }
+{ 'struct': '__org.qemu_x-Base',
+  'data': { '__org.qemu_x-member1': '__org.qemu_x-Enum' } }
+{ 'struct': '__org.qemu_x-Struct', 'base': '__org.qemu_x-Base',
+  'data': { '__org.qemu_x-member2': 'str' } }
diff --git a/tests/qapi-schema/qapi-schema-test.out 
b/tests/qapi-schema/qapi-schema-test.out
index 40f0f20..6ab4f00 100644
--- a/tests/qapi-schema/qapi-schema-test.out
+++ b/tests/qapi-schema/qapi-schema-test.out
@@ -23,7 +23,9 @@
  OrderedDict([('event', 'EVENT_B'), ('data', OrderedDict())]),
  OrderedDict([('event', 'EVENT_C'), ('data', OrderedDict([('*a', 'int'), 
('*b', 'UserDefOne'), ('c', 'str')]))]),
  OrderedDict([('event', 'EVENT_D'), ('data', OrderedDict([('a', 
'EventStructOne'), ('b', 'str'), ('*c', 'str'), ('*enum3', 'EnumOne')]))]),
- OrderedDict([('enum', '__org.qemu_x-Enum'), ('data', 
['__org.qemu_x-value'])])]
+ OrderedDict([('enum', '__org.qemu_x-Enum'), ('data', 
['__org.qemu_x-value'])]),
+ OrderedDict([('struct', '__org.qemu_x-Base'), ('data', 
OrderedDict([('__org.qemu_x-member1', '__org.qemu_x-Enum')]))]),
+ OrderedDict([('struct', '__org.qemu_x-Struct'), ('base', 
'__org.qemu_x-Base'), ('data', OrderedDict([('__org.qemu_x-member2', 
'str')]))])]
 [{'enum_name': 'EnumOne', 'enum_values': ['value1', 'value2', 'value3']},
  {'enum_name': '__org.qemu_x-Enum', 'enum_values': ['__org.qemu_x-value']},
  {'enum_name': 'UserDefAlternateKind', 'enum_values': None},
@@ -39,4 +41,6 @@
  OrderedDict([('struct', 'UserDefC'), ('data', OrderedDict([('string1', 
'str'), ('string2', 'str')]))]),
  OrderedDict([('struct', 'UserDefUnionBase'), ('data', OrderedDict([('string', 
'str'), ('enum1', 'EnumOne')]))]),
  OrderedDict([('struct', 'UserDefOptions'), ('data', OrderedDict([('*i64', 
['int']), ('*u64', ['uint64']), ('*u16', ['uint16']), ('*i64x', 'int'), 
('*u64x', 'uint64')]))]),
- OrderedDict([('struct', 'EventStructOne'), ('data', OrderedDict([('struct1', 
'UserDefOne'), ('string', 'str'), ('*enum2', 'EnumOne')]))])]
+ OrderedDict([('struct', 'EventStructOn

[Qemu-devel] [PATCH v4 13/16] qapi: Support downstream flat unions

2015-05-14 Thread Eric Blake
Enhance the testsuite to cover downstream flat unions, including
the base type, discriminator name and type, and branch name and
type.  Update the generator to mangle the union names in the
appropriate places.

Signed-off-by: Eric Blake 
Reviewed-by: Markus Armbruster 
---
 scripts/qapi-types.py   | 2 +-
 scripts/qapi-visit.py   | 4 ++--
 tests/qapi-schema/qapi-schema-test.json | 5 +
 tests/qapi-schema/qapi-schema-test.out  | 7 +--
 4 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
index 5b0bc5d..13e4b53 100644
--- a/scripts/qapi-types.py
+++ b/scripts/qapi-types.py
@@ -213,7 +213,7 @@ struct %(name)s
 void *data;
 ''',
 name=name,
-discriminator_type_name=discriminator_type_name)
+discriminator_type_name=c_name(discriminator_type_name))

 for key in typeinfo:
 ret += mcgen('''
diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index d1ec70b..c15305f 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -252,7 +252,7 @@ def generate_visit_union(expr):
 if enum_define:
 # Use the enum type as discriminator
 ret = ""
-disc_type = enum_define['enum_name']
+disc_type = c_name(enum_define['enum_name'])
 else:
 # There will always be a discriminator in the C switch code, by default
 # it is an enum type generated silently
@@ -290,7 +290,7 @@ void visit_type_%(name)s(Visitor *m, %(name)s **obj, const 
char *name, Error **e
 goto out_obj;
 }
 ''',
-name=name)
+ name=c_name(name))

 if not discriminator:
 disc_key = "type"
diff --git a/tests/qapi-schema/qapi-schema-test.json 
b/tests/qapi-schema/qapi-schema-test.json
index 6416d85..ac236e3 100644
--- a/tests/qapi-schema/qapi-schema-test.json
+++ b/tests/qapi-schema/qapi-schema-test.json
@@ -115,3 +115,8 @@
 { 'struct': '__org.qemu_x-Struct', 'base': '__org.qemu_x-Base',
   'data': { '__org.qemu_x-member2': 'str' } }
 { 'union': '__org.qemu_x-Union1', 'data': { '__org.qemu_x-branch': 'str' } }
+{ 'struct': '__org.qemu_x-Struct2',
+  'data': { 'array': ['__org.qemu_x-Union1'] } }
+{ 'union': '__org.qemu_x-Union2', 'base': '__org.qemu_x-Base',
+  'discriminator': '__org.qemu_x-member1',
+  'data': { '__org.qemu_x-value': '__org.qemu_x-Struct2' } }
diff --git a/tests/qapi-schema/qapi-schema-test.out 
b/tests/qapi-schema/qapi-schema-test.out
index f9ebe08..3fc24e8 100644
--- a/tests/qapi-schema/qapi-schema-test.out
+++ b/tests/qapi-schema/qapi-schema-test.out
@@ -26,7 +26,9 @@
  OrderedDict([('enum', '__org.qemu_x-Enum'), ('data', 
['__org.qemu_x-value'])]),
  OrderedDict([('struct', '__org.qemu_x-Base'), ('data', 
OrderedDict([('__org.qemu_x-member1', '__org.qemu_x-Enum')]))]),
  OrderedDict([('struct', '__org.qemu_x-Struct'), ('base', 
'__org.qemu_x-Base'), ('data', OrderedDict([('__org.qemu_x-member2', 
'str')]))]),
- OrderedDict([('union', '__org.qemu_x-Union1'), ('data', 
OrderedDict([('__org.qemu_x-branch', 'str')]))])]
+ OrderedDict([('union', '__org.qemu_x-Union1'), ('data', 
OrderedDict([('__org.qemu_x-branch', 'str')]))]),
+ OrderedDict([('struct', '__org.qemu_x-Struct2'), ('data', 
OrderedDict([('array', ['__org.qemu_x-Union1'])]))]),
+ OrderedDict([('union', '__org.qemu_x-Union2'), ('base', '__org.qemu_x-Base'), 
('discriminator', '__org.qemu_x-member1'), ('data', 
OrderedDict([('__org.qemu_x-value', '__org.qemu_x-Struct2')]))])]
 [{'enum_name': 'EnumOne', 'enum_values': ['value1', 'value2', 'value3']},
  {'enum_name': '__org.qemu_x-Enum', 'enum_values': ['__org.qemu_x-value']},
  {'enum_name': 'UserDefAlternateKind', 'enum_values': None},
@@ -45,4 +47,5 @@
  OrderedDict([('struct', 'UserDefOptions'), ('data', OrderedDict([('*i64', 
['int']), ('*u64', ['uint64']), ('*u16', ['uint16']), ('*i64x', 'int'), 
('*u64x', 'uint64')]))]),
  OrderedDict([('struct', 'EventStructOne'), ('data', OrderedDict([('struct1', 
'UserDefOne'), ('string', 'str'), ('*enum2', 'EnumOne')]))]),
  OrderedDict([('struct', '__org.qemu_x-Base'), ('data', 
OrderedDict([('__org.qemu_x-member1', '__org.qemu_x-Enum')]))]),
- OrderedDict([('struct', '__org.qemu_x-Struct'), ('base', 
'__org.qemu_x-Base'), ('data', OrderedDict([('__org.qemu_x-member2', 
'str')]))])]
+ OrderedDict([('struct', '__org.qemu_x-Struct'), ('base', 
'__org.qemu_x-Base'), ('data', OrderedDict([('__org.qemu_x-member2', 
'str')]))]),
+ OrderedDict([('struct', '__org.qemu_x-Struct2'), ('data', 
OrderedDict([('array', ['__org.qemu_x-Union1'])]))])]
-- 
2.1.0




[Qemu-devel] [PATCH v4 07/16] qapi: Move camel_to_upper(), c_enum_const() to closely related code

2015-05-14 Thread Eric Blake
From: Markus Armbruster 

Signed-off-by: Markus Armbruster 
Signed-off-by: Eric Blake 
---
 scripts/qapi.py | 50 +-
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/scripts/qapi.py b/scripts/qapi.py
index 3757a91..cc33355 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -742,6 +742,31 @@ def camel_case(name):
 new_name += ch.lower()
 return new_name

+# ENUMName -> ENUM_NAME, EnumName1 -> ENUM_NAME1
+# ENUM_NAME -> ENUM_NAME, ENUM_NAME1 -> ENUM_NAME1, ENUM_Name2 -> ENUM_NAME2
+# ENUM24_Name -> ENUM24_NAME
+def camel_to_upper(value):
+c_fun_str = c_name(value, False)
+if value.isupper():
+return c_fun_str
+
+new_name = ''
+l = len(c_fun_str)
+for i in range(l):
+c = c_fun_str[i]
+# When c is upper and no "_" appears before, do more checks
+if c.isupper() and (i > 0) and c_fun_str[i - 1] != "_":
+# Case 1: next string is lower
+# Case 2: previous string is digit
+if (i < (l - 1) and c_fun_str[i + 1].islower()) or \
+c_fun_str[i - 1].isdigit():
+new_name += '_'
+new_name += c
+return new_name.lstrip('_').upper()
+
+def c_enum_const(type_name, const_name):
+return camel_to_upper(type_name + '_' + const_name)
+
 c_name_trans = string.maketrans('.-', '__')

 def c_name(name, protect=True):
@@ -926,28 +951,3 @@ def guardend(name):

 ''',
  name=guardname(name))
-
-# ENUMName -> ENUM_NAME, EnumName1 -> ENUM_NAME1
-# ENUM_NAME -> ENUM_NAME, ENUM_NAME1 -> ENUM_NAME1, ENUM_Name2 -> ENUM_NAME2
-# ENUM24_Name -> ENUM24_NAME
-def camel_to_upper(value):
-c_fun_str = c_name(value, False)
-if value.isupper():
-return c_fun_str
-
-new_name = ''
-l = len(c_fun_str)
-for i in range(l):
-c = c_fun_str[i]
-# When c is upper and no "_" appears before, do more checks
-if c.isupper() and (i > 0) and c_fun_str[i - 1] != "_":
-# Case 1: next string is lower
-# Case 2: previous string is digit
-if (i < (l - 1) and c_fun_str[i + 1].islower()) or \
-c_fun_str[i - 1].isdigit():
-new_name += '_'
-new_name += c
-return new_name.lstrip('_').upper()
-
-def c_enum_const(type_name, const_name):
-return camel_to_upper(type_name + '_' + const_name)
-- 
2.1.0




[Qemu-devel] [PATCH v4 03/16] qapi: Rename _generate_enum_string() to camel_to_upper()

2015-05-14 Thread Eric Blake
From: Markus Armbruster 

Signed-off-by: Markus Armbruster 
Signed-off-by: Eric Blake 
---
 scripts/qapi.py | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/scripts/qapi.py b/scripts/qapi.py
index a4701ca..7330f7c 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -538,7 +538,7 @@ def check_union(expr, expr_info):

 # Otherwise, check for conflicts in the generated enum
 else:
-c_key = _generate_enum_string(key)
+c_key = camel_to_upper(key)
 if c_key in values:
 raise QAPIExprError(expr_info,
 "Union '%s' member '%s' clashes with '%s'"
@@ -556,7 +556,7 @@ def check_alternate(expr, expr_info):
 check_name(expr_info, "Member of alternate '%s'" % name, key)

 # Check for conflicts in the generated enum
-c_key = _generate_enum_string(key)
+c_key = camel_to_upper(key)
 if c_key in values:
 raise QAPIExprError(expr_info,
 "Alternate '%s' member '%s' clashes with '%s'"
@@ -587,7 +587,7 @@ def check_enum(expr, expr_info):
 for member in members:
 check_name(expr_info, "Member of enum '%s'" %name, member,
enum_member=True)
-key = _generate_enum_string(member)
+key = camel_to_upper(member)
 if key in values:
 raise QAPIExprError(expr_info,
 "Enum '%s' member '%s' clashes with '%s'"
@@ -941,7 +941,7 @@ def guardend(name):
 # ENUMName -> ENUM_NAME, EnumName1 -> ENUM_NAME1
 # ENUM_NAME -> ENUM_NAME, ENUM_NAME1 -> ENUM_NAME1, ENUM_Name2 -> ENUM_NAME2
 # ENUM24_Name -> ENUM24_NAME
-def _generate_enum_string(value):
+def camel_to_upper(value):
 c_fun_str = c_name(value, False)
 if value.isupper():
 return c_fun_str
@@ -961,6 +961,6 @@ def _generate_enum_string(value):
 return new_name.lstrip('_').upper()

 def generate_enum_full_value(enum_name, enum_value):
-abbrev_string = _generate_enum_string(enum_name)
-value_string = _generate_enum_string(enum_value)
+abbrev_string = camel_to_upper(enum_name)
+value_string = camel_to_upper(enum_value)
 return "%s_%s" % (abbrev_string, value_string)
-- 
2.1.0




Re: [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 14:45, Markus Armbruster wrote:
> Paolo Bonzini  writes:
> 
>> On 14/05/2015 14:02, Markus Armbruster wrote:
>>>   It should certainly be off for pc-q35-2.4 and newer.  Real Q35 boards
>>>   commonly don't have an FDC (depends on the Super I/O chip used).
>>>
>>>   We may want to keep it off for pc-i440fx-2.4 and newer.  I doubt
>>>   there's a real i440FX without an FDC, but our virtual i440FX is quite
>>>   unlike a real one in other ways already.
>>
>> That would break libvirt for people upgrading from 2.3 to 2.4.  So it's
>> more like pc-i440fx-3.0 and pc-q35-3.0.
> 
> What exactly breaks when?

libvirt expects "-nodefaults -drive if=none,id=fdd0,... -global
isa-fdc.driveA=fdd0" to result in a machine with a working FDD.  It
doesn't know that it has to add "-machine fdc=on".

Besides, adding a new machine option is not the best we can do.  If the
default is "no FDC", all that is needed to add one back is -device.  An
FDC is yet another ISA device, it is possible to create one with -device.

> add the magic to make -global isa-fdc... auto-set the option to on.

That would be ugly magic.

The more I think about this, the more I think this is just a kneejerk
reaction to a sensationalist announcement.  The effect of this
vulnerability on properly configured data centers (running
non-prehistoric versions of Xen or KVM and using
stubdom/SELinux/AppArmor properly) should be really close to zero.

It's a storm in a tea cup.

Paolo

>>  Unless for q35 we decide to
>> break everything and retroactively nuke the controller.
>>
>> (I'm still not sure why we have backwards-compatible machine types for q35).
> 
> Beats me :)
> 
> [...]
> 



[Qemu-devel] [PATCH v4 04/16] qapi: Rename generate_enum_full_value() to c_enum_const()

2015-05-14 Thread Eric Blake
From: Markus Armbruster 

Signed-off-by: Markus Armbruster 
Signed-off-by: Eric Blake 
---
 scripts/qapi-event.py | 5 ++---
 scripts/qapi-types.py | 6 +++---
 scripts/qapi-visit.py | 4 ++--
 scripts/qapi.py   | 6 +++---
 4 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/scripts/qapi-event.py b/scripts/qapi-event.py
index c4612e3..a7e0033 100644
--- a/scripts/qapi-event.py
+++ b/scripts/qapi-event.py
@@ -177,7 +177,7 @@ typedef enum %(event_enum_name)s
   event_enum_name = event_enum_name)

 # append automatically generated _MAX value
-enum_max_value = generate_enum_full_value(event_enum_name, "MAX")
+enum_max_value = c_enum_const(event_enum_name, "MAX")
 enum_values = event_enum_values + [ enum_max_value ]

 i = 0
@@ -343,8 +343,7 @@ for expr in exprs:
 fdecl.write(ret)

 # We need an enum value per event
-event_enum_value = generate_enum_full_value(event_enum_name,
-event_name)
+event_enum_value = c_enum_const(event_enum_name, event_name)
 ret = generate_event_implement(api_name, event_name, params)
 fdef.write(ret)

diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
index e74cabe..6ca48c1 100644
--- a/scripts/qapi-types.py
+++ b/scripts/qapi-types.py
@@ -118,13 +118,13 @@ const char *%(name)s_lookup[] = {
  name=name)
 i = 0
 for value in values:
-index = generate_enum_full_value(name, value)
+index = c_enum_const(name, value)
 ret += mcgen('''
 [%(index)s] = "%(value)s",
 ''',
  index = index, value = value)

-max_index = generate_enum_full_value(name, 'MAX')
+max_index = c_enum_const(name, 'MAX')
 ret += mcgen('''
 [%(max_index)s] = NULL,
 };
@@ -150,7 +150,7 @@ typedef enum %(name)s

 i = 0
 for value in enum_values:
-enum_full_value = generate_enum_full_value(name, value)
+enum_full_value = c_enum_const(name, value)
 enum_decl += mcgen('''
 %(enum_full_value)s = %(i)d,
 ''',
diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index ba623b1..0368e62 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -214,7 +214,7 @@ void visit_type_%(name)s(Visitor *m, %(name)s **obj, const 
char *name, Error **e
 or find_union(members[key])
 or find_enum(members[key])), "Invalid alternate member"

-enum_full_value = generate_enum_full_value(disc_type, key)
+enum_full_value = c_enum_const(disc_type, key)
 ret += mcgen('''
 case %(enum_full_value)s:
 visit_type_%(c_type)s(m, &(*obj)->%(c_name)s, name, &err);
@@ -315,7 +315,7 @@ void visit_type_%(name)s(Visitor *m, %(name)s **obj, const 
char *name, Error **e
 else:
 fmt = 'visit_type_implicit_%(c_type)s(m, &(*obj)->%(c_name)s, 
&err);'

-enum_full_value = generate_enum_full_value(disc_type, key)
+enum_full_value = c_enum_const(disc_type, key)
 ret += mcgen('''
 case %(enum_full_value)s:
 ''' + fmt + '''
diff --git a/scripts/qapi.py b/scripts/qapi.py
index 7330f7c..1258f76 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -960,7 +960,7 @@ def camel_to_upper(value):
 new_name += c
 return new_name.lstrip('_').upper()

-def generate_enum_full_value(enum_name, enum_value):
-abbrev_string = camel_to_upper(enum_name)
-value_string = camel_to_upper(enum_value)
+def c_enum_const(type_name, const_name):
+abbrev_string = camel_to_upper(type_name)
+value_string = camel_to_upper(const_name)
 return "%s_%s" % (abbrev_string, value_string)
-- 
2.1.0




[Qemu-devel] [PATCH v4 01/16] qapi: Fix C identifiers generated for names containing '.'

2015-05-14 Thread Eric Blake
From: Markus Armbruster 

c_fun() maps '.' to '_', c_var() doesn't.  Nothing prevents '.' in
QAPI names that get passed to c_var().

Which QAPI names get passed to c_fun(), to c_var(), or to both is not
obvious.  Names of command parameters and struct type members get
passed to c_var().

c_var() strips a leading '*', but this cannot happen.  c_fun()
doesn't.

Fix c_var() to work exactly like c_fun().

Perhaps they should be replaced by a single mapping function.

Signed-off-by: Markus Armbruster 
[add 'import string']
Signed-off-by: Eric Blake 
Reviewed-by: Alberto Garcia 
---
 scripts/qapi.py | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/scripts/qapi.py b/scripts/qapi.py
index 166b74f..4b07805 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -15,6 +15,7 @@ import re
 from ordereddict import OrderedDict
 import os
 import sys
+import string

 builtin_types = {
 'str':  'QTYPE_QSTRING',
@@ -752,6 +753,8 @@ def camel_case(name):
 new_name += ch.lower()
 return new_name

+c_var_trans = string.maketrans('.-', '__')
+
 def c_var(name, protect=True):
 # ANSI X3J11/88-090, 3.1.1
 c89_words = set(['auto', 'break', 'case', 'char', 'const', 'continue',
@@ -781,10 +784,10 @@ def c_var(name, protect=True):
 polluted_words = set(['unix', 'errno'])
 if protect and (name in c89_words | c99_words | c11_words | gcc_words | 
cpp_words | polluted_words):
 return "q_" + name
-return name.replace('-', '_').lstrip("*")
+return name.translate(c_var_trans)

 def c_fun(name, protect=True):
-return c_var(name, protect).replace('.', '_')
+return c_var(name, protect)

 def c_list_type(name):
 return '%sList' % name
-- 
2.1.0




Re: [Qemu-devel] [RFC PATCH v4 00/28] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service

2015-05-14 Thread zhanghailiang

On 2015/5/14 20:14, Dr. David Alan Gilbert wrote:

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:

This is the 4th version of COLO, here is only COLO frame part, include: VM 
checkpoint,
failover, proxy API, block replication API, not include block replication.
The block part has been sent by wencongyang:
[RFC PATCH COLO v2 00/13] Block replication for continuous checkpoints

Compared with last version, there aren't too much optimize and new functions.
The main reason is that there is an known issue that still unsolved, we found
some dirty pages which have been missed setting bit in corresponding bitmap.
And it will trigger strange problem in VM.
We hope to resolve it before add more codes.

You can get the newest integrated qemu colo patches from github:
https://github.com/coloft/qemu/commits/colo-v1.1


I thought I'd just say I've got the remotes/origin/colo_huawei_v4.7 off Wen's
git running here on one of my pair of machines.

This set of hosts is mostly OK running colo; the proxy module still gets
upset sometimes; but mostly on errors;  if the colo-proxy-script fails
during startup, I get RCU stalls; but if the script works, colo normally
works on this setup.


Hi Dave,

I'm trying to optimize the proxy module codes, and yes, the 'lock' we used in 
proxy module
was a little messy before, and i have finished rewriting the part of 
communication between qemu
and proxy, also removing the parameter of xt_PMYCOLO module ...

There are still some further tests need to be done, i will finish it as soon as 
possible.
I hope to send the next version in next week. And you can test the branch of 
Wen's temporarily,
Except block part, there is no difference in COLO framework between his branch 
and mine. ;)

Thanks,
zhanghailiang





About how to test COLO, Please reference to the follow link.
http://wiki.qemu.org/Features/COLO.

Please review and test.

Known issue still unsolved:
(1) Some pages dirtied without setting its corresponding dirty-bitmap.

Previous posted RFC patch series:
http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html
http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg04459.html
https://lists.gnu.org/archive/html/qemu-devel/2015-02/msg04771.html

TODO list:
1 Optimize the process of checkpoint, shorten the time-consuming:
   (Partly done, patch is not include into this series)
1) separate ram and device save/load process to reduce size of extra memory
   used during checkpoint
2) live migrate part of dirty pages to slave during sleep time.
2 Add more debug/stat info
   (Partly done, patch is not include into this series)
   include checkpoint count, proxy discompare count, downtime,
number of live migrated pages, total sent pages, etc.
3 Strengthen failover
4 optimize proxy part, include proxy script.
5 The capability of continuous FT

v4:
- New block replication scheme (use image-fleecing for sencondary side)
- Adress some comments from Eric Blake and Dave
- Add commmand colo-set-checkpoint-period to set the time of periodic checkpoint
- Add a delay (100ms) between continuous checkpoint requests to ensure VM
   run 100ms at least since last pause.

v3:
- use proxy instead of colo agent to compare network packets
- add block replication
- Optimize failover disposal
- handle shutdown

v2:
- use QEMUSizedBuffer/QEMUFile as COLO buffer
- colo support is enabled by default
- add nic replication support
- addressed comments from Eric Blake and Dr. David Alan Gilbert

v1:
- implement the frame of colo

Wen Congyang (1):
   COLO: Add block replication into colo process

zhanghailiang (27):
   configure: Add parameter for configure to enable/disable COLO support
   migration: Introduce capability 'colo' to migration
   COLO: migrate colo related info to slave
   migration: Integrate COLO checkpoint process into migration
   migration: Integrate COLO checkpoint process into loadvm
   COLO: Implement colo checkpoint protocol
   COLO: Add a new RunState RUN_STATE_COLO
   QEMUSizedBuffer: Introduce two help functions for qsb
   COLO: Save VM state to slave when do checkpoint
   COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
   COLO VMstate: Load VM state into qsb before restore it
   arch_init: Start to trace dirty pages of SVM
   COLO RAM: Flush cached RAM into SVM's memory
   COLO failover: Introduce a new command to trigger a failover
   COLO failover: Implement COLO master/slave failover work
   COLO failover: Don't do failover during loading VM's state
   COLO: Add new command parameter 'colo_nicname' 'colo_script' for net
   COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
   COLO NIC: Implement colo nic device interface configure()
   COLO NIC : Implement colo nic init/destroy function
   COLO NIC: Some init work related with proxy module
   COLO: Do checkpoint according to the result of net packets comparing
   COLO: Improve checkpoint efficiency by do additional periodic
 checkpoint
   COLO: Add colo-set-c

[Qemu-devel] [PATCH v4 09/16] qapi: Make c_type() consistently convert qapi names

2015-05-14 Thread Eric Blake
Continuing the string of cleanups for supporting downstream names
containing '.', this patch focuses on ensuring c_type() can
handle a downstream name.  This patch alone does not fix the
places where generator output should be calling this function
but was open-coding things instead, but it gets us a step closer.

In particular, the changes to c_list_type() and type_name() mean
that type_name(FOO) now handles the case when FOO contains '.',
'-', or is a ticklish identifier other than a builtin (builtins
are exempted because ['int'] must remain mapped to 'intList' and
not 'q_intList').  Meanwhile, ['unix'] now maps to 'q_unixList'
rather than 'unixList', to match the fact that 'unix' is ticklish;
however, our naming conventions state that complex types should
start with a capital, so no type name following conventions will
ever have the 'q_' prepended.

Likewise, changes to c_type() mean that c_type(FOO) properly
handles an enum or complex type FOO with '.' or '-' in the
name, or is a ticklish identifier (again, a ticklish identifier
as a type name violates conventions).

Signed-off-by: Eric Blake 

---

v4: more comments based on feedback from Markus; rebase atop
inserted patch with parameter rename
---
 scripts/qapi.py | 33 +
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/scripts/qapi.py b/scripts/qapi.py
index 85e5d00..309dfec 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -769,6 +769,15 @@ def c_enum_const(type_name, const_name):

 c_name_trans = string.maketrans('.-', '__')

+# Map @name to a valid C identifier.
+# If @protect, avoid returning certain ticklish identifiers (like
+# C keywords) by prepending "q_".
+#
+# Used for converting 'name' from a 'name':'type' qapi definition
+# into a generated struct member, as well as converting type names
+# into substrings of a generated C function name.
+# '__a.b_c' -> '__a_b_c', 'x-foo' -> 'x_foo'
+# protect=True: 'int' -> 'q_int'; protect=False: 'int' -> 'int'
 def c_name(name, protect=True):
 # ANSI X3J11/88-090, 3.1.1
 c89_words = set(['auto', 'break', 'case', 'char', 'const', 'continue',
@@ -800,13 +809,25 @@ def c_name(name, protect=True):
 return "q_" + name
 return name.translate(c_name_trans)

+# Map type @name to the C typedef name for the list form.
+#
+# ['Name'] -> 'NameList', ['x-Foo'] -> 'x_FooList', ['int'] -> 'intList'
 def c_list_type(name):
-return name + 'List'
+return type_name(name) + 'List'

+# Map type @value to the C typedef form.
+#
+# Used for converting 'type' from a 'member':'type' qapi definition
+# into the alphanumeric portion of the type for a generated C parameter,
+# as well as generated C function names.  See c_type() for the rest of
+# the conversion such as adding '*' on pointer types.
+# 'int' -> 'int', '[x-Foo]' -> 'x_FooList', '__a.b_c' -> '__a_b_c'
 def type_name(value):
 if type(value) == list:
 return c_list_type(value[0])
-return value
+if value in builtin_types.keys():
+return value
+return c_name(value)

 def add_name(name, info, meta, implicit = False):
 global all_names
@@ -865,6 +886,10 @@ def is_enum(name):
 eatspace = '\033EATSPACE.'
 pointer_suffix = ' *' + eatspace

+# Map type @name to its C type expression.
+# If @is_param, const-qualify the string type.
+#
+# This function is used for computing the full C type of 'member':'name'.
 # A special suffix is added in c_type() for pointer types, and it's
 # stripped in mcgen(). So please notice this when you check the return
 # value of c_type() outside mcgen().
@@ -889,7 +914,7 @@ def c_type(value, is_param=False):
 elif type(value) == list:
 return c_list_type(value[0]) + pointer_suffix
 elif is_enum(value):
-return value
+return c_name(value)
 elif value == None:
 return 'void'
 elif value in events:
@@ -897,7 +922,7 @@ def c_type(value, is_param=False):
 else:
 # complex type name
 assert isinstance(value, str) and str != ""
-return value + pointer_suffix
+return c_name(value) + pointer_suffix

 def is_c_ptr(value):
 return c_type(value).endswith(pointer_suffix)
-- 
2.1.0




[Qemu-devel] [PATCH v4 06/16] qapi: Use c_enum_const() in generate_alternate_qtypes()

2015-05-14 Thread Eric Blake
From: Markus Armbruster 

Missed in commit b0b5819.

Signed-off-by: Markus Armbruster 
Signed-off-by: Eric Blake 
---
 scripts/qapi-types.py |  6 ++
 scripts/qapi.py   | 11 ---
 2 files changed, 2 insertions(+), 15 deletions(-)

diff --git a/scripts/qapi-types.py b/scripts/qapi-types.py
index 6ca48c1..9eb08a6 100644
--- a/scripts/qapi-types.py
+++ b/scripts/qapi-types.py
@@ -180,11 +180,9 @@ const int %(name)s_qtypes[QTYPE_MAX] = {
 assert qtype, "Invalid alternate member"

 ret += mcgen('''
-[ %(qtype)s ] = %(abbrev)s_KIND_%(enum)s,
+[ %(qtype)s ] = %(enum_const)s,
 ''',
-qtype = qtype,
-abbrev = de_camel_case(name).upper(),
-enum = c_name(de_camel_case(key),False).upper())
+qtype = qtype, enum_const = c_enum_const(name + 'Kind', key))

 ret += mcgen('''
 };
diff --git a/scripts/qapi.py b/scripts/qapi.py
index b917cad..3757a91 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -729,17 +729,6 @@ def parse_args(typeinfo):
 # value of an optional argument.
 yield (argname, argentry, optional)

-def de_camel_case(name):
-new_name = ''
-for ch in name:
-if ch.isupper() and new_name:
-new_name += '_'
-if ch == '-':
-new_name += '_'
-else:
-new_name += ch.lower()
-return new_name
-
 def camel_case(name):
 new_name = ''
 first = True
-- 
2.1.0




Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Andrew Jones
On Thu, May 14, 2015 at 11:37:46AM +0100, Peter Maydell wrote:
> On 14 May 2015 at 11:31, Andrew Jones  wrote:
> > Forgot to (4): switch from setting userspace's mapping to
> > device memory to normal, non-cacheable. Using device memory
> > caused a problem that Alex Graf found, and Peter Maydell suggested
> > using normal, non-cacheable instead.
> 
> Did you check that non-cacheable is definitely the correct
> kind of Normal memory attribute we want? (ie not write-through).

I was concerned that write-through wouldn't be sufficient. If the
guest writes to its non-cached memory, and QEMU needs to see what
it wrote, then won't write-through fail to work? Unless we some
how invalidate the cache first?

drew



Re: [Qemu-devel] [PATCH v4 00/16] Fix qapi mangling of downstream names

2015-05-14 Thread Eric Blake
On 05/14/2015 06:50 AM, Eric Blake wrote:
> This series makes it possible to use downstream extensions
> (such as __com.redhat_xyz) and temporary names (such as x-foo)
> in every position possible in QAPI schemes, with added tests
> that the generated code still compiles.
> 
> There's still some things we could do to the qapi generator,
> such as normalizing struct member names and C manglings and
> creating named implicit types up front on the initial parse
> rather than multiple times in each backend.  But that should
> wait until existing pending patches have landed, to minimize
> rebase churn.
> 
> v3 was here:
> https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg00519.html
> [already queued in Markus tree, but not yet in a pull request, so
> that queue can be ruthlessly rebased]
> 
> v4 includes 2 new patches (although 14/14 is somewhat unrelated,

Make that 16/16 as the new patch that could be deferred to later (I was
looking at the overall size of v3, while writing about v4)

> and could easily be dropped from the series if it needs another
> spin without holding up the rest of the series), incorporates
> Markus' suggestions for more comments, and simplifies special-casing
> of lists of builtin types.
> 


-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Peter Maydell
On 14 May 2015 at 14:03, Andrew Jones  wrote:
> On Thu, May 14, 2015 at 11:37:46AM +0100, Peter Maydell wrote:
>> On 14 May 2015 at 11:31, Andrew Jones  wrote:
>> > Forgot to (4): switch from setting userspace's mapping to
>> > device memory to normal, non-cacheable. Using device memory
>> > caused a problem that Alex Graf found, and Peter Maydell suggested
>> > using normal, non-cacheable instead.
>>
>> Did you check that non-cacheable is definitely the correct
>> kind of Normal memory attribute we want? (ie not write-through).
>
> I was concerned that write-through wouldn't be sufficient. If the
> guest writes to its non-cached memory, and QEMU needs to see what
> it wrote, then won't write-through fail to work? Unless we some
> how invalidate the cache first?

Well, I meant more that the correct mapping for userspace is
the same as the guest, whatever that is, and so somebody needs
to look at what the guest actually does rather than merely
hoping NormalNC is OK. (For instance, do we need to provide
support for QEMU to map both NC and writethrough?)

-- PMM



Re: [Qemu-devel] [PATCH v2 2/5] Fix migration in case of scsi-generic

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 13:02, Dimitris Aragiorgis wrote:
> Also note that QEMU seems to flush the disk cache explicitly in
> case of iSCSI, using iscsi_synchronizecache10_task() from libiscsi
> inside iscsi_co_flush().

Right, QEMU does that if type is DISK only.

> Perhaps we can also extend this approach to scsi-generic e.g. by
> calling sg_ll_sync_cache_10() from libsgutils2. This is a distinct,
> orthogonal patch that could be added in the future.

Yes.  But do not use libsgutils2, just do it manually in QEMU.  See
for example get_stream_blocksize in hw/scsi/scsi-generic.c.

Paolo



Re: [Qemu-devel] [Xen-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Sander Eikelenboom

Thursday, May 14, 2015, 2:53:17 PM, you wrote:



> On 14/05/2015 14:45, Markus Armbruster wrote:
>> Paolo Bonzini  writes:
>> 
>>> On 14/05/2015 14:02, Markus Armbruster wrote:
   It should certainly be off for pc-q35-2.4 and newer.  Real Q35 boards
   commonly don't have an FDC (depends on the Super I/O chip used).

   We may want to keep it off for pc-i440fx-2.4 and newer.  I doubt
   there's a real i440FX without an FDC, but our virtual i440FX is quite
   unlike a real one in other ways already.
>>>
>>> That would break libvirt for people upgrading from 2.3 to 2.4.  So it's
>>> more like pc-i440fx-3.0 and pc-q35-3.0.
>> 
>> What exactly breaks when?

> libvirt expects "-nodefaults -drive if=none,id=fdd0,... -global
> isa-fdc.driveA=fdd0" to result in a machine with a working FDD.  It
> doesn't know that it has to add "-machine fdc=on".

> Besides, adding a new machine option is not the best we can do.  If the
> default is "no FDC", all that is needed to add one back is -device.  An
> FDC is yet another ISA device, it is possible to create one with -device.

>> add the magic to make -global isa-fdc... auto-set the option to on.

> That would be ugly magic.

> The more I think about this, the more I think this is just a kneejerk
> reaction to a sensationalist announcement.  The effect of this
> vulnerability on properly configured data centers (running
> non-prehistoric versions of Xen or KVM and using
> stubdom/SELinux/AppArmor properly) should be really close to zero.

> It's a storm in a tea cup.

> Paolo

I tend to kindly disagree if you look at the broader perspective, yes it's 
could 
be a storm in a tea cup, but there seems to be a pattern.

From a "cmdline user" / "platform emulation" point of view i can imagine that 
some sort of 
auto configuration / bundling in platforms is necessary (to limit the length of 
the cmdline to be supplied).

But from a library / toolstack point of view it's quite nasty if *all* more or 
less obscure options have to actively disabled. It's quite undoable, not to 
mention what
happens when new ones get added. 

From this PoV it would be better to have to actively enable all the stuff you 
want.

At least Xen seemed to have taken the "no-defaults" as the best option to get 
there so far, but that doesn't seem to 

It's not the first CVE that has come from this "you have to actively disable 
all 
you don't want to happen" and probably won't be the last.

So a "-no-defaults-now-for-real" option/mode for libraries / toolstacks, that 
requires them to be very verbose, explicit and specific in what they actually 
want, could be valuable.

--
Sander

>>>  Unless for q35 we decide to
>>> break everything and retroactively nuke the controller.
>>>
>>> (I'm still not sure why we have backwards-compatible machine types for q35).
>> 
>> Beats me :)
>> 
>> [...]
>> 






Re: [Qemu-devel] [RFC/RFT PATCH v2 3/3] arm/arm64: KVM: implement 'uncached' mem coherency

2015-05-14 Thread Andrew Jones
On Thu, May 14, 2015 at 12:55:49PM +0200, Christoffer Dall wrote:
> On Wed, May 13, 2015 at 01:31:54PM +0200, Andrew Jones wrote:
> > When S1 and S2 memory attributes combine wrt to caching policy,
> > non-cacheable types take precedence. If a guest maps a region as
> > device memory, which KVM userspace is using to emulate the device
> > using normal, cacheable memory, then we lose coherency. With
> > KVM_MEM_UNCACHED, KVM userspace can now hint to KVM which memory
> > regions are likely to be problematic. With this patch, as pages
> > of these types of regions are faulted into the guest, not only do
> > we flush the page's dcache, but we also change userspace's
> > mapping to NC in order to maintain coherency.
> > 
> > What if the guest doesn't do what we expect? While we can't
> > force a guest to use cacheable memory, we can take advantage of
> > the non-cacheable precedence, and force it to use non-cacheable.
> > So, this patch also introduces PAGE_S2_NORMAL_NC, and uses it on
> > KVM_MEM_UNCACHED regions to force them to NC.
> > 
> > We now have both guest and userspace on the same page (pun intended)
> 
> I'd like to revisit the overall approach here.  Is doing non-cached
> accesses in both the guest and host really the right thing to do here?

I think so, but all ideas/approaches are still on the table. This is
still an RFC.

> 
> The semantics of the device becomes that it is cache coherent (because
> QEMU is), and I think Marc argued that Linux/UEFI should simply be
> adapted to handle whatever emulated devices we have as coherent.  I also
> remember someone arguing that would be wrong (Peter?).

I'm not really for quirking all devices in all guest types (AAVMF, Linux,
other bootloaders, other OSes). Windows is unlikely to apply any quirks.

> 
> Finally, does this address all cache coherency issues with emulated
> devices?  Some VOS guys had seen things still not working with this
> approach, unsure why...  I'd like to avoid us merging this only to merge
> a more complete solution in a few weeks which reverts this solution...

I'm not sure (this is still an RFT too :-) We definitely would need to
scatter some more memory_region_set_uncached() calls around QEMU first.

> 
> More comments/questions below:
> 
> > 
> > Signed-off-by: Andrew Jones 
> > ---
> >  arch/arm/include/asm/kvm_mmu.h|  5 -
> >  arch/arm/include/asm/pgtable-3level.h |  1 +
> >  arch/arm/include/asm/pgtable.h|  1 +
> >  arch/arm/kvm/mmu.c| 37 
> > +++
> >  arch/arm64/include/asm/kvm_mmu.h  |  5 -
> >  arch/arm64/include/asm/memory.h   |  1 +
> >  arch/arm64/include/asm/pgtable.h  |  1 +
> >  7 files changed, 36 insertions(+), 15 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> > index 405aa18833073..e8034a80b12e5 100644
> > --- a/arch/arm/include/asm/kvm_mmu.h
> > +++ b/arch/arm/include/asm/kvm_mmu.h
> > @@ -214,8 +214,11 @@ static inline void __coherent_cache_guest_page(struct 
> > kvm_vcpu *vcpu, pfn_t pfn,
> > while (size) {
> > void *va = kmap_atomic_pfn(pfn);
> >  
> > -   if (need_flush)
> > +   if (need_flush) {
> > kvm_flush_dcache_to_poc(va, PAGE_SIZE);
> > +   if (ipa_uncached)
> > +   set_memory_nc((unsigned long)va, 1);
> 
> nit: consider moving this outside the need_flush
> 
> > +   }
> >  
> > if (icache_is_pipt())
> > __cpuc_coherent_user_range((unsigned long)va,
> > diff --git a/arch/arm/include/asm/pgtable-3level.h 
> > b/arch/arm/include/asm/pgtable-3level.h
> > index a745a2a53853c..39b3f7a40e663 100644
> > --- a/arch/arm/include/asm/pgtable-3level.h
> > +++ b/arch/arm/include/asm/pgtable-3level.h
> > @@ -121,6 +121,7 @@
> >   * 2nd stage PTE definitions for LPAE.
> >   */
> >  #define L_PTE_S2_MT_UNCACHED   (_AT(pteval_t, 0x0) << 2) /* 
> > strongly ordered */
> > +#define L_PTE_S2_MT_NORMAL_NC  (_AT(pteval_t, 0x5) << 2) /* 
> > normal non-cacheable */
> >  #define L_PTE_S2_MT_WRITETHROUGH   (_AT(pteval_t, 0xa) << 2) /* normal 
> > inner write-through */
> >  #define L_PTE_S2_MT_WRITEBACK  (_AT(pteval_t, 0xf) << 2) /* 
> > normal inner write-back */
> >  #define L_PTE_S2_MT_DEV_SHARED (_AT(pteval_t, 0x1) << 2) /* 
> > device */
> > diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> > index f40354198bad4..ae13ca8b0a23d 100644
> > --- a/arch/arm/include/asm/pgtable.h
> > +++ b/arch/arm/include/asm/pgtable.h
> > @@ -100,6 +100,7 @@ extern pgprot_t pgprot_s2_device;
> >  #define PAGE_HYP   _MOD_PROT(pgprot_kernel, L_PTE_HYP)
> >  #define PAGE_HYP_DEVICE_MOD_PROT(pgprot_hyp_device, L_PTE_HYP)
> >  #define PAGE_S2_MOD_PROT(pgprot_s2, L_PTE_S2_RDONLY)
> > +#define PAGE_S2_NORMAL_NC  __pgprot((pgprot_val(PAGE_S2) & 
> > ~L_PTE_S

Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Laszlo Ersek
On 05/14/15 15:00, Andrew Jones wrote:
> On Thu, May 14, 2015 at 01:38:11PM +0100, Peter Maydell wrote:
>> On 14 May 2015 at 13:28, Paolo Bonzini  wrote:
>>> Well, PCI BARs are generally MMIO resources, and hence should not be cached.
>>>
>>> As an optimization, OS drivers can mark them as cacheable or
>>> write-combining or something like that, but in general it's a safe
>>> default to leave them uncached---one would think.
>>
>> Isn't this handled by the OS mapping them in the 'prefetchable'
>> MMIO window rather than the 'non-prefetchable' one? (QEMU's
>> generic-PCIe device doesn't yet support the prefetchable window.)
> 
> I was thinking (with my limited PCI knowledge) the same thing, and
> was planning on experimenting with that.

This could be supported in UEFI as well, with the following steps:
- the DTB that QEMU provides UEFI with should advertise such a
  prefetchable window.
- The driver in UEFI that parses the DTB should understand that DTB
  node (well, record type), and store the appropriate base & size into
  some new dynamic PCDs (= basically, firmware wide global variables;
  PCD = platform configuration database)
- The entry point of the host bridge driver would call
  gDS->AddMemorySpace() twice, separately for the two different windows,
  with their appropriate caching attributes.
- The host bridge driver needs to be extended so that TypePMem32
  requests are not rejected (like now); they should be handled
  similarly to TypeMem32. Except, the gDS->AllocateMemorySpace() call
  should allocate from the prefetchable range (determined by the new
  PCDs above).
- QEMU's emulated devices should then expose their BARs as prefetchable
  (so that the above branch would be taken in the host bridge driver).

(Of course, if QEMU intends to emulate PCI devices somewhat
realistically, then QEMU should claim "non-prefetchable" for BARs that
would not be prefetchable on physical hardware either, and then the
hypervisor should accommodate the firmware's UC mapping and say "hey I
know better, we're virtual in fact", and override the attribute (-> use
WB instead of UC). With which we'd be back to square one...)

Thanks
Laszlo



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Laszlo Ersek
On 05/14/15 15:11, Peter Maydell wrote:
> On 14 May 2015 at 14:03, Andrew Jones  wrote:
>> On Thu, May 14, 2015 at 11:37:46AM +0100, Peter Maydell wrote:
>>> On 14 May 2015 at 11:31, Andrew Jones  wrote:
 Forgot to (4): switch from setting userspace's mapping to
 device memory to normal, non-cacheable. Using device memory
 caused a problem that Alex Graf found, and Peter Maydell suggested
 using normal, non-cacheable instead.
>>>
>>> Did you check that non-cacheable is definitely the correct
>>> kind of Normal memory attribute we want? (ie not write-through).
>>
>> I was concerned that write-through wouldn't be sufficient. If the
>> guest writes to its non-cached memory, and QEMU needs to see what
>> it wrote, then won't write-through fail to work? Unless we some
>> how invalidate the cache first?
> 
> Well, I meant more that the correct mapping for userspace is
> the same as the guest, whatever that is, and so somebody needs
> to look at what the guest actually does

I think Ard explored this earlier, by tracking the MAIRs and stuff.
Can't recall what the findings were though. (Or I could be simply
confused, sorry.)

Laszlo

> rather than merely
> hoping NormalNC is OK. (For instance, do we need to provide
> support for QEMU to map both NC and writethrough?)
> 
> -- PMM
> 




Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Andrew Jones
On Thu, May 14, 2015 at 02:11:59PM +0100, Peter Maydell wrote:
> On 14 May 2015 at 14:03, Andrew Jones  wrote:
> > On Thu, May 14, 2015 at 11:37:46AM +0100, Peter Maydell wrote:
> >> On 14 May 2015 at 11:31, Andrew Jones  wrote:
> >> > Forgot to (4): switch from setting userspace's mapping to
> >> > device memory to normal, non-cacheable. Using device memory
> >> > caused a problem that Alex Graf found, and Peter Maydell suggested
> >> > using normal, non-cacheable instead.
> >>
> >> Did you check that non-cacheable is definitely the correct
> >> kind of Normal memory attribute we want? (ie not write-through).
> >
> > I was concerned that write-through wouldn't be sufficient. If the
> > guest writes to its non-cached memory, and QEMU needs to see what
> > it wrote, then won't write-through fail to work? Unless we some
> > how invalidate the cache first?
> 
> Well, I meant more that the correct mapping for userspace is
> the same as the guest, whatever that is, and so somebody needs
> to look at what the guest actually does rather than merely
> hoping NormalNC is OK. (For instance, do we need to provide
> support for QEMU to map both NC and writethrough?)
>

Ah, we assume the guest is mapping it as device memory, and in
this version of the series, I ensure that it is at least NC with
the S2 attributes. I don't think we can look at what some guests
do with some devices to come up with anything beyond (poor?)
heuristics. I prefer that we force both the guest and QEMU to NC
(or guest chooses Device and QEMU is forced to NC) to make sure
we get it right.

drew



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Andrew Jones
On Thu, May 14, 2015 at 01:38:11PM +0100, Peter Maydell wrote:
> On 14 May 2015 at 13:28, Paolo Bonzini  wrote:
> > Well, PCI BARs are generally MMIO resources, and hence should not be cached.
> >
> > As an optimization, OS drivers can mark them as cacheable or
> > write-combining or something like that, but in general it's a safe
> > default to leave them uncached---one would think.
> 
> Isn't this handled by the OS mapping them in the 'prefetchable'
> MMIO window rather than the 'non-prefetchable' one? (QEMU's
> generic-PCIe device doesn't yet support the prefetchable window.)

I was thinking (with my limited PCI knowledge) the same thing, and
was planning on experimenting with that.

drew



Re: [Qemu-devel] is there a limit on the number of in-flight I/O operations?

2015-05-14 Thread Andrey Korolyov
On Wed, Aug 27, 2014 at 9:43 AM, Chris Friesen
 wrote:
> On 08/25/2014 03:50 PM, Chris Friesen wrote:
>
>> I think I might have a glimmering of what's going on.  Someone please
>> correct me if I get something wrong.
>>
>> I think that VIRTIO_PCI_QUEUE_MAX doesn't really mean anything with
>> respect to max inflight operations, and neither does virtio-blk calling
>> virtio_add_queue() with a queue size of 128.
>>
>> I think what's happening is that virtio_blk_handle_output() spins,
>> pulling data off the 128-entry queue and calling
>> virtio_blk_handle_request().  At this point that queue entry can be
>> reused, so the queue size isn't really relevant.
>>
>> In virtio_blk_handle_write() we add the request to a MultiReqBuffer and
>> every 32 writes we'll call virtio_submit_multiwrite() which calls down
>> into bdrv_aio_multiwrite().  That tries to merge requests and then for
>> each resulting request calls bdrv_aio_writev() which ends up calling
>> qemu_rbd_aio_writev(), which calls rbd_start_aio().
>>
>> rbd_start_aio() allocates a buffer and converts from iovec to a single
>> buffer.  This buffer stays allocated until the request is acked, which
>> is where the bulk of the memory overhead with rbd is coming from (has
>> anyone considered adding iovec support to rbd to avoid this extra copy?).
>>
>> The only limit I see in the whole call chain from
>> virtio_blk_handle_request() on down is the call to
>> bdrv_io_limits_intercept() in bdrv_co_do_writev().  However, that
>> doesn't provide any limit on the absolute number of inflight operations,
>> only on operations/sec.  If the ceph server cluster can't keep up with
>> the aggregate load, then the number of inflight operations can still
>> grow indefinitely.
>>
>> Chris
>
>
> I was a bit concerned that I'd need to extend the IO throttling code to
> support a limit on total inflight bytes, but it doesn't look like that will
> be necessary.
>
> It seems that using mallopt() to set the trim/mmap thresholds to 128K is
> enough to minimize the increase in RSS and also drop it back down after an
> I/O burst.  For now this looks like it should be sufficient for our
> purposes.
>
> I'm actually a bit surprised I didn't have to go lower, but it seems to work
> for both "dd" and dbench testcases so we'll give it a try.
>
> Chris
>

Bumping this...

For now, we are rarely suffering with an unlimited cache growth issue
which can be observed on all post-1.4 versions of qemu with rbd
backend in a writeback mode and certain pattern of a guest operations.
The issue is confirmed for virtio and can be re-triggered by issuing
excessive amount of write requests without completing returned acks
from a emulator` cache timely. Since most applications behave in a
right way, the oom issue is very rare (and we developed an ugly
workaround for such situations long ago). If anybody is interested in
fixing this, I can send a prepared image for a reproduction or
instructions to make one, whichever is preferable.

Thanks!



Re: [Qemu-devel] [Xen-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Daniel P. Berrange
On Thu, May 14, 2015 at 03:25:39PM +0200, Sander Eikelenboom wrote:
> 
> Thursday, May 14, 2015, 2:53:17 PM, you wrote:
> 
> 
> 
> > On 14/05/2015 14:45, Markus Armbruster wrote:
> >> Paolo Bonzini  writes:
> >> 
> >>> On 14/05/2015 14:02, Markus Armbruster wrote:
>    It should certainly be off for pc-q35-2.4 and newer.  Real Q35 boards
>    commonly don't have an FDC (depends on the Super I/O chip used).
> 
>    We may want to keep it off for pc-i440fx-2.4 and newer.  I doubt
>    there's a real i440FX without an FDC, but our virtual i440FX is quite
>    unlike a real one in other ways already.
> >>>
> >>> That would break libvirt for people upgrading from 2.3 to 2.4.  So it's
> >>> more like pc-i440fx-3.0 and pc-q35-3.0.
> >> 
> >> What exactly breaks when?
> 
> > libvirt expects "-nodefaults -drive if=none,id=fdd0,... -global
> > isa-fdc.driveA=fdd0" to result in a machine with a working FDD.  It
> > doesn't know that it has to add "-machine fdc=on".
> 
> > Besides, adding a new machine option is not the best we can do.  If the
> > default is "no FDC", all that is needed to add one back is -device.  An
> > FDC is yet another ISA device, it is possible to create one with -device.
> 
> >> add the magic to make -global isa-fdc... auto-set the option to on.
> 
> > That would be ugly magic.
> 
> > The more I think about this, the more I think this is just a kneejerk
> > reaction to a sensationalist announcement.  The effect of this
> > vulnerability on properly configured data centers (running
> > non-prehistoric versions of Xen or KVM and using
> > stubdom/SELinux/AppArmor properly) should be really close to zero.
> 
> > It's a storm in a tea cup.
> 
> > Paolo
> 
> I tend to kindly disagree if you look at the broader perspective, yes it's 
> could 
> be a storm in a tea cup, but there seems to be a pattern.
> 
> From a "cmdline user" / "platform emulation" point of view i can imagine that 
> some sort of 
> auto configuration / bundling in platforms is necessary (to limit the length 
> of 
> the cmdline to be supplied).
> 
> But from a library / toolstack point of view it's quite nasty if *all* more 
> or 
> less obscure options have to actively disabled. It's quite undoable, not to 
> mention what
> happens when new ones get added. 
> 
> From this PoV it would be better to have to actively enable all the stuff you 
> want.
> 
> At least Xen seemed to have taken the "no-defaults" as the best option to get 
> there so far, but that doesn't seem to 
> 
> It's not the first CVE that has come from this "you have to actively disable 
> all 
> you don't want to happen" and probably won't be the last.
> 
> So a "-no-defaults-now-for-real" option/mode for libraries / toolstacks, that 
> requires them to be very verbose, explicit and specific in what they actually 
> want, could be valuable.

That doesn't really scale to be honest - a -no-defaults-now-for-real
basically means deleting the entire concept of machine types, and
requiring every single aspect of the base chipset to be explicitly
configured. That's certainly possible but its a massive code job for
QEMU and the management apps using QEMU. In fact the amount of code
we'll have to write to achieve that in both QEMU and libvirt would
probably introduce more bugs, so its questionable whether it would
even be a net win in terms of reducing CVEs.

Even if you disable some more default devices, the amount of code
that'll be taken out of the attack surface is going to be tiny
compared to the amount of code that will always remain due to the
feature needs of the OS.

So even with more default devices disabled, we have to assume that
we will continue to see CVEs in stuff that remains and so must ensure
we can mitigate the risks such that they have negligble impact on
people who have deployed QEMU. ie the goal should be that a guest
attack on QEMU ends up being nothing worse than a denial of service
on attacker themselves.

This is why using technologies such as SELinux, AppArmour, StubDomains
and seccomp are so important when running QEMU. There is scope for
applying more such as kernel namespaces to further isolate QEMU
processes from each other.

With these defences in place these kind of bugs really do become a
storm in a teacup as Paolo suggests. That's not to say they are not
important to fix, but we need to have some perspective about where
efforts should be best focused to reduce our security risk.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [RFC PATCH v2 00/13] main-loop: Get rid of fd_read_poll and qemu_set_fd_handler2

2015-05-14 Thread Fam Zheng
On Thu, 05/14 11:36, Paolo Bonzini wrote:
> On 14/05/2015 06:39, Fam Zheng wrote:
> > On Thu, 05/14 11:34, Fam Zheng wrote:
> >> Patch 1 adds a stub for qemu_set_fd_handler which will be referenced in 
> >> coming
> >> patches.
> >>
> >> Patch 2 converts qemu-nbd which compares two global numbers in the 
> >> fd_read_poll
> >> callback.
> >>
> >> Patches 2~5 converts the four net devices, all of which checks
> >> qemu_can_send_packet() in the callback.
> > 
> > s/2~5/3~6/
> > 
> >>
> >> Patch 6 and 7 finally removes the function.
> > 
> > s/6 and 7/7 and 8/
> 
> Should I queue patches 9-12 now?

Yes if you want to, but I lean to leaving them for now: they are not complete
(making the function return value unchecked) without the return type change in
13.

So let's focus more on 1-8 :)

Fam



Re: [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Markus Armbruster
Paolo Bonzini  writes:

> On 14/05/2015 13:47, Michael S. Tsirkin wrote:
>> > I would be OK with a new property too, as we could set it from
>> > libxl or libvirt. Anybody would be happy to pick this one up or should I
>> > do it?
>>
>> Pls go ahead, I can merge it in the pc tree.
>
> Note that because of the "-drive if=none,id=fdd1 -global
> isa-fdc.driveA=fdd1" trick to create a floppy drive without if=floppy,
> the default should remain on---at least for a few releases---even if
> -nodefaults is passed.

Unless we bite the bullet and add the magic to make -global
isa-fdc... auto-set the option to on.



Re: [Qemu-devel] [RFC/RFT PATCH v2 1/3] arm/arm64: pageattr: add set_memory_nc

2015-05-14 Thread Andrew Jones
On Thu, May 14, 2015 at 01:05:09PM +0200, Christoffer Dall wrote:
> On Wed, May 13, 2015 at 01:31:52PM +0200, Andrew Jones wrote:
> > Provide a method to change normal, cacheable memory to non-cacheable.
> > KVM will make use of this to keep emulated device memory regions
> > coherent with the guest.
> > 
> > Signed-off-by: Andrew Jones 
> 
> Reviewed-by: Christoffer Dall 
> 
> But you obviously need Russell and Will/Catalin to ack/merge this.

I guess this patch is going to go away in the next round. You've
pointed out that I screwed stuff up royally with my over eagerness
to reuse code. I need to reimplement change_memory_common, but a
version that takes an mm, which is more or less what I did in the
last version of this series, back when I was pinning pages.

drew



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Michael S. Tsirkin
On Thu, May 14, 2015 at 03:32:10PM +0200, Laszlo Ersek wrote:
> On 05/14/15 15:00, Andrew Jones wrote:
> > On Thu, May 14, 2015 at 01:38:11PM +0100, Peter Maydell wrote:
> >> On 14 May 2015 at 13:28, Paolo Bonzini  wrote:
> >>> Well, PCI BARs are generally MMIO resources, and hence should not be 
> >>> cached.
> >>>
> >>> As an optimization, OS drivers can mark them as cacheable or
> >>> write-combining or something like that, but in general it's a safe
> >>> default to leave them uncached---one would think.
> >>
> >> Isn't this handled by the OS mapping them in the 'prefetchable'
> >> MMIO window rather than the 'non-prefetchable' one? (QEMU's
> >> generic-PCIe device doesn't yet support the prefetchable window.)
> > 
> > I was thinking (with my limited PCI knowledge) the same thing, and
> > was planning on experimenting with that.
> 
> This could be supported in UEFI as well, with the following steps:
> - the DTB that QEMU provides UEFI with should advertise such a
>   prefetchable window.
> - The driver in UEFI that parses the DTB should understand that DTB
>   node (well, record type), and store the appropriate base & size into
>   some new dynamic PCDs (= basically, firmware wide global variables;
>   PCD = platform configuration database)
> - The entry point of the host bridge driver would call
>   gDS->AddMemorySpace() twice, separately for the two different windows,
>   with their appropriate caching attributes.
> - The host bridge driver needs to be extended so that TypePMem32
>   requests are not rejected (like now); they should be handled
>   similarly to TypeMem32. Except, the gDS->AllocateMemorySpace() call
>   should allocate from the prefetchable range (determined by the new
>   PCDs above).
> - QEMU's emulated devices should then expose their BARs as prefetchable
>   (so that the above branch would be taken in the host bridge driver).
> 
> (Of course, if QEMU intends to emulate PCI devices somewhat
> realistically, then QEMU should claim "non-prefetchable" for BARs that
> would not be prefetchable on physical hardware either, and then the
> hypervisor should accommodate the firmware's UC mapping and say "hey I
> know better, we're virtual in fact", and override the attribute (-> use
> WB instead of UC). With which we'd be back to square one...)
> 
> Thanks
> Laszlo

Prefetcheable is unrelated to BAR caching or drivers, it's a way to tell
host bridges they can do limited tweaks to downstream transactions in a
specific range.

Really non-prefetcheable BARs are mostly those where read has
side-effects, which is best avoided. this does not mean it's ok to
reorder transactions or cache them.

-- 
MST



Re: [Qemu-devel] [PATCH v3 5/6] qemu-iotests: Add test case for mirror with unmap

2015-05-14 Thread Fam Zheng
On Thu, 05/14 12:37, Paolo Bonzini wrote:
> Do other "quick" tests need a QEMU binary?

102



Re: [Qemu-devel] [PATCH v3 1/6] mirror: Do zero write on target if sectors not allocated

2015-05-14 Thread Fam Zheng
On Thu, 05/14 12:30, Paolo Bonzini wrote:
> 
> 
> On 12/05/2015 13:48, Fam Zheng wrote:
> > +if (!bdrv_is_allocated_above(source, NULL, sector_num,
> > + nb_sectors, &pnum)) {
> > +op->nb_sectors = pnum;
> > +if (s->source_may_unmap) {
> 
> Can you avoid this check by introducing bdrv_get_block_status_above?  Then:
> 
> - if BDRV_ZERO, you use bdrv_aio_write_zeroes
> 
> - if BDRV_ALLOCATED, you use bdrv_aio_readv
> 
> - else you use bdrv_aio_discard

OK, that's better.

> 
> > +/*
> > + * Source unallocated sectors have zero data. We can't discard
> > + * target even if s->target_may_unmap, because the discard
> > + * granularity may be different.
> > + */
> > +bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
> > +  s->target_may_unmap ? BDRV_REQ_MAY_UNMAP 
> > : 0,
> 
> You can set the flag unconditionally.  But it's probably better to make
> this a drive-mirror argument instead of checking the target's
> BlockDriverInfo.

OK, I'll add a flag.

Fam



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Laszlo Ersek
On 05/14/15 14:34, Christoffer Dall wrote:
> On Thu, May 14, 2015 at 02:28:49PM +0200, Paolo Bonzini wrote:
>>
>>
>> On 14/05/2015 14:24, Christoffer Dall wrote:
>>> On Thu, May 14, 2015 at 02:08:49PM +0200, Paolo Bonzini wrote:


 On 14/05/2015 14:00, Christoffer Dall wrote:
> So, getting back to my original question.  Is the point then that UEFI
> must assume (from ACPI/DT) the cache-coherency properties of the PCI
> controller which exists in hardware on the system you're running on,
> even for the virtual PCI bus because that will be the semantics for
> assigned devices?
>
> And in that case, we have no way to distinguish between passthrough
> devices and virtual devices plugged into the virtual PCI bus?

 Well, we could use the subsystem id.  But it's a hack, and may cause
 incompatibilities with some drivers.  Michael, any ideas?

> What about the idea of having two virtual PCI buses on your system where
> one is always cache-coherent and uses for virtual devices, and the other
> is whatever the hardware is and used for passthrough devices?

 I think that was rejected before.
>>>
>>> Do you remember where?  I just remember Catalin mentioning the idea to
>>> me verbally.
>>
>> In the last centithread on the subject. :)
>>
>> At least I and Peter disagreed.  It's not about the heavy added use of
>> resources, it's more about it being really easy to misconfigure.
>>
>>> But I'm still not sure why UEFI/Linux currently sees our PCI bus as
>>> being non-coherent when in fact it is and we have no passthrough issues
>>> currently.  Are all PCI controllers always non-coherent for some reason
>>> and therefore we model it as such too?
>>
>> Well, PCI BARs are generally MMIO resources, and hence should not be cached.
>>
>> As an optimization, OS drivers can mark them as cacheable or
>> write-combining or something like that, but in general it's a safe
>> default to leave them uncached---one would think.
>>
> ok, I guess this series makes sense then, assuming it works, and
> assuming we don't kill performance by going to RAM all the time when we
> don't have to...

The idea Paolo and I had discussed in the past is:
- Remove the kludge from UEFI, and map all MMIO BARs as uncached by
  default. This should be a theoretically correct approach, and for
  assigned devices, correct in practice too.

- At an appropriate place in the firmware (specifically, right before
  this line: [1]), when PCI devices have been enumerated, but their
  particular drivers (especially VGA) have not been connected yet,
  check the subsystem id / vendor id / etc for each, and if we can tell
  it's virtual, then set the attributes for all of its MMIO bars to
  writeback.

It doesn't seem hard to implement, I've just been shying away from
actually coding it up because I'd like to see it make a difference for
an actual assigned device. That is, reproducing the current (statically
kludged) behavior wouldn't be hard, but I prefer not to write a new
patch until I can test it both ways. UC is broken and WB works for
virtual devices, fine; now let's see if the exact reverse holds for
assigned devices in reality.

... Testing of which will require someone to send a PCI card (NIC or
GPU) -- with an AARCH64 UEFI driver oprom on it -- to my place, so that
I can plug into my Mustang. ;)

Thanks
Laszlo

[1]
https://github.com/tianocore/edk2/blob/99d9ade85aad554a0fa08fff8586b0fd40570ac3/ArmPlatformPkg/ArmVirtualizationPkg/Library/PlatformIntelBdsLib/IntelBdsPlatform.c#L366




Re: [Qemu-devel] [Xen-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 15:25, Sander Eikelenboom wrote:
> I tend to kindly disagree if you look at the broader perspective, yes it's 
> could 
> be a storm in a tea cup, but there seems to be a pattern.
> 
> From a "cmdline user" / "platform emulation" point of view i can imagine that 
> some sort of 
> auto configuration / bundling in platforms is necessary (to limit the length 
> of 
> the cmdline to be supplied).
> 
> But from a library / toolstack point of view it's quite nasty if *all* more 
> or 
> less obscure options have to actively disabled. It's quite undoable, not to 
> mention what
> happens when new ones get added. 

Where do you stop?

Do you want to disable ACPI by default?  It's a relatively large amount
of code, but for most modern guests it is necessary.

Besides, removing just the floppy drive makes little sense.  The
following devices remain with -nodefaults:

- an HPET

- the power management device, which includes an I2C controller

- an IDE controller

- a keyboard controller

- the magic VMware I/O port

- the PC speaker (!)

- the legacy PIT

- the real-time clock

- the two PICs and IOAPIC

Of all these, most guests will only use the PM device (maybe) and a
small subset of the RTC.  At the very least, the IDE controller, PC
speaker and the HPET should be removed by -nodefaults-i-mean-it.  If
you're using UEFI firmware, probably you could remove everything except
the PM device---maybe the RTC and the IOAPIC.  At this point this is not
-nodefaults-i-mean-it, it's a different machine type altogether and it's
remarkably different from a PC.

Reducing the attack surface is always nice, but hypervisor and device
model bugs are going to exist forever.  The amount of code for legacy
devices is all in all not that big, and I can hardly recall other
vulnerabilities in there.  This is why I say it's a knee-jerk reaction.

It is the intrinsic problem of virtualization mentioned in the famous
quote of Theo de Raadt(*).  Even if you do PV like Xen or Hyper-V, you
have less or no QEMU code but you throw more hypervisor code in
untrusted hands (hypervisor bugs exist, e.g. CVE-2012-4539,
CVE-2012-5510 or CVE-2013-1964).

(*) Which of course misses the point of virtualization
altogether.  Once you have established that you need
more density, choosing containers vs. virtualization
is a gamble on whether you prefer more moving parts and
more layers that have to be broken (more isolation), or
fewer moving parts and less isolation.

Paolo

> From this PoV it would be better to have to actively enable all the stuff you 
> want.
> 
> At least Xen seemed to have taken the "no-defaults" as the best option to get 
> there so far, but that doesn't seem to 
> 
> It's not the first CVE that has come from this "you have to actively disable 
> all 
> you don't want to happen" and probably won't be the last.
> 
> So a "-no-defaults-now-for-real" option/mode for libraries / toolstacks, that 
> requires them to be very verbose, explicit and specific in what they actually 
> want, could be valuable.



Re: [Qemu-devel] [Xen-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Michael S. Tsirkin
On Thu, May 14, 2015 at 03:25:39PM +0200, Sander Eikelenboom wrote:
> 
> Thursday, May 14, 2015, 2:53:17 PM, you wrote:
> 
> 
> 
> > On 14/05/2015 14:45, Markus Armbruster wrote:
> >> Paolo Bonzini  writes:
> >> 
> >>> On 14/05/2015 14:02, Markus Armbruster wrote:
>    It should certainly be off for pc-q35-2.4 and newer.  Real Q35 boards
>    commonly don't have an FDC (depends on the Super I/O chip used).
> 
>    We may want to keep it off for pc-i440fx-2.4 and newer.  I doubt
>    there's a real i440FX without an FDC, but our virtual i440FX is quite
>    unlike a real one in other ways already.
> >>>
> >>> That would break libvirt for people upgrading from 2.3 to 2.4.  So it's
> >>> more like pc-i440fx-3.0 and pc-q35-3.0.
> >> 
> >> What exactly breaks when?
> 
> > libvirt expects "-nodefaults -drive if=none,id=fdd0,... -global
> > isa-fdc.driveA=fdd0" to result in a machine with a working FDD.  It
> > doesn't know that it has to add "-machine fdc=on".
> 
> > Besides, adding a new machine option is not the best we can do.  If the
> > default is "no FDC", all that is needed to add one back is -device.  An
> > FDC is yet another ISA device, it is possible to create one with -device.
> 
> >> add the magic to make -global isa-fdc... auto-set the option to on.
> 
> > That would be ugly magic.
> 
> > The more I think about this, the more I think this is just a kneejerk
> > reaction to a sensationalist announcement.  The effect of this
> > vulnerability on properly configured data centers (running
> > non-prehistoric versions of Xen or KVM and using
> > stubdom/SELinux/AppArmor properly) should be really close to zero.
> 
> > It's a storm in a tea cup.
> 
> > Paolo

I agree.  An option to disable fdc sounds like a reasonable thing to do
reduce attack surface while keeping full compatibility. After all
downstreams disable devices they don't want to support, too.  Changing
defaults just because a device had a bug does sound extreme.

> I tend to kindly disagree if you look at the broader perspective, yes it's 
> could 
> be a storm in a tea cup, but there seems to be a pattern.
> 
> From a "cmdline user" / "platform emulation" point of view i can imagine that 
> some sort of 
> auto configuration / bundling in platforms is necessary (to limit the length 
> of 
> the cmdline to be supplied).
> 
> But from a library / toolstack point of view it's quite nasty if *all* more 
> or 
> less obscure options have to actively disabled. It's quite undoable, not to 
> mention what
> happens when new ones get added. 

I support this statement for new features. Users should opt-in to them.
E.g. we learned this the hard way with pvpanic.

> From this PoV it would be better to have to actively enable all the stuff you 
> want.
> 
> At least Xen seemed to have taken the "no-defaults" as the best option to get 
> there so far, but that doesn't seem to 
> 
> It's not the first CVE that has come from this "you have to actively disable 
> all 
> you don't want to happen" and probably won't be the last.
> 
> So a "-no-defaults-now-for-real" option/mode for libraries / toolstacks, that 
> requires them to be very verbose, explicit and specific in what they actually 
> want, could be valuable.
> 
> --
> Sander

If you would change what this flag means in each release, that would
achieve no useful purpose IMO.

> >>>  Unless for q35 we decide to
> >>> break everything and retroactively nuke the controller.
> >>>
> >>> (I'm still not sure why we have backwards-compatible machine types for 
> >>> q35).
> >> 
> >> Beats me :)
> >> 
> >> [...]
> >> 
> 

I'm fine with making them all identical.

-- 
MST



Re: [Qemu-devel] [PATCH COLO v4 01/15] docs: block replication's description

2015-05-14 Thread Wen Congyang

At 2015/5/14 19:19, Dr. David Alan Gilbert Wrote:

One thing I wanted to check I understand;  how much RAM do the active and hidden
disks use;  lets say during the 1st checkpoint   10MB of disk is written,
and during hte 2nd checkpoint a different 10MB of the disk is written
(to different locations on the disk); how big are the active and hidden
disks in RAM - are they 10MB or 20MB?


10MB, we will make active and hidden disks empty when doing checkpoint.

Thanks
Wen Congyang



Re: [Qemu-devel] [PATCH v3 3/5] raw-posix: DPRINTF instead of DEBUG_BLOCK_PRINT

2015-05-14 Thread Eric Blake
On 05/14/2015 05:01 AM, Dimitris Aragiorgis wrote:
> Building the QEMU tools fails if we #define DEBUG_BLOCK inside
> block/raw-posix.c. Here instead of adding qemu-log.o in block-obj-y
> so that DEBUG_BLOCK_PRINT can be used, we substitute the latter with
> a simple DPRINTF() (that does not cause bit-rot).
> 
> Signed-off-by: Dimitris Aragiorgis 
> ---
>  block/raw-posix.c |   17 +++--
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 

> +#define DPRINTF(fmt, ...) \
> +do { \
> +if (DEBUG_BLOCK_PRINT) \
> +printf(fmt, ## __VA_ARGS__); \
> +} while (0)

Still missing {} for the if statement.

[It might be nice to update scripts/checkpatch.pl to catch correct {}
inside macros, but that's a project for a different day.]

> @@ -1023,7 +1028,7 @@ static int xfs_write_zeroes(BDRVRawState *s, int64_t 
> offset, uint64_t bytes)
>  fl.l_len = bytes;
>  
>  if (xfsctl(NULL, s->fd, XFS_IOC_ZERO_RANGE, &fl) < 0) {
> -DEBUG_BLOCK_PRINT("cannot write zero range (%s)\n", strerror(errno));
> +DPRINTF("cannot write zero range (%s)\n", strerror(errno));
>  return -errno;
>  }

Pre-existing bug - strerror() and/or DPRINTF() can clobber errno, so
your return value may be destroyed.  You'll have to save errno into a
temporary variable prior to DPRINTF, and return that value.

>  
> @@ -1040,7 +1045,7 @@ static int xfs_discard(BDRVRawState *s, int64_t offset, 
> uint64_t bytes)
>  fl.l_len = bytes;
>  
>  if (xfsctl(NULL, s->fd, XFS_IOC_UNRESVSP64, &fl) < 0) {
> -DEBUG_BLOCK_PRINT("cannot punch hole (%s)\n", strerror(errno));
> +DPRINTF("cannot punch hole (%s)\n", strerror(errno));
>  return -errno;

Here too.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2] mirror: correct buf_size

2015-05-14 Thread Fam Zheng
On Thu, 05/14 12:38, Paolo Bonzini wrote:
> 
> 
> On 14/05/2015 12:29, Wen Congyang wrote:
> > 
> > If buf_size % granularity is not 0, mirror_free_init() will
> > do dangerous things.
> > 
> > Signed-off-by: Wen Congyang 
> > ---
> >  block/mirror.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/block/mirror.c b/block/mirror.c
> > index 58f391a..9521212 100644
> > --- a/block/mirror.c
> > +++ b/block/mirror.c
> > @@ -684,7 +684,7 @@ static void mirror_start_job(BlockDriverState *bs, 
> > BlockDriverState *target,
> >  s->is_none_mode = is_none_mode;
> >  s->base = base;
> >  s->granularity = granularity;
> > -s->buf_size = MAX(buf_size, granularity);
> > +s->buf_size = ROUND_UP(buf_size, granularity);
> >  
> >  s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, 
> > errp);
> >  if (!s->dirty_bitmap) {
> > 
> 
> Reviewed-by: Paolo Bonzini 

What if buf_size is negative?

Fam



Re: [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Michael S. Tsirkin
On Thu, May 14, 2015 at 02:02:04PM +0200, Markus Armbruster wrote:
> Correct.
> 
> Here's how I think it should be done:
> 
> * Create a machine option to control the FDC
> 
>   This is a machine-specific option.  It should only exist for machine
>   types that have an optional FDC.
> 
>   Default must be "on" for old machine types.  Default may be "off" for
>   new machine types.
> 
>   It should certainly be off for pc-q35-2.4 and newer.  Real Q35 boards
>   commonly don't have an FDC (depends on the Super I/O chip used).
> 
>   We may want to keep it off for pc-i440fx-2.4 and newer.  I doubt
>   there's a real i440FX without an FDC, but our virtual i440FX is quite
>   unlike a real one in other ways already.

I think making it off by default is a bad idea, it will break
command-line users.


> * Create the FDC only if the option is "on".
> 
> * Optional: make -drive if=floppy,... auto-enable it

Every time we do such auto hacks, we regret this later.
Just do what we are told, fail if=floppy if disabled.

>   I wouldn't bother doing the same for -global isa-fdc.driveA=... and
>   such.
> 
> Stefano, if you're willing to tackle this, go right ahead!



Re: [Qemu-devel] [PATCH] block: get_block_status: use "else" when testing the opposite condition

2015-05-14 Thread Eric Blake
On 05/14/2015 04:35 AM, Paolo Bonzini wrote:
> A bit of Boolean algebra (and common sense) tells us that the
> second "if" here is looking for blocks that are not allocated.
> This is the opposite of the "if" that sets BDRV_BLOCK_ALLOCATED,
> and thus it can use an "else".
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  block/io.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] block: get_block_status: use "else" when testing the opposite condition

2015-05-14 Thread Fam Zheng
On Thu, 05/14 12:35, Paolo Bonzini wrote:
> A bit of Boolean algebra (and common sense) tells us that the
> second "if" here is looking for blocks that are not allocated.
> This is the opposite of the "if" that sets BDRV_BLOCK_ALLOCATED,
> and thus it can use an "else".
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  block/io.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 1ce62c4..494e3b3 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1456,9 +1456,7 @@ static int64_t coroutine_fn 
> bdrv_co_get_block_status(BlockDriverState *bs,
>  
>  if (ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ZERO)) {
>  ret |= BDRV_BLOCK_ALLOCATED;
> -}
> -
> -if (!(ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO)) {
> +} else {
>  if (bdrv_unallocated_blocks_are_zero(bs)) {
>  ret |= BDRV_BLOCK_ZERO;
>  } else if (bs->backing_hd) {
> -- 
> 2.4.0
> 

Reviewed-by: Fam Zheng 



Re: [Qemu-devel] [PATCH v2] mirror: correct buf_size

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 16:06, Fam Zheng wrote:
> On Thu, 05/14 12:38, Paolo Bonzini wrote:
>>
>>
>> On 14/05/2015 12:29, Wen Congyang wrote:
>>>
>>> If buf_size % granularity is not 0, mirror_free_init() will
>>> do dangerous things.
>>>
>>> Signed-off-by: Wen Congyang 
>>> ---
>>>  block/mirror.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/block/mirror.c b/block/mirror.c
>>> index 58f391a..9521212 100644
>>> --- a/block/mirror.c
>>> +++ b/block/mirror.c
>>> @@ -684,7 +684,7 @@ static void mirror_start_job(BlockDriverState *bs, 
>>> BlockDriverState *target,
>>>  s->is_none_mode = is_none_mode;
>>>  s->base = base;
>>>  s->granularity = granularity;
>>> -s->buf_size = MAX(buf_size, granularity);
>>> +s->buf_size = ROUND_UP(buf_size, granularity);
>>>  
>>>  s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, 
>>> errp);
>>>  if (!s->dirty_bitmap) {
>>>
>>
>> Reviewed-by: Paolo Bonzini 
> 
> What if buf_size is negative?

This will fail:

s->buf = qemu_try_blockalign(bs, s->buf_size);

O:-)

but really that should be checked in mirror_start_job, so that the
command fails.

Paolo



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Laszlo Ersek
On 05/14/15 15:48, Michael S. Tsirkin wrote:
> On Thu, May 14, 2015 at 03:32:10PM +0200, Laszlo Ersek wrote:
>> On 05/14/15 15:00, Andrew Jones wrote:
>>> On Thu, May 14, 2015 at 01:38:11PM +0100, Peter Maydell wrote:
 On 14 May 2015 at 13:28, Paolo Bonzini  wrote:
> Well, PCI BARs are generally MMIO resources, and hence should not be 
> cached.
>
> As an optimization, OS drivers can mark them as cacheable or
> write-combining or something like that, but in general it's a safe
> default to leave them uncached---one would think.

 Isn't this handled by the OS mapping them in the 'prefetchable'
 MMIO window rather than the 'non-prefetchable' one? (QEMU's
 generic-PCIe device doesn't yet support the prefetchable window.)
>>>
>>> I was thinking (with my limited PCI knowledge) the same thing, and
>>> was planning on experimenting with that.
>>
>> This could be supported in UEFI as well, with the following steps:
>> - the DTB that QEMU provides UEFI with should advertise such a
>>   prefetchable window.
>> - The driver in UEFI that parses the DTB should understand that DTB
>>   node (well, record type), and store the appropriate base & size into
>>   some new dynamic PCDs (= basically, firmware wide global variables;
>>   PCD = platform configuration database)
>> - The entry point of the host bridge driver would call
>>   gDS->AddMemorySpace() twice, separately for the two different windows,
>>   with their appropriate caching attributes.
>> - The host bridge driver needs to be extended so that TypePMem32
>>   requests are not rejected (like now); they should be handled
>>   similarly to TypeMem32. Except, the gDS->AllocateMemorySpace() call
>>   should allocate from the prefetchable range (determined by the new
>>   PCDs above).
>> - QEMU's emulated devices should then expose their BARs as prefetchable
>>   (so that the above branch would be taken in the host bridge driver).
>>
>> (Of course, if QEMU intends to emulate PCI devices somewhat
>> realistically, then QEMU should claim "non-prefetchable" for BARs that
>> would not be prefetchable on physical hardware either, and then the
>> hypervisor should accommodate the firmware's UC mapping and say "hey I
>> know better, we're virtual in fact", and override the attribute (-> use
>> WB instead of UC). With which we'd be back to square one...)
>>
>> Thanks
>> Laszlo
> 
> Prefetcheable is unrelated to BAR caching or drivers, it's a way to tell
> host bridges they can do limited tweaks to downstream transactions in a
> specific range.
> 
> Really non-prefetcheable BARs are mostly those where read has
> side-effects, which is best avoided. this does not mean it's ok to
> reorder transactions or cache them.

I believe I understood that (although certainly not in the depth that
you do), because when the idea had come up first (ie. equating cacheable
with prefetchable, or at least "repurposing" the latter for the former)
I had tried to read up on prefetchable (just on the web; no time for
reading the PCI spec. ... I peeked now, it also mentions "write merging"
for bridges.) The way I perceived it, the idea was to give the guest a
hint about caching with the prefetchable bit / DTB entry. Sorry if I was
mistaken.

Thanks
Laszlo



Re: [Qemu-devel] [PATCH] block: Let bdrv_drain_all() to call aio_poll() for each AioContext

2015-05-14 Thread Alexander Yarygin
Paolo Bonzini  writes:

> On 13/05/2015 18:34, Alexander Yarygin wrote:
>> Ah, right. We need second loop, something like this:
>> 
>> @@ -2030,20 +2033,33 @@ void bdrv_drain(BlockDriverState *bs)
>>  void bdrv_drain_all(void)
>>  {
>>  /* Always run first iteration so any pending completion BHs run */
>> -bool busy = true;
>> +bool busy = true, pending = false;
>>  BlockDriverState *bs;
>> +GList *aio_ctxs = NULL, *ctx;
>> +AioContext *aio_context;
>> 
>>  while (busy) {
>>  busy = false;
>> 
>>  QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
>> -AioContext *aio_context = bdrv_get_aio_context(bs);
>> +aio_context = bdrv_get_aio_context(bs);
>> 
>>  aio_context_acquire(aio_context);
>>  busy |= bdrv_drain_one(bs);
>>  aio_context_release(aio_context);
>> +if (!aio_ctxs || !g_list_find(aio_ctxs, aio_context))
>> +aio_ctxs = g_list_append(aio_ctxs, aio_context);
>> +}
>> +pending = busy;
>> +
>> +for (ctx = aio_ctxs; ctx != NULL; ctx = ctx->next) {
>> +aio_context = ctx->data;
>> +aio_context_acquire(aio_context);
>> +busy |= aio_poll(aio_context, pending);
>> +aio_context_release(aio_context);
>>  }
>>  }
>> +g_list_free(aio_ctxs);
>>  }
>> 
>> That looks quite ugly for me and breaks consistence of bdrv_drain_one()
>> since it doesn't call aio_poll() anymore...
>
> It's not ugly.  After your patch bdrv_drain_one doesn't call aio_poll,
> while bdrv_drain and bdrv_drain_all call bdrv_drain_one + aio_poll.  All
> callers of bdrv_drain_one are consistent.
>
> Perhaps you can rename bdrv_drain_one to bdrv_flush_io_queue (inlining
> the existing bdrv_flush_io_queue into it)?  That would work very well
> for me.
>
> Paolo

Hmm, bdrv_flush_io_queue() is public, but has no users. How about
different name, maybe something like "bdrv_drain_requests_one" or so?

Otherwise here is a patch related to renaming to bdrv_flush_io_queue()
below. If it's ok, I will respin the whole patch.

Thanks.

--- a/block.c
+++ b/block.c
@@ -1987,11 +1987,17 @@ static bool bdrv_requests_pending(BlockDriverState *bs)
 return false;
 }
 
-static bool bdrv_drain_one(BlockDriverState *bs)
+static bool bdrv_flush_io_queue(BlockDriverState *bs)
 {
 bool bs_busy;
+BlockDriver *drv = bs->drv;
+
+if (drv && drv->bdrv_flush_io_queue) {
+drv->bdrv_flush_io_queue(bs);
+} else if (bs->file) {
+bdrv_flush_io_queue(bs->file);
+}
 
-bdrv_flush_io_queue(bs);
 bdrv_start_throttled_reqs(bs);
 bs_busy = bdrv_requests_pending(bs);
 return bs_busy;
@@ -2013,7 +2019,7 @@ void bdrv_drain(BlockDriverState *bs)
 
 while (busy) {
 /* Keep iterating */
-busy = bdrv_drain_one(bs);
+busy = bdrv_flush_io_queue(bs);
 busy |= aio_poll(bdrv_get_aio_context(bs), busy);
 }
 }
@@ -2044,7 +2050,7 @@ void bdrv_drain_all(void)
 AioContext *aio_context = bdrv_get_aio_context(bs);
 
 aio_context_acquire(aio_context);
-busy |= bdrv_drain_one(bs);
+busy |= bdrv_flush_io_queue(bs);
 if (!aio_ctxs || !g_slist_find(aio_ctxs, aio_context)) {
 busy |= aio_poll(aio_context, busy);
 aio_ctxs = g_slist_prepend(aio_ctxs, aio_context);
@@ -6088,16 +6094,6 @@ void bdrv_io_unplug(BlockDriverState *bs)
 }
 }
 
-void bdrv_flush_io_queue(BlockDriverState *bs)
-{
-BlockDriver *drv = bs->drv;
-if (drv && drv->bdrv_flush_io_queue) {
-drv->bdrv_flush_io_queue(bs);
-} else if (bs->file) {
-bdrv_flush_io_queue(bs->file);
-}
-}
-
 static bool append_open_options(QDict *d, BlockDriverState *bs)
 {
 const QDictEntry *entry;
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -565,7 +565,6 @@ int bdrv_probe_geometry(BlockDriverState *bs, HDGeometry 
*geo);
 
 void bdrv_io_plug(BlockDriverState *bs);
 void bdrv_io_unplug(BlockDriverState *bs);
-void bdrv_flush_io_queue(BlockDriverState *bs);
 
 BlockAcctStats *bdrv_get_stats(BlockDriverState *bs);
 




Re: [Qemu-devel] [PATCH] block: Let bdrv_drain_all() to call aio_poll() for each AioContext

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 16:29, Alexander Yarygin wrote:
> > Perhaps you can rename bdrv_drain_one to bdrv_flush_io_queue (inlining
> > the existing bdrv_flush_io_queue into it)?  That would work very well
> > for me.
>
> Hmm, bdrv_flush_io_queue() is public, but has no users. How about
> different name, maybe something like "bdrv_drain_requests_one" or so?

It's common for functions to call a driver hook, and then follow up with
generic code.  See bdrv_truncate for an example.  I would just keep
bdrv_flush_io_queue(); bdrv_start_throttled_reqs is really the generic
code to flush the I/O queue.

Perhaps, if you prefer, move bdrv_requests_pending(bs) to the callers so
that it keeps returning void?

Paolo



Re: [Qemu-devel] [Xen-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Stefano Stabellini
On Thu, 14 May 2015, Paolo Bonzini wrote:
> On 14/05/2015 15:25, Sander Eikelenboom wrote:
> > I tend to kindly disagree if you look at the broader perspective, yes it's 
> > could 
> > be a storm in a tea cup, but there seems to be a pattern.
> > 
> > From a "cmdline user" / "platform emulation" point of view i can imagine 
> > that some sort of 
> > auto configuration / bundling in platforms is necessary (to limit the 
> > length of 
> > the cmdline to be supplied).
> > 
> > But from a library / toolstack point of view it's quite nasty if *all* more 
> > or 
> > less obscure options have to actively disabled. It's quite undoable, not to 
> > mention what
> > happens when new ones get added. 
> 
> Where do you stop?

At this stage I would be happy enough if no floppy drives were actually
emulated when the user asks for none.


> Do you want to disable ACPI by default?  It's a relatively large amount
> of code, but for most modern guests it is necessary.
> 
> Besides, removing just the floppy drive makes little sense.  The
> following devices remain with -nodefaults:
> 
> - an HPET
> 
> - the power management device, which includes an I2C controller
> 
> - an IDE controller
> 
> - a keyboard controller
> 
> - the magic VMware I/O port
> 
> - the PC speaker (!)
> 
> - the legacy PIT
> 
> - the real-time clock
> 
> - the two PICs and IOAPIC
> 
> Of all these, most guests will only use the PM device (maybe) and a
> small subset of the RTC.  At the very least, the IDE controller, PC
> speaker and the HPET should be removed by -nodefaults-i-mean-it.  If
> you're using UEFI firmware, probably you could remove everything except
> the PM device---maybe the RTC and the IOAPIC.  At this point this is not
> -nodefaults-i-mean-it, it's a different machine type altogether and it's
> remarkably different from a PC.
> 
> Reducing the attack surface is always nice, but hypervisor and device
> model bugs are going to exist forever.  The amount of code for legacy
> devices is all in all not that big, and I can hardly recall other
> vulnerabilities in there.  This is why I say it's a knee-jerk reaction.
> 
> It is the intrinsic problem of virtualization mentioned in the famous
> quote of Theo de Raadt(*).  Even if you do PV like Xen or Hyper-V, you
> have less or no QEMU code but you throw more hypervisor code in
> untrusted hands (hypervisor bugs exist, e.g. CVE-2012-4539,
> CVE-2012-5510 or CVE-2013-1964).
> 
>   (*) Which of course misses the point of virtualization
>   altogether.  Once you have established that you need
>   more density, choosing containers vs. virtualization
>   is a gamble on whether you prefer more moving parts and
>   more layers that have to be broken (more isolation), or
>   fewer moving parts and less isolation.

Sure, but it is harder to write a device emulator in a secure fashion
than a brand new PV interface, that can be designed with security in
mind from scratch. The number of VM escaping CVEs affecting Xen PV
interfaces is extremely low, I think it was just two since the start of
the project.



> > From this PoV it would be better to have to actively enable all the stuff 
> > you want.
> > 
> > At least Xen seemed to have taken the "no-defaults" as the best option to 
> > get 
> > there so far, but that doesn't seem to 
> > 
> > It's not the first CVE that has come from this "you have to actively 
> > disable all 
> > you don't want to happen" and probably won't be the last.
> > 
> > So a "-no-defaults-now-for-real" option/mode for libraries / toolstacks, 
> > that 
> > requires them to be very verbose, explicit and specific in what they 
> > actually 
> > want, could be valuable.
> 



Re: [Qemu-devel] [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED

2015-05-14 Thread Michael S. Tsirkin
On Thu, May 14, 2015 at 04:19:23PM +0200, Laszlo Ersek wrote:
> On 05/14/15 15:48, Michael S. Tsirkin wrote:
> > On Thu, May 14, 2015 at 03:32:10PM +0200, Laszlo Ersek wrote:
> >> On 05/14/15 15:00, Andrew Jones wrote:
> >>> On Thu, May 14, 2015 at 01:38:11PM +0100, Peter Maydell wrote:
>  On 14 May 2015 at 13:28, Paolo Bonzini  wrote:
> > Well, PCI BARs are generally MMIO resources, and hence should not be 
> > cached.
> >
> > As an optimization, OS drivers can mark them as cacheable or
> > write-combining or something like that, but in general it's a safe
> > default to leave them uncached---one would think.
> 
>  Isn't this handled by the OS mapping them in the 'prefetchable'
>  MMIO window rather than the 'non-prefetchable' one? (QEMU's
>  generic-PCIe device doesn't yet support the prefetchable window.)
> >>>
> >>> I was thinking (with my limited PCI knowledge) the same thing, and
> >>> was planning on experimenting with that.
> >>
> >> This could be supported in UEFI as well, with the following steps:
> >> - the DTB that QEMU provides UEFI with should advertise such a
> >>   prefetchable window.
> >> - The driver in UEFI that parses the DTB should understand that DTB
> >>   node (well, record type), and store the appropriate base & size into
> >>   some new dynamic PCDs (= basically, firmware wide global variables;
> >>   PCD = platform configuration database)
> >> - The entry point of the host bridge driver would call
> >>   gDS->AddMemorySpace() twice, separately for the two different windows,
> >>   with their appropriate caching attributes.
> >> - The host bridge driver needs to be extended so that TypePMem32
> >>   requests are not rejected (like now); they should be handled
> >>   similarly to TypeMem32. Except, the gDS->AllocateMemorySpace() call
> >>   should allocate from the prefetchable range (determined by the new
> >>   PCDs above).
> >> - QEMU's emulated devices should then expose their BARs as prefetchable
> >>   (so that the above branch would be taken in the host bridge driver).
> >>
> >> (Of course, if QEMU intends to emulate PCI devices somewhat
> >> realistically, then QEMU should claim "non-prefetchable" for BARs that
> >> would not be prefetchable on physical hardware either, and then the
> >> hypervisor should accommodate the firmware's UC mapping and say "hey I
> >> know better, we're virtual in fact", and override the attribute (-> use
> >> WB instead of UC). With which we'd be back to square one...)
> >>
> >> Thanks
> >> Laszlo
> > 
> > Prefetcheable is unrelated to BAR caching or drivers, it's a way to tell
> > host bridges they can do limited tweaks to downstream transactions in a
> > specific range.
> > 
> > Really non-prefetcheable BARs are mostly those where read has
> > side-effects, which is best avoided. this does not mean it's ok to
> > reorder transactions or cache them.
> 
> I believe I understood that (although certainly not in the depth that
> you do), because when the idea had come up first (ie. equating cacheable
> with prefetchable, or at least "repurposing" the latter for the former)
> I had tried to read up on prefetchable (just on the web; no time for
> reading the PCI spec. ... I peeked now, it also mentions "write merging"
> for bridges.)

Read up on what it is if you like, it is much weaker than WC not to
mention cacheable.

> The way I perceived it, the idea was to give the guest a
> hint about caching with the prefetchable bit / DTB entry. Sorry if I was
> mistaken.
> 
> Thanks
> Laszlo

And what I am saying is that prefetchable bit would be a PV solution -
on real devices it is not a hint about caching and can't be used as
such.

-- 
MST



[Qemu-devel] [PATCH] Add a property to disable the floppy controller

2015-05-14 Thread Stefano Stabellini
Do not change default settings for any machines at the moment: this
patch does not introduce any changes in behavior unless the user asks
for no-fdc=on.

Signed-off-by: Stefano Stabellini 
---
 hw/i386/pc.c |   27 ---
 include/hw/i386/pc.h |2 ++
 vl.c |4 
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index f31d55e..501609b 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1450,10 +1450,14 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq 
*gsi,
 cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
 DMA_init(0, cpu_exit_irq);
 
-for(i = 0; i < MAX_FD; i++) {
-fd[i] = drive_get(IF_FLOPPY, 0, i);
+*floppy = NULL;
+if (!object_property_get_bool(qdev_get_machine(),
+PC_MACHINE_NO_FDC, &error_abort)) {
+for (i = 0; i < MAX_FD; i++) {
+fd[i] = drive_get(IF_FLOPPY, 0, i);
+}
+*floppy = fdctrl_init_isa(isa_bus, fd);
 }
-*floppy = fdctrl_init_isa(isa_bus, fd);
 }
 
 void pc_nic_init(ISABus *isa_bus, PCIBus *pci_bus)
@@ -1797,6 +1801,20 @@ static bool pc_machine_get_aligned_dimm(Object *obj, 
Error **errp)
 return pcms->enforce_aligned_dimm;
 }
 
+static bool pc_machine_get_no_fdc(Object *obj, Error **errp)
+{
+PCMachineState *pcms = PC_MACHINE(obj);
+
+return pcms->no_fdc;
+}
+
+static void pc_machine_set_no_fdc(Object *obj, bool value, Error **errp)
+{
+PCMachineState *pcms = PC_MACHINE(obj);
+
+pcms->no_fdc = value;
+}
+
 static void pc_machine_initfn(Object *obj)
 {
 PCMachineState *pcms = PC_MACHINE(obj);
@@ -1820,6 +1838,9 @@ static void pc_machine_initfn(Object *obj)
 object_property_add_bool(obj, PC_MACHINE_ENFORCE_ALIGNED_DIMM,
  pc_machine_get_aligned_dimm,
  NULL, NULL);
+object_property_add_bool(obj, PC_MACHINE_NO_FDC,
+ pc_machine_get_no_fdc, pc_machine_set_no_fdc,
+ NULL);
 }
 
 static void pc_machine_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 69d9cf8..93b2442 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -41,6 +41,7 @@ struct PCMachineState {
 uint64_t max_ram_below_4g;
 OnOffAuto vmport;
 bool enforce_aligned_dimm;
+bool no_fdc;
 };
 
 #define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device"
@@ -48,6 +49,7 @@ struct PCMachineState {
 #define PC_MACHINE_MAX_RAM_BELOW_4G "max-ram-below-4g"
 #define PC_MACHINE_VMPORT   "vmport"
 #define PC_MACHINE_ENFORCE_ALIGNED_DIMM "enforce-aligned-dimm"
+#define PC_MACHINE_NO_FDC "no-fdc"
 
 /**
  * PCMachineClass:
diff --git a/vl.c b/vl.c
index 91411c1..81d80ae 100644
--- a/vl.c
+++ b/vl.c
@@ -387,6 +387,10 @@ static QemuOptsList qemu_machine_opts = {
 .name = "iommu",
 .type = QEMU_OPT_BOOL,
 .help = "Set on/off to enable/disable Intel IOMMU (VT-d)",
+},{
+.name = PC_MACHINE_NO_FDC,
+.type = QEMU_OPT_BOOL,
+.help = "do not emulate a floppy controller",
 },
 { /* End of list */ }
 },
-- 
1.7.10.4




Re: [Qemu-devel] [Xen-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 16:39, Stefano Stabellini wrote:
> On Thu, 14 May 2015, Paolo Bonzini wrote:
>> On 14/05/2015 15:25, Sander Eikelenboom wrote:
>>> I tend to kindly disagree if you look at the broader perspective, yes it's 
>>> could 
>>> be a storm in a tea cup, but there seems to be a pattern.
>>>
>>> From a "cmdline user" / "platform emulation" point of view i can imagine 
>>> that some sort of 
>>> auto configuration / bundling in platforms is necessary (to limit the 
>>> length of 
>>> the cmdline to be supplied).
>>>
>>> But from a library / toolstack point of view it's quite nasty if *all* more 
>>> or 
>>> less obscure options have to actively disabled. It's quite undoable, not to 
>>> mention what
>>> happens when new ones get added. 
>>
>> Where do you stop?
> 
> At this stage I would be happy enough if no floppy drives were actually
> emulated when the user asks for none.

Floppy drives aren't being emulated; the controller is.  Same for IDE,
so why single out the FDD controller?

> Sure, but it is harder to write a device emulator in a secure fashion
> than a brand new PV interface, that can be designed with security in
> mind from scratch. The number of VM escaping CVEs affecting Xen PV
> interfaces is extremely low, I think it was just two since the start of
> the project.

Sure; OTOH, treating hypervisor and QEMU escapes would be very silly, if
QEMU has been properly protected (through any combination of stubdoms,
LSM, seccomp, ...).

Paolo

> 
> 
>>> From this PoV it would be better to have to actively enable all the stuff 
>>> you want.
>>>
>>> At least Xen seemed to have taken the "no-defaults" as the best option to 
>>> get 
>>> there so far, but that doesn't seem to 
>>>
>>> It's not the first CVE that has come from this "you have to actively 
>>> disable all 
>>> you don't want to happen" and probably won't be the last.
>>>
>>> So a "-no-defaults-now-for-real" option/mode for libraries / toolstacks, 
>>> that 
>>> requires them to be very verbose, explicit and specific in what they 
>>> actually 
>>> want, could be valuable.
>>



Re: [Qemu-devel] [PATCH] Add a property to disable the floppy controller

2015-05-14 Thread Paolo Bonzini


On 14/05/2015 16:41, Stefano Stabellini wrote:
> Do not change default settings for any machines at the moment: this
> patch does not introduce any changes in behavior unless the user asks
> for no-fdc=on.

Can we avoid the double negative?

Paolo

> Signed-off-by: Stefano Stabellini 
> ---
>  hw/i386/pc.c |   27 ---
>  include/hw/i386/pc.h |2 ++
>  vl.c |4 
>  3 files changed, 30 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index f31d55e..501609b 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1450,10 +1450,14 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq 
> *gsi,
>  cpu_exit_irq = qemu_allocate_irqs(cpu_request_exit, NULL, 1);
>  DMA_init(0, cpu_exit_irq);
>  
> -for(i = 0; i < MAX_FD; i++) {
> -fd[i] = drive_get(IF_FLOPPY, 0, i);
> +*floppy = NULL;
> +if (!object_property_get_bool(qdev_get_machine(),
> +PC_MACHINE_NO_FDC, &error_abort)) {
> +for (i = 0; i < MAX_FD; i++) {
> +fd[i] = drive_get(IF_FLOPPY, 0, i);
> +}
> +*floppy = fdctrl_init_isa(isa_bus, fd);
>  }
> -*floppy = fdctrl_init_isa(isa_bus, fd);
>  }
>  
>  void pc_nic_init(ISABus *isa_bus, PCIBus *pci_bus)
> @@ -1797,6 +1801,20 @@ static bool pc_machine_get_aligned_dimm(Object *obj, 
> Error **errp)
>  return pcms->enforce_aligned_dimm;
>  }
>  
> +static bool pc_machine_get_no_fdc(Object *obj, Error **errp)
> +{
> +PCMachineState *pcms = PC_MACHINE(obj);
> +
> +return pcms->no_fdc;
> +}
> +
> +static void pc_machine_set_no_fdc(Object *obj, bool value, Error **errp)
> +{
> +PCMachineState *pcms = PC_MACHINE(obj);
> +
> +pcms->no_fdc = value;
> +}
> +
>  static void pc_machine_initfn(Object *obj)
>  {
>  PCMachineState *pcms = PC_MACHINE(obj);
> @@ -1820,6 +1838,9 @@ static void pc_machine_initfn(Object *obj)
>  object_property_add_bool(obj, PC_MACHINE_ENFORCE_ALIGNED_DIMM,
>   pc_machine_get_aligned_dimm,
>   NULL, NULL);
> +object_property_add_bool(obj, PC_MACHINE_NO_FDC,
> + pc_machine_get_no_fdc, pc_machine_set_no_fdc,
> + NULL);
>  }
>  
>  static void pc_machine_class_init(ObjectClass *oc, void *data)
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 69d9cf8..93b2442 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -41,6 +41,7 @@ struct PCMachineState {
>  uint64_t max_ram_below_4g;
>  OnOffAuto vmport;
>  bool enforce_aligned_dimm;
> +bool no_fdc;
>  };
>  
>  #define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device"
> @@ -48,6 +49,7 @@ struct PCMachineState {
>  #define PC_MACHINE_MAX_RAM_BELOW_4G "max-ram-below-4g"
>  #define PC_MACHINE_VMPORT   "vmport"
>  #define PC_MACHINE_ENFORCE_ALIGNED_DIMM "enforce-aligned-dimm"
> +#define PC_MACHINE_NO_FDC "no-fdc"
>  
>  /**
>   * PCMachineClass:
> diff --git a/vl.c b/vl.c
> index 91411c1..81d80ae 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -387,6 +387,10 @@ static QemuOptsList qemu_machine_opts = {
>  .name = "iommu",
>  .type = QEMU_OPT_BOOL,
>  .help = "Set on/off to enable/disable Intel IOMMU (VT-d)",
> +},{
> +.name = PC_MACHINE_NO_FDC,
> +.type = QEMU_OPT_BOOL,
> +.help = "do not emulate a floppy controller",
>  },
>  { /* End of list */ }
>  },
> 



Re: [Qemu-devel] [Xen-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread Stefano Stabellini
On Thu, 14 May 2015, Paolo Bonzini wrote:
> On 14/05/2015 16:39, Stefano Stabellini wrote:
> > On Thu, 14 May 2015, Paolo Bonzini wrote:
> >> On 14/05/2015 15:25, Sander Eikelenboom wrote:
> >>> I tend to kindly disagree if you look at the broader perspective, yes 
> >>> it's could 
> >>> be a storm in a tea cup, but there seems to be a pattern.
> >>>
> >>> From a "cmdline user" / "platform emulation" point of view i can imagine 
> >>> that some sort of 
> >>> auto configuration / bundling in platforms is necessary (to limit the 
> >>> length of 
> >>> the cmdline to be supplied).
> >>>
> >>> But from a library / toolstack point of view it's quite nasty if *all* 
> >>> more or 
> >>> less obscure options have to actively disabled. It's quite undoable, not 
> >>> to mention what
> >>> happens when new ones get added. 
> >>
> >> Where do you stop?
> > 
> > At this stage I would be happy enough if no floppy drives were actually
> > emulated when the user asks for none.
> 
> Floppy drives aren't being emulated; the controller is.  Same for IDE,
> so why single out the FDD controller?

The problem is that floppy drive emulation code in the controller is not
completely turned off even when no drives are available, see: 

http://marc.info/?l=xen-devel&m=143155839722714&w=2

Instead of disentangling the controller code, I am taking the easy route
of removing the controller.


> > Sure, but it is harder to write a device emulator in a secure fashion
> > than a brand new PV interface, that can be designed with security in
> > mind from scratch. The number of VM escaping CVEs affecting Xen PV
> > interfaces is extremely low, I think it was just two since the start of
> > the project.
> 
> Sure; OTOH, treating hypervisor and QEMU escapes would be very silly, if
> QEMU has been properly protected (through any combination of stubdoms,
> LSM, seccomp, ...).

That is true, but I don't know how many distros setup up selinux,
stubdoms, etc properly by default, but I am guessing not that many.



  1   2   3   4   >