Re: [Qemu-devel] [Xen-devel] [qemu-mainline bisection] complete test-amd64-amd64-xl-win7-amd64

2014-12-30 Thread Fabio Fantoni

Il 30/12/2014 08:52, xen.org ha scritto:

branch xen-unstable
xen branch xen-unstable
job test-amd64-amd64-xl-win7-amd64
test windows-install

Tree: linux git://xenbits.xen.org/linux-pvops.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/staging/qemu-xen-unstable.git
Tree: qemuu git://git.qemu.org/qemu.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

   Bug is in tree:  qemuu git://git.qemu.org/qemu.git
   Bug introduced:  49d2e648e8087d154d8bf8b91f27c8e05e79d5a6
   Bug not present: 60fb1a87b47b14e4ea67043aa56f353e77fbd70a


   commit 49d2e648e8087d154d8bf8b91f27c8e05e79d5a6
   Author: Marcel Apfelbaum 
   Date:   Tue Dec 16 16:58:05 2014 +
   
   machine: remove qemu_machine_opts global list
   
   QEMU has support for options per machine, keeping

   a global list of options is no longer necessary.
   
   Signed-off-by: Marcel Apfelbaum 

   Reviewed-by: Alexander Graf 
   Reviewed-by: Greg Bellows 
   Message-id: 1418217570-15517-2-git-send-email-marce...@redhat.com
   Signed-off-by: Peter Maydell 


In the automatic test the qemu log contain:

qemu-system-i386: util/qemu-option.c:387: qemu_opt_get_bool_helper: Assertion `opt->desc 
&& opt->desc->type == QEMU_OPT_BOOL' failed.
Is there unexpected case in the qemu patch spotted by bisection or qemu 
parameters in libxl need improvements (probably machinearg cases in 
libxl_dm.c)?


Thanks for any reply and sorry for my bad english.




For bisection revision-tuple graph see:

http://www.chiark.greenend.org.uk/~xensrcts/results/bisect.qemu-mainline.test-amd64-amd64-xl-win7-amd64.windows-install.html
Revision IDs in each graph node refer, respectively, to the Trees above.


Searching for failure / basis pass:
  32689 fail [host=rice-weevil] / 32598 ok.
Failure / basis pass flights: 32689 / 32598
(tree with no url: seabios)
Tree: linux git://xenbits.xen.org/linux-pvops.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/staging/qemu-xen-unstable.git
Tree: qemuu git://git.qemu.org/qemu.git
Tree: xen git://xenbits.xen.org/xen.git
Latest 83a926f7a4e39fb6be0576024e67fe161593defa 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
b0d42741f8e9a00854c3b3faca1da84bfc69bf22 
ab0302ee764fd702465aef6d88612cdff4302809 
36174af3fbeb1b662c0eadbfa193e77f68cc955b
Basis pass 83a926f7a4e39fb6be0576024e67fe161593defa 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
b0d42741f8e9a00854c3b3faca1da84bfc69bf22 
7e58e2ac7778cca3234c33387e49577bb7732714 
36174af3fbeb1b662c0eadbfa193e77f68cc955b
Generating revisions with ./adhoc-revtuple-generator  
git://xenbits.xen.org/linux-pvops.git#83a926f7a4e39fb6be0576024e67fe161593defa-83a926f7a4e39fb6be0576024e67fe161593defa
 
git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860
 
git://xenbits.xen.org/staging/qemu-xen-unstable.git#b0d42741f8e9a00854c3b3faca1da84bfc69bf22-b0d42741f8e9a00854c3b3faca1da84bfc69bf22
 
git://git.qemu.org/qemu.git#7e58e2ac7778cca3234c33387e49577bb7732714-ab0302ee764fd702465aef6d88612cdff4302809
 
git://xenbits.xen.org/xen.git#36174af3fbeb1b662c0eadbfa193e77f68cc955b-36174af3fbeb1b662c0eadbfa193e77f68cc955b
+ exec
+ sh -xe
+ cd /export/home/osstest/repos/qemu
+ git remote set-url origin 
git://drall.uk.xensource.com:9419/git://git.qemu.org/qemu.git
+ git fetch -p origin +refs/heads/*:refs/remotes/origin/*
+ exec
+ sh -xe
+ cd /export/home/osstest/repos/qemu
+ git remote set-url origin 
git://drall.uk.xensource.com:9419/git://git.qemu.org/qemu.git
+ git fetch -p origin +refs/heads/*:refs/remotes/origin/*
Loaded 1005 nodes in revision graph
Searching for test results:
  32585 pass irrelevant
  32598 pass 83a926f7a4e39fb6be0576024e67fe161593defa 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
b0d42741f8e9a00854c3b3faca1da84bfc69bf22 
7e58e2ac7778cca3234c33387e49577bb7732714 
36174af3fbeb1b662c0eadbfa193e77f68cc955b
  32611 fail 83a926f7a4e39fb6be0576024e67fe161593defa 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
b0d42741f8e9a00854c3b3faca1da84bfc69bf22 
ab0302ee764fd702465aef6d88612cdff4302809 
36174af3fbeb1b662c0eadbfa193e77f68cc955b
  32626 fail 83a926f7a4e39fb6be0576024e67fe161593defa 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
b0d42741f8e9a00854c3b3faca1da84bfc69bf22 
ab0302ee764fd702465aef6d88612cdff4302809 
36174af3fbeb1b662c0eadbfa193e77f68cc955b
  32689 fail 83a926f7a4e39fb6be0576024e67fe161593defa 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
b0d42741f8e9a00854c3b3faca1da84bfc69bf22 
ab0302ee764fd702465aef6d88612cdff4302809 
36174af3fbeb1b662c0eadbfa193e77f68cc955b
  32659 fail 83a926f7a4e39fb6be0576024e67fe161593defa 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
b0d42741f8e9a00854c3b3faca1da84bfc69bf22 
ab0302ee764fd702465aef6d88612cdff4302809 
36174af3fbeb1b662c0eadbfa193e77f68cc955b
  32855 fail 83a926f7a4e39fb6be0576

[Qemu-devel] [PATCH 8/8] block/raw-posix: set max_write_zeroes to INT_MAX for regular files

2014-12-30 Thread Denis V. Lunev
fallocate() works fine and could handle properly with arbitrary size
requests. There is no sense to reduce the amount of space to fallocate.
The bigger the size, the better is performance as the amount of journal
updates is reduced.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
CC: Peter Lieven 
---
 block/raw-posix.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 57b94ad..3e156c4 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -292,6 +292,20 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
 }
 }
 
+static void raw_probe_max_write_zeroes(BlockDriverState *bs)
+{
+BDRVRawState *s = bs->opaque;
+struct stat st;
+
+if (fstat(s->fd, &st) < 0) {
+return; /* no problem, keep default value */
+}
+if (!S_ISREG(st.st_mode) || !s->discard_zeroes) {
+return;
+}
+bs->bl.max_write_zeroes = INT_MAX;
+}
+
 static void raw_parse_flags(int bdrv_flags, int *open_flags)
 {
 assert(open_flags != NULL);
@@ -598,6 +612,7 @@ static int raw_reopen_prepare(BDRVReopenState *state,
 /* Fail already reopen_prepare() if we can't get a working O_DIRECT
  * alignment with the new fd. */
 if (raw_s->fd != -1) {
+raw_probe_max_write_zeroes(state->bs);
 raw_probe_alignment(state->bs, raw_s->fd, &local_err);
 if (local_err) {
 qemu_close(raw_s->fd);
@@ -651,6 +666,8 @@ static void raw_refresh_limits(BlockDriverState *bs, Error 
**errp)
 
 raw_probe_alignment(bs, s->fd, errp);
 bs->bl.opt_mem_alignment = s->buf_align;
+
+raw_probe_max_write_zeroes(bs);
 }
 
 static ssize_t handle_aiocb_ioctl(RawPosixAIOData *aiocb)
-- 
1.9.1




[Qemu-devel] [PATCH 5/8] block/raw-posix: refactor handle_aiocb_write_zeroes a bit

2014-12-30 Thread Denis V. Lunev
move code dealing with a block device to a separate function. This will
allow to implement additional processing for an ordinary files.

Pls note, that xfs_code has been moved before checking for
s->has_write_zeroes as xfs_write_zeroes does not touch this flag inside.
This makes code a bit more consistent.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
CC: Peter Lieven 
---
 block/raw-posix.c | 60 +++
 1 file changed, 38 insertions(+), 22 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 25a6947..7866d31 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -915,46 +915,62 @@ static int do_fallocate(int fd, int mode, off_t offset, 
off_t len)
 }
 #endif
 
-static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb)
+static ssize_t handle_aiocb_write_zeroes_block(RawPosixAIOData *aiocb)
 {
 int ret = -EOPNOTSUPP;
 BDRVRawState *s = aiocb->bs->opaque;
 
-if (s->has_write_zeroes == 0) {
+if (!s->has_write_zeroes) {
 return -ENOTSUP;
 }
 
-if (aiocb->aio_type & QEMU_AIO_BLKDEV) {
 #ifdef BLKZEROOUT
-do {
-uint64_t range[2] = { aiocb->aio_offset, aiocb->aio_nbytes };
-if (ioctl(aiocb->aio_fildes, BLKZEROOUT, range) == 0) {
-return 0;
-}
-} while (errno == EINTR);
-
-ret = -errno;
-#endif
-} else {
-#ifdef CONFIG_XFS
-if (s->is_xfs) {
-return xfs_write_zeroes(s, aiocb->aio_offset, aiocb->aio_nbytes);
+do {
+uint64_t range[2] = { aiocb->aio_offset, aiocb->aio_nbytes };
+if (ioctl(aiocb->aio_fildes, BLKZEROOUT, range) == 0) {
+return 0;
 }
-#endif
+} while (errno == EINTR);
 
-#ifdef CONFIG_FALLOCATE_ZERO_RANGE
-ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
-   aiocb->aio_offset, aiocb->aio_nbytes);
+ret = translate_err(-errno);
 #endif
-}
 
-ret = translate_err(ret);
 if (ret == -ENOTSUP) {
 s->has_write_zeroes = false;
 }
 return ret;
 }
 
+static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb)
+{
+BDRVRawState *s;
+
+if (aiocb->aio_type & QEMU_AIO_BLKDEV) {
+return handle_aiocb_write_zeroes_block(aiocb);
+}
+
+#ifdef CONFIG_XFS
+if (s->is_xfs) {
+return xfs_write_zeroes(s, aiocb->aio_offset, aiocb->aio_nbytes);
+}
+#endif
+
+s = aiocb->bs->opaque;
+
+#ifdef CONFIG_FALLOCATE_ZERO_RANGE
+if (s->has_write_zeroes) {
+int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
+   aiocb->aio_offset, aiocb->aio_nbytes);
+if (ret == 0 && ret != -ENOTSUP) {
+return ret;
+}
+}
+#endif
+
+s->has_write_zeroes = false;
+return -ENOTSUP;
+}
+
 static ssize_t handle_aiocb_discard(RawPosixAIOData *aiocb)
 {
 int ret = -EOPNOTSUPP;
-- 
1.9.1




[Qemu-devel] [PATCH 1/8] block: prepare bdrv_co_do_write_zeroes to deal with large bl.max_write_zeroes

2014-12-30 Thread Denis V. Lunev
bdrv_co_do_write_zeroes split writes using bl.max_write_zeroes or
16 MiB as a chunk size. This is implemented in this way to tolerate
buggy block backends which do not accept too big requests.

Though if the bdrv_co_write_zeroes callback is not good enough, we
fallback to write data explicitely using bdrv_co_writev and we
create buffer to accomodate zeroes inside. The size of this buffer
is the size of the chunk. Thus if the underlying layer will have
bl.max_write_zeroes high enough, f.e. 4 GiB, the allocation can fail.

Actually, there is no need to allocate such a big amount of memory.
We could simply allocate 1 MiB buffer and create iovec, which will
point to the same memory.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
CC: Peter Lieven 
---
 block.c | 35 ---
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/block.c b/block.c
index 4165d42..d69c121 100644
--- a/block.c
+++ b/block.c
@@ -3173,14 +3173,18 @@ int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState 
*bs,
  * of 32768 512-byte sectors (16 MiB) per request.
  */
 #define MAX_WRITE_ZEROES_DEFAULT 32768
+/* allocate iovec with zeroes using 1 MiB chunks to avoid to big allocations */
+#define MAX_ZEROES_CHUNK (1024 * 1024)
 
 static int coroutine_fn bdrv_co_do_write_zeroes(BlockDriverState *bs,
 int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
 {
 BlockDriver *drv = bs->drv;
 QEMUIOVector qiov;
-struct iovec iov = {0};
 int ret = 0;
+void *chunk = NULL;
+
+qemu_iovec_init(&qiov, 0);
 
 int max_write_zeroes = bs->bl.max_write_zeroes ?
bs->bl.max_write_zeroes : MAX_WRITE_ZEROES_DEFAULT;
@@ -3217,27 +3221,35 @@ static int coroutine_fn 
bdrv_co_do_write_zeroes(BlockDriverState *bs,
 }
 
 if (ret == -ENOTSUP) {
+int64_t num_bytes = (int64_t)num << BDRV_SECTOR_BITS;
+int chunk_size = MIN(MAX_ZEROES_CHUNK, num_bytes);
+
 /* Fall back to bounce buffer if write zeroes is unsupported */
-iov.iov_len = num * BDRV_SECTOR_SIZE;
-if (iov.iov_base == NULL) {
-iov.iov_base = qemu_try_blockalign(bs, num * BDRV_SECTOR_SIZE);
-if (iov.iov_base == NULL) {
+if (chunk == NULL) {
+chunk = qemu_try_blockalign(bs, chunk_size);
+if (chunk == NULL) {
 ret = -ENOMEM;
 goto fail;
 }
-memset(iov.iov_base, 0, num * BDRV_SECTOR_SIZE);
+memset(chunk, 0, chunk_size);
+}
+
+while (num_bytes > 0) {
+int to_add = MIN(chunk_size, num_bytes);
+qemu_iovec_add(&qiov, chunk, to_add);
+num_bytes -= to_add;
 }
-qemu_iovec_init_external(&qiov, &iov, 1);
 
 ret = drv->bdrv_co_writev(bs, sector_num, num, &qiov);
 
 /* Keep bounce buffer around if it is big enough for all
  * all future requests.
  */
-if (num < max_write_zeroes) {
-qemu_vfree(iov.iov_base);
-iov.iov_base = NULL;
+if (chunk_size != MAX_ZEROES_CHUNK) {
+qemu_vfree(chunk);
+chunk = NULL;
 }
+qemu_iovec_reset(&qiov);
 }
 
 sector_num += num;
@@ -3245,7 +3257,8 @@ static int coroutine_fn 
bdrv_co_do_write_zeroes(BlockDriverState *bs,
 }
 
 fail:
-qemu_vfree(iov.iov_base);
+qemu_iovec_destroy(&qiov);
+qemu_vfree(chunk);
 return ret;
 }
 
-- 
1.9.1




[Qemu-devel] [PATCH v3 0/8] eliminate data write in bdrv_write_zeroes on Linux in raw-posix.c

2014-12-30 Thread Denis V. Lunev
These patches eliminate data writes completely on Linux if fallocate
FALLOC_FL_ZERO_RANGE or FALLOC_FL_PUNCH_HOLE are  supported on
underlying filesystem.

I have performed several tests with non-aligned fallocate calls and
in all cases (with non-aligned fallocates) Linux performs fine, i.e.
areas are zeroed correctly. Checks were made on
   Linux 3.16.0-28-generic #38-Ubuntu SMP

This should seriously increase performance in some special cases.

Changes from v2:
- added Peter Lieven to CC
- added CONFIG_FALLOCATE check to call do_fallocate in patch 7
- dropped patch 1 as NACK-ed
- added processing of very large data areas in bdrv_co_write_zeroes (new
  patch 1)
- set bl.max_write_zeroes to INT_MAX in raw-posix.c for regular files
  (new patch 8)

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
CC: Peter Lieven 




[Qemu-devel] [PATCH 4/8] block/raw-posix: create translate_err helper to merge errno values

2014-12-30 Thread Denis V. Lunev
actually the code
if (ret == -ENODEV || ret == -ENOSYS || ret == -EOPNOTSUPP ||
ret == -ENOTTY) {
ret = -ENOTSUP;
}
is present twice and will be added a couple more times. Create helper
for this. Place it into do_fallocate() for further convinience.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
CC: Peter Lieven 
---
 block/raw-posix.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index a7c8816..25a6947 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -894,6 +894,15 @@ static int xfs_discard(BDRVRawState *s, int64_t offset, 
uint64_t bytes)
 #endif
 
 
+static int translate_err(int err)
+{
+if (err == -ENODEV || err == -ENOSYS || err == -EOPNOTSUPP ||
+err == -ENOTTY) {
+err = -ENOTSUP;
+}
+return err;
+}
+
 #if defined(CONFIG_FALLOCATE_PUNCH_HOLE) || 
defined(CONFIG_FALLOCATE_ZERO_RANGE)
 static int do_fallocate(int fd, int mode, off_t offset, off_t len)
 {
@@ -902,7 +911,7 @@ static int do_fallocate(int fd, int mode, off_t offset, 
off_t len)
 return 0;
 }
 } while (errno == EINTR);
-return -errno;
+return translate_err(-errno);
 }
 #endif
 
@@ -939,10 +948,9 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData 
*aiocb)
 #endif
 }
 
-if (ret == -ENODEV || ret == -ENOSYS || ret == -EOPNOTSUPP ||
-ret == -ENOTTY) {
+ret = translate_err(ret);
+if (ret == -ENOTSUP) {
 s->has_write_zeroes = false;
-ret = -ENOTSUP;
 }
 return ret;
 }
@@ -980,10 +988,9 @@ static ssize_t handle_aiocb_discard(RawPosixAIOData *aiocb)
 #endif
 }
 
-if (ret == -ENODEV || ret == -ENOSYS || ret == -EOPNOTSUPP ||
-ret == -ENOTTY) {
+ret = translate_err(ret);
+if (ret == -ENOTSUP) {
 s->has_discard = false;
-ret = -ENOTSUP;
 }
 return ret;
 }
-- 
1.9.1




[Qemu-devel] [PATCH 2/8] block: use fallocate(FALLOC_FL_ZERO_RANGE) in handle_aiocb_write_zeroes

2014-12-30 Thread Denis V. Lunev
This efficiently writes zeroes on Linux if the kernel is capable enough.
FALLOC_FL_ZERO_RANGE correctly handles all cases, including and not
including file expansion.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
CC: Peter Lieven 
---
 block/raw-posix.c | 13 -
 configure | 19 +++
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index e51293a..66ebaab 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -60,7 +60,7 @@
 #define FS_NOCOW_FL 0x0080 /* Do not cow file */
 #endif
 #endif
-#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
+#if defined(CONFIG_FALLOCATE_PUNCH_HOLE) || 
defined(CONFIG_FALLOCATE_ZERO_RANGE)
 #include 
 #endif
 #if defined (__FreeBSD__) || defined(__FreeBSD_kernel__)
@@ -919,6 +919,17 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData 
*aiocb)
 return xfs_write_zeroes(s, aiocb->aio_offset, aiocb->aio_nbytes);
 }
 #endif
+
+#ifdef CONFIG_FALLOCATE_ZERO_RANGE
+do {
+if (fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
+  aiocb->aio_offset, aiocb->aio_nbytes) == 0) {
+return 0;
+}
+} while (errno == EINTR);
+
+ret = -errno;
+#endif
 }
 
 if (ret == -ENODEV || ret == -ENOSYS || ret == -EOPNOTSUPP ||
diff --git a/configure b/configure
index cae588c..dfcf7b3 100755
--- a/configure
+++ b/configure
@@ -3309,6 +3309,22 @@ if compile_prog "" "" ; then
   fallocate_punch_hole=yes
 fi
 
+# check that fallocate supports range zeroing inside the file
+fallocate_zero_range=no
+cat > $TMPC << EOF
+#include 
+#include 
+
+int main(void)
+{
+fallocate(0, FALLOC_FL_ZERO_RANGE, 0, 0);
+return 0;
+}
+EOF
+if compile_prog "" "" ; then
+  fallocate_zero_range=yes
+fi
+
 # check for posix_fallocate
 posix_fallocate=no
 cat > $TMPC << EOF
@@ -4538,6 +4554,9 @@ fi
 if test "$fallocate_punch_hole" = "yes" ; then
   echo "CONFIG_FALLOCATE_PUNCH_HOLE=y" >> $config_host_mak
 fi
+if test "$fallocate_zero_range" = "yes" ; then
+  echo "CONFIG_FALLOCATE_ZERO_RANGE=y" >> $config_host_mak
+fi
 if test "$posix_fallocate" = "yes" ; then
   echo "CONFIG_POSIX_FALLOCATE=y" >> $config_host_mak
 fi
-- 
1.9.1




[Qemu-devel] [PATCH 7/8] block/raw-posix: call plain fallocate in handle_aiocb_write_zeroes

2014-12-30 Thread Denis V. Lunev
There is a possibility that we are extending our image and thus writing
zeroes beyond end of the file. In this case we do not need to care
about the hole to make sure that there is no data in the file under
this offset (pre-condition to fallocate(0) to work). We could simply call
fallocate(0).

This improves the performance of writing zeroes even on really old
platforms which do not have even FALLOC_FL_PUNCH_HOLE.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
CC: Peter Lieven 
---
 block/raw-posix.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 96a8678..57b94ad 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -60,7 +60,7 @@
 #define FS_NOCOW_FL 0x0080 /* Do not cow file */
 #endif
 #endif
-#if defined(CONFIG_FALLOCATE_PUNCH_HOLE) || 
defined(CONFIG_FALLOCATE_ZERO_RANGE)
+#ifdef CONFIG_FALLOCATE
 #include 
 #endif
 #if defined (__FreeBSD__) || defined(__FreeBSD_kernel__)
@@ -903,7 +903,7 @@ static int translate_err(int err)
 return err;
 }
 
-#if defined(CONFIG_FALLOCATE_PUNCH_HOLE) || 
defined(CONFIG_FALLOCATE_ZERO_RANGE)
+#ifdef CONFIG_FALLOCATE
 static int do_fallocate(int fd, int mode, off_t offset, off_t len)
 {
 do {
@@ -957,6 +957,12 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData 
*aiocb)
 
 s = aiocb->bs->opaque;
 
+#ifdef CONFIG_FALLOCATE
+if (aiocb->aio_offset >= aiocb->bs->total_sectors << BDRV_SECTOR_BITS) {
+return do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes);
+}
+#endif
+
 #ifdef CONFIG_FALLOCATE_ZERO_RANGE
 if (s->has_write_zeroes) {
 int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
-- 
1.9.1




[Qemu-devel] [PATCH 3/8] block/raw-posix: create do_fallocate helper

2014-12-30 Thread Denis V. Lunev
The pattern
do {
if (fallocate(s->fd, mode, offset, len) == 0) {
return 0;
}
} while (errno == EINTR);
is used twice at the moment and I am going to add more usages. Move it
to the helper function.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
CC: Peter Lieven 
---
 block/raw-posix.c | 33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 66ebaab..a7c8816 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -893,6 +893,19 @@ static int xfs_discard(BDRVRawState *s, int64_t offset, 
uint64_t bytes)
 }
 #endif
 
+
+#if defined(CONFIG_FALLOCATE_PUNCH_HOLE) || 
defined(CONFIG_FALLOCATE_ZERO_RANGE)
+static int do_fallocate(int fd, int mode, off_t offset, off_t len)
+{
+do {
+if (fallocate(fd, mode, offset, len) == 0) {
+return 0;
+}
+} while (errno == EINTR);
+return -errno;
+}
+#endif
+
 static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb)
 {
 int ret = -EOPNOTSUPP;
@@ -921,14 +934,8 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData 
*aiocb)
 #endif
 
 #ifdef CONFIG_FALLOCATE_ZERO_RANGE
-do {
-if (fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
-  aiocb->aio_offset, aiocb->aio_nbytes) == 0) {
-return 0;
-}
-} while (errno == EINTR);
-
-ret = -errno;
+ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
+   aiocb->aio_offset, aiocb->aio_nbytes);
 #endif
 }
 
@@ -968,14 +975,8 @@ static ssize_t handle_aiocb_discard(RawPosixAIOData *aiocb)
 #endif
 
 #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
-do {
-if (fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
-  aiocb->aio_offset, aiocb->aio_nbytes) == 0) {
-return 0;
-}
-} while (errno == EINTR);
-
-ret = -errno;
+ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+   aiocb->aio_offset, aiocb->aio_nbytes);
 #endif
 }
 
-- 
1.9.1




[Qemu-devel] [PATCH 6/8] block: use fallocate(FALLOC_FL_PUNCH_HOLE) & fallocate(0) to write zeroes

2014-12-30 Thread Denis V. Lunev
This sequence works efficiently if FALLOC_FL_ZERO_RANGE is not supported.

Simple fallocate(0) will extend file with zeroes when appropriate in the
middle of the file if there is a hole there and at the end of the file.
Unfortunately fallocate(0) does not drop the content of the file if
there is a data on this offset. Therefore to make the situation consistent
we should drop the data beforehand. This is done using FALLOC_FL_PUNCH_HOLE

This should increase the performance a bit for not-so-modern kernels or for
filesystems which do not support FALLOC_FL_ZERO_RANGE.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
CC: Peter Lieven 
---
 block/raw-posix.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 7866d31..96a8678 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -968,6 +968,23 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData 
*aiocb)
 #endif
 
 s->has_write_zeroes = false;
+
+#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
+if (s->has_discard) {
+int ret;
+ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+   aiocb->aio_offset, aiocb->aio_nbytes);
+if (ret < 0) {
+if (ret == -ENOTSUP) {
+s->has_discard = false;
+}
+return ret;
+}
+return do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes);
+}
+#endif
+
+s->has_discard = false;
 return -ENOTSUP;
 }
 
-- 
1.9.1




[Qemu-devel] [PATCH 02/19] block/parallels: rename parallels_header to ParallelsHeader

2014-12-30 Thread Denis V. Lunev
this follows QEMU coding convention

Signed-off-by: Denis V. Lunev 
Acked-by: Roman Kagan 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index 4f9cd8d..dca0df6 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -35,7 +35,7 @@
 #define HEADER_SIZE 64
 
 // always little-endian
-struct parallels_header {
+typedef struct ParallelsHeader {
 char magic[16]; // "WithoutFreeSpace"
 uint32_t version;
 uint32_t heads;
@@ -46,7 +46,7 @@ struct parallels_header {
 uint32_t inuse;
 uint32_t data_off;
 char padding[12];
-} QEMU_PACKED;
+} QEMU_PACKED ParallelsHeader;
 
 typedef struct BDRVParallelsState {
 CoMutex lock;
@@ -61,7 +61,7 @@ typedef struct BDRVParallelsState {
 
 static int parallels_probe(const uint8_t *buf, int buf_size, const char 
*filename)
 {
-const struct parallels_header *ph = (const void *)buf;
+const ParallelsHeader *ph = (const void *)buf;
 
 if (buf_size < HEADER_SIZE)
 return 0;
@@ -79,7 +79,7 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 {
 BDRVParallelsState *s = bs->opaque;
 int i;
-struct parallels_header ph;
+ParallelsHeader ph;
 int ret;
 
 bs->read_only = 1; // no write support yet
-- 
1.9.1




[Qemu-devel] [PATCH v2 0/19] write/create for Parallels images with reasonable performance

2014-12-30 Thread Denis V. Lunev
This patchset provides an ability to create of/write to Parallels
images and some testing of the new code. Writes are not optimized
now at all, we just modify catalog_bitmap and write those changes
to the image itself at once. This will be improved in next steps.

This patchset consists of not problematic part of the previous
patchset aka
   [PATCH v4 0/16] parallels format support improvements
and new write/create code but it does not contradict questionable code
with XML handling there.

Kevin, I would appreciate if you will quick look into proof-of-concept
patch aka
   [RFC PATCH 1/1] block/parallels: new concept for DiskDescriptor.xml
to validate that approach for DiskDescriptor.xml problem.

Changes from v1:
- patches 13-19 added, which boosts performance from 800 KiB/sec to
  near native performance

Signed-off-by: Denis V. Lunev 
CC: Roman Kagan 
CC: Jeff Cody 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 

P.S. Should I add myself to MAINTAINERS of this driver?




[Qemu-devel] [PATCH 07/19] block/parallels: replace magic constants 4, 64 with proper sizeofs

2014-12-30 Thread Denis V. Lunev
simple purification..

Signed-off-by: Denis V. Lunev 
Acked-by: Roman Kagan 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index 64b169b..306f2e3 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -32,7 +32,6 @@
 #define HEADER_MAGIC "WithoutFreeSpace"
 #define HEADER_MAGIC2 "WithouFreSpacExt"
 #define HEADER_VERSION 2
-#define HEADER_SIZE 64
 
 // always little-endian
 typedef struct ParallelsHeader {
@@ -63,7 +62,7 @@ static int parallels_probe(const uint8_t *buf, int buf_size, 
const char *filenam
 {
 const ParallelsHeader *ph = (const void *)buf;
 
-if (buf_size < HEADER_SIZE)
+if (buf_size < sizeof(ParallelsHeader))
 return 0;
 
 if ((!memcmp(ph->magic, HEADER_MAGIC, 16) ||
@@ -116,7 +115,7 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 }
 
 s->catalog_size = le32_to_cpu(ph.catalog_entries);
-if (s->catalog_size > INT_MAX / 4) {
+if (s->catalog_size > INT_MAX / sizeof(uint32_t)) {
 error_setg(errp, "Catalog too large");
 ret = -EFBIG;
 goto fail;
@@ -127,7 +126,8 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 goto fail;
 }
 
-ret = bdrv_pread(bs->file, 64, s->catalog_bitmap, s->catalog_size * 4);
+ret = bdrv_pread(bs->file, sizeof(ParallelsHeader),
+ s->catalog_bitmap, s->catalog_size * sizeof(uint32_t));
 if (ret < 0) {
 goto fail;
 }
-- 
1.9.1




[Qemu-devel] [PATCH 05/19] block/parallels: add get_block_status

2014-12-30 Thread Denis V. Lunev
From: Roman Kagan 

Implement VFS method for get_block_status to Parallels format driver.

Signed-off-by: Roman Kagan 
Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/block/parallels.c b/block/parallels.c
index 8770c82..b469984 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -166,6 +166,26 @@ static int cluster_remainder(BDRVParallelsState *s, 
int64_t sector_num,
 return MIN(nb_sectors, ret);
 }
 
+static int64_t coroutine_fn parallels_co_get_block_status(BlockDriverState *bs,
+int64_t sector_num, int nb_sectors, int *pnum)
+{
+BDRVParallelsState *s = bs->opaque;
+int64_t offset;
+
+qemu_co_mutex_lock(&s->lock);
+offset = seek_to_sector(s, sector_num);
+qemu_co_mutex_unlock(&s->lock);
+
+*pnum = cluster_remainder(s, sector_num, nb_sectors);
+
+if (offset < 0) {
+return 0;
+}
+
+return (offset << BDRV_SECTOR_BITS) |
+BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
+}
+
 static int parallels_read(BlockDriverState *bs, int64_t sector_num,
 uint8_t *buf, int nb_sectors)
 {
@@ -213,6 +233,7 @@ static BlockDriver bdrv_parallels = {
 .bdrv_open = parallels_open,
 .bdrv_read  = parallels_co_read,
 .bdrv_close= parallels_close,
+.bdrv_co_get_block_status = parallels_co_get_block_status,
 };
 
 static void bdrv_parallels_init(void)
-- 
1.9.1




[Qemu-devel] [PATCH 03/19] block/parallels: switch to bdrv_read

2014-12-30 Thread Denis V. Lunev
From: Roman Kagan 

Switch the .bdrv_read method implementation from using bdrv_pread() to
bdrv_read() on the underlying file, since the latter is subject to i/o
throttling while the former is not.

Besides, since bdrv_read() operates in sectors rather than bytes, adjust
the helper functions to do so too.

Signed-off-by: Roman Kagan 
Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index dca0df6..baefd3e 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -146,9 +146,8 @@ fail:
 return ret;
 }
 
-static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
+static int64_t seek_to_sector(BDRVParallelsState *s, int64_t sector_num)
 {
-BDRVParallelsState *s = bs->opaque;
 uint32_t index, offset;
 
 index = sector_num / s->tracks;
@@ -157,24 +156,27 @@ static int64_t seek_to_sector(BlockDriverState *bs, 
int64_t sector_num)
 /* not allocated */
 if ((index >= s->catalog_size) || (s->catalog_bitmap[index] == 0))
 return -1;
-return
-((uint64_t)s->catalog_bitmap[index] * s->off_multiplier + offset) * 
512;
+return (uint64_t)s->catalog_bitmap[index] * s->off_multiplier + offset;
 }
 
 static int parallels_read(BlockDriverState *bs, int64_t sector_num,
 uint8_t *buf, int nb_sectors)
 {
+BDRVParallelsState *s = bs->opaque;
+
 while (nb_sectors > 0) {
-int64_t position = seek_to_sector(bs, sector_num);
+int64_t position = seek_to_sector(s, sector_num);
 if (position >= 0) {
-if (bdrv_pread(bs->file, position, buf, 512) != 512)
-return -1;
+int ret = bdrv_read(bs->file, position, buf, 1);
+if (ret < 0) {
+return ret;
+}
 } else {
-memset(buf, 0, 512);
+memset(buf, 0, BDRV_SECTOR_SIZE);
 }
 nb_sectors--;
 sector_num++;
-buf += 512;
+buf += BDRV_SECTOR_SIZE;
 }
 return 0;
 }
-- 
1.9.1




[Qemu-devel] [PATCH 01/19] iotests, parallels: quote TEST_IMG in 076 test to be path-safe

2014-12-30 Thread Denis V. Lunev
suggested by Jeff Cody

Signed-off-by: Denis V. Lunev 
Acked-by: Roman Kagan 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 tests/qemu-iotests/076 | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/076 b/tests/qemu-iotests/076
index ed2be35..0139976 100755
--- a/tests/qemu-iotests/076
+++ b/tests/qemu-iotests/076
@@ -49,31 +49,31 @@ nb_sectors_offset=$((0x24))
 echo
 echo "== Read from a valid v1 image =="
 _use_sample_img parallels-v1.bz2
-{ $QEMU_IO -c "read -P 0x11 0 64k" $TEST_IMG; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+{ $QEMU_IO -c "read -P 0x11 0 64k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
 
 echo
 echo "== Negative catalog size =="
 _use_sample_img parallels-v1.bz2
 poke_file "$TEST_IMG" "$catalog_entries_offset" "\xff\xff\xff\xff"
-{ $QEMU_IO -c "read 0 512" $TEST_IMG; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+{ $QEMU_IO -c "read 0 512" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
 
 echo
 echo "== Overflow in catalog allocation =="
 _use_sample_img parallels-v1.bz2
 poke_file "$TEST_IMG" "$nb_sectors_offset" "\xff\xff\xff\xff"
 poke_file "$TEST_IMG" "$catalog_entries_offset" "\x01\x00\x00\x40"
-{ $QEMU_IO -c "read 64M 64M" $TEST_IMG; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+{ $QEMU_IO -c "read 64M 64M" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
 
 echo
 echo "== Zero sectors per track =="
 _use_sample_img parallels-v1.bz2
 poke_file "$TEST_IMG" "$tracks_offset" "\x00\x00\x00\x00"
-{ $QEMU_IO -c "read 0 512" $TEST_IMG; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+{ $QEMU_IO -c "read 0 512" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
 
 echo
 echo "== Read from a valid v2 image =="
 _use_sample_img parallels-v2.bz2
-{ $QEMU_IO -c "read -P 0x11 0 64k" $TEST_IMG; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+{ $QEMU_IO -c "read -P 0x11 0 64k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
 
 # success, all done
 echo "*** done"
-- 
1.9.1




[Qemu-devel] [PATCH 04/19] block/parallels: read up to cluster end in one go

2014-12-30 Thread Denis V. Lunev
From: Roman Kagan 

Teach parallels_read() to do reads in coarser granularity than just a
single sector: if requested, read up to the cluster end in one go.

Signed-off-by: Roman Kagan 
Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index baefd3e..8770c82 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -159,6 +159,13 @@ static int64_t seek_to_sector(BDRVParallelsState *s, 
int64_t sector_num)
 return (uint64_t)s->catalog_bitmap[index] * s->off_multiplier + offset;
 }
 
+static int cluster_remainder(BDRVParallelsState *s, int64_t sector_num,
+int nb_sectors)
+{
+int ret = s->tracks - sector_num % s->tracks;
+return MIN(nb_sectors, ret);
+}
+
 static int parallels_read(BlockDriverState *bs, int64_t sector_num,
 uint8_t *buf, int nb_sectors)
 {
@@ -166,17 +173,18 @@ static int parallels_read(BlockDriverState *bs, int64_t 
sector_num,
 
 while (nb_sectors > 0) {
 int64_t position = seek_to_sector(s, sector_num);
+int n = cluster_remainder(s, sector_num, nb_sectors);
 if (position >= 0) {
-int ret = bdrv_read(bs->file, position, buf, 1);
+int ret = bdrv_read(bs->file, position, buf, n);
 if (ret < 0) {
 return ret;
 }
 } else {
-memset(buf, 0, BDRV_SECTOR_SIZE);
+memset(buf, 0, n << BDRV_SECTOR_BITS);
 }
-nb_sectors--;
-sector_num++;
-buf += BDRV_SECTOR_SIZE;
+nb_sectors -= n;
+sector_num += n;
+buf += n << BDRV_SECTOR_BITS;
 }
 return 0;
 }
-- 
1.9.1




[Qemu-devel] [PATCH 06/19] block/parallels: provide _co_readv routine for parallels format driver

2014-12-30 Thread Denis V. Lunev
Main approach is taken from qcow2_co_readv.

The patch drops coroutine lock for the duration of IO operation and
peforms normal scatter-gather IO using standard QEMU backend.

Signed-off-by: Denis V. Lunev 
Acked-by: Roman Kagan 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 46 +++---
 1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index b469984..64b169b 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -186,37 +186,45 @@ static int64_t coroutine_fn 
parallels_co_get_block_status(BlockDriverState *bs,
 BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
 }
 
-static int parallels_read(BlockDriverState *bs, int64_t sector_num,
-uint8_t *buf, int nb_sectors)
+static coroutine_fn int parallels_co_readv(BlockDriverState *bs,
+int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
 {
 BDRVParallelsState *s = bs->opaque;
+uint64_t bytes_done = 0;
+QEMUIOVector hd_qiov;
+int ret = 0;
 
+qemu_iovec_init(&hd_qiov, qiov->niov);
+
+qemu_co_mutex_lock(&s->lock);
 while (nb_sectors > 0) {
 int64_t position = seek_to_sector(s, sector_num);
 int n = cluster_remainder(s, sector_num, nb_sectors);
-if (position >= 0) {
-int ret = bdrv_read(bs->file, position, buf, n);
+int nbytes = n << BDRV_SECTOR_BITS;
+
+if (position < 0) {
+qemu_iovec_memset(qiov, bytes_done, 0, nbytes);
+} else {
+qemu_iovec_reset(&hd_qiov);
+qemu_iovec_concat(&hd_qiov, qiov, bytes_done, nbytes);
+
+qemu_co_mutex_unlock(&s->lock);
+ret = bdrv_co_readv(bs->file, position, n, &hd_qiov);
+qemu_co_mutex_lock(&s->lock);
+
 if (ret < 0) {
-return ret;
+goto fail;
 }
-} else {
-memset(buf, 0, n << BDRV_SECTOR_BITS);
 }
+
 nb_sectors -= n;
 sector_num += n;
-buf += n << BDRV_SECTOR_BITS;
+bytes_done += nbytes;
 }
-return 0;
-}
-
-static coroutine_fn int parallels_co_read(BlockDriverState *bs, int64_t 
sector_num,
-  uint8_t *buf, int nb_sectors)
-{
-int ret;
-BDRVParallelsState *s = bs->opaque;
-qemu_co_mutex_lock(&s->lock);
-ret = parallels_read(bs, sector_num, buf, nb_sectors);
 qemu_co_mutex_unlock(&s->lock);
+
+fail:
+qemu_iovec_destroy(&hd_qiov);
 return ret;
 }
 
@@ -231,9 +239,9 @@ static BlockDriver bdrv_parallels = {
 .instance_size = sizeof(BDRVParallelsState),
 .bdrv_probe= parallels_probe,
 .bdrv_open = parallels_open,
-.bdrv_read  = parallels_co_read,
 .bdrv_close= parallels_close,
 .bdrv_co_get_block_status = parallels_co_get_block_status,
+.bdrv_co_readv  = parallels_co_readv,
 };
 
 static void bdrv_parallels_init(void)
-- 
1.9.1




[Qemu-devel] [PATCH 08/19] block/parallels: _co_writev callback for Parallels format

2014-12-30 Thread Denis V. Lunev
Support write on Parallels images. The code is almost the same as one
in the previous patch implemented scatter-gather IO for read.

Signed-off-by: Denis V. Lunev 
Acked-by: Roman Kagan 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 77 +--
 1 file changed, 75 insertions(+), 2 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index 306f2e3..dca3d0b 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -81,8 +81,6 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 ParallelsHeader ph;
 int ret;
 
-bs->read_only = 1; // no write support yet
-
 ret = bdrv_pread(bs->file, 0, &ph, sizeof(ph));
 if (ret < 0) {
 goto fail;
@@ -159,6 +157,37 @@ static int64_t seek_to_sector(BDRVParallelsState *s, 
int64_t sector_num)
 return (uint64_t)s->catalog_bitmap[index] * s->off_multiplier + offset;
 }
 
+static int64_t allocate_sector(BlockDriverState *bs, int64_t sector_num)
+{
+BDRVParallelsState *s = bs->opaque;
+uint32_t idx, offset, tmp;
+int64_t pos;
+int ret;
+
+idx = sector_num / s->tracks;
+offset = sector_num % s->tracks;
+
+if (idx >= s->catalog_size) {
+return -EINVAL;
+}
+if (s->catalog_bitmap[idx] != 0) {
+return (uint64_t)s->catalog_bitmap[idx] * s->off_multiplier + offset;
+}
+
+pos = bdrv_getlength(bs->file) >> BDRV_SECTOR_BITS;
+bdrv_truncate(bs->file, (pos + s->tracks) << BDRV_SECTOR_BITS);
+s->catalog_bitmap[idx] = pos / s->off_multiplier;
+
+tmp = cpu_to_le32(s->catalog_bitmap[idx]);
+
+ret = bdrv_pwrite_sync(bs->file,
+sizeof(ParallelsHeader) + idx * sizeof(tmp), &tmp, sizeof(tmp));
+if (ret < 0) {
+return ret;
+}
+return (uint64_t)s->catalog_bitmap[idx] * s->off_multiplier + offset;
+}
+
 static int cluster_remainder(BDRVParallelsState *s, int64_t sector_num,
 int nb_sectors)
 {
@@ -186,6 +215,49 @@ static int64_t coroutine_fn 
parallels_co_get_block_status(BlockDriverState *bs,
 BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
 }
 
+static coroutine_fn int parallels_co_writev(BlockDriverState *bs,
+int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
+{
+BDRVParallelsState *s = bs->opaque;
+uint64_t bytes_done = 0;
+QEMUIOVector hd_qiov;
+int ret = 0;
+
+qemu_iovec_init(&hd_qiov, qiov->niov);
+
+qemu_co_mutex_lock(&s->lock);
+while (nb_sectors > 0) {
+int64_t position = allocate_sector(bs, sector_num);
+int n = cluster_remainder(s, sector_num, nb_sectors);
+int nbytes = n << BDRV_SECTOR_BITS;
+
+if (position < 0) {
+ret = (int)position;
+break;
+}
+
+qemu_iovec_reset(&hd_qiov);
+qemu_iovec_concat(&hd_qiov, qiov, bytes_done, nbytes);
+
+qemu_co_mutex_unlock(&s->lock);
+ret = bdrv_co_writev(bs->file, position, n, &hd_qiov);
+qemu_co_mutex_lock(&s->lock);
+
+if (ret < 0) {
+goto fail;
+}
+
+nb_sectors -= n;
+sector_num += n;
+bytes_done += nbytes;
+}
+qemu_co_mutex_unlock(&s->lock);
+
+fail:
+qemu_iovec_destroy(&hd_qiov);
+return ret;
+}
+
 static coroutine_fn int parallels_co_readv(BlockDriverState *bs,
 int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
 {
@@ -242,6 +314,7 @@ static BlockDriver bdrv_parallels = {
 .bdrv_close= parallels_close,
 .bdrv_co_get_block_status = parallels_co_get_block_status,
 .bdrv_co_readv  = parallels_co_readv,
+.bdrv_co_writev = parallels_co_writev,
 };
 
 static void bdrv_parallels_init(void)
-- 
1.9.1




[Qemu-devel] [PATCH 09/19] iotests, parallels: test for write into Parallels image

2014-12-30 Thread Denis V. Lunev
Signed-off-by: Denis V. Lunev 
Acked-by: Roman Kagan 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 tests/qemu-iotests/076 |  5 +
 tests/qemu-iotests/076.out | 10 ++
 2 files changed, 15 insertions(+)

diff --git a/tests/qemu-iotests/076 b/tests/qemu-iotests/076
index 0139976..c9b55a9 100755
--- a/tests/qemu-iotests/076
+++ b/tests/qemu-iotests/076
@@ -74,6 +74,11 @@ echo
 echo "== Read from a valid v2 image =="
 _use_sample_img parallels-v2.bz2
 { $QEMU_IO -c "read -P 0x11 0 64k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+{ $QEMU_IO -c "write -P 0x21 1024k 1k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+{ $QEMU_IO -c "write -P 0x22 1025k 1k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+{ $QEMU_IO -c "read -P 0x21 1024k 1k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+{ $QEMU_IO -c "read -P 0x22 1025k 1k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+{ $QEMU_IO -c "read -P 0 1026k 62k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
 
 # success, all done
 echo "*** done"
diff --git a/tests/qemu-iotests/076.out b/tests/qemu-iotests/076.out
index 32ade08..bae 100644
--- a/tests/qemu-iotests/076.out
+++ b/tests/qemu-iotests/076.out
@@ -19,4 +19,14 @@ no file open, try 'help open'
 == Read from a valid v2 image ==
 read 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 1024/1024 bytes at offset 1048576
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 1024/1024 bytes at offset 1049600
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 1024/1024 bytes at offset 1048576
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 1024/1024 bytes at offset 1049600
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 63488/63488 bytes at offset 1050624
+62 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 *** done
-- 
1.9.1




Re: [Qemu-devel] [Xen-devel] [qemu-mainline bisection] complete test-amd64-amd64-xl-win7-amd64

2014-12-30 Thread Peter Maydell
On 30 December 2014 at 09:06, Fabio Fantoni  wrote:
> In the automatic test the qemu log contain:
>>
>> qemu-system-i386: util/qemu-option.c:387: qemu_opt_get_bool_helper:
>> Assertion `opt->desc && opt->desc->type == QEMU_OPT_BOOL' failed.
>
> Is there unexpected case in the qemu patch spotted by bisection or qemu
> parameters in libxl need improvements (probably machinearg cases in
> libxl_dm.c)?

Known issue (though we don't have a fix yet). See the qemu-devel thread
"'-usb' regressed by 49d2e648 ("machine: remove qemu_machine_opts
global list")".

thanks
-- PMM



[Qemu-devel] [PATCH v2 1/1] migration/block: fix pending() return value

2014-12-30 Thread Vladimir Sementsov-Ogievskiy
Because of wrong return value of .save_live_pending() in
migration/block.c, migration finishes before the whole disk is
transferred. Such situation occurs when the migration process is fast
enough, for example when source and dest are on the same host.

If in the bulk phase we return something < max_size, we will skip
transferring the tail of the device. Currently we have "set pending to
BLOCK_SIZE if it is zero" for bulk phase, but there no guarantee, that
it will be < max_size.

True approach is to return, for example, max_size+1 when we are in the
bulk phase.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/migration/block.c b/migration/block.c
index 74d9eb1..2e92605 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -765,8 +765,8 @@ static uint64_t block_save_pending(QEMUFile *f, void 
*opaque, uint64_t max_size)
block_mig_state.read_done * BLOCK_SIZE;
 
 /* Report at least one block pending during bulk phase */
-if (pending == 0 && !block_mig_state.bulk_completed) {
-pending = BLOCK_SIZE;
+if (pending <= max_size && !block_mig_state.bulk_completed) {
+pending = max_size + BLOCK_SIZE;
 }
 blk_mig_unlock();
 qemu_mutex_unlock_iothread();
-- 
1.9.1




[Qemu-devel] [PATCH v2 0/1] Fix block migration bug

2014-12-30 Thread Vladimir Sementsov-Ogievskiy
v2:
  - rebase to master
  - fix typos in description

Because of wrong return value of .save_live_pending() in
block-migration, migration finishes before the whole disk
is transferred. Such situation occurs when the migration
process is fast enough, for example when source and dest 
are on the same host.

It's easy to test this with the following:

bug.sh
=
#!/bin/sh

size=$1
addr=$2

rm /tmp/fifo-mig /tmp/a /tmp/b /tmp/sock-mig

./qemu-img create -f qcow2 /tmp/a $size
./qemu-img create -f qcow2 /tmp/b $size

./qemu-io -c "write -P 0x22 $addr 512" /tmp/a

mkfifo /tmp/fifo-mig

./x86_64-softmmu/qemu-system-x86_64 -drive file=/tmp/b,id=disk\
-qmp unix:/tmp/sock-mig,server,nowait\
-incoming "exec: cat /tmp/fifo-mig" &

echo 'migrate -b exec:cat>/tmp/fifo-mig\nquit\n' |\
./x86_64-softmmu/qemu-system-x86_64 -drive file=/tmp/a,id=disk\
-monitor stdio

./scripts/qmp/qmp --path=/tmp/sock-mig quit
sleep 3

echo checking
./qemu-io -c "read -P 0x22 $addr 512" /tmp/b
=

For './bug.sh 1G 1M' qemu-io check finishes successfully,
but for './bug.sh 1G 1022M' it finishes with 'Pattern verification
failed' status.

The following patch fixes this bug.

Vladimir Sementsov-Ogievskiy (1):
  migration/block: fix pending() return value

 migration/block.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-- 
1.9.1




[Qemu-devel] [PATCH 13/19] block/parallels: store ParallelsHeader to the BDRVParallelsState

2014-12-30 Thread Denis V. Lunev
This would be useful for the future for speed optimizations of new block
creation in the image. At the moment each write to the catalog bitmap
results in read-modify-write transaction. It would be beneficial to
write by pages or sectors. Though in order to do that for the begining
of the image we should keep the header somethere to obtain first sector
of the image properly. BDRVParallelsState would be a good place for that.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index e3abf4e..f79ddff 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -57,6 +57,8 @@ typedef struct ParallelsHeader {
 typedef struct BDRVParallelsState {
 CoMutex lock;
 
+ParallelsHeader ph;
+
 uint32_t *catalog_bitmap;
 unsigned int catalog_size;
 
@@ -85,29 +87,28 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 {
 BDRVParallelsState *s = bs->opaque;
 int i;
-ParallelsHeader ph;
 int ret;
 
-ret = bdrv_pread(bs->file, 0, &ph, sizeof(ph));
+ret = bdrv_pread(bs->file, 0, &s->ph, sizeof(s->ph));
 if (ret < 0) {
 goto fail;
 }
 
-bs->total_sectors = le64_to_cpu(ph.nb_sectors);
+bs->total_sectors = le64_to_cpu(s->ph.nb_sectors);
 
-if (le32_to_cpu(ph.version) != HEADER_VERSION) {
+if (le32_to_cpu(s->ph.version) != HEADER_VERSION) {
 goto fail_format;
 }
-if (!memcmp(ph.magic, HEADER_MAGIC, 16)) {
+if (!memcmp(s->ph.magic, HEADER_MAGIC, 16)) {
 s->off_multiplier = 1;
 bs->total_sectors = 0x & bs->total_sectors;
-} else if (!memcmp(ph.magic, HEADER_MAGIC2, 16)) {
-s->off_multiplier = le32_to_cpu(ph.tracks);
+} else if (!memcmp(s->ph.magic, HEADER_MAGIC2, 16)) {
+s->off_multiplier = le32_to_cpu(s->ph.tracks);
 } else {
 goto fail_format;
 }
 
-s->tracks = le32_to_cpu(ph.tracks);
+s->tracks = le32_to_cpu(s->ph.tracks);
 if (s->tracks == 0) {
 error_setg(errp, "Invalid image: Zero sectors per track");
 ret = -EINVAL;
@@ -119,7 +120,7 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 goto fail;
 }
 
-s->catalog_size = le32_to_cpu(ph.catalog_entries);
+s->catalog_size = le32_to_cpu(s->ph.catalog_entries);
 if (s->catalog_size > INT_MAX / sizeof(uint32_t)) {
 error_setg(errp, "Catalog too large");
 ret = -EFBIG;
-- 
1.9.1




[Qemu-devel] [PATCH 15/19] block/parallels: rename catalog_ names to bat_

2014-12-30 Thread Denis V. Lunev
BAT means 'block allocation table'. Thus this name is clean and shorter
on writing.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 48 
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index d072276..ddc3aee 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -47,7 +47,7 @@ typedef struct ParallelsHeader {
 uint32_t heads;
 uint32_t cylinders;
 uint32_t tracks;
-uint32_t catalog_entries;
+uint32_t bat_entries;
 uint64_t nb_sectors;
 uint32_t inuse;
 uint32_t data_off;
@@ -59,8 +59,8 @@ typedef struct BDRVParallelsState {
 
 ParallelsHeader ph;
 
-uint32_t *catalog_bitmap;
-unsigned int catalog_size;
+uint32_t *bat;
+unsigned int bat_size;
 
 unsigned int tracks;
 
@@ -68,7 +68,7 @@ typedef struct BDRVParallelsState {
 } BDRVParallelsState;
 
 
-static uint32_t catalog_offset(uint32_t index)
+static uint32_t bat_offset(uint32_t index)
 {
 return sizeof(ParallelsHeader) + sizeof(uint32_t) * index;
 }
@@ -126,26 +126,26 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 goto fail;
 }
 
-s->catalog_size = le32_to_cpu(s->ph.catalog_entries);
-if (s->catalog_size > INT_MAX / sizeof(uint32_t)) {
+s->bat_size = le32_to_cpu(s->ph.bat_entries);
+if (s->bat_size > INT_MAX / sizeof(uint32_t)) {
 error_setg(errp, "Catalog too large");
 ret = -EFBIG;
 goto fail;
 }
-s->catalog_bitmap = g_try_new(uint32_t, s->catalog_size);
-if (s->catalog_size && s->catalog_bitmap == NULL) {
+s->bat = g_try_new(uint32_t, s->bat_size);
+if (s->bat_size && s->bat == NULL) {
 ret = -ENOMEM;
 goto fail;
 }
 
 ret = bdrv_pread(bs->file, sizeof(ParallelsHeader),
- s->catalog_bitmap, s->catalog_size * sizeof(uint32_t));
+ s->bat, s->bat_size * sizeof(uint32_t));
 if (ret < 0) {
 goto fail;
 }
 
-for (i = 0; i < s->catalog_size; i++)
-le32_to_cpus(&s->catalog_bitmap[i]);
+for (i = 0; i < s->bat_size; i++)
+le32_to_cpus(&s->bat[i]);
 
 qemu_co_mutex_init(&s->lock);
 return 0;
@@ -154,7 +154,7 @@ fail_format:
 error_setg(errp, "Image not in Parallels format");
 ret = -EINVAL;
 fail:
-g_free(s->catalog_bitmap);
+g_free(s->bat);
 return ret;
 }
 
@@ -166,9 +166,9 @@ static int64_t seek_to_sector(BDRVParallelsState *s, 
int64_t sector_num)
 offset = sector_num % s->tracks;
 
 /* not allocated */
-if ((index >= s->catalog_size) || (s->catalog_bitmap[index] == 0))
+if ((index >= s->bat_size) || (s->bat[index] == 0))
 return -1;
-return (uint64_t)s->catalog_bitmap[index] * s->off_multiplier + offset;
+return (uint64_t)s->bat[index] * s->off_multiplier + offset;
 }
 
 static int64_t allocate_sector(BlockDriverState *bs, int64_t sector_num)
@@ -181,24 +181,24 @@ static int64_t allocate_sector(BlockDriverState *bs, 
int64_t sector_num)
 idx = sector_num / s->tracks;
 offset = sector_num % s->tracks;
 
-if (idx >= s->catalog_size) {
+if (idx >= s->bat_size) {
 return -EINVAL;
 }
-if (s->catalog_bitmap[idx] != 0) {
-return (uint64_t)s->catalog_bitmap[idx] * s->off_multiplier + offset;
+if (s->bat[idx] != 0) {
+return (uint64_t)s->bat[idx] * s->off_multiplier + offset;
 }
 
 pos = bdrv_getlength(bs->file) >> BDRV_SECTOR_BITS;
 bdrv_truncate(bs->file, (pos + s->tracks) << BDRV_SECTOR_BITS);
-s->catalog_bitmap[idx] = pos / s->off_multiplier;
+s->bat[idx] = pos / s->off_multiplier;
 
-tmp = cpu_to_le32(s->catalog_bitmap[idx]);
+tmp = cpu_to_le32(s->bat[idx]);
 
-ret = bdrv_pwrite_sync(bs->file, catalog_offset(idx), &tmp, sizeof(tmp));
+ret = bdrv_pwrite_sync(bs->file, bat_offset(idx), &tmp, sizeof(tmp));
 if (ret < 0) {
 return ret;
 }
-return (uint64_t)s->catalog_bitmap[idx] * s->off_multiplier + offset;
+return (uint64_t)s->bat[idx] * s->off_multiplier + offset;
 }
 
 static int cluster_remainder(BDRVParallelsState *s, int64_t sector_num,
@@ -347,7 +347,7 @@ static int parallels_create(const char *filename, QemuOpts 
*opts, Error **errp)
 }
 
 cat_entries = DIV_ROUND_UP(total_size, cl_size);
-cat_sectors = DIV_ROUND_UP(catalog_offset(cat_entries), cl_size);
+cat_sectors = DIV_ROUND_UP(bat_offset(cat_entries), cl_size);
 cat_sectors = (cat_sectors *  cl_size) >> BDRV_SECTOR_BITS;
 
 memset(&header, 0, sizeof(header));
@@ -357,7 +357,7 @@ static int parallels_create(const char *filename, QemuOpts 
*opts, Error **errp)
 header.heads = cpu_to_le32(16);
 header.cylinders = cpu_to_le32(total_size / BDRV_SECTOR_SIZE / 16 / 32);
 header.tracks = cpu_to_le32(cl_size >> BDRV_SECTOR_BITS);
-header.catalog_entries = cpu_to_le32(cat_entries);
+heade

[Qemu-devel] [PATCH 10/19] block/parallels: support parallels image creation

2014-12-30 Thread Denis V. Lunev
Do not even care to create WithoutFreeSpace image, it is obsolete.
Always create WithouFreSpacExt one.

The code also does not spend a lot of efforts to fill cylinders and
heads fields, they are not used actually in a real life neither in
QEMU nor in Parallels products.

Signed-off-by: Denis V. Lunev 
Acked-by: Roman Kagan 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 97 +++
 1 file changed, 97 insertions(+)

diff --git a/block/parallels.c b/block/parallels.c
index dca3d0b..bea1217 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -33,6 +33,9 @@
 #define HEADER_MAGIC2 "WithouFreSpacExt"
 #define HEADER_VERSION 2
 
+#define DEFAULT_CLUSTER_SIZE 1048576/* 1 MiB */
+
+
 // always little-endian
 typedef struct ParallelsHeader {
 char magic[16]; // "WithoutFreeSpace"
@@ -300,12 +303,103 @@ fail:
 return ret;
 }
 
+static int parallels_create(const char *filename, QemuOpts *opts, Error **errp)
+{
+int64_t total_size, cl_size;
+uint8_t tmp[BDRV_SECTOR_SIZE];
+Error *local_err = NULL;
+BlockDriverState *file;
+uint32_t cat_entries, cat_sectors;
+ParallelsHeader header;
+int ret;
+
+total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
+  BDRV_SECTOR_SIZE);
+cl_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_CLUSTER_SIZE,
+  DEFAULT_CLUSTER_SIZE), BDRV_SECTOR_SIZE);
+
+ret = bdrv_create_file(filename, opts, &local_err);
+if (ret < 0) {
+error_propagate(errp, local_err);
+return ret;
+}
+
+file = NULL;
+ret = bdrv_open(&file, filename, NULL, NULL,
+BDRV_O_RDWR | BDRV_O_PROTOCOL, NULL, &local_err);
+if (ret < 0) {
+error_propagate(errp, local_err);
+return ret;
+}
+ret = bdrv_truncate(file, 0);
+if (ret < 0) {
+goto exit;
+}
+
+cat_entries = DIV_ROUND_UP(total_size, cl_size);
+cat_sectors = DIV_ROUND_UP(cat_entries * sizeof(uint32_t) +
+   sizeof(ParallelsHeader), cl_size);
+cat_sectors = (cat_sectors *  cl_size) >> BDRV_SECTOR_BITS;
+
+memset(&header, 0, sizeof(header));
+memcpy(header.magic, HEADER_MAGIC2, sizeof(header.magic));
+header.version = cpu_to_le32(HEADER_VERSION);
+/* don't care much about geometry, it is not used on image level */
+header.heads = cpu_to_le32(16);
+header.cylinders = cpu_to_le32(total_size / BDRV_SECTOR_SIZE / 16 / 32);
+header.tracks = cpu_to_le32(cl_size >> BDRV_SECTOR_BITS);
+header.catalog_entries = cpu_to_le32(cat_entries);
+header.nb_sectors = cpu_to_le64(DIV_ROUND_UP(total_size, 
BDRV_SECTOR_SIZE));
+header.data_off = cpu_to_le32(cat_sectors);
+
+/* write all the data */
+memset(tmp, 0, sizeof(tmp));
+memcpy(tmp, &header, sizeof(header));
+
+ret = bdrv_pwrite(file, 0, tmp, BDRV_SECTOR_SIZE);
+if (ret < 0) {
+goto exit;
+}
+ret = bdrv_write_zeroes(file, 1, cat_sectors - 1, 0);
+if (ret < 0) {
+goto exit;
+}
+ret = 0;
+
+done:
+bdrv_unref(file);
+return ret;
+
+exit:
+error_setg_errno(errp, -ret, "Failed to create Parallels image");
+goto done;
+}
+
 static void parallels_close(BlockDriverState *bs)
 {
 BDRVParallelsState *s = bs->opaque;
 g_free(s->catalog_bitmap);
 }
 
+static QemuOptsList parallels_create_opts = {
+.name = "parallels-create-opts",
+.head = QTAILQ_HEAD_INITIALIZER(parallels_create_opts.head),
+.desc = {
+{
+.name = BLOCK_OPT_SIZE,
+.type = QEMU_OPT_SIZE,
+.help = "Virtual disk size",
+},
+{
+.name = BLOCK_OPT_CLUSTER_SIZE,
+.type = QEMU_OPT_SIZE,
+.help = "Parallels image cluster size",
+.def_value_str = stringify(DEFAULT_CLUSTER_SIZE),
+},
+{ /* end of list */ }
+}
+};
+
 static BlockDriver bdrv_parallels = {
 .format_name   = "parallels",
 .instance_size = sizeof(BDRVParallelsState),
@@ -315,6 +409,9 @@ static BlockDriver bdrv_parallels = {
 .bdrv_co_get_block_status = parallels_co_get_block_status,
 .bdrv_co_readv  = parallels_co_readv,
 .bdrv_co_writev = parallels_co_writev,
+
+.bdrv_create= parallels_create,
+.create_opts= ¶llels_create_opts,
 };
 
 static void bdrv_parallels_init(void)
-- 
1.9.1




[Qemu-devel] [PATCH 19/19] block/parallels: optimize linear image expansion

2014-12-30 Thread Denis V. Lunev
Plain image expansion spends a lot of time to update image file size.
This seriously affects the performance. The following simple test
  qemu_img create -f parallels -o cluster_size=64k ./1.hds 64G
  qemu_io -n -c "write -P 0x11 0 1024M" ./1.hds
could be improved if the format driver will pre-allocate some space
in the image file with a reasonable chunk.

This patch preallocates 128 Mb using bdrv_write_zeroes, which should
normally use fallocate() call inside. Fallback to older truncate()
could be used as a fallback using image open options thanks to the
previous patch.

The benefit is around 15%.

This patch is final in this series. Block driver has near native
performance now.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 35 +--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index 12a9cea..5ec4a0d 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -82,6 +82,7 @@ typedef struct BDRVParallelsState {
 int bat_cache_off;
 int data_off;
 
+int64_t  prealloc_off;
 uint64_t prealloc_size;
 ParallelsPreallocMode prealloc_mode;
 
@@ -216,9 +217,19 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 goto fail_options;
 }
 
-for (i = 0; i < s->bat_size; i++)
+for (i = 0; i < s->bat_size; i++) {
+int64_t off;
 le32_to_cpus(&s->bat[i]);
 
+if (s->bat[i] == 0) {
+continue;
+}
+off = s->bat[i] * s->off_multiplier;
+if (off >= s->prealloc_off) {
+s->prealloc_off = off + s->tracks;
+}
+}
+
 qemu_co_mutex_init(&s->lock);
 
 s->bat_cache_off = -1;
@@ -230,6 +241,9 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 if (s->data_off == 0) {
 s->data_off = ROUND_UP(bat_offset(s->bat_size), BDRV_SECTOR_SIZE);
 }
+if (s->prealloc_off == 0) {
+s->prealloc_off = s->data_off >> BDRV_SECTOR_BITS;
+}
 
 return 0;
 
@@ -338,7 +352,19 @@ static int64_t allocate_sector(BlockDriverState *bs, 
int64_t sector_num)
 }
 
 pos = bdrv_getlength(bs->file) >> BDRV_SECTOR_BITS;
-bdrv_truncate(bs->file, (pos + s->tracks) << BDRV_SECTOR_BITS);
+if (s->prealloc_off + s->tracks > pos) {
+if (s->prealloc_mode == PRL_PREALLOC_MODE_FALLOCATE)
+ret = bdrv_write_zeroes(bs->file, s->prealloc_off,
+s->prealloc_size, 0);
+else
+ret = bdrv_truncate(bs->file,
+(s->prealloc_off + s->prealloc_size) << BDRV_SECTOR_BITS);
+if (ret < 0) {
+return ret;
+}
+}
+pos = s->prealloc_off;
+s->prealloc_off += s->tracks;
 
 ret = cache_bat(bs, idx, pos / s->off_multiplier);
 if (ret < 0) {
@@ -546,6 +572,11 @@ exit:
 static void parallels_close(BlockDriverState *bs)
 {
 BDRVParallelsState *s = bs->opaque;
+
+if (bs->open_flags & BDRV_O_RDWR) {
+bdrv_truncate(bs->file, s->prealloc_off << BDRV_SECTOR_BITS);
+}
+
 qemu_vfree(s->bat_cache);
 g_free(s->bat);
 }
-- 
1.9.1




[Qemu-devel] [PATCH 11/19] iotests, parallels: test for newly created parallels image via qemu-img

2014-12-30 Thread Denis V. Lunev
Signed-off-by: Denis V. Lunev 
Acked-by: Roman Kagan 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 tests/qemu-iotests/115 | 68 ++
 tests/qemu-iotests/115.out | 24 
 tests/qemu-iotests/group   |  1 +
 3 files changed, 93 insertions(+)
 create mode 100755 tests/qemu-iotests/115
 create mode 100644 tests/qemu-iotests/115.out

diff --git a/tests/qemu-iotests/115 b/tests/qemu-iotests/115
new file mode 100755
index 000..f45afa7
--- /dev/null
+++ b/tests/qemu-iotests/115
@@ -0,0 +1,68 @@
+#!/bin/bash
+#
+# parallels format validation tests (created by QEMU)
+#
+# Copyright (C) 2014 Denis V. Lunev 
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=d...@openvz.org
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+_cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt parallels
+_supported_proto file
+_supported_os Linux
+
+size=64M
+CLUSTER_SIZE=64k
+IMGFMT=parallels
+_make_test_img $size
+
+echo == read empty image ==
+{ $QEMU_IO -c "read -P 0 32k 64k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+echo == write more than 1 block in a row ==
+{ $QEMU_IO -c "write -P 0x11 32k 128k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+echo == read less than block ==
+{ $QEMU_IO -c "read -P 0x11 32k 32k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+echo == read exactly 1 block ==
+{ $QEMU_IO -c "read -P 0x11 64k 64k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+echo == read more than 1 block ==
+{ $QEMU_IO -c "read -P 0x11 32k 128k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+echo == check that there is no trash after written ==
+{ $QEMU_IO -c "read -P 0 160k 32k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+echo == check that there is no trash before written ==
+{ $QEMU_IO -c "read -P 0 0 32k" "$TEST_IMG"; } 2>&1 | _filter_qemu_io | 
_filter_testdir
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/115.out b/tests/qemu-iotests/115.out
new file mode 100644
index 000..6a73104
--- /dev/null
+++ b/tests/qemu-iotests/115.out
@@ -0,0 +1,24 @@
+QA output created by 115
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
+== read empty image ==
+read 65536/65536 bytes at offset 32768
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+== write more than 1 block in a row ==
+wrote 131072/131072 bytes at offset 32768
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+== read less than block ==
+read 32768/32768 bytes at offset 32768
+32 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+== read exactly 1 block ==
+read 65536/65536 bytes at offset 65536
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+== read more than 1 block ==
+read 131072/131072 bytes at offset 32768
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+== check that there is no trash after written ==
+read 32768/32768 bytes at offset 163840
+32 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+== check that there is no trash before written ==
+read 32768/32768 bytes at offset 0
+32 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index a4742c6..77377b3 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -115,3 +115,4 @@
 111 rw auto quick
 113 rw auto quick
 114 rw auto quick
+115 rw auto quick
-- 
1.9.1




[Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each block allocation table update

2014-12-30 Thread Denis V. Lunev
>From the point of guest each write to real disk prior to disk barrier
operation could be lost. Therefore there is no problem that "not synced"
new block is lost due to not updated allocation table if QEMU is crashed.

This patch improves writing performance of
  qemu-img create -f parallels -o cluster_size=64k ./1.hds 64G
  qemu-io -f parallels -c "write -P 0x11 0 1024k" 1.hds
from 45 Mb/sec to 160 Mb/sec on my SSD disk. The gain on rotational media
is much more sufficient, from 800 Kb/sec to 45 Mb/sec.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/parallels.c b/block/parallels.c
index ddc3aee..46cf031 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -194,7 +194,7 @@ static int64_t allocate_sector(BlockDriverState *bs, 
int64_t sector_num)
 
 tmp = cpu_to_le32(s->bat[idx]);
 
-ret = bdrv_pwrite_sync(bs->file, bat_offset(idx), &tmp, sizeof(tmp));
+ret = bdrv_pwrite(bs->file, bat_offset(idx), &tmp, sizeof(tmp));
 if (ret < 0) {
 return ret;
 }
-- 
1.9.1




[Qemu-devel] [PATCH 14/19] block/parallels: create catalog_offset helper

2014-12-30 Thread Denis V. Lunev
to calculate entry offset inside catalog bitmap in parallels image.
This is a matter of convinience.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index f79ddff..d072276 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -67,6 +67,12 @@ typedef struct BDRVParallelsState {
 unsigned int off_multiplier;
 } BDRVParallelsState;
 
+
+static uint32_t catalog_offset(uint32_t index)
+{
+return sizeof(ParallelsHeader) + sizeof(uint32_t) * index;
+}
+
 static int parallels_probe(const uint8_t *buf, int buf_size, const char 
*filename)
 {
 const ParallelsHeader *ph = (const void *)buf;
@@ -188,8 +194,7 @@ static int64_t allocate_sector(BlockDriverState *bs, 
int64_t sector_num)
 
 tmp = cpu_to_le32(s->catalog_bitmap[idx]);
 
-ret = bdrv_pwrite_sync(bs->file,
-sizeof(ParallelsHeader) + idx * sizeof(tmp), &tmp, sizeof(tmp));
+ret = bdrv_pwrite_sync(bs->file, catalog_offset(idx), &tmp, sizeof(tmp));
 if (ret < 0) {
 return ret;
 }
@@ -342,8 +347,7 @@ static int parallels_create(const char *filename, QemuOpts 
*opts, Error **errp)
 }
 
 cat_entries = DIV_ROUND_UP(total_size, cl_size);
-cat_sectors = DIV_ROUND_UP(cat_entries * sizeof(uint32_t) +
-   sizeof(ParallelsHeader), cl_size);
+cat_sectors = DIV_ROUND_UP(catalog_offset(cat_entries), cl_size);
 cat_sectors = (cat_sectors *  cl_size) >> BDRV_SECTOR_BITS;
 
 memset(&header, 0, sizeof(header));
-- 
1.9.1




[Qemu-devel] [PATCH 12/19] parallels: change copyright information in the image header

2014-12-30 Thread Denis V. Lunev
Signed-off-by: Denis V. Lunev 
Acked-by: Roman Kagan 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/block/parallels.c b/block/parallels.c
index bea1217..e3abf4e 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -2,8 +2,12 @@
  * Block driver for Parallels disk image format
  *
  * Copyright (c) 2007 Alex Beregszaszi
+ * Copyright (c) 2014 Denis V. Lunev 
  *
- * This code is based on comparing different disk images created by Parallels.
+ * This code was originally based on comparing different disk images created
+ * by Parallels. Currently it is based on opened OpenVZ sources
+ * available at
+ * http://git.openvz.org/?p=ploop;a=summary
  *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to 
deal
-- 
1.9.1




[Qemu-devel] [PATCH 17/19] block/parallels: delay writing to BAT till bdrv_co_flush_to_os

2014-12-30 Thread Denis V. Lunev
The idea is that we do not need to immediately sync BAT to the image as
from the guest point of view there is a possibility that IO is lost
even in the physical controller until flush command was finished.
bdrv_co_flush_to_os is exactly the right place for this purpose.

Technically the patch aligns writes into BAT to MAX(bdrv_align, 4096),
which elliminates read-modify-write transactions on BAT update and
cache ready-to-write content in a special buffer in BDRVParallelsState.

This buffer possibly contains ParallelsHeader if the first page of the
image should be modified. The header occupies first 64 bytes of the image
and the BAT starts immediately after it.

It is also possible that BAT end is not aligned to the cluster size.
ParallelsHeader->data_off is not specified for this case. We should write
only part of the cache in that case.

This patch speed ups
  qemu-img create -f parallels -o cluster_size=64k ./1.hds 64G
  qemu-io -f parallels -c "write -P 0x11 0 1024k" 1.hds
writing from 50-60 Mb/sec to 80-90 Mb/sec on rotational media and
from 160 Mb/sec to 190 Mb/sec on SSD disk.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 99 ---
 1 file changed, 94 insertions(+), 5 deletions(-)

diff --git a/block/parallels.c b/block/parallels.c
index 46cf031..18b9267 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -62,6 +62,11 @@ typedef struct BDRVParallelsState {
 uint32_t *bat;
 unsigned int bat_size;
 
+uint32_t *bat_cache;
+unsigned bat_cache_size;
+int bat_cache_off;
+int data_off;
+
 unsigned int tracks;
 
 unsigned int off_multiplier;
@@ -148,6 +153,17 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 le32_to_cpus(&s->bat[i]);
 
 qemu_co_mutex_init(&s->lock);
+
+s->bat_cache_off = -1;
+if (bs->open_flags & BDRV_O_RDWR) {
+s->bat_cache_size = MAX(bdrv_opt_mem_align(bs->file), 4096);
+s->bat_cache = qemu_blockalign(bs->file, s->bat_cache_size);
+}
+s->data_off = le32_to_cpu(s->ph.data_off) * BDRV_SECTOR_SIZE;
+if (s->data_off == 0) {
+s->data_off = ROUND_UP(bat_offset(s->bat_size), BDRV_SECTOR_SIZE);
+}
+
 return 0;
 
 fail_format:
@@ -171,10 +187,71 @@ static int64_t seek_to_sector(BDRVParallelsState *s, 
int64_t sector_num)
 return (uint64_t)s->bat[index] * s->off_multiplier + offset;
 }
 
+static int write_bat_cache(BlockDriverState *bs)
+{
+BDRVParallelsState *s = bs->opaque;
+int size, off;
+
+if (s->bat_cache_off == -1) {
+/* no cached data */
+return 0;
+}
+
+size = s->bat_cache_size;
+if (size + s->bat_cache_off > s->data_off) {
+/* avoid writing to the first data block */
+size = s->data_off - s->bat_cache_off;
+}
+
+off = s->bat_cache_off;
+s->bat_cache_off = -1;
+return bdrv_pwrite(bs->file, off, s->bat_cache, size);
+}
+
+static int cache_bat(BlockDriverState *bs, uint32_t idx, uint32_t new_data_off)
+{
+int ret, i, off, cache_off;
+int64_t first_idx, last_idx;
+BDRVParallelsState *s = bs->opaque;
+uint32_t *cache = s->bat_cache;
+
+off = bat_offset(idx);
+cache_off = (off / s->bat_cache_size) * s->bat_cache_size;
+
+if (s->bat_cache_off != -1 && s->bat_cache_off != cache_off) {
+ret = write_bat_cache(bs);
+if (ret < 0) {
+return ret;
+}
+}
+
+first_idx = idx - (off - cache_off) / sizeof(uint32_t);
+last_idx = first_idx + s->bat_cache_size / sizeof(uint32_t);
+if (first_idx < 0) {
+memcpy(s->bat_cache, &s->ph, sizeof(s->ph));
+first_idx = 0;
+cache = s->bat_cache + sizeof(s->ph) / sizeof(uint32_t);
+}
+
+if (last_idx > s->bat_size) {
+memset(cache + s->bat_size - first_idx, 0,
+   sizeof(uint32_t) * (last_idx - s->bat_size));
+}
+
+for (i = 0; i < last_idx - first_idx; i++) {
+cache[i] = cpu_to_le32(s->bat[first_idx + i]);
+}
+cache[idx - first_idx] = cpu_to_le32(new_data_off);
+s->bat[idx] = new_data_off;
+
+s->bat_cache_off = cache_off;
+return 0;
+}
+
 static int64_t allocate_sector(BlockDriverState *bs, int64_t sector_num)
 {
 BDRVParallelsState *s = bs->opaque;
-uint32_t idx, offset, tmp;
+uint32_t idx, offset;
 int64_t pos;
 int ret;
 
@@ -190,17 +267,27 @@ static int64_t allocate_sector(BlockDriverState *bs, 
int64_t sector_num)
 
 pos = bdrv_getlength(bs->file) >> BDRV_SECTOR_BITS;
 bdrv_truncate(bs->file, (pos + s->tracks) << BDRV_SECTOR_BITS);
-s->bat[idx] = pos / s->off_multiplier;
-
-tmp = cpu_to_le32(s->bat[idx]);
 
-ret = bdrv_pwrite(bs->file, bat_offset(idx), &tmp, sizeof(tmp));
+ret = cache_bat(bs, idx, pos / s->off_multiplier);
 if (ret < 0) {
 return ret;
 }
 return (uint64_t)s->bat[idx] * s->off_multiplier + offset;
 }
 
+static coroutine_fn int 

[Qemu-devel] [PATCH 18/19] block/parallels: add prealloc-mode and prealloc-size open paramemets

2014-12-30 Thread Denis V. Lunev
This is preparational commit for tweaks in Parallels image expansion.
The idea is that enlarge via truncate by one data block is slow. It
would be much better to use fallocate via bdrv_write_zeroes and
expand by some significant amount at once.

This patch just adds proper parameters into BDRVParallelsState and
performs options parsing in parallels_open.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
 block/parallels.c | 72 +++
 1 file changed, 72 insertions(+)

diff --git a/block/parallels.c b/block/parallels.c
index 18b9267..12a9cea 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -30,6 +30,7 @@
 #include "qemu-common.h"
 #include "block/block_int.h"
 #include "qemu/module.h"
+#include "qapi/util.h"
 
 /**/
 
@@ -54,6 +55,20 @@ typedef struct ParallelsHeader {
 char padding[12];
 } QEMU_PACKED ParallelsHeader;
 
+
+typedef enum ParallelsPreallocMode {
+PRL_PREALLOC_MODE_FALLOCATE = 0,
+PRL_PREALLOC_MODE_TRUNCATE = 1,
+PRL_PREALLOC_MODE_MAX = 2,
+} ParallelsPreallocMode;
+
+static const char *prealloc_mode_lookup[] = {
+"falloc",
+"truncate",
+NULL,
+};
+
+
 typedef struct BDRVParallelsState {
 CoMutex lock;
 
@@ -67,12 +82,40 @@ typedef struct BDRVParallelsState {
 int bat_cache_off;
 int data_off;
 
+uint64_t prealloc_size;
+ParallelsPreallocMode prealloc_mode;
+
 unsigned int tracks;
 
 unsigned int off_multiplier;
 } BDRVParallelsState;
 
 
+#define PARALLELS_OPT_PREALLOC_MODE "prealloc-mode"
+#define PARALLELS_OPT_PREALLOC_SIZE "prealloc-size"
+
+static QemuOptsList parallels_runtime_opts = {
+.name = "parallels",
+.head = QTAILQ_HEAD_INITIALIZER(parallels_runtime_opts.head),
+.desc = {
+{
+.name = PARALLELS_OPT_PREALLOC_SIZE,
+.type = QEMU_OPT_SIZE,
+.help = "Preallocation size on image expansion",
+.def_value_str = "128MiB",
+},
+{
+.name = PARALLELS_OPT_PREALLOC_MODE,
+.type = QEMU_OPT_STRING,
+.help = "Preallocation mode on image expansion "
+"(allowed values: falloc, truncate)",
+.def_value_str = "falloc",
+},
+{ /* end of list */ },
+},
+};
+
+
 static uint32_t bat_offset(uint32_t index)
 {
 return sizeof(ParallelsHeader) + sizeof(uint32_t) * index;
@@ -99,6 +142,9 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 BDRVParallelsState *s = bs->opaque;
 int i;
 int ret;
+QemuOpts *opts = NULL;
+Error *local_err = NULL;
+char *buf;
 
 ret = bdrv_pread(bs->file, 0, &s->ph, sizeof(s->ph));
 if (ret < 0) {
@@ -149,6 +195,27 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 goto fail;
 }
 
+opts = qemu_opts_create(¶llels_runtime_opts, NULL, 0, &local_err);
+if (local_err != NULL) {
+goto fail_options;
+}
+
+qemu_opts_absorb_qdict(opts, options, &local_err);
+if (local_err != NULL) {
+goto fail_options;
+}
+
+s->prealloc_size =
+qemu_opt_get_size_del(opts, PARALLELS_OPT_PREALLOC_SIZE, 0);
+s->prealloc_size = MAX(s->tracks, s->prealloc_size >> BDRV_SECTOR_BITS);
+buf = qemu_opt_get_del(opts, PARALLELS_OPT_PREALLOC_MODE);
+s->prealloc_mode = qapi_enum_parse(prealloc_mode_lookup, buf,
+PRL_PREALLOC_MODE_MAX, PRL_PREALLOC_MODE_FALLOCATE, &local_err);
+g_free(buf);
+if (local_err != NULL) {
+goto fail_options;
+}
+
 for (i = 0; i < s->bat_size; i++)
 le32_to_cpus(&s->bat[i]);
 
@@ -172,6 +239,11 @@ fail_format:
 fail:
 g_free(s->bat);
 return ret;
+
+fail_options:
+error_propagate(errp, local_err);
+ret = -EINVAL;
+goto fail;
 }
 
 static int64_t seek_to_sector(BDRVParallelsState *s, int64_t sector_num)
-- 
1.9.1




Re: [Qemu-devel] [PULL 0/2] lm32: milkymist fixes and MAINTAINERS update

2014-12-30 Thread Michael Walle
Oh sorry. Something went wrong with the pull request. I'll make a new one asap.

-michael 

Re: [Qemu-devel] [PATCH 01/10] pci: move REDHAT_SDHCI device ID to make room for Rocker

2014-12-30 Thread Peter Maydell
On 30 December 2014 at 05:14,   wrote:
> From: Scott Feldman 
>
> The rocker device uses same PCI device ID as sdhci.  Since rocker device 
> driver
> has already been accepted into Linux 3.18, and REDHAT_SDHCI device ID isn't
> used by any drivers, it's safe to move REDHAT_SDHCI device ID, avoiding
> conflict with rocker.
>
> Signed-off-by: Scott Feldman 
> Signed-off-by: Jiri Pirko 
> ---
>  docs/specs/pci-ids.txt |2 +-
>  include/hw/pci/pci.h   |2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/docs/specs/pci-ids.txt b/docs/specs/pci-ids.txt
> index 9b57d5e..c6732fe 100644
> --- a/docs/specs/pci-ids.txt
> +++ b/docs/specs/pci-ids.txt
> @@ -45,7 +45,7 @@ PCI devices (other than virtio):
>  1b36:0003  PCI Dual-port 16550A adapter (docs/specs/pci-serial.txt)
>  1b36:0004  PCI Quad-port 16550A adapter (docs/specs/pci-serial.txt)
>  1b36:0005  PCI test device (docs/specs/pci-testdev.txt)
> -1b36:0006  PCI SD Card Host Controller Interface (SDHCI)
> +1b36:0007  PCI SD Card Host Controller Interface (SDHCI)

Paolo, do you know how we ended up with this double-allocation
of PCI IDs? Who is the master authority for handing them out
from this range?

thanks
-- PMM



Re: [Qemu-devel] [PATCH v3 0/8] eliminate data write in bdrv_write_zeroes on Linux in raw-posix.c

2014-12-30 Thread Peter Lieven
Am 30.12.2014 um 10:20 schrieb Denis V. Lunev:
> These patches eliminate data writes completely on Linux if fallocate
> FALLOC_FL_ZERO_RANGE or FALLOC_FL_PUNCH_HOLE are  supported on
> underlying filesystem.
>
> I have performed several tests with non-aligned fallocate calls and
> in all cases (with non-aligned fallocates) Linux performs fine, i.e.
> areas are zeroed correctly. Checks were made on
>Linux 3.16.0-28-generic #38-Ubuntu SMP
>
> This should seriously increase performance in some special cases.

Could you give a hint what that special cases are? It would help
to evaluate and test the performance difference.

Thanks,
Peter

>
> Changes from v2:
> - added Peter Lieven to CC
> - added CONFIG_FALLOCATE check to call do_fallocate in patch 7
> - dropped patch 1 as NACK-ed
> - added processing of very large data areas in bdrv_co_write_zeroes (new
>   patch 1)
> - set bl.max_write_zeroes to INT_MAX in raw-posix.c for regular files
>   (new patch 8)
>
> Signed-off-by: Denis V. Lunev 
> CC: Kevin Wolf 
> CC: Stefan Hajnoczi 
> CC: Peter Lieven 
>




Re: [Qemu-devel] [PATCH v3 0/8] eliminate data write in bdrv_write_zeroes on Linux in raw-posix.c

2014-12-30 Thread Denis V. Lunev

On 30/12/14 13:55, Peter Lieven wrote:

Am 30.12.2014 um 10:20 schrieb Denis V. Lunev:

These patches eliminate data writes completely on Linux if fallocate
FALLOC_FL_ZERO_RANGE or FALLOC_FL_PUNCH_HOLE are  supported on
underlying filesystem.

I have performed several tests with non-aligned fallocate calls and
in all cases (with non-aligned fallocates) Linux performs fine, i.e.
areas are zeroed correctly. Checks were made on
Linux 3.16.0-28-generic #38-Ubuntu SMP

This should seriously increase performance in some special cases.

Could you give a hint what that special cases are? It would help
to evaluate and test the performance difference.

Thanks,
Peter


- 15% in Parallels Image expansion, see my side patchset
- writing zeroes to raw image with BLOCKDEV_DETECT_ZEROES_OPTIONS_ON set
   (actually I have kludged raw-posix.c to have this flag always set
   to perform independent testing)



Re: [Qemu-devel] [PATCH RFC v6 13/20] virtio: allow to fail setting status

2014-12-30 Thread Michael S. Tsirkin
On Thu, Dec 11, 2014 at 02:25:15PM +0100, Cornelia Huck wrote:
> virtio-1 allow setting of the FEATURES_OK status bit to fail if
> the negotiated feature bits are inconsistent: let's fail
> virtio_set_status() in that case and update virtio-ccw to post an
> error to the guest.
> 
> Signed-off-by: Cornelia Huck 

Right but a separate validate_features call is awkward.
How about we defer virtio_set_features until FEATURES_OK,
and teach virtio_set_features that it can fail?


> ---
>  hw/s390x/virtio-ccw.c  |   20 
>  hw/virtio/virtio.c |   24 +++-
>  include/hw/virtio/virtio.h |3 ++-
>  3 files changed, 37 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
> index e09e0da..a55e851 100644
> --- a/hw/s390x/virtio-ccw.c
> +++ b/hw/s390x/virtio-ccw.c
> @@ -555,15 +555,19 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
>  if (!(status & VIRTIO_CONFIG_S_DRIVER_OK)) {
>  virtio_ccw_stop_ioeventfd(dev);
>  }
> -virtio_set_status(vdev, status);
> -if (vdev->status == 0) {
> -virtio_reset(vdev);
> -}
> -if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
> -virtio_ccw_start_ioeventfd(dev);
> +if (virtio_set_status(vdev, status) == 0) {
> +if (vdev->status == 0) {
> +virtio_reset(vdev);
> +}
> +if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
> +virtio_ccw_start_ioeventfd(dev);
> +}
> +sch->curr_status.scsw.count = ccw.count - sizeof(status);
> +ret = 0;
> +} else {
> +/* Trigger a command reject. */
> +ret = -ENOSYS;
>  }
> -sch->curr_status.scsw.count = ccw.count - sizeof(status);
> -ret = 0;
>  }
>  break;
>  case CCW_CMD_SET_IND:
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index a3dd67b..90eedd3 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -543,15 +543,37 @@ void virtio_update_irq(VirtIODevice *vdev)
>  virtio_notify_vector(vdev, VIRTIO_NO_VECTOR);
>  }
>  
> -void virtio_set_status(VirtIODevice *vdev, uint8_t val)
> +static int virtio_validate_features(VirtIODevice *vdev)
> +{
> +VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
> +
> +if (k->validate_features) {
> +return k->validate_features(vdev);
> +} else {
> +return 0;
> +}
> +}
> +
> +int virtio_set_status(VirtIODevice *vdev, uint8_t val)
>  {
>  VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
>  trace_virtio_set_status(vdev, val);
>  
> +if (virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) {
> +if (!(vdev->status & VIRTIO_CONFIG_S_FEATURES_OK) &&
> +val & VIRTIO_CONFIG_S_FEATURES_OK) {
> +int ret = virtio_validate_features(vdev);
> +
> +if (ret) {
> +return ret;
> +}
> +}
> +}
>  if (k->set_status) {
>  k->set_status(vdev, val);
>  }
>  vdev->status = val;
> +return 0;
>  }
>  
>  bool target_words_bigendian(void);
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index a24e403..068211e 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -149,6 +149,7 @@ typedef struct VirtioDeviceClass {
>  uint64_t (*get_features)(VirtIODevice *vdev, uint64_t 
> requested_features);
>  uint64_t (*bad_features)(VirtIODevice *vdev);
>  void (*set_features)(VirtIODevice *vdev, uint64_t val);
> +int (*validate_features)(VirtIODevice *vdev);
>  void (*get_config)(VirtIODevice *vdev, uint8_t *config);
>  void (*set_config)(VirtIODevice *vdev, const uint8_t *config);
>  void (*reset)(VirtIODevice *vdev);
> @@ -233,7 +234,7 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, 
> int align);
>  void virtio_queue_notify(VirtIODevice *vdev, int n);
>  uint16_t virtio_queue_vector(VirtIODevice *vdev, int n);
>  void virtio_queue_set_vector(VirtIODevice *vdev, int n, uint16_t vector);
> -void virtio_set_status(VirtIODevice *vdev, uint8_t val);
> +int virtio_set_status(VirtIODevice *vdev, uint8_t val);
>  void virtio_reset(void *opaque);
>  void virtio_update_irq(VirtIODevice *vdev);
>  int virtio_set_features(VirtIODevice *vdev, uint64_t val);
> -- 
> 1.7.9.5



[Qemu-devel] [PATCH 0/2] GPIO model for Zynq SoC

2014-12-30 Thread Colin Leitner
Hello everyone,

I wrote the Zynq GPIO model a while ago and it proved useful in an internal
project, so maybe others will find it useful too.

Cheers,
Colin

Colin Leitner (2):
  zynq_gpio: GPIO model for Zynq SoC
  xilinx_zynq: Add zynq_gpio to the machine

 hw/arm/xilinx_zynq.c  |2 +
 hw/gpio/Makefile.objs |1 +
 hw/gpio/zynq_gpio.c   |  441 +
 3 files changed, 444 insertions(+)
 create mode 100644 hw/gpio/zynq_gpio.c

-- 
1.7.10.4




[Qemu-devel] [PATCH 2/2] xilinx_zynq: Add zynq_gpio to the machine

2014-12-30 Thread Colin Leitner
Signed-off-by: Colin Leitner 
---
 hw/arm/xilinx_zynq.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index 06e6e24..6d8c0d9 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -202,6 +202,8 @@ static void zynq_init(MachineState *machine)
 zynq_init_spi_flashes(0xE0007000, pic[81-IRQ_OFFSET], false);
 zynq_init_spi_flashes(0xE000D000, pic[51-IRQ_OFFSET], true);
 
+sysbus_create_simple("zynq-gpio", 0xE000A000, pic[52-IRQ_OFFSET]);
+
 sysbus_create_simple("xlnx,ps7-usb", 0xE0002000, pic[53-IRQ_OFFSET]);
 sysbus_create_simple("xlnx,ps7-usb", 0xE0003000, pic[76-IRQ_OFFSET]);
 
-- 
1.7.10.4




[Qemu-devel] [PATCH 1/2] zynq_gpio: GPIO model for Zynq SoC

2014-12-30 Thread Colin Leitner
Based on the pl061 model. This model implements all four banks with 32 I/Os
each.

The I/Os are placed in four named groups:

 * mio_in/out[0..63], where mio_in/out[0..31] map to bank 0 and the rest to
   bank 1
 * emio_in/out[0..63], where emio_in/out[0..31] map to bank 2 and the rest to
   bank 3

Basic I/O tested with the Zynq GPIO driver in Linux 3.12.

Signed-off-by: Colin Leitner 
---
 hw/gpio/Makefile.objs |1 +
 hw/gpio/zynq_gpio.c   |  441 +
 2 files changed, 442 insertions(+)
 create mode 100644 hw/gpio/zynq_gpio.c

diff --git a/hw/gpio/Makefile.objs b/hw/gpio/Makefile.objs
index 1abcf17..32b99e0 100644
--- a/hw/gpio/Makefile.objs
+++ b/hw/gpio/Makefile.objs
@@ -5,3 +5,4 @@ common-obj-$(CONFIG_ZAURUS) += zaurus.o
 common-obj-$(CONFIG_E500) += mpc8xxx.o
 
 obj-$(CONFIG_OMAP) += omap_gpio.o
+obj-$(CONFIG_ZYNQ) += zynq_gpio.o
diff --git a/hw/gpio/zynq_gpio.c b/hw/gpio/zynq_gpio.c
new file mode 100644
index 000..2119561
--- /dev/null
+++ b/hw/gpio/zynq_gpio.c
@@ -0,0 +1,441 @@
+/*
+ * Zynq General Purpose IO
+ *
+ * Copyright (C) 2014 Colin Leitner 
+ *
+ * Based on the PL061 model:
+ *   Copyright (c) 2007 CodeSourcery.
+ *   Written by Paul Brook
+ *
+ * This code is licensed under the GPL.
+ */
+
+/*
+ * We model all banks as if they were fully populated. MIO pins are usually
+ * limited to 54 pins, but this is probably device dependent and shouldn't
+ * cause too much trouble. One noticable difference is the reset value of
+ * INT_TYPE_1, which is 0x003f according to the TRM and 0x here.
+ *
+ * The output enable pins are not modeled.
+ */
+
+#include "hw/sysbus.h"
+
+//#define DEBUG_ZYNQ_GPIO 1
+
+#ifdef DEBUG_ZYNQ_GPIO
+#define DPRINTF(fmt, ...) \
+do { printf("zynq-gpio: " fmt , ## __VA_ARGS__); } while (0)
+#define BADF(fmt, ...) \
+do { fprintf(stderr, "zynq-gpio: error: " fmt , ## __VA_ARGS__); exit(1);} 
while (0)
+#else
+#define DPRINTF(fmt, ...) do {} while(0)
+#define BADF(fmt, ...) \
+do { fprintf(stderr, "zynq-gpio: error: " fmt , ## __VA_ARGS__);} while (0)
+#endif
+
+#define TYPE_ZYNQ_GPIO "zynq-gpio"
+#define ZYNQ_GPIO(obj) OBJECT_CHECK(ZynqGPIOState, (obj), TYPE_ZYNQ_GPIO)
+
+typedef struct {
+uint32_t mask_data;
+uint32_t out_data;
+uint32_t old_out_data;
+uint32_t in_data;
+uint32_t old_in_data;
+uint32_t dir;
+uint32_t oen;
+uint32_t imask;
+uint32_t istat;
+uint32_t itype;
+uint32_t ipolarity;
+uint32_t iany;
+
+qemu_irq *out;
+} GPIOBank;
+
+typedef struct ZynqGPIOState {
+SysBusDevice parent_obj;
+
+MemoryRegion iomem;
+GPIOBank banks[4];
+qemu_irq mio_out[64];
+qemu_irq emio_out[64];
+qemu_irq irq;
+} ZynqGPIOState;
+
+static void zynq_gpio_update_out(GPIOBank *b)
+{
+uint32_t changed;
+uint32_t mask;
+uint32_t out;
+int i;
+
+DPRINTF("dir = %d, data = %d\n", b->dir, b->out_data);
+
+/* Outputs float high.  */
+/* FIXME: This is board dependent.  */
+out = (b->out_data & b->dir) | ~b->dir;
+changed = b->old_out_data ^ out;
+if (changed) {
+b->old_out_data = out;
+for (i = 0; i < 32; i++) {
+mask = 1 << i;
+if (changed & mask) {
+DPRINTF("Set output %d = %d\n", i, (out & mask) != 0);
+qemu_set_irq(b->out[i], (out & mask) != 0);
+}
+}
+}
+}
+
+static void zynq_gpio_update_in(GPIOBank *b)
+{
+uint32_t changed;
+uint32_t mask;
+int i;
+
+changed = b->old_in_data ^ b->in_data;
+if (changed) {
+b->old_in_data = b->in_data;
+for (i = 0; i < 32; i++) {
+mask = 1 << i;
+if (changed & mask) {
+DPRINTF("Changed input %d = %d\n", i, (b->in_data & mask) != 
0);
+
+if (b->itype & mask) {
+/* Edge interrupt */
+if (b->iany & mask) {
+/* Any edge triggers the interrupt */
+b->istat |= mask;
+} else {
+/* Edge is selected by INT_POLARITY */
+b->istat |= ~(b->in_data ^ b->ipolarity) & mask;
+}
+}
+}
+}
+}
+
+/* Level interrupt */
+b->istat |= ~(b->in_data ^ b->ipolarity) & ~b->itype;
+
+DPRINTF("istat = %08X\n", b->istat);
+}
+
+static void zynq_gpio_set_in_irq(ZynqGPIOState *s)
+{
+int b;
+uint32_t istat = 0;
+
+for (b = 0; b < 4; b++) {
+istat |= s->banks[b].istat & ~s->banks[b].imask;
+}
+
+DPRINTF("IRQ = %d\n", istat != 0);
+
+qemu_set_irq(s->irq, istat != 0);
+}
+
+static void zynq_gpio_update(ZynqGPIOState *s)
+{
+int b;
+
+for (b = 0; b < 4; b++) {
+zynq_gpio_update_out(&s->banks[b]);
+zynq_gpio_update_in(&s->banks[b]);
+}
+
+zynq_gpio_set_in_irq(s);
+}
+
+static uint64_t zynq_gpio_read(void *opaque, hwaddr offset,
+  

Re: [Qemu-devel] [PULL 8/8] acpi-build: make ROMs RAM blocks resizeable

2014-12-30 Thread Marcel Apfelbaum

On 12/24/2014 01:51 PM, Michael S. Tsirkin wrote:

Use resizeable ram API so we can painlessly extend ROMs in the
future.  Note: migration is not affected, as we are
not actually changing the used length for RAM, which
is the part that's migrated.

Use this in acpi: reserve x16 more RAM space.

Signed-off-by: Michael S. Tsirkin 
---
  hw/lm32/lm32_hwsetup.h |  3 ++-
  include/hw/loader.h|  4 ++--
  hw/core/loader.c   | 18 ++
  hw/i386/acpi-build.c   | 19 ++-
  4 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/hw/lm32/lm32_hwsetup.h b/hw/lm32/lm32_hwsetup.h
index 9fd5e69..838754d 100644
--- a/hw/lm32/lm32_hwsetup.h
+++ b/hw/lm32/lm32_hwsetup.h
@@ -73,7 +73,8 @@ static inline void hwsetup_free(HWSetup *hw)
  static inline void hwsetup_create_rom(HWSetup *hw,
  hwaddr base)
  {
-rom_add_blob("hwsetup", hw->data, TARGET_PAGE_SIZE, base, NULL, NULL, 
NULL);
+rom_add_blob("hwsetup", hw->data, TARGET_PAGE_SIZE,
+ TARGET_PAGE_SIZE, base, NULL, NULL, NULL);
  }

  static inline void hwsetup_add_u8(HWSetup *hw, uint8_t u)
diff --git a/include/hw/loader.h b/include/hw/loader.h
index 6481639..1d76108 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -60,7 +60,7 @@ int rom_add_file(const char *file, const char *fw_dir,
   hwaddr addr, int32_t bootindex,
   bool option_rom);
  ram_addr_t rom_add_blob(const char *name, const void *blob, size_t len,
-   hwaddr addr, const char *fw_file_name,
+   size_t max_len, hwaddr addr, const char *fw_file_name,
 FWCfgReadCallback fw_callback, void *callback_opaque);
  int rom_add_elf_program(const char *name, void *data, size_t datasize,
  size_t romsize, hwaddr addr);
@@ -74,7 +74,7 @@ void do_info_roms(Monitor *mon, const QDict *qdict);
  #define rom_add_file_fixed(_f, _a, _i)  \
  rom_add_file(_f, NULL, _a, _i, false)
  #define rom_add_blob_fixed(_f, _b, _l, _a)  \
-rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL)
+rom_add_blob(_f, _b, _l, _l, _a, NULL, NULL, NULL)

  #define PC_ROM_MIN_VGA 0xc
  #define PC_ROM_MIN_OPTION  0xc8000
diff --git a/hw/core/loader.c b/hw/core/loader.c
index 7527fd3..d3f8501 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -712,12 +712,22 @@ static void rom_insert(Rom *rom)
  QTAILQ_INSERT_TAIL(&roms, rom, next);
  }

+static void fw_cfg_resized(const char *id, uint64_t length, void *host)
+{
+if (fw_cfg) {
+fw_cfg_modify_file(fw_cfg, id + strlen("/rom@"), host, length);
+}
+}
+
  static void *rom_set_mr(Rom *rom, Object *owner, const char *name)
  {
  void *data;

  rom->mr = g_malloc(sizeof(*rom->mr));
-memory_region_init_ram(rom->mr, owner, name, rom->datasize, &error_abort);
+memory_region_init_resizeable_ram(rom->mr, owner, name,
+  rom->datasize, rom->romsize,
+  fw_cfg_resized,
+  &error_abort);
  memory_region_set_readonly(rom->mr, true);
  vmstate_register_ram_global(rom->mr);

@@ -812,7 +822,7 @@ err:
  }

  ram_addr_t rom_add_blob(const char *name, const void *blob, size_t len,
-   hwaddr addr, const char *fw_file_name,
+   size_t max_len, hwaddr addr, const char *fw_file_name,
 FWCfgReadCallback fw_callback, void *callback_opaque)
  {
  Rom *rom;
@@ -821,7 +831,7 @@ ram_addr_t rom_add_blob(const char *name, const void *blob, 
size_t len,
  rom   = g_malloc0(sizeof(*rom));
  rom->name = g_strdup(name);
  rom->addr = addr;
-rom->romsize  = len;
+rom->romsize  = max_len ? max_len : len;
  rom->datasize = len;
  rom->data = g_malloc0(rom->datasize);
  memcpy(rom->data, blob, len);
@@ -841,7 +851,7 @@ ram_addr_t rom_add_blob(const char *name, const void *blob, 
size_t len,

  fw_cfg_add_file_callback(fw_cfg, fw_file_name,
   fw_callback, callback_opaque,
- data, rom->romsize);
+ data, rom->datasize);
  }
  return ret;
  }
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index a4d0c0c..6a2e9c5 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -68,6 +68,9 @@

  #define ACPI_BUILD_TABLE_SIZE 0x2

+/* Reserve RAM space for tables: add another order of magnitude. */
+#define ACPI_BUILD_TABLE_MAX_SIZE 0x20
+
  /* #define DEBUG_ACPI_BUILD */
  #ifdef DEBUG_ACPI_BUILD
  #define ACPI_BUILD_DPRINTF(fmt, ...)\
@@ -1718,6 +1721,11 @@ static void acpi_build_update(void *build_opaque, 
uint32_t offset)
  acpi_build(build_state->guest_info, &tables);

  assert(acpi_data_len(tables.table_data) == build_state->table_size);
+
+/* Make sure RAM size is correct - in case it got changed by migr

Re: [Qemu-devel] [PATCH v10 06/13] block: Add bdrv_copy_dirty_bitmap and bdrv_clear_dirty_bitmap

2014-12-30 Thread Vladimir Sementsov-Ogievskiy
I'm sorry if it was already discussed, but I think it is inconsistent to 
have "size" in sectors and "granularity" in bytes in one structure. I've 
misused these fields because of this in my current work.


At least, I think there should be comments about this.

Best regards,
Vladimir

On 23.12.2014 04:12, John Snow wrote:

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
  block.c   | 39 +++
  include/block/block.h |  4 
  2 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index a1d9e88..f9e0767 100644
--- a/block.c
+++ b/block.c
@@ -53,6 +53,8 @@
  
  struct BdrvDirtyBitmap {

  HBitmap *bitmap;
+int64_t size;
+int64_t granularity;
  char *name;
  QLIST_ENTRY(BdrvDirtyBitmap) list;
  };
@@ -5343,6 +5345,21 @@ void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap)
  bitmap->name = NULL;
  }
  
+BdrvDirtyBitmap *bdrv_copy_dirty_bitmap(BlockDriverState *bs,

+BdrvDirtyBitmap *bitmap,
+const char *name)
+{
+BdrvDirtyBitmap *new_bitmap;
+
+new_bitmap = g_malloc0(sizeof(BdrvDirtyBitmap));
+new_bitmap->bitmap = hbitmap_copy(bitmap->bitmap);
+new_bitmap->size = bitmap->size;
+new_bitmap->granularity = bitmap->granularity;
+new_bitmap->name = g_strdup(name);
+QLIST_INSERT_HEAD(&bs->dirty_bitmaps, new_bitmap, list);
+return new_bitmap;
+}
+
  BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
int granularity,
const char *name,
@@ -5350,6 +5367,7 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
  {
  int64_t bitmap_size;
  BdrvDirtyBitmap *bitmap;
+int sector_granularity;
  
  assert((granularity & (granularity - 1)) == 0);
  
@@ -5357,8 +5375,8 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,

  error_setg(errp, "Bitmap already exists: %s", name);
  return NULL;
  }
-granularity >>= BDRV_SECTOR_BITS;
-assert(granularity);
+sector_granularity = granularity >> BDRV_SECTOR_BITS;
+assert(sector_granularity);
  bitmap_size = bdrv_nb_sectors(bs);
  if (bitmap_size < 0) {
  error_setg_errno(errp, -bitmap_size, "could not get length of 
device");
@@ -5366,7 +5384,9 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
  return NULL;
  }
  bitmap = g_new0(BdrvDirtyBitmap, 1);
-bitmap->bitmap = hbitmap_alloc(bitmap_size, ffs(granularity) - 1);
+bitmap->size = bitmap_size;
+bitmap->granularity = granularity;
+bitmap->bitmap = hbitmap_alloc(bitmap->size, ffs(sector_granularity) - 1);
  bitmap->name = g_strdup(name);
  QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
  return bitmap;
@@ -5439,7 +5459,9 @@ uint64_t 
bdrv_get_default_bitmap_granularity(BlockDriverState *bs)
  uint64_t bdrv_dirty_bitmap_granularity(BlockDriverState *bs,
 BdrvDirtyBitmap *bitmap)
  {
-return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
+g_assert(BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap) ==
+ bitmap->granularity);
+return bitmap->granularity;
  }
  
  void bdrv_dirty_iter_init(BlockDriverState *bs,

@@ -5460,6 +5482,15 @@ void bdrv_reset_dirty_bitmap(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap,
  hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
  }
  
+/**

+ * Effectively, reset the hbitmap from bits [0, size)
+ * Synonymous with bdrv_reset_dirty_bitmap(bs, bitmap, 0, bitmap->size)
+ */
+void bdrv_clear_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
+{
+hbitmap_reset(bitmap->bitmap, 0, bitmap->size);
+}
+
  static void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
 int nr_sectors)
  {
diff --git a/include/block/block.h b/include/block/block.h
index c7402e7..e964abd 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -436,6 +436,9 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState 
*bs,
  BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs,
  const char *name);
  void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, BdrvDirtyBitmap 
*bitmap);
+BdrvDirtyBitmap *bdrv_copy_dirty_bitmap(BlockDriverState *bs,
+BdrvDirtyBitmap *bitmap,
+const char *name);
  void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
  BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
  uint64_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
@@ -446,6 +449,7 @@ void bdrv_set_dirty_bitmap(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap,
 int64_t cur_sector, int nr_sectors);
  void

[Qemu-devel] [PATCH RFC 0/3] virtio-pci: towards virtio 1.0 host support

2014-12-30 Thread Michael S. Tsirkin
Partial implementation for virtio 1.0.
Some bits are still missing, but this is already
somewhat useful for driver development.

Michael S. Tsirkin (3):
  linux-headers: add virtio_pci
  virtio: misc fixes, include linux header
  virtio-pci: initial virtio 1.0 support

 hw/virtio/virtio-pci.h   |  16 ++
 linux-headers/linux/virtio_pci.h | 194 
 hw/net/virtio-net.c  |   2 +-
 hw/virtio/virtio-pci.c   | 378 +++
 hw/virtio/virtio.c   |  13 +-
 5 files changed, 598 insertions(+), 5 deletions(-)
 create mode 100644 linux-headers/linux/virtio_pci.h

-- 
MST




[Qemu-devel] [PATCH RFC 2/3] virtio: misc fixes, include linux header

2014-12-30 Thread Michael S. Tsirkin
Tweak virtio core so we can use linux virtio pci header
directly without duplicating code.

Signed-off-by: Michael S. Tsirkin 
---
 hw/net/virtio-net.c|  2 +-
 hw/virtio/virtio-pci.c |  3 +++
 hw/virtio/virtio.c | 13 +
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index b5dd356..5fff769 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1046,7 +1046,7 @@ static ssize_t virtio_net_receive(NetClientState *nc, 
const uint8_t *buf, size_t
 return -1;
 error_report("virtio-net unexpected empty queue: "
 "i %zd mergeable %d offset %zd, size %zd, "
-"guest hdr len %zd, host hdr len %zd guest features 0x%lx",
+"guest hdr len %zd, host hdr len %zd guest features 
0x%"PRIx64,
 i, n->mergeable_rx_bufs, offset, size,
 n->guest_hdr_len, n->host_hdr_len, vdev->guest_features);
 exit(1);
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 7382705..bc11d3d 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -17,6 +17,7 @@
 
 #include 
 
+#include "linux-headers/linux/virtio_pci.h"
 #include "hw/virtio/virtio.h"
 #include "hw/virtio/virtio-blk.h"
 #include "hw/virtio/virtio-net.h"
@@ -76,6 +77,8 @@
  VIRTIO_PCI_CONFIG_MSI : \
  VIRTIO_PCI_CONFIG_NOMSI)
 
+#undef VIRTIO_PCI_CONFIG
+
 /* The remaining space is defined by each driver as the per-driver
  * configuration space */
 #define VIRTIO_PCI_CONFIG(dev)  (msix_enabled(dev) ? \
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 90eedd3..301b83f 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1033,7 +1033,8 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int 
version_id)
 int i, ret;
 int32_t config_len;
 uint32_t num;
-uint32_t features;
+uint32_t features_lo, features_hi;
+uint64_t features;
 uint64_t supported_features;
 BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
 VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
@@ -1057,12 +1058,16 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f, int 
version_id)
 if (vdev->queue_sel >= VIRTIO_PCI_QUEUE_MAX) {
 return -1;
 }
-qemu_get_be32s(f, &features);
+qemu_get_be32s(f, &features_lo);
+
+//if (features_lo & (1UL << VIRTIO_F_VERSION_1)) {
+qemu_get_be32s(f, &features_hi);
+//}
+features = (((uint64_t)features_hi) << 32) | features_lo;
 
-/* XXX features >= 32 */
 if (__virtio_set_features(vdev, features) < 0) {
 supported_features = k->get_features(qbus->parent);
-error_report("Features 0x%x unsupported. Allowed features: 0x%lx",
+error_report("Features 0x%"PRIx64" unsupported. Allowed features: 
0x%"PRIx64,
  features, supported_features);
 return -1;
 }
-- 
MST




[Qemu-devel] [PATCH RFC 1/3] linux-headers: add virtio_pci

2014-12-30 Thread Michael S. Tsirkin
Easier than duplicating code.

Signed-off-by: Michael S. Tsirkin 
---
 linux-headers/linux/virtio_pci.h | 194 +++
 1 file changed, 194 insertions(+)
 create mode 100644 linux-headers/linux/virtio_pci.h

diff --git a/linux-headers/linux/virtio_pci.h b/linux-headers/linux/virtio_pci.h
new file mode 100644
index 000..e841edd
--- /dev/null
+++ b/linux-headers/linux/virtio_pci.h
@@ -0,0 +1,194 @@
+/*
+ * Virtio PCI driver
+ *
+ * This module allows virtio devices to be used over a virtual PCI device.
+ * This can be used with QEMU based VMMs like KVM or Xen.
+ *
+ * Copyright IBM Corp. 2007
+ *
+ * Authors:
+ *  Anthony Liguori  
+ *
+ * This header is BSD licensed so anyone can use the definitions to implement
+ * compatible drivers/servers.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ *may be used to endorse or promote products derived from this software
+ *without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS 
IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#ifndef _LINUX_VIRTIO_PCI_H
+#define _LINUX_VIRTIO_PCI_H
+
+#include 
+
+#ifndef VIRTIO_PCI_NO_LEGACY
+
+/* A 32-bit r/o bitmask of the features supported by the host */
+#define VIRTIO_PCI_HOST_FEATURES   0
+
+/* A 32-bit r/w bitmask of features activated by the guest */
+#define VIRTIO_PCI_GUEST_FEATURES  4
+
+/* A 32-bit r/w PFN for the currently selected queue */
+#define VIRTIO_PCI_QUEUE_PFN   8
+
+/* A 16-bit r/o queue size for the currently selected queue */
+#define VIRTIO_PCI_QUEUE_NUM   12
+
+/* A 16-bit r/w queue selector */
+#define VIRTIO_PCI_QUEUE_SEL   14
+
+/* A 16-bit r/w queue notifier */
+#define VIRTIO_PCI_QUEUE_NOTIFY16
+
+/* An 8-bit device status register.  */
+#define VIRTIO_PCI_STATUS  18
+
+/* An 8-bit r/o interrupt status register.  Reading the value will return the
+ * current contents of the ISR and will also clear it.  This is effectively
+ * a read-and-acknowledge. */
+#define VIRTIO_PCI_ISR 19
+
+/* MSI-X registers: only enabled if MSI-X is enabled. */
+/* A 16-bit vector for configuration changes. */
+#define VIRTIO_MSI_CONFIG_VECTOR20
+/* A 16-bit vector for selected queue notifications. */
+#define VIRTIO_MSI_QUEUE_VECTOR 22
+
+/* The remaining space is defined by each driver as the per-driver
+ * configuration space */
+#define VIRTIO_PCI_CONFIG_OFF(msix_enabled)((msix_enabled) ? 24 : 20)
+/* Deprecated: please use VIRTIO_PCI_CONFIG_OFF instead */
+#define VIRTIO_PCI_CONFIG(dev) VIRTIO_PCI_CONFIG_OFF((dev)->msix_enabled)
+
+/* Virtio ABI version, this must match exactly */
+#define VIRTIO_PCI_ABI_VERSION 0
+
+/* How many bits to shift physical queue address written to QUEUE_PFN.
+ * 12 is historical, and due to x86 page size. */
+#define VIRTIO_PCI_QUEUE_ADDR_SHIFT12
+
+/* The alignment to use between consumer and producer parts of vring.
+ * x86 pagesize again. */
+#define VIRTIO_PCI_VRING_ALIGN 4096
+
+#endif /* VIRTIO_PCI_NO_LEGACY */
+
+/* The bit of the ISR which indicates a device configuration change. */
+#define VIRTIO_PCI_ISR_CONFIG  0x2
+/* Vector value used to disable MSI for queue */
+#define VIRTIO_MSI_NO_VECTOR0x
+
+#ifndef VIRTIO_PCI_NO_MODERN
+
+/* IDs for different capabilities.  Must all exist. */
+
+/* Common configuration */
+#define VIRTIO_PCI_CAP_COMMON_CFG  1
+/* Notifications */
+#define VIRTIO_PCI_CAP_NOTIFY_CFG  2
+/* ISR access */
+#define VIRTIO_PCI_CAP_ISR_CFG 3
+/* Device specific confiuration */
+#define VIRTIO_PCI_CAP_DEVICE_CFG  4
+
+/* This is the PCI capability header: */
+struct virtio_pci_cap {
+   __u8 cap_vndr; 

[Qemu-devel] [PATCH RFC 3/3] virtio-pci: initial virtio 1.0 support

2014-12-30 Thread Michael S. Tsirkin
This is somewhat functional.  With this, and linux driver from my tree,
I was able to use virtio net as virtio 1.0 device for light browsing.

At the moment, dataplane and vhost code is
still missing.

Based on Cornelia's virtio 1.0 patchset:
Date: Thu, 11 Dec 2014 14:25:02 +0100
From: Cornelia Huck 
To: virtualizat...@lists.linux-foundation.org, qemu-devel@nongnu.org
Cc: ru...@rustcorp.com.au, th...@linux.vnet.ibm.com, m...@redhat.com,
Cornelia Huck 
Subject: [PATCH RFC v6 00/20] qemu: towards virtio-1 host support
Message-Id: <1418304322-7546-1-git-send-email-cornelia.h...@de.ibm.com>

which is itself still missing some core bits.

Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-pci.h |  16 +++
 hw/virtio/virtio-pci.c | 375 +
 2 files changed, 391 insertions(+)

diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index 85f102d..2cddd6a 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -88,10 +88,26 @@ typedef struct VirtioPCIClass {
 struct VirtIOPCIProxy {
 PCIDevice pci_dev;
 MemoryRegion bar;
+MemoryRegion common;
+MemoryRegion isr;
+MemoryRegion device;
+MemoryRegion notify;
+MemoryRegion modern_bar;
 uint32_t flags;
 uint32_t class_code;
 uint32_t nvectors;
 uint64_t host_features;
+uint32_t dfselect;
+uint32_t gfselect;
+uint32_t guest_features[2];
+struct {
+uint16_t num;
+bool enabled;
+uint32_t desc[2];
+uint32_t avail[2];
+uint32_t used[2];
+} vqs[VIRTIO_PCI_QUEUE_MAX];
+
 bool ioeventfd_disabled;
 bool ioeventfd_started;
 VirtIOIRQFD *vector_irqfd;
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index bc11d3d..51c33c5 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -959,6 +959,275 @@ static const TypeInfo virtio_9p_pci_info = {
  * virtio-pci: This is the PCIDevice which has a virtio-pci-bus.
  */
 
+static void virtio_pci_add_mem_cap(VirtIOPCIProxy *proxy,
+   struct virtio_pci_cap *cap)
+{
+PCIDevice *dev = &proxy->pci_dev;
+int offset;
+
+cap->type_and_bar |= 2 << VIRTIO_PCI_CAP_BAR_SHIFT;
+
+offset = pci_add_capability(dev, PCI_CAP_ID_VNDR, 0, cap->cap_len);
+assert(offset > 0);
+
+assert(cap->cap_len >= sizeof *cap);
+memcpy(dev->config + offset + PCI_CAP_FLAGS, &cap->cap_len,
+   cap->cap_len - PCI_CAP_FLAGS);
+}
+
+#define QEMU_VIRTIO_PCI_QUEUE_MEM_MULT 0x1
+
+static uint64_t virtio_pci_common_read(void *opaque, hwaddr addr,
+   unsigned size)
+{
+VirtIOPCIProxy *proxy = opaque;
+VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
+uint32_t val = 0;
+int i;
+
+switch (addr) {
+case VIRTIO_PCI_COMMON_DFSELECT:
+val = proxy->dfselect;
+break;
+case VIRTIO_PCI_COMMON_DF:
+if (proxy->dfselect <= 1) {
+val = proxy->host_features >> (32 * proxy->dfselect);
+}
+break;
+case VIRTIO_PCI_COMMON_GFSELECT:
+val = proxy->gfselect;
+break;
+case VIRTIO_PCI_COMMON_GF:
+if (proxy->gfselect <= ARRAY_SIZE(proxy->guest_features)) {
+val = proxy->guest_features[proxy->gfselect];
+}
+break;
+case VIRTIO_PCI_COMMON_MSIX:
+val = vdev->config_vector;
+break;
+case VIRTIO_PCI_COMMON_NUMQ:
+for (i = 0; i < VIRTIO_PCI_QUEUE_MAX; ++i) {
+if (virtio_queue_get_num(vdev, i)) {
+val = i + 1;
+}
+}
+break;
+case VIRTIO_PCI_COMMON_STATUS:
+val = vdev->status;
+break;
+case VIRTIO_PCI_COMMON_CFGGENERATION:
+val = 0; /* TODO */
+break;
+case VIRTIO_PCI_COMMON_Q_SELECT:
+val = vdev->queue_sel;
+break;
+case VIRTIO_PCI_COMMON_Q_SIZE:
+val = virtio_queue_get_num(vdev, vdev->queue_sel);
+break;
+case VIRTIO_PCI_COMMON_Q_MSIX:
+val = virtio_queue_vector(vdev, vdev->queue_sel);
+break;
+case VIRTIO_PCI_COMMON_Q_ENABLE:
+val = proxy->vqs[vdev->queue_sel].enabled;
+break;
+case VIRTIO_PCI_COMMON_Q_NOFF:
+/* Simply map queues in order */
+val = vdev->queue_sel;
+break;
+case VIRTIO_PCI_COMMON_Q_DESCLO:
+val = proxy->vqs[vdev->queue_sel].desc[0];
+break;
+case VIRTIO_PCI_COMMON_Q_DESCHI:
+val = proxy->vqs[vdev->queue_sel].desc[1];
+break;
+case VIRTIO_PCI_COMMON_Q_AVAILLO:
+val = proxy->vqs[vdev->queue_sel].avail[0];
+break;
+case VIRTIO_PCI_COMMON_Q_AVAILHI:
+val = proxy->vqs[vdev->queue_sel].avail[1];
+break;
+case VIRTIO_PCI_COMMON_Q_USEDLO:
+val = proxy->vqs[vdev->queue_sel].used[0];
+break;
+case VIRTIO_PCI_COMMON_Q_USEDHI:
+val = proxy->vqs[vdev->queue_sel].used[1];
+break

[Qemu-devel] [PULL RESEND 0/2] lm32: milkymist fixes and MAINTAINERS update

2014-12-30 Thread Michael Walle
Hi Peter,

please pull these two commits which were posted to the ML some time ago.

---

The following changes since commit ab0302ee764fd702465aef6d88612cdff4302809:

  Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20141223' 
into staging (2014-12-23 15:05:22 +)

are available in the git repository at:

  git://github.com/mwalle/qemu.git tags/lm32-fixes/20141229

for you to fetch changes up to 4eab7a0a2394275a611a19aef6619402ad524b63:

  MAINTAINERS: add myself to lm32 and milkymist (2014-12-29 17:25:17 +0100)


Michael Walle (2):
  milkymist: softmmu: fix event handling
  MAINTAINERS: add myself to lm32 and milkymist

 MAINTAINERS  |  6 +-
 hw/input/milkymist-softusb.c | 19 ---
 2 files changed, 17 insertions(+), 8 deletions(-)
-- 
2.1.4




Re: [Qemu-devel] [PATCH RFC 2/3] virtio: misc fixes, include linux header

2014-12-30 Thread Peter Maydell
On 30 December 2014 at 16:28, Michael S. Tsirkin  wrote:
> Tweak virtio core so we can use linux virtio pci header
> directly without duplicating code.
>
> Signed-off-by: Michael S. Tsirkin 

> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index 7382705..bc11d3d 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -17,6 +17,7 @@
>
>  #include 
>
> +#include "linux-headers/linux/virtio_pci.h"

I'm afraid this won't work. You can only pull in the Linux headers
inside CONFIG_KVM ifdefs. Otherwise you're liable to break the
build on non-Linux platforms, because the kernel headers make
no guarantees about being usable on any hosts other than Linux.

Indeed in this specific case MacOSX won't build:

In file included from /Users/pm215/src/qemu/hw/virtio/virtio-pci.c:20:
/Users/pm215/src/qemu/linux-headers/linux/virtio_pci.h:42:10: fatal
error: 'linux/types.h' file not found
#include 
 ^

If you need the virtio_pci.h header you'll need to make a
portable copy of it in include/hw/virtio/, the same way we
have already for the other headers.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 1/2] zynq_gpio: GPIO model for Zynq SoC

2014-12-30 Thread Peter Crosthwaite
On Tue, Dec 30, 2014 at 5:13 AM, Colin Leitner
 wrote:
> Based on the pl061 model. This model implements all four banks with 32 I/Os
> each.
>
> The I/Os are placed in four named groups:
>
>  * mio_in/out[0..63], where mio_in/out[0..31] map to bank 0 and the rest to
>bank 1
>  * emio_in/out[0..63], where emio_in/out[0..31] map to bank 2 and the rest to
>bank 3
>
> Basic I/O tested with the Zynq GPIO driver in Linux 3.12.
>
> Signed-off-by: Colin Leitner 
> ---
>  hw/gpio/Makefile.objs |1 +
>  hw/gpio/zynq_gpio.c   |  441 
> +
>  2 files changed, 442 insertions(+)
>  create mode 100644 hw/gpio/zynq_gpio.c
>
> diff --git a/hw/gpio/Makefile.objs b/hw/gpio/Makefile.objs
> index 1abcf17..32b99e0 100644
> --- a/hw/gpio/Makefile.objs
> +++ b/hw/gpio/Makefile.objs
> @@ -5,3 +5,4 @@ common-obj-$(CONFIG_ZAURUS) += zaurus.o
>  common-obj-$(CONFIG_E500) += mpc8xxx.o
>
>  obj-$(CONFIG_OMAP) += omap_gpio.o
> +obj-$(CONFIG_ZYNQ) += zynq_gpio.o

I think we are trying to slowly covert filenames to use - separators.
Should be followed for new files.

> diff --git a/hw/gpio/zynq_gpio.c b/hw/gpio/zynq_gpio.c
> new file mode 100644
> index 000..2119561
> --- /dev/null
> +++ b/hw/gpio/zynq_gpio.c
> @@ -0,0 +1,441 @@
> +/*
> + * Zynq General Purpose IO
> + *
> + * Copyright (C) 2014 Colin Leitner 
> + *
> + * Based on the PL061 model:
> + *   Copyright (c) 2007 CodeSourcery.
> + *   Written by Paul Brook
> + *
> + * This code is licensed under the GPL.
> + */
> +
> +/*
> + * We model all banks as if they were fully populated. MIO pins are usually
> + * limited to 54 pins, but this is probably device dependent and shouldn't
> + * cause too much trouble. One noticable difference is the reset value of

"noticeable"

> + * INT_TYPE_1, which is 0x003f according to the TRM and 0x here.
> + *
> + * The output enable pins are not modeled.
> + */
> +
> +#include "hw/sysbus.h"
> +
> +//#define DEBUG_ZYNQ_GPIO 1

Don't worry about commented out debug switches.

> +
> +#ifdef DEBUG_ZYNQ_GPIO
> +#define DPRINTF(fmt, ...) \
> +do { printf("zynq-gpio: " fmt , ## __VA_ARGS__); } while (0)

use qemu_log.

> +#define BADF(fmt, ...) \

BADF is unused.

> +do { fprintf(stderr, "zynq-gpio: error: " fmt , ## __VA_ARGS__); exit(1);} 
> while (0)

and exit(1) is probably not a good semantic for an error condition.
Such conditions should just be asserts. You should just drop BADF
completely.

> +#else
> +#define DPRINTF(fmt, ...) do {} while(0)
> +#define BADF(fmt, ...) \
> +do { fprintf(stderr, "zynq-gpio: error: " fmt , ## __VA_ARGS__);} while (0)
> +#endif
> +

It's better to use a regular if for debug instrumentation. check
hw/dma/pl330.c for an example of this. The reason is to always compile
test the contents of printfs.

> +#define TYPE_ZYNQ_GPIO "zynq-gpio"
> +#define ZYNQ_GPIO(obj) OBJECT_CHECK(ZynqGPIOState, (obj), TYPE_ZYNQ_GPIO)
> +
> +typedef struct {

Modern device-model conventions require the type struct, typename and
cast macros and the register offset #defines to be in a header file
specific to the device. These bits should be in zynq-gpio.h

> +uint32_t mask_data;
> +uint32_t out_data;
> +uint32_t old_out_data;
> +uint32_t in_data;
> +uint32_t old_in_data;
> +uint32_t dir;
> +uint32_t oen;
> +uint32_t imask;
> +uint32_t istat;
> +uint32_t itype;
> +uint32_t ipolarity;
> +uint32_t iany;
> +
> +qemu_irq *out;
> +} GPIOBank;

Preface the struct name with "Zynq" for consistency with other
identifiers defined.

> +
> +typedef struct ZynqGPIOState {
> +SysBusDevice parent_obj;
> +
> +MemoryRegion iomem;
> +GPIOBank banks[4];
> +qemu_irq mio_out[64];
> +qemu_irq emio_out[64];

Is it better to just model the GPIO controller as a standalone GPIO,
and leave the mio vs emio distinction to the SoC/Board level?

This would mean the bank GPIOs are on the top level entity, and the
core would then have no EMIO/MIO awareness. This also makes QEMU a
little less awkward considering there is no sense of MIO and EMIO in
QEMU to date.

> +qemu_irq irq;
> +} ZynqGPIOState;
> +
> +static void zynq_gpio_update_out(GPIOBank *b)
> +{
> +uint32_t changed;
> +uint32_t mask;
> +uint32_t out;
> +int i;
> +
> +DPRINTF("dir = %d, data = %d\n", b->dir, b->out_data);
> +
> +/* Outputs float high.  */
> +/* FIXME: This is board dependent.  */

How so? Looks pretty generic to me (not sure what needs fixing here).
Are you saying that the IO width should truncate based on Zynq
specifics?

> +out = (b->out_data & b->dir) | ~b->dir;
> +changed = b->old_out_data ^ out;
> +if (changed) {

This if doesn't save much in optimization, as the expensive part (the
qemu_set_irq) is already change-guarded per-bit below anyway. Just
drop the if.

> +b->old_out_data = out;
> +for (i = 0; i < 32; i++) {

Macroify hardcoded constant 32.

> +mask = 1 << i;
> +if 

Re: [Qemu-devel] [PATCH v8 1/7] stm32f2xx_timer: Add the stm32f2xx Timer

2014-12-30 Thread Peter Crosthwaite
On Thu, Dec 25, 2014 at 8:22 PM, Alistair Francis  wrote:
> This patch adds the stm32f2xx timers: TIM2, TIM3, TIM4 and TIM5
> to QEMU.
>
> Signed-off-by: Alistair Francis 
> ---
> V8:
>  - Fix tick_offset to allow now to wrap around
>  - Remove the calls to get_ticks_per_sec()
>  - Pre-scale the guest visable time
> V6:
>  - Rename to STM32F2XX
>  - Change the timer calculations to use ns
>  - Update the value to timer_mod to ensure it is in ns
>  - Account for reloadable/resetable timer
> - Thanks to Peter C for pointing this out
> V4:
>  - Update timer units again
> - Thanks to Peter C
> V3:
>  - Update debug statements
>  - Correct the units for timer_mod
>  - Correctly set timer_offset from resets
> V2:
>  - Reorder the Makefile config
>  - Fix up the debug printing
>  - Correct the timer event trigger
>
>  default-configs/arm-softmmu.mak|   1 +
>  hw/timer/Makefile.objs |   2 +
>  hw/timer/stm32f2xx_timer.c | 337 
> +
>  include/hw/timer/stm32f2xx_timer.h | 103 
>  4 files changed, 443 insertions(+)
>  create mode 100644 hw/timer/stm32f2xx_timer.c
>  create mode 100644 include/hw/timer/stm32f2xx_timer.h
>
> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> index f3513fa..faea100 100644
> --- a/default-configs/arm-softmmu.mak
> +++ b/default-configs/arm-softmmu.mak
> @@ -78,6 +78,7 @@ CONFIG_NSERIES=y
>  CONFIG_REALVIEW=y
>  CONFIG_ZAURUS=y
>  CONFIG_ZYNQ=y
> +CONFIG_STM32F2XX_TIMER=y
>
>  CONFIG_VERSATILE_PCI=y
>  CONFIG_VERSATILE_I2C=y
> diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs
> index 2c86c3d..133bd0d 100644
> --- a/hw/timer/Makefile.objs
> +++ b/hw/timer/Makefile.objs
> @@ -31,3 +31,5 @@ obj-$(CONFIG_DIGIC) += digic-timer.o
>  obj-$(CONFIG_MC146818RTC) += mc146818rtc.o
>
>  obj-$(CONFIG_ALLWINNER_A10_PIT) += allwinner-a10-pit.o
> +
> +common-obj-$(CONFIG_STM32F2XX_TIMER) += stm32f2xx_timer.o
> diff --git a/hw/timer/stm32f2xx_timer.c b/hw/timer/stm32f2xx_timer.c
> new file mode 100644
> index 000..b16789e
> --- /dev/null
> +++ b/hw/timer/stm32f2xx_timer.c
> @@ -0,0 +1,337 @@
> +/*
> + * STM32F2XX Timer
> + *
> + * Copyright (c) 2014 Alistair Francis 
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#include "hw/timer/stm32f2xx_timer.h"
> +
> +#ifndef STM_TIMER_ERR_DEBUG
> +#define STM_TIMER_ERR_DEBUG 0
> +#endif
> +
> +#define DB_PRINT_L(lvl, fmt, args...) do { \
> +if (STM_TIMER_ERR_DEBUG >= lvl) { \
> +qemu_log("%s: " fmt, __func__, ## args); \
> +} \
> +} while (0);
> +
> +#define DB_PRINT(fmt, args...) DB_PRINT_L(1, fmt, ## args)
> +
> +static void stm32f2xx_timer_set_alarm(STM32F2XXTimerState *s);
> +
> +static void stm32f2xx_timer_interrupt(void *opaque)
> +{
> +STM32F2XXTimerState *s = opaque;
> +
> +DB_PRINT("Interrupt\n");
> +
> +if (s->tim_dier & TIM_DIER_UIE && s->tim_cr1 & TIM_CR1_CEN) {
> +s->tim_sr |= 1;
> +qemu_irq_pulse(s->irq);
> +stm32f2xx_timer_set_alarm(s);
> +}
> +}
> +
> +static void stm32f2xx_timer_set_alarm(STM32F2XXTimerState *s)
> +{
> +uint32_t ticks;
> +int64_t now, wait_time;
> +
> +DB_PRINT("Alarm set at: 0x%x\n", s->tim_cr1);
> +
> +now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
> +
> +/* When now wraps around, update the tick_offset to represent
> + * the last time the clock was reset
> + */

Can this ever practically happen? now is a 64b ns timer so it should
take a very long time for it to wrap around.

> +if (now < s->tick_offset) {
> +s->tick_offset = s->tick_offset - TIMER_MAX_TICKS;
> +}
> +ticks = s->tim_arr -
> +((muldiv64(s->freq_hz, now, 10ULL) - s->tick_offset) /
> +(s->tim_psc + 1));
> +
> +DB_PRINT("Alarm set in %d ticks\n", ticks);
> +
> +if (ticks == 0) {
> +timer_del(s->timer);
> +   

Re: [Qemu-devel] [PATCH 1/2] zynq_gpio: GPIO model for Zynq SoC

2014-12-30 Thread Colin Leitner
Hi Peter,

thanks for the review! I'll rework the patch ASAP.

> Is it better to just model the GPIO controller as a standalone GPIO,
> and leave the mio vs emio distinction to the SoC/Board level?
> 
> This would mean the bank GPIOs are on the top level entity, and the
> core would then have no EMIO/MIO awareness. This also makes QEMU a
> little less awkward considering there is no sense of MIO and EMIO in
> QEMU to date.

The reason for chosing the MIO/EMIO names was simply for easier mapping
to real hardware where I've been usually confronted with MIO/EMIO.
Changing this to the banks makes sense of course if Xilinx choses to
reuse that IP again.

>> +/* Outputs float high.  */
>> +/* FIXME: This is board dependent.  */
> 
> How so? Looks pretty generic to me (not sure what needs fixing here).
> Are you saying that the IO width should truncate based on Zynq
> specifics?

This is a fragment from pl061. If we don't explicitly drive a output
line through the direction register, we assume it floats high. We
still have to drive the qemu IRQ line to some state.

I'll write a better comment.

>> +zynq_gpio_reset(s);
> 
> Don't reset in init fns. You shuold use a device-reset function ...

Another 1:1 copy from pl061. I'll take the time to read up how the
device model is meant to be implemented correctly.

Thanks again for the review and you'll hear from me shortly.

Regards,
Colin



Re: [Qemu-devel] [Qemu-trivial] [PATCH 1/1] Do not hang on full PTY

2014-12-30 Thread Don Slutz

On 12/29/14 18:41, Peter Maydell wrote:

On 29 December 2014 at 20:27, Don Slutz  wrote:

I was not sure on this being trivial also, but it looked like it could
be to me.  The uses of this FD all looked that they handle non-blocking.

Does g_io_channel_read_chars() definitely return G_IO_STATUS_NORMAL
(and not, say, G_IO_STATUS_AGAIN) for an attempted read on a non-blocking
fd with no data?


The only time I know of to get here in that state, is when the other end 
disconnects.
Normally pty_chr_read will only be called when there is at least 1 
character to read or

a state change.


  Otherwise pty_chr_read() is going to call
pty_chr_state(chr, 0) which I think means "the other end has hung up"
and will take the fd out of the main loop's poll set.


Yes, that is correct.  But it only happens when the other end disconnects.
pty_chr_timer() also is involved here, so on a reconnect, the polling is 
re-enabled.


   -Don Slutz



thanks
-- PMM





[Qemu-devel] [Bug 1385934] Re: USB with passthrougth guest cannot enumerate USB host

2014-12-30 Thread Mike Frysinger
looks like 79ae25af1569a50a0ec799901a1bb280c088f121 (which is in
qemu-2.2.0) makes it work again for my test case.  not sure if the OP
wants to verify as well or just close this out now.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1385934

Title:
  USB with passthrougth guest cannot enumerate USB host

Status in QEMU:
  New

Bug description:
  Following the guide at 
http://www.linux-kvm.org/page/USB_Host_Device_Assigned_to_Guest
  Qemu is launched with qemu-system-x86_64 /dev/vgstripe/kvm_wifi -enable-kvm 
-m 512 -k fr -net nic -net tap,ifname=tap1,script=/bin/ifup.script -kernel 
/usr/src/linux_git/arch/x86_64/boot/bzImage -append root=/dev/sda -usb -device 
usb-host,hostbus=1,hostaddr=6
  The USB device does not show and USB stack seems not working
  On the guest:
  dmesg |grep -i usb
  [1.416966] hub 1-0:1.0: USB hub found
  [1.420431] usbcore: registered new interface driver usb-storage
  [1.445374] usbcore: registered new interface driver usbhid
  [1.446839] usbhid: USB HID core driver
  [1.863226] usb 1-1: new low-speed USB device number 2 using uhci_hcd
  [2.126173] usb 1-1: Invalid ep0 maxpacket: 64
  [2.373161] usb 1-1: new low-speed USB device number 3 using uhci_hcd
  [2.648112] usb 1-1: Invalid ep0 maxpacket: 64
  [2.892404] usb 1-1: new low-speed USB device number 4 using uhci_hcd
  [2.913001] usb 1-1: Invalid ep0 maxpacket: 64
  [3.161367] usb 1-1: new low-speed USB device number 5 using uhci_hcd
  [3.180070] usb 1-1: Invalid ep0 maxpacket: 64
  [3.181633] usb usb1-port1: unable to enumerate USB device
  lsusb
  Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

  On the host:
  lsusb
  Bus 001 Device 006: ID 0457:0163 Silicon Integrated Systems Corp. SiS163U 
802.11 Wireless LAN Adapter
  qemu-system-x86_64 --version
  QEMU emulator version 2.1.2, Copyright (c) 2003-2008 Fabrice Bellard

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1385934/+subscriptions



[Qemu-devel] [PATCH v3 0/5] QEMU:Xen stubdom vTPM for HVM virtual machine

2014-12-30 Thread Quan Xu
*INTRODUCTION*
The goal of virtual Trusted Platform Module (vTPM) is to provide a TPM 
functionality to virtual machines (Fedora, Ubuntu, Redhat, Windows .etc). This 
allows programs to interact with a TPM in a virtual machine the same way they 
interact with a TPM on the physical system. Each virtual machine gets its own 
unique, emulated, software TPM. Each major component of vTPM is implemented as 
a stubdom, providing secure separation guaranteed by the hypervisor.

The vTPM stubdom is a Xen mini-OS domain that emulates a TPM for the virtual 
machine to use. It is a small wrapper around the Berlios TPM emulator. TPM 
commands are passed from mini-os TPM backend driver.

*ARCHITECTURE*
The architecture of stubdom vTPM for HVM virtual machine:

++
| Windows/Linux DomU | ...
||  ^|
|v  ||
|  Qemu tpm1.2 Tis   |
||  ^|
|v  ||
| XenStubdoms backend|
++
 |  ^
 v  |
++
|  XenDevOps |
++
 |  ^
 v  |
++
|  mini-os/tpmback   |
||  ^|
|v  ||
|   vtpm-stubdom | ...
||  ^|
|v  ||
|  mini-os/tpmfront  |
++
 |  ^
 v  |
++
|  mini-os/tpmback   |
||  ^|
|v  ||
|  vtpmmgr-stubdom   |
||  ^|
|v  ||
|  mini-os/tpm_tis   |
++
 |  ^
 v  |
++
|Hardware TPM|
++



 * Windows/Linux DomU:
The HVM based guest that wants to use a vTPM. There may be
more than one of these.

 * Qemu tpm1.2 Tis:
Implementation of the tpm1.2 Tis interface for HVM virtual
machines. It is Qemu emulation device.

 * vTPM xenstubdoms driver:
Qemu vTPM driver. This driver provides vtpm initialization
and sending data and commends to a para-virtualized vtpm
stubdom.

 * XenDevOps:
Register Xen stubdom vTPM frontend driver, and transfer any
request/repond between TPM xenstubdoms driver and Xen vTPM
stubdom. Facilitate communications between Xen vTPM stubdom
and vTPM xenstubdoms driver.

 * mini-os/tpmback:
Mini-os TPM backend driver. The Linux frontend driver connects
to this backend driver to facilitate communications between the
Linux DomU and its vTPM. This driver is also used by vtpmmgr
stubdom to communicate with vtpm-stubdom.

 * vtpm-stubdom:
A mini-os stub domain that implements a vTPM. There is a
one to one mapping between running vtpm-stubdom instances and
logical vtpms on the system. The vTPM Platform Configuration
Registers (PCRs) are all initialized to zero.

 * mini-os/tpmfront:
Mini-os TPM frontend driver. The vTPM mini-os domain vtpm
stubdom uses this driver to communicate with vtpmmgr-stubdom.
This driver could also be used separately to implement a mini-os
domain that wishes to use a vTPM of its own.

 * vtpmmgr-stubdom:
A mini-os domain that implements the vTPM manager. There is only
one vTPM manager and it should be running during the entire lifetime
of the machine. vtpmmgr domain securely stores encryption keys for
each of the vtpms and accesses to the hardware TPM to get the root of
trust for the entire system.

 * mini-os/tpm_tis:
Mini-os TPM version 1.2 TPM Interface Specification (TIS) driver.
This driver used by vtpmmgr-stubdom to talk directly to the hardware
TPM. Communication is facilitated by mapping hardware memory pages
into vtpmmgr stubdom.

 * Hardware TPM: The physical TPM 1.2 that is soldered onto the motherboard.

--Changes in v3:
-New xen_frontend.c file
-Adjust the format of command line options
-Move xenbus_switch_state() to xen_frontend.c
-Move xen_stubdom_be() to xenstore_fe_read_be_str()
-Move *_stubdom_*() to *_fe_*()
-Move xen_stubdom_vtpm.c to xen_vtpm_frontend.c
-Read Xen vTPM status via XenStore
-Call vtpm_send() and vtpm_recv() directly.

--Changes in v2:
-adding xen_fe_register() that handle any Xen PV frontend registration
-remove a private structure 'QEMUBH'
-change version number to 2.3 in qapi-schema.json
-move hw/xen/xen_stubdom_vtpm.c to hw/tpm/xen_stubdom_vtpm.c

Quan Xu (5):
  Qemu-Xen-vTPM: Support for Xen stubdom vTPM command line options
  Qemu-Xen-vTPM: Xen frontend driver i

[Qemu-devel] [v3 5/5] Qemu-Xen-vTPM: QEMU machine class is initialized before tpm_init()

2014-12-30 Thread Quan Xu
make sure QEMU machine class is initialized and QEMU has registered
Xen stubdom vTPM driver when call tpm_init()

Signed-off-by: Quan Xu 
---
 vl.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/vl.c b/vl.c
index f6b3546..dd437e1 100644
--- a/vl.c
+++ b/vl.c
@@ -4114,12 +4114,6 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 
-#ifdef CONFIG_TPM
-if (tpm_init() < 0) {
-exit(1);
-}
-#endif
-
 /* init the bluetooth world */
 if (foreach_device_config(DEV_BT, bt_parse))
 exit(1);
@@ -4225,6 +4219,16 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 
+/* For compatible with Xen stubdom vTPM driver, make
+ * sure QEMU machine class is initialized and QEMU has
+ * registered Xen stubdom vTPM driver ..
+*/
+#ifdef CONFIG_TPM
+if (tpm_init() < 0) {
+exit(1);
+}
+#endif
+
 /* init generic devices */
 if (qemu_opts_foreach(qemu_find_opts("device"), device_init_func, NULL, 1) 
!= 0)
 exit(1);
-- 
1.8.3.2




[Qemu-devel] [v3 1/5] Qemu-Xen-vTPM: Support for Xen stubdom vTPM command line options

2014-12-30 Thread Quan Xu
Signed-off-by: Quan Xu 
---
 configure| 14 ++
 hmp.c|  7 +++
 qapi-schema.json | 19 ---
 qemu-options.hx  | 13 +++--
 tpm.c|  7 ++-
 5 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/configure b/configure
index a9e4d49..d63b8a1 100755
--- a/configure
+++ b/configure
@@ -2942,6 +2942,16 @@ else
 fi
 
 ##
+# TPM xenstubdoms is only on x86 Linux
+
+if test "$targetos" = Linux && test "$cpu" = i386 -o "$cpu" = x86_64 && \
+   test "$xen" = "yes"; then
+  tpm_xenstubdoms=$tpm
+else
+  tpm_xenstubdoms=no
+fi
+
+##
 # attr probe
 
 if test "$attr" != "no" ; then
@@ -4333,6 +4343,7 @@ echo "gcov  $gcov_tool"
 echo "gcov enabled  $gcov"
 echo "TPM support   $tpm"
 echo "libssh2 support   $libssh2"
+echo "TPM xenstubdoms   $tpm_xenstubdoms"
 echo "TPM passthrough   $tpm_passthrough"
 echo "QOM debugging $qom_cast_debug"
 echo "vhdx  $vhdx"
@@ -4810,6 +4821,9 @@ if test "$tpm" = "yes"; then
   if test "$tpm_passthrough" = "yes"; then
 echo "CONFIG_TPM_PASSTHROUGH=y" >> $config_host_mak
   fi
+  if test "$tpm_xenstubdoms" = "yes"; then
+echo "CONFIG_TPM_XENSTUBDOMS=y" >> $config_host_mak
+  fi
 fi
 
 echo "TRACE_BACKENDS=$trace_backends" >> $config_host_mak
diff --git a/hmp.c b/hmp.c
index 63d7686..1df3ec7 100644
--- a/hmp.c
+++ b/hmp.c
@@ -689,6 +689,7 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict)
 Error *err = NULL;
 unsigned int c = 0;
 TPMPassthroughOptions *tpo;
+TPMXenstubdomsOptions *txo;
 
 info_list = qmp_query_tpm(&err);
 if (err) {
@@ -718,6 +719,12 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict)
tpo->has_cancel_path ? ",cancel-path=" : "",
tpo->has_cancel_path ? tpo->cancel_path : "");
 break;
+case TPM_TYPE_OPTIONS_KIND_XENSTUBDOMS:
+txo = ti->options->xenstubdoms;
+if (!txo) {
+monitor_printf(mon, "null TPMXenstubdomsOptions error!\n");
+}
+break;
 case TPM_TYPE_OPTIONS_KIND_MAX:
 break;
 }
diff --git a/qapi-schema.json b/qapi-schema.json
index 24379ab..9745c2b 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2854,9 +2854,10 @@
 #
 # @passthrough: TPM passthrough type
 #
-# Since: 1.5
+# @xenstubdoms: TPM xenstubdoms type (since 2.3)## Since 1.5
+#
 ##
-{ 'enum': 'TpmType', 'data': [ 'passthrough' ] }
+{ 'enum': 'TpmType', 'data': [ 'passthrough', 'xenstubdoms' ] }
 
 ##
 # @query-tpm-types:
@@ -2884,6 +2885,16 @@
 { 'type': 'TPMPassthroughOptions', 'data': { '*path' : 'str',
  '*cancel-path' : 'str'} }
 
+# @TPMXenstubdomsOptions:
+#
+# Information about the TPM xenstubdoms type
+#
+# Since: 2.3
+##
+{ 'type': 'TPMXenstubdomsOptions', 'data': {  } }
+#
+##
+
 ##
 # @TpmTypeOptions:
 #
@@ -2894,7 +2905,9 @@
 # Since: 1.5
 ##
 { 'union': 'TpmTypeOptions',
-   'data': { 'passthrough' : 'TPMPassthroughOptions' } }
+  'data': { 'passthrough' : 'TPMPassthroughOptions',
+'xenstubdoms' : 'TPMXenstubdomsOptions' } }
+##
 
 ##
 # @TpmInfo:
diff --git a/qemu-options.hx b/qemu-options.hx
index 1e7d5b8..fd73f57 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2485,7 +2485,8 @@ DEF("tpmdev", HAS_ARG, QEMU_OPTION_tpmdev, \
 "-tpmdev passthrough,id=id[,path=path][,cancel-path=path]\n"
 "use path to provide path to a character device; default 
is /dev/tpm0\n"
 "use cancel-path to provide path to TPM's cancel sysfs 
entry; if\n"
-"not provided it will be searched for in 
/sys/class/misc/tpm?/device\n",
+"not provided it will be searched for in 
/sys/class/misc/tpm?/device\n"
+"-tpmdev xenstubdoms,id=id\n",
 QEMU_ARCH_ALL)
 STEXI
 
@@ -2495,7 +2496,8 @@ The general form of a TPM device option is:
 @item -tpmdev @var{backend} ,id=@var{id} [,@var{options}]
 @findex -tpmdev
 Backend type must be:
-@option{passthrough}.
+@option{passthrough}, or
+@option{xenstubdoms}.
 
 The specific backend type will determine the applicable options.
 The @code{-tpmdev} option creates the TPM backend and requires a
@@ -2545,6 +2547,13 @@ To create a passthrough TPM use the following two 
options:
 Note that the @code{-tpmdev} id is @code{tpm0} and is referenced by
 @code{tpmdev=tpm0} in the device option.
 
+To create a xenstubdoms TPM use the following two options:
+@example
+-tpmdev xenstubdoms,id=tpm0 -device tpm-tis,tpmdev=tpm0
+@end example
+Note that the @code{-tpmdev} id is @code{tpm0} and is referenced by
+@code{tpmdev=tpm0} in the device option.
+
 @end table
 
 ETEXI
diff --git a/tpm.c b/tpm.c
index c371023..ee9acb8 100644
--- a/tpm.c
+++ b/tpm.c
@@ -25,7 +25,7 @@ static QLIST_HEAD(, TPMBackend) tpm_backends =
 
 
 #define TPM_MAX_MODELS  1
-#define TPM_M

[Qemu-devel] [v3 3/5] Qemu-Xen-vTPM: Register Xen stubdom vTPM frontend driver

2014-12-30 Thread Quan Xu
This drvier transfers any request/repond between TPM xenstubdoms
driver and Xen vTPM stubdom, and facilitates communications between
Xen vTPM stubdom domain and vTPM xenstubdoms driver. It is a glue for
the TPM xenstubdoms driver and Xen stubdom vTPM domain that provides
the actual TPM functionality.

(Xen) Xen backend driver should run before running this frontend, and
initialize XenStore as the following for communication.

[XenStore]
 ..
  FE.DOMAIN.ID
   device = ""
vtpm = ""
 0 = ""
  backend = "/local/domain/{BE.DOMAIN.ID}/backend/vtpm/{FE.DOMAIN.ID}/0"
  backend-id = "BE.DOMAIN.ID"
  state = "1"
  handle = "0"
 ..

(QEMU) xen_vtpmdev_ops is initialized with the following process:
  xen_hvm_init()
[...]
-->xen_fe_register("vtpm", ...)
  -->xenstore_fe_scan()
-->xen_fe_get_xendev()
  --> XenDevOps.alloc()
-->xen_fe_check()
  --> XenDevOps.init()
  --> XenDevOps.initialise()
  --> XenDevOps.connected()
-->xs_watch()
[...]

--Changes in v3:
-Move xen_stubdom_vtpm.c to xen_vtpm_frontend.c
-Read Xen vTPM status via XenStore

Signed-off-by: Quan Xu 
---
 hw/tpm/Makefile.objs |   1 +
 hw/tpm/xen_vtpm_frontend.c   | 264 +++
 hw/xen/xen_backend.c |  34 ++
 include/hw/xen/xen_backend.h |   9 +-
 include/hw/xen/xen_common.h  |   6 +
 xen-hvm.c|  16 +++
 6 files changed, 328 insertions(+), 2 deletions(-)
 create mode 100644 hw/tpm/xen_vtpm_frontend.c

diff --git a/hw/tpm/Makefile.objs b/hw/tpm/Makefile.objs
index 99f5983..57919fa 100644
--- a/hw/tpm/Makefile.objs
+++ b/hw/tpm/Makefile.objs
@@ -1,2 +1,3 @@
 common-obj-$(CONFIG_TPM_TIS) += tpm_tis.o
 common-obj-$(CONFIG_TPM_PASSTHROUGH) += tpm_passthrough.o
+common-obj-$(CONFIG_TPM_XENSTUBDOMS) += xen_vtpm_frontend.o
diff --git a/hw/tpm/xen_vtpm_frontend.c b/hw/tpm/xen_vtpm_frontend.c
new file mode 100644
index 000..00cc888
--- /dev/null
+++ b/hw/tpm/xen_vtpm_frontend.c
@@ -0,0 +1,264 @@
+/*
+ * Connect to Xen vTPM stubdom domain
+ *
+ *  Copyright (c) 2014 Intel Corporation
+ *  Authors:
+ *Quan Xu 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "hw/hw.h"
+#include "block/aio.h"
+#include "hw/xen/xen_backend.h"
+
+enum tpmif_state {
+TPMIF_STATE_IDLE,/* no contents / vTPM idle / cancel complete */
+TPMIF_STATE_SUBMIT,  /* request ready / vTPM working */
+TPMIF_STATE_FINISH,  /* response ready / vTPM idle */
+TPMIF_STATE_CANCEL,  /* cancel requested / vTPM working */
+};
+
+static AioContext *vtpm_aio_ctx;
+
+enum status_bits {
+VTPM_STATUS_RUNNING  = 0x1,
+VTPM_STATUS_IDLE = 0x2,
+VTPM_STATUS_RESULT   = 0x4,
+VTPM_STATUS_CANCELED = 0x8,
+};
+
+struct tpmif_shared_page {
+uint32_t length; /* request/response length in bytes */
+
+uint8_t  state;   /* enum tpmif_state */
+uint8_t  locality;/* for the current request */
+uint8_t  pad; /* should be zero */
+
+uint8_t  nr_extra_pages;  /* extra pages for long packets; may be zero */
+uint32_t extra_pages[0]; /* grant IDs; length is actually nr_extra_pages */
+};
+
+struct xen_vtpm_dev {
+struct XenDevice xendev;  /* must be first */
+struct   tpmif_shared_page *shr;
+xc_gntshr*xen_xcs;
+int  ring_ref;
+int  bedomid;
+QEMUBH   *sr_bh;
+};
+
+static uint8_t vtpm_status(struct xen_vtpm_dev *vtpmdev)
+{
+switch (vtpmdev->shr->state) {
+case TPMIF_STATE_IDLE:
+case TPMIF_STATE_FINISH:
+return VTPM_STATUS_IDLE;
+case TPMIF_STATE_SUBMIT:
+case TPMIF_STATE_CANCEL:
+return VTPM_STATUS_RUNNING;
+default:
+return 0;
+}
+}
+
+static bool vtpm_aio_wait(AioContext *ctx)
+{
+return aio_poll(ctx, true);
+}
+
+static void sr_bh_handler(void *opaque)
+{
+}
+
+int vtpm_recv(struct XenDevice *xendev, uint8_t* buf, size_t *count)
+{
+struct xen_vtpm_dev *vtpmdev = container_of(xendev, struct xen_vtpm_dev,
+xendev);
+struct tpmif_shared_page *shr = vtpmdev->shr;

[Qemu-devel] [v3 2/5] Qemu-Xen-vTPM: Xen frontend driver infrastructure

2014-12-30 Thread Quan Xu
This patch adds infrastructure for xen front drivers living in qemu,
so drivers don't need to implement common stuff on their own.  It's
mostly xenbus management stuff: some functions to access XenStore,
setting up XenStore watches, callbacks on device discovery and state
changes, and handle event channel between the virtual machines.

Call xen_fe_register() function to register XenDevOps, and make sure,
XenDevOps's flags is DEVOPS_FLAG_FE, which is flag bit to point out
the XenDevOps is Xen frontend.

--Changes in v3:
-New xen_frontend.c file
-Move xenbus_switch_state() to xen_frontend.c
-Move xen_stubdom_be() to xenstore_fe_read_be_str()
-Move *_stubdom_*() to *_fe_*()

Signed-off-by: Quan Xu 
---
 hw/xen/Makefile.objs |   2 +-
 hw/xen/xen_backend.c |  11 +-
 hw/xen/xen_frontend.c| 323 +++
 include/hw/xen/xen_backend.h |  14 ++
 4 files changed, 348 insertions(+), 2 deletions(-)
 create mode 100644 hw/xen/xen_frontend.c

diff --git a/hw/xen/Makefile.objs b/hw/xen/Makefile.objs
index a0ca0aa..b0bb065 100644
--- a/hw/xen/Makefile.objs
+++ b/hw/xen/Makefile.objs
@@ -1,5 +1,5 @@
 # xen backend driver support
-common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o xen_devconfig.o
+common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o xen_devconfig.o 
xen_frontend.o
 
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o xen_pt_msi.o
diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index b2cb22b..ad6e324 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -275,7 +275,7 @@ static struct XenDevice *xen_be_get_xendev(const char 
*type, int dom, int dev,
 /*
  * release xen backend device.
  */
-static struct XenDevice *xen_be_del_xendev(int dom, int dev)
+struct XenDevice *xen_be_del_xendev(int dom, int dev)
 {
 struct XenDevice *xendev, *xnext;
 
@@ -681,6 +681,10 @@ static void xenstore_update(void *unused)
 if (sscanf(vec[XS_WATCH_TOKEN], "fe:%" PRIxPTR, &ptr) == 1) {
 xenstore_update_fe(vec[XS_WATCH_PATH], (void*)ptr);
 }
+if (sscanf(vec[XS_WATCH_TOKEN], "stub:%" PRIxPTR ":%d:%" PRIxPTR,
+   &type, &dom, &ops) == 3) {
+xenstore_fe_update(vec[XS_WATCH_PATH], (void *)type, dom, (void *)ops);
+}
 
 cleanup:
 free(vec);
@@ -808,3 +812,8 @@ void xen_be_printf(struct XenDevice *xendev, int msg_level, 
const char *fmt, ...
 }
 qemu_log_flush();
 }
+
+void xen_qtail_insert_xendev(struct XenDevice *xendev)
+{
+QTAILQ_INSERT_TAIL(&xendevs, xendev, next);
+}
diff --git a/hw/xen/xen_frontend.c b/hw/xen/xen_frontend.c
new file mode 100644
index 000..07ffc5c
--- /dev/null
+++ b/hw/xen/xen_frontend.c
@@ -0,0 +1,323 @@
+/*
+ * Xen frontend driver infrastructure
+ *
+ *  Copyright (c) 2014 Intel Corporation
+ *  Authors:
+ *Quan Xu 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "hw/hw.h"
+#include "sysemu/char.h"
+#include "qemu/log.h"
+#include "hw/xen/xen_backend.h"
+#include 
+
+/* private */
+static int debug;
+
+/* - */
+/*Get backend with fe type|domid, try to write the backend-id when
+ *create virtual machine.
+ *
+ *[XenStore]
+ *
+ *Dom.ID = ""
+ *  device
+ *vtpm = ""
+ *  0 = ""
+ *backend-id = "ID"
+ *[..]
+ */
+
+char *xenstore_fe_read_be_str(const char *type, int dom, int dev)
+{
+char *val, *domu;
+char path[XEN_BUFSIZE];
+unsigned int len, ival;
+
+/*fe path*/
+domu = xs_get_domain_path(xenstore, dom);
+snprintf(path, sizeof(path), "%s/device/%s/%d/backend-id",
+ domu, type, dev);
+g_free(domu);
+
+val = xs_read(xenstore, 0, path, &len);
+if (!val || 1 != sscanf(val, "%d", &ival)) {
+g_free(val);
+return NULL;
+}
+g_free(val);
+
+/*be path*/
+domu = xs_get_domain_path(xenstore, ival);
+
+return domu;
+}
+
+/*make sure, initialize the 'xendev->fe' in xendev->ops->init() or
+ * xendev->ops->initialize()
+ */
+int xenbus_switch_state(struct XenDevice *xendev, enum xenbus_state xbus)
+{
+xs_transaction_t xbt = XBT_NULL;
+
+if (xendev->fe_state == xbus) {
+return 0

[Qemu-devel] [v3 4/5] Qemu-Xen-vTPM: Qemu vTPM xenstubdoms backen.

2014-12-30 Thread Quan Xu
This Patch provides the glue for the TPM_TIS(Qemu frontend) to Xen
stubdom vTPM domain that provides the actual TPM functionality. It
sends data and TPM commends with xen_vtpm_frontend. It is similar as
another two vTPM backens:
  *vTPM passthrough backen Since QEMU 1.5.
  *vTPM libtpms-based backen.

Some details:
This part of the patch provides support for the spawning of a thread
that will interact with stubdom vTPM domain by the xen_vtpm_frontend.
It expects a signal from the frontend to wake and pick up the TPM
command that is supposed to be processed and delivers the response
packet using a callback function provided by the frontend.

The backend connects itself to the frontend by filling out an interface
structure with pointers to the function implementing support for various
operations.

(QEMU) vTPM XenStubdoms backen is initialized by Qemu command line options,
  "-tpmdev xenstubdoms,id=xenvtpm0 -device tpm-tis,tpmdev=xenvtpm0"

--Changes in v3:
-Call vtpm_send() and vtpm_recv() directly.

Signed-off-by: Quan Xu 
---
 hw/tpm/Makefile.objs |   2 +-
 hw/tpm/tpm_xenstubdoms.c | 245 +++
 2 files changed, 246 insertions(+), 1 deletion(-)
 create mode 100644 hw/tpm/tpm_xenstubdoms.c

diff --git a/hw/tpm/Makefile.objs b/hw/tpm/Makefile.objs
index 57919fa..190e776 100644
--- a/hw/tpm/Makefile.objs
+++ b/hw/tpm/Makefile.objs
@@ -1,3 +1,3 @@
 common-obj-$(CONFIG_TPM_TIS) += tpm_tis.o
 common-obj-$(CONFIG_TPM_PASSTHROUGH) += tpm_passthrough.o
-common-obj-$(CONFIG_TPM_XENSTUBDOMS) += xen_vtpm_frontend.o
+common-obj-$(CONFIG_TPM_XENSTUBDOMS) += tpm_xenstubdoms.o xen_vtpm_frontend.o
diff --git a/hw/tpm/tpm_xenstubdoms.c b/hw/tpm/tpm_xenstubdoms.c
new file mode 100644
index 000..98ea496
--- /dev/null
+++ b/hw/tpm/tpm_xenstubdoms.c
@@ -0,0 +1,245 @@
+/*
+ * Xen Stubdom vTPM driver
+ *
+ *  Copyright (c) 2014 Intel Corporation
+ *  Authors:
+ *Quan Xu 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include 
+#include "qemu-common.h"
+#include "qapi/error.h"
+#include "qemu/sockets.h"
+#include "qemu/log.h"
+#include "sysemu/tpm_backend.h"
+#include "tpm_int.h"
+#include "hw/hw.h"
+#include "hw/i386/pc.h"
+#include "hw/xen/xen_backend.h"
+#include "sysemu/tpm_backend_int.h"
+#include "tpm_tis.h"
+
+#ifdef DEBUG_TPM
+#define DPRINTF(fmt, ...) \
+do { fprintf(stderr, fmt, ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) \
+do { } while (0)
+#endif
+
+#define TYPE_TPM_XENSTUBDOMS "tpm-xenstubdoms"
+#define TPM_XENSTUBDOMS(obj) \
+OBJECT_CHECK(TPMXenstubdomsState, (obj), TYPE_TPM_XENSTUBDOMS)
+
+static const TPMDriverOps tpm_xenstubdoms_driver;
+
+/* data structures */
+typedef struct TPMXenstubdomsThreadParams {
+TPMState *tpm_state;
+TPMRecvDataCB *recv_data_callback;
+TPMBackend *tb;
+} TPMXenstubdomsThreadParams;
+
+struct TPMXenstubdomsState {
+TPMBackend parent;
+TPMBackendThread tbt;
+TPMXenstubdomsThreadParams tpm_thread_params;
+bool had_startup_error;
+};
+
+typedef struct TPMXenstubdomsState TPMXenstubdomsState;
+
+/* functions */
+
+static void tpm_xenstubdoms_cancel_cmd(TPMBackend *tb);
+
+static int tpm_xenstubdoms_unix_transfer(const TPMLocality *locty_data)
+{
+size_t rlen;
+struct XenDevice *xendev;
+
+xendev = xen_be_find_xendev("vtpm", xen_domid, 0);
+if (xendev == NULL) {
+xen_be_printf(xendev, 0, "Con not find vtpm device\n");
+return -1;
+}
+vtpm_send(xendev, locty_data->w_buffer.buffer, locty_data->w_offset);
+vtpm_recv(xendev, locty_data->r_buffer.buffer, &rlen);
+return 0;
+}
+
+static void tpm_xenstubdoms_worker_thread(gpointer data,
+  gpointer user_data)
+{
+TPMXenstubdomsThreadParams *thr_parms = user_data;
+TPMBackendCmd cmd = (TPMBackendCmd)data;
+
+switch (cmd) {
+case TPM_BACKEND_CMD_PROCESS_CMD:
+/* here need a the cmd process function */
+tpm_xenstubdoms_unix_transfer(thr_parms->tpm_state->locty_data);
+thr_parms->recv_data_callback(thr_parms->tpm_state,
+  thr_parms->tpm_state->locty_number);
+break;
+case TPM_BACKEND_CMD_INIT:
+case TPM_BACKEND_CMD_END:
+case TPM_BACKEND_CMD_TPM_RESET:
+/* nothing to do */
+break;
+}
+}
+
+/*
+ *  * Start t

[Qemu-devel] about the re-attach more than one pci devices failed

2014-12-30 Thread Li, Liang Z
Hi Paolo,

We have found a bug in all the xen-4.4 and xen-4.5-rcx, the bug
can be reproduced by the following steps:

Use the 'xl pci-attach $DomU $BDF' command to attach more then 
one PCI devices to the guest, then detach the devices with
'xl pci-detach $DomU $BDF', after that, re-attach these PCI 
devices again, an error message will be reported like following:

libxl: error: libxl_qmp.c:287:qmp_handle_error_response: receive
an error message from QMP server: Duplicate ID 'pci-pt-03_10.1'
for device.

By debugging, I found the count of calling xen_pt_region_add and 
xen_pt_region_del are not the same, and this  may cause the 
XenPCIPassthroughState and it's related QemuOpts object not be
released properly. 

I don't know how this happened, but the following patch can fix this bug.

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index be4220b..a418c53 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -607,7 +607,6 @@ static void xen_pt_region_add(MemoryListener *l, 
MemoryRegionSection *sec)
 XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState,
  memory_listener);
 
-memory_region_ref(sec->mr);
 xen_pt_region_update(s, sec, true);
 }
 
@@ -617,7 +616,6 @@ static void xen_pt_region_del(MemoryListener *l, 
MemoryRegionSection *sec)
  memory_listener);
 
 xen_pt_region_update(s, sec, false);
-memory_region_unref(sec->mr);
 }
 
 static void xen_pt_io_region_add(MemoryListener *l, MemoryRegionSection *sec)
@@ -625,7 +623,6 @@ static void xen_pt_io_region_add(MemoryListener *l, 
MemoryRegionSection *sec)
 XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState,
  io_listener);
 
-memory_region_ref(sec->mr);
 xen_pt_region_update(s, sec, true);
 }
 
@@ -635,7 +632,6 @@ static void xen_pt_io_region_del(MemoryListener *l, 
MemoryRegionSection *sec)
  io_listener);
 
 xen_pt_region_update(s, sec, false);
-memory_region_unref(sec->mr);
 }
 
 static const MemoryListener xen_pt_memory_listener = {


After reading other parts of the source code, I don't think the above patch is 
a good fix.
I have verified the following patch can work too:

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index c1bf357..f2893b2 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -736,7 +736,7 @@ static int xen_pt_initfn(PCIDevice *d)
 }
 
 out:
-memory_listener_register(&s->memory_listener, &address_space_memory);
+memory_listener_register(&s->memory_listener, &s->dev.bus_master_as);
 memory_listener_register(&s->io_listener, &address_space_io);
 XEN_PT_LOG(d,
"Real physical device %02x:%02x.%d registered successfully!\n",

By  debugging, I found when using 'address_space_memory', 
xen_pt_region_del won't be called when the memory region is not  ' 
xen-pci-pt-*',
when using ' s->dev.bus_master_as ', there is no such issue.

I am not sure use 's->dev.bus_master_as' instead of 'address_space_memory'
is right. Could you give some suggestion?

Liang






[Qemu-devel] [PATCH v3 3/7] qemu-iotests: Replace "/bin/true" with "true"

2014-12-30 Thread Fam Zheng
The former is not portable because on Mac OSX it is /usr/bin/true.

Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/common.config | 2 +-
 tests/qemu-iotests/common.filter | 2 +-
 tests/qemu-iotests/common.rc | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/common.config b/tests/qemu-iotests/common.config
index 91a5ef6..a1973ad 100644
--- a/tests/qemu-iotests/common.config
+++ b/tests/qemu-iotests/common.config
@@ -155,4 +155,4 @@ _readlink()
 }
 
 # make sure this script returns success
-/bin/true
+true
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 6c14590..a2cb9fb 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -223,4 +223,4 @@ _filter_qemu_img_map()
 }
 
 # make sure this script returns success
-/bin/true
+true
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 3b14053..aa093d9 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -490,4 +490,4 @@ _die()
 }
 
 # make sure this script returns success
-/bin/true
+true
-- 
1.9.3




[Qemu-devel] [PATCH v3 0/7] tests: Add check-block to "make check"

2014-12-30 Thread Fam Zheng
qemu-iotests contains useful tests that have a nice coverage of block layer
code. Adding check-block (which calls tests/qemu-iotests-quick.sh) to "make
check" is good for developers' self-testing.

v2: Take care of other platforms, basically by keeping them unchanged, and only
add "make check-block" to "make check" on Linux. (Peter)

Fam Zheng (7):
  .gitignore: Ignore generated "common.env"
  qemu-iotests: Remove 091 from quick group
  qemu-iotests: Replace "/bin/true" with "true"
  qemu-iotests: Add "_supported_os Linux" to 058
  qemu-iotests: Speed up make check-block
  tests/Makefile: Add check-block to make check on Linux
  qemu-iotests: Add supported os parameter for python tests

 .gitignore   | 1 +
 tests/Makefile   | 3 +++
 tests/qemu-iotests-quick.sh  | 2 +-
 tests/qemu-iotests/058   | 1 +
 tests/qemu-iotests/check | 1 +
 tests/qemu-iotests/common.config | 2 +-
 tests/qemu-iotests/common.filter | 2 +-
 tests/qemu-iotests/common.rc | 2 +-
 tests/qemu-iotests/group | 2 +-
 tests/qemu-iotests/iotests.py| 5 -
 10 files changed, 15 insertions(+), 6 deletions(-)

-- 
1.9.3




[Qemu-devel] [PATCH v3 1/7] .gitignore: Ignore generated "common.env"

2014-12-30 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 .gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitignore b/.gitignore
index e32a584..090f974 100644
--- a/.gitignore
+++ b/.gitignore
@@ -109,3 +109,4 @@ cscope.*
 tags
 TAGS
 *~
+/tests/qemu-iotests/common.env
-- 
1.9.3




[Qemu-devel] [PATCH v3 6/7] tests/Makefile: Add check-block to make check on Linux

2014-12-30 Thread Fam Zheng
"make check-block" does nothing on other platforms, but still takes some
time to enumerate all the tests. So let's only add it for Linux for now.

Signed-off-by: Fam Zheng 
---
 tests/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/Makefile b/tests/Makefile
index e4ddb6a..0968121 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -467,6 +467,9 @@ check-qtest: $(patsubst %,check-qtest-%, $(QTEST_TARGETS))
 check-unit: $(patsubst %,check-%, $(check-unit-y))
 check-block: $(patsubst %,check-%, $(check-block-y))
 check: check-qapi-schema check-unit check-qtest
+ifeq ($(shell uname -s),"Linux")
+   check: check-block
+endif
 check-clean:
$(MAKE) -C tests/tcg clean
rm -rf $(check-unit-y) tests/*.o $(QEMU_IOTESTS_HELPERS-y)
-- 
1.9.3




[Qemu-devel] [PATCH v3 2/7] qemu-iotests: Remove 091 from quick group

2014-12-30 Thread Fam Zheng
For the purpose of allowing running quick group on tmpfs.

Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/group | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index a4742c6..08099b9 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -97,7 +97,7 @@
 088 rw auto quick
 089 rw auto quick
 090 rw auto quick
-091 rw auto quick
+091 rw auto
 092 rw auto quick
 095 rw auto quick
 097 rw auto backing
-- 
1.9.3




[Qemu-devel] [PATCH v3 4/7] qemu-iotests: Add "_supported_os Linux" to 058

2014-12-30 Thread Fam Zheng
Other cases have this, and this test is not portable as well, as we want
to add "make check-block" to "make check", it shouldn't fail on Mac OS
X.

Reported-by: Peter Maydell 
Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/058 | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/qemu-iotests/058 b/tests/qemu-iotests/058
index 2d5ca85..a60b34b 100755
--- a/tests/qemu-iotests/058
+++ b/tests/qemu-iotests/058
@@ -87,6 +87,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt qcow2
 _supported_proto file
+_supported_os Linux
 _require_command QEMU_NBD
 
 # Use -f raw instead of -f $IMGFMT for the NBD connection
-- 
1.9.3




[Qemu-devel] [PATCH v3 5/7] qemu-iotests: Speed up make check-block

2014-12-30 Thread Fam Zheng
Using /tmp, which is usually mounted as tmpfs, the quick group can be
quicker.

On my laptop (Lenovo T430s with Fedora 20), this reduces the time from
50s to 30s.

Signed-off-by: Fam Zheng 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests-quick.sh | 2 +-
 tests/qemu-iotests/check| 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests-quick.sh b/tests/qemu-iotests-quick.sh
index 12af731..0e554bb 100755
--- a/tests/qemu-iotests-quick.sh
+++ b/tests/qemu-iotests-quick.sh
@@ -3,6 +3,6 @@
 cd tests/qemu-iotests
 
 ret=0
-./check -T -qcow2 -g quick || ret=1
+TEST_DIR=${TEST_DIR:-/tmp/qemu-iotests-quick-$$} ./check -T -qcow2 -g quick || 
ret=1
 
 exit $ret
diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 8ca4011..baeae80 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -238,6 +238,7 @@ QEMU_NBD  -- $QEMU_NBD
 IMGFMT-- $FULL_IMGFMT_DETAILS
 IMGPROTO  -- $FULL_IMGPROTO_DETAILS
 PLATFORM  -- $FULL_HOST_DETAILS
+TEST_DIR  -- $TEST_DIR
 SOCKET_SCM_HELPER -- $SOCKET_SCM_HELPER
 
 EOF
-- 
1.9.3




[Qemu-devel] [PATCH v3 7/7] qemu-iotests: Add supported os parameter for python tests

2014-12-30 Thread Fam Zheng
If I understand correctly, qemu-iotests never meant to be portable. We
only support Linux for all the shell cases, but didn't specify it for
python tests. Now add this and default all the python tests as Linux
only. If we cares enough later, we can override the parameter in
individual cases.

Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/iotests.py | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index f57f154..87002e0 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -282,12 +282,15 @@ def notrun(reason):
 print '%s not run: %s' % (seq, reason)
 sys.exit(0)
 
-def main(supported_fmts=[]):
+def main(supported_fmts=[], supported_oses=['linux']):
 '''Run tests'''
 
 if supported_fmts and (imgfmt not in supported_fmts):
 notrun('not suitable for this image format: %s' % imgfmt)
 
+if sys.platform not in supported_oses:
+notrun('not suitable for this OS: %s' % sys.platform)
+
 # We need to filter out the time taken from the output so that qemu-iotest
 # can reliably diff the results against master output.
 import StringIO
-- 
1.9.3




[Qemu-devel] [Bug 1406706] [NEW] guest will be destroyed when create guest with parameter "-usbdevice tablet".

2014-12-30 Thread chao zhou
Public bug reported:

Environment:

Host OS (ia32/ia32e/IA64):ia32e
Guest OS (ia32/ia32e/IA64):ia32e
Guest OS Type (Linux/Windows):windows
kvm.git Commit:2c4aa55a6af070262cca425745e8e54310e96b8d
qemu.git Commit:ab0302ee764fd702465aef6d88612cdff4302809
Host Kernel Version:3.18.0-rc3
Hardware: Ivytown_EP,Haswell_EP


Bug detailed description:
--
when create guest with parameter "-usbdevice tablet", then guest will be 
destroyed.

note:
this shoule be a qemu bug:
kvm +   qemu= result
2c4aa55a  +  ab0302ee = bad
2c4aa55a  +  54600752 = good


Reproduce steps:

1. create guest
qemu-system-x86_64 --enable-kvm -m 4G smp 2 -net none win8.1.qcow -usbdevice 
tablet


Current result:

the guest will be destroyed when create guest with "-usbdevice tablet"

Expected result:

the guest works fine when create guest with "-usbdevice tablet"


Basic root-causing log:
--
[root@vt-hsw2 ~]# qemu-system-x86_64 -enable-kvm -m 4G -smp 2 -net none 
/root/cathy/win8.1.qcow  -usbdevice tablet
qemu-system-x86_64: util/qemu-option.c:387: qemu_opt_get_bool_helper: Assertion 
`opt->desc && opt->desc->type == QEMU_OPT_BOOL' failed.
Aborted (core dumped)

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1406706

Title:
  guest will be destroyed when create guest with parameter "-usbdevice
  tablet".

Status in QEMU:
  New

Bug description:
  Environment:
  
  Host OS (ia32/ia32e/IA64):ia32e
  Guest OS (ia32/ia32e/IA64):ia32e
  Guest OS Type (Linux/Windows):windows
  kvm.git Commit:2c4aa55a6af070262cca425745e8e54310e96b8d
  qemu.git Commit:ab0302ee764fd702465aef6d88612cdff4302809
  Host Kernel Version:3.18.0-rc3
  Hardware: Ivytown_EP,Haswell_EP

  
  Bug detailed description:
  --
  when create guest with parameter "-usbdevice tablet", then guest will be 
destroyed.

  note:
  this shoule be a qemu bug:
  kvm +   qemu= result
  2c4aa55a  +  ab0302ee = bad
  2c4aa55a  +  54600752 = good


  Reproduce steps:
  
  1. create guest
  qemu-system-x86_64 --enable-kvm -m 4G smp 2 -net none win8.1.qcow -usbdevice 
tablet

  
  Current result:
  
  the guest will be destroyed when create guest with "-usbdevice tablet"

  Expected result:
  
  the guest works fine when create guest with "-usbdevice tablet"

  
  Basic root-causing log:
  --
  [root@vt-hsw2 ~]# qemu-system-x86_64 -enable-kvm -m 4G -smp 2 -net none 
/root/cathy/win8.1.qcow  -usbdevice tablet
  qemu-system-x86_64: util/qemu-option.c:387: qemu_opt_get_bool_helper: 
Assertion `opt->desc && opt->desc->type == QEMU_OPT_BOOL' failed.
  Aborted (core dumped)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1406706/+subscriptions



[Qemu-devel] bind interdomain ioctl error xen-kvm.c

2014-12-30 Thread Rishi Ranjan
I am trying to use Xen as accelerator for my Qemu machine. I have created a
guest domain with following xl config:

builder = "hvm"
name = "qemu-hvm"
memory = "512"
vcpus = 1
vif = ['']
vnc = 1
boot="c"


When I try to run with following parameters:

-machine q35,accel=xen -cpu qemu64 -bios ./pc-bios/bios-256k.bin -xen-domid
"Domain id of guest"

I am getting follwing error from xen-hvm.c:

"bind interdomain ioctl error" in xen_hvm_init while calling
state->shared_vmport_page =
xc_map_foreign_range(xen_xc, xen_domid, XC_PAGE_SIZE,
 PROT_READ|PROT_WRITE, ioreq_pfn);

Can someone help me get this working?

Thanks,
Rishi


[Qemu-devel] [PATCH] Fix irq route entries exceed KVM_MAX_IRQ_ROUTES

2014-12-30 Thread 马文霜
Last month, we experienced several guests crash(6cores-8cores),qemu logs
display the following messages:

qemu-system-x86_64: /build/qemu-2.1.2/kvm-all.c:976:
kvm_irqchip_commit_routes: Assertion `ret == 0' failed.

After analysis and verification, we can confirm it's irq-balance
daemon(in guest) leads to the assertion failure.So start a 8 core guest
with two disks, execute the following scripts will reproduce the BUG quickly:

vda_irq_num=25
vdb_irq_num=27
while [ 1 ]
do
for irq in {1,2,4,8,10,20,40,80}
do
echo $irq > /proc/irq/$vda_irq_num/smp_affinity
echo $irq > /proc/irq/$vdb_irq_num/smp_affinity
dd if=/dev/vda of=/dev/zero bs=4K count=100 iflag=direct
dd if=/dev/vdb of=/dev/zero bs=4K count=100 iflag=direct
done
done

QEMU setup static irq route entries in kvm_pc_setup_irq_routing(),PIC and
IOAPIC share the first 15 GSI numbers,take up 23 GSI numbers,but take up 38
irq route entries.When change irq smp_affinity in guest,a dynamic route
entry may be setup,the current logic is:if allocate GSI number succeeds,
a new route entry can be added.The available dynamic GSI numbers is
1021(KVM_MAX_IRQ_ROUTES-23),but available irq route entries is only
986(KVM_MAX_IRQ_ROUTES-38),GSI numbers greater than route entries.
irq-balance's behavior will eventually leads to total irq route entries
exceed KVM_MAX_IRQ_ROUTES,ioctl(KVM_SET_GSI_ROUTING) fail and
kvm_irqchip_commit_routes() trigger assertion failure.

This patch fix the BUG.

Signed-off-by: Wenshuang Ma 
---
 kvm-all.c |   11 +++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 18cc6b4..f47e1b1 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1123,6 +1123,17 @@ static int kvm_irqchip_get_virq(KVMState *s)
 int i, bit;
 bool retry = true;
 
+/*
+ * PIC and IOAPIC share the first 15 GSI numbers,available GSI
+ * numbers greater than IRQ route entries. If allocate GSI number
+ * succeeds, a new route entry can be added, so total IRQ route
+ * enties can exceed gsi_count, flush dynamic MSI entries when
+ * IRQ route entries arrive gsi_count.
+ */
+if (!s->direct_msi && s->irq_routes->nr == s->gsi_count) {
+kvm_flush_dynamic_msi_routes(s);
+}
+
 again:
 /* Return the lowest unused GSI in the bitmap */
 for (i = 0; i < max_words; i++) {
-- 
1.7.1