date:20160105

Re: [Qemu-devel] [vfio-users] [PATCH v2 1/3] input: add qemu_input_qcode_to_linux + qemu_input_linux_to_qcode

2016-01-05 Thread Jonathan Scruggs

I found the qemu user yesterday and added it to the input group. All is
good now. The patches work great! Are they being added to the main code
base soon? A small faq on the site detailing libvirt usage and adding qemu
to the input group would be needed though.

I notice no bugs as of yet. The mouse fully disengages from each system
unlike an earlier report I read in this mailing list. This is perfect and
precisely what I requested many months ago when vfio-users mailing list
first started. :)

Thanks for all your work on this,
Jon

On 5 January 2016 at 07:05, Gerd Hoffmann  wrote:

> On Mo, 2016-01-04 at 13:19 +, Jonathan Scruggs wrote:
> > Oh. I just changed /dev/input/eventx (replace x with correct number
> > for my devices) to permissions of 666 and it worked. I guess I had to
> > change the conf file and change the permissions. Is there a way to
> > make the devices work with qemu? The permission user is root and group
> > of input for all the eventx devices. Do I need a udev script or is
> > there a qemu user that can be added to the group of input?
>
> I'm using chmod 666, adding the qemu user to the input group should work
> too.
>
> cheers,
>   Gerd
>
>
>

Re: [Qemu-devel] [PATCH v4 00/14] qemu-img map: Allow driver to return file of the allocated block

2016-01-05 Thread Fam Zheng

On Mon, 01/04 22:02, Max Reitz wrote:
> On 24.12.2015 06:50, Fam Zheng wrote:
> > v4: Rebase and resend, adding Eric's and Stefan's reviewed-by.
> > 
> > Fix one typo in patch 13.
> > 
> > Drop previous patch 14 for a later rework because it is not a hard
> > requirement, but it is pending on Eric's QAPI-to-JSON visitor series:
> > 
> > https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg03929.html
> > 
> > v3: Add Eric's rev-by in patches 6, 7, 13, 14.
> > 12: New, split out from the previous 13.
> > 12->13: Refactor "entry_mergable" from imp_map().
> > Don't mess the merge conditions. [Paolo]
> > Address Eric's comments:
> > - Check has_foo before using foo.
> > - Remove blank line between comments and definition in schema.
> > - Use PRId64 instead of %ld.
> > - Merge short lines.
> > 
> > v2: Add Eric's rev-by in patches 2, 4, 5.
> > 01: Refering -> referring in commit message. [Eric]
> > Recurse to "file" for sensible "zero" flag. [Paolo]
> > 12: New. Make MapEntry a QAPI struct. [Paolo, Markus]
> > 
> > Original cover letter
> > -
> > 
> > I stumbled upon this when looking at external bitmap formats.
> > 
> > Current "qemu-img map" command only displays filename if the data is 
> > allocated
> > in bs (bs->file) itself, or in the backing chain. Otherwise, it displays an
> > unfriendly error message:
> > 
> > $ qemu-img create -f vmdk -o subformat=monolithicFlat /tmp/test.vmdk 1G
> > 
> > $ qemu-img map /tmp/test.vmdk
> > Offset  Length  Mapped to   File
> > qemu-img: File contains external, encrypted or compressed clusters.
> > 
> > This can be improved. This series extends the .bdrv_co_get_block_status
> > callback, to let block driver return the BDS of file; then updates all 
> > driver
> > to implement it; and lastly, it changes qemu-img to use this information in
> > "map" command:
> > 
> > 
> > $ qemu-img map /tmp/test.vmdk
> > Offset  Length  Mapped to   File
> > 0   0x4000  0   /tmp/test-flat.vmdk
> > 
> > $ qemu-img map --output json /tmp/test.vmdk
> > [{"length": 1073741824, "start": 0, "zero": false, "offset": 0, 
> > "depth": 0,
> >   "file": "/tmp/test-flat.vmdk", "data": true}
> > ]
> > 
> > Fam Zheng (14):
> >   block: Add "file" output parameter to block status query functions
> >   qcow: Assign bs->file->bs to file in qcow_co_get_block_status
> >   qcow2: Assign bs->file->bs to file in qcow2_co_get_block_status
> 
> Minor comment: I'd swap these two patches (2 and 3). Patch 1 breaks test
> 102, patch 3 fixes it again. It would be better to break it for as short
> a time as possible.
> 
> Alternatively, in order not to break 102 at all, patch 1 would need to
> leave the "if (bs->file &&" part of bdrv_co_get_block_status()
> (@@ -1544,13 +1550,14 @@) as-is and change it only after the format
> block drivers set *file.

Yes, this is better.

Fam

[Qemu-devel] [PATCH v5 00/15] qemu-img map: Allow driver to return file of the allocated block

2016-01-05 Thread Fam Zheng

v5: Address Max's review comments:
- "It's" -> "Its" in commit message in patch 1;
- Retain the "bs->file" condition to unbreak iotest 102 in patch 1;
- Fix comment of BDRV_BLOCK_OFFSET_VALID in patch 1;
- Fix VMDK's compressed case in patch 11;
- Add patch 12 in complement of patch 1.

v4: Rebase and resend, adding Eric's and Stefan's reviewed-by.

Fix one typo in patch 13.

Drop previous patch 14 for a later rework because it is not a hard
requirement, but it is pending on Eric's QAPI-to-JSON visitor series:

https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg03929.html

Original cover letter
-

I stumbled upon this when looking at external bitmap formats.

Current "qemu-img map" command only displays filename if the data is allocated
in bs (bs->file) itself, or in the backing chain. Otherwise, it displays an
unfriendly error message:

$ qemu-img create -f vmdk -o subformat=monolithicFlat /tmp/test.vmdk 1G

$ qemu-img map /tmp/test.vmdk
Offset  Length  Mapped to   File
qemu-img: File contains external, encrypted or compressed clusters.

This can be improved. This series extends the .bdrv_co_get_block_status
callback, to let block driver return the BDS of file; then updates all driver
to implement it; and lastly, it changes qemu-img to use this information in
"map" command:


$ qemu-img map /tmp/test.vmdk
Offset  Length  Mapped to   File
0   0x4000  0   /tmp/test-flat.vmdk

$ qemu-img map --output json /tmp/test.vmdk
[{"length": 1073741824, "start": 0, "zero": false, "offset": 0, "depth": 0,
  "file": "/tmp/test-flat.vmdk", "data": true}
]


Fam Zheng (15):
  block: Add "file" output parameter to block status query functions
  qcow: Assign bs->file->bs to file in qcow_co_get_block_status
  qcow2: Assign bs->file->bs to file in qcow2_co_get_block_status
  raw: Assign bs to file in raw_co_get_block_status
  iscsi: Assign bs to file in iscsi_co_get_block_status
  parallels: Assign bs->file->bs to file in
parallels_co_get_block_status
  qed: Assign bs->file->bs to file in bdrv_qed_co_get_block_status
  sheepdog: Assign bs to file in sd_co_get_block_status
  vdi: Assign bs->file->bs to file in vdi_co_get_block_status
  vpc: Assign bs->file->bs to file in vpc_co_get_block_status
  vmdk: Return extent's file in bdrv_get_block_status
  block: Remove the "bs->file" test in bdrv_co_get_block_status
  qemu-img: In "map", use the returned "file" from bdrv_get_block_status
  qemu-img: Make MapEntry a QAPI struct
  iotests: Add "qemu-img map" test for VMDK extents

 block/io.c | 42 -
 block/iscsi.c  |  9 --
 block/mirror.c |  3 +-
 block/parallels.c  |  3 +-
 block/qcow.c   |  3 +-
 block/qcow2.c  |  3 +-
 block/qed.c|  6 +++-
 block/raw-posix.c  |  4 ++-
 block/raw_bsd.c|  4 ++-
 block/sheepdog.c   |  5 ++-
 block/vdi.c|  3 +-
 block/vmdk.c   | 12 ---
 block/vpc.c|  4 ++-
 block/vvfat.c  |  2 +-
 include/block/block.h  |  9 --
 include/block/block_int.h  |  3 +-
 qapi/block-core.json   | 27 
 qemu-img.c | 78 --
 tests/qemu-iotests/059 | 10 ++
 tests/qemu-iotests/059.out | 38 ++
 20 files changed, 201 insertions(+), 67 deletions(-)

-- 
2.4.3

[Qemu-devel] [PATCH v5 03/15] qcow2: Assign bs->file->bs to file in qcow2_co_get_block_status

2016-01-05 Thread Fam Zheng

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/qcow2.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 7096a29..da74eb7 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1348,6 +1348,7 @@ static int64_t coroutine_fn 
qcow2_co_get_block_status(BlockDriverState *bs,
 !s->cipher) {
 index_in_cluster = sector_num & (s->cluster_sectors - 1);
 cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS);
+*file = bs->file->bs;
 status |= BDRV_BLOCK_OFFSET_VALID | cluster_offset;
 }
 if (ret == QCOW2_CLUSTER_ZERO) {
-- 
2.4.3

[Qemu-devel] [PATCH v5 01/15] block: Add "file" output parameter to block status query functions

2016-01-05 Thread Fam Zheng

The added parameter can be used to return the BDS pointer which the
valid offset is referring to. Its value should be ignored unless
BDRV_BLOCK_OFFSET_VALID in ret is set.

Until block drivers fill in the right value, let's clear it explicitly
right before calling .bdrv_get_block_status.

The "bs->file" condition in bdrv_co_get_block_status is kept now to keep iotest
case 102 passing, and will be fixed once all drivers return the right file
pointer.

Signed-off-by: Fam Zheng 
---
 block/io.c| 42 --
 block/iscsi.c |  6 --
 block/mirror.c|  3 ++-
 block/parallels.c |  2 +-
 block/qcow.c  |  2 +-
 block/qcow2.c |  2 +-
 block/qed.c   |  3 ++-
 block/raw-posix.c |  3 ++-
 block/raw_bsd.c   |  3 ++-
 block/sheepdog.c  |  2 +-
 block/vdi.c   |  2 +-
 block/vmdk.c  |  2 +-
 block/vpc.c   |  2 +-
 block/vvfat.c |  2 +-
 include/block/block.h |  9 ++---
 include/block/block_int.h |  3 ++-
 qemu-img.c|  7 +--
 17 files changed, 61 insertions(+), 34 deletions(-)

diff --git a/block/io.c b/block/io.c
index 63e3678..492c291 100644
--- a/block/io.c
+++ b/block/io.c
@@ -663,6 +663,7 @@ int bdrv_write_zeroes(BlockDriverState *bs, int64_t 
sector_num,
 int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags flags)
 {
 int64_t target_sectors, ret, nb_sectors, sector_num = 0;
+BlockDriverState *file;
 int n;
 
 target_sectors = bdrv_nb_sectors(bs);
@@ -675,7 +676,7 @@ int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags 
flags)
 if (nb_sectors <= 0) {
 return 0;
 }
-ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &n);
+ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &n, &file);
 if (ret < 0) {
 error_report("error getting block status at sector %" PRId64 ": 
%s",
  sector_num, strerror(-ret));
@@ -1464,6 +1465,7 @@ int bdrv_flush_all(void)
 typedef struct BdrvCoGetBlockStatusData {
 BlockDriverState *bs;
 BlockDriverState *base;
+BlockDriverState **file;
 int64_t sector_num;
 int nb_sectors;
 int *pnum;
@@ -1488,7 +1490,8 @@ typedef struct BdrvCoGetBlockStatusData {
  */
 static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
  int64_t sector_num,
- int nb_sectors, int *pnum)
+ int nb_sectors, int *pnum,
+ BlockDriverState **file)
 {
 int64_t total_sectors;
 int64_t n;
@@ -1518,16 +1521,19 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,
 return ret;
 }
 
-ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum);
+*file = NULL;
+ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum,
+file);
 if (ret < 0) {
 *pnum = 0;
+*file = NULL;
 return ret;
 }
 
 if (ret & BDRV_BLOCK_RAW) {
 assert(ret & BDRV_BLOCK_OFFSET_VALID);
 return bdrv_get_block_status(bs->file->bs, ret >> BDRV_SECTOR_BITS,
- *pnum, pnum);
+ *pnum, pnum, file);
 }
 
 if (ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ZERO)) {
@@ -1544,13 +1550,14 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,
 }
 }
 
-if (bs->file &&
+if (bs->file && *file && *file != bs &&
 (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
 (ret & BDRV_BLOCK_OFFSET_VALID)) {
+BlockDriverState *file2;
 int file_pnum;
 
-ret2 = bdrv_co_get_block_status(bs->file->bs, ret >> BDRV_SECTOR_BITS,
-*pnum, &file_pnum);
+ret2 = bdrv_co_get_block_status(*file, ret >> BDRV_SECTOR_BITS,
+*pnum, &file_pnum, &file2);
 if (ret2 >= 0) {
 /* Ignore errors.  This is just providing extra information, it
  * is useful but not necessary.
@@ -1575,14 +1582,15 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status_above(BlockDriverState *bs,
 BlockDriverState *base,
 int64_t sector_num,
 int nb_sectors,
-int *pnum)
+int *pnum,
+BlockDriverState **file)
 {
 BlockDriverState *p;
 int64_t ret = 0;
 
 assert(bs != base);
 for (p = bs; p != base; p = backing_bs(p)) {
-ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum);
+ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum, file);
 if (ret < 0 || ret & BDRV_BLOCK_ALLOCATED) {
 break;

Re: [Qemu-devel] [PATCH] macio: fix overflow in lba to offset conversion for ATAPI devices

2016-01-05 Thread Mark Cave-Ayland

On 04/01/16 21:03, John Snow wrote:

> On 01/04/2016 03:54 PM, Mark Cave-Ayland wrote:
>> On 04/01/16 20:36, John Snow wrote:
>>
>>> On 01/04/2016 02:15 PM, Mark Cave-Ayland wrote:
 On 04/01/16 19:04, P J P wrote:

> +-- On Mon, 4 Jan 2016, Mark Cave-Ayland wrote --+
> |  /* Calculate current offset */
> | -offset = (int64_t)(s->lba << 11) + s->io_buffer_index;
> | +offset = ((int64_t)(s->lba) << 11) + s->io_buffer_index;
>
> Maybe ((int64_t)s->lba << 11) ? No parenthesis around s->lba.

 Yes that works here too (perhaps I was just being over-cautious).
 Alex/John, please let me know if you want me to resubmit.

>>>
>>> PJP's version should work just fine. I won't ask you to resubmit, though...
>>
>> Great, thanks :)
>>
>>> ...But, well, while we're here, I have a question for you:
>>>
>>> So s->lba is an int that we left shift by 11 for a max of (2^43 - 2^11)
>>> then we add it against s->io_buffer_index, a uint64_t, so this statement
>>> could still in theory overflow.
>>>
>>> Except not really, since io_buffer_index is bounded (in general) by
>>> io_buffer_total_len, which is usually (IDE_DMA_BUF_SECTORS*512 + 4) ->
>>> ~132K.
>>>
>>> I don't think there's any rigorous bounds-checking of io_buffer_index,
>>> just ad-hoc checking when we're good enough to remember to do it. And we
>>> don't seem to do it anywhere in macio. Is it worth peppering in an
>>> assert somewhere that io_buffer_index is reasonably small?
>>
>> The DBDMA engine is limited to 16-bit transfers so the maximum transfer
>> size is 64K, and s->io_buffer_index is used to hold the current position
>> within this transfer so unless we get some very large disks I think we
>> should be okay here?
>>
> 
> For all non-malicious uses of the code, yes.
> 
> If I want to apply some rigorous checking to this bound I should just
> add a function to manipulate it centrally in core.c, I think.

That sounds good - any solution that avoids having to maintain changes
across IDE core and macio separately is always good!

> I'll pull this and edit it to PJP's suggestion.

Brilliant - thanks a lot!


ATB,

Mark.

[Qemu-devel] [PATCH v5 02/15] qcow: Assign bs->file->bs to file in qcow_co_get_block_status

2016-01-05 Thread Fam Zheng

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/qcow.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/qcow.c b/block/qcow.c
index 558f443..b59383f 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -509,6 +509,7 @@ static int64_t coroutine_fn 
qcow_co_get_block_status(BlockDriverState *bs,
 return BDRV_BLOCK_DATA;
 }
 cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS);
+*file = bs->file->bs;
 return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | cluster_offset;
 }
 
-- 
2.4.3

[Qemu-devel] [PATCH v5 07/15] qed: Assign bs->file->bs to file in bdrv_qed_co_get_block_status

2016-01-05 Thread Fam Zheng

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/qed.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/qed.c b/block/qed.c
index a6bbd8b..03af9c1 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -692,6 +692,7 @@ typedef struct {
 uint64_t pos;
 int64_t status;
 int *pnum;
+BlockDriverState **file;
 } QEDIsAllocatedCB;
 
 static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t 
len)
@@ -703,6 +704,7 @@ static void qed_is_allocated_cb(void *opaque, int ret, 
uint64_t offset, size_t l
 case QED_CLUSTER_FOUND:
 offset |= qed_offset_into_cluster(s, cb->pos);
 cb->status = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
+*cb->file = cb->bs->file->bs;
 break;
 case QED_CLUSTER_ZERO:
 cb->status = BDRV_BLOCK_ZERO;
@@ -734,6 +736,7 @@ static int64_t coroutine_fn 
bdrv_qed_co_get_block_status(BlockDriverState *bs,
 .pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE,
 .status = BDRV_BLOCK_OFFSET_MASK,
 .pnum = pnum,
+.file = file,
 };
 QEDRequest request = { .l2_table = NULL };
 
-- 
2.4.3

[Qemu-devel] [PATCH v5 06/15] parallels: Assign bs->file->bs to file in parallels_co_get_block_status

2016-01-05 Thread Fam Zheng

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/parallels.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/parallels.c b/block/parallels.c
index d83246b..129668b 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -273,6 +273,7 @@ static int64_t coroutine_fn 
parallels_co_get_block_status(BlockDriverState *bs,
 return 0;
 }
 
+*file = bs->file->bs;
 return (offset << BDRV_SECTOR_BITS) |
 BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
 }
-- 
2.4.3

[Qemu-devel] [PATCH v5 08/15] sheepdog: Assign bs to file in sd_co_get_block_status

2016-01-05 Thread Fam Zheng

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/sheepdog.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 0f6789e..d5e7ff8 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -2740,6 +2740,9 @@ sd_co_get_block_status(BlockDriverState *bs, int64_t 
sector_num, int nb_sectors,
 if (*pnum > nb_sectors) {
 *pnum = nb_sectors;
 }
+if (ret > 0 && ret & BDRV_BLOCK_OFFSET_VALID) {
+*file = bs;
+}
 return ret;
 }
 
-- 
2.4.3

[Qemu-devel] [PATCH v5 04/15] raw: Assign bs to file in raw_co_get_block_status

2016-01-05 Thread Fam Zheng

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/raw-posix.c | 1 +
 block/raw_bsd.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 6fc0b71..344272f 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1869,6 +1869,7 @@ static int64_t coroutine_fn 
raw_co_get_block_status(BlockDriverState *bs,
 *pnum = MIN(nb_sectors, (data - start) / BDRV_SECTOR_SIZE);
 ret = BDRV_BLOCK_ZERO;
 }
+*file = bs;
 return ret | BDRV_BLOCK_OFFSET_VALID | start;
 }
 
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 7d23c08..2d4c896 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -118,6 +118,7 @@ static int64_t coroutine_fn 
raw_co_get_block_status(BlockDriverState *bs,
 BlockDriverState **file)
 {
 *pnum = nb_sectors;
+*file = bs;
 return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
(sector_num << BDRV_SECTOR_BITS);
 }
-- 
2.4.3

[Qemu-devel] [PATCH v5 05/15] iscsi: Assign bs to file in iscsi_co_get_block_status

2016-01-05 Thread Fam Zheng

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/iscsi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/iscsi.c b/block/iscsi.c
index 2d1e230..8c7f1b3 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -625,6 +625,9 @@ out:
 if (iTask.task != NULL) {
 scsi_free_scsi_task(iTask.task);
 }
+if (ret > 0 && ret & BDRV_BLOCK_OFFSET_VALID) {
+*file = bs;
+}
 return ret;
 }
 
-- 
2.4.3

[Qemu-devel] [PATCH v5 11/15] vmdk: Return extent's file in bdrv_get_block_status

2016-01-05 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 block/vmdk.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index f5a56fd..59e1ffe 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1265,6 +1265,7 @@ static int64_t coroutine_fn 
vmdk_co_get_block_status(BlockDriverState *bs,
  0, 0);
 qemu_co_mutex_unlock(&s->lock);
 
+index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
 switch (ret) {
 case VMDK_ERROR:
 ret = -EIO;
@@ -1277,14 +1278,15 @@ static int64_t coroutine_fn 
vmdk_co_get_block_status(BlockDriverState *bs,
 break;
 case VMDK_OK:
 ret = BDRV_BLOCK_DATA;
-if (extent->file == bs->file && !extent->compressed) {
-ret |= BDRV_BLOCK_OFFSET_VALID | offset;
+if (!extent->compressed) {
+ret |= BDRV_BLOCK_OFFSET_VALID;
+ret |= (offset + (index_in_cluster << BDRV_SECTOR_BITS))
+& BDRV_BLOCK_OFFSET_MASK;
 }
-
+*file = extent->file->bs;
 break;
 }
 
-index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
 n = extent->cluster_sectors - index_in_cluster;
 if (n > nb_sectors) {
 n = nb_sectors;
-- 
2.4.3

[Qemu-devel] [PATCH v5 09/15] vdi: Assign bs->file->bs to file in vdi_co_get_block_status

2016-01-05 Thread Fam Zheng

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/vdi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/vdi.c b/block/vdi.c
index 2199fd3..6b1a57b 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -550,6 +550,7 @@ static int64_t coroutine_fn 
vdi_co_get_block_status(BlockDriverState *bs,
 offset = s->header.offset_data +
   (uint64_t)bmap_entry * s->block_size +
   sector_in_block * SECTOR_SIZE;
+*file = bs->file->bs;
 return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
 }
 
-- 
2.4.3

[Qemu-devel] [PATCH v5 13/15] qemu-img: In "map", use the returned "file" from bdrv_get_block_status

2016-01-05 Thread Fam Zheng

Now all drivers should return a correct "file", we can make use of it,
even with the recursion into backing chain above.

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 qemu-img.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-img.c b/qemu-img.c
index ac4eee5..a35edee 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2225,7 +2225,7 @@ static int get_block_status(BlockDriverState *bs, int64_t 
sector_num,
 e->flags = ret & ~BDRV_BLOCK_OFFSET_MASK;
 e->offset = ret & BDRV_BLOCK_OFFSET_MASK;
 e->depth = depth;
-e->bs = bs;
+e->bs = file;
 return 0;
 }
 
-- 
2.4.3

[Qemu-devel] [PATCH v5 12/15] block: Remove the "bs->file" test in bdrv_co_get_block_status

2016-01-05 Thread Fam Zheng

Now that all drivers return the right "file" pointer, we can remove this
check.

Signed-off-by: Fam Zheng 
---
 block/io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 492c291..1ca4e61 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1550,7 +1550,7 @@ static int64_t coroutine_fn 
bdrv_co_get_block_status(BlockDriverState *bs,
 }
 }
 
-if (bs->file && *file && *file != bs &&
+if (*file && *file != bs &&
 (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
 (ret & BDRV_BLOCK_OFFSET_VALID)) {
 BlockDriverState *file2;
-- 
2.4.3

[Qemu-devel] [PATCH v5 10/15] vpc: Assign bs->file->bs to file in vpc_co_get_block_status

2016-01-05 Thread Fam Zheng

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 block/vpc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/vpc.c b/block/vpc.c
index 912f5d0..412ff41 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -588,6 +588,7 @@ static int64_t coroutine_fn 
vpc_co_get_block_status(BlockDriverState *bs,
 
 if (be32_to_cpu(footer->type) == VHD_FIXED) {
 *pnum = nb_sectors;
+*file = bs->file->bs;
 return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
(sector_num << BDRV_SECTOR_BITS);
 }
@@ -609,6 +610,7 @@ static int64_t coroutine_fn 
vpc_co_get_block_status(BlockDriverState *bs,
 /* *pnum can't be greater than one block for allocated
  * sectors since there is always a bitmap in between. */
 if (allocated) {
+*file = bs->file->bs;
 return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
 }
 if (nb_sectors == 0) {
-- 
2.4.3

Re: [Qemu-devel] [PATCH v8 1/2] mirror: Rewrite mirror_iteration

2016-01-05 Thread Fam Zheng

On Mon, 01/04 20:27, Max Reitz wrote:
> On 24.12.2015 04:15, Fam Zheng wrote:
> > The "pnum < nb_sectors" condition in deciding whether to actually copy
> > data is unnecessarily strict, and the qiov initialization is
> > unnecessarily for bdrv_aio_write_zeroes and bdrv_aio_discard.
> > 
> > Rewrite mirror_iteration to fix both flaws.
> > 
> > Signed-off-by: Fam Zheng 
> > ---
> >  block/mirror.c | 344 
> > +++--
> >  trace-events   |   1 -
> >  2 files changed, 213 insertions(+), 132 deletions(-)
> > 
> > diff --git a/block/mirror.c b/block/mirror.c
> > index f201f2b..0081c2e 100644
> > --- a/block/mirror.c
> > +++ b/block/mirror.c
> > @@ -46,7 +46,6 @@ typedef struct MirrorBlockJob {
> >  BlockdevOnError on_source_error, on_target_error;
> >  bool synced;
> >  bool should_complete;
> > -int64_t sector_num;
> >  int64_t granularity;
> >  size_t buf_size;
> >  int64_t bdev_length;
> > @@ -63,6 +62,8 @@ typedef struct MirrorBlockJob {
> >  int ret;
> >  bool unmap;
> >  bool waiting_for_io;
> > +int target_cluster_sectors;
> > +int max_iov;
> >  } MirrorBlockJob;
> >  
> >  typedef struct MirrorOp {
> > @@ -158,115 +159,90 @@ static void mirror_read_complete(void *opaque, int 
> > ret)
> >  mirror_write_complete, op);
> >  }
> >  
> > -static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
> > +/* Round sector_num and/or nb_sectors to target cluster if COW is needed, 
> > and
> > + * return the offset of the adjusted tail sector against original. */
> > +static int mirror_cow_align(MirrorBlockJob *s,
> > +int64_t *sector_num,
> > +int *nb_sectors)
> > +{
> > +bool head_need_cow, tail_need_cow;
> > +int diff = 0;
> > +int chunk_sectors = s->granularity >> BDRV_SECTOR_BITS;
> > +
> > +head_need_cow = !test_bit(*sector_num / chunk_sectors, s->cow_bitmap);
> > +tail_need_cow = !test_bit((*sector_num + *nb_sectors - 1) / 
> > chunk_sectors,
> > +  s->cow_bitmap);
> > +if (head_need_cow || tail_need_cow) {
> > +int64_t align_sector_num;
> > +int align_nb_sectors;
> > +bdrv_round_to_clusters(s->target, *sector_num, *nb_sectors,
> > +   &align_sector_num, &align_nb_sectors);
> > +if (tail_need_cow) {
> > +diff = align_sector_num + align_nb_sectors
> > +   - (*sector_num + *nb_sectors);
> > +assert(diff >= 0);
> > +*nb_sectors += diff;
> > +}
> > +if (head_need_cow) {
> > +int d = *sector_num - align_sector_num;
> > +assert(d >= 0);
> > +*sector_num = align_sector_num;
> > +*nb_sectors += d;
> > +}
> > +}
> > +
> > +/* If the resulting chunks are more than max_iov, we have to shrink it
> > + * under the alignment restriction. */
> > +if (*nb_sectors / chunk_sectors > s->max_iov) {
> > +int shrink = *nb_sectors / chunk_sectors - s->max_iov;
> 
> Isn't this missing a "shrink *= chunk_sectors"? Because after this line,
> shrink's unit seems to be chunks, but the following code seems to presume it

You are right, and the if condition above should be careful of partial chunks
too. Will fix.

Fam

[Qemu-devel] [PATCH v5 14/15] qemu-img: Make MapEntry a QAPI struct

2016-01-05 Thread Fam Zheng

The "flags" bit mask is expanded to two booleans, "data" and "zero";
"bs" is replaced with "filename" string.

Refactor the merge conditions in img_map() into entry_mergeable().

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 qapi/block-core.json | 27 
 qemu-img.c   | 71 +++-
 2 files changed, 69 insertions(+), 29 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 1a5d9ce..90f7467 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -186,6 +186,33 @@
'*fragmented-clusters': 'int', '*compressed-clusters': 'int' } }
 
 ##
+# @MapEntry:
+#
+# Mapping information from a virtual block range to a host file range
+#
+# @start: the start byte of the mapped virtual range
+#
+# @length: the number of bytes of the mapped virtual range
+#
+# @data: whether the mapped range has data
+#
+# @zero: whether the virtual blocks are zeroed
+#
+# @depth: the depth of the mapping
+#
+# @offset: #optional the offset in file that the virtual sectors are mapped to
+#
+# @filename: #optional filename that is referred to by @offset
+#
+# Since: 2.6
+#
+##
+{ 'struct': 'MapEntry',
+  'data': {'start': 'int', 'length': 'int', 'data': 'bool',
+   'zero': 'bool', 'depth': 'int', '*offset': 'int',
+   '*filename': 'str' } }
+
+##
 # @BlockdevCacheInfo
 #
 # Cache mode information for a block device
diff --git a/qemu-img.c b/qemu-img.c
index a35edee..78b755c 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2136,47 +2136,37 @@ static int img_info(int argc, char **argv)
 return 0;
 }
 
-
-typedef struct MapEntry {
-int flags;
-int depth;
-int64_t start;
-int64_t length;
-int64_t offset;
-BlockDriverState *bs;
-} MapEntry;
-
 static void dump_map_entry(OutputFormat output_format, MapEntry *e,
MapEntry *next)
 {
 switch (output_format) {
 case OFORMAT_HUMAN:
-if ((e->flags & BDRV_BLOCK_DATA) &&
-!(e->flags & BDRV_BLOCK_OFFSET_VALID)) {
+if (e->data && !e->has_offset) {
 error_report("File contains external, encrypted or compressed 
clusters.");
 exit(1);
 }
-if ((e->flags & (BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO)) == BDRV_BLOCK_DATA) 
{
+if (e->data && !e->zero) {
 printf("%#-16"PRIx64"%#-16"PRIx64"%#-16"PRIx64"%s\n",
-   e->start, e->length, e->offset, e->bs->filename);
+   e->start, e->length,
+   e->has_offset ? e->offset : 0,
+   e->has_filename ? e->filename : "");
 }
 /* This format ignores the distinction between 0, ZERO and ZERO|DATA.
  * Modify the flags here to allow more coalescing.
  */
-if (next &&
-(next->flags & (BDRV_BLOCK_DATA|BDRV_BLOCK_ZERO)) != 
BDRV_BLOCK_DATA) {
-next->flags &= ~BDRV_BLOCK_DATA;
-next->flags |= BDRV_BLOCK_ZERO;
+if (next && (!next->data || next->zero)) {
+next->data = false;
+next->zero = true;
 }
 break;
 case OFORMAT_JSON:
-printf("%s{ \"start\": %"PRId64", \"length\": %"PRId64", \"depth\": 
%d,"
-   " \"zero\": %s, \"data\": %s",
+printf("%s{ \"start\": %"PRId64", \"length\": %"PRId64","
+   " \"depth\": %"PRId64", \"zero\": %s, \"data\": %s",
(e->start == 0 ? "[" : ",\n"),
e->start, e->length, e->depth,
-   (e->flags & BDRV_BLOCK_ZERO) ? "true" : "false",
-   (e->flags & BDRV_BLOCK_DATA) ? "true" : "false");
-if (e->flags & BDRV_BLOCK_OFFSET_VALID) {
+   e->zero ? "true" : "false",
+   e->data ? "true" : "false");
+if (e->has_offset) {
 printf(", \"offset\": %"PRId64"", e->offset);
 }
 putchar('}');
@@ -,13 +2212,39 @@ static int get_block_status(BlockDriverState *bs, 
int64_t sector_num,
 
 e->start = sector_num * BDRV_SECTOR_SIZE;
 e->length = nb_sectors * BDRV_SECTOR_SIZE;
-e->flags = ret & ~BDRV_BLOCK_OFFSET_MASK;
+e->data = !!(ret & BDRV_BLOCK_DATA);
+e->zero = !!(ret & BDRV_BLOCK_ZERO);
 e->offset = ret & BDRV_BLOCK_OFFSET_MASK;
+e->has_offset = !!(ret & BDRV_BLOCK_OFFSET_VALID);
 e->depth = depth;
-e->bs = file;
+if (file && e->has_offset) {
+e->has_filename = true;
+e->filename = file->filename;
+}
 return 0;
 }
 
+static inline bool entry_mergeable(const MapEntry *curr, const MapEntry *next)
+{
+if (curr->length == 0) {
+return false;
+}
+if (curr->zero != next->zero ||
+curr->data != next->data ||
+curr->depth != next->depth ||
+curr->has_filename != next->has_filename ||
+curr->has_offset != next->has_offset) {
+return false;
+}
+if (curr->has_filename && strcmp(curr->filename, next->filenam

[Qemu-devel] [PATCH v5 15/15] iotests: Add "qemu-img map" test for VMDK extents

2016-01-05 Thread Fam Zheng

Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/059 | 10 ++
 tests/qemu-iotests/059.out | 38 ++
 2 files changed, 48 insertions(+)

diff --git a/tests/qemu-iotests/059 b/tests/qemu-iotests/059
index 0ded0c3..261d8b0 100755
--- a/tests/qemu-iotests/059
+++ b/tests/qemu-iotests/059
@@ -133,6 +133,16 @@ $QEMU_IO -c "write -P 0xa 900G 512" "$TEST_IMG" | 
_filter_qemu_io
 $QEMU_IO -c "read -v 900G 1024" "$TEST_IMG" | _filter_qemu_io
 
 echo
+echo "=== Testing qemu-img map on extents ==="
+for fmt in twoGbMaxExtentSparse twoGbMaxExtentFlat; do
+IMGOPTS="subformat=$fmt" _make_test_img 31G
+$QEMU_IO -c "write 65024 1k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c "write 2147483136 1k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IO -c "write 5G 1k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG map "$TEST_IMG" | _filter_testdir
+done
+
+echo
 echo "=== Testing afl image with a very large capacity ==="
 _use_sample_img afl9.vmdk.bz2
 _img_info
diff --git a/tests/qemu-iotests/059.out b/tests/qemu-iotests/059.out
index 00057fe..54eb530 100644
--- a/tests/qemu-iotests/059.out
+++ b/tests/qemu-iotests/059.out
@@ -2337,6 +2337,44 @@ e103f0:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00  
 read 1024/1024 bytes at offset 966367641600
 1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
+=== Testing qemu-img map on extents ===
+Formatting 'TEST_DIR/iotest-version3.IMGFMT', fmt=IMGFMT size=33285996544 
subformat=twoGbMaxExtentSparse
+wrote 1024/1024 bytes at offset 65024
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 1024/1024 bytes at offset 2147483136
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 1024/1024 bytes at offset 5368709120
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Offset  Length  Mapped to   File
+0   0x2 0x5 
TEST_DIR/iotest-version3-s001.vmdk
+0x7fff  0x1 0x7 
TEST_DIR/iotest-version3-s001.vmdk
+0x8000  0x1 0x5 
TEST_DIR/iotest-version3-s002.vmdk
+0x14000 0x1 0x5 
TEST_DIR/iotest-version3-s003.vmdk
+Formatting 'TEST_DIR/iotest-version3.IMGFMT', fmt=IMGFMT size=33285996544 
subformat=twoGbMaxExtentFlat
+wrote 1024/1024 bytes at offset 65024
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 1024/1024 bytes at offset 2147483136
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 1024/1024 bytes at offset 5368709120
+1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Offset  Length  Mapped to   File
+0   0x8000  0   
TEST_DIR/iotest-version3-f001.vmdk
+0x8000  0x8000  0   
TEST_DIR/iotest-version3-f002.vmdk
+0x1 0x8000  0   
TEST_DIR/iotest-version3-f003.vmdk
+0x18000 0x8000  0   
TEST_DIR/iotest-version3-f004.vmdk
+0x2 0x8000  0   
TEST_DIR/iotest-version3-f005.vmdk
+0x28000 0x8000  0   
TEST_DIR/iotest-version3-f006.vmdk
+0x3 0x8000  0   
TEST_DIR/iotest-version3-f007.vmdk
+0x38000 0x8000  0   
TEST_DIR/iotest-version3-f008.vmdk
+0x4 0x8000  0   
TEST_DIR/iotest-version3-f009.vmdk
+0x48000 0x8000  0   
TEST_DIR/iotest-version3-f010.vmdk
+0x5 0x8000  0   
TEST_DIR/iotest-version3-f011.vmdk
+0x58000 0x8000  0   
TEST_DIR/iotest-version3-f012.vmdk
+0x6 0x8000  0   
TEST_DIR/iotest-version3-f013.vmdk
+0x68000 0x8000  0   
TEST_DIR/iotest-version3-f014.vmdk
+0x7 0x8000  0   
TEST_DIR/iotest-version3-f015.vmdk
+0x78000 0x4000  0   
TEST_DIR/iotest-version3-f016.vmdk
+
 === Testing afl image with a very large capacity ===
 qemu-img: Can't get size of device 'image': File too large
 *** done
-- 
2.4.3

[Qemu-devel] [PATCH v9 0/2] mirror: Improve zero write and discard

2016-01-05 Thread Fam Zheng

v9: Fix the one bug Max found:
"shrink *= chunk_sectors".

v8: Rebase onto master (didn't pick up Max's rev-by due to non-trivial code
change).

The conflict is around removed lines about "max_iov" and "IOV_MAX" due to
commit 3515727f3, but this also reveals this series forgot that check. So
this revision adds it back:
- Two new fields in MirrorBlockJob are added to cache size info:
  target_cluster_sectors and max_iov.
- mirror_align_cow compares nb_chunks and max_iov, and cut the tail
  appropriately.
- The orphan trace point is removed from trace-events.

Patch 1 rewrites mirror_iteration. Patch 2 is a small DRY cleaning up.

The main benefit is copying unallocated sectors (both zeroed and discarded)
doesn't go through the iov setup loop, as they don't need it.


Fam Zheng (2):
  mirror: Rewrite mirror_iteration
  mirror: Add mirror_wait_for_io

 block/mirror.c | 365 +++--
 trace-events   |   1 -
 2 files changed, 225 insertions(+), 141 deletions(-)

-- 
2.4.3

[Qemu-devel] [PATCH v9 2/2] mirror: Add mirror_wait_for_io

2016-01-05 Thread Fam Zheng

The three lines are duplicated a number of times now, refactor a
function.

Signed-off-by: Fam Zheng 
Reviewed-by: Max Reitz 
---
 block/mirror.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index e3e9fad..0fcc37b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -209,6 +209,14 @@ static int mirror_cow_align(MirrorBlockJob *s,
 return diff;
 }
 
+static inline void mirror_wait_for_io(MirrorBlockJob *s)
+{
+assert(!s->waiting_for_io);
+s->waiting_for_io = true;
+qemu_coroutine_yield();
+s->waiting_for_io = false;
+}
+
 /* Submit async read while handling COW.
  * Returns: nb_sectors if no alignment is necessary, or
  *  (new_end - sector_num) if tail is rounded up or down due to
@@ -241,9 +249,7 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t 
sector_num,
 
 while (s->buf_free_count < nb_chunks) {
 trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
-s->waiting_for_io = true;
-qemu_coroutine_yield();
-s->waiting_for_io = false;
+mirror_wait_for_io(s);
 }
 
 /* Allocate a MirrorOp that is used as an AIO callback.  */
@@ -334,9 +340,7 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 break;
 }
 trace_mirror_yield_in_flight(s, next_sector, s->in_flight);
-s->waiting_for_io = true;
-qemu_coroutine_yield();
-s->waiting_for_io = false;
+mirror_wait_for_io(s);
 /* Now retry.  */
 } else {
 hbitmap_next = hbitmap_iter_next(&s->hbi);
@@ -426,9 +430,7 @@ static void mirror_free_init(MirrorBlockJob *s)
 static void mirror_drain(MirrorBlockJob *s)
 {
 while (s->in_flight > 0) {
-s->waiting_for_io = true;
-qemu_coroutine_yield();
-s->waiting_for_io = false;
+mirror_wait_for_io(s);
 }
 }
 
@@ -616,9 +618,7 @@ static void coroutine_fn mirror_run(void *opaque)
 if (s->in_flight == MAX_IN_FLIGHT || s->buf_free_count == 0 ||
 (cnt == 0 && s->in_flight > 0)) {
 trace_mirror_yield(s, s->in_flight, s->buf_free_count, cnt);
-s->waiting_for_io = true;
-qemu_coroutine_yield();
-s->waiting_for_io = false;
+mirror_wait_for_io(s);
 continue;
 } else if (cnt != 0) {
 delay_ns = mirror_iteration(s);
-- 
2.4.3

[Qemu-devel] [PATCH v9 1/2] mirror: Rewrite mirror_iteration

2016-01-05 Thread Fam Zheng

The "pnum < nb_sectors" condition in deciding whether to actually copy
data is unnecessarily strict, and the qiov initialization is
unnecessarily for bdrv_aio_write_zeroes and bdrv_aio_discard.

Rewrite mirror_iteration to fix both flaws.

Signed-off-by: Fam Zheng 
---
 block/mirror.c | 347 +++--
 trace-events   |   1 -
 2 files changed, 216 insertions(+), 132 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index f201f2b..e3e9fad 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -46,7 +46,6 @@ typedef struct MirrorBlockJob {
 BlockdevOnError on_source_error, on_target_error;
 bool synced;
 bool should_complete;
-int64_t sector_num;
 int64_t granularity;
 size_t buf_size;
 int64_t bdev_length;
@@ -63,6 +62,8 @@ typedef struct MirrorBlockJob {
 int ret;
 bool unmap;
 bool waiting_for_io;
+int target_cluster_sectors;
+int max_iov;
 } MirrorBlockJob;
 
 typedef struct MirrorOp {
@@ -158,115 +159,93 @@ static void mirror_read_complete(void *opaque, int ret)
 mirror_write_complete, op);
 }
 
-static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
+/* Round sector_num and/or nb_sectors to target cluster if COW is needed, and
+ * return the offset of the adjusted tail sector against original. */
+static int mirror_cow_align(MirrorBlockJob *s,
+int64_t *sector_num,
+int *nb_sectors)
+{
+bool head_need_cow, tail_need_cow;
+int diff = 0;
+int chunk_sectors = s->granularity >> BDRV_SECTOR_BITS;
+
+head_need_cow = !test_bit(*sector_num / chunk_sectors, s->cow_bitmap);
+tail_need_cow = !test_bit((*sector_num + *nb_sectors - 1) / chunk_sectors,
+  s->cow_bitmap);
+if (head_need_cow || tail_need_cow) {
+int64_t align_sector_num;
+int align_nb_sectors;
+bdrv_round_to_clusters(s->target, *sector_num, *nb_sectors,
+   &align_sector_num, &align_nb_sectors);
+if (tail_need_cow) {
+diff = align_sector_num + align_nb_sectors
+   - (*sector_num + *nb_sectors);
+assert(diff >= 0);
+*nb_sectors += diff;
+}
+if (head_need_cow) {
+int d = *sector_num - align_sector_num;
+assert(d >= 0);
+*sector_num = align_sector_num;
+*nb_sectors += d;
+}
+}
+
+/* If the resulting chunks are more than max_iov, we have to shrink it
+ * under the alignment restriction. */
+if (*nb_sectors > chunk_sectors * s->max_iov) {
+int shrink = *nb_sectors - chunk_sectors * s->max_iov;
+if (tail_need_cow) {
+/* In this case, tail must be aligned already, so we just make sure
+ * the shrink is also aligned. */
+shrink -= shrink % s->target_cluster_sectors;
+}
+assert(shrink);
+diff -= shrink;
+*nb_sectors -= shrink;
+}
+
+assert(*nb_sectors > 0);
+return diff;
+}
+
+/* Submit async read while handling COW.
+ * Returns: nb_sectors if no alignment is necessary, or
+ *  (new_end - sector_num) if tail is rounded up or down due to
+ *  alignment or buffer limit.
+ */
+static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,
+  int nb_sectors)
 {
 BlockDriverState *source = s->common.bs;
-int nb_sectors, sectors_per_chunk, nb_chunks, max_iov;
-int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
-uint64_t delay_ns = 0;
+int sectors_per_chunk, nb_chunks;
+int ret = nb_sectors;
 MirrorOp *op;
-int pnum;
-int64_t ret;
 
-max_iov = MIN(source->bl.max_iov, s->target->bl.max_iov);
-
-s->sector_num = hbitmap_iter_next(&s->hbi);
-if (s->sector_num < 0) {
-bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi);
-s->sector_num = hbitmap_iter_next(&s->hbi);
-trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap));
-assert(s->sector_num >= 0);
-}
-
-hbitmap_next_sector = s->sector_num;
-sector_num = s->sector_num;
 sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
-end = s->bdev_length / BDRV_SECTOR_SIZE;
 
-/* Extend the QEMUIOVector to include all adjacent blocks that will
- * be copied in this operation.
- *
- * We have to do this if we have no backing file yet in the destination,
- * and the cluster size is very large.  Then we need to do COW ourselves.
- * The first time a cluster is copied, copy it entirely.  Note that,
- * because both the granularity and the cluster size are powers of two,
- * the number of sectors to copy cannot exceed one cluster.
- *
- * We also want to extend the QEMUIOVector to include more adjacent
- * dirty blocks if possible, to limit the number of I/O operations and
- * run efficiently even

Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Michael S. Tsirkin

On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> >> The two mechanisms referenced above would likely require coordination with
> >> QEMU and as such are open to discussion.  I haven't attempted to address
> >> them as I am not sure there is a consensus as of yet.  My personal
> >> preference would be to add a vendor-specific configuration block to the
> >> emulated pci-bridge interfaces created by QEMU that would allow us to
> >> essentially extend shpc to support guest live migration with pass-through
> >> devices.
> >
> > shpc?
> 
> That is kind of what I was thinking.  We basically need some mechanism
> to allow for the host to ask the device to quiesce.  It has been
> proposed to possibly even look at something like an ACPI interface
> since I know ACPI is used by QEMU to manage hot-plug in the standard
> case.
> 
> - Alex

Start by using hot-unplug for this!

Really use your patch guest side, and write host side
to allow starting migration with the device, but
defer completing it.

So

1.- host tells guest to start tracking memory writes
2.- guest acks
3.- migration starts
4.- most memory is migrated
5.- host tells guest to eject device
6.- guest acks
7.- stop vm and migrate rest of state

It will already be a win since hot unplug after migration starts and
most memory has been migrated is better than hot unplug before migration
starts.

Then measure downtime and profile. Then we can look at ways
to quiesce device faster which really means step 5 is replaced
with "host tells guest to quiesce device and dirty (or just unmap!)
all memory mapped for write by device".

-- 
MST

Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Dr. David Alan Gilbert

* Michael S. Tsirkin (m...@redhat.com) wrote:
> On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> > >> The two mechanisms referenced above would likely require coordination 
> > >> with
> > >> QEMU and as such are open to discussion.  I haven't attempted to address
> > >> them as I am not sure there is a consensus as of yet.  My personal
> > >> preference would be to add a vendor-specific configuration block to the
> > >> emulated pci-bridge interfaces created by QEMU that would allow us to
> > >> essentially extend shpc to support guest live migration with pass-through
> > >> devices.
> > >
> > > shpc?
> > 
> > That is kind of what I was thinking.  We basically need some mechanism
> > to allow for the host to ask the device to quiesce.  It has been
> > proposed to possibly even look at something like an ACPI interface
> > since I know ACPI is used by QEMU to manage hot-plug in the standard
> > case.
> > 
> > - Alex
> 
> 
> Start by using hot-unplug for this!
> 
> Really use your patch guest side, and write host side
> to allow starting migration with the device, but
> defer completing it.
> 
> So
> 
> 1.- host tells guest to start tracking memory writes
> 2.- guest acks
> 3.- migration starts
> 4.- most memory is migrated
> 5.- host tells guest to eject device
> 6.- guest acks
> 7.- stop vm and migrate rest of state
> 
> 
> It will already be a win since hot unplug after migration starts and
> most memory has been migrated is better than hot unplug before migration
> starts.
> 
> Then measure downtime and profile. Then we can look at ways
> to quiesce device faster which really means step 5 is replaced
> with "host tells guest to quiesce device and dirty (or just unmap!)
> all memory mapped for write by device".


Doing a hot-unplug is going to upset the guests network stacks view
of the world; that's something we don't want to change.

Dave

> 
> -- 
> MST
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] [Bug 1529859] Re: qemu 2.5.0 ivshmem segfault with msi=off option

2016-01-05 Thread maquefel

>See previously posted patch:
> http://lists.gnu.org/archive/html/qemu-stable/2015-12/msg00034.html

This patch resolves issue. I can confirm it works.

** Changed in: qemu
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1529859

Title:
  qemu 2.5.0 ivshmem segfault with msi=off option

Status in QEMU:
  Fix Committed

Bug description:
  Launching qemu with "-device ivshmem,chardev=ivshmemid,msi=off
  -chardev socket,path=/tmp/ivshmem_socket,id=ivshmemid"

  Causes segfault because, s->msi_vectors is not initialized and
  s->msi_vectors == 0.

  Does ivshmem exactly need this line ? :

  s->msi_vectors[vector].pdev = pdev;

  It makes no sence for me.

  Subject: [PATCH] fixed ivshmem empty msi vector on msi=off segfault

  ---
   hw/misc/ivshmem.c | 9 -
   1 file changed, 4 insertions(+), 5 deletions(-)

  diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
  index f73f0c2..2087d5e 100644
  --- a/hw/misc/ivshmem.c
  +++ b/hw/misc/ivshmem.c
  @@ -359,8 +359,6 @@ static CharDriverState* create_eventfd_chr_device(void * 
opaque, EventNotifier *
   int eventfd = event_notifier_get_fd(n);
   CharDriverState *chr;
   
  -s->msi_vectors[vector].pdev = pdev;
  -
   chr = qemu_chr_open_eventfd(eventfd);
   
   if (chr == NULL) {
  @@ -1038,10 +1036,11 @@ static void pci_ivshmem_exit(PCIDevice *dev)
   }
   
   if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
  -msix_uninit_exclusive_bar(dev);
  +msix_uninit_exclusive_bar(dev);
   }
  -
  -g_free(s->msi_vectors);
  +
  +if(s->msi_vectors)
  +   g_free(s->msi_vectors);
   }
   
   static bool test_msix(void *opaque, int version_id)
  -- 
  2.3.6

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1529859/+subscriptions

Re: [Qemu-devel] [PATCH 0/3] sun4u: timer fixups

2016-01-05 Thread Artyom Tarasenko

Looks good.

Reviewed-By: Artyom Tarasenko 


On Fri, Nov 13, 2015 at 6:54 PM, Mark Cave-Ayland
 wrote:
> Whilst trying to boot FreeBSD SPARC64, it became apparent that the existing
> timer code doesn't behave correctly with respect to the NPT and INT_DIS
> bits. This patchset corrects this behaviour and allows FreeBSD SPARC64 boot
> to progress further.
>
> Many thanks to Marius Strobl  for pointing me in
> the right direction.
>
> Signed-off-by: Mark Cave-Ayland 
>
> Mark Cave-Ayland (3):
>   sun4u: split out NPT and INT_DIS into separate CPUTimer fields
>   sun4u: split NPT and INT_DIS accesses between timer and compare
> registers
>   target-sparc: implement NPT timer bit
>
>  hw/sparc64/sun4u.c   |   36 +++-
>  target-sparc/cpu.h   |2 ++
>  target-sparc/helper.c|   10 --
>  target-sparc/helper.h|2 +-
>  target-sparc/translate.c |   18 +++---
>  5 files changed, 49 insertions(+), 19 deletions(-)
>
> --
> 1.7.10.4
>



-- 
Regards,
Artyom Tarasenko

SPARC and PPC PReP under qemu blog: http://tyom.blogspot.com/search/label/qemu

Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Michael S. Tsirkin

On Tue, Jan 05, 2016 at 10:01:04AM +, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (m...@redhat.com) wrote:
> > On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> > > >> The two mechanisms referenced above would likely require coordination 
> > > >> with
> > > >> QEMU and as such are open to discussion.  I haven't attempted to 
> > > >> address
> > > >> them as I am not sure there is a consensus as of yet.  My personal
> > > >> preference would be to add a vendor-specific configuration block to the
> > > >> emulated pci-bridge interfaces created by QEMU that would allow us to
> > > >> essentially extend shpc to support guest live migration with 
> > > >> pass-through
> > > >> devices.
> > > >
> > > > shpc?
> > > 
> > > That is kind of what I was thinking.  We basically need some mechanism
> > > to allow for the host to ask the device to quiesce.  It has been
> > > proposed to possibly even look at something like an ACPI interface
> > > since I know ACPI is used by QEMU to manage hot-plug in the standard
> > > case.
> > > 
> > > - Alex
> > 
> > 
> > Start by using hot-unplug for this!
> > 
> > Really use your patch guest side, and write host side
> > to allow starting migration with the device, but
> > defer completing it.
> > 
> > So
> > 
> > 1.- host tells guest to start tracking memory writes
> > 2.- guest acks
> > 3.- migration starts
> > 4.- most memory is migrated
> > 5.- host tells guest to eject device
> > 6.- guest acks
> > 7.- stop vm and migrate rest of state
> > 
> > 
> > It will already be a win since hot unplug after migration starts and
> > most memory has been migrated is better than hot unplug before migration
> > starts.
> > 
> > Then measure downtime and profile. Then we can look at ways
> > to quiesce device faster which really means step 5 is replaced
> > with "host tells guest to quiesce device and dirty (or just unmap!)
> > all memory mapped for write by device".
> 
> 
> Doing a hot-unplug is going to upset the guests network stacks view
> of the world; that's something we don't want to change.
> 
> Dave

It might but if you store the IP and restore it quickly
after migration e.g. using guest agent, as opposed to DHCP,
then it won't.

It allows calming the device down in a generic way,
specific drivers can then implement the fast quiesce.

> > 
> > -- 
> > MST
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v2 1/4] Add Error **errp for xen_host_pci_device_get()

2016-01-05 Thread Stefano Stabellini

On Tue, 5 Jan 2016, Cao jin wrote:
> On 01/04/2016 11:15 PM, Stefano Stabellini wrote:
> > On Sun, 27 Dec 2015, Cao jin wrote:
> > > To catch the error msg. Also modify the caller
> > > 
> > > Signed-off-by: Cao jin 
> > 
> > This looks much better, thanks.
> > 
> > 
> [...]
> > > 
> > > -int xen_host_pci_device_get(XenHostPCIDevice *d, uint16_t domain,
> > > -uint8_t bus, uint8_t dev, uint8_t func)
> > > +void xen_host_pci_device_get(XenHostPCIDevice *d, uint16_t domain,
> > > +uint8_t bus, uint8_t dev, uint8_t func,
> > > +Error **errp)
> > >   {
> > >   unsigned int v;
> > > -int rc = 0;
> > > 
> > >   d->config_fd = -1;
> > >   d->domain = domain;
> > > @@ -353,43 +360,48 @@ int xen_host_pci_device_get(XenHostPCIDevice *d,
> > > uint16_t domain,
> > >   d->dev = dev;
> > >   d->func = func;
> > > 
> > > -rc = xen_host_pci_config_open(d);
> > > -if (rc) {
> > > +xen_host_pci_config_open(d, errp);
> > > +if (*errp) {
> > 
> > I think that errp could be NULL, therefore the right way to do this is:
> > 
> >  Error *err = NULL;
> >  foo(arg, &err);
> >  if (err) {
> >  handle the error...
> >  error_propagate(errp, err);
> >  }
> > 
> > see the comment at the beginning of include/qapi/error.h.
> > 
> 
> Hi stefano,
> 
> I read that comment, and find something maybe new:
> 
> "errp could be NULL", I think it is saying, if we are in a .realize()
> function, yes, *errp* maybe NULL, but reality is, here is the callee of
> .realize(), and we defined a local variable: Error *local_err = NULL in
> .realize() and passed it to all the callee, so, theoretically *errp* won`t be
> NULL.

This is true, however I think that relying on it is error prone: in a
couple of years from now somebody might change the call sequence without
updating the error handling (easy to forget), causing QEMU to crash on
error. I think it is safer not to rely on errp != NULL.


> so the way you said above is suitable in .realize() IMHO, and I also did
> it in that way.
> 
> comment also says:
> 
>  * Receive an error and pass it on to the caller:
>  * Error *err = NULL;
>  * foo(arg, &err);
>  * if (err) {
>  * handle the error...
>  * error_propagate(errp, err);
>  * }
>  * where Error **errp is a parameter, by convention the last one.
> 
> If I understand the last sentence well, the Error **errp in .realize()
> prototype is *the last one*, so we could call error_propagate(errp, err) only
> in .realize()
> 
> The comment also says:
> 
>  * But when all you do with the error is pass it on, please use
>  * foo(arg, errp);
>  * for readability."
> 
> We just pass error on in all the callees, so I guess I also did as comment
> suggest?
> 
> How do you think?
 
I think we only need to use a local Error variable when we want to check
for the returned error, in cases such as:

if (*errp) {

In other cases, when we are not interested in *errp, we can simply
propagate the error, like you have done in your patches.

Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Dr. David Alan Gilbert

* Michael S. Tsirkin (m...@redhat.com) wrote:
> On Tue, Jan 05, 2016 at 10:01:04AM +, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> > > > >> The two mechanisms referenced above would likely require 
> > > > >> coordination with
> > > > >> QEMU and as such are open to discussion.  I haven't attempted to 
> > > > >> address
> > > > >> them as I am not sure there is a consensus as of yet.  My personal
> > > > >> preference would be to add a vendor-specific configuration block to 
> > > > >> the
> > > > >> emulated pci-bridge interfaces created by QEMU that would allow us to
> > > > >> essentially extend shpc to support guest live migration with 
> > > > >> pass-through
> > > > >> devices.
> > > > >
> > > > > shpc?
> > > > 
> > > > That is kind of what I was thinking.  We basically need some mechanism
> > > > to allow for the host to ask the device to quiesce.  It has been
> > > > proposed to possibly even look at something like an ACPI interface
> > > > since I know ACPI is used by QEMU to manage hot-plug in the standard
> > > > case.
> > > > 
> > > > - Alex
> > > 
> > > 
> > > Start by using hot-unplug for this!
> > > 
> > > Really use your patch guest side, and write host side
> > > to allow starting migration with the device, but
> > > defer completing it.
> > > 
> > > So
> > > 
> > > 1.- host tells guest to start tracking memory writes
> > > 2.- guest acks
> > > 3.- migration starts
> > > 4.- most memory is migrated
> > > 5.- host tells guest to eject device
> > > 6.- guest acks
> > > 7.- stop vm and migrate rest of state
> > > 
> > > 
> > > It will already be a win since hot unplug after migration starts and
> > > most memory has been migrated is better than hot unplug before migration
> > > starts.
> > > 
> > > Then measure downtime and profile. Then we can look at ways
> > > to quiesce device faster which really means step 5 is replaced
> > > with "host tells guest to quiesce device and dirty (or just unmap!)
> > > all memory mapped for write by device".
> > 
> > 
> > Doing a hot-unplug is going to upset the guests network stacks view
> > of the world; that's something we don't want to change.
> > 
> > Dave
> 
> It might but if you store the IP and restore it quickly
> after migration e.g. using guest agent, as opposed to DHCP,
> then it won't.

I thought if you hot-unplug then it will lose any outstanding connections
on that device.

> It allows calming the device down in a generic way,
> specific drivers can then implement the fast quiesce.

Except that if it breaks the guest networking it's useless.

Dave

> 
> > > 
> > > -- 
> > > MST
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] [PATCH] PCI Bonito: QOMify

2016-01-05 Thread Cao jin

Also clear the code

Signed-off-by: Cao jin 
---
Since this file don`t have maintainer & it should be simple enough for
qemu-trival, so cc: qemu-trival.

 hw/pci-host/bonito.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/hw/pci-host/bonito.c b/hw/pci-host/bonito.c
index 4139a2c..b477679 100644
--- a/hw/pci-host/bonito.c
+++ b/hw/pci-host/bonito.c
@@ -180,8 +180,6 @@
 #define PCI_ADDR(busno,devno,funno,regno)  \
 busno)<<16)&0xff) + (((devno)<<11)&0xf800) + (((funno)<<8)&0x700) 
+ (regno))
 
-#define TYPE_BONITO_PCI_HOST_BRIDGE "Bonito-pcihost"
-
 typedef struct BonitoState BonitoState;
 
 typedef struct PCIBonitoState
@@ -215,17 +213,20 @@ typedef struct PCIBonitoState
 
 } PCIBonitoState;
 
-#define BONITO_PCI_HOST_BRIDGE(obj) \
-OBJECT_CHECK(BonitoState, (obj), TYPE_BONITO_PCI_HOST_BRIDGE)
-
 struct BonitoState {
 PCIHostState parent_obj;
-
 qemu_irq *pic;
-
 PCIBonitoState *pci_dev;
 };
 
+#define TYPE_BONITO_PCI_HOST_BRIDGE "Bonito-pcihost"
+#define BONITO_PCI_HOST_BRIDGE(obj) \
+OBJECT_CHECK(BonitoState, (obj), TYPE_BONITO_PCI_HOST_BRIDGE)
+
+#define TYPE_PCI_BONITO "Bonito"
+#define PCI_BONITO(obj) \
+OBJECT_CHECK(PCIBonitoState, (obj), TYPE_PCI_BONITO)
+
 static void bonito_writel(void *opaque, hwaddr addr,
   uint64_t val, unsigned size)
 {
@@ -723,7 +724,7 @@ static int bonito_pcihost_initfn(SysBusDevice *dev)
 
 static void bonito_realize(PCIDevice *dev, Error **errp)
 {
-PCIBonitoState *s = DO_UPCAST(PCIBonitoState, dev, dev);
+PCIBonitoState *s = PCI_BONITO(dev);
 SysBusDevice *sysbus = SYS_BUS_DEVICE(s->pcihost);
 PCIHostState *phb = PCI_HOST_BRIDGE(s->pcihost);
 
@@ -799,8 +800,8 @@ PCIBus *bonito_init(qemu_irq *pic)
 qdev_init_nofail(dev);
 
 /* set the pcihost pointer before bonito_initfn is called */
-d = pci_create(phb->bus, PCI_DEVFN(0, 0), "Bonito");
-s = DO_UPCAST(PCIBonitoState, dev, d);
+d = pci_create(phb->bus, PCI_DEVFN(0, 0), TYPE_PCI_BONITO);
+s = PCI_BONITO(d);
 s->pcihost = pcihost;
 pcihost->pci_dev = s;
 qdev_init_nofail(DEVICE(d));
@@ -828,7 +829,7 @@ static void bonito_class_init(ObjectClass *klass, void 
*data)
 }
 
 static const TypeInfo bonito_info = {
-.name  = "Bonito",
+.name  = TYPE_PCI_BONITO,
 .parent= TYPE_PCI_DEVICE,
 .instance_size = sizeof(PCIBonitoState),
 .class_init= bonito_class_init,
-- 
2.1.0

Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Michael S. Tsirkin

On Tue, Jan 05, 2016 at 10:45:25AM +, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (m...@redhat.com) wrote:
> > On Tue, Jan 05, 2016 at 10:01:04AM +, Dr. David Alan Gilbert wrote:
> > > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > > On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> > > > > >> The two mechanisms referenced above would likely require 
> > > > > >> coordination with
> > > > > >> QEMU and as such are open to discussion.  I haven't attempted to 
> > > > > >> address
> > > > > >> them as I am not sure there is a consensus as of yet.  My personal
> > > > > >> preference would be to add a vendor-specific configuration block 
> > > > > >> to the
> > > > > >> emulated pci-bridge interfaces created by QEMU that would allow us 
> > > > > >> to
> > > > > >> essentially extend shpc to support guest live migration with 
> > > > > >> pass-through
> > > > > >> devices.
> > > > > >
> > > > > > shpc?
> > > > > 
> > > > > That is kind of what I was thinking.  We basically need some mechanism
> > > > > to allow for the host to ask the device to quiesce.  It has been
> > > > > proposed to possibly even look at something like an ACPI interface
> > > > > since I know ACPI is used by QEMU to manage hot-plug in the standard
> > > > > case.
> > > > > 
> > > > > - Alex
> > > > 
> > > > 
> > > > Start by using hot-unplug for this!
> > > > 
> > > > Really use your patch guest side, and write host side
> > > > to allow starting migration with the device, but
> > > > defer completing it.
> > > > 
> > > > So
> > > > 
> > > > 1.- host tells guest to start tracking memory writes
> > > > 2.- guest acks
> > > > 3.- migration starts
> > > > 4.- most memory is migrated
> > > > 5.- host tells guest to eject device
> > > > 6.- guest acks
> > > > 7.- stop vm and migrate rest of state
> > > > 
> > > > 
> > > > It will already be a win since hot unplug after migration starts and
> > > > most memory has been migrated is better than hot unplug before migration
> > > > starts.
> > > > 
> > > > Then measure downtime and profile. Then we can look at ways
> > > > to quiesce device faster which really means step 5 is replaced
> > > > with "host tells guest to quiesce device and dirty (or just unmap!)
> > > > all memory mapped for write by device".
> > > 
> > > 
> > > Doing a hot-unplug is going to upset the guests network stacks view
> > > of the world; that's something we don't want to change.
> > > 
> > > Dave
> > 
> > It might but if you store the IP and restore it quickly
> > after migration e.g. using guest agent, as opposed to DHCP,
> > then it won't.
> 
> I thought if you hot-unplug then it will lose any outstanding connections
> on that device.
> 
> > It allows calming the device down in a generic way,
> > specific drivers can then implement the fast quiesce.
> 
> Except that if it breaks the guest networking it's useless.
> 
> Dave

Is hot unplug useless then?

> > 
> > > > 
> > > > -- 
> > > > MST
> > > --
> > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Dr. David Alan Gilbert

* Michael S. Tsirkin (m...@redhat.com) wrote:
> On Tue, Jan 05, 2016 at 10:45:25AM +, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > On Tue, Jan 05, 2016 at 10:01:04AM +, Dr. David Alan Gilbert wrote:
> > > > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > > > On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> > > > > > >> The two mechanisms referenced above would likely require 
> > > > > > >> coordination with
> > > > > > >> QEMU and as such are open to discussion.  I haven't attempted to 
> > > > > > >> address
> > > > > > >> them as I am not sure there is a consensus as of yet.  My 
> > > > > > >> personal
> > > > > > >> preference would be to add a vendor-specific configuration block 
> > > > > > >> to the
> > > > > > >> emulated pci-bridge interfaces created by QEMU that would allow 
> > > > > > >> us to
> > > > > > >> essentially extend shpc to support guest live migration with 
> > > > > > >> pass-through
> > > > > > >> devices.
> > > > > > >
> > > > > > > shpc?
> > > > > > 
> > > > > > That is kind of what I was thinking.  We basically need some 
> > > > > > mechanism
> > > > > > to allow for the host to ask the device to quiesce.  It has been
> > > > > > proposed to possibly even look at something like an ACPI interface
> > > > > > since I know ACPI is used by QEMU to manage hot-plug in the standard
> > > > > > case.
> > > > > > 
> > > > > > - Alex
> > > > > 
> > > > > 
> > > > > Start by using hot-unplug for this!
> > > > > 
> > > > > Really use your patch guest side, and write host side
> > > > > to allow starting migration with the device, but
> > > > > defer completing it.
> > > > > 
> > > > > So
> > > > > 
> > > > > 1.- host tells guest to start tracking memory writes
> > > > > 2.- guest acks
> > > > > 3.- migration starts
> > > > > 4.- most memory is migrated
> > > > > 5.- host tells guest to eject device
> > > > > 6.- guest acks
> > > > > 7.- stop vm and migrate rest of state
> > > > > 
> > > > > 
> > > > > It will already be a win since hot unplug after migration starts and
> > > > > most memory has been migrated is better than hot unplug before 
> > > > > migration
> > > > > starts.
> > > > > 
> > > > > Then measure downtime and profile. Then we can look at ways
> > > > > to quiesce device faster which really means step 5 is replaced
> > > > > with "host tells guest to quiesce device and dirty (or just unmap!)
> > > > > all memory mapped for write by device".
> > > > 
> > > > 
> > > > Doing a hot-unplug is going to upset the guests network stacks view
> > > > of the world; that's something we don't want to change.
> > > > 
> > > > Dave
> > > 
> > > It might but if you store the IP and restore it quickly
> > > after migration e.g. using guest agent, as opposed to DHCP,
> > > then it won't.
> > 
> > I thought if you hot-unplug then it will lose any outstanding connections
> > on that device.
> > 
> > > It allows calming the device down in a generic way,
> > > specific drivers can then implement the fast quiesce.
> > 
> > Except that if it breaks the guest networking it's useless.
> > 
> > Dave
> 
> Is hot unplug useless then?

As a migration hack, yes, unless it's paired with a second network device
as a redundent route.
To do what's being suggested here it's got to be done at the device level
and not visible to the networking stack.

Dave

> 
> > > 
> > > > > 
> > > > > -- 
> > > > > MST
> > > > --
> > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Michael S. Tsirkin

On Tue, Jan 05, 2016 at 10:45:25AM +, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (m...@redhat.com) wrote:
> > On Tue, Jan 05, 2016 at 10:01:04AM +, Dr. David Alan Gilbert wrote:
> > > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > > On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> > > > > >> The two mechanisms referenced above would likely require 
> > > > > >> coordination with
> > > > > >> QEMU and as such are open to discussion.  I haven't attempted to 
> > > > > >> address
> > > > > >> them as I am not sure there is a consensus as of yet.  My personal
> > > > > >> preference would be to add a vendor-specific configuration block 
> > > > > >> to the
> > > > > >> emulated pci-bridge interfaces created by QEMU that would allow us 
> > > > > >> to
> > > > > >> essentially extend shpc to support guest live migration with 
> > > > > >> pass-through
> > > > > >> devices.
> > > > > >
> > > > > > shpc?
> > > > > 
> > > > > That is kind of what I was thinking.  We basically need some mechanism
> > > > > to allow for the host to ask the device to quiesce.  It has been
> > > > > proposed to possibly even look at something like an ACPI interface
> > > > > since I know ACPI is used by QEMU to manage hot-plug in the standard
> > > > > case.
> > > > > 
> > > > > - Alex
> > > > 
> > > > 
> > > > Start by using hot-unplug for this!
> > > > 
> > > > Really use your patch guest side, and write host side
> > > > to allow starting migration with the device, but
> > > > defer completing it.
> > > > 
> > > > So
> > > > 
> > > > 1.- host tells guest to start tracking memory writes
> > > > 2.- guest acks
> > > > 3.- migration starts
> > > > 4.- most memory is migrated
> > > > 5.- host tells guest to eject device
> > > > 6.- guest acks
> > > > 7.- stop vm and migrate rest of state
> > > > 
> > > > 
> > > > It will already be a win since hot unplug after migration starts and
> > > > most memory has been migrated is better than hot unplug before migration
> > > > starts.
> > > > 
> > > > Then measure downtime and profile. Then we can look at ways
> > > > to quiesce device faster which really means step 5 is replaced
> > > > with "host tells guest to quiesce device and dirty (or just unmap!)
> > > > all memory mapped for write by device".
> > > 
> > > 
> > > Doing a hot-unplug is going to upset the guests network stacks view
> > > of the world; that's something we don't want to change.
> > > 
> > > Dave
> > 
> > It might but if you store the IP and restore it quickly
> > after migration e.g. using guest agent, as opposed to DHCP,
> > then it won't.
> 
> I thought if you hot-unplug then it will lose any outstanding connections
> on that device.

Which connections and which device?  TCP connections and an ethernet
device?  These are on different layers so of course you don't lose them.
Just do not change the IP address.

Some guests send a signal to applications to close connections
when all links go down. One can work around this
in a variety of ways.

> > It allows calming the device down in a generic way,
> > specific drivers can then implement the fast quiesce.
> 
> Except that if it breaks the guest networking it's useless.
> 
> Dave
> 
> > 
> > > > 
> > > > -- 
> > > > MST
> > > --
> > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Michael S. Tsirkin

On Tue, Jan 05, 2016 at 12:59:54PM +0200, Michael S. Tsirkin wrote:
> On Tue, Jan 05, 2016 at 10:45:25AM +, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > On Tue, Jan 05, 2016 at 10:01:04AM +, Dr. David Alan Gilbert wrote:
> > > > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > > > On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> > > > > > >> The two mechanisms referenced above would likely require 
> > > > > > >> coordination with
> > > > > > >> QEMU and as such are open to discussion.  I haven't attempted to 
> > > > > > >> address
> > > > > > >> them as I am not sure there is a consensus as of yet.  My 
> > > > > > >> personal
> > > > > > >> preference would be to add a vendor-specific configuration block 
> > > > > > >> to the
> > > > > > >> emulated pci-bridge interfaces created by QEMU that would allow 
> > > > > > >> us to
> > > > > > >> essentially extend shpc to support guest live migration with 
> > > > > > >> pass-through
> > > > > > >> devices.
> > > > > > >
> > > > > > > shpc?
> > > > > > 
> > > > > > That is kind of what I was thinking.  We basically need some 
> > > > > > mechanism
> > > > > > to allow for the host to ask the device to quiesce.  It has been
> > > > > > proposed to possibly even look at something like an ACPI interface
> > > > > > since I know ACPI is used by QEMU to manage hot-plug in the standard
> > > > > > case.
> > > > > > 
> > > > > > - Alex
> > > > > 
> > > > > 
> > > > > Start by using hot-unplug for this!
> > > > > 
> > > > > Really use your patch guest side, and write host side
> > > > > to allow starting migration with the device, but
> > > > > defer completing it.
> > > > > 
> > > > > So
> > > > > 
> > > > > 1.- host tells guest to start tracking memory writes
> > > > > 2.- guest acks
> > > > > 3.- migration starts
> > > > > 4.- most memory is migrated
> > > > > 5.- host tells guest to eject device
> > > > > 6.- guest acks
> > > > > 7.- stop vm and migrate rest of state
> > > > > 
> > > > > 
> > > > > It will already be a win since hot unplug after migration starts and
> > > > > most memory has been migrated is better than hot unplug before 
> > > > > migration
> > > > > starts.
> > > > > 
> > > > > Then measure downtime and profile. Then we can look at ways
> > > > > to quiesce device faster which really means step 5 is replaced
> > > > > with "host tells guest to quiesce device and dirty (or just unmap!)
> > > > > all memory mapped for write by device".
> > > > 
> > > > 
> > > > Doing a hot-unplug is going to upset the guests network stacks view
> > > > of the world; that's something we don't want to change.
> > > > 
> > > > Dave
> > > 
> > > It might but if you store the IP and restore it quickly
> > > after migration e.g. using guest agent, as opposed to DHCP,
> > > then it won't.
> > 
> > I thought if you hot-unplug then it will lose any outstanding connections
> > on that device.
> > 
> > > It allows calming the device down in a generic way,
> > > specific drivers can then implement the fast quiesce.
> > 
> > Except that if it breaks the guest networking it's useless.
> > 
> > Dave
> 
> Is hot unplug useless then?

Actually I misunderstood the question, unplug does not
have to break guest networking.

> > > 
> > > > > 
> > > > > -- 
> > > > > MST
> > > > --
> > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Michael S. Tsirkin

On Tue, Jan 05, 2016 at 11:03:38AM +, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (m...@redhat.com) wrote:
> > On Tue, Jan 05, 2016 at 10:45:25AM +, Dr. David Alan Gilbert wrote:
> > > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > > On Tue, Jan 05, 2016 at 10:01:04AM +, Dr. David Alan Gilbert wrote:
> > > > > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > > > > On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> > > > > > > >> The two mechanisms referenced above would likely require 
> > > > > > > >> coordination with
> > > > > > > >> QEMU and as such are open to discussion.  I haven't attempted 
> > > > > > > >> to address
> > > > > > > >> them as I am not sure there is a consensus as of yet.  My 
> > > > > > > >> personal
> > > > > > > >> preference would be to add a vendor-specific configuration 
> > > > > > > >> block to the
> > > > > > > >> emulated pci-bridge interfaces created by QEMU that would 
> > > > > > > >> allow us to
> > > > > > > >> essentially extend shpc to support guest live migration with 
> > > > > > > >> pass-through
> > > > > > > >> devices.
> > > > > > > >
> > > > > > > > shpc?
> > > > > > > 
> > > > > > > That is kind of what I was thinking.  We basically need some 
> > > > > > > mechanism
> > > > > > > to allow for the host to ask the device to quiesce.  It has been
> > > > > > > proposed to possibly even look at something like an ACPI interface
> > > > > > > since I know ACPI is used by QEMU to manage hot-plug in the 
> > > > > > > standard
> > > > > > > case.
> > > > > > > 
> > > > > > > - Alex
> > > > > > 
> > > > > > 
> > > > > > Start by using hot-unplug for this!
> > > > > > 
> > > > > > Really use your patch guest side, and write host side
> > > > > > to allow starting migration with the device, but
> > > > > > defer completing it.
> > > > > > 
> > > > > > So
> > > > > > 
> > > > > > 1.- host tells guest to start tracking memory writes
> > > > > > 2.- guest acks
> > > > > > 3.- migration starts
> > > > > > 4.- most memory is migrated
> > > > > > 5.- host tells guest to eject device
> > > > > > 6.- guest acks
> > > > > > 7.- stop vm and migrate rest of state
> > > > > > 
> > > > > > 
> > > > > > It will already be a win since hot unplug after migration starts and
> > > > > > most memory has been migrated is better than hot unplug before 
> > > > > > migration
> > > > > > starts.
> > > > > > 
> > > > > > Then measure downtime and profile. Then we can look at ways
> > > > > > to quiesce device faster which really means step 5 is replaced
> > > > > > with "host tells guest to quiesce device and dirty (or just unmap!)
> > > > > > all memory mapped for write by device".
> > > > > 
> > > > > 
> > > > > Doing a hot-unplug is going to upset the guests network stacks view
> > > > > of the world; that's something we don't want to change.
> > > > > 
> > > > > Dave
> > > > 
> > > > It might but if you store the IP and restore it quickly
> > > > after migration e.g. using guest agent, as opposed to DHCP,
> > > > then it won't.
> > > 
> > > I thought if you hot-unplug then it will lose any outstanding connections
> > > on that device.
> > > 
> > > > It allows calming the device down in a generic way,
> > > > specific drivers can then implement the fast quiesce.
> > > 
> > > Except that if it breaks the guest networking it's useless.
> > > 
> > > Dave
> > 
> > Is hot unplug useless then?
> 
> As a migration hack, yes,

Based on a premise that it breaks connections but it does not
have to.

> unless it's paired with a second network device
> as a redundent route.

You can do this too.

But this is not a must at all.

> To do what's being suggested here it's got to be done at the device level
> and not visible to the networking stack.
> 
> Dave

Need for this was never demonstrated.

> > 
> > > > 
> > > > > > 
> > > > > > -- 
> > > > > > MST
> > > > > --
> > > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> > > --
> > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] [PATCH v3 04/11] igd: switch TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE to realize

2016-01-05 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 hw/pci-host/igd.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index ef0273b..d1eeafb 100644
--- a/hw/pci-host/igd.c
+++ b/hw/pci-host/igd.c
@@ -53,7 +53,7 @@ out:
 return ret;
 }
 
-static int igd_pt_i440fx_initfn(struct PCIDevice *pci_dev)
+static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp)
 {
 uint32_t val = 0;
 int rc, i, num;
@@ -65,12 +65,11 @@ static int igd_pt_i440fx_initfn(struct PCIDevice *pci_dev)
 len = igd_host_bridge_infos[i].len;
 rc = host_pci_config_read(pos, len, val);
 if (rc) {
-return -ENODEV;
+error_setg(errp, "failed to read host config");
+return;
 }
 pci_default_write_config(pci_dev, pos, val, len);
 }
-
-return 0;
 }
 
 static void igd_passthrough_i440fx_class_init(ObjectClass *klass, void *data)
@@ -78,7 +77,7 @@ static void igd_passthrough_i440fx_class_init(ObjectClass 
*klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 
-k->init = igd_pt_i440fx_initfn;
+k->realize = igd_pt_i440fx_realize;
 dc->desc = "IGD Passthrough Host bridge";
 }
 
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 06/11] igd: use defines for standard pci config space offsets

2016-01-05 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 hw/pci-host/igd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index 6f52ab1..0784128 100644
--- a/hw/pci-host/igd.c
+++ b/hw/pci-host/igd.c
@@ -10,9 +10,9 @@ typedef struct {
 
 /* Here we just expose minimal host bridge offset subset. */
 static const IGDHostInfo igd_host_bridge_infos[] = {
-{0x08, 2},  /* revision id */
-{0x2c, 2},  /* sybsystem vendor id */
-{0x2e, 2},  /* sybsystem id */
+{PCI_REVISION_ID, 2},
+{PCI_SUBSYSTEM_VENDOR_ID, 2},
+{PCI_SUBSYSTEM_ID,2},
 {0x50, 2},  /* SNB: processor graphics control register */
 {0x52, 2},  /* processor graphics control register */
 {0xa4, 4},  /* SNB: graphics base of stolen memory */
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 02/11] pc: remove has_igd_gfx_passthru global

2016-01-05 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 hw/xen/xen_pt.h |  5 +++--
 vl.c| 10 --
 2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 3749711..cdd73ff 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -4,6 +4,7 @@
 #include "qemu-common.h"
 #include "hw/xen/xen_common.h"
 #include "hw/pci/pci.h"
+#include "hw/boards.h"
 #include "xen-host-pci-device.h"
 
 void xen_pt_log(const PCIDevice *d, const char *f, ...) GCC_FMT_ATTR(2, 3);
@@ -322,10 +323,10 @@ extern void *pci_assign_dev_load_option_rom(PCIDevice 
*dev,
 unsigned int domain,
 unsigned int bus, unsigned int 
slot,
 unsigned int function);
-extern bool has_igd_gfx_passthru;
 static inline bool is_igd_vga_passthrough(XenHostPCIDevice *dev)
 {
-return (has_igd_gfx_passthru
+MachineState *machine = MACHINE(qdev_get_machine());
+return (machine->igd_gfx_passthru
 && ((dev->class_code >> 0x8) == PCI_CLASS_DISPLAY_VGA));
 }
 int xen_pt_register_vga_regions(XenHostPCIDevice *dev);
diff --git a/vl.c b/vl.c
index 5aaea77..d4e51ec 100644
--- a/vl.c
+++ b/vl.c
@@ -1365,13 +1365,6 @@ static inline void semihosting_arg_fallback(const char 
*file, const char *cmd)
 }
 }
 
-/* Now we still need this for compatibility with XEN. */
-bool has_igd_gfx_passthru;
-static void igd_gfx_passthru(void)
-{
-has_igd_gfx_passthru = current_machine->igd_gfx_passthru;
-}
-
 /***/
 /* USB devices */
 
@@ -4550,9 +4543,6 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 
-/* Check if IGD GFX passthrough. */
-igd_gfx_passthru();
-
 /* init generic devices */
 if (qemu_opts_foreach(qemu_find_opts("device"),
   device_init_func, NULL, NULL)) {
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 03/11] pc: move igd support code to igd.c

2016-01-05 Thread Gerd Hoffmann

Pure code motion, except for dropping instance_size for
TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE (no need to set,
we can inherit it from TYPE_I440FX_PCI_DEVICE).

Signed-off-by: Gerd Hoffmann 
Acked-by: Stefano Stabellini 
---
 hw/pci-host/Makefile.objs |  3 ++
 hw/pci-host/igd.c | 96 +++
 hw/pci-host/piix.c| 88 ---
 3 files changed, 99 insertions(+), 88 deletions(-)
 create mode 100644 hw/pci-host/igd.c

diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index 45f1f0e..e341a49 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -11,6 +11,9 @@ common-obj-$(CONFIG_PPCE500_PCI) += ppce500.o
 # ARM devices
 common-obj-$(CONFIG_VERSATILE_PCI) += versatile.o
 
+# igd passthrough support
+common-obj-$(CONFIG_LINUX) += igd.o
+
 common-obj-$(CONFIG_PCI_APB) += apb.o
 common-obj-$(CONFIG_FULONG) += bonito.o
 common-obj-$(CONFIG_PCI_PIIX) += piix.o
diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
new file mode 100644
index 000..ef0273b
--- /dev/null
+++ b/hw/pci-host/igd.c
@@ -0,0 +1,96 @@
+#include "qemu-common.h"
+#include "hw/pci/pci.h"
+#include "hw/i386/pc.h"
+
+/* IGD Passthrough Host Bridge. */
+typedef struct {
+uint8_t offset;
+uint8_t len;
+} IGDHostInfo;
+
+/* Here we just expose minimal host bridge offset subset. */
+static const IGDHostInfo igd_host_bridge_infos[] = {
+{0x08, 2},  /* revision id */
+{0x2c, 2},  /* sybsystem vendor id */
+{0x2e, 2},  /* sybsystem id */
+{0x50, 2},  /* SNB: processor graphics control register */
+{0x52, 2},  /* processor graphics control register */
+{0xa4, 4},  /* SNB: graphics base of stolen memory */
+{0xa8, 4},  /* SNB: base of GTT stolen memory */
+};
+
+static int host_pci_config_read(int pos, int len, uint32_t val)
+{
+char path[PATH_MAX];
+int config_fd;
+ssize_t size = sizeof(path);
+/* Access real host bridge. */
+int rc = snprintf(path, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s",
+  0, 0, 0, 0, "config");
+int ret = 0;
+
+if (rc >= size || rc < 0) {
+return -ENODEV;
+}
+
+config_fd = open(path, O_RDWR);
+if (config_fd < 0) {
+return -ENODEV;
+}
+
+if (lseek(config_fd, pos, SEEK_SET) != pos) {
+ret = -errno;
+goto out;
+}
+do {
+rc = read(config_fd, (uint8_t *)&val, len);
+} while (rc < 0 && (errno == EINTR || errno == EAGAIN));
+if (rc != len) {
+ret = -errno;
+}
+out:
+close(config_fd);
+return ret;
+}
+
+static int igd_pt_i440fx_initfn(struct PCIDevice *pci_dev)
+{
+uint32_t val = 0;
+int rc, i, num;
+int pos, len;
+
+num = ARRAY_SIZE(igd_host_bridge_infos);
+for (i = 0; i < num; i++) {
+pos = igd_host_bridge_infos[i].offset;
+len = igd_host_bridge_infos[i].len;
+rc = host_pci_config_read(pos, len, val);
+if (rc) {
+return -ENODEV;
+}
+pci_default_write_config(pci_dev, pos, val, len);
+}
+
+return 0;
+}
+
+static void igd_passthrough_i440fx_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+k->init = igd_pt_i440fx_initfn;
+dc->desc = "IGD Passthrough Host bridge";
+}
+
+static const TypeInfo igd_passthrough_i440fx_info = {
+.name  = TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE,
+.parent= TYPE_I440FX_PCI_DEVICE,
+.class_init= igd_passthrough_i440fx_class_init,
+};
+
+static void igd_register_types(void)
+{
+type_register_static(&igd_passthrough_i440fx_info);
+}
+
+type_init(igd_register_types)
diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 715208b..ccacb57 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -744,93 +744,6 @@ static const TypeInfo i440fx_info = {
 .class_init= i440fx_class_init,
 };
 
-/* IGD Passthrough Host Bridge. */
-typedef struct {
-uint8_t offset;
-uint8_t len;
-} IGDHostInfo;
-
-/* Here we just expose minimal host bridge offset subset. */
-static const IGDHostInfo igd_host_bridge_infos[] = {
-{0x08, 2},  /* revision id */
-{0x2c, 2},  /* sybsystem vendor id */
-{0x2e, 2},  /* sybsystem id */
-{0x50, 2},  /* SNB: processor graphics control register */
-{0x52, 2},  /* processor graphics control register */
-{0xa4, 4},  /* SNB: graphics base of stolen memory */
-{0xa8, 4},  /* SNB: base of GTT stolen memory */
-};
-
-static int host_pci_config_read(int pos, int len, uint32_t val)
-{
-char path[PATH_MAX];
-int config_fd;
-ssize_t size = sizeof(path);
-/* Access real host bridge. */
-int rc = snprintf(path, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s",
-  0, 0, 0, 0, "config");
-int ret = 0;
-
-if (rc >= size || rc < 0) {
-return -ENODEV;
-}
-
-config_fd = open(path, O_RDWR)

[Qemu-devel] [PATCH v3 05/11] igd: TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE: call parent realize

2016-01-05 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 hw/pci-host/igd.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index d1eeafb..6f52ab1 100644
--- a/hw/pci-host/igd.c
+++ b/hw/pci-host/igd.c
@@ -53,12 +53,20 @@ out:
 return ret;
 }
 
+static void (*i440fx_realize)(PCIDevice *pci_dev, Error **errp);
 static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp)
 {
+Error *err = NULL;
 uint32_t val = 0;
 int rc, i, num;
 int pos, len;
 
+i440fx_realize(pci_dev, &err);
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+
 num = ARRAY_SIZE(igd_host_bridge_infos);
 for (i = 0; i < num; i++) {
 pos = igd_host_bridge_infos[i].offset;
@@ -77,6 +85,7 @@ static void igd_passthrough_i440fx_class_init(ObjectClass 
*klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 
+i440fx_realize = k->realize;
 k->realize = igd_pt_i440fx_realize;
 dc->desc = "IGD Passthrough Host bridge";
 }
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 11/11] igd: move igd-passthrough-isa-bridge creation to machine init

2016-01-05 Thread Gerd Hoffmann

This patch moves igd-passthrough-isa-bridge creation out of the xen
passthrough code into machine init.  It is triggered by the
igd-passthru=on machine option.  Advantages:

 * This works for on both xen and kvm.
 * It is activated for the pc machine type only, q35 has a real
   isa bridge on 1f.0 and must be handled differently.  The q35
   plan is https://lkml.org/lkml/2015/11/26/183 (should land in
   the next merge window, i.e. linux 4.5).
 * If we don't need it any more some day (intel is busy removing
   chipset dependencies from the guest driver) we have a single
   machine switch to just turn off all igd passthru chipset
   tweaks.

Signed-off-by: Gerd Hoffmann 
---
 hw/i386/pc_piix.c |  6 ++
 hw/xen/xen_pt.c   | 14 --
 2 files changed, 6 insertions(+), 14 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index f36222e..2afbbd3 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -281,6 +281,12 @@ static void pc_init1(MachineState *machine,
 if (pcmc->pci_enabled) {
 pc_pci_device_init(pci_bus);
 }
+
+#ifdef CONFIG_LINUX
+if (machine->igd_gfx_passthru) {
+igd_passthrough_isa_bridge_create(pci_bus);
+}
+#endif
 }
 
 /* Looking for a pc_compat_2_4() function? It doesn't exist.
diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 18a7f72..5f626c9 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -685,17 +685,6 @@ static const MemoryListener xen_pt_io_listener = {
 .priority = 10,
 };
 
-static void
-xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
-  XenHostPCIDevice *dev)
-{
-uint16_t gpu_dev_id;
-PCIDevice *d = &s->dev;
-
-gpu_dev_id = dev->device_id;
-igd_passthrough_isa_bridge_create(d->bus);
-}
-
 /* destroy. */
 static void xen_pt_destroy(PCIDevice *d) {
 
@@ -810,9 +799,6 @@ static int xen_pt_initfn(PCIDevice *d)
 xen_host_pci_device_put(&s->real_device);
 return -1;
 }
-
-/* Register ISA bridge for passthrough GFX. */
-xen_igd_passthrough_isa_bridge_create(s, &s->real_device);
 }
 
 /* Handle real device's MMIO/PIO BARs */
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 07/11] igd: revamp host config read

2016-01-05 Thread Gerd Hoffmann

Move all work to the host_pci_config_copy helper function,
which we can easily reuse when adding q35 support.
Open sysfs file only once for all values.  Use pread.
Proper error handling.  Fix bugs:

 * Don't throw away results (like old host_pci_config_read
   did because val was passed by value not reference).
 * Update config space directly (writing via
   pci_default_write_config only works for registers
   whitelisted in wmask).

Hmm, this code can hardly ever worked before,
/me wonders what test coverage it had.

With this patch in place igd-passthru=on actually
works, although it still requires root priviledges
because linux refuses to allow non-root users access
pci config space above offset 0x50.

Signed-off-by: Gerd Hoffmann 
---
 hw/pci-host/igd.c | 65 +++
 1 file changed, 27 insertions(+), 38 deletions(-)

diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index 0784128..ec48875 100644
--- a/hw/pci-host/igd.c
+++ b/hw/pci-host/igd.c
@@ -19,47 +19,39 @@ static const IGDHostInfo igd_host_bridge_infos[] = {
 {0xa8, 4},  /* SNB: base of GTT stolen memory */
 };
 
-static int host_pci_config_read(int pos, int len, uint32_t val)
+static void host_pci_config_copy(PCIDevice *guest, const char *host,
+ const IGDHostInfo *list, int len, Error 
**errp)
 {
-char path[PATH_MAX];
-int config_fd;
-ssize_t size = sizeof(path);
-/* Access real host bridge. */
-int rc = snprintf(path, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s",
-  0, 0, 0, 0, "config");
-int ret = 0;
+char *path;
+int config_fd, rc, i;
 
-if (rc >= size || rc < 0) {
-return -ENODEV;
-}
-
-config_fd = open(path, O_RDWR);
+path = g_strdup_printf("/sys/bus/pci/devices/%s/config", host);
+config_fd = open(path, O_RDONLY);
 if (config_fd < 0) {
-return -ENODEV;
+error_setg_file_open(errp, errno, path);
+goto out_free;
 }
 
-if (lseek(config_fd, pos, SEEK_SET) != pos) {
-ret = -errno;
-goto out;
+for (i = 0; i < len; i++) {
+rc = pread(config_fd, guest->config + list[i].offset,
+   list[i].len, list[i].offset);
+if (rc != list[i].len) {
+error_setg_errno(errp, errno, "read %s, offset 0x%x",
+ path, list[i].offset);
+goto out_close;
+}
 }
-do {
-rc = read(config_fd, (uint8_t *)&val, len);
-} while (rc < 0 && (errno == EINTR || errno == EAGAIN));
-if (rc != len) {
-ret = -errno;
-}
-out:
+
+out_close:
 close(config_fd);
-return ret;
+out_free:
+g_free(path);
 }
 
 static void (*i440fx_realize)(PCIDevice *pci_dev, Error **errp);
 static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp)
 {
 Error *err = NULL;
-uint32_t val = 0;
-int rc, i, num;
-int pos, len;
 
 i440fx_realize(pci_dev, &err);
 if (err != NULL) {
@@ -67,16 +59,13 @@ static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error 
**errp)
 return;
 }
 
-num = ARRAY_SIZE(igd_host_bridge_infos);
-for (i = 0; i < num; i++) {
-pos = igd_host_bridge_infos[i].offset;
-len = igd_host_bridge_infos[i].len;
-rc = host_pci_config_read(pos, len, val);
-if (rc) {
-error_setg(errp, "failed to read host config");
-return;
-}
-pci_default_write_config(pci_dev, pos, val, len);
+host_pci_config_copy(pci_dev, ":00:00.0",
+ igd_host_bridge_infos,
+ ARRAY_SIZE(igd_host_bridge_infos),
+ &err);
+if (err != NULL) {
+error_propagate(errp, err);
+return;
 }
 }
 
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 10/11] igd: handle igd-passthrough-isa-bridge setup in realize()

2016-01-05 Thread Gerd Hoffmann

That way a simple '-device igd-passthrough-isa-bridge,addr=1f' will
do the setup.

Also instead of looking up reasonable PCI IDs based on the graphic
device id simply copy over the ids from the host, thereby reusing the
infrastructure we have in place for the igd host bridges.  Less code,
and should be more robust as we don't have to maintain the id table
to keep things going.

Signed-off-by: Gerd Hoffmann 
---
 hw/pci-host/igd.c| 115 +--
 hw/xen/xen_pt.c  |   2 +-
 include/hw/i386/pc.h |   2 +-
 3 files changed, 30 insertions(+), 89 deletions(-)

diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index 96b679d..8f32c39 100644
--- a/hw/pci-host/igd.c
+++ b/hw/pci-host/igd.c
@@ -123,111 +123,52 @@ static const TypeInfo igd_passthrough_q35_info = {
 .class_init= igd_passthrough_q35_class_init,
 };
 
-typedef struct {
-uint16_t gpu_device_id;
-uint16_t pch_device_id;
-uint8_t pch_revision_id;
-} IGDDeviceIDInfo;
-
-/* In real world different GPU should have different PCH. But actually
- * the different PCH DIDs likely map to different PCH SKUs. We do the
- * same thing for the GPU. For PCH, the different SKUs are going to be
- * all the same silicon design and implementation, just different
- * features turn on and off with fuses. The SW interfaces should be
- * consistent across all SKUs in a given family (eg LPT). But just same
- * features may not be supported.
- *
- * Most of these different PCH features probably don't matter to the
- * Gfx driver, but obviously any difference in display port connections
- * will so it should be fine with any PCH in case of passthrough.
- *
- * So currently use one PCH version, 0x8c4e, to cover all HSW(Haswell)
- * scenarios, 0x9cc3 for BDW(Broadwell).
- */
-static const IGDDeviceIDInfo igd_combo_id_infos[] = {
-/* HSW Classic */
-{0x0402, 0x8c4e, 0x04}, /* HSWGT1D, HSWD_w7 */
-{0x0406, 0x8c4e, 0x04}, /* HSWGT1M, HSWM_w7 */
-{0x0412, 0x8c4e, 0x04}, /* HSWGT2D, HSWD_w7 */
-{0x0416, 0x8c4e, 0x04}, /* HSWGT2M, HSWM_w7 */
-{0x041E, 0x8c4e, 0x04}, /* HSWGT15D, HSWD_w7 */
-/* HSW ULT */
-{0x0A06, 0x8c4e, 0x04}, /* HSWGT1UT, HSWM_w7 */
-{0x0A16, 0x8c4e, 0x04}, /* HSWGT2UT, HSWM_w7 */
-{0x0A26, 0x8c4e, 0x06}, /* HSWGT3UT, HSWM_w7 */
-{0x0A2E, 0x8c4e, 0x04}, /* HSWGT3UT28W, HSWM_w7 */
-{0x0A1E, 0x8c4e, 0x04}, /* HSWGT2UX, HSWM_w7 */
-{0x0A0E, 0x8c4e, 0x04}, /* HSWGT1ULX, HSWM_w7 */
-/* HSW CRW */
-{0x0D26, 0x8c4e, 0x04}, /* HSWGT3CW, HSWM_w7 */
-{0x0D22, 0x8c4e, 0x04}, /* HSWGT3CWDT, HSWD_w7 */
-/* HSW Server */
-{0x041A, 0x8c4e, 0x04}, /* HSWSVGT2, HSWD_w7 */
-/* HSW SRVR */
-{0x040A, 0x8c4e, 0x04}, /* HSWSVGT1, HSWD_w7 */
-/* BSW */
-{0x1606, 0x9cc3, 0x03}, /* BDWULTGT1, BDWM_w7 */
-{0x1616, 0x9cc3, 0x03}, /* BDWULTGT2, BDWM_w7 */
-{0x1626, 0x9cc3, 0x03}, /* BDWULTGT3, BDWM_w7 */
-{0x160E, 0x9cc3, 0x03}, /* BDWULXGT1, BDWM_w7 */
-{0x161E, 0x9cc3, 0x03}, /* BDWULXGT2, BDWM_w7 */
-{0x1602, 0x9cc3, 0x03}, /* BDWHALOGT1, BDWM_w7 */
-{0x1612, 0x9cc3, 0x03}, /* BDWHALOGT2, BDWM_w7 */
-{0x1622, 0x9cc3, 0x03}, /* BDWHALOGT3, BDWM_w7 */
-{0x162B, 0x9cc3, 0x03}, /* BDWHALO28W, BDWM_w7 */
-{0x162A, 0x9cc3, 0x03}, /* BDWGT3WRKS, BDWM_w7 */
-{0x162D, 0x9cc3, 0x03}, /* BDWGT3SRVR, BDWM_w7 */
+static const IGDHostInfo igd_isa_bridge_infos[] = {
+{PCI_VENDOR_ID,   2},
+{PCI_DEVICE_ID,   2},
+{PCI_REVISION_ID, 2},
+{PCI_SUBSYSTEM_VENDOR_ID, 2},
+{PCI_SUBSYSTEM_ID,2},
 };
 
+static void igd_pt_isa_bridge_realize(PCIDevice *pci_dev, Error **errp)
+{
+Error *err = NULL;
+
+if (pci_dev->devfn != PCI_DEVFN(0x1f, 0)) {
+error_setg(errp, "igd isa bridge must have address 1f.0");
+return;
+}
+
+host_pci_config_copy(pci_dev, ":00:1f.0",
+ igd_isa_bridge_infos,
+ ARRAY_SIZE(igd_isa_bridge_infos),
+ &err);
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+}
+
 static void isa_bridge_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 
 dc->desc= "ISA bridge faked to support IGD PT";
-k->vendor_id= PCI_VENDOR_ID_INTEL;
+k->realize  = igd_pt_isa_bridge_realize;
 k->class_id = PCI_CLASS_BRIDGE_ISA;
 };
 
 static TypeInfo igd_passthrough_isa_bridge_info = {
 .name  = "igd-passthrough-isa-bridge",
 .parent= TYPE_PCI_DEVICE,
-.instance_size = sizeof(PCIDevice),
 .class_init = isa_bridge_class_init,
 };
 
-void igd_passthrough_isa_bridge_create(PCIBus *bus, uint16_t gpu_dev_id)
+void igd_passthrough_isa_bridge_create(PCIBus *bus)
 {
-struct PCIDevice *bridge_dev;
-int i, num;
-uint16_t pch_dev_id = 0x;
-uint8_t pch_rev_id;
-
-num = ARRAY_SIZE(

[Qemu-devel] [PATCH v3 01/11] pc: wire up TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE for !xen

2016-01-05 Thread Gerd Hoffmann

rename pc_xen_hvm_init_pci to pc_i440fx_init_pci,
use it for both xen and non-xen init.

That changes behavior of all pc-i440fx-$version machine types where
specifying -machine igd-passthru=on used to have no effect and now it
has.  It is unlikely to cause any trouble though as there used to be
no reason to add that option to kvm guests in the first place.

Signed-off-by: Gerd Hoffmann 
Reviewed-by: Eduardo Habkost 
Reviewed-by: Stefano Stabellini 
---
 hw/i386/pc_piix.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 438cdae..6532e32 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -368,10 +368,9 @@ static void pc_init_isa(MachineState *machine)
 pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, TYPE_I440FX_PCI_DEVICE);
 }
 
-#ifdef CONFIG_XEN
-static void pc_xen_hvm_init_pci(MachineState *machine)
+static void pc_i440fx_init_pci(MachineState *machine)
 {
-const char *pci_type = has_igd_gfx_passthru ?
+const char *pci_type = machine->igd_gfx_passthru ?
 TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE : 
TYPE_I440FX_PCI_DEVICE;
 
 pc_init1(machine,
@@ -379,6 +378,7 @@ static void pc_xen_hvm_init_pci(MachineState *machine)
  pci_type);
 }
 
+#ifdef CONFIG_XEN
 static void pc_xen_hvm_init(MachineState *machine)
 {
 PCIBus *bus;
@@ -388,7 +388,7 @@ static void pc_xen_hvm_init(MachineState *machine)
 exit(1);
 }
 
-pc_xen_hvm_init_pci(machine);
+pc_i440fx_init_pci(machine);
 
 bus = pci_find_primary_bus();
 if (bus != NULL) {
@@ -404,8 +404,7 @@ static void pc_xen_hvm_init(MachineState *machine)
 if (compat) { \
 compat(machine); \
 } \
-pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, \
- TYPE_I440FX_PCI_DEVICE); \
+pc_i440fx_init_pci(machine); \
 } \
 DEFINE_PC_MACHINE(suffix, name, pc_init_##suffix, optionfn)
 
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 00/11] igd passthrough chipset tweaks

2016-01-05 Thread Gerd Hoffmann

  Hi,

We have some code in our tree to support pci passthrough of intel
graphics devices (igd) on xen, which requires some chipset tweaks
for (a) the host bridge and (b) the lpc/isa-bridge to meat the
expectations of the guest driver.

For kvm we need pretty much the same, also the requirements for vgpu
(xengt/kvmgt) are very simliar.  This patch wires up the existing
support for kvm.  It also brings a bunch of bugfixes and cleanups.

Unfortunaly the oldish laptop I had planned to use for testing turned
out to have no working iommu support for igd, so this patch series
still has seen very light testing only.  Any testing feedback is very
welcome.

Testing with kvm/i440fx:
  Add '-M pc,igd-passthru=on' to turn on the chipset tweaks.
  Passthrough the igd using vfio.

Testing with kvm/q35:
  Add '-M q35,igd-passthru=on' to turn on the the chipset tweaks.
  Pick up the linux kernel patch referenced in patch #11, build a
  custom kernel with it.  Passthrough the igd using vfio.

Testing with xen:
  Existing setups should continue working ;)

Changes in v3:
  * Handle igd-passthrough-isa-bridge creation in machine init.
  * Fix xen build failure.

Changes in v2:
  * Added igd-passthrough-isa-bridge support form kvm.
  * Added patch to drop has_igd_gfx_passthru.

cheers,
  Gerd

Gerd Hoffmann (11):
  pc: wire up TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE for !xen
  pc: remove has_igd_gfx_passthru global
  pc: move igd support code to igd.c
  igd: switch TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE to realize
  igd: TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE: call parent realize
  igd: use defines for standard pci config space offsets
  igd: revamp host config read
  igd: add q35 support
  igd: move igd-passthrough-isa-bridge to igd.c too
  igd: handle igd-passthrough-isa-bridge setup in realize()
  igd: move igd-passthrough-isa-bridge creation to machine init

 hw/i386/pc_piix.c | 130 +++--
 hw/pci-host/Makefile.objs |   3 +
 hw/pci-host/igd.c | 181 ++
 hw/pci-host/piix.c|  88 --
 hw/pci-host/q35.c |   6 +-
 hw/xen/xen_pt.c   |  14 
 hw/xen/xen_pt.h   |   5 +-
 include/hw/i386/pc.h  |   2 +-
 vl.c  |  10 ---
 9 files changed, 204 insertions(+), 235 deletions(-)
 create mode 100644 hw/pci-host/igd.c

-- 
1.8.3.1

[Qemu-devel] [PATCH v3 08/11] igd: add q35 support

2016-01-05 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 hw/pci-host/igd.c | 41 -
 hw/pci-host/q35.c |  6 +-
 2 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index ec48875..f6e3f7a 100644
--- a/hw/pci-host/igd.c
+++ b/hw/pci-host/igd.c
@@ -1,5 +1,6 @@
 #include "qemu-common.h"
 #include "hw/pci/pci.h"
+#include "hw/pci-host/q35.h"
 #include "hw/i386/pc.h"
 
 /* IGD Passthrough Host Bridge. */
@@ -76,7 +77,7 @@ static void igd_passthrough_i440fx_class_init(ObjectClass 
*klass, void *data)
 
 i440fx_realize = k->realize;
 k->realize = igd_pt_i440fx_realize;
-dc->desc = "IGD Passthrough Host bridge";
+dc->desc = "IGD Passthrough Host bridge (i440fx)";
 }
 
 static const TypeInfo igd_passthrough_i440fx_info = {
@@ -85,9 +86,47 @@ static const TypeInfo igd_passthrough_i440fx_info = {
 .class_init= igd_passthrough_i440fx_class_init,
 };
 
+static void (*q35_realize)(PCIDevice *pci_dev, Error **errp);
+static void igd_pt_q35_realize(PCIDevice *pci_dev, Error **errp)
+{
+Error *err = NULL;
+
+q35_realize(pci_dev, &err);
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+
+host_pci_config_copy(pci_dev, ":00:00.0",
+ igd_host_bridge_infos,
+ ARRAY_SIZE(igd_host_bridge_infos),
+ &err);
+if (err != NULL) {
+error_propagate(errp, err);
+return;
+}
+}
+
+static void igd_passthrough_q35_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+q35_realize = k->realize;
+k->realize = igd_pt_q35_realize;
+dc->desc = "IGD Passthrough Host bridge (q35)";
+}
+
+static const TypeInfo igd_passthrough_q35_info = {
+.name  = "igd-passthrough-q35-mch",
+.parent= TYPE_MCH_PCI_DEVICE,
+.class_init= igd_passthrough_q35_class_init,
+};
+
 static void igd_register_types(void)
 {
 type_register_static(&igd_passthrough_i440fx_info);
+type_register_static(&igd_passthrough_q35_info);
 }
 
 type_init(igd_register_types)
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 1fb4707..07dc595 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -151,7 +151,11 @@ static void q35_host_initfn(Object *obj)
 memory_region_init_io(&phb->data_mem, obj, &pci_host_data_le_ops, phb,
   "pci-conf-data", 4);
 
-object_initialize(&s->mch, sizeof(s->mch), TYPE_MCH_PCI_DEVICE);
+if (object_property_get_bool(qdev_get_machine(), "igd-passthru", NULL)) {
+object_initialize(&s->mch, sizeof(s->mch), "igd-passthrough-q35-mch");
+} else {
+object_initialize(&s->mch, sizeof(s->mch), TYPE_MCH_PCI_DEVICE);
+}
 object_property_add_child(OBJECT(s), "mch", OBJECT(&s->mch), NULL);
 qdev_prop_set_uint32(DEVICE(&s->mch), "addr", PCI_DEVFN(0, 0));
 qdev_prop_set_bit(DEVICE(&s->mch), "multifunction", false);
-- 
1.8.3.1

[Qemu-devel] [PATCH v3 09/11] igd: move igd-passthrough-isa-bridge to igd.c too

2016-01-05 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 hw/i386/pc_piix.c | 113 --
 hw/pci-host/igd.c | 108 +++
 2 files changed, 108 insertions(+), 113 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 6532e32..f36222e 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -914,119 +914,6 @@ static void pc_i440fx_0_10_machine_options(MachineClass 
*m)
 DEFINE_I440FX_MACHINE(v0_10, "pc-0.10", pc_compat_0_13,
   pc_i440fx_0_10_machine_options);
 
-typedef struct {
-uint16_t gpu_device_id;
-uint16_t pch_device_id;
-uint8_t pch_revision_id;
-} IGDDeviceIDInfo;
-
-/* In real world different GPU should have different PCH. But actually
- * the different PCH DIDs likely map to different PCH SKUs. We do the
- * same thing for the GPU. For PCH, the different SKUs are going to be
- * all the same silicon design and implementation, just different
- * features turn on and off with fuses. The SW interfaces should be
- * consistent across all SKUs in a given family (eg LPT). But just same
- * features may not be supported.
- *
- * Most of these different PCH features probably don't matter to the
- * Gfx driver, but obviously any difference in display port connections
- * will so it should be fine with any PCH in case of passthrough.
- *
- * So currently use one PCH version, 0x8c4e, to cover all HSW(Haswell)
- * scenarios, 0x9cc3 for BDW(Broadwell).
- */
-static const IGDDeviceIDInfo igd_combo_id_infos[] = {
-/* HSW Classic */
-{0x0402, 0x8c4e, 0x04}, /* HSWGT1D, HSWD_w7 */
-{0x0406, 0x8c4e, 0x04}, /* HSWGT1M, HSWM_w7 */
-{0x0412, 0x8c4e, 0x04}, /* HSWGT2D, HSWD_w7 */
-{0x0416, 0x8c4e, 0x04}, /* HSWGT2M, HSWM_w7 */
-{0x041E, 0x8c4e, 0x04}, /* HSWGT15D, HSWD_w7 */
-/* HSW ULT */
-{0x0A06, 0x8c4e, 0x04}, /* HSWGT1UT, HSWM_w7 */
-{0x0A16, 0x8c4e, 0x04}, /* HSWGT2UT, HSWM_w7 */
-{0x0A26, 0x8c4e, 0x06}, /* HSWGT3UT, HSWM_w7 */
-{0x0A2E, 0x8c4e, 0x04}, /* HSWGT3UT28W, HSWM_w7 */
-{0x0A1E, 0x8c4e, 0x04}, /* HSWGT2UX, HSWM_w7 */
-{0x0A0E, 0x8c4e, 0x04}, /* HSWGT1ULX, HSWM_w7 */
-/* HSW CRW */
-{0x0D26, 0x8c4e, 0x04}, /* HSWGT3CW, HSWM_w7 */
-{0x0D22, 0x8c4e, 0x04}, /* HSWGT3CWDT, HSWD_w7 */
-/* HSW Server */
-{0x041A, 0x8c4e, 0x04}, /* HSWSVGT2, HSWD_w7 */
-/* HSW SRVR */
-{0x040A, 0x8c4e, 0x04}, /* HSWSVGT1, HSWD_w7 */
-/* BSW */
-{0x1606, 0x9cc3, 0x03}, /* BDWULTGT1, BDWM_w7 */
-{0x1616, 0x9cc3, 0x03}, /* BDWULTGT2, BDWM_w7 */
-{0x1626, 0x9cc3, 0x03}, /* BDWULTGT3, BDWM_w7 */
-{0x160E, 0x9cc3, 0x03}, /* BDWULXGT1, BDWM_w7 */
-{0x161E, 0x9cc3, 0x03}, /* BDWULXGT2, BDWM_w7 */
-{0x1602, 0x9cc3, 0x03}, /* BDWHALOGT1, BDWM_w7 */
-{0x1612, 0x9cc3, 0x03}, /* BDWHALOGT2, BDWM_w7 */
-{0x1622, 0x9cc3, 0x03}, /* BDWHALOGT3, BDWM_w7 */
-{0x162B, 0x9cc3, 0x03}, /* BDWHALO28W, BDWM_w7 */
-{0x162A, 0x9cc3, 0x03}, /* BDWGT3WRKS, BDWM_w7 */
-{0x162D, 0x9cc3, 0x03}, /* BDWGT3SRVR, BDWM_w7 */
-};
-
-static void isa_bridge_class_init(ObjectClass *klass, void *data)
-{
-DeviceClass *dc = DEVICE_CLASS(klass);
-PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
-
-dc->desc= "ISA bridge faked to support IGD PT";
-k->vendor_id= PCI_VENDOR_ID_INTEL;
-k->class_id = PCI_CLASS_BRIDGE_ISA;
-};
-
-static TypeInfo isa_bridge_info = {
-.name  = "igd-passthrough-isa-bridge",
-.parent= TYPE_PCI_DEVICE,
-.instance_size = sizeof(PCIDevice),
-.class_init = isa_bridge_class_init,
-};
-
-static void pt_graphics_register_types(void)
-{
-type_register_static(&isa_bridge_info);
-}
-type_init(pt_graphics_register_types)
-
-void igd_passthrough_isa_bridge_create(PCIBus *bus, uint16_t gpu_dev_id)
-{
-struct PCIDevice *bridge_dev;
-int i, num;
-uint16_t pch_dev_id = 0x;
-uint8_t pch_rev_id;
-
-num = ARRAY_SIZE(igd_combo_id_infos);
-for (i = 0; i < num; i++) {
-if (gpu_dev_id == igd_combo_id_infos[i].gpu_device_id) {
-pch_dev_id = igd_combo_id_infos[i].pch_device_id;
-pch_rev_id = igd_combo_id_infos[i].pch_revision_id;
-}
-}
-
-if (pch_dev_id == 0x) {
-return;
-}
-
-/* Currently IGD drivers always need to access PCH by 1f.0. */
-bridge_dev = pci_create_simple(bus, PCI_DEVFN(0x1f, 0),
-   "igd-passthrough-isa-bridge");
-
-/*
- * Note that vendor id is always PCI_VENDOR_ID_INTEL.
- */
-if (!bridge_dev) {
-fprintf(stderr, "set igd-passthrough-isa-bridge failed!\n");
-return;
-}
-pci_config_set_device_id(bridge_dev->config, pch_dev_id);
-pci_config_set_revision(bridge_dev->config, pch_rev_id);
-}
-
 static void isapc_machine_options(MachineClass *m)
 {
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index f6e3f7a..96b679d 100644
--

[Qemu-devel] [PATCH] pc: allow raising low memory via max-ram-below-4g option

2016-01-05 Thread Gerd Hoffmann

This patch extends the functionality of the max-ram-below-4g option
to also allow increasing lowmem.  While being at it also rework the
lowmem calculation logic and add a longish comment describing how it
works and what the compatibility constrains are.

Signed-off-by: Gerd Hoffmann 
---
 hw/i386/pc.c  |  2 +-
 hw/i386/pc_piix.c | 61 +++
 2 files changed, 40 insertions(+), 23 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 459260b..1332269 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1887,7 +1887,7 @@ static void pc_machine_initfn(Object *obj)
 pc_machine_get_hotplug_memory_region_size,
 NULL, NULL, NULL, &error_abort);
 
-pcms->max_ram_below_4g = 1ULL << 32; /* 4G */
+pcms->max_ram_below_4g = 0xe000; /* 3.5G */
 object_property_add(obj, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
 pc_machine_get_max_ram_below_4g,
 pc_machine_set_max_ram_below_4g,
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 438cdae..3743736 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -87,29 +87,46 @@ static void pc_init1(MachineState *machine,
 PcGuestInfo *guest_info;
 ram_addr_t lowmem;
 
-/* Check whether RAM fits below 4G (leaving 1/2 GByte for IO memory).
- * If it doesn't, we need to split it in chunks below and above 4G.
- * In any case, try to make sure that guest addresses aligned at
- * 1G boundaries get mapped to host addresses aligned at 1G boundaries.
- * For old machine types, use whatever split we used historically to avoid
- * breaking migration.
+/*
+ * Calculate ram split, for memory below and above 4G.  It's a bit
+ * complicated for backward compatibility reasons ...
+ *
+ *  - Traditional split is 3.5G (lowmem = 0xe000).  This is the
+ *default value for max_ram_below_4g now.
+ *
+ *  - Then, to gigabyte align the memory, we move the split to 3G
+ *(lowmem = 0xc000).  But only in case we have to split in
+ *the first place, i.e. ram_size is larger than (traditional)
+ *lowmem.  And for new machine types (gigabyte_align = true)
+ *only, for live migration compatibility reasons.
+ *
+ *  - Next the max-ram-below-4g option was added, which allowed to
+ *reduce lowmem to a smaller value, to allow a larger PCI I/O
+ *window below 4G.  qemu doesn't enforce gigabyte alignment here,
+ *but prints a warning.
+ *
+ *  - Finally max-ram-below-4g got updated to also allow raising lowmem,
+ *so legacy non-PAE guests can get as much memory as possible in
+ *the 32bit address space below 4G.
+ *
+ * Examples:
+ *qemu -M pc-1.7 -m 4G(old default)-> 3584M low,  512M high
+ *qemu -M pc -m 4G(new default)-> 3072M low, 1024M high
+ *qemu -M pc,max-ram-below-4g=2G -m 4G -> 2048M low, 2048M high
+ *qemu -M pc,max-ram-below-4g=4G -m 3968M  -> 3968M low (=4G-128M)
  */
-if (machine->ram_size >= 0xe000) {
-lowmem = pcmc->gigabyte_align ? 0xc000 : 0xe000;
-} else {
-lowmem = 0xe000;
-}
-
-/* Handle the machine opt max-ram-below-4g.  It is basically doing
- * min(qemu limit, user limit).
- */
-if (lowmem > pcms->max_ram_below_4g) {
-lowmem = pcms->max_ram_below_4g;
-if (machine->ram_size - lowmem > lowmem &&
-lowmem & ((1ULL << 30) - 1)) {
-error_report("Warning: Large machine and max_ram_below_4g(%"PRIu64
- ") not a multiple of 1G; possible bad performance.",
- pcms->max_ram_below_4g);
+lowmem = pcms->max_ram_below_4g;
+if (machine->ram_size >= pcms->max_ram_below_4g) {
+if (pcmc->gigabyte_align) {
+if (lowmem > 0xc000) {
+lowmem = 0xc000;
+}
+if (lowmem & ((1ULL << 30) - 1)) {
+error_report("Warning: Large machine and max_ram_below_4g "
+ "(%" PRIu64 ") not a multiple of 1G; "
+ "possible bad performance.",
+ pcms->max_ram_below_4g);
+}
 }
 }
 
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH 0/2] checkpatch: Fixing two cases of false positives in checkpatch.pl

2016-01-05 Thread Leonid Bloch

ping

On Sun, Dec 20, 2015 at 11:31 AM, Leonid Bloch  wrote:

> ping
>
> http://patchwork.ozlabs.org/patch/537763
> http://patchwork.ozlabs.org/patch/537762
>
> On Tue, Nov 24, 2015 at 4:43 PM, Leonid Bloch  wrote:
>
>> ping
>>
>> http://patchwork.ozlabs.org/patch/537763
>> http://patchwork.ozlabs.org/patch/537762
>>
>> On Tue, Nov 10, 2015 at 11:48 AM, Leonid Bloch  wrote:
>> > ping
>> >
>> > http://patchwork.ozlabs.org/patch/537763
>> > http://patchwork.ozlabs.org/patch/537762
>> >
>> > On Thu, Oct 29, 2015 at 11:48 AM, Leonid Bloch 
>> wrote:
>> >> This series addresses two cases where errors were printed if
>> whitespaces
>> >> appeared in front of a square bracket in places where there should be
>> no
>> >> problem with such placements (please see messages of individual
>> commits).
>> >>
>> >> Leonid Bloch (2):
>> >>   checkpatch: Eliminate false positive in case of comma-space-square
>> >> bracket
>> >>   checkpatch: Eliminate false positive in case of space before square
>> >> bracket in a definition
>> >>
>> >>  scripts/checkpatch.pl | 6 +-
>> >>  1 file changed, 5 insertions(+), 1 deletion(-)
>> >>
>> >> --
>> >> 2.4.3
>> >>
>>
>
>

[Qemu-devel] [PULL 5/5] seabios: update binaries to release 1.9.0

2016-01-05 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 pc-bios/bios-256k.bin  | Bin 262144 -> 262144 bytes
 pc-bios/bios.bin   | Bin 131072 -> 131072 bytes
 pc-bios/vgabios-cirrus.bin | Bin 38400 -> 38400 bytes
 pc-bios/vgabios-qxl.bin| Bin 38400 -> 38912 bytes
 pc-bios/vgabios-stdvga.bin | Bin 38400 -> 38912 bytes
 pc-bios/vgabios-virtio.bin | Bin 38400 -> 38912 bytes
 pc-bios/vgabios-vmware.bin | Bin 38400 -> 38912 bytes
 pc-bios/vgabios.bin| Bin 38400 -> 38400 bytes
 8 files changed, 0 insertions(+), 0 deletions(-)

diff --git a/pc-bios/bios-256k.bin b/pc-bios/bios-256k.bin
index f86adff..1c50904 100644
Binary files a/pc-bios/bios-256k.bin and b/pc-bios/bios-256k.bin differ
diff --git a/pc-bios/bios.bin b/pc-bios/bios.bin
index db835fb..742a773 100644
Binary files a/pc-bios/bios.bin and b/pc-bios/bios.bin differ
diff --git a/pc-bios/vgabios-cirrus.bin b/pc-bios/vgabios-cirrus.bin
index dde8502..4d35b2e 100644
Binary files a/pc-bios/vgabios-cirrus.bin and b/pc-bios/vgabios-cirrus.bin 
differ
diff --git a/pc-bios/vgabios-qxl.bin b/pc-bios/vgabios-qxl.bin
index 5c43bd2..aa1c725 100644
Binary files a/pc-bios/vgabios-qxl.bin and b/pc-bios/vgabios-qxl.bin differ
diff --git a/pc-bios/vgabios-stdvga.bin b/pc-bios/vgabios-stdvga.bin
index b2dd8f9..d6eeab1 100644
Binary files a/pc-bios/vgabios-stdvga.bin and b/pc-bios/vgabios-stdvga.bin 
differ
diff --git a/pc-bios/vgabios-virtio.bin b/pc-bios/vgabios-virtio.bin
index 03ac8a7..f9db6a6 100644
Binary files a/pc-bios/vgabios-virtio.bin and b/pc-bios/vgabios-virtio.bin 
differ
diff --git a/pc-bios/vgabios-vmware.bin b/pc-bios/vgabios-vmware.bin
index 15e21c2..33e0377 100644
Binary files a/pc-bios/vgabios-vmware.bin and b/pc-bios/vgabios-vmware.bin 
differ
diff --git a/pc-bios/vgabios.bin b/pc-bios/vgabios.bin
index 84f1561..db0c5eb 100644
Binary files a/pc-bios/vgabios.bin and b/pc-bios/vgabios.bin differ
-- 
1.8.3.1

[Qemu-devel] [PULL 2/5] seabios: use new EXTRAVERSION to tag qemu builds

2016-01-05 Thread Gerd Hoffmann

Signed-off-by: Gerd Hoffmann 
---
 roms/Makefile | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/roms/Makefile b/roms/Makefile
index 09e33b5..01f2701 100644
--- a/roms/Makefile
+++ b/roms/Makefile
@@ -35,7 +35,7 @@ powerpc_cross_prefix := $(call find-cross-prefix,powerpc)
 x86_64_cross_prefix := $(call find-cross-prefix,x86_64)
 
 # tag our seabios builds
-SEABIOS_VERSION="$(shell cd seabios; git describe --tags --long) by 
qemu-project.org"
+SEABIOS_EXTRAVERSION="-prebuilt.qemu-project.org"
 
 #
 # EfiRom utility is shipped with edk2 / tianocore, in BaseTools/
@@ -78,12 +78,12 @@ build-seabios-config-%: config.%
mkdir -p seabios/builds/$*
cp $< seabios/builds/$*/.config
$(MAKE) -C seabios \
-   VERSION=$(SEABIOS_VERSION) \
+   EXTRAVERSION=$(SEABIOS_EXTRAVERSION) \
CROSS_COMPILE=$(x86_64_cross_prefix) \
KCONFIG_CONFIG=$(CURDIR)/seabios/builds/$*/.config \
OUT=$(CURDIR)/seabios/builds/$*/ oldnoconfig
$(MAKE) -C seabios \
-   VERSION=$(SEABIOS_VERSION) \
+   EXTRAVERSION=$(SEABIOS_EXTRAVERSION) \
CROSS_COMPILE=$(x86_64_cross_prefix) \
KCONFIG_CONFIG=$(CURDIR)/seabios/builds/$*/.config \
OUT=$(CURDIR)/seabios/builds/$*/ all
-- 
1.8.3.1

[Qemu-devel] [PULL 4/5] seabios: stop updating aml files

2016-01-05 Thread Gerd Hoffmann

ACPI aml files traditionally have been managed in the seabios repo.
In qemu version 2.0 we've switched over to have qemu generate the
acpi tables and provide them to the firmware via fw_cfg.

The old aml files are still there and used for old machine types.
Well, actually the q35 file only, the piix4 version is compiled into
seabios (unless built with CONFIG_ACPI_DSDT=n) and is there for
reference only.

The aml files havn't been touched for a long time, and given that
new features requiring acpi changes are typically only added to new
machine types this is unlikely to change in the future.  So stop
updating them.

That allows to cleanup things a bit on the seabios side in the future.

Signed-off-by: Gerd Hoffmann 
---
 roms/Makefile | 1 -
 1 file changed, 1 deletion(-)

diff --git a/roms/Makefile b/roms/Makefile
index 01f2701..7bd1252 100644
--- a/roms/Makefile
+++ b/roms/Makefile
@@ -64,7 +64,6 @@ default:
 bios: build-seabios-config-seabios-128k build-seabios-config-seabios-256k
cp seabios/builds/seabios-128k/bios.bin ../pc-bios/bios.bin
cp seabios/builds/seabios-256k/bios.bin ../pc-bios/bios-256k.bin
-   cp seabios/builds/seabios-256k/src/fw/*dsdt.aml ../pc-bios/
 
 seavgabios: $(patsubst %,seavgabios-%,$(vgabios_variants))
 
-- 
1.8.3.1

[Qemu-devel] [PULL 3/5] seabios: update 128k bios config

2016-01-05 Thread Gerd Hoffmann

Turn off OHCI + TPM support to keep the size below 128k.

Signed-off-by: Gerd Hoffmann 
---
 roms/config.seabios-128k | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/roms/config.seabios-128k b/roms/config.seabios-128k
index c719ba6..0a9da77 100644
--- a/roms/config.seabios-128k
+++ b/roms/config.seabios-128k
@@ -3,6 +3,8 @@
 CONFIG_QEMU=y
 CONFIG_ROM_SIZE=128
 CONFIG_XEN=n
+CONFIG_USB_OHCI=n
 CONFIG_USB_XHCI=n
 CONFIG_USB_UAS=n
 CONFIG_SDCARD=n
+CONFIG_TCGBIOS=n
-- 
1.8.3.1

[Qemu-devel] [PULL 1/5] seabios: update submodule to release 1.9.0

2016-01-05 Thread Gerd Hoffmann

Highlights / user visible changes in seabios:
 * boot menu key is ESC now.
 * virtio 1.0 support.
 * sdcard support.
 * fw_cfg dma suport.
 * usual share of bugfixes ;)

In vgabios:
 * Emulates leal instruction.  Works around a bug in old x86emu versions,
   which makes old xorg vesa drivers work (RHEL-5 for example).

full shortlog rel-1.8.2..rel-1.9.0
--

Ameya Palande (1):
  x86: add barrier to read{b,w,l} and write{b,w,l} functions

Andreas Färber (1):
  checkrom: Fix typo in error message

Chen Fan (1):
  pci: enable SERR# for error forwarding in bridge control register

Gerd Hoffmann (28):
  vga: simplify vga builds
  vga: rework virtio-vga support
  vga: add virtio-vga to kconfig
  pci: allow to loop over capabilities
  virtio: run drivers in 32bit mode
  virtio: add struct vp_device
  virtio: pass struct pci_device to vp_init_simple
  virtio: add version 1.0 structs and #defines
  virtio: add version 0.9.5 struct
  virtio: find version 1.0 virtio capabilities
  virtio: create vp_cap struct for legacy bar
  virtio: add read/write functions and macros
  virtio: make features 64bit, support version 1.0 features
  virtio: add version 1.0 support to vp_{get,set}_status
  virtio: add version 1.0 support to vp_get_isr
  virtio: add version 1.0 support to vp_reset
  virtio: add version 1.0 support to vp_notify
  virtio: remove unused vp_del_vq
  virtio: add version 1.0 support to vp_find_vq
  virtio-scsi: fix initialization for version 1.0
  virtio-blk: fix initialization for version 1.0
  virtio: use version 1.0 if available (flip the big switch)
  virtio: also probe version 1.0 pci ids
  virtio: legacy cleanup
  virtio-blk: 32bit cleanup
  virtio-scsi: 32bit cleanup
  virtio-ring: 32bit cleanup
  virtio-pci: use high memory for rings

Julius Werner (1):
  xhci: Count new Max Scratchpad Bufs bits from XHCI 1.1

Kevin O'Connor (126):
  docs: add page for SeaVGABIOS
  docs: Add page describing the patch contribution process
  docs: Add page on available CBFS/fw_cfg runtime config files
  docs: Prefer triple backticks to multiple lines with single backticks
  smp: Fix smp race introduced in 0673b787
  docs: Note release date of 1.8.1
  vgabios: On bda_save_restore() the saved vbe_mode also has flags in it
  vgabios: Don't use extra stack if it appears a modern OS is in use
  docs: Clarify that pci-optionrom-exec doesn't apply to roms in cbfs
  checkstack: Replace function information tuple with class
  checkstack: Simplify yield calculations
  checkstack: Prefer passing "function" class instead of function address
  smbios: Use integer signature instead of string signature
  vgabios: Don't use "smsww" instruction - it confuses x86emu
  vgabios: Add config option for assembler fixups
  vgabios: Emulate "leal" instruction
  checkstack: Minor - continue if not a regular asm line
  Don't forward declare functions with "inline" in headers
  build: Support "make VERSION=xyz" to override the default build version
  tcg: Use seabios setup()/prepboot() calling convention for tcg
  build: CONFIG_VGA_FIXUP_ASM should depend on CONFIG_BUILD_VGABIOS
  bootorder: Update "extra pci root" buses bootorder format to match qemu
  Make sure all code checks for malloc failures
  docs: Note release date of 1.8.2
  block: Split process_op() command dispatch up into multiple functions
  block: Introduce default_process_op() with common command handling codes
  block: Route scsi style commands through 'struct disk_op_s'
  blockcmd: Introduce scsi_fill_cmd()
  ata: Handle ATA ATAPI drives directly via 'struct disk_op_s' requests
  ahci: Handle AHCI ATAPI drives directly via 'struct disk_op_s' requests
  usb-msc: Handle USB drives directly via 'struct disk_op_s' requests
  usb-uas: Handle USB drives directly via 'struct disk_op_s' requests
  lsi-scsi: Handle LSI drives directly via 'struct disk_op_s' requests
  esp-scsi: Handle ESP drives directly via 'struct disk_op_s' requests
  megasas: Handle Megasas drives directly via 'struct disk_op_s' requests
  virtio-scsi: Handle virtio drives directly via 'struct disk_op_s' requests
  pvscsi: Move pvscsi_fill_req() code into pvscsi_cmd()
  pvscsi: Handle pvscsi drives directly via 'struct disk_op_s' requests
  blockcmd: Remove unused scsi_process_op() and cdb_cmd_data()
  blockcmd: Convert cdb_is_read() to scsi_is_read()
  block: Rename process_XXX_op() functions to XXX_process_op()
  coreboot: Try to auto-detect if the CBFS anchor pointer is a relative 
pointer
  ps2: Support mode for polling the PS2 port instead of using irqs
  ata: Make sure "chanid" is relative to PCI device for bootorder file
  Don't enable interrupts prior to IVT and PIC setup
  p

[Qemu-devel] [PULL 0/5] seabios: update to release 1.9.0

2016-01-05 Thread Gerd Hoffmann

  Hi,

This patch series updates seabios to version 1.9.0.
Pull request is identical to the patch series sent
to the list before xmas, except that I've dropped
one patch which got merged through mst already.

please pull,
  Gerd

The following changes since commit 38a762fec63fd5c035aae29ba9a77d357e21e4a7:

  Merge remote-tracking branch 
'remotes/berrange/tags/pull-crypto-fixes-2015-12-23-1' into staging (2015-12-23 
13:53:32 +)

are available in the git repository at:


  git://git.kraxel.org/qemu tags/pull-seabios-20160105-1

for you to fetch changes up to 4b9294c00e08011793f8869afd893e2fb49dacb3:

  seabios: update binaries to release 1.9.0 (2016-01-05 13:04:15 +0100)


seabios: update to release 1.9.0


Gerd Hoffmann (5):
  seabios: update submodule to release 1.9.0
  seabios: use new EXTRAVERSION to tag qemu builds
  seabios: update 128k bios config
  seabios: stop updating aml files
  seabios: update binaries to release 1.9.0

 pc-bios/bios-256k.bin  | Bin 262144 -> 262144 bytes
 pc-bios/bios.bin   | Bin 131072 -> 131072 bytes
 pc-bios/vgabios-cirrus.bin | Bin 38400 -> 38400 bytes
 pc-bios/vgabios-qxl.bin| Bin 38400 -> 38912 bytes
 pc-bios/vgabios-stdvga.bin | Bin 38400 -> 38912 bytes
 pc-bios/vgabios-virtio.bin | Bin 38400 -> 38912 bytes
 pc-bios/vgabios-vmware.bin | Bin 38400 -> 38912 bytes
 pc-bios/vgabios.bin| Bin 38400 -> 38400 bytes
 roms/Makefile  |   7 +++
 roms/config.seabios-128k   |   2 ++
 roms/seabios   |   2 +-
 11 files changed, 6 insertions(+), 5 deletions(-)

Re: [Qemu-devel] [PATCH] i2c-tiny-usb: add new usb to i2c bridge

2016-01-05 Thread Tim Sander

Hi Gerd

Thanks for your review.

Am Dienstag, 5. Januar 2016, 08:44:30 schrieb Gerd Hoffmann:
> > +case 0x4107:
> > +/* this seems to be a byte type access */
> > +if (i2c_start_transfer(s->i2cbus, /*address*/index, 0)) {
> > +trace_usb_i2c_tiny_i2c_start_transfer_failed();
> > +p->actual_length = 0; /* write failure */
> > +break;
> > +}
> > +for (i = 0; i < length; i++) {
> > +trace_usb_i2c_tiny_write(request, index, i, data[i]);
> > +i2c_send(s->i2cbus, data[i]);
> > +}
> > +p->actual_length = length;
> > +i2c_end_transfer(s->i2cbus);
> > +break;
> 
> I think most of the tracepoints should be moved into i2c code (or just
> dropped in case we already have tracepoints there).
I don't think that there are any tracepoints in there.

> One (high-level) tracepoint per transfer request makes sense in the usb
> code, i.e. trace_usb_i2c_transfer_{read,write}, so one can see in the
> trace log which usb request triggered which i2c transaction.
> 
> > +case 0xc101:
> > +{
> > +/* thats what the real thing reports, FIXME: can we do better
> > here? */
> 
> Hmm, didn't we agree on adding a note about what the "real thing" we
> mimic here is, to the comment at the start of the file?
Ok, that was a missunderstanding. I thought you wanted a description on top of 
the patch i was sending but  having a description in the file makes sense and i 
will add it.

> > +uint32_t func = htole32(I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL);
> 
> Can we move 'func' to the start of the function too, like we did with
> 'i'?
will do.
 
> > +case 0xc106:
> > +trace_usb_i2c_tiny_unknown_request(index, request, value,
> > length);
> > +trace_usb_i2c_tiny_unknown_request(data[0], data[1], data[2],
> > data[3]);
> > +if (i2c_start_transfer(s->i2cbus, /*address*/ index, 1)) {
> > +trace_usb_i2c_tiny_i2c_start_transfer_failed();
> > +p->actual_length = 0;
> > +break;
> > +}
> 
> Doesn't look like this request is unknown ...
> 
> > +for (i = 0; i < length; i++) {
> > +data[i] = i2c_recv(s->i2cbus);
> 
> Can this fail?
I think failure is just returning 255 as a value? AFAIK thats what real i2c 
hardware returns.

Best regards
Tim

Re: [Qemu-devel] qcow2 snapshot + resize

2016-01-05 Thread lihuiba

Max, 

I'll see what I can do, and give you my plan.


Thanks!




>On 29.12.2015 10:38, lihuiba wrote:
>> Hi,
>>
>> In our production environment, we need to extend a qcow2 image with
>> snapshots in it. This feature, however, is not implemented yet. 
>>
>> So I want to ask, if this feature is under active development?
>
>No, it is not.
>
>> How can I
>> help with this feature?
>
>Well, you can implement it. ;-)
>
>> It seems that, this feature is not too difficult as long as cluster_size
>> is kept unchanged. Is this correct?
>
>The thing is that one would need to update all the inactive L1 tables. I
>don't think it should be too difficult, it's just that apparently so far
>nobody ever had the need for this feature.

>I can take a look, but I can't say anything about whether or when
>anything will come out of it.

>Max

Re: [Qemu-devel] [PATCH v2 1/4] Add Error **errp for xen_host_pci_device_get()

2016-01-05 Thread Cao jin




On 01/05/2016 06:40 PM, Stefano Stabellini wrote:

On Tue, 5 Jan 2016, Cao jin wrote:

[...]


This is true, however I think that relying on it is error prone: in a
couple of years from now somebody might change the call sequence without
updating the error handling (easy to forget), causing QEMU to crash on
error. I think it is safer not to rely on errp != NULL.



I see, sounds reasonable




[...]


I think we only need to use a local Error variable when we want to check
for the returned error, in cases such as:

if (*errp) {

In other cases, when we are not interested in *errp, we can simply
propagate the error, like you have done in your patches.



Ok, will fix it


.



--
Yours Sincerely,

Cao Jin

Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Dr. David Alan Gilbert

* Michael S. Tsirkin (m...@redhat.com) wrote:
> On Tue, Jan 05, 2016 at 10:45:25AM +, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > On Tue, Jan 05, 2016 at 10:01:04AM +, Dr. David Alan Gilbert wrote:
> > > > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > > > On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> > > > > > >> The two mechanisms referenced above would likely require 
> > > > > > >> coordination with
> > > > > > >> QEMU and as such are open to discussion.  I haven't attempted to 
> > > > > > >> address
> > > > > > >> them as I am not sure there is a consensus as of yet.  My 
> > > > > > >> personal
> > > > > > >> preference would be to add a vendor-specific configuration block 
> > > > > > >> to the
> > > > > > >> emulated pci-bridge interfaces created by QEMU that would allow 
> > > > > > >> us to
> > > > > > >> essentially extend shpc to support guest live migration with 
> > > > > > >> pass-through
> > > > > > >> devices.
> > > > > > >
> > > > > > > shpc?
> > > > > > 
> > > > > > That is kind of what I was thinking.  We basically need some 
> > > > > > mechanism
> > > > > > to allow for the host to ask the device to quiesce.  It has been
> > > > > > proposed to possibly even look at something like an ACPI interface
> > > > > > since I know ACPI is used by QEMU to manage hot-plug in the standard
> > > > > > case.
> > > > > > 
> > > > > > - Alex
> > > > > 
> > > > > 
> > > > > Start by using hot-unplug for this!
> > > > > 
> > > > > Really use your patch guest side, and write host side
> > > > > to allow starting migration with the device, but
> > > > > defer completing it.
> > > > > 
> > > > > So
> > > > > 
> > > > > 1.- host tells guest to start tracking memory writes
> > > > > 2.- guest acks
> > > > > 3.- migration starts
> > > > > 4.- most memory is migrated
> > > > > 5.- host tells guest to eject device
> > > > > 6.- guest acks
> > > > > 7.- stop vm and migrate rest of state
> > > > > 
> > > > > 
> > > > > It will already be a win since hot unplug after migration starts and
> > > > > most memory has been migrated is better than hot unplug before 
> > > > > migration
> > > > > starts.
> > > > > 
> > > > > Then measure downtime and profile. Then we can look at ways
> > > > > to quiesce device faster which really means step 5 is replaced
> > > > > with "host tells guest to quiesce device and dirty (or just unmap!)
> > > > > all memory mapped for write by device".
> > > > 
> > > > 
> > > > Doing a hot-unplug is going to upset the guests network stacks view
> > > > of the world; that's something we don't want to change.
> > > > 
> > > > Dave
> > > 
> > > It might but if you store the IP and restore it quickly
> > > after migration e.g. using guest agent, as opposed to DHCP,
> > > then it won't.
> > 
> > I thought if you hot-unplug then it will lose any outstanding connections
> > on that device.
> 
> Which connections and which device?  TCP connections and an ethernet
> device?  These are on different layers so of course you don't lose them.
> Just do not change the IP address.
> 
> Some guests send a signal to applications to close connections
> when all links go down. One can work around this
> in a variety of ways.

So, OK, I was surprised that a simple connection didn't go down when
I tested and just removed the network card; I'd thought stuff was more
aggressive when there was no route.
But as you say, some stuff does close connections when the links go down/away
so we do need to work around that; and any new outgoing connections get
a 'no route to host'.  So I'm still nervous what will break.

Dave

> 
> > > It allows calming the device down in a generic way,
> > > specific drivers can then implement the fast quiesce.
> > 
> > Except that if it breaks the guest networking it's useless.
> > 
> > Dave
> > 
> > > 
> > > > > 
> > > > > -- 
> > > > > MST
> > > > --
> > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v3 00/11] igd passthrough chipset tweaks

2016-01-05 Thread Michael S. Tsirkin

On Tue, Jan 05, 2016 at 12:41:27PM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> We have some code in our tree to support pci passthrough of intel
> graphics devices (igd) on xen, which requires some chipset tweaks
> for (a) the host bridge and (b) the lpc/isa-bridge to meat the
> expectations of the guest driver.
> 
> For kvm we need pretty much the same, also the requirements for vgpu
> (xengt/kvmgt) are very simliar.  This patch wires up the existing
> support for kvm.  It also brings a bunch of bugfixes and cleanups.
> 
> Unfortunaly the oldish laptop I had planned to use for testing turned
> out to have no working iommu support for igd, so this patch series
> still has seen very light testing only.  Any testing feedback is very
> welcome.

I'm very interested to hear about it too, especially in light of the
fact that config accesses to host seem completely
broken ATM.


> Testing with kvm/i440fx:
>   Add '-M pc,igd-passthru=on' to turn on the chipset tweaks.
>   Passthrough the igd using vfio.
> 
> Testing with kvm/q35:
>   Add '-M q35,igd-passthru=on' to turn on the the chipset tweaks.
>   Pick up the linux kernel patch referenced in patch #11, build a
>   custom kernel with it.  Passthrough the igd using vfio.
> 
> Testing with xen:
>   Existing setups should continue working ;)
> 
> Changes in v3:
>   * Handle igd-passthrough-isa-bridge creation in machine init.
>   * Fix xen build failure.
> 
> Changes in v2:
>   * Added igd-passthrough-isa-bridge support form kvm.
>   * Added patch to drop has_igd_gfx_passthru.
> 
> cheers,
>   Gerd
> 
> Gerd Hoffmann (11):
>   pc: wire up TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE for !xen
>   pc: remove has_igd_gfx_passthru global
>   pc: move igd support code to igd.c
>   igd: switch TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE to realize
>   igd: TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE: call parent realize
>   igd: use defines for standard pci config space offsets
>   igd: revamp host config read
>   igd: add q35 support
>   igd: move igd-passthrough-isa-bridge to igd.c too
>   igd: handle igd-passthrough-isa-bridge setup in realize()
>   igd: move igd-passthrough-isa-bridge creation to machine init
> 
>  hw/i386/pc_piix.c | 130 +++--
>  hw/pci-host/Makefile.objs |   3 +
>  hw/pci-host/igd.c | 181 
> ++
>  hw/pci-host/piix.c|  88 --
>  hw/pci-host/q35.c |   6 +-
>  hw/xen/xen_pt.c   |  14 
>  hw/xen/xen_pt.h   |   5 +-
>  include/hw/i386/pc.h  |   2 +-
>  vl.c  |  10 ---
>  9 files changed, 204 insertions(+), 235 deletions(-)
>  create mode 100644 hw/pci-host/igd.c
> 
> -- 
> 1.8.3.1
>

Re: [Qemu-devel] [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Michael S. Tsirkin

On Tue, Jan 05, 2016 at 12:43:03PM +, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (m...@redhat.com) wrote:
> > On Tue, Jan 05, 2016 at 10:45:25AM +, Dr. David Alan Gilbert wrote:
> > > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > > On Tue, Jan 05, 2016 at 10:01:04AM +, Dr. David Alan Gilbert wrote:
> > > > > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > > > > On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> > > > > > > >> The two mechanisms referenced above would likely require 
> > > > > > > >> coordination with
> > > > > > > >> QEMU and as such are open to discussion.  I haven't attempted 
> > > > > > > >> to address
> > > > > > > >> them as I am not sure there is a consensus as of yet.  My 
> > > > > > > >> personal
> > > > > > > >> preference would be to add a vendor-specific configuration 
> > > > > > > >> block to the
> > > > > > > >> emulated pci-bridge interfaces created by QEMU that would 
> > > > > > > >> allow us to
> > > > > > > >> essentially extend shpc to support guest live migration with 
> > > > > > > >> pass-through
> > > > > > > >> devices.
> > > > > > > >
> > > > > > > > shpc?
> > > > > > > 
> > > > > > > That is kind of what I was thinking.  We basically need some 
> > > > > > > mechanism
> > > > > > > to allow for the host to ask the device to quiesce.  It has been
> > > > > > > proposed to possibly even look at something like an ACPI interface
> > > > > > > since I know ACPI is used by QEMU to manage hot-plug in the 
> > > > > > > standard
> > > > > > > case.
> > > > > > > 
> > > > > > > - Alex
> > > > > > 
> > > > > > 
> > > > > > Start by using hot-unplug for this!
> > > > > > 
> > > > > > Really use your patch guest side, and write host side
> > > > > > to allow starting migration with the device, but
> > > > > > defer completing it.
> > > > > > 
> > > > > > So
> > > > > > 
> > > > > > 1.- host tells guest to start tracking memory writes
> > > > > > 2.- guest acks
> > > > > > 3.- migration starts
> > > > > > 4.- most memory is migrated
> > > > > > 5.- host tells guest to eject device
> > > > > > 6.- guest acks
> > > > > > 7.- stop vm and migrate rest of state
> > > > > > 
> > > > > > 
> > > > > > It will already be a win since hot unplug after migration starts and
> > > > > > most memory has been migrated is better than hot unplug before 
> > > > > > migration
> > > > > > starts.
> > > > > > 
> > > > > > Then measure downtime and profile. Then we can look at ways
> > > > > > to quiesce device faster which really means step 5 is replaced
> > > > > > with "host tells guest to quiesce device and dirty (or just unmap!)
> > > > > > all memory mapped for write by device".
> > > > > 
> > > > > 
> > > > > Doing a hot-unplug is going to upset the guests network stacks view
> > > > > of the world; that's something we don't want to change.
> > > > > 
> > > > > Dave
> > > > 
> > > > It might but if you store the IP and restore it quickly
> > > > after migration e.g. using guest agent, as opposed to DHCP,
> > > > then it won't.
> > > 
> > > I thought if you hot-unplug then it will lose any outstanding connections
> > > on that device.
> > 
> > Which connections and which device?  TCP connections and an ethernet
> > device?  These are on different layers so of course you don't lose them.
> > Just do not change the IP address.
> > 
> > Some guests send a signal to applications to close connections
> > when all links go down. One can work around this
> > in a variety of ways.
> 
> So, OK, I was surprised that a simple connection didn't go down when
> I tested and just removed the network card; I'd thought stuff was more
> aggressive when there was no route.
> But as you say, some stuff does close connections when the links go down/away
> so we do need to work around that; and any new outgoing connections get
> a 'no route to host'.


You can create a dummy device in guest for the duration of migration.
Use guest agent to move IP address there and that should be enough to trick 
most guests.


>  So I'm still nervous what will break.
> 
> Dave

I'm not saying nothing breaks.  Far being from it.  For example, some NAT
or firewall implementations keep state per interface and these might
lose state (if using NAT/stateful firewall within guest).


So yes it *would* be useful to teach guests, for example, that a device
is "not dead, just resting" and that another device will shortly come
and take its place.


But the simple setup is already useful and worth supporting, and merging
things gradually will help this project finally get off the ground.


> > 
> > > > It allows calming the device down in a generic way,
> > > > specific drivers can then implement the fast quiesce.
> > > 
> > > Except that if it breaks the guest networking it's useless.
> > > 
> > > Dave
> > > 
> > > > 
> > > > > > 
> > > > > > -- 
> > > > > > MST
> > > > > --
> > > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> > > --
> > > Dr. David Alan Gilbert / dgilb

Re: [Qemu-devel] [PATCH] pc: allow raising low memory via max-ram-below-4g option

2016-01-05 Thread Michael S. Tsirkin

On Tue, Jan 05, 2016 at 12:59:51PM +0100, Gerd Hoffmann wrote:
> This patch extends the functionality of the max-ram-below-4g option
> to also allow increasing lowmem.  While being at it also rework the
> lowmem calculation logic and add a longish comment describing how it
> works and what the compatibility constrains are.
> 
> Signed-off-by: Gerd Hoffmann 

Could you please add info about the motivation for this
in the commit log?

> ---
>  hw/i386/pc.c  |  2 +-
>  hw/i386/pc_piix.c | 61 
> +++
>  2 files changed, 40 insertions(+), 23 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 459260b..1332269 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1887,7 +1887,7 @@ static void pc_machine_initfn(Object *obj)
>  pc_machine_get_hotplug_memory_region_size,
>  NULL, NULL, NULL, &error_abort);
>  
> -pcms->max_ram_below_4g = 1ULL << 32; /* 4G */
> +pcms->max_ram_below_4g = 0xe000; /* 3.5G */
>  object_property_add(obj, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
>  pc_machine_get_max_ram_below_4g,
>  pc_machine_set_max_ram_below_4g,
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 438cdae..3743736 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -87,29 +87,46 @@ static void pc_init1(MachineState *machine,
>  PcGuestInfo *guest_info;
>  ram_addr_t lowmem;
>  
> -/* Check whether RAM fits below 4G (leaving 1/2 GByte for IO memory).
> - * If it doesn't, we need to split it in chunks below and above 4G.
> - * In any case, try to make sure that guest addresses aligned at
> - * 1G boundaries get mapped to host addresses aligned at 1G boundaries.
> - * For old machine types, use whatever split we used historically to 
> avoid
> - * breaking migration.
> +/*
> + * Calculate ram split, for memory below and above 4G.  It's a bit
> + * complicated for backward compatibility reasons ...
> + *
> + *  - Traditional split is 3.5G (lowmem = 0xe000).  This is the
> + *default value for max_ram_below_4g now.
> + *
> + *  - Then, to gigabyte align the memory, we move the split to 3G
> + *(lowmem = 0xc000).  But only in case we have to split in
> + *the first place, i.e. ram_size is larger than (traditional)
> + *lowmem.  And for new machine types (gigabyte_align = true)
> + *only, for live migration compatibility reasons.
> + *
> + *  - Next the max-ram-below-4g option was added, which allowed to
> + *reduce lowmem to a smaller value, to allow a larger PCI I/O
> + *window below 4G.  qemu doesn't enforce gigabyte alignment here,
> + *but prints a warning.
> + *
> + *  - Finally max-ram-below-4g got updated to also allow raising lowmem,
> + *so legacy non-PAE guests can get as much memory as possible in
> + *the 32bit address space below 4G.
> + *
> + * Examples:
> + *qemu -M pc-1.7 -m 4G(old default)-> 3584M low,  512M high
> + *qemu -M pc -m 4G(new default)-> 3072M low, 1024M high
> + *qemu -M pc,max-ram-below-4g=2G -m 4G -> 2048M low, 2048M high
> + *qemu -M pc,max-ram-below-4g=4G -m 3968M  -> 3968M low (=4G-128M)
>   */
> -if (machine->ram_size >= 0xe000) {
> -lowmem = pcmc->gigabyte_align ? 0xc000 : 0xe000;
> -} else {
> -lowmem = 0xe000;
> -}
> -
> -/* Handle the machine opt max-ram-below-4g.  It is basically doing
> - * min(qemu limit, user limit).
> - */
> -if (lowmem > pcms->max_ram_below_4g) {
> -lowmem = pcms->max_ram_below_4g;
> -if (machine->ram_size - lowmem > lowmem &&
> -lowmem & ((1ULL << 30) - 1)) {
> -error_report("Warning: Large machine and 
> max_ram_below_4g(%"PRIu64
> - ") not a multiple of 1G; possible bad performance.",
> - pcms->max_ram_below_4g);
> +lowmem = pcms->max_ram_below_4g;
> +if (machine->ram_size >= pcms->max_ram_below_4g) {
> +if (pcmc->gigabyte_align) {
> +if (lowmem > 0xc000) {
> +lowmem = 0xc000;
> +}
> +if (lowmem & ((1ULL << 30) - 1)) {
> +error_report("Warning: Large machine and max_ram_below_4g "
> + "(%" PRIu64 ") not a multiple of 1G; "
> + "possible bad performance.",
> + pcms->max_ram_below_4g);
> +}
>  }
>  }
>  
> -- 
> 1.8.3.1

[Qemu-devel] [PATCH] trace-events: fix broken format strings

2016-01-05 Thread Andrew Jones

Fixes compiling with --enable-trace-backends

Signed-off-by: Andrew Jones 
---
 trace-events | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/trace-events b/trace-events
index 6f036384a84f8..e42898d5659d5 100644
--- a/trace-events
+++ b/trace-events
@@ -1799,15 +1799,15 @@ qcrypto_tls_session_new(void *session, void *creds, 
const char *hostname, const
 vhost_user_event(const char *chr, int event) "chr: %s got event: %d"
 
 # linux-user/signal.c
-user_setup_frame(void *env, uint64_t frame_addr) "env=%p frame_addr="PRIx64""
-user_setup_rt_frame(void *env, uint64_t frame_addr) "env=%p 
frame_addr="PRIx64""
-user_do_rt_sigreturn(void *env, uint64_t frame_addr) "env=%p 
frame_addr="PRIx64""
-user_do_sigreturn(void *env, uint64_t frame_addr) "env=%p frame_addr="PRIx64""
+user_setup_frame(void *env, uint64_t frame_addr) "env=%p 
frame_addr=0x%"PRIx64""
+user_setup_rt_frame(void *env, uint64_t frame_addr) "env=%p 
frame_addr=0x%"PRIx64""
+user_do_rt_sigreturn(void *env, uint64_t frame_addr) "env=%p 
frame_addr=0x%"PRIx64""
+user_do_sigreturn(void *env, uint64_t frame_addr) "env=%p 
frame_addr=0x%"PRIx64""
 user_force_sig(void *env, int target_sig, int host_sig) "env=%p signal %d 
(host %d)"
 user_handle_signal(void *env, int target_sig) "env=%p signal %d"
 user_host_signal(void *env, int host_sig, int target_sig) "env=%p signal %d 
(target %d("
 user_queue_signal(void *env, int target_sig) "env=%p signal %d"
-user_s390x_restore_sigregs(void *env, uint64_t sc_psw_addr, uint64_t 
env_psw_addr) "env=%p frame psw.addr "PRIx64 " current psw.addr "PRIx64""
+user_s390x_restore_sigregs(void *env, uint64_t sc_psw_addr, uint64_t 
env_psw_addr) "env=%p frame psw.addr 0x%"PRIx64 " current psw.addr 0x%"PRIx64""
 
 # io/task.c
 qio_task_new(void *task, void *source, void *func, void *opaque) "Task new 
task=%p source=%p func=%p opaque=%p"
-- 
2.4.3

[Qemu-devel] [PATCH] hw/dma/xilinx_axidma: debug printf fixups

2016-01-05 Thread Andrew Jones

(Found by grepping for broken PRI users.)

Signed-off-by: Andrew Jones 
---
 hw/dma/xilinx_axidma.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/dma/xilinx_axidma.c b/hw/dma/xilinx_axidma.c
index b1cfa11356a26..2ab0772cd19ae 100644
--- a/hw/dma/xilinx_axidma.c
+++ b/hw/dma/xilinx_axidma.c
@@ -180,10 +180,10 @@ static inline int streamid_from_addr(hwaddr addr)
 #ifdef DEBUG_ENET
 static void stream_desc_show(struct SDesc *d)
 {
-qemu_log("buffer_addr  = " PRIx64 "\n", d->buffer_address);
-qemu_log("nxtdesc  = " PRIx64 "\n", d->nxtdesc);
-qemu_log("control  = %x\n", d->control);
-qemu_log("status   = %x\n", d->status);
+qemu_log("buffer_addr  = 0x%" PRIx64 "\n", d->buffer_address);
+qemu_log("nxtdesc  = 0x%" PRIx64 "\n", d->nxtdesc);
+qemu_log("control  = 0x%x\n", d->control);
+qemu_log("status   = 0x%x\n", d->status);
 }
 #endif
 
-- 
2.4.3

Re: [Qemu-devel] [PATCH] pc: allow raising low memory via max-ram-below-4g option

2016-01-05 Thread Gerd Hoffmann

On Di, 2016-01-05 at 15:17 +0200, Michael S. Tsirkin wrote:
> On Tue, Jan 05, 2016 at 12:59:51PM +0100, Gerd Hoffmann wrote:
> > This patch extends the functionality of the max-ram-below-4g option
> > to also allow increasing lowmem.  While being at it also rework the
> > lowmem calculation logic and add a longish comment describing how it
> > works and what the compatibility constrains are.
> > 
> > Signed-off-by: Gerd Hoffmann 
> 
> Could you please add info about the motivation for this
> in the commit log?

> > + *  - Finally max-ram-below-4g got updated to also allow raising 
> > lowmem,
> > + *so legacy non-PAE guests can get as much memory as possible in
> > + *the 32bit address space below 4G.

It's right here (where it imho is more useful for people looking at the
code).

Want me copy this into the commit message?

cheers,
  Gerd

Re: [Qemu-devel] [PATCH] pc: allow raising low memory via max-ram-below-4g option

2016-01-05 Thread Michael S. Tsirkin

On Tue, Jan 05, 2016 at 02:32:05PM +0100, Gerd Hoffmann wrote:
> On Di, 2016-01-05 at 15:17 +0200, Michael S. Tsirkin wrote:
> > On Tue, Jan 05, 2016 at 12:59:51PM +0100, Gerd Hoffmann wrote:
> > > This patch extends the functionality of the max-ram-below-4g option
> > > to also allow increasing lowmem.  While being at it also rework the
> > > lowmem calculation logic and add a longish comment describing how it
> > > works and what the compatibility constrains are.
> > > 
> > > Signed-off-by: Gerd Hoffmann 
> > 
> > Could you please add info about the motivation for this
> > in the commit log?
> 
> > > + *  - Finally max-ram-below-4g got updated to also allow raising 
> > > lowmem,
> > > + *so legacy non-PAE guests can get as much memory as possible in
> > > + *the 32bit address space below 4G.
> 
> It's right here (where it imho is more useful for people looking at the
> code).

That's good.

> Want me copy this into the commit message?
> 
> cheers,
>   Gerd

Yes please do in the next revision. I do review in several passes, info
in commit log is helpful so I can decide how urgent is any given patch
and thus into which mailbox to sort it.

-- 
MST

[Qemu-devel] [PATCH v9 2/4] hw/ptimer: Perform tick and counter wrap around if timer already expired

2016-01-05 Thread Dmitry Osipenko

ptimer_get_count() might be called while QEMU timer already been expired.
In that case ptimer would return counter = 0, which might be undesirable
in case of polled timer. Do counter wrap around for periodic timer to keep
it distributed.

In addition, there is no reason to keep expired timer tick deferred, so
just perform the tick from ptimer_get_count().

Signed-off-by: Dmitry Osipenko 
---
 hw/core/ptimer.c | 35 +--
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 035af97..ff0586b 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -85,15 +85,21 @@ static void ptimer_tick(void *opaque)
 
 uint64_t ptimer_get_count(ptimer_state *s)
 {
+int enabled = s->enabled;
 int64_t now;
+int64_t next;
 uint64_t counter;
+int expired;
+int oneshot;
 
-if (s->enabled) {
+if (enabled) {
 now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+next = s->next_event;
+expired = (now - next >= 0);
+oneshot = (enabled == 2);
 /* Figure out the current counter value.  */
-if (now - s->next_event > 0
-|| s->period == 0) {
-/* Prevent timer underflowing if it should already have
+if ((s->period == 0) || (expired && oneshot)) {
+/* Prevent one-shot timer underflowing if it should already have
triggered.  */
 counter = 0;
 } else {
@@ -114,12 +120,12 @@ uint64_t ptimer_get_count(ptimer_state *s)
backwards.
 */
 
-if ((s->enabled == 1) && (s->limit * period < 1)) {
+if (!oneshot && (s->limit * period < 1)) {
 period = 1 / s->limit;
 period_frac = 0;
 }
 
-rem = s->next_event - now;
+rem = expired ? now - next : next - now;
 div = period;
 
 clz1 = clz64(rem);
@@ -139,6 +145,23 @@ uint64_t ptimer_get_count(ptimer_state *s)
 div += 1;
 }
 counter = rem / div;
+
+if (expired && (counter != 0)) {
+/* Wrap around periodic counter.  */
+counter = s->delta = s->limit - counter % s->limit;
+}
+}
+
+if (expired) {
+if (counter == 0) {
+ptimer_tick(s);
+} else {
+/* Don't use ptimer_tick() for the periodic timer since it
+ * would reset the delta value.
+ */
+ptimer_trigger(s);
+ptimer_reload(s);
+}
 }
 } else {
 counter = s->delta;
-- 
2.6.4

[Qemu-devel] [PATCH v9 1/4] hw/ptimer: Fix issues caused by the adjusted timer limit value

2016-01-05 Thread Dmitry Osipenko

Multiple issues here related to the timer with a adjusted .limit value:

1) ptimer_get_count() returns incorrect counter value for the disabled
timer after loading the counter with a small value, because adjusted limit
value is used instead of the original.

For instance:
1) ptimer_stop(t)
2) ptimer_set_period(t, 1)
3) ptimer_set_limit(t, 0, 1)
4) ptimer_get_count(t) <-- would return 1 instead of 0

2) ptimer_get_count() might return incorrect value for the timer running
with a adjusted limit value.

For instance:
1) ptimer_stop(t)
2) ptimer_set_period(t, 1)
3) ptimer_set_limit(t, 10, 1)
4) ptimer_run(t)
5) ptimer_get_count(t) <-- might return value > 10

3) Neither ptimer_set_period() nor ptimer_set_freq() are adjusting the
limit value, so it is still possible to make timer timeout value
arbitrary small.

For instance:
1) ptimer_set_period(t, 1)
2) ptimer_set_limit(t, 1, 0)
3) ptimer_set_period(t, 1) <-- bypass limit correction

Fix all of the above issues by adjusting timer period instead of the limit.
Do the adjust for periodic timer only.

Signed-off-by: Dmitry Osipenko 
---
 hw/core/ptimer.c | 59 ++--
 1 file changed, 36 insertions(+), 23 deletions(-)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index edf077c..035af97 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -34,20 +34,39 @@ static void ptimer_trigger(ptimer_state *s)
 
 static void ptimer_reload(ptimer_state *s)
 {
-if (s->delta == 0) {
+uint32_t period_frac = s->period_frac;
+uint64_t period = s->period;
+uint64_t delta = s->delta;
+uint64_t limit = s->limit;
+
+if (delta == 0) {
 ptimer_trigger(s);
-s->delta = s->limit;
+delta = limit;
 }
-if (s->delta == 0 || s->period == 0) {
+if (delta == 0 || period == 0) {
 fprintf(stderr, "Timer with period zero, disabling\n");
 s->enabled = 0;
 return;
 }
 
+/*
+ * Artificially limit timeout rate to something
+ * achievable under QEMU.  Otherwise, QEMU spends all
+ * its time generating timer interrupts, and there
+ * is no forward progress.
+ * About ten microseconds is the fastest that really works
+ * on the current generation of host machines.
+ */
+
+if ((s->enabled == 1) && (limit * period < 1)) {
+period = 1 / limit;
+period_frac = 0;
+}
+
 s->last_event = s->next_event;
-s->next_event = s->last_event + s->delta * s->period;
-if (s->period_frac) {
-s->next_event += ((int64_t)s->period_frac * s->delta) >> 32;
+s->next_event = s->last_event + delta * period;
+if (period_frac) {
+s->next_event += ((int64_t)period_frac * delta) >> 32;
 }
 timer_mod(s->timer, s->next_event);
 }
@@ -82,6 +101,8 @@ uint64_t ptimer_get_count(ptimer_state *s)
 uint64_t div;
 int clz1, clz2;
 int shift;
+uint32_t period_frac = s->period_frac;
+uint64_t period = s->period;
 
 /* We need to divide time by period, where time is stored in
rem (64-bit integer) and period is stored in period/period_frac
@@ -93,8 +114,13 @@ uint64_t ptimer_get_count(ptimer_state *s)
backwards.
 */
 
+if ((s->enabled == 1) && (s->limit * period < 1)) {
+period = 1 / s->limit;
+period_frac = 0;
+}
+
 rem = s->next_event - now;
-div = s->period;
+div = period;
 
 clz1 = clz64(rem);
 clz2 = clz64(div);
@@ -103,13 +129,13 @@ uint64_t ptimer_get_count(ptimer_state *s)
 rem <<= shift;
 div <<= shift;
 if (shift >= 32) {
-div |= ((uint64_t)s->period_frac << (shift - 32));
+div |= ((uint64_t)period_frac << (shift - 32));
 } else {
 if (shift != 0)
-div |= (s->period_frac >> (32 - shift));
+div |= (period_frac >> (32 - shift));
 /* Look at remaining bits of period_frac and round div up if 
necessary.  */
-if ((uint32_t)(s->period_frac << shift))
+if ((uint32_t)(period_frac << shift))
 div += 1;
 }
 counter = rem / div;
@@ -181,19 +207,6 @@ void ptimer_set_freq(ptimer_state *s, uint32_t freq)
count = limit.  */
 void ptimer_set_limit(ptimer_state *s, uint64_t limit, int reload)
 {
-/*
- * Artificially limit timeout rate to something
- * achievable under QEMU.  Otherwise, QEMU spends all
- * its time generating timer interrupts, and there
- * is no forward progress.
- * About ten microseconds is the fastest that really works
- * on the current generation of host machines.
- */
-
-if (!use_icount && limit * s->period < 1 &&

[Qemu-devel] [PATCH v9 3/4] hw/ptimer: Update .delta on period/freq change

2016-01-05 Thread Dmitry Osipenko

Delta value must be updated on period/freq change, otherwise running timer
would be restarted (counter reloaded with old delta). Only m68k/mcf520x
and arm/arm_timer devices are currently doing freq change correctly, i.e.
stopping the timer. Perform delta update to fix affected devices and
eliminate potential further mistakes.

Signed-off-by: Dmitry Osipenko 
---
 hw/core/ptimer.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index ff0586b..eb4eb4b 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -207,6 +207,7 @@ void ptimer_stop(ptimer_state *s)
 /* Set counter increment interval in nanoseconds.  */
 void ptimer_set_period(ptimer_state *s, int64_t period)
 {
+s->delta = ptimer_get_count(s);
 s->period = period;
 s->period_frac = 0;
 if (s->enabled) {
@@ -218,6 +219,7 @@ void ptimer_set_period(ptimer_state *s, int64_t period)
 /* Set counter frequency in Hz.  */
 void ptimer_set_freq(ptimer_state *s, uint32_t freq)
 {
+s->delta = ptimer_get_count(s);
 s->period = 10ll / freq;
 s->period_frac = (10ll << 32) / freq;
 if (s->enabled) {
-- 
2.6.4

[Qemu-devel] [PATCH v9 0/4] PTimer fixes and ARM MPTimer conversion

2016-01-05 Thread Dmitry Osipenko

Changelog for ARM MPTimer QEMUTimer to ptimer conversion:

V2: Fixed changing periodic timer counter value "on the fly". I added a
test to the gist to cover that issue.

V3: Fixed starting the timer with load = 0 and counter != 0, added tests
to the gist for this issue. Changed vmstate version for all VMSD's,
since loadvm doesn't check version of nested VMSD.

V4: Fixed spurious IT bit set for the timer starting in the periodic mode
with counter = 0. Test added.

V5: Code cleanup, now depends on ptimer_set_limit() fix.

V6: No code change, added test to check ptimer_get_count() with corrected
.limit value.

V7: No change.

V8: No change.

V9: No change.

ARM MPTimer tests: https://gist.github.com/digetx/dbd46109503b1a91941a


Patch for ptimer is introduced since V5 of "ARM MPTimer conversion".

Changelog for the "ptimer fixes" patch:

V5: Only fixed ptimer_set_limit() for the disabled timer.

V6: As was pointed by Peter Maydell, there are other issues beyond
ptimer_set_limit(), so V6 supposed to cover all those issues.

V7: Added accidentally removed !use_icount check.
Added missed "else" statement.

V8: Adjust period instead of the limit and do it for periodic timer only
(.limit adjusting bug). Added patch/fix for freq/period change and
ptimer_get_count() improvement.

V9: Don't do wrap around if counter == 0, otherwise polled periodic
timer won't ever return counter = 0.

Dmitry Osipenko (4):
  hw/ptimer: Fix issues caused by the adjusted timer limit value
  hw/ptimer: Perform tick and counter wrap around if timer already
expired
  hw/ptimer: Update .delta on period/freq change
  arm_mptimer: Convert to use ptimer

 hw/core/ptimer.c   |  94 ---
 hw/timer/arm_mptimer.c | 110 ++---
 include/hw/timer/arm_mptimer.h |   4 +-
 3 files changed, 115 insertions(+), 93 deletions(-)

-- 
2.6.4

[Qemu-devel] [PATCH v9 4/4] arm_mptimer: Convert to use ptimer

2016-01-05 Thread Dmitry Osipenko

Current ARM MPTimer implementation uses QEMUTimer for the actual timer,
this implementation isn't complete and mostly tries to duplicate of what
generic ptimer is already doing fine.

Conversion to ptimer brings the following benefits and fixes:
- Simple timer pausing implementation
- Fixes counter value preservation after stopping the timer
- Code simplification and reduction

Bump VMSD to version 3, since VMState is changed and is not compatible
with the previous implementation.

Signed-off-by: Dmitry Osipenko 
---
 hw/timer/arm_mptimer.c | 110 ++---
 include/hw/timer/arm_mptimer.h |   4 +-
 2 files changed, 49 insertions(+), 65 deletions(-)

diff --git a/hw/timer/arm_mptimer.c b/hw/timer/arm_mptimer.c
index 3e59c2a..c06da5e 100644
--- a/hw/timer/arm_mptimer.c
+++ b/hw/timer/arm_mptimer.c
@@ -19,8 +19,9 @@
  * with this program; if not, see .
  */
 
+#include "hw/ptimer.h"
 #include "hw/timer/arm_mptimer.h"
-#include "qemu/timer.h"
+#include "qemu/main-loop.h"
 #include "qom/cpu.h"
 
 /* This device implements the per-cpu private timer and watchdog block
@@ -47,28 +48,10 @@ static inline uint32_t timerblock_scale(TimerBlock *tb)
 return (((tb->control >> 8) & 0xff) + 1) * 10;
 }
 
-static void timerblock_reload(TimerBlock *tb, int restart)
-{
-if (tb->count == 0) {
-return;
-}
-if (restart) {
-tb->tick = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-}
-tb->tick += (int64_t)tb->count * timerblock_scale(tb);
-timer_mod(tb->timer, tb->tick);
-}
-
 static void timerblock_tick(void *opaque)
 {
 TimerBlock *tb = (TimerBlock *)opaque;
 tb->status = 1;
-if (tb->control & 2) {
-tb->count = tb->load;
-timerblock_reload(tb, 0);
-} else {
-tb->count = 0;
-}
 timerblock_update_irq(tb);
 }
 
@@ -76,21 +59,11 @@ static uint64_t timerblock_read(void *opaque, hwaddr addr,
 unsigned size)
 {
 TimerBlock *tb = (TimerBlock *)opaque;
-int64_t val;
 switch (addr) {
 case 0: /* Load */
 return tb->load;
 case 4: /* Counter.  */
-if (((tb->control & 1) == 0) || (tb->count == 0)) {
-return 0;
-}
-/* Slow and ugly, but hopefully won't happen too often.  */
-val = tb->tick - qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-val /= timerblock_scale(tb);
-if (val < 0) {
-val = 0;
-}
-return val;
+return ptimer_get_count(tb->timer);
 case 8: /* Control.  */
 return tb->control;
 case 12: /* Interrupt status.  */
@@ -100,6 +73,19 @@ static uint64_t timerblock_read(void *opaque, hwaddr addr,
 }
 }
 
+static void timerblock_run(TimerBlock *tb, uint64_t count, int set_count)
+{
+if (set_count) {
+if (((tb->control & 3) == 3) && (count == 0)) {
+count = tb->load;
+}
+ptimer_set_count(tb->timer, count);
+}
+if ((tb->control & 1) && (count != 0)) {
+ptimer_run(tb->timer, !(tb->control & 2));
+}
+}
+
 static void timerblock_write(void *opaque, hwaddr addr,
  uint64_t value, unsigned size)
 {
@@ -108,32 +94,34 @@ static void timerblock_write(void *opaque, hwaddr addr,
 switch (addr) {
 case 0: /* Load */
 tb->load = value;
-/* Fall through.  */
-case 4: /* Counter.  */
-if ((tb->control & 1) && tb->count) {
-/* Cancel the previous timer.  */
-timer_del(tb->timer);
+/* Setting load to 0 stops the timer.  */
+if (tb->load == 0) {
+ptimer_stop(tb->timer);
 }
-tb->count = value;
-if (tb->control & 1) {
-timerblock_reload(tb, 1);
+ptimer_set_limit(tb->timer, tb->load, 1);
+timerblock_run(tb, tb->load, 0);
+break;
+case 4: /* Counter.  */
+/* Setting counter to 0 stops the one-shot timer.  */
+if (!(tb->control & 2) && (value == 0)) {
+ptimer_stop(tb->timer);
 }
+timerblock_run(tb, value, 1);
 break;
 case 8: /* Control.  */
 old = tb->control;
 tb->control = value;
-if (value & 1) {
-if ((old & 1) && (tb->count != 0)) {
-/* Do nothing if timer is ticking right now.  */
-break;
-}
-if (tb->control & 2) {
-tb->count = tb->load;
-}
-timerblock_reload(tb, 1);
-} else if (old & 1) {
-/* Shutdown the timer.  */
-timer_del(tb->timer);
+/* Timer mode switch requires ptimer to be stopped.  */
+if ((old & 3) != (tb->control & 3)) {
+ptimer_stop(tb->timer);
+}
+if (!(tb->control & 1)) {
+break;
+}
+ptimer_set_period(tb->timer, timerblock_scale(tb));
+if ((old & 3) != (tb->control & 3))

Re: [Qemu-devel] qcow2 snapshot + resize

2016-01-05 Thread Eric Blake

On 01/05/2016 05:10 AM, lihuiba wrote:

>>> In our production environment, we need to extend a qcow2 image with
>>> snapshots in it.

>> The thing is that one would need to update all the inactive L1 tables. I
>> don't think it should be too difficult, it's just that apparently so far
>> nobody ever had the need for this feature.

Is resizing a snapshot really what you want?  Ideally, a snapshot tracks
the data from a point in time, including the metadata of the size being
tracked at that time.  Extending the snapshots then reverting to that
snapshot means your guest would see a larger disk on revert than it did
at the time the snapshot was created, which guests might not handle very
well.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v8 31/35] qapi: Rework deallocation of partial struct

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> Commit cee2dedb noticed that if you have a partial flat union
> (such as if an input parse failed due to a missing
> discriminator), calling the dealloc visitor could result in
> trying to dereference the NULL pointer. But the fix it proposed
> requires the use of a 'data' member in the union, which may or
> may not be the same size as other branches of the union
> (consider a 32-bit platform where one of the branches is an
> int64), so it feels fairly dirty.  A better fix is to tweak all
> of the generated visit_type_implicit_FOO() functions to avoid
> dereferencing NULL in the first place, by not visiting the
> fields if the struct pointer itself is not present, at which
> point we no longer even need visit_start_union().  And no one
> was implementing visit_end_union() callbacks.
>
> While rewriting the code, use patterns that are closer to what
> is used elsewhere in the generated visitors, by using 'goto'
> to cleanup labels rather than putting followup code under 'if'
> conditions.  The change keeps the contract that any successful
> use of visit_start_implicit_struct() will be paired with a
> matching visit_end_implicit_struct(), even if intermediate
> processing is skipped.  We are safe in checking *obj alone, as
> as the contract of visit_start_implicit_struct() requires a
> non-NULL obj.
>
> As an example of the changes to generated code:
> |@@ -1331,10 +1331,16 @@ static void visit_type_implicit_Blockdev
> | Error *err = NULL;
> |
> | visit_start_implicit_struct(v, (void **)obj, 
> sizeof(BlockdevOptionsArchipelago), &err);
> |-if (!err) {
> |-visit_type_BlockdevOptionsArchipelago_fields(v, obj, errp);
> |-visit_end_implicit_struct(v);
> |+if (err) {
> |+goto out;
> |+}
> |+if (!*obj) {
> |+goto out_obj;
> | }
> |+visit_type_BlockdevOptionsArchipelago_fields(v, obj, &err);
> |+out_obj:
> |+visit_end_implicit_struct(v);
> |+out:
> | error_propagate(errp, err);
> | }
> ...
> |@@ -1479,9 +1539,6 @@ void visit_type_BlockdevOptions(Visitor
> | if (err) {
> | goto out_obj;
> | }
> |-if (!visit_start_union(v, !!(*obj)->u.data, &err) || err) {
> |-goto out_obj;
> |-}
> | switch ((*obj)->driver) {
> | case BLOCKDEV_DRIVER_ARCHIPELAGO:
> | visit_type_implicit_BlockdevOptionsArchipelago(v, 
> &(*obj)->u.archipelago, &err);
> |@@ -1570,11 +1627,6 @@ void visit_type_BlockdevOptions(Visitor
> | out_obj:
> | error_propagate(errp, err);
> | err = NULL;
> |-if (*obj) {
> |-visit_end_union(v, !!(*obj)->u.data, &err);
> |-}
> |-error_propagate(errp, err);
> |-err = NULL;
> | visit_end_struct(v, &err);
>
> Signed-off-by: Eric Blake 
>

Reviewed-by: Marc-André Lureau 

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 29/35] qapi: Canonicalize missing object to :empty

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> Now that we elide unnecessary visits of empty types, we can
> start using the special ':empty' type in more places.  By using
> the empty type as the base class of every explicit struct or
> union, and as the default data for any command or event, we can
> simplify later logic in qapi-{visit,commands,event} by merely
> checking whether the type is empty, without also having to worry
> whether a type was even supplied.
>
> Note that gen_object() in qapi-types still has to check for a
> base, because it is also called for alternates (which have no
> base).
>
> No change to generated code.
>
> Signed-off-by: Eric Blake 
>

Looks all good to me, but will let Markus check if that makes sense.


-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 28/35] qapi: Eliminate empty visit_type_FOO_fields

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> For empty structs, such as the 'Abort' helper type used as part
> of the 'transaction' command, we were emitting a no-op
> visit_type_FOO_fields().  Optimize things to instead omit calls
> for empty structs.  Generated code changes resemble:
>
> |-static void visit_type_Abort_fields(Visitor *v, Abort **obj, Error **errp)
> |-{
> |-Error *err = NULL;
> |-
> |-error_propagate(errp, err);
> |-}
> |-
> | void visit_type_Abort(Visitor *v, const char *name, Abort **obj, Error 
> **errp)
> | {
> | Error *err = NULL;
> |@@ -112,7 +105,6 @@ void visit_type_Abort(Visitor *v, Abort
> | if (!*obj) {
> | goto out_obj;
> | }
> |-visit_type_Abort_fields(v, obj, &err);
> | out_obj:
> | error_propagate(errp, err);
>
> Another reason for doing this optimization is that it gets us
> closer to merging the code for visiting structs and unions:
> since flat unions have no local members, they do not need to
> have a visit_type_UNION_fields() emitted, even when they start
> sharing the code used to visit structs.
>
> Signed-off-by: Eric Blake 
>

Reviewed-by: Marc-André Lureau 




-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 30/35] qapi-visit: Unify struct and union visit

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> We are finally at the point where gen_visit_struct() and
> gen_visit_union() can be unified to a generic gen_visit_object().
>
> The generated code for structs and for flat unions is unchanged.
> For simple unions, a new visit_type_FOO_fields() is created,
> wrapping the visit of the non-variant tag field:
>
> |+static void visit_type_ChardevBackend_fields(Visitor *v, ChardevBackend 
> **obj, Error **errp)
> |+{
> |+Error *err = NULL;
> |+
> |+visit_type_ChardevBackendKind(v, "type", &(*obj)->type, &err);
> |+if (err) {
> |+goto out;
> |+}
> |+
> |+out:
> |+error_propagate(errp, err);
> |+}
> |+
> | void visit_type_ChardevBackend(Visitor *v, const char *name, ChardevBackend 
> **obj, Error **errp)
> | {
> | Error *err = NULL;
> |@@ -2319,7 +2332,7 @@ void visit_type_ChardevBackend(Visitor *
> | if (!*obj) {
> | goto out_obj;
> | }
> |-visit_type_ChardevBackendKind(v, "type", &(*obj)->type, &err);
> |+visit_type_ChardevBackend_fields(v, obj, &err);
> | if (err) {
>
> Signed-off-by: Eric Blake 
>

Reviewed-by: Marc-André Lureau 

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 27/35] qapi: Fix command with named empty argument type

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> The generator special-cased
>  { 'command':'foo', 'data': {} }
> to avoid emitting a visitor variable, but failed to see that
>  { 'struct':'NamedEmptyType, 'data': {} }
>  { 'command':'foo', 'data':'NamedEmptyType' }
> needs the same treatment.  Without a fix to the generator, the
> change to qapi-schema-test.json fails to compile with:
>
> tests/test-qmp-marshal.c: In function ‘qmp_marshal_user_def_cmd0’:
> tests/test-qmp-marshal.c:264:14: error: variable ‘v’ set but not used 
> [-Werror=unused-but-set-variable]
>  Visitor *v;
>   ^
>
> Signed-off-by: Eric Blake 

Reviewed-by: Marc-André Lureau 


-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 18/35] qapi: Drop unused error argument for list and implicit struct

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> No backend was setting an error when ending the visit of a list
> or implicit struct.  Make the callers a bit easier to follow by
> making this a part of the contract, and removing the errp
> argument - callers can then unconditionally end an object as
> part of cleanup without having to think about whether a second
> error is dominated by a first, because there is no second error.
>
> A later patch will then tackle the larger task of splitting
> visit_end_struct(), which can indeed set an error.
>
> Signed-off-by: Eric Blake 
>

This patch makes visit_end_list() possibly abort, while before it
would pass the error to upper. I assume that's what you are going to
change next.

Reviewed-by: Marc-André Lureau 

> ---
> v8: no change
> v7: place earlier in series, rebase to earlier changes
> v6: new patch, split from RFC on v5 7/46
> ---
>  hw/ppc/spapr_drc.c   |  6 +-
>  include/qapi/visitor-impl.h  |  6 --
>  include/qapi/visitor.h   |  5 +++--
>  qapi/opts-visitor.c  |  2 +-
>  qapi/qapi-dealloc-visitor.c  |  4 ++--
>  qapi/qapi-visit-core.c   |  8 
>  qapi/qmp-input-visitor.c |  9 ++---
>  qapi/qmp-output-visitor.c|  2 +-
>  qapi/string-input-visitor.c  |  2 +-
>  qapi/string-output-visitor.c |  2 +-
>  scripts/qapi-visit.py| 10 +++---
>  11 files changed, 23 insertions(+), 33 deletions(-)
>
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index c00e9da..18a4225 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -314,11 +314,7 @@ static void prop_get_fdt(Object *obj, Visitor *v, const 
> char *name,
>  return;
>  }
>  }
> -visit_end_list(v, &err);
> -if (err) {
> -error_propagate(errp, err);
> -return;
> -}
> +visit_end_list(v);
>  break;
>  }
>  default:
> diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
> index ac22964..32d6725 100644
> --- a/include/qapi/visitor-impl.h
> +++ b/include/qapi/visitor-impl.h
> @@ -24,11 +24,13 @@ struct Visitor
>
>  void (*start_implicit_struct)(Visitor *v, void **obj, size_t size,
>Error **errp);
> -void (*end_implicit_struct)(Visitor *v, Error **errp);
> +/* May be NULL */
> +void (*end_implicit_struct)(Visitor *v);
>
>  void (*start_list)(Visitor *v, const char *name, Error **errp);
>  GenericList *(*next_list)(Visitor *v, GenericList **list, Error **errp);
> -void (*end_list)(Visitor *v, Error **errp);
> +/* Must be set */
> +void (*end_list)(Visitor *v);
>
>  void (*type_enum)(Visitor *v, const char *name, int *obj,
>const char *const strings[], Error **errp);
> diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
> index 4abc180..10390d2 100644
> --- a/include/qapi/visitor.h
> +++ b/include/qapi/visitor.h
> @@ -32,10 +32,11 @@ void visit_start_struct(Visitor *v, const char *name, 
> void **obj,
>  void visit_end_struct(Visitor *v, Error **errp);
>  void visit_start_implicit_struct(Visitor *v, void **obj, size_t size,
>   Error **errp);
> -void visit_end_implicit_struct(Visitor *v, Error **errp);
> +void visit_end_implicit_struct(Visitor *v);
> +
>  void visit_start_list(Visitor *v, const char *name, Error **errp);
>  GenericList *visit_next_list(Visitor *v, GenericList **list, Error **errp);
> -void visit_end_list(Visitor *v, Error **errp);
> +void visit_end_list(Visitor *v);
>
>  /**
>   * Check if an optional member @name of an object needs visiting.
> diff --git a/qapi/opts-visitor.c b/qapi/opts-visitor.c
> index 6d4a91e..62ffdd4 100644
> --- a/qapi/opts-visitor.c
> +++ b/qapi/opts-visitor.c
> @@ -269,7 +269,7 @@ opts_next_list(Visitor *v, GenericList **list, Error 
> **errp)
>
>
>  static void
> -opts_end_list(Visitor *v, Error **errp)
> +opts_end_list(Visitor *v)
>  {
>  OptsVisitor *ov = to_ov(v);
>
> diff --git a/qapi/qapi-dealloc-visitor.c b/qapi/qapi-dealloc-visitor.c
> index 49e7cf0..560feb3 100644
> --- a/qapi/qapi-dealloc-visitor.c
> +++ b/qapi/qapi-dealloc-visitor.c
> @@ -83,7 +83,7 @@ static void qapi_dealloc_start_implicit_struct(Visitor *v,
>  qapi_dealloc_push(qov, obj);
>  }
>
> -static void qapi_dealloc_end_implicit_struct(Visitor *v, Error **errp)
> +static void qapi_dealloc_end_implicit_struct(Visitor *v)
>  {
>  QapiDeallocVisitor *qov = to_qov(v);
>  void **obj = qapi_dealloc_pop(qov);
> @@ -119,7 +119,7 @@ static GenericList *qapi_dealloc_next_list(Visitor *v, 
> GenericList **listp,
>  return NULL;
>  }
>
> -static void qapi_dealloc_end_list(Visitor *v, Error **errp)
> +static void qapi_dealloc_end_list(Visitor *v)
>  {
>  QapiDeallocVisitor *qov = to_qov(v);
>  void *obj = qapi_dealloc_pop(qov);
> diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
> i

Re: [Qemu-devel] [PATCH v8 23/35] qmp: Tighten output visitor rules

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> Add a new qmp_output_visitor_reset(), which must be called before
> reusing an exising QmpOutputVisitor on a new root object.  Tighten
> assertions to require that qmp_output_get_qobject() can only be
> called after pairing a visit_end_* for every visit_start_* (rather
> than allowing it to return a partially built object), that it must
> not be called unless at least one visit_type_* or visit_start/
> visit_end pair has occurred since creation/reset (the accidental
> return of NULL fixed by commit ab8bf1d7 would have been much
> easier to diagnose), and that it may only be called once per visit.
>
> To keep the semantics of test_visitor_out_empty, we now have to
> explicitly request a top-level visit of a NULL object, by
> implementing the just-added visitor type_null() callback.
>
> Signed-off-by: Eric Blake 
>
> ---
> v8: rename qmp_output_reset to qmp_output_visitor_reset
> v7: new patch, based on discussion about spapr_drc.c
> ---
>  include/qapi/qmp-output-visitor.h |  1 +
>  include/qapi/visitor-impl.h   |  2 +-
>  qapi/qmp-output-visitor.c | 37 -
>  tests/test-qmp-output-visitor.c   |  2 ++
>  4 files changed, 28 insertions(+), 14 deletions(-)
>
> diff --git a/include/qapi/qmp-output-visitor.h 
> b/include/qapi/qmp-output-visitor.h
> index 2266770..5093f0d 100644
> --- a/include/qapi/qmp-output-visitor.h
> +++ b/include/qapi/qmp-output-visitor.h
> @@ -21,6 +21,7 @@ typedef struct QmpOutputVisitor QmpOutputVisitor;
>
>  QmpOutputVisitor *qmp_output_visitor_new(void);
>  void qmp_output_visitor_cleanup(QmpOutputVisitor *v);
> +void qmp_output_visitor_reset(QmpOutputVisitor *v);
>
>  QObject *qmp_output_get_qobject(QmpOutputVisitor *v);
>  Visitor *qmp_output_get_visitor(QmpOutputVisitor *v);
> diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
> index d301901..ed2934b 100644
> --- a/include/qapi/visitor-impl.h
> +++ b/include/qapi/visitor-impl.h
> @@ -76,7 +76,7 @@ struct Visitor
>  void (*type_any)(Visitor *v, const char *name, QObject **obj,
>   Error **errp);
>  /* Must be provided to visit explicit null values (right now, only the
> - * dealloc visitor supports this).  */
> + * dealloc and qmp-output visitors support this).  */
>  void (*type_null)(Visitor *v, const char *name, Error **errp);
>
>  /* May be NULL; most useful for input visitors. */
> diff --git a/qapi/qmp-output-visitor.c b/qapi/qmp-output-visitor.c
> index df22999..f2c39c5 100644
> --- a/qapi/qmp-output-visitor.c
> +++ b/qapi/qmp-output-visitor.c
> @@ -1,6 +1,7 @@
>  /*
>   * Core Definitions for QAPI/QMP Command Registry
>   *
> + * Copyright (C) 2015 Red Hat, Inc.
>   * Copyright IBM, Corp. 2011
>   *
>   * Authors:
> @@ -88,9 +89,8 @@ static void qmp_output_add_obj(QmpOutputVisitor *qov, const 
> char *name,
>  cur = qmp_output_last(qov);
>
>  if (!cur) {
> -/* FIXME we should require the user to reset the visitor, rather
> - * than throwing away the previous root */
> -qobject_decref(qov->root);
> +/* Don't allow reuse of visitor on more than one root */
> +assert(!qov->root);
>  qov->root = value;
>  } else {
>  switch (qobject_type(cur)) {
> @@ -202,18 +202,22 @@ static void qmp_output_type_any(Visitor *v, const char 
> *name, QObject **obj,
>  qmp_output_add_obj(qov, name, *obj);
>  }
>
> +static void qmp_output_type_null(Visitor *v, const char *name, Error **errp)
> +{
> +QmpOutputVisitor *qov = to_qov(v);
> +qmp_output_add_obj(qov, name, qnull());
> +}
> +
>  /* Finish building, and return the root object. Will not be NULL. */
>  QObject *qmp_output_get_qobject(QmpOutputVisitor *qov)
>  {
> -/* FIXME: we should require that a visit occurred, and that it is
> - * complete (no starts without a matching end) */
> -QObject *obj = qov->root;
> -if (obj) {
> -qobject_incref(obj);
> -} else {
> -obj = qnull();
> -}
> -return obj;
> +QObject *root;
> +
> +assert(qov->root);  /* A visit must have occurred...  */
> +assert(!qmp_output_last(qov));  /* ...with each start paired with end.  
> */
> +root = qov->root;
> +qov->root = NULL;
> +return root;
>  }
>
>  Visitor *qmp_output_get_visitor(QmpOutputVisitor *v)
> @@ -221,7 +225,7 @@ Visitor *qmp_output_get_visitor(QmpOutputVisitor *v)
>  return &v->visitor;
>  }
>
> -void qmp_output_visitor_cleanup(QmpOutputVisitor *v)
> +void qmp_output_visitor_reset(QmpOutputVisitor *v)
>  {
>  QStackEntry *e, *tmp;
>
> @@ -231,6 +235,12 @@ void qmp_output_visitor_cleanup(QmpOutputVisitor *v)
>  }
>
>  qobject_decref(v->root);
> +v->root = NULL;
> +}
> +
> +void qmp_output_visitor_cleanup(QmpOutputVisitor *v)

It would make sense to call it _free() imho..

> +{
> +qmp_output_visitor_reset(v);
>  g_free(v);
>  }
>
> @@ -252,6 +262,7 @@ QmpOutput

Re: [Qemu-devel] [PATCH v8 17/35] qapi: Drop unused 'kind' for struct/enum visit

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> visit_start_struct() and visit_type_enum() had a 'kind' argument
> that was usually set to either the stringized version of the
> corresponding qapi type name, or to NULL (although some clients
> didn't even get that right).  But nothing ever used the argument.
> It's even hard to argue that it would be useful in a debugger,
> as a stack backtrace also tells which type is being visited.
>
> Therefore, drop the 'kind' argument as dead.
>
> Signed-off-by: Eric Blake 
>

Sounds reasonable to me:
Reviewed-by: Marc-André Lureau 


-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 20/35] qmp: Don't abuse stack to track qmp-output root

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> The previous commit documented an inconsistency in how we are
> using the stack of qmp-output-visitor.  Normally, pushing a
> single top-level object puts the object on the stack twice:
> once as the root, and once as the current container being
> appended to; but popping that struct only pops once.  However,
> qmp_ouput_add() was trying to either set up the added object
> as the new root (works if you parse two top-level scalars in a
> row: the second replaces the first as the root) or as a member
> of the current container (works as long as you have an open
> container on the stack; but if you have popped the first
> top-level container, it then resolves to the root and still
> tries to add into that existing container).
>
> Fix the stupidity by not tracking two separate things in the
> stack.  Drop the now-useless qmp_output_first() while at it.
>
> Saved for a later patch: we still are rather sloppy in that
> qmp_output_get_object() can be called in the middle of a parse,
> rather than requiring that a visit is complete.
>
> Signed-off-by: Eric Blake 
>

Reviewed-by: Marc-André Lureau 


-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 26/35] qapi: Add type.is_empty() helper

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> And use it in qapi-types and qapi-event.  Down the road, we may
> want to lift our artificial restriction of no variants at the
> top level of an event, at which point, inlining our check for
> whether members is empty will no longer be sufficient.  More
> immediately, the new .is_empty() helper will help fix a bug in
> qapi-visit.
>

which bug? (I guess it's related to the next patch)

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 16/35] qapi: Swap 'name' in visit_* callbacks to match public API

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> As explained in the previous patches, matching argument order of
> 'name, &value' to JSON's "name":value makes sense.  However,
> while the last two patches were easy with Coccinelle, I ended up
> doing this one all by hand.  Now all the visitor callbacks match
> the main interface.
>
> Signed-off-by: Eric Blake 
>

Reviewed-by: Marc-André Lureau 

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 13/35] qom: Use typedef for Visitor

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> No need to repeat 'struct Visitor' when we already have it in
> typedefs.h.  Omitting the redundant 'struct' also makes a later
> patch easier to search for all object property callbacks that
> are associated with a Visitor.
>
> Signed-off-by: Eric Blake 

Reviewed-by: Marc-André Lureau 



-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 22/35] qapi: Add visit_type_null() visitor

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> Right now, qmp-output-visitor happens to produce a QNull result
> if nothing is actually visited between the creation of the visitor
> and the request for the resulting QObject.  A stronger protocol
> would require that a QMP output visit MUST visit something.  But
> to still be able to produce a JSON 'null' output, we need a new
> visitor function that states our intentions.
>
> This patch introduces the new visit_type_null() interface, and
> a later patch will then wire it up into the qmp output visitor.
>
> Signed-off-by: Eric Blake 
>

overall looks good to me:
Reviewed-by: Marc-André Lureau 

Just a small remark below,

> ---
> v8: rebase to 'name' motion
> v7: new patch, based on discussion about spapr_drc.c
> ---
>  include/qapi/visitor-impl.h | 3 +++
>  include/qapi/visitor.h  | 8 
>  qapi/qapi-dealloc-visitor.c | 5 +
>  qapi/qapi-visit-core.c  | 5 +
>  4 files changed, 21 insertions(+)
>
> diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
> index 8110ebd..d301901 100644
> --- a/include/qapi/visitor-impl.h
> +++ b/include/qapi/visitor-impl.h
> @@ -75,6 +75,9 @@ struct Visitor
>   * visitors do not currently visit arbitrary types).  */
>  void (*type_any)(Visitor *v, const char *name, QObject **obj,
>   Error **errp);
> +/* Must be provided to visit explicit null values (right now, only the
> + * dealloc visitor supports this).  */
> +void (*type_null)(Visitor *v, const char *name, Error **errp);
>

It's not clear to me what you mean by "Must be provided" if only one
visitor implements it.

>  /* May be NULL; most useful for input visitors. */
>  void (*optional)(Visitor *v, const char *name, bool *present);
> diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
> index 5349a64..6e49b51 100644
> --- a/include/qapi/visitor.h
> +++ b/include/qapi/visitor.h
> @@ -267,6 +267,14 @@ void visit_type_number(Visitor *v, const char *name, 
> double *obj,
>  void visit_type_any(Visitor *v, const char *name, QObject **obj, Error 
> **errp);
>
>  /**
> + * Visit a JSON null value tied to @name in the current object visit.
> + * @name will be NULL if this is visited as part of a list.
> + * No obj parameter is needed; rather, this is a witness that an
> + * explicit null value is expected rather than any other type.
> + */
> +void visit_type_null(Visitor *v, const char *name, Error **errp);
> +
> +/**
>   * Mark the start of visiting the branches of a union. Return true if
>   * @data_present.
>   * FIXME: Should not be needed
> diff --git a/qapi/qapi-dealloc-visitor.c b/qapi/qapi-dealloc-visitor.c
> index 560feb3..ede1703 100644
> --- a/qapi/qapi-dealloc-visitor.c
> +++ b/qapi/qapi-dealloc-visitor.c
> @@ -162,6 +162,10 @@ static void qapi_dealloc_type_anything(Visitor *v, const 
> char *name,
>  }
>  }
>
> +static void qapi_dealloc_type_null(Visitor *v, const char *name, Error 
> **errp)
> +{
> +}
> +
>  static void qapi_dealloc_type_enum(Visitor *v, const char *name, int *obj,
> const char * const strings[], Error 
> **errp)
>  {
> @@ -222,6 +226,7 @@ QapiDeallocVisitor *qapi_dealloc_visitor_new(void)
>  v->visitor.type_str = qapi_dealloc_type_str;
>  v->visitor.type_number = qapi_dealloc_type_number;
>  v->visitor.type_any = qapi_dealloc_type_anything;
> +v->visitor.type_null = qapi_dealloc_type_null;
>  v->visitor.start_union = qapi_dealloc_start_union;
>
>  QTAILQ_INIT(&v->stack);
> diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
> index 1612d0d..399256b 100644
> --- a/qapi/qapi-visit-core.c
> +++ b/qapi/qapi-visit-core.c
> @@ -262,6 +262,11 @@ void visit_type_any(Visitor *v, const char *name, 
> QObject **obj, Error **errp)
>  v->type_any(v, name, obj, errp);
>  }
>
> +void visit_type_null(Visitor *v, const char *name, Error **errp)
> +{
> +v->type_null(v, name, errp);
> +}

Shouldn't it be optional then and a if (v->type_null) be added?

> +
>  void output_type_enum(Visitor *v, const char *name, int *obj,
>const char * const strings[], Error **errp)
>  {
> --
> 2.4.3
>
>



-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 25/35] qapi: Simplify excess input reporting in input visitors

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> When reporting that an unvisited member remains at the end of an
> input visit for a struct, we were using g_hash_table_find()
> coupled with a callback function that always returns true, to
> locate an arbitrary member of the hash table.  But if all we
> need is an arbitrary entry, we can get that from a single-use
> iterator, without needing a tautological callback function.
>
> Suggested-by: Markus Armbruster 
> Signed-off-by: Eric Blake 
>

Reviewed-by: Marc-André Lureau 



-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 07/35] qapi: Improve generated event use of qapi visitor

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> All other successful clients of visit_start_struct() were paired
> with an unconditional visit_end_struct(); but the generated
> code for events was relying on qmp_output_visitor_cleanup() to
> work on an incomplete visit.  Alter the code to guarantee that
> the struct is completed, which will make a future patch to
> split visit_end_struct() easier to reason about.  While at it,
> drop some assertions and comments that are not present in other
> uses of the qmp output visitor, rearrange the declaration to
> make it easier for a future patch to introduce the notion of
> a boxed event visit, and pass NULL rather than "" as the 'kind'
> parameter (matching most other uses where obj is NULL).
>
> Signed-off-by: Eric Blake 
>
> ---
> v8: no change
> v7: place earlier in series, adjust handling of 'kind'
> v6: new patch
>
> If desired, I can defer the hunk re-ordering the declaration of
> obj to later in the series where it actually comes in handy.
> ---
>  scripts/qapi-event.py | 14 ++
>  scripts/qapi.py   |  5 +++--
>  2 files changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/scripts/qapi-event.py b/scripts/qapi-event.py
> index 720486f..e37c07a 100644
> --- a/scripts/qapi-event.py
> +++ b/scripts/qapi-event.py
> @@ -41,9 +41,9 @@ def gen_event_send(name, arg_type):
>
>  if arg_type and arg_type.members:
>  ret += mcgen('''
> +QObject *obj;
>  QmpOutputVisitor *qov;
>  Visitor *v;
> -QObject *obj;

This looks like churning code.

>
>  ''')
>
> @@ -61,25 +61,23 @@ def gen_event_send(name, arg_type):
>  if arg_type and arg_type.members:
>  ret += mcgen('''
>  qov = qmp_output_visitor_new();
> -g_assert(qov);
> -
>  v = qmp_output_get_visitor(qov);
> -g_assert(v);
>
> -/* Fake visit, as if all members are under a structure */
> -visit_start_struct(v, NULL, "", "%(name)s", 0, &err);
> +visit_start_struct(v, NULL, NULL, "%(name)s", 0, &err);

The comment seemed somewhat useful to me.

>  ''',
>   name=name)
>  ret += gen_err_check()
> -ret += gen_visit_fields(arg_type.members, need_cast=True)
> +ret += gen_visit_fields(arg_type.members, need_cast=True,
> +label='out_obj')
>  ret += mcgen('''
> +out_obj:
>  visit_end_struct(v, &err);
>  if (err) {
>  goto out;
>  }
>
>  obj = qmp_output_get_qobject(qov);
> -g_assert(obj != NULL);
> +g_assert(obj);
>
>  qdict_put_obj(qmp, "data", obj);
>  ''')
> diff --git a/scripts/qapi.py b/scripts/qapi.py
> index 7dec611..497eaba 100644
> --- a/scripts/qapi.py
> +++ b/scripts/qapi.py
> @@ -1636,7 +1636,8 @@ def gen_err_check(label='out', skiperr=False):
>   label=label)
>
>
> -def gen_visit_fields(members, prefix='', need_cast=False, skiperr=False):
> +def gen_visit_fields(members, prefix='', need_cast=False, skiperr=False,
> + label='out'):
>  ret = ''
>  if skiperr:
>  errparg = 'NULL'
> @@ -1664,7 +1665,7 @@ def gen_visit_fields(members, prefix='', 
> need_cast=False, skiperr=False):
>   c_type=memb.type.c_name(), prefix=prefix, cast=cast,
>   c_name=c_name(memb.name), name=memb.name,
>   errp=errparg)
> -ret += gen_err_check(skiperr=skiperr)
> +ret += gen_err_check(skiperr=skiperr, label=label)
>
>  if memb.optional:
>  pop_indent()
> --
> 2.4.3
>
>

Reviewed-by: Marc-André Lureau 

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 03/35] qapi: Drop dead dealloc visitor variable

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> Commit 0b9d8542 added StackEntry.is_list_head, but forgot to
> delete the now-unused QapiDeallocVisitor.is_list_head.
>
> Signed-off-by: Eric Blake 

Reviewed-by: Marc-André Lureau 



-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 19/35] qmp: Fix reference-counting of qnull on empty output visit

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> Commit 6c2f9a15 ensured that we would not return NULL when the
> caller used an output visitor but had nothing to visit. But
> in doing so, it added a FIXME about a reference count leak
> that could abort qemu in the (unlikely) case of SIZE_MAX such
> visits (more plausible on 32-bit).  (Although that commit
> suggested we might fix it in time for 2.5, we ran out of time;
> fortunately, it is unlikely enough to bite that it was not
> worth worrying about during the 2.5 release.)
>
> This fixes things by documenting the internal contracts, and
> explaining why the internal function can return NULL and only
> the public facing interface needs to worry about qnull(),
> thus avoiding over-referencing the qnull_ global object.
>
> It does not, however, fix the stupidity of the stack mixing
> up two separate pieces of information; add a FIXME to explain
> that issue.
>
> Signed-off-by: Eric Blake 
> Cc: qemu-sta...@nongnu.org
>
> ---
> v8: rebase to earlier changes
> v7: cc qemu-stable, tweak some asserts, drop stale comment, add more
> comments
> v6: no change
> ---
>  qapi/qmp-output-visitor.c   | 39 ---
>  tests/test-qmp-output-visitor.c |  2 ++
>  2 files changed, 34 insertions(+), 7 deletions(-)
>
> diff --git a/qapi/qmp-output-visitor.c b/qapi/qmp-output-visitor.c
> index d367148..316f4e4 100644
> --- a/qapi/qmp-output-visitor.c
> +++ b/qapi/qmp-output-visitor.c
> @@ -29,6 +29,15 @@ typedef QTAILQ_HEAD(QStack, QStackEntry) QStack;
>  struct QmpOutputVisitor
>  {
>  Visitor visitor;
> +/* FIXME: we are abusing stack to hold two separate pieces of
> + * information: the current root object in slot 0, and the stack
> + * of N objects still being built in slots 1 through N (for N+1
> + * slots in use).  Worse, our behavior is inconsistent:
> + * qmp_output_add_obj() visiting two top-level scalars in a row
> + * discards the first in favor of the second, but visiting two
> + * top-level objects in a row tries to append the second object
> + * into the first (since the first object was placed in the stack
> + * in both slot 0 and 1, but only popped from slot 1).  */

I skipped checking thoroughly this comment, since it's a bit
off-topic, although it looks ok.

Later, oh well, it's fixed in next commit. Imho it's not strictly
necessary in this commit.

>  QStack stack;
>  };
>
> @@ -41,10 +50,12 @@ static QmpOutputVisitor *to_qov(Visitor *v)
>  return container_of(v, QmpOutputVisitor, visitor);
>  }
>
> +/* Push @value onto the stack of current QObjects being built */
>  static void qmp_output_push_obj(QmpOutputVisitor *qov, QObject *value)
>  {
>  QStackEntry *e = g_malloc0(sizeof(*e));
>
> +assert(value);
>  e->value = value;
>  if (qobject_type(e->value) == QTYPE_QLIST) {
>  e->is_list_head = true;
> @@ -52,44 +63,51 @@ static void qmp_output_push_obj(QmpOutputVisitor *qov, 
> QObject *value)
>  QTAILQ_INSERT_HEAD(&qov->stack, e, node);
>  }
>
> +/* Grab and remove the most recent QObject from the stack */
>  static QObject *qmp_output_pop(QmpOutputVisitor *qov)
>  {
>  QStackEntry *e = QTAILQ_FIRST(&qov->stack);
>  QObject *value;
> +
> +assert(e);
>  QTAILQ_REMOVE(&qov->stack, e, node);
>  value = e->value;
>  g_free(e);
>  return value;
>  }
>
> +/* Grab the root QObject, if any, in preparation to empty the stack */
>  static QObject *qmp_output_first(QmpOutputVisitor *qov)
>  {
>  QStackEntry *e = QTAILQ_LAST(&qov->stack, QStack);
>
> -/*
> - * FIXME Wrong, because qmp_output_get_qobject() will increment
> - * the refcnt *again*.  We need to think through how visitors
> - * handle null.
> - */
>  if (!e) {
> -return qnull();
> +/* No root */
> +return NULL;
>  }
> -
> +assert(e->value);
>  return e->value;
>  }
>
> +/* Grab the most recent QObject from the stack, which must exist */
>  static QObject *qmp_output_last(QmpOutputVisitor *qov)
>  {
>  QStackEntry *e = QTAILQ_FIRST(&qov->stack);
> +
> +assert(e && e->value);
>  return e->value;
>  }
>
> +/* Add @value to the current QObject being built.
> + * If the stack is visiting a dictionary or list, @value is now owned
> + * by that container. Otherwise, @value is now the root.  */
>  static void qmp_output_add_obj(QmpOutputVisitor *qov, const char *name,
> QObject *value)
>  {
>  QObject *cur;
>
>  if (QTAILQ_EMPTY(&qov->stack)) {
> +/* Stack was empty, track this object as root */
>  qmp_output_push_obj(qov, value);
>  return;
>  }
> @@ -98,13 +116,17 @@ static void qmp_output_add_obj(QmpOutputVisitor *qov, 
> const char *name,
>
>  switch (qobject_type(cur)) {
>  case QTYPE_QDICT:
> +assert(name);
>  qdict_put_obj(qobject_to_qdict(cur), name, value);
>  break;
>  case QTYPE_QLI

Re: [Qemu-devel] [PATCH v8 21/35] qapi: Document visitor interfaces, add assertions

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> The visitor interface for mapping between QObject/QemuOpts/string
> and qapi has formerly been documented only by reading source code,
> making it difficult to propose changes to either scripts/qapi*.py
> or to clients without knowing whether those changes would be safe.
> This adds documentation, including mentioning when parameters can
> be NULL, and where there are still some interface warts that would
> be nice to remove.  In particular, I have plans to remove
> visit_start_union() in a future patch.
>
> Add some asserts to strengthen the claims of the assertions; some
> of these were only made possible by recent cleanup commits.  These
> were made easier with the addition of a new visit_is_output()
> helper (since all 2 output visitors of our 6 overall visitors use
> the same .type_enum() callback).
>
> Signed-off-by: Eric Blake 

Nice doc,
Reviewed-by: Marc-André Lureau 



-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 01/35] qobject: Document more shortcomings in our number handling

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> We've already documented that our JSON parsing is locale dependent;
> but we should also document that our JSON output has the same
> problem.  Additionally, JSON requires finite values (you have to
> upgrade to JSON5 to get support for Inf or NaN), and our output
> risks truncating floating point numbers to the point of losing
> significant precision.
>
> Sadly, this series is not going to be the one that addresses these
> problems.
>
> Fix some trailing whitespace I noticed in the vicinity.
>
> Signed-off-by: Eric Blake 
>

Reviewed-by: Marc-André Lureau 




-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 09/35] qapi: Prefer type_int64 over type_int in visitors

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> The qapi builtin type 'int' is basically shorthand for the type
> 'int64'.  In fact, since no visitor was providing the optional
> type_int64() callback, visit_type_int64() was just always falling
> back to type_int(), cementing the equivalence between the types.
>
> However, some visitors are providing a type_uint64() callback.
> For purposes of code consistency, it is nicer if all visitors
> use the paired type_int64/type_uint64 names rather than the
> mismatched type_int/type_uint64.  So this patch just renames
> the signed int callbacks in place, dropping the type_int()
> callback as redundant, and a later patch will focus on the
> unsigned int callbacks.
>
> Add some FIXMEs to questionable reuse of errp in code touched
> by the rename, while at it (the reuse works as long as the
> callbacks don't modify value when setting an error, but it's not
> a good example to set).

Hmm, that potentially could cause assert() in error_setv() (that could
possibly trigger in so many untested ways I would rather see a
warning, but ok). Rather than a FIXME, I would  see a patch to check
for errp before doing the further check and assert.

I later realized that a following patch fixes it, I guess you could
update commit message to mention it.

>
> No change in functionality here, although further cleanups are
> in the pipeline.
>
> Signed-off-by: Eric Blake 
>
> ---
> v8: no change
> v7: split off of 1/23 and 2/23 for easier-to-read diffs
> ---
>  include/qapi/visitor-impl.h  |  1 -
>  qapi/opts-visitor.c  |  4 ++--
>  qapi/qapi-dealloc-visitor.c  |  6 +++---
>  qapi/qapi-visit-core.c   | 36 ++--
>  qapi/qmp-input-visitor.c |  6 +++---
>  qapi/qmp-output-visitor.c|  6 +++---
>  qapi/string-input-visitor.c  |  6 +++---
>  qapi/string-output-visitor.c |  6 +++---
>  8 files changed, 39 insertions(+), 32 deletions(-)
>
> diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
> index 44a21b7..70326e0 100644
> --- a/include/qapi/visitor-impl.h
> +++ b/include/qapi/visitor-impl.h
> @@ -36,7 +36,6 @@ struct Visitor
>  void (*get_next_type)(Visitor *v, QType *type, bool promote_int,
>const char *name, Error **errp);
>
> -void (*type_int)(Visitor *v, int64_t *obj, const char *name, Error 
> **errp);
>  void (*type_bool)(Visitor *v, bool *obj, const char *name, Error **errp);
>  void (*type_str)(Visitor *v, char **obj, const char *name, Error **errp);
>  void (*type_number)(Visitor *v, double *obj, const char *name,
> diff --git a/qapi/opts-visitor.c b/qapi/opts-visitor.c
> index dd4094c..56c798f 100644
> --- a/qapi/opts-visitor.c
> +++ b/qapi/opts-visitor.c
> @@ -360,7 +360,7 @@ opts_type_bool(Visitor *v, bool *obj, const char *name, 
> Error **errp)
>
>
>  static void
> -opts_type_int(Visitor *v, int64_t *obj, const char *name, Error **errp)
> +opts_type_int64(Visitor *v, int64_t *obj, const char *name, Error **errp)
>  {
>  OptsVisitor *ov = to_ov(v);
>  const QemuOpt *opt;
> @@ -528,7 +528,7 @@ opts_visitor_new(const QemuOpts *opts)
>   */
>  ov->visitor.type_enum = &input_type_enum;
>
> -ov->visitor.type_int= &opts_type_int;
> +ov->visitor.type_int64  = &opts_type_int64;
>  ov->visitor.type_uint64 = &opts_type_uint64;
>  ov->visitor.type_size   = &opts_type_size;
>  ov->visitor.type_bool   = &opts_type_bool;
> diff --git a/qapi/qapi-dealloc-visitor.c b/qapi/qapi-dealloc-visitor.c
> index 204de8f..e9b9f3f 100644
> --- a/qapi/qapi-dealloc-visitor.c
> +++ b/qapi/qapi-dealloc-visitor.c
> @@ -135,8 +135,8 @@ static void qapi_dealloc_type_str(Visitor *v, char **obj, 
> const char *name,
>  }
>  }
>
> -static void qapi_dealloc_type_int(Visitor *v, int64_t *obj, const char *name,
> -  Error **errp)
> +static void qapi_dealloc_type_int64(Visitor *v, int64_t *obj, const char 
> *name,
> +Error **errp)
>  {
>  }
>
> @@ -219,7 +219,7 @@ QapiDeallocVisitor *qapi_dealloc_visitor_new(void)
>  v->visitor.next_list = qapi_dealloc_next_list;
>  v->visitor.end_list = qapi_dealloc_end_list;
>  v->visitor.type_enum = qapi_dealloc_type_enum;
> -v->visitor.type_int = qapi_dealloc_type_int;
> +v->visitor.type_int64 = qapi_dealloc_type_int64;
>  v->visitor.type_bool = qapi_dealloc_type_bool;
>  v->visitor.type_str = qapi_dealloc_type_str;
>  v->visitor.type_number = qapi_dealloc_type_number;
> diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
> index 6d63e40..6295fa8 100644
> --- a/qapi/qapi-visit-core.c
> +++ b/qapi/qapi-visit-core.c
> @@ -97,7 +97,7 @@ void visit_type_enum(Visitor *v, int *obj, const char * 
> const strings[],
>
>  void visit_type_int(Visitor *v, int64_t *obj, const char *name, Error **errp)
>  {
> -v->type_int(v, obj, name, errp);
> +v->type_int64(v, obj, name, errp);
>  }
>
>  void visit_type_uint

Re: [Qemu-devel] [PATCH v8 08/35] qapi: Track all failures between visit_start/stop

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> Inside the generated code between visit_start_struct() and
> visit_end_struct(), we were blindly setting the error into
> the caller's errp parameter.  But a future patch to split
> visit_end_struct() will require that we take action based
> on whether an error has occurred, which requires us to track
> all actions through a local err.  Rewrite the visits to be
> more in line with the other generated calls.
>
> Signed-off-by: Eric Blake 
>

Reviewed-by: Marc-André Lureau 




-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 11/35] qapi: Consolidate visitor small integer callbacks

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> Commit 4e27e819 introduced optional visitor callbacks for all
> sorts of int types, but no visitor has supplied any of the
> callbacks for sizes less than 64 bits.  In other words, the
> generic implementation based on using type_[u]int64() followed
> by bounds-checking works just fine. In the interest of
> simplicity, it's easier to make the visitor callback interface
> not have to worry about the other sizes.
>
> Adding some helper functions minimizes the boilerplate required
> to correct FIXMEs added earlier with regards to questionable
> reuse of errp, particularly now that we can guarantee from a
> single file audit that value is unchanged if an error is set.
>
> Signed-off-by: Eric Blake 
>
> ---
> v8: no change
> v7: further factor out helper functions that eliminate the
> questionable errp reuse
> v6: split off from v5 23/46
> original version also appeared in v6-v9 of subset D
> ---
>  include/qapi/visitor-impl.h |  22 +++---
>  qapi/qapi-visit-core.c  | 158 
> +---
>  2 files changed, 70 insertions(+), 110 deletions(-)
>
> diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
> index 70326e0..5ee2974 100644
> --- a/include/qapi/visitor-impl.h
> +++ b/include/qapi/visitor-impl.h
> @@ -1,7 +1,7 @@
>  /*
>   * Core Definitions for QAPI Visitor implementations
>   *
> - * Copyright (C) 2012 Red Hat, Inc.
> + * Copyright (C) 2012, 2015 Red Hat, Inc.
>   *
>   * Author: Paolo Bonizni 
>   *
> @@ -36,6 +36,16 @@ struct Visitor
>  void (*get_next_type)(Visitor *v, QType *type, bool promote_int,
>const char *name, Error **errp);
>
> +/* Must be set. */
> +void (*type_int64)(Visitor *v, int64_t *obj, const char *name,
> +   Error **errp);
> +/* Must be set. */
> +void (*type_uint64)(Visitor *v, uint64_t *obj, const char *name,
> +Error **errp);
> +/* Optional; fallback is type_uint64().  */
> +void (*type_size)(Visitor *v, uint64_t *obj, const char *name,
> +  Error **errp);
> +/* Must be set. */
>  void (*type_bool)(Visitor *v, bool *obj, const char *name, Error **errp);
>  void (*type_str)(Visitor *v, char **obj, const char *name, Error **errp);
>  void (*type_number)(Visitor *v, double *obj, const char *name,
> @@ -46,16 +56,6 @@ struct Visitor
>  /* May be NULL; most useful for input visitors. */
>  void (*optional)(Visitor *v, bool *present, const char *name);
>
> -void (*type_uint8)(Visitor *v, uint8_t *obj, const char *name, Error 
> **errp);
> -void (*type_uint16)(Visitor *v, uint16_t *obj, const char *name, Error 
> **errp);
> -void (*type_uint32)(Visitor *v, uint32_t *obj, const char *name, Error 
> **errp);
> -void (*type_uint64)(Visitor *v, uint64_t *obj, const char *name, Error 
> **errp);
> -void (*type_int8)(Visitor *v, int8_t *obj, const char *name, Error 
> **errp);
> -void (*type_int16)(Visitor *v, int16_t *obj, const char *name, Error 
> **errp);
> -void (*type_int32)(Visitor *v, int32_t *obj, const char *name, Error 
> **errp);
> -void (*type_int64)(Visitor *v, int64_t *obj, const char *name, Error 
> **errp);
> -/* visit_type_size() falls back to (*type_uint64)() if type_size is 
> unset */
> -void (*type_size)(Visitor *v, uint64_t *obj, const char *name, Error 
> **errp);
>  bool (*start_union)(Visitor *v, bool data_present, Error **errp);
>  void (*end_union)(Visitor *v, bool data_present, Error **errp);
>  };

I think it would be a good idea to move this chunk in the previous patch.

> diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
> index 4a8ad43..a48fd4e 100644
> --- a/qapi/qapi-visit-core.c
> +++ b/qapi/qapi-visit-core.c
> @@ -100,129 +100,89 @@ void visit_type_int(Visitor *v, int64_t *obj, const 
> char *name, Error **errp)
>  v->type_int64(v, obj, name, errp);
>  }
>
> +static void visit_type_uintN(Visitor *v, uint64_t *obj, const char *name,
> + uint64_t max, const char *type, Error **errp)
> +{
> +Error *err = NULL;
> +uint64_t value = *obj;
> +
> +v->type_uint64(v, &value, name, &err);
> +if (err) {
> +error_propagate(errp, err);
> +} else if (value > max) {
> +error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
> +   name ? name : "null", type);
> +} else {
> +*obj = value;
> +}
> +}
> +
>  void visit_type_uint8(Visitor *v, uint8_t *obj, const char *name, Error 
> **errp)
>  {
> -uint64_t value;
> -
> -if (v->type_uint8) {
> -v->type_uint8(v, obj, name, errp);
> -} else {
> -value = *obj;
> -v->type_uint64(v, &value, name, errp);
> -if (value > UINT8_MAX) {
> -/* FIXME questionable reuse of errp if type_uint64() changes
> -   value on error */
> -error_setg(errp, QERR_INVALID_PARAMET

Re: [Qemu-devel] [PATCH v8 15/35] qom: Swap 'name' next to visitor in ObjectPropertyAccessor

2016-01-05 Thread Marc-André Lureau

Hi

On Wed, Dec 23, 2015 at 5:30 PM, Eric Blake  wrote:
> On 12/21/2015 10:08 AM, Eric Blake wrote:
>> Similar to the previous patch, it's nice to have all functions
>> n the tree that involve a visitor and a name for conversion to
>
> s/^n/in/

I was about to say that,

with that
Reviewed-by: Marc-André Lureau 

>
>> or from QAPI to consistently stick the 'name' parameter next
>> to the Visitor parameter.
>>
>
> --
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>



-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 14/35] qapi: Swap visit_* arguments for consistent 'name' placement

2016-01-05 Thread Marc-André Lureau

Reviewed-by: Marc-André Lureau Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> JSON uses "name":value, but many of our visitor interfaces were
> called with visit_type_FOO(v, &value, name, errp).  This can be
> a bit confusing to have to mentally swap the parameter order to
> match JSON order.  It's particularly bad for visit_start_struct(),
> where the 'name' parameter is smack in the middle of the
> otherwise-related group of 'obj, kind, size' parameters! It's
> time to do a global swap of the parameter ordering, so that the
> 'name' parameter is always immediately after the Visitor argument.
>

fwiw, I do agree.

> Additional reasons in favor of the swap: name is always an input
> parameter, while &value is sometimes an output parameter (depending
> on whether the caller is using an input visitor); and it is nicer
> to list input parameters first.  Also, the existing include/qjson.h
> prefers listing 'name' first in json_prop_*(), and I have plans to
> unify that file with the qapi visitors; listing 'name' first in
> qapi will minimize churn to the (admittedly few) qjson.h clients.
>
> The next patches will then fix object.h, visitor-impl.h, and those
> clients to match.
>
> Done by first patching scripts/qapi*.py by hand to make generated
> files do what I want, then by running the following Coccinelle
> script to affect the rest of the code base:
>  $ spatch --sp-file script `git grep -l '\bvisit_' -- '**/*.[ch]'`
> I then had to apply some touchups (Coccinelle insisted on TAB
> indentation in visitor.h, and botched the signature of
> visit_type_enum() by rewriting 'const char *const strings[]' to
> the syntactically invalid 'const char*const[] strings').
>
> // Part 1: Swap declaration order
> @@
> type TV, TErr, TObj, T1, T2;
> identifier OBJ, ARG1, ARG2;
> @@
>  void visit_start_struct
> -(TV v, TObj OBJ, T1 ARG1, const char *name, T2 ARG2, TErr errp)
> +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
>  { ... }
>
> @@
> type bool, TV, T1;
> identifier ARG1;
> @@
>  bool visit_optional
> -(TV v, T1 ARG1, const char *name)
> +(TV v, const char *name, T1 ARG1)
>  { ... }
>
> @@
> type TV, TErr, TObj, T1;
> identifier OBJ, ARG1;
> @@
>  void visit_get_next_type
> -(TV v, TObj OBJ, T1 ARG1, const char *name, TErr errp)
> +(TV v, const char *name, TObj OBJ, T1 ARG1, TErr errp)
>  { ... }
>
> @@
> type TV, TErr, TObj, T1, T2;
> identifier OBJ, ARG1, ARG2;
> @@
>  void visit_type_enum
> -(TV v, TObj OBJ, T1 ARG1, T2 ARG2, const char *name, TErr errp)
> +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
>  { ... }
>
> @@
> type TV, TErr, TObj;
> identifier OBJ;
> identifier VISIT_TYPE =~ "^visit_type_";
> @@
>  void VISIT_TYPE
> -(TV v, TObj OBJ, const char *name, TErr errp)
> +(TV v, const char *name, TObj OBJ, TErr errp)
>  { ... }
>
> // Part 2: swap caller order
> @@
> expression V, NAME, OBJ, ARG1, ARG2, ERR;
> identifier VISIT_TYPE =~ "^visit_type_";
> @@
> (
> -visit_start_struct(V, OBJ, ARG1, NAME, ARG2, ERR)
> +visit_start_struct(V, NAME, OBJ, ARG1, ARG2, ERR)
> |
> -visit_optional(V, ARG1, NAME)
> +visit_optional(V, NAME, ARG1)
> |
> -visit_get_next_type(V, OBJ, ARG1, NAME, ERR)
> +visit_get_next_type(V, NAME, OBJ, ARG1, ERR)
> |
> -visit_type_enum(V, OBJ, ARG1, ARG2, NAME, ERR)
> +visit_type_enum(V, NAME, OBJ, ARG1, ARG2, ERR)
> |
> -VISIT_TYPE(V, OBJ, NAME, ERR)
> +VISIT_TYPE(V, NAME, OBJ, ERR)
> )
>
> Signed-off-by: Eric Blake 

The result looks good and passes the tests
Reviewed-by: Marc-André Lureau 

However, docs/qapi-code-gen.txt should be updated in a follow-up patch.

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH] hw/dma/xilinx_axidma: debug printf fixups

2016-01-05 Thread Eric Blake

On 01/05/2016 06:22 AM, Andrew Jones wrote:
> (Found by grepping for broken PRI users.)
> 
> Signed-off-by: Andrew Jones 
> ---
>  hw/dma/xilinx_axidma.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/dma/xilinx_axidma.c b/hw/dma/xilinx_axidma.c
> index b1cfa11356a26..2ab0772cd19ae 100644
> --- a/hw/dma/xilinx_axidma.c
> +++ b/hw/dma/xilinx_axidma.c
> @@ -180,10 +180,10 @@ static inline int streamid_from_addr(hwaddr addr)
>  #ifdef DEBUG_ENET
>  static void stream_desc_show(struct SDesc *d)
>  {
> -qemu_log("buffer_addr  = " PRIx64 "\n", d->buffer_address);
> -qemu_log("nxtdesc  = " PRIx64 "\n", d->nxtdesc);
> -qemu_log("control  = %x\n", d->control);
> -qemu_log("status   = %x\n", d->status);
> +qemu_log("buffer_addr  = 0x%" PRIx64 "\n", d->buffer_address);
> +qemu_log("nxtdesc  = 0x%" PRIx64 "\n", d->nxtdesc);
> +qemu_log("control  = 0x%x\n", d->control);
> +qemu_log("status   = 0x%x\n", d->status);

This is dead code.  Nothing uses stream_desc_show() even when DEBUG_ENET
is defined.  I'd just delete the function and #ifdef altogether, instead.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v8 02/35] qapi: Avoid use of misnamed DO_UPCAST()

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> The macro DO_UPCAST() is incorrectly named: it converts from a
> parent class to a derived class (which is a downcast).  Better,
> and more consistent with some of the other qapi visitors, is
> to use the container_of() macro through a to_FOO() helper.

I think DO_UPCAST() is confusing too. I assume you'd rather see it
renamed or replaced by to_FOO() helpers appropriately (I wonder if
coccinelle would be able to do that ;)

> Our current definition of container_of() is weaker than
> DO_UPCAST(), in that it does not require the derived class to
> have Visitor * as its first member, but this does not hurt our
> usage patterns in qapi visitors.

Reviewed-by: Marc-André Lureau 


-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v8 04/35] hmp: Improve use of qapi visitor

2016-01-05 Thread Marc-André Lureau

Hi

On Mon, Dec 21, 2015 at 6:08 PM, Eric Blake  wrote:
> Cache the visitor in a local variable instead of repeatedly
> calling the accessor.  Pass NULL for the visit_start_struct()
> object (which matches the fact that we were already passing 0
> for the size argument, because we aren't using the visit to
> allocate a qapi struct).
>
> Signed-off-by: Eric Blake 

Reviewed-by: Marc-André Lureau 



-- 
Marc-André Lureau

1 2 3 >

1 - 100 of 226 matches

Mail list logo