date:20170807

Re: [Qemu-devel] Do we need a "virt-2.10" machine type before the 2.10 release?

2017-08-07 Thread Auger Eric

Hi Drew,
On 07/08/2017 09:06, Andrew Jones wrote:
> On Fri, Aug 04, 2017 at 06:42:54PM +0100, Peter Maydell wrote:
>> Hi; I noticed today that the virt board doesn't have a virt-2.10
>> machine type defined. Do we need to add it before release?
>>
>> (I don't know if there have in fact been any changes between
>> 2.9 and 2.10 that would be compatibility issues.)
>>
>> thanks
>> -- PMM
>>
> 
> Yes, we should have separately posted that to make sure it
> got in. Eric posted it, but as part of an RFC series:
> 
>  https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00045.html
> 
> Eric, would you like to post a separate patch for the 2.10
> machine type that doesn't include no_iommu?

Yes I can send it. However is it really mandated as I don't think we
have changes between 2.9 and 2.10?

Thanks

Eric
> 
> Thanks,
> drew
>

Re: [Qemu-devel] Do we need a "virt-2.10" machine type before the 2.10 release?

2017-08-07 Thread Andrew Jones

On Fri, Aug 04, 2017 at 06:42:54PM +0100, Peter Maydell wrote:
> Hi; I noticed today that the virt board doesn't have a virt-2.10
> machine type defined. Do we need to add it before release?
> 
> (I don't know if there have in fact been any changes between
> 2.9 and 2.10 that would be compatibility issues.)
> 
> thanks
> -- PMM
>

Yes, we should have separately posted that to make sure it
got in. Eric posted it, but as part of an RFC series:

 https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00045.html

Eric, would you like to post a separate patch for the 2.10
machine type that doesn't include no_iommu?

Thanks,
drew

Re: [Qemu-devel] [PATCH f0r 2.11] runstate/migrate: Two more transitions

2017-08-07 Thread Peter Xu

On Fri, Aug 04, 2017 at 06:50:11PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> There's a race if someone does a 'stop' near the end of migrate;
> the migration process goes through two runstates:
> 'finish migrate'
> 'postmigrate'
> 
> If the user issues a 'stop' between the two we end up with invalid
> state transitions.
> Add the transitions as valid.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> ---
>  vl.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/vl.c b/vl.c
> index 99fcfa0442..bacb03f49d 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -621,6 +621,7 @@ static const RunStateTransition 
> runstate_transitions_def[] = {
>  
>  { RUN_STATE_PAUSED, RUN_STATE_RUNNING },
>  { RUN_STATE_PAUSED, RUN_STATE_FINISH_MIGRATE },
> +{ RUN_STATE_PAUSED, RUN_STATE_POSTMIGRATE },
>  { RUN_STATE_PAUSED, RUN_STATE_PRELAUNCH },
>  { RUN_STATE_PAUSED, RUN_STATE_COLO},
>  
> @@ -633,6 +634,7 @@ static const RunStateTransition 
> runstate_transitions_def[] = {
>  { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE },
>  
>  { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
> +{ RUN_STATE_FINISH_MIGRATE, RUN_STATE_PAUSED },
>  { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
>  { RUN_STATE_FINISH_MIGRATE, RUN_STATE_PRELAUNCH },
>  { RUN_STATE_FINISH_MIGRATE, RUN_STATE_COLO},

Do we need this as well?

{ RUN_STATE_POSTMIGRATE, RUN_STATE_PAUSED },

-- 
Peter Xu

Re: [Qemu-devel] [PATCH qemu v4 1/3] spapr_iommu: Realloc guest visible TCE table when hot(un)plugging vfio-pci

2017-08-07 Thread Alexey Kardashevskiy

On 24/07/17 20:48, Alexey Kardashevskiy wrote:
> On 24/07/17 15:53, David Gibson wrote:
>> On Thu, Jul 20, 2017 at 05:22:29PM +1000, Alexey Kardashevskiy wrote:
>>> This replaces g_malloc() with spapr_tce_alloc_table() as this is
>>> the standard way of allocating tables and this allows moving the table
>>> back to KVM when unplugging a VFIO PCI device and VFIO TCE acceleration
>>> support is not present in the KVM.
>>
>> So, I like the idea here, and I think it will work for now, but one
>> thing concerns me.
>>
>> AIUI, your future plans include allowing in-kernel accelerated TCE
>> tables even with VFIO devices.  When that happens, we might have an
>> in-kernel table both before and after a change in the "need_vfio"
>> state.
> 
> The in-kernel table just stays there, mappings will be replayed but in the
> end of the replay only the hardware table will change.
> 
>>
>> But you won't be able to have two in-kernel tables allocated at once,
>> whereas this code relies on having both the old and new tables at once
>> so it can copy the contents over.
>>
>> How do you intend to handle that?
> 
> I intend to make this function no-op when cap_spapr_vfio is present.
> 
> 
>>
>>> Although spapr_tce_alloc_table() is expected to fail with EBUSY
>>> if called when previous fd is not closed yet, in practice we will not
>>> see it because cap_spapr_vfio is false at the moment.
>>
>> I don't follow this.  How would cap_spapr_vfio be changing at any
>> point?
> 
> It depends on the version of the host kernel.


Ping?


> 
>>
>>>
>>> Signed-off-by: Alexey Kardashevskiy 
>>> ---
>>>  hw/ppc/spapr_iommu.c | 35 ++-
>>>  1 file changed, 14 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
>>> index e614621a83..307dc3021e 100644
>>> --- a/hw/ppc/spapr_iommu.c
>>> +++ b/hw/ppc/spapr_iommu.c
>>> @@ -275,33 +275,26 @@ static int spapr_tce_table_realize(DeviceState *dev)
>>>  void spapr_tce_set_need_vfio(sPAPRTCETable *tcet, bool need_vfio)
>>>  {
>>>  size_t table_size = tcet->nb_table * sizeof(uint64_t);
>>> -void *newtable;
>>> +uint64_t *oldtable;
>>> +int newfd = -1;
>>>  
>>> -if (need_vfio == tcet->need_vfio) {
>>> -/* Nothing to do */
>>> -return;
>>> -}
>>> +g_assert(need_vfio != tcet->need_vfio);
>>>  
>>> -if (!need_vfio) {
>>> -/* FIXME: We don't support transition back to KVM accelerated
>>> - * TCEs yet */
>>> -return;
>>> -}
>>> +tcet->need_vfio = need_vfio;
>>>  
>>> -tcet->need_vfio = true;
>>> +oldtable = tcet->table;
>>>  
>>> -if (tcet->fd < 0) {
>>> -/* Table is already in userspace, nothing to be do */
>>> -return;
>>> -}
>>> +tcet->table = spapr_tce_alloc_table(tcet->liobn,
>>> +tcet->page_shift,
>>> +tcet->bus_offset,
>>> +tcet->nb_table,
>>> +&newfd,
>>> +need_vfio);
>>> +memcpy(tcet->table, oldtable, table_size);
>>>  
>>> -newtable = g_malloc(table_size);
>>> -memcpy(newtable, tcet->table, table_size);
>>> +spapr_tce_free_table(oldtable, tcet->fd, tcet->nb_table);
>>>  
>>> -kvmppc_remove_spapr_tce(tcet->table, tcet->fd, tcet->nb_table);
>>> -
>>> -tcet->fd = -1;
>>> -tcet->table = newtable;
>>> +tcet->fd = newfd;
>>>  }
>>>  
>>>  sPAPRTCETable *spapr_tce_new_table(DeviceState *owner, uint32_t liobn)
>>
> 
> 


-- 
Alexey



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v4 00/22] Clean up around qmp() and hmp()

2017-08-07 Thread Fam Zheng

On Mon, 08/07 14:43, Fam Zheng wrote:
> On Fri, 08/04 20:10, Fam Zheng wrote:
> > On Fri, 08/04 06:50, Eric Blake wrote:
> > > On 08/03/2017 08:54 PM, no-re...@patchew.org wrote:
> > > > Hi,
> > > > 
> > > > This series failed automatic build test. Please find the testing 
> > > > commands and
> > > > their output below. If you have docker installed, you can probably 
> > > > reproduce it
> > > > locally.
> > > 
> > > When will patchew have a syntax for stating dependencies? (Of course, I
> > > should actually mention those dependencies in my cover letter, not after
> > > the fact).
> > 
> > I can give it a try this weekend, if it works, it will be announced on this
> > list.
> 
> Hijack! As an experiment of the newly pushed "base" tag handling on the 
> server:
> 
> Based-on: 20170802201900.11890-1-ebl...@redhat.com

It worked. Now, restarted the tests with the "rebased" branch:

https://github.com/patchew-project/qemu/commits/patchew/20170804012551.2714-1-eblake%40redhat.com

Fam

Re: [Qemu-devel] [PATCH v7 0/6] fsdev: qmp interface for io throttling

2017-08-07 Thread Pradeep Jagadeesh


On 7/7/2017 8:14 AM, Markus Armbruster wrote:

Pradeep Jagadeesh  writes:


These patches provide the qmp interface, to query the io throttle
status of the all fsdev devices that are present in a vm.
also, it provides an interface to set the io throttle parameters of a
fsdev to a required value. some of the patches also remove the duplicate
code that was present in block and fsdev files.

Pradeep Jagadeesh (6):
  throttle: factor out duplicate code
  qmp: Create IOThrottle structure
  throttle: move out function to reuse the code
  hmp: create a throttle initialization function for code reusability
  fsdev: hmp interface for throttling
  fsdev: QMP interface for throttling

 Makefile|   4 ++
 blockdev.c  |  97 ++---
 fsdev/qemu-fsdev-dummy.c|  10 
 fsdev/qemu-fsdev-throttle.c | 118 ++--
 fsdev/qemu-fsdev-throttle.h |  13 +
 fsdev/qemu-fsdev.c  |  37 +
 hmp-commands-info.hx|  18 ++
 hmp-commands.hx |  19 +++
 hmp.c   |  81 +--
 hmp.h   |   4 ++
 include/qemu/throttle-options.h |   7 +++
 include/qemu/throttle.h |   4 +-
 include/qemu/typedefs.h |   1 +
 monitor.c   |   5 ++
 qapi-schema.json|   3 +
 qapi/block-core.json|  76 +-
 qapi/fsdev.json |  84 
 qapi/iothrottle.json|  88 ++
 qmp.c   |  14 +
 util/throttle.c | 110 +
 20 files changed, 577 insertions(+), 216 deletions(-)
 create mode 100644 qapi/fsdev.json
 create mode 100644 qapi/iothrottle.json


No test coverage?

I wanted to upstream these first then I am planning to write the tests.

-Pradeep

Re: [Qemu-devel] [PATCH v4 00/22] Clean up around qmp() and hmp()

2017-08-07 Thread Fam Zheng

On Thu, 08/03 18:54, no-re...@patchew.org wrote:
> Hi,
> 
> This series failed automatic build test. Please find the testing commands and
> their output below. If you have docker installed, you can probably reproduce 
> it
> locally.

As said in the sub-thread of this message, this is a false positive because
patchew didn't know the dependency. With that fixed, this test and the other
falsely failed one both pass. Thanks.

Fam

[Qemu-devel] [Bug 1709025] [NEW] Disk corrupted after snapshot deletion

2017-08-07 Thread junchi

Public bug reported:

  I found the vm disk corruption after snapshot deletion sometimes(the 
probability is very low, I'm afraid i can't reproduce it). And I found there is 
a patch for it as follow, but I'm not sure whether the patch repaired the bug. 
  Drain disk before snapshot deletion can't guarantee anything, there is still 
pending IO in snapshot-deletion process. Anyone can help?

author  Zhang Haoyu2014-10-21 16:38:01 +0800
committer   Stefan Hajnoczi2014-11-03 09:48:42 
+
commit  3432a1929ee18e08787ce35476abd74f2c93a17c (patch)
tree13a81c0a46707d91622f1593ccf7b926935371fd /block/snapshot.c
parent  573742a5431a99ceaba6968ae269cee247727cce (diff)
snapshot: add bdrv_drain_all() to bdrv_snapshot_delete() to avoid concurrency 
problem
If there are still pending i/o while deleting snapshot,
because deleting snapshot is done in non-coroutine context, and
the pending i/o read/write (bdrv_co_do_rw) is done in coroutine context,
so it's possible to cause concurrency problem between above two operations.
Add bdrv_drain_all() to bdrv_snapshot_delete() to avoid this problem.

Signed-off-by: Zhang Haoyu 
Reviewed-by: Paolo Bonzini 
Message-id: 201410211637596311...@sangfor.com
Signed-off-by: Stefan Hajnoczi 
Diffstat (limited to 'block/snapshot.c')
-rw-r--r--  block/snapshot.c4   
1 files changed, 4 insertions, 0 deletions
diff --git a/block/snapshot.c b/block/snapshot.c
index 85c52ff..698e1a1 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -236,6 +236,10 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
 error_setg(errp, "snapshot_id and name are both NULL");
 return -EINVAL;
 }
+
+/* drain all pending i/o before deleting snapshot */
+bdrv_drain_all();
+
 if (drv->bdrv_snapshot_delete) {
 return drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp);
 }

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1709025

Title:
  Disk corrupted after snapshot deletion

Status in QEMU:
  New

Bug description:
I found the vm disk corruption after snapshot deletion sometimes(the 
probability is very low, I'm afraid i can't reproduce it). And I found there is 
a patch for it as follow, but I'm not sure whether the patch repaired the bug. 
Drain disk before snapshot deletion can't guarantee anything, there is 
still pending IO in snapshot-deletion process. Anyone can help?

  authorZhang Haoyu2014-10-21 16:38:01 
+0800
  committer Stefan Hajnoczi2014-11-03 09:48:42 
+
  commit3432a1929ee18e08787ce35476abd74f2c93a17c (patch)
  tree  13a81c0a46707d91622f1593ccf7b926935371fd /block/snapshot.c
  parent573742a5431a99ceaba6968ae269cee247727cce (diff)
  snapshot: add bdrv_drain_all() to bdrv_snapshot_delete() to avoid concurrency 
problem
  If there are still pending i/o while deleting snapshot,
  because deleting snapshot is done in non-coroutine context, and
  the pending i/o read/write (bdrv_co_do_rw) is done in coroutine context,
  so it's possible to cause concurrency problem between above two operations.
  Add bdrv_drain_all() to bdrv_snapshot_delete() to avoid this problem.

  Signed-off-by: Zhang Haoyu 
  Reviewed-by: Paolo Bonzini 
  Message-id: 201410211637596311...@sangfor.com
  Signed-off-by: Stefan Hajnoczi 
  Diffstat (limited to 'block/snapshot.c')
  -rw-r--r--block/snapshot.c4   
  1 files changed, 4 insertions, 0 deletions
  diff --git a/block/snapshot.c b/block/snapshot.c
  index 85c52ff..698e1a1 100644
  --- a/block/snapshot.c
  +++ b/block/snapshot.c
  @@ -236,6 +236,10 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
   error_setg(errp, "snapshot_id and name are both NULL");
   return -EINVAL;
   }
  +
  +/* drain all pending i/o before deleting snapshot */
  +bdrv_drain_all();
  +
   if (drv->bdrv_snapshot_delete) {
   return drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp);
   }

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1709025/+subscriptions

Re: [Qemu-devel] [PATCH 04/17] nbd/client: fix nbd_send_request to return int

2017-08-07 Thread Daniel P. Berrange

On Fri, Aug 04, 2017 at 06:14:27PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Fix nbd_send_request to return int, as it returns a return value
> of nbd_write (which is int), and the only user of nbd_send_request's
> return value (nbd_co_send_request) consider it as int too.

The real problem here is the return value of nbd_write() - it should be
a ssize_t, not an int, since it is propagating the return value of
nbd_rwv() which is ssize_t. Basically all functions that do I/O and
return the number of bytes read/written should be ssize_t - any use of
int is a bug.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH 05/17] block/nbd-client: get rid of ssize_t

2017-08-07 Thread Daniel P. Berrange

On Mon, Aug 07, 2017 at 09:57:30AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 04.08.2017 19:11, Daniel P. Berrange wrote:
> > On Fri, Aug 04, 2017 at 06:14:28PM +0300, Vladimir Sementsov-Ogievskiy 
> > wrote:
> > > Use int variable for nbd_co_send_request return value (as
> > > nbd_co_send_request returns int).
> > Hmm, nbd_co_send_request() propagates return value of nbd_send_request,
> > which returns ssize_t.
> 
> See patch 04, nbd_send_request fixed in it to return int. It propagates
> return
> value of nbd_write which is int.

That's wrong too - nbd_write() should have been fixed to have ssize_t
as its return type, since its returning the value of a ssize_t variable.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] Editing qemu source code

2017-08-07 Thread Stefan Hajnoczi

On Sun, Aug 06, 2017 at 01:44:52PM +0200, Stefan Weil wrote:
> Am 06.08.2017 um 13:25 schrieb Shantanu Agarwal:
> > Hello all,
> > I was reading about qemu, trying to understand the source code. I found
> > that vl.c is considered the main file of qemu source code from where all
> > the execution starts. So I flipped some values, i even put a exit(0) just
> > after the int main starts, but nothing is changed. Qemu is fully
> > functional. Also i put many printf's in between but no output at all. Am I
> > missing something or vl.c is not the main file.
> > 
> 
> Hi,
> 
> the main function in vl.c is the correct entry point.
> If you don't see an effect of your changes, then you
> are probably calling an executable which is not the
> changed one.

Yes, this is a common cause for this problem.  Are you launching the
correct QEMU binary?

For example, the x86 full-system emulator built in the source tree can
be run like this:

  $ cd qemu
  $ x86_64-softmmu/qemu-system-x86_64

Remember that operating systems resolve shell commands using the PATH
environment variable.  If you ran 'qemu-system-x86_64' without a
relative or absolute path then you are probably running the QEMU binary
that was installed from a package somewhere else on your system (e.g.
/usr/bin/qemu-system-x86_64).

If you ran 'make install' then make sure to give the full path to the
QEMU binary (e.g. /usr/local/bin/qemu-system-x86_64) or set PATH so
/usr/local/bin comes before other directories in the search path.

Stefan

signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-block] [PATCH] block: drop bdrv_set_key from BlockDriver

2017-08-07 Thread Stefan Hajnoczi

On Fri, Aug 04, 2017 at 05:26:55PM +0200, Paolo Bonzini wrote:
> This is not used anymore since c01c214b69 ("block: remove all encryption
> handling APIs", 2017-07-11).
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  include/block/block_int.h | 1 -
>  1 file changed, 1 deletion(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v6 3/4] sockets: factor out create_fast_reuse_socket

2017-08-07 Thread Daniel P. Berrange

On Sat, Jul 29, 2017 at 11:18:17PM +0200, Knut Omang wrote:
> Another refactoring step to prepare for fixing the problem
> exposed with the test-listen test in the previous commit
> 
> Signed-off-by: Knut Omang 
> ---
>  util/qemu-sockets.c | 15 ---
>  1 file changed, 12 insertions(+), 3 deletions(-)

Reviewed-by: Daniel P. Berrange 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH v6 4/4] sockets: Handle race condition between binds to the same port

2017-08-07 Thread Daniel P. Berrange

On Sat, Jul 29, 2017 at 11:18:18PM +0200, Knut Omang wrote:
> If an offset of ports is specified to the inet_listen_saddr function(),
> and two or more processes tries to bind from these ports at the same time,
> occasionally more than one process may be able to bind to the same
> port. The condition is detected by listen() but too late to avoid a failure.
> 
> This function is called by socket_listen() and used
> by all socket listening code in QEMU, so all cases where any form of dynamic
> port selection is used should be subject to this issue.
> 
> Add code to close and re-establish the socket when this
> condition is observed, hiding the race condition from the user.
> 
> Also clean up some issues with error handling to allow more
> accurate reporting of the cause of an error.
> 
> This has been developed and tested by means of the
> test-listen unit test in the previous commit.
> Enable the test for make check now that it passes.
> 
> Reviewed-by: Bhavesh Davda 
> Reviewed-by: Yuval Shaia 
> Reviewed-by: Girish Moodalbail 
> Signed-off-by: Knut Omang 
> ---
>  tests/Makefile.include |  2 +-
>  util/qemu-sockets.c| 58 ++-
>  2 files changed, 42 insertions(+), 18 deletions(-)

Reviewed-by: Daniel P. Berrange 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH for-2.10] quorum: Handle bdrv_getlength() failures in quorum_co_flush()

2017-08-07 Thread Alberto Garcia

On Fri 04 Aug 2017 05:44:00 PM CEST, Eric Blake wrote:
>> --- a/block/quorum.c
>> +++ b/block/quorum.c
>> @@ -785,8 +785,9 @@ static coroutine_fn int quorum_co_flush(BlockDriverState 
>> *bs)
>>  for (i = 0; i < s->num_children; i++) {
>>  result = bdrv_co_flush(s->children[i]->bs);
>>  if (result) {
>> +int64_t length = bdrv_getlength(s->children[i]->bs);
>>  quorum_report_bad(QUORUM_OP_TYPE_FLUSH, 0,
>> -  bdrv_getlength(s->children[i]->bs),
>> +  length > 0 ? length : 0,
>
> In the fallback case, is always picking 0 good enough?  Then again,
> this is in the error path, so it is unlikely in practice, and I don't
> see any better way to handle it.

I don't see any better way to handle it either, and I'm not sure that it
matters much: this is a flush error event, the 'sectors-count' field
doesn't even mean anything, but we have to put something there.

Berto

Re: [Qemu-devel] [PATCH v6 1/4] tests: Add test-listen - a stress test for QEMU socket listen

2017-08-07 Thread Daniel P. Berrange

On Sat, Jul 29, 2017 at 11:18:15PM +0200, Knut Omang wrote:
> There's a potential race condition between multiple bind()'s
> attempting to bind to the same port, which occasionally
> allows more than one bind to succeed against the same port.
> 
> When a subsequent listen() call is made with the same socket
> only one will succeed.
> 
> The current QEMU code does however not take this situation into account
> and the listen will cause the code to break out and fail even
> when there are actually available ports to use.
> 
> This test exposes two subtests:
> 
> /socket/listen-serial
> /socket/listen-compete
> 
> The "compete" subtest creates a number of threads and have them all trying to 
> bind
> to the same port with a large enough offset input to
> allow all threads to get it's own port.
> The "serial" subtest just does the same, except in series in a
> single thread.
> 
> The serial version passes, probably in most versions of QEMU.
> 
> The parallel version exposes the problem in a relatively reliable way,
> eg. it fails a majority of times, but not with a 100% rate, occasional
> passes can be seen. Nevertheless this is quite good given that
> the bug was tricky to reproduce and has been left undetected for
> a while.
> 
> The problem seems to be present in all versions of QEMU.
> 
> The original failure scenario occurred with VNC port allocation
> in a traditional Xen based build, in different code
> but with similar functionality.
> 
> Reported-by: Bhavesh Davda 
> Signed-off-by: Knut Omang 
> Reviewed-by: Yuval Shaia 
> Reviewed-by: Bhavesh Davda 
> Reviewed-by: Girish Moodalbail 
> ---
>  tests/Makefile.include |   2 +-
>  tests/test-listen.c| 253 ++-
>  2 files changed, 255 insertions(+)
>  create mode 100644 tests/test-listen.c
> 
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index 7af278d..b37c0c8 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -128,6 +128,7 @@ check-unit-y += tests/test-bufferiszero$(EXESUF)
>  gcov-files-check-bufferiszero-y = util/bufferiszero.c
>  check-unit-y += tests/test-uuid$(EXESUF)
>  check-unit-y += tests/ptimer-test$(EXESUF)
> +#check-unit-y += tests/test-listen$(EXESUF)
>  gcov-files-ptimer-test-y = hw/core/ptimer.c
>  check-unit-y += tests/test-qapi-util$(EXESUF)
>  gcov-files-test-qapi-util-y = qapi/qapi-util.c
> @@ -769,6 +770,7 @@ tests/test-arm-mptimer$(EXESUF): tests/test-arm-mptimer.o
>  tests/test-qapi-util$(EXESUF): tests/test-qapi-util.o $(test-util-obj-y)
>  tests/numa-test$(EXESUF): tests/numa-test.o
>  tests/vmgenid-test$(EXESUF): tests/vmgenid-test.o tests/boot-sector.o 
> tests/acpi-utils.o
> +tests/test-listen$(EXESUF): tests/test-listen.o $(test-util-obj-y)
>  
>  tests/migration/stress$(EXESUF): tests/migration/stress.o
>   $(call quiet-command, $(LINKPROG) -static -O3 $(PTHREAD_LIB) -o $@ $< 
> ,"LINK","$(TARGET_DIR)$@")
> diff --git a/tests/test-listen.c b/tests/test-listen.c
> new file mode 100644
> index 000..5c07537
> --- /dev/null
> +++ b/tests/test-listen.c
> @@ -0,0 +1,253 @@
> +/*
> + * Copyright (c) 2017, Oracle and/or its affiliates. All rights reserved.
> + *Author: Knut Omang 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2
> + * as published by the Free Software Foundation.

Can you change that to "version 2 or later" - per the LICENSE file, we don't
accept contributions under "version 2 only" except for 4 specific subdirs:


  "As of July 2013, contributions under version 2 of the GNU General Public
   License (and no later version) are only accepted for the following files
   or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*."


> + *
> + * Test parallel port listen configuration with
> + * dynamic port allocation
> + */


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[Qemu-devel] [PATCH v2 3/4] i386/kvm: introduce tsc_is_stable_and_known()

2017-08-07 Thread Ladi Prosek

Move the "is TSC stable and known" condition to a reusable helper.

Signed-off-by: Ladi Prosek 
---
 target/i386/kvm.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 15d56ae..2dc01c9 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -611,6 +611,15 @@ static int kvm_arch_set_tsc_khz(CPUState *cs)
 return 0;
 }
 
+static bool tsc_is_stable_and_known(CPUX86State *env)
+{
+if (!env->tsc_khz) {
+return false;
+}
+return (env->features[FEAT_8000_0007_EDX] & CPUID_APM_INVTSC)
+|| env->user_tsc_khz;
+}
+
 static int hyperv_handle_properties(CPUState *cs)
 {
 X86CPU *cpu = X86_CPU(cs);
@@ -986,9 +995,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
 && cpu->expose_kvm
 && kvm_base == KVM_CPUID_SIGNATURE
 /* TSC clock must be stable and known for this feature. */
-&& ((env->features[FEAT_8000_0007_EDX] & CPUID_APM_INVTSC)
-|| env->user_tsc_khz != 0)
-&& env->tsc_khz != 0) {
+&& tsc_is_stable_and_known(env)) {
 
 c = &cpuid_data.entries[cpuid_i++];
 c->function = KVM_CPUID_SIGNATURE | 0x10;
-- 
2.9.3

[Qemu-devel] [PATCH v2 1/4] i386/kvm: use a switch statement for MSR detection

2017-08-07 Thread Ladi Prosek

Switch is easier on the eye and might lead to better codegen.

Signed-off-by: Ladi Prosek 
---
 target/i386/kvm.c | 75 +++
 1 file changed, 31 insertions(+), 44 deletions(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 6db7783..b14a0db 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1081,65 +1081,52 @@ static int kvm_get_supported_msrs(KVMState *s)
 int i;
 
 for (i = 0; i < kvm_msr_list->nmsrs; i++) {
-if (kvm_msr_list->indices[i] == MSR_STAR) {
+switch (kvm_msr_list->indices[i]) {
+case MSR_STAR:
 has_msr_star = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == MSR_VM_HSAVE_PA) {
+break;
+case MSR_VM_HSAVE_PA:
 has_msr_hsave_pa = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == MSR_TSC_AUX) {
+break;
+case MSR_TSC_AUX:
 has_msr_tsc_aux = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == MSR_TSC_ADJUST) {
+break;
+case MSR_TSC_ADJUST:
 has_msr_tsc_adjust = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == MSR_IA32_TSCDEADLINE) {
+break;
+case MSR_IA32_TSCDEADLINE:
 has_msr_tsc_deadline = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == MSR_IA32_SMBASE) {
+break;
+case MSR_IA32_SMBASE:
 has_msr_smbase = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == MSR_IA32_MISC_ENABLE) {
+break;
+case MSR_IA32_MISC_ENABLE:
 has_msr_misc_enable = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == MSR_IA32_BNDCFGS) {
+break;
+case MSR_IA32_BNDCFGS:
 has_msr_bndcfgs = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == MSR_IA32_XSS) {
+break;
+case MSR_IA32_XSS:
 has_msr_xss = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == HV_X64_MSR_CRASH_CTL) {
+break;;
+case HV_X64_MSR_CRASH_CTL:
 has_msr_hv_crash = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == HV_X64_MSR_RESET) {
+break;
+case HV_X64_MSR_RESET:
 has_msr_hv_reset = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == HV_X64_MSR_VP_INDEX) {
+break;
+case HV_X64_MSR_VP_INDEX:
 has_msr_hv_vpindex = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == HV_X64_MSR_VP_RUNTIME) {
+break;
+case HV_X64_MSR_VP_RUNTIME:
 has_msr_hv_runtime = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == HV_X64_MSR_SCONTROL) {
+break;
+case HV_X64_MSR_SCONTROL:
 has_msr_hv_synic = true;
-continue;
-}
-if (kvm_msr_list->indices[i] == HV_X64_MSR_STIMER0_CONFIG) {
+break;
+case HV_X64_MSR_STIMER0_CONFIG:
 has_msr_hv_stimer = true;
-continue;
+break;
 }
 }
 }
-- 
2.9.3

[Qemu-devel] [PATCH v2 0/4] i386/kvm: advertise Hyper-V frequency MSRs

2017-08-07 Thread Ladi Prosek

This is the QEMU part of the changes required for nested Hyper-V to read
timestamps with RDTSC + TSC page. Without exposing the frequency MSRs,
Windows with the Hyper-V role enabled use the much slower
HV_X64_MSR_TIME_REF_COUNT (0x4020) RDMSR to read timestamps.

The new registers are exposed only if the TSC frequency is stable across
migration and known, as suggested by Paolo.

v1->v2:
* deleted an extra empty line in patch 1
* added patch 3 introducing a helper function for the "TSC is stable and
  known" check (David)

Ladi Prosek (4):
  i386/kvm: use a switch statement for MSR detection
  i386/kvm: set tsc_khz before configuring Hyper-V CPUID
  i386/kvm: introduce tsc_is_stable_and_known()
  i386/kvm: advertise Hyper-V frequency MSRs

 target/i386/kvm.c | 138 --
 1 file changed, 71 insertions(+), 67 deletions(-)

-- 
2.9.3

[Qemu-devel] [PATCH v2 2/4] i386/kvm: set tsc_khz before configuring Hyper-V CPUID

2017-08-07 Thread Ladi Prosek

Timing-related Hyper-V enlightenments will benefit from knowing the final
tsc_khz value. This commit just moves the code in preparation for further
changes.

Signed-off-by: Ladi Prosek 
---
 target/i386/kvm.c | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index b14a0db..15d56ae 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -695,6 +695,25 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
 cpuid_i = 0;
 
+r = kvm_arch_set_tsc_khz(cs);
+if (r < 0) {
+goto fail;
+}
+
+/* vcpu's TSC frequency is either specified by user, or following
+ * the value used by KVM if the former is not present. In the
+ * latter case, we query it from KVM and record in env->tsc_khz,
+ * so that vcpu's TSC frequency can be migrated later via this field.
+ */
+if (!env->tsc_khz) {
+r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
+kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) :
+-ENOTSUP;
+if (r > 0) {
+env->tsc_khz = r;
+}
+}
+
 /* Paravirtualization CPUIDs */
 if (hyperv_enabled(cpu)) {
 c = &cpuid_data.entries[cpuid_i++];
@@ -961,25 +980,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
 }
 }
 
-r = kvm_arch_set_tsc_khz(cs);
-if (r < 0) {
-goto fail;
-}
-
-/* vcpu's TSC frequency is either specified by user, or following
- * the value used by KVM if the former is not present. In the
- * latter case, we query it from KVM and record in env->tsc_khz,
- * so that vcpu's TSC frequency can be migrated later via this field.
- */
-if (!env->tsc_khz) {
-r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
-kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) :
--ENOTSUP;
-if (r > 0) {
-env->tsc_khz = r;
-}
-}
-
 if (cpu->vmware_cpuid_freq
 /* Guests depend on 0x4000 to detect this feature, so only expose
  * it if KVM exposes leaf 0x4000. (Conflicts with Hyper-V) */
-- 
2.9.3

[Qemu-devel] [PATCH v2 4/4] i386/kvm: advertise Hyper-V frequency MSRs

2017-08-07 Thread Ladi Prosek

As of kernel commit eb82feea59d6 ("KVM: hyperv: support HV_X64_MSR_TSC_FREQUENCY
and HV_X64_MSR_APIC_FREQUENCY"), KVM supports two new MSRs which are required
for nested Hyper-V to read timestamps with RDTSC + TSC page.

This commit makes QEMU advertise the MSRs with CPUID.4003H:EAX[11] and
CPUID.4003H:EDX[8] as specified in the Hyper-V TLFS and experimentally
verified on a Hyper-V host. The feature is enabled with the existing hv-time CPU
flag, and only if the TSC frequency is stable across migrations and known.

Signed-off-by: Ladi Prosek 
---
 target/i386/kvm.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 2dc01c9..739334a 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -89,6 +89,7 @@ static bool has_msr_hv_vpindex;
 static bool has_msr_hv_runtime;
 static bool has_msr_hv_synic;
 static bool has_msr_hv_stimer;
+static bool has_msr_hv_frequencies;
 static bool has_msr_xss;
 
 static bool has_msr_architectural_pmu;
@@ -640,7 +641,13 @@ static int hyperv_handle_properties(CPUState *cs)
 if (cpu->hyperv_time) {
 env->features[FEAT_HYPERV_EAX] |= HV_X64_MSR_HYPERCALL_AVAILABLE;
 env->features[FEAT_HYPERV_EAX] |= HV_X64_MSR_TIME_REF_COUNT_AVAILABLE;
-env->features[FEAT_HYPERV_EAX] |= 0x200;
+env->features[FEAT_HYPERV_EAX] |= HV_X64_MSR_REFERENCE_TSC_AVAILABLE;
+
+if (has_msr_hv_frequencies && tsc_is_stable_and_known(env)) {
+env->features[FEAT_HYPERV_EAX] |= HV_X64_ACCESS_FREQUENCY_MSRS;
+env->features[FEAT_HYPERV_EDX] |=
+HV_FEATURE_FREQUENCY_MSRS_AVAILABLE;
+}
 }
 if (cpu->hyperv_crash && has_msr_hv_crash) {
 env->features[FEAT_HYPERV_EDX] |= HV_X64_GUEST_CRASH_MSR_AVAILABLE;
@@ -1134,6 +1141,9 @@ static int kvm_get_supported_msrs(KVMState *s)
 case HV_X64_MSR_STIMER0_CONFIG:
 has_msr_hv_stimer = true;
 break;
+case HV_X64_MSR_TSC_FREQUENCY:
+has_msr_hv_frequencies = true;
+break;
 }
 }
 }
-- 
2.9.3

Re: [Qemu-devel] [Qemu-ppc] [PATCH] target/ppc: Only set PCR in kvm if actually in a compat mode

2017-08-07 Thread Greg Kurz

On Fri, 4 Aug 2017 12:35:30 +1000
David Gibson  wrote:

> On Thu, Aug 03, 2017 at 07:28:06PM +0200, Greg Kurz wrote:
> > On Thu, 13 Jul 2017 11:21:18 +1000
> > David Gibson  wrote:
> >   
> > > On Wed, Jul 12, 2017 at 04:45:17PM +1000, Suraj Jitindar Singh wrote:  
> > > > On Mon, 2017-07-03 at 19:20 +1000, David Gibson wrote:
> > > > > On Mon, Jul 03, 2017 at 01:18:38PM +1000, Suraj Jitindar Singh wrote: 
> > > > >
> > > > > > On Fri, 2017-06-30 at 14:03 +1000, David Gibson wrote:
> > > > > > > On Thu, Jun 29, 2017 at 02:59:39PM +1000, Suraj Jitindar Singh
> > > > > > > wrote:
> > > > > > > > The Processor Compatibility Register (PCR) I used to set the
> > > > > > > > compatibility mode of the processor using the SET_ONE_REG ioctl
> > > > > > > > on
> > > > > > > > KVM_REG_PPC_ARCH_COMPAT. Previously this was only called when a
> > > > > > > > compat
> > > > > > > > mode was actually in use, however a recent patch made it
> > > > > > > > unconditional.
> > > > > > > > Calling this in KVM_PR fails as there is no handler for that
> > > > > > > > call
> > > > > > > > and it
> > > > > > > > is thus impossible to start a machine with KVM_PR.
> > > > > > > > 
> > > > > > > > Change ppc_set_compat() so that the ioctl is only actually
> > > > > > > > called
> > > > > > > > if a
> > > > > > > > compat mode is in use. This means that a KVM_PR guest can boot.
> > > > > > > > Additionally the current behaviour for KVM_HV is preserved
> > > > > > > > where a
> > > > > > > > compat
> > > > > > > > mode of 0 set pcr and arch_compat in the vcore struct to zero,
> > > > > > > > both
> > > > > > > > of
> > > > > > > > which are initialised to zero anyway.
> > > > > > > > 
> > > > > > > > Fixes: 37f516defa2e ("pseries: Reset CPU compatibility mode")
> > > > > > > > 
> > > > > > > > Signed-off-by: Suraj Jitindar Singh   
> > > > > > > >   
> > > > > > > 
> > > > > > > This doesn't seem quite right.  With this change, how would we
> > > > > > > ever
> > > > > > > turn compatibility mode _off_ (which could happen on reset if
> > > > > > > nothing
> > > > > > 
> > > > > > Oh yeah, didn't really think about that.
> > > > > > 
> > > > > > > else).  Really we should add this pseudo-register to KVM PR,
> > > > > > > although
> > > > > > > I'm fine with also having a qemu workaround to let it work with
> > > > > > > older
> > > > > > > PR versions.
> > > > > > 
> > > > > > How do you feel about having a check and only calling the ioctl if
> > > > > > the
> > > > > > KVM in use is HV?
> > > > > 
> > > > > Don't really like it.  For one thing, we want to avoid explicitly
> > > > > checking for KVM PR - we should check specific capabilities instead.
> > > > > For another, it means on PR we're silently ignoring the compatibility
> > > > > mode which isn't really right.
> > > > > 
> > > > > I think the right approach here is to only call the ioctl() if the
> > > > > compatibility mode has actually changed.  That should make it work in
> > > > > the cases the original patch did, which is.. actually very few, given
> > > > > the new CAS logic.
> > > > 
> > > > I think this is the right approach. There is no point calling the ioctl
> > > > if nothing changed. Additionally this fixes KVM_PR in the interim
> > > > assuming no max-cpu-compat is specified on the command line.
> > > 
> > > Right, that's the idea.
> > >   
> > > > > Really the right fix is to implement the set compat mode ioctl() in
> > > > > KVM PR.
> > > > 
> > > > This would be the ideal fix however I suggest the implementation of
> > > > that would be to simply ignore it- given the main use case of KVM_PR is
> > > > nested and that means we can't actually set the PCR since it's
> > > > hypervisor privileged.
> > > 
> > > Yeah, as we discussed on IRC, I tend to agree.  I don't love the idea
> > > of silently presenting something other than requested.  However, we
> > > don't really have much choice and we do already have precedent.  PR
> > > tries to match the CPU requested in the PVR, but won't always be able
> > > to do so exactly (if the host CPU supports userspace instructions the
> > > requested PVR doesn't).  This doesn't really change the situation,
> > > except that we have a (PVR+PCR) combination instead of just a PVR that
> > > we're trying, and not completely suceeding, in matching.
> > >   
> > 
> > Does it make sense at all to use compat mode with KVM_PR since it
> > requires hypervisor privilege, that we're supposed not to have ?  
> 
> Uh.. what?  Availability of the PCR is a question of the guest
> environment, and PR (at least potentially) supports various different
> guest environments, both with and without (apparent) hypervisor
> capability.
> 

I mean mtpcr/mfpcr only work when the CPU is in hypervisor state, and PR
is supposed to be *mostly* used nested, ie, without hypervisor privilege.
Thus, I don't see the point in implementing PPC_ARCH_COMPAT in PR... but
I'm probably missing something :)

> > Shouldn't

Re: [Qemu-devel] [PATCH 04/17] nbd/client: fix nbd_send_request to return int

2017-08-07 Thread Vladimir Sementsov-Ogievskiy


07.08.2017 11:23, Daniel P. Berrange wrote:

On Fri, Aug 04, 2017 at 06:14:27PM +0300, Vladimir Sementsov-Ogievskiy wrote:

Fix nbd_send_request to return int, as it returns a return value
of nbd_write (which is int), and the only user of nbd_send_request's
return value (nbd_co_send_request) consider it as int too.

The real problem here is the return value of nbd_write() - it should be
a ssize_t, not an int, since it is propagating the return value of
nbd_rwv() which is ssize_t.


It was changed in f5d406fe86bb (sent in May)
The discussion actually was started half a year ago:
https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00825.html



Basically all functions that do I/O and
return the number of bytes read/written should be ssize_t - any use of
int is a bug.

Regards,
Daniel



--
Best regards,
Vladimir

Re: [Qemu-devel] [RFC 22/29] migration: new message MIG_RP_MSG_RECV_BITMAP

2017-08-07 Thread Dr. David Alan Gilbert

* Peter Xu (pet...@redhat.com) wrote:
> On Fri, Aug 04, 2017 at 10:49:42AM +0100, Dr. David Alan Gilbert wrote:
> > * Peter Xu (pet...@redhat.com) wrote:
> > > On Thu, Aug 03, 2017 at 11:50:22AM +0100, Dr. David Alan Gilbert wrote:
> > > > > +/* Size of the bitmap, in bytes */
> > > > > +size = (block->max_length >> TARGET_PAGE_BITS) / 8;
> > > > > +qemu_put_be64(file, size);
> > > > > +qemu_put_buffer(file, (const uint8_t *)block->receivedmap, size);
> > > > 
> > > > Do we need to be careful about endianness and length of long here?
> > > > The migration stream can (theoretically) migrate between hosts of
> > > > different endianness, e.g. a Power LE and Power BE host it can also
> > > > migrate between a 32bit and 64bit host where the 'long' used in our
> > > > bitmap is a different length.
> > > 
> > > Ah, good catch...
> > > 
> > > I feel like we'd better provide a new bitmap helper for this when the
> > > bitmap will be sent to somewhere else, like:
> > > 
> > >   void bitmap_to_le(unsigned long *dst, const unsigned long *src,
> > > long nbits);
> > >   void bitmap_from_le(unsigned long *dst, const unsigned long *src,
> > >   long nbits);
> > > 
> > > I used little endian since I *think* that should work even cross 32/64
> > > bits machines (and I think big endian should not work).
> > 
> > Lets think about some combinations:
> > 
> > 64 bit LE  G0,G1...G7
> > 64 bit BE  G7,G6...G0
> > 32 bit LE  A0,A1,A2,A3, B0,B1,B2,B3
> > 32 bit BE  A3,A2,A1,A0  B3,B2,B1,B0
> > 
> > considering a 64bit BE src to a 32bit LE dest:
> >   64 bit BE  G7,G6...G0
> >   bitmap_to_le swaps that to
> >  G0,G1,..G7
> > 
> > destination reads two 32bit chunks:
> >   G0,G1,G2,G3G4,G5,G6,G7
> > 
> > dest is LE so no byteswap is needed.
> > 
> > Yes, I _think_ that's OK.
> 
> Hmm, I thought it over again and see another problem, which makes it
> more interesting...
> 
> The size of the bitmap can actually be different on hosts that are
> using different word sizes (or say, long size, which should be the
> same size of the word size). E.g., when we allocate a bitmap that
> covers nbits=6*32+1, we'll get 28 bytes (6*4+4) sized bitmap on 32bit
> machines, and 32 bytes (3*8+8) sized bitmap on 64bit machines.
> 
> I really hoped that we have some typedef for current bitmap, like:
> 
>   typedef long *Bitmap;
> 
> Then I can simply consider to switch the "long" to uint64_t, but sadly
> we don't have such. Then, type switching would be slightly overkill
> (maybe still doable, at least I need to touch lots of files).

Yes, it's messy.

> One of the solution is: I always send the bitmap in 8-bytes chunk,
> even on 32bit machines. If there is not aligned bitmap size (of course
> it'll only happen on 32bit machines), I align it up to 8-bytes, then
> fill the rest with zeros. I'll try to do this hack only in migration
> code for now.

Yes, although you may find it easier to send them in 4byte chunks and
pad on reception.

Dave

> -- 
> Peter Xu
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PULL for-2.10 0/3] TCG misc patches

2017-08-07 Thread Peter Maydell

On 3 August 2017 at 19:12, Richard Henderson  wrote:
> Two of these have been posted before.  The third I discovered in
> testing on a Sparc host yesterday.
>
>
> r~
>
>
> The following changes since commit aaaec6acad7cf97372d48c1b09126a09697519c8:
>
>   Update version for v2.10.0-rc1 release (2017-08-02 16:36:32 +0100)
>
> are available in the git repository at:
>
>   git://github.com/rth7680/qemu.git tags/pull-tcg-20170803
>
> for you to fetch changes up to 13aaef678ed377b12b76dc7fb9e615b2f2f9047b:
>
>   tcg: Increase minimum alignment from tcg_malloc to 8 (2017-08-03 11:00:30 
> -0700)
>
> 
> Queued misc tcg patches
>
> 
> Richard Henderson (3):
>   tcg/arm: Fix runtime overalignment test
>   target/s390x: Fix CSST for 16-byte store
>   tcg: Increase minimum alignment from tcg_malloc to 8
>

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH v6 1/4] tests: Add test-listen - a stress test for QEMU socket listen

2017-08-07 Thread Knut Omang

On Mon, 2017-08-07 at 09:45 +0100, Daniel P. Berrange wrote:
> On Sat, Jul 29, 2017 at 11:18:15PM +0200, Knut Omang wrote:
> > There's a potential race condition between multiple bind()'s
> > attempting to bind to the same port, which occasionally
> > allows more than one bind to succeed against the same port.
> > 
> > When a subsequent listen() call is made with the same socket
> > only one will succeed.
> > 
> > The current QEMU code does however not take this situation into account
> > and the listen will cause the code to break out and fail even
> > when there are actually available ports to use.
> > 
> > This test exposes two subtests:
> > 
> > /socket/listen-serial
> > /socket/listen-compete
> > 
> > The "compete" subtest creates a number of threads and have them all trying 
> > to bind
> > to the same port with a large enough offset input to
> > allow all threads to get it's own port.
> > The "serial" subtest just does the same, except in series in a
> > single thread.
> > 
> > The serial version passes, probably in most versions of QEMU.
> > 
> > The parallel version exposes the problem in a relatively reliable way,
> > eg. it fails a majority of times, but not with a 100% rate, occasional
> > passes can be seen. Nevertheless this is quite good given that
> > the bug was tricky to reproduce and has been left undetected for
> > a while.
> > 
> > The problem seems to be present in all versions of QEMU.
> > 
> > The original failure scenario occurred with VNC port allocation
> > in a traditional Xen based build, in different code
> > but with similar functionality.
> > 
> > Reported-by: Bhavesh Davda 
> > Signed-off-by: Knut Omang 
> > Reviewed-by: Yuval Shaia 
> > Reviewed-by: Bhavesh Davda 
> > Reviewed-by: Girish Moodalbail 
> > ---
> >  tests/Makefile.include |   2 +-
> >  tests/test-listen.c| 253 ++-
> >  2 files changed, 255 insertions(+)
> >  create mode 100644 tests/test-listen.c
> > 
> > diff --git a/tests/Makefile.include b/tests/Makefile.include
> > index 7af278d..b37c0c8 100644
> > --- a/tests/Makefile.include
> > +++ b/tests/Makefile.include
> > @@ -128,6 +128,7 @@ check-unit-y += tests/test-bufferiszero$(EXESUF)
> >  gcov-files-check-bufferiszero-y = util/bufferiszero.c
> >  check-unit-y += tests/test-uuid$(EXESUF)
> >  check-unit-y += tests/ptimer-test$(EXESUF)
> > +#check-unit-y += tests/test-listen$(EXESUF)
> >  gcov-files-ptimer-test-y = hw/core/ptimer.c
> >  check-unit-y += tests/test-qapi-util$(EXESUF)
> >  gcov-files-test-qapi-util-y = qapi/qapi-util.c
> > @@ -769,6 +770,7 @@ tests/test-arm-mptimer$(EXESUF): 
> > tests/test-arm-mptimer.o
> >  tests/test-qapi-util$(EXESUF): tests/test-qapi-util.o $(test-util-obj-y)
> >  tests/numa-test$(EXESUF): tests/numa-test.o
> >  tests/vmgenid-test$(EXESUF): tests/vmgenid-test.o tests/boot-sector.o 
> > tests/acpi-utils.o
> > +tests/test-listen$(EXESUF): tests/test-listen.o $(test-util-obj-y)
> >  
> >  tests/migration/stress$(EXESUF): tests/migration/stress.o
> >     $(call quiet-command, $(LINKPROG) -static -O3 $(PTHREAD_LIB) -o $@ $< 
> > ,"LINK","$(TARGET_DIR)$@")
> > diff --git a/tests/test-listen.c b/tests/test-listen.c
> > new file mode 100644
> > index 000..5c07537
> > --- /dev/null
> > +++ b/tests/test-listen.c
> > @@ -0,0 +1,253 @@
> > +/*
> > + * Copyright (c) 2017, Oracle and/or its affiliates. All rights reserved.
> > + *Author: Knut Omang 
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2
> > + * as published by the Free Software Foundation.
> 
> Can you change that to "version 2 or later" - per the LICENSE file, we don't
> accept contributions under "version 2 only" except for 4 specific subdirs:
> 
> 
>   "As of July 2013, contributions under version 2 of the GNU General Public
>    License (and no later version) are only accepted for the following files
>    or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*."

Oh, sorry - I wasn't aware of this, +"...or later" is fine with me.
Would you like me to send a v7 of the set with only that change, or can you 
amend 
it as part of the merge?

Thanks,
Knut

> 
> > + *
> > + * Test parallel port listen configuration with
> > + * dynamic port allocation
> > + */
> 
> 
> Regards,
> Daniel

Re: [Qemu-devel] [PATCH v7 6/6] fsdev: QMP interface for throttling

2017-08-07 Thread Pradeep Jagadeesh


On 7/6/2017 8:47 PM, Markus Armbruster wrote:

Pradeep Jagadeesh  writes:


This patch introduces qmp interfaces for the fsdev
devices. This provides two interfaces one
for querying info of all the fsdev devices. The second one
to set the IO limits for the required fsdev device.

Signed-off-by: Pradeep Jagadeesh 
---
 Makefile|  4 +++
 fsdev/qemu-fsdev-dummy.c| 10 ++
 fsdev/qemu-fsdev-throttle.c | 76 +++-
 fsdev/qemu-fsdev-throttle.h | 13 +++
 fsdev/qemu-fsdev.c  | 37 
 monitor.c   |  5 +++
 qapi-schema.json|  3 ++
 qapi/fsdev.json | 84 +
 qmp.c   | 14 
 9 files changed, 245 insertions(+), 1 deletion(-)
 create mode 100644 qapi/fsdev.json

diff --git a/Makefile b/Makefile
index 16a0430..4fd7625 100644
--- a/Makefile
+++ b/Makefile
@@ -416,6 +416,10 @@ qapi-modules = $(SRC_PATH)/qapi-schema.json 
$(SRC_PATH)/qapi/common.json \
$(SRC_PATH)/qapi/crypto.json $(SRC_PATH)/qapi/rocker.json \
$(SRC_PATH)/qapi/trace.json

+ifdef CONFIG_VIRTFS


Uh, qapi-schema.json includes fsdev.json *unconditionally*, doesn't it?

Yes, I did not find any ways to include with some condition.



+qapi-modules += $(SRC_PATH)/qapi/fsdev.json
+endif
+
 qapi-types.c qapi-types.h :\
 $(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
diff --git a/fsdev/qemu-fsdev-dummy.c b/fsdev/qemu-fsdev-dummy.c
index 6dc0fbc..f33305d 100644
--- a/fsdev/qemu-fsdev-dummy.c
+++ b/fsdev/qemu-fsdev-dummy.c
@@ -19,3 +19,13 @@ int qemu_fsdev_add(QemuOpts *opts)
 {
 return 0;
 }
+
+void qmp_fsdev_set_io_throttle(IOThrottle *arg, Error **errp)
+{
+  return;


Indentation is off.

Will fix



+}
+
+IOThrottleList *qmp_query_fsdev_io_throttle(Error **errp)
+{
+abort();
+}


Any particular reason why one of the stubs abort()s, but not the other?

No idea, I think can even return NULL. I will change it.



diff --git a/fsdev/qemu-fsdev-throttle.c b/fsdev/qemu-fsdev-throttle.c
index c5e2499..4483533 100644
--- a/fsdev/qemu-fsdev-throttle.c
+++ b/fsdev/qemu-fsdev-throttle.c
@@ -16,7 +16,6 @@
 #include "qemu/error-report.h"
 #include "qemu-fsdev-throttle.h"
 #include "qemu/iov.h"
-#include "qemu/throttle-options.h"

 static void fsdev_throttle_read_timer_cb(void *opaque)
 {
@@ -30,6 +29,81 @@ static void fsdev_throttle_write_timer_cb(void *opaque)
 qemu_co_enter_next(&fst->throttled_reqs[true]);
 }

+void fsdev_set_io_throttle(IOThrottle *arg, FsThrottle *fst, Error **errp)
+{
+ThrottleConfig cfg;
+
+throttle_set_io_limits(&cfg, arg);
+
+if (throttle_is_valid(&cfg, errp)) {
+fst->cfg = cfg;
+fsdev_throttle_init(fst);
+}
+}
+
+void fsdev_get_io_throttle(FsThrottle *fst, IOThrottle **fs9pcfg,
+   char *fsdevice, Error **errp)
+{
+
+ThrottleConfig cfg = fst->cfg;
+IOThrottle *fscfg = g_malloc0(sizeof(*fscfg));
+
+fscfg->has_id = true;
+fscfg->id = g_strdup(fsdevice);
+fscfg->bps = cfg.buckets[THROTTLE_BPS_TOTAL].avg;
+fscfg->bps_rd = cfg.buckets[THROTTLE_BPS_READ].avg;
+fscfg->bps_wr = cfg.buckets[THROTTLE_BPS_WRITE].avg;
+
+fscfg->iops = cfg.buckets[THROTTLE_OPS_TOTAL].avg;
+fscfg->iops_rd = cfg.buckets[THROTTLE_OPS_READ].avg;
+fscfg->iops_wr = cfg.buckets[THROTTLE_OPS_WRITE].avg;
+
+fscfg->has_bps_max = cfg.buckets[THROTTLE_BPS_TOTAL].max;
+fscfg->bps_max = cfg.buckets[THROTTLE_BPS_TOTAL].max;
+fscfg->has_bps_rd_max  = cfg.buckets[THROTTLE_BPS_READ].max;
+fscfg->bps_rd_max  = cfg.buckets[THROTTLE_BPS_READ].max;
+fscfg->has_bps_wr_max  = cfg.buckets[THROTTLE_BPS_WRITE].max;
+fscfg->bps_wr_max  = cfg.buckets[THROTTLE_BPS_WRITE].max;
+
+fscfg->has_iops_max= cfg.buckets[THROTTLE_OPS_TOTAL].max;
+fscfg->iops_max= cfg.buckets[THROTTLE_OPS_TOTAL].max;
+fscfg->has_iops_rd_max = cfg.buckets[THROTTLE_OPS_READ].max;
+fscfg->iops_rd_max = cfg.buckets[THROTTLE_OPS_READ].max;
+fscfg->has_iops_wr_max = cfg.buckets[THROTTLE_OPS_WRITE].max;
+fscfg->iops_wr_max = cfg.buckets[THROTTLE_OPS_WRITE].max;
+
+fscfg->has_bps_max_length = fscfg->has_bps_max;
+fscfg->bps_max_length =
+ cfg.buckets[THROTTLE_BPS_TOTAL].burst_length;
+fscfg->has_bps_rd_max_length  = fscfg->has_bps_rd_max;
+fscfg->bps_rd_max_length  =
+ cfg.buckets[THROTTLE_BPS_READ].burst_length;
+fscfg->has_bps_wr_max_length  = fscfg->has_bps_wr_max;
+fscfg->bps_wr_max_length  =
+ cfg.buckets[THROTTLE_BPS_WRITE].burst_length;
+
+fscfg->has_iops_max_length= fscfg->has_iops_max;
+fscfg->iops_max_length=
+ cfg.buckets[THROTTLE_OPS_TOTAL].burst_length;
+fscfg->has_iops_rd_max_length = fscfg->has_iops_rd_max;
+fscfg->iop

Re: [Qemu-devel] [PATCH v7 1/6] throttle: factor out duplicate code

2017-08-07 Thread Pradeep Jagadeesh


On 7/10/2017 4:41 PM, Eric Blake wrote:

On 07/04/2017 10:30 AM, Pradeep Jagadeesh wrote:

This patch factor out the duplicate throttle code that was present in


s/This patch factor/Factor/

It's okay to write commit messages in the imperative tense; the easiest
way I know to start a good message is to use an implied "Apply this
patch to ..." in front of the sentence.  But "Apply this patch to This
patch ..." obviously doesn't flow, compared to "Apply this patch to
factor ..."

OK



block and fsdev devices.

Signed-off-by: Pradeep Jagadeesh 
Reviewed-by: Alberto Garcia 
---

Reviewed-by: Eric Blake

Re: [Qemu-devel] [PATCH v7 5/6] fsdev: hmp interface for throttling

2017-08-07 Thread Pradeep Jagadeesh


On 7/6/2017 8:15 PM, Markus Armbruster wrote:

Pradeep Jagadeesh  writes:


This patch introduces hmp interfaces for the fsdev
devices.

Signed-off-by: Pradeep Jagadeesh 


Fails to compile, first error:

/work/armbru/qemu/hmp.c: In function ‘hmp_fsdev_set_io_throttle’:
/work/armbru/qemu/hmp.c:1791:5: warning: implicit declaration of function 
‘qmp_fsdev_set_io_throttle’ [-Wimplicit-function-declaration]
 qmp_fsdev_set_io_throttle(&throttle, &err);
 ^

Do you need to swap PATCH 5 and 6?

Hmm, yes better to swap, because of dependency.
I will do in V8.

-Pradeep

Re: [Qemu-devel] [PATCH v4 5/9] s390x/ccw: create s390 phb conditionally

2017-08-07 Thread Cornelia Huck

On Fri, 4 Aug 2017 15:20:26 +0200
David Hildenbrand  wrote:

> On 04.08.2017 13:29, Cornelia Huck wrote:
> > Don't create the s390 pci host bridge if we do not provide the zpci
> > facility.
> > 
> > Reviewed-by: Thomas Huth 
> > Acked-by: Christian Borntraeger 
> > Signed-off-by: Cornelia Huck   
> 
> This works, because s390_init_cpus(machine) (which sets up features) is
> called before adding the bus. Mind to add a comment so nobody tries to
> reshuffle these functions?

This is already needed by the flic setup (which also relies on the
features being set up). We really want the cpu setup as early as
possible.

Do you have a good comment handy? :)

Re: [Qemu-devel] [PATCH v4 8/9] s390x/kvm: msi route fixup for non-pci

2017-08-07 Thread Cornelia Huck

On Fri, 4 Aug 2017 15:18:14 +0200
David Hildenbrand  wrote:

> On 04.08.2017 13:29, Cornelia Huck wrote:
> > If we don't provide pci, we cannot have a pci device for which we
> > have to translate to adapter routes: just return -ENODEV.
> > 
> > Signed-off-by: Cornelia Huck 
> > ---
> >  target/s390x/kvm.c | 6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
> > index 9de165d8b1..d8db1cbf6e 100644
> > --- a/target/s390x/kvm.c
> > +++ b/target/s390x/kvm.c
> > @@ -2424,6 +2424,12 @@ int kvm_arch_fixup_msi_route(struct 
> > kvm_irq_routing_entry *route,
> >  uint32_t idx = data >> ZPCI_MSI_VEC_BITS;
> >  uint32_t vec = data & ZPCI_MSI_VEC_MASK;
> >  
> > +if (!s390_has_feat(S390_FEAT_ZPCI)) {
> > +/* How can we get here without pci enabled? */
> > +g_assert(false);
> > +return -ENODEV;
> > +}
> > +  
> 
> /* How can we get here without pci enabled? */
> g_assert(s390_has_feat(S390_FEAT_ZPCI));
> 
> ?

What I don't like about this is that it implicitly relies on
s390_pci_find_dev_by_idx() doing the correct thing if asserts are off -
and it is non-obvious that we'll get pbdev == NULL in that case.

> 
> >  pbdev = s390_pci_find_dev_by_idx(s390_get_phb(), idx);
> >  if (!pbdev) {
> >  DPRINTF("add_msi_route no dev\n");
> >   
> 
>

Re: [Qemu-devel] [PATCH v4 7/9] s390x/pci: fence off instructions for non-pci

2017-08-07 Thread Cornelia Huck

On Fri, 4 Aug 2017 15:17:25 +0200
David Hildenbrand  wrote:

> On 04.08.2017 13:29, Cornelia Huck wrote:
> > If a guest running on a machine without zpci issues a pci instruction,
> > throw them an exception.
> > 
> > Reviewed-by: Thomas Huth 
> > Signed-off-by: Cornelia Huck 
> > ---
> >  target/s390x/kvm.c | 54 
> > +-
> >  1 file changed, 41 insertions(+), 13 deletions(-)
> > 
> > diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
> > index bc62bba5b7..9de165d8b1 100644
> > --- a/target/s390x/kvm.c
> > +++ b/target/s390x/kvm.c
> > @@ -1191,7 +1191,11 @@ static int kvm_clp_service_call(S390CPU *cpu, struct 
> > kvm_run *run)
> >  {
> >  uint8_t r2 = (run->s390_sieic.ipb & 0x000f) >> 16;
> >  
> > -return clp_service_call(cpu, r2);
> > +if (s390_has_feat(S390_FEAT_ZPCI)) {
> > +return clp_service_call(cpu, r2);
> > +} else {
> > +return -1;
> > +}  
> 
> I am a fan of dropping these else case and returning directly. But that
> is just my opinion.
> 
> (applies to all changes in this patch)
> 
> 

You are not the first to say that :)

I do prefer this way around, though, and if there aren't strong
objections, I'll keep it like this.

Re: [Qemu-devel] [PATCH v7 2/6] qmp: Create IOThrottle structure

2017-08-07 Thread Pradeep Jagadeesh


On 7/6/2017 7:55 PM, Markus Armbruster wrote:

Pradeep Jagadeesh  writes:


This patch enables qmp interfaces for the fsdev
devices. This provides two interfaces one
for querying info of all the fsdev devices. The second one
to set the IO limits for the required fsdev device.

Signed-off-by: Pradeep Jagadeesh 
Reviewed-by: Greg Kurz 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
---
 qapi/block-core.json | 76 ++---
 qapi/iothrottle.json | 88 
 2 files changed, 91 insertions(+), 73 deletions(-)
 create mode 100644 qapi/iothrottle.json

diff --git a/qapi/block-core.json b/qapi/block-core.json
index f85c223..9320974 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -6,6 +6,7 @@

 # QAPI common definitions
 { 'include': 'common.json' }
+{ 'include': 'iothrottle.json' }

 ##
 # @SnapshotInfo:
@@ -1761,84 +1762,13 @@
 #
 # @device: Block device name (deprecated, use @id instead)
 #
-# @id: The name or QOM path of the guest device (since: 2.8)
-#
-# @bps: total throughput limit in bytes per second
-#
-# @bps_rd: read throughput limit in bytes per second
-#
-# @bps_wr: write throughput limit in bytes per second
-#
-# @iops: total I/O operations per second
-#
-# @iops_rd: read I/O operations per second
-#
-# @iops_wr: write I/O operations per second
-#
-# @bps_max: total throughput limit during bursts,
-# in bytes (Since 1.7)
-#
-# @bps_rd_max: read throughput limit during bursts,
-#in bytes (Since 1.7)
-#
-# @bps_wr_max: write throughput limit during bursts,
-#in bytes (Since 1.7)
-#
-# @iops_max: total I/O operations per second during bursts,
-#  in bytes (Since 1.7)
-#
-# @iops_rd_max: read I/O operations per second during bursts,
-# in bytes (Since 1.7)
-#
-# @iops_wr_max: write I/O operations per second during bursts,
-# in bytes (Since 1.7)
-#
-# @bps_max_length: maximum length of the @bps_max burst
-#period, in seconds. It must only
-#be set if @bps_max is set as well.
-#Defaults to 1. (Since 2.6)
-#
-# @bps_rd_max_length: maximum length of the @bps_rd_max
-#   burst period, in seconds. It must only
-#   be set if @bps_rd_max is set as well.
-#   Defaults to 1. (Since 2.6)
-#
-# @bps_wr_max_length: maximum length of the @bps_wr_max
-#   burst period, in seconds. It must only
-#   be set if @bps_wr_max is set as well.
-#   Defaults to 1. (Since 2.6)
-#
-# @iops_max_length: maximum length of the @iops burst
-# period, in seconds. It must only
-# be set if @iops_max is set as well.
-# Defaults to 1. (Since 2.6)
-#
-# @iops_rd_max_length: maximum length of the @iops_rd_max
-#burst period, in seconds. It must only
-#be set if @iops_rd_max is set as well.
-#Defaults to 1. (Since 2.6)
-#
-# @iops_wr_max_length: maximum length of the @iops_wr_max
-#burst period, in seconds. It must only
-#be set if @iops_wr_max is set as well.
-#Defaults to 1. (Since 2.6)
-#
-# @iops_size: an I/O size in bytes (Since 1.7)
-#
 # @group: throttle group name (Since 2.4)
 #
 # Since: 1.1
 ##
 { 'struct': 'BlockIOThrottle',
-  'data': { '*device': 'str', '*id': 'str', 'bps': 'int', 'bps_rd': 'int',
-'bps_wr': 'int', 'iops': 'int', 'iops_rd': 'int', 'iops_wr': 'int',
-'*bps_max': 'int', '*bps_rd_max': 'int',
-'*bps_wr_max': 'int', '*iops_max': 'int',
-'*iops_rd_max': 'int', '*iops_wr_max': 'int',
-'*bps_max_length': 'int', '*bps_rd_max_length': 'int',
-'*bps_wr_max_length': 'int', '*iops_max_length': 'int',
-'*iops_rd_max_length': 'int', '*iops_wr_max_length': 'int',
-'*iops_size': 'int', '*group': 'str' } }
+  'base': 'IOThrottle',
+  'data': { '*device': 'str', '*group': 'str' } }

 ##
 # @block-stream:
diff --git a/qapi/iothrottle.json b/qapi/iothrottle.json
new file mode 100644
index 000..0f067c3
--- /dev/null
+++ b/qapi/iothrottle.json
@@ -0,0 +1,88 @@
+# -*- Mode: Python -*-
+
+##
+# == QAPI IOThrottle definitions
+##
+
+##
+# @IOThrottle:
+#
+# A set of parameters describing IO throttling
+#
+# @id: The name or QOM path of the guest device (since: 2.8)
+#
+# @bps: total throughput limit in bytes per second
+#
+# @bps_rd: read throughput limit in bytes per second
+#
+# @bps_wr: write throughput limit in bytes per second
+#
+# @iops: total I/O operations per second
+#
+#

Re: [Qemu-devel] [PULL 02/17] cpu_physical_memory_sync_dirty_bitmap: Fix alignment check

2017-08-07 Thread Alex Bennée


Dr. David Alan Gilbert  writes:

> * Peter Maydell (peter.mayd...@linaro.org) wrote:
>> On 1 August 2017 at 17:17, Paolo Bonzini  wrote:
>> > From: "Dr. David Alan Gilbert" 
>> >
>> > This code has an optimised, word aligned version, and a boring
>> > unaligned version.  Recently 084140bd498909 fixed a missing offset
>> > addition from the core of both versions.  However, the offset isn't
>> > necessarily aligned and thus the choice between the two versions
>> > needs fixing up to also include the offset.
>> >
>> > Symptom:
>> >   A few stuck unsent pages during migration; not normally noticed
>> > unless under very low bandwidth in which case the migration may get
>> > stuck never ending and never performing a 2nd sync; noticed by
>> > a hanging postcopy-test on a very heavily loaded system.
>> >
>> > Fixes: 084140bd498909
>> >
>> > Signed-off-by: Dr. David Alan Gilbert 
>> > Reported-by: Alex BenneÃ© 
>> > Tested-by: Alex BenneÃ© 
>> >
>> > --
>> > v2
>> >   Move 'page' inside the if (Comment from Paolo)
>> > Message-Id: <20170724165125.29887-1-dgilb...@redhat.com>
>> > Signed-off-by: Paolo Bonzini 
>>
>> Something somewhere along the line seems to have mangled the
>> unicode characters in Alex's name :-(
>> Also, Alex's email address is typoed, and what should be the
>> '---' marker has been written as '--' so the below-the-fold waffle
>> is still hanging around in the commit.
>>
>> This doesn't seem worth rejecting the pull request on the
>> eve of rc1 for, but it's still a bit sad :-(
>
> Hmm yes, I think some of that's mine - the missing 'n' is certainly
> my fault; and I may have got his two 'e's in the wrong order,
> but I don't know what changed the encoding, that seems right on:
>   https://lists.nongnu.org/archive/html/qemu-devel/2017-07/msg07417.html

Hmm also the acute accent should be on the second (not third) e ;-)

--
Alex Bennée
(breaking UTF-8 workflow since before it was cool)

Re: [Qemu-devel] [PATCH v6 1/4] tests: Add test-listen - a stress test for QEMU socket listen

2017-08-07 Thread Daniel P. Berrange

On Mon, Aug 07, 2017 at 11:21:25AM +0200, Knut Omang wrote:
> On Mon, 2017-08-07 at 09:45 +0100, Daniel P. Berrange wrote:
> > On Sat, Jul 29, 2017 at 11:18:15PM +0200, Knut Omang wrote:
> > > There's a potential race condition between multiple bind()'s
> > > attempting to bind to the same port, which occasionally
> > > allows more than one bind to succeed against the same port.
> > > 
> > > When a subsequent listen() call is made with the same socket
> > > only one will succeed.
> > > 
> > > The current QEMU code does however not take this situation into account
> > > and the listen will cause the code to break out and fail even
> > > when there are actually available ports to use.
> > > 
> > > This test exposes two subtests:
> > > 
> > > /socket/listen-serial
> > > /socket/listen-compete
> > > 
> > > The "compete" subtest creates a number of threads and have them all 
> > > trying to bind
> > > to the same port with a large enough offset input to
> > > allow all threads to get it's own port.
> > > The "serial" subtest just does the same, except in series in a
> > > single thread.
> > > 
> > > The serial version passes, probably in most versions of QEMU.
> > > 
> > > The parallel version exposes the problem in a relatively reliable way,
> > > eg. it fails a majority of times, but not with a 100% rate, occasional
> > > passes can be seen. Nevertheless this is quite good given that
> > > the bug was tricky to reproduce and has been left undetected for
> > > a while.
> > > 
> > > The problem seems to be present in all versions of QEMU.
> > > 
> > > The original failure scenario occurred with VNC port allocation
> > > in a traditional Xen based build, in different code
> > > but with similar functionality.
> > > 
> > > Reported-by: Bhavesh Davda 
> > > Signed-off-by: Knut Omang 
> > > Reviewed-by: Yuval Shaia 
> > > Reviewed-by: Bhavesh Davda 
> > > Reviewed-by: Girish Moodalbail 
> > > ---
> > >  tests/Makefile.include |   2 +-
> > >  tests/test-listen.c| 253 ++-
> > >  2 files changed, 255 insertions(+)
> > >  create mode 100644 tests/test-listen.c
> > > 
> > > diff --git a/tests/Makefile.include b/tests/Makefile.include
> > > index 7af278d..b37c0c8 100644
> > > --- a/tests/Makefile.include
> > > +++ b/tests/Makefile.include
> > > @@ -128,6 +128,7 @@ check-unit-y += tests/test-bufferiszero$(EXESUF)
> > >  gcov-files-check-bufferiszero-y = util/bufferiszero.c
> > >  check-unit-y += tests/test-uuid$(EXESUF)
> > >  check-unit-y += tests/ptimer-test$(EXESUF)
> > > +#check-unit-y += tests/test-listen$(EXESUF)
> > >  gcov-files-ptimer-test-y = hw/core/ptimer.c
> > >  check-unit-y += tests/test-qapi-util$(EXESUF)
> > >  gcov-files-test-qapi-util-y = qapi/qapi-util.c
> > > @@ -769,6 +770,7 @@ tests/test-arm-mptimer$(EXESUF): 
> > > tests/test-arm-mptimer.o
> > >  tests/test-qapi-util$(EXESUF): tests/test-qapi-util.o $(test-util-obj-y)
> > >  tests/numa-test$(EXESUF): tests/numa-test.o
> > >  tests/vmgenid-test$(EXESUF): tests/vmgenid-test.o tests/boot-sector.o 
> > > tests/acpi-utils.o
> > > +tests/test-listen$(EXESUF): tests/test-listen.o $(test-util-obj-y)
> > >  
> > >  tests/migration/stress$(EXESUF): tests/migration/stress.o
> > >   $(call quiet-command, $(LINKPROG) -static -O3 $(PTHREAD_LIB) -o $@ $< 
> > > ,"LINK","$(TARGET_DIR)$@")
> > > diff --git a/tests/test-listen.c b/tests/test-listen.c
> > > new file mode 100644
> > > index 000..5c07537
> > > --- /dev/null
> > > +++ b/tests/test-listen.c
> > > @@ -0,0 +1,253 @@
> > > +/*
> > > + * Copyright (c) 2017, Oracle and/or its affiliates. All rights reserved.
> > > + *Author: Knut Omang 
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License version 2
> > > + * as published by the Free Software Foundation.
> > 
> > Can you change that to "version 2 or later" - per the LICENSE file, we don't
> > accept contributions under "version 2 only" except for 4 specific subdirs:
> > 
> > 
> >   "As of July 2013, contributions under version 2 of the GNU General Public
> >    License (and no later version) are only accepted for the following files
> >    or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*."
> 
> Oh, sorry - I wasn't aware of this, +"...or later" is fine with me.
> Would you like me to send a v7 of the set with only that change, or can you 
> amend 
> it as part of the merge?

Since its a copyright statement, I'd prefer if you just sent a v7. I've no
other comments, so will queue it for merge once you resend, and send a pull
request once 2.11 opens up.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH v4 5/9] s390x/ccw: create s390 phb conditionally

2017-08-07 Thread David Hildenbrand

On 07.08.2017 12:00, Cornelia Huck wrote:
> On Fri, 4 Aug 2017 15:20:26 +0200
> David Hildenbrand  wrote:
> 
>> On 04.08.2017 13:29, Cornelia Huck wrote:
>>> Don't create the s390 pci host bridge if we do not provide the zpci
>>> facility.
>>>
>>> Reviewed-by: Thomas Huth 
>>> Acked-by: Christian Borntraeger 
>>> Signed-off-by: Cornelia Huck   
>>
>> This works, because s390_init_cpus(machine) (which sets up features) is
>> called before adding the bus. Mind to add a comment so nobody tries to
>> reshuffle these functions?
> 
> This is already needed by the flic setup (which also relies on the
> features being set up). We really want the cpu setup as early as
> possible.
> 
> Do you have a good comment handy? :)
> 

/* init CPUs (incl. CPU model) early so s390_has_feature() works */

-- 

Thanks,

David

Re: [Qemu-devel] [PATCH v6 1/4] tests: Add test-listen - a stress test for QEMU socket listen

2017-08-07 Thread Knut Omang

On Mon, 2017-08-07 at 11:27 +0100, Daniel P. Berrange wrote:
> On Mon, Aug 07, 2017 at 11:21:25AM +0200, Knut Omang wrote:
> > On Mon, 2017-08-07 at 09:45 +0100, Daniel P. Berrange wrote:
> > > On Sat, Jul 29, 2017 at 11:18:15PM +0200, Knut Omang wrote:
> > > > There's a potential race condition between multiple bind()'s
> > > > attempting to bind to the same port, which occasionally
> > > > allows more than one bind to succeed against the same port.
> > > > 
> > > > When a subsequent listen() call is made with the same socket
> > > > only one will succeed.
> > > > 
> > > > The current QEMU code does however not take this situation into account
> > > > and the listen will cause the code to break out and fail even
> > > > when there are actually available ports to use.
> > > > 
> > > > This test exposes two subtests:
> > > > 
> > > > /socket/listen-serial
> > > > /socket/listen-compete
> > > > 
> > > > The "compete" subtest creates a number of threads and have them all 
> > > > trying to bind
> > > > to the same port with a large enough offset input to
> > > > allow all threads to get it's own port.
> > > > The "serial" subtest just does the same, except in series in a
> > > > single thread.
> > > > 
> > > > The serial version passes, probably in most versions of QEMU.
> > > > 
> > > > The parallel version exposes the problem in a relatively reliable way,
> > > > eg. it fails a majority of times, but not with a 100% rate, occasional
> > > > passes can be seen. Nevertheless this is quite good given that
> > > > the bug was tricky to reproduce and has been left undetected for
> > > > a while.
> > > > 
> > > > The problem seems to be present in all versions of QEMU.
> > > > 
> > > > The original failure scenario occurred with VNC port allocation
> > > > in a traditional Xen based build, in different code
> > > > but with similar functionality.
> > > > 
> > > > Reported-by: Bhavesh Davda 
> > > > Signed-off-by: Knut Omang 
> > > > Reviewed-by: Yuval Shaia 
> > > > Reviewed-by: Bhavesh Davda 
> > > > Reviewed-by: Girish Moodalbail 
> > > > ---
> > > >  tests/Makefile.include |   2 +-
> > > >  tests/test-listen.c| 253 
> > > > ++-
> > > >  2 files changed, 255 insertions(+)
> > > >  create mode 100644 tests/test-listen.c
> > > > 
> > > > diff --git a/tests/Makefile.include b/tests/Makefile.include
> > > > index 7af278d..b37c0c8 100644
> > > > --- a/tests/Makefile.include
> > > > +++ b/tests/Makefile.include
> > > > @@ -128,6 +128,7 @@ check-unit-y += tests/test-bufferiszero$(EXESUF)
> > > >  gcov-files-check-bufferiszero-y = util/bufferiszero.c
> > > >  check-unit-y += tests/test-uuid$(EXESUF)
> > > >  check-unit-y += tests/ptimer-test$(EXESUF)
> > > > +#check-unit-y += tests/test-listen$(EXESUF)
> > > >  gcov-files-ptimer-test-y = hw/core/ptimer.c
> > > >  check-unit-y += tests/test-qapi-util$(EXESUF)
> > > >  gcov-files-test-qapi-util-y = qapi/qapi-util.c
> > > > @@ -769,6 +770,7 @@ tests/test-arm-mptimer$(EXESUF): 
> > > > tests/test-arm-mptimer.o
> > > >  tests/test-qapi-util$(EXESUF): tests/test-qapi-util.o 
> > > > $(test-util-obj-y)
> > > >  tests/numa-test$(EXESUF): tests/numa-test.o
> > > >  tests/vmgenid-test$(EXESUF): tests/vmgenid-test.o tests/boot-sector.o 
> > > > tests/acpi-utils.o
> > > > +tests/test-listen$(EXESUF): tests/test-listen.o $(test-util-obj-y)
> > > >  
> > > >  tests/migration/stress$(EXESUF): tests/migration/stress.o
> > > >     $(call quiet-command, $(LINKPROG) -static -O3 $(PTHREAD_LIB) -o 
> > > > $@ $< ,"LINK","$(TARGET_DIR)$@")
> > > > diff --git a/tests/test-listen.c b/tests/test-listen.c
> > > > new file mode 100644
> > > > index 000..5c07537
> > > > --- /dev/null
> > > > +++ b/tests/test-listen.c
> > > > @@ -0,0 +1,253 @@
> > > > +/*
> > > > + * Copyright (c) 2017, Oracle and/or its affiliates. All rights 
> > > > reserved.
> > > > + *Author: Knut Omang 
> > > > + *
> > > > + * This program is free software; you can redistribute it and/or modify
> > > > + * it under the terms of the GNU General Public License version 2
> > > > + * as published by the Free Software Foundation.
> > > 
> > > Can you change that to "version 2 or later" - per the LICENSE file, we 
> > > don't
> > > accept contributions under "version 2 only" except for 4 specific subdirs:
> > > 
> > > 
> > >   "As of July 2013, contributions under version 2 of the GNU General 
> > > Public
> > >    License (and no later version) are only accepted for the following 
> > > files
> > >    or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*."
> > 
> > Oh, sorry - I wasn't aware of this, +"...or later" is fine with me.
> > Would you like me to send a v7 of the set with only that change, or can you 
> > amend 
> > it as part of the merge?
> 
> Since its a copyright statement, I'd prefer if you just sent a v7. I've no
> other comments, so will queue it for merge once you resend, and send a pull
> request once 2.11 opens up.

Ok, will do,

Thanks,
Knut

Re: [Qemu-devel] [Qemu-block] [PATCH 1/2] block/vhdx: check error return of bdrv_getlength()

2017-08-07 Thread Kevin Wolf

Am 07.08.2017 um 05:08 hat Jeff Cody geschrieben:
> Calls to bdrv_getlength() were not checking for error.  In vhdx.c, this
> can lead to truncating an image file, so it is a definite bug.  In
> vhdx-log.c, the path for improper behavior is less clear, but it is best
> to check in any case.
> 
> Reported-by: Markus Armbruster 
> Signed-off-by: Jeff Cody 
> ---
>  block/vhdx-log.c | 20 
>  block/vhdx.c |  9 -
>  2 files changed, 24 insertions(+), 5 deletions(-)
> 
> diff --git a/block/vhdx-log.c b/block/vhdx-log.c
> index 01278f3..fd4e7af 100644
> --- a/block/vhdx-log.c
> +++ b/block/vhdx-log.c
> @@ -491,6 +491,7 @@ static int vhdx_log_flush(BlockDriverState *bs, 
> BDRVVHDXState *s,
>  uint32_t cnt, sectors_read;
>  uint64_t new_file_size;
>  void *data = NULL;
> +int64_t file_length;
>  VHDXLogDescEntries *desc_entries = NULL;
>  VHDXLogEntryHeader hdr_tmp = { 0 };
>  
> @@ -510,10 +511,15 @@ static int vhdx_log_flush(BlockDriverState *bs, 
> BDRVVHDXState *s,
>  if (ret < 0) {
>  goto exit;
>  }
> +file_length = bdrv_getlength(bs->file->bs);
> +if (file_length < 0) {
> +ret = file_length;
> +goto exit;
> +}
>  /* if the log shows a FlushedFileOffset larger than our current file
>   * size, then that means the file has been truncated / corrupted, and
>   * we must refused to open it / use it */
> -if (hdr_tmp.flushed_file_offset > bdrv_getlength(bs->file->bs)) {
> +if (hdr_tmp.flushed_file_offset > file_length) {
>  ret = -EINVAL;
>  goto exit;
>  }
> @@ -543,7 +549,7 @@ static int vhdx_log_flush(BlockDriverState *bs, 
> BDRVVHDXState *s,
>  goto exit;
>  }
>  }
> -if (bdrv_getlength(bs->file->bs) < 
> desc_entries->hdr.last_file_offset) {
> +if (file_length < desc_entries->hdr.last_file_offset) {
>  new_file_size = desc_entries->hdr.last_file_offset;
>  if (new_file_size % (1024*1024)) {
>  /* round up to nearest 1MB boundary */

The vhdx_log_flush() part looks good, but it made me notice a
bdrv_flush() in the same function where the return value isn't checked.
I'm almost sure that this is a bug.

> @@ -851,6 +857,7 @@ static int vhdx_log_write(BlockDriverState *bs, 
> BDRVVHDXState *s,
>  uint32_t partial_sectors = 0;
>  uint32_t bytes_written = 0;
>  uint64_t file_offset;
> +int64_t file_length;
>  VHDXHeader *header;
>  VHDXLogEntryHeader new_hdr;
>  VHDXLogDescriptor *new_desc = NULL;
> @@ -913,10 +920,15 @@ static int vhdx_log_write(BlockDriverState *bs, 
> BDRVVHDXState *s,
>  .sequence_number = s->log.sequence,
>  .descriptor_count= sectors,
>  .reserved= 0,
> -.flushed_file_offset = bdrv_getlength(bs->file->bs),
> -.last_file_offset= bdrv_getlength(bs->file->bs),
>};
>  
> +file_length = bdrv_getlength(bs->file->bs);
> +if (file_length < 0) {
> +ret = file_length;
> +goto exit;
> +}
> +new_hdr.flushed_file_offset = file_length;
> +new_hdr.last_file_offset= file_length;
>  new_hdr.log_guid = header->log_guid;

If you move the bdrv_getlength() above the initialisation of new_hdr,
you could keep these fields in the designated initialiser, which should
be better for readability.

I also don't know why .log_guid isn't part if it, could be moved, too.

>  desc_sectors = vhdx_compute_desc_sectors(new_hdr.descriptor_count);
> diff --git a/block/vhdx.c b/block/vhdx.c
> index a9cecd2..6a14999 100644
> --- a/block/vhdx.c
> +++ b/block/vhdx.c
> @@ -1166,7 +1166,14 @@ exit:
>  static int vhdx_allocate_block(BlockDriverState *bs, BDRVVHDXState *s,
>  uint64_t *new_offset)
>  {
> -*new_offset = bdrv_getlength(bs->file->bs);
> +int64_t current_len;
> +current_len = bdrv_getlength(bs->file->bs);
> +
> +if (current_len < 0) {
> +return current_len;
> +}

Don't you want the empty line to be between declaration and code rather
than assignment and check?

> +*new_offset = current_len;
>  
>  /* per the spec, the address for a block is in units of 1MB */
>  *new_offset = ROUND_UP(*new_offset, 1024 * 1024);

So the code looks correct, but we could make it a little nicer in a v2.

Kevin

Re: [Qemu-devel] [Qemu-block] [PATCH 2/2] block/vhdx: check for offset overflow to bdrv_truncate()

2017-08-07 Thread Kevin Wolf

Am 07.08.2017 um 05:08 hat Jeff Cody geschrieben:
> VHDX uses uint64_t types for most offsets, following the VHDX spec.
> However, bdrv_truncate() takes an int64_t value for the truncating
> offset.  Check for overflow before calling bdrv_truncate().
> 
> N.B.: For a compliant image this is not an issue, as the maximum VHDX
> image size is defined per the spec to be 64TB.
> 
> Signed-off-by: Jeff Cody 

Reviewed-by: Kevin Wolf

Re: [Qemu-devel] [Qemu-block] [PATCH 0/2] VHDX cleanup

2017-08-07 Thread Kevin Wolf

Am 07.08.2017 um 05:08 hat Jeff Cody geschrieben:
> Two VHDX items cleaned up:
> 
> 1. Check for error when calling bdrv_getlength() [Markus]
> 
> 2. Check for overflow in offset prior to calling bdrv_truncate().

This doesn't look like cleanups to me, but like proper fixes. I think it
would still qualify for 2.10.

Kevin

[Qemu-devel] [PATCH v7 0/4] Unit test+fix for problem with QEMU handling of multiple bind()s to the same port

2017-08-07 Thread Knut Omang

This series contains:
* a unit test that exposes a race condition which causes QEMU to fail
  to find a port even when there is plenty of available ports.
* a refactor of the qemu-sockets inet_listen_saddr() function
  to better handle this situation.

Changes from v6:
* Changed license of the (new) test-listen.c source file from
  GNU v2 to GNU v2 and later, according to QEMU standards.

Changes from v5:
* Also move setting of error from socket creation out
  into the main double for loop.
* Simplify and enhance reporting if socket creation fails:
  Only report failure to create sockets if none of the
  provided addrinfo list elements allows creation of a
  socket, otherwise report failure to find an available port.
* Further simplify if's within the for loops and make sure
  failure to recreate a socket gets distinctively reported.
* Rebased to current master.

Changes from v4:
* Move the complexity of recreating a socket and setting the error pointer
  into the main for loop, eliminating the try_bind_listen() function
  again. Cleaning up and improving error handling in the process.

Changes from v3:
* Test changes: Add missing license
  Add subtests for ipv4, ipv6 and both
  Various g_* usage improvements
* Split patch into 3 patches with two refactoring patches ahead
  of the actual fix.

Changes from v2:
* Non-trivial rebase + further abstraction
  on top of 7ad9af343c7f1c70c8015c7c519c312d8c5f9fa1
  'tests: add functional test validating ipv4/ipv6 address flag handling'

Changes from v1:
* Fix potential uninitialized variable only detected by optimize.
* Improve unexpected error detection in test-listen to give more
  details about why the test fails unexpectedly.
* Fix some line length style issues.

Thanks,
Knut

Knut Omang (4):
  tests: Add test-listen - a stress test for QEMU socket listen
  sockets: factor out a new try_bind() function
  sockets: factor out create_fast_reuse_socket
  sockets: Handle race condition between binds to the same port

 tests/Makefile.include |   2 +-
 tests/test-listen.c| 253 ++-
 util/qemu-sockets.c| 138 +++
 3 files changed, 345 insertions(+), 48 deletions(-)
 create mode 100644 tests/test-listen.c

base-commit: a588c4985eff363154d65aee8607d0a4601655f7
-- 
git-series 0.9.1

[Qemu-devel] [PATCH v7 3/4] sockets: factor out create_fast_reuse_socket

2017-08-07 Thread Knut Omang

Another refactoring step to prepare for fixing the problem
exposed with the test-listen test in the previous commit

Signed-off-by: Knut Omang 
Reviewed-by: Daniel P. Berrange 
---
 util/qemu-sockets.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index b4a2f24..d0d2047 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -149,6 +149,16 @@ int inet_ai_family_from_address(InetSocketAddress *addr,
 return PF_UNSPEC;
 }
 
+static int create_fast_reuse_socket(struct addrinfo *e)
+{
+int slisten = qemu_socket(e->ai_family, e->ai_socktype, e->ai_protocol);
+if (slisten < 0) {
+return -1;
+}
+socket_set_fast_reuse(slisten);
+return slisten;
+}
+
 static int try_bind(int socket, InetSocketAddress *saddr, struct addrinfo *e)
 {
 #ifndef IPV6_V6ONLY
@@ -253,7 +263,8 @@ static int inet_listen_saddr(InetSocketAddress *saddr,
 getnameinfo((struct sockaddr*)e->ai_addr,e->ai_addrlen,
uaddr,INET6_ADDRSTRLEN,uport,32,
NI_NUMERICHOST | NI_NUMERICSERV);
-slisten = qemu_socket(e->ai_family, e->ai_socktype, e->ai_protocol);
+
+slisten = create_fast_reuse_socket(e);
 if (slisten < 0) {
 if (!e->ai_next) {
 error_setg_errno(errp, errno, "Failed to create socket");
@@ -261,8 +272,6 @@ static int inet_listen_saddr(InetSocketAddress *saddr,
 continue;
 }
 
-socket_set_fast_reuse(slisten);
-
 port_min = inet_getport(e);
 port_max = saddr->has_to ? saddr->to + port_offset : port_min;
 for (p = port_min; p <= port_max; p++) {
-- 
git-series 0.9.1

[Qemu-devel] [PATCH v7 1/4] tests: Add test-listen - a stress test for QEMU socket listen

2017-08-07 Thread Knut Omang

There's a potential race condition between multiple bind()'s
attempting to bind to the same port, which occasionally
allows more than one bind to succeed against the same port.

When a subsequent listen() call is made with the same socket
only one will succeed.

The current QEMU code does however not take this situation into account
and the listen will cause the code to break out and fail even
when there are actually available ports to use.

This test exposes two subtests:

/socket/listen-serial
/socket/listen-compete

The "compete" subtest creates a number of threads and have them all trying to 
bind
to the same port with a large enough offset input to
allow all threads to get it's own port.
The "serial" subtest just does the same, except in series in a
single thread.

The serial version passes, probably in most versions of QEMU.

The parallel version exposes the problem in a relatively reliable way,
eg. it fails a majority of times, but not with a 100% rate, occasional
passes can be seen. Nevertheless this is quite good given that
the bug was tricky to reproduce and has been left undetected for
a while.

The problem seems to be present in all versions of QEMU.

The original failure scenario occurred with VNC port allocation
in a traditional Xen based build, in different code
but with similar functionality.

Reported-by: Bhavesh Davda 
Signed-off-by: Knut Omang 
Reviewed-by: Yuval Shaia 
Reviewed-by: Bhavesh Davda 
Reviewed-by: Girish Moodalbail 
---
 tests/Makefile.include |   2 +-
 tests/test-listen.c| 253 ++-
 2 files changed, 255 insertions(+)
 create mode 100644 tests/test-listen.c

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 7af278d..b37c0c8 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -128,6 +128,7 @@ check-unit-y += tests/test-bufferiszero$(EXESUF)
 gcov-files-check-bufferiszero-y = util/bufferiszero.c
 check-unit-y += tests/test-uuid$(EXESUF)
 check-unit-y += tests/ptimer-test$(EXESUF)
+#check-unit-y += tests/test-listen$(EXESUF)
 gcov-files-ptimer-test-y = hw/core/ptimer.c
 check-unit-y += tests/test-qapi-util$(EXESUF)
 gcov-files-test-qapi-util-y = qapi/qapi-util.c
@@ -769,6 +770,7 @@ tests/test-arm-mptimer$(EXESUF): tests/test-arm-mptimer.o
 tests/test-qapi-util$(EXESUF): tests/test-qapi-util.o $(test-util-obj-y)
 tests/numa-test$(EXESUF): tests/numa-test.o
 tests/vmgenid-test$(EXESUF): tests/vmgenid-test.o tests/boot-sector.o 
tests/acpi-utils.o
+tests/test-listen$(EXESUF): tests/test-listen.o $(test-util-obj-y)
 
 tests/migration/stress$(EXESUF): tests/migration/stress.o
$(call quiet-command, $(LINKPROG) -static -O3 $(PTHREAD_LIB) -o $@ $< 
,"LINK","$(TARGET_DIR)$@")
diff --git a/tests/test-listen.c b/tests/test-listen.c
new file mode 100644
index 000..03c4c8f
--- /dev/null
+++ b/tests/test-listen.c
@@ -0,0 +1,253 @@
+/*
+ * Copyright (c) 2017, Oracle and/or its affiliates. All rights reserved.
+ *Author: Knut Omang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 or later
+ * as published by the Free Software Foundation.
+ *
+ * Test parallel port listen configuration with
+ * dynamic port allocation
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest.h"
+#include "qemu-common.h"
+#include "qemu/thread.h"
+#include "qemu/sockets.h"
+#include "qapi/error.h"
+
+#define NAME_LEN 1024
+#define PORT_LEN 16
+
+struct thr_info {
+QemuThread thread;
+int to_port;
+bool ipv4;
+bool ipv6;
+int got_port;
+int eno;
+int fd;
+const char *errstr;
+char hostname[NAME_LEN + 1];
+char port[PORT_LEN + 1];
+};
+
+
+/* These two functions taken from test-io-channel-socket.c */
+static int check_bind(const char *hostname, bool *has_proto)
+{
+int fd = -1;
+struct addrinfo ai, *res = NULL;
+int rc;
+int ret = -1;
+
+memset(&ai, 0, sizeof(ai));
+ai.ai_flags = AI_CANONNAME | AI_ADDRCONFIG;
+ai.ai_family = AF_UNSPEC;
+ai.ai_socktype = SOCK_STREAM;
+
+/* lookup */
+rc = getaddrinfo(hostname, NULL, &ai, &res);
+if (rc != 0) {
+if (rc == EAI_ADDRFAMILY ||
+rc == EAI_FAMILY) {
+*has_proto = false;
+goto done;
+}
+goto cleanup;
+}
+
+fd = qemu_socket(res->ai_family, res->ai_socktype, res->ai_protocol);
+if (fd < 0) {
+goto cleanup;
+}
+
+if (bind(fd, res->ai_addr, res->ai_addrlen) < 0) {
+if (errno == EADDRNOTAVAIL) {
+*has_proto = false;
+goto done;
+}
+goto cleanup;
+}
+
+*has_proto = true;
+ done:
+ret = 0;
+
+ cleanup:
+if (fd != -1) {
+close(fd);
+}
+if (res) {
+freeaddrinfo(res);
+}
+return ret;
+}
+
+static int check_protocol_support(bool *has_ipv4, bool *has_ipv6)
+{
+if (check_bind("127.0.0.1", has_ipv4) < 0) {
+return -1;
+}
+if

[Qemu-devel] [PATCH v7 4/4] sockets: Handle race condition between binds to the same port

2017-08-07 Thread Knut Omang

If an offset of ports is specified to the inet_listen_saddr function(),
and two or more processes tries to bind from these ports at the same time,
occasionally more than one process may be able to bind to the same
port. The condition is detected by listen() but too late to avoid a failure.

This function is called by socket_listen() and used
by all socket listening code in QEMU, so all cases where any form of dynamic
port selection is used should be subject to this issue.

Add code to close and re-establish the socket when this
condition is observed, hiding the race condition from the user.

Also clean up some issues with error handling to allow more
accurate reporting of the cause of an error.

This has been developed and tested by means of the
test-listen unit test in the previous commit.
Enable the test for make check now that it passes.

Reviewed-by: Bhavesh Davda 
Reviewed-by: Yuval Shaia 
Reviewed-by: Girish Moodalbail 
Signed-off-by: Knut Omang 
Reviewed-by: Daniel P. Berrange 
---
 tests/Makefile.include |  2 +-
 util/qemu-sockets.c| 58 ++-
 2 files changed, 42 insertions(+), 18 deletions(-)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index b37c0c8..9d2131d 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -128,7 +128,7 @@ check-unit-y += tests/test-bufferiszero$(EXESUF)
 gcov-files-check-bufferiszero-y = util/bufferiszero.c
 check-unit-y += tests/test-uuid$(EXESUF)
 check-unit-y += tests/ptimer-test$(EXESUF)
-#check-unit-y += tests/test-listen$(EXESUF)
+check-unit-y += tests/test-listen$(EXESUF)
 gcov-files-ptimer-test-y = hw/core/ptimer.c
 check-unit-y += tests/test-qapi-util$(EXESUF)
 gcov-files-test-qapi-util-y = qapi/qapi-util.c
diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index d0d2047..6a511fb 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -206,7 +206,10 @@ static int inet_listen_saddr(InetSocketAddress *saddr,
 char port[33];
 char uaddr[INET6_ADDRSTRLEN+1];
 char uport[33];
-int slisten, rc, port_min, port_max, p;
+int rc, port_min, port_max, p;
+int slisten = 0;
+int saved_errno = 0;
+bool socket_created = false;
 Error *err = NULL;
 
 memset(&ai,0, sizeof(ai));
@@ -258,7 +261,7 @@ static int inet_listen_saddr(InetSocketAddress *saddr,
 return -1;
 }
 
-/* create socket + bind */
+/* create socket + bind/listen */
 for (e = res; e != NULL; e = e->ai_next) {
 getnameinfo((struct sockaddr*)e->ai_addr,e->ai_addrlen,
uaddr,INET6_ADDRSTRLEN,uport,32,
@@ -266,37 +269,58 @@ static int inet_listen_saddr(InetSocketAddress *saddr,
 
 slisten = create_fast_reuse_socket(e);
 if (slisten < 0) {
-if (!e->ai_next) {
-error_setg_errno(errp, errno, "Failed to create socket");
-}
 continue;
 }
 
+socket_created = true;
 port_min = inet_getport(e);
 port_max = saddr->has_to ? saddr->to + port_offset : port_min;
 for (p = port_min; p <= port_max; p++) {
 inet_setport(e, p);
-if (try_bind(slisten, saddr, e) >= 0) {
-goto listen;
-}
-if (p == port_max) {
-if (!e->ai_next) {
+rc = try_bind(slisten, saddr, e);
+if (rc) {
+if (errno == EADDRINUSE) {
+continue;
+} else {
 error_setg_errno(errp, errno, "Failed to bind socket");
+goto listen_failed;
 }
 }
+if (!listen(slisten, 1)) {
+goto listen_ok;
+}
+if (errno != EADDRINUSE) {
+error_setg_errno(errp, errno, "Failed to listen on socket");
+goto listen_failed;
+}
+/* Someone else managed to bind to the same port and beat us
+ * to listen on it! Socket semantics does not allow us to
+ * recover from this situation, so we need to recreate the
+ * socket to allow bind attempts for subsequent ports:
+ */
+closesocket(slisten);
+slisten = create_fast_reuse_socket(e);
+if (slisten < 0) {
+error_setg_errno(errp, errno,
+ "Failed to recreate failed listening socket");
+goto listen_failed;
+}
 }
+}
+error_setg_errno(errp, errno,
+ socket_created ?
+ "Failed to find an available port" :
+ "Failed to create a socket");
+listen_failed:
+saved_errno = errno;
+if (slisten >= 0) {
 closesocket(slisten);
 }
 freeaddrinfo(res);
+errno = saved_errno;
 return -1;
 
-listen:
-if (listen(slisten,1) != 0) {
-error_setg_errno(errp, errno, "Failed to listen on socket");
-closes

[Qemu-devel] [PATCH v7 2/4] sockets: factor out a new try_bind() function

2017-08-07 Thread Knut Omang

A refactoring step to prepare for the problem
exposed by the test-listen test in the previous commit.

Simplify and reorganize the IPv6 specific extra
measures and move it out of the for loop to increase
code readability. No semantic changes.

Signed-off-by: Knut Omang 
Reviewed-by: Daniel P. Berrange 
---
 util/qemu-sockets.c | 69 ++
 1 file changed, 39 insertions(+), 30 deletions(-)

diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index 1358c81..b4a2f24 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -149,6 +149,44 @@ int inet_ai_family_from_address(InetSocketAddress *addr,
 return PF_UNSPEC;
 }
 
+static int try_bind(int socket, InetSocketAddress *saddr, struct addrinfo *e)
+{
+#ifndef IPV6_V6ONLY
+return bind(socket, e->ai_addr, e->ai_addrlen);
+#else
+/*
+ * Deals with first & last cases in matrix in comment
+ * for inet_ai_family_from_address().
+ */
+int v6only =
+((!saddr->has_ipv4 && !saddr->has_ipv6) ||
+ (saddr->has_ipv4 && saddr->ipv4 &&
+  saddr->has_ipv6 && saddr->ipv6)) ? 0 : 1;
+int stat;
+
+ rebind:
+if (e->ai_family == PF_INET6) {
+qemu_setsockopt(socket, IPPROTO_IPV6, IPV6_V6ONLY, &v6only,
+sizeof(v6only));
+}
+
+stat = bind(socket, e->ai_addr, e->ai_addrlen);
+if (!stat) {
+return 0;
+}
+
+/* If we got EADDRINUSE from an IPv6 bind & v6only is unset,
+ * it could be that the IPv4 port is already claimed, so retry
+ * with v6only set
+ */
+if (e->ai_family == PF_INET6 && errno == EADDRINUSE && !v6only) {
+v6only = 1;
+goto rebind;
+}
+return stat;
+#endif
+}
+
 static int inet_listen_saddr(InetSocketAddress *saddr,
  int port_offset,
  bool update_addr,
@@ -228,39 +266,10 @@ static int inet_listen_saddr(InetSocketAddress *saddr,
 port_min = inet_getport(e);
 port_max = saddr->has_to ? saddr->to + port_offset : port_min;
 for (p = port_min; p <= port_max; p++) {
-#ifdef IPV6_V6ONLY
-/*
- * Deals with first & last cases in matrix in comment
- * for inet_ai_family_from_address().
- */
-int v6only =
-((!saddr->has_ipv4 && !saddr->has_ipv6) ||
- (saddr->has_ipv4 && saddr->ipv4 &&
-  saddr->has_ipv6 && saddr->ipv6)) ? 0 : 1;
-#endif
 inet_setport(e, p);
-#ifdef IPV6_V6ONLY
-rebind:
-if (e->ai_family == PF_INET6) {
-qemu_setsockopt(slisten, IPPROTO_IPV6, IPV6_V6ONLY, &v6only,
-sizeof(v6only));
-}
-#endif
-if (bind(slisten, e->ai_addr, e->ai_addrlen) == 0) {
+if (try_bind(slisten, saddr, e) >= 0) {
 goto listen;
 }
-
-#ifdef IPV6_V6ONLY
-/* If we got EADDRINUSE from an IPv6 bind & V6ONLY is unset,
- * it could be that the IPv4 port is already claimed, so retry
- * with V6ONLY set
- */
-if (e->ai_family == PF_INET6 && errno == EADDRINUSE && !v6only) {
-v6only = 1;
-goto rebind;
-}
-#endif
-
 if (p == port_max) {
 if (!e->ai_next) {
 error_setg_errno(errp, errno, "Failed to bind socket");
-- 
git-series 0.9.1

Re: [Qemu-devel] [PATCH v2 10/13] vvfat: correctly generate numeric-tail of short file names

2017-08-07 Thread Eric Blake

On 08/05/2017 01:52 PM, Pranith Kumar wrote:
> FYI,
> 
> This commit breaks the build with gcc-7:
> 
>   CC  block/vvfat.o
> qemu/block/vvfat.c: In function ‘read_directory’:
> qemu/block/vvfat.c:605:37: error: ‘__builtin___sprintf_chk’ may write
> a terminating nul past the end of the destination
> [-Werror=format-overflow=]
>  int len = sprintf(tail, "~%d", i);
>  ^

Doesn't commit 7c8730d45f6 fix that?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v4 5/9] s390x/ccw: create s390 phb conditionally

2017-08-07 Thread Cornelia Huck

On Mon, 7 Aug 2017 12:21:30 +0200
David Hildenbrand  wrote:

> On 07.08.2017 12:00, Cornelia Huck wrote:
> > On Fri, 4 Aug 2017 15:20:26 +0200
> > David Hildenbrand  wrote:
> >   
> >> On 04.08.2017 13:29, Cornelia Huck wrote:  
> >>> Don't create the s390 pci host bridge if we do not provide the zpci
> >>> facility.
> >>>
> >>> Reviewed-by: Thomas Huth 
> >>> Acked-by: Christian Borntraeger 
> >>> Signed-off-by: Cornelia Huck 
> >>
> >> This works, because s390_init_cpus(machine) (which sets up features) is
> >> called before adding the bus. Mind to add a comment so nobody tries to
> >> reshuffle these functions?  
> > 
> > This is already needed by the flic setup (which also relies on the
> > features being set up). We really want the cpu setup as early as
> > possible.
> > 
> > Do you have a good comment handy? :)
> >   
> 
> /* init CPUs (incl. CPU model) early so s390_has_feature() works */
> 

Thanks, I'll go with that.

[Qemu-devel] [PULL 0/1] Tracing patches

2017-08-07 Thread Stefan Hajnoczi

The following changes since commit ac44ed2afb7c60255e989b163301479f5b4ecd04:

  Merge remote-tracking branch 'remotes/ehabkost/tags/machine-pull-request' 
into staging (2017-08-04 13:46:22 +0100)

are available in the git repository at:

  git://github.com/stefanha/qemu.git tags/tracing-pull-request

for you to fetch changes up to f42cf447e2310e84e119b99f7f13c8dc7a6cf3d6:

  block: move trace probes into bdrv_co_preadv|pwritev (2017-08-07 09:39:35 
+0100)





Daniel P. Berrange (1):
  block: move trace probes into bdrv_co_preadv|pwritev

 block/io.c | 8 
 block/trace-events | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

-- 
2.13.3

Re: [Qemu-devel] [PATCH] kvm: workaround build break on gcc-7.1.1 / fedora26

2017-08-07 Thread Greg Kurz

On Thu, 3 Aug 2017 13:56:45 -0500
Eric Blake  wrote:

> On 08/03/2017 01:38 PM, Greg Kurz wrote:
> > The following code snippet spits a warning with gcc-7.1.1-3.fc26 and -Wall 
> > on
> > ppc64le AND x86_64:
> > 
> > int foo(void)
> > {
> > char empty_array[] = { };
> > int i, ret = 0;
> > 
> > for (i = 0; i < (int) (sizeof(empty_array) / sizeof(empty_array[0])); 
> > i++) {
> > ret = empty_array[i];
> > }
> > 
> > return ret;
> > }  
> 
> Confirmed.  No idea why the cast makes gcc think i is uninitialized.
> 

FYI, I've created a BZ to track this issue in gcc:

https://bugzilla.redhat.com/show_bug.cgi?id=1478864

> > 
> > If I drop the (int), the warning goes away... and so does the build break
> > of qemu-system-ppc64 on my ppc64le host.
> > 
> > I don't have 4.7.1 or 4.6 compilers around but I could check 4.8.5
> > is okay with that change FWIW.  
> 
> I tested with gcc 4.4.7 on RHEL 6, but -Wtype-limits there didn't warn
> whether or not the (int) was present; it's possible that we still need
> to test with 4.6 or 4.7.1 to see whether it makes a difference.
> 
> >   
> >> Ok so let's stop losing time about compiler incoherent warnings, using 
> >> -Wno-type-limits for GCC < 5...
> >> So we can keep a sane/understandable codebase, using size_t and no (int) 
> >> cast.  
> 
> That's also a possibility, if we still hit old compilers that require
> the (int).
> 
> >>  
> > 
> > If I also convert 'int i' to 'size_t i' then I get the same error as
> > in commit 61c7bbd2:
> > 
> > error: comparison of unsigned expression < 0 is always
> >  false [-Werror=type-limits]  
> 
> Confirmed with newer gcc, whether or not the (int) cast is present.
> 



pgp8R95ET0TQP.pgp
Description: OpenPGP digital signature

[Qemu-devel] [PULL 1/1] block: move trace probes into bdrv_co_preadv|pwritev

2017-08-07 Thread Stefan Hajnoczi

From: "Daniel P. Berrange" 

There are trace probes in bdrv_co_readv|writev, however, the
block drivers are being gradually moved over to using the
bdrv_co_preadv|pwritev functions instead. As a result some
block drivers miss the current probes. Move the probes
into bdrv_co_preadv|pwritev instead, so that they are triggered
by more (all?) I/O code paths.

Signed-off-by: Daniel P. Berrange 
Message-id: 20170804105036.11879-1-berra...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 block/io.c | 8 
 block/trace-events | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/io.c b/block/io.c
index d9dc822173..26003814eb 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1135,6 +1135,8 @@ int coroutine_fn bdrv_co_preadv(BdrvChild *child,
 bool use_local_qiov = false;
 int ret;
 
+trace_bdrv_co_preadv(child->bs, offset, bytes, flags);
+
 if (!drv) {
 return -ENOMEDIUM;
 }
@@ -1207,8 +1209,6 @@ static int coroutine_fn bdrv_co_do_readv(BdrvChild *child,
 int coroutine_fn bdrv_co_readv(BdrvChild *child, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov)
 {
-trace_bdrv_co_readv(child->bs, sector_num, nb_sectors);
-
 return bdrv_co_do_readv(child, sector_num, nb_sectors, qiov, 0);
 }
 
@@ -1526,6 +1526,8 @@ int coroutine_fn bdrv_co_pwritev(BdrvChild *child,
 bool use_local_qiov = false;
 int ret;
 
+trace_bdrv_co_pwritev(child->bs, offset, bytes, flags);
+
 if (!bs->drv) {
 return -ENOMEDIUM;
 }
@@ -1660,8 +1662,6 @@ static int coroutine_fn bdrv_co_do_writev(BdrvChild 
*child,
 int coroutine_fn bdrv_co_writev(BdrvChild *child, int64_t sector_num,
 int nb_sectors, QEMUIOVector *qiov)
 {
-trace_bdrv_co_writev(child->bs, sector_num, nb_sectors);
-
 return bdrv_co_do_writev(child, sector_num, nb_sectors, qiov, 0);
 }
 
diff --git a/block/trace-events b/block/trace-events
index 071a8d77ba..25dd5a3026 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -9,8 +9,8 @@ blk_co_preadv(void *blk, void *bs, int64_t offset, unsigned int 
bytes, int flags
 blk_co_pwritev(void *blk, void *bs, int64_t offset, unsigned int bytes, int 
flags) "blk %p bs %p offset %"PRId64" bytes %u flags 0x%x"
 
 # block/io.c
-bdrv_co_readv(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num 
%"PRId64" nb_sectors %d"
-bdrv_co_writev(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num 
%"PRId64" nb_sectors %d"
+bdrv_co_preadv(void *bs, int64_t offset, int64_t nbytes, unsigned int flags) 
"bs %p offset %"PRId64" nbytes %"PRId64" flags 0x%x"
+bdrv_co_pwritev(void *bs, int64_t offset, int64_t nbytes, unsigned int flags) 
"bs %p offset %"PRId64" nbytes %"PRId64" flags 0x%x"
 bdrv_co_pwrite_zeroes(void *bs, int64_t offset, int count, int flags) "bs %p 
offset %"PRId64" count %d flags 0x%x"
 bdrv_co_do_copy_on_readv(void *bs, int64_t offset, unsigned int bytes, int64_t 
cluster_offset, unsigned int cluster_bytes) "bs %p offset %"PRId64" bytes %u 
cluster_offset %"PRId64" cluster_bytes %u"
 
-- 
2.13.3

Re: [Qemu-devel] [PATCH 2/2] block/vhdx: check for offset overflow to bdrv_truncate()

2017-08-07 Thread Eric Blake

On 08/06/2017 10:08 PM, Jeff Cody wrote:
> VHDX uses uint64_t types for most offsets, following the VHDX spec.
> However, bdrv_truncate() takes an int64_t value for the truncating
> offset.  Check for overflow before calling bdrv_truncate().
> 
> N.B.: For a compliant image this is not an issue, as the maximum VHDX
> image size is defined per the spec to be 64TB.
> 
> Signed-off-by: Jeff Cody 
> ---
>  block/vhdx-log.c | 4 
>  block/vhdx.c | 3 +++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/block/vhdx-log.c b/block/vhdx-log.c
> index fd4e7af..3b74e5d 100644
> --- a/block/vhdx-log.c
> +++ b/block/vhdx-log.c
> @@ -554,6 +554,10 @@ static int vhdx_log_flush(BlockDriverState *bs, 
> BDRVVHDXState *s,
>  if (new_file_size % (1024*1024)) {
>  /* round up to nearest 1MB boundary */
>  new_file_size = ((new_file_size >> 20) + 1) << 20;

Since you're touching here, can you fix this to use QEMU_ALIGN_UP instead?

> +if (new_file_size > INT64_MAX) {
> +ret = -EINVAL;
> +goto exit;
> +}
>  bdrv_truncate(bs->file, new_file_size, PREALLOC_MODE_OFF, 
> NULL);

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 1/2] block/vhdx: check error return of bdrv_getlength()

2017-08-07 Thread Eric Blake

On 08/06/2017 10:08 PM, Jeff Cody wrote:
> Calls to bdrv_getlength() were not checking for error.  In vhdx.c, this
> can lead to truncating an image file, so it is a definite bug.  In
> vhdx-log.c, the path for improper behavior is less clear, but it is best
> to check in any case.
> 
> Reported-by: Markus Armbruster 
> Signed-off-by: Jeff Cody 
> ---
>  block/vhdx-log.c | 20 
>  block/vhdx.c |  9 -
>  2 files changed, 24 insertions(+), 5 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH for-2.10] quorum: Handle bdrv_getlength() failures in quorum_co_flush()

2017-08-07 Thread Eric Blake

On 08/07/2017 03:43 AM, Alberto Garcia wrote:
> On Fri 04 Aug 2017 05:44:00 PM CEST, Eric Blake wrote:
>>> --- a/block/quorum.c
>>> +++ b/block/quorum.c
>>> @@ -785,8 +785,9 @@ static coroutine_fn int 
>>> quorum_co_flush(BlockDriverState *bs)
>>>  for (i = 0; i < s->num_children; i++) {
>>>  result = bdrv_co_flush(s->children[i]->bs);
>>>  if (result) {
>>> +int64_t length = bdrv_getlength(s->children[i]->bs);
>>>  quorum_report_bad(QUORUM_OP_TYPE_FLUSH, 0,
>>> -  bdrv_getlength(s->children[i]->bs),
>>> +  length > 0 ? length : 0,
>>
>> In the fallback case, is always picking 0 good enough?  Then again,
>> this is in the error path, so it is unlikely in practice, and I don't
>> see any better way to handle it.
> 
> I don't see any better way to handle it either, and I'm not sure that it
> matters much: this is a flush error event, the 'sectors-count' field
> doesn't even mean anything, but we have to put something there.

If that's the case, can we ALWAYS report 0, instead of usually the
length and sometimes 0?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 01/17] nbd/client: fix nbd_opt_go

2017-08-07 Thread Eric Blake

On 08/04/2017 10:14 AM, Vladimir Sementsov-Ogievskiy wrote:
> Do not send NBD_OPT_ABORT to the broken server. After sending
> NBD_REP_ACK on NBD_OPT_GO server is most probably in transmission
> phase, when option sending is finished.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  nbd/client.c | 2 --
>  1 file changed, 2 deletions(-)

Reviewed-by: Eric Blake 

But buggy servers aren't a severe enough problem to warrant this being
in 2.10.  I'll save this for 2.11.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v2] kvm: workaround build break on gcc-7.1.1 / fedora26

2017-08-07 Thread Greg Kurz

Building QEMU on fedora26 with the latest gcc package fails:

  CC  ppc64-softmmu/target/ppc/kvm.o
In file included from include/sysemu/hw_accel.h:16:0,
 from target/ppc/kvm.c:31:
target/ppc/kvm.c: In function ‘kvmppc_booke_watchdog_enable’:
include/sysemu/kvm.h:449:35: error: ‘args_tmp[i]’ may be used uninitialized
 in this function [-Werror=maybe-uninitialized]
 cap.args[i] = args_tmp[i];   \
   ^
target/ppc/kvm.c: In function ‘kvmppc_set_papr’:
include/sysemu/kvm.h:449:35: error: ‘args_tmp[i]’ may be used uninitialized
 in this function [-Werror=maybe-uninitialized]
cc1: all warnings being treated as errors

$ rpm -q gcc
gcc-7.1.1-3.fc26.ppc64le

The compiler should obviously optimize this code away when no extra
agument is passed to kvm_vm_enable_cap() and kvm_vcpu_enable_cap(),
but it doesn't. This bug should be fixed one day in gcc, but we can
also change our code pattern so that we don't hit the issue anymore.
We workaround this, by using memcpy() instead of open-coding the copy.

Signed-off-by: Greg Kurz 
---
v2: - use memcpy()
---
 include/sysemu/kvm.h |   14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 91fc07ee9afe..3a458f50e9f4 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -428,11 +428,8 @@ int kvm_vm_check_extension(KVMState *s, unsigned int 
extension);
 .flags = cap_flags,  \
 };   \
 uint64_t args_tmp[] = { __VA_ARGS__ };   \
-int i;   \
-for (i = 0; i < (int)ARRAY_SIZE(args_tmp) && \
- i < ARRAY_SIZE(cap.args); i++) {\
-cap.args[i] = args_tmp[i];   \
-}\
+size_t n = MIN(ARRAY_SIZE(args_tmp), ARRAY_SIZE(cap.args));  \
+memcpy(cap.args, args_tmp, n * sizeof(cap.args[0])); \
 kvm_vm_ioctl(s, KVM_ENABLE_CAP, &cap);   \
 })
 
@@ -443,11 +440,8 @@ int kvm_vm_check_extension(KVMState *s, unsigned int 
extension);
 .flags = cap_flags,  \
 };   \
 uint64_t args_tmp[] = { __VA_ARGS__ };   \
-int i;   \
-for (i = 0; i < (int)ARRAY_SIZE(args_tmp) && \
- i < ARRAY_SIZE(cap.args); i++) {\
-cap.args[i] = args_tmp[i];   \
-}\
+size_t n = MIN(ARRAY_SIZE(args_tmp), ARRAY_SIZE(cap.args));  \
+memcpy(cap.args, args_tmp, n * sizeof(cap.args[0])); \
 kvm_vcpu_ioctl(cpu, KVM_ENABLE_CAP, &cap);   \
 })

Re: [Qemu-devel] [PATCH 02/17] nbd/client: refactor nbd_read_eof

2017-08-07 Thread Eric Blake

On 08/04/2017 10:14 AM, Vladimir Sementsov-Ogievskiy wrote:
> Refactor nbd_read_eof to return 1 on success, 0 on eof, when no
> data was read and <0 for other cases, because returned size of
> read data is not actually used.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  nbd/nbd-internal.h | 34 +-
>  nbd/client.c   |  5 -
>  tests/qemu-iotests/083.out |  4 ++--
>  3 files changed, 27 insertions(+), 16 deletions(-)
> 
> diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
> index 396ddb4d3e..3fb0b6098a 100644
> --- a/nbd/nbd-internal.h
> +++ b/nbd/nbd-internal.h
> @@ -77,21 +77,37 @@
>  #define NBD_ESHUTDOWN  108
>  
>  /* nbd_read_eof
> - * Tries to read @size bytes from @ioc. Returns number of bytes actually 
> read.
> - * May return a value >= 0 and < size only on EOF, i.e. when iteratively 
> called
> - * qio_channel_readv() returns 0. So, there is no need to call nbd_read_eof
> - * iteratively.
> + * Tries to read @size bytes from @ioc.
> + * Returns 1 on success
> + * 0 on eof, when no data was read (errp is not set)
> + * -EINVAL on eof, when some data < @size was read until eof
> + * < 0 on read fail

In general, mixing negative errno value and generic < 0 in the same
function is most likely ambiguous.

>   */
> -static inline ssize_t nbd_read_eof(QIOChannel *ioc, void *buffer, size_t 
> size,
> -   Error **errp)
> +static inline int nbd_read_eof(QIOChannel *ioc, void *buffer, size_t size,
> +   Error **errp)
>  {
>  struct iovec iov = { .iov_base = buffer, .iov_len = size };
> +ssize_t ret;
> +
>  /* Sockets are kept in blocking mode in the negotiation phase.  After
>   * that, a non-readable socket simply means that another thread stole
>   * our request/reply.  Synchronization is done with recv_coroutine, so
>   * that this is coroutine-safe.
>   */
> -return nbd_rwv(ioc, &iov, 1, size, true, errp);
> +
> +assert(size > 0);

Effectively the same as assert(size != 0).

> +
> +ret = nbd_rwv(ioc, &iov, 1, size, true, errp);
> +if (ret <= 0) {
> +return ret;
> +}

So this is a negative errno (or 0 on EOF),

> +
> +if (ret != size) {
> +error_setg(errp, "End of file");
> +return -EINVAL;

and so is this. Which makes the function documentation not quite
accurate; you aren't mixing a generic < 0.

> +}
> +
> +return 1;
>  }
>  
>  /* nbd_read
> @@ -100,9 +116,9 @@ static inline ssize_t nbd_read_eof(QIOChannel *ioc, void 
> *buffer, size_t size,
>  static inline int nbd_read(QIOChannel *ioc, void *buffer, size_t size,
> Error **errp)
>  {
> -ssize_t ret = nbd_read_eof(ioc, buffer, size, errp);
> +int ret = nbd_read_eof(ioc, buffer, size, errp);
>  
> -if (ret >= 0 && ret != size) {
> +if (ret == 0) {
>  ret = -EINVAL;
>  error_setg(errp, "End of file");

Why do we have to set errp here instead of in nbd_read_eof()? Is there
ever any case where hitting early EOF is not something that should be
treated as an error?

>  }
> diff --git a/nbd/client.c b/nbd/client.c
> index f1c16b588f..4556056daa 100644
> --- a/nbd/client.c
> +++ b/nbd/client.c
> @@ -925,11 +925,6 @@ ssize_t nbd_receive_reply(QIOChannel *ioc, NBDReply 
> *reply, Error **errp)
>  return ret;
>  }
>  
> -if (ret != sizeof(buf)) {
> -error_setg(errp, "read failed");
> -return -EINVAL;
> -}
> -
>  /* Reply
> [ 0 ..  3]magic   (NBD_REPLY_MAGIC)
> [ 4 ..  7]error   (0 == no error)
> diff --git a/tests/qemu-iotests/083.out b/tests/qemu-iotests/083.out
> index a24c6bfece..d3bea1b2f5 100644
> --- a/tests/qemu-iotests/083.out
> +++ b/tests/qemu-iotests/083.out
> @@ -69,12 +69,12 @@ read failed: Input/output error
>  
>  === Check disconnect 4 reply ===
>  
> -read failed
> +End of file
>  read failed: Input/output error

At least you tracked that your changes tweak the error message.  But I'm
not yet convinced whether this change simplifies anything.  Is there a
later patch that is easier to write with the new semantics which was not
possible with the pre-patch semantics?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH for-2.10] hw/arm/virt: Add 2.10 machine type

2017-08-07 Thread Eric Auger

Add virt-2.10 machine type.

Signed-off-by: Eric Auger 

---
---
 hw/arm/virt.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 31739d7..6b7a0fe 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1639,7 +1639,7 @@ static void machvirt_machine_init(void)
 }
 type_init(machvirt_machine_init);
 
-static void virt_2_9_instance_init(Object *obj)
+static void virt_2_10_instance_init(Object *obj)
 {
 VirtMachineState *vms = VIRT_MACHINE(obj);
 VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
@@ -1699,10 +1699,25 @@ static void virt_2_9_instance_init(Object *obj)
 vms->irqmap = a15irqmap;
 }
 
+static void virt_machine_2_10_options(MachineClass *mc)
+{
+}
+DEFINE_VIRT_MACHINE_AS_LATEST(2, 10)
+
+#define VIRT_COMPAT_2_9 \
+HW_COMPAT_2_9
+
+static void virt_2_9_instance_init(Object *obj)
+{
+virt_2_10_instance_init(obj);
+}
+
 static void virt_machine_2_9_options(MachineClass *mc)
 {
+virt_machine_2_10_options(mc);
+SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_9);
 }
-DEFINE_VIRT_MACHINE_AS_LATEST(2, 9)
+DEFINE_VIRT_MACHINE(2, 9)
 
 #define VIRT_COMPAT_2_8 \
 HW_COMPAT_2_8
-- 
1.9.1

Re: [Qemu-devel] [PATCH 04/17] nbd/client: fix nbd_send_request to return int

2017-08-07 Thread Eric Blake

On 08/07/2017 03:57 AM, Vladimir Sementsov-Ogievskiy wrote:
> 07.08.2017 11:23, Daniel P. Berrange wrote:
>> On Fri, Aug 04, 2017 at 06:14:27PM +0300, Vladimir Sementsov-Ogievskiy
>> wrote:
>>> Fix nbd_send_request to return int, as it returns a return value
>>> of nbd_write (which is int), and the only user of nbd_send_request's
>>> return value (nbd_co_send_request) consider it as int too.
>> The real problem here is the return value of nbd_write() - it should be
>> a ssize_t, not an int, since it is propagating the return value of
>> nbd_rwv() which is ssize_t.
> 
> It was changed in f5d406fe86bb (sent in May)
> The discussion actually was started half a year ago:
> https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00825.html
> 
> 
>> Basically all functions that do I/O and
>> return the number of bytes read/written should be ssize_t - any use of
>> int is a bug.

If returning the size matters, then yes, not using ssize_t is a bug.
But if all we need is to know whether an entire expected length was read
(and treat ANY partial read the same as failure), then a mere int
becomes enough to encode the results.

Where I'm less certain is whether this change in semantics simplifies
later patches, or loses information that might have been useful.  But
intuitively, ANY encounter of EOF means that NBD needs to quit trying to
talk to the other end of the socket, whether or not the EOF occurred on
a nice message boundary; and the rest of the protocol is strongly tied
to messages, where we don't act until we have read the entire expected
message; so if a later patch is indeed simpler by not returning bytes
read, this patch might be okay.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 06/17] block/nbd-client: fix nbd_read_reply_entry

2017-08-07 Thread Eric Blake

On 08/04/2017 10:14 AM, Vladimir Sementsov-Ogievskiy wrote:
> Set reply.handle to 0 on error path to prevent normal path of
> nbd_co_receive_reply.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd-client.c | 1 +
>  1 file changed, 1 insertion(+)

Can you document a case where not fixing this would be an observable bug
(even if it requires using gdb and single-stepping between client and
server to make what is otherwise a racy situation easy to see)?  I'm
trying to figure out if this is 2.10 material.

> 
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index dc19894a7c..0c88d84de6 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -107,6 +107,7 @@ static coroutine_fn void nbd_read_reply_entry(void 
> *opaque)
>  qemu_coroutine_yield();
>  }
>  
> +s->reply.handle = 0;
>  nbd_recv_coroutines_enter_all(s);
>  s->read_reply_co = NULL;
>  }
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2] kvm: workaround build break on gcc-7.1.1 / fedora26

2017-08-07 Thread Paolo Bonzini

On 07/08/2017 13:36, Greg Kurz wrote:
> Building QEMU on fedora26 with the latest gcc package fails:
> 
>   CC  ppc64-softmmu/target/ppc/kvm.o
> In file included from include/sysemu/hw_accel.h:16:0,
>  from target/ppc/kvm.c:31:
> target/ppc/kvm.c: In function ‘kvmppc_booke_watchdog_enable’:
> include/sysemu/kvm.h:449:35: error: ‘args_tmp[i]’ may be used uninitialized
>  in this function [-Werror=maybe-uninitialized]
>  cap.args[i] = args_tmp[i];   \
>^
> target/ppc/kvm.c: In function ‘kvmppc_set_papr’:
> include/sysemu/kvm.h:449:35: error: ‘args_tmp[i]’ may be used uninitialized
>  in this function [-Werror=maybe-uninitialized]
> cc1: all warnings being treated as errors
> 
> $ rpm -q gcc
> gcc-7.1.1-3.fc26.ppc64le
> 
> The compiler should obviously optimize this code away when no extra
> agument is passed to kvm_vm_enable_cap() and kvm_vcpu_enable_cap(),
> but it doesn't. This bug should be fixed one day in gcc, but we can
> also change our code pattern so that we don't hit the issue anymore.
> We workaround this, by using memcpy() instead of open-coding the copy.

Nice way to do it, thanks.  I'll queue it for 2.10.

Paolo

> Signed-off-by: Greg Kurz 
> ---
> v2: - use memcpy()
> ---
>  include/sysemu/kvm.h |   14 --
>  1 file changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index 91fc07ee9afe..3a458f50e9f4 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -428,11 +428,8 @@ int kvm_vm_check_extension(KVMState *s, unsigned int 
> extension);
>  .flags = cap_flags,  \
>  };   \
>  uint64_t args_tmp[] = { __VA_ARGS__ };   \
> -int i;   \
> -for (i = 0; i < (int)ARRAY_SIZE(args_tmp) && \
> - i < ARRAY_SIZE(cap.args); i++) {\
> -cap.args[i] = args_tmp[i];   \
> -}\
> +size_t n = MIN(ARRAY_SIZE(args_tmp), ARRAY_SIZE(cap.args));  \
> +memcpy(cap.args, args_tmp, n * sizeof(cap.args[0])); \
>  kvm_vm_ioctl(s, KVM_ENABLE_CAP, &cap);   \
>  })
>  
> @@ -443,11 +440,8 @@ int kvm_vm_check_extension(KVMState *s, unsigned int 
> extension);
>  .flags = cap_flags,  \
>  };   \
>  uint64_t args_tmp[] = { __VA_ARGS__ };   \
> -int i;   \
> -for (i = 0; i < (int)ARRAY_SIZE(args_tmp) && \
> - i < ARRAY_SIZE(cap.args); i++) {\
> -cap.args[i] = args_tmp[i];   \
> -}\
> +size_t n = MIN(ARRAY_SIZE(args_tmp), ARRAY_SIZE(cap.args));  \
> +memcpy(cap.args, args_tmp, n * sizeof(cap.args[0])); \
>  kvm_vcpu_ioctl(cpu, KVM_ENABLE_CAP, &cap);   \
>  })
>  
>

Re: [Qemu-devel] [PATCH 11/17] block/nbd-client: fix nbd_co_request: set s->reply.handle to 0 on error

2017-08-07 Thread Eric Blake

On 08/04/2017 10:14 AM, Vladimir Sementsov-Ogievskiy wrote:
> We set s->reply.handle to 0 on one error path and don't set on another.
> For consistancy and to avoid assert in nbd_read_reply_entry let's
> set s->reply.handle to 0 in case of wrong handle too.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd-client.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Can this assertion be triggered now (presumably, with a broken server)?
I'm trying to figure out if this is 2.10 material.

[Urgh. If a broken server is able to cause an assertion failure that
causes a client to abort on an assertion failure, that probably deserves
a CVE]

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 04/17] nbd/client: fix nbd_send_request to return int

2017-08-07 Thread Daniel P. Berrange

On Mon, Aug 07, 2017 at 11:57:12AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 07.08.2017 11:23, Daniel P. Berrange wrote:
> > On Fri, Aug 04, 2017 at 06:14:27PM +0300, Vladimir Sementsov-Ogievskiy 
> > wrote:
> > > Fix nbd_send_request to return int, as it returns a return value
> > > of nbd_write (which is int), and the only user of nbd_send_request's
> > > return value (nbd_co_send_request) consider it as int too.
> > The real problem here is the return value of nbd_write() - it should be
> > a ssize_t, not an int, since it is propagating the return value of
> > nbd_rwv() which is ssize_t.
> 
> It was changed in f5d406fe86bb (sent in May)
> The discussion actually was started half a year ago:
> https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg00825.html

Actually I misread the code.  nbd_rwv() is returning 0 on success, not
the number of bytes, so int is in fact ok


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH 02/17] nbd/client: refactor nbd_read_eof

2017-08-07 Thread Vladimir Sementsov-Ogievskiy


07.08.2017 14:42, Eric Blake wrote:

On 08/04/2017 10:14 AM, Vladimir Sementsov-Ogievskiy wrote:

Refactor nbd_read_eof to return 1 on success, 0 on eof, when no
data was read and <0 for other cases, because returned size of
read data is not actually used.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  nbd/nbd-internal.h | 34 +-
  nbd/client.c   |  5 -
  tests/qemu-iotests/083.out |  4 ++--
  3 files changed, 27 insertions(+), 16 deletions(-)

diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 396ddb4d3e..3fb0b6098a 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -77,21 +77,37 @@
  #define NBD_ESHUTDOWN  108
  
  /* nbd_read_eof

- * Tries to read @size bytes from @ioc. Returns number of bytes actually read.
- * May return a value >= 0 and < size only on EOF, i.e. when iteratively called
- * qio_channel_readv() returns 0. So, there is no need to call nbd_read_eof
- * iteratively.
+ * Tries to read @size bytes from @ioc.
+ * Returns 1 on success
+ * 0 on eof, when no data was read (errp is not set)
+ * -EINVAL on eof, when some data < @size was read until eof
+ * < 0 on read fail

In general, mixing negative errno value and generic < 0 in the same
function is most likely ambiguous.


Hmm, but this is entirely what we do so often:

if (,,) return -EINVAL;

return some_other_func().

last two lines may be rewritten like this:
+ * < 0 on fail




   */
-static inline ssize_t nbd_read_eof(QIOChannel *ioc, void *buffer, size_t size,
-   Error **errp)
+static inline int nbd_read_eof(QIOChannel *ioc, void *buffer, size_t size,
+   Error **errp)
  {
  struct iovec iov = { .iov_base = buffer, .iov_len = size };
+ssize_t ret;
+
  /* Sockets are kept in blocking mode in the negotiation phase.  After
   * that, a non-readable socket simply means that another thread stole
   * our request/reply.  Synchronization is done with recv_coroutine, so
   * that this is coroutine-safe.
   */
-return nbd_rwv(ioc, &iov, 1, size, true, errp);
+
+assert(size > 0);

Effectively the same as assert(size != 0).


+
+ret = nbd_rwv(ioc, &iov, 1, size, true, errp);
+if (ret <= 0) {
+return ret;
+}

So this is a negative errno (or 0 on EOF),


if it is < 0, it can be only -EIO, specified in nbd_rwv "by hand". it is 
unrelated to read read/write errno's





+
+if (ret != size) {
+error_setg(errp, "End of file");
+return -EINVAL;

and so is this. Which makes the function documentation not quite
accurate; you aren't mixing a generic < 0.


hmm.. my wordings are weird sometimes, sorry for that :(. and thank you 
for your patience.





+}
+
+return 1;
  }
  
  /* nbd_read

@@ -100,9 +116,9 @@ static inline ssize_t nbd_read_eof(QIOChannel *ioc, void 
*buffer, size_t size,
  static inline int nbd_read(QIOChannel *ioc, void *buffer, size_t size,
 Error **errp)
  {
-ssize_t ret = nbd_read_eof(ioc, buffer, size, errp);
+int ret = nbd_read_eof(ioc, buffer, size, errp);
  
-if (ret >= 0 && ret != size) {

+if (ret == 0) {
  ret = -EINVAL;
  error_setg(errp, "End of file");

Why do we have to set errp here instead of in nbd_read_eof()? Is there
ever any case where hitting early EOF is not something that should be
treated as an error?


yes. it is the only usage of nbd_read_eof - in nbd_receive_reply. This 
used to understand that there no more replies to read.





  }
diff --git a/nbd/client.c b/nbd/client.c
index f1c16b588f..4556056daa 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -925,11 +925,6 @@ ssize_t nbd_receive_reply(QIOChannel *ioc, NBDReply 
*reply, Error **errp)
  return ret;
  }
  
-if (ret != sizeof(buf)) {

-error_setg(errp, "read failed");
-return -EINVAL;
-}
-
  /* Reply
 [ 0 ..  3]magic   (NBD_REPLY_MAGIC)
 [ 4 ..  7]error   (0 == no error)
diff --git a/tests/qemu-iotests/083.out b/tests/qemu-iotests/083.out
index a24c6bfece..d3bea1b2f5 100644
--- a/tests/qemu-iotests/083.out
+++ b/tests/qemu-iotests/083.out
@@ -69,12 +69,12 @@ read failed: Input/output error
  
  === Check disconnect 4 reply ===
  
-read failed

+End of file
  read failed: Input/output error

At least you tracked that your changes tweak the error message.  But I'm
not yet convinced whether this change simplifies anything.  Is there a
later patch that is easier to write with the new semantics which was not
possible with the pre-patch semantics?


This patch just moves error handling one level down (do not propagate 
unused information). And removes (with the following patch) last remains 
of ssize_t and returning number of bytes in nbd/client.c - for consistency.

Let nbd_rwv  to be the only function returning number of bytes.





--
Best regards,
Vladimir

Re: [Qemu-devel] [PATCH v4 08/15] qcow2: set inactive flag

2017-08-07 Thread Anton Nefedov



On 08/04/2017 11:00 PM, Eric Blake wrote:

On 08/01/2017 09:19 AM, Anton Nefedov wrote:

Qcow2State and BlockDriverState flags have to be in sync

Signed-off-by: Anton Nefedov 
---
  block/qcow2.c | 1 +
  1 file changed, 1 insertion(+)


Is this a bug fix needed for 2.10?



I don't think this bites us now, but yes looks like a bug



diff --git a/block/qcow2.c b/block/qcow2.c
index 66aa8c2..2a1d2f2 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2138,6 +2138,7 @@ static int qcow2_inactivate(BlockDriverState *bs)
  
  if (result == 0) {

  qcow2_mark_clean(bs);
+s->flags |= BDRV_O_INACTIVE;
  }
  
  return result;





/Anton

Re: [Qemu-devel] [PATCH for-2.10] quorum: Handle bdrv_getlength() failures in quorum_co_flush()

2017-08-07 Thread Alberto Garcia

On Mon 07 Aug 2017 01:29:09 PM CEST, Eric Blake wrote:
> On 08/07/2017 03:43 AM, Alberto Garcia wrote:
>> On Fri 04 Aug 2017 05:44:00 PM CEST, Eric Blake wrote:
 --- a/block/quorum.c
 +++ b/block/quorum.c
 @@ -785,8 +785,9 @@ static coroutine_fn int 
 quorum_co_flush(BlockDriverState *bs)
  for (i = 0; i < s->num_children; i++) {
  result = bdrv_co_flush(s->children[i]->bs);
  if (result) {
 +int64_t length = bdrv_getlength(s->children[i]->bs);
  quorum_report_bad(QUORUM_OP_TYPE_FLUSH, 0,
 -  bdrv_getlength(s->children[i]->bs),
 +  length > 0 ? length : 0,
>>>
>>> In the fallback case, is always picking 0 good enough?  Then again,
>>> this is in the error path, so it is unlikely in practice, and I don't
>>> see any better way to handle it.
>> 
>> I don't see any better way to handle it either, and I'm not sure that it
>> matters much: this is a flush error event, the 'sectors-count' field
>> doesn't even mean anything, but we have to put something there.
>
> If that's the case, can we ALWAYS report 0, instead of usually the
> length and sometimes 0?

That's actually not a bad idea, I'll send the patch now.

Berto

Re: [Qemu-devel] [PATCH 2/2] block/vhdx: check for offset overflow to bdrv_truncate()

2017-08-07 Thread Jeff Cody

On Mon, Aug 07, 2017 at 06:24:30AM -0500, Eric Blake wrote:
> On 08/06/2017 10:08 PM, Jeff Cody wrote:
> > VHDX uses uint64_t types for most offsets, following the VHDX spec.
> > However, bdrv_truncate() takes an int64_t value for the truncating
> > offset.  Check for overflow before calling bdrv_truncate().
> > 
> > N.B.: For a compliant image this is not an issue, as the maximum VHDX
> > image size is defined per the spec to be 64TB.
> > 
> > Signed-off-by: Jeff Cody 
> > ---
> >  block/vhdx-log.c | 4 
> >  block/vhdx.c | 3 +++
> >  2 files changed, 7 insertions(+)
> > 
> > diff --git a/block/vhdx-log.c b/block/vhdx-log.c
> > index fd4e7af..3b74e5d 100644
> > --- a/block/vhdx-log.c
> > +++ b/block/vhdx-log.c
> > @@ -554,6 +554,10 @@ static int vhdx_log_flush(BlockDriverState *bs, 
> > BDRVVHDXState *s,
> >  if (new_file_size % (1024*1024)) {
> >  /* round up to nearest 1MB boundary */
> >  new_file_size = ((new_file_size >> 20) + 1) << 20;
> 
> Since you're touching here, can you fix this to use QEMU_ALIGN_UP instead?
> 

Good idea, yes.

> > +if (new_file_size > INT64_MAX) {
> > +ret = -EINVAL;
> > +goto exit;
> > +}
> >  bdrv_truncate(bs->file, new_file_size, PREALLOC_MODE_OFF, 
> > NULL);
> 
> Reviewed-by: Eric Blake 
> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
>

Re: [Qemu-devel] [PATCH for-2.10] hw/arm/virt: Add 2.10 machine type

2017-08-07 Thread Andrew Jones

On Mon, Aug 07, 2017 at 11:49:41AM +, Eric Auger wrote:
> Add virt-2.10 machine type.
> 
> Signed-off-by: Eric Auger 
> 
> ---
> ---
>  hw/arm/virt.c | 19 +--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 31739d7..6b7a0fe 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1639,7 +1639,7 @@ static void machvirt_machine_init(void)
>  }
>  type_init(machvirt_machine_init);
>  
> -static void virt_2_9_instance_init(Object *obj)
> +static void virt_2_10_instance_init(Object *obj)
>  {
>  VirtMachineState *vms = VIRT_MACHINE(obj);
>  VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
> @@ -1699,10 +1699,25 @@ static void virt_2_9_instance_init(Object *obj)
>  vms->irqmap = a15irqmap;
>  }
>  
> +static void virt_machine_2_10_options(MachineClass *mc)
> +{
> +}
> +DEFINE_VIRT_MACHINE_AS_LATEST(2, 10)
> +
> +#define VIRT_COMPAT_2_9 \
> +HW_COMPAT_2_9
> +
> +static void virt_2_9_instance_init(Object *obj)
> +{
> +virt_2_10_instance_init(obj);
> +}
> +
>  static void virt_machine_2_9_options(MachineClass *mc)
>  {
> +virt_machine_2_10_options(mc);
> +SET_MACHINE_COMPAT(mc, VIRT_COMPAT_2_9);
>  }
> -DEFINE_VIRT_MACHINE_AS_LATEST(2, 9)
> +DEFINE_VIRT_MACHINE(2, 9)
>  
>  #define VIRT_COMPAT_2_8 \
>  HW_COMPAT_2_8
> -- 
> 1.9.1
> 
>

Reviewed-by: Andrew Jones

Re: [Qemu-devel] [PATCH v2] kvm: workaround build break on gcc-7.1.1 / fedora26

2017-08-07 Thread Cornelia Huck

On Mon, 07 Aug 2017 13:36:44 +0200
Greg Kurz  wrote:

> Building QEMU on fedora26 with the latest gcc package fails:
> 
>   CC  ppc64-softmmu/target/ppc/kvm.o
> In file included from include/sysemu/hw_accel.h:16:0,
>  from target/ppc/kvm.c:31:
> target/ppc/kvm.c: In function ‘kvmppc_booke_watchdog_enable’:
> include/sysemu/kvm.h:449:35: error: ‘args_tmp[i]’ may be used uninitialized
>  in this function [-Werror=maybe-uninitialized]
>  cap.args[i] = args_tmp[i];   \
>^
> target/ppc/kvm.c: In function ‘kvmppc_set_papr’:
> include/sysemu/kvm.h:449:35: error: ‘args_tmp[i]’ may be used uninitialized
>  in this function [-Werror=maybe-uninitialized]
> cc1: all warnings being treated as errors
> 
> $ rpm -q gcc
> gcc-7.1.1-3.fc26.ppc64le
> 
> The compiler should obviously optimize this code away when no extra
> agument is passed to kvm_vm_enable_cap() and kvm_vcpu_enable_cap(),
> but it doesn't. This bug should be fixed one day in gcc, but we can
> also change our code pattern so that we don't hit the issue anymore.
> We workaround this, by using memcpy() instead of open-coding the copy.
> 
> Signed-off-by: Greg Kurz 
> ---
> v2: - use memcpy()
> ---
>  include/sysemu/kvm.h |   14 --
>  1 file changed, 4 insertions(+), 10 deletions(-)

Acked-by: Cornelia Huck

Re: [Qemu-devel] [Qemu-block] [PATCH 1/2] block/vhdx: check error return of bdrv_getlength()

2017-08-07 Thread Jeff Cody

On Mon, Aug 07, 2017 at 12:46:30PM +0200, Kevin Wolf wrote:
> Am 07.08.2017 um 05:08 hat Jeff Cody geschrieben:
> > Calls to bdrv_getlength() were not checking for error.  In vhdx.c, this
> > can lead to truncating an image file, so it is a definite bug.  In
> > vhdx-log.c, the path for improper behavior is less clear, but it is best
> > to check in any case.
> > 
> > Reported-by: Markus Armbruster 
> > Signed-off-by: Jeff Cody 
> > ---
> >  block/vhdx-log.c | 20 
> >  block/vhdx.c |  9 -
> >  2 files changed, 24 insertions(+), 5 deletions(-)
> > 
> > diff --git a/block/vhdx-log.c b/block/vhdx-log.c
> > index 01278f3..fd4e7af 100644
> > --- a/block/vhdx-log.c
> > +++ b/block/vhdx-log.c
> > @@ -491,6 +491,7 @@ static int vhdx_log_flush(BlockDriverState *bs, 
> > BDRVVHDXState *s,
> >  uint32_t cnt, sectors_read;
> >  uint64_t new_file_size;
> >  void *data = NULL;
> > +int64_t file_length;
> >  VHDXLogDescEntries *desc_entries = NULL;
> >  VHDXLogEntryHeader hdr_tmp = { 0 };
> >  
> > @@ -510,10 +511,15 @@ static int vhdx_log_flush(BlockDriverState *bs, 
> > BDRVVHDXState *s,
> >  if (ret < 0) {
> >  goto exit;
> >  }
> > +file_length = bdrv_getlength(bs->file->bs);
> > +if (file_length < 0) {
> > +ret = file_length;
> > +goto exit;
> > +}
> >  /* if the log shows a FlushedFileOffset larger than our current 
> > file
> >   * size, then that means the file has been truncated / corrupted, 
> > and
> >   * we must refused to open it / use it */
> > -if (hdr_tmp.flushed_file_offset > bdrv_getlength(bs->file->bs)) {
> > +if (hdr_tmp.flushed_file_offset > file_length) {
> >  ret = -EINVAL;
> >  goto exit;
> >  }
> > @@ -543,7 +549,7 @@ static int vhdx_log_flush(BlockDriverState *bs, 
> > BDRVVHDXState *s,
> >  goto exit;
> >  }
> >  }
> > -if (bdrv_getlength(bs->file->bs) < 
> > desc_entries->hdr.last_file_offset) {
> > +if (file_length < desc_entries->hdr.last_file_offset) {
> >  new_file_size = desc_entries->hdr.last_file_offset;
> >  if (new_file_size % (1024*1024)) {
> >  /* round up to nearest 1MB boundary */
> 
> The vhdx_log_flush() part looks good, but it made me notice a
> bdrv_flush() in the same function where the return value isn't checked.
> I'm almost sure that this is a bug.
> 

I'll add 2 more simple patches to the series: one for bdrv_flush(), and one
for the unchecked bdrv_truncate() as well.


> > @@ -851,6 +857,7 @@ static int vhdx_log_write(BlockDriverState *bs, 
> > BDRVVHDXState *s,
> >  uint32_t partial_sectors = 0;
> >  uint32_t bytes_written = 0;
> >  uint64_t file_offset;
> > +int64_t file_length;
> >  VHDXHeader *header;
> >  VHDXLogEntryHeader new_hdr;
> >  VHDXLogDescriptor *new_desc = NULL;
> > @@ -913,10 +920,15 @@ static int vhdx_log_write(BlockDriverState *bs, 
> > BDRVVHDXState *s,
> >  .sequence_number = s->log.sequence,
> >  .descriptor_count= sectors,
> >  .reserved= 0,
> > -.flushed_file_offset = bdrv_getlength(bs->file->bs),
> > -.last_file_offset= bdrv_getlength(bs->file->bs),
> >};
> >  
> > +file_length = bdrv_getlength(bs->file->bs);
> > +if (file_length < 0) {
> > +ret = file_length;
> > +goto exit;
> > +}
> > +new_hdr.flushed_file_offset = file_length;
> > +new_hdr.last_file_offset= file_length;
> >  new_hdr.log_guid = header->log_guid;
> 
> If you move the bdrv_getlength() above the initialisation of new_hdr,
> you could keep these fields in the designated initialiser, which should
> be better for readability.
> 
> I also don't know why .log_guid isn't part if it, could be moved, too.
> 
> >  desc_sectors = vhdx_compute_desc_sectors(new_hdr.descriptor_count);
> > diff --git a/block/vhdx.c b/block/vhdx.c
> > index a9cecd2..6a14999 100644
> > --- a/block/vhdx.c
> > +++ b/block/vhdx.c
> > @@ -1166,7 +1166,14 @@ exit:
> >  static int vhdx_allocate_block(BlockDriverState *bs, BDRVVHDXState *s,
> >  uint64_t *new_offset)
> >  {
> > -*new_offset = bdrv_getlength(bs->file->bs);
> > +int64_t current_len;
> > +current_len = bdrv_getlength(bs->file->bs);
> > +
> > +if (current_len < 0) {
> > +return current_len;
> > +}
> 
> Don't you want the empty line to be between declaration and code rather
> than assignment and check?
> 
> > +*new_offset = current_len;
> >  
> >  /* per the spec, the address for a block is in units of 1MB */
> >  *new_offset = ROUND_UP(*new_offset, 1024 * 1024);
> 
> So the code looks correct, but we could make it a little nicer in a v2.
>

Yep, thanks.  I'll address all those cleanups you mentioned

Re: [Qemu-devel] [PATCH v4 02/15] blkverify: set supported write/zero flags

2017-08-07 Thread Anton Nefedov


On 08/04/2017 06:23 PM, Eric Blake wrote:

On 08/01/2017 09:18 AM, Anton Nefedov wrote:

Signed-off-by: Anton Nefedov 
---
  block/blkverify.c | 9 +
  1 file changed, 9 insertions(+)


Basically, blkverify supports a flag if BOTH of its underlying files
also support the flag; if either side can't handle the flag, then we
fall back to emulation for both sides.

With more overhead, we COULD state that we support both bits if at least
one of the two underlying BDS supports the bit, and then emulate support
for the bit on the second BDS where it was lacking, so that at least the
first BDS doesn't suffer from the penalties of the fallbacks.  But that
means duplicating the block layer fallback code in blkverify, which is
already something that we don't necessarily expect high performance from.

For FUA, failure to implement the bit merely means that we have more
device-wide flush calls (instead of per-transaction mini-flushes), but
the end data should be the same.  But for MAY_UNMAP, I'm worried that we
may have situations where a plain BDS will create holes, while running
the same device paired through blkverify will fall back to slower
explicit zeroes.  I'm wondering whether this will bite us, if we have
scenarios where the mere fact of trying to verify block device behavior
changes what behavior we are even verifying.

Thus, while I think the code change _looks_ okay, I'm not sure if it is
correct design-wise, nor whether it is 2.10 material.



But this change doesn't make things worse, does it?
Blkverify will support at least the matching bytes, while now it drops
them all.

Will the 'reference' driver (I guess it's raw/file in most of the cases)
usually support a superset of 'tested' driver supported bits?
Maybe we should report an error otherwise?



+bs->supported_write_flags = BDRV_REQ_FUA &
+bs->file->bs->supported_write_flags &
+s->test_file->bs->supported_write_flags;
+
+bs->supported_zero_flags =
+(BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
+bs->file->bs->supported_zero_flags &
+s->test_file->bs->supported_zero_flags;
+
  ret = 0;
  fail:
  qemu_opts_del(opts);




/Anton

Re: [Qemu-devel] [PATCH f0r 2.11] runstate/migrate: Two more transitions

2017-08-07 Thread Dr. David Alan Gilbert

* Peter Xu (pet...@redhat.com) wrote:
> On Fri, Aug 04, 2017 at 06:50:11PM +0100, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" 
> > 
> > There's a race if someone does a 'stop' near the end of migrate;
> > the migration process goes through two runstates:
> > 'finish migrate'
> > 'postmigrate'
> > 
> > If the user issues a 'stop' between the two we end up with invalid
> > state transitions.
> > Add the transitions as valid.
> > 
> > Signed-off-by: Dr. David Alan Gilbert 
> > ---
> >  vl.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/vl.c b/vl.c
> > index 99fcfa0442..bacb03f49d 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -621,6 +621,7 @@ static const RunStateTransition 
> > runstate_transitions_def[] = {
> >  
> >  { RUN_STATE_PAUSED, RUN_STATE_RUNNING },
> >  { RUN_STATE_PAUSED, RUN_STATE_FINISH_MIGRATE },
> > +{ RUN_STATE_PAUSED, RUN_STATE_POSTMIGRATE },
> >  { RUN_STATE_PAUSED, RUN_STATE_PRELAUNCH },
> >  { RUN_STATE_PAUSED, RUN_STATE_COLO},
> >  
> > @@ -633,6 +634,7 @@ static const RunStateTransition 
> > runstate_transitions_def[] = {
> >  { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE },
> >  
> >  { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
> > +{ RUN_STATE_FINISH_MIGRATE, RUN_STATE_PAUSED },
> >  { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
> >  { RUN_STATE_FINISH_MIGRATE, RUN_STATE_PRELAUNCH },
> >  { RUN_STATE_FINISH_MIGRATE, RUN_STATE_COLO},
> 
> Do we need this as well?
> 
> { RUN_STATE_POSTMIGRATE, RUN_STATE_PAUSED },

Apparently not:

(qemu) migrate "exec:cat > /dev/null"
(qemu) infomigrate
unknown command: 'infomigrate'
(qemu) info status
VM status: paused (postmigrate)
(qemu) stop
(qemu) info status
VM status: paused (postmigrate)
(qemu)


so doing a stop at that point doesn't seem to cause any problem.

Dave

> -- 
> Peter Xu
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] [PATCH for-2.10] quorum: Set sectors-count to 0 when reporting a flush error

2017-08-07 Thread Alberto Garcia

The QUORUM_REPORT_BAD event has fields to report the sector in which
the error was detected and the number of affected sectors starting
from that one. This is important for read and write errors, but not
for flush errors.

For flush errors the current code reports the total size of the disk
image. That is however not useful information in this case. Moreover,
the bdrv_getlength() call can fail, and there's no good way of
handling that failure.

Since we're reporting useless information and we cannot even guarantee
to do it in a consistent way, this patch changes the code to report 0
instead in all cases.

Reported-by: Markus Armbruster 
Signed-off-by: Alberto Garcia 
---
 block/quorum.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/quorum.c b/block/quorum.c
index 55ba916655..d04da4f430 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -785,8 +785,7 @@ static coroutine_fn int quorum_co_flush(BlockDriverState 
*bs)
 for (i = 0; i < s->num_children; i++) {
 result = bdrv_co_flush(s->children[i]->bs);
 if (result) {
-quorum_report_bad(QUORUM_OP_TYPE_FLUSH, 0,
-  bdrv_getlength(s->children[i]->bs),
+quorum_report_bad(QUORUM_OP_TYPE_FLUSH, 0, 0,
   s->children[i]->bs->node_name, result);
 result_value.l = result;
 quorum_count_vote(&error_votes, &result_value, i);
-- 
2.11.0

[Qemu-devel] [PATCH V8 2/6] qmp: Create IOThrottle structure

2017-08-07 Thread Pradeep Jagadeesh

This patch enables qmp interfaces for the fsdev
devices. This provides two interfaces one
for querying info of all the fsdev devices. The second one
to set the IO limits for the required fsdev device.

Signed-off-by: Pradeep Jagadeesh 
Reviewed-by: Greg Kurz 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
---
 qapi/block-core.json | 76 ++---
 qapi/iothrottle.json | 88 
 2 files changed, 91 insertions(+), 73 deletions(-)
 create mode 100644 qapi/iothrottle.json

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 833c602..98147ef 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -6,6 +6,7 @@
 
 # QAPI common definitions
 { 'include': 'common.json' }
+{ 'include': 'iothrottle.json' }
 
 ##
 # @SnapshotInfo:
@@ -1825,84 +1826,13 @@
 #
 # @device: Block device name (deprecated, use @id instead)
 #
-# @id: The name or QOM path of the guest device (since: 2.8)
-#
-# @bps: total throughput limit in bytes per second
-#
-# @bps_rd: read throughput limit in bytes per second
-#
-# @bps_wr: write throughput limit in bytes per second
-#
-# @iops: total I/O operations per second
-#
-# @iops_rd: read I/O operations per second
-#
-# @iops_wr: write I/O operations per second
-#
-# @bps_max: total throughput limit during bursts,
-# in bytes (Since 1.7)
-#
-# @bps_rd_max: read throughput limit during bursts,
-#in bytes (Since 1.7)
-#
-# @bps_wr_max: write throughput limit during bursts,
-#in bytes (Since 1.7)
-#
-# @iops_max: total I/O operations per second during bursts,
-#  in bytes (Since 1.7)
-#
-# @iops_rd_max: read I/O operations per second during bursts,
-# in bytes (Since 1.7)
-#
-# @iops_wr_max: write I/O operations per second during bursts,
-# in bytes (Since 1.7)
-#
-# @bps_max_length: maximum length of the @bps_max burst
-#period, in seconds. It must only
-#be set if @bps_max is set as well.
-#Defaults to 1. (Since 2.6)
-#
-# @bps_rd_max_length: maximum length of the @bps_rd_max
-#   burst period, in seconds. It must only
-#   be set if @bps_rd_max is set as well.
-#   Defaults to 1. (Since 2.6)
-#
-# @bps_wr_max_length: maximum length of the @bps_wr_max
-#   burst period, in seconds. It must only
-#   be set if @bps_wr_max is set as well.
-#   Defaults to 1. (Since 2.6)
-#
-# @iops_max_length: maximum length of the @iops burst
-# period, in seconds. It must only
-# be set if @iops_max is set as well.
-# Defaults to 1. (Since 2.6)
-#
-# @iops_rd_max_length: maximum length of the @iops_rd_max
-#burst period, in seconds. It must only
-#be set if @iops_rd_max is set as well.
-#Defaults to 1. (Since 2.6)
-#
-# @iops_wr_max_length: maximum length of the @iops_wr_max
-#burst period, in seconds. It must only
-#be set if @iops_wr_max is set as well.
-#Defaults to 1. (Since 2.6)
-#
-# @iops_size: an I/O size in bytes (Since 1.7)
-#
 # @group: throttle group name (Since 2.4)
 #
 # Since: 1.1
 ##
 { 'struct': 'BlockIOThrottle',
-  'data': { '*device': 'str', '*id': 'str', 'bps': 'int', 'bps_rd': 'int',
-'bps_wr': 'int', 'iops': 'int', 'iops_rd': 'int', 'iops_wr': 'int',
-'*bps_max': 'int', '*bps_rd_max': 'int',
-'*bps_wr_max': 'int', '*iops_max': 'int',
-'*iops_rd_max': 'int', '*iops_wr_max': 'int',
-'*bps_max_length': 'int', '*bps_rd_max_length': 'int',
-'*bps_wr_max_length': 'int', '*iops_max_length': 'int',
-'*iops_rd_max_length': 'int', '*iops_wr_max_length': 'int',
-'*iops_size': 'int', '*group': 'str' } }
+  'base': 'IOThrottle',
+  'data': { '*device': 'str', '*group': 'str' } }
 
 ##
 # @block-stream:
diff --git a/qapi/iothrottle.json b/qapi/iothrottle.json
new file mode 100644
index 000..0f067c3
--- /dev/null
+++ b/qapi/iothrottle.json
@@ -0,0 +1,88 @@
+# -*- Mode: Python -*-
+
+##
+# == QAPI IOThrottle definitions
+##
+
+##
+# @IOThrottle:
+#
+# A set of parameters describing IO throttling
+#
+# @id: The name or QOM path of the guest device (since: 2.8)
+#
+# @bps: total throughput limit in bytes per second
+#
+# @bps_rd: read throughput limit in bytes per second
+#
+# @bps_wr: write throughput limit in bytes per second
+#
+# @iops: total I/O operations per second
+#
+# @iops_rd: read I/O operations per second
+#
+# @iops_wr: write I/O operati

[Qemu-devel] [PATCH V8 3/6] throttle: move out function to reuse the code

2017-08-07 Thread Pradeep Jagadeesh

This patch move out the throttle code to util/throttle.c to maximize
the reusability of the code.The same code is also used by fsdev.

Signed-off-by: Pradeep Jagadeesh 
---
 blockdev.c  | 53 +++-
 include/qemu/throttle-options.h |  3 +++
 util/throttle.c | 59 +
 3 files changed, 65 insertions(+), 50 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 780ae58..1caf2e0 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2571,6 +2571,7 @@ void qmp_block_set_io_throttle(BlockIOThrottle *arg, 
Error **errp)
 BlockDriverState *bs;
 BlockBackend *blk;
 AioContext *aio_context;
+IOThrottle *iothrottle;
 
 blk = qmp_get_blk(arg->has_device ? arg->device : NULL,
   arg->has_id ? arg->id : NULL,
@@ -2588,56 +2589,8 @@ void qmp_block_set_io_throttle(BlockIOThrottle *arg, 
Error **errp)
 goto out;
 }
 
-throttle_config_init(&cfg);
-cfg.buckets[THROTTLE_BPS_TOTAL].avg = arg->bps;
-cfg.buckets[THROTTLE_BPS_READ].avg  = arg->bps_rd;
-cfg.buckets[THROTTLE_BPS_WRITE].avg = arg->bps_wr;
-
-cfg.buckets[THROTTLE_OPS_TOTAL].avg = arg->iops;
-cfg.buckets[THROTTLE_OPS_READ].avg  = arg->iops_rd;
-cfg.buckets[THROTTLE_OPS_WRITE].avg = arg->iops_wr;
-
-if (arg->has_bps_max) {
-cfg.buckets[THROTTLE_BPS_TOTAL].max = arg->bps_max;
-}
-if (arg->has_bps_rd_max) {
-cfg.buckets[THROTTLE_BPS_READ].max = arg->bps_rd_max;
-}
-if (arg->has_bps_wr_max) {
-cfg.buckets[THROTTLE_BPS_WRITE].max = arg->bps_wr_max;
-}
-if (arg->has_iops_max) {
-cfg.buckets[THROTTLE_OPS_TOTAL].max = arg->iops_max;
-}
-if (arg->has_iops_rd_max) {
-cfg.buckets[THROTTLE_OPS_READ].max = arg->iops_rd_max;
-}
-if (arg->has_iops_wr_max) {
-cfg.buckets[THROTTLE_OPS_WRITE].max = arg->iops_wr_max;
-}
-
-if (arg->has_bps_max_length) {
-cfg.buckets[THROTTLE_BPS_TOTAL].burst_length = arg->bps_max_length;
-}
-if (arg->has_bps_rd_max_length) {
-cfg.buckets[THROTTLE_BPS_READ].burst_length = arg->bps_rd_max_length;
-}
-if (arg->has_bps_wr_max_length) {
-cfg.buckets[THROTTLE_BPS_WRITE].burst_length = arg->bps_wr_max_length;
-}
-if (arg->has_iops_max_length) {
-cfg.buckets[THROTTLE_OPS_TOTAL].burst_length = arg->iops_max_length;
-}
-if (arg->has_iops_rd_max_length) {
-cfg.buckets[THROTTLE_OPS_READ].burst_length = arg->iops_rd_max_length;
-}
-if (arg->has_iops_wr_max_length) {
-cfg.buckets[THROTTLE_OPS_WRITE].burst_length = arg->iops_wr_max_length;
-}
-
-if (arg->has_iops_size) {
-cfg.op_size = arg->iops_size;
-}
+iothrottle = qapi_BlockIOThrottle_base(arg);
+throttle_set_io_limits(&cfg, iothrottle);
 
 if (!throttle_is_valid(&cfg, errp)) {
 goto out;
diff --git a/include/qemu/throttle-options.h b/include/qemu/throttle-options.h
index f63d38c..a9deb8e 100644
--- a/include/qemu/throttle-options.h
+++ b/include/qemu/throttle-options.h
@@ -11,6 +11,7 @@
 #define THROTTLE_OPTIONS_H
 
 #include "typedefs.h"
+#include "qapi-types.h"
 
 #define THROTTLE_OPTS \
   { \
@@ -93,4 +94,6 @@
 
 void throttle_parse_options(ThrottleConfig *, QemuOpts *);
 
+void throttle_set_io_limits(ThrottleConfig *, IOThrottle *);
+
 #endif
diff --git a/util/throttle.c b/util/throttle.c
index 95c2ecf..2d00532 100644
--- a/util/throttle.c
+++ b/util/throttle.c
@@ -553,3 +553,62 @@ void throttle_parse_options(ThrottleConfig *throttle_cfg, 
QemuOpts *opts)
 throttle_cfg->op_size =
 qemu_opt_get_number(opts, "throttling.iops-size", 0);
 }
+
+/* set the throttle limits
+ *
+ * @arg: iothrottle limits
+ * @cfg: throttle configuration
+ */
+void throttle_set_io_limits(ThrottleConfig *cfg, IOThrottle *arg)
+{
+throttle_config_init(cfg);
+cfg->buckets[THROTTLE_BPS_TOTAL].avg = arg->bps;
+cfg->buckets[THROTTLE_BPS_READ].avg  = arg->bps_rd;
+cfg->buckets[THROTTLE_BPS_WRITE].avg = arg->bps_wr;
+
+cfg->buckets[THROTTLE_OPS_TOTAL].avg = arg->iops;
+cfg->buckets[THROTTLE_OPS_READ].avg  = arg->iops_rd;
+cfg->buckets[THROTTLE_OPS_WRITE].avg = arg->iops_wr;
+
+if (arg->has_bps_max) {
+cfg->buckets[THROTTLE_BPS_TOTAL].max = arg->bps_max;
+}
+if (arg->has_bps_rd_max) {
+cfg->buckets[THROTTLE_BPS_READ].max = arg->bps_rd_max;
+}
+if (arg->has_bps_wr_max) {
+cfg->buckets[THROTTLE_BPS_WRITE].max = arg->bps_wr_max;
+}
+if (arg->has_iops_max) {
+cfg->buckets[THROTTLE_OPS_TOTAL].max = arg->iops_max;
+}
+if (arg->has_iops_rd_max) {
+cfg->buckets[THROTTLE_OPS_READ].max = arg->iops_rd_max;
+}
+if (arg->has_iops_wr_max) {
+cfg->buckets[THROTTLE_OPS_WRITE].max = arg->iops_wr_max;
+}
+
+if (arg->has_bps_max_length) {
+cfg->buckets[THROTTLE_BPS_TOTAL].burst_length = arg->bps_max_length;
+

[Qemu-devel] [PATCH v8 0/6] fsdev: qmp interface for io throttling

2017-08-07 Thread Pradeep Jagadeesh

These patches provide the qmp interface, to query the io throttle 
status of the all fsdev devices that are present in a vm.
also, it provides an interface to set the io throttle parameters of a
fsdev to a required value. some of the patches also remove the duplicate
code that was present in block and fsdev files. 

Pradeep Jagadeesh (6):
  throttle: factor out duplicate code
  qmp: Create IOThrottle structure
  throttle: move out function to reuse the code
  hmp: create a throttle initialization function for code reusability
  fsdev: QMP interface for throttling
  fsdev: hmp interface for throttling

 Makefile|   4 ++
 blockdev.c  |  97 ++--
 fsdev/qemu-fsdev-dummy.c|  11 
 fsdev/qemu-fsdev-throttle.c | 120 ++--
 fsdev/qemu-fsdev-throttle.h |   8 ++-
 fsdev/qemu-fsdev.c  |  38 +
 hmp-commands-info.hx|  18 ++
 hmp-commands.hx |  19 +++
 hmp.c   |  81 +--
 hmp.h   |   4 ++
 include/qemu/throttle-options.h |   7 +++
 include/qemu/throttle.h |   4 +-
 include/qemu/typedefs.h |   1 +
 monitor.c   |   5 ++
 qapi-schema.json|   3 +
 qapi/block-core.json|  76 +
 qapi/fsdev.json |  84 
 qapi/iothrottle.json|  88 +
 qmp.c   |  14 +
 util/throttle.c | 110 
 20 files changed, 574 insertions(+), 218 deletions(-)
 create mode 100644 qapi/fsdev.json
 create mode 100644 qapi/iothrottle.json

v0 -> v1:
 Addressed comments from Eric Blake, Greg Kurz and Daniel P.Berrange
 Mainly renaming the functions and removing the redundant code.

v1 -> v2:
 Addressed comments from Eric Blake and Greg Kurz.
 As per the suggestion I split the patches into smaller patches.
 Removed some more duplicate code.

v2 -> v3:
 Addresssed comments from Alberto Garcia.
 Changed the comment from block to iothrottle in the iothrottle.json 
 Added the dummy functions in qemu-fsdev-dummy.c to address the compilation
 issues that were observed.

v3 -> v4:
 Addressed comments from Eric Blake and Greg Kurz
 Re-ordered the patches
 Added the dummy functions in qmp.c to address the cross compilation issues

v4 -> v5:
  Addressed comments from Eric Blake and Greg Kurz
  Split the fsdev qmp patch into hmp and qmp related patches
  Moved the common functionalities to throttle.c instead of creating
  a new file

v5 -> v6:
  Addressed comments from Greg Kurz and Markus Armbruster
  Split the commits to specific to hmp and throttle as suggested by Greg
  Moved ThrottleConfig typedef to qemu/typedefs.h
  Addressed compilation issue on FreeBSD by adding flags in qmp.c

v6 -> v7:
  Addressed comments from Albert Garcia and Dr. David Alan Gilbert
  Fixed the hmp-commands-info.hx and hmp-commands.hx as per Dr. David's
  comments.
  Fixed the bug with the hmp fsdev_set_io_throttle and info fsdev_iothrottle  

v7 -> v8:
  Addressed comments from Markus Armbruster and Eric Blake
  Removed unwanted headers from qmp-fsdev-throttle.h
-- 
1.8.3.1

[Qemu-devel] [PATCH V8 4/6] hmp: create a throttle initialization function for code reusability

2017-08-07 Thread Pradeep Jagadeesh

This patch creates a throttle initialization function to maximize the
code reusability. The same code is also used by fsdev.

Acked-by: Dr. David Alan Gilbert 
Signed-off-by: Pradeep Jagadeesh 
---
 hmp.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/hmp.c b/hmp.c
index fd80dce..2dbfb80 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1758,20 +1758,27 @@ void hmp_change(Monitor *mon, const QDict *qdict)
 hmp_handle_error(mon, &err);
 }
 
+static void hmp_initialize_io_throttle(IOThrottle *iot, const QDict *qdict)
+{
+iot->bps = qdict_get_int(qdict, "bps");
+iot->bps_rd = qdict_get_int(qdict, "bps_rd");
+iot->bps_wr = qdict_get_int(qdict, "bps_wr");
+iot->iops = qdict_get_int(qdict, "iops");
+iot->iops_rd = qdict_get_int(qdict, "iops_rd");
+iot->iops_wr = qdict_get_int(qdict, "iops_wr");
+}
+
 void hmp_block_set_io_throttle(Monitor *mon, const QDict *qdict)
 {
 Error *err = NULL;
+IOThrottle *iothrottle;
 BlockIOThrottle throttle = {
 .has_device = true,
 .device = (char *) qdict_get_str(qdict, "device"),
-.bps = qdict_get_int(qdict, "bps"),
-.bps_rd = qdict_get_int(qdict, "bps_rd"),
-.bps_wr = qdict_get_int(qdict, "bps_wr"),
-.iops = qdict_get_int(qdict, "iops"),
-.iops_rd = qdict_get_int(qdict, "iops_rd"),
-.iops_wr = qdict_get_int(qdict, "iops_wr"),
 };
 
+iothrottle = qapi_BlockIOThrottle_base(&throttle);
+hmp_initialize_io_throttle(iothrottle, qdict);
 qmp_block_set_io_throttle(&throttle, &err);
 hmp_handle_error(mon, &err);
 }
-- 
1.8.3.1

[Qemu-devel] [PATCH V8 5/6] fsdev: QMP interface for throttling

2017-08-07 Thread Pradeep Jagadeesh

This patch introduces qmp interfaces for the fsdev
devices. This provides two interfaces one
for querying info of all the fsdev devices. The second one
to set the IO limits for the required fsdev device.

Signed-off-by: Pradeep Jagadeesh 
---
 Makefile|  4 +++
 fsdev/qemu-fsdev-dummy.c| 11 ++
 fsdev/qemu-fsdev-throttle.c | 76 
 fsdev/qemu-fsdev-throttle.h |  8 +++--
 fsdev/qemu-fsdev.c  | 38 
 monitor.c   |  5 +++
 qapi-schema.json|  3 ++
 qapi/fsdev.json | 84 +
 qmp.c   | 14 
 9 files changed, 241 insertions(+), 2 deletions(-)
 create mode 100644 qapi/fsdev.json

diff --git a/Makefile b/Makefile
index 97a58a0..5af92cb 100644
--- a/Makefile
+++ b/Makefile
@@ -416,6 +416,10 @@ qapi-modules = $(SRC_PATH)/qapi-schema.json 
$(SRC_PATH)/qapi/common.json \
$(SRC_PATH)/qapi/crypto.json $(SRC_PATH)/qapi/rocker.json \
$(SRC_PATH)/qapi/trace.json
 
+ifdef CONFIG_VIRTFS
+qapi-modules += $(SRC_PATH)/qapi/fsdev.json
+endif
+
 qapi-types.c qapi-types.h :\
 $(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
diff --git a/fsdev/qemu-fsdev-dummy.c b/fsdev/qemu-fsdev-dummy.c
index 6dc0fbc..28c82d2 100644
--- a/fsdev/qemu-fsdev-dummy.c
+++ b/fsdev/qemu-fsdev-dummy.c
@@ -14,8 +14,19 @@
 #include "qemu-fsdev.h"
 #include "qemu/config-file.h"
 #include "qemu/module.h"
+#include "qmp-commands.h"
 
 int qemu_fsdev_add(QemuOpts *opts)
 {
 return 0;
 }
+
+void qmp_fsdev_set_io_throttle(IOThrottle *arg, Error **errp)
+{
+return;
+}
+
+IOThrottleList *qmp_query_fsdev_io_throttle(Error **errp)
+{
+return NULL;
+}
diff --git a/fsdev/qemu-fsdev-throttle.c b/fsdev/qemu-fsdev-throttle.c
index 0e6fb86..184ed4c 100644
--- a/fsdev/qemu-fsdev-throttle.c
+++ b/fsdev/qemu-fsdev-throttle.c
@@ -16,6 +16,7 @@
 #include "qemu/error-report.h"
 #include "qemu-fsdev-throttle.h"
 #include "qemu/iov.h"
+#include "qemu/main-loop.h"
 #include "qemu/throttle-options.h"
 
 static void fsdev_throttle_read_timer_cb(void *opaque)
@@ -30,6 +31,81 @@ static void fsdev_throttle_write_timer_cb(void *opaque)
 qemu_co_enter_next(&fst->throttled_reqs[true]);
 }
 
+void fsdev_set_io_throttle(IOThrottle *arg, FsThrottle *fst, Error **errp)
+{
+ThrottleConfig cfg;
+
+throttle_set_io_limits(&cfg, arg);
+
+if (throttle_is_valid(&cfg, errp)) {
+fst->cfg = cfg;
+fsdev_throttle_init(fst);
+}
+}
+
+void fsdev_get_io_throttle(FsThrottle *fst, IOThrottle **fs9pcfg,
+   char *fsdevice, Error **errp)
+{
+
+ThrottleConfig cfg = fst->cfg;
+IOThrottle *fscfg = g_malloc0(sizeof(*fscfg));
+
+fscfg->has_id = true;
+fscfg->id = g_strdup(fsdevice);
+fscfg->bps = cfg.buckets[THROTTLE_BPS_TOTAL].avg;
+fscfg->bps_rd = cfg.buckets[THROTTLE_BPS_READ].avg;
+fscfg->bps_wr = cfg.buckets[THROTTLE_BPS_WRITE].avg;
+
+fscfg->iops = cfg.buckets[THROTTLE_OPS_TOTAL].avg;
+fscfg->iops_rd = cfg.buckets[THROTTLE_OPS_READ].avg;
+fscfg->iops_wr = cfg.buckets[THROTTLE_OPS_WRITE].avg;
+
+fscfg->has_bps_max = cfg.buckets[THROTTLE_BPS_TOTAL].max;
+fscfg->bps_max = cfg.buckets[THROTTLE_BPS_TOTAL].max;
+fscfg->has_bps_rd_max  = cfg.buckets[THROTTLE_BPS_READ].max;
+fscfg->bps_rd_max  = cfg.buckets[THROTTLE_BPS_READ].max;
+fscfg->has_bps_wr_max  = cfg.buckets[THROTTLE_BPS_WRITE].max;
+fscfg->bps_wr_max  = cfg.buckets[THROTTLE_BPS_WRITE].max;
+
+fscfg->has_iops_max= cfg.buckets[THROTTLE_OPS_TOTAL].max;
+fscfg->iops_max= cfg.buckets[THROTTLE_OPS_TOTAL].max;
+fscfg->has_iops_rd_max = cfg.buckets[THROTTLE_OPS_READ].max;
+fscfg->iops_rd_max = cfg.buckets[THROTTLE_OPS_READ].max;
+fscfg->has_iops_wr_max = cfg.buckets[THROTTLE_OPS_WRITE].max;
+fscfg->iops_wr_max = cfg.buckets[THROTTLE_OPS_WRITE].max;
+
+fscfg->has_bps_max_length = fscfg->has_bps_max;
+fscfg->bps_max_length =
+ cfg.buckets[THROTTLE_BPS_TOTAL].burst_length;
+fscfg->has_bps_rd_max_length  = fscfg->has_bps_rd_max;
+fscfg->bps_rd_max_length  =
+ cfg.buckets[THROTTLE_BPS_READ].burst_length;
+fscfg->has_bps_wr_max_length  = fscfg->has_bps_wr_max;
+fscfg->bps_wr_max_length  =
+ cfg.buckets[THROTTLE_BPS_WRITE].burst_length;
+
+fscfg->has_iops_max_length= fscfg->has_iops_max;
+fscfg->iops_max_length=
+ cfg.buckets[THROTTLE_OPS_TOTAL].burst_length;
+fscfg->has_iops_rd_max_length = fscfg->has_iops_rd_max;
+fscfg->iops_rd_max_length =
+ cfg.buckets[THROTTLE_OPS_READ].burst_length;
+fscfg->has_iops_wr_max_length = fscfg->has_iops_wr_max;
+fscfg->iops_wr_max_length =
+ cfg.buckets[THROTTLE_OPS_WRITE].burst_length;
+
+fs

[Qemu-devel] [PATCH V8 1/6] throttle: factor out duplicate code

2017-08-07 Thread Pradeep Jagadeesh

This patch factor out the duplicate throttle code that was present in
block and fsdev devices.

Signed-off-by: Pradeep Jagadeesh 
Reviewed-by: Alberto Garcia 
---
 blockdev.c  | 44 +--
 fsdev/qemu-fsdev-throttle.c | 44 ++-
 include/qemu/throttle-options.h |  4 
 include/qemu/throttle.h |  4 ++--
 include/qemu/typedefs.h |  1 +
 util/throttle.c | 51 +
 6 files changed, 61 insertions(+), 87 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 02cd69b..780ae58 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -388,49 +388,7 @@ static void extract_common_blockdev_options(QemuOpts 
*opts, int *bdrv_flags,
 }
 
 if (throttle_cfg) {
-throttle_config_init(throttle_cfg);
-throttle_cfg->buckets[THROTTLE_BPS_TOTAL].avg =
-qemu_opt_get_number(opts, "throttling.bps-total", 0);
-throttle_cfg->buckets[THROTTLE_BPS_READ].avg  =
-qemu_opt_get_number(opts, "throttling.bps-read", 0);
-throttle_cfg->buckets[THROTTLE_BPS_WRITE].avg =
-qemu_opt_get_number(opts, "throttling.bps-write", 0);
-throttle_cfg->buckets[THROTTLE_OPS_TOTAL].avg =
-qemu_opt_get_number(opts, "throttling.iops-total", 0);
-throttle_cfg->buckets[THROTTLE_OPS_READ].avg =
-qemu_opt_get_number(opts, "throttling.iops-read", 0);
-throttle_cfg->buckets[THROTTLE_OPS_WRITE].avg =
-qemu_opt_get_number(opts, "throttling.iops-write", 0);
-
-throttle_cfg->buckets[THROTTLE_BPS_TOTAL].max =
-qemu_opt_get_number(opts, "throttling.bps-total-max", 0);
-throttle_cfg->buckets[THROTTLE_BPS_READ].max  =
-qemu_opt_get_number(opts, "throttling.bps-read-max", 0);
-throttle_cfg->buckets[THROTTLE_BPS_WRITE].max =
-qemu_opt_get_number(opts, "throttling.bps-write-max", 0);
-throttle_cfg->buckets[THROTTLE_OPS_TOTAL].max =
-qemu_opt_get_number(opts, "throttling.iops-total-max", 0);
-throttle_cfg->buckets[THROTTLE_OPS_READ].max =
-qemu_opt_get_number(opts, "throttling.iops-read-max", 0);
-throttle_cfg->buckets[THROTTLE_OPS_WRITE].max =
-qemu_opt_get_number(opts, "throttling.iops-write-max", 0);
-
-throttle_cfg->buckets[THROTTLE_BPS_TOTAL].burst_length =
-qemu_opt_get_number(opts, "throttling.bps-total-max-length", 1);
-throttle_cfg->buckets[THROTTLE_BPS_READ].burst_length  =
-qemu_opt_get_number(opts, "throttling.bps-read-max-length", 1);
-throttle_cfg->buckets[THROTTLE_BPS_WRITE].burst_length =
-qemu_opt_get_number(opts, "throttling.bps-write-max-length", 1);
-throttle_cfg->buckets[THROTTLE_OPS_TOTAL].burst_length =
-qemu_opt_get_number(opts, "throttling.iops-total-max-length", 1);
-throttle_cfg->buckets[THROTTLE_OPS_READ].burst_length =
-qemu_opt_get_number(opts, "throttling.iops-read-max-length", 1);
-throttle_cfg->buckets[THROTTLE_OPS_WRITE].burst_length =
-qemu_opt_get_number(opts, "throttling.iops-write-max-length", 1);
-
-throttle_cfg->op_size =
-qemu_opt_get_number(opts, "throttling.iops-size", 0);
-
+throttle_parse_options(throttle_cfg, opts);
 if (!throttle_is_valid(throttle_cfg, errp)) {
 return;
 }
diff --git a/fsdev/qemu-fsdev-throttle.c b/fsdev/qemu-fsdev-throttle.c
index 49eebb5..0e6fb86 100644
--- a/fsdev/qemu-fsdev-throttle.c
+++ b/fsdev/qemu-fsdev-throttle.c
@@ -16,6 +16,7 @@
 #include "qemu/error-report.h"
 #include "qemu-fsdev-throttle.h"
 #include "qemu/iov.h"
+#include "qemu/throttle-options.h"
 
 static void fsdev_throttle_read_timer_cb(void *opaque)
 {
@@ -31,48 +32,7 @@ static void fsdev_throttle_write_timer_cb(void *opaque)
 
 void fsdev_throttle_parse_opts(QemuOpts *opts, FsThrottle *fst, Error **errp)
 {
-throttle_config_init(&fst->cfg);
-fst->cfg.buckets[THROTTLE_BPS_TOTAL].avg =
-qemu_opt_get_number(opts, "throttling.bps-total", 0);
-fst->cfg.buckets[THROTTLE_BPS_READ].avg  =
-qemu_opt_get_number(opts, "throttling.bps-read", 0);
-fst->cfg.buckets[THROTTLE_BPS_WRITE].avg =
-qemu_opt_get_number(opts, "throttling.bps-write", 0);
-fst->cfg.buckets[THROTTLE_OPS_TOTAL].avg =
-qemu_opt_get_number(opts, "throttling.iops-total", 0);
-fst->cfg.buckets[THROTTLE_OPS_READ].avg =
-qemu_opt_get_number(opts, "throttling.iops-read", 0);
-fst->cfg.buckets[THROTTLE_OPS_WRITE].avg =
-qemu_opt_get_number(opts, "throttling.iops-write", 0);
-
-fst->cfg.buckets[THROTTLE_BPS_TOTAL].max =
-qemu_opt_get_number(opts, "throttling.bps-total-max", 0);
-fst->cfg.buckets[THROTTLE_BPS_READ].max  =
-qemu_opt_get_number(opts, "throttling.bps-read-max", 0);
-fst->cfg.buckets[THROTTLE_BPS_WRITE].max =
-qe

[Qemu-devel] [PATCH v2 for-2.10 0/4] VHDX bugfixes

2017-08-07 Thread Jeff Cody


Some VHDX bug fixes, including:

1. Checking bdrv_getlength(), bdrv_flush(), and bdrv_truncate() return values

Changes from v1->v2:

git-backport-diff -r qemu/master.. -u vhdx-fixes-v1
Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/4:[0013] [FC] 'block/vhdx: check error return of bdrv_getlength()'
   
Some code cleanup / movement (Thanks Kevin)

002/4:[0002] [FC] 'block/vhdx: check for offset overflow to bdrv_truncate()'
   
Use QEMU_ALIGN_UP (Thanks Eric)

003/4:[down] 'block/vhdx: check error return of bdrv_flush()'

New, suggested by Kevin.

004/4:[down] 'block/vhdx: check error return of bdrv_truncate()'

New.

1. Check for error when calling bdrv_getlength() [Markus]

2. Check for overflow in offset prior to calling bdrv_truncate().

Jeff Cody (4):
  block/vhdx: check error return of bdrv_getlength()
  block/vhdx: check for offset overflow to bdrv_truncate()
  block/vhdx: check error return of bdrv_flush()
  block/vhdx: check error return of bdrv_truncate()

 block/vhdx-log.c | 52 ++--
 block/vhdx.c | 12 +++-
 2 files changed, 53 insertions(+), 11 deletions(-)

-- 
2.9.4

[Qemu-devel] [PATCH V8 6/6] fsdev: hmp interface for throttling

2017-08-07 Thread Pradeep Jagadeesh

This patch introduces hmp interfaces for the fsdev
devices.

Signed-off-by: Pradeep Jagadeesh 
---
 hmp-commands-info.hx | 18 +++
 hmp-commands.hx  | 19 
 hmp.c| 62 
 hmp.h|  4 
 4 files changed, 103 insertions(+)

diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index d9df238..07ed338 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -84,6 +84,24 @@ STEXI
 Show block device statistics.
 ETEXI
 
+#if defined(CONFIG_VIRTFS)
+
+{
+.name   = "fsdev_iothrottle",
+.args_type  = "",
+.params = "",
+.help   = "show fsdev iothrottle information",
+.cmd= hmp_info_fsdev_iothrottle,
+},
+
+#endif
+
+STEXI
+@item info fsdev_iothrottle
+@findex fsdev_iothrottle
+Show fsdev device throttle info.
+ETEXI
+
 {
 .name   = "block-jobs",
 .args_type  = "",
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 1941e19..aef9f79 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1680,6 +1680,25 @@ STEXI
 Change I/O throttle limits for a block drive to @var{bps} @var{bps_rd} 
@var{bps_wr} @var{iops} @var{iops_rd} @var{iops_wr}
 ETEXI
 
+#if defined(CONFIG_VIRTFS)
+
+{
+.name   = "fsdev_set_io_throttle",
+.args_type  = 
"device:B,bps:l,bps_rd:l,bps_wr:l,iops:l,iops_rd:l,iops_wr:l",
+.params = "device bps bps_rd bps_wr iops iops_rd iops_wr",
+.help   = "change I/O throttle limits for a fs devices",
+.cmd= hmp_fsdev_set_io_throttle,
+},
+
+#endif
+
+STEXI
+@item fsdev_set_io_throttle @var{device} @var{bps} @var{bps_rd} @var{bps_wr} 
@var{iops} @var{iops_rd} @var{iops_wr}
+@findex fsdev_set_io_throttle
+Change I/O throttle limits for a fs devices to @var{bps} @var{bps_rd} 
@var{bps_wr} @var{iops} @var{iops_rd} @var{iops_wr}
+ETEXI
+
+
 {
 .name   = "set_password",
 .args_type  = "protocol:s,password:s,connected:s?",
diff --git a/hmp.c b/hmp.c
index 2dbfb80..712c6a3 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1783,6 +1783,68 @@ void hmp_block_set_io_throttle(Monitor *mon, const QDict 
*qdict)
 hmp_handle_error(mon, &err);
 }
 
+#ifdef CONFIG_VIRTFS
+
+void hmp_fsdev_set_io_throttle(Monitor *mon, const QDict *qdict)
+{
+Error *err = NULL;
+IOThrottle throttle = {
+.has_id = true,
+.id = (char *) qdict_get_str(qdict, "device"),
+};
+
+hmp_initialize_io_throttle(&throttle, qdict);
+qmp_fsdev_set_io_throttle(&throttle, &err);
+hmp_handle_error(mon, &err);
+}
+
+static void print_fsdev_throttle_config(Monitor *mon, IOThrottle *fscfg,
+   Error *err)
+{
+monitor_printf(mon, "%s", fscfg->id);
+monitor_printf(mon, "I/O throttling:"
+   " bps=%" PRId64
+   " bps_rd=%" PRId64  " bps_wr=%" PRId64
+   " bps_max=%" PRId64
+   " bps_rd_max=%" PRId64
+   " bps_wr_max=%" PRId64
+   " iops=%" PRId64 " iops_rd=%" PRId64
+   " iops_wr=%" PRId64
+   " iops_max=%" PRId64
+   " iops_rd_max=%" PRId64
+   " iops_wr_max=%" PRId64
+   " iops_size=%" PRId64
+   "\n",
+   fscfg->bps,
+   fscfg->bps_rd,
+   fscfg->bps_wr,
+   fscfg->bps_max,
+   fscfg->bps_rd_max,
+   fscfg->bps_wr_max,
+   fscfg->iops,
+   fscfg->iops_rd,
+   fscfg->iops_wr,
+   fscfg->iops_max,
+   fscfg->iops_rd_max,
+   fscfg->iops_wr_max,
+   fscfg->iops_size);
+   hmp_handle_error(mon, &err);
+}
+
+void hmp_info_fsdev_iothrottle(Monitor *mon, const QDict *qdict)
+{
+Error *err = NULL;
+IOThrottleList *fsdev_list, *info;
+fsdev_list = qmp_query_fsdev_io_throttle(&err);
+
+for (info = fsdev_list; info; info = info->next) {
+print_fsdev_throttle_config(mon, info->value, err);
+}
+qapi_free_IOThrottleList(fsdev_list);
+}
+
+#endif
+
 void hmp_block_stream(Monitor *mon, const QDict *qdict)
 {
 Error *error = NULL;
diff --git a/hmp.h b/hmp.h
index 1ff4552..d700d7d 100644
--- a/hmp.h
+++ b/hmp.h
@@ -81,6 +81,10 @@ void hmp_set_password(Monitor *mon, const QDict *qdict);
 void hmp_expire_password(Monitor *mon, const QDict *qdict);
 void hmp_eject(Monitor *mon, const QDict *qdict);
 void hmp_change(Monitor *mon, const QDict *qdict);
+#ifdef CONFIG_VIRTFS
+void hmp_fsdev_set_io_throttle(Monitor *mon, const QDict *qdict);
+void hmp_info_fsdev_iothrottle(Monitor *mon, const QDict *qdict);
+#endif
 void hmp_block_set_io_throttle(Monitor *mon, const QDict *qdict);
 void hmp_block_stream(Monitor *mon, const QDict *qdict);
 void hmp_block_job_set_speed(Monitor *mon, const QD

[Qemu-devel] [PATCH v2 for-2.10 2/4] block/vhdx: check for offset overflow to bdrv_truncate()

2017-08-07 Thread Jeff Cody

VHDX uses uint64_t types for most offsets, following the VHDX spec.
However, bdrv_truncate() takes an int64_t value for the truncating
offset.  Check for overflow before calling bdrv_truncate().

While we are here, replace the bit shifting with QEMU_ALIGN_UP as well.

N.B.: For a compliant image this is not an issue, as the maximum VHDX
image size is defined per the spec to be 64TB.

Reviewed-by: Eric Blake 
Reviewed-by: Kevin Wolf 
Signed-off-by: Jeff Cody 
---
 block/vhdx-log.c | 6 +-
 block/vhdx.c | 3 +++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/block/vhdx-log.c b/block/vhdx-log.c
index 2e26fd4..9597223 100644
--- a/block/vhdx-log.c
+++ b/block/vhdx-log.c
@@ -553,7 +553,11 @@ static int vhdx_log_flush(BlockDriverState *bs, 
BDRVVHDXState *s,
 new_file_size = desc_entries->hdr.last_file_offset;
 if (new_file_size % (1024*1024)) {
 /* round up to nearest 1MB boundary */
-new_file_size = ((new_file_size >> 20) + 1) << 20;
+new_file_size = QEMU_ALIGN_UP(new_file_size, MiB);
+if (new_file_size > INT64_MAX) {
+ret = -EINVAL;
+goto exit;
+}
 bdrv_truncate(bs->file, new_file_size, PREALLOC_MODE_OFF, 
NULL);
 }
 }
diff --git a/block/vhdx.c b/block/vhdx.c
index 37224b8..7ae4589 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -1177,6 +1177,9 @@ static int vhdx_allocate_block(BlockDriverState *bs, 
BDRVVHDXState *s,
 
 /* per the spec, the address for a block is in units of 1MB */
 *new_offset = ROUND_UP(*new_offset, 1024 * 1024);
+if (*new_offset > INT64_MAX) {
+return -EINVAL;
+}
 
 return bdrv_truncate(bs->file, *new_offset + s->block_size,
  PREALLOC_MODE_OFF, NULL);
-- 
2.9.4

Re: [Qemu-devel] [PATCH V8 2/6] qmp: Create IOThrottle structure

2017-08-07 Thread Eric Blake

On 08/07/2017 07:37 AM, Pradeep Jagadeesh wrote:
> This patch enables qmp interfaces for the fsdev
> devices. This provides two interfaces one
> for querying info of all the fsdev devices. The second one
> to set the IO limits for the required fsdev device.
> 
> Signed-off-by: Pradeep Jagadeesh 
> Reviewed-by: Greg Kurz 
> Reviewed-by: Eric Blake 
> Reviewed-by: Alberto Garcia 
> ---
>  qapi/block-core.json | 76 ++---
>  qapi/iothrottle.json | 88 
> 
>  2 files changed, 91 insertions(+), 73 deletions(-)
>  create mode 100644 qapi/iothrottle.json

I'm wondering how much of this overlaps or even duplicates Manos'
efforts on making throttling a separate filter BDS.


> +++ b/qapi/iothrottle.json
> @@ -0,0 +1,88 @@
> +# -*- Mode: Python -*-
> +
> +##
> +# == QAPI IOThrottle definitions
> +##
> +
> +##
> +# @IOThrottle:
> +#
> +# A set of parameters describing IO throttling
> +#
> +# @id: The name or QOM path of the guest device (since: 2.8)

> +# Since: 2.10

Also, we've missed freeze for 2.10; this refactoring now belongs in 2.11.

> +##
> +{ 'struct': 'IOThrottle',
> +  'data': { '*id': 'str', 'bps': 'int', 'bps_rd': 'int',
> +'bps_wr': 'int', 'iops': 'int', 'iops_rd': 'int', 'iops_wr': 
> 'int',
> +'*bps_max': 'int', '*bps_rd_max': 'int',
> +'*bps_wr_max': 'int', '*iops_max': 'int',
> +'*iops_rd_max': 'int', '*iops_wr_max': 'int',
> +'*bps_max_length': 'int', '*bps_rd_max_length': 'int',
> +'*bps_wr_max_length': 'int', '*iops_max_length': 'int',
> +'*iops_rd_max_length': 'int', '*iops_wr_max_length': 'int',
> +'*iops_size': 'int' } }
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v2 for-2.10 1/4] block/vhdx: check error return of bdrv_getlength()

2017-08-07 Thread Jeff Cody

Calls to bdrv_getlength() were not checking for error.  In vhdx.c, this
can lead to truncating an image file, so it is a definite bug.  In
vhdx-log.c, the path for improper behavior is less clear, but it is best
to check in any case.

Some minor code movement of the log_guid intialization, as well.

Reported-by: Markus Armbruster 
Reviewed-by: Eric Blake 
Signed-off-by: Jeff Cody 
---
 block/vhdx-log.c | 23 ++-
 block/vhdx.c |  9 -
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/block/vhdx-log.c b/block/vhdx-log.c
index 01278f3..2e26fd4 100644
--- a/block/vhdx-log.c
+++ b/block/vhdx-log.c
@@ -491,6 +491,7 @@ static int vhdx_log_flush(BlockDriverState *bs, 
BDRVVHDXState *s,
 uint32_t cnt, sectors_read;
 uint64_t new_file_size;
 void *data = NULL;
+int64_t file_length;
 VHDXLogDescEntries *desc_entries = NULL;
 VHDXLogEntryHeader hdr_tmp = { 0 };
 
@@ -510,10 +511,15 @@ static int vhdx_log_flush(BlockDriverState *bs, 
BDRVVHDXState *s,
 if (ret < 0) {
 goto exit;
 }
+file_length = bdrv_getlength(bs->file->bs);
+if (file_length < 0) {
+ret = file_length;
+goto exit;
+}
 /* if the log shows a FlushedFileOffset larger than our current file
  * size, then that means the file has been truncated / corrupted, and
  * we must refused to open it / use it */
-if (hdr_tmp.flushed_file_offset > bdrv_getlength(bs->file->bs)) {
+if (hdr_tmp.flushed_file_offset > file_length) {
 ret = -EINVAL;
 goto exit;
 }
@@ -543,7 +549,7 @@ static int vhdx_log_flush(BlockDriverState *bs, 
BDRVVHDXState *s,
 goto exit;
 }
 }
-if (bdrv_getlength(bs->file->bs) < desc_entries->hdr.last_file_offset) 
{
+if (file_length < desc_entries->hdr.last_file_offset) {
 new_file_size = desc_entries->hdr.last_file_offset;
 if (new_file_size % (1024*1024)) {
 /* round up to nearest 1MB boundary */
@@ -851,6 +857,7 @@ static int vhdx_log_write(BlockDriverState *bs, 
BDRVVHDXState *s,
 uint32_t partial_sectors = 0;
 uint32_t bytes_written = 0;
 uint64_t file_offset;
+int64_t file_length;
 VHDXHeader *header;
 VHDXLogEntryHeader new_hdr;
 VHDXLogDescriptor *new_desc = NULL;
@@ -904,6 +911,12 @@ static int vhdx_log_write(BlockDriverState *bs, 
BDRVVHDXState *s,
 
 sectors += partial_sectors;
 
+file_length = bdrv_getlength(bs->file->bs);
+if (file_length < 0) {
+ret = file_length;
+goto exit;
+}
+
 /* sectors is now how many sectors the data itself takes, not
  * including the header and descriptor metadata */
 
@@ -913,11 +926,11 @@ static int vhdx_log_write(BlockDriverState *bs, 
BDRVVHDXState *s,
 .sequence_number = s->log.sequence,
 .descriptor_count= sectors,
 .reserved= 0,
-.flushed_file_offset = bdrv_getlength(bs->file->bs),
-.last_file_offset= bdrv_getlength(bs->file->bs),
+.flushed_file_offset = file_length,
+.last_file_offset= file_length,
+.log_guid= header->log_guid,
   };
 
-new_hdr.log_guid = header->log_guid;
 
 desc_sectors = vhdx_compute_desc_sectors(new_hdr.descriptor_count);
 
diff --git a/block/vhdx.c b/block/vhdx.c
index a9cecd2..37224b8 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -1166,7 +1166,14 @@ exit:
 static int vhdx_allocate_block(BlockDriverState *bs, BDRVVHDXState *s,
 uint64_t *new_offset)
 {
-*new_offset = bdrv_getlength(bs->file->bs);
+int64_t current_len;
+
+current_len = bdrv_getlength(bs->file->bs);
+if (current_len < 0) {
+return current_len;
+}
+
+*new_offset = current_len;
 
 /* per the spec, the address for a block is in units of 1MB */
 *new_offset = ROUND_UP(*new_offset, 1024 * 1024);
-- 
2.9.4

[Qemu-devel] [PATCH v2 for-2.10 4/4] block/vhdx: check error return of bdrv_truncate()

2017-08-07 Thread Jeff Cody

Signed-off-by: Jeff Cody 
---
 block/vhdx-log.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/block/vhdx-log.c b/block/vhdx-log.c
index a27dc05..14b724e 100644
--- a/block/vhdx-log.c
+++ b/block/vhdx-log.c
@@ -558,7 +558,11 @@ static int vhdx_log_flush(BlockDriverState *bs, 
BDRVVHDXState *s,
 ret = -EINVAL;
 goto exit;
 }
-bdrv_truncate(bs->file, new_file_size, PREALLOC_MODE_OFF, 
NULL);
+ret = bdrv_truncate(bs->file, new_file_size, PREALLOC_MODE_OFF,
+NULL);
+if (ret < 0) {
+goto exit;
+}
 }
 }
 qemu_vfree(desc_entries);
-- 
2.9.4

[Qemu-devel] [PATCH v2 for-2.10 3/4] block/vhdx: check error return of bdrv_flush()

2017-08-07 Thread Jeff Cody

Reported-by: Kevin Wolf 
Signed-off-by: Jeff Cody 
---
 block/vhdx-log.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/block/vhdx-log.c b/block/vhdx-log.c
index 9597223..a27dc05 100644
--- a/block/vhdx-log.c
+++ b/block/vhdx-log.c
@@ -565,7 +565,10 @@ static int vhdx_log_flush(BlockDriverState *bs, 
BDRVVHDXState *s,
 desc_entries = NULL;
 }
 
-bdrv_flush(bs);
+ret = bdrv_flush(bs);
+if (ret < 0) {
+goto exit;
+}
 /* once the log is fully flushed, indicate that we have an empty log
  * now.  This also sets the log guid to 0, to indicate an empty log */
 vhdx_log_reset(bs, s);
@@ -1039,7 +1042,11 @@ int vhdx_log_write_and_flush(BlockDriverState *bs, 
BDRVVHDXState *s,
 
 /* Make sure data written (new and/or changed blocks) is stable
  * on disk, before creating log entry */
-bdrv_flush(bs);
+ret = bdrv_flush(bs);
+if (ret < 0) {
+goto exit;
+}
+
 ret = vhdx_log_write(bs, s, data, length, offset);
 if (ret < 0) {
 goto exit;
@@ -1047,7 +1054,11 @@ int vhdx_log_write_and_flush(BlockDriverState *bs, 
BDRVVHDXState *s,
 logs.log = s->log;
 
 /* Make sure log is stable on disk */
-bdrv_flush(bs);
+ret = bdrv_flush(bs);
+if (ret < 0) {
+goto exit;
+}
+
 ret = vhdx_log_flush(bs, s, &logs);
 if (ret < 0) {
 goto exit;
-- 
2.9.4

Re: [Qemu-devel] [PATCH V8 5/6] fsdev: QMP interface for throttling

2017-08-07 Thread Eric Blake

On 08/07/2017 07:37 AM, Pradeep Jagadeesh wrote:
> This patch introduces qmp interfaces for the fsdev
> devices. This provides two interfaces one
> for querying info of all the fsdev devices. The second one
> to set the IO limits for the required fsdev device.
> 
> Signed-off-by: Pradeep Jagadeesh 
> ---
>  Makefile|  4 +++
>  fsdev/qemu-fsdev-dummy.c| 11 ++
>  fsdev/qemu-fsdev-throttle.c | 76 
>  fsdev/qemu-fsdev-throttle.h |  8 +++--
>  fsdev/qemu-fsdev.c  | 38 
>  monitor.c   |  5 +++
>  qapi-schema.json|  3 ++
>  qapi/fsdev.json | 84 
> +
>  qmp.c   | 14 
>  9 files changed, 241 insertions(+), 2 deletions(-)
>  create mode 100644 qapi/fsdev.json
> 

> +##
> +# @fsdev-set-io-throttle:
> +#
> +# Change I/O limits for a 9p/fsdev device.
> +#
> +# I/O limits can be enabled by setting throttle value to non-zero number.
> +#
> +# I/O limits can be disabled by setting all throttle values to 0.
> +#
> +# Returns: Nothing on success
> +#  If @device is not a valid fsdev device, GenericError
> +#
> +# Since: 2.10
> +#

2.11, now.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH for-2.10] quorum: Set sectors-count to 0 when reporting a flush error

2017-08-07 Thread Eric Blake

On 08/07/2017 07:36 AM, Alberto Garcia wrote:
> The QUORUM_REPORT_BAD event has fields to report the sector in which
> the error was detected and the number of affected sectors starting
> from that one. This is important for read and write errors, but not
> for flush errors.
> 
> For flush errors the current code reports the total size of the disk
> image. That is however not useful information in this case. Moreover,
> the bdrv_getlength() call can fail, and there's no good way of
> handling that failure.
> 
> Since we're reporting useless information and we cannot even guarantee
> to do it in a consistent way, this patch changes the code to report 0
> instead in all cases.
> 
> Reported-by: Markus Armbruster 
> Signed-off-by: Alberto Garcia 
> ---
>  block/quorum.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH V8 5/6] fsdev: QMP interface for throttling

2017-08-07 Thread Pradeep Jagadeesh


On 8/7/2017 2:44 PM, Eric Blake wrote:

On 08/07/2017 07:37 AM, Pradeep Jagadeesh wrote:

This patch introduces qmp interfaces for the fsdev
devices. This provides two interfaces one
for querying info of all the fsdev devices. The second one
to set the IO limits for the required fsdev device.

Signed-off-by: Pradeep Jagadeesh 
---
 Makefile|  4 +++
 fsdev/qemu-fsdev-dummy.c| 11 ++
 fsdev/qemu-fsdev-throttle.c | 76 
 fsdev/qemu-fsdev-throttle.h |  8 +++--
 fsdev/qemu-fsdev.c  | 38 
 monitor.c   |  5 +++
 qapi-schema.json|  3 ++
 qapi/fsdev.json | 84 +
 qmp.c   | 14 
 9 files changed, 241 insertions(+), 2 deletions(-)
 create mode 100644 qapi/fsdev.json




+##
+# @fsdev-set-io-throttle:
+#
+# Change I/O limits for a 9p/fsdev device.
+#
+# I/O limits can be enabled by setting throttle value to non-zero number.
+#
+# I/O limits can be disabled by setting all throttle values to 0.
+#
+# Returns: Nothing on success
+#  If @device is not a valid fsdev device, GenericError
+#
+# Since: 2.10
+#





2.11, now.


Hmm!.

-Pradeep

Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH] booke206: fix MAS update on tlb miss

2017-08-07 Thread KONRAD Frederic




On 08/03/2017 03:13 PM, Thomas Huth wrote:

On 03.08.2017 14:08, KONRAD Frederic wrote:



On 08/03/2017 01:37 PM, Thomas Huth wrote:

On 01.08.2017 10:44, KONRAD Frederic wrote:

When a tlb instruction miss happen, rw is set to 0 at the bottom
of cpu_ppc_handle_mmu_fault which cause the MAS update function to miss
the SAS and TS bit in MAS6, MAS1 in booke206_update_mas_tlb_miss.

Just calling booke206_update_mas_tlb_miss with rw = 2 solve the issue.

Signed-off-by: KONRAD Frederic 
---
   target/ppc/mmu_helper.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
index b7b9088..f06b938 100644
--- a/target/ppc/mmu_helper.c
+++ b/target/ppc/mmu_helper.c
@@ -1551,7 +1551,7 @@ static int cpu_ppc_handle_mmu_fault(CPUPPCState
*env, target_ulong address,
   env->spr[SPR_40x_ESR] = 0x;
   break;
   case POWERPC_MMU_BOOKE206:
-booke206_update_mas_tlb_miss(env, address, rw);
+booke206_update_mas_tlb_miss(env, address, 2);



Hi Thomas,


Couldn't that code path be called for normal data read miss (instead of
instruction miss), too?



I don't think so because we have access_type == ACCESS_CODE and
the code in cpu_ppc_handle_mmu_fault explicitely split the CODE
and DATA cases.


Ah, right, I missed that if-statement. So never mind about my comment!


Anyway, could we please use MMU_INST_FETCH instead of magic values like
2 here?


I agree it's not nice to have a magic value like this.. But it's
used all over the code there and david took the patch.

So I suggest I send a second patch to fix all the instances of
that magic value.


Sounds like a good idea!


Hi Thomas,

Looking more in details at this magic value, it seems rw is
not considered as an access type but more as an is_write boolean.

The whole mmu code use "int access_type" which is kind of
redundant / confusing if we change rw to be MMUAccessType.

Fred



  Thanks,
   Thomas

Re: [Qemu-devel] [PATCH v2 for-2.10 3/4] block/vhdx: check error return of bdrv_flush()

2017-08-07 Thread Eric Blake

On 08/07/2017 07:38 AM, Jeff Cody wrote:
> Reported-by: Kevin Wolf 
> Signed-off-by: Jeff Cody 
> ---
>  block/vhdx-log.c | 17 ++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 for-2.10 4/4] block/vhdx: check error return of bdrv_truncate()

2017-08-07 Thread Eric Blake

On 08/07/2017 07:38 AM, Jeff Cody wrote:
> Signed-off-by: Jeff Cody 
> ---
>  block/vhdx-log.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake 

> diff --git a/block/vhdx-log.c b/block/vhdx-log.c
> index a27dc05..14b724e 100644
> --- a/block/vhdx-log.c
> +++ b/block/vhdx-log.c
> @@ -558,7 +558,11 @@ static int vhdx_log_flush(BlockDriverState *bs, 
> BDRVVHDXState *s,
>  ret = -EINVAL;
>  goto exit;
>  }
> -bdrv_truncate(bs->file, new_file_size, PREALLOC_MODE_OFF, 
> NULL);
> +ret = bdrv_truncate(bs->file, new_file_size, 
> PREALLOC_MODE_OFF,
> +NULL);
> +if (ret < 0) {
> +goto exit;
> +}
>  }
>  }
>  qemu_vfree(desc_entries);
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 06/17] block/nbd-client: fix nbd_read_reply_entry

2017-08-07 Thread Vladimir Sementsov-Ogievskiy


07.08.2017 14:52, Eric Blake wrote:

On 08/04/2017 10:14 AM, Vladimir Sementsov-Ogievskiy wrote:

Set reply.handle to 0 on error path to prevent normal path of
nbd_co_receive_reply.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd-client.c | 1 +
  1 file changed, 1 insertion(+)

Can you document a case where not fixing this would be an observable bug
(even if it requires using gdb and single-stepping between client and
server to make what is otherwise a racy situation easy to see)?  I'm
trying to figure out if this is 2.10 material.


it is simple enough:

run qemu-nbd in gdb, set break on nbd_send_reply, and when it shoot s,
next up to "stl_be_p(buf, NBD_REPLY_MAGIC);" and after it do "call 
stl_be_p(buf, 1000)"


run qemu-io with some read in gdb, set break on
br block/nbd-client.c:83

( it is break; after failed nbd_receive_reply call)

and on
br block/nbd-client.c:170

(it is in nbd_co_receive_reply after yield)

on first break we will be sure that  nbd_receive_reply failed,
on second we will be sure by
(gdb) p s->reply
$1 = {handle = 93825000680144, error = 0}
(gdb) p request->handle
$2 = 93825000680144

that we are on normal receiving path.




diff --git a/block/nbd-client.c b/block/nbd-client.c
index dc19894a7c..0c88d84de6 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -107,6 +107,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
  qemu_coroutine_yield();
  }
  
+s->reply.handle = 0;

  nbd_recv_coroutines_enter_all(s);
  s->read_reply_co = NULL;
  }



--
Best regards,
Vladimir

Re: [Qemu-devel] About virtio device hotplug in Q35! 【外域邮件.谨慎查阅】

2017-08-07 Thread Bob Chen

Bad news... The performance had dropped dramatically when using emulated
switches.

I was referring to the PCIe doc at
https://github.com/qemu/qemu/blob/master/docs/pcie.txt

# qemu-system-x86_64_2.6.2 -enable-kvm -cpu host,kvm=off -machine
q35,accel=kvm -nodefaults -nodefconfig \
-device ioh3420,id=root_port1,chassis=1,slot=1,bus=pcie.0 \
-device x3130-upstream,id=upstream_port1,bus=root_port1 \
-device
xio3130-downstream,id=downstream_port1,bus=upstream_port1,chassis=11,slot=11
\
-device
xio3130-downstream,id=downstream_port2,bus=upstream_port1,chassis=12,slot=12
\
-device vfio-pci,host=08:00.0,multifunction=on,bus=downstream_port1 \
-device vfio-pci,host=09:00.0,multifunction=on,bus=downstream_port2 \
-device ioh3420,id=root_port2,chassis=2,slot=2,bus=pcie.0 \
-device x3130-upstream,id=upstream_port2,bus=root_port2 \
-device
xio3130-downstream,id=downstream_port3,bus=upstream_port2,chassis=21,slot=21
\
-device
xio3130-downstream,id=downstream_port4,bus=upstream_port2,chassis=22,slot=22
\
-device vfio-pci,host=89:00.0,multifunction=on,bus=downstream_port3 \
-device vfio-pci,host=8a:00.0,multifunction=on,bus=downstream_port4 \
...


Not 8 GPUs this time, only 4.

*1. Attached to pcie bus directly (former situation):*

Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D 0  1  2  3
 0 420.93  10.03  11.07  11.09
 1  10.04 425.05  11.08  10.97
 2  11.17  11.17 425.07  10.07
 3  11.25  11.25  10.07 423.64
Unidirectional P2P=Enabled Bandwidth Matrix (GB/s)
   D\D 0  1  2  3
 0 425.98  10.03  11.07  11.09
 1   9.99 426.43  11.07  11.07
 2  11.04  11.20 425.98   9.89
 3  11.21  11.21  10.06 425.97
Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D 0  1  2  3
 0 430.67  10.45  19.59  19.58
 1  10.44 428.81  19.49  19.53
 2  19.62  19.62 429.52  10.57
 3  19.60  19.66  10.43 427.38
Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)
   D\D 0  1  2  3
 0 429.47  10.47  19.52  19.39
 1  10.48 427.15  19.64  19.52
 2  19.64  19.59 429.02  10.42
 3  19.60  19.64  10.47 427.81
P2P=Disabled Latency Matrix (us)
   D\D 0  1  2  3
 0   4.50  13.72  14.49  14.44
 1  13.65   4.53  14.52  14.33
 2  14.22  13.82   4.52  14.50
 3  13.87  13.75  14.53   4.55
P2P=Enabled Latency Matrix (us)
   D\D 0  1  2  3
 0   4.44  13.56  14.58  14.45
 1  13.56   4.48  14.39  14.45
 2  13.85  13.93   4.86  14.80
 3  14.51  14.23  14.70   4.72


*2. Attached to emulated Root Port and Switches:*

Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D 0  1  2  3
 0 420.48   3.15   3.12   3.12
 1   3.13 422.31   3.12   3.12
 2   3.08   3.09 421.40   3.13
 3   3.10   3.10   3.13 418.68
Unidirectional P2P=Enabled Bandwidth Matrix (GB/s)
   D\D 0  1  2  3
 0 418.68   3.14   3.12   3.12
 1   3.15 420.03   3.12   3.12
 2   3.11   3.10 421.39   3.14
 3   3.11   3.08   3.13 419.13
Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D 0  1  2  3
 0 424.36   5.36   5.35   5.34
 1   5.36 424.36   5.34   5.34
 2   5.35   5.36 425.52   5.35
 3   5.36   5.36   5.34 425.29
Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)
   D\D 0  1  2  3
 0 422.98   5.35   5.35   5.35
 1   5.35 423.44   5.34   5.33
 2   5.35   5.35 425.29   5.35
 3   5.35   5.34   5.34 423.21
P2P=Disabled Latency Matrix (us)
   D\D 0  1  2  3
 0   4.79  16.59  16.38  16.22
 1  16.62   4.77  16.35  16.69
 2  16.77  16.66   4.03  16.68
 3  16.54  16.56  16.78   4.08
P2P=Enabled Latency Matrix (us)
   D\D 0  1  2  3
 0   4.51  16.56  16.58  16.66
 1  15.65   3.87  16.74  16.61
 2  16.59  16.81   3.96  16.70
 3  16.47  16.28  16.68   4.03


Is it because the heavy load of CPU emulation had caused a bottleneck?



2017-08-01 23:01 GMT+08:00 Alex Williamson :

> On Tue, 1 Aug 2017 17:35:40 +0800
> Bob Chen  wrote:
>
> > 2017-08-01 13:46 GMT+08:00 Alex Williamson :
> >
> > > On Tue, 1 Aug 2017 13:04:46 +0800
> > > Bob Chen  wrote:
> > >
> > > > Hi,
> > > >
> > > > This is a sketch of my hardware topology.
> > > >
> > > >   CPU0 <- QPI ->CPU1
> > > >| |
> > > > Root Port(at PCIe.0)Root Port(at PCIe.1)
> > > >/\   /   \
> > >
> > > Are each of these lines above separate root ports?  ie. each root
> > > complex hosts two root ports, each with a two-port switch downstream of
> > > it?
> > >
> >
> > Not quite sure if root complex is a concept or a real physical device ...
> >
> > But according to my observation by `lspci -vt`, there are indeed 4 Root
> > Ports in the system. So the sketch might need a tiny update.
> >
> >
> >   CPU0 <- QPI ->CPU1
> >
> >|

Re: [Qemu-devel] About virtio device hotplug in Q35! 【外域邮件.谨慎查阅】

2017-08-07 Thread Bob Chen

Besides, I checked the lspci -vvv output, no capabilities of Access Control
are seen.

2017-08-01 23:01 GMT+08:00 Alex Williamson :

> On Tue, 1 Aug 2017 17:35:40 +0800
> Bob Chen  wrote:
>
> > 2017-08-01 13:46 GMT+08:00 Alex Williamson :
> >
> > > On Tue, 1 Aug 2017 13:04:46 +0800
> > > Bob Chen  wrote:
> > >
> > > > Hi,
> > > >
> > > > This is a sketch of my hardware topology.
> > > >
> > > >   CPU0 <- QPI ->CPU1
> > > >| |
> > > > Root Port(at PCIe.0)Root Port(at PCIe.1)
> > > >/\   /   \
> > >
> > > Are each of these lines above separate root ports?  ie. each root
> > > complex hosts two root ports, each with a two-port switch downstream of
> > > it?
> > >
> >
> > Not quite sure if root complex is a concept or a real physical device ...
> >
> > But according to my observation by `lspci -vt`, there are indeed 4 Root
> > Ports in the system. So the sketch might need a tiny update.
> >
> >
> >   CPU0 <- QPI ->CPU1
> >
> >| |
> >
> >   Root Complex(device?)  Root Complex(device?)
> >
> >  /\   /\
> >
> > Root Port  Root Port Root Port  Root Port
> >
> >/\   /\
> >
> > SwitchSwitch SwitchSwitch
> >
> >  /   \  /  \  /   \ /   \
> >
> >GPU   GPU  GPU  GPU  GPU   GPU  GPU   GPU
>
>
> Yes, that's what I expected.  So the numbers make sense, the immediate
> sibling GPU would share bandwidth between the root port and upstream
> switch port, any other GPU should not double-up on any single link.
>
> > > > SwitchSwitch SwitchSwitch
> > > >  /   \  /  \  /   \/\
> > > >GPU   GPU  GPU  GPU  GPU   GPU GPU   GPU
> > > >
> > > >
> > > > And below are the p2p bandwidth test results.
> > > >
> > > > Host：
> > > >D\D 0  1  2  3  4  5  6  7
> > > >  0 426.91  25.32  19.72  19.72  19.69  19.68  19.75  19.66
> > > >  1  25.31 427.61  19.74  19.72  19.66  19.68  19.74  19.73
> > > >  2  19.73  19.73 429.49  25.33  19.66  19.74  19.73  19.74
> > > >  3  19.72  19.71  25.36 426.68  19.70  19.71  19.77  19.74
> > > >  4  19.72  19.72  19.73  19.75 425.75  25.33  19.72  19.71
> > > >  5  19.71  19.75  19.76  19.75  25.35 428.11  19.69  19.70
> > > >  6  19.76  19.72  19.79  19.78  19.73  19.74 425.75  25.35
> > > >  7  19.69  19.75  19.79  19.75  19.72  19.72  25.39 427.15
> > > >
> > > > VM：
> > > >D\D 0  1  2  3  4  5  6  7
> > > >  0 427.38  10.52  18.99  19.11  19.75  19.62  19.75  19.71
> > > >  1  10.53 426.68  19.28  19.19  19.73  19.71  19.72  19.73
> > > >  2  18.88  19.30 426.92  10.48  19.66  19.71  19.67  19.68
> > > >  3  18.93  19.18  10.45 426.94  19.69  19.72  19.67  19.72
> > > >  4  19.60  19.66  19.69  19.70 428.13  10.49  19.40  19.57
> > > >  5  19.52  19.74  19.72  19.69  10.44 426.45  19.68  19.61
> > > >  6  19.63  19.50  19.72  19.64  19.59  19.66 426.91  10.47
> > > >  7  19.69  19.75  19.70  19.69  19.66  19.74  10.45 426.23
> > >
> > > Interesting test, how do you get these numbers?  What are the units,
> > > GB/s?
> > >
> >
> >
> >
> > A p2pBandwidthLatencyTest from Nvidia CUDA sample code. Units are
> > GB/s. Asynchronous read and write. Bidirectional.
> >
> > However, the Unidirectional test had shown a different result. Didn't
> fall
> > down to a half.
> >
> > VM:
> > Unidirectional P2P=Enabled Bandwidth Matrix (GB/s)
> >D\D 0  1  2  3  4  5  6  7
> >  0 424.07  10.02  11.33  11.30  11.09  11.05  11.06  11.10
> >  1  10.05 425.98  11.40  11.33  11.08  11.10  11.13  11.09
> >  2  11.31  11.28 423.67  10.10  11.14  11.13  11.13  11.11
> >  3  11.30  11.31  10.08 425.05  11.10  11.07  11.09  11.06
> >  4  11.16  11.17  11.21  11.17 423.67  10.08  11.25  11.28
> >  5  10.97  11.01  11.07  11.02  10.09 425.52  11.23  11.27
> >  6  11.09  11.13  11.16  11.10  11.28  11.33 422.71  10.10
> >  7  11.13  11.09  11.15  11.11  11.36  11.33  10.02 422.75
> >
> > Host:
> > Unidirectional P2P=Enabled Bandwidth Matrix (GB/s)
> >D\D 0  1  2  3  4  5  6  7
> >  0 424.13  13.38  10.17  10.17  11.23  11.21  10.94  11.22
> >  1  13.38 424.06  10.18  10.19  11.20  11.19  11.19  11.14
> >  2  10.18  10.18 422.75  13.38  11.19  11.19  11.17  11.17
> >  3  10.18  10.18  13.38 425.05  11.05  11.08  11.08  11.06
> >  4  11.01  11.06  11.06  11.03 423.21  13.38  10.17  10.17
> >  5  10.91  10.91  10.89  10.92  13.38 425.52  10.18  10.18
> >  6  11.28  11.30  11.32  11.31  10.19  10.18 424.59  13.37
> >  7  11.18  11.20  11.16  11.21  10.17  10.19  13.38 424.13
>
> Looks

Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 1/1] ppc: spapr: Make VCPU ID handling private to SPAPR

2017-08-07 Thread Greg Kurz

On Mon, 7 Aug 2017 16:33:29 +1000
Sam Bobroff  wrote:

> The concept of a VCPU ID that differs from the CPU's index
> (cpu->cpu_index) exists only within SPAPR machines so, move the
> functions ppc_get_vcpu_id() and ppc_get_cpu_by_vcpu_id() into spapr.c
> and rename them appropriately.
> 
> Signed-off-by: Sam Bobroff 
> ---
> Changes in v2:
> 
> * Re-arranged so that spapr_vcpu_id() calls kvm_arch_vcpu_id() rather than the
> other way around.
> 

Better indeed! :)

Reviewed-by: Greg Kurz 

>  hw/ppc/ppc.c   | 21 -
>  hw/ppc/spapr.c | 40 +---
>  hw/ppc/spapr_hcall.c   |  4 ++--
>  hw/ppc/spapr_rtas.c|  4 ++--
>  include/hw/ppc/spapr.h |  3 +++
>  target/ppc/cpu.h   | 18 --
>  target/ppc/kvm.c   |  2 +-
>  7 files changed, 41 insertions(+), 51 deletions(-)
> 
> diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
> index 4477d4ad89..f76886f4d3 100644
> --- a/hw/ppc/ppc.c
> +++ b/hw/ppc/ppc.c
> @@ -1358,27 +1358,6 @@ void PPC_debug_write (void *opaque, uint32_t addr, 
> uint32_t val)
>  }
>  }
>  
> -/* CPU device-tree ID helpers */
> -int ppc_get_vcpu_id(PowerPCCPU *cpu)
> -{
> -return cpu->vcpu_id;
> -}
> -
> -PowerPCCPU *ppc_get_cpu_by_vcpu_id(int vcpu_id)
> -{
> -CPUState *cs;
> -
> -CPU_FOREACH(cs) {
> -PowerPCCPU *cpu = POWERPC_CPU(cs);
> -
> -if (cpu->vcpu_id == vcpu_id) {
> -return cpu;
> -}
> -}
> -
> -return NULL;
> -}
> -
>  void ppc_cpu_parse_features(const char *cpu_model)
>  {
>  CPUClass *cc;
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index d6c9b3e334..d31e6d2c0d 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -208,7 +208,7 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, 
> PowerPCCPU *cpu,
>  int i, ret = 0;
>  uint32_t servers_prop[smt_threads];
>  uint32_t gservers_prop[smt_threads * 2];
> -int index = ppc_get_vcpu_id(cpu);
> +int index = spapr_vcpu_id(cpu);
>  
>  if (cpu->compat_pvr) {
>  ret = fdt_setprop_cell(fdt, offset, "cpu-version", cpu->compat_pvr);
> @@ -237,7 +237,7 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, 
> PowerPCCPU *cpu,
>  
>  static int spapr_fixup_cpu_numa_dt(void *fdt, int offset, PowerPCCPU *cpu)
>  {
> -int index = ppc_get_vcpu_id(cpu);
> +int index = spapr_vcpu_id(cpu);
>  uint32_t associativity[] = {cpu_to_be32(0x5),
>  cpu_to_be32(0x0),
>  cpu_to_be32(0x0),
> @@ -341,7 +341,7 @@ static int spapr_fixup_cpu_dt(void *fdt, 
> sPAPRMachineState *spapr)
>  PowerPCCPU *cpu = POWERPC_CPU(cs);
>  CPUPPCState *env = &cpu->env;
>  DeviceClass *dc = DEVICE_GET_CLASS(cs);
> -int index = ppc_get_vcpu_id(cpu);
> +int index = spapr_vcpu_id(cpu);
>  int compat_smt = MIN(smp_threads, ppc_compat_max_threads(cpu));
>  
>  if ((index % smt) != 0) {
> @@ -493,7 +493,7 @@ static void spapr_populate_cpu_dt(CPUState *cs, void 
> *fdt, int offset,
>  PowerPCCPU *cpu = POWERPC_CPU(cs);
>  CPUPPCState *env = &cpu->env;
>  PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
> -int index = ppc_get_vcpu_id(cpu);
> +int index = spapr_vcpu_id(cpu);
>  uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
> 0x, 0x};
>  uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq()
> @@ -626,7 +626,7 @@ static void spapr_populate_cpus_dt_node(void *fdt, 
> sPAPRMachineState *spapr)
>   */
>  CPU_FOREACH_REVERSE(cs) {
>  PowerPCCPU *cpu = POWERPC_CPU(cs);
> -int index = ppc_get_vcpu_id(cpu);
> +int index = spapr_vcpu_id(cpu);
>  DeviceClass *dc = DEVICE_GET_CLASS(cs);
>  int offset;
>  
> @@ -2982,7 +2982,7 @@ static void *spapr_populate_hotplug_cpu_dt(CPUState 
> *cs, int *fdt_offset,
>  {
>  PowerPCCPU *cpu = POWERPC_CPU(cs);
>  DeviceClass *dc = DEVICE_GET_CLASS(cs);
> -int id = ppc_get_vcpu_id(cpu);
> +int id = spapr_vcpu_id(cpu);
>  void *fdt;
>  int offset, fdt_size;
>  char *nodename;
> @@ -3392,7 +3392,7 @@ static void spapr_ics_resend(XICSFabric *dev)
>  
>  static ICPState *spapr_icp_get(XICSFabric *xi, int vcpu_id)
>  {
> -PowerPCCPU *cpu = ppc_get_cpu_by_vcpu_id(vcpu_id);
> +PowerPCCPU *cpu = spapr_find_cpu(vcpu_id);
>  
>  return cpu ? ICP(cpu->intc) : NULL;
>  }
> @@ -3412,6 +3412,32 @@ static void 
> spapr_pic_print_info(InterruptStatsProvider *obj,
>  ics_pic_print_info(spapr->ics, mon);
>  }
>  
> +int spapr_vcpu_id(PowerPCCPU *cpu)
> +{
> +CPUState *cs = CPU(cpu);
> +
> +if (kvm_enabled()) {
> +return kvm_arch_vcpu_id(cs);
> +} else {
> +return cs->cpu_index;
> +}
> +}
> +
> +PowerPCCPU *spapr_find_cpu(int vcpu_id)
> +{
> +CPUState *cs;
> +
> +CPU_FOREACH(cs) {
> +PowerPCCPU *cpu = POWERPC_CPU(cs);
> +
> +if (cpu->vcpu_id == vcp

Re: [Qemu-devel] [PULL 0/1] Tracing patches

2017-08-07 Thread Peter Maydell

On 7 August 2017 at 12:04, Stefan Hajnoczi  wrote:
> The following changes since commit ac44ed2afb7c60255e989b163301479f5b4ecd04:
>
>   Merge remote-tracking branch 'remotes/ehabkost/tags/machine-pull-request' 
> into staging (2017-08-04 13:46:22 +0100)
>
> are available in the git repository at:
>
>   git://github.com/stefanha/qemu.git tags/tracing-pull-request
>
> for you to fetch changes up to f42cf447e2310e84e119b99f7f13c8dc7a6cf3d6:
>
>   block: move trace probes into bdrv_co_preadv|pwritev (2017-08-07 09:39:35 
> +0100)
>
> 
>
> 
>
> Daniel P. Berrange (1):
>   block: move trace probes into bdrv_co_preadv|pwritev
>
>  block/io.c | 8 
>  block/trace-events | 4 ++--
>  2 files changed, 6 insertions(+), 6 deletions(-)

Applied, thanks.

-- PMM

Re: [Qemu-devel] [PATCH 11/17] block/nbd-client: fix nbd_co_request: set s->reply.handle to 0 on error

2017-08-07 Thread Vladimir Sementsov-Ogievskiy


07.08.2017 14:55, Eric Blake wrote:

On 08/04/2017 10:14 AM, Vladimir Sementsov-Ogievskiy wrote:

We set s->reply.handle to 0 on one error path and don't set on another.
For consistancy and to avoid assert in nbd_read_reply_entry let's
set s->reply.handle to 0 in case of wrong handle too.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd-client.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

Can this assertion be triggered now (presumably, with a broken server)?
I'm trying to figure out if this is 2.10 material.

[Urgh. If a broken server is able to cause an assertion failure that
causes a client to abort on an assertion failure, that probably deserves
a CVE]

Hmm looks like I've mistaken, if handle is wrong than read_reply_co 
should be already finished, so it's impossible



--
Best regards,
Vladimir

Re: [Qemu-devel] [Qemu-block] [PATCH for-2.10] quorum: Set sectors-count to 0 when reporting a flush error

2017-08-07 Thread Kevin Wolf

Am 07.08.2017 um 14:36 hat Alberto Garcia geschrieben:
> The QUORUM_REPORT_BAD event has fields to report the sector in which
> the error was detected and the number of affected sectors starting
> from that one. This is important for read and write errors, but not
> for flush errors.
> 
> For flush errors the current code reports the total size of the disk
> image. That is however not useful information in this case. Moreover,
> the bdrv_getlength() call can fail, and there's no good way of
> handling that failure.
> 
> Since we're reporting useless information and we cannot even guarantee
> to do it in a consistent way, this patch changes the code to report 0
> instead in all cases.
> 
> Reported-by: Markus Armbruster 
> Signed-off-by: Alberto Garcia 

Thanks, applied to the block branch.

Kevin

Re: [Qemu-devel] [PATCH for-2.10] hw/arm/virt: Add 2.10 machine type

2017-08-07 Thread Peter Maydell

On 7 August 2017 at 13:14, Andrew Jones  wrote:
> On Mon, Aug 07, 2017 at 11:49:41AM +, Eric Auger wrote:
>> Add virt-2.10 machine type.
>>
>> Signed-off-by: Eric Auger 
>>
>> ---

Applied to master, thanks.

-- PMM

Re: [Qemu-devel] [PATCH v2 for-2.10 0/4] VHDX bugfixes

2017-08-07 Thread Kevin Wolf

Am 07.08.2017 um 14:38 hat Jeff Cody geschrieben:
> 
> Some VHDX bug fixes, including:
> 
> 1. Checking bdrv_getlength(), bdrv_flush(), and bdrv_truncate() return values

Thanks, applied to the block branch.

Kevin

1 2 3 4 >

1 - 100 of 336 matches

Mail list logo