date:20170602

Re: [Qemu-devel] [PATCH v2 09/45] qapi: merge QInt and QFloat in QNum

2017-06-02 Thread Markus Armbruster

Marc-André Lureau  writes:

> We would like to use a same QObject type to represent numbers, whether
> they are int, uint, or floats. Getters will allow some compatibility
> between the various types if the number fits other representations.
>
> Signed-off-by: Marc-André Lureau 
> ---
>  scripts/qapi.py  |  40 -
>  scripts/qapi-visit.py|   2 +-
>  include/qapi/qmp/qdict.h |   3 +-
>  include/qapi/qmp/qfloat.h|  29 ---
>  include/qapi/qmp/qint.h  |  28 --
>  include/qapi/qmp/qlist.h |   3 +-
>  include/qapi/qmp/qnum.h  |  44 ++
>  include/qapi/qmp/types.h |   3 +-
>  include/qapi/qobject-input-visitor.h |   6 +-
>  include/qapi/qobject-output-visitor.h|   8 +-
>  block/blkdebug.c |   1 -
>  block/nbd.c  |   1 -
>  block/nfs.c  |   1 -
>  block/qapi.c |  13 ++-
>  block/quorum.c   |   1 -
>  block/sheepdog.c |   1 -
>  block/ssh.c  |   1 -
>  block/vvfat.c|   1 -
>  blockdev.c   |   8 +-
>  hw/acpi/pcihp.c  |   1 -
>  hw/i386/acpi-build.c |  41 +++--
>  hw/usb/xen-usb.c |   1 -
>  monitor.c|   2 +-
>  qapi/qobject-input-visitor.c |  41 +++--
>  qapi/qobject-output-visitor.c|   6 +-
>  qga/commands.c   |   2 +-
>  qga/main.c   |   1 -
>  qobject/json-parser.c|  29 +++
>  qobject/qdict.c  |  43 +-
>  qobject/qfloat.c |  62 --
>  qobject/qint.c   |  61 -
>  qobject/qjson.c  |  37 +---
>  qobject/qnum.c   | 141 
> +++
>  qobject/qobject.c|   3 +-
>  qom/object.c |  16 ++--
>  target/i386/cpu.c|   6 +-
>  tests/check-qdict.c  |  30 ---
>  tests/check-qfloat.c |  53 
>  tests/check-qint.c   |  87 ---
>  tests/check-qjson.c  |  91 +++-
>  tests/check-qlist.c  |  18 ++--
>  tests/check-qnum.c   | 133 +
>  tests/test-qmp-commands.c|   8 +-
>  tests/test-qmp-event.c   |   9 +-
>  tests/test-qobject-input-visitor.c   |  33 
>  tests/test-qobject-output-visitor.c  |  66 +--
>  tests/test-x86-cpuid-compat.c|  20 +++--
>  ui/spice-core.c  |   1 -
>  ui/vnc-enc-tight.c   |   1 -
>  util/qemu-option.c   |  20 ++---
>  MAINTAINERS  |   3 +-
>  qobject/Makefile.objs|   2 +-
>  scripts/coccinelle/qobject.cocci |   4 +-
>  tests/.gitignore |   3 +-
>  tests/Makefile.include   |  13 ++-
>  tests/qapi-schema/comments.out   |   2 +-
>  tests/qapi-schema/doc-good.out   |   2 +-
>  tests/qapi-schema/empty.out  |   2 +-
>  tests/qapi-schema/event-case.out |   2 +-
>  tests/qapi-schema/ident-with-escape.out  |   2 +-
>  tests/qapi-schema/include-relpath.out|   2 +-
>  tests/qapi-schema/include-repetition.out |   2 +-
>  tests/qapi-schema/include-simple.out |   2 +-
>  tests/qapi-schema/indented-expr.out  |   2 +-
>  tests/qapi-schema/qapi-schema-test.out   |   2 +-
>  65 files changed, 637 insertions(+), 665 deletions(-)
>  delete mode 100644 include/qapi/qmp/qfloat.h
>  delete mode 100644 include/qapi/qmp/qint.h
>  create mode 100644 include/qapi/qmp/qnum.h
>  delete mode 100644 qobject/qfloat.c
>  delete mode 100644 qobject/qint.c
>  create mode 100644 qobject/qnum.c
>  delete mode 100644 tests/check-qfloat.c
>  delete mode 100644 tests/check-qint.c
>  create mode 100644 tests/check-qnum.c
>
> diff --git a/scripts/qapi.py b/scripts/qapi.py
> index 06e583d8c3..0de809f56b 100644
> --- a/scripts/qapi.py
> +++ b/scripts/qapi.py
> @@ -21,18 +21,18 @@ from ordereddict import OrderedDict
>  
>  builtin_types = {
>  'str':  'QTYPE_QSTRING',
> -'int':  'QTYPE_QINT',
> -'number':   'QTYPE_QFLOAT',
> +'int':  'QTYPE_QNUM',
> +'number':   'QTYPE_QNUM',
>  'bool': 'QTYPE_QBOOL',
> -'int8': 'QTYPE_QINT',
> -'int16':'QTYPE_QINT',
> -'int32':'QTYPE_QINT',
> -'int64':'QTYPE_QINT',
> -'uint8':'QTYPE_QINT',
> -'uint16':   'QTYPE_QINT',
> -

Re: [Qemu-devel] [PATCH 3/4] spapr: Abolish DRC set_configured method

2017-06-02 Thread David Gibson

On Thu, Jun 01, 2017 at 10:37:39AM -0500, Michael Roth wrote:
> Quoting David Gibson (2017-05-31 20:52:17)
> > DRConnectorClass has a set_configured method, however:
> >   * There is only one implementation, and only ever likely to be one
> >   * There's exactly one caller, and that's (now) local
> >   * The implementation is very straightforward
> > 
> > So abolish the method entirely, and just open-code what we need.  We also
> > remove the tracepoints associated with it, since they don't look to be
> > terribly useful.
> 
> Dropping the method makes sense, but the 'configured' state affects a
> lot of the state-transitions throughout the code so I think it may
> be useful to keep the traces.

Fair point, I'll put the traces back in.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 1/4] spapr: Move DRC RTAS calls into spapr_drc.c

2017-06-02 Thread David Gibson

On Thu, Jun 01, 2017 at 10:56:50AM -0500, Michael Roth wrote:
> Quoting David Gibson (2017-05-31 20:52:15)
> > Currently implementations of the RTAS calls related to DRCs are in
> > spapr_rtas.c.  They belong better in spapr_drc.c - that way they're closer
> > to related code, and we'll be able to make some more things local.
> > 
> > spapr_rtas.c was intended to contain the RTAS infrastructure and core calls
> > that don't belong anywhere else, not every RTAS implementation.
> > 
> > Code motion only.
> 
> Technically rtas-get-sensor-state and rtas-set-indicator aren't specific
> to DRCs, but looking through the documented indicators/sensors (tables
> 40 and 42 in LoPAPR v11) it doesn't seem too likely we'll ever implement
> any others so the move seems reasonable.

True, I realised that a little after I posted.

Even if we did implement other sensors, though, the natural way to
split this would be to have the generic set-indicator function check
simply that the indicator is a DR related one, and pass the options
straight to a function in the DRC code, rather doing preliminary
looking into DRC internals as it does now.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/4] spapr:DRC cleanups (part I)

2017-06-02 Thread David Gibson

On Thu, Jun 01, 2017 at 12:41:40PM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 06/01/2017 02:30 AM, David Gibson wrote:
> > On Wed, May 31, 2017 at 11:25:41PM -0500, Michael Roth wrote:
> > > Quoting Bharata B Rao (2017-05-31 23:06:46)
> > > > On Thu, Jun 01, 2017 at 11:52:14AM +1000, David Gibson wrote:
> > > > > The code managing DRCs[0] has quite a few things that are more
> > > > > complicated than they need to be.  In particular the object
> > > > > representing a DRC has a bunch of method pointers, despite the fact
> > > > > that there are currently no subclasses, and even if there were the
> > > > > method implementations would be unlikely to differ.
> > > > So you are getting rid of a few methods. How about other methods ?
> > > > Specially attach and detach which have incorporated all the logic needed
> > > > to handle logical and physical DRs into their implementations ?
> > > I would avoid any methods that incorporate special-casing for
> > > physical vs. logical DRCs, since that seems like a good logical
> > > starting point for moving to 'physical'/'logical' DRC
> > > sub-classes to help simplify the increasingly complicated
> > > state-tracking.
> > Right, I'm looking at making subclasses for each of the DRC types.
> > Possibly with intermediate subclasses for physical vs. logical, we'll
> > see how it works out.
> 
> Back in the DRC migration patch series I talked with Mike about refactoring
> the DRC code in such fashion (physical DRC and logical DRC). But first I
> would
> implement some kind of unit testing in this code to avoid breaking too much
> stuff during this refactoring.

So, I'd love to have good unit tests, but everything takes time.

> I am not sure about the effort to implementing unit test in the
> current DRC code.  This series is simplifying the DRC code, making
> it more minimalist and possibly easier to be tested. In the end it
> would be a first step towards unit testing.

..and as you say, extra complexity in the code makes testing and
reliability harder.

> 
> However, there is the issue of backward compatibility. I fear this DRC
> refactoring
> of Logical/Physical DRC would be too drastic to maintain such compatibility
> (assuming that it is not already broken). If this refactor goes live only in
> 2.11 then
> we will have a hard time to migrate from 2.11 to 2.10.

Right such a rework could break migration.

> All that said, I believe we can live without unit testing for a little
> longer and if
> we're going for this Physical/DRC refactoring, we need to push it for 2.10.
> We can
> think about unit test later with the refactored code. Feel free to send to
> me any
> unfinished/beta DRC refactoring code you might be working on and want
> tested. I can help in the refactoring too, just let me know.

So like you I think getting it into 2.10 would be a good idea, before
we have any released version with DRC migration to break.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 1/4] spapr: Move DRC RTAS calls into spapr_drc.c

2017-06-02 Thread David Gibson

On Thu, Jun 01, 2017 at 11:05:37AM -0500, Michael Roth wrote:
> Quoting Laurent Vivier (2017-06-01 08:36:36)
> > On 01/06/2017 03:52, David Gibson wrote:
> > > Currently implementations of the RTAS calls related to DRCs are in
> > > spapr_rtas.c.  They belong better in spapr_drc.c - that way they're closer
> > > to related code, and we'll be able to make some more things local.
> > > 
> > > spapr_rtas.c was intended to contain the RTAS infrastructure and core 
> > > calls
> > > that don't belong anywhere else, not every RTAS implementation.
> > > 
> > > Code motion only.
> > > 
> > > Signed-off-by: David Gibson 
> > > ---
> > >  hw/ppc/spapr_drc.c  | 322 
> > > ++--
> > >  hw/ppc/spapr_rtas.c | 304 
> > > -
> > >  2 files changed, 315 insertions(+), 311 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> > > index cc2400b..ae8800d 100644
> > > --- a/hw/ppc/spapr_drc.c
> > > +++ b/hw/ppc/spapr_drc.c
> > > @@ -27,6 +27,34 @@
> > >  #define DRC_INDEX_TYPE_SHIFT 28
> > >  #define DRC_INDEX_ID_MASK ((1ULL << DRC_INDEX_TYPE_SHIFT) - 1)
> > >  
> > > +static sPAPRConfigureConnectorState *spapr_ccs_find(sPAPRMachineState 
> > > *spapr,
> > > +uint32_t drc_index)
> > > +{
> > > +sPAPRConfigureConnectorState *ccs = NULL;
> > > +
> > > +QTAILQ_FOREACH(ccs, &spapr->ccs_list, next) {
> > > +if (ccs->drc_index == drc_index) {
> > > +break;
> > > +}
> > > +}
> > > +
> > > +return ccs;
> > > +}
> > > +
> > > +static void spapr_ccs_add(sPAPRMachineState *spapr,
> > > +  sPAPRConfigureConnectorState *ccs)
> > > +{
> > > +g_assert(!spapr_ccs_find(spapr, ccs->drc_index));
> > > +QTAILQ_INSERT_HEAD(&spapr->ccs_list, ccs, next);
> > > +}
> > > +
> > > +static void spapr_ccs_remove(sPAPRMachineState *spapr,
> > > + sPAPRConfigureConnectorState *ccs)
> > > +{
> > > +QTAILQ_REMOVE(&spapr->ccs_list, ccs, next);
> > > +g_free(ccs);
> > > +}
> > > +
> > >  static sPAPRDRConnectorTypeShift get_type_shift(sPAPRDRConnectorType 
> > > type)
> > >  {
> > >  uint32_t shift = 0;
> > > @@ -747,13 +775,6 @@ static const TypeInfo spapr_dr_connector_info = {
> > >  .class_init= spapr_dr_connector_class_init,
> > >  };
> > >  
> > > -static void spapr_drc_register_types(void)
> > > -{
> > > -type_register_static(&spapr_dr_connector_info);
> > > -}
> > > -
> > > -type_init(spapr_drc_register_types)
> > > -
> > >  /* helper functions for external users */
> > >  
> > >  sPAPRDRConnector *spapr_dr_connector_by_index(uint32_t index)
> > > @@ -932,3 +953,290 @@ out:
> > >  
> > >  return ret;
> > >  }
> > > +
> > > +/*
> > > + * RTAS calls
> > > + */
> > > +
> > > +static bool sensor_type_is_dr(uint32_t sensor_type)
> > > +{
> > > +switch (sensor_type) {
> > > +case RTAS_SENSOR_TYPE_ISOLATION_STATE:
> > > +case RTAS_SENSOR_TYPE_DR:
> > > +case RTAS_SENSOR_TYPE_ALLOCATION_STATE:
> > > +return true;
> > > +}
> > > +
> > > +return false;
> > > +}
> > > +
> > > +static void rtas_set_indicator(PowerPCCPU *cpu, sPAPRMachineState *spapr,
> > > +   uint32_t token, uint32_t nargs,
> > > +   target_ulong args, uint32_t nret,
> > > +   target_ulong rets)
> > > +{
> > > +uint32_t sensor_type;
> > > +uint32_t sensor_index;
> > > +uint32_t sensor_state;
> > > +uint32_t ret = RTAS_OUT_SUCCESS;
> > > +sPAPRDRConnector *drc;
> > > +sPAPRDRConnectorClass *drck;
> > > +
> > > +if (nargs != 3 || nret != 1) {
> > > +ret = RTAS_OUT_PARAM_ERROR;
> > > +goto out;
> > > +}
> > > +
> > > +sensor_type = rtas_ld(args, 0);
> > > +sensor_index = rtas_ld(args, 1);
> > > +sensor_state = rtas_ld(args, 2);
> > > +
> > > +if (!sensor_type_is_dr(sensor_type)) {
> > > +goto out_unimplemented;
> > > +}
> > > +
> > > +/* if this is a DR sensor we can assume sensor_index == drc_index */
> > > +drc = spapr_dr_connector_by_index(sensor_index);
> > > +if (!drc) {
> > > +trace_spapr_rtas_set_indicator_invalid(sensor_index);
> > > +ret = RTAS_OUT_PARAM_ERROR;
> > > +goto out;
> > > +}
> > > +drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> > > +
> > > +switch (sensor_type) {
> > > +case RTAS_SENSOR_TYPE_ISOLATION_STATE:
> > > +/* if the guest is configuring a device attached to this
> > > + * DRC, we should reset the configuration state at this
> > > + * point since it may no longer be reliable (guest released
> > > + * device and needs to start over, or unplug occurred so
> > > + * the FDT is no longer valid)
> > > + */
> > > +if (sensor_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) {
> > > +

Re: [Qemu-devel] [PATCH 0/4] spapr:DRC cleanups (part I)

2017-06-02 Thread David Gibson

On Thu, Jun 01, 2017 at 11:52:14AM +1000, David Gibson wrote:
> The code managing DRCs[0] has quite a few things that are more
> complicated than they need to be.  In particular the object
> representing a DRC has a bunch of method pointers, despite the fact
> that there are currently no subclasses, and even if there were the
> method implementations would be unlikely to differ.
> 
> This appears to be a misguided attempt to "abstract" or hide things in
> a way which is bureaucraticl, rather than meaningful.  We may have an
> object model, but we don't have to adopt Java's kingdom-of-nouns
> nonsense[1].
> 
> This series makes a start on simplifying things.  There's still plenty
> more, but you have to start somewhere.
> 
> [0] "Dynamic Reconfiguration Connectors" a firmware abstraction used
> in hotplug operations
> [1]
> 
> https://steve-yegge.blogspot.com.au/2006/03/execution-in-kingdom-of-nouns.html

I've had enough acks that I've merged this series (with minor
corrections) into ppc-for-2.10.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCHv5 0/4] Clean up compatibility mode handling

2017-06-02 Thread David Gibson

On Thu, Jun 01, 2017 at 08:55:19PM -0700, no-re...@patchew.org wrote:
> Hi,
> 
> This series seems to have some coding style problems. See output below for
> more information:
> 
> Type: series
> Message-id: 20170602031507.29881-1-da...@gibson.dropbear.id.au
> Subject: [Qemu-devel] [PATCHv5 0/4] Clean up compatibility mode handling
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> 
> BASE=base
> n=1
> total=$(git log --oneline $BASE.. | wc -l)
> failed=0
> 
> git config --local diff.renamelimit 0
> git config --local diff.renames True
> 
> commits="$(git log --format=%H --reverse $BASE..)"
> for c in $commits; do
> echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
> if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; 
> then
> failed=1
> echo
> fi
> n=$((n+1))
> done
> 
> exit $failed
> === TEST SCRIPT END ===
> 
> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
> Switched to a new branch 'test'
> 6ccf47f ppc: Rework CPU compatibility testing across migration
> 2b75df3 pseries: Reset CPU compatibility mode
> 282a89c pseries: Move CPU compatibility property to machine
> 8741a73 qapi: add explicit null to string input and output visitors
> 
> === OUTPUT BEGIN ===
> Checking PATCH 1/4: qapi: add explicit null to string input and output 
> visitors...
> Checking PATCH 2/4: pseries: Move CPU compatibility property to machine...
> Checking PATCH 3/4: pseries: Reset CPU compatibility mode...
> Checking PATCH 4/4: ppc: Rework CPU compatibility testing across migration...
> ERROR: braces {} are necessary for all arms of this statement
> #94: FILE: target/ppc/machine.c:236:
> +if (cpu->compat_pvr) {
> [...]
> +} else
> [...]
> 
> total: 1 errors, 0 warnings, 100 lines checked

This is a false positive triggered by an #ifdef intersecting the
if/else.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Qemu-devel] [PATCH 3/5] spapr: Move configure-connector state into DRC

2017-06-02 Thread David Gibson

Currently the sPAPRMachineState contains a of sPAPRConfigureConnector
structures which store intermediate state for the ibm,configure-connector
RTAS call.

This was an attempt to separate this state from the core of the DRC state.
However the configure connector process is intimately tied to the DRC
model, so there's really no point trying to have two levels of interface
here.

Moving the configure-connector state into its corresponding DRC allows
removal of a number of helpers for maintaining the anciliary list.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c |  4 ---
 hw/ppc/spapr_drc.c | 73 --
 include/hw/ppc/spapr.h | 14 -
 include/hw/ppc/spapr_drc.h |  7 +
 4 files changed, 25 insertions(+), 73 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 7aac3b9..6234dbd 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2339,10 +2339,6 @@ static void ppc_spapr_init(MachineState *machine)
 register_savevm_live(NULL, "spapr/htab", -1, 1,
  &savevm_htab_handlers, spapr);
 
-/* used by RTAS */
-QTAILQ_INIT(&spapr->ccs_list);
-qemu_register_reset(spapr_ccs_reset_hook, spapr);
-
 qemu_register_boot_set(spapr_boot_set, spapr);
 
 if (kvm_enabled()) {
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 2514f87..27d4bd3 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -27,34 +27,6 @@
 #define DRC_INDEX_TYPE_SHIFT 28
 #define DRC_INDEX_ID_MASK ((1ULL << DRC_INDEX_TYPE_SHIFT) - 1)
 
-static sPAPRConfigureConnectorState *spapr_ccs_find(sPAPRMachineState *spapr,
-uint32_t drc_index)
-{
-sPAPRConfigureConnectorState *ccs = NULL;
-
-QTAILQ_FOREACH(ccs, &spapr->ccs_list, next) {
-if (ccs->drc_index == drc_index) {
-break;
-}
-}
-
-return ccs;
-}
-
-static void spapr_ccs_add(sPAPRMachineState *spapr,
-  sPAPRConfigureConnectorState *ccs)
-{
-g_assert(!spapr_ccs_find(spapr, ccs->drc_index));
-QTAILQ_INSERT_HEAD(&spapr->ccs_list, ccs, next);
-}
-
-static void spapr_ccs_remove(sPAPRMachineState *spapr,
- sPAPRConfigureConnectorState *ccs)
-{
-QTAILQ_REMOVE(&spapr->ccs_list, ccs, next);
-g_free(ccs);
-}
-
 sPAPRDRConnectorType spapr_drc_type(sPAPRDRConnector *drc)
 {
 sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
@@ -81,6 +53,16 @@ static uint32_t set_isolation_state(sPAPRDRConnector *drc,
 
 trace_spapr_drc_set_isolation_state(spapr_drc_index(drc), state);
 
+/* if the guest is configuring a device attached to this DRC, we
+ * should reset the configuration state at this point since it may
+ * no longer be reliable (guest released device and needs to start
+ * over, or unplug occurred so the FDT is no longer valid)
+ */
+if (state == SPAPR_DR_ISOLATION_STATE_ISOLATED) {
+g_free(drc->ccs);
+drc->ccs = NULL;
+}
+
 if (state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) {
 /* cannot unisolate a non-existent resource, and, or resources
  * which are in an 'UNUSABLE' allocation state. (PAPR 2.7, 13.5.3.5)
@@ -485,6 +467,10 @@ static void reset(DeviceState *d)
 sPAPRDREntitySense state;
 
 trace_spapr_drc_reset(spapr_drc_index(drc));
+
+g_free(drc->ccs);
+drc->ccs = NULL;
+
 /* immediately upon reset we can safely assume DRCs whose devices
  * are pending removal can be safely removed, and that they will
  * subsequently be left in an ISOLATED state. move the DRC to this
@@ -1020,19 +1006,6 @@ static void rtas_set_indicator(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 
 switch (sensor_type) {
 case RTAS_SENSOR_TYPE_ISOLATION_STATE:
-/* if the guest is configuring a device attached to this
- * DRC, we should reset the configuration state at this
- * point since it may no longer be reliable (guest released
- * device and needs to start over, or unplug occurred so
- * the FDT is no longer valid)
- */
-if (sensor_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) {
-sPAPRConfigureConnectorState *ccs = spapr_ccs_find(spapr,
-   sensor_index);
-if (ccs) {
-spapr_ccs_remove(spapr, ccs);
-}
-}
 ret = drck->set_isolation_state(drc, sensor_state);
 break;
 case RTAS_SENSOR_TYPE_DR:
@@ -1116,16 +1089,6 @@ static void configure_connector_st(target_ulong addr, 
target_ulong offset,
   buf, MIN(len, CC_WA_LEN - offset));
 }
 
-void spapr_ccs_reset_hook(void *opaque)
-{
-sPAPRMachineState *spapr = opaque;
-sPAPRConfigureConnectorState *ccs, *ccs_tmp;
-
-QTAILQ_FOREACH_SAFE(ccs, &spapr->ccs_list, next, ccs_tmp) {
-spapr_ccs_remove(spapr, ccs);
-}
-}
-
 static void rtas_ibm_c

[Qemu-devel] [PATCH 1/5] spapr: Introduce DRC subclasses

2017-06-02 Thread David Gibson

Currently we only have a single QOM type for all DRCs, but lots of
places where we switch behaviour based on the DRC's PAPR defined type.
This is a poor use of our existing type system.

So, instead create QOM subclasses for each PAPR defined DRC type.  We
also introduce intermediate subclasses for physical and logical DRCs,
a division which will be useful later on.

Instead of being stored in the DRC object itself, the PAPR type is now
stored in the class structure.  There are still many places where we
switch directly on the PAPR type value, but this at least provides the
basis to start to remove those.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c |   5 +-
 hw/ppc/spapr_drc.c | 120 +
 hw/ppc/spapr_pci.c |   3 +-
 include/hw/ppc/spapr_drc.h |  47 --
 4 files changed, 136 insertions(+), 39 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 5d10366..456f9e7 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1911,7 +1911,7 @@ static void 
spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
 uint64_t addr;
 
 addr = i * lmb_size + spapr->hotplug_memory.base;
-drc = spapr_dr_connector_new(OBJECT(spapr), 
SPAPR_DR_CONNECTOR_TYPE_LMB,
+drc = spapr_dr_connector_new(OBJECT(spapr), TYPE_SPAPR_DRC_LMB,
  addr/lmb_size);
 qemu_register_reset(spapr_drc_reset, drc);
 }
@@ -2008,8 +2008,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
 
 if (mc->has_hotpluggable_cpus) {
 sPAPRDRConnector *drc =
-spapr_dr_connector_new(OBJECT(spapr),
-   SPAPR_DR_CONNECTOR_TYPE_CPU,
+spapr_dr_connector_new(OBJECT(spapr), TYPE_SPAPR_DRC_CPU,
(core_id / smp_threads) * smt);
 
 qemu_register_reset(spapr_drc_reset, drc);
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index a35314e..690b41f 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -70,14 +70,23 @@ static sPAPRDRConnectorTypeShift 
get_type_shift(sPAPRDRConnectorType type)
 return shift;
 }
 
+sPAPRDRConnectorType spapr_drc_type(sPAPRDRConnector *drc)
+{
+sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+
+return 1 << drck->typeshift;
+}
+
 uint32_t spapr_drc_index(sPAPRDRConnector *drc)
 {
+sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+
 /* no set format for a drc index: it only needs to be globally
  * unique. this is how we encode the DRC type on bare-metal
  * however, so might as well do that here
  */
-return (get_type_shift(drc->type) << DRC_INDEX_TYPE_SHIFT) |
-(drc->id & DRC_INDEX_ID_MASK);
+return (drck->typeshift << DRC_INDEX_TYPE_SHIFT)
+| (drc->id & DRC_INDEX_ID_MASK);
 }
 
 static uint32_t set_isolation_state(sPAPRDRConnector *drc,
@@ -107,7 +116,7 @@ static uint32_t set_isolation_state(sPAPRDRConnector *drc,
  * If the LMB being removed doesn't belong to a DIMM device that is
  * actually being unplugged, fail the isolation request here.
  */
-if (drc->type == SPAPR_DR_CONNECTOR_TYPE_LMB) {
+if (spapr_drc_type(drc) == SPAPR_DR_CONNECTOR_TYPE_LMB) {
 if ((state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
  !drc->awaiting_release) {
 return RTAS_OUT_HW_ERROR;
@@ -177,7 +186,7 @@ static uint32_t set_allocation_state(sPAPRDRConnector *drc,
 }
 }
 
-if (drc->type != SPAPR_DR_CONNECTOR_TYPE_PCI) {
+if (spapr_drc_type(drc) != SPAPR_DR_CONNECTOR_TYPE_PCI) {
 drc->allocation_state = state;
 if (drc->awaiting_release &&
 drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) {
@@ -191,11 +200,6 @@ static uint32_t set_allocation_state(sPAPRDRConnector *drc,
 return RTAS_OUT_SUCCESS;
 }
 
-sPAPRDRConnectorType spapr_drc_type(sPAPRDRConnector *drc)
-{
-return drc->type;
-}
-
 static const char *get_name(sPAPRDRConnector *drc)
 {
 return drc->name;
@@ -217,7 +221,7 @@ static void set_signalled(sPAPRDRConnector *drc)
 static uint32_t entity_sense(sPAPRDRConnector *drc, sPAPRDREntitySense *state)
 {
 if (drc->dev) {
-if (drc->type != SPAPR_DR_CONNECTOR_TYPE_PCI &&
+if (spapr_drc_type(drc) != SPAPR_DR_CONNECTOR_TYPE_PCI &&
 drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) {
 /* for logical DR, we return a state of UNUSABLE
  * iff the allocation state UNUSABLE.
@@ -235,7 +239,7 @@ static uint32_t entity_sense(sPAPRDRConnector *drc, 
sPAPRDREntitySense *state)
 *state = SPAPR_DR_ENTITY_SENSE_PRESENT;
 }
 } else {
-if (drc->type == SPAPR_DR_CONNECTOR_TYPE_PCI) {
+if (spapr_drc_type(drc) == SPAPR_DR_CONNECTOR_TYPE_PCI) {
 /* PCI devices, and only PCI devices, use EMPTY
  * in cases where we'd otherwi

[Qemu-devel] [PATCH 5/5] spapr: Remove some non-useful properties on DRC objects

2017-06-02 Thread David Gibson

 * 'connector_type' is easily derived from the 'index' property, so there's
   no point to it (it's also implicit in the QOM type of the DRC)
 * 'isolation-state', 'indicator-state' and 'allocation-state' are
   part of the transaction between qemu and guest during PAPR hotplug
   operations, and outside tools really have no business looking at it
   (especially not changing, and these were RW properties)
 * 'entity-sense' is basically just a weird PAPR encoding of whether there
   is a device connected to this DRC

Strictly speaking removing these properties is breaking the qemu interface.
However, I'm pretty sure no management tools have ever used these.  For
debugging there are better alternatives.  Therefore, I think removing these
broken interfaces is the better option.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr_drc.c | 29 -
 1 file changed, 29 deletions(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index d43c9cd..4dd26a8 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -228,14 +228,6 @@ static void prop_get_index(Object *obj, Visitor *v, const 
char *name,
 visit_type_uint32(v, name, &value, errp);
 }
 
-static void prop_get_type(Object *obj, Visitor *v, const char *name,
-  void *opaque, Error **errp)
-{
-sPAPRDRConnector *drc = SPAPR_DR_CONNECTOR(obj);
-uint32_t value = (uint32_t)spapr_drc_type(drc);
-visit_type_uint32(v, name, &value, errp);
-}
-
 static char *prop_get_name(Object *obj, Error **errp)
 {
 sPAPRDRConnector *drc = SPAPR_DR_CONNECTOR(obj);
@@ -243,17 +235,6 @@ static char *prop_get_name(Object *obj, Error **errp)
 return g_strdup(drck->get_name(drc));
 }
 
-static void prop_get_entity_sense(Object *obj, Visitor *v, const char *name,
-  void *opaque, Error **errp)
-{
-sPAPRDRConnector *drc = SPAPR_DR_CONNECTOR(obj);
-sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
-uint32_t value;
-
-drck->entity_sense(drc, &value);
-visit_type_uint32(v, name, &value, errp);
-}
-
 static void prop_get_fdt(Object *obj, Visitor *v, const char *name,
  void *opaque, Error **errp)
 {
@@ -670,20 +651,10 @@ static void spapr_dr_connector_instance_init(Object *obj)
 {
 sPAPRDRConnector *drc = SPAPR_DR_CONNECTOR(obj);
 
-object_property_add_uint32_ptr(obj, "isolation-state",
-   &drc->isolation_state, NULL);
-object_property_add_uint32_ptr(obj, "indicator-state",
-   &drc->indicator_state, NULL);
-object_property_add_uint32_ptr(obj, "allocation-state",
-   &drc->allocation_state, NULL);
 object_property_add_uint32_ptr(obj, "id", &drc->id, NULL);
 object_property_add(obj, "index", "uint32", prop_get_index,
 NULL, NULL, NULL, NULL);
-object_property_add(obj, "connector_type", "uint32", prop_get_type,
-NULL, NULL, NULL, NULL);
 object_property_add_str(obj, "name", prop_get_name, NULL, NULL);
-object_property_add(obj, "entity-sense", "uint32", prop_get_entity_sense,
-NULL, NULL, NULL, NULL);
 object_property_add(obj, "fdt", "struct", prop_get_fdt,
 NULL, NULL, NULL, NULL);
 }
-- 
2.9.4

[Qemu-devel] [PATCH 0/5] spapr: DRC cleanups (part II)

2017-06-02 Thread David Gibson

Having merged my first batch of cleanups for the DRC code, this series
contains a second batch.

This adds the long-discussed QOM subtypes for different PAPR DRC
types.  It still only makes partial use of the QOM type structure, but
it's a start.  It also removes the artificial separation between
configure-connector and DRC state, which simplifies a number of things.

David Gibson (5):
  spapr: Introduce DRC subclasses
  spapr: Clean up spapr_dr_connector_by_*()
  spapr: Move configure-connector state into DRC
  spapr: Eliminate spapr_drc_get_type_str()
  spapr: Remove some non-useful properties on DRC objects

 hw/ppc/spapr.c |  37 +++---
 hw/ppc/spapr_drc.c | 275 +++--
 hw/ppc/spapr_events.c  |   2 +-
 hw/ppc/spapr_pci.c |   9 +-
 include/hw/ppc/spapr.h |  14 ---
 include/hw/ppc/spapr_drc.h |  60 +-
 6 files changed, 189 insertions(+), 208 deletions(-)

-- 
2.9.4

[Qemu-devel] [PATCH 2/5] spapr: Clean up spapr_dr_connector_by_*()

2017-06-02 Thread David Gibson

 * Change names to something less ludicrously verbose
 * Now that we have QOM subclasses for the different DRC types, use a QOM
   typename instead of a PAPR type value parameter

The latter allows removal of the get_type_shift() helper.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 28 ++--
 hw/ppc/spapr_drc.c | 34 ++
 hw/ppc/spapr_events.c  |  2 +-
 hw/ppc/spapr_pci.c |  6 ++
 include/hw/ppc/spapr_drc.h |  5 ++---
 5 files changed, 29 insertions(+), 46 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 456f9e7..7aac3b9 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -460,7 +460,7 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, 
int offset,
 uint32_t radix_AP_encodings[PPC_PAGE_SIZES_MAX_SZ];
 int i;
 
-drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index);
+drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index);
 if (drc) {
 drc_index = spapr_drc_index(drc);
 _FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", drc_index)));
@@ -653,7 +653,7 @@ static int spapr_populate_drconf_memory(sPAPRMachineState 
*spapr, void *fdt)
 if (i >= hotplug_lmb_start) {
 sPAPRDRConnector *drc;
 
-drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB, i);
+drc = spapr_drc_by_id(TYPE_SPAPR_DRC_LMB, i);
 g_assert(drc);
 
 dynamic_memory[0] = cpu_to_be32(addr >> 32);
@@ -2528,8 +2528,8 @@ static void spapr_add_lmbs(DeviceState *dev, uint64_t 
addr_start, uint64_t size,
 uint64_t addr = addr_start;
 
 for (i = 0; i < nr_lmbs; i++) {
-drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
-addr/SPAPR_MEMORY_BLOCK_SIZE);
+drc = spapr_drc_by_id(TYPE_SPAPR_DRC_LMB,
+  addr / SPAPR_MEMORY_BLOCK_SIZE);
 g_assert(drc);
 
 fdt = create_device_tree(&fdt_size);
@@ -2550,8 +2550,8 @@ static void spapr_add_lmbs(DeviceState *dev, uint64_t 
addr_start, uint64_t size,
  */
 if (dev->hotplugged) {
 if (dedicated_hp_event_source) {
-drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
-addr_start / SPAPR_MEMORY_BLOCK_SIZE);
+drc = spapr_drc_by_id(TYPE_SPAPR_DRC_LMB,
+  addr_start / SPAPR_MEMORY_BLOCK_SIZE);
 drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
 spapr_hotplug_req_add_by_count_indexed(SPAPR_DR_CONNECTOR_TYPE_LMB,
nr_lmbs,
@@ -2668,8 +2668,8 @@ static sPAPRDIMMState 
*spapr_recover_pending_dimm_state(sPAPRMachineState *ms,
 
 addr = addr_start;
 for (i = 0; i < nr_lmbs; i++) {
-drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
-   addr / SPAPR_MEMORY_BLOCK_SIZE);
+drc = spapr_drc_by_id(TYPE_SPAPR_DRC_LMB,
+  addr / SPAPR_MEMORY_BLOCK_SIZE);
 g_assert(drc);
 if (drc->indicator_state != SPAPR_DR_INDICATOR_STATE_INACTIVE) {
 avail_lmbs++;
@@ -2752,8 +2752,8 @@ static void spapr_memory_unplug_request(HotplugHandler 
*hotplug_dev,
 
 addr = addr_start;
 for (i = 0; i < nr_lmbs; i++) {
-drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
-addr / SPAPR_MEMORY_BLOCK_SIZE);
+drc = spapr_drc_by_id(TYPE_SPAPR_DRC_LMB,
+  addr / SPAPR_MEMORY_BLOCK_SIZE);
 g_assert(drc);
 
 drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
@@ -2761,8 +2761,8 @@ static void spapr_memory_unplug_request(HotplugHandler 
*hotplug_dev,
 addr += SPAPR_MEMORY_BLOCK_SIZE;
 }
 
-drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
-   addr_start / SPAPR_MEMORY_BLOCK_SIZE);
+drc = spapr_drc_by_id(TYPE_SPAPR_DRC_LMB,
+  addr_start / SPAPR_MEMORY_BLOCK_SIZE);
 drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
 spapr_hotplug_req_remove_by_count_indexed(SPAPR_DR_CONNECTOR_TYPE_LMB,
   nr_lmbs, spapr_drc_index(drc));
@@ -2833,7 +2833,7 @@ void spapr_core_unplug_request(HotplugHandler 
*hotplug_dev, DeviceState *dev,
 return;
 }
 
-drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index * smt);
+drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * smt);
 g_assert(drc);
 
 drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
@@ -2868,7 +2868,7 @@ static void spapr_core_plug(HotplugHandler *hotplug_dev, 
DeviceState *dev,
cc->core_id);
 return;
 }
-drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index * smt);
+drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * smt);
 
 g_assert(drc || !mc->has_hotpluggable_cpus);
 
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 690b41

[Qemu-devel] [PATCH 4/5] spapr: Eliminate spapr_drc_get_type_str()

2017-06-02 Thread David Gibson

This function was used in generating the device tree.  However, now that
we have different QOM types for different DRC types we can easily store
the information we need in the class structure and avoid this specialized
lookup function.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr_drc.c | 31 ---
 include/hw/ppc/spapr_drc.h |  1 +
 2 files changed, 5 insertions(+), 27 deletions(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 27d4bd3..d43c9cd 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -716,6 +716,7 @@ static void spapr_drc_cpu_class_init(ObjectClass *k, void 
*data)
 sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
 
 drck->typeshift = SPAPR_DR_CONNECTOR_TYPE_SHIFT_CPU;
+drck->typename = "CPU";
 }
 
 static void spapr_drc_pci_class_init(ObjectClass *k, void *data)
@@ -723,6 +724,7 @@ static void spapr_drc_pci_class_init(ObjectClass *k, void 
*data)
 sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
 
 drck->typeshift = SPAPR_DR_CONNECTOR_TYPE_SHIFT_PCI;
+drck->typename = "28";
 }
 
 static void spapr_drc_lmb_class_init(ObjectClass *k, void *data)
@@ -730,6 +732,7 @@ static void spapr_drc_lmb_class_init(ObjectClass *k, void 
*data)
 sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_CLASS(k);
 
 drck->typeshift = SPAPR_DR_CONNECTOR_TYPE_SHIFT_LMB;
+drck->typename = "MEM";
 }
 
 static const TypeInfo spapr_dr_connector_info = {
@@ -796,31 +799,6 @@ sPAPRDRConnector *spapr_drc_by_id(const char *type, 
uint32_t id)
   | (id & DRC_INDEX_ID_MASK));
 }
 
-/* generate a string the describes the DRC to encode into the
- * device tree.
- *
- * as documented by PAPR+ v2.7, 13.5.2.6 and C.6.1
- */
-static const char *spapr_drc_get_type_str(sPAPRDRConnectorType type)
-{
-switch (type) {
-case SPAPR_DR_CONNECTOR_TYPE_CPU:
-return "CPU";
-case SPAPR_DR_CONNECTOR_TYPE_PHB:
-return "PHB";
-case SPAPR_DR_CONNECTOR_TYPE_VIO:
-return "SLOT";
-case SPAPR_DR_CONNECTOR_TYPE_PCI:
-return "28";
-case SPAPR_DR_CONNECTOR_TYPE_LMB:
-return "MEM";
-default:
-g_assert(false);
-}
-
-return NULL;
-}
-
 /**
  * spapr_drc_populate_dt
  *
@@ -902,8 +880,7 @@ int spapr_drc_populate_dt(void *fdt, int fdt_offset, Object 
*owner,
 drc_names = g_string_insert_len(drc_names, -1, "\0", 1);
 
 /* ibm,drc-types */
-drc_types = g_string_append(drc_types,
-
spapr_drc_get_type_str(spapr_drc_type(drc)));
+drc_types = g_string_append(drc_types, drck->typename);
 drc_types = g_string_insert_len(drc_types, -1, "\0", 1);
 }
 
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index 53b0f8b..8a4889a 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -212,6 +212,7 @@ typedef struct sPAPRDRConnectorClass {
 
 /*< public >*/
 sPAPRDRConnectorTypeShift typeshift;
+const char *typename; /* used in device tree, PAPR 13.5.2.6 & C.6.1 */
 
 /* accessors for guest-visible (generally via RTAS) DR state */
 uint32_t (*set_isolation_state)(sPAPRDRConnector *drc,
-- 
2.9.4

Re: [Qemu-devel] [PATCH v2] spapr: manage hotplugged devices while the VM is not started

2017-06-02 Thread David Gibson

On Wed, May 31, 2017 at 11:25:16AM +0200, Laurent Vivier wrote:
> For QEMU, a hotlugged device is a device added using the HMP/QMP
> interface.
> For SPAPR, a hotplugged device is a device added while the
> machine is running. In this case QEMU doesn't update internal
> state but relies on the OS for this part
> 
> In the case of migration, when we (libvirt) hotplug a device
> on the source guest, we (libvirt) generally hotplug the same
> device on the destination guest. But in this case, the machine
> is stopped (RUN_STATE_INMIGRATE) and QEMU must not expect
> the OS will manage it as an hotplugged device as it will
> be "imported" by the migration.
> 
> This patch changes the meaning of "hotplugged" in spapr.c
> to manage a QEMU hotplugged device like a "coldplugged" one
> when the machine is awaiting an incoming migration.
> 
> Signed-off-by: Laurent Vivier 
> ---
> v2:
> - don't replace dev->hotplugged to test if CPU hotplug
>   is supported.

So, this addresses the specific points mentioned on the last version,
but not the wider query: what exactly in the previous behaviour was
causing problems.

On a related note, having better understood the DRC code while writing
my cleanups, I'm reconsidering your earlier suggestion of simply
disabling hotplugs until after CAS.

I rejected that approach before because I was assuming the problem was
due to hotplug events being lost, and I was confident that CAS marked
a transition from losing to not-losing events (other than by
accident).  However, later discussions have shown that the problem is
more to do with device tree updates being essentially duplicated
between the DT given to the guest during CAS and further updates from
hotplug events.

On that new understanding CAS does seem like a logical watershed,
since it's the point at which the guest gets the final version of the
"cold plugged" device tree.  I still have some concerns about the
details, but the basic approach seems sound.

Sorry for the misdirection.

> 
>  hw/ppc/spapr.c | 18 +-
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index ab3aab1..f0c543c 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2521,6 +2521,12 @@ static void spapr_nmi(NMIState *n, int cpu_index, 
> Error **errp)
>  }
>  }
>  
> +static bool spapr_coldplugged(DeviceState *dev)
> +{
> +return runstate_check(RUN_STATE_INMIGRATE) ||
> +   !dev->hotplugged;
> +}
> +
>  static void spapr_add_lmbs(DeviceState *dev, uint64_t addr_start, uint64_t 
> size,
> uint32_t node, bool dedicated_hp_event_source,
> Error **errp)
> @@ -2531,6 +2537,7 @@ static void spapr_add_lmbs(DeviceState *dev, uint64_t 
> addr_start, uint64_t size,
>  int i, fdt_offset, fdt_size;
>  void *fdt;
>  uint64_t addr = addr_start;
> +bool coldplugged = spapr_coldplugged(dev);
>  
>  for (i = 0; i < nr_lmbs; i++) {
>  drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
> @@ -2542,9 +2549,9 @@ static void spapr_add_lmbs(DeviceState *dev, uint64_t 
> addr_start, uint64_t size,
>  SPAPR_MEMORY_BLOCK_SIZE);
>  
>  drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> -drck->attach(drc, dev, fdt, fdt_offset, !dev->hotplugged, errp);
> +drck->attach(drc, dev, fdt, fdt_offset, coldplugged, errp);
>  addr += SPAPR_MEMORY_BLOCK_SIZE;
> -if (!dev->hotplugged) {
> +if (coldplugged) {
>  /* guests expect coldplugged LMBs to be pre-allocated */
>  drck->set_allocation_state(drc, 
> SPAPR_DR_ALLOCATION_STATE_USABLE);
>  drck->set_isolation_state(drc, 
> SPAPR_DR_ISOLATION_STATE_UNISOLATED);
> @@ -2553,7 +2560,7 @@ static void spapr_add_lmbs(DeviceState *dev, uint64_t 
> addr_start, uint64_t size,
>  /* send hotplug notification to the
>   * guest only in case of hotplugged memory
>   */
> -if (dev->hotplugged) {
> +if (!coldplugged) {
>  if (dedicated_hp_event_source) {
>  drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_LMB,
>  addr_start / SPAPR_MEMORY_BLOCK_SIZE);
> @@ -2867,6 +2874,7 @@ static void spapr_core_plug(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
>  int smt = kvmppc_smt_threads();
>  CPUArchId *core_slot;
>  int index;
> +bool coldplugged = spapr_coldplugged(dev);
>  
>  core_slot = spapr_find_cpu_slot(MACHINE(hotplug_dev), cc->core_id, 
> &index);
>  if (!core_slot) {
> @@ -2888,7 +2896,7 @@ static void spapr_core_plug(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
>  
>  if (drc) {
>  sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> -drck->attach(drc, dev, fdt, fdt_offset, !dev->hotplugged, 
> &local_err);
> +drck->attach(drc, dev, fdt, fdt_offset, coldplugged, &local_err);
>  if (local_err) {
>  g_

[Qemu-devel] [PULL 01/15] char: cast ARRAY_SIZE() as signed to silent warning on empty array

2017-06-02 Thread Marc-André Lureau

From: Philippe Mathieu-Daudé 

chardev/char.c: In function 'chardev_name_foreach':
chardev/char.c:546:19: error: comparison of unsigned expression < 0 is always 
false [-Werror=type-limits]
 for (i = 0; i < ARRAY_SIZE(chardev_alias_table); i++) {
   ^
Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20170530120919.8874-1-f4...@amsat.org>
Reviewed-by: Marc-André Lureau 
---
 chardev/char.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/chardev/char.c b/chardev/char.c
index 4e24dc39af..26607c1c6b 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -841,7 +841,7 @@ chardev_name_foreach(void (*fn)(const char *name, void 
*opaque), void *opaque)
 
 object_class_foreach(chardev_class_foreach, TYPE_CHARDEV, false, &fe);
 
-for (i = 0; i < ARRAY_SIZE(chardev_alias_table); i++) {
+for (i = 0; i < (int)ARRAY_SIZE(chardev_alias_table); i++) {
 fn(chardev_alias_table[i].alias, opaque);
 }
 }
@@ -887,7 +887,7 @@ Chardev *qemu_chr_new_from_opts(QemuOpts *opts,
 return NULL;
 }
 
-for (i = 0; i < ARRAY_SIZE(chardev_alias_table); i++) {
+for (i = 0; i < (int)ARRAY_SIZE(chardev_alias_table); i++) {
 if (g_strcmp0(chardev_alias_table[i].alias, name) == 0) {
 name = chardev_alias_table[i].typename;
 break;
-- 
2.13.0.91.g00982b8dd

[Qemu-devel] [PULL 00/15] chardev patches

2017-06-02 Thread Marc-André Lureau

The following changes since commit 43771d5d92312504305c19abe29ec5bfabd55f01:

  Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-05-31' into 
staging (2017-06-01 16:39:16 +0100)

are available in the git repository at:

  github/chrfe chrfe-pull-request

for you to fetch changes up to 6b10e573d15ef82dbc5c5b3726028e6642e134f6:

  char: move char devices to chardev/ (2017-06-02 11:33:53 +0400)





Marc-André Lureau (14):
  char-win: simplify win_chr_read()
  char-win: remove WinChardev.len
  char-win: rename win_chr_init/poll win_chr_serial_init/poll
  char-win: rename hcom->file
  char-win: close file handle except with console
  Remove/replace sysemu/char.h inclusion
  chardev: move headers to include/chardev
  chardev: serial & parallel declaration to own headers
  be-hci: use backend functions
  char: generalize qemu_chr_write_all()
  char: move CharBackend handling in char-fe unit
  char: rename functions that are not part of fe
  char: make chr_fe_deinit() optionaly delete backend
  char: move char devices to chardev/

Philippe Mathieu-Daudé (1):
  char: cast ARRAY_SIZE() as signed to silent warning on empty array

 {chardev => include/chardev}/char-fd.h|   2 +-
 include/chardev/char-fe.h | 251 +
 {chardev => include/chardev}/char-io.h|   2 +-
 {chardev => include/chardev}/char-mux.h   |   3 +-
 {chardev => include/chardev}/char-parallel.h  |  20 +-
 {chardev => include/chardev}/char-serial.h|  22 ++
 {chardev => include/chardev}/char-win-stdio.h |   0
 {chardev => include/chardev}/char-win.h   |  14 +-
 include/chardev/char.h| 229 
 include/hw/char/bcm2835_aux.h |   2 +-
 include/hw/char/cadence_uart.h|   2 +-
 include/hw/char/digic-uart.h  |   2 +-
 include/hw/char/imx_serial.h  |   2 +-
 include/hw/char/serial.h  |   4 +-
 include/hw/char/stm32f2xx_usart.h |   2 +-
 include/sysemu/char.h | 499 --
 backends/rng-egd.c|   4 +-
 {backends => chardev}/baum.c  |   2 +-
 chardev/char-console.c|   4 +-
 chardev/char-fd.c |   6 +-
 chardev/char-fe.c | 361 +++
 chardev/char-file.c   |   8 +-
 chardev/char-io.c |   2 +-
 chardev/char-mux.c|   6 +-
 chardev/char-null.c   |   2 +-
 chardev/char-parallel.c   |   6 +-
 chardev/char-pipe.c   |  16 +-
 chardev/char-pty.c|   4 +-
 chardev/char-ringbuf.c|   2 +-
 chardev/char-serial.c |   8 +-
 chardev/char-socket.c |   4 +-
 chardev/char-stdio.c  |   8 +-
 chardev/char-udp.c|   4 +-
 chardev/char-win-stdio.c  |   4 +-
 chardev/char-win.c|  95 ++---
 chardev/char.c| 390 +---
 {backends => chardev}/msmouse.c   |   2 +-
 spice-qemu-char.c => chardev/spice.c  |   4 +-
 {backends => chardev}/testdev.c   |   2 +-
 {backends => chardev}/wctablet.c  |   2 +-
 gdbstub.c |  18 +-
 hmp.c |   2 +-
 hw/arm/bcm2835_peripherals.c  |   1 -
 hw/arm/fsl-imx25.c|   2 +-
 hw/arm/fsl-imx31.c|   2 +-
 hw/arm/fsl-imx6.c |   2 +-
 hw/arm/omap2.c|   2 +-
 hw/arm/pxa2xx.c   |   2 +-
 hw/arm/strongarm.c|   3 +-
 hw/bt/hci-csr.c   |  11 +-
 hw/char/cadence_uart.c|   3 +-
 hw/char/debugcon.c|   2 +-
 hw/char/digic-uart.c  |   2 +-
 hw/char/escc.c|   3 +-
 hw/char/etraxfs_ser.c |   2 +-
 hw/char/exynos4210_uart.c |   3 +-
 hw/char/grlib_apbuart.c   |   2 +-
 hw/char/imx_serial.c  |   1 -
 hw/char/ipoctal232.c  |   2 +-
 hw/char/lm32_juart.c  |   2 +-
 hw/char/lm32_uart.c   |   2 +-
 hw/char/mcf_uart.c|   2 +-
 hw/char/milkymist-uart.c  |   2 +-
 hw/char/omap_uart.c   |   2 +-
 hw/char/parallel.c|   3 +-

[Qemu-devel] [PULL 03/15] char-win: remove WinChardev.len

2017-06-02 Thread Marc-André Lureau

The "len" argument can be passed directly to win_chr_read()

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 chardev/char-win.h |  1 -
 chardev/char-win.c | 16 +++-
 2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/chardev/char-win.h b/chardev/char-win.h
index 73a0e3caef..70215e04c2 100644
--- a/chardev/char-win.h
+++ b/chardev/char-win.h
@@ -32,7 +32,6 @@ typedef struct {
 HANDLE hcom, hrecv, hsend;
 OVERLAPPED orecv;
 BOOL fpipe;
-DWORD len;
 
 /* Protected by the Chardev chr_write_lock.  */
 OVERLAPPED osend;
diff --git a/chardev/char-win.c b/chardev/char-win.c
index a46d878ef8..5e7daeeae1 100644
--- a/chardev/char-win.c
+++ b/chardev/char-win.c
@@ -26,7 +26,7 @@
 #include "qapi/error.h"
 #include "char-win.h"
 
-static void win_chr_read(Chardev *chr)
+static void win_chr_read(Chardev *chr, DWORD len)
 {
 WinChardev *s = WIN_CHARDEV(chr);
 int max_size = qemu_chr_be_can_write(chr);
@@ -34,16 +34,16 @@ static void win_chr_read(Chardev *chr)
 uint8_t buf[CHR_READ_BUF_LEN];
 DWORD size;
 
-if (s->len > max_size) {
-s->len = max_size;
+if (len > max_size) {
+len = max_size;
 }
-if (s->len == 0) {
+if (len == 0) {
 return;
 }
 
 ZeroMemory(&s->orecv, sizeof(s->orecv));
 s->orecv.hEvent = s->hrecv;
-ret = ReadFile(s->hcom, buf, s->len, &size, &s->orecv);
+ret = ReadFile(s->hcom, buf, len, &size, &s->orecv);
 if (!ret) {
 err = GetLastError();
 if (err == ERROR_IO_PENDING) {
@@ -65,8 +65,7 @@ static int win_chr_poll(void *opaque)
 
 ClearCommError(s->hcom, &comerr, &status);
 if (status.cbInQue > 0) {
-s->len = status.cbInQue;
-win_chr_read(chr);
+win_chr_read(chr, status.cbInQue);
 return 1;
 }
 return 0;
@@ -146,8 +145,7 @@ int win_chr_pipe_poll(void *opaque)
 
 PeekNamedPipe(s->hcom, NULL, 0, NULL, &size, NULL);
 if (size > 0) {
-s->len = size;
-win_chr_read(chr);
+win_chr_read(chr, size);
 return 1;
 }
 return 0;
-- 
2.13.0.91.g00982b8dd

[Qemu-devel] [PULL 02/15] char-win: simplify win_chr_read()

2017-06-02 Thread Marc-André Lureau

win_chr_read_poll() is always used before win_chr_read().
We can easily fold win_chr_readfile() too.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 chardev/char-win.h |  2 +-
 chardev/char-win.c | 35 +--
 2 files changed, 10 insertions(+), 27 deletions(-)

diff --git a/chardev/char-win.h b/chardev/char-win.h
index d78a7d7972..73a0e3caef 100644
--- a/chardev/char-win.h
+++ b/chardev/char-win.h
@@ -28,7 +28,7 @@
 
 typedef struct {
 Chardev parent;
-int max_size;
+
 HANDLE hcom, hrecv, hsend;
 OVERLAPPED orecv;
 BOOL fpipe;
diff --git a/chardev/char-win.c b/chardev/char-win.c
index e4b6957ded..a46d878ef8 100644
--- a/chardev/char-win.c
+++ b/chardev/char-win.c
@@ -26,14 +26,21 @@
 #include "qapi/error.h"
 #include "char-win.h"
 
-static void win_chr_readfile(Chardev *chr)
+static void win_chr_read(Chardev *chr)
 {
 WinChardev *s = WIN_CHARDEV(chr);
-
+int max_size = qemu_chr_be_can_write(chr);
 int ret, err;
 uint8_t buf[CHR_READ_BUF_LEN];
 DWORD size;
 
+if (s->len > max_size) {
+s->len = max_size;
+}
+if (s->len == 0) {
+return;
+}
+
 ZeroMemory(&s->orecv, sizeof(s->orecv));
 s->orecv.hEvent = s->hrecv;
 ret = ReadFile(s->hcom, buf, s->len, &size, &s->orecv);
@@ -49,28 +56,6 @@ static void win_chr_readfile(Chardev *chr)
 }
 }
 
-static void win_chr_read(Chardev *chr)
-{
-WinChardev *s = WIN_CHARDEV(chr);
-
-if (s->len > s->max_size) {
-s->len = s->max_size;
-}
-if (s->len == 0) {
-return;
-}
-
-win_chr_readfile(chr);
-}
-
-static int win_chr_read_poll(Chardev *chr)
-{
-WinChardev *s = WIN_CHARDEV(chr);
-
-s->max_size = qemu_chr_be_can_write(chr);
-return s->max_size;
-}
-
 static int win_chr_poll(void *opaque)
 {
 Chardev *chr = CHARDEV(opaque);
@@ -81,7 +66,6 @@ static int win_chr_poll(void *opaque)
 ClearCommError(s->hcom, &comerr, &status);
 if (status.cbInQue > 0) {
 s->len = status.cbInQue;
-win_chr_read_poll(chr);
 win_chr_read(chr);
 return 1;
 }
@@ -163,7 +147,6 @@ int win_chr_pipe_poll(void *opaque)
 PeekNamedPipe(s->hcom, NULL, 0, NULL, &size, NULL);
 if (size > 0) {
 s->len = size;
-win_chr_read_poll(chr);
 win_chr_read(chr);
 return 1;
 }
-- 
2.13.0.91.g00982b8dd

[Qemu-devel] [PULL 12/15] char: move CharBackend handling in char-fe unit

2017-06-02 Thread Marc-André Lureau

Move all the frontend struct and methods to a seperate unit. This avoids
accidentally mixing backend and frontend calls, and helps with readabilty.

Make qemu_chr_replay() a macro shared by both char and char-fe.

Export qemu_chr_write(), and use a macro for qemu_chr_write_all()

(nb: yes, CharBackend is for char frontend :)

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/chardev/char-fe.h | 249 ++
 include/chardev/char-mux.h|   1 +
 include/chardev/char.h| 242 +-
 include/hw/char/bcm2835_aux.h |   2 +-
 include/hw/char/cadence_uart.h|   2 +-
 include/hw/char/digic-uart.h  |   2 +-
 include/hw/char/imx_serial.h  |   2 +-
 include/hw/char/serial.h  |   2 +-
 include/hw/char/stm32f2xx_usart.h |   2 +-
 backends/rng-egd.c|   2 +-
 chardev/char-fe.c | 358 ++
 chardev/char.c| 343 +---
 gdbstub.c |   1 +
 hw/arm/omap2.c|   2 +-
 hw/arm/pxa2xx.c   |   2 +-
 hw/arm/strongarm.c|   1 +
 hw/char/cadence_uart.c|   1 +
 hw/char/debugcon.c|   2 +-
 hw/char/digic-uart.c  |   2 +-
 hw/char/escc.c|   1 +
 hw/char/etraxfs_ser.c |   2 +-
 hw/char/exynos4210_uart.c |   1 +
 hw/char/grlib_apbuart.c   |   2 +-
 hw/char/ipoctal232.c  |   2 +-
 hw/char/lm32_juart.c  |   2 +-
 hw/char/lm32_uart.c   |   2 +-
 hw/char/mcf_uart.c|   2 +-
 hw/char/milkymist-uart.c  |   2 +-
 hw/char/parallel.c|   1 +
 hw/char/pl011.c   |   2 +-
 hw/char/sclpconsole-lm.c  |   2 +-
 hw/char/sclpconsole.c |   2 +-
 hw/char/sh_serial.c   |   2 +-
 hw/char/spapr_vty.c   |   2 +-
 hw/char/terminal3270.c|   2 +-
 hw/char/virtio-console.c  |   2 +-
 hw/char/xen_console.c |   2 +-
 hw/char/xilinx_uartlite.c |   2 +-
 hw/core/qdev-properties-system.c  |   2 +-
 hw/ipmi/ipmi_bmc_extern.c |   2 +-
 hw/misc/ivshmem.c |   2 +-
 hw/usb/ccid-card-passthru.c   |   2 +-
 hw/usb/dev-serial.c   |   1 +
 hw/usb/redirect.c |   2 +-
 hw/virtio/vhost-user.c|   2 +-
 monitor.c |   2 +-
 net/colo-compare.c|   2 +-
 net/filter-mirror.c   |   2 +-
 net/slirp.c   |   2 +-
 net/vhost-user.c  |   2 +-
 qtest.c   |   2 +-
 slirp/slirp.c |   2 +-
 tests/test-char.c |   2 +-
 tests/vhost-user-test.c   |   2 +-
 ui/console.c  |   2 +-
 chardev/Makefile.objs |   1 +
 56 files changed, 664 insertions(+), 623 deletions(-)
 create mode 100644 include/chardev/char-fe.h
 create mode 100644 chardev/char-fe.c

diff --git a/include/chardev/char-fe.h b/include/chardev/char-fe.h
new file mode 100644
index 00..bd82093218
--- /dev/null
+++ b/include/chardev/char-fe.h
@@ -0,0 +1,249 @@
+#ifndef QEMU_CHAR_FE_H
+#define QEMU_CHAR_FE_H
+
+#include "chardev/char.h"
+
+typedef void IOEventHandler(void *opaque, int event);
+
+/* This is the backend as seen by frontend, the actual backend is
+ * Chardev */
+struct CharBackend {
+Chardev *chr;
+IOEventHandler *chr_event;
+IOCanReadHandler *chr_can_read;
+IOReadHandler *chr_read;
+void *opaque;
+int tag;
+int fe_open;
+};
+
+/**
+ * @qemu_chr_fe_init:
+ *
+ * Initializes a front end for the given CharBackend and
+ * Chardev. Call qemu_chr_fe_deinit() to remove the association and
+ * release the driver.
+ *
+ * Returns: false on error.
+ */
+bool qemu_chr_fe_init(CharBackend *b, Chardev *s, Error **errp);
+
+/**
+ * @qemu_chr_fe_deinit:
+ *
+ * Dissociate the CharBackend from the Chardev.
+ *
+ * Safe to call without associated Chardev.
+ */
+void qemu_chr_fe_deinit(CharBackend *b);
+
+/**
+ * @qemu_chr_fe_get_driver:
+ *
+ * Returns the driver associated with a CharBackend or NULL if no
+ * associated Chardev.
+ */
+Chardev *qemu_chr_fe_get_driver(CharBackend *be);
+
+/**
+ * @qemu_chr_fe_set_handlers:
+ * @b: a CharBackend
+ * @fd_can_read: callback to get the amount of data the frontend may
+ *   receive
+ * @fd_read: callback to receive data from char
+ * @fd_event: event callback
+ * @opaque: an opaque pointer for the callbacks
+ * @context: a main loop context or NULL for the default
+ * @set_open: whether to call qemu_chr_fe_set_open() implicitely when
+ * any of the handler is non-NULL
+ *
+ * Set the front end char handlers. The front end takes the focus if
+ * any of the handler is non-NULL.
+ *
+ * Without associated Chardev, nothing is changed.
+ */
+void qemu_ch

[Qemu-devel] [PULL 05/15] char-win: rename hcom->file

2017-06-02 Thread Marc-André Lureau

hcom is the name of the file handle, regardless of the actual chardev
driver (serial, file, console etc..). Rename it to be more explicit.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 chardev/char-win.h  |  2 +-
 chardev/char-pipe.c | 10 +-
 chardev/char-win.c  | 36 ++--
 3 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/chardev/char-win.h b/chardev/char-win.h
index e0b3839a77..888be2b3ca 100644
--- a/chardev/char-win.h
+++ b/chardev/char-win.h
@@ -29,7 +29,7 @@
 typedef struct {
 Chardev parent;
 
-HANDLE hcom, hrecv, hsend;
+HANDLE file, hrecv, hsend;
 OVERLAPPED orecv;
 BOOL fpipe;
 
diff --git a/chardev/char-pipe.c b/chardev/char-pipe.c
index 54240c863d..aae950a22b 100644
--- a/chardev/char-pipe.c
+++ b/chardev/char-pipe.c
@@ -58,27 +58,27 @@ static int win_chr_pipe_init(Chardev *chr, const char 
*filename,
 }
 
 openname = g_strdup_printf(".\\pipe\\%s", filename);
-s->hcom = CreateNamedPipe(openname,
+s->file = CreateNamedPipe(openname,
   PIPE_ACCESS_DUPLEX | FILE_FLAG_OVERLAPPED,
   PIPE_TYPE_BYTE | PIPE_READMODE_BYTE |
   PIPE_WAIT,
   MAXCONNECT, NSENDBUF, NRECVBUF, NTIMEOUT, NULL);
 g_free(openname);
-if (s->hcom == INVALID_HANDLE_VALUE) {
+if (s->file == INVALID_HANDLE_VALUE) {
 error_setg(errp, "Failed CreateNamedPipe (%lu)", GetLastError());
-s->hcom = NULL;
+s->file = NULL;
 goto fail;
 }
 
 ZeroMemory(&ov, sizeof(ov));
 ov.hEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
-ret = ConnectNamedPipe(s->hcom, &ov);
+ret = ConnectNamedPipe(s->file, &ov);
 if (ret) {
 error_setg(errp, "Failed ConnectNamedPipe");
 goto fail;
 }
 
-ret = GetOverlappedResult(s->hcom, &ov, &size, TRUE);
+ret = GetOverlappedResult(s->file, &ov, &size, TRUE);
 if (!ret) {
 error_setg(errp, "Failed GetOverlappedResult");
 if (ov.hEvent) {
diff --git a/chardev/char-win.c b/chardev/char-win.c
index 11abad1521..a7e3296909 100644
--- a/chardev/char-win.c
+++ b/chardev/char-win.c
@@ -43,11 +43,11 @@ static void win_chr_read(Chardev *chr, DWORD len)
 
 ZeroMemory(&s->orecv, sizeof(s->orecv));
 s->orecv.hEvent = s->hrecv;
-ret = ReadFile(s->hcom, buf, len, &size, &s->orecv);
+ret = ReadFile(s->file, buf, len, &size, &s->orecv);
 if (!ret) {
 err = GetLastError();
 if (err == ERROR_IO_PENDING) {
-ret = GetOverlappedResult(s->hcom, &s->orecv, &size, TRUE);
+ret = GetOverlappedResult(s->file, &s->orecv, &size, TRUE);
 }
 }
 
@@ -63,7 +63,7 @@ static int win_chr_serial_poll(void *opaque)
 COMSTAT status;
 DWORD comerr;
 
-ClearCommError(s->hcom, &comerr, &status);
+ClearCommError(s->file, &comerr, &status);
 if (status.cbInQue > 0) {
 win_chr_read(chr, status.cbInQue);
 return 1;
@@ -91,15 +91,15 @@ int win_chr_serial_init(Chardev *chr, const char *filename, 
Error **errp)
 goto fail;
 }
 
-s->hcom = CreateFile(filename, GENERIC_READ | GENERIC_WRITE, 0, NULL,
+s->file = CreateFile(filename, GENERIC_READ | GENERIC_WRITE, 0, NULL,
   OPEN_EXISTING, FILE_FLAG_OVERLAPPED, 0);
-if (s->hcom == INVALID_HANDLE_VALUE) {
+if (s->file == INVALID_HANDLE_VALUE) {
 error_setg(errp, "Failed CreateFile (%lu)", GetLastError());
-s->hcom = NULL;
+s->file = NULL;
 goto fail;
 }
 
-if (!SetupComm(s->hcom, NRECVBUF, NSENDBUF)) {
+if (!SetupComm(s->file, NRECVBUF, NSENDBUF)) {
 error_setg(errp, "Failed SetupComm");
 goto fail;
 }
@@ -110,23 +110,23 @@ int win_chr_serial_init(Chardev *chr, const char 
*filename, Error **errp)
 comcfg.dcb.DCBlength = sizeof(DCB);
 CommConfigDialog(filename, NULL, &comcfg);
 
-if (!SetCommState(s->hcom, &comcfg.dcb)) {
+if (!SetCommState(s->file, &comcfg.dcb)) {
 error_setg(errp, "Failed SetCommState");
 goto fail;
 }
 
-if (!SetCommMask(s->hcom, EV_ERR)) {
+if (!SetCommMask(s->file, EV_ERR)) {
 error_setg(errp, "Failed SetCommMask");
 goto fail;
 }
 
 cto.ReadIntervalTimeout = MAXDWORD;
-if (!SetCommTimeouts(s->hcom, &cto)) {
+if (!SetCommTimeouts(s->file, &cto)) {
 error_setg(errp, "Failed SetCommTimeouts");
 goto fail;
 }
 
-if (!ClearCommError(s->hcom, &err, &comstat)) {
+if (!ClearCommError(s->file, &err, &comstat)) {
 error_setg(errp, "Failed ClearCommError");
 goto fail;
 }
@@ -143,7 +143,7 @@ int win_chr_pipe_poll(void *opaque)
 WinChardev *s = WIN_CHARDEV(opaque);
 DWORD size;
 
-PeekNamedPipe(s->hcom, NULL, 0, NULL, &size, NULL);
+PeekNamedPipe(s->file, NULL, 0, NULL, &size, NULL);
 if (size > 0) {
 win_

[Qemu-devel] [PULL 04/15] char-win: rename win_chr_init/poll win_chr_serial_init/poll

2017-06-02 Thread Marc-André Lureau

Those 2 functions are specific to serial chardev, make it more clear.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 chardev/char-win.h| 2 +-
 chardev/char-serial.c | 2 +-
 chardev/char-win.c| 8 
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/chardev/char-win.h b/chardev/char-win.h
index 70215e04c2..e0b3839a77 100644
--- a/chardev/char-win.h
+++ b/chardev/char-win.h
@@ -46,7 +46,7 @@ typedef struct {
 #define WIN_CHARDEV(obj) OBJECT_CHECK(WinChardev, (obj), TYPE_CHARDEV_WIN)
 
 void qemu_chr_open_win_file(Chardev *chr, HANDLE fd_out);
-int win_chr_init(Chardev *chr, const char *filename, Error **errp);
+int win_chr_serial_init(Chardev *chr, const char *filename, Error **errp);
 int win_chr_pipe_poll(void *opaque);
 
 #endif /* CHAR_WIN_H */
diff --git a/chardev/char-serial.c b/chardev/char-serial.c
index 094e08dca5..fef3a91c77 100644
--- a/chardev/char-serial.c
+++ b/chardev/char-serial.c
@@ -45,7 +45,7 @@ static void qmp_chardev_open_serial(Chardev *chr,
 {
 ChardevHostdev *serial = backend->u.serial.data;
 
-win_chr_init(chr, serial->device, errp);
+win_chr_serial_init(chr, serial->device, errp);
 }
 
 #elif defined(__linux__) || defined(__sun__) || defined(__FreeBSD__)  \
diff --git a/chardev/char-win.c b/chardev/char-win.c
index 5e7daeeae1..11abad1521 100644
--- a/chardev/char-win.c
+++ b/chardev/char-win.c
@@ -56,7 +56,7 @@ static void win_chr_read(Chardev *chr, DWORD len)
 }
 }
 
-static int win_chr_poll(void *opaque)
+static int win_chr_serial_poll(void *opaque)
 {
 Chardev *chr = CHARDEV(opaque);
 WinChardev *s = WIN_CHARDEV(opaque);
@@ -71,7 +71,7 @@ static int win_chr_poll(void *opaque)
 return 0;
 }
 
-int win_chr_init(Chardev *chr, const char *filename, Error **errp)
+int win_chr_serial_init(Chardev *chr, const char *filename, Error **errp)
 {
 WinChardev *s = WIN_CHARDEV(chr);
 COMMCONFIG comcfg;
@@ -130,7 +130,7 @@ int win_chr_init(Chardev *chr, const char *filename, Error 
**errp)
 error_setg(errp, "Failed ClearCommError");
 goto fail;
 }
-qemu_add_polling_cb(win_chr_poll, chr);
+qemu_add_polling_cb(win_chr_serial_poll, chr);
 return 0;
 
  fail:
@@ -208,7 +208,7 @@ static void char_win_finalize(Object *obj)
 if (s->fpipe) {
 qemu_del_polling_cb(win_chr_pipe_poll, chr);
 } else {
-qemu_del_polling_cb(win_chr_poll, chr);
+qemu_del_polling_cb(win_chr_serial_poll, chr);
 }
 
 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
-- 
2.13.0.91.g00982b8dd

[Qemu-devel] [PULL 06/15] char-win: close file handle except with console

2017-06-02 Thread Marc-André Lureau

Only the console handle shouldn't be closed, however, the "file" handle
should.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 chardev/char-win.h |  5 ++---
 chardev/char-console.c |  2 +-
 chardev/char-file.c|  2 +-
 chardev/char-win.c | 12 
 4 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/chardev/char-win.h b/chardev/char-win.h
index 888be2b3ca..4994425e9e 100644
--- a/chardev/char-win.h
+++ b/chardev/char-win.h
@@ -29,14 +29,13 @@
 typedef struct {
 Chardev parent;
 
+bool keep_open; /* console do not close file */
 HANDLE file, hrecv, hsend;
 OVERLAPPED orecv;
 BOOL fpipe;
 
 /* Protected by the Chardev chr_write_lock.  */
 OVERLAPPED osend;
-/* FIXME: file/console do not finalize */
-bool skip_free;
 } WinChardev;
 
 #define NSENDBUF 2048
@@ -45,7 +44,7 @@ typedef struct {
 #define TYPE_CHARDEV_WIN "chardev-win"
 #define WIN_CHARDEV(obj) OBJECT_CHECK(WinChardev, (obj), TYPE_CHARDEV_WIN)
 
-void qemu_chr_open_win_file(Chardev *chr, HANDLE fd_out);
+void win_chr_set_file(Chardev *chr, HANDLE file, bool keep_open);
 int win_chr_serial_init(Chardev *chr, const char *filename, Error **errp);
 int win_chr_pipe_poll(void *opaque);
 
diff --git a/chardev/char-console.c b/chardev/char-console.c
index c824937fe6..8d972c1506 100644
--- a/chardev/char-console.c
+++ b/chardev/char-console.c
@@ -29,7 +29,7 @@ static void qemu_chr_open_win_con(Chardev *chr,
   bool *be_opened,
   Error **errp)
 {
-qemu_chr_open_win_file(chr, GetStdHandle(STD_OUTPUT_HANDLE));
+win_chr_set_file(chr, GetStdHandle(STD_OUTPUT_HANDLE), true);
 }
 
 static void char_console_class_init(ObjectClass *oc, void *data)
diff --git a/chardev/char-file.c b/chardev/char-file.c
index 8bae25350d..aed4ae1569 100644
--- a/chardev/char-file.c
+++ b/chardev/char-file.c
@@ -65,7 +65,7 @@ static void qmp_chardev_open_file(Chardev *chr,
 return;
 }
 
-qemu_chr_open_win_file(chr, out);
+win_chr_set_file(chr, out, false);
 #else
 int flags, in = -1, out;
 
diff --git a/chardev/char-win.c b/chardev/char-win.c
index a7e3296909..ec9a731be9 100644
--- a/chardev/char-win.c
+++ b/chardev/char-win.c
@@ -192,17 +192,13 @@ static void char_win_finalize(Object *obj)
 Chardev *chr = CHARDEV(obj);
 WinChardev *s = WIN_CHARDEV(chr);
 
-if (s->skip_free) {
-return;
-}
-
 if (s->hsend) {
 CloseHandle(s->hsend);
 }
 if (s->hrecv) {
 CloseHandle(s->hrecv);
 }
-if (s->file) {
+if (!s->keep_open && s->file) {
 CloseHandle(s->file);
 }
 if (s->fpipe) {
@@ -214,12 +210,12 @@ static void char_win_finalize(Object *obj)
 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
 }
 
-void qemu_chr_open_win_file(Chardev *chr, HANDLE fd_out)
+void win_chr_set_file(Chardev *chr, HANDLE file, bool keep_open)
 {
 WinChardev *s = WIN_CHARDEV(chr);
 
-s->skip_free = true;
-s->file = fd_out;
+s->keep_open = keep_open;
+s->file = file;
 }
 
 static void char_win_class_init(ObjectClass *oc, void *data)
-- 
2.13.0.91.g00982b8dd

[Qemu-devel] [PULL 13/15] char: rename functions that are not part of fe

2017-06-02 Thread Marc-André Lureau

There is no clear reason to have those functions associated with
frontend.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 chardev/char.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/chardev/char.c b/chardev/char.c
index 8ea7b5777a..7aa0210765 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -66,8 +66,7 @@ void qemu_chr_be_event(Chardev *s, int event)
 
 /* Not reporting errors from writing to logfile, as logs are
  * defined to be "best effort" only */
-static void qemu_chr_fe_write_log(Chardev *s,
-  const uint8_t *buf, size_t len)
+static void qemu_chr_write_log(Chardev *s, const uint8_t *buf, size_t len)
 {
 size_t done = 0;
 ssize_t ret;
@@ -91,9 +90,9 @@ static void qemu_chr_fe_write_log(Chardev *s,
 }
 }
 
-static int qemu_chr_fe_write_buffer(Chardev *s,
-const uint8_t *buf, int len,
-int *offset, bool write_all)
+static int qemu_chr_write_buffer(Chardev *s,
+ const uint8_t *buf, int len,
+ int *offset, bool write_all)
 {
 ChardevClass *cc = CHARDEV_GET_CLASS(s);
 int res = 0;
@@ -118,7 +117,7 @@ static int qemu_chr_fe_write_buffer(Chardev *s,
 }
 }
 if (*offset > 0) {
-qemu_chr_fe_write_log(s, buf, *offset);
+qemu_chr_write_log(s, buf, *offset);
 }
 qemu_mutex_unlock(&s->chr_write_lock);
 
@@ -133,11 +132,11 @@ int qemu_chr_write(Chardev *s, const uint8_t *buf, int 
len, bool write_all)
 if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_PLAY) {
 replay_char_write_event_load(&res, &offset);
 assert(offset <= len);
-qemu_chr_fe_write_buffer(s, buf, offset, &offset, true);
+qemu_chr_write_buffer(s, buf, offset, &offset, true);
 return res;
 }
 
-res = qemu_chr_fe_write_buffer(s, buf, len, &offset, write_all);
+res = qemu_chr_write_buffer(s, buf, len, &offset, write_all);
 
 if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_RECORD) {
 replay_char_write_event_save(res, offset);
-- 
2.13.0.91.g00982b8dd

[Qemu-devel] [PULL 07/15] Remove/replace sysemu/char.h inclusion

2017-06-02 Thread Marc-André Lureau

Those are apparently unnecessary includes.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/arm/bcm2835_peripherals.c | 1 -
 hw/char/imx_serial.c | 1 -
 hw/display/xenfb.c   | 1 -
 hw/i386/xen/xen-hvm.c| 1 -
 hw/mips/mips_fulong2e.c  | 1 -
 hw/mips/mips_malta.c | 1 -
 hw/net/xgmac.c   | 1 -
 hw/ppc/spapr_events.c| 1 -
 hw/ppc/spapr_rtas.c  | 1 -
 hw/sparc/leon3.c | 1 -
 hw/usb/ccid-card-emulated.c  | 2 +-
 hw/xen/xen_backend.c | 1 -
 util/event_notifier-posix.c  | 1 -
 13 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/hw/arm/bcm2835_peripherals.c b/hw/arm/bcm2835_peripherals.c
index 369ef1e3bd..502f04c02a 100644
--- a/hw/arm/bcm2835_peripherals.c
+++ b/hw/arm/bcm2835_peripherals.c
@@ -13,7 +13,6 @@
 #include "hw/arm/bcm2835_peripherals.h"
 #include "hw/misc/bcm2835_mbox_defs.h"
 #include "hw/arm/raspi_platform.h"
-#include "sysemu/char.h"
 #include "sysemu/sysemu.h"
 
 /* Peripheral base address on the VC (GPU) system bus */
diff --git a/hw/char/imx_serial.c b/hw/char/imx_serial.c
index 52e67f8dc9..af250305be 100644
--- a/hw/char/imx_serial.c
+++ b/hw/char/imx_serial.c
@@ -21,7 +21,6 @@
 #include "qemu/osdep.h"
 #include "hw/char/imx_serial.h"
 #include "sysemu/sysemu.h"
-#include "sysemu/char.h"
 #include "qemu/log.h"
 
 #ifndef DEBUG_IMX_UART
diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c
index 7a8727aa21..e76c0d805c 100644
--- a/hw/display/xenfb.c
+++ b/hw/display/xenfb.c
@@ -28,7 +28,6 @@
 
 #include "hw/hw.h"
 #include "ui/console.h"
-#include "sysemu/char.h"
 #include "hw/xen/xen_backend.h"
 
 #include 
diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index 919f09b694..1acd4de405 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -18,7 +18,6 @@
 #include "hw/xen/xen_backend.h"
 #include "qmp-commands.h"
 
-#include "sysemu/char.h"
 #include "qemu/error-report.h"
 #include "qemu/range.h"
 #include "sysemu/xen-mapcache.h"
diff --git a/hw/mips/mips_fulong2e.c b/hw/mips/mips_fulong2e.c
index e636c3abaa..dbe2805acb 100644
--- a/hw/mips/mips_fulong2e.c
+++ b/hw/mips/mips_fulong2e.c
@@ -32,7 +32,6 @@
 #include "hw/mips/mips.h"
 #include "hw/mips/cpudevs.h"
 #include "hw/pci/pci.h"
-#include "sysemu/char.h"
 #include "sysemu/sysemu.h"
 #include "audio/audio.h"
 #include "qemu/log.h"
diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
index 7814c39654..95cdabb2dd 100644
--- a/hw/mips/mips_malta.c
+++ b/hw/mips/mips_malta.c
@@ -37,7 +37,6 @@
 #include "hw/mips/mips.h"
 #include "hw/mips/cpudevs.h"
 #include "hw/pci/pci.h"
-#include "sysemu/char.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/arch_init.h"
 #include "qemu/log.h"
diff --git a/hw/net/xgmac.c b/hw/net/xgmac.c
index 46b1aa17fa..0843bf185c 100644
--- a/hw/net/xgmac.c
+++ b/hw/net/xgmac.c
@@ -26,7 +26,6 @@
 
 #include "qemu/osdep.h"
 #include "hw/sysbus.h"
-#include "sysemu/char.h"
 #include "qemu/log.h"
 #include "net/net.h"
 #include "net/checksum.h"
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 73e2a1884f..57acd85a87 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -28,7 +28,6 @@
 #include "qapi/error.h"
 #include "cpu.h"
 #include "sysemu/sysemu.h"
-#include "sysemu/char.h"
 #include "hw/qdev.h"
 #include "sysemu/device_tree.h"
 
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 128d993d04..b666a4c15c 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -29,7 +29,6 @@
 #include "qemu/log.h"
 #include "qemu/error-report.h"
 #include "sysemu/sysemu.h"
-#include "sysemu/char.h"
 #include "hw/qdev.h"
 #include "sysemu/device_tree.h"
 #include "sysemu/cpus.h"
diff --git a/hw/sparc/leon3.c b/hw/sparc/leon3.c
index 6e16478413..f415997649 100644
--- a/hw/sparc/leon3.c
+++ b/hw/sparc/leon3.c
@@ -28,7 +28,6 @@
 #include "hw/hw.h"
 #include "qemu/timer.h"
 #include "hw/ptimer.h"
-#include "sysemu/char.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/qtest.h"
 #include "hw/boards.h"
diff --git a/hw/usb/ccid-card-emulated.c b/hw/usb/ccid-card-emulated.c
index 99627860a3..e646eb243b 100644
--- a/hw/usb/ccid-card-emulated.c
+++ b/hw/usb/ccid-card-emulated.c
@@ -33,7 +33,7 @@
 #include 
 
 #include "qemu/thread.h"
-#include "sysemu/char.h"
+#include "qemu/main-loop.h"
 #include "ccid.h"
 
 #define DPRINTF(card, lvl, fmt, ...) \
diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index 3570f37e56..c46cbb0759 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -28,7 +28,6 @@
 #include "hw/hw.h"
 #include "hw/sysbus.h"
 #include "hw/boards.h"
-#include "sysemu/char.h"
 #include "qemu/log.h"
 #include "qapi/error.h"
 #include "hw/xen/xen_backend.h"
diff --git a/util/event_notifier-posix.c b/util/event_notifier-posix.c
index acdbe3b483..73c4046b58 100644
--- a/util/event_notifier-posix.c
+++ b/util/event_notifier-posix.c
@@ -14,7 +14,6 @@
 #include "qemu-common.h"
 #include "qemu/cutils.h"
 #include "qemu/event_notifier.h"
-#include

[Qemu-devel] [PULL 10/15] be-hci: use backend functions

2017-06-02 Thread Marc-André Lureau

Avoid accessing CharBackend directly, use qemu_chr_be_* methods instead.

be->chr_read should exists if qemu_chr_be_can_write() is true.

(use qemu_chr_be_write(), _impl() bypasses replay)

Signed-off-by: Marc-André Lureau 
Reviewed-by: Andrzej Zaborowski 
---
 hw/bt/hci-csr.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/hw/bt/hci-csr.c b/hw/bt/hci-csr.c
index 0f2021086d..d13192b9b5 100644
--- a/hw/bt/hci-csr.c
+++ b/hw/bt/hci-csr.c
@@ -82,17 +82,14 @@ enum {
 
 static inline void csrhci_fifo_wake(struct csrhci_s *s)
 {
-Chardev *chr = (Chardev *)s;
-CharBackend *be = chr->be;
+Chardev *chr = CHARDEV(s);
 
 if (!s->enable || !s->out_len)
 return;
 
 /* XXX: Should wait for s->modem_state & CHR_TIOCM_RTS? */
-if (be && be->chr_can_read && be->chr_can_read(be->opaque) &&
-be->chr_read) {
-be->chr_read(be->opaque,
- s->outfifo + s->out_start++, 1);
+if (qemu_chr_be_can_write(chr)) {
+qemu_chr_be_write(chr, s->outfifo + s->out_start++, 1);
 s->out_len--;
 if (s->out_start >= s->out_size) {
 s->out_start = 0;
-- 
2.13.0.91.g00982b8dd

[Qemu-devel] [PULL 08/15] chardev: move headers to include/chardev

2017-06-02 Thread Marc-André Lureau

So they are all in one place. The following patch will move serial &
parallel declarations to the respective headers.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 {chardev => include/chardev}/char-fd.h|  2 +-
 {chardev => include/chardev}/char-io.h|  2 +-
 {chardev => include/chardev}/char-mux.h   |  2 +-
 {chardev => include/chardev}/char-parallel.h  |  0
 {chardev => include/chardev}/char-serial.h|  0
 {chardev => include/chardev}/char-win-stdio.h |  0
 {chardev => include/chardev}/char-win.h   |  2 +-
 include/{sysemu => chardev}/char.h|  0
 include/hw/char/bcm2835_aux.h |  2 +-
 include/hw/char/cadence_uart.h|  2 +-
 include/hw/char/digic-uart.h  |  2 +-
 include/hw/char/imx_serial.h  |  2 +-
 include/hw/char/serial.h  |  4 ++--
 include/hw/char/stm32f2xx_usart.h |  2 +-
 backends/baum.c   |  2 +-
 backends/msmouse.c|  2 +-
 backends/rng-egd.c|  2 +-
 backends/testdev.c|  2 +-
 backends/wctablet.c   |  2 +-
 chardev/char-console.c|  2 +-
 chardev/char-fd.c |  6 +++---
 chardev/char-file.c   |  6 +++---
 chardev/char-io.c |  2 +-
 chardev/char-mux.c|  4 ++--
 chardev/char-null.c   |  2 +-
 chardev/char-parallel.c   |  6 +++---
 chardev/char-pipe.c   |  6 +++---
 chardev/char-pty.c|  4 ++--
 chardev/char-ringbuf.c|  2 +-
 chardev/char-serial.c |  6 +++---
 chardev/char-socket.c |  4 ++--
 chardev/char-stdio.c  |  8 
 chardev/char-udp.c|  4 ++--
 chardev/char-win-stdio.c  |  4 ++--
 chardev/char-win.c|  2 +-
 chardev/char.c| 10 +-
 gdbstub.c |  2 +-
 hmp.c |  2 +-
 hw/arm/fsl-imx25.c|  2 +-
 hw/arm/fsl-imx31.c|  2 +-
 hw/arm/fsl-imx6.c |  2 +-
 hw/arm/omap2.c|  2 +-
 hw/arm/pxa2xx.c   |  2 +-
 hw/arm/strongarm.c|  2 +-
 hw/bt/hci-csr.c   |  2 +-
 hw/char/cadence_uart.c|  2 +-
 hw/char/debugcon.c|  2 +-
 hw/char/digic-uart.c  |  2 +-
 hw/char/escc.c|  2 +-
 hw/char/etraxfs_ser.c |  2 +-
 hw/char/exynos4210_uart.c |  2 +-
 hw/char/grlib_apbuart.c   |  2 +-
 hw/char/ipoctal232.c  |  2 +-
 hw/char/lm32_juart.c  |  2 +-
 hw/char/lm32_uart.c   |  2 +-
 hw/char/mcf_uart.c|  2 +-
 hw/char/milkymist-uart.c  |  2 +-
 hw/char/omap_uart.c   |  2 +-
 hw/char/parallel.c|  2 +-
 hw/char/pl011.c   |  2 +-
 hw/char/sclpconsole-lm.c  |  2 +-
 hw/char/sclpconsole.c |  2 +-
 hw/char/serial.c  |  2 +-
 hw/char/sh_serial.c   |  2 +-
 hw/char/spapr_vty.c   |  2 +-
 hw/char/terminal3270.c|  2 +-
 hw/char/virtio-console.c  |  2 +-
 hw/char/xen_console.c |  2 +-
 hw/char/xilinx_uartlite.c |  2 +-
 hw/core/qdev-properties-system.c  |  2 +-
 hw/core/qdev-properties.c |  2 +-
 hw/ipmi/ipmi_bmc_extern.c |  2 +-
 hw/isa/pc87312.c  |  2 +-
 hw/mips/boston.c  |  2 +-
 hw/misc/ivshmem.c |  2 +-
 hw/usb/ccid-card-passthru.c   |  2 +-
 hw/usb/dev-serial.c   |  2 +-
 hw/usb/redirect.c |  2 +-
 hw/virtio/vhost-user.c|  2 +-
 hw/xen/xen-common.c   |  2 +-
 hw/xtensa/xtfpga.c|  2 +-
 monitor.c |  2 +-
 net/colo-compare.c|  2 +-
 net/filter-mirror.c   |  2 +-
 net/slirp.c   |  2 +-
 net/vhost-user.c  |  2 +-
 qmp.c

[Qemu-devel] [PULL 11/15] char: generalize qemu_chr_write_all()

2017-06-02 Thread Marc-André Lureau

qemu_chr_fe_write() is similar to qemu_chr_write_all(): the later write
all with a chardev backend.

Make qemu_chr_write() and qemu_chr_fe_write_buffer() take an 'all'
argument. If false, handle 'partial' write the way qemu_chr_fe_write()
use to, and call qemu_chr_write() from qemu_chr_fe_write().

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 chardev/char.c | 70 +++---
 1 file changed, 28 insertions(+), 42 deletions(-)

diff --git a/chardev/char.c b/chardev/char.c
index 02142b480e..c9e46f00f0 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -96,7 +96,8 @@ static void qemu_chr_fe_write_log(Chardev *s,
 }
 
 static int qemu_chr_fe_write_buffer(Chardev *s,
-const uint8_t *buf, int len, int *offset)
+const uint8_t *buf, int len,
+int *offset, bool write_all)
 {
 ChardevClass *cc = CHARDEV_GET_CLASS(s);
 int res = 0;
@@ -106,7 +107,7 @@ static int qemu_chr_fe_write_buffer(Chardev *s,
 while (*offset < len) {
 retry:
 res = cc->chr_write(s, buf + *offset, len - *offset);
-if (res < 0 && errno == EAGAIN) {
+if (res < 0 && errno == EAGAIN && write_all) {
 g_usleep(100);
 goto retry;
 }
@@ -116,6 +117,9 @@ static int qemu_chr_fe_write_buffer(Chardev *s,
 }
 
 *offset += res;
+if (!write_all) {
+break;
+}
 }
 if (*offset > 0) {
 qemu_chr_fe_write_log(s, buf, *offset);
@@ -130,54 +134,20 @@ static bool qemu_chr_replay(Chardev *chr)
 return qemu_chr_has_feature(chr, QEMU_CHAR_FEATURE_REPLAY);
 }
 
-int qemu_chr_fe_write(CharBackend *be, const uint8_t *buf, int len)
+static int qemu_chr_write(Chardev *s, const uint8_t *buf, int len,
+  bool write_all)
 {
-Chardev *s = be->chr;
-ChardevClass *cc;
-int ret;
-
-if (!s) {
-return 0;
-}
-
-if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_PLAY) {
-int offset;
-replay_char_write_event_load(&ret, &offset);
-assert(offset <= len);
-qemu_chr_fe_write_buffer(s, buf, offset, &offset);
-return ret;
-}
-
-cc = CHARDEV_GET_CLASS(s);
-qemu_mutex_lock(&s->chr_write_lock);
-ret = cc->chr_write(s, buf, len);
-
-if (ret > 0) {
-qemu_chr_fe_write_log(s, buf, ret);
-}
-
-qemu_mutex_unlock(&s->chr_write_lock);
-
-if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_RECORD) {
-replay_char_write_event_save(ret, ret < 0 ? 0 : ret);
-}
-
-return ret;
-}
-
-int qemu_chr_write_all(Chardev *s, const uint8_t *buf, int len)
-{
-int offset;
+int offset = 0;
 int res;
 
 if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_PLAY) {
 replay_char_write_event_load(&res, &offset);
 assert(offset <= len);
-qemu_chr_fe_write_buffer(s, buf, offset, &offset);
+qemu_chr_fe_write_buffer(s, buf, offset, &offset, true);
 return res;
 }
 
-res = qemu_chr_fe_write_buffer(s, buf, len, &offset);
+res = qemu_chr_fe_write_buffer(s, buf, len, &offset, write_all);
 
 if (qemu_chr_replay(s) && replay_mode == REPLAY_MODE_RECORD) {
 replay_char_write_event_save(res, offset);
@@ -189,6 +159,22 @@ int qemu_chr_write_all(Chardev *s, const uint8_t *buf, int 
len)
 return offset;
 }
 
+int qemu_chr_write_all(Chardev *s, const uint8_t *buf, int len)
+{
+return qemu_chr_write(s, buf, len, true);
+}
+
+int qemu_chr_fe_write(CharBackend *be, const uint8_t *buf, int len)
+{
+Chardev *s = be->chr;
+
+if (!s) {
+return 0;
+}
+
+return qemu_chr_write(s, buf, len, false);
+}
+
 int qemu_chr_fe_write_all(CharBackend *be, const uint8_t *buf, int len)
 {
 Chardev *s = be->chr;
@@ -197,7 +183,7 @@ int qemu_chr_fe_write_all(CharBackend *be, const uint8_t 
*buf, int len)
 return 0;
 }
 
-return qemu_chr_write_all(s, buf, len);
+return qemu_chr_write(s, buf, len, true);
 }
 
 int qemu_chr_fe_read_all(CharBackend *be, uint8_t *buf, int len)
-- 
2.13.0.91.g00982b8dd

[Qemu-devel] [PULL 09/15] chardev: serial & parallel declaration to own headers

2017-06-02 Thread Marc-André Lureau

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/chardev/char-parallel.h | 20 +++-
 include/chardev/char-serial.h   | 22 ++
 include/chardev/char.h  | 36 
 backends/wctablet.c |  2 +-
 hw/arm/strongarm.c  |  2 +-
 hw/bt/hci-csr.c |  2 +-
 hw/char/cadence_uart.c  |  2 +-
 hw/char/escc.c  |  2 +-
 hw/char/exynos4210_uart.c   |  2 +-
 hw/char/parallel.c  |  2 +-
 hw/char/serial.c|  2 +-
 hw/usb/dev-serial.c |  2 +-
 12 files changed, 50 insertions(+), 46 deletions(-)

diff --git a/include/chardev/char-parallel.h b/include/chardev/char-parallel.h
index 26742f9d5c..3284a1b96b 100644
--- a/include/chardev/char-parallel.h
+++ b/include/chardev/char-parallel.h
@@ -24,9 +24,27 @@
 #ifndef CHAR_PARALLEL_H
 #define CHAR_PARALLEL_H
 
-#if defined(__linux__) || defined(__FreeBSD__) || \
+#include "chardev/char.h"
+
+#if defined(__linux__) || defined(__FreeBSD__) ||   \
 defined(__FreeBSD_kernel__) || defined(__DragonFly__)
 #define HAVE_CHARDEV_PARPORT 1
 #endif
 
+#define CHR_IOCTL_PP_READ_DATA3
+#define CHR_IOCTL_PP_WRITE_DATA   4
+#define CHR_IOCTL_PP_READ_CONTROL 5
+#define CHR_IOCTL_PP_WRITE_CONTROL6
+#define CHR_IOCTL_PP_READ_STATUS  7
+#define CHR_IOCTL_PP_EPP_READ_ADDR8
+#define CHR_IOCTL_PP_EPP_READ 9
+#define CHR_IOCTL_PP_EPP_WRITE_ADDR  10
+#define CHR_IOCTL_PP_EPP_WRITE   11
+#define CHR_IOCTL_PP_DATA_DIR12
+
+struct ParallelIOArg {
+void *buffer;
+int count;
+};
+
 #endif /* CHAR_PARALLEL_H */
diff --git a/include/chardev/char-serial.h b/include/chardev/char-serial.h
index 64a27f63b1..cb2e59e82a 100644
--- a/include/chardev/char-serial.h
+++ b/include/chardev/char-serial.h
@@ -24,6 +24,8 @@
 #ifndef CHAR_SERIAL_H
 #define CHAR_SERIAL_H
 
+#include "chardev/char.h"
+
 #ifdef _WIN32
 #define HAVE_CHARDEV_SERIAL 1
 #elif defined(__linux__) || defined(__sun__) || defined(__FreeBSD__)\
@@ -32,4 +34,24 @@
 #define HAVE_CHARDEV_SERIAL 1
 #endif
 
+#define CHR_IOCTL_SERIAL_SET_PARAMS   1
+typedef struct {
+int speed;
+int parity;
+int data_bits;
+int stop_bits;
+} QEMUSerialSetParams;
+
+#define CHR_IOCTL_SERIAL_SET_BREAK2
+
+#define CHR_IOCTL_SERIAL_SET_TIOCM   13
+#define CHR_IOCTL_SERIAL_GET_TIOCM   14
+
+#define CHR_TIOCM_CTS   0x020
+#define CHR_TIOCM_CAR   0x040
+#define CHR_TIOCM_DSR   0x100
+#define CHR_TIOCM_RI0x080
+#define CHR_TIOCM_DTR   0x002
+#define CHR_TIOCM_RTS   0x004
+
 #endif
diff --git a/include/chardev/char.h b/include/chardev/char.h
index fffc0f40d4..95273e10ae 100644
--- a/include/chardev/char.h
+++ b/include/chardev/char.h
@@ -27,42 +27,6 @@ typedef enum {
 
 #define CHR_READ_BUF_LEN 4096
 
-#define CHR_IOCTL_SERIAL_SET_PARAMS   1
-typedef struct {
-int speed;
-int parity;
-int data_bits;
-int stop_bits;
-} QEMUSerialSetParams;
-
-#define CHR_IOCTL_SERIAL_SET_BREAK2
-
-#define CHR_IOCTL_PP_READ_DATA3
-#define CHR_IOCTL_PP_WRITE_DATA   4
-#define CHR_IOCTL_PP_READ_CONTROL 5
-#define CHR_IOCTL_PP_WRITE_CONTROL6
-#define CHR_IOCTL_PP_READ_STATUS  7
-#define CHR_IOCTL_PP_EPP_READ_ADDR8
-#define CHR_IOCTL_PP_EPP_READ 9
-#define CHR_IOCTL_PP_EPP_WRITE_ADDR  10
-#define CHR_IOCTL_PP_EPP_WRITE   11
-#define CHR_IOCTL_PP_DATA_DIR12
-
-struct ParallelIOArg {
-void *buffer;
-int count;
-};
-
-#define CHR_IOCTL_SERIAL_SET_TIOCM   13
-#define CHR_IOCTL_SERIAL_GET_TIOCM   14
-
-#define CHR_TIOCM_CTS  0x020
-#define CHR_TIOCM_CAR  0x040
-#define CHR_TIOCM_DSR  0x100
-#define CHR_TIOCM_RI   0x080
-#define CHR_TIOCM_DTR  0x002
-#define CHR_TIOCM_RTS  0x004
-
 typedef void IOEventHandler(void *opaque, int event);
 
 typedef enum {
diff --git a/backends/wctablet.c b/backends/wctablet.c
index 07a4cde956..6c13c2c58a 100644
--- a/backends/wctablet.c
+++ b/backends/wctablet.c
@@ -32,7 +32,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu-common.h"
-#include "chardev/char.h"
+#include "chardev/char-serial.h"
 #include "ui/console.h"
 #include "ui/input.h"
 #include "trace.h"
diff --git a/hw/arm/strongarm.c b/hw/arm/strongarm.c
index 66cad198d4..967caea749 100644
--- a/hw/arm/strongarm.c
+++ b/hw/arm/strongarm.c
@@ -34,7 +34,7 @@
 #include "strongarm.h"
 #include "qemu/error-report.h"
 #include "hw/arm/arm.h"
-#include "chardev/char.h"
+#include "chardev/char-serial.h"
 #include "sysemu/sysemu.h"
 #include "hw/ssi/ssi.h"
 #include "qemu/cutils.h"
diff --git a/hw/bt/hci-csr.c b/hw/bt/hci-csr.c
index cc2087392e..0f2021086d 100644
--- a/hw/bt/hci-csr.c
+++ b/hw/bt/hci-csr.c
@@ -20,7 +20,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu-common.h"
-#include "chardev/char.h"
+#include "chardev/char-serial.h"
 #include "qemu/timer.h"
 #include "qemu/bswap.h"
 #include "hw/irq.h"
diff --git a/hw/char/cadence_uart.c b/hw/char/cadence_u

[Qemu-devel] [PULL 14/15] char: make chr_fe_deinit() optionaly delete backend

2017-06-02 Thread Marc-André Lureau

This simplifies removing a backend for a frontend user (no need to
retrieve the associated driver and separate delete call etc).

NB: many frontends have questionable handling of ending a chardev. They
should probably delete the backend to prevent broken reusage.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/chardev/char-fe.h|  6 --
 backends/rng-egd.c   |  2 +-
 chardev/char-fe.c|  5 -
 chardev/char-mux.c   |  2 +-
 gdbstub.c| 15 ++-
 hw/char/serial.c |  2 +-
 hw/char/xen_console.c|  2 +-
 hw/core/qdev-properties-system.c |  2 +-
 hw/usb/ccid-card-passthru.c  |  5 +
 hw/usb/redirect.c|  4 +---
 monitor.c|  2 +-
 net/colo-compare.c   |  8 +++-
 net/filter-mirror.c  |  6 +++---
 net/vhost-user.c |  5 +
 tests/test-char.c| 22 --
 tests/vhost-user-test.c  |  4 +---
 16 files changed, 34 insertions(+), 58 deletions(-)

diff --git a/include/chardev/char-fe.h b/include/chardev/char-fe.h
index bd82093218..2cbb262f66 100644
--- a/include/chardev/char-fe.h
+++ b/include/chardev/char-fe.h
@@ -30,12 +30,14 @@ bool qemu_chr_fe_init(CharBackend *b, Chardev *s, Error 
**errp);
 
 /**
  * @qemu_chr_fe_deinit:
- *
+ * @b: a CharBackend
+ * @del: if true, delete the chardev backend
+*
  * Dissociate the CharBackend from the Chardev.
  *
  * Safe to call without associated Chardev.
  */
-void qemu_chr_fe_deinit(CharBackend *b);
+void qemu_chr_fe_deinit(CharBackend *b, bool del);
 
 /**
  * @qemu_chr_fe_get_driver:
diff --git a/backends/rng-egd.c b/backends/rng-egd.c
index ad3e1e5edf..e7ce2cac80 100644
--- a/backends/rng-egd.c
+++ b/backends/rng-egd.c
@@ -145,7 +145,7 @@ static void rng_egd_finalize(Object *obj)
 {
 RngEgd *s = RNG_EGD(obj);
 
-qemu_chr_fe_deinit(&s->chr);
+qemu_chr_fe_deinit(&s->chr, false);
 g_free(s->chr_name);
 }
 
diff --git a/chardev/char-fe.c b/chardev/char-fe.c
index 341221d029..3f90f0567c 100644
--- a/chardev/char-fe.c
+++ b/chardev/char-fe.c
@@ -211,7 +211,7 @@ unavailable:
 return false;
 }
 
-void qemu_chr_fe_deinit(CharBackend *b)
+void qemu_chr_fe_deinit(CharBackend *b, bool del)
 {
 assert(b);
 
@@ -224,6 +224,9 @@ void qemu_chr_fe_deinit(CharBackend *b)
 MuxChardev *d = MUX_CHARDEV(b->chr);
 d->backends[b->tag] = NULL;
 }
+if (del) {
+object_unparent(OBJECT(b->chr));
+}
 b->chr = NULL;
 }
 }
diff --git a/chardev/char-mux.c b/chardev/char-mux.c
index 106c682e7f..08570b915e 100644
--- a/chardev/char-mux.c
+++ b/chardev/char-mux.c
@@ -266,7 +266,7 @@ static void char_mux_finalize(Object *obj)
 be->chr = NULL;
 }
 }
-qemu_chr_fe_deinit(&d->chr);
+qemu_chr_fe_deinit(&d->chr, false);
 }
 
 void mux_chr_set_handlers(Chardev *chr, GMainContext *context)
diff --git a/gdbstub.c b/gdbstub.c
index 4251d23898..ec4e4b25be 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -1678,9 +1678,6 @@ void gdb_exit(CPUArchState *env, int code)
 {
   GDBState *s;
   char buf[4];
-#ifndef CONFIG_USER_ONLY
-  Chardev *chr;
-#endif
 
   s = gdbserver_state;
   if (!s) {
@@ -1690,19 +1687,13 @@ void gdb_exit(CPUArchState *env, int code)
   if (gdbserver_fd < 0 || s->fd < 0) {
   return;
   }
-#else
-  chr = qemu_chr_fe_get_driver(&s->chr);
-  if (!chr) {
-  return;
-  }
 #endif
 
   snprintf(buf, sizeof(buf), "W%02x", (uint8_t)code);
   put_packet(s, buf);
 
 #ifndef CONFIG_USER_ONLY
-  qemu_chr_fe_deinit(&s->chr);
-  object_unparent(OBJECT(chr));
+  qemu_chr_fe_deinit(&s->chr, true);
 #endif
 }
 
@@ -2002,9 +1993,7 @@ int gdbserver_start(const char *device)
NULL, &error_abort);
 monitor_init(mon_chr, 0);
 } else {
-if (qemu_chr_fe_get_driver(&s->chr)) {
-object_unparent(OBJECT(qemu_chr_fe_get_driver(&s->chr)));
-}
+qemu_chr_fe_deinit(&s->chr, true);
 mon_chr = s->mon_chr;
 memset(s, 0, sizeof(GDBState));
 s->mon_chr = mon_chr;
diff --git a/hw/char/serial.c b/hw/char/serial.c
index 23e5fe9d18..e1f12507bf 100644
--- a/hw/char/serial.c
+++ b/hw/char/serial.c
@@ -905,7 +905,7 @@ void serial_realize_core(SerialState *s, Error **errp)
 
 void serial_exit_core(SerialState *s)
 {
-qemu_chr_fe_deinit(&s->chr);
+qemu_chr_fe_deinit(&s->chr, false);
 
 timer_del(s->modem_status_poll);
 timer_free(s->modem_status_poll);
diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
index cb849c2e3e..f9af8cadf4 100644
--- a/hw/char/xen_console.c
+++ b/hw/char/xen_console.c
@@ -261,7 +261,7 @@ static void con_disconnect(struct XenDevice *xendev)
 {
 struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
 
-qemu_chr_fe_deinit(&con->chr);
+qemu_chr_fe_deinit(&con->chr, fa

Re: [Qemu-devel] [PATCH] msi: remove return code for msi_init()

2017-06-02 Thread Markus Armbruster

Peter Xu  writes:

> On Thu, Jun 01, 2017 at 03:06:29PM -0700, Paul Burton wrote:
>> Hi Aurelien/Paolo/Marcel,
>> 
>> On Thursday, 1 June 2017 12:22:06 PDT Aurelien Jarno wrote:
>> > On 2017-06-01 16:23, Paolo Bonzini wrote:
>> > > On 01/06/2017 10:27, Marcel Apfelbaum wrote:
>> > > > On 31/05/2017 11:28, Paolo Bonzini wrote:
>> > > >> No, for now I'd rather just go and remove msi_nonbroken.  When someone
>> > > >> reports a bug, we can add back "msi_broken".
>> > > > 
>> > > > Hi,
>> > > > I agree with the direction, but I am concerned msi_nonbroken is there
>> > > > for a reason.
>> > > > We might break some (obscure/not in use) machine.
>> > > > Maybe we should CC all arch machine maintainers/contributors to give
>> > > > them a chance to object...
>> > > 
>> > > Yeah, Alpha, MIPS and SH are those that support PCI.  Adding Richard and
>> > > Aurelien, do your platforms support MSI on real hardware but not in QEMU?
>> > 
>> > SH clearly doesn't support MSI.
>> > 
>> > The oldest MIPS board also do not support MSI, but I guess the Boston
>> > board might support it. I am adding Paul Burton in Cc: who probably
>> > knows about that.
>> > 
>> > Aurelien
>> 
>> Indeed, real Boston hardware does support MSI (or rather, the Xilinx AXI 
>> Bridge for PCI Express IP used on Boston does) & we make use of it in Linux.
>> 
>> Thanks,
>> Paul
>
> Does this mean that we'd better still keep the msi_nonbroken bit?

If we still need the "monkey-patch MSI-capable devices to hide board
bugs" logic, it should become opt-in rather than opt-out, i.e. broken
boards set msi_broken (with a suitable comment), non-broken boards don't
touch it.

> Anyway, maybe we can first merge Paolo's fix on edu device:
>
>   [PATCH] edu: fix memory leak on msi_broken platforms
>
> Then we can see whether we still need the rest of the changes.
>
> Thanks,

[Qemu-devel] [PULL 15/15] char: move char devices to chardev/

2017-06-02 Thread Marc-André Lureau

Suggested by Paolo Bonzini during series review.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Philippe Mathieu-Daudé 
---
 {backends => chardev}/baum.c |  0
 {backends => chardev}/msmouse.c  |  0
 spice-qemu-char.c => chardev/spice.c |  2 +-
 {backends => chardev}/testdev.c  |  0
 {backends => chardev}/wctablet.c |  0
 MAINTAINERS  |  4 +---
 Makefile.objs|  4 ++--
 backends/Makefile.objs   |  4 
 backends/trace-events| 10 --
 chardev/Makefile.objs|  6 ++
 chardev/trace-events | 18 ++
 trace-events |  7 ---
 12 files changed, 28 insertions(+), 27 deletions(-)
 rename {backends => chardev}/baum.c (100%)
 rename {backends => chardev}/msmouse.c (100%)
 rename spice-qemu-char.c => chardev/spice.c (99%)
 rename {backends => chardev}/testdev.c (100%)
 rename {backends => chardev}/wctablet.c (100%)
 create mode 100644 chardev/trace-events

diff --git a/backends/baum.c b/chardev/baum.c
similarity index 100%
rename from backends/baum.c
rename to chardev/baum.c
diff --git a/backends/msmouse.c b/chardev/msmouse.c
similarity index 100%
rename from backends/msmouse.c
rename to chardev/msmouse.c
diff --git a/spice-qemu-char.c b/chardev/spice.c
similarity index 99%
rename from spice-qemu-char.c
rename to chardev/spice.c
index 1c6c2e3969..a312078812 100644
--- a/spice-qemu-char.c
+++ b/chardev/spice.c
@@ -1,5 +1,5 @@
 #include "qemu/osdep.h"
-#include "trace-root.h"
+#include "trace.h"
 #include "ui/qemu-spice.h"
 #include "chardev/char.h"
 #include "qemu/error-report.h"
diff --git a/backends/testdev.c b/chardev/testdev.c
similarity index 100%
rename from backends/testdev.c
rename to chardev/testdev.c
diff --git a/backends/wctablet.c b/chardev/wctablet.c
similarity index 100%
rename from backends/wctablet.c
rename to chardev/wctablet.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 48e2964ed8..120788d8fb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1239,13 +1239,11 @@ M: Marc-André Lureau 
 S: Maintained
 F: chardev/
 F: include/chardev/
-F: backends/msmouse.c
-F: backends/testdev.c
 
 Character Devices (Braille)
 M: Samuel Thibault 
 S: Maintained
-F: backends/baum.c
+F: chardev/baum.c
 
 Command line option argument parsing
 M: Markus Armbruster 
diff --git a/Makefile.objs b/Makefile.objs
index 2100845ce2..0575802440 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -50,8 +50,6 @@ common-obj-$(CONFIG_LINUX) += fsdev/
 
 common-obj-y += migration/
 
-common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
-
 common-obj-y += audio/
 common-obj-y += hw/
 common-obj-y += accel.o
@@ -70,6 +68,7 @@ common-obj-y += tpm.o
 common-obj-$(CONFIG_SLIRP) += slirp/
 
 common-obj-y += backends/
+common-obj-y += chardev/
 
 common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o
 
@@ -121,6 +120,7 @@ trace-events-subdirs += io
 trace-events-subdirs += migration
 trace-events-subdirs += block
 trace-events-subdirs += backends
+trace-events-subdirs += chardev
 trace-events-subdirs += hw/block
 trace-events-subdirs += hw/block/dataplane
 trace-events-subdirs += hw/char
diff --git a/backends/Makefile.objs b/backends/Makefile.objs
index 0e0f1567b2..0400799efd 100644
--- a/backends/Makefile.objs
+++ b/backends/Makefile.objs
@@ -1,10 +1,6 @@
 common-obj-y += rng.o rng-egd.o
 common-obj-$(CONFIG_POSIX) += rng-random.o
 
-common-obj-y += msmouse.o wctablet.o testdev.o
-common-obj-$(CONFIG_BRLAPI) += baum.o
-baum.o-cflags := $(SDL_CFLAGS)
-
 common-obj-$(CONFIG_TPM) += tpm.o
 
 common-obj-y += hostmem.o hostmem-ram.o
diff --git a/backends/trace-events b/backends/trace-events
index 8c3289a3f9..e69de29bb2 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -1,10 +0,0 @@
-# See docs/tracing.txt for syntax documentation.
-
-# backends/wctablet.c
-wct_init(void) ""
-wct_cmd_re(void) ""
-wct_cmd_st(void) ""
-wct_cmd_sp(void) ""
-wct_cmd_ts(int input) "0x%02x"
-wct_cmd_other(const char *cmd) "%s"
-wct_speed(int speed) "%d"
diff --git a/chardev/Makefile.objs b/chardev/Makefile.objs
index e0b37dbfd8..52a8127606 100644
--- a/chardev/Makefile.objs
+++ b/chardev/Makefile.objs
@@ -16,3 +16,9 @@ chardev-obj-y += char-stdio.o
 chardev-obj-y += char-udp.o
 chardev-obj-$(CONFIG_WIN32) += char-win.o
 chardev-obj-$(CONFIG_WIN32) += char-win-stdio.o
+
+common-obj-y += msmouse.o wctablet.o testdev.o
+common-obj-$(CONFIG_BRLAPI) += baum.o
+baum.o-cflags := $(SDL_CFLAGS)
+
+common-obj-$(CONFIG_SPICE) += spice.o
diff --git a/chardev/trace-events b/chardev/trace-events
new file mode 100644
index 00..822dde668b
--- /dev/null
+++ b/chardev/trace-events
@@ -0,0 +1,18 @@
+# See docs/tracing.txt for syntax documentation.
+
+# chardev/wctablet.c
+wct_init(void) ""
+wct_cmd_re(void) ""
+wct_cmd_st(void) ""
+wct_cmd_sp(void) ""
+wct_cmd_ts(int input) "0x%02x"
+wct_cmd_other(const char *cmd) "%s"
+wct_speed(int speed) "%d"
+
+# chardev/spice.c
+spice_vmc_write(ssize_t out, int len) "spi

[Qemu-devel] [PATCH v2 0/6] Convert to realize and cleanup

2017-06-02 Thread Mao Zhongyi

v2:
* patch1: subject and commit message was rewrited by markus.
* patch2: comment was added to pci_add_capability2().
* patch3: a new patch that fix the wrong return value judgment condition.
* patch4: a new patch that fix code style problems.
* patch5: add an errp argument for pci_add_capability to pass
  error for its callers.
* patch6: convert part of pci-bridge device to realize.

v1:
* patch1: fix unreasonable return value check

Mao Zhongyi (6):
  pci: Clean up error checking in pci_add_capability()
  pci: Add comment for pci_add_capability2()
  pci: Fix the wrong return value judgment condition
  net/eepro100: Fixed code style
  pci: Make errp the last parameter of pci_add_capability()
  pci: Convert to realize

 hw/i386/amd_iommu.c| 24 
 hw/net/e1000e.c|  9 -
 hw/net/eepro100.c  | 77 +-
 hw/pci-bridge/i82801b11.c  | 12 +++---
 hw/pci-bridge/pcie_root_port.c | 15 +++-
 hw/pci-bridge/xio3130_downstream.c | 20 +-
 hw/pci-bridge/xio3130_upstream.c   | 20 +-
 hw/pci/pci.c   | 18 -
 hw/pci/pci_bridge.c|  8 +++-
 hw/pci/pcie.c  | 15 ++--
 hw/pci/shpc.c  |  5 ++-
 hw/pci/slotid_cap.c|  7 +++-
 hw/vfio/pci.c  |  5 ++-
 hw/virtio/virtio-pci.c | 19 +++---
 include/hw/pci/pci.h   |  3 +-
 include/hw/pci/pci_bridge.h|  3 +-
 include/hw/pci/pcie.h  |  3 +-
 17 files changed, 154 insertions(+), 109 deletions(-)

-- 
2.9.3

[Qemu-devel] [PATCH v2 1/6] pci: Clean up error checking in pci_add_capability()

2017-06-02 Thread Mao Zhongyi

On success, pci_add_capability2() returns a positive value. On
failure, it sets an error and return a negative value.

pci_add_capability() laboriously checks this behavior. No other
caller does. Drop the checks from pci_add_capability().

Cc: m...@redhat.com
Cc: mar...@redhat.com
Cc: arm...@redhat.com
Signed-off-by: Mao Zhongyi 
Reviewed-by: Marcel Apfelbaum 
---
 hw/pci/pci.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 98ccc27..53566b8 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2270,12 +2270,8 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
 Error *local_err = NULL;
 
 ret = pci_add_capability2(pdev, cap_id, offset, size, &local_err);
-if (local_err) {
-assert(ret < 0);
+if (ret < 0) {
 error_report_err(local_err);
-} else {
-/* success implies a positive offset in config space */
-assert(ret > 0);
 }
 return ret;
 }
-- 
2.9.3

[Qemu-devel] [PATCH v2 2/6] pci: Add comment for pci_add_capability2()

2017-06-02 Thread Mao Zhongyi

Add a comment for pci_add_capability2() to explain the return
value. This may help to make a correct return value check for
its callers.

Cc: m...@redhat.com
Cc: mar...@redhat.com
Cc: arm...@redhat.com
Suggested-by: Markus Armbruster 
Signed-off-by: Mao Zhongyi 
---
 hw/pci/pci.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 53566b8..9810d5f 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2276,6 +2276,10 @@ int pci_add_capability(PCIDevice *pdev, uint8_t cap_id,
 return ret;
 }
 
+/*
+ * On success, pci_add_capability2() returns a positive value.
+ * On failure, it sets an error and returns a negative value.
+ */
 int pci_add_capability2(PCIDevice *pdev, uint8_t cap_id,
uint8_t offset, uint8_t size,
Error **errp)
-- 
2.9.3

[Qemu-devel] [PATCH v2 6/6] pci: Convert to realize

2017-06-02 Thread Mao Zhongyi

The pci-birdge device i82801b11 and io3130_upstream/downstream
still implements the old PCIDeviceClass .init() through *_init()
instead of the new .realize(). All devices need to be converted
to .realize(). So convert it and rename it to *_realize().

Cc: m...@redhat.com
Cc: mar...@redhat.com
Cc: arm...@redhat.com
Signed-off-by: Mao Zhongyi 
---
 hw/pci-bridge/i82801b11.c  | 11 +--
 hw/pci-bridge/pcie_root_port.c | 15 ++-
 hw/pci-bridge/xio3130_downstream.c | 20 +---
 hw/pci-bridge/xio3130_upstream.c   | 20 +---
 hw/pci/pci_bridge.c|  7 +++
 hw/pci/pcie.c  | 11 ++-
 include/hw/pci/pci_bridge.h|  3 ++-
 include/hw/pci/pcie.h  |  3 ++-
 8 files changed, 42 insertions(+), 48 deletions(-)

diff --git a/hw/pci-bridge/i82801b11.c b/hw/pci-bridge/i82801b11.c
index 2c065c3..2c1b747 100644
--- a/hw/pci-bridge/i82801b11.c
+++ b/hw/pci-bridge/i82801b11.c
@@ -59,24 +59,23 @@ typedef struct I82801b11Bridge {
 /*< public >*/
 } I82801b11Bridge;
 
-static int i82801b11_bridge_initfn(PCIDevice *d)
+static void i82801b11_bridge_realize(PCIDevice *d, Error **errp)
 {
 int rc;
 
 pci_bridge_initfn(d, TYPE_PCI_BUS);
 
 rc = pci_bridge_ssvid_init(d, I82801ba_SSVID_OFFSET,
-   I82801ba_SSVID_SVID, I82801ba_SSVID_SSID);
+   I82801ba_SSVID_SVID, I82801ba_SSVID_SSID,
+   errp);
 if (rc < 0) {
 goto err_bridge;
 }
 pci_config_set_prog_interface(d->config, PCI_CLASS_BRIDGE_PCI_INF_SUB);
-return 0;
+return;
 
 err_bridge:
 pci_bridge_exitfn(d);
-
-return rc;
 }
 
 static const VMStateDescription i82801b11_bridge_dev_vmstate = {
@@ -96,7 +95,7 @@ static void i82801b11_bridge_class_init(ObjectClass *klass, 
void *data)
 k->vendor_id = PCI_VENDOR_ID_INTEL;
 k->device_id = PCI_DEVICE_ID_INTEL_82801BA_11;
 k->revision = ICH9_D2P_A2_REVISION;
-k->init = i82801b11_bridge_initfn;
+k->realize = i82801b11_bridge_realize;
 k->config_write = pci_bridge_write_config;
 dc->vmsd = &i82801b11_bridge_dev_vmstate;
 set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
index cf36318..00f0b1f 100644
--- a/hw/pci-bridge/pcie_root_port.c
+++ b/hw/pci-bridge/pcie_root_port.c
@@ -59,29 +59,27 @@ static void rp_realize(PCIDevice *d, Error **errp)
 PCIDeviceClass *dc = PCI_DEVICE_GET_CLASS(d);
 PCIERootPortClass *rpc = PCIE_ROOT_PORT_GET_CLASS(d);
 int rc;
-Error *local_err = NULL;
 
 pci_config_set_interrupt_pin(d->config, 1);
 pci_bridge_initfn(d, TYPE_PCIE_BUS);
 pcie_port_init_reg(d);
 
-rc = pci_bridge_ssvid_init(d, rpc->ssvid_offset, dc->vendor_id, rpc->ssid);
+rc = pci_bridge_ssvid_init(d, rpc->ssvid_offset, dc->vendor_id,
+   rpc->ssid, errp);
 if (rc < 0) {
-error_setg(errp, "Can't init SSV ID, error %d", rc);
 goto err_bridge;
 }
 
 if (rpc->interrupts_init) {
-rc = rpc->interrupts_init(d, &local_err);
+rc = rpc->interrupts_init(d, errp);
 if (rc < 0) {
-error_propagate(errp, local_err);
 goto err_bridge;
 }
 }
 
-rc = pcie_cap_init(d, rpc->exp_offset, PCI_EXP_TYPE_ROOT_PORT, p->port);
+rc = pcie_cap_init(d, rpc->exp_offset, PCI_EXP_TYPE_ROOT_PORT,
+   p->port, errp);
 if (rc < 0) {
-error_setg(errp, "Can't add Root Port capability, error %d", rc);
 goto err_int;
 }
 
@@ -98,9 +96,8 @@ static void rp_realize(PCIDevice *d, Error **errp)
 }
 
 rc = pcie_aer_init(d, PCI_ERR_VER, rpc->aer_offset,
-   PCI_ERR_SIZEOF, &local_err);
+   PCI_ERR_SIZEOF, errp);
 if (rc < 0) {
-error_propagate(errp, local_err);
 goto err;
 }
 pcie_aer_root_init(d);
diff --git a/hw/pci-bridge/xio3130_downstream.c 
b/hw/pci-bridge/xio3130_downstream.c
index cfe8a36..e706f36 100644
--- a/hw/pci-bridge/xio3130_downstream.c
+++ b/hw/pci-bridge/xio3130_downstream.c
@@ -56,33 +56,33 @@ static void xio3130_downstream_reset(DeviceState *qdev)
 pci_bridge_reset(qdev);
 }
 
-static int xio3130_downstream_initfn(PCIDevice *d)
+static void xio3130_downstream_realize(PCIDevice *d, Error **errp)
 {
 PCIEPort *p = PCIE_PORT(d);
 PCIESlot *s = PCIE_SLOT(d);
 int rc;
-Error *err = NULL;
 
 pci_bridge_initfn(d, TYPE_PCIE_BUS);
 pcie_port_init_reg(d);
 
 rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR,
   XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
-  XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT, &err);
+  XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
+  errp);
 if (rc < 0) {
 assert(rc == -ENOTSUP);
-error_repo

[Qemu-devel] [PATCH v2 3/6] pci: Fix the wrong return value judgment condition

2017-06-02 Thread Mao Zhongyi

On success, pci_add_capability2() returns a positive value. On
failure, it sets an error and return a negative value. It doesn't
always return 0. So the judgment condtion of pci_add_capability2()
is wrong if it contains the situation where return value equal to
0. Fix the error checks from its callers.

Cc: dmi...@daynix.com
Cc: jasow...@redhat.com
Cc: alex.william...@redhat.com
Cc: mar...@redhat.com
Cc: m...@redhat.com
Cc: arm...@redhat.com
Signed-off-by: Mao Zhongyi 
---
 hw/net/e1000e.c   | 2 +-
 hw/net/eepro100.c | 2 +-
 hw/vfio/pci.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index 6e23493..8259d67 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -374,7 +374,7 @@ e1000e_add_pm_capability(PCIDevice *pdev, uint8_t offset, 
uint16_t pmc)
 {
 int ret = pci_add_capability(pdev, PCI_CAP_ID_PM, offset, PCI_PM_SIZEOF);
 
-if (ret >= 0) {
+if (ret > 0) {
 pci_set_word(pdev->config + offset + PCI_PM_PMC,
  PCI_PM_CAP_VER_1_1 |
  pmc);
diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index 4bf71f2..da36816 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -571,7 +571,7 @@ static void e100_pci_reset(EEPRO100State * s)
 int cfg_offset = 0xdc;
 int r = pci_add_capability(&s->dev, PCI_CAP_ID_PM,
cfg_offset, PCI_PM_SIZEOF);
-assert(r >= 0);
+assert(r > 0);
 pci_set_word(pci_conf + cfg_offset + PCI_PM_PMC, 0x7e21);
 #if 0 /* TODO: replace dummy code for power management emulation. */
 /* TODO: Power Management Control / Status. */
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 32aca77..5881968 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1744,7 +1744,7 @@ static int vfio_setup_pcie_cap(VFIOPCIDevice *vdev, int 
pos, uint8_t size,
 }
 
 pos = pci_add_capability(&vdev->pdev, PCI_CAP_ID_EXP, pos, size);
-if (pos >= 0) {
+if (pos > 0) {
 vdev->pdev.exp.exp_cap = pos;
 }
 
-- 
2.9.3

Re: [Qemu-devel] [PATCH v2 10/45] qapi: Remove visit_start_alternate() parameter promote_int

2017-06-02 Thread Markus Armbruster

Marc-André Lureau  writes:

> Before the previous commit, parameter promote_int = true made
> visit_start_alternate() with an input visitor avoid QTYPE_QINT
> variants and create QTYPE_QFLOAT variants instead.  This was used
> where QTYPE_QINT variants were invalid.
>
> The previous commit fused QTYPE_QINT with QTYPE_QFLOAT, rendering
> rendering promote_int useless and unused.

Scratch one of two "rendering".

> Signed-off-by: Marc-André Lureau 

With that tidied up:
Reviewed-by: Markus Armbruster

Re: [Qemu-devel] [PULL 00/22] Docker and block patches

2017-06-02 Thread Fam Zheng

On Thu, 06/01 18:18, Peter Maydell wrote:
> On 26 May 2017 at 08:52, Fam Zheng  wrote:
> > The following changes since commit 9964e96dccf7f7c936ee854a795415d19b60:
> >
> >   Merge remote-tracking branch 'jasowang/tags/net-pull-request' into 
> > staging (2017-05-23 15:01:31 +0100)
> >
> > are available in the git repository at:
> >
> >   git://github.com/famz/qemu.git tags/docker-and-block-pull-request
> >
> > for you to fetch changes up to 77269bba94ef97de99ae61fdc98629a8704ae2ed:
> >
> >   block: make accounting thread-safe (2017-05-26 09:25:30 +0800)
> >
> > 
> >
> > For Paolo's block layer thread safety part I and my docker testing
> > enhancements.
> >
> > 
> 
> Hi. I'm afraid this doesn't build on BSD or OSX:
> 
> libqemuutil.a(stats64.o): In function `stat64_rdlock':
> /root/qemu/util/stats64.c:24: undefined reference to `cpu_relax'
> libqemuutil.a(stats64.o): In funclibqemuutil.a(stats64.o): In function
> `stat64_rdlock':
> /root/qemu/util/stats64.c:24: undefined reference to `cpu_relax'
> libqemuutil.a(stats64.o): In function `stat64_add32_carry':
> /root/qemu/util/stats64.c:64: undefined reference to `cpu_relax'
> libqemuutil.a(stats64.o): In function `stat64_min_slow':
> /root/qemu/util/stats64.c:85: undefined reference to `cpu_relax'
> libqemuutil.a(stats64.o): In function `stat64_max_slow':
> /root/qemu/util/stats64.c:114: undefined reference to `cpu_relax'
> tilibqemuutil.a(stats64.o): In function `stat64_rdlock':
> /root/qemu/util/stats64.c:24: undefined reference to `cpu_relax'
> libqemuutil.a(stats64.o): In function `stat64_add32_carry':
> /root/qemu/util/stats64.c:64: undefined reference to `cpu_relax'
> libqemuutil.a(stats64.o): In function `stat64_min_slow':
> /root/qemu/util/stats64.c:85: undefined reference to `cpu_relax'
> libqemuutil.a(stats64.o): In function `stat64_max_slow':
> /root/qemu/util/stats64.c:114: undefined reference to `cpu_relax'
> on `stat64_add32_carry':
> /root/qemu/util/stats64.c:64: undefined reference to `cpu_relax'
> libqemuutil.a(stats64.o): In function `stat64_min_slow':
> /root/qemu/util/stats64.c:85: undefined reference to `cpu_relax'
> libqemuutil.a(stats64.o): In function `stat64_max_slow':
> /root/qemu/util/stats64.c:114: undefined reference to `cpu_relax'
> 
> 
> and
> /Users/pm215/src/qemu-for-merges/util/stats64.c:24:9: error: implicit
> declaration of function 'cpu_relax' is invalid in C99
> [-Werror,-Wimplicit-function-declaration]
> cpu_relax();
> ^
> 
> 
> Looks like an omitted include of qemu/processor.h ?

Yes, including this one works for me, I'll add it and send v2. Thanks,

Fam

[Qemu-devel] [PATCH v2 4/6] net/eepro100: Fixed code style

2017-06-02 Thread Mao Zhongyi

It reports a code style problem(ERROR: "foo * bar" should be "foo *bar")
when running checkpatch.pl. So fix it to conform to the coding standards.

Cc: jasow...@redhat.com
Cc: arm...@redhat.com
Signed-off-by: Mao Zhongyi 
---
 hw/net/eepro100.c | 62 +++
 1 file changed, 31 insertions(+), 31 deletions(-)

diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index da36816..62e989c 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -405,7 +405,7 @@ enum scb_stat_ack {
 stat_ack_tx = (stat_ack_cu_idle | stat_ack_cu_cmd_done),
 };
 
-static void disable_interrupt(EEPRO100State * s)
+static void disable_interrupt(EEPRO100State *s)
 {
 if (s->int_stat) {
 TRACE(INT, logout("interrupt disabled\n"));
@@ -414,7 +414,7 @@ static void disable_interrupt(EEPRO100State * s)
 }
 }
 
-static void enable_interrupt(EEPRO100State * s)
+static void enable_interrupt(EEPRO100State *s)
 {
 if (!s->int_stat) {
 TRACE(INT, logout("interrupt enabled\n"));
@@ -423,7 +423,7 @@ static void enable_interrupt(EEPRO100State * s)
 }
 }
 
-static void eepro100_acknowledge(EEPRO100State * s)
+static void eepro100_acknowledge(EEPRO100State *s)
 {
 s->scb_stat &= ~s->mem[SCBAck];
 s->mem[SCBAck] = s->scb_stat;
@@ -432,7 +432,7 @@ static void eepro100_acknowledge(EEPRO100State * s)
 }
 }
 
-static void eepro100_interrupt(EEPRO100State * s, uint8_t status)
+static void eepro100_interrupt(EEPRO100State *s, uint8_t status)
 {
 uint8_t mask = ~s->mem[SCBIntmask];
 s->mem[SCBAck] |= status;
@@ -449,52 +449,52 @@ static void eepro100_interrupt(EEPRO100State * s, uint8_t 
status)
 }
 }
 
-static void eepro100_cx_interrupt(EEPRO100State * s)
+static void eepro100_cx_interrupt(EEPRO100State *s)
 {
 /* CU completed action command. */
 /* Transmit not ok (82557 only, not in emulation). */
 eepro100_interrupt(s, 0x80);
 }
 
-static void eepro100_cna_interrupt(EEPRO100State * s)
+static void eepro100_cna_interrupt(EEPRO100State *s)
 {
 /* CU left the active state. */
 eepro100_interrupt(s, 0x20);
 }
 
-static void eepro100_fr_interrupt(EEPRO100State * s)
+static void eepro100_fr_interrupt(EEPRO100State *s)
 {
 /* RU received a complete frame. */
 eepro100_interrupt(s, 0x40);
 }
 
-static void eepro100_rnr_interrupt(EEPRO100State * s)
+static void eepro100_rnr_interrupt(EEPRO100State *s)
 {
 /* RU is not ready. */
 eepro100_interrupt(s, 0x10);
 }
 
-static void eepro100_mdi_interrupt(EEPRO100State * s)
+static void eepro100_mdi_interrupt(EEPRO100State *s)
 {
 /* MDI completed read or write cycle. */
 eepro100_interrupt(s, 0x08);
 }
 
-static void eepro100_swi_interrupt(EEPRO100State * s)
+static void eepro100_swi_interrupt(EEPRO100State *s)
 {
 /* Software has requested an interrupt. */
 eepro100_interrupt(s, 0x04);
 }
 
 #if 0
-static void eepro100_fcp_interrupt(EEPRO100State * s)
+static void eepro100_fcp_interrupt(EEPRO100State *s)
 {
 /* Flow control pause interrupt (82558 and later). */
 eepro100_interrupt(s, 0x01);
 }
 #endif
 
-static void e100_pci_reset(EEPRO100State * s)
+static void e100_pci_reset(EEPRO100State *s)
 {
 E100PCIDeviceInfo *info = eepro100_get_class(s);
 uint32_t device = s->device;
@@ -598,7 +598,7 @@ static void e100_pci_reset(EEPRO100State * s)
 #endif /* EEPROM_SIZE > 0 */
 }
 
-static void nic_selective_reset(EEPRO100State * s)
+static void nic_selective_reset(EEPRO100State *s)
 {
 size_t i;
 uint16_t *eeprom_contents = eeprom93xx_data(s->eeprom);
@@ -669,7 +669,7 @@ static char *regname(uint32_t addr)
  /
 
 #if 0
-static uint16_t eepro100_read_command(EEPRO100State * s)
+static uint16_t eepro100_read_command(EEPRO100State *s)
 {
 uint16_t val = 0x;
 TRACE(OTHER, logout("val=0x%04x\n", val));
@@ -694,27 +694,27 @@ enum commands {
 CmdTxFlex = 0x0008, /* Use "Flexible mode" for CmdTx command. */
 };
 
-static cu_state_t get_cu_state(EEPRO100State * s)
+static cu_state_t get_cu_state(EEPRO100State *s)
 {
 return ((s->mem[SCBStatus] & BITS(7, 6)) >> 6);
 }
 
-static void set_cu_state(EEPRO100State * s, cu_state_t state)
+static void set_cu_state(EEPRO100State *s, cu_state_t state)
 {
 s->mem[SCBStatus] = (s->mem[SCBStatus] & ~BITS(7, 6)) + (state << 6);
 }
 
-static ru_state_t get_ru_state(EEPRO100State * s)
+static ru_state_t get_ru_state(EEPRO100State *s)
 {
 return ((s->mem[SCBStatus] & BITS(5, 2)) >> 2);
 }
 
-static void set_ru_state(EEPRO100State * s, ru_state_t state)
+static void set_ru_state(EEPRO100State *s, ru_state_t state)
 {
 s->mem[SCBStatus] = (s->mem[SCBStatus] & ~BITS(5, 2)) + (state << 2);
 }
 
-static void dump_statistics(EEPRO100State * s)
+static void dump_statistics(EEPRO100State *s)
 {
 /* Dump statistical data. Most data is never changed by the emulation
  * and always 0, so we first just copy the

[Qemu-devel] [PATCH v2 5/6] pci: Make errp the last parameter of pci_add_capability()

2017-06-02 Thread Mao Zhongyi

Add Error argument for pci_add_capability() to leverage the errp
to pass info on errors. This way is helpful for its callers to
make a better error handling when moving to 'realize'.

Cc: m...@redhat.com
Cc: pbonz...@redhat.com
Cc: r...@twiddle.net
Cc: ehabk...@redhat.com
Cc: dmi...@daynix.com
Cc: jasow...@redhat.com
Cc: mar...@redhat.com
Cc: alex.william...@redhat.com
Cc: arm...@redhat.com
Signed-off-by: Mao Zhongyi 
---
 hw/i386/amd_iommu.c   | 24 +---
 hw/net/e1000e.c   |  7 ++-
 hw/net/eepro100.c | 17 -
 hw/pci-bridge/i82801b11.c |  1 +
 hw/pci/pci.c  | 10 --
 hw/pci/pci_bridge.c   |  7 ++-
 hw/pci/pcie.c | 10 --
 hw/pci/shpc.c |  5 -
 hw/pci/slotid_cap.c   |  7 ++-
 hw/vfio/pci.c |  3 ++-
 hw/virtio/virtio-pci.c| 19 ++-
 include/hw/pci/pci.h  |  3 ++-
 12 files changed, 82 insertions(+), 31 deletions(-)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 7b6d4ea..d93ffc2 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1158,13 +1158,23 @@ static void amdvi_realize(DeviceState *dev, Error **err)
 x86_iommu->type = TYPE_AMD;
 qdev_set_parent_bus(DEVICE(&s->pci), &bus->qbus);
 object_property_set_bool(OBJECT(&s->pci), true, "realized", err);
-s->capab_offset = pci_add_capability(&s->pci.dev, AMDVI_CAPAB_ID_SEC, 0,
- AMDVI_CAPAB_SIZE);
-assert(s->capab_offset > 0);
-ret = pci_add_capability(&s->pci.dev, PCI_CAP_ID_MSI, 0, 
AMDVI_CAPAB_REG_SIZE);
-assert(ret > 0);
-ret = pci_add_capability(&s->pci.dev, PCI_CAP_ID_HT, 0, 
AMDVI_CAPAB_REG_SIZE);
-assert(ret > 0);
+ret = pci_add_capability(&s->pci.dev, AMDVI_CAPAB_ID_SEC, 0,
+ AMDVI_CAPAB_SIZE, err);
+if (ret < 0) {
+return;
+}
+s->capab_offset = ret;
+
+ret = pci_add_capability(&s->pci.dev, PCI_CAP_ID_MSI, 0,
+ AMDVI_CAPAB_REG_SIZE, err);
+if (ret < 0) {
+return;
+}
+ret = pci_add_capability(&s->pci.dev, PCI_CAP_ID_HT, 0,
+ AMDVI_CAPAB_REG_SIZE, err);
+if (ret < 0) {
+return;
+}
 
 /* set up MMIO */
 memory_region_init_io(&s->mmio, OBJECT(s), &mmio_mem_ops, s, "amdvi-mmio",
diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index 8259d67..41430766 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -47,6 +47,7 @@
 #include "e1000e_core.h"
 
 #include "trace.h"
+#include "qapi/error.h"
 
 #define TYPE_E1000E "e1000e"
 #define E1000E(obj) OBJECT_CHECK(E1000EState, (obj), TYPE_E1000E)
@@ -372,7 +373,9 @@ e1000e_gen_dsn(uint8_t *mac)
 static int
 e1000e_add_pm_capability(PCIDevice *pdev, uint8_t offset, uint16_t pmc)
 {
-int ret = pci_add_capability(pdev, PCI_CAP_ID_PM, offset, PCI_PM_SIZEOF);
+Error *local_err = NULL;
+int ret = pci_add_capability(pdev, PCI_CAP_ID_PM, offset,
+ PCI_PM_SIZEOF, &local_err);
 
 if (ret > 0) {
 pci_set_word(pdev->config + offset + PCI_PM_PMC,
@@ -386,6 +389,8 @@ e1000e_add_pm_capability(PCIDevice *pdev, uint8_t offset, 
uint16_t pmc)
 
 pci_set_word(pdev->w1cmask + offset + PCI_PM_CTRL,
  PCI_PM_CTRL_PME_STATUS);
+} else {
+error_report_err(local_err);
 }
 
 return ret;
diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index 62e989c..f24046a 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -494,7 +494,7 @@ static void eepro100_fcp_interrupt(EEPRO100State *s)
 }
 #endif
 
-static void e100_pci_reset(EEPRO100State *s)
+static void e100_pci_reset(EEPRO100State *s, Error **errp)
 {
 E100PCIDeviceInfo *info = eepro100_get_class(s);
 uint32_t device = s->device;
@@ -570,9 +570,13 @@ static void e100_pci_reset(EEPRO100State *s)
 /* Power Management Capabilities */
 int cfg_offset = 0xdc;
 int r = pci_add_capability(&s->dev, PCI_CAP_ID_PM,
-   cfg_offset, PCI_PM_SIZEOF);
-assert(r > 0);
-pci_set_word(pci_conf + cfg_offset + PCI_PM_PMC, 0x7e21);
+   cfg_offset, PCI_PM_SIZEOF,
+   errp);
+if (r > 0) {
+pci_set_word(pci_conf + cfg_offset + PCI_PM_PMC, 0x7e21);
+} else {
+return;
+}
 #if 0 /* TODO: replace dummy code for power management emulation. */
 /* TODO: Power Management Control / Status. */
 pci_set_word(pci_conf + cfg_offset + PCI_PM_CTRL, 0x);
@@ -1863,7 +1867,10 @@ static void e100_nic_realize(PCIDevice *pci_dev, Error 
**errp)
 
 s->device = info->device;
 
-e100_pci_reset(s);
+e100_pci_reset(s, errp);
+if (errp && *errp) {
+return;
+}
 
 /* Add 64 * 2 EEPROM. i82557 and i82558 support a 64 word EEPROM,
  * i82559 and later support 64 or 256 word EEPR

Re: [Qemu-devel] [PATCH v2 11/45] tests: remove /qnum/destroy test

2017-06-02 Thread Markus Armbruster

Marc-André Lureau  writes:

> The test isn't really useful.
>
> Signed-off-by: Marc-André Lureau 

Same in check-qdict.c check-qlist.c check-qstring.c.  Please drop them, too.

Re: [Qemu-devel] [PATCH v2 12/45] qnum: add uint type

2017-06-02 Thread Markus Armbruster

Marc-André Lureau  writes:

> In order to store integer values superior to INT64_MAX, add a u64

"superior" sounds odd.  What about "above"?  Or perhaps "between
INT64_MAX and UINT64_MAX".

s/a u64/a uint64_t/

> internal representation.
>
> Signed-off-by: Marc-André Lureau 

With the commit message tidied up:
Reviewed-by: Markus Armbruster

Re: [Qemu-devel] [Qemu-block] [PATCH 00/29] qed: Convert to coroutines

2017-06-02 Thread Paolo Bonzini

On 01/06/2017 19:08, Kevin Wolf wrote:
> Am 01.06.2017 um 18:40 hat Paolo Bonzini geschrieben:
>> On 01/06/2017 18:28, Kevin Wolf wrote:
 - qed_acquire/qed_release can be removed as you inline stuff, but this
 should not cause bugs so you can either do it as a final patch or let
 me remove it later.
>>> To be honest, I don't completely understand what they are protecting in
>>> the first place. The places where they are look quite strange to me. So
>>> I tried to simply leave them alone.
>>>
>>> What is the reason that we can remove them when we inline stuff?
>>> Shouldn't the inlining be semantically equivalent?
>>
>> You're right, they can be removed when going from callback to direct
>> call.  Callbacks are invoked without the AioContext acquire (aio_co_wake
>> does it for the callbacks).
> 
> So if we take qed_read_table_cb() for example:
> 
> qed_acquire(s);
> for (i = 0; i < noffsets; i++) {
> table->offsets[i] = le64_to_cpu(table->offsets[i]);
> }
> qed_release(s);
> 
> First of all, I don't see what it protects. If we wanted to avoid that
> someone else sees table->offsets with wrong endianness, we would be
> taking the lock much too late. And if nobody else knows about the table
> yet, what is there to be locked?

That is the product of a mechanical conversion where all callbacks grew
a qed_acquire/qed_release pair (commit b9e413dd37, "block: explicitly
acquire aiocontext in aio callbacks that need it", 2017-02-21).

In this case:

qed_acquire(s);
bdrv_aio_flush(write_table_cb->s->bs, qed_write_table_cb,
   write_table_cb);
qed_release(s);

the AioContext protects write_table_cb->s->bs.

> But anyway, given your explanation that acquiring the AioContext lock is
> getting replaced by coroutine magic, the qed_acquire/release() pair
> actually can't be removed in patch 2 when the callback is converted to a
> direct call, but only when the whole call path between .bdrv_aio_readv/
> writev and this specific callback is converted, so that we never drop
> out of coroutine context before reaching this code. Correct?

Yes.

> This happens only very late in the series, so it probably also means
> that patch 5 is indeed wrong because it removes the lock too early?

bdrv_qed_co_get_block_status is entirely in coroutine context, so I
think that one is fine.

Paolo

Re: [Qemu-devel] [PULL v2 00/34] Misc patches for 2016-06-01

2017-06-02 Thread Paolo Bonzini

On 01/06/2017 19:56, Peter Maydell wrote:
> On 1 June 2017 at 18:53, Peter Maydell  wrote:
>> Test failure on OSX:
>>
>> TEST: tests/device-introspect-test... (pid=66373)
>>   /aarch64/device/introspect/list: OK
>>   /aarch64/device/introspect/none: OK
>>   /aarch64/device/introspect/abstract: OK
>>   /aarch64/device/introspect/concrete: **
>> ERROR:/root/qemu/qom/object.c:364:object_initialize_with_type:
>> assertion failed: (type != NULL)
>> Broken pipe
> 
> Got those the wrong way round -- this is the FreeBSD failure
> and the other lot are OSX. Pretty sure it's the same error,
> though -- it's just that for some reason my OSX setup doesn't
> actually cause make to exit with an error when a test fails,
> so it goes on to hit what's probably the same bug in all the
> other check-qtest-$ARCH targets rather than bailing out.

Thanks, any chance you can bisect these?  I'll install a FreeBSD VM next
week.

Paolo

Re: [Qemu-devel] [PATCH v3 30/30] target/s390x: update maximum TCG model to z800

2017-06-02 Thread Thomas Huth

On 01.06.2017 21:17, Aurelien Jarno wrote:
> On 2017-06-01 11:04, David Hildenbrand wrote:
>> On 01.06.2017 10:38, David Hildenbrand wrote:
>>> On 01.06.2017 00:01, Aurelien Jarno wrote:
 At the same time fix the TCG version of get_max_cpu_model to return the
 maximum model like on KVM. Remove the ETF2 and long-displacement
>>>
>>> I don't understand the part
>>> "fix the TCG version of get_max_cpu_model to return the maximum model
>>> like on KVM".
>>>
>>> Can you elaborate?
>>>
 facilities from the additional features as it is included in the z800.

 Signed-off-by: Aurelien Jarno 
 ---
  target/s390x/cpu_models.c | 13 ++---
  1 file changed, 6 insertions(+), 7 deletions(-)

 diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
 index fc3cb25cc3..c13bbd852c 100644
 --- a/target/s390x/cpu_models.c
 +++ b/target/s390x/cpu_models.c
 @@ -668,8 +668,6 @@ static void add_qemu_cpu_model_features(S390FeatBitmap 
 fbm)
  static const int feats[] = {
  S390_FEAT_STFLE,
  S390_FEAT_EXTENDED_IMMEDIATE,
 -S390_FEAT_EXTENDED_TRANSLATION_2,
 -S390_FEAT_LONG_DISPLACEMENT,
  S390_FEAT_LONG_DISPLACEMENT_FAST,
  S390_FEAT_ETF2_ENH,
  S390_FEAT_STORE_CLOCK_FAST,
 @@ -696,9 +694,9 @@ static S390CPUModel *get_max_cpu_model(Error **errp)
  if (kvm_enabled()) {
  kvm_s390_get_host_cpu_model(&max_model, errp);
  } else {
 -/* TCG emulates a z900 (with some optional additional features) */
 -max_model.def = &s390_cpu_defs[0];
 -bitmap_copy(max_model.features, max_model.def->default_feat,
 +/* TCG emulates a z800 (with some optional additional features) */
 +max_model.def = s390_find_cpu_def(0x2066, 7, 3, NULL);
 +bitmap_copy(max_model.features, max_model.def->full_feat,
  S390_FEAT_MAX);
>>
>> This is most likely wrong: you're indicating features here that are not
>> available on tcg. esp. S390_FEAT_SIE_F2 and friends.
>>
>> I think should only copy the base features and add whatever else is
>> available via add_qemu_cpu_model_features() as already done.
> 
> The patch series added all the z800 features exposed via STFL/STFLE.
> Indeed the SIE features are missing, but anyway QEMU doesn't emulate SIE
> at all so the lack of these features are not exposed to the guest. In that
> regard QEMU already wrongly claim to emulate a z900.
> 
> 
  add_qemu_cpu_model_features(max_model.features);
  }
 @@ -956,8 +954,9 @@ static void s390_qemu_cpu_model_initfn(Object *obj)
  S390CPU *cpu = S390_CPU(obj);
  
  cpu->model = g_malloc0(sizeof(*cpu->model));
 -/* TCG emulates a z900 (with some optional additional features) */
 -memcpy(&s390_qemu_cpu_defs, &s390_cpu_defs[0], 
 sizeof(s390_qemu_cpu_defs));
 +/* TCG emulates a z800 (with some optional additional features) */
 +memcpy(&s390_qemu_cpu_defs, s390_find_cpu_def(0x2066, 7, 3, NULL),
 +   sizeof(s390_qemu_cpu_defs));
>>>
>>> No changing the qemu model without compatibility handling.
>>>
>> Please have a look at the following mail for a possible solution:
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg06030.html
>>
>> This could be moved to a separate patch. So this patch really should
>> just care about the maximum model, not the qemu model.
> 
> From what I understand from this thread, the patch from Thomas Huth was
> finally considered acceptable. I am adding him in Cc: so that he can
> comment.

In the 2nd version of my patch, I only changed the full_feat of the
model definitions, but not the base_feat and default_feat fields, so
that the CPU stays the same when you run QEMU without "-cpu" parameter
(or simply with "-cpu qemu").

Consider that the following scenario should work: Start a QEMU v2.10
with "-M s390-ccw-virtio-2.9" and a QEMU v2.9 with "-M
s390-ccw-virtio-2.9 -incoming ...". Then it should be possible to
migrate from the v2.10 to the v2.9 instance without problems. This won't
work anymore if you changed the default feature bits unconditionally.

 Thomas

[Qemu-devel] [PULL v2 01/22] docker: Run tests with current user

2017-06-02 Thread Fam Zheng

We've used --add-current-user to create a user in the image, use it to
run tests, because root has too much priviledge, and can surprise test
cases.

Signed-off-by: Fam Zheng 
Message-Id: <20170505032340.26467-2-f...@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Alex Bennée 
Reviewed-by: Paolo Bonzini 
Signed-off-by: Fam Zheng 
---
 tests/docker/Makefile.include | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
index 03eda37..0ed8c3d 100644
--- a/tests/docker/Makefile.include
+++ b/tests/docker/Makefile.include
@@ -126,7 +126,7 @@ docker-run: docker-qemu-src
"  COPYING $(EXECUTABLE) to $(IMAGE)"))
$(call quiet-command,   \
$(SRC_PATH)/tests/docker/docker.py run  \
-   -t  \
+   $(if $(NOUSER),,-u $(shell id -u)) -t   \
$(if $V,,--rm)  \
$(if $(DEBUG),-i,--net=none)\
-e TARGET_LIST=$(TARGET_LIST)   \
-- 
2.9.4

[Qemu-devel] [PULL v2 00/22] Docker and block patches

2017-06-02 Thread Fam Zheng

The following changes since commit 43771d5d92312504305c19abe29ec5bfabd55f01:

  Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-05-31' into 
staging (2017-06-01 16:39:16 +0100)

are available in the git repository at:

  git://github.com/famz/qemu.git tags/docker-and-block-pull-request

for you to fetch changes up to 4ab2cbc128f1938355a12970c9b6e419a63fcab6:

  block: make accounting thread-safe (2017-06-02 15:59:32 +0800)



v2: Fix building on OSX and BSD.



Fam Zheng (4):
  docker: Run tests with current user
  docker: Add bzip2 and hostname to fedora image
  docker: Add libaio to fedora image
  docker: Add flex and bison to centos6 image

Paolo Bonzini (18):
  block: access copy_on_read with atomic ops
  block: access quiesce_counter with atomic ops
  block: access io_limits_disabled with atomic ops
  block: access serialising_in_flight with atomic ops
  block: access wakeup with atomic ops
  block: access io_plugged with atomic ops
  throttle-groups: only start one coroutine from drained_begin
  throttle-groups: do not use qemu_co_enter_next
  throttle-groups: protect throttled requests with a CoMutex
  util: add stats64 module
  block: use Stat64 for wr_highest_offset
  block: access write_gen with atomics
  block: protect tracked_requests and flush_queue with reqs_lock
  block: introduce dirty_bitmap_mutex
  migration/block: reset dirty bitmap before reading
  block: protect modification of dirty bitmaps with a mutex
  block: introduce block_account_one_io
  block: make accounting thread-safe

 block.c |   9 +-
 block/accounting.c  |  65 ++-
 block/block-backend.c   |   5 +-
 block/dirty-bitmap.c| 114 +--
 block/io.c  |  51 +
 block/mirror.c  |  14 ++-
 block/nfs.c |   4 +-
 block/qapi.c|   2 +-
 block/sheepdog.c|   3 +-
 block/throttle-groups.c |  91 +++
 blockdev.c  |  46 ++--
 include/block/accounting.h  |   8 +-
 include/block/block.h   |   5 +-
 include/block/block_int.h   |  65 +++
 include/block/dirty-bitmap.h|  25 +++--
 include/qemu/stats64.h  | 193 
 include/sysemu/block-backend.h  |  10 +-
 migration/block.c   |  17 ++-
 tests/docker/Makefile.include   |   2 +-
 tests/docker/dockerfiles/centos6.docker |   2 +-
 tests/docker/dockerfiles/fedora.docker  |   4 +-
 util/Makefile.objs  |   1 +
 util/stats64.c  | 137 +++
 23 files changed, 688 insertions(+), 185 deletions(-)
 create mode 100644 include/qemu/stats64.h
 create mode 100644 util/stats64.c

-- 
2.9.4

[Qemu-devel] [PULL v2 07/22] block: access io_limits_disabled with atomic ops

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Alberto Garcia 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-4-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block/block-backend.c  | 4 ++--
 block/throttle-groups.c| 2 +-
 include/sysemu/block-backend.h | 3 ++-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index f3a6008..e50ec03 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1953,7 +1953,7 @@ static void blk_root_drained_begin(BdrvChild *child)
 /* Note that blk->root may not be accessible here yet if we are just
  * attaching to a BlockDriverState that is drained. Use child instead. */
 
-if (blk->public.io_limits_disabled++ == 0) {
+if (atomic_fetch_inc(&blk->public.io_limits_disabled) == 0) {
 throttle_group_restart_blk(blk);
 }
 }
@@ -1964,7 +1964,7 @@ static void blk_root_drained_end(BdrvChild *child)
 assert(blk->quiesce_counter);
 
 assert(blk->public.io_limits_disabled);
---blk->public.io_limits_disabled;
+atomic_dec(&blk->public.io_limits_disabled);
 
 if (--blk->quiesce_counter == 0) {
 if (blk->dev_ops && blk->dev_ops->drained_end) {
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index b73e7a8..69bfbd4 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -240,7 +240,7 @@ static bool throttle_group_schedule_timer(BlockBackend 
*blk, bool is_write)
 ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
 bool must_wait;
 
-if (blkp->io_limits_disabled) {
+if (atomic_read(&blkp->io_limits_disabled)) {
 return false;
 }
 
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 840ad61..24b63d6 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -80,7 +80,8 @@ typedef struct BlockBackendPublic {
 CoQueue  throttled_reqs[2];
 
 /* Nonzero if the I/O limits are currently being ignored; generally
- * it is zero.  */
+ * it is zero.  Accessed with atomic operations.
+ */
 unsigned int io_limits_disabled;
 
 /* The following fields are protected by the ThrottleGroup lock.
-- 
2.9.4

[Qemu-devel] [PULL v2 08/22] block: access serialising_in_flight with atomic ops

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-5-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block/io.c|  6 +++---
 include/block/block_int.h | 10 ++
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/block/io.c b/block/io.c
index 70643df..d76202b 100644
--- a/block/io.c
+++ b/block/io.c
@@ -375,7 +375,7 @@ void bdrv_drain_all(void)
 static void tracked_request_end(BdrvTrackedRequest *req)
 {
 if (req->serialising) {
-req->bs->serialising_in_flight--;
+atomic_dec(&req->bs->serialising_in_flight);
 }
 
 QLIST_REMOVE(req, list);
@@ -414,7 +414,7 @@ static void mark_request_serialising(BdrvTrackedRequest 
*req, uint64_t align)
- overlap_offset;
 
 if (!req->serialising) {
-req->bs->serialising_in_flight++;
+atomic_inc(&req->bs->serialising_in_flight);
 req->serialising = true;
 }
 
@@ -519,7 +519,7 @@ static bool coroutine_fn 
wait_serialising_requests(BdrvTrackedRequest *self)
 bool retry;
 bool waited = false;
 
-if (!bs->serialising_in_flight) {
+if (!atomic_read(&bs->serialising_in_flight)) {
 return false;
 }
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 9691db3..e5f19eb 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -604,10 +604,6 @@ struct BlockDriverState {
 /* Callback before write request is processed */
 NotifierWithReturnList before_write_notifiers;
 
-/* number of in-flight requests; overall and serialising */
-unsigned int in_flight;
-unsigned int serialising_in_flight;
-
 bool wakeup;
 
 /* Offset after the highest byte written to */
@@ -634,6 +630,12 @@ struct BlockDriverState {
  */
 int copy_on_read;
 
+/* number of in-flight requests; overall and serialising.
+ * Accessed with atomic ops.
+ */
+unsigned int in_flight;
+unsigned int serialising_in_flight;
+
 /* do we need to tell the quest if we have a volatile write cache? */
 int enable_write_cache;
 
-- 
2.9.4

[Qemu-devel] [PULL v2 03/22] docker: Add libaio to fedora image

2017-06-02 Thread Fam Zheng

Signed-off-by: Fam Zheng 
Message-Id: <20170505032340.26467-5-f...@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Alex Bennée 
Reviewed-by: Paolo Bonzini 
Signed-off-by: Fam Zheng 
---
 tests/docker/dockerfiles/fedora.docker | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/docker/dockerfiles/fedora.docker 
b/tests/docker/dockerfiles/fedora.docker
index 39f6b58..4eaa8ed 100644
--- a/tests/docker/dockerfiles/fedora.docker
+++ b/tests/docker/dockerfiles/fedora.docker
@@ -2,7 +2,7 @@ FROM fedora:latest
 ENV PACKAGES \
 ccache git tar PyYAML sparse flex bison python2 bzip2 hostname \
 glib2-devel pixman-devel zlib-devel SDL-devel libfdt-devel \
-gcc gcc-c++ clang make perl which bc findutils \
+gcc gcc-c++ clang make perl which bc findutils libaio-devel \
 mingw32-pixman mingw32-glib2 mingw32-gmp mingw32-SDL mingw32-pkg-config \
 mingw32-gtk2 mingw32-gtk3 mingw32-gnutls mingw32-nettle mingw32-libtasn1 \
 mingw32-libjpeg-turbo mingw32-libpng mingw32-curl mingw32-libssh2 \
-- 
2.9.4

[Qemu-devel] [PULL v2 02/22] docker: Add bzip2 and hostname to fedora image

2017-06-02 Thread Fam Zheng

It is used by qemu-iotests.

Signed-off-by: Fam Zheng 
Message-Id: <20170505032340.26467-3-f...@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Alex Bennée 
Reviewed-by: Paolo Bonzini 
Signed-off-by: Fam Zheng 
---
 tests/docker/dockerfiles/fedora.docker | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/docker/dockerfiles/fedora.docker 
b/tests/docker/dockerfiles/fedora.docker
index c4f80ad..39f6b58 100644
--- a/tests/docker/dockerfiles/fedora.docker
+++ b/tests/docker/dockerfiles/fedora.docker
@@ -1,6 +1,6 @@
 FROM fedora:latest
 ENV PACKAGES \
-ccache git tar PyYAML sparse flex bison python2 \
+ccache git tar PyYAML sparse flex bison python2 bzip2 hostname \
 glib2-devel pixman-devel zlib-devel SDL-devel libfdt-devel \
 gcc gcc-c++ clang make perl which bc findutils \
 mingw32-pixman mingw32-glib2 mingw32-gmp mingw32-SDL mingw32-pkg-config \
-- 
2.9.4

[Qemu-devel] [PULL v2 10/22] block: access io_plugged with atomic ops

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-7-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block/io.c| 4 ++--
 include/block/block_int.h | 8 +---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index 4a59829..bb1c9c5 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2645,7 +2645,7 @@ void bdrv_io_plug(BlockDriverState *bs)
 bdrv_io_plug(child->bs);
 }
 
-if (bs->io_plugged++ == 0) {
+if (atomic_fetch_inc(&bs->io_plugged) == 0) {
 BlockDriver *drv = bs->drv;
 if (drv && drv->bdrv_io_plug) {
 drv->bdrv_io_plug(bs);
@@ -2658,7 +2658,7 @@ void bdrv_io_unplug(BlockDriverState *bs)
 BdrvChild *child;
 
 assert(bs->io_plugged);
-if (--bs->io_plugged == 0) {
+if (atomic_fetch_dec(&bs->io_plugged) == 1) {
 BlockDriver *drv = bs->drv;
 if (drv && drv->bdrv_io_unplug) {
 drv->bdrv_io_unplug(bs);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index dd09e00..d11417e 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -611,9 +611,6 @@ struct BlockDriverState {
 uint64_t write_threshold_offset;
 NotifierWithReturn write_threshold_notifier;
 
-/* counter for nested bdrv_io_plug */
-unsigned io_plugged;
-
 QLIST_HEAD(, BdrvTrackedRequest) tracked_requests;
 CoQueue flush_queue;  /* Serializing flush queue */
 bool active_flush_req;/* Flush request in flight? */
@@ -639,6 +636,11 @@ struct BlockDriverState {
  */
 bool wakeup;
 
+/* counter for nested bdrv_io_plug.
+ * Accessed with atomic ops.
+*/
+unsigned io_plugged;
+
 /* do we need to tell the quest if we have a volatile write cache? */
 int enable_write_cache;
 
-- 
2.9.4

[Qemu-devel] [PULL v2 11/22] throttle-groups: only start one coroutine from drained_begin

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Starting all waiting coroutines from bdrv_drain_all is unnecessary;
throttle_group_co_io_limits_intercept calls schedule_next_request as
soon as the coroutine restarts, which in turn will restart the next
request if possible.

If we only start the first request and let the coroutines dance from
there the code is simpler and there is more reuse between
throttle_group_config, throttle_group_restart_blk and timer_cb.  The
next patch will benefit from this.

We also stop accessing from throttle_group_restart_blk the
blkp->throttled_reqs CoQueues even when there was no
attached throttling group.  This worked but is not pretty.

The only thing that can interrupt the dance is the QEMU_CLOCK_VIRTUAL
timer when switching from one block device to the next, because the
timer is set to "now + 1" but QEMU_CLOCK_VIRTUAL might not be running.
Set that timer to point in the present ("now") rather than the future
and things work.

Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-8-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block/throttle-groups.c | 45 +
 1 file changed, 25 insertions(+), 20 deletions(-)

diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index 69bfbd4..85169ec 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -292,7 +292,7 @@ static void schedule_next_request(BlockBackend *blk, bool 
is_write)
 } else {
 ThrottleTimers *tt = &blk_get_public(token)->throttle_timers;
 int64_t now = qemu_clock_get_ns(tt->clock_type);
-timer_mod(tt->timers[is_write], now + 1);
+timer_mod(tt->timers[is_write], now);
 tg->any_timer_armed[is_write] = true;
 }
 tg->tokens[is_write] = token;
@@ -340,15 +340,32 @@ void coroutine_fn 
throttle_group_co_io_limits_intercept(BlockBackend *blk,
 qemu_mutex_unlock(&tg->lock);
 }
 
+static void throttle_group_restart_queue(BlockBackend *blk, bool is_write)
+{
+BlockBackendPublic *blkp = blk_get_public(blk);
+ThrottleGroup *tg = container_of(blkp->throttle_state, ThrottleGroup, ts);
+bool empty_queue;
+
+aio_context_acquire(blk_get_aio_context(blk));
+empty_queue = !qemu_co_enter_next(&blkp->throttled_reqs[is_write]);
+aio_context_release(blk_get_aio_context(blk));
+
+/* If the request queue was empty then we have to take care of
+ * scheduling the next one */
+if (empty_queue) {
+qemu_mutex_lock(&tg->lock);
+schedule_next_request(blk, is_write);
+qemu_mutex_unlock(&tg->lock);
+}
+}
+
 void throttle_group_restart_blk(BlockBackend *blk)
 {
 BlockBackendPublic *blkp = blk_get_public(blk);
-int i;
 
-for (i = 0; i < 2; i++) {
-while (qemu_co_enter_next(&blkp->throttled_reqs[i])) {
-;
-}
+if (blkp->throttle_state) {
+throttle_group_restart_queue(blk, 0);
+throttle_group_restart_queue(blk, 1);
 }
 }
 
@@ -376,8 +393,7 @@ void throttle_group_config(BlockBackend *blk, 
ThrottleConfig *cfg)
 throttle_config(ts, tt, cfg);
 qemu_mutex_unlock(&tg->lock);
 
-qemu_co_enter_next(&blkp->throttled_reqs[0]);
-qemu_co_enter_next(&blkp->throttled_reqs[1]);
+throttle_group_restart_blk(blk);
 }
 
 /* Get the throttle configuration from a particular group. Similar to
@@ -408,7 +424,6 @@ static void timer_cb(BlockBackend *blk, bool is_write)
 BlockBackendPublic *blkp = blk_get_public(blk);
 ThrottleState *ts = blkp->throttle_state;
 ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
-bool empty_queue;
 
 /* The timer has just been fired, so we can update the flag */
 qemu_mutex_lock(&tg->lock);
@@ -416,17 +431,7 @@ static void timer_cb(BlockBackend *blk, bool is_write)
 qemu_mutex_unlock(&tg->lock);
 
 /* Run the request that was waiting for this timer */
-aio_context_acquire(blk_get_aio_context(blk));
-empty_queue = !qemu_co_enter_next(&blkp->throttled_reqs[is_write]);
-aio_context_release(blk_get_aio_context(blk));
-
-/* If the request queue was empty then we have to take care of
- * scheduling the next one */
-if (empty_queue) {
-qemu_mutex_lock(&tg->lock);
-schedule_next_request(blk, is_write);
-qemu_mutex_unlock(&tg->lock);
-}
+throttle_group_restart_queue(blk, is_write);
 }
 
 static void read_timer_cb(void *opaque)
-- 
2.9.4

[Qemu-devel] [PULL v2 04/22] docker: Add flex and bison to centos6 image

2017-06-02 Thread Fam Zheng

Currently there are warnings about flex and bison being missing when
building in the centos6 image:

make[1]: flex: Command not found
 BISON dtc-parser.tab.c
make[1]: bison: Command not found

Add them.

Reported-by: Thomas Huth 
Signed-off-by: Fam Zheng 
Message-Id: <20170524005206.31916-1-f...@redhat.com>
Reviewed-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Fam Zheng 
---
 tests/docker/dockerfiles/centos6.docker | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/docker/dockerfiles/centos6.docker 
b/tests/docker/dockerfiles/centos6.docker
index 34e0d3b..17a4d24 100644
--- a/tests/docker/dockerfiles/centos6.docker
+++ b/tests/docker/dockerfiles/centos6.docker
@@ -1,7 +1,7 @@
 FROM centos:6
 RUN yum install -y epel-release
 ENV PACKAGES libfdt-devel ccache \
-tar git make gcc g++ \
+tar git make gcc g++ flex bison \
 zlib-devel glib2-devel SDL-devel pixman-devel \
 epel-release
 RUN yum install -y $PACKAGES
-- 
2.9.4

[Qemu-devel] [PULL v2 06/22] block: access quiesce_counter with atomic ops

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Alberto Garcia 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-3-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block/io.c| 4 ++--
 include/block/block_int.h | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index 98c690f..70643df 100644
--- a/block/io.c
+++ b/block/io.c
@@ -241,7 +241,7 @@ void bdrv_drained_begin(BlockDriverState *bs)
 return;
 }
 
-if (!bs->quiesce_counter++) {
+if (atomic_fetch_inc(&bs->quiesce_counter) == 0) {
 aio_disable_external(bdrv_get_aio_context(bs));
 bdrv_parent_drained_begin(bs);
 }
@@ -252,7 +252,7 @@ void bdrv_drained_begin(BlockDriverState *bs)
 void bdrv_drained_end(BlockDriverState *bs)
 {
 assert(bs->quiesce_counter > 0);
-if (--bs->quiesce_counter > 0) {
+if (atomic_fetch_dec(&bs->quiesce_counter) > 1) {
 return;
 }
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 5f99cdb..9691db3 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -637,6 +637,7 @@ struct BlockDriverState {
 /* do we need to tell the quest if we have a volatile write cache? */
 int enable_write_cache;
 
+/* Accessed with atomic ops.  */
 int quiesce_counter;
 };
 
-- 
2.9.4

[Qemu-devel] [PULL v2 12/22] throttle-groups: do not use qemu_co_enter_next

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Prepare for removing this function; always restart throttled requests
from coroutine context.  This will matter when restarting throttled
requests will have to acquire a CoMutex.

Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-9-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block/throttle-groups.c | 42 +-
 1 file changed, 37 insertions(+), 5 deletions(-)

diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index 85169ec..8bf1031 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -260,6 +260,20 @@ static bool throttle_group_schedule_timer(BlockBackend 
*blk, bool is_write)
 return must_wait;
 }
 
+/* Start the next pending I/O request for a BlockBackend.  Return whether
+ * any request was actually pending.
+ *
+ * @blk:   the current BlockBackend
+ * @is_write:  the type of operation (read/write)
+ */
+static bool coroutine_fn throttle_group_co_restart_queue(BlockBackend *blk,
+ bool is_write)
+{
+BlockBackendPublic *blkp = blk_get_public(blk);
+
+return qemu_co_queue_next(&blkp->throttled_reqs[is_write]);
+}
+
 /* Look for the next pending I/O request and schedule it.
  *
  * This assumes that tg->lock is held.
@@ -287,7 +301,7 @@ static void schedule_next_request(BlockBackend *blk, bool 
is_write)
 if (!must_wait) {
 /* Give preference to requests from the current blk */
 if (qemu_in_coroutine() &&
-qemu_co_queue_next(&blkp->throttled_reqs[is_write])) {
+throttle_group_co_restart_queue(blk, is_write)) {
 token = blk;
 } else {
 ThrottleTimers *tt = &blk_get_public(token)->throttle_timers;
@@ -340,15 +354,21 @@ void coroutine_fn 
throttle_group_co_io_limits_intercept(BlockBackend *blk,
 qemu_mutex_unlock(&tg->lock);
 }
 
-static void throttle_group_restart_queue(BlockBackend *blk, bool is_write)
+typedef struct {
+BlockBackend *blk;
+bool is_write;
+} RestartData;
+
+static void coroutine_fn throttle_group_restart_queue_entry(void *opaque)
 {
+RestartData *data = opaque;
+BlockBackend *blk = data->blk;
+bool is_write = data->is_write;
 BlockBackendPublic *blkp = blk_get_public(blk);
 ThrottleGroup *tg = container_of(blkp->throttle_state, ThrottleGroup, ts);
 bool empty_queue;
 
-aio_context_acquire(blk_get_aio_context(blk));
-empty_queue = !qemu_co_enter_next(&blkp->throttled_reqs[is_write]);
-aio_context_release(blk_get_aio_context(blk));
+empty_queue = !throttle_group_co_restart_queue(blk, is_write);
 
 /* If the request queue was empty then we have to take care of
  * scheduling the next one */
@@ -359,6 +379,18 @@ static void throttle_group_restart_queue(BlockBackend 
*blk, bool is_write)
 }
 }
 
+static void throttle_group_restart_queue(BlockBackend *blk, bool is_write)
+{
+Coroutine *co;
+RestartData rd = {
+.blk = blk,
+.is_write = is_write
+};
+
+co = qemu_coroutine_create(throttle_group_restart_queue_entry, &rd);
+aio_co_enter(blk_get_aio_context(blk), co);
+}
+
 void throttle_group_restart_blk(BlockBackend *blk)
 {
 BlockBackendPublic *blkp = blk_get_public(blk);
-- 
2.9.4

[Qemu-devel] [PULL v2 14/22] util: add stats64 module

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

This module provides fast paths for 64-bit atomic operations on machines
that only have 32-bit atomic access.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-11-pbonz...@redhat.com>
[Include qemu/processor.h for cpu_relax(). - Fam]
Signed-off-by: Fam Zheng 
---
 include/qemu/stats64.h | 193 +
 util/Makefile.objs |   1 +
 util/stats64.c | 137 +++
 3 files changed, 331 insertions(+)
 create mode 100644 include/qemu/stats64.h
 create mode 100644 util/stats64.c

diff --git a/include/qemu/stats64.h b/include/qemu/stats64.h
new file mode 100644
index 000..4a357b3
--- /dev/null
+++ b/include/qemu/stats64.h
@@ -0,0 +1,193 @@
+/*
+ * Atomic operations on 64-bit quantities.
+ *
+ * Copyright (C) 2017 Red Hat, Inc.
+ *
+ * Author: Paolo Bonzini 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_STATS64_H
+#define QEMU_STATS64_H 1
+
+#include "qemu/atomic.h"
+
+/* This provides atomic operations on 64-bit type, using a reader-writer
+ * spinlock on architectures that do not have 64-bit accesses.  Even on
+ * those architectures, it tries hard not to take the lock.
+ */
+
+typedef struct Stat64 {
+#ifdef CONFIG_ATOMIC64
+uint64_t value;
+#else
+uint32_t low, high;
+uint32_t lock;
+#endif
+} Stat64;
+
+#ifdef CONFIG_ATOMIC64
+static inline void stat64_init(Stat64 *s, uint64_t value)
+{
+/* This is not guaranteed to be atomic! */
+*s = (Stat64) { value };
+}
+
+static inline uint64_t stat64_get(const Stat64 *s)
+{
+return atomic_read__nocheck(&s->value);
+}
+
+static inline void stat64_add(Stat64 *s, uint64_t value)
+{
+atomic_add(&s->value, value);
+}
+
+static inline void stat64_min(Stat64 *s, uint64_t value)
+{
+uint64_t orig = atomic_read__nocheck(&s->value);
+while (orig > value) {
+orig = atomic_cmpxchg__nocheck(&s->value, orig, value);
+}
+}
+
+static inline void stat64_max(Stat64 *s, uint64_t value)
+{
+uint64_t orig = atomic_read__nocheck(&s->value);
+while (orig < value) {
+orig = atomic_cmpxchg__nocheck(&s->value, orig, value);
+}
+}
+#else
+uint64_t stat64_get(const Stat64 *s);
+bool stat64_min_slow(Stat64 *s, uint64_t value);
+bool stat64_max_slow(Stat64 *s, uint64_t value);
+bool stat64_add32_carry(Stat64 *s, uint32_t low, uint32_t high);
+
+static inline void stat64_init(Stat64 *s, uint64_t value)
+{
+/* This is not guaranteed to be atomic! */
+*s = (Stat64) { .low = value, .high = value >> 32, .lock = 0 };
+}
+
+static inline void stat64_add(Stat64 *s, uint64_t value)
+{
+uint32_t low, high;
+high = value >> 32;
+low = (uint32_t) value;
+if (!low) {
+if (high) {
+atomic_add(&s->high, high);
+}
+return;
+}
+
+for (;;) {
+uint32_t orig = s->low;
+uint32_t result = orig + low;
+uint32_t old;
+
+if (result < low || high) {
+/* If the high part is affected, take the lock.  */
+if (stat64_add32_carry(s, low, high)) {
+return;
+}
+continue;
+}
+
+/* No carry, try with a 32-bit cmpxchg.  The result is independent of
+ * the high 32 bits, so it can race just fine with stat64_add32_carry
+ * and even stat64_get!
+ */
+old = atomic_cmpxchg(&s->low, orig, result);
+if (orig == old) {
+return;
+}
+}
+}
+
+static inline void stat64_min(Stat64 *s, uint64_t value)
+{
+uint32_t low, high;
+uint32_t orig_low, orig_high;
+
+high = value >> 32;
+low = (uint32_t) value;
+do {
+orig_high = atomic_read(&s->high);
+if (orig_high < high) {
+return;
+}
+
+if (orig_high == high) {
+/* High 32 bits are equal.  Read low after high, otherwise we
+ * can get a false positive (e.g. 0x1235,0x changes to
+ * 0x1234,0x8000 and we read it as 0x1234,0x). Pairs with
+ * the write barrier in stat64_min_slow.
+ */
+smp_rmb();
+orig_low = atomic_read(&s->low);
+if (orig_low <= low) {
+return;
+}
+
+/* See if we were lucky and a writer raced against us.  The
+ * barrier is theoretically unnecessary, but if we remove it
+ * we may miss being lucky.
+ */
+smp_rmb();
+orig_high = atomic_read(&s->high);
+if (orig_high < high) {
+return;
+}
+}
+
+/* If the value changes in any way, we have to take the lock.  */
+} while (!stat64_min_slow(s, value));
+}
+
+static inline void stat64_max(Stat64 *s, uint64_t value)
+{
+uint32_t low, high;
+uint32_t orig_

[Qemu-devel] [PULL v2 05/22] block: access copy_on_read with atomic ops

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-2-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block.c   |  6 --
 block/io.c|  8 
 blockdev.c|  2 +-
 include/block/block_int.h | 11 ++-
 4 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index fa1d06d..af6366b 100644
--- a/block.c
+++ b/block.c
@@ -1300,7 +1300,9 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockBackend *file,
 goto fail_opts;
 }
 
-assert(bs->copy_on_read == 0); /* bdrv_new() and bdrv_close() make it so */
+/* bdrv_new() and bdrv_close() make it so */
+assert(atomic_read(&bs->copy_on_read) == 0);
+
 if (bs->open_flags & BDRV_O_COPY_ON_READ) {
 if (!bs->read_only) {
 bdrv_enable_copy_on_read(bs);
@@ -3063,7 +3065,7 @@ static void bdrv_close(BlockDriverState *bs)
 
 g_free(bs->opaque);
 bs->opaque = NULL;
-bs->copy_on_read = 0;
+atomic_set(&bs->copy_on_read, 0);
 bs->backing_file[0] = '\0';
 bs->backing_format[0] = '\0';
 bs->total_sectors = 0;
diff --git a/block/io.c b/block/io.c
index ed31810..98c690f 100644
--- a/block/io.c
+++ b/block/io.c
@@ -130,13 +130,13 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error 
**errp)
  */
 void bdrv_enable_copy_on_read(BlockDriverState *bs)
 {
-bs->copy_on_read++;
+atomic_inc(&bs->copy_on_read);
 }
 
 void bdrv_disable_copy_on_read(BlockDriverState *bs)
 {
-assert(bs->copy_on_read > 0);
-bs->copy_on_read--;
+int old = atomic_fetch_dec(&bs->copy_on_read);
+assert(old >= 1);
 }
 
 /* Check if any requests are in-flight (including throttled requests) */
@@ -1144,7 +1144,7 @@ int coroutine_fn bdrv_co_preadv(BdrvChild *child,
 bdrv_inc_in_flight(bs);
 
 /* Don't do copy-on-read if we read data before write operation */
-if (bs->copy_on_read && !(flags & BDRV_REQ_NO_SERIALISING)) {
+if (atomic_read(&bs->copy_on_read) && !(flags & BDRV_REQ_NO_SERIALISING)) {
 flags |= BDRV_REQ_COPY_ON_READ;
 }
 
diff --git a/blockdev.c b/blockdev.c
index 892d768..335fbcc 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1791,7 +1791,7 @@ static void external_snapshot_commit(BlkActionState 
*common)
 /* We don't need (or want) to use the transactional
  * bdrv_reopen_multiple() across all the entries at once, because we
  * don't want to abort all of them if one of them fails the reopen */
-if (!state->old_bs->copy_on_read) {
+if (!atomic_read(&state->old_bs->copy_on_read)) {
 bdrv_reopen(state->old_bs, state->old_bs->open_flags & ~BDRV_O_RDWR,
 NULL);
 }
diff --git a/include/block/block_int.h b/include/block/block_int.h
index e5eb473..5f99cdb 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -595,11 +595,6 @@ struct BlockDriverState {
 
 /* Protected by AioContext lock */
 
-/* If true, copy read backing sectors into image.  Can be >1 if more
- * than one client has requested copy-on-read.
- */
-int copy_on_read;
-
 /* If we are reading a disk image, give its size in sectors.
  * Generally read-only; it is written to by load_vmstate and save_vmstate,
  * but the block layer is quiescent during those.
@@ -633,6 +628,12 @@ struct BlockDriverState {
 
 QLIST_HEAD(, BdrvDirtyBitmap) dirty_bitmaps;
 
+/* If true, copy read backing sectors into image.  Can be >1 if more
+ * than one client has requested copy-on-read.  Accessed with atomic
+ * ops.
+ */
+int copy_on_read;
+
 /* do we need to tell the quest if we have a volatile write cache? */
 int enable_write_cache;
 
-- 
2.9.4

[Qemu-devel] [PULL v2 15/22] block: use Stat64 for wr_highest_offset

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-12-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block/io.c| 4 +---
 block/qapi.c  | 2 +-
 include/block/block_int.h | 7 ---
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/block/io.c b/block/io.c
index bb1c9c5..bc69b4c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1405,9 +1405,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 ++bs->write_gen;
 bdrv_set_dirty(bs, start_sector, end_sector - start_sector);
 
-if (bs->wr_highest_offset < offset + bytes) {
-bs->wr_highest_offset = offset + bytes;
-}
+stat64_max(&bs->wr_highest_offset, offset + bytes);
 
 if (ret >= 0) {
 bs->total_sectors = MAX(bs->total_sectors, end_sector);
diff --git a/block/qapi.c b/block/qapi.c
index a40922e..14b60ae 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -441,7 +441,7 @@ static BlockStats *bdrv_query_bds_stats(const 
BlockDriverState *bs,
 s->node_name = g_strdup(bdrv_get_node_name(bs));
 }
 
-s->stats->wr_highest_offset = bs->wr_highest_offset;
+s->stats->wr_highest_offset = stat64_get(&bs->wr_highest_offset);
 
 if (bs->file) {
 s->has_parent = true;
diff --git a/include/block/block_int.h b/include/block/block_int.h
index d11417e..8f36d1f 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -29,6 +29,7 @@
 #include "qemu/option.h"
 #include "qemu/queue.h"
 #include "qemu/coroutine.h"
+#include "qemu/stats64.h"
 #include "qemu/timer.h"
 #include "qapi-types.h"
 #include "qemu/hbitmap.h"
@@ -604,9 +605,6 @@ struct BlockDriverState {
 /* Callback before write request is processed */
 NotifierWithReturnList before_write_notifiers;
 
-/* Offset after the highest byte written to */
-uint64_t wr_highest_offset;
-
 /* threshold limit for writes, in bytes. "High water mark". */
 uint64_t write_threshold_offset;
 NotifierWithReturn write_threshold_notifier;
@@ -619,6 +617,9 @@ struct BlockDriverState {
 
 QLIST_HEAD(, BdrvDirtyBitmap) dirty_bitmaps;
 
+/* Offset after the highest byte written to */
+Stat64 wr_highest_offset;
+
 /* If true, copy read backing sectors into image.  Can be >1 if more
  * than one client has requested copy-on-read.  Accessed with atomic
  * ops.
-- 
2.9.4

[Qemu-devel] [PULL v2 13/22] throttle-groups: protect throttled requests with a CoMutex

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Another possibility is to use tg->lock, which we're holding anyway in
both schedule_next_request and throttle_group_co_io_limits_intercept.
This would require open-coding the CoQueue however, so I've chosen this
alternative.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-10-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block/block-backend.c  |  1 +
 block/throttle-groups.c| 12 ++--
 include/sysemu/block-backend.h |  7 ++-
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index e50ec03..be2ddf1 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -216,6 +216,7 @@ BlockBackend *blk_new(uint64_t perm, uint64_t shared_perm)
 blk->shared_perm = shared_perm;
 blk_set_enable_write_cache(blk, true);
 
+qemu_co_mutex_init(&blk->public.throttled_reqs_lock);
 qemu_co_queue_init(&blk->public.throttled_reqs[0]);
 qemu_co_queue_init(&blk->public.throttled_reqs[1]);
 
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index 8bf1031..a181cb1 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -270,8 +270,13 @@ static bool coroutine_fn 
throttle_group_co_restart_queue(BlockBackend *blk,
  bool is_write)
 {
 BlockBackendPublic *blkp = blk_get_public(blk);
+bool ret;
 
-return qemu_co_queue_next(&blkp->throttled_reqs[is_write]);
+qemu_co_mutex_lock(&blkp->throttled_reqs_lock);
+ret = qemu_co_queue_next(&blkp->throttled_reqs[is_write]);
+qemu_co_mutex_unlock(&blkp->throttled_reqs_lock);
+
+return ret;
 }
 
 /* Look for the next pending I/O request and schedule it.
@@ -340,7 +345,10 @@ void coroutine_fn 
throttle_group_co_io_limits_intercept(BlockBackend *blk,
 if (must_wait || blkp->pending_reqs[is_write]) {
 blkp->pending_reqs[is_write]++;
 qemu_mutex_unlock(&tg->lock);
-qemu_co_queue_wait(&blkp->throttled_reqs[is_write], NULL);
+qemu_co_mutex_lock(&blkp->throttled_reqs_lock);
+qemu_co_queue_wait(&blkp->throttled_reqs[is_write],
+   &blkp->throttled_reqs_lock);
+qemu_co_mutex_unlock(&blkp->throttled_reqs_lock);
 qemu_mutex_lock(&tg->lock);
 blkp->pending_reqs[is_write]--;
 }
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 24b63d6..999eb23 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -72,11 +72,8 @@ typedef struct BlockDevOps {
  * fields that must be public. This is in particular for QLIST_ENTRY() and
  * friends so that BlockBackends can be kept in lists outside block-backend.c 
*/
 typedef struct BlockBackendPublic {
-/* I/O throttling has its own locking, but also some fields are
- * protected by the AioContext lock.
- */
-
-/* Protected by AioContext lock.  */
+/* throttled_reqs_lock protects the CoQueues for throttled requests.  */
+CoMutex  throttled_reqs_lock;
 CoQueue  throttled_reqs[2];
 
 /* Nonzero if the I/O limits are currently being ignored; generally
-- 
2.9.4

[Qemu-devel] [PULL v2 09/22] block: access wakeup with atomic ops

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-6-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block/io.c| 3 ++-
 block/nfs.c   | 4 +++-
 block/sheepdog.c  | 3 ++-
 include/block/block.h | 5 +++--
 include/block/block_int.h | 7 +--
 5 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/block/io.c b/block/io.c
index d76202b..4a59829 100644
--- a/block/io.c
+++ b/block/io.c
@@ -501,7 +501,8 @@ static void dummy_bh_cb(void *opaque)
 
 void bdrv_wakeup(BlockDriverState *bs)
 {
-if (bs->wakeup) {
+/* The barrier (or an atomic op) is in the caller.  */
+if (atomic_read(&bs->wakeup)) {
 aio_bh_schedule_oneshot(qemu_get_aio_context(), dummy_bh_cb, NULL);
 }
 }
diff --git a/block/nfs.c b/block/nfs.c
index 848b2c0..18c87d2 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -730,7 +730,9 @@ nfs_get_allocated_file_size_cb(int ret, struct nfs_context 
*nfs, void *data,
 if (task->ret < 0) {
 error_report("NFS Error: %s", nfs_get_error(nfs));
 }
-task->complete = 1;
+
+/* Set task->complete before reading bs->wakeup.  */
+atomic_mb_set(&task->complete, 1);
 bdrv_wakeup(task->bs);
 }
 
diff --git a/block/sheepdog.c b/block/sheepdog.c
index a18315a..5ebf5d9 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -698,7 +698,8 @@ out:
 
 srco->co = NULL;
 srco->ret = ret;
-srco->finished = true;
+/* Set srco->finished before reading bs->wakeup.  */
+atomic_mb_set(&srco->finished, true);
 if (srco->bs) {
 bdrv_wakeup(srco->bs);
 }
diff --git a/include/block/block.h b/include/block/block.h
index 9b355e9..a4f09df 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -402,7 +402,8 @@ void bdrv_drain_all(void);
  * block_job_defer_to_main_loop for how to do it). \
  */\
 assert(!bs_->wakeup);  \
-bs_->wakeup = true;\
+/* Set bs->wakeup before evaluating cond.  */  \
+atomic_mb_set(&bs_->wakeup, true); \
 while (busy_) {\
 if ((cond)) {  \
 waited_ = busy_ = true;\
@@ -414,7 +415,7 @@ void bdrv_drain_all(void);
 waited_ |= busy_;  \
 }  \
 }  \
-bs_->wakeup = false;   \
+atomic_set(&bs_->wakeup, false);   \
 }  \
 waited_; })
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index e5f19eb..dd09e00 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -604,8 +604,6 @@ struct BlockDriverState {
 /* Callback before write request is processed */
 NotifierWithReturnList before_write_notifiers;
 
-bool wakeup;
-
 /* Offset after the highest byte written to */
 uint64_t wr_highest_offset;
 
@@ -636,6 +634,11 @@ struct BlockDriverState {
 unsigned int in_flight;
 unsigned int serialising_in_flight;
 
+/* Internal to BDRV_POLL_WHILE and bdrv_wakeup.  Accessed with atomic
+ * ops.
+ */
+bool wakeup;
+
 /* do we need to tell the quest if we have a volatile write cache? */
 int enable_write_cache;
 
-- 
2.9.4

[Qemu-devel] [PULL v2 19/22] migration/block: reset dirty bitmap before reading

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Any data that is returned by read may be stale already, the bitmap
has to be cleared before issuing the read.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-16-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 migration/block.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/migration/block.c b/migration/block.c
index 9e9f031..8fe484e 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -536,6 +536,8 @@ static int mig_save_device_dirty(QEMUFile *f, 
BlkMigDevState *bmds,
 } else {
 nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
 }
+bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, sector, nr_sectors);
+
 blk = g_new(BlkMigBlock, 1);
 blk->buf = g_malloc(BLOCK_SIZE);
 blk->bmds = bmds;
@@ -568,7 +570,6 @@ static int mig_save_device_dirty(QEMUFile *f, 
BlkMigDevState *bmds,
 g_free(blk);
 }
 
-bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, sector, nr_sectors);
 sector += nr_sectors;
 bmds->cur_dirty = sector;
 
-- 
2.9.4

[Qemu-devel] [PULL v2 17/22] block: protect tracked_requests and flush_queue with reqs_lock

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-14-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block.c   |  1 +
 block/io.c| 16 ++--
 include/block/block_int.h | 14 +-
 3 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/block.c b/block.c
index 361005c..a5cbd45 100644
--- a/block.c
+++ b/block.c
@@ -320,6 +320,7 @@ BlockDriverState *bdrv_new(void)
 QLIST_INIT(&bs->op_blockers[i]);
 }
 notifier_with_return_list_init(&bs->before_write_notifiers);
+qemu_co_mutex_init(&bs->reqs_lock);
 bs->refcnt = 1;
 bs->aio_context = qemu_get_aio_context();
 
diff --git a/block/io.c b/block/io.c
index 036f5a4..91611ff 100644
--- a/block/io.c
+++ b/block/io.c
@@ -378,8 +378,10 @@ static void tracked_request_end(BdrvTrackedRequest *req)
 atomic_dec(&req->bs->serialising_in_flight);
 }
 
+qemu_co_mutex_lock(&req->bs->reqs_lock);
 QLIST_REMOVE(req, list);
 qemu_co_queue_restart_all(&req->wait_queue);
+qemu_co_mutex_unlock(&req->bs->reqs_lock);
 }
 
 /**
@@ -404,7 +406,9 @@ static void tracked_request_begin(BdrvTrackedRequest *req,
 
 qemu_co_queue_init(&req->wait_queue);
 
+qemu_co_mutex_lock(&bs->reqs_lock);
 QLIST_INSERT_HEAD(&bs->tracked_requests, req, list);
+qemu_co_mutex_unlock(&bs->reqs_lock);
 }
 
 static void mark_request_serialising(BdrvTrackedRequest *req, uint64_t align)
@@ -526,6 +530,7 @@ static bool coroutine_fn 
wait_serialising_requests(BdrvTrackedRequest *self)
 
 do {
 retry = false;
+qemu_co_mutex_lock(&bs->reqs_lock);
 QLIST_FOREACH(req, &bs->tracked_requests, list) {
 if (req == self || (!req->serialising && !self->serialising)) {
 continue;
@@ -544,7 +549,7 @@ static bool coroutine_fn 
wait_serialising_requests(BdrvTrackedRequest *self)
  * (instead of producing a deadlock in the former case). */
 if (!req->waiting_for) {
 self->waiting_for = req;
-qemu_co_queue_wait(&req->wait_queue, NULL);
+qemu_co_queue_wait(&req->wait_queue, &bs->reqs_lock);
 self->waiting_for = NULL;
 retry = true;
 waited = true;
@@ -552,6 +557,7 @@ static bool coroutine_fn 
wait_serialising_requests(BdrvTrackedRequest *self)
 }
 }
 }
+qemu_co_mutex_unlock(&bs->reqs_lock);
 } while (retry);
 
 return waited;
@@ -2291,14 +2297,17 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 goto early_exit;
 }
 
+qemu_co_mutex_lock(&bs->reqs_lock);
 current_gen = atomic_read(&bs->write_gen);
 
 /* Wait until any previous flushes are completed */
 while (bs->active_flush_req) {
-qemu_co_queue_wait(&bs->flush_queue, NULL);
+qemu_co_queue_wait(&bs->flush_queue, &bs->reqs_lock);
 }
 
+/* Flushes reach this point in nondecreasing current_gen order.  */
 bs->active_flush_req = true;
+qemu_co_mutex_unlock(&bs->reqs_lock);
 
 /* Write back all layers by calling one driver function */
 if (bs->drv->bdrv_co_flush) {
@@ -2370,9 +2379,12 @@ out:
 if (ret == 0) {
 bs->flushed_gen = current_gen;
 }
+
+qemu_co_mutex_lock(&bs->reqs_lock);
 bs->active_flush_req = false;
 /* Return value is ignored - it's ok if wait queue is empty */
 qemu_co_queue_next(&bs->flush_queue);
+qemu_co_mutex_unlock(&bs->reqs_lock);
 
 early_exit:
 bdrv_dec_in_flight(bs);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8a9bc0b..31fb364 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -609,11 +609,6 @@ struct BlockDriverState {
 uint64_t write_threshold_offset;
 NotifierWithReturn write_threshold_notifier;
 
-QLIST_HEAD(, BdrvTrackedRequest) tracked_requests;
-CoQueue flush_queue;  /* Serializing flush queue */
-bool active_flush_req;/* Flush request in flight? */
-unsigned int flushed_gen; /* Flushed write generation */
-
 QLIST_HEAD(, BdrvDirtyBitmap) dirty_bitmaps;
 
 /* Offset after the highest byte written to */
@@ -647,6 +642,15 @@ struct BlockDriverState {
 /* Accessed with atomic ops.  */
 int quiesce_counter;
 unsigned int write_gen;   /* Current data generation */
+
+/* Protected by reqs_lock.  */
+CoMutex reqs_lock;
+QLIST_HEAD(, BdrvTrackedRequest) tracked_requests;
+CoQueue flush_queue;  /* Serializing flush queue */
+bool active_flush_req;/* Flush request in flight? */
+
+/* Only read/written by whoever has set active_flush_req to true.  */
+unsigned int flushed_gen; /* Flushed write generation */
 };
 
 struct BlockBackendRootState {
-- 
2.9.4

[Qemu-devel] [PULL v2 18/22] block: introduce dirty_bitmap_mutex

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

It protects only the list of dirty bitmaps; in the next patch we will
also protect their content.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-15-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block/dirty-bitmap.c  | 44 +++-
 block/mirror.c|  3 ++-
 blockdev.c| 44 +++-
 include/block/block_int.h |  5 +
 migration/block.c |  6 --
 5 files changed, 57 insertions(+), 45 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 519737c..fa78109 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -52,6 +52,17 @@ struct BdrvDirtyBitmapIter {
 BdrvDirtyBitmap *bitmap;
 };
 
+static inline void bdrv_dirty_bitmaps_lock(BlockDriverState *bs)
+{
+qemu_mutex_lock(&bs->dirty_bitmap_mutex);
+}
+
+static inline void bdrv_dirty_bitmaps_unlock(BlockDriverState *bs)
+{
+qemu_mutex_unlock(&bs->dirty_bitmap_mutex);
+}
+
+/* Called with BQL or dirty_bitmap lock taken.  */
 BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
 {
 BdrvDirtyBitmap *bm;
@@ -65,6 +76,7 @@ BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, 
const char *name)
 return NULL;
 }
 
+/* Called with BQL taken.  */
 void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
 {
 assert(!bdrv_dirty_bitmap_frozen(bitmap));
@@ -72,6 +84,7 @@ void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
 bitmap->name = NULL;
 }
 
+/* Called with BQL taken.  */
 BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
   uint32_t granularity,
   const char *name,
@@ -100,7 +113,9 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState 
*bs,
 bitmap->size = bitmap_size;
 bitmap->name = g_strdup(name);
 bitmap->disabled = false;
+bdrv_dirty_bitmaps_lock(bs);
 QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
+bdrv_dirty_bitmaps_unlock(bs);
 return bitmap;
 }
 
@@ -164,16 +179,19 @@ const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap 
*bitmap)
 return bitmap->name;
 }
 
+/* Called with BQL taken.  */
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
 {
 return bitmap->successor;
 }
 
+/* Called with BQL taken.  */
 bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap)
 {
 return !(bitmap->disabled || bitmap->successor);
 }
 
+/* Called with BQL taken.  */
 DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
 {
 if (bdrv_dirty_bitmap_frozen(bitmap)) {
@@ -188,6 +206,7 @@ DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap 
*bitmap)
 /**
  * Create a successor bitmap destined to replace this bitmap after an 
operation.
  * Requires that the bitmap is not frozen and has no successor.
+ * Called with BQL taken.
  */
 int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap, Error **errp)
@@ -220,6 +239,7 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
 /**
  * For a bitmap with a successor, yield our name to the successor,
  * delete the old bitmap, and return a handle to the new bitmap.
+ * Called with BQL taken.
  */
 BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
 BdrvDirtyBitmap *bitmap,
@@ -247,6 +267,7 @@ BdrvDirtyBitmap 
*bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
  * In cases of failure where we can no longer safely delete the parent,
  * we may wish to re-join the parent and child/successor.
  * The merged parent will be un-frozen, but not explicitly re-enabled.
+ * Called with BQL taken.
  */
 BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *parent,
@@ -271,25 +292,30 @@ BdrvDirtyBitmap 
*bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
 
 /**
  * Truncates _all_ bitmaps attached to a BDS.
+ * Called with BQL taken.
  */
 void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
 {
 BdrvDirtyBitmap *bitmap;
 uint64_t size = bdrv_nb_sectors(bs);
 
+bdrv_dirty_bitmaps_lock(bs);
 QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
 assert(!bdrv_dirty_bitmap_frozen(bitmap));
 assert(!bitmap->active_iterators);
 hbitmap_truncate(bitmap->bitmap, size);
 bitmap->size = size;
 }
+bdrv_dirty_bitmaps_unlock(bs);
 }
 
+/* Called with BQL taken.  */
 static void bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
   BdrvDirtyBitmap *bitmap,
   bool only_named)
 {
 BdrvDirtyBitmap *bm, *next;
+bdrv_dirty_bitmaps_lock(bs);
 QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
 if ((!bitmap || bm == bitmap) && (

[Qemu-devel] [PULL v2 21/22] block: introduce block_account_one_io

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

This is the common code to account operations that produced actual I/O.

Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-18-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block/accounting.c | 51 ++-
 1 file changed, 22 insertions(+), 29 deletions(-)

diff --git a/block/accounting.c b/block/accounting.c
index 3f457c4..a279e0b 100644
--- a/block/accounting.c
+++ b/block/accounting.c
@@ -86,7 +86,8 @@ void block_acct_start(BlockAcctStats *stats, BlockAcctCookie 
*cookie,
 cookie->type = type;
 }
 
-void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
+static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie 
*cookie,
+ bool failed)
 {
 BlockAcctTimedStats *s;
 int64_t time_ns = qemu_clock_get_ns(clock_type);
@@ -98,31 +99,14 @@ void block_acct_done(BlockAcctStats *stats, BlockAcctCookie 
*cookie)
 
 assert(cookie->type < BLOCK_MAX_IOTYPE);
 
-stats->nr_bytes[cookie->type] += cookie->bytes;
-stats->nr_ops[cookie->type]++;
-stats->total_time_ns[cookie->type] += latency_ns;
-stats->last_access_time_ns = time_ns;
-
-QSLIST_FOREACH(s, &stats->intervals, entries) {
-timed_average_account(&s->latency[cookie->type], latency_ns);
+if (failed) {
+stats->failed_ops[cookie->type]++;
+} else {
+stats->nr_bytes[cookie->type] += cookie->bytes;
+stats->nr_ops[cookie->type]++;
 }
-}
-
-void block_acct_failed(BlockAcctStats *stats, BlockAcctCookie *cookie)
-{
-assert(cookie->type < BLOCK_MAX_IOTYPE);
-
-stats->failed_ops[cookie->type]++;
-
-if (stats->account_failed) {
-BlockAcctTimedStats *s;
-int64_t time_ns = qemu_clock_get_ns(clock_type);
-int64_t latency_ns = time_ns - cookie->start_time_ns;
-
-if (qtest_enabled()) {
-latency_ns = qtest_latency_ns;
-}
 
+if (!failed || stats->account_failed) {
 stats->total_time_ns[cookie->type] += latency_ns;
 stats->last_access_time_ns = time_ns;
 
@@ -132,15 +116,24 @@ void block_acct_failed(BlockAcctStats *stats, 
BlockAcctCookie *cookie)
 }
 }
 
+void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
+{
+block_account_one_io(stats, cookie, false);
+}
+
+void block_acct_failed(BlockAcctStats *stats, BlockAcctCookie *cookie)
+{
+block_account_one_io(stats, cookie, true);
+}
+
 void block_acct_invalid(BlockAcctStats *stats, enum BlockAcctType type)
 {
 assert(type < BLOCK_MAX_IOTYPE);
 
-/* block_acct_done() and block_acct_failed() update
- * total_time_ns[], but this one does not. The reason is that
- * invalid requests are accounted during their submission,
- * therefore there's no actual I/O involved. */
-
+/* block_account_one_io() updates total_time_ns[], but this one does
+ * not.  The reason is that invalid requests are accounted during their
+ * submission, therefore there's no actual I/O involved.
+ */
 stats->invalid_ops[type]++;
 
 if (stats->account_invalid) {
-- 
2.9.4

[Qemu-devel] [PULL v2 16/22] block: access write_gen with atomics

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-13-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block.c   | 2 +-
 block/io.c| 6 +++---
 include/block/block_int.h | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index af6366b..361005c 100644
--- a/block.c
+++ b/block.c
@@ -3424,7 +3424,7 @@ int bdrv_truncate(BdrvChild *child, int64_t offset, Error 
**errp)
 ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
 bdrv_dirty_bitmap_truncate(bs);
 bdrv_parent_cb_resize(bs);
-++bs->write_gen;
+atomic_inc(&bs->write_gen);
 }
 return ret;
 }
diff --git a/block/io.c b/block/io.c
index bc69b4c..036f5a4 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1402,7 +1402,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 }
 bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE);
 
-++bs->write_gen;
+atomic_inc(&bs->write_gen);
 bdrv_set_dirty(bs, start_sector, end_sector - start_sector);
 
 stat64_max(&bs->wr_highest_offset, offset + bytes);
@@ -2291,7 +2291,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 goto early_exit;
 }
 
-current_gen = bs->write_gen;
+current_gen = atomic_read(&bs->write_gen);
 
 /* Wait until any previous flushes are completed */
 while (bs->active_flush_req) {
@@ -2516,7 +2516,7 @@ int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, 
int64_t offset,
 }
 ret = 0;
 out:
-++bs->write_gen;
+atomic_inc(&bs->write_gen);
 bdrv_set_dirty(bs, req.offset >> BDRV_SECTOR_BITS,
req.bytes >> BDRV_SECTOR_BITS);
 tracked_request_end(&req);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8f36d1f..8a9bc0b 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -612,7 +612,6 @@ struct BlockDriverState {
 QLIST_HEAD(, BdrvTrackedRequest) tracked_requests;
 CoQueue flush_queue;  /* Serializing flush queue */
 bool active_flush_req;/* Flush request in flight? */
-unsigned int write_gen;   /* Current data generation */
 unsigned int flushed_gen; /* Flushed write generation */
 
 QLIST_HEAD(, BdrvDirtyBitmap) dirty_bitmaps;
@@ -647,6 +646,7 @@ struct BlockDriverState {
 
 /* Accessed with atomic ops.  */
 int quiesce_counter;
+unsigned int write_gen;   /* Current data generation */
 };
 
 struct BlockBackendRootState {
-- 
2.9.4

[Qemu-devel] [PULL v2 20/22] block: protect modification of dirty bitmaps with a mutex

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-17-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block/dirty-bitmap.c | 70 ++--
 block/mirror.c   | 11 +--
 include/block/block_int.h|  4 +--
 include/block/dirty-bitmap.h | 25 +++-
 migration/block.c| 10 ---
 5 files changed, 95 insertions(+), 25 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index fa78109..a04c6e4 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -37,6 +37,7 @@
  * or enabled. A frozen bitmap can only abdicate() or reclaim().
  */
 struct BdrvDirtyBitmap {
+QemuMutex *mutex;
 HBitmap *bitmap;/* Dirty sector bitmap implementation */
 HBitmap *meta;  /* Meta dirty bitmap */
 BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
@@ -62,6 +63,16 @@ static inline void 
bdrv_dirty_bitmaps_unlock(BlockDriverState *bs)
 qemu_mutex_unlock(&bs->dirty_bitmap_mutex);
 }
 
+void bdrv_dirty_bitmap_lock(BdrvDirtyBitmap *bitmap)
+{
+qemu_mutex_lock(bitmap->mutex);
+}
+
+void bdrv_dirty_bitmap_unlock(BdrvDirtyBitmap *bitmap)
+{
+qemu_mutex_unlock(bitmap->mutex);
+}
+
 /* Called with BQL or dirty_bitmap lock taken.  */
 BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
 {
@@ -109,6 +120,7 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState 
*bs,
 return NULL;
 }
 bitmap = g_new0(BdrvDirtyBitmap, 1);
+bitmap->mutex = &bs->dirty_bitmap_mutex;
 bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(sector_granularity));
 bitmap->size = bitmap_size;
 bitmap->name = g_strdup(name);
@@ -134,20 +146,24 @@ void bdrv_create_meta_dirty_bitmap(BdrvDirtyBitmap 
*bitmap,
int chunk_size)
 {
 assert(!bitmap->meta);
+qemu_mutex_lock(bitmap->mutex);
 bitmap->meta = hbitmap_create_meta(bitmap->bitmap,
chunk_size * BITS_PER_BYTE);
+qemu_mutex_unlock(bitmap->mutex);
 }
 
 void bdrv_release_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap)
 {
 assert(bitmap->meta);
+qemu_mutex_lock(bitmap->mutex);
 hbitmap_free_meta(bitmap->bitmap);
 bitmap->meta = NULL;
+qemu_mutex_unlock(bitmap->mutex);
 }
 
-int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
-   BdrvDirtyBitmap *bitmap, int64_t sector,
-   int nb_sectors)
+int bdrv_dirty_bitmap_get_meta_locked(BlockDriverState *bs,
+  BdrvDirtyBitmap *bitmap, int64_t sector,
+  int nb_sectors)
 {
 uint64_t i;
 int sectors_per_bit = 1 << hbitmap_granularity(bitmap->meta);
@@ -162,11 +178,26 @@ int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
 return false;
 }
 
+int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
+   BdrvDirtyBitmap *bitmap, int64_t sector,
+   int nb_sectors)
+{
+bool dirty;
+
+qemu_mutex_lock(bitmap->mutex);
+dirty = bdrv_dirty_bitmap_get_meta_locked(bs, bitmap, sector, nb_sectors);
+qemu_mutex_unlock(bitmap->mutex);
+
+return dirty;
+}
+
 void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
   BdrvDirtyBitmap *bitmap, int64_t sector,
   int nb_sectors)
 {
+qemu_mutex_lock(bitmap->mutex);
 hbitmap_reset(bitmap->meta, sector, nb_sectors);
+qemu_mutex_unlock(bitmap->mutex);
 }
 
 int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap)
@@ -393,8 +424,9 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 return list;
 }
 
-int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
-   int64_t sector)
+/* Called within bdrv_dirty_bitmap_lock..unlock */
+int bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
+  int64_t sector)
 {
 if (bitmap) {
 return hbitmap_get(bitmap->bitmap, sector);
@@ -467,23 +499,42 @@ int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter)
 return hbitmap_iter_next(&iter->hbi);
 }
 
-void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
-   int64_t cur_sector, int64_t nr_sectors)
+/* Called within bdrv_dirty_bitmap_lock..unlock */
+void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
+  int64_t cur_sector, int64_t nr_sectors)
 {
 assert(bdrv_dirty_bitmap_enabled(bitmap));
 hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
 }
 
-void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
- int64_t cur_sector, int64_t nr_sectors)
+void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
+   int64_t cur_sector, int64_t nr_sectors)
+{
+bdrv_dir

[Qemu-devel] [PATCH] vhost-user-bridge: fix iov_restore_front() warning

2017-06-02 Thread Marc-André Lureau

  CC  tests/vhost-user-bridge.o
/home/dgilbert/git/qemu-world3/tests/vhost-user-bridge.c:228:23: warning: 
variables 'front' and 'iov' used in loop condition not modified in loop body 
[-Wfor-loop-analysis]
for (cur = front; front != iov; cur++) {
  ^~~~
1 warning generated.

Fix the loop, document the function, and fix some related assert().

In practice, the loop bug was harmless because the front sg buffer is
enough to discard/restore the header size.

Reported-by: Dr. David Alan Gilbert 
Signed-off-by: Marc-André Lureau 
---
 tests/vhost-user-bridge.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tests/vhost-user-bridge.c b/tests/vhost-user-bridge.c
index 8618c20d53..1e5b5ca3da 100644
--- a/tests/vhost-user-bridge.c
+++ b/tests/vhost-user-bridge.c
@@ -220,12 +220,18 @@ vubr_handle_tx(VuDev *dev, int qidx)
 free(elem);
 }
 
+
+/* this function reverse the effect of iov_discard_front() it must be
+ * called with 'front' being the original struct iovec and 'bytes'
+ * being the number of bytes you shaved off
+ */
 static void
 iov_restore_front(struct iovec *front, struct iovec *iov, size_t bytes)
 {
 struct iovec *cur;
 
-for (cur = front; front != iov; cur++) {
+for (cur = front; cur != iov; cur++) {
+assert(bytes >= cur->iov_len);
 bytes -= cur->iov_len;
 }
 
@@ -302,7 +308,8 @@ vubr_backend_recv_cb(int sock, void *ctx)
 }
 iov_from_buf(sg, elem->in_num, 0, &hdr, sizeof hdr);
 total += hdrlen;
-assert(iov_discard_front(&sg, &num, hdrlen) == hdrlen);
+ret = iov_discard_front(&sg, &num, hdrlen);
+assert(ret == hdrlen);
 }
 
 struct msghdr msg = {
-- 
2.13.0.91.g00982b8dd

Re: [Qemu-devel] [PATCHv4 0/5] Clean up compatibility mode handling

2017-06-02 Thread Greg Kurz

On Fri, 2 Jun 2017 12:00:07 +1000
David Gibson  wrote:

> On Thu, Jun 01, 2017 at 03:09:15PM +0200, Greg Kurz wrote:
> > On Thu, 1 Jun 2017 13:59:14 +0200
> > Cédric Le Goater  wrote:
> >   
> > > On 06/01/2017 08:52 AM, David Gibson wrote:  
> > > > On Wed, May 31, 2017 at 10:58:57AM +0200, Greg Kurz wrote:
> > > >> On Wed, 31 May 2017 12:57:48 +1000
> > > >> David Gibson  wrote:
> > > >>> [...]
> > >  All old non-pseries machine types already complain when started with
> > >  a POWER7 or newer CPU. Providing the extra error message looks weird:
> > > 
> > >  qemu-system-ppc64 -machine ppce500 \
> > >    -cpu POWER7,compat=power6
> > >  qemu-system-ppc64: CPU 'compat' property is deprecated and has no 
> > >  effect;
> > >   use max-cpu-compat machine property instead
> > >  MMU model 983043 not supported by this machine.
> > > 
> > >  but I guess it's better than crashing. :)  
> > > >>>
> > > >>> Well, sure POWER7 doesn't make sense for an e500 machine for other
> > > >>> reasons.  But POWER7 or POWER8 _would_ make sense for powernv, where
> > > >>> compat= doesn't.
> > > >>>
> > > >>
> > > >> The powernv machine type doesn't even support CPU features at all:
> > > >>
> > > >> chip_typename = g_strdup_printf(TYPE_PNV_CHIP "-%s", 
> > > >> machine->cpu_model);
> > > >> if (!object_class_by_name(chip_typename)) {
> > > >> error_report("invalid CPU model '%s' for %s machine",
> > > >>  machine->cpu_model, 
> > > >> MACHINE_GET_CLASS(machine)->name);
> > > >> exit(1);
> > > >> }
> > > > 
> > > > Ah, well, that's another bug, but not one that's in scope for this
> > > > series.
> > > 
> > > PowerNV is still work in progress. I would not worry about it too much.
> > >   
> > 
> > Of course and this isn't the purpose of the discussion actually. We were
> > talking about CPU features being relevant or not depending on the machine
> > type.
> > 
> > But I'm not even sure that CPU features are useful at all for ppc, not to
> > say very confusing (otherwise this series wouldn't be needed for example).
> > 
> > Speaking of PowerNV, just as an example, I guess the fix would be to
> > forbid machine->cpu_model if it contains features. And probably the same
> > for all other machine types, except pseries for backward compatibility
> > reasons.  
> 
> I don't think that's correct in principle.  I can imagine CPU
> properties it might make sense to really set on the cpu, regardless of
> machine type.  A quick look says we don't have any such at the moment,
> but I don't think it's something we should prevent as a matter of policy.
> 

Fair enough. Then maybe all machine should parse CPU features and check which
one are valid before instantiating the CPUs ?


pgpJxtGdil63a.pgp
Description: OpenPGP digital signature

[Qemu-devel] [PULL v2 22/22] block: make accounting thread-safe

2017-06-02 Thread Fam Zheng

From: Paolo Bonzini 

I'm not trying too hard yet.  Later, with multiqueue support,
this may cause mutex contention or cacheline bouncing.

Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Paolo Bonzini 
Message-Id: <20170525163225.29954-19-pbonz...@redhat.com>
Signed-off-by: Fam Zheng 
---
 block/accounting.c | 16 
 include/block/accounting.h |  8 ++--
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/block/accounting.c b/block/accounting.c
index a279e0b..37ed66f 100644
--- a/block/accounting.c
+++ b/block/accounting.c
@@ -35,6 +35,7 @@ static const int qtest_latency_ns = NANOSECONDS_PER_SECOND / 
1000;
 void block_acct_init(BlockAcctStats *stats, bool account_invalid,
  bool account_failed)
 {
+qemu_mutex_init(&stats->lock);
 stats->account_invalid = account_invalid;
 stats->account_failed = account_failed;
 
@@ -49,6 +50,7 @@ void block_acct_cleanup(BlockAcctStats *stats)
 QSLIST_FOREACH_SAFE(s, &stats->intervals, entries, next) {
 g_free(s);
 }
+qemu_mutex_destroy(&stats->lock);
 }
 
 void block_acct_add_interval(BlockAcctStats *stats, unsigned interval_length)
@@ -58,12 +60,15 @@ void block_acct_add_interval(BlockAcctStats *stats, 
unsigned interval_length)
 
 s = g_new0(BlockAcctTimedStats, 1);
 s->interval_length = interval_length;
+s->stats = stats;
+qemu_mutex_lock(&stats->lock);
 QSLIST_INSERT_HEAD(&stats->intervals, s, entries);
 
 for (i = 0; i < BLOCK_MAX_IOTYPE; i++) {
 timed_average_init(&s->latency[i], clock_type,
(uint64_t) interval_length * 
NANOSECONDS_PER_SECOND);
 }
+qemu_mutex_unlock(&stats->lock);
 }
 
 BlockAcctTimedStats *block_acct_interval_next(BlockAcctStats *stats,
@@ -99,6 +104,8 @@ static void block_account_one_io(BlockAcctStats *stats, 
BlockAcctCookie *cookie,
 
 assert(cookie->type < BLOCK_MAX_IOTYPE);
 
+qemu_mutex_lock(&stats->lock);
+
 if (failed) {
 stats->failed_ops[cookie->type]++;
 } else {
@@ -114,6 +121,8 @@ static void block_account_one_io(BlockAcctStats *stats, 
BlockAcctCookie *cookie,
 timed_average_account(&s->latency[cookie->type], latency_ns);
 }
 }
+
+qemu_mutex_unlock(&stats->lock);
 }
 
 void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
@@ -134,18 +143,23 @@ void block_acct_invalid(BlockAcctStats *stats, enum 
BlockAcctType type)
  * not.  The reason is that invalid requests are accounted during their
  * submission, therefore there's no actual I/O involved.
  */
+qemu_mutex_lock(&stats->lock);
 stats->invalid_ops[type]++;
 
 if (stats->account_invalid) {
 stats->last_access_time_ns = qemu_clock_get_ns(clock_type);
 }
+qemu_mutex_unlock(&stats->lock);
 }
 
 void block_acct_merge_done(BlockAcctStats *stats, enum BlockAcctType type,
   int num_requests)
 {
 assert(type < BLOCK_MAX_IOTYPE);
+
+qemu_mutex_lock(&stats->lock);
 stats->merged[type] += num_requests;
+qemu_mutex_unlock(&stats->lock);
 }
 
 int64_t block_acct_idle_time_ns(BlockAcctStats *stats)
@@ -160,7 +174,9 @@ double block_acct_queue_depth(BlockAcctTimedStats *stats,
 
 assert(type < BLOCK_MAX_IOTYPE);
 
+qemu_mutex_lock(&stats->stats->lock);
 sum = timed_average_sum(&stats->latency[type], &elapsed);
+qemu_mutex_unlock(&stats->stats->lock);
 
 return (double) sum / elapsed;
 }
diff --git a/include/block/accounting.h b/include/block/accounting.h
index 2089163..2687be7 100644
--- a/include/block/accounting.h
+++ b/include/block/accounting.h
@@ -26,8 +26,10 @@
 #define BLOCK_ACCOUNTING_H
 
 #include "qemu/timed-average.h"
+#include "qemu/thread.h"
 
 typedef struct BlockAcctTimedStats BlockAcctTimedStats;
+typedef struct BlockAcctStats BlockAcctStats;
 
 enum BlockAcctType {
 BLOCK_ACCT_READ,
@@ -37,12 +39,14 @@ enum BlockAcctType {
 };
 
 struct BlockAcctTimedStats {
+BlockAcctStats *stats;
 TimedAverage latency[BLOCK_MAX_IOTYPE];
 unsigned interval_length; /* in seconds */
 QSLIST_ENTRY(BlockAcctTimedStats) entries;
 };
 
-typedef struct BlockAcctStats {
+struct BlockAcctStats {
+QemuMutex lock;
 uint64_t nr_bytes[BLOCK_MAX_IOTYPE];
 uint64_t nr_ops[BLOCK_MAX_IOTYPE];
 uint64_t invalid_ops[BLOCK_MAX_IOTYPE];
@@ -53,7 +57,7 @@ typedef struct BlockAcctStats {
 QSLIST_HEAD(, BlockAcctTimedStats) intervals;
 bool account_invalid;
 bool account_failed;
-} BlockAcctStats;
+};
 
 typedef struct BlockAcctCookie {
 int64_t bytes;
-- 
2.9.4

Re: [Qemu-devel] [PATCH v2 13/45] json: learn to parse uint64 numbers

2017-06-02 Thread Markus Armbruster

Marc-André Lureau  writes:

> Switch strtoll() usage to qemu_strtoi64() helper while at it.
>
> Add a few tests for large numbers.
>
> Signed-off-by: Marc-André Lureau 
> ---
>  qobject/json-lexer.c  |  4 
>  qobject/json-parser.c | 30 --
>  tests/check-qjson.c   | 37 +
>  3 files changed, 65 insertions(+), 6 deletions(-)
>
> diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c
> index af4a75e05b..980ba159d6 100644
> --- a/qobject/json-lexer.c
> +++ b/qobject/json-lexer.c
> @@ -227,15 +227,18 @@ static const uint8_t json_lexer[][256] =  {
>  /* escape */
>  [IN_ESCAPE_LL] = {
>  ['d'] = JSON_ESCAPE,
> +['u'] = JSON_ESCAPE,
>  },
>  
>  [IN_ESCAPE_L] = {
>  ['d'] = JSON_ESCAPE,
>  ['l'] = IN_ESCAPE_LL,
> +['u'] = JSON_ESCAPE,
>  },
>  
>  [IN_ESCAPE_I64] = {
>  ['d'] = JSON_ESCAPE,
> +['u'] = JSON_ESCAPE,
>  },
>  
>  [IN_ESCAPE_I6] = {
> @@ -251,6 +254,7 @@ static const uint8_t json_lexer[][256] =  {
>  ['i'] = JSON_ESCAPE,
>  ['p'] = JSON_ESCAPE,
>  ['s'] = JSON_ESCAPE,
> +['u'] = JSON_ESCAPE,
>  ['f'] = JSON_ESCAPE,
>  ['l'] = IN_ESCAPE_L,
>  ['I'] = IN_ESCAPE_I,
> diff --git a/qobject/json-parser.c b/qobject/json-parser.c
> index b90b2fb45a..62dcac8128 100644
> --- a/qobject/json-parser.c
> +++ b/qobject/json-parser.c
> @@ -12,6 +12,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/cutils.h"
>  #include "qapi/error.h"
>  #include "qemu-common.h"
>  #include "qapi/qmp/types.h"
> @@ -472,6 +473,13 @@ static QObject *parse_escape(JSONParserContext *ctxt, 
> va_list *ap)
>  } else if (!strcmp(token->str, "%lld") ||
> !strcmp(token->str, "%I64d")) {
>  return QOBJECT(qnum_from_int(va_arg(*ap, long long)));
> +} else if (!strcmp(token->str, "%u")) {
> +return QOBJECT(qnum_from_uint(va_arg(*ap, unsigned int)));
> +} else if (!strcmp(token->str, "%lu")) {
> +return QOBJECT(qnum_from_uint(va_arg(*ap, unsigned long)));
> +} else if (!strcmp(token->str, "%llu") ||
> +   !strcmp(token->str, "%I64u")) {
> +return QOBJECT(qnum_from_uint(va_arg(*ap, unsigned long long)));
>  } else if (!strcmp(token->str, "%s")) {
>  return QOBJECT(qstring_from_str(va_arg(*ap, const char *)));
>  } else if (!strcmp(token->str, "%f")) {
> @@ -493,20 +501,30 @@ static QObject *parse_literal(JSONParserContext *ctxt)
>  case JSON_INTEGER: {
>  /*
>   * Represent JSON_INTEGER as QNUM_I64 if possible, else as
> - * QNUM_DOUBLE. Note that strtoll() fails with ERANGE when
> - * it's not possible.
> + * QNUM_U64, else as QNUM_DOUBLE.  Note that qemu_strtoi64()
> + * and qemu_strtou64 fail with ERANGE when it's not possible.

qemu_strtou64(), please.

>   *
>   * qnum_get_int() will then work for any signed 64-bit
> - * JSON_INTEGER, and qnum_get_double both for any JSON_INTEGER
> + * JSON_INTEGER, qnum_get_uint() for any unsigned 64-bit
> + * integer, and qnum_get_double() both for any JSON_INTEGER
>   * and any JSON_FLOAT.
>   */
> +int ret;
>  int64_t value;
> +uint64_t uvalue;
>  
> -errno = 0; /* strtoll doesn't set errno on success */
> -value = strtoll(token->str, NULL, 10);
> -if (errno != ERANGE) {
> +ret = qemu_strtoi64(token->str, NULL, 10, &value);
> +if (!ret) {
>  return QOBJECT(qnum_from_int(value));
>  }
> +assert(ret == -ERANGE);
> +
> +if (token->str[0] != '-') {
> +ret = qemu_strtou64(token->str, NULL, 10, &uvalue);
> +if (!ret) {
> +return QOBJECT(qnum_from_uint(uvalue));
> +}

assert(ret == -ERANGE), please.

> +}
>  /* fall through to JSON_FLOAT */
>  }
>  case JSON_FLOAT:
> diff --git a/tests/check-qjson.c b/tests/check-qjson.c
> index 8ec728a702..6fb14445a3 100644
> --- a/tests/check-qjson.c
> +++ b/tests/check-qjson.c
> @@ -906,6 +906,42 @@ static void simple_number(void)
>  }
>  }
>  
> +static void large_number(void)
> +{
> +const char *maxu64 = "18446744073709551615"; /* 2^64-1 */
> +const char *gtu64 = "18446744073709551616"; /* 2^64 */
> +const char *range = "-9223372036854775809";

Why is this called @range?

Let's add /* -2^63-1 */.

> +QNum *qnum;
> +QString *str;
> +uint64_t val;
> +
> +qnum = qobject_to_qnum(qobject_from_json(maxu64, &error_abort));
> +g_assert(qnum);
> +g_assert(qnum_get_uint(qnum, &val));
> +g_assert_cmpuint(val, ==, 18446744073709551615U);
> +
> +str = qobject_to_json(QOBJECT(qnum));
> +g_assert_cmpstr(qstring_get_str(str), ==, maxu64);
> +QDECREF(str);
> +QDECREF(qnum);
> +
> +qnum = qobject_to_qnum(qobject_from_json(gtu64, &error_abor

Re: [Qemu-devel] [PATCH v8 19/20] qcow2: report encryption specific image information

2017-06-02 Thread Alberto Garcia

On Thu 01 Jun 2017 07:27:33 PM CEST, Daniel P. Berrange wrote:
> Currently 'qemu-img info' reports a simple "encrypted: yes"
> field. This is not very useful now that qcow2 can support
> multiple encryption formats. Users want to know which format
> is in use and some data related to it.

Reviewed-by: Alberto Garcia 

Berto

Re: [Qemu-devel] [PATCH] vhost-user-bridge: fix iov_restore_front() warning

2017-06-02 Thread Dr. David Alan Gilbert

* Marc-André Lureau (marcandre.lur...@redhat.com) wrote:
>   CC  tests/vhost-user-bridge.o
> /home/dgilbert/git/qemu-world3/tests/vhost-user-bridge.c:228:23: warning: 
> variables 'front' and 'iov' used in loop condition not modified in loop body 
> [-Wfor-loop-analysis]
> for (cur = front; front != iov; cur++) {
>   ^~~~
> 1 warning generated.
> 
> Fix the loop, document the function, and fix some related assert().
> 
> In practice, the loop bug was harmless because the front sg buffer is
> enough to discard/restore the header size.
> 
> Reported-by: Dr. David Alan Gilbert 
> Signed-off-by: Marc-André Lureau 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  tests/vhost-user-bridge.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/vhost-user-bridge.c b/tests/vhost-user-bridge.c
> index 8618c20d53..1e5b5ca3da 100644
> --- a/tests/vhost-user-bridge.c
> +++ b/tests/vhost-user-bridge.c
> @@ -220,12 +220,18 @@ vubr_handle_tx(VuDev *dev, int qidx)
>  free(elem);
>  }
>  
> +
> +/* this function reverse the effect of iov_discard_front() it must be
> + * called with 'front' being the original struct iovec and 'bytes'
> + * being the number of bytes you shaved off
> + */
>  static void
>  iov_restore_front(struct iovec *front, struct iovec *iov, size_t bytes)
>  {
>  struct iovec *cur;
>  
> -for (cur = front; front != iov; cur++) {
> +for (cur = front; cur != iov; cur++) {
> +assert(bytes >= cur->iov_len);
>  bytes -= cur->iov_len;
>  }
>  
> @@ -302,7 +308,8 @@ vubr_backend_recv_cb(int sock, void *ctx)
>  }
>  iov_from_buf(sg, elem->in_num, 0, &hdr, sizeof hdr);
>  total += hdrlen;
> -assert(iov_discard_front(&sg, &num, hdrlen) == hdrlen);
> +ret = iov_discard_front(&sg, &num, hdrlen);
> +assert(ret == hdrlen);
>  }
>  
>  struct msghdr msg = {
> -- 
> 2.13.0.91.g00982b8dd
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v3 1/1] coroutine-lock: do not touch coroutine after another one has been entered

2017-06-02 Thread Stefan Hajnoczi

On Thu, Jun 01, 2017 at 06:08:47PM +0200, Roman Pen wrote:
> Submission of requests on linux aio is a bit tricky and can lead to
> requests completions on submission path:
> 
> 44713c9e8547 ("linux-aio: Handle io_submit() failure gracefully")
> 0ed93d84edab ("linux-aio: process completions from ioq_submit()")
> 
> That means that any coroutine which has been yielded in order to wait
> for completion can be resumed from submission path and be eventually
> terminated (freed).
> 
> The following use-after-free crash was observed when IO throttling
> was enabled:
> 
>  Program received signal SIGSEGV, Segmentation fault.
>  [Switching to Thread 0x7f5813dff700 (LWP 56417)]
>  virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=) at 
> virtio.c:252
>  (gdb) bt
>  #0  virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=) at 
> virtio.c:252
>   ^^
>   remember the address
> 
>  #1  virtqueue_fill (vq=0x5598b20d21b0, elem=0x7f5804009a30, len=1, idx=0) at 
> virtio.c:282
>  #2  virtqueue_push (vq=0x5598b20d21b0, elem=elem@entry=0x7f5804009a30, 
> len=) at virtio.c:308
>  #3  virtio_blk_req_complete (req=req@entry=0x7f5804009a30, 
> status=status@entry=0 '\000') at virtio-blk.c:61
>  #4  virtio_blk_rw_complete (opaque=, ret=0) at 
> virtio-blk.c:126
>  #5  blk_aio_complete (acb=0x7f58040068d0) at block-backend.c:923
>  #6  coroutine_trampoline (i0=, i1=) at 
> coroutine-ucontext.c:78
> 
>  (gdb) p * elem
>  $8 = {index = 77, out_num = 2, in_num = 1,
>in_addr = 0x7f5804009ad8, out_addr = 0x7f5804009ae0,
>in_sg = 0x0, out_sg = 0x7f5804009a50}
>
>'in_sg' and 'out_sg' are invalid.
>e.g. it is impossible that 'in_sg' is zero,
>instead its value must be equal to:
> 
>(gdb) p/x 0x7f5804009ad8 + sizeof(elem->in_addr[0]) + 2 * 
> sizeof(elem->out_addr[0])
>$26 = 0x7f5804009af0
> 
> Seems 'elem' was corrupted.  Meanwhile another thread raised an abort:
> 
>  Thread 12 (Thread 0x7f57f2ffd700 (LWP 56426)):
>  #0  raise () from /lib/x86_64-linux-gnu/libc.so.6
>  #1  abort () from /lib/x86_64-linux-gnu/libc.so.6
>  #2  qemu_coroutine_enter (co=0x7f5804009af0) at qemu-coroutine.c:113
>  #3  qemu_co_queue_run_restart (co=0x7f5804009a30) at qemu-coroutine-lock.c:60
>  #4  qemu_coroutine_enter (co=0x7f5804009a30) at qemu-coroutine.c:119
>^^
>WTF?? this is equal to elem from crashed thread
> 
>  #5  qemu_co_queue_run_restart (co=0x7f57e7f16ae0) at qemu-coroutine-lock.c:60
>  #6  qemu_coroutine_enter (co=0x7f57e7f16ae0) at qemu-coroutine.c:119
>  #7  qemu_co_queue_run_restart (co=0x7f5807e112a0) at qemu-coroutine-lock.c:60
>  #8  qemu_coroutine_enter (co=0x7f5807e112a0) at qemu-coroutine.c:119
>  #9  qemu_co_queue_run_restart (co=0x7f5807f17820) at qemu-coroutine-lock.c:60
>  #10 qemu_coroutine_enter (co=0x7f5807f17820) at qemu-coroutine.c:119
>  #11 qemu_co_queue_run_restart (co=0x7f57e7f18e10) at qemu-coroutine-lock.c:60
>  #12 qemu_coroutine_enter (co=0x7f57e7f18e10) at qemu-coroutine.c:119
>  #13 qemu_co_enter_next (queue=queue@entry=0x5598b1e742d0) at 
> qemu-coroutine-lock.c:106
>  #14 timer_cb (blk=0x5598b1e74280, is_write=) at 
> throttle-groups.c:419
> 
> Crash can be explained by access of 'co' object from the loop inside
> qemu_co_queue_run_restart():
> 
>   while ((next = QSIMPLEQ_FIRST(&co->co_queue_wakeup))) {
>   QSIMPLEQ_REMOVE_HEAD(&co->co_queue_wakeup, co_queue_next);
>
>on each iteration 'co' is accessed,
>but 'co' can be already freed
> 
>   qemu_coroutine_enter(next);
>   }
> 
> When 'next' coroutine is resumed (entered) it can in its turn resume
> 'co', and eventually free it.  That's why we see 'co' (which was freed)
> has the same address as 'elem' from the first backtrace.
> 
> The fix is obvious: use temporary queue and do not touch coroutine after
> first qemu_coroutine_enter() is invoked.
> 
> The issue is quite rare and happens every ~12 hours on very high IO
> and CPU load (building linux kernel with -j512 inside guest) when IO
> throttling is enabled.  With the fix applied guest is running ~35 hours
> and is still alive so far.
> 
> Signed-off-by: Roman Pen 
> Cc: Paolo Bonzini 
> Cc: Fam Zheng 
> Cc: Stefan Hajnoczi 
> Cc: Kevin Wolf 
> Cc: qemu-devel@nongnu.org
> ---
>  v3:
>  Comments tweaks suggested by Stefan.
>  v2:
>  Comments tweaks suggested by Paolo.
> 
>  util/qemu-coroutine-lock.c | 19 +--
>  util/qemu-coroutine.c  |  5 +
>  2 files changed, 22 insertions(+), 2 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 09/25] block/dirty-bitmap: add readonly field to BdrvDirtyBitmap

2017-06-02 Thread Vladimir Sementsov-Ogievskiy


02.06.2017 02:25, John Snow wrote:


On 06/01/2017 03:30 AM, Sementsov-Ogievskiy Vladimir wrote:

Hi John!

Look at our discussion about this in v18 thread.

Shortly: readonly is not the same as disabled. disabled= bitmap just

ignores all writes. readonly= writes are not allowed at all.

And I think, I'll try to go through way 2: "dirty" field instead of
"readonly" (look at v18 discussion), as it a bit more flexible.


Not sure which I prefer...

Method 1 is attractive in that it is fairly simple, and enforces fairly
loudly the inability to write to devices with RO bitmaps. It's a natural
extension of your current approach.


For now I decided to realize this one, I think I'll publish it today. 
Also, I'm going to rename s/readonly/in_use - to avoid the confuse with 
disabled. So let this field just be mirror of IN_USE in the image and 
just say "persistent storage knows, that bitmap is in use and may be dirty".


Also, optimization with 'dirty' flag may be added later.



Method 2 is attractive in that it seems a little more efficient, and is
a little more clever. A dirty flag lets us avoid flushing bitmaps we
never even changed (though we still need to clean up the in_use flags.)

What I wonder about #2 is what happens when a write sneaks in (due to a
bug or a use case we didn't see) on a bitmap attached to a read-only
node. We fail later on invalidate? It shouldn't happen in normal
circumstances, but I worry that the failure mode is messier.


if bitmap is dirty - all ok, the problems will appear when we'll try to 
save it, but these problems are not fatal - bitmap should be marked 
'in_use' in the image, so it will be lost (the worst case is when in_use 
not set and bitmap is incorrect - it may lead to data loss for user)


if it is not dirty - we will fail to write 'in_use' before actual write 
and the whole write will fail.





Well, either way I will be happy for now I think -- pick whichever
option feels easiest or best for you to implement.

Thanks!


On 01.06.2017 02:48, John Snow wrote:

On 05/30/2017 04:17 AM, Vladimir Sementsov-Ogievskiy wrote:

It will be needed in following commits for persistent bitmaps.
If bitmap is loaded from read-only storage (and we can't mark it
"in use" in this storage) corresponding BdrvDirtyBitmap should be
read-only.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
   block/dirty-bitmap.c | 28 
   block/io.c   |  8 
   blockdev.c   |  6 ++
   include/block/dirty-bitmap.h |  4 
   4 files changed, 46 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 90af37287f..733f19ca5e 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -44,6 +44,8 @@ struct BdrvDirtyBitmap {
   int64_t size;   /* Size of the bitmap (Number of
sectors) */
   bool disabled;  /* Bitmap is read-only */
   int active_iterators;   /* How many iterators are active */
+bool readonly;  /* Bitmap is read-only and may be
changed only
+   by deserialize* functions */
   QLIST_ENTRY(BdrvDirtyBitmap) list;
   };
   @@ -436,6 +438,7 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap
*bitmap,
  int64_t cur_sector, int64_t nr_sectors)
   {
   assert(bdrv_dirty_bitmap_enabled(bitmap));
+assert(!bdrv_dirty_bitmap_readonly(bitmap));

Not reasonable to add the condition for !readonly into
bdrv_dirty_bitmap_enabled?

As is:

If readonly is set to true on a bitmap, bdrv_dirty_bitmap_status is
going to return ACTIVE for such bitmaps, but DISABLED might be more
appropriate to indicate the read-only nature.

If you add this condition into _enabled(), you can skip the extra
assertions you've added here.


   hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
   }
   @@ -443,12 +446,14 @@ void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap
*bitmap,
int64_t cur_sector, int64_t nr_sectors)
   {
   assert(bdrv_dirty_bitmap_enabled(bitmap));
+assert(!bdrv_dirty_bitmap_readonly(bitmap));
   hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
   }
 void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
   {
   assert(bdrv_dirty_bitmap_enabled(bitmap));
+assert(!bdrv_dirty_bitmap_readonly(bitmap));
   if (!out) {
   hbitmap_reset_all(bitmap->bitmap);
   } else {
@@ -519,6 +524,7 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t
cur_sector,
   if (!bdrv_dirty_bitmap_enabled(bitmap)) {
   continue;
   }
+assert(!bdrv_dirty_bitmap_readonly(bitmap));
   hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
   }
   }
@@ -540,3 +546,25 @@ int64_t
bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap)
   {
   return hbitmap_count(bitmap->meta);
   }
+
+bool bdrv_dirty_bitmap_readonly(const BdrvDirtyBitmap *bitmap)
+{
+return bitmap->readonly;
+}
+
+v

Re: [Qemu-devel] [Qemu-arm] [PATCH 09/13] armv7m: Implement M profile default memory map

2017-06-02 Thread Peter Maydell

On 2 June 2017 at 06:10, Philippe Mathieu-Daudé  wrote:
> On 05/30/2017 12:11 PM, Peter Maydell wrote:
>> This is the arm of the if() that deals with R profile, and R profile's
>
>
> Oh I completely misunderstood that if() indeed. R and also A I suppose.

A profile is never PMSA -- arguably VMSA (MMU) vs PMSA (MPU)) is the defining
distinction between A profile and R profile cores.

thanks
-- PMM

Re: [Qemu-devel] [PATCH 09/25] block/dirty-bitmap: add readonly field to BdrvDirtyBitmap

2017-06-02 Thread Vladimir Sementsov-Ogievskiy


02.06.2017 11:56, Vladimir Sementsov-Ogievskiy wrote:

02.06.2017 02:25, John Snow wrote:


On 06/01/2017 03:30 AM, Sementsov-Ogievskiy Vladimir wrote:

Hi John!

Look at our discussion about this in v18 thread.

Shortly: readonly is not the same as disabled. disabled= bitmap just

ignores all writes. readonly= writes are not allowed at all.

And I think, I'll try to go through way 2: "dirty" field instead of
"readonly" (look at v18 discussion), as it a bit more flexible.


Not sure which I prefer...

Method 1 is attractive in that it is fairly simple, and enforces fairly
loudly the inability to write to devices with RO bitmaps. It's a natural
extension of your current approach.


For now I decided to realize this one, I think I'll publish it today. 
Also, I'm going to rename s/readonly/in_use - to avoid the confuse 
with disabled. So let this field just be mirror of IN_USE in the image 
and just say "persistent storage knows, that bitmap is in use and may 
be dirty".


Also, optimization with 'dirty' flag may be added later.


And, also, I don't want to influence this "first write", on which we 
will set "IN_USE" in all bitmaps (for way (2). Reopening rw is less 
performance-demanding place than write.







Method 2 is attractive in that it seems a little more efficient, and is
a little more clever. A dirty flag lets us avoid flushing bitmaps we
never even changed (though we still need to clean up the in_use flags.)

What I wonder about #2 is what happens when a write sneaks in (due to a
bug or a use case we didn't see) on a bitmap attached to a read-only
node. We fail later on invalidate? It shouldn't happen in normal
circumstances, but I worry that the failure mode is messier.


if bitmap is dirty - all ok, the problems will appear when we'll try 
to save it, but these problems are not fatal - bitmap should be marked 
'in_use' in the image, so it will be lost (the worst case is when 
in_use not set and bitmap is incorrect - it may lead to data loss for 
user)


if it is not dirty - we will fail to write 'in_use' before actual 
write and the whole write will fail.





Well, either way I will be happy for now I think -- pick whichever
option feels easiest or best for you to implement.

Thanks!


On 01.06.2017 02:48, John Snow wrote:

On 05/30/2017 04:17 AM, Vladimir Sementsov-Ogievskiy wrote:

It will be needed in following commits for persistent bitmaps.
If bitmap is loaded from read-only storage (and we can't mark it
"in use" in this storage) corresponding BdrvDirtyBitmap should be
read-only.

Signed-off-by: Vladimir Sementsov-Ogievskiy 


---
   block/dirty-bitmap.c | 28 
   block/io.c   |  8 
   blockdev.c   |  6 ++
   include/block/dirty-bitmap.h |  4 
   4 files changed, 46 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 90af37287f..733f19ca5e 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -44,6 +44,8 @@ struct BdrvDirtyBitmap {
   int64_t size;   /* Size of the bitmap (Number of
sectors) */
   bool disabled;  /* Bitmap is read-only */
   int active_iterators;   /* How many iterators are 
active */

+bool readonly;  /* Bitmap is read-only and may be
changed only
+   by deserialize* functions */
   QLIST_ENTRY(BdrvDirtyBitmap) list;
   };
   @@ -436,6 +438,7 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap
*bitmap,
  int64_t cur_sector, int64_t nr_sectors)
   {
   assert(bdrv_dirty_bitmap_enabled(bitmap));
+assert(!bdrv_dirty_bitmap_readonly(bitmap));

Not reasonable to add the condition for !readonly into
bdrv_dirty_bitmap_enabled?

As is:

If readonly is set to true on a bitmap, bdrv_dirty_bitmap_status is
going to return ACTIVE for such bitmaps, but DISABLED might be more
appropriate to indicate the read-only nature.

If you add this condition into _enabled(), you can skip the extra
assertions you've added here.


hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
   }
   @@ -443,12 +446,14 @@ void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap
*bitmap,
int64_t cur_sector, int64_t 
nr_sectors)

   {
   assert(bdrv_dirty_bitmap_enabled(bitmap));
+assert(!bdrv_dirty_bitmap_readonly(bitmap));
   hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
   }
 void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap 
**out)

   {
   assert(bdrv_dirty_bitmap_enabled(bitmap));
+assert(!bdrv_dirty_bitmap_readonly(bitmap));
   if (!out) {
   hbitmap_reset_all(bitmap->bitmap);
   } else {
@@ -519,6 +524,7 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t
cur_sector,
   if (!bdrv_dirty_bitmap_enabled(bitmap)) {
   continue;
   }
+assert(!bdrv_dirty_bitmap_readonly(bitmap));
   hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
   }

Re: [Qemu-devel] [PATCH 09/25] block/dirty-bitmap: add readonly field to BdrvDirtyBitmap

2017-06-02 Thread Vladimir Sementsov-Ogievskiy


02.06.2017 12:01, Vladimir Sementsov-Ogievskiy wrote:

02.06.2017 11:56, Vladimir Sementsov-Ogievskiy wrote:

02.06.2017 02:25, John Snow wrote:


On 06/01/2017 03:30 AM, Sementsov-Ogievskiy Vladimir wrote:

Hi John!

Look at our discussion about this in v18 thread.

Shortly: readonly is not the same as disabled. disabled= bitmap just

ignores all writes. readonly= writes are not allowed at all.

And I think, I'll try to go through way 2: "dirty" field instead of
"readonly" (look at v18 discussion), as it a bit more flexible.


Not sure which I prefer...

Method 1 is attractive in that it is fairly simple, and enforces fairly
loudly the inability to write to devices with RO bitmaps. It's a 
natural

extension of your current approach.


For now I decided to realize this one, I think I'll publish it today. 
Also, I'm going to rename s/readonly/in_use - to avoid the confuse 
with disabled. So let this field just be mirror of IN_USE in the 
image and just say "persistent storage knows, that bitmap is in use 
and may be dirty".


Finally it would be readonly. in_use is bad for created (not loaded) 
bitmaps. I'll add more descriptive comments for disabled and readonly.




Also, optimization with 'dirty' flag may be added later.


And, also, I don't want to influence this "first write", on which we 
will set "IN_USE" in all bitmaps (for way (2). Reopening rw is less 
performance-demanding place than write.







Method 2 is attractive in that it seems a little more efficient, and is
a little more clever. A dirty flag lets us avoid flushing bitmaps we
never even changed (though we still need to clean up the in_use flags.)

What I wonder about #2 is what happens when a write sneaks in (due to a
bug or a use case we didn't see) on a bitmap attached to a read-only
node. We fail later on invalidate? It shouldn't happen in normal
circumstances, but I worry that the failure mode is messier.


if bitmap is dirty - all ok, the problems will appear when we'll try 
to save it, but these problems are not fatal - bitmap should be 
marked 'in_use' in the image, so it will be lost (the worst case is 
when in_use not set and bitmap is incorrect - it may lead to data 
loss for user)


if it is not dirty - we will fail to write 'in_use' before actual 
write and the whole write will fail.





Well, either way I will be happy for now I think -- pick whichever
option feels easiest or best for you to implement.

Thanks!


On 01.06.2017 02:48, John Snow wrote:

On 05/30/2017 04:17 AM, Vladimir Sementsov-Ogievskiy wrote:

It will be needed in following commits for persistent bitmaps.
If bitmap is loaded from read-only storage (and we can't mark it
"in use" in this storage) corresponding BdrvDirtyBitmap should be
read-only.

Signed-off-by: Vladimir Sementsov-Ogievskiy 


---
   block/dirty-bitmap.c | 28 
   block/io.c   |  8 
   blockdev.c   |  6 ++
   include/block/dirty-bitmap.h |  4 
   4 files changed, 46 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 90af37287f..733f19ca5e 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -44,6 +44,8 @@ struct BdrvDirtyBitmap {
   int64_t size;   /* Size of the bitmap (Number of
sectors) */
   bool disabled;  /* Bitmap is read-only */
   int active_iterators;   /* How many iterators are 
active */

+bool readonly;  /* Bitmap is read-only and may be
changed only
+   by deserialize* functions */
   QLIST_ENTRY(BdrvDirtyBitmap) list;
   };
   @@ -436,6 +438,7 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap
*bitmap,
  int64_t cur_sector, int64_t 
nr_sectors)

   {
   assert(bdrv_dirty_bitmap_enabled(bitmap));
+assert(!bdrv_dirty_bitmap_readonly(bitmap));

Not reasonable to add the condition for !readonly into
bdrv_dirty_bitmap_enabled?

As is:

If readonly is set to true on a bitmap, bdrv_dirty_bitmap_status is
going to return ACTIVE for such bitmaps, but DISABLED might be more
appropriate to indicate the read-only nature.

If you add this condition into _enabled(), you can skip the extra
assertions you've added here.


hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
   }
   @@ -443,12 +446,14 @@ void 
bdrv_reset_dirty_bitmap(BdrvDirtyBitmap

*bitmap,
int64_t cur_sector, int64_t 
nr_sectors)

   {
   assert(bdrv_dirty_bitmap_enabled(bitmap));
+assert(!bdrv_dirty_bitmap_readonly(bitmap));
   hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
   }
 void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, 
HBitmap **out)

   {
   assert(bdrv_dirty_bitmap_enabled(bitmap));
+assert(!bdrv_dirty_bitmap_readonly(bitmap));
   if (!out) {
   hbitmap_reset_all(bitmap->bitmap);
   } else {
@@ -519,6 +524,7 @@ void bdrv_set_dirty(BlockDriverState *bs, 
int64_t

cur_sector,

Re: [Qemu-devel] [PATCH v2 1/4] dump: add DumpInfo structure

2017-06-02 Thread Marc-André Lureau

Hi

On Thu, Jun 1, 2017 at 10:19 PM Eric Blake  wrote:

> On 06/01/2017 01:06 PM, Laszlo Ersek wrote:
> > On 06/01/17 15:03, Marc-André Lureau wrote:
> >> One way or another, the guest could communicate various dump info (via
> >> guest agent or vmcoreinfo device) and populate that structure. It can
> >> then be used to augment the dump with various details, as done in the
> >> following patch.
> >>
> >> Signed-off-by: Marc-André Lureau 
> >> ---
> >>  include/sysemu/dump-info.h | 18 ++
> >>  dump.c |  3 +++
> >>  2 files changed, 21 insertions(+)
> >>  create mode 100644 include/sysemu/dump-info.h
> >>
> >> diff --git a/include/sysemu/dump-info.h b/include/sysemu/dump-info.h
> >> new file mode 100644
> >> index 00..d2378e15e2
> >> --- /dev/null
> >> +++ b/include/sysemu/dump-info.h
> >> @@ -0,0 +1,18 @@
> >> +#ifndef DUMP_INFO_H
> >> +#define DUMP_INFO_H
>
> >>
> >
> > Can you please spell out, in the commit message, the reason for
> > introducing a new header file? (I suspect your reason, but it should be
> > documented explicitly.)
>
> Also, should you have a copyright header in the new file?  And does
> MAINTAINERS cover it?
>

None of the dump support is covered. Based on commit history, I can suggest
Wen Congyang, as original author.
Sadly, Qiao Nuohan cannot be reached with his mail today (anyone knows if
he is still contributing?). Laszlo has done significant changes and reviews
too. I can also propose myself to help with reviews.

Wen or Laszla, do you want to be the main maintainer?
-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH 1/2] qcow2: add reduce image support

2017-06-02 Thread Pavel Butsykin




On 01.06.2017 17:41, Kevin Wolf wrote:

Am 31.05.2017 um 16:43 hat Pavel Butsykin geschrieben:

This patch adds the reduction of the image file for qcow2. As a result, this
allows us to reduce the virtual image size and free up space on the disk without
copying the image. Image can be fragmented and reduction is done by punching
holes in the image file.

Signed-off-by: Pavel Butsykin 
---
  block/qcow2-cache.c|  8 +
  block/qcow2-cluster.c  | 83 ++
  block/qcow2-refcount.c | 65 +++
  block/qcow2.c  | 40 ++--
  block/qcow2.h  |  4 +++
  qapi/block-core.json   |  4 ++-
  6 files changed, 193 insertions(+), 11 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 1d25147392..da55118ca7 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -411,3 +411,11 @@ void qcow2_cache_entry_mark_dirty(BlockDriverState *bs, 
Qcow2Cache *c,
  assert(c->entries[i].offset != 0);
  c->entries[i].dirty = true;
  }
+
+void qcow2_cache_entry_mark_clean(BlockDriverState *bs, Qcow2Cache *c,
+ void *table)
+{
+int i = qcow2_cache_get_table_idx(bs, c, table);
+assert(c->entries[i].offset != 0);
+c->entries[i].dirty = false;
+}


This is an interesting function. We can use it whenever we're not
interested in the content of the table any more. However, we still keep
that data in the cache and may even evict other tables before this one.
The data in the cache also becomes inconsistent with the data in the
file, which should not be a problem in theory (because nobody should be
using it), but it surely could be confusing when debugging something in
the cache.



Good idea!


We can easily improve this a little: Make it qcow2_cache_discard(), a
function that gets a cluster offset, asserts that a table at this
offset isn't in use (not cached or ref == 0), and then just directly
drops it from the cache. This can be called from update_refcount()
whenever a refcount goes to 0, immediately before or after calling
update_refcount_discard() - those two are closely related. Then this
would automatically also be used for L2 tables.



Did I understand correctly? Every time we need to check the incoming
offset to make sure it is offset to L2/refcount table (not to the guest 
data) ?



Adding this mechanism could be a patch of its own

...


Kevin

[Qemu-devel] [PATCH] virtio-serial: fix segfault on disconnect

2017-06-02 Thread Stefan Hajnoczi

Since commit d4c19cdeeb2f1e474bc426a6da261f1d7346eb5b ("virtio-serial:
add missing virtio_detach_element() call") the following commands may
cause QEMU to segfault:

  $ qemu -M accel=kvm -cpu host -m 1G \
 -drive if=virtio,file=test.img,format=raw \
 -device virtio-serial-pci,id=virtio-serial0 \
 -chardev socket,id=channel1,path=/tmp/chardev.sock,server,nowait \
 -device virtserialport,chardev=channel1,bus=virtio-serial0.0,id=port1
  $ nc -U /tmp/chardev.sock
  ^C

  (guest)$ cat /dev/zero >/dev/vport0p1

The segfault is non-deterministic: if the event loop notices the socket
has been closed then there is no crash.  The disconnect has to happen
right before QEMU attempts to write data to the socket.

The backtrace is as follows:

  Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
  0x557e0698 in do_flush_queued_data (port=0x582cedf0, 
vq=0x7fffcc854290, vdev=0x5807b1d0) at hw/char/virtio-serial-bus.c:180
  180   for (i = port->iov_idx; i < port->elem->out_num; i++) {
  #1  0x5580d363 in virtio_queue_notify_vq (vq=0x7fffcc854290) at 
hw/virtio/virtio.c:1524
  #2  0x5580d363 in virtio_queue_host_notifier_read (n=0x7fffcc8542f8) 
at hw/virtio/virtio.c:2430
  #3  0x55b3482c in aio_dispatch_handlers 
(ctx=ctx@entry=0x566b8c80) at util/aio-posix.c:399
  #4  0x55b350d8 in aio_dispatch (ctx=0x566b8c80) at 
util/aio-posix.c:430
  #5  0x55b3212e in aio_ctx_dispatch (source=, 
callback=, user_data=) at util/async.c:261
  #6  0x7fffde71de52 in g_main_context_dispatch () at 
/lib64/libglib-2.0.so.0
  #7  0x55b34353 in glib_pollfds_poll () at util/main-loop.c:213
  #8  0x55b34353 in os_host_main_loop_wait (timeout=) at 
util/main-loop.c:261
  #9  0x55b34353 in main_loop_wait (nonblocking=) at 
util/main-loop.c:517
  #10 0x55773207 in main_loop () at vl.c:1917
  #11 0x55773207 in main (argc=, argv=, 
envp=) at vl.c:4751

The do_flush_queued_data() function does not anticipate chardev close
events during vsc->have_data().  It expects port->elem to remain
non-NULL for the duration its for loop.

The fix is simply to return from do_flush_queued_data() if the port
closes because the close event already frees port->elem and drains the
virtqueue - there is nothing left for do_flush_queued_data() to do.

Reported-by: Sitong Liu 
Reported-by: Min Deng 
Signed-off-by: Stefan Hajnoczi 
---
 hw/char/virtio-serial-bus.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index d797a67..c5aa26c 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -186,6 +186,9 @@ static void do_flush_queued_data(VirtIOSerialPort *port, 
VirtQueue *vq,
   port->elem->out_sg[i].iov_base
   + port->iov_offset,
   buf_size);
+if (!port->elem) { /* bail if we got disconnected */
+return;
+}
 if (port->throttled) {
 port->iov_idx = i;
 if (ret > 0) {
-- 
2.9.4

Re: [Qemu-devel] [PATCH v2 1/4] dump: add DumpInfo structure

2017-06-02 Thread Marc-André Lureau

Hi

On Fri, Jun 2, 2017 at 1:46 PM Marc-André Lureau 
wrote:

> Hi
>
> On Thu, Jun 1, 2017 at 10:19 PM Eric Blake  wrote:
>
>> On 06/01/2017 01:06 PM, Laszlo Ersek wrote:
>> > On 06/01/17 15:03, Marc-André Lureau wrote:
>> >> One way or another, the guest could communicate various dump info (via
>> >> guest agent or vmcoreinfo device) and populate that structure. It can
>> >> then be used to augment the dump with various details, as done in the
>> >> following patch.
>> >>
>> >> Signed-off-by: Marc-André Lureau 
>> >> ---
>> >>  include/sysemu/dump-info.h | 18 ++
>> >>  dump.c |  3 +++
>> >>  2 files changed, 21 insertions(+)
>> >>  create mode 100644 include/sysemu/dump-info.h
>> >>
>> >> diff --git a/include/sysemu/dump-info.h b/include/sysemu/dump-info.h
>> >> new file mode 100644
>> >> index 00..d2378e15e2
>> >> --- /dev/null
>> >> +++ b/include/sysemu/dump-info.h
>> >> @@ -0,0 +1,18 @@
>> >> +#ifndef DUMP_INFO_H
>> >> +#define DUMP_INFO_H
>>
>> >>
>> >
>> > Can you please spell out, in the commit message, the reason for
>> > introducing a new header file? (I suspect your reason, but it should be
>> > documented explicitly.)
>>
>> Also, should you have a copyright header in the new file?  And does
>> MAINTAINERS cover it?
>>
>
> None of the dump support is covered. Based on commit history, I can
> suggest Wen Congyang, as original author.
> Sadly, Qiao Nuohan cannot be reached with his mail today (anyone knows if
> he is still contributing?). Laszlo has done significant changes and reviews
> too. I can also propose myself to help with reviews.
>
> Wen or Laszla, do you want to be the main maintainer?
>

(sorry for the typo)

or rather "Supported" ("Someone is actually paid to look after this"
according to MAINTAINERS)
-- 
Marc-André Lureau

Re: [Qemu-devel] [PULL 23/33] exec: fix address_space_get_iotlb_entry page mask

2017-06-02 Thread Peter Xu

On Thu, Jun 01, 2017 at 02:41:41PM +0200, Paolo Bonzini wrote:
> From: Peter Xu 
> 
> The IOTLB that it returned didn't guarantee that page_mask is indeed a
> so-called page mask. That won't affect current usage since now only
> vhost is using it (vhost API allows arbitary IOTLB range). However we
> have IOTLB scemantic and we should best follow it. This patch fixes this
> issue to make sure the page_mask is always a valid page mask.
> 
> Fixes: a764040 ("exec: abstract address_space_do_translate()")
> Signed-off-by: Peter Xu 
> Message-Id: <1496212378-22605-1-git-send-email-pet...@redhat.com>
> Signed-off-by: Paolo Bonzini 
> ---
>  exec.c | 14 ++
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 29633cd..22b8f0c 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -528,16 +528,14 @@ IOMMUTLBEntry 
> address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr,
>  section.offset_within_region;
>  
>  if (plen == (hwaddr)-1) {
> -/*
> - * We use default page size here. Logically it only happens
> - * for identity mappings.
> - */
> -plen = TARGET_PAGE_SIZE;
> +/* If not specified during translation, use default mask */
> +plen = TARGET_PAGE_MASK;
> +} else {
> +/* Make it a valid page mask */
> +assert(plen);
> +plen = pow2floor(plen) - 1;
>  }
>  
> -/* Convert to address mask */
> -plen -= 1;
> -
>  return (IOMMUTLBEntry) {
>  .target_as = section.address_space,
>  .iova = addr & ~plen,
> -- 
> 1.8.3.1

Paolo,

I got a better idea on refactoring address_space_get_iotlb_entry(). If
you haven't started preparing another pull request, please feel free
to drop this one (it fixed the problem but not that complete).
Otherwise I'll just work upon it, which is fine as well.

Sorry for the troublesome.

-- 
Peter Xu

Re: [Qemu-devel] [PATCH 13/19] nbd/server: return original error codes

2017-06-02 Thread Vladimir Sementsov-Ogievskiy


02.06.2017 01:29, Eric Blake wrote:

On 05/30/2017 09:30 AM, Vladimir Sementsov-Ogievskiy wrote:

The code in many cases return -EINVAL or -EIO instead of original error
code from, for example, write_sync(). Following patch will need EPIPE
handling, so, let's refactor this where possible (the only exclusion
is nbd_co_receive_request, with own return-code convention)

Do we still want/need EPIPE handling, given the discussion on the
previous two patches?


Looks like this patch should be dropped. If EPIPE is not accepted 
(previous discussion), so error code is always EIO, no needs to save it 
into a variable..





Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  nbd/server.c | 124 +--
  1 file changed, 77 insertions(+), 47 deletions(-)


Feels weird to have a net gain in code, but maybe worthwhile.


diff --git a/nbd/server.c b/nbd/server.c
index a47f13e4fb..30dfb81a5c 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -136,30 +136,38 @@ static void nbd_client_receive_next_request(NBDClient 
*client);
  static int nbd_negotiate_send_rep_len(QIOChannel *ioc, uint32_t type,
uint32_t opt, uint32_t len)
  {
+int ret;
  uint64_t magic;
  
  TRACE("Reply opt=%" PRIx32 " type=%" PRIx32 " len=%" PRIu32,

type, opt, len);
  
  magic = cpu_to_be64(NBD_REP_MAGIC);

-if (write_sync(ioc, &magic, sizeof(magic), NULL) < 0) {
+ret = write_sync(ioc, &magic, sizeof(magic), NULL);
+if (ret < 0) {
  LOG("write failed (rep magic)");
-return -EINVAL;
+return ret;
  }

Constructs like this should get shorter once we plumb errp all the way
through.  Okay, I can live with the temporary verbosity.

You may still have to make changes due to rebasing (in which case I'll
definitely want to review again); but if this patch doesn't need further
rework, you can add:
Reviewed-by: Eric Blake 



--
Best regards,
Vladimir

[Qemu-devel] [PATCH] spapr/drc: don't migrate DRC of cold-plugged CPUs and LMBs

2017-06-02 Thread Greg Kurz

As explained in commit 5c0139a8c2f0 ("spapr: fix default DRC state for
coldplugged LMBs"), guests expect cold-plugged LMBs to be pre-allocated
and unisolated. The same goes for cold-plugged CPUs.

While here, let's convert g_assert(false) to the better self documenting
g_assert_not_reached().

Signed-off-by: Greg Kurz 
---

FWIW

$ git grep -i -E '(g_)?assert\((0|false)\)' | wc -l
100
$ git grep g_assert_not_reached | wc -l
244
---
 hw/ppc/spapr_drc.c |   10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index cc2400bcd57f..ab5f7cdf569c 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -538,20 +538,16 @@ static bool spapr_drc_needed(void *opaque)
  */
 switch (drc->type) {
 case SPAPR_DR_CONNECTOR_TYPE_PCI:
-rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) &&
-   (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) &&
-   drc->configured && drc->signalled && !drc->awaiting_release);
-break;
 case SPAPR_DR_CONNECTOR_TYPE_CPU:
 case SPAPR_DR_CONNECTOR_TYPE_LMB:
-rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
-   (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
+rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) &&
+   (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) &&
drc->configured && drc->signalled && !drc->awaiting_release);
 break;
 case SPAPR_DR_CONNECTOR_TYPE_PHB:
 case SPAPR_DR_CONNECTOR_TYPE_VIO:
 default:
-g_assert(false);
+g_assert_not_reached();
 }
 return rc;
 }

Re: [Qemu-devel] [PATCH] virtio-serial: fix segfault on disconnect

2017-06-02 Thread Pankaj Gupta


Hello Stefan,

> 
> Since commit d4c19cdeeb2f1e474bc426a6da261f1d7346eb5b ("virtio-serial:
> add missing virtio_detach_element() call") the following commands may
> cause QEMU to segfault:
> 
>   $ qemu -M accel=kvm -cpu host -m 1G \
>  -drive if=virtio,file=test.img,format=raw \
>  -device virtio-serial-pci,id=virtio-serial0 \
>  -chardev socket,id=channel1,path=/tmp/chardev.sock,server,nowait \
>  -device
>  virtserialport,chardev=channel1,bus=virtio-serial0.0,id=port1
>   $ nc -U /tmp/chardev.sock
>   ^C
> 
>   (guest)$ cat /dev/zero >/dev/vport0p1
> 
> The segfault is non-deterministic: if the event loop notices the socket
> has been closed then there is no crash.  The disconnect has to happen
> right before QEMU attempts to write data to the socket.
> 
> The backtrace is as follows:
> 
>   Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
>   0x557e0698 in do_flush_queued_data (port=0x582cedf0,
>   vq=0x7fffcc854290, vdev=0x5807b1d0) at hw/char/virtio-serial-bus.c:180
>   180   for (i = port->iov_idx; i < port->elem->out_num; i++) {
>   #1  0x5580d363 in virtio_queue_notify_vq (vq=0x7fffcc854290) at
>   hw/virtio/virtio.c:1524
>   #2  0x5580d363 in virtio_queue_host_notifier_read
>   (n=0x7fffcc8542f8) at hw/virtio/virtio.c:2430
>   #3  0x55b3482c in aio_dispatch_handlers
>   (ctx=ctx@entry=0x566b8c80) at util/aio-posix.c:399
>   #4  0x55b350d8 in aio_dispatch (ctx=0x566b8c80) at
>   util/aio-posix.c:430
>   #5  0x55b3212e in aio_ctx_dispatch (source=,
>   callback=, user_data=) at util/async.c:261
>   #6  0x7fffde71de52 in g_main_context_dispatch () at
>   /lib64/libglib-2.0.so.0
>   #7  0x55b34353 in glib_pollfds_poll () at util/main-loop.c:213
>   #8  0x55b34353 in os_host_main_loop_wait (timeout=)
>   at util/main-loop.c:261
>   #9  0x55b34353 in main_loop_wait (nonblocking=) at
>   util/main-loop.c:517
>   #10 0x55773207 in main_loop () at vl.c:1917
>   #11 0x55773207 in main (argc=, argv=,
>   envp=) at vl.c:4751
> 
> The do_flush_queued_data() function does not anticipate chardev close
> events during vsc->have_data().  It expects port->elem to remain
> non-NULL for the duration its for loop.

Just thinking if there is still data to flush, should we close/free the port.
Or it can get close automatically.

Or I am missing anything here?
> 
> The fix is simply to return from do_flush_queued_data() if the port
> closes because the close event already frees port->elem and drains the
> virtqueue - there is nothing left for do_flush_queued_data() to do.
> 
> Reported-by: Sitong Liu 
> Reported-by: Min Deng 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  hw/char/virtio-serial-bus.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> index d797a67..c5aa26c 100644
> --- a/hw/char/virtio-serial-bus.c
> +++ b/hw/char/virtio-serial-bus.c
> @@ -186,6 +186,9 @@ static void do_flush_queued_data(VirtIOSerialPort *port,
> VirtQueue *vq,
>port->elem->out_sg[i].iov_base
>+ port->iov_offset,
>buf_size);
> +if (!port->elem) { /* bail if we got disconnected */
> +return;
> +}
>  if (port->throttled) {
>  port->iov_idx = i;
>  if (ret > 0) {
> --
> 2.9.4
> 
> 
>

[Qemu-devel] [PATCH v3 3/5] vhost-user: add vhost_user to hold the chr

2017-06-02 Thread Maxime Coquelin

From: Marc-André Lureau 

Next patches will add more fields to the structure

Signed-off-by: Marc-André Lureau 
Signed-off-by: Maxime Coquelin 
---
 hw/virtio/vhost-user.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index dde094a..bd13b23 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -110,6 +110,10 @@ static VhostUserMsg m __attribute__ ((unused));
 /* The version of the protocol we support */
 #define VHOST_USER_VERSION(0x1)
 
+struct vhost_user {
+CharBackend *chr;
+};
+
 static bool ioeventfd_enabled(void)
 {
 return kvm_enabled() && kvm_eventfds_enabled();
@@ -117,7 +121,8 @@ static bool ioeventfd_enabled(void)
 
 static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg)
 {
-CharBackend *chr = dev->opaque;
+struct vhost_user *u = dev->opaque;
+CharBackend *chr = u->chr;
 uint8_t *p = (uint8_t *) msg;
 int r, size = VHOST_USER_HDR_SIZE;
 
@@ -202,7 +207,8 @@ static bool vhost_user_one_time_request(VhostUserRequest 
request)
 static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg,
 int *fds, int fd_num)
 {
-CharBackend *chr = dev->opaque;
+struct vhost_user *u = dev->opaque;
+CharBackend *chr = u->chr;
 int ret, size = VHOST_USER_HDR_SIZE + msg->size;
 
 /*
@@ -575,11 +581,14 @@ static int vhost_user_reset_device(struct vhost_dev *dev)
 static int vhost_user_init(struct vhost_dev *dev, void *opaque)
 {
 uint64_t features;
+struct vhost_user *u;
 int err;
 
 assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
 
-dev->opaque = opaque;
+u = g_new0(struct vhost_user, 1);
+u->chr = opaque;
+dev->opaque = u;
 
 err = vhost_user_get_features(dev, &features);
 if (err < 0) {
@@ -624,8 +633,12 @@ static int vhost_user_init(struct vhost_dev *dev, void 
*opaque)
 
 static int vhost_user_cleanup(struct vhost_dev *dev)
 {
+struct vhost_user *u;
+
 assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
 
+u = dev->opaque;
+g_free(u);
 dev->opaque = 0;
 
 return 0;
-- 
2.9.4

[Qemu-devel] [PATCH v3 0/5] vhost-user: Specify and implement device IOTLB support

2017-06-02 Thread Maxime Coquelin

This series aims at specifying ans implementing the protocol update
required to support device IOTLB with user backends.

In this third non-RFC version, main change is mandating the slave to be able
to send IOTLB miss requests for any addresses it needs to access. It implies
the removal of patch 3, which is no more necessary, and the related spec
update in patch 6. Also, a check is added in vhost-user init to ensure that
the slave does not advertise VIRTIO_F_IOMMU_PLATFORM if it does not support
both VHOST_USER_PROTOCOL_F_SLAVE_REQ and VHOST_USER_PROTOCOL_F_REPLY_ACK
protcol features.

The slave requests channel part is re-used from Marc-André's series submitted
last year[1], with main changes from original version being request/feature
names renaming and addition of the REPLY_ACK feature support.

Regarding IOTLB protocol, one noticeable change is the IOTLB miss request
reply made optionnal (i.e. only if slave requests it by setting the
VHOST_USER_NEED_REPLY flag in the message header). This change provides
more flexibility in the backend implementation of the feature.

The protocol is very close to kernel backends, except that a new
communication channel is introduced to enable the slave to send
requests to the master.

[1]: https://lists.gnu.org/archive/html/qemu-devel/2016-04/msg00095.html

Marc-André Lureau (2):
  vhost-user: add vhost_user to hold the chr
  vhost-user: add slave-req-fd support

Maxime Coquelin (3):
  vhost: propagate errors in vhost_device_iotlb_miss()
  vhost: rework IOTLB messaging
  spec/vhost-user spec: Add IOMMU support

 docs/specs/vhost-user.txt | 116 ++-
 hw/net/vhost_net.c|   1 +
 hw/virtio/vhost-backend.c | 130 ++---
 hw/virtio/vhost-user.c| 194 --
 hw/virtio/vhost.c |  19 ++--
 include/hw/virtio/vhost-backend.h |  23 +++--
 include/hw/virtio/vhost.h |   2 +-
 7 files changed, 404 insertions(+), 81 deletions(-)

-- 
2.9.4

[Qemu-devel] [PATCH v3 2/5] vhost: rework IOTLB messaging

2017-06-02 Thread Maxime Coquelin

This patch reworks IOTLB messaging to prepare for vhost-user
device IOTLB support.

IOTLB messages handling is extracted from vhost-kernel backend,
so that only the messages transport remains backend specifics.

Signed-off-by: Maxime Coquelin 
---
 hw/virtio/vhost-backend.c | 130 +-
 hw/virtio/vhost.c |   8 +--
 include/hw/virtio/vhost-backend.h |  23 ---
 3 files changed, 92 insertions(+), 69 deletions(-)

diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index be927b8..4e31de1 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -192,7 +192,6 @@ static void vhost_kernel_iotlb_read(void *opaque)
 ssize_t len;
 
 while ((len = read((uintptr_t)dev->opaque, &msg, sizeof msg)) > 0) {
-struct vhost_iotlb_msg *imsg = &msg.iotlb;
 if (len < sizeof msg) {
 error_report("Wrong vhost message len: %d", (int)len);
 break;
@@ -201,70 +200,21 @@ static void vhost_kernel_iotlb_read(void *opaque)
 error_report("Unknown vhost iotlb message type");
 break;
 }
-switch (imsg->type) {
-case VHOST_IOTLB_MISS:
-vhost_device_iotlb_miss(dev, imsg->iova,
-imsg->perm != VHOST_ACCESS_RO);
-break;
-case VHOST_IOTLB_UPDATE:
-case VHOST_IOTLB_INVALIDATE:
-error_report("Unexpected IOTLB message type");
-break;
-case VHOST_IOTLB_ACCESS_FAIL:
-/* FIXME: report device iotlb error */
-break;
-default:
-break;
-}
-}
-}
 
-static int vhost_kernel_update_device_iotlb(struct vhost_dev *dev,
-uint64_t iova, uint64_t uaddr,
-uint64_t len,
-IOMMUAccessFlags perm)
-{
-struct vhost_msg msg;
-msg.type = VHOST_IOTLB_MSG;
-msg.iotlb.iova =  iova;
-msg.iotlb.uaddr = uaddr;
-msg.iotlb.size = len;
-msg.iotlb.type = VHOST_IOTLB_UPDATE;
-
-switch (perm) {
-case IOMMU_RO:
-msg.iotlb.perm = VHOST_ACCESS_RO;
-break;
-case IOMMU_WO:
-msg.iotlb.perm = VHOST_ACCESS_WO;
-break;
-case IOMMU_RW:
-msg.iotlb.perm = VHOST_ACCESS_RW;
-break;
-default:
-g_assert_not_reached();
-}
-
-if (write((uintptr_t)dev->opaque, &msg, sizeof msg) != sizeof msg) {
-error_report("Fail to update device iotlb");
-return -EFAULT;
+vhost_backend_handle_iotlb_msg(dev, &msg.iotlb);
 }
-
-return 0;
 }
 
-static int vhost_kernel_invalidate_device_iotlb(struct vhost_dev *dev,
-uint64_t iova, uint64_t len)
+static int vhost_kernel_send_device_iotlb_msg(struct vhost_dev *dev,
+  struct vhost_iotlb_msg *imsg)
 {
 struct vhost_msg msg;
 
 msg.type = VHOST_IOTLB_MSG;
-msg.iotlb.iova = iova;
-msg.iotlb.size = len;
-msg.iotlb.type = VHOST_IOTLB_INVALIDATE;
+msg.iotlb = *imsg;
 
 if (write((uintptr_t)dev->opaque, &msg, sizeof msg) != sizeof msg) {
-error_report("Fail to invalidate device iotlb");
+error_report("Fail to update device iotlb");
 return -EFAULT;
 }
 
@@ -311,8 +261,7 @@ static const VhostOps kernel_ops = {
 .vhost_vsock_set_running = vhost_kernel_vsock_set_running,
 #endif /* CONFIG_VHOST_VSOCK */
 .vhost_set_iotlb_callback = vhost_kernel_set_iotlb_callback,
-.vhost_update_device_iotlb = vhost_kernel_update_device_iotlb,
-.vhost_invalidate_device_iotlb = vhost_kernel_invalidate_device_iotlb,
+.vhost_send_device_iotlb_msg = vhost_kernel_send_device_iotlb_msg,
 };
 
 int vhost_set_backend_type(struct vhost_dev *dev, VhostBackendType 
backend_type)
@@ -333,3 +282,70 @@ int vhost_set_backend_type(struct vhost_dev *dev, 
VhostBackendType backend_type)
 
 return r;
 }
+
+int vhost_backend_update_device_iotlb(struct vhost_dev *dev,
+ uint64_t iova, uint64_t uaddr,
+ uint64_t len,
+ IOMMUAccessFlags perm)
+{
+struct vhost_iotlb_msg imsg;
+
+imsg.iova =  iova;
+imsg.uaddr = uaddr;
+imsg.size = len;
+imsg.type = VHOST_IOTLB_UPDATE;
+
+switch (perm) {
+case IOMMU_RO:
+imsg.perm = VHOST_ACCESS_RO;
+break;
+case IOMMU_WO:
+imsg.perm = VHOST_ACCESS_WO;
+break;
+case IOMMU_RW:
+imsg.perm = VHOST_ACCESS_RW;
+break;
+default:
+return -EINVAL;
+}
+
+return dev->vhost_ops->vhost_send_device_iotlb_msg(dev, &imsg);
+}
+
+int vhost_backend_invalidate_device_iotlb(struct vhost_dev *dev,
+ uint64_t iova, uint64_t len)
+{
+

[Qemu-devel] [PATCH v3 4/5] vhost-user: add slave-req-fd support

2017-06-02 Thread Maxime Coquelin

From: Marc-André Lureau 

Learn to give a socket to the slave to let him make requests to the
master.

Signed-off-by: Marc-André Lureau 
Signed-off-by: Maxime Coquelin 
---
 docs/specs/vhost-user.txt |  32 +++-
 hw/virtio/vhost-user.c| 127 ++
 2 files changed, 157 insertions(+), 2 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 036890f..5fa7016 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -139,6 +139,7 @@ in the ancillary data:
  * VHOST_USER_SET_VRING_KICK
  * VHOST_USER_SET_VRING_CALL
  * VHOST_USER_SET_VRING_ERR
+ * VHOST_USER_SET_SLAVE_REQ_FD
 
 If Master is unable to send the full message or receives a wrong reply it will
 close the connection. An optional reconnection mechanism can be implemented.
@@ -252,6 +253,18 @@ Once the source has finished migration, rings will be 
stopped by
 the source. No further update must be done before rings are
 restarted.
 
+Slave communication
+---
+
+An optional communication channel is provided if the slave declares
+VHOST_USER_PROTOCOL_F_SLAVE_REQ protocol feature, to allow the slave to make
+requests to the master.
+
+The fd is provided via VHOST_USER_SET_SLAVE_REQ_FD ancillary data.
+
+A slave may then send VHOST_USER_SLAVE_* messages to the master
+using this fd communication channel.
+
 Protocol features
 -
 
@@ -260,9 +273,10 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_RARP   2
 #define VHOST_USER_PROTOCOL_F_REPLY_ACK  3
 #define VHOST_USER_PROTOCOL_F_MTU4
+#define VHOST_USER_PROTOCOL_F_SLAVE_REQ  5
 
-Message types
--
+Master message types
+
 
  * VHOST_USER_GET_FEATURES
 
@@ -486,6 +500,20 @@ Message types
   If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, slave must respond
   with zero in case the specified MTU is valid, or non-zero otherwise.
 
+ * VHOST_USER_SET_SLAVE_REQ_FD
+
+  Id: 21
+  Equivalent ioctl: N/A
+  Master payload: N/A
+
+  Set the socket file descriptor for slave initiated requests. It is passed
+  in the ancillary data.
+  This request should be sent only when VHOST_USER_F_PROTOCOL_FEATURES
+  has been negotiated, and protocol feature bit 
VHOST_USER_PROTOCOL_F_SLAVE_REQ
+  bit is present in VHOST_USER_GET_PROTOCOL_FEATURES.
+  If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, slave must respond
+  with zero for success, non-zero otherwise.
+
 VHOST_USER_PROTOCOL_F_REPLY_ACK:
 ---
 The original vhost-user specification only demands replies for certain
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index bd13b23..6a35600 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -32,6 +32,7 @@ enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_RARP = 2,
 VHOST_USER_PROTOCOL_F_REPLY_ACK = 3,
 VHOST_USER_PROTOCOL_F_NET_MTU = 4,
+VHOST_USER_PROTOCOL_F_SLAVE_REQ = 5,
 
 VHOST_USER_PROTOCOL_F_MAX
 };
@@ -60,9 +61,15 @@ typedef enum VhostUserRequest {
 VHOST_USER_SET_VRING_ENABLE = 18,
 VHOST_USER_SEND_RARP = 19,
 VHOST_USER_NET_SET_MTU = 20,
+VHOST_USER_SET_SLAVE_REQ_FD = 21,
 VHOST_USER_MAX
 } VhostUserRequest;
 
+typedef enum VhostUserSlaveRequest {
+VHOST_USER_SLAVE_NONE = 0,
+VHOST_USER_SLAVE_MAX
+}  VhostUserSlaveRequest;
+
 typedef struct VhostUserMemoryRegion {
 uint64_t guest_phys_addr;
 uint64_t memory_size;
@@ -112,6 +119,7 @@ static VhostUserMsg m __attribute__ ((unused));
 
 struct vhost_user {
 CharBackend *chr;
+int slave_fd;
 };
 
 static bool ioeventfd_enabled(void)
@@ -578,6 +586,115 @@ static int vhost_user_reset_device(struct vhost_dev *dev)
 return 0;
 }
 
+static void slave_read(void *opaque)
+{
+struct vhost_dev *dev = opaque;
+struct vhost_user *u = dev->opaque;
+VhostUserMsg msg = { 0, };
+int size, ret = 0;
+
+/* Read header */
+size = read(u->slave_fd, &msg, VHOST_USER_HDR_SIZE);
+if (size != VHOST_USER_HDR_SIZE) {
+error_report("Failed to read from slave.");
+goto err;
+}
+
+if (msg.size > VHOST_USER_PAYLOAD_SIZE) {
+error_report("Failed to read msg header."
+" Size %d exceeds the maximum %zu.", msg.size,
+VHOST_USER_PAYLOAD_SIZE);
+goto err;
+}
+
+/* Read payload */
+size = read(u->slave_fd, &msg.payload, msg.size);
+if (size != msg.size) {
+error_report("Failed to read payload from slave.");
+goto err;
+}
+
+switch (msg.request) {
+default:
+error_report("Received unexpected msg type.");
+ret = -EINVAL;
+}
+
+/*
+ * REPLY_ACK feature handling. Other reply types has to be managed
+ * directly in their request handlers.
+ */
+if (msg.flags & VHOST_USER_NEED_REPLY_MASK) {
+msg.flags &= ~VHOST_USER_NEED_REPLY_MASK;
+msg.flags |= VHO

[Qemu-devel] [PATCH v3 1/5] vhost: propagate errors in vhost_device_iotlb_miss()

2017-06-02 Thread Maxime Coquelin

Some backends might want to know when things went wrong.

Signed-off-by: Maxime Coquelin 
---
 hw/virtio/vhost.c | 15 ++-
 include/hw/virtio/vhost.h |  2 +-
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 03a46a7..8fab12d 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -971,18 +971,20 @@ static int vhost_memory_region_lookup(struct vhost_dev 
*hdev,
 return -EFAULT;
 }
 
-void vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int write)
+int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int write)
 {
 IOMMUTLBEntry iotlb;
 uint64_t uaddr, len;
+int ret = -EFAULT;
 
 rcu_read_lock();
 
 iotlb = address_space_get_iotlb_entry(dev->vdev->dma_as,
   iova, write);
 if (iotlb.target_as != NULL) {
-if (vhost_memory_region_lookup(dev, iotlb.translated_addr,
-   &uaddr, &len)) {
+ret = vhost_memory_region_lookup(dev, iotlb.translated_addr,
+ &uaddr, &len);
+if (ret) {
 error_report("Fail to lookup the translated address "
  "%"PRIx64, iotlb.translated_addr);
 goto out;
@@ -991,14 +993,17 @@ void vhost_device_iotlb_miss(struct vhost_dev *dev, 
uint64_t iova, int write)
 len = MIN(iotlb.addr_mask + 1, len);
 iova = iova & ~iotlb.addr_mask;
 
-if (dev->vhost_ops->vhost_update_device_iotlb(dev, iova, uaddr,
-  len, iotlb.perm)) {
+ret = dev->vhost_ops->vhost_update_device_iotlb(dev, iova, uaddr,
+  len, iotlb.perm);
+if (ret) {
 error_report("Fail to update device iotlb");
 goto out;
 }
 }
 out:
 rcu_read_unlock();
+
+return ret;
 }
 
 static int vhost_virtqueue_start(struct vhost_dev *dev,
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index a450321..467dc77 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -105,5 +105,5 @@ bool vhost_has_free_slot(void);
 int vhost_net_set_backend(struct vhost_dev *hdev,
   struct vhost_vring_file *file);
 
-void vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int write);
+int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int write);
 #endif
-- 
2.9.4

[Qemu-devel] [PATCH v3 5/5] spec/vhost-user spec: Add IOMMU support

2017-06-02 Thread Maxime Coquelin

This patch specifies and implements the master/slave communication
to support device IOTLB in slave.

The vhost_iotlb_msg structure introduced for kernel backends is
re-used, making the design close between the two backends.

An exception is the use of the secondary channel to enable the
slave to send IOTLB miss requests to the master.

Signed-off-by: Maxime Coquelin 
---

v3:
 - spec: Remove the part requiring the master to send IOTLB updates
   for ring addresses. Slave must always be able to request translations
   for areas it needs to access to.
 - vhost-user.c: Make init fail if backend advertises IOMMU feature but does not
   support REPLY_ACK and SLAVE_REQ protocolfeatures.

 docs/specs/vhost-user.txt | 84 +++
 hw/net/vhost_net.c|  1 +
 hw/virtio/vhost-user.c| 48 +--
 3 files changed, 130 insertions(+), 3 deletions(-)

diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 5fa7016..481ab56 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -97,6 +97,25 @@ Depending on the request type, payload can be:
log offset: offset from start of supplied file descriptor
where logging starts (i.e. where guest address 0 would be logged)
 
+ * An IOTLB message
+   -
+   | iova | size | user address | permissions flags | type |
+   -
+
+   IOVA: a 64-bit I/O virtual address programmed by the guest
+   Size: a 64-bit size
+   User address: a 64-bit user address
+   Permissions: a 8-bit value:
+- 0: No access
+- 1: Read access
+- 2: Write access
+- 3: Read/Write access
+   Type: a 8-bit IOTLB message type:
+- 1: IOTLB miss
+- 2: IOTLB update
+- 3: IOTLB invalidate
+- 4: IOTLB access fail
+
 In QEMU the vhost-user message is implemented with the following struct:
 
 typedef struct VhostUserMsg {
@@ -109,6 +128,7 @@ typedef struct VhostUserMsg {
 struct vhost_vring_addr addr;
 VhostUserMemory memory;
 VhostUserLog log;
+struct vhost_iotlb_msg iotlb;
 };
 } QEMU_PACKED VhostUserMsg;
 
@@ -253,6 +273,38 @@ Once the source has finished migration, rings will be 
stopped by
 the source. No further update must be done before rings are
 restarted.
 
+IOMMU support
+-
+
+When the VIRTIO_F_IOMMU_PLATFORM feature has been negotiated, the master
+sends IOTLB entries update & invalidation by sending VHOST_USER_IOTLB_MSG
+requests to the slave with a struct vhost_iotlb_msg as payload. For update
+events, the iotlb payload has to be filled with the update message type (2),
+the I/O virtual address, the size, the user virtual address, and the
+permissions flags. Addresses and size must be within vhost memory regions set
+via the VHOST_USER_SET_MEM_TABLE request. For invalidation events, the iotlb
+payload has to be filled with the invalidation message type (3), the I/O 
virtual
+address and the size. On success, the slave is expected to reply with a zero
+payload, non-zero otherwise.
+
+The slave relies on the slave communcation channel (see "Slave communication"
+section below) to send IOTLB miss and access failure events, by sending
+VHOST_USER_SLAVE_IOTLB_MSG requests to the master with a struct vhost_iotlb_msg
+as payload. For miss events, the iotlb payload has to be filled with the miss
+message type (1), the I/O virtual address and the permissions flags. For access
+failure event, the iotlb payload has to be filled with the access failure
+message type (4), the I/O virtual address and the permissions flags.
+For synchronization purpose, the slave may rely on the reply-ack feature,
+so the master may send a reply when operation is completed if the reply-ack
+feature is negotiated and slaves requests a reply. For miss events, completed
+operation means either master sent an update message containing the IOTLB entry
+containing requested address and permission, or master sent nothing if the 
IOTLB
+miss message is invalid (invalid IOVA or permission).
+
+The master isn't expected to take the initiative to send IOTLB update messages,
+as the slave sends IOTLB miss messages for the guest virtual memory areas it
+needs to access.
+
 Slave communication
 ---
 
@@ -514,6 +566,38 @@ Master message types
   If VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated, slave must respond
   with zero for success, non-zero otherwise.
 
+ * VHOST_USER_IOTLB_MSG
+
+  Id: 22
+  Equivalent ioctl: N/A (equivalent to VHOST_IOTLB_MSG message type)
+  Master payload: struct vhost_iotlb_msg
+  Slave payload: u64
+
+  Send IOTLB messages with struct vhost_iotlb_msg as payload.
+  Master sends such requests to update and invalidate entries in the device
+  IOTLB. The slave has to acknowledge the request with sending zero as u64
+  payload for success, non-zero otherwise.
+  This requ

Re: [Qemu-devel] [PATCH 5/6] hw/misc: add a TMP42{1, 2, 3} device model

2017-06-02 Thread Peter Maydell

On 15 May 2017 at 06:51, Cédric Le Goater  wrote:
> Largely inspired by the TMP105 temperature sensor, here is a model for
> the TMP42{1,2,3} temperature sensors.
>
> Specs can be found here :
>
> http://www.ti.com/lit/gpn/tmp421
>
> Signed-off-by: Cédric Le Goater 

This turns out to segfault on OSX and BSD...

> +static void tmp421_class_init(ObjectClass *klass, void *data)
> +{
> +DeviceClass *dc = DEVICE_CLASS(klass);
> +I2CSlaveClass *k = I2C_SLAVE_CLASS(klass);
> +TMP421Class *sc = TMP421_CLASS(klass);
> +
> +k->init = tmp421_init;
> +k->event = tmp421_event;
> +k->recv = tmp421_rx;
> +k->send = tmp421_tx;
> +dc->vmsd = &vmstate_tmp421;
> +sc->dev = (DeviceInfo *) data;

...because this write to sc->dev is off the end of a
malloced block. You can see this with valgrind on Linux:

$ valgrind ./build/x86/aarch64-softmmu/qemu-system-aarch64
==14009== Memcheck, a memory error detector
[...]
==14009== Invalid write of size 8
==14009==at 0x67D3D9: tmp421_class_init (tmp421.c:374)
==14009==by 0x80CD51: type_initialize (object.c:334)
==14009==by 0x80CABC: type_initialize (object.c:286)
==14009==by 0x80DF49: object_class_foreach_tramp (object.c:805)
==14009==by 0x1B7AE33F: g_hash_table_foreach (in
/lib/x86_64-linux-gnu/libglib-2.0.so.0.4800.2)
==14009==by 0x80E028: object_class_foreach (object.c:827)
==14009==by 0x80E1F7: object_class_get_list (object.c:881)
==14009==by 0x5500A8: find_default_machine (vl.c:1488)
==14009==by 0x554039: select_machine (vl.c:2745)
==14009==by 0x55730E: main (vl.c:4091)
==14009==  Address 0x2b4e7cd0 is 0 bytes after a block of size 224 alloc'd
==14009==at 0x4C2FB55: calloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==14009==by 0x1B7C4770: g_malloc0 (in
/lib/x86_64-linux-gnu/libglib-2.0.so.0.4800.2)
==14009==by 0x80CA8A: type_initialize (object.c:282)
==14009==by 0x80CABC: type_initialize (object.c:286)
==14009==by 0x80DF49: object_class_foreach_tramp (object.c:805)
==14009==by 0x1B7AE33F: g_hash_table_foreach (in
/lib/x86_64-linux-gnu/libglib-2.0.so.0.4800.2)
==14009==by 0x80E028: object_class_foreach (object.c:827)
==14009==by 0x80E1F7: object_class_get_list (object.c:881)
==14009==by 0x5500A8: find_default_machine (vl.c:1488)
==14009==by 0x554039: select_machine (vl.c:2745)
==14009==by 0x55730E: main (vl.c:4091)
==14009==

> +}
> +
> +static const TypeInfo tmp421_info = {
> +.name  = TYPE_TMP421,
> +.parent= TYPE_I2C_SLAVE,
> +.instance_size = sizeof(TMP421State),
> +.instance_init = tmp421_initfn,
> +.class_init= tmp421_class_init,

...which is because this TypeInfo doesn't set the
.class_size field, so only sizeof(I2CSlaveClass) is
allocated, not sizeof(TMP421Class).

(http://wiki.qemu.org/Documentation/QOMConventions
lists this as one of the things you need to do for a class
that has a class struct).

> +};
> +
> +static void tmp421_register_types(void)
> +{
> +int i;
> +
> +type_register_static(&tmp421_info);
> +for (i = 0; i < ARRAY_SIZE(devices); ++i) {
> +TypeInfo ti = {
> +.name   = devices[i].name,
> +.parent = TYPE_TMP421,
> +.class_init = tmp421_class_init,
> +.class_data = (void *) &devices[i],

This TypeInfo is a bit odd too, looking more closely at it.
It defines a type that's a subclass of TYPE_TMP421, but which
has a class_init method that's the same function as TYPE_TMP421's
class_init method. That means we call it twice for the same
class, which doesn't seem right.

> +};
> +type_register(&ti);
> +}
> +}
> +
> +type_init(tmp421_register_types)
> --
> 2.7.4

I'm going to drop this patch and the next one from my arm
queue, but leave 1-4 in.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v1] s390x/cpumodel: wire up cpu type + id for TCG

2017-06-02 Thread David Hildenbrand


>> +
>> +#ifndef CONFIG_USER_ONLY
>> +void HELPER(stidp)(CPUS390XState *env, uint64_t addr)
>> +{
>> +S390CPU *cpu = s390_env_get_cpu(env);
>> +uint64_t cpuid = s390_cpuid_from_cpu_model(cpu->model);
>> +
>> +if (addr & 0x7) {
>> +program_interrupt(env, PGM_SPECIFICATION, ILEN_LATER_INC);
>> +return;
>> +}
>> +
>> +/* basic mode, write the cpu address into the first 4 bit of the ID */
>> +cpuid |= ((uint64_t)env->cpu_num & 0xf) << 54;
>> +cpu_stq_data(env, addr, cpuid);
>> +}
>> +#endif
> 
> I don't really see the point of using an helper instead of just updating
> the existing code. From what I understand the cpuid does not change at
> runtime, so the s390_cpuid_from_cpu_model function can also be called 
> from translate.c.
> 
> Aurelien
> 

>From what I can see, conditional exceptions are more complicated to
implement without helpers (involves generating compares, jumps and so
on). As this function is not expected to be executed on hot paths, I
think moving it into a helper is the right thing to do.

-- 

Thanks,

David

Re: [Qemu-devel] [PATCH v2 2/4] dump: add vmcoreinfo ELF note

2017-06-02 Thread Marc-André Lureau

HI

On Thu, Jun 1, 2017 at 10:38 PM Laszlo Ersek  wrote:

> On 06/01/17 15:03, Marc-André Lureau wrote:
> > Read vmcoreinfo note from guest memory when dump_info provides the
> > address, and write it as an ELF note in the dump.
> >
> > NUMBER(phys_base) in vmcoreinfo has only been recently introduced in
> > Linux 4.10 ("kexec: export the value of phys_base instead of symbol
> > address"). To accomadate for older kernels, modify the vmcoreinfo to add
> > the new fields and help newer crash that will use it.
>
> I think here you mean
>
>   modify the DumpState structure
>
> rather than
>
>   modify the vmcoreinfo
>

No it's actually the content of the vmcoreinfo that is modified.


>
> >
> > Signed-off-by: Marc-André Lureau 
> > ---
> >  include/sysemu/dump.h |   2 +
> >  dump.c| 133
> ++
> >  2 files changed, 135 insertions(+)
> >
> > diff --git a/include/sysemu/dump.h b/include/sysemu/dump.h
> > index 2672a15f8b..b8a7a1e41d 100644
> > --- a/include/sysemu/dump.h
> > +++ b/include/sysemu/dump.h
> > @@ -192,6 +192,8 @@ typedef struct DumpState {
> >* this could be used to calculate
> >* how much work we have
> >* finished. */
> > +uint8_t *vmcoreinfo;
>
> Can you document that this is an ELF note?
>

ok


>
> > +size_t vmcoreinfo_size;
> >  } DumpState;
> >
> >  uint16_t cpu_to_dump16(DumpState *s, uint16_t val);
> > diff --git a/dump.c b/dump.c
> > index bdf3270f02..6911ffad8b 100644
> > --- a/dump.c
> > +++ b/dump.c
> > @@ -27,6 +27,7 @@
> >  #include "qapi/qmp/qerror.h"
> >  #include "qmp-commands.h"
> >  #include "qapi-event.h"
> > +#include "qemu/error-report.h"
> >
> >  #include 
> >  #ifdef CONFIG_LZO
> > @@ -88,6 +89,8 @@ static int dump_cleanup(DumpState *s)
> >  qemu_mutex_unlock_iothread();
> >  }
> >  }
> > +g_free(s->vmcoreinfo);
> > +s->vmcoreinfo = NULL;
> >
> >  return 0;
> >  }
>
> I vaguely feel that this should be moved in front of resuming VM
> execution. I don't have a strong reason, just consistency with the rest
> of the cleanup.
>

You mean before vm_start(), ok that makes sense (although I doubt dump can
be reentered as long as the status isn't changed).

> @@ -238,6 +241,19 @@ static inline int cpu_index(CPUState *cpu)
> >  return cpu->cpu_index + 1;
> >  }
> >
> > +static void write_vmcoreinfo_note(WriteCoreDumpFunction f, DumpState *s,
> > +  Error **errp)
> > +{
> > +int ret;
> > +
> > +if (s->vmcoreinfo) {
> > +ret = f(s->vmcoreinfo, s->vmcoreinfo_size, s);
> > +if (ret < 0) {
> > +error_setg(errp, "dump: failed to write vmcoreinfo");
> > +}
> > +}
> > +}
> > +
> >  static void write_elf64_notes(WriteCoreDumpFunction f, DumpState *s,
> >Error **errp)
> >  {
> > @@ -261,6 +277,8 @@ static void write_elf64_notes(WriteCoreDumpFunction
> f, DumpState *s,
> >  return;
> >  }
> >  }
> > +
> > +write_vmcoreinfo_note(f, s, errp);
> >  }
> >
> >  static void write_elf32_note(DumpState *s, Error **errp)
> > @@ -306,6 +324,8 @@ static void write_elf32_notes(WriteCoreDumpFunction
> f, DumpState *s,
> >  return;
> >  }
> >  }
> > +
> > +write_vmcoreinfo_note(f, s, errp);
> >  }
> >
> >  static void write_elf_section(DumpState *s, int type, Error **errp)
> > @@ -717,6 +737,50 @@ static int buf_write_note(const void *buf, size_t
> size, void *opaque)
> >  return 0;
> >  }
> >
> > +static void get_note_sizes(DumpState *s, const void *note,
> > +   uint64_t *note_head_size,
> > +   uint64_t *name_size,
> > +   uint64_t *desc_size)
> > +{
>
> I'm not happy that I have to reverse engineer what this function does.
> Please document it in the commit message and/or in a function-level
> comment, especially regarding the actual permitted types of *note.
>
>
ok, would that help?:
 /*
 * This function retrieves various sizes from an elf header.
 *
 * @note has to be a valid ELF note. The return sizes are unmodified
 * (not padded or rounded up to be multiple of 4).
 */

> Very similar functionality exists in "target/i386/arch_dump.c" already.
> Is there a (remote) possibility to extract / refactor / share code?
>
>
Although the 2 functions share a few similarities, since they compute
various sizes based on elf class, they are actually quite different. I
don't see an easy way to refactor in a common function that would make
sense.

> +uint64_t note_head_sz;
> > +uint64_t name_sz;
> > +uint64_t desc_sz;
> > +
> > +if (s->dump_info.d_class == ELFCLASS64) {
>
> Ugh, this is extremely confusing. This refers to DumpState.dump_info,
> which has type ArchDumpInfo. But in the previous patch we also introduce
> a global "dump_info" variable,

Re: [Qemu-devel] [PATCH 04/17] qapi: merge QInt and QFloat in QNum

2017-06-02 Thread Marc-André Lureau

Hi

On Fri, Jun 2, 2017 at 10:30 AM Markus Armbruster  wrote:

> Marc-André Lureau  writes:
>
> > Hi
> >
> > On Tue, May 30, 2017 at 6:23 PM Markus Armbruster 
> wrote:
> >
> >> Marc-André Lureau  writes:
> >>
> >> > Hi
> >> >
> >> > On Thu, May 11, 2017 at 6:30 PM Markus Armbruster 
> wrote:
> [...]
> >> >> g_assert_not_reached() is problematic, see "[PATCH] checkpatch:
> Disallow
> >> >> glib asserts in main code".
> >> >>
> >> >> Message-Id: <20170427165526.19836-1-dgilb...@redhat.com>
> >> >> https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg05499.html
> >> >>
> >> >>
> >> > Actually g_assert() and g_assert_not_reached() are accepted.
> >>
> >> What exactly does g_assert() buy us over plain assert(), and
> >> g_assert_not_reached() over assert(0)?
> >>
> >
> > g_assert() brings a bit more context, afaik, can be trapped for error
> > testing, and error reporting can be handled by an handler. Not that
> useful
> > to qemu, but could be for the graphical UI though.
> >
> > g_assert_not_reached() is quite more readable than assert(0)
>
> I'm all for making intent explicit, but what else could assert(0)
> possibly mean?
>
> >> qapi/ overwhelmingly uses assert().
> >>
> >
> > ok, it's already a mix of assert & g_assert in qemu though
>
> True.
>
> In my opinion, we should use only one outside tests.  g_assert() if it
> adds value, else plain assert().  "Outside tests", because g_assert()
> might add sufficient value in tests even when it doesn't elsewhere.
>
>
g_assert*() is useful in the unit under test, so you can check the assert
behaviour. This could be useful in particular for the assertions under
qapi/ since we have unit tests for those


> Until then, I prefer to use only one *locally*.  In qapi/, that's plain
> assert() now.
>

ok, fair enough
-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v2 14/45] qapi: update the qobject visitor to use QNUM_U64

2017-06-02 Thread Markus Armbruster

Marc-André Lureau  writes:

> Switch to use QNum/uint where appropriate to remove i64 limitation.
>
> The input visitor will cast i64 input to u64 for compatibility
> reasons (existing json QMP client already use negative i64 for large
> u64, and expect an implicit cast in qemu).
>
> Signed-off-by: Marc-André Lureau 
> ---
>  hw/i386/acpi-build.c|  3 +--
>  qapi/qobject-input-visitor.c| 21 -
>  qapi/qobject-output-visitor.c   |  3 +--
>  tests/test-qobject-input-visitor.c  |  7 ++-
>  tests/test-qobject-output-visitor.c | 28 +---
>  5 files changed, 41 insertions(+), 21 deletions(-)
>
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 1709efdf1c..ba2be1e9da 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -2634,10 +2634,9 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
>  if (!o) {
>  return false;
>  }
> -if (!qnum_get_int(qobject_to_qnum(o), &val)) {
> +if (!qnum_get_uint(qobject_to_qnum(o), &mcfg->mcfg_base)) {
>  g_assert_not_reached();
>  }
> -mcfg->mcfg_base = val;
>  qobject_decref(o);
>  
>  o = object_property_get_qobject(pci_host, PCIE_HOST_MCFG_SIZE, NULL);
> diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c
> index 74835ba339..7f9d6f57a1 100644
> --- a/qapi/qobject-input-visitor.c
> +++ b/qapi/qobject-input-visitor.c
> @@ -417,7 +417,6 @@ static void qobject_input_type_int64_keyval(Visitor *v, 
> const char *name,
>  static void qobject_input_type_uint64(Visitor *v, const char *name,
>uint64_t *obj, Error **errp)
>  {
> -/* FIXME: qobject_to_qnum mishandles values over INT64_MAX */
>  QObjectInputVisitor *qiv = to_qiv(v);
>  QObject *qobj = qobject_input_get_object(qiv, name, true, errp);
>  QNum *qnum;
> @@ -427,11 +426,23 @@ static void qobject_input_type_uint64(Visitor *v, const 
> char *name,
>  return;
>  }
>  qnum = qobject_to_qnum(qobj);
> -if (!qnum || !qnum_get_int(qnum, &val)) {
> -error_setg(errp, QERR_INVALID_PARAMETER_TYPE,
> -   full_name(qiv, name), "integer");
> +if (!qnum) {
> +goto err;
> +}
> +
> +if (qnum_get_uint(qnum, obj)) {
> +return;
>  }
> -*obj = val;
> +
> +/* Need to accept negative values for backward compatibility */
> +if (qnum_get_int(qnum, &val)) {
> +*obj = val;
> +return;
> +}
> +
> +err:
> +error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
> +   full_name(qiv, name), "uint64");
>  }
>  
>  static void qobject_input_type_uint64_keyval(Visitor *v, const char *name,
> diff --git a/qapi/qobject-output-visitor.c b/qapi/qobject-output-visitor.c
> index 2ca5093b22..70be84ccb5 100644
> --- a/qapi/qobject-output-visitor.c
> +++ b/qapi/qobject-output-visitor.c
> @@ -150,9 +150,8 @@ static void qobject_output_type_int64(Visitor *v, const 
> char *name,
>  static void qobject_output_type_uint64(Visitor *v, const char *name,
> uint64_t *obj, Error **errp)
>  {
> -/* FIXME values larger than INT64_MAX become negative */
>  QObjectOutputVisitor *qov = to_qov(v);
> -qobject_output_add(qov, name, qnum_from_int(*obj));
> +qobject_output_add(qov, name, qnum_from_uint(*obj));
>  }
>  

Before the patch, uint64_t values above INT64_MAX are sent as negative
values, e.g. UINT64_MAX is sent as -1.

After the patch, they are sent unmodified.  Clearly a bug fix, but we
have to consider compatibility issues anyway.  You wrote that libvirt
should cope fine, because its parsing of unsigned integers accepts
negative values modulo 2^64.  There's hope that other clients will, too.

There's one thing left to do: please document the incompatible bug fix
in the commit message.

>  static void qobject_output_type_bool(Visitor *v, const char *name, bool *obj,
> diff --git a/tests/test-qobject-input-visitor.c 
> b/tests/test-qobject-input-visitor.c
> index 983c59c474..6f94fc677c 100644
> --- a/tests/test-qobject-input-visitor.c
> +++ b/tests/test-qobject-input-visitor.c
> @@ -122,7 +122,6 @@ static void test_visitor_in_int(TestInputVisitorData 
> *data,
>  static void test_visitor_in_uint(TestInputVisitorData *data,
>  const void *unused)
>  {
> -Error *err = NULL;
>  uint64_t res = 0;
>  int64_t i64;
>  double dbl;
> @@ -146,12 +145,10 @@ static void test_visitor_in_uint(TestInputVisitorData 
> *data,
>  visit_type_uint64(v, NULL, &res, &error_abort);
>  g_assert_cmpuint(res, ==, (uint64_t)-value);
>  
> -/* BUG: value between INT64_MAX+1 and UINT64_MAX rejected */
> -
>  v = visitor_input_test_init(data, "18446744073709551574");
>  
> -visit_type_uint64(v, NULL, &res, &err);
> -error_free_or_abort(&err);
> +visit_type_uint64(v, NULL, &res, &error_abort);
> +g_assert_cmpuint(res, ==, 18446744073709551574U)

Re: [Qemu-devel] [PATCH v2 14/45] qapi: update the qobject visitor to use QNUM_U64

2017-06-02 Thread Markus Armbruster

One more nitpick:

Marc-André Lureau  writes:

> Switch to use QNum/uint where appropriate to remove i64 limitation.
>
> The input visitor will cast i64 input to u64 for compatibility
> reasons (existing json QMP client already use negative i64 for large
> u64, and expect an implicit cast in qemu).
>
> Signed-off-by: Marc-André Lureau 
[...]
> diff --git a/tests/test-qobject-output-visitor.c 
> b/tests/test-qobject-output-visitor.c
> index 3180d8cbde..d9f106d52e 100644
> --- a/tests/test-qobject-output-visitor.c
> +++ b/tests/test-qobject-output-visitor.c
> @@ -602,17 +602,31 @@ static void check_native_list(QObject *qobj,
>  qlist = qlist_copy(qobject_to_qlist(qdict_get(qdict, "data")));
>  
>  switch (kind) {
> -case USER_DEF_NATIVE_LIST_UNION_KIND_S8:
> -case USER_DEF_NATIVE_LIST_UNION_KIND_S16:
> -case USER_DEF_NATIVE_LIST_UNION_KIND_S32:
> -case USER_DEF_NATIVE_LIST_UNION_KIND_S64:
>  case USER_DEF_NATIVE_LIST_UNION_KIND_U8:
>  case USER_DEF_NATIVE_LIST_UNION_KIND_U16:
>  case USER_DEF_NATIVE_LIST_UNION_KIND_U32:
>  case USER_DEF_NATIVE_LIST_UNION_KIND_U64:
> -/* all integer elements in JSON arrays get stored into QNums when
> - * we convert to QObjects, so we can check them all in the same
> - * fashion, so simply fall through here
> +for (i = 0; i < 32; i++) {
> +QObject *tmp;
> +QNum *qvalue;
> +uint64_t val;
> +
> +tmp = qlist_peek(qlist);
> +g_assert(tmp);
> +qvalue = qobject_to_qnum(tmp);
> +g_assert(qnum_get_uint(qvalue, &val));
> +g_assert_cmpuint(val, ==, i);
> +qobject_decref(qlist_pop(qlist));
> +}
> +break;
> +
> +case USER_DEF_NATIVE_LIST_UNION_KIND_S8:
> +case USER_DEF_NATIVE_LIST_UNION_KIND_S16:
> +case USER_DEF_NATIVE_LIST_UNION_KIND_S32:
> +case USER_DEF_NATIVE_LIST_UNION_KIND_S64:
> +/* All signed integer elements in JSON arrays get stored into
> + * QInts when we convert to QObjects, so we can check them all
> + * in the same fashion, so simply fall through here.
>   */
>  case USER_DEF_NATIVE_LIST_UNION_KIND_INTEGER:
>  for (i = 0; i < 32; i++) {

Wing both ends of the comment, please.

[Qemu-devel] [PATCH v20 30/30] block: release persistent bitmaps on inactivate

2017-06-02 Thread Vladimir Sementsov-Ogievskiy

We should release them here to reload on invalidate cache.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block.c  |  4 
 block/dirty-bitmap.c | 29 +++--
 include/block/dirty-bitmap.h |  1 +
 3 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index a1e67759b9..1043e64a19 100644
--- a/block.c
+++ b/block.c
@@ -4110,6 +4110,10 @@ static int bdrv_inactivate_recurse(BlockDriverState *bs,
 }
 }
 
+/* At this point persistent bitmaps should be already stored by the format
+ * driver */
+bdrv_release_persistent_dirty_bitmaps(bs);
+
 return 0;
 }
 
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 6b43ad04fc..45b18dd3f3 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -301,13 +301,18 @@ void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
 }
 }
 
-static void bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
-  BdrvDirtyBitmap *bitmap,
-  bool only_named)
+static bool bdrv_dirty_bitmap_has_name(BdrvDirtyBitmap *bitmap)
+{
+return !!bdrv_dirty_bitmap_name(bitmap);
+}
+
+static void bdrv_do_release_matching_dirty_bitmap(
+BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
+bool (*cond)(BdrvDirtyBitmap *bitmap))
 {
 BdrvDirtyBitmap *bm, *next;
 QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
-if ((!bitmap || bm == bitmap) && (!only_named || bm->name)) {
+if ((!bitmap || bm == bitmap) && (!cond || cond(bm))) {
 assert(!bm->active_iterators);
 assert(!bdrv_dirty_bitmap_frozen(bm));
 assert(!bm->meta);
@@ -328,7 +333,7 @@ static void 
bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
 
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
 {
-bdrv_do_release_matching_dirty_bitmap(bs, bitmap, false);
+bdrv_do_release_matching_dirty_bitmap(bs, bitmap, NULL);
 }
 
 /**
@@ -338,7 +343,19 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap)
  */
 void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs)
 {
-bdrv_do_release_matching_dirty_bitmap(bs, NULL, true);
+bdrv_do_release_matching_dirty_bitmap(bs, NULL, 
bdrv_dirty_bitmap_has_name);
+}
+
+/**
+ * Release all persistent dirty bitmaps attached to a BDS (for use in
+ * bdrv_inactivate_recurse()).
+ * There must not be any frozen bitmaps attached.
+ * This function does not remove persistent bitmaps from the storage.
+ */
+void bdrv_release_persistent_dirty_bitmaps(BlockDriverState *bs)
+{
+bdrv_do_release_matching_dirty_bitmap(bs, NULL,
+  bdrv_dirty_bitmap_get_persistance);
 }
 
 /**
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 5fac2d8411..6fec93bdeb 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -25,6 +25,7 @@ BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs,
 void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
 void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs);
+void bdrv_release_persistent_dirty_bitmaps(BlockDriverState *bs);
 void bdrv_remove_persistent_dirty_bitmap(BlockDriverState *bs,
  const char *name,
  Error **errp);
-- 
2.11.1

[Qemu-devel] [PATCH v20 05/30] block: fix bdrv_dirty_bitmap_granularity signature

2017-06-02 Thread Vladimir Sementsov-Ogievskiy

Make getter signature const-correct. This allows other functions with
const dirty bitmap parameter use bdrv_dirty_bitmap_granularity().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
Reviewed-by: John Snow 
Reviewed-by: Kevin Wolf 
---
 block/dirty-bitmap.c | 2 +-
 include/block/dirty-bitmap.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 519737c8d3..186941cfc3 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -388,7 +388,7 @@ uint32_t 
bdrv_get_default_bitmap_granularity(BlockDriverState *bs)
 return granularity;
 }
 
-uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap)
+uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap)
 {
 return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
 }
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 9dea14ba03..7cbe623ba7 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -29,7 +29,7 @@ void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
-uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap);
+uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap);
 uint32_t bdrv_dirty_bitmap_meta_granularity(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
-- 
2.11.1

1 2 3 4 >

1 - 100 of 347 matches

Mail list logo