date:20130319

Re: [Qemu-devel] [PATCH 7/8] spice-qemu-char: Register interface on post load

2013-03-19 Thread Gerd Hoffmann

On 03/14/13 17:36, Hans de Goede wrote:
> From: Alon Levy 
> 
> The target has not seen the guest_connected event via
> spice_chr_guest_open or spice_chr_write, and so spice server wrongly
> assumes there is no agent active, while the client continues to send
> motion events only by the agent channel, which the server ignores. The
> net effect is that the mouse is static in the guest.
> 
> By registering the interface on post load spice server will pass on the
> agent messages fixing the mouse behavior after migration.

> +static VMStateDescription spice_chr_vmstate = {
> +.name   = "spice-chr",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.post_load  = spice_chr_post_load,
> +.fields = (VMStateField[]) {
> +VMSTATE_UINT32(guest_open, SpiceCharDriver),
> +VMSTATE_END_OF_LIST()
> +},
> +};

> +vmstate_register(NULL, -1, &spice_chr_vmstate, s);
> +

That is a showstopper.  If there are two of these there is no reliable
way to figure which is which.

Do we actually need that in the first place?  I assume virtio-serial
will migrate the guest-open state anyway.  So if we can use that somehow
we don't have to live migrate the chardev state ...

cheers,
  Gerd

Re: [Qemu-devel] Spice / usb-redir chardev flowcontrol patches v2

2013-03-19 Thread Gerd Hoffmann

On 03/18/13 13:21, Amit Shah wrote:
> On (Thu) 14 Mar 2013 [17:36:49], Hans de Goede wrote:
>> Here is v2 of the series adding watch support to the spicevmc chardev backend
>> and flowcontrol support to the usb-redir device. It also includes a few
>> virtio-consoled bugfixes which were found during the development of this
>> series.
>>
>> Note that this series is based *on top of* Gerd Hoffmann's chardev.5 series
>> to avoid conflicts with that series.
> 
> Looks good.
> 
> Acked-by: Amit Shah 
> 
> Gerd, will you pick these up via your tree?

I'll go pick the usbredir patch into the usb queue.  The spice bits need
some discussion, but with your ack I'll can take them through the spice
queue once the remaining issue is settled.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH] ui/cocoa.m: Fix compile failures introduced by recent console changes

2013-03-19 Thread Gerd Hoffmann

On 03/18/13 21:28, Peter Maydell wrote:
> Fix various compilation failures introduced by the recent console
> changes.
> 
> Signed-off-by: Peter Maydell 
> ---
>  ui/cocoa.m | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/ui/cocoa.m b/ui/cocoa.m
> index 8e0eaa2..048cc97 100644
> --- a/ui/cocoa.m
> +++ b/ui/cocoa.m
> @@ -403,6 +403,9 @@ QemuCocoaView *cocoaView;
>  {
>  COCOA_DEBUG("QemuCocoaView: switchSurface\n");
>  
> +int w = surface_width(surface);
> +int h = surface_height(surface);
> +
>  // update screenBuffer
>  if (dataProviderRef)
>  CGDataProviderRelease(dataProviderRef);
> @@ -1014,9 +1017,9 @@ static void cocoa_cleanup(void)
>  
>  static const DisplayChangeListenerOps dcl_ops = {
>  .dpy_name  = "cocoa",
> -.dpy_gfx_update = cocoa_update;
> -.dpy_gfx_switch = cocoa_switch;
> -.dpy_refresh = cocoa_refresh;
> +.dpy_gfx_update = cocoa_update,
> +.dpy_gfx_switch = cocoa_switch,
> +.dpy_refresh = cocoa_refresh,
>  };
>  
>  void cocoa_display_init(DisplayState *ds, int full_screen)

Acked-by: Gerd Hoffmann 

[ I'll go pick it into the next console patch series, but wouldn't
  mind if someone applies this directly to fix the macos build ].

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v2 2/2] hw/arm_gic_common: Use vmstate struct rather than save/load functions

2013-03-19 Thread Gerd Hoffmann

  Hi,

>> (vmstate.h is getting hugely repetitive to the point that I'm
>> really tempted to say we should just autogenerate it. That way
>> you could define a fairly small set of things (arrays, base types,
>> safe vs unsafe, etc) and have a script generate the cross product,
>> rather than the current setup where there is a lot of hand written
>> repetition and a tendency to gaps in the coverage where nobody's
>> using them yet.)

> I can recall a qemu-devel discussion over a long-term QOM goals a while
> ago.Somebody suggested that in the future we will define devices state
> structures using some special macro which will be parsed during
> compilation, serializing each member for both QOM introspection and vmstate
> migration.

That is where I see the future too.  Michael Roth [ cc'ed ] has this on
his agenda.  We have code generation infrastructure for qapi and it
surely makes sense to reuse that for vmstate.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH 01/15] pixman: add qemu_pixman_color()

2013-03-19 Thread Gerd Hoffmann

On 03/18/13 22:13, Søren Sandmann wrote:
> Gerd Hoffmann  writes:
> 
>> Helper function to map qemu colors (32bit integer + matching PixelFormat)
>> into pixman_color_t.
> 
> Are qemu colors premultiplied? If not, this function should probably
> premultiply before returning the color.

The alpha channel is usually not used.
Maybe I should just set alpha to 0 to make that clear ...

cheers,
  Gerd

Re: [Qemu-devel] [PATCH] QMP: TPM QMP and man page documentation updates

2013-03-19 Thread Markus Armbruster

[Note cc: Anthony for QAPI schema expertise]

Stefan Berger  writes:

> On 03/18/2013 12:16 PM, Markus Armbruster wrote:
>> Corey Bryant  writes:
>>
>>> Signed-off-by: Corey Bryant 
>>> ---
>>>   qemu-options.hx |  3 ++-
>>>   qmp-commands.hx | 59 
>>> +
>>>   2 files changed, 61 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qemu-options.hx b/qemu-options.hx
>>> index 30fb85d..3b3cd0f 100644
>>> --- a/qemu-options.hx
>>> +++ b/qemu-options.hx
>>> @@ -2237,7 +2237,8 @@ Backend type must be:
>>>   @option{passthrough}.
>>> The specific backend type will determine the applicable
>>> options.
>>> -The @code{-tpmdev} option requires a @code{-device} option.
>>> +The @code{-tpmdev} option creates the TPM backend and requires a
>>> +@code{-device} option that specifies the TPM frontend interface model.
>>> Options to each backend are described below.
>>>   diff --git a/qmp-commands.hx b/qmp-commands.hx
>>> index b370060..4eda5ea 100644
>>> --- a/qmp-commands.hx
>>> +++ b/qmp-commands.hx
>>> @@ -2721,18 +2721,77 @@ EQMP
>>>   .mhandler.cmd_new = qmp_marshal_input_query_tpm,
>>>   },
>>>   +SQMP
>>> +query-tpm
>>> +-
>>> +
>>> +Return information about the TPM device.
>>> +
>>> +Arguments: None
>>> +
>>> +Example:
>>> +
>>> +-> { "execute": "query-tpm" }
>>> +<- { "return":
>>> + [
>>> +   { "model": "tpm-tis",
>>> + "tpm-options":
>>> +   { "type": "tpm-passthrough-options",
>>> + "data":
>>> +   { "cancel-path": "/sys/class/misc/tpm0/device/cancel",
>>> + "path": "/dev/tpm0"
>>> +   }
>>> +   },
>>> + "type": "passthrough",
>>> + "id": "tpm0"
>>> +   }
>>> + ]
>>> +   }
>>> +
>>> +EQMP
>>> +
>> "tpm-options" is a discriminated union.  How is its discriminator "type"
>> (here: "tpm-passthrough-options") related to the outer "type" (here:
>> "passthrough")?
>
> It gives you similar information twice. So there is a direct
> relationship between the two types.

Awkward and undocumented.

Relevant parts of qapi-schema.json:

{ 'enum': 'TpmType', 'data': [ 'passthrough' ] }

{ 'union': 'TpmTypeOptions',
   'data': { 'tpm-passthrough-options' : 'TPMPassthroughOptions' } }

{ 'type': 'TPMInfo',
  'data': {'id': 'str',
   'model': 'TpmModel',
   'type': 'TpmType',
   'tpm-options': 'TpmTypeOptions' } }

Type Netdev solves the same problem more elegantly:

{ 'union': 'NetClientOptions',
  'data': {
'none': 'NetdevNoneOptions',
'nic':  'NetLegacyNicOptions',
'user': 'NetdevUserOptions',
'tap':  'NetdevTapOptions',
'socket':   'NetdevSocketOptions',
'vde':  'NetdevVdeOptions',
'dump': 'NetdevDumpOptions',
'bridge':   'NetdevBridgeOptions',
'hubport':  'NetdevHubPortOptions' } }

{ 'type': 'Netdev',
  'data': {
'id':   'str',
'opts': 'NetClientOptions' } }

Uses the union's discriminator.  Straightforward.

Following Netdev precedence, we get:

{ 'union': 'TpmTypeOptions',
  'data': { 'passthrough' : 'TPMPassthroughOptions' } }

{ 'type': 'TPMInfo',
  'data': {'id': 'str',
   'model': 'TpmModel',
   'opts': 'TpmTypeOptions' } }

Duplication of type is gone.  No need for documentation.

Since enum TpmType is used elsewhere, it still gets duplicated in the
union's discriminator.  Anthony, is there a way to name the implicit
discriminator enum type for reference elsewhere in the schema?

Re: [Qemu-devel] [PATCH 3/3] iscsi_truncate: ensure there are no requests in flight

2013-03-19 Thread Peter Lieven


On 11.03.2013 11:16, Paolo Bonzini wrote:

Il 11/03/2013 11:05, Peter Lieven ha scritto:

ensure that there are no pending I/Os before calling
the sync readcapacity commands. the block_resize monitor
command will also flush all I/O, but double check in
case iscsi_truncate() is called from elsewhere in the
future.

Signed-off-by: Peter Lieven 
---
  block/iscsi.c |4 
  1 file changed, 4 insertions(+)

diff --git a/block/iscsi.c b/block/iscsi.c
index 3d52921..de20d53 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1167,6 +1167,10 @@ static int iscsi_truncate(BlockDriverState *bs,
int64_t offset)
  return -ENOTSUP;
  }

+/* ensure all async requests are completed before executing
+ * a sync readcapacity */
+bdrv_drain_all();
+
  if ((ret = iscsi_readcapacity_sync(iscsilun)) != 0) {
  return ret;
  }


NACK to this patch.  It would be a bug, let's fix it properly.

The other two are fine, however.


Paolo, can you ensure that Patch 1+2 get merged before qemu 1.5.0
to fix the regression.

Thanks,
Peter




Paolo

Re: [Qemu-devel] [RFC PATCH v4 00/30] ACPI memory hotplug

2013-03-19 Thread li guang

在 2013-02-28四的 11:18 +0100，Vasilis Liaskovitis写道：
> Hi,
> 
> sorry for the delay.
> On Tue, Feb 19, 2013 at 07:39:40PM -0300, Erlon Cruz wrote:
> > On Tue, Dec 18, 2012 at 10:41 AM, Vasilis Liaskovitis <
> > vasilis.liaskovi...@profitbricks.com> wrote:
> > 
> > > This is v4 of the ACPI memory hotplug functionality. Only x86_64 target is
> > > supported (both i440fx and q35). There are still several issues, but it's
> > > been a while since v3 and I wanted to get some more feedback on the 
> > > current
> > > state of the patchseries.
> > >
> > >
> > We are working in memory hotplug functionality on pSeries machine. I'm
> > wondering whether and how we can better integrate things. Do you think the
> > DIMM abstraction is generic enough to be used in other machine types?
> 
> I think the DimmDevice is generic enough but I am open to other suggestions. 
> 
> A related issue is that the patchseries uses a DimmBus to hot-add and 
> hot-remove
> DimmDevice. Another approach that has been suggested is to use links<> between
> DimmDevices and the dram controller device (piix4 or mch for pc and q35-pc
> machines respectively). This would be more similar to the CPUState/qom
> patches - see Andreas Färber's earlier reply to this thread.
> 
> I think we should get some consensus from the community/maintainers before we
> continue to integrate. 
> 
> I haven't updated the series for a while, but I can rework if there is a more
> clear direction for the community.
> 
> Another open issue is reference counting of memoryregions in qemu memory
> model. In order to make memory hot-remove operations safe, we need to remove
> a memoryregion after all users (e.g. both guest and block layer) have stopped
> using it,

it seems it mostly up to the user who want to hot-(un)plug,
if user want to un-plug a memory which is kernel's main memory, kernel
will always run on it(never stop) unless power off.
and if guest stops, all DIMMs should be safe to hot-remove,
or else we should do something to let user can unlock all reference.

>  see discussion at
> http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg03986.html. There was 
> a
> relevant ibm patchset
> https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02697.html
> but it was not merged.
> 
> > 
> > 
> > > Overview:
> > >
> > > Dimm device layout is modeled with a normal qemu device:
> > >
> > > "-device dimm,id=name,size=sz,node=pxm,populated=on|off,bus=membus.0"
> > >
> > >
> >  How does this will handle the no-hotplugable memory for example the memory
> > passed in '-m' parameter?
> 
> The non-hotpluggable initial memory (-m) is currently not modelled at all as a
> DimmDevice. We may want to model it though.
> 
> thanks,
> - Vasilis
>

Re: [Qemu-devel] [PATCH V27 1/7] Support for TPM command line options

2013-03-19 Thread Markus Armbruster

Stefan Berger  writes:

> On 03/18/2013 09:10 AM, Markus Armbruster wrote:
>>>
>>> In case the TPM is currently not operating on a command there will be
>>> no impact. In case the TPM is operating on a command, it will hold the
>>> thread inside /dev/tpm0 until the command has finished. The solution
>>> here is to write a byte into sysfs file to terminate the TPM from
>>> further executing the command and return; this code exists and is
>>> being invoked in all other cases than abnormal termination obviously.
>> Unlike other kernel resources such as file descriptors, this one isn't
>> tied to a process and cleaned up automatically on process termination.
>> Correct?
>
> No, it's also a file descriptor and will be cleaned up upon process
> termination.

Closing the file descriptor does *not* free kernel resources associated
with it?!?

>> Therefore, we try to clean it up manually.  Actual cleanup code is
>> tpm_passthrough_cancel_cmd().  Correct?
>
> No, tpm_passthrough_cancel_cmd() sends a byte to the TPM's sysfs entry
> for canceling an ongoing command. QEMU has a thread that sends the
> commands to the TPM via write() and gets the responses in read(). The
> issue being the thread is held inside the write() (!) until the
> command has completed inside the host TPM.

You mean write() blocks?

>Inside the passthrough
> driver we set a flag while the command is executing and if a command
> is executing when for example QEMU is reset, then the sysfs entry is
> written to to get the thread out of the write() due to termination of
> the ongoing TPM command.

I can understand a need to abort the write() on guest reset, but why on
QEMU exit()?  Shouldn't process termination abort the write() just fine?

>  Depending on which version of Fedora is run,
> the termination of the command may or may not get invoked.

May or may not get invoked on what condition?

>F14 for
> example does not seem to care about waiting until all  user-level
> processes have properly shut down and would reset  while F16 seems to
> wait for all processes to properly terminate.

F14 is more than nine months past it's shelf date.  Why should we
support it when its makers don't?

> The cancellation of an ongoing command may be initiated inside the
> guest as well and will also cause tpm_passthrough_cancel_cmd() to be
> invoked if a command is executed by the host TPM on behalf of the
> guest.

Passing through the guest's cancellation makes sense to me.

> There were/are(?) some issues with cancellation of ongoing
> commands. Different vendors seem to set different bits to indicate to
> the driver as to when a command has been canceled. The specs are not
> clear enough in that area. We fixed this for the TPMs that we have
> access (to wake up the thread inside the write()).

I'm not doubting the need to cancel commands.  I'm just trying to
understand why we do it on exit().

>> The only existing cleanup function tpm_passthrough_destroy() does a
>> whole lot more.  Just to make sure I understand what's going on:
>> everything but tpm_passthrough_cancel_cmd() is effectively a no-op
>> there, correct?
>
> It depends on what you mean with no-op. In tpm_passthrough_destroy() a
> possibly ongoing command is canceled, then the thread is signaled to
> terminate and join()ed. The rest is closing of file descriptors and
> freeing of memory.

Closing file descriptors and freeing memory on exit() is a no-op at
best, and a source of bugs otherwise.  The kernel already cleans these
up just fine.

Unless the thread does something interesting in response to its
termination signal, making it terminate is also a no-op.  I can't see
anything interesting happening, please correct me if I'm wrong.

Re: [Qemu-devel] [PULL 0/2] virtio-ccw patches 2013-03-18

2013-03-19 Thread Cornelia Huck

On Mon, 18 Mar 2013 17:04:40 +0100
Cornelia Huck  wrote:

Hm, skip this. This clashes with the virtio-blk refactoring.

> The following changes since commit 225dc991b03f0f034aa348f5cf499de9d0979107:
> 
>   s390: Fix cpu refactoring fallout. (2013-03-17 20:01:31 +)
> 
> are available in the git repository at:
> 
>   git://github.com/cohuck/qemu virtio-ccw-upstr
> 
> for you to fetch changes up to 06964ad5dfff4ea3f8536be4730a8b501113e3c1:
> 
>   s390-virtio, virtio-ccw: Add config_wce for virtio-blk. (2013-03-18 
> 13:06:09 +0100)
> 
> 
> Cornelia Huck (2):
>   virtio-ccw: Add missing blk chs properties.
>   s390-virtio, virtio-ccw: Add config_wce for virtio-blk.
> 
>  hw/s390x/s390-virtio-bus.c |1 +
>  hw/s390x/virtio-ccw.c  |2 ++
>  2 files changed, 3 insertions(+)
>

Re: [Qemu-devel] [RFC PATCH RDMA support v4: 03/10] more verbose documentation of the RDMA transport

2013-03-19 Thread Michael S. Tsirkin

On Mon, Mar 18, 2013 at 07:23:53PM -0400, Michael R. Hines wrote:
> On 03/18/2013 05:26 PM, Michael S. Tsirkin wrote:
> >
> >Probably but I haven't mentioned ballooning at all.
> >
> >memory overcommit != ballooning
> 
> Sure, then setting ballooning aside for the moment,
> then let's just consider regular (unused) virtual memory.
> 
> In this case, what's wrong with the destination mapping
> and pinning all the memory if it is not being ballooned?
> 
> If the guest touches all the memory during normal operation
> before migration begins (which would be the common case),
> then overcommit is irrelevant, no?

We have ways (e.g. cgroups) to limit what a VM can do. If it tries to
use more RAM than we let it, it will swap, still making progress, just
slower.  OTOH it looks like pinning more memory than allowed by the
cgroups limit will just get stuck forever (probably a bug,
should fail instead? but does not help your protocol
which needs it all pinned at all times).

There are also per-task resource limits. If you exceed this
registration will fail, so not good either.

I just don't see why do registration by chunks
on source but not on destination.

-- 
MST

[Qemu-devel] [PATCH 0/2] fix segfault in (iscsi) option parsing

2013-03-19 Thread Peter Lieven

current git master segfaults if an iscsi option is specified
in command line.

Peter Lieven (2):
  qemu-option: avoid segfault if QemuOptsList == NULL
  vl.c: fix segfault in iscsi options parsing

 block/iscsi.c  |   27 ---
 util/qemu-option.c |1 +
 vl.c   |   40 +---
 3 files changed, 38 insertions(+), 30 deletions(-)

-- 
1.7.9.5

[Qemu-devel] [PATCH 1/2] qemu-option: avoid segfault if QemuOptsList == NULL

2013-03-19 Thread Peter Lieven

Signed-off-by: Peter Lieven 
---
 util/qemu-option.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/util/qemu-option.c b/util/qemu-option.c
index 8b74bf1..24479d2 100644
--- a/util/qemu-option.c
+++ b/util/qemu-option.c
@@ -938,6 +938,7 @@ static QemuOpts *opts_parse(QemuOptsList *list, const char 
*params,
 QemuOpts *opts;
 Error *local_err = NULL;
 
+assert(list != NULL);
 assert(!permit_abbrev || list->implied_opt_name);
 firstname = permit_abbrev ? list->implied_opt_name : NULL;
 
-- 
1.7.9.5

[Qemu-devel] [PATCH 2/2] vl.c: fix segfault in iscsi options parsing

2013-03-19 Thread Peter Lieven

this patch fixes parsing of iscsi options such as initiator-name
passed to command line via -iscsi option group.

because iscsi options where registered too late qemu_find_opts
returned NULL leading to a segfault in qemu_opts_parse.

Signed-off-by: Peter Lieven 
---
 block/iscsi.c |   27 ---
 vl.c  |   40 +---
 2 files changed, 37 insertions(+), 30 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 3d52921..23d4210 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1263,36 +1263,9 @@ static BlockDriver bdrv_iscsi = {
 #endif
 };
 
-static QemuOptsList qemu_iscsi_opts = {
-.name = "iscsi",
-.head = QTAILQ_HEAD_INITIALIZER(qemu_iscsi_opts.head),
-.desc = {
-{
-.name = "user",
-.type = QEMU_OPT_STRING,
-.help = "username for CHAP authentication to target",
-},{
-.name = "password",
-.type = QEMU_OPT_STRING,
-.help = "password for CHAP authentication to target",
-},{
-.name = "header-digest",
-.type = QEMU_OPT_STRING,
-.help = "HeaderDigest setting. "
-"{CRC32C|CRC32C-NONE|NONE-CRC32C|NONE}",
-},{
-.name = "initiator-name",
-.type = QEMU_OPT_STRING,
-.help = "Initiator iqn name to use when connecting",
-},
-{ /* end of list */ }
-},
-};
-
 static void iscsi_block_init(void)
 {
 bdrv_register(&bdrv_iscsi);
-qemu_add_opts(&qemu_iscsi_opts);
 }
 
 block_init(iscsi_block_init);
diff --git a/vl.c b/vl.c
index ce51e65..9925675 100644
--- a/vl.c
+++ b/vl.c
@@ -517,6 +517,34 @@ static QemuOptsList qemu_tpmdev_opts = {
 },
 };
 
+#ifdef CONFIG_LIBISCSI
+static QemuOptsList qemu_iscsi_opts = {
+.name = "iscsi",
+.head = QTAILQ_HEAD_INITIALIZER(qemu_iscsi_opts.head),
+.desc = {
+{
+.name = "user",
+.type = QEMU_OPT_STRING,
+.help = "username for CHAP authentication to target",
+},{
+.name = "password",
+.type = QEMU_OPT_STRING,
+.help = "password for CHAP authentication to target",
+},{
+.name = "header-digest",
+.type = QEMU_OPT_STRING,
+.help = "HeaderDigest setting. "
+"{CRC32C|CRC32C-NONE|NONE-CRC32C|NONE}",
+},{
+.name = "initiator-name",
+.type = QEMU_OPT_STRING,
+.help = "Initiator iqn name to use when connecting",
+},
+{ /* end of list */ }
+},
+};
+#endif
+
 const char *qemu_get_vm_name(void)
 {
 return qemu_name;
@@ -2899,6 +2927,9 @@ int main(int argc, char **argv, char **envp)
 qemu_add_opts(&qemu_add_fd_opts);
 qemu_add_opts(&qemu_object_opts);
 qemu_add_opts(&qemu_tpmdev_opts);
+#ifdef CONFIG_LIBISCSI
+qemu_add_opts(&qemu_iscsi_opts);
+#endif
 
 runstate_init();
 
@@ -3199,14 +3230,17 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 break;
-#ifdef CONFIG_LIBISCSI
 case QEMU_OPTION_iscsi:
-opts = qemu_opts_parse(qemu_find_opts("iscsi"), optarg, 0);
+olist = qemu_find_opts("iscsi");
+if (!olist) {
+fprintf(stderr, "iscsi is not supported by this qemu 
build.\n");
+exit(1);
+}
+opts = qemu_opts_parse(olist, optarg, 0);
 if (!opts) {
 exit(1);
 }
 break;
-#endif
 #ifdef CONFIG_SLIRP
 case QEMU_OPTION_tftp:
 legacy_tftp_prefix = optarg;
-- 
1.7.9.5

Re: [Qemu-devel] qemu segfault parsing iscsi options

2013-03-19 Thread Markus Armbruster

Peter Lieven  writes:

> On 18.03.2013 13:52, Markus Armbruster wrote:
>> Peter Lieven  writes:
>>
>>> Hi,
>>>
>>> with recent qemu from git qemu segfaults with the following commandline:
>>>
>>> x86_64-softmmu/qemu-system-x86_64 -iscsi test
>>>
>>> qemu-system-x86_64: -iscsi test: There is no option group 'iscsi'
>>> Speicherzugriffsfehler (Speicherabzug geschrieben)
>>>
>>> It seems that there is something missing regarding the iscsi options
>>> in qemu-option.hx.
>>>
>>> This was working with qemu-kvm-1.2.0.
>>
>> Works for me with current master 225dc991.
>>
>
> for me it doesn't...
>
> ~/git/qemu$ git log -1 --oneline
> 225dc99 s390: Fix cpu refactoring fallout.
>
> there is definetly missing some parts in qemu-options.hx for -iscsi

Nope.  It's an initialization order issue.

Paolo's commit 4d454574 "qemu-option: move standard option definitions
out of qemu-config.c" moved option group registration from compile time
(initializer of vm_config_groups[]) to run time (various places).

Registration of group "iscsi" moved to iscsi_block_init(), a block
driver initialization function.  These are run by
bdrv_init_with_whitelist(), which gets called only after command line
parsing.  Oops.

I'm afraid we need to re-review all of that commit for similar
initialization order errors.

[Qemu-devel] [PATCH] virtio-blk: Do not segfault fault if failed to initialize dataplane

2013-03-19 Thread Dunrong Huang

$ ~/usr/bin/qemu-system-x86_64 -enable-kvm -m 1024 -drive 
if=none,id=drive0,cache=none,aio=native,format=raw,file=/root/Image/centos-6.4.raw
 -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on,config-wce=on # 
make dataplane fail to initialize
qemu-system-x86_64: -device 
virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on,config-wce=on: device is 
incompatible with x-data-plane, use config-wce=off
*** glibc detected *** /root/usr/bin/qemu-system-x86_64: free(): invalid 
pointer: 0x7f001fef12f8 ***
=== Backtrace: =
/lib64/libc.so.6(+0x7d776)[0x7f00153a5776]
/root/usr/bin/qemu-system-x86_64(+0x2c34ec)[0x7f001cf5b4ec]
/root/usr/bin/qemu-system-x86_64(+0x342f9a)[0x7f001cfdaf9a]
/root/usr/bin/qemu-system-x86_64(+0x33694e)[0x7f001cfce94e]


 (gdb) bt
 #0  0x7f3bf3a12015 in raise () from /lib64/libc.so.6
 #1  0x7f3bf3a1348b in abort () from /lib64/libc.so.6
 #2  0x7f3bf3a51a4e in __libc_message () from /lib64/libc.so.6
 #3  0x7f3bf3a57776 in malloc_printerr () from /lib64/libc.so.6
 #4  0x7f3bfb60d4ec in free_and_trace (mem=0x7f3bfe0129f8) at vl.c:2786
 #5  0x7f3bfb68cf9a in virtio_cleanup (vdev=0x7f3bfe0129f8) at 
/root/Develop/QEMU/qemu/hw/virtio.c:900
 #6  0x7f3bfb68094e in virtio_blk_device_init (vdev=0x7f3bfe0129f8) at 
/root/Develop/QEMU/qemu/hw/virtio-blk.c:666
 #7  0x7f3bfb68dadf in virtio_device_init (qdev=0x7f3bfe0129f8) at 
/root/Develop/QEMU/qemu/hw/virtio.c:1092
 #8  0x7f3bfb50da46 in device_realize (dev=0x7f3bfe0129f8, 
err=0x7fff479c9258) at hw/qdev.c:176
.

In virtio_blk_device_init(), the memory which vdev point to is a static
member of "struct VirtIOBlkPCI", not heap memory, and it does not
get freed. So we shoule use virtio_common_cleanup() to clean this VirtIODevice
rather than virtio_cleanup(), which attempts to free the vdev.

This error was introduced by commit 05ff686536f408ba6e8426b1b54d25bd3379fda2
recently.

Signed-off-by: Dunrong Huang 
---
 hw/virtio-blk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index e6f8875..f2143fd 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -663,7 +663,7 @@ static int virtio_blk_device_init(VirtIODevice *vdev)
 s->vq = virtio_add_queue(vdev, 128, virtio_blk_handle_output);
 #ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
 if (!virtio_blk_data_plane_create(vdev, blk, &s->dataplane)) {
-virtio_cleanup(vdev);
+virtio_common_cleanup(vdev);
 return -1;
 }
 #endif
-- 
1.8.1.5

Re: [Qemu-devel] qemu segfault parsing iscsi options

2013-03-19 Thread Peter Lieven


On 19.03.2013 09:25, Markus Armbruster wrote:

Peter Lieven  writes:


On 18.03.2013 13:52, Markus Armbruster wrote:

Peter Lieven  writes:


Hi,

with recent qemu from git qemu segfaults with the following commandline:

x86_64-softmmu/qemu-system-x86_64 -iscsi test

qemu-system-x86_64: -iscsi test: There is no option group 'iscsi'
Speicherzugriffsfehler (Speicherabzug geschrieben)

It seems that there is something missing regarding the iscsi options
in qemu-option.hx.

This was working with qemu-kvm-1.2.0.


Works for me with current master 225dc991.



for me it doesn't...

~/git/qemu$ git log -1 --oneline
225dc99 s390: Fix cpu refactoring fallout.

there is definetly missing some parts in qemu-options.hx for -iscsi


Nope.  It's an initialization order issue.


yes, you are right. so Paolo's patch needs to be fixed. The patch
series fixes at least the iSCSI part.



Paolo's commit 4d454574 "qemu-option: move standard option definitions
out of qemu-config.c" moved option group registration from compile time
(initializer of vm_config_groups[]) to run time (various places).

Registration of group "iscsi" moved to iscsi_block_init(), a block
driver initialization function.  These are run by
bdrv_init_with_whitelist(), which gets called only after command line
parsing.  Oops.

I'm afraid we need to re-review all of that commit for similar
initialization order errors.



From a quick look at the code, it could be that -spice is broken.
-fsdev and -virtfs work.
I have not find any other invocations of qemu_add_opts outside vl.c

Peter.

Re: [Qemu-devel] [PATCH] virtio-blk: Do not segfault fault if failed to initialize dataplane

2013-03-19 Thread KONRAD Frédéric


On 19/03/2013 09:27, Dunrong Huang wrote:

$ ~/usr/bin/qemu-system-x86_64 -enable-kvm -m 1024 -drive 
if=none,id=drive0,cache=none,aio=native,format=raw,file=/root/Image/centos-6.4.raw
 -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on,config-wce=on # 
make dataplane fail to initialize
qemu-system-x86_64: -device 
virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on,config-wce=on: device is 
incompatible with x-data-plane, use config-wce=off
*** glibc detected *** /root/usr/bin/qemu-system-x86_64: free(): invalid 
pointer: 0x7f001fef12f8 ***
=== Backtrace: =
/lib64/libc.so.6(+0x7d776)[0x7f00153a5776]
/root/usr/bin/qemu-system-x86_64(+0x2c34ec)[0x7f001cf5b4ec]
/root/usr/bin/qemu-system-x86_64(+0x342f9a)[0x7f001cfdaf9a]
/root/usr/bin/qemu-system-x86_64(+0x33694e)[0x7f001cfce94e]


  (gdb) bt
  #0  0x7f3bf3a12015 in raise () from /lib64/libc.so.6
  #1  0x7f3bf3a1348b in abort () from /lib64/libc.so.6
  #2  0x7f3bf3a51a4e in __libc_message () from /lib64/libc.so.6
  #3  0x7f3bf3a57776 in malloc_printerr () from /lib64/libc.so.6
  #4  0x7f3bfb60d4ec in free_and_trace (mem=0x7f3bfe0129f8) at vl.c:2786
  #5  0x7f3bfb68cf9a in virtio_cleanup (vdev=0x7f3bfe0129f8) at 
/root/Develop/QEMU/qemu/hw/virtio.c:900
  #6  0x7f3bfb68094e in virtio_blk_device_init (vdev=0x7f3bfe0129f8) at 
/root/Develop/QEMU/qemu/hw/virtio-blk.c:666
  #7  0x7f3bfb68dadf in virtio_device_init (qdev=0x7f3bfe0129f8) at 
/root/Develop/QEMU/qemu/hw/virtio.c:1092
  #8  0x7f3bfb50da46 in device_realize (dev=0x7f3bfe0129f8, 
err=0x7fff479c9258) at hw/qdev.c:176
.

In virtio_blk_device_init(), the memory which vdev point to is a static
member of "struct VirtIOBlkPCI", not heap memory, and it does not
get freed. So we shoule use virtio_common_cleanup() to clean this VirtIODevice
rather than virtio_cleanup(), which attempts to free the vdev.

This error was introduced by commit 05ff686536f408ba6e8426b1b54d25bd3379fda2
recently.

Signed-off-by: Dunrong Huang 

Reviewed-by: KONRAD Frederic 

Oops sorry for that :/

---
  hw/virtio-blk.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index e6f8875..f2143fd 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -663,7 +663,7 @@ static int virtio_blk_device_init(VirtIODevice *vdev)
  s->vq = virtio_add_queue(vdev, 128, virtio_blk_handle_output);
  #ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
  if (!virtio_blk_data_plane_create(vdev, blk, &s->dataplane)) {
-virtio_cleanup(vdev);
+virtio_common_cleanup(vdev);
  return -1;
  }
  #endif

Re: [Qemu-devel] [PATCH V3 WIP 3/3] disable vhost_verify_ring_mappings check

2013-03-19 Thread Stefan Hajnoczi

On Tue, Mar 19, 2013 at 08:34:45AM +0800, Asias He wrote:
> ---
>  hw/vhost.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/vhost.c b/hw/vhost.c
> index 4d6aee3..0c52ec4 100644
> --- a/hw/vhost.c
> +++ b/hw/vhost.c
> @@ -421,10 +421,12 @@ static void vhost_set_memory(MemoryListener *listener,
>  return;
>  }
>  
> +#if 0
>  if (dev->started) {
>  r = vhost_verify_ring_mappings(dev, start_addr, size);
>  assert(r >= 0);
>  }
> +#endif

Please add a comment to explain why.

Stefan

Re: [Qemu-devel] [PATCH V3 WIP 2/3] vhost-scsi: new device supporting the tcm_vhost Linux kernel module

2013-03-19 Thread Stefan Hajnoczi

On Tue, Mar 19, 2013 at 08:34:44AM +0800, Asias He wrote:
> +static void vhost_scsi_stop(VHostSCSI *vs, VirtIODevice *vdev)
> +{
> +int ret = 0;
> +
> +if (!vdev->binding->set_guest_notifiers) {
> +ret = vdev->binding->set_guest_notifiers(vdev->binding_opaque,
> + vs->dev.nvqs, false);
> +if (ret < 0) {
> +error_report("vhost guest notifier cleanup failed: %d\n", 
> ret);

Indentation.  scripts/checkpatch.pl should catch this.

> +}
> +}
> +assert(ret >= 0);
> +
> +vhost_scsi_clear_endpoint(vdev);
> +vhost_dev_stop(&vs->dev, vdev);
> +vhost_dev_disable_notifiers(&vs->dev, vdev);
> +}
> +
> +static void vhost_scsi_set_config(VirtIODevice *vdev,
> +  const uint8_t *config)
> +{
> +VirtIOSCSIConfig *scsiconf = (VirtIOSCSIConfig *)config;
> +VHostSCSI *vs = (VHostSCSI *)vdev;
> +
> +if ((uint32_t) ldl_raw(&scsiconf->sense_size) != vs->vs.sense_size ||
> +(uint32_t) ldl_raw(&scsiconf->cdb_size) != vs->vs.cdb_size) {
> +error_report("vhost-scsi does not support changing the sense data 
> and CDB sizes");
> +exit(1);

Guest-triggerable exits can be used as a denial of service - especially
under nested virtualization where killing the L1 hypervisor would kill
all L2 guests!

I would just log a warning here.

> +}
> +}
> +
> +static void vhost_scsi_set_status(VirtIODevice *vdev, uint8_t val)
> +{
> +VHostSCSI *vs = (VHostSCSI *)vdev;
> +bool start = (val & VIRTIO_CONFIG_S_DRIVER_OK);
> +
> +if (vs->dev.started == start) {
> +return;
> +}
> +
> +if (start) {
> +int ret;
> +
> +ret = vhost_scsi_start(vs, vdev);
> +if (ret < 0) {
> +error_report("virtio-scsi: unable to start vhost: %s\n",
> + strerror(-ret));
> +
> +/* There is no userspace virtio-scsi fallback so exit */
> +exit(1);

It's questionable whether to kill the guest or simply disable this
virtio-scsi-pci adapter.  Fine for now but we may want to allow a policy
here in the future.

> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index 39c1966..281a7e2 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -22,6 +22,7 @@
>  #include "hw/virtio-net.h"
>  #include "hw/virtio-serial.h"
>  #include "hw/virtio-scsi.h"
> +#include "hw/vhost-scsi.h"

Can this header be included unconditionally?  It uses _IOW() which may
not be available on all host platforms.

Re: [Qemu-devel] [PATCH V3 WIP 3/3] disable vhost_verify_ring_mappings check

2013-03-19 Thread Asias He

On Tue, Mar 19, 2013 at 09:40:57AM +0100, Stefan Hajnoczi wrote:
> On Tue, Mar 19, 2013 at 08:34:45AM +0800, Asias He wrote:
> > ---
> >  hw/vhost.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/hw/vhost.c b/hw/vhost.c
> > index 4d6aee3..0c52ec4 100644
> > --- a/hw/vhost.c
> > +++ b/hw/vhost.c
> > @@ -421,10 +421,12 @@ static void vhost_set_memory(MemoryListener *listener,
> >  return;
> >  }
> >  
> > +#if 0
> >  if (dev->started) {
> >  r = vhost_verify_ring_mappings(dev, start_addr, size);
> >  assert(r >= 0);
> >  }
> > +#endif
> 
> Please add a comment to explain why.

Apparently, this is just a workaround for testing not for merge.

> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Asias

Re: [Qemu-devel] [PATCH v2 0/2] block: fix BDRV_O_SNAPSHOT with protocols

2013-03-19 Thread Stefan Hajnoczi

On Tue, Mar 19, 2013 at 10:16:41AM +0400, Michael Tokarev wrote:
> 18.03.2013 20:58, Stefan Hajnoczi wrote:
> > Richard Jones  reported that BDRV_O_SNAPSHOT does not 
> > work
> > with NBD:
> 
> Heh.  http://permalink.gmane.org/gmane.comp.emulators.qemu/148390
> 
> (nothing's wrong with the patchset, just saying ;)

Sorry, that wasn't intentional.  Richard included your link in his bug
report but I didn't follow it since the scope of the bug seemed clear
enough.

Stefan

Re: [Qemu-devel] [PATCH 0/2] fix segfault in (iscsi) option parsing

2013-03-19 Thread Markus Armbruster

Paolo, please have a look.

Peter Lieven  writes:

> current git master segfaults if an iscsi option is specified
> in command line.
>
> Peter Lieven (2):
>   qemu-option: avoid segfault if QemuOptsList == NULL
>   vl.c: fix segfault in iscsi options parsing
>
>  block/iscsi.c  |   27 ---
>  util/qemu-option.c |1 +
>  vl.c   |   40 +---
>  3 files changed, 38 insertions(+), 30 deletions(-)

Re: [Qemu-devel] [PATCH][RFC 0/14] implement power chip

2013-03-19 Thread Peter Maydell

On 19 March 2013 00:55, li guang  wrote:
> 在 2013-03-18一的 11:07 +，Peter Maydell写道：
>> I'm really dubious that this can or should be implemented by
>> adding methods at the device base class level. I don't think
>> real hardware works this way.
>
> and what it should be?
>
>>  You could probably do power on/off
>> like this. Reset definitely shouldn't be done this way, and
>> suspend probably not either.
>
> AFAIK, reset is mostly power off then power on,

The DeviceState method 'reset' simulates a power cycle.
On real hardware power-controller controlled reset is
more complicated and generally devices implement one
or more reset pins which can be asserted by the power
controller.

> suspend, of course is not supported by all devices,
> but, if system want to suspend, it have to let all devices
> aware this, and if the device support power suspend, it
> can do something specific(or just some with others, simple)

I suspect this should involve more modelling of actual
control signals between the power controller and
the devices, not methods on the base class.

> I'm eager to get more comments and discussion.
> This idea simply based on system board design convention,
> I'm not saying a power chip has signals directly connected
> to all devices, I mean system board and its devices should
> have protocol to deal with power state changes.

Hardware does it with signals, so should we.

-- PMM

[Qemu-devel] qmp monitor is broken

2013-03-19 Thread Gerd Hoffmann

  Hi,

$subject says it.  Bisecting points to:

commit 23673ca740e0eda66901ca801a5a901df378b063
Author: Anthony Liguori 
Date:   Tue Mar 5 23:21:23 2013 +0530

qemu-char: add watch support

This allows a front-end to request for a callback when the backend
is writable again.

Signed-off-by: Anthony Liguori 
Signed-off-by: Amit Shah 


Symtoms: "virsh start" hangs (not always, but most of the time).  When
killing qemu it prints an error message:

error: Failed to start domain fedora-org-virtio
error: internal error cannot parse json {"return": [{"name":
"chardev-remove"}, {"name": "chardev-add"}, {"name": "query-target"},
{"name": "query-cpu-definitions"}, {"name": "query-machines"}, {"name":
"device-list-properties"}, {"name": "qom-list-types"}, {"name":
"change-vnc-password"}, {"name": "nbd-server-stop"}, {"name":
"nbd-server-add"}, {"name": "nbd-server-start"}, {"name": "qom-get"},
{"name": "qom-set"}, {"name": "qom-list"}, {"name": "query-block-jobs"},
{"name": "query-balloon"}, {"name": "query-migrate-capabilities"},
{"name": "migrate-set-capabilities"}, {"name": "query-migrate"},
{"name": "query-uuid"}, {"name": "query-name"}, {"name": "query-spice"},
{"name": "query-vnc"}, {"name": "query-mice"}, {"name": "query-status"},
{"name": "query-kvm"}, {"name": "query-pci"}, {"name": "query-cpus"},
{"name": "query-blockstats"}, {"name": "query-block"}, {"name":
"query-chardev"}, {"name": "query-events"}, {"name": "query-commands"},
{"name": "query-version"}, {"name": "human-monitor-command"}, {"name":
"qmp_capabilities

So it looks like a bulky qmp reply is written incomplete (or corrupted)
and libvirt waits forever for the missing bits ...

cheers,
  Gerd

Re: [Qemu-devel] [PATCH 7/8] spice-qemu-char: Register interface on post load

2013-03-19 Thread Alon Levy

> On 03/14/13 17:36, Hans de Goede wrote:
> > From: Alon Levy 
> > 
> > The target has not seen the guest_connected event via
> > spice_chr_guest_open or spice_chr_write, and so spice server
> > wrongly
> > assumes there is no agent active, while the client continues to
> > send
> > motion events only by the agent channel, which the server ignores.
> > The
> > net effect is that the mouse is static in the guest.
> > 
> > By registering the interface on post load spice server will pass on
> > the
> > agent messages fixing the mouse behavior after migration.
> 
> > +static VMStateDescription spice_chr_vmstate = {
> > +.name   = "spice-chr",
> > +.version_id = 1,
> > +.minimum_version_id = 1,
> > +.post_load  = spice_chr_post_load,
> > +.fields = (VMStateField[]) {
> > +VMSTATE_UINT32(guest_open, SpiceCharDriver),
> > +VMSTATE_END_OF_LIST()
> > +},
> > +};
> 
> > +vmstate_register(NULL, -1, &spice_chr_vmstate, s);
> > +
> 
> That is a showstopper.  If there are two of these there is no
> reliable
> way to figure which is which.

But they will both get a different state pointer s.

> 
> Do we actually need that in the first place?  I assume virtio-serial
> will migrate the guest-open state anyway.  So if we can use that
> somehow
> we don't have to live migrate the chardev state ...

It is being migrated in virtio-serial-bus (port->guest_connected) but we need 
to pass that on to spice-server, which there is no hook for in the chardev 
layer, I could add one. I would need to query the backend, in this case 
virtio-serial-bus, something like this:

diff --git a/hw/virtio-console.c b/hw/virtio-console.c
index e2d1c58..f36a973 100644
--- a/hw/virtio-console.c
+++ b/hw/virtio-console.c
@@ -88,6 +88,16 @@ static void guest_close(VirtIOSerialPort *port)
 qemu_chr_fe_close(vcon->chr);
 }
 
+static void guest_migrated(VirtIOSerialPort *port)
+{
+VirtConsole *vcon = DO_UPCAST(VirtConsole, port, port);
+
+if (!vcon->chr) {
+return;
+}
+qemu_chr_fe_migrated(vcon->chr, port->guest_connected);
+}
+
 /* Readiness of the guest to accept data on a port */
 static int chr_can_read(void *opaque)
 {
@@ -177,6 +187,7 @@ static void virtserialport_class_init(ObjectClass *klass, 
void *data)
 k->have_data = flush_buf;
 k->guest_open = guest_open;
 k->guest_close = guest_close;
+k->guest_migrated = guest_migrated;
 dc->props = virtserialport_properties;
 }
 
diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 7d0515f..b6b1c3b 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -664,6 +664,7 @@ static int fetch_active_ports_list(QEMUFile *f, int 
version_id,
 /* Items in struct VirtIOSerialPort */
 for (i = 0; i < nr_active_ports; i++) {
 VirtIOSerialPort *port;
+VirtIOSerialPortClass *vsc;
 uint32_t id;
 
 id = qemu_get_be32(f);
@@ -671,6 +672,7 @@ static int fetch_active_ports_list(QEMUFile *f, int 
version_id,
 if (!port) {
 return -EINVAL;
 }
+vsc = VIRTIO_SERIAL_PORT_GET_CLASS(port);
 
 port->guest_connected = qemu_get_byte(f);
 s->post_load->connected[i].port = port;
@@ -698,6 +700,7 @@ static int fetch_active_ports_list(QEMUFile *f, int 
version_id,
 virtio_serial_throttle_port(port, false);
 }
 }
+vsc->guest_migrated(port);
 }
 qemu_mod_timer(s->post_load->timer, 1);
 return 0;
diff --git a/hw/virtio-serial.h b/hw/virtio-serial.h
index d2d9fb7..992ef55 100644
--- a/hw/virtio-serial.h
+++ b/hw/virtio-serial.h
@@ -107,6 +107,8 @@ typedef struct VirtIOSerialPortClass {
  */
 ssize_t (*have_data)(VirtIOSerialPort *port, const uint8_t *buf,
  size_t len);
+
+void (*guest_migrated)(VirtIOSerialPort *port);
 } VirtIOSerialPortClass;
 
 /*
diff --git a/include/char/char.h b/include/char/char.h
index d6a0351..b7c8cb1 100644
--- a/include/char/char.h
+++ b/include/char/char.h
@@ -70,6 +70,7 @@ struct CharDriverState {
 void (*chr_set_echo)(struct CharDriverState *chr, bool echo);
 void (*chr_guest_open)(struct CharDriverState *chr);
 void (*chr_guest_close)(struct CharDriverState *chr);
+void (*chr_guest_migrated)(struct CharDriverState *chr, int connected);
 void *opaque;
 int idle_tag;
 char *label;
@@ -143,6 +144,8 @@ void qemu_chr_fe_open(struct CharDriverState *chr);
  */
 void qemu_chr_fe_close(struct CharDriverState *chr);
 
+void qemu_chr_fe_migrated(struct CharDriverState *chr, int connected);
+
 /**
  * @qemu_chr_fe_printf:
  *
diff --git a/qemu-char.c b/qemu-char.c
index e633797..cda785f 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -3390,6 +3390,13 @@ void qemu_chr_fe_open(struct CharDriverState *chr)
 }
 }
 
+void qemu_chr_fe_migrated(struct CharDriverState *chr, int connected)
+{
+if (chr->chr_guest_migrated) {
+chr->chr_guest_migrated(chr, connected

Re: [Qemu-devel] [RFC PATCH RDMA support v4: 08/10] introduce QEMUFileRDMA

2013-03-19 Thread Paolo Bonzini

Il 18/03/2013 21:33, Michael R. Hines ha scritto:
>>
>> +int qemu_drain(QEMUFile *f)
>> +{
>> +return f->ops->drain ? f->ops->drain(f->opaque) : 0;
>> +}
>> Hmm, this is very similar to qemu_fflush, but not quite. :/
>>
>> Why exactly is this needed?
> 
> Good idea - I'll replace drain with flush once I added
> the  "qemu_file_ops_are(const QEMUFile *, const QEMUFileOps *) "
> that you recommended..

If I understand correctly, the problem is that save_rdma_page is
asynchronous and you have to wait for pending operations to do the
put_buffer protocol correctly.

Would it work to just do the "drain" in the put_buffer operation, if and
only if it was preceded by a save_rdma_page operation?

> 
>>>   /** Flushes QEMUFile buffer
>>>*
>>>*/
>>> @@ -723,6 +867,8 @@ int qemu_get_byte(QEMUFile *f)
>>>   int64_t qemu_ftell(QEMUFile *f)
>>>   {
>>>   qemu_fflush(f);
>>> +if(migrate_use_rdma(f))
>>> +return delta_norm_mig_bytes_transferred();
>> Not needed, and another undesirable dependency (savevm.c ->
>> arch_init.c).  Just update f->pos in save_rdma_page.
> 
> f->pos isn't good enough because save_rdma_page does not
> go through QEMUFile directly - only non-live state goes
> through QEMUFile ... pc.ram uses direct RDMA writes.
> 
> As a result, the position pointer does not get updated
> and the accounting is missed

Yes, I am suggesting to modify f->pos in save_rdma_page instead.

Paolo

Re: [Qemu-devel] [RFC PATCH RDMA support v4: 09/10] check for QMP string and bypass nonblock() calls

2013-03-19 Thread Paolo Bonzini

Il 18/03/2013 21:37, Michael R. Hines ha scritto:
>>
>> +if(!migrate_use_rdma(f)) {
>> +int fd = qemu_get_fd(f);
>> +assert(fd != -1);
>> +socket_set_nonblock(fd);
>> Is this because qemu_get_fd(f) returns -1 for RDMA?  Then, you can
>> instead put socket_set_nonblock under an if(fd != -1).
> 
> Yes, I proposed doing that check (for -1) in a previous RFC,
> but you told me to remove it and make a separate patch =)
> 
> Is it OK to keep it in this patch?

Yes---this is a separate patch.  Apologies if you had the if(fd != -1)
before. :)  In fact, both the if(fd != -1) and the
if(!migrate_use_rdma(f)) are bad, but I prefer to eliminate as many uses
as possible of migrate_use_rdma.

The reason why they are bad, is that we try to operate on the socket in
a non-blocking manner, so that the monitor keeps working during incoming
migration.  We do it with non-blocking sockets because incoming
migration does not (yet?) have a separate thread, and is not a
bottleneck (the VM is not running, so it's not a problem to hold the big
QEMU lock for extended periods of time).

Does librdmacm support non-blocking operation, similar to select() or
poll()?  Perhaps we can add support for that later.

Paolo

Re: [Qemu-devel] [PATCH 00/35] hw/ reorganization, part 2

2013-03-19 Thread Paolo Bonzini

Il 18/03/2013 21:21, Peter Maydell ha scritto:
> On 18 March 2013 20:05, Paolo Bonzini  wrote:
>> Il 18/03/2013 19:17, Peter Maydell ha scritto:
> Shouldn't these containers also host the CPU device(s), rather than the
> boards?  And create them according to the num-cpu property?  If so, they
> would have to go in hw/arm.
>>> Yes, ideally they should have the CPU devices in them too.
>>> Remind me why devices which instantiate the CPU device go
>>> in hw/arm ?
>>
>> Because they refer to ARMCPU/CPUARMState.
> 
> Well, a container object that instantiated the CPUs wouldn't
> be referring to the internal CPU state struct, it would just
> be treating them as QOM objects like any other, so that doesn't
> apply.

Wouldn't it also bridge the CPU's internal interrupt pins to the GIC?
Like this code in highbank.c:


/* This will become a QOM property eventually */
irqp = arm_pic_init_cpu(cpu);
cpu_irq[n] = irqp[ARM_PIC_CPU_IRQ];
...
for (n = 0; n < smp_cpus; n++) {
sysbus_connect_irq(busdev, n, cpu_irq[n]);
}

Paolo

Re: [Qemu-devel] [PATCH 7/8] spice-qemu-char: Register interface on post load

2013-03-19 Thread Gerd Hoffmann

On 03/19/13 10:15, Alon Levy wrote:

>>> +vmstate_register(NULL, -1, &spice_chr_vmstate, s);
>>> +
>>
>> That is a showstopper.  If there are two of these there is no
>> reliable
>> way to figure which is which.
> 
> But they will both get a different state pointer s.

I mean in the migration data stream.

> It is being migrated in virtio-serial-bus (port->guest_connected)
> but we need to pass that on to spice-server, which there is no hook
> for in the chardev layer, I could add one. I would need to query the
> backend, in this case virtio-serial-bus, something like this:

That is better IMHO.

cheers,
  Gerd

Re: [Qemu-devel] [RFC PATCH 0/2] port network layer onto glib

2013-03-19 Thread Markus Armbruster

Peter Maydell  writes:

> On 13 March 2013 12:34, Anthony Liguori  wrote:
>> AioContext is necessary for the block layer because the block layer
>> still has synchronous I/O.  I think we should aim to replace all sync
>> I/O in the long term with coroutine based I/O.
>
> I think coroutines are dreadful and we should really not be moving
> towards greater use of them. They're just really really not portable
> and they don't fit with the C language, and they're a constant source
> of problems.(For instance I have a bug I need to look into where we
> seem to hang using the gthread coroutine backend but not sigaltstack.)
>
> Use threads, or a genuinely asynchronous API, or a select/poll loop
> with callbacks, but not more coroutines please.

I disagree.

Coroutines are a perfectly pedestrian control flow construct.  Stefan
already explained why we use them, and why callbacks are not a viable
alternative when things get hairy.

Coroutines fit about as well with C as threads, namely not really.
Unfortunate, but switching to threads won't improve that aspect one bit,
just add "interesting" concurrency and performance problems to the mix.

If portable coroutines in C really was an intractable problem, the
solution could not be "no more coroutines, please", only "no coroutines,
period".  As long as we have to pay the price for coroutines anyway, I
can't see why we should deny ourselves the benefits.

C programs have been doing coroutines since forever, using either hand
coded assembly language stack switching, sigaltstack() trickery,
ucontext(), w32 fibers, or some coroutine library built on top of these.
We chose "all of the above" (except for assembly).  No wonder we got
more problems.

Some of our portability problems come from our heroic (quixotic?) quest
to keep QEMU portable to the widest range of hosts, without sacrificing
performance on any of them.  Sometimes it takes a tough man to make a
tender chicken.

Re: [Qemu-devel] [PATCH][RFC 0/14] implement power chip

2013-03-19 Thread li guang

在 2013-03-19二的 09:05 +，Peter Maydell写道：
> On 19 March 2013 00:55, li guang  wrote:
> > 在 2013-03-18一的 11:07 +，Peter Maydell写道：
> >> I'm really dubious that this can or should be implemented by
> >> adding methods at the device base class level. I don't think
> >> real hardware works this way.
> >
> > and what it should be?
> >
> >>  You could probably do power on/off
> >> like this. Reset definitely shouldn't be done this way, and
> >> suspend probably not either.
> >
> > AFAIK, reset is mostly power off then power on,
> 
> The DeviceState method 'reset' simulates a power cycle.
> On real hardware power-controller controlled reset is
> more complicated and generally devices implement one
> or more reset pins which can be asserted by the power
> controller.

By now, it seems 'reset' callback only implemented as
device status initialization, and I want to keep this,
because, it's mostly the result power off/on.

> 
> > suspend, of course is not supported by all devices,
> > but, if system want to suspend, it have to let all devices
> > aware this, and if the device support power suspend, it
> > can do something specific(or just some with others, simple)
> 
> I suspect this should involve more modelling of actual
> control signals between the power controller and
> the devices, not methods on the base class.
> 

do we have to realize something like signals which are actually
only some copper wires?
I think we just emulate the real work, that is when some signals
asserted, we just call corresponding method to do something
by these embedded method, I want to let devices take care
of power event(on/off/suspend/wakeup) themselves.

> > I'm eager to get more comments and discussion.
> > This idea simply based on system board design convention,
> > I'm not saying a power chip has signals directly connected
> > to all devices, I mean system board and its devices should
> > have protocol to deal with power state changes.
> 
> Hardware does it with signals, so should we.

can these signals be viewed as the calling of corresponding methods?

[Qemu-devel] [PATCH] qcow2: fix null deref apparent during migration

2013-03-19 Thread Alon Levy

Signed-off-by: Alon Levy 
---
 block/qcow2.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 1f99866..260a1d3 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -512,7 +512,9 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, 
int flags)
 
 /* Enable lazy_refcounts according to image and command line options */
 opts = qemu_opts_create_nofail(&qcow2_runtime_opts);
-qemu_opts_absorb_qdict(opts, options, &local_err);
+if (options) {
+qemu_opts_absorb_qdict(opts, options, &local_err);
+}
 if (error_is_set(&local_err)) {
 qerror_report_err(local_err);
 error_free(local_err);
-- 
1.8.1.4

Re: [Qemu-devel] migrate cancel bug in latest qemu upstream

2013-03-19 Thread Paolo Bonzini

Il 19/03/2013 05:49, Amos Kong ha scritto:
> On Mon, Mar 18, 2013 at 09:20:03PM +0800, Amos Kong wrote:
>> On Tue, Mar 12, 2013 at 10:58:54AM +0800, Amos Kong wrote:
>>> Hi quintela, I found a migrate bug in Autotest result, not sure if
>>> it's already known.
>
> This bug was already fixed by this patch:
> 
> commit dba433c03a0f5dc22a459435dd89557886298921
> Author: Paolo Bonzini 
> Date:   Fri Feb 22 17:36:17 2013 +0100
> 
> migration: simplify error handling
> 
> Always use qemu_file_get_error to detect errors, since that is how
> QEMUFile itself drops I/O after an error occurs.  There is no need
> to propagate and check return values all the time.
> 
> Also remove the "complete" member, since we know that it is set (via
> migrate_fd_cleanup) only when the state changes.
> 
> Reviewed-by: Orit Wasserman 
> Reviewed-by: Juan Quintela 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: Juan Quintela 
> 

Should we add these patches to qemu-stable?

5da5aad (migration: simplify while loop, 2013-02-22)
891518a (migration: always use vm_stop_force_state, 2013-02-22)
7a2c172 (migration: move more error handling to migrate_fd_cleanup, 2013-02-22)
04943eb (migration: push qemu_savevm_state_cancel out of qemu_savevm_state_*, 
2013-02-22)
93bf210 (qemu-file: pass errno from qemu_fflush via f->last_error, 2013-02-22)
47c8c17 (migration: use qemu_file_set_error to pass error codes back to 
qemu_savevm_state, 2013-02-22)
4eb9381 (qemu-file: temporarily expose qemu_file_set_error and qemu_fflush, 
2013-02-22)
f582151 (migration: flush all data to fd when buffered_flush is called, 
2013-02-22)
63dfbd7 (migration: use qemu_file_set_error, 2013-02-22)
dba433c (migration: simplify error handling, 2013-02-22)

Paolo

Re: [Qemu-devel] [PATCH 3/3] iscsi_truncate: ensure there are no requests in flight

2013-03-19 Thread Paolo Bonzini

Il 19/03/2013 08:29, Peter Lieven ha scritto:
> On 11.03.2013 11:16, Paolo Bonzini wrote:
>> Il 11/03/2013 11:05, Peter Lieven ha scritto:
>>> ensure that there are no pending I/Os before calling
>>> the sync readcapacity commands. the block_resize monitor
>>> command will also flush all I/O, but double check in
>>> case iscsi_truncate() is called from elsewhere in the
>>> future.
>>>
>>> Signed-off-by: Peter Lieven 
>>> ---
>>>   block/iscsi.c |4 
>>>   1 file changed, 4 insertions(+)
>>>
>>> diff --git a/block/iscsi.c b/block/iscsi.c
>>> index 3d52921..de20d53 100644
>>> --- a/block/iscsi.c
>>> +++ b/block/iscsi.c
>>> @@ -1167,6 +1167,10 @@ static int iscsi_truncate(BlockDriverState *bs,
>>> int64_t offset)
>>>   return -ENOTSUP;
>>>   }
>>>
>>> +/* ensure all async requests are completed before executing
>>> + * a sync readcapacity */
>>> +bdrv_drain_all();
>>> +
>>>   if ((ret = iscsi_readcapacity_sync(iscsilun)) != 0) {
>>>   return ret;
>>>   }
>>
>> NACK to this patch.  It would be a bug, let's fix it properly.
>>
>> The other two are fine, however.
> 
> Paolo, can you ensure that Patch 1+2 get merged before qemu 1.5.0
> to fix the regression.

No, I cannot :) because I'm not the block maintainer.  But I'm sure that
they are on Kevin and Jeff's radar.

Paolo

Re: [Qemu-devel] qemu segfault parsing iscsi options

2013-03-19 Thread Paolo Bonzini

Il 18/03/2013 17:47, Markus Armbruster ha scritto:
> Peter Lieven  writes:
> 
>> On 18.03.2013 13:52, Markus Armbruster wrote:
>>> Peter Lieven  writes:
>>>
 Hi,

 with recent qemu from git qemu segfaults with the following commandline:

 x86_64-softmmu/qemu-system-x86_64 -iscsi test

 qemu-system-x86_64: -iscsi test: There is no option group 'iscsi'
 Speicherzugriffsfehler (Speicherabzug geschrieben)

 It seems that there is something missing regarding the iscsi options
 in qemu-option.hx.

 This was working with qemu-kvm-1.2.0.
>>>
>>> Works for me with current master 225dc991.
>>>
>>
>> for me it doesn't...
>>
>> ~/git/qemu$ git log -1 --oneline
>> 225dc99 s390: Fix cpu refactoring fallout.
>>
>> there is definetly missing some parts in qemu-options.hx for -iscsi
> 
> Hmm, I got CONFIG_LIBISCSI off.  What do I have to install to flip it to
> on (Fedora 17)?

It is only in Fedora 18, but I think it will install just fine if you
download the packages from Koji.

Paolo

[Qemu-devel] [PATCH] fix monitor

2013-03-19 Thread Gerd Hoffmann

chardev flow control broke monitor, fix it by adding watch support.
---
 monitor.c |   23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/monitor.c b/monitor.c
index 112e920..680d344 100644
--- a/monitor.c
+++ b/monitor.c
@@ -261,11 +261,30 @@ int monitor_read_password(Monitor *mon, ReadLineFunc 
*readline_func,
 }
 }
 
+static gboolean monitor_unblocked(GIOChannel *chan, GIOCondition cond,
+  void *opaque)
+{
+monitor_flush(opaque);
+return FALSE;
+}
+
 void monitor_flush(Monitor *mon)
 {
+int rc;
+
 if (mon && mon->outbuf_index != 0 && !mon->mux_out) {
-qemu_chr_fe_write(mon->chr, mon->outbuf, mon->outbuf_index);
-mon->outbuf_index = 0;
+rc = qemu_chr_fe_write(mon->chr, mon->outbuf, mon->outbuf_index);
+if (rc == mon->outbuf_index) {
+/* all flushed */
+mon->outbuf_index = 0;
+return;
+}
+if (rc > 0) {
+/* partinal write */
+memmove(mon->outbuf, mon->outbuf + rc, mon->outbuf_index - rc);
+mon->outbuf_index -= rc;
+}
+qemu_chr_fe_add_watch(mon->chr, G_IO_OUT, monitor_unblocked, mon);
 }
 }
 
-- 
1.7.9.7

Re: [Qemu-devel] [Xen-devel] [PATCHv3] QEMU(upstream): Disable xen's use of O_DIRECT by default as it results in crashes.

2013-03-19 Thread George Dunlap

On Mon, Mar 18, 2013 at 6:00 PM, Paolo Bonzini  wrote:
> Il 18/03/2013 18:38, George Dunlap ha scritto:

>>> This might be a difference between Xen and KVM. On Xen migration is
>>> made to a server in a paused state, and it's only unpaused when
>>> the migration to B is complete. There's a sort of extra handshake at
>>> the end.
>>
>> I think what you mean is that all the memory is handled by Xen and the
>> toolstack, not by qemu.  The qemu state is sent as the very last thing,
>> after all of the memory, and therefore (you are arguing) that qemu is
>> not started, and the files cannot be opened, until after the migration
>> is nearly complete, and certainly until after the file is closed on the
>> sending side.
>
> That would be quite dangerous.  Files aren't closed until after QEMU
> exits; at this point whatever problem you have launching QEMU on the
> destination would be unrecoverable.

But if I understand your concern correctly, you were concerned about
the following scenario:
R1. Receiver qemu opens file
R2. Something causes receiver kernel to cache parts of file (maybe
optimistic read-ahead)
S1. Sender qemu writes to file
S2. Sender qemu does final flush
S3. Sender qemu closes file
R3. Receiver reads stale blocks from cache

Even supposing that Xen doesn't actually shut down qemu until it is
started on the remote side, as long as the file isn't opened by qemu
until after S2, we should be safe, right?  It would look like this:

S1. Sender qemu writes to file
S2. Sender qemu does final flush
R1. Receiver qemu opens file
R2. Receiver kernel caches file
S3. Sender qemu closes file

This is all assuming that:
1. The barrier operations / write flush are effective at getting the
data back on to the NFS server
2. The receiver qemu doesn't open the file until after the last flush
by the sender.

Number 1 has been tested by Alex I believe, and is mentioned in the
changeset log; so if #2 is true, then we should be safe.  I'll try to
verify that today.

> Even for successful migration, it would also be bad for downtime (QEMU
> isn't exactly lightning-fast to start).  And even if failure weren't
> catastrophic, it would be a pity to transfer a few gigs of memory and
> then find out that QEMU isn't present in the destination. :)

Well, if qemu isn't present at the destination, that's definitely user
error. :-)  In any case, I know that he migrate can resume if it
fails, so I suspect that the qemu is just paused on the sending side
until the migration is known to complete.  As long as the last write
was flushed to the NFS server before the receiver opens the file, we
should be safe.

> Still, it's more than possible that I've forgotten something about Xen's
> management of QEMU.

And unfortunately I am not intimately familiar with that codepath; it
just happens that I'm the last person to have to dig into that code
and fix something. :-)

 -George

Re: [Qemu-devel] [PATCH 00/35] hw/ reorganization, part 2

2013-03-19 Thread Peter Maydell

On 19 March 2013 09:26, Paolo Bonzini  wrote:
> Il 18/03/2013 21:21, Peter Maydell ha scritto:
>> On 18 March 2013 20:05, Paolo Bonzini  wrote:
>>> Il 18/03/2013 19:17, Peter Maydell ha scritto:
>> Shouldn't these containers also host the CPU device(s), rather than the
>> boards?  And create them according to the num-cpu property?  If so, they
>> would have to go in hw/arm.
 Yes, ideally they should have the CPU devices in them too.
 Remind me why devices which instantiate the CPU device go
 in hw/arm ?
>>>
>>> Because they refer to ARMCPU/CPUARMState.
>>
>> Well, a container object that instantiated the CPUs wouldn't
>> be referring to the internal CPU state struct, it would just
>> be treating them as QOM objects like any other, so that doesn't
>> apply.
>
> Wouldn't it also bridge the CPU's internal interrupt pins to the GIC?
> Like this code in highbank.c:
>
>
> /* This will become a QOM property eventually */
> irqp = arm_pic_init_cpu(cpu);
> cpu_irq[n] = irqp[ARM_PIC_CPU_IRQ];
> ...
> for (n = 0; n < smp_cpus; n++) {
> sysbus_connect_irq(busdev, n, cpu_irq[n]);
> }

Well, for the CPU to be a proper QOM object it should be exposing
the IRQ/FIQ lines normally, not via the code in hw/arm/pic_cpu.c.
My point is that the QOM abstraction should encapsulate the CPU
cores just like any other piece of hardware. We're not there yet
but that's where we should be going. You can't really put the
CPUs into the a9mpcore &c containers until we've done that
abstraction properly anyway.

-- PMM

Re: [Qemu-devel] [Qemu-ppc] [PATCH 3/5] pseries: Fixes and enhancements to L1 cache properties

2013-03-19 Thread Alexander Graf


On 19.03.2013, at 02:00, David Gibson wrote:

> On Mon, Mar 18, 2013 at 11:54:05AM +0100, Andreas Färber wrote:
>> Am 15.03.2013 13:27, schrieb Alexander Graf:
>>> 
>>> On 14.03.2013, at 02:53, David Gibson wrote:
>>> 
 PAPR requires that the device tree's CPU nodes have several properties
 with information about the L1 cache.  We already create two of these
 properties, but with incorrect names - "[id]cache-block-size" instead
 of "[id]-cache-block-size" (note the extra hyphen).
 
 We were also missing some of the required cache properties.  This
 patch adds the [id]-cache-line-size properties (which have the same
 values as the block size properties in all current cases).  We also
 add the [id]-cache-size properties.
 
 Adding the cache sizes requires some extra infrastructure in the
 general target-ppc code to (optionally) set the cache sizes for
 various CPUs.  The CPU family descriptions in translate_init.c can set
 these sizes - this patch adds correct information for POWER7, I'm
 leaving other CPU types to people who have a physical example to
 verify against.  In addition, for -cpu host we take the values
 advertised by the host (if available) and use those to override the
 information based on PVR.
 
 Signed-off-by: David Gibson 
 ---
 hw/ppc/spapr.c  |   20 ++--
 target-ppc/cpu.h|1 +
 target-ppc/kvm.c|   39 +++
 target-ppc/translate_init.c |4 
 4 files changed, 62 insertions(+), 2 deletions(-)
 
 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index 9a13697..7293082 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -333,10 +333,26 @@ static void *spapr_create_fdt_skel(const char 
 *cpu_model,
_FDT((fdt_property_string(fdt, "device_type", "cpu")));
 
_FDT((fdt_property_cell(fdt, "cpu-version", env->spr[SPR_PVR])));
 -_FDT((fdt_property_cell(fdt, "dcache-block-size",
 +_FDT((fdt_property_cell(fdt, "d-cache-block-size",
env->dcache_line_size)));
 -_FDT((fdt_property_cell(fdt, "icache-block-size",
 +_FDT((fdt_property_cell(fdt, "d-cache-line-size",
 +env->dcache_line_size)));
 +_FDT((fdt_property_cell(fdt, "i-cache-block-size",
 +env->icache_line_size)));
 +_FDT((fdt_property_cell(fdt, "i-cache-line-size",
env->icache_line_size)));
 +
 +if (env->l1_dcache_size) {
 +_FDT((fdt_property_cell(fdt, "d-cache-size", 
 env->l1_dcache_size)));
 +} else {
 +fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
 +}
 +if (env->l1_icache_size) {
 +_FDT((fdt_property_cell(fdt, "i-cache-size", 
 env->l1_icache_size)));
 +} else {
 +fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
 +}
>>> 
>>> The L1 sizes should come from the class, not env, right? Andreas, any ideas 
>>> on this?
>> 
>> Generally speaking,
>> 
>> CPUPPCState: Only if this is used for TCG with an offset from AREG0 (or
>> for legacy grouping reasons).
>> 
>> PowerPCCPU: If you ever intend to let the user override this value
>> (per-instance) from the command line.
>> 
>> PowerPCCPUClass: If the value is always constant at runtime.
>> 
>> I can't tell from a brief look at this patch which may be the case here.
> 
> Ok, on that rationale I'll rework to move it to the class.

Thanks :). I think we decided a while ago that subversions should also be 
individual (sub)classes, so that all of this makes sense again ;).


Alex

Re: [Qemu-devel] [RFC PATCH 0/2] port network layer onto glib

2013-03-19 Thread Peter Maydell

On 19 March 2013 09:30, Markus Armbruster  wrote:
> Coroutines are a perfectly pedestrian control flow construct.

In some languages, sure. Not in C, and we're writing C.

> Coroutines fit about as well with C as threads, namely not really.

Threads are supported by the language runtime provided on all
the systems we support, which is why they are reasonably usable.
When you've persuaded glibc, MacOSX libc and Windows to implement
coroutines please come back and let me know :-)

> If portable coroutines in C really was an intractable problem, the
> solution could not be "no more coroutines, please", only "no coroutines,
> period".  As long as we have to pay the price for coroutines anyway, I
> can't see why we should deny ourselves the benefits.

I'd like to see coroutines gone completely. The first step is not
to let them get used more than they are already.

> C programs have been doing coroutines since forever, using either hand
> coded assembly language stack switching, sigaltstack() trickery,
> ucontext(), w32 fibers, or some coroutine library built on top of these.

...and there's a wide range of really nasty options because none
of them are actually decent solutions to the problem.

-- PMM

Re: [Qemu-devel] [PATCH][RFC 0/14] implement power chip

2013-03-19 Thread Peter Maydell

On 19 March 2013 09:31, li guang  wrote:
> 在 2013-03-19二的 09:05 +，Peter Maydell写道：
>> I suspect this should involve more modelling of actual
>> control signals between the power controller and
>> the devices, not methods on the base class.
>>
>
> do we have to realize something like signals which are actually
> only some copper wires?
> I think we just emulate the real work, that is when some signals
> asserted, we just call corresponding method to do something
> by these embedded method, I want to let devices take care
> of power event(on/off/suspend/wakeup) themselves.

The point is that how exactly power controllers connect
to devices, and which devices respond to reset/suspend/etc
is a property of the individual machine being modelled.
An x86 PC will be different from an ARM devboard which is
different again from the Exynos4 ARM SoC. So to allow this
flexibility, you have to let the machine model do the configuration,
which you do by having the model wire up the power controller
to the devices in the same way it's done on hardware.

>> > I'm eager to get more comments and discussion.
>> > This idea simply based on system board design convention,
>> > I'm not saying a power chip has signals directly connected
>> > to all devices, I mean system board and its devices should
>> > have protocol to deal with power state changes.
>>
>> Hardware does it with signals, so should we.
>
> can these signals be viewed as the calling of corresponding methods?

In some ways, they are -- but the wiring up of the source of
the call to the implementation is done at runtime as devices
are connected together.

-- PMM

Re: [Qemu-devel] [PATCH] qcow2: fix null deref apparent during migration

2013-03-19 Thread Kevin Wolf

Am 19.03.2013 um 10:35 hat Alon Levy geschrieben:
> Signed-off-by: Alon Levy 
> ---
>  block/qcow2.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Thanks, Alon. I already sent a more comprehensive patch yesterday:
http://patchwork.ozlabs.org/patch/228462/

I guess I should send a pull request for it quickly.

Kevin

Re: [Qemu-devel] [PATCH V27 1/7] Support for TPM command line options

2013-03-19 Thread Stefan Berger


On 03/19/2013 03:45 AM, Markus Armbruster wrote:

Stefan Berger  writes:


On 03/18/2013 09:10 AM, Markus Armbruster wrote:

In case the TPM is currently not operating on a command there will be
no impact. In case the TPM is operating on a command, it will hold the
thread inside /dev/tpm0 until the command has finished. The solution
here is to write a byte into sysfs file to terminate the TPM from
further executing the command and return; this code exists and is
being invoked in all other cases than abnormal termination obviously.

Unlike other kernel resources such as file descriptors, this one isn't
tied to a process and cleaned up automatically on process termination.
Correct?

No, it's also a file descriptor and will be cleaned up upon process
termination.

Closing the file descriptor does *not* free kernel resources associated
with it?!?


That's not what I said. You suggested "this one isn't tied to a process 
and cleaned up automatically on process termination", which is not true 
but the problem lies somewhere else.





Therefore, we try to clean it up manually.  Actual cleanup code is
tpm_passthrough_cancel_cmd().  Correct?

No, tpm_passthrough_cancel_cmd() sends a byte to the TPM's sysfs entry
for canceling an ongoing command. QEMU has a thread that sends the
commands to the TPM via write() and gets the responses in read(). The
issue being the thread is held inside the write() (!) until the
command has completed inside the host TPM.

You mean write() blocks?


Yep.





Inside the passthrough
driver we set a flag while the command is executing and if a command
is executing when for example QEMU is reset, then the sysfs entry is
written to to get the thread out of the write() due to termination of
the ongoing TPM command.

I can understand a need to abort the write() on guest reset, but why on
QEMU exit()?  Shouldn't process termination abort the write() just fine?


Nope, process termination won't get you out of write(). Even if this was 
to be changed today in the kernel driver, the old kernels are out there 
for a while to be.





  Depending on which version of Fedora is run,
the termination of the command may or may not get invoked.

May or may not get invoked on what condition?


Behavior of the OS upon shutdown itself. Whether it waits for all 
processes to terminate or not.




F14 for
example does not seem to care about waiting until all  user-level
processes have properly shut down and would reset  while F16 seems to
wait for all processes to properly terminate.

F14 is more than nine months past it's shelf date.  Why should we
support it when its makers don't?


F14 is just an example of one OS that can be run just happens to be old 
by by now; there may be others that behave the same way.




The cancellation of an ongoing command may be initiated inside the
guest as well and will also cause tpm_passthrough_cancel_cmd() to be
invoked if a command is executed by the host TPM on behalf of the
guest.

Passing through the guest's cancellation makes sense to me.


There were/are(?) some issues with cancellation of ongoing
commands. Different vendors seem to set different bits to indicate to
the driver as to when a command has been canceled. The specs are not
clear enough in that area. We fixed this for the TPMs that we have
access (to wake up the thread inside the write()).

I'm not doubting the need to cancel commands.  I'm just trying to
understand why we do it on exit().


I can write an OS that sends a command to the TPM and then shuts down 
without waiting for the program to terminate or the even the response 
from the TPM to come back . In this case we would need to cancel the 
command on exit()l.






The only existing cleanup function tpm_passthrough_destroy() does a
whole lot more.  Just to make sure I understand what's going on:
everything but tpm_passthrough_cancel_cmd() is effectively a no-op
there, correct?

It depends on what you mean with no-op. In tpm_passthrough_destroy() a
possibly ongoing command is canceled, then the thread is signaled to
terminate and join()ed. The rest is closing of file descriptors and
freeing of memory.

Closing file descriptors and freeing memory on exit() is a no-op at
best, and a source of bugs otherwise.  The kernel already cleans these
up just fine.

Unless the thread does something interesting in response to its
termination signal, making it terminate is also a no-op.  I can't see
anything interesting happening, please correct me if I'm wrong.



Per what you are saying only no-ops are happening.

   Stefan

Re: [Qemu-devel] [PATCH 00/35] hw/ reorganization, part 2

2013-03-19 Thread Paolo Bonzini

Il 19/03/2013 11:10, Peter Maydell ha scritto:
> On 19 March 2013 09:26, Paolo Bonzini  wrote:
>> Il 18/03/2013 21:21, Peter Maydell ha scritto:
>>> On 18 March 2013 20:05, Paolo Bonzini  wrote:
 Il 18/03/2013 19:17, Peter Maydell ha scritto:
>>> Shouldn't these containers also host the CPU device(s), rather than the
>>> boards?  And create them according to the num-cpu property?  If so, they
>>> would have to go in hw/arm.
> Yes, ideally they should have the CPU devices in them too.
> Remind me why devices which instantiate the CPU device go
> in hw/arm ?

 Because they refer to ARMCPU/CPUARMState.
>>>
>>> Well, a container object that instantiated the CPUs wouldn't
>>> be referring to the internal CPU state struct, it would just
>>> be treating them as QOM objects like any other, so that doesn't
>>> apply.
>>
>> Wouldn't it also bridge the CPU's internal interrupt pins to the GIC?
>> Like this code in highbank.c:
>>
>>
>> /* This will become a QOM property eventually */
>> irqp = arm_pic_init_cpu(cpu);
>> cpu_irq[n] = irqp[ARM_PIC_CPU_IRQ];
>> ...
>> for (n = 0; n < smp_cpus; n++) {
>> sysbus_connect_irq(busdev, n, cpu_irq[n]);
>> }
> 
> Well, for the CPU to be a proper QOM object it should be exposing
> the IRQ/FIQ lines normally, not via the code in hw/arm/pic_cpu.c.

That applies to everything else that was put in hw/ARCH.  Everything
could become a QOM property.

> My point is that the QOM abstraction should encapsulate the CPU
> cores just like any other piece of hardware. We're not there yet
> but that's where we should be going. You can't really put the
> CPUs into the a9mpcore &c containers until we've done that
> abstraction properly anyway.

Why not?  It would remove a bunch of code that is currently duplicated
in the boards.

Paolo

Re: [Qemu-devel] [PATCH 00/35] hw/ reorganization, part 2

2013-03-19 Thread Peter Maydell

On 19 March 2013 10:27, Paolo Bonzini  wrote:
> Il 19/03/2013 11:10, Peter Maydell ha scritto:
>> Well, for the CPU to be a proper QOM object it should be exposing
>> the IRQ/FIQ lines normally, not via the code in hw/arm/pic_cpu.c.
>
> That applies to everything else that was put in hw/ARCH.  Everything
> could become a QOM property.

Yes. If we clean this stuff up I can kick it back out of hw/ARCH :-)

>> My point is that the QOM abstraction should encapsulate the CPU
>> cores just like any other piece of hardware. We're not there yet
>> but that's where we should be going. You can't really put the
>> CPUs into the a9mpcore &c containers until we've done that
>> abstraction properly anyway.
>
> Why not?  It would remove a bunch of code that is currently duplicated
> in the boards.

Hmm, maybe.

-- PMM

Re: [Qemu-devel] [PATCH v4] Add GDB qAttached support

2013-03-19 Thread Fabien Chouteau

On 03/15/2013 01:02 PM, Fabien Chouteau wrote:
> On 03/14/2013 10:07 PM, Jesse Larrew wrote:
>> On 03/14/2013 02:51 PM, Jan Kiszka wrote:
>>> With this patch QEMU handles qAttached request from gdb. When QEMU
>>> replies 1, GDB sends a "detach" command at the end of a debugging
>>> session otherwise GDB sends "kill".
>>>
>>> The default value for qAttached is 1 on system emulation and 0 on user
>>> emulation.
>>>
>>> Based on original version by Fabien Chouteau.
>>>
>>> Signed-off-by: Jan Kiszka 
>>> ---
>>>
>>> As Fabien dropped his attempt to make this configurable, let's
>>> preserve the value of exposing this feature to gdb statically.
>>>
>>>  gdbstub.c |   10 ++
>>>  1 files changed, 10 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/gdbstub.c b/gdbstub.c
>>> index e414ad9..9daee86 100644
>>> --- a/gdbstub.c
>>> +++ b/gdbstub.c
>>> @@ -42,6 +42,12 @@
>>>  #include "sysemu/kvm.h"
>>>  #include "qemu/bitops.h"
>>>
>>> +#ifdef CONFIG_USER_ONLY
>>> +#define GDB_ATTACHED "0"
>>> +#else
>>> +#define GDB_ATTACHED "1"
>>> +#endif
>>> +
>>
>> Yes, I like the #define better.
> 
> Right, since we drop reconfigurability, this is more appropriate.
> 
> 

Looks like we have a consensus. Anthony can you please apply this patch and the 
revert (patch 2/3)?

Thanks in advance,

-- 
Fabien Chouteau

Re: [Qemu-devel] [RFC PATCH 0/2] port network layer onto glib

2013-03-19 Thread Peter Maydell

On 19 March 2013 10:34, Paolo Bonzini  wrote:
> Il 19/03/2013 11:12, Peter Maydell ha scritto:
>> Threads are supported by the language runtime provided on all
>> the systems we support, which is why they are reasonably usable.
>> When you've persuaded glibc, MacOSX libc and Windows to implement
>> coroutines please come back and let me know :-)
>
> Windows supports them (it calls them fibers) and glibc does on many
> architectures (all but ARM, basically).

If you mean ucontext, I'm not sure I'd call that coroutine
support at the library level (and we did implement it on
ARM glibc).

-- PMM

Re: [Qemu-devel] [Xen-devel] [PATCHv3] QEMU(upstream): Disable xen's use of O_DIRECT by default as it results in crashes.

2013-03-19 Thread Paolo Bonzini

Il 19/03/2013 11:06, George Dunlap ha scritto:
> On Mon, Mar 18, 2013 at 6:00 PM, Paolo Bonzini  wrote:
>> Il 18/03/2013 18:38, George Dunlap ha scritto:
>
 This might be a difference between Xen and KVM. On Xen migration is
 made to a server in a paused state, and it's only unpaused when
 the migration to B is complete. There's a sort of extra handshake at
 the end.
>>>
>>> I think what you mean is that all the memory is handled by Xen and the
>>> toolstack, not by qemu.  The qemu state is sent as the very last thing,
>>> after all of the memory, and therefore (you are arguing) that qemu is
>>> not started, and the files cannot be opened, until after the migration
>>> is nearly complete, and certainly until after the file is closed on the
>>> sending side.
>>
>> That would be quite dangerous.  Files aren't closed until after QEMU
>> exits; at this point whatever problem you have launching QEMU on the
>> destination would be unrecoverable.
> 
> But if I understand your concern correctly, you were concerned about
> the following scenario:
> R1. Receiver qemu opens file
> R2. Something causes receiver kernel to cache parts of file (maybe
> optimistic read-ahead)

For some image formats, metadata is cached inside QEMU on startup.
There is a callback to invalidate QEMU's cache at the end of migration,
but that does not extend to the page cache.

> S1. Sender qemu writes to file
> S2. Sender qemu does final flush
> S3. Sender qemu closes file
> R3. Receiver reads stale blocks from cache
> 
> Even supposing that Xen doesn't actually shut down qemu until it is
> started on the remote side, as long as the file isn't opened by qemu
> until after S2, we should be safe, right?  It would look like this:
> 
> S1. Sender qemu writes to file
> S2. Sender qemu does final flush
> R1. Receiver qemu opens file
> R2. Receiver kernel caches file
> S3. Sender qemu closes file
> 
> This is all assuming that:
> 1. The barrier operations / write flush are effective at getting the
> data back on to the NFS server
> 2. The receiver qemu doesn't open the file until after the last flush
> by the sender.
> 
> Number 1 has been tested by Alex I believe, and is mentioned in the
> changeset log; so if #2 is true, then we should be safe.  I'll try to
> verify that today.

Thanks.

>> Even for successful migration, it would also be bad for downtime (QEMU
>> isn't exactly lightning-fast to start).  And even if failure weren't
>> catastrophic, it would be a pity to transfer a few gigs of memory and
>> then find out that QEMU isn't present in the destination. :)
> 
> Well, if qemu isn't present at the destination, that's definitely user
> error. :-)  In any case, I know that he migrate can resume if it
> fails, so I suspect that the qemu is just paused on the sending side
> until the migration is known to complete.  As long as the last write
> was flushed to the NFS server before the receiver opens the file, we
> should be safe.

Note that the close really must happen before the next open.  Otherwise
the file metadata might not be up-to-date on the destination, too.

Paolo

Re: [Qemu-devel] [RFC PATCH 0/2] port network layer onto glib

2013-03-19 Thread Paolo Bonzini

Il 19/03/2013 11:12, Peter Maydell ha scritto:
> On 19 March 2013 09:30, Markus Armbruster  wrote:
>> Coroutines are a perfectly pedestrian control flow construct.
> 
> In some languages, sure. Not in C, and we're writing C.
> 
>> Coroutines fit about as well with C as threads, namely not really.
> 
> Threads are supported by the language runtime provided on all
> the systems we support, which is why they are reasonably usable.
> When you've persuaded glibc, MacOSX libc and Windows to implement
> coroutines please come back and let me know :-)

Windows supports them (it calls them fibers) and glibc does on many
architectures (all but ARM, basically).

Paolo

>> If portable coroutines in C really was an intractable problem, the
>> solution could not be "no more coroutines, please", only "no coroutines,
>> period".  As long as we have to pay the price for coroutines anyway, I
>> can't see why we should deny ourselves the benefits.
> 
> I'd like to see coroutines gone completely. The first step is not
> to let them get used more than they are already.
> 
>> C programs have been doing coroutines since forever, using either hand
>> coded assembly language stack switching, sigaltstack() trickery,
>> ucontext(), w32 fibers, or some coroutine library built on top of these.
> 
> ...and there's a wide range of really nasty options because none
> of them are actually decent solutions to the problem.
> 
> -- PMM
>

Re: [Qemu-devel] [PATCH] virtio-blk: Do not segfault fault if failed to initialize dataplane

2013-03-19 Thread Kevin Wolf

Am 19.03.2013 um 09:37 hat KONRAD Frédéric geschrieben:
> On 19/03/2013 09:27, Dunrong Huang wrote:
> >$ ~/usr/bin/qemu-system-x86_64 -enable-kvm -m 1024 -drive 
> >if=none,id=drive0,cache=none,aio=native,format=raw,file=/root/Image/centos-6.4.raw
> > -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on,config-wce=on 
> ># make dataplane fail to initialize
> >qemu-system-x86_64: -device 
> >virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on,config-wce=on: device 
> >is incompatible with x-data-plane, use config-wce=off
> >*** glibc detected *** /root/usr/bin/qemu-system-x86_64: free(): invalid 
> >pointer: 0x7f001fef12f8 ***
> >=== Backtrace: =
> >/lib64/libc.so.6(+0x7d776)[0x7f00153a5776]
> >/root/usr/bin/qemu-system-x86_64(+0x2c34ec)[0x7f001cf5b4ec]
> >/root/usr/bin/qemu-system-x86_64(+0x342f9a)[0x7f001cfdaf9a]
> >/root/usr/bin/qemu-system-x86_64(+0x33694e)[0x7f001cfce94e]
> >
> >
> >  (gdb) bt
> >  #0  0x7f3bf3a12015 in raise () from /lib64/libc.so.6
> >  #1  0x7f3bf3a1348b in abort () from /lib64/libc.so.6
> >  #2  0x7f3bf3a51a4e in __libc_message () from /lib64/libc.so.6
> >  #3  0x7f3bf3a57776 in malloc_printerr () from /lib64/libc.so.6
> >  #4  0x7f3bfb60d4ec in free_and_trace (mem=0x7f3bfe0129f8) at vl.c:2786
> >  #5  0x7f3bfb68cf9a in virtio_cleanup (vdev=0x7f3bfe0129f8) at 
> > /root/Develop/QEMU/qemu/hw/virtio.c:900
> >  #6  0x7f3bfb68094e in virtio_blk_device_init (vdev=0x7f3bfe0129f8) at 
> > /root/Develop/QEMU/qemu/hw/virtio-blk.c:666
> >  #7  0x7f3bfb68dadf in virtio_device_init (qdev=0x7f3bfe0129f8) at 
> > /root/Develop/QEMU/qemu/hw/virtio.c:1092
> >  #8  0x7f3bfb50da46 in device_realize (dev=0x7f3bfe0129f8, 
> > err=0x7fff479c9258) at hw/qdev.c:176
> >.
> >
> >In virtio_blk_device_init(), the memory which vdev point to is a static
> >member of "struct VirtIOBlkPCI", not heap memory, and it does not
> >get freed. So we shoule use virtio_common_cleanup() to clean this 
> >VirtIODevice
> >rather than virtio_cleanup(), which attempts to free the vdev.
> >
> >This error was introduced by commit 05ff686536f408ba6e8426b1b54d25bd3379fda2
> >recently.
> >
> >Signed-off-by: Dunrong Huang 
> Reviewed-by: KONRAD Frederic 
> 
> Oops sorry for that :/

So virtio_init() has to be paired with virtio_common_cleanup(), and
virtio_common_init() with virtio_cleanup()? Confusing...

Anyway, the patch looks correct. Thanks, applied to the block branch.

Kevin

Re: [Qemu-devel] [Xen-devel] [PATCHv3] QEMU(upstream): Disable xen's use of O_DIRECT by default as it results in crashes.

2013-03-19 Thread George Dunlap


On 03/19/2013 10:43 AM, Paolo Bonzini wrote:

Even for successful migration, it would also be bad for downtime (QEMU
isn't exactly lightning-fast to start).  And even if failure weren't
catastrophic, it would be a pity to transfer a few gigs of memory and
then find out that QEMU isn't present in the destination. :)


Well, if qemu isn't present at the destination, that's definitely user
error. :-)  In any case, I know that he migrate can resume if it
fails, so I suspect that the qemu is just paused on the sending side
until the migration is known to complete.  As long as the last write
was flushed to the NFS server before the receiver opens the file, we
should be safe.


Note that the close really must happen before the next open.  Otherwise
the file metadata might not be up-to-date on the destination, too.


By "file metadata" I assume you mean "metadata about the virtual disk 
within the file", not "metadata about the file within the filesystem", 
right?  That's good to know, I'll keep that in mind.


Even if it's true that at the moment qemu doesn't write the file 
metadata until it closes the file, that just means we'd have to add a 
hook to the callback to save qemu state, to sync the metadata at that 
point, right?  (This may in fact already be done; I'm more or less 
completely unfamiliar with the qemu side of save/restore.)


 -George

Re: [Qemu-devel] [PATCH v2 1/2] arm_gic: Fix sizes of state fields in preparation for vmstate support

2013-03-19 Thread Andreas Färber

Am 18.03.2013 18:47, schrieb Peter Maydell:
> In preparation for switching to vmstate for migration support, fix
> the sizes of various GIC state fields. In particular, we replace all
> the bitfields (which VMState can't deal with) with straightforward
> uint8_t values which we do bit operations on. (The bitfields made
> more sense when NCPU was set differently in different situations,
> but we now always model at the architectural limit of 8.)
> 
> Signed-off-by: Peter Maydell 

Reviewed-by: Andreas Färber 

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH v2 2/2] hw/arm_gic_common: Use vmstate struct rather than save/load functions

2013-03-19 Thread Andreas Färber

Am 18.03.2013 21:43, schrieb Igor Mitsyanko:
> 
> On Mar 19, 2013 12:21 AM, "Peter Maydell"  > wrote:
>>
>> On 18 March 2013 19:48, Igor Mitsyanko  > wrote:
>> >> On 03/18/2013 09:47 PM, Peter Maydell wrote:
>> >>>
>> >>> +VMSTATE_BUFFER_UNSAFE(last_active, GICState, 0,
>> >>> +  GIC_MAXIRQ * NCPU * sizeof(uint16_t)),
>>
>> > I'm not sure about this one, do we have any guarantees that it will
> always
>> > be tightly packed? What will happen when we will try to migrate VM
> between
>> > BE and LE hosts?
>>
>> Ugh. I think the packing is ok but I hadn't thought about the
>> endianness issue.
>>
>> Gerd and I were talking on IRC about 2D arrays. I think we came to
>> the conclusion that you could provide a new set of vmstate macros
>> for 2D arrays which basically work just like the existing 1D array
>> ones except that the typecheck is different.
>>
>> (vmstate.h is getting hugely repetitive to the point that I'm
>> really tempted to say we should just autogenerate it. That way
>> you could define a fairly small set of things (arrays, base types,
>> safe vs unsafe, etc) and have a script generate the cross product,
>> rather than the current setup where there is a lot of hand written
>> repetition and a tendency to gaps in the coverage where nobody's
>> using them yet.)
>>
>> -- PMM
> 
> I can recall a qemu-devel discussion over a long-term QOM goals a while
> ago.Somebody suggested that in the future we will define devices state
> structures using some special macro which will be parsed during
> compilation, serializing each member for both QOM introspection and
> vmstate migration.

QIDL - CC'ing Michael.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH 3/5] pseries: Fixes and enhancements to L1 cache properties

2013-03-19 Thread Andreas Färber

Am 18.03.2013 12:05, schrieb Alexander Graf:
> 
> On 18.03.2013, at 11:54, Andreas Färber wrote:
> 
>> Am 15.03.2013 13:27, schrieb Alexander Graf:
>>>
>>> On 14.03.2013, at 02:53, David Gibson wrote:
>>>
 PAPR requires that the device tree's CPU nodes have several properties
 with information about the L1 cache.  We already create two of these
 properties, but with incorrect names - "[id]cache-block-size" instead
 of "[id]-cache-block-size" (note the extra hyphen).

 We were also missing some of the required cache properties.  This
 patch adds the [id]-cache-line-size properties (which have the same
 values as the block size properties in all current cases).  We also
 add the [id]-cache-size properties.

 Adding the cache sizes requires some extra infrastructure in the
 general target-ppc code to (optionally) set the cache sizes for
 various CPUs.  The CPU family descriptions in translate_init.c can set
 these sizes - this patch adds correct information for POWER7, I'm
 leaving other CPU types to people who have a physical example to
 verify against.  In addition, for -cpu host we take the values
 advertised by the host (if available) and use those to override the
 information based on PVR.

 Signed-off-by: David Gibson 
 ---
 hw/ppc/spapr.c  |   20 ++--
 target-ppc/cpu.h|1 +
 target-ppc/kvm.c|   39 +++
 target-ppc/translate_init.c |4 
 4 files changed, 62 insertions(+), 2 deletions(-)

 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index 9a13697..7293082 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -333,10 +333,26 @@ static void *spapr_create_fdt_skel(const char 
 *cpu_model,
_FDT((fdt_property_string(fdt, "device_type", "cpu")));

_FDT((fdt_property_cell(fdt, "cpu-version", env->spr[SPR_PVR])));
 -_FDT((fdt_property_cell(fdt, "dcache-block-size",
 +_FDT((fdt_property_cell(fdt, "d-cache-block-size",
env->dcache_line_size)));
 -_FDT((fdt_property_cell(fdt, "icache-block-size",
 +_FDT((fdt_property_cell(fdt, "d-cache-line-size",
 +env->dcache_line_size)));
 +_FDT((fdt_property_cell(fdt, "i-cache-block-size",
 +env->icache_line_size)));
 +_FDT((fdt_property_cell(fdt, "i-cache-line-size",
env->icache_line_size)));
 +
 +if (env->l1_dcache_size) {
 +_FDT((fdt_property_cell(fdt, "d-cache-size", 
 env->l1_dcache_size)));
 +} else {
 +fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
 +}
 +if (env->l1_icache_size) {
 +_FDT((fdt_property_cell(fdt, "i-cache-size", 
 env->l1_icache_size)));
 +} else {
 +fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
 +}
>>>
>>> The L1 sizes should come from the class, not env, right? Andreas, any ideas 
>>> on this?
>>
>> Generally speaking,
>>
>> CPUPPCState: Only if this is used for TCG with an offset from AREG0 (or
>> for legacy grouping reasons).
>>
>> PowerPCCPU: If you ever intend to let the user override this value
>> (per-instance) from the command line.
>>
>> PowerPCCPUClass: If the value is always constant at runtime.
>>
>> I can't tell from a brief look at this patch which may be the case here.
> 
> Imagine the following:
> 
>   PPC
> `- POWER7
> |- POWER7_v10
> `- POWER7_v20
> 
> Version 1.0 has L1 cache size of x MB. Version 2.0 has L1 cache size of y MB. 
> The user should never override the value (except with -cpu host). It is 
> constant during the lifetime of a VM.

Alex, you asked for an answer here, but I don't spot a question. :-)

I'm guessing requirements like these mean that we need to introduce a
new POWERPC_DEF_SVR_CLS_INST(name, pvr, svr, type, snippet1, snippet2)
macro to put additional code into class_init and instance_init
respectively and let _DEF() and _DEF_SVR() pass nothing there...
Possibly add specific macros wrapping the above to keep the model list
readable.

Either way there's a trade-off between keeping easy macros hiding QOM
boilerplate code vs. having high flexibility of what code to put in
there - that's why I resisted your requests to hide too much in macros
at the family level.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH 3/5] pseries: Fixes and enhancements to L1 cache properties

2013-03-19 Thread Alexander Graf


On 19.03.2013, at 12:06, Andreas Färber wrote:

> Am 18.03.2013 12:05, schrieb Alexander Graf:
>> 
>> On 18.03.2013, at 11:54, Andreas Färber wrote:
>> 
>>> Am 15.03.2013 13:27, schrieb Alexander Graf:
 
 On 14.03.2013, at 02:53, David Gibson wrote:
 
> PAPR requires that the device tree's CPU nodes have several properties
> with information about the L1 cache.  We already create two of these
> properties, but with incorrect names - "[id]cache-block-size" instead
> of "[id]-cache-block-size" (note the extra hyphen).
> 
> We were also missing some of the required cache properties.  This
> patch adds the [id]-cache-line-size properties (which have the same
> values as the block size properties in all current cases).  We also
> add the [id]-cache-size properties.
> 
> Adding the cache sizes requires some extra infrastructure in the
> general target-ppc code to (optionally) set the cache sizes for
> various CPUs.  The CPU family descriptions in translate_init.c can set
> these sizes - this patch adds correct information for POWER7, I'm
> leaving other CPU types to people who have a physical example to
> verify against.  In addition, for -cpu host we take the values
> advertised by the host (if available) and use those to override the
> information based on PVR.
> 
> Signed-off-by: David Gibson 
> ---
> hw/ppc/spapr.c  |   20 ++--
> target-ppc/cpu.h|1 +
> target-ppc/kvm.c|   39 +++
> target-ppc/translate_init.c |4 
> 4 files changed, 62 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 9a13697..7293082 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -333,10 +333,26 @@ static void *spapr_create_fdt_skel(const char 
> *cpu_model,
>   _FDT((fdt_property_string(fdt, "device_type", "cpu")));
> 
>   _FDT((fdt_property_cell(fdt, "cpu-version", env->spr[SPR_PVR])));
> -_FDT((fdt_property_cell(fdt, "dcache-block-size",
> +_FDT((fdt_property_cell(fdt, "d-cache-block-size",
>   env->dcache_line_size)));
> -_FDT((fdt_property_cell(fdt, "icache-block-size",
> +_FDT((fdt_property_cell(fdt, "d-cache-line-size",
> +env->dcache_line_size)));
> +_FDT((fdt_property_cell(fdt, "i-cache-block-size",
> +env->icache_line_size)));
> +_FDT((fdt_property_cell(fdt, "i-cache-line-size",
>   env->icache_line_size)));
> +
> +if (env->l1_dcache_size) {
> +_FDT((fdt_property_cell(fdt, "d-cache-size", 
> env->l1_dcache_size)));
> +} else {
> +fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
> +}
> +if (env->l1_icache_size) {
> +_FDT((fdt_property_cell(fdt, "i-cache-size", 
> env->l1_icache_size)));
> +} else {
> +fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
> +}
 
 The L1 sizes should come from the class, not env, right? Andreas, any 
 ideas on this?
>>> 
>>> Generally speaking,
>>> 
>>> CPUPPCState: Only if this is used for TCG with an offset from AREG0 (or
>>> for legacy grouping reasons).
>>> 
>>> PowerPCCPU: If you ever intend to let the user override this value
>>> (per-instance) from the command line.
>>> 
>>> PowerPCCPUClass: If the value is always constant at runtime.
>>> 
>>> I can't tell from a brief look at this patch which may be the case here.
>> 
>> Imagine the following:
>> 
>>  PPC
>>`- POWER7
>>|- POWER7_v10
>>`- POWER7_v20
>> 
>> Version 1.0 has L1 cache size of x MB. Version 2.0 has L1 cache size of y 
>> MB. The user should never override the value (except with -cpu host). It is 
>> constant during the lifetime of a VM.
> 
> Alex, you asked for an answer here, but I don't spot a question. :-)
> 
> I'm guessing requirements like these mean that we need to introduce a
> new POWERPC_DEF_SVR_CLS_INST(name, pvr, svr, type, snippet1, snippet2)
> macro to put additional code into class_init and instance_init
> respectively and let _DEF() and _DEF_SVR() pass nothing there...
> Possibly add specific macros wrapping the above to keep the model list
> readable.
> 
> Either way there's a trade-off between keeping easy macros hiding QOM
> boilerplate code vs. having high flexibility of what code to put in
> there - that's why I resisted your requests to hide too much in macros
> at the family level.

Can't we just do a POWERPC_DEF_FUNC(name, pvr, type, init_func) which would 
pass in a model specific initialization function in addition to the easy to 
read defaults? :)


Alex

Re: [Qemu-devel] [Xen-devel] [PATCHv3] QEMU(upstream): Disable xen's use of O_DIRECT by default as it results in crashes.

2013-03-19 Thread Paolo Bonzini

Il 19/03/2013 11:51, George Dunlap ha scritto:
> On 03/19/2013 10:43 AM, Paolo Bonzini wrote:
 Even for successful migration, it would also be bad for downtime (QEMU
 isn't exactly lightning-fast to start).  And even if failure weren't
 catastrophic, it would be a pity to transfer a few gigs of memory and
 then find out that QEMU isn't present in the destination. :)
>>>
>>> Well, if qemu isn't present at the destination, that's definitely user
>>> error. :-)  In any case, I know that he migrate can resume if it
>>> fails, so I suspect that the qemu is just paused on the sending side
>>> until the migration is known to complete.  As long as the last write
>>> was flushed to the NFS server before the receiver opens the file, we
>>> should be safe.
>>
>> Note that the close really must happen before the next open.  Otherwise
>> the file metadata might not be up-to-date on the destination, too.
> 
> By "file metadata" I assume you mean "metadata about the virtual disk
> within the file", not "metadata about the file within the filesystem",
> right?  That's good to know, I'll keep that in mind.

Actually especially the former (I'm calling them respectively "image
metadata" and "file metadata").  File metadata could also be a problem,
but I think it might just work except in cases like on-line resizing
during migration.

> Even if it's true that at the moment qemu doesn't write the file
> metadata until it closes the file, that just means we'd have to add a
> hook to the callback to save qemu state, to sync the metadata at that
> point, right?

Unfortunately no.  The problem is in the loading side's kernel, on which
you do not have any control.  If the loading side doesn't use O_DIRECT,
any attempt to invalidate the metadata in userspace or on the source is
futile, because there is no way to invalidate the page cache's copy of
that metadata.

Paolo

Re: [Qemu-devel] [PATCH 3/5] pseries: Fixes and enhancements to L1 cache properties

2013-03-19 Thread Andreas Färber

Am 19.03.2013 12:09, schrieb Alexander Graf:
> 
> On 19.03.2013, at 12:06, Andreas Färber wrote:
> 
>> Am 18.03.2013 12:05, schrieb Alexander Graf:
>>>
>>> On 18.03.2013, at 11:54, Andreas Färber wrote:
>>>
 Am 15.03.2013 13:27, schrieb Alexander Graf:
>
> On 14.03.2013, at 02:53, David Gibson wrote:
>
>> PAPR requires that the device tree's CPU nodes have several properties
>> with information about the L1 cache.  We already create two of these
>> properties, but with incorrect names - "[id]cache-block-size" instead
>> of "[id]-cache-block-size" (note the extra hyphen).
>>
>> We were also missing some of the required cache properties.  This
>> patch adds the [id]-cache-line-size properties (which have the same
>> values as the block size properties in all current cases).  We also
>> add the [id]-cache-size properties.
>>
>> Adding the cache sizes requires some extra infrastructure in the
>> general target-ppc code to (optionally) set the cache sizes for
>> various CPUs.  The CPU family descriptions in translate_init.c can set
>> these sizes - this patch adds correct information for POWER7, I'm
>> leaving other CPU types to people who have a physical example to
>> verify against.  In addition, for -cpu host we take the values
>> advertised by the host (if available) and use those to override the
>> information based on PVR.
>>
>> Signed-off-by: David Gibson 
>> ---
>> hw/ppc/spapr.c  |   20 ++--
>> target-ppc/cpu.h|1 +
>> target-ppc/kvm.c|   39 
>> +++
>> target-ppc/translate_init.c |4 
>> 4 files changed, 62 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 9a13697..7293082 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -333,10 +333,26 @@ static void *spapr_create_fdt_skel(const char 
>> *cpu_model,
>>   _FDT((fdt_property_string(fdt, "device_type", "cpu")));
>>
>>   _FDT((fdt_property_cell(fdt, "cpu-version", env->spr[SPR_PVR])));
>> -_FDT((fdt_property_cell(fdt, "dcache-block-size",
>> +_FDT((fdt_property_cell(fdt, "d-cache-block-size",
>>   env->dcache_line_size)));
>> -_FDT((fdt_property_cell(fdt, "icache-block-size",
>> +_FDT((fdt_property_cell(fdt, "d-cache-line-size",
>> +env->dcache_line_size)));
>> +_FDT((fdt_property_cell(fdt, "i-cache-block-size",
>> +env->icache_line_size)));
>> +_FDT((fdt_property_cell(fdt, "i-cache-line-size",
>>   env->icache_line_size)));
>> +
>> +if (env->l1_dcache_size) {
>> +_FDT((fdt_property_cell(fdt, "d-cache-size", 
>> env->l1_dcache_size)));
>> +} else {
>> +fprintf(stderr, "Warning: Unknown L1 dcache size for 
>> cpu\n");
>> +}
>> +if (env->l1_icache_size) {
>> +_FDT((fdt_property_cell(fdt, "i-cache-size", 
>> env->l1_icache_size)));
>> +} else {
>> +fprintf(stderr, "Warning: Unknown L1 icache size for 
>> cpu\n");
>> +}
>
> The L1 sizes should come from the class, not env, right? Andreas, any 
> ideas on this?

 Generally speaking,

 CPUPPCState: Only if this is used for TCG with an offset from AREG0 (or
 for legacy grouping reasons).

 PowerPCCPU: If you ever intend to let the user override this value
 (per-instance) from the command line.

 PowerPCCPUClass: If the value is always constant at runtime.

 I can't tell from a brief look at this patch which may be the case here.
>>>
>>> Imagine the following:
>>>
>>>  PPC
>>>`- POWER7
>>>|- POWER7_v10
>>>`- POWER7_v20
>>>
>>> Version 1.0 has L1 cache size of x MB. Version 2.0 has L1 cache size of y 
>>> MB. The user should never override the value (except with -cpu host). It is 
>>> constant during the lifetime of a VM.
>>
>> Alex, you asked for an answer here, but I don't spot a question. :-)
>>
>> I'm guessing requirements like these mean that we need to introduce a
>> new POWERPC_DEF_SVR_CLS_INST(name, pvr, svr, type, snippet1, snippet2)
>> macro to put additional code into class_init and instance_init
>> respectively and let _DEF() and _DEF_SVR() pass nothing there...
>> Possibly add specific macros wrapping the above to keep the model list
>> readable.
>>
>> Either way there's a trade-off between keeping easy macros hiding QOM
>> boilerplate code vs. having high flexibility of what code to put in
>> there - that's why I resisted your requests to hide too much in macros
>> at the family level.
> 
> Can't we just do a POWERPC_DEF_FUNC(name, pvr, type, init_f

Re: [Qemu-devel] [PATCH 0/2] fix segfault in (iscsi) option parsing

2013-03-19 Thread Paolo Bonzini

Il 19/03/2013 09:51, Markus Armbruster ha scritto:
> Paolo, please have a look.

Why isn't it enough to call bdrv_init_with_whitelist earlier?

There is no conditional logic in it, the whitelist is checked at open time.

Paolo

> Peter Lieven  writes:
> 
>> current git master segfaults if an iscsi option is specified
>> in command line.
>>
>> Peter Lieven (2):
>>   qemu-option: avoid segfault if QemuOptsList == NULL
>>   vl.c: fix segfault in iscsi options parsing
>>
>>  block/iscsi.c  |   27 ---
>>  util/qemu-option.c |1 +
>>  vl.c   |   40 +---
>>  3 files changed, 38 insertions(+), 30 deletions(-)
> 
>

[Qemu-devel] [PATCH 1/5] sheepdog: show error message for halt status

2013-03-19 Thread Kevin Wolf

From: Liu Yuan 

Sheepdog (neither quorum nor unsafe mode) will refuse to serve IO requests when
number of alive nodes is less than that of copies specified by users. This will
return 0x19 to QEMU client which currently doesn't recognize it.

This patch adds an error description when QEMU client receives it, other than
plainly printing 'Invalid error code'

Cc: MORITA Kazutaka 
Cc: Kevin Wolf 
Cc: Stefan Hajnoczi 
Signed-off-by: Liu Yuan 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: MORITA Kazutaka 
Signed-off-by: Kevin Wolf 
---
 block/sheepdog.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 4245328..54d3e53 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -65,6 +65,7 @@
 #define SD_RES_WAIT_FOR_FORMAT  0x16 /* Waiting for a format operation */
 #define SD_RES_WAIT_FOR_JOIN0x17 /* Waiting for other nodes joining */
 #define SD_RES_JOIN_FAILED   0x18 /* Target node had failed to join sheepdog */
+#define SD_RES_HALT  0x19 /* Sheepdog is stopped serving IO request */
 
 /*
  * Object ID rules
@@ -344,6 +345,7 @@ static const char * sd_strerror(int err)
 {SD_RES_WAIT_FOR_FORMAT, "Sheepdog is waiting for a format operation"},
 {SD_RES_WAIT_FOR_JOIN, "Sheepdog is waiting for other nodes joining"},
 {SD_RES_JOIN_FAILED, "Target node had failed to join sheepdog"},
+{SD_RES_HALT, "Sheepdog is stopped serving IO request"},
 };
 
 for (i = 0; i < ARRAY_SIZE(errors); ++i) {
-- 
1.8.1.4

[Qemu-devel] [PATCH 3/5] block: fix BDRV_O_SNAPSHOT protocol detection

2013-03-19 Thread Kevin Wolf

From: Stefan Hajnoczi 

realpath(3) is used to get an absolute path to the image file when
creating a -drive snapshot=on temporary qcow2.  This does not work for
protocols since their filenames ("proto:foo:...") do not correspond to
file system paths.

Commit 7c96d46ec245d73fd76726588409f9abe4bd5dc1 ("Let snapshot work with
protocols") skipped realpath(3) for protocols.  Later on the "raw"
format was introduced and broke the check.

Use path_has_protocol(filename) to decide if this image uses a protocol
or a filename.

Reported-by: Richard Jones 
Signed-off-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/block.c b/block.c
index 037e15e..0a062c9 100644
--- a/block.c
+++ b/block.c
@@ -830,7 +830,6 @@ int bdrv_open(BlockDriverState *bs, const char *filename, 
QDict *options,
 if (flags & BDRV_O_SNAPSHOT) {
 BlockDriverState *bs1;
 int64_t total_size;
-int is_protocol = 0;
 BlockDriver *bdrv_qcow2;
 QEMUOptionParameter *options;
 char backing_filename[PATH_MAX];
@@ -847,9 +846,6 @@ int bdrv_open(BlockDriverState *bs, const char *filename, 
QDict *options,
 }
 total_size = bdrv_getlength(bs1) & BDRV_SECTOR_MASK;
 
-if (bs1->drv && bs1->drv->protocol_name)
-is_protocol = 1;
-
 bdrv_delete(bs1);
 
 ret = get_tmp_filename(tmp_filename, sizeof(tmp_filename));
@@ -858,7 +854,7 @@ int bdrv_open(BlockDriverState *bs, const char *filename, 
QDict *options,
 }
 
 /* Real path is meaningless for protocols */
-if (is_protocol) {
+if (path_has_protocol(filename)) {
 snprintf(backing_filename, sizeof(backing_filename),
  "%s", filename);
 } else if (!realpath(filename, backing_filename)) {
-- 
1.8.1.4

[Qemu-devel] [PATCH 2/5] qcow2: Fix segfault in qcow2_invalidate_cache

2013-03-19 Thread Kevin Wolf

Need to pass an options QDict to qcow2_open() now. This fixes a segfault
on the migration target with qcow2.

Signed-off-by: Kevin Wolf 
---
 block/qcow2.c | 12 ++--
 block/qcow2.h |  3 +++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 1f99866..98bb7f3 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -29,6 +29,7 @@
 #include "block/qcow2.h"
 #include "qemu/error-report.h"
 #include "qapi/qmp/qerror.h"
+#include "qapi/qmp/qbool.h"
 #include "trace.h"
 
 /*
@@ -520,7 +521,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, 
int flags)
 goto fail;
 }
 
-s->use_lazy_refcounts = qemu_opt_get_bool(opts, "lazy_refcounts",
+s->use_lazy_refcounts = qemu_opt_get_bool(opts, QCOW2_OPT_LAZY_REFCOUNTS,
 (s->compatible_features & QCOW2_COMPAT_LAZY_REFCOUNTS));
 
 qemu_opts_del(opts);
@@ -930,6 +931,7 @@ static void qcow2_invalidate_cache(BlockDriverState *bs)
 AES_KEY aes_encrypt_key;
 AES_KEY aes_decrypt_key;
 uint32_t crypt_method = 0;
+QDict *options;
 
 /*
  * Backing files are read-only which makes all of their metadata immutable,
@@ -944,8 +946,14 @@ static void qcow2_invalidate_cache(BlockDriverState *bs)
 
 qcow2_close(bs);
 
+options = qdict_new();
+qdict_put(options, QCOW2_OPT_LAZY_REFCOUNTS,
+  qbool_from_int(s->use_lazy_refcounts));
+
 memset(s, 0, sizeof(BDRVQcowState));
-qcow2_open(bs, NULL, flags);
+qcow2_open(bs, options, flags);
+
+QDECREF(options);
 
 if (crypt_method) {
 s->crypt_method = crypt_method;
diff --git a/block/qcow2.h b/block/qcow2.h
index 103abdb..e4b5e11 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -58,6 +58,9 @@
 
 #define DEFAULT_CLUSTER_SIZE 65536
 
+
+#define QCOW2_OPT_LAZY_REFCOUNTS "lazy_refcounts"
+
 typedef struct QCowHeader {
 uint32_t magic;
 uint32_t version;
-- 
1.8.1.4

[Qemu-devel] [PATCH V2 3/3] target-arm: Fix VFP register byte order in GDB remote

2013-03-19 Thread Fabien Chouteau

>From GDB Remote Serial Protocol doc:

"The bytes with the register are transmitted in target byte order."

Signed-off-by: Fabien Chouteau 
---
 target-arm/helper.c |   23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index e97e1a5..1ba25e1 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -16,18 +16,22 @@ static int vfp_gdb_get_reg(CPUARMState *env, uint8_t *buf, 
int reg)
 {
 int nregs;
 
-/* VFP data registers are always little-endian.  */
 nregs = arm_feature(env, ARM_FEATURE_VFP3) ? 32 : 16;
 if (reg < nregs) {
-stfq_le_p(buf, env->vfp.regs[reg]);
+stfq_p(buf, env->vfp.regs[reg]);
 return 8;
 }
 if (arm_feature(env, ARM_FEATURE_NEON)) {
 /* Aliases for Q regs.  */
 nregs += 16;
 if (reg < nregs) {
-stfq_le_p(buf, env->vfp.regs[(reg - 32) * 2]);
-stfq_le_p(buf + 8, env->vfp.regs[(reg - 32) * 2 + 1]);
+#ifdef TARGET_WORDS_BIGENDIAN
+stfq_p(buf, env->vfp.regs[(reg - 32) * 2 + 1]);
+stfq_p(buf + 8, env->vfp.regs[(reg - 32) * 2]);
+#else
+stfq_p(buf, env->vfp.regs[(reg - 32) * 2]);
+stfq_p(buf + 8, env->vfp.regs[(reg - 32) * 2 + 1]);
+#endif  /* TARGET_WORDS_BIGENDIAN */
 return 16;
 }
 }
@@ -45,14 +49,19 @@ static int vfp_gdb_set_reg(CPUARMState *env, uint8_t *buf, 
int reg)
 
 nregs = arm_feature(env, ARM_FEATURE_VFP3) ? 32 : 16;
 if (reg < nregs) {
-env->vfp.regs[reg] = ldfq_le_p(buf);
+env->vfp.regs[reg] = ldfq_p(buf);
 return 8;
 }
 if (arm_feature(env, ARM_FEATURE_NEON)) {
 nregs += 16;
 if (reg < nregs) {
-env->vfp.regs[(reg - 32) * 2] = ldfq_le_p(buf);
-env->vfp.regs[(reg - 32) * 2 + 1] = ldfq_le_p(buf + 8);
+#ifdef TARGET_WORDS_BIGENDIAN
+env->vfp.regs[(reg - 32) * 2 + 1] = ldfq_p(buf);
+env->vfp.regs[(reg - 32) * 2] = ldfq_p(buf + 8);
+#else
+env->vfp.regs[(reg - 32) * 2] = ldfq_p(buf);
+env->vfp.regs[(reg - 32) * 2 + 1] = ldfq_p(buf + 8);
+#endif  /* TARGET_WORDS_BIGENDIAN */
 return 16;
 }
 }
-- 
1.7.9.5

[Qemu-devel] [PATCH V2 1/3] QAPI: Add ARMEB target-type

2013-03-19 Thread Fabien Chouteau


Signed-off-by: Fabien Chouteau 
---
 qapi-schema.json |9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index 28b070f..0615715 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2993,10 +2993,11 @@
 # Since: 1.2.0
 ##
 { 'enum': 'TargetType',
-  'data': [ 'alpha', 'arm', 'cris', 'i386', 'lm32', 'm68k', 'microblazeel',
-'microblaze', 'mips64el', 'mips64', 'mipsel', 'mips', 'or32',
-'ppc64', 'ppcemb', 'ppc', 's390x', 'sh4eb', 'sh4', 'sparc64',
-'sparc', 'unicore32', 'x86_64', 'xtensaeb', 'xtensa' ] }
+  'data': [ 'alpha', 'arm', 'armeb','cris', 'i386', 'lm32', 'm68k',
+'microblazeel', 'microblaze', 'mips64el', 'mips64', 'mipsel',
+'mips', 'or32', 'ppc64', 'ppcemb', 'ppc', 's390x', 'sh4eb',
+'sh4', 'sparc64', 'sparc', 'unicore32', 'x86_64', 'xtensaeb',
+'xtensa' ] }
 
 ##
 # @TargetInfo:
-- 
1.7.9.5

[Qemu-devel] [PATCH V2 2/3] Add default config for armeb-softmmu

2013-03-19 Thread Fabien Chouteau

Just create one that includes arm-softmmu.mak.

Signed-off-by: Fabien Chouteau 
---
 default-configs/armeb-softmmu.mak |3 +++
 1 file changed, 3 insertions(+)
 create mode 100644 default-configs/armeb-softmmu.mak

diff --git a/default-configs/armeb-softmmu.mak 
b/default-configs/armeb-softmmu.mak
new file mode 100644
index 000..17670c0
--- /dev/null
+++ b/default-configs/armeb-softmmu.mak
@@ -0,0 +1,3 @@
+# Default configuration for armeb-softmmu
+
+include arm-softmmu.mak
-- 
1.7.9.5

[Qemu-devel] [PATCH 4/5] qemu-iotests: add 052 BDRV_O_SNAPSHOT test

2013-03-19 Thread Kevin Wolf

From: Stefan Hajnoczi 

Check that writes to an image opened with BDRV_O_SNAPSHOT do not modify
the underlying image file.

Signed-off-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/052 | 61 ++
 tests/qemu-iotests/052.out | 13 ++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 75 insertions(+)
 create mode 100755 tests/qemu-iotests/052
 create mode 100644 tests/qemu-iotests/052.out

diff --git a/tests/qemu-iotests/052 b/tests/qemu-iotests/052
new file mode 100755
index 000..14a5126
--- /dev/null
+++ b/tests/qemu-iotests/052
@@ -0,0 +1,61 @@
+#!/bin/bash
+#
+# Test bdrv_read/bdrv_write using BDRV_O_SNAPSHOT
+#
+# Copyright (C) 2013 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=stefa...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+   _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt generic
+_supported_proto generic
+_supported_os Linux
+
+
+size=128M
+_make_test_img $size
+
+echo
+echo "== reading whole image =="
+$QEMU_IO -s -c "read 0 $size" $TEST_IMG | _filter_qemu_io
+
+echo
+echo "== writing whole image does not modify image =="
+$QEMU_IO -s -c "write -P 0xa 0 $size" $TEST_IMG | _filter_qemu_io
+$QEMU_IO -c "read -P 0 0 $size" $TEST_IMG | _filter_qemu_io
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/052.out b/tests/qemu-iotests/052.out
new file mode 100644
index 000..8617aa2
--- /dev/null
+++ b/tests/qemu-iotests/052.out
@@ -0,0 +1,13 @@
+QA output created by 052
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 
+
+== reading whole image ==
+read 134217728/134217728 bytes at offset 0
+128 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+== writing whole image does not modify image ==
+wrote 134217728/134217728 bytes at offset 0
+128 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 134217728/134217728 bytes at offset 0
+128 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 1d7e4f3..73c5966 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -57,3 +57,4 @@
 048 img auto quick
 049 rw auto
 050 rw auto backing quick
+052 rw auto backing
-- 
1.8.1.4

[Qemu-devel] [PATCH 5/5] virtio-blk: Do not segfault fault if failed to initialize dataplane

2013-03-19 Thread Kevin Wolf

From: Dunrong Huang 

$ ~/usr/bin/qemu-system-x86_64 -enable-kvm -m 1024 -drive 
if=none,id=drive0,cache=none,aio=native,format=raw,file=/root/Image/centos-6.4.raw
 -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on,config-wce=on # 
make dataplane fail to initialize
qemu-system-x86_64: -device 
virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on,config-wce=on: device is 
incompatible with x-data-plane, use config-wce=off
*** glibc detected *** /root/usr/bin/qemu-system-x86_64: free(): invalid 
pointer: 0x7f001fef12f8 ***
=== Backtrace: =
/lib64/libc.so.6(+0x7d776)[0x7f00153a5776]
/root/usr/bin/qemu-system-x86_64(+0x2c34ec)[0x7f001cf5b4ec]
/root/usr/bin/qemu-system-x86_64(+0x342f9a)[0x7f001cfdaf9a]
/root/usr/bin/qemu-system-x86_64(+0x33694e)[0x7f001cfce94e]


 (gdb) bt
 #0  0x7f3bf3a12015 in raise () from /lib64/libc.so.6
 #1  0x7f3bf3a1348b in abort () from /lib64/libc.so.6
 #2  0x7f3bf3a51a4e in __libc_message () from /lib64/libc.so.6
 #3  0x7f3bf3a57776 in malloc_printerr () from /lib64/libc.so.6
 #4  0x7f3bfb60d4ec in free_and_trace (mem=0x7f3bfe0129f8) at vl.c:2786
 #5  0x7f3bfb68cf9a in virtio_cleanup (vdev=0x7f3bfe0129f8) at 
/root/Develop/QEMU/qemu/hw/virtio.c:900
 #6  0x7f3bfb68094e in virtio_blk_device_init (vdev=0x7f3bfe0129f8) at 
/root/Develop/QEMU/qemu/hw/virtio-blk.c:666
 #7  0x7f3bfb68dadf in virtio_device_init (qdev=0x7f3bfe0129f8) at 
/root/Develop/QEMU/qemu/hw/virtio.c:1092
 #8  0x7f3bfb50da46 in device_realize (dev=0x7f3bfe0129f8, 
err=0x7fff479c9258) at hw/qdev.c:176
.

In virtio_blk_device_init(), the memory which vdev point to is a static
member of "struct VirtIOBlkPCI", not heap memory, and it does not
get freed. So we shoule use virtio_common_cleanup() to clean this VirtIODevice
rather than virtio_cleanup(), which attempts to free the vdev.

This error was introduced by commit 05ff686536f408ba6e8426b1b54d25bd3379fda2
recently.

Signed-off-by: Dunrong Huang 
Signed-off-by: Kevin Wolf 
---
 hw/virtio-blk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index e6f8875..f2143fd 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -663,7 +663,7 @@ static int virtio_blk_device_init(VirtIODevice *vdev)
 s->vq = virtio_add_queue(vdev, 128, virtio_blk_handle_output);
 #ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
 if (!virtio_blk_data_plane_create(vdev, blk, &s->dataplane)) {
-virtio_cleanup(vdev);
+virtio_common_cleanup(vdev);
 return -1;
 }
 #endif
-- 
1.8.1.4

[Qemu-devel] [PULL 0/5] Block layer fixes

2013-03-19 Thread Kevin Wolf

Anthony, this includes the fixes for recently introduced regressions in the
block layer: With qcow2 and migration a NULL pointer access, and with
virtio-blk dataplane a double free in error cases.


The following changes since commit 2d62a95766025e6a0a333528278936e2cc8bf978:

  virtio-blk: cleanup: remove qdev field. (2013-03-18 13:08:41 -0500)

are available in the git repository at:

  git://repo.or.cz/qemu/kevin.git for-anthony

for you to fetch changes up to a8e5cc0c076a6e3a62f0e9aad88b007dccf3dd17:

  virtio-blk: Do not segfault fault if failed to initialize dataplane 
(2013-03-19 11:48:56 +0100)


Dunrong Huang (1):
  virtio-blk: Do not segfault fault if failed to initialize dataplane

Kevin Wolf (1):
  qcow2: Fix segfault in qcow2_invalidate_cache

Liu Yuan (1):
  sheepdog: show error message for halt status

Stefan Hajnoczi (2):
  block: fix BDRV_O_SNAPSHOT protocol detection
  qemu-iotests: add 052 BDRV_O_SNAPSHOT test

 block.c|  6 +
 block/qcow2.c  | 12 +++--
 block/qcow2.h  |  3 +++
 block/sheepdog.c   |  2 ++
 hw/virtio-blk.c|  2 +-
 tests/qemu-iotests/052 | 61 ++
 tests/qemu-iotests/052.out | 13 ++
 tests/qemu-iotests/group   |  1 +
 8 files changed, 92 insertions(+), 8 deletions(-)
 create mode 100755 tests/qemu-iotests/052
 create mode 100644 tests/qemu-iotests/052.out

Re: [Qemu-devel] [Xen-devel] [PATCHv3] QEMU(upstream): Disable xen's use of O_DIRECT by default as it results in crashes.

2013-03-19 Thread George Dunlap


On 03/19/2013 11:14 AM, Paolo Bonzini wrote:

Il 19/03/2013 11:51, George Dunlap ha scritto:

On 03/19/2013 10:43 AM, Paolo Bonzini wrote:

Even for successful migration, it would also be bad for downtime (QEMU
isn't exactly lightning-fast to start).  And even if failure weren't
catastrophic, it would be a pity to transfer a few gigs of memory and
then find out that QEMU isn't present in the destination. :)


Well, if qemu isn't present at the destination, that's definitely user
error. :-)  In any case, I know that he migrate can resume if it
fails, so I suspect that the qemu is just paused on the sending side
until the migration is known to complete.  As long as the last write
was flushed to the NFS server before the receiver opens the file, we
should be safe.


Note that the close really must happen before the next open.  Otherwise
the file metadata might not be up-to-date on the destination, too.


By "file metadata" I assume you mean "metadata about the virtual disk
within the file", not "metadata about the file within the filesystem",
right?  That's good to know, I'll keep that in mind.


Actually especially the former (I'm calling them respectively "image
metadata" and "file metadata").  File metadata could also be a problem,
but I think it might just work except in cases like on-line resizing
during migration.


Even if it's true that at the moment qemu doesn't write the file
metadata until it closes the file, that just means we'd have to add a
hook to the callback to save qemu state, to sync the metadata at that
point, right?


Unfortunately no.  The problem is in the loading side's kernel, on which
you do not have any control.  If the loading side doesn't use O_DIRECT,
any attempt to invalidate the metadata in userspace or on the source is
futile, because there is no way to invalidate the page cache's copy of
that metadata.


Yes, I meant "the only further thing we would have to do". The entire 
discussion relies on the assumption that the receiving side doesn't open 
the file until after the sending side has issued the qemu state save. 
So as long as both the virtual blocks and the image metadata have been 
synced to the NFS server at that point, we should be all right.  If at 
the moment the image metadata is *not* synced at that point, it seems 
like we should be able to make it so relatively easily.


 -George

[Qemu-devel] [PATCH V2 0/3] ARM: Misc ARM big-endian bug fixes

2013-03-19 Thread Fabien Chouteau

v2:
  - I drop the the CPSR.E/SCTLR.E/SCTLR.IE patch. The issue is still too
  complex for me to submit a good patch.

  - Fix the GDB byte order for 64bits registers

Fabien Chouteau (3):
  QAPI: Add ARMEB target-type
  Add default config for armeb-softmmu
  target-arm: Fix VFP register byte order in GDB remote

 default-configs/armeb-softmmu.mak |3 +++
 qapi-schema.json  |9 +
 target-arm/helper.c   |   23 ---
 3 files changed, 24 insertions(+), 11 deletions(-)
 create mode 100644 default-configs/armeb-softmmu.mak

-- 
1.7.9.5

Re: [Qemu-devel] [PATCH V2 0/3] ARM: Misc ARM big-endian bug fixes

2013-03-19 Thread Peter Maydell

On 19 March 2013 11:21, Fabien Chouteau  wrote:
> v2:
>   - I drop the the CPSR.E/SCTLR.E/SCTLR.IE patch. The issue is still too
>   complex for me to submit a good patch.
>
>   - Fix the GDB byte order for 64bits registers
>
> Fabien Chouteau (3):
>   QAPI: Add ARMEB target-type
>   Add default config for armeb-softmmu

Hi Fabien. I'm afraid I don't want to take these two unless they
come attached to an actual cpu and board model that use them.
Otherwise they're really just half-a-feature, I think.

(As an aside, perhaps we should generate the list in qapi-schema.json
automatically rather than hardcoding every target name? dunno)

>   target-arm: Fix VFP register byte order in GDB remote

This one could go in (assuming it passes code review, I
haven't looked too closely yet), because it's just fixing
generic bigendian support.

thanks
-- PMM

[Qemu-devel] [PATCH] serial: Fix debug format strings

2013-03-19 Thread Kevin Wolf

This fixes the build of hw/serial.c with DEBUG_SERIAL enabled.

Signed-off-by: Kevin Wolf 
---
 hw/serial.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/serial.c b/hw/serial.c
index 48a5eb6..0ccc499 100644
--- a/hw/serial.c
+++ b/hw/serial.c
@@ -306,7 +306,7 @@ static void serial_ioport_write(void *opaque, hwaddr addr, 
uint64_t val,
 SerialState *s = opaque;
 
 addr &= 7;
-DPRINTF("write addr=0x%02x val=0x%02x\n", addr, val);
+DPRINTF("write addr=0x%" HWADDR_PRIx " val=0x%" PRIx64 "\n", addr, val);
 switch(addr) {
 default:
 case 0:
@@ -527,7 +527,7 @@ static uint64_t serial_ioport_read(void *opaque, hwaddr 
addr, unsigned size)
 ret = s->scr;
 break;
 }
-DPRINTF("read addr=0x%02x val=0x%02x\n", addr, ret);
+DPRINTF("read addr=0x%" HWADDR_PRIx " val=0x%02x\n", addr, ret);
 return ret;
 }
 
-- 
1.8.1.4

Re: [Qemu-devel] [PATCH 3/5] pseries: Fixes and enhancements to L1 cache properties

2013-03-19 Thread Alexander Graf


On 19.03.2013, at 12:16, Andreas Färber wrote:

> Am 19.03.2013 12:09, schrieb Alexander Graf:
>> 
>> On 19.03.2013, at 12:06, Andreas Färber wrote:
>> 
>>> Am 18.03.2013 12:05, schrieb Alexander Graf:
 
 On 18.03.2013, at 11:54, Andreas Färber wrote:
 
> Am 15.03.2013 13:27, schrieb Alexander Graf:
>> 
>> On 14.03.2013, at 02:53, David Gibson wrote:
>> 
>>> PAPR requires that the device tree's CPU nodes have several properties
>>> with information about the L1 cache.  We already create two of these
>>> properties, but with incorrect names - "[id]cache-block-size" instead
>>> of "[id]-cache-block-size" (note the extra hyphen).
>>> 
>>> We were also missing some of the required cache properties.  This
>>> patch adds the [id]-cache-line-size properties (which have the same
>>> values as the block size properties in all current cases).  We also
>>> add the [id]-cache-size properties.
>>> 
>>> Adding the cache sizes requires some extra infrastructure in the
>>> general target-ppc code to (optionally) set the cache sizes for
>>> various CPUs.  The CPU family descriptions in translate_init.c can set
>>> these sizes - this patch adds correct information for POWER7, I'm
>>> leaving other CPU types to people who have a physical example to
>>> verify against.  In addition, for -cpu host we take the values
>>> advertised by the host (if available) and use those to override the
>>> information based on PVR.
>>> 
>>> Signed-off-by: David Gibson 
>>> ---
>>> hw/ppc/spapr.c  |   20 ++--
>>> target-ppc/cpu.h|1 +
>>> target-ppc/kvm.c|   39 
>>> +++
>>> target-ppc/translate_init.c |4 
>>> 4 files changed, 62 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index 9a13697..7293082 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -333,10 +333,26 @@ static void *spapr_create_fdt_skel(const char 
>>> *cpu_model,
>>>  _FDT((fdt_property_string(fdt, "device_type", "cpu")));
>>> 
>>>  _FDT((fdt_property_cell(fdt, "cpu-version", env->spr[SPR_PVR])));
>>> -_FDT((fdt_property_cell(fdt, "dcache-block-size",
>>> +_FDT((fdt_property_cell(fdt, "d-cache-block-size",
>>>  env->dcache_line_size)));
>>> -_FDT((fdt_property_cell(fdt, "icache-block-size",
>>> +_FDT((fdt_property_cell(fdt, "d-cache-line-size",
>>> +env->dcache_line_size)));
>>> +_FDT((fdt_property_cell(fdt, "i-cache-block-size",
>>> +env->icache_line_size)));
>>> +_FDT((fdt_property_cell(fdt, "i-cache-line-size",
>>>  env->icache_line_size)));
>>> +
>>> +if (env->l1_dcache_size) {
>>> +_FDT((fdt_property_cell(fdt, "d-cache-size", 
>>> env->l1_dcache_size)));
>>> +} else {
>>> +fprintf(stderr, "Warning: Unknown L1 dcache size for 
>>> cpu\n");
>>> +}
>>> +if (env->l1_icache_size) {
>>> +_FDT((fdt_property_cell(fdt, "i-cache-size", 
>>> env->l1_icache_size)));
>>> +} else {
>>> +fprintf(stderr, "Warning: Unknown L1 icache size for 
>>> cpu\n");
>>> +}
>> 
>> The L1 sizes should come from the class, not env, right? Andreas, any 
>> ideas on this?
> 
> Generally speaking,
> 
> CPUPPCState: Only if this is used for TCG with an offset from AREG0 (or
> for legacy grouping reasons).
> 
> PowerPCCPU: If you ever intend to let the user override this value
> (per-instance) from the command line.
> 
> PowerPCCPUClass: If the value is always constant at runtime.
> 
> I can't tell from a brief look at this patch which may be the case here.
 
 Imagine the following:
 
 PPC
   `- POWER7
   |- POWER7_v10
   `- POWER7_v20
 
 Version 1.0 has L1 cache size of x MB. Version 2.0 has L1 cache size of y 
 MB. The user should never override the value (except with -cpu host). It 
 is constant during the lifetime of a VM.
>>> 
>>> Alex, you asked for an answer here, but I don't spot a question. :-)
>>> 
>>> I'm guessing requirements like these mean that we need to introduce a
>>> new POWERPC_DEF_SVR_CLS_INST(name, pvr, svr, type, snippet1, snippet2)
>>> macro to put additional code into class_init and instance_init
>>> respectively and let _DEF() and _DEF_SVR() pass nothing there...
>>> Possibly add specific macros wrapping the above to keep the model list
>>> readable.
>>> 
>>> Either way there's a trade-off between keeping easy macros hiding QOM
>>> boilerplate code vs. having high flexibility of what co

Re: [Qemu-devel] [RFC PATCH 0/2] port network layer onto glib

2013-03-19 Thread Paolo Bonzini

Il 19/03/2013 11:38, Peter Maydell ha scritto:
> On 19 March 2013 10:34, Paolo Bonzini  wrote:
>> Il 19/03/2013 11:12, Peter Maydell ha scritto:
>>> Threads are supported by the language runtime provided on all
>>> the systems we support, which is why they are reasonably usable.
>>> When you've persuaded glibc, MacOSX libc and Windows to implement
>>> coroutines please come back and let me know :-)
>>
>> Windows supports them (it calls them fibers) and glibc does on many
>> architectures (all but ARM, basically).
> 
> If you mean ucontext, I'm not sure I'd call that coroutine
> support at the library level (and we did implement it on
> ARM glibc).

Yes, I mean ucontext, more precisely makecontext/setcontext.  Portably
creating a new stack is really the crux of coroutine support.

Paolo

Re: [Qemu-devel] [Xen-devel] [PATCHv3] QEMU(upstream): Disable xen's use of O_DIRECT by default as it results in crashes.

2013-03-19 Thread Alex Bligh




--On 19 March 2013 12:14:36 +0100 Paolo Bonzini  wrote:


Unfortunately no.  The problem is in the loading side's kernel, on which
you do not have any control.  If the loading side doesn't use O_DIRECT,
any attempt to invalidate the metadata in userspace or on the source is
futile, because there is no way to invalidate the page cache's copy of
that metadata.


If you closed the file and reopened it, then on NFS this would have this
effect I believe, as it needs to check whether other clients have modified
the file to give open/close coherency.

However, we are rather assuming the file isn't even open at the point it
is closed by the sender, which is what George is investigating.

If this isn't true, we have a problem anyway with (e.g.) emulated
devices which don't use O_DIRECT anyway. And I had thought (I may be
wrong) using O_DIRECT does not guarantee no read caching with NFS;
O_DIRECT merely guarantees the page cache is not used under Linux
and isn't defined under POSIX:
 http://www.spinics.net/lists/linux-nfs/msg17472.html

If it were just a write caching issue, we could use O_DSYNC instead of
O_DIRECT, which would at least ensure the copy from userspace.

--
Alex Bligh

Re: [Qemu-devel] [Xen-devel] [PATCHv3] QEMU(upstream): Disable xen's use of O_DIRECT by default as it results in crashes.

2013-03-19 Thread Paolo Bonzini

Il 19/03/2013 12:44, Alex Bligh ha scritto:
> If this isn't true, we have a problem anyway with (e.g.) emulated
> devices which don't use O_DIRECT anyway.

Yes, though that would be a libxl bug, not a QEMU bug.

> And I had thought (I may be
> wrong) using O_DIRECT does not guarantee no read caching with NFS;
> O_DIRECT merely guarantees the page cache is not used under Linux
> and isn't defined under POSIX:
>  http://www.spinics.net/lists/linux-nfs/msg17472.html

Read caching on the server is fine, because it is the same server that
was used for the writes.

O_DIRECT bypasses the client's page cache, and that's enough for our
purposes.

> If it were just a write caching issue, we could use O_DSYNC instead of
> O_DIRECT, which would at least ensure the copy from userspace.

O_DSYNC is not necessary.  We do issue the appropriate fsync/fdatasync.
 What O_DSYNC does is add an implicit fdatasync after every write,
basically.

Paolo

[Qemu-devel] [PATCH] PPC/GDB: handle read and write of fpscr

2013-03-19 Thread Fabien Chouteau

Although the support of this register may be uncomplete, there are no
reason to prevent the debugger from reading or writing it.

Signed-off-by: Fabien Chouteau 
---
 gdbstub.c   |3 ++-
 target-ppc/translate_init.c |2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/gdbstub.c b/gdbstub.c
index e414ad9..d23d9c5 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -781,7 +781,8 @@ static int cpu_gdb_write_register(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 /* fpscr */
 if (gdb_has_xml)
 return 0;
-return 4;
+env->fpscr = ldtul_p(mem_buf);
+return sizeof(target_ulong);
 }
 }
 return 0;
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 09ad4ba..a5d2cc3 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7693,7 +7693,7 @@ static int gdb_set_float_reg(CPUPPCState *env, uint8_t 
*mem_buf, int n)
 return 8;
 }
 if (n == 32) {
-/* FPSCR not implemented  */
+env->fpscr = ldl_p(mem_buf);
 return 4;
 }
 return 0;
-- 
1.7.9.5

Re: [Qemu-devel] [PATCH] PPC/GDB: handle read and write of fpscr

2013-03-19 Thread Alexander Graf


On 19.03.2013, at 13:03, Fabien Chouteau wrote:

> Although the support of this register may be uncomplete, there are no
> reason to prevent the debugger from reading or writing it.
> 
> Signed-off-by: Fabien Chouteau 
> ---
> gdbstub.c   |3 ++-
> target-ppc/translate_init.c |2 +-
> 2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/gdbstub.c b/gdbstub.c
> index e414ad9..d23d9c5 100644
> --- a/gdbstub.c
> +++ b/gdbstub.c
> @@ -781,7 +781,8 @@ static int cpu_gdb_write_register(CPUPPCState *env, 
> uint8_t *mem_buf, int n)
> /* fpscr */
> if (gdb_has_xml)
> return 0;
> -return 4;
> +env->fpscr = ldtul_p(mem_buf);

Check out helper_store_fpscr() in target-ppc/fpu_helper.c. Storing fpscr has a 
bunch of side effects that won't happen when you just set the env variable. I'd 
prefer not to enable users to set fpscr when we can't guarantee that the 
updated values are actually used.

Can't you just call the helper function here?


Alex

> +return sizeof(target_ulong);
> }
> }
> return 0;
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index 09ad4ba..a5d2cc3 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -7693,7 +7693,7 @@ static int gdb_set_float_reg(CPUPPCState *env, uint8_t 
> *mem_buf, int n)
> return 8;
> }
> if (n == 32) {
> -/* FPSCR not implemented  */
> +env->fpscr = ldl_p(mem_buf);
> return 4;
> }
> return 0;
> -- 
> 1.7.9.5
>

Re: [Qemu-devel] [PATCH] PPC/GDB: handle read and write of fpscr

2013-03-19 Thread Peter Maydell

On 19 March 2013 12:10, Alexander Graf  wrote:
> Check out helper_store_fpscr() in target-ppc/fpu_helper.c. Storing
> fpscr has a bunch of side effects that won't happen when you just
> set the env variable.

Speaking of which, am I missing something, or does the code
in machine.c do nothing to ensure that we call set_float_rounding_mode
to set the fpu/ emulation code up with the right rounding mode
when loading CPU state on incoming migration?

thanks
-- PMM

Re: [Qemu-devel] [PATCH] PPC/GDB: handle read and write of fpscr

2013-03-19 Thread Alexander Graf


On 19.03.2013, at 13:13, Peter Maydell wrote:

> On 19 March 2013 12:10, Alexander Graf  wrote:
>> Check out helper_store_fpscr() in target-ppc/fpu_helper.c. Storing
>> fpscr has a bunch of side effects that won't happen when you just
>> set the env variable.
> 
> Speaking of which, am I missing something, or does the code
> in machine.c do nothing to ensure that we call set_float_rounding_mode
> to set the fpu/ emulation code up with the right rounding mode
> when loading CPU state on incoming migration?

I very much doubt anyone tests migration with TCG :)


Alex

Re: [Qemu-devel] [PATCH] Introduce query-cpu-max QMP command and cpu_max HMP counterpart

2013-03-19 Thread Markus Armbruster

Please cc: Luiz and me on QMP work in the future.

Michal Novotny  writes:

> This is the patch to introduce the query-cpu-max QMP command to get
> the maximum number of CPUs supported by the currently running emulator
> instance. This may differ machine from machine as defined by -machine
> settings and max_cpus member of QEMUMachine structure.

Humor me: don't start commit message bodies with "This patch" or
variations thereof, and don't repeat the subject.  Suggest:

QMP command query-cpu-max returns the maximum number of CPUs supported
by the currently running emulator instance, as defined in its
QEMUMachine struct.

> It's been tested both using QMP/qmp utility and telnet session on
> the QEMU session.

What's a QMP/qmp utility?  Let's drop this sentence.  In the future,
feel free to put testing info below the "---" line.

> The HMP counterpart called cpu_max has been introduced by this patch
> too.

Grammar nit: s/has been/is/.  Even better, avoid passive voice.

Hmm, I just rewrote most of your commit message, so why not rewrite all
of it:

New QMP command query-cpu-max and HMP command cpu_max

These commands return the maximum number of CPUs supported by the
currently running emulator instance, as defined in its QEMUMachine
struct.

Perhaps Luiz can fix up the commit message commit, if you don't mind.

Patch looks good.

Should query commands for machine properties multiply, we should
consider creating a single command returning all of them.  I'm not
asking you to do that now.

Reviewed-by: Markus Armbruster

[Qemu-devel] [PATCH] char: Fix return type of qemu_chr_fe_add_watch()

2013-03-19 Thread Kevin Wolf

qemu_chr_fe_add_watch() can return negative errors, therefore it must
not have an unsigned return type. For consistency with other
qemu_chr_fe_* functions, this uses a standard C int instead of glib
types.

In situations where qemu_chr_fe_add_watch() is falsely assumed to have
succeeded, the serial ports would go into a state where it never becomes
ready for transmitting more data; this is fixed by this patch.

Signed-off-by: Kevin Wolf 
---
 include/char/char.h | 4 ++--
 qemu-char.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/char/char.h b/include/char/char.h
index d6a0351..0326b2a 100644
--- a/include/char/char.h
+++ b/include/char/char.h
@@ -153,8 +153,8 @@ void qemu_chr_fe_close(struct CharDriverState *chr);
 void qemu_chr_fe_printf(CharDriverState *s, const char *fmt, ...)
 GCC_FMT_ATTR(2, 3);
 
-guint qemu_chr_fe_add_watch(CharDriverState *s, GIOCondition cond,
-GIOFunc func, void *user_data);
+int qemu_chr_fe_add_watch(CharDriverState *s, GIOCondition cond,
+  GIOFunc func, void *user_data);
 
 /**
  * @qemu_chr_fe_write:
diff --git a/qemu-char.c b/qemu-char.c
index e633797..4e011df 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -3397,8 +3397,8 @@ void qemu_chr_fe_close(struct CharDriverState *chr)
 }
 }
 
-guint qemu_chr_fe_add_watch(CharDriverState *s, GIOCondition cond,
-GIOFunc func, void *user_data)
+int qemu_chr_fe_add_watch(CharDriverState *s, GIOCondition cond,
+  GIOFunc func, void *user_data)
 {
 GSource *src;
 guint tag;
-- 
1.8.1.4

Re: [Qemu-devel] [PATCH 8/8] usb-redir: Add flow control support

2013-03-19 Thread Gerd Hoffmann

On 03/14/13 17:36, Hans de Goede wrote:
> +if (!dev->watch) {
> +dev->watch = qemu_chr_fe_add_watch(dev->cs, G_IO_OUT,
> +   usbredir_write_unblocked, 
> dev);
> +}
> +if (r < 0)
> +r = 0;
> +}
> +return r;

=== checkpatch complains ===
WARNING: braces {} are necessary for all arms of this statement
#48: FILE: hw/usb/redirect.c:290:
+if (r < 0)
[...]

total: 0 errors, 1 warnings, 65 lines checked

cheers,
  Gerd

Re: [Qemu-devel] [PATCH] serial: Fix debug format strings

2013-03-19 Thread Markus Armbruster

Cc'ing qemu-trivial

Kevin Wolf  writes:

> This fixes the build of hw/serial.c with DEBUG_SERIAL enabled.
>
> Signed-off-by: Kevin Wolf 
> ---
>  hw/serial.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/serial.c b/hw/serial.c
> index 48a5eb6..0ccc499 100644
> --- a/hw/serial.c
> +++ b/hw/serial.c
> @@ -306,7 +306,7 @@ static void serial_ioport_write(void *opaque, hwaddr 
> addr, uint64_t val,
>  SerialState *s = opaque;
>  
>  addr &= 7;
> -DPRINTF("write addr=0x%02x val=0x%02x\n", addr, val);
> +DPRINTF("write addr=0x%" HWADDR_PRIx " val=0x%" PRIx64 "\n", addr, val);
>  switch(addr) {
>  default:
>  case 0:
> @@ -527,7 +527,7 @@ static uint64_t serial_ioport_read(void *opaque, hwaddr 
> addr, unsigned size)
>  ret = s->scr;
>  break;
>  }
> -DPRINTF("read addr=0x%02x val=0x%02x\n", addr, ret);
> +DPRINTF("read addr=0x%" HWADDR_PRIx " val=0x%02x\n", addr, ret);
>  return ret;
>  }

Re: [Qemu-devel] [RFC PATCH RDMA support v4: 09/10] check for QMP string and bypass nonblock() calls

2013-03-19 Thread Michael R. Hines


On 03/19/2013 05:23 AM, Paolo Bonzini wrote:


Yes---this is a separate patch.  Apologies if you had the if(fd != -1)
before. :)  In fact, both the if(fd != -1) and the
if(!migrate_use_rdma(f)) are bad, but I prefer to eliminate as many uses
as possible of migrate_use_rdma.

I agree. In my current patch I've eliminated all of them.


Does librdmacm support non-blocking operation, similar to select() or
poll()?  Perhaps we can add support for that later.


Yes, it does, actually. The library provides what is called an "event 
channel".

(This term is overloaded by other technologies, but that's OK).

An event channel is a file descriptor provided by (I believe) the rdma_cm
kernel module driver.

When you poll on this file descriptor, it can tell you all sorts of things
just like other files or sockets like when data is ready or when
events of interest have completed (like the completion queue has elements).

In my current patch, I'm using this during 
"rdma_accept_incoming_connection()",
but I'm not currently using it for the rest of the coroutine on the 
receiver side.

Re: [Qemu-devel] [PATCH 1/3] Revert "block: complete all IOs before .bdrv_truncate"

2013-03-19 Thread Kevin Wolf

Am 11.03.2013 um 11:03 hat Peter Lieven geschrieben:
> brdv_truncate() is also called from readv/writev commands on self-
> growing file based storage. this will result in requests waiting
> for theirselves to complete.
> 
> This reverts commit 9a665b2b8640e464f0a778216fc2dca8d02acf33.

Thanks, applied to the block branch.

Kevin

Re: [Qemu-devel] [PATCH 2/3] block: complete all IOs before resizing a device

2013-03-19 Thread Kevin Wolf

Am 11.03.2013 um 11:04 hat Peter Lieven geschrieben:
> this patch ensures that all pending IOs are completed
> before a device is resized. this is especially important
> if a device is shrinked as it the bdrv_check_request()
> result is invalidated.
> 
> Signed-off-by: Peter Lieven 

Thanks, applied to the block branch.

Kevin

Re: [Qemu-devel] [RFC PATCH RDMA support v4: 08/10] introduce QEMUFileRDMA

2013-03-19 Thread Michael R. Hines


On 03/19/2013 05:18 AM, Paolo Bonzini wrote:

Il 18/03/2013 21:33, Michael R. Hines ha scritto:

+int qemu_drain(QEMUFile *f)
+{
+return f->ops->drain ? f->ops->drain(f->opaque) : 0;
+}
Hmm, this is very similar to qemu_fflush, but not quite. :/

Why exactly is this needed?

Good idea - I'll replace drain with flush once I added
the  "qemu_file_ops_are(const QEMUFile *, const QEMUFileOps *) "
that you recommended..

If I understand correctly, the problem is that save_rdma_page is
asynchronous and you have to wait for pending operations to do the
put_buffer protocol correctly.

Would it work to just do the "drain" in the put_buffer operation, if and
only if it was preceded by a save_rdma_page operation?


Yes, the drain needs to happen in a few places already:

1. During save_rdma_page (if the current "chunk" is full of pages)
2. During the end of each iteration (now using qemu_fflush in my current 
patch)

3. And also during qemu_savem_state_complete(), also using qemu_fflush.

   /** Flushes QEMUFile buffer
*
*/
@@ -723,6 +867,8 @@ int qemu_get_byte(QEMUFile *f)
   int64_t qemu_ftell(QEMUFile *f)
   {
   qemu_fflush(f);
+if(migrate_use_rdma(f))
+return delta_norm_mig_bytes_transferred();

Not needed, and another undesirable dependency (savevm.c ->
arch_init.c).  Just update f->pos in save_rdma_page.

f->pos isn't good enough because save_rdma_page does not
go through QEMUFile directly - only non-live state goes
through QEMUFile ... pc.ram uses direct RDMA writes.

As a result, the position pointer does not get updated
and the accounting is missed

Yes, I am suggesting to modify f->pos in save_rdma_page instead.

Paolo



Would that not confuse the other QEMUFile users?
If I change that pointer (without actually putting bytes
in into QEMUFile), won't the f->pos pointer be
incorrectly updated?

Re: [Qemu-devel] [RFC PATCH RDMA support v4: 03/10] more verbose documentation of the RDMA transport

2013-03-19 Thread Michael R. Hines


On 03/19/2013 04:19 AM, Michael S. Tsirkin wrote:
We have ways (e.g. cgroups) to limit what a VM can do. If it tries to 
use more RAM than we let it, it will swap, still making progress, just 
slower. OTOH it looks like pinning more memory than allowed by the 
cgroups limit will just get stuck forever (probably a bug, should fail 
instead? but does not help your protocol which needs it all pinned at 
all times). There are also per-task resource limits. If you exceed 
this registration will fail, so not good either. I just don't see why 
do registration by chunks on source but not on destination. 


Would this a hard requirement for an initial version?

I do understand how and why this makes things more flexible during
the long run, but it does have the potential to slow down the RDMA
protocol significantly.

The way its implemented now, the sender can dump bytes
onto the wire at full speed (up to 30gbps last time I measured it),
but if we insert a round-trip message + registration on the
destination side before we're allowed to push more bytes out,
we'll have to introduce more complex flow control only for
the benefit of making the destination side have the flexibility
that you described.

Re: [Qemu-devel] [RFC PATCH RDMA support v4: 09/10] check for QMP string and bypass nonblock() calls

2013-03-19 Thread Paolo Bonzini

Il 19/03/2013 14:08, Michael R. Hines ha scritto:
> On 03/19/2013 05:23 AM, Paolo Bonzini wrote:
>>
>> Yes---this is a separate patch.  Apologies if you had the if(fd != -1)
>> before. :)  In fact, both the if(fd != -1) and the
>> if(!migrate_use_rdma(f)) are bad, but I prefer to eliminate as many uses
>> as possible of migrate_use_rdma.
> I agree. In my current patch I've eliminated all of them.

Very nice.  It remains to be seen how many are replaced by checks on the
QEMUFileOps :) but it cannot be worse!

>> Does librdmacm support non-blocking operation, similar to select() or
>> poll()?  Perhaps we can add support for that later.
> 
> Yes, it does, actually. The library provides what is called an "event
> channel".
> (This term is overloaded by other technologies, but that's OK).
> 
> An event channel is a file descriptor provided by (I believe) the rdma_cm
> kernel module driver.
> 
> When you poll on this file descriptor, it can tell you all sorts of things
> just like other files or sockets like when data is ready or when
> events of interest have completed (like the completion queue has elements).
> 
> In my current patch, I'm using this during
> "rdma_accept_incoming_connection()",
> but I'm not currently using it for the rest of the coroutine on the
> receiver side.

Ok, this can be added later.

Paolo

Re: [Qemu-devel] [PATCH] Introduce query-cpu-max QMP command and cpu_max HMP counterpart

2013-03-19 Thread Michal Novotny


On 03/19/2013 01:28 PM, Markus Armbruster wrote:
> Please cc: Luiz and me on QMP work in the future.
>
> Michal Novotny  writes:
>
>> This is the patch to introduce the query-cpu-max QMP command to get
>> the maximum number of CPUs supported by the currently running emulator
>> instance. This may differ machine from machine as defined by -machine
>> settings and max_cpus member of QEMUMachine structure.
> Humor me: don't start commit message bodies with "This patch" or
> variations thereof, and don't repeat the subject.  Suggest:
>
> QMP command query-cpu-max returns the maximum number of CPUs supported
> by the currently running emulator instance, as defined in its
> QEMUMachine struct.
>
>> It's been tested both using QMP/qmp utility and telnet session on
>> the QEMU session.
> What's a QMP/qmp utility?  Let's drop this sentence.  In the future,
> feel free to put testing info below the "---" line.


I meant the utility located in /QMP/qmp which is
essentially the python script for QMP testing.


>
>> The HMP counterpart called cpu_max has been introduced by this patch
>> too.
> Grammar nit: s/has been/is/.  Even better, avoid passive voice.
>
> Hmm, I just rewrote most of your commit message, so why not rewrite all
> of it:
>
> New QMP command query-cpu-max and HMP command cpu_max
>
> These commands return the maximum number of CPUs supported by the
> currently running emulator instance, as defined in its QEMUMachine
> struct.
>
> Perhaps Luiz can fix up the commit message commit, if you don't mind.
>
> Patch looks good.

The commit log change is fine with me. Thanks for your feedback!

Michal

> Should query commands for machine properties multiply, we should
> consider creating a single command returning all of them.  I'm not
> asking you to do that now.
>
> Reviewed-by: Markus Armbruster 

-- 
Michal Novotny , RHCE, Red Hat
Virtualization | libvirt-php bindings | php-virt-control.org

Re: [Qemu-devel] [RFC PATCH RDMA support v4: 08/10] introduce QEMUFileRDMA

2013-03-19 Thread Paolo Bonzini

Il 19/03/2013 14:12, Michael R. Hines ha scritto:
> On 03/19/2013 05:18 AM, Paolo Bonzini wrote:
>> Il 18/03/2013 21:33, Michael R. Hines ha scritto:
 +int qemu_drain(QEMUFile *f)
 +{
 +return f->ops->drain ? f->ops->drain(f->opaque) : 0;
 +}
 Hmm, this is very similar to qemu_fflush, but not quite. :/

 Why exactly is this needed?
>>> Good idea - I'll replace drain with flush once I added
>>> the  "qemu_file_ops_are(const QEMUFile *, const QEMUFileOps *) "
>>> that you recommended..
>> If I understand correctly, the problem is that save_rdma_page is
>> asynchronous and you have to wait for pending operations to do the
>> put_buffer protocol correctly.
>>
>> Would it work to just do the "drain" in the put_buffer operation, if and
>> only if it was preceded by a save_rdma_page operation?
> 
> Yes, the drain needs to happen in a few places already:
> 
> 1. During save_rdma_page (if the current "chunk" is full of pages)

Ok, this is internal to RDMA so no problem.

> 2. During the end of each iteration (now using qemu_fflush in my current
> patch)

Why?

> 3. And also during qemu_savem_state_complete(), also using qemu_fflush.

This would be caught by put_buffer, but (2) would not.

>/** Flushes QEMUFile buffer
> *
> */
> @@ -723,6 +867,8 @@ int qemu_get_byte(QEMUFile *f)
>int64_t qemu_ftell(QEMUFile *f)
>{
>qemu_fflush(f);
> +if(migrate_use_rdma(f))
> +return delta_norm_mig_bytes_transferred();
 Not needed, and another undesirable dependency (savevm.c ->
 arch_init.c).  Just update f->pos in save_rdma_page.
>>> f->pos isn't good enough because save_rdma_page does not
>>> go through QEMUFile directly - only non-live state goes
>>> through QEMUFile ... pc.ram uses direct RDMA writes.
>>>
>>> As a result, the position pointer does not get updated
>>> and the accounting is missed
>> Yes, I am suggesting to modify f->pos in save_rdma_page instead.
>>
>> Paolo
>>
> 
> Would that not confuse the other QEMUFile users?
> If I change that pointer (without actually putting bytes
> in into QEMUFile), won't the f->pos pointer be
> incorrectly updated?

f->pos is never used directly by QEMUFile, it is almost an opaque value.
 It is accumulated on every qemu_fflush (so that it can be passed to the
->put_buffer function), and returned by qemu_ftell; nothing else.

If you make somehow save_rdma_page a new op, returning a value from that
op and adding it to f->pos would be a good way to achieve this.

Paolo

Re: [Qemu-devel] [PATCH] Introduce query-cpu-max QMP command and cpu_max HMP counterpart

2013-03-19 Thread Andreas Färber

Am 19.03.2013 13:28, schrieb Markus Armbruster:
> Please cc: Luiz and me on QMP work in the future.
> 
> Michal Novotny  writes:
> 
>> This is the patch to introduce the query-cpu-max QMP command to get
>> the maximum number of CPUs supported by the currently running emulator
>> instance. This may differ machine from machine as defined by -machine
>> settings and max_cpus member of QEMUMachine structure.
> 
> Humor me: don't start commit message bodies with "This patch" or
> variations thereof, and don't repeat the subject.  Suggest:
> 
> QMP command query-cpu-max returns the maximum number of CPUs supported
> by the currently running emulator instance, as defined in its
> QEMUMachine struct.
> 
>> It's been tested both using QMP/qmp utility and telnet session on
>> the QEMU session.
> 
> What's a QMP/qmp utility?

Our ./QMP/qmp script, I guess.

>  Let's drop this sentence.  In the future,
> feel free to put testing info below the "---" line.
> 
>> The HMP counterpart called cpu_max has been introduced by this patch
>> too.
> 
> Grammar nit: s/has been/is/.  Even better, avoid passive voice.
> 
> Hmm, I just rewrote most of your commit message, so why not rewrite all
> of it:
> 
> New QMP command query-cpu-max and HMP command cpu_max
> 
> These commands return the maximum number of CPUs supported by the
> currently running emulator instance, as defined in its QEMUMachine
> struct.
> 
> Perhaps Luiz can fix up the commit message commit, if you don't mind.
> 
> Patch looks good.
> 
> Should query commands for machine properties multiply, we should
> consider creating a single command returning all of them.  I'm not
> asking you to do that now.
> 
> Reviewed-by: Markus Armbruster 

>From my CPU perspective I am wondering if this info has future as-is?
IIUC this QEMUMachine value differs from the value exposed to
SeaBIOS/KVM already since recently, as it ignores CPU topology.

I'm guessing the libvirt use case is just to detect users specifying too
many -cpu options, so I won't veto this, but want to caution that we may
need to change this as we proceed with CPU remodelling.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH] virtio-blk: Do not segfault fault if failed to initialize dataplane

2013-03-19 Thread KONRAD Frédéric


On 19/03/2013 11:33, Kevin Wolf wrote:

Am 19.03.2013 um 09:37 hat KONRAD Frédéric geschrieben:

On 19/03/2013 09:27, Dunrong Huang wrote:

$ ~/usr/bin/qemu-system-x86_64 -enable-kvm -m 1024 -drive 
if=none,id=drive0,cache=none,aio=native,format=raw,file=/root/Image/centos-6.4.raw
 -device virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on,config-wce=on # 
make dataplane fail to initialize
qemu-system-x86_64: -device 
virtio-blk-pci,drive=drive0,scsi=off,x-data-plane=on,config-wce=on: device is 
incompatible with x-data-plane, use config-wce=off
*** glibc detected *** /root/usr/bin/qemu-system-x86_64: free(): invalid 
pointer: 0x7f001fef12f8 ***
=== Backtrace: =
/lib64/libc.so.6(+0x7d776)[0x7f00153a5776]
/root/usr/bin/qemu-system-x86_64(+0x2c34ec)[0x7f001cf5b4ec]
/root/usr/bin/qemu-system-x86_64(+0x342f9a)[0x7f001cfdaf9a]
/root/usr/bin/qemu-system-x86_64(+0x33694e)[0x7f001cfce94e]


  (gdb) bt
  #0  0x7f3bf3a12015 in raise () from /lib64/libc.so.6
  #1  0x7f3bf3a1348b in abort () from /lib64/libc.so.6
  #2  0x7f3bf3a51a4e in __libc_message () from /lib64/libc.so.6
  #3  0x7f3bf3a57776 in malloc_printerr () from /lib64/libc.so.6
  #4  0x7f3bfb60d4ec in free_and_trace (mem=0x7f3bfe0129f8) at vl.c:2786
  #5  0x7f3bfb68cf9a in virtio_cleanup (vdev=0x7f3bfe0129f8) at 
/root/Develop/QEMU/qemu/hw/virtio.c:900
  #6  0x7f3bfb68094e in virtio_blk_device_init (vdev=0x7f3bfe0129f8) at 
/root/Develop/QEMU/qemu/hw/virtio-blk.c:666
  #7  0x7f3bfb68dadf in virtio_device_init (qdev=0x7f3bfe0129f8) at 
/root/Develop/QEMU/qemu/hw/virtio.c:1092
  #8  0x7f3bfb50da46 in device_realize (dev=0x7f3bfe0129f8, 
err=0x7fff479c9258) at hw/qdev.c:176
.

In virtio_blk_device_init(), the memory which vdev point to is a static
member of "struct VirtIOBlkPCI", not heap memory, and it does not
get freed. So we shoule use virtio_common_cleanup() to clean this VirtIODevice
rather than virtio_cleanup(), which attempts to free the vdev.

This error was introduced by commit 05ff686536f408ba6e8426b1b54d25bd3379fda2
recently.

Signed-off-by: Dunrong Huang 

Reviewed-by: KONRAD Frederic 

Oops sorry for that :/

So virtio_init() has to be paired with virtio_common_cleanup(), and
virtio_common_init() with virtio_cleanup()? Confusing...


True, I agree that's confusing...

Note that this will disappear when all device will be re-factored and 
was transparent with the big patch-set.


(I think that the original code had virtio_cleanup and virtio_common_init.)

Fred


Anyway, the patch looks correct. Thanks, applied to the block branch.

Kevin

Re: [Qemu-devel] [PATCH] serial: Fix debug format strings

2013-03-19 Thread Andreas Färber

Am 19.03.2013 12:25, schrieb Kevin Wolf:
> This fixes the build of hw/serial.c with DEBUG_SERIAL enabled.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  hw/serial.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/serial.c b/hw/serial.c
> index 48a5eb6..0ccc499 100644
> --- a/hw/serial.c
> +++ b/hw/serial.c
> @@ -306,7 +306,7 @@ static void serial_ioport_write(void *opaque, hwaddr 
> addr, uint64_t val,
>  SerialState *s = opaque;
>  
>  addr &= 7;
> -DPRINTF("write addr=0x%02x val=0x%02x\n", addr, val);
> +DPRINTF("write addr=0x%" HWADDR_PRIx " val=0x%" PRIx64 "\n", addr, val);

Are you intentionally dropping 02 in both places? That changes output FWIW.

Andreas

>  switch(addr) {
>  default:
>  case 0:
> @@ -527,7 +527,7 @@ static uint64_t serial_ioport_read(void *opaque, hwaddr 
> addr, unsigned size)
>  ret = s->scr;
>  break;
>  }
> -DPRINTF("read addr=0x%02x val=0x%02x\n", addr, ret);
> +DPRINTF("read addr=0x%" HWADDR_PRIx " val=0x%02x\n", addr, ret);
>  return ret;
>  }
>  
> 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Re: [Qemu-devel] [PATCH] Introduce query-cpu-max QMP command and cpu_max HMP counterpart

2013-03-19 Thread Michal Novotny


On 03/19/2013 02:27 PM, Andreas Färber wrote:
> Am 19.03.2013 13:28, schrieb Markus Armbruster:
>> Please cc: Luiz and me on QMP work in the future.
>>
>> Michal Novotny  writes:
>>
>>> This is the patch to introduce the query-cpu-max QMP command to get
>>> the maximum number of CPUs supported by the currently running emulator
>>> instance. This may differ machine from machine as defined by -machine
>>> settings and max_cpus member of QEMUMachine structure.
>> Humor me: don't start commit message bodies with "This patch" or
>> variations thereof, and don't repeat the subject.  Suggest:
>>
>> QMP command query-cpu-max returns the maximum number of CPUs supported
>> by the currently running emulator instance, as defined in its
>> QEMUMachine struct.
>>
>>> It's been tested both using QMP/qmp utility and telnet session on
>>> the QEMU session.
>> What's a QMP/qmp utility?
> Our ./QMP/qmp script, I guess.
>
>>  Let's drop this sentence.  In the future,
>> feel free to put testing info below the "---" line.
>>
>>> The HMP counterpart called cpu_max has been introduced by this patch
>>> too.
>> Grammar nit: s/has been/is/.  Even better, avoid passive voice.
>>
>> Hmm, I just rewrote most of your commit message, so why not rewrite all
>> of it:
>>
>> New QMP command query-cpu-max and HMP command cpu_max
>>
>> These commands return the maximum number of CPUs supported by the
>> currently running emulator instance, as defined in its QEMUMachine
>> struct.
>>
>> Perhaps Luiz can fix up the commit message commit, if you don't mind.
>>
>> Patch looks good.
>>
>> Should query commands for machine properties multiply, we should
>> consider creating a single command returning all of them.  I'm not
>> asking you to do that now.
>>
>> Reviewed-by: Markus Armbruster 
> From my CPU perspective I am wondering if this info has future as-is?
> IIUC this QEMUMachine value differs from the value exposed to
> SeaBIOS/KVM already since recently, as it ignores CPU topology.
>
> I'm guessing the libvirt use case is just to detect users specifying too
> many -cpu options, so I won't veto this, but want to caution that we may
> need to change this as we proceed with CPU remodelling.

Yeah, this patch was written for libvirt as one of the libvirt
developers was not happy qemu doesn't support that yet.

Michal
>
> Regards,
> Andreas
>

-- 
Michal Novotny , RHCE, Red Hat
Virtualization | libvirt-php bindings | php-virt-control.org

Re: [Qemu-devel] [PATCH v9 03/10] virtio-blk: don't use pointer for configuration.

2013-03-19 Thread KONRAD Frédéric


On 18/03/2013 09:59, Kevin Wolf wrote:

Am 15.03.2013 um 19:48 hat fred.kon...@greensocs.com geschrieben:

From: KONRAD Frederic 

The configuration field must not be a pointer as it will be used for virtio-blk
properties. So *blk is replaced by blk in VirtIOBlock structure.

Signed-off-by: KONRAD Frederic 
Reviewed-by: Peter Maydell 
---
  hw/virtio-blk.c | 8 
  hw/virtio-blk.h | 2 +-
  2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
index 6714b01..908c316 100644
--- a/hw/virtio-blk.c
+++ b/hw/virtio-blk.c
@@ -151,7 +151,7 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *req)
   */
  req->scsi = (void *)req->elem.in_sg[req->elem.in_num - 2].iov_base;
  
-if (!req->dev->blk->scsi) {

+if (!req->dev->blk.scsi) {
  status = VIRTIO_BLK_S_UNSUPP;
  goto fail;
  }
@@ -371,7 +371,7 @@ static void virtio_blk_handle_request(VirtIOBlockReq *req,
   * terminated by '\0' only when shorter than buffer.
   */
  strncpy(req->elem.in_sg[0].iov_base,
-s->blk->serial ? s->blk->serial : "",
+s->blk.serial ? s->blk.serial : "",
  MIN(req->elem.in_sg[0].iov_len, VIRTIO_BLK_ID_BYTES));
  virtio_blk_req_complete(req, VIRTIO_BLK_S_OK);
  g_free(req);
@@ -534,7 +534,7 @@ static uint32_t virtio_blk_get_features(VirtIODevice *vdev, 
uint32_t features)
  features |= (1 << VIRTIO_BLK_F_BLK_SIZE);
  features |= (1 << VIRTIO_BLK_F_SCSI);
  
-if (s->blk->config_wce) {

+if (s->blk.config_wce) {
  features |= (1 << VIRTIO_BLK_F_CONFIG_WCE);
  }
  if (bdrv_enable_write_cache(s->bs))
@@ -650,7 +650,7 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, 
VirtIOBlkConf *blk)
  s->vdev.reset = virtio_blk_reset;
  s->bs = blk->conf.bs;
  s->conf = &blk->conf;
-s->blk = blk;
+memcpy(&(s->blk), blk, sizeof(struct VirtIOBlkConf));

Why not simply s->blk = *blk?

The reason why copying this works is that blk is read-only after
initialisation. We also get an additional reference to blk->serial, but
we know that it can only go away after this device has been destroyed
(the same assumption was necessary for the s->blk pointer in the old
code).


You mean this copying (s->blk = *blk) ?



Is my understanding of this correct?

Kevin

Re: [Qemu-devel] [RFC PATCH RDMA support v4: 08/10] introduce QEMUFileRDMA

2013-03-19 Thread Michael R. Hines


On 03/19/2013 09:25 AM, Paolo Bonzini wrote:

Yes, the drain needs to happen in a few places already:

1. During save_rdma_page (if the current "chunk" is full of pages)
Ok, this is internal to RDMA so no problem.


2. During the end of each iteration (now using qemu_fflush in my current
patch)

Why?


This is because of downtime: You have to drain the queue anyway at the
very end, and if you don't drain it in advance after each iteration, then
the queue will have lots of bytes in it waiting for transmission and the
Virtual Machine will be stopped for a much longer period of time during
the last iteration waiting for RDMA card to finish transmission of all those
bytes.

If you wait till the last iteration to do this, then all of that waiting 
time gets

counted as downtime, causing the VCPUs to be unnecessarily stopped.



3. And also during qemu_savem_state_complete(), also using qemu_fflush.

This would be caught by put_buffer, but (2) would not.



I'm not sure this is good enough either - we don't want to flush
the queue *frequently*. only when it's necessary for performance
 we do want the queue to have some meat to it so the hardware
can write bytes as fast as possible.

If we flush inside put_buffer (which is called very frequently): then
we have no way to distinquish *where* put buffer was called from
(either from qemu_savevm_state_complete() or from a device-level
function call that's using QEMUFile).


Yes, I am suggesting to modify f->pos in save_rdma_page instead.

Paolo


Would that not confuse the other QEMUFile users?
If I change that pointer (without actually putting bytes
in into QEMUFile), won't the f->pos pointer be
incorrectly updated?

f->pos is never used directly by QEMUFile, it is almost an opaque value.
  It is accumulated on every qemu_fflush (so that it can be passed to the
->put_buffer function), and returned by qemu_ftell; nothing else.

If you make somehow save_rdma_page a new op, returning a value from that
op and adding it to f->pos would be a good way to achieve this.


Ok, great - I'll take advantage of thatThanks.

Re: [Qemu-devel] [PATCH] serial: Fix debug format strings

2013-03-19 Thread Kevin Wolf

Am 19.03.2013 um 14:29 hat Andreas Färber geschrieben:
> Am 19.03.2013 12:25, schrieb Kevin Wolf:
> > This fixes the build of hw/serial.c with DEBUG_SERIAL enabled.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >  hw/serial.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/serial.c b/hw/serial.c
> > index 48a5eb6..0ccc499 100644
> > --- a/hw/serial.c
> > +++ b/hw/serial.c
> > @@ -306,7 +306,7 @@ static void serial_ioport_write(void *opaque, hwaddr 
> > addr, uint64_t val,
> >  SerialState *s = opaque;
> >  
> >  addr &= 7;
> > -DPRINTF("write addr=0x%02x val=0x%02x\n", addr, val);
> > +DPRINTF("write addr=0x%" HWADDR_PRIx " val=0x%" PRIx64 "\n", addr, 
> > val);
> 
> Are you intentionally dropping 02 in both places? That changes output FWIW.

%02 for 64 bit values felt odd, but I think they might be guaranteed to
be small enough in practice. If you really prefer, I can send a v2 which
adds it back. But it's only debug code, so... *shrug*

Kevin

Re: [Qemu-devel] [RFC PATCH RDMA support v4: 08/10] introduce QEMUFileRDMA

2013-03-19 Thread Paolo Bonzini

Il 19/03/2013 14:40, Michael R. Hines ha scritto:
> On 03/19/2013 09:25 AM, Paolo Bonzini wrote:
>> Yes, the drain needs to happen in a few places already:
>>
>> 1. During save_rdma_page (if the current "chunk" is full of pages)
>> Ok, this is internal to RDMA so no problem.
>>
>>> 2. During the end of each iteration (now using qemu_fflush in my current
>>> patch)
>> Why?
> 
> This is because of downtime: You have to drain the queue anyway at the
> very end, and if you don't drain it in advance after each iteration, then
> the queue will have lots of bytes in it waiting for transmission and the
> Virtual Machine will be stopped for a much longer period of time during
> the last iteration waiting for RDMA card to finish transmission of all
> those
> bytes.

Shouldn't the "current chunk full" case take care of it too?

Of course if you disable chunking you have to add a different condition,
perhaps directly into save_rdma_page.

> If you wait till the last iteration to do this, then all of that waiting time 
> gets
> counted as downtime, causing the VCPUs to be unnecessarily stopped.
> 
>>> 3. And also during qemu_savem_state_complete(), also using qemu_fflush.
>> This would be caught by put_buffer, but (2) would not.
>>
> 
> I'm not sure this is good enough either - we don't want to flush
> the queue *frequently*. only when it's necessary for performance
>  we do want the queue to have some meat to it so the hardware
> can write bytes as fast as possible.
> 
> If we flush inside put_buffer (which is called very frequently):

Is it called at any time during RAM migration?

> then we have no way to distinquish *where* put buffer was called from
> (either from qemu_savevm_state_complete() or from a device-level
> function call that's using QEMUFile).

Can you make drain a no-op if there is nothing in flight?  Then every
call to put_buffer after the first should not have any overhead.

Paolo

 Yes, I am suggesting to modify f->pos in save_rdma_page instead.

 Paolo

>>> Would that not confuse the other QEMUFile users?
>>> If I change that pointer (without actually putting bytes
>>> in into QEMUFile), won't the f->pos pointer be
>>> incorrectly updated?
>> f->pos is never used directly by QEMUFile, it is almost an opaque value.
>>   It is accumulated on every qemu_fflush (so that it can be passed to the
>> ->put_buffer function), and returned by qemu_ftell; nothing else.
>>
>> If you make somehow save_rdma_page a new op, returning a value from that
>> op and adding it to f->pos would be a good way to achieve this.
> 
> Ok, great - I'll take advantage of thatThanks.
>

1 2 3 >

1 - 100 of 233 matches

Mail list logo