date:20140731

Re: [Qemu-devel] [PATCH v2] po: Add Chinese translation

2014-07-31 Thread Amos Kong

On Thu, Jul 31, 2014 at 11:18:45AM +0800, Fam Zheng wrote:
> Signed-off-by: Fam Zheng 
> 
> ---
> v2: Fix typo for "_View".
> ---
>  po/zh_CN.po | 86 
> +
>  1 file changed, 86 insertions(+)
>  create mode 100644 po/zh_CN.po

Reviewed-by: Amos Kong 
 
> diff --git a/po/zh_CN.po b/po/zh_CN.po
> new file mode 100644
> index 000..7ffba30
> --- /dev/null
> +++ b/po/zh_CN.po
> @@ -0,0 +1,86 @@
> +# Chinese translation for QEMU.
> +# This file is put in the public domain.
> +#
> +# Fam Zheng , 2014
> +msgid ""
> +msgstr ""
> +"Project-Id-Version: QEMU 2.2\n"
> +"Report-Msgid-Bugs-To: qemu-devel@nongnu.org\n"
> +"POT-Creation-Date: 2014-07-31 10:03+0800\n"
> +"PO-Revision-Date: 2014-07-31 10:00+0800\n"
> +"Last-Translator: Fam Zheng \n"
> +"Language-Team: Chinese \n"
> +"Language: zh\n"
> +"MIME-Version: 1.0\n"
> +"Content-Type: text/plain; charset=UTF-8\n"
> +"Content-Transfer-Encoding: 8bit\n"
> +"Plural-Forms: nplurals=2; plural=n != 1;\n"
> +"X-Generator: Lokalize 1.4\n"
> +
> +#: ui/gtk.c:321
> +msgid " - Press Ctrl+Alt+G to release grab"
> +msgstr "- 按下 Ctrl+Alt+G 取消捕获"

  ^ lost one blank space
> +
> +#: ui/gtk.c:325
> +msgid " [Paused]"
> +msgstr " [已暂停]"
> +
> +#: ui/gtk.c:1601
> +msgid "_Pause"
> +msgstr "暂停(_P)"
> +
> +#: ui/gtk.c:1607
> +msgid "_Reset"
> +msgstr "重置(_R)"
> +
> +#: ui/gtk.c:1610
> +msgid "Power _Down"
> +msgstr "关闭电源(_D)"
> +
> +#: ui/gtk.c:1616
> +msgid "_Quit"
> +msgstr "退出(_Q)"
> +
> +#: ui/gtk.c:1692
> +msgid "_Fullscreen"
> +msgstr "全屏(_F)"
> +
> +#: ui/gtk.c:1702
> +msgid "Zoom _In"
> +msgstr "放大(_I)"
> +
> +#: ui/gtk.c:1709
> +msgid "Zoom _Out"
> +msgstr "缩小(_O)"
> +
> +#: ui/gtk.c:1716
> +msgid "Best _Fit"
> +msgstr "最合适大小(_F)"
> +
> +#: ui/gtk.c:1723
> +msgid "Zoom To _Fit"
> +msgstr "缩放以适应大小(_F)"
> +
> +#: ui/gtk.c:1729
> +msgid "Grab On _Hover"
> +msgstr "鼠标经过时捕获(_H)"
> +
> +#: ui/gtk.c:1732
> +msgid "_Grab Input"
> +msgstr "捕获输入(_G)"
> +
> +#: ui/gtk.c:1761
> +msgid "Show _Tabs"
> +msgstr "显示标签页(_T)"
> +
> +#: ui/gtk.c:1764
> +msgid "Detach Tab"
> +msgstr "分离标签页"
> +
> +#: ui/gtk.c:1778
> +msgid "_Machine"
> +msgstr "虚拟机(_M)"
> +
> +#: ui/gtk.c:1783
> +msgid "_View"
> +msgstr "视图(_V)"
> -- 
> 2.0.3

-- 
Amos.

[Qemu-devel] [Bug 1350435] Re: tcg.c:1693: tcg fatal error

2014-07-31 Thread LocutusOfBorg

No, I did not test the patch, I don't think that I can push that patch
in lp infrastructure, and I don't have my own one.

Anyway Peter is right, we don't need to hide bugs, but to fix them ;)

** Changed in: qemu (Ubuntu)
   Status: Incomplete => New

** Tags removed: patch

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1350435

Title:
  tcg.c:1693: tcg fatal error

Status in launchpad-buildd:
  New
Status in QEMU:
  New
Status in “qemu” package in Ubuntu:
  New

Bug description:
  this started happening after the launchpad buildd trusty deploy
  
https://code.launchpad.net/~costamagnagianfranco/+archive/ubuntu/firefox/+build/6224439

  
  debconf-updatepo
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  Segmentation fault (core dumped)
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  Segmentation fault (core dumped)
  /build/buildd/qemu-2.0.0+dfsg/tcg/tcg.c:1693: tcg fatal error
  /build/buildd/qemu-2.0.0+dfsg/tcg/tcg.c:1693: tcg fatal error

  this seems to be the patch needed
  https://patches.linaro.org/32473/

To manage notifications about this bug go to:
https://bugs.launchpad.net/launchpad-buildd/+bug/1350435/+subscriptions

Re: [Qemu-devel] [PATCH v4 0/5] This series adds iotest case for IO throttling.

2014-07-31 Thread Fam Zheng

On Tue, 06/17 10:45, Fam Zheng wrote:
> On Thu, 06/05 16:47, Fam Zheng wrote:
> > There is a change in qemu-io sub-commands "aio_read" and "aio_write", which
> > makes the aio requests accounted and the statistics reflected in blockstats.
> > 
> > Note that IO throttling implementation allows overcommiting of requests, so 
> > the
> > actual IO happened in a time unit may be a bit larger than given limits. In 
> > the
> > test case, the stats numbers are compared against 110%, to make room for 
> > such
> > flexibility in order to improve determinism.
> > 
> > v4: Rebase to master. Add Benoit's rev-by lines to all the patches.
> 
> Ping?

Ping.

Re: [Qemu-devel] [PATCH 01/15] qemu coroutine: support bypass mode

2014-07-31 Thread Benoît Canet

The Thursday 31 Jul 2014 à 11:55:28 (+0800), Ming Lei wrote :
> On Thu, Jul 31, 2014 at 7:37 AM, Paolo Bonzini  wrote:
> > Il 30/07/2014 19:15, Ming Lei ha scritto:
> >> On Wed, Jul 30, 2014 at 9:45 PM, Paolo Bonzini  wrote:
> >>> Il 30/07/2014 13:39, Ming Lei ha scritto:
>  This patch introduces several APIs for supporting bypass qemu coroutine
>  in case of being not necessary and for performance's sake.
> >>>
> >>> No, this is wrong.  Dataplane *must* use the same code as non-dataplane,
> >>> anything else is a step backwards.
> >>
> >> As we saw, coroutine has brought up performance regression
> >> on dataplane, and it isn't necessary to use co in some cases, is it?
> >
> > Yes, and it's not necessary on non-dataplane either.  It's not necessary
> > on virtio-scsi, and it will not be necessary on virtio-scsi dataplane
> > either.
> 
> It is good to know these cases, so they might benefit from this patch
> in future too, :-)
> 
> >>> If you want to bypass coroutines, bdrv_aio_readv/writev must detect the
> >>> conditions that allow doing that and call the bdrv_aio_readv/writev
> >>> directly.
> >>
> >> That is easy to detect, please see the 5th patch.
> >
> > No, that's not enough.  Dataplane right now prevents block jobs, but
> > that's going to change and it could require coroutines even for raw devices.
> 
> Could you explain it a bit why co is required for raw devices in future?

Block mirroring of a device for example is done using coroutines.
As block mirroring can be done on a raw device you need coroutines.

> 
> If some cases for not requiring co can be detected beforehand, these
> patches are useful, IMO.
> 
> >>> To begin with, have you benchmarked QEMU and can you provide a trace of
> >>> *where* the coroutine overhead lies?
> >>
> >> I guess it may be caused by the stack switch, at least in one of
> >> my box, bypassing co can improve throughput by ~7%, and by
> >> ~15% in another box.
> >
> > No guesses please.  Actually that's also my guess, but since you are
> > submitting the patch you must do better and show profiles where stack
> > switching disappears after the patches.
> 
> OK, no problem, I will provide some profile result.
> 
> Thanks,
>

Re: [Qemu-devel] [PATCH v2 1/2] block: fix overlapping multiwrite requests

2014-07-31 Thread Benoît Canet

The Wednesday 30 Jul 2014 à 09:53:30 (+0100), Stefan Hajnoczi wrote :
> When request A is a strict superset of request B:
> 
>   
> 
> 
> multiwrite_merge() merges them as follows:
> 
>   AA
> 
> The tail of request A should have been included:
> 
>   AAAA
> 
> This patch fixes data loss but this code path is probably rare.  Since
> guests cannot assume ordering between in-flight requests, few
> applications submit overlapping write requests.
> 
> Reported-by: Slava Pestov 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  block.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/block.c b/block.c
> index 8cf519b..0a3ac43 100644
> --- a/block.c
> +++ b/block.c
> @@ -4498,6 +4498,12 @@ static int multiwrite_merge(BlockDriverState *bs, 
> BlockRequest *reqs,
>  // Add the second request
>  qemu_iovec_concat(qiov, reqs[i].qiov, 0, reqs[i].qiov->size);
>  
> +// Add tail of first request, if necessary
> +if (qiov->size < reqs[outidx].qiov->size) {
> +qemu_iovec_concat(qiov, reqs[outidx].qiov, qiov->size,
> +  reqs[outidx].qiov->size - qiov->size);
> +}
> +
>  reqs[outidx].nb_sectors = qiov->size >> 9;
>  reqs[outidx].qiov = qiov;
>  
> -- 
> 1.9.3
> 
> 

Seems to make sense.

Reviewed-by: Benoit Canet

Re: [Qemu-devel] [PATCH/RFC 4/5] s390x/kvm: test whether a cpu is STOPPED when checking "has_work"

2014-07-31 Thread David Hildenbrand

> > We have
> > - wait (wait bit in PSW)
> > - disabled wait (wait bit and interrupt fencing in PSW)
> > - STOPPED (not related to PSW, state change usually handled via service 
> > processor or hypervisor)
> >
> > I think we have to differentiate between KVM/TCG. On KVM we always do in 
> > kernel halt and qemu sees a halted only for STOPPED or disabled wait. TCG 
> > has to take care of the normal wait as well.
> >
> >  From a first glimpse, a disabled wait and STOPPED look similar, but there 
> > are (important) differences, e.g. other CPUs get a different a different 
> > result from a SIGP SENSE. This makes a big difference, e.g. for Linux 
> > guests, that send a SIGP STOP, followed by a SIGP SENSE loop until the CPU 
> > is down on hotplug (and shutdown, kexec..) So I think we agree, that 
> > handling the cpu states natively makes sense.
> >
> > The question is now only how to model it correctly without breaking TCG/KVM 
> > and reuse as much common code as possible. Correct?
> >
> > Do I understand you correctly, that your collapsing of stopped and halted 
> > is only in the qemu coding sense, IOW maybe we could just modify 
> > kvm_arch_process_async_events to consider the STOPPED state, as TCGs sigp 
> > implementation does not support SMP anyway?
> 
> That works for me, yes.
> 
> 
> Alex
> 

I had a look at it yesterday and it seems like we can totally drop this patch:

1. TCG doesn't support multiple CPUs and the TCG SIGP implementation isn't
ready for proper STOP/START/SENSE. Testing for STOPPED cpus in cpu_has_work()
can be dropped. To be able to support TCG was the main reason for this patch -
as we don't want to do so for now, we can leave it as is. We can still decide
to support the cpu states later using a mechanism suggest by Alex
(interrupt_requests).

Even if cpu_has_work() would make cpu.c:cpu_thread_is_idle() return false,
kvm_arch_process_async_events() called by kvm-all.c:kvm_cpu_exec() would make
it go back to sleep. Therefore a stopped VCPU will never be able to run in the
KVM case (because it always has cs->halted = true).

2. The unhalt in kvm_arch_process_async_events is for a special case where a
VCPU is in disabled wait and receives e.g. a machine-check interrupt. These
might happen in the future, for now we will never see them (the only
way to get a vcpu out of disabled wait are SIGP RESTART/CPU RESET - so we
don't break anything at that point).

David

Re: [Qemu-devel] [PATCH v2 2/2] qemu-iotests: add multiwrite test cases

2014-07-31 Thread Benoît Canet

The Wednesday 30 Jul 2014 à 09:53:31 (+0100), Stefan Hajnoczi wrote :
> This test case covers the basic bdrv_aio_multiwrite() scenarios:
> 1. Single request
> 2. Sequential requests (AABB)
> 3. Superset overlapping requests (AABBAA)
> 4. Subset overlapping requests (BBAABB)
> 5. Head overlapping requests (AABB)
> 6. Tail overlapping requests (BBAA)
> 7. Disjoint requests (AA BB)
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  tests/qemu-iotests/100 | 134 
> +
>  tests/qemu-iotests/100.out |  89 ++
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 224 insertions(+)
>  create mode 100755 tests/qemu-iotests/100
>  create mode 100644 tests/qemu-iotests/100.out
> 
> diff --git a/tests/qemu-iotests/100 b/tests/qemu-iotests/100
> new file mode 100755
> index 000..9124aba
> --- /dev/null
> +++ b/tests/qemu-iotests/100
> @@ -0,0 +1,134 @@
> +#!/bin/bash
> +#
> +# Test simple read/write using plain bdrv_read/bdrv_write
> +#
> +# Copyright (C) 2014 Red Hat, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see .
> +#
> +
> +# creator
> +owner=stefa...@redhat.com
> +
> +seq=`basename $0`
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1 # failure is the default!
> +
> +_cleanup()
> +{
> + _cleanup_test_img
> +}
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +# get standard environment, filters and checks
> +. ./common.rc
> +. ./common.filter
> +
> +_supported_fmt generic
> +_supported_proto generic
> +_supported_os Linux
> +
> +
> +size=128M
> +
> +echo
> +echo "== Single request =="
> +_make_test_img $size
> +$QEMU_IO -c "multiwrite 0 4k" "$TEST_IMG" | _filter_qemu_io
> +
> +echo
> +echo "== verify pattern =="
> +$QEMU_IO -c "read -P 0xcd 0 4k" "$TEST_IMG" | _filter_qemu_io
> +$QEMU_IO -c "read -P 0 4k 4k" "$TEST_IMG" | _filter_qemu_io
> +
> +echo
> +echo "== Sequential requests =="
> +_make_test_img $size
> +$QEMU_IO -c "multiwrite 0 4k ; 4k 4k" "$TEST_IMG" | _filter_qemu_io
> +
> +echo
> +echo "== verify pattern =="
> +$QEMU_IO -c "read -P 0xcd 0 4k" "$TEST_IMG" | _filter_qemu_io
> +$QEMU_IO -c "read -P 0xce 4k 4k" "$TEST_IMG" | _filter_qemu_io
> +$QEMU_IO -c "read -P 0 8k 4k" "$TEST_IMG" | _filter_qemu_io
> +
> +echo
> +echo "== Superset overlapping requests =="
> +_make_test_img $size
> +$QEMU_IO -c "multiwrite 0 4k ; 1k 2k" "$TEST_IMG" | _filter_qemu_io
> +
> +echo
> +echo "== verify pattern =="
> +# Order of overlapping in-flight requests is not guaranteed so we cannot 
> verify
> +# [1k, 3k) since it could have either pattern 0xcd or 0xce.
> +$QEMU_IO -c "read -P 0xcd 0 1k" "$TEST_IMG" | _filter_qemu_io
> +$QEMU_IO -c "read -P 0xcd 3k 1k" "$TEST_IMG" | _filter_qemu_io
> +$QEMU_IO -c "read -P 0 4k 4k" "$TEST_IMG" | _filter_qemu_io
> +
> +echo
> +echo "== Subset overlapping requests =="
> +_make_test_img $size
> +$QEMU_IO -c "multiwrite 1k 2k ; 0k 4k" "$TEST_IMG" | _filter_qemu_io
> +
> +echo
> +echo "== verify pattern =="
> +# Order of overlapping in-flight requests is not guaranteed so we cannot 
> verify
> +# [1k, 3k) since it could have either pattern 0xcd or 0xce.
> +$QEMU_IO -c "read -P 0xce 0 1k" "$TEST_IMG" | _filter_qemu_io
> +$QEMU_IO -c "read -P 0xce 3k 1k" "$TEST_IMG" | _filter_qemu_io
> +$QEMU_IO -c "read -P 0 4k 4k" "$TEST_IMG" | _filter_qemu_io
> +
> +echo
> +echo "== Head overlapping requests =="
> +_make_test_img $size
> +$QEMU_IO -c "multiwrite 0k 2k ; 0k 4k" "$TEST_IMG" | _filter_qemu_io
> +
> +echo
> +echo "== verify pattern =="
> +# Order of overlapping in-flight requests is not guaranteed so we cannot 
> verify
> +# [0k, 2k) since it could have either pattern 0xcd or 0xce.
> +$QEMU_IO -c "read -P 0xce 2k 2k" "$TEST_IMG" | _filter_qemu_io
> +$QEMU_IO -c "read -P 0 4k 4k" "$TEST_IMG" | _filter_qemu_io
> +
> +echo
> +echo "== Tail overlapping requests =="
> +_make_test_img $size
> +$QEMU_IO -c "multiwrite 2k 2k ; 0k 4k" "$TEST_IMG" | _filter_qemu_io
> +
> +echo
> +echo "== verify pattern =="
> +# Order of overlapping in-flight requests is not guaranteed so we cannot 
> verify
> +# [2k, 4k) since it could have either pattern 0xcd or 0xce.
> +$QEMU_IO -c "read -P 0xce 0k 2k" "$TEST_IMG" | _filter_qemu_io
> +$QEMU_IO -c "read -P 0 4k 4k" "$TEST_IMG" | _filter_qemu_io
> +
> +echo
> +echo "== Disjoint requests =="
> +_make_test_img $size
> +$QEMU_IO -c "multiw

Re: [Qemu-devel] [PATCH alt 1/7] block: Add status callback to bdrv_amend_options()

2014-07-31 Thread Benoît Canet

The Saturday 26 Jul 2014 à 21:22:05 (+0200), Max Reitz wrote :

> Depending on the changed options and the image format,
> bdrv_amend_options() may take a significant amount of time. In these
> cases, a way to be informed about the operation's status is desirable.
> 
> Since the operation is rather complex and may fundamentally change the
> image, implementing it as AIO or a couroutine does not seem feasible. On
> the other hand, implementing it as a block job would be significantly
> more difficult than a simple callback and would not add benefits other
> than progress report to the amending operation, because it should not
> actually be run as a block job at all.
> 
> A callback may not be very pretty, but it's very easy to implement and
> perfectly fits its purpose here.
> 
> Signed-off-by: Max Reitz 
> ---
>  block.c   | 5 +++--
>  block/qcow2.c | 5 -
>  include/block/block.h | 5 -
>  include/block/block_int.h | 3 ++-
>  qemu-img.c| 2 +-
>  5 files changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 3e252a2..4c8d4d8 100644
> --- a/block.c
> +++ b/block.c
> @@ -5708,12 +5708,13 @@ void bdrv_add_before_write_notifier(BlockDriverState 
> *bs,
>  notifier_with_return_list_add(&bs->before_write_notifiers, notifier);
>  }
>  
> -int bdrv_amend_options(BlockDriverState *bs, QemuOpts *opts)
> +int bdrv_amend_options(BlockDriverState *bs, QemuOpts *opts,
> +   BlockDriverAmendStatusCB *status_cb)
>  {
>  if (!bs->drv->bdrv_amend_options) {
>  return -ENOTSUP;
>  }
> -return bs->drv->bdrv_amend_options(bs, opts);
> +return bs->drv->bdrv_amend_options(bs, opts, status_cb);
>  }
>  
>  /* This function will be called by the bdrv_recurse_is_first_non_filter 
> method
> diff --git a/block/qcow2.c b/block/qcow2.c
> index b0faa69..757f890 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -2210,7 +2210,8 @@ static int qcow2_downgrade(BlockDriverState *bs, int 
> target_version)
>  return 0;
>  }
>  
> -static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts)
> +static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
> +   BlockDriverAmendStatusCB *status_cb)
>  {
>  BDRVQcowState *s = bs->opaque;
>  int old_version = s->qcow_version, new_version = old_version;
> @@ -2223,6 +2224,8 @@ static int qcow2_amend_options(BlockDriverState *bs, 
> QemuOpts *opts)
>  int ret;
>  QemuOptDesc *desc = opts->list->desc;
>  
> +(void)status_cb;

This look like a temporary hack to please the compiler.
Am I right ? Should be comment this ?

> +
>  while (desc && desc->name) {
>  if (!qemu_opt_find(opts, desc->name)) {
>  /* only change explicitly defined options */
> diff --git a/include/block/block.h b/include/block/block.h
> index 32d3676..f2b1799 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -309,7 +309,10 @@ typedef enum {
>  
>  int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode 
> fix);
>  
> -int bdrv_amend_options(BlockDriverState *bs_new, QemuOpts *opts);
> +typedef void BlockDriverAmendStatusCB(BlockDriverState *bs, int64_t offset,
> +  int64_t total_work_size);
> +int bdrv_amend_options(BlockDriverState *bs_new, QemuOpts *opts,
> +   BlockDriverAmendStatusCB *status_cb);
>  
>  /* external snapshots */
>  bool bdrv_recurse_is_first_non_filter(BlockDriverState *bs,
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index f6c3bef..5c5798d 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -228,7 +228,8 @@ struct BlockDriver {
>  int (*bdrv_check)(BlockDriverState* bs, BdrvCheckResult *result,
>  BdrvCheckMode fix);
>  
> -int (*bdrv_amend_options)(BlockDriverState *bs, QemuOpts *opts);
> +int (*bdrv_amend_options)(BlockDriverState *bs, QemuOpts *opts,
> +  BlockDriverAmendStatusCB *status_cb);
>  
>  void (*bdrv_debug_event)(BlockDriverState *bs, BlkDebugEvent event);
>  
> diff --git a/qemu-img.c b/qemu-img.c
> index ef74ae4..90d6b79 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -2827,7 +2827,7 @@ static int img_amend(int argc, char **argv)
>  goto out;
>  }
>  
> -ret = bdrv_amend_options(bs, opts);
> +ret = bdrv_amend_options(bs, opts, NULL);
>  if (ret < 0) {
>  error_report("Error while amending options: %s", strerror(-ret));
>  goto out;
> -- 
> 2.0.3
> 
>

Re: [Qemu-devel] [PATCH alt 2/7] qemu-img: Add progress output for amend

2014-07-31 Thread Benoît Canet

The Saturday 26 Jul 2014 à 21:22:06 (+0200), Max Reitz wrote :
> Now that bdrv_amend_options() supports a status callback, use it to
> display a progress report.
> 
> Signed-off-by: Max Reitz 
> ---
>  qemu-img.c | 26 +++---
>  1 file changed, 23 insertions(+), 3 deletions(-)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index 90d6b79..a06f425 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -2732,6 +2732,13 @@ out:
>  return 0;
>  }
>  
> +static void amend_status_cb(BlockDriverState *bs,
> +int64_t offset, int64_t total_work_size)
> +{
> +(void)bs;
This cast also look like a compiler pleasing thing.
Is it really required ?

> +qemu_progress_print(100.f * offset / total_work_size, 0);
> +}
> +
>  static int img_amend(int argc, char **argv)
>  {
>  int c, ret = 0;
> @@ -2740,12 +2747,12 @@ static int img_amend(int argc, char **argv)
>  QemuOpts *opts = NULL;
>  const char *fmt = NULL, *filename, *cache;
>  int flags;
> -bool quiet = false;
> +bool quiet = false, progress = false;
>  BlockDriverState *bs = NULL;
>  
>  cache = BDRV_DEFAULT_CACHE;
>  for (;;) {
> -c = getopt(argc, argv, "ho:f:t:q");
> +c = getopt(argc, argv, "ho:f:t:pq");
>  if (c == -1) {
>  break;
>  }
> @@ -2775,6 +2782,9 @@ static int img_amend(int argc, char **argv)
>  case 't':
>  cache = optarg;
>  break;
> +case 'p':
> +progress = true;
> +break;
>  case 'q':
>  quiet = true;
>  break;
> @@ -2785,6 +2795,11 @@ static int img_amend(int argc, char **argv)
>  error_exit("Must specify options (-o)");
>  }
>  
> +if (quiet) {
> +progress = false;
> +}
> +qemu_progress_init(progress, 1.0);
> +
>  filename = (optind == argc - 1) ? argv[argc - 1] : NULL;
>  if (fmt && has_help_option(options)) {
>  /* If a format is explicitly specified (and possibly no filename is
> @@ -2827,13 +2842,18 @@ static int img_amend(int argc, char **argv)
>  goto out;
>  }
>  
> -ret = bdrv_amend_options(bs, opts, NULL);
> +/* In case the driver does not call amend_status_cb() */
> +qemu_progress_print(0.f, 0);
> +ret = bdrv_amend_options(bs, opts, &amend_status_cb);
> +qemu_progress_print(100.f, 0);
>  if (ret < 0) {
>  error_report("Error while amending options: %s", strerror(-ret));
>  goto out;
>  }
>  
>  out:
> +qemu_progress_end();
> +
>  if (bs) {
>  bdrv_unref(bs);
>  }
> -- 
> 2.0.3
> 
>

Re: [Qemu-devel] [PATCH alt 3/7] qemu-img: Fix insignifcant memleak

2014-07-31 Thread Benoît Canet

The Saturday 26 Jul 2014 à 21:22:07 (+0200), Max Reitz wrote :

> As soon as options is set in img_amend(), it needs to be freed before
> the function returns. This leak is rather insignifcant, as qemu-img will
> exit subsequently anyway, but there's no point in not fixing it.
> 
> Signed-off-by: Max Reitz 
> ---
>  qemu-img.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index a06f425..d73a68a 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -2809,7 +2809,9 @@ static int img_amend(int argc, char **argv)
>  }
>  
>  if (optind != argc - 1) {
> -error_exit("Expecting one image file name");
> +error_report("Expecting one image file name");
> +ret = -1;
> +goto out;
>  }
>  
>  flags = BDRV_O_FLAGS | BDRV_O_RDWR;
> -- 
> 2.0.3
> 
> 

Reviewed-by: Benoit Canet

Re: [Qemu-devel] [PATCH alt 4/7] block/qcow2: Implement status CB for amend

2014-07-31 Thread Benoît Canet

The Saturday 26 Jul 2014 à 21:22:08 (+0200), Max Reitz wrote :
> The only really time-consuming operation potentially performed by
> qcow2_amend_options() is zero cluster expansion when downgrading qcow2
> images from compat=1.1 to compat=0.10, so report status of that
> operation and that operation only through the status CB.
> 
> For this, approximate the progress as the number of L1 entries visited
> during the operation.
> 
> Signed-off-by: Max Reitz 
> ---
>  block/qcow2-cluster.c | 36 
>  block/qcow2.c |  9 -
>  block/qcow2.h |  3 ++-
>  3 files changed, 38 insertions(+), 10 deletions(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 4208dc0..f8bec6f 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -1548,10 +1548,17 @@ fail:
>   * zero expansion (i.e., has been filled with zeroes and is referenced from 
> an
>   * L2 table). nb_clusters contains the total cluster count of the image file,
>   * i.e., the number of bits in expanded_clusters.
> + *
> + * l1_entries and *visited_l1_entries are ued to keep track of progress for
> + * status_cb(). l1_entries contains the total number of L1 entries and
> + * *visited_l1_entries counts all visited L1 entries.
>   */
>  static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t 
> *l1_table,
>int l1_size, uint8_t 
> **expanded_clusters,
> -  uint64_t *nb_clusters)
> +  uint64_t *nb_clusters,
> +  int64_t *visited_l1_entries,
> +  int64_t l1_entries,
> +  BlockDriverAmendStatusCB *status_cb)
>  {
>  BDRVQcowState *s = bs->opaque;
>  bool is_active_l1 = (l1_table == s->l1_table);
> @@ -1571,6 +1578,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState 
> *bs, uint64_t *l1_table,
>  
>  if (!l2_offset) {
>  /* unallocated */
> +(*visited_l1_entries)++;
>  continue;
>  }
>  
> @@ -1703,6 +1711,13 @@ static int expand_zero_clusters_in_l1(BlockDriverState 
> *bs, uint64_t *l1_table,
>  }
>  }
>  }
> +
> +(*visited_l1_entries)++;
> +
> +if (status_cb) {
> +status_cb(bs, *visited_l1_entries << (s->l2_bits + 
> s->cluster_bits),
> +  l1_entries << (s->l2_bits + s->cluster_bits));

Shifting is a multiplication so it keep proportionality intact.
So do we really need these shifts ?

> +}
>  }
>  
>  ret = 0;
> @@ -1729,21 +1744,32 @@ fail:
>   * allocation for pre-allocated ones). This is important for downgrading to a
>   * qcow2 version which doesn't yet support metadata zero clusters.
>   */
> -int qcow2_expand_zero_clusters(BlockDriverState *bs)
> +int qcow2_expand_zero_clusters(BlockDriverState *bs,
> +   BlockDriverAmendStatusCB *status_cb)
>  {
>  BDRVQcowState *s = bs->opaque;
>  uint64_t *l1_table = NULL;
>  uint64_t nb_clusters;
> +int64_t l1_entries = 0, visited_l1_entries = 0;
>  uint8_t *expanded_clusters;
>  int ret;
>  int i, j;
>  
> +if (status_cb) {
> +l1_entries = s->l1_size;
> +for (i = 0; i < s->nb_snapshots; i++) {
> +l1_entries += s->snapshots[i].l1_size;
> +}
> +}
> +
>  nb_clusters = size_to_clusters(s, bs->file->total_sectors *
> BDRV_SECTOR_SIZE);
>  expanded_clusters = g_malloc0((nb_clusters + 7) / 8);
>  
>  ret = expand_zero_clusters_in_l1(bs, s->l1_table, s->l1_size,
> - &expanded_clusters, &nb_clusters);
> + &expanded_clusters, &nb_clusters,
> + &visited_l1_entries, l1_entries,
> + status_cb);
>  if (ret < 0) {
>  goto fail;
>  }
> @@ -1777,7 +1803,9 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs)
>  }
>  
>  ret = expand_zero_clusters_in_l1(bs, l1_table, 
> s->snapshots[i].l1_size,
> - &expanded_clusters, &nb_clusters);
> + &expanded_clusters, &nb_clusters,
> + &visited_l1_entries, l1_entries,
> + status_cb);
>  if (ret < 0) {
>  goto fail;
>  }
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 757f890..6e8c8ab 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -2147,7 +2147,8 @@ static int qcow2_load_vmstate(BlockDriverState *bs, 
> uint8_t *buf,
>   * Downgrades an image's version. To achieve this, any incompatible features
>   * have to be removed.
>   */
> -static int qcow2_downgrade(BlockDriverState *bs, int target_versio

Re: [Qemu-devel] [PATCH alt 1/7] block: Add status callback to bdrv_amend_options()

2014-07-31 Thread Benoît Canet

The Thursday 31 Jul 2014 à 09:51:05 (+0200), Benoît Canet wrote :
> The Saturday 26 Jul 2014 à 21:22:05 (+0200), Max Reitz wrote :
> 
> > Depending on the changed options and the image format,
> > bdrv_amend_options() may take a significant amount of time. In these
> > cases, a way to be informed about the operation's status is desirable.
> > 
> > Since the operation is rather complex and may fundamentally change the
> > image, implementing it as AIO or a couroutine does not seem feasible. On
> > the other hand, implementing it as a block job would be significantly
> > more difficult than a simple callback and would not add benefits other
> > than progress report to the amending operation, because it should not
> > actually be run as a block job at all.
> > 
> > A callback may not be very pretty, but it's very easy to implement and
> > perfectly fits its purpose here.
> > 
> > Signed-off-by: Max Reitz 
> > ---
> >  block.c   | 5 +++--
> >  block/qcow2.c | 5 -
> >  include/block/block.h | 5 -
> >  include/block/block_int.h | 3 ++-
> >  qemu-img.c| 2 +-
> >  5 files changed, 14 insertions(+), 6 deletions(-)
> > 
> > diff --git a/block.c b/block.c
> > index 3e252a2..4c8d4d8 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -5708,12 +5708,13 @@ void 
> > bdrv_add_before_write_notifier(BlockDriverState *bs,
> >  notifier_with_return_list_add(&bs->before_write_notifiers, notifier);
> >  }
> >  
> > -int bdrv_amend_options(BlockDriverState *bs, QemuOpts *opts)
> > +int bdrv_amend_options(BlockDriverState *bs, QemuOpts *opts,
> > +   BlockDriverAmendStatusCB *status_cb)
> >  {
> >  if (!bs->drv->bdrv_amend_options) {
> >  return -ENOTSUP;
> >  }
> > -return bs->drv->bdrv_amend_options(bs, opts);
> > +return bs->drv->bdrv_amend_options(bs, opts, status_cb);
> >  }
> >  
> >  /* This function will be called by the bdrv_recurse_is_first_non_filter 
> > method
> > diff --git a/block/qcow2.c b/block/qcow2.c
> > index b0faa69..757f890 100644
> > --- a/block/qcow2.c
> > +++ b/block/qcow2.c
> > @@ -2210,7 +2210,8 @@ static int qcow2_downgrade(BlockDriverState *bs, int 
> > target_version)
> >  return 0;
> >  }
> >  
> > -static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts)
> > +static int qcow2_amend_options(BlockDriverState *bs, QemuOpts *opts,
> > +   BlockDriverAmendStatusCB *status_cb)
> >  {
> >  BDRVQcowState *s = bs->opaque;
> >  int old_version = s->qcow_version, new_version = old_version;
> > @@ -2223,6 +2224,8 @@ static int qcow2_amend_options(BlockDriverState *bs, 
> > QemuOpts *opts)
> >  int ret;
> >  QemuOptDesc *desc = opts->list->desc;
> >  
> > +(void)status_cb;
> 
> This look like a temporary hack to please the compiler.
> Am I right ? Should be comment this ?

For the further patch I now understand this I am just suprised that you need to 
silence a unused function parameter.

Reviewed-by: Benoit Canet 

> 
> > +
> >  while (desc && desc->name) {
> >  if (!qemu_opt_find(opts, desc->name)) {
> >  /* only change explicitly defined options */
> > diff --git a/include/block/block.h b/include/block/block.h
> > index 32d3676..f2b1799 100644
> > --- a/include/block/block.h
> > +++ b/include/block/block.h
> > @@ -309,7 +309,10 @@ typedef enum {
> >  
> >  int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode 
> > fix);
> >  
> > -int bdrv_amend_options(BlockDriverState *bs_new, QemuOpts *opts);
> > +typedef void BlockDriverAmendStatusCB(BlockDriverState *bs, int64_t offset,
> > +  int64_t total_work_size);
> > +int bdrv_amend_options(BlockDriverState *bs_new, QemuOpts *opts,
> > +   BlockDriverAmendStatusCB *status_cb);
> >  
> >  /* external snapshots */
> >  bool bdrv_recurse_is_first_non_filter(BlockDriverState *bs,
> > diff --git a/include/block/block_int.h b/include/block/block_int.h
> > index f6c3bef..5c5798d 100644
> > --- a/include/block/block_int.h
> > +++ b/include/block/block_int.h
> > @@ -228,7 +228,8 @@ struct BlockDriver {
> >  int (*bdrv_check)(BlockDriverState* bs, BdrvCheckResult *result,
> >  BdrvCheckMode fix);
> >  
> > -int (*bdrv_amend_options)(BlockDriverState *bs, QemuOpts *opts);
> > +int (*bdrv_amend_options)(BlockDriverState *bs, QemuOpts *opts,
> > +  BlockDriverAmendStatusCB *status_cb);
> >  
> >  void (*bdrv_debug_event)(BlockDriverState *bs, BlkDebugEvent event);
> >  
> > diff --git a/qemu-img.c b/qemu-img.c
> > index ef74ae4..90d6b79 100644
> > --- a/qemu-img.c
> > +++ b/qemu-img.c
> > @@ -2827,7 +2827,7 @@ static int img_amend(int argc, char **argv)
> >  goto out;
> >  }
> >  
> > -ret = bdrv_amend_options(bs, opts);
> > +ret = bdrv_amend_options(bs, opts, NULL);
> >  if (ret < 0) {
> >  error_report("Error while amending option

Re: [Qemu-devel] [PATCH alt 5/7] block/qcow2: Make get_refcount() global

2014-07-31 Thread Benoît Canet

The Saturday 26 Jul 2014 à 21:22:09 (+0200), Max Reitz wrote :
> Reading the refcount of a cluster is an operation which can be useful in
> all of the qcow2 code, so make that function globally available.
> 
> Signed-off-by: Max Reitz 
> ---
>  block/qcow2-refcount.c | 23 ---
>  block/qcow2.h  |  2 ++
>  2 files changed, 14 insertions(+), 11 deletions(-)
> 
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index cc6cf74..0c9887b 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -87,7 +87,7 @@ static int load_refcount_block(BlockDriverState *bs,
>   * return value is the refcount of the cluster, negative values are -errno
>   * and indicate an error.
>   */
> -static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
> +int qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index)
>  {
>  BDRVQcowState *s = bs->opaque;
>  uint64_t refcount_table_index, block_index;
> @@ -625,7 +625,7 @@ int qcow2_update_cluster_refcount(BlockDriverState *bs,
>  return ret;
>  }
>  
> -return get_refcount(bs, cluster_index);
> +return qcow2_get_refcount(bs, cluster_index);
>  }
>  
>  
> @@ -646,7 +646,7 @@ static int64_t alloc_clusters_noref(BlockDriverState *bs, 
> uint64_t size)
>  retry:
>  for(i = 0; i < nb_clusters; i++) {
>  uint64_t next_cluster_index = s->free_cluster_index++;
> -refcount = get_refcount(bs, next_cluster_index);
> +refcount = qcow2_get_refcount(bs, next_cluster_index);
>  
>  if (refcount < 0) {
>  return refcount;
> @@ -710,7 +710,7 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, 
> uint64_t offset,
>  /* Check how many clusters there are free */
>  cluster_index = offset >> s->cluster_bits;
>  for(i = 0; i < nb_clusters; i++) {
> -refcount = get_refcount(bs, cluster_index++);
> +refcount = qcow2_get_refcount(bs, cluster_index++);
>  
>  if (refcount < 0) {
>  return refcount;
> @@ -927,7 +927,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
>  cluster_index, addend,
>  QCOW2_DISCARD_SNAPSHOT);
>  } else {
> -refcount = get_refcount(bs, cluster_index);
> +refcount = qcow2_get_refcount(bs, cluster_index);
>  }
>  
>  if (refcount < 0) {
> @@ -967,7 +967,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
>  refcount = qcow2_update_cluster_refcount(bs, l2_offset >>
>  s->cluster_bits, addend, QCOW2_DISCARD_SNAPSHOT);
>  } else {
> -refcount = get_refcount(bs, l2_offset >> s->cluster_bits);
> +refcount = qcow2_get_refcount(bs, l2_offset >> 
> s->cluster_bits);
>  }
>  if (refcount < 0) {
>  ret = refcount;
> @@ -1243,8 +1243,8 @@ fail:
>   * Checks the OFLAG_COPIED flag for all L1 and L2 entries.
>   *
>   * This function does not print an error message nor does it increment
> - * check_errors if get_refcount fails (this is because such an error will 
> have
> - * been already detected and sufficiently signaled by the calling function
> + * check_errors if qcow2_get_refcount fails (this is because such an error 
> will
> + * have been already detected and sufficiently signaled by the calling 
> function
>   * (qcow2_check_refcounts) by the time this function is called).
>   */
>  static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res,
> @@ -1265,7 +1265,7 @@ static int check_oflag_copied(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  continue;
>  }
>  
> -refcount = get_refcount(bs, l2_offset >> s->cluster_bits);
> +refcount = qcow2_get_refcount(bs, l2_offset >> s->cluster_bits);
>  if (refcount < 0) {
>  /* don't print message nor increment check_errors */
>  continue;
> @@ -1307,7 +1307,8 @@ static int check_oflag_copied(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  
>  if ((cluster_type == QCOW2_CLUSTER_NORMAL) ||
>  ((cluster_type == QCOW2_CLUSTER_ZERO) && (data_offset != 
> 0))) {
> -refcount = get_refcount(bs, data_offset >> s->cluster_bits);
> +refcount = qcow2_get_refcount(bs,
> +  data_offset >> 
> s->cluster_bits);
>  if (refcount < 0) {
>  /* don't print message nor increment check_errors */
>  continue;
> @@ -1597,7 +1598,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, 
> BdrvCheckResult *res,
>  
>  /* compare ref counts */
>  for (i = 0, highest_cluster = 0; i < nb_clusters; i++) {
> -refcount1 = get_refcount(bs, i);
> +refcount1 = qco

Re: [Qemu-devel] [PATCH alt 6/7] block/qcow2: Simplify shared L2 handling in amend

2014-07-31 Thread Benoît Canet

The Saturday 26 Jul 2014 à 21:22:10 (+0200), Max Reitz wrote :

> Currently, we have a bitmap for keeping track of which clusters have
> been created during the zero cluster expansion process. This was
> necessary because we need to properly increase the refcount for shared
> L2 tables.
> 
> However, now we can simply take the L2 refcount and use it for the
> cluster allocated for expansion. This will be the correct refcount and
> therefore we don't have to remember that cluster having been allocated
> any more.
> 
> Signed-off-by: Max Reitz 
> ---
>  block/qcow2-cluster.c | 90 
> ---
>  1 file changed, 28 insertions(+), 62 deletions(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index f8bec6f..e6bff40 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -1543,20 +1543,12 @@ fail:
>   * Expands all zero clusters in a specific L1 table (or deallocates them, for
>   * non-backed non-pre-allocated zero clusters).
>   *
> - * expanded_clusters is a bitmap where every bit corresponds to one cluster 
> in
> - * the image file; a bit gets set if the corresponding cluster has been used 
> for
> - * zero expansion (i.e., has been filled with zeroes and is referenced from 
> an
> - * L2 table). nb_clusters contains the total cluster count of the image file,
> - * i.e., the number of bits in expanded_clusters.
> - *
>   * l1_entries and *visited_l1_entries are ued to keep track of progress for
>   * status_cb(). l1_entries contains the total number of L1 entries and
>   * *visited_l1_entries counts all visited L1 entries.
>   */
>  static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t 
> *l1_table,
> -  int l1_size, uint8_t 
> **expanded_clusters,
> -  uint64_t *nb_clusters,
> -  int64_t *visited_l1_entries,
> +  int l1_size, int64_t 
> *visited_l1_entries,
>int64_t l1_entries,
>BlockDriverAmendStatusCB *status_cb)
>  {
> @@ -1575,6 +1567,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState 
> *bs, uint64_t *l1_table,
>  for (i = 0; i < l1_size; i++) {
>  uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK;
>  bool l2_dirty = false;
> +int l2_refcount;
>  
>  if (!l2_offset) {
>  /* unallocated */
> @@ -1595,33 +1588,19 @@ static int 
> expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
>  goto fail;
>  }
>  
> +l2_refcount = qcow2_get_refcount(bs, l2_offset >> s->cluster_bits);
> +if (l2_refcount < 0) {
> +ret = l2_refcount;
> +goto fail;
> +}
> +
>  for (j = 0; j < s->l2_size; j++) {
>  uint64_t l2_entry = be64_to_cpu(l2_table[j]);
> -int64_t offset = l2_entry & L2E_OFFSET_MASK, cluster_index;
> +int64_t offset = l2_entry & L2E_OFFSET_MASK;
>  int cluster_type = qcow2_get_cluster_type(l2_entry);
>  bool preallocated = offset != 0;
>  
> -if (cluster_type == QCOW2_CLUSTER_NORMAL) {
> -cluster_index = offset >> s->cluster_bits;
> -assert((cluster_index >= 0) && (cluster_index < 
> *nb_clusters));
> -if ((*expanded_clusters)[cluster_index / 8] &
> -(1 << (cluster_index % 8))) {
> -/* Probably a shared L2 table; this cluster was a zero
> - * cluster which has been expanded, its refcount
> - * therefore most likely requires an update. */
> -ret = qcow2_update_cluster_refcount(bs, cluster_index, 1,
> -QCOW2_DISCARD_NEVER);
> -if (ret < 0) {
> -goto fail;
> -}
> -/* Since we just increased the refcount, the COPIED flag 
> may
> - * no longer be set. */
> -l2_table[j] = cpu_to_be64(l2_entry & ~QCOW_OFLAG_COPIED);
> -l2_dirty = true;
> -}
> -continue;
> -}
> -else if (qcow2_get_cluster_type(l2_entry) != QCOW2_CLUSTER_ZERO) 
> {
> +if (cluster_type != QCOW2_CLUSTER_ZERO) {
>  continue;
>  }
>  
> @@ -1639,6 +1618,19 @@ static int expand_zero_clusters_in_l1(BlockDriverState 
> *bs, uint64_t *l1_table,
>  ret = offset;
>  goto fail;
>  }
> +
> +if (l2_refcount > 1) {
> +/* For shared L2 tables, set the refcount accordingly 
> (it is
> + * already 1 and needs to be l2_refcount) */
> +ret = qcow2_update_cluster_refcount(bs,
>

Re: [Qemu-devel] [PATCH alt 7/7] iotests: Expand test 061

2014-07-31 Thread Benoît Canet

The Saturday 26 Jul 2014 à 21:22:11 (+0200), Max Reitz wrote :
> Add some tests for progress output to 061.
> 
> Signed-off-by: Max Reitz 
> ---
>  tests/qemu-iotests/061 | 27 +++
>  tests/qemu-iotests/061.out | 32 
>  tests/qemu-iotests/group   |  2 +-
>  3 files changed, 60 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/061 b/tests/qemu-iotests/061
> index ab98def..fbb5897 100755
> --- a/tests/qemu-iotests/061
> +++ b/tests/qemu-iotests/061
> @@ -209,6 +209,33 @@ $QEMU_IMG amend -o "compat=0.10" "$TEST_IMG"
>  _check_test_img
>  $QEMU_IO -c "read -P 0 0 64M" "$TEST_IMG" | _filter_qemu_io
>  
> +echo
> +echo "=== Testing progress report without snapshot ==="
> +echo
> +IMGOPTS="compat=1.1" TEST_IMG="$TEST_IMG.base" _make_test_img 4G
> +IMGOPTS="compat=1.1" _make_test_img -b "$TEST_IMG.base" 4G
> +$QEMU_IO -c "write -z 0  64k" \
> + -c "write -z 1G 64k" \
> + -c "write -z 2G 64k" \
> + -c "write -z 3G 64k" "$TEST_IMG" | _filter_qemu_io
> +$QEMU_IMG amend -p -o "compat=0.10" "$TEST_IMG"
> +_check_test_img
> +
> +echo
> +echo "=== Testing progress report with snapshot ==="
> +echo
> +IMGOPTS="compat=1.1" TEST_IMG="$TEST_IMG.base" _make_test_img 4G
> +IMGOPTS="compat=1.1" _make_test_img -b "$TEST_IMG.base" 4G
> +$QEMU_IO -c "write -z 0  64k" \
> + -c "write -z 1G 64k" \
> + -c "write -z 2G 64k" \
> + -c "write -z 3G 64k" "$TEST_IMG" | _filter_qemu_io
> +$QEMU_IMG snapshot -c foo "$TEST_IMG"


> +# COW L2 table 0
It's not clear what you are trying to achieve here since
it does not appear on the tests output.
Maybe that what you are testing is that it will output nothing
but you should notice it.

> +$QEMU_IO -c "write -z 64k 64k" "$TEST_IMG" | _filter_qemu_io
> +$QEMU_IMG amend -p -o "compat=0.10" "$TEST_IMG"
> +_check_test_img
> +
>  # success, all done
>  echo "*** done"
>  rm -f $seq.full
> diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out
> index e372470..352cca3 100644
> --- a/tests/qemu-iotests/061.out
> +++ b/tests/qemu-iotests/061.out
> @@ -384,4 +384,36 @@ wrote 67108864/67108864 bytes at offset 0
>  No errors were found on the image.
>  read 67108864/67108864 bytes at offset 0
>  64 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +
> +=== Testing progress report without snapshot ===
> +
> +Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=4294967296 
> +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4294967296 
> backing_file='TEST_DIR/t.IMGFMT.base' 
> +wrote 65536/65536 bytes at offset 0
> +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +wrote 65536/65536 bytes at offset 1073741824
> +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +wrote 65536/65536 bytes at offset 2147483648
> +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +wrote 65536/65536 bytes at offset 3221225472
> +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +(0.00/100%)
> (12.50/100%)
> (37.50/100%)
> (62.50/100%)
> (87.50/100%)
> (100.00/100%)
> +No errors were found on the image.
> +
> +=== Testing progress report with snapshot ===
> +
> +Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=4294967296 
> +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=4294967296 
> backing_file='TEST_DIR/t.IMGFMT.base' 
> +wrote 65536/65536 bytes at offset 0
> +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +wrote 65536/65536 bytes at offset 1073741824
> +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +wrote 65536/65536 bytes at offset 2147483648
> +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +wrote 65536/65536 bytes at offset 3221225472
> +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +wrote 65536/65536 bytes at offset 65536
> +64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +(0.00/100%)
> (6.25/100%)
> (18.75/100%)
> (31.25/100%)
> (43.75/100%)
> (56.25/100%)
> (68.75/100%)
> (81.25/100%)
> (93.75/100%)
> (100.00/100%)
> +No errors were found on the image.
>  *** done
> diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
> index 6e67f61..b27e2f9 100644
> --- a/tests/qemu-iotests/group
> +++ b/tests/qemu-iotests/group
> @@ -67,7 +67,7 @@
>  058 rw auto quick
>  059 rw auto quick
>  060 rw auto quick
> -061 rw auto quick
> +061 rw auto
>  062 rw auto quick
>  063 rw auto quick
>  064 rw auto quick
> -- 
> 2.0.3
> 
>

Re: [Qemu-devel] [PATCH 14/15] hw/block/virtio-blk: create num_queues vqs if dataplane is enabled

2014-07-31 Thread Paolo Bonzini

Il 31/07/2014 05:47, Ming Lei ha scritto:
> On Wed, Jul 30, 2014 at 11:25 PM, Paolo Bonzini  wrote:
>> Il 30/07/2014 17:12, Michael S. Tsirkin ha scritto:
>
> Dataplane must not be a change to the guest ABI.  If you implement this
> feature you have to implement it for both dataplane and non-dataplne.
>
> 
> IMO, no matter if the feature is set or not, both old and new VM
> can support it well.
> 
> Per virtio spec, only the feature is set, the VM can be allowed to
> access the 'num_queues' field, and I didn't see any problem from
> VM's view point.
> 
> So could you explain why both dataplane and non-dataplane have
> to support the feature.

Because otherwise you change the guest ABI.

> I am doing so because there isn't performance advantage to take mq
> for non-dataplane.

It doesn't matter, changing features between dataplane and non-dataplane
is not something that you can do.

>>> Actually, I think different backends at runtime should be allowed to
>>> change guest visible ABI.  For example if you give qemu a read only
>>> backend this is guest visible right?
>>
>> Dataplane is not meant to be a different backend, it's just moving stuff
>> out to a separate thread.  It's only due to QEMU limitation that it
>> doesn't share the vring code (and these patches include a lot of steps
>> backwards where dataplane becomes != non-dataplane).
> 
> There won't lots of backwards steps, just the bypass co patch, which
> is just bypassing co in case of being unnecessary for raw image case,
> but it is still one code path.

Using the object pool for dataplane only is a backwards step,
implementing multique for dataplane only is a backwards step, bypassing
coroutines for dataplane only is a backwards step.

Paolo

Re: [Qemu-devel] [v2][PATCH 0/5] xen: introduce new machine for IGD passthrough

2014-07-31 Thread Michael S. Tsirkin

On Thu, Jul 31, 2014 at 02:31:34PM +0800, Tiejun Chen wrote:
> v2:
> 
> * Fix some coding style
> * New patch to separate i440fx_init
> * Just add prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough
> * Based on patch #2 to regenerate
> * Unify prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough like patch #3
> * Test: boot with a preinstalled ubuntu 14.04
>   ./i386-softmmu/qemu-system-i386 -hda test.img -m 2560 -boot c -machine pc

You need to test with windows guests as well, linux often can boot
on bad hardware.

> As we discussed we need to create a separate machine to support current
> IGD passthrough.
> 
> 
> Tiejun Chen (5):
>   hw:i386:pc_piix: split pc_init1()
>   hw:pci-host:piix: split i440fx_init
>   xen:hw:pci-host:piix: create host bridge to passthrough
>   xen:hw:pci-host:piix: introduce xen_igd_passthrough_i440fx_init
>   xen:hw:i386:pc_piix: introduce new machine for IGD passthrough
> 
>  hw/i386/pc_piix.c| 222 
> +---
>  hw/pci-host/piix.c   | 165 
> +++-
>  include/hw/i386/pc.h |  11 +
>  3 files changed, 358 insertions(+), 40 deletions(-)
> 
> Thanks
> Tiejun

Re: [Qemu-devel] [PATCH 01/15] qemu coroutine: support bypass mode

2014-07-31 Thread Ming Lei

On Thu, Jul 31, 2014 at 7:37 AM, Paolo Bonzini  wrote:
> Il 30/07/2014 19:15, Ming Lei ha scritto:
>> On Wed, Jul 30, 2014 at 9:45 PM, Paolo Bonzini  wrote:
>>> Il 30/07/2014 13:39, Ming Lei ha scritto:
 This patch introduces several APIs for supporting bypass qemu coroutine
 in case of being not necessary and for performance's sake.
>>>
>>> No, this is wrong.  Dataplane *must* use the same code as non-dataplane,
>>> anything else is a step backwards.
>>
>> As we saw, coroutine has brought up performance regression
>> on dataplane, and it isn't necessary to use co in some cases, is it?
>
> Yes, and it's not necessary on non-dataplane either.  It's not necessary
> on virtio-scsi, and it will not be necessary on virtio-scsi dataplane
> either.
>
>>> If you want to bypass coroutines, bdrv_aio_readv/writev must detect the
>>> conditions that allow doing that and call the bdrv_aio_readv/writev
>>> directly.
>>
>> That is easy to detect, please see the 5th patch.
>
> No, that's not enough.  Dataplane right now prevents block jobs, but
> that's going to change and it could require coroutines even for raw devices.
>
>>> To begin with, have you benchmarked QEMU and can you provide a trace of
>>> *where* the coroutine overhead lies?
>>
>> I guess it may be caused by the stack switch, at least in one of
>> my box, bypassing co can improve throughput by ~7%, and by
>> ~15% in another box.
>
> No guesses please.  Actually that's also my guess, but since you are
> submitting the patch you must do better and show profiles where stack
> switching disappears after the patches.

Follows the below hardware events reported by 'perf stat' when running
fio randread benchmark for 2min in VM(single vq, 2 jobs):

sudo ~/bin/perf stat -e
L1-dcache-loads,L1-dcache-load-misses,cpu-cycles,instructions,branch-instructions,branch-misses,branch-loads,branch-load-misses,dTLB-loads,dTLB-load-misses
./nqemu-start-mq 4 1

1), without bypassing coroutine via forcing to set 's->raw_format ' as
false, see patch 5/15

- throughout: 95K

  Performance counter stats for './nqemu-start-mq 4 1':

69,231,035,842  L1-dcache-loads
  [40.10%]
 1,909,978,930  L1-dcache-load-misses #2.76% of all
L1-dcache hits   [39.98%]
   263,731,501,086  cpu-cycles[40.03%]
   232,564,905,115  instructions  #0.88  insns per
cycle [50.23%]
46,157,868,745  branch-instructions
  [49.82%]
   785,618,591  branch-misses #1.70% of all
branches [49.99%]
46,280,342,654  branch-loads
  [49.95%]
34,934,790,140  branch-load-misses
  [50.02%]
69,447,857,237  dTLB-loads
  [40.13%]
   169,617,374  dTLB-load-misses  #0.24% of all
dTLB cache hits  [40.04%]

 161.991075781 seconds time elapsed


2), with bypassing coroutinue
- throughput: 115K
 Performance counter stats for './nqemu-start-mq 4 1':

76,784,224,509  L1-dcache-loads
  [39.93%]
 1,334,036,447  L1-dcache-load-misses #1.74% of all
L1-dcache hits   [39.91%]
   262,697,428,470  cpu-cycles[40.03%]
   255,526,629,881  instructions  #0.97  insns per
cycle [50.01%]
50,160,082,611  branch-instructions
  [49.97%]
   564,407,788  branch-misses #1.13% of all
branches [50.08%]
50,331,510,702  branch-loads
  [50.08%]
35,760,766,459  branch-load-misses
  [50.03%]
76,706,000,951  dTLB-loads
  [40.00%]
   123,291,001  dTLB-load-misses  #0.16% of all
dTLB cache hits  [40.02%]

 162.333465490 seconds time elapsed

Re: [Qemu-devel] [PATCH 01/15] qemu coroutine: support bypass mode

2014-07-31 Thread Paolo Bonzini

Il 31/07/2014 10:59, Ming Lei ha scritto:
>> > No guesses please.  Actually that's also my guess, but since you are
>> > submitting the patch you must do better and show profiles where stack
>> > switching disappears after the patches.
> Follows the below hardware events reported by 'perf stat' when running
> fio randread benchmark for 2min in VM(single vq, 2 jobs):
> 
> sudo ~/bin/perf stat -e
> L1-dcache-loads,L1-dcache-load-misses,cpu-cycles,instructions,branch-instructions,branch-misses,branch-loads,branch-load-misses,dTLB-loads,dTLB-load-misses
> ./nqemu-start-mq 4 1
> 
> 1), without bypassing coroutine via forcing to set 's->raw_format ' as
> false, see patch 5/15
> 
> - throughout: 95K
>232,564,905,115  instructions
>  161.991075781 seconds time elapsed
> 
> 
> 2), with bypassing coroutinue
> - throughput: 115K
>255,526,629,881  instructions
>  162.333465490 seconds time elapsed

Ok, so you are saving 10% instructions per iop: before 232G / 95K =
2.45M instructions/iop, 255G / 115K = 2.22M instructions/iop.

That's not small, and it's a good thing for CPU utilization even if you
were not increasing iops.  On top of this, can you provide the stack
traces to see the difference in the profiles?

Paolo

Re: [Qemu-devel] [PATCH 07/15] dataplane: use object pool to speed up allocation for virtio blk request

2014-07-31 Thread Paolo Bonzini

Il 31/07/2014 05:22, Ming Lei ha scritto:
>> >
>> > The problem is that g_slice here is not using the slab-style allocator
>> > because the object is larger than roughly 500 bytes.  One solution would
>> > be to make virtqueue_pop/vring_pop allocate a VirtQueueElement of the
>> > right size (and virtqueue_push/vring_push free it), as mentioned in the
>> > review of patch 8.
> Unluckily both iovec and addr array can't be fitted into 500 bytes, :-(
> Not mention all users of VirtQueueElement need to be changed too,
> I hate to make that work involved in this patchset, :-)

Well, the point of dataplane was not just to get maximum iops.  It was
also to provide guidance in the work necessary to improve the code and
get maximum iops without special-casing everything.  This can be a lot
of work indeed.

>> >
>> > However, I now remembered that VirtQueueElement is a mess because it's
>> > serialized directly into the migration state. :(  So you basically
>> > cannot change it without mucking with migration.  Please leave out patch
>> > 8 for now.
> save_device code serializes elem in this way:
> 
>  qemu_put_buffer(f, (unsigned char *)&req->elem,
> sizeof(VirtQueueElement));
> 
> so I am wondering why this patch may break migration.

Because you change the on-wire format and break migration from 2.1 to
2.2.  Sorry, I wasn't clear enough.

Paolo

> And in my test live migration can survive with the patch.
>

Re: [Qemu-devel] [v2][PATCH 2/5] hw:pci-host:piix: split i440fx_init

2014-07-31 Thread Michael S. Tsirkin

On Thu, Jul 31, 2014 at 02:31:36PM +0800, Tiejun Chen wrote:
> We'd like to split i440fx_init and then we can share something
> with other stuff.
> 
> Signed-off-by: Tiejun Chen 

I think this is too much work for very little benefit.
Just pass const char *type to i440fx_init.

-->

i440fx: make type configurable at run-time

Xen wants to supply a different pci host device,
inheriting i440fx. Make this configurable.

Signed-off-by: Michael S. Tsirkin 

--

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index be8fdfe..72e612b 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -230,7 +230,8 @@ extern int no_hpet;
 struct PCII440FXState;
 typedef struct PCII440FXState PCII440FXState;
 
-PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn,
+PCIBus *i440fx_init(const char *type,
+PCII440FXState **pi440fx_state, int *piix_devfn,
 ISABus **isa_bus, qemu_irq *pic,
 MemoryRegion *address_space_mem,
 MemoryRegion *address_space_io,
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 31125b7..5fefde6 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -194,7 +194,8 @@ static void pc_init1(MachineState *machine,
 }
 
 if (pci_enabled) {
-pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, &isa_bus, gsi,
+pci_bus = i440fx_init(TYPE_I440FX_PCI_HOST_BRIDGE,
+  &i440fx_state, &piix3_devfn, &isa_bus, gsi,
   system_memory, system_io, machine->ram_size,
   below_4g_mem_size,
   above_4g_mem_size,
diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index e0e0946..f0c5b5d 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -305,7 +305,8 @@ static int i440fx_initfn(PCIDevice *dev)
 return 0;
 }
 
-PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
+PCIBus *i440fx_init(const char *type,
+PCII440FXState **pi440fx_state,
 int *piix3_devfn,
 ISABus **isa_bus, qemu_irq *pic,
 MemoryRegion *address_space_mem,
@@ -325,7 +326,7 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
 unsigned i;
 I440FXState *i440fx;
 
-dev = qdev_create(NULL, TYPE_I440FX_PCI_HOST_BRIDGE);
+dev = qdev_create(NULL, type);
 s = PCI_HOST_BRIDGE(dev);
 b = pci_bus_new(dev, NULL, pci_address_space,
 address_space_io, 0, TYPE_PCI_BUS);

Re: [Qemu-devel] [v2][PATCH 2/5] hw:pci-host:piix: split i440fx_init

2014-07-31 Thread Chen, Tiejun


On 2014/7/31 17:10, Michael S. Tsirkin wrote:

On Thu, Jul 31, 2014 at 02:31:36PM +0800, Tiejun Chen wrote:

We'd like to split i440fx_init and then we can share something
with other stuff.

Signed-off-by: Tiejun Chen 


I think this is too much work for very little benefit.
Just pass const char *type to i440fx_init.


You know we will introduce that faked PCIe device represented that PCH 
later, so how to distinguish them? Are you saying I should check the 
type like this?


if(Xen-Type)
{}
else
{}

If so, actually this is mostly same as my original implementation,

"
@@ -333,8 +348,15 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
object_property_add_child(qdev_get_machine(), "i440fx", 
OBJECT(dev), NULL);

qdev_init_nofail(dev);

-d = pci_create_simple(b, 0, TYPE_I440FX_PCI_DEVICE);
-*pi440fx_state = I440FX_PCI_DEVICE(d);
+if (xen_enabled() && xen_has_gfx_passthru) {
+d = pci_create_simple(b, 0, TYPE_I440FX_XEN_PCI_DEVICE);
+*pi440fx_state = I440FX_XEN_PCI_DEVICE(d);
+pci_create_pch(b);
+} else {
+d = pci_create_simple(b, 0, TYPE_I440FX_PCI_DEVICE);
+*pi440fx_state = I440FX_PCI_DEVICE(d);
+}
+
f = *pi440fx_state;
f->system_memory = address_space_mem;
f->pci_address_space = pci_address_space;
"

But as I said to you last time, Paolo doesn't like we walk the common 
codes :)


Tiejun



-->

i440fx: make type configurable at run-time

Xen wants to supply a different pci host device,
inheriting i440fx. Make this configurable.

Signed-off-by: Michael S. Tsirkin 

--

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index be8fdfe..72e612b 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -230,7 +230,8 @@ extern int no_hpet;
  struct PCII440FXState;
  typedef struct PCII440FXState PCII440FXState;

-PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn,
+PCIBus *i440fx_init(const char *type,
+PCII440FXState **pi440fx_state, int *piix_devfn,
  ISABus **isa_bus, qemu_irq *pic,
  MemoryRegion *address_space_mem,
  MemoryRegion *address_space_io,
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 31125b7..5fefde6 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -194,7 +194,8 @@ static void pc_init1(MachineState *machine,
  }

  if (pci_enabled) {
-pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, &isa_bus, gsi,
+pci_bus = i440fx_init(TYPE_I440FX_PCI_HOST_BRIDGE,
+  &i440fx_state, &piix3_devfn, &isa_bus, gsi,
system_memory, system_io, machine->ram_size,
below_4g_mem_size,
above_4g_mem_size,
diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index e0e0946..f0c5b5d 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -305,7 +305,8 @@ static int i440fx_initfn(PCIDevice *dev)
  return 0;
  }

-PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
+PCIBus *i440fx_init(const char *type,
+PCII440FXState **pi440fx_state,
  int *piix3_devfn,
  ISABus **isa_bus, qemu_irq *pic,
  MemoryRegion *address_space_mem,
@@ -325,7 +326,7 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
  unsigned i;
  I440FXState *i440fx;

-dev = qdev_create(NULL, TYPE_I440FX_PCI_HOST_BRIDGE);
+dev = qdev_create(NULL, type);
  s = PCI_HOST_BRIDGE(dev);
  b = pci_bus_new(dev, NULL, pci_address_space,
  address_space_io, 0, TYPE_PCI_BUS);

Re: [Qemu-devel] questions about host side of virtio-serial

2014-07-31 Thread Richard W.M. Jones

On Wed, Jul 30, 2014 at 12:52:41PM -0600, Chris Friesen wrote:
> Hi,
> 
> I'm working on a native user of virtio-serial (ie, not going via the
> qemu guest agent).
> 
> The information at "http://www.linux-kvm.org/page/Virtio-serial_API";
> does a good job of describing the guest side of things, but has very
> little information about the host side of things.

You might also want to read the libguestfs source code, since
libguestfs is a major and long-time user of virtio-serial.

In particular these files:

  src/launch-direct.c
  src/proto.c
  src/conn-socket.c
  daemon/proto.c
  src/launch-libvirt.c  # if interested in using virtio-serial from libvirt

> In particular, assuming that the host side is using a chardev mapped
> to a unix socket:
> 
> 1) Is there any way for the host app to get information about
> whether or not the guest is reading the messages?  (i.e. logically
> equivalent to getting POLLHUP in the guest when the host app
> disconnects.)

No, I don't believe that is possible.  It acts like a real serial port
and throws away bytes when no one is listening (on both ends).

> 2) Suppose the host sends a large message.  The guest app reads a
> portion of the message, then crashes.  We respawn the guest app and
> start reading again, but now we're in the middle of a message of
> arbitrary size.  Is there a recommended technique to re-sync the
> host and guest?

AFAIK that's either very difficult or impossible.  Maybe with some
kind of self-synchronizing protocol, or if you ran SLIP/PPP on top of
the raw virtio-serial channel?

In the libguestfs case we wouldn't even try to go there -- if
something in the VM crashes we completely recreate the virtual machine
from scratch.

> 3) Same as 2, but the guest sending to the host and the host app
> crashing partway through.
> 
> 4) If nothing in the guest is reading the data, how much data can
> the host send before it will get an error?

It won't get an error - the sender will block.  Except on ARM where
there is a race condition in virtio-mmio causing writes to be thrown
away (https://bugs.launchpad.net/qemu/+bug/122).

> Is there a way to adjust this?

Not as far as I know.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

[Qemu-devel] [RFC PATCH 01/10] qom: Make object_child_foreach safe for objects removal

2014-07-31 Thread Alexey Kardashevskiy

Current object_child_foreach() uses QTAILQ_FOREACH() to walk
through children and that makes children removal from the callback
impossible.

This makes object_child_foreach() use QTAILQ_FOREACH_SAFE().

Signed-off-by: Alexey Kardashevskiy 
---

This went to Andreas's qom-next tree, it is here for the reference only.
---
 qom/object.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qom/object.c b/qom/object.c
index 0e8267b..4a814dc 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -678,10 +678,10 @@ void object_class_foreach(void (*fn)(ObjectClass *klass, 
void *opaque),
 int object_child_foreach(Object *obj, int (*fn)(Object *child, void *opaque),
  void *opaque)
 {
-ObjectProperty *prop;
+ObjectProperty *prop, *next;
 int ret = 0;
 
-QTAILQ_FOREACH(prop, &obj->properties, node) {
+QTAILQ_FOREACH_SAFE(prop, &obj->properties, node, next) {
 if (object_property_is_child(prop)) {
 ret = fn(prop->opaque, opaque);
 if (ret != 0) {
-- 
2.0.0

[Qemu-devel] [RFC PATCH 09/10] spapr_pci_vfio: Enable DDW

2014-07-31 Thread Alexey Kardashevskiy

This implements DDW for VFIO. Host kernel support is required for this.

Signed-off-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr_pci_vfio.c | 75 +
 1 file changed, 75 insertions(+)

diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index d3bddf2..dc443e2 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -69,6 +69,77 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState 
*sphb, Error **errp)
 /* Register default 32bit DMA window */
 memory_region_add_subregion(&sphb->iommu_root, tcet->bus_offset,
 spapr_tce_get_iommu(tcet));
+
+sphb->ddw_supported = !!(info.flags & VFIO_IOMMU_SPAPR_TCE_FLAG_DDW);
+}
+
+static int spapr_pci_vfio_ddw_query(sPAPRPHBState *sphb,
+uint32_t *windows_available,
+uint32_t *page_size_mask)
+{
+sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
+struct vfio_iommu_spapr_tce_query query = { .argsz = sizeof(query) };
+int ret;
+
+ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
+   VFIO_IOMMU_SPAPR_TCE_QUERY, &query);
+if (ret) {
+return ret;
+}
+
+*windows_available = query.windows_available;
+*page_size_mask = query.page_size_mask;
+
+return ret;
+}
+
+static int spapr_pci_vfio_ddw_create(sPAPRPHBState *sphb, uint32_t page_shift,
+ uint32_t window_shift, uint32_t liobn,
+ sPAPRTCETable **ptcet)
+{
+sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
+struct vfio_iommu_spapr_tce_create create = {
+.argsz = sizeof(create),
+.page_shift = page_shift,
+.window_shift = window_shift,
+.start_addr = 0
+};
+int ret;
+
+ret = vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
+   VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
+if (ret) {
+return ret;
+}
+
+*ptcet = spapr_tce_new_table(DEVICE(sphb), liobn, create.start_addr,
+ page_shift, 1 << (window_shift - page_shift),
+ true);
+memory_region_add_subregion(&sphb->iommu_root, (*ptcet)->bus_offset,
+spapr_tce_get_iommu(*ptcet));
+
+return ret;
+}
+
+static int spapr_pci_vfio_ddw_remove(sPAPRPHBState *sphb, sPAPRTCETable *tcet)
+{
+sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
+struct vfio_iommu_spapr_tce_remove remove = {
+.argsz = sizeof(remove),
+.start_addr = tcet->bus_offset
+};
+
+return vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
+VFIO_IOMMU_SPAPR_TCE_REMOVE, &remove);
+}
+
+static int spapr_pci_vfio_ddw_reset(sPAPRPHBState *sphb)
+{
+sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
+struct vfio_iommu_spapr_tce_reset reset = { .argsz = sizeof(reset) };
+
+return vfio_container_ioctl(&sphb->iommu_as, svphb->iommugroupid,
+VFIO_IOMMU_SPAPR_TCE_RESET, &reset);
 }
 
 static void spapr_phb_vfio_reset(DeviceState *qdev)
@@ -84,6 +155,10 @@ static void spapr_phb_vfio_class_init(ObjectClass *klass, 
void *data)
 dc->props = spapr_phb_vfio_properties;
 dc->reset = spapr_phb_vfio_reset;
 spc->finish_realize = spapr_phb_vfio_finish_realize;
+spc->ddw_query = spapr_pci_vfio_ddw_query;
+spc->ddw_create = spapr_pci_vfio_ddw_create;
+spc->ddw_remove = spapr_pci_vfio_ddw_remove;
+spc->ddw_reset = spapr_pci_vfio_ddw_reset;
 }
 
 static const TypeInfo spapr_phb_vfio_info = {
-- 
2.0.0

[Qemu-devel] [RFC PATCH 00/10] spapr: vfio: Enable Dynamic DMA windows (DDW)

2014-07-31 Thread Alexey Kardashevskiy

At the moment sPAPR PHB supports only a single 32bit window
which is normally 1..2GB which is not enough for high performance devices.

PAPR spec enables creating an additional window(s) to support 64bit
DMA and bigger page sizes.

This patchset adds DDW support for pseries. The host kernel changes are
required.

This was tested on POWER8 system which allows one additional DMA window
which is mapped at 0x800... and supports 16MB pages.
Existing guests check for DDW capabilities in PHB's device tree and if it
is present, they request for an additional window and map entire guest RAM
using H_PUT_TCE/... hypercalls once at boot time and switch to direct DMA
operations.

TCE tables still may be big enough for guests backed with 64K pages but they
are reasonably small for guests backed by 16MB pages.

Please comment. Thanks!




Alexey Kardashevskiy (10):
  qom: Make object_child_foreach safe for objects removal
  spapr_iommu: Disable in-kernel IOMMU tables for >4GB windows
  spapr_pci: Make find_phb()/find_dev() public
  spapr_iommu: Make spapr_tce_find_by_liobn() public
  linux headers update for DDW
  spapr_rtas: Add Dynamic DMA windows (DDW) RTAS calls support
  spapr: Add "ddw" machine option
  spapr_pci: Enable DDW
  spapr_pci_vfio: Enable DDW
  vfio: Enable DDW ioctls to VFIO IOMMU driver

 hw/misc/vfio.c  |   4 +
 hw/ppc/Makefile.objs|   3 +
 hw/ppc/spapr.c  |  15 +++
 hw/ppc/spapr_iommu.c|   8 +-
 hw/ppc/spapr_pci.c  |  87 +++--
 hw/ppc/spapr_pci_vfio.c |  75 +++
 hw/ppc/spapr_rtas_ddw.c | 296 
 include/hw/pci-host/spapr.h |  27 
 include/hw/ppc/spapr.h  |   7 +-
 linux-headers/linux/vfio.h  |  37 +-
 qom/object.c|   4 +-
 trace-events|   4 +
 vl.c|   4 +
 13 files changed, 552 insertions(+), 19 deletions(-)
 create mode 100644 hw/ppc/spapr_rtas_ddw.c

-- 
2.0.0

[Qemu-devel] [RFC PATCH 06/10] spapr_rtas: Add Dynamic DMA windows (DDW) RTAS calls support

2014-07-31 Thread Alexey Kardashevskiy

This adds support for Dynamic DMA Windows (DDW) option defined by
the SPAPR specification which allows to have additional DMA window(s)
which can support page sizes other than 4K.

The existing implementation of DDW in the guest tries to create one huge
DMA window with 64K or 16MB pages and map the entire guest RAM to. If it
succeeds, the guest switches to dma_direct_ops and never calls
TCE hypercalls (H_PUT_TCE,...) again. This enables VFIO devices to use
the entire RAM and not waste time on map/unmap.

This adds 4 RTAS handlers:
* ibm,query-pe-dma-window
* ibm,create-pe-dma-window
* ibm,remove-pe-dma-window
* ibm,reset-pe-dma-window
These are registered from type_init() callback.

These RTAS handlers are implemented in a separate file to avoid polluting
spapr_iommu.c with PHB.

Signed-off-by: Alexey Kardashevskiy 
---
 hw/ppc/Makefile.objs|   3 +
 hw/ppc/spapr_rtas_ddw.c | 296 
 include/hw/pci-host/spapr.h |  18 +++
 include/hw/ppc/spapr.h  |   6 +-
 trace-events|   4 +
 5 files changed, 326 insertions(+), 1 deletion(-)
 create mode 100644 hw/ppc/spapr_rtas_ddw.c

diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
index edd44d0..9773294 100644
--- a/hw/ppc/Makefile.objs
+++ b/hw/ppc/Makefile.objs
@@ -7,6 +7,9 @@ obj-$(CONFIG_PSERIES) += spapr_pci.o
 ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
 obj-y += spapr_pci_vfio.o
 endif
+ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES), yy)
+obj-y += spapr_rtas_ddw.o
+endif
 # PowerPC 4xx boards
 obj-y += ppc405_boards.o ppc4xx_devs.o ppc405_uc.o ppc440_bamboo.o
 obj-y += ppc4xx_pci.o
diff --git a/hw/ppc/spapr_rtas_ddw.c b/hw/ppc/spapr_rtas_ddw.c
new file mode 100644
index 000..943af2c
--- /dev/null
+++ b/hw/ppc/spapr_rtas_ddw.c
@@ -0,0 +1,296 @@
+/*
+ * QEMU sPAPR Dynamic DMA windows support
+ *
+ * Copyright (c) 2014 Alexey Kardashevskiy, IBM Corporation.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License,
+ *  or (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see .
+ */
+
+#include "hw/ppc/spapr.h"
+#include "hw/pci-host/spapr.h"
+#include "trace.h"
+
+static inline uint32_t spapr_iommu_fixmask(uint32_t cur_mask,
+   struct ppc_one_seg_page_size *sps,
+   uint32_t query_mask,
+   int shift,
+   uint32_t add_mask)
+{
+if ((sps->page_shift == shift) && (query_mask & add_mask)) {
+cur_mask |= add_mask;
+}
+return cur_mask;
+}
+
+static void rtas_ibm_query_pe_dma_window(PowerPCCPU *cpu,
+ sPAPREnvironment *spapr,
+ uint32_t token, uint32_t nargs,
+ target_ulong args,
+ uint32_t nret, target_ulong rets)
+{
+CPUPPCState *env = &cpu->env;
+sPAPRPHBState *sphb;
+sPAPRPHBClass *spc;
+uint64_t buid;
+uint32_t addr, pgmask = 0;
+uint32_t windows_available = 0, page_size_mask = 0;
+long ret, i;
+
+if ((nargs != 3) || (nret != 5)) {
+goto param_error_exit;
+}
+
+buid = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 2);
+addr = rtas_ld(args, 0);
+sphb = spapr_pci_find_phb(spapr, buid);
+if (!sphb) {
+goto param_error_exit;
+}
+
+spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
+if (!spc->ddw_query) {
+goto hw_error_exit;
+}
+
+ret = spc->ddw_query(sphb, &windows_available, &page_size_mask);
+trace_spapr_iommu_ddw_query(buid, addr, windows_available,
+page_size_mask, pgmask, ret);
+if (ret) {
+goto hw_error_exit;
+}
+
+/* DBG! */
+if (!(page_size_mask & DDW_PGSIZE_16M)) {
+goto hw_error_exit;
+}
+
+/* Work out biggest possible page size */
+for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
+int j;
+struct ppc_one_seg_page_size *sps = &env->sps.sps[i];
+const struct { int shift; uint32_t mask; } masks[] = {
+{ 12, DDW_PGSIZE_4K },
+{ 16, DDW_PGSIZE_64K },
+{ 24, DDW_PGSIZE_16M },
+{ 25, DDW_PGSIZE_32M },
+{ 26, DDW_PGSIZE_64M },
+{ 27, DDW_PGSIZE_128M },
+{ 28, DDW_PGSIZE_256M },
+{ 34, DDW_PGSIZE_16G },
+};
+

[Qemu-devel] [RFC PATCH 08/10] spapr_pci: Enable DDW

2014-07-31 Thread Alexey Kardashevskiy

This implements DDW for emulated PHB.

This advertises DDW in device tree.

Signed-off-by: Alexey Kardashevskiy 
---

The DDW has not been tested as QEMU does not implement any 64bit DMA capable
device and existing linux guests do not use DDW for 32bit DMA.
---
 hw/ppc/spapr_pci.c  | 65 +
 include/hw/pci-host/spapr.h |  5 
 2 files changed, 70 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 230b59c..d1f4c86 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -22,6 +22,7 @@
  * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
  * THE SOFTWARE.
  */
+#include "sysemu/sysemu.h"
 #include "hw/hw.h"
 #include "hw/pci/pci.h"
 #include "hw/pci/msi.h"
@@ -650,6 +651,8 @@ static void spapr_phb_finish_realize(sPAPRPHBState *sphb, 
Error **errp)
 /* Register default 32bit DMA window */
 memory_region_add_subregion(&sphb->iommu_root, 0,
 spapr_tce_get_iommu(tcet));
+
+sphb->ddw_supported = true;
 }
 
 static int spapr_phb_children_reset(Object *child, void *opaque)
@@ -781,6 +784,42 @@ static const char *spapr_phb_root_bus_path(PCIHostState 
*host_bridge,
 return sphb->dtbusname;
 }
 
+static int spapr_pci_ddw_query(sPAPRPHBState *sphb,
+   uint32_t *windows_available,
+   uint32_t *page_size_mask)
+{
+*windows_available = 1;
+*page_size_mask = DDW_PGSIZE_16M;
+
+return 0;
+}
+
+static int spapr_pci_ddw_create(sPAPRPHBState *sphb, uint32_t page_shift,
+uint32_t window_shift, uint32_t liobn,
+sPAPRTCETable **ptcet)
+{
+*ptcet = spapr_tce_new_table(DEVICE(sphb), liobn, SPAPR_PCI_TCE64_START,
+ page_shift, 1 << (window_shift - page_shift),
+ true);
+if (!*ptcet) {
+return -1;
+}
+memory_region_add_subregion(&sphb->iommu_root, (*ptcet)->bus_offset,
+spapr_tce_get_iommu(*ptcet));
+
+return 0;
+}
+
+static int spapr_pci_ddw_remove(sPAPRPHBState *sphb, sPAPRTCETable *tcet)
+{
+return 0;
+}
+
+static int spapr_pci_ddw_reset(sPAPRPHBState *sphb)
+{
+return 0;
+}
+
 static void spapr_phb_class_init(ObjectClass *klass, void *data)
 {
 PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass);
@@ -795,6 +834,10 @@ static void spapr_phb_class_init(ObjectClass *klass, void 
*data)
 set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
 dc->cannot_instantiate_with_device_add_yet = false;
 spc->finish_realize = spapr_phb_finish_realize;
+spc->ddw_query = spapr_pci_ddw_query;
+spc->ddw_create = spapr_pci_ddw_create;
+spc->ddw_remove = spapr_pci_ddw_remove;
+spc->ddw_reset = spapr_pci_ddw_reset;
 }
 
 static const TypeInfo spapr_phb_info = {
@@ -878,6 +921,14 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb,
 uint32_t interrupt_map_mask[] = {
 cpu_to_be32(b_d(-1)|b_fff(0)), 0x0, 0x0, cpu_to_be32(-1)};
 uint32_t interrupt_map[PCI_SLOT_MAX * PCI_NUM_PINS][7];
+uint32_t ddw_applicable[] = {
+RTAS_IBM_QUERY_PE_DMA_WINDOW,
+RTAS_IBM_CREATE_PE_DMA_WINDOW,
+RTAS_IBM_REMOVE_PE_DMA_WINDOW
+};
+uint32_t ddw_extensions[] = { 1, RTAS_IBM_RESET_PE_DMA_WINDOW };
+sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(phb);
+QemuOpts *machine_opts = qemu_get_machine_opts();
 
 /* Start populating the FDT */
 sprintf(nodename, "pci@%" PRIx64, phb->buid);
@@ -907,6 +958,20 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb,
 _FDT(fdt_setprop_cell(fdt, bus_off, "ibm,pci-config-space-type", 0x1));
 _FDT(fdt_setprop_cell(fdt, bus_off, "ibm,pe-total-#msi", XICS_IRQS));
 
+/* Dynamic DMA window */
+if (qemu_opt_get_bool(machine_opts, "ddw", true) &&
+phb->ddw_supported &&
+spc->ddw_query && spc->ddw_create && spc->ddw_remove) {
+_FDT(fdt_setprop(fdt, bus_off, "ibm,ddw-applicable", &ddw_applicable,
+ sizeof(ddw_applicable)));
+
+if (spc->ddw_reset) {
+/* When enabled, the guest will remove the default 32bit window */
+_FDT(fdt_setprop(fdt, bus_off, "ibm,ddw-extensions",
+ &ddw_extensions, sizeof(ddw_extensions)));
+}
+}
+
 /* Build the interrupt-map, this must matches what is done
  * in pci_spapr_map_irq
  */
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index 119d326..2046356 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -103,6 +103,8 @@ struct sPAPRPHBState {
 int32_t msi_devs_num;
 spapr_pci_msi_mig *msi_devs;
 
+bool ddw_supported;
+
 QLIST_ENTRY(sPAPRPHBState) list;
 };
 
@@ -125,6 +127,9 @@ struct sPAPRPHBVFIOState {
 
 #define SPAPR_PCI_MEM_WIN_BUS_OFFSET 0x8000ULL
 
+/* Default 64bit dynamic window offset */
+#define SPAPR

[Qemu-devel] [RFC PATCH 02/10] spapr_iommu: Disable in-kernel IOMMU tables for >4GB windows

2014-07-31 Thread Alexey Kardashevskiy

The existing KVM_CREATE_SPAPR_TCE ioctl only support 4G windows max.
We are going to add huge DMA windows support so this will create small
window and unexpectedly fail later.

This disables KVM_CREATE_SPAPR_TCE for windows bigger that 4GB. Since
those windows are normally mapped at the boot time, there will be no
performance impact.

Signed-off-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr_iommu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index f6e32a4..36f5d27 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -113,11 +113,11 @@ static MemoryRegionIOMMUOps spapr_iommu_ops = {
 static int spapr_tce_table_realize(DeviceState *dev)
 {
 sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev);
+uint64_t window_size = tcet->nb_table << tcet->page_shift;
 
-if (kvm_enabled()) {
+if (kvm_enabled() && !(window_size >> 32)) {
 tcet->table = kvmppc_create_spapr_tce(tcet->liobn,
-  tcet->nb_table <<
-  tcet->page_shift,
+  window_size,
   &tcet->fd,
   tcet->vfio_accel);
 }
-- 
2.0.0

[Qemu-devel] [RFC PATCH 07/10] spapr: Add "ddw" machine option

2014-07-31 Thread Alexey Kardashevskiy

This adds a new "ddw" option for a machine to control presense
of the Dynamic DMA window (DDW) feature.

This option will be used by pseries to decide whether to put
DDW RTAS tokens to PHB device tree nodes or not.

This is not a PHB property because there is no way to change
the emulated PHB properties at start time. Also there is no point
in enabling DDW only for some PHB's because for emulated PHBs
it would not add any noticible overhead and for VFIO the very first
DDW-capable PHB will pin the entire guest memory and others won't
change it anyhow.

Signed-off-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr.c | 15 +++
 vl.c   |  4 
 2 files changed, 19 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 364a1e1..192e398 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -101,6 +101,7 @@ struct sPAPRMachineState {
 
 /*< public >*/
 char *kvm_type;
+bool ddw_supported;
 };
 
 sPAPREnvironment *spapr;
@@ -1633,10 +1634,24 @@ static void spapr_set_kvm_type(Object *obj, const char 
*value, Error **errp)
 sm->kvm_type = g_strdup(value);
 }
 
+static bool spapr_machine_get_ddw(Object *obj, Error **errp)
+{
+sPAPRMachineState *sms = SPAPR_MACHINE(obj);
+return sms->ddw_supported;
+}
+
+static void spapr_machine_set_ddw(Object *obj, bool value, Error **errp)
+{
+sPAPRMachineState *sms = SPAPR_MACHINE(obj);
+sms->ddw_supported = value;
+}
+
 static void spapr_machine_initfn(Object *obj)
 {
 object_property_add_str(obj, "kvm-type",
 spapr_get_kvm_type, spapr_set_kvm_type, NULL);
+object_property_add_bool(obj, "ddw", spapr_machine_get_ddw,
+ spapr_machine_set_ddw, NULL);
 }
 
 static void spapr_machine_class_init(ObjectClass *oc, void *data)
diff --git a/vl.c b/vl.c
index fe451aa..e53eaeb 100644
--- a/vl.c
+++ b/vl.c
@@ -383,6 +383,10 @@ static QemuOptsList qemu_machine_opts = {
 .name = "kvm-type",
 .type = QEMU_OPT_STRING,
 .help = "Specifies the KVM virtualization mode (HV, PR)",
+}, {
+.name = "ddw",
+.type = QEMU_OPT_BOOL,
+.help = "Enable Dynamic DMA windows support (pseries only)",
 },{
 .name = PC_MACHINE_MAX_RAM_BELOW_4G,
 .type = QEMU_OPT_SIZE,
-- 
2.0.0

[Qemu-devel] [RFC PATCH 03/10] spapr_pci: Make find_phb()/find_dev() public

2014-07-31 Thread Alexey Kardashevskiy

This makes find_phb()/find_dev() public and changed its names
to spapr_pci_find_phb()/spapr_pci_find_dev() as they are going to
be used from other parts of QEMU such as VFIO DDW (dynamic DMA window)
or VFIO PCI error injection or VFIO EEH handling - in all these
cases there are RTAS calls which are addressed to BUID+config_addr
in IEEE1275 format.

Signed-off-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr_pci.c  | 22 +++---
 include/hw/pci-host/spapr.h |  4 
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 9ed39a9..230b59c 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -47,7 +47,7 @@
 #define RTAS_TYPE_MSI   1
 #define RTAS_TYPE_MSIX  2
 
-static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, uint64_t buid)
+sPAPRPHBState *spapr_pci_find_phb(sPAPREnvironment *spapr, uint64_t buid)
 {
 sPAPRPHBState *sphb;
 
@@ -61,10 +61,10 @@ static sPAPRPHBState *find_phb(sPAPREnvironment *spapr, 
uint64_t buid)
 return NULL;
 }
 
-static PCIDevice *find_dev(sPAPREnvironment *spapr, uint64_t buid,
-   uint32_t config_addr)
+PCIDevice *spapr_pci_find_dev(sPAPREnvironment *spapr, uint64_t buid,
+  uint32_t config_addr)
 {
-sPAPRPHBState *sphb = find_phb(spapr, buid);
+sPAPRPHBState *sphb = spapr_pci_find_phb(spapr, buid);
 PCIHostState *phb = PCI_HOST_BRIDGE(sphb);
 int bus_num = (config_addr >> 16) & 0xFF;
 int devfn = (config_addr >> 8) & 0xFF;
@@ -95,7 +95,7 @@ static void finish_read_pci_config(sPAPREnvironment *spapr, 
uint64_t buid,
 return;
 }
 
-pci_dev = find_dev(spapr, buid, addr);
+pci_dev = spapr_pci_find_dev(spapr, buid, addr);
 addr = rtas_pci_cfgaddr(addr);
 
 if (!pci_dev || (addr % size) || (addr >= pci_config_size(pci_dev))) {
@@ -162,7 +162,7 @@ static void finish_write_pci_config(sPAPREnvironment 
*spapr, uint64_t buid,
 return;
 }
 
-pci_dev = find_dev(spapr, buid, addr);
+pci_dev = spapr_pci_find_dev(spapr, buid, addr);
 addr = rtas_pci_cfgaddr(addr);
 
 if (!pci_dev || (addr % size) || (addr >= pci_config_size(pci_dev))) {
@@ -281,9 +281,9 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
 }
 
 /* Fins sPAPRPHBState */
-phb = find_phb(spapr, buid);
+phb = spapr_pci_find_phb(spapr, buid);
 if (phb) {
-pdev = find_dev(spapr, buid, config_addr);
+pdev = spapr_pci_find_dev(spapr, buid, config_addr);
 }
 if (!phb || !pdev) {
 rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
@@ -377,9 +377,9 @@ static void 
rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
 spapr_pci_msi *msi;
 
 /* Find sPAPRPHBState */
-phb = find_phb(spapr, buid);
+phb = spapr_pci_find_phb(spapr, buid);
 if (phb) {
-pdev = find_dev(spapr, buid, config_addr);
+pdev = spapr_pci_find_dev(spapr, buid, config_addr);
 }
 if (!phb || !pdev) {
 rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
@@ -553,7 +553,7 @@ static void spapr_phb_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 
-if (find_phb(spapr, sphb->buid)) {
+if (spapr_pci_find_phb(spapr, sphb->buid)) {
 error_setg(errp, "PCI host bridges must have unique BUIDs");
 return;
 }
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index 32f0aa7..14c2ab0 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -122,4 +122,8 @@ void spapr_pci_msi_init(sPAPREnvironment *spapr, hwaddr 
addr);
 
 void spapr_pci_rtas_init(void);
 
+sPAPRPHBState *spapr_pci_find_phb(sPAPREnvironment *spapr, uint64_t buid);
+PCIDevice *spapr_pci_find_dev(sPAPREnvironment *spapr, uint64_t buid,
+  uint32_t config_addr);
+
 #endif /* __HW_SPAPR_PCI_H__ */
-- 
2.0.0

[Qemu-devel] [RFC PATCH 05/10] linux headers update for DDW

2014-07-31 Thread Alexey Kardashevskiy

Signed-off-by: Alexey Kardashevskiy 
---
 linux-headers/linux/vfio.h | 37 -
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 26c218e..f0aa97d 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -448,13 +448,48 @@ struct vfio_iommu_type1_dma_unmap {
  */
 struct vfio_iommu_spapr_tce_info {
__u32 argsz;
-   __u32 flags;/* reserved for future use */
+   __u32 flags;
+#define VFIO_IOMMU_SPAPR_TCE_FLAG_DDW  1 /* Support dynamic windows */
__u32 dma32_window_start;   /* 32 bit window start (bytes) */
__u32 dma32_window_size;/* 32 bit window size (bytes) */
 };
 
 #define VFIO_IOMMU_SPAPR_TCE_GET_INFO  _IO(VFIO_TYPE, VFIO_BASE + 12)
 
+/*
+ * Dynamic DMA windows
+ */
+struct vfio_iommu_spapr_tce_query {
+   __u32 argsz;
+   /* out */
+   __u32 windows_available;
+   __u32 page_size_mask;
+};
+#define VFIO_IOMMU_SPAPR_TCE_QUERY _IO(VFIO_TYPE, VFIO_BASE + 17)
+
+struct vfio_iommu_spapr_tce_create {
+   __u32 argsz;
+   /* in */
+   __u32 page_shift;
+   __u32 window_shift;
+   /* out */
+   __u64 start_addr;
+
+};
+#define VFIO_IOMMU_SPAPR_TCE_CREATE_IO(VFIO_TYPE, VFIO_BASE + 18)
+
+struct vfio_iommu_spapr_tce_remove {
+   __u32 argsz;
+   /* in */
+   __u64 start_addr;
+};
+#define VFIO_IOMMU_SPAPR_TCE_REMOVE_IO(VFIO_TYPE, VFIO_BASE + 19)
+
+struct vfio_iommu_spapr_tce_reset {
+   __u32 argsz;
+};
+#define VFIO_IOMMU_SPAPR_TCE_RESET _IO(VFIO_TYPE, VFIO_BASE + 20)
+
 /* * */
 
 #endif /* VFIO_H */
-- 
2.0.0

[Qemu-devel] [RFC PATCH 10/10] vfio: Enable DDW ioctls to VFIO IOMMU driver

2014-07-31 Thread Alexey Kardashevskiy

This enables DDW RTAS-related ioctls in VFIO.

Signed-off-by: Alexey Kardashevskiy 
---
 hw/misc/vfio.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 0b9eba0..e7b4d6e 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -4437,6 +4437,10 @@ int vfio_container_ioctl(AddressSpace *as, int32_t 
groupid,
 switch (req) {
 case VFIO_CHECK_EXTENSION:
 case VFIO_IOMMU_SPAPR_TCE_GET_INFO:
+case VFIO_IOMMU_SPAPR_TCE_QUERY:
+case VFIO_IOMMU_SPAPR_TCE_CREATE:
+case VFIO_IOMMU_SPAPR_TCE_REMOVE:
+case VFIO_IOMMU_SPAPR_TCE_RESET:
 break;
 default:
 /* Return an error on unknown requests */
-- 
2.0.0

Re: [Qemu-devel] [PATCH 08/15] virtio: decrease size of VirtQueueElement

2014-07-31 Thread Paolo Bonzini

Il 31/07/2014 04:07, Ming Lei ha scritto:
> On Wed, Jul 30, 2014 at 9:51 PM, Paolo Bonzini  wrote:
>> Il 30/07/2014 13:39, Ming Lei ha scritto:
>>> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
>>> index a60104c..943e72f 100644
>>> --- a/include/hw/virtio/virtio.h
>>> +++ b/include/hw/virtio/virtio.h
>>> @@ -84,12 +84,17 @@ typedef struct VirtQueue VirtQueue;
>>>  typedef struct VirtQueueElement
>>>  {
>>>  unsigned int index;
>>> +unsigned int num;
>>>  unsigned int out_num;
>>>  unsigned int in_num;
>>> -hwaddr in_addr[VIRTQUEUE_MAX_SIZE];
>>> -hwaddr out_addr[VIRTQUEUE_MAX_SIZE];
>>> -struct iovec in_sg[VIRTQUEUE_MAX_SIZE];
>>> -struct iovec out_sg[VIRTQUEUE_MAX_SIZE];
>>> +
>>> +hwaddr *in_addr;
>>> +hwaddr *out_addr;
>>> +struct iovec *in_sg;
>>> +struct iovec *out_sg;
>>> +
>>> +hwaddr addr[VIRTQUEUE_MAX_SIZE];
>>> +struct iovec sg[VIRTQUEUE_MAX_SIZE];
>>>  } VirtQueueElement;
>>>
>>>  #define VIRTIO_PCI_QUEUE_MAX 64
>>> --
>>
>> since addr and sg aren't used directly, allocate them flexibly like
>>
>> char *p;
>> VirtQueueElement *elem;
>> total_size = ROUND_UP(sizeof(struct VirtQueueElement),
>>   __alignof__(elem->addr[0]);
>> addr_offset = total_size;
>> total_size = ROUND_UP(total_size + num * sizeof(elem->addr[0]),
>>   __alignof__(elem->sg[0]));
>> sg_offset = total_size;
>> total_size += num * sizeof(elem->sg[0]);
>>
>> elem = p = g_slice_alloc(total_size);
>> elem->size = total_size;
>> elem->in_addr = p + addr_offset;
>> elem->out_addr = elem->in_addr + in_num;
>> elem->in_sg = p + sg_offset;
>> elem->out_sg = elem->in_sg + in_num;
>>
>> ...
>>
>> g_slice_free1(elem->size, elem);
>>
>> The small size will help glib do slab-style allocation and should remove
>> the need for an object pool.
> 
> Yes, that should be correct way to do, but can't avoid big chunk allocation
> completely because 'num' is a bit big.

For typical iops microbenchmarks, num will be 3 (4K I/O) to 18 (64K).

> Also this kind of change requires almost all users of elem to be changed,
> that need lots of work.
> 
> That is why I choose to take the simple approach to ease memory
> preallocation for obj pool.

Yeah, that's simpler.

Paolo

Re: [Qemu-devel] [PATCH qom v1 2/2] memory: remove object_property_add_child_array

2014-07-31 Thread Paolo Bonzini

Il 31/07/2014 07:35, Peter Crosthwaite ha scritto:
> Obsoleted by automatic object_property_add arrayification.
> 
> Signed-off-by: Peter Crosthwaite 
> ---
> 
>  memory.c | 30 +-
>  1 file changed, 5 insertions(+), 25 deletions(-)
> 
> diff --git a/memory.c b/memory.c
> index 64d7176..5272bf9 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -877,30 +877,6 @@ static char *memory_region_escape_name(const char *name)
>  return escaped;
>  }
>  
> -static void object_property_add_child_array(Object *owner,
> -const char *name,
> -Object *child)
> -{
> -int i;
> -char *base_name = memory_region_escape_name(name);
> -
> -for (i = 0; ; i++) {
> -char *full_name = g_strdup_printf("%s[%d]", base_name, i);
> -Error *local_err = NULL;
> -
> -object_property_add_child(owner, full_name, child, &local_err);
> -g_free(full_name);
> -if (!local_err) {
> -break;
> -}
> -
> -error_free(local_err);
> -}
> -
> -g_free(base_name);
> -}
> -
> -
>  void memory_region_init(MemoryRegion *mr,
>  Object *owner,
>  const char *name,
> @@ -918,8 +894,12 @@ void memory_region_init(MemoryRegion *mr,
>  mr->name = g_strdup(name);
>  
>  if (name) {
> -object_property_add_child_array(owner, name, OBJECT(mr));
> +char *escaped_name = memory_region_escape_name(name);
> +char *name_array = g_strdup_printf("%s[*]", escaped_name);
> +object_property_add_child(owner, name_array, OBJECT(mr), 
> &error_abort);
>  object_unref(OBJECT(mr));
> +g_free(name_array);
> +g_free(escaped_name);
>  }
>  }
>  
> 

Reviewed-by: Paolo Bonzini

Re: [Qemu-devel] [PATCH v8 4/5] QMP: Add support for Archipelago

2014-07-31 Thread Stefan Hajnoczi

On Wed, Jul 30, 2014 at 08:59:09PM +0300, Chrysostomos Nanakos wrote:
> Introduce new enum BlockdevOptionsArchipelago.
> 
> @volume:  #Name of the Archipelago volume image
> 
> @mport:   #'mport' is the port number on which mapperd is
>   listening. This is optional and if not specified,
>   QEMU will make Archipelago to use the default port.
> 
> @vport:   #'vport' is the port number on which vlmcd is
>   listening. This is optional and if not specified,
>   QEMU will make Archipelago to use the default port.
> 
> @segment: #optional The name of the shared memory segment
>   Archipelago stack is using. This is optional
>   and if not specified, QEMU will make Archipelago
>   use the default value, 'archipelago'.
> 
> Signed-off-by: Chrysostomos Nanakos 
> ---
>  qapi/block-core.json |   35 ++-
>  1 file changed, 34 insertions(+), 1 deletion(-)

Thanks, applied to my block-next tree and replaced v7 patch:
https://github.com/stefanha/qemu/commits/block-next

Stefan


pgpajz68n068R.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH qom v1 1/2] qom: object_property_add: Add automatic arrayification

2014-07-31 Thread Paolo Bonzini

Il 31/07/2014 07:34, Peter Crosthwaite ha scritto:
> If "[*]" is given as the last part of a QOM property name, treat that
> as an array property. The added property is given the first available
> name, replacing the * with a decimal number counting from 0.
> 
> First add with name "foo[*]" will be "foo[0]". Second "foo[1]" and so
> on.
> 
> Signed-off-by: Peter Crosthwaite 
> ---
> Suggest by Paolo and first pass discussion on list about the feature
> here:
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg03794.html
> 
>  qom/object.c | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/qom/object.c b/qom/object.c
> index 0e8267b..c869e8e 100644
> --- a/qom/object.c
> +++ b/qom/object.c
> @@ -739,6 +739,28 @@ object_property_add(Object *obj, const char *name, const 
> char *type,
>  {
>  ObjectProperty *prop;
>  
> +if (strlen(name) >= 3 && !strncmp(name + strlen(name) - 3, "[*]", 3)) {

Please cache strlen in a variable, and use memcmp(..., "[*]", 4).
strncmp is used often to compare just a prefix, and it's not obvious
that you're doing something else.

Otherwise looks good, thanks!

Paolo

> +int i;
> +ObjectProperty *ret;
> +char *name_no_array = g_strdup(name);
> +
> +name_no_array[strlen(name) - 3] = '\0';
> +for (i = 0; ; ++i) {
> +char *full_name = g_strdup_printf("%s[%d]", name_no_array, i);
> +Error *local_err = NULL;
> +ret = object_property_add(obj, full_name, type, get, set,
> +  release, opaque, &local_err);
> +
> +g_free(full_name);
> +if (!local_err) {
> +break;
> +}
> +error_free(local_err);
> +}
> +g_free(name_no_array);
> +return ret;
> +}
> +
>  QTAILQ_FOREACH(prop, &obj->properties, node) {
>  if (strcmp(prop->name, name) == 0) {
>  error_setg(errp, "attempt to add duplicate property '%s'"
>

[Qemu-devel] [RFC PATCH 04/10] spapr_iommu: Make spapr_tce_find_by_liobn() public

2014-07-31 Thread Alexey Kardashevskiy

At the moment spapr_tce_find_by_liobn() is used by H_PUT_TCE/...
handlers to find an IOMMU by LIOBN.

We are going to implement Dynamic DMA windows (DDW), new code
will go to a new file and we will use spapr_tce_find_by_liobn()
there too so let's make it public.

Signed-off-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr_iommu.c   | 2 +-
 include/hw/ppc/spapr.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index 36f5d27..588d442 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -40,7 +40,7 @@ enum sPAPRTCEAccess {
 
 static QLIST_HEAD(spapr_tce_tables, sPAPRTCETable) spapr_tce_tables;
 
-static sPAPRTCETable *spapr_tce_find_by_liobn(uint32_t liobn)
+sPAPRTCETable *spapr_tce_find_by_liobn(uint32_t liobn)
 {
 sPAPRTCETable *tcet;
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 36e8e51..c9d6c6c 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -467,6 +467,7 @@ struct sPAPRTCETable {
 QLIST_ENTRY(sPAPRTCETable) list;
 };
 
+sPAPRTCETable *spapr_tce_find_by_liobn(uint32_t liobn);
 void spapr_events_init(sPAPREnvironment *spapr);
 void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
 int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
-- 
2.0.0

Re: [Qemu-devel] [PATCH 01/15] qemu coroutine: support bypass mode

2014-07-31 Thread Ming Lei

On Thu, Jul 31, 2014 at 3:37 PM, Benoît Canet  wrote:
> The Thursday 31 Jul 2014 à 11:55:28 (+0800), Ming Lei wrote :
>> On Thu, Jul 31, 2014 at 7:37 AM, Paolo Bonzini  wrote:
>> > Il 30/07/2014 19:15, Ming Lei ha scritto:
>> >> On Wed, Jul 30, 2014 at 9:45 PM, Paolo Bonzini  
>> >> wrote:
>> >>> Il 30/07/2014 13:39, Ming Lei ha scritto:
>>  This patch introduces several APIs for supporting bypass qemu coroutine
>>  in case of being not necessary and for performance's sake.
>> >>>
>> >>> No, this is wrong.  Dataplane *must* use the same code as non-dataplane,
>> >>> anything else is a step backwards.
>> >>
>> >> As we saw, coroutine has brought up performance regression
>> >> on dataplane, and it isn't necessary to use co in some cases, is it?
>> >
>> > Yes, and it's not necessary on non-dataplane either.  It's not necessary
>> > on virtio-scsi, and it will not be necessary on virtio-scsi dataplane
>> > either.
>>
>> It is good to know these cases, so they might benefit from this patch
>> in future too, :-)
>>
>> >>> If you want to bypass coroutines, bdrv_aio_readv/writev must detect the
>> >>> conditions that allow doing that and call the bdrv_aio_readv/writev
>> >>> directly.
>> >>
>> >> That is easy to detect, please see the 5th patch.
>> >
>> > No, that's not enough.  Dataplane right now prevents block jobs, but
>> > that's going to change and it could require coroutines even for raw 
>> > devices.
>>
>> Could you explain it a bit why co is required for raw devices in future?
>
> Block mirroring of a device for example is done using coroutines.
> As block mirroring can be done on a raw device you need coroutines.

If block layer knows the mirroring is in-progress, it still can enable
coroutine by ignoring bypass coroutine, or just let device disable
bypass coroutine in case of mirroring, and the current APIs are very
flexible.

Thanks,

Re: [Qemu-devel] [PATCH 5/9] nic: do not destroy memory regions in cleanup functions

2014-07-31 Thread Stefan Hajnoczi

On Wed, Jul 30, 2014 at 12:27:08PM +0200, Paolo Bonzini wrote:
> The memory regions should be destroyed in the unrealize function;
> since these NICs are not even qdev-ified, they cannot be unplugged
> and they do not have to do anything to destroy their memory regions.
> 
> Cc: stefa...@redhat.com
> Signed-off-by: Paolo Bonzini 
> ---
>  hw/net/dp8393x.c | 3 ---
>  hw/net/mcf_fec.c | 3 ---
>  2 files changed, 6 deletions(-)

Reviewed-by: Stefan Hajnoczi 


pgpfh34fbJiQb.pgp
Description: PGP signature

[Qemu-devel] [PATCH v4 2/8] bootindex: add del_boot_device_path function

2014-07-31 Thread arei.gonglei

From: Gonglei 

Introduce del_boot_device_path() to clean up fw_cfg content when
hot-unplugging a device that refers to a bootindex.

Signed-off-by: Gonglei 
Signed-off-by: Chenliang 
---
 include/sysemu/sysemu.h |  1 +
 vl.c| 20 
 2 files changed, 21 insertions(+)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index e1b0659..7a79ff4 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -209,6 +209,7 @@ void usb_info(Monitor *mon, const QDict *qdict);
 
 void add_boot_device_path(int32_t bootindex, DeviceState *dev,
   const char *suffix);
+void del_boot_device_path(DeviceState *dev);
 void modify_boot_device_path(int32_t bootindex, DeviceState *dev,
  const char *suffix);
 char *get_boot_devices_list(size_t *size, bool ignore_suffixes);
diff --git a/vl.c b/vl.c
index 825f9fd..49328df 100644
--- a/vl.c
+++ b/vl.c
@@ -1263,6 +1263,26 @@ static bool is_same_fw_dev_path(DeviceState *src, 
DeviceState *dst)
 return ret;
 }
 
+void del_boot_device_path(DeviceState *dev)
+{
+FWBootEntry *i;
+
+assert(dev != NULL);
+
+/* remove all entries of the assigned device */
+QTAILQ_FOREACH(i, &fw_boot_order, link) {
+if ((i->dev->id && dev->id &&
+!strcmp(i->dev->id, dev->id)) ||
+(!i->dev->id && is_same_fw_dev_path(i->dev, dev))) {
+QTAILQ_REMOVE(&fw_boot_order, i, link);
+g_free(i->suffix);
+g_free(i);
+
+break;
+}
+}
+}
+
 void modify_boot_device_path(int32_t bootindex, DeviceState *dev,
  const char *suffix)
 {
-- 
1.7.12.4

[Qemu-devel] [PATCH v4 3/8] fw_cfg: add fw_cfg_machine_reset function

2014-07-31 Thread arei.gonglei

From: Gonglei 

We must assure that the changed bootindex can take effect
when guest is rebooted. So we introduce fw_cfg_machine_reset(),
which change the fw_cfg file's bootindex data using the new
global fw_boot_order list.

Signed-off-by: Chenliang 
Signed-off-by: Gonglei 
---
 hw/nvram/fw_cfg.c | 54 +--
 include/hw/nvram/fw_cfg.h |  2 ++
 2 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index b71d251..a24a44d 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -56,7 +56,6 @@ struct FWCfgState {
 FWCfgFiles *files;
 uint16_t cur_entry;
 uint32_t cur_offset;
-Notifier machine_ready;
 };
 
 #define JPG_FILE 0
@@ -402,6 +401,26 @@ static void fw_cfg_add_bytes_read_callback(FWCfgState *s, 
uint16_t key,
 s->entries[arch][key].callback_opaque = callback_opaque;
 }
 
+static void *fw_cfg_modify_bytes_read(FWCfgState *s, uint16_t key,
+  void *data, size_t len)
+{
+void *ptr;
+int arch = !!(key & FW_CFG_ARCH_LOCAL);
+
+key &= FW_CFG_ENTRY_MASK;
+
+assert(key < FW_CFG_MAX_ENTRY && len < UINT32_MAX);
+
+/* return the old data to the function caller, avoid memory leak */
+ptr = s->entries[arch][key].data;
+s->entries[arch][key].data = data;
+s->entries[arch][key].len = len;
+s->entries[arch][key].callback_opaque = NULL;
+s->entries[arch][key].callback = NULL;
+
+return ptr;
+}
+
 void fw_cfg_add_bytes(FWCfgState *s, uint16_t key, void *data, size_t len)
 {
 fw_cfg_add_bytes_read_callback(s, key, NULL, NULL, data, len);
@@ -499,13 +518,36 @@ void fw_cfg_add_file(FWCfgState *s,  const char *filename,
 fw_cfg_add_file_callback(s, filename, NULL, NULL, data, len);
 }
 
-static void fw_cfg_machine_ready(struct Notifier *n, void *data)
+void *fw_cfg_modify_file(FWCfgState *s, const char *filename,
+void *data, size_t len)
+{
+int i, index;
+
+assert(s->files);
+
+index = be32_to_cpu(s->files->count);
+assert(index < FW_CFG_FILE_SLOTS);
+
+for (i = 0; i < index; i++) {
+if (strcmp(filename, s->files->f[i].name) == 0) {
+return fw_cfg_modify_bytes_read(s, FW_CFG_FILE_FIRST + i,
+ data, len);
+}
+}
+/* add new one */
+fw_cfg_add_file_callback(s, filename, NULL, NULL, data, len);
+return NULL;
+}
+
+static void fw_cfg_machine_reset(void *opaque)
 {
+void *ptr;
 size_t len;
-FWCfgState *s = container_of(n, FWCfgState, machine_ready);
+FWCfgState *s = opaque;
 char *bootindex = get_boot_devices_list(&len, false);
 
-fw_cfg_add_file(s, "bootorder", (uint8_t*)bootindex, len);
+ptr = fw_cfg_modify_file(s, "bootorder", (uint8_t *)bootindex, len);
+g_free(ptr);
 }
 
 FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
@@ -542,9 +584,7 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t 
data_port,
 fw_cfg_bootsplash(s);
 fw_cfg_reboot(s);
 
-s->machine_ready.notify = fw_cfg_machine_ready;
-qemu_add_machine_init_done_notifier(&s->machine_ready);
-
+qemu_register_reset(fw_cfg_machine_reset, s);
 return s;
 }
 
diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index 72b1549..56e1ed7 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -76,6 +76,8 @@ void fw_cfg_add_file(FWCfgState *s, const char *filename, 
void *data,
 void fw_cfg_add_file_callback(FWCfgState *s, const char *filename,
   FWCfgReadCallback callback, void 
*callback_opaque,
   void *data, size_t len);
+void *fw_cfg_modify_file(FWCfgState *s, const char *filename, void *data,
+ size_t len);
 FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
 hwaddr crl_addr, hwaddr data_addr);
 
-- 
1.7.12.4

[Qemu-devel] [PATCH v4 1/8] bootindex: add modify_boot_device_path function

2014-07-31 Thread arei.gonglei

From: Gonglei 

When we want to change one device's bootindex, we
should lookup the device and change the bootindex.
it is simply that remove it from the global boot list,
then re-add it, sorted by new bootindex.

If the new bootindex has already used by another device
just throw an error.

Allow changing the existing bootindex entry only,
not support adding new boot entries.

Signed-off-by: Gonglei 
Signed-off-by: Chenliang 
---
 include/sysemu/sysemu.h |  2 ++
 vl.c| 68 +
 2 files changed, 70 insertions(+)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index d8539fd..e1b0659 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -209,6 +209,8 @@ void usb_info(Monitor *mon, const QDict *qdict);
 
 void add_boot_device_path(int32_t bootindex, DeviceState *dev,
   const char *suffix);
+void modify_boot_device_path(int32_t bootindex, DeviceState *dev,
+ const char *suffix);
 char *get_boot_devices_list(size_t *size, bool ignore_suffixes);
 
 DeviceState *get_boot_device(uint32_t position);
diff --git a/vl.c b/vl.c
index fe451aa..825f9fd 100644
--- a/vl.c
+++ b/vl.c
@@ -1248,6 +1248,74 @@ void add_boot_device_path(int32_t bootindex, DeviceState 
*dev,
 QTAILQ_INSERT_TAIL(&fw_boot_order, node, link);
 }
 
+static bool is_same_fw_dev_path(DeviceState *src, DeviceState *dst)
+{
+bool ret = false;
+char *devpath_src = qdev_get_fw_dev_path(src);
+char *devpath_dst = qdev_get_fw_dev_path(dst);
+
+if (!strcmp(devpath_src, devpath_dst)) {
+ret = true;
+}
+
+g_free(devpath_src);
+g_free(devpath_dst);
+return ret;
+}
+
+void modify_boot_device_path(int32_t bootindex, DeviceState *dev,
+ const char *suffix)
+{
+FWBootEntry *i, *old_entry = NULL;
+
+assert(dev != NULL || suffix != NULL);
+
+if (bootindex >= 0) {
+QTAILQ_FOREACH(i, &fw_boot_order, link) {
+if (i->bootindex == bootindex) {
+qerror_report(ERROR_CLASS_GENERIC_ERROR,
+  "The bootindex %d has already been used",
+  bootindex);
+return;
+}
+}
+}
+
+QTAILQ_FOREACH(i, &fw_boot_order, link) {
+/*
+ * Delete the same original dev, if devices havn't id property,
+ * we must check they fw_dev_path to identify them.
+ */
+if ((i->dev->id && !strcmp(i->dev->id, dev->id)) ||
+(!i->dev->id && is_same_fw_dev_path(i->dev, dev))) {
+if (!suffix) {
+QTAILQ_REMOVE(&fw_boot_order, i, link);
+old_entry = i;
+
+break;
+} else if (i->suffix && !strcmp(suffix, i->suffix)) {
+QTAILQ_REMOVE(&fw_boot_order, i, link);
+old_entry = i;
+
+break;
+}
+}
+}
+
+if (!old_entry) {
+qerror_report(ERROR_CLASS_GENERIC_ERROR,
+  "The device(%s) havn't configured bootindex property"
+  " or suffix don't match", dev->id);
+
+return;
+}
+
+add_boot_device_path(bootindex, dev, old_entry->suffix);
+
+g_free(old_entry->suffix);
+g_free(old_entry);
+}
+
 DeviceState *get_boot_device(uint32_t position)
 {
 uint32_t counter = 0;
-- 
1.7.12.4

[Qemu-devel] [PATCH v4 6/8] qemu-monitor: HMP set-bootindex wrapper

2014-07-31 Thread arei.gonglei

From: Gonglei 

Add HMP set-bootindex wrapper to allow setting
devcie's bootindex via monitor.

Signed-off-by: Gonglei 
Signed-off-by: Chenliang 
---
 hmp-commands.hx | 15 +++
 hmp.c   | 13 +
 hmp.h   |  1 +
 3 files changed, 29 insertions(+)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index d0943b1..31ef24e 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -688,6 +688,21 @@ Remove device @var{id}.
 ETEXI
 
 {
+.name   = "set-bootindex",
+.args_type  = "id:s,bootindex:l,suffix:s?",
+.params = "device bootindex [suffix]",
+.help   = "set bootindex of a device(e.g. set-bootindex disk0 1 
'/disk@0')",
+.mhandler.cmd = hmp_set_bootindex,
+},
+
+STEXI
+@item set-bootindex @var{id} @var{bootindex}
+@findex set-bootindex
+
+Set bootindex of a device.
+ETEXI
+
+{
 .name   = "cpu",
 .args_type  = "index:i",
 .params = "index",
diff --git a/hmp.c b/hmp.c
index 4d1838e..95f7eeb 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1714,3 +1714,16 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict)
 
 monitor_printf(mon, "\n");
 }
+
+void hmp_set_bootindex(Monitor *mon, const QDict *qdict)
+{
+Error *err = NULL;
+
+const char *id = qdict_get_str(qdict, "id");
+int64_t bootindex = qdict_get_int(qdict, "bootindex");
+bool has_suffix = qdict_haskey(qdict, "suffix");
+const char *suffix = qdict_get_try_str(qdict, "suffix");
+
+qmp_set_bootindex(id, bootindex, has_suffix, suffix, &err);
+hmp_handle_error(mon, &err);
+}
diff --git a/hmp.h b/hmp.h
index 4fd3c4a..eb2641a 100644
--- a/hmp.h
+++ b/hmp.h
@@ -94,6 +94,7 @@ void hmp_cpu_add(Monitor *mon, const QDict *qdict);
 void hmp_object_add(Monitor *mon, const QDict *qdict);
 void hmp_object_del(Monitor *mon, const QDict *qdict);
 void hmp_info_memdev(Monitor *mon, const QDict *qdict);
+void hmp_set_bootindex(Monitor *mon, const QDict *qdict);
 void object_add_completion(ReadLineState *rs, int nb_args, const char *str);
 void object_del_completion(ReadLineState *rs, int nb_args, const char *str);
 void device_add_completion(ReadLineState *rs, int nb_args, const char *str);
-- 
1.7.12.4

[Qemu-devel] [PATCH v4 7/8] qmp: add query-bootindex command

2014-07-31 Thread arei.gonglei

From: Gonglei 

Adds "query-bootindex" QMP command.

Example QMP command:

-> { "execute": "query-bootindex"}
<- {
  "return":[
 {
"id":"ide0-0-0",
"bootindex":1,
"suffix":"/disk@0"
 },
 {
"id":"nic1",
"bootindex":2,
"suffix":"/ethernet-phy@0"
 }
  ]
   }

Signed-off-by: Gonglei 
Signed-off-by: Chenliang 
---
 qapi-schema.json | 29 +
 qmp-commands.hx  | 41 +
 vl.c | 31 +++
 3 files changed, 101 insertions(+)

diff --git a/qapi-schema.json b/qapi-schema.json
index 30bd6ad..680cbc5 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1704,6 +1704,33 @@
 { 'command': 'device_del', 'data': {'id': 'str'} }
 
 ##
+# @BootindexInfo:
+#
+# Information about devcie's bootindex.
+#
+# @id: the name of a device owning the bootindex
+#
+# @bootindex: the bootindex number
+#
+# @suffix: the suffix a device's bootindex
+#
+# Since: 2.2
+##
+{ 'type': 'BootindexInfo',
+  'data': {'id': 'str', 'bootindex': 'int', 'suffix': 'str'} }
+
+##
+# @query-bootindex:
+#
+# Returns information about bootindex
+#
+# Returns: a list of @BootindexInfo for existing device
+#
+# Since: 2.2
+##
+{ 'command': 'query-bootindex', 'returns': ['BootindexInfo'] }
+
+##
 # @set-bootindex:
 #
 # set bootindex of a device
@@ -1715,6 +1742,8 @@
 # Returns: Nothing on success
 #  If @id is not a valid device, DeviceNotFound
 #
+# Note: suffix can be gotten by query-bootindex command
+#
 # Since: 2.2
 ##
 { 'command': 'set-bootindex',
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 19cc3b8..6ab9325 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -337,6 +337,47 @@ EQMP
 },
 
 SQMP
+query-bootindex
+---
+
+Show VM bootindex information.
+
+The returned value is a json-array of all device configured
+bootindex property. Each bootindex is represented by a json-object.
+
+The bootindex json-object contains the following:
+
+- "id": the name of a device owning the bootindex (json-string)
+- "bootindex": the bootindex number (json-int)
+- "suffix": the suffix a device's bootindex (json-string)
+
+Example:
+
+-> { "execute": "query-bootindex" }
+<- {
+  "return":[
+ {
+"id":"ide0-0-0",
+"bootindex":1,
+"suffix":"/disk@0"
+ },
+ {
+"id":"nic1",
+"bootindex":2,
+"suffix":"/ethernet-phy@0"
+ }
+  ]
+   }
+
+EQMP
+
+{
+.name   = "query-bootindex",
+.args_type  = "",
+.mhandler.cmd_new = qmp_marshal_input_query_bootindex,
+},
+
+SQMP
 set-bootindex
 -
 
diff --git a/vl.c b/vl.c
index 49328df..52e4d9a 100644
--- a/vl.c
+++ b/vl.c
@@ -1219,6 +1219,37 @@ static void restore_boot_order(void *opaque)
 g_free(normal_boot_order);
 }
 
+BootindexInfoList *qmp_query_bootindex(Error **errp)
+{
+BootindexInfoList *booindex_list = NULL;
+BootindexInfoList *info;
+FWBootEntry *i;
+
+QTAILQ_FOREACH(i, &fw_boot_order, link) {
+info = g_new0(BootindexInfoList, 1);
+info->value = g_new0(BootindexInfo, 1);
+
+if (i->dev->id) {
+info->value->id = g_strdup(i->dev->id);
+} else {
+/* For virtio devices, we should get id from they parent */
+if (i->dev->parent_bus) {
+DeviceState *dev = i->dev->parent_bus->parent;
+info->value->id = g_strdup(dev->id);
+} else {
+info->value->id = g_strdup("");
+}
+}
+info->value->bootindex = i->bootindex;
+info->value->suffix = g_strdup(i->suffix);
+
+info->next = booindex_list;
+booindex_list = info;
+}
+
+return booindex_list;
+}
+
 void add_boot_device_path(int32_t bootindex, DeviceState *dev,
   const char *suffix)
 {
-- 
1.7.12.4

[Qemu-devel] [PATCH v4 0/8] modify boot order of guest, and take effect after rebooting

2014-07-31 Thread arei.gonglei

From: Gonglei 

Sometimes, we want to modify boot order of a guest, but no need to
shutdown it. We can call dynamic changing bootindex of a guest, which
can be assured taking effect just after the guest rebooting.

For example, in P2V scene, we boot a guest and then attach a
new system disk, for copying some thing. We want to assign the
new disk as the booting disk, which means its bootindex=1.

Different nics can be assigen different bootindex dynamically
also make sense.

The patchsets add one qmp interface, and add an fw_cfg_machine_reset()
to achieve it.

Steps of testing:

./qemu-system-x86_64 -enable-kvm -m 4096 -smp 4 -name redhat6.2 -drive \
file=/home/redhat6.2.img,if=none,id=drive-ide0-0-0 \
-device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 \
-drive file=/home/RH-DVD1.iso,if=none,id=drive-ide0-0-1 \
-device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=4 \
-vnc 0.0.0.0:10 -netdev type=user,id=net0 \
-device virtio-net-pci,netdev=net0,bootindex=3,id=nic1 \
-netdev type=user,id=net1 -device e1000,netdev=net1,bootindex=2,id=nic2 \
-drive file=/home/virtio-disk.vfd,if=none,id=drive-fdc0-0-0,format=raw \
-device isa-fdc,driveA=drive-fdc0-0-0,id=floppy1,bootindexA=5 -monitor stdio
QEMU 2.0.93 monitor - type 'help' for more information
(qemu) info bootindex
id   bootindex   suffix
"floppy1"5  "/floppy@0"
"ide0-0-1"   4  "/disk@1"
"nic1"   3  "/ethernet-phy@0"
"nic2"   2  "/ethernet-phy@0"
"ide0-0-0"   1  "/disk@0"
(qemu) set-bootindex ide0-0-1 1
The bootindex 1 has already been used
(qemu) set-bootindex ide0-0-1 6 "/disk@1"
(qemu) set-bootindex ide0-0-1 0
(qemu) system_reset
(qemu) set-bootindex ide0-0-1 1
The bootindex 1 has already been used
(qemu) set-bootindex nic1 0
The bootindex 0 has already been used
(qemu) set-bootindex ide0-0-1 -1
(qemu) set-bootindex nic1 0
(qemu) info bootindex
id   bootindex   suffix
"floppy1"5  "/floppy@0"
"nic2"   2  "/ethernet-phy@0"
"ide0-0-0"   1  "/disk@0"
"nic1"   0  "/ethernet-phy@0"
(qemu) system_reset
(qemu) 

Changes since v3:
 - rework del_* and modify_* function, because of virtio devices' specialation.
   For example, virtio-net's id is NULL, and its parent virtio-net-pci's id was 
assigned.
   Though the global fw_boot_order stored the virtio-net device.
 - call dell_boot_device_path in each individual device avoiding waste resouce.
 - introduce qmp "query-bootindex" command
 - introcude hmp "info bootindex" command
 - Fixes by Eric's reviewing comments, thanks.

Changes since v2:
 *address Gerd's reviewing suggestion:
 - use the old entry's suffix, if the caller do not pass it in.
 - call del_boot_device_path() from device_finalize() instead
   of placing it into each individual device.

  Thanks Gerd.

Changes since v1:
 *rework by Gerd's suggestion:
 - split modify and del fw_boot_order for single function.
 - change modify bootindex's realization which simply lookup
   the device and modify the bootindex. if the new bootindex
   has already used by another device just throw an error.
 - change to del_boot_device_path(DeviceState *dev) and simply delete all
   entries belonging to the device.

Gonglei (8):
  bootindex: add modify_boot_device_path function
  bootindex: add del_boot_device_path function
  fw_cfg: add fw_cfg_machine_reset function
  bootindex: delete bootindex when device is removed
  qmp: add set-bootindex command
  qemu-monitor: HMP set-bootindex wrapper
  qmp: add query-bootindex command
  qemu-monitor: add HMP "info-bootindex" command

 hmp-commands.hx   |  17 +++
 hmp.c |  33 +
 hmp.h |   2 +
 hw/block/virtio-blk.c |   1 +
 hw/i386/kvm/pci-assign.c  |   1 +
 hw/misc/vfio.c|   1 +
 hw/net/e1000.c|   1 +
 hw/net/eepro100.c |   1 +
 hw/net/ne2000.c   |   1 +
 hw/net/rtl8139.c  |   1 +
 hw/net/virtio-net.c   |   1 +
 hw/net/vmxnet3.c  |   1 +
 hw/nvram/fw_cfg.c |  54 ++---
 hw/scsi/scsi-generic.c|   1 +
 hw/usb/dev-network.c  |   1 +
 hw/usb/host-libusb.c  |   1 +
 hw/usb/redirect.c |   1 +
 include/hw/nvram/fw_cfg.h |   2 +
 include/sysemu/sysemu.h   |   3 ++
 monitor.c |   7 +++
 qapi-schema.json  |  46 ++
 qmp-commands.hx   |  66 +
 qmp.c |  17 +++
 vl.c  | 119 ++
 24 files changed, 372 insertions(+), 7 deletions(-)

-- 
1.7.12.4

[Qemu-devel] [PATCH v4 4/8] bootindex: delete bootindex when device is removed

2014-07-31 Thread arei.gonglei

From: Gonglei 

Device should be removed from global boot list when
it is hot-unplugged.

Signed-off-by: Chenliang 
Signed-off-by: Gonglei 
---
 hw/block/virtio-blk.c| 1 +
 hw/i386/kvm/pci-assign.c | 1 +
 hw/misc/vfio.c   | 1 +
 hw/net/e1000.c   | 1 +
 hw/net/eepro100.c| 1 +
 hw/net/ne2000.c  | 1 +
 hw/net/rtl8139.c | 1 +
 hw/net/virtio-net.c  | 1 +
 hw/net/vmxnet3.c | 1 +
 hw/scsi/scsi-generic.c   | 1 +
 hw/usb/dev-network.c | 1 +
 hw/usb/host-libusb.c | 1 +
 hw/usb/redirect.c| 1 +
 13 files changed, 13 insertions(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index c241c50..49813d5 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -786,6 +786,7 @@ static void virtio_blk_device_unrealize(DeviceState *dev, 
Error **errp)
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
 VirtIOBlock *s = VIRTIO_BLK(dev);
 
+del_boot_device_path(dev);
 #ifdef CONFIG_VIRTIO_BLK_DATA_PLANE
 remove_migration_state_change_notifier(&s->migration_state_notifier);
 virtio_blk_data_plane_destroy(s->dataplane);
diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index de33657..1322479 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -1853,6 +1853,7 @@ static void assigned_exitfn(struct PCIDevice *pci_dev)
 {
 AssignedDevice *dev = DO_UPCAST(AssignedDevice, dev, pci_dev);
 
+del_boot_device_path(&pci_dev->qdev);
 deassign_device(dev);
 free_assigned_device(dev);
 }
diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 0b9eba0..f891312 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -4304,6 +4304,7 @@ static void vfio_exitfn(PCIDevice *pdev)
 VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
 VFIOGroup *group = vdev->group;
 
+del_boot_device_path(&pdev->qdev);
 vfio_unregister_err_notifier(vdev);
 pci_device_set_intx_routing_notifier(&vdev->pdev, NULL);
 vfio_disable_interrupts(vdev);
diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 0fc29a0..fa4e858 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1492,6 +1492,7 @@ pci_e1000_uninit(PCIDevice *dev)
 {
 E1000State *d = E1000(dev);
 
+del_boot_device_path(DEVICE(dev));
 timer_del(d->autoneg_timer);
 timer_free(d->autoneg_timer);
 timer_del(d->mit_timer);
diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index 3263e3f..62951e4 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -1843,6 +1843,7 @@ static void pci_nic_uninit(PCIDevice *pci_dev)
 {
 EEPRO100State *s = DO_UPCAST(EEPRO100State, dev, pci_dev);
 
+del_boot_device_path(&pci_dev->qdev);
 memory_region_destroy(&s->mmio_bar);
 memory_region_destroy(&s->io_bar);
 memory_region_destroy(&s->flash_bar);
diff --git a/hw/net/ne2000.c b/hw/net/ne2000.c
index d558b8c..62afa65 100644
--- a/hw/net/ne2000.c
+++ b/hw/net/ne2000.c
@@ -748,6 +748,7 @@ static void pci_ne2000_exit(PCIDevice *pci_dev)
 PCINE2000State *d = DO_UPCAST(PCINE2000State, dev, pci_dev);
 NE2000State *s = &d->ne2000;
 
+del_boot_device_path(&pci_dev->qdev);
 memory_region_destroy(&s->io);
 qemu_del_nic(s->nic);
 qemu_free_irq(s->irq);
diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 90bc5ec..b34c91a 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -3462,6 +3462,7 @@ static void pci_rtl8139_uninit(PCIDevice *dev)
 {
 RTL8139State *s = RTL8139(dev);
 
+del_boot_device_path(DEVICE(dev));
 memory_region_destroy(&s->bar_io);
 memory_region_destroy(&s->bar_mem);
 if (s->cplus_txbuffer) {
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 268eff9..0419575 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1654,6 +1654,7 @@ static void virtio_net_device_unrealize(DeviceState *dev, 
Error **errp)
 virtio_net_set_status(vdev, 0);
 
 unregister_savevm(dev, "virtio-net", n);
+del_boot_device_path(dev);
 
 g_free(n->netclient_name);
 n->netclient_name = NULL;
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 77bea6f..c10d84e 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -2176,6 +2176,7 @@ static void vmxnet3_pci_uninit(PCIDevice *pci_dev)
 VMW_CBPRN("Starting uninit...");
 
 unregister_savevm(dev, "vmxnet3-msix", s);
+del_boot_device_path(dev);
 
 vmxnet3_net_uninit(s);
 
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 3733d2c..b270bc7 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -388,6 +388,7 @@ static void scsi_generic_reset(DeviceState *dev)
 
 static void scsi_destroy(SCSIDevice *s)
 {
+del_boot_device_path(&s->qdev);
 scsi_device_purge_requests(s, SENSE_CODE(NO_SENSE));
 blockdev_mark_auto_del(s->conf.bs);
 }
diff --git a/hw/usb/dev-network.c b/hw/usb/dev-network.c
index 518d536..be39802 100644
--- a/hw/usb/dev-network.c
+++ b/hw/usb/dev-network.c
@@ -1331,6 +1331,7 @@ static void usb_net_handle_destroy(USBDevice *dev)
 /* TODO: remove the nd_table[] e

[Qemu-devel] [PATCH v4 8/8] qemu-monitor: add HMP "info-bootindex" command

2014-07-31 Thread arei.gonglei

From: Gonglei 

Add HMP info-bootindex command to getting
devcie's bootindex via monitor.

Signed-off-by: Gonglei 
Signed-off-by: Chenliang 
---
 hmp-commands.hx |  2 ++
 hmp.c   | 20 
 hmp.h   |  1 +
 monitor.c   |  7 +++
 4 files changed, 30 insertions(+)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 31ef24e..bc1b982 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1795,6 +1795,8 @@ show qdev device model list
 show roms
 @item info tpm
 show the TPM device
+@item info bootindex
+show the current VM bootindex information
 @end table
 ETEXI
 
diff --git a/hmp.c b/hmp.c
index 95f7eeb..1688e02 100644
--- a/hmp.c
+++ b/hmp.c
@@ -725,6 +725,26 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict)
 qapi_free_TPMInfoList(info_list);
 }
 
+void hmp_info_bootindex(Monitor *mon, const QDict *qdict)
+{
+BootindexInfoList *bootindex_list, *info;
+
+bootindex_list = qmp_query_bootindex(NULL);
+if (!bootindex_list) {
+monitor_printf(mon, "No bootindex was configured\n");
+return;
+}
+
+monitor_printf(mon, "id \t bootindex \t suffix\n");
+for (info = bootindex_list; info; info = info->next) {
+monitor_printf(mon, "\"%s\"\t %"PRId64"\t\"%s\"\n",
+   info->value->id, info->value->bootindex,
+   info->value->suffix);
+}
+
+qapi_free_BootindexInfoList(bootindex_list);
+}
+
 void hmp_quit(Monitor *mon, const QDict *qdict)
 {
 monitor_suspend(mon);
diff --git a/hmp.h b/hmp.h
index eb2641a..5899537 100644
--- a/hmp.h
+++ b/hmp.h
@@ -38,6 +38,7 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict);
 void hmp_info_pci(Monitor *mon, const QDict *qdict);
 void hmp_info_block_jobs(Monitor *mon, const QDict *qdict);
 void hmp_info_tpm(Monitor *mon, const QDict *qdict);
+void hmp_info_bootindex(Monitor *mon, const QDict *qdict);
 void hmp_quit(Monitor *mon, const QDict *qdict);
 void hmp_stop(Monitor *mon, const QDict *qdict);
 void hmp_system_reset(Monitor *mon, const QDict *qdict);
diff --git a/monitor.c b/monitor.c
index 5bc70a6..8158ddb 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2918,6 +2918,13 @@ static mon_cmd_t info_cmds[] = {
 .mhandler.cmd = hmp_info_memdev,
 },
 {
+.name   = "bootindex",
+.args_type  = "",
+.params = "",
+.help   = "show the current VM bootindex information",
+.mhandler.cmd = hmp_info_bootindex,
+},
+{
 .name   = NULL,
 },
 };
-- 
1.7.12.4

[Qemu-devel] [PATCH v4 5/8] qmp: add set-bootindex command

2014-07-31 Thread arei.gonglei

From: Gonglei 

Adds "set-bootindex id=xx,bootindex=xx,suffix=xx" QMP command.

Example QMP command:
-> { "execute": "set-bootindex",
 "arguments": { "id": "ide0-0-1", "bootindex": 1, "suffix": "/disk@0"}}
<- { "return": {} }

Signed-off-by: Gonglei 
Signed-off-by: Chenliang 
---
 qapi-schema.json | 17 +
 qmp-commands.hx  | 25 +
 qmp.c| 17 +
 3 files changed, 59 insertions(+)

diff --git a/qapi-schema.json b/qapi-schema.json
index b11aad2..30bd6ad 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1704,6 +1704,23 @@
 { 'command': 'device_del', 'data': {'id': 'str'} }
 
 ##
+# @set-bootindex:
+#
+# set bootindex of a device
+#
+# @id: the name of the device
+# @bootindex: the bootindex of the device
+# @suffix: #optional a suffix of the device
+#
+# Returns: Nothing on success
+#  If @id is not a valid device, DeviceNotFound
+#
+# Since: 2.2
+##
+{ 'command': 'set-bootindex',
+  'data': {'id': 'str', 'bootindex': 'int', '*suffix': 'str'} }
+
+##
 # @DumpGuestMemoryFormat:
 #
 # An enumeration of guest-memory-dump's format.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 4be4765..19cc3b8 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -330,6 +330,31 @@ Example:
 <- { "return": {} }
 
 EQMP
+{
+.name   = "set-bootindex",
+.args_type  = "id:s,bootindex:l,suffix:s?",
+.mhandler.cmd_new = qmp_marshal_input_set_bootindex,
+},
+
+SQMP
+set-bootindex
+-
+
+Set bootindex of a device
+
+Arguments:
+
+- "id": the device's ID (json-string)
+- "bootindex": the device's bootindex (json-int)
+- "suffix": the device's suffix in global boot list (json-string, optional)
+
+Example:
+
+-> { "execute": "set-bootindex",
+ "arguments": { "id": "ide0-0-1", "bootindex": 1, "suffix": "/disk@0"}}
+<- { "return": {} }
+
+EQMP
 
 {
 .name   = "send-key",
diff --git a/qmp.c b/qmp.c
index 0d2553a..e046200 100644
--- a/qmp.c
+++ b/qmp.c
@@ -684,6 +684,23 @@ void qmp_object_del(const char *id, Error **errp)
 object_unparent(obj);
 }
 
+void qmp_set_bootindex(const char *id, int64_t bootindex,
+   bool has_suffix, const char *suffix, Error **errp)
+{
+DeviceState *dev;
+
+dev = qdev_find_recursive(sysbus_get_default(), id);
+if (!dev) {
+error_set(errp, QERR_DEVICE_NOT_FOUND, id);
+return;
+}
+if (has_suffix) {
+modify_boot_device_path(bootindex, dev, suffix);
+} else {
+modify_boot_device_path(bootindex, dev, NULL);
+}
+}
+
 MemoryDeviceInfoList *qmp_query_memory_devices(Error **errp)
 {
 MemoryDeviceInfoList *head = NULL;
-- 
1.7.12.4

Re: [Qemu-devel] [v2][PATCH 2/5] hw:pci-host:piix: split i440fx_init

2014-07-31 Thread Michael S. Tsirkin

On Thu, Jul 31, 2014 at 05:26:41PM +0800, Chen, Tiejun wrote:
> On 2014/7/31 17:10, Michael S. Tsirkin wrote:
> >On Thu, Jul 31, 2014 at 02:31:36PM +0800, Tiejun Chen wrote:
> >>We'd like to split i440fx_init and then we can share something
> >>with other stuff.
> >>
> >>Signed-off-by: Tiejun Chen 
> >
> >I think this is too much work for very little benefit.
> >Just pass const char *type to i440fx_init.
> 
> You know we will introduce that faked PCIe device represented that PCH
> later,

Later when? On top of this patch series? Would like to see it all
before applying this ...

> so how to distinguish them? Are you saying I should check the type
> like this?
> 
> if(Xen-Type)
> {}
> else
> {}
>

No! Put the code in init function for the respective class,
pass type as an argument:


i440fx: make types configurable at run-time

Xen wants to supply a different pci and host devices,
inheriting i440fx devices. Make types configurable.

Signed-off-by: Michael S. Tsirkin 

-->
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index be8fdfe..86f295a 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -230,7 +230,11 @@ extern int no_hpet;
 struct PCII440FXState;
 typedef struct PCII440FXState PCII440FXState;
 
-PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn,
+#define TYPE_I440FX_PCI_HOST_BRIDGE "i440FX-pcihost"
+#define TYPE_I440FX_PCI_DEVICE "i440FX"
+
+PCIBus *i440fx_init(const char *host_type, const char *pci_type,
+PCII440FXState **pi440fx_state, int *piix_devfn,
 ISABus **isa_bus, qemu_irq *pic,
 MemoryRegion *address_space_mem,
 MemoryRegion *address_space_io,
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 31125b7..e0979cd 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -194,7 +194,9 @@ static void pc_init1(MachineState *machine,
 }
 
 if (pci_enabled) {
-pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, &isa_bus, gsi,
+pci_bus = i440fx_init(TYPE_I440FX_PCI_HOST_BRIDGE,
+  TYPE_I440FX_PCI_DEVICE,
+  &i440fx_state, &piix3_devfn, &isa_bus, gsi,
   system_memory, system_io, machine->ram_size,
   below_4g_mem_size,
   above_4g_mem_size,
diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index e0e0946..0cd82b8 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -40,7 +40,6 @@
  * http://download.intel.com/design/chipsets/datashts/29054901.pdf
  */
 
-#define TYPE_I440FX_PCI_HOST_BRIDGE "i440FX-pcihost"
 #define I440FX_PCI_HOST_BRIDGE(obj) \
 OBJECT_CHECK(I440FXState, (obj), TYPE_I440FX_PCI_HOST_BRIDGE)
 
@@ -91,7 +90,6 @@ typedef struct PIIX3State {
 MemoryRegion rcr_mem;
 } PIIX3State;
 
-#define TYPE_I440FX_PCI_DEVICE "i440FX"
 #define I440FX_PCI_DEVICE(obj) \
 OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)
 
@@ -305,7 +303,8 @@ static int i440fx_initfn(PCIDevice *dev)
 return 0;
 }
 
-PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
+PCIBus *i440fx_init(const char *host_type, const char *pci_type,
+PCII440FXState **pi440fx_state,
 int *piix3_devfn,
 ISABus **isa_bus, qemu_irq *pic,
 MemoryRegion *address_space_mem,
@@ -325,7 +324,7 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
 unsigned i;
 I440FXState *i440fx;
 
-dev = qdev_create(NULL, TYPE_I440FX_PCI_HOST_BRIDGE);
+dev = qdev_create(NULL, host_type);
 s = PCI_HOST_BRIDGE(dev);
 b = pci_bus_new(dev, NULL, pci_address_space,
 address_space_io, 0, TYPE_PCI_BUS);
@@ -333,7 +332,7 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
 object_property_add_child(qdev_get_machine(), "i440fx", OBJECT(dev), NULL);
 qdev_init_nofail(dev);
 
-d = pci_create_simple(b, 0, TYPE_I440FX_PCI_DEVICE);
+d = pci_create_simple(b, 0, pci_type);
 *pi440fx_state = I440FX_PCI_DEVICE(d);
 f = *pi440fx_state;
 f->system_memory = address_space_mem;

Re: [Qemu-devel] [Spice-devel] qemu run crash when divice_add another qxl display card

2014-07-31 Thread Marc-André Lureau

On Tue, Jul 29, 2014 at 9:50 AM, zhou link  wrote:

> here hit an assertion:
>
> qemu-system-x86_64 -monitor stdio -vga qxl -spice
> port=,disable-ticketing
> (/home/brook/local/bin/qemu-system-x86_64:27280): Spice-Warning **:
> reds.c:3295:spice_server_init: [07-29 23:41:47]ct: Jul 26 2014 00:28:12
>
>
QEMU 2.0.0 monitor - type 'help' for more information
> (qemu)
> (qemu)
> (qemu)
> (qemu) device_add qxl
> (/home/brook/local/bin/qemu-system-x86_64:27280): SpiceWorker-ERROR **:
> red_worker.c:12385:handle_dev_stop: [07-29 23:41:56]assertion
> `worker->running' failed
>
>
Breakpoint 3, qemu_spice_display_stop () at ui/spice-core.c:922
922spice_server_vm_stop(spice_
server);
(gdb) bt
#0  qemu_spice_display_stop () at ui/spice-core.c:922
#1  0x55806910 in qxl_hard_reset (d=0x56612660, loadvm=0) at
hw/display/qxl.c:1158
#2  0x558069b9 in qxl_reset_handler (dev=0x56612660) at
hw/display/qxl.c:1184
#3  0x557d25a9 in device_reset (dev=0x56612660) at
hw/core/qdev.c:996
#4  0x557d1e85 in device_set_realized (obj=0x56612660,
value=true, errp=0x7fffc298)
at hw/core/qdev.c:833
#5  0x558c76c7 in property_set_bool (obj=0x56612660,
v=0x5637d770, opaque=0x5639ddb0,
name=0x559ae629 "realized", errp=0x7fffc298) at
qom/object.c:1421
#6  0x558c6245 in object_property_set (obj=0x56612660,
v=0x5637d770,
name=0x559ae629 "realized", errp=0x7fffc298) at qom/object.c:819
#7  0x558c7d0f in object_property_set_qobject (obj=0x56612660,
value=0x56388f50,
name=0x559ae629 "realized", errp=0x7fffc298) at
qom/qom-qobject.c:24
#8  0x558c6490 in object_property_set_bool (obj=0x56612660,
value=true,
name=0x559ae629 "realized", errp=0x7fffc298) at qom/object.c:883
#9  0x5570fda2 in qdev_device_add (opts=0x565bbca0) at
qdev-monitor.c:560


In qemu, spice_server_vm_stop() is called from device "reset" when it is
added, because spice_display_is_running. The spice server exposes a single
state, regardless the number of devices/worker,

In Spice server, handle_dev_stop:
-spice_assert(worker->running);
+if (!worker->running)
+return;
+

It looks like replacing the assert with a simple check solves the issue,
but I am not sure qemu or the spice code handles the rest fine. What were
you trying to achieve?


cheers


-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH 01/15] qemu coroutine: support bypass mode

2014-07-31 Thread Ming Lei

On Thu, Jul 31, 2014 at 5:15 PM, Paolo Bonzini  wrote:
> Il 31/07/2014 10:59, Ming Lei ha scritto:
>>> > No guesses please.  Actually that's also my guess, but since you are
>>> > submitting the patch you must do better and show profiles where stack
>>> > switching disappears after the patches.
>> Follows the below hardware events reported by 'perf stat' when running
>> fio randread benchmark for 2min in VM(single vq, 2 jobs):
>>
>> sudo ~/bin/perf stat -e
>> L1-dcache-loads,L1-dcache-load-misses,cpu-cycles,instructions,branch-instructions,branch-misses,branch-loads,branch-load-misses,dTLB-loads,dTLB-load-misses
>> ./nqemu-start-mq 4 1
>>
>> 1), without bypassing coroutine via forcing to set 's->raw_format ' as
>> false, see patch 5/15
>>
>> - throughout: 95K
>>232,564,905,115  instructions
>>  161.991075781 seconds time elapsed
>>
>>
>> 2), with bypassing coroutinue
>> - throughput: 115K
>>255,526,629,881  instructions
>>  162.333465490 seconds time elapsed
>
> Ok, so you are saving 10% instructions per iop: before 232G / 95K =
> 2.45M instructions/iop, 255G / 115K = 2.22M instructions/iop.

I am wondering if it is useful since IOP is from view of VM, and instructions
is got from host(qemu). If qemu dataplane handles IO quickly enough,
it can save instructions by batch operations.

But I will collect the 'perf report' on 'instruction' event for you.

L1-dcache-load-misses ratio is increased by 1%, and instructions per
cycle is decreased by 10%(0.88->0.97), which is big too, and means
qemu becomes much slower when running block I/O in VM by
coroutine.

Thanks,

Re: [Qemu-devel] [v2][PATCH 2/5] hw:pci-host:piix: split i440fx_init

2014-07-31 Thread Chen, Tiejun


On 2014/7/31 17:53, Michael S. Tsirkin wrote:

On Thu, Jul 31, 2014 at 05:26:41PM +0800, Chen, Tiejun wrote:

On 2014/7/31 17:10, Michael S. Tsirkin wrote:

On Thu, Jul 31, 2014 at 02:31:36PM +0800, Tiejun Chen wrote:

We'd like to split i440fx_init and then we can share something
with other stuff.

Signed-off-by: Tiejun Chen 


I think this is too much work for very little benefit.
Just pass const char *type to i440fx_init.


You know we will introduce that faked PCIe device represented that PCH
later,


Later when? On top of this patch series? Would like to see it all
before applying this ...


I will send this with other IGD stuff after this series is fine to you 
since its just creating a simple PCIe device.


I think you should know this whole story since as you guys discussed we 
don't fix that PCH at 1f.0. So it may be like this,


static int create_pseudo_pch_isa_bridge(PCIBus *bus, XenHostPCIDevice *hdev)
{

struct PCIDevice *dev;

char rid;

/* We havt to use a simple PCI device to fake this ISA bridge
 * to avoid making some confusion to BIOS and ACPI.
 */
dev = pci_create(bus, -1, "pseudo-intel-pch-isa-bridge");

qdev_init_nofail(&dev->qdev);
pci_config_set_vendor_id(dev->config, XEN_SUBSYSTEM_ID));
^
I don't remember this exactly.
pci_config_set_device_id(dev->config, hdev->device_id);

return 0;
}




so how to distinguish them? Are you saying I should check the type
like this?

if(Xen-Type)
{}
else
{}



No! Put the code in init function for the respective class,
pass type as an argument:


If you mean we don't introduce any "if/else", I still don't understand 
how to insert such that function above, could you show this exactly?


Tiejun




i440fx: make types configurable at run-time

Xen wants to supply a different pci and host devices,
inheriting i440fx devices. Make types configurable.

Signed-off-by: Michael S. Tsirkin 

-->

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index be8fdfe..86f295a 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -230,7 +230,11 @@ extern int no_hpet;
  struct PCII440FXState;
  typedef struct PCII440FXState PCII440FXState;

-PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn,
+#define TYPE_I440FX_PCI_HOST_BRIDGE "i440FX-pcihost"
+#define TYPE_I440FX_PCI_DEVICE "i440FX"
+
+PCIBus *i440fx_init(const char *host_type, const char *pci_type,
+PCII440FXState **pi440fx_state, int *piix_devfn,
  ISABus **isa_bus, qemu_irq *pic,
  MemoryRegion *address_space_mem,
  MemoryRegion *address_space_io,
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 31125b7..e0979cd 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -194,7 +194,9 @@ static void pc_init1(MachineState *machine,
  }

  if (pci_enabled) {
-pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, &isa_bus, gsi,
+pci_bus = i440fx_init(TYPE_I440FX_PCI_HOST_BRIDGE,
+  TYPE_I440FX_PCI_DEVICE,
+  &i440fx_state, &piix3_devfn, &isa_bus, gsi,
system_memory, system_io, machine->ram_size,
below_4g_mem_size,
above_4g_mem_size,
diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index e0e0946..0cd82b8 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -40,7 +40,6 @@
   * http://download.intel.com/design/chipsets/datashts/29054901.pdf
   */

-#define TYPE_I440FX_PCI_HOST_BRIDGE "i440FX-pcihost"
  #define I440FX_PCI_HOST_BRIDGE(obj) \
  OBJECT_CHECK(I440FXState, (obj), TYPE_I440FX_PCI_HOST_BRIDGE)

@@ -91,7 +90,6 @@ typedef struct PIIX3State {
  MemoryRegion rcr_mem;
  } PIIX3State;

-#define TYPE_I440FX_PCI_DEVICE "i440FX"
  #define I440FX_PCI_DEVICE(obj) \
  OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)

@@ -305,7 +303,8 @@ static int i440fx_initfn(PCIDevice *dev)
  return 0;
  }

-PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
+PCIBus *i440fx_init(const char *host_type, const char *pci_type,
+PCII440FXState **pi440fx_state,
  int *piix3_devfn,
  ISABus **isa_bus, qemu_irq *pic,
  MemoryRegion *address_space_mem,
@@ -325,7 +324,7 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
  unsigned i;
  I440FXState *i440fx;

-dev = qdev_create(NULL, TYPE_I440FX_PCI_HOST_BRIDGE);
+dev = qdev_create(NULL, host_type);
  s = PCI_HOST_BRIDGE(dev);
  b = pci_bus_new(dev, NULL, pci_address_space,
  address_space_io, 0, TYPE_PCI_BUS);
@@ -333,7 +332,7 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
  object_property_add_child(qdev_get_machine(), "i440fx", OBJECT(dev), 
NULL);
  qdev_init_nof

Re: [Qemu-devel] [PATCH 3/4] libqos: add a simple first-fit memory allocator

2014-07-31 Thread Stefan Hajnoczi

On Wed, Jul 30, 2014 at 06:28:28PM -0400, John Snow wrote:
> +#define bitany(X, MASK) ((X) & (MASK))
> +#define bitset(X, MASK) (bitany((X), (MASK)) == (MASK))

This is subjective but macros like this should be avoided.  This macro does not
encapsulate anything substantial.  It forces the reader to jump to the
definition of this macro to understand the code, making code harder to read.

IMO a cleaner solution is to drop the macros:

  PCAllocOpts mask = PC_ALLOC_PARANOID | PC_ALLOC_LEAK_ASSERT;
  if (s->opts & mask == mask) { (1)
  if ((node->addr != s->start) ||
  (node->size != s->end - s->start)) {
  fprintf(stderr, "Free list is corrupted.\n");
  if (s->opts & PC_ALLOC_LEAK_ASSERT) { (2)
  g_assert_not_reached();
  }
  }
  }

Now that I read the expanded code, a bug becomes exposed:

In (1) we check that PC_ALLOC_PARANOID and PC_ALLOC_LEAK_ASSERT are both set.
Then in (2) we check whether PC_ALLOC_LEAK_ASSERT is set.  But we already knew
that PC_ALLOC_LEAK_ASSERT must be set in (1), so I guess the logic should have
really been:

  if (s->opts & (PC_ALLOC_PARANOID | PC_ALLOC_LEAK_ASSERT)) {
  if ((node->addr != s->start) ||
  (node->size != s->end - s->start)) {
  fprintf(stderr, "Free list is corrupted.\n");
  if (s->opts & PC_ALLOC_LEAK_ASSERT) {
  g_assert_not_reached();
  }
  }
  }

> +#define MLIST_ENTNAME entries
> +#define MLIST_FOREACH(node, head) QTAILQ_FOREACH((node), (head), 
> MLIST_ENTNAME)
> +#define MLIST_PREV(node) QTAILQ_PREV((node), MemList, MLIST_ENTNAME);
> +#define MLIST_NEXT(node) QTAILQ_NEXT((node), MLIST_ENTNAME);

For the same reasons as my previous comment, please don't hide straightforward
expressions behind a macro.

> +typedef QTAILQ_HEAD(MemList, MemBlock) MemList;
> +typedef struct MemBlock {
> +QTAILQ_ENTRY(MemBlock) MLIST_ENTNAME;
> +uint64_t size;
> +uint64_t addr;
> +} MemBlock;
>  
>  typedef struct PCAlloc
>  {
>  QGuestAllocator alloc;
> -
> +PCAllocOpts opts;
>  uint64_t start;
>  uint64_t end;
> +
> +MemList used;
> +MemList free;
>  } PCAlloc;
>  
> -static uint64_t pc_alloc(QGuestAllocator *allocator, size_t size)
> +static inline void mlist_insert(MemList *head, MemBlock *insr)
>  {
> -PCAlloc *s = container_of(allocator, PCAlloc, alloc);
> -uint64_t addr;
> +QTAILQ_INSERT_HEAD(head, insr, MLIST_ENTNAME);
> +}
> +
> +static inline void mlist_append(MemList *head, MemBlock *node)
> +{
> +QTAILQ_INSERT_TAIL(head, node, MLIST_ENTNAME);
> +}
> +
> +static inline void mlist_unlink(MemList *head, MemBlock *rem)
> +{
> +QTAILQ_REMOVE(head, rem, MLIST_ENTNAME);
> +}

For the same reasons as my comments about the macros, these trivial functions
are boilerplate.  Why not use the QTAILQ macros directly?

(It would be good to hide the list implementation if this was an external API
and you want to avoid exposing the implementation details, but within this
source file there is no point in extra layers of indirection.)


pgpw1YMfVHJN1.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 0/4] libqos: add a simple first-fit memory allocator

2014-07-31 Thread Stefan Hajnoczi

On Wed, Jul 30, 2014 at 06:28:25PM -0400, John Snow wrote:
> This set collects two patches by Marc Marí already on the mailing list,
> but goes further by adding a simple memory allocator that allows us to
> track and debug freed memory, and optionally keep track of any leaks.
> 
> 
> v2: use QTAILQ as a basis for the linked list implementation instead.
> Correct an error in the size of the initial node.
> 
> 
> John Snow (2):
>   libqos: add a simple first-fit memory allocator
>   qtest/ide: Uninitialize PC allocator
> 
> Marc Marí (2):
>   libqos: Correct mask to align size to PAGE_SIZE in malloc-pc
>   libqos: Change free function called in malloc
> 
>  tests/ide-test.c |   2 +
>  tests/libqos/malloc-pc.c | 296 
> +--
>  tests/libqos/malloc-pc.h |   9 ++
>  tests/libqos/malloc.h|   2 +-
>  4 files changed, 298 insertions(+), 11 deletions(-)

Looking pretty close, please CC Andreas Färber .

Stefan


pgpRaYlQx5xcQ.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 01/15] qemu coroutine: support bypass mode

2014-07-31 Thread Paolo Bonzini

Il 31/07/2014 11:47, Ming Lei ha scritto:
>> Block mirroring of a device for example is done using coroutines.
>> As block mirroring can be done on a raw device you need coroutines.
> 
> If block layer knows the mirroring is in-progress, it still can enable
> coroutine by ignoring bypass coroutine, or just let device disable
> bypass coroutine in case of mirroring, and the current APIs are very
> flexible.

What matters is not whether you're mirroring.  What matters is whether
you're calling bdrv_aio_readv/writev or bdrv_co_readv/writev.  Under
some limitation, if you are calling bdrv_aio_readv/writev you can bypass
coroutines.  (In fact drive mirroring uses those APIs too so it doesn't
need coroutines and can benefit from the speedup too. :)  But
drive-backup does need them).

The limitations are essentially "would bdrv_co_do_preadv or
bdrv_co_do_pwritev do anything special?"  This means for example for
bdrv_co_do_pwritev:

- bs->drv must be non-NULL

- bs->read_only must be false

- bdrv_check_byte_request(bs, offset, bytes) must return false

- bs->io_limits_enabled must be false

- the request must be aligned

- (in bdrv_aligned_pwritev) the before_write_notifiers must be empty

- (in bdrv_aligned_pwritev) bs->detect_zeroes must be off

- (in bdrv_aligned_pwritev) the BDRV_REQ_ZERO_WRITE flag must be off

- (in bdrv_aligned_pwritev) bs->enable_write_cache must be false

and the hard part is organizing the code so that the code duplication is
as limited as possible.

Paolo

Re: [Qemu-devel] [PATCH 03/28] ide-test: add test for werror=stop

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:44PM -0400, John Snow wrote:
> @@ -489,6 +490,72 @@ static void test_flush(void)
>  ide_test_quit();
>  }
>  
> +static void prepare_blkdebug_script(const char *debug_path, const char 
> *event)
> +{
> +FILE *debug_file = fopen(debug_path, "w");

Please avoid shadowing the global debug_path variable.  Perhaps simply
call the argument 'filename'.

> +static void test_retry_flush(void)
> +{
> +uint8_t data;
> +const char *s;
> +
> +prepare_blkdebug_script(debug_path, "flush_to_disk");
> +
> +ide_test_start(
> +"-vnc none "
> +"-drive 
> file=blkdebug:%s:%s,if=ide,cache=writeback,rerror=stop,werror=stop",
> +debug_path, tmp_path);
> +
> +/* FLUSH CACHE command on device 0*/
> +outb(IDE_BASE + reg_device, 0);
> +outb(IDE_BASE + reg_command, CMD_FLUSH_CACHE);
> +
> +/* Check status while request is in flight*/
> +data = inb(IDE_BASE + reg_status);
> +assert_bit_set(data, BSY | DRDY);
> +assert_bit_clear(data, DF | ERR | DRQ);
> +
> +sleep(1);/* HACK: wait for event */
> +
> +/* Complete the command */
> +s = "{'execute':'cont' }";
> +while (!qmp(s)) {
> +s = "";
> +sleep(1);
> +}

I guess we're supposed to wait for the block I/O error event when the
machine stops.  Please implement that and replace this polling loop.

See the STOP event in docs/qmp/qmp-events.txt.

> @@ -501,6 +568,11 @@ int main(int argc, char **argv)
>  return 0;
>  }
>  
> +/* Create temporary blkdebug instructions */
> +fd = mkstemp(debug_path);
> +g_assert(fd >= 0);
> +close(fd);
> +

debug_path must be unlinked at the end of main().  Tests should not
clutter up the /tmp directory, all resources must be freed.


pgpCaKeGcmhUm.pgp
Description: PGP signature

Re: [Qemu-devel] latest rc: virtio-blk hangs forever after migration

2014-07-31 Thread Marcin Gibuła


Can you dump *env before and after the call to kvm_arch_get_registers?


Yes, but it seems they are equal - I used memcmp() to compare them. Is
there any other side effect that cpu_synchronize_all_states() may have?


I think I found it.

The reason for hang is, because when second call to 
kvm_arch_get_registers() is skipped, it also skips kvm_get_apic() which 
updates cpu->apic_state.


--
mg

Re: [Qemu-devel] [PATCH 1/9] qom: object: delete properties before calling instance_finalize

2014-07-31 Thread Peter Crosthwaite

On Wed, Jul 30, 2014 at 8:27 PM, Paolo Bonzini  wrote:
> This ensures that the children's unparent callback will still
> have a usable parent.
>
> Signed-off-by: Paolo Bonzini 

Reviewed-by: Peter Crosthwaite 

> ---
>  qom/object.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/qom/object.c b/qom/object.c
> index 0e8267b..f301bc2 100644
> --- a/qom/object.c
> +++ b/qom/object.c
> @@ -418,8 +418,8 @@ static void object_finalize(void *data)
>  Object *obj = data;
>  TypeImpl *ti = obj->class->type;
>
> -object_deinit(obj, ti);
>  object_property_del_all(obj);
> +object_deinit(obj, ti);
>
>  g_assert(obj->ref == 0);
>  if (obj->free) {
> --
> 1.8.3.1
>
>
>

Re: [Qemu-devel] [PATCH 05/28] ide: simplify reset callbacks

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:46PM -0400, John Snow wrote:
> From: Paolo Bonzini 
> 
> Drop the unused return value and make the callback optional.
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: John Snow 
> ---
>  hw/ide/ahci.c | 6 --
>  hw/ide/core.c | 5 +++--
>  hw/ide/internal.h | 3 ++-
>  hw/ide/macio.c| 1 -
>  hw/ide/pci.c  | 4 +---
>  5 files changed, 6 insertions(+), 13 deletions(-)

Reviewed-by: Stefan Hajnoczi 


pgpnkHh279qVL.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 04/28] ide: stash aiocb for flushes

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:45PM -0400, John Snow wrote:
> From: Paolo Bonzini 
> 
> This ensures that operations are completed after a reset
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: John Snow 
> ---
>  hw/ide/core.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Reviewed-by: Stefan Hajnoczi 


pgpn6ee3SMxMi.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 07/28] ide: simplify async_cmd_done callbacks

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:48PM -0400, John Snow wrote:
> From: Paolo Bonzini 
> 
> Drop the unused return value.
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: John Snow 
> ---
>  hw/ide/ahci.c | 4 +---
>  hw/ide/internal.h | 2 +-
>  2 files changed, 2 insertions(+), 4 deletions(-)

Reviewed-by: Stefan Hajnoczi 


pgpfQLgY6_WkC.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 08/28] ide: simplify start_transfer callbacks

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:49PM -0400, John Snow wrote:
> From: Paolo Bonzini 
> 
> Drop the unused return value and make the callback optional.
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: John Snow 
> ---
>  hw/ide/ahci.c |  4 +---
>  hw/ide/core.c | 10 +++---
>  hw/ide/internal.h |  3 +--
>  hw/ide/macio.c|  6 --
>  hw/ide/pci.c  |  6 --
>  5 files changed, 5 insertions(+), 24 deletions(-)

Reviewed-by: Stefan Hajnoczi 


pgpJe4jGREyGU.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 06/28] ide: simplify set_inactive callbacks

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:47PM -0400, John Snow wrote:
> From: Paolo Bonzini 
> 
> Drop the unused return value and make the callback optional.
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: John Snow 
> ---
>  hw/ide/ahci.c | 6 --
>  hw/ide/core.c | 5 +++--
>  hw/ide/internal.h | 2 +-
>  hw/ide/macio.c| 1 -
>  hw/ide/pci.c  | 4 +---
>  5 files changed, 5 insertions(+), 13 deletions(-)

Reviewed-by: Stefan Hajnoczi 


pgpoMuBUHtfP_.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 09/28] ide: wrap start_dma callback

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:50PM -0400, John Snow wrote:
> From: Paolo Bonzini 
> 
> Make it optional and prepare for the next patches.
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: John Snow 
> ---
>  hw/ide/atapi.c|  6 ++
>  hw/ide/core.c | 15 ---
>  hw/ide/internal.h |  1 +
>  3 files changed, 11 insertions(+), 11 deletions(-)

Reviewed-by: Stefan Hajnoczi 


pgp1KvXM4W0iu.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 10/28] ide: remove wrong setting of BM_STATUS_INT

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:51PM -0400, John Snow wrote:
> From: Paolo Bonzini 
> 
> Similar to the case removed in commit 69c38b8 (ide/core: Remove explicit
> setting of BM_STATUS_INT, 2011-05-19), the only remaining use of
> add_status(..., BM_STATUS_INT) is for short PRDs.  The flag should
> not be raised in this case.
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: John Snow 
> ---
>  hw/ide/ahci.c  | 4 
>  hw/ide/atapi.c | 1 -
>  2 files changed, 5 deletions(-)

Reviewed-by: Stefan Hajnoczi 


pgpZa8Fzc7FVS.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 12/28] ide: move BM_STATUS bits to pci.[ch]

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:53PM -0400, John Snow wrote:
> From: Paolo Bonzini 
> 
> They are not used by AHCI, and should not be even available there.
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: John Snow 
> ---
>  hw/ide/internal.h | 11 ---
>  hw/ide/pci.c  |  4 
>  hw/ide/pci.h  |  7 +++
>  3 files changed, 11 insertions(+), 11 deletions(-)

Reviewed-by: Stefan Hajnoczi 


pgpowURSkh7Oa.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 11/28] ide: fold add_status callback into set_inactive

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:52PM -0400, John Snow wrote:
> From: Paolo Bonzini 
> 
> It is now called only after the set_inactive callback.  Put the two together.
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: John Snow 
> ---
>  hw/ide/ahci.c |  9 -
>  hw/ide/atapi.c|  2 +-
>  hw/ide/core.c | 12 
>  hw/ide/internal.h |  6 +++---
>  hw/ide/macio.c|  1 -
>  hw/ide/pci.c  | 19 +++
>  6 files changed, 15 insertions(+), 34 deletions(-)

Reviewed-by: Stefan Hajnoczi 


pgpFn345pZrJo.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 9/9] tpm_tis: remove instance_finalize callback

2014-07-31 Thread Peter Crosthwaite

On Wed, Jul 30, 2014 at 8:27 PM, Paolo Bonzini  wrote:
> It is never used, since ISA device are not hot-unpluggable.
>

Is it not good design practice though for the uninit to be correctly
implemented regardless of whether there is current-day usage? This
seems like the kind of patch that would get reverted if anyone finds a
reason to un-init a QOM object beyond existing hotplug use-cases.

Regards,
Peter

> Signed-off-by: Paolo Bonzini 
> ---
>  hw/tpm/tpm_tis.c | 8 
>  1 file changed, 8 deletions(-)
>
> diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
> index d398c16..82747ee 100644
> --- a/hw/tpm/tpm_tis.c
> +++ b/hw/tpm/tpm_tis.c
> @@ -896,13 +896,6 @@ static void tpm_tis_initfn(Object *obj)
>  &s->mmio);
>  }
>
> -static void tpm_tis_uninitfn(Object *obj)
> -{
> -TPMState *s = TPM(obj);
> -
> -memory_region_del_subregion(get_system_memory(), &s->mmio);
> -}
> -
>  static void tpm_tis_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> @@ -918,7 +911,6 @@ static const TypeInfo tpm_tis_info = {
>  .parent = TYPE_ISA_DEVICE,
>  .instance_size = sizeof(TPMState),
>  .instance_init = tpm_tis_initfn,
> -.instance_finalize = tpm_tis_uninitfn,
>  .class_init  = tpm_tis_class_init,
>  };
>
> --
> 1.8.3.1
>
>

Re: [Qemu-devel] [PATCH 4/9] vga: do not dynamically allocate chain4_alias

2014-07-31 Thread Peter Crosthwaite

On Wed, Jul 30, 2014 at 8:27 PM, Paolo Bonzini  wrote:
> Instead, add a boolean variable to indicate the presence of the region.
>

Patch is good. Can you add a sentence on why you are doing this
though? Is it just the save on the repeated malloc and free (which is
sane in its own right) or something bigger in the context of your
series?

> Signed-off-by: Paolo Bonzini 

Otherwise,

Reviewed-by: Peter Crosthwaite 

> ---
>  hw/display/vga.c | 24 ++--
>  hw/display/vga_int.h |  3 ++-
>  2 files changed, 12 insertions(+), 15 deletions(-)
>
> diff --git a/hw/display/vga.c b/hw/display/vga.c
> index 4b089a3..68cfee2 100644
> --- a/hw/display/vga.c
> +++ b/hw/display/vga.c
> @@ -168,15 +168,18 @@ static uint8_t expand4to8[16];
>
>  static void vga_update_memory_access(VGACommonState *s)
>  {
> -MemoryRegion *region, *old_region = s->chain4_alias;
>  hwaddr base, offset, size;
>
>  if (s->legacy_address_space == NULL) {
>  return;
>  }
>
> -s->chain4_alias = NULL;
> -
> +if (s->has_chain4_alias) {
> +memory_region_del_subregion(s->legacy_address_space, 
> &s->chain4_alias);
> +memory_region_destroy(&s->chain4_alias);
> +s->has_chain4_alias = false;
> +s->plane_updated = 0xf;
> +}
>  if ((s->sr[VGA_SEQ_PLANE_WRITE] & VGA_SR02_ALL_PLANES) ==
>  VGA_SR02_ALL_PLANES && s->sr[VGA_SEQ_MEMORY_MODE] & VGA_SR04_CHN_4M) 
> {
>  offset = 0;
> @@ -201,18 +204,11 @@ static void vga_update_memory_access(VGACommonState *s)
>  break;
>  }
>  base += isa_mem_base;
> -region = g_malloc(sizeof(*region));
> -memory_region_init_alias(region, memory_region_owner(&s->vram),
> +memory_region_init_alias(&s->chain4_alias, 
> memory_region_owner(&s->vram),
>   "vga.chain4", &s->vram, offset, size);
>  memory_region_add_subregion_overlap(s->legacy_address_space, base,
> -region, 2);
> -s->chain4_alias = region;
> -}
> -if (old_region) {
> -memory_region_del_subregion(s->legacy_address_space, old_region);
> -memory_region_destroy(old_region);
> -g_free(old_region);
> -s->plane_updated = 0xf;
> +&s->chain4_alias, 2);
> +s->has_chain4_alias = true;
>  }
>  }
>
> @@ -1321,7 +1317,7 @@ static void vga_draw_text(VGACommonState *s, int 
> full_update)
>  s->font_offsets[1] = offset;
>  full_update = 1;
>  }
> -if (s->plane_updated & (1 << 2) || s->chain4_alias) {
> +if (s->plane_updated & (1 << 2) || s->has_chain4_alias) {
>  /* if the plane 2 was modified since the last display, it
> indicates the font may have been modified */
>  s->plane_updated = 0;
> diff --git a/hw/display/vga_int.h b/hw/display/vga_int.h
> index 5320abd..641f8f4 100644
> --- a/hw/display/vga_int.h
> +++ b/hw/display/vga_int.h
> @@ -94,7 +94,8 @@ typedef struct VGACommonState {
>  uint32_t vram_size;
>  uint32_t vram_size_mb; /* property */
>  uint32_t latch;
> -MemoryRegion *chain4_alias;
> +bool has_chain4_alias;
> +MemoryRegion chain4_alias;
>  uint8_t sr_index;
>  uint8_t sr[256];
>  uint8_t gr_index;
> --
> 1.8.3.1
>
>
>

Re: [Qemu-devel] [PATCH 9/9] tpm_tis: remove instance_finalize callback

2014-07-31 Thread Paolo Bonzini

Il 31/07/2014 14:00, Peter Crosthwaite ha scritto:
> On Wed, Jul 30, 2014 at 8:27 PM, Paolo Bonzini  wrote:
>> It is never used, since ISA device are not hot-unpluggable.
>>
> 
> Is it not good design practice though for the uninit to be correctly
> implemented regardless of whether there is current-day usage? This
> seems like the kind of patch that would get reverted if anyone finds a
> reason to un-init a QOM object beyond existing hotplug use-cases.

But even then, it should be an unrealize method, not an
instance_finalize, shouldn't it?

Paolo

> Regards,
> Peter
> 
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  hw/tpm/tpm_tis.c | 8 
>>  1 file changed, 8 deletions(-)
>>
>> diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
>> index d398c16..82747ee 100644
>> --- a/hw/tpm/tpm_tis.c
>> +++ b/hw/tpm/tpm_tis.c
>> @@ -896,13 +896,6 @@ static void tpm_tis_initfn(Object *obj)
>>  &s->mmio);
>>  }
>>
>> -static void tpm_tis_uninitfn(Object *obj)
>> -{
>> -TPMState *s = TPM(obj);
>> -
>> -memory_region_del_subregion(get_system_memory(), &s->mmio);
>> -}
>> -
>>  static void tpm_tis_class_init(ObjectClass *klass, void *data)
>>  {
>>  DeviceClass *dc = DEVICE_CLASS(klass);
>> @@ -918,7 +911,6 @@ static const TypeInfo tpm_tis_info = {
>>  .parent = TYPE_ISA_DEVICE,
>>  .instance_size = sizeof(TPMState),
>>  .instance_init = tpm_tis_initfn,
>> -.instance_finalize = tpm_tis_uninitfn,
>>  .class_init  = tpm_tis_class_init,
>>  };
>>
>> --
>> 1.8.3.1
>>
>>

Re: [Qemu-devel] [PATCH 5/9] nic: do not destroy memory regions in cleanup functions

2014-07-31 Thread Peter Crosthwaite

On Thu, Jul 31, 2014 at 7:46 PM, Stefan Hajnoczi  wrote:
> On Wed, Jul 30, 2014 at 12:27:08PM +0200, Paolo Bonzini wrote:
>> The memory regions should be destroyed in the unrealize function;
>> since these NICs are not even qdev-ified, they cannot be unplugged
>> and they do not have to do anything to destroy their memory regions.
>>
>> Cc: stefa...@redhat.com
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  hw/net/dp8393x.c | 3 ---
>>  hw/net/mcf_fec.c | 3 ---
>>  2 files changed, 6 deletions(-)
>
> Reviewed-by: Stefan Hajnoczi 

Reviewed-by: Peter Crosthwaite

Re: [Qemu-devel] [PATCH 13/28] ide: move retry constants out of BM_STATUS_* namespace

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:54PM -0400, John Snow wrote:
> From: Paolo Bonzini 
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: John Snow 
> ---
>  hw/ide/core.c | 20 ++--
>  hw/ide/internal.h | 12 ++--
>  hw/ide/pci.c  | 14 +++---
>  3 files changed, 23 insertions(+), 23 deletions(-)

Reviewed-by: Stefan Hajnoczi 


pgp8zW1rAcY8j.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 4/9] vga: do not dynamically allocate chain4_alias

2014-07-31 Thread Paolo Bonzini

Il 31/07/2014 14:01, Peter Crosthwaite ha scritto:
> On Wed, Jul 30, 2014 at 8:27 PM, Paolo Bonzini  wrote:
>> Instead, add a boolean variable to indicate the presence of the region.
>>
> 
> Patch is good. Can you add a sentence on why you are doing this
> though? Is it just the save on the repeated malloc and free (which is
> sane in its own right) or something bigger in the context of your
> series?

Even more than the repeated malloc and free, it's the repeated
add_child/unparent.

Paolo

>> Signed-off-by: Paolo Bonzini 
> 
> Otherwise,
> 
> Reviewed-by: Peter Crosthwaite 
> 
>> ---
>>  hw/display/vga.c | 24 ++--
>>  hw/display/vga_int.h |  3 ++-
>>  2 files changed, 12 insertions(+), 15 deletions(-)
>>
>> diff --git a/hw/display/vga.c b/hw/display/vga.c
>> index 4b089a3..68cfee2 100644
>> --- a/hw/display/vga.c
>> +++ b/hw/display/vga.c
>> @@ -168,15 +168,18 @@ static uint8_t expand4to8[16];
>>
>>  static void vga_update_memory_access(VGACommonState *s)
>>  {
>> -MemoryRegion *region, *old_region = s->chain4_alias;
>>  hwaddr base, offset, size;
>>
>>  if (s->legacy_address_space == NULL) {
>>  return;
>>  }
>>
>> -s->chain4_alias = NULL;
>> -
>> +if (s->has_chain4_alias) {
>> +memory_region_del_subregion(s->legacy_address_space, 
>> &s->chain4_alias);
>> +memory_region_destroy(&s->chain4_alias);
>> +s->has_chain4_alias = false;
>> +s->plane_updated = 0xf;
>> +}
>>  if ((s->sr[VGA_SEQ_PLANE_WRITE] & VGA_SR02_ALL_PLANES) ==
>>  VGA_SR02_ALL_PLANES && s->sr[VGA_SEQ_MEMORY_MODE] & 
>> VGA_SR04_CHN_4M) {
>>  offset = 0;
>> @@ -201,18 +204,11 @@ static void vga_update_memory_access(VGACommonState *s)
>>  break;
>>  }
>>  base += isa_mem_base;
>> -region = g_malloc(sizeof(*region));
>> -memory_region_init_alias(region, memory_region_owner(&s->vram),
>> +memory_region_init_alias(&s->chain4_alias, 
>> memory_region_owner(&s->vram),
>>   "vga.chain4", &s->vram, offset, size);
>>  memory_region_add_subregion_overlap(s->legacy_address_space, base,
>> -region, 2);
>> -s->chain4_alias = region;
>> -}
>> -if (old_region) {
>> -memory_region_del_subregion(s->legacy_address_space, old_region);
>> -memory_region_destroy(old_region);
>> -g_free(old_region);
>> -s->plane_updated = 0xf;
>> +&s->chain4_alias, 2);
>> +s->has_chain4_alias = true;
>>  }
>>  }
>>
>> @@ -1321,7 +1317,7 @@ static void vga_draw_text(VGACommonState *s, int 
>> full_update)
>>  s->font_offsets[1] = offset;
>>  full_update = 1;
>>  }
>> -if (s->plane_updated & (1 << 2) || s->chain4_alias) {
>> +if (s->plane_updated & (1 << 2) || s->has_chain4_alias) {
>>  /* if the plane 2 was modified since the last display, it
>> indicates the font may have been modified */
>>  s->plane_updated = 0;
>> diff --git a/hw/display/vga_int.h b/hw/display/vga_int.h
>> index 5320abd..641f8f4 100644
>> --- a/hw/display/vga_int.h
>> +++ b/hw/display/vga_int.h
>> @@ -94,7 +94,8 @@ typedef struct VGACommonState {
>>  uint32_t vram_size;
>>  uint32_t vram_size_mb; /* property */
>>  uint32_t latch;
>> -MemoryRegion *chain4_alias;
>> +bool has_chain4_alias;
>> +MemoryRegion chain4_alias;
>>  uint8_t sr_index;
>>  uint8_t sr[256];
>>  uint8_t gr_index;
>> --
>> 1.8.3.1
>>
>>
>>

Re: [Qemu-devel] [v2][PATCH 2/5] hw:pci-host:piix: split i440fx_init

2014-07-31 Thread Chen, Tiejun


On 2014/7/31 18:12, Chen, Tiejun wrote:

On 2014/7/31 17:53, Michael S. Tsirkin wrote:

On Thu, Jul 31, 2014 at 05:26:41PM +0800, Chen, Tiejun wrote:

On 2014/7/31 17:10, Michael S. Tsirkin wrote:

On Thu, Jul 31, 2014 at 02:31:36PM +0800, Tiejun Chen wrote:

We'd like to split i440fx_init and then we can share something
with other stuff.

Signed-off-by: Tiejun Chen 


I think this is too much work for very little benefit.
Just pass const char *type to i440fx_init.


You know we will introduce that faked PCIe device represented that PCH
later,


Later when? On top of this patch series? Would like to see it all
before applying this ...


I will send this with other IGD stuff after this series is fine to you
since its just creating a simple PCIe device.

I think you should know this whole story since as you guys discussed we
don't fix that PCH at 1f.0. So it may be like this,

static int create_pseudo_pch_isa_bridge(PCIBus *bus, XenHostPCIDevice
*hdev)
{

 struct PCIDevice *dev;

 char rid;

 /* We havt to use a simple PCI device to fake this ISA bridge
  * to avoid making some confusion to BIOS and ACPI.
  */
 dev = pci_create(bus, -1, "pseudo-intel-pch-isa-bridge");

 qdev_init_nofail(&dev->qdev);
 pci_config_set_vendor_id(dev->config, XEN_SUBSYSTEM_ID));
 ^
 I don't remember this exactly.
 pci_config_set_device_id(dev->config, hdev->device_id);

 return 0;
}




so how to distinguish them? Are you saying I should check the type
like this?

if(Xen-Type)
{}
else
{}



No! Put the code in init function for the respective class,
pass type as an argument:


If you mean we don't introduce any "if/else", I still don't understand
how to insert such that function above, could you show this exactly?



I think I can do this inside that into 
xen_igd_passthrough_pc_machine_pci_bus_init(). There, we can just follow 
pci_bus = i440fx_init() since create_pseudo_pch_isa_bridge() just needs 
pci_bus as a parameter.


Thanks
Tiejun

[Qemu-devel] [v3][PATCH 3/4] xen:hw:pci-host:piix: create host bridge to passthrough

2014-07-31 Thread Tiejun Chen

Implement a pci host bridge specific to passthrough. Actually
this just inherits the standard one.

This is based on http://patchwork.ozlabs.org/patch/363810/.

Signed-off-by: Tiejun Chen 
---
 hw/pci-host/piix.c   | 41 +
 include/hw/i386/pc.h |  2 ++
 2 files changed, 43 insertions(+)

v3:

* Rebase

v2:

* Just add prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 0cd82b8..26ba1e5 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -93,6 +93,9 @@ typedef struct PIIX3State {
 #define I440FX_PCI_DEVICE(obj) \
 OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)
 
+#define XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE(obj) \
+OBJECT_CHECK(PCII440FXState, (obj), 
TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE)
+
 struct PCII440FXState {
 /*< private >*/
 PCIDevice parent_obj;
@@ -303,6 +306,16 @@ static int i440fx_initfn(PCIDevice *dev)
 return 0;
 }
 
+static int xen_igd_passthrough_i440fx_initfn(PCIDevice *dev)
+{
+PCII440FXState *d = XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE(dev);
+
+dev->config[I440FX_SMRAM] = 0x02;
+
+cpu_smm_register(&i440fx_set_smm, d);
+return 0;
+}
+
 PCIBus *i440fx_init(const char *host_type, const char *pci_type,
 PCII440FXState **pi440fx_state,
 int *piix3_devfn,
@@ -703,6 +716,33 @@ static const TypeInfo i440fx_info = {
 .class_init= i440fx_class_init,
 };
 
+static void xen_igd_passthrough_i440fx_class_init(ObjectClass *klass, void 
*data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+k->init = xen_igd_passthrough_i440fx_initfn;
+k->vendor_id = PCI_VENDOR_ID_INTEL;
+k->device_id = PCI_DEVICE_ID_INTEL_82441;
+k->revision = 0x02;
+k->class_id = PCI_CLASS_BRIDGE_HOST;
+dc->desc = "IGD PT XEN Host bridge";
+dc->vmsd = &vmstate_i440fx;
+/*
+ * PCI-facing part of the host bridge, not usable without the
+ * host-facing part, which can't be device_add'ed, yet.
+ */
+dc->cannot_instantiate_with_device_add_yet = true;
+dc->hotpluggable   = false;
+}
+
+static const TypeInfo xen_igd_passthrough_i440fx_info = {
+.name  = TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE,
+.parent= TYPE_PCI_DEVICE,
+.instance_size = sizeof(PCII440FXState),
+.class_init= xen_igd_passthrough_i440fx_class_init,
+};
+
 static const char *i440fx_pcihost_root_bus_path(PCIHostState *host_bridge,
 PCIBus *rootbus)
 {
@@ -744,6 +784,7 @@ static const TypeInfo i440fx_pcihost_info = {
 static void i440fx_register_types(void)
 {
 type_register_static(&i440fx_info);
+type_register_static(&xen_igd_passthrough_i440fx_info);
 type_register_static(&piix3_info);
 type_register_static(&piix3_xen_info);
 type_register_static(&i440fx_pcihost_info);
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 9778a71..98ae1a3 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -233,6 +233,8 @@ typedef struct PCII440FXState PCII440FXState;
 #define TYPE_I440FX_PCI_HOST_BRIDGE "i440FX-pcihost"
 #define TYPE_I440FX_PCI_DEVICE "i440FX"
 
+#define TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE "xen-igd-passthrough-i440FX"
+
 PCIBus *i440fx_init(const char *host_type, const char *pci_type,
 PCII440FXState **pi440fx_state, int *piix_devfn,
 ISABus **isa_bus, qemu_irq *pic,
-- 
1.9.1

Re: [Qemu-devel] [PATCH 14/28] ahci: remove duplicate PORT_IRQ_* constants

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:55PM -0400, John Snow wrote:
> From: Paolo Bonzini 
> 
> These are defined twice, just use one set consistently.
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: John Snow 
> ---
>  hw/ide/ahci.c |  6 +++---
>  hw/ide/ahci.h | 21 -
>  2 files changed, 3 insertions(+), 24 deletions(-)

Reviewed-by: Stefan Hajnoczi 


pgpqQ53Ww3sVL.pgp
Description: PGP signature

[Qemu-devel] [v3][PATCH 1/4] i440fx: make types configurable at run-time

2014-07-31 Thread Tiejun Chen

Xen wants to supply a different pci and host devices,
inheriting i440fx devices. Make types configurable.

Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Tiejun Chen 
---
 hw/i386/pc_piix.c| 4 +++-
 hw/pci-host/piix.c   | 9 -
 include/hw/i386/pc.h | 6 +-
 3 files changed, 12 insertions(+), 7 deletions(-)

v3:

* New patch from Michael

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 9694f88..928c12c 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -194,7 +194,9 @@ static void pc_init1(MachineState *machine,
 }
 
 if (pci_enabled) {
-pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, &isa_bus, gsi,
+pci_bus = i440fx_init(TYPE_I440FX_PCI_HOST_BRIDGE,
+  TYPE_I440FX_PCI_DEVICE,
+  &i440fx_state, &piix3_devfn, &isa_bus, gsi,
   system_memory, system_io, machine->ram_size,
   below_4g_mem_size,
   above_4g_mem_size,
diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index e0e0946..0cd82b8 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -40,7 +40,6 @@
  * http://download.intel.com/design/chipsets/datashts/29054901.pdf
  */
 
-#define TYPE_I440FX_PCI_HOST_BRIDGE "i440FX-pcihost"
 #define I440FX_PCI_HOST_BRIDGE(obj) \
 OBJECT_CHECK(I440FXState, (obj), TYPE_I440FX_PCI_HOST_BRIDGE)
 
@@ -91,7 +90,6 @@ typedef struct PIIX3State {
 MemoryRegion rcr_mem;
 } PIIX3State;
 
-#define TYPE_I440FX_PCI_DEVICE "i440FX"
 #define I440FX_PCI_DEVICE(obj) \
 OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)
 
@@ -305,7 +303,8 @@ static int i440fx_initfn(PCIDevice *dev)
 return 0;
 }
 
-PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
+PCIBus *i440fx_init(const char *host_type, const char *pci_type,
+PCII440FXState **pi440fx_state,
 int *piix3_devfn,
 ISABus **isa_bus, qemu_irq *pic,
 MemoryRegion *address_space_mem,
@@ -325,7 +324,7 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
 unsigned i;
 I440FXState *i440fx;
 
-dev = qdev_create(NULL, TYPE_I440FX_PCI_HOST_BRIDGE);
+dev = qdev_create(NULL, host_type);
 s = PCI_HOST_BRIDGE(dev);
 b = pci_bus_new(dev, NULL, pci_address_space,
 address_space_io, 0, TYPE_PCI_BUS);
@@ -333,7 +332,7 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
 object_property_add_child(qdev_get_machine(), "i440fx", OBJECT(dev), NULL);
 qdev_init_nofail(dev);
 
-d = pci_create_simple(b, 0, TYPE_I440FX_PCI_DEVICE);
+d = pci_create_simple(b, 0, pci_type);
 *pi440fx_state = I440FX_PCI_DEVICE(d);
 f = *pi440fx_state;
 f->system_memory = address_space_mem;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index f4b9b2b..9778a71 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -230,7 +230,11 @@ extern int no_hpet;
 struct PCII440FXState;
 typedef struct PCII440FXState PCII440FXState;
 
-PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn,
+#define TYPE_I440FX_PCI_HOST_BRIDGE "i440FX-pcihost"
+#define TYPE_I440FX_PCI_DEVICE "i440FX"
+
+PCIBus *i440fx_init(const char *host_type, const char *pci_type,
+PCII440FXState **pi440fx_state, int *piix_devfn,
 ISABus **isa_bus, qemu_irq *pic,
 MemoryRegion *address_space_mem,
 MemoryRegion *address_space_io,
-- 
1.9.1

[Qemu-devel] [v3][PATCH 0/5] xen: introduce new machine for IGD passthrough

2014-07-31 Thread Tiejun Chen

v3:

* Drop patch #4
* Add one patch #1 from Michael
* Rebase

v2:

* Fix some coding style
* New patch to separate i440fx_init
* Just add prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough
* Based on patch #2 to regenerate
* Unify prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough like patch #3
* Test: boot with a preinstalled ubuntu 14.04
  ./i386-softmmu/qemu-system-i386 -hda test.img -m 2560 -boot c -machine pc

As we discussed we need to create a separate machine to support current
IGD passthrough.


Tiejun Chen (4):
  i440fx: make types configurable at run-time
  hw:i386:pc_piix: split pc_init1()
  xen:hw:pci-host:piix: create host bridge to passthrough
  xen:hw:i386:pc_piix: introduce new machine for IGD passthrough

 hw/i386/pc_piix.c| 227 
+---
 hw/pci-host/piix.c   |  50 ++
 include/hw/i386/pc.h |   8 ++-
 3 files changed, 254 insertions(+), 31 deletions(-)

Thanks
Tiejun

[Qemu-devel] [v3][PATCH 2/4] hw:i386:pc_piix: split pc_init1()

2014-07-31 Thread Tiejun Chen

We'd like to split pc_init1 and then we can share something
with other stuff.

Signed-off-by: Tiejun Chen 
---
 hw/i386/pc_piix.c | 117 +++---
 1 file changed, 93 insertions(+), 24 deletions(-)

v3:

* Rebase

v2:

* Fix some coding style

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 928c12c..0c40b06 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -71,36 +71,29 @@ static bool smbios_legacy_mode;
 static bool gigabyte_align = true;
 static bool has_reserved_memory = true;
 
-/* PC hardware initialisation */
-static void pc_init1(MachineState *machine,
- int pci_enabled,
- int kvmclock_enabled)
+static ram_addr_t below_4g_mem_size;
+static ram_addr_t above_4g_mem_size;
+static void pc_machine_base_init(MachineState *machine,
+ int pci_enabled,
+ int kvmclock_enabled,
+ DeviceState **pc_icc_bridge,
+ MemoryRegion **pc_ram_memory,
+ MemoryRegion **pc_pci_memory,
+ qemu_irq **pc_gsi,
+ GSIState **pc_gsi_state,
+ FWCfgState **pc_fw_cfg)
 {
 PCMachineState *pc_machine = PC_MACHINE(machine);
 MemoryRegion *system_memory = get_system_memory();
-MemoryRegion *system_io = get_system_io();
-int i;
-ram_addr_t below_4g_mem_size, above_4g_mem_size;
-PCIBus *pci_bus;
-ISABus *isa_bus;
-PCII440FXState *i440fx_state;
-int piix3_devfn = -1;
-qemu_irq *cpu_irq;
-qemu_irq *gsi;
-qemu_irq *i8259;
-qemu_irq *smi_irq;
-GSIState *gsi_state;
-DriveInfo *hd[MAX_IDE_BUS * MAX_IDE_DEVS];
-BusState *idebus[MAX_IDE_BUS];
-ISADevice *rtc_state;
-ISADevice *floppy;
-MemoryRegion *ram_memory;
-MemoryRegion *pci_memory;
 MemoryRegion *rom_memory;
-DeviceState *icc_bridge;
-FWCfgState *fw_cfg = NULL;
 PcGuestInfo *guest_info;
 ram_addr_t lowmem;
+DeviceState *icc_bridge;
+MemoryRegion *ram_memory;
+MemoryRegion *pci_memory;
+qemu_irq *gsi;
+GSIState *gsi_state;
+FWCfgState *fw_cfg;
 
 /* Check whether RAM fits below 4G (leaving 1/2 GByte for IO memory).
  * If it doesn't, we need to split it in chunks below and above 4G.
@@ -193,6 +186,31 @@ static void pc_init1(MachineState *machine,
 gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
 }
 
+*pc_icc_bridge = icc_bridge;
+*pc_ram_memory = ram_memory;
+*pc_pci_memory = pci_memory;
+*pc_gsi = gsi;
+*pc_gsi_state = gsi_state;
+*pc_fw_cfg = fw_cfg;
+}
+
+static void pc_machine_pci_bus_init(MachineState *machine,
+int pci_enabled,
+PCII440FXState **pc_i440fx_state,
+int *pc_piix3_devfn,
+PCIBus **pc_pci_bus,
+ISABus **pc_isa_bus,
+qemu_irq *gsi,
+MemoryRegion *pci_memory,
+MemoryRegion *ram_memory)
+{
+MemoryRegion *system_memory = get_system_memory();
+MemoryRegion *system_io = get_system_io();
+PCII440FXState *i440fx_state;
+int piix3_devfn;
+PCIBus *pci_bus;
+ISABus *isa_bus;
+
 if (pci_enabled) {
 pci_bus = i440fx_init(TYPE_I440FX_PCI_HOST_BRIDGE,
   TYPE_I440FX_PCI_DEVICE,
@@ -207,6 +225,33 @@ static void pc_init1(MachineState *machine,
 isa_bus = isa_bus_new(NULL, system_io);
 no_hpet = 1;
 }
+
+*pc_i440fx_state = i440fx_state;
+*pc_piix3_devfn = piix3_devfn;
+*pc_pci_bus = pci_bus;
+*pc_isa_bus = isa_bus;
+}
+
+static void pc_machine_device_init(MachineState *machine,
+   int pci_enabled,
+   GSIState *gsi_state,
+   DeviceState *icc_bridge,
+   int piix3_devfn,
+   FWCfgState *fw_cfg,
+   PCIBus *pci_bus,
+   ISABus *isa_bus,
+   qemu_irq *gsi)
+{
+int i;
+DriveInfo *hd[MAX_IDE_BUS * MAX_IDE_DEVS];
+BusState *idebus[MAX_IDE_BUS];
+qemu_irq *smi_irq;
+PCMachineState *pc_machine = PC_MACHINE(machine);
+qemu_irq *cpu_irq;
+qemu_irq *i8259;
+ISADevice *rtc_state;
+ISADevice *floppy;
+
 isa_bus_irqs(isa_bus, gsi);
 
 if (kvm_irqchip_in_kernel()) {
@@ -294,6 +339,30 @@ static void pc_init1(MachineState *machine,
 }
 }
 
+/* PC hardware initialisation */
+static void pc_init1(MachineState *machine,
+ int pci_enabled,
+ int kvmclock_enabled)

[Qemu-devel] [v3][PATCH 4/4] xen:hw:i386:pc_piix: introduce new machine for IGD passthrough

2014-07-31 Thread Tiejun Chen

Now we can introduce a new machine, xenigd, specific to IGD
passthrough. This can avoid involving other common codes.

Signed-off-by: Tiejun Chen 
---
 hw/i386/pc_piix.c | 106 ++
 1 file changed, 106 insertions(+)

v3:

* Rebase

v2:

* Unify prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough like patch #3

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 0c40b06..cde39d6 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -232,6 +232,46 @@ static void pc_machine_pci_bus_init(MachineState *machine,
 *pc_isa_bus = isa_bus;
 }
 
+#ifdef CONFIG_XEN
+static void xen_igd_passthrough_pc_machine_pci_bus_init(MachineState *machine,
+int pci_enabled,
+PCII440FXState 
**pc_i440fx_state,
+int *pc_piix3_devfn,
+PCIBus **pc_pci_bus,
+ISABus **pc_isa_bus,
+qemu_irq *gsi,
+MemoryRegion 
*pci_memory,
+MemoryRegion 
*ram_memory)
+{
+MemoryRegion *system_memory = get_system_memory();
+MemoryRegion *system_io = get_system_io();
+PCII440FXState *i440fx_state;
+int piix3_devfn;
+PCIBus *pci_bus;
+ISABus *isa_bus;
+
+if (pci_enabled) {
+pci_bus = i440fx_init(TYPE_I440FX_PCI_HOST_BRIDGE,
+  TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE,
+  &i440fx_state, &piix3_devfn, &isa_bus, gsi,
+  system_memory, system_io, machine->ram_size,
+  below_4g_mem_size,
+  above_4g_mem_size,
+  pci_memory, ram_memory);
+} else {
+pci_bus = NULL;
+i440fx_state = NULL;
+isa_bus = isa_bus_new(NULL, system_io);
+no_hpet = 1;
+}
+
+*pc_i440fx_state = i440fx_state;
+*pc_piix3_devfn = piix3_devfn;
+*pc_pci_bus = pci_bus;
+*pc_isa_bus = isa_bus;
+}
+#endif
+
 static void pc_machine_device_init(MachineState *machine,
int pci_enabled,
GSIState *gsi_state,
@@ -368,6 +408,38 @@ static void pc_init_pci(MachineState *machine)
 pc_init1(machine, 1, 1);
 }
 
+#ifdef CONFIG_XEN
+static void xen_igd_passthrough_pc_init1(MachineState *machine,
+ int pci_enabled,
+ int kvmclock_enabled)
+{
+PCIBus *pci_bus = NULL;
+ISABus *isa_bus = NULL;
+PCII440FXState *i440fx_state = NULL;
+int piix3_devfn = -1;
+qemu_irq *gsi = NULL;
+GSIState *gsi_state = NULL;
+MemoryRegion *ram_memory = NULL;
+MemoryRegion *pci_memory = NULL;
+DeviceState *icc_bridge = NULL;
+FWCfgState *fw_cfg = NULL;
+
+pc_machine_base_init(machine, pci_enabled, kvmclock_enabled, &icc_bridge,
+ &ram_memory, &pci_memory, &gsi, &gsi_state, &fw_cfg);
+xen_igd_passthrough_pc_machine_pci_bus_init(machine, pci_enabled,
+&i440fx_state, &piix3_devfn,
+&pci_bus, &isa_bus, gsi,
+pci_memory, ram_memory);
+pc_machine_device_init(machine, pci_enabled, gsi_state, icc_bridge,
+   piix3_devfn, fw_cfg, pci_bus, isa_bus, gsi);
+}
+
+static void xen_igd_passthrough_pc_init_pci(MachineState *machine)
+{
+xen_igd_passthrough_pc_init1(machine, 1, 1);
+}
+#endif
+
 static void pc_compat_2_0(MachineState *machine)
 {
 /* This value depends on the actual DSDT and SSDT compiled into
@@ -514,6 +586,18 @@ static void pc_xen_hvm_init(MachineState *machine)
 pci_create_simple(bus, -1, "xen-platform");
 }
 }
+
+static void xen_igd_passthrough_pc_hvm_init(MachineState *machine)
+{
+PCIBus *bus;
+
+xen_igd_passthrough_pc_init_pci(machine);
+
+bus = pci_find_primary_bus();
+if (bus != NULL) {
+pci_create_simple(bus, -1, "xen-platform");
+}
+}
 #endif
 
 #define PC_I440FX_MACHINE_OPTIONS \
@@ -963,6 +1047,27 @@ static QEMUMachine xenfv_machine = {
 { /* end of list */ }
 },
 };
+
+static QEMUMachine xenigd_machine = {
+PC_COMMON_MACHINE_OPTIONS,
+.name = "xenigd",
+.desc = "Xen Fully-virtualized PC specific to IGD",
+.init = xen_igd_passthrough_pc_hvm_init,
+.max_cpus = HVM_MAX_VCPUS,
+.default_machine_opts = "accel=xen",
+.hot_add_cpu = pc_hot_add_cpu,
+.compat_props = (GlobalProperty[]) {
+/* xenfv has no fwcfg and so does not load acpi from QEMU.
+ * as such new acpi features don'

Re: [Qemu-devel] [PATCH 15/28] ide: stop PIO transfer on errors

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:56PM -0400, John Snow wrote:
> From: Paolo Bonzini 
> 
> This will provide a hook for sending the result of the command via the
> FIS receive area.
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: John Snow 
> ---
>  hw/ide/core.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index bd3bd34..d900ba0 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -422,6 +422,9 @@ static inline void ide_abort_command(IDEState *s)
>  {
>  s->status = READY_STAT | ERR_STAT;
>  s->error = ABRT_ERR;
> +if (s->end_transfer_func != ide_transfer_stop) {
> +ide_transfer_stop(s);
> +}
>  }

I don't understand this.

ide_transfer_stop() sets s->send_transfer_func = ide_transfer_stop.
This has the side-effect of making ide_is_pio_out() return true (not
sure if that poses a problem).

Why can't we call ide_transfer_stop() when s->end_transfer_func ==
ide_transfer_stop?

Stefan


pgpxe38crSBWO.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 16/28] ide: make all commands go through cmd_done

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:57PM -0400, John Snow wrote:
> From: Paolo Bonzini 
> 
> AHCI has code to fill in the D2H FIS trigger the IRQ all over the place.
> Centralize this in a single cmd_done callback by generalizing the existing
> async_cmd_done callback.
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: John Snow 
> ---
>  hw/ide/ahci.c | 16 +++-
>  hw/ide/atapi.c|  2 +-
>  hw/ide/core.c | 20 +++-
>  hw/ide/internal.h |  2 +-
>  4 files changed, 16 insertions(+), 24 deletions(-)

Reviewed-by: Stefan Hajnoczi 


pgpqXtrbyONBI.pgp
Description: PGP signature

[Qemu-devel] [PATCH 1/7] usb: a trivial code change for more idiomatic writing style

2014-07-31 Thread arei.gonglei

From: Gonglei 

Signed-off-by: Gonglei 
---
 hw/usb/dev-audio.c | 2 +-
 hw/usb/dev-mtp.c   | 4 ++--
 hw/usb/hcd-ehci.c  | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/usb/dev-audio.c b/hw/usb/dev-audio.c
index bfebfe9..988f6cc 100644
--- a/hw/usb/dev-audio.c
+++ b/hw/usb/dev-audio.c
@@ -371,7 +371,7 @@ static void output_callback(void *opaque, int avail)
 return;
 }
 data = streambuf_get(&s->out.buf);
-if (NULL == data) {
+if (data == NULL) {
 return;
 }
 AUD_write(s->out.voice, data, USBAUDIO_PACKET_SIZE);
diff --git a/hw/usb/dev-mtp.c b/hw/usb/dev-mtp.c
index 384d4a5..0820046 100644
--- a/hw/usb/dev-mtp.c
+++ b/hw/usb/dev-mtp.c
@@ -832,7 +832,7 @@ static void usb_mtp_command(MTPState *s, MTPControl *c)
 return;
 }
 data_in = usb_mtp_get_object(s, c, o);
-if (NULL == data_in) {
+if (data_in == NULL) {
 usb_mtp_queue_result(s, RES_GENERAL_ERROR,
  c->trans, 0, 0, 0);
 return;
@@ -851,7 +851,7 @@ static void usb_mtp_command(MTPState *s, MTPControl *c)
 return;
 }
 data_in = usb_mtp_get_partial_object(s, c, o);
-if (NULL == data_in) {
+if (data_in == NULL) {
 usb_mtp_queue_result(s, RES_GENERAL_ERROR,
  c->trans, 0, 0, 0);
 return;
diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
index a00a93c..448e007 100644
--- a/hw/usb/hcd-ehci.c
+++ b/hw/usb/hcd-ehci.c
@@ -1596,7 +1596,7 @@ static EHCIQueue *ehci_state_fetchqh(EHCIState *ehci, int 
async)
 
 entry = ehci_get_fetch_addr(ehci, async);
 q = ehci_find_queue_by_qh(ehci, entry, async);
-if (NULL == q) {
+if (q == NULL) {
 q = ehci_alloc_queue(ehci, entry, async);
 }
 
-- 
1.7.12.4

[Qemu-devel] [PATCH for-2.2 0/7] a trivial code change for more idiomatic writing style

2014-07-31 Thread arei.gonglei

From: Gonglei 

Gonglei (7):
  usb: a trivial code change for more idiomatic writing style
  audio: a trivial code change for more idiomatic writing style
  isa-bus: a trivial code change for more idiomatic writing style
  a trivial code change for more idiomatic writing style
  spice: a trivial code change for more idiomatic writing style
  vl: a trivial code change for more idiomatic writing style
  vmxnet3: a trivial code change for more idiomatic writing style

 hw/audio/gus.c   |  2 +-
 hw/audio/hda-codec.c |  3 ++-
 hw/audio/sb16.c  |  6 +++---
 hw/isa/isa-bus.c |  2 +-
 hw/net/vmxnet3.c | 16 
 hw/usb/dev-audio.c   |  2 +-
 hw/usb/dev-mtp.c |  4 ++--
 hw/usb/hcd-ehci.c|  2 +-
 qdev-monitor.c   |  2 +-
 qemu-char.c  |  2 +-
 ui/spice-core.c  |  4 ++--
 util/qemu-sockets.c  |  2 +-
 vl.c |  5 +++--
 13 files changed, 27 insertions(+), 25 deletions(-)

-- 
1.7.12.4

[Qemu-devel] [PATCH 5/7] spice: a trivial code change for more idiomatic writing style

2014-07-31 Thread arei.gonglei

From: Gonglei 

Signed-off-by: Gonglei 
---
 ui/spice-core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ui/spice-core.c b/ui/spice-core.c
index 7bb91e6..ff54dac 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -677,7 +677,7 @@ void qemu_spice_init(void)
 
 if (tls_port) {
 x509_dir = qemu_opt_get(opts, "x509-dir");
-if (NULL == x509_dir) {
+if (x509_dir == NULL) {
 x509_dir = ".";
 }
 
@@ -803,7 +803,7 @@ void qemu_spice_init(void)
 
 seamless_migration = qemu_opt_get_bool(opts, "seamless-migration", 0);
 spice_server_set_seamless_migration(spice_server, seamless_migration);
-if (0 != spice_server_init(spice_server, &core_interface)) {
+if (spice_server_init(spice_server, &core_interface) != 0) {
 error_report("failed to initialize spice server");
 exit(1);
 };
-- 
1.7.12.4

[Qemu-devel] [PATCH 6/7] vl: a trivial code change for more idiomatic writing style

2014-07-31 Thread arei.gonglei

From: Gonglei 

Signed-off-by: Gonglei 
---
 vl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index fe451aa..69f0f20 100644
--- a/vl.c
+++ b/vl.c
@@ -1136,7 +1136,7 @@ static int drive_init_func(QemuOpts *opts, void *opaque)
 
 static int drive_enable_snapshot(QemuOpts *opts, void *opaque)
 {
-if (NULL == qemu_opt_get(opts, "snapshot")) {
+if (qemu_opt_get(opts, "snapshot") == NULL) {
 qemu_opt_set(opts, "snapshot", "on");
 }
 return 0;
@@ -2488,8 +2488,9 @@ static int foreach_device_config(int type, int 
(*func)(const char *cmdline))
 loc_push_restore(&conf->loc);
 rc = func(conf->cmdline);
 loc_pop(&conf->loc);
-if (0 != rc)
+if (rc != 0) {
 return rc;
+}
 }
 return 0;
 }
-- 
1.7.12.4

[Qemu-devel] [PATCH 7/7] vmxnet3: a trivial code change for more idiomatic writing style

2014-07-31 Thread arei.gonglei

From: Gonglei 

Signed-off-by: Gonglei 
---
 hw/net/vmxnet3.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 77bea6f..34c5aff 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -1009,7 +1009,7 @@ vmxnet3_indicate_packet(VMXNET3State *s)
 
 vmxnet3_dump_rx_descr(&rxd);
 
-if (0 != ready_rxcd_pa) {
+if (ready_rxcd_pa != 0) {
 cpu_physical_memory_write(ready_rxcd_pa, &rxcd, sizeof(rxcd));
 }
 
@@ -1020,7 +1020,7 @@ vmxnet3_indicate_packet(VMXNET3State *s)
 rxcd.gen = new_rxcd_gen;
 rxcd.rqID = RXQ_IDX + rx_ridx * s->rxq_num;
 
-if (0 == bytes_left) {
+if (bytes_left == 0) {
 vmxnet3_rx_update_descr(s->rx_pkt, &rxcd);
 }
 
@@ -1038,16 +1038,16 @@ vmxnet3_indicate_packet(VMXNET3State *s)
 num_frags++;
 }
 
-if (0 != ready_rxcd_pa) {
+if (ready_rxcd_pa != 0) {
 rxcd.eop = 1;
-rxcd.err = (0 != bytes_left);
+rxcd.err = (bytes_left != 0);
 cpu_physical_memory_write(ready_rxcd_pa, &rxcd, sizeof(rxcd));
 
 /* Flush RX descriptor changes */
 smp_wmb();
 }
 
-if (0 != new_rxcd_pa) {
+if (new_rxcd_pa != 0) {
 vmxnet3_revert_rxc_descr(s, RXQ_IDX);
 }
 
@@ -1190,8 +1190,8 @@ static void vmxnet3_update_mcast_filters(VMXNET3State *s)
 s->mcast_list_len = list_bytes / sizeof(s->mcast_list[0]);
 
 s->mcast_list = g_realloc(s->mcast_list, list_bytes);
-if (NULL == s->mcast_list) {
-if (0 == s->mcast_list_len) {
+if (s->mcast_list == NULL) {
+if (s->mcast_list_len == 0) {
 VMW_CFPRN("Current multicast list is empty");
 } else {
 VMW_ERPRN("Failed to allocate multicast list of %d elements",
@@ -1667,7 +1667,7 @@ vmxnet3_io_bar1_write(void *opaque,
  * memory address. We save it to temp variable and set the
  * shared address only after we get the high part
  */
-if (0 == val) {
+if (val == 0) {
 s->device_active = false;
 }
 s->temp_shared_guest_driver_memory = val;
-- 
1.7.12.4

[Qemu-devel] [PATCH 4/7] a trivial code change for more idiomatic writing style

2014-07-31 Thread arei.gonglei

From: Gonglei 

Signed-off-by: Gonglei 
---
 qdev-monitor.c  | 2 +-
 qemu-char.c | 2 +-
 util/qemu-sockets.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/qdev-monitor.c b/qdev-monitor.c
index f87f3d8..3e30d38 100644
--- a/qdev-monitor.c
+++ b/qdev-monitor.c
@@ -694,7 +694,7 @@ void qmp_device_del(const char *id, Error **errp)
 DeviceState *dev;
 
 dev = qdev_find_recursive(sysbus_get_default(), id);
-if (NULL == dev) {
+if (dev == NULL) {
 error_set(errp, QERR_DEVICE_NOT_FOUND, id);
 return;
 }
diff --git a/qemu-char.c b/qemu-char.c
index 956be49..70d5a64 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -4117,7 +4117,7 @@ void qmp_chardev_remove(const char *id, Error **errp)
 CharDriverState *chr;
 
 chr = qemu_chr_find(id);
-if (NULL == chr) {
+if (chr == NULL) {
 error_setg(errp, "Chardev '%s' not found", id);
 return;
 }
diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index 74cf078..5d38395 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -732,7 +732,7 @@ int unix_connect_opts(QemuOpts *opts, Error **errp,
 ConnectState *connect_state = NULL;
 int sock, rc;
 
-if (NULL == path) {
+if (path == NULL) {
 error_setg(errp, "unix connect: no path specified");
 return -1;
 }
-- 
1.7.12.4

Re: [Qemu-devel] [PATCH 17/28] ahci: construct PIO Setup FIS for PIO commands

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:58PM -0400, John Snow wrote:
> +static void ahci_write_fis_pio(AHCIDevice *ad, uint16_t len)
> +{
> +AHCIPortRegs *pr = &ad->port_regs;
> +uint8_t *pio_fis, *cmd_fis;
> +uint64_t tbl_addr;
> +dma_addr_t cmd_len = 0x80;
> +
> +if (!ad->res_fis || !(pr->cmd & PORT_CMD_FIS_RX)) {
> +return;
> +}
> +
> +/* map cmd_fis */
> +tbl_addr = le64_to_cpu(ad->cur_cmd->tbl_addr);
> +cmd_fis = dma_memory_map(ad->hba->as, tbl_addr, &cmd_len,
> + DMA_DIRECTION_TO_DEVICE);

We should check cmd_len == 0x80 and cmd_fis != NULL to avoid undefined
behavior when accessing cmd_fis.


pgp4617Ej93mJ.pgp
Description: PGP signature

[Qemu-devel] [PATCH 3/7] isa-bus: a trivial code change for more idiomatic writing style

2014-07-31 Thread arei.gonglei

From: Gonglei 

Signed-off-by: Gonglei 
---
 hw/isa/isa-bus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/isa/isa-bus.c b/hw/isa/isa-bus.c
index b28981b..851f953 100644
--- a/hw/isa/isa-bus.c
+++ b/hw/isa/isa-bus.c
@@ -50,7 +50,7 @@ ISABus *isa_bus_new(DeviceState *dev, MemoryRegion 
*address_space_io)
 fprintf(stderr, "Can't create a second ISA bus\n");
 return NULL;
 }
-if (NULL == dev) {
+if (dev == NULL) {
 dev = qdev_create(NULL, "isabus-bridge");
 qdev_init_nofail(dev);
 }
-- 
1.7.12.4

Re: [Qemu-devel] [PATCH 6/9] ioport: split deletion and destruction

2014-07-31 Thread Peter Crosthwaite

On Wed, Jul 30, 2014 at 8:27 PM, Paolo Bonzini  wrote:
> Of the two functions portio_list_del and portio_list_destroy,
> the latter is just freeing a memory area.  However, portio_list_del
> is the logical equivalent of memory_region_del_subregion so
> destruction of memory regions does not belong there.
>

So I found the original implementation made sense to me, in that _del
is the converse of _add and _destroy _init WRT to the MR ops.

Currently
_init = malloc array
_add = mr_init + mr_add_subregion
_del = mr_del_subregion + mr_destroy
_destory = free array

This delays mr_destory to _destroy breaking the symmetry.

With this change is is still possible to:

_init()
_add()
_del()
_add()
_del()
_destrory()

And not leak a ref, with the reinited memory region on second add?

Then again I may be misunderstanding, as apparently neither of these
API have any users so I'm having trouble getting a handle on intended
usage:

$ git grep portio_list_del
include/exec/ioport.h:void portio_list_del(PortioList *piolist);
ioport.c:void portio_list_del(PortioList *piolist)
$ git grep portio_list_destroy
include/exec/ioport.h:void portio_list_destroy(PortioList *piolist);
ioport.c:void portio_list_destroy(PortioList *piolist)

Regards,
Peter

> Signed-off-by: Paolo Bonzini 
> ---
>  ioport.c | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/ioport.c b/ioport.c
> index 3d91e79..dce94a3 100644
> --- a/ioport.c
> +++ b/ioport.c
> @@ -149,6 +149,14 @@ void portio_list_set_flush_coalesced(PortioList *piolist)
>
>  void portio_list_destroy(PortioList *piolist)
>  {
> +MemoryRegionPortioList *mrpio;
> +unsigned i;
> +
> +for (i = 0; i < piolist->nr; ++i) {
> +mrpio = container_of(piolist->regions[i], MemoryRegionPortioList, 
> mr);
> +memory_region_destroy(&mrpio->mr);
> +g_free(mrpio);
> +}
>  g_free(piolist->regions);
>  }
>
> @@ -291,8 +299,5 @@ void portio_list_del(PortioList *piolist)
>  for (i = 0; i < piolist->nr; ++i) {
>  mrpio = container_of(piolist->regions[i], MemoryRegionPortioList, 
> mr);
>  memory_region_del_subregion(piolist->address_space, &mrpio->mr);
> -memory_region_destroy(&mrpio->mr);
> -g_free(mrpio);
> -piolist->regions[i] = NULL;
>  }
>  }
> --
> 1.8.3.1
>
>
>

[Qemu-devel] [PATCH 2/7] audio: a trivial code change for more idiomatic writing style

2014-07-31 Thread arei.gonglei

From: Gonglei 

Signed-off-by: Gonglei 
---
 hw/audio/gus.c   | 2 +-
 hw/audio/hda-codec.c | 3 ++-
 hw/audio/sb16.c  | 6 +++---
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/audio/gus.c b/hw/audio/gus.c
index bba6840..4a43ce7 100644
--- a/hw/audio/gus.c
+++ b/hw/audio/gus.c
@@ -212,7 +212,7 @@ static int GUS_read_DMA (void *opaque, int nchan, int 
dma_pos, int dma_len)
 pos += copied;
 }
 
-if (0 == ((mode >> 4) & 1)) {
+if (((mode >> 4) & 1) == 0) {
 DMA_release_DREQ (s->emu.gusdma);
 }
 return dma_len;
diff --git a/hw/audio/hda-codec.c b/hw/audio/hda-codec.c
index cbcf521..3c03ff5 100644
--- a/hw/audio/hda-codec.c
+++ b/hw/audio/hda-codec.c
@@ -489,8 +489,9 @@ static int hda_audio_init(HDACodecDevice *hda, const struct 
desc_codec *desc)
 for (i = 0; i < a->desc->nnodes; i++) {
 node = a->desc->nodes + i;
 param = hda_codec_find_param(node, AC_PAR_AUDIO_WIDGET_CAP);
-if (NULL == param)
+if (param == NULL) {
 continue;
+}
 type = (param->val & AC_WCAP_TYPE) >> AC_WCAP_TYPE_SHIFT;
 switch (type) {
 case AC_WID_AUD_OUT:
diff --git a/hw/audio/sb16.c b/hw/audio/sb16.c
index 60c4b3b..bda26d0 100644
--- a/hw/audio/sb16.c
+++ b/hw/audio/sb16.c
@@ -928,7 +928,7 @@ static IO_WRITE_PROTO (dsp_write)
 /* if (s->highspeed) */
 /* break; */
 
-if (0 == s->needed_bytes) {
+if (s->needed_bytes == 0) {
 command (s, val);
 #if 0
 if (0 == s->needed_bytes) {
@@ -1212,7 +1212,7 @@ static int SB_read_DMA (void *opaque, int nchan, int 
dma_pos, int dma_len)
 #endif
 
 if (till <= copy) {
-if (0 == s->dma_auto) {
+if (s->dma_auto == 0) {
 copy = till;
 }
 }
@@ -1224,7 +1224,7 @@ static int SB_read_DMA (void *opaque, int nchan, int 
dma_pos, int dma_len)
 if (s->left_till_irq <= 0) {
 s->mixer_regs[0x82] |= (nchan & 4) ? 2 : 1;
 qemu_irq_raise (s->pic);
-if (0 == s->dma_auto) {
+if (s->dma_auto == 0) {
 control (s, 0);
 speaker (s, 0);
 }
-- 
1.7.12.4

Re: [Qemu-devel] [PATCH 18/28] q35: Enable the ioapic device to be seen by qtest.

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:17:59PM -0400, John Snow wrote:
> Currently, the ioapic device can not be found in a qtest environment
> when requesting "irq_interrupt_in ioapic" via the qtest socket.
> 
> By mirroring how the ioapic is added in i44ofx (hw/i440/pc_piix.c),
> as a child of "q35," the device is able to be seen by qtest.
> 
> Signed-off-by: John Snow 
> ---
>  hw/i386/pc_q35.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 36b6ab0..143b978 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -231,7 +231,7 @@ static void pc_q35_init(MachineState *machine)
>  gsi_state->i8259_irq[i] = i8259[i];
>  }
>  if (pci_enabled) {
> -ioapic_init_gsi(gsi_state, NULL);
> +ioapic_init_gsi(gsi_state, "q35");
>  }
>  qdev_init_nofail(icc_bridge);
>  
> -- 
> 1.9.3

Reviewed-by: Stefan Hajnoczi 


pgp41BEPRbI58.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v2] po: Add Chinese translation

2014-07-31 Thread Dongsheng Song

On Thu, Jul 31, 2014 at 3:02 PM, Amos Kong  wrote:
> On Thu, Jul 31, 2014 at 11:18:45AM +0800, Fam Zheng wrote:
>> Signed-off-by: Fam Zheng 
>>
>> ---
>> v2: Fix typo for "_View".
>> ---
>>  po/zh_CN.po | 86 
>> +
>>  1 file changed, 86 insertions(+)
>>  create mode 100644 po/zh_CN.po
>
> Reviewed-by: Amos Kong 

Reviewed-by: Dongsheng Song 

--
Dongsheng

Re: [Qemu-devel] [PATCH 19/28] qtest: Adding qtest_memset and qmemset.

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:18:00PM -0400, John Snow wrote:
> Currently, libqtest allows for memread and memwrite, but
> does not offer a simple way to zero out regions of memory.
> This patch adds a simple function to do so.
> 
> Signed-off-by: John Snow 
> ---
>  tests/libqtest.c | 12 
>  tests/libqtest.h | 24 
>  2 files changed, 36 insertions(+)

Reviewed-by: Stefan Hajnoczi 


pgpE7WaqWSyk_.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 21/28] libqtest: Correct small memory leak.

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:18:02PM -0400, John Snow wrote:
> Fixes a small memory leak inside of libqtest.
> After we produce a test path and glib copies the string
> for itself, we should clean up our temporary copy.
> 
> Signed-off-by: John Snow 
> ---
>  tests/libqtest.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tests/libqtest.c b/tests/libqtest.c
> index e525e6f..8205e0a 100644
> --- a/tests/libqtest.c
> +++ b/tests/libqtest.c
> @@ -644,6 +644,7 @@ void qtest_add_func(const char *str, void (*fn))
>  {
>  gchar *path = g_strdup_printf("/%s/%s", qtest_get_arch(), str);
>  g_test_add_func(path, fn);
> +free(path);

Glib functions use g_malloc()/g_free(), not malloc()/free().  Since this
string was allocated with g_strdup_printf() it must be freed with
g_free().


pgpWexY1SIG6y.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 20/28] libqos: Correct memory leak

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:18:01PM -0400, John Snow wrote:
> Fix a small memory leak inside of libqos, in the pc_alloc_init routine.
> 
> Signed-off-by: John Snow 
> ---
>  tests/libqos/malloc-pc.c | 3 +++
>  1 file changed, 3 insertions(+)

Reviewed-by: Stefan Hajnoczi 


pgpcpqzaGHLgQ.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH for-2.2 0/7] a trivial code change for more idiomatic writing style

2014-07-31 Thread Peter Maydell

On 31 July 2014 13:28,   wrote:
> From: Gonglei 
>
> Gonglei (7):
>   usb: a trivial code change for more idiomatic writing style
>   audio: a trivial code change for more idiomatic writing style
>   isa-bus: a trivial code change for more idiomatic writing style
>   a trivial code change for more idiomatic writing style
>   spice: a trivial code change for more idiomatic writing style
>   vl: a trivial code change for more idiomatic writing style
>   vmxnet3: a trivial code change for more idiomatic writing style

Hi. I think we could make the commit messages here
a bit more specific:
===
$WHATEVER: don't use 'Yoda conditions'

'Yoda conditions' are not part of idiomatic QEMU coding
style, so rewrite them in the more usual order.
===

thanks
-- PMM

Re: [Qemu-devel] [PATCH 22/28] libqos: Fixes a small memory leak.

2014-07-31 Thread Stefan Hajnoczi

On Mon, Jul 07, 2014 at 02:18:03PM -0400, John Snow wrote:
> Allow users the chance to clean up the QPCIBusPC structure
> by adding a small cleanup routine. Helps clear up small
> memory leaks during setup/teardown, to allow for cleaner
> debug output messages.
> 
> Signed-off-by: John Snow 
> ---
>  tests/libqos/pci-pc.c | 7 +++
>  tests/libqos/pci-pc.h | 1 +
>  2 files changed, 8 insertions(+)

Reviewed-by: Stefan Hajnoczi 


pgp3ROHP3w5S5.pgp
Description: PGP signature

1 2 3 >

1 - 100 of 288 matches

Mail list logo